NdisImPlatformBindingOptions and LWFs

(see next post)

Hey mods - no need to add that post I created. Resolved the issue after 2 days. [MODS: OK!]

Turns out to be an issue with my setup (Windbg kd through VMware NAT). VMWare NAT seems to repackage all TCP packets hence why I wasn’t seeing any in my filter.

Also the 2nd issue was caused by loss of DHCP assigned IP when installing the ndislwf was resolved by choosing “Optional filter” so:
HKR, Ndi,FilterRunType,0x00010001, 2

However even though in ipconfig I could see I had a valid 192.168.235.x ip I couldn’t connect to the host. But issuing seemed to work partially.
HKR, Parameters, NdisImPlatformBindingOptions,0x00010001,3 ; This allows the TCP connection to work

Couldn’t find much documentation on this (except related https://docs.microsoft.com/en-us/windows-hardware/drivers/network/configuring-an-inf-file-for-a-modifying-filter-driver, but mostly guess work. However I still to do have 1 issue (Attach,Deatch,Restart,Pause are fine), if I add the extra handlers back in i.e. ReceiveNetBufferListsHandler I won’t see the TCP packets, but the moment I NULL out the handler, the TCP packets are working as seen in wireshark. Hmmm? Does does adding the handler affect the binding somehow?

I suggest moving this technical discussion (including my reply) to NTDEV [MODS: Done.]

NdisImPlatformBindingOptions controls how your filter interacts with some (not all!) IM drivers. Currently, there are 3 IM drivers that honor this setting: Server’s classic LBFO teaming, the built-in network bridge, and an exotic feature that probably you’ll never encounter. Notably, the built-in virtual switch does not currently honor this setting.

I put some of the specifics on what it does here: https://github.com/Microsoft/Windows-driver-samples/blob/master/network/ndis/filter/netlwf.inf

At a high level, though, here’s a cheat sheet:

  • NdisImPlatformBindingOptions=0: You want to bind closer to TCPIP. Example: a filter that wants to fiddle with IP addresses.

  • NdisImPlatformBindingOptions=3: You want to bind closer to the hardware. Example: a filter that injects (or blocks) specific hardware offloads.

  • NdisImPlatformBindingOptions=2: You want to bind everywhere. Example: a packet capture filter.

Regarding your 2nd issue. Your decision to put the filter into/out-of the datapath doesn’t affect bindings. In fact, registering datapath handlers is completely orthogonal to everything else that goes on with the NDIS stack. If inserting your filter into the receive path causes you to not see TCP packets, it’s likely that your receive handler has a bug that is breaking TCP and/or NDIS’s loopback.

Note that Wireshark uses a clumsy and outdated mechanism to monitor Tx traffic (layer2 loopback), so it’s not a great indicator of what’s actually happening to your filter. Admittedly, Wireshark has the best GUI & analytics, but it has the worst capture technology on Windows. Netmon, Message Analyzer, and “netsh trace start capture=yes” all collect packet captures using a filter driver, which means they have a much more accurate view of what specifically happens in the network stack. In particular, “netsh trace start” supports a multilayer capture, which can show you how packets get transformed by each other LWF in the stack. You can use this to see if your LWF is allowing traffic to flow through it, and if so, whether it’s damaging the packets in any way.

A few common gotchas:

  • LWFs in the receive path must honor the NDIS_RECEIVE_FLAGS_RESOURCES flag; if you aren’t special-casing that flag, you have a bug.
  • LWFs in the receive path that do anything nontrivial with the packets most likely will want to ignore NBLs with the NDIS_NBL_FLAGS_IS_LOOPBACK_PACKET flag set. (It’s basically only used when Wireshark is enabled, as NDIS kicks itself into a low-performance compatiblity mode to get Wireshark’s outdated driver to limp along. So bugs with this flag are like the opposite of heisenbugs – they only repro when you’re looking.)
  • If you’re injecting novel packets of your own, you have to pull them back out of the completion path.

@“Jeffrey_Tippet_[MSFT]” Hey thanks for that. Yeah you’re right after I resolved the vmware nat issues as described below, I could use any NdisImPlatformBindingOptions and they all worked. And even wireshark showed the exact same number of TCP packets (both 0 length and non-null packets) during a URL visit operation as my ndislwf.

So some issues related to the original post. First all the issues I had previously with missing TCP packets were related to VMware NAT, go figure eh.

  • VMware NAT seems to repackage all TCP packets to UDP. Hence why I could see no TCP traffic in wireshark when browsing sites.
  • Setting my NDIS LWF to “Optional” instead of “Mandatory” did not cause the unbinding of the protocol stacks, hence didn’t lose my DHCP assigned IPs, which we’re also causing issues with vmware nat dhcp and windbg’s kernel debugger hijacked eth interface
    * I didn’t see any TCP traffic until I see myself as a “Diagnostic” FilterClass. Using the default “Compression”, did not let me see TCP packets. Vmware nat caused this, doh

So some Q’s:

  1. Is the FilterDriverCharacteristics.SendNetBufferListsHandler for Tx/outbound operations, so going from host to network, and FilterDriverCharacteristics.ReceiveNetBufferListsHandler the Rx/inbound operations, so coming in from network to host? So if I wanted to monitor only outbound traffic I would only register to SendNetBufferListsHandler, and if I wanted to see responses it would be in ReceiveNetBufferListsHandler?
  2. When I want to modify traffic for outbound operations, I would additionally need the outbound SendNetBufferListsCompleteHandler. What about for modifying inbound traffic out of curiosity?
  3. Since NetBuffer is essentially a packet/frame seen in wireshark, what does a NetBufferList represent? Is it just a Windows thing, relating to chaining NetBuffer MDLs for performance?

1 Yes, SendNetBufferListsHandler is Tx/outbound/host-to-network. ReceiveNetBufferListsHandler is Rx/inbound/network-to-host. (The semantics are rather different for filter drivers that serve as vSwitch extensions, but you’re not writing one of those, so don’t worry about it.)

And, also yes, if you want to monitor only outbound traffic, you only need SendNetBufferListsHandler + SendNetBufferListsCompleteHandler. You can optionally implement CancelSendNetBufferListsHandler too, but honestly nobody really cares about cancellation, so it’s unlikely to be worth it.

Likewise, if you only care about inbound traffic, you only need ReceiveNetBufferListsHandler + ReturnNetBufferListsHandler. (There’s no analogous cancellation for Rx path.)

These two paths are completely orthogonal, in that you can choose to implement one or the other, or neither or both. With some fancy NdisFRestartFilter footwork, you can even change which paths you filter on-the-fly. But again that’s unlikely to be worth the trouble; most filters just unconditionally plug themselves into whichever of {Tx,Rx} they might ever need to look at.

2 SendNetBufferListsHandler must be paired with SendNetBufferListsCompleteHandler. ReceiveNetBufferListsHandler must be paired with ReturnNetBufferListsHandler.

3 Here are the semantics:

An MDL is a single buffer of memory that is contiguous in virtual address space. (Possibly fragmented in physical address space, but a LWF doesn’t care.)

A NET_BUFFER (aka NB) is a single network frame/packet. There’s a big asterisk: the LSO and RSC offloads will combine multiple packets-on-the-wire into a single “packet” in memory, simply for efficiency. You might need to know about this if you’re literally counting packets on the wire, or monitoring what happens on the other side of the network. But for most local processing, you can just pretend that one NB == one packet.

A NET_BUFFER_LIST (aka NBL) groups multiple “related” packets together, again for efficiency. The definition of “related” is a bit complex. You can think of it as “the same socket/stream”, although that’s an imprecise definition, since NDIS doesn’t have a concept of “socket”. For TCP/UDP traffic, it means the packets all have the same VLANs, source/destination MAC addresses, IP addresses, and TCP/UDP port numbers. The packets must all have the same metadata, like QOS priority.

The NIC cannot be reasonably expected to know whether two packets are “related”, so NDIS doesn’t allow multiple NBs per NBL on the receive path. In other words, your ReceiveNetBufferListsHandler will always see 1 NB per NBL.

On the transmit path, you will commonly see multiple NBs in a single NBL. E.g., if an application posts a 50,000-byte buffer to a TCP socket, the layer3 MTU is 1000 bytes, the TCP window sizes permit it, and LSO isn’t enabled, TCP might unload 50 NBs into a single NBL.

In practice, you can mostly ignore the difference between NBLs and NBs. Just write code to flatten the list like this:

for (nbl = nblChain; nbl; nbl = nbl->Next) {
    for (nb = nbl->FirstNetBuffer; nb; nb = nb->Next) {
        ProcessOnePacket(nb);
    }
}

You’ll only need to pay attention to NBLs when you have to manage the lifetime of completing them. You can’t complete a single NB; you need to wait until you’re done with all the NBs in an NBL, then complete the entire NBL.

Optionally, you may be able to improve performance by using the NBL’s “related” property. For example, if you’re only interested in packets sent to a particular IP address, you can short-circuit processing the entire NB chain, by assuming that the first NB is representative of all NBs in the NBL:

for (nbl = nblChain; nbl; nbl = nbl->Next) {
    if (!IsPacketDestinedToTargetIPAddress(nbl->FirstNetBuffer)) {
        continue;
    }
    for (nb = nbl->FirstNetBuffer; nb; nb = nb->Next) {
        ProcessOnePacket(nb);
    }
}

Finally, I’ll note that NDIS always tries to give you batches (i.e., linked lists) of NBLs. There is no relationship between the NBLs in a batch; you are free to split, splice, or combine lists of NBLs in any way. You’re technically allowed to reorder the list, although reordering packets on the Tx/Rx paths can degrade TCP performance. (There’s no performance penalty for reordering NBLs on the Tx/Rx completion paths.) You cannot assume any two NBLs in the same batch have any semantic relationship; two NBLs that are linked together might have completely different “sockets”, metadata, offloads, etc.

1 Like