WFP Callout Driver Layer2 Filtering

Hello,
I am starting a discussion on a Windows Filter Platform callout driver I have been working on that has two registered filters (INBOUND & OUBOUND MAC NATIVE). Each filter registered has an assigned thread that, in turn, has an assigned LIST ENTRY. The inbound filter does a quick check for specific packets types, and valid packets are added to a linked list (LIST ENTRY). The threading portion of this I already verified is working, no memory leaks or the like, and it has been verified working when using cloned packets with filters at the IPV4 DISCARD filter layer. However, I am encapsulating packets and needed the ability to be able to create NBL chains in order to improve performance when dealing with large file transfers and the like (i.e. typically for every 1 packet during an SMB file transfer one needs to generate at least 2 packets per 1 original packet because of MTU issues). Because of this and additional processing that occurs, it made sense to just do a full “deep copy” for all inbound and valid layer 2 (native) packets and then just do the typical packet absorbtion for the original packets. The driver can be compiled to use cloning or deep copy, and the code base works fine with cloned packets.

The problem I am running into is that, when compiled for deep copy, the copied packets have well formed ethernet and IPv4 headers (already done memory watch and inspected the constructed NBL, the single NB, and the MDL assigned to the net buffer and all looks good) however when I inject them using fwpsinjectmacsendasync (typically to the outbound filter path and on a separate NIC index) the method call returns STATUS_SUCCESS and the completion callback shows STATUS_SUCCESS but no packets are detected on the expected NIC (or any other NIC on the kernel debug machine) and it ~almost appears~ that after the TTL period expires for the first few packets injected there is a crash in the WFP lower LWF NDis driver (it is breaking upon freeing an MDL).

The interesting part is that, for sanity check purposes, added an additional OID request code to send pre-constructed packets invoked by a user mode application. While the driver is started and running in the same “proxy” mode as it does when it crashes, I can send hundreds of packets using the same methods used to construct new packet copies for inbound traffic. Basically, I have verified that the construction of a new packet (self crafted or copy) and the deletion of the allocated memory (context structure, NBL, NB, and MDL) are all working fine (or appears this way) when self crafting (user mode app) and sending packets.

Memory alignment (compiled for 8 byte alignment) of the copied packets, the MAC & IP addresses, proper network order of things like IPV4 TotalLength, and such have already been verified as valid…so I have narrowed it down to one of two possibilities (that I can think of) as the problem (obviously something wrong on my side) and before posting a bunch of code and crash dumps I was hoping someone could verify and/or provide additional insight in the following areas:

1.) When sending packets from a kernel mode thread (constructed during driver start/initializing), other than typical thread safe practices, is there anything “special” one must do in order to inject a packet (i.e. can you pass an injection handle created during initialization to the thread in question or does this need to be constructed by the thread itself).

2.) The driver receives and queues from an inbound filter and then injects to the same layer ID but into a registered outbound filter (NATIVE MAC). Are there known issues with having multiple Layer 2 filters (Inbound/Outbound)?

I can post a crash dump (I realize I am not posting code so it could be hard to answer), but the crash occurs much later and after packets have been injected, returned no errors, but did not show up on Wireshark. As well, it seems I can self-craft and send packets on any NIC with Wireshark running…but when I try to send a packet to the debug box, the packet is received, unwrapped (i.e. already encapsulated), and then the inner-packet (less than MTU) is sent…and then a BSOD in some net buffer list clone…even though I am not cloning packets…this only happens if wireshark is running and if I copy inbound packets, extract the encapsulated packet, and then inject the extracted packet outbound on a different NIC index (NDIS port 0/default)…and get a different BSOD…but again not in my code…it is elsewhere in the LWF NDIS layer. Of course, if I take the exact same internal packet, self craft it, and just send it (via OID code and data to driver) then it sends with no problem, wireshark does not crash, and memory is cleaned up with no problems…all using the same code to construct a new packet (from user mode application OID code invoked request’s data buffer), send the packet, and to clean up once it is sent.

The only thing I haven’t tried is when sending a self-crafted packet, as opposed to sending it under the OID invocation context, as a queued item for a thread to process and send (running that test tomorrow) to see if that changes the behavior.

For now I am just opening this up for anyone who might have run across a similar situation when dealing with LAYER 2 WFP callout filters and threading. Again, can post a crash dump, but it doesn’t crash immediately (typical of malformed MDL, NB, or NBLs or bad packet data) and requires that I send several encapsulated packets (from client machine running the same driver in “NBL clone” mode) before it crashes…but if Wireshark is running crashes immediately…but not when I self-craft (OID path) the packets…and the crash is not any MDL, NB, or NBL that I have created or inspected (which leads me to think that perhaps memory is being stomped, but then again…the same code that copies inbound traffic is used for the OID invoked self-constructed packets…and using that path I can send thousands of packets for long durations with no crashes).

So, for now that is the summary…I might just ditch WFP and jump down to LWF NDIS.

Any ideas?

An update on this issue:

Yesterday I wrote a quick tool that would take Visual Studio memory watch copied hex values (as text) and converted it (if proper memory block/range copied) to a wireshark hex dump compatible file. Prior to sending the internal encapsulated IPV4 packet, I had wireshark running on the client side and a break point just before the inject send async (mac) function call. Upon sending the packet, I exported the client side encapsulated packet to a hex dump file, removed the encapsulating wrapper header, adjusted memory offset (wireshark hex dump format) so wireshark could load it properly, and then copied the un-encapsulated (extracted) packet’s memory and generated a second wireshark hex dump. Finally, I merged the client side hex dump with the gateway side un-encapsulated hex dump and loaded the two identical packets into wireshark where I confirmed the internal/encapsulated packet being sent was indeed the same final packet data being sent (gateway side). So, I know the packet is correct (other than the MAC address differences).

Next, I then decided to disable the outbound layer 2 filter so only the single inbound layer 2 filtering was being applied. This resulted in the same error code “DATA_NOT_ACCEPTED”, however there was no BSOD. I then uncommented out the layer 2 outbound filter, recompiled, and ran the same test again. With both inbound and outbound layer 2 filters I got the “DATA_NOT_ACCEPTED” error, but then immediately after that I got the BSOD down in the LWF NIDS side of WFP complaining about memory being freed that had potentially already been freed.

One more note about this: About 3 months ago I posted a comment on MSDN regarding the FwpsInjectMacSendAsync0 function call and how it was basically a copy and paste of the FwpsInjectMacReceiveAsync0 function (the title of the link was for FwpsInjectMacSendAsync0 but the content/text body kept referring to FwpsInjectMacReceiveAsync0). The name issue was fixed, but I think there could be an error in the MSDN documentation
It states that:
“The NET_BUFFER_LIST structure must begin with a MAC header” under the net buffer list parameter description.

The individual who fixed the naming issue also stated that:
“these functions have identical parameters and are basically the same, except they’re injecting in different directions and the NBL retreating aspect might be different.

So, I think there are two bugs ( or just bad documentation) regarding this issue:
1.) The FwpsInjectMacSendAsync0 function could require that the NB is advanced past the ethernet header such that the IPV4 header is the start of the NB based on its offset (i.e. it does not require MAC addresses as it is sent to top of TCP filter stack and that is sorted out on the packets way out by the system).
2.) There is either a bug with having two NATIVE MAC filters (inbound and outbound) active at the same time =or= if a layer 2 outbound filter is active the NBL, NB, and MDL must not be deleted until the outbound injection has completed the outbound filter classify function callback (i.e. it references the NBL during the outbound classify function and is trying to autmatically free it upon completing). In this later case scenario, it seems that the injection send completion callback is invoked prior to the outbound classify function but the outbound classify function already has a reference to the NBL that was sent…this would then explain the BSOD).

Once I run a few more tests I will post the findings, but until then…does any of this sound familiar to anyone or like it could be the issue I am experiencing? A simpler question would be: “Do all inject send WFP function calls always traverse from the top of the TCP filter stack, and if so would the minimum layer required/expected (i.e. NBL pointing to a layer header like IPv4) for network and mac send injections be the IPV4 header?”

It looks like there might have been a bad driver for the NIC =or= an earlier version of Wireshark/WinPCap was part of the issue.
Removed wireshark (and winpcap), deleted all NICs, and had the system re-discover and install drivers. This got me past the issue with the BSOD in a completely seperate thread/service (bottom portion of WFP’s NDIS LWF portion), and now seeing the packet make it out on the proper NIC.

Verified documentation is correct for the FwpsInjectMacSendAsync0 (NBL must start with Ethernet Header).
Verified that as long as one filter (inbound or outbound) type of the targeted layer ID seems to work fine (i.e. an inbound native mac filter allows for calling FwpsInjectMacSendAsync0 even though that is happening on the outbound side of things).

It is odd that a simple deletion and re-install of the NIC drivers and removal&re-install of Wireshark WinPCAP solved the issue, but for now it looks like it is “working as one would expect”.

One last note: I noticed another BSOD (after removing wireshark & winpcap prior to NIC deletion and re-install) that occurred in the KiSystemServiceShadow. After looking around, it would appear that KiSystemService was modified for the Spectre exploit and was wondering if anyone knew what the heck KiSystemServiceShadow was (obviously part of the kernel…)???
Seems like an odd name…and is odd that once I deleted the NICs and had them re-discovered that BSOD stopped…does windows have a “shadow version of kernel components”? Only thing I could think of is that it is doing some form of memory CRC comparison at specific points (during memory/MDL free it would seem) and if things don’t match it does a hard crash break (BSOD).

If anyone knows what this is it would be good to know a bit more about.

As a follow up to anyone who might have been “looking for answers” on why Wireshark seems to mysteriously cause BSODS when you are injecting network traffic (i.e. no clones just OOB or uniquely crafted).

On all of your inbound registered callout filters, you should place this at the very start of your classify function callback:

PNET_BUFFER_LIST    nbl = (PNET_BUFFER_LIST)layerData;
if (NdisTestNblFlag(nbl, NDIS_NBL_FLAGS_IS_LOOPBACK_PACKET))
{
	return AllowPacket(classifyOut);
}

Wireshark clones the packet and sends it as a loopback to its own inbound filter, and if you do not ignore and pass this packet along you will end up crashing somewhere around here if you are injecting at the layer 2 level (anything other than that would crash somewhere in the tcpip.sys region with a similar looking pattern):
nt!KeBugCheckEx
nt!ExFreePoolWithTag + 0x1413
NETIO!NetioFreeMdl + 0x1a380
fwpkclnt!FwppFreeDeepCloneNetBufferList + 0x22
wfplwfs!L2FreeNetBufferListContext + 0x9a
wfplwfs!L2pReturnOrCompleteNetBufferList + 0x3c
wfplwfs!L2DereferenceNetBufferListContext + 0x63
wfplwfs!L2pDeepCloneNetBufferListCompletionFn + 0x9
wfplwfs!L2CompleteInjectedNetBufferLists + 0x39
wfplwfs!LwfLowerReturnNetBufferLists + 0x28ec
NDIS!ndisInvokeNextReceiveCompleteHandler + 0x19b
NDIS!ndisReturnNetBufferListsInternal + 0x124
NDIS!ndisSortNetBufferLists + 0x23173
NDIS!ndisMDispatchReceiveNetBufferLists + 0x17d
NDIS!ndisMTopReceiveNetBufferLists + 0x26125
NDIS!ndisInvokeNextReceiveHandler + 0x4b
NDIS!ndisFilterIndicateReceiveNetBufferLists + 0x22cce
NDIS!NdisFIndicateReceiveNetBufferLists + 0x3f
wfplwfs!L2NdisFIndicateReceiveNetBufferLists + 0x7c
wfplwfs!LwfLowerRecvNetBufferLists + 0x2a7a
NDIS!ndisCallReceiveHandler + 0x47
NDIS!ndisDataPathExpandStackCallback + 0x3e
nt!KeExpandKernelStackAndCalloutInternal + 0x8a
nt!KeExpandKernelStackAndCalloutEx + 0x1d
NDIS!ndisInvokeNextReceiveHandler + 0x235
NDIS!ndisDoLoopbackNetBufferList + 0x2bf
NDIS!ndisMLoopbackNetBufferLists + 0xd0
NDIS!ndisMSendNBLToMiniportInternal + 0x20e4e
NDIS!ndisMSendNBLToMiniport + 0xe
NDIS!ndisInvokeNextSendHandler + 0x46
NDIS!NdisFSendNetBufferLists + 0x101
wfplwfs!L2NdisFSendNetBufferLists + 0x74
wfplwfs!L2InjectNetBufferLists + 0x3bf

Anyway, this bug took me almost 18 hours of digging around to figure out why Wireshark caused my driver to barf… and there you have it… loop back…

Thank you, Mr. NILC, for the consistent follow-up and for letting us know your resolution. I’m sure your experience will help others in the future.

Peter