NDIS Filter: forwarding to ip

Samuel_Ghinet · October 13, 2013, 5:18pm

Hello,

I am making an NDIS filter driver and I need to forward certain packets to a different IP.

I don’t know what the good approach is:
a) modify the packet itself and change the destination ip
b) use some API function to clone (NdisAllocateCloneNetBufferList / NdisAllocateFragmentNetBufferList)
if I have the NBL list, I heard it’s safe to assume that the ethernet frame is in contiguous memory in the first MDL. What can be said for the other frames (IPv4 / IPv6, TCP)? Is it safe to seek for them in every NET_BUFFER?
How do I modify a portion (here, a frame) of the NET_BUFFER?
Is it ok to simply write into the buffer returned from NdisGetDataBuffer? (given that it returns a contiguous block of data, and I do not ask it to allocate).

Thanks!

Jeffrey_Tippet_MSFT · October 14, 2013, 3:59pm

The best architecture for forwarding IP packets is actually to use a WFP callout. WFP callouts integrate with the OS’s TCPIP stack, and can make forwarding/routing decisions. By integrating with the TCPIP stack, you get higher performance, automatic hardware offloads, firewall integration, support for IPsec-protected traffic, and path MTU discovery. WFP callouts are also easy to write, since you only have to provide a few callback functions, and you don’t have to worry about tracking hardware state.

If you really need to use an NDIS LWF, here is how to do it:

Technically you’re not supposed to modify the packet payload of an NBL. You can allocate a new MDL, fill it with your modified IP header, then chain it to the front of the NB. Make sure to save the old MDL, so you can restore the MDL when you’re done with the packet.

Because you’re not using WFP, you may also need to recalculate the IP checksum. If checksum offload is enabled, you should not defeat the offload by calculating the checksum. But if checksum offload is disabled or not supported, you should recalculate the IP checksum and write out the new value.

The specific rules regarding contiguous buffers are documented for send [http://msdn.microsoft.com/en-us/library/windows/hardware/ff570756(v=vs.85).aspx] and receive [http://msdn.microsoft.com/en-us/library/windows/hardware/ff554851(v=vs.85).aspx]. Basically the Ethernet header cannot be split, and on the receive path, the remaining headers are contiguous up to the lookahead size.

Samuel_Ghinet · October 15, 2013, 6:40am

Thanks for the links!

Samuel_Ghinet · October 19, 2013, 4:39pm

I’ve got a question:

The msdn doc for NdisAllocateFragmentNetBufferList presents these args:

“DataOffsetDelta [in]
The additional amount of used data space that NDIS should make available in the new NET_BUFFER structures.
DataBackFill [in]
The amount of data space in addition to the value of the DataOffsetDelta parameter to allocate if allocation is necessary. If NDIS must allocate memory to supply the data space requested in DataOffsetDelta, it should also allocate the additional space that DataBackFill specifies.”

I don’t understand what’s with DataBackFill, and what the difference is between DataOffsetDelta and DataBackFill.

Can somebody please clarify the relationship between the two to me?
If DataOffsetDelta reduces from DataOffset (which I understand, it allocates memory if needed), then what does DataBackFill do?

Samuel_Ghinet · October 19, 2013, 6:04pm

And one more question…

I read from here:
http://msdn.microsoft.com/en-us/library/windows/hardware/hh582254(v=vs.85).aspx

“Once the cloned packet’s send or receive request has completed, the extension must complete the send or receive request of the original packet.
Note If the extension has duplicated a packet’s NET_BUFFER_LIST structure, it can complete the send or receive request of the original packet after it has been duplicated.”

So the steps are:
a) NdisFSendNetBufferLists(cloned_nbl)
b) NdisFSendNetBufferListsComplete(original_nbl).

Do I understand correctly?
The doc say “it can complete” - does this mean it is not necessary? What happens if you do / don’t?

David_R_Cattley · October 20, 2013, 10:49am

> Do I understand that correctly …

You must carefully understand the relationship between the NET_BUFFER_LIST,
NET_BUFFER, MDL, and actually memory pages described by the MDL.

What this warning is telling you is that a clone NBL/NB refers directly to
the same memory that backed the parent NBL frame [data]. The only way to
ensure that that memory remains in the state that represents the original
NBL - and the cloned NBL - is to ensure that the clone send operation
completes *AND* the clone operation is reversed (free the clone) before
completing the original send.

The next part basically says that if you make a deep copy of the original
NBL and thus allocated new memory to hold the frame [data] and create an
entirely separate MDL chain describing that frame [data], the resulting
‘copy’ (not referred to as a clone at this point) will have no relationship
to the original.

By relationship it is meant that some resource associated with the original
NBL is needed to send the derived NBL.

If no such relationship is created in the derivation the original can be
completed independent of the derived.

The doc say “it can complete” - does this mean it is not necessary?

You must still complete the original NBL. You are just less constrained on
exactly when it must complete relative to the completion of the derived NBL.

What happens if you do / don’t?

If you do not complete them ever, at some point the originator of those NBLs
is going to run out of resources.

Good Luck,
Dave Cattley

David_R_Cattley · October 20, 2013, 11:18am

The fragment operation takes an existing NB and slices it into fragments,
each described by a separate NB. The derived NBs resulting reference the
original memory (there is no ‘copy’ operation here).

Depending on a myriad of edge cases about where buffer boundaries fall in
the MDL chain, you can get all sorts of interesting results but the only one
relevant to the backfill parameters is what to do about the beginning of the
MDL chain in each NB.

If you ask for backfill, the MDL chain must be such that it describes that
backfill with that backfill available if one retreats the NB.

For the second through n-th fragment NBs, there will not naturally be any
backfill space because those fragments are being sliced out of the original
buffer from the middle. What ‘precedes’ these fragments in the logical
buffer is the end of the last fragment. So NDIS is going to allocate a new
block of pool and describe it with an MDL and attach to the start of the NBL
chain, one each for each fragment NB.

How big this block of memory will be is controlled by these parameters.

In the case of the first fragment, the original NB may well have had
backfill space available preceding it in the logical buffer. So the first
NB might be able to meet the DataOffsetDelta request without allocating any
memory. Or it might not. And so what happens with the MDL chain in the
first fragment NB can either be ‘nothing’ (no new MDL/buffer) or it can be
modified with the addition of a new buffer to meet the DataOffsetDelta
request just as the second through n-th NB MDL chains are modified.

Basically this boils down the instructions equivalent to:

Get me a cup of sugar from the cupboard. If you need to go to the store,
buy a 5lb bag of sugar.

DataBackFill is how big a bag of sugar to buy if you don’t have enough
already. But precisely it is how much ‘more’ not how much ‘total’.

Good Luck,
Dave Cattley

Samuel_Ghinet · October 20, 2013, 12:28pm

“If you ask for backfill, the MDL chain must be such that it describes that
backfill with that backfill available if one retreats the NB.”
So the backfill is some kind of optimization for the case you need later to retreat, to have some unused space already available?

Regarding cloning (fragmenting) & duplication:
If I want to encapsulate NBs (add & modify protocols’ headers), which should I choose?
When is duplication preferable to cloning?

Also, for a cloning, why do you have to allocate the forwarding context? why is the original forwarding context not used for the clone?

And another thing I’m unsure of:
So:
a) there is no MDL associated for any part of the unused space, right?
b) any memory mapped from an MDL is sure to contain only used space.
c) any retreat operation (have enough unused space or not) does not modify any existing MDL. Instead, you obtain one more MDL linked to the list.
d) the packets that are transmitted over the cable are NET_BUFFERs. MDLs are where / how the device (PC) represents that memory.

I’m also curios what is the reason of grouping lists of NET_BUFFERs into NET_BUFFER_LISTs / how it helps.

Thanks!

David_R_Cattley · October 20, 2013, 1:16pm

> So the backfill is some kind of optimization for the case you need later
to retreat, to have some unused space already available?

Yes, but not so much an optimization but one very specifically ensuring that
the backfill area will be contiguous as well. When IPv4 datagrams are
fragmented into packets, each fragment gets an IPv4 header. That header
and the L2 header need to go somewhere and the IP header needs to be
contiguous.

The architecture of NBLs/NBs strives very hard to avoid unnecessary copying
and the cost of mapping memory associated with it. It is possible that a
send operation in UM can get all the way to the DMA operation at the NIC
without ever having mapped anything more than the headers into kernel
address space (they are generally allocated from NPPool anyway). In other
words, the UM buffer will never be mapped and read by any KM component and
the data can be sent via the NIC by DMA directly from the physical pages.

If I want to encapsulate NBs (add & modify protocols’ headers), which
should I choose?
When is duplication preferable to cloning?

I don’t think there is one answer here or even a rule-of-thumb. Cloning
avoids copying data but requires a strict completion ordering relationship
to avoid that copy. Avoiding the copy can be a good thing especially if
you are also avoiding the memory manager having to map the pages in the
first place.

a) there is no MDL associated for any part of the unused space, right?

No, I don’t think that is correct. Either the CurrentMdl will describe the
backfill entirely or their will be an MDL preceding the CurrentMdl that
describes it.

b) any memory mapped from an MDL is sure to contain only used space.
No. Unlike and NDIS_PACKET, a NET_BUFFER can have an MDL chain that
describes a logical frame buffer that contains the frame as an arbitrary
contiguous sub-section of the entire buffer. The MDL chain can describe
buffer space before and after the ‘used’ portion.

c) any retreat operation (have enough unused space or not) does not modify
any existing MDL. Instead, you obtain one more MDL linked to the list.

I am pretty sure this is a correct statement. Retreat will not modify an
MDL in the MDL chain. It can result in prepending an additional MDL. By
modify I mean change fields in the MDL directly and not the ‘modification’
that might occur if the Memory Manager is asked to map the MDL to System
Address Space. This technically modifies the MDL by assigning it a
SystemVA but it does not change the ‘logical’ buffer segment extents
described by the MDL.

The only ‘adjustment’ to an MDL I know of that NDIS permits is to change the
logical length of the memory via NdisAdjustMdlLength(). This would have no
use in a retreat which is operating on the ‘other end’ (beginning) of the
MDL chain.

d) the packets that are transmitted over the cable are NET_BUFFERs. MDLs
are where / how the device (PC) represents that memory.

Sure. That works mostly. Keep in mind that the details include that the
‘packet’ is described as a sub-section of the MDL chain by fields in the
NET_BUFFER. Specifically the DataOffset & DataLength fields. It is not
the case that one can divorce the MDL chain from the NET_BUFFER and know
what the logical packet is.

I’m also curios what is the reason of grouping lists of NET_BUFFERs into
NET_BUFFER_LISTs / how it helps.

Efficiency. In the send path, all of those NBs in an NBL will be related
to single logical operation and share the same Out-Of-Band / context
information. Intervening filters and the Miniport can gain significant
efficiency in many operations by taking advantage of this.

Good Luck,
Dave Cattley

Samuel_Ghinet · October 20, 2013, 1:47pm

"> If I want to encapsulate NBs (add & modify protocols’ headers), which
should I choose?

When is duplication preferable to cloning?

I don’t think there is one answer here or even a rule-of-thumb. Cloning
avoids copying data but requires a strict completion ordering relationship
to avoid that copy. Avoiding the copy can be a good thing especially if
you are also avoiding the memory manager having to map the pages in the
first place."

I expected it to depend on things. I understand then, that the pro of cloning is that it’s more resource efficient & fast.

“> b) any memory mapped from an MDL is sure to contain only used space.
No. Unlike and NDIS_PACKET, a NET_BUFFER can have an MDL chain that
describes a logical frame buffer that contains the frame as an arbitrary
contiguous sub-section of the entire buffer. The MDL chain can describe
buffer space before and after the ‘used’ portion.”

What I’m interested to know is, if I want to write into the packet (e.g. retreat and then write, overwrite protocol headers), how do I know that I write to the correct places?
I found this function - MmGetSystemAddressForMdlSafe - for mapping the memory described by and MDL. Or there is other, better method of writing into the packet?

Thanks!

David_R_Cattley · October 20, 2013, 2:02pm

> What I’m interested to know is, if I want to write into the packet (e.g.
retreat and then write, overwrite protocol headers),

how do I know that I write to the correct places?

First a caveat: In general you are not permitted to write into a buffer
you did not allocate. So think about that carefully and how it applies to
the operations you intend to perform in the logical flow of send in your
driver.

I found this function - MmGetSystemAddressForMdlSafe - for mapping the
memory described by and MDL.
Or there is other, better method of writing into the packet?

NdisGetDataBuffer() is a way of getting a pointer into a contiguous
sub-section of a frame described by a NET_BUFFER.

I think the logical operation you are trying to do would be described by the
following (schematic) sequence:

Read the header data from the original NB.
Advance the original NB(L) to logically remove the header data.
Clone the original NBL.
Retreat the original NBL to restore its original state. Now or before
completing it.
Retreat the clone by the amount you need to write your header.
Write your modified header to the area that retreat just exposed in the
clone.
Send the clone.
When the clone completes. Complete the original.

Or some like that.

I can’t say as I have done this to know it works perfectly. If you try it,
let us know!

Good Luck,
Dave Cattley

Samuel_Ghinet · October 22, 2013, 7:10am

With NdisGetDataBuffer I have succeeded to handle the buffer properly.
I’m having the following problem, however: after I send the cloned nbl (NdisFSendNetBufferLists), when it exists the FilterSendNetBufferLists function, it crashes:

“DRIVER_IRQL_NOT_LESS_OR_EQUAL (d1)
An attempt was made to access a pageable (or completely invalid) address at an
interrupt request level (IRQL) that is too high. This is usually
caused by drivers using improper addresses.
If kernel debugger is available get stack backtrace.
Arguments:
Arg1: 0000000000000000, memory referenced
Arg2: 0000000000000002, IRQL
Arg3: 0000000000000001, value 0 = read operation, 1 = write operation
Arg4: fffff880036e0e9d, address which referenced memory”

For a clone, I must or it is optional that I allocate a forwarding context (AllocateNetBufferListForwardingContext) - I see that if I don’t, the clone does have a non-null ptr to NET_BUFFER_LIST_CONTEXT.

Also, after the cloning, the parent of the clone is NULL. The doc says it should point to the original (parent). Why is NULL to me?

I must use CopyNetBufferListInfo for the clone?

Samuel_Ghinet · October 22, 2013, 8:24am

OK, now I learnt this one: NdisFSendNetBufferLists does not call FilterSendNetBufferListsComplete.
Now I know where to complete the original NBL and stuff.

The only thing remains… in FilterSendNetBufferLists I receive a list of NBLs, which means… I’ll have to keep a global list of NBLs “original NBL” and a list “cloned NBL”. uhhh.

Samuel_Ghinet · October 22, 2013, 2:07pm

still does not work: it crashes somewhere after FilterSendNetBufferListsClomplete exists (because of the clone).
I wonder what the hell I did wrong / missed.

I have done this:
in FilterSendNetBufferLists (in this order):

advance by ethernet size (don’t free)
clone using NdisAllocateCloneNetBufferList - 0 as clone flag.
set the clone sourceHandle
AllocateNetBufferListForwardingContext with pClonedNbl as arg, and CopyNetBufferListInfo from original to clone.
add the data to the clone (retreat + write)
NdisFSendNetBufferLists on the nbl
add the original and clone nbl into a list, so that when FilterSendNetBufferLists is called, we will have both.

In FilterSendNetBufferLists (in this order) :

if the list aforementioned is not empty:
retrieve the struct to have the original & clone, for clone = NetBufferLists FilterSendNetBufferLists arg.
NdisFSendNetBufferListsComplete on clone
FreeNetBufferListForwardingContext on clone
NdisFreeFragmentNetBufferList on clone
NdisRetreatNetBufferListDataStart with header size.
NdisFSendNetBufferListsComplete on original
FreePoolWithTag on struct (with original and cloned nbl - it was allocated to be in the list).

Can anybody PLEASE tell me what I did wrong or what I forgot to do?
I also attempted to retreat the original right after making the clone.

The problem appears to be caused by the sending of the cloned nbl.
I’ll continue to study the msdn doc, but please, if you know / have an idea what could be, please tell me!

David_R_Cattley · October 22, 2013, 2:45pm

Ok, perhaps you can just post the code for you send and send complete routines.

Did you actually mean to describe FilterSendNetBufferLists() twice? I assumed the second one was really the completion path.

add the original and clone nbl into a list, so that when FilterSendNetBufferLists is called, we will have both.

The clone is going to be given back in the completion callback. The original is something you should find because the parent pointer in the clone ought to be pointing at the original. I just don’t recall if that is something that NdisCloneXxx() does automatically or if that is a feature of the WFP clone routines. But that is a minor point assuming your code handles the lists correctly.

And post the !analyze output from the crash. We can’t guess.

Good Luck,
Dave Cattley

Samuel_Ghinet · October 22, 2013, 3:30pm

Yeah, sorry,

“- add the original and clone nbl into a list, so that when
FilterSendNetBufferListsComplete is called, we will have both.”

"In FilterSendNetBufferListsComplete (in this order) :

if the list aforementioned is not empty:
retrieve the struct to have the original & clone, for clone = NetBufferLists
FilterSendNetBufferListsComplete arg."

“And post the !analyze output from the crash. We can’t guess.”

I reproduced it again now:
"kd> !analyze -v
*******************************************************************************
* *
* Bugcheck Analysis *
* *
*******************************************************************************

ATTEMPTED_EXECUTE_OF_NOEXECUTE_MEMORY (fc)
An attempt was made to execute non-executable memory. The guilty driver
is on the stack trace (and is typically the current instruction pointer).
When possible, the guilty driver’s name (Unicode string) is printed on
the bugcheck screen and saved in KiBugCheckDriver.
Arguments:
Arg1: fffffa8001eb01a0, Virtual address for the attempted execute.
Arg2: 800000003de009e3, PTE contents.
Arg3: fffff88006fa1710, (reserved)
Arg4: 0000000000000003, (reserved)

Debugging Details:

DEFAULT_BUCKET_ID: WIN8_DRIVER_FAULT

BUGCHECK_STR: 0xFC

PROCESS_NAME: svchost.exe

CURRENT_IRQL: 2

TRAP_FRAME: fffff88006fa1710 – (.trap 0xfffff88006fa1710)
NOTE: The trap frame does not contain all registers.
Some register values may be zeroed or incorrect.
rax=0000000000000000 rbx=0000000000000000 rcx=fffffa8001e90000
rdx=fffffa8002e11490 rsi=0000000000000000 rdi=0000000000000000
rip=fffffa8001eb01a0 rsp=fffff88006fa18a8 rbp=fffffa8002ea9010
r8=0000000000000005 r9=fffffa8002e11490 r10=0000000000000000
r11=0000000000000000 r12=0000000000000000 r13=0000000000000000
r14=0000000000000000 r15=0000000000000000
iopl=0 nv up ei pl zr na po nc
fffffa8001eb01a0 1101 adc dword ptr [rcx],eax ds:fffffa8001e90000=a0a96c03
Resetting default scope

LAST_CONTROL_TRANSFER: from fffff80377f810ea to fffff80377e80930

STACK_TEXT:
fffff88006fa0d88 fffff80377f810ea : 0000000000000000 00000000000000fc fffff88006fa0ef0 fffff80377f054b8 : nt!DbgBreakPointWithStatus
fffff88006fa0d90 fffff80377f80742 : 0000000000000003 fffff88006fa0ef0 fffff80377f05e90 00000000000000fc : nt!KiBugCheckDebugBreak+0x12
fffff88006fa0df0 fffff80377e86144 : 0000000000000000 fffff88006fa17b0 0000000000000080 fffff88001bf3dc0 : nt!KeBugCheck2+0x79f
fffff88006fa1510 fffff80377ff4708 : 00000000000000fc fffffa8001eb01a0 800000003de009e3 fffff88006fa1710 : nt!KeBugCheckEx+0x104
fffff88006fa1550 fffff80377ff3528 : fffff88000000001 0000000000000001 0000000000000000 0000000000000000 : nt! ?? ::FNODOBFM::string'+0x335ae fffff88006fa1590 fffff80377ec108f : 0000000000000005 fffff88006fa1640 0000000000000000 fffff88006fa1710 : nt! ?? ::FNODOBFM::string’+0x322fc
fffff88006fa15d0 fffff80377e83aee : 0000000000000008 0000000000000005 0000000000000000 fffff88006fa1710 : nt!MmAccessFault+0xa6f
fffff88006fa1710 fffffa8001eb01a0 : fffff8800145bc35 fffff88001bf0850 fffff80377f1c2dc fffff88006fa19a8 : nt!KiPageFault+0x16e
fffff88006fa18a8 fffff8800145bc35 : fffff88001bf0850 fffff80377f1c2dc fffff88006fa19a8 fffffa8002ea2400 : 0xfffffa8001eb01a0 fffff88006fa18b0 fffff8800145bf96 : fffffa8001eb01a0 fffffa8002e11490 fffff88000000005 fffffa80028ca002 : NDIS!ndisMSendCompleteNetBufferListsInternal+0x135 fffff88006fa1960 fffff8800145c0df : fffffa8002e11490 0000000000000000 0000000000000021 0000000000000065 : NDIS!ndisInvokeNextSendCompleteHandler+0x126 fffff88006fa1a00 fffff880036bd3ea : fffffa8001e90000 fffff88006fa1ac8 fffff88006fa1a90 0000000000000000 : NDIS!NdisMSendNetBufferListsComplete+0x9f fffff88006fa1a60 fffff88003671178 : 0000000000000000 0000000000000000 0000000000000000 fffff88006fa1bc0 : vmswitch!VmsExtIoPacketRouted+0x4f0fa fffff88006fa1af0 fffff8800366ba78 : fffffa8002e11490 fffff88006fa1bc0 fffffa8002e11490 0000ffffffffffff : vmswitch!VmsNblHelperRefCountDecrementMany+0x68 fffff88006fa1b30 fffff88003692ed3 : 0000000000000000 fffffa8001eb2000 fffffa8002e11490 fffff88000000021 : vmswitch!VmsExtMpRoutePackets+0x348 fffff88006fa1c90 fffff8800145b76a : 0000000000000000 fffff88006fa1da9 fffffa8002e11490 fffffa800001ff21 : vmswitch!VmsExtMpSendNetBufferLists+0x15b fffff88006fa1cf0 fffff8800145b97b : fffffa8001eb01a0 0000000000000000 0000000000000002 fffffa8002e11490 : NDIS!ndisInvokeNextSendHandler+0x22a fffff88006fa1e00 fffff880036bd987 : fffffa8001eb2002 fffffa8002962d20 0000000000000000 fffff88000000021 : NDIS!NdisSendNetBufferLists+0x12b fffff88006fa1ee0 fffff880036695d1 : fffffa8001536000 0000000000000000 fffffa8001536000 fffff88001461800 : vmswitch!VmsExtPtRouteNetBufferLists+0x525a7 fffff88006fa1fb0 fffff880036694cf : 0000000000000000 0000000000000001 fffffa80015e3360 000000000001ff00 : vmswitch!VmsExtPtRouteNetBufferListsWithBwCap+0xb1 fffff88006fa2030 fffff8800145b76a : fffff88006fa2128 fffffa8002962d20 fffff88003660000 fffffa8000000001 : vmswitch!VmsMpNicSendNetBufferLists+0x12f fffff88006fa20c0 fffff8800145b97b : fffffa8001f341a0 0000000000000000 0000000000000002 fffffa8002962d20 : NDIS!ndisInvokeNextSendHandler+0x22a fffff88006fa21d0 fffff88001cccecd : 0000000000000002 0000000000000000 fffffa8000000000 fffffa8000000000 : NDIS!NdisSendNetBufferLists+0x12b fffff88006fa22b0 fffff88001cd2924 : fffff88001e04b90 0000000000000000 fffffa80013f0000 fffffa8000000800 : tcpip!IppFragmentPackets+0x49d fffff88006fa2410 fffff88001cd32c6 : fffff88001e04b90 fffffa8001d4e1e8 fffffa8001c58800 c09bcc28ecec9300 : tcpip!IppDispatchSendPacketHelper+0x94 fffff88006fa2530 fffff88001d017cb : 0000000000000000 fffffa80013f16a8 fffff88006fa2960 fffffa80013f15c8 : tcpip!IppPacketizeDatagrams+0x2b6 fffff88006fa2640 fffff88001cbb6df : 0000000000000004 fffff88001e04b90 fffffa80017c5550 0000000000000000 : tcpip!IppSendDatagramsCommon+0x6eb fffff88006fa2800 fffff88001cba819 : fffffa80028c6240 fffffa8001622164 fffff88006fa3270 fffff88001e04b90 : tcpip!UdpSendMessagesOnPathCreation+0x90f fffff88006fa2c20 fffff88001cbca88 : fffff88006fa3160 fffff80377ec89c0 fffff88006fa31d0 fffff88006fa3160 : tcpip!UdpSendMessages+0x259 fffff88006fa3030 fffff80377ec6df5 : fffff8a000020019 0000000000000000 0000000000000000 fffffa8000c99f6c : tcpip!UdpTlProviderSendMessagesCalloutRoutine+0x15 fffff88006fa3060 fffff80377ec7d85 : fffff88001cbca74 fffff88006fa31d0 0000000000000000 0000000000000000 : nt!KeExpandKernelStackAndCalloutInternal+0xe5 fffff88006fa3160 fffff88001cbcc70 : 0012008900000000 fffff80378095b90 000000000000000c fffff80300000000 : nt!KeExpandKernelStackAndCalloutEx+0x25 fffff88006fa31a0 fffff88001b3a1f6 : fffffa8001622160 fffff88006fa32a9 fffffa8002c37d3e 00000000000007ff : tcpip!UdpTlProviderSendMessages+0x70 fffff88006fa3220 fffff88001b3962f : 0000000000000000 fffffa8002d2f780 fffffa8002d2f898 0000000000000020 : tdx!TdxSendDatagramTransportAddress+0x2e6 fffff88006fa3310 fffff88001b69f20 : fffffa8002d2f780 fffffa8002d2f780 fffffa8002c37b50 fffffa8002d2f3d0 : tdx!TdxTdiDispatchInternalDeviceControl+0x7f fffff88006fa3510 fffff88001b6be9f : fffffa8001d6ce30 fffffa8002dbb740 fffffa8002c3ed20 fffff88001b6abb0 : netbt!SendNameServiceRequest+0x5a4 fffff88006fa35c0 fffff88001b6b5ab : fffffa8001e51708 0000000000000000 0000000000000001 0000000000000040 : netbt!QueryNameOnNet+0x52f fffff88006fa36c0 fffff88001ba8a3f : fffffa8001646b10 fffffa8001646b10 fffff88001b6b200 fffff80377ed6e00 : netbt!FindNameOrQuery+0x1fc fffff88006fa3750 fffff88001ba808e : fffffa8001646b10 fffffa8002d0bbd0 fffffa8002e54230 fffffa8001646b10 : netbt!DispatchIoctls+0x23f fffff88006fa3860 fffff8037826f42f : fffffa8002e54230 fffffa8002d0bbd0 fffff88006fa3b80 0000000000000001 : netbt!NbtDispatchDevCtrl+0x7e fffff88006fa3890 fffff8037826fdb6 : 0000000000000000 fffff8a000000000 0000000000000000 00000063950833c0 : nt!IopXxxControlFile+0x7dd fffff88006fa3a20 fffff80377e85053 : 00000000000006b0 0000000000000000 0000000000000000 0000000000000000 : nt!NtDeviceIoControlFile+0x56 fffff88006fa3a90 000007fa48172c1a : 000007fa449f6fb0 0000006395082f40 0000000000000000 0000000000000000 : nt!KiSystemServiceCopyEnd+0x13 00000063953be858 000007fa449f6fb0 : 0000006395082f40 0000000000000000 0000000000000000 00000063950669c0 : ntdll!NtDeviceIoControlFile+0xa 00000063953be860 0000006395082f40 : 0000000000000000 0000000000000000 00000063950669c0 00000063950833c0 : DNSAPI!Socket_CacheInit+0x16e0 00000063953be868 0000000000000000 : 0000000000000000 00000063950669c0 00000063950833c0 0000000000210096 : 0x0000006395082f40

STACK_COMMAND: kb

FOLLOWUP_IP:
vmswitch!VmsExtIoPacketRouted+4f0fa
fffff880`036bd3ea 90 nop

SYMBOL_STACK_INDEX: c

SYMBOL_NAME: vmswitch!VmsExtIoPacketRouted+4f0fa

FOLLOWUP_NAME: MachineOwner

MODULE_NAME: vmswitch

IMAGE_NAME: vmswitch.sys

DEBUG_FLR_IMAGE_TIMESTAMP: 5010aa4f

BUCKET_ID_FUNC_OFFSET: 4f0fa

FAILURE_BUCKET_ID: 0xFC_vmswitch!VmsExtIoPacketRouted

BUCKET_ID: 0xFC_vmswitch!VmsExtIoPacketRouted

Followup: MachineOwner

"

I’m a bit frustrated here that I don’t know how to investigate further given such info.

Samuel_Ghinet · October 22, 2013, 4:04pm

ok, here’s the code:

FilterSendNetBufferLists code

NET_BUFFER_LIST* CloneNblNormal(In NET_BUFFER_LIST* pNbl, In PSX_SWITCH_OBJECT vSwitch)
{
NET_BUFFER_LIST* pClonedNbl = NULL;
NDIS_STATUS status = 0;

pClonedNbl = NdisAllocateCloneNetBufferList(pNbl, g_hNblPool, g_hNbPool, 0);
if (!pClonedNbl) {
DbgPrint(“CloneNbl: NdisAllocateFragmentNetBufferList failed”);
ASSERT(pClonedNbl);
}

pClonedNbl->SourceHandle = vSwitch->ndisFilterHandle;

status = vSwitch->ndisSwitchHandlers.AllocateNetBufferListForwardingContext(vSwitch->ndisSwitchContext, pClonedNbl);
if (status != NDIS_STATUS_SUCCESS) {
ASSERT(0);
}

status = vSwitch->ndisSwitchHandlers.CopyNetBufferListInfo(vSwitch->ndisSwitchContext, pClonedNbl, pNbl, 0);
if (status != NDIS_STATUS_SUCCESS) {
ASSERT(0);
}

pClonedNbl->ParentNetBufferList = pNbl;

return pClonedNbl;
}

static VOID _ProcessOneNblIngress(PSX_SWITCH_OBJECT vSwitch, ULONG sendFlags, NET_BUFFER_LIST* pNbl)
{
BOOLEAN require_gre = FALSE;
NET_BUFFER_LIST* pClonedNbl = NULL;
VOID* pNetBuffer = NULL;
ULONG ethHeaderSize = sizeof(ABC_ETHERNET_HEADER);

//check if either of the NBs in the NBL requires GRE encapsulation.
require_gre = CheckNbsRequireGre(pNbl, &pNetBuffer);

if (require_gre)
{
NdisAdvanceNetBufferListDataStart(pNbl, ethHeaderSize, FALSE, NULL);

//attempt cloning without fragmenting for the moment
pClonedNbl = CloneNblNormal(pNbl, vSwitch);
ASSERT(pClonedNbl);

//write info from the allocated buffer pNetBuffer which contains all the NB buffer
//retreats bytes: eth + ip + gre, used NdisRetreatNetBufferDataStart to retrieve contiguous memory (the newly allocated mdl created by retreat)
Gre_EncapsulateNbs(pClonedNbl, (ABC_ETHERNET_HEADER*)pNetBuffer);
ExFreePoolWithTag(pNetBuffer, g_frameTag);

//it’s the SxLibSendNetBufferListsIngress function from the sample
_Nbls_SendIngress(vSwitch, pClonedNbl, sendFlags, 0);
} else {
//it’s the SxLibSendNetBufferListsIngress function from the sample
_Nbls_SendIngress(vSwitch, pNbl, sendFlags, 0);
}
}

static VOID _ProcessNblsIngress(PSX_SWITCH_OBJECT vSwitch, PNET_BUFFER_LIST netBufferLists, ULONG sendFlags)
{
NET_BUFFER_LIST* pNbl = netBufferLists;
NET_BUFFER_LIST* pNextNbl = NULL;

DbgPrintNblCount(netBufferLists);

while (pNbl) {
//we are allowed to break the links between NBLs. Each NBL will be sent separately.
pNextNbl = NET_BUFFER_LIST_NEXT_NBL(pNbl);
NET_BUFFER_LIST_NEXT_NBL(pNbl) = NULL;

_ProcessOneNblIngress(vSwitch, sendFlags, pNbl);

pNbl = pNextNbl;
}
}

Use_decl_annotations
VOID Nbls_StartIngress(PSX_SWITCH_OBJECT vSwitch, NDIS_HANDLE extensionContext, PNET_BUFFER_LIST netBufferLists, ULONG sendFlags)
{
UNREFERENCED_PARAMETER(extensionContext);

if (vSwitch->dataFlowState != SxSwitchRunning) {
//this will call complete; it’s the SxLibSendNetBufferListsIngress function from the sample
_Nbls_SendIngress(vSwitch, netBufferLists, sendFlags, 0);
return;
}

_ProcessNblsIngress(vSwitch, netBufferLists, sendFlags);
}

Use_decl_annotations
VOID FilterSendNetBufferLists(NDIS_HANDLE filterModuleContext, PNET_BUFFER_LIST netBufferLists, NDIS_PORT_NUMBER portNumber, ULONG sendFlags)
{
//this struct is from the forwarding sample
PSX_SWITCH_OBJECT switchObject = (PSX_SWITCH_OBJECT)filterModuleContext;
UNREFERENCED_PARAMETER(portNumber);

Nbls_StartIngress(switchObject, switchObject->extensionContext, netBufferLists, sendFlags);
}

==============================================================

FilterSendNetBufferListsComplete code

//No ingress complete processing necessary.
Use_decl_annotations
VOID Nbls_CompleteIngress(PSX_SWITCH_OBJECT vSwitch, NDIS_HANDLE extensionContext, PNET_BUFFER_LIST netBufferLists, ULONG sendCompleteFlags)
{
UNREFERENCED_PARAMETER(extensionContext);

NdisFSendNetBufferListsComplete(vSwitch->ndisFilterHandle, netBufferLists, sendCompleteFlags);
}

VOID FreeClonedNblNormal(In NET_BUFFER_LIST* pNbl)
{
NdisFreeCloneNetBufferList(pNbl, 0);
}

VOID FilterSendNetBufferListsComplete(NDIS_HANDLE filterModuleContext, PNET_BUFFER_LIST netBufferLists, ULONG sendCompleteFlags)
{
PSX_SWITCH_OBJECT switchObject = (PSX_SWITCH_OBJECT)filterModuleContext;

if (netBufferLists->ParentNetBufferList)
{
Nbls_CompleteIngress(switchObject, switchObject->extensionContext, netBufferLists, sendCompleteFlags);
switchObject->ndisSwitchHandlers.FreeNetBufferListForwardingContext(switchObject->ndisSwitchContext, netBufferLists);
FreeClonedNblNormal(netBufferLists);
NdisRetreatNetBufferListDataStart(netBufferLists->ParentNetBufferList, sizeof(ABC_ETHERNET_HEADER), 0, NULL, NULL);

Nbls_CompleteIngress(switchObject, switchObject->extensionContext, netBufferLists->ParentNetBufferList, sendCompleteFlags);
} else {
Nbls_CompleteIngress(switchObject, switchObject->extensionContext, netBufferLists, sendCompleteFlags);
}
}

=============================================================

I hope it helps.

Samuel_Ghinet · October 22, 2013, 4:57pm

Do I need to specify destination port, even though I don’t want to send the cloned nbl anywhere other than where the original NBL would have gone to?

Samuel_Ghinet · October 22, 2013, 6:14pm

I have tried with adding a destination port with AddNetBufferListDestination.
The same error happens (ATTEMPTED_EXECUTE_OF_NOEXECUTE_MEMORY), slightly different arg values:
Arg1: fffffa8001cbe1a0, Virtual address for the attempted execute.
Arg2: 800000003e0009e3, PTE contents.
Arg3: fffff801a20cc450, (reserved)
Arg4: 0000000000000003, (reserved)

still svchost.exe.

Samuel_Ghinet · October 23, 2013, 12:45pm

I have disabled the cloning functionality.
I tried if it works well when setting the Next on NBLs to NULL and are sent separately, each NBL.
It yields first-chance exception:
“Assertion failure - code c0000420 (first chance)”

!analyze -v says:

"
Unknown bugcheck code (0)
Unknown bugcheck description
Arguments:
Arg1: 0000000000000000
Arg2: 0000000000000000
Arg3: 0000000000000000
Arg4: 0000000000000000

Debugging Details:

PROCESS_NAME: System

DPC_TIMEOUT_TYPE: SINGLE_DPC_TIMEOUT_EXCEEDED

DPC_RUNTIME: 283

DPC_TIME_LIMIT: 282

FAULTING_IP:
nt!KeAccumulateTicks+575
fffff802`ec07b2e5 cd2c int 2Ch
"

I understood from here: http://msdn.microsoft.com/en-us/library/windows/hardware/ff570756(v=vs.85).aspx
“However, drivers are not required to restore the links between NET_BUFFER_LIST structures.”
That I can set the links without restoring them, and thus send each NBL separately.

How else can I separate each NBL?
(given that from this NBL list, only certain NBLs should be cloned)