NDIS: Communicating filter driver context size needs to underlying and overlying drivers

Hi,

Can you please help me to find how to communicate, from a filter driver, context size needs to underlying (e.g., other filter, miniport) and overlying (e.g., other filter, protocol) drivers in order to prevent memory allocations in calls to NdisAllocatenetBufferListContext for NET_BUFFER_LISTs passed to FilterSendNetBufferLists and FilterReceiveNetBufferLists?

Besides, as my filter driver generates sends and receive indications of its own, including calls to NdisAllocateNetBufferListPool, NdisAllocateCloneNetBufferList and NdisAllocateNetBufferList, how can I determine the context needs of all underlying and overlying drivers in order to provide a big enough value for the ContextSize in parameters to those calls?

Documentation page NET_BUFFER_LIST_CONTEXT structure says “NDIS estimates the required context data space and, if necessary, adjusts the allocated data space to meet the requirements for the entire driver stack”, but I don’t know if I’m supposed to contribute some information to aid NDIS in making that estimate.

Thank you very much.
Regards.

I’ve realised that it’d be good if I could also influence the datafillsize used by overlying and underlying drivers, so that when the NET_BUFFER comes to my driver there is enough preallocated room for calling NdisRetreatNetBufferDataStart without allocating.

Is there a way to do that, like intercepting and altering an OID set request?

There is OID_GEN_MINIPORT_RESTART_ATTRIBUTES, but it’s not allowed for query or set requests.

Thank you very much.
Regards.

Good question - you were unsuccessful in finding detailed information on how to do this, because there isn’t a complete way to do this.

For the receive path. The NIC driver (“miniport”) is completely unaware of what’s going on above it. It allocates a pool of NBLs to use for receive indications. At that time, the NBLs’ initial context size is determined. The miniport doesn’t know when your filter attaches, and it doesn’t know what context size your filter might want. So its NBLs most likely won’t have space dedicated to your filter.

For the transmit path. NDIS starts the stack from the bottom up, so NDIS actually can accumulate the context size requests of the miniport, then each filter driver, and finally give the total context to each of the protocols. You’ll see this in the NDIS_RESTART_GENERAL_ATTRIBUTES: you can increment the ContextBackFillSize in your filter driver, and protocol drivers will see the final total.

However, in practice, the built-in TCPIP driver preallocates all its NBLs from a global pool (not specific to a miniport stack), so it can’t size the NBLs’ context areas based on the miniport stack’s specific needs. (It does this because TCP connections can float across layer-2 interfaces: a connection is bound to a layer-3 address, not a layer-2 network interface.)

So even if you dutifully increment ContextBackFillSize, there’s a good chance that nobody will look at it, and it won’t be honored.

I’m not supposed to say bad things about NDIS in public, but I do have to conceed that this whole feature is a little half-baked.

Because the ContextBackFillSize is typically not honored, Windows would normally allocate + free context slabs every time the NBL traverses the stack. This is obviously bad for performance, so around Windows 8, we added a heuristic. Starting in Windows 8, NDIS will monitor the context allocations that actually get used in practice, and NDIS will heuristically somtimes cache some of the context slabs with the NBL. So when you call NdisFreeNetBufferListContext, the context slab(s) may not go back to the general kernel pool; the slab(s) may just be hidden inside the NBL. Next time you call NdisAllocateNetBufferListContext on the same NBL, NDIS may restore the slab from the cache, rather than hitting the general kernel pool.

The upshot for you is that you can optionally improve performance in some cases by doing these steps:

  • If you need N bytes of context on the Tx path, then in your FilterRestartHandler, do restartAttributes->ContextBackFillSize += N.
  • If you’re allocating NBLs for Tx, read the restartAttributes->ContextBackFillSize and put that into the 3rd param to NdisAllocateNetBufferListContext.
  • It’s better to have a few big slabs, packed with many small contexts, then many small slabs each with their own context. So it’s maybe a good idea to just toss in extra ContextBackFill in your context allocations. Use benchmarks to tune an ideal backfill size there, to minimize the number of slabs that get allocated in practice.

If you are benchmarking, keep in mind:

  • Windows 8 and later will perform differently (better!) than earlier versions.
  • Leave room for a bit of warmup, since it takes a short amount of time for NBLs to get cycled through the datapath and get their cached contexts attached.
  • NDIS has somewhat brittle heuristics around when to cache; seemingly-small changes to your filter’s behavior or configuration can kick you off the hot path and result in a measurable difference in calls to the pool.

Oh I missed your reply to this post. Let me explain OID_GEN_MINIPORT_RESTART_ATTRIBUTES. This isn’t a normal OID in the sense that you use it with FilterOidRequest or NdisFOidRequest. Instead, it’s delivered via your FilterRestartHandler. The sample driver illustrates how to pluck it out of the function inputs: https://github.com/Microsoft/Windows-driver-samples/blob/master/network/ndis/filter/filter.c#L544