Inherited an NDIS miniport driver. How does interrupt registration work?

Shane_Corbin · May 28, 2019, 11:43pm

Looks like interrupts/resources are owned by a parent driver, but HLK tests want to do Receive Side Throttling (RST) under heavy burden. The NDIS miniport driver doesn’t call NdisMRegisterInterruptEx to provide callbacks for the various “handler” functions for NDIS to call. So, there’s no “handler” routine for RST to call less frequently. Instead there’s a bus interface shared between the bus/parent driver and its children. The children provide callback functions to the bus driver and when an interrupt occurs the ISR of the bus driver iterates over all the callbacks.

my NDIS miniport driver | my other drivers | another of my drivers
----|     ^     |-----------|      ^    |---------|     ^     |----
----| interface |-----------| interface |---------| interface |----
----|     v     |-----------|      v    |---------|     v     |----
my bus driver
--------------									 
pci.sys
--------------
hardware

What’s the appropriate way to register interrupts in such a scenario?

Jeffrey_Tippet_MSFT · May 29, 2019, 2:41am

In general, RST is largely tied to a specific notion of how “normal” PCI adapters work. It’s not general-purpose enough to be used in all types of miniport drivers. So in cases like this, you probably can’t use RST.

RST assumes that:

you have some packet queue in hardware;
you can very cheaply determine whether there’s more packets are in that queue;
there’s a small, O(n) cost for removing n packets from the queue; and
your hardware will do something reasonable (like 802.3x PAUSE frames) if the hardware queue starts to fill up.

(When I say “removing” from the queue, I don’t mean DMA. That’s a higher cost that’s paid asynchronously. I mean the small cost of reading the packet descriptor and copying it into an NBL. The point is that a typical PCI NIC can pay the cost on a per-packet basis. In contrast, a USB NIC doesn’t really work that way; the USB host controller pays most of those costs up front, and the USB device driver only gets the URBs later. So a USB device wouldn’t really benefit from this part of RST.)

The benefits of RST are:

if the computer is flooded with more traffic than it can handle, hardware can see the queues are filling up, and it has a chance to “do something reasonable”
if the computer is flooded with more traffic than it can handle, we at least minimize the cost of copying hardware packet descriptors into NBLs for the packets that we were going to drop anyway
some shenanigans with DPCs vs threads running at DISPATCH_LEVEL, to minimize the appearance of having long-running DPCs

These benefits are fairly modest, so you’re not losing out on a lot if you don’t implement RST. Of the three, benefit #3 is the most interesting, since arguably “the computer is flooded” is a misconfigured computer, and we don’t really need to optimize for it too much. But, actually, “shenanigans” with DPCs vs threads is not at all tied to NDIS or ISRs. You could roll your own, in your own driver. It basically amounts to having your DPC’s ISR indicate up the first N packets, then queuing a workitem to indicate the remaining ones.

I don’t want to give the impression that it’s easy to do this in a way that is correct. NDIS’s own implementation has been patched several times as the NDIS team learns more about the kernel scheduler, and miniport drivers aren’t in a great position to have global visibility on what else is going on with the system. But at least one major IHV has implemented the scheduling side of RST in their NIC driver, and their implementation now has substantially more features than the NDIS equivalent.

FWIW, I have an item on my backlog to extract the scheduling “shenanigans” into a more generic library, so any miniport can take advantage of it, even if the miniport doesn’t use NDIS for ISR/DPC. But at the rate I’m getting things done, we’ll probably all have converted to NetAdapter before then, and NDIS & its RST will be obsolete.

Shane_Corbin · May 29, 2019, 4:26pm

@“Jeffrey_Tippet_[MSFT]” said:
… So in cases like this, you probably can’t use RST…

Thanks for the response Jeffrey. It was very informative and helpful.

We’re working our way through a number of HLK findings with this network driver. Many of the findings are performance related (GlitchFree, MiniStress, etc.). My understanding is that NDIS implements some logic to account for many of these performance related scenarios, but we’re unable to leverage those algorithms because they rely on NDIS calling our Interrupt/DPC handlers directly. RST being just one of these scenarios.

It’s certainly possible that registering our handlers with NDIS won’t magically resolve our HLK findings, but I’d like to understand how I can give NDIS control of our Interrupts/DPCs in this driver hierarchy where the parent/bus driver presently controls when those are called.

Jeffrey_Tippet_MSFT · May 29, 2019, 9:17pm

You can’t do this with the current version of NDIS. NDIS’s tricks only help when you can delegate your ISR to NDIS.

It’s certainly possible that registering our handlers with NDIS won’t magically resolve our HLK findings

Yeah, NDIS RST isn’t magic. It does a bit of work to avoid hitting a DPC timeout (bugcheck 0x133), but it certainly doesn’t make the datapath run any faster. If your “HLK findings” are 133 bugchecks, then you’ll want to either use RST or replicate what it’s doing. If it’s something other than 133 bugchecks, then RST is unlikely to help, and you’ll need to investigate those findings.

If you want to replicate RST in your driver, because you’re seeing 133 bugchecks, then here’s a thumbnail sketch of what RST is actually doing.

The current implementation of NDIS has a trivial feedback loop that tries to keep the total time spent in a single DPC below some % of the DPC time limit. If the previous indication took too long, then NDIS lowers the NBL limit. If the previous indication came in well below time budget, then NDIS increases the NBL limit.

You’re not expected to implement this feedback loop yourself; many IHVs with homemade RST do just fine by hardcoding a constant maximum number of NBLs, like 128 or 1024. (Higher-end gear tends to use a higher NBL limit, with the expectation that if someone paid a lot for the NIC, they probably also paid a lot for the CPU, so the system can handle more packets in the same time interval.)

If there’s more packets available than the limit, then queue another DPC to dispatch the next batch. This will get you past the “single DPC” watchdog. But it won’t help you with the “cumulative DPC” watchdog. For that, you need to ensure that there are small gaps in the DPCs. The easiest way to do this is to schedule a KTIMER that’s due to expire on the next tick. Unfortunately, that means that if the processor is idle, you’ll just waste cycles. So you can race a low-priority passive thread against the timer: the thread ensures you get any spare CPU cycles, while the timer ensures you run within a small deadline (approximately 1/2 clock interrupt).

If you’re very familiar with the NT scheduler, you might wonder whether using a Threaded DPC would save you some trouble. Unfortunately, NDIS disallows receive indications on those, because our current implementation of TCPIP is not compatible with Threaded DPCs.

All this is a lot of complexity to build into a NIC driver, so I’m not really asking that you do it. Really, NDIS owes you a library API to do all this work for you. And NDIS can do a better job of scheduling than any individual miniport, since NDIS has more system-wide visibility. E.g., NDIS can determine that CPU4 is not used for anything else, so NDIS can disable the DPC watchdog on CPU4 and use the CPU exclusively for receive operations.