Prokash Sinha put together some notes about high-performance miniport/NIC
design in 2010. His notes might not provide all the answers you need, but
may be worth a read. See:
http://www.ndis.com/ndis-design/prokash/default.htm
Offhand I’d say that your first and simplest tweak should be to implement
some sort of interrupt moderation in the receive path. Allow this value to
be configurable.
The problem of effectively using CPU resources in the receive path is fairly
difficult. Certainly out of my league. The NDIS “receive-side scaling” (RSS)
feature addresses this for TCP streams using message-signaled interrupts.
Here the NIC itself sorts packets for specific target CPUs before generating
the HW interrupt. Sorting criteria, based on header has values, attempts to
target all packets associated with a TCP stream to the same CPU.
If you know any unique characteristics of the network traffic that your
NIC/miniport will be handling, then you might be able to capitalize on the
traffic characteristics to target CPUs.
A messy problem.
FWIW,
Thomas F. Divine
http://www.pcausa.com
-----Original Message-----
From: xxxxx@lists.osr.com
[mailto:xxxxx@lists.osr.com] On Behalf Of
xxxxx@gmail.com
Sent: Thursday, April 3, 2014 10:24 PM
To: Windows System Software Devs Interest List
Subject: [ntdev] NDIS Multiport Device Receive Processsing
We have a single function, multi-port device (4-port).
In our design, we have two drivers. A bus driver (KMDF) that enumerates 4
PDOs. Our minport drivers (NDIS 6.2) attaches to each of the PDOs. The bus
driver handles the MII Interrupt processing, and the miniport driver
handles the regular Rx and Tx processing.
The bus driver creates and interface that allows the miniport driver to
“register” for interrupts, basically passing the function for the bus driver
to call when a Rx or Tx interrupt occurs. The bus driver schedules a DPC for
each interrupt, which eventually calls the function passed by the miniport
driver.
This works fine, except that in the case of 4 ports, the DPCs are all
scheduled on the same processor, and thus, port 0 processes interrupts
first, then port 1, then port 2, then port 3.
Generally, this is fine, but when we stress test all 4 ports at the same
time, 1 processor gets flooded while the other processors are free and
available for processing.
I’ve seen this issue throughout the forum, and there has been mention of
“KeSetTargetProcessorDpc”. Is there a sane way to use this? By sane I mean,
is there a good way to arbitrate which processor a DPC should be scheduled
on?
This would help solve the issue where Rx packets are dropped because buffers
fill up and processing of those buffers don’t occur until another port’s
processing is completed.
Would you suggest I set a max amount of Rx packets to process per interrupt?
And schedule remaining work to be done later in another DPC or work item? Or
really use KeSetTargetProcessorDpc??
Thanks in advance.
NTDEV is sponsored by OSR
Visit the list at: http://www.osronline.com/showlists.cfm?list=ntdev
OSR is HIRING!! See http://www.osr.com/careers
For our schedule of WDF, WDM, debugging and other seminars visit:
http://www.osr.com/seminars
To unsubscribe, visit the List Server section of OSR Online at
http://www.osronline.com/page.cfm?name=ListServer