NDIS Multiport Device Receive Processsing

Robert_Graham · April 3, 2014, 10:24pm

We have a single function, multi-port device (4-port).

In our design, we have two drivers. A bus driver (KMDF) that enumerates 4 PDOs. Our minport drivers (NDIS 6.2) attaches to each of the PDOs. The bus driver handles the MII Interrupt processing, and the miniport driver handles the regular Rx and Tx processing.

The bus driver creates and interface that allows the miniport driver to “register” for interrupts, basically passing the function for the bus driver to call when a Rx or Tx interrupt occurs. The bus driver schedules a DPC for each interrupt, which eventually calls the function passed by the miniport driver.

This works fine, except that in the case of 4 ports, the DPCs are all scheduled on the same processor, and thus, port 0 processes interrupts first, then port 1, then port 2, then port 3.

Generally, this is fine, but when we stress test all 4 ports at the same time, 1 processor gets flooded while the other processors are free and available for processing.

I’ve seen this issue throughout the forum, and there has been mention of “KeSetTargetProcessorDpc”. Is there a sane way to use this? By sane I mean, is there a good way to arbitrate which processor a DPC should be scheduled on?

This would help solve the issue where Rx packets are dropped because buffers fill up and processing of those buffers don’t occur until another port’s processing is completed.

Would you suggest I set a max amount of Rx packets to process per interrupt? And schedule remaining work to be done later in another DPC or work item? Or really use KeSetTargetProcessorDpc??

Thanks in advance.

Thomas_Divine · April 4, 2014, 9:51am

Prokash Sinha put together some notes about high-performance miniport/NIC
design in 2010. His notes might not provide all the answers you need, but
may be worth a read. See:

http://www.ndis.com/ndis-design/prokash/default.htm

Offhand I’d say that your first and simplest tweak should be to implement
some sort of interrupt moderation in the receive path. Allow this value to
be configurable.

The problem of effectively using CPU resources in the receive path is fairly
difficult. Certainly out of my league. The NDIS “receive-side scaling” (RSS)
feature addresses this for TCP streams using message-signaled interrupts.
Here the NIC itself sorts packets for specific target CPUs before generating
the HW interrupt. Sorting criteria, based on header has values, attempts to
target all packets associated with a TCP stream to the same CPU.

If you know any unique characteristics of the network traffic that your
NIC/miniport will be handling, then you might be able to capitalize on the
traffic characteristics to target CPUs.

A messy problem.

FWIW,

Thomas F. Divine
http://www.pcausa.com

-----Original Message-----
From: xxxxx@lists.osr.com
[mailto:xxxxx@lists.osr.com] On Behalf Of
xxxxx@gmail.com
Sent: Thursday, April 3, 2014 10:24 PM
To: Windows System Software Devs Interest List
Subject: [ntdev] NDIS Multiport Device Receive Processsing

We have a single function, multi-port device (4-port).

In our design, we have two drivers. A bus driver (KMDF) that enumerates 4
PDOs. Our minport drivers (NDIS 6.2) attaches to each of the PDOs. The bus
driver handles the MII Interrupt processing, and the miniport driver
handles the regular Rx and Tx processing.

The bus driver creates and interface that allows the miniport driver to
“register” for interrupts, basically passing the function for the bus driver
to call when a Rx or Tx interrupt occurs. The bus driver schedules a DPC for
each interrupt, which eventually calls the function passed by the miniport
driver.

This works fine, except that in the case of 4 ports, the DPCs are all
scheduled on the same processor, and thus, port 0 processes interrupts
first, then port 1, then port 2, then port 3.

Generally, this is fine, but when we stress test all 4 ports at the same
time, 1 processor gets flooded while the other processors are free and
available for processing.

I’ve seen this issue throughout the forum, and there has been mention of
“KeSetTargetProcessorDpc”. Is there a sane way to use this? By sane I mean,
is there a good way to arbitrate which processor a DPC should be scheduled
on?

This would help solve the issue where Rx packets are dropped because buffers
fill up and processing of those buffers don’t occur until another port’s
processing is completed.

Would you suggest I set a max amount of Rx packets to process per interrupt?
And schedule remaining work to be done later in another DPC or work item? Or
really use KeSetTargetProcessorDpc??

Thanks in advance.

NTDEV is sponsored by OSR

Visit the list at: http://www.osronline.com/showlists.cfm?list=ntdev

OSR is HIRING!! See http://www.osr.com/careers

For our schedule of WDF, WDM, debugging and other seminars visit:
http://www.osr.com/seminars

To unsubscribe, visit the List Server section of OSR Online at
http://www.osronline.com/page.cfm?name=ListServer

Robert_Graham · April 4, 2014, 10:14am

Hi Thomas,

I read this article a long time ago. Thanks for the resource! RSS definitely solves the issue with MSI-X.

I’m also wondering if I should implement some kind of “polling” mode. I wrote the Linux version of this driver, and under Linux, they have an API called “NAPI”, which places the driver in “polling” mode. So instead of waiting for Rx or Tx interrupts, the driver polls on a periodic basis, and as long as the number of Rx and Tx packets processed during each poll > some defined threshold, we stay in polling mode. Thus, the processing for each port isn’t tied to particular processor, as the processor on which the software interrupt occurs is undefined. This solves the issue where port 1 is waiting on port 0 to finish processing, port 2 is waiting on port 1, and port 3 is waiting on port 2. Wondering if maybe a similar design should be implemented on the Windows side.

Thomas_Divine · April 4, 2014, 10:27am

Give thanks to “pro”. I just published his thoughts…

I can’t give authoritative advice on whether the Linux approach would work
on Windows.

But I would say that it’s time for experimentation and the idea seems simple
enough to try.

Good luck,

Thomas F. Divine
http://www.pcausa.com

-----Original Message-----
From: xxxxx@lists.osr.com
[mailto:xxxxx@lists.osr.com] On Behalf Of
xxxxx@gmail.com
Sent: Friday, April 4, 2014 10:15 AM
To: Windows System Software Devs Interest List
Subject: RE:[ntdev] NDIS Multiport Device Receive Processsing

Hi Thomas,

I read this article a long time ago. Thanks for the resource! RSS definitely
solves the issue with MSI-X.

I’m also wondering if I should implement some kind of “polling” mode. I
wrote the Linux version of this driver, and under Linux, they have an API
called “NAPI”, which places the driver in “polling” mode. So instead of
waiting for Rx or Tx interrupts, the driver polls on a periodic basis, and
as long as the number of Rx and Tx packets processed during each poll > some
defined threshold, we stay in polling mode. Thus, the processing for each
port isn’t tied to particular processor, as the processor on which the
software interrupt occurs is undefined. This solves the issue where port 1
is waiting on port 0 to finish processing, port 2 is waiting on port 1, and
port 3 is waiting on port 2. Wondering if maybe a similar design should be
implemented on the Windows side.

NTDEV is sponsored by OSR

Visit the list at: http://www.osronline.com/showlists.cfm?list=ntdev

OSR is HIRING!! See http://www.osr.com/careers

For our schedule of WDF, WDM, debugging and other seminars visit:
http://www.osr.com/seminars

To unsubscribe, visit the List Server section of OSR Online at
http://www.osronline.com/page.cfm?name=ListServer

Prokash_Sinha-1 · April 4, 2014, 3:15pm

Thos -

You have done lot more (NT) community services (like others in this
list )than me, so no worries. Also it is the real eather - gets
broadcasted all over, whoever catches it is the …

-pro

On 4/4/2014 7:27 AM, Thomas F. Divine wrote:

Give thanks to “pro”. I just published his thoughts…

I can’t give authoritative advice on whether the Linux approach would work
on Windows.

But I would say that it’s time for experimentation and the idea seems simple
enough to try.

Good luck,

Thomas F. Divine
http://www.pcausa.com

-----Original Message-----
From: xxxxx@lists.osr.com
[mailto:xxxxx@lists.osr.com] On Behalf Of
xxxxx@gmail.com
Sent: Friday, April 4, 2014 10:15 AM
To: Windows System Software Devs Interest List
Subject: RE:[ntdev] NDIS Multiport Device Receive Processsing

Hi Thomas,

I read this article a long time ago. Thanks for the resource! RSS definitely
solves the issue with MSI-X.

I’m also wondering if I should implement some kind of “polling” mode. I
wrote the Linux version of this driver, and under Linux, they have an API
called “NAPI”, which places the driver in “polling” mode. So instead of
waiting for Rx or Tx interrupts, the driver polls on a periodic basis, and
as long as the number of Rx and Tx packets processed during each poll > some
defined threshold, we stay in polling mode. Thus, the processing for each
port isn’t tied to particular processor, as the processor on which the
software interrupt occurs is undefined. This solves the issue where port 1
is waiting on port 0 to finish processing, port 2 is waiting on port 1, and
port 3 is waiting on port 2. Wondering if maybe a similar design should be
implemented on the Windows side.

NTDEV is sponsored by OSR

Visit the list at: http://www.osronline.com/showlists.cfm?list=ntdev

OSR is HIRING!! See http://www.osr.com/careers

For our schedule of WDF, WDM, debugging and other seminars visit:
http://www.osr.com/seminars

To unsubscribe, visit the List Server section of OSR Online at
http://www.osronline.com/page.cfm?name=ListServer

NTDEV is sponsored by OSR

Visit the list at: http://www.osronline.com/showlists.cfm?list=ntdev

OSR is HIRING!! See http://www.osr.com/careers

For our schedule of WDF, WDM, debugging and other seminars visit:
http://www.osr.com/seminars

To unsubscribe, visit the List Server section of OSR Online at http://www.osronline.com/page.cfm?name=ListServer