DPC limitations?

I’m wondering whether there are any additional limitations I need to be
aware of while processing within a ‘Custom’ DPC?

Current design has an ISR DPC potentially kicking off another ‘Custom’
DPC to not only handle NBL completions, but also to check for any queued
NBLs and send them off as well.

I thought about using a threaded DPC for this work, however since it
runs at PASSIVE_LEVEL, and a spin lock is acquired (which IIUC, will
raise to DISPATCH_LEVEL) it didn’t seem to make sense to use a Threaded
DPC.

The ‘send’ mechanism does have a throttling mechanism, so even during
high load, it will not be possible for the DPC thread to hold the CPU
hostage. The ‘send’ mechanism will not block, save for contention on a
spin lock (in which case it would likely find no work to do after
acquisition).

Does this sound reasonable? if not, what other mechanisms do I have?

Thx
-PWM

(caveat: I know nothing about NDIS, which is what I assume NBL relates to)

You don’t want to do this in your DpcForIsr… why?

Peter
OSR

Are you working on a physical NIC? Just curious, why you want to write a NIC
miniport driver for Windows as a sw company? Can’t you just use the driver
from hardware vendor? A field proven miniport driver is very difficult to
get right.

Calvin

-----Original Message-----
From: xxxxx@lists.osr.com
[mailto:xxxxx@lists.osr.com] On Behalf Of Peter W. Morreale
Sent: Wednesday, January 13, 2010 9:42 AM
To: Windows System Software Devs Interest List
Subject: [ntdev] DPC limitations?

I’m wondering whether there are any additional limitations I need to be
aware of while processing within a ‘Custom’ DPC?

Current design has an ISR DPC potentially kicking off another ‘Custom’
DPC to not only handle NBL completions, but also to check for any queued
NBLs and send them off as well.

I thought about using a threaded DPC for this work, however since it
runs at PASSIVE_LEVEL, and a spin lock is acquired (which IIUC, will
raise to DISPATCH_LEVEL) it didn’t seem to make sense to use a Threaded
DPC.

The ‘send’ mechanism does have a throttling mechanism, so even during
high load, it will not be possible for the DPC thread to hold the CPU
hostage. The ‘send’ mechanism will not block, save for contention on a
spin lock (in which case it would likely find no work to do after
acquisition).

Does this sound reasonable? if not, what other mechanisms do I have?

Thx
-PWM


NTDEV is sponsored by OSR

For our schedule of WDF, WDM, debugging and other seminars visit:
http://www.osr.com/seminars

To unsubscribe, visit the List Server section of OSR Online at
http://www.osronline.com/page.cfm?name=ListServer

>

Are you working on a physical NIC? Just curious, why you want to write
a NIC
miniport driver for Windows as a sw company? Can’t you just use the
driver
from hardware vendor? A field proven miniport driver is very difficult
to
get right.

Are you referring to Novell? They made the ‘original’ cheap network card
:slight_smile:

(okay, that was a long time ago, and anyone involved in that project
has probably moved on to different things or retired, but it’s still
strange to hear of Novell referred to as a software company)

James

On Wed, 2010-01-13 at 11:56 -0800, Calvin Guan wrote:

Are you working on a physical NIC?

No, a software one. Hardware is for sissies.

http://developer.novell.com/wiki/index.php/AlacrityVM

Just curious, why you want to write a NIC
miniport driver for Windows as a sw company? Can’t you just use the driver
from hardware vendor? A field proven miniport driver is very difficult to
get right.

nod.

Like Barbie once said: “Math class is tough!”.

:wink:

Best,
-PWM

Calvin

-----Original Message-----
From: xxxxx@lists.osr.com
[mailto:xxxxx@lists.osr.com] On Behalf Of Peter W. Morreale
Sent: Wednesday, January 13, 2010 9:42 AM
To: Windows System Software Devs Interest List
Subject: [ntdev] DPC limitations?

I’m wondering whether there are any additional limitations I need to be
aware of while processing within a ‘Custom’ DPC?

Current design has an ISR DPC potentially kicking off another ‘Custom’
DPC to not only handle NBL completions, but also to check for any queued
NBLs and send them off as well.

I thought about using a threaded DPC for this work, however since it
runs at PASSIVE_LEVEL, and a spin lock is acquired (which IIUC, will
raise to DISPATCH_LEVEL) it didn’t seem to make sense to use a Threaded
DPC.

The ‘send’ mechanism does have a throttling mechanism, so even during
high load, it will not be possible for the DPC thread to hold the CPU
hostage. The ‘send’ mechanism will not block, save for contention on a
spin lock (in which case it would likely find no work to do after
acquisition).

Does this sound reasonable? if not, what other mechanisms do I have?

Thx
-PWM


NTDEV is sponsored by OSR

For our schedule of WDF, WDM, debugging and other seminars visit:
http://www.osr.com/seminars

To unsubscribe, visit the List Server section of OSR Online at
http://www.osronline.com/page.cfm?name=ListServer


NTDEV is sponsored by OSR

For our schedule of WDF, WDM, debugging and other seminars visit:
http://www.osr.com/seminars

To unsubscribe, visit the List Server section of OSR Online at http://www.osronline.com/page.cfm?name=ListServer

On Wed, 2010-01-13 at 13:55 -0500, xxxxx@osr.com wrote:

(caveat: I know nothing about NDIS, which is what I assume NBL relates to)

NET_BUFFER LIST. Which contains a list of NET_BUFFERs, each of which
contain a list of MDLs, pointing to fragments of a frame.

A list of lists of lists.

You don’t want to do this in your DpcForIsr… why?

Because the documentation for EvtInterruptDpc() states:

The system does not add the DPC object to the DPC queue if the object is
already queued. An EvtInterruptIsr callback function might be called
several times before the system calls the EvtInterruptDpc callback
function. Therefore, the EvtInterruptDpc callback function must be able
to process information from several interrupts, and it must process all
interrupts that have occurred since the last time it was called.

Since in my case, I could be contending for a spin lock, I need an async
(to the interrupt handling itself) method of the disposition of the data
passed in the interrupt.

Thanks,
-PWM

Peter
OSR


NTDEV is sponsored by OSR

For our schedule of WDF, WDM, debugging and other seminars visit:
http://www.osr.com/seminars

To unsubscribe, visit the List Server section of OSR Online at http://www.osronline.com/page.cfm?name=ListServer

>Therefore, the EvtInterruptDpc callback function must be able to process information from several >interrupts, and it must process all interrupts that have occurred since the last time it was called.
Right.
You need to poll buffer data in your DPC routine before exit in case if you get another interrupt during queuing or processing DPC. You can’t issue one DPC for each DPC.

Igor Sharovar

>>Therefore, the EvtInterruptDpc callback function must be able to process information from several >>interrupts, and it must process all interrupts that have occurred since the last time it was called. >Right. You need to poll buffer data in your DPC routine before exit in case if you get another interrupt >during queuing or processing DPC.

… which is potentially bad, as there is a limit to how long you can spend in a DPC before you have to leave – polling buffer data can potentially exceed the 100us allowed for a DPC …

You can’t issue one DPC for each DPC.

… actually, you need to be able to requeue a DPC after n poll loops inside of the DPC and exit to avoid losing data if there is still data pending after the 100us has elapsed. The OS won’t allow you to double-queue the DPC, true – but it will happily allow you to queue a new DPC while you are completing one …

> The OS won’t allow you to double-queue the DPC, true – but it will happily allow you to queue a new >DPC while you are completing one …
Good point. It should be done in this way.

Igor Sharovar

On Wed, 2010-01-13 at 20:20 -0500, xxxxx@hotmail.com wrote:

> The OS won’t allow you to double-queue the DPC, true – but it will happily allow you to queue a new >DPC while you are completing one …
Good point. It should be done in this way.

Igor Sharovar

Which is precisely what I am doing.

Thanks for the sanity check.

-PWM


NTDEV is sponsored by OSR

For our schedule of WDF, WDM, debugging and other seminars visit:
http://www.osr.com/seminars

To unsubscribe, visit the List Server section of OSR Online at http://www.osronline.com/page.cfm?name=ListServer

> Because the documentation for EvtInterruptDpc() states: The system does not add the DPC

object to the DPC queue if the object is already queued. An EvtInterruptIsr callback function
might be called several times before the system calls the EvtInterruptDpc callback function.
Therefore, the EvtInterruptDpc callback function must be able to process information from several >interrupts, and it must process all interrupts that have occurred since the last time it was called.

Since in my case, I could be contending for a spin lock, I need an async (to the interrupt handling
itself) method of the disposition of the data passed in the interrupt.

If you don’t mind, could you please expand it a bit and explain to us what you want to do and why you think DpcForIsr is not going to work for you. Look -as long as DPC is in a queue it means that it is due for execution so that it will be able to get all information about hardware events that occurred before it got dequeued, from device registers. Therefore, it does not make sense to invoke the same routine more than once, so that the system is not going to queue DPC that has already been enqueued.

This is all that the quote that you have provided says - it does not say that DPC cannot get queued successfully until DPC routine returns control, does it??? If your target hardware event occurs while DPC routine is running ISR will be able to enqueue DPC successfully, because DPC is already dequeued at the time DPC routine starts execution. Once you are going to contend for a spinlock at DPC level your code can be interrupted if you target hardware event occurs. ISR will queue DPC once again, so that your DPC routine is going to do a delayed processing upon the next invocation.

What is the problem??? Why do you think DpcForIsr is not going to work for you???

Anton Bassov

Okay… but ??

The standard programming model that drivers use is that the from the ISR the driver queues a DpcForIsr. Like Anton said (I can’t BELIEVE I just typed that), all the quote says is that requests from the ISR to queue the DpcForIsr aren’t guaranteed to be 1:1 – In cases where your device has multiple requests outstanding simultaneously, your ISR can be invoked multiple times and can thus request the DpcForIsr multiple times, and yet have your DpcForIsr be invoked only once.

Therefore, in the DpcForIsr, the classic programming pattern is to do what we call The Grand Loop. Within the DpcForIsr, you check the status of your transfer operations: Any sends done? If so, go process those send completions. Any receives done? If so, go process them. Any device service need to be done, new buffers supplied, errors process? If so, go and take care of those. THEN, go back to the beginning of the loop and repeat, until the device has been completely serviced.

In cases where The Grand Loop can take “too long” (100us is one way to view “too long”), you simply need to limit the number of activities/passes you perform in that loop, and before “too many” such activities or passes are completed, you re-queue your DpcForIsr (KeInsertQueueDpc).

If that’s what you’re doing… cool. But I was just trying to point out that you don’t need a custom DPC to do this, and in fact the standard programming pattern is to USE the DpcForIsr for this work.

Peter
OSR

On Thu, 2010-01-14 at 04:19 -0500, xxxxx@hotmail.com wrote:

> Because the documentation for EvtInterruptDpc() states: The system does not add the DPC
> object to the DPC queue if the object is already queued. An EvtInterruptIsr callback function
> might be called several times before the system calls the EvtInterruptDpc callback function.
> Therefore, the EvtInterruptDpc callback function must be able to process information from several >interrupts, and it must process all interrupts that have occurred since the last time it was called.

> Since in my case, I could be contending for a spin lock, I need an async (to the interrupt handling
> itself) method of the disposition of the data passed in the interrupt.

If you don’t mind, could you please expand it a bit and explain to us what you want to do and why you think DpcForIsr is not going to work for you. Look -as long as DPC is in a queue it means that it is due for execution so that it will be able to get all information about hardware events that occurred before it got dequeued, from device registers. Therefore, it does not make sense to invoke the same routine more than once, so that the system is not going to queue DPC that has already been enqueued.

This is all that the quote that you have provided says - it does not say that DPC cannot get queued successfully until DPC routine returns control, does it??? If your target hardware event occurs while DPC routine is running ISR will be able to enqueue DPC successfully, because DPC is already dequeued at the time DPC routine starts execution. Once you are going to contend for a spinlock at DPC level your code can be interrupted if you target hardware event occurs. ISR will queue DPC once again, so that your DPC routine is going to do a delayed processing upon the next invocation.

What is the problem??? Why do you think DpcForIsr is not going to work for you???

It does work for us and it is used.

Recall that this ‘NIC’ is a pure software device in virtualization
space. The actual ‘bus’ is a lockless shared memory (between guest and
host) ring and has a software signaling mechanism that is multiplexed
via the bus driver’s interrupt object. This bus allows us to bypass
the traditional hardware emulation used in para-virtual space and save a
significant amount of hardware emulation overhead (and thus gain
performance, and much lower latency).

The DpcForIsr in the bus driver is used to post a signal to a specific
instance of the ring. This signal implies that this ring has some
specific work to do for the associated ring. This signaling mechanism
takes the place of the traditional ISR in a purely hardware scenario.

There is a kernel software driver on the ‘other side’ of the ring that
communicates directly with the NIC.

In the case of tx, we need to validate the completion status of the sent
packets and set the completion of the associated NET_BUFFER_LISTs. In
addition, we need to reclaim the elements of the ring associated with
the packets on those NET_BUFFER_LISTs.

In the case of rx, we need to pull packets off the ring and forward them
up the stack, and reclaim the elements of the ring.

These rings operate completely independently of each other. It takes 3
rings for each device: an event ring for the device itself, and a tx and
rx). The event ring is used (in the NIC case) for things such as link
up/down state, and hotplug. The tx and rx rings are self-explanatory.
Right now, the bus DpcForIsr merely queues a custom dpc for each ring
that performs its associated work.

So while we could do all the work with within the context of the bus
DpcForIsr, we would impact performance of the other rings within the
system.

In the Linux guests, we use tasklets for this purpose, which are a very
lightweight async mechanism. It appears that Windows DPCs are the
similar mechanism.

Best,
-PWM

Anton Bassov


NTDEV is sponsored by OSR

For our schedule of WDF, WDM, debugging and other seminars visit:
http://www.osr.com/seminars

To unsubscribe, visit the List Server section of OSR Online at http://www.osronline.com/page.cfm?name=ListServer

> Like Anton said (I can’t BELIEVE I just typed that),

Thanks A LOT for a compliment - if Mr.Viscarola somehow feels uncomfortable about referring to someone, the person in question, whoever he is, gets a feeling of being in the same company with Mr.Raymond and Mr.Torvalds (shit, it looks like I have a delusion of grandeur already)…

Anton Bassov

>

In the Linux guests, we use tasklets for this purpose, which are a
very
lightweight async mechanism. It appears that Windows DPCs are the
similar mechanism.

Aren’t Linux tasklets more like Windows workitems? I thought the Linux
“bottom half” interrupt handler was more equivalent to a Windows DPC?
There are enough differences between the two operating systems that the
comparison isn’t really useful though.

If you are approaching the maximum amount of work to be done in a Dpc
(based on how many packets you think you can process before too much
time elapses), queue another Dpc and exit. The system will put your Dpc
on the end of the Dpc queue (giving other Dpc’s a chance to run) and get
back to you in good time. I don’t think this is strictly permitted
within NDIS (queuing another Dpc), but I doubt it will be a problem if
you do it right.

It’s hard to tune though… how many packets is too many to process per
Dpc? What if the user decides that their system must process network
packets above all else and they don’t particularly care if Dpc
processing time takes more than it should? (eg it’s a server not a
workstation). Ideally the Dpc should run really quick, just getting
packets off the ring and giving them to NDIS is cpu and memory bandwidth
bound, but you have to make calls to NDIS and NDIS can take quite a
while to process the packets you give it.

James

> Therefore, in the DpcForIsr, the classic programming pattern is to do
what we

call The Grand Loop. Within the DpcForIsr, you check the status of
your
transfer operations: Any sends done? If so, go process those send
completions. Any receives done? If so, go process them. Any device
service
need to be done, new buffers supplied, errors process? If so, go and
take
care of those. THEN, go back to the beginning of the loop and repeat,
until
the device has been completely serviced.

In cases where The Grand Loop can take “too long” (100us is one way to
view
“too long”), you simply need to limit the number of activities/passes
you
perform in that loop, and before “too many” such activities or passes
are
completed, you re-queue your DpcForIsr (KeInsertQueueDpc).

What is NDIS for ‘KeInsertQueueDpc’? The only way I could find to do
this in my NDIS driver was to not use the NDIS Dpc infrastructure and
use the KeXxx calls as you suggest.

Remember that if you do use KeInsertQueueDpc that Windows will schedule
your Dpc onto any CPU unless you tell it not to
(KeSetTargetProcessorDpc). This is different behaviour to NDIS 5.x (and
6.x?) where the Dpc is always scheduled onto CPU 0. You probably know
that, but I didn’t when I stopped using NDIS Dpc routines and it caused
a few bugs :slight_smile:

Also, you have to synchronise your calls to KeInsertQueueDpc, which
almost certainly also means synchronising to your Isr (again, you
probably know that, but I didn’t :slight_smile:

James


What is NDIS for ‘KeInsertQueueDpc’? The only way I could find to do
this in my NDIS driver was to not use the NDIS Dpc infrastructure and
use the KeXxx calls as you suggest.

So curiously, the NDIS ‘environment’ abstraction does not include a DPC.
The typical way of handling this is to use an NDIS Timer (Miniport Timer)
and schedule it ‘immediately’. It has the same effect in NT (the only
surviving NDIS platform).

Good Luck,
Dave Cattley


NTDEV is sponsored by OSR

For our schedule of WDF, WDM, debugging and other seminars visit:
http://www.osr.com/seminars

To unsubscribe, visit the List Server section of OSR Online at
http://www.osronline.com/page.cfm?name=ListServer

On Fri, 2010-01-15 at 08:23 +1100, James Harper wrote:

>
> In the Linux guests, we use tasklets for this purpose, which are a
very
> lightweight async mechanism. It appears that Windows DPCs are the
> similar mechanism.
>

Aren’t Linux tasklets more like Windows workitems? I thought the Linux
“bottom half” interrupt handler was more equivalent to a Windows DPC?
There are enough differences between the two operating systems that the
comparison isn’t really useful though.

Good question. The only documentation I can find on ‘workitems’ relate
to userspace apps.

Yes, from what I gleam, a DPC is equivalent to ‘bottom-half’ processing,
although these days there is a push to move that to a kernel thread in
Linux. (Certainly is true in RT Linux systems like SLERT.)

If you are approaching the maximum amount of work to be done in a Dpc
(based on how many packets you think you can process before too much
time elapses), queue another Dpc and exit. The system will put your Dpc
on the end of the Dpc queue (giving other Dpc’s a chance to run) and get
back to you in good time. I don’t think this is strictly permitted
within NDIS (queuing another Dpc), but I doubt it will be a problem if
you do it right.

It’s hard to tune though… how many packets is too many to process per
Dpc? What if the user decides that their system must process network
packets above all else and they don’t particularly care if Dpc
processing time takes more than it should? (eg it’s a server not a
workstation). Ideally the Dpc should run really quick, just getting
packets off the ring and giving them to NDIS is cpu and memory bandwidth
bound, but you have to make calls to NDIS and NDIS can take quite a
while to process the packets you give it.

There are some options (within my space) wrt tuning. And some knobs
will be exposed to help with that. Right now the driver is still under
development, so until that is completed and we get to play with system
loads…

Thanks,
-PWM

James


NTDEV is sponsored by OSR

For our schedule of WDF, WDM, debugging and other seminars visit:
http://www.osr.com/seminars

To unsubscribe, visit the List Server section of OSR Online at http://www.osronline.com/page.cfm?name=ListServer

>
> What is NDIS for ‘KeInsertQueueDpc’? The only way I could find to do
> this in my NDIS driver was to not use the NDIS Dpc infrastructure and
> use the KeXxx calls as you suggest.
>

So curiously, the NDIS ‘environment’ abstraction does not include a
DPC.
The typical way of handling this is to use an NDIS Timer (Miniport
Timer)
and schedule it ‘immediately’. It has the same effect in NT (the only
surviving NDIS platform).

9:08 in the morning and I already learnt something new :slight_smile:

Thanks

James

James, you are almost always a day ahead of the rest of us :slight_smile:

Cheers,
Dave Cattley

-----Original Message-----
From: xxxxx@lists.osr.com
[mailto:xxxxx@lists.osr.com] On Behalf Of James Harper
Sent: Thursday, January 14, 2010 5:09 PM
To: Windows System Software Devs Interest List
Subject: RE: [ntdev] DPC limitations?


> What is NDIS for ‘KeInsertQueueDpc’? The only way I could find to do
> this in my NDIS driver was to not use the NDIS Dpc infrastructure and
> use the KeXxx calls as you suggest.
>

So curiously, the NDIS ‘environment’ abstraction does not include a
DPC.
The typical way of handling this is to use an NDIS Timer (Miniport
Timer)
and schedule it ‘immediately’. It has the same effect in NT (the only
surviving NDIS platform).

9:08 in the morning and I already learnt something new :slight_smile:

Thanks

James


NTDEV is sponsored by OSR

For our schedule of WDF, WDM, debugging and other seminars visit:
http://www.osr.com/seminars

To unsubscribe, visit the List Server section of OSR Online at
http://www.osronline.com/page.cfm?name=ListServer