Interrupts and DPC - race condition?

Mark_McDougall · March 28, 2007, 10:41pm

I’ve got a lock-up in my (KMDF) driver and my current theory is that it
is related to my interrupt processing.

However, thinking the problem through I’m stumped on what appears to
me to be a race condition without a solution…

My ISR clears the hardware interrupt source and sets a flag in volatile
memory that can be accessed by the InterruptDPC callback to indicate
what type of interrupt (there are several) has occurred.

The InterruptDPC callback then uses WdfInterruptAcquireLock to access
the flags in order to process each of the interrupt sources. Finally it
resets the flag and then releases the interrupt lock.

Sounds OK in theory.

However, a hardware interrupt can be raised at any time, including of
course whilst the DPC is running. Obviously whilst it holds the
interrupt lock, the DPC can’t be pre-empted. But that doesn’t stop the
interrupt being serviced immediately after the lock is released, but
before the DPC has exited and been de-queued.

Problem is, the new interrupt ISR attempts to queue the DPC and it
fails. The interrupt is cleared and the flag is set, but no DPC
executes. Due to the nature of the driver, further interrupts for this
device/driver are never generated (because the DPC didn’t execute) and
the system grinds to a halt.

No amount of looping/checking in the DPC is going to overcome this basic
problem - namely that there is a window during which the DPC has
released the interrupt lock but hasn’t been dequeued, in which an
interrupt may occur which attempts to queue another DPC.

Can anyone confirm that my thinking thru of this scenario is correct? If
this is the case, then how is this general condition avoided? It appears
to me to be a fundamental problem with the single-DPC-queued mechanism
that others must have encountered before?

Or am I missing something?

Regards,

–
Mark McDougall, Engineer
Virtual Logic Pty Ltd, http:
21-25 King St, Rockdale, 2216
Ph: +612-9599-3255 Fax: +612-9599-3266</http:>

Doron_Holan · March 28, 2007, 10:56pm

By the time the DPC executes, it has already been dequeued and if the ISR executes before the DPC runs, or while it is running and requests a DPC to be queued, it will be queued up again (and could run concurrently with the first DPC if you are on an MP machine). You most likely need to store history in your extension, not just the single previous state.

Since the DPC is already dequeued by the time the DPC is executing, if the ISR runs right after the DPC releases the interrupt lock, the queueing of the DPC should work just fine and the DPC should execute some time later.

d

OSR_Community_User · March 28, 2007, 11:19pm

You should setup your driver so that the information the ISR reads from the card gets merged into an “interrupt state” structure that you share with your DPC.

The ISR reads the interrupt, sets some flags in the interrupt state structure, maybe enters buffers it used to collect data or requests that completed into a queue in the shared structure, then requests the DPC, acknowledges the interrupt and returns. When another interrupt comes in it merges the new information into this same structure.

In your DPC you grab the interrupt lock, copy the accumulated interrupt state out to memory owned by the DPC, clear the interrupt state data, then drop the interrupt lock. Then you can process the previous set of accumulated interrupt state as the next set accrues.

-p

-----Original Message-----
From: xxxxx@lists.osr.com [mailto:xxxxx@lists.osr.com] On Behalf Of Mark McDougall
Sent: Wednesday, March 28, 2007 7:41 PM
To: Windows System Software Devs Interest List
Subject: [ntdev] Interrupts and DPC - race condition?

I’ve got a lock-up in my (KMDF) driver and my current theory is that it
is related to my interrupt processing.

However, thinking the problem through I’m stumped on what appears to
me to be a race condition without a solution…

My ISR clears the hardware interrupt source and sets a flag in volatile
memory that can be accessed by the InterruptDPC callback to indicate
what type of interrupt (there are several) has occurred.

The InterruptDPC callback then uses WdfInterruptAcquireLock to access
the flags in order to process each of the interrupt sources. Finally it
resets the flag and then releases the interrupt lock.

Sounds OK in theory.

However, a hardware interrupt can be raised at any time, including of
course whilst the DPC is running. Obviously whilst it holds the
interrupt lock, the DPC can’t be pre-empted. But that doesn’t stop the
interrupt being serviced immediately after the lock is released, but
before the DPC has exited and been de-queued.

Problem is, the new interrupt ISR attempts to queue the DPC and it
fails. The interrupt is cleared and the flag is set, but no DPC
executes. Due to the nature of the driver, further interrupts for this
device/driver are never generated (because the DPC didn’t execute) and
the system grinds to a halt.

No amount of looping/checking in the DPC is going to overcome this basic
problem - namely that there is a window during which the DPC has
released the interrupt lock but hasn’t been dequeued, in which an
interrupt may occur which attempts to queue another DPC.

Can anyone confirm that my thinking thru of this scenario is correct? If
this is the case, then how is this general condition avoided? It appears
to me to be a fundamental problem with the single-DPC-queued mechanism
that others must have encountered before?

Or am I missing something?

Regards,

–
Mark McDougall, Engineer
Virtual Logic Pty Ltd, http:
21-25 King St, Rockdale, 2216
Ph: +612-9599-3255 Fax: +612-9599-3266

—
Questions? First check the Kernel Driver FAQ at http://www.osronline.com/article.cfm?id=256

To unsubscribe, visit the List Server section of OSR Online at http://www.osronline.com/page.cfm?name=ListServer</http:>

Mark_McDougall · March 29, 2007, 12:09am

xxxxx@Microsoft.com wrote:

By the time the DPC executes, it has already been dequeued and if the
ISR executes before the DPC runs, or while it is running and requests
a DPC to be queued, it will be queued up again (and could run
concurrently with the first DPC if you are on an MP machine). You
most likely need to store history in your extension, not just the
single previous state.

Since the DPC is already dequeued by the time the DPC is executing,
if the ISR runs right after the DPC releases the interrupt lock, the
queueing of the DPC should work just fine and the DPC should execute
some time later.

OK, thanks again Doran for clearing that up!

Regards,

–
Mark McDougall, Engineer
Virtual Logic Pty Ltd, http:
21-25 King St, Rockdale, 2216
Ph: +612-9599-3255 Fax: +612-9599-3266</http:>

Mark_McDougall · March 29, 2007, 12:53am

Peter Wieland wrote:

You should setup your driver so that the information the ISR reads
from the card gets merged into an “interrupt state” structure that
you share with your DPC.

The ISR reads the interrupt, sets some flags in the interrupt state
structure, maybe enters buffers it used to collect data or requests
that completed into a queue in the shared structure, then requests
the DPC, acknowledges the interrupt and returns. When another
interrupt comes in it merges the new information into this same
structure.

In your DPC you grab the interrupt lock, copy the accumulated
interrupt state out to memory owned by the DPC, clear the interrupt
state data, then drop the interrupt lock. Then you can process the
previous set of accumulated interrupt state as the next set accrues.

That’s the essence of what I’m doing - it’s just that my (erroneous)
understanding of when DPCs were dequeued led me to believe that there
was still a race condition in this sequence of events.

But Doran has hit that nail on the head - and I need to look elsewhere
for my system freeze.

Thanks for your input.

Regards,

–
Mark McDougall, Engineer
Virtual Logic Pty Ltd, http:
21-25 King St, Rockdale, 2216
Ph: +612-9599-3255 Fax: +612-9599-3266</http:>

Maxim_S_Shatskih · March 29, 2007, 5:45am

> Problem is, the new interrupt ISR attempts to queue the DPC and it

fails. The interrupt is cleared and the flag is set, but no DPC
executes.

Re-check the flag before DPC exit.

Actually, create a loop in your DPC of:

for(;
{
AcquireInterruptLock();
LocalFlag = DevExt->Flag;
DevExt->Flag = 0;
ReleaseInterruptLock();
if( LocalFlag == 0 )
return;
// Process according to LocalFlag bits
…
}

problem - namely that there is a window during which the DPC has
released the interrupt lock but hasn’t been dequeued

DPC is dequeued before it runs. So, the above loop is save. The ISR should just
a) query the hardware status register b) update the hardware to dismiss the
interrupt c) set DevExt->Flag d) KeInsertQueueDpc, ignoring its return value.

–
Maxim Shatskih, Windows DDK MVP
StorageCraft Corporation
xxxxx@storagecraft.com
http://www.storagecraft.com

Maxim_S_Shatskih · March 29, 2007, 5:56am

> By the time the DPC executes, it has already been dequeued and if the ISR

executes before the DPC runs, or while it is running and requests a DPC to be
queued, it will be queued up again (and could run concurrently with the first
DPC
if you are on an MP machine). You most likely need to store history in your
extension, not just the single previous state.

Correct, the OP should care about the fact his DPC can run on 2 CPUs
simultaneously.

My pseudo-code in another post solves this issue too by setting DevExt->Flag to
zero under the interrupt lock.

–
Maxim Shatskih, Windows DDK MVP
StorageCraft Corporation
xxxxx@storagecraft.com
http://www.storagecraft.com