Hi Tim, appreciate your help but I am not sure if I can post the whole code. Apologies if the digest below doesn’t help. Perhaps I can send the whole thing via email?
Below is a shortened version and explanation why it’s done this way. Maybe you’ll spot a problem right away.
ISR (not shown here)
- ISR can get fired very often, in extreme cases it could be 20000x a second (e.g. at 20000 FPS).
- HOWEVER: In this particular case the rate would be much lower around 10 ISRs a second.
- Since we had issues with skipped frames at very high rates we implemented an “ISR event queueing”, as described here.
- ISR is called on every frame written, i.e. the device interrupts: “hey, I just DMA’d a frame into your circ. buffer”
- ISR retrieves a preallocated event item (
ExInterlockedPopEntrySList
), fills the event with a simple frame counter and timestamp fromKeQueryPerformanceCounter
. - ISR queues the event into a custom ISR-DPC queue (
ExInterlockedInsertTailList
, max 64 items capacity). - ISR then calls
WdfInterruptQueueDpcForIsr
(if the same interrupt arrives again before the DPC is executed theWdfInterruptQueueDpcForIsr
should fail but the ‘event’ is still queued before that). - ISR routine leaves.
DPC (below)
- DPC iterates over the ‘ISR event queue’ (
PmIrqEventDequeue
→ExInterlockedRemoveHeadList
) - DPC checks the mask for the EOF event. (there might be different flags used for other purposes)
- DPC queues a ‘notification’ event into another, different queue, this queue is processed by a thread on passive level that sends notifications to user space.
- (in short: there is a queue between ISR and DPC and different queue between DPC and passive level thread)
- Meanwhile, the ISR may get fired again, pushing another event to the queue, that’s why we limit the DPC with
MAX_IRQ_EVENTS
So far this has been working fine although I’m not an experience kernel developer - if there is a problem with the approach itself I’ll be happy to learn. I am also investigating a possibility that the division by Frequency is really the problem and the Frequency gets corrupted (it’s updated at the start of the acquisition). But again, this has been flawless for years and it still is on 99% of computers we tested, even with very high frame rates. Just the one computer on customer’s site reports these weird bugchecks so far.
Function itself
VOID PmEvtInterruptDpc(IN WDFINTERRUPT WdfInterrupt, IN WDFOBJECT WdfDevice) {
PDEVICE_EXTENSION DeviceExtension = GetDeviceContext(WdfDevice);
ULONG ulProcessed = 0;
PPM_IRQ_EVENT irqEvt = NULL;
while ((irqEvt = PmIrqEventDequeue(DeviceExtension)) != NULL) {
if( irqEvt->InterruptMask & PMPCIE_INT_MASK_EOF) {
// We need to store the last timestamps because these are still used with status IOCTL
DeviceExtension->TimeStamp = irqEvt->TimeStamp;
DeviceExtension->TimeStamp.QuadPart -= DeviceExtension->AcquisitionStartTime.QuadPart;
DeviceExtension->TimeStamp.QuadPart *= RES100US;
DeviceExtension->TimeStamp.QuadPart /= DeviceExtension->Frequency.QuadPart; //< **** DIVISION HERE! ****
PM_ACQ_METADATA acqMd;
acqMd.EofCount = irqEvt->EofCount;
acqMd.TimeStampEof = DeviceExtension->TimeStamp.QuadPart;
PmQueueAcqEvent(DeviceExtension,PM_EVENT_DING_NOTIFICATIONS,(void *)NOTIFICATION_EOF, &acqMd); //< **** LINE 188 ****, bugcheck sometimes reported here, division by zero
KeSetEvent(&DeviceExtension->AcqEvent,0,FALSE);
irqEvt->InterruptMask &= (ULONG)(~PMPCIE_INT_MASK_EOF); // Clear the flag
}
ulProcessed++;
// Push the event object back for reuse
PmIrqEventReturnForReuse(DeviceExtension, irqEvt);
// Since we could theoretically spend infinite time in this loop (because ISR
// can keep filling the queue, we should stop at some point and give up)
if (ulProcessed > MAX_IRQ_EVENTS) // Max 64 items
{
DbgPrint("PMPCIE: Too many IRQs pending in DPC, quitting.");
break;
}
}
return;
}