IRQL Priority Problems

Hoping someone can help us with a KMFD driver on Windows 7 problem we’ve encountered. Our driver was written to support some custom hardware used to communicate with avionics equipment. Everything works well and we’re almost at the end of the project. We have one target device (out of many) that is giving us some problems. It appears to be a timing problem where we have slightly longer than expected gaps in transmission between buffers (32 word buffer).

The problem happens because our driver is CPU dependent and has no DMA support. The transmitter raises it’s buffer empty interrupt right before putting the last word in the buffer on the line. Most of the time, the interrupt gets handled efficiently and the buffer is refilled during the time the last word is being transmitted. In this case, the time delta between words is consistent and there is no gap. On occasion, it appears another driver in the system is preempting ours. This results in a delay in refilling the buffer and we get a longer than expected delta between words.

This is not a new problem with this driver. I had to make some adjustments to threading and affinity to fix it before. It was really, really bad, but I got it to the point it is pretty consistent and interrupts get handled almost real-time (obviously, windows is not real-time). I researched and research looking for a way to increase the IRQL of our driver and give it a higher priority so it can’t be preempted but came up empty handed.

Can any tell us if there is a way we can increase the priority? Or know of any tools that will help us figure out what driver is preempting ours (without causing further latency)?

Any help would be greatly appreciated.

Can you fill the buffer partially? If you can, have your device interrupt as soon as the buffer is not full, and have the driver just dump in enough to top it off. That way, you have a really big margin for tolerating interrupt latency.

If you can’t fill the buffer partially, have your device interrupt when there are “a few” (for some definition of “a few”. I’d make it N-1, so I’d get approximately the same latency tolerance) bytes left, and have the DPC queue a work item that will spin on the remaining count register until the buffer is empty, then dump in a full load again. If you know approximately how long each byte takes, for large remainders, you can do a delay instead of spin for most of it.

Phil

Not speaking for LogRhythm
Phil Barila | Senior Software Engineer
720.881.5364 (w)
LogRhythm, Inc.
The Security Intelligence Company
A LEADER in Gartner’s SIEM Magic Quadrant (2012-2014)
Perfect 5-Star Rating in SC Magazine (2009-2014)
BEST SIEM: Information Security Magazine & SearchSecurity.com 2014 Readers’ Choice Awards

From: xxxxx@lists.osr.com [mailto:xxxxx@lists.osr.com] On Behalf Of Brad Aswegan
Sent: Monday, April 13, 2015 2:51 PM
To: Windows System Software Devs Interest List
Subject: [ntdev] IRQL Priority Problems

[snip]

The problem happens because our driver is CPU dependent and has no DMA support. The transmitter raises it’s buffer empty interrupt right before putting the last word in the buffer on the line. Most of the time, the interrupt gets handled efficiently and the buffer is refilled during the time the last word is being transmitted. In this case, the time delta between words is consistent and there is no gap. On occasion, it appears another driver in the system is preempting ours. This results in a delay in refilling the buffer and we get a longer than expected delta between words.

[snip]

Are you refilling the buffer directly from the ISR? Or do you have a thread
that runs to do the work after the interrupt has fired?

No, there’s no option that says “I’m the most important thing” and thus
there is always the risk of this behavior on a Windows system. You may be
able to craft a set of devices and drivers that are known to work within
some limits (discovered through testing and measuring), but the sort of
tuning you want doesn’t typically exist in a general purpose OS.

Xperf is your friend, though it can be challenging to get started with it.
Just wrote an article for the latest issue of The NT Insider that might be
of use to you:

https://www.osr.com/nt-insider/2015-issue1/happiness-xperf/

-scott
OSR
@OSRDrivers

“Brad Aswegan” wrote in message
news:xxxxx@ntdev…

Hoping someone can help us with a KMFD driver on Windows 7 problem we’ve
encountered. Our driver was written to support some custom hardware used to
communicate with avionics equipment. Everything works well and we’re almost
at the end of the project. We have one target device (out of many) that is
giving us some problems. It appears to be a timing problem where we have
slightly longer than expected gaps in transmission between buffers (32 word
buffer).

The problem happens because our driver is CPU dependent and has no DMA
support. The transmitter raises it’s buffer empty interrupt right before
putting the last word in the buffer on the line. Most of the time, the
interrupt gets handled efficiently and the buffer is refilled during the
time the last word is being transmitted. In this case, the time delta
between words is consistent and there is no gap. On occasion, it appears
another driver in the system is preempting ours. This results in a delay
in refilling the buffer and we get a longer than expected delta between
words.

This is not a new problem with this driver. I had to make some adjustments
to threading and affinity to fix it before. It was really, really bad, but
I got it to the point it is pretty consistent and interrupts get handled
almost real-time (obviously, windows is not real-time). I researched and
research looking for a way to increase the IRQL of our driver and give it a
higher priority so it can’t be preempted but came up empty handed.

Can any tell us if there is a way we can increase the priority? Or know of
any tools that will help us figure out what driver is preempting ours
(without causing further latency)?

Any help would be greatly appreciated.

No, the ISR sets up a DPC.

That’s what all the research I’ve done has led me to conclude. I was hoping that perhaps we could change what class of device it is (or something) and get windows to prioritize it higher.

Thank you! If I can figure out what driver it is causing us latency and it’s not system critical I can deal with it instead. The fun of kernel mode driver development, it’s like quantum mechanics, you can’t examine the system without changing it.

Cheers,
Brad

-----Original Message-----
From: xxxxx@lists.osr.com [mailto:xxxxx@lists.osr.com] On Behalf Of Scott Noone
Sent: Monday, April 13, 2015 4:15 PM
To: Windows System Software Devs Interest List
Subject: Re:[ntdev] IRQL Priority Problems

Are you refilling the buffer directly from the ISR? Or do you have a thread that runs to do the work after the interrupt has fired?

No, there’s no option that says “I’m the most important thing” and thus there is always the risk of this behavior on a Windows system. You may be able to craft a set of devices and drivers that are known to work within some limits (discovered through testing and measuring), but the sort of tuning you want doesn’t typically exist in a general purpose OS.

Xperf is your friend, though it can be challenging to get started with it.
Just wrote an article for the latest issue of The NT Insider that might be of use to you:

https://www.osr.com/nt-insider/2015-issue1/happiness-xperf/

-scott
OSR
@OSRDrivers

“Brad Aswegan” wrote in message news:xxxxx@ntdev…

Hoping someone can help us with a KMFD driver on Windows 7 problem we’ve encountered. Our driver was written to support some custom hardware used to communicate with avionics equipment. Everything works well and we’re almost at the end of the project. We have one target device (out of many) that is giving us some problems. It appears to be a timing problem where we have slightly longer than expected gaps in transmission between buffers (32 word buffer).

The problem happens because our driver is CPU dependent and has no DMA support. The transmitter raises it’s buffer empty interrupt right before putting the last word in the buffer on the line. Most of the time, the interrupt gets handled efficiently and the buffer is refilled during the time the last word is being transmitted. In this case, the time delta between words is consistent and there is no gap. On occasion, it appears
another driver in the system is preempting ours. This results in a delay
in refilling the buffer and we get a longer than expected delta between words.

This is not a new problem with this driver. I had to make some adjustments to threading and affinity to fix it before. It was really, really bad, but I got it to the point it is pretty consistent and interrupts get handled almost real-time (obviously, windows is not real-time). I researched and research looking for a way to increase the IRQL of our driver and give it a higher priority so it can’t be preempted but came up empty handed.

Can any tell us if there is a way we can increase the priority? Or know of any tools that will help us figure out what driver is preempting ours (without causing further latency)?

Any help would be greatly appreciated.


NTDEV is sponsored by OSR

Visit the list at: http://www.osronline.com/showlists.cfm?list=ntdev

OSR is HIRING!! See http://www.osr.com/careers

For our schedule of WDF, WDM, debugging and other seminars visit:
http://www.osr.com/seminars

To unsubscribe, visit the List Server section of OSR Online at http://www.osronline.com/page.cfm?name=ListServer

I will have to look into the capabilities of the chipset we’re using. Off the top of my head, I’m not sure it does this kind of interrupting. Originally, we took a similar approach but ran into a race condition if you ran things fast enough. I’ve also considered just pushing data into the buffer as quickly as possible using a non-interrupt driven approach. Oh, the buffer isn’t full? Well, here’s more data. After we figured out what was causing us so many problems before (hyperthreading), it did not look like such measures would be necessary.

I really, really don’t want to make such a major change to the driver at this point. Testing passes on the majority of target devices. A change such as this would force us to retest on every target. This could delay the release of the product by months and we’re under the gun right now. This issue with one silly device is holding up everything.

Brad

From: xxxxx@lists.osr.com [mailto:xxxxx@lists.osr.com] On Behalf Of Phil Barila
Sent: Monday, April 13, 2015 4:08 PM
To: Windows System Software Devs Interest List
Subject: RE:[ntdev] IRQL Priority Problems

Can you fill the buffer partially? If you can, have your device interrupt as soon as the buffer is not full, and have the driver just dump in enough to top it off. That way, you have a really big margin for tolerating interrupt latency.

If you can’t fill the buffer partially, have your device interrupt when there are “a few” (for some definition of “a few”. I’d make it N-1, so I’d get approximately the same latency tolerance) bytes left, and have the DPC queue a work item that will spin on the remaining count register until the buffer is empty, then dump in a full load again. If you know approximately how long each byte takes, for large remainders, you can do a delay instead of spin for most of it.

Phil

Not speaking for LogRhythm
Phil Barila | Senior Software Engineer
720.881.5364 (w)
LogRhythm, Inc.
The Security Intelligence Company
A LEADER in Gartner’s SIEM Magic Quadrant (2012-2014)
Perfect 5-Star Rating in SC Magazine (2009-2014)
BEST SIEM: Information Security Magazine & SearchSecurity.com 2014 Readers’ Choice Awards

From: xxxxx@lists.osr.com [mailto:xxxxx@lists.osr.com] On Behalf Of Brad Aswegan
Sent: Monday, April 13, 2015 2:51 PM
To: Windows System Software Devs Interest List
Subject: [ntdev] IRQL Priority Problems

[snip]

The problem happens because our driver is CPU dependent and has no DMA support. The transmitter raises it’s buffer empty interrupt right before putting the last word in the buffer on the line. Most of the time, the interrupt gets handled efficiently and the buffer is refilled during the time the last word is being transmitted. In this case, the time delta between words is consistent and there is no gap. On occasion, it appears another driver in the system is preempting ours. This results in a delay in refilling the buffer and we get a longer than expected delta between words.

[snip]


NTDEV is sponsored by OSR

Visit the list at: http://www.osronline.com/showlists.cfm?list=ntdev

OSR is HIRING!! See http://www.osr.com/careers

For our schedule of WDF, WDM, debugging and other seminars visit:
http://www.osr.com/seminars

To unsubscribe, visit the List Server section of OSR Online at http://www.osronline.com/page.cfm?name=ListServer

As noted in your response to SNoone, you have made it appear to work most of the time. And if you can identify the device that is pushing you out of your timing spec, you can probably mitigate that by disabling that device, or taking some other approach.

I do hope you can resolve the issue with your one “device”. I suspect it’s actually the host system. If you are selling this as a general purpose thing, without controlling the host environment, it’s going to occur again, no matter how hard you try to sort out the issue with the single thing you’re currently chasing.

I hate to say it, but as long as your architecture doesn’t fit your environment, you’re always going to be looking over your shoulder at this kind of issue.

Since you mention interrupts, I assume that means a PCI device? If so, you might be able to make it work just by moving the PCI devices around to get a different interrupt, or different enumeration order, so your ISR gets invoked before that of the problem device.

Phil

From: xxxxx@lists.osr.com [mailto:xxxxx@lists.osr.com] On Behalf Of Brad Aswegan
Sent: Monday, April 13, 2015 3:29 PM
To: Windows System Software Devs Interest List
Subject: RE:[ntdev] IRQL Priority Problems

I will have to look into the capabilities of the chipset we’re using. Off the top of my head, I’m not sure it does this kind of interrupting. Originally, we took a similar approach but ran into a race condition if you ran things fast enough. I’ve also considered just pushing data into the buffer as quickly as possible using a non-interrupt driven approach. Oh, the buffer isn’t full? Well, here’s more data. After we figured out what was causing us so many problems before (hyperthreading), it did not look like such measures would be necessary.

I really, really don’t want to make such a major change to the driver at this point. Testing passes on the majority of target devices. A change such as this would force us to retest on every target. This could delay the release of the product by months and we’re under the gun right now. This issue with one silly device is holding up everything.

Brad

If I had my choice, I would have developed this in Linux like the majority of our products that use this communication standard. Not my choice, we’re replacing a legacy product and its windows based.

The good news, the environment of devices we need to communicate with is pretty much static and we know every single one of them. This protocol is old and it’s not being used anymore, at least not on newer devices. They just use Ethernet. This is avionics though, so the existing devices will never be retired. This one device just can’t seem to handle it if there is any latency between words… pretty silly. But yes, this problem originates with the host and our driver. Our problem being that the previous version of this product works with the device, so we’ll get a lot of blowback over this (despite the new version working with 99% of the devices in use).

And yes, this is PCI… you would be terrified but we have to go through a PCIe-to-PCI bridge and then a PCI-to-local bus bridge. That’s how old this stuff is, designed to work in a different age of computing.

Thanks much for the advice,
Brad

From: xxxxx@lists.osr.com [mailto:xxxxx@lists.osr.com] On Behalf Of Phil Barila
Sent: Monday, April 13, 2015 4:48 PM
To: Windows System Software Devs Interest List
Subject: RE:[ntdev] IRQL Priority Problems

As noted in your response to SNoone, you have made it appear to work most of the time. And if you can identify the device that is pushing you out of your timing spec, you can probably mitigate that by disabling that device, or taking some other approach.

I do hope you can resolve the issue with your one “device”. I suspect it’s actually the host system. If you are selling this as a general purpose thing, without controlling the host environment, it’s going to occur again, no matter how hard you try to sort out the issue with the single thing you’re currently chasing.

I hate to say it, but as long as your architecture doesn’t fit your environment, you’re always going to be looking over your shoulder at this kind of issue.

Since you mention interrupts, I assume that means a PCI device? If so, you might be able to make it work just by moving the PCI devices around to get a different interrupt, or different enumeration order, so your ISR gets invoked before that of the problem device.

Phil

From: xxxxx@lists.osr.com [mailto:xxxxx@lists.osr.com] On Behalf Of Brad Aswegan
Sent: Monday, April 13, 2015 3:29 PM
To: Windows System Software Devs Interest List
Subject: RE:[ntdev] IRQL Priority Problems

I will have to look into the capabilities of the chipset we’re using. Off the top of my head, I’m not sure it does this kind of interrupting. Originally, we took a similar approach but ran into a race condition if you ran things fast enough. I’ve also considered just pushing data into the buffer as quickly as possible using a non-interrupt driven approach. Oh, the buffer isn’t full? Well, here’s more data. After we figured out what was causing us so many problems before (hyperthreading), it did not look like such measures would be necessary.

I really, really don’t want to make such a major change to the driver at this point. Testing passes on the majority of target devices. A change such as this would force us to retest on every target. This could delay the release of the product by months and we’re under the gun right now. This issue with one silly device is holding up everything.

Brad


NTDEV is sponsored by OSR

Visit the list at: http://www.osronline.com/showlists.cfm?list=ntdev

OSR is HIRING!! See http://www.osr.com/careers

For our schedule of WDF, WDM, debugging and other seminars visit:
http://www.osr.com/seminars

To unsubscribe, visit the List Server section of OSR Online at http://www.osronline.com/page.cfm?name=ListServer

Linux doesn’t have latency guarantees, either.

Your better shot would be to replenish the buffer straight in the ISR. Even if your IRQ has more priority, the DPC will only run after all ISRs and will be preempted by other ISRs.

I might try this, but I think we already did and I appreciate the suggestion. The problem is the target chip is soo slow it’s actually not even on local bus, we need another custom bridge. If I put it in the ISR and don’t DPC it, it makes our driver perform too far outside of specifications (the ISR wouldn’t exit for 100+ ms). This could be exceptionally bad because we can have two transmitters and four receivers all sharing the same interrupt. We don’t have the luxury of masking the interrupt that long.

Linux is far more deterministic than windows is. If needed, you can modify the kernel/task scheduler as needed… can’t do that in windows. Or just changing the APIC vectors.

Cheers,
Brad


From: xxxxx@lists.osr.com [xxxxx@lists.osr.com] on behalf of xxxxx@broadcom.com [xxxxx@broadcom.com]
Sent: Monday, April 13, 2015 6:31 PM
To: Windows System Software Devs Interest List
Subject: RE:[ntdev] IRQL Priority Problems

Linux doesn’t have latency guarantees, either.

Your better shot would be to replenish the buffer straight in the ISR. Even if your IRQ has more priority, the DPC will only run after all ISRs and will be preempted by other ISRs.


NTDEV is sponsored by OSR

Visit the list at: http://www.osronline.com/showlists.cfm?list=ntdev

OSR is HIRING!! See http://www.osr.com/careers

For our schedule of WDF, WDM, debugging and other seminars visit:
http://www.osr.com/seminars

To unsubscribe, visit the List Server section of OSR Online at http://www.osronline.com/page.cfm?name=ListServer

On 14-Apr-2015 00:20, Brad Aswegan wrote:

I was hoping that perhaps we could change what class of device it is (or something) and get windows to prioritize it higher.

This is unlikely.
( For a longer explanation, see this excellent reply of Jamie Hanrahan:
http://bit.ly/1Ex4Z0G )

If I can figure out what driver it is causing us latency and it’s not system critical I can deal with it instead.

Before doing anything more complicated, try to tweak the global timer
resolution: increase it to 2ms, 1ms … this may shuffle things around.
A quick way to do this - run the media player. Or write a little program
that calls NtSetTimerResolution and leave it running.

Regards,
– pa

Mr Grig wrote:

The OP replied:

It doesn’t have to be either/or. Your goal is to prevent underruns (the device running out of data), right? So, you you only need to stuff enough data into the FIFO in your ISR to allow you to get to your DPC. THEN you finish stuffing the FIFO in your DPC.

So, in your ISR you would stuff just a couple of words into the FIFO, queue your DpcForIsr, and then let the DpcForIsr do the rest. This was you stay under the target ISR latency.

Peter
OSR
@OSRDrivers

>This could be exceptionally bad because we can have two transmitters and four receivers all sharing the same interrupt. We don’t have the luxury of masking the interrupt that long.

You’re screwed. I suppose you meant 100 us to stuff 32 words, not 100 ms+, because that would mean the hardware is very badly designed. Even 3 us per word is pretty bad.

If you have to service 6 channels on the same interrupt, can every of them withstand 500 us of service latency? Because that’s what they may have. You can use MSI-X and put them on different messages directed to different processors.

If the problem is not in ISR latency, but in DPC latency, you can use targeted DPCs (affinitized to a target CPU), and queue a bunch of them to different CPUs from your ISR. Whichever gets to execute first (use an interlocked operation to catch that) services the channel. This way, you can service different channels in parallel on different CPUs (and overcome DPC head of line blocking), even if you have a single ISR.

If I’m understanding your issue correctly the undesired behavior is that there is too much latency between subsequent transmit WORDs due to an underrun condition.

The sequence of events is:
(1) receive a buffer empty interrupt
(2) Run your DPC to put more data in the buffer
The I/O with hardware in the DPC has very slow transactions.
The transmitter is able to transmit the entire contents of the buffer before you can provide more
data to transmit. This introduces latency between subsequent WORDs.

It may seem that responding to the interrupt more quickly would be a suitable solution, but really the underlying issue is that you need to keep the buffer from emptying. One approach would be to look for an alternate interrupt condition that would notify you when the buffer is ‘almost’ empty. This would give you time to add more data to the buffer before the transmitter drains it. What you really need to figure out is just how long your I/O transactions with the device are taking. Then you need to instrument how quickly your transmitter can drain the buffer. If you know these two things you can then figure out just how low you can allow the buffer to get before you are approaching your underrun condition. If you don’t know this information and need a shotgun approach, your best option it to likely abandon the interrupt driven model and fill the buffer in a separate thread. You could have the thread have an arbitrary wait interval and the sole responsibility of the thread is to keep feeding that buffer.

I’m kind of in the same boat.

I want to know more about how he created the timer? Is this a WdfTimer object? What period did you choose for the timer to expire? What execution level is the timer running at? Passive? Did you try to adjust the global timer resolution too?

I *think* what I’ve gathered so far is that the FIFO depth is 32 words. If he fills the FIFO in the ISR rather than the DPC it was taking ~100us to fill the FIFO. I guess we can assume that it takes roughly 100us to fill the FIFO due to slow I/O.

We don’t have any idea how long it takes the transmitter to drain the FIFO. I have to assume it’s much faster, but I don’t know if there is any flow control mechanism that would pace the transmitter.

The *solution* was to setup a periodic timer to keep the FIFO from ever getting empty. That assumes you can get a callback and enqueue another word into the FIFO faster than your transmitter can drain the FIFO. If we knew how long it took the transmitter to drain the entire FIFO, AND knew how long it took for a single I/O transaction we could easily determine how frequently you need to put more data in the FIFO. Just because we know that number doesn’t necessary mean we can force Windows to wake on that periodic interval though. It *sounds* like he chose an arbitrary number, did three Hail Mary’s, and let it run all day.