Hi Eeveryone,
I’ve written a driver for a PCI Express card, and everything works fine with the driver except for when the device is present in the system during boot up. When the device is present at startup the ISR in my driver get called continuously even though my device has not asserted an interrupt over the PCI Express bus. After Windows has started I can go into the device manager and disable the device and then re-enable it, and the interrupt storm stops, and the device works perfectly, including DMA operations that make use of the interrupt.
From the Device Manager I can see that my device has been assigned IRQ 16, which is shared with the graphics card for the machine I’m testing on.
To give you some more info, the driver is using KMDF 1.7 and the machine is using Windows Vista SP1.
So to sum up the problem, the driver is getting an interrupt storm if it’s loaded when the OS boots, but if the driver is loaded afterwards it works fine. Has anyone ever seen a problem like this before or have any idea why this is happening? Let me know if you need to know any more details. Thanks.
-Jeff
What’s the interrupt rate that you are seeing?
Since your card shares the same interrupt with your graphic card, it’s
pretty normal that your ISR gets called even if your card is not generating
an interrupt, this is how interrupt sharing works. You just need to return
FALSE from your ISR to indicate that you didn’t service the interrupt
because it’s not yours.
Have a nice day
GV
----- Original Message -----
From:
To: “Windows System Software Devs Interest List”
Sent: Wednesday, October 29, 2008 8:43 AM
Subject: [ntdev] PCIe Driver Getting Interrupt Storm
> Hi Eeveryone,
>
> I’ve written a driver for a PCI Express card, and everything works
> fine with the driver except for when the device is present in the system
> during boot up. When the device is present at startup the ISR in my
> driver get called continuously even though my device has not asserted an
> interrupt over the PCI Express bus. After Windows has started I can go
> into the device manager and disable the device and then re-enable it, and
> the interrupt storm stops, and the device works perfectly, including DMA
> operations that make use of the interrupt.
>
> From the Device Manager I can see that my device has been assigned IRQ
> 16, which is shared with the graphics card for the machine I’m testing on.
>
> To give you some more info, the driver is using KMDF 1.7 and the
> machine is using Windows Vista SP1.
>
> So to sum up the problem, the driver is getting an interrupt storm if
> it’s loaded when the OS boots, but if the driver is loaded afterwards it
> works fine. Has anyone ever seen a problem like this before or have any
> idea why this is happening? Let me know if you need to know any more
> details. Thanks.
>
> -Jeff
>
> —
> NTDEV is sponsored by OSR
>
> For our schedule of WDF, WDM, debugging and other seminars visit:
> http://www.osr.com/seminars
>
> To unsubscribe, visit the List Server section of OSR Online at
> http://www.osronline.com/page.cfm?name=ListServer
Define what YOU specifically mean by “interrupt storm”, please?
If you move your card to another PCIe slot (in an attempt to get a different IRQ assigned)… what happens?
Peter
OSR
Thanks for the replies.
Yes, I understand that my ISR could be called any time that another device that shares an interrupt vector with mine asserts an interrupt. Each time the ISR is called during the “interrupt storm” I can see the driver reading a value of 0 (meaning device is not the interrupt source), and in such a case my driver returns FALSE to allow an ISR to be called for the driver whose device DID cause the interrupt.
Peter, by interrupt storm I mean to say that my ISR is being called constantly from the time that it is first loaded. The machine runs extremely slow, and task manager shows CPU usage of 100%. When I open dbgview I see the print statements from my ISR scrolling by constantly. By interrupt storm I DO NOT mean the device is constantly asserting interrupts, or that an interrupt is being asserted but never handled/deasserted. The PCI Express endpoint is on a Xilinx FPGA, and I have verified using Chipscope (FPGA signal capture tool) that our device is never asserting a single interrupt during thie “interrupt storm”. I tried putting the card in my other PCI Express slot this morning, and it was still assigned to the same IRQ and I saw the same interrupt storm problem. Right now I’m testing with another machine. I’ll give an update if I learn something new.
-Jeff
Having a DbgPrint in your ISR can slow down your system A LOT. I would try
to remove any tracing from your ISR and see if you still experience slow
boot times.
Have a nice day
GV
----- Original Message -----
From:
To: “Windows System Software Devs Interest List”
Sent: Wednesday, October 29, 2008 9:59 AM
Subject: RE:[ntdev] PCIe Driver Getting Interrupt Storm
> Thanks for the replies.
>
> Yes, I understand that my ISR could be called any time that another device
> that shares an interrupt vector with mine asserts an interrupt. Each time
> the ISR is called during the “interrupt storm” I can see the driver
> reading a value of 0 (meaning device is not the interrupt source), and in
> such a case my driver returns FALSE to allow an ISR to be called for the
> driver whose device DID cause the interrupt.
>
> Peter, by interrupt storm I mean to say that my ISR is being called
> constantly from the time that it is first loaded. The machine runs
> extremely slow, and task manager shows CPU usage of 100%. When I open
> dbgview I see the print statements from my ISR scrolling by constantly.
> By interrupt storm I DO NOT mean the device is constantly asserting
> interrupts, or that an interrupt is being asserted but never
> handled/deasserted. The PCI Express endpoint is on a Xilinx FPGA, and I
> have verified using Chipscope (FPGA signal capture tool) that our device
> is never asserting a single interrupt during thie “interrupt storm”. I
> tried putting the card in my other PCI Express slot this morning, and it
> was still assigned to the same IRQ and I saw the same interrupt storm
> problem. Right now I’m testing with another machine. I’ll give an update
> if I learn something new.
>
> -Jeff
>
> —
> NTDEV is sponsored by OSR
>
> For our schedule of WDF, WDM, debugging and other seminars visit:
> http://www.osr.com/seminars
>
> To unsubscribe, visit the List Server section of OSR Online at
> http://www.osronline.com/page.cfm?name=ListServer
It seems likely that this is a result of a bug in the video adapter, or more
likely the video BIOS. During boot, the video chip is being run by code
that is treating it as a VGA device, with no interrupt connected. If the
video chip is asserting an interrupt, that’s not a big deal unless somebody
unmasks that I/O APIC input. Your driver did just that by calling
IoConnectInterrupt[Ex]. Now the interrupt for video is unmasked and
unhandled, leading to the behavior that you’re seeing.
I agree with Peter. It would be interesting to see what happens with your
adapter in a different slot.
–
Jake Oshins
Hyper-V I/O Architect (former interrupt guy)
Windows Kernel Team
This post implies no warranties and confers no rights.
wrote in message news:xxxxx@ntdev…
> Thanks for the replies.
>
> Yes, I understand that my ISR could be called any time that another device
> that shares an interrupt vector with mine asserts an interrupt. Each time
> the ISR is called during the “interrupt storm” I can see the driver
> reading a value of 0 (meaning device is not the interrupt source), and in
> such a case my driver returns FALSE to allow an ISR to be called for the
> driver whose device DID cause the interrupt.
>
> Peter, by interrupt storm I mean to say that my ISR is being called
> constantly from the time that it is first loaded. The machine runs
> extremely slow, and task manager shows CPU usage of 100%. When I open
> dbgview I see the print statements from my ISR scrolling by constantly.
> By interrupt storm I DO NOT mean the device is constantly asserting
> interrupts, or that an interrupt is being asserted but never
> handled/deasserted. The PCI Express endpoint is on a Xilinx FPGA, and I
> have verified using Chipscope (FPGA signal capture tool) that our device
> is never asserting a single interrupt during thie “interrupt storm”. I
> tried putting the card in my other PCI Express slot this morning, and it
> was still assigned to the same IRQ and I saw the same interrupt storm
> problem. Right now I’m testing with another machine. I’ll give an update
> if I learn something new.
>
> -Jeff
>
Thanks for the input Jake. Like I said in the last post, my machine only has one other PCI express slot, and when I tried that slot my device was still being assigned the same IRQ. If you know of any other way to work around or determine if this is the issue let me know. In the meantime I’ll be trying to get the device and driver installed on my other machine to see if the issue exists there as well.
-Jeff
Does your device support message signaled interrupts? If so, you
could try switching to them under Vista and see what happens.
Also, do you have reason to believe that the device’s interrupt
support is working correctly? I’ve run into cases very similar to
what you describe, and the ultimate cause was the device not properly
updating its interrupt pending register. Since the driver didn’t see
any pending interrupts it wouldn’t try and dismiss the interrupt. As
a result, the device never de-asserted the interrupt request and
Windows would be interrupted over and over again.
-John
On Oct 29, 2008, at 11:05 AM, xxxxx@gmail.com wrote:
Thanks for the input Jake. Like I said in the last post, my machine
only has one other PCI express slot, and when I tried that slot my
device was still being assigned the same IRQ. If you know of any
other way to work around or determine if this is the issue let me
know. In the meantime I’ll be trying to get the device and driver
installed on my other machine to see if the issue exists there as
well.
-Jeff
NTDEV is sponsored by OSR
For our schedule of WDF, WDM, debugging and other seminars visit:
http://www.osr.com/seminars
To unsubscribe, visit the List Server section of OSR Online at http://www.osronline.com/page.cfm?name=ListServer
According to Chipscope, his device isn’t causing ANY interrupts. It could still be his device that’s the problem, but this does reduce the likelihood.
Peter
OSR
Thanks for your help. I just changed my INF to add the registry entries to support MSI, and the problem is gone when using MSI. I still need line-based interrupts to work for when I’m running on Windows XP.
I think Jake was onto something with his idea about the video BIOS. My driver is based on WDF, so I’m never explicitly calling IoConnectInterrupt which would be unmasking the APIC input. I imagine that the unmasking would be taking place sometime during or shortly after my EvtPrepareHardware callback. Is there a way to delay connecting the interrupt until later?
-Jeff
You can gain control of when IoConnectInterrupt(Ex) is called by not creating the WDFINTERRUPT and managing the interrupt’s state yourself. Here is what kmdf does on power up
PrepareHw() (only on pnp start or rebalance)
D0Entry
EvtInterruptEnable
D0EntryPostInterruptsEnabled
[DMA callbacks]
[start power managed queues]
SelfManagedIoInit/Restart
This is all covered in the wdf book
d
-----Original Message-----
From: xxxxx@lists.osr.com [mailto:xxxxx@lists.osr.com] On Behalf Of xxxxx@gmail.com
Sent: Wednesday, October 29, 2008 2:43 PM
To: Windows System Software Devs Interest List
Subject: RE:[ntdev] PCIe Driver Getting Interrupt Storm
Thanks for your help. I just changed my INF to add the registry entries to support MSI, and the problem is gone when using MSI. I still need line-based interrupts to work for when I’m running on Windows XP.
I think Jake was onto something with his idea about the video BIOS. My driver is based on WDF, so I’m never explicitly calling IoConnectInterrupt which would be unmasking the APIC input. I imagine that the unmasking would be taking place sometime during or shortly after my EvtPrepareHardware callback. Is there a way to delay connecting the interrupt until later?
-Jeff
—
NTDEV is sponsored by OSR
For our schedule of WDF, WDM, debugging and other seminars visit:
http://www.osr.com/seminars
To unsubscribe, visit the List Server section of OSR Online at http://www.osronline.com/page.cfm?name=ListServer
[begin quote]
by interrupt storm I mean to say that my ISR is being called constantly from the time that it is first loaded. The machine runs extremely slow, and task manager shows CPU usage of 100%. When I open dbgview I see the print statements from my ISR scrolling by constantly. By interrupt storm I DO NOT mean the device is constantly asserting interrupts, or that an interrupt is being asserted but never handled/deasserted. The PCI Express endpoint is on a Xilinx FPGA, and I have verified using Chipscope (FPGA signal capture tool) that our device is never asserting a single interrupt during thie “interrupt storm”. I tried putting the card in my other PCI Express slot this morning, and it was still assigned to the same IRQ and I saw the same interrupt storm problem. Right now I’m testing with another machine. I’ll give an update if I learn something new.
[end quote]
Actually, as long as interrupt eventually gets handled by the driver that shares IRQ with yours so that line gets deasserted, what you describe here (apart from the performance issues, of course) is absolutely normal behavior - the performance issues seem to be related to the things that you do in you ISR before you return FALSE. This fully explains why problems arise if you connect interrupt at a boot time but not when you do it at some later stage - this is just a question of your position in a call chain. As long as video
driver is the first one in a chain, it handles interrupt raised by its device, so that your ISR just does not get invoked. However, if your ISR is the first one in a chain, performance issues arise because of something that you do before returning FALSE… Please post your ISR code, and let’s see what can be done here…
Anton Bassov
Chipscope is fine as long as he is probing and looking at the right stuff. It could be that the PCIe block assert intr but other logic forget to update the intr csr. I personally wouldn’t start with chipscoping in this case, looking at PCIe bus I/F will get the right answer, plain and simple.
–
Calvin Guan
Broadcom Corp.
Connecting Everything(r)
----- Original Message ----
From: “xxxxx@osr.com”
To: Windows System Software Devs Interest List
Sent: Wednesday, October 29, 2008 12:01:15 PM
Subject: RE:[ntdev] PCIe Driver Getting Interrupt Storm
According to Chipscope, his device isn’t causing ANY interrupts.? It could still be his device that’s the problem, but this does reduce the likelihood.
Peter
OSR
—
NTDEV is sponsored by OSR
For our schedule of WDF, WDM, debugging and other seminars visit:
http://www.osr.com/seminars
To unsubscribe, visit the List Server section of OSR Online at http://www.osronline.com/page.cfm?name=ListServer
__________________________________________________________________
Yahoo! Canada Toolbar: Search from anywhere on the web, and bookmark your favourite sites. Download it now at
http://ca.toolbar.yahoo.com.
Doron answered your question about how to delay interrupt connection under
KMDF just a little bit. But I think his post kind of missed the point.
(I’m sure that Doron will poke me back about this… 8^))
In order to work around this problem, you’re going to have to delay
connecting an interrupt until the real driver for the video device loads and
clears the interrupting condition. That’s going to be really tough, and if
you find a solution, it probably won’t be in restructuring your driver. It
will be about changing your driver to load much later.
–
Jake Oshins
Hyper-V I/O Architect
Windows Kernel Team
This post implies no warranties and confers no rights.
wrote in message news:xxxxx@ntdev…
> Thanks for your help. I just changed my INF to add the registry entries
> to support MSI, and the problem is gone when using MSI. I still need
> line-based interrupts to work for when I’m running on Windows XP.
>
> I think Jake was onto something with his idea about the video BIOS. My
> driver is based on WDF, so I’m never explicitly calling IoConnectInterrupt
> which would be unmasking the APIC input. I imagine that the unmasking
> would be taking place sometime during or shortly after my
> EvtPrepareHardware callback. Is there a way to delay connecting the
> interrupt until later?
>
> -Jeff
>
Anton, I disagree, because I don’t think that the video driver is loaded and
in the chain. I think that the video is being driven by VGA code that
doesn’t use an interrupt, or at least thinks it doesn’t.
–
Jake Oshins
Hyper-V I/O Architect
Windows Kernel Team
This post implies no warranties and confers no rights.
wrote in message news:xxxxx@ntdev…
> [begin quote]
>
> by interrupt storm I mean to say that my ISR is being called constantly
> from the time that it is first loaded. The machine runs extremely slow,
> and task manager shows CPU usage of 100%. When I open dbgview I see the
> print statements from my ISR scrolling by constantly. By interrupt storm I
> DO NOT mean the device is constantly asserting interrupts, or that an
> interrupt is being asserted but never handled/deasserted. The PCI Express
> endpoint is on a Xilinx FPGA, and I have verified using Chipscope (FPGA
> signal capture tool) that our device is never asserting a single interrupt
> during thie “interrupt storm”. I tried putting the card in my other PCI
> Express slot this morning, and it was still assigned to the same IRQ and I
> saw the same interrupt storm problem. Right now I’m testing with another
> machine. I’ll give an update if I learn something new.
>
> [end quote]
>
> Actually, as long as interrupt eventually gets handled by the driver that
> shares IRQ with yours so that line gets deasserted, what you describe here
> (apart from the performance issues, of course) is absolutely normal
> behavior - the performance issues seem to be related to the things that
> you do in you ISR before you return FALSE. This fully explains why
> problems arise if you connect interrupt at a boot time but not when you do
> it at some later stage - this is just a question of your position in a
> call chain. As long as video
> driver is the first one in a chain, it handles interrupt raised by its
> device, so that your ISR just does not get invoked. However, if your ISR
> is the first one in a chain, performance issues arise because of
> something that you do before returning FALSE… Please post your ISR code,
> and let’s see what can be done here…
>
> Anton Bassov
>
No, you are right ;). Answer the micro question, not the macro question about the underlying issue.
d
-----Original Message-----
From: xxxxx@lists.osr.com [mailto:xxxxx@lists.osr.com] On Behalf Of Jake Oshins
Sent: Thursday, October 30, 2008 1:40 PM
To: Windows System Software Devs Interest List
Subject: Re:[ntdev] PCIe Driver Getting Interrupt Storm
Doron answered your question about how to delay interrupt connection under
KMDF just a little bit. But I think his post kind of missed the point.
(I’m sure that Doron will poke me back about this… 8^))
In order to work around this problem, you’re going to have to delay
connecting an interrupt until the real driver for the video device loads and
clears the interrupting condition. That’s going to be really tough, and if
you find a solution, it probably won’t be in restructuring your driver. It
will be about changing your driver to load much later.
–
Jake Oshins
Hyper-V I/O Architect
Windows Kernel Team
This post implies no warranties and confers no rights.
wrote in message news:xxxxx@ntdev…
> Thanks for your help. I just changed my INF to add the registry entries
> to support MSI, and the problem is gone when using MSI. I still need
> line-based interrupts to work for when I’m running on Windows XP.
>
> I think Jake was onto something with his idea about the video BIOS. My
> driver is based on WDF, so I’m never explicitly calling IoConnectInterrupt
> which would be unmasking the APIC input. I imagine that the unmasking
> would be taking place sometime during or shortly after my
> EvtPrepareHardware callback. Is there a way to delay connecting the
> interrupt until later?
>
> -Jeff
>
—
NTDEV is sponsored by OSR
For our schedule of WDF, WDM, debugging and other seminars visit:
http://www.osr.com/seminars
To unsubscribe, visit the List Server section of OSR Online at http://www.osronline.com/page.cfm?name=ListServer
I think before hacking up my driver code in hopes of connecting the interrupt later I should first try the device and driver in another machine and see if the same issue exists. I should mention that the graphics card is integrated on the motherboard of the machine I’ve been testing on, in case that makes any difference.
There’s one other thing I should make known. The PCI Express endpoint is on an FPGA, and when the machine powers on the FPGA bitstream gets programmed from a flash memory. This takes about 20 seconds while the computer is booting, before which the device cannot be enumerated. Obviously the device is being enumerated and the driver is being installed during the boot process, but I just thought since this is somewhat uncommon behavior for a PCI device (as far as I know) I should throw it out there. Oh and in my INF I’m using SERVICE_DEMAND_START for the start type.
The only other box I have available for testing is running 64-bit Vista, and I was trying to get my driver to install on there today. The device first shows up as a “Standard RAM Controller” (as it does on the other test machine) and then I use devcon update to have it use my driver. When I do this on the 64-bit Vista machine I get “devcon failed” and no other output at all. And nothing is added to setupapi.dev.log either. I used the LogControl program to change the registry keys for the setupapi logs and turned the device installation setting up to the most verbose, but still nothing. I saw the following in the devcon documentation:
“When users do not have the required permissions, Devcon displays a generic “devcon failed” message with no further explanation.”
Maybe for some reason I don’t have permission to install the driver. I used the boot option to allow installation of unsigned drivers, and I also test signed the driver using makeCert, signTool, and inf2cat. Is there some other reason why devcon would not even try to install the driver because of account permission issues?
Ok thanks again for your help.
-Jeff
> Maybe for some reason I don’t have permission to install the driver.
Try using the setupapi logging.
–
Maxim S. Shatskih
Windows DDK MVP
xxxxx@storagecraft.com
http://www.storagecraft.com
xxxxx@gmail.com wrote:
The only other box I have available for testing is running 64-bit Vista, and I was trying to get my driver to install on there today. The device first shows up as a “Standard RAM Controller” (as it does on the other test machine) and then I use devcon update to have it use my driver. When I do this on the 64-bit Vista machine I get “devcon failed” and no other output at all. And nothing is added to setupapi.dev.log either.
My apologies for asking the obvious question: did you create a 64-bit
driver binary? You can’t install a 32-bit driver in Vista 64.
–
Tim Roberts, xxxxx@probo.com
Providenza & Boekelheide, Inc.
Definitely a question worth asking in case I missed the obvious. I was actually a little confused on that myself at first because there’s AMD64 and IA64. The processor is an Intel X64 dual core, so I’m using a binary compiled under the AMD64 environment. Correct me if that’s a mistake.
-Jeff