Shared Interrupt Vector

Robert_Graham · April 30, 2014, 6:25pm

I have a PCIe device which shares an interrupt vector with other PCIe devices. We recently fixed a bug where we incorrectly returned TRUE from our "EvtInterruptIsr " function when our device did not have any interrupt events to handle. This caused an issue where other devices’ ISR functions were no longer being called.

My question is concerning shared interrupts. What happens if two devices both cause the interrupt, and the first ISR returns TRUE…does the second device’s ISR still get called?

Thanks guys. Cheers.

Tim_Roberts · April 30, 2014, 7:21pm

xxxxx@gmail.com wrote:

I have a PCIe device which shares an interrupt vector with other PCIe devices. We recently fixed a bug where we incorrectly returned TRUE from our "EvtInterruptIsr " function when our device did not have any interrupt events to handle. This caused an issue where other devices’ ISR functions were no longer being called.

My question is concerning shared interrupts. What happens if two devices both cause the interrupt, and the first ISR returns TRUE…does the second device’s ISR still get called?

It’s a good question. The hardware handles this. The second device’s
interrupt will remain pending until the first ISR completes, then the
interrupt will immediately fire again, and presumably the second ISR
will finally get it’s shot.

–
Tim Roberts, xxxxx@probo.com
Providenza & Boekelheide, Inc.

Tim_Roberts · April 30, 2014, 7:23pm

Tim Roberts wrote:

It’s a good question. The hardware handles this. The second device’s
interrupt will remain pending until the first ISR completes, then the
interrupt will immediately fire again, and presumably the second ISR
will finally get it’s shot.

I can’t believe I committed an apostrophe violation. It’s one of my pet
peeves. I’m embarrassed just reading that message.

ITS shot. ITS shot.

–
Tim Roberts, xxxxx@probo.com
Providenza & Boekelheide, Inc.

Peter_Wieland · April 30, 2014, 7:26pm

Unless the first ISR returns true all the time. Then the interrupt line is never de-asserted (because the second ISR won’t get a chance to de-assert it) and you have an interrupt storm. So generally it’s bad to claim an interrupt you don’t have some reason to believe you caused.

I don’t remember if Windows does any round-robbining of ISRs on a given vector, which might allow the OS to clear that situation up.

-p

-----Original Message-----
From: xxxxx@lists.osr.com [mailto:xxxxx@lists.osr.com] On Behalf Of Tim Roberts
Sent: Wednesday, April 30, 2014 4:21 PM
To: Windows System Software Devs Interest List
Subject: Re: [ntdev] Shared Interrupt Vector

xxxxx@gmail.com wrote:

I have a PCIe device which shares an interrupt vector with other PCIe devices. We recently fixed a bug where we incorrectly returned TRUE from our "EvtInterruptIsr " function when our device did not have any interrupt events to handle. This caused an issue where other devices’ ISR functions were no longer being called.

My question is concerning shared interrupts. What happens if two devices both cause the interrupt, and the first ISR returns TRUE…does the second device’s ISR still get called?

It’s a good question. The hardware handles this. The second device’s interrupt will remain pending until the first ISR completes, then the interrupt will immediately fire again, and presumably the second ISR will finally get it’s shot.

–
Tim Roberts, xxxxx@probo.com
Providenza & Boekelheide, Inc.

NTDEV is sponsored by OSR

Visit the list at: http://www.osronline.com/showlists.cfm?list=ntdev

OSR is HIRING!! See http://www.osr.com/careers

For our schedule of WDF, WDM, debugging and other seminars visit:
http://www.osr.com/seminars

To unsubscribe, visit the List Server section of OSR Online at http://www.osronline.com/page.cfm?name=ListServer

Peter_Viscarola_OSR · April 30, 2014, 10:00pm

We’re talking PCIe line-based interrupt emulation here?

For level-triggered interrupts, the ISRs are always called in the same order, starting from the beginning, whenever the line is asserted. So as soon as the an ISR returns true, if the line is still asserted, Windows calls the first ISR again.

For edge-triggered interrupts all ISRs are always called, regardless of whether more returns true.

At least, this is how it worked when last I checked.

Peter
OSR
@OSRDrivers

Robert_Graham · April 30, 2014, 10:18pm

@Peter that answers my question completely and explains what was happenning! For our NDIS driver, the hardware has level-triggered interrupts, and I was mistakenly return TRUE all the time (forgot to mask off interrupts when reading the event register). Thus…it’s ISR starved out everyone else who would have normally been called after.

Thanks for the knowledge!

anton_bassov · May 1, 2014, 12:23am

> My question is concerning shared interrupts. What happens if two devices both cause the interrupt,

and the first ISR returns TRUE…does the second device’s ISR still get called?

Once you are speaking about shared interrupts, we can safely assume the ones that are asserted via a pin, right (actually, for PCIe pin-based interrupts are just emulated - in actuality, the whole thing relies upon the messages behind the scenes, but it does not really matter in this context). Interrupts that PCI devices assert via a pin are required to be level-triggered by PCI specs. Therefore, the whole ISR chain is going to get invoked until someone returns TRUE. When it happens, ISR handler stub can safely acknowledge interrupt to the controller and return - if line is still asserted, interrupt will fire again immediately after ISR handler stub acknowledges it to the controller. This approach relies upon the semantics of level-triggered interrupts that are active as long as line is asserted, and assumes that everyone behaves properly. If interrupt is caused by device B and device’s A ISR that gets invoked before device’s B one returns TRUE, the only possible scenario is interrupt storm. In context of your question, this is exactly what must have been happening.

Another option is MSI (support for which, IIRC, is optional for PCI devices but mandatory for PCIe ones). MSI is raised as a memory write, so that it is kind of edge-triggered interrupt in a sense that it is asserted only once. Although technically there is nothing that prevents MSI interrupts from being shared, sharing interrupts
would simply defeat the very purpose of MSI (i.e. to eliminate the need to access the device in order to find out whether it caused interrupt), in the first place. Therefore, I assume you are not speaking about MSI.

Anton Bassov

Peter_Viscarola_OSR · May 1, 2014, 10:19am

Well… Yes. BUT to be clear: Having MSI does *not* mean that the driver does not need some method to determine if the device has interrupted. The driver might check an in-memory structure, for example. But the driver still has to check.

Now, if it doesn’t, no *real* harm done I suppose. If the MSI vector WERE to be shared (it’s possible, I believe, just not supported in Windows) all the attached ISRs would be called in any case (per my description of edge-triggered interrupts, above).

Peter
OSR
@OSRDrivers

anton_bassov · May 1, 2014, 12:23pm

> Well… Yes. BUT to be clear: Having MSI does *not* mean that the driver does not need some

method to determine if the device has interrupted.

Sure - after all, interrupt may be just spurious. For example, it may be raised as a write to the local APIC’s ICR
by some “bold experimenter”…

The driver might check an in-memory structure, for example.

…which is of zero use if this “bold experimenter” happens to be Alberto - IIRC, he was about to raise MSI interrupts as a write by the CPU to the memory that is supposed to be written to only by MSI-capable device (you remember this nonsense that he was about to do, don’t you)…

If the MSI vector WERE to be shared (it’s possible, I believe, just not supported in Windows)
all the attached ISRs would be called in any case (per my description of edge-triggered interrupts, above).

Well, it does not necessarily have to work this way. For example, consider the following approach in interrupt handler stub for shared MSI interrupts (I assume interrupts have been already re-enabled on the CPU)

cli();

spinlock(irq->lock);

ack__interrupt();

if( irq->flags & FLAG_ISR_ALREADY_RUNS)
{
irq->flags |= FLAG_ISR_REPLAY;
can_proceed =0;
}

else

{
irq->flags |= FLAG_ISR_ALREADY_RUNS;
can_proceed =1;
}

while(1)
{

spin_unlock(irq->lock);
sti();
if(! can_proceed) break;

invoke_shared_msi_isr_chain(irq);

cli();

spinlock(irq->lock);

if( irq->flags & FLAG_ISR_REPLAY)
{
irq->flags &= ~FLAG_ISR_REPLAY;
can_proceed =1;
}

else

{
irq->flags = 0;
can_proceed =0;
}

}

As you can see, we are not going to lose interrupts even if invoke_shared_msi_isr_chain() returns immediately after the one of ISRs in a chain returns TRUE. In order to make it consistent with the existing model of interrupt spinlocks invoke_shared_msi_isr_chain() will have to acquire a separate spinlock for each interrupt source and hold it while the corresponding ISR runs…

Anton Bassov

Tim_Roberts · May 1, 2014, 12:51pm

xxxxx@osr.com wrote:

But the driver still has to check.

Now, if it doesn’t, no *real* harm done I suppose. If the MSI vector WERE to be shared (it’s possible, I believe, just not supported in Windows) all the attached ISRs would be called in any case (per my description of edge-triggered interrupts, above).

PCIExpress throws an interesting wrinkle into all of this.

In PCI, for example, the interrupt wires literally are shared between
devices. If two devices both raise the same interrupt pin, the PCI
controller cannot know which device fired. It has to keep triggering an
ISR until all of the clients are satisfied.

In PCIExpress, that cannot happen. ALL interrupts, whether they are the
legacy wire simulation or an MSI interrupt, are sent as PCIExpress
packets. Because it is a point-to-point connection, the root complex
always knows exactly which device raised the interrupt. Theoretically,
assuming you had enough slots in the interrupt controller, you would
never have to share interrupts again.

–
Tim Roberts, xxxxx@probo.com
Providenza & Boekelheide, Inc.

Peter_Viscarola_OSR · May 1, 2014, 4:19pm

Bingo. Give that man a cigar!

You’re exceeding my knowledge here… but is it not the case that there can be “bridges and things” between the device and the root complex, that make this a lot more complicated than that?

Peter
OSR
@OSRDrivers