Why is my interrupt not getting serviced?

In my WDM driver, I have a service routine connected to an interrupt vector. I know that the hardware condition that triggers the interrupt is present, but the service routine is not getting called.

Here is some annotated excerpts from the driver code. It is in C++, and the DeviceExt and DeviceInterrupt are classes.

I’m hoping somebody will see an error in this code.

struct DeviceExt
{
	bool sampCheck (void)
	{
		dbgPrint (session.debug, "Interrupt service");	// I never see this message!
		// This checks the hardware and if the interrupt condition is present, it calls the given lambda function.
		return ibs.sampCheck ([&] (void)
			{
				serviceDpc ();
			});
	}
	PKINTERRUPT		object;

};

NTSTATUS DeviceExt::start (IRP &, const Session & params) // This is a DPC.
{
	interrupt.alloc< &DeviceExt::sampCheck > (*this) )
}

struct DeviceInterrupt
{
	
	template< bool (DeviceExt::* cb) (void) >
	bool alloc (DeviceExt & ext);

	PKINTERRUPT		object;				// Overrides base class object member.
};

template< bool (DeviceExt::* cb) (void) >
bool DeviceInterrupt::alloc (DeviceExt & ext)
{
	// the vector was obtained from Apic register 500h, which is the vector that the Ibs hardware
	// uses to signal an interrupt.
	// shared_ = true.  dirql = 3.
...
		auto svcCtx = &ext;
		using Ctx = decltype (svcCtx);
		auto svc = [] (PKINTERRUPT, void * ctx_) -> BOOLEAN {
			Ctx ctx = (Ctx) ctx_;
			return (BOOLEAN) (ctx->*cb) ();
		};
		status = IoConnectInterrupt (
			&object, svc, svcCtx, NULL,
			vector, dIrql, dIrql,
			(::_KINTERRUPT_MODE) LevelSensitive, shared_,
			1LL << session.processor.Number (), false);
		if ( status == STATUS_SUCCESS )
			dbgPrint (session.debug, "Interrupt connected, vector %2x.\n", vector);	// This message is issued.
}

Is this your own class framework?

Is this an SBC? In a normal PC architecture, you would never extract an interrupt from the APIC. You’d always be handed the interrupt resource at startup time.

There’s a lot of code we’re not seeing, but as written, the comment in DeviceInterrupt is wrong, because there is no base class.

There’s a lot here not to like. Seriously. Where are you getting the IRQL? Why in the world are you seeing the processor interrupt mask that way?

Sorry, but this looks entirely broken to me.

Peter

Here’s an explanation of what I’m working with and what I’m trying to do.
The AMD processors have a feature called IBS, which is a very good profiling tool. It samples an instruction periodically (while still executing it, of course) and collects information about it in several MSRs, and then it raises an interrupt on vector 35. It also sets a SampleAvail bit in an MSR. It won’t sample again as long as this bit is set.
A driver is supposed to collect the sample data from the MSRs and then clear the SampleAvail bit. The hardware will then start counting another interval before the next sample, and the process repeats.
This is NOT a PNP device. It is built in to the processor core.
The int vector number is put in APIC 500 by the BIOS at boot time, and that’s where I get the value 35. There is an interrupt mask bit also in APIC 500, which I make sure is cleared.
I recall long ago when I first started work on this, that I looked at what the OS had associated with vector 35, and there was nothing at all connected to that vector. (I assume that if the OS gets an interrupt on such a vector, it just ignores it.) So if I call IoConnectInterrupt, my service routine should get called when that vector raises an interrupt. Unfortunately, I know that the SampleAvail bit is getting set by the hardware, but my service routine is not getting as far as the dbgPrint in the sampCheck () method.
Good point about the IRQL. Since AMD has no documentation about what IRQL is required, or suggested, I have some code that tries the IoConnectInterrupt at various IRQL values starting at 3 and going up to 11. I wrote that code when I supposed that there would be something else already connected, and I would have to have an IRQL that is consistent with that other connection. I don’t understand how Windows would treat two service routines on the same vector, so I’d appreciate advice about this point. Perhaps I should try IRQL 11 first, and then work down to 3? Or perhaps I could try non-shared instead of shared?
I think I’ve shown all the relevant code. The IoConnectInterrupt supplies a C++ lambda function, which calls the DeviceExt::sampCheck () member function of the DeviceExt object provided as interrupt context.
Unfortunately, I’m not a serious driver programmer, I just want to be able to use this AMD IBS facility for my own purposes. I only have one computer, so I don’t have the option of doing a WndDbg from a separate host, which would give me more options than just dbgPrint.
Can you suggest anything else that would help me monitor the system. In particular, I want to see if an interrupt on vector 35 is indeed being generated by the hardware, or not. Is there a windbg command that would put a watch on vector 35 and print something when an interrupt occurs (of course, I can’t do a break, but I could at least get a message).
The alternative strategy I’ve been using up to now, and one that works fine, is to poll the hardware periodically using a timer DPC. It does in fact find that the SampleAvail bit is set, and retrieves the sample info from the MSRs and clears the SampleAvail bit – basically the same thing that my ISR would do, only it doesn’t run quite as often.
So it would be nice to get the ISR to work, but it’s not worth a great effort.

Tim and Peter, I appreciate how you’ve both been so willing to answer queries over the years.

// the vector was obtained from Apic register 500h, which is the vector that the Ibs hardware
// uses to signal an interrupt.
// shared_ = true. dirql = 3.

I’m hoping somebody will see an error in this code.

Seems to be “a post of the month”, don’t you think, guys…

struct DeviceExt
{
bool sampCheck (void)
{
dbgPrint (session.debug, “Interrupt service”); // I never see this message!
// This checks the hardware and if the interrupt condition is present, it calls the given lambda function.
return ibs.sampCheck ([&] (void)
{
serviceDpc ();
});
}
PKINTERRUPT object;

};

I don’t want to start any flame wars, but the very first thing that gets into my head when I see code like this is the following “pearl” from Mr.Torvalds

[begin quote]

C++ is a horrible language. It’s made more horrible by the fact that a lot of substandard programmers use it, to the point where it’s much much easier to generate total and utter crap with it. Quite frankly, even if the choice of C were to do nothing but keep the C++ programmers out, that in itself would be a huge reason to use C.

[end quote]

Anton Bassov

I’m an idiot! My call to IoConnectInterrupt was contained in the larger block:

if ( 01 ) status = STATUS_SUCCESS;
else
	status = IoConnectInterrupt ( /* ... */ );

I put that bypass in the code a long time ago when the IoConnectInterrupt was leading to problems. That obviously explains why I’m not getting the ISR called.
I’m going to fix that, and I’ll post another comment to let you know if things are now working, or if I still need help.

Unfortunately, I’m not a serious driver programmer, I just want to be able to use this AMD IBS facility for my own purposes.

Forget it.

You cannot connect to arbitrary interrupts in Windows. OK… that’s a bit of a fib. Windows doesn’t want you to do it, and you need to fool around to even try.

You seem to misunderstand the concept of IRQL, IRQL does not equal IRQ. That’s the first thing.

So it would be nice to get the ISR to work, but it’s not worth a great effort.

Sorry to be the bearer of bad news: Discard your hope of implementing this. It’s pretty far out of the range of the ordinary types of things Windows will let you do.

I’m not saying it isn’t a worthy thing to do. I’m not even saying it’s a completely impossible thing to do. But what I am saying is that unless you’re an experienced Windows driver dev who’s willing to do a bit of experimentation and also “color outside the lines”… you have little chance of getting this to work. Heck, I’m not even sure that I could figure out how to get this to work.

That famous dev Mr. Alighieri said it best: “Abandon hope all ye who enter here…”

I only have one computer, so I don’t have the option of doing a WndDbg from a separate host

With all due respect, that’s just pure foolishness. I mean that in all humility, and understanding that different people are of different means. There’s a cost of entry to driver development for hardware. That cost of entry is the cost of s,second system… which can be less than US$500. That is less than the fully burdened cost of 3 days of a software maintenance engineer’s time… in India.

Peter

I’m an idiot! My call to IoConnectInterrupt was contained in the larger block:

if ( 01 ) status = STATUS_SUCCESS;
else
	status = IoConnectInterrupt ( /* ... */ );

I put that bypass in the code a long time ago when the IoConnectInterrupt was leading to problems. That obviously explains why I’m not getting the ISR called.
I’m going to fix that, and I’ll post another comment to let you know if things are now working, or if I still need help.

OK, I tried fixing my code but I’m stuck in what looks like an impossible place.
When I said in the OP that vector 35 was unused, that was wrong; that’s just some faulty memory.
Here’s the IDT entry for vector 35, the one the IBS hardware points to as the vector it signals to.
+0x000 Type : 0n22
+0x002 Size : 0n256
+0x008 InterruptListEntry : _LIST_ENTRY [ 0x00000000 00000000 - 0x00000000 00000000 ]
+0x018 ServiceRoutine : 0xfffff800 95d85280 unsigned char +fffff80095d85280
+0x020 MessageServiceRoutine : (null)
+0x028 MessageIndex : 0
+0x030 ServiceContext : (null)
+0x038 SpinLock : 0
+0x040 TickCount : 0
+0x048 ActualLock : 0xffffffff ffffffff → ??
+0x050 DispatchAddress : 0xfffff800 955acbc0 void nt!KiInterruptDispatchNoLock+0
+0x058 Vector : 0x35
+0x05c Irql : 0x5 ‘’
+0x05d SynchronizeIrql : 0x5 ‘’
+0x05e FloatingSave : 0 ‘’
+0x05f Connected : 0x1 ‘’
+0x060 Number : 0
+0x064 ShareVector : 0 ‘’
+0x065 EmulateActiveBoth : 0 ‘’
+0x066 ActiveCount : 0
+0x068 InternalState : 0n0
+0x06c Mode : 1 ( Latched )
+0x070 Polarity : 0 ( InterruptPolarityUnknown )
+0x074 ServiceCount : 0
+0x078 DispatchCount : 0
+0x080 PassiveEvent : (null)
+0x088 TrapFrame : (null)
+0x090 DisconnectData : (null)
+0x098 ServiceThread : (null)
+0x0a0 ConnectionData : (null)
+0x0a8 IntTrackEntry : (null)
+0x0b0 IsrDpcStats : _ISRDPCSTATS
+0x0f0 RedirectObject : (null)
+0x0f8 Padding : [8] “”

Since the vector is unshared, IoConnectInterrupt fails.
Seems that the people who wrote the BIOS (or whatever sets up the vector) made it impossible to use the hardware.
I might try some other things like finding an unused vector, connecting to it, and setting its vector number in the APIC in place of the 35. Or perhaps ask around and try to find a way to “unbind” vector 35 or make it shared (there are apparently no ISRs associated with the vector). However, I have a decent fallback, which is to poll the IBS hardware instead of responding to interrupts, and this will get me the same profiling data although a bit slower. I wanted to get interrupts to work, but now I’ll give up on it.
This can be an example for future readers that it can be hazardous to try something outside of the normal realm.