Invalid Interrupt Vector?

I have two FPGA PCIe cards which are built on the same IP but differ in port counts. The cards use MSI interrupts and each port has it’s own message. In this case one card has 4 ports and the other has 8 ports. Each card loads and functions fine on it’s own but when I try and use both cards in the system the second card never loads and device manager shows STATUS_DEVICE_POWER_FAILURE.

Looking at the WDF log I see the following failure (note this is from a different run than the logs below):

52: FxInterrupt::Connect - IoConnectInterrupt(Ex) Failed, SpinLock 0xFFFF830C55DD6AD0, Vector 0x93, IRQL 0x9, Synchronize IRQL 0x9, Mode 0x1, ShareVector False, ProcessorGroup 0, ProcessorEnableMask 0xfff, FloatingSave False, 0xc00000ef(STATUS_INVALID_PARAMETER_1)

I setup the interrupts in my EvtDevicePrepareHardware callback by calling the very simple function shown below with the resource descriptors:

_Use_decl_annotations_
NTSTATUS
InterruptInitialize(
    WDFDEVICE Device,
    PCM_PARTIAL_RESOURCE_DESCRIPTOR InterruptRaw,
    PCM_PARTIAL_RESOURCE_DESCRIPTOR InterruptTranslated
)
{
    WDF_INTERRUPT_CONFIG interruptConfig;
    NTSTATUS status = STATUS_SUCCESS;
    ULONG irq;

    PAGED_CODE();

    TraceEvents(TRACE_LEVEL_INFORMATION, TRACE_INTERRUPT, "%!FUNC! Entry");

    auto deviceContext = DeviceGetContext(Device);

    if (CM_RESOURCE_INTERRUPT_MESSAGE & InterruptTranslated->Flags) {
        TraceEvents(TRACE_LEVEL_INFORMATION, TRACE_INTERRUPT,
            "Message Interrupt level 0x%0x, Vector 0x%0x, MessageCount %u",
            InterruptTranslated->u.MessageInterrupt.Translated.Level,
            InterruptTranslated->u.MessageInterrupt.Translated.Vector,
            InterruptTranslated->u.MessageInterrupt.Raw.MessageCount);
    }
    else {
        TraceEvents(
            TRACE_LEVEL_ERROR, TRACE_INTERRUPT, "Legacy Interrupts Not Supported!");
        return STATUS_INVALID_DEVICE_REQUEST;
    }

    // First we need to determine how many interrupts we need
    auto numberOfPorts = READ_REGISTER_ULONG(deviceContext->MemoryBar[1].Buffer + kPortsPresent) & 0x1F;

    WDF_INTERRUPT_CONFIG_INIT(&interruptConfig, InterruptIsr, InterruptDpc);

    interruptConfig.InterruptRaw = InterruptRaw;
    interruptConfig.InterruptTranslated = InterruptTranslated;

    for (irq = 0; irq < numberOfPorts; irq++) {
        status = WdfInterruptCreate(Device, &interruptConfig, WDF_NO_OBJECT_ATTRIBUTES, &deviceContext->InterruptObject[irq]);
        if (NT_SUCCESS(status)) {
            WDF_INTERRUPT_INFO interruptInfo;
            WDF_INTERRUPT_INFO_INIT(&interruptInfo);

            // Log some info
            WdfInterruptGetInfo(deviceContext->InterruptObject[irq], &interruptInfo);
            TraceEvents(TRACE_LEVEL_INFORMATION, TRACE_INTERRUPT, "Created interrupt(%u): Signaled(%c), Num(%u), Vect(0x%0x)",
                        irq, interruptInfo.MessageSignaled ? 'Y' : 'n', interruptInfo.MessageNumber, interruptInfo.Vector);
        }
        else {
            TraceEvents(TRACE_LEVEL_ERROR, TRACE_INTERRUPT,
                "WdfInterruptCreate failed with status %!STATUS!", status);
            break;
        }
    }

    TraceEvents(TRACE_LEVEL_INFORMATION, TRACE_INTERRUPT, "Created %u MSI interrupts.", irq);
    TraceEvents(TRACE_LEVEL_INFORMATION, TRACE_INTERRUPT, "%!FUNC! Exit");

    return status;
}

When I added the WdfInterruptGetInfo and additional logging I noticed something interesting. In a good case interruptInfo.Vector increments with every call:

00000012	driver-windows	4	5420	3	12	11\22\2021-08:07:31:842	InterruptInitialize Entry
00000013	driver-windows	4	5420	3	13	11\22\2021-08:07:31:842	Message Interrupt level 0x4, Vector 0x40, MessageCount 0
00000014	driver-windows	4	5420	3	14	11\22\2021-08:07:31:842	Created interrupt(0): Signaled(Y), Num(0), Vect(0x40)
00000015	driver-windows	4	5420	3	15	11\22\2021-08:07:31:842	Created interrupt(1): Signaled(Y), Num(1), Vect(0x41)
00000016	driver-windows	4	5420	3	16	11\22\2021-08:07:31:842	Created interrupt(2): Signaled(Y), Num(2), Vect(0x42)
00000017	driver-windows	4	5420	3	17	11\22\2021-08:07:31:842	Created interrupt(3): Signaled(Y), Num(3), Vect(0x43)
00000018	driver-windows	4	5420	3	18	11\22\2021-08:07:31:842	Created interrupt(4): Signaled(Y), Num(4), Vect(0x44)
00000019	driver-windows	4	5420	3	19	11\22\2021-08:07:31:842	Created interrupt(5): Signaled(Y), Num(5), Vect(0x45)
00000020	driver-windows	4	5420	3	20	11\22\2021-08:07:31:842	Created interrupt(6): Signaled(Y), Num(6), Vect(0x46)
00000021	driver-windows	4	5420	3	21	11\22\2021-08:07:31:842	Created interrupt(7): Signaled(Y), Num(7), Vect(0x47)
00000022	driver-windows	4	5420	3	22	11\22\2021-08:07:31:842	Created 8 MSI interrupts.
00000023	driver-windows	4	5420	3	23	11\22\2021-08:07:31:842	InterruptInitialize Exit

In the bad case it gets does not increment:

00000042	driver-windows	4	2656	0	42	11\22\2021-08:07:37:339	InterruptInitialize Entry
00000043	driver-windows	4	2656	0	43	11\22\2021-08:07:37:339	Message Interrupt level 0x6, Vector 0x63, MessageCount 0
00000044	driver-windows	4	2656	0	44	11\22\2021-08:07:37:339	Created interrupt(0): Signaled(Y), Num(0), Vect(0x63)
00000045	driver-windows	4	2656	0	45	11\22\2021-08:07:37:339	Created interrupt(1): Signaled(Y), Num(1), Vect(0x63)
00000046	driver-windows	4	2656	0	46	11\22\2021-08:07:37:339	Created interrupt(2): Signaled(Y), Num(2), Vect(0x63)
00000047	driver-windows	4	2656	0	47	11\22\2021-08:07:37:339	Created interrupt(3): Signaled(Y), Num(3), Vect(0x63)
00000048	driver-windows	4	2656	0	48	11\22\2021-08:07:37:339	Created 4 MSI interrupts.
00000049	driver-windows	4	2656	0	49	11\22\2021-08:07:37:339	InterruptInitialize Exit

The vector is supplied by Windows and simply passed through. I do not believe I should be manipulating it. I return success to the EvtDevicePrepareHardware and then it immediately turns around and calls my EvtDeviceReleaseHardware callback due to the IoConnectInterrupt failure.

Any thoughts what I am missing here? Any help is much appreciated.

-Tom

Hmmm… there’s a lot that’s really confusing in your code. At least to me.

First of all, we’re talking MSI here… NOT MSI-X, right? Please confirm. Cuz they’re handled differently.

Assuming you actually DO mean MSI and not MSI-X, then the code you’ve got here isn’t correct. Well, actually, the code you’ve got here isn’t correct for MSI or MSI-X… but, whatever.

You can’t just ASSUME you’re going to be granted the number of interrupts you’ve requested. The number of Interrupts you’ve been granted is defined by (a) the number of Interrupt Resources you receive, and (b) the “message count” provided for each of those resources.

In your code above, you can see that InterruptTranslated->u.MessageInterrupt.Raw.MessageCount is being passed to you as zero. That means you’ve been granted ZERO MSI (as part of this Interrupt Resource). You can not just “assume” you’re going to get one MSI per port. So, doing that little loop based on the number of ports you define is not valid.

You need to look at the Flags field of your Translated Interrupt Resource to determine if the CM_RESOURCE_INTERRUPT_MESSAGE bit is set. If it is, then you have an MSI or MSI-X. If not, you’ve got an LBI. Which is what I bet you’ve got here.

This code should work for MSI or MSI-X resources – This is an updated version of the code I posted a couple of weeks back which was only correct for MSI-X:

            case CmResourceTypeInterrupt: {

                DbgPrint(INFO,"Resource %lu: Interrupt of total so far %u\n", i, totalInterruptsFound);

                if(totalInterruptsFound < MY_DRIVER_EXPECTED_NUMBER_OF_INTERRUPTS) {

                    resourceRaw = WdfCmResourceListGetDescriptor(Resources, i);

                    if(resourceTrans->Flags & CM_RESOURCE_INTERRUPT_MESSAGE) {

                        WDF_INTERRUPT_CONFIG interruptConfig;

                        DbgPrint("\t\tInt type: MSI/MSI-X\n");

                        DbgPrint("\t\tMessages this resource: %u\n", resourceRaw->u.MessageInterrupt.Raw.MessageCount);

                        for(ULONG intNumber = 0; intNumber < resourceRaw->u.MessageInterrupt.Raw.MessageCount; intNumber++) {

                            //
                            // Create WDFINTERRUPT object thereby connecting to the interrupt.
                            //
                            WDF_INTERRUPT_CONFIG_INIT(&interruptConfig,
                                                      MyDriverIsr,
                                                      MyDriverDpcForIsr);
     
                            interruptConfig.EvtInterruptEnable  = MyDriverInterruptEnable;
                            interruptConfig.EvtInterruptDisable = MyDriverInterruptDisable;
                            interruptConfig.InterruptTranslated = resourceTrans;
                            interruptConfig.InterruptRaw        = resourceRaw;

                            status = WdfInterruptCreate(devContext->WdfDevice,
                                                &interruptConfig,
                                                WDF_NO_OBJECT_ATTRIBUTES,
                                                &devContext->InterruptObjects[totalInterruptsFound]);

                            if(!NT_SUCCESS(status)) {

                                DbgPrint("ConnectInterrupt failed for interrupt # %u? Status = 0x%lx\n", intNumber, status);

                                goto done;
                            }

                            DbgPrint("Successfully connected interrupt %u\n", totalInterruptsFound);

                            totalInterruptsFound ++;
                        }

                    } else {

                        DbgPrint("UNEXPECTED Int type LBI found\n");

                        //
                        // Win32: ERROR_INVALID_PARAMETER
                        //
                        status = STATUS_DEVICE_CONFIGURATION_ERROR;

                        goto done;
                    }

BTW, architecturally you are “required” to handle the case when you fall back to use a single LBI when zero messages are granted. I don’t handle that above.

I should probably write a blog post on this topic, because this question comes up periodically and the whole issue of handling MSI/MSI-X is not well handled in the WDK docs.

And don’t forget: You need to enable MSI/MSI-X support and indicate the max number of MSI/MSI-X interrupts you handle, via Registry entries that you make via your INF. See the WDK Docs on this, which are pretty clear.

Thanks for the reply Peter. I honestly suspected my code was wrong which is why I posted it. As you said the documentation does not handle MSI/MSI-X well. This is MSI (not MSI-x). The documentation stated that I needed to call WdfInterruptCreate for each message so I basically just took what my code for a card that only supported LBI did and put a for loop around it and gave it a try.

I do not believe it is giving me an LBI because you will notice the logging at the top of the function is checking for CM_RESOURCE_INTERRUPT_MESSAGE in InterruptTranslated->Flags. Also if I remove the registry directives to enable MSI then it falls into the failure case.

This is what is in my INF (name changed to generic “My”):

[My_Device.NT.HW]
AddReg=My_Device_AddMsi

[My_Device_AddMsi]
HKR,Interrupt Management,,0x00000010
HKR,Interrupt Management\MessageSignaledInterruptProperties,,0x00000010
HKR,Interrupt Management\MessageSignaledInterruptProperties,MSISupported,0x00010001,1

I tried adding MessageNumberLimit to 16 (which is hardware max) like so:

HKR,Interrupt Management\MessageSignaledInterruptProperties,MSISupported,0x00010001,16

But I still get a 0 for the MessageCount. I also only get one CmResourceTypeInterrupt. These cards work just fine in Linux so in theory the hardware is correct. Not really sure what Windows is doing here.

Hmmmm… Well, as we know from long experience, what works in one OS doesn’t necessary work in any other OS.

The documentation stated that I needed to call WdfInterruptCreate for each message

That’s a fairly recent addition, too. It’s helpful “as far as it goes” but could still be a lot more useful.

Changing the Registry parameter to be larger than what your device is requesting isn’t going to change the number of MSIs granted to you… this is a function of what the device is requesting.

OK! Let’s not guess. Have a look at the MSI Capability registers on your device. If you don’t already have a favorite utility with which to do this on Windows (and I’m guessing you don’t) I’d recommend you grab the free one from Teledyne/Lecroy. It generally seems to work for me, and its provenance is known (unlike some other similar yet famous utilities that are kicking around the Internet).

Peter

Well found my typo/error that caused some of my confusion. I was still looking at my original log statement output which was: InterruptTranslated->u.MessageInterrupt.Raw.MessageCount.

Does not work to view the raw member from the translated.

If I look at InterruptRaw->u.MessageInterrupt.Raw.MessageCount then I got a message count of 1. Then after setting MessageNumberLimit to 16 in the registry I got a message count of 16.

So now that I got that sorted out I think I can go back and rework my code to do things properly based on your above example.

Thanks for all the help Peter.

Awesome… you are on track now, I think.

Peter