PCI.sys driver crash

Hello Everyone.

I am working on developing KMDF driver for a PCIexpress based accelerator cord.

Once problem we observed is there is crash/bugcheck in PCI.sys when the system is restarted/shutdown.
Below is the bugcheck details and stack trace.

SYSTEM_THREAD_EXCEPTION_NOT_HANDLED (7e)
This is a very common BugCheck. Usually the exception address pinpoints
the driver/function that caused the problem. Always note this address
as well as the link date of the driver/image that contains this address.
Arguments:
Arg1: ffffffffc0000005, The exception code that was not handled
Arg2: fffff80205260df2, The address that the exception occurred at
Arg3: fffff48650b9ed88, Exception Record Address
Arg4: fffff48650b9e570, Context Record Address

[0x0] pci!ExpressSaveTphMsiXStEntries+0x6a 0xfffff48650b9efc0 0xfffff80205246400
[0x1] pci!ExpressSaveTphRequesterCapability+0x16c 0xfffff48650b9eff0 0xfffff80205238004
[0x2] pci!ExpressSavePortRegisters+0x544 0xfffff48650b9f0a0 0xfffff80205234af0
[0x3] pci!PciPowerDownDevice+0x124 0xfffff48650b9f1a0 0xfffff802052522a4
[0x4] pci!PciDevice_SetPower+0x304 0xfffff48650b9f210 0xfffff8020523da56
[0x5] pci!PciDispatchPnpPower+0x126 0xfffff48650b9f340 0xfffff802733d42db
[0x6] nt!IopPoHandleIrp+0x3b 0xfffff48650b9f3a0 0xfffff802732969b3
[0x7] nt!IofCallDriver+0xf3 0xfffff48650b9f3d0 0xfffff80273466b19
[0x8] nt!IoCallDriver+0x9 0xfffff48650b9f410 0xfffff8020514c0ec
[0x9] ACPI!ACPIFilterIrpSetPower+0xdc 0xfffff48650b9f440 0xfffff8020514b286
[0xa] ACPI!ACPIDispatchIrp+0x1d6 0xfffff48650b9f4a0 0xfffff802733d42db
[0xb] nt!IopPoHandleIrp+0x3b 0xfffff48650b9f520 0xfffff802732969b3
[0xc] nt!IofCallDriver+0xf3 0xfffff48650b9f550 0xfffff80273466b19
[0xd] nt!IoCallDriver+0x9 0xfffff48650b9f590 0xfffff80204e86699
[0xe] Wdf01000!FxIrp::PoCallDriver+0x16 (Inline Function) (Inline Function)
[0xf] Wdf01000!FxPkgFdo::_PowerPassDown+0x79 0xfffff48650b9f5c0 0xfffff80204e865f8
[0x10] Wdf01000!FxPkgFdo::PowerReleasePendingDeviceIrp+0x38 0xfffff48650b9f5f0 0xfffff80204e7a0f8
[0x11] Wdf01000!FxPkgPnp::PowerGotoDxIoStoppedCommon+0x174 0xfffff48650b9f620 0xfffff80204e79f6b
[0x12] Wdf01000!FxPkgPnp::PowerGotoDxIoStopped+0x7 (Inline Function) (Inline Function)
[0x13] Wdf01000!FxPkgPnp::PowerGotoDNotZeroIoStopped+0xb 0xfffff48650b9f690 0xfffff80204e7dd0a
[0x14] Wdf01000!FxPkgPnp::PowerEnterNewState+0x152 0xfffff48650b9f6c0 0xfffff80204e7d9d8
[0x15] Wdf01000!FxPkgPnp::PowerProcessEventInner+0xe0 0xfffff48650b9f810 0xfffff80204e7b569
[0x16] Wdf01000!FxPkgPnp::PowerProcessEvent+0x15d 0xfffff48650b9f890 0xfffff80204e7b374
[0x17] Wdf01000!FxPkgFdo::LowerDevicePower+0x34 (Inline Function) (Inline Function)
[0x18] Wdf01000!FxPkgFdo::DispatchDeviceSetPower+0x7c 0xfffff48650b9f930 0xfffff80204e79895
[0x19] Wdf01000!FxPkgFdo::_DispatchSetPower+0x25 0xfffff48650b9f980 0xfffff80204ea9553
[0x1a] Wdf01000!FxPkgPnp::Dispatch+0x103 0xfffff48650b9f9b0 0xfffff80204e9dcc2
[0x1b] Wdf01000!DispatchWorker+0xea (Inline Function) (Inline Function)
[0x1c] Wdf01000!FxDevice::Dispatch+0xf3 (Inline Function) (Inline Function)
[0x1d] Wdf01000!FxDevice::DispatchWithLock+0x232 0xfffff48650b9fa20 0xfffff802733d40de
[0x1e] nt!PopIrpWorker+0x2de 0xfffff48650b9fa80 0xfffff8027345904a
[0x1f] nt!PspSystemThreadStartup+0x5a 0xfffff48650b9fb30 0xfffff802736741c4
[0x20] nt!KiStartSystemThread+0x34 0xfffff48650b9fb80 0x0

Initially this PCIe device implemented both MSI and MSI-X capabilities. MSI interrupts implementation is fully functional but MSI-X interrupt implementation is not fully functional in hardware.

Since Linux driver for this device can enable any of these Interrupts (MSI or MSI-X), it is enabling MSI interrupts only.

Windows driver's EvtDevicePrepareHardware callback giving MSI-X interrupts resource(s) (Windows OS automatically selecting MSI-X when device supports both MSI and MSI-X)

As this device MSI-X interrupts are not fully implemented, we asked HW team to disable MSI-X capability in the hardware. They disabled it by putting "Next Cap Pointer" value to 0x0 in previous capability register which is pointing to MSI-X capability. With this change Driver's EvtDevicePrepareHardware callback giving MSI interrupts. Device functionality is working as expected.

During our testing we observed there is crash/bug-check in PCI.sys when the system is restarted/shutdown.

Since the crash is in PCI.sys, I am assuming this crash due to incorrect implementaiton of MSI-X (not sure whether it is still valid cap for Windows) and/or PCI_EXPRESS_TPH_REQUESTER_CAP_ID capabilities.

One observation is crash happens when our PCIexpress accelerator device driver is enabled. If it is not disabled/not-loaded then there no crash in PCI.sys.

Is there any way to disable these capabilities (both MSI-X and PCI_EXPRESS_TPH_REQUESTER_CAP_ID) or instruct the OS not to use these caps for our PCIe device.

Please help me to resolve this issue.

One more query is, can we read PCIexpress extended configuration space using BUS_INTERFACE_STANDARD.GetBusData like below.

BUS_INTERFACE_STANDARD pciBus = { 0 };
USHORT pciExtendedConfig;
ULONG pciExtendedConfig_Offset = 0x100;

NTSTATUS status = WdfFdoQueryForInterface(pdev_ctx->wdfDevice, &GUID_BUS_INTERFACE_STANDARD,
    (PINTERFACE)&pciBus, sizeof(BUS_INTERFACE_STANDARD), 1 /* Version */, NULL);
if (!NT_SUCCESS(status)) {
    return status;
}

ULONG bytesRead = pciBus.GetBusData(pciBus.Context, PCI_WHICHSPACE_CONFIG, &pciExtendedConfig,
    offset, sizeof(USHORT);

Thanks in advance.
Madhu

if you have known bad hardware, you should fix that. If you absolutely can’t, then it might be possible with a lot of effort to work around this problem. It will be painful, and a form like this won’t be much help as we can’t spend the required time to look at the details. Think about months of work as a likely estimate