What is wrong with this PCIe device?

I have a PCie device and I am trying to get my KMDF driver loaded and running on this PCIe device.

The KMDF driver’s DriverEntry, Add() gets called but PrepareHardware() never gets called. Instead DeviceCleanup() followed by DriverCleanup() gets invoked and then that results in a yellow bang

I have provided the !pci dump (2 different executions of the same command). It is a x1 PCie device. I have replaced the Vendor Id and Device ID and subsysvendor ID with invalid values but the rest of it stays the same (Upon request of the vendor).

a) The command register has Bus master , Memory space enable and IO space enable cleared so any host specific MMIO accesses will result in master abort. This just means Windows has disabled the device but why?

b) Is it because the device is requesting a 32-bit BAR (BAR0 is 32-bit prefetchable).

c) If I write all 0xf into BAR0 and BAR1, the length comes back as 0x2000 which is the correct size per the device vendor.

I was thinking Windows does not like the class code (This translates into a processor controller). They are claiming their driver for Linux loads on it just fine

At the bottom I have another example of the same device where !pci shows the BAR as MPF for BAR0. What exactly does this mean?.
Any ideas on how to enable PCI bus driver tracing so I can figure out what is going on?

I have provided the pci dump (both commands) as the attached file because of character count limitations

Any help is much appreciated. I am just developing the driver for them. I do not have any PCI express protocol analyzer tools.

Thanks,
Rama

The only thing I don’t like is this:

WARNING PMC non-zero reserved fields 01c0

Those bits in PCI_PMC are indeed reserved.

Might wanna ask you IHV what they think they’re doing and if they can give you a board without these bits set (it may be a very simple matter to update the config rom for you as a test).

Peter

Peter, thanks for the quick response. I just noticed that. I am not sure if this is the reason why Windows refuses to load the driver but I am going to ask them to fix it. Yes, there might be a way to update the FW locally here (They have a Linux based tool that relies on the driver to update the FW).

Thanks and appreciate it very much.
–RK

You say it’s a 32-bit BAR, and indeed the low bits say so, bit it’s curious that it seems to be set up for three 64-bit BARs. Note the 0xC bits in the bottom of BAR2 and BAR4. Are you absolutely sure your device responds to BAR1 with 0xffffffff? If the FPGA is set up to handle a 64-bit BAR but the config space says its 32, that would cause confusion. Or if BAR2 and BAR4 responded oddly, that also would be a Bad Thing.

(As an aside… I’ve seen SUCH funky PCIe devices lately. I thought the era where “everyone was a device designer” because they could program an FPGA was bad? What I’m seeing lately is even worse. Saw the first device in my life that had no valid contents (showed BAR not used) in BAR0 or BAR1… but had valid contents in BAR2/3 and BAR4/5.

There is also some kind of weird trend in which people are putting device registers in PCIe extended config space. I see this even on some Intel devices. I. Just. Don’t. Get. It. WTF is going on?)

Peter

Tim, yes you are right. I think this might also be a problem that I will cause the IHV to fix. I am not sure why this has to be a 32-bit BAR.
I did verify that the 2 64-bit BARs at BAR2 and BAR4 are 0 length BARs.

I tried an experiment.

a) I power cycled the machine (My driver was not loaded)
Once I booted into Windows, I ran RW-Everything tool and noticed that the command register had 0x6 which means bus mastering and memory space decode is enabled. I was able to read the physical memory (The physical address is 0x00000000F0000000 which is the value in BAR0). The IHV has confirmed that the values seen at this address are the default values

b) Then I loaded my driver and the driver failed to load and yellow bang.
Then I saw the same command register set to 0 which makes sense because Windows PCI bus driver must have written this value to disable the device.
If I now use RW-Everything to write CMD register to 0x6, then I can decode the physical space behind BAR0 again and it shows the same values as in (a).

So this means the upstream PCIe root ports and internal routing logic has been programmed properly to allow accesses to the physical address in BAR0 to be routed properly to the device. It is just the device has been disabled.

It could be because of the PCI PMC register that Peter identified and also BAR0 which is marked 32-bits but takes up 2 slots (0x10 and 0x11).

Thanks,
Rama

I did verify that the 2 64-bit BARs at BAR2 and BAR4 are 0 length BARs.

But that’s not right. An unused BAR is read-only and must be hardwired to 0. Whatever you write, it needs to return all 0 bits. BAR2 and BAR4 are not doing that.

That also applies to BAR1, which is going to be a separate BAR, since BAR0 is 32-bit. All of the fields from BAR1 to BAR5 should always return all 0.

Tim, thanks. You are right. I just checked the PCI LB3 spec and PCIe specs. I was wrong there. I will tell the IHV to fix this

@“Peter_Viscarola_(OSR)” said:
(As an aside… I’ve seen SUCH funky PCIe devices lately. I thought the era where “everyone was a device designer” because they could program an FPGA was bad? What I’m seeing lately is even worse. Saw the first device in my life that had no valid contents (showed BAR not used) in BAR0 or BAR1… but had valid contents in BAR2/3 and BAR4/5.

There is also some kind of weird trend in which people are putting device registers in PCIe extended config space. I see this even on some Intel devices. I. Just. Don’t. Get. It. WTF is going on?)

Peter

Hi Peter,
I have met with this situation as well. And talked to developer of silicon who did that. Their response was - we are not going to certify our PCIe peripheral, nor are we interested to follow PCIe rules. If we can put registers there we will since it makes easy to use those. I.e. do what we want to make it simple for one of a kind PCIe peripheral which will never be inserted into any PCIe slot on any PC or Mac, not even exist as removable PCIe card.
Sergey

Thanks for sharing that story. I appreciate it. That makes some sense, actually.

But the devices I’m seeing are mainline, intended for mass production, “stick-em into any user’s PC” kind of devices. And, you know, in Linux PCIe Extended Config Space is memory mapped into the driver’s kernel virtual address space. So they read/write the space like any other “ space (what we would call “register” space in Windows.

Ugh… just terrible hardware engineering these days.

Peter

Hi Peter,
perhaps doing such bad thinsg makes some sense for them. The mainline PCIe devices you mention are following the same hacky paradigm of proprietory ones. My understanding of why even mass production PCIe device manufacturers do this IMHO comes form the following thought: we make some ABC chip and want it to work in many PCs (i.e. be sold as popular product). All we want is a cheap way to transfer data between Windows or Linux and our chip. Sounds simple, right? But options are limited, for internal card that has to be PCIe only, like it or not. They in fact hate it! Because they need to deal with PCIe. And so they choose to deal with PCIe any way which works for them, the simpler the better, cutting all corners. What PCIe rules??? We don’t care, just get us a way to exchange data and we make and sell our chip. So there is already some kB of PCIe memory for us, then use it! Put data in it, registers, whatever we need. That’s not enough? How about using subsystem_vendor_id as chip register? Have seen even that.
Oh well. Modern hardware engineering.
Cheers,
Sergey