As I’m a R&D driver engineer working at Avago/Broadcom on PLX based products (PLX was acquired by Avago 2 years ago, Avago acquired Broadcom this spring and the name on the building is now Broadcom Limited which is now classified as the 3rd largest semiconductor company, Avago had also acquired LSI Logic and Emulex a while back, and was originally a spinoff from HP’s semiconductor group). I’ll send an email to make sure the PLX technical support engineers are aware of this bridge serial number requirement for Windows OSs. It might not be a requirement to be PCIe spec compliant or to work with Linux. From a card manufacturing point of view, assuring each PCIe bridge has a unique serial number burned into the configuration EEPROM sounds like a requirement that could easily get missed (or ignored for board cost reduction).
My understanding is Windows would like every PCIe device to have a unique serial number, and this is not always the case.
Starting in about Windows Server 2012, the OS PCIe bus driver tries to use PCIe serial numbers as a unique value for the PnP instance id generation. In previous versions, a hash of the PCIe device tree (like the BDF) was used. When software control over the PCIe hierarchy started happening (I think in the 2009 time frame), using a PnP id based on the location in the PCIe bus hierarchy became a problem. An example of the problem is you would have a server with 2 partitionable NIC cards. You set the firmware configuration on the first PCIe NIC card to partition the NIC into 4 devices instead of 1. When the system was rebooted the PnP instance ids of the second NIC had changed, because the first NIC was now using 4 PCIe bus ids instead of 1 and the second NIC is now at different BDFs. Nothing had been physically moved. When the NIC PnP instance id changed, all the settings associated with that NIC no longer are found, which is especially catastrophic if the system was iSCSI booting from that second NIC.
So, the solution to changing PCIe BDF was to make the PCIe bus driver use the device serial number as the PnP instance id. The bus hierarchy could change, but the serial number stayed the same. A few side effects of this was your NIC device goes bad and you plug in a replacement, and the new card has a different serial number, so generates a different PnP instance id, and your iSCSI booted system no longer boots because it can’t find the NIC settings. I believe a workaround to this was also added (there was as Server 2012 hot fix, and believe Server 2012 R2 had the fix out of the box), so that if a new PCIe serial number shows up, the PCIe bus driver attempts to match the BDF and VID/PID, and use the old instance id.
So the new PCIe instance id generation logic also has to deal with the case of a device has a serial number, but it’s a duplicate of another device serial number. My understanding is the PCIe bus driver has to calculate a repeatable unique value for the PnP instance id, which I believe is prefixed with a hash of parent bus PnP instance ids. The result of this algorithm is PCIe devices get a repeatable instance id, although if you uninstall the parent bridge drivers and let the OS rediscover them, the instance ids are not guaranteed to be the same as before. If your PCIe device has a unique serial number, it doesn’t matter what the instance ids of the parent bridges are. If your device does not have a serial number or has duplicate numbers, it does matter to the generation of your device instance id. There is a KMDF WHQL test that removes the parent bridges that can stimulate this behavior.
There has been ongoing evolution of PnP instance id generation, and have not heard if Win 10/Server 2016 have changed. If anybody has corrections or additions, please do add to this thread.
This is also part of the reason Thunderbolt support is difficult. If you plug in a Thunderbolt device, it participates in the PCIe bus hierarchy enumeration, and can cause changes to the BDF and as a result PnP instance ID of devices later in the tree. To correctly support Thunderbolt dynamically, the OS would need to do PCIe bus id rebalancing, which I fuzzily remember being partially supported in Win 10. A problem is if a boot critical device needs to have it’s PCIe bus id reassigned, you’re stuck because you can’t disable it, so you can’t reassign bus ids, so you can’t open up bus id space for the hot plugged Thunderbolt device.
I assume NVME storage devices may be introducing issues in PCIe enumeration, as it seems reasonable to want NVME devices you can partition, just like you want NICs you can partition. You definitely want hot pluggable NVME storage devices.
Some hardware engineers are oblivious about how OSs need to dynamically cope with hot plug devices, so are unaware of the need for PCIe serial numbers. Linux and Windows also handle device enumeration a little differently, so a product that works fine on Linux may not work fine on Windows.
Jan
From: on behalf of Frank van Eijkelenburg
Reply-To: Windows List
Date: Friday, June 17, 2016 at 6:55 AM
To: Windows List
Subject: Re: [ntdev] Re: Driver PCI returned invalid ID for a child device
Hi Peter,
My colleague investigated the issue and as far as I understand it was the PEX8624 device on our board which had the capability for a serial number and also had a valid (default) serial number present. However, this serial number must be unique within a single system.
In our case, we had multiple boards within a single system and for all PEX8624 devices the serial number was the same (the default manufacturer serial number).
You can overrule the default serial number by writing the EEPROM of the PEX8624 device.
Best regards,
Frank van Eijkelenburg
On Fri, Jun 17, 2016 at 3:28 PM, > wrote:
Thank you, Mr. van Eijkelenburg, for taking the time to past back here and report the resolution to this issue.
Can you please clarify one detail for us: Did your device not have a serial number capability, or did you have one that have the capability but with no serial number (either all zeros or all 0xFF).
I’ll be very surprised to hear that the (optional) serial number capability is now somehow required. I’ve never met a board that implements it, and if something somewhere is now trying to ensure that capability is present…
Peter
OSR
@OSRDrivers
—
NTDEV is sponsored by OSR
Visit the list online at: http:
MONTHLY seminars on crash dump analysis, WDF, Windows internals and software drivers!
Details at http:
To unsubscribe, visit the List Server section of OSR Online at http:
— NTDEV is sponsored by OSR Visit the list online at: MONTHLY seminars on crash dump analysis, WDF, Windows internals and software drivers! Details at To unsubscribe, visit the List Server section of OSR Online at</http:></http:></http:>