Windows System Software -- Consulting, Training, Development -- Unique Expertise, Guaranteed Results

Home NTDEV
Before Posting...
Please check out the Community Guidelines in the Announcements and Administration Category.

PCIe over Thunderbolt, here are good guidelines

Eric_WittmayerEric_Wittmayer Member Posts: 47

This document is the best set of guidelines i've seen for supporting PCIe over Thunderbolt. It covered all the things that I learned while updating a PCIe driver and PCIe hardware for use on Thunderbolt.
Thunderbolt Device Driver Programming Guide
Yes it is from Apple, so you can ignore all the macOS specific stuff but most of the issues are the same on any OS.

These sections cover the topics I would emphasize to anyone adding support for Thunderbolt:

Tolerating PCI Latency: This is the area where we also made hardware changes to the PCIe device along with driver changes.

Using Hot Plug Operation with PCI Devices: Check for device gone everywhere.

Hopefully someone finds this helpful.

Eric

Comments

  • Gabe_JonesGabe_Jones Member Posts: 68

    Do you have any experience with the Kernel DMA Protection introduced in Windows 10 1803? I think we are encountering a failure due to this with a legacy WDM PCI driver. We are getting a DRIVER_VERIFIER_DMA_VIOLATION bugcheck with Arg1 0x26 (IOMMU detected DMA violation) when running the hardware over Thunderbolt. My suspicion is that this driver does not use the Windows DMA APIs and thus does correctly deal with DMA remapping. I found it interesting that it reports as a DRIVER_VERIFIER violation even though Driver Verifier is not running on this system. I have this vague nagging feeling that I've encountered one other bugcheck in the past that had this behavior (i.e., saying it was a DV bugcheck when DV was not running), but it still caught me by surprise.

  • Peter_Viscarola_(OSR)Peter_Viscarola_(OSR) Administrator Posts: 7,688

    My suspicion is that this driver does not use the Windows DMA APIs and thus does correctly deal with DMA remapping

    That would, of course, cause the problem.

    Are you SURE verifier isn’t running at the time of the crash? You know Windows WILL automatically enable it following certain crashes.

    Peter

    Peter Viscarola
    OSR
    @OSRDrivers

  • Gabe_JonesGabe_Jones Member Posts: 68

    @Peter_Viscarola_(OSR) said:

    My suspicion is that this driver does not use the Windows DMA APIs and thus does correctly deal with DMA remapping

    That would, of course, cause the problem.

    Yeah, I need to dig through the (maze of twisty passages) code and make sure of this, but from what I know about it it's not a bad hunch. (I just noticed that I put "does correctly" instead of "does not correctly" in the OP. :/)

    Are you SURE verifier isn’t running at the time of the crash? You know Windows WILL automatically enable it following certain crashes.

    I haven't had my hands on the system yet, but it is with a colleague that I trust and he said that DV was not enabled. I didn't realize that it would automatically be enabled following certain crashes. Are those documented somewhere, or is it just tribal knowledge?

  • Peter_Viscarola_(OSR)Peter_Viscarola_(OSR) Administrator Posts: 7,688

    Are those documented somewhere, or is it just tribal knowledge?

    I dunno. I’m just telling you what I’ve experienced. You can check from the dump using !verifier.

    Peter

    Peter Viscarola
    OSR
    @OSRDrivers

  • Gabe_JonesGabe_Jones Member Posts: 68

    @Peter_Viscarola_(OSR) said:
    I dunno. I’m just telling you what I’ve experienced. You can check from the dump using !verifier.

    !verifier 0x1 doesn't show any drivers being verified, and the only flag set is (0x00000000) Automatic Checks. Notably, (0x00000080) DMA checking is not enabled.

  • Peter_Viscarola_(OSR)Peter_Viscarola_(OSR) Administrator Posts: 7,688

    That is super interesting.

    Google says there are other folks seeing issues like the one you're reporting. One guy, like you, is very clear that Verifier is not running. In some cases, people have solved their problem by flashing the BIOS with the latest. Others have resulted from errors in the Dell Thunderbolt dock driver.

    So, in addition to your observation (which is definitive) there's additional evidence that the IOMMU checks are being done even when Verifier is not enabled. I guess this makes sense... they can figure out of the IOMMU isn't being used (properly) without having to go to the extreme of forcing data to be double-buffered (which is what the DMA Verification option does. Assuming this is the case, I can see how they might just use the Driver Verifier bugcheck code to indicate that the error results from checking on the activity of an errant driver (there aren't an unlimited number of bugcheck codes, afterall). But what they've done is make things confusing to us devs... as you've pointed out.

    They can fix this in the documentation... and it'd be nice if they told us SOMEthing about what this check is and what a violation means.

    Peter

    Peter Viscarola
    OSR
    @OSRDrivers

  • DianeDiane Member - All Emails Posts: 16

    Last year we were debugging a crash with DRIVER_VERIFIER_DMA_VIOLATION, and I got verification from Microsoft folks that this is indeed poorly named - the check is done irrespective of Verifier settings.

    Diane

  • Gabe_JonesGabe_Jones Member Posts: 68

    @Diane said:
    Last year we were debugging a crash with DRIVER_VERIFIER_DMA_VIOLATION, and I got verification from Microsoft folks that this is indeed poorly named - the check is done irrespective of Verifier settings.

    Diane,

    Thanks for the confirmation. Did they say that was true of all DRIVER_VERIFIER_DMA_VIOLATION subtypes, or just specific failure modes?

    Gabe

  • DianeDiane Member - All Emails Posts: 16

    Hi Gabe,

    Sorry, I don't know about all subtypes. I was focused on my particular failure.

    Diane

  • Peter_Viscarola_(OSR)Peter_Viscarola_(OSR) Administrator Posts: 7,688

    Thank you @Diane! That’s very helpful.

    Did they say that was true of all DRIVER_VERIFIER_DMA_VIOLATION

    I can tell you for sure that “ordinary” DMA verification is absolutely not enabled without Verifier. The overhead would be untenable.

    Peter

    Peter Viscarola
    OSR
    @OSRDrivers

  • Scott_Noone_(OSR)Scott_Noone_(OSR) Administrator Posts: 3,231
    edited January 25

    I did a little assembly searching for fun on whatever Win10 version of nt/hal I'm running. So, this shouldn't be taken as definitive, but it's something.

    From what I see the HAL appears to generated two different DRIVER_VERIFIER_DMA_VIOLATION bugchecks, regardless of whether or not Verifier is enabled:

    1. From HalpDmaControllerFlushChannel with Parameter1 == 0x23 - "Cannot flush a channel that hasn't been completed or cancelled".
    2. From IvtHandleInterrupt with Parameter1 == 0x26 - "IOMMU detected DMA violation". This appears to be in response to certain interrupts generated by the IOMMU.

    The kernel is a bit more confusing...Most of the bugchecks come from DMA Verifier being enabled. However, if you're running on a system where DMA is not cache coherent (as controlled by the nt!KiSystemFullyCoherent global) you might see a DRIVER_VERIFIER_DMA_VIOLATION from KeFlushIoBuffers with a Parameter1 == 4 ("Driver has freed too many simultaneous adapter channels") or 5 ("Freed too many map registers") even without Verifier enabled

    Soooo, yeah, they should have picked a different crash code. Probably seemed like a good idea at the time :)

    Post edited by Scott_Noone_(OSR) on

    -scott
    OSR

Sign In or Register to comment.

Howdy, Stranger!

It looks like you're new here. If you want to get involved, click one of these buttons!

Upcoming OSR Seminars
OSR has suspended in-person seminars due to the Covid-19 outbreak. But, don't miss your training! Attend via the internet instead!
Kernel Debugging 30 Mar 2020 OSR Seminar Space
Developing Minifilters 20 Apr 2020 LIVE ONLINE
Writing WDF Drivers 11 May 2020 LIVE ONLINE
Internals & Software Drivers 28 Sept 2020 Dulles, VA