Tracking down who's holding a reference to our driver's USB FDO and preventing its removal.

I have a customer using our USB functional driver on their Windows 7 SP1 (32-bit) platform and their USB is experiencing a lot of disruptions. The URB’s from our driver complete with errors and finally we get PNP IRP_MN_SURPRISE_REMOVAL for our USB FDO (call it FDO-A) . That’s okay, the driver is designed to recover from USB disconnects / reconnects. Usually when we receive the surprise removal it is followed immediately by IRP_MN_REMOVE_DEVICE. But in one particular situation, the driver unloads and reloads, another USB disconnect happens and we receive the surprise removal but instead of next receiving IRP_MN_REMOVE_DEVICE, we receive START_DEVICE for the very same device. I thought you would have to receive IRP_MN_REMOVE_DEVICE before you would ever get another START_DEVICE but Microsoft tells me this could happen if there is still an outstanding reference to FDO-A. As it turns out, eventually after several more USB disconnect/reconnect cycles we finally receive the IRP_MN_REMOVE_DEVICE for FDO-A some 12 minutes later. But this causes big problems for our driver which was not designed to handle a situation like that and we end up referencing a stale data structure and blue-screening.

I’m hoping somebody could help me come up with some way to determine who/what is holding a reference to FDO-A. I have the crashdump. Is there anyway I could track it down from that? Perhaps at that point it is too late. The customer has McAfee installed on their platform and I’m wondering if it could be the culprit. Needless to say, our driver needs to eventually be re-written to properly handle this anamoly but for the time being I’m hoping to track down who’s holding that reference and perhaps that can be addressed as a work-around.

You’re not going to have much luck if the crash dump is at some point after the reference is dropped.

Some ways I can see to diagnose:

  1. During the 12 minutes, run Process Explorer and use the Find Handle or DLL option to search for your device name. If there’s an app with a HANDLE to your device this should find it.
  2. Get Object Reference Tracing set up on the client system and let it crash again. With the new dump you can then !obtrace your device object (see !obtrace docs for details). It’s not a guarantee that this will pinpoint the culprit but it might.
  3. Set a breakpoint or get a crash dump at the point where you see the start device without seeing a remove. At this point there should be a File Object that is pointing to your Device Object somewhere in the dump. Running !search on the Device Object address should find it. Once you have the address of the File Object you can run “!handle 0 3 0 File” to dump all HANDLEs to File Objects in every process. If you find the File Object in the output you’ll have your culprit. This is very roundabout but it’s the cheapest way to figure it out on Windows 7.
1 Like