Kernel Mode driver error Code 38 - Failed Prior Unload.

cyberteen · October 29, 2021, 8:49am

Hi Everyone,

A new windows developer here. Excuse me for any mistakes!

We have a kernel mode driver which after a sequence of hibernate stress tests resulted in one of the driver module repeatedly loading and unloading.
The resulting error code on device manager is 38 - which according to MSDN is that driver prior unloading failed.

From the WPP logs, all modules seem to cleanup and unload properly. Not sure how to debug this problem

Also we have a crash dump collected after we see this error on driver. I can see the error node which failed to unload ( using the command - !devnode 0 1 or !pnptriage.)

Many other posts and articles suggest that this could point to leaked reference on Driver Object itself rather than simple memory leaks.

If turning on Object Reference Tracking is the way, how do I enable it for a particular driver ? Example only show for dll files or executables.

Appreciate any help. Thanks

cyberteen · October 29, 2021, 11:40pm

I was able to gather this information to enable Object Tracking on Device Objects,

Enable Gflags
and keeping the pool tag as “Devi”
Ignore the image name so it will consider the entire System
So, when I reboot my PC, run the test, and get the Code 38 error again, Can I just run !obtrace “dev obj address” which I can get from !devnode ?

Is this a right way to debug this particular issue of driver unload ? Let me know if any other info is required or if better approaches are available.

Thanks

Scott_Noone_OSR · November 3, 2021, 6:29pm

There’s possibly a reference on the driver object. That could be from a device object or it can be something else (e.g. failing to unregister with WPP). Does your driver pass Verifier?

cyberteen · November 5, 2021, 7:01am

@“Scott_Noone_(OSR)” said:
There’s possibly a reference on the driver object. That could be from a device object or it can be something else (e.g. failing to unregister with WPP). Does your driver pass Verifier?

Yes. Driver verifier is enabled, but nothing is triggered.

Scott_Noone_OSR · November 5, 2021, 3:57pm

Does this happen reliably? If yes, can you start removing functionality until the test case passes? Sometimes that’s the easiest way to narrow down the problem.

cyberteen · November 8, 2021, 5:05pm

@“Scott_Noone_(OSR)” , Not reliably I guess. It has been reported to happen after several hundred or even thousands of iterations of hibernate cycles.
Further, this is on a customer’s setup. So, I don’t think I can take the approach to keep reducing functionality until problem is found. i.e. not feasible for customer to run so many patches in worst case.

But I am also trying to replicate this in my internal setup. Not been successful so far. But will consider your approach if issue gets repro’d.