Identify driver holding on to zombie process handle

Hi all, I’m fairly new to windbg, but I’ve got an issue where after a few days of work my windows 10 machine will usually have a few hundred thousand zombie processes on it and has to be rebooted to free the memory up.

I suspect that there is a kernel driver that’s holding to handles, but I’m not sure how I would go about identifying which driver has the handles. Could someone help me with some steps I could take to try and track down the culprit?

Here is a quick example of a zombie process which is owned by “System”

lkd> !process 0 7 cgo.exe
PROCESS ffffb90d8255d580
SessionId: 1 Cid: 3f5c Peb: 00381000 ParentCid: 43e8
DirBase: 381102000 ObjectTable: 00000000 HandleCount: 0.
Image: cgo.exe
VadRoot 0000000000000000 Vads 0 Clone 0 Private 11. Modified 1. Locked 0.
DeviceMap ffff900d6d739d50
Token ffff900d77df5060
ElapsedTime 00:35:27.199
UserTime 00:00:00.015
KernelTime 00:00:00.015
QuotaPoolUsage[PagedPool] 0
QuotaPoolUsage[NonPagedPool] 0
Working Set Sizes (now,min,max) (12, 50, 345) (48KB, 200KB, 1380KB)
PeakWorkingSetSize 1290
VirtualSize 0 Mb
PeakVirtualSize 4177 Mb
PageFaultCount 1386
MemoryPriority BACKGROUND
BasePriority 8
CommitCharge 15

No active threads

!object ffffb90d8255d580
Object: ffffb90d8255d580 Type: (ffffb90d77a7a740) Process
ObjectHeader: ffffb90d8255d550 (new version)
HandleCount: 2 PointerCount: 65536

Look into the !obtrace functionality of WinDBG, in which you configure
GFLAGS to track objects from a particular pool tag, to have the call
stack captured whenever the objects are referenced or dereferenced.
And then use !obtrace to view the call stacks captured for a specific
object, and investigate which one seems to be a reference which was
never released.

Alan Adams
Client for Open Enterprise Server
Micro Focus
xxxxx@microfocus.com

Thanks for the information. Can you tell me how I determine which pool tag(s) to track?

Proc is the tag used for processes. However, I’d expect this to be VERY
noisy. Often it’s difficult to figure out what’s a “good” reference and
what’s a bad reference…

If you have a dump, you can do the following to try and find which process
has the handle(s) to an offending process:

!handle 0 3 0 Process

If the handles are in the System process then handle tracing is enabled if
you enable Driver Verifier. You can then dump the handle trace buffer with
!htrace.

-scott
OSR
@OSRDrivers

Thanks. That’s helpful. I don’t have a dump, will this still work for local debugging? The handles are definitely getting lost in the System process.

I guess my follow up is…is this going to be so noisy/convoluted that it’s not going to be worth my time to try and track down and I should just live with the problem? I don’t necessarily want to go down a huge rabbit hole with this.

Yes, should work with local debugging.

That means a kernel component (probably a driver) is leaking the handles.
This should be tractable, the basic idea is:

  1. Enable Driver Verifier’s “Miscellaneous Checks” option on any driver you
    like (really doesn’t matter which, this just had the side effect of enabling
    handle tracing on the System process)

  2. Reproduce your situation where you have leaky handles

  3. Run !handle 0 3 0 Process and note a handle to the offending process

  4. Run !htrace

    This should give you a call stack for the handle and (potentially) point to
    the offending driver.



Oh, this could absolutely be a huge, massive, never ending time suck :slight_smile: If
the above goes according to plan it shouldn’t be too bad, but no plan
survives contact with the enemy…

But, in all seriousness, TRYING the above shouldn’t take very long. If it
doesn’t immediately point to the bad driver then you can at least say you
tried.

-scott
OSR
@OSRDrivers