Drivers are cleaned up on unload, yes?

Just to verify, when a driver is unloaded and it’s allocated resources have been properly deallocated (i.e. Verifier is happy) then the OS ensures there is nothing “left” in kernel memory, like when a usermode application is unloaded, yes?

I have a vexing slow nonpaged memory leak in a driver which is loaded/ unloaded thousands of times a day in a stress test, and for which I don’t find any good reason for it be happening. Over time, usually 12+ load/ call/ unload reps the nonpaged memory usage is enough that the OS starts sputtering and convulsing …

Verifier is happy, there are only standard pool allocations made, the IOCTL’s are METHOD_BUFFER, nothing really stands out … and yet like a tire you find flat in the morning, the leak in nonpaged memory is there … one recent change was that WPP logs are being accumulated, when a driver unloads what happens to the associated WPP logs the OS is collecting … do they go away as well, or are they persistent until consumed?

If Verifier is happy, then your pool has all been returned.

What’s the pool tag of the leaking resource?

Peter

@“Peter_Viscarola_(OSR)” said:
If Verifier is happy, then your pool has all been returned.

What’s the pool tag of the leaking resource?

Peter

That’s part of the vexing thing, I haven’t been able to identify the actual resource that’s not being released … it’s just a slow accumulation of “something” in the non paged pool. Verifier is happy, and I’ve both used the custom pool in GFlags as well as instrumented the address and count of allocations and frees … it all looks fine and dandy, the stuff allocated is being freed. There aren’t any pending IRP’s, the global object tables look the same, I’m backing up versions now to see when this effect first started happening and play “what’s different” …

I just want to be pedantic about the first sentence you wrote. The kernel does absolutely nothing to clean up after a driver when it unloads. It is not at all like cleaning up a user-mode process, because the kernel does not know what belonged to you. It won’t close handles, it won’t free memory, it won’t kill threads. Your DRIVER must do all of that before it unloads.

@Tim_Roberts … ah, so I misunderstood about kernel drivers; that means that something changed over the last time the long duration test was run and now that’s leaking memory, thanks for that correction!

In TaskManager PagedPool is definitely slowly increasing … the main difference between what “was” and what “is” is the inclusion of WPP tracing; with this new info that it’s up to me to clean up, when I’m doing a WPP trace and I call WPP_CLEANUP in DriverContextCleanup, does that release any pending trace buffers?

Updating, the only changes made to the driver which will then manifest the paged pool memory leak is the addition of WPP tracing; comparing !poolused reports at the beginning and end of the run, the page pool leak is in the AlMs pool, which is apparently for ALPC messages.

—snip—
Begin run:
AlMs 0 0 3168 1926720 ALPC message , Binary: nt!alpc

1000 rep’s:
AlMs 0 0 7828 5963296 ALPC message , Binary: nt!alpc
–snip–

[Guessing mode] ALPC messages are (apparently, according to some GoogleFu) used by WPP for in flight traffic to it’s internal buffers, so it’s possible that the WPP_CLEANUP code is being called while there is an inflight ALPC message … the buffers are cleaned up but the ALPC message remains pending and just sits there

Testing ongoing … hmm …

If you disable/stop tracing before you unload your driver, do you still “lose” pool?

If you never enable tracing for your driver, do you lose pool?

I call WPP_CLEANUP in DriverContextCleanup, does that release any pending trace buffers?

Yes, that’s all you need to do.

Peter

Tying things up into a bow: there was a WPP trace message followed by the WPP_CLEANUP call which when a KeDelayExecution of a small amount was added prior to the WPP_CLEANUP call reduced (but did not eliminate) the AlMs paged pool leak.

Making this change dropped the leakage to an acceptable amount, so it’s off my plate … :slight_smile:

Tim says pedantic, but I’ll be harsher - this is foundational. There is no possible way that the OS could clean up in this way after a driver that lives in ring 0