Weird poolmon/!poolused results

Hi,
Sometimes, during a test cycle - I observe that my WDF driver behaves very sluggishly with memory allocation/de-allocation (like 4-6x times slower). And when I check poolmon or !poolused, everything looks fine (no obvious leaks or memory overuse) BUT there are those weird entries, e.g.:

Tag Type Allocs Frees Diff Bytes Per Alloc
LBwi Paged 2301440590 5 2301440585 2150647928 0
LBwi Nonp 2337 2337 0 -8566365319305713018 -8566365319305713018

Or (from a memdump):
KSpt 0 -9209781557739388923 50330832 -4049585717204047226

The affected tag ,and the values seem to be random every time. Do you guys might have any idea:

  1. If this is could be related to the sluggishness?
  2. How do I debug it? Not really sure how to approach it.

Unfortunately, I cannot reproduce it with Driver Verifier, also I don’t have a stable method for getting to this state yet.

Hmmmm… Perhaps this tag is for Work Items?? (the"wi" in the tag??) I don’t know what the “LB” is for (as far as I can tell, it’s not publicly documented). I don’t see this tag in the WDF source code.

When the system gets *really* sluggish, check to see where the system is spending its time… that could provide a hint.

On the off-chance that this strikes you helpful: We had a really interesting case where we had a Framework object that required PASSIVE_LEVEL cleanup, in a parent-child relationship where we were tearing down the parent at IRQL DISPATCH_LEVEL. In our case, it was a WDFTIMER that we parented with a WDFREQUEST. After processing a few hundred thousand WDFREQUESTS, the system slowed to a crawl. Moral of the story: Don’t do that.

Peter
OSR
@OSRDrivers

I believe something like this can happen if you pass 0 as the tag when calling ExAllocatePoolWithTag. The verifier should catch this, however. Did you enable it for all drivers?

Checked kernel should catch this as well.

-----Original Message-----
From: xxxxx@lists.osr.com [mailto:xxxxx@lists.osr.com] On Behalf Of xxxxx@intel.com
Sent: Monday, November 23, 2015 9:34 AM
To: Windows System Software Devs Interest List
Subject: [ntdev] Weird poolmon/!poolused results

Hi,
Sometimes, during a test cycle - I observe that my WDF driver behaves very sluggishly with memory allocation/de-allocation (like 4-6x times slower). And when I check poolmon or !poolused, everything looks fine (no obvious leaks or memory overuse) BUT there are those weird entries, e.g.:

Tag Type Allocs Frees Diff Bytes Per Alloc
LBwi Paged 2301440590 5 2301440585 2150647928 0
LBwi Nonp 2337 2337 0 -8566365319305713018 -8566365319305713018

Or (from a memdump):
KSpt 0 -9209781557739388923 50330832 -4049585717204047226

The affected tag ,and the values seem to be random every time. Do you guys might have any idea:
1) If this is could be related to the sluggishness?
2) How do I debug it? Not really sure how to approach it.

Unfortunately, I cannot reproduce it with Driver Verifier, also I don’t have a stable method for getting to this state yet.

Today I also noticed that !vm 1 returns a warning:

*** Virtual Memory Usage ***
Physical Memory: 2068635 ( 8274540 Kb)
Page File: ??\C:\pagefile.sys
Current: 8912896 Kb Free Space: 8900988 Kb
Minimum: 8912896 Kb Maximum: 10485756 Kb
Page File: ??\C:\swapfile.sys
Current: 16384 Kb Free Space: 16376 Kb
Minimum: 16384 Kb Maximum: 8388604 Kb
No Name for Paging File

Current: 18760296 Kb Free Space: 18716040 Kb
Minimum: 18760296 Kb Maximum: 18760296 Kb
Available Pages: 1062719 ( 4250876 Kb)
ResAvail Pages: 1395640 ( 5582560 Kb)
Locked IO Pages: 0 ( 0 Kb)
Free System PTEs: 4294988506 (17179954024 Kb)
Modified Pages: 13320 ( 53280 Kb)
Modified PF Pages: 13307 ( 53228 Kb)
Modified No Write Pages: 0 ( 0 Kb)
NonPagedPool Usage: 240 ( 960 Kb)
NonPagedPoolNx Usage: 52724 ( 210896 Kb)
NonPagedPool Max: 4294967296 (17179869184 Kb)
PagedPool 0 Usage: 42062 ( 168248 Kb)
PagedPool 1 Usage: 16311 ( 65244 Kb)
PagedPool 2 Usage: 5550 ( 22200 Kb)
PagedPool 3 Usage: 5565 ( 22260 Kb)
PagedPool 4 Usage: 5553 ( 22212 Kb)
PagedPool Usage: 75041 ( 300164 Kb)
PagedPool Maximum: 4160749568 (16642998272 Kb)
Session Commit: 11722 ( 46888 Kb)
Shared Commit: 31303 ( 125212 Kb)
Special Pool: 0 ( 0 Kb)
Shared Process: 7936 ( 31744 Kb)
Pages For MDLs: 3508 ( 14032 Kb)
Pages For AWE: 0 ( 0 Kb)
NonPagedPool Commit: 572527 ( 2290108 Kb)
PagedPool Commit: 75041 ( 300164 Kb)
Driver Commit: 57283 ( 229132 Kb)
Boot Commit: 25641 ( 102564 Kb)
System PageTables: 1601 ( 6404 Kb)
VAD/PageTable Bitmaps: 7026 ( 28104 Kb)
ProcessLockedFilePages: 0 ( 0 Kb)
Pagefile Hash Pages: 11 ( 44 Kb)
Sum System Commit: 793599 ( 3174396 Kb)
Total Private: 246389 ( 985556 Kb)

********** Sum of individual system commit + Process commit exceeds overall commit by 127904 Kb ? ********
Committed pages: 1008012 ( 4032048 Kb)
Commit limit: 4296859 ( 17187436 Kb)

Peter:
Not sure if the pooltag value is actually indicative of anything, it seems random.
Today I got:
FatR Paged 1066592245 1066592245 0 7274412237495401954 7274412237495401954
FSfc Nonp 37 8 29 -35182346444688 636093410047003687
FSfc Paged 4294959104 4294959104 0 -35182346446944 -35182346446944

The issue you mentioned, seems a bit similar - this one is also reproduced after many many hours.
I double checked parent-child relationships with regards to exec level, nothing fishy. Can I somehow verify that this occurred/didn’t occur? wdflogdump doesn’t help, wdfdriverinfo tree also looks clean (no leftovers or orphans).

Pavel:
I was running verifier with selected drivers only. Now I’m trying with all of them - so far nothing turned up.

Maciej

When the system gets *really* slow, you need to break in and see what the system is doing. If the system is slow, you need to figure out what it’s doing. No amount of looking at pool tags will tell you what the system is doing.

The Pools Tags you’re seeing aren’t random. Both FatR and FSfc are “real” pool tags. Now… whether the pool usage is related to your driver/system slowdown is certainly open to debate (and, in fact, doubtful).

Check your system’s activity,

Peter
OSR
@OSRDrivers