verifier BSOD in classpnp.sys

We're experiencing a BSOD on win8.1, so far only seen on 32-bit installs, and only after WHCK NVMe IO Test has been run. Our NVMe driver is a storport miniport, and we've had some successful WHCK runs on x64 systems, but the 32-bit always goes into a state where it will BSOD on every boot after this test is run.

If I do a clean OS install and replicate the verifier setup (flags 0x209bb on ournvme.sys and wdf01000.sys), there is no problem. We only see the problem after the WHCK test has run.

The bugcheck code is 0xD5 (DRIVER_PAGE_FAULT_IN_FREED_SPECIAL_POOL), and my driver hasn't done any Free operations when it fails.

I understand the concept: something has freed memory and then tried to use it, but having logged all my driver's alloc/free operations I don't *think* it's my fault.
Analysis as follows, but I'm at a loss where to look next. Has anyone seen this before?

BugCheck D5, {a4948f38, 0, 8206a1ad, 0}

Probably caused by : CLASSPNP.SYS ( CLASSPNP!SqmLogForwardProgress+101 )

Followup: MachineOwner

nt!RtlpBreakWithStatusInstruction:
815111f4 cc int 3
5: kd> !analyze -v
*******************************************************************************
* *
* Bugcheck Analysis *
* *
*******************************************************************************

DRIVER_PAGE_FAULT_IN_FREED_SPECIAL_POOL (d5)
Memory was referenced after it was freed.
This cannot be protected by try-except.
When possible, the guilty driver's name (Unicode string) is printed on
the bugcheck screen and saved in KiBugCheckDriver.
Arguments:
Arg1: a4948f38, memory referenced
Arg2: 00000000, value 0 = read operation, 1 = write operation
Arg3: 8206a1ad, if non-zero, the address which referenced memory.
Arg4: 00000000, (reserved)

Debugging Details:

READ_ADDRESS: a4948f38 Special pool

FAULTING_IP:
CLASSPNP!SqmLogForwardProgress+101
8206a1ad 8b4118 mov eax,dword ptr [ecx+18h]

MM_INTERNAL_CODE: 0

IMAGE_NAME: CLASSPNP.SYS

DEBUG_FLR_IMAGE_TIMESTAMP: 5215707d

MODULE_NAME: CLASSPNP

FAULTING_MODULE: 8202f000 CLASSPNP

DEFAULT_BUCKET_ID: WIN8_DRIVER_FAULT

BUGCHECK_STR: 0xD5

PROCESS_NAME: System

CURRENT_IRQL: 0

ANALYSIS_VERSION: 6.3.9600.17237 (debuggers(dbg).140716-0327) amd64fre

TRAP_FRAME: 88fe7ac0 -- (.trap 0xffffffff88fe7ac0)
ErrCode = 00000000
eax=00000000 ebx=897fa478 ecx=a4948f20 edx=00000000 esi=00000000 edi=897fa530
eip=8206a1ad esp=88fe7b34 ebp=88fe7ba0 iopl=0 nv up ei pl zr na pe nc
cs=0008 ss=0010 ds=0023 es=0023 fs=0030 gs=0000 efl=00010246
CLASSPNP!SqmLogForwardProgress+0x101:
8206a1ad 8b4118 mov eax,dword ptr [ecx+18h] ds:0023:a4948f38=????????
Resetting default scope

LAST_CONTROL_TRANSFER: from 8158d59d to 815111f4

STACK_TEXT:
88fe74dc 8158d59d 00000003 4bc33844 00000065 nt!RtlpBreakWithStatusInstruction
88fe7530 8158d0b7 8643f340 88fe792c 88fe7998 nt!KiBugCheckDebugBreak+0x1f
88fe7900 8150fdc6 00000050 a4948f38 00000000 nt!KeBugCheck2+0x676
88fe7924 8150fcfd 00000050 a4948f38 00000000 nt!KiBugCheck2+0xc6
88fe7944 8153873b 00000050 a4948f38 00000000 nt!KeBugCheckEx+0x19
88fe7998 8148110c 88fe7ac0 00001607 88fe7a18 nt! ?? ::FNODOBFM::`string'+0xf095
88fe7a28 815242e5 00000000 a4948f38 00000000 nt!MmAccessFault+0x6bc
88fe7a28 8206a1ad 00000000 a4948f38 00000000 nt!KiTrap0E+0xf1
88fe7ba0 8145fc46 897fa478 a49fcfd0 8145fbde CLASSPNP!SqmLogForwardProgress+0x101
88fe7be4 8145f4c9 a49fcfd0 00000000 88d7fbc0 nt!IopProcessWorkItem+0x68
88fe7c30 814be6a8 00000000 4bc33104 00000000 nt!ExpWorkerThread+0xff
88fe7c70 81525701 8145f3ca 00000000 00000000 nt!PspSystemThreadStartup+0x58
88fe7c7c 00000000 00000000 00000000 00000000 nt!KiThreadStartup+0x15

STACK_COMMAND: kb

FOLLOWUP_IP:
CLASSPNP!SqmLogForwardProgress+101
8206a1ad 8b4118 mov eax,dword ptr [ecx+18h]

SYMBOL_STACK_INDEX: 8

SYMBOL_NAME: CLASSPNP!SqmLogForwardProgress+101

FOLLOWUP_NAME: MachineOwner

BUCKET_ID_FUNC_OFFSET: 101

FAILURE_BUCKET_ID: 0xD5_VRF_CLASSPNP!SqmLogForwardProgress

BUCKET_ID: 0xD5_VRF_CLASSPNP!SqmLogForwardProgress

ANALYSIS_SOURCE: KM

FAILURE_ID_HASH_STRING: km:0xd5_vrf_classpnp!sqmlogforwardprogress

FAILURE_ID_HASH: {4856134c-7868-20c6-1221-0cbd1784ad34}

Followup: MachineOwner

5: kd> !pte a4948f38
VA a4948f38
PDE at C0602920 PTE at C0524A40
contains 000000003381D863 contains 0000160700000000
pfn 3381d ---DA--KWEV not valid
PageFile: 0
Offset: 1607
Protect: 0

5: kd> !pte a4947fff
VA a4947fff
PDE at C0602920 PTE at C0524A38
contains 000000003381D863 contains 0000160700000000
pfn 3381d ---DA--KWEV not valid
PageFile: 0
Offset: 1607
Protect: 0

5: kd> !pool a4948f38
Pool page a4948f38 region is Special pool
Address a4948000 does not belong to any pool
a4948000: Unable to get contents of special pool block
5: kd> !pool a4948000
Pool page a4948000 region is Special pool
Address a4948000 does not belong to any pool
a4948000: Unable to get contents of special pool block
5: kd> !pool a4948000-1000
Pool page a4947000 region is Special pool
Address a4947000 does not belong to any pool
a4947000: Unable to get contents of special pool block
5: kd> !pool a4948000-2000
Pool page a4946000 region is Special pool
Address a4946000 does not belong to any pool
*a4946000 size: 24 data: a4946fd8 (NonPaged) *Io
Pooltag Io : general IO allocations, Binary : nt!io

Andy,

Do you manipulate work items by any means?

Quick disassembly of CLASSPNP!SqmLogForwardProgress shows that the only ‘freeing’ function is IoFreeWorkItem. Is it possible that you freed a work item twice? Or freed a work items that doesn’t belong to you?

> Do you manipulate work items by any means?

Not directly. It may be that storport does on my behalf, but my miniport doesn’t.

[I was wondering if SqmLogForwardProgress is a WHCK thing, because I can’t find it in the source to classpnp.sys (in win8.1 WDK samples)]

Does "!verifier 80 a4948f38" show anything?

-scott
OSR
@OSRDrivers

wrote in message news:xxxxx@ntdev...

We're experiencing a BSOD on win8.1, so far only seen on 32-bit installs,
and only after WHCK NVMe IO Test has been run. Our NVMe driver is a storport
miniport, and we've had some successful WHCK runs on x64 systems, but the
32-bit always goes into a state where it will BSOD on every boot after this
test is run.

If I do a clean OS install and replicate the verifier setup (flags 0x209bb
on ournvme.sys and wdf01000.sys), there is no problem. We only see the
problem after the WHCK test has run.

The bugcheck code is 0xD5 (DRIVER_PAGE_FAULT_IN_FREED_SPECIAL_POOL), and my
driver hasn't done any Free operations when it fails.

I understand the concept: something has freed memory and then tried to use
it, but having logged all my driver's alloc/free operations I don't *think*
it's my fault.
Analysis as follows, but I'm at a loss where to look next. Has anyone seen
this before?

BugCheck D5, {a4948f38, 0, 8206a1ad, 0}

Probably caused by : CLASSPNP.SYS ( CLASSPNP!SqmLogForwardProgress+101 )

Followup: MachineOwner

nt!RtlpBreakWithStatusInstruction:
815111f4 cc int 3
5: kd> !analyze -v
*******************************************************************************
*
*
* Bugcheck Analysis
*
*
*
*******************************************************************************

DRIVER_PAGE_FAULT_IN_FREED_SPECIAL_POOL (d5)
Memory was referenced after it was freed.
This cannot be protected by try-except.
When possible, the guilty driver's name (Unicode string) is printed on
the bugcheck screen and saved in KiBugCheckDriver.
Arguments:
Arg1: a4948f38, memory referenced
Arg2: 00000000, value 0 = read operation, 1 = write operation
Arg3: 8206a1ad, if non-zero, the address which referenced memory.
Arg4: 00000000, (reserved)

Debugging Details:

READ_ADDRESS: a4948f38 Special pool

FAULTING_IP:
CLASSPNP!SqmLogForwardProgress+101
8206a1ad 8b4118 mov eax,dword ptr [ecx+18h]

MM_INTERNAL_CODE: 0

IMAGE_NAME: CLASSPNP.SYS

DEBUG_FLR_IMAGE_TIMESTAMP: 5215707d

MODULE_NAME: CLASSPNP

FAULTING_MODULE: 8202f000 CLASSPNP

DEFAULT_BUCKET_ID: WIN8_DRIVER_FAULT

BUGCHECK_STR: 0xD5

PROCESS_NAME: System

CURRENT_IRQL: 0

ANALYSIS_VERSION: 6.3.9600.17237 (debuggers(dbg).140716-0327) amd64fre

TRAP_FRAME: 88fe7ac0 -- (.trap 0xffffffff88fe7ac0)
ErrCode = 00000000
eax=00000000 ebx=897fa478 ecx=a4948f20 edx=00000000 esi=00000000
edi=897fa530
eip=8206a1ad esp=88fe7b34 ebp=88fe7ba0 iopl=0 nv up ei pl zr na pe
nc
cs=0008 ss=0010 ds=0023 es=0023 fs=0030 gs=0000
efl=00010246
CLASSPNP!SqmLogForwardProgress+0x101:
8206a1ad 8b4118 mov eax,dword ptr [ecx+18h]
ds:0023:a4948f38=????????
Resetting default scope

LAST_CONTROL_TRANSFER: from 8158d59d to 815111f4

STACK_TEXT:
88fe74dc 8158d59d 00000003 4bc33844 00000065
nt!RtlpBreakWithStatusInstruction
88fe7530 8158d0b7 8643f340 88fe792c 88fe7998 nt!KiBugCheckDebugBreak+0x1f
88fe7900 8150fdc6 00000050 a4948f38 00000000 nt!KeBugCheck2+0x676
88fe7924 8150fcfd 00000050 a4948f38 00000000 nt!KiBugCheck2+0xc6
88fe7944 8153873b 00000050 a4948f38 00000000 nt!KeBugCheckEx+0x19
88fe7998 8148110c 88fe7ac0 00001607 88fe7a18 nt! ??
::FNODOBFM::`string'+0xf095
88fe7a28 815242e5 00000000 a4948f38 00000000 nt!MmAccessFault+0x6bc
88fe7a28 8206a1ad 00000000 a4948f38 00000000 nt!KiTrap0E+0xf1
88fe7ba0 8145fc46 897fa478 a49fcfd0 8145fbde
CLASSPNP!SqmLogForwardProgress+0x101
88fe7be4 8145f4c9 a49fcfd0 00000000 88d7fbc0 nt!IopProcessWorkItem+0x68
88fe7c30 814be6a8 00000000 4bc33104 00000000 nt!ExpWorkerThread+0xff
88fe7c70 81525701 8145f3ca 00000000 00000000 nt!PspSystemThreadStartup+0x58
88fe7c7c 00000000 00000000 00000000 00000000 nt!KiThreadStartup+0x15

STACK_COMMAND: kb

FOLLOWUP_IP:
CLASSPNP!SqmLogForwardProgress+101
8206a1ad 8b4118 mov eax,dword ptr [ecx+18h]

SYMBOL_STACK_INDEX: 8

SYMBOL_NAME: CLASSPNP!SqmLogForwardProgress+101

FOLLOWUP_NAME: MachineOwner

BUCKET_ID_FUNC_OFFSET: 101

FAILURE_BUCKET_ID: 0xD5_VRF_CLASSPNP!SqmLogForwardProgress

BUCKET_ID: 0xD5_VRF_CLASSPNP!SqmLogForwardProgress

ANALYSIS_SOURCE: KM

FAILURE_ID_HASH_STRING: km:0xd5_vrf_classpnp!sqmlogforwardprogress

FAILURE_ID_HASH: {4856134c-7868-20c6-1221-0cbd1784ad34}

Followup: MachineOwner

5: kd> !pte a4948f38
VA a4948f38
PDE at C0602920 PTE at C0524A40
contains 000000003381D863 contains 0000160700000000
pfn 3381d ---DA--KWEV not valid
PageFile: 0
Offset: 1607
Protect: 0

5: kd> !pte a4947fff
VA a4947fff
PDE at C0602920 PTE at C0524A38
contains 000000003381D863 contains 0000160700000000
pfn 3381d ---DA--KWEV not valid
PageFile: 0
Offset: 1607
Protect: 0

5: kd> !pool a4948f38
Pool page a4948f38 region is Special pool
Address a4948000 does not belong to any pool
a4948000: Unable to get contents of special pool block
5: kd> !pool a4948000
Pool page a4948000 region is Special pool
Address a4948000 does not belong to any pool
a4948000: Unable to get contents of special pool block
5: kd> !pool a4948000-1000
Pool page a4947000 region is Special pool
Address a4947000 does not belong to any pool
a4947000: Unable to get contents of special pool block
5: kd> !pool a4948000-2000
Pool page a4946000 region is Special pool
Address a4946000 does not belong to any pool
*a4946000 size: 24 data: a4946fd8 (NonPaged) *Io
Pooltag Io : general IO allocations, Binary : nt!io

The storage class driver sources in the WDK are stripped versions of the
real sources.

-scott
OSR
@OSRDrivers

wrote in message news:xxxxx@ntdev…

Do you manipulate work items by any means?

Not directly. It may be that storport does on my behalf, but my miniport
doesn’t.

[I was wondering if SqmLogForwardProgress is a WHCK thing, because I can’t find it in the source to classpnp.sys (in win8.1 WDK samples)]

> Does "!verifier 80 a4948f38" show anything?

Oooh... yes it does - output is below
(and here I'll be honest, I have no idea yet what this info means, I'll now go and read up on it, but pasted here anyway. Thanks for the starter!)

5: kd> !verifier 80 a4948f38

Log of recent kernel pool Allocate and Free operations:

There are up to 0x10000 entries in the log.

Parsing 0x00010000 log entries, searching for address 0xa4948f38.

======================================================================
Pool block a4948f20, Size 000000e0, Thread 88d7fbc0
818c6ae2 nt!VfIoFreeIrp+0x145
818be391 nt!IovFreeIrpPrivate+0x3d
81497ffd nt!IopCompleteRequest+0x25d
81497724 nt!IopfCompleteRequest+0x474
818bdc9f nt!IovCompleteRequest+0x123
82a80ab6 Wdf01000!FxRequest::CompleteInternal+0x1da
82a81599 Wdf01000!imp_WdfRequestCompleteWithInformation+0x7f
81c1b40d ScsiIrpTestFilter+0x240d
82a81233 Wdf01000!FxRequestBase::CompleteSubmitted+0x39f
82a82189 Wdf01000!FxIoTarget::_RequestCompletionRoutine+0x11f
814d84df nt!IopUnloadSafeCompletion+0x33
818be6da nt!IovpLocalCompletionRoutine+0x136
814975e8 nt!IopfCompleteRequest+0x338

Pool block a4948f20, Size 000000dc, Thread 88d7fbc0
818c6df4 nt!ViIrpAllocateLockedPacket+0x38
818c6913 nt!VfIoAllocateIrp1+0x18
818be025 nt!IovAllocateIrp+0x34
8144ed67 nt!IopBuildDeviceIoControlRequest+0x1f
8144ed3d nt!IoBuildDeviceIoControlRequest+0x2b
8206a175 CLASSPNP!SqmLogForwardProgress+0xc9
8145fc46 nt!IopProcessWorkItem+0x68
8145f4c9 nt!ExpWorkerThread+0xff
814be6a8 nt!PspSystemThreadStartup+0x58
81525701 nt!KiThreadStartup+0x15

Pool block a4948de0, Size 00000220, Thread a1fc57c0
818c6ae2 nt!VfIoFreeIrp+0x145
818be391 nt!IovFreeIrpPrivate+0x3d
81497684 nt!IopfCompleteRequest+0x3d4
818bdc9f nt!IovCompleteRequest+0x123
8043dff8 fltmgr!FltpLegacyProcessingAfterPreCallbacksCompleted+0x453
8043db30 fltmgr!FltpDispatch+0xb1
818bdb4d nt!IovCallDriver+0x2cc
8149c772 nt!IofCallDriver+0x62
817434da nt!FsRtlSetFileSize+0xa0
816643fa nt!MmExtendSection+0x204
816bf4eb nt!MiCreateSection+0x5d7
816beeff nt!MmCreateSection+0x57
816bedb4 nt!NtCreateSection+0x106

Pool block a4948de0, Size 00000220, Thread a1fc57c0
818c6df4 nt!ViIrpAllocateLockedPacket+0x38
818c6913 nt!VfIoAllocateIrp1+0x18
818be025 nt!IovAllocateIrp+0x34
814c7060 nt!IoAllocateIrp+0x14
81743479 nt!FsRtlSetFileSize+0x3f
816643fa nt!MmExtendSection+0x204
816bf4eb nt!MiCreateSection+0x5d7
816beeff nt!MmCreateSection+0x57
816bedb4 nt!NtCreateSection+0x106
81520e17 nt!KiSystemServicePostCall+0
Parsed entry 00010000/00010000...
Finished parsing all pool tracking information.

One more idea.

The address a4948f38 looks very usermodish and not realistic. I don’t have disassembly of your version of SqmLogForwardProgress, so I might be very wrong.

From my version of SqmLogForwardProgress (Win 7, 32) it looks that this bad address came from _DEVICE_EXTENSION of a _DEVICE_OBJECT passed as 1st parameter to SqmLogForwardProgress.

So, setting hardware bp on write to _DEV_EXT field might help to reveal code that damaged the extension.

Very cool - Allocate and Free logs. Another tool in the pocket.

So, I’ve removed ScsiIrpTestFilter from the LowerFilters list, and the problem has gone - machine useable once more.

Based on that alloc/free log, I wonder if I’m completing an SRB more than once - I’ll go and review (again).

I think you have stumbled across a bug in ClassPnP. It also appears as
though this bug is still present in Win10 (just checked my x64 VM),
congratulations! You’re the “lucky” winner :slight_smile:

Based on the alloc/free log, the memory in question is the IRP:

nt!VfIoFreeIrp+0x145
nt!IovFreeIrpPrivate+0x3d
nt!IopCompleteRequest+0x25d
nt!IopfCompleteRequest+0x474
nt!IovCompleteRequest+0x123
Wdf01000!FxRequest::CompleteInternal+0x1da
Wdf01000!imp_WdfRequestCompleteWithInformation+0x7f
ScsiIrpTestFilter+0x240d

So, that would point to ClassPnP touching the IRP after it’s been freed.
Here’s the annotated disassembly from a Win7 system (it’s what I had for x86
and it’s a bit easier to follow, this bug has been present for a while I
guess)

// Build an IRP with IoBuildDeviceIoControlRequest. IMO this is an evil API
// that pretty much everyone calls wrong…
call dword ptr [CLASSPNP!_imp__IoBuildDeviceIoControlRequest]

// EDI holds the IRP allocated with IoBuildDeviceIoControlRequest
mov edi,eax
cmp edi,ebx

// Jump if the IRP is NULL, which isn’t true in this case
je CLASSPNP!SqmLogForwardProgress+0xff

CLASSPNP!SqmLogForwardProgress+0xca:

// Pass the IRP to IoCallDriver…
mov ecx,dword ptr [esi+8]
mov edx,edi
call dword ptr [CLASSPNP!_imp_IofCallDriver]

// Is it pending?
cmp eax,103h

// If the return is NOT pending, jump out somewhere else
// (and avoid the bug)
jne CLASSPNP!SqmLogForwardProgress+0xfb

CLASSPNP!SqmLogForwardProgress+0xdc:

// We’re in the pending case, so we’ll wait

call dword ptr [CLASSPNP!_imp__KeWaitForSingleObject]

// BUGBUG - Touch EDI! Oops…You’re not allowed to touch
// an IRP allocated with IoBuildDeviceIoControlRequest after
// it has been sent with IoCallDriver. It would cause exactly
// this problem as it has already been freed.
mov eax,dword ptr [edi+18h] <<<< BOOM

This crash would probably only be seen with Verifier enabled. You’d also
need to be down this path, and I’m not sure how you trigger it (I set a
breakpoint here during boot and didn’t hit it). The name of the function
makes me think that it has something to do with low resources, what does !vm
21 say at the time of the crash?

-scott
OSR
@OSRDrivers

wrote in message news:xxxxx@ntdev…

Very cool - Allocate and Free logs. Another tool in the pocket.

So, I’ve removed ScsiIrpTestFilter from the LowerFilters list, and the
problem has gone - machine useable once more.

Based on that alloc/free log, I wonder if I’m completing an SRB more than
once - I’ll go and review (again).

Scott, you’re my new favourite person

,-’ -._ |=========| ( ) | !! | | !! | | | | | =======’

I’ve rebooted the system and faffed with the registry, so can’t do the !vm command just now…
However I’m going on vacation in four minutes time, and you just made it more relaxing. Thanks.

I learned a lot from this thread as well, Thank Scott!
//dlcu

Date: Tue, 4 Aug 2015 11:27:58 -0400
From: xxxxx@thegreenho.me
To: xxxxx@lists.osr.com
Subject: RE:[ntdev] verifier BSOD in classpnp.sys

Scott, you’re my new favourite person

,-’ -._ |=========| ( ) | !! | | !! | | | | | =======’

I’ve rebooted the system and faffed with the registry, so can’t do the !vm command just now…
However I’m going on vacation in four minutes time, and you just made it more relaxing. Thanks.


NTDEV is sponsored by OSR

Visit the list at: http://www.osronline.com/showlists.cfm?list=ntdev

OSR is HIRING!! See http://www.osr.com/careers

For our schedule of WDF, WDM, debugging and other seminars visit:
http://www.osr.com/seminars

To unsubscribe, visit the List Server section of OSR Online at http://www.osronline.com/page.cfm?name=ListServer

Scott,
This is awesome…Is there anywhere where all these !verifier (and others) are documented… HOW DO YOU KNOW THIS :slight_smile: ???

Thanks,
Ajitabh

-----Original Message-----
From: xxxxx@lists.osr.com [mailto:xxxxx@lists.osr.com] On Behalf Of Scott Noone
Sent: Tuesday, August 04, 2015 8:46 PM
To: Windows System Software Devs Interest List
Subject: Re:[ntdev] verifier BSOD in classpnp.sys

I think you have stumbled across a bug in ClassPnP. It also appears as though this bug is still present in Win10 (just checked my x64 VM), congratulations! You’re the “lucky” winner :slight_smile:

Based on the alloc/free log, the memory in question is the IRP:

nt!VfIoFreeIrp+0x145
nt!IovFreeIrpPrivate+0x3d
nt!IopCompleteRequest+0x25d
nt!IopfCompleteRequest+0x474
nt!IovCompleteRequest+0x123
Wdf01000!FxRequest::CompleteInternal+0x1da
Wdf01000!imp_WdfRequestCompleteWithInformation+0x7f
ScsiIrpTestFilter+0x240d

So, that would point to ClassPnP touching the IRP after it’s been freed.
Here’s the annotated disassembly from a Win7 system (it’s what I had for x86 and it’s a bit easier to follow, this bug has been present for a while I
guess)

// Build an IRP with IoBuildDeviceIoControlRequest. IMO this is an evil API // that pretty much everyone calls wrong…
call dword ptr [CLASSPNP!_imp__IoBuildDeviceIoControlRequest]

// EDI holds the IRP allocated with IoBuildDeviceIoControlRequest
mov edi,eax
cmp edi,ebx

// Jump if the IRP is NULL, which isn’t true in this case
je CLASSPNP!SqmLogForwardProgress+0xff

CLASSPNP!SqmLogForwardProgress+0xca:

// Pass the IRP to IoCallDriver…
mov ecx,dword ptr [esi+8]
mov edx,edi
call dword ptr [CLASSPNP!_imp_IofCallDriver]

// Is it pending?
cmp eax,103h

// If the return is NOT pending, jump out somewhere else // (and avoid the bug)
jne CLASSPNP!SqmLogForwardProgress+0xfb

CLASSPNP!SqmLogForwardProgress+0xdc:

// We’re in the pending case, so we’ll wait …
call dword ptr [CLASSPNP!_imp__KeWaitForSingleObject]

// BUGBUG - Touch EDI! Oops…You’re not allowed to touch // an IRP allocated with IoBuildDeviceIoControlRequest after // it has been sent with IoCallDriver. It would cause exactly // this problem as it has already been freed.
mov eax,dword ptr [edi+18h] <<<< BOOM

This crash would probably only be seen with Verifier enabled. You’d also need to be down this path, and I’m not sure how you trigger it (I set a breakpoint here during boot and didn’t hit it). The name of the function makes me think that it has something to do with low resources, what does !vm
21 say at the time of the crash?

-scott
OSR
@OSRDrivers

wrote in message news:xxxxx@ntdev…

Very cool - Allocate and Free logs. Another tool in the pocket.

So, I’ve removed ScsiIrpTestFilter from the LowerFilters list, and the problem has gone - machine useable once more.

Based on that alloc/free log, I wonder if I’m completing an SRB more than once - I’ll go and review (again).


NTDEV is sponsored by OSR

Visit the list at: http://www.osronline.com/showlists.cfm?list=ntdev

OSR is HIRING!! See http://www.osr.com/careers

For our schedule of WDF, WDM, debugging and other seminars visit:
http://www.osr.com/seminars

To unsubscribe, visit the List Server section of OSR Online at http://www.osronline.com/page.cfm?name=ListServer

The commands come to me in my dreams…

Also, this one is documented :slight_smile:

https://msdn.microsoft.com/en-us/library/windows/hardware/ff565591(v=vs.85).aspx

Discussed them a while ago here as well:

https://www.osronline.com/article.cfm?id=589

Enjoy!

-scott
OSR
@OSRDrivers

“Ajitabh Saxena (ajisaxen)” wrote in message news:xxxxx@ntdev…

Scott,
This is awesome…Is there anywhere where all these !verifier (and
others) are documented… HOW DO YOU KNOW THIS :slight_smile: ???

Thanks,
Ajitabh

-----Original Message-----
From: xxxxx@lists.osr.com
[mailto:xxxxx@lists.osr.com] On Behalf Of Scott Noone
Sent: Tuesday, August 04, 2015 8:46 PM
To: Windows System Software Devs Interest List
Subject: Re:[ntdev] verifier BSOD in classpnp.sys

I think you have stumbled across a bug in ClassPnP. It also appears as
though this bug is still present in Win10 (just checked my x64 VM),
congratulations! You’re the “lucky” winner :slight_smile:

Based on the alloc/free log, the memory in question is the IRP:

nt!VfIoFreeIrp+0x145
nt!IovFreeIrpPrivate+0x3d
nt!IopCompleteRequest+0x25d
nt!IopfCompleteRequest+0x474
nt!IovCompleteRequest+0x123
Wdf01000!FxRequest::CompleteInternal+0x1da
Wdf01000!imp_WdfRequestCompleteWithInformation+0x7f
ScsiIrpTestFilter+0x240d

So, that would point to ClassPnP touching the IRP after it’s been freed.
Here’s the annotated disassembly from a Win7 system (it’s what I had for x86
and it’s a bit easier to follow, this bug has been present for a while I
guess)

// Build an IRP with IoBuildDeviceIoControlRequest. IMO this is an evil API
// that pretty much everyone calls wrong…
call dword ptr [CLASSPNP!_imp__IoBuildDeviceIoControlRequest]

// EDI holds the IRP allocated with IoBuildDeviceIoControlRequest
mov edi,eax
cmp edi,ebx

// Jump if the IRP is NULL, which isn’t true in this case
je CLASSPNP!SqmLogForwardProgress+0xff

CLASSPNP!SqmLogForwardProgress+0xca:

// Pass the IRP to IoCallDriver…
mov ecx,dword ptr [esi+8]
mov edx,edi
call dword ptr [CLASSPNP!_imp_IofCallDriver]

// Is it pending?
cmp eax,103h

// If the return is NOT pending, jump out somewhere else // (and avoid the
bug)
jne CLASSPNP!SqmLogForwardProgress+0xfb

CLASSPNP!SqmLogForwardProgress+0xdc:

// We’re in the pending case, so we’ll wait …
call dword ptr [CLASSPNP!_imp__KeWaitForSingleObject]

// BUGBUG - Touch EDI! Oops…You’re not allowed to touch // an IRP
allocated with IoBuildDeviceIoControlRequest after // it has been sent with
IoCallDriver. It would cause exactly // this problem as it has already been
freed.
mov eax,dword ptr [edi+18h] <<<< BOOM

This crash would probably only be seen with Verifier enabled. You’d also
need to be down this path, and I’m not sure how you trigger it (I set a
breakpoint here during boot and didn’t hit it). The name of the function
makes me think that it has something to do with low resources, what does !vm
21 say at the time of the crash?

-scott
OSR
@OSRDrivers

wrote in message news:xxxxx@ntdev…

Very cool - Allocate and Free logs. Another tool in the pocket.

So, I’ve removed ScsiIrpTestFilter from the LowerFilters list, and the
problem has gone - machine useable once more.

Based on that alloc/free log, I wonder if I’m completing an SRB more than
once - I’ll go and review (again).


NTDEV is sponsored by OSR

Visit the list at: http://www.osronline.com/showlists.cfm?list=ntdev

OSR is HIRING!! See http://www.osr.com/careers

For our schedule of WDF, WDM, debugging and other seminars visit:
http://www.osr.com/seminars

To unsubscribe, visit the List Server section of OSR Online at
http://www.osronline.com/page.cfm?name=ListServer