Which KMDF driver is bug checking?

I have a KMDF1.1 Filter over a KMDF1.1 function driver. Sometimes I get an
IRQL_NOT_LESS_OR_EQUAL bugcheck when unloading the drivers. The problem is
that, since none of my routines are visible in the trace, I can’t easily
see which driver is faulting.

Is there an easy way to tell from the trace, or from the attached debugger?

I don’t think there are any requests left in either of the drivers, so a hint
as to the cause of the bug check would also be appreciated.

The EvtDeviceSelfManagedIoSuspend routine of the filter is called. It cancels
4 requests, which are subsequently deleted by the completion routine once they
have been cancelled by the lower driver. The bug check occurs before the call
of the EvtCleanupCallback routine of the filter.

Don

BTW. why does the trace say “Wdf01000!FxIrpQueue”? I’d have expected
“Wdf01001!FxIrpQueue”

0: kd> !analyze -v
*******************************************************************************
* *
* Bugcheck Analysis *
* *
*******************************************************************************

DRIVER_IRQL_NOT_LESS_OR_EQUAL (d1)
An attempt was made to access a pageable (or completely invalid) address at an
interrupt request level (IRQL) that is too high. This is usually
caused by drivers using improper addresses.
If kernel debugger is available get stack backtrace.
Arguments:
Arg1: 00040100, memory referenced
Arg2: 00000002, IRQL
Arg3: 00000001, value 0 = read operation, 1 = write operation
Arg4: f5ef097a, address which referenced memory

Debugging Details:

WRITE_ADDRESS: 00040100

CURRENT_IRQL: 2

FAULTING_IP:
Wdf01000!FxIrpQueue::RemoveNextIrpFromQueue+2d
f5ef097a 8916 mov [esi],edx

DEFAULT_BUCKET_ID: DRIVER_FAULT

BUGCHECK_STR: 0xD1

LAST_CONTROL_TRANSFER: from f5ef0ab3 to f5ef097a

TRAP_FRAME: f78ea9bc – (.trap fffffffff78ea9bc)
ErrCode = 00000002
eax=85655008 ebx=00000000 ecx=85657894 edx=84c5f670 esi=00040100 edi=f5f2276c
eip=f5ef097a esp=f78eaa30 ebp=f78eaa34 iopl=0 nv up ei ng nz na po nc
cs=0008 ss=0010 ds=0023 es=0023 fs=0030 gs=0000 efl=00010286
Wdf01000!FxIrpQueue::RemoveNextIrpFromQueue+0x2d:
f5ef097a 8916 mov [esi],edx
Resetting default scope

STACK_TEXT:
f78eaa34 f5ef0ab3 00000000 f78eaa58 f78eaa50 Wdf01000!FxIrpQueue::RemoveNextIrpFromQueue+0x2d
f78eaa44 f5ee7584 f78eaa58 f78eaa70 f5eff012 Wdf01000!FxIrpQueue::GetNextRequest+0xf
f78eaa50 f5eff012 85657894 856579f4 85657820 Wdf01000!FxRequest::GetNextRequest+0x11
f78eaa70 f5eff2cc 00000001 00000001 00000000 Wdf01000!FxIoQueue::QueuePurge+0x351
f78eaa90 f5f003b2 00000003 00000008 856587f8 Wdf01000!FxIoQueue::StopProcessingForPower+0x30
f78eaab0 f5f0b0cb 00000003 00000008 856587f8 Wdf01000!FxPkgIo::StopProcessingForPower+0x88
f78eaac0 f5f09c9a 856587f8 f5f0adfa f78eaaf8 Wdf01000!FxPkgPnp::PnpEventRemovedCommonCode+0x16
f78eaac8 f5f0adfa f78eaaf8 f5f0ba8d 856587f8 Wdf01000!FxPkgFdo::PnpEventFdoRemovedOverload+0xa
f78eaad0 f5f0ba8d 856587f8 f5f23ab0 856587f8 Wdf01000!FxPkgPnp::PnpEventFdoRemoved+0xd
f78eaaf8 f5f0c53d 00000117 00000000 856587f8 Wdf01000!FxPkgPnp::PnpEnterNewState+0x13b
f78eab20 f5f0c871 f78eab38 00000000 856587f8 Wdf01000!FxPkgPnp::PnpProcessEventInner+0x1ad
f78eab40 f5f05bb7 00000200 84b08468 856587f8 Wdf01000!FxPkgPnp::PnpProcessEvent+0x177
f78eab68 f5f03a55 856587f8 f78eab84 84b0848c Wdf01000!FxPkgPnp::_PnpRemoveDevice+0x6e
f78eab88 f5eef272 84b083b0 f78eabb0 f5eef47a Wdf01000!FxPkgPnp::Dispatch+0x107
f78eab94 f5eef47a 86152628 84b083b0 84b0848c Wdf01000!FxDevice::Dispatch+0x7f
f78eabb0 804f0473 86152628 84b083b0 f78eac34 Wdf01000!FxDevice::DispatchWithLock+0x60
f78eabc0 805c9829 865c0910 865c0910 00000002 nt!IofCallDriver+0x3f
f78eabf0 8063b479 84b083c0 f78eac10 00000000 nt!IopSynchronousCall+0xc0
f78eac40 805408de 865c0910 00000002 00000000 nt!IopRemoveDevice+0x89
f78eac68 8063cfc2 e34953b8 00000015 e1812428 nt!IopRemoveLockedDeviceNode+0x15e
f78eac80 8063d03d 865c0210 00000002 e1812428 nt!IopDeleteLockedDeviceNode+0x32
f78eacb4 80642db2 865c0910 02812428 00000002 nt!IopDeleteLockedDeviceNodes+0x3d
f78ead40 80643094 f78ead7c 865a68a0 e13bc760 nt!PiProcessQueryRemoveAndEject+0x782
f78ead58 805c7918 f78ead7c 865a68a0 80582dbc nt!PiProcessTargetDeviceEvent+0x24
f78ead80 804eeabb 84b81b48 00000000 865a68a0 nt!PiWalkDeviceList+0x1b0
f78eadac 80596ffe 84b81b48 00000000 00000000 nt!ExpWorkerThread+0xe9
f78eaddc 805008c6 804ee9f0 00000001 00000000 nt!PspSystemThreadStartup+0x2e
00000000 00000000 00000000 00000000 00000000 nt!KiThreadStartup+0x16

FOLLOWUP_IP:
Wdf01000!FxIrpQueue::RemoveNextIrpFromQueue+2d
f5ef097a 8916 mov [esi],edx

SYMBOL_STACK_INDEX: 0

FOLLOWUP_NAME: MachineOwner

SYMBOL_NAME: Wdf01000!FxIrpQueue::RemoveNextIrpFromQueue+2d

MODULE_NAME: Wdf01000

IMAGE_NAME: Wdf01000.sys

DEBUG_FLR_IMAGE_TIMESTAMP: 444738f2

STACK_COMMAND: .trap fffffffff78ea9bc ; kb

BUCKET_ID: 0xD1_W_Wdf01000!FxIrpQueue::RemoveNextIrpFromQueue+2d

Followup: MachineOwner

More information:

I’m fairly sure it’s my filter which is failing (although it would
still be nice to be able to see that easily). Given it was failing
whilst trying to drain the queues, I looked at the state of the
queue just before (I knew) it would crash if I removed the device.

My app has just one request inside the driver but the queue thinks
differently. It says the driver owns one request but has two cancellable.
And the list of requests is clearly broken.

I guess I’ve managed to corrupt the queue information somehow, although
I’m not too clear how.


0: kd> !wdfdevicequeues 0x79f4cfe8

Dumping queues of WDFDEVICE 0x79f4cfe8

Number of queues: 2

Queue: 1 (!wdfqueue 0x7a883bd8)
Manual, Not power-managed, PowerOn, Can accept, Can dispatch, ExecutionLevelDispatch, SynchronizationScopeNone
Number of driver owned requests: 0
Number of waiting requests: 0

This is WDF internal queue for create requests.

Queue: 2 (!wdfqueue 0x79ead970)
Parallel, Not power-managed, PowerOn, Can accept, Can dispatch, ExecutionLevelDispatch, SynchronizationScopeNone
Number of driver owned requests: 1
!WDFREQUEST 0x7a83fa50 !IRP 0x85773008
Number of waiting requests: 0

Number of driver owned cancelable requests: 2
!WDFREQUEST 0x7a83fa50 !IRP 0x85773008
!WDFREQUEST 0x7a83fa50 !IRP 0x85773008
!WDFREQUEST 0x7a83fa50 !IRP 0x85773008

… lots of identical lines removed

!WDFREQUEST 0x7a83fa50 !IRP 0x85773008
!WDFREQUEST 0x7a83fa50 !IRP 0x85773008
wdfirpqueue.c:DumpIrpList Debugger extension Count of 50 entries exceeded

When I Ctrl-C my app, which cancels the request (and appears to work) the queue
state is as follows. If I unload the driver it blue screens but, amazingly,
normal requests seem to work fine - I can make other requests into the driver,
start further applications with no indication of any problem.


0: kd> !wdfdevicequeues 0x79f4cfe8

Dumping queues of WDFDEVICE 0x79f4cfe8

Number of queues: 2

Queue: 1 (!wdfqueue 0x7a883bd8)
Manual, Not power-managed, PowerOn, Can accept, Can dispatch, ExecutionLevelDispatch, SynchronizationScopeNone
Number of driver owned requests: 0
Number of waiting requests: 0

This is WDF internal queue for create requests.

Queue: 2 (!wdfqueue 0x79ead970)
Parallel, Not power-managed, PowerOn, Can accept, Can dispatch, ExecutionLevelDispatch, SynchronizationScopeNone
Number of driver owned requests: 0
Number of waiting requests: 0

Number of driver owned cancelable requests: 1
!WDFREQUEST 0x00000034 !IRP 0x85773008
Failed to get a pointer @ 0xfffffff4: invalid address, or memory is paged out?
c0000005 Exception in H:\WINDDK\WDF\KMDF10\bin\x86\wdfkd.wdfdevicequeues debugger extension.
PC: 7c81eb33 VA: 00000000 R/W: 1003f Parameter: 00000000

I’ll be trying to get a better picture of when and how the queue gets into
bad shape after sending this, but in the meantime Any ideas?

Don

> More information:

I’m fairly sure it’s my filter which is failing (although it would
still be nice to be able to see that easily). Given it was failing
whilst trying to drain the queues, I looked at the state of the
queue just before (I knew) it would crash if I removed the device.

My app has just one request inside the driver but the queue thinks
differently. It says the driver owns one request but has two
cancellable.
And the list of requests is clearly broken.

I guess I’ve managed to corrupt the queue information
somehow, although I’m not too clear how.

The answer is that in one code path I was calling WdfRequestMarkCancelable
and not calling WdfRequestUnmarkCancelable before completing the request.

Is there some reason why WdfRequestComplete and friends cannot check for
this error?

Don

If you turn on the KMDF verifier, it will catch this. we can’t put a
ton of code in the completion path since it is a hot path. See
http://blogs.msdn.com/doronh/archive/2006/05/17/600335.aspx on turning
it on.

You don’t see wdf01001 because there is only one major version of KMDF
running on a system at once. All minor versions are supported by the
same major version library.

d

-----Original Message-----
From: xxxxx@lists.osr.com
[mailto:xxxxx@lists.osr.com] On Behalf Of Don Ward
Sent: Wednesday, July 12, 2006 7:34 AM
To: Windows System Software Devs Interest List
Subject: RE: [ntdev] Which KMDF driver is bug checking?

More information:

I’m fairly sure it’s my filter which is failing (although it would
still be nice to be able to see that easily). Given it was failing
whilst trying to drain the queues, I looked at the state of the
queue just before (I knew) it would crash if I removed the device.

My app has just one request inside the driver but the queue thinks
differently. It says the driver owns one request but has two
cancellable.
And the list of requests is clearly broken.

I guess I’ve managed to corrupt the queue information
somehow, although I’m not too clear how.

The answer is that in one code path I was calling
WdfRequestMarkCancelable
and not calling WdfRequestUnmarkCancelable before completing the
request.

Is there some reason why WdfRequestComplete and friends cannot check for
this error?

Don


Questions? First check the Kernel Driver FAQ at
http://www.osronline.com/article.cfm?id=256

To unsubscribe, visit the List Server section of OSR Online at
http://www.osronline.com/page.cfm?name=ListServer