BOSD after surprise removal

Hello,

i am writing a virtual USB bus driver. The bus driver
enumerates a composite device (Canon MP810 Multifunction printer).

When the printer is switched off my bus driver calls the function
IoInvalidateRelations for the type BusRelation and in the following
IRP_MN_QUERY_DEVICE_RELATIONS the driver reports that no device
is present on the bus.

The upper driver’s cancel all outstanding IPRs with the cancel routine
of the bus driver and/or called URB_FUNCTION_ABORT_PIPE for every pipe.

At this moment the IRP queue for the composite device is empty.

For the IRP_MN_SURPRISE_REMOVAL request my bus driver deactivates the
device interface and for IRP_MN_REMOVE_DEVICE the driver deletes the device object.

The effect is a BSOD.

Can somebody give me a hint.

Thank you very much for all tip’s.

Best Regards,

Stefan Witt

*** Fatal System Error: 0x0000000a
(0x00000004,0x00000002,0x00000000,0x805314A6)

Break instruction exception - code 80000003 (first chance)

A fatal system error has occurred.
Debugger entered on first try; Bugcheck callbacks have not been invoked.

A fatal system error has occurred.

Connected to Windows XP 2600 x86 compatible target, ptr64 FALSE
Loading Kernel Symbols

Loading User Symbols

Loading unloaded module list

*******************************************************************************
* *
* Bugcheck Analysis *
* *
*******************************************************************************

Use !analyze -v to get detailed debugging information.

BugCheck A, {4, 2, 0, 805314a6}

*** No owner thread found for resource 80558460
*** No owner thread found for resource 805584e0
*** No owner thread found for resource 80558460
*** No owner thread found for resource 805584e0
*** No owner thread found for resource 80558460
*** No owner thread found for resource 805584e0
Probably caused by : ntoskrnl.exe ( nt!PpDevNodeRemoveFromTree+26 )

Followup: MachineOwner

nt!RtlpBreakWithStatusInstruction:
804e3592 cc int 3
kd> !analyze -v
*******************************************************************************
* *
* Bugcheck Analysis *
* *
*******************************************************************************

IRQL_NOT_LESS_OR_EQUAL (a)
An attempt was made to access a pageable (or completely invalid) address at an
interrupt request level (IRQL) that is too high. This is usually
caused by drivers using improper addresses.
If a kernel debugger is available get the stack backtrace.
Arguments:
Arg1: 00000004, memory referenced
Arg2: 00000002, IRQL
Arg3: 00000000, bitfield :
bit 0 : value 0 = read operation, 1 = write operation
bit 3 : value 0 = not an execute operation, 1 = execute operation (only on chips which support this level of status)
Arg4: 805314a6, address which referenced memory

Debugging Details:

*** No owner thread found for resource 80558460
*** No owner thread found for resource 805584e0
*** No owner thread found for resource 80558460
*** No owner thread found for resource 805584e0
*** No owner thread found for resource 80558460
*** No owner thread found for resource 805584e0

READ_ADDRESS: 00000004

CURRENT_IRQL: 2

FAULTING_IP:
nt!PpDevNodeRemoveFromTree+26
805314a6 3931 cmp dword ptr [ecx],esi

DEFAULT_BUCKET_ID: DRIVER_FAULT

BUGCHECK_STR: 0xA

PROCESS_NAME: System

TRAP_FRAME: f88eeba4 – (.trap 0xfffffffff88eeba4)
ErrCode = 00000000
eax=00000000 ebx=80558518 ecx=00000004 edx=804dc8c1 esi=82018b78 edi=80558080
eip=805314a6 esp=f88eec18 ebp=f88eec28 iopl=0 nv up ei pl nz na po nc
cs=0008 ss=0010 ds=0023 es=0023 fs=0030 gs=0000 efl=00010202
nt!PpDevNodeRemoveFromTree+0x26:
805314a6 3931 cmp dword ptr [ecx],esi ds:0023:00000004=???
Resetting default scope

LOCK_ADDRESS: 80558460 – (!locks 80558460)

Resource @ nt!PiEngineLock (0x80558460) Exclusively owned
Contention Count = 4
Threads: 822dc3c8-01<*>
1 total locks, 1 locks currently held

PNP_TRIAGE:
Lock address : 0x80558460
Thread Count : 0
Thread address: 0x00000000
Thread wait : 0x0

LAST_CONTROL_TRANSFER: from 8053225b to 804e3592

STACK_TEXT:
f88ee758 8053225b 00000003 f88eeab4 00000000 nt!RtlpBreakWithStatusInstruction
f88ee7a4 80532d2e 00000003 00000004 805314a6 nt!KiBugCheckDebugBreak+0x19
f88eeb84 804e187f 0000000a 00000004 00000002 nt!KeBugCheck2+0x574
f88eeb84 805314a6 0000000a 00000004 00000002 nt!KiTrap0E+0x233
f88eec28 8061b5d3 82018b78 805584a0 81fabd20 nt!PpDevNodeRemoveFromTree+0x26
f88eec48 8061b90f 821b9618 e1a1a540 00000000 nt!IopUnlinkDeviceRemovalRelations+0x85
f88eec68 8061ba10 81fabd20 00000000 81ff10e8 nt!IopDelayedRemoveWorker+0x5c
f88eec80 80530651 821b9618 00000001 e1b82d60 nt!IopChainDereferenceComplete+0xd9
f88eecac 8061d893 81fc05f0 00000006 00000000 nt!IopNotifyPnpWhenChainDereferenced+0xa1
f88eed34 805ec65b f88eed70 806ed188 e100eb38 nt!PiProcessQueryRemoveAndEject+0x9e4
f88eed50 8059c423 f88eed70 82200140 8056147c nt!PiProcessTargetDeviceEvent+0x2a
f88eed74 804e426b 82200140 00000000 822dc3c8 nt!PiWalkDeviceList+0x122
f88eedac 8057be15 82200140 00000000 00000000 nt!ExpWorkerThread+0x100
f88eeddc 804fa4da 804e4196 00000001 00000000 nt!PspSystemThreadStartup+0x34
00000000 00000000 00000000 00000000 00000000 nt!KiThreadStartup+0x16

STACK_COMMAND: kb

FOLLOWUP_IP:
nt!PpDevNodeRemoveFromTree+26
805314a6 3931 cmp dword ptr [ecx],esi

SYMBOL_STACK_INDEX: 4

SYMBOL_NAME: nt!PpDevNodeRemoveFromTree+26

FOLLOWUP_NAME: MachineOwner

MODULE_NAME: nt

IMAGE_NAME: ntoskrnl.exe

DEBUG_FLR_IMAGE_TIMESTAMP: 42250ff9

FAILURE_BUCKET_ID: 0xA_VRF_nt!PpDevNodeRemoveFromTree+26

BUCKET_ID: 0xA_VRF_nt!PpDevNodeRemoveFromTree+26

Followup: MachineOwner

Your description of your remove processing sounds correct. As long as you
did not report the PDO in the previous QueryDeviceRelations, and you have
freed up all resources related to the PDO, you should be able to delete the
PDO on the Remove.

The BSOD is a classice null pointer dereference in the PNP remove device
path, indicating that PNP thinks the PDO still exists and is blindly walking
off a null pointer.

Since your description of your code is correct, perhaps it is the code
itself that is the problem? (Post the code, not what you think the code
does!)

Driver verifier can help for debugging these situations as it may halt you
earlier in the disaster and closer to the cause of the failure.

On Nov 20, 2007 7:23 AM, wrote:

> Hello,
>
> i am writing a virtual USB bus driver. The bus driver
> enumerates a composite device (Canon MP810 Multifunction printer).
>
> When the printer is switched off my bus driver calls the function
> IoInvalidateRelations for the type BusRelation and in the following
> IRP_MN_QUERY_DEVICE_RELATIONS the driver reports that no device
> is present on the bus.
>
> The upper driver’s cancel all outstanding IPRs with the cancel routine
> of the bus driver and/or called URB_FUNCTION_ABORT_PIPE for every pipe.
>
> At this moment the IRP queue for the composite device is empty.
>
> For the IRP_MN_SURPRISE_REMOVAL request my bus driver deactivates the
> device interface and for IRP_MN_REMOVE_DEVICE the driver deletes the
> device object.
>
> The effect is a BSOD.
>
> Can somebody give me a hint.
>
> Thank you very much for all tip’s.
>
> Best Regards,
>
> Stefan Witt
>
> Fatal System Error: 0x0000000a
> (0x00000004,0x00000002,0x00000000,0x805314A6)
>
> Break instruction exception - code 80000003 (first chance)
>
> A fatal system error has occurred.
> Debugger entered on first try; Bugcheck callbacks have not been invoked.
>
> A fatal system error has occurred.
>
> Connected to Windows XP 2600 x86 compatible target, ptr64 FALSE
> Loading Kernel Symbols
>
> …
> Loading User Symbols
>
> Loading unloaded module list
> …
>
>
****************************************************************************
> *
> *
> * Bugcheck Analysis
> *
> *
> *
>
>
>
> Use !analyze -v to get detailed debugging information.
>
> BugCheck A, {4, 2, 0, 805314a6}
>
>
No owner thread found for resource 80558460
> No owner thread found for resource 805584e0
>
No owner thread found for resource 80558460
> No owner thread found for resource 805584e0
>
No owner thread found for resource 80558460
> No owner thread found for resource 805584e0
> Probably caused by : ntoskrnl.exe ( nt!PpDevNodeRemoveFromTree+26 )
>
> Followup: MachineOwner
> ---------
>
> nt!RtlpBreakWithStatusInstruction:
> 804e3592 cc int 3
> kd> !analyze -v
>
>

> *
> *
> * Bugcheck Analysis
> *
> *
> *
>
> ***************************************************************************
>
> IRQL_NOT_LESS_OR_EQUAL (a)
> An attempt was made to access a pageable (or completely invalid) address
> at an
> interrupt request level (IRQL) that is too high. This is usually
> caused by drivers using improper addresses.
> If a kernel debugger is available get the stack backtrace.
> Arguments:
> Arg1: 00000004, memory referenced
> Arg2: 00000002, IRQL
> Arg3: 00000000, bitfield :
> bit 0 : value 0 = read operation, 1 = write operation
> bit 3 : value 0 = not an execute operation, 1 = execute operation
> (only on chips which support this level of status)
> Arg4: 805314a6, address which referenced memory
>
> Debugging Details:
> ------------------
>
>
No owner thread found for resource 80558460
> No owner thread found for resource 805584e0
>
No owner thread found for resource 80558460
> No owner thread found for resource 805584e0
>
No owner thread found for resource 80558460
> *** No owner thread found for resource 805584e0
>
> READ_ADDRESS: 00000004
>
> CURRENT_IRQL: 2
>
> FAULTING_IP:
> nt!PpDevNodeRemoveFromTree+26
> 805314a6 3931 cmp dword ptr [ecx],esi
>
> DEFAULT_BUCKET_ID: DRIVER_FAULT
>
> BUGCHECK_STR: 0xA
>
> PROCESS_NAME: System
>
> TRAP_FRAME: f88eeba4 – (.trap 0xfffffffff88eeba4)
> ErrCode = 00000000
> eax=00000000 ebx=80558518 ecx=00000004 edx=804dc8c1 esi=82018b78
> edi=80558080
> eip=805314a6 esp=f88eec18 ebp=f88eec28 iopl=0 nv up ei pl nz na po
> nc
> cs=0008 ss=0010 ds=0023 es=0023 fs=0030 gs=0000
> efl=00010202
> nt!PpDevNodeRemoveFromTree+0x26:
> 805314a6 3931 cmp dword ptr [ecx],esi
> ds:0023:00000004=???
> Resetting default scope
>
> LOCK_ADDRESS: 80558460 – (!locks 80558460)
>
> Resource @ nt!PiEngineLock (0x80558460) Exclusively owned
> Contention Count = 4
> Threads: 822dc3c8-01<
>
> 1 total locks, 1 locks currently held
>
> PNP_TRIAGE:
> Lock address : 0x80558460
> Thread Count : 0
> Thread address: 0x00000000
> Thread wait : 0x0
>
> LAST_CONTROL_TRANSFER: from 8053225b to 804e3592
>
> STACK_TEXT:
> f88ee758 8053225b 00000003 f88eeab4 00000000
> nt!RtlpBreakWithStatusInstruction
> f88ee7a4 80532d2e 00000003 00000004 805314a6 nt!KiBugCheckDebugBreak+0x19
> f88eeb84 804e187f 0000000a 00000004 00000002 nt!KeBugCheck2+0x574
> f88eeb84 805314a6 0000000a 00000004 00000002 nt!KiTrap0E+0x233
> f88eec28 8061b5d3 82018b78 805584a0 81fabd20
> nt!PpDevNodeRemoveFromTree+0x26
> f88eec48 8061b90f 821b9618 e1a1a540 00000000
> nt!IopUnlinkDeviceRemovalRelations+0x85
> f88eec68 8061ba10 81fabd20 00000000 81ff10e8
> nt!IopDelayedRemoveWorker+0x5c
> f88eec80 80530651 821b9618 00000001 e1b82d60
> nt!IopChainDereferenceComplete+0xd9
> f88eecac 8061d893 81fc05f0 00000006 00000000
> nt!IopNotifyPnpWhenChainDereferenced+0xa1
> f88eed34 805ec65b f88eed70 806ed188 e100eb38
> nt!PiProcessQueryRemoveAndEject+0x9e4
> f88eed50 8059c423 f88eed70 82200140 8056147c
> nt!PiProcessTargetDeviceEvent+0x2a
> f88eed74 804e426b 82200140 00000000 822dc3c8 nt!PiWalkDeviceList+0x122
> f88eedac 8057be15 82200140 00000000 00000000 nt!ExpWorkerThread+0x100
> f88eeddc 804fa4da 804e4196 00000001 00000000
> nt!PspSystemThreadStartup+0x34
> 00000000 00000000 00000000 00000000 00000000 nt!KiThreadStartup+0x16
>
>
> STACK_COMMAND: kb
>
> FOLLOWUP_IP:
> nt!PpDevNodeRemoveFromTree+26
> 805314a6 3931 cmp dword ptr [ecx],esi
>
> SYMBOL_STACK_INDEX: 4
>
> SYMBOL_NAME: nt!PpDevNodeRemoveFromTree+26
>
> FOLLOWUP_NAME: MachineOwner
>
> MODULE_NAME: nt
>
> IMAGE_NAME: ntoskrnl.exe
>
> DEBUG_FLR_IMAGE_TIMESTAMP: 42250ff9
>
> FAILURE_BUCKET_ID: 0xA_VRF_nt!PpDevNodeRemoveFromTree+26
>
> BUCKET_ID: 0xA_VRF_nt!PpDevNodeRemoveFromTree+26
>
> Followup: MachineOwner
> ---------
>
>
>
>
> —
> NTDEV is sponsored by OSR
>
> For our schedule of WDF, WDM, debugging and other seminars visit:
> http://www.osr.com/seminars
>
> To unsubscribe, visit the List Server section of OSR Online at
> http://www.osronline.com/page.cfm?name=ListServer
>


Mark Roddy

Hello Mark,

thank you for your reply. My test system is a Windows XP with SP2. The driver verifier is always activated with all options (except the simulation of low ressource).

I will check my source code for IRP_MN_REMOVE_DEVICE.

Best regards,

Stefan Witt

Hello,

i stripped my source code for IRP_MN_REMOVE_DEVICE.

case IRP_MN_SURPRISE_REMOVAL:
setNewDevicePnPState(DeviceData, PNP_SurpriseRemovePending);
DeviceData->ReportedMissing = TRUE;
IoSetDeviceInterfaceState(&DeviceData->InterfaceName, FALSE);
status = STATUS_SUCCESS;
Irp->IoStatus.Information = 0;
break;
case IRP_MN_REMOVE_DEVICE:
:
:
wait4RemoveEvent4DeviceQueue(DeviceData);
// this source includes only the part for the surprise removal
if (DeviceData->ReportedMissing) {
IoDeleteDevice(DeviceObject);
status = STATUS_SUCCESS;
Irp->IoStatus.Information = 0;
}

Irp->IoStatus.Status = status;
IoCompleteRequest (Irp, IO_NO_INCREMENT);
return status;

All source code for cleanup and freeing memory are deleted.

But this removal code creates also a BSOD.

My interpretation of the debugger output is, that usbprint.sys wants to
cancel the IRP 826a8f00 and the driver usbccgp.sys is unloaded. But this IRPs
i never saw in the virtual bus driver debugger output.

Best Regards,

Stefan Witt

BSOD:

Entering Devmode…
Query Err/Eat: No Status
PJLMonEndDocPort
PL_EndDocPort

*** Fatal System Error: 0x000000d1
(0xF8768DD0,0x00000002,0x00000000,0xF8768DD0)

Break instruction exception - code 80000003 (first chance)

A fatal system error has occurred.
Debugger entered on first try; Bugcheck callbacks have not been invoked.

A fatal system error has occurred.

Connected to Windows XP 2600 x86 compatible target, ptr64 FALSE
Loading Kernel Symbols

Loading User Symbols

Loading unloaded module list

*******************************************************************************
* *
* Bugcheck Analysis *
* *
*******************************************************************************

Use !analyze -v to get detailed debugging information.

BugCheck D1, {f8768dd0, 2, 0, f8768dd0}

Probably caused by : usbccgp.sys ( usbccgp+1dd0 )

Followup: MachineOwner

nt!RtlpBreakWithStatusInstruction:
804e3592 cc int 3
kd> !analyze -v
*******************************************************************************
* *
* Bugcheck Analysis *
* *
*******************************************************************************

DRIVER_IRQL_NOT_LESS_OR_EQUAL (d1)
An attempt was made to access a pageable (or completely invalid) address at an
interrupt request level (IRQL) that is too high. This is usually
caused by drivers using improper addresses.
If kernel debugger is available get stack backtrace.
Arguments:
Arg1: f8768dd0, memory referenced
Arg2: 00000002, IRQL
Arg3: 00000000, value 0 = read operation, 1 = write operation
Arg4: f8768dd0, address which referenced memory

Debugging Details:

READ_ADDRESS: f8768dd0

CURRENT_IRQL: 2

FAULTING_IP:
usbccgp+1dd0
f8768dd0 ?? ???

DEFAULT_BUCKET_ID: DRIVER_FAULT

BUGCHECK_STR: 0xD1

PROCESS_NAME: spoolsv.exe

TRAP_FRAME: f5859bb8 – (.trap 0xfffffffff5859bb8)
ErrCode = 00000000
eax=f8768dd0 ebx=806ed000 ecx=826a8fdc edx=00000007 esi=826a8f00 edi=8201e468
eip=f8768dd0 esp=f5859c2c ebp=f5859c44 iopl=0 nv up ei ng nz ac pe cy
cs=0008 ss=0010 ds=0023 es=0023 fs=0030 gs=0000 efl=00010297
<unloaded_usbccgp.sys>+0x1dd0:
f8768dd0 ?? ???
Resetting default scope

IP_MODULE_UNLOADED:
usbccgp+1dd0
f8768dd0 ?? ???

LAST_CONTROL_TRANSFER: from 8053225b to 804e3592

FAILED_INSTRUCTION_ADDRESS:
usbccgp+1dd0
f8768dd0 ?? ???

STACK_TEXT:
f585976c 8053225b 00000003 f5859ac8 00000000 nt!RtlpBreakWithStatusInstruction
f58597b8 80532d2e 00000003 f8768dd0 f8768dd0 nt!KiBugCheckDebugBreak+0x19
f5859b98 804e187f 0000000a f8768dd0 00000002 nt!KeBugCheck2+0x574
f5859b98 f8768dd0 0000000a f8768dd0 00000002 nt!KiTrap0E+0x233
WARNING: Frame IP not in any known module. Following frames may be wrong.
f5859c28 80504f1a 81f80e20 826a8f00 826a8f10 <unloaded_usbccgp.sys>+0x1dd0
f5859c44 80595532 826a8f00 82025948 8201e258 nt!IoCancelIrp+0x6f
f5859c6c 8057a2a3 8201e258 8201e258 8201e4a0 nt!IoCancelThreadIo+0x33
f5859d14 8057a46a 00000000 00000000 8201e258 nt!PspExitThread+0x442
f5859d34 8057aa43 8201e258 00000000 f5859d64 nt!PspTerminateThreadByPointer+0x52
f5859d54 804de7ec 00000000 00000000 00e6fd50 nt!NtTerminateThread+0x70
f5859d54 7c91eb94 00000000 00000000 00e6fd50 nt!KiFastCallEntry+0xf8
00e6fd0c 7c91e8af 7c80cd04 00000000 00000000 ntdll!KiFastSystemCallRet
00e6fd10 7c80cd04 00000000 00000000 00397b58 ntdll!NtTerminateThread+0xc
00e6fd50 75e6a84d 00000000 0000001c 00000002 kernel32!ExitThread+0x8b
00e6ffb4 7c80b50b 00397b58 0000001c 00000002 localspl!PortThread+0x56d
00e6ffec 00000000 75e6912e 00397b58 00000000 kernel32!BaseThreadStart+0x37

STACK_COMMAND: kb

FOLLOWUP_IP:
usbccgp+1dd0
f8768dd0 ?? ???

SYMBOL_STACK_INDEX: 4

FOLLOWUP_NAME: MachineOwner

SYMBOL_NAME: usbccgp+1dd0

MODULE_NAME: usbccgp

IMAGE_NAME: usbccgp.sys

DEBUG_FLR_IMAGE_TIMESTAMP: 0

FAILURE_BUCKET_ID: 0xD1_VRF_CODE_AV_BAD_IP_usbccgp+1dd0

BUCKET_ID: 0xD1_VRF_CODE_AV_BAD_IP_usbccgp+1dd0

Followup: MachineOwner
---------

kd> !irp 826a8f00
Irp is active with 4 stacks 4 is current (= 0x826a8fdc)
No Mdl: No System Buffer: Thread 8201e258: Irp stack trace.
cmd flg cl Device File Completion-Context
[0, 0] 0 0 00000000 00000000 00000000-00000000

Args: 00000000 00000000 00000000 00000000
[0, 0] 0 0 00000000 00000000 00000000-00000000

Args: 00000000 00000000 00000000 00000000
[0, 0] 0 10 00000000 00000000 00000000-00000000

Args: 00000000 00000000 00000000 00000000
>[f, 0] 0 e0 81f80e20 00000000 f87fa1e0-81f95650 Success Error Cancel
<unloaded_usbprint.sys>
Args: 00000000 00000008 00220027 82229698</unloaded_usbprint.sys></unloaded_usbccgp.sys></unloaded_usbccgp.sys>

Quoting xxxxx@seh.de:

Hello,

i stripped my source code for IRP_MN_REMOVE_DEVICE.

case IRP_MN_SURPRISE_REMOVAL:
setNewDevicePnPState(DeviceData, PNP_SurpriseRemovePending);
DeviceData->ReportedMissing = TRUE;

No it is not.
It is only Reported Missing when IRP_MJ_PNP IRP_MN_QUERY_DEVICE_RELATIONS
BusRelations returns a list not including this PDO.

IoSetDeviceInterfaceState(&DeviceData->InterfaceName, FALSE);
status = STATUS_SUCCESS;
Irp->IoStatus.Information = 0;
break;

Hello Ian,

do you mean, that i need to move the line

DeviceData->ReportedMissing = TRUE;

to IRP_MN_QUERY_DEVICE_RELATIONS for type BusRelation ?

I saw in the debugger output the IRP_MN_QUERY_DEVICE_RELATIONS for type BusRelation and my virtual bus driver reported no PDO.

Best Regards,

Stefan Witt

If you look at the logic in the toaster bus sample you will see that
ReportedMissing should only be set from QDR when reporting the PDO missing.
This may not fix your problem, but at least it will get rid of an obvious
error.

On Nov 24, 2007 7:05 AM, wrote:

> Hello Ian,
>
> do you mean, that i need to move the line
>
> DeviceData->ReportedMissing = TRUE;
>
> to IRP_MN_QUERY_DEVICE_RELATIONS for type BusRelation ?
>
> I saw in the debugger output the IRP_MN_QUERY_DEVICE_RELATIONS for type
> BusRelation and my virtual bus driver reported no PDO.
>
> Best Regards,
>
> Stefan Witt
>
> —
> NTDEV is sponsored by OSR
>
> For our schedule of WDF, WDM, debugging and other seminars visit:
> http://www.osr.com/seminars
>
> To unsubscribe, visit the List Server section of OSR Online at
> http://www.osronline.com/page.cfm?name=ListServer
>


Mark Roddy

Hello,

thank you very much for your help. I will correct this on Monday.

Best Regards,

Stefan Wtt