Premature driver removal?!

I’m analyzing memory dump with BSOD DRIVER_UNLOADED_WITHOUT_CANCELLING_PENDING_OPERATIONS (ce). I can’t believe what I see.

It occured during USB device surprise removal. Driver apparently received surprise remove and remove IRPs and blocked waiting until everything completes. In parallel or as a result of SR it received an IOCTL from user mode app which in turn lead to port reset which can take some times in this state. When done, IOCTL dispatch completed the IRP and decremented IO count which in turn lead to unblocked remove IRP handler finishing and removing the device. User mode app then probably closed a handle from different thread which caused driver unload. But IOCTL dispatch still had few instructions to execute and crashed because code wasn’t in the memory.

I just took Oney’s WDM book and he clearly states driver remains in the memory until it both completes IRP and returns from the dispatch routine. But it didn’t. Someting changed in the XP SP2? Or is it app responsibility to not close handles before synchronous functions called from other threads return? (seems impossible anyway)

Analysis below. I forced WinDbg to load symbols for my unloaded driver (tcusb.sys) so don’t be confused it is there. Originally there was just reference to unloaded driver. I examined handle 0x450 passed to NtDeviceIoControlFile() and it is already freed. Also my device extension indicates all handles were freed.

kd> !analyze -v
*******************************************************************************
* *
* Bugcheck Analysis *
* *
*******************************************************************************

DRIVER_UNLOADED_WITHOUT_CANCELLING_PENDING_OPERATIONS (ce)
A driver unloaded without cancelling timers, DPCs, worker threads, etc.
The broken driver’s name is displayed on the screen.
Arguments:
Arg1: f7972185, memory referenced
Arg2: 00000000, value 0 = read operation, 1 = write operation
Arg3: f7972185, If non-zero, the instruction address which referenced the bad memory
address.
Arg4: 00000000, Mm internal code.

Debugging Details:

READ_ADDRESS: f7972185

FAULTING_IP:
tcusb!BulkUsb_IoDecrement+61 [e:\build\tcdrv\build-0071-main\src\tcdrv\bulkpnp.c @ 2796]
f7972185 ?? ???

DEFAULT_BUCKET_ID: DRIVER_FAULT

BUGCHECK_STR: 0xCE

PROCESS_NAME: sdk_test.exe

TRAP_FRAME: eea73a30 – (.trap ffffffffeea73a30)
ErrCode = 00000000
eax=80a16123 ebx=82d9c848 ecx=00000000 edx=00000000 esi=82d9c824 edi=8080cf16
eip=f7972185 esp=eea73aa4 ebp=eea73ab4 iopl=0 nv up ei pl nz na po nc
cs=0008 ss=0010 ds=0023 es=0023 fs=0030 gs=0000 efl=00010202
tcusb!BulkUsb_IoDecrement+0x61:
f7972185 ?? ???
Resetting default scope

LAST_CONTROL_TRANSFER: from 8084d6fb to 8085c9ae

FAILED_INSTRUCTION_ADDRESS:
tcusb!BulkUsb_IoDecrement+61 [e:\build\tcdrv\build-0071-main\src\tcdrv\bulkpnp.c @ 2796]
f7972185 ?? ???

STACK_TEXT:
eea739cc 8084d6fb 00000050 f7972185 00000000 nt!KeBugCheckEx+0x1b
eea73a18 8080aff1 00000000 f7972185 00000000 nt!MmAccessFault+0x6f5
eea73a18 f7972185 00000000 f7972185 00000000 nt!KiTrap0E+0xcc
eea73ab4 f7976b58 00d9c758 82de6fb8 831c5278 tcusb!BulkUsb_IoDecrement+0x61 [e:\build\tcdrv\build-0071-main\src\tcdrv\bulkpnp.c @ 2796]
eea73c34 8080cd77 82d9c6a0 8324f008 80a16070 tcusb!DispatchDevCtrl+0xed6 [e:\build\tcdrv\build-0071-main\src\tcdrv\ioctl.c @ 1297]
eea73c44 808939ab 8324f0e4 832762e0 8324f008 nt!IopfCallDriver+0x31
eea73c58 808a69f7 82d9c6a0 8324f008 832762e0 nt!IopSynchronousServiceTail+0x60
eea73d00 808a8bfa 00000450 00000464 00000000 nt!IopXxxControlFile+0x611
eea73d34 8080806b 00000450 00000464 00000000 nt!NtDeviceIoControlFile+0x2a
eea73d34 7c90eb94 00000450 00000464 00000000 nt!KiFastCallEntry+0xf8
WARNING: Frame IP not in any known module. Following frames may be wrong.
00129a34 00000000 00000000 00000000 00000000 0x7c90eb94

STACK_COMMAND: kb

FOLLOWUP_IP:
tcusb!BulkUsb_IoDecrement+61 [e:\build\tcdrv\build-0071-main\src\tcdrv\bulkpnp.c @ 2796]
f7972185 ?? ???

FAULTING_SOURCE_CODE:
2792:
2793: KeReleaseSpinLock(&DeviceExtension->IOCountLock, oldIrql);
2794:
2795: TTRACE_LEAVE((“BulkUsb_IoDecrement() -> %x”, result));

2796: return(result);
2797: }
2798:
2799: NTSTATUS
2800: CanStopDevice(
2801: IN PDEVICE_OBJECT DeviceObject,

SYMBOL_STACK_INDEX: 3

SYMBOL_NAME: tcusb!BulkUsb_IoDecrement+61

FOLLOWUP_NAME: MachineOwner

MODULE_NAME: tcusb

IMAGE_NAME: tcusb.sys

DEBUG_FLR_IMAGE_TIMESTAMP: fffffffe

FAILURE_BUCKET_ID: 0xCE_BAD_IP_tcusb!BulkUsb_IoDecrement+61

BUCKET_ID: 0xCE_BAD_IP_tcusb!BulkUsb_IoDecrement+61

Followup: MachineOwner

Best regards,

Michal Vodicka
UPEK, Inc.
[xxxxx@upek.com, http://www.upek.com]

You will not get a remove until the app closes its handle, there is no
way the remove and the app’s closing of the handle can race. Are you
purging/canceling all i/o in IRP_MJ_CLEANUP on the PFILE_OBJECT for all
i/o sent on the PFILE_OBJECT before completing the cleanup irp?

d

-----Original Message-----
From: xxxxx@lists.osr.com
[mailto:xxxxx@lists.osr.com] On Behalf Of Michal Vodicka
Sent: Thursday, December 07, 2006 2:38 PM
To: Windows System Software Devs Interest List
Subject: [ntdev] Premature driver removal?!

I’m analyzing memory dump with BSOD
DRIVER_UNLOADED_WITHOUT_CANCELLING_PENDING_OPERATIONS (ce). I can’t
believe what I see.

It occured during USB device surprise removal. Driver apparently
received surprise remove and remove IRPs and blocked waiting until
everything completes. In parallel or as a result of SR it received an
IOCTL from user mode app which in turn lead to port reset which can take
some times in this state. When done, IOCTL dispatch completed the IRP
and decremented IO count which in turn lead to unblocked remove IRP
handler finishing and removing the device. User mode app then probably
closed a handle from different thread which caused driver unload. But
IOCTL dispatch still had few instructions to execute and crashed because
code wasn’t in the memory.

I just took Oney’s WDM book and he clearly states driver remains in the
memory until it both completes IRP and returns from the dispatch
routine. But it didn’t. Someting changed in the XP SP2? Or is it app
responsibility to not close handles before synchronous functions called
from other threads return? (seems impossible anyway)

Analysis below. I forced WinDbg to load symbols for my unloaded driver
(tcusb.sys) so don’t be confused it is there. Originally there was just
reference to unloaded driver. I examined handle 0x450 passed to
NtDeviceIoControlFile() and it is already freed. Also my device
extension indicates all handles were freed.

kd> !analyze -v
************************************************************************
*******
*
*
* Bugcheck Analysis
*
*
*
************************************************************************
*******

DRIVER_UNLOADED_WITHOUT_CANCELLING_PENDING_OPERATIONS (ce)
A driver unloaded without cancelling timers, DPCs, worker threads, etc.
The broken driver’s name is displayed on the screen.
Arguments:
Arg1: f7972185, memory referenced
Arg2: 00000000, value 0 = read operation, 1 = write operation
Arg3: f7972185, If non-zero, the instruction address which referenced
the bad memory
address.
Arg4: 00000000, Mm internal code.

Debugging Details:

READ_ADDRESS: f7972185

FAULTING_IP:
tcusb!BulkUsb_IoDecrement+61
[e:\build\tcdrv\build-0071-main\src\tcdrv\bulkpnp.c @ 2796]
f7972185 ?? ???

DEFAULT_BUCKET_ID: DRIVER_FAULT

BUGCHECK_STR: 0xCE

PROCESS_NAME: sdk_test.exe

TRAP_FRAME: eea73a30 – (.trap ffffffffeea73a30)
ErrCode = 00000000
eax=80a16123 ebx=82d9c848 ecx=00000000 edx=00000000 esi=82d9c824
edi=8080cf16
eip=f7972185 esp=eea73aa4 ebp=eea73ab4 iopl=0 nv up ei pl nz na
po nc
cs=0008 ss=0010 ds=0023 es=0023 fs=0030 gs=0000
efl=00010202
tcusb!BulkUsb_IoDecrement+0x61:
f7972185 ?? ???
Resetting default scope

LAST_CONTROL_TRANSFER: from 8084d6fb to 8085c9ae

FAILED_INSTRUCTION_ADDRESS:
tcusb!BulkUsb_IoDecrement+61
[e:\build\tcdrv\build-0071-main\src\tcdrv\bulkpnp.c @ 2796]
f7972185 ?? ???

STACK_TEXT:
eea739cc 8084d6fb 00000050 f7972185 00000000 nt!KeBugCheckEx+0x1b
eea73a18 8080aff1 00000000 f7972185 00000000 nt!MmAccessFault+0x6f5
eea73a18 f7972185 00000000 f7972185 00000000 nt!KiTrap0E+0xcc
eea73ab4 f7976b58 00d9c758 82de6fb8 831c5278
tcusb!BulkUsb_IoDecrement+0x61
[e:\build\tcdrv\build-0071-main\src\tcdrv\bulkpnp.c @ 2796]
eea73c34 8080cd77 82d9c6a0 8324f008 80a16070 tcusb!DispatchDevCtrl+0xed6
[e:\build\tcdrv\build-0071-main\src\tcdrv\ioctl.c @ 1297]
eea73c44 808939ab 8324f0e4 832762e0 8324f008 nt!IopfCallDriver+0x31
eea73c58 808a69f7 82d9c6a0 8324f008 832762e0
nt!IopSynchronousServiceTail+0x60
eea73d00 808a8bfa 00000450 00000464 00000000 nt!IopXxxControlFile+0x611
eea73d34 8080806b 00000450 00000464 00000000
nt!NtDeviceIoControlFile+0x2a
eea73d34 7c90eb94 00000450 00000464 00000000 nt!KiFastCallEntry+0xf8
WARNING: Frame IP not in any known module. Following frames may be
wrong.
00129a34 00000000 00000000 00000000 00000000 0x7c90eb94

STACK_COMMAND: kb

FOLLOWUP_IP:
tcusb!BulkUsb_IoDecrement+61
[e:\build\tcdrv\build-0071-main\src\tcdrv\bulkpnp.c @ 2796]
f7972185 ?? ???

FAULTING_SOURCE_CODE:
2792:
2793: KeReleaseSpinLock(&DeviceExtension->IOCountLock, oldIrql);
2794:
2795: TTRACE_LEAVE((“BulkUsb_IoDecrement() -> %x”, result));

2796: return(result);
2797: }
2798:
2799: NTSTATUS
2800: CanStopDevice(
2801: IN PDEVICE_OBJECT DeviceObject,

SYMBOL_STACK_INDEX: 3

SYMBOL_NAME: tcusb!BulkUsb_IoDecrement+61

FOLLOWUP_NAME: MachineOwner

MODULE_NAME: tcusb

IMAGE_NAME: tcusb.sys

DEBUG_FLR_IMAGE_TIMESTAMP: fffffffe

FAILURE_BUCKET_ID: 0xCE_BAD_IP_tcusb!BulkUsb_IoDecrement+61

BUCKET_ID: 0xCE_BAD_IP_tcusb!BulkUsb_IoDecrement+61

Followup: MachineOwner

Best regards,

Michal Vodicka
UPEK, Inc.
[xxxxx@upek.com, http://www.upek.com]


Questions? First check the Kernel Driver FAQ at
http://www.osronline.com/article.cfm?id=256

To unsubscribe, visit the List Server section of OSR Online at
http://www.osronline.com/page.cfm?name=ListServer

Yes, but only queued IRPs. This one was synchronous. Sent to IOCTL dispatch handler which processes it, completes and doesn’t return pending status. Do you mean I should wait in cleanup dispatch until all synchronous IRPs are completed, too? I don’t see how to do it without races.

Best regards,

Michal Vodicka
UPEK, Inc.
[xxxxx@upek.com, http://www.upek.com]


From: xxxxx@lists.osr.com[SMTP:xxxxx@lists.osr.com] on behalf of Doron Holan[SMTP:xxxxx@microsoft.com]
Reply To: Windows System Software Devs Interest List
Sent: Thursday, December 07, 2006 11:46 PM
To: Windows System Software Devs Interest List
Subject: RE: [ntdev] Premature driver removal?!

You will not get a remove until the app closes its handle, there is no
way the remove and the app’s closing of the handle can race. Are you
purging/canceling all i/o in IRP_MJ_CLEANUP on the PFILE_OBJECT for all
i/o sent on the PFILE_OBJECT before completing the cleanup irp?

d

-----Original Message-----
From: xxxxx@lists.osr.com
[mailto:xxxxx@lists.osr.com] On Behalf Of Michal Vodicka
Sent: Thursday, December 07, 2006 2:38 PM
To: Windows System Software Devs Interest List
Subject: [ntdev] Premature driver removal?!

I’m analyzing memory dump with BSOD
DRIVER_UNLOADED_WITHOUT_CANCELLING_PENDING_OPERATIONS (ce). I can’t
believe what I see.

It occured during USB device surprise removal. Driver apparently
received surprise remove and remove IRPs and blocked waiting until
everything completes. In parallel or as a result of SR it received an
IOCTL from user mode app which in turn lead to port reset which can take
some times in this state. When done, IOCTL dispatch completed the IRP
and decremented IO count which in turn lead to unblocked remove IRP
handler finishing and removing the device. User mode app then probably
closed a handle from different thread which caused driver unload. But
IOCTL dispatch still had few instructions to execute and crashed because
code wasn’t in the memory.

I just took Oney’s WDM book and he clearly states driver remains in the
memory until it both completes IRP and returns from the dispatch
routine. But it didn’t. Someting changed in the XP SP2? Or is it app
responsibility to not close handles before synchronous functions called
from other threads return? (seems impossible anyway)

Analysis below. I forced WinDbg to load symbols for my unloaded driver
(tcusb.sys) so don’t be confused it is there. Originally there was just
reference to unloaded driver. I examined handle 0x450 passed to
NtDeviceIoControlFile() and it is already freed. Also my device
extension indicates all handles were freed.

kd> !analyze -v
************************************************************************
*******
*
*
* Bugcheck Analysis
*
*
*
************************************************************************
*******

DRIVER_UNLOADED_WITHOUT_CANCELLING_PENDING_OPERATIONS (ce)
A driver unloaded without cancelling timers, DPCs, worker threads, etc.
The broken driver’s name is displayed on the screen.
Arguments:
Arg1: f7972185, memory referenced
Arg2: 00000000, value 0 = read operation, 1 = write operation
Arg3: f7972185, If non-zero, the instruction address which referenced
the bad memory
address.
Arg4: 00000000, Mm internal code.

Debugging Details:

READ_ADDRESS: f7972185

FAULTING_IP:
tcusb!BulkUsb_IoDecrement+61
[e:\build\tcdrv\build-0071-main\src\tcdrv\bulkpnp.c @ 2796]
f7972185 ?? ???

DEFAULT_BUCKET_ID: DRIVER_FAULT

BUGCHECK_STR: 0xCE

PROCESS_NAME: sdk_test.exe

TRAP_FRAME: eea73a30 – (.trap ffffffffeea73a30)
ErrCode = 00000000
eax=80a16123 ebx=82d9c848 ecx=00000000 edx=00000000 esi=82d9c824>
edi=8080cf16
eip=f7972185 esp=eea73aa4 ebp=eea73ab4 iopl=0 nv up ei pl nz na
po nc
cs=0008 ss=0010 ds=0023 es=0023 fs=0030 gs=0000
efl=00010202
tcusb!BulkUsb_IoDecrement+0x61:
f7972185 ?? ???
Resetting default scope

LAST_CONTROL_TRANSFER: from 8084d6fb to 8085c9ae

FAILED_INSTRUCTION_ADDRESS:
tcusb!BulkUsb_IoDecrement+61
[e:\build\tcdrv\build-0071-main\src\tcdrv\bulkpnp.c @ 2796]
f7972185 ?? ???

STACK_TEXT:
eea739cc 8084d6fb 00000050 f7972185 00000000 nt!KeBugCheckEx+0x1b
eea73a18 8080aff1 00000000 f7972185 00000000 nt!MmAccessFault+0x6f5
eea73a18 f7972185 00000000 f7972185 00000000 nt!KiTrap0E+0xcc
eea73ab4 f7976b58 00d9c758 82de6fb8 831c5278
tcusb!BulkUsb_IoDecrement+0x61
[e:\build\tcdrv\build-0071-main\src\tcdrv\bulkpnp.c @ 2796]
eea73c34 8080cd77 82d9c6a0 8324f008 80a16070 tcusb!DispatchDevCtrl+0xed6
[e:\build\tcdrv\build-0071-main\src\tcdrv\ioctl.c @ 1297]
eea73c44 808939ab 8324f0e4 832762e0 8324f008 nt!IopfCallDriver+0x31
eea73c58 808a69f7 82d9c6a0 8324f008 832762e0
nt!IopSynchronousServiceTail+0x60
eea73d00 808a8bfa 00000450 00000464 00000000 nt!IopXxxControlFile+0x611
eea73d34 8080806b 00000450 00000464 00000000
nt!NtDeviceIoControlFile+0x2a
eea73d34 7c90eb94 00000450 00000464 00000000 nt!KiFastCallEntry+0xf8
WARNING: Frame IP not in any known module. Following frames may be
wrong.
00129a34 00000000 00000000 00000000 00000000 0x7c90eb94

STACK_COMMAND: kb

FOLLOWUP_IP:
tcusb!BulkUsb_IoDecrement+61
[e:\build\tcdrv\build-0071-main\src\tcdrv\bulkpnp.c @ 2796]
f7972185 ?? ???

FAULTING_SOURCE_CODE:
2792:
2793: KeReleaseSpinLock(&DeviceExtension->IOCountLock, oldIrql);
2794:
2795: TTRACE_LEAVE((“BulkUsb_IoDecrement() -> %x”, result));
> 2796: return(result);
2797: }
2798:
2799: NTSTATUS
2800: CanStopDevice(
2801: IN PDEVICE_OBJECT DeviceObject,

SYMBOL_STACK_INDEX: 3

SYMBOL_NAME: tcusb!BulkUsb_IoDecrement+61

FOLLOWUP_NAME: MachineOwner

MODULE_NAME: tcusb

IMAGE_NAME: tcusb.sys

DEBUG_FLR_IMAGE_TIMESTAMP: fffffffe

FAILURE_BUCKET_ID: 0xCE_BAD_IP_tcusb!BulkUsb_IoDecrement+61

BUCKET_ID: 0xCE_BAD_IP_tcusb!BulkUsb_IoDecrement+61

Followup: MachineOwner

Best regards,

Michal Vodicka
UPEK, Inc.
[xxxxx@upek.com, http://www.upek.com]


Questions? First check the Kernel Driver FAQ at
http://www.osronline.com/article.cfm?id=256

To unsubscribe, visit the List Server section of OSR Online at
http://www.osronline.com/page.cfm?name=ListServer


Questions? First check the Kernel Driver FAQ at http://www.osronline.com/article.cfm?id=256

To unsubscribe, visit the List Server section of OSR Online at http://www.osronline.com/page.cfm?name=ListServer

I am asking the i/o manager folks about this. Are you opening the
handle as overlapped? If so, are you passing a OVERLAPPED structure to
the DeviceIoControl call?

d

-----Original Message-----
From: xxxxx@lists.osr.com
[mailto:xxxxx@lists.osr.com] On Behalf Of Michal Vodicka
Sent: Thursday, December 07, 2006 3:01 PM
To: Windows System Software Devs Interest List
Subject: RE: [ntdev] Premature driver removal?!

Yes, but only queued IRPs. This one was synchronous. Sent to IOCTL
dispatch handler which processes it, completes and doesn’t return
pending status. Do you mean I should wait in cleanup dispatch until all
synchronous IRPs are completed, too? I don’t see how to do it without
races.

Best regards,

Michal Vodicka
UPEK, Inc.
[xxxxx@upek.com, http://www.upek.com]


From:
xxxxx@lists.osr.com[SMTP:xxxxx@lists.osr.com
] on behalf of Doron Holan[SMTP:xxxxx@microsoft.com]
Reply To: Windows System Software Devs Interest List
Sent: Thursday, December 07, 2006 11:46 PM
To: Windows System Software Devs Interest List
Subject: RE: [ntdev] Premature driver removal?!

You will not get a remove until the app closes its handle, there is no
way the remove and the app’s closing of the handle can race. Are you
purging/canceling all i/o in IRP_MJ_CLEANUP on the PFILE_OBJECT for
all
i/o sent on the PFILE_OBJECT before completing the cleanup irp?

d

-----Original Message-----
From: xxxxx@lists.osr.com
[mailto:xxxxx@lists.osr.com] On Behalf Of Michal Vodicka
Sent: Thursday, December 07, 2006 2:38 PM
To: Windows System Software Devs Interest List
Subject: [ntdev] Premature driver removal?!

I’m analyzing memory dump with BSOD
DRIVER_UNLOADED_WITHOUT_CANCELLING_PENDING_OPERATIONS (ce). I can’t
believe what I see.

It occured during USB device surprise removal. Driver apparently
received surprise remove and remove IRPs and blocked waiting until
everything completes. In parallel or as a result of SR it received an
IOCTL from user mode app which in turn lead to port reset which can
take
some times in this state. When done, IOCTL dispatch completed the IRP
and decremented IO count which in turn lead to unblocked remove IRP
handler finishing and removing the device. User mode app then probably
closed a handle from different thread which caused driver unload. But
IOCTL dispatch still had few instructions to execute and crashed
because
code wasn’t in the memory.

I just took Oney’s WDM book and he clearly states driver remains in
the
memory until it both completes IRP and returns from the dispatch
routine. But it didn’t. Someting changed in the XP SP2? Or is it app
responsibility to not close handles before synchronous functions
called
from other threads return? (seems impossible anyway)

Analysis below. I forced WinDbg to load symbols for my unloaded driver
(tcusb.sys) so don’t be confused it is there. Originally there was
just
reference to unloaded driver. I examined handle 0x450 passed to
NtDeviceIoControlFile() and it is already freed. Also my device
extension indicates all handles were freed.

kd> !analyze -v

************************************************************************

*******
*
*
* Bugcheck Analysis
*
*
*

************************************************************************

*******

DRIVER_UNLOADED_WITHOUT_CANCELLING_PENDING_OPERATIONS (ce)
A driver unloaded without cancelling timers, DPCs, worker threads,
etc.
The broken driver’s name is displayed on the screen.
Arguments:
Arg1: f7972185, memory referenced
Arg2: 00000000, value 0 = read operation, 1 = write operation
Arg3: f7972185, If non-zero, the instruction address which referenced
the bad memory
address.
Arg4: 00000000, Mm internal code.

Debugging Details:

READ_ADDRESS: f7972185

FAULTING_IP:
tcusb!BulkUsb_IoDecrement+61
[e:\build\tcdrv\build-0071-main\src\tcdrv\bulkpnp.c @ 2796]
f7972185 ?? ???

DEFAULT_BUCKET_ID: DRIVER_FAULT

BUGCHECK_STR: 0xCE

PROCESS_NAME: sdk_test.exe

TRAP_FRAME: eea73a30 – (.trap ffffffffeea73a30)
ErrCode = 00000000
eax=80a16123 ebx=82d9c848 ecx=00000000 edx=00000000 esi=82d9c824>
edi=8080cf16
eip=f7972185 esp=eea73aa4 ebp=eea73ab4 iopl=0 nv up ei pl nz
na
po nc
cs=0008 ss=0010 ds=0023 es=0023 fs=0030 gs=0000
efl=00010202
tcusb!BulkUsb_IoDecrement+0x61:
f7972185 ?? ???
Resetting default scope

LAST_CONTROL_TRANSFER: from 8084d6fb to 8085c9ae

FAILED_INSTRUCTION_ADDRESS:
tcusb!BulkUsb_IoDecrement+61
[e:\build\tcdrv\build-0071-main\src\tcdrv\bulkpnp.c @ 2796]
f7972185 ?? ???

STACK_TEXT:
eea739cc 8084d6fb 00000050 f7972185 00000000 nt!KeBugCheckEx+0x1b
eea73a18 8080aff1 00000000 f7972185 00000000 nt!MmAccessFault+0x6f5
eea73a18 f7972185 00000000 f7972185 00000000 nt!KiTrap0E+0xcc
eea73ab4 f7976b58 00d9c758 82de6fb8 831c5278
tcusb!BulkUsb_IoDecrement+0x61
[e:\build\tcdrv\build-0071-main\src\tcdrv\bulkpnp.c @ 2796]
eea73c34 8080cd77 82d9c6a0 8324f008 80a16070
tcusb!DispatchDevCtrl+0xed6
[e:\build\tcdrv\build-0071-main\src\tcdrv\ioctl.c @ 1297]
eea73c44 808939ab 8324f0e4 832762e0 8324f008 nt!IopfCallDriver+0x31
eea73c58 808a69f7 82d9c6a0 8324f008 832762e0
nt!IopSynchronousServiceTail+0x60
eea73d00 808a8bfa 00000450 00000464 00000000
nt!IopXxxControlFile+0x611
eea73d34 8080806b 00000450 00000464 00000000
nt!NtDeviceIoControlFile+0x2a
eea73d34 7c90eb94 00000450 00000464 00000000 nt!KiFastCallEntry+0xf8
WARNING: Frame IP not in any known module. Following frames may be
wrong.
00129a34 00000000 00000000 00000000 00000000 0x7c90eb94

STACK_COMMAND: kb

FOLLOWUP_IP:
tcusb!BulkUsb_IoDecrement+61
[e:\build\tcdrv\build-0071-main\src\tcdrv\bulkpnp.c @ 2796]
f7972185 ?? ???

FAULTING_SOURCE_CODE:
2792:
2793: KeReleaseSpinLock(&DeviceExtension->IOCountLock, oldIrql);
2794:
2795: TTRACE_LEAVE((“BulkUsb_IoDecrement() -> %x”, result));
> 2796: return(result);
2797: }
2798:
2799: NTSTATUS
2800: CanStopDevice(
2801: IN PDEVICE_OBJECT DeviceObject,

SYMBOL_STACK_INDEX: 3

SYMBOL_NAME: tcusb!BulkUsb_IoDecrement+61

FOLLOWUP_NAME: MachineOwner

MODULE_NAME: tcusb

IMAGE_NAME: tcusb.sys

DEBUG_FLR_IMAGE_TIMESTAMP: fffffffe

FAILURE_BUCKET_ID: 0xCE_BAD_IP_tcusb!BulkUsb_IoDecrement+61

BUCKET_ID: 0xCE_BAD_IP_tcusb!BulkUsb_IoDecrement+61

Followup: MachineOwner

Best regards,

Michal Vodicka
UPEK, Inc.
[xxxxx@upek.com, http://www.upek.com]


Questions? First check the Kernel Driver FAQ at
http://www.osronline.com/article.cfm?id=256

To unsubscribe, visit the List Server section of OSR Online at
http://www.osronline.com/page.cfm?name=ListServer


Questions? First check the Kernel Driver FAQ at
http://www.osronline.com/article.cfm?id=256

To unsubscribe, visit the List Server section of OSR Online at
http://www.osronline.com/page.cfm?name=ListServer


Questions? First check the Kernel Driver FAQ at
http://www.osronline.com/article.cfm?id=256

To unsubscribe, visit the List Server section of OSR Online at
http://www.osronline.com/page.cfm?name=ListServer

Yes, it is overlapped handle and OVERLAPPED structure with valid event handle should be always passed to DeviceIoControl() call.

I already asked to reproduce it with debug driver so I can see traces. This should verify the analysis but I’m almost sure about it. The communication library which app uses has a separate thread which closes handle when receives notification about device removal. It probably occured in parallel or just after the DeviceIoControl() call. IOCTL dispatch tried to get port status and reset port which can take 1 sec and more when device is surprise removed. Enough to close finish.

Best regards,

Michal Vodicka
UPEK, Inc.
[xxxxx@upek.com, http://www.upek.com]


From: xxxxx@lists.osr.com[SMTP:xxxxx@lists.osr.com] on behalf of Doron Holan[SMTP:xxxxx@microsoft.com]
Reply To: Windows System Software Devs Interest List
Sent: Friday, December 08, 2006 12:12 AM
To: Windows System Software Devs Interest List
Subject: RE: [ntdev] Premature driver removal?!

I am asking the i/o manager folks about this. Are you opening the
handle as overlapped? If so, are you passing a OVERLAPPED structure to
the DeviceIoControl call?

d

-----Original Message-----
From: xxxxx@lists.osr.com
[mailto:xxxxx@lists.osr.com] On Behalf Of Michal Vodicka
Sent: Thursday, December 07, 2006 3:01 PM
To: Windows System Software Devs Interest List
Subject: RE: [ntdev] Premature driver removal?!

Yes, but only queued IRPs. This one was synchronous. Sent to IOCTL
dispatch handler which processes it, completes and doesn’t return
pending status. Do you mean I should wait in cleanup dispatch until all
synchronous IRPs are completed, too? I don’t see how to do it without
races.

Best regards,

Michal Vodicka
UPEK, Inc.
[xxxxx@upek.com, http://www.upek.com]

> ----------
> From:
xxxxx@lists.osr.com[SMTP:xxxxx@lists.osr.com
] on behalf of Doron Holan[SMTP:xxxxx@microsoft.com]
> Reply To: Windows System Software Devs Interest List
> Sent: Thursday, December 07, 2006 11:46 PM
> To: Windows System Software Devs Interest List
> Subject: RE: [ntdev] Premature driver removal?!
>
> You will not get a remove until the app closes its handle, there is no
> way the remove and the app’s closing of the handle can race. Are you
> purging/canceling all i/o in IRP_MJ_CLEANUP on the PFILE_OBJECT for
all
> i/o sent on the PFILE_OBJECT before completing the cleanup irp?
>
> d
>
> -----Original Message-----
> From: xxxxx@lists.osr.com
> [mailto:xxxxx@lists.osr.com] On Behalf Of Michal Vodicka
> Sent: Thursday, December 07, 2006 2:38 PM
> To: Windows System Software Devs Interest List
> Subject: [ntdev] Premature driver removal?!
>
> I’m analyzing memory dump with BSOD
> DRIVER_UNLOADED_WITHOUT_CANCELLING_PENDING_OPERATIONS (ce). I can’t
> believe what I see.
>
> It occured during USB device surprise removal. Driver apparently
> received surprise remove and remove IRPs and blocked waiting until
> everything completes. In parallel or as a result of SR it received an
> IOCTL from user mode app which in turn lead to port reset which can
take
> some times in this state. When done, IOCTL dispatch completed the IRP
> and decremented IO count which in turn lead to unblocked remove IRP
> handler finishing and removing the device. User mode app then probably
> closed a handle from different thread which caused driver unload. But
> IOCTL dispatch still had few instructions to execute and crashed
because
> code wasn’t in the memory.
>
> I just took Oney’s WDM book and he clearly states driver remains in
the
> memory until it both completes IRP and returns from the dispatch>
> routine. But it didn’t. Someting changed in the XP SP2? Or is it app
> responsibility to not close handles before synchronous functions
called
> from other threads return? (seems impossible anyway)
>
> Analysis below. I forced WinDbg to load symbols for my unloaded driver
> (tcusb.sys) so don’t be confused it is there. Originally there was
just
> reference to unloaded driver. I examined handle 0x450 passed to
> NtDeviceIoControlFile() and it is already freed. Also my device
> extension indicates all handles were freed.
>
> kd> !analyze -v
>
************************************************************************
> *******
> *
> *
> * Bugcheck Analysis
> *
> *
> *
>
************************************************************************
> *******
>
> DRIVER_UNLOADED_WITHOUT_CANCELLING_PENDING_OPERATIONS (ce)
> A driver unloaded without cancelling timers, DPCs, worker threads,
etc.
> The broken driver’s name is displayed on the screen.
> Arguments:
> Arg1: f7972185, memory referenced
> Arg2: 00000000, value 0 = read operation, 1 = write operation
> Arg3: f7972185, If non-zero, the instruction address which referenced
> the bad memory
> address.
> Arg4: 00000000, Mm internal code.
>
> Debugging Details:
> ------------------
>
>
> READ_ADDRESS: f7972185
>
> FAULTING_IP:
> tcusb!BulkUsb_IoDecrement+61
> [e:\build\tcdrv\build-0071-main\src\tcdrv\bulkpnp.c @ 2796]
> f7972185 ?? ???
>
> DEFAULT_BUCKET_ID: DRIVER_FAULT
>
> BUGCHECK_STR: 0xCE
>
> PROCESS_NAME: sdk_test.exe
>
> TRAP_FRAME: eea73a30 – (.trap ffffffffeea73a30)
> ErrCode = 00000000
> eax=80a16123 ebx=82d9c848 ecx=00000000 edx=00000000 esi=82d9c824>
> edi=8080cf16
> eip=f7972185 esp=eea73aa4 ebp=eea73ab4 iopl=0 nv up ei pl nz
na
> po nc
> cs=0008 ss=0010 ds=0023 es=0023 fs=0030 gs=0000
> efl=00010202
> tcusb!BulkUsb_IoDecrement+0x61:
> f7972185 ?? ???
> Resetting default scope
>
> LAST_CONTROL_TRANSFER: from 8084d6fb to 8085c9ae
>
> FAILED_INSTRUCTION_ADDRESS:
> tcusb!BulkUsb_IoDecrement+61
> [e:\build\tcdrv\build-0071-main\src\tcdrv\bulkpnp.c @ 2796]
> f7972185 ?? ???
>
> STACK_TEXT:
> eea739cc 8084d6fb 00000050 f7972185 00000000 nt!KeBugCheckEx+0x1b
> eea73a18 8080aff1 00000000 f7972185 00000000 nt!MmAccessFault+0x6f5
> eea73a18 f7972185 00000000 f7972185 00000000 nt!KiTrap0E+0xcc
> eea73ab4 f7976b58 00d9c758 82de6fb8 831c5278
> tcusb!BulkUsb_IoDecrement+0x61
> [e:\build\tcdrv\build-0071-main\src\tcdrv\bulkpnp.c @ 2796]
> eea73c34 8080cd77 82d9c6a0 8324f008 80a16070
tcusb!DispatchDevCtrl+0xed6
> [e:\build\tcdrv\build-0071-main\src\tcdrv\ioctl.c @ 1297]
> eea73c44 808939ab 8324f0e4 832762e0 8324f008 nt!IopfCallDriver+0x31
> eea73c58 808a69f7 82d9c6a0 8324f008 832762e0
> nt!IopSynchronousServiceTail+0x60
> eea73d00 808a8bfa 00000450 00000464 00000000
nt!IopXxxControlFile+0x611
> eea73d34 8080806b 00000450 00000464 00000000
> nt!NtDeviceIoControlFile+0x2a
> eea73d34 7c90eb94 00000450 00000464 00000000 nt!KiFastCallEntry+0xf8
> WARNING: Frame IP not in any known module. Following frames may be
> wrong.
> 00129a34 00000000 00000000 00000000 00000000 0x7c90eb94
>
>
> STACK_COMMAND: kb
>
> FOLLOWUP_IP:
> tcusb!BulkUsb_IoDecrement+61
> [e:\build\tcdrv\build-0071-main\src\tcdrv\bulkpnp.c @ 2796]
> f7972185 ?? ???
>
> FAULTING_SOURCE_CODE:
> 2792:
> 2793: KeReleaseSpinLock(&DeviceExtension->IOCountLock, oldIrql);
> 2794:
> 2795: TTRACE_LEAVE((“BulkUsb_IoDecrement() -> %x”, result));
> > 2796: return(result);
> 2797: }
> 2798:
> 2799: NTSTATUS
> 2800: CanStopDevice(
> 2801: IN PDEVICE_OBJECT DeviceObject,
>
>
> SYMBOL_STACK_INDEX: 3
>
> SYMBOL_NAME: tcusb!BulkUsb_IoDecrement+61>
>
> FOLLOWUP_NAME: MachineOwner
>
> MODULE_NAME: tcusb
>
> IMAGE_NAME: tcusb.sys
>
> DEBUG_FLR_IMAGE_TIMESTAMP: fffffffe
>
> FAILURE_BUCKET_ID: 0xCE_BAD_IP_tcusb!BulkUsb_IoDecrement+61
>
> BUCKET_ID: 0xCE_BAD_IP_tcusb!BulkUsb_IoDecrement+61
>
> Followup: MachineOwner
> ---------
>
> Best regards,
>
> Michal Vodicka
> UPEK, Inc.
> [xxxxx@upek.com, http://www.upek.com]
>
>
> —
> Questions? First check the Kernel Driver FAQ at
> http://www.osronline.com/article.cfm?id=256
>
> To unsubscribe, visit the List Server section of OSR Online at
> http://www.osronline.com/page.cfm?name=ListServer
>
> —
> Questions? First check the Kernel Driver FAQ at
http://www.osronline.com/article.cfm?id=256
>
> To unsubscribe, visit the List Server section of OSR Online at
http://www.osronline.com/page.cfm?name=ListServer
>


Questions? First check the Kernel Driver FAQ at
http://www.osronline.com/article.cfm?id=256

To unsubscribe, visit the List Server section of OSR Online at
http://www.osronline.com/page.cfm?name=ListServer


Questions? First check the Kernel Driver FAQ at http://www.osronline.com/article.cfm?id=256

To unsubscribe, visit the List Server section of OSR Online at http://www.osronline.com/page.cfm?name=ListServer

The I/O manager folks said that is a bug in the I/O manager. It was
fixed in Vista, but previous versions will have this problem. If you
can alster the application so that all threads sending I/O back off and
drain sent i/o before you close the handle, that would work around the
problem.

d

-----Original Message-----
From: xxxxx@lists.osr.com
[mailto:xxxxx@lists.osr.com] On Behalf Of Michal Vodicka
Sent: Thursday, December 07, 2006 4:38 PM
To: Windows System Software Devs Interest List
Subject: RE: [ntdev] Premature driver removal?!

Yes, it is overlapped handle and OVERLAPPED structure with valid event
handle should be always passed to DeviceIoControl() call.

I already asked to reproduce it with debug driver so I can see traces.
This should verify the analysis but I’m almost sure about it. The
communication library which app uses has a separate thread which closes
handle when receives notification about device removal. It probably
occured in parallel or just after the DeviceIoControl() call. IOCTL
dispatch tried to get port status and reset port which can take 1 sec
and more when device is surprise removed. Enough to close finish.

Best regards,

Michal Vodicka
UPEK, Inc.
[xxxxx@upek.com, http://www.upek.com]


From:
xxxxx@lists.osr.com[SMTP:xxxxx@lists.osr.com
] on behalf of Doron Holan[SMTP:xxxxx@microsoft.com]
Reply To: Windows System Software Devs Interest List
Sent: Friday, December 08, 2006 12:12 AM
To: Windows System Software Devs Interest List
Subject: RE: [ntdev] Premature driver removal?!

I am asking the i/o manager folks about this. Are you opening the
handle as overlapped? If so, are you passing a OVERLAPPED structure
to
the DeviceIoControl call?

d

-----Original Message-----
From: xxxxx@lists.osr.com
[mailto:xxxxx@lists.osr.com] On Behalf Of Michal Vodicka
Sent: Thursday, December 07, 2006 3:01 PM
To: Windows System Software Devs Interest List
Subject: RE: [ntdev] Premature driver removal?!

Yes, but only queued IRPs. This one was synchronous. Sent to IOCTL
dispatch handler which processes it, completes and doesn’t return
pending status. Do you mean I should wait in cleanup dispatch until
all
synchronous IRPs are completed, too? I don’t see how to do it without
races.

Best regards,

Michal Vodicka
UPEK, Inc.
[xxxxx@upek.com, http://www.upek.com]

> ----------
> From:

xxxxx@lists.osr.com[SMTP:xxxxx@lists.osr.com

] on behalf of Doron Holan[SMTP:xxxxx@microsoft.com]
> Reply To: Windows System Software Devs Interest List
> Sent: Thursday, December 07, 2006 11:46 PM
> To: Windows System Software Devs Interest List
> Subject: RE: [ntdev] Premature driver removal?!
>
> You will not get a remove until the app closes its handle, there is
no
> way the remove and the app’s closing of the handle can race. Are
you
> purging/canceling all i/o in IRP_MJ_CLEANUP on the PFILE_OBJECT for
all
> i/o sent on the PFILE_OBJECT before completing the cleanup irp?
>
> d
>
> -----Original Message-----
> From: xxxxx@lists.osr.com
> [mailto:xxxxx@lists.osr.com] On Behalf Of Michal
Vodicka
> Sent: Thursday, December 07, 2006 2:38 PM
> To: Windows System Software Devs Interest List
> Subject: [ntdev] Premature driver removal?!
>
> I’m analyzing memory dump with BSOD
> DRIVER_UNLOADED_WITHOUT_CANCELLING_PENDING_OPERATIONS (ce). I can’t
> believe what I see.
>
> It occured during USB device surprise removal. Driver apparently
> received surprise remove and remove IRPs and blocked waiting until
> everything completes. In parallel or as a result of SR it received
an
> IOCTL from user mode app which in turn lead to port reset which can
take
> some times in this state. When done, IOCTL dispatch completed the
IRP
> and decremented IO count which in turn lead to unblocked remove IRP
> handler finishing and removing the device. User mode app then
probably
> closed a handle from different thread which caused driver unload.
But
> IOCTL dispatch still had few instructions to execute and crashed
because
> code wasn’t in the memory.
>
> I just took Oney’s WDM book and he clearly states driver remains in
the
> memory until it both completes IRP and returns from the dispatch>
> routine. But it didn’t. Someting changed in the XP SP2? Or is it app
> responsibility to not close handles before synchronous functions
called
> from other threads return? (seems impossible anyway)
>
> Analysis below. I forced WinDbg to load symbols for my unloaded
driver
> (tcusb.sys) so don’t be confused it is there. Originally there was
just
> reference to unloaded driver. I examined handle 0x450 passed to
> NtDeviceIoControlFile() and it is already freed. Also my device
> extension indicates all handles were freed.
>
> kd> !analyze -v
>

************************************************************************

> *******
> *
> *
> * Bugcheck Analysis
> *
> *
> *
>

************************************************************************

> *******
>
> DRIVER_UNLOADED_WITHOUT_CANCELLING_PENDING_OPERATIONS (ce)
> A driver unloaded without cancelling timers, DPCs, worker threads,
etc.
> The broken driver’s name is displayed on the screen.
> Arguments:
> Arg1: f7972185, memory referenced
> Arg2: 00000000, value 0 = read operation, 1 = write operation
> Arg3: f7972185, If non-zero, the instruction address which
referenced
> the bad memory
> address.
> Arg4: 00000000, Mm internal code.
>
> Debugging Details:
> ------------------
>
>
> READ_ADDRESS: f7972185
>
> FAULTING_IP:
> tcusb!BulkUsb_IoDecrement+61
> [e:\build\tcdrv\build-0071-main\src\tcdrv\bulkpnp.c @ 2796]
> f7972185 ?? ???
>
> DEFAULT_BUCKET_ID: DRIVER_FAULT
>
> BUGCHECK_STR: 0xCE
>
> PROCESS_NAME: sdk_test.exe
>
> TRAP_FRAME: eea73a30 – (.trap ffffffffeea73a30)
> ErrCode = 00000000
> eax=80a16123 ebx=82d9c848 ecx=00000000 edx=00000000 esi=82d9c824>
> edi=8080cf16
> eip=f7972185 esp=eea73aa4 ebp=eea73ab4 iopl=0 nv up ei pl nz
na
> po nc
> cs=0008 ss=0010 ds=0023 es=0023 fs=0030 gs=0000
> efl=00010202
> tcusb!BulkUsb_IoDecrement+0x61:
> f7972185 ?? ???
> Resetting default scope
>
> LAST_CONTROL_TRANSFER: from 8084d6fb to 8085c9ae
>
> FAILED_INSTRUCTION_ADDRESS:
> tcusb!BulkUsb_IoDecrement+61
> [e:\build\tcdrv\build-0071-main\src\tcdrv\bulkpnp.c @ 2796]
> f7972185 ?? ???
>
> STACK_TEXT:
> eea739cc 8084d6fb 00000050 f7972185 00000000 nt!KeBugCheckEx+0x1b
> eea73a18 8080aff1 00000000 f7972185 00000000 nt!MmAccessFault+0x6f5
> eea73a18 f7972185 00000000 f7972185 00000000 nt!KiTrap0E+0xcc
> eea73ab4 f7976b58 00d9c758 82de6fb8 831c5278
> tcusb!BulkUsb_IoDecrement+0x61
> [e:\build\tcdrv\build-0071-main\src\tcdrv\bulkpnp.c @ 2796]
> eea73c34 8080cd77 82d9c6a0 8324f008 80a16070
tcusb!DispatchDevCtrl+0xed6
> [e:\build\tcdrv\build-0071-main\src\tcdrv\ioctl.c @ 1297]
> eea73c44 808939ab 8324f0e4 832762e0 8324f008 nt!IopfCallDriver+0x31
> eea73c58 808a69f7 82d9c6a0 8324f008 832762e0
> nt!IopSynchronousServiceTail+0x60
> eea73d00 808a8bfa 00000450 00000464 00000000
nt!IopXxxControlFile+0x611
> eea73d34 8080806b 00000450 00000464 00000000
> nt!NtDeviceIoControlFile+0x2a
> eea73d34 7c90eb94 00000450 00000464 00000000 nt!KiFastCallEntry+0xf8
> WARNING: Frame IP not in any known module. Following frames may be
> wrong.
> 00129a34 00000000 00000000 00000000 00000000 0x7c90eb94
>
>
> STACK_COMMAND: kb
>
> FOLLOWUP_IP:
> tcusb!BulkUsb_IoDecrement+61
> [e:\build\tcdrv\build-0071-main\src\tcdrv\bulkpnp.c @ 2796]
> f7972185 ?? ???
>
> FAULTING_SOURCE_CODE:
> 2792:
> 2793: KeReleaseSpinLock(&DeviceExtension->IOCountLock,
oldIrql);
> 2794:
> 2795: TTRACE_LEAVE((“BulkUsb_IoDecrement() -> %x”, result));
> > 2796: return(result);
> 2797: }
> 2798:
> 2799: NTSTATUS
> 2800: CanStopDevice(
> 2801: IN PDEVICE_OBJECT DeviceObject,
>
>
> SYMBOL_STACK_INDEX: 3
>
> SYMBOL_NAME: tcusb!BulkUsb_IoDecrement+61>
>
> FOLLOWUP_NAME: MachineOwner
>
> MODULE_NAME: tcusb
>
> IMAGE_NAME: tcusb.sys
>
> DEBUG_FLR_IMAGE_TIMESTAMP: fffffffe
>
> FAILURE_BUCKET_ID: 0xCE_BAD_IP_tcusb!BulkUsb_IoDecrement+61
>
> BUCKET_ID: 0xCE_BAD_IP_tcusb!BulkUsb_IoDecrement+61
>
> Followup: MachineOwner
> ---------
>
> Best regards,
>
> Michal Vodicka
> UPEK, Inc.
> [xxxxx@upek.com, http://www.upek.com]
>
>
> —
> Questions? First check the Kernel Driver FAQ at
> http://www.osronline.com/article.cfm?id=256
>
> To unsubscribe, visit the List Server section of OSR Online at
> http://www.osronline.com/page.cfm?name=ListServer
>
> —
> Questions? First check the Kernel Driver FAQ at
http://www.osronline.com/article.cfm?id=256
>
> To unsubscribe, visit the List Server section of OSR Online at
http://www.osronline.com/page.cfm?name=ListServer
>


Questions? First check the Kernel Driver FAQ at
http://www.osronline.com/article.cfm?id=256

To unsubscribe, visit the List Server section of OSR Online at
http://www.osronline.com/page.cfm?name=ListServer


Questions? First check the Kernel Driver FAQ at
http://www.osronline.com/article.cfm?id=256

To unsubscribe, visit the List Server section of OSR Online at
http://www.osronline.com/page.cfm?name=ListServer


Questions? First check the Kernel Driver FAQ at
http://www.osronline.com/article.cfm?id=256

To unsubscribe, visit the List Server section of OSR Online at
http://www.osronline.com/page.cfm?name=ListServer

Thanks, Doron. It is hard to believe there is a bug in the IO manager, which should be one of the best tested pieces of NT code.

If there any chance to back port the fix to XP as a hotfix? I’m affraid with faster and multi core CPUs the probability of this BSOD raises. I’d agree closing handle before other IO completes is non-standard but what if app crashes or is terminated from outside? If there is IO in progress there would be the same races. I can even imagine virus or trojan which invokes BSOD this way without admin rights.

Ironically, our app behaves this way because of MS advice. It is a workaround against yet another USB bug related to ESD recovery (if I remember correctly). In this case it was important to close app handle ASAP otherwise recovery didn’t work correctly. I already forgot details, it was probably related to USB reenumeration after ESD which failed if driver wasn’t unloaded fast enough and caused lockup or non functional device. We complained about this “solution” as driver and OS shouldn’t depend on app behaviour. They eventually agreed but weren’t able to do anything better so we ended with this workaround.

Oh well. I’m probably able to rewrite app to postpone close until all IO calls returns. Hopefully it’d fix this problem and still don’t break ESD recovery.

Best regards,

Michal Vodicka
UPEK, Inc.
[xxxxx@upek.com, http://www.upek.com]


From: xxxxx@lists.osr.com[SMTP:xxxxx@lists.osr.com] on behalf of Doron Holan[SMTP:xxxxx@microsoft.com]
Reply To: Windows System Software Devs Interest List
Sent: Friday, December 08, 2006 7:50 PM
To: Windows System Software Devs Interest List
Subject: RE: [ntdev] Premature driver removal?!

The I/O manager folks said that is a bug in the I/O manager. It was
fixed in Vista, but previous versions will have this problem. If you
can alster the application so that all threads sending I/O back off and
drain sent i/o before you close the handle, that would work around the
problem.

d

-----Original Message-----
From: xxxxx@lists.osr.com
[mailto:xxxxx@lists.osr.com] On Behalf Of Michal Vodicka
Sent: Thursday, December 07, 2006 4:38 PM
To: Windows System Software Devs Interest List
Subject: RE: [ntdev] Premature driver removal?!

Yes, it is overlapped handle and OVERLAPPED structure with valid event
handle should be always passed to DeviceIoControl() call.

I already asked to reproduce it with debug driver so I can see traces.
This should verify the analysis but I’m almost sure about it. The
communication library which app uses has a separate thread which closes
handle when receives notification about device removal. It probably
occured in parallel or just after the DeviceIoControl() call. IOCTL
dispatch tried to get port status and reset port which can take 1 sec
and more when device is surprise removed. Enough to close finish.

Best regards,

Michal Vodicka
UPEK, Inc.
[xxxxx@upek.com, http://www.upek.com]

> ----------
> From:
xxxxx@lists.osr.com[SMTP:xxxxx@lists.osr.com
] on behalf of Doron Holan[SMTP:xxxxx@microsoft.com]
> Reply To: Windows System Software Devs Interest List
> Sent: Friday, December 08, 2006 12:12 AM
> To: Windows System Software Devs Interest List
> Subject: RE: [ntdev] Premature driver removal?!
>
> I am asking the i/o manager folks about this. Are you opening the
> handle as overlapped? If so, are you passing a OVERLAPPED structure
to
> the DeviceIoControl call?
>
> d
>
> -----Original Message-----
> From: xxxxx@lists.osr.com
> [mailto:xxxxx@lists.osr.com] On Behalf Of Michal Vodicka
> Sent: Thursday, December 07, 2006 3:01 PM
> To: Windows System Software Devs Interest List>
> Subject: RE: [ntdev] Premature driver removal?!
>
> Yes, but only queued IRPs. This one was synchronous. Sent to IOCTL
> dispatch handler which processes it, completes and doesn’t return
> pending status. Do you mean I should wait in cleanup dispatch until
all
> synchronous IRPs are completed, too? I don’t see how to do it without
> races.
>
> Best regards,
>
> Michal Vodicka
> UPEK, Inc.
> [xxxxx@upek.com, http://www.upek.com]
>
>
> > ----------
> > From:
>
xxxxx@lists.osr.com[SMTP:xxxxx@lists.osr.com
> ] on behalf of Doron Holan[SMTP:xxxxx@microsoft.com]
> > Reply To: Windows System Software Devs Interest List
> > Sent: Thursday, December 07, 2006 11:46 PM
> > To: Windows System Software Devs Interest List
> > Subject: RE: [ntdev] Premature driver removal?!
> >
> > You will not get a remove until the app closes its handle, there is
no
> > way the remove and the app’s closing of the handle can race. Are
you
> > purging/canceling all i/o in IRP_MJ_CLEANUP on the PFILE_OBJECT for
> all
> > i/o sent on the PFILE_OBJECT before completing the cleanup irp?
> >
> > d
> >
> > -----Original Message-----
> > From: xxxxx@lists.osr.com
> > [mailto:xxxxx@lists.osr.com] On Behalf Of Michal
Vodicka
> > Sent: Thursday, December 07, 2006 2:38 PM
> > To: Windows System Software Devs Interest List
> > Subject: [ntdev] Premature driver removal?!
> >
> > I’m analyzing memory dump with BSOD
> > DRIVER_UNLOADED_WITHOUT_CANCELLING_PENDING_OPERATIONS (ce). I can’t
> > believe what I see.
> >
> > It occured during USB device surprise removal. Driver apparently
> > received surprise remove and remove IRPs and blocked waiting until
> > everything completes. In parallel or as a result of SR it received
an
> > IOCTL from user mode app which in turn lead to port reset which can
> take
> > some times in this state. When done, IOCTL dispatch completed the
IRP
> > and decremented IO count which in turn lead to unblocked remove IRP
> > handler finishing and removing the device. User mode app then
probably
> > closed a handle from different thread which caused driver unload.
But
> > IOCTL dispatch still had few instructions to execute and crashed
> because
> > code wasn’t in the memory.
> >
> > I just took Oney’s WDM book and he clearly states driver remains in
> the
> > memory until it both completes IRP and returns from the dispatch>
> > routine. But it didn’t. Someting changed in the XP SP2? Or is it app
> > responsibility to not close handles before synchronous functions
> called
> > from other threads return? (seems impossible anyway)
> >
> > Analysis below. I forced WinDbg to load symbols for my unloaded
driver
> > (tcusb.sys) so don’t be confused it is there. Originally there was
> just
> > reference to unloaded driver. I examined handle 0x450 passed to
> > NtDeviceIoControlFile() and it is already freed. Also my device
> > extension indicates all handles were freed.
> >
> > kd> !analyze -v
> >
>
************************************************************************
> > *******
> > *
> > *
> > * Bugcheck Analysis
> > *
> > *
> > *
> >
>
************************************************************************
> > *******
> >
> > DRIVER_UNLOADED_WITHOUT_CANCELLING_PENDING_OPERATIONS (ce)
> > A driver unloaded without cancelling timers, DPCs, worker threads,
> etc.
> > The broken driver’s name is displayed on the screen.
> > Arguments:
> > Arg1: f7972185, memory referenced
> > Arg2: 00000000, value 0 = read operation, 1 = write operation
> > Arg3: f7972185, If non-zero, the instruction address which
referenced
> > the bad memory
> > address.
> > Arg4: 00000000, Mm internal code.>
> >
> > Debugging Details:
> > ------------------
> >
> >
> > READ_ADDRESS: f7972185
> >
> > FAULTING_IP:
> > tcusb!BulkUsb_IoDecrement+61
> > [e:\build\tcdrv\build-0071-main\src\tcdrv\bulkpnp.c @ 2796]
> > f7972185 ?? ???
> >
> > DEFAULT_BUCKET_ID: DRIVER_FAULT
> >
> > BUGCHECK_STR: 0xCE
> >
> > PROCESS_NAME: sdk_test.exe
> >
> > TRAP_FRAME: eea73a30 – (.trap ffffffffeea73a30)
> > ErrCode = 00000000
> > eax=80a16123 ebx=82d9c848 ecx=00000000 edx=00000000 esi=82d9c824>
> > edi=8080cf16
> > eip=f7972185 esp=eea73aa4 ebp=eea73ab4 iopl=0 nv up ei pl nz
> na
> > po nc
> > cs=0008 ss=0010 ds=0023 es=0023 fs=0030 gs=0000
> > efl=00010202
> > tcusb!BulkUsb_IoDecrement+0x61:
> > f7972185 ?? ???
> > Resetting default scope
> >
> > LAST_CONTROL_TRANSFER: from 8084d6fb to 8085c9ae
> >
> > FAILED_INSTRUCTION_ADDRESS:
> > tcusb!BulkUsb_IoDecrement+61
> > [e:\build\tcdrv\build-0071-main\src\tcdrv\bulkpnp.c @ 2796]
> > f7972185 ?? ???
> >
> > STACK_TEXT:
> > eea739cc 8084d6fb 00000050 f7972185 00000000 nt!KeBugCheckEx+0x1b
> > eea73a18 8080aff1 00000000 f7972185 00000000 nt!MmAccessFault+0x6f5
> > eea73a18 f7972185 00000000 f7972185 00000000 nt!KiTrap0E+0xcc
> > eea73ab4 f7976b58 00d9c758 82de6fb8 831c5278
> > tcusb!BulkUsb_IoDecrement+0x61
> > [e:\build\tcdrv\build-0071-main\src\tcdrv\bulkpnp.c @ 2796]
> > eea73c34 8080cd77 82d9c6a0 8324f008 80a16070
> tcusb!DispatchDevCtrl+0xed6
> > [e:\build\tcdrv\build-0071-main\src\tcdrv\ioctl.c @ 1297]
> > eea73c44 808939ab 8324f0e4 832762e0 8324f008 nt!IopfCallDriver+0x31
> > eea73c58 808a69f7 82d9c6a0 8324f008 832762e0
> > nt!IopSynchronousServiceTail+0x60
> > eea73d00 808a8bfa 00000450 00000464 00000000
> nt!IopXxxControlFile+0x611
> > eea73d34 8080806b 00000450 00000464 00000000
> > nt!NtDeviceIoControlFile+0x2a
> > eea73d34 7c90eb94 00000450 00000464 00000000 nt!KiFastCallEntry+0xf8
> > WARNING: Frame IP not in any known module. Following frames may be
> > wrong.
> > 00129a34 00000000 00000000 00000000 00000000 0x7c90eb94
> >
> >
> > STACK_COMMAND: kb
> >
> > FOLLOWUP_IP:
> > tcusb!BulkUsb_IoDecrement+61
> > [e:\build\tcdrv\build-0071-main\src\tcdrv\bulkpnp.c @ 2796]
> > f7972185 ?? ???
> >
> > FAULTING_SOURCE_CODE:
> > 2792:
> > 2793: KeReleaseSpinLock(&DeviceExtension->IOCountLock,
oldIrql);
> > 2794:
> > 2795: TTRACE_LEAVE((“BulkUsb_IoDecrement() -> %x”, result));
> > > 2796: return(result);
> > 2797: }
> > 2798:
> > 2799: NTSTATUS
> > 2800: CanStopDevice(
> > 2801: IN PDEVICE_OBJECT DeviceObject,
> >
> >
> > SYMBOL_STACK_INDEX: 3
> >
> > SYMBOL_NAME: tcusb!BulkUsb_IoDecrement+61>
> >
> > FOLLOWUP_NAME: MachineOwner
> >
> > MODULE_NAME: tcusb
> >
> > IMAGE_NAME: tcusb.sys
> >
> > DEBUG_FLR_IMAGE_TIMESTAMP: fffffffe
> >
> > FAILURE_BUCKET_ID: 0xCE_BAD_IP_tcusb!BulkUsb_IoDecrement+61
> >
> > BUCKET_ID: 0xCE_BAD_IP_tcusb!BulkUsb_IoDecrement+61
> >
> > Followup: MachineOwner
> > ---------
> >
> > Best regards,
> >
> > Michal Vodicka
> > UPEK, Inc.
> > [xxxxx@upek.com, http://www.upek.com]
> >
> >
> > —
> > Questions? First check the Kernel Driver FAQ at
> > http://www.osronline.com/article.cfm?id=256
> >
> > To unsubscribe, visit the List Server section of OSR Online at
> > http://www.osronline.com/page.cfm?name=ListServer
> >
> > —
> > Questions? First check the Kernel Driver FAQ at
> http://www.osronline.com/article.cfm?id=256
> >
> > To unsubscribe, visit the List Server section of OSR Online at
> http://www.osronline.com/page.cfm?name=ListServer
> >
>
> —
> Questions? First check the Kernel Driver FAQ at
> http://www.osronline.com/article.cfm?id=256\>
>
> To unsubscribe, visit the List Server section of OSR Online at
> http://www.osronline.com/page.cfm?name=ListServer
>
> —
> Questions? First check the Kernel Driver FAQ at
http://www.osronline.com/article.cfm?id=256
>
> To unsubscribe, visit the List Server section of OSR Online at
http://www.osronline.com/page.cfm?name=ListServer
>


Questions? First check the Kernel Driver FAQ at
http://www.osronline.com/article.cfm?id=256

To unsubscribe, visit the List Server section of OSR Online at
http://www.osronline.com/page.cfm?name=ListServer


Questions? First check the Kernel Driver FAQ at http://www.osronline.com/article.cfm?id=256

To unsubscribe, visit the List Server section of OSR Online at http://www.osronline.com/page.cfm?name=ListServer