This is more bug report than a question but comments are appreciated. Memory dumps are available if anybody in MS is interested.
Today our QA reported me two BSODs at two different computers. Both appeared in short time and both computers were connected to the same outlet to which was also grounded some kind of noise generator. I presume the noise caused USB problems and surprise removals, probably repeated. However, it shouldn’t be the reason of OS crash. In both cases usbhub.sys used invalid pointer. In one case only OS drivers were on the stack because usbscan.sys was used as USB functional driver, in the second case my driver was also there but I believe it is innocent. The driver is based on BulkUsb sample and works similarly as UsbScan. Dumps analysis is below.
In both dumps failed code looks very similar way:
push dword ptr [reg+0x20c]
call dword ptr [reg+0x22c]
and the indirect call causes crash because pointer is bogus or NULL. I’d suspect race conditions when a memory block is removed too soon.
In the first case, my driver is involved. It just forwards remove IRP down to the stack. On the end I dumped faulting call and referenced memory. It contains nonsense and pointer used probably points to already deallocated and reused memory block:
Microsoft (R) Windows Debugger Version 6.3.0017.0
Copyright (c) Microsoft Corporation. All rights reserved.
Loading Dump File [T:\Michal\BSOD-T42\t42_1.dmp]
Kernel Summary Dump File: Only kernel address space is available
Symbol search path is: srv*e:\websymbols*http://msdl.microsoft.com/download/symbols
Executable search path is:
Windows XP Kernel Version 2600 (Service Pack 2) UP Free x86 compatible
Product: WinNt, suite: TerminalServer SingleUserTS
Built by: 2600.xpsp.050301-1521
Kernel base = 0x804d7000 PsLoadedModuleList = 0x8055a420
Debug session time: Tue Sep 20 21:38:00 2005
System Uptime: 1 days 1:50:58.956
Loading Kernel Symbols
…
Loading unloaded module list
…
Loading User Symbols
*******************************************************************************
* *
* Bugcheck Analysis *
* *
*******************************************************************************
Use !analyze -v to get detailed debugging information.
BugCheck 7E, {c0000005, 49b70000, f7cc1ab8, f7cc17b4}
*** ERROR: Module load completed but symbols could not be loaded for tcusb.sys
Probably caused by : tcusb.sys ( tcusb+1c53 )
Followup: MachineOwner
kd> !analyze -v
*******************************************************************************
* *
* Bugcheck Analysis *
* *
*******************************************************************************
SYSTEM_THREAD_EXCEPTION_NOT_HANDLED (7e)
This is a very common bugcheck. Usually the exception address pinpoints
the driver/function that caused the problem. Always note this address
as well as the link date of the driver/image that contains this address.
Arguments:
Arg1: c0000005, The exception code that was not handled
Arg2: 49b70000, The address that the exception occurred at
Arg3: f7cc1ab8, Exception Record Address
Arg4: f7cc17b4, Context Record Address
Debugging Details:
ERROR_CODE: (NTSTATUS) 0xc0000005 - The instruction at “0x%08lx” referenced memory at “0x%08lx”. The memory could not be “%s”.
FAULTING_IP:
+49b70000
49b70000 ?? ???
EXCEPTION_PARAMETER1: f7cc1ab8
CONTEXT: f7cc17b4 – (.cxr fffffffff7cc17b4)
eax=85062030 ebx=00000000 ecx=00000003 edx=8541aa74 esi=850620e8 edi=850d68e0
eip=49b70000 esp=f7cc1b80 ebp=f7cc1bac iopl=0 nv up ei ng nz na pe nc
cs=0008 ss=0010 ds=0023 es=0023 fs=0030 gs=0000 efl=00010282
49b70000 ?? ???
Resetting default scope
DEFAULT_BUCKET_ID: DRIVER_FAULT
BUGCHECK_STR: 0x7E
LAST_CONTROL_TRANSFER: from f79066ff to 49b70000
STACK_TEXT:
WARNING: Frame IP not in any known module. Following frames may be wrong.
f7cc1b7c f79066ff 49b70000 68627375 70646f52 0x49b70000
f7cc1bac f790d661 85062030 850d68e0 8541a998 usbhub!USBH_PdoRemoveDevice+0x41
f7cc1bcc f7906952 850620e8 8541a998 00000002 usbhub!USBH_PdoPnP+0x5b
f7cc1bf0 f79041d8 010620e8 8541a998 854069b8 usbhub!USBH_PdoDispatch+0x5a
f7cc1c00 804e37f7 85062030 8541a998 85406720 usbhub!USBH_HubDispatch+0x48
f7cc1c10 f7bf3c53 8541aa74 8541a998 f7cc1c40 nt!IopfCallDriver+0x31
f7cc1c24 f7bf40ed 85406668 8541a998 8541aa98 tcusb!HandleRemoveDevice+0x149 [e:\build\tcdrv-build-0044\src\tcdrv\bulkpnp.c @ 1887]
f7cc1c40 804e37f7 85406668 85406720 f7cc1ccc tcusb!BulkUsb_DispatchPnP+0x83 [e:\build\tcdrv-build-0044\src\tcdrv\bulkpnp.c @ 189]
f7cc1c50 805d8a7d 85062030 85062030 00000002 nt!IopfCallDriver+0x31
f7cc1c7c 8061a6ba 85406668 f7cc1ca8 00000000 nt!IopSynchronousCall+0xb7
f7cc1cd0 805312c1 85062030 00000002 00000000 nt!IopRemoveDevice+0x93
f7cc1cf8 8061b82f e3432bc8 00000018 e3036198 nt!IopRemoveLockedDeviceNode+0x160
f7cc1d10 8061b89b 850a6300 00000002 e3036198 nt!IopDeleteLockedDeviceNode+0x34
f7cc1d44 8061b946 85062030 02036198 00000002 nt!IopDeleteLockedDeviceNodes+0x3f
f7cc1d74 804e426b 85493108 00000000 867b33c8 nt!IopDelayedRemoveWorker+0x4b
f7cc1dac 8057be15 85493108 00000000 00000000 nt!ExpWorkerThread+0x100
f7cc1ddc 804fa4da 804e4196 00000001 00000000 nt!PspSystemThreadStartup+0x34
00000000 00000000 00000000 00000000 00000000 nt!KiThreadStartup+0x16
FOLLOWUP_IP:
tcusb!HandleRemoveDevice+149 [e:\build\tcdrv-build-0044\src\tcdrv\bulkpnp.c @ 1887]
f7bf3c53 6a18 push 0x18
SYMBOL_STACK_INDEX: 6
FOLLOWUP_NAME: MachineOwner
SYMBOL_NAME: tcusb!HandleRemoveDevice+149
MODULE_NAME: tcusb
IMAGE_NAME: tcusb.sys
DEBUG_FLR_IMAGE_TIMESTAMP: 41412f2c
STACK_COMMAND: .cxr fffffffff7cc17b4 ; kb
BUCKET_ID: 0x7E_tcusb!HandleRemoveDevice+149
Followup: MachineOwner
kd> u f79066ff - 18
usbhub!USBH_PdoRemoveDevice+0x29:
f79066e7 51 push ecx
f79066e8 50 push eax
f79066e9 68526f6470 push 0x70646f52
f79066ee 6875736268 push 0x68627375
f79066f3 ffb70c020000 push dword ptr [edi+0x20c]
f79066f9 ff972c020000 call dword ptr [edi+0x22c]
f79066ff 83bfa800000001 cmp dword ptr [edi+0xa8],0x1
f7906706 740c jz usbhub!USBH_PdoRemoveDevice+0x56 (f7906714)
kd> dd 850d68e0 + 22c
850d6b0c 49b70000 0a020002 20736649 00000000
850d6b1c 49b70000 0a020002 20736649 00000000
850d6b2c 49b70000 0a020002 20736649 00000000
850d6b3c 49b70000 0a020002 20736649 00000000
850d6b4c 49b70000 0a020002 20736649 00000000
850d6b5c 49b70000 0a020002 20736649 00000000
850d6b6c 49b70000 0a020002 20736649 00000000
850d6b7c 49b70000 0a020002 20736649 00000000
In the second case, usbscan.sys is involved and usbhub crashes during write. The call in the usbhub.sys dereferences NULL pointer.
(yes, I noticed wrong symbols warning. It is bogus; probably because it is minidump and windbg can’t find correct ntoskrnl.exe because I analyzed it on different computer)
Microsoft (R) Windows Debugger Version 6.3.0017.0
Copyright (c) Microsoft Corporation. All rights reserved.
Loading Dump File [T:\Michal\BSOD-PC013\Mini092005-01.dmp]
Mini Kernel Dump File: Only registers and stack trace are available
Symbol search path is: srv*e:\websymbols*http://msdl.microsoft.com/download/symbols
Executable search path is:
Unable to load image ntoskrnl.exe, Win32 error 2
*** WARNING: Unable to verify timestamp for ntoskrnl.exe
*** ERROR: Module load completed but symbols could not be loaded for ntoskrnl.exe
Windows XP Kernel Version 2600 (Service Pack 2) UP Free x86 compatible
Product: WinNt, suite: TerminalServer SingleUserTS
Kernel base = 0x804d7000 PsLoadedModuleList = 0x8055a420
Debug session time: Tue Sep 20 11:41:49 2005
System Uptime: 5 days 0:54:20.002
Unable to load image ntoskrnl.exe, Win32 error 2
*** WARNING: Unable to verify timestamp for ntoskrnl.exe
*** ERROR: Module load completed but symbols could not be loaded for ntoskrnl.exe
Loading Kernel Symbols
…
Loading unloaded module list
…
Loading User Symbols
*******************************************************************************
* *
* Bugcheck Analysis *
* *
*******************************************************************************
Use !analyze -v to get detailed debugging information.
BugCheck 1000008E, {c0000005, 0, ee702b34, 0}
***** Kernel symbols are WRONG. Please fix symbols to do analysis.
Probably caused by : usbscan.sys ( usbscan!USWrite+7e )
Followup: MachineOwner
kd> !analyze -v
*******************************************************************************
* *
* Bugcheck Analysis *
* *
*******************************************************************************
KERNEL_MODE_EXCEPTION_NOT_HANDLED_M (1000008e)
This is a very common bugcheck. Usually the exception address pinpoints
the driver/function that caused the problem. Always note this address
as well as the link date of the driver/image that contains this address.
Some common problems are exception code 0x80000003. This means a hard
coded breakpoint or assertion was hit, but this system was booted
/NODEBUG. This is not supposed to happen as developers should never have
hardcoded breakpoints in retail code, but …
If this happens, make sure a debugger gets connected, and the
system is booted /DEBUG. This will let us see why this breakpoint is
happening.
An exception code of 0x80000002 (STATUS_DATATYPE_MISALIGNMENT) indicates
that an unaligned data reference was encountered. The trap frame will
supply additional information.
Arguments:
Arg1: c0000005, The exception code that was not handled
Arg2: 00000000, The address that the exception occurred at
Arg3: ee702b34, Trap Frame
Arg4: 00000000
Debugging Details:
***** Kernel symbols are WRONG. Please fix symbols to do analysis.
EXCEPTION_CODE: (NTSTATUS) 0xc0000005 - The instruction at “0x%08lx” referenced memory at “0x%08lx”. The memory could not be “%s”.
FAULTING_IP:
+0
00000000 ?? ???
TRAP_FRAME: ee702b34 – (.trap ffffffffee702b34)
ErrCode = 00000000
eax=00000002 ebx=82d53b30 ecx=00000000 edx=00220003 esi=820590e8 edi=82f0d008
eip=00000000 esp=ee702ba8 ebp=ee702bcc iopl=0 nv up ei pl zr na po nc
cs=0008 ss=0010 ds=0023 es=0023 fs=0030 gs=0000 efl=00010246
00000000 ?? ???
Resetting default scope
CUSTOMER_CRASH_COUNT: 1
DEFAULT_BUCKET_ID: DRIVER_FAULT
BUGCHECK_STR: 0x8E
LAST_CONTROL_TRANSFER: from f877aae1 to 00000000
STACK_TEXT:
ee702ba4 f877aae1 00001000 68627375 64657621 0x0
ee702bcc f87781d8 82d52ee0 82f0d008 ee702c30 usbhub!USBH_PdoDispatch+0x1e9
ee702bdc 804e37f7 82059030 82f0d008 82f0d008 usbhub!USBH_HubDispatch+0x48
WARNING: Stack unwind information not available. Following frames may be wrong.
ee702c30 f8b14572 81fa28e0 82f0d008 00000001 nt+0xc7f7
ee702c60 804e37f7 81fa28e0 00000078 806ed070 usbscan!USWrite+0x7e
ee702c84 805784c0 81fa28e0 82f0d008 820a4a80 nt+0xc7f7
ee702d38 804de7ec 000000b0 00000208 00000000 nt+0xa14c0
ee702d64 7c90eb94 badb0d00 0012e8cc 00000000 nt+0x77ec
0012e924 00000000 00000000 00000000 00000000 0x7c90eb94
FOLLOWUP_IP:
usbscan!USWrite+7e
f8b14572 894508 mov [ebp+0x8],eax
SYMBOL_STACK_INDEX: 4
FOLLOWUP_NAME: MachineOwner
SYMBOL_NAME: usbscan!USWrite+7e
MODULE_NAME: usbscan
IMAGE_NAME: usbscan.sys
DEBUG_FLR_IMAGE_TIMESTAMP: 41107b14
STACK_COMMAND: .trap ffffffffee702b34 ; kb
BUCKET_ID: WRONG_SYMBOLS
Followup: MachineOwner
kd> u f877aae1 - 18
usbhub!USBH_PdoDispatch+0x1d1:
f877aac9 6a00 push 0x0
f877aacb 6821766564 push 0x64657621
f877aad0 6875736268 push 0x68627375
f877aad5 ffb30c020000 push dword ptr [ebx+0x20c]
f877aadb ff932c020000 call dword ptr [ebx+0x22c]
f877aae1 8b4508 mov eax,[ebp+0x8]
f877aae4 834808ff or dword ptr [eax+0x8],0xffffffff
f877aae8 ff7348 push dword ptr [ebx+0x48]
kd> dd 82d53b30 + 22c
82d53d5c 00000000 03cc8000 00000000 00000000
82d53d6c 00001000 0000d9f0 00000000 1be7b000
82d53d7c 00000000 00000000 00001000 0000e9f0
82d53d8c 00000000 0fc6e000 00000000 00000000
82d53d9c 00000610 0000f9f0 00000000 00000000
82d53dac 00000000 00000000 00000000 00000000
82d53dbc 00000000 72743031 00000000 82d53b8c
82d53dcc 00000000 00001000 82d8bf70 00000000
Best regards,
Michal Vodicka
UPEK, Inc.
[xxxxx@upek.com, http://www.upek.com]