my 1394 driver BSODs someone else's

Hi Folks

I’ve been doing pretty well with my 1394 driver until I start sharing the
bus. As soon as I start pumping asynch data through mine it BSODs in another
party’s 1394 driver (fireface.sys). My next step is to single step my code
to see if I can identify the point it upsets the other one. In the meantime
I’d really appreciate any tips on any more info I can get out of the
crashdump (below). Incidentally my hardware also forces a bus reset after
it’s received its code update so this is another point I’ll be watching for.
Any tips / strategies appreciated…

Thank, Mike.

Microsoft (R) Windows Debugger Version 6.6.0007.5
Copyright (c) Microsoft Corporation. All rights reserved.

Loading Dump File [C:\crashdumps\bootlm-ff-2.DMP]
Kernel Complete Dump File: Full address space is available

Symbol search path is:
srv*c:\localsymbols*http://msdl.microsoft.com/download/symbols
Executable search path is:
C:\Development\lm1394\driver\lm1394\objchk_wxp_x86\i386
Windows XP Kernel Version 2600 (Service Pack 2) MP (2 procs) Free x86
compatible
Product: WinNt, suite: TerminalServer SingleUserTS
Built by: 2600.xpsp_sp2_rtm.040803-2158
Kernel base = 0x804d7000 PsLoadedModuleList = 0x805644a0
Debug session time: Sat Sep 2 15:23:56.812 2006 (GMT+1)
System Uptime: 0 days 0:25:49.530
Loading Kernel Symbols

Loading User Symbols

Loading unloaded module list

*******************************************************************************
*
*
* Bugcheck Analysis
*
*
*
*******************************************************************************

Use !analyze -v to get detailed debugging information.

BugCheck 8E, {c0000005, 804ebfeb, f6ee9644, 0}

*** ERROR: Module load completed but symbols could not be loaded for
fireface.sys
*** WARNING: Unable to verify checksum for Nuendo2.exe
*** ERROR: Module load completed but symbols could not be loaded for
Nuendo2.exe
Probably caused by : fireface.sys ( fireface+5aee )

Followup: MachineOwner

0: kd> !analyze -v
*******************************************************************************
*
*
* Bugcheck Analysis
*
*
*
*******************************************************************************

KERNEL_MODE_EXCEPTION_NOT_HANDLED (8e)
This is a very common bugcheck. Usually the exception address pinpoints
the driver/function that caused the problem. Always note this address
as well as the link date of the driver/image that contains this address.
Some common problems are exception code 0x80000003. This means a hard
coded breakpoint or assertion was hit, but this system was booted
/NODEBUG. This is not supposed to happen as developers should never have
hardcoded breakpoints in retail code, but …
If this happens, make sure a debugger gets connected, and the
system is booted /DEBUG. This will let us see why this breakpoint is
happening.
Arguments:
Arg1: c0000005, The exception code that was not handled
Arg2: 804ebfeb, The address that the exception occurred at
Arg3: f6ee9644, Trap Frame
Arg4: 00000000

Debugging Details:

EXCEPTION_CODE: (NTSTATUS) 0xc0000005 - The instruction at “0x%08lx”
referenced memory at “0x%08lx”. The memory could not be “%s”.

FAULTING_IP:
nt!IopFreeIrp+a
804ebfeb 66833e06 cmp word ptr [esi],6

TRAP_FRAME: f6ee9644 – (.trap fffffffff6ee9644)
ErrCode = 00000000
eax=00000000 ebx=83a030e0 ecx=00000000 edx=00000000 esi=00000000
edi=00000000
eip=804ebfeb esp=f6ee96b8 ebp=f6ee96c0 iopl=0 nv up ei pl zr na pe
nc
cs=0008 ss=0010 ds=0023 es=0023 fs=0030 gs=0000
efl=00010246
nt!IopFreeIrp+0xa:
804ebfeb 66833e06 cmp word ptr [esi],6
ds:0023:00000000=???
Resetting default scope

DEFAULT_BUCKET_ID: DRIVER_FAULT

BUGCHECK_STR: 0x8E

PROCESS_NAME: Nuendo2.exe

LAST_CONTROL_TRANSFER: from 80522839 to 80537832

STACK_TEXT:
f6ee920c 80522839 0000008e c0000005 804ebfeb nt!KeBugCheckEx+0x1b
f6ee95d4 804de998 f6ee95f0 00000000 f6ee9644 nt!KiDispatchException+0x3b1
f6ee963c 804de944 f6ee96c0 804ebfeb badb0d00 nt!CommonDispatchException+0x4d
f6ee965c 804e1a57 00000000 00000000 00000000 nt!Kei386EoiHelper+0x18a
f6ee96c0 f6fc7aee 00000000 f6ee96d0 00000000 nt!KiInsertTimerTable+0x1b
WARNING: Stack unwind information not available. Following frames may be
wrong.
f6ee96d8 f6fc9086 00000014 00000000 00000000 fireface+0x5aee
f6ee970c f6fc4267 83a030e0 00000000 00000001 fireface+0x7086
f6ee9748 804e19ee 83a03028 82bf9320 82bf9320 fireface+0x2267
f6ee9758 8057e648 82d82268 00000000 00000000 nt!IopfCallDriver+0x31
f6ee9790 8056e78f 00d82280 00000000 82d82268 nt!IopDeleteFile+0x132
f6ee97ac 804e1f77 82d82280 00000000 00000558 nt!ObpRemoveObjectRoutine+0xdf
f6ee97c4 80570dde 000000aa 00000558 e2c45ab0 nt!ObfDereferenceObject+0x4c
f6ee97dc 8058d3f4 e196a418 82d82280 00000558
nt!ObpCloseHandleTableEntry+0x155
f6ee97fc 8058cc2e e2c45ab0 00000558 f6ee983c nt!ObpCloseHandleProcedure+0x1f
f6ee981c 8058d3a0 e196a418 8058d3d5 f6ee983c nt!ExSweepHandleTable+0x3b
f6ee9848 8058cb23 82e712b8 82cbb228 00000001 nt!ObKillProcess+0x5c
f6ee98d0 8058d31f 00000001 f6ee992c 804e7c8d nt!PspExitThread+0x58a
f6ee98dc 804e7c8d 82cbb228 f6ee9928 f6ee991c nt!PsExitSpecialApc+0x22
f6ee992c 804ddf7a 00000001 00000000 f6ee9944 nt!KiDeliverApc+0x1af
f6ee992c 7c90ead0 00000001 00000000 f6ee9944 nt!KiServiceExit+0x59
0012f264 804e53b9 0115c0e8 00000000 00000000 ntdll!KiUserCallbackDispatcher
f6ee9c00 80570593 f6ee9c78 f6ee9c7c 000025ff nt!KiCallUserMode+0x4
f6ee9c5c bf861e3b 0000004d 00000000 00000000 nt!KeUserModeCallback+0x87
f6ee9c80 bf802d70 00000200 e46881d0 00000000
win32k!ClientDeliverUserApc+0x20
f6ee9ca8 bf801aa8 000025ff 00000000 00000001 win32k!xxxSleepThread+0x1e4
f6ee9cec bf80f106 f6ee9d18 000025ff 00000000
win32k!xxxRealInternalGetMessage+0x418
f6ee9d4c 804ddf0f 0115c0e8 00000000 00000000 win32k!NtUserGetMessage+0x27
f6ee9d4c 7c90eb94 0115c0e8 00000000 00000000 nt!KiFastCallEntry+0xfc
0012f23c 77d4919b 77d6ea85 0115c0e8 00000000 ntdll!KiFastSystemCallRet
0012f264 00a8d3a3 0115c0e8 00000000 00000000 USER32!NtUserGetMessage+0xc
0012f2e4 00a8d830 00000003 0235d938 023a3ed8 Nuendo2+0x68d3a3
0012f2fc 00af2acc 023a3ed8 00ab1cea 023962e0 Nuendo2+0x68d830
0012f324 0090627c 0012f754 00000000 00000000 Nuendo2+0x6f2acc
0012f918 00a8b779 472c4400 0004023e 00000001 Nuendo2+0x50627c
0012f950 00a8bf22 0012fd40 00a8b850 00000000 Nuendo2+0x68b779
0012fcd8 77d48709 0004023e 00000113 00000001 Nuendo2+0x68bf22
0012fd04 77d487eb 00a8b850 0004023e 00000113 USER32!InternalCallWinProc+0x28
0012fd6c 77d489a5 00000000 00a8b850 0004023e
USER32!UserCallWinProcCheckWow+0x150
0012fdcc 77d4bccc 0115c0e8 00000001 0012fe50
USER32!DispatchMessageWorker+0x306
0012fddc 00a8d59d 0115c0e8 02174000 02174110 USER32!DispatchMessageA+0xf
0012fe50 00a8d9b9 000fe9bb 02174000 7ffdf000 Nuendo2+0x68d59d
0012fe64 00ae1d1c 02174000 7c80b529 02174000 Nuendo2+0x68d9b9
0012fe90 00401019 0012ffc0 00c61467 00400000 Nuendo2+0x6e1d1c
0012fe98 00c61467 00400000 00000000 00152342 Nuendo2+0x1019
0012ffc0 7c816d4f 0119df4c 00000018 7ffdf000 Nuendo2+0x861467
0012fff0 00000000 0139a5b8 00000000 78746341 kernel32!BaseProcessStart+0x23

STACK_COMMAND: kb

FOLLOWUP_IP:
fireface+5aee
f6fc7aee 897e14 mov dword ptr [esi+14h],edi

SYMBOL_STACK_INDEX: 5

SYMBOL_NAME: fireface+5aee

FOLLOWUP_NAME: MachineOwner

MODULE_NAME: fireface

IMAGE_NAME: fireface.sys

DEBUG_FLR_IMAGE_TIMESTAMP: 44d0d9a0

FAILURE_BUCKET_ID: 0x8E_fireface+5aee

BUCKET_ID: 0x8E_fireface+5aee

Followup: MachineOwner

Well, this is a basic null pointer dereference (notice ESI=0 which is
what causes the page fault and this then throws the exception.) It
shouldn’t be too difficult to start digging out the bits (start with the
trap frame, feed it into “.trap” and start looking back to find out
where ESI originated. That generally gives a good sense of the
origination of the bug.

Or send it to me and I’ll analyze it and use it for debug class. :wink:

Tony

Tony Mason
Consulting Partner
OSR Open Systems Resources, Inc.
http://www.osr.com

-----Original Message-----
From: xxxxx@lists.osr.com
[mailto:xxxxx@lists.osr.com] On Behalf Of Mike Kemp
Sent: Saturday, September 02, 2006 8:01 AM
To: ntdev redirect
Subject: [ntdev] my 1394 driver BSODs someone else’s

Hi Folks

I’ve been doing pretty well with my 1394 driver until I start sharing
the
bus. As soon as I start pumping asynch data through mine it BSODs in
another
party’s 1394 driver (fireface.sys). My next step is to single step my
code
to see if I can identify the point it upsets the other one. In the
meantime
I’d really appreciate any tips on any more info I can get out of the
crashdump (below). Incidentally my hardware also forces a bus reset
after
it’s received its code update so this is another point I’ll be watching
for.
Any tips / strategies appreciated…

Thank, Mike.

Microsoft (R) Windows Debugger Version 6.6.0007.5
Copyright (c) Microsoft Corporation. All rights reserved.

Loading Dump File [C:\crashdumps\bootlm-ff-2.DMP]
Kernel Complete Dump File: Full address space is available

Symbol search path is:
srv*c:\localsymbols*http://msdl.microsoft.com/download/symbols
Executable search path is:
C:\Development\lm1394\driver\lm1394\objchk_wxp_x86\i386
Windows XP Kernel Version 2600 (Service Pack 2) MP (2 procs) Free x86
compatible
Product: WinNt, suite: TerminalServer SingleUserTS
Built by: 2600.xpsp_sp2_rtm.040803-2158
Kernel base = 0x804d7000 PsLoadedModuleList = 0x805644a0
Debug session time: Sat Sep 2 15:23:56.812 2006 (GMT+1)
System Uptime: 0 days 0:25:49.530
Loading Kernel Symbols



Loading User Symbols


Loading unloaded module list

************************************************************************
*******
*
*
* Bugcheck Analysis
*
*
*
************************************************************************
*******

Use !analyze -v to get detailed debugging information.

BugCheck 8E, {c0000005, 804ebfeb, f6ee9644, 0}

*** ERROR: Module load completed but symbols could not be loaded for
fireface.sys
*** WARNING: Unable to verify checksum for Nuendo2.exe
*** ERROR: Module load completed but symbols could not be loaded for
Nuendo2.exe
Probably caused by : fireface.sys ( fireface+5aee )

Followup: MachineOwner

0: kd> !analyze -v
************************************************************************
*******
*
*
* Bugcheck Analysis
*
*
*
************************************************************************
*******

KERNEL_MODE_EXCEPTION_NOT_HANDLED (8e)
This is a very common bugcheck. Usually the exception address pinpoints
the driver/function that caused the problem. Always note this address
as well as the link date of the driver/image that contains this address.
Some common problems are exception code 0x80000003. This means a hard
coded breakpoint or assertion was hit, but this system was booted
/NODEBUG. This is not supposed to happen as developers should never
have
hardcoded breakpoints in retail code, but …
If this happens, make sure a debugger gets connected, and the
system is booted /DEBUG. This will let us see why this breakpoint is
happening.
Arguments:
Arg1: c0000005, The exception code that was not handled
Arg2: 804ebfeb, The address that the exception occurred at
Arg3: f6ee9644, Trap Frame
Arg4: 00000000

Debugging Details:

EXCEPTION_CODE: (NTSTATUS) 0xc0000005 - The instruction at “0x%08lx”
referenced memory at “0x%08lx”. The memory could not be “%s”.

FAULTING_IP:
nt!IopFreeIrp+a
804ebfeb 66833e06 cmp word ptr [esi],6

TRAP_FRAME: f6ee9644 – (.trap fffffffff6ee9644)
ErrCode = 00000000
eax=00000000 ebx=83a030e0 ecx=00000000 edx=00000000 esi=00000000
edi=00000000
eip=804ebfeb esp=f6ee96b8 ebp=f6ee96c0 iopl=0 nv up ei pl zr na
pe
nc
cs=0008 ss=0010 ds=0023 es=0023 fs=0030 gs=0000
efl=00010246
nt!IopFreeIrp+0xa:
804ebfeb 66833e06 cmp word ptr [esi],6
ds:0023:00000000=???
Resetting default scope

DEFAULT_BUCKET_ID: DRIVER_FAULT

BUGCHECK_STR: 0x8E

PROCESS_NAME: Nuendo2.exe

LAST_CONTROL_TRANSFER: from 80522839 to 80537832

STACK_TEXT:
f6ee920c 80522839 0000008e c0000005 804ebfeb nt!KeBugCheckEx+0x1b
f6ee95d4 804de998 f6ee95f0 00000000 f6ee9644
nt!KiDispatchException+0x3b1
f6ee963c 804de944 f6ee96c0 804ebfeb badb0d00
nt!CommonDispatchException+0x4d
f6ee965c 804e1a57 00000000 00000000 00000000 nt!Kei386EoiHelper+0x18a
f6ee96c0 f6fc7aee 00000000 f6ee96d0 00000000 nt!KiInsertTimerTable+0x1b
WARNING: Stack unwind information not available. Following frames may be

wrong.
f6ee96d8 f6fc9086 00000014 00000000 00000000 fireface+0x5aee
f6ee970c f6fc4267 83a030e0 00000000 00000001 fireface+0x7086
f6ee9748 804e19ee 83a03028 82bf9320 82bf9320 fireface+0x2267
f6ee9758 8057e648 82d82268 00000000 00000000 nt!IopfCallDriver+0x31
f6ee9790 8056e78f 00d82280 00000000 82d82268 nt!IopDeleteFile+0x132
f6ee97ac 804e1f77 82d82280 00000000 00000558
nt!ObpRemoveObjectRoutine+0xdf
f6ee97c4 80570dde 000000aa 00000558 e2c45ab0
nt!ObfDereferenceObject+0x4c
f6ee97dc 8058d3f4 e196a418 82d82280 00000558
nt!ObpCloseHandleTableEntry+0x155
f6ee97fc 8058cc2e e2c45ab0 00000558 f6ee983c
nt!ObpCloseHandleProcedure+0x1f
f6ee981c 8058d3a0 e196a418 8058d3d5 f6ee983c nt!ExSweepHandleTable+0x3b
f6ee9848 8058cb23 82e712b8 82cbb228 00000001 nt!ObKillProcess+0x5c
f6ee98d0 8058d31f 00000001 f6ee992c 804e7c8d nt!PspExitThread+0x58a
f6ee98dc 804e7c8d 82cbb228 f6ee9928 f6ee991c nt!PsExitSpecialApc+0x22
f6ee992c 804ddf7a 00000001 00000000 f6ee9944 nt!KiDeliverApc+0x1af
f6ee992c 7c90ead0 00000001 00000000 f6ee9944 nt!KiServiceExit+0x59
0012f264 804e53b9 0115c0e8 00000000 00000000
ntdll!KiUserCallbackDispatcher
f6ee9c00 80570593 f6ee9c78 f6ee9c7c 000025ff nt!KiCallUserMode+0x4
f6ee9c5c bf861e3b 0000004d 00000000 00000000 nt!KeUserModeCallback+0x87
f6ee9c80 bf802d70 00000200 e46881d0 00000000
win32k!ClientDeliverUserApc+0x20
f6ee9ca8 bf801aa8 000025ff 00000000 00000001 win32k!xxxSleepThread+0x1e4
f6ee9cec bf80f106 f6ee9d18 000025ff 00000000
win32k!xxxRealInternalGetMessage+0x418
f6ee9d4c 804ddf0f 0115c0e8 00000000 00000000
win32k!NtUserGetMessage+0x27
f6ee9d4c 7c90eb94 0115c0e8 00000000 00000000 nt!KiFastCallEntry+0xfc
0012f23c 77d4919b 77d6ea85 0115c0e8 00000000 ntdll!KiFastSystemCallRet
0012f264 00a8d3a3 0115c0e8 00000000 00000000 USER32!NtUserGetMessage+0xc
0012f2e4 00a8d830 00000003 0235d938 023a3ed8 Nuendo2+0x68d3a3
0012f2fc 00af2acc 023a3ed8 00ab1cea 023962e0 Nuendo2+0x68d830
0012f324 0090627c 0012f754 00000000 00000000 Nuendo2+0x6f2acc
0012f918 00a8b779 472c4400 0004023e 00000001 Nuendo2+0x50627c
0012f950 00a8bf22 0012fd40 00a8b850 00000000 Nuendo2+0x68b779
0012fcd8 77d48709 0004023e 00000113 00000001 Nuendo2+0x68bf22
0012fd04 77d487eb 00a8b850 0004023e 00000113
USER32!InternalCallWinProc+0x28
0012fd6c 77d489a5 00000000 00a8b850 0004023e
USER32!UserCallWinProcCheckWow+0x150
0012fdcc 77d4bccc 0115c0e8 00000001 0012fe50
USER32!DispatchMessageWorker+0x306
0012fddc 00a8d59d 0115c0e8 02174000 02174110 USER32!DispatchMessageA+0xf
0012fe50 00a8d9b9 000fe9bb 02174000 7ffdf000 Nuendo2+0x68d59d
0012fe64 00ae1d1c 02174000 7c80b529 02174000 Nuendo2+0x68d9b9
0012fe90 00401019 0012ffc0 00c61467 00400000 Nuendo2+0x6e1d1c
0012fe98 00c61467 00400000 00000000 00152342 Nuendo2+0x1019
0012ffc0 7c816d4f 0119df4c 00000018 7ffdf000 Nuendo2+0x861467
0012fff0 00000000 0139a5b8 00000000 78746341
kernel32!BaseProcessStart+0x23

STACK_COMMAND: kb

FOLLOWUP_IP:
fireface+5aee
f6fc7aee 897e14 mov dword ptr [esi+14h],edi

SYMBOL_STACK_INDEX: 5

SYMBOL_NAME: fireface+5aee

FOLLOWUP_NAME: MachineOwner

MODULE_NAME: fireface

IMAGE_NAME: fireface.sys

DEBUG_FLR_IMAGE_TIMESTAMP: 44d0d9a0

FAILURE_BUCKET_ID: 0x8E_fireface+5aee

BUCKET_ID: 0x8E_fireface+5aee

Followup: MachineOwner


Questions? First check the Kernel Driver FAQ at
http://www.osronline.com/article.cfm?id=256

To unsubscribe, visit the List Server section of OSR Online at
http://www.osronline.com/page.cfm?name=ListServer

Look for buffer overruns. I’d bet donut you are overrunning an allocated buffer or stack space. Stepping thru ain’t abad idea and many times the best way to this type of critter.

Gary

-----Original Message-----
>From: “Mike Kemp”
>Sent: 02-Sep-06 08:00:37
>To: “Windows System Software Devs Interest List”
>Subject: [ntdev] my 1394 driver BSODs someone else’s
>
>Hi Folks
>
>I’ve been doing pretty well with my 1394 driver until I start sharing the
>bus. As soon as I start pumping asynch data through mine it BSODs in another
>party’s 1394 driver (fireface.sys). My next step is to single step my code
>to see if I can identify the point it upsets the other one. In the meantime
>I’d really appreciate any tips on any more info I can get out of the
>crashdump (below). Incidentally my hardware also forces a bus reset after
>it’s received its code update so this is another point I’ll be watching for.
>Any tips / strategies appreciated…
>
>Thank, Mike.
>
>Microsoft (R) Windows Debugger Version 6.6.0007.5
>Copyright (c) Microsoft Corporation. All rights reserved.
>
>
>Loading Dump File [C:\crashdumps\bootlm-ff-2.DMP]
>Kernel Complete Dump File: Full address space is available
>
>Symbol search path is:
>srvc:\localsymbolshttp://msdl.microsoft.com/download/symbols
>Executable search path is:
>C:\Development\lm1394\driver\lm1394\objchk_wxp_x86\i386
>Windows XP Kernel Version 2600 (Service Pack 2) MP (2 procs) Free x86
>

[Message truncated. Tap Edit->Mark for Download to get remaining portion.]

This might seem obvious to most of the people here but make sure you are running Verifier on your driver. If Gary is right, the Special Pool option is very effective at catching buffer overruns.