hal functions and crash dump in atheros wireless driver

hi

I’m trying to analyze a crash dump crash in the Atheros wireless driver.
The call stack (I pasted in the analyze -v output below, that includes a
stack) shows that a function call hal!KfLowerIrql is calling
hal!pCheckForSoftwareInterrupt, which appears to be calling into the atheros
driver.

Can somebody summarize what CheckForSoftware Interrupt does ?

I was suspicious that a call into the hal could end up into the atheros
driver – I thought that perhaps the debugger couldn’t unwind the stack
because the code isn’t following call conventions when pushing on the stack
— but when I try to unwind the stack myself by dumping and examining
stack memory and then looking for ebp and return addresses (and the using u

- 5 to look for a call instruction) I can't find any other valid
return address nearby. Perhaps I need to try again or examine the stack
farther down (in higher address locations).

It would help if I could understand generally what these Hal functions do. I
googled and couldn't find any documentation.

*******************************************************************************
*
*
* Bugcheck Analysis
*
*
*
*******************************************************************************

Use !analyze -v to get detailed debugging information.

BugCheck D1, {0, 2, 0, 9b950bbc}

***ERROR: Module load completed but symbols could not be loaded for
ar5523.sys
*** ERROR: Symbol file could not be found. Defaulted to export symbols for
mfehidk.sys -
Probably caused by : ar5523.sys ( ar5523+44bbc )

Followup: MachineOwner
---------

kd> !analyze -v
*******************************************************************************
*
*
* Bugcheck Analysis
*
*
*
*******************************************************************************

DRIVER_IRQL_NOT_LESS_OR_EQUAL (d1)
An attempt was made to access a pageable (or completely invalid) address at
an
interrupt request level (IRQL) that is too high. This is usually
caused by drivers using improper addresses.
If kernel debugger is available get stack backtrace.
Arguments:
Arg1: 00000000, memory referenced
Arg2: 00000002, IRQL
Arg3: 00000000, value 0 = read operation, 1 = write operation
Arg4: 9b950bbc, address which referenced memory

Debugging Details:
------------------

READ_ADDRESS: 00000000

CURRENT_IRQL: 2

FAULTING_IP:
ar5523+44bbc
9b950bbc 8b0f mov ecx,dword ptr [edi] <----- edi is 0 (this
caused the crash)

DEFAULT_BUCKET_ID: VISTA_RC

BUGCHECK_STR: 0xD1

PROCESS_NAME: System

TRAP_FRAME: 9be3d484 -- (.trap ffffffff9be3d484)
ErrCode = 00000000
eax=9ca45000 ebx=00000018 ecx=00000000 edx=9ca45000 esi=9c94c6a0
edi=00000000
eip=9b950bbc esp=9be3d4f8 ebp=9ca45000 iopl=0 nv up ei pl zr na pe
nc
cs=0008 ss=0010 ds=0023 es=0023 fs=0030 gs=0000
efl=00010246
ar5523+0x44bbc:
9b950bbc 8b0f mov ecx,dword ptr [edi]
ds:0023:00000000=????????
Resetting default scope

LAST_CONTROL_TRANSFER: from 9b950bbc to 81c494d4

STACK_TEXT:
9be3d484 9b950bbc badb0d00 9ca45000 00000001 nt!KiTrap0E+0x2ac
WARNING: Stack unwind information not available. Following frames may be
wrong.
9be3d4fc 9b942340 9ca45000 ffffffff ffffffff ar5523+0x44bbc
9be3d518 9b94336a 9c94c688 9ca4fa34 9ca45000 ar5523+0x36340
9be3d540 81f9b518 c02105cc be8f0000 9be3d570 ar5523+0x3736a
9be3d62c 81c6a820 9be3d790 9be3d788 00000001 hal!KfLowerIrql+0x64
9be3d660 81c6834e 9be3d700 83b6bce3 83b6bb08 nt!KiExitDispatcher+0x1a2
9be3d680 a21507ee 9be3d700 00000000 00000000 nt!KeSetEvent+0xcc
9be3d694 81c34b79 83ba55a0 83b6bb08 9be3d788
mfehidk!DEVICEDISPATCH::DispatchPassThrough+0xce0
9be3d6c8 83f6c3ef 9be3d6e8 83f6c43c 83b6bb08 nt!IopfCompleteRequest+0x12d
9be3d6d0 83f6c43c 83b6bb08 c0000034 00000000 fltmgr!FltpCompleteRequest+0x2d
9be3d6e8 83f6cb39 9c981d78 83b6bb08 00000000
fltmgr!FltpSynchronizeIoCleanup+0x44
9be3d710 83f7ea91 9be3d730 c0000034 00000000
fltmgr!FltpLegacyProcessingAfterPreCallbacksCompleted+0x307
9be3d75c 81c67cc9 831968d0 83196460 83b6bd28 fltmgr!FltpCreate+0x2a1
9be3d79c 81f9b518 000001ff 81d28100 9be3d810 nt!IofCallDriver+0x63
83ba56a0 62766564 00000001 83ba55a0 83ba5658 hal!KfLowerIrql+0x64
83ba56a4 00000000 83ba55a0 83ba5658 a21577d0 0x62766564

STACK_COMMAND: kb

FOLLOWUP_IP:
ar5523+44bbc
9b950bbc 8b0f mov ecx,dword ptr [edi]

SYMBOL_STACK_INDEX: 1

FOLLOWUP_NAME: MachineOwner

MODULE_NAME: ar5523

IMAGE_NAME: ar5523.sys

DEBUG_FLR_IMAGE_TIMESTAMP: 42e85aea

SYMBOL_NAME: ar5523+44bbc

FAILURE_BUCKET_ID: 0xD1_ar5523+44bbc

BUCKET_ID: 0xD1_ar5523+44bbc

Followup: MachineOwner
---------

thanks

----- Original Message -----
From: “S. Drasnin”
To: “Windows System Software Devs Interest List”
Sent: Wednesday, January 31, 2007 11:51 AM
Subject: [ntdev] hal functions and crash dump in atheros wireless driver

> hi
>
> I’m trying to analyze a crash dump crash in the Atheros wireless driver.
> The call stack (I pasted in the analyze -v output below, that includes a
> stack) shows that a function call hal!KfLowerIrql is calling
> hal!pCheckForSoftwareInterrupt, which appears to be calling into the
> atheros driver.
>

Uhm, I might be completely wrong, but my suspect is that when the HAL needs
to perform a KfLowerIrql, before lowering the IRQL it needs to check that
there are no pending DPCs scheduled. A DPC is generally scheduled by an ISR
through a softint, so that when the ISR ends, the processor jumps back
executing the software ISR, executing the tasks in the DPC queue.

In your specific case, what happens is the other way around. Some code
running at PASSIVE_LEVEL raised the IRQL to DISPATCH, did something then it
called XXLowerIrql, thus causing the check for “pending software interrupts”
that needs to be serviced (e.g. DPCs) before the IRQL can actually go down
to PASSIVE_LEVEL.

If the stuff above is right, the code crashing in the atheros driver is
probably either a DPC routine or a Timer DPC routine.

Hope it helps
GV

> Can somebody summarize what CheckForSoftware Interrupt does ?
>
> I was suspicious that a call into the hal could end up into the atheros
> driver – I thought that perhaps the debugger couldn’t unwind the stack
> because the code isn’t following call conventions when pushing on the
> stack — but when I try to unwind the stack myself by dumping and
> examining stack memory and then looking for ebp and return addresses (and
> the using u - 5 to look for a call instruction) I can’t find
> any other valid return address nearby. Perhaps I need to try again or
> examine the stack farther down (in higher address locations).
>
> It would help if I could understand generally what these Hal functions do.
> I googled and couldn’t find any documentation.
>
> ***
> *
>
> * Bugcheck Analysis
>
> *
>
>

>
> Use !analyze -v to get detailed debugging information.
>
> BugCheck D1, {0, 2, 0, 9b950bbc}
>
> ERROR: Module load completed but symbols could not be loaded for
> ar5523.sys
>
ERROR: Symbol file could not be found. Defaulted to export symbols
> for mfehidk.sys -
> Probably caused by : ar5523.sys ( ar5523+44bbc )
>
> Followup: MachineOwner
> ---------
>
> kd> !analyze -v
> ***
> *
>
> * Bugcheck Analysis
>
> *
>
>

>
> DRIVER_IRQL_NOT_LESS_OR_EQUAL (d1)
> An attempt was made to access a pageable (or completely invalid) address
> at an
> interrupt request level (IRQL) that is too high. This is usually
> caused by drivers using improper addresses.
> If kernel debugger is available get stack backtrace.
> Arguments:
> Arg1: 00000000, memory referenced
> Arg2: 00000002, IRQL
> Arg3: 00000000, value 0 = read operation, 1 = write operation
> Arg4: 9b950bbc, address which referenced memory
>
> Debugging Details:
> ------------------
>
>
> READ_ADDRESS: 00000000
>
> CURRENT_IRQL: 2
>
> FAULTING_IP:
> ar5523+44bbc
> 9b950bbc 8b0f mov ecx,dword ptr [edi] <----- edi is 0 (this
> caused the crash)
>
> DEFAULT_BUCKET_ID: VISTA_RC
>
> BUGCHECK_STR: 0xD1
>
> PROCESS_NAME: System
>
> TRAP_FRAME: 9be3d484 – (.trap ffffffff9be3d484)
> ErrCode = 00000000
> eax=9ca45000 ebx=00000018 ecx=00000000 edx=9ca45000 esi=9c94c6a0
> edi=00000000
> eip=9b950bbc esp=9be3d4f8 ebp=9ca45000 iopl=0 nv up ei pl zr na pe
> nc
> cs=0008 ss=0010 ds=0023 es=0023 fs=0030 gs=0000
> efl=00010246
> ar5523+0x44bbc:
> 9b950bbc 8b0f mov ecx,dword ptr [edi]
> ds:0023:00000000=???
> Resetting default scope
>
> LAST_CONTROL_TRANSFER: from 9b950bbc to 81c494d4
>
> STACK_TEXT:
> 9be3d484 9b950bbc badb0d00 9ca45000 00000001 nt!KiTrap0E+0x2ac
> WARNING: Stack unwind information not available. Following frames may be
> wrong.
> 9be3d4fc 9b942340 9ca45000 ffffffff ffffffff ar5523+0x44bbc
> 9be3d518 9b94336a 9c94c688 9ca4fa34 9ca45000 ar5523+0x36340
> 9be3d540 81f9b518 c02105cc be8f0000 9be3d570 ar5523+0x3736a
> 9be3d62c 81c6a820 9be3d790 9be3d788 00000001 hal!KfLowerIrql+0x64
> 9be3d660 81c6834e 9be3d700 83b6bce3 83b6bb08 nt!KiExitDispatcher+0x1a2
> 9be3d680 a21507ee 9be3d700 00000000 00000000 nt!KeSetEvent+0xcc
> 9be3d694 81c34b79 83ba55a0 83b6bb08 9be3d788
> mfehidk!DEVICEDISPATCH::DispatchPassThrough+0xce0
> 9be3d6c8 83f6c3ef 9be3d6e8 83f6c43c 83b6bb08 nt!IopfCompleteRequest+0x12d
> 9be3d6d0 83f6c43c 83b6bb08 c0000034 00000000
> fltmgr!FltpCompleteRequest+0x2d
> 9be3d6e8 83f6cb39 9c981d78 83b6bb08 00000000
> fltmgr!FltpSynchronizeIoCleanup+0x44
> 9be3d710 83f7ea91 9be3d730 c0000034 00000000
> fltmgr!FltpLegacyProcessingAfterPreCallbacksCompleted+0x307
> 9be3d75c 81c67cc9 831968d0 83196460 83b6bd28 fltmgr!FltpCreate+0x2a1
> 9be3d79c 81f9b518 000001ff 81d28100 9be3d810 nt!IofCallDriver+0x63
> 83ba56a0 62766564 00000001 83ba55a0 83ba5658 hal!KfLowerIrql+0x64
> 83ba56a4 00000000 83ba55a0 83ba5658 a21577d0 0x62766564
>
>
> STACK_COMMAND: kb
>
> FOLLOWUP_IP:
> ar5523+44bbc
> 9b950bbc 8b0f mov ecx,dword ptr [edi]
>
> SYMBOL_STACK_INDEX: 1
>
> FOLLOWUP_NAME: MachineOwner
>
> MODULE_NAME: ar5523
>
> IMAGE_NAME: ar5523.sys
>
> DEBUG_FLR_IMAGE_TIMESTAMP: 42e85aea
>
> SYMBOL_NAME: ar5523+44bbc
>
> FAILURE_BUCKET_ID: 0xD1_ar5523+44bbc
>
> BUCKET_ID: 0xD1_ar5523+44bbc
>
> Followup: MachineOwner
> ---------
>
>
> thanks
>
>
>
> —
> Questions? First check the Kernel Driver FAQ at
> http://www.osronline.com/article.cfm?id=256
>
> To unsubscribe, visit the List Server section of OSR Online at
> http://www.osronline.com/page.cfm?name=ListServer

>Can somebody summarize what CheckForSoftware Interrupt does ?

If you look at the faulting IP, it resides in the ar5523 driver. Also,
whatever function is currently executing within that driver appears to have
dereferenced a NULL pointer which may have nothing to do with the
HAL(depending on whether the ptr was a parameter to the function or not).

Have you tried examining that function yet(via the disassembly window or
u(unassemble) command in windbg)?

From: “S. Drasnin”
>Reply-To: “Windows System Software Devs Interest List”
>
>To: “Windows System Software Devs Interest List”
>Subject: [ntdev] hal functions and crash dump in atheros wireless driver
>Date: Wed, 31 Jan 2007 11:51:03 -0800
>
>hi
>
>I’m trying to analyze a crash dump crash in the Atheros wireless driver.
>The call stack (I pasted in the analyze -v output below, that includes a
>stack) shows that a function call hal!KfLowerIrql is calling
>hal!pCheckForSoftwareInterrupt, which appears to be calling into the
>atheros driver.
>
>Can somebody summarize what CheckForSoftware Interrupt does ?
>
>I was suspicious that a call into the hal could end up into the atheros
>driver – I thought that perhaps the debugger couldn’t unwind the stack
>because the code isn’t following call conventions when pushing on the stack
>— but when I try to unwind the stack myself by dumping and examining
>stack memory and then looking for ebp and return addresses (and the using u
> - 5 to look for a call instruction) I can’t find any other valid
>return address nearby. Perhaps I need to try again or examine the stack
>farther down (in higher address locations).
>
>It would help if I could understand generally what these Hal functions do.
>I googled and couldn’t find any documentation.
>
>
>

>
>
Bugcheck Analysis
>
>

>
>

>
>Use !analyze -v to get detailed debugging information.
>
>BugCheck D1, {0, 2, 0, 9b950bbc}
>
> ERROR: Module load completed but symbols could not be loaded for
>ar5523.sys
>
ERROR: Symbol file could not be found. Defaulted to export symbols for
>mfehidk.sys -
>Probably caused by : ar5523.sys ( ar5523+44bbc )
>
>Followup: MachineOwner
>---------
>
>kd> !analyze -v
>
>

>
>
Bugcheck Analysis
>
>

>
>

>
>DRIVER_IRQL_NOT_LESS_OR_EQUAL (d1)
>An attempt was made to access a pageable (or completely invalid) address at
>an
>interrupt request level (IRQL) that is too high. This is usually
>caused by drivers using improper addresses.
>If kernel debugger is available get stack backtrace.
>Arguments:
>Arg1: 00000000, memory referenced
>Arg2: 00000002, IRQL
>Arg3: 00000000, value 0 = read operation, 1 = write operation
>Arg4: 9b950bbc, address which referenced memory
>
>Debugging Details:
>------------------
>
>
>READ_ADDRESS: 00000000
>
>CURRENT_IRQL: 2
>
>FAULTING_IP:
>ar5523+44bbc
>9b950bbc 8b0f mov ecx,dword ptr [edi] <----- edi is 0 (this
>caused the crash)
>
>DEFAULT_BUCKET_ID: VISTA_RC
>
>BUGCHECK_STR: 0xD1
>
>PROCESS_NAME: System
>
>TRAP_FRAME: 9be3d484 – (.trap ffffffff9be3d484)
>ErrCode = 00000000
>eax=9ca45000 ebx=00000018 ecx=00000000 edx=9ca45000 esi=9c94c6a0
>edi=00000000
>eip=9b950bbc esp=9be3d4f8 ebp=9ca45000 iopl=0 nv up ei pl zr na pe
>nc
>cs=0008 ss=0010 ds=0023 es=0023 fs=0030 gs=0000
>efl=00010246
>ar5523+0x44bbc:
>9b950bbc 8b0f mov ecx,dword ptr [edi]
>ds:0023:00000000=???
>Resetting default scope
>
>LAST_CONTROL_TRANSFER: from 9b950bbc to 81c494d4
>
>STACK_TEXT:
>9be3d484 9b950bbc badb0d00 9ca45000 00000001 nt!KiTrap0E+0x2ac
>WARNING: Stack unwind information not available. Following frames may be
>wrong.
>9be3d4fc 9b942340 9ca45000 ffffffff ffffffff ar5523+0x44bbc
>9be3d518 9b94336a 9c94c688 9ca4fa34 9ca45000 ar5523+0x36340
>9be3d540 81f9b518 c02105cc be8f0000 9be3d570 ar5523+0x3736a
>9be3d62c 81c6a820 9be3d790 9be3d788 00000001 hal!KfLowerIrql+0x64
>9be3d660 81c6834e 9be3d700 83b6bce3 83b6bb08 nt!KiExitDispatcher+0x1a2
>9be3d680 a21507ee 9be3d700 00000000 00000000 nt!KeSetEvent+0xcc
>9be3d694 81c34b79 83ba55a0 83b6bb08 9be3d788
>mfehidk!DEVICEDISPATCH::DispatchPassThrough+0xce0
>9be3d6c8 83f6c3ef 9be3d6e8 83f6c43c 83b6bb08 nt!IopfCompleteRequest+0x12d
>9be3d6d0 83f6c43c 83b6bb08 c0000034 00000000
>fltmgr!FltpCompleteRequest+0x2d
>9be3d6e8 83f6cb39 9c981d78 83b6bb08 00000000
>fltmgr!FltpSynchronizeIoCleanup+0x44
>9be3d710 83f7ea91 9be3d730 c0000034 00000000
>fltmgr!FltpLegacyProcessingAfterPreCallbacksCompleted+0x307
>9be3d75c 81c67cc9 831968d0 83196460 83b6bd28 fltmgr!FltpCreate+0x2a1
>9be3d79c 81f9b518 000001ff 81d28100 9be3d810 nt!IofCallDriver+0x63
>83ba56a0 62766564 00000001 83ba55a0 83ba5658 hal!KfLowerIrql+0x64
>83ba56a4 00000000 83ba55a0 83ba5658 a21577d0 0x62766564
>
>
>STACK_COMMAND: kb
>
>FOLLOWUP_IP:
>ar5523+44bbc
>9b950bbc 8b0f mov ecx,dword ptr [edi]
>
>SYMBOL_STACK_INDEX: 1
>
>FOLLOWUP_NAME: MachineOwner
>
>MODULE_NAME: ar5523
>
>IMAGE_NAME: ar5523.sys
>
>DEBUG_FLR_IMAGE_TIMESTAMP: 42e85aea
>
>SYMBOL_NAME: ar5523+44bbc
>
>FAILURE_BUCKET_ID: 0xD1_ar5523+44bbc
>
>BUCKET_ID: 0xD1_ar5523+44bbc
>
>Followup: MachineOwner
>---------
>
>
>thanks
>
>
>
>—
>Questions? First check the Kernel Driver FAQ at
>http://www.osronline.com/article.cfm?id=256
>
>To unsubscribe, visit the List Server section of OSR Online at
>http://www.osronline.com/page.cfm?name=ListServer

_________________________________________________________________
From predictions to trailers, check out the MSN Entertainment Guide to the
Academy Awards®
http://movies.msn.com/movies/oscars2007/?icid=ncoscartagline1

Looking at the preceding code is a great idea. Try the following:

“.trap f9be3d484;ub @eip

That will let you see the code leading up to the exception. From the
stack trace it’s deep inside this driver and it is likely associated
with a DPC function for the given driver that must now be called because
the IRQL is being lowered.

In general, invalid address references like this come from some
containing data structure. Using “!pool address derived>” or “ln derived” yields further information about what went wrong.

Tony

Tony Mason
Consulting Partner
OSR Open Systems Resources, Inc.
http://www.osr.com

Thanks for all the helpful replies. I have a couple of questions below.

I neglected to mention that I did walk the frames and disassemble the code.
For frame 0 (I pasted in the stack below again if you want to see it) , edi
is getting loaded by fetching a pointer off the stack (argument 1, at
location esp+4) adding a offset to it (0xA19). that address is deferenced
and stuffed into edi - the value is 0 and when it gets used as pointer, the
crash occurs. I partially walked the frames backwards to see how that the
argument 1 got that way. However, I stopped doing this when I got to frame
4 because in the disassembly of Hal!KfLowerIrql showed that a call to
hal!HalCheckForSoftwareInterrupts was being made but that call isn’t on the
call stack.

I don’t see this call on the stack, so I assumed that the debugger might be
having trouble unwinding the stack and thus started to look at the stack by
hand looking for valid ebp/return address pairs.

hal!KfLowerIrql+0x54: <— this is for frame 4
81f9b508 8acb mov cl,bl
81f9b50a e8a9fdffff call hal!HalpLowerIrqlHardwareInterrupts
(81f9b2b8)
81f9b50f 33d2 xor edx,edx
81f9b511 8acb mov cl,bl
81f9b513 e8f8feffff call hal!HalpCheckForSoftwareInterrupt
(81f9b410) <---- this call was taken, but how does it get into the atheros
driver ??
81f9b518 5b pop ebx <— this is at the return address
specified in frame 3
81f9b519 c9 leave
81f9b51a c3 ret

My questions are:

  1. How come this call isn’t shown when I look at stack above the line for
    hal!KfLowerIrql before the atheros entries? (See the stack below with where
    I expected the check interrupts call to be).

All I can think of is there really isn’t a call going on, but I don’t know
enough about software interrupt processing and DPC handling to know. Oney’s
book doesn’t cover it – I will check the Windows Internals book. Or perhaps
one of you has a better suggestion to learn about this in better detail.

  1. How does the call to HalCheckForSoftwareinterrupts end up getting into
    the atheros driver?
    Right now, I don’t think it will help if I examine the frames’ disassembly
    code to see how I ended up with a NULL pointer deference in frame 0 if I
    don’t understand what’s happening between the time of the call to
    HalCheckForSoftwareInterrupts in frame 4 disassembly and the atheros driver
    call in frame 3 disassembly.

STACK_TEXT:
9be3d484 9b950bbc badb0d00 9ca45000 00000001 nt!KiTrap0E+0x2ac <– frame 0
WARNING: Stack unwind information not available. Following frames may be
wrong.
9be3d4fc 9b942340 9ca45000 ffffffff ffffffff ar5523+0x44bbc <- frame 1
9be3d518 9b94336a 9c94c688 9ca4fa34 9ca45000 ar5523+0x36340 <– frame 2
9be3d540 81f9b518 c02105cc be8f0000 9be3d570 ar5523+0x3736a <— frame 3
<<<<<<<<<<<<<<<<<<<>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
< <<<<<<hal!HalpCheckForSoftwareInterrupts>>>>>>>>
<<<<<<<<<<<<<<<<<<<>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
9be3d62c 81c6a820 9be3d790 9be3d788 00000001 hal!KfLowerIrql+0x64 <- frame 4

thanks all!

>
>Looking at the preceding code is a great idea. Try the following:
>
>“.trap f9be3d484;ub @eip
>
>That will let you see the code leading up to the exception. From the
>stack trace it’s deep inside this driver and it is likely associated
>with a DPC function for the given driver that must now be called because
>the IRQL is being lowered.
>
>In general, invalid address references like this come from some
>containing data structure. Using “!pool >address derived>” or “ln >derived” yields further information about what went wrong.
>
>Tony
>
>Tony Mason
>Consulting Partner
>OSR Open Systems Resources, Inc.
>http://www.osr.com
>
>

Out of curiosity, are you trying to just debug a crash on your machine, or
rather you have a driver in the stack that could cause such blue screen?

Regarding question 2), as I explained in my previous post it’s probably due
to a DPC routine (or DPC timer routine) being executed before the IRQL gets
lowered (to PASSIVE_LEVEL, most probably).

The Windows Internals book by Solomon and Russinovich is definitely *the*
source for this type of information.

Have a nice day
GV

“S. Drasnin” wrote in message news:xxxxx@ntdev…
> Thanks for all the helpful replies. I have a couple of questions below.
>
> I neglected to mention that I did walk the frames and disassemble the
> code.
> For frame 0 (I pasted in the stack below again if you want to see it) ,
> edi is getting loaded by fetching a pointer off the stack (argument 1, at
> location esp+4) adding a offset to it (0xA19). that address is deferenced
> and stuffed into edi - the value is 0 and when it gets used as pointer,
> the crash occurs. I partially walked the frames backwards to see how that
> the argument 1 got that way. However, I stopped doing this when I got to
> frame 4 because in the disassembly of Hal!KfLowerIrql showed that a call
> to hal!HalCheckForSoftwareInterrupts was being made but that call isn’t on
> the call stack.
>
> I don’t see this call on the stack, so I assumed that the debugger might
> be having trouble unwinding the stack and thus started to look at the
> stack by hand looking for valid ebp/return address pairs.
>
> hal!KfLowerIrql+0x54: <— this is for frame 4
> 81f9b508 8acb mov cl,bl
> 81f9b50a e8a9fdffff call hal!HalpLowerIrqlHardwareInterrupts
> (81f9b2b8)
> 81f9b50f 33d2 xor edx,edx
> 81f9b511 8acb mov cl,bl
> 81f9b513 e8f8feffff call hal!HalpCheckForSoftwareInterrupt
> (81f9b410) <---- this call was taken, but how does it get into the atheros
> driver ??
> 81f9b518 5b pop ebx <— this is at the return address
> specified in frame 3
> 81f9b519 c9 leave
> 81f9b51a c3 ret
>
> My questions are:
> 1) How come this call isn’t shown when I look at stack above the line for
> hal!KfLowerIrql before the atheros entries? (See the stack below with
> where I expected the check interrupts call to be).
>
> All I can think of is there really isn’t a call going on, but I don’t know
> enough about software interrupt processing and DPC handling to know.
> Oney’s book doesn’t cover it – I will check the Windows Internals book.
> Or perhaps one of you has a better suggestion to learn about this in
> better detail.
>
> 2) How does the call to HalCheckForSoftwareinterrupts end up getting into
> the atheros driver?
> Right now, I don’t think it will help if I examine the frames’ disassembly
> code to see how I ended up with a NULL pointer deference in frame 0 if I
> don’t understand what’s happening between the time of the call to
> HalCheckForSoftwareInterrupts in frame 4 disassembly and the atheros
> driver call in frame 3 disassembly.
>
>
> STACK_TEXT:
> 9be3d484 9b950bbc badb0d00 9ca45000 00000001 nt!KiTrap0E+0x2ac <– frame 0
> WARNING: Stack unwind information not available. Following frames may be
> wrong.
> 9be3d4fc 9b942340 9ca45000 ffffffff ffffffff ar5523+0x44bbc <- frame 1
> 9be3d518 9b94336a 9c94c688 9ca4fa34 9ca45000 ar5523+0x36340 <– frame 2
> 9be3d540 81f9b518 c02105cc be8f0000 9be3d570 ar5523+0x3736a <— frame 3
> <<<<<<<<<<<<<<<<<<<>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
> < <<<<<<> hal!HalpCheckForSoftwareInterrupts>>>>>>>>
> <<<<<<<<<<<<<<<<<<<>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
> 9be3d62c 81c6a820 9be3d790 9be3d788 00000001 hal!KfLowerIrql+0x64 <- frame
> 4
>
> thanks all!
>
>
>>
>>Looking at the preceding code is a great idea. Try the following:
>>
>>“.trap f9be3d484;ub @eip
>>
>>That will let you see the code leading up to the exception. From the
>>stack trace it’s deep inside this driver and it is likely associated
>>with a DPC function for the given driver that must now be called because
>>the IRQL is being lowered.
>>
>>In general, invalid address references like this come from some
>>containing data structure. Using “!pool >>address derived>” or “ln >>derived” yields further information about what went wrong.
>>
>>Tony
>>
>>Tony Mason
>>Consulting Partner
>>OSR Open Systems Resources, Inc.
>>http://www.osr.com
>>
>>
>
>
>

S. Drasnin wrote:

I neglected to mention that I did walk the frames and disassemble the
code.
For frame 0 (I pasted in the stack below again if you want to see it)
, edi is getting loaded by fetching a pointer off the stack (argument
1, at location esp+4) adding a offset to it (0xA19). that address is
deferenced and stuffed into edi - the value is 0 and when it gets used
as pointer, the crash occurs. I partially walked the frames backwards
to see how that the argument 1 got that way. However, I stopped doing
this when I got to frame 4 because in the disassembly of
Hal!KfLowerIrql showed that a call to
hal!HalCheckForSoftwareInterrupts was being made but that call isn’t
on the call stack.

Do you really mean that it isn’t on the stack, or do you just mean that
it isn’t in the stack trace as shown by “kb”?

Remember that the “kb” command only works when functions use the
standard function prolog/epilog (push ebp / mov ebp, esp). Many of the
internal functions in the kernel do not do this, and so “kb” will not
see them.

  1. How does the call to HalCheckForSoftwareinterrupts end up getting
    into the atheros driver?
    Right now, I don’t think it will help if I examine the frames’
    disassembly code to see how I ended up with a NULL pointer deference
    in frame 0 if I don’t understand what’s happening between the time of
    the call to HalCheckForSoftwareInterrupts in frame 4 disassembly and
    the atheros driver call in frame 3 disassembly.

I disagree. For the most part, you can assume that the operating system
is handing your callbacks something reasonable. The disconnect is
something within your driver.

Is this a driver you are developing, or are you just chasing down a
random crash in your computer?


Tim Roberts, xxxxx@probo.com
Providenza & Boekelheide, Inc.

>>1) How come this call isn’t shown when I look at stack above the line for

>hal!KfLowerIrql before >>the atheros entries? (See the stack below with
>where I expected the check interrupts call to >>be).

I think the debugging engine gets confused when walking a stack wherein two
conditions are present: (1)it cannot locate symbols for the next module in
the stack AND (2)one of the functions lower in the stack uses frame pointer
omission(FPO). Without symbols, the engine will try to walk the stack using
only the current EBP value. If it does not encounter any functions using
FPO, then the stack will be accurate; otherwise, functions will be missing
from the stack.

Have you tried examining the raw stack via the “dds esp L100” command? If
HalCheckForSoftwareInterrupts was called and made a subsequent call(s), you
should see its return address in the dds output. If so, it may lead you to
the missing pieces of the call stack.

Ron


Valentine’s Day – Shop for gifts that spell L-O-V-E at MSN Shopping
http://shopping.msn.com/content/shp/?ctId=8323,ptnrid=37,ptnrdata=24095&tcode=wlmtagline

> Remember that the “kb” command only works when functions use the

standard function prolog/epilog (push ebp / mov ebp, esp). Many of the
internal functions in the kernel do not do this, and so “kb” will not
see them.

If the symbols are OK - then “kb” will walk even this function using Frame
Pointer Omission symbol records.


Maxim Shatskih, Windows DDK MVP
StorageCraft Corporation
xxxxx@storagecraft.com
http://www.storagecraft.com

hi

Thanks to all the replies and for the suggestion. Will try once I have
access to a Windows PC.

In regard to all the questions, this was a crash dump is from a pc at my
previous job. It’s not a driver I was developing. I was analyzing it because
I wanted to practice my debugging skills as a learning exercise (not having
symbols makes much more difficult) to see if I could track down the problem.

thanks!

ok at stack above the

>>line for hal!KfLowerIrql before >>the atheros entries? (See the stack
>>below with where I expected the check interrupts call to >>be).

I think the debugging engine gets confused when walking a stack wherein two
conditions are present: (1)it cannot locate symbols for the next module in
the stack AND (2)one of the functions lower in the stack uses frame pointer
omission(FPO). Without symbols, the engine will try to walk the stack
using only the current EBP value. If it does not encounter any functions
using FPO, then the stack will be accurate; otherwise, functions will be
missing from the stack.

Have you tried examining the raw stack via the “dds esp L100” command? If
HalCheckForSoftwareInterrupts was called and made a subsequent call(s), you
should see its return address in the dds output. If so, it may lead you to
the missing pieces of the call stack.

Ron


Valentine’s Day – Shop for gifts that spell L-O-V-E at MSN Shopping
http://shopping.msn.com/content/shp/?ctId=8323,ptnrid=37,ptnrdata=24095&tcode=wlmtagline


Questions? First check the Kernel Driver FAQ at
http://www.osronline.com/article.cfm?id=256

To unsubscribe, visit the List Server section of OSR Online at
http://www.osronline.com/page.cfm?name=ListServer