Hello all,
I’m analyzing an Bug Check 0xD1: DRIVER_IRQL_NOT_LESS_OR_EQUAL
(0x0061F904,0xFF,0x01,0x81DC7BF7) from dump file(W2K-SP2).
ChildEBP RetAddr Args to Child
f241f1e0 8052dfb5 f241f1f4 ffdff13c f241fac4 nt!KdpCauseBugCheck+0x10 (FPO:
[1,0,0])
f241f240 8052ea97 00000007 f241f618 f241f610 nt!KdpSendWaitContinue+0x1db
(FPO: [Non-Fpo])
f241f620 80530ae7 f241fac4 ffdff13c 00000000
nt!KdpReportExceptionStateChange+0x5b (FPO: [Non-Fpo])
f241f6e0 8042eac9 f241fb18 00000000 f241fac4 nt!KdpTrap+0x385 (FPO:
[Non-Fpo])
f241faa8 80466505 f241fac4 00000000 f241fb18 nt!KiDispatchException+0xaf
(FPO: [Non-Fpo])
f241fb10 80466ae4 8042c002 000000e1 8201c120 nt!CommonDispatchException+0x4d
(FPO: [0,20,0])
f241fb10 804559b5 8042c002 000000e1 8201c120 nt!KiTrap03+0x98 (FPO: [0,0]
TrapFrame @ f241fb18)
f241fb88 8042a43b 00000003 f241fbd0 0061f904
nt!RtlpBreakWithStatusInstruction+0x1 (FPO: [1,0,0])
f241fbb8 8042aa2e 00000003 0061f904 81dc7bf7 nt!KiBugCheckDebugBreak+0x31
(FPO: [Non-Fpo])
f241ff44 80468b6c 00000000 0061f904 000000ff nt!KeBugCheckEx+0x390 (FPO:
[Non-Fpo])
f241ff44 81dc7bf7 00000000 0061f904 000000ff nt!KiTrap0E+0x284 (FPO: [0,0]
TrapFrame @ f241ff60)
WARNING: Frame IP not in any known module. Following frames may be wrong.
f241ffd0 80431b2f 00000008 00003246 80464a19 0x81dc7bf7
00013006 00000000 00000000 00000000 00000000 nt!KiQuantumEnd+0xaf
I examined the stack of thread where the BSOD occurred and below is what I
found.
Thread’s stack:
…
…
f241ff04 00000000 f241ff6c f2020f52 8042d009
f241ff14 81fac2b0 81fac324 ffffff01 0061f904
f241ff24 000000ff 00000001 81dc7bf7 f886873e
f241ff34 ffdff848 00000000 00000000 81fac368
f241ff44 f241ff60 80468b6c 00000000 0061f904
f241ff54 000000ff 00000001 81dc7bf7 804699ac
f241ff64 02000002 000000a3 f87994d8 81fb5d08
f241ff74 f1fa9e67 81fb5d08 81f66a00 81fb5008
f241ff84 00000000 f241ffa8 f1fa9c2d 00000000
f241ff94 81f60023 80060023 80482e60 80431b30
f241ffa4 ffdff800 80482e60 ffffffff ffdf0030
f241ffb4 7f3c5bc8 ffdff849 ffdff000 00000000
f241ffc4 00000002 81dc7bf7 00000008 00013006
f241ffd4 80431b30 00000008 00003246 80464a19 <- Return from CALL ECX address
f241ffe4 80482e60 00000000 0013ebf3 00000000
f241fff4 0000002f 80469e3b f1d90d44 ??? <- Bottom
The CPU has just finished the “CALL ECX” instruction (push EIP and load EIP
with ECX=0x80431b30) somewhere in nt!KiRetireDpcList,
804649ff ff7214 push dword ptr [edx+0x14]
80464a02 ff7210 push dword ptr [edx+0x10]
80464a05 52 push edx
80464a06 c7421c00000000 mov dword ptr [edx+0x1c],0x0
80464a0d ff8b08080000 dec dword ptr [ebx+0x808]
80464a13 c60600 mov byte ptr [esi],0x0
80464a16 fb sti
80464a17 ffd1 call ecx <- Pending IRQ occured
80464a19 fa cli
80464a1a 3b6d00 cmp ebp,[ebp]
80464a1d 75c0 jnz nt!KiRetireDpcList+0xe (804649df)
80464a1f c7830c08000000000000 mov dword ptr [ebx+0x80c],0x0
80464a29 c783e007000000000000 mov dword ptr [ebx+0x7e0],0x0
0: kd> u 80431b30
nt!KiTimerExpiration:
80431b30 55 push ebp
80431b31 8bec mov ebp,esp
80431b33 83ec18 sub esp,0x18
80431b36 53 push ebx
80431b37 56 push esi
80431b38 57 push edi
80431b39 33c9 xor ecx,ecx
80431b3b ff150c064080 call dword ptr [nt!imp (8040060c)]
when an (pending) interrupt occurred.
The CPU saved the EFL(00003246), CS(00000008) and EIP(80431b30) on the stack
and start the execution of IRQ handler.
Somewhere in the IRQ handler code, the CPU executed an instruction (ADD
[EAX+0x820104],AL) that triggered a page fault interrupt.
81dc7be7 e500 in eax,00
81dc7be9 0000 add [eax],al
81dc7beb 0000 add [eax],al
81dc7bed 0000 add [eax],al
81dc7bef 0001 add [ecx],al
81dc7bf1 0000 add [eax],al
81dc7bf3 0000 add [eax],al
81dc7bf5 0000 add [eax],al
81dc7bf7 008004018200 add [eax+0x820104],al <– PAGE FAULT
81dc7bfd 0800 or [eax],al
81dc7bff 40 inc eax
81dc7c00 c04e4880 ror byte ptr [esi+0x48],0x80
81dc7c04 0000 add [eax],al
81dc7c06 0000 add [eax],al
81dc7c08 0500700090 add eax,0x90007000
The page fault interrupt left the following trace on the stack
EFL(00013006), CS(00000008) and EIP(81dc7bf7).
In processing the page fault, exception handler found that the address
[eax+0x820104]=0x0061F904 is “not valid” and so it triggered BSOD.
This is the point until where I was able to came.
I would like to identify the offending IRQ handler (device driver).
I have tried with the “!arbiter”, but I didn’t find any address range
matching 0x81dc7bf7, where the IRQ handler was executing.
There are no nested CALLs on the stack between 2 interrupts and the code
executed by first interrupt handler (… 0x81dc7bf7) looks like *garbage* to
me.
Any hope?
Any suggestions?
WBR Primoz