Help with invalid interrupt crash?

I got this crash today on an idle system that had been running idle for
a week, and I don’t know what to make of the crash. I’m guessing that
it’s due to an invalid interrupt, but I can’t figure out which interrupt
it would be, but I think it’s the hard drive. This system has a fair
amount of our custom hardware in it, but it’s a product we make, so it’s
not anything that isn’t running in thousands of other systems of ours,
and I haven’t seen this type of crash before. One difference in this
system is a new single-board motherboard, so I’m thinking that there may
be a spurrious device on it for which there’s no installed device driver
or something along those lines.

Anyway, can anyone tell me what this crash REALLY means, or how to
proceed with debugging it, or what the magic crash address of 0x0128 is
trying to tell me?

I think that it’s saying that the 0000003e passed into
HalBeginSystemInterrupt is the interrupt vector, and thus the IRQ is
0xee (14), which on that machine is attached to the primary IDE channel.
But then the code that crashes should have evaluated into
hal!HalpSpecialDismissTable(80a750a8)+0x0e*4, which has the value
80a737b8, which is _HalBeginSystemInterrupt+0040, but it didn’t jump
there, it crashed. So why would it suddenly have a problem handling the
interrupt from the hard drive? Or am I dealing with a faulty memory
read? Or am I on the wrong track?

Thanks in advance for any assistance!

Microsoft (R) Windows Debugger Version 6.6.0003.5
Copyright (c) Microsoft Corporation. All rights reserved.

Loading Dump File [c:\windows\MEMORY.DMP]
Kernel Complete Dump File: Full address space is available

Symbol search path is:
srv*c:\websymbols*http://msdl.microsoft.com/download/symbols;c:\windows\
symbols
Executable search path is:
Windows Server 2003 Kernel Version 3790 (Service Pack 1) UP Free x86
compatible
Product: Server, suite: TerminalServer SingleUserTS
Built by: 3790.srv03_sp1_rtm.050324-1447
Kernel base = 0x80800000 PsLoadedModuleList = 0x808a8e48
Debug session time: Fri Jul 21 03:30:02.071 2006 (GMT-7)
System Uptime: 6 days 16:55:48.524
Loading Kernel Symbols


Loading User Symbols
Loading unloaded module list

************************************************************************
*******
*
*
* Bugcheck Analysis
*
*
*
************************************************************************
*******

Use !analyze -v to get detailed debugging information.

BugCheck A, {128, ff, 1, 80a73781}

Probably caused by : ntoskrnl.exe ( nt!KiTrap0E+2a1 )

Followup: MachineOwner

kd> !analyze -v
************************************************************************
*******
*
*
* Bugcheck Analysis
*
*
*
************************************************************************
*******

IRQL_NOT_LESS_OR_EQUAL (a)
An attempt was made to access a pageable (or completely invalid) address
at an
interrupt request level (IRQL) that is too high. This is usually
caused by drivers using improper addresses.
If a kernel debugger is available get the stack backtrace.
Arguments:
Arg1: 00000128, memory referenced
Arg2: 000000ff, IRQL
Arg3: 00000001, value 0 = read operation, 1 = write operation
Arg4: 80a73781, address which referenced memory

Debugging Details:

WRITE_ADDRESS: 00000128

CURRENT_IRQL: 2

FAULTING_IP:
hal!HalBeginSystemInterrupt+9
80a73781 ff248da850a780 jmp dword ptr [hal!HalpSpecialDismissTable
(80a750a8)+ecx*4]

DEFAULT_BUCKET_ID: INTEL_CPU_MICROCODE_ZERO

BUGCHECK_STR: 0xA

LAST_CONTROL_TRANSFER: from 80a73781 to 80826493

STACK_TEXT:
8089d504 80a73781 badb0d00 80010031 857c3190 nt!KiTrap0E+0x2a1
8089d574 8081f14b 0001000d 0000003e 8089d588
hal!HalBeginSystemInterrupt+0x9
8089d574 80a7338a 0001000d 0000003e 8089d588 nt!KiInterruptDispatch+0x1b
8089d600 80820a45 00000000 0000000e 00000000 hal!HalProcessorIdle+0x2
8089d604 00000000 0000000e 00000000 00000000 nt!KiIdleLoop+0xa

STACK_COMMAND: kb

FOLLOWUP_IP:
nt!KiTrap0E+2a1
80826493 833dc0828a8000 cmp dword ptr [nt!KiFreezeFlag
(808a82c0)],0x0

FAULTING_SOURCE_CODE:

SYMBOL_STACK_INDEX: 0

FOLLOWUP_NAME: MachineOwner

SYMBOL_NAME: nt!KiTrap0E+2a1

MODULE_NAME: nt

IMAGE_NAME: ntoskrnl.exe

DEBUG_FLR_IMAGE_TIMESTAMP: 42435e33

FAILURE_BUCKET_ID: 0xA_W_nt!KiTrap0E+2a1

BUCKET_ID: 0xA_W_nt!KiTrap0E+2a1

Followup: MachineOwner

Generally the 0x00000128 is somebody attempting reference a structure offset
0x128 bytes from a pointer that happens to be null, which results in an
access to the 0 guard page which results in the IRQL_NOT_LESS_THAN_OR_EQUAL:
bugcheck which really ought ot be named NULL_POINTER_REFERENCE_BUGCHECK.

=====================
Mark Roddy DDK MVP
Windows 2003/XP/2000 Consulting
Hollis Technology Solutions 603-321-1032
www.hollistech.com


From: xxxxx@lists.osr.com
[mailto:xxxxx@lists.osr.com] On Behalf Of Taed Wynnell
Sent: Friday, July 21, 2006 6:29 PM
To: Windows System Software Devs Interest List
Subject: [ntdev] Help with invalid interrupt crash?

I got this crash today on an idle system that had been running idle for a
week, and I don’t know what to make of the crash. I’m guessing that it’s
due to an invalid interrupt, but I can’t figure out which interrupt it would
be, but I think it’s the hard drive. This system has a fair amount of our
custom hardware in it, but it’s a product we make, so it’s not anything that
isn’t running in thousands of other systems of ours, and I haven’t seen this
type of crash before. One difference in this system is a new single-board
motherboard, so I’m thinking that there may be a spurrious device on it for
which there’s no installed device driver or something along those lines.

Anyway, can anyone tell me what this crash REALLY means, or how to proceed
with debugging it, or what the magic crash address of 0x0128 is trying to
tell me?

I think that it’s saying that the 0000003e passed into
HalBeginSystemInterrupt is the interrupt vector, and thus the IRQ is 0xee
(14), which on that machine is attached to the primary IDE channel. But
then the code that crashes should have evaluated into
hal!HalpSpecialDismissTable(80a750a8)+0x0e*4, which has the value 80a737b8,
which is _HalBeginSystemInterrupt+0040, but it didn’t jump there, it
crashed. So why would it suddenly have a problem handling the interrupt
from the hard drive? Or am I dealing with a faulty memory read? Or am I on
the wrong track?

Thanks in advance for any assistance!

Microsoft (R) Windows Debugger Version 6.6.0003.5
Copyright (c) Microsoft Corporation. All rights reserved.

Loading Dump File [c:\windows\MEMORY.DMP]
Kernel Complete Dump File: Full address space is available

Symbol search path is:
srv*c:\websymbols*http://msdl.microsoft.com/download/symbols;c:\windows\symb
ols
Executable search path is:
Windows Server 2003 Kernel Version 3790 (Service Pack 1) UP Free x86
compatible
Product: Server, suite: TerminalServer SingleUserTS
Built by: 3790.srv03_sp1_rtm.050324-1447
Kernel base = 0x80800000 PsLoadedModuleList = 0x808a8e48
Debug session time: Fri Jul 21 03:30:02.071 2006 (GMT-7)
System Uptime: 6 days 16:55:48.524
Loading Kernel Symbols

Loading User Symbols
Loading unloaded module list

****************************************************************************
***
*
*
* Bugcheck Analysis
*
*
*
****************************************************************************
***

Use !analyze -v to get detailed debugging information.

BugCheck A, {128, ff, 1, 80a73781}

Probably caused by : ntoskrnl.exe ( nt!KiTrap0E+2a1 )

Followup: MachineOwner

kd> !analyze -v
****************************************************************************
***
*
*
* Bugcheck Analysis
*
*
*
****************************************************************************
***

IRQL_NOT_LESS_OR_EQUAL (a)
An attempt was made to access a pageable (or completely invalid) address at
an
interrupt request level (IRQL) that is too high. This is usually
caused by drivers using improper addresses.
If a kernel debugger is available get the stack backtrace.
Arguments:
Arg1: 00000128, memory referenced
Arg2: 000000ff, IRQL
Arg3: 00000001, value 0 = read operation, 1 = write operation
Arg4: 80a73781, address which referenced memory

Debugging Details:

WRITE_ADDRESS: 00000128

CURRENT_IRQL: 2

FAULTING_IP:
hal!HalBeginSystemInterrupt+9
80a73781 ff248da850a780 jmp dword ptr [hal!HalpSpecialDismissTable
(80a750a8)+ecx*4]

DEFAULT_BUCKET_ID: INTEL_CPU_MICROCODE_ZERO

BUGCHECK_STR: 0xA

LAST_CONTROL_TRANSFER: from 80a73781 to 80826493

STACK_TEXT:
8089d504 80a73781 badb0d00 80010031 857c3190 nt!KiTrap0E+0x2a1
8089d574 8081f14b 0001000d 0000003e 8089d588 hal!HalBeginSystemInterrupt+0x9

8089d574 80a7338a 0001000d 0000003e 8089d588 nt!KiInterruptDispatch+0x1b
8089d600 80820a45 00000000 0000000e 00000000 hal!HalProcessorIdle+0x2
8089d604 00000000 0000000e 00000000 00000000 nt!KiIdleLoop+0xa

STACK_COMMAND: kb

FOLLOWUP_IP:
nt!KiTrap0E+2a1
80826493 833dc0828a8000 cmp dword ptr [nt!KiFreezeFlag (808a82c0)],0x0

FAULTING_SOURCE_CODE:

SYMBOL_STACK_INDEX: 0

FOLLOWUP_NAME: MachineOwner

SYMBOL_NAME: nt!KiTrap0E+2a1

MODULE_NAME: nt

IMAGE_NAME: ntoskrnl.exe

DEBUG_FLR_IMAGE_TIMESTAMP: 42435e33

FAILURE_BUCKET_ID: 0xA_W_nt!KiTrap0E+2a1

BUCKET_ID: 0xA_W_nt!KiTrap0E+2a1

Followup: MachineOwner


Questions? First check the Kernel Driver FAQ at
http://www.osronline.com/article.cfm?id=256

To unsubscribe, visit the List Server section of OSR Online at
http://www.osronline.com/page.cfm?name=ListServer

> Generally the 0x00000128 is somebody attempting reference a structure offset

0x128 bytes from a pointer that happens to be null, which results in an
access to the 0 guard page which results in the IRQL_NOT_LESS_THAN_OR_EQUAL:
bugcheck which really ought ot be named NULL_POINTER_REFERENCE_BUGCHECK.

Of course. But that doesn’t seem at all relevant here considering where it crashed in the Windows code that handles interrupt dispatching:

FAULTING_IP:
hal!HalBeginSystemInterrupt+9
80a73781 ff248da850a780 jmp dword ptr [hal!HalpSpecialDismissTable
(80a750a8)+ecx*4]

STACK_TEXT:
8089d504 80a73781 badb0d00 80010031 857c3190 nt!KiTrap0E+0x2a1
8089d574 8081f14b 0001000d 0000003e 8089d588 hal!HalBeginSystemInterrupt+0x9
8089d574 80a7338a 0001000d 0000003e 8089d588 nt!KiInterruptDispatch+0x1b
8089d600 80820a45 00000000 0000000e 00000000 hal!HalProcessorIdle+0x2
8089d604 00000000 0000000e 00000000 00000000 nt!KiIdleLoop+0xa