Unexplainable Stack in amd64 crash

At least one I cant explain. We have a amd64 system running windows
2003 sp1, in our lab that crashes randomly

usually a couple of days after starting some MS sql server tests.
Offending instruction is indicated as

dec qword ptr [r8], the registers show that r8 contains 0, so a null
ptr dereference.

However, the instruction just prior to that is:

mov r8,[r11+rax*8+0x1878d0]

I cant imagine how r8 can end up with 0, except if a context switch
restored back a corrupt context. I don’t see any other

control paths to dec qword ptr [r8]. Any ideas on how to debug/analyze
this problem is much appreciated. .trap though warns

that some registers might be zeroed, can I trust the values shown?

Thanks for any suggestions.

-Shyam

The stack shows up is as follows:

fffffadfc6417f90 fffff800013e65ef : 0000000000000863 fffff6fd57d6c090 ffffffffffffffff 0000000000000001 :
nt!MiRemovePageByColor+0xbf

fffffadfc6418030 fffff800010ad778 : 0000000000000128 000000004d524f56 fffffaaf00000000 fffff80000000000 :
nt!MiAllocateSpecialPool+0x298

fffffadfc64180e0 fffff800013c844b : fffffadfc8373530 fffffadfcd538bf0 fffffaaf4d524f56 fffff8000105b910 :
nt!ExAllocatePoolWithTagPriority+0x68

fffffadfc6418150 fffff800013c7ada : fffffadfc6419000 fffff800013d0fd3 fffffadf4d524f56 fffffadfc8373530 :
nt!VeAllocatePoolWithTagPriority+0x2ec

fffffadfc64181d0 fffffadfc83260b7 : fffffadfc6419000 0000000000000000 fffffadfcd514bd0 fffffadfc8373700 :
nt!VerifierExAllocatePoolWithTag+0x8d

… chomp …

************************************************************************
*******

*
*

* Bugcheck Analysis
*

*
*

************************************************************************
*******

IRQL_NOT_LESS_OR_EQUAL (a)

An attempt was made to access a pageable (or completely invalid) address
at an

interrupt request level (IRQL) that is too high. This is usually

caused by drivers using improper addresses.

If a kernel debugger is available get the stack backtrace.

Arguments:

Arg1: 0000000000000000, memory referenced

Arg2: 0000000000000002, IRQL

Arg3: 0000000000000001, value 0 = read operation, 1 = write operation

Arg4: fffff80001059dc7, address which referenced memory

Debugging Details:


WRITE_ADDRESS: 0000000000000000

CURRENT_IRQL: 2

FAULTING_IP:

nt!MiRemovePageByColor+bf

fffff800`01059dc7 49ff08 dec qword ptr [r8]

DEFAULT_BUCKET_ID: DRIVER_FAULT

BUGCHECK_STR: 0xA

LAST_CONTROL_TRANSFER: from fffff800013e65ef to fffff80001059dc7

1: kd> .trap fffffadfc6417e00 ; kb

NOTE: The trap frame does not contain all registers.

Some register values may be zeroed.

rax=0000000000000006 rbx=0000000000000000 rcx=0000000000055edb

rdx=000000000000000b rsi=0000000000000000 rdi=0000000000000000

rip=fffff80001059dc7 rsp=fffffadfc6417f90 rbp=fffffadfc6417fc0

r8=0000000000000000 r9=0000000000000000 r10=fffffadfcabfe500

r11=fffff80001000000 r12=0000000000000000 r13=0000000000000000

r14=0000000000000000 r15=0000000000000000

iopl=0 nv up ei pl nz na po nc

Forget my question. I just answered it myself. Been sleeping too much
over the weekend.

-Shyam


From: xxxxx@lists.osr.com
[mailto:xxxxx@lists.osr.com] On Behalf Of Nagaraj Shyam
Sent: Tuesday, September 06, 2005 1:45 PM
To: Windows File Systems Devs Interest List
Subject: [ntfsd] Unexplainable Stack in amd64 crash

At least one I cant explain. We have a amd64 system running windows
2003 sp1, in our lab that crashes randomly

usually a couple of days after starting some MS sql server tests.
Offending instruction is indicated as

dec qword ptr [r8], the registers show that r8 contains 0, so a null
ptr dereference.

However, the instruction just prior to that is:

mov r8,[r11+rax*8+0x1878d0]

I cant imagine how r8 can end up with 0, except if a context switch
restored back a corrupt context. I don’t see any other

control paths to dec qword ptr [r8]. Any ideas on how to debug/analyze
this problem is much appreciated. .trap though warns

that some registers might be zeroed, can I trust the values shown?

Thanks for any suggestions.

-Shyam

The stack shows up is as follows:

fffffadfc6417f90 fffff800013e65ef : 0000000000000863 fffff6fd57d6c090 ffffffffffffffff 0000000000000001 :
nt!MiRemovePageByColor+0xbf

fffffadfc6418030 fffff800010ad778 : 0000000000000128 000000004d524f56 fffffaaf00000000 fffff80000000000 :
nt!MiAllocateSpecialPool+0x298

fffffadfc64180e0 fffff800013c844b : fffffadfc8373530 fffffadfcd538bf0 fffffaaf4d524f56 fffff8000105b910 :
nt!ExAllocatePoolWithTagPriority+0x68

fffffadfc6418150 fffff800013c7ada : fffffadfc6419000 fffff800013d0fd3 fffffadf4d524f56 fffffadfc8373530 :
nt!VeAllocatePoolWithTagPriority+0x2ec

fffffadfc64181d0 fffffadfc83260b7 : fffffadfc6419000 0000000000000000 fffffadfcd514bd0 fffffadfc8373700 :
nt!VerifierExAllocatePoolWithTag+0x8d

… chomp …

************************************************************************
*******

*
*

* Bugcheck Analysis
*

*
*

************************************************************************
*******

IRQL_NOT_LESS_OR_EQUAL (a)

An attempt was made to access a pageable (or completely invalid) address
at an

interrupt request level (IRQL) that is too high. This is usually

caused by drivers using improper addresses.

If a kernel debugger is available get the stack backtrace.

Arguments:

Arg1: 0000000000000000, memory referenced

Arg2: 0000000000000002, IRQL

Arg3: 0000000000000001, value 0 = read operation, 1 = write operation

Arg4: fffff80001059dc7, address which referenced memory

Debugging Details:


WRITE_ADDRESS: 0000000000000000

CURRENT_IRQL: 2

FAULTING_IP:

nt!MiRemovePageByColor+bf

fffff800`01059dc7 49ff08 dec qword ptr [r8]

DEFAULT_BUCKET_ID: DRIVER_FAULT

BUGCHECK_STR: 0xA

LAST_CONTROL_TRANSFER: from fffff800013e65ef to fffff80001059dc7

1: kd> .trap fffffadfc6417e00 ; kb

NOTE: The trap frame does not contain all registers.

Some register values may be zeroed.

rax=0000000000000006 rbx=0000000000000000 rcx=0000000000055edb

rdx=000000000000000b rsi=0000000000000000 rdi=0000000000000000

rip=fffff80001059dc7 rsp=fffffadfc6417f90 rbp=fffffadfc6417fc0

r8=0000000000000000 r9=0000000000000000 r10=fffffadfcabfe500

r11=fffff80001000000 r12=0000000000000000 r13=0000000000000000

r14=0000000000000000 r15=0000000000000000

iopl=0 nv up ei pl nz na po nc


Questions? First check the IFS FAQ at
https://www.osronline.com/article.cfm?id=17

You are currently subscribed to ntfsd as: unknown lmsubst tag argument:
‘’
To unsubscribe send a blank email to xxxxx@lists.osr.com

Ok, I can explain how it got to be zero. Someone (most likely our
driver) wrote zero to that address.

In the crash the address turns out to be fffff800`01187900. Our product
is an encryption and

fine grain access control filter driver based on the older filtering
(sfilter type)

model.

Any ideas from the helpful list readers on how to catch this as it
happens?

Appreciate the help.

-Shyam


From: xxxxx@lists.osr.com
[mailto:xxxxx@lists.osr.com] On Behalf Of Nagaraj Shyam
Sent: Tuesday, September 06, 2005 2:01 PM
To: Windows File Systems Devs Interest List
Subject: RE: [ntfsd] Unexplainable Stack in amd64 crash

Forget my question. I just answered it myself. Been sleeping too much
over the weekend.

-Shyam


From: xxxxx@lists.osr.com
[mailto:xxxxx@lists.osr.com] On Behalf Of Nagaraj Shyam
Sent: Tuesday, September 06, 2005 1:45 PM
To: Windows File Systems Devs Interest List
Subject: [ntfsd] Unexplainable Stack in amd64 crash

At least one I cant explain. We have a amd64 system running windows
2003 sp1, in our lab that crashes randomly

usually a couple of days after starting some MS sql server tests.
Offending instruction is indicated as

dec qword ptr [r8], the registers show that r8 contains 0, so a null
ptr dereference.

However, the instruction just prior to that is:

mov r8,[r11+rax*8+0x1878d0]

I cant imagine how r8 can end up with 0, except if a context switch
restored back a corrupt context. I don’t see any other

control paths to dec qword ptr [r8]. Any ideas on how to debug/analyze
this problem is much appreciated. .trap though warns

that some registers might be zeroed, can I trust the values shown?

Thanks for any suggestions.

-Shyam

The stack shows up is as follows:

fffffadfc6417f90 fffff800013e65ef : 0000000000000863 fffff6fd57d6c090 ffffffffffffffff 0000000000000001 :
nt!MiRemovePageByColor+0xbf

fffffadfc6418030 fffff800010ad778 : 0000000000000128 000000004d524f56 fffffaaf00000000 fffff80000000000 :
nt!MiAllocateSpecialPool+0x298

fffffadfc64180e0 fffff800013c844b : fffffadfc8373530 fffffadfcd538bf0 fffffaaf4d524f56 fffff8000105b910 :
nt!ExAllocatePoolWithTagPriority+0x68

fffffadfc6418150 fffff800013c7ada : fffffadfc6419000 fffff800013d0fd3 fffffadf4d524f56 fffffadfc8373530 :
nt!VeAllocatePoolWithTagPriority+0x2ec

fffffadfc64181d0 fffffadfc83260b7 : fffffadfc6419000 0000000000000000 fffffadfcd514bd0 fffffadfc8373700 :
nt!VerifierExAllocatePoolWithTag+0x8d

… chomp …

************************************************************************
*******

*
*

* Bugcheck Analysis
*

*
*

************************************************************************
*******

IRQL_NOT_LESS_OR_EQUAL (a)

An attempt was made to access a pageable (or completely invalid) address
at an

interrupt request level (IRQL) that is too high. This is usually

caused by drivers using improper addresses.

If a kernel debugger is available get the stack backtrace.

Arguments:

Arg1: 0000000000000000, memory referenced

Arg2: 0000000000000002, IRQL

Arg3: 0000000000000001, value 0 = read operation, 1 = write operation

Arg4: fffff80001059dc7, address which referenced memory

Debugging Details:


WRITE_ADDRESS: 0000000000000000

CURRENT_IRQL: 2

FAULTING_IP:

nt!MiRemovePageByColor+bf

fffff800`01059dc7 49ff08 dec qword ptr [r8]

DEFAULT_BUCKET_ID: DRIVER_FAULT

BUGCHECK_STR: 0xA

LAST_CONTROL_TRANSFER: from fffff800013e65ef to fffff80001059dc7

1: kd> .trap fffffadfc6417e00 ; kb

NOTE: The trap frame does not contain all registers.

Some register values may be zeroed.

rax=0000000000000006 rbx=0000000000000000 rcx=0000000000055edb

rdx=000000000000000b rsi=0000000000000000 rdi=0000000000000000

rip=fffff80001059dc7 rsp=fffffadfc6417f90 rbp=fffffadfc6417fc0

r8=0000000000000000 r9=0000000000000000 r10=fffffadfcabfe500

r11=fffff80001000000 r12=0000000000000000 r13=0000000000000000

r14=0000000000000000 r15=0000000000000000

iopl=0 nv up ei pl nz na po nc


Questions? First check the IFS FAQ at
https://www.osronline.com/article.cfm?id=17

You are currently subscribed to ntfsd as: unknown lmsubst tag argument:
‘’
To unsubscribe send a blank email to xxxxx@lists.osr.com


Questions? First check the IFS FAQ at
https://www.osronline.com/article.cfm?id=17

You are currently subscribed to ntfsd as: unknown lmsubst tag argument:
‘’
To unsubscribe send a blank email to xxxxx@lists.osr.com