My first dump analysis -- how to proceed???

Hello,

happy new year to you all!

My new year started with my first XP bluescreen. I have been running XP
pro retail for quite some time now without any problems or bluescreens.
On 31st december I installed Norton SystemWorks 2002.05 and BlackICE
Defender 2.9.cap, and one day later I got my first crash (have had two
more since).

I suspect Norton or BlackICE (or the combination of the two) to be the
cause of these crashes. However, I am an absolute beginner when it comes
to WinDbg and crash dump analysis, so I would like to hear your opinion
on how to proceed.

Here’s what I did so far (it’s quite long, sorry for that :slight_smile:

  • I had a full dump (512MB) generated when the crash occurred.
  • I loaded WinDbg 4.0.0018.0. The symbol search path was set to

d:\windows\symbols;srv*d:\windows\websymbols*http://msdl.microsoft.com/d
ownload/symbols

The executable search path was set to d:\windows\system32

  • I loaded the dump, and WinDbg immediately did an !analyze. It also
    reports that it can not find the symbols for mshtml.dll and is using
    its exports instead. I notice that during the symbol loading process,
    two directories are created under d:\windows\websymbols, named
    mshtml.pdb and mshtml.dbg, which are removed once the “*** ERROR:
    Symbol file could not be found” message appears in WinDbg.

QUESTION: is this proof that WinDbg went out to the symbol server but
could not find the proper symbols there?

  • In order to find out why this symbol load failed, I did a
    !lmi mshtml. It gave me the following timestamp:
    3c0d735e Wed Dec 05 02:07:42 2001. When I go out to
    d:\windows\system32 and use explorer to look at mshtml.dll’s
    properties there, I do not see a matching timestamp. When I compare
    kernel32.dll’s timestamps in this same manner, they do not match
    either. The mshtml.dll version is 6.0.2712.300 and I can not find
    this version in Microsoft’s online DLL version database…

QUESTION: is this the proper way to compare timestamps, to ascertain
which module version is loaded?

  • Ignoring the missing mshtml symbols for now, I issue an !analyze -v
    command. (See below for the full analysis output) It tells me the
    bugcheck is IRQL_NOT_LESS_OR_EQUAL (a), and the IRQL was 2
    (DISPATCH_LEVEL). The fault occurred in KeWaitForSingleObject(), so I
    guess this must mean that this function was called with a non-zero
    Timeout parameter (cf. DDK docs).

  • Looking at the stack backtrace I see:

00135fb8 1a4027b7 000001f8 ffffffff 1a4591f8
kernel32!WaitForSingleObject+0xf

From this I deduce that the handle value to wait on was 0x1f8 and the
wait was to be INFINITE (0xffffffff). Issueing a !handle 1f8 tells me
the object is a named Mutant, with name “ZonesCounterMutex”

QUESTION: are these deductions correct?

  • Furthermore, looking at the stack backtrace I see no drivers or
    anything unexpected in the trace. It looks like it’s just urlmon.dll
    calling WaitForSingleObject()

QUESTION: so how can the IRQL be too high when KeWaitForSingleObject()
is called inside ntoskrnl.exe, when the call is coming
without detours from urlmon.dll in usermode???

I would like to know what to do next. Of course I can uninstall Norton
SystemWorks and BlackICe Defender and – once the crashes stay away –
empirically “proof” that they must’ve been the culprits, but I would
like to determine this from the dump itself, if that is at all possible.

Thanks for any help,

Gert-Jan Bartelds

Here’s the full analysis output:

kd> !analyze -v
************************************************************************
*******
*
*
* Bugcheck Analysis
*
*
*
************************************************************************
*******

IRQL_NOT_LESS_OR_EQUAL (a)
An attempt was made to access a pagable (or completely invalid) address
at an
interrupt request level (IRQL) that is too high. This is usually
caused by drivers using improper addresses.
If a kernel debugger is available get the stack backtrace.
Arguments:
Arg1: 00000004, memory referenced
Arg2: 00000002, IRQL
Arg3: 00000001, value 0 = read operation, 1 = write operation
Arg4: 804ebaed, address which referenced memory

Debugging Details:

WRITE_ADDRESS: 00000004 Unknown

CURRENT_IRQL: 2

FAULTING_IP:
nt!KeWaitForSingleObject+290
804ebaed 894204 mov [edx+0x4],eax

DEFAULT_BUCKET_ID: DRIVER_FAULT

BUGCHECK_STR: A

TRAP_FRAME: f405bc58 – (.trap fffffffff405bc58)
ErrCode = 00000002
eax=818c1b08 ebx=818c1af8 ecx=81946370 edx=00000000 esi=81a16890
edi=81a16900
eip=804ebaed esp=f405bccc ebp=f405bcec iopl=0 nv up ei ng nz ac
po cy
cs=0008 ss=0010 ds=0023 es=0023 fs=0030 gs=0000
efl=00010297
nt!KeWaitForSingleObject+290:
804ebaed 894204 mov [edx+0x4],eax
Resetting default context

LAST_CONTROL_TRANSFER: from 805718e8 to 804ebaed

STACK_TEXT:
f405bcec 805718e8 00000001 00000006 c0000001
nt!KeWaitForSingleObject+0x290
f405bd50 804d4e91 000001f8 00000000 00000000
nt!NtWaitForSingleObject+0x9a
f405bd50 7ffe0304 000001f8 00000000 00000000 nt!KiSystemService+0xc4
00135f40 77f7f4af 77e7788b 000001f8 00000000
SharedUserData!SystemCallStub+0x4
00135f44 77e7788b 000001f8 00000000 00000000
ntdll!NtWaitForSingleObject+0xc
00135fa8 77e79d6a 000001f8 ffffffff 00000000
kernel32!WaitForSingleObjectEx+0xa8
00135fb8 1a4027b7 000001f8 ffffffff 1a4591f8
kernel32!WaitForSingleObject+0xf
00135ff8 1a40663e 0118e644 001362cc 001362e8
urlmon!InternetCreateSecurityManager+0x41
001362ec 63615b3a 0019b3a0 0118e644 00138344
urlmon!CoInternetParseUrl+0x1fe
WARNING: Stack unwind information not available. Following frames may be
wrong.
00138320 636096b2 00138344 00138544 00000000 mshtml+0x95b3a
00138548 63609534 011606e0 00000000 00138584 mshtml+0x896b2
0013856c 6360a461 0118e760 0013861c 0259ce48 mshtml+0x89534
0013868c 00000000 0116d960 00200000 77e77f1e mshtml+0x8a461

FOLLOWUP_IP:
nt!KeWaitForSingleObject+290
804ebaed 894204 mov [edx+0x4],eax

FOLLOWUP_NAME: MachineOwner

SYMBOL_NAME: nt!KeWaitForSingleObject+290

MODULE_NAME: nt

IMAGE_NAME: ntoskrnl

STACK_COMMAND: .trap fffffffff405bc58 ; kb

BUCKET_ID: 0xA_nt!KeWaitForSingleObject+290

Followup: MachineOwner


You are currently subscribed to windbg as: $subst(‘Recip.EmailAddr’)
To unsubscribe send a blank email to leave-windbg-$subst(‘Recip.MemberIDChar’)@lists.osr.com

If KeWaitForSingleObject crashed it’s because
A) someone pased bogus parameters (not the case ehre since it’s call by the
Nt API which does full parameter validation
or
B) something corrupted memory and clobered a random structure -
KeWaitForSingleObject is the victim.

I would

  1. turn on driver verifier on all the third party drivers on your machine
    first, see if you can catch something (you can turn it on everything int he
    OS, but that requires LOTS of memory in the system to work well)
  2. if that does not find anything, start disabling some drivers see if you
    can make the crash disappear - this can isolate it down to a bad driver.

If you do find the bad driver, I would *GREATLY* appreciate if you let me
know which one it is because tracking memory corrupting drivers is pretty
hard, and knowing which drivers can cause memory corruption makes our
online crash analysis effort much more efficient.

-Andre


You are currently subscribed to windbg as: $subst(‘Recip.EmailAddr’)
To unsubscribe send a blank email to leave-windbg-$subst(‘Recip.MemberIDChar’)@lists.osr.com

Andre,

thanks for replying.

In the time span between my original posting and your reply I have had 4
more crashes. Three of those four were IRQL_NOT_LESS_OR_EQUAL, and two
of
those three had SYMEVENT in the stack. So I reckoned Norton Systemworks
could indeed be the culprit, so I uninstalled it.

Some hours later I got another crash, this time
KERNEL_MODE_EXCEPTIOn_NOT_HANDLED, with nv4_mini.sys high in the stack
trace
(NVidia’s detonator driver). The crash occurred while swithing Windows
Media
player from full screen to windowed operation. So I upgraded the NVidia
drivers ( version 5.13.01.2183) to the newest one from their website
(6.13.10.2311)

Have been running for half a day now without any crashes.

However, I might for “education” purposes (my education, that is :slight_smile:
roll
back to the original NVida drivers and also reinstall Systemworks, and
then
turn on Driver Verifier. See what happens.

I’ll let you know,

Thanks again.

Gert-Jan

-----Original Message-----
From: xxxxx@microsoft.com
To: Kernel Debugging Interest List
Sent: 06/Jan/02 1:14 AM
Subject: [windbg] Re: My first dump analysis – how to proceed???

If KeWaitForSingleObject crashed it’s because
A) someone pased bogus parameters (not the case ehre since it’s call by
the
Nt API which does full parameter validation
or
B) something corrupted memory and clobered a random structure -
KeWaitForSingleObject is the victim.

I would

  1. turn on driver verifier on all the third party drivers on your
    machine
    first, see if you can catch something (you can turn it on everything int
    he
    OS, but that requires LOTS of memory in the system to work well)
  2. if that does not find anything, start disabling some drivers see if
    you
    can make the crash disappear - this can isolate it down to a bad driver.

If you do find the bad driver, I would *GREATLY* appreciate if you let
me
know which one it is because tracking memory corrupting drivers is
pretty
hard, and knowing which drivers can cause memory corruption makes our
online crash analysis effort much more efficient.

-Andre


You are currently subscribed to windbg as: xxxxx@fenestrae.com
To unsubscribe send a blank email to leave-windbg-$subst(‘Recip.MemberIDChar’)@lists.osr.com


You are currently subscribed to windbg as: $subst(‘Recip.EmailAddr’)
To unsubscribe send a blank email to leave-windbg-$subst(‘Recip.MemberIDChar’)@lists.osr.com