Hi all,
I have a problem with a system hanging, and I am trying to diagnose the
cause. I would appreciate any suggestions you could offer in my debug
analysis procedure. I am debugging an NT4.0 SP5 server. The system is hung
Since the system is hung, I’ve connected a WinDbg 4.0.0018.0
The first step I did was !teb. It returned among other things: ClientId:
125.142
I also did a “!process -1” which showed me the current process and all the
threads. The base information it displayed was:
PROCESS 80e74300 Cid: 0125 Peb: 7ffdf000 ParentCid: 002d
DirBase: 15ef0000 ObjectTable: 81f24f08 TableSize: 606.
Image: AppNumberOne.ex
VadRoot 80e83408 Clone 0 Private 16753. Modified 1088783. Locked 0.
80E744BC MutantState Signalled OwningThread 0
Token e13fa030
ElapsedTime 22:16:34.0218
UserTime 0:05:18.0453
KernelTime 0:18:34.0203
QuotaPoolUsage[PagedPool] 67982
QuotaPoolUsage[NonPagedPool] 5399032
Working Set Sizes (now,min,max) (17389, 50, 345) (69556KB, 200KB,
1380KB)
PeakWorkingSetSize 18063
VirtualSize 157 Mb
PeakVirtualSize 164 Mb
PageFaultCount 26984367
MemoryPriority BACKGROUND
BasePriority 8
CommitCharge 17422
There was only one thread listed in a “running” state. 125.142 whose thread
id was 81e06020
Next I did a .thread 81e06020
Then I typed a KB command which yielded some interesting results.
ChildEBP RetAddr
f52e16a0 80119594 hal!KeAcquireSpinLockRaiseToSynch+0x34
f52e16b0 80112b35 nt!KeInsertQueueApc+0x12
f52e16dc eb0e2130 nt!IofCompleteRequest+0x201
WARNING: Stack unwind information not available. Following frames may be
wrong.
80b12034 0e1fb000 MyDriver+0x2130
0690b000 00000000 0xe1fb000
Note that “MyDriver” is NOT called by this application, but is called by
another application running in the system.
I also typed “!pcr” which yielded “Irql: 00000000”
So now that I see this thread spinning in “KeAcquireSpinLockRaiseToSynch”,
I typed “!locks” which displayed (among other things) the following:
**** DUMP OF ALL RESOURCE OBJECTS ****
KD: Scanning for held locks…
Resource @ 0x81f2d4e8 Shared 2 owning threads
Threads: 81e06020-02 80f937c0-01
Now I see that thread 81e06020 (hey that was the same as above) is using a
resource that thread 80f937c0 is using. So I did a “.thread 80f937c0” and
then a “kb” and this is what I got:
ChildEBP RetAddr
eb41f7ec 801185c2 nt!KiSwapThread+0x1b1
eb41f810 8027e5ac nt!KeWaitForSingleObject+0x1b8
eb41f82c 8027d9e7 Ntfs!NtfsWaitSync+0x18
eb41fa10 8027fcae Ntfs!NtfsNonCachedIo+0x1a5
eb41fc4c 8027eb8d Ntfs!NtfsCommonWrite_6695+0x36
eb41fcc0 801128af Ntfs!NtfsFsdWrite+0xcc
eb41fcd4 80113816 nt!IofCallDriver+0x37
eb41fcec 80125439 nt!IoSynchronousPageWrite+0xb2
eb41fdc8 8012507c nt!MiFlushSectionInternal+0x36f
eb41fe04 80104222 nt!MmFlushSection+0x128
eb41feb4 80103bca nt!CcFlushCache+0x3b6
eb41fef8 80108f87 nt!CcWriteBehind+0xf0
eb41ff34 8010bcdd nt!CcWorkerThread+0xc7
eb41ff4c 80139bde nt!ExpWorkerThread+0x73
eb41ff7c 8014563e nt!PspSystemThreadStartup+0x54
00000000 00000000 nt!KiThreadStartup+0x16
So… now what? Did I misinterpret or miss something along the way? What
did it mean when it said “MutantState Signalled OwningThread 0” in the
“!process -1” command’s output? Also, the “Working set sizes” seemed very
odd. What should I be asking myself that I am missing?
I know this seems like a lot of information (to me), but any suggestions
would be greatly appreciated.
Thanks,
Joe D.