debugging locked up server

I have a server that is locked so hard the mouse doesn’t even move… what conclusions can I draw from that, in particular does it tell me anything about what IRQL things are stuck at?

!locks says:

**** DUMP OF ALL RESOURCE OBJECTS ****
KD: Scanning for held locks…

Resource @ 0xfffffa80013b1830 Shared 1 owning threads
Threads: fffffa8002479b50-01<*>
KD: Scanning for held locks…

Resource @ 0xfffffa8002038af8 Shared 1 owning threads
Threads: fffffa8000c9e660-01<*>
KD: Scanning for held locks…
5101 total locks, 2 locks currently held

!thread fffffa8002479b50
THREAD fffffa8002479b50 Cid 058c.0640 Teb: 000007fffffde000 Win32Thread: 0000000000000000 ???
IRP List:
fffffa800264f540: (0006,0310) Flags: 00060043 Mdl: fffffa8001aed670
Not impersonating
Owning Process fffffa80021a0b30 Image: smss.exe
Attached Process N/A Image: N/A
Wait Start TickCount 123972666 Ticks: 33426535 (6:00:50:57.288)
Context Switch Count 1
UserTime 00:00:00.000
KernelTime 00:00:00.000
Win32 Start Address 0x0000000047c77d9c
Stack Init fffff88004073db0 Current fffff88004073080
Base fffff88004074000 Limit fffff8800406e000 Call 0
Priority 8 BasePriority 8 UnusualBoost 0 ForegroundBoost 0 IoPriority 2 PagePriority 5
Child-SP RetAddr : Args to Child : Call Site
fffff880040730c0 fffff800014cfbd2 : 000000000000012f fffffa8002479b50 0000000004073200 0000000000000008 : nt!KiSwapContext+0x7a
fffff88004073200 fffff800014e0f8f : fffff880040736d0 fffffa8001393040 0000000000000000 fffff88004073430 : nt!KiCommitThreadWait+0x1d2
fffff88004073290 fffff8800142d3ff : fffffa8001382400 0000000000000000 fffff8a0001d0c00 0000000000000000 : nt!KeWaitForSingleObject+0x19f
fffff88004073330 fffff88001425fc6 : fffff880040736d0 fffffa800264f540 fffff8a0001d0c70 0000000000000000 : Ntfs!NtfsNonCachedIo+0x23f
fffff88004073500 fffff88001427a68 : fffff880040736d0 fffffa800264f540 fffff88004073801 fffffa80018dac01 : Ntfs!NtfsCommonRead+0x7a6
fffff880040736a0 fffff88001202bcf : fffffa800264f808 fffffa800264f540 fffffa80018dacc0 0000000000000000 : Ntfs!NtfsFsdRead+0x1b8
fffff880040738b0 fffff880012016df : fffffa8001382790 fffffa8002479b01 fffffa8001382700 fffffa800264f540 : fltmgr!FltpLegacyProcessingAfterPreCallbacksCompleted+0x24f
fffff88004073940 fffff800015014f5 : fffffa800264f560 fffffa80013b7810 fffffa8001aed670 fffff8000164ce80 : fltmgr!FltpDispatch+0xcf
fffff880040739a0 fffff80001500fc9 : 0000000000000001 0000000000000001 fffffa8001aed5b0 0000000000000000 : nt!IoPageRead+0x255
fffff88004073a30 fffff800014e785a : 0000000000000000 0000000000000000 ffffffffffffffff fffff80000000000 : nt!MiIssueHardFault+0x255
fffff88004073ac0 fffff800014d82ee : 0000000000000000 000000007746228a 0000000000000001 000000000026f5c0 : nt!MmAccessFault+0x146a
fffff88004073c20 000000007735c35a : 0000000000000000 0000000000000000 0000000000000000 0000000000000000 : nt!KiPageFault+0x16e (TrapFrame @ fffff88004073c20) 000000000026f540 0000000000000000 : 0000000000000000 0000000000000000 0000000000000000 00000000`00000000 : 0x7735c35a

!thread fffffa8000c9e660
THREAD fffffa8000c9e660 Cid 0004.0040 Teb: 0000000000000000 Win32Thread: 0000000000000000 RUNNING on processor 0
IRP List:
fffffa800171f4d0: (0006,0310) Flags: 00060043 Mdl: fffffa8001394900
Not impersonating
DeviceMap fffff8a000008b30
Owning Process fffffa8000c85040 Image: System
Attached Process N/A Image: N/A
Wait Start TickCount 123972667 Ticks: 33426534 (6:00:50:57.272)
Context Switch Count 766156
UserTime 00:00:00.000
KernelTime 6 Days 00:50:58.411
Win32 Start Address nt!ExpWorkerThread (0xfffff800014e3730)
Stack Init fffff8800213ffb0 Current fffff88001f57380
Base fffff88002140000 Limit fffff8800213a000 Call 0
Priority 13 BasePriority 13 UnusualBoost 0 ForegroundBoost 0 IoPriority 2 PagePriority 5
Child-SP RetAddr : Args to Child : Call Site
0000000000000000 0000000000000000 : 0000000000000000 0000000000000000 0000000000000000 0000000000000000 : 0x0

But I don’t quite know where to go from here…

Thanks

James

“James Harper” wrote in message news:xxxxx@ntdev…

I have a server that is locked so hard the mouse doesn’t even move… what
conclusions can I draw from that, in particular >does it tell me anything
about what IRQL things are stuck at?

Generally that means that you have a livelock and your processors are hung
at IRQL >= DISPATCH_LEVEL (thus preventing DPCs from running, which gives
you the hard hang behavior). !locks won’t help you much here as that is only
useful for ERESOURCE deadlocks, which by definition have to happen at IRQL <
DISPATCH_LEVEL (they’re wait locks, thus they are not acquirable at raised
IRQL).

The key here will be to look at the state of the individual processors in
the system. You can do this in WinDBG with:

!running -ti

NB: Before doing this make sure that you’re in the default trap context
(execute, “.trap” with no parameters). !running has a bug where if you
change the default context it will show THAT context for all processors,
which can blow your mind if you’re not paying attention :slight_smile:

-scott

“James Harper” wrote in message news:xxxxx@ntdev…

I have a server that is locked so hard the mouse doesn’t even move… what
conclusions can I draw from that, in particular does it tell me anything
about what IRQL things are stuck at?

!locks says:

**** DUMP OF ALL RESOURCE OBJECTS ****
KD: Scanning for held locks…

Resource @ 0xfffffa80013b1830 Shared 1 owning threads
Threads: fffffa8002479b50-01<*>
KD: Scanning for held
locks…

Resource @ 0xfffffa8002038af8 Shared 1 owning threads
Threads: fffffa8000c9e660-01<*>
KD: Scanning for held locks…
5101 total locks, 2 locks currently held

!thread fffffa8002479b50
THREAD fffffa8002479b50 Cid 058c.0640 Teb: 000007fffffde000 Win32Thread:
0000000000000000 ???
IRP List:
fffffa800264f540: (0006,0310) Flags: 00060043 Mdl: fffffa8001aed670
Not impersonating
Owning Process fffffa80021a0b30 Image: smss.exe
Attached Process N/A Image: N/A
Wait Start TickCount 123972666 Ticks: 33426535 (6:00:50:57.288)
Context Switch Count 1
UserTime 00:00:00.000
KernelTime 00:00:00.000
Win32 Start Address 0x0000000047c77d9c
Stack Init fffff88004073db0 Current fffff88004073080
Base fffff88004074000 Limit fffff8800406e000 Call 0
Priority 8 BasePriority 8 UnusualBoost 0 ForegroundBoost 0 IoPriority 2
PagePriority 5
Child-SP RetAddr : Args to Child
: Call Site
fffff880040730c0 fffff800014cfbd2 : 000000000000012f fffffa8002479b50
0000000004073200 0000000000000008 : nt!KiSwapContext+0x7a
fffff88004073200 fffff800014e0f8f : fffff880040736d0 fffffa8001393040
0000000000000000 fffff88004073430 : nt!KiCommitThreadWait+0x1d2
fffff88004073290 fffff8800142d3ff : fffffa8001382400 0000000000000000
fffff8a0001d0c00 0000000000000000 : nt!KeWaitForSingleObject+0x19f
fffff88004073330 fffff88001425fc6 : fffff880040736d0 fffffa800264f540
fffff8a0001d0c70 0000000000000000 : Ntfs!NtfsNonCachedIo+0x23f
fffff88004073500 fffff88001427a68 : fffff880040736d0 fffffa800264f540
fffff88004073801 fffffa80018dac01 : Ntfs!NtfsCommonRead+0x7a6
fffff880040736a0 fffff88001202bcf : fffffa800264f808 fffffa800264f540
fffffa80018dacc0 0000000000000000 : Ntfs!NtfsFsdRead+0x1b8
fffff880040738b0 fffff880012016df : fffffa8001382790 fffffa8002479b01
fffffa8001382700 fffffa800264f540 :
fltmgr!FltpLegacyProcessingAfterPreCallbacksCompleted+0x24f
fffff88004073940 fffff800015014f5 : fffffa800264f560 fffffa80013b7810
fffffa8001aed670 fffff8000164ce80 : fltmgr!FltpDispatch+0xcf
fffff880040739a0 fffff80001500fc9 : 0000000000000001 0000000000000001
fffffa8001aed5b0 0000000000000000 : nt!IoPageRead+0x255
fffff88004073a30 fffff800014e785a : 0000000000000000 0000000000000000
ffffffffffffffff fffff80000000000 : nt!MiIssueHardFault+0x255
fffff88004073ac0 fffff800014d82ee : 0000000000000000 000000007746228a
0000000000000001 000000000026f5c0 : nt!MmAccessFault+0x146a
fffff88004073c20 000000007735c35a : 0000000000000000 0000000000000000
0000000000000000 0000000000000000 : nt!KiPageFault+0x16e (TrapFrame @
fffff88004073c20) 000000000026f540 0000000000000000 : 0000000000000000 0000000000000000 0000000000000000 00000000`00000000 : 0x7735c35a

!thread fffffa8000c9e660
THREAD fffffa8000c9e660 Cid 0004.0040 Teb: 0000000000000000 Win32Thread:
0000000000000000 RUNNING on processor 0
IRP List:
fffffa800171f4d0: (0006,0310) Flags: 00060043 Mdl: fffffa8001394900
Not impersonating
DeviceMap fffff8a000008b30
Owning Process fffffa8000c85040 Image: System
Attached Process N/A Image: N/A
Wait Start TickCount 123972667 Ticks: 33426534 (6:00:50:57.272)
Context Switch Count 766156
UserTime 00:00:00.000
KernelTime 6 Days 00:50:58.411
Win32 Start Address nt!ExpWorkerThread (0xfffff800014e3730)
Stack Init fffff8800213ffb0 Current fffff88001f57380
Base fffff88002140000 Limit fffff8800213a000 Call 0
Priority 13 BasePriority 13 UnusualBoost 0 ForegroundBoost 0 IoPriority 2
PagePriority 5
Child-SP RetAddr : Args to Child
: Call Site
0000000000000000 0000000000000000 : 0000000000000000 0000000000000000
0000000000000000 0000000000000000 : 0x0

But I don’t quite know where to go from here…

Thanks

James