Way of finding info from full memory dump of Hung machine

Gyani_Lal · July 28, 2015, 2:39pm

From given full memory dump, how can i ensure that machine was in hung state? Sometime it becomes very tricky to ensure and !process 0 7 output is too large to ensure correctly.

raj_r · July 28, 2015, 4:22pm

there is a -hang switch available to !analyze

quoted as documented in windbg.chm

-hang
Generates !analyze hung-application output. Use this parameter when
the target has experienced a bug check or exception, but an analysis
of why an application hung is more relevant to your problem. In kernel
mode, !analyze -hang investigates locks that the system holds and then
scans the DPC queue chain. In user mode, !analyze -hang analyzes the
thread stack analysis to determine whether any threads are blocking
other threads.
Before you run this extension in user mode, consider changing the
current thread to the thread that you think has stopped responding
(that is, hung), because the exception might have changed the current
thread to a different one.

On 7/29/15, xxxxx@gmail.com wrote:
> From given full memory dump, how can i ensure that machine was in hung
> state? Sometime it becomes very tricky to ensure and !process 0 7 output is
> too large to ensure correctly.
>
>
>
> —
> WINDBG is sponsored by OSR
>
> OSR is hiring!! Info at http://www.osr.com/careers
>
> For our schedule of WDF, WDM, debugging and other seminars visit:
> http://www.osr.com/seminars
>
> To unsubscribe, visit the List Server section of OSR Online at
> http://www.osronline.com/page.cfm?name=ListServer
>

Scott_Noone_OSR · July 29, 2015, 10:00am

A hang can mean so many different things that there’s no one thing you can
look for.

If the system is experiencing a “hard hang” (no keyboard, no mouse, can’t
ping, etc.) then you typically have something that is endlessly running and
starving the CPUs. You can look at the state of each CPU with !running -it.

If the system is still responsive in any way, you probably have a thread (or
many) that is sleeping waiting on some resource to become available (e.g. an
event, a lock, etc.). Unfortunately, most threads in a normal system are
idle and waiting for work to do. This makes it difficult to identify a
thread that is sleeping as a normal part of its operation versus one that is
part of a hang.

If you have ANY higher level description from the user at this point you can
use it to your advantage. For example, a coworker yesterday could no longer
interact with the shell on his test system (specifically, he couldn’t click
on the desktop or the taskbar). I know that these components are controlled
by Explorer, so I ran the following command in the debugger:

!process 0 1F explorer.exe

Scrolling through the output I noticed that one thread had been waiting on a
disk I/O for 15 seconds, which is generally a sign of bad things

If you have NO help, you need to start looking through all threads for
something “interesting”. I typically use !stacks 2 for this. I also ported
the !uniqstack command to kernel mode for exactly these situations. It goes
through every thread in the system and only shows you threads that have
unique call chains. This is usually pretty good at only showing you the
threads that are different, which are usually the ones that are interesting
in analyzing a hang:

http://www.osronline.com/OsrDown.cfm/apexts.zip?name=apexts.zip&id=559

Lastly, sometimes a user perceived hang can really just be because of horrid
performance. If in analyzing the state of the machine is really doesn’t
appear to a hang, you can try Xperf and WPA. Be sure that you don’t have a
hang first though as these can become a serious rat hole.

Good luck!

-scott
OSR
@OSRDrivers

wrote in message news:xxxxx@windbg…

From given full memory dump, how can i ensure that machine was in hung
state? Sometime it becomes very tricky to ensure and !process 0 7 output is
too large to ensure correctly.