Debugging kernel hang/deadlock with crash dumps

Hello,
I have a crash dump file which is generated by manually initiate a crash
dump (right Ctl + 2 ScrollLocks). The dump was generated after some apps get
hanged on a disk IO operation. We suspect that there is a driver
(emcpBase.sys, ecmXXX.sys etc.) problem therefore manually dumped memory.
However I donot know how to recover live process/thread info from a dump
file, can anyone suggest a method. I have checked sys ptes and it seems to
be Ok. Some general advice on debugging hang/deadlock using crash dump files
are very much appreciated.

Thanks!

Jicun Zhong

Start with a dump of the system threads (!stacks or !process 0 7). Try
to find the threads for the processes that were hung. Look at the
individual threads, try to identify what they are blocked on and then
look for what should happen in order for the thread to unblock.

Regards,

Tony

Tony Mason
Consulting Partner
OSR Open Systems Resources, Inc.
http://www.osr.com

Looking forward to seeing you at the Next OSR File Systems Class October
18, 2004 in Silicon Valley!

-----Original Message-----
From: xxxxx@lists.osr.com
[mailto:xxxxx@lists.osr.com] On Behalf Of Jicun Zhong
Sent: Tuesday, June 01, 2004 11:41 AM
To: ntdev redirect
Subject: [ntdev] Debugging kernel hang/deadlock with crash dumps

Hello,
I have a crash dump file which is generated by manually initiate a
crash
dump (right Ctl + 2 ScrollLocks). The dump was generated after some apps
get
hanged on a disk IO operation. We suspect that there is a driver
(emcpBase.sys, ecmXXX.sys etc.) problem therefore manually dumped
memory.
However I donot know how to recover live process/thread info from a dump

file, can anyone suggest a method. I have checked sys ptes and it seems
to
be Ok. Some general advice on debugging hang/deadlock using crash dump
files
are very much appreciated.

Thanks!

Jicun Zhong


Questions? First check the Kernel Driver FAQ at
http://www.osronline.com/article.cfm?id=256

You are currently subscribed to ntdev as: xxxxx@osr.com
To unsubscribe send a blank email to xxxxx@lists.osr.com

Stupid question: Can’t you debug this by using WinDBG connected to the
target machine?

It’s easier that way to actually figure out what’s going on.

Another hint: If you set the time for timeout on things, so that when it
hangs in a deadlock of some sort, it actually is looping around in the code,
rather than “sitting in the OS”, it can be very helpful. Of course, this can
be a combined effort, such as:

if (!waitForSomething(10))
{
while(!waitForSomething(0)) /* Do nothing */ ;
}
So that the OS gets a timeslice of 10 seconds to get the event, then we loop
around trying to grab it.

This is naturally very bad for performance, but it’s much easier to find
which bit of code has hung that way than when you’re just finding that the
OS is sitting idle, doing nothing.


Mats

-----Original Message-----
From: Jicun Zhong [mailto:xxxxx@vallcom.com]
Sent: Tuesday, June 01, 2004 4:41 PM
To: Windows System Software Devs Interest List
Subject: [ntdev] Debugging kernel hang/deadlock with crash dumps

Hello,
I have a crash dump file which is generated by manually
initiate a crash
dump (right Ctl + 2 ScrollLocks). The dump was generated
after some apps get
hanged on a disk IO operation. We suspect that there is a driver
(emcpBase.sys, ecmXXX.sys etc.) problem therefore manually
dumped memory.
However I donot know how to recover live process/thread info
from a dump
file, can anyone suggest a method. I have checked sys ptes
and it seems to
be Ok. Some general advice on debugging hang/deadlock using
crash dump files
are very much appreciated.

Thanks!

Jicun Zhong


Questions? First check the Kernel Driver FAQ at
http://www.osronline.com/article.cfm?id=256

You are currently subscribed to ntdev as: xxxxx@3dlabs.com
To unsubscribe send a blank email to xxxxx@lists.osr.com

Thanks Tony!

Jicun

----- Original Message -----
From: “Tony Mason”
To: “Windows System Software Devs Interest List”
Sent: Tuesday, June 01, 2004 5:52 PM
Subject: RE: [ntdev] Debugging kernel hang/deadlock with crash dumps

Start with a dump of the system threads (!stacks or !process 0 7). Try
to find the threads for the processes that were hung. Look at the
individual threads, try to identify what they are blocked on and then
look for what should happen in order for the thread to unblock.

Regards,

Tony

Tony Mason
Consulting Partner
OSR Open Systems Resources, Inc.
http://www.osr.com

Looking forward to seeing you at the Next OSR File Systems Class October
18, 2004 in Silicon Valley!

-----Original Message-----
From: xxxxx@lists.osr.com
[mailto:xxxxx@lists.osr.com] On Behalf Of Jicun Zhong
Sent: Tuesday, June 01, 2004 11:41 AM
To: ntdev redirect
Subject: [ntdev] Debugging kernel hang/deadlock with crash dumps

Hello,
I have a crash dump file which is generated by manually initiate a
crash
dump (right Ctl + 2 ScrollLocks). The dump was generated after some apps
get
hanged on a disk IO operation. We suspect that there is a driver
(emcpBase.sys, ecmXXX.sys etc.) problem therefore manually dumped
memory.
However I donot know how to recover live process/thread info from a dump

file, can anyone suggest a method. I have checked sys ptes and it seems
to
be Ok. Some general advice on debugging hang/deadlock using crash dump
files
are very much appreciated.

Thanks!

Jicun Zhong


Questions? First check the Kernel Driver FAQ at
http://www.osronline.com/article.cfm?id=256

You are currently subscribed to ntdev as: xxxxx@osr.com
To unsubscribe send a blank email to xxxxx@lists.osr.com


Questions? First check the Kernel Driver FAQ at
http://www.osronline.com/article.cfm?id=256

You are currently subscribed to ntdev as: xxxxx@vallcom.com
To unsubscribe send a blank email to xxxxx@lists.osr.com

Hi Mats,
Yes a live WinDbg debugging is much better however I have no way to
access the target machine let alone source code. Your hint is very nice but
helpless in my case since I cannot plant such tracings. Thanks for the hint!

Jicun

----- Original Message -----
From:
To: “Windows System Software Devs Interest List”
Sent: Tuesday, June 01, 2004 5:58 PM
Subject: RE: [ntdev] Debugging kernel hang/deadlock with crash dumps

> Stupid question: Can’t you debug this by using WinDBG connected to the
> target machine?
>
> It’s easier that way to actually figure out what’s going on.
>
> Another hint: If you set the time for timeout on things, so that when it
> hangs in a deadlock of some sort, it actually is looping around in the
> code,
> rather than “sitting in the OS”, it can be very helpful. Of course, this
> can
> be a combined effort, such as:
>
> if (!waitForSomething(10))
> {
> while(!waitForSomething(0)) /* Do nothing */ ;
> }
> So that the OS gets a timeslice of 10 seconds to get the event, then we
> loop
> around trying to grab it.
>
> This is naturally very bad for performance, but it’s much easier to find
> which bit of code has hung that way than when you’re just finding that the
> OS is sitting idle, doing nothing.
>
> –
> Mats
>
>
>> -----Original Message-----
>> From: Jicun Zhong [mailto:xxxxx@vallcom.com]
>> Sent: Tuesday, June 01, 2004 4:41 PM
>> To: Windows System Software Devs Interest List
>> Subject: [ntdev] Debugging kernel hang/deadlock with crash dumps
>>
>>
>> Hello,
>> I have a crash dump file which is generated by manually
>> initiate a crash
>> dump (right Ctl + 2 ScrollLocks). The dump was generated
>> after some apps get
>> hanged on a disk IO operation. We suspect that there is a driver
>> (emcpBase.sys, ecmXXX.sys etc.) problem therefore manually
>> dumped memory.
>> However I donot know how to recover live process/thread info
>> from a dump
>> file, can anyone suggest a method. I have checked sys ptes
>> and it seems to
>> be Ok. Some general advice on debugging hang/deadlock using
>> crash dump files
>> are very much appreciated.
>>
>> Thanks!
>>
>> Jicun Zhong
>>
>>
>> —
>> Questions? First check the Kernel Driver FAQ at
>> http://www.osronline.com/article.cfm?id=256
>>
>> You are currently subscribed to ntdev as: xxxxx@3dlabs.com
>> To unsubscribe send a blank email to xxxxx@lists.osr.com
>>
>
> —
> Questions? First check the Kernel Driver FAQ at
> http://www.osronline.com/article.cfm?id=256
>
> You are currently subscribed to ntdev as: xxxxx@vallcom.com
> To unsubscribe send a blank email to xxxxx@lists.osr.com
>