HELP: How to debug "stuck IRPS" in TDI filter

Hi

Our product contains TDI filter and File system filter driver.

Lately, several applications began to stuck with no known reason. I think
that one of our filters causes some of the IRPS to “get stuck” so the user
mode process isn’t getting a response to its request and it gets stuck too.

These are the symptoms:

-Two computers that sometimes have their telnet application stuck when it
tries to disconnect.
-One computer had it’s excel stuck. I don’t know if excel is creating some
kind of communication so I don’t know if the problem is in the TDI or the
file system filter.
-One computer that runs AutoIt (a QA helper app). The AutoIt got stuck and
afterwards all iexplore, msnm, and outlook got stuck (the application window
is freeze after I run it). The wired thing is, that mozilla firefox runs OK!

All the problems occurred after SEVERAL DAYS of work while our product was
running.

Because most of the problems were in communications apps, we think it’s the
TDI filter.

Any idea how to debug such a problem?
Is there a way to get all the stuck IRPS for a process or a driver? (Maybe
doing a manual dump and get it from there?)

Thanks for any help.

Try starting with “!analyze -hang” - that will show you a certain class
of problems. From there, you can use “!stacks” or “!process 0 7” to get
summary or complete information on each thread. If the thread is
blocked, you’ll see on what it is blocking and can work back from there.
Other possibilities include trying to perform certain operations at APC
level (try “!apc”) or work queue deadlocks (try “!exqueue”).

If you really are looking for outstanding IRPs, us “!irpfind” to locate
them all - but that’s slow over a serial connection.

If you know the specific app that’s hung, try “!process 0 0” to get a
list of processes and then choose the one that represents a known hung
application and do "!process

7" - that will cut down on the
relative verbosity of the information ("!process 0 7" can be
intimidating the first time you do that - typical systems these days
have hundreds of threads.)

Generally, hangs are one of the easiest class of problems to track down,
since the system is excellent at keeping track of the dispatcher objects
on which each thread is waiting. Those dispatcher objects in turn often
have ownership information - which leads from the waiting thread to the
owning thread. The owning thread then is likely waiting for something.
Typically you find a loop in owning/waiting that establishes the
deadlock. If, however, the threads are using synchronization events
(KEVENT) or other dispatcher objects that do not have ownership
information you'll have to do quite a lot more digging to figure it out.

Regards,

Tony

Tony Mason

Consulting Partner

OSR Open Systems Resources, Inc.

http://www.osr.com

________________________________

From: xxxxx@lists.osr.com
[mailto:xxxxx@lists.osr.com] On Behalf Of Zed y
Sent: Tuesday, June 20, 2006 6:23 AM
To: ntdev redirect
Subject: [ntdev] HELP: How to debug "stuck IRPS" in TDI filter

Hi

Our product contains TDI filter and File system filter driver.

Lately, several applications began to stuck with no known reason. I
think that one of our filters causes some of the IRPS to "get stuck" so
the user mode process isn't getting a response to its request and it
gets stuck too.

These are the symptoms:

-Two computers that sometimes have their telnet application stuck when
it tries to disconnect.
-One computer had it's excel stuck. I don't know if excel is creating
some kind of communication so I don't know if the problem is in the TDI
or the file system filter.
-One computer that runs AutoIt (a QA helper app). The AutoIt got stuck
and afterwards all iexplore, msnm, and outlook got stuck (the
application window is freeze after I run it). The wired thing is, that
mozilla firefox runs OK!

All the problems occurred after SEVERAL DAYS of work while our product
was running.

Because most of the problems were in communications apps, we think it's
the TDI filter.

Any idea how to debug such a problem?
Is there a way to get all the stuck IRPS for a process or a driver?
(Maybe doing a manual dump and get it from there?)

Thanks for any help.

--- Questions? First check the Kernel Driver FAQ at
http://www.osronline.com/article.cfm?id=256 To unsubscribe, visit the
List Server section of OSR Online at
http://www.osronline.com/page.cfm?name=ListServer

Hi

Thanks for your help.

I seems that several of the stuck applications are stuck on
nt!KeWaitForSingleObject.
You said it will be complicated because KEVENT does not containe ownership
information.
Maybe you can still give me a starting point to go from here ?

What does the “IRP list” represent ?
Is this the thread currently “pended” irps list? (and if no, how do i get
it?)

Thanks.

PROCESS 8555f5c0 SessionId: 0 Cid: 1404 Peb: 7ffde000 ParentCid: 0b68
DirBase: 06c41360 ObjectTable: e1929e30 HandleCount: 353.
Image: IEXPLORE.EXE
VadRoot 8570ac70 Vads 188 Clone 0 Private 2301. Modified 20. Locked 0.
DeviceMap e2da3888
Token e386b500
ElapsedTime 00:50:27.868
UserTime 00:00:00.312
KernelTime 00:00:00.890
QuotaPoolUsage[PagedPool] 134756
QuotaPoolUsage[NonPagedPool] 9296
Working Set Sizes (now,min,max) (5566, 50, 345) (22264KB, 200KB,
1380KB)
PeakWorkingSetSize 5575
VirtualSize 84 Mb
PeakVirtualSize 95 Mb
PageFaultCount 7374
MemoryPriority BACKGROUND
BasePriority 8
CommitCharge 2374

THREAD 85756a18 Cid 1404.1700 Teb: 7ffdd000 Win32Thread: e419be10
WAIT: (Executive) KernelMode Non-Alertable
f6cdceb8 Mutant - owning thread 85824a28
IRP List:
86109e00: (0006,01fc) Flags: 00000884 Mdl: 00000000
853aa2d8: (0006,01fc) Flags: 00000884 Mdl: 00000000
Not impersonating
DeviceMap e2da3888
Owning Process 8555f5c0 Image: IEXPLORE.EXE
Wait Start TickCount 18574888 Ticks: 137357 (0:00:35:
46.203)
Context Switch Count 6490 LargeStack
UserTime 00:00:00.0281
KernelTime 00:00:00.0734
Start Address 0x7c810867
Win32 Start Address 0x00402451
Stack Init a9cac000 Current a9cab664 Base a9cac000 Limit a9ca6000
Call 0
Priority 10 BasePriority 8 PriorityDecrement 0 DecrementCount 16
ChildEBP RetAddr Args to Child
a9cab67c 80502b17 85756a88 85756a18 804fad6c nt!KiSwapContext+0x2f
(FPO: [Uses EBP] [0,0,4])
a9cab688 804fad6c 00000000 86109e00 86109e00 nt!KiSwapThread+0x6b
(FPO: [0,0,0])
a9cab6b0 f6cdb396 00000000 00000000 00000000
nt!KeWaitForSingleObject+0x1c2 (FPO: [Non-Fpo])
a9cab6c8 f6cdd233 86127630 e101c0e0 e327c5a8 sysaudio!GrabMutex+0x11
(FPO: [0,0,0])
a9cab6dc f6a71077 8603ff08 86109f48 86109e10
sysaudio!CFilterInstance::FilterDispatchCreate+0x46 (FPO: [Non-Fpo])
a9cab700 804eeeb1 8603ff08 00000000 86109e00 ks!DispatchCreate+0xc7
(FPO: [Non-Fpo])
a9cab710 80581eba 85b99018 8559cb8c a9cab8a8 nt!IopfCallDriver+0x31
(FPO: [0,0,0])
a9cab7f0 805bdd08 85b99030 00000000 8559cae8 nt!IopParseDevice+0xa58
(FPO: [Non-Fpo])
a9cab868 805ba390 00000000 a9cab8a8 00000040
nt!ObpLookupObjectName+0x53c (FPO: [Non-Fpo])
a9cab8bc 80574e37 00000000 00000000 00000600
nt!ObOpenObjectByName+0xea (FPO: [Non-Fpo])
a9cab938 805757ae a9caba34 c0100000 a9cab9d4 nt!IopCreateFile+0x407
(FPO: [Non-Fpo])
a9cab994 aa14881b a9caba34 c0100000 a9cab9d4 nt!IoCreateFile+0x8e
(FPO: [Non-Fpo])
a9cab9fc aa148861 e464b1a8 a9caba34 853aa36c wdmaud!OpenDevice+0x56
(FPO: [Non-Fpo])
a9caba20 aa14d946 a9caba34 a9caba38 a9caba38
wdmaud!OpenSysAudio+0x36 (FPO: [Non-Fpo])
a9caba3c aa149de1 853aa2e8 853e6818 86083d30
wdmaud!kmxlOpenSysAudio+0x1d (FPO: [Non-Fpo])
a9caba5c 804eeeb1 860f86d0 853aa2d8 853aa2d8
wdmaud!SoundDispatchCreate+0x86 (FPO: [Non-Fpo])
a9caba6c 80581eba 8600b0c8 857f84dc a9cabc04 nt!IopfCallDriver+0x31
(FPO: [0,0,0])
a9cabb4c 805bdd08 8600b0e0 00000000 857f8438 nt!IopParseDevice+0xa58
(FPO: [Non-Fpo])
a9cabbc4 805ba390 00000000 a9cabc04 00000040
nt!ObpLookupObjectName+0x53c (FPO: [Non-Fpo])
a9cabc18 80574e37 00000000 00000000 65764501
nt!ObOpenObjectByName+0xea (FPO: [Non-Fpo])
a9cabc94 805757ae 00136f50 c0100080 00136ef0 nt!IopCreateFile+0x407
(FPO: [Non-Fpo])
a9cabcf0 80577e78 00136f50 c0100080 00136ef0 nt!IoCreateFile+0x8e
(FPO: [Non-Fpo])
a9cabd30 8054060c 00136f50 c0100080 00136ef0 nt!NtCreateFile+0x30
(FPO: [Non-Fpo])
a9cabd30 7c90eb94 00136f50 c0100080 00136ef0 nt!KiFastCallEntry+0xfc
(FPO: [0,0] TrapFrame @ a9cabd64)

On 6/20/06, Tony Mason wrote:
>
> Try starting with “!analyze ?hang” ? that will show you a certain class
> of problems. From there, you can use “!stacks” or “!process 0 7” to get
> summary or complete information on each thread. If the thread is blocked,
> you’ll see on what it is blocking and can work back from there. Other
> possibilities include trying to perform certain operations at APC level (try
> “!apc”) or work queue deadlocks (try “!exqueue”).
>
>
>
> If you really are looking for outstanding IRPs, us “!irpfind” to locate
> them all ? but that’s slow over a serial connection.
>
>
>
> If you know the specific app that’s hung, try “!process 0 0” to get a list
> of processes and then choose the one that represents a known hung
> application and do “!process 7” ? that will cut down on the
> relative verbosity of the information (“!process 0 7” can be intimidating
> the first time you do that ? typical systems these days have hundreds of
> threads.)
>
>
>
> Generally, hangs are one of the easiest class of problems to track down,
> since the system is excellent at keeping track of the dispatcher objects on
> which each thread is waiting. Those dispatcher objects in turn often have
> ownership information ? which leads from the waiting thread to the owning
> thread. The owning thread then is likely waiting for something. Typically
> you find a loop in owning/waiting that establishes the deadlock. If,
> however, the threads are using synchronization events (KEVENT) or other
> dispatcher objects that do not have ownership information you’ll have to do
> quite a lot more digging to figure it out.
>
>
>
> Regards,
>
>
>
> Tony
>
>
>
> Tony Mason
>
> Consulting Partner
>
> OSR Open Systems Resources, Inc.
>
> http://www.osr.com
>
>
> ------------------------------
>
> From: xxxxx@lists.osr.com [mailto:
> xxxxx@lists.osr.com] *On Behalf Of *Zed y
> Sent: Tuesday, June 20, 2006 6:23 AM
> To: ntdev redirect
> Subject: [ntdev] HELP: How to debug “stuck IRPS” in TDI filter
>
>
>
> Hi
>
> Our product contains TDI filter and File system filter driver.
>
> Lately, several applications began to stuck with no known reason. I think
> that one of our filters causes some of the IRPS to “get stuck” so the user
> mode process isn’t getting a response to its request and it gets stuck too.
>
> These are the symptoms:
>
> -Two computers that sometimes have their telnet application stuck when it
> tries to disconnect.
> -One computer had it’s excel stuck. I don’t know if excel is creating some
> kind of communication so I don’t know if the problem is in the TDI or the
> file system filter.
> -One computer that runs AutoIt (a QA helper app). The AutoIt got stuck
> and afterwards all iexplore, msnm, and outlook got stuck (the application
> window is freeze after I run it). The wired thing is, that mozilla firefox
> runs OK!
>
> All the problems occurred after SEVERAL DAYS of work while our product was
> running.
>
> Because most of the problems were in communications apps, we think it’s
> the TDI filter.
>
> Any idea how to debug such a problem?
> Is there a way to get all the stuck IRPS for a process or a driver?
> (Maybe doing a manual dump and get it from there?)
>
> Thanks for any help.
>
> — Questions? First check the Kernel Driver FAQ at
> http://www.osronline.com/article.cfm?id=256 To unsubscribe, visit the List
> Server section of OSR Online at
> http://www.osronline.com/page.cfm?name=ListServer
>
> —
> Questions? First check the Kernel Driver FAQ at
> http://www.osronline.com/article.cfm?id=256
>
> To unsubscribe, visit the List Server section of OSR Online at
> http://www.osronline.com/page.cfm?name=ListServer
>

KeWaitForSingleObject works for all dispatcher objects, not just
KEVENTs. Further, your specific thread is waiting on a Mutant - that
means it has ownership info (see the “owning thread” information there?)
So this thread is waiting for a mutant owned by another thread -
question now becomes, why is the next thread not making further
progress.

That’s the wonder of deadlock debugging - it’s a matter of following the
trail of breadcrumbs until you either find the cause (a thread that is
waiting for a resource owned by one of the other threads you’ve already
seen waiting) OR you find a black hole (synchronization event or other
ownership-less structure).

The irp list is the list of outstanding I/O operations charged to this
thread (see IoQueueThreadIrp for information - that’s the function that
adds a thread to this list.) Indeed, this list is why the OS has to
use an APC to get back to thread context - the queue is only manipulated
in thread context (so no spin lock is needed, since a thread cannot be
running simultaneously on two processors) and thus removing the IRP from
the list needs to be done in the original thread context.

But the information you provided thus far means you are on the right
path. Now you need to look at the thread that owns the mutant
(85824a28) and figure out why it isn’t giving up the mutant (probably
because it is waiting for something else.)

Regards,

Tony

Tony Mason

Consulting Partner

OSR Open Systems Resources, Inc.

http://www.osr.com


From: xxxxx@lists.osr.com
[mailto:xxxxx@lists.osr.com] On Behalf Of Zed y
Sent: Tuesday, June 20, 2006 9:11 AM
To: ntdev redirect
Subject: Re: [ntdev] HELP: How to debug “stuck IRPS” in TDI filter

Hi

Thanks for your help.

I seems that several of the stuck applications are stuck on
nt!KeWaitForSingleObject.

You said it will be complicated because KEVENT does not containe
ownership information.

Maybe you can still give me a starting point to go from here ?

What does the “IRP list” represent ?

Is this the thread currently “pended” irps list? (and if no, how do i
get it?)

Thanks.

PROCESS 8555f5c0 SessionId: 0 Cid: 1404 Peb: 7ffde000 ParentCid:
0b68
DirBase: 06c41360 ObjectTable: e1929e30 HandleCount: 353.
Image: IEXPLORE.EXE
VadRoot 8570ac70 Vads 188 Clone 0 Private 2301. Modified 20. Locked
0.
DeviceMap e2da3888
Token e386b500
ElapsedTime 00:50:27.868
UserTime 00:00:00.312
KernelTime 00:00: 00.890
QuotaPoolUsage[PagedPool] 134756
QuotaPoolUsage[NonPagedPool] 9296
Working Set Sizes (now,min,max) (5566, 50, 345) (22264KB, 200KB,
1380KB)
PeakWorkingSetSize 5575
VirtualSize 84 Mb
PeakVirtualSize 95 Mb
PageFaultCount 7374
MemoryPriority BACKGROUND
BasePriority 8
CommitCharge 2374

THREAD 85756a18 Cid 1404.1700 Teb: 7ffdd000 Win32Thread:
e419be10 WAIT: (Executive) KernelMode Non-Alertable
f6cdceb8 Mutant - owning thread 85824a28
IRP List:
86109e00: (0006,01fc) Flags: 00000884 Mdl: 00000000
853aa2d8: (0006,01fc) Flags: 00000884 Mdl: 00000000
Not impersonating
DeviceMap e2da3888
Owning Process 8555f5c0 Image:
IEXPLORE.EXE
Wait Start TickCount 18574888 Ticks: 137357
(0:00:35:46.203)
Context Switch Count 6490 LargeStack
UserTime 00:00:00.0281
KernelTime 00:00: 00.0734
Start Address 0x7c810867
Win32 Start Address 0x00402451
Stack Init a9cac000 Current a9cab664 Base a9cac000 Limit
a9ca6000 Call 0
Priority 10 BasePriority 8 PriorityDecrement 0 DecrementCount 16

ChildEBP RetAddr Args to Child
a9cab67c 80502b17 85756a88 85756a18 804fad6c
nt!KiSwapContext+0x2f (FPO: [Uses EBP] [0,0,4])
a9cab688 804fad6c 00000000 86109e00 86109e00
nt!KiSwapThread+0x6b (FPO: [0,0,0])
a9cab6b0 f6cdb396 00000000 00000000 00000000
nt!KeWaitForSingleObject+0x1c2 (FPO: [Non-Fpo])
a9cab6c8 f6cdd233 86127630 e101c0e0 e327c5a8
sysaudio!GrabMutex+0x11 (FPO: [0,0,0])
a9cab6dc f6a71077 8603ff08 86109f48 86109e10
sysaudio!CFilterInstance::FilterDispatchCreate+0x46 (FPO: [Non-Fpo])
a9cab700 804eeeb1 8603ff08 00000000 86109e00
ks!DispatchCreate+0xc7 (FPO: [Non-Fpo])
a9cab710 80581eba 85b99018 8559cb8c a9cab8a8
nt!IopfCallDriver+0x31 (FPO: [0,0,0])
a9cab7f0 805bdd08 85b99030 00000000 8559cae8
nt!IopParseDevice+0xa58 (FPO: [Non-Fpo])
a9cab868 805ba390 00000000 a9cab8a8 00000040
nt!ObpLookupObjectName+0x53c (FPO: [Non-Fpo])
a9cab8bc 80574e37 00000000 00000000 00000600
nt!ObOpenObjectByName+0xea (FPO: [Non-Fpo])
a9cab938 805757ae a9caba34 c0100000 a9cab9d4
nt!IopCreateFile+0x407 (FPO: [Non-Fpo])
a9cab994 aa14881b a9caba34 c0100000 a9cab9d4
nt!IoCreateFile+0x8e (FPO: [Non-Fpo])
a9cab9fc aa148861 e464b1a8 a9caba34 853aa36c
wdmaud!OpenDevice+0x56 (FPO: [Non-Fpo])
a9caba20 aa14d946 a9caba34 a9caba38 a9caba38
wdmaud!OpenSysAudio+0x36 (FPO: [Non-Fpo])
a9caba3c aa149de1 853aa2e8 853e6818 86083d30
wdmaud!kmxlOpenSysAudio+0x1d (FPO: [Non-Fpo])
a9caba5c 804eeeb1 860f86d0 853aa2d8 853aa2d8
wdmaud!SoundDispatchCreate+0x86 (FPO: [Non-Fpo])
a9caba6c 80581eba 8600b0c8 857f84dc a9cabc04
nt!IopfCallDriver+0x31 (FPO: [0,0,0])
a9cabb4c 805bdd08 8600b0e0 00000000 857f8438
nt!IopParseDevice+0xa58 (FPO: [Non-Fpo])
a9cabbc4 805ba390 00000000 a9cabc04 00000040
nt!ObpLookupObjectName+0x53c (FPO: [Non-Fpo])
a9cabc18 80574e37 00000000 00000000 65764501
nt!ObOpenObjectByName+0xea (FPO: [Non-Fpo])
a9cabc94 805757ae 00136f50 c0100080 00136ef0
nt!IopCreateFile+0x407 (FPO: [Non-Fpo])
a9cabcf0 80577e78 00136f50 c0100080 00136ef0
nt!IoCreateFile+0x8e (FPO: [Non-Fpo])
a9cabd30 8054060c 00136f50 c0100080 00136ef0
nt!NtCreateFile+0x30 (FPO: [Non-Fpo])
a9cabd30 7c90eb94 00136f50 c0100080 00136ef0
nt!KiFastCallEntry+0xfc (FPO: [0,0] TrapFrame @ a9cabd64)

On 6/20/06, Tony Mason wrote:

Try starting with “!analyze -hang” - that will show you a certain class
of problems. From there, you can use “!stacks” or “!process 0 7” to get
summary or complete information on each thread. If the thread is
blocked, you’ll see on what it is blocking and can work back from there.
Other possibilities include trying to perform certain operations at APC
level (try “!apc”) or work queue deadlocks (try “!exqueue”).

If you really are looking for outstanding IRPs, us “!irpfind” to locate
them all - but that’s slow over a serial connection.

If you know the specific app that’s hung, try “!process 0 0” to get a
list of processes and then choose the one that represents a known hung
application and do “!process 7” - that will cut down on the
relative verbosity of the information (“!process 0 7” can be
intimidating the first time you do that - typical systems these days
have hundreds of threads.)

Generally, hangs are one of the easiest class of problems to track down,
since the system is excellent at keeping track of the dispatcher objects
on which each thread is waiting. Those dispatcher objects in turn often
have ownership information - which leads from the waiting thread to the
owning thread. The owning thread then is likely waiting for something.
Typically you find a loop in owning/waiting that establishes the
deadlock. If, however, the threads are using synchronization events
(KEVENT) or other dispatcher objects that do not have ownership
information you’ll have to do quite a lot more digging to figure it out.

Regards,

Tony

Tony Mason

Consulting Partner

OSR Open Systems Resources, Inc.

http://www.osr.com http:</http:>

________________________________

From: xxxxx@lists.osr.com
[mailto:xxxxx@lists.osr.com] On Behalf Of Zed y
Sent: Tuesday, June 20, 2006 6:23 AM
To: ntdev redirect
Subject: [ntdev] HELP: How to debug “stuck IRPS” in TDI filter

Hi

Our product contains TDI filter and File system filter driver.

Lately, several applications began to stuck with no known reason. I
think that one of our filters causes some of the IRPS to “get stuck” so
the user mode process isn’t getting a response to its request and it
gets stuck too.

These are the symptoms:

-Two computers that sometimes have their telnet application stuck when
it tries to disconnect.
-One computer had it’s excel stuck. I don’t know if excel is creating
some kind of communication so I don’t know if the problem is in the TDI
or the file system filter.
-One computer that runs AutoIt (a QA helper app). The AutoIt got stuck
and afterwards all iexplore, msnm, and outlook got stuck (the
application window is freeze after I run it). The wired thing is, that
mozilla firefox runs OK!

All the problems occurred after SEVERAL DAYS of work while our product
was running.

Because most of the problems were in communications apps, we think it’s
the TDI filter.

Any idea how to debug such a problem?
Is there a way to get all the stuck IRPS for a process or a driver?
(Maybe doing a manual dump and get it from there?)

Thanks for any help.

— Questions? First check the Kernel Driver FAQ at
http://www.osronline.com/article.cfm?id=256 To unsubscribe, visit the
List Server section of OSR Online at
http://www.osronline.com/page.cfm?name=ListServer


Questions? First check the Kernel Driver FAQ at
http://www.osronline.com/article.cfm?id=256

To unsubscribe, visit the List Server section of OSR Online at
http://www.osronline.com/page.cfm?name=ListServer

— Questions? First check the Kernel Driver FAQ at
http://www.osronline.com/article.cfm?id=256 To unsubscribe, visit the
List Server section of OSR Online at
http://www.osronline.com/page.cfm?name=ListServer

I see that the owning thread is waiting on an event… any way to know who
created that event?

But hay, this is all in user mode space, it shouldn’t cause an application
to be stuck in a way i can’t terminate it… isn’t it !?

0: kd> !thread 85824a28
THREAD 85824a28 Cid 0dc4.1760 Teb: 7ffd6000 Win32Thread: e32e5c50 WAIT:
(Suspended) KernelMode Non-Alertable
863ff65c SynchronizationEvent
IRP List:
85870e70: (0006,0190) Flags: 00000034 Mdl: 00000000
85c1a970: (0006,0190) Flags: 00000404 Mdl: 00000000
860f6e70: (0006,0190) Flags: 00000070 Mdl: 00000000
Not impersonating
DeviceMap e2da3888
Owning Process 854516e8 Image: IEXPLORE.EXE
Wait Start TickCount 18261959 Ticks: 450286 (0:01:57:15.718)
Context Switch Count 353220 LargeStack
UserTime 00:00:15.0453
KernelTime 00:00:09.0140
Start Address 0x7c810856
Win32 Start Address 0x77833b19
Stack Init a9302000 Current a93017e8 Base a9302000 Limit a92fc000 Call 0
Priority 10 BasePriority 8 PriorityDecrement 0 DecrementCount 16
*** ERROR: Module load completed but symbols could not be loaded for
MidiSyn.sys
ChildEBP RetAddr Args to Child
a9301800 80502b17 85824a98 85824a28 804fad6c nt!KiSwapContext+0x2f (FPO:
[Uses EBP] [0,0,4])
a930180c 804fad6c 00000000 863ff65c 00000001 nt!KiSwapThread+0x6b (FPO:
[0,0,0])
a9301834 f6aa3811 00000000 00000005 00000000 nt!KeWaitForSingleObject+0x1c2
(FPO: [Non-Fpo])
a9301854 f6aa4198 863ff5b8 00000000 00000001
portcls!CKsShellRequestor::SetDeviceState+0x68 (FPO: [Non-Fpo])
a9301888 f6aa4ace 8587e6c8 00000001 85870e70
portcls!CPortPinDMus::DistributeDeviceState+0x52 (FPO: [Non-Fpo])
a93018a4 f6a6ff4c 853585c4 860b3288 860b3280
portcls!CPortPinDMus::SetDeviceState+0x23f (FPO: [Non-Fpo])
a9301908 f6a6fec9 85870e70 00000003 e46bc828 ks!KspPropertyHandler+0x616
(FPO: [Non-Fpo])
a930192c f6a9a603 85870e70 00000003 e46bc828 ks!KsPropertyHandler+0x19 (FPO:
[Non-Fpo])
a9301940 f6aa466d 85870e70 00000003 e46bc828
portcls!PcHandlePropertyWithTable+0x1b (FPO: [Non-Fpo])

On 6/20/06, Tony Mason wrote:
>
> KeWaitForSingleObject works for all dispatcher objects, not just
> KEVENTs. Further, your specific thread is waiting on a Mutant ? that means
> it has ownership info (see the “owning thread” information there?) So this
> thread is waiting for a mutant owned by another thread ? question now
> becomes, why is the next thread not making further progress.
>
>
>
> That’s the wonder of deadlock debugging ? it’s a matter of following the
> trail of breadcrumbs until you either find the cause (a thread that is
> waiting for a resource owned by one of the other threads you’ve already seen
> waiting) OR you find a black hole (synchronization event or other
> ownership-less structure).
>
>
>
> The irp list is the list of outstanding I/O operations charged to this
> thread (see IoQueueThreadIrp for information ? that’s the function that adds
> a thread to this list.) Indeed, this list is why the OS has to use an APC
> to get back to thread context ? the queue is only manipulated in thread
> context (so no spin lock is needed, since a thread cannot be running
> simultaneously on two processors) and thus removing the IRP from the list
> needs to be done in the original thread context.
>
>
>
> But the information you provided thus far means you are on the right
> path. Now you need to look at the thread that owns the mutant (85824a28)
> and figure out why it isn’t giving up the mutant (probably because it is
> waiting for something else.)
>
>
>
> Regards,
>
>
>
> Tony
>
>
>
> Tony Mason
>
> Consulting Partner
>
> OSR Open Systems Resources, Inc.
>
> http://www.osr.com
>
>
> ------------------------------
>
> From: xxxxx@lists.osr.com [mailto:
> xxxxx@lists.osr.com] *On Behalf Of *Zed y
> Sent: Tuesday, June 20, 2006 9:11 AM
> To: ntdev redirect
> Subject: Re: [ntdev] HELP: How to debug “stuck IRPS” in TDI filter
>
>
>
> Hi
>
>
>
> Thanks for your help.
>
>
>
> I seems that several of the stuck applications are stuck on
> nt!KeWaitForSingleObject.
>
> You said it will be complicated because KEVENT does not containe ownership
> information.
>
> Maybe you can still give me a starting point to go from here ?
>
>
>
> What does the “IRP list” represent ?
>
> Is this the thread currently “pended” irps list? (and if no, how do i get
> it?)
>
>
>
> Thanks.
>
>
>
>
>
> PROCESS 8555f5c0 SessionId: 0 Cid: 1404 Peb: 7ffde000 ParentCid:
> 0b68
> DirBase: 06c41360 ObjectTable: e1929e30 HandleCount: 353.
> Image: IEXPLORE.EXE
> VadRoot 8570ac70 Vads 188 Clone 0 Private 2301. Modified 20. Locked 0.
>
> DeviceMap e2da3888
> Token e386b500
> ElapsedTime 00:50:27.868
> UserTime 00:00:00.312
> KernelTime 00:00: 00.890
> QuotaPoolUsage[PagedPool] 134756
> QuotaPoolUsage[NonPagedPool] 9296
> Working Set Sizes (now,min,max) (5566, 50, 345) (22264KB, 200KB,
> 1380KB)
> PeakWorkingSetSize 5575
> VirtualSize 84 Mb
> PeakVirtualSize 95 Mb
> PageFaultCount 7374
> MemoryPriority BACKGROUND
> BasePriority 8
> CommitCharge 2374
>
> THREAD 85756a18 Cid 1404.1700 Teb: 7ffdd000 Win32Thread:
> e419be10 WAIT: (Executive) KernelMode Non-Alertable
> f6cdceb8 Mutant - owning thread 85824a28
> IRP List:
> 86109e00: (0006,01fc) Flags: 00000884 Mdl: 00000000
> 853aa2d8: (0006,01fc) Flags: 00000884 Mdl: 00000000
> Not impersonating
> DeviceMap e2da3888
> Owning Process 8555f5c0 Image:
> IEXPLORE.EXE
> Wait Start TickCount 18574888 Ticks: 137357 (0:00:35:
> 46.203)
> Context Switch Count 6490 LargeStack
> UserTime 00:00:00.0281
> KernelTime 00:00: 00.0734
> Start Address 0x7c810867
> Win32 Start Address 0x00402451
> Stack Init a9cac000 Current a9cab664 Base a9cac000 Limit a9ca6000
> Call 0
> Priority 10 BasePriority 8 PriorityDecrement 0 DecrementCount 16
> ChildEBP RetAddr Args to Child
> a9cab67c 80502b17 85756a88 85756a18 804fad6c nt!KiSwapContext+0x2f
> (FPO: [Uses EBP] [0,0,4])
> a9cab688 804fad6c 00000000 86109e00 86109e00 nt!KiSwapThread+0x6b
> (FPO: [0,0,0])
> a9cab6b0 f6cdb396 00000000 00000000 00000000
> nt!KeWaitForSingleObject+0x1c2 (FPO: [Non-Fpo])
> a9cab6c8 f6cdd233 86127630 e101c0e0 e327c5a8
> sysaudio!GrabMutex+0x11 (FPO: [0,0,0])
> a9cab6dc f6a71077 8603ff08 86109f48 86109e10
> sysaudio!CFilterInstance::FilterDispatchCreate+0x46 (FPO: [Non-Fpo])
> a9cab700 804eeeb1 8603ff08 00000000 86109e00
> ks!DispatchCreate+0xc7 (FPO: [Non-Fpo])
> a9cab710 80581eba 85b99018 8559cb8c a9cab8a8
> nt!IopfCallDriver+0x31 (FPO: [0,0,0])
> a9cab7f0 805bdd08 85b99030 00000000 8559cae8
> nt!IopParseDevice+0xa58 (FPO: [Non-Fpo])
> a9cab868 805ba390 00000000 a9cab8a8 00000040
> nt!ObpLookupObjectName+0x53c (FPO: [Non-Fpo])
> a9cab8bc 80574e37 00000000 00000000 00000600
> nt!ObOpenObjectByName+0xea (FPO: [Non-Fpo])
> a9cab938 805757ae a9caba34 c0100000 a9cab9d4
> nt!IopCreateFile+0x407 (FPO: [Non-Fpo])
> a9cab994 aa14881b a9caba34 c0100000 a9cab9d4 nt!IoCreateFile+0x8e
> (FPO: [Non-Fpo])
> a9cab9fc aa148861 e464b1a8 a9caba34 853aa36c
> wdmaud!OpenDevice+0x56 (FPO: [Non-Fpo])
> a9caba20 aa14d946 a9caba34 a9caba38 a9caba38
> wdmaud!OpenSysAudio+0x36 (FPO: [Non-Fpo])
> a9caba3c aa149de1 853aa2e8 853e6818 86083d30
> wdmaud!kmxlOpenSysAudio+0x1d (FPO: [Non-Fpo])
> a9caba5c 804eeeb1 860f86d0 853aa2d8 853aa2d8
> wdmaud!SoundDispatchCreate+0x86 (FPO: [Non-Fpo])
> a9caba6c 80581eba 8600b0c8 857f84dc a9cabc04
> nt!IopfCallDriver+0x31 (FPO: [0,0,0])
> a9cabb4c 805bdd08 8600b0e0 00000000 857f8438
> nt!IopParseDevice+0xa58 (FPO: [Non-Fpo])
> a9cabbc4 805ba390 00000000 a9cabc04 00000040
> nt!ObpLookupObjectName+0x53c (FPO: [Non-Fpo])
> a9cabc18 80574e37 00000000 00000000 65764501
> nt!ObOpenObjectByName+0xea (FPO: [Non-Fpo])
> a9cabc94 805757ae 00136f50 c0100080 00136ef0
> nt!IopCreateFile+0x407 (FPO: [Non-Fpo])
> a9cabcf0 80577e78 00136f50 c0100080 00136ef0 nt!IoCreateFile+0x8e
> (FPO: [Non-Fpo])
> a9cabd30 8054060c 00136f50 c0100080 00136ef0 nt!NtCreateFile+0x30
> (FPO: [Non-Fpo])
> a9cabd30 7c90eb94 00136f50 c0100080 00136ef0
> nt!KiFastCallEntry+0xfc (FPO: [0,0] TrapFrame @ a9cabd64)
>
>
>
> On 6/20/06, Tony Mason wrote:
>
> Try starting with “!analyze ?hang” ? that will show you a certain class of
> problems. From there, you can use “!stacks” or “!process 0 7” to get
> summary or complete information on each thread. If the thread is blocked,
> you’ll see on what it is blocking and can work back from there. Other
> possibilities include trying to perform certain operations at APC level (try
> “!apc”) or work queue deadlocks (try “!exqueue”).
>
>
>
> If you really are looking for outstanding IRPs, us “!irpfind” to locate
> them all ? but that’s slow over a serial connection.
>
>
>
> If you know the specific app that’s hung, try “!process 0 0” to get a list
> of processes and then choose the one that represents a known hung
> application and do “!process 7” ? that will cut down on the
> relative verbosity of the information (“!process 0 7” can be intimidating
> the first time you do that ? typical systems these days have hundreds of
> threads.)
>
>
>
> Generally, hangs are one of the easiest class of problems to track down,
> since the system is excellent at keeping track of the dispatcher objects on
> which each thread is waiting. Those dispatcher objects in turn often have
> ownership information ? which leads from the waiting thread to the owning
> thread. The owning thread then is likely waiting for something. Typically
> you find a loop in owning/waiting that establishes the deadlock. If,
> however, the threads are using synchronization events (KEVENT) or other
> dispatcher objects that do not have ownership information you’ll have to do
> quite a lot more digging to figure it out.
>
>
>
> Regards,
>
>
>
> Tony
>
>
>
> Tony Mason
>
> Consulting Partner
>
> OSR Open Systems Resources, Inc.
>
> http://www.osr.com
>
>
> ------------------------------
>
> From: xxxxx@lists.osr.com [mailto:
> xxxxx@lists.osr.com] *On Behalf Of *Zed y
> Sent: Tuesday, June 20, 2006 6:23 AM
> To: ntdev redirect
> Subject: [ntdev] HELP: How to debug “stuck IRPS” in TDI filter
>
>
>
> Hi
>
> Our product contains TDI filter and File system filter driver.
>
> Lately, several applications began to stuck with no known reason. I think
> that one of our filters causes some of the IRPS to “get stuck” so the user
> mode process isn’t getting a response to its request and it gets stuck too.
>
> These are the symptoms:
>
> -Two computers that sometimes have their telnet application stuck when it
> tries to disconnect.
> -One computer had it’s excel stuck. I don’t know if excel is creating some
> kind of communication so I don’t know if the problem is in the TDI or the
> file system filter.
> -One computer that runs AutoIt (a QA helper app). The AutoIt got stuck
> and afterwards all iexplore, msnm, and outlook got stuck (the application
> window is freeze after I run it). The wired thing is, that mozilla firefox
> runs OK!
>
> All the problems occurred after SEVERAL DAYS of work while our product was
> running.
>
> Because most of the problems were in communications apps, we think it’s
> the TDI filter.
>
> Any idea how to debug such a problem?
> Is there a way to get all the stuck IRPS for a process or a driver?
> (Maybe doing a manual dump and get it from there?)
>
> Thanks for any help.
>
> — Questions? First check the Kernel Driver FAQ at
> http://www.osronline.com/article.cfm?id=256 To unsubscribe, visit the List
> Server section of OSR Online at
> http://www.osronline.com/page.cfm?name=ListServer
>
>
> —
> Questions? First check the Kernel Driver FAQ at
> http://www.osronline.com/article.cfm?id=256
>
> To unsubscribe, visit the List Server section of OSR Online at
> http://www.osronline.com/page.cfm?name=ListServer
>
>
> — Questions? First check the Kernel Driver FAQ at
> http://www.osronline.com/article.cfm?id=256 To unsubscribe, visit the List
> Server section of OSR Online at
> http://www.osronline.com/page.cfm?name=ListServer
>
> —
> Questions? First check the Kernel Driver FAQ at
> http://www.osronline.com/article.cfm?id=256
>
> To unsubscribe, visit the List Server section of OSR Online at
> http://www.osronline.com/page.cfm?name=ListServer
>