System hang on unload (XP Only)

Hello all
I’m trying to trace a problem where the system hangs on shutdown after our file system minifilter driver has been loaded. Note that the problem still persists even after our driver has been unloaded prior to shutdown but not if our driver was never installed.

I’ve traced down through all the ExitWindowsEx calls to the nt!NtShutdownSystem call and the thread appears to be waiting on a WaitForSingleObject call after calling NtSetSystemPowerState. Dumping the parameters for the call to KeWaitForSingleObject though it appears that they are all NULL except for the second which is set to 5. Can anbody suggest what might be going wrong here or has anybody seen something similar before ??

Cheers

Mark

0: kd> !thread 896bd220
THREAD 896bd220 Cid 0278.027c Teb: 7ffdf000 Win32Thread: e18984b0 WAIT: (Suspended) KernelMode Non-Alertable
SuspendCount 1
896bd3bc Semaphore Limit 0x2
IRP List:
89af67c0: (0006,0094) Flags: 00000800 Mdl: 00000000
Not impersonating
DeviceMap e10092b0
Owning Process 0 Image:
Attached Process 896201c8 Image: winlogon.exe
Wait Start TickCount 654398 Ticks: 37333 (0:00:09:43.328)
Context Switch Count 9284 LargeStack
UserTime 00:00:00.578
KernelTime 00:00:08.718
Start Address 0x0103e5e1
Stack Init ba158000 Current ba1579bc Base ba158000 Limit ba154000 Call 0
Priority 15 BasePriority 15 PriorityDecrement 0 DecrementCount 0
ChildEBP RetAddr Args to Child
ba1579d4 80503850 896bd290 896bd220 804fb078 nt!KiSwapContext+0x2f (FPO: [Uses EBP] [0,0,4])
ba1579e0 804fb078 896bd38c 896bd220 896bd254 nt!KiSwapThread+0x8a (FPO: [0,0,0])
ba157a08 80502f8c 00000000 00000005 00000000 nt!KeWaitForSingleObject+0x1c2 (FPO: [Non-Fpo])
ba157a20 804ff8c4 00000000 00000000 00000000 nt!KiSuspendThread+0x18 (FPO: [3,0,0])
ba157a68 806e7ef2 00000000 00000000 ba157a80 nt!KiDeliverApc+0x124 (FPO: [Non-Fpo])
ba157a68 806e7adf 00000000 00000000 ba157a80 hal!HalpApcInterrupt+0xc6 (FPO: [0,2] TrapFrame @ ba157a80)
ba157af0 804fd5a1 00000000 001f0003 00000000 hal!KeReleaseQueuedSpinLock+0x37 (FPO: [0,0,0])
ba157b10 8065434a 896bd220 ba157bf8 ba157c7c nt!KeSuspendThread+0xa9 (FPO: [Non-Fpo])
ba157be4 8054167c 00000006 00000004 c0000004 nt!NtSetSystemPowerState+0x532 (FPO: [Non-Fpo])
ba157be4 80501021 00000006 00000004 c0000004 nt!KiFastCallEntry+0xfc (FPO: [0,0] TrapFrame @ ba157bf8)
ba157c68 80653e7d 00000006 00000004 c0000004 nt!ZwSetSystemPowerState+0x11 (FPO: [3,0,0])
ba157d44 80612fc2 00000006 00000004 c0000004 nt!NtSetSystemPowerState+0x65 (FPO: [Non-Fpo])
ba157d58 8054167c 00000002 0006fc7c 7c90e514 nt!NtShutdownSystem+0x32 (FPO: [Non-Fpo])
ba157d58 7c90e514 00000002 0006fc7c 7c90e514 nt!KiFastCallEntry+0xfc (FPO: [0,0] TrapFrame @ ba157d64)
0006fc7c 00000000 00000000 00000000 00000000 ntdll!KiFastSystemCallRet (FPO: [0,0,0])

Possible causes:

You did not register an Unload function.
If you handle IRP’s you still have an IRP open.
Your filter is attached to, or has a handle open against it.

Hey ? not to bad for someone under the influence of 3 Hydrocodone after oral surgery 5 hours ago ? of course I may have just confirmed everyone’s suspicion that I’m on something ? oh well

Gary Little
H (952) 223-1349
C (952) 454-4629
xxxxx@comcast.net

On Mar 19, 2012, at 5:52 PM, xxxxx@amberleigh.eclipse.co.uk wrote:

Hello all
I’m trying to trace a problem where the system hangs on shutdown after our file system minifilter driver has been loaded. Note that the problem still persists even after our driver has been unloaded prior to shutdown but not if our driver was never installed.

I’ve traced down through all the ExitWindowsEx calls to the nt!NtShutdownSystem call and the thread appears to be waiting on a WaitForSingleObject call after calling NtSetSystemPowerState. Dumping the parameters for the call to KeWaitForSingleObject though it appears that they are all NULL except for the second which is set to 5. Can anbody suggest what might be going wrong here or has anybody seen something similar before ??

Cheers

Mark

0: kd> !thread 896bd220
THREAD 896bd220 Cid 0278.027c Teb: 7ffdf000 Win32Thread: e18984b0 WAIT: (Suspended) KernelMode Non-Alertable
SuspendCount 1
896bd3bc Semaphore Limit 0x2
IRP List:
89af67c0: (0006,0094) Flags: 00000800 Mdl: 00000000
Not impersonating
DeviceMap e10092b0
Owning Process 0 Image:
> Attached Process 896201c8 Image: winlogon.exe
> Wait Start TickCount 654398 Ticks: 37333 (0:00:09:43.328)
> Context Switch Count 9284 LargeStack
> UserTime 00:00:00.578
> KernelTime 00:00:08.718
> Start Address 0x0103e5e1
> Stack Init ba158000 Current ba1579bc Base ba158000 Limit ba154000 Call 0
> Priority 15 BasePriority 15 PriorityDecrement 0 DecrementCount 0
> ChildEBP RetAddr Args to Child
> ba1579d4 80503850 896bd290 896bd220 804fb078 nt!KiSwapContext+0x2f (FPO: [Uses EBP] [0,0,4])
> ba1579e0 804fb078 896bd38c 896bd220 896bd254 nt!KiSwapThread+0x8a (FPO: [0,0,0])
> ba157a08 80502f8c 00000000 00000005 00000000 nt!KeWaitForSingleObject+0x1c2 (FPO: [Non-Fpo])
> ba157a20 804ff8c4 00000000 00000000 00000000 nt!KiSuspendThread+0x18 (FPO: [3,0,0])
> ba157a68 806e7ef2 00000000 00000000 ba157a80 nt!KiDeliverApc+0x124 (FPO: [Non-Fpo])
> ba157a68 806e7adf 00000000 00000000 ba157a80 hal!HalpApcInterrupt+0xc6 (FPO: [0,2] TrapFrame @ ba157a80)
> ba157af0 804fd5a1 00000000 001f0003 00000000 hal!KeReleaseQueuedSpinLock+0x37 (FPO: [0,0,0])
> ba157b10 8065434a 896bd220 ba157bf8 ba157c7c nt!KeSuspendThread+0xa9 (FPO: [Non-Fpo])
> ba157be4 8054167c 00000006 00000004 c0000004 nt!NtSetSystemPowerState+0x532 (FPO: [Non-Fpo])
> ba157be4 80501021 00000006 00000004 c0000004 nt!KiFastCallEntry+0xfc (FPO: [0,0] TrapFrame @ ba157bf8)
> ba157c68 80653e7d 00000006 00000004 c0000004 nt!ZwSetSystemPowerState+0x11 (FPO: [3,0,0])
> ba157d44 80612fc2 00000006 00000004 c0000004 nt!NtSetSystemPowerState+0x65 (FPO: [Non-Fpo])
> ba157d58 8054167c 00000002 0006fc7c 7c90e514 nt!NtShutdownSystem+0x32 (FPO: [Non-Fpo])
> ba157d58 7c90e514 00000002 0006fc7c 7c90e514 nt!KiFastCallEntry+0xfc (FPO: [0,0] TrapFrame @ ba157d64)
> 0006fc7c 00000000 00000000 00000000 00000000 ntdll!KiFastSystemCallRet (FPO: [0,0,0])
>
>
> —
> NTDEV is sponsored by OSR
>
> For our schedule of WDF, WDM, debugging and other seminars visit:
> http://www.osr.com/seminars
>
> To unsubscribe, visit the List Server section of OSR Online at http://www.osronline.com/page.cfm?name=ListServer

Hi Gary
it isn’t that our driver isn’t being unloaded - on the contrary, we still see the problem even if I unload our driver manually and then shut down the system which is what makes it so odd. The system shutdown hangs during the call to NtSetSystemPowerState.

Cheers

Mark

On 20-Mar-2012 01:20, Gary Little wrote:

Possible causes:

You did not register an Unload function.
If you handle IRP’s you still have an IRP open.
Your filter is attached to, or has a handle open against it.

Then the filter won’t unload. The OP wrote that he could unload it.

Hey … not to bad for someone under the influence of 3 Hydrocodone after
oral surgery 5 hours ago … of course I may have just confirmed
everyone’s suspicion that I’m on something … oh well

Ah, yep. Hope you get well soon.

– pa

Gary Little
H (952) 223-1349
C (952) 454-4629
xxxxx@comcast.net mailto:xxxxx
>
>
> On Mar 19, 2012, at 5:52 PM, xxxxx@amberleigh.eclipse.co.uk
> mailto:xxxxx wrote:
>
>> Hello all
>> I’m trying to trace a problem where the system hangs on shutdown after
>> our file system minifilter driver has been loaded. Note that the
>> problem still persists even after our driver has been unloaded prior
>> to shutdown but not if our driver was never installed.
>>
>> I’ve traced down through all the ExitWindowsEx calls to the
>> nt!NtShutdownSystem call and the thread appears to be waiting on a
>> WaitForSingleObject call after calling NtSetSystemPowerState. Dumping
>> the parameters for the call to KeWaitForSingleObject though it appears
>> that they are all NULL except for the second which is set to 5. Can
>> anbody suggest what might be going wrong here or has anybody seen
>> something similar before ??
>>
…</mailto:xxxxx></mailto:xxxxx>

This thread you posted isn’t really that interesting in terms of tracking
down the hang, the real work is probably being done inside the System
process by a worker thread. Try the following to see if you see anything
interesting:

!process 0 f System

Assuming you’re a mini-filter, you can also try the following:

!stacks 2 fltmgr!

Which will show you any threads out there inside Filter Manager.

-scott


Scott Noone
Consulting Associate and Chief System Problem Analyst
OSR Open Systems Resources, Inc.
http://www.osronline.com

wrote in message news:xxxxx@ntdev…

Hi Gary
it isn’t that our driver isn’t being unloaded - on the contrary, we still
see the problem even if I unload our driver manually and then shut down the
system which is what makes it so odd. The system shutdown hangs during the
call to NtSetSystemPowerState.

Cheers

Mark

Thanks Scott - I’ve disassembled NtSetSystemPowerState and you are quite correct - for calls like this that don’t run in the context of the system process, the call to PopGracefulShutdown is queued off via a work item and the thread I posted is then suspended.

The work item is fired and PopGracefulShutdown ultimately, calls IoShutdownSystem where is hangs waiting for an eresource called nt!IopDatabaseResource. This resource is owned by a thread that is executing FltpExWaitForRundownProtectionRelease. Within this function it Initializes an event on the stack and then enters a (presumably) infinite wait since there’s no obvious way that a stack variable could be signalled and it doesn’t pass it elsewhere…

ChildEBP RetAddr Args to Child
ba4f39ec 80503850 89c40518 89c404a8 804fb078 nt!KiSwapContext+0x2f (FPO: [Uses EBP] [0,0,4])
ba4f39f8 804fb078 8956a780 8956a778 00000006 nt!KiSwapThread+0x8a (FPO: [0,0,0])
ba4f3a20 b9ed5aa4 00000000 00000000 00000000 nt!KeWaitForSingleObject+0x1c2 (FPO: [Non-Fpo])
ba4f3a60 b9ed3e3e ba4f3a84 b9ed50b4 8956a778 fltMgr!FltpExWaitForRundownProtectionRelease+0x72 (FPO: [Non-Fpo])
ba4f3a68 b9ed50b4 8956a778 8956aba8 8956aba8 fltMgr!FltpObjectRundownWait+0x10 (FPO: [Non-Fpo])
ba4f3a84 b9ec78f7 8956a7fc 00000008 805266ae fltMgr!FltpFreeVolume+0x9e (FPO: [Non-Fpo])
ba4f3a9c b9ec7a06 8956aba8 00000008 895b0870 fltMgr!FltpCleanupDeviceObject+0x61 (FPO: [Non-Fpo])
ba4f3ab8 b9ecc0f3 895b0870 8055ae50 00fe0030 fltMgr!FltpDetachFromFileSystemDevice+0x44 (FPO: [Non-Fpo])
ba4f3be0 b9ecc4d4 895b0870 00000000 ba4f3c0c fltMgr!FltpFsNotificationActual+0x4f (FPO: [Non-Fpo])
ba4f3bf0 80574e19 895b0870 00000000 895b0870 fltMgr!FltpFsNotification+0x1c (FPO: [Non-Fpo])
ba4f3c0c b1097475 895b0870 895b1950 895b0870 nt!IoUnregisterFileSystem+0x3f (FPO: [Non-Fpo])
ba4f3c24 b10873c0 895b1950 895b0870 b108df00 mrxdav!RxUnstart+0x31 (FPO: [Non-Fpo])
ba4f3c70 b108ece0 895b1950 895b1963 0014038c mrxdav!RxStopMinirdr+0xe6 (FPO: [Non-Fpo])
ba4f3cac b108f173 895b1950 00000000 895b1950 mrxdav!MRxDAVOuterStop+0xe6 (FPO: [Non-Fpo])
ba4f3ce4 b1097e54 005b1950 895b19dc 896be758 mrxdav!MRxDAVDevFcbXXXControlFile+0x105 (FPO: [Non-Fpo])
ba4f3cfc b1098092 895b1950 0014038c ba4f3d7c mrxdav!RxXXXControlFileCallthru+0x66 (FPO: [Non-Fpo])
ba4f3d0c b10a3087 895b1950 895b19dc 80564820 mrxdav!RxCommonDevFCBFsCtl+0x6a (FPO: [Non-Fpo])
ba4f3d7c 805387cb 895b1950 00000000 89c404a8 mrxdav!RxFspDispatch+0x12b (FPO: [Non-Fpo])
ba4f3dac 805cffa8 895b1950 00000000 00000000 nt!ExpWorkerThread+0xef (FPO: [Non-Fpo])
ba4f3ddc 8054615e 805386dc 00000000 00000000 nt!PspSystemThreadStartup+0x34 (FPO: [Non-Fpo])
00000000 00000000 00000000 00000000 00000000 nt!KiThreadStartup+0x16

I’m guessing that this is some obscure way of suspending the thread but why on earth would we suspend a thread that is still holding a global lock (nt!IopDatabaseResource)?

When the system is in a state where the shutdown works correctly (even with our driver loaded), the FltpExWaitForRundownProtectionRelease function is never called. Does anybody know what this function is used for and/or under what circumstances it might be called?

Cheers

Mark

OK, now we’re getting somewhere :slight_smile:

You’ve leaked some kind of mini-filter resource. Could be an object
reference, work item, name info structure, etc. Have you enabled Driver
Verifier on your driver? Not sure how well that works on XP, but on Vista
(where you can’t repro, of course) it should breakpoint and print out any
leaked references in this case. If Verifier doesn’t find anything, you might
need the checked build of fltmgr.sys to track this down.

-scott


Scott Noone
Consulting Associate and Chief System Problem Analyst
OSR Open Systems Resources, Inc.
http://www.osronline.com

wrote in message news:xxxxx@ntdev…

Thanks Scott - I’ve disassembled NtSetSystemPowerState and you are quite
correct - for calls like this that don’t run in the context of the system
process, the call to PopGracefulShutdown is queued off via a work item and
the thread I posted is then suspended.

The work item is fired and PopGracefulShutdown ultimately, calls
IoShutdownSystem where is hangs waiting for an eresource called
nt!IopDatabaseResource. This resource is owned by a thread that is executing
FltpExWaitForRundownProtectionRelease. Within this function it Initializes
an event on the stack and then enters a (presumably) infinite wait since
there’s no obvious way that a stack variable could be signalled and it
doesn’t pass it elsewhere…

ChildEBP RetAddr Args to Child
ba4f39ec 80503850 89c40518 89c404a8 804fb078 nt!KiSwapContext+0x2f (FPO:
[Uses EBP] [0,0,4])
ba4f39f8 804fb078 8956a780 8956a778 00000006 nt!KiSwapThread+0x8a (FPO:
[0,0,0])
ba4f3a20 b9ed5aa4 00000000 00000000 00000000 nt!KeWaitForSingleObject+0x1c2
(FPO: [Non-Fpo])
ba4f3a60 b9ed3e3e ba4f3a84 b9ed50b4 8956a778
fltMgr!FltpExWaitForRundownProtectionRelease+0x72 (FPO: [Non-Fpo])
ba4f3a68 b9ed50b4 8956a778 8956aba8 8956aba8
fltMgr!FltpObjectRundownWait+0x10 (FPO: [Non-Fpo])
ba4f3a84 b9ec78f7 8956a7fc 00000008 805266ae fltMgr!FltpFreeVolume+0x9e
(FPO: [Non-Fpo])
ba4f3a9c b9ec7a06 8956aba8 00000008 895b0870
fltMgr!FltpCleanupDeviceObject+0x61 (FPO: [Non-Fpo])
ba4f3ab8 b9ecc0f3 895b0870 8055ae50 00fe0030
fltMgr!FltpDetachFromFileSystemDevice+0x44 (FPO: [Non-Fpo])
ba4f3be0 b9ecc4d4 895b0870 00000000 ba4f3c0c
fltMgr!FltpFsNotificationActual+0x4f (FPO: [Non-Fpo])
ba4f3bf0 80574e19 895b0870 00000000 895b0870 fltMgr!FltpFsNotification+0x1c
(FPO: [Non-Fpo])
ba4f3c0c b1097475 895b0870 895b1950 895b0870 nt!IoUnregisterFileSystem+0x3f
(FPO: [Non-Fpo])
ba4f3c24 b10873c0 895b1950 895b0870 b108df00 mrxdav!RxUnstart+0x31 (FPO:
[Non-Fpo])
ba4f3c70 b108ece0 895b1950 895b1963 0014038c mrxdav!RxStopMinirdr+0xe6 (FPO:
[Non-Fpo])
ba4f3cac b108f173 895b1950 00000000 895b1950 mrxdav!MRxDAVOuterStop+0xe6
(FPO: [Non-Fpo])
ba4f3ce4 b1097e54 005b1950 895b19dc 896be758
mrxdav!MRxDAVDevFcbXXXControlFile+0x105 (FPO: [Non-Fpo])
ba4f3cfc b1098092 895b1950 0014038c ba4f3d7c
mrxdav!RxXXXControlFileCallthru+0x66 (FPO: [Non-Fpo])
ba4f3d0c b10a3087 895b1950 895b19dc 80564820 mrxdav!RxCommonDevFCBFsCtl+0x6a
(FPO: [Non-Fpo])
ba4f3d7c 805387cb 895b1950 00000000 89c404a8 mrxdav!RxFspDispatch+0x12b
(FPO: [Non-Fpo])
ba4f3dac 805cffa8 895b1950 00000000 00000000 nt!ExpWorkerThread+0xef (FPO:
[Non-Fpo])
ba4f3ddc 8054615e 805386dc 00000000 00000000 nt!PspSystemThreadStartup+0x34
(FPO: [Non-Fpo])
00000000 00000000 00000000 00000000 00000000 nt!KiThreadStartup+0x16

I’m guessing that this is some obscure way of suspending the thread but why
on earth would we suspend a thread that is still holding a global lock
(nt!IopDatabaseResource)?

When the system is in a state where the shutdown works correctly (even with
our driver loaded), the FltpExWaitForRundownProtectionRelease function is
never called. Does anybody know what this function is used for and/or under
what circumstances it might be called?

Cheers

Mark

Hi Scott
Many thanks for you assistance. Turned out to be an elevated rundown count on the FLT_VOLUME object which I have now traced to missing FltObjectDereference calls after calling FltEnumerateVolumes. At least it kept me busy for a couple of days.

Cheers

Mark

Woohoo!

Too bad OACR/PREfast didn’t catch this, it’s exactly the sort of thing you’d
want to find through static analysis.

-scott


Scott Noone
Consulting Associate and Chief System Problem Analyst
OSR Open Systems Resources, Inc.
http://www.osronline.com
wrote in message news:xxxxx@ntdev…

Hi Scott
Many thanks for you assistance. Turned out to be an elevated rundown count
on the FLT_VOLUME object which I have now traced to missing
FltObjectDereference calls after calling FltEnumerateVolumes. At least it
kept me busy for a couple of days.

Cheers

Mark