Debugging system shutdown hang

I’ve been asked to look at a W2K3 system that fails to shutdown.

We’ve generated a kernel crash dump and it looked pretty uninteresting. None
of the thread stacks for any process seemed to be doing much (very shallow
call stacks), except waiting for some work to do. Irpfind found some active
irps, but none seemed to be in any non-standard system drivers. The system
disk must still be online, as NMI initiated crash dumps work.

The system has a WHQL signed NDIS im vlan driver, which may or may not have
anything to do with the issue, and a few custom user services. The app
developers tell me there is no chance they are cancelling the shutdown.

Are there any docs that might explain how the system shutdown works in
detail, and how I would tell if some user mode app is doing something bad.

Is there any ETW tracing that might be useful? I know W2K3 has much more
limited tracing that W2K8 and later.

It’s also not easily reproducible in front of me, so pretty much have to
send instructions off to a site and then get back the results.

Is this the kind of issue one could pay Microsoft support to debug and
expect an efficient analysis?

Jan

I’d try !poaction to see the state of shutdown power IRP processing.
Maybe you’d see something interesting there.

Best regards,

Michal Vodicka
UPEK, Inc.
[xxxxx@upek.com, http://www.upek.com http:</http:>]


From: xxxxx@lists.osr.com
[mailto:xxxxx@lists.osr.com] On Behalf Of Jan Bottorff
Sent: Wednesday, June 02, 2010 9:36 PM
To: Windows System Software Devs Interest List
Subject: [ntdev] Debugging system shutdown hang

I’ve been asked to look at a W2K3 system that fails to shutdown.

We’ve generated a kernel crash dump and it looked pretty
uninteresting. None of the thread stacks for any process seemed to be
doing much (very shallow call stacks), except waiting for some work to
do. Irpfind found some active irps, but none seemed to be in any
non-standard system drivers. The system disk must still be online, as
NMI initiated crash dumps work.

The system has a WHQL signed NDIS im vlan driver, which may or
may not have anything to do with the issue, and a few custom user
services. The app developers tell me there is no chance they are
cancelling the shutdown.

Are there any docs that might explain how the system shutdown
works in detail, and how I would tell if some user mode app is doing
something bad.

Is there any ETW tracing that might be useful? I know W2K3 has
much more limited tracing that W2K8 and later.

It’s also not easily reproducible in front of me, so pretty much
have to send instructions off to a site and then get back the results.

Is this the kind of issue one could pay Microsoft support to
debug and expect an efficient analysis?

Jan


NTDEV is sponsored by OSR

For our schedule of WDF, WDM, debugging and other seminars
visit:
http://www.osr.com/seminars

To unsubscribe, visit the List Server section of OSR Online at
http://www.osronline.com/page.cfm?name=ListServer

!poaction was not very interesting, or maybe it actually says the kernel
thinks it’s not shutting down.

1: kd> !poaction

PopAction: fffff800011cc5c0

State…: 0 - Idle

Updates…: 0

Action…: None

Lightest State.: Unspecified

Flags…: 0

Irp minor…: ??

System State…: Unspecified

Hiber Context…: 0000000000000000

No Device State present

From: xxxxx@lists.osr.com
[mailto:xxxxx@lists.osr.com] On Behalf Of Michal Vodicka
Sent: Wednesday, June 02, 2010 12:41 PM
To: Windows System Software Devs Interest List
Subject: RE: [ntdev] Debugging system shutdown hang

I’d try !poaction to see the state of shutdown power IRP processing. Maybe
you’d see something interesting there.

Best regards,

Michal Vodicka
UPEK, Inc.
[xxxxx@upek.com, http:</http:> http://www.upek.com]


From: xxxxx@lists.osr.com
[mailto:xxxxx@lists.osr.com] On Behalf Of Jan Bottorff
Sent: Wednesday, June 02, 2010 9:36 PM
To: Windows System Software Devs Interest List
Subject: [ntdev] Debugging system shutdown hang

I’ve been asked to look at a W2K3 system that fails to shutdown.

We’ve generated a kernel crash dump and it looked pretty uninteresting. None
of the thread stacks for any process seemed to be doing much (very shallow
call stacks), except waiting for some work to do. Irpfind found some active
irps, but none seemed to be in any non-standard system drivers. The system
disk must still be online, as NMI initiated crash dumps work.

The system has a WHQL signed NDIS im vlan driver, which may or may not have
anything to do with the issue, and a few custom user services. The app
developers tell me there is no chance they are cancelling the shutdown.

Are there any docs that might explain how the system shutdown works in
detail, and how I would tell if some user mode app is doing something bad.

Is there any ETW tracing that might be useful? I know W2K3 has much more
limited tracing that W2K8 and later.

It’s also not easily reproducible in front of me, so pretty much have to
send instructions off to a site and then get back the results.

Is this the kind of issue one could pay Microsoft support to debug and
expect an efficient analysis?

Jan


NTDEV is sponsored by OSR

For our schedule of WDF, WDM, debugging and other seminars visit:
http://www.osr.com/seminars

To unsubscribe, visit the List Server section of OSR Online at
http://www.osronline.com/page.cfm?name=ListServer


NTDEV is sponsored by OSR

For our schedule of WDF, WDM, debugging and other seminars visit:
http://www.osr.com/seminars

To unsubscribe, visit the List Server section of OSR Online at
http://www.osronline.com/page.cfm?name=ListServer

Maybe it says, this is the normal state when system is running. It looks
like power IRP wasn’t sent, yet.

Best regards,

Michal Vodicka
UPEK, Inc.
[xxxxx@upek.com, http://www.upek.com http:</http:>]


From: xxxxx@lists.osr.com
[mailto:xxxxx@lists.osr.com] On Behalf Of Jan Bottorff
Sent: Wednesday, June 02, 2010 9:53 PM
To: Windows System Software Devs Interest List
Subject: RE: [ntdev] Debugging system shutdown hang

!poaction was not very interesting, or maybe it actually says
the kernel thinks it’s not shutting down.

1: kd> !poaction

PopAction: fffff800011cc5c0

State…: 0 - Idle

Updates…: 0

Action…: None

Lightest State.: Unspecified

Flags…: 0

Irp minor…: ??

System State…: Unspecified

Hiber Context…: 0000000000000000

No Device State present

From: xxxxx@lists.osr.com
[mailto:xxxxx@lists.osr.com] On Behalf Of Michal Vodicka
Sent: Wednesday, June 02, 2010 12:41 PM
To: Windows System Software Devs Interest List
Subject: RE: [ntdev] Debugging system shutdown hang

I’d try !poaction to see the state of shutdown power IRP
processing. Maybe you’d see something interesting there.

Best regards,

Michal Vodicka
UPEK, Inc.
[xxxxx@upek.com, http://www.upek.com
http:</http:> ]


From: xxxxx@lists.osr.com
[mailto:xxxxx@lists.osr.com] On Behalf Of Jan Bottorff
Sent: Wednesday, June 02, 2010 9:36 PM
To: Windows System Software Devs Interest List
Subject: [ntdev] Debugging system shutdown hang

I’ve been asked to look at a W2K3 system that fails to
shutdown.

We’ve generated a kernel crash dump and it looked pretty
uninteresting. None of the thread stacks for any process seemed to be
doing much (very shallow call stacks), except waiting for some work to
do. Irpfind found some active irps, but none seemed to be in any
non-standard system drivers. The system disk must still be online, as
NMI initiated crash dumps work.

The system has a WHQL signed NDIS im vlan driver, which
may or may not have anything to do with the issue, and a few custom user
services. The app developers tell me there is no chance they are
cancelling the shutdown.

Are there any docs that might explain how the system
shutdown works in detail, and how I would tell if some user mode app is
doing something bad.

Is there any ETW tracing that might be useful? I know
W2K3 has much more limited tracing that W2K8 and later.

It’s also not easily reproducible in front of me, so
pretty much have to send instructions off to a site and then get back
the results.

Is this the kind of issue one could pay Microsoft
support to debug and expect an efficient analysis?

Jan


NTDEV is sponsored by OSR

For our schedule of WDF, WDM, debugging and other
seminars visit:
http://www.osr.com/seminars

To unsubscribe, visit the List Server section of OSR
Online at http://www.osronline.com/page.cfm?name=ListServer


NTDEV is sponsored by OSR

For our schedule of WDF, WDM, debugging and other seminars
visit:
http://www.osr.com/seminars

To unsubscribe, visit the List Server section of OSR Online at
http://www.osronline.com/page.cfm?name=ListServer


NTDEV is sponsored by OSR

For our schedule of WDF, WDM, debugging and other seminars
visit:
http://www.osr.com/seminars

To unsubscribe, visit the List Server section of OSR Online at
http://www.osronline.com/page.cfm?name=ListServer

Jan have you let it set for very long? I see much the same thing on a Win7
64bit installation, and it can be as frustrating as hell. If I let it sit
long enough and let it finally crash after about 40 minutes, It will blow
out with a BSOD error message statiing an IRP was held too long and the
fault bucket is pointing to the NDIS stack. That is all in a post-mortem
dump.

Gary G. Little

H (952) 223-1349

C (952) 454-4629

xxxxx@comcast.net

From: xxxxx@lists.osr.com
[mailto:xxxxx@lists.osr.com] On Behalf Of Jan Bottorff
Sent: Wednesday, June 02, 2010 2:36 PM
To: Windows System Software Devs Interest List
Subject: [ntdev] Debugging system shutdown hang

I’ve been asked to look at a W2K3 system that fails to shutdown.

We’ve generated a kernel crash dump and it looked pretty uninteresting. None
of the thread stacks for any process seemed to be doing much (very shallow
call stacks), except waiting for some work to do. Irpfind found some active
irps, but none seemed to be in any non-standard system drivers. The system
disk must still be online, as NMI initiated crash dumps work.

The system has a WHQL signed NDIS im vlan driver, which may or may not have
anything to do with the issue, and a few custom user services. The app
developers tell me there is no chance they are cancelling the shutdown.

Are there any docs that might explain how the system shutdown works in
detail, and how I would tell if some user mode app is doing something bad.

Is there any ETW tracing that might be useful? I know W2K3 has much more
limited tracing that W2K8 and later.

It’s also not easily reproducible in front of me, so pretty much have to
send instructions off to a site and then get back the results.

Is this the kind of issue one could pay Microsoft support to debug and
expect an efficient analysis?

Jan


NTDEV is sponsored by OSR

For our schedule of WDF, WDM, debugging and other seminars visit:
http://www.osr.com/seminars

To unsubscribe, visit the List Server section of OSR Online at
http://www.osronline.com/page.cfm?name=ListServer

__________ Information from ESET Smart Security, version of virus signature
database 5167 (20100602) __________

The message was checked by ESET Smart Security.

http://www.eset.com