NDIS IM driver + dump analysis

Hi,

I have my NDIS Intermediate driver which is crashing when system shuts down or goest into hibernate mode.
Getting the following info on analyzing the dump:
4.000010 fffffa8018b9ab50 ffc8c579 Blocked nt!KiSwapContext+0x7a nt!KiCommitThreadWait+0x1d2
nt!KeWaitForSingleObject+0x19f
ndisPnpNotifyBinding
ndis!ndisPnPNotifyAllTransports+0x18f
ndis!NdisFNetPnPEvent+0xd2
ndis!NdisFNetPnPEvent+0xa0
ndis!NdisFNetPnPEvent+0xa0

The bugcheck is that IRP is waiting for too long and this happens when a shutdown or hibernate is attempted.

Thus the above means that pnp is waiting for some event to complete… And this event can be the completion of send and receive packets.

So, essentially it means that some send and receive operation is pending because of which system state change IRP is not getting completed .

Is my understanding correct here ? Request you to please share your suggestion.

Thanks
Anshul Makkar
www.justkernel.com

Try

!stacks 2 ndis!

and post the output of the stack back-traces.

Also show the output of !ndiskd.protocols

What does your IM driver do when notified of a PnPEvent? Are you returning
NDIS_STATUS_NOT_SUPPORTED and thus being asked to unbind?

The thread back-trace you show below is not showing us anything other than
that NDIS appears to be delivering a PnP event to bound protocols (which in
this case is more than likely your lower protocol edge of your IM driver)
but no other information.

Somewhat hard to guess where the system is stuck.

Oh, and !analyze -v is always welcome too but I suspect it is a 9F bugcheck
that identifies some NIC (adapter) DO as having taken way too long to
respond to a PnP IRP. The precise adapter (Device Object) the PoMGr is
complaining about is sometimes helpful since it lets you know where to start
looking (above it) for the broken behavior.

Good Luck,
Dave Cattley

Thanks David. Below are the details that you have asked for.

kd> !stacks 2 ndis
Proc.Thread .Thread Ticks ThreadState Blocker
[fffff80004060180 Idle]
*** ERROR: Module load completed but symbols could not be loaded for intelppm.sys
[fffffa8018b43040 System]
4.000010 fffffa8018b9ab50 ffc8c579 Blocked nt!KiSwapContext+0x7a
nt!KiCommitThreadWait+0x1d2
nt!KeWaitForSingleObject+0x19f
ndis! ?? ::LNCPHCLB::`string’+0x4f4b = ndisPnPNotifyBinding
ndis!ndisPnPNotifyAllTransports+0x18f
ndis!NdisFNetPnPEvent+0xd2
ndis!NdisFNetPnPEvent+0xa0
ndis!NdisFNetPnPEvent+0xa0
ndis!ndisDevicePnPEventNotifyFiltersAndAllTransports+0x128
ndis!ndisQueryPower+0x2b8
ndis!ndisPowerDispatch+0x19c
nt!PopIrpWorker+0x3c5
nt!PspSystemThreadStartup+0x5a
nt!KiStartSystemThread+0x16

Yes, its a 9F BugCheck.
0: kd> !analyze -v
*******************************************************************************
* *
* Bugcheck Analysis *
* *
*******************************************************************************

DRIVER_POWER_STATE_FAILURE (9f)
A driver has failed to complete a power IRP within a specific time (usually 10 minutes).
Arguments:
Arg1: 0000000000000003, A device object has been blocking an Irp for too long a time
Arg2: fffffa8018b874d0, Physical Device Object of the stack
Arg3: fffff80000b7e3d8, nt!TRIAGE_9F_POWER on Win7, otherwise the Functional Device Object of the stack
Arg4: fffffa80245e02d0, The blocked IRP

Debugging Details:

DRVPOWERSTATE_SUBCODE: 3

IMAGE_NAME: ntkrnlmp

DEBUG_FLR_IMAGE_TIMESTAMP: 0

MODULE_NAME: ntkrnlmp

FAULTING_MODULE: fffff880057aa000

DEVICE_OBJECT: fffffa801b317050

DRIVER_OBJECT: fffffa801bc5b360

IRP_ADDRESS: fffffa80245e02d0

Below is the analysis for Dump No.2 . Same crash, but reported from a different user.
kd> stack 2 ndis
nt!KiCommitThreadWait+0x23c
nt!KeWaitForSingleObject+0x1cf
nt!ExWaitForRundownProtectionReleaseCacheAware+0x9d
tcpip!FlPnpEvent+0x26b
ndis!ndisInvokeNetPnPEvent+0x42
ndis!ndisDeliverNetPnPEventSynchronously+0x47
ndis!ndisPnPNotifyBinding+0x84
ndis!ndisPnPNotifyAllTransports+0x3b0
ndis!ndisPauseMiniportStack+0x94
ndis!ndisPrepForLowPower+0x80
ndis!ndisSetSystemPower+0x131
ndis!ndisSetPower+0x99
ndis!ndisPowerDispatch+0x99
nt!PopIrpWorker+0x284
nt!PspSystemThreadStartup+0x59
nt!KiStartSystemThread+0x16

kd> !analyze -v


Bugcheck Analysis



DRIVER_POWER_STATE_FAILURE (9f)
A driver has failed to complete a power IRP within a specific time (usually 10 minutes).
Arguments:
Arg1: 0000000000000003, A device object has been blocking an Irp for too long a time
Arg2: fffffa80069e9060, Physical Device Object of the stack
Arg3: fffff803decdb7e0, nt!TRIAGE_9F_POWER on Win7, otherwise the Functional Device Object of the stack
Arg4: fffffa80095cecf0, The blocked IRP

Debugging Details:
------------------

Page 2183c9 not present in the dump file. Type “.hh dbgerr004” for details

USER_NAME: drspott

DRVPOWERSTATE_SUBCODE: 3

IMAGE_NAME: ntkrnlmp

DEBUG_FLR_IMAGE_TIMESTAMP: 0

MODULE_NAME: ntkrnlmp

FAULTING_MODULE: fffff88004077000

DEVICE_OBJECT: fffffa80095c3050

DRIVER_OBJECT: fffffa8009808e60

IRP_ADDRESS: fffffa80095cecf0

I believe the root cause is same in both the scenarios.

kd> !ndiskd.protocols
fffffa8010ce8740 - PACKETDRIVER

fffffa800ea60c50 - RSPNDR
fffffa8009b7f010 - Bluetooth Device (Personal Area Network)
fffffa80078a9010 - Microsoft Hosted Network Virtual Adapter
fffffa800b97bb40 - Intel(R) Centrino(R) Ultimate-N 6300 AGN - MyDriver Bridged Networking Driver Miniport
fffffa8007818c30 - Intel(R) 82579LM Gigabit Network Connection - MyDriver Bridged Networking Driver Miniport

fffffa800eb75010 - NDISUIO
fffffa800eb73530 - Bluetooth Device (Personal Area Network)
fffffa800b388c30 - Microsoft Hosted Network Virtual Adapter
fffffa8007574c30 - Intel(R) Centrino(R) Ultimate-N 6300 AGN - MyDriver Bridged Networking Driver Miniport
fffffa8009f4a810 - Intel(R) 82579LM Gigabit Network Connection - NyDriver Bridged Networking Driver Miniport

fffffa800eb863f0 - LLTDIO
fffffa800a196700 - Bluetooth Device (Personal Area Network)
fffffa800b6d4010 - Microsoft Hosted Network Virtual Adapter
fffffa800b314c30 - Intel(R) Centrino(R) Ultimate-N 6300 AGN - MyDriver Bridged Networking Driver Miniport
fffffa8007994820 - Intel(R) 82579LM Gigabit Network Connection - MyDriver Bridged Networking Driver Miniport

fffffa800eb5c360 - VMNETBRIDGE
fffffa800b42d630 - Bluetooth Device (Personal Area Network)
fffffa800b872590 - Microsoft Hosted Network Virtual Adapter
fffffa800a769010 - Intel(R) Centrino(R) Ultimate-N 6300 AGN - MyDriver Bridged Networking Driver Miniport
fffffa800b4a93e0 - Intel(R) 82579LM Gigabit Network Connection - MyDriver Bridged Networking Driver Miniport

fffffa8009bff4a0 - NDPROXY

fffffa8009b6a010 - RASPPPOE

fffffa800698d520 - MYDriverINT_FilterDriver
fffffa800b402c30 - Intel(R) Centrino(R) Ultimate-N 6300 AGN
fffffa800bcb4550 - Intel(R) 82579LM Gigabit Network Connection

fffffa80097c2010 - NDISWANLEGACY

fffffa80097d0c50 - NDISWAN

fffffa8006953b10 - WANARPV6
fffffa800dd206a0 - WAN Miniport (IPv6)

fffffa80097d9b30 - WANARP
fffffa800d0c47f0 - WAN Miniport (IP)

fffffa8009415880 - TCPIP6TUNNEL
fffffa800c454aa0 - Microsoft ISATAP Adapter #21
fffffa800c49e270 - Teredo Tunneling Pseudo-Interface

fffffa8009415c50 - TCPIPTUNNEL

fffffa8009417890 - TCPIP6
fffffa80074499c0 - Bluetooth Device (Personal Area Network)
fffffa8009aa3c30 - Microsoft Hosted Network Virtual Adapter
fffffa800b1d5930 - Intel(R) Centrino(R) Ultimate-N 6300 AGN - MyDriver Bridged Networking Driver Miniport
fffffa800b41a9c0 - Intel(R) 82579LM Gigabit Network Connection - MyDriver Bridged Networking Driver Miniport

fffffa8009417c50 - TCPIP
fffffa800a8c6010 - Bluetooth Device (Personal Area Network)
fffffa8009aeac30 - Microsoft Hosted Network Virtual Adapter
fffffa800a04d010 - Intel(R) Centrino(R) Ultimate-N 6300 AGN - MyDriver Bridged Networking Driver Miniport
fffffa8009aafc30 - Intel(R) 82579LM Gigabit Network Connection - MyDriver Bridged Networking Driver Miniport

Thanks
Anshul Makkar
www.justkernel.com

My next move would be to locate the precise binding (or what NDIS calls
‘opens’) that is preventing the shutdown and to use !ndiskd to tell you
everything it knows about why it might be stuck.

I would also make sure that your IM driver has statistics (counters) that
count every packet indicate, returned, sent, and completed across both edges
and for each interface. Do the same for NDIS requests. That way when
the bugcheck occurs you can look at your per-interface context and go figure
out what is ‘stuck’.

Good Luck,
Dave Cattley

David, please can you explain your first statement . What exactly you mean by “opens” and which ndiskd commands are you referring to. I have already experimented with pendingnbls, mopen, and minport and protocol commands to get the maximum amount of information. Are you looking for any specific piece of information that can be helpful here.

This is a bugcheck that is seen randomly by the users and they wont be ready to test out a test build with increased logging.(I have tried it) So what I have at my disposal is just a dump file based on which I need to fix the issue.

I hope you understand the complexity here and thus not able to figure out the exact reason for the crash.
Please also suggest, if I have missed any step or anything else I should try to get the reason for the crash.

Thanks a lot for replying.

Anshul Makkar
www.justkernel.com

Just to add to the information available, there are no pending OIDs but I could see pendingnbls (got this info from the dump)

> David, please can you explain your first statement . What exactly you mean by “opens” and which ndiskd commands are you referring to. I have already experimented with pendingnbls, mopen, and minport and protocol commands to get the maximum amount of information. Are you looking for any specific piece of information that can be helpful here.

“opens” are what NDIS calls an active binding between a protocol edge and an MAC device (miniport). It is the result of calling NdisOpenAdapter(), etc. This is slighly different than a binding in that a binding is a configuration thing and an open is a runtime thing. I was simply suggesting that you use NDISKD to help explore the runtime state at the time of the issue. > I hope you understand the complexity here and thus not able to figure out the exact reason for the crash. More than you can imagine. I live this nightmare everyday (as I have a few IM drivers I help keep alive). > Please also suggest, if I have missed any step or anything else I should try to get the reason for the crash. You need to assess why NDIS is waiting. From there you need ascertain what specifically it is waiting for. And then you need to figure out what is causing the waited for outcome to be delayed. Good Luck,Dave Cattley

Thanks David for suggestions. Yeah, the main root cause is to find out why NDIS is waiting… I have found that there are some pending nbls , though no OIDs… So it seems some send, receive or return/completed packets are pending with my IM driver because of which NDIS is waiting… But why they are pending, still I am not able to figure it out. I have compared my driver handing of packet with MS sample of passthru, but couldn’t find any difference there.

Planning to release a test build with counters for each send, receive , return/completed packet. Hopefully, I will find someone to test …

Anyhow, below is the details of blocked thread
4.000010 fffffa8006cc1b00 ffba3b2f Blocked nt!KiSwapContext+0x76
nt!KiCommitThreadWait+0x23c
nt!KeWaitForSingleObject+0x1cf
ndis!NdisMSleep+0x59
ndis!ndisDetachFilter+0x33b
ndis!ndisDetachFiltersOnMiniport+0x122
ndis!ndisPnPRemoveDevice+0x226
ndis!NdisIMDeInitializeDeviceInstance+0x72
MyDriver!MyDriverNetFltWinMpDeInitializeDeviceInstance+0x7d
MyDriverlt!MyDriverNetFltWinPtDoUnbinding+0xae
MyDrivert!MyDriverNetFltWinDetachFromInterface+0x51
MyDriver!MyDriverNetFltWinPtUnbindAdapter+0x16
ndis!ndisUnbindProtocol+0x7c9e
ndis!ndisPnPNotifyAllTransports+0xeddc
ndis!ndisPauseMiniportStack+0x94
ndis!ndisPrepForLowPower+0x80
ndis!ndisSetSystemPower+0x131
ndis!ndisSetPower+0x99
ndis!ndisPowerDispatch+0x99
nt!PopIrpWorker+0x284
nt!PspSystemThreadStartup+0x59
nt!KiStartSystemThread+0x16

It clearly shows that it is stuck during uninitialize as some of the operations/packets are pending …

Thanks
Anshul Makkar
www.justkernel.com