Win7x86 NDIS hang during shutdown

I have an interesting crash dump, manually generated following a hang at shutdown.
I know what the problem is, but I don?t know why.
The code in ?brickwan? is part of my miniport driver.
Obviously my code in brickwan::sendrecv.cpp::SendNetBuffers() is grabbing a spin lock and is still holding it when the OS later calls
brickwan::callmanager.cpp::BrickWanCmDeleteVc() to tear down the connection. Therefore BrickWanCmDeleteVc() cannot grab the same spin lock, and the system hangs.
One thing I do not understand is why the delete request comes in on the same procedural stack as the SendNetBuffers went out on. Am I violating some contract with NDIS by grabbing the spinlock in SendNetBuffers?

Any insight you could provide would be greatly appreciated.

I have a full manual crash dump; here is a snapshot of some output from !running:

0 kd !running -ti
System Processors (0000000f)
Idle Processors (0000000d)
ChildEBP RetAddr
8ed30a74 8ae5c93c hal!KeAcquireSpinLockRaiseToSynch+0x3f
8ed30a80 8ae56bef Wdf01000!FxSpinLockAcquireLock+0x11
8ed30a98 95368b96 Wdf01000!imp_WdfSpinLockAcquire+0xb8
8ed30aa8 95369505 brickwan!WdfSpinLockAcquire+0x16 [cwinddk7600.16385.1incwdfkmdf1.9wdfsync.h @ 245]
8ed30ac4 8b3184e4 brickwan!BrickWanCmDeleteVc+0xf5 [cseleniawindows_cleanmainpcidriversbrickwancallmanager.cpp @ 294]
8ed30ae4 95487f52 ndis!NdisCoDeleteVc+0x1f2
8ed30b00 95484934 NDProxy!DoDerefVcWork+0x6a
8ed30b1c 8b31936c NDProxy!PxCmCloseCall+0xb2
8ed30b40 95319b65 ndis!NdisClCloseCall+0xba
8ed30b64 95315277 ndiswan!DerefVc+0x40
8ed30b7c 8b2ba0a5 ndiswan!ProtoCoSendComplete+0xcd
8ed30ba4 8b2ba008 ndis!ndisMCoSendNetBufferListsCompleteToNdisPackets+0x94
8ed30bb8 9536aa84 ndis!NdisMCoSendNetBufferListsComplete+0x15
8ed30c18 9536d416 brickwan!SendNetBuffers+0x444 [mainpcidriversbrickwansendrecv.cpp @ 396]
8ed30c50 83022056 brickwan!PciDeviceTxThread+0x166 [mainpcidriversbrickwandevice.cpp @ 867]
8ed30c90 82eca1a9 nt!PspSystemThreadStartup+0x9e
00000000 00000000 nt!KiThreadStartup+0x19

>

I have an interesting crash dump, manually generated following a hang at
shutdown.
I know what the problem is, but I don?t know why.
The code in ?brickwan? is part of my miniport driver.
Obviously my code in brickwan::sendrecv.cpp::SendNetBuffers() is grabbing
a spin lock and is still holding it when the OS later calls
brickwan::callmanager.cpp::BrickWanCmDeleteVc() to tear down the
connection. Therefore BrickWanCmDeleteVc() cannot grab the same spin
lock, and the system hangs.
One thing I do not understand is why the delete request comes in on the
same procedural stack as the SendNetBuffers went out on. Am I violating
some contract with NDIS by grabbing the spinlock in SendNetBuffers?

Any insight you could provide would be greatly appreciated.

In general you shouldn’t hold a spinlock when calling Ndis (or probably any other subsystem) unless you know that that call won’t call you back somehow. NdisMIndicateReceivePacket->MiniportReturnPacket is the obvious case here but it would appear you’ve found another case that happens during shutdown.

Can you simply release the lock before calling NdisMCoSendNetBufferListsComplete?

James

Yeah, don’t do that.

The call manager is calling you back and has no idea that you are holding a
non-recursive lock when you called it.

You should not call outside the boundary (scope) of your driver while
holding a spinlock. But in particular, you cannot call send on the lower
edge (or indicate to the upper edge).

Consider that on that call the lower edge component might complete the send,
choose to do darn near anything, indicate a status, etc. etc. etc. So by
you holding a lock and allowing the activity to egress your driver (scope)
with that lock held, you are [expletive deleted] up the entire works with
respect to the synchronization assumptions of the other ‘loosely’ couple
components :slight_smile:

If you need to synchronize the send path, use a queue. Protect the queue
with the spinlock. Dispatch sends out of the queue one at a time but
without a lock held.

Many of the interlayer calls are permitted at DISPATCH_LEVEL but that does
not mean they can be called with a ‘restriction’ (lock held).

Good Luck,
Dave Cattley

(btw, Brick seems so humorously appropriate considering what the driver
temporarily turned your machine into! )

The question is: what are you doing within the spin-lock? A spin-lock
should be held for as short a time as possible, so if you are doing
something with a potentially unbounded delay from within the spin lock,
this would be a Bad Thing. So it would be useful to know what you are
doing that might cause such a delay. For example, it appears you are
calling SendNetBuffers, which is probably a bad thing to do from within a
spin lock. But that’s just a guess.
joe

I have an interesting crash dump, manually generated following a hang at
shutdown.
I know what the problem is, but I don?t know why.
The code in ?brickwan? is part of my miniport driver.
Obviously my code in brickwan::sendrecv.cpp::SendNetBuffers() is grabbing
a spin lock and is still holding it when the OS later calls
brickwan::callmanager.cpp::BrickWanCmDeleteVc() to tear down the
connection. Therefore BrickWanCmDeleteVc() cannot grab the same spin lock,
and the system hangs.
One thing I do not understand is why the delete request comes in on the
same procedural stack as the SendNetBuffers went out on. Am I violating
some contract with NDIS by grabbing the spinlock in SendNetBuffers?

Any insight you could provide would be greatly appreciated.

I have a full manual crash dump; here is a snapshot of some output from
!running:

0 kd !running -ti
System Processors (0000000f)
Idle Processors (0000000d)
ChildEBP RetAddr
8ed30a74 8ae5c93c hal!KeAcquireSpinLockRaiseToSynch+0x3f
8ed30a80 8ae56bef Wdf01000!FxSpinLockAcquireLock+0x11
8ed30a98 95368b96 Wdf01000!imp_WdfSpinLockAcquire+0xb8
8ed30aa8 95369505 brickwan!WdfSpinLockAcquire+0x16
[cwinddk7600.16385.1incwdfkmdf1.9wdfsync.h @ 245]
8ed30ac4 8b3184e4 brickwan!BrickWanCmDeleteVc+0xf5
[cseleniawindows_cleanmainpcidriversbrickwancallmanager.cpp @ 294]
8ed30ae4 95487f52 ndis!NdisCoDeleteVc+0x1f2
8ed30b00 95484934 NDProxy!DoDerefVcWork+0x6a
8ed30b1c 8b31936c NDProxy!PxCmCloseCall+0xb2
8ed30b40 95319b65 ndis!NdisClCloseCall+0xba
8ed30b64 95315277 ndiswan!DerefVc+0x40
8ed30b7c 8b2ba0a5 ndiswan!ProtoCoSendComplete+0xcd
8ed30ba4 8b2ba008 ndis!ndisMCoSendNetBufferListsCompleteToNdisPackets+0x94
8ed30bb8 9536aa84 ndis!NdisMCoSendNetBufferListsComplete+0x15
8ed30c18 9536d416 brickwan!SendNetBuffers+0x444
[mainpcidriversbrickwansendrecv.cpp @ 396]
8ed30c50 83022056 brickwan!PciDeviceTxThread+0x166
[mainpcidriversbrickwandevice.cpp @ 867]
8ed30c90 82eca1a9 nt!PspSystemThreadStartup+0x9e
00000000 00000000 nt!KiThreadStartup+0x19


NTDEV is sponsored by OSR

OSR is HIRING!! See http://www.osr.com/careers

For our schedule of WDF, WDM, debugging and other seminars visit:
http://www.osr.com/seminars

To unsubscribe, visit the List Server section of OSR Online at
http://www.osronline.com/page.cfm?name=ListServer