KeAcquireSpinLockForDpc is preempted while waiting for the lock

I am getting a strange behavior from the KeAcquireSpinLockForDpc() API. We, or rather the github ZFS project, is making use of it outside of the threaded DPC realm as it seemed an easy way to speedily lock whether the thread was at DISPATCH or lower IRQL levels. What I was not expecting is to see that while the thread was waiting to acquire the lock it could be preempted and reentered, waiting again on the same lock (see stack excerpt #1 below). The thread that actually holds the lock is itself preempted (see stack #2) and the I/Os going through deadlock with all other interesting threads waiting to get the lock in PrintBuffer too. I/Os are coming through a StorPort interface and I see that the KeAcquireSpinLockForDpc() routine is tagged as HwStorPortProhibitedDDIs in the compliance rules. It would help me to understand the effect of KeAcquireSpinLockForDpc() on the thread reentrancy I see here, if it is the cause.


As always, thanks for helping.
Thierry Laurent
DataCore Software Corp.


stack #1: thread being reentered while waiting for the lock (the acquire lock is in the PrintBuffer routine).

4.001560 ffffd98de769b5c0 0003a1f RUNNING +0xfffff801d2959003
nt!KxWaitForSpinLockAndAcquire+0xf4f6d
nt!KeAcquireSpinLockAtDpcLevel+0x1f
nt!KeAcquireSpinLockForDpc+0x2b
ZFSin!printBuffer+0x65
ZFSin!wzvol_HwStartIo+0x21
storport!RaidAdapterPostScatterGatherExecute+0x2e7
storport!RaUnitStartIo+0x1b7
storport!RaidRestartIoQueue+0x9a
storport!RaidAdapterRestartQueues+0x81
storport!RaidPauseTimerDpcRoutine+0x1e
nt!KiRetireDpcList+0x731
nt!KxRetireDpcList+0x5
nt!KiDispatchInterruptContinue
nt!KiDpcInterruptBypass+0x25
nt!KiInterruptDispatchNoLockNoEtw+0xb1
nt!KxWaitForSpinLockAndAcquire+0x18
nt!KeAcquireSpinLockAtDpcLevel+0x1f
nt!KeAcquireSpinLockForDpc+0x2b
ZFSin!printBuffer+0x65
ZFSin!dispatcher+0xa8
nt!IovCallDriver+0x252
nt!IofCallDriver+0x72
storport!RaSendIrpSynchronous+0x4d
storport!RaidUnitResetUnit+0x16f
storport!RaidUnitHierarchicalReset+0x4a
storport!RaidHierarchicalResetWorkRoutine+0x66
nt!IopProcessWorkItem+0xf0
nt!ExpWorkerThread+0xe9
nt!PspSystemThreadStartup+0x41
nt!KiStartSystemThread+0x16

stack #2: thread being interrupted while doing the PrintBuffer job after acquiring the lock.

e2c.001240 ffffd98def9fe080 0004f03 READY nt!KxDispatchInterrupt+0x122
nt!KiDpcInterrupt+0x2a3
ZFSin!addbuffer+0x1dc
ZFSin!printBuffer+0x73
ZFSin!dispatcher+0xa8
nt!IovCallDriver+0x252
nt!IofCallDriver+0x72
CLASSPNP!SubmitTransferPacket+0x201
CLASSPNP!ServiceTransferRequest+0x29d
CLASSPNP!ClassReadWrite+0x164
nt!IovCallDriver+0x252
nt!IofCallDriver+0x72
DcsDf!DiskPerfReadWrite+0x1fe
nt!IovCallDriver+0x252
nt!IofCallDriver+0x72
DcsPMF!PMFReadWrite+0x23c
nt!IovCallDriver+0x252
nt!IofCallDriver+0x72
partmgr!PmGlobalDispatch+0xf7
nt!IovCallDriver+0x252
nt!IofCallDriver+0x72
partmgr!PartitionIo+0x177
partmgr!PmGlobalDispatch+0x63
nt!IovCallDriver+0x252
nt!IofCallDriver+0x72
volmgr!VmReadWrite+0x109
nt!IovCallDriver+0x252
nt!IofCallDriver+0x72
nt!IovCallDriver+0x252
nt!IofCallDriver+0x72
volsnap!VolSnapReadFilter+0x61
nt!IovCallDriver+0x252
nt!IofCallDriver+0x72
DcsPMF!PMFReadWrite+0x23c
nt!IovCallDriver+0x252
nt!IofCallDriver+0x72
nt!RawReadWriteDeviceControl+0x9e
nt!RawDispatch+0x78
nt!IovCallDriver+0x252
nt!IofCallDriver+0x72
FLTMGR!FltpLegacyProcessingAfterPreCallbacksCompleted+0x1a6
FLTMGR!FltpDispatch+0xb6
nt!IovCallDriver+0x252
nt!IofCallDriver+0x72
nt!IopSynchronousServiceTail+0x1a0
nt!NtReadFile+0x670
nt!KiSystemServiceCopyEnd+0x13
ntdll!NtReadFile+0x14
KERNELBASE!ReadFile+0x74
DiskTest+0x2edb
DiskTest+0x6bff
DiskTest+0x8262
KERNEL32!BaseThreadInitThunk+0x14
ntdll!RtlUserThreadStart+0x21

If you acquire the spin lock at PASSIVE_LEVEL, then you can certainly be pre-empted. Your nested call is a DPC, so it’s going to run at DISPATCH_LEVEL. That call cannot be pre-empted.

There shouldn’t a deadlock unless the thread that actually holds the spin lock isn’t releasing it. That would be a bug. If you hold a spin lock, the only way you can be pre-empted is by a hardware interrupt. It’s ISR should run and return in a few microseconds.

It depends on what you’re protecting, of course, but it’s possible a spin lock is not the right synchronization primitive for you.

Just make sure you are not doing the classic bug of calling KeAquireSpinLockForDPC at greater than DISPATCH_LEVEL such as in a path from the ISR. I’ve seen this done more than once, including in some widely distributed open source.

Tim. when invoking KeAcquireSpinLock() for example, even if the thread is at PASSIVE, the level is raised to DISPATCH before the spin wait. So a thread waiting to acquire a spinlock should not be interrupted by a DPC routine.
The question is now: maybe KeAquireSpinLockForDPC() does something different? I wish I had the source for that API …

the level is raised to DISPATCH before the spin wait

Yeah, but you could still be pre-empted, JUST before the IRQL gets raised, right?

maybe KeAquireSpinLockForDPC() does something different?

It doesn’t. If you’re running in a threaded DPC then it just calls KeAcquireSpinLock. If you’re running in a regular DPC, it calls KeAcquireSpinLockAtDpcLevel.

/editorial rant

KeAquireSpinLockForDPC is, without a doubt, one of the most ridiculous functions I have ever seen in all my years in writing operating system level software. I would suggest nobody use it. Ever.

First of all, there’s the horrible potential for mis-typing and confusion among:

  • KeAcquireSpinLockForDpc
  • KeAcquireSpinLockAtDpcLevel
  • KeAcqurieSpinLockRaiseToDpcLevel

(Please multiply the above by “InStackQueued” and “TryToAcquire” as well)

Not to even mention the issue of being sure you’re actually calling the correct paired release function.

The whole premise of the “ForDpc” and “AtDpcLevel” functions is “if we’re already running at IRQL DISPATCH_LEVEL, we don’t want to go through the overhead of writing the TPR”… a case of premature optimization if I’ve ever seen one. On a typical system, the TPR is pounded about a zillion times a second. But, you know, for this particular spinlock acquisition we can save a whole TPR reference by using one of these very similarly named functions… at the cost of making maintainers lives miserable forever and ever, unto ages of ages.

This is reflection of Windows’ long-standing ad hoc philosophy of “If there’s one way to do something, there should be six alternative ways to do the same thing, in case there’s some developer who might want to do it slightly differently.”

So, aside from bellowing loudly, here’s what I suggest: Forget these other versions of KeAcquire exist. Just always call KeAcquireSpinLock and KeReleaseSpinLock… or the InStackQueuedVersion and be done with it. I promise, nobody will think less of you for not calling KeAcquireSpinLockAtDpcLevel from within, say, your Timer Routine or your DpcForIsr. Nobody will even notice. Because your code will be nice and clear and easy to maintain, everyone will understand what you’re doing, and it won’t send all and sundry who read your code off to Google what this particular variant does and does not do and when and not it should not be used.

/end of editorial rant

Peter

Peter, thanks. In light of your answer I now see the core of my problem: The routine trying to use the ForDpc() version was neither running in a threaded nor a regular DPC; it was just in a thread. Looking at the git version of the ntoskrnl I observe that the Prcb’s DpcThreadActive will be FALSE in my case so it will call the AtDpc() version and that screws everything up if the caller is lower than DISPATCH (verifier did not catch it, which is surprising).
The DDK related to the ForDpc() routine makes one believe that no matter the caller level it will act accordingly but that is only if the caller happens to be invoked through the kernel DPC logic.
As you said, saving those cycles is not worth the trouble.
Thanks everyone!
Thierry

Glad to know you got it working… and that my rant was not entirely wasted.

And thank you, Mr. Laurent, for proving my point that these ridiculous variants are as confusing as they are unnecessary.

Peter

1 Like

Agree with Peter almost 100%. However KeAcquireSpinLockAtDpcLevel can be useful if we want to do synchronization from code with disabled interrupts.

to do synchronization from code with disabled interrupts.

You mean, “at an IRQL other than IRQL DISPATCH_LEVEL”? Kinda “roll your own spin lock” type of thing?

That’s a bad practice, in general (though I admit to having done it once or twice myself).

To me, that’s another reason AtDpcLevel shouldn’t exist, or at the very least should bugcheck if not called at IRQL DISPATCH_LEVEL.

Peter

I mean cleared interrupt flag. Actual disabled interrupts, not IRQL manipulation. I used KeAcquireSpinLockAtDpcLevel in vm-exit dispatcher in my hypervisor and it was super useful. Although now that I think about it and after I checked KeAcquireSpinLockAtDpcLevel disassembly I understand that I should have implemented my own spinlock using instructions with lock prefix. So maybe I agree with you 100 % after all :slight_smile:

https://i.giphy.com/media/mywyss493rDCBmSwhg/giphy.gif![](/uploads/db2714/original/1X/f2e7e3a0d1dcc818c6f51b5171be02c5f7ad4ed8.gif “”)

1 Like

I am getting a strange behavior from the KeAcquireSpinLockForDpc() API. We, or rather the github ZFS project

I would highly recommend checking the following thread for more info on the subject.

https://community.osr.com/discussion/comment/294014

As you will see with your own eyes, in order to comply with IRQL constraints, the author of ZFS On Windows ARBITRARILY lowers IRQL so that he can make a PASSIVE_LEVEL call while holding a lock. Therefore, the root of your problem became immediately obvious to me right at the moment I saw the name of above mentioned project in context of your question. All you have to do is to look for a code that lowers IRQL while holding a spinlock…

It looks like all the time that I spent on explaining IRQL basics to the guy was simply waisted on him - the only “progress” that he seems to have made since was extending his"engineering concept" from APC_LEVEL code to the one that runs in atomic context. In other words, you just have experimentally established what may happen when you do things like that.

Anton Bassov

…was simply waisted

OMG - it reminds me of the thread where I made a “hummer” instead of “hammer” typo.

Certainly, the typo that I made on this thread is not as funny as the above mentioned one, but still…

Anton Bassov