I'm seeing some strange behaviour in a KMDF driver, where a non-power-managed parallel-dispatch WDF I/O queue does not call my EvtRequestCancel routine as soon as the WDM cancel routine is invoked. I know this because I replaced WDF's WDM cancel routine in the IRP with my own. That cancel routine prints a debug message and then calls WDF's WDM cancel routine. It's naughty, but it proves that the WDF I/O queue is deferring invocation of my EvtRequestCancel routine for some reason.
The sequence of events is:
1. Program A opens the device and initiates an operation, which I will call "DOSOMETHING". EvtIoDeviceControl marks the request CancelableEx without completing the request immediately. I am deliberately preventing DOSOMETHING from completing so that I can test its cancellation.
2. Program B opens the same device and initiates an operation, which I will call "RESET". This is completely synchronous (i.e. EvtDeviceIoControl does not return until the operation is finished), and does the following:
(a) Sets a flag that causes all incoming DOSOMETHINGs to fail.
(b) Synchronously waits for all ongoing DOSOMETHING operations to finish before proceeding (so it is waiting on 1 operation at this point). It uses KeWaitForSingleObject on a KEVENT, KernelMode, non-alertable.
(c) Does some stuff, and completes the request.
3. I hit CTRL-C for program A. I see that the WDM cancel routine is invoked immediately, BUT the WDF I/O queue does not (yet) invoke my EvtRequestCancel for DOSOMETHING.
At this point, things are deadlocked. RESET is stuck at (b), waiting on 1 x DOSOMETHING to finish, but the WDF I/O queue has not invoked DOSOMETHING's EvtRequestCancel. There's really nothing else happening as far as the driver is concerned.
4. I then run PROGRAM C, which opens the same device. At that moment, the WDF I/O queue decides to invoke EvtRequestCancel for DOSOMETHING, in the process context of C.
5. DOSOMETHING completes its request with STATUS_CANCELLED and decrements a counter to 0, which causes the KEVENT to become signalled. RESET's KeWaitForSingleObject finishes normally, so RESET completes its request normally too.
Programs A, B, and C then complete normally, and the expected Close and Cleanup events occur, and nothing bad happens. I'm currently running with driver verifier enabled, and it does not see any problems.
I have not specified any WdfSynchronizationLevel or WdfExecutionLevel for any of the WDF driver, device, file or I/O queue objects. My reading of the docs makes me think that this should result in the minimum of automatic synchronization.
What I don't understand is why cancellation is being deferred by the WDF I/O queue, and, even weirder, why an incoming device open is sufficient to unblock the deadlock.
I took at a look at the WDF source code. I couldn't really follow it properly, but the gist of what I saw is that invoking EvtRequestCancel immediately, in response to a WDM cancel routine being invoked, is conditional. There's some reasonably complex logic in the main dispatch routine of the I/O queue object, which decides what callback the I/O queue invokes next. I couldn't really figure out the rules that it plays by, though.
Can somebody spell out what needs to be done, if anything, to have no automatic synchronization in an I/O queue? Really, I want WDF to call my callbacks the way WDM would, for the most part. Is it sufficient to leave synchronization and execution levels for everything as InheritFromParent?
Is it in fact not possible to turn off serialization of EvtDeviceIoControl with EvtRequestCancel?
I wondered if the KeWaitForSingleObject is messing with IRQLs, given that the wait is non-alertable. Can this affect a WDF I/O queue's dispatching?
Given that I pretty much want WDM-style behaviour for interactions with user mode, would I be better off using WdfDeviceInitAssignWdmIrpPreprocessCallback and handling IRP_MJ_READ/WRITE/DEVICE_CONTROL/CREATE/CLOSE/CLEANUP myself? With that approach, will EvtDeviceSelfManagedIoSuspend etc. be sufficient to avoid problems due to no longer having the necessary automatic interactions between the WDF I/O queue's state and the WDF device object's PnP/power state?
OS is Windows 10 x64. KMDF version string is "Kernel Mode Driver Framework (verifier on) version 01.021.0".