Issues with SDV invalidreqaccess - seems to get into impossible situation of duplicate completion

The issue seems to be that SDV thinks that I am completing the same request twice. I’m looking for any advice/links on debugging similar SDV issues, any similar anecdotes to prove I’m not going crazy, any annotations to assure SDV the requests are valid, or for someone to tell me that the way I am using WdfRequestUnmarkCancelable is in fact not allowed, despite my confidence that the requests will still be valid if the field is not NULL. Googling “invalidreqaccess” does not return any results of similar issues.

The driver has several internal contexts that may be acted upon. The device can specify which one it wants to act on, as well as an operation to use, and some other data associated with the request. These messages from the device can cause the completion of requests that are pended in the context.

The SDV traces show that I am receiving two messages, both with operation == OP1. They return what appears to be a different ‘context’ from GetContextByMessageContextInfo. They both go into the part where they complete PendingWaitForOp1Request, and the second one causes SDV to claim the request is invalid. Looking back through the traces, the only time a request is being set to invalid is in in the previous loop, where PendingWaitForOp1Request was completed on the first context, so SDV appears to think the same request is pending on two different contexts, however it is only possible for a request to be pended in a single context.

One example of the code triggering the issue looks something like this (but much longer, and I cannot share actual portions of it):

static EVT_WDF_DPC ReceiveMessageDpc;
_Use_decl_annotations_
static VOID
ReceiveMessageDpc(
    WDFDPC Dpc
    )
{
    PMESSAGE message;
    PCONTEXT context;
    PDEV_CTX devCtx = GetDpcCtx(Dpc)->DevCtx

    while (NULL != (message = GetNextMessageFromDevice(devCtx))) {
        // Looks into a table stored in the driver for a context matching this info.
        context = GetContextByMessageContextInfo(message->contextInfo);
        if (!context) {
             RespondToDeviceAboutBadContextInfo(message->contextInfo);
             continue;
        }

        WdfSpinLockAcquire(context->lock);
        if (message->operation == OP1) {
             if (context->state == WaitingForOp1) {
                 context->state = GotOp1;
                 if (NULL != context->PendingWaitForOp1Request) {
                     // request cancelation/completion is always guarded by NULL check with context->lock held.
                     (void)WdfRequestUnmarkCancelable(context->PendingWaitForOp1Request);
                     WdfRequestComplete(context->PendingWaitForOp1Request, STATUS_SUCCESS);
                     context->PendingWaitForOp1Request = NULL;
                 }
             } else {
                 // includes completing all Pending requests with pattern used above for completion
                 PutContextInErrorState(context);
             }
        } else if (message->operation == OP2) {
             // All sorts of things can happen with OP2+, but SDV traces show OP1 being called twice.
        } // and so on.
        WdfSpinLockRelease(context->lock);
    }
}

Note, I was able to work around other similar failures in this test where the types of requests being completed first and second were not even the same ones. In those cases, I am handling a Request which results in the completion of a different Pending request, then SDV thinks the original request I am currently working on becomes invalid after completing the pending one. I worked around that by capturing the output of WdfRequestUnmarkCancelable and using Analysis_assume(status == STATUS_SUCCESS) (that code has since been reverted). That experience has caused me to be skeptical of the way SDV is tracking which requests are which.

In case anybody finds this topic later and is looking for resolution… there isn’t one. I’ve found that WHQL does not require a successful dvl.xml file in the Static Tools Logo Test when the device class is System.

I attempted to fix the issue with https://docs.microsoft.com/en-us/windows-hardware/drivers/devtest/using---sdv-save-request-and---sdv-retrieve-request-for-deferred-proce but it did not fix the issue, since the requests were not being stashed in the SDV trace, they were just assumed to be valid all along.

If that hadn’t worked, my next steps were either to use some #pragma to disable analysis around WdfRequestComplete or _Analysis_assume_(NULL == context->PendingWaitForOp1Request) if that didn’t work.