Bugcheck 0x1E (KMODE_EXCEPTION_NOT_HANDLED)

Hi. I have a driver that’s been working in production across many systems for a long time, but there is one particular machine that started to crash . I’ve spent days staring at this code and the dump is not telling me enough specific information. All the parameters are zero! The source line it claims is responsible makes no sense. Maybe somebody can see something I don’t. Thanks in advance.

*******************************************************************************
*                                                                             *
*                        Bugcheck Analysis                                    *
*                                                                             *
*******************************************************************************

KMODE_EXCEPTION_NOT_HANDLED (1e)
This is a very common bugcheck.  Usually the exception address pinpoints
the driver/function that caused the problem.  Always note this address
as well as the link date of the driver/image that contains this address.
Arguments:
Arg1: 0000000000000000, The exception code that was not handled
Arg2: 0000000000000000, The address that the exception occurred at
Arg3: 0000000000000000, Parameter 0 of the exception
Arg4: 0000000000000000, Parameter 1 of the exception

Debugging Details:
------------------

CUSTOMER_CRASH_COUNT:  1

DEFAULT_BUCKET_ID:  WIN7_DRIVER_FAULT_SERVER

BUGCHECK_STR:  0x1E

PROCESS_NAME:  System

CURRENT_IRQL:  2

TAG_NOT_DEFINED_c000000f:  FFFFF800015A2FB0

LAST_CONTROL_TRANSFER:  from fffff800016f90ee to fffff800016f5e70

STACK_TEXT:  
fffff800`0159ba38 fffff800`016f90ee : 00000000`00000004 fffffa80`05c166b0 fffff800`0159c240 fffff800`016bdde0 : nt!KeBugCheck
fffff800`0159ba40 fffff800`016fd1fd : fffff800`018d8388 fffff800`0182b634 fffff800`01662000 fffff800`0159c1a0 : nt!KiKernelCalloutExceptionHandler+0xe
fffff800`0159ba70 fffff800`016bd125 : fffff800`0180bb94 fffff800`0159bae8 fffff800`0159c1a0 fffff800`01662000 : nt!RtlpExecuteHandlerForException+0xd
fffff800`0159baa0 fffff800`016bbdc2 : fffff800`0159c1a0 fffff800`0159c240 00000000`00000001 00000000`00000000 : nt!RtlDispatchException+0x415
fffff800`0159c180 fffff800`0171529c : fffff800`000850d6 fffff880`00000000 fffffa80`0411db50 fffffa80`0411db00 : nt!RtlRaiseStatus+0x4e
fffff800`0159c720 fffff880`016447cf : fffff800`0159c810 00000000`00000000 fffffa80`0424ca90 00000000`00000000 : nt! ?? ::FNODOBFM::`string'+0xc99c
fffff800`0159c7a0 fffff880`01640dbb : fffffa80`17aa4c60 00000000`00000034 fffffa80`0411c030 00000000`00000000 : DgRplct!DgRplctTrackQueue_CheckElement+0x9b [e:\sh-safehaven3\winonboardlra\lra_driver\dgrplct\dgrplcttrackqueue.c @ 916]
fffff800`0159c7e0 fffff800`01695681 : 00000000`00000000 00000000`00000000 00000000`00000000 00000000`00000000 : DgRplct!DgRplct_ForwardIrpPOCF1_Completion+0x3f [e:\sh-safehaven3\winonboardlra\lra_driver\dgrplct\dgrplctlogiclayer.c @ 345]
fffff800`0159c810 fffff880`012aebce : 00000000`00000000 00000000`00000001 00000000`ffffffff 00000000`00000000 : nt!IopfCompleteRequest+0x341
fffff800`0159c900 fffff800`01695681 : 00000000`00000000 fffff800`01846180 fffffa80`1547ed70 00000000`00000c20 : CLASSPNP!TransferPktComplete+0x1ce
fffff800`0159c980 fffff880`013556c8 : fffff880`02d63010 fffffa80`0401cb01 fffffa80`063c4230 00000000`00000000 : nt!IopfCompleteRequest+0x341
fffff800`0159ca70 fffff880`01355443 : fffffa80`00000000 fffff880`000000eb 00000000`00000001 fffffa80`04014060 : storport!RaidUnitCompleteRequest+0x208
fffff800`0159cb50 fffff800`016a2cbc : fffff800`01846180 fffffa80`00000000 fffffa80`04014128 400000c2`400000c1 : storport!RaidpAdapterDpcRoutine+0x53
fffff800`0159cb90 fffff800`016f942a : fffff800`01846180 fffff800`018561c0 00000000`00000000 fffff880`013553f0 : nt!KiRetireDpcList+0x1bc
fffff800`0159cc40 00000000`00000000 : fffff800`0159d000 fffff800`01597000 fffff800`0159cc00 00000000`00000000 : nt!KiIdleLoop+0x5a

STACK_COMMAND:  kb

FOLLOWUP_IP: 
DgRplct!DgRplctTrackQueue_CheckElement+9b [e:\sh-safehaven3\winonboardlra\lra_driver\dgrplct\dgrplcttrackqueue.c @ 916]
fffff880`016447cf ??              ???

FAULTING_SOURCE_LINE:  e:\sh-safehaven3\winonboardlra\lra_driver\dgrplct\dgrplcttrackqueue.c

FAULTING_SOURCE_FILE:  e:\sh-safehaven3\winonboardlra\lra_driver\dgrplct\dgrplcttrackqueue.c

FAULTING_SOURCE_LINE_NUMBER:  916

FAULTING_SOURCE_CODE:  
   912:             1,
   913:             FALSE);
   914:     }
   915: 
>  916:     *pbRequestCompleted = bRequestCompleted;
   917: 
   918:     return bRet;
   919: }
   920: 
   921: //------------------------------------------------------------------------------

SYMBOL_STACK_INDEX:  6

SYMBOL_NAME:  DgRplct!DgRplctTrackQueue_CheckElement+9b

FOLLOWUP_NAME:  MachineOwner

MODULE_NAME: DgRplct

IMAGE_NAME:  DgRplct.sys

DEBUG_FLR_IMAGE_TIMESTAMP:  5eef8292

FAILURE_BUCKET_ID:  X64_0x1E_DgRplct!DgRplctTrackQueue_CheckElement+9b

BUCKET_ID:  X64_0x1E_DgRplct!DgRplctTrackQueue_CheckElement+9b

Followup: MachineOwner
---------

Source:

NTSTATUS
DgRplct_ForwardIrpPOCF1_Completion(
    IN PDEVICE_OBJECT pDeviceObject,
    IN PIRP pIrp,
    IN PVOID pContext
    )
{ 
    PDEVICE_EXTENSION pDevExtension = (PDEVICE_EXTENSION) pDeviceObject->DeviceExtension;
    
    // 
    // Because the dispatch routine is returning the status of lower driver
    // as is, w must do the following (http://support.microsoft.com/kb/320275):
    if(pIrp->PendingReturned == TRUE) {
        IoMarkIrpPending(pIrp);
    }
    
    if(NULL != pContext) {
        PDGRPLCT_TRACKELMT pstTQElmt = (PDGRPLCT_TRACKELMT) pContext;
        BOOLEAN bRequestCompleted = FALSE;
        DgRplctTrackQueue_CheckElement(
            &pDevExtension->stTrackQueue, 
            pstTQElmt,
            &bRequestCompleted);
    }

    IoCompleteRequest(pIrp, IO_NO_INCREMENT);

    return STATUS_MORE_PROCESSING_REQUIRED;

} // end DgRplct_ForwardIrpPOCF1_Completion

BOOLEAN DgRplctTrackQueue_CheckElement
(
    IN PDGRPLCT_TRACKQUEUE pstTrackQueue,
    IN PDGRPLCT_TRACKELMT pstTQElmt,
    OUT PBOOLEAN pbRequestCompleted
)
{
    KIRQL OldIrql;
    BOOLEAN bRequestCompleted = FALSE;
    BOOLEAN bRet = FALSE;
    
    if((NULL == pstTrackQueue) || (NULL == pstTQElmt) ||(NULL == pbRequestCompleted)) {
        return FALSE;
    }
    
    *pbRequestCompleted = FALSE;

    DgRplctAcquireTrackQueueLock(&pstTrackQueue->QLock, &OldIrql);
    
    if(NULL != pstTrackQueue->pCircleQueue) {

        if( (DGRPLCT_TRACKQUEUE_IRP_STATUS_UNKNOWN != pstTQElmt->stPayload.emSrcIrpStatus) && 
            (DGRPLCT_TRACKQUEUE_IRP_STATUS_UNKNOWN != pstTQElmt->stPayload.emDstIrpStatus) &&
            (DGRPLCT_TRACKQUEUE_ELMT_STATUS_READY_TO_POPOUT != pstTQElmt->emElmtStatus) )
        {
            pstTQElmt->emElmtStatus = DGRPLCT_TRACKQUEUE_ELMT_STATUS_READY_TO_POPOUT; 
            bRequestCompleted = TRUE;
        }
        else {
            pstTQElmt->emElmtStatus = DGRPLCT_TRACKQUEUE_ELMT_STATUS_DIRTY;
        }

        bRet = TRUE;

    }
    /*  // bRet by defualt is set to FALSE
    else {
        bRet = FALSE;
    }
    */
    
    DgRplctReleaseTrackQueueLock(&pstTrackQueue->QLock, OldIrql);

    

    if(TRUE == bRequestCompleted) {
        (VOID)KeReleaseSemaphore(&pstTrackQueue->KeSmp_TrackQueue,
            IO_NO_INCREMENT,
            1,
            FALSE);
    }

    *pbRequestCompleted = bRequestCompleted;

    return bRet;
}

(I noticed the parameter pbRequestCompleted, and for that matter bRet too, is never actually used, but i don’t see why it causes a crash)

pbRequestCompleted certainly is used. It is in the line of code that failed. The implication from the dump is that this function has been paged out. Is it possible this in a section of code marked with #pragma as being paged, AND that the target system here is being run with Driver Verifier? If so, that’s the problem. This code is called in a completion routine, which means it will run at DISPATCH_LEVEL (as the dump says) and cannot be paged.

Thanks Tim, I appreciate your response.

I meant that the value of *pbRequestCompleted is never read after it is assigned. Theoretically I could remove the parameter entirely.

I will check with the customer but it seems unlikely that they are running driver verifier on their production system. There is no #pragma alloc_text (PAGE, DgRplctTrackQueue_CheckElement), or for the DgRplct_ForwardIrpPOCF1_Completion.

However, this code was present in this function (I removed for brevity)

#ifndef DGRPLCT_USE_SPINLOCK_FOR_TRACK_QUEUE
    PAGED_CODE();
    UNREFERENCED_PARAMETER(OldIrql);
#endif

Of course, DGRPLCT_USE_SPINLOCK_FOR_TRACK_QUEUE is defined in the header file.

#define DGRPLCT_USE_SPINLOCK_FOR_TRACK_QUEUE
///-#undef DGRPLCT_USE_SPINLOCK_FOR_TRACK_QUEUE

I can’t think of any way in which this could be mistakenly enabled.

Given that we use spinlocks in this function, I would imagine we’d have lots of other issues if this code was paged.

From the microsoft documentation:
“Because the kernel stack might be paged out, please be cautious about passing stack-based buffers (i.e. local variables) to DMA or any routine that runs at DISPATCH_LEVEL or above.”

Does this mean it’s unsafe to pass down local variables in an IOCompletion routine?

Local variables are fine (your stack isn’t pageable while you’re running). What’s the output of:

.frame /c 6
!pte @rip 

?

Ok, good to have a sanity check.

0: kd> .frame /c 6
06 fffff800`0159c7a0 fffff880`01640dbb DgRplct!DgRplctTrackQueue_CheckElement+0x9b [e:\sh-safehaven3\winonboardlra\lra_driver\dgrplct\dgrplcttrackqueue.c @ 916]
rax=fffff8000159c050 rbx=0000000000000001 rcx=000000000000001e
rdx=fffff8000159cc40 rsi=fffff8000159c810 rdi=fffffa800424ca90
rip=fffff880016447cf rsp=fffff8000159c7a0 rbp=0000000000000001
 r8=fffff8000159c240  r9=fffff8000159bb30 r10=fffff8000159cc70
r11=fffff8000159bae8 r12=fffffa800411b000 r13=0000000000000000
r14=fffffa800411dc08 r15=0000000000000001
iopl=0         nv up ei ng nz na pe nc
cs=0010  ss=0018  ds=002b  es=002b  fs=0053  gs=002b             efl=00000282
DgRplct!DgRplctTrackQueue_CheckElement+0x9b:
fffff880`016447cf ??              ???
0: kd> !pte @rip
                                           VA fffff880016447cf
PXE at FFFFF6FB7DBEDF88    PPE at FFFFF6FB7DBF1000    PDE at FFFFF6FB7E200058    PTE at FFFFF6FC4000B220
Unable to get PXE FFFFF6FB7DBEDF88