Windows System Software -- Consulting, Training, Development -- Unique Expertise, Guaranteed Results

Before Posting...
Please check out the Community Guidelines in the Announcements and Administration Category.

RE: WdfRequestUnmarkCanc?elable results in BSOD (continued)

Doron_HolanDoron_Holan Member - All Emails Posts: 10,438
!wdfkd.wdflogdump riffaqq will tell you what is going wrong

d

Bent from my phone
________________________________
From: xxxxx@gmail.com
Sent: ?4/?1/?2015 5:43 PM
To: Windows System Software Devs Interest List
Subject: [ntdev] WdfRequestUnmarkCanc?elable results in BSOD (continued)

I can't seem to locate the original thread I started last month. So I'll start a new one here.

I have an IOCTL based DMA driver that I wrote for interfacing with PCIe FPGAs. It's for Win7 and built using the KMDF APIs. It works well and is reliable until I add support for IO cancellation. Once I add the WdfRequestMarkCancelableEx, WdfRequestUnmarkCancelable calls I get the BSOD .

The Minidump log when I have driver verifier turned on (as well as KMDF VerifyOn set to 1) is below.

CUSTOMER_CRASH_COUNT: 1

DEFAULT_BUCKET_ID: VERIFIER_ENABLED_VISTA_MINIDUMP

BUGCHECK_STR: 0x1E

PROCESS_NAME: svchost.exe

CURRENT_IRQL: 2

TAG_NOT_DEFINED_c000000f: FFFFF880032FFFB0

EXCEPTION_RECORD: fffff880032ff838 -- (.exr 0xfffff880032ff838)
ExceptionAddress: fffff80002c7ccd0 (nt!DbgBreakPoint)
ExceptionCode: 80000003 (Break instruction exception)
ExceptionFlags: 00000000
NumberParameters: 1
Parameter[0]: 0000000000000000

TRAP_FRAME: fffff880032ff8e0 -- (.trap 0xfffff880032ff8e0)
NOTE: The trap frame does not contain all registers.
Some register values may be zeroed or incorrect.
rax=fffff880032ffa78 rbx=0000000000000000 rcx=fffff98002aaafd0
rdx=fffff98002aaafd0 rsi=0000000000000000 rdi=0000000000000000
rip=fffff80002c7ccd1 rsp=fffff880032ffa78 rbp=fffff880032ffb02
r8=0000000000000000 r9=fffff88000ede610 r10=7efefefefefeff40
r11=8101010101010100 r12=0000000000000000 r13=0000000000000000
r14=0000000000000000 r15=0000000000000000
iopl=0 nv up ei pl nz na pe nc
nt!DbgBreakPoint+0x1:
fffff800`02c7ccd1 c3 ret
Resetting default scope

LAST_CONTROL_TRANSFER: from fffff80002c7c8be to fffff80002c84e90

STACK_TEXT:
fffff880`032fe918 fffff800`02c7c8be : fffffa80`199bc580 fffff880`00c0bf6b fffff880`032ff090 fffff800`02cb0a00 : nt!KeBugCheck
fffff880`032fe920 fffff800`02cb06cd : fffff800`02e95398 fffff800`02dce22c fffff800`02c10000 fffff880`032ff838 : nt!KiKernelCalloutExceptionHandler+0xe
fffff880`032fe950 fffff800`02caf4a5 : fffff800`02dd62c4 fffff880`032fe9c8 fffff880`032ff838 fffff800`02c10000 : nt!RtlpExecuteHandlerForException+0xd
fffff880`032fe980 fffff800`02cc0431 : fffff880`032ff838 fffff880`032ff090 fffff880`00000000 00000000`c0000010 : nt!RtlDispatchException+0x415
fffff880`032ff060 fffff800`02c84542 : fffff880`032ff838 00000000`00000000 fffff880`032ff8e0 fffff980`02aaae40 : nt!KiDispatchException+0x135
fffff880`032ff700 fffff800`02c82374 : 00000000`00000000 fffff880`032ff960 00000000`c0000010 fffff980`02aaae40 : nt!KiExceptionDispatch+0xc2
fffff880`032ff8e0 fffff800`02c7ccd1 : fffff880`00efe0bd fffff980`02aaafd0 fffff980`02aaafd0 00000000`00000000 : nt!KiBreakpointTrap+0xf4
fffff880`032ffa78 fffff880`00efe0bd : fffff980`02aaafd0 fffff980`02aaafd0 00000000`00000000 fffff880`00ede610 : nt!DbgBreakPoint+0x1
fffff880`032ffa80 fffff880`00eff691 : 00000000`00000000 fffff980`02aaae40 fffff980`02aaae40 fffff980`0c722e60 : Wdf01000!FxRequest::Vf_VerifyRequestIsCancelable+0x79
fffff880`032ffad0 fffff880`00e5dbcc : 00000000`00000002 fffff980`0c722e60 fffff880`032ffb50 fffff880`00000008 : Wdf01000!FxIoQueue::Vf_VerifyRequestCancelable+0x99
fffff880`032ffb10 fffff880`00e68685 : ffff9d59`b1850b10 fffff980`0c722e60 0000067f`f38dd198 00000000`00000022 : Wdf01000!FxIoQueue::RequestCancelable+0x3c
fffff880`032ffb80 fffff880`061a821b : fffff980`0c722e60 0000067f`fd4db1b8 fffff980`02b24e40 fffffa80`1bf8049c : Wdf01000!imp_WdfRequestUnmarkCancelable+0x115
fffff880`032ffbe0 fffff880`061a6a22 : 0000067f`f38dd198 fffffa80`1bf8049c fffff880`061a9c70 fffffa80`1bf8030c : riffa!WdfRequestUnmarkCancelable+0x1b [c:\winddk\7600.16385.1\inc\wdf\kmdf\1.9\wdfrequest.h @ 775]
fffff880`032ffc10 fffff880`061a7d6c : fffffa80`1bf802d0 fffff800`00000000 fffff880`00000000 fffffa80`1bf8030c : riffa!RiffaCompleteRequest+0xa2 [c:\winddk\7600.16385.1\src\general\riffa\driver\windows\sys\riffa.c @ 1462]
fffff880`032ffc70 fffff880`061a5252 : fffffa80`1bf802d0 fffff880`00000000 fffff880`00000001 fffffa80`00000000 : riffa!RiffaTransactionComplete+0x10c [c:\winddk\7600.16385.1\src\general\riffa\driver\windows\sys\riffa.c @ 1907]
fffff880`032ffcc0 fffff880`00ed96d3 : 0000067f`fd4db1b8 0000057f`e407ffd8 00000000`4e54241d 00000000`4559db6a : riffa!RiffaEvtInterruptDpc+0x622 [c:\winddk\7600.16385.1\src\general\riffa\driver\windows\sys\riffa.c @ 878]
fffff880`032ffed0 fffff800`02c907ac : fffff880`032d0180 fffff880`032d0180 fffff980`02b24ed8 00000000`00000042 : Wdf01000!FxInterrupt::DpcHandler+0xc3
fffff880`032fff00 fffff800`02c87925 : 00000000`00000000 fffffa80`19cdb300 00000000`00000000 fffff880`00ed9840 : nt!KiRetireDpcList+0x1bc
fffff880`032fffb0 fffff800`02c8773c : 00000000`00000000 00000000`00000000 00000000`00000000 00000000`00000000 : nt!KxRetireDpcList+0x5
fffff880`063657e0 00000000`00000000 : 00000000`00000000 00000000`00000000 00000000`00000000 00000000`00000000 : nt!KiDispatchInterruptContinue


STACK_COMMAND: kb

FOLLOWUP_IP:
riffa!WdfRequestUnmarkCancelable+1b [c:\winddk\7600.16385.1\inc\wdf\kmdf\1.9\wdfrequest.h @ 775]
fffff880`061a821b 4883c428 add rsp,28h

FAULTING_SOURCE_LINE: c:\winddk\7600.16385.1\inc\wdf\kmdf\1.9\wdfrequest.h

FAULTING_SOURCE_FILE: c:\winddk\7600.16385.1\inc\wdf\kmdf\1.9\wdfrequest.h

FAULTING_SOURCE_LINE_NUMBER: 775

FAULTING_SOURCE_CODE:
771: WDFREQUEST Request
772: )
773: {
774: return ((PFN_WDFREQUESTUNMARKCANCELABLE) WdfFunctions[WdfRequestUnmarkCancelableTableIndex])(WdfDriverGlobals, Request);
> 775: }
776:
777: //
778: // WDF Function: WdfRequestIsCanceled
779: //
780: typedef


SYMBOL_STACK_INDEX: c

SYMBOL_NAME: riffa!WdfRequestUnmarkCancelable+1b

FOLLOWUP_NAME: MachineOwner

MODULE_NAME: riffa

IMAGE_NAME: riffa.sys

DEBUG_FLR_IMAGE_TIMESTAMP: 54e4c333

FAILURE_BUCKET_ID: X64_0x1E_VRF_riffa!WdfRequestUnmarkCancelable+1b

BUCKET_ID: X64_0x1E_VRF_riffa!WdfRequestUnmarkCancelable+1b

Followup: MachineOwner

---------------------------------
It seems to suggest that the call to WdfRequestUnmarkCancelable in my RiffaTransactionComplete function is to blame. OK. But my understanding is that I need to call WdfRequestUnmarkCancelable on every WDFREQUEST that I call WdfRequestMarkCancelable. According to the WdfRequestUnmarkCancelable MSDN page (https://msdn.microsoft.com/en-us/library/windows/hardware/ff550035(v=vs.85).aspx), this must be done everywhere except the EvtRequestCancel callback.

The general flow of things in my driver:

-> RiffaEvtIoDeviceControl()
// This is the entry point
-> RiffaIoctlSend()
// Save the WDFREQUEST in a field on the PDEVICE_EXTENSION object.
// Start a DMA transfer.
// Call WdfRequestMarkCancelableEx() with the WDFREQUEST and own EvtRequestCancel callback called RiffaEvtRequestCancel.

The driver waits for an interrupt to signal the end of the DMA transfer.

-> RiffaEvtInterruptDpc()
// Interrupt handler
-> RiffaTransactionComplete()
// Ends the DMA transfer, releases the DMA transaction object.
-> RiffaCompleteRequest()
// Gets WDFREQUEST off the PDEVICE_EXTENSION object. Checks if it's null. If null the request was cancelled and this does nothing. If not null, sets the field on the PDEVICE_EXTENSION to null and calls WdfRequestUnmarkCancelable(), followed by WdfRequestCompleteWithInformation().

If ever invoked, the EvtRequestCancel callback, RiffaEvtRequestCancel, sets the WDFREQUEST field on the PDEVICE_EXTENSION to null and calls WdfRequestCompleteWithInformation().

It's a pretty straightforward/simple driver modeled after the PLX9x5x driver example. I have yet been able to determine what's causing this problem. The only thing I do know is that it only seems to happen when issuing a lot of back to back invocations. Stepping through invocations with a debugger seems to slow things down enough to prevent whatever's going on.

I'll post the relevant function bodies in a follow up post.

I'll add a few extra detail and suspicions. The suspicions about possible causes stem from other somewhat related posts I've found. But as of yet, I've been unable to identify these ideas as the culprit.

1) The driver runs flawlessly without WdfRequestMarkCancelable & WdfRequestUnmarkCancelable. With low frequency and high frequency back to back invocations. But I need to be able to cancel the IO request. So I'm stuck using WdfRequestMarkCancelable & WdfRequestUnmarkCancelable.

2) Someone had a similar problem and discovered that the WDFREQUEST objects were being reused behind the scenes and that person's driver was failing to call WdfRequestUnmarkCancelable on one of the code paths. His error was similar but was being raised when calling WdfRequestMarkCancelable on a WDFREQUEST that was presumably still marked cancelable. I don't think this is what's going on here, but knowing that the WDFREQUESTS are being used is useful.

3) In my initial post on this, someone vaguely suggested that the WDFREQUEST might be non-null on the PDEVICE_EXTENSION but that somehow the "underlying request" object" might have been completed. I don't see how this could happen as both code paths that complete this WDFREQUEST mark it as null before doing so. This suggestion was a bit vague, so I don't really know what would cause such a situation. If you do, please enlighten me.

Any ideas would be appreciated. Thanks.

Matt


---
NTDEV is sponsored by OSR

Visit the list at: http://www.osronline.com/showlists.cfm?list=ntdev

OSR is HIRING!! See http://www.osr.com/careers

For our schedule of WDF, WDM, debugging and other seminars visit:
http://www.osr.com/seminars

To unsubscribe, visit the List Server section of OSR Online at http://www.osronline.com/page.cfm?name=ListServer
d

Comments

  • OSR_Community_UserOSR_Community_User Member Posts: 110,217
    Thanks for the quick reply. That command didn't seem to do anything
    meaningful.

    10: kd> !wdfkd.wdflogdump riffaqq
    c0000005 Exception in wdfkd.wdflogdump debugger extension.
    PC: 00000000`6b9f942b VA: 00000000`00000000 R/W: 0 Parameter:
    00000000`0001003f

    I tried it with just the driver name "riffa", but it resulted in the same:

    10: kd> !wdfkd.wdflogdump riffa
    c0000005 Exception in wdfkd.wdflogdump debugger extension.
    PC: 00000000`6b9f942b VA: 00000000`00000000 R/W: 0 Parameter:
    00000000`0001003f


    On Wed, Apr 1, 2015 at 5:56 PM, Doron Holan
    wrote:

    > !wdfkd.wdflogdump riffaqq will tell you what is going wrong
    >
    > d
    >
    > Bent from my phone
    > ------------------------------
    > From: xxxxx@gmail.com
    > Sent: ‎4/‎1/‎2015 5:43 PM
    > To: Windows System Software Devs Interest List
    > Subject: [ntdev] WdfRequestUnmarkCanc?elable results in BSOD (continued)
    >
    > I can't seem to locate the original thread I started last month. So
    > I'll start a new one here.
    >
    > I have an IOCTL based DMA driver that I wrote for interfacing with PCIe
    > FPGAs. It's for Win7 and built using the KMDF APIs. It works well and is
    > reliable until I add support for IO cancellation. Once I add the
    > WdfRequestMarkCancelableEx, WdfRequestUnmarkCancelable calls I get the BSOD
    > .
    >
    > The Minidump log when I have driver verifier turned on (as well as KMDF
    > VerifyOn set to 1) is below.
    >
    > CUSTOMER_CRASH_COUNT: 1
    >
    > DEFAULT_BUCKET_ID: VERIFIER_ENABLED_VISTA_MINIDUMP
    >
    > BUGCHECK_STR: 0x1E
    >
    > PROCESS_NAME: svchost.exe
    >
    > CURRENT_IRQL: 2
    >
    > TAG_NOT_DEFINED_c000000f: FFFFF880032FFFB0
    >
    > EXCEPTION_RECORD: fffff880032ff838 -- (.exr 0xfffff880032ff838)
    > ExceptionAddress: fffff80002c7ccd0 (nt!DbgBreakPoint)
    > ExceptionCode: 80000003 (Break instruction exception)
    > ExceptionFlags: 00000000
    > NumberParameters: 1
    > Parameter[0]: 0000000000000000
    >
    > TRAP_FRAME: fffff880032ff8e0 -- (.trap 0xfffff880032ff8e0)
    > NOTE: The trap frame does not contain all registers.
    > Some register values may be zeroed or incorrect.
    > rax=fffff880032ffa78 rbx=0000000000000000 rcx=fffff98002aaafd0
    > rdx=fffff98002aaafd0 rsi=0000000000000000 rdi=0000000000000000
    > rip=fffff80002c7ccd1 rsp=fffff880032ffa78 rbp=fffff880032ffb02
    > r8=0000000000000000 r9=fffff88000ede610 r10=7efefefefefeff40
    > r11=8101010101010100 r12=0000000000000000 r13=0000000000000000
    > r14=0000000000000000 r15=0000000000000000
    > iopl=0 nv up ei pl nz na pe nc
    > nt!DbgBreakPoint+0x1:
    > fffff800`02c7ccd1 c3 ret
    > Resetting default scope
    >
    > LAST_CONTROL_TRANSFER: from fffff80002c7c8be to fffff80002c84e90
    >
    > STACK_TEXT:
    > fffff880`032fe918 fffff800`02c7c8be : fffffa80`199bc580 fffff880`00c0bf6b
    > fffff880`032ff090 fffff800`02cb0a00 : nt!KeBugCheck
    > fffff880`032fe920 fffff800`02cb06cd : fffff800`02e95398 fffff800`02dce22c
    > fffff800`02c10000 fffff880`032ff838 : nt!KiKernelCalloutExceptionHandler+0xe
    > fffff880`032fe950 fffff800`02caf4a5 : fffff800`02dd62c4 fffff880`032fe9c8
    > fffff880`032ff838 fffff800`02c10000 : nt!RtlpExecuteHandlerForException+0xd
    > fffff880`032fe980 fffff800`02cc0431 : fffff880`032ff838 fffff880`032ff090
    > fffff880`00000000 00000000`c0000010 : nt!RtlDispatchException+0x415
    > fffff880`032ff060 fffff800`02c84542 : fffff880`032ff838 00000000`00000000
    > fffff880`032ff8e0 fffff980`02aaae40 : nt!KiDispatchException+0x135
    > fffff880`032ff700 fffff800`02c82374 : 00000000`00000000 fffff880`032ff960
    > 00000000`c0000010 fffff980`02aaae40 : nt!KiExceptionDispatch+0xc2
    > fffff880`032ff8e0 fffff800`02c7ccd1 : fffff880`00efe0bd fffff980`02aaafd0
    > fffff980`02aaafd0 00000000`00000000 : nt!KiBreakpointTrap+0xf4
    > fffff880`032ffa78 fffff880`00efe0bd : fffff980`02aaafd0 fffff980`02aaafd0
    > 00000000`00000000 fffff880`00ede610 : nt!DbgBreakPoint+0x1
    > fffff880`032ffa80 fffff880`00eff691 : 00000000`00000000 fffff980`02aaae40
    > fffff980`02aaae40 fffff980`0c722e60 :
    > Wdf01000!FxRequest::Vf_VerifyRequestIsCancelable+0x79
    > fffff880`032ffad0 fffff880`00e5dbcc : 00000000`00000002 fffff980`0c722e60
    > fffff880`032ffb50 fffff880`00000008 :
    > Wdf01000!FxIoQueue::Vf_VerifyRequestCancelable+0x99
    > fffff880`032ffb10 fffff880`00e68685 : ffff9d59`b1850b10 fffff980`0c722e60
    > 0000067f`f38dd198 00000000`00000022 :
    > Wdf01000!FxIoQueue::RequestCancelable+0x3c
    > fffff880`032ffb80 fffff880`061a821b : fffff980`0c722e60 0000067f`fd4db1b8
    > fffff980`02b24e40 fffffa80`1bf8049c :
    > Wdf01000!imp_WdfRequestUnmarkCancelable+0x115
    > fffff880`032ffbe0 fffff880`061a6a22 : 0000067f`f38dd198 fffffa80`1bf8049c
    > fffff880`061a9c70 fffffa80`1bf8030c : riffa!WdfRequestUnmarkCancelable+0x1b
    > [c:\winddk\7600.16385.1\inc\wdf\kmdf\1.9\wdfrequest.h @ 775]
    > fffff880`032ffc10 fffff880`061a7d6c : fffffa80`1bf802d0 fffff800`00000000
    > fffff880`00000000 fffffa80`1bf8030c : riffa!RiffaCompleteRequest+0xa2
    > [c:\winddk\7600.16385.1\src\general\riffa\driver\windows\sys\riffa.c @ 1462]
    > fffff880`032ffc70 fffff880`061a5252 : fffffa80`1bf802d0 fffff880`00000000
    > fffff880`00000001 fffffa80`00000000 : riffa!RiffaTransactionComplete+0x10c
    > [c:\winddk\7600.16385.1\src\general\riffa\driver\windows\sys\riffa.c @ 1907]
    > fffff880`032ffcc0 fffff880`00ed96d3 : 0000067f`fd4db1b8 0000057f`e407ffd8
    > 00000000`4e54241d 00000000`4559db6a : riffa!RiffaEvtInterruptDpc+0x622
    > [c:\winddk\7600.16385.1\src\general\riffa\driver\windows\sys\riffa.c @ 878]
    > fffff880`032ffed0 fffff800`02c907ac : fffff880`032d0180 fffff880`032d0180
    > fffff980`02b24ed8 00000000`00000042 : Wdf01000!FxInterrupt::DpcHandler+0xc3
    > fffff880`032fff00 fffff800`02c87925 : 00000000`00000000 fffffa80`19cdb300
    > 00000000`00000000 fffff880`00ed9840 : nt!KiRetireDpcList+0x1bc
    > fffff880`032fffb0 fffff800`02c8773c : 00000000`00000000 00000000`00000000
    > 00000000`00000000 00000000`00000000 : nt!KxRetireDpcList+0x5
    > fffff880`063657e0 00000000`00000000 : 00000000`00000000 00000000`00000000
    > 00000000`00000000 00000000`00000000 : nt!KiDispatchInterruptContinue
    >
    >
    > STACK_COMMAND: kb
    >
    > FOLLOWUP_IP:
    > riffa!WdfRequestUnmarkCancelable+1b
    > [c:\winddk\7600.16385.1\inc\wdf\kmdf\1.9\wdfrequest.h @ 775]
    > fffff880`061a821b 4883c428 add rsp,28h
    >
    > FAULTING_SOURCE_LINE: c:\winddk\7600.16385.1\inc\wdf\kmdf\1.9\wdfrequest.h
    >
    > FAULTING_SOURCE_FILE: c:\winddk\7600.16385.1\inc\wdf\kmdf\1.9\wdfrequest.h
    >
    > FAULTING_SOURCE_LINE_NUMBER: 775
    >
    > FAULTING_SOURCE_CODE:
    > 771: WDFREQUEST Request
    > 772: )
    > 773: {
    > 774: return ((PFN_WDFREQUESTUNMARKCANCELABLE)
    > WdfFunctions[WdfRequestUnmarkCancelableTableIndex])(WdfDriverGlobals,
    > Request);
    > > 775: }
    > 776:
    > 777: //
    > 778: // WDF Function: WdfRequestIsCanceled
    > 779: //
    > 780: typedef
    >
    >
    > SYMBOL_STACK_INDEX: c
    >
    > SYMBOL_NAME: riffa!WdfRequestUnmarkCancelable+1b
    >
    > FOLLOWUP_NAME: MachineOwner
    >
    > MODULE_NAME: riffa
    >
    > IMAGE_NAME: riffa.sys
    >
    > DEBUG_FLR_IMAGE_TIMESTAMP: 54e4c333
    >
    > FAILURE_BUCKET_ID: X64_0x1E_VRF_riffa!WdfRequestUnmarkCancelable+1b
    >
    > BUCKET_ID: X64_0x1E_VRF_riffa!WdfRequestUnmarkCancelable+1b
    >
    > Followup: MachineOwner
    >
    > ---------------------------------
    > It seems to suggest that the call to WdfRequestUnmarkCancelable in my
    > RiffaTransactionComplete function is to blame. OK. But my understanding is
    > that I need to call WdfRequestUnmarkCancelable on every WDFREQUEST that I
    > call WdfRequestMarkCancelable. According to the WdfRequestUnmarkCancelable
    > MSDN page (
    > https://msdn.microsoft.com/en-us/library/windows/hardware/ff550035(v=vs.85).aspx),
    > this must be done everywhere except the EvtRequestCancel callback.
    >
    > The general flow of things in my driver:
    >
    > -> RiffaEvtIoDeviceControl()
    > // This is the entry point
    > -> RiffaIoctlSend()
    > // Save the WDFREQUEST in a field on the PDEVICE_EXTENSION object.
    > // Start a DMA transfer.
    > // Call WdfRequestMarkCancelableEx() with the WDFREQUEST and own
    > EvtRequestCancel callback called RiffaEvtRequestCancel.
    >
    > The driver waits for an interrupt to signal the end of the DMA transfer.
    >
    > -> RiffaEvtInterruptDpc()
    > // Interrupt handler
    > -> RiffaTransactionComplete()
    > // Ends the DMA transfer, releases the DMA transaction object.
    > -> RiffaCompleteRequest()
    > // Gets WDFREQUEST off the PDEVICE_EXTENSION object. Checks if it's
    > null. If null the request was cancelled and this does nothing. If not null,
    > sets the field on the PDEVICE_EXTENSION to null and calls
    > WdfRequestUnmarkCancelable(), followed by
    > WdfRequestCompleteWithInformation().
    >
    > If ever invoked, the EvtRequestCancel callback, RiffaEvtRequestCancel,
    > sets the WDFREQUEST field on the PDEVICE_EXTENSION to null and calls
    > WdfRequestCompleteWithInformation().
    >
    > It's a pretty straightforward/simple driver modeled after the PLX9x5x
    > driver example. I have yet been able to determine what's causing this
    > problem. The only thing I do know is that it only seems to happen when
    > issuing a lot of back to back invocations. Stepping through invocations
    > with a debugger seems to slow things down enough to prevent whatever's
    > going on.
    >
    > I'll post the relevant function bodies in a follow up post.
    >
    > I'll add a few extra detail and suspicions. The suspicions about possible
    > causes stem from other somewhat related posts I've found. But as of yet,
    > I've been unable to identify these ideas as the culprit.
    >
    > 1) The driver runs flawlessly without WdfRequestMarkCancelable &
    > WdfRequestUnmarkCancelable. With low frequency and high frequency back to
    > back invocations. But I need to be able to cancel the IO request. So I'm
    > stuck using WdfRequestMarkCancelable & WdfRequestUnmarkCancelable.
    >
    > 2) Someone had a similar problem and discovered that the WDFREQUEST
    > objects were being reused behind the scenes and that person's driver was
    > failing to call WdfRequestUnmarkCancelable on one of the code paths. His
    > error was similar but was being raised when calling
    > WdfRequestMarkCancelable on a WDFREQUEST that was presumably still marked
    > cancelable. I don't think this is what's going on here, but knowing that
    > the WDFREQUESTS are being used is useful.
    >
    > 3) In my initial post on this, someone vaguely suggested that the
    > WDFREQUEST might be non-null on the PDEVICE_EXTENSION but that somehow the
    > "underlying request" object" might have been completed. I don't see how
    > this could happen as both code paths that complete this WDFREQUEST mark it
    > as null before doing so. This suggestion was a bit vague, so I don't really
    > know what would cause such a situation. If you do, please enlighten me.
    >
    > Any ideas would be appreciated. Thanks.
    >
    > Matt
    >
    >
    > ---
    > NTDEV is sponsored by OSR
    >
    > Visit the list at: http://www.osronline.com/showlists.cfm?list=ntdev
    >
    > OSR is HIRING!! See http://www.osr.com/careers
    >
    > For our schedule of WDF, WDM, debugging and other seminars visit:
    > http://www.osr.com/seminars
    >
    > To unsubscribe, visit the List Server section of OSR Online at
    > http://www.osronline.com/page.cfm?name=ListServer
    >
    > ---
    > NTDEV is sponsored by OSR
    >
    > Visit the list at: http://www.osronline.com/showlists.cfm?list=ntdev
    >
    > OSR is HIRING!! See http://www.osr.com/careers
    >
    > For our schedule of WDF, WDM, debugging and other seminars visit:
    > http://www.osr.com/seminars
    >
    > To unsubscribe, visit the List Server section of OSR Online at
    > http://www.osronline.com/page.cfm?name=ListServer
    >
  • Doron_HolanDoron_Holan Member - All Emails Posts: 10,438
    What os are you running the debugger on? What version of the debugger are you running? Iirc there is a wdfkd.dll in the kit, you want to use that version.

    d

    Bent from my phone
    ________________________________
    From: Matt Jacobsen
    Sent: ?4/?1/?2015 8:01 PM
    To: Windows System Software Devs Interest List
    Subject: Re: [ntdev] WdfRequestUnmarkCanc?elable results in BSOD (continued)

    Thanks for the quick reply. That command didn't seem to do anything meaningful.

    10: kd> !wdfkd.wdflogdump riffaqq
    c0000005 Exception in wdfkd.wdflogdump debugger extension.
    PC: 00000000`6b9f942b VA: 00000000`00000000 R/W: 0 Parameter: 00000000`0001003f

    I tried it with just the driver name "riffa", but it resulted in the same:

    10: kd> !wdfkd.wdflogdump riffa
    c0000005 Exception in wdfkd.wdflogdump debugger extension.
    PC: 00000000`6b9f942b VA: 00000000`00000000 R/W: 0 Parameter: 00000000`0001003f


    On Wed, Apr 1, 2015 at 5:56 PM, Doron Holan > wrote:
    !wdfkd.wdflogdump riffaqq will tell you what is going wrong

    d

    Bent from my phone
    ________________________________
    From: xxxxx@gmail.com
    Sent: ?4/?1/?2015 5:43 PM
    To: Windows System Software Devs Interest List
    Subject: [ntdev] WdfRequestUnmarkCanc?elable results in BSOD (continued)

    I can't seem to locate the original thread I started last month. So I'll start a new one here.

    I have an IOCTL based DMA driver that I wrote for interfacing with PCIe FPGAs. It's for Win7 and built using the KMDF APIs. It works well and is reliable until I add support for IO cancellation. Once I add the WdfRequestMarkCancelableEx, WdfRequestUnmarkCancelable calls I get the BSOD .

    The Minidump log when I have driver verifier turned on (as well as KMDF VerifyOn set to 1) is below.

    CUSTOMER_CRASH_COUNT: 1

    DEFAULT_BUCKET_ID: VERIFIER_ENABLED_VISTA_MINIDUMP

    BUGCHECK_STR: 0x1E

    PROCESS_NAME: svchost.exe

    CURRENT_IRQL: 2

    TAG_NOT_DEFINED_c000000f: FFFFF880032FFFB0

    EXCEPTION_RECORD: fffff880032ff838 -- (.exr 0xfffff880032ff838)
    ExceptionAddress: fffff80002c7ccd0 (nt!DbgBreakPoint)
    ExceptionCode: 80000003 (Break instruction exception)
    ExceptionFlags: 00000000
    NumberParameters: 1
    Parameter[0]: 0000000000000000

    TRAP_FRAME: fffff880032ff8e0 -- (.trap 0xfffff880032ff8e0)
    NOTE: The trap frame does not contain all registers.
    Some register values may be zeroed or incorrect.
    rax=fffff880032ffa78 rbx=0000000000000000 rcx=fffff98002aaafd0
    rdx=fffff98002aaafd0 rsi=0000000000000000 rdi=0000000000000000
    rip=fffff80002c7ccd1 rsp=fffff880032ffa78 rbp=fffff880032ffb02
    r8=0000000000000000 r9=fffff88000ede610 r10=7efefefefefeff40
    r11=8101010101010100 r12=0000000000000000 r13=0000000000000000
    r14=0000000000000000 r15=0000000000000000
    iopl=0 nv up ei pl nz na pe nc
    nt!DbgBreakPoint+0x1:
    fffff800`02c7ccd1 c3 ret
    Resetting default scope

    LAST_CONTROL_TRANSFER: from fffff80002c7c8be to fffff80002c84e90

    STACK_TEXT:
    fffff880`032fe918 fffff800`02c7c8be : fffffa80`199bc580 fffff880`00c0bf6b fffff880`032ff090 fffff800`02cb0a00 : nt!KeBugCheck
    fffff880`032fe920 fffff800`02cb06cd : fffff800`02e95398 fffff800`02dce22c fffff800`02c10000 fffff880`032ff838 : nt!KiKernelCalloutExceptionHandler+0xe
    fffff880`032fe950 fffff800`02caf4a5 : fffff800`02dd62c4 fffff880`032fe9c8 fffff880`032ff838 fffff800`02c10000 : nt!RtlpExecuteHandlerForException+0xd
    fffff880`032fe980 fffff800`02cc0431 : fffff880`032ff838 fffff880`032ff090 fffff880`00000000 00000000`c0000010 : nt!RtlDispatchException+0x415
    fffff880`032ff060 fffff800`02c84542 : fffff880`032ff838 00000000`00000000 fffff880`032ff8e0 fffff980`02aaae40 : nt!KiDispatchException+0x135
    fffff880`032ff700 fffff800`02c82374 : 00000000`00000000 fffff880`032ff960 00000000`c0000010 fffff980`02aaae40 : nt!KiExceptionDispatch+0xc2
    fffff880`032ff8e0 fffff800`02c7ccd1 : fffff880`00efe0bd fffff980`02aaafd0 fffff980`02aaafd0 00000000`00000000 : nt!KiBreakpointTrap+0xf4
    fffff880`032ffa78 fffff880`00efe0bd : fffff980`02aaafd0 fffff980`02aaafd0 00000000`00000000 fffff880`00ede610 : nt!DbgBreakPoint+0x1
    fffff880`032ffa80 fffff880`00eff691 : 00000000`00000000 fffff980`02aaae40 fffff980`02aaae40 fffff980`0c722e60 : Wdf01000!FxRequest::Vf_VerifyRequestIsCancelable+0x79
    fffff880`032ffad0 fffff880`00e5dbcc : 00000000`00000002 fffff980`0c722e60 fffff880`032ffb50 fffff880`00000008 : Wdf01000!FxIoQueue::Vf_VerifyRequestCancelable+0x99
    fffff880`032ffb10 fffff880`00e68685 : ffff9d59`b1850b10 fffff980`0c722e60 0000067f`f38dd198 00000000`00000022 : Wdf01000!FxIoQueue::RequestCancelable+0x3c
    fffff880`032ffb80 fffff880`061a821b : fffff980`0c722e60 0000067f`fd4db1b8 fffff980`02b24e40 fffffa80`1bf8049c : Wdf01000!imp_WdfRequestUnmarkCancelable+0x115
    fffff880`032ffbe0 fffff880`061a6a22 : 0000067f`f38dd198 fffffa80`1bf8049c fffff880`061a9c70 fffffa80`1bf8030c : riffa!WdfRequestUnmarkCancelable+0x1b [c:\winddk\7600.16385.1\inc\wdf\kmdf\1.9\wdfrequest.h @ 775]
    fffff880`032ffc10 fffff880`061a7d6c : fffffa80`1bf802d0 fffff800`00000000 fffff880`00000000 fffffa80`1bf8030c : riffa!RiffaCompleteRequest+0xa2 [c:\winddk\7600.16385.1\src\general\riffa\driver\windows\sys\riffa.c @ 1462]
    fffff880`032ffc70 fffff880`061a5252 : fffffa80`1bf802d0 fffff880`00000000 fffff880`00000001 fffffa80`00000000 : riffa!RiffaTransactionComplete+0x10c [c:\winddk\7600.16385.1\src\general\riffa\driver\windows\sys\riffa.c @ 1907]
    fffff880`032ffcc0 fffff880`00ed96d3 : 0000067f`fd4db1b8 0000057f`e407ffd8 00000000`4e54241d 00000000`4559db6a : riffa!RiffaEvtInterruptDpc+0x622 [c:\winddk\7600.16385.1\src\general\riffa\driver\windows\sys\riffa.c @ 878]
    fffff880`032ffed0 fffff800`02c907ac : fffff880`032d0180 fffff880`032d0180 fffff980`02b24ed8 00000000`00000042 : Wdf01000!FxInterrupt::DpcHandler+0xc3
    fffff880`032fff00 fffff800`02c87925 : 00000000`00000000 fffffa80`19cdb300 00000000`00000000 fffff880`00ed9840 : nt!KiRetireDpcList+0x1bc
    fffff880`032fffb0 fffff800`02c8773c : 00000000`00000000 00000000`00000000 00000000`00000000 00000000`00000000 : nt!KxRetireDpcList+0x5
    fffff880`063657e0 00000000`00000000 : 00000000`00000000 00000000`00000000 00000000`00000000 00000000`00000000 : nt!KiDispatchInterruptContinue


    STACK_COMMAND: kb

    FOLLOWUP_IP:
    riffa!WdfRequestUnmarkCancelable+1b [c:\winddk\7600.16385.1\inc\wdf\kmdf\1.9\wdfrequest.h @ 775]
    fffff880`061a821b 4883c428 add rsp,28h

    FAULTING_SOURCE_LINE: c:\winddk\7600.16385.1\inc\wdf\kmdf\1.9\wdfrequest.h

    FAULTING_SOURCE_FILE: c:\winddk\7600.16385.1\inc\wdf\kmdf\1.9\wdfrequest.h

    FAULTING_SOURCE_LINE_NUMBER: 775

    FAULTING_SOURCE_CODE:
    771: WDFREQUEST Request
    772: )
    773: {
    774: return ((PFN_WDFREQUESTUNMARKCANCELABLE) WdfFunctions[WdfRequestUnmarkCancelableTableIndex])(WdfDriverGlobals, Request);
    > 775: }
    776:
    777: //
    778: // WDF Function: WdfRequestIsCanceled
    779: //
    780: typedef


    SYMBOL_STACK_INDEX: c

    SYMBOL_NAME: riffa!WdfRequestUnmarkCancelable+1b

    FOLLOWUP_NAME: MachineOwner

    MODULE_NAME: riffa

    IMAGE_NAME: riffa.sys

    DEBUG_FLR_IMAGE_TIMESTAMP: 54e4c333

    FAILURE_BUCKET_ID: X64_0x1E_VRF_riffa!WdfRequestUnmarkCancelable+1b

    BUCKET_ID: X64_0x1E_VRF_riffa!WdfRequestUnmarkCancelable+1b

    Followup: MachineOwner

    ---------------------------------
    It seems to suggest that the call to WdfRequestUnmarkCancelable in my RiffaTransactionComplete function is to blame. OK. But my understanding is that I need to call WdfRequestUnmarkCancelable on every WDFREQUEST that I call WdfRequestMarkCancelable. According to the WdfRequestUnmarkCancelable MSDN page (https://msdn.microsoft.com/en-us/library/windows/hardware/ff550035(v=vs.85).aspx), this must be done everywhere except the EvtRequestCancel callback.

    The general flow of things in my driver:

    -> RiffaEvtIoDeviceControl()
    // This is the entry point
    -> RiffaIoctlSend()
    // Save the WDFREQUEST in a field on the PDEVICE_EXTENSION object.
    // Start a DMA transfer.
    // Call WdfRequestMarkCancelableEx() with the WDFREQUEST and own EvtRequestCancel callback called RiffaEvtRequestCancel.

    The driver waits for an interrupt to signal the end of the DMA transfer.

    -> RiffaEvtInterruptDpc()
    // Interrupt handler
    -> RiffaTransactionComplete()
    // Ends the DMA transfer, releases the DMA transaction object.
    -> RiffaCompleteRequest()
    // Gets WDFREQUEST off the PDEVICE_EXTENSION object. Checks if it's null. If null the request was cancelled and this does nothing. If not null, sets the field on the PDEVICE_EXTENSION to null and calls WdfRequestUnmarkCancelable(), followed by WdfRequestCompleteWithInformation().

    If ever invoked, the EvtRequestCancel callback, RiffaEvtRequestCancel, sets the WDFREQUEST field on the PDEVICE_EXTENSION to null and calls WdfRequestCompleteWithInformation().

    It's a pretty straightforward/simple driver modeled after the PLX9x5x driver example. I have yet been able to determine what's causing this problem. The only thing I do know is that it only seems to happen when issuing a lot of back to back invocations. Stepping through invocations with a debugger seems to slow things down enough to prevent whatever's going on.

    I'll post the relevant function bodies in a follow up post.

    I'll add a few extra detail and suspicions. The suspicions about possible causes stem from other somewhat related posts I've found. But as of yet, I've been unable to identify these ideas as the culprit.

    1) The driver runs flawlessly without WdfRequestMarkCancelable & WdfRequestUnmarkCancelable. With low frequency and high frequency back to back invocations. But I need to be able to cancel the IO request. So I'm stuck using WdfRequestMarkCancelable & WdfRequestUnmarkCancelable.

    2) Someone had a similar problem and discovered that the WDFREQUEST objects were being reused behind the scenes and that person's driver was failing to call WdfRequestUnmarkCancelable on one of the code paths. His error was similar but was being raised when calling WdfRequestMarkCancelable on a WDFREQUEST that was presumably still marked cancelable. I don't think this is what's going on here, but knowing that the WDFREQUESTS are being used is useful.

    3) In my initial post on this, someone vaguely suggested that the WDFREQUEST might be non-null on the PDEVICE_EXTENSION but that somehow the "underlying request" object" might have been completed. I don't see how this could happen as both code paths that complete this WDFREQUEST mark it as null before doing so. This suggestion was a bit vague, so I don't really know what would cause such a situation. If you do, please enlighten me.

    Any ideas would be appreciated. Thanks.

    Matt


    ---
    NTDEV is sponsored by OSR

    Visit the list at: http://www.osronline.com/showlists.cfm?list=ntdev

    OSR is HIRING!! See http://www.osr.com/careers

    For our schedule of WDF, WDM, debugging and other seminars visit:
    http://www.osr.com/seminars

    To unsubscribe, visit the List Server section of OSR Online at http://www.osronline.com/page.cfm?name=ListServer

    ---
    NTDEV is sponsored by OSR

    Visit the list at: http://www.osronline.com/showlists.cfm?list=ntdev

    OSR is HIRING!! See http://www.osr.com/careers

    For our schedule of WDF, WDM, debugging and other seminars visit:
    http://www.osr.com/seminars

    To unsubscribe, visit the List Server section of OSR Online at http://www.osronline.com/page.cfm?name=ListServer

    --- NTDEV is sponsored by OSR Visit the list at: http://www.osronline.com/showlists.cfm?list=ntdev OSR is HIRING!! See http://www.osr.com/careers For our schedule of WDF, WDM, debugging and other seminars visit: http://www.osr.com/seminars To unsubscribe, visit the List Server section of OSR Online at http://www.osronline.com/page.cfm?name=ListServer
    d
  • OSR_Community_UserOSR_Community_User Member Posts: 110,217
    The debugger is running on Win 7 Professional 32 bit. The version of the
    debugger is 6.2.9200.16384 x86. It's the debugger that came with the WDK
    7600.16385.1. Is there a dll that I need to load into the debugger to get
    the symbols or something? I am running the debugger on a minidump of the
    crash from a Win 7 64 bit machine that had verifier turned on and KMDF
    VerifierOn set to 1 (if that helps).

    Thanks,
    Matt

    On Wed, Apr 1, 2015 at 8:16 PM, Doron Holan
    wrote:

    > What os are you running the debugger on? What version of the debugger
    > are you running? Iirc there is a wdfkd.dll in the kit, you want to use that
    > version.
    >
    > d
    >
    > Bent from my phone
    > ------------------------------
    > From: Matt Jacobsen
    > Sent: ‎4/‎1/‎2015 8:01 PM
    > To: Windows System Software Devs Interest List
    > Subject: Re: [ntdev] WdfRequestUnmarkCanc?elable results in BSOD
    > (continued)
    >
    > Thanks for the quick reply. That command didn't seem to do anything
    > meaningful.
    >
    > 10: kd> !wdfkd.wdflogdump riffaqq
    > c0000005 Exception in wdfkd.wdflogdump debugger extension.
    > PC: 00000000`6b9f942b VA: 00000000`00000000 R/W: 0 Parameter:
    > 00000000`0001003f
    >
    > I tried it with just the driver name "riffa", but it resulted in the
    > same:
    >
    > 10: kd> !wdfkd.wdflogdump riffa
    > c0000005 Exception in wdfkd.wdflogdump debugger extension.
    > PC: 00000000`6b9f942b VA: 00000000`00000000 R/W: 0 Parameter:
    > 00000000`0001003f
    >
    >
    > On Wed, Apr 1, 2015 at 5:56 PM, Doron Holan
    > wrote:
    >
    >> !wdfkd.wdflogdump riffaqq will tell you what is going wrong
    >>
    >> d
    >>
    >> Bent from my phone
    >> ------------------------------
    >> From: xxxxx@gmail.com
    >> Sent: ‎4/‎1/‎2015 5:43 PM
    >> To: Windows System Software Devs Interest List
    >> Subject: [ntdev] WdfRequestUnmarkCanc?elable results in BSOD (continued)
    >>
    >> I can't seem to locate the original thread I started last month. So
    >> I'll start a new one here.
    >>
    >> I have an IOCTL based DMA driver that I wrote for interfacing with PCIe
    >> FPGAs. It's for Win7 and built using the KMDF APIs. It works well and is
    >> reliable until I add support for IO cancellation. Once I add the
    >> WdfRequestMarkCancelableEx, WdfRequestUnmarkCancelable calls I get the BSOD
    >> .
    >>
    >> The Minidump log when I have driver verifier turned on (as well as KMDF
    >> VerifyOn set to 1) is below.
    >>
    >> CUSTOMER_CRASH_COUNT: 1
    >>
    >> DEFAULT_BUCKET_ID: VERIFIER_ENABLED_VISTA_MINIDUMP
    >>
    >> BUGCHECK_STR: 0x1E
    >>
    >> PROCESS_NAME: svchost.exe
    >>
    >> CURRENT_IRQL: 2
    >>
    >> TAG_NOT_DEFINED_c000000f: FFFFF880032FFFB0
    >>
    >> EXCEPTION_RECORD: fffff880032ff838 -- (.exr 0xfffff880032ff838)
    >> ExceptionAddress: fffff80002c7ccd0 (nt!DbgBreakPoint)
    >> ExceptionCode: 80000003 (Break instruction exception)
    >> ExceptionFlags: 00000000
    >> NumberParameters: 1
    >> Parameter[0]: 0000000000000000
    >>
    >> TRAP_FRAME: fffff880032ff8e0 -- (.trap 0xfffff880032ff8e0)
    >> NOTE: The trap frame does not contain all registers.
    >> Some register values may be zeroed or incorrect.
    >> rax=fffff880032ffa78 rbx=0000000000000000 rcx=fffff98002aaafd0
    >> rdx=fffff98002aaafd0 rsi=0000000000000000 rdi=0000000000000000
    >> rip=fffff80002c7ccd1 rsp=fffff880032ffa78 rbp=fffff880032ffb02
    >> r8=0000000000000000 r9=fffff88000ede610 r10=7efefefefefeff40
    >> r11=8101010101010100 r12=0000000000000000 r13=0000000000000000
    >> r14=0000000000000000 r15=0000000000000000
    >> iopl=0 nv up ei pl nz na pe nc
    >> nt!DbgBreakPoint+0x1:
    >> fffff800`02c7ccd1 c3 ret
    >> Resetting default scope
    >>
    >> LAST_CONTROL_TRANSFER: from fffff80002c7c8be to fffff80002c84e90
    >>
    >> STACK_TEXT:
    >> fffff880`032fe918 fffff800`02c7c8be : fffffa80`199bc580 fffff880`00c0bf6b
    >> fffff880`032ff090 fffff800`02cb0a00 : nt!KeBugCheck
    >> fffff880`032fe920 fffff800`02cb06cd : fffff800`02e95398 fffff800`02dce22c
    >> fffff800`02c10000 fffff880`032ff838 : nt!KiKernelCalloutExceptionHandler+0xe
    >> fffff880`032fe950 fffff800`02caf4a5 : fffff800`02dd62c4 fffff880`032fe9c8
    >> fffff880`032ff838 fffff800`02c10000 : nt!RtlpExecuteHandlerForException+0xd
    >> fffff880`032fe980 fffff800`02cc0431 : fffff880`032ff838 fffff880`032ff090
    >> fffff880`00000000 00000000`c0000010 : nt!RtlDispatchException+0x415
    >> fffff880`032ff060 fffff800`02c84542 : fffff880`032ff838 00000000`00000000
    >> fffff880`032ff8e0 fffff980`02aaae40 : nt!KiDispatchException+0x135
    >> fffff880`032ff700 fffff800`02c82374 : 00000000`00000000 fffff880`032ff960
    >> 00000000`c0000010 fffff980`02aaae40 : nt!KiExceptionDispatch+0xc2
    >> fffff880`032ff8e0 fffff800`02c7ccd1 : fffff880`00efe0bd fffff980`02aaafd0
    >> fffff980`02aaafd0 00000000`00000000 : nt!KiBreakpointTrap+0xf4
    >> fffff880`032ffa78 fffff880`00efe0bd : fffff980`02aaafd0 fffff980`02aaafd0
    >> 00000000`00000000 fffff880`00ede610 : nt!DbgBreakPoint+0x1
    >> fffff880`032ffa80 fffff880`00eff691 : 00000000`00000000 fffff980`02aaae40
    >> fffff980`02aaae40 fffff980`0c722e60 :
    >> Wdf01000!FxRequest::Vf_VerifyRequestIsCancelable+0x79
    >> fffff880`032ffad0 fffff880`00e5dbcc : 00000000`00000002 fffff980`0c722e60
    >> fffff880`032ffb50 fffff880`00000008 :
    >> Wdf01000!FxIoQueue::Vf_VerifyRequestCancelable+0x99
    >> fffff880`032ffb10 fffff880`00e68685 : ffff9d59`b1850b10 fffff980`0c722e60
    >> 0000067f`f38dd198 00000000`00000022 :
    >> Wdf01000!FxIoQueue::RequestCancelable+0x3c
    >> fffff880`032ffb80 fffff880`061a821b : fffff980`0c722e60 0000067f`fd4db1b8
    >> fffff980`02b24e40 fffffa80`1bf8049c :
    >> Wdf01000!imp_WdfRequestUnmarkCancelable+0x115
    >> fffff880`032ffbe0 fffff880`061a6a22 : 0000067f`f38dd198 fffffa80`1bf8049c
    >> fffff880`061a9c70 fffffa80`1bf8030c : riffa!WdfRequestUnmarkCancelable+0x1b
    >> [c:\winddk\7600.16385.1\inc\wdf\kmdf\1.9\wdfrequest.h @ 775]
    >> fffff880`032ffc10 fffff880`061a7d6c : fffffa80`1bf802d0 fffff800`00000000
    >> fffff880`00000000 fffffa80`1bf8030c : riffa!RiffaCompleteRequest+0xa2
    >> [c:\winddk\7600.16385.1\src\general\riffa\driver\windows\sys\riffa.c @ 1462]
    >> fffff880`032ffc70 fffff880`061a5252 : fffffa80`1bf802d0 fffff880`00000000
    >> fffff880`00000001 fffffa80`00000000 : riffa!RiffaTransactionComplete+0x10c
    >> [c:\winddk\7600.16385.1\src\general\riffa\driver\windows\sys\riffa.c @ 1907]
    >> fffff880`032ffcc0 fffff880`00ed96d3 : 0000067f`fd4db1b8 0000057f`e407ffd8
    >> 00000000`4e54241d 00000000`4559db6a : riffa!RiffaEvtInterruptDpc+0x622
    >> [c:\winddk\7600.16385.1\src\general\riffa\driver\windows\sys\riffa.c @ 878]
    >> fffff880`032ffed0 fffff800`02c907ac : fffff880`032d0180 fffff880`032d0180
    >> fffff980`02b24ed8 00000000`00000042 : Wdf01000!FxInterrupt::DpcHandler+0xc3
    >> fffff880`032fff00 fffff800`02c87925 : 00000000`00000000 fffffa80`19cdb300
    >> 00000000`00000000 fffff880`00ed9840 : nt!KiRetireDpcList+0x1bc
    >> fffff880`032fffb0 fffff800`02c8773c : 00000000`00000000 00000000`00000000
    >> 00000000`00000000 00000000`00000000 : nt!KxRetireDpcList+0x5
    >> fffff880`063657e0 00000000`00000000 : 00000000`00000000 00000000`00000000
    >> 00000000`00000000 00000000`00000000 : nt!KiDispatchInterruptContinue
    >>
    >>
    >> STACK_COMMAND: kb
    >>
    >> FOLLOWUP_IP:
    >> riffa!WdfRequestUnmarkCancelable+1b
    >> [c:\winddk\7600.16385.1\inc\wdf\kmdf\1.9\wdfrequest.h @ 775]
    >> fffff880`061a821b 4883c428 add rsp,28h
    >>
    >> FAULTING_SOURCE_LINE:
    >> c:\winddk\7600.16385.1\inc\wdf\kmdf\1.9\wdfrequest.h
    >>
    >> FAULTING_SOURCE_FILE:
    >> c:\winddk\7600.16385.1\inc\wdf\kmdf\1.9\wdfrequest.h
    >>
    >> FAULTING_SOURCE_LINE_NUMBER: 775
    >>
    >> FAULTING_SOURCE_CODE:
    >> 771: WDFREQUEST Request
    >> 772: )
    >> 773: {
    >> 774: return ((PFN_WDFREQUESTUNMARKCANCELABLE)
    >> WdfFunctions[WdfRequestUnmarkCancelableTableIndex])(WdfDriverGlobals,
    >> Request);
    >> > 775: }
    >> 776:
    >> 777: //
    >> 778: // WDF Function: WdfRequestIsCanceled
    >> 779: //
    >> 780: typedef
    >>
    >>
    >> SYMBOL_STACK_INDEX: c
    >>
    >> SYMBOL_NAME: riffa!WdfRequestUnmarkCancelable+1b
    >>
    >> FOLLOWUP_NAME: MachineOwner
    >>
    >> MODULE_NAME: riffa
    >>
    >> IMAGE_NAME: riffa.sys
    >>
    >> DEBUG_FLR_IMAGE_TIMESTAMP: 54e4c333
    >>
    >> FAILURE_BUCKET_ID: X64_0x1E_VRF_riffa!WdfRequestUnmarkCancelable+1b
    >>
    >> BUCKET_ID: X64_0x1E_VRF_riffa!WdfRequestUnmarkCancelable+1b
    >>
    >> Followup: MachineOwner
    >>
    >> ---------------------------------
    >> It seems to suggest that the call to WdfRequestUnmarkCancelable in my
    >> RiffaTransactionComplete function is to blame. OK. But my understanding is
    >> that I need to call WdfRequestUnmarkCancelable on every WDFREQUEST that I
    >> call WdfRequestMarkCancelable. According to the WdfRequestUnmarkCancelable
    >> MSDN page (
    >> https://msdn.microsoft.com/en-us/library/windows/hardware/ff550035(v=vs.85).aspx),
    >> this must be done everywhere except the EvtRequestCancel callback.
    >>
    >> The general flow of things in my driver:
    >>
    >> -> RiffaEvtIoDeviceControl()
    >> // This is the entry point
    >> -> RiffaIoctlSend()
    >> // Save the WDFREQUEST in a field on the PDEVICE_EXTENSION object.
    >> // Start a DMA transfer.
    >> // Call WdfRequestMarkCancelableEx() with the WDFREQUEST and own
    >> EvtRequestCancel callback called RiffaEvtRequestCancel.
    >>
    >> The driver waits for an interrupt to signal the end of the DMA transfer.
    >>
    >> -> RiffaEvtInterruptDpc()
    >> // Interrupt handler
    >> -> RiffaTransactionComplete()
    >> // Ends the DMA transfer, releases the DMA transaction object.
    >> -> RiffaCompleteRequest()
    >> // Gets WDFREQUEST off the PDEVICE_EXTENSION object. Checks if it's
    >> null. If null the request was cancelled and this does nothing. If not null,
    >> sets the field on the PDEVICE_EXTENSION to null and calls
    >> WdfRequestUnmarkCancelable(), followed by
    >> WdfRequestCompleteWithInformation().
    >>
    >> If ever invoked, the EvtRequestCancel callback, RiffaEvtRequestCancel,
    >> sets the WDFREQUEST field on the PDEVICE_EXTENSION to null and calls
    >> WdfRequestCompleteWithInformation().
    >>
    >> It's a pretty straightforward/simple driver modeled after the PLX9x5x
    >> driver example. I have yet been able to determine what's causing this
    >> problem. The only thing I do know is that it only seems to happen when
    >> issuing a lot of back to back invocations. Stepping through invocations
    >> with a debugger seems to slow things down enough to prevent whatever's
    >> going on.
    >>
    >> I'll post the relevant function bodies in a follow up post.
    >>
    >> I'll add a few extra detail and suspicions. The suspicions about possible
    >> causes stem from other somewhat related posts I've found. But as of yet,
    >> I've been unable to identify these ideas as the culprit.
    >>
    >> 1) The driver runs flawlessly without WdfRequestMarkCancelable &
    >> WdfRequestUnmarkCancelable. With low frequency and high frequency back to
    >> back invocations. But I need to be able to cancel the IO request. So I'm
    >> stuck using WdfRequestMarkCancelable & WdfRequestUnmarkCancelable.
    >>
    >> 2) Someone had a similar problem and discovered that the WDFREQUEST
    >> objects were being reused behind the scenes and that person's driver was
    >> failing to call WdfRequestUnmarkCancelable on one of the code paths. His
    >> error was similar but was being raised when calling
    >> WdfRequestMarkCancelable on a WDFREQUEST that was presumably still marked
    >> cancelable. I don't think this is what's going on here, but knowing that
    >> the WDFREQUESTS are being used is useful.
    >>
    >> 3) In my initial post on this, someone vaguely suggested that the
    >> WDFREQUEST might be non-null on the PDEVICE_EXTENSION but that somehow the
    >> "underlying request" object" might have been completed. I don't see how
    >> this could happen as both code paths that complete this WDFREQUEST mark it
    >> as null before doing so. This suggestion was a bit vague, so I don't really
    >> know what would cause such a situation. If you do, please enlighten me.
    >>
    >> Any ideas would be appreciated. Thanks.
    >>
    >> Matt
    >>
    >>
    >> ---
    >> NTDEV is sponsored by OSR
    >>
    >> Visit the list at: http://www.osronline.com/showlists.cfm?list=ntdev
    >>
    >> OSR is HIRING!! See http://www.osr.com/careers
    >>
    >> For our schedule of WDF, WDM, debugging and other seminars visit:
    >> http://www.osr.com/seminars
    >>
    >> To unsubscribe, visit the List Server section of OSR Online at
    >> http://www.osronline.com/page.cfm?name=ListServer
    >>
    >> ---
    >> NTDEV is sponsored by OSR
    >>
    >> Visit the list at: http://www.osronline.com/showlists.cfm?list=ntdev
    >>
    >> OSR is HIRING!! See http://www.osr.com/careers
    >>
    >> For our schedule of WDF, WDM, debugging and other seminars visit:
    >> http://www.osr.com/seminars
    >>
    >> To unsubscribe, visit the List Server section of OSR Online at
    >> http://www.osronline.com/page.cfm?name=ListServer
    >>
    >
    > --- NTDEV is sponsored by OSR Visit the list at:
    > http://www.osronline.com/showlists.cfm?list=ntdev OSR is HIRING!! See
    > http://www.osr.com/careers For our schedule of WDF, WDM, debugging and
    > other seminars visit: http://www.osr.com/seminars To unsubscribe, visit
    > the List Server section of OSR Online at
    > http://www.osronline.com/page.cfm?name=ListServer
    >
    > ---
    > NTDEV is sponsored by OSR
    >
    > Visit the list at: http://www.osronline.com/showlists.cfm?list=ntdev
    >
    > OSR is HIRING!! See http://www.osr.com/careers
    >
    > For our schedule of WDF, WDM, debugging and other seminars visit:
    > http://www.osr.com/seminars
    >
    > To unsubscribe, visit the List Server section of OSR Online at
    > http://www.osronline.com/page.cfm?name=ListServer
    >
  • Tim_RobertsTim_Roberts Member - All Emails Posts: 13,035
    On Apr 1, 2015, at 5:44 PM, xxxxx@gmail.com wrote:
    >
    > I have an IOCTL based DMA driver that I wrote for interfacing with PCIe FPGAs. It's for Win7 and built using the KMDF APIs. It works well and is reliable until I add support for IO cancellation. Once I add the WdfRequestMarkCancelableEx, WdfRequestUnmarkCancelable calls I get the BSOD .
    > ...
    > ---------------------------------
    > It seems to suggest that the call to WdfRequestUnmarkCancelable in my RiffaTransactionComplete function is to blame. OK. But my understanding is that I need to call WdfRequestUnmarkCancelable on every WDFREQUEST that I call WdfRequestMarkCancelable. According to the WdfRequestUnmarkCancelable MSDN page (https://msdn.microsoft.com/en-us/library/windows/hardware/ff550035(v=vs.85).aspx), this must be done everywhere except the EvtRequestCancel callback.

    Well, you need call it for every request that was marked. The implication from the stack dump is that you tried to Unmark a request that was not current marked as cancelable. Now, I can’t tell whether the request had been marked and then unmarked, or never marked at all.


    > The general flow of things in my driver:
    >
    > -> RiffaEvtIoDeviceControl()
    > // This is the entry point
    > -> RiffaIoctlSend()
    > // Save the WDFREQUEST in a field on the PDEVICE_EXTENSION object.
    > // Start a DMA transfer.
    > // Call WdfRequestMarkCancelableEx() with the WDFREQUEST and own EvtRequestCancel callback called RiffaEvtRequestCancel.

    Why don’t you just use a manual queue? That way, you don’t have to deal with the cancelable status at all.

    Tim Roberts, xxxxx@probo.com
    Providenza & Boekelheide, Inc.

    Tim Roberts, [email protected]
    Providenza & Boekelheide, Inc.

  • Tim_RobertsTim_Roberts Member - All Emails Posts: 13,035
    On Apr 1, 2015, at 5:57 PM, xxxxx@gmail.com wrote:
    >
    > // Complete the send request
    > InterlockedExchange(&DevExt->Chnl[Chnl].InUse, 0);
    > RiffaCompleteRequest(DevExt, Chnl, Status);

    You’re doing those in the wrong order. As soon as you set InUse to 0, your original ioctl handler is free to overwrite that channel’s request with a different one. If it happens to do so before you get into the completion routine, it will complete the wrong request. You should complete it, and then release InUse.

    Again, wouldn’t this all be easier if you had a manual I/O queue for each channel? Put all of your data in a context structure for the request. Every time a DMA completes, you can pop the next one, instead of just rejecting requests when one is already in progress.

    Tim Roberts, xxxxx@probo.com
    Providenza & Boekelheide, Inc.

    Tim Roberts, [email protected]
    Providenza & Boekelheide, Inc.

  • OSR_Community_UserOSR_Community_User Member Posts: 110,217
    Tim, thanks for pointing this out. I removed the InUse entirely and just
    make sure there's no code after the WdfRequestComplete calls. Yes, there's
    no protection for multiple threads trying to enter simultaneously, but I
    can deal with that later after this bug is resolved.

    I'm not sure I fully understand how to use a manual queue. Any example I
    could look at? Your description sounds a lot like what I'm already doing.
    I'm saving the WDFREQUEST in the PDEVICE_EXTENSION upon entering the
    driver. Then when I receive the "DMA done" interrupt I retrieve the
    WDFREQUEST from the PDEVICE_EXTENSION and complete it.

    Lastly, after reorganizing the driver to perform the WdfRequestComplete as
    the absolute last operation from every code path, I'm encountering a
    slightly different error. From the minidump:

    CUSTOMER_CRASH_COUNT: 1
    DEFAULT_BUCKET_ID: VERIFIER_ENABLED_VISTA_MINIDUMP
    BUGCHECK_STR: 0x1E
    PROCESS_NAME: Project1.exe
    CURRENT_IRQL: 2
    TAG_NOT_DEFINED_c000000f: FFFFF880032FFFB0
    EXCEPTION_RECORD: fffff880032ff8b8 -- (.exr 0xfffff880032ff8b8)
    ExceptionAddress: fffff80002c79cd0 (nt!DbgBreakPoint)
    ExceptionCode: 80000003 (Break instruction exception)
    ExceptionFlags: 00000000
    NumberParameters: 1
    Parameter[0]: 0000000000000000
    TRAP_FRAME: fffff880032ff960 -- (.trap 0xfffff880032ff960)
    NOTE: The trap frame does not contain all registers.
    Some register values may be zeroed or incorrect.
    rax=fffff880032ffaf8 rbx=0000000000000000 rcx=fffff98002decfd0
    rdx=fffff98002decfd0 rsi=0000000000000000 rdi=0000000000000000
    rip=fffff80002c79cd1 rsp=fffff880032ffaf8 rbp=fffff880032ffb02
    r8=0000000000000000 r9=fffff88000f97610 r10=7efefefefefeff40
    r11=8101010101010100 r12=0000000000000000 r13=0000000000000000
    r14=0000000000000000 r15=0000000000000000
    iopl=0 nv up ei pl nz na pe nc
    nt!DbgBreakPoint+0x1:
    fffff800`02c79cd1 c3 ret
    Resetting default scope
    LAST_CONTROL_TRANSFER: from fffff80002c798be to fffff80002c81e90
    STACK_TEXT:
    fffff880`032fe998 fffff800`02c798be : 00000000`0000003c fffff880`00fd2c74
    fffff880`032ff110 fffff800`02cada00 : nt!KeBugCheck
    fffff880`032fe9a0 fffff800`02cad6cd : fffff800`02e92398 fffff800`02dcb22c
    fffff800`02c0d000 fffff880`032ff8b8 : nt!KiKernelCalloutExceptionHandler+0xe
    fffff880`032fe9d0 fffff800`02cac4a5 : fffff800`02dd32c4 fffff880`032fea48
    fffff880`032ff8b8 fffff800`02c0d000 : nt!RtlpExecuteHandlerForException+0xd
    fffff880`032fea00 fffff800`02cbd431 : fffff880`032ff8b8 fffff880`032ff110
    fffff880`00000000 00000000`c0000010 : nt!RtlDispatchException+0x415
    fffff880`032ff0e0 fffff800`02c81542 : fffff880`032ff8b8 00000000`00000000
    fffff880`032ff960 fffff980`02dece40 : nt!KiDispatchException+0x135
    fffff880`032ff780 fffff800`02c7f374 : fffffa80`20591470 fffff800`0311eb31
    00000000`00000000 fffff880`032ffb00 : nt!KiExceptionDispatch+0xc2
    fffff880`032ff960 fffff800`02c79cd1 : fffff880`00fb70bd fffff980`02decfd0
    fffff980`02decfd0 00000000`00000000 : nt!KiBreakpointTrap+0xf4
    fffff880`032ffaf8 fffff880`00fb70bd : fffff980`02decfd0 fffff980`02decfd0
    00000000`00000000 fffff880`00f97610 : nt!DbgBreakPoint+0x1
    fffff880`032ffb00 fffff880`00fb8691 : 00000000`00000000 fffff980`02dece40
    fffff980`02dece40 fffff980`09d90e60 :
    Wdf01000!FxRequest::Vf_VerifyRequestIsCancelable+0x79
    fffff880`032ffb50 fffff880`00f16bcc : 00000000`00000002 fffff980`09d90e60
    fffff880`032ffbd0 fffff880`00000008 :
    Wdf01000!FxIoQueue::Vf_VerifyRequestCancelable+0x99
    fffff880`032ffb90 fffff880`00f21685 : fffffa80`1c04d2d0 fffff980`09d90e60
    0000067f`f626f198 fffff800`02cbd7e8 :
    Wdf01000!FxIoQueue::RequestCancelable+0x3c
    fffff880`032ffc00 fffff880`061ba552 : fffff980`09d90e60 00000000`00000000
    00000000`00000000 0000067f`f626f198 :
    Wdf01000!imp_WdfRequestUnmarkCancelable+0x115
    fffff880`032ffc60 fffff880`00000001 : 00000000`00000000 00000000`00000034
    fffff880`032ffded fffff880`061b988b : riffa!RiffaIoctlSend+0x142
    [c:\winddk\7600.16385.1\src\general\riffa\driver\windows\sys\riffa.c @ 1006]
    fffff880`032ffce0 00000000`00000000 : 00000000`00000034 fffff880`032ffded
    fffff880`061b988b 00000000`00000000 : 0xfffff880`00000001

    STACK_COMMAND: kb
    FOLLOWUP_IP:
    riffa!RiffaIoctlSend+142
    [c:\winddk\7600.16385.1\src\general\riffa\driver\windows\sys\riffa.c @ 1006]
    fffff880`061ba552 ?? ???
    FAULTING_SOURCE_LINE:
    c:\winddk\7600.16385.1\src\general\riffa\driver\windows\sys\riffa.c
    FAULTING_SOURCE_FILE:
    c:\winddk\7600.16385.1\src\general\riffa\driver\windows\sys\riffa.c
    FAULTING_SOURCE_LINE_NUMBER: 1006
    FAULTING_SOURCE_CODE:
    1002: return;
    1003: }
    1004:
    1005: // Set the channel number in the request context for
    RiffaEvtRequestCancel.
    > 1006: reqExt = RiffaGetRequestContext(Request);
    1007: reqExt->Chnl = io->Chnl;
    1008:
    1009: // Start a send transaction.
    1010: if (length) {
    1011: // Start a DMA transaction for sending.

    SYMBOL_STACK_INDEX: c
    SYMBOL_NAME: riffa!RiffaIoctlSend+142
    FOLLOWUP_NAME: MachineOwner
    MODULE_NAME: riffa
    IMAGE_NAME: riffa.sys
    DEBUG_FLR_IMAGE_TIMESTAMP: 551f2b56
    FAILURE_BUCKET_ID: X64_0x1E_VRF_riffa!RiffaIoctlSend+142
    BUCKET_ID: X64_0x1E_VRF_riffa!RiffaIoctlSend+142
    Followup: MachineOwner
    ---------

    This would suggest that the PREQUEST_EXTENSION struct I'm using to pass
    information to the EvtRequestCancel callback is somehow calling
    WdfRequestUnmarkCancelable. Clearly this is going to fail because this is
    the entry point into the driver, I haven't marked the WDFREQUEST
    cancellable yet. In fact, I might complete the request before I do so (if
    there's an input validation problem). I don't know why on earth this call
    to get the PREQUEST_EXTENSION would decide to call
    WdfRequestUnmarkCancelable on my behalf. There's no code in this function
    that calls WdfRequestUnmarkCancelable . This is complete madness.

    Any insights would be appreciated.

    Thanks,
    Matt


    On Wed, Apr 1, 2015 at 11:04 PM, Tim Roberts wrote:

    > On Apr 1, 2015, at 5:57 PM, xxxxx@gmail.com wrote:
    > >
    > > // Complete the send request
    > > InterlockedExchange(&DevExt->Chnl[Chnl].InUse, 0);
    > > RiffaCompleteRequest(DevExt, Chnl, Status);
    >
    > You’re doing those in the wrong order. As soon as you set InUse to 0,
    > your original ioctl handler is free to overwrite that channel’s request
    > with a different one. If it happens to do so before you get into the
    > completion routine, it will complete the wrong request. You should
    > complete it, and then release InUse.
    >
    > Again, wouldn’t this all be easier if you had a manual I/O queue for each
    > channel? Put all of your data in a context structure for the request.
    > Every time a DMA completes, you can pop the next one, instead of just
    > rejecting requests when one is already in progress.
    > —
    > Tim Roberts, xxxxx@probo.com
    > Providenza & Boekelheide, Inc.
    >
    >
    > ---
    > NTDEV is sponsored by OSR
    >
    > Visit the list at: http://www.osronline.com/showlists.cfm?list=ntdev
    >
    > OSR is HIRING!! See http://www.osr.com/careers
    >
    > For our schedule of WDF, WDM, debugging and other seminars visit:
    > http://www.osr.com/seminars
    >
    > To unsubscribe, visit the List Server section of OSR Online at
    > http://www.osronline.com/page.cfm?name=ListServer
    >
  • OSR_Community_UserOSR_Community_User Member Posts: 110,217
    Any ideas on this? According to the stack trace from the minidump the
    failing instruction sequence is:

    riffa!RiffaIoctlSend --> Wdf01000!imp_WdfRequestUnmarkCancelable

    But this is _impossible_. There is _no_ instruction in RiffaIoctlSend that
    would jump to WdfRequestUnmarkCancelable. Moreover the line in source it
    says is the problem is a call to get the PREQUEST_EXTENSION. That's clearly
    not causing this. The only thing I can imagine is that the driver code is
    corrupt because this stack trace is clearly garbage. To test this, I moved
    some lines of code around without changing any behavior, recompiled the
    driver, and tested again. The same error and stack trace are encountered,
    but this time it says the offending line of code is an if statement
    checking if the value of an INT32 is 0 or not. Absolute rubbish!

    Is this typical of Windows driver development? I've never seen this in
    other OS development. Do I need to disable and re-enable the driver
    verifier every time I upgrade the driver code? Is there a special dance
    that needs to be done? Or sacrificial offering to the Windows driver god?

    Thanks,
    Matt


    On Fri, Apr 3, 2015 at 5:44 PM, Matt Jacobsen
    wrote:

    > Tim, thanks for pointing this out. I removed the InUse entirely and just
    > make sure there's no code after the WdfRequestComplete calls. Yes, there's
    > no protection for multiple threads trying to enter simultaneously, but I
    > can deal with that later after this bug is resolved.
    >
    > I'm not sure I fully understand how to use a manual queue. Any example I
    > could look at? Your description sounds a lot like what I'm already doing.
    > I'm saving the WDFREQUEST in the PDEVICE_EXTENSION upon entering the
    > driver. Then when I receive the "DMA done" interrupt I retrieve the
    > WDFREQUEST from the PDEVICE_EXTENSION and complete it.
    >
    > Lastly, after reorganizing the driver to perform the WdfRequestComplete as
    > the absolute last operation from every code path, I'm encountering a
    > slightly different error. From the minidump:
    >
    > CUSTOMER_CRASH_COUNT: 1
    > DEFAULT_BUCKET_ID: VERIFIER_ENABLED_VISTA_MINIDUMP
    > BUGCHECK_STR: 0x1E
    > PROCESS_NAME: Project1.exe
    > CURRENT_IRQL: 2
    > TAG_NOT_DEFINED_c000000f: FFFFF880032FFFB0
    > EXCEPTION_RECORD: fffff880032ff8b8 -- (.exr 0xfffff880032ff8b8)
    > ExceptionAddress: fffff80002c79cd0 (nt!DbgBreakPoint)
    > ExceptionCode: 80000003 (Break instruction exception)
    > ExceptionFlags: 00000000
    > NumberParameters: 1
    > Parameter[0]: 0000000000000000
    > TRAP_FRAME: fffff880032ff960 -- (.trap 0xfffff880032ff960)
    > NOTE: The trap frame does not contain all registers.
    > Some register values may be zeroed or incorrect.
    > rax=fffff880032ffaf8 rbx=0000000000000000 rcx=fffff98002decfd0
    > rdx=fffff98002decfd0 rsi=0000000000000000 rdi=0000000000000000
    > rip=fffff80002c79cd1 rsp=fffff880032ffaf8 rbp=fffff880032ffb02
    > r8=0000000000000000 r9=fffff88000f97610 r10=7efefefefefeff40
    > r11=8101010101010100 r12=0000000000000000 r13=0000000000000000
    > r14=0000000000000000 r15=0000000000000000
    > iopl=0 nv up ei pl nz na pe nc
    > nt!DbgBreakPoint+0x1:
    > fffff800`02c79cd1 c3 ret
    > Resetting default scope
    > LAST_CONTROL_TRANSFER: from fffff80002c798be to fffff80002c81e90
    > STACK_TEXT:
    > fffff880`032fe998 fffff800`02c798be : 00000000`0000003c fffff880`00fd2c74
    > fffff880`032ff110 fffff800`02cada00 : nt!KeBugCheck
    > fffff880`032fe9a0 fffff800`02cad6cd : fffff800`02e92398 fffff800`02dcb22c
    > fffff800`02c0d000 fffff880`032ff8b8 : nt!KiKernelCalloutExceptionHandler+0xe
    > fffff880`032fe9d0 fffff800`02cac4a5 : fffff800`02dd32c4 fffff880`032fea48
    > fffff880`032ff8b8 fffff800`02c0d000 : nt!RtlpExecuteHandlerForException+0xd
    > fffff880`032fea00 fffff800`02cbd431 : fffff880`032ff8b8 fffff880`032ff110
    > fffff880`00000000 00000000`c0000010 : nt!RtlDispatchException+0x415
    > fffff880`032ff0e0 fffff800`02c81542 : fffff880`032ff8b8 00000000`00000000
    > fffff880`032ff960 fffff980`02dece40 : nt!KiDispatchException+0x135
    > fffff880`032ff780 fffff800`02c7f374 : fffffa80`20591470 fffff800`0311eb31
    > 00000000`00000000 fffff880`032ffb00 : nt!KiExceptionDispatch+0xc2
    > fffff880`032ff960 fffff800`02c79cd1 : fffff880`00fb70bd fffff980`02decfd0
    > fffff980`02decfd0 00000000`00000000 : nt!KiBreakpointTrap+0xf4
    > fffff880`032ffaf8 fffff880`00fb70bd : fffff980`02decfd0 fffff980`02decfd0
    > 00000000`00000000 fffff880`00f97610 : nt!DbgBreakPoint+0x1
    > fffff880`032ffb00 fffff880`00fb8691 : 00000000`00000000 fffff980`02dece40
    > fffff980`02dece40 fffff980`09d90e60 :
    > Wdf01000!FxRequest::Vf_VerifyRequestIsCancelable+0x79
    > fffff880`032ffb50 fffff880`00f16bcc : 00000000`00000002 fffff980`09d90e60
    > fffff880`032ffbd0 fffff880`00000008 :
    > Wdf01000!FxIoQueue::Vf_VerifyRequestCancelable+0x99
    > fffff880`032ffb90 fffff880`00f21685 : fffffa80`1c04d2d0 fffff980`09d90e60
    > 0000067f`f626f198 fffff800`02cbd7e8 :
    > Wdf01000!FxIoQueue::RequestCancelable+0x3c
    > fffff880`032ffc00 fffff880`061ba552 : fffff980`09d90e60 00000000`00000000
    > 00000000`00000000 0000067f`f626f198 :
    > Wdf01000!imp_WdfRequestUnmarkCancelable+0x115
    > fffff880`032ffc60 fffff880`00000001 : 00000000`00000000 00000000`00000034
    > fffff880`032ffded fffff880`061b988b : riffa!RiffaIoctlSend+0x142
    > [c:\winddk\7600.16385.1\src\general\riffa\driver\windows\sys\riffa.c @ 1006]
    > fffff880`032ffce0 00000000`00000000 : 00000000`00000034 fffff880`032ffded
    > fffff880`061b988b 00000000`00000000 : 0xfffff880`00000001
    >
    > STACK_COMMAND: kb
    > FOLLOWUP_IP:
    > riffa!RiffaIoctlSend+142
    > [c:\winddk\7600.16385.1\src\general\riffa\driver\windows\sys\riffa.c @ 1006]
    > fffff880`061ba552 ?? ???
    > FAULTING_SOURCE_LINE:
    > c:\winddk\7600.16385.1\src\general\riffa\driver\windows\sys\riffa.c
    > FAULTING_SOURCE_FILE:
    > c:\winddk\7600.16385.1\src\general\riffa\driver\windows\sys\riffa.c
    > FAULTING_SOURCE_LINE_NUMBER: 1006
    > FAULTING_SOURCE_CODE:
    > 1002: return;
    > 1003: }
    > 1004:
    > 1005: // Set the channel number in the request context for
    > RiffaEvtRequestCancel.
    > > 1006: reqExt = RiffaGetRequestContext(Request);
    > 1007: reqExt->Chnl = io->Chnl;
    > 1008:
    > 1009: // Start a send transaction.
    > 1010: if (length) {
    > 1011: // Start a DMA transaction for sending.
    >
    > SYMBOL_STACK_INDEX: c
    > SYMBOL_NAME: riffa!RiffaIoctlSend+142
    > FOLLOWUP_NAME: MachineOwner
    > MODULE_NAME: riffa
    > IMAGE_NAME: riffa.sys
    > DEBUG_FLR_IMAGE_TIMESTAMP: 551f2b56
    > FAILURE_BUCKET_ID: X64_0x1E_VRF_riffa!RiffaIoctlSend+142
    > BUCKET_ID: X64_0x1E_VRF_riffa!RiffaIoctlSend+142
    > Followup: MachineOwner
    > ---------
    >
    > This would suggest that the PREQUEST_EXTENSION struct I'm using to pass
    > information to the EvtRequestCancel callback is somehow calling
    > WdfRequestUnmarkCancelable. Clearly this is going to fail because this is
    > the entry point into the driver, I haven't marked the WDFREQUEST
    > cancellable yet. In fact, I might complete the request before I do so (if
    > there's an input validation problem). I don't know why on earth this call
    > to get the PREQUEST_EXTENSION would decide to call
    > WdfRequestUnmarkCancelable on my behalf. There's no code in this function
    > that calls WdfRequestUnmarkCancelable . This is complete madness.
    >
    > Any insights would be appreciated.
    >
    > Thanks,
    > Matt
    >
    >
    > On Wed, Apr 1, 2015 at 11:04 PM, Tim Roberts wrote:
    >
    >> On Apr 1, 2015, at 5:57 PM, xxxxx@gmail.com wrote:
    >> >
    >> > // Complete the send request
    >> > InterlockedExchange(&DevExt->Chnl[Chnl].InUse, 0);
    >> > RiffaCompleteRequest(DevExt, Chnl, Status);
    >>
    >> You’re doing those in the wrong order. As soon as you set InUse to 0,
    >> your original ioctl handler is free to overwrite that channel’s request
    >> with a different one. If it happens to do so before you get into the
    >> completion routine, it will complete the wrong request. You should
    >> complete it, and then release InUse.
    >>
    >> Again, wouldn’t this all be easier if you had a manual I/O queue for each
    >> channel? Put all of your data in a context structure for the request.
    >> Every time a DMA completes, you can pop the next one, instead of just
    >> rejecting requests when one is already in progress.
    >> —
    >> Tim Roberts, xxxxx@probo.com
    >> Providenza & Boekelheide, Inc.
    >>
    >>
    >> ---
    >> NTDEV is sponsored by OSR
    >>
    >> Visit the list at: http://www.osronline.com/showlists.cfm?list=ntdev
    >>
    >> OSR is HIRING!! See http://www.osr.com/careers
    >>
    >> For our schedule of WDF, WDM, debugging and other seminars visit:
    >> http://www.osr.com/seminars
    >>
    >> To unsubscribe, visit the List Server section of OSR Online at
    >> http://www.osronline.com/page.cfm?name=ListServer
    >>
    >
    >
  • Mark_RoddyMark_Roddy Member - All Emails Posts: 4,307
    https://github.com/Microsoft/Windows-driver-frameworks

    The source code for KMDF. That might be a better way to get an answer.

    Mark Roddy

    On Mon, Apr 6, 2015 at 11:15 AM, Matt Jacobsen
    wrote:

    > Any ideas on this? According to the stack trace from the minidump the
    > failing instruction sequence is:
    >
    > riffa!RiffaIoctlSend --> Wdf01000!imp_WdfRequestUnmarkCancelable
    >
    > But this is _impossible_. There is _no_ instruction in RiffaIoctlSend
    > that would jump to WdfRequestUnmarkCancelable. Moreover the line in
    > source it says is the problem is a call to get the PREQUEST_EXTENSION.
    > That's clearly not causing this. The only thing I can imagine is that the
    > driver code is corrupt because this stack trace is clearly garbage. To test
    > this, I moved some lines of code around without changing any behavior,
    > recompiled the driver, and tested again. The same error and stack trace are
    > encountered, but this time it says the offending line of code is an if
    > statement checking if the value of an INT32 is 0 or not. Absolute rubbish!
    >
    > Is this typical of Windows driver development? I've never seen this in
    > other OS development. Do I need to disable and re-enable the driver
    > verifier every time I upgrade the driver code? Is there a special dance
    > that needs to be done? Or sacrificial offering to the Windows driver god?
    >
    > Thanks,
    > Matt
    >
    >
    > On Fri, Apr 3, 2015 at 5:44 PM, Matt Jacobsen
    > wrote:
    >
    >> Tim, thanks for pointing this out. I removed the InUse entirely and just
    >> make sure there's no code after the WdfRequestComplete calls. Yes, there's
    >> no protection for multiple threads trying to enter simultaneously, but I
    >> can deal with that later after this bug is resolved.
    >>
    >> I'm not sure I fully understand how to use a manual queue. Any example I
    >> could look at? Your description sounds a lot like what I'm already doing.
    >> I'm saving the WDFREQUEST in the PDEVICE_EXTENSION upon entering the
    >> driver. Then when I receive the "DMA done" interrupt I retrieve the
    >> WDFREQUEST from the PDEVICE_EXTENSION and complete it.
    >>
    >> Lastly, after reorganizing the driver to perform the WdfRequestComplete
    >> as the absolute last operation from every code path, I'm encountering a
    >> slightly different error. From the minidump:
    >>
    >> CUSTOMER_CRASH_COUNT: 1
    >> DEFAULT_BUCKET_ID: VERIFIER_ENABLED_VISTA_MINIDUMP
    >> BUGCHECK_STR: 0x1E
    >> PROCESS_NAME: Project1.exe
    >> CURRENT_IRQL: 2
    >> TAG_NOT_DEFINED_c000000f: FFFFF880032FFFB0
    >> EXCEPTION_RECORD: fffff880032ff8b8 -- (.exr 0xfffff880032ff8b8)
    >> ExceptionAddress: fffff80002c79cd0 (nt!DbgBreakPoint)
    >> ExceptionCode: 80000003 (Break instruction exception)
    >> ExceptionFlags: 00000000
    >> NumberParameters: 1
    >> Parameter[0]: 0000000000000000
    >> TRAP_FRAME: fffff880032ff960 -- (.trap 0xfffff880032ff960)
    >> NOTE: The trap frame does not contain all registers.
    >> Some register values may be zeroed or incorrect.
    >> rax=fffff880032ffaf8 rbx=0000000000000000 rcx=fffff98002decfd0
    >> rdx=fffff98002decfd0 rsi=0000000000000000 rdi=0000000000000000
    >> rip=fffff80002c79cd1 rsp=fffff880032ffaf8 rbp=fffff880032ffb02
    >> r8=0000000000000000 r9=fffff88000f97610 r10=7efefefefefeff40
    >> r11=8101010101010100 r12=0000000000000000 r13=0000000000000000
    >> r14=0000000000000000 r15=0000000000000000
    >> iopl=0 nv up ei pl nz na pe nc
    >> nt!DbgBreakPoint+0x1:
    >> fffff800`02c79cd1 c3 ret
    >> Resetting default scope
    >> LAST_CONTROL_TRANSFER: from fffff80002c798be to fffff80002c81e90
    >> STACK_TEXT:
    >> fffff880`032fe998 fffff800`02c798be : 00000000`0000003c fffff880`00fd2c74
    >> fffff880`032ff110 fffff800`02cada00 : nt!KeBugCheck
    >> fffff880`032fe9a0 fffff800`02cad6cd : fffff800`02e92398 fffff800`02dcb22c
    >> fffff800`02c0d000 fffff880`032ff8b8 : nt!KiKernelCalloutExceptionHandler+0xe
    >> fffff880`032fe9d0 fffff800`02cac4a5 : fffff800`02dd32c4 fffff880`032fea48
    >> fffff880`032ff8b8 fffff800`02c0d000 : nt!RtlpExecuteHandlerForException+0xd
    >> fffff880`032fea00 fffff800`02cbd431 : fffff880`032ff8b8 fffff880`032ff110
    >> fffff880`00000000 00000000`c0000010 : nt!RtlDispatchException+0x415
    >> fffff880`032ff0e0 fffff800`02c81542 : fffff880`032ff8b8 00000000`00000000
    >> fffff880`032ff960 fffff980`02dece40 : nt!KiDispatchException+0x135
    >> fffff880`032ff780 fffff800`02c7f374 : fffffa80`20591470 fffff800`0311eb31
    >> 00000000`00000000 fffff880`032ffb00 : nt!KiExceptionDispatch+0xc2
    >> fffff880`032ff960 fffff800`02c79cd1 : fffff880`00fb70bd fffff980`02decfd0
    >> fffff980`02decfd0 00000000`00000000 : nt!KiBreakpointTrap+0xf4
    >> fffff880`032ffaf8 fffff880`00fb70bd : fffff980`02decfd0 fffff980`02decfd0
    >> 00000000`00000000 fffff880`00f97610 : nt!DbgBreakPoint+0x1
    >> fffff880`032ffb00 fffff880`00fb8691 : 00000000`00000000 fffff980`02dece40
    >> fffff980`02dece40 fffff980`09d90e60 :
    >> Wdf01000!FxRequest::Vf_VerifyRequestIsCancelable+0x79
    >> fffff880`032ffb50 fffff880`00f16bcc : 00000000`00000002 fffff980`09d90e60
    >> fffff880`032ffbd0 fffff880`00000008 :
    >> Wdf01000!FxIoQueue::Vf_VerifyRequestCancelable+0x99
    >> fffff880`032ffb90 fffff880`00f21685 : fffffa80`1c04d2d0 fffff980`09d90e60
    >> 0000067f`f626f198 fffff800`02cbd7e8 :
    >> Wdf01000!FxIoQueue::RequestCancelable+0x3c
    >> fffff880`032ffc00 fffff880`061ba552 : fffff980`09d90e60 00000000`00000000
    >> 00000000`00000000 0000067f`f626f198 :
    >> Wdf01000!imp_WdfRequestUnmarkCancelable+0x115
    >> fffff880`032ffc60 fffff880`00000001 : 00000000`00000000 00000000`00000034
    >> fffff880`032ffded fffff880`061b988b : riffa!RiffaIoctlSend+0x142
    >> [c:\winddk\7600.16385.1\src\general\riffa\driver\windows\sys\riffa.c @ 1006]
    >> fffff880`032ffce0 00000000`00000000 : 00000000`00000034 fffff880`032ffded
    >> fffff880`061b988b 00000000`00000000 : 0xfffff880`00000001
    >>
    >> STACK_COMMAND: kb
    >> FOLLOWUP_IP:
    >> riffa!RiffaIoctlSend+142
    >> [c:\winddk\7600.16385.1\src\general\riffa\driver\windows\sys\riffa.c @ 1006]
    >> fffff880`061ba552 ?? ???
    >> FAULTING_SOURCE_LINE:
    >> c:\winddk\7600.16385.1\src\general\riffa\driver\windows\sys\riffa.c
    >> FAULTING_SOURCE_FILE:
    >> c:\winddk\7600.16385.1\src\general\riffa\driver\windows\sys\riffa.c
    >> FAULTING_SOURCE_LINE_NUMBER: 1006
    >> FAULTING_SOURCE_CODE:
    >> 1002: return;
    >> 1003: }
    >> 1004:
    >> 1005: // Set the channel number in the request context for
    >> RiffaEvtRequestCancel.
    >> > 1006: reqExt = RiffaGetRequestContext(Request);
    >> 1007: reqExt->Chnl = io->Chnl;
    >> 1008:
    >> 1009: // Start a send transaction.
    >> 1010: if (length) {
    >> 1011: // Start a DMA transaction for sending.
    >>
    >> SYMBOL_STACK_INDEX: c
    >> SYMBOL_NAME: riffa!RiffaIoctlSend+142
    >> FOLLOWUP_NAME: MachineOwner
    >> MODULE_NAME: riffa
    >> IMAGE_NAME: riffa.sys
    >> DEBUG_FLR_IMAGE_TIMESTAMP: 551f2b56
    >> FAILURE_BUCKET_ID: X64_0x1E_VRF_riffa!RiffaIoctlSend+142
    >> BUCKET_ID: X64_0x1E_VRF_riffa!RiffaIoctlSend+142
    >> Followup: MachineOwner
    >> ---------
    >>
    >> This would suggest that the PREQUEST_EXTENSION struct I'm using to pass
    >> information to the EvtRequestCancel callback is somehow calling
    >> WdfRequestUnmarkCancelable. Clearly this is going to fail because this is
    >> the entry point into the driver, I haven't marked the WDFREQUEST
    >> cancellable yet. In fact, I might complete the request before I do so (if
    >> there's an input validation problem). I don't know why on earth this call
    >> to get the PREQUEST_EXTENSION would decide to call
    >> WdfRequestUnmarkCancelable on my behalf. There's no code in this function
    >> that calls WdfRequestUnmarkCancelable . This is complete madness.
    >>
    >> Any insights would be appreciated.
    >>
    >> Thanks,
    >> Matt
    >>
    >>
    >> On Wed, Apr 1, 2015 at 11:04 PM, Tim Roberts wrote:
    >>
    >>> On Apr 1, 2015, at 5:57 PM, xxxxx@gmail.com wrote:
    >>> >
    >>> > // Complete the send request
    >>> > InterlockedExchange(&DevExt->Chnl[Chnl].InUse, 0);
    >>> > RiffaCompleteRequest(DevExt, Chnl, Status);
    >>>
    >>> You’re doing those in the wrong order. As soon as you set InUse to 0,
    >>> your original ioctl handler is free to overwrite that channel’s request
    >>> with a different one. If it happens to do so before you get into the
    >>> completion routine, it will complete the wrong request. You should
    >>> complete it, and then release InUse.
    >>>
    >>> Again, wouldn’t this all be easier if you had a manual I/O queue for
    >>> each channel? Put all of your data in a context structure for the
    >>> request. Every time a DMA completes, you can pop the next one, instead of
    >>> just rejecting requests when one is already in progress.
    >>> —
    >>> Tim Roberts, xxxxx@probo.com
    >>> Providenza & Boekelheide, Inc.
    >>>
    >>>
    >>> ---
    >>> NTDEV is sponsored by OSR
    >>>
    >>> Visit the list at: http://www.osronline.com/showlists.cfm?list=ntdev
    >>>
    >>> OSR is HIRING!! See http://www.osr.com/careers
    >>>
    >>> For our schedule of WDF, WDM, debugging and other seminars visit:
    >>> http://www.osr.com/seminars
    >>>
    >>> To unsubscribe, visit the List Server section of OSR Online at
    >>> http://www.osronline.com/page.cfm?name=ListServer
    >>>
    >>
    >>
    > --- NTDEV is sponsored by OSR Visit the list at:
    > http://www.osronline.com/showlists.cfm?list=ntdev OSR is HIRING!! See
    > http://www.osr.com/careers For our schedule of WDF, WDM, debugging and
    > other seminars visit: http://www.osr.com/seminars To unsubscribe, visit
    > the List Server section of OSR Online at
    > http://www.osronline.com/page.cfm?name=ListServer
    >
  • Tim_RobertsTim_Roberts Member - All Emails Posts: 13,035
    Matt Jacobsen wrote:
    > Tim, thanks for pointing this out. I removed the InUse entirely and
    > just make sure there's no code after the WdfRequestComplete calls.
    > Yes, there's no protection for multiple threads trying to enter
    > simultaneously, but I can deal with that later after this bug is resolved.

    You are assuming this not what's happening right now. Can you be sure
    of that?


    > I'm not sure I fully understand how to use a manual queue. Any example
    > I could look at? Your description sounds a lot like what I'm already
    > doing. I'm saving the WDFREQUEST in the PDEVICE_EXTENSION upon
    > entering the driver. Then when I receive the "DMA done" interrupt I
    > retrieve the WDFREQUEST from the PDEVICE_EXTENSION and complete it.

    It's basically a wrapped implementation of the cancel-safe queues
    (IoCsqXxx) that were introduced in Windows XP. Instead of saving the
    request in a single cell that you have to monitor, you call
    WdfRequestForwardToIoQueue. When you have data to complete, you call
    WdfQueueRetrieveNextRequest. The framework will automatically handle
    cancellation. If you call WdfQueueRetrieveNextRequest and there's
    nothing there, then your request must have been canceled. You can
    register a callback if a request is canceled while on the queue.

    In general, I always try to minimize the time that I own a request. I
    want to hand it back to a queue so the framework does the management.


    > Lastly, after reorganizing the driver to perform the
    > WdfRequestComplete as the absolute last operation from every code
    > path, I'm encountering a slightly different error. From the minidump:

    This is showing a KMDF Verifier assertion failure. The !wdflogdump
    output should show you what the assertion was, but my guess is trying to
    unmark a request that has already been unmarked, or was never marked to
    begin with. You should be able to get the WDFREQUEST handle from the
    dump and explore its information.


    > STACK_COMMAND: kb
    > FOLLOWUP_IP:
    > riffa!RiffaIoctlSend+142
    > [c:\winddk\7600.16385.1\src\general\riffa\driver\windows\sys\riffa.c @
    > 1006]
    > fffff880`061ba552 ?? ???
    > FAULTING_SOURCE_LINE:
    > c:\winddk\7600.16385.1\src\general\riffa\driver\windows\sys\riffa.c
    > FAULTING_SOURCE_FILE:
    > c:\winddk\7600.16385.1\src\general\riffa\driver\windows\sys\riffa.c

    It's not a good practice to put your driver sources in the DDK tree.
    What happens when you move to the next WDK, or if you have to reinstall?


    > FAULTING_SOURCE_LINE_NUMBER: 1006
    > FAULTING_SOURCE_CODE:
    > 1002: return;
    > 1003: }
    > 1004:
    > 1005: // Set the channel number in the request context for
    > RiffaEvtRequestCancel.
    > > 1006: reqExt = RiffaGetRequestContext(Request);
    > 1007: reqExt->Chnl = io->Chnl;
    > 1008:
    > 1009: // Start a send transaction.
    > 1010: if (length) {
    > 1011: // Start a DMA transaction for sending.
    > ...
    >
    > This would suggest that the PREQUEST_EXTENSION struct I'm using to
    > pass information to the EvtRequestCancel callback is somehow calling
    > WdfRequestUnmarkCancelable. Clearly this is going to fail because this
    > is the entry point into the driver, I haven't marked the WDFREQUEST
    > cancellable yet. In fact, I might complete the request before I do so
    > (if there's an input validation problem). I don't know why on earth
    > this call to get the PREQUEST_EXTENSION would decide to call
    > WdfRequestUnmarkCancelable on my behalf. There's no code in this
    > function that calls WdfRequestUnmarkCancelable . This is complete madness.

    Is this a debug build or a release build? Compiler optimizations can
    really do a number on the debugger's ability to pinpoint locations. Are
    you sure the symbols you have match the binary used in the dump?

    --
    Tim Roberts, xxxxx@probo.com
    Providenza & Boekelheide, Inc.

    Tim Roberts, [email protected]
    Providenza & Boekelheide, Inc.

  • Tim_RobertsTim_Roberts Member - All Emails Posts: 13,035
    Matt Jacobsen wrote:
    >
    > Is this typical of Windows driver development? I've never seen this in
    > other OS development. Do I need to disable and re-enable the driver
    > verifier every time I upgrade the driver code? Is there a special
    > dance that needs to be done? Or sacrificial offering to the Windows
    > driver god?
    >

    Debugging is not a precise art. If you have edited your source code,
    for example, the debugger can't know that. It sees "line 936" in the
    debugger tables, so it shows you line 936, even if you have added 12
    lines of code since that build. Also, as I said, in the release build
    with optimizations on, things move around a lot. The debugger tries to
    keep up, but the information can be imprecise.

    You are simply doing post-mortem analysis at this point, right? You
    might need to hook up a live kernel debugger and grab some breakpoints,
    to make sure things are happening in the order you expect.

    --
    Tim Roberts, xxxxx@probo.com
    Providenza & Boekelheide, Inc.

    Tim Roberts, [email protected]
    Providenza & Boekelheide, Inc.

  • Mark_RoddyMark_Roddy Member - All Emails Posts: 4,307
    As I think I responded earlier, if you have cancelable requests on your
    hardware that require your driver to perform operations with your hardware
    to effect the cancelation then you have to handle the race condition
    between hardware completion and request cancelation, and the only way I
    have found to do that is to have a request extension that contains two
    state variables:

    struct FDO_REQUEST_CONTEXT
    {
    LONG CancelSet;
    LONG RequestCompleted;
    }

    that your code maintains, and you have to use WdfObjectDereference
    and WdfObjectReference appropriately to keep the request context valid
    until both CancelSet and RequestCompleted are zero.

    Your "put on hardware" routine has to add a reference to the request before
    it commits the request to hardware, and remove that reference on its
    completion handling. If you don't do this correctly you cannot examine the
    request context safely.

    Your hardware completion handling has to inspect RequestCompleted and
    CancelSet, ignoring the request if the RequestCompleted flag is set other
    than to remove its reference on the request,
    calling WdfRequestUnmarkCancelable if CancelSet is true and
    RequestCompleted is false, and handling the error return
    from WdfRequestUnmarkCancelable correctly by not processing the request
    other than to again call WdfObjectDereference.

    Your cancel routine can safely assume that it has exclusive access to the
    request, as the hardware completion side is testing the return value from
    WdfRequestUnmarkCancelable, and the CancelSet flag is not cleared by your
    cancel routine until it is completing the request.

    Mark Roddy

    On Mon, Apr 6, 2015 at 11:15 AM, Matt Jacobsen
    wrote:

    > Any ideas on this? According to the stack trace from the minidump the
    > failing instruction sequence is:
    >
    > riffa!RiffaIoctlSend --> Wdf01000!imp_WdfRequestUnmarkCancelable
    >
    > But this is _impossible_. There is _no_ instruction in RiffaIoctlSend
    > that would jump to WdfRequestUnmarkCancelable. Moreover the line in
    > source it says is the problem is a call to get the PREQUEST_EXTENSION.
    > That's clearly not causing this. The only thing I can imagine is that the
    > driver code is corrupt because this stack trace is clearly garbage. To test
    > this, I moved some lines of code around without changing any behavior,
    > recompiled the driver, and tested again. The same error and stack trace are
    > encountered, but this time it says the offending line of code is an if
    > statement checking if the value of an INT32 is 0 or not. Absolute rubbish!
    >
    > Is this typical of Windows driver development? I've never seen this in
    > other OS development. Do I need to disable and re-enable the driver
    > verifier every time I upgrade the driver code? Is there a special dance
    > that needs to be done? Or sacrificial offering to the Windows driver god?
    >
    > Thanks,
    > Matt
    >
    >
    > On Fri, Apr 3, 2015 at 5:44 PM, Matt Jacobsen
    > wrote:
    >
    >> Tim, thanks for pointing this out. I removed the InUse entirely and just
    >> make sure there's no code after the WdfRequestComplete calls. Yes, there's
    >> no protection for multiple threads trying to enter simultaneously, but I
    >> can deal with that later after this bug is resolved.
    >>
    >> I'm not sure I fully understand how to use a manual queue. Any example I
    >> could look at? Your description sounds a lot like what I'm already doing.
    >> I'm saving the WDFREQUEST in the PDEVICE_EXTENSION upon entering the
    >> driver. Then when I receive the "DMA done" interrupt I retrieve the
    >> WDFREQUEST from the PDEVICE_EXTENSION and complete it.
    >>
    >> Lastly, after reorganizing the driver to perform the WdfRequestComplete
    >> as the absolute last operation from every code path, I'm encountering a
    >> slightly different error. From the minidump:
    >>
    >> CUSTOMER_CRASH_COUNT: 1
    >> DEFAULT_BUCKET_ID: VERIFIER_ENABLED_VISTA_MINIDUMP
    >> BUGCHECK_STR: 0x1E
    >> PROCESS_NAME: Project1.exe
    >> CURRENT_IRQL: 2
    >> TAG_NOT_DEFINED_c000000f: FFFFF880032FFFB0
    >> EXCEPTION_RECORD: fffff880032ff8b8 -- (.exr 0xfffff880032ff8b8)
    >> ExceptionAddress: fffff80002c79cd0 (nt!DbgBreakPoint)
    >> ExceptionCode: 80000003 (Break instruction exception)
    >> ExceptionFlags: 00000000
    >> NumberParameters: 1
    >> Parameter[0]: 0000000000000000
    >> TRAP_FRAME: fffff880032ff960 -- (.trap 0xfffff880032ff960)
    >> NOTE: The trap frame does not contain all registers.
    >> Some register values may be zeroed or incorrect.
    >> rax=fffff880032ffaf8 rbx=0000000000000000 rcx=fffff98002decfd0
    >> rdx=fffff98002decfd0 rsi=0000000000000000 rdi=0000000000000000
    >> rip=fffff80002c79cd1 rsp=fffff880032ffaf8 rbp=fffff880032ffb02
    >> r8=0000000000000000 r9=fffff88000f97610 r10=7efefefefefeff40
    >> r11=8101010101010100 r12=0000000000000000 r13=0000000000000000
    >> r14=0000000000000000 r15=0000000000000000
    >> iopl=0 nv up ei pl nz na pe nc
    >> nt!DbgBreakPoint+0x1:
    >> fffff800`02c79cd1 c3 ret
    >> Resetting default scope
    >> LAST_CONTROL_TRANSFER: from fffff80002c798be to fffff80002c81e90
    >> STACK_TEXT:
    >> fffff880`032fe998 fffff800`02c798be : 00000000`0000003c fffff880`00fd2c74
    >> fffff880`032ff110 fffff800`02cada00 : nt!KeBugCheck
    >> fffff880`032fe9a0 fffff800`02cad6cd : fffff800`02e92398 fffff800`02dcb22c
    >> fffff800`02c0d000 fffff880`032ff8b8 : nt!KiKernelCalloutExceptionHandler+0xe
    >> fffff880`032fe9d0 fffff800`02cac4a5 : fffff800`02dd32c4 fffff880`032fea48
    >> fffff880`032ff8b8 fffff800`02c0d000 : nt!RtlpExecuteHandlerForException+0xd
    >> fffff880`032fea00 fffff800`02cbd431 : fffff880`032ff8b8 fffff880`032ff110
    >> fffff880`00000000 00000000`c0000010 : nt!RtlDispatchException+0x415
    >> fffff880`032ff0e0 fffff800`02c81542 : fffff880`032ff8b8 00000000`00000000
    >> fffff880`032ff960 fffff980`02dece40 : nt!KiDispatchException+0x135
    >> fffff880`032ff780 fffff800`02c7f374 : fffffa80`20591470 fffff800`0311eb31
    >> 00000000`00000000 fffff880`032ffb00 : nt!KiExceptionDispatch+0xc2
    >> fffff880`032ff960 fffff800`02c79cd1 : fffff880`00fb70bd fffff980`02decfd0
    >> fffff980`02decfd0 00000000`00000000 : nt!KiBreakpointTrap+0xf4
    >> fffff880`032ffaf8 fffff880`00fb70bd : fffff980`02decfd0 fffff980`02decfd0
    >> 00000000`00000000 fffff880`00f97610 : nt!DbgBreakPoint+0x1
    >> fffff880`032ffb00 fffff880`00fb8691 : 00000000`00000000 fffff980`02dece40
    >> fffff980`02dece40 fffff980`09d90e60 :
    >> Wdf01000!FxRequest::Vf_VerifyRequestIsCancelable+0x79
    >> fffff880`032ffb50 fffff880`00f16bcc : 00000000`00000002 fffff980`09d90e60
    >> fffff880`032ffbd0 fffff880`00000008 :
    >> Wdf01000!FxIoQueue::Vf_VerifyRequestCancelable+0x99
    >> fffff880`032ffb90 fffff880`00f21685 : fffffa80`1c04d2d0 fffff980`09d90e60
    >> 0000067f`f626f198 fffff800`02cbd7e8 :
    >> Wdf01000!FxIoQueue::RequestCancelable+0x3c
    >> fffff880`032ffc00 fffff880`061ba552 : fffff980`09d90e60 00000000`00000000
    >> 00000000`00000000 0000067f`f626f198 :
    >> Wdf01000!imp_WdfRequestUnmarkCancelable+0x115
    >> fffff880`032ffc60 fffff880`00000001 : 00000000`00000000 00000000`00000034
    >> fffff880`032ffded fffff880`061b988b : riffa!RiffaIoctlSend+0x142
    >> [c:\winddk\7600.16385.1\src\general\riffa\driver\windows\sys\riffa.c @ 1006]
    >> fffff880`032ffce0 00000000`00000000 : 00000000`00000034 fffff880`032ffded
    >> fffff880`061b988b 00000000`00000000 : 0xfffff880`00000001
    >>
    >> STACK_COMMAND: kb
    >> FOLLOWUP_IP:
    >> riffa!RiffaIoctlSend+142
    >> [c:\winddk\7600.16385.1\src\general\riffa\driver\windows\sys\riffa.c @ 1006]
    >> fffff880`061ba552 ?? ???
    >> FAULTING_SOURCE_LINE:
    >> c:\winddk\7600.16385.1\src\general\riffa\driver\windows\sys\riffa.c
    >> FAULTING_SOURCE_FILE:
    >> c:\winddk\7600.16385.1\src\general\riffa\driver\windows\sys\riffa.c
    >> FAULTING_SOURCE_LINE_NUMBER: 1006
    >> FAULTING_SOURCE_CODE:
    >> 1002: return;
    >> 1003: }
    >> 1004:
    >> 1005: // Set the channel number in the request context for
    >> RiffaEvtRequestCancel.
    >> > 1006: reqExt = RiffaGetRequestContext(Request);
    >> 1007: reqExt->Chnl = io->Chnl;
    >> 1008:
    >> 1009: // Start a send transaction.
    >> 1010: if (length) {
    >> 1011: // Start a DMA transaction for sending.
    >>
    >> SYMBOL_STACK_INDEX: c
    >> SYMBOL_NAME: riffa!RiffaIoctlSend+142
    >> FOLLOWUP_NAME: MachineOwner
    >> MODULE_NAME: riffa
    >> IMAGE_NAME: riffa.sys
    >> DEBUG_FLR_IMAGE_TIMESTAMP: 551f2b56
    >> FAILURE_BUCKET_ID: X64_0x1E_VRF_riffa!RiffaIoctlSend+142
    >> BUCKET_ID: X64_0x1E_VRF_riffa!RiffaIoctlSend+142
    >> Followup: MachineOwner
    >> ---------
    >>
    >> This would suggest that the PREQUEST_EXTENSION struct I'm using to pass
    >> information to the EvtRequestCancel callback is somehow calling
    >> WdfRequestUnmarkCancelable. Clearly this is going to fail because this is
    >> the entry point into the driver, I haven't marked the WDFREQUEST
    >> cancellable yet. In fact, I might complete the request before I do so (if
    >> there's an input validation problem). I don't know why on earth this call
    >> to get the PREQUEST_EXTENSION would decide to call
    >> WdfRequestUnmarkCancelable on my behalf. There's no code in this function
    >> that calls WdfRequestUnmarkCancelable . This is complete madness.
    >>
    >> Any insights would be appreciated.
    >>
    >> Thanks,
    >> Matt
    >>
    >>
    >> On Wed, Apr 1, 2015 at 11:04 PM, Tim Roberts wrote:
    >>
    >>> On Apr 1, 2015, at 5:57 PM, xxxxx@gmail.com wrote:
    >>> >
    >>> > // Complete the send request
    >>> > InterlockedExchange(&DevExt->Chnl[Chnl].InUse, 0);
    >>> > RiffaCompleteRequest(DevExt, Chnl, Status);
    >>>
    >>> You’re doing those in the wrong order. As soon as you set InUse to 0,
    >>> your original ioctl handler is free to overwrite that channel’s request
    >>> with a different one. If it happens to do so before you get into the
    >>> completion routine, it will complete the wrong request. You should
    >>> complete it, and then release InUse.
    >>>
    >>> Again, wouldn’t this all be easier if you had a manual I/O queue for
    >>> each channel? Put all of your data in a context structure for the
    >>> request. Every time a DMA completes, you can pop the next one, instead of
    >>> just rejecting requests when one is already in progress.
    >>> —
    >>> Tim Roberts, xxxxx@probo.com
    >>> Providenza & Boekelheide, Inc.
    >>>
    >>>
    >>> ---
    >>> NTDEV is sponsored by OSR
    >>>
    >>> Visit the list at: http://www.osronline.com/showlists.cfm?list=ntdev
    >>>
    >>> OSR is HIRING!! See http://www.osr.com/careers
    >>>
    >>> For our schedule of WDF, WDM, debugging and other seminars visit:
    >>> http://www.osr.com/seminars
    >>>
    >>> To unsubscribe, visit the List Server section of OSR Online at
    >>> http://www.osronline.com/page.cfm?name=ListServer
    >>>
    >>
    >>
    > --- NTDEV is sponsored by OSR Visit the list at:
    > http://www.osronline.com/showlists.cfm?list=ntdev OSR is HIRING!! See
    > http://www.osr.com/careers For our schedule of WDF, WDM, debugging and
    > other seminars visit: http://www.osr.com/seminars To unsubscribe, visit
    > the List Server section of OSR Online at
    > http://www.osronline.com/page.cfm?name=ListServer
    >
Sign In or Register to comment.

Howdy, Stranger!

It looks like you're new here. If you want to get involved, click one of these buttons!

Upcoming OSR Seminars
Developing Minifilters 29 July 2019 OSR Seminar Space
Writing WDF Drivers 23 Sept 2019 OSR Seminar Space
Kernel Debugging 21 Oct 2019 OSR Seminar Space
Internals & Software Drivers 18 Nov 2019 Dulles, VA