Implement DMA in packet-mode in a PCI/VME Bridge

Thank you very much Tim.

I have an other question about the end of DMA: my device triggers a PCI interrupt if the DMA ends normally or if an error on both PCI or VME bus occurs during a DMA transfer.
If the DMA ends due to an error, I wonder if I deal with this error in my ISR routine or in my DPC routine after calling WdfDmaTransactionComplete ? Because in this case, I have to stop at once the DMA transaction and cancel the DMA I/O Request associated without the Framework schedules another DMA transfer.

You do it in your DPC.

Peter

Thank you Peter.

I have a WDF VIOLATION error due to a bugcheck with an Arg = 0x00000003 = Windows Driver Framework Verifier has encountered a fatal error.
in particular, an I/O request has been completed, but a framework request object cannot be deleted because there are outstanding references to the input buffer or the output buffer, or both. The driver’s IFR will include details on the outstanding references.

I performed a check of the log dump file and the last lines related to buffer in IOCTL says that:
35: FxRequest::CompleteInternal - WDFREQUEST 0x0000567C11539B38, PIRP 0xFFFF8F8FC58E6D80, Major Function 0xe, completed with outstanding references on WDFMEMORY 0x0000567C11539A61 or 0x0000567C11539A51 retrieved from this request
36: FxRequest::CompleteInternal - WDFMEMORY 0x0000567C11539A61, buffer FFFF8F8FC617AFE0, PMDL 0000000000000000, length 20 bytes
37: FxRequest::CompleteInternal - IOCTL output WDFMEMORY 0x0000567C11539A51, buffer FFFFE481CAE12010, PMDL FFFFA983EE8C6750, length 4096 bytes
---- end of log ----

I do not see any particular error but I identify in the code that the WDF method that causes the bugcheck is WdfTransactionExecute().
Did you have any suggestion to help me to debug the driver ? I am a newbie with the WinDbg debugger. Thanks.

Like the log says, a WDFMEMORY object has a reference when the Request is completed.

How are your input and output buffers for this Request created? Do you build them? Are you make a copy of either of those buffers using a WDFMEMORY object?

Peter

Hello Peter,
My user app created both buffers, the first buffer is a “control” buffer and the second is a “data” buffer.
The IOCTL is defined as METHOD_IN_DIRECT, FILE_WRITE_ACCESS because i would like to perform a DMA Write
DeviceIoControl(
DriverHandle,
IOCTL_DMA_WRITE,
DmaIoctl,
sizeof(DMA_IOCTL),
Buffer,
BufferSize,
&BytesReturned,
&Overlapped);

I do not use any WDFMEMORY object in my driver, I only use WdfRequestRetrieveInputBuffer for the “control” buffer and WdfRequestRetrieveOutputBuffer for the “data” Buffer and then I use WdfDmaTransactionInitializeUsingRequest(DmaTransaction, DmaRequest, EvtProgramDmaFunction, WdfDmaDirectionWriteToDevice);
The bugcheck occurs when I call the WdfDmaTransactionExecute() just after.

If this is a DMA write, then it’s really METHOD_OUT_DIRECT, but that doesn’t really matter very much. Where are you completing the request? Is it possible you complete the request before you call WdefDmaTransactionDmaCompleted?

Hello Tim,
Thank you very much, you have pointed out the issue: in the EvtDeviceIOControl() method , I have the possibility to do a WdfRequestComplete() before performing WdfDmaTransactionDmaCompleted();
In my implementation, I complete the request only in ISR/DPC if WdfDmaTransactionDmaCompleted() returns true or if any errors occur before end of DMA, I call WdfObjectDelete() and then WdfRequestCompleteWithInformation().
Is the right way to proceed ?

Do you mean WdfObjectDelete on the transaction object? That’s not right; you need to call WdfDmaTransactionDmaCompletedFinal to tell the system how much did get done. That lets the framework clean up the resources it allocated. If that returns true, then you complete the request.

There’s a very nice article on completing DMA transactions here: https://docs.microsoft.com/en-us/windows-hardware/drivers/wdf/completing-a-dma-transfer

Hello Tim,
I do not have succeed to complete the DMA transaction in the DPC; I have a bugcheck with following error:
DRIVER_VERIFIER_IOMANAGER_VIOLATION (c9)
The IO manager has caught a misbehaving driver.
Arguments:
Arg1: 000000000000000e, Irql > DPC at IoCompleteRequest
Arg2: 0000000000000009, the current Irql
Arg3: ffff9d077d39cd80, the IRP
Arg4: 0000000000000000

Reading the article, if the length of the DMA is not reached, do I have to call WdfDmaTransactionDmaCompletedWithLength or WdfDmaTransactionCompletedFinal ? Because my hardware do not report any error but I can see that DMA length is not reached.

If you call WdfDmaTransactionDmaCompletedWithLength and the length is less than the full transfer, the system will submit another transfer to complete the rest. If you need to abort the transfer and can’t do any more, then you call WdfDmaTransactionDmaCompletedFinal.

The BSOD you got is because you tried to call IoCompleteRequest in your ISR. You need to defer almost all of your processing to your DPC. The ISR should do very little more than acknowledge the interrupt and trigger your DPC.

Thank you very much Tim.
I have fixed my issue and my DMA is working now !!
Thank you again for your great advices !!

Debugging my DMA, I print the Address in Sglist.Element[0] and is not incremented at each call of EvtProgramDma (after ending in DPC). I do not understand why the buffer address is not incremented for each DMA transfer. Could you explain why ?

Is your device limited to 32-bit physical addresses? If so, then the operating system has allocated a “bounce buffer” below the 4GB mark, and is copying the user’s buffer into that space. So, each call to EvtProgramDma is getting the same physical buffer with different data.

Yup. What Mr. Roberts said. Exactly.

Peter

Yes my device is limited to 32-bit physical address. So I understand that the “bounce buffer” is at the same address but data are copied into this buffer.
Thank you for your reply.

There’s no excuse for any hardware design from the last 20 years to use 32-bit physical addressing. The hardware engineers should have their fingernails removed.

(Tangent: Mr. Roberts is well known for (quite rightly) deriding DMA designs that do not support scatter/gather. I had to laugh when I was watching a video of Darryl Havens doing a presentation on the I/O Subsystem from 1993 (the first PDC where the details of Windows NT were discussed) where he said “Windows NT basically assumes everything supports scatter/gather. More and more hardware has that capability these days.” I immediately thought of Mr. Roberts. Later in the same presentation, Mr. Havens also was remarking on devices that only supported 24-bit addressing…)

Supporting scatter/gather and requitement of 64 bit address are not the same thing.
It is not wise (not from security POV, neither from electronic engineering POV) to give a device full access to host address space.
The host should have I/O map registers (IOMMU or whatever) that map limited pieces of host memory for the device.
For many reasonable devices 24-bit sized fragments (even 16 bit) are enough.
When host address space increases beyond 64 bits, this approach will accommodate old 64-bit devices and 32-bit as well.