Implement DMA in packet-mode in a PCI/VME Bridge

Hi all,
I have to implement a DMA support for an old PCI to VME bridge device and I wonder what design should I use. I have already done using KMDF framework all initialization and other stuff like single transfers that are already working using IOCTL. The requirements for DMA support are very poor, but what I know is that the device hardware does not support scatter/gather mecanism and do only DMA transfer in packet mode. To perform DMA, I have some registers commonly used for DMA : PCI start address, VME start address, Transfer size, block size and a register that holds miscelaneous flags used for VME xfer like the direction (read/write). In user application, I would like to use IOCTL to perform all DMA Read and Write operation. For data transfer, I would use a shared buffer between user appl and kernel driver.
My questions are:

  1. Could I use WdfDma framework features for this design since the HW does not support scatter/gather mecanism nor descriptors as shown in PLX9x5x WDF example ?
  2. For shared buffer, do i need to use a CommonBuffer allocation mecanism in the driver or does exist an other mecanism ?
    Thanks in advance for your answer.

As your device doesn’t support scatter gather you have to use a common buffer. You are going to do a copy from there into your user’s buffer, so there is no point in trying to share anything explicitly.

Thank you Mark for your quick answer.
Is it possible to do this copy from the common buffer into user’s buffer using an IOCTL call ?

https://docs.microsoft.com/en-us/windows-hardware/drivers/wdf/handling-i-o-requests-in-a-kmdf-driver-for-a-bus-master-dma-device

WDF is going to handle the copy details for you if your dma is connected to io requests… The general practice is to either initiate the DMA based on a user ioctl arriving at your driver, or to have the user pend a set of such ioctls to be used by your driver, with the driver independently initiating dma requests.

Is it possible to do this copy from the common buffer into user’s buffer using an IOCTL call ?

That is, in fact, the ONLY way to do it.

Thanks again.
What I do not understand in this model, it is the link between the PSCATTER_GATHER_LIST SgList in the EvtProgramDma
and what I get from IOCTL request in the call:
WdfDmaTransactionInitializeUsingRequest( dmaTransaction,
Request,
EvtProgramDma,
direction );
In DMA packet mode, I know that I have to use a single pair
SgList->Elements[0].Address.LowPart;
SgList->Elements[0].Length;
but I do not understand which address I got in this structure.
Thanks.

If you are using a common buffer, then you won’t be using WdfDmaTransactionInitializeUsingRequest. You have the physical address of of the common buffer, and copying into the user’s request when the DMA is done is just a simple RtlMoveMemory call.

Thank you Tim for your answer. So I understand that I have to not use WdfTransactionInitializeUsingRequest, do I use instead WdfDmaTransactionInitialize method ?
Also, in my EvtProgramDma, does I use the SgList parameter ?

This talk of “common buffers” has, in fact, confused you I think. Let me see if I can clear that up.

Use the standard KMDF DMA APIs. When you create your DMA Enabler, specify one of the packet-based profiles but NOT one that indicates that you support scatter/gather (because you don’t).

When your EvtProgramDma callback is called, the s/g list passed to you will have exactly one element. You use this as the base address and length to program your hardware. Done!

Ignore all this talk of Common Buffers… which could provide you an alternative way to do what you want, but does not really directly answer your question.

Peter

Thank you Peter for ypue reply. You are right about common buffer confusioning, and I will follow your suggestion about the s/g list point and take you informed if I have any further question.

Just one point: the start address for starting DMA on both PCI bus and VME bus of the Device must be programmed by user app. For example, for a Write access (PCI to VME direction), I have to program a register for the PCI Address and one register for the VME address. How the s/g list base address is programmed in this case ? In other words, what is the value of the first s/g list element ?

Just one point: the start address for starting DMA on both PCI bus and VME bus of the Device must be programmed by user app.

I’m not sure I understand: The user app calls (for example) WriteFile, providing a pointer to the data buffer that they want to write, and the length of that buffer. You then create a DMA Transaction and initialize it using the Request (InitializeDmaTransactionUsingRequest). Your EvtProgramDma Event Processing Callback gets called with the scatter/gather list (with exactly one element). You program your hardware.

Again… see this: https://docs.microsoft.com/en-us/windows-hardware/drivers/wdf/handling-i-o-requests-in-a-kmdf-driver-for-a-bus-master-dma-device which I think explains the process pretty clearly.

Peter

Just one point: the start address for starting DMA on both PCI bus and VME bus of the Device must be programmed by user app.

No. The programming of the DMA hardware must be entirely under the control of the driver. User mode code is naturally insecure; it’s way too easy for another app to interfere, and cause your hardware to do transfers for immoral purposes. The address of the user’s buffer is passed in the IRP, and is validated by the I/O subsystem. The address on the device side should be passed to the driver as a parameter, where the driver can validate it.

Thank you Peter and Tim for your reply.
I try to understand : the s/g list provides one element sglist.Elements[0].Address that I have to use to program my device in EvtProgramDma, but I do not see the link between this address and the address of the user’s buffer passed through the IRP.
Could you explain please at which address correspond this sglist.Elements[0].Address ?

Hmmmm… I’m not sure what you don’t understand.

The first byte of the buffer to which the s/g list points is the first byte of the user data buffer, as passed by the user in WriteFile or ReadFile.

Peter

sglist.Elements[0].Address is the physical address of the page whose virtual address is in the IRP. Are you familiar with virtual memory and physical memory? None of the addresses you use in your code, either in user mode or kernel mode, are actually the addresses of the bytes in memory. Those are virtual addresses, and they have to be passed through the page tables in order to get the address that goes out on the memory bus.

@Peter,
Ok so if I use a Writefile(handle, buf, bufsize, NULL, NULL ), or an IOCTL call like deviceIoControl( handle, IOCTL_WRITE_DMA, NULL, 0, buf, bufsize, NULL, NULL ), s/g list point to the first byte of “buf”, correct ?
@Tim,
Yes, I have a few knowlegde of the Virtual/Physical memory adresses concept and I read recently the Microsoft documentation about contiguous and non-contiguous block of memory associated to the s/g list concept because in fact it was new for me, you are right.
And so, if I have to program my 32-bit only device, I have to use sglist.Element[0].Address.LowPart that corresponds to the physical address translated by the framework, correct ?

Yes.

It would probably be most helpful to not refer to the contents of sglist.Element[0].Address as the “physical address” of the users data buffer. Because it may be or it may not be. It is the device bus logical address of the users data buffer, suitable for use with DMA. In the OP’s case, it almost certainly will not be the user data buffer physical address.

Peter

Thank you Peter for your reply.
And so on, if I use an IOCTL like in my example using Direct I/O type, is it possible to pass a structure instead of a buffer ? In this case, if the device bus logical address is the first element of this structure, does it mean that sglist.Element[0].Address point to this element ?
In my OP’s case, I want to perform DMA transfer using 1 IOCTL in User App for programming my device and I wonder if this is the good method because
all PCI examples use WriteFile() method.

And so, if I have to program my 32-bit only device, I have to use sglist.Element[0].Address.LowPart that corresponds to the physical address translated by the framework, correct ?

You’ve raised a couple of brand-new issues here. Is this a very old device? Because any modern hardware designer who creates a PCIe device that is limited to 32-bit addressing is guilty of malpractice. Windows has had 64-bit physical addresses since the very beginning, clear back in the 20th Century. ALL current PCIe IP blocks supports 64-bit physical addresses. There’s no excuse.

When your DMA is limited to 32 bits, the system has to take an extra step. You can’t control where the user’s buffer lives, and since modern systems often have 16GB or 32GB or more of RAM, the average user-mode buffer these days is statistically going to be above the 4GB mark. In that case, the operating system has to allocate special space below the physical 4GB limit. These are called “bounce buffers”. When the user submits a request, the I/O system will copy his buffer into a “bounce buffer” before calling your EvtProgramDma callback. There are a limited number of “bounce buffers”, which means your user request might be chopped into several pieces, with each piece getting another call to EvtProgramDma. This is mostly done without your knowledge, but since you have to maintain a destination address, this may be something you need to know.

Peter is quite right to point out the difference between “physical” and “logical” addresses. A bus address is not necessarily the same as a physical address. I admit to being lax with this terminology, because in my career I have never encountered a system where the two were not identical.

In my OP’s case, I want to perform DMA transfer using 1 IOCTL in User App for programming my device and I wonder if this is the good method because all PCI examples use WriteFile() method.

There’s almost no difference. It may not be obvious from above, but from a driver standpoint, ReadFile, WriteFile and DeviceIoControl are all virtually identical. The driver just gets an IRP, and the buffers are stored in the same places. In YOUR case, there is an additional consideration, because you need to specify a destination address. With DeviceIoControl, you have the opportunity to send two buffers. You can put the address in buffer 1, and the data in buffer 2. Without that, you have to invent some other scheme, and it doesn’t seem as natural.