DMA - Exposing the commonbuffer to user sw

Either I’m confused or you’re confused or… you know… we’re both confused. Which is always a possibility.

Your FPGA is moving data from HOST memory to a block of memory on the board (what we would call LOCAL memory), that it shares with this “legacy” device, right?

If the legacy device doesn’t understand S/G, then it’s the LOCAL memory that needs to be contiguous. The organization of the HOST memory (whether its contiguous or not) depends solely on the capabilities of the FPGA.

When I read about DMA, it all seems to be from the point of view that my SW wants to do a write-to or read-from. So SW send a request to the Driver and the driver sets up a DMA Transaction to execute a write or read. But that is not what the SW is trying to do. SW needs to allocate a landing spot for the FPGA to write-to.

Yes. This is the distinction between what Windows calls “Packet Based” DMA (where one read or write Request in the host results in one DMA transaction, and “Continuous” DMA where the host driver sets up a mapping once (more or less) and then the FPGA periodically does transfers on its own (again, more or less, depending on the details). There are also “Hybrid” DMA schemes that combine the two approaches.

In Windows, we generally think of continuous DMA designs as being to/from common buffers… and packet based designs as being S/G because the data for each “packet” is coming directly from the user.

If you’re sharing your continuous DMA buffer with the user, this is a perfect application for the scheme I outlined a long time ago, in which the user sends an IOCTL with the OutBuffer describing their data buffer, and your driver keeps this in progress until the thread app is done with the device. Now, don’t get me wrong: There’s still complexity in that approach, but the major advantage is that once you get it to demonstrably work… it works. There are no hidden issues lurking. Contrast this with the “allocate some memory and map it back to the user application” approach, where getting it to work is pretty straightforward (as you’ve seen) but you’re really not half way there at that point… there are still somewhat subtle problems lurking, the solutions to which can be hard to test.

Peter