Hello ntdev-
I am developing a WDF driver for a PCIx data acquisition board. I am running on a Windows7 64-bit OS. My requirement is to acquire into a very large user allocated buffer (5GB worst case) in a single DMA transfer. I am using packet DMA, several small (64k) packets strung together. I have 24GB of RAM installed.
First off, I was wondering if anyone out there has successfully been able to DMA 5GB or more, using standard Windows APIs and practices. I seem to have hit a ?ceiling? of 2GB. That is, when I pass a MaxTransferLength of > 2GB to WdfDmaEnablerCreate(), it returns a clipped value of 2GB (slightly less actually). That 2GB corresponds to 0x80000 (1/2Meg) map registers (and correspondingly, same number of descriptor table entries). It appears that the OS has allotted a max of that many map registers for my device, or more precisely, for that DMA engine of my device. I have no idea how to ask for more map registers, or if that is even possible.
This is my first excursion into the WDF world ? my former driver projects were many years ago, back in NT/2000-land. So, admittedly, I may be missing something stupid here 
Any insights into my problem would be much appreciated!
Jeff Larson
Do you need to post the whole 5GB buffer to the device at once? What are you latency/throughput requirements?
>That is, when I pass a MaxTransferLength of > 2GB to WdfDmaEnablerCreate(), it returns a >clipped value of 2GB (slightly less actually).
It may be there is a default value for MaxTransferLength.
Did you try to call WdfDmaTransactionSetMaximumLength to increase this number?
Igor Sharovar
>Do you need to post the whole 5GB buffer to the device at once? What are you
latency/throughput requirements?
I have a 48MB/sec data stream coming in via a serial link interface. No flow control so I have to keep up. I don’t believe I have to post the whole buffer at once (I assume you mean set up descriptors for the whole 5GB before starting the transfer?).
It may be there is a default value for MaxTransferLength.
Did you try to call WdfDmaTransactionSetMaximumLength to increase this number?
It appears WdfDmaTransactionSetMaximumLength clips values above the “default” value specified to WdfDmaEnablerCreate (according to the DDK documentation). So I don’t think that will help me.
Thanks!
> It appears that the OS has allotted a max of that many map registers for my device, or more >precisely, for that DMA engine of my device.
What is a length of mapped memory that you get from WDFCMRESLIST? Is it more than 2G?
Igor Sharovar
xxxxx@neurologica.com wrote:
> Do you need to post the whole 5GB buffer to the device at once? What are you
> latency/throughput requirements?
I have a 48MB/sec data stream coming in via a serial link interface. No flow control so I have to keep up. I don’t believe I have to post the whole buffer at once (I assume you mean set up descriptors for the whole 5GB before starting the transfer?).
What will you be doing with all this data? 48 MB/s is not all that
fast. You can handle that on the fly, a fraction of a second at a time,
assuming you can get interrupts while the processing is going on. USB
runs that fast, but it only locks down a millisecond’s worth of space at
a time.
–
Tim Roberts, xxxxx@probo.com
Providenza & Boekelheide, Inc.
> What will you be doing with all this data? 48 MB/s is not all that
fast. You can handle that on the fly, a fraction of a second at a time,
assuming you can get interrupts while the processing is going on. USB
runs that fast, but it only locks down a millisecond’s worth of space at
a time.
This is a CT medical application. We are going to be crunching on the data as soon as it starts arriving. We get interrupts end of every (64KB) packet, so I could see windowing the data buffer and using a descriptor ring…
But what about the map registers? Don’t I need them in place for the full DMA transfer size? Despite my gross abuse of allocating so many descriptors, and locking down the whole honkin buffer, my limitation presently seems to be the map registers.
Also, I wanted to just get a sanity check here, but does WdfCommonBufferCreateWithConfig() return a locked down buffer. I use that to alloc my descriptors, and I am getting a page fault setting up descriptors for some reason (only for big (1.5GB+) transfer sizes…).
xxxxx@neurologica.com wrote:
This is a CT medical application. We are going to be crunching on the data as soon as it starts arriving. We get interrupts end of every (64KB) packet, so I could see windowing the data buffer and using a descriptor ring…
That seems like a much better path to explore. Does your hardware allow
you to modify the descriptor chain while a transfer is in progress? As
long as you can keep some tens of milliseconds ahead, that’s all you
should need to have locked. Lock, program, wait, unlock.
But what about the map registers? Don’t I need them in place for the full DMA transfer size? Despite my gross abuse of allocating so many descriptors, and locking down the whole honkin buffer, my limitation presently seems to be the map registers.
There’s a dirty little secret here. Except in rather extraordinary
circumstances, “map registers” are a purely imaginary concept. They do
not actually exist. What you get back from the DMA APIs are plain,
ordinary physical addresses. If your hardware cannot handle 64-bit
addressing, then the map registers play a critically important role, and
because of that, it’s important to pretend that map registers are a real
and limited resource. However, if that were the case, the limitation is
MUCH smaller than what you’ve been describing (well under a megabyte).
Also, I wanted to just get a sanity check here, but does WdfCommonBufferCreateWithConfig() return a locked down buffer.
Absolutely. Locked down, and contiguous in physical space.
–
Tim Roberts, xxxxx@probo.com
Providenza & Boekelheide, Inc.
You can have your application to post a number of 64kB buffers outstanding, and the driver will lock them, request an SGL and post to the device.
It’s not too much to expect the latencies under 0.1 second, if your post/completion thread is running with slightly elevated priority. So if you have 8 or 16 MB outstanding that would be more than enough. No need for the world’s biggest buffer.
48 MB/s is not a big deal; 3 years ago I used to handle 40 MB/s stream from a USB device. For another example, a modern SATA drive can do some 150 MB/s sustained transfer from the media sequential; 50-100 MB/s scattered.
Thanks so much for your help, guys! I am going to look into the feed-the-descriptor-ring approach. :)))