Giorgio Delmondo wrote:
> IFthe user data buffer is virtually contiguous in user space, and
how can it not be.
unfortunately it can be.
The API the user gives it the following:
- ALLOCATE_BUFFER, where a virtually contiguous buffer is allocated.
It can be called as many times as the user reqeuest. An array of
“pages” is returned.
- PUSH_DESCRIPTOR where the USER compose the scatter/gather list,
referring to the physical_address/descriptor/… given by the previous
call
- START DMA/STOP DMA
- Wait dma block
Using the push descriptor the user can potentially setup virtually non
contigous chains in block of PAGE_SIZE
Yes, but do they actually DO that? Many APIs allow things that don’t
make sense, and because of that don’t get used. Any app writer with a
brain is just going to allocate a buffer, and use the pages in that
buffer, in order.
For the third time, I need like to point out that you could solve many
of your problem here by returning handles or virtual addresses instead
of physical addresses. That leaves the physical address manipulation
entirely in the driver’s control. You remember in your driver context
the list of allocated buffers and their sizes. Then, after you collect
all the descriptors, you say
for each descriptor
for each buffer
if this descriptor is within this buffer
pull the physical address from the MDL based on the
offset within the buffer
If they pass an address that was not in one of the allocated buffers,
then they violated the contract.
The other issue is blocking me from using the proper DMA API is that
once the DMA is started, the dma could never complete (loopback of the
chain) and I get just end of block interrupts, which are composed of
multiple pages.
The Windows DMA API AFAIK wants DMA transfert that have an end. The
only way to trick it I found is to re-issue a new transfert each time
a DPC finds an end of block interrupt. This implies to follow the
Scatter/Gather list handling in the driver which is a nightmare.
Not necessarily. Not every DMA scenario fits the KMDF abstraction. You
could use WdfDmaEnablerWdmGetDmaAdapter to fetch a DMA_ADAPTER, and call
BuildScatterGatherList by hand.
However, if you are going to need mapping registers (bounce buffers) to
compensate for your 32-bit address limitation, you have a problem.
First, those bounce buffers are a severely limited resource – typically
only about 64k bytes. That means your transaction has to be chopped up
into 64k increments, so that the data can be copied to/from the bounce
buffers.
If you truly want uninterrupted long DMA transfers without worrying
about multiple transactions, then I don’t see that you have any choice
other than allocating a common buffer below 4GB and copying the data to
the user buffers. That allows you to do the looping you describe. If
you have to compensate for addresses above 4GB, then there is **ALWAYS**
going to be a copy involved. There is no other alternative. With the
abstraction, that copy is hidden in the DMA transaction. Without the
abstraction, you do it explicitly. The performance is the same.
–
Tim Roberts, xxxxx@probo.com
Providenza & Boekelheide, Inc.