DMA implementation using KMDF

No, you do not need a common buffer. The DMA transaction object will take care of the mapping through its EvtProgramDma callback and staging the transaction in cooperation with your calls to WdfDmaTRansactionDmaCompletedxxx calls- this is true for both scatter / gather and for packet (single mapping at a time) devices.

As for “any buffer will do”, the answer is “usually”.

I think the exceptions depend upon your device. For instance, if your device must transfer a set amount of data and has limited scatter / gather capability [say it must transfer 64K, only allows 4 SG elements, and only interrupts when all 64K is transferred], it is possible pool allocations will be too fragmented for you to use them. My experience tells me that this is an unusual restriction [but experience varies, after all].

Most of my DMA work has been audio and parallel port (ECP or FIFO), neither of which has such restrictions.

Common buffer makes the programming easy, however- there’s only a single entry to program. So it reduces complexity at the cost of having the buffer harder to get [contiguous memory, particularly when they physical address range is constrained, being an additional allocation burden].

-----Original Message-----
From: xxxxx@lists.osr.com [mailto:xxxxx@lists.osr.com] On Behalf Of xxxxx@ddc-web.com
Sent: Monday, July 30, 2007 9:06 AM
To: Windows System Software Devs Interest List
Subject: RE:[ntdev] DMA implementation using KMDF

Igor,

I did not see your post… I apologize. Our implementations are very similiar. My initial code looks almost exactly to your call sequence. I am starting feel a lot better since you have this working. Right now I encountering an error on IoAllocateMdl for the WdfCommonBuffers I created. I am sure it is something very silly.

Bob ,
Based on Igor’s call sequence(and mine) there is no need for the buffer source to be a WDFCommonBuffer? Any buffer will do as long as I feed it to IoAllocateMdl andMmBuildMdlForNonPagedPool ?

Well Bob … under the final Betas and the first release of KMDF, there was
a perfectly good reason for using WDM DMA in a KMDF driver … KMDF DMA was
not working. :slight_smile: As of now, we still use WDM DMA, but at some point in the
future we will be moving to a more consistent KMDF solution.


The personal opinion of
Gary G. Little

“Bob Kjelgaard” wrote in message
news:xxxxx@ntdev…
As far as I can tell, nobody actually answered your question- my apologies
if I missed a post somewhere.

You can use an MDL for a non-contiguous buffer in
WdfDmaTransactionInitialize. It would be a rather worthless DDI if you
could not ;->.

I can’t think, off-hand, of a reason to use WDM DMA in a KMDF driver- the
KMDF model is quite complete and even adds some good value [in aligning
buffers under unusual circumstances]. If you do have a case that requires
this, I’d like to understand it, if I may (you can email me directly).

-----Original Message-----
From: xxxxx@lists.osr.com
[mailto:xxxxx@lists.osr.com] On Behalf Of xxxxx@ddc-web.com
Sent: Friday, July 27, 2007 12:29 PM
To: Windows System Software Devs Interest List
Subject: RE:[ntdev] DMA implementation using KMDF

The hardware I am using is a PLX 9056, which is essentially the same
hardware used in the PLX example of the DDK (9656). It does support
scatter/gather and I was hoping to use it. However, in order to use the DMA
API I still need a way to obtain an mdl that is contiguous. Or, do I? Does
the type of mdl I feed WdfDmaTransactionInitialize matter?


NTDEV is sponsored by OSR

For our schedule of WDF, WDM, debugging and other seminars visit:
http://www.osr.com/seminars

To unsubscribe, visit the List Server section of OSR Online at
http://www.osronline.com/page.cfm?name=ListServer

Eric,

I may send the piece of working code but I think you will find the problem without any help. It may be physical memory address as IoAllocateMdl parameter or IoAllocateMdl may be called on IRQL > DISPATCH_LEVEL or something similar.
Beside of that it should work. In my case it works in parallel for 24 independent streams.

Fair enough, I suppose- but anyone doing new development should be using KMDF 1.5, so that shouldn’t matter.

-----Original Message-----
From: xxxxx@lists.osr.com [mailto:xxxxx@lists.osr.com] On Behalf Of Gary G. Little
Sent: Monday, July 30, 2007 10:37 AM
To: Windows System Software Devs Interest List
Subject: Re:[ntdev] RE:DMA implementation using KMDF

Well Bob … under the final Betas and the first release of KMDF, there was
a perfectly good reason for using WDM DMA in a KMDF driver … KMDF DMA was
not working. :slight_smile: As of now, we still use WDM DMA, but at some point in the
future we will be moving to a more consistent KMDF solution.


The personal opinion of
Gary G. Little

“Bob Kjelgaard” wrote in message
news:xxxxx@ntdev…
As far as I can tell, nobody actually answered your question- my apologies
if I missed a post somewhere.

You can use an MDL for a non-contiguous buffer in
WdfDmaTransactionInitialize. It would be a rather worthless DDI if you
could not ;->.

I can’t think, off-hand, of a reason to use WDM DMA in a KMDF driver- the
KMDF model is quite complete and even adds some good value [in aligning
buffers under unusual circumstances]. If you do have a case that requires
this, I’d like to understand it, if I may (you can email me directly).

-----Original Message-----
From: xxxxx@lists.osr.com
[mailto:xxxxx@lists.osr.com] On Behalf Of xxxxx@ddc-web.com
Sent: Friday, July 27, 2007 12:29 PM
To: Windows System Software Devs Interest List
Subject: RE:[ntdev] DMA implementation using KMDF

The hardware I am using is a PLX 9056, which is essentially the same
hardware used in the PLX example of the DDK (9656). It does support
scatter/gather and I was hoping to use it. However, in order to use the DMA
API I still need a way to obtain an mdl that is contiguous. Or, do I? Does
the type of mdl I feed WdfDmaTransactionInitialize matter?


NTDEV is sponsored by OSR

For our schedule of WDF, WDM, debugging and other seminars visit:
http://www.osr.com/seminars

To unsubscribe, visit the List Server section of OSR Online at
http://www.osronline.com/page.cfm?name=ListServer


NTDEV is sponsored by OSR

For our schedule of WDF, WDM, debugging and other seminars visit:
http://www.osr.com/seminars

To unsubscribe, visit the List Server section of OSR Online at http://www.osronline.com/page.cfm?name=ListServer

Igor,

I must be missing something very basic that I still can’t track down. All PLX register value writes seem to be correct but I am not getting a DMA xfer to occur. Are you using a 9056?

I tried getting scatter/gather working for my implementation but I could not get anything going for the longest time. So, I then tried to back off and move to packet DMA. I have the register sequences I used in my Linux implementation so I have a good idea of what the sequence should be to get the hw to do an xfer, but I am getting nothing at all.

Here is the packet DMA implementation.

Here’s the sequence:
WdfDmaEnablerCreate
WdfCommonBufferCreate
IoAllocateMdl
MmBuildMdlForNonPagedPool
WdfDmaTransactionCreate
WdfDmaTransactionInitialize
WdfDmaTransactionExecute
EvtProgramReadDma

I initialize the enable to use Packet DMA.

Here is the EvtProgramReadDma:

BOOLEAN
EvtProgramReadDma(
IN WDFDMATRANSACTION Transaction,
IN WDFDEVICE Device,
IN WDFCONTEXT Context,
IN WDF_DMA_DIRECTION Direction,
IN PSCATTER_GATHER_LIST SgList
)
/*++

Routine Description:

The framework calls a driver’s EvtProgramDma event callback function
when the driver calls WdfDmaTransactionExecute and the system has
enough map registers to do the transfer. The callback function must
program the hardware to start the transfer. A single transaction
initiated by calling WdfDmaTransactionExecute may result in multiple
calls to this function if the buffer is too large and there aren’t
enough map registers to do the whole transfer.

Arguments:

Return Value:

–*/
{
PDEVICE_EXTENSION devExt;
size_t offset;
PDMA_TRANSFER_ELEMENT dteVA;
ULONG_PTR dteLA;
BOOLEAN errors;
ULONG i;
PMTI_DMA_TRANSACTION_CONTEXT transContext;

UNREFERENCED_PARAMETER( Context );
UNREFERENCED_PARAMETER( Direction );

//TraceEvents(TRACE_LEVEL_INFORMATION, DBG_READ,
// “–> PLxEvtProgramReadDma”);
KdPrint((“ddcEvtProgramReadDma: called!\n”));
//
// Initialize locals
//
devExt = ddcGetDeviceContext(Device);
errors = FALSE;

transContext = ddcGetDMATransactionContext( Transaction );

WdfInterruptAcquireLock( devExt->Interrupt );

// set PCI buffer address
WritePlx32(devExt,BOARD_LVL_REG,REG_PLX_DMAPADR0,devExt->DMACommonBufferBaseLA[transContext->Channel].LowPart);

// set local address SRC
WritePlx32(devExt,BOARD_LVL_REG,REG_PLX_DMALADR0,transContext->StartAddr );

// set xfer size
WritePlx32(devExt,BOARD_LVL_REG,REG_PLX_DMASIZ0,transContext->DataLength);

// set dma dir
WritePlx32(devExt,BOARD_LVL_REG,REG_PLX_DMAPR0,0x8);

// start xfer
WritePlx32(devExt,BOARD_LVL_REG,REG_PLX_DMA0_CSR,0x1013);

WdfInterruptReleaseLock( devExt->Interrupt );

TraceEvents(TRACE_LEVEL_INFORMATION, DBG_READ,
“<– PLxEvtProgramReadDma”);

}

Now i know once a 0x3 is written to REG_PLX_DMA0_CSR…the PLX should just try to do the xfer.

My device gives me an interrupt when the PLX finished a transaction. I don’t get it. If I did get it, I would finish the transaction object and delete it.

Hopefully you are using the same device and maybe something I am missing.

Ok, I think I am on to something here. It seems that the device I am writing this driver for does not explicity declare itself as a Bus Master device. I am going to have to explicitly set this bit in the PCI Config header. Does windows have a set of functions that assist in accessing the PCI header or is it a manual affair?

Eric,

I have an easier case - PCI E board with DMA implemented in FPGA - any time we may put a scope and check what’s going on on the hw side.
I guess my board also doesn’t declare itself as a Bus Master - I don’t have DMA controller in the resource list. But I can create WDFDMAENABLER in AddDevice().

I’m using SgList->Elements->Address as DMA address but in my case it’s the same as logical address (non scatter/gather).

You are writing twice to the same register - 2nd and 3rd params are the same - BOARD_LVL_REG and REG_PLX_DMALADR0:
// set PCI buffer address WritePlx32(devExt,BOARD_LVL_REG,REG_PLX_DMAPADR0,devExt->DMACommonBufferBaseLA[transContext->Channel].LowPart); // set local address SRC WritePlx32(devExt,BOARD_LVL_REG,REG_PLX_DMALADR0,transContext->StartAddr );

Maybe it be the reason ?

xxxxx@ddc-web.com wrote:

Ok, I think I am on to something here. It seems that the device I am writing this driver for does not explicity declare itself as a Bus Master device. I am going to have to explicitly set this bit in the PCI Config header. Does windows have a set of functions that assist in accessing the PCI header or is it a manual affair?

The old method used HalSetBusData, but using that today earns you
derision and scorn. :wink: The new method is to send
IRP_MN_QUERY_INTERFACE, IRP_MN_READ_CONFIG and IRP_MN_WRITE_CONFIG IRPs
to the PCI bus driver. Google should provide a couple of good code
clips on how to do this


Tim Roberts, xxxxx@probo.com
Providenza & Boekelheide, Inc.

Igor,

,BOARD_LVL_REG is an offset into the register space…i ahve multiple “channels” represented in the mapped reg space.

REG_PLX_DMAPADR0 = 0x84 //Indicates from where in PCI Memory space DMA transfers (reads or writes) start. Value is a physical address.

REG_PLX_DMALADR0 = 0x88 //Address. Indicates from where in Local Memory
space DMA transfers (reads or writes) start.

Here is the trace of PLX writes:
Wed Aug 1 13:45:06.712 2007 (GMT-4): ddcWritePlx32: Address : 0xf7f52e84 Value : 0x64f4000
Wed Aug 1 13:45:06.712 2007 (GMT-4):
Wed Aug 1 13:45:06.727 2007 (GMT-4): ddcWritePlx32: Address : 0xf7f52e88 Value : 0x0
Wed Aug 1 13:45:06.727 2007 (GMT-4):
Wed Aug 1 13:45:06.727 2007 (GMT-4): ddcWritePlx32: Address : 0xf7f52e8c Value : 0x190
Wed Aug 1 13:45:06.727 2007 (GMT-4):
Wed Aug 1 13:45:06.743 2007 (GMT-4): ddcWritePlx32: Address : 0xf7f52e90 Value : 0x8
Wed Aug 1 13:45:06.743 2007 (GMT-4):
Wed Aug 1 13:45:06.758 2007 (GMT-4): ddcWritePlx32: Address : 0xf7f52ea8 Value : 0x1013
Wed Aug 1 13:45:06.758 2007 (GMT-4):

This exact sequence works in Linux( I just tried it). I am stumped here.

Tim,

It too a little while but I finally found an example using WdfFdoQueryForInterface. It appeasr that I read out the PCI config space and confirmed that Bus Mastering was turned off. I flipped the bit and wrote it back but it wasn’t the solution as I had hoped. I still do not get a DMA complete interrupt nor does any data seem to have been placed in the buffer.

This is baffling. I know from my experience in Linux that whether or not you load the src, dest. length and direction registers in the PLX correctly, once you write that last location with aboth bits 0 and 1 on, the plx will indiscriminantlty do whatever transfer is described by the state of these bits. I am not seeing that.

Thank you both for your help. I guess I need a PCI analyzer now to see a lower layer of activity here.

Am I perhaps not using the DMA API correctly?

Eric,

Sorry, my mistake. Did not notice the ‘L’ and ‘P’ are there.
Just a guesses:

  • Did you check the dest memory content ?
  • Do you have interrupts enabled ?
  • Is there the way to connect the scope and check what’s going on in the hw ?

From my experience it’s easy to get a blue screen after “DMA enabled” was written to the hw

Thanks for your help everyone. It turned out to be some PLX specific buts that needed to be enabled in order for my hardware to report DMA interrupts. I have some new questions that are related to what my DMA implementation but they are really memory related. I will start the question in a new thread.

Thanks Again

Gary’s rationale, while correct at one time, is incomplete. Unless I’ve
missed a WdfDmaEnablerYYY() or WdfDmaTransactionZZZ() function that does
this, the KMDF DMA stuff doesn’t allow you to pre-allocate SGL buffers and
construct the SGL in them, like you can with
DmaOps->BuildScatterGatherList().

For generic desktop use, that’s probably a plus, but for a high bandwidth
testing harness that’s being benchmarked against DOS tools, we can’t afford
any additional overhead, and the cost of the common buffer allocation
embedded in the WdfDmaTransactionInitialize() call is huge.

So for now, we’re using BuildScatterGatherList(), with an SGL buffer
(allocated by a call to DmaOps->AllocateCommonBuffer()) that is large enough
to describe a maximally fragmented user buffer, and then translating that
into a second pre-allocated common buffer that is large enough to hold
whatever form of SGL the HBA needs (in many cases, that is very different
from the DDK’s SCATTER_GATHER_LIST).

In fact, we’re so sensitive to the overhead that we’re pondering whether to
drop back one more step to the MapTransfer() loop, so we can build our
HBA-defined SGLs directly into the second pre-allocated common buffer (and
eliminating the one that we pass to BuildSGL entirely) instead of having the
translation step.

We’re also going to provide a helper driver that will allocate contiguous
nonpaged memory on behalf of the app, so we really don’t have to worry about
the SGL generation, as you end up with only one SGE in the SGL, and that’s
easy for (almost) every HBA to deal with. We also can use the large page
support (VirtualAlloc(MEM_LARGE_PAGES)) that is documented to be only
available in Vista. But since the general use case is to support the use of
an ordinary pageable buffer from app code on XP, we have what we think is a
compelling reason (performance) to continue to use the WDM DMA API.

Phil

Philip D. Barila
Seagate Technology LLC
(720) 684-1842
As if I need to say it: Not speaking for Seagate.

“Bob Kjelgaard” wrote in message
news:xxxxx@ntdev…
Fair enough, I suppose- but anyone doing new development should be using
KMDF 1.5, so that shouldn’t matter.

-----Original Message-----
From: xxxxx@lists.osr.com
[mailto:xxxxx@lists.osr.com] On Behalf Of Gary G. Little
Sent: Monday, July 30, 2007 10:37 AM
To: Windows System Software Devs Interest List
Subject: Re:[ntdev] RE:DMA implementation using KMDF

Well Bob … under the final Betas and the first release of KMDF, there was
a perfectly good reason for using WDM DMA in a KMDF driver … KMDF DMA was
not working. :slight_smile: As of now, we still use WDM DMA, but at some point in the
future we will be moving to a more consistent KMDF solution.


The personal opinion of
Gary G. Little

“Bob Kjelgaard” wrote in message
news:xxxxx@ntdev…
As far as I can tell, nobody actually answered your question- my apologies
if I missed a post somewhere.

You can use an MDL for a non-contiguous buffer in
WdfDmaTransactionInitialize. It would be a rather worthless DDI if you
could not ;->.

I can’t think, off-hand, of a reason to use WDM DMA in a KMDF driver- the
KMDF model is quite complete and even adds some good value [in aligning
buffers under unusual circumstances]. If you do have a case that requires
this, I’d like to understand it, if I may (you can email me directly).

-----Original Message-----
From: xxxxx@lists.osr.com
[mailto:xxxxx@lists.osr.com] On Behalf Of xxxxx@ddc-web.com
Sent: Friday, July 27, 2007 12:29 PM
To: Windows System Software Devs Interest List
Subject: RE:[ntdev] DMA implementation using KMDF

The hardware I am using is a PLX 9056, which is essentially the same
hardware used in the PLX example of the DDK (9656). It does support
scatter/gather and I was hoping to use it. However, in order to use the DMA
API I still need a way to obtain an mdl that is contiguous. Or, do I? Does
the type of mdl I feed WdfDmaTransactionInitialize matter?


NTDEV is sponsored by OSR

For our schedule of WDF, WDM, debugging and other seminars visit:
http://www.osr.com/seminars

To unsubscribe, visit the List Server section of OSR Online at
http://www.osronline.com/page.cfm?name=ListServer


NTDEV is sponsored by OSR

For our schedule of WDF, WDM, debugging and other seminars visit:
http://www.osr.com/seminars

To unsubscribe, visit the List Server section of OSR Online at
http://www.osronline.com/page.cfm?name=ListServer