PCIe Bus master Write DMA doesn't work

The device: PCIe Bus master & Device FW initiates DMA
OS: Win7
Driver: Suppose to allocate memory for Device to Host DMA & Device will write to allocated memory

The DMA is initialized as part of Prepare HW

The device will get Write Command with DATA and Read Command with Buffer address. Device is able to get command and data from host command and it decodes data and then when read command command with buffer recieved it will write data to address mentioned in Read Command. Device is sending data, that can be seen on PCIe protocol analyzer with correct address and correct TLP format. But data doesn’t reflect at the Host RAM.

The Drive code is

IO_ALLOCATION_ACTION
MyAdapterControl(
struct _DEVICE_OBJECT *DeviceObject,
struct _IRP *Irp,
PVOID MapRegisterBase,
PVOID Context
)
{
PDEVICE_CONTEXT pDevContext;
UNREFERENCED_PARAMETER(DeviceObject);
UNREFERENCED_PARAMETER(Irp);
pDevContext = (PDEVICE_CONTEXT) Context;
pDevContext->MapRegisterBase = MapRegisterBase;
TraceEvents(TRACE_LEVEL_ERROR, DBG_IOCTLS,“MyAdapterControl :Existing”);
return DeallocateObjectKeepRegisters;
}

void myAdapterListControl
(
struct _DEVICE_OBJECT *DeviceObject,
struct _IRP *Irp,
PSCATTER_GATHER_LIST ScatterGather,
PVOID Context
)
{
PDEVICE_CONTEXT pDevContext;
UNREFERENCED_PARAMETER(DeviceObject);
UNREFERENCED_PARAMETER(Irp);

pDevContext = (PDEVICE_CONTEXT) Context;

pDevContext->ScatterGather = ScatterGather;

TraceEvents(TRACE_LEVEL_ERROR, DBG_IOCTLS,“myAdapterListControl :physical address Address of non paged mem = 0X%X\n”, pDevContext->ScatterGather->Elements[0].Address.LowPart);
TraceEvents(TRACE_LEVEL_ERROR, DBG_IOCTLS,“myAdapterListControl :physical address Address of non paged mem = 0X%X\n”, pDevContext->ScatterGather->Elements[0].Address.HighPart);
TraceEvents(TRACE_LEVEL_ERROR, DBG_IOCTLS,“myAdapterListControl :Length = 0X%X\n”, pDevContext->ScatterGather->Elements[0].Length);
TraceEvents(TRACE_LEVEL_ERROR, DBG_IOCTLS,“myAdapterListControl :Existing\n”);
}

void DirectReadWrite(PDEVICE_CONTEXT pDevContext)
{
PHYSICAL_ADDRESS phyAddr;
PUCHAR pMemVA = NULL;
ULONG i = 0x2fffffff;
PMDL Mdl = NULL;
ULONG LengthMapped = PAGE_SIZE;

UINT32 uchWriteCommmand[32] = {// some data removed due to confidential};
UINT32 uchReadCommmand[32] = {// some data removed due to confidential };

UINT32 uchWriteData[128] = { // some data removed due to confidential};

TraceEvents(TRACE_LEVEL_VERBOSE, DBG_IOCTLS,“–> Entered DirectReadWrite\n”);
pMemVA = (PUCHAR)pDevContext->WriteCommonBufferBase;
memset(pMemVA, 0 , 512);
phyAddr = MmGetPhysicalAddress( pMemVA );
TraceEvents( TRACE_LEVEL_VERBOSE, DBG_INIT, “DirectWrite : Data flow Host to Dev. Copy application data to non paged memory\n”);
memcpy(pMemVA, uchWriteData, 512);

TraceEvents(TRACE_LEVEL_ERROR, DBG_IOCTLS,“DirectWrite :Virt Address of non paged mem = 0X%X\n”, pMemVA);
TraceEvents(TRACE_LEVEL_ERROR, DBG_IOCTLS,“DirectWrite:logical Address of non paged mem = 0X%X\n”, pDevContext->WriteCommonBufferBaseLA.LowPart);

uchWriteCommmand[6] = (UINT32)(pDevContext->WriteCommonBufferBaseLA.LowPart);
PostCmdToDeviceMemory(pDevContext,CMD_TYPE_ADMIN,(UCHAR *)uchWriteCommmand, 0 );

while( i > 0 )
i–;

if(pDevContext->ReadAdapter->DmaOperations->AllocateCommonBuffer != NULL)
{
pMemVA = pDevContext->ReadAdapter->DmaOperations->AllocateCommonBuffer(pDevContext->ReadAdapter, PAGE_SIZE, &phyAddr,FALSE);
memset(pMemVA, 0xab ,PAGE_SIZE);
TraceEvents(TRACE_LEVEL_ERROR, DBG_IOCTLS,“DirectRead :Virt Address of non paged mem = 0X%X\n”, pMemVA);
TraceEvents(TRACE_LEVEL_ERROR, DBG_IOCTLS,“DirectRead :physical address Address of non paged mem = 0X%X\n”, phyAddr.LowPart);
TraceEvents(TRACE_LEVEL_ERROR, DBG_IOCTLS,“DirectRead :physical address Address of non paged mem = 0X%X\n”, phyAddr.HighPart);
if(pDevContext->ReadAdapter->DmaOperations->AllocateAdapterChannel != NULL)
{
pDevContext->ReadAdapter->DmaOperations->AllocateAdapterChannel(pDevContext->ReadAdapter, WdfDeviceWdmGetAttachedDevice(pDevContext->WdfDevice), pDevContext->mapReadRegistersAllocated,&MyAdapterControl,(PVOID)pDevContext);

} else {
TraceEvents(TRACE_LEVEL_ERROR, DBG_INIT,“AllocateAdapterChannel = NULL \n”);
}
if(pDevContext->ReadAdapter->DmaOperations->GetScatterGatherList !=NULL)
{
Mdl = IoAllocateMdl(pMemVA, PAGE_SIZE , FALSE, FALSE, NULL);

if( Mdl == NULL ) {
TraceEvents( TRACE_LEVEL_VERBOSE, DBG_INIT, “DirectRead :Mdl forCommon buffer could not be created. Exiting\n”);
}

MmBuildMdlForNonPagedPool( Mdl );
KeFlushIoBuffers( Mdl,TRUE, TRUE);
TraceEvents( TRACE_LEVEL_VERBOSE, DBG_INIT, “DirectRead :KeFlushIoBuffers Flushing cache for data buffer\n”);
if( pDevContext->ReadAdapter->DmaOperations->GetScatterGatherList(pDevContext->ReadAdapter,WdfDeviceWdmGetAttachedDevice(pDevContext->WdfDevice), Mdl, pMemVA, PAGE_SIZE,myAdapterListControl, (PVOID)pDevContext, FALSE) != STATUS_SUCCESS ) {
TraceEvents(TRACE_LEVEL_ERROR, DBG_INIT, “DirectRead : GetScatterGatherList() Failed. Exiting\n”);
} else {
TraceEvents(TRACE_LEVEL_ERROR, DBG_INIT, “DirectRead : GetScatterGatherList() passed \n”);
}

if(pDevContext->ReadAdapter->DmaOperations->MapTransfer !=NULL)
{
phyAddr = pDevContext->ReadAdapter->DmaOperations->MapTransfer( pDevContext->ReadAdapter, Mdl, pDevContext->MapRegisterBase , pMemVA, &LengthMapped, FALSE);
TraceEvents(TRACE_LEVEL_ERROR, DBG_IOCTLS,“DirectRead :MapTransfer:physical address Address of non paged mem = 0X%X\n”, phyAddr.LowPart);
TraceEvents(TRACE_LEVEL_ERROR, DBG_IOCTLS,“DirectRead :MapTransfer:physical address Address of non paged mem = 0X%X\n”, phyAddr.HighPart);
TraceEvents(TRACE_LEVEL_ERROR, DBG_IOCTLS,“DirectRead :MapTransfer:Mapped Length = 0X%X\n”, LengthMapped);
} else {
TraceEvents(TRACE_LEVEL_ERROR, DBG_INIT,“MapTransfer = NULL \n”);
}

} else {
TraceEvents(TRACE_LEVEL_ERROR, DBG_INIT,“GetScatterGatherList = NULL \n”);
}
} else {
TraceEvents(TRACE_LEVEL_ERROR, DBG_INIT,“AllocateCommonBuffer = NULL \n”);
}

TraceEvents(TRACE_LEVEL_ERROR, DBG_IOCTLS,“DirectRead :Virt Address of non paged mem = 0X%X\n”, pMemVA);
//TraceEvents(TRACE_LEVEL_ERROR, DBG_IOCTLS,“DirectRead :logical Address of non paged mem = 0X%X\n”, pDevContext->ReadCommonBufferBaseLA.LowPart);
//TraceEvents(TRACE_LEVEL_ERROR, DBG_IOCTLS,“DirectRead :logical Address of non paged mem = 0X%X\n”, pDevContext->ReadCommonBufferBaseLA.HighPart);
TraceEvents(TRACE_LEVEL_ERROR, DBG_IOCTLS,“DirectRead :physical address Address of non paged mem = 0X%X\n”, phyAddr.LowPart);
TraceEvents(TRACE_LEVEL_ERROR, DBG_IOCTLS,“DirectRead :physical address Address of non paged mem = 0X%X\n”, phyAddr.HighPart);

uchReadCommmand[6] = (UINT32)(phyAddr.LowPart); //(UINT32)(pDevContext->ReadCommonBufferBaseLA.LowPart);//(//
PostCmdToDeviceMemory(pDevContext,CMD_TYPE_ADMIN,(UCHAR *)uchReadCommmand, 0);

i = 0x2fffffff;
while( i > 0 )
i–; //Delay just to wait till data written by device. usually data will be written in shorter time then this dealy ( code is to test DMA working)

if( pDevContext->ReadAdapter->DmaOperations->PutScatterGatherList != NULL )
{
TraceEvents(TRACE_LEVEL_ERROR, DBG_INIT, “DirectRead : PutScatterGatherList \n” );
pDevContext->ReadAdapter->DmaOperations->PutScatterGatherList(pDevContext->ReadAdapter, pDevContext->ScatterGather,FALSE);
i = 0x4fffffff;
while( i > 0 )
i–;
if( pDevContext->ReadAdapter->DmaOperations->FlushAdapterBuffers != NULL )
{
pDevContext->ReadAdapter->DmaOperations->FlushAdapterBuffers(pDevContext->ReadAdapter, Mdl,pDevContext->MapRegisterBase, MmGetMdlVirtualAddress(Mdl), PAGE_SIZE, FALSE);
TraceEvents(TRACE_LEVEL_ERROR, DBG_INIT,“DirectRead : FlushAdapterBuffers done @ MapBaseAddress = 0x%x Logical Address = 0x%x\n”,(pDevContext->MapRegisterBase),MmGetMdlVirtualAddress(Mdl));

if(pDevContext->ReadAdapter->DmaOperations->FreeMapRegisters != NULL) {
pDevContext->ReadAdapter->DmaOperations->FreeMapRegisters(pDevContext->ReadAdapter, pDevContext->MapRegisterBase, pDevContext->mapReadRegistersAllocated);
TraceEvents(TRACE_LEVEL_ERROR, DBG_INIT,“DirectRead : FreeMapRegisters done @ MapBaseAddress = 0x%x \n”,(pDevContext->MapRegisterBase));
} else {
TraceEvents(TRACE_LEVEL_ERROR, DBG_INIT,“DirectRead : FreeMapRegisters = NULL \n”);
}
} else {
TraceEvents(TRACE_LEVEL_ERROR, DBG_INIT,“DirectRead : FlushAdapterBuffers = NULL \n”);
}
TraceEvents(TRACE_LEVEL_ERROR, DBG_INIT, “DirectRead : 0x%x 0x%x 0x%x 0x%x\n”, *(pMemVA), *(pMemVA+5), *(pMemVA+10),*(pMemVA+15));
} else {
TraceEvents(TRACE_LEVEL_ERROR, DBG_INIT, “DirectRead : Can’t Flush \n” );
}

}

Please let me know if something done wrongly?

How to debug such issue? Device functions correctly with Lecroy Exerciser but doesn’t work with Driver.

You don’t need to call MapTransfer for bus master (scatter gather DMA).

And you HAVE TO USE THE FULL 64 BIT PHYSICAL ADDRESS (LowPart AND HighPart).

Os: Win7 32 bit OS
DRAM:4GB
CPU: Xeon - x86 architecture

Always HighPart was zero and command format doesn’t allow 64 bit address programming.

DMA is 32 bit controller and command format has 4 bytes to post physical address.

Also tried without using MapTransfer, but same result.

Thanks.

Show how you initialize DEVICE_DESCRIPTION before IoGetDmaAdapter call.

xxxxx@gmail.com wrote:

The device will get Write Command with DATA and Read Command with Buffer address. Device is able to get command and data from host command and it decodes data and then when read command command with buffer recieved it will write data to address mentioned in Read Command. Device is sending data, that can be seen on PCIe protocol analyzer with correct address and correct TLP format. But data doesn’t reflect at the Host RAM.

If you have a KMDF driver, which it appears that you do, why are you
using the DMA_ADAPTER directly, instead of using the KMDF DMA abstraction?

void DirectReadWrite(PDEVICE_CONTEXT pDevContext)
{
PHYSICAL_ADDRESS phyAddr;
PUCHAR pMemVA = NULL;
ULONG i = 0x2fffffff;
PMDL Mdl = NULL;
ULONG LengthMapped = PAGE_SIZE;

UINT32 uchWriteCommmand[32] = {// some data removed due to confidential};
UINT32 uchReadCommmand[32] = {// some data removed due to confidential };
UINT32 uchWriteData[128] = { // some data removed due to confidential};

If you truly have it written that way, then the compiler has to generate
code on the fly to recreate those tables each and every time you enter
the function. That is, you will have code that does:
uchWriteData[0] = 0x11111111;
uchWriteData[1] = 0x22222222;

That’s terribly wasteful of both time and memory. If you have static
data like this, then declare it as static:

static UINT32 uchWriteCommand[32] = {…};
static UINT32 uchReadCommand[32] = {…};
static UINT32 uchWriteData[128] = {…};

I would also point out the foolishness of using a “uch” Hungarian prefix
on arrays of type “unsigned int”. “uch” is supposed to be a single
variable of type unsigned char. This is why Microsoft code no longer
uses these prefixes.

TraceEvents(TRACE_LEVEL_VERBOSE, DBG_IOCTLS,“–> Entered DirectReadWrite\n”);
pMemVA = (PUCHAR)pDevContext->WriteCommonBufferBase;
memset(pMemVA, 0 , 512);
phyAddr = MmGetPhysicalAddress( pMemVA );
TraceEvents( TRACE_LEVEL_VERBOSE, DBG_INIT, “DirectWrite : Data flow Host to Dev. Copy application data to non paged memory\n”);
memcpy(pMemVA, uchWriteData, 512);

It is absolutely silly to initialize that whole array to 0, and the
promptly overwrite the whole thing with data. It is also poor practice
to use “512” in this case. Use sizeof(uchWriteData) instead. Also, it
would be better to use RtlMoveMemory instead of memcpy. For now, it
happens to compile to the same thing, but RtlMoveMemory is documented
for use in drivers.

When you created your DMA_ADAPTER, did you declare that your device is
only capable of 32-bit addressing?

uchWriteCommmand[6] = (UINT32)(pDevContext->WriteCommonBufferBaseLA.LowPart);
PostCmdToDeviceMemory(pDevContext,CMD_TYPE_ADMIN,(UCHAR *)uchWriteCommmand, 0 );

while( i > 0 )
i–;

Bad, bad habit. I don’t care that this is for debugging. You need to
develop good habits, and good habits mean you use KeDelayExecutionThread
when you need to wait.

When you say “read” and “write” here, from whose point of view are you
speaking? You are calling this operation a “write”. Is that going from
host memory to the device? That would usually be referred to as a
“read”, because the device’s DMA is reading from host memory.

pMemVA = pDevContext->ReadAdapter->DmaOperations->AllocateCommonBuffer(pDevContext->ReadAdapter, PAGE_SIZE, &phyAddr,FALSE);
memset(pMemVA, 0xab ,PAGE_SIZE);

Here, you are setting up a “read”. Is this transferring from the device
to host memory? Again, this would usually be called a “write”, because
the device’s DMA engine is writing to host memory.

So, you allocate a common buffer, and it is always one page in length.
Do you understand that scatter/gather is completely unnecessary in this
case? A common buffer is always physically contiguous, but a single
page is always contiguous anyway. Have you checked that the physical
address you get from AllocateCommonBuffer is the same as the physical
address from MapTransfer?

Also, you never free this common buffer (in the code you showed is).
This will leak a page every time you go through here.

uchReadCommmand[6] = (UINT32)(phyAddr.LowPart); //(UINT32)(pDevContext->ReadCommonBufferBaseLA.LowPart);//(//
PostCmdToDeviceMemory(pDevContext,CMD_TYPE_ADMIN,(UCHAR *)uchReadCommmand, 0);

i = 0x2fffffff;
while( i > 0 )
i–; //Delay just to wait till data written by device. usually data will be written in shorter time then this dealy ( code is to test DMA working)

I just find it very odd that a “read” command is supposed to have “data
written by the device”. Perhaps you should take a step back and make
sure you have the “read” and “write” directions down. You might see the
results you are seeing if you were actually firing a host-to-device
transfer (“read”) when you thought you were doing a device-to-host
transfer (“write”).


Tim Roberts, xxxxx@probo.com
Providenza & Boekelheide, Inc.

Tim thanks for your detailed explaination of good coding practice.

The actual code written different then one posted. Since command value comes from application to driver to dispatch routine file and multiple functions and then posted code to device Queue exist at BAR space.

Since it needs to step into multiple files and debug unwanted things, I asked some juniors to come up one function does both write and read command to test DMA. We can’t test only read DMA, since device FW needs certain write command and then read command to respond with data.

This code used to test DMA initialization and Read DMA operation functions well or not . Later it will be discarded.

In Real driver, we have MSI and command completion queue. FW will post command completion status and then generate MSI. In DPC, we will read command completion status based on MSI message data.

But FW not enabled MSI and posting command completion to completion queue. So we just used some delay before assuming data written into read buffer.

With this code, we wanted to know DMA related API are called correctly or not. Actual driver written in WDF and initial DMA code written using WDF API and things doesn’t worked we tried WDM DMA approach.

Read DMA adapater has Device-to-host transfer direction. In that respect no issues. About allocating paged size memory also for testing purpose. Two command doesn’t ave data more then 512. These are two command implemented in device FW. Actually other command transfers data upto 1 MB and hence DMA approach tried is scatter gather. DEVICE_DESCRIPTION has scattergather mode, 32 bit addressing and 64 bit addressing false.

I know we should have used packet based mode for small data. But later we need scatter gather mode for actual data based commands.

Have you checked that the physical
address you get from AllocateCommonBuffer is the same as the physical
address from MapTransfer?
Yes. All address same. Even we checked SGL element address same.

We enabled MapTransfer later assuming it may give LA address for DMA, but no use with and without MapTranfer function call.

Some of this code is weird, some is downright scary. Comments below.

The device: PCIe Bus master & Device FW initiates DMA
OS: Win7
Driver: Suppose to allocate memory for Device to Host DMA & Device will
write to allocated memory

The DMA is initialized as part of Prepare HW

The device will get Write Command with DATA and Read Command with Buffer
address. Device is able to get command and data from host command and it
decodes data and then when read command command with buffer recieved it
will write data to address mentioned in Read Command. Device is sending
data, that can be seen on PCIe protocol analyzer with correct address and
correct TLP format. But data doesn’t reflect at the Host RAM.

The Drive code is

IO_ALLOCATION_ACTION
MyAdapterControl(
struct _DEVICE_OBJECT *DeviceObject,
struct _IRP *Irp,
PVOID MapRegisterBase,
PVOID Context
)
{
PDEVICE_CONTEXT pDevContext;
UNREFERENCED_PARAMETER(DeviceObject);
UNREFERENCED_PARAMETER(Irp);
pDevContext = (PDEVICE_CONTEXT) Context;
pDevContext->MapRegisterBase = MapRegisterBase;
TraceEvents(TRACE_LEVEL_ERROR, DBG_IOCTLS,“MyAdapterControl
:Existing”);
return DeallocateObjectKeepRegisters;
}

void myAdapterListControl
(
struct _DEVICE_OBJECT *DeviceObject,
struct _IRP *Irp,
PSCATTER_GATHER_LIST ScatterGather,
PVOID Context
)
{
PDEVICE_CONTEXT pDevContext;
UNREFERENCED_PARAMETER(DeviceObject);
UNREFERENCED_PARAMETER(Irp);

pDevContext = (PDEVICE_CONTEXT) Context;

pDevContext->ScatterGather = ScatterGather;

TraceEvents(TRACE_LEVEL_ERROR, DBG_IOCTLS,“myAdapterListControl
:physical address Address of non paged mem = 0X%X\n”,
pDevContext->ScatterGather->Elements[0].Address.LowPart);
TraceEvents(TRACE_LEVEL_ERROR, DBG_IOCTLS,“myAdapterListControl
:physical address Address of non paged mem = 0X%X\n”,
pDevContext->ScatterGather->Elements[0].Address.HighPart);
TraceEvents(TRACE_LEVEL_ERROR, DBG_IOCTLS,“myAdapterListControl :Length
= 0X%X\n”, pDevContext->ScatterGather->Elements[0].Length);
TraceEvents(TRACE_LEVEL_ERROR, DBG_IOCTLS,“myAdapterListControl
:Existing\n”);
}

void DirectReadWrite(PDEVICE_CONTEXT pDevContext)
{
PHYSICAL_ADDRESS phyAddr;
PUCHAR pMemVA = NULL;
ULONG i = 0x2fffffff;
*****
For example, what is the significance of this very weird number?
*****
PMDL Mdl = NULL;
ULONG LengthMapped = PAGE_SIZE;

UINT32 uchWriteCommmand[32] = {// some data removed due to confidential};
UINT32 uchReadCommmand[32] = {// some data removed due to confidential };

****
How do you know that 32 makes sense? Why is it not some symbolic value
like COMMAND_LENGTH. Is this information overwritten in any way? (static
const UINT32 would be better if it is readonly)
****

UINT32 uchWriteData[128] = { // some data removed due to confidential};
****
You are working in a limited stack environent, yet you allocate on the
stack (32+32+128)*4 bytes. Bad move. You should consider putting this
buffer in the device extension.
****

TraceEvents(TRACE_LEVEL_VERBOSE, DBG_IOCTLS,“–> Entered
DirectReadWrite\n”);
pMemVA = (PUCHAR)pDevContext->WriteCommonBufferBase;
memset(pMemVA, 0 , 512);
*****
Why are you zeroing out a buffer whose contents you are about to replace.
And WHERE did you get that bizarre number 512? Have you ever heard of
‘sizeof’?
*****
phyAddr = MmGetPhysicalAddress( pMemVA );
TraceEvents( TRACE_LEVEL_VERBOSE, DBG_INIT, “DirectWrite : Data flow Host
to Dev. Copy application data to non paged memory\n”);
memcpy(pMemVA, uchWriteData, 512);
*****
You just uselessly set it to all zeroes, and now set it to something else.
Where in the world did you get that bizarre value 512?
****

TraceEvents(TRACE_LEVEL_ERROR, DBG_IOCTLS,“DirectWrite :Virt Address of
non paged mem = 0X%X\n”, pMemVA);
TraceEvents(TRACE_LEVEL_ERROR, DBG_IOCTLS,“DirectWrite:logical Address of
non paged mem = 0X%X\n”, pDevContext->WriteCommonBufferBaseLA.LowPart);

uchWriteCommmand[6] =
(UINT32)(pDevContext->WriteCommonBufferBaseLA.LowPart);
****
And what did you do with the other 32 bits? If you want to believe they
are all 0, fine, but TEST them and if they are not 0, you will gracefully
terminate this operation with a suitable error code. Toss in some
ASSERTMSG as well, for debugging.

This does answer the question I had earlier about writeability; it
couldn’t be static or const.
****

PostCmdToDeviceMemory(pDevContext,CMD_TYPE_ADMIN,(UCHAR
*)uchWriteCommmand, 0 );

while( i > 0 )
i–;
*****
WHAT IN THE WORLD IS THIS SUPPOSED TO DO? There isn’t even a comment to
explain it! It looks like a classic “delay” loop, the kind that has not
made sense since the 8088. Has anyone ever said to you “pipelined
superscalar with instruction pipe, speculative execution, multilevel
cache, and optimizing compiler”?

Any ONE of those features would render the above code at best a joke, and
you are working in an environment where ALL of them are true! If you are
at passive level, put the thread to sleep; if you are at DPC level you
have fallen into the trap of thinking of sequentiality in terms of syntax,
which is deadly. If you are at DPC level, the code above isn’t even a
joke, it is out-and-out WRONG!

At DPC level, you would activate a timer, and in the timer’s DPC routine
you would put the rest of the code. Of course, if the hardware had a
remotely sane design, it would signal its completion with an interrupt!
*****

if(pDevContext->ReadAdapter->DmaOperations->AllocateCommonBuffer != NULL)
{
pMemVA =
****
I saw a bunch of things being done to pMemVA already, so why is it being
assigned here?

If you need pointers to different things, you declare variables with
completely different names. If the variable is supposed to mean the same
thing, there should be exactly one assignment statement for it, in exactly
one place.
****

pDevContext->ReadAdapter->DmaOperations->AllocateCommonBuffer(pDevContext->ReadAdapter,
PAGE_SIZE, &phyAddr,FALSE);
memset(pMemVA, 0xab ,PAGE_SIZE);
****
This is as bizarre as zeroing it out. Of course, if this were to help you
see if data is being written, it could make sense, but it would be written
as

#ifdef DBG
memset(…etc…); // load with known pattern to see if DMA worked
#endif

By failing to mark it as a debugging aid, you only confuse the reader.
*****

TraceEvents(TRACE_LEVEL_ERROR, DBG_IOCTLS,“DirectRead :Virt Address of
non paged mem = 0X%X\n”, pMemVA);
*****
That should be “0x%p”, since this code will fail horribly in a 64-bit
build. In general, most hex values which are not pointers should be
printed to the maximum precision of the data, which for a UINT32 means
0x%08x. But never use %x to print an address.
*****
TraceEvents(TRACE_LEVEL_ERROR, DBG_IOCTLS,“DirectRead :physical address
Address of non paged mem = 0X%X\n”, phyAddr.LowPart);
TraceEvents(TRACE_LEVEL_ERROR, DBG_IOCTLS,“DirectRead :physical address
Address of non paged mem = 0X%X\n”, phyAddr.HighPart);
*****
TraceEvents(…DirectRead: physical address 0x%08x`%08x", …Highpart,
…LowPart);

(note that %p doesn’t work here, because it works for the native pointer
size, but PHYSICAL_ADDRESS is not the native pointer size for a 32-bit
machine)

if(pDevContext->ReadAdapter->DmaOperations->AllocateAdapterChannel !=
NULL)
{
pDevContext->ReadAdapter->DmaOperations->AllocateAdapterChannel(pDevContext->ReadAdapter,
WdfDeviceWdmGetAttachedDevice(pDevContext->WdfDevice),
pDevContext->mapReadRegistersAllocated,&MyAdapterControl,(PVOID)pDevContext);

} else {
TraceEvents(TRACE_LEVEL_ERROR, DBG_INIT,“AllocateAdapterChannel = NULL
\n”);
}
****
It looks to me like if the adapter channel allocation fails, the next line
will BSOD.

Graceful recovery from errors is not optional, it is mandatory. This
looks like a driver written by an app programmer. And much of this would
not be good practice in an app, either.

I might be missing something because the nesting is next-to-unreadable and
I can’t follow it.
*****

if(pDevContext->ReadAdapter->DmaOperations->GetScatterGatherList
!=NULL)
{
Mdl = IoAllocateMdl(pMemVA, PAGE_SIZE , FALSE, FALSE, NULL);

if( Mdl == NULL ) {
TraceEvents( TRACE_LEVEL_VERBOSE, DBG_INIT, “DirectRead :Mdl
forCommon buffer could not be created. Exiting\n”);
}
****
The few lines above, however, are unambiguous as to what will happen. If
the MDL is NULL, a BSOD follows immediately. This is unbelievably poor
code, even in an app, and totally unacceptable in a driver!
*****

MmBuildMdlForNonPagedPool( Mdl );
KeFlushIoBuffers( Mdl,TRUE, TRUE);
TraceEvents( TRACE_LEVEL_VERBOSE, DBG_INIT, “DirectRead
:KeFlushIoBuffers Flushing cache for data buffer\n”);
if(
pDevContext->ReadAdapter->DmaOperations->GetScatterGatherList(pDevContext->ReadAdapter,WdfDeviceWdmGetAttachedDevice(pDevContext->WdfDevice),
Mdl, pMemVA, PAGE_SIZE,myAdapterListControl, (PVOID)pDevContext,
FALSE) != STATUS_SUCCESS ) {
*****
Why are you getting a scatter/gather list for a common buffer? It’s
already set up for a fifth-rate DMA design that requires contiguous
physical addresses!
*****
TraceEvents(TRACE_LEVEL_ERROR, DBG_INIT, “DirectRead :
GetScatterGatherList() Failed. Exiting\n”);
} else {
TraceEvents(TRACE_LEVEL_ERROR, DBG_INIT, “DirectRead :
GetScatterGatherList() passed \n”);
}

if(pDevContext->ReadAdapter->DmaOperations->MapTransfer !=NULL)
{
****
Having gotten a s/g list for the contiguous buffer, why are you calling
MapTransfer, which is duplicating the effort? In any case, neither of
these should be necessary because you have the physical address of the
common buffer.
*****
phyAddr = pDevContext->ReadAdapter->DmaOperations->MapTransfer(
pDevContext->ReadAdapter, Mdl, pDevContext->MapRegisterBase , pMemVA,
&LengthMapped, FALSE);
TraceEvents(TRACE_LEVEL_ERROR, DBG_IOCTLS,“DirectRead
:MapTransfer:physical address Address of non paged mem = 0X%X\n”,
phyAddr.LowPart);
TraceEvents(TRACE_LEVEL_ERROR, DBG_IOCTLS,“DirectRead
:MapTransfer:physical address Address of non paged mem = 0X%X\n”,
phyAddr.HighPart);
TraceEvents(TRACE_LEVEL_ERROR, DBG_IOCTLS,“DirectRead
:MapTransfer:Mapped Length = 0X%X\n”, LengthMapped);
} else {
TraceEvents(TRACE_LEVEL_ERROR, DBG_INIT,“MapTransfer = NULL \n”);
}

} else {
TraceEvents(TRACE_LEVEL_ERROR, DBG_INIT,“GetScatterGatherList = NULL
\n”);
}
} else {
TraceEvents(TRACE_LEVEL_ERROR, DBG_INIT,“AllocateCommonBuffer = NULL
\n”);
******
This is not an application. Emitting this message is insufficient. You
must gracefully complete the IRP with an informative error code. And what
happens if this occurs in the release version where the TraceEvents
probably has compiled to nothing, and the customer couldn’t see the
message even if it came out? Remember: tech support knows where you live!
****
}

TraceEvents(TRACE_LEVEL_ERROR, DBG_IOCTLS,“DirectRead :Virt Address of
non paged mem = 0X%X\n”, pMemVA);
//TraceEvents(TRACE_LEVEL_ERROR, DBG_IOCTLS,“DirectRead :logical Address
of non paged mem = 0X%X\n”, pDevContext->ReadCommonBufferBaseLA.LowPart);
//TraceEvents(TRACE_LEVEL_ERROR, DBG_IOCTLS,“DirectRead :logical Address
of non paged mem = 0X%X\n”,
pDevContext->ReadCommonBufferBaseLA.HighPart);
TraceEvents(TRACE_LEVEL_ERROR, DBG_IOCTLS,“DirectRead :physical address
Address of non paged mem = 0X%X\n”, phyAddr.LowPart);
TraceEvents(TRACE_LEVEL_ERROR, DBG_IOCTLS,“DirectRead :physical address
Address of non paged mem = 0X%X\n”, phyAddr.HighPart);

uchReadCommmand[6] = (UINT32)(phyAddr.LowPart);
//(UINT32)(pDevContext->ReadCommonBufferBaseLA.LowPart);//(//
PostCmdToDeviceMemory(pDevContext,CMD_TYPE_ADMIN,(UCHAR
*)uchReadCommmand, 0);

i = 0x2fffffff;
while( i > 0 )
i–; //Delay just to wait till data written by device. usually data will
be written in shorter time then this dealy ( code is to test DMA
working)

****
(a) this is the wrong way to delay (b) who designed the P.O.S. device so
that it doesn’t interrupt?
****

if( pDevContext->ReadAdapter->DmaOperations->PutScatterGatherList !=
NULL )
{
TraceEvents(TRACE_LEVEL_ERROR, DBG_INIT, “DirectRead :
PutScatterGatherList \n” );
pDevContext->ReadAdapter->DmaOperations->PutScatterGatherList(pDevContext->ReadAdapter,
pDevContext->ScatterGather,FALSE);
i = 0x4fffffff;
while( i > 0 )
i–;
****
This doesn’t even make sense! A horrible delay, and for no discernable
purpose.
*****
if( pDevContext->ReadAdapter->DmaOperations->FlushAdapterBuffers !=
NULL )
{
pDevContext->ReadAdapter->DmaOperations->FlushAdapterBuffers(pDevContext->ReadAdapter,
Mdl,pDevContext->MapRegisterBase, MmGetMdlVirtualAddress(Mdl),
PAGE_SIZE, FALSE);
TraceEvents(TRACE_LEVEL_ERROR, DBG_INIT,“DirectRead :
FlushAdapterBuffers done @ MapBaseAddress = 0x%x Logical Address =
0x%x\n”,(pDevContext->MapRegisterBase),MmGetMdlVirtualAddress(Mdl));

if(pDevContext->ReadAdapter->DmaOperations->FreeMapRegisters != NULL) {
pDevContext->ReadAdapter->DmaOperations->FreeMapRegisters(pDevContext->ReadAdapter,
pDevContext->MapRegisterBase, pDevContext->mapReadRegistersAllocated);
TraceEvents(TRACE_LEVEL_ERROR, DBG_INIT,“DirectRead : FreeMapRegisters
done @ MapBaseAddress = 0x%x \n”,(pDevContext->MapRegisterBase));
} else {
TraceEvents(TRACE_LEVEL_ERROR, DBG_INIT,“DirectRead : FreeMapRegisters
= NULL \n”);
}
} else {
TraceEvents(TRACE_LEVEL_ERROR, DBG_INIT,“DirectRead :
FlushAdapterBuffers = NULL \n”);
}
TraceEvents(TRACE_LEVEL_ERROR, DBG_INIT, “DirectRead : 0x%x 0x%x 0x%x
0x%x\n”, *(pMemVA), *(pMemVA+5), *(pMemVA+10),*(pMemVA+15));
} else {
TraceEvents(TRACE_LEVEL_ERROR, DBG_INIT, “DirectRead : Can’t Flush \n”
);
}

***
I didn’t try to follow all the complex nesting, but I’m concerned about
error messages being issued but no recovery code visible.
*****

}

Please let me know if something done wrongly?

How to debug such issue? Device functions correctly with Lecroy Exerciser
but doesn’t work with Driver.

*****
Your driver is doing something wrong. If the device is known to work in
some context, analyzing what the working context does compared to your
driver would be a good start. A PCI bus analyzer is a pretty big hammer
to use, but it may be what is ultimately required. Somehow, your driver
is not saying the right things to your device.

But the first thing I’d suggest is a lot of code cleanup, including
eliminating those awful counting loops!
*****


NTDEV is sponsored by OSR

For our schedule of WDF, WDM, debugging and other seminars visit:
http://www.osr.com/seminars

To unsubscribe, visit the List Server section of OSR Online at
http://www.osronline.com/page.cfm?name=ListServer

I didn’t understand even after saying it is written just to test DMA by some juniors. Anyways I will give below code removing all unwanted stuff.

here the code goes

IO_ALLOCATION_ACTION
MyAdapterControl(
struct _DEVICE_OBJECT *DeviceObject,
struct _IRP *Irp,
PVOID MapRegisterBase,
PVOID Context
)
{
PDEVICE_CONTEXT pDevContext;
UNREFERENCED_PARAMETER(DeviceObject);
UNREFERENCED_PARAMETER(Irp);

pDevContext = (PDEVICE_CONTEXT) Context;
pDevContext->MapRegisterBase = MapRegisterBase;
TraceEvents(TRACE_LEVEL_ERROR, DBG_IOCTLS,“MyAdapterControl :Existing”);

return DeallocateObjectKeepRegisters;
}

void myAdapterListControl
(
struct _DEVICE_OBJECT *DeviceObject,
struct _IRP *Irp,
PSCATTER_GATHER_LIST ScatterGather,
PVOID Context
)
{
PDEVICE_CONTEXT pDevContext;
UNREFERENCED_PARAMETER(DeviceObject);
UNREFERENCED_PARAMETER(Irp);

pDevContext = (PDEVICE_CONTEXT) Context;

pDevContext->ScatterGather = ScatterGather;

TraceEvents(TRACE_LEVEL_ERROR, DBG_IOCTLS,“myAdapterListControl :Existing\n”);
}

UINT32 DirectReadWrite(PDEVICE_CONTEXT pDevContext)
{
PHYSICAL_ADDRESS phyAddr;
PUCHAR pMemVA = NULL;
PMDL Mdl = NULL;
ULONG LengthMapped = PAGE_SIZE;

if(pDevContext->ReadAdapter->DmaOperations->AllocateCommonBuffer != NULL)
{
pMemVA = pDevContext->ReadAdapter->DmaOperations->AllocateCommonBuffer(pDevContext->ReadAdapter, PAGE_SIZE, &phyAddr, FALSE);

TraceEvents(TRACE_LEVEL_ERROR, DBG_IOCTLS,“DirectRead :physical address
Address of non paged mem = 0X%8X\n”, phyAddr.LowPart);
TraceEvents(TRACE_LEVEL_ERROR, DBG_IOCTLS,“DirectRead :physical address
Address of non paged mem = 0X%8X\n”, phyAddr.HighPart);
} else {
// Returns approapiate error . But code doesn’t hit this section
}

if(pDevContext->ReadAdapter->DmaOperations->AllocateAdapterChannel != NULL)
{
pDevContext->ReadAdapter->DmaOperations->AllocateAdapterChannel (pDevContext->ReadAdapter, WdfDeviceWdmGetAttachedDevice(pDevContext->WdfDevice), pDevContext->mapReadRegistersAllocated, &MyAdapterControl, (PVOID)pDevContext);

} else {
// Returns approapiate error . But code doesn’t hit this section
}

Mdl = IoAllocateMdl(pMemVA, PAGE_SIZE , FALSE, FALSE, NULL);
//MDL created to call KeFlushIOBuffer.
if( Mdl == NULL ){
// Returns approapiate error . But code doesn’t hit this section

}

MmBuildMdlForNonPagedPool( Mdl );
KeFlushIoBuffers( Mdl,TRUE, TRUE);

//Fill Address = (UINT32)(phyAddr.LowPart) into read command and Posted command to Device memory (BAR space) . This code works fine and not related to DMA so removed to look only DMA related code

// Wait for certain time Delay just to wait till data written by device. usually data will be
written in shorter time then this dealy.

if( pDevContext->ReadAdapter->DmaOperations->FlushAdapterBuffers != NULL )
{
pDevContext->ReadAdapter->DmaOperations->FlushAdapterBuffers (pDevContext->ReadAdapter, Mdl,pDevContext->MapRegisterBase, MmGetMdlVirtualAddress(Mdl), PAGE_SIZE, FALSE);

} else {
// Returns approapiate error . But code doesn’t hit this section

}

Return error;

}

Please comment on this code. This approach is also tested by removing scatter gather list creation and map transfer creation. But this still doesn’t work.

Thanks

xxxxx@gmail.com wrote:

I didn’t understand even after saying it is written just to test DMA by some juniors. Anyways I will give below code removing all unwanted stuff.

Please comment on this code. This approach is also tested by removing scatter gather list creation and map transfer creation. But this still doesn’t work.

This code never triggers the hardware DMA. Your code does this:
* AllocateCommonBuffer
* AllocateAdapterChannel
* IoAllocateMdl
* MmBuildMdlForNonPagedPool
* KeFlushIoBuffers
* FlushAdapterBuffers

None of those calls cause any transfers to occur. Were you just asking
about the philosophy? The philosophy should be irrelevant, if you are
actually seeing the bus master packets in your PCIExpress bus analyzer.
All of that mucking around is just to get legitimate physical
addresses. If you have physical addresses, then the code is correct.

If the bus master transfer packets really are coming across the bus,
then the data has to be going into memory. There’s just no way that
path can break. Is it possible you are misinterpreting the bus analyzer
output? Can you post some of the relevent packets here?


Another stylistic item. I find this odd:

if(pDevContext->ReadAdapter->DmaOperations->AllocateCommonBuffer != NULL)

It would not occur to me to test this in every call. Why wouldn’t you
validate the functions once when you fetch the adapter? Indeed, if
IoGetDmaAdapter doesn’t return an error, why would you validate these
function pointers at all?

I am also still wondering why you are using the WDM-style DMA
abstraction in a WDF driver.


Tim Roberts, xxxxx@probo.com
Providenza & Boekelheide, Inc.

I would not assume the h/o 32 bits are zero. The failure mode if they are
not is catastrophic, so I’d chech them and fail the operation gracefully
if they were nonzero. If I’ve learned one thing about writing drivers,
it’s that you cannot assume anything to be in an “expected state” and
everything is double-checked, triple-checked, and all failures are
graceful (for exaple, theIRP is completed, the next one is dequeued, no
allocations are left unfreed, etc.).

I learned this from reading code examples, and then discovering that most
BSOD and other unpleasant behaviors were usually because someone failed to
check a pointer for non-NULL or failed to check a return value for
failure. A switch statement without a default: is usually a good warning
that the coding is wonky, and if-statements that test for consistency but
don’t have an else clause to handle recovery is another warning. If there
is any possible doubt, check and recover.
joe

Os: Win7 32 bit OS
DRAM:4GB
CPU: Xeon - x86 architecture

Always HighPart was zero and command format doesn’t allow 64 bit address
programming.

DMA is 32 bit controller and command format has 4 bytes to post physical
address.

Also tried without using MapTransfer, but same result.

Thanks.


NTDEV is sponsored by OSR

For our schedule of WDF, WDM, debugging and other seminars visit:
http://www.osr.com/seminars

To unsubscribe, visit the List Server section of OSR Online at
http://www.osronline.com/page.cfm?name=ListServer