Windows System Software -- Consulting, Training, Development -- Unique Expertise, Guaranteed Results

Home NTDEV

More Info on Driver Writing and Debugging


The free OSR Learning Library has more than 50 articles on a wide variety of topics about writing and debugging device drivers and Minifilters. From introductory level to advanced. All the articles have been recently reviewed and updated, and are written using the clear and definitive style you've come to expect from OSR over the years.


Check out The OSR Learning Library at: https://www.osr.com/osr-learning-library/


Before Posting...

Please check out the Community Guidelines in the Announcements and Administration Category.

PCIe dma not working as expected

DavidNDavidN Member Posts: 4

Hi all. I'm in the process of developing a KMDF PCIe driver and am having some trouble getting device->host/host->device DMA to work properly (it works perfectly fine when communicating with itself). As far as I can tell everything is being set up properly but the data is not being transferred from src to dst.

In the driver initialization I'm using WdfCommonBufferCreate and writing data to the virtual address obtained from WdfCommonBufferGetAlignedVirtualAddress.
I then have an application that should serve to initiate the dma, I initialize the dst buffer, write the src, dst addresses, and transfer size data to the AXI CDMA registers and then wait for the signal that the transfer has completed. The status & control indicate that things should have worked however when reading from dst nothing new has been written.

This is my first post and also first experience with driver development (I'm a Co-op) so I'd greatly appreciate as much detail as possible in answers & also apologize if I've not included enough information in this initial post.

Thanks in advance.

Comments

  • Peter_Viscarola_(OSR)Peter_Viscarola_(OSR) Administrator Posts: 8,704

    Welcome to the community Mr. @DavidN -- We'll try to be gentle ;-)

    Before we dive down into any more detail, can you confirm for me please that Common Buffer address that you're giving to the hardware at the target for your DMA operations is the address you get from WdfCommonBufferGetAlignedLogicalAddress?

    Are you using Simple DMA Mode or Scatter/Gather Mode for your transfers?

    Also, please verify that you have WinDbg set up... and that you're actually looking at the buffer in question via WinDbg. I assume that, setting up for a DMA write operation to memory (data coming FROM the device TO host memory), you initialize the buffer to something (just as a test, for example, you set it to all 0xFF or something), setup and do the DMA write to memory, then look at the memory buffer and see.. that it hasn't changed??

    Peter

    Peter Viscarola
    OSR
    @OSRDrivers

  • Gregory_G._DyessGregory_G._Dyess Member - All Emails Posts: 381
    via Email
    A common DMA gotcha I've seen, especially on ARM platforms, is the need for a memory barrier between the write to the buffer and the write to trigger the DMA controller.  This bit me BIG TIME on a PCIe driver in the past.

    Most modern processors have "out of order" execution capabilities.  Without the memory barrier, the write to memory can sometimes be incomplete by the time the DMA operation is triggered.

    Greg


    On Wed, 06 Oct 2021 17:46:10 +0000 (UTC), "Peter_Viscarola_(OSR)" wrote:

    OSR https://community.osr.com/

    Peter_Viscarola_(OSR) commented on PCIe dma not working as expected

    Welcome to the community Mr. @DavidN -- We'll try to be gentle ;-)

    Before we dive down into any more detail, can you confirm for me please that Common Buffer address that you're giving to the hardware at the target for your DMA operations is the address you get from WdfCommonBufferGetAlignedLogicalAddress?

    Are you using Simple DMA Mode or Scatter/Gather Mode for your transfers?

    Also, please verify that you have WinDbg set up... and that you're actually looking at the buffer in question via WinDbg. I assume that, setting up for a DMA write operation to memory (data coming FROM the device TO host memory), you initialize the buffer to something (just as a test, for example, you set it to all 0xFF or something), setup and do the DMA write to memory, then look at the memory buffer and see.. that it hasn't changed??

    Peter
  • Peter_Viscarola_(OSR)Peter_Viscarola_(OSR) Administrator Posts: 8,704

    Great point. The docs are wildly remiss in not mentioning this for KMDF and Commkn Buffers. KeFluhIoBuffers is your friend.

    If you’re using the Packet Based interface (DMA Transactions) the flushing should be taken care of for you. Is that not you experience Mr. @Gregory_G._Dyess ?

    I’ve yet to do any bushmaster DMA on ARM. I’m looking forward to the chance, though.

    Peter

    Peter Viscarola
    OSR
    @OSRDrivers

  • Gregory_G._DyessGregory_G._Dyess Member - All Emails Posts: 381
    via Email
    I have to admit I've not done a lot of driver development for desktop Windows in a LONG time.  My entire Windows kernel development now is limited to Windows Embedded and Linux on ARM processors.  Windows Embedded does things completely differently when it comes to drivers (except for NDIS).

    Greg


    On Wed, 06 Oct 2021 21:54:07 +0000 (UTC), "Peter_Viscarola_(OSR)" wrote:

    OSR https://community.osr.com/

    Peter_Viscarola_(OSR) commented on PCIe dma not working as expected

    Great point. The docs are wildly remiss in not mentioning this for KMDF and Commkn Buffers. KeFluhIoBuffers is your friend.

    If you’re using the Packet Based interface (DMA Transactions) the flushing should be taken care of for you. Is that not you experience Mr. @Gregory_G._Dyess ?

    I’ve yet to do any bushmaster DMA on ARM. I’m looking forward to the chance, though.

    Peter
  • Ramakrishna_SaripalliRamakrishna_Saripalli Member Posts: 74

    I am running into a similar problem. With or without VT-d, I start DMA on my device. Apparently this is a Synopsys DMA engine. The DMA controller does increment the source and target addresses by the transfer size but the data does not show up (I am doing device to memory). I allocated an MDL for the common buffer (It is 2 pages long) and built the physical pages using MmBuildMdlForNonPagedPool during driver bringup.

    Before DMA start, I am calling KeFlushIoBuffers(MDL, TRUE, TRUE) and still data does not show up in my common buffer.

    I am wondering if I have to transition to a packet DMA model where I think the flushing of caches is done by the framework when I complete the transaction. Unfortunately, my DMA is not interrupt driven, it is just polling but that should be ok I think.

  • Peter_Viscarola_(OSR)Peter_Viscarola_(OSR) Administrator Posts: 8,704
    edited October 7

    @Ramakrishna_Saripalli …. Why are you building that MDL?

    You don’t need to change to packet mode unless that’s what your user interface needs/wants.

    Let’s start by you, also, answering the same questions I posed to the OP.

    Peter

    Post edited by Peter_Viscarola_(OSR) on

    Peter Viscarola
    OSR
    @OSRDrivers

  • Ramakrishna_SaripalliRamakrishna_Saripalli Member Posts: 74
    @"Peter_Viscarola_(OSR)" wdk says in order to use keflushiobuffers i need an mdl.

    Wrt your questions yes i am getting the logical address and writing the lower 32 and upper 32 into the dma controller registers

    In fact after i start the dma i can see the dest addr and src addr getting incremented by the transfer size. But the data does not show up
  • Ramakrishna_SaripalliRamakrishna_Saripalli Member Posts: 74

    Thought I would share my source code. This is one of the Synopsys DMA controllers (I do not have the specs for it). Instead I have the linux driver code so I am trying to make windows work from it. One of these days, I am going to boot Ubuntu 18.04, build the driver and see if it works. But the linux driver code was given to me so I am hoping it is the reference model.

    The following is done during initialization. I have removed the code for error checking (The driver has it though).
    WdfDeviceSetAlignmentRequirement(
    device_ctxt,
    FILE_BYTE_ALIGNMENT);

       WDF_DMA_ENABLER_CONFIG_INIT(
        &dmaConfig,
        WdfDmaProfilePacket64, /* device is capable of addressing all 64-bits but no scatter gather */
        8192);
    
    status = WdfDmaEnablerCreate(
        p_x1_device_ctxt->device,
        &dmaConfig,
        WDF_NO_OBJECT_ATTRIBUTES,
        &device_ctxt->DmaEnablerHandle
    );
    
        status = WdfCommonBufferCreate(
            device_ctxt->DmaEnablerHandle,
            p_x1_device_ctxt->common_dma_buffer_size, /* This is equal to 8192 */
            WDF_NO_OBJECT_ATTRIBUTES,
            &device_ctxt->h_common_dma_buffer
        );
    
            device_ctxt->common_dma_buffer_kernel_va =
                WdfCommonBufferGetAlignedVirtualAddress(device_ctxt->h_common_dma_buffer);
            device_ctxt->common_dma_buffer_device_la =
                WdfCommonBufferGetAlignedLogicalAddress(device_ctxt->h_common_dma_buffer);
    
            RtlZeroMemory(....)====> to zero out the common buffer
    
            device_ctxt->common_buffer_mdl = IoAllocateMdl(device_ctxt->device_ctxt->common_dma_buffer_kernel_va , 8192, FALSE, FALSE, NULL)
    

    Before the DMA starts, driver writes a pattern of 0xdeadbeef to the DMA buffer using another ioctl. I have verified this works.

    When I get the ioctl to start the DMA (from device memory to system memory).
    Driver logic to start DMA. The ioctl is a METHOD_BUFFERED and provides the size of dma and other parameters.

    KeFlushIoBuffers(device_ctxt->common_buffer_mdl, TRUE, TRUE);

    < I even threw in a __wbinvd() here to flush out the whole cache hierarchy. Did not make a difference >


    regs->dest_addr_low = inputBuffer->internal_addr & 0xFFFFFFFF;
    regs->dest_addr_high = inputBuffer->internal_addr >> 32;
    regs->source_addr_low = device_ctxt->common_dma_buffer_device_la.LowPart;
    regs->source_addr_high = device_ctxt->common_dma_buffer_device_la.HighPart;

    regs->transfer_size = inputBuffer->num_dma_bytes;
    < another write to another register in the DMA for some control operation>.
    MemoryBarrier();
    < write to the doorbell register to start the DMA>

    After the above operation, I can see the dest_addr_low and source_addr_low being incremented by the num_dma_bytes.
    The transfer_size register turns to zero.

    But the data is not showing up in the common buffer DMA. I have a DbgPrintEx after the DMA for the first DWORD of the common buffer and it still shows 0xdeadbeef.

  • Gregory_G._DyessGregory_G._Dyess Member - All Emails Posts: 381
    via Email
    This almost sounds like a stale cache on the destination side of the DMA transfer. 

    Try cleaning the cache on the destination buffer before starting the DMA transfer. 
    Then do a memory barrier after the DMA completes.

    Greg


    On Thu, 07 Oct 2021 13:32:52 +0000 (UTC), Ramakrishna_Saripalli wrote:

    After the above operation, I can see the dest_addr_low and source_addr_low being incremented by the num_dma_bytes.

    The transfer_size register turns to zero.

    But the data is not showing up in the common buffer DMA. I have a DbgPrintEx after the DMA for the first DWORD of the common buffer and it still shows 0xdeadbeef.
     
  • Ramakrishna_SaripalliRamakrishna_Saripalli Member Posts: 74

    @Gregory_G._Dyess I agree but the KeFlushIoBuffers() call should do that and that is being done right before the DMA operation.
    This link https://docs.microsoft.com/en-us/windows-hardware/drivers/kernel/flushing-cached-data-during-dma-operations talks about KeFlushIoBuffers.

    I also have a MemoryBarrier() just before writing to the doorbell to start the DMA so I know that all the operations before that have been completed before the DMA operation starts.

    But you are recommending a MemoryBarrier() after the DMA operation completes. Is the KeFlushIoBuffers() not good enough?. I do not have any read or write operations to the DMA buffer between the KeFlushIoBuffers and the DMA operation (which might cause the cachelines to be refilled again).

  • Gregory_G._DyessGregory_G._Dyess Member - All Emails Posts: 381
    via Email
    I am suggesting the following sequence of operation:

    1. write data to source buffer
    2. flush buffers and caches of source buffer
    3. Memory barrier
    4. Invalidate cache of destination buffer
    5. perform DMA
    6. Memory barrier
    7. Enjoy your data

    Keep in mind, I am primarily a kernel developer on Arm architectures (even worked for Arm and taught software architecture classes while there).  I saw an issue with a PCIe driver that was very similar to this on on a Xilinx-based RFSoC (4x Cortex A53) and it was the out of order execution and caches that turned out to be the issue.  The sequence I described above is recommended to ensure there are no stale caches, memory in the HW write buffers within the processor or out of order execution issues.

    Intel might be different.

    Greg
     
  • DavidNDavidN Member Posts: 4

    Thank you for the warm welcome,

    To answer your questions Peter;
    Yes I am giving the logical address from WdfCommonBufferGetAlignedLogicalAddress to the hardware.
    The configuration mode for the dma_enabler is WdfDmaProfileScatterGather64Duplex.
    I do have winDbg setup and checking the memory shows the same data I initialized it to both before and after the dma.

    I'll give it a shot using KeFlushIoBuffers + memory barrier and let you all know if anything interesting happens

    Thanks for the many responses so far.
    -David

  • Ramakrishna_SaripalliRamakrishna_Saripalli Member Posts: 74

    @DavidN I would love to see if you have any better luck with this than I have. Do you have IOMMU enabled on your setup?.( assuming it supports one). FYI, my results do not matter whether VT-d is enabled or disabled.

  • Peter_Viscarola_(OSR)Peter_Viscarola_(OSR) Administrator Posts: 8,704

    Hmmm… I’m not sayin’ Mr @Gregory_G._Dyess isnt correct… but,

    1) When I hear hoof beats, I tend to think horses not zebras

    2) I have never seen a case where a properly constructed Windows driver needs to manually add a memory barrier into the code. This is supposed to be all handled by the Windows abstractions. Note we don’t code memory barriers around register reads and writes (though, in theory at least, you might have to call KeFlushIoBuffers).

    Which memory barrier, or fence, specifically are you recommending these guys add, Mr @Gregory_G._Dyess ?.

    Peter

    Peter Viscarola
    OSR
    @OSRDrivers

  • Ramakrishna_SaripalliRamakrishna_Saripalli Member Posts: 74

    @Peter_Viscarola_(OSR) that is what I thought too. Given that PCIe memory register reads and writes are mapped to uncached regions, I did not think memory barriers were needed. The processor (at least x64) does not reorder around uncached regions. At least that is my understanding.

    I can see why the barriers are needed if we access cached regions (normal memory).

    Thanks,
    RK

  • Gregory_G._DyessGregory_G._Dyess Member - All Emails Posts: 381
    via Email
    If I were on a Zebra Farm instead of a cattle ranch.....

    If it were an Arm core, it would be a memory write barrier (DMB) after writing the data and before triggering the DMA.  After the DMA completes, a memory read barrier (DMB).  A Data Synchronization Barrier would be too heavy handed and, if used too often, would kill system performance.

    Again, my experience is with Arm A-Class cores, not Intel.  Principles are similar but execution might be different. 

    As I said, I've not written a desktop Windows driver in 20 years.  I write mostly Windows Embedded Compact (and Linux) kernel drivers now.  I'm not claiming the barriers are a fix-all.  Just the behavior being described sounded a lot like a stale cache and/or out of order execution.

    Greg


    On Thu, 07 Oct 2021 22:21:40 +0000 (UTC), "Peter_Viscarola_(OSR)" wrote:

    OSR https://community.osr.com/

    Peter_Viscarola_(OSR) commented on PCIe dma not working as expected

    Hmmm… I’m not sayin’ Mr @Gregory_G._Dyess isnt correct… but,

    1) When I hear hoof beats, I tend to think horses not zebras

    2) I have never seen a case where a properly constructed Windows driver needs to manually add a memory barrier into the code. This is supposed to be all handled by the Windows abstractions. Note we don’t code memory barriers around register reads and writes (though, in theory at least, you might have to call KeFlushIoBuffers).

    Which memory barrier, or fence, specifically are you recommending these guys add, Mr @Gregory_G._Dyess ?.

    Peter
  • Gregory_G._DyessGregory_G._Dyess Member - All Emails Posts: 381
    via Email
    Barriers are not for synchronizing caches.  That would be the cache maintenance instructions which, as Mr Viscarola pointed out, should be handled by the Windows-provided driver framework(s).  The barriers simply keep the processor from executing certain sequences of code out of order.

    Greg


    On Thu, 07 Oct 2021 22:50:45 +0000 (UTC), Ramakrishna_Saripalli wrote:

    OSR https://community.osr.com/

    Ramakrishna_Saripalli commented on PCIe dma not working as expected

    @Peter_Viscarola_(OSR) that is what I thought too. Given that PCIe memory register reads and writes are mapped to uncached regions, I did not think memory barriers were needed. The processor (at least x64) does not reorder around uncached regions. At least that is my understanding.

    I can see why the barriers are needed if we access cached regions (normal memory).

    Thanks,

    RK
  • DavidNDavidN Member Posts: 4

    Unfortunately no luck yet for me. IOMMU being enabled/disabled didn't influence anything on my end either, nor did KeFlushIoBuffers. my initialization code looks very similar to that which you posted RK.
    Though it's possible I'm doing something wrong I appear to be following the documentation on the use of KeFlushIoBuffers, just in case though I had tried moving it around to different points in the code with different configurations to no avail. To be honest I'm at a bit of a loss for what's even left to try, but I'll keep at it and let you know of any breakthroughs.

    some extra info If it helps at all, the destination addr is a block ram on the fpga so unless I'm mistaken I don't think it should be subject to cache problems.

    Thanks,
    David

    @Ramakrishna_Saripalli said:
    @DavidN I would love to see if you have any better luck with this than I have. Do you have IOMMU enabled on your setup?.( assuming it supports one). FYI, my results do not matter whether VT-d is enabled or disabled.

  • Ramakrishna_SaripalliRamakrishna_Saripalli Member Posts: 74

    @DavidN my design is using the Synopsys Designware cores pci express controller. I have not had much luck either although I have a feeling I might be programming the DMA controller incorrectly. You are right. Unless the design has a cache for the onboard RAM, it should not be subject to cache problems.

  • Peter_Viscarola_(OSR)Peter_Viscarola_(OSR) Administrator Posts: 8,704

    Guys... there's GOT to be a simple explanation here. I can pretty much guarantee you that, whatever your problem is, it has nothing to do with cache, memory barriers, fences, neutrinos, left-spin vs right-spin, or anything else similarly esoteric.

    If you were experienced Windows devs, and all of a sudden, you were seeing a problem... or if you were seeing a problem SOMEtimes... or you were on ARM64, which probably hasn't been nearly as well tested (in terms of the Windows abstractions)... then MAYBE I'd buy that this is a memory barrier problem.

    Even worrying about KeFlushIoBuffers is a bit of a stretch. Until just a few years ago, this function was a noop on x86 and x64 architecture systems. From wdm.h:

    #if (NTDDI_VERSION >= NTDDI_WINTHRESHOLD)
    
    VOID
    KeFlushIoBuffers (
        _In_ PMDL Mdl,
        _In_ BOOLEAN ReadOperation,
        _In_ BOOLEAN DmaOperation
        );
    
    #else
    
    #define KeFlushIoBuffers(Mdl, ReadOperation, DmaOperation)
    
    #endif
    
    
    

    Soooo... isn't it much more likely that (a) there's a bug in your FPGA, or (b) you're making some simple error in your Windows API calls?

    I just did this with a FPGA dev, four weeks ago. He SWORE he was doing a DMA to the memory segment I provided... but on closer inspection of his code, he saw... oooops... wrong address. So he was doing a DMA to some random place in physical memory. Ooopsie!

    And presence or absence of an IOMMU has no bearing on anything. This is all cooked into calling GetAlignedLogicalAddress... the "Logical Address" is provided by the HAL and takes into account the IOMMU. This is why we don't call MmGetPhysicalAddress, but instead WdfCommonBufferGetAlignedLogicalAddress.

    SO, let's go back to first principals, shall we?

    • Let's be sure your programming your registers with the LogicalAddress -- all 64-bits of it.
    • Let's be sure the rest of the registers are setup right... For the guy using AXI CMDA... did you ever tell me whether this was simple mode or not? Regardless, see if you can get things working first with simple mode... then if you need to worry about S/G and descriptors you can.
    • Let's be sure you're looking at the data after a device-to-host memory transfer in the debugger, and NOT from some program you've written (too many chances for errors)
    • Let's make sure that when you look at the data in the debugger, you try looking at it using the memory window, using first the kernel virtual address (that you get back from GetAlignedVirtualAddress) and the "physical memory" address you get back from GetAlignedLogicalAddress
    • Setup ChipScope or SignalTap or whatever... and see if you can monitor that DMA operation (easy for ME to say, never having actually used either one of these tools... I'm a host-side software guy, not an FPGA guy... though I sometimes masquerade as one).

    Peter

    Peter Viscarola
    OSR
    @OSRDrivers

  • Peter_Viscarola_(OSR)Peter_Viscarola_(OSR) Administrator Posts: 8,704

    Just for fun, here's the code for KeFlushIoBuffers from Windows 20H1:

    nt!KeFlushIoBuffers:
    fffff804`3f52f7b0 48895c2410      mov     qword ptr [rsp+10h],rbx
    fffff804`3f52f7b5 48896c2418      mov     qword ptr [rsp+18h],rbp
    fffff804`3f52f7ba 56              push    rsi
    fffff804`3f52f7bb 57              push    rdi
    fffff804`3f52f7bc 4154            push    r12
    fffff804`3f52f7be 4156            push    r14
    fffff804`3f52f7c0 4157            push    r15
    fffff804`3f52f7c2 4883ec60        sub     rsp,60h
    fffff804`3f52f7c6 488b05f3c98e00  mov     rax,qword ptr [nt!_security_cookie (fffff804`3fe1c1c0)]
    fffff804`3f52f7cd 4833c4          xor     rax,rsp
    fffff804`3f52f7d0 4889442450      mov     qword ptr [rsp+50h],rax
    fffff804`3f52f7d5 8b055db98e00    mov     eax,dword ptr [nt!KiSystemFullyCoherent (fffff804`3fe1b138)]
    fffff804`3f52f7db 0f57c0          xorps   xmm0,xmm0
    fffff804`3f52f7de 418ae8          mov     bpl,r8b
    fffff804`3f52f7e1 448af2          mov     r14b,dl
    fffff804`3f52f7e4 488bf9          mov     rdi,rcx
    fffff804`3f52f7e7 0f11442430      movups  xmmword ptr [rsp+30h],xmm0
    fffff804`3f52f7ec 0f11442440      movups  xmmword ptr [rsp+40h],xmm0
    fffff804`3f52f7f1 85c0            test    eax,eax
    fffff804`3f52f7f3 0f84d37c1500    je      nt!KeFlushIoBuffers+0x157d1c (fffff804`3f6874cc)
    fffff804`3f52f7f9 488b4c2450      mov     rcx,qword ptr [rsp+50h]
    fffff804`3f52f7fe 4833cc          xor     rcx,rsp
    fffff804`3f52f801 e88a380900      call    nt!_security_check_cookie (fffff804`3f5c3090)
    fffff804`3f52f806 4c8d5c2460      lea     r11,[rsp+60h]
    fffff804`3f52f80b 498b5b38        mov     rbx,qword ptr [r11+38h]
    fffff804`3f52f80f 498b6b40        mov     rbp,qword ptr [r11+40h]
    fffff804`3f52f813 498be3          mov     rsp,r11
    fffff804`3f52f816 415f            pop     r15
    fffff804`3f52f818 415e            pop     r14
    fffff804`3f52f81a 415c            pop     r12
    fffff804`3f52f81c 5f              pop     rdi
    fffff804`3f52f81d 5e              pop     rsi
    fffff804`3f52f81e c3              ret
    

    and (on my X64 VM):

    0: kd> dd nt!KiSystemFullyCoherent
    fffff804`3fe1b138  00000001 000032c9 00000001 02000504
    
    

    So... put as many calls to KeFlushIoBuffers as you want, anywhere you want... ;-)

    Let me hasten to add that the above is strictly aimed at x86/x64 architecture machines. ARM... is a different story. And Mr @Gregory_G._Dyess in his comments hasn't been talking about cache flushing in any case, he's been talking about instruction re-ordering.

    Peter

    Peter Viscarola
    OSR
    @OSRDrivers

  • Ramakrishna_SaripalliRamakrishna_Saripalli Member Posts: 74

    @Peter_Viscarola_(OSR) thank you for your very detailed analysis. I have a feeling, at least in my case, I do not understand the Synopsys DMA controller programming model so there is a chance (pretty big) that I am not programming the DMA controller properly. I am still looking through the documentation. I suppose I should refrain from posting anything here until I get that sorted out.

  • DavidNDavidN Member Posts: 4

    We were able to resolve the problem!
    Many thanks to everyone who helped both in and out of this thread.

    The issue, in case it helps someone else, was that we are using a 32-bit address space on the Zynq FPGA. To map from the address space of the FPGA to the 64-bit address space of the host we use the Address Translation registers of the AXI Memory Mapped
    to PCI Express core that we are using ( https://www.xilinx.com/support/documentation/ip_documentation/axi_pcie/v2_8/pg055-axi-bridge-pcie.pdf ). We correctly set the AXI Base Address Translation Configuration Registers, however what we missed was that the lower address translation registers are limited by the address width of each AXI BAR, which is described on pg. 9 of XAPP1171 ( https://www.xilinx.com/support/documentation/application_notes/xapp1171-pcie-central-dma-subsystem.pdf ). In our case we have 64MB AXI BAR addresses, so had to call WdfDeviceSetAlignmentRequirement with 0x3FFFFFF. Hopefully this helps you too @Ramakrishna_Saripalli otherwise I wish you good luck getting to the bottom of your bug.

    Thanks again everyone and enjoy the weekend.
    -David

  • Peter_Viscarola_(OSR)Peter_Viscarola_(OSR) Administrator Posts: 8,704

    Nice follow-up… thanks for telling us the ultimate solution to your problem.

    Yup… Horses.

    Peter

    Peter Viscarola
    OSR
    @OSRDrivers

Sign In or Register to comment.

Howdy, Stranger!

It looks like you're new here. Sign in or register to get started.

Upcoming OSR Seminars
OSR has suspended in-person seminars due to the Covid-19 outbreak. But, don't miss your training! Attend via the internet instead!
Internals & Software Drivers 15 November 2021 Live, Online
Writing WDF Drivers 24 January 2022 Live, Online
Developing Minifilters 7 February 2022 Live, Online
Kernel Debugging 21 March 2022 Live, Online