Windows System Software -- Consulting, Training, Development -- Unique Expertise, Guaranteed Results

Home NTDEV
Before Posting...
Please check out the Community Guidelines in the Announcements and Administration Category.

Setting up DMA - User Buffer, Handing I/O

OneKneeToeOneKneeToe Member Posts: 42

Happy New Year OSR (it's still January, and it's my first post - do I get an OK? :smile: )

So, after having success with with exposing a Kernel-Mode common buffer to user-space to facilitate an FPGA sending data (DMA) from FPGA Memory (simplified, I know) to host memory, I am ready to attempt the more safe approach of using the "Hanging Direct I/O" technique.

The Idea:

  1. User SW allocates a 2GB buffer using VirtualAlloc (Large Pages).
  • FPGA will not perform S/G and will DMA assuming 64KB Pages at destination.
  • CommonBuffer was nice as it guarantees the logical addresses to be contiguous.
  • I do not believe this is the case with the Hanging Direct I/O technique, so the 2MB Large Pages is needed.
  1. User SW issues an IOCTL to the Driver, associating the 2GB buffer with the out_buffer.
  2. Driver uses the MDL from the out_buffer to obtain logical addresses (for DMA purposes) for each MDL entry.
  • Driver does not need to "Lock" the buffer as this has already been done by the I/O Manager.
  1. Driver keeps IOCTL pended.
  2. User SW sends another IOCTL to retrieve the logical addresses.
  3. User SW programs the FPGA with logical addresses.
  4. User SW tells FPGA to run (start sending data).
    ... good times ...
  5. User SW tells FPGA to stop running.
  6. User SW cancels the pended IOCTL.
  7. Driver, as part of cancellation, performs clean-up and completes the request.
    //

Assuming I have the above correct:

  1. I think I can handle step 1 and step 2 .
  2. I am having a little problem with Step 3.
  • I figure I would need to create a DMA enabler.
  • What I am not sure about is where to go from there.
  • DMA Transaction?
  • How do I get the logical addresses? - I haven't yet found an API or examples.

Thank you again for the continued help.

Regards,
Juan

Comments

  • Tim_RobertsTim_Roberts Member - All Emails Posts: 13,282

    You should file an official ethics complaint against your hardware designers. It is inexcusable to have a PCIe FPGA with DMA these days that does not do scatter/gather.

    The user software should not program the FPGA, nor work with logical addresses, nor trigger the DMA. The user software sends the buffer to the driver. The user software sends an ioctl to start streaming, and in response the driver does the mappings, programs the hardware, and triggers streaming. The driver needs to be in control of that entire process, so it can clean up. User apps are way too ephemeral. They die uncleanly, and that leaves the driver not knowing the state of the data stream. In addition, user apps can be exploited. One simple rogue process to tweak an ioctl call, and your computer is hosed when the hardware trashes memory.

    You can certainly use WDF DMA transactions for this, by using WdfDmaTransactionInitializeUsingRequest or WdfDmaTransactionInitializeUsingOffset, or you can use WdfDmaEnablerWdmGetDmaAdapter and call the DMA_ADAPTER methods directly to do your mapping.

    Tim Roberts, [email protected]
    Providenza & Boekelheide, Inc.

  • OneKneeToeOneKneeToe Member Posts: 42

    Hello Tim:

    Thanks for the response. I wont go into the reason for this approach, but I am looking into changing things around.

    Forgive my lack of Driver knowledge: I did look both Transaction and DMA_ADAPTER methods before I posted, but none of them mentioned being able to obtain the Logical Address - nothing as straight forward as, "WdfCommonBufferGetAlignedLogicalAddress".

    I do see methods related to getting SG lists - is that the approach I should be using? Get the S/G List based on the Direct IO out_buffer MDL, then iterate through the SG List to get the Logical Addresses. I would need to know the corresponding virtual address as SW will use the virtual address to access the data for processing.

    Thank you, Tim.
    Juan

  • Tim_RobertsTim_Roberts Member - All Emails Posts: 13,282

    Yes, S/G lists have logical addresses. MmGetSystemAddressForMdlSafe gets you the virtual address.

    Tim Roberts, [email protected]
    Providenza & Boekelheide, Inc.

  • Jeffrey_Tippet_[MSFT]Jeffrey_Tippet_[MSFT] Member - All Emails Posts: 570

    If the usermode app just calls VirtualAlloc, even with large pages, you'll need 1024 pages to cobble together a 2GB buffer. These pages will not be contiguous, so you'll need to juggle 1024 different logical addresses. That can be inconvenient and possibly slow things down.

    Getting a contiguous buffer makes this way easier, and possibly faster. The downside is that 2GB of contiguous pages is pretty unlikely to just happen to be available, except early at boot. You could have your driver reserve 2GB of contiguous memory at boot, then use it once the device & usermode app are ready. This approach might be better, if your end-users are okay with permanently dedicating 2GB of RAM to your hardware, and with possibly needing to reboot the first time the hardware is installed. (I worked on a project a few years ago that did exactly this, and the customer preferred paying for extra RAM up-front to get deterministic and low execution time, versus worrying about the variable performance of allocating that RAM in small pieces dynamically later.)

    Note that regardless of what approach you use, you must use a DMA API at some point. For contiguous memory, you'd ideally get it from AllocateCommonBuffer[Ex]; although if the device doesn't exist yet, you can use MmAllocateContiguousMemory and then later call BuildScatterGatherList[Ex] on it when the device is ready. If you're going with VirtualAlloc'd memory, use BuildScatterGatherList[Ex] on it. You can't do anything useful with MmGetPhysicalAddress or the PFN array in the MDL, (unless ADAPTER_INFO_API_BYPASS explicitly says you can).

    There's no reverse mapping from a SG list back to a virtual address, because HAL reserves the rights to bounce the buffer. That is, it might have silently allocated a whole new buffer and copied the data there. At that point, what's the virtual address? The original buffer (which you already know about) or the bounce buffer (which you shouldn't need to touch)?

  • OneKneeToeOneKneeToe Member Posts: 42

    Thank you, Tim:

    My plan is:

    1. During EvtDeviceAdd, create the DMA Enabler.
    2. Add a method, "ConfigureFpgaForDma".
    3. Call ConfigureFpgaForDma via an EvtIoInCallerContext
    4. SW issues the IOCTL, providing info on the In_buffer and the VirtualAlloc buffer in the out_buffer.
    5. Inside the ConfigureFpgaForDma, call WdfRequestRetrieveOutputWdmMdl to get the MDL.
    6. Then call WdfDmaEnablerWdmGetDmaAdapter to get the DMA_ADAPTER (using WdfDmaDirectionReadFromDevice since the FPGA will write into the host buffer).
    7. Call BuildScatterGatherList.
    • The method requires an, "Execution Routine" which I don't really need. So this will be an empty routine that enters and exits.
    • I also need to manually allocate a block of memory for the SG List itself.
    1. Now I have the SG List, so I will loop through the SG list and write each SG logical address to the appropriate FPGA addresses.

    I have a question about the following:

    The driver must call PutScatterGatherList (which flushes the list) before it can access the data in the list.

    Does the Driver need to do this before User SW can access the data via the original pointer?

    Regards,
    Juan

  • Tim_RobertsTim_Roberts Member - All Emails Posts: 13,282

    Does the driver need to do this before User SW can access the data via the original pointer?

    No, PutScatterGatherList says "I'm done with this list, you can free your resources". However, you should call FlushAdapterBuffers when you know a transfer has completed. The need for that call depends on your CPU architecture; on x64 architectures, it often does nothing, but you're still supposed to call it.

    Tim Roberts, [email protected]
    Providenza & Boekelheide, Inc.

  • OneKneeToeOneKneeToe Member Posts: 42

    Thank you for the response!

    If the usermode app just calls VirtualAlloc, even with large pages, you'll need 1024 pages to cobble together a 2GB buffer. These pages will not be contiguous, so you'll need to juggle 1024 different logical addresses.

    The FPGA has a LUT that is programmed (by SW ) with all the logical addresses. The FPGA is legacy, so I don't have the flexibility to change how it works. It is expecting that each LUT entry points to a single 64KB (contiguous) memory. The reason for using Large Pages is that I can then split the 2MB Page ( using the Logical Address and some math) into 32 (logical addresses).

    Getting a contiguous buffer makes this way easier, and possibly faster.

    Indeed! And I have been successful, with a running system, allocating 2GB during boot via the kernel driver and a CommonBuffer. However, since I am exposing that kernel space memory to user SW, I was told that this is an unsafe practice; If one must go this way, one should take extra care to address the various side-effects and holes. The better approach, aside from using standard DMA practices, is to have user SW allocate the buffer and provide it to the driver.

    If you're going with VirtualAlloc'd memory, use BuildScatterGatherList[Ex] on it.

    I will be attempting this approach, given the advice I've gotten so far, for my specific application.

    There's no reverse mapping from a SG list back to a virtual address, because HAL reserves the rights to bounce the buffer.

    This may make things a little more tricky for me. My FPGA will be sending (DMA'ing) blocks of data. SW needs to know where that block of data resides within the 2GB buffer. Finding the virtual address is easy as it's contiguous. The tricky part is mapping the virtual address to its corresponding logical address.

    • The SG logical addresses are not guaranteed to be contiguous, unlike the CommonBuffer approach, where they are contiguous.
    • The good thing is that I only need to do the "mapping" once (at startup)...

    Thanks again, Jeffrey.

    Regards,
    Juan

  • OneKneeToeOneKneeToe Member Posts: 42

    So it's taken some time to get to this point, and I am stuck at the moment. :)

    My call to CalculateScatterGatherList is returning "STATUS_INVALID_PARAMETER" and I can't figure out which one it is and why it's invalid.

    Code snippet (checks and declarations omitted ).
    I did trace all the parameters that I am passing into the call, and they seem OK, as best I understand the documentation.
    Instead of "StartVa" I tried using "MappedSystemVa" but had the same error.

    START

    // Get the output buffer passed in by the IOCTL.
    WdfRequestRetrieveOutputMemory( request, &outputBufferObj );
    
    // Get the MemoryBuffer as it's easier to get the MDL.
    PVOID outBuff_P = WdfMemoryGetBuffer( outputBufferObj, &outBuffByteSize );
    
    // Get the MDL
    status = WdfRequestRetrieveOutputWdmMdl( request, &bufferMdlP );
    
    // Get the DMA Adapter.
    PDMA_ADAPTER dmaAdapter = WdfDmaEnablerWdmGetDmaAdapter( myDmaEnabler, WdfDmaDirectionReadFromDevice );
    
    // Get the SG List size.
    CalculateScatterGatherList( dmaAdapter,
        bufferMdlP,
        bufferMdlP->StartVa,
        bufferMdlP->ByteCount,
        &sgByteSize,
        &numMapRegisters );
    
    // Get a memory buffer for the SG List
    WdfMemoryCreate( &buffAttributes,
        PagedPool,
        0,
        static_cast<size_t>( sgByteSize ),
        &scatterGatherMemoryObj,
        &scatterGatherListBuffP );
    
    // Build the SG List
    dmaAdapter->DmaOperations->BuildScatterGatherList( dmaAdapter,
        deviceObjectP,
        bufferMdlP,
        bufferMdlP->StartVa,
        static_cast<ULONG>( outBuffByteSize ),
        &(MyAdapterListControl),
        deviceContextP,
        FALSE,
        scatterGatherListBuffP,
        sgByteSize );
    

    END

    Thank you for any help!

    Regards,
    Juan

  • Tim_RobertsTim_Roberts Member - All Emails Posts: 13,282

    Presumably you meant dmaAdapter->DmaOperations->CalculateScatterGatherListSize instead of CalculateScatterGatherList.

    Technically, you should be using MmGetSystemAddressForMdlSafe(bufferMdlP). It will fetch StartVa if it's already set up, and will map the pages into memory if not already set up. Did you dump StartVa in a debugger to see if it made sense?

    Tim Roberts, [email protected]
    Providenza & Boekelheide, Inc.

  • OneKneeToeOneKneeToe Member Posts: 42

    Hello Tim:

    Thank you for the response.

    Presumably you meant dmaAdapter->DmaOperations->CalculateScatterGatherListSize instead of CalculateScatterGatherList.

    Actually, the call CalculateScatterGatherList maps to PCALCULATE_SCATTER_GATHER_LIST_SIZE in wdm.h's _DMA_OPERATIONS struct.

    Technically, you should be using MmGetSystemAddressForMdlSafe(bufferMdlP).

    Thank you for that, I ended up using MmMapLockedPagesSpecifyCache since I don't want the memory to be cached

    Unfortunately I still get the same "STATUS_INVALID_PARAMETER" return code from "...->CalculateScatterGatherList"

    On the SW side, it is allocating a string buffer and passing it in via the outbuffer of the IO Control call (overlapped, generic_read).
    The IOCTL is OUT_DIRECT.
    The DMA Enabler: flag: WDF_DMA_ENABLER_CONFIG_REQUIRE_SINGLE_TRANSFER, Profile: WdfDmaProfilePacket64, Len: 128

    I will dump the KVA and see what I find.

    (Note: Abridged; Some declarations and checks omitted. EvtIoInCallerContext does check the IOControlCode and calls a helper to perform the SGList building shown below.)

    void EvtIoInCallerContext( IN WDFDEVICE  device, IN WDFREQUEST  request )
      // Get the outbuffer from the IOCTL request
      WDFMEMORY outputBufferObj{ nullptr };
      WdfRequestRetrieveOutputMemory( request, &outputBufferObj );
    
      // Get the outbuffer size. Note: outBuffP not actually needed, we just want the size.
      PVOID outBuffP = WdfMemoryGetBuffer( outputBufferObj, &outBuffByteSize );
    
      // Get the MDL from the outbuffer
      WdfRequestRetrieveOutputWdmMdl( request, &deviceContextP->hostMemoryMdlP );
    
      // Get the DMA adapter.
      PDMA_ADAPTER dmaAdapter = WdfDmaEnablerWdmGetDmaAdapter(
                                                          myDmaEnabler,
                                                          WdfDmaDirectionReadFromDevice );
    
      // Get the Kernel Virtual Address for the MDL
      PVOID hostMemoryKvaP = MmMapLockedPagesSpecifyCache(
                                                          hostMemoryMdlP, KernelMode, 
                                                          MmNonCached, NULL,
                                                          FALSE, HighPagePriority );
    
      // Get the SG List size
      dmaAdapter->DmaOperations->CalculateScatterGatherList( dmaAdapter,
                                                                        hostMemoryMdlP, hostMemoryKvaP,
                                                                        hostMemoryMdlP->ByteCount,
                                                                        &sgByteSize, &numMapRegisters );
      // Allocate memory for the SG List
      PVOID scatterGatherListBuffP{ nullptr };
      WdfMemoryCreate( &buffAttributes, PagedPool, 0,
                                           static_cast<size_t>( sgByteSize ),
                                           &scatterGatherMemoryObj, &scatterGatherListBuffP );
    
      // Get the Device Object
      PDEVICE_OBJECT deviceObjectP = WdfDeviceWdmGetDeviceObject( device );
    
      // Get the SG List
      dmaAdapter->DmaOperations->BuildScatterGatherList( dmaAdapter, deviceObjectP,
                                                                    hostMemoryMdlP, hostMemoryKvaP, 
                                                                    static_cast<ULONG>( outBuffByteSize ),
                                                                    &(MyAdapterListControl), deviceContextP, FALSE,
                                                                    scatterGatherListBuffP, sgByteSize );
    

    Thank you, again!
    Juan

  • Tim_RobertsTim_Roberts Member - All Emails Posts: 13,282

    Thank you for that, I ended up using MmMapLockedPagesSpecifyCache since I don't want the memory to be cached

    You don't get to decide that. The memory has already been allocated, so its cache attribute has already been established. You are required to use the same attribute. Why do you think you want it to be uncached? Because your reason is almost certainly wrong.

    WDF_DMA_ENABLER_CONFIG_REQUIRE_SINGLE_TRANSFER requires that you set WdmDmaVersionOverride to 3, but once again, you almost certainly don't need that at all.

    Tim Roberts, [email protected]
    Providenza & Boekelheide, Inc.

  • OneKneeToeOneKneeToe Member Posts: 42
    edited February 14

    @Tim_Roberts said:
    The memory has already been allocated, so its cache attribute has already been established. You are required to use the same attribute.

    • Understood.

    Why do you think you want it to be uncached? Because your reason is almost certainly wrong.

    • Since the FPGA will be streaming data continuously to the host memory, I did not want the OS to deal with cacheing.
    • That memory needs to stay locked down as it always needs to be available for the FPGA to DMA to and for user SW to access for processing.
    • Given your comment, I decided to remove the "NoCache" from VirtualAlloc and am now using MmGetSystemAddressForMdlSafe.

    WDF_DMA_ENABLER_CONFIG_REQUIRE_SINGLE_TRANSFER requires that you set WdmDmaVersionOverride to 3, but once again, you almost certainly don't need that at all.

    My thinking here is that I did not want the FPGA DMA transaction to be split into multiple transactions. But I suppose, since the FPGA is doing the DMA, that would be up to the FPGA??

    Current State

    • I am still getting the invalid parameter error.
    • Just to try something different, instead of passing in the System Address from MmGetSystemAddressForMdlSafe to CalculateScatterGatherList, I am passing in the StartVa from the MDL - this address is the same address as used by user sw.
    • With this change, I get an "Insufficient resources" error - Though I am only allocating 4MB of Large Page memory.

    Question

    • What parameters should I be passing into CalculateScatterGatherList?

    CalculateScatterGatherList( dmaAdapter, // Pointer to DMA Adapter I get from WdfDmaEnablerWdmGetDmaAdapter
                            hostMemoryMdlP, // Pointer to MDL I get from WdfRequestRetrieveOutputWdmMdl
                            hostMemoryKvaP, // Kernel Virtual Address I get from MmGetSystemAddressForMdlSafe
                            hostMemoryMdlP->ByteCount, // Size of Host Memory Buffer ( Size used when user sw called VirtualAlloc)
                            &sgByteSize, // Out param, Byte size of ScatterGather List. What is needed when calling WdfMemoryCreate
                            &numMapRegisters //Out param, Will not be used.
                            );
    

    DmaEnabler

    I create the DmaEnabler as follows:

        WDF_DMA_ENABLER_CONFIG dmaEnablerConfig;
        WDF_DMA_ENABLER_CONFIG_INIT( &dmaEnablerConfig, WdfDmaProfilePacket64, 128 );
    
        dmaEnablerConfig.EvtDmaEnablerFill = NULL;
        dmaEnablerConfig.EvtDmaEnablerFlush = NULL;
        dmaEnablerConfig.EvtDmaEnablerDisable = NULL;
        dmaEnablerConfig.EvtDmaEnablerEnable = NULL;
        dmaEnablerConfig.EvtDmaEnablerSelfManagedIoStart = NULL;
        dmaEnablerConfig.EvtDmaEnablerSelfManagedIoStop = NULL;
        dmaEnablerConfig.AddressWidthOverride = 0;
        dmaEnablerConfig.WdmDmaVersionOverride = 3;
        dmaEnablerConfig.Flags = WDF_DMA_ENABLER_CONFIG_REQUIRE_SINGLE_TRANSFER;
    
        NTSTATUS status{ WdfDmaEnablerCreate( wdfDevice, &dmaEnablerConfig, WDF_NO_OBJECT_ATTRIBUTES, &( myDmaEnabler ) ) };
        if( !NT_SUCCESS( status ) )
        {
            TraceEvents( TRACE_LEVEL_ERROR, DBG_INIT, "WdfDmaEnablerCreate() failed with status=[%!STATUS!]", status );
        }
    

    Quick Note

    As a quick test: I modified user SW to write 0xDEADBEEF0BADC0DE to the VirtualAlloc memory location. The driver, using outBuffP (obtained from WdfMemoryGetBuffer - see above code snippet) traced the correct value. The driver then modified the value by writing all 0xFs. User SW saw this change.

    Post edited by OneKneeToe on
  • OneKneeToeOneKneeToe Member Posts: 42

    In addition to my previous question (last post) I have another for you:

    Technically, you should be using MmGetSystemAddressForMdlSafe(bufferMdlP).

    • How do I unlock the pages? Since I got the MDL using WdfRequestRetrieveOutputWdmMdl, I do not know which method to use for unlocking the pages.
    • I get a BSOD, "PROCESS_HAS_LOCKED_PAGES" when shutting down, even though the request was completed due to the failed CalculateScatterGatherList call.

    Thank you..

  • Tim_RobertsTim_Roberts Member - All Emails Posts: 13,282

    Since the FPGA will be streaming data continuously to the host memory, I did not want the OS to deal with cacheing.

    Caching is not an OS thing at all. It is a hardware thing. x86 hardware handles it for you.

    WDF_DMA_ENABLER_CONFIG_INIT( &dmaEnablerConfig, WdfDmaProfilePacket64, 128 );

    You misunderstand the "maximum transfer length" parameter. What you're giving here is the TLP size from PCIe config space. The operating system doesn't give a whack about that. You need to pass the maximum buffer size you can do in a single DMA transfer. With most devices that's at LEAST a page, and is often many megabytes.

    Don'e zero out the callback fields in the dmaEnablerConfig structure. That's what the _INIT is there for. You just need to override the things that aren't the default.

    The "...REQUIRE_SINGLE_TRANSFER" thing is only needed if the system has to allocate bounce buffers for you. For example, if your device cannot handle 64-bit addressing, then when a DMA request comes in beyond the 4GB mark, the system will allocate buffers below the 4GB mark, and copy the data to and fro. Those buffers are small (like 64kB), so if you get a request for a megabyte, the system will do that as 16 requests of 64kB each.

    But that's all handled transparently on your behalf. I STRONGLY suggest you eliminate the REQURE_SINGLE_TRANSFER flag and lwave the WdmDmaVersionOverride set to its default value.

    Tim Roberts, [email protected]
    Providenza & Boekelheide, Inc.

  • Tim_RobertsTim_Roberts Member - All Emails Posts: 13,282

    How do I unlock the pages?

    You don't have to. For METHOD_xx_DIRECT ioctls, the I/O system locks the memory on the way in and unlocks it on the way out. Do you still have calls to MmMapLockedPagesEtc in your code? Those have to be mated with corresponding MmUnmapLockedPages.

    Tim Roberts, [email protected]
    Providenza & Boekelheide, Inc.

  • OneKneeToeOneKneeToe Member Posts: 42

    @Tim_Roberts said:

    How do I unlock the pages?

    You don't have to. For METHOD_xx_DIRECT ioctls, the I/O system locks the memory on the way in and unlocks it on the way out.

    That was my understanding. But doesn't this mean that I don't need to make a call to MmGetSystemAddressForMdlSafe as it locks the pages?

    Do you still have calls to MmMapLockedPagesEtc in your code? Those have to be mated with corresponding MmUnmapLockedPages.

    No, I haven't introduced any.

    • The BSOD started happening when I introduced the call to MmGetSystemAddressForMdlSafe. If I take out the call, the BSOD doesn't happen.

    Thank you for all the great information! I will work on fixing the DmaEnabler creation. I know I'm stumbling through this; BIG Thanks for the continued patience and help.

  • Tim_RobertsTim_Roberts Member - All Emails Posts: 13,282

    MmGetSystemAddressForMdlSafe does not lock pages. The buffer that an MDL describes has already been locked. Otherwise, you couldn't get physical page numbers, which is what the MDL contains.

    Tim Roberts, [email protected]
    Providenza & Boekelheide, Inc.

  • OneKneeToeOneKneeToe Member Posts: 42

    @Tim_Roberts said:
    MmGetSystemAddressForMdlSafe does not lock pages. The buffer that an MDL describes has already been locked. Otherwise, you couldn't get physical page numbers, which is what the MDL contains.

    Yes, you're right Mr. Roberts. I had been googling and reading posts and Microsoft docs to try and figure the overall problem out; my brain is getting all jumbled.

    Current State:

    • I am using MmGetSystemAddressForMdlSafe
    • I changed the DMA Config (Removed the zero'ing out of the callbacks, Updated the MaxTransferByteSize, removed the Flag and DmaVersionOverride).
    • Still, no cigar with a status of STATUS_INSUFFICIENT_RESOURCES (0xC000009A).
    • UserSW uses VirtualAlloc to allocate 4MB of Large Page memory.

    I'm at a loss at the moment...it doesn't seem like it should be complicated :smile:

  • Tim_RobertsTim_Roberts Member - All Emails Posts: 13,282

    Where do you get the STATUS_INSUFFICIENT_RESOURCES?

    Tim Roberts, [email protected]
    Providenza & Boekelheide, Inc.

  • OneKneeToeOneKneeToe Member Posts: 42

    @Tim_Roberts said:
    Where do you get the STATUS_INSUFFICIENT_RESOURCES?

    The call to CalculateScatterGatherList

  • OneKneeToeOneKneeToe Member Posts: 42

    To close off this thread:

    I haven't had any luck making this work. I continue to get STATUS_INSUFFICIENT_RESOURCES from CalculateScatterGatherList.

    • For now, I will set aside the "Hanging Direct I/O" approach and use the method of exposing a Common_Buffer to UserSW approach as it is working.
    • Still, it's on the back burner, so if you have any suggestions / comments, please post away - thanks in advance.

    Recap of my [failed] implementation (Note: Code snippets have been "simplified" for brevity):

    User SW:

        // SW is allocating a data buffer using Large Pages (2MB). For this, SW needs to be running as Admin + have SE Privilege.
        tp.Privileges[ 0 ].Attributes = SE_PRIVILEGE_ENABLED;
    
          // enable privilege
        status = AdjustTokenPrivileges( hToken, FALSE, &tp, 0, ( PTOKEN_PRIVILEGES ) NULL, 0 );
    
       // For testing, allocate 4MB (2 pages) but will eventually need much more than this.
       pointerToMemory = VirtualAlloc( NULL, 
                                       memorySize,
                                       MEM_COMMIT | MEM_RESERVE | MEM_LARGE_PAGES,
                                       PAGE_EXECUTE_READWRITE );
    
        DeviceIoControl( pDevice->hFile,
                        IOCTL_SETUP_HOST_MEMORY_DMA,
                        inDataHostMemoryInfoP, sizeof( SetupHostMemoryInputBufferType ),
                        pointerToMemory , memorySize,
                        &bytesReturned, NULL ) );
    

    Driver Device Add:

    WDF_DMA_ENABLER_CONFIG dmaEnablerConfig;
    WDF_DMA_ENABLER_CONFIG_INIT( &dmaEnablerConfig, WdfDmaProfileScatterGather64Duplex, 1024*4 );
    
    dmaEnablerConfig.AddressWidthOverride = 0;
    
    NTSTATUS status{ WdfDmaEnablerCreate( wdfDevice, &dmaEnablerConfig, WDF_NO_OBJECT_ATTRIBUTES, &( myDmaEnabler ) ) };
    if( !NT_SUCCESS( status ) )
    {
        TraceEvents( TRACE_LEVEL_ERROR, DBG_INIT, "WdfDmaEnablerCreate() failed with status=[%!STATUS!]", status );
    }
    

    Driver Device Process IOCTL:

    void EvtIoInCallerContext( IN WDFDEVICE  device, IN WDFREQUEST  request )
      // Get the outbuffer from the IOCTL request
      WDFMEMORY outputBufferObj{ nullptr };
      WdfRequestRetrieveOutputMemory( request, &outputBufferObj );
    
      // Get the outbuffer size. Note: outBuffP not actually needed, we just want the size.
      PVOID outBuffP = WdfMemoryGetBuffer( outputBufferObj, &outBuffByteSize );
    
      // Get the MDL from the outbuffer
      WdfRequestRetrieveOutputWdmMdl( request, &hostMemoryMdlP );
    
      // Get the DMA adapter.
      PDMA_ADAPTER dmaAdapter = WdfDmaEnablerWdmGetDmaAdapter( myDmaEnabler,
                                                               WdfDmaDirectionReadFromDevice );
    
      // Get the Kernel Virtual Address for the MDL
      hostMemoryKvaP = MmGetSystemAddressForMdlSafe( hostMemoryMdlP, 
                                                                     HighPagePriority );
    
      // Get the SG List size
      dmaAdapter->DmaOperations->CalculateScatterGatherList( dmaAdapter,
                                                             hostMemoryMdlP, 
                                                             hostMemoryKvaP,
                                                             hostMemoryMdlP->ByteCount,
                                                             &sgByteSize, 
                                                             &numMapRegisters );
      // Allocate memory for the SG List
      PVOID scatterGatherListBuffP{ nullptr };
      WdfMemoryCreate( &buffAttributes, PagedPool, 0,
                       static_cast<size_t>( sgByteSize ),
                       &scatterGatherMemoryObj, &scatterGatherListBuffP );
    
      // Get the Device Object
      PDEVICE_OBJECT deviceObjectP = WdfDeviceWdmGetDeviceObject( device );
    
      // Get the SG List
      dmaAdapter->DmaOperations->BuildScatterGatherList( dmaAdapter, deviceObjectP,
                                                                    hostMemoryMdlP, hostMemoryKvaP, 
                                                                    static_cast<ULONG>( outBuffByteSize ),
                                                                    &(MyAdapterListControl), deviceContextP, FALSE,
                                                                    scatterGatherListBuffP, sgByteSize );
    
Sign In or Register to comment.

Howdy, Stranger!

It looks like you're new here. If you want to get involved, click one of these buttons!

Upcoming OSR Seminars
OSR has suspended in-person seminars due to the Covid-19 outbreak. But, don't miss your training! Attend via the internet instead!
Kernel Debugging 30 Mar 2020 OSR Seminar Space
Developing Minifilters 20 Apr 2020 LIVE ONLINE
Writing WDF Drivers 11 May 2020 LIVE ONLINE
Internals & Software Drivers 28 Sept 2020 Dulles, VA