Windows System Software -- Consulting, Training, Development -- Unique Expertise, Guaranteed Results

Home NTDEV

More Info on Driver Writing and Debugging


The free OSR Learning Library has more than 50 articles on a wide variety of topics about writing and debugging device drivers and Minifilters. From introductory level to advanced. All the articles have been recently reviewed and updated, and are written using the clear and definitive style you've come to expect from OSR over the years.


Check out The OSR Learning Library at: https://www.osr.com/osr-learning-library/


Before Posting...

Please check out the Community Guidelines in the Announcements and Administration Category.

Implement DMA in packet-mode in a PCI/VME Bridge

croy_kfrcroy_kfr Member Posts: 18

Hi all,
I have to implement a DMA support for an old PCI to VME bridge device and I wonder what design should I use. I have already done using KMDF framework all initialization and other stuff like single transfers that are already working using IOCTL. The requirements for DMA support are very poor, but what I know is that the device hardware does not support scatter/gather mecanism and do only DMA transfer in packet mode. To perform DMA, I have some registers commonly used for DMA : PCI start address, VME start address, Transfer size, block size and a register that holds miscelaneous flags used for VME xfer like the direction (read/write). In user application, I would like to use IOCTL to perform all DMA Read and Write operation. For data transfer, I would use a shared buffer between user appl and kernel driver.
My questions are:
1) Could I use WdfDma framework features for this design since the HW does not support scatter/gather mecanism nor descriptors as shown in PLX9x5x WDF example ?
2) For shared buffer, do i need to use a CommonBuffer allocation mecanism in the driver or does exist an other mecanism ?
Thanks in advance for your answer.

«1

Comments

  • Mark_RoddyMark_Roddy Member - All Emails Posts: 4,426

    As your device doesn't support scatter gather you have to use a common buffer. You are going to do a copy from there into your user's buffer, so there is no point in trying to share anything explicitly.

  • croy_kfrcroy_kfr Member Posts: 18

    Thank you Mark for your quick answer.
    Is it possible to do this copy from the common buffer into user's buffer using an IOCTL call ?

  • Mark_RoddyMark_Roddy Member - All Emails Posts: 4,426

    https://docs.microsoft.com/en-us/windows-hardware/drivers/wdf/handling-i-o-requests-in-a-kmdf-driver-for-a-bus-master-dma-device

    WDF is going to handle the copy details for you if your dma is connected to io requests.. The general practice is to either initiate the DMA based on a user ioctl arriving at your driver, or to have the user pend a set of such ioctls to be used by your driver, with the driver independently initiating dma requests.

  • Tim_RobertsTim_Roberts Member - All Emails Posts: 13,907

    Is it possible to do this copy from the common buffer into user's buffer using an IOCTL call ?

    That is, in fact, the ONLY way to do it.

    Tim Roberts, [email protected]
    Providenza & Boekelheide, Inc.

  • croy_kfrcroy_kfr Member Posts: 18

    Thanks again.
    What I do not understand in this model, it is the link between the PSCATTER_GATHER_LIST SgList in the EvtProgramDma
    and what I get from IOCTL request in the call:
    WdfDmaTransactionInitializeUsingRequest( dmaTransaction,
    Request,
    EvtProgramDma,
    direction );
    In DMA packet mode, I know that I have to use a single pair
    SgList->Elements[0].Address.LowPart;
    SgList->Elements[0].Length;
    but I do not understand which address I got in this structure.
    Thanks.

  • Tim_RobertsTim_Roberts Member - All Emails Posts: 13,907

    If you are using a common buffer, then you won't be using WdfDmaTransactionInitializeUsingRequest. You have the physical address of of the common buffer, and copying into the user's request when the DMA is done is just a simple RtlMoveMemory call.

    Tim Roberts, [email protected]
    Providenza & Boekelheide, Inc.

  • croy_kfrcroy_kfr Member Posts: 18

    Thank you Tim for your answer. So I understand that I have to not use WdfTransactionInitializeUsingRequest, do I use instead WdfDmaTransactionInitialize method ?
    Also, in my EvtProgramDma, does I use the SgList parameter ?

  • Peter_Viscarola_(OSR)Peter_Viscarola_(OSR) Administrator Posts: 8,399

    This talk of “common buffers” has, in fact, confused you I think. Let me see if I can clear that up.

    Use the standard KMDF DMA APIs. When you create your DMA Enabler, specify one of the packet-based profiles but NOT one that indicates that you support scatter/gather (because you don’t).

    When your EvtProgramDma callback is called, the s/g list passed to you will have exactly one element. You use this as the base address and length to program your hardware. Done!

    Ignore all this talk of Common Buffers... which could provide you an alternative way to do what you want, but does not really directly answer your question.

    Peter

    Peter Viscarola
    OSR
    @OSRDrivers

  • croy_kfrcroy_kfr Member Posts: 18

    Thank you Peter for ypue reply. You are right about common buffer confusioning, and I will follow your suggestion about the s/g list point and take you informed if I have any further question.

  • croy_kfrcroy_kfr Member Posts: 18

    Just one point: the start address for starting DMA on both PCI bus and VME bus of the Device must be programmed by user app. For example, for a Write access (PCI to VME direction), I have to program a register for the PCI Address and one register for the VME address. How the s/g list base address is programmed in this case ? In other words, what is the value of the first s/g list element ?

  • Peter_Viscarola_(OSR)Peter_Viscarola_(OSR) Administrator Posts: 8,399

    Just one point: the start address for starting DMA on both PCI bus and VME bus of the Device must be programmed by user app.

    I'm not sure I understand: The user app calls (for example) WriteFile, providing a pointer to the data buffer that they want to write, and the length of that buffer. You then create a DMA Transaction and initialize it using the Request (InitializeDmaTransactionUsingRequest). Your EvtProgramDma Event Processing Callback gets called with the scatter/gather list (with exactly one element). You program your hardware.

    Again... see this: https://docs.microsoft.com/en-us/windows-hardware/drivers/wdf/handling-i-o-requests-in-a-kmdf-driver-for-a-bus-master-dma-device which I think explains the process pretty clearly.

    Peter

    Peter Viscarola
    OSR
    @OSRDrivers

  • Tim_RobertsTim_Roberts Member - All Emails Posts: 13,907

    Just one point: the start address for starting DMA on both PCI bus and VME bus of the Device must be programmed by user app.

    No. The programming of the DMA hardware must be entirely under the control of the driver. User mode code is naturally insecure; it's way too easy for another app to interfere, and cause your hardware to do transfers for immoral purposes. The address of the user's buffer is passed in the IRP, and is validated by the I/O subsystem. The address on the device side should be passed to the driver as a parameter, where the driver can validate it.

    Tim Roberts, [email protected]
    Providenza & Boekelheide, Inc.

  • croy_kfrcroy_kfr Member Posts: 18

    Thank you Peter and Tim for your reply.
    I try to understand : the s/g list provides one element sglist.Elements[0].Address that I have to use to program my device in EvtProgramDma, but I do not see the link between this address and the address of the user's buffer passed through the IRP.
    Could you explain please at which address correspond this sglist.Elements[0].Address ?

  • Peter_Viscarola_(OSR)Peter_Viscarola_(OSR) Administrator Posts: 8,399

    Hmmmm... I’m not sure what you don’t understand.

    The first byte of the buffer to which the s/g list points is the first byte of the user data buffer, as passed by the user in WriteFile or ReadFile.

    Peter

    Peter Viscarola
    OSR
    @OSRDrivers

  • Tim_RobertsTim_Roberts Member - All Emails Posts: 13,907

    sglist.Elements[0].Address is the physical address of the page whose virtual address is in the IRP. Are you familiar with virtual memory and physical memory? None of the addresses you use in your code, either in user mode or kernel mode, are actually the addresses of the bytes in memory. Those are virtual addresses, and they have to be passed through the page tables in order to get the address that goes out on the memory bus.

    Tim Roberts, [email protected]
    Providenza & Boekelheide, Inc.

  • croy_kfrcroy_kfr Member Posts: 18

    @Peter,
    Ok so if I use a Writefile(handle, buf, bufsize, NULL, NULL ), or an IOCTL call like deviceIoControl( handle, IOCTL_WRITE_DMA, NULL, 0, buf, bufsize, NULL, NULL ), s/g list point to the first byte of "buf", correct ?
    @Tim,
    Yes, I have a few knowlegde of the Virtual/Physical memory adresses concept and I read recently the Microsoft documentation about contiguous and non-contiguous block of memory associated to the s/g list concept because in fact it was new for me, you are right.
    And so, if I have to program my 32-bit only device, I have to use sglist.Element[0].Address.LowPart that corresponds to the physical address translated by the framework, correct ?

  • Peter_Viscarola_(OSR)Peter_Viscarola_(OSR) Administrator Posts: 8,399

    Yes.

    It would probably be most helpful to not refer to the contents of sglist.Element[0].Address as the “physical address” of the users data buffer. Because it may be or it may not be. It is the device bus logical address of the users data buffer, suitable for use with DMA. In the OP’s case, it almost certainly will not be the user data buffer physical address.

    Peter

    Peter Viscarola
    OSR
    @OSRDrivers

  • croy_kfrcroy_kfr Member Posts: 18

    Thank you Peter for your reply.
    And so on, if I use an IOCTL like in my example using Direct I/O type, is it possible to pass a structure instead of a buffer ? In this case, if the device bus logical address is the first element of this structure, does it mean that sglist.Element[0].Address point to this element ?
    In my OP's case, I want to perform DMA transfer using 1 IOCTL in User App for programming my device and I wonder if this is the good method because
    all PCI examples use WriteFile() method.

  • Tim_RobertsTim_Roberts Member - All Emails Posts: 13,907

    And so, if I have to program my 32-bit only device, I have to use sglist.Element[0].Address.LowPart that corresponds to the physical address translated by the framework, correct ?

    You've raised a couple of brand-new issues here. Is this a very old device? Because any modern hardware designer who creates a PCIe device that is limited to 32-bit addressing is guilty of malpractice. Windows has had 64-bit physical addresses since the very beginning, clear back in the 20th Century. ALL current PCIe IP blocks supports 64-bit physical addresses. There's no excuse.

    When your DMA is limited to 32 bits, the system has to take an extra step. You can't control where the user's buffer lives, and since modern systems often have 16GB or 32GB or more of RAM, the average user-mode buffer these days is statistically going to be above the 4GB mark. In that case, the operating system has to allocate special space below the physical 4GB limit. These are called "bounce buffers". When the user submits a request, the I/O system will copy his buffer into a "bounce buffer" before calling your EvtProgramDma callback. There are a limited number of "bounce buffers", which means your user request might be chopped into several pieces, with each piece getting another call to EvtProgramDma. This is mostly done without your knowledge, but since you have to maintain a destination address, this may be something you need to know.

    Peter is quite right to point out the difference between "physical" and "logical" addresses. A bus address is not necessarily the same as a physical address. I admit to being lax with this terminology, because in my career I have never encountered a system where the two were not identical.

    In my OP's case, I want to perform DMA transfer using 1 IOCTL in User App for programming my device and I wonder if this is the good method because all PCI examples use WriteFile() method.

    There's almost no difference. It may not be obvious from above, but from a driver standpoint, ReadFile, WriteFile and DeviceIoControl are all virtually identical. The driver just gets an IRP, and the buffers are stored in the same places. In YOUR case, there is an additional consideration, because you need to specify a destination address. With DeviceIoControl, you have the opportunity to send two buffers. You can put the address in buffer 1, and the data in buffer 2. Without that, you have to invent some other scheme, and it doesn't seem as natural.

    Tim Roberts, [email protected]
    Providenza & Boekelheide, Inc.

  • Tim_RobertsTim_Roberts Member - All Emails Posts: 13,907

    And so on, if I use an IOCTL like in my example using Direct I/O type, is it possible to pass a structure instead of a buffer ?

    It's just a block of bytes. No one in the system knows or cares how it is interpreted. That's between the driver and its client.

    Having said that, let me offer a couple of just-in-case cautions. Do not pass a structure that contains pointers. It is tricky (although not impossible) to handle user-mode pointers in a kernel mode driver. As long as you follow the rules, the I/O system will make sure all addresses are kernel-mode addresses by the time the request gets to you, but as I said, it can't know what's inside your buffers.

    Secondly, remember to consider field sizes and packing. Remember that your 64-bit driver can be called by 32-bit and 64-bit applications, and the structure packing rules are different. It is a pain in the butt for a driver to have to translate structures coming in from a 32-bit app. The best plan is to design your ioctl structures so they are independent of the app bittedness.

    Tim Roberts, [email protected]
    Providenza & Boekelheide, Inc.

  • croy_kfrcroy_kfr Member Posts: 18

    Thank you Tim for your reply and your explnations.
    I am just a little bit confuses about the fact that I can put the address in buffer 1, and the data in buffer 2. So, what buffer is used by the sg_list parameter in DMA ?

  • Tim_RobertsTim_Roberts Member - All Emails Posts: 13,907

    You need to read more about IRPs. DeviceIoControl has two buffers. When you're using direct I/O, as you would here, the first buffer is copied into and/or/out of kernel memory. The second buffer is mapped into kernel memory. That's the one you will pass to WdfTransactionInitializeUsingRequest.

    Tim Roberts, [email protected]
    Providenza & Boekelheide, Inc.

  • croy_kfrcroy_kfr Member Posts: 18

    Thank you very much Tim.

  • croy_kfrcroy_kfr Member Posts: 18

    I have an other question about the end of DMA: my device triggers a PCI interrupt if the DMA ends normally or if an error on both PCI or VME bus occurs during a DMA transfer.
    If the DMA ends due to an error, I wonder if I deal with this error in my ISR routine or in my DPC routine after calling WdfDmaTransactionComplete ? Because in this case, I have to stop at once the DMA transaction and cancel the DMA I/O Request associated without the Framework schedules another DMA transfer.

  • Peter_Viscarola_(OSR)Peter_Viscarola_(OSR) Administrator Posts: 8,399

    You do it in your DPC.

    Peter

    Peter Viscarola
    OSR
    @OSRDrivers

  • croy_kfrcroy_kfr Member Posts: 18

    Thank you Peter.

  • croy_kfrcroy_kfr Member Posts: 18

    I have a WDF VIOLATION error due to a bugcheck with an Arg = 0x00000003 = Windows Driver Framework Verifier has encountered a fatal error.
    in particular, an I/O request has been completed, but a framework request object cannot be deleted because there are outstanding references to the input buffer or the output buffer, or both. The driver's IFR will include details on the outstanding references.

    I performed a check of the log dump file and the last lines related to buffer in IOCTL says that:
    35: FxRequest::CompleteInternal - WDFREQUEST 0x0000567C11539B38, PIRP 0xFFFF8F8FC58E6D80, Major Function 0xe, completed with outstanding references on WDFMEMORY 0x0000567C11539A61 or 0x0000567C11539A51 retrieved from this request
    36: FxRequest::CompleteInternal - WDFMEMORY 0x0000567C11539A61, buffer FFFF8F8FC617AFE0, PMDL 0000000000000000, length 20 bytes
    37: FxRequest::CompleteInternal - IOCTL output WDFMEMORY 0x0000567C11539A51, buffer FFFFE481CAE12010, PMDL FFFFA983EE8C6750, length 4096 bytes
    ---- end of log ----

    I do not see any particular error but I identify in the code that the WDF method that causes the bugcheck is WdfTransactionExecute().
    Did you have any suggestion to help me to debug the driver ? I am a newbie with the WinDbg debugger. Thanks.

  • Peter_Viscarola_(OSR)Peter_Viscarola_(OSR) Administrator Posts: 8,399

    Like the log says, a WDFMEMORY object has a reference when the Request is completed.

    How are your input and output buffers for this Request created? Do you build them? Are you make a copy of either of those buffers using a WDFMEMORY object?

    Peter

    Peter Viscarola
    OSR
    @OSRDrivers

  • croy_kfrcroy_kfr Member Posts: 18

    Hello Peter,
    My user app created both buffers, the first buffer is a "control" buffer and the second is a "data" buffer.
    The IOCTL is defined as METHOD_IN_DIRECT, FILE_WRITE_ACCESS because i would like to perform a DMA Write
    DeviceIoControl(
    DriverHandle,
    IOCTL_DMA_WRITE,
    DmaIoctl,
    sizeof(DMA_IOCTL),
    Buffer,
    BufferSize,
    &BytesReturned,
    &Overlapped);

    I do not use any WDFMEMORY object in my driver, I only use WdfRequestRetrieveInputBuffer for the "control" buffer and WdfRequestRetrieveOutputBuffer for the "data" Buffer and then I use WdfDmaTransactionInitializeUsingRequest(DmaTransaction, DmaRequest, EvtProgramDmaFunction, WdfDmaDirectionWriteToDevice);
    The bugcheck occurs when I call the WdfDmaTransactionExecute() just after.

  • Tim_RobertsTim_Roberts Member - All Emails Posts: 13,907

    If this is a DMA write, then it's really METHOD_OUT_DIRECT, but that doesn't really matter very much. Where are you completing the request? Is it possible you complete the request before you call WdefDmaTransactionDmaCompleted?

    Tim Roberts, [email protected]
    Providenza & Boekelheide, Inc.

Sign In or Register to comment.

Howdy, Stranger!

It looks like you're new here. If you want to get involved, click one of these buttons!

Upcoming OSR Seminars
OSR has suspended in-person seminars due to the Covid-19 outbreak. But, don't miss your training! Attend via the internet instead!
Developing Minifilters 24 May 2021 Live, Online
Writing WDF Drivers 14 June 2021 Live, Online
Internals & Software Drivers 2 August 2021 Live, Online
Kernel Debugging 27 Sept 2021 Live, Online