Solution for multiple process access the same DMA buffer

Hi All,

I am developing a PCI-e device driver and the corresponding user mode API which encapsulates a certain protocol and is used for communication. The API may be used in more than one process and each of them may access the DAM buffer to read captured data. eg, there may be two process exists simultaneously, one is used to communication and the other is used to cpature frames.

The DAM buffer is a continuous physical memory, and the interface with HW is as descriped in the following:

|-------------------------| ------------------------| | ------------------------|
|ower bit| frame data |ower bit| frame data |…| ower bit| frame data|
|-------------------------| ------------------------| | ------------------------|
frame buffer 0 frame buffer 1 frame buffer n

ower bit:
1: the buffer is owned by the HW, and HW can write data to the buffer, ower bit will change to 0 when finished filling frame
0: the buffer is owned by the SW, and SW can read data from buffer, ower bit will change to 1 when finished reading

Is there some better solutions to solve this problem?
By the way, the FPGA engineer is also in our team and if there is a better soluton we can modify the interface between the SW and HW.

Thanks in advance!

sorry, the diagram of the DMA buffer is as in the following:

|--------------------| -----------------| |--------------------|
|ower bit| frame |ower bit| frame|…| ower bit| frame |
|--------------------| -----------------| | -------------------|
frame buffer 0 frame buffer 1 frame buffer n

If you using owner bit you still need to synchronize access between HW and SW components.
Now you don’t have such synchronization.
One of solution which you could implement is to use FIFO queues. One queue will contain pointers of free, empty frames.SW writes to this queue released frames. The second queue will contains full frames. HW writes to the queue a pointer of frame when the frame is ready for SW processing.
Both queue must be allocated in PCI memory because both HW and SW access them.
In additional your FPGA could have status bits - the free queue is empty and the full queue contains at least one frame. Both these status bits could be triggered by interrupt.

Igor Sharovar

I don’t see anything wrong with the approach you outlined – it’s the basis for several similar approaches that are commonly used. Plus, it has the advantage of being simple to implement.

I can recommend a couple of improvements:

  1. I generally recommend separating data and control information. So, you might consider using a vector in contiguous physical memory with a control structure that contains a “status” ULONG and a pointer to the actual data buffer.

  2. You can augment the approach above to include a head and tail pointer (one updated by the hardware the other by the software) – now you have a traditional “circular buffer” approach. STILL pretty easy to implement and very flexible.

  3. I generally recommend avoiding approaches that require the data buffer be physically contiguous, if this is possible. If the buffers you need are small and will be allocated by the driver, this isn’t a big deal. But if the buffers you need are large, and you want to be able to malloc them from one or more user-mode apps, supporting scattter/gather (AKA DMA Chaining) can be *much* more efficient. This approach is not a lot more complex on the host software side at least (thanks to the Windows functions provided to handle this) and gets you a lot of flexibility. So, now your control structure points not to the start of the data buffer but rather to the start of a physically contiguous vector of base address and length pairs for each fragment of the data buffer.

Do you have specific concerns about the initial approach you described? What are your design constraints and goals?

Peter
OSR

xxxxx@gmail.com wrote:

I am developing a PCI-e device driver and the corresponding user mode API which encapsulates a certain protocol and is used for communication. The API may be used in more than one process and each of them may access the DAM buffer to read captured data. eg, there may be two process exists simultaneously, one is used to communication and the other is used to cpature frames.

The DAM buffer is a continuous physical memory, and the interface with HW is as descriped in the following:

|-------------------------| ------------------------| | ------------------------|
|ower bit| frame data |ower bit| frame data |…| ower bit| frame data|
|-------------------------| ------------------------| | ------------------------|
frame buffer 0 frame buffer 1 frame buffer n

ower bit:
1: the buffer is owned by the HW, and HW can write data to the buffer, ower bit will change to 0 when finished filling frame
0: the buffer is owned by the SW, and SW can read data from buffer, ower bit will change to 1 when finished reading

My comments echo Igor’s. Although this architecture can be made to
work, there are synchronization issues. For example, you can’t step in
arbitrarily and know which is the next buffer to be read or written –
you have to have that information as state somewhere. If you keep a
FIFO of empty buffers and filled buffers, you don’t need that extra state.

Aren’t you going to want an interrupt at the end of each frame?


Tim Roberts, xxxxx@probo.com
Providenza & Boekelheide, Inc.

>For example, you can’t step in arbitrarily and know which is the next buffer to be read or written – >you have to have that information as state somewhere.
Exactly. PCI memory is limited resource and OP needs to implement an algorithm of reusing frames/buffers. His approach misses this.

Igor Sharovar

And considering the UM interaction:

Depending on the kind of driver this is, you must be prepared to handled
multiple overlapping requests from various client applications. If you tell
us what stack your are in, we can provide more specific information (or if
this is a custom interface, then what is the nature of the device)

“Tim Roberts” wrote in message news:xxxxx@ntdev…

xxxxx@gmail.com wrote:

I am developing a PCI-e device driver and the corresponding user mode API
which encapsulates a certain protocol and is used for communication. The
API may be used in more than one process and each of them may access the
DAM buffer to read captured data. eg, there may be two process exists
simultaneously, one is used to communication and the other is used to
cpature frames.

The DAM buffer is a continuous physical memory, and the interface with HW
is as descriped in the following:

|-------------------------| ------------------------|
| ------------------------|
|ower bit| frame data |ower bit| frame data |…| ower bit| frame data|
|-------------------------| ------------------------|
| ------------------------|
frame buffer 0 frame buffer 1 frame buffer
n

ower bit:
1: the buffer is owned by the HW, and HW can write data to the buffer,
ower bit will change to 0 when finished filling frame
0: the buffer is owned by the SW, and SW can read data from buffer, ower
bit will change to 1 when finished reading

My comments echo Igor’s. Although this architecture can be made to
work, there are synchronization issues. For example, you can’t step in
arbitrarily and know which is the next buffer to be read or written –
you have to have that information as state somewhere. If you keep a
FIFO of empty buffers and filled buffers, you don’t need that extra state.

Aren’t you going to want an interrupt at the end of each frame?


Tim Roberts, xxxxx@probo.com
Providenza & Boekelheide, Inc.