Common Buffer DMA Questions

PaoloC · December 16, 2004, 9:47am

Hi,
why using the common buffer ?
With PLX bridge Scatter/Gather support you can do DMA directly in user application
buffers.
Just queue enough I/O overlapped requests from the application.
In OSR site you can find a very very interesting technical article named
X-DMA - Extreme DMA for Performance

http://www.osronline.com/article.cfm?id=19

Regards,
PaoloC

– Messaggio originale –
From: “Scott M. Dunn”
>To: “Windows System Software Devs Interest List”
>Date: Thu, 16 Dec 2004 13:09:39 +0100
>Subject: [ntdev] Common Buffer DMA Questions
>Reply-To: “Windows System Software Devs Interest List”
>
>
>Hello everyone.
>
>I?m a new driver writer, and I have some questions about
>common buffer DMA. I?ve looked through the lists, but haven?t
>found what I?m looking for.
>
>I am writing a driver for a data acquisition card (FPGA, PLX
>PCI, Bus Master, Scatter/Gather). I have a Unix driver as a
>model.
>
>I have a ring buffer made out of 8 one-page common buffers. I
>started the DMA transfer and can read from 7 buffers, and the 8th
>is being written. The DMA transfer runs continually.
>
>The read process is causing me problems. In the user-mode, you
>just keep reading buffers in a loop. For now, it works so - 7
>reads work, and then for the 8th (waiting for the next buffer to be
>completely written) I return 0 bytes transferred. This works, so I
>can read data, but of course this isn?t optimal.
>
>The Unix driver solves this so. If the data is there, the read
>request is completed. If the data is not there, the read request
>sleeps until an interrupt comes. If the buffer that was completed
>isn?t the right one, the read request sleeps again until the next
>interrupt and so on. But I haven?t been able to find an equivalent
>for WDM drivers.
>
>I think I have to set up a queue, but I?m not sure how that would
>work. Until now, I have been doing all of the reading right in the
>DispatchRead routine because the data was there and it didn?t
>take long. So I could put the read IRP in a queue when the data
>isn?t there, status_pending and everything, and then read from
>the queue when an interrupt occurs.
>
>But here is where I run into problems. What happens when two
>file-handles are open, two programs try to read two different
>blocks that haven?t been written yet? I start reading the queue,
>one request can be filled, I would want to wait on the next one,
>can I re-queue it?
>
>Is this a lousy design? Am I missing something? Are there any
>examples of this? I?ve found very few examples of common
>buffer DMA, and none with multiple read request ?
>
>I have the books from Walter Oney, Art Baker, and Chris Cant.
>
>Any help would be greatly appreciated, and let me know if I can
>clarify anything.
>
>Scott Dunn
>
>Br?ckner & Jarosch Ing.-GmbH
>Nonnengasse 5a
>99084 Erfurt
>Germany
>
>Tel: +49 (0)361 / 21 24 02 2
>Fax: +49 (0)361 / 21 24 01 9
>
>—
>Questions? First check the Kernel Driver FAQ at http://www.osronline.com/article.cfm?id=256
>
>You are currently subscribed to ntdev as: unknown lmsubst tag argument:
‘’
>To unsubscribe send a blank email to xxxxx@lists.osr.com

OSR_Community_User · December 16, 2004, 10:24am

You should never wait on IO completion in your dispatch routine.

You need to re-read walter’s book, paying closer attention to the
asynchronous IO model. Also the Viscarola/Mason book covers this as well. In
addition, you should read the Microsoft KB articles on IRP processing here:
http://www.microsoft.com/whdc/driver/kernel/default.mspx
http:
And Inside Windows XXXXX latest version Russinovich & Soloman could be
helpful.

The IO model is NT is quite well designed, but if your ‘mental model’ is
synchronous unix-style IO you have a bit of learning to do.

Another poster suggested that you use MDL based direct IO instead. That is
in general a good idea for DMA based devices, however given the small buffer
size you described the common buffer approach is probably equally efficient.
Neither approach will by itself solve your design issue regarding sharing
access to the seven read slots on your device. You have to provide an
internal mechanism that implements that feature appropriately for the IO
characteristics and semantics of whatever it is your device and applications
are actually doing.

You need something like a ‘pending readers queue’ - IOs waiting for access
to the read slot, and an ‘in progress read queue’ - read requests currently
using the read slots. When a read DMA completes you have to have a token for
that read that correlates to the IRP waiting for its completion. A data
structure that maps slots to irp requests is appropriate. You then pull just
that IRP from the in progress read queue and complete it (copying the data
first if you are using the common buffer design.) You then select a pending
read based on whatever qualities you want to base the selection on (e.g.
FIFO,) and start that request on the available slot. Repeat as needed.

=====================
Mark Roddy

_____

From: xxxxx@lists.osr.com
[mailto:xxxxx@lists.osr.com] On Behalf Of Scott M. Dunn
Sent: Thursday, December 16, 2004 7:10 AM
To: Windows System Software Devs Interest List
Subject: [ntdev] Common Buffer DMA Questions

Hello everyone.

I’m a new driver writer, and I have some questions about common buffer DMA.
I’ve looked through the lists, but haven’t found what I’m looking for.

I am writing a driver for a data acquisition card (FPGA, PLX PCI, Bus
Master, Scatter/Gather). I have a Unix driver as a model.

I have a ring buffer made out of 8 one-page common buffers. I started the
DMA transfer and can read from 7 buffers, and the 8th is being written. The
DMA transfer runs continually.

The read process is causing me problems. In the user-mode, you just keep
reading buffers in a loop. For now, it works so - 7 reads work, and then
for the 8th (waiting for the next buffer to be completely written) I return
0 bytes transferred. This works, so I can read data, but of course this
isn’t optimal.

The Unix driver solves this so. If the data is there, the read request is
completed. If the data is not there, the read request sleeps until an
interrupt comes. If the buffer that was completed isn’t the right one, the
read request sleeps again until the next interrupt and so on. But I haven’t
been able to find an equivalent for WDM drivers.

I think I have to set up a queue, but I’m not sure how that would work.
Until now, I have been doing all of the reading right in the DispatchRead
routine because the data was there and it didn’t take long. So I could put
the read IRP in a queue when the data isn’t there, status_pending and
everything, and then read from the queue when an interrupt occurs.

But here is where I run into problems. What happens when two file-handles
are open, two programs try to read two different blocks that haven’t been
written yet? I start reading the queue, one request can be filled, I would
want to wait on the next one, can I re-queue it?

Is this a lousy design? Am I missing something? Are there any examples of
this? I’ve found very few examples of common buffer DMA, and none with
multiple read request …

I have the books from Walter Oney, Art Baker, and Chris Cant.

Any help would be greatly appreciated, and let me know if I can clarify
anything.

Scott Dunn

Brückner & Jarosch Ing.-GmbH
Nonnengasse 5a
99084 Erfurt
Germany

Tel: +49 (0)361 / 21 24 02 2
Fax: +49 (0)361 / 21 24 01 9
—
Questions? First check the Kernel Driver FAQ at
http://www.osronline.com/article.cfm?id=256

You are currently subscribed to ntdev as: unknown lmsubst tag argument: ‘’
To unsubscribe send a blank email to xxxxx@lists.osr.com</http:>

Maxim_S_Shatskih · December 16, 2004, 12:20pm

>But here is where I run into problems. What happens when two file-handles are
open, two

Not all kinds of drivers support opening 2 or more file handles. Possibly yours
is such.

BTW - running DMA over the IRP’s MDLs is better then having a common buffer.
Common buffer is good only for control structures which hardware will
interpret - like hardware-defined scatter-gather list.

Maxim Shatskih, Windows DDK MVP
StorageCraft Corporation
xxxxx@storagecraft.com
http://www.storagecraft.com

Maxim_S_Shatskih · December 16, 2004, 12:29pm

It depends.

If I send IOCTL_DISK_IS_WRITEABLE down - then yes, I will wait for it

Maxim Shatskih, Windows DDK MVP
StorageCraft Corporation
xxxxx@storagecraft.com
http://www.storagecraft.com

----- Original Message -----
From: Roddy, Mark
To: Windows System Software Devs Interest List
Sent: Thursday, December 16, 2004 6:24 PM
Subject: RE: [ntdev] Common Buffer DMA Questions

You should never wait on IO completion in your dispatch routine.

You need to re-read walter’s book, paying closer attention to the asynchronous IO model. Also the Viscarola/Mason book covers this as well. In addition, you should read the Microsoft KB articles on IRP processing here: http://www.microsoft.com/whdc/driver/kernel/default.mspx
And Inside Windows XXXXX latest version Russinovich & Soloman could be helpful.

The IO model is NT is quite well designed, but if your ‘mental model’ is synchronous unix-style IO you have a bit of learning to do.

Another poster suggested that you use MDL based direct IO instead. That is in general a good idea for DMA based devices, however given the small buffer size you described the common buffer approach is probably equally efficient. Neither approach will by itself solve your design issue regarding sharing access to the seven read slots on your device. You have to provide an internal mechanism that implements that feature appropriately for the IO characteristics and semantics of whatever it is your device and applications are actually doing.

You need something like a ‘pending readers queue’ - IOs waiting for access to the read slot, and an ‘in progress read queue’ - read requests currently using the read slots. When a read DMA completes you have to have a token for that read that correlates to the IRP waiting for its completion. A data structure that maps slots to irp requests is appropriate. You then pull just that IRP from the in progress read queue and complete it (copying the data first if you are using the common buffer design.) You then select a pending read based on whatever qualities you want to base the selection on (e.g. FIFO,) and start that request on the available slot. Repeat as needed.

=====================
Mark Roddy

From: xxxxx@lists.osr.com [mailto:xxxxx@lists.osr.com] On Behalf Of Scott M. Dunn
Sent: Thursday, December 16, 2004 7:10 AM
To: Windows System Software Devs Interest List
Subject: [ntdev] Common Buffer DMA Questions

Hello everyone.

I’m a new driver writer, and I have some questions about common buffer DMA. I’ve looked through the lists, but haven’t found what I’m looking for.

I am writing a driver for a data acquisition card (FPGA, PLX PCI, Bus Master, Scatter/Gather). I have a Unix driver as a model.

I have a ring buffer made out of 8 one-page common buffers. I started the DMA transfer and can read from 7 buffers, and the 8th is being written. The DMA transfer runs continually.

The read process is causing me problems. In the user-mode, you just keep reading buffers in a loop. For now, it works so - 7 reads work, and then for the 8th (waiting for the next buffer to be completely written) I return 0 bytes transferred. This works, so I can read data, but of course this isn’t optimal.

The Unix driver solves this so. If the data is there, the read request is completed. If the data is not there, the read request sleeps until an interrupt comes. If the buffer that was completed isn’t the right one, the read request sleeps again until the next interrupt and so on. But I haven’t been able to find an equivalent for WDM drivers.

I think I have to set up a queue, but I’m not sure how that would work. Until now, I have been doing all of the reading right in the DispatchRead routine because the data was there and it didn’t take long. So I could put the read IRP in a queue when the data isn’t there, status_pending and everything, and then read from the queue when an interrupt occurs.

But here is where I run into problems. What happens when two file-handles are open, two programs try to read two different blocks that haven’t been written yet? I start reading the queue, one request can be filled, I would want to wait on the next one, can I re-queue it?

Is this a lousy design? Am I missing something? Are there any examples of this? I’ve found very few examples of common buffer DMA, and none with multiple read request …

I have the books from Walter Oney, Art Baker, and Chris Cant.

Any help would be greatly appreciated, and let me know if I can clarify anything.

Scott Dunn

Br?ckner & Jarosch Ing.-GmbH
Nonnengasse 5a
99084 Erfurt
Germany

Tel: +49 (0)361 / 21 24 02 2
Fax: +49 (0)361 / 21 24 01 9

Questions? First check the Kernel Driver FAQ at http://www.osronline.com/article.cfm?id=256

You are currently subscribed to ntdev as: unknown lmsubst tag argument: ‘’
To unsubscribe send a blank email to xxxxx@lists.osr.com —
Questions? First check the Kernel Driver FAQ at http://www.osronline.com/article.cfm?id=256

You are currently subscribed to ntdev as: unknown lmsubst tag argument: ‘’
To unsubscribe send a blank email to xxxxx@lists.osr.com

OSR_Community_User · December 16, 2004, 1:10pm

well yes. I was a bit strong on the ‘never’ part, but we were discussing
read/write IO in a hardware driver.

=====================
Mark Roddy

From: xxxxx@lists.osr.com
[mailto:xxxxx@lists.osr.com] On Behalf Of Maxim S. Shatskih
Sent: Thursday, December 16, 2004 12:30 PM
To: Windows System Software Devs Interest List
Subject: Re: [ntdev] Common Buffer DMA Questions

It depends.

If I send IOCTL_DISK_IS_WRITEABLE down - then yes, I will wait for it

Maxim Shatskih, Windows DDK MVP
StorageCraft Corporation
xxxxx@storagecraft.com mailto:xxxxx
http://www.storagecraft.com http:

----- Original Message -----
From: Roddy, mailto:xxxxx Mark
To: Windows System Software Devs Interest mailto:xxxxx List

Sent: Thursday, December 16, 2004 6:24 PM
Subject: RE: [ntdev] Common Buffer DMA Questions

You should never wait on IO completion in your dispatch routine.

You need to re-read walter’s book, paying closer attention to the
asynchronous IO model. Also the Viscarola/Mason book covers this as well. In
addition, you should read the Microsoft KB articles on IRP processing here:
http://www.microsoft.com/whdc/driver/kernel/default.mspx
http:
And Inside Windows XXXXX latest version Russinovich & Soloman could be
helpful.

The IO model is NT is quite well designed, but if your ‘mental model’ is
synchronous unix-style IO you have a bit of learning to do.

Another poster suggested that you use MDL based direct IO instead. That is
in general a good idea for DMA based devices, however given the small buffer
size you described the common buffer approach is probably equally efficient.
Neither approach will by itself solve your design issue regarding sharing
access to the seven read slots on your device. You have to provide an
internal mechanism that implements that feature appropriately for the IO
characteristics and semantics of whatever it is your device and applications
are actually doing.

You need something like a ‘pending readers queue’ - IOs waiting for access
to the read slot, and an ‘in progress read queue’ - read requests currently
using the read slots. When a read DMA completes you have to have a token for
that read that correlates to the IRP waiting for its completion. A data
structure that maps slots to irp requests is appropriate. You then pull just
that IRP from the in progress read queue and complete it (copying the data
first if you are using the common buffer design.) You then select a pending
read based on whatever qualities you want to base the selection on (e.g.
FIFO,) and start that request on the available slot. Repeat as needed.

=====================
Mark Roddy

_____

From: xxxxx@lists.osr.com
[mailto:xxxxx@lists.osr.com] On Behalf Of Scott M. Dunn
Sent: Thursday, December 16, 2004 7:10 AM
To: Windows System Software Devs Interest List
Subject: [ntdev] Common Buffer DMA Questions

Hello everyone.

I’m a new driver writer, and I have some questions about common buffer DMA.
I’ve looked through the lists, but haven’t found what I’m looking for.

I am writing a driver for a data acquisition card (FPGA, PLX PCI, Bus
Master, Scatter/Gather). I have a Unix driver as a model.

I have a ring buffer made out of 8 one-page common buffers. I started the
DMA transfer and can read from 7 buffers, and the 8th is being written. The
DMA transfer runs continually.

The read process is causing me problems. In the user-mode, you just keep
reading buffers in a loop. For now, it works so - 7 reads work, and then
for the 8th (waiting for the next buffer to be completely written) I return
0 bytes transferred. This works, so I can read data, but of course this
isn’t optimal.

The Unix driver solves this so. If the data is there, the read request is
completed. If the data is not there, the read request sleeps until an
interrupt comes. If the buffer that was completed isn’t the right one, the
read request sleeps again until the next interrupt and so on. But I haven’t
been able to find an equivalent for WDM drivers.

I think I have to set up a queue, but I’m not sure how that would work.
Until now, I have been doing all of the reading right in the DispatchRead
routine because the data was there and it didn’t take long. So I could put
the read IRP in a queue when the data isn’t there, status_pending and
everything, and then read from the queue when an interrupt occurs.

But here is where I run into problems. What happens when two file-handles
are open, two programs try to read two different blocks that haven’t been
written yet? I start reading the queue, one request can be filled, I would
want to wait on the next one, can I re-queue it?

Is this a lousy design? Am I missing something? Are there any examples of
this? I’ve found very few examples of common buffer DMA, and none with
multiple read request …

I have the books from Walter Oney, Art Baker, and Chris Cant.

Any help would be greatly appreciated, and let me know if I can clarify
anything.

Scott Dunn

Brückner & Jarosch Ing.-GmbH
Nonnengasse 5a
99084 Erfurt
Germany

Tel: +49 (0)361 / 21 24 02 2
Fax: +49 (0)361 / 21 24 01 9
—
Questions? First check the Kernel Driver FAQ at
http://www.osronline.com/article.cfm?id=256

You are currently subscribed to ntdev as: unknown lmsubst tag argument: ‘’
To unsubscribe send a blank email to xxxxx@lists.osr.com —
Questions? First check the Kernel Driver FAQ at
http://www.osronline.com/article.cfm?id=256

You are currently subscribed to ntdev as: unknown lmsubst tag argument: ‘’
To unsubscribe send a blank email to xxxxx@lists.osr.com

—
Questions? First check the Kernel Driver FAQ at
http://www.osronline.com/article.cfm?id=256

You are currently subscribed to ntdev as: unknown lmsubst tag argument: ‘’
To unsubscribe send a blank email to xxxxx@lists.osr.com</http:></mailto:xxxxx></mailto:xxxxx></http:></mailto:xxxxx>

Common Buffer DMA Questions

Tel: +49 (0)361 / 21 24 02 2 Fax: +49 (0)361 / 21 24 01 9

Tel: +49 (0)361 / 21 24 02 2
Fax: +49 (0)361 / 21 24 01 9