DMA to user-mode buffers (locked down)

I’m not really sure I’m going about this right, so I figured perhaps I
should ask for some help…

I’m working on a satellite capture application that uses DMA to do
real-time capture of demodulated data (generally an MPEG stream, but as
far as I’m concerned, it’s just raw bytes). We’re using an FPGA to be
the PCI device which, among other things, has a scatter-gather PCI DMA
engine inside of it. I wrote much of the FPGA code, so I have the
ability to change things if I need to, but I don’t think that’s the problem.

Where I’m running into difficulty is in DMAing into user space.
Generally, this sounds like a bad idea, but I was hoping that generating
an MDL from an ioctl-passed virtual address (the ioctl executes in the
context of the calling application, no?) and locking down the pages with
MmProbeAndLockPages() with UserMode and IoModifyAccess as the parameters
would do the trick for giving me an MDL that suits the buffer.

However, when the DMA is actually performed, I get memory corruption
problems of various types, generally when the ISR fires at the end of
the transfer. I’ve printed out the page addresses of the scatter-gather
list, and I’m getting what I would consider good addresses (in the
0x1XXXXXXX range, and the appropriate number of them for the buffer size
I’m trying to do), but I get the feeling that perhaps my pages aren’t
staying locked down.

It is, of course, possible that the DMA engine is sending bad addresses,
but the logic analyzer seems to think I’ve got them right, so unless
it’s a rather intermittent problem, I don’t think that’s it.

Any thoughts? It’s not impossible that I could just use the
common-buffer approach (which I tried before), but that involves a lot
more overhead than I’d really like, and the latency seems to make the
demodulator FIFO overflow a lot (though that may actually be fixed at
this point, because there were some priority issues internal to the FPGA
which caused it to hog the bus while in the midst of continuous transfer).

Thanks in advance,
Dave


David Riley
Hardware Engineer
Mantaro Networks
20410 Century Blvd
Germantown, MD 20874
(301)528-2244 x532

David Riley wrote:

I’m working on a satellite capture application that uses DMA to do
real-time capture of demodulated data (generally an MPEG stream, but as
far as I’m concerned, it’s just raw bytes). We’re using an FPGA to be
the PCI device which, among other things, has a scatter-gather PCI DMA
engine inside of it. I wrote much of the FPGA code, so I have the
ability to change things if I need to, but I don’t think that’s the problem.

Where I’m running into difficulty is in DMAing into user space.
Generally, this sounds like a bad idea, but I was hoping that generating
an MDL from an ioctl-passed virtual address (the ioctl executes in the
context of the calling application, no?) and locking down the pages with
MmProbeAndLockPages() with UserMode and IoModifyAccess as the parameters
would do the trick for giving me an MDL that suits the buffer.

That’s the right way to do things.

However, when the DMA is actually performed, I get memory corruption
problems of various types, generally when the ISR fires at the end of
the transfer. I’ve printed out the page addresses of the scatter-gather
list, and I’m getting what I would consider good addresses (in the
0x1XXXXXXX range, and the appropriate number of them for the buffer size
I’m trying to do), but I get the feeling that perhaps my pages aren’t
staying locked down.

Ooh, that’s a terrible kind of problem to chase down. I’m having a
similar issue with a project I’m working on now. When you see memory
corruption, are you able to identify it clearly? If it is a DMA bug,
it’s likely to be either writing past the end of a page, or reading too
many entries from the MDL. Is the physical address of the corruption
near any of your pages – like just after one of them?


Tim Roberts, xxxxx@probo.com
Providenza & Boekelheide, Inc.

Problem solved; looks like I was wrong, it was in the FPGA. Turns out
the DMA engine wasn’t sending the end-of-burst code properly to the
internal PCI bridge, so the bridge would accept the read request on the
internal (Wishbone) side, but keep issuing retries on the PCI side until
the PCI bridge gave up.

What made this so difficult to diagnose was that when the PCI bridge
gives up when a PCI read is done at PASSIVE_LEVEL or even
DISPATCH_LEVEL, it registers as a read of all Fs. When it happens
within the initial ISR (when the interrupt flags register is being read
out of the card), it generates a hardware exception that is unhandled.

Let this be a note to everyone out there using the OpenCores PCI bridge
in their FPGA/ASIC: make sure your CTI always registers “end of burst”
at the end of a page or a transaction, or strange things result.

David Riley
Hardware Engineer
Mantaro Networks
20410 Century Blvd
Germantown, MD 20874
(301)528-2244 x532

Tim Roberts wrote:

David Riley wrote:
> I’m working on a satellite capture application that uses DMA to do
> real-time capture of demodulated data (generally an MPEG stream, but as
> far as I’m concerned, it’s just raw bytes). We’re using an FPGA to be
> the PCI device which, among other things, has a scatter-gather PCI DMA
> engine inside of it. I wrote much of the FPGA code, so I have the
> ability to change things if I need to, but I don’t think that’s the problem.
>
> Where I’m running into difficulty is in DMAing into user space.
> Generally, this sounds like a bad idea, but I was hoping that generating
> an MDL from an ioctl-passed virtual address (the ioctl executes in the
> context of the calling application, no?) and locking down the pages with
> MmProbeAndLockPages() with UserMode and IoModifyAccess as the parameters
> would do the trick for giving me an MDL that suits the buffer.
>

That’s the right way to do things.

> However, when the DMA is actually performed, I get memory corruption
> problems of various types, generally when the ISR fires at the end of
> the transfer. I’ve printed out the page addresses of the scatter-gather
> list, and I’m getting what I would consider good addresses (in the
> 0x1XXXXXXX range, and the appropriate number of them for the buffer size
> I’m trying to do), but I get the feeling that perhaps my pages aren’t
> staying locked down.
>

Ooh, that’s a terrible kind of problem to chase down. I’m having a
similar issue with a project I’m working on now. When you see memory
corruption, are you able to identify it clearly? If it is a DMA bug,
it’s likely to be either writing past the end of a page, or reading too
many entries from the MDL. Is the physical address of the corruption
near any of your pages – like just after one of them?