Re: Flushing DMA Buffer Allocated with AllocateCommon Buffer

Maybe someone who’s more familiar with the hardware can explain it to me ?
I’m having trouble visualizing how it works.

On a Xeon platform, because DMA doesn’t go through the front-side bus, the
processors cannot snoop DMA cycles, and hence the MESI protocol doesn’t take
those transfers into consideration. Unless of course the bridge has some
mechanism to warn processors, but then, that would defeat the point of doing
DMA, no ? I mean, instead of generating data cycles on the FSB we generate
snoop cycles instead ? Going out may be negotiated with writethrough, but
going in may be a problem, how are the processors going to know that
physical memory has changed under their caches ? And when it does, the
processors will need to invalidate the caches, so, what’s the point of
having a cacheable region for the common buffer ? Furthermore, data that
goes out in a DMA transfer must be resident in physical memory, so, wouldn’t
it be the case that some sequencing instruction would be required to flush
the write buffers ?

I’m a bit rusty on all that jazz, so, I may be wrong !

Alberto.

-----Original Message-----
From: xxxxx@lists.osr.com
[mailto:xxxxx@lists.osr.com]On Behalf Of Mark Roddy
Sent: Thursday, November 13, 2003 2:25 PM
To: Windows System Software Devs Interest List
Subject: [ntdev] Re: Flushing DMA Buffer Allocated with
AllocateCommonBuffer

  1. the OP should consider why he needs a cached common buffer.

  2. on all current platforms (i.e. not alpha or mips or powerpc,) running
    NT all memory is by default cache coherent, including memory written by
    peripheral devices, from the perspective of the processors on the
    system. The only possible issue is writes from the CPU to the common
    buffer where the common buffer is cached. I’m not convinced there is a
    problem here, but see point one above.

  3. common buffers are ‘special’. They are platform specific designed to
    be used for dma. The documentation in the DDK does not require that you
    flush common buffers.

Note that on x86 platforms KeFlushIoBuffers is a NOP. Draw your own
conclusions.

===========================
Mark Roddy
Consultant, Microsoft DDK MVP
Hollis Technology Solutions
xxxxx@hollistech.com
www.hollistech.com
603-321-1032

-----Original Message-----
From: “Doug”
To: “Windows System Software Devs Interest List”
Date: Wed, 12 Nov 2003 11:46:17 -0500
Subject: [ntdev] Re: Flushing DMA Buffer Allocated with
AllocateCommonBuffer

> Data written by the cpu to a device can exists in the cache (unless
> this is
> write-through cache) and needs to get flushed to the actual physical
> memory
> so the device can bus master it.
>
> Data written by a device can exist in the memory but the cache lines
> may
> have old data in them so they need to be invalidated before the cpu can
> ‘read’ the data.
>
> Direct is not direct when caches are involved…
>
>
> “Moreira, Alberto” wrote in message
> news:xxxxx@ntdev…
> >
> > Think of it: it’s called “direct” memory access because it doesn’t
> involve
> > the processors. Caches belong in the processors. Ergo…
> >
> > Alberto.
> >
>
>
>
> —
> Questions? First check the Kernel Driver FAQ at
> http://www.osronline.com/article.cfm?id=256
>
> You are currently subscribed to ntdev as: xxxxx@hollistech.com
> To unsubscribe send a blank email to xxxxx@lists.osr.com


Questions? First check the Kernel Driver FAQ at
http://www.osronline.com/article.cfm?id=256

You are currently subscribed to ntdev as: xxxxx@compuware.com
To unsubscribe send a blank email to xxxxx@lists.osr.com

The contents of this e-mail are intended for the named addressee only. It
contains information that may be confidential. Unless you are the named
addressee or an authorized designee, you may not copy or use it, or disclose
it to anyone else. If you received it in error please notify us immediately
and then destroy it.

Look up the DDK docs for “Dma Verification”. This is a new functionality of
DriverVerifier. Here’s its definition of common-buffer DMA:

Common-buffer DMA

Common-buffer DMA is performed when the system can allocate a single buffer
that is accessible by both the hardware and the software. The driver is
responsible for synchronizing accesses to the buffer. The memory is
not cached, making this synchronization easier for the driver. After setting
up a common buffer, both the driver and the hardware can write directly to
the addresses in the buffer without any intervention from the HAL.

Well, it specifically says, the memory is not cached to simplify
synchronization.

Alberto.

-----Original Message-----
From: xxxxx@lists.osr.com
[mailto:xxxxx@lists.osr.com]On Behalf Of Calvin Guan
Sent: Friday, November 14, 2003 12:03 PM
To: Windows System Software Devs Interest List
Subject: [ntdev] Re: Flushing DMA Buffer Allocated with AllocateC
ommonBuffer

If you are writing NDIS drivers, you can use NdisMUpdateSharedMemory to
flush shared memory which is not described by any NDIS_BUFFER (MDL). Packet
descriptor structure accessed by both NIC and CPU falls into this category.

I couldn’t find the KeXxx counterpart of it. It’s defined as no-op for now
though.

Calvin Guan, Software Developer xxxxx@nospam.ati.com
SW2D-Radeon NT Core Drivers
ATI Technologies Inc.
1 Commerce Valley Drive East
Markham, Ontario, Canada L3T 7X6
Tel: (905) 882-2600 Ext. 8654
Find a driver: http://www.ati.com/support/driver.html
http:

> -----Original Message-----
> From: Doug [mailto:xxxxx@hotmail.com mailto:xxxxx]
> Sent: Friday, November 14, 2003 11:05 AM
> To: Windows System Software Devs Interest List
> Subject: [ntdev] Re: Flushing DMA Buffer Allocated with
> AllocateCommonBuffer
>
>
> The documentation does require it. While the page in the DDK
> help labeled
> “Using Common-Buffer Bus-Master DMA” makes no mention of
> allocating an MDL
> and passing it to KeFlushIoBuffers, the page “Flushing Cached
> Data during
> DMA Operations” infers it if you mark the buffers as CacheEnabled.
>
> Problem in the DDK docs?
>
> “Mark Roddy” wrote in message
news:xxxxx@ntdev news:xxxxx
> 3) common buffers are ‘special’. They are platform specific designed to
> be used for dma. The documentation in the DDK does not require that you
> flush common buffers.
>
> Note that on x86 platforms KeFlushIoBuffers is a NOP. Draw your own
> conclusions.
>


Questions? First check the Kernel Driver FAQ at
http://www.osronline.com/article.cfm?id=256
http:

You are currently subscribed to ntdev as: xxxxx@ati.com
To unsubscribe send a blank email to xxxxx@lists.osr.com


Questions? First check the Kernel Driver FAQ at
http://www.osronline.com/article.cfm?id=256

You are currently subscribed to ntdev as: xxxxx@compuware.com
To unsubscribe send a blank email to xxxxx@lists.osr.com

The contents of this e-mail are intended for the named addressee only. It
contains information that may be confidential. Unless you are the named
addressee or an authorized designee, you may not copy or use it, or disclose
it to anyone else. If you received it in error please notify us immediately
and then destroy it.</http:></news:xxxxx></mailto:xxxxx></http:>

Well, the datasheet for the Intel 7505 states that not all AGP transfers are
coherent. It doesn’t say anything about hub transfers, but they’re done on a
separate bus. I still don’t see how a processor is to know that such a
transfer has taken place !

Alberto.

-----Original Message-----
From: xxxxx@lists.osr.com
[mailto:xxxxx@lists.osr.com]On Behalf Of Peter Wieland
Sent: Friday, November 14, 2003 4:14 PM
To: Windows System Software Devs Interest List
Subject: [ntdev] Re: Flushing DMA Buffer Allocated with AllocateC
ommonBuffer

I believe the memory controller takes care of ensuring consistency for
other things attempting to access it (like the PCI Controller)

-p

-----Original Message-----
From: xxxxx@lists.osr.com
[mailto:xxxxx@lists.osr.com] On Behalf Of Moreira, Alberto
Sent: Friday, November 14, 2003 8:44 AM
To: Windows System Software Devs Interest List
Subject: [ntdev] Re: Flushing DMA Buffer Allocated with AllocateC
ommonBuffer

I don’t see how it can work otherwise. For example, take a common-buffer
DMA where I’m writing a stream of commands to a graphics chip on the PCI
bus. If that buffer is writeback-cached, the processor is depositing the
data on its cache, and the cache lines are marked “modified” - however,
there’s no data actually written to physical memory, and it would take
another front-side bus master to access that memory to force the system
to dump the modified cache lines onto main memory. Now, the moment I
kick the DMA transfer alive, the DMA controller is going to go to memory
to fetch the stream of data, the only problem is, data isn’t there
because the buffer hasn’t been flushed from its cache lines. In fact,
some of that data may not even be in the cache yet, it may be lying in
the processor’s write buffers.

But processor caches stay coherent through the MESI protocol, which
operates at hardware level and is independent of the OS. That protocol
operates through every processor snooping every transaction on the
front-side bus.
But DMA controllers are masters on the PCI bus, not on the front-side
bus !
Hence, I see no way a DMA controller can be aware of the comings and
goings of the MESI protocol on the front-side bus.

For example, take a look at the DDK documentation for function
VideoPortStartDma(), it specifically says that the function flushes the
memory region in the host processor’s caches, then it builds a
scatter-gather list and calls HwVidExecuteDma().

Alberto.

-----Original Message-----
From: xxxxx@lists.osr.com
[mailto:xxxxx@lists.osr.com]On Behalf Of Doug
Sent: Friday, November 14, 2003 11:05 AM
To: Windows System Software Devs Interest List
Subject: [ntdev] Re: Flushing DMA Buffer Allocated with
AllocateCommonBuffer

The documentation does require it. While the page in the DDK help
labeled “Using Common-Buffer Bus-Master DMA” makes no mention of
allocating an MDL and passing it to KeFlushIoBuffers, the page “Flushing
Cached Data during DMA Operations” infers it if you mark the buffers as
CacheEnabled.

Problem in the DDK docs?

“Mark Roddy” wrote in message news:xxxxx@ntdev…
> 3) common buffers are ‘special’. They are platform specific designed
> to be used for dma. The documentation in the DDK does not require that

> you flush common buffers.
>
> Note that on x86 platforms KeFlushIoBuffers is a NOP. Draw your own
> conclusions.
>


Questions? First check the Kernel Driver FAQ at
http://www.osronline.com/article.cfm?id=256

You are currently subscribed to ntdev as: xxxxx@compuware.com
To unsubscribe send a blank email to xxxxx@lists.osr.com

The contents of this e-mail are intended for the named addressee only.
It contains information that may be confidential. Unless you are the
named addressee or an authorized designee, you may not copy or use it,
or disclose it to anyone else. If you received it in error please notify
us immediately and then destroy it.


Questions? First check the Kernel Driver FAQ at
http://www.osronline.com/article.cfm?id=256

You are currently subscribed to ntdev as: xxxxx@microsoft.com To
unsubscribe send a blank email to xxxxx@lists.osr.com


Questions? First check the Kernel Driver FAQ at
http://www.osronline.com/article.cfm?id=256

You are currently subscribed to ntdev as: xxxxx@compuware.com
To unsubscribe send a blank email to xxxxx@lists.osr.com

The contents of this e-mail are intended for the named addressee only. It
contains information that may be confidential. Unless you are the named
addressee or an authorized designee, you may not copy or use it, or disclose
it to anyone else. If you received it in error please notify us immediately
and then destroy it.

There’s a lot of functionality in a Miniport, it does most of the
non-time-critical functions of driving a graphics subsystem. Some people put
support for several different chips in the same piece of code, but even if
you only have one chip, your Miniport may end up being pretty big. Some of
the actual space is taken by tables, for example, every graphics driver
supports several resolutions and bit depths, and one must keep tables of
register settings that set up your chip to the corresponding video mode.
There’s also tables with configuration and capability settings, and they
take space. You must handle initialization, capabilities, mode changes,
power management, multiple screens, resource management, you name it. You
must also manage the retrace interrupt. In WinXP there’s even new support
for DMA. BTW, Calvin, do you guys implement and use the new DMA calls that
WinXP added to the Miniport ?

Alberto.

-----Original Message-----
From: xxxxx@lists.osr.com
[mailto:xxxxx@lists.osr.com]On Behalf Of Maxim S. Shatskih
Sent: Monday, November 17, 2003 10:14 PM
To: Windows System Software Devs Interest List
Subject: [ntdev] Re: Flushing DMA Buffer Allocated with AllocateC
ommonBuffer

Wow! Am I right that this huge amount of code is due to supporting all
videocard hardware models and maintaining the backward compatibility, so
that the newest binary can work with even the old hardware?

Maxim Shatskih, Windows DDK MVP
StorageCraft Corporation
xxxxx@storagecraft.com mailto:xxxxx
http://www.storagecraft.com http:

----- Original Message -----
From: Calvin Guan mailto:xxxxx
To: Windows System Software Devs Interest mailto:xxxxx List

Sent: Tuesday, November 18, 2003 4:02 AM
Subject: [ntdev] Re: Flushing DMA Buffer Allocated with AllocateC
ommonBuffer

Well, video miniport is a lot of code-:).
Our Radeon x86 free build miniport (ati2mtag.sys) is more than 600k. the chk
build doesn’t fit into a floppy…

Calvin Guan, Software Developer xxxxx@nospam.ati.com
mailto:xxxxx
SW2D-Radeon NT Core Drivers
ATI Technologies Inc.
1 Commerce Valley Drive East
Markham, Ontario, Canada L3T 7X6
Tel: (905) 882-2600 Ext. 8654
Find a driver: http://www.ati.com/support/driver.html
http:

> -----Original Message-----
> From: Maxim S. Shatskih [mailto:xxxxx@storagecraft.com
mailto:xxxxx]
> Sent: Monday, November 17, 2003 7:20 PM
> To: Windows System Software Devs Interest List
> Subject: [ntdev] Re: Flushing DMA Buffer Allocated with AllocateC
> ommonBuffer
>
>
> > Miniport. For example, look at the Permedia P3 sample in
> the DDK, the DMA
> > rendering is handled in the driver and not in the Miniport.
> There’s not
>
> Then why the nVidia’s miniport is THIS huge (500KB or such)?
>
> Maxim Shatskih, Windows DDK MVP
> StorageCraft Corporation
> xxxxx@storagecraft.com
> http://www.storagecraft.com http:
>
>
> —
> Questions? First check the Kernel Driver FAQ at
> http://www.osronline.com/article.cfm?id=256
http:
>
> You are currently subscribed to ntdev as: xxxxx@ati.com
> To unsubscribe send a blank email to xxxxx@lists.osr.com
>


Questions? First check the Kernel Driver FAQ at
http://www.osronline.com/article.cfm?id=256

You are currently subscribed to ntdev as: xxxxx@storagecraft.com
To unsubscribe send a blank email to xxxxx@lists.osr.com


Questions? First check the Kernel Driver FAQ at
http://www.osronline.com/article.cfm?id=256

You are currently subscribed to ntdev as: xxxxx@compuware.com
To unsubscribe send a blank email to xxxxx@lists.osr.com

The contents of this e-mail are intended for the named addressee only. It
contains information that may be confidential. Unless you are the named
addressee or an authorized designee, you may not copy or use it, or disclose
it to anyone else. If you received it in error please notify us immediately
and then destroy it.</http:></http:></mailto:xxxxx></http:></mailto:xxxxx></mailto:xxxxx></mailto:xxxxx></http:></mailto:xxxxx>