RE: Looking for a WDM DMA Sample - platforms that translate DMA addresses

Below, you ask “Has anyone seen a real life example that a physical
address that is from MmGetPhysicalAddress of a contiguous system memory
different from return value of the MapTransfer?”

My answer is yes, both on x86 and other platforms. I’ve personally
written HALs that made Macintoshes run NT, back in the days when we
supported that sort of thing. All of those machines remap physical
address space both on the way to the bus and, differently, on the way
back again. The fundamental issue is that the PowerPC, along with
Alpha, MIPS, IA-64 and others, have no real I/O space. So they map bus
I/O into memory on the processor side. This tends to jumble the address
space requirements enough that some platforms will also translate
physical addresses of memory when viewed from the bus. If you’re really
curious about this, you can read a justification of the whole thing by
reading the old PReP spec from IBM, Motorola and Apple. It got even
more complicated when they evolved PReP into CHRP.

This technique isn’t limited to dead processor architectures, either. A
few years ago, Corollary built a Pentium-based machine that had many
root PCI busses. In order to solve the limited I/O space issues that
come with that, they mapped each root bus’s 16-bit I/O space into
separate memory ranges. Along with this, they re-mapped some of the DMA
ranges. (I did some maintenance on that HAL, though I didn’t write it.)

Next, look at IA-64 (also known as EPIC, or Itanic.) There is no real
I/O space on that platform. The processor tries to fake it well,
providing I/O instructions and an internal register that tells the
processor where to find the “I/O” range in memory. But some high-end
IA-64 machines have many root PCI busses, overwhelming the processor’s
ability to fake the existence of I/O space. Again, along with that
comes remapping of the DMA ranges.

All of these made up the reasons why I added the _DMA object to ACPI
2.0, allowing platform firmware to specify exactly how main memory
addresses are translated into bus-relative address spaces.

Finally, just consider machines that have broken chipsets. There exist
chipsets that can’t address all of memory from a particular bus, even if
your device is capable of generating 64-bit addresses. Do you want to
understand the hacks for each chipset in every driver? Or do you want
to use the existing APIs and let the OS contain the hacks?

Jake Oshins
(Former HAL Guy)
Windows NT Kernel Group

P.S. Bi, you also misunderstand the role of the HAL. It’s not the
microkernel. It’s just something that was meant to abstract the
architecture of specific motherboards. (There is much misleading text
about the HAL in the world, some of it even written by well-meaning but
misguided Microsoft employees.) In fact, most of what used to be in the
HAL has been moved into the PnP subsystem and associated bus drivers.
This is why you should use the PnP interfaces, rather than the old HAL
APIs. The HAL APIs may be broken on some platforms.

This posting is provided “AS IS” with no warranties, and confers no
rights.

-----Original Message-----
Subject: Re: Looking for a WDM DMA Sample
From: Bi Chen
Date: Tue, 1 Oct 2002 19:17:48 -0700
X-Message-Number: 29

This message is in MIME format. Since your mail reader does not
understand
this format, some or all of this message may not be legible.

------_=_NextPart_001_01C269B9.E71981F0
Content-Type: text/plain;
charset=“iso-8859-1”

Hi, Guys.

I just utter something to see if my understanding on the issue is
correct.
I’ll appreciate if something I post here is wrong.

Peter and others are right that in NT/2k/XP one should use HAL/Bus
driver
supported DMA functions. NT family OSes are sort of microkernel
architecture
and its microkerenl is HAL (maybe with some other part). The microkerenl
has
the knowledge of the CPU architecture and northbridge, bus etc and
present
an platform architecture independent view to the rest of the kernel.
Nevertheless, I don’t think NT family OSes are pure microkerenl OSes
since
its VMM must know CPU MMU’s, which is not in the HAL. I could be wrong
since
HAL API I all is from DDK.

Conceivably, some northbridge or its equivalent could add a displacement
to
the physical address when the generate address on the (front) system bus
are
destination to peripheral bus and subtract the displacement when the R/W
address issued by a device. I have not seen anything like that and
wonder
why if anyone wants to do that.

Northbridge is essentially a crossbar router that routes R/W request
based
on the address generated by CPU or device. For example, if the address
fall
into the window of the X-bus, the r/w request is routed to southbridge.
If
the address fall into the windows of system memory, rom/flash, AGP, it
will
be routed correspondingly.

In absence of bus bridge that wants to translate an address from
upstream or
downstream as aforementioned, the entire system physical address is
essentially flat. The system address can be used by the device if it is
a
positive decoding bus device such as PCI device (positive decoding bus
device means that if the address falling into a device window, it will
claim
the transaction).

Serial bus is not included in the discussion since the host hub/bridge
will
do the the job of converting framed message to proper address
destination to
northbridge.

I don’t think the issue has anything to do with CPUs core where MMU
resides.

Has anyone seen a real life example that a physical address that is from
MmGetPhysicalAddress of a contiguous system memory different from return
value of the MapTransfer (IoMapTransfer on NT 4.0) of the same memory (I
bet
someone did)? On what kind of bus (or what platform?)

Thanks

Bi

The link between re-mapping I/O and MMI/O usually comes about because
the platform designer is trying to make everything (including I/O and
MMI/O) available below 32-bits, while also making RAM start at physical
address zero.

The thought process usually goes like this:

  1. I have to put RAM at address 0 in order to make legacy code work.
  2. I have to map in I/O space somewhere in processor memory address
    space - pick some high number.
  3. I have to map in MMI/O space somewhere in processor memory address
    space - pick some other high number.
  4. I have to support old ISA MMI/O devices which expect to live at low
    bus-relative addresses - so build a translation between
    processor-relative physical addresses and bus-relative physical
    addresses. This makes ISA devices appear to the processor at high
    physical addresses rather than the low bus-relative ones.
  5. Now, since the ISA devices are decoding the low parts of
    bus-relative physical address space using the same physical addresses at
    which the processor is decoding RAM, create a bus-relative translation
    that allows a busmaster to read and write low physical RAM at high
    bus-relative addresses.

I’m not advocating this sort of scheme. I’m just saying that it exists.
And it has the interesting property that RAM can be addressed
contiguously by the processor. This seemed important to some designs,
so they chose it over the fantastically complex RAM-remapping scheme
that we know and love in today’s PCs. (The PC scheme creates lots of
holes in RAM, but it keeps the bus-relative physical addresses equal to
the processor-relative physical addresses.

Jake Oshins
Windows NT Kernel Group

This posting is provided “AS IS” with no warranties, and confers no
rights.

-----Original Message-----
Subject: RE: Looking for a WDM DMA Sample - platforms that tra
nslate DMA addresses
From: Bi Chen
Date: Wed, 2 Oct 2002 10:31:36 -0700
X-Message-Number: 14

Thanks. Jake:

I am glad that I could get some inside of NT OS from Microsoft people.

One more question, if a platform has to emulate IOIO (port IO) space,
that
leads to certain bus address translation, I assume it is only in IOIO
space.
I can understand that. However, as far as MMIO (memory mapped IO) space
is
considered, is there a need to translate bus address in 32-bit only
case,
baring chipset that sets fixed mapping window of IOIO space?

Thanks again.

Bi

-----Original Message-----
From: Jake Oshins [mailto:xxxxx@windows.microsoft.com]
Sent: Wednesday, October 02, 2002 10:00 AM
To: NT Developers Interest List
Subject: [ntdev] RE: Looking for a WDM DMA Sample - platforms that
translate DMA addresses

Below, you ask “Has anyone seen a real life example that a physical
address that is from MmGetPhysicalAddress of a contiguous system memory
different from return value of the MapTransfer?”

My answer is yes, both on x86 and other platforms. I’ve personally
written HALs that made Macintoshes run NT, back in the days when we
supported that sort of thing. All of those machines remap physical
address space both on the way to the bus and, differently, on the way
back again. The fundamental issue is that the PowerPC, along with
Alpha, MIPS, IA-64 and others, have no real I/O space. So they map bus
I/O into memory on the processor side. This tends to jumble the address
space requirements enough that some platforms will also translate
physical addresses of memory when viewed from the bus. If you’re really
curious about this, you can read a justification of the whole thing by
reading the old PReP spec from IBM, Motorola and Apple. It got even
more complicated when they evolved PReP into CHRP.

This technique isn’t limited to dead processor architectures, either. A
few years ago, Corollary built a Pentium-based machine that had many
root PCI busses. In order to solve the limited I/O space issues that
come with that, they mapped each root bus’s 16-bit I/O space into
separate memory ranges. Along with this, they re-mapped some of the DMA
ranges. (I did some maintenance on that HAL, though I didn’t write it.)

Next, look at IA-64 (also known as EPIC, or Itanic.) There is no real
I/O space on that platform. The processor tries to fake it well,
providing I/O instructions and an internal register that tells the
processor where to find the “I/O” range in memory. But some high-end
IA-64 machines have many root PCI busses, overwhelming the processor’s
ability to fake the existence of I/O space. Again, along with that
comes remapping of the DMA ranges.

All of these made up the reasons why I added the _DMA object to ACPI
2.0, allowing platform firmware to specify exactly how main memory
addresses are translated into bus-relative address spaces.

Finally, just consider machines that have broken chipsets. There exist
chipsets that can’t address all of memory from a particular bus, even if
your device is capable of generating 64-bit addresses. Do you want to
understand the hacks for each chipset in every driver? Or do you want
to use the existing APIs and let the OS contain the hacks?

Jake Oshins
(Former HAL Guy)
Windows NT Kernel Group

P.S. Bi, you also misunderstand the role of the HAL. It’s not the
microkernel. It’s just something that was meant to abstract the
architecture of specific motherboards. (There is much misleading text
about the HAL in the world, some of it even written by well-meaning but
misguided Microsoft employees.) In fact, most of what used to be in the
HAL has been moved into the PnP subsystem and associated bus drivers.
This is why you should use the PnP interfaces, rather than the old HAL
APIs. The HAL APIs may be broken on some platforms.

This posting is provided “AS IS” with no warranties, and confers no
rights.