XP MmUnmapIoSpace / MmMapIoSpace bug?

I have a driver that maps and unmaps memory above MAXMEM via MmMapIoSpace / MmUnmapIoSpace. On Intel processors I have not seen any problem but with an AMD Sempron I have seen stale TLB entries (the virtual to physical mapping does not match the PTE). Using windbg I can poke around physical and virtual memory and they are obviously not connected. Just as documented in the processor manuals, as soon as I rewrite a MTRR register or toggle the PGE bit in CR4, I see the virtual memory update and correctly reflect the physical memory.

I did a lot of tracing through the XP MmMapIoSpace and MmUnmapIoSpace and found that when MmMapIoSpace is called with MmCached it does not invalidate the TLB entries. If it is called with MmUncached then it does invalidate the TLBs and caches via the calls to KeFlushCurrentTb and KeInvalidateAllCaches.

My tracing through the MmUnmapIoSpace showed that neither cached nor uncached memory causes any type of invalidate in this code path. I confirmed this through the following code that when called without the block of code that explicitly invalidates the TLBs does not fault but with the explicit invalidate it does fault on the access to *pLong after the MmUnmapIoSpace. The code below is just and example and as such the physical memory address is arbitrarily picked to be somewhere in DRAM memory (it is only read and its contents are unimportant).

{
static unsigned long volatile *pLong;
PHYSICAL_ADDRESS physicalAddress;
unsigned long a, b;

physicalAddress.LowPart = 0x00100000; // Arbitrary address somewhere in physical DRAM
physicalAddress.HighPart = 0;

KdPrint( (“LDUNLD: About to map physical memory\n”) );
pLong = MmMapIoSpace(physicalAddress, 4096, MmCached);
if (pLong != NULL)
{
KdPrint( (“LDUNLD: Memory successfully mapped\n”) );
a = *pLong;
b = a + 1;
{ // BLOCK OF CODE TO EXPLICITLY INVALIDATE TLB ENTRIES
// Force the TLB entries to be invalidated before we use them
unsigned long page = (unsigned long) pLong & 0xfffff000;
unsigned long endPage = (((unsigned long) pLong + (unsigned long) 4096) - 1) & 0xfffff000;

for (; page <= endPage; page += 0x00001000)
{
__asm mov eax, page
__asm invlpg [eax]
}
}
MmUnmapIoSpace((PVOID)pLong, 4096);
KdPrint( (“LDUNLD: Unmapped physical memory\n”) );
a = *pLong + 5;
b = a + 1;
}
}

I did “fix” my original problem by wrapping all MmMapIoSpace calls with code that the explicit invalidates the TLB just before the return from the wrapper function. I’m not real happy with this because I was unable to root cause the problem and don’t understand why I have not seen this issue with Intel processors. My speculation at this point is that I have been lucky up to this point with Intel processors naturally evicting TLB entries before they are reused (I’m still looking into it but this speculation assumes that the AMD processor has a larger TLB). I suspect that most drivers use MmMapIoSpace / MmUnmapIoSpace at load and unload time. I’m calling these very frequently to map virtual windows into large area of physical memory on an as need bases (I don’t doubly map physical memory).

I would like to root cause this (or at least root cause why Intel processors don’t fail).

Has anyone run into a similar problem or know what the pertinent differences between Intel and AMD are?

I’m running single processor XP embedded kernel with SP1. I searched here, on msdn, and on google and haven’t found anything.

wrote in message news:xxxxx@ntdev…
>I have a driver that maps and unmaps memory above MAXMEM via MmMapIoSpace / MmUnmapIoSpace. On >Intel processors I have not
>seen any problem but with an AMD Sempron I have seen stale TLB entries (the >virtual to physical mapping does not match the
>PTE).

The memory you’re mapping is not I/O space, strictly speaking.
So no wonder that MmMapIoSpace works incorrectly (the GIGO rule).

> Has anyone run into a similar problem or know what the pertinent differences between Intel and AMD are?

the first of them is simply better? :slight_smile:
How much RAM your systems have and what is value of MAXMEM?

–PA

>I did “fix” my original problem by wrapping all MmMapIoSpace calls with code

that the explicit invalidates the TLB just before the return from the wrapper

From what I remember - MmUnmapIoSpace does not invalidate the TLB, but put the
freed system PTE to the “invalidate” list. This is for performance. When the
“invalidate list” is too long - then the whole TLB is invalidated, and the PTE
entries from this list are put to the “free” list.

MmMapIoSpace only uses the “free” list, which - according to the above - only
uses the invalid PTEs with invalid TLB entries for them.

Maxim Shatskih, Windows DDK MVP
StorageCraft Corporation
xxxxx@storagecraft.com
http://www.storagecraft.com