On XP (pre SP2, if it matters), when I use MmAllocatePagesForMdl along with MmMapLockedPagesSpecifyCache to map a large (50MB) buffer into application space, I find that my system’s available physical memory takes two 50MB ‘hits’. Prior to mapping the buffer into application space, physical memory is reduced by 50MB. Later, when the buffer is mapped into user space, physical memory is again reduced by 50MB.
The design goal is a 50MB buffer that cannot be swapped out and is shared between two processes and eliminates disk I/O after the data is initially read into the large buffer.
My call to MmAllocatePagesForMdl looks like:
pMdl = MmAllocatePagesForMdl((PHYSICAL_ADDRESS)LowAddress, // 0
(PHYSICAL_ADDRESS)HighAddress, // -1
(PHYSICAL_ADDRESS)ByteSkip, // 0
(SIZE_T)RequestedBytes); // 50MB
And my call to MmMapLockedPagesSpecifyCache looks like:
userAddr = MmMapLockedPagesSpecifyCache(
pMdl, // IN PMDL MemoryDescriptorList,
UserMode, // IN KPROCESSOR_MODE AccessMode,
MmCached, // IN MEMORY_CACHING_TYPE CacheType,
NULL, // IN PVOID BaseAddress,
FALSE, // IN ULONG BugCheckOnFailure,
HighPagePriority // IN MM_PAGE_PRIORITY Priority
);
Is there a better way to achieve my design goals? Can I tweak how these interfaces are being used to improve the system’s memory management of this buffer?
We have tried a memory mapped file allocated in user space, but locking it down did not actually prevent it from being swapped out in low resource conditions.
You’re question’s not that clear… But it DOES look like you’re using these DDI’s correctly (except for the HighPagePriority, which I’d personally rather see specified as NormalPagePriority, but that’s not relevant to the problem).
Are you implying that you believe MmMapLockedPagesSpecifyCache is allocating 50MB of memory?
Some first-level questions:
“makes two 50MB ‘hits’” as determined by what? What’s leading you to believe that 100MB of memory gets allocated, if that’s what you mean.
You’ve checked that the MDL contents is the same before and after the call to MmMapLockedPagesSpecifyCache, right? Assuming this is the case, it’s the same 50MB buffer…
Regarding:
I took me a while to understand what I *think* you’re saying here… You tried writing code in user-mode mode that memory mapped a file, and you locked it with whatever the user-mode API is that supposedly locks memory, but you discovered that the memory was paged out in certain conditions… Is that what you’re saying? Cuz that IS the way it works. “Locking” pages from user mode merely locks the pages into your working set, it doesn’t prevent them from being paged if the process gets paged.
***
I’d suggest that what’s probably happening is that you’re misreading or misinterpreting some statistic that’s displayed somewhere. If that’s the case, don’t worry about it and go about your business…
You’re question’s not that clear… But it DOES look like you’re using these
DDI’s correctly (except for the HighPagePriority, which I’d personally rather
see specified as NormalPagePriority, but that’s not relevant to the problem).
Are you implying that you believe MmMapLockedPagesSpecifyCache is allocating
50MB of memory?
BRS - No. It appears to me that the end result of this call causes the OS to ‘reserve’ another 50MB of memory, so that it is no longer available to other processes.
Some first-level questions:
“makes two 50MB ‘hits’” as determined by what? What’s leading you to
believe that 100MB of memory gets allocated, if that’s what you mean.
BRS - As reported by Task Manager’s Physical Memory Available statistic and calls in user space to GlobalMemoryStatus().
You’ve checked that the MDL contents is the same before and after the call to
MmMapLockedPagesSpecifyCache, right? Assuming this is the case, it’s the same
50MB buffer…
BRS - I will check this. I only allocate a single 50MB buffer, so I’m sure its the same buffer.
Regarding:
I took me a while to understand what I *think* you’re saying here… You tried
writing code in user-mode mode that memory mapped a file, and you locked it with
whatever the user-mode API is that supposedly locks memory, but you discovered
that the memory was paged out in certain conditions… Is that what you’re
saying? Cuz that IS the way it works. “Locking” pages from user mode merely
locks the pages into your working set, it doesn’t prevent them from being paged
if the process gets paged.
BRS - Yes, this is what I was trying to say.
***
I’d suggest that what’s probably happening is that you’re misreading or
misinterpreting some statistic that’s displayed somewhere. If that’s the case,
don’t worry about it and go about your business…
BRS - The statistics that we see do correlate with the poor performance that we are seeing, but may not tell the whole story.
BRS - What we are finding is that the process that writes into the memory buffer sometimes takes much, much longer to complete its job.
BRS - If the system isn’t low on memory, this process usually completes in 4 to 5 seconds. When we are low on memory, this process can take upwards of minutes.
BRS - I am going to investigate whether the MmMapLockedPagesSpecifyCache call itself is what takes so long.
BRS - Perhaps the user code that operates on the buffer is getting swapped in and out.
One thing that’s not clear to me is if you’re consistently talking about physical memory or virtual memory space.
MmMapLockedPagesSpecifyCache will allocate the (50MB) of physical memory. Assuming the call succeeds, these pages will be within the physical address range you’ve requested, non-pageable, zeroed when they’re handed to you, and described by the MDL.
They will not, however, be mapped into any virtual address space. At this point, you’ve just got the memory.
Calling MmMapLockedPagesSpecifyCache as you have called it will take the buffer comprising the physical pages described by the MDL you pass in, and map that buffer into the user virtual address of the current process. To do this, it *will* allocate storage for the data structures necessary to track this virtual address space, but that’s NOTHING even remotely like 50MB.
While I am sooooo not a user-mode guy, if you haven’t already do so you might want to ensure that you’re adjusting your working set size/quota appropriately. Depending on what else your app does, if there’s memory pressure and 50MB of the process working set is not pageable you *could* certainly see a lot of working set trimming and paging. But I suspect that there would have to be a *lot* of memory pressure to cause the horrible performance you’ve described.
ANYhow, maybe that’ll give you some ideas. At the very least, at least you know you’re not making some gross error in calling the DDIs shown in your post…
Thank you for your input here. I believe that our usage of .NET and our combination of managed and un-managed code is somehow responsible for the extra 50MB memory allocation. I will refrain from additional speculation until I know more.
I ran a test, and when the memory driver mapped the 50MB buffer into a simple MFC application created with Dev Studio 6, the second 50MB memory allocation did not occur.