Implementing a growable region of memory.

In usermode I would normally implement this by reserving the desired maximum size of address-space, and incrementally committing as necessary to grow. ie.,:

Base = VirtualAlloc(..., 0x200000 /* Max size */, MEM_RESERVE)

VirtualAlloc(Base + CurrentSize, 0x1000 /* Grow size */, MEM_COMMIT);  CurrentSize += 0x1000;

I’m struggling however to understand the correct way to accomplish this in KM. My thought is that it should be fundamentally similar; grab a big chunk of system address-space, then to grow: allocate some physical pages for backing and map them (similar to how one would do it with AWE).

The technique I’m currently using entails:

  • Allocating system address-space via MmAllocateMappingAddress
  • Then to grow using MmAllocatePagesForMdlEx followed by MmMapLockedPagesWithReservedMapping

This does work, however:

  • These APIs feel inappropriate to be using for this (based on interface and the docs) and more like what one would be using to e.g, allocate a long-lived pool of DMA buffers in an actual device driver. Despite trawling the reference docs, I couldn’t find anything else however.
  • While researching, I came across an old thread here on ntdev where-in the poster was trying to do the same thing, however he noticed (as I myself did), that the documentation for MmMapLockedPagesWithReservedMapping specifies that the base address must be the base allocation (as returned by MmAllocateMappingAddress) which thwarted his attempt (he reports it would bugcheck trying to pass an offsetted address). The problem is mine works, which suggests I’m doing something wrong – driver verifier doesn’t report anything, and nothing is blowing up.
  • The biggest issue: from what I understand from the docs, each allocation made via MmAllocatePagesForMdlEx must be freed by providing the exact same MDL to MmFreePagesFromMdl, which in this case would mean having to hold onto a potentially very large number of them. Is it maybe possible to, at teardown time, simply build an MDL describing the mapped region (since that PFN information previously in the MDLs would exist in the PTE) and free that?

notes:

  • The lifetime of this region matches that of the driver itself/is global.
  • Only grows; never shrinks.
  • Being virtually contiguous is a requirement; physical contiguity is not required.
  • Private to the driver; does not need to be shared.
  • Must be backed by nonpaged memory (due to IRQL)
  • NDIS filter driver.

Thanks,

Matt.

The remarks section for the MmMapLockedPagesWithReservedMapping doc here: https://learn.microsoft.com/en-us/windows-hardware/drivers/ddi/wdm/nf-wdm-mmmaplockedpageswithreservedmapping#remarks seems to directly address your situation. Note the requirement to us IoAllocateMdl and MmProbeAndLockPages.

It does also mention (in the parameter description) that one can use MmAllocatePagesForMdlEx, and infact this is what Peter recommends in the cited thread.

Actually this technique does not work; I assumed since mapping at an offset address worked that it would allow arbitrary mappings. This does not appear to be the case, and there is indeed a 1:1 mapping between the reserved region and a supplied MDL.

This also appears to be the case for MmProbeAndLockPages which was another approach I was considering – allocate pagable memory and incrementally lock.

After some conferring with with the NT5 source, Windows Internals and some reverse-engineering I managed to discover a way to accomplish this.

There’s a [what I thought was] an undocumented flag for MmAllocateMappingAddressEx which enables the size of the region described by the MDL to be used instead of the allocation size when interrogating the PTEs to determine if they’re valid (existing translation/mapping).

Later I discovered it’s just not documented on MSDN – it’s actually right there in the header called MM_MAPPING_ADDRESS_DIVISIBLE

Some trivia after looking into this: in NT5, the size and tag supplied by the user is encoded within the page table itself using 2 prefixed PTE entries, one for the size, one for the tag.

In recent kernels it appears to be stored externally in a regular pool-allocated linked-list.