Memory-mapped files with no handles

I’m currently looking into the way in which applications using memory-mapped files actually map data into memory. I’m using notepad as an example.

Here’s what I know. Normally, when a file is opened, there will be a handle within the process’s object table. However, notepad doesn’t create any handles in there for the file that it has open. This is confirmed by both Sysinternal tools, and some stuff that I’ve written.

However, there is a _FILE_OBJECT structure created for the file, which I’ve found by using the memory analysis software Volatility. In addition to the file object, there is also a _CONTROL_AREA structure for it, which I found using the WinDBG command ‘!ca 0’.

After opening the file, and examining the VAD tree for notepad, I found the file was actually mapped into what the Sysinternal tool VMMap tells me is the heap.

So my question is this; without a handle in the object table, is it possible to locate the _FILE_OBJECT or _CONTROL_AREA without simply brute force memory scanning of the pools (page/nonpaged). Is there some kind of kernel controlled linkedlist of control areas.

I did suspect that something, maybe a section object, could be found in the kernel object database, however I didn’t see anything in there that seemed helpful.

If anyone could offer any advice or insight into a possible solution, I’d appreciate it.

Thanks

The handle is created and then closed ASAP

Just filter creates and maintain the file object table.


Maxim S. Shatskih
Microsoft MVP on File System And Storage
xxxxx@storagecraft.com
http://www.storagecraft.com

wrote in message news:xxxxx@ntdev…
> I’m currently looking into the way in which applications using memory-mapped files actually map data into memory. I’m using notepad as an example.
>
> Here’s what I know. Normally, when a file is opened, there will be a handle within the process’s object table. However, notepad doesn’t create any handles in there for the file that it has open. This is confirmed by both Sysinternal tools, and some stuff that I’ve written.
>
> However, there is a _FILE_OBJECT structure created for the file, which I’ve found by using the memory analysis software Volatility. In addition to the file object, there is also a _CONTROL_AREA structure for it, which I found using the WinDBG command ‘!ca 0’.
>
> After opening the file, and examining the VAD tree for notepad, I found the file was actually mapped into what the Sysinternal tool VMMap tells me is the heap.
>
> So my question is this; without a handle in the object table, is it possible to locate the _FILE_OBJECT or _CONTROL_AREA without simply brute force memory scanning of the pools (page/nonpaged). Is there some kind of kernel controlled linkedlist of control areas.
>
> I did suspect that something, maybe a section object, could be found in the kernel object database, however I didn’t see anything in there that seemed helpful.
>
> If anyone could offer any advice or insight into a possible solution, I’d appreciate it.
>
> Thanks
>

Thanks Maxim.

This is what I assumed was happening, as to why it wasn’t appearing in the object table.

However, why does the _FILE_OBJECT still exist? And why does it have a _CONTROL_AREA still?

The handle has been closed, as the _FILE_OBJECT has zero open handles. However, it still has 16 reference/pointer counts, therefore it’s still being referenced by something. Is there anyway of determining what those 16 somethings are?

Thanks

> The handle has been closed, as the _FILE_OBJECT has zero open handles. However, it still has

16 reference/pointer counts, therefore it’s still being referenced by something.

Correct. By Mm/Cc internals, including the MmCa

It is fine and legal for the app to do CloseHandle immediately after MapViewOfFile, this does not kill the mapping.


Maxim S. Shatskih
Microsoft MVP on File System And Storage
xxxxx@storagecraft.com
http://www.storagecraft.com

>It is fine and legal for the app to do CloseHandle immediately after MapViewOfFile, this does not kill the mapping.

The one case when it’s not a good idea is when a file is created with FILE_FLAG_DELETE_ON_CLOSE.

Also, I have a suspicion the handle being closed may affect page discard policy for the section. Those pages might be discarded more readily.

It’s kind of twisted, but this works just fine. Closing the handle deletes
the link, but the file system needs to know to not reclaim the disk space
until the last reference goes away.

Closing the handle is the standard model in Windows for ages, so it
shouldn’t start discarding things more quickly. Note that, for example,
running processes don’t have any specific handles open to them.

-scott
OSR

wrote in message news:xxxxx@ntdev…

It is fine and legal for the app to do CloseHandle immediately after
MapViewOfFile, this does not kill the mapping.

The one case when it’s not a good idea is when a file is created with
FILE_FLAG_DELETE_ON_CLOSE.

Also, I have a suspicion the handle being closed may affect page discard
policy for the section. Those pages might be discarded more readily.

>Closing the handle is the standard model in Windows for ages, so it shouldn’t start discarding things more quickly. Note that, for example, running processes don’t have any specific handles open to them.

Maybe that’s why a large file copy (used to) cause extreme thrashing of code pages? When those pages appear to be discarded almost instantly and the processes just stand still because of that.

>It’s kind of twisted, but this works just fine. Closing the handle deletes the link, but the file system needs to know to not reclaim the disk space until the last reference goes away.

There’s been reports that for such a file section with a closed DELETE_ON_CLOSE handle, dirty pages may not be flushed to the file, but quietly discarded, and after that page-in may bring stale file contents. This may be fixed in the current OS version.

Regarding FILE_FLAG_DELETE_ON_CLOSE vs CreateFileMapping, there’s been a discussion in microsoft.public.win32.programmer.kernel in 2009.

Mr. Pavel Lebedinsky provided some guidance on that, which may or may not be current for the latest OS.

Basically at that point you’re at the mercy of the implementation of the
file system and how it decides to handle paging I/Os that arrive after the
cleanup. I could imagine that NTFS simply discards/zeroes them, seems
reasonable enough.

-scott
OSR

wrote in message news:xxxxx@ntdev…

Regarding FILE_FLAG_DELETE_ON_CLOSE vs CreateFileMapping, there’s been a
discussion in microsoft.public.win32.programmer.kernel in 2009.

Mr. Pavel Lebedinsky provided some guidance on that, which may or may not be
current for the latest OS.

Not sure what caused that specific case. To the best of my knowledge though
the Mm doesn’t take into account the handle count on the stream backing the
section. The Mm really just knows about a single file object for the stream,
the only one that knows definitively how many total handles there are to
that stream across all file objects is the file system.

-scott
OSR

wrote in message news:xxxxx@ntdev…

Closing the handle is the standard model in Windows for ages, so it
shouldn’t start discarding things more quickly. Note that, for example,
running processes don’t have any specific handles open to them.

Maybe that’s why a large file copy (used to) cause extreme thrashing of code
pages? When those pages appear to be discarded almost instantly and the
processes just stand still because of that.

Thanks all for all the feedback. I’ve now got a better idea of how things are working, however I’ve still got a few questions.

I can understand that the file handle has been closed, however as notepad.exe is using file-view mappings, I think I’m correct in saying that the contents of the file is actually mapped directly into the process address space.

After digging through the VAD tree for the process, I’ve identified the contents of the file, and it’s loaded within notepad’s heap allocation. I’ve looked through the object table, and there doesn’t appear to be a section object that corresponds to the original file data. Am I looking in the wrong place, or is there simply not going to be a section object.

Using WinDBG, I can see that there is still both a _FILE_OBJECT and a _CONTROL_AREA that correspond to the text file that was opened. I think that these are located within the paged pool.

Is it possible to access either of these objects from either the contents of the VAD tree, or through the object table for the process?

I can see all of the DLLs and the actual notepad.exe files mapped into the VADs as file mappings, with corresponding _FILE_OBJECTS, but the actual data read from the file doesn’t seem to have any data within the tables.

If you can provide any additional information, I’d appreciate it.

Thanks again.

Josh

I’m not sure Notepad ever uses file mapping for that. It sure opens the file with the most relaxed sharing option, and then probably just uses ReadFile to read it to heap-allocated memory. This is why it’s so slow to load large files.

Try dumping the VAD when notepad is loading a large text file. IIRC Notepad
memory maps the file, copies the data, and then unmaps the file, so the VAD
entry is transient.

You’ll still find the control area and file object in memory until there is
memory pressure, at which point the Mm will discard. There is nothing in
this process that would definitively tie you back to the correct file object
or control area.

-scott
OSR

wrote in message news:xxxxx@ntdev…

Thanks all for all the feedback. I’ve now got a better idea of how things
are working, however I’ve still got a few questions.

I can understand that the file handle has been closed, however as
notepad.exe is using file-view mappings, I think I’m correct in saying that
the contents of the file is actually mapped directly into the process
address space.

After digging through the VAD tree for the process, I’ve identified the
contents of the file, and it’s loaded within notepad’s heap allocation.
I’ve looked through the object table, and there doesn’t appear to be a
section object that corresponds to the original file data. Am I looking in
the wrong place, or is there simply not going to be a section object.

Using WinDBG, I can see that there is still both a _FILE_OBJECT and a
_CONTROL_AREA that correspond to the text file that was opened. I think
that these are located within the paged pool.

Is it possible to access either of these objects from either the contents of
the VAD tree, or through the object table for the process?

I can see all of the DLLs and the actual notepad.exe files mapped into the
VADs as file mappings, with corresponding _FILE_OBJECTS, but the actual data
read from the file doesn’t seem to have any data within the tables.

If you can provide any additional information, I’d appreciate it.

Thanks again.

Josh

> Is it possible to access either of these objects from either the contents of the VAD tree

Why do you need this?


Maxim S. Shatskih
Microsoft MVP on File System And Storage
xxxxx@storagecraft.com
http://www.storagecraft.com

Thanks for the additional info.

Maxim: I want this information because I’m currently working on some introspection tools, and I’m trying to get a list of all files that are currently opened/accessed by a process.

Scott: If there is no way of getting to the file or section object from the process/VAD, will the objects be listed in the kernel’s object namespace? Is that it’s purpose?

I’ve noticed that the !ca command simply performs a scan of the pools looking for the tags. Is this the only mechanism for locating them? Are they not listed in the object namespace?

Thanks again.

Josh

If the process has a valid mapping, then there’s a VAD entry that points to
a Control Area. This Control Area in turn references a file object owned by
the file system. For example:

0: kd> !vad
VAD …
895a1df0 … Mapped Exe … \WINDOWS\system32\advapi32.dll

0: kd> dt nt!_mmvad 895a1df0 ControlArea
+0x018 ControlArea : 0x89ce5f28 _CONTROL_AREA

0: kd> !ca 0x89ce5f28

ControlArea @ 89ce5f28

File Object 89ce3a40 ModWriteCount 0 System Views 0
WritableRefs 0
Flags (a0) Image File

\WINDOWS\system32\advapi32.dll

If the process unmaps this, then there is no VAD entry and therefore there
is no mapping to the Control Area.

The “short as it can be” answer for what happens if the process wants to map
the file again, is really that the application has two choices:

  1. If it kept the Section handle around, it can simply call MapViewOfFile
    again. The Section handle is a handle to a Section Object, which in turn
    references the Control Area.

  2. It it didn’t keep the Section handle around, it needs to open the file to
    the File System again, which creates a new File Object. If the Control Area
    is still in memory, then the old File Object is still in memory, which means
    the File System structures for the file/stream won’t have been torn down yet
    (they’re referenced by the File Object). When the File System walks its
    internal tables, it locates in memory File Control Block (FCB) for the file.

The FCB has a structure called the SectionObjectPointers, which contains
pointers to the Control Areas (data and/or image) for the file. The File
System puts a pointer to this structure in
FileObject->SectionObjectPointers. The application now of course needs a
Section to map, so it calls CreateFileMapping again and specifies the File
handle. At this point the Mm can locate the Control Area
(FileObject->SectionObjectPointers->{DataSectionObject|ImageSectionObject})
and wire the application’s Section up to the correct data. Repeat Step 1.

-scott
OSR

wrote in message news:xxxxx@ntdev…

Thanks for the additional info.

Maxim: I want this information because I’m currently working on some
introspection tools, and I’m trying to get a list of all files that are
currently opened/accessed by a process.

Scott: If there is no way of getting to the file or section object from the
process/VAD, will the objects be listed in the kernel’s object namespace?
Is that it’s purpose?

I’ve noticed that the !ca command simply performs a scan of the pools
looking for the tags. Is this the only mechanism for locating them? Are
they not listed in the object namespace?

Thanks again.

Josh

You can’t safely access any of these structures (EPROCESS, the VAD tree, control areas, section objects etc.) from a driver. First, because you don’t have the necessary synchronization to do so. Second, because the layout of these structures can (and often does) change for each major release or service pack, and sometimes even for hotfixes.

In user mode you can obtain the names of all memory mapped files in a process using VirtualQueryEx + GetMappedFileName.

I don’t think there is a supported way to enumerate section or file handles in a process. If you are interested in a specific file, you can use the Restart Manager API to find out which processes are using it. That should detect both file/section handles and mapped views.

-----Original Message-----
From: xxxxx@lists.osr.com [mailto:xxxxx@lists.osr.com] On Behalf Of xxxxx@outlook.com
Sent: Wednesday, February 5, 2014 5:37 AM
To: Windows System Software Devs Interest List
Subject: RE:[ntdev] Memory-mapped files with no handles

Thanks for the additional info.

Maxim: I want this information because I’m currently working on some introspection tools, and I’m trying to get a list of all files that are currently opened/accessed by a process.

Scott: If there is no way of getting to the file or section object from the process/VAD, will the objects be listed in the kernel’s object namespace? Is that it’s purpose?

I’ve noticed that the !ca command simply performs a scan of the pools looking for the tags. Is this the only mechanism for locating them? Are they not listed in the object namespace?

Can NtQuerySection also help?

“Pavel Lebedynskiy” wrote in message news:xxxxx@ntdev…
You can’t safely access any of these structures (EPROCESS, the VAD tree, control areas, section objects etc.) from a driver. First, because you don’t have the necessary synchronization to do so. Second, because the layout of these structures can (and often does) change for each major release or service pack, and sometimes even for hotfixes.

In user mode you can obtain the names of all memory mapped files in a process using VirtualQueryEx + GetMappedFileName.

I don’t think there is a supported way to enumerate section or file handles in a process. If you are interested in a specific file, you can use the Restart Manager API to find out which processes are using it. That should detect both file/section handles and mapped views.

-----Original Message-----
From: xxxxx@lists.osr.com [mailto:xxxxx@lists.osr.com] On Behalf Of xxxxx@outlook.com
Sent: Wednesday, February 5, 2014 5:37 AM
To: Windows System Software Devs Interest List
Subject: RE:[ntdev] Memory-mapped files with no handles

Thanks for the additional info.

Maxim: I want this information because I’m currently working on some introspection tools, and I’m trying to get a list of all files that are currently opened/accessed by a process.

Scott: If there is no way of getting to the file or section object from the process/VAD, will the objects be listed in the kernel’s object namespace? Is that it’s purpose?

I’ve noticed that the !ca command simply performs a scan of the pools looking for the tags. Is this the only mechanism for locating them? Are they not listed in the object namespace?

Thanks to everyone for your input!

Seems that my current approach isn’t going to work. I’ll have to have a rethink.