Reading image files as data

Hello,

If my mini-filter performs a cached read on an image file (PE executable) and if the same image file is subsequently executed, will it share the same pages in memory? I understand that the cache manager control data structures may be duplicated and there may be some pages which get copy-on-write while applying relocation. However, what about rest of the pages? Will they share same physical page memory? BTW, I am not talking about the case where the image file is just compiled and immediately executed in which case there may be two physical copies.

Other related questions.

  1. Does it depend upon whether the image file is read as data first OR whether it is executed first? i.e. depending upon whether DataSectionObject is created first OR ImageSectionObject?
  2. Is the behavior dependent on Windows version i.e. Pre-vista the pages may not be shared, but vista-and-above they will be shared?

Thanks.
-Prasad

When a binary is loaded as an executable, it will be mapped differently. It
is mapped according to the sections present in it.

See what SEC_IMAGE does.

On Fri, Jun 21, 2013 at 4:12 PM, wrote:

> Hello,
>
> If my mini-filter performs a cached read on an image file (PE executable)
> and if the same image file is subsequently executed, will it share the same
> pages in memory? I understand that the cache manager control data
> structures may be duplicated and there may be some pages which get
> copy-on-write while applying relocation. However, what about rest of the
> pages? Will they share same physical page memory? BTW, I am not talking
> about the case where the image file is just compiled and immediately
> executed in which case there may be two physical copies.
>
> Other related questions.
> 1. Does it depend upon whether the image file is read as data first OR
> whether it is executed first? i.e. depending upon whether DataSectionObject
> is created first OR ImageSectionObject?
> 2. Is the behavior dependent on Windows version i.e. Pre-vista the pages
> may not be shared, but vista-and-above they will be shared?
>
> Thanks.
> -Prasad
>
>
> —
> NTFSD is sponsored by OSR
>
> OSR is hiring!! Info at http://www.osr.com/careers
>
> For our schedule of debugging and file system seminars visit:
> http://www.osr.com/seminars
>
> To unsubscribe, visit the List Server section of OSR Online at
> http://www.osronline.com/page.cfm?name=ListServer
>

Hi Amritanshu,

Thanks for the response.

Yes, that part is understood. However, that’s just the virtual mapping. My question is more about physical page sharing. I mean, even if the image is mapped differently during execution than when it is read as data, it can still share same physical memory as long as the pages are not copy-on-written.

I got some data points on other forum threads that indicate that this sharing is possible and I wanted to validate the same. Hence my questions.

Thanks.
-Prasad

This is not an authoritative answer but I think it (pages can be shared
between data and image) might not be possible.

-When you specify a file to be loaded as an IMAGE its file-section
(subsection in Memory) needs to be loaded on page aligned boundary (0x1000)
but ondisk the file-section will be on sector aligned boundaries. Since
both(Data and image) memory mappings are contiguous, I think the content of
all the pages will not be same.

-Most executable file formats have certain amount of relocation involved -
if the image misses it’s preferred load address, on xp and 2k3 this does
result in Copy On Write(COW) and reduces the number of shared pages across
process. With ASLR on Vista/ Win7, I guess MSFT would have worked around
COW aspect but there too the IMAGE view of file and the DATA view of file
will not be same.

Unless OS actually keeps a content based track of Pages, it is not apparent
to me how it can happen?

PS: you can quickly try it out too.

On Sun, Jun 23, 2013 at 9:23 PM, wrote:

> Hi Amritanshu,
>
> Thanks for the response.
>
> Yes, that part is understood. However, that’s just the virtual mapping. My
> question is more about physical page sharing. I mean, even if the image is
> mapped differently during execution than when it is read as data, it can
> still share same physical memory as long as the pages are not
> copy-on-written.
>
> I got some data points on other forum threads that indicate that this
> sharing is possible and I wanted to validate the same. Hence my questions.
>
> Thanks.
> -Prasad
>
>
> —
> NTFSD is sponsored by OSR
>
> OSR is hiring!! Info at http://www.osr.com/careers
>
> For our schedule of debugging and file system seminars visit:
> http://www.osr.com/seminars
>
> To unsubscribe, visit the List Server section of OSR Online at
> http://www.osronline.com/page.cfm?name=ListServer
>

I understand the COW impact and mentioned that in my earlier post. However, I think, that, it should still be possible to share some pages (if not all) that are not copy-on-written

Even if the virtual memory layout of the mapped image file is not identical to it’s disk layout, we know which (offset, length) is backing which sub section. Cache manager is also doing virtual block caching based on which file regions are in the cache.

Anybody on the list has an authoritative answer on this?

Thanks.
-Prasad

Haven’t we been here before?

http://www.osronline.com/showThread.CFM?link=211131

See Pavel’s response in message #36.

-scott
OSR

wrote in message news:xxxxx@ntfsd…

I understand the COW impact and mentioned that in my earlier post. However,
I think, that, it should still be possible to share some pages (if not all)
that are not copy-on-written

Even if the virtual memory layout of the mapped image file is not identical
to it’s disk layout, we know which (offset, length) is backing which sub
section. Cache manager is also doing virtual block caching based on which
file regions are in the cache.

Anybody on the list has an authoritative answer on this?

Thanks.
-Prasad

@Scott, yes, it’s my own thread on which we had a long discussion and Pavel clarified about FloppyMedia bit which was causing a side effect resulting in a memory shoot-up.

However, I wasn’t clear about the duplication aspect and why sharing is not at all possible in such cases. In the Windows internals book by Mark, the duplication is talked about in the context of compile-run sequence. But, I wasn’t clear if this is generically true. Hence wanted to check on the same.

So, basically you are saying that after (cached) reading the entire image file as data, even if the entire file may be present in cache, the physical memory pages backing the cache cannot be shared while executing the code and hence all the file regions accessed during execution will occupy separate physical pages and this is true on all versions of Windows?

Thanks.
-Prasad

That is correct.

When you look at the SOP it has two pointers - one to the “image section object” and the other to the “data section object”. These are distinct sections and control distinct regions of physical memory.

The same thing happens in TxFS when you have a memory mapped file inside a transaction - it owns its own SOP structure, so that its image of the file is distinct inside the transaction from outside the transaction (at least until the transaction commits, at which point the two views must be reconciled).

Tony
OSR

Thanks Tony for authoritative confirmation!

Thanks.
-Prasad