PFN entry share count vs. reference count

Hi all,

I was looking at a piece of code running on Win7 requesting the allocation of a memory chunk from nonpaged pool; the memory chunk returned is at 0x85497bf8

kd> dt nt!_pool_header 85497bf0
+0x000 PreviousSize : 0y000010111 (0x17)
+0x000 PoolIndex : 0y0000000 (0)
+0x002 BlockSize : 0y000100011 (0x23)
+0x002 PoolType : 0y0000010 (0x2)
+0x000 Ulong1 : 0x4230017

kd> !pte 85497bf8
VA 85497bf8
PDE at C0300854 PTE at C021525C
contains 1FF05863 contains 1E497963
pfn 1ff05 —DA–KWEV pfn 1e497 -G-DA–KWEV

looking at the PFN entry for it

kd> !pfn 1e497
PFN 0001E497 at address 83ED6E28
flink 00000000 blink / share count 00000001 pteaddress C021525C
reference count 0001 Cached color 0 Priority 0
restore pte 00000000 containing page 01FF05 Active

share count and reference count value is 1.

Which are the goals of those fields? How the system uses/manages them ?

Thanks.

Any little help ? (Even if not related until now to a ‘real’ case I’m facing to ? )

Very roughly speaking (see the book for more info):

  • Reference count is the number of times the page has been “locked” in memory through APIs such as MmProbeAndLockPages, plus one reference for the page itself, obviously.

  • Share count is the number of other PTEs that point to this PFN, typically for shared pages such as those from section objects.


Best regards,
Alex Ionescu

Thanks Alex…I’ve read the interesting Windows Internal 6th edition (part 2) about it

  • Share count is the number of other PTEs that point to this PFN, typically for shared pages such
    as those from section objects.

As you said there can exist multiple PTEs pointing to the same PFN, some of them map the physical page in a ‘paged’ VAs range while others can map it to an ‘non paged’ VA range.

My doubt is: does ‘share count’ take in account only for the number of (valid) ‘paged’ mapping (i.e. belonging to a process/system working set) or just for the total number of them ?

Don’t quote me, but ShareCount should be references to the *PFN* proper. If the page is paged out or trimmed from the WS, this should drop the ShareCount – and even lead to the PFN moving to the standby list and thus available for repurposing if the PFN is not active in any Working Sets. When the first process re-touches the page, a new PFN can be chosen and the share count updated for any new processes that will re-page.

At least this is how I would’ve implemented it. Landy may have it done differently for purposes that are not immediately obvious.

You could, however, easily validate this. Load a shared memory region in two processes, get the PTE-PFN. Now force it to be trimmed out of one working set (RAMmap should be able to do this).


Best regards,
Alex Ionescu

> You could, however, easily validate this. Load a shared memory region in two processes, get the

PTE-PFN. Now force it to be trimmed out of one working set

Yes, that works as expected…but consider the following case (a driver code) in which same physical page has two PTE pointing to it. The first PTE maps it in a process’ user space while the second in the system space (system PTE area)

kd> dc 0x03564700
03564700 73696854 72745320 20676e69 66207369 This String is f
03564710 206d6f72 72657355 70704120 6163696c rom User Applica
03564720 6e6f6974 7375203b 20676e69 4854454d tion; using METH
03564730 4e5f444f 48544945 00005245 00000000 OD_NEITHER…
03564740 00000000 00000000 00000000 00000000 …
03564750 00000000 00000000 00000000 00000000 …
03564760 00000000 ffffffff ffffffff 00000002 …
03564770 00000000 00000000 00000000 00000000 …

kd> dc 0x8c055700
8c055700 73696854 72745320 20676e69 66207369 This String is f
8c055710 206d6f72 72657355 70704120 6163696c rom User Applica
8c055720 6e6f6974 7375203b 20676e69 4854454d tion; using METH
8c055730 4e5f444f 48544945 00005245 00000000 OD_NEITHER…
8c055740 00000000 00000000 00000000 00000000 …
8c055750 00000000 00000000 00000000 00000000 …
8c055760 00000000 ffffffff ffffffff 00000002 …
8c055770 00000000 00000000 00000000 00000000 …

kd> !sysptes 4
0x23a3 System PTEs allocated to mapping locked pages
VA MDL PageCount Caller/CallersCaller
8c055700 84d85970 1 SIoctl!SioctlDeviceControl+0x35a/nt!IofCallDriver+0x63


kd> !pte 0x03564700
VA 03564700
PDE at C0300034 PTE at C000D590
contains 02368867 contains 12928867
pfn 2368 —DA–UWEV pfn 12928 —DA–UWEV

kd> !pte 0x8c055700
VA 8c055700
PDE at C03008C0 PTE at C0230154
contains 1B3B1863 contains 12928963
pfn 1b3b1 —DA–KWEV pfn 12928 -G-DA–KWEV

PFN numbers, obviously, are the same and PFN database entry for the physical page has refcount 2 (the page had been previously locked by driver code calling MmProbeAndLockPages)

kd> !pfn 12928
PFN 00012928 at address 83DBDBC0
flink 0000010E blink / share count 00000001 pteaddress C000D590
reference count 0002 Cached color 0 Priority 5
restore pte 00000080 containing page 002368 Active M
Modified

Now it does exist a valid system pte pointing to that physical age (PTE at C0230154)…nevertheless share count seems to do not take in account it (its value is 1 and not 2)…

> Don’t quote me, but ShareCount should be references to the *PFN* proper.

IIRC ShareCount is used to manage prototype PTE - if ShareCount drops to zero, then the PPTE is marked Transition.

When the PTE is being marked Transition, it is looked at the PFN whether it is a shared page (i.e. has a prototype PTE).

If no - then the PTE itself is marked Transition.

If yes - then PTE is set to a PpteReference, and ShareCount is decremented. If ShareCount drops to zero - then PPTE is also set to Transition.

RefCount is what prevents the page from be paged out/discarded.

MDL mappings to system and even user space do not use PPTE or ShareCount.


Maxim S. Shatskih
Windows DDK MVP
xxxxx@storagecraft.com
http://www.storagecraft.com

Maxim is correct in that only PPTEs increment the share count because MDL mappings automatically lock the page thus causing a jump in ReferenceCount.


Best regards,
Alex Ionescu

I guess with PPTE you mean ‘Prototype PTE’ and not a PTE pointing to it, right ?

Thanks.

>I guess with PPTE you mean ‘Prototype PTE’ and not a PTE pointing to it, right ?

Yes.


Maxim S. Shatskih
Windows DDK MVP
xxxxx@storagecraft.com
http://www.storagecraft.com

The simple way to describe this is that the share count is the number of working set entries pointing to the page.

A non-zero share count adds one reference to the page. Additional references can be created using MmProbeAndLockPages.

Both counts have to be zero before a page can be removed from memory.

In a way this is similar to the handle count/reference count model used by the object manager (handles/share counts are for user processes, references are for the kernel/drivers).

Landy may have it done differently…

The basic model here hasn’t changed much since LouP’s original NT MM design back from 1990 or so.

-----Original Message-----
From: xxxxx@lists.osr.com [mailto:xxxxx@lists.osr.com] On Behalf Of xxxxx@videotron.ca
Sent: Monday, March 4, 2013 6:55 PM
To: Windows System Software Devs Interest List
Subject: RE:[ntdev] PFN entry share count vs. reference count

Don’t quote me, but ShareCount should be references to the *PFN* proper. If the page is paged out or trimmed from the WS, this should drop the ShareCount – and even lead to the PFN moving to the standby list and thus available for repurposing if the PFN is not active in any Working Sets. When the first process re-touches the page, a new PFN can be chosen and the share count updated for any new processes that will re-page.

At least this is how I would’ve implemented it. Landy may have it done differently for purposes that are not immediately obvious.

You could, however, easily validate this. Load a shared memory region in two processes, get the PTE-PFN. Now force it to be trimmed out of one working set (RAMmap should be able to do this).

1 Like

Pavel, that was pretty much my simple explanation as well, but it doesn’t really seem to be accurate because non-PPTE based user-mapped pages of the same PFN (such as ProbeAndLock) increment RefCount, not ShareCount, as in Carlo’s example earlier. Or is it that those user-mapped pages are not in the WS?


Best regards,
Alex Ionescu

On 3/5/2013 5:48 PM, xxxxx@videotron.ca wrote:

Pavel, that was pretty much my simple explanation as well, but it doesn’t really seem to be accurate because non-PPTE based user-mapped pages of the same PFN (such as ProbeAndLock) increment RefCount, not ShareCount, as in Carlo’s example earlier. Or is it that those user-mapped pages are not in the WS?


Best regards,
Alex Ionescu

Locking the user pages brings them into the WS if they were not already
there. (I am assuming that “WS” stands for “workingset.”)

Locking user pages does bring them into the working set but that’s an implementation detail (and the pages can be trimmed again as soon as MmProbeAndLockPages returns). Remember that MmProbeAndLockPages only increments the reference count; it doesn’t place any restrictions on the share count.

Anyway, this is not relevant for the question Alex asked. The answer to that question is that MDL mappings (in user or system space) are not part of any working set, and therefore don’t affect the share count. Working sets keep track of pageable memory; MDL mappings are non-pageable so there is no point in adding them to the working set.

-----Original Message-----
From: xxxxx@lists.osr.com [mailto:xxxxx@lists.osr.com] On Behalf Of George M. Garner Jr.
Sent: Tuesday, March 5, 2013 3:59 PM
To: Windows System Software Devs Interest List
Subject: Re:[ntdev] PFN entry share count vs. reference count

On 3/5/2013 5:48 PM, xxxxx@videotron.ca wrote:

Pavel, that was pretty much my simple explanation as well, but it doesn’t really seem to be accurate because non-PPTE based user-mapped pages of the same PFN (such as ProbeAndLock) increment RefCount, not ShareCount, as in Carlo’s example earlier. Or is it that those user-mapped pages are not in the WS?


Best regards,
Alex Ionescu

Locking the user pages brings them into the WS if they were not already there. (I am assuming that “WS” stands for “workingset.”)

1 Like

On 3/5/2013 8:23 PM, Pavel Lebedynskiy wrote:

but that’s an implementation detail

Virtually everything being discussed in this thread (share count,
reference count, WS) is an implementation detail, don’t you think?

Yes, and I think most people here understand that.

But “probe-and-lock will make the user VAs valid” is the kind of implementation detail that even a reasonable person could potentially misinterpret as part of the API contract. Which is why I thought it would be worth pointing that out.


From: George M. Garner Jr.mailto:xxxxx
Sent: ?3/?5/?2013 6:18 PM
To: Windows System Software Devs Interest Listmailto:xxxxx
Subject: Re:[ntdev] PFN entry share count vs. reference count

On 3/5/2013 8:23 PM, Pavel Lebedynskiy wrote:
> but that’s an implementation detail

Virtually everything being discussed in this thread (share count,
reference count, WS) is an implementation detail, don’t you think?


NTDEV is sponsored by OSR

OSR is HIRING!! See http://www.osr.com/careers

For our schedule of WDF, WDM, debugging and other seminars visit:
http://www.osr.com/seminars

To unsubscribe, visit the List Server section of OSR Online at http://www.osronline.com/page.cfm?name=ListServer</mailto:xxxxx></mailto:xxxxx>

>RefCount, not ShareCount, as in Carlo’s example earlier. Or is it that those user-mapped pages are

not in the WS?

Correct, since they are never trimmed by the VM replacement policy.


Maxim S. Shatskih
Windows DDK MVP
xxxxx@storagecraft.com
http://www.storagecraft.com

>The simple way to describe this is that the share count is the number of working set entries pointing to the page.

Pavel, just a doubt…

Here I’ve found a “non paged pool” page (by very definition no part of any woking set (process or system)) for which share count is 1 (obviously reference count is 1 because never paged out)…why ?

kd> !pool 85684a50
Pool page 85684a50 region is Nonpaged pool
85684000 size: b8 previous size: 0 (Allocated) File (Protected)
856840b8 size: 18 previous size: b8 (Allocated) ReEv
856840d0 size: 68 previous size: 18 (Allocated) FMsl
85684138 size: 40 previous size: 68 (Allocated) Even (Protected)
85684178 size: 40 previous size: 40 (Allocated) Even (Protected)
856841b8 size: 130 previous size: 40 (Allocated) IPRT
856842e8 size: 50 previous size: 130 (Allocated) Vadm
85684338 size: 48 previous size: 50 (Allocated) CM44 Process: 85689b18
85684380 size: 40 previous size: 48 (Allocated) Even (Protected)
856843c0 size: 18 previous size: 40 (Allocated) ReEv
856843d8 size: 40 previous size: 18 (Allocated) Even (Protected)
85684418 size: 38 previous size: 40 (Allocated) AlIn
85684450 size: 48 previous size: 38 (Allocated) Sema (Protected)
85684498 size: 48 previous size: 48 (Allocated) Vad
856844e0 size: 30 previous size: 48 (Allocated) Icp
85684510 size: 180 previous size: 30 (Allocated) EtwG
85684690 size: 48 previous size: 180 (Allocated) Vad
856846d8 size: 48 previous size: 48 (Allocated) Vad
85684720 size: 68 previous size: 48 (Allocated) FMsl
85684788 size: f8 previous size: 68 (Allocated) MmCi
85684880 size: 10 previous size: f8 (Allocated) IoCc
85684890 size: 50 previous size: 10 (Allocated) Muta (Protected)
856848e0 size: b8 previous size: 50 (Allocated) File (Protected)
85684998 size: 40 previous size: b8 (Allocated) Even (Protected)
856849d8 size: 48 previous size: 40 (Allocated) Vad
85684a20 size: 28 previous size: 48 (Allocated) VadS
*85684a48 size: f8 previous size: 28 (Allocated) *MmCi
Pooltag MmCi : Mm control areas for images, Binary : nt!mm
85684b40 size: 48 previous size: f8 (Allocated) Vad
85684b88 size: 18 previous size: 48 (Allocated) MmSe
85684ba0 size: 68 previous size: 18 (Allocated) FMsl
85684c08 size: 48 previous size: 68 (Allocated) Vad
85684c50 size: 48 previous size: 48 (Allocated) Vad
85684c98 size: 8 previous size: 48 (Free) smMd
85684ca0 size: 30 previous size: 8 (Allocated) Icp
85684cd0 size: 48 previous size: 30 (Allocated) Vad
85684d18 size: 18 previous size: 48 (Allocated) ReEv
85684d30 size: 48 previous size: 18 (Allocated) Vad
85684d78 size: f8 previous size: 48 (Allocated) MmCi
85684e70 size: 28 previous size: f8 (Allocated) VadS
85684e98 size: 50 previous size: 28 (Allocated) Vadl
85684ee8 size: 48 previous size: 50 (Allocated) Vad
85684f30 size: 48 previous size: 48 (Allocated) Vad
85684f78 size: 8 previous size: 48 (Free) Vad
85684f80 size: 40 previous size: 8 (Allocated) Even (Protected)
85684fc0 size: 40 previous size: 40 (Allocated) SeTl

kd> !pte 85684a50
VA 85684a50
PDE at C0300854 PTE at C0215A10
contains 1FF05863 contains 1E684963
pfn 1ff05 —DA–KWEV pfn 1e684 -G-DA–KWEV

kd> !pfn 1e684
PFN 0001E684 at address 83ED9C60
flink 00000000 blink / share count 00000001 pteaddress C0215A10
reference count 0001 Cached color 0 Priority 0
restore pte 00000000 containing page 01FF05 Active

> Virtually everything being discussed in this thread (share count, reference count, WS)

is an implementation detail, don’t you think?

The OP seems to have just an unhealthy obsession with MM’s internals. If you read this NG on more or less regular basis you must have noticed it…

Anton Bassov

> The OP seems to have just an unhealthy obsession with MM’s internals

Anton, do you think this is not the right place to ask about Windows internals ?