I don’t know if it is the right section…anyway
As far as I read Windows Internals I do not understand clearly the following topic
Consider a 32-bit windows client running in /2G mode with PAE not enabled
(basically a virtual-to-physical translation scheme with 1024 page tables - processes’s page directory is itself a page table - with page frames 4KB each)
Following documentation page tables exist in virtual address range C0000000 - C03FFFFF (4MB) and with 4KB page we have, in turn, 1024 page tables
which can map the entire 4GB virtual address space (2GB of process’s user address space + 2 GB of system space)
Now reading further I learned about system PTE’s (IIUC they map system (kernel) address space) which exist in a different system (virtual) address range.
A question arise: why Windows maps these system PTE’s in a “separate” address range when space in C0000000 - C03FFFFF is avaliable for the entire 4GB address space mapping ?
In order to answer your question all you have to do is to think how virtual addresses are translated into physical ones…
If you write the address of the page directory into its 768’th entry you will be able to access the page directory as a virtual address C03000000. In order to access its 0th entry you will use the address C0000000;
1st entry as C0001000,etc,etc,etc Think a bit why it works this way - this is your homework…
Therefore, if you keep all entries in the upper halves of all page directories (apart from 768th entries, of course)
the same you will get the kernel address space - it is not “separate” at all…
Anton Bassov
The first part of my homework I think is clear:
768 = 0x300 (hex) and in order to access 768’th page directory entry I will use C0000000 + 0x300*1000 = C03000000 virtual address
Now second part…What do you mean with "if you keep all entries in the upper halves of all page directories (apart from 768th entries, of course) "
Which entries do you refer to ?
thanks
xxxxx@alice.it wrote:
Consider a 32-bit windows client running in /2G mode with PAE not enabled
(basically a virtual-to-physical translation scheme with 1024 page tables - processes’s page directory is itself a page table - with page frames 4KB each)
…
Now reading further I learned about system PTE’s (IIUC they map system (kernel) address space) which exist in a different system (virtual) address range.
Where did you read that? As far as I know (and I haven’t delved as
deeply into this as some others), the “system PTEs” are just a subset of
the kernel address space that is reserved for non-paged pool and device
memory mapping. I don’t think these PTEs actually live at a separate
virtual address. As you say, there wouldn’t be much point to that.
A question arise: why Windows maps these system PTE’s in a “separate” address range when space in C0000000 - C03FFFFF is avaliable for the entire 4GB address space mapping ?
Remember that the user page tables have to be changed every time there
is a process switch, but the kernel page tables remain the same for
everyone. There may be some optimizations to allow half of the page
tables to be swapped out.
–
Tim Roberts, xxxxx@probo.com
Providenza & Boekelheide, Inc.
> Where did you read that?
For instance “Windows Internals 4th ed” Fig 7-11 shows x86 system space layout in which “system PTE area” is located above E1000000
In the following section “Address Translation” Fig 7-18 shows system PTE within “system page tables” involved in system space translation…
From these descriptions I guessed system PTE shown in Fig 7-18 were the same as in Fig 7-11…Maybe I was wrong…
help ! thanks
xxxxx@alice.it wrote:
> Where did you read that?
For instance “Windows Internals 4th ed” Fig 7-11 shows x86 system space layout in which “system PTE area” is located above E1000000
I haven’t read that, but my guess is that they are saying the region at
E100,0000 has the pages that are mapped when system PTEs are needed, not
that they actually contain page table entries.
–
Tim Roberts, xxxxx@probo.com
Providenza & Boekelheide, Inc.
> As far as I know (and I haven’t delved as deeply into this as some
others), the “system PTEs” are just a subset of the kernel address
space that is reserved for non-paged pool and device memory
mapping.
System PTEs do not include paged/nonpaged pool memory. These are all
separate “types” that can be assigned to regions of kernel VA space.
On a Vista+ system you can do “dt nt!_MI_SYSTEM_VA_TYPE” to
see all types that exist on that particular version of Windows.
“System PTEs” are the type used for mapping MDLs into system space
(MmGetSystemAddressForMdlSafe), kernel stacks and a few other things.
> A question arise: why Windows maps these system PTE’s in a
> “separate” address range when space in C0000000 - C03FFFFF is
> avaliable for the entire 4GB address space mapping ?
How the system VA space is partitioned into different types varies
with Windows versions and architectures. 64 bit versions use a mostly
static scheme where each type gets a fixed VA range. Pre-Vista 32-bit
versions also used static partitioning, configurable at boot time via
various registry knobs (LargeSystemCache for the file cache,
SystemPages for system PTEs etc).
Vista+ 32-bit systems use dynamic partitioning where system space is
allocated on demand in PDE-sized chunks (2/4 MB). Skywing wrote an article
that goes into more details on this:
http://www.nynaeve.net/?p=261
“System PTEs” are used to to satisfy various kernel internal allocations that are broken up into multiples of page-size chunks (i.e. which have a whole number of PTEs assigned to map them). It’s best to think of them as used to back a general allocator for PAGE_SIZE regions.
From: xxxxx@lists.osr.com [xxxxx@lists.osr.com] on behalf of Tim Roberts [xxxxx@probo.com]
Sent: Tuesday, April 24, 2012 10:24 AM
To: Windows System Software Devs Interest List
Subject: Re: [ntdev] Windows system PTE
xxxxx@alice.it wrote:
> Where did you read that?
For instance “Windows Internals 4th ed” Fig 7-11 shows x86 system space layout in which “system PTE area” is located above E1000000
I haven’t read that, but my guess is that they are saying the region at
E100,0000 has the pages that are mapped when system PTEs are needed, not
that they actually contain page table entries.
–
Tim Roberts, xxxxx@probo.com
Providenza & Boekelheide, Inc.
NTDEV is sponsored by OSR
For our schedule of WDF, WDM, debugging and other seminars visit:
http://www.osr.com/seminars
To unsubscribe, visit the List Server section of OSR Online at http://www.osronline.com/page.cfm?name=ListServer
> What do you mean with "if you keep all entries in the upper halves of all page directories
(apart from 768th entries, of course) " Which entries do you refer to ?
I refer to entries 512-1023, excluding 768th one…
For instance “Windows Internals 4th ed” Fig 7-11 shows x86 system space layout in
which “system PTE area” is located above E1000000
…which means page directory entries >=900 refer to page tables that describe pages that are used only as system page tables. Once these page tables also happen to be end-level pages, these page tables and their end-level pages may refer to one another, as well as to themselves the way page directory does. I know
the above explanation may seem a bit twisted and not-so-easy to parse, but this is how CPUs understand things (which is relatively easy for 32-bit CPUs with just 3 levels of translation - imagine the complexity of parsing the explanation of relationships in 5-level x86_64 tables)…
Anton Bassov
Great…! Trying to recap
For each process’s page directory 512<= PDE entries < 900 refer to page tables that describe pages in system address space 0x80000000 - E0FFFFFF
However, for PDE entries >=900, there exist an extra level of translation: basically pages pointed by the first 2 translation levels (PDE and PTE) are used themselves as system page tables
Does it make sense ?
Consider now a not PAE system with 4KB page frame: CPU MMU hardware performs “normal” virtual-to-physical translation using 2-level page (page directory table and page tables)
How can it handle this different translation logic for “system PTE” address range ?
thanks for your patience…
> How can it handle this different translation logic for “system PTE” address range ?
There is no different translation logic. System PTEs is just a range in >3GB addresses, nothing else.
Note that all PD trees of all processes share the same PDs which describe this >3GB kernel space. Also, those PDEs and PTEs have the bit in them which prevents the TLB entries from discarding on process context (i.e. PD tree root) switch.
–
Maxim S. Shatskih
Windows DDK MVP
xxxxx@storagecraft.com
http://www.storagecraft.com
> However, for PDE entries >=900, there exist an extra level of translation:
There is no “extra level of translation” whatsoever…
pages pointed by the first 2 translation levels (PDE and PTE) are used themselves as system page tables
Exactly…
Anton Bassov
> The first part of my homework I think is clear: 768 = 0x300 (hex) and in order to access
768’th page directory entry I will use C0000000 + 0x300*1000 = C03000000 virtual address
This is the second part. Good enough. The only thing left is to understand why C0000000 is used as a base…
Anton Bassov
Well…
Page Directory table is itself a page table when entries are interpreted as PTE’s describing page table pages themselves
Now first 512 process’s page tables describe 2GB process’s user address space. Following 256 page tables describe 3’th GB (1th GB in system address space with /2GB switch) and next we find page table pages starting at C0000000 (last address space GB)
Now I guess C000000 is the base address because it exist in system address space even when the system is booted with /3GB switch
> Page Directory table is itself a page table when entries are interpreted as PTE’s describing page
table pages themselves
Correct…
Now first 512 process’s page tables describe 2GB process’s user address space.
Following 256 page tables describe 3’th GB (1th GB in system address space with /2GB switch)
and next we find page table pages starting at C0000000 (last address space GB)
Correct…
Now I guess C000000 is the base address because it exist in system address space even
when the system is booted with /3GB switch
Unfortunately, this part is wrong…
Just to give you a tip, if you put an address of a page directory into some entry other than 768th the base address will be already not C0000000 , with the precise value depending on the exact entry number. Come on, you are already not so far from a solution…
Anton Bassov
> A question arise: why Windows maps these system PTE’s in a “separate” address
range when space in C0000000 - C03FFFFF is avaliable for the entire 4GB address
space mapping ?
Page tables are per-process.
If the System PTE’s were part of per-process page tables, the kernel will have to update page tables of each-n-every process in the system.
> Just to give you a tip, if you put an address of a page directory into some entry other than 768th the > base address will be already not C0000000 , with the precise value depending on the exact entry >number…
Right…the main idea is that Page Directory table entries have to be PTE’s for page table pages. Starting from the statement that 768’th entry describes page directory page itself it turn out that 1th entry in page directory (interpreted as PTE for page table page) has to describe an address that is 4K*1024*768 = 3145728K = 3145728*1024 = 3221225472 = C0000000 (hex)
If, for instance, you put page directory address in 512’th entry the base address would be 4K*1024*512 = 2147483648 = 80000000 (hex)
Does it make sense ?
> If the System PTE’s were part of per-process page tables, the kernel will have to update
page tables of each-n-every process in the system.
In fact, I’ve got to amend my statement concerning the system page tables at E1000000 a bit…
In order to be able to map 2G of kernel address space you need just 512 page tables, each of them describing the total of 4MB of virtual address space. The 4MB range starting at C0300000 up to C0400000 (i.e the one described by page directory’s 768th entry) is unusable because this entry is reserved for accessing page directory itself.
Therefore, if you take a 2MB physical range (i.e 512 physical pages), fill its 384th page with the physical addresses of all pages in this range (i.e 0th goes to 0th entry, 1st to 1st…, 384thh to 384th,…511th to 511th), and write the address of this page into page directory’s 900th entry, you will be able to access the system PTE’s as the virtual addresses in E1000000 - E13FF000 range with 4K hole at E1300000 (this entry will refer to the map itself, and will never ever be needed after having filled the map, because page directory’s 768th entry refers to the page directory itself so that C0300000-C0400000 range is a “no-fly zone” anyway) …
Anton Bassov
Voila - finally you managed it…
Anton Bassov
> as the virtual addresses in E1000000 - E13FF000 range with 4K hole at E1300000
Oops, my mistake - the range is E1000000 - E11FF000, and the hole is E118000. There are only 512 and not 1024 entires in this range, so that the MSB of the second-level 10-bit index is always clear…
Anton Bassov