when does DTLB load

Hi all,

As far as my understanding on TLB loading on Intel processors goes, the DTLB
or ITLB get loaded whenever a page lookup happens successfully after a trip
to page tablels. To see this first-hand on Windows 2000 (Pentium 4, non-HT),
I put a breakpoint on the windows page fault handler using WinDbg.

What I observed was this:

  1. When a page-fault happens because of code execution, lets call it
    instruction fault, then once the page-fault has been handled control returns
    to the faulting instruction. If I let that one instruction execute, which
    causes ITLB to be populated, and then clear the valid bit in the PTE for
    that code page and then execute the next instruction on same page that works
    as expected without causing another page fault since ITLB has the PTE info
    cached.

  2. When I did the same for data fault, i.e., let the fault be handled and
    then let the faulting instruction execute after fault handling which caused
    DTLB to be loaded, its fine till this point. After this if I clear the
    present bit from the PTE for that data page and reexecute the same
    instruction, which thus accesses the same data page, then this time it
    causes a page-not-present fault. This means that DTLB got updated as soon as
    I changed the PTE and even before any data from that page has been accessed.

The question is how do we explain this behaviour?
Is there a way to disables DTLB use?
Does page-table caching affect this in any way?

Thanks in advance,
Pawan

Pawan Khatri wrote:

Hi all,

As far as my understanding on TLB loading on Intel processors goes,
the DTLB or ITLB get loaded whenever a page lookup happens
successfully after a trip to page tablels. To see this first-hand on
Windows 2000 (Pentium 4, non-HT), I put a breakpoint on the windows
page fault handler using WinDbg.

What I observed was this:

  1. When a page-fault happens because of code execution, lets call it
    instruction fault, then once the page-fault has been handled control
    returns to the faulting instruction. If I let that one instruction
    execute, which causes ITLB to be populated, and then clear the valid
    bit in the PTE for that code page and then execute the next
    instruction on same page that works as expected without causing
    another page fault since ITLB has the PTE info cached.

  2. When I did the same for data fault, i.e., let the fault be handled
    and then let the faulting instruction execute after fault handling
    which caused DTLB to be loaded, its fine till this point. After this
    if I clear the present bit from the PTE for that data page and
    reexecute the same instruction, which thus accesses the same data
    page, then this time it causes a page-not-present fault. This means
    that DTLB got updated as soon as I changed the PTE and even before any
    data from that page has been accessed.

The question is how do we explain this behaviour?

It’s the Heisenberg Principle. When you break into the debugger, you
are running an awful lot of code and touching an awful lot of data. The
likely explanation is that the machinations of the debugger have simply
caused that entry in the DTLB to flush, so your next data access has to
go back out to the PTE.

Is there a way to disables DTLB use?

No, of course not. That would be silly. Why would you want to?

Does page-table caching affect this in any way?

Do you mean “caching” as in “L1 cache” and “L2 cache”? I don’t think
so. The explanation for your specific question is probably rather simple.


Tim Roberts, xxxxx@probo.com
Providenza & Boekelheide, Inc.

Hi Tim,

Thanks for replying.

I did give this a thought and to dismiss this point I tried by setting the
global page bit before DTLB load. Once DTLB was loaded, I cleared the
present bit and observed the same thing as was seeing earlier.

The chances of Global page invalidation between two consecutive instructions
referencing data on same page, even when debugger is running in the
background, are very little.

Though I still don’t dismiss your point entirely, but more thoughts are
welcome.

Thanks,
Pawan

On 6/16/06, Tim Roberts wrote:
>
> Pawan Khatri wrote:
>
> > Hi all,
> >
> > As far as my understanding on TLB loading on Intel processors goes,
> > the DTLB or ITLB get loaded whenever a page lookup happens
> > successfully after a trip to page tablels. To see this first-hand on
> > Windows 2000 (Pentium 4, non-HT), I put a breakpoint on the windows
> > page fault handler using WinDbg.
> >
> > What I observed was this:
> >
> > 1. When a page-fault happens because of code execution, lets call it
> > instruction fault, then once the page-fault has been handled control
> > returns to the faulting instruction. If I let that one instruction
> > execute, which causes ITLB to be populated, and then clear the valid
> > bit in the PTE for that code page and then execute the next
> > instruction on same page that works as expected without causing
> > another page fault since ITLB has the PTE info cached.
> >
> > 2. When I did the same for data fault, i.e., let the fault be handled
> > and then let the faulting instruction execute after fault handling
> > which caused DTLB to be loaded, its fine till this point. After this
> > if I clear the present bit from the PTE for that data page and
> > reexecute the same instruction, which thus accesses the same data
> > page, then this time it causes a page-not-present fault. This means
> > that DTLB got updated as soon as I changed the PTE and even before any
> > data from that page has been accessed.
> >
> > The question is how do we explain this behaviour?
>
>
> It’s the Heisenberg Principle. When you break into the debugger, you
> are running an awful lot of code and touching an awful lot of data. The
> likely explanation is that the machinations of the debugger have simply
> caused that entry in the DTLB to flush, so your next data access has to
> go back out to the PTE.
>
> > Is there a way to disables DTLB use?
>
>
> No, of course not. That would be silly. Why would you want to?
>
> > Does page-table caching affect this in any way?
>
>
> Do you mean “caching” as in “L1 cache” and “L2 cache”? I don’t think
> so. The explanation for your specific question is probably rather simple.
>
> –
> Tim Roberts, xxxxx@probo.com
> Providenza & Boekelheide, Inc.
>
>
> —
> Questions? First check the Kernel Driver FAQ at
> http://www.osronline.com/article.cfm?id=256
>
> To unsubscribe, visit the List Server section of OSR Online at
> http://www.osronline.com/page.cfm?name=ListServer
>

> As far as my understanding on TLB loading on Intel processors goes, the DTLB

or ITLB get loaded whenever a page lookup happens successfully after a trip
to page tablels. To see this first-hand on Windows 2000 (Pentium 4, non-HT),

From what I remember, either INVLPG or MOV to CR3 (address space switch) must
be done each time the present PTE is updated, and Windows really does this.

Any updates to non-present PTE - including the update which make this PTE
present - do not require TLB invalidation.

Maxim Shatskih, Windows DDK MVP
StorageCraft Corporation
xxxxx@storagecraft.com
http://www.storagecraft.com