Context switching ...

Hi all!

I’d like to know exactly(to be sure), is it TRUE that during switching
ring3 -> ring0 the thread’s context is not switched by OS even on
multi-CPU platform?
So, if it’s true I may not care about context switching when my driver
is inside a context of a particular process.
Many books say that it’s true, but how does the preemtive concept work in
this case?

Thanx!

Michael

It is true. The current process remains the same, and you can still
reference memory in that process’s address space in your driver.

Michael Alekseev wrote:

Hi all!

I’d like to know exactly(to be sure), is it TRUE that during switching
ring3 -> ring0 the thread’s context is not switched by OS even on
multi-CPU platform?
So, if it’s true I may not care about context switching when my driver
is inside a context of a particular process.
Many books say that it’s true, but how does the preemtive concept work in
this case?

Thanx!

Michael


You are currently subscribed to ntdev as: xxxxx@nryan.com
To unsubscribe send a blank email to xxxxx@lists.osr.com

  • Nick Ryan (MVP for DDK)

Hello Nick,

Excuse me my misunderstanding, but in this case how does the OS handle
processes running on multi-CPU platform? They use some global
structures that have to be changed according to the current process, like
PEB… Or the OS creates global structures for each processor? Whould
you be so kind to tell me where I may read all that stuff related to
the process handling on a multi-CPU platform? I need to know what’s
going on on the multi-CPU machine respectively to processes in system
and how does the OS manipulate by process “objects” also on multi-CPU
platform.

MA

Thursday, July 24, 2003, 10:40:23 AM, you wrote:

NR> It is true. The current process remains the same, and you can still
NR> reference memory in that process’s address space in your driver.

From what I understand, all page table directories for all processes
exist at different kernel-mode addresses simultaneously. Some register
(CR3 on IA32) on each CPU will point to the correct page table directory
for the process associated with the thread that that CPU is currently
executing. Also, the FS register on each CPU references a structure
called a TIB (Thread Information Block), off of which Windows hangs all
OS-specific information about that thread it wishes to know about.

So yes, process and thread context can be independent from CPU to CPU.

Where you can read about this stuff? Well, all over… ntddk.h, ntifs.h,
WinDbg OS symbols, Google on appropriate keywords, Nagar’s book, OSR’s
book, so on. Skill at data mining is half (OK, a quarter) of what makes
a good Windows programmer. :slight_smile:

Michael Alekseev wrote:

Hello Nick,

Excuse me my misunderstanding, but in this case how does the OS handle
processes running on multi-CPU platform? They use some global
structures that have to be changed according to the current process, like
PEB… Or the OS creates global structures for each processor? Whould
you be so kind to tell me where I may read all that stuff related to
the process handling on a multi-CPU platform? I need to know what’s
going on on the multi-CPU machine respectively to processes in system
and how does the OS manipulate by process “objects” also on multi-CPU
platform.

MA

Thursday, July 24, 2003, 10:40:23 AM, you wrote:

NR> It is true. The current process remains the same, and you can still
NR> reference memory in that process’s address space in your driver.


You are currently subscribed to ntdev as: xxxxx@nryan.com
To unsubscribe send a blank email to xxxxx@lists.osr.com

  • Nick Ryan (MVP for DDK)

> I’d like to know exactly(to be sure), is it TRUE that during switching

ring3 -> ring0 the thread’s context is not switched by OS even on
multi-CPU platform?

It is exactly so. The threads in the OS are divided to following classes:

  • system threads, they belong to System process and cannot execute user-mode
    code. PsCreateSystemThread and Ex/IoQueueWorkItem result in execution in the
    system thread context. DriverEntry and PnP/Power paths are also called in the
    system thread context.

  • user threads, which are created by Win32 CreateThread or CreateProcess,
    belong to a particular user process and can execute both user and kernel mode
    code. They enter kernel mode in cases of syscall or page fault or some other
    CPU fault.

Crossing the user/kernel boundary - in any direction - does not alter the
notion of the “current thread”.

So, if it’s true I may not care about context switching when my driver
is inside a context of a particular process.

NT kernel is preemptive. If you want to suspend preemption - then raise to
DISPATCH_LEVEL.

Max

> structures that have to be changed according to the current process, like

PEB… Or the OS creates global structures for each processor?

Yes.

going on on the multi-CPU machine respectively to processes in system

The short description is - everything is done in a simplest possible way, with
the per-CPU notion of “current thread” (and thus current process).

Max

I believe page tables are themselves pageable. That suggests very strongly that
if their storage is stolen and they are paged back in, they will assume a
possibly different storage location.

Nick Ryan wrote:

From what I understand, all page table directories for all processes
exist at different kernel-mode addresses simultaneously.


If replying by e-mail, please remove “nospam.” from the address.

James Antognini
Windows DDK MVP

The transition from ring 3 to ring 0 has no effect on current process or
current thread. Amongst the things that do change:

  1. Full 4G (in 32-bit systems) addressability, versus only user-space
    addressability.
  2. CPL 0 is in force.
  3. Kernel stack is used, once the interrupt handler instates it. That is,
    this is purely the result of software.
  4. Libraries written for user-mode execution will not work. Again, this is a
    software issue. A consequence is that it is usually not practicable to call
    back into user space whilst remaining at CPL 0.

Things that do not change:

  1. Pageability, ie, ability to reference virtual storage that may not be
    backed.
  2. Thread priority.
  3. Preemptiblity. I think this includes the time slice.

Off the top of my head, I cannot think of anything that would be different on
a single- or multiple-CPU system. It’s true that on a single-CPU system,
going to DISPATCH_LEVEL will not involve physical disablement of interrupts,
but that’s not something related to the current question.

Michael Alekseev wrote:

is it TRUE that during switching
ring3 -> ring0 the thread’s context is not switched by OS even on
multi-CPU platform?


If replying by e-mail, please remove “nospam.” from the address.

James Antognini
Windows DDK MVP

Note that the ring transition is a per-processor pure-hardware thing. P1
could be in Ring 0 while P2 is in Ring 3 or vice versa. The hardware level
transition is well documented in the Intel manuals. Also note that things
such as DISPATCH_LEVEL don’t exist in the hardware, they’re OS
abstractions; the hardware does what’s written in the IDT entry.

Alberto.

-----Original Message-----
From: James Antognini [mailto:xxxxx@mindspring.nospam.com]
Sent: Thursday, July 24, 2003 1:27 PM
To: Windows System Software Developers Interest List
Subject: [ntdev] Re: Context switching …

The transition from ring 3 to ring 0 has no effect on current process or
current thread. Amongst the things that do change:

  1. Full 4G (in 32-bit systems) addressability, versus only user-space
    addressability.
  2. CPL 0 is in force.
  3. Kernel stack is used, once the interrupt handler instates it. That is,
    this is purely the result of software.
  4. Libraries written for user-mode execution will not work. Again, this is a
    software issue. A consequence is that it is usually not practicable to call
    back into user space whilst remaining at CPL 0.

Things that do not change:

  1. Pageability, ie, ability to reference virtual storage that may not be
    backed.
  2. Thread priority.
  3. Preemptiblity. I think this includes the time slice.

Off the top of my head, I cannot think of anything that would be different
on
a single- or multiple-CPU system. It’s true that on a single-CPU system,
going to DISPATCH_LEVEL will not involve physical disablement of interrupts,
but that’s not something related to the current question.

Michael Alekseev wrote:

is it TRUE that during switching
ring3 -> ring0 the thread’s context is not switched by OS even on
multi-CPU platform?


If replying by e-mail, please remove “nospam.” from the address.

James Antognini
Windows DDK MVP


You are currently subscribed to ntdev as: xxxxx@compuware.com
To unsubscribe send a blank email to xxxxx@lists.osr.com

The contents of this e-mail are intended for the named addressee only. It
contains information that may be confidential. Unless you are the named
addressee or an authorized designee, you may not copy or use it, or disclose
it to anyone else. If you received it in error please notify us immediately
and then destroy it.

Alberto, James, Max, et all…

thanx for providing all these information…

Has anyone tried using any disassemble/debuggger with the intel liturature
for the actual os fielder routines and hardware for sysenter and sysexit, XP
mainly relies on this, int 2e is may be for just backward compatiblity or
service
call extensibility. On XP, usually int2e route is absent, and I’ve seen it
by
providing a hook and burping msg ( nothing spits out)…

If there is any infos about the whole sequence of inst(s) and mechanics
available that
might be a good doc!!! It might be that by using MSR etc, the stack
copying/swaping between
usr/krnl is faster than the sys call dispatcher of nt/2k, but any
quantitative ananysis would be
great…, then may be punching in the NUMA/UMA, and hyperthreading etc ( may
be last one being
superficial)…

I’m saving these dialogues.

thanx
-prokash

-----Original Message-----
From: xxxxx@lists.osr.com
[mailto:xxxxx@lists.osr.com]On Behalf Of Moreira, Alberto
Sent: Thursday, July 24, 2003 10:47 AM
To: Windows System Software Developers Interest List
Subject: [ntdev] Re: Context switching …

Note that the ring transition is a per-processor pure-hardware thing. P1
could be in Ring 0 while P2 is in Ring 3 or vice versa. The hardware level
transition is well documented in the Intel manuals. Also note that things
such as DISPATCH_LEVEL don’t exist in the hardware, they’re OS
abstractions; the hardware does what’s written in the IDT entry.

Alberto.

-----Original Message-----
From: James Antognini [mailto:xxxxx@mindspring.nospam.com]
Sent: Thursday, July 24, 2003 1:27 PM
To: Windows System Software Developers Interest List
Subject: [ntdev] Re: Context switching …

The transition from ring 3 to ring 0 has no effect on current process or
current thread. Amongst the things that do change:

  1. Full 4G (in 32-bit systems) addressability, versus only user-space
    addressability.
  2. CPL 0 is in force.
  3. Kernel stack is used, once the interrupt handler instates it. That is,
    this is purely the result of software.
  4. Libraries written for user-mode execution will not work. Again, this is a
    software issue. A consequence is that it is usually not practicable to call
    back into user space whilst remaining at CPL 0.

Things that do not change:

  1. Pageability, ie, ability to reference virtual storage that may not be
    backed.
  2. Thread priority.
  3. Preemptiblity. I think this includes the time slice.

Off the top of my head, I cannot think of anything that would be different
on
a single- or multiple-CPU system. It’s true that on a single-CPU system,
going to DISPATCH_LEVEL will not involve physical disablement of interrupts,
but that’s not something related to the current question.

Michael Alekseev wrote:

is it TRUE that during switching
ring3 -> ring0 the thread’s context is not switched by OS even on
multi-CPU platform?


If replying by e-mail, please remove “nospam.” from the address.

James Antognini
Windows DDK MVP


You are currently subscribed to ntdev as: xxxxx@compuware.com
To unsubscribe send a blank email to xxxxx@lists.osr.com

The contents of this e-mail are intended for the named addressee only. It
contains information that may be confidential. Unless you are the named
addressee or an authorized designee, you may not copy or use it, or disclose
it to anyone else. If you received it in error please notify us immediately
and then destroy it.


You are currently subscribed to ntdev as: xxxxx@vormetric.com
To unsubscribe send a blank email to xxxxx@lists.osr.com

Page tables can be paged out, but page directories cannot be.

James Antognini wrote:

I believe page tables are themselves pageable. That suggests very strongly that
if their storage is stolen and they are paged back in, they will assume a
possibly different storage location.

Nick Ryan wrote:

> From what I understand, all page table directories for all processes
>exist at different kernel-mode addresses simultaneously.


If replying by e-mail, please remove “nospam.” from the address.

James Antognini
Windows DDK MVP


You are currently subscribed to ntdev as: xxxxx@nryan.com
To unsubscribe send a blank email to xxxxx@lists.osr.com

  • Nick Ryan (MVP for DDK)

Nick,

Is it the reason that a double fault would occur, if the directories
are paged out !!!

-prokash

-----Original Message-----
From: xxxxx@lists.osr.com
[mailto:xxxxx@lists.osr.com]On Behalf Of Nick Ryan
Sent: Thursday, July 24, 2003 1:33 PM
To: Windows System Software Developers Interest List
Subject: [ntdev] Re: Context switching …

Page tables can be paged out, but page directories cannot be.

James Antognini wrote:

I believe page tables are themselves pageable. That suggests very strongly
that
if their storage is stolen and they are paged back in, they will assume a
possibly different storage location.

Nick Ryan wrote:

> From what I understand, all page table directories for all processes
>exist at different kernel-mode addresses simultaneously.


If replying by e-mail, please remove “nospam.” from the address.

James Antognini
Windows DDK MVP


You are currently subscribed to ntdev as: xxxxx@nryan.com
To unsubscribe send a blank email to xxxxx@lists.osr.com

  • Nick Ryan (MVP for DDK)

You are currently subscribed to ntdev as: xxxxx@vormetric.com
To unsubscribe send a blank email to xxxxx@lists.osr.com

Yes. The PT page can be paged out if there is no “present” PTEs in it. Each
present PTE holds 1 reference to a PT page.

Max

----- Original Message -----
From: “James Antognini”
Newsgroups: ntdev
To: “Windows System Software Developers Interest List”
Sent: Thursday, July 24, 2003 9:14 PM
Subject: [ntdev] Re: Context switching …

> I believe page tables are themselves pageable. That suggests very strongly
that
> if their storage is stolen and they are paged back in, they will assume a
> possibly different storage location.
>
> Nick Ryan wrote:
>
> > From what I understand, all page table directories for all processes
> > exist at different kernel-mode addresses simultaneously.
>
> –
> If replying by e-mail, please remove “nospam.” from the address.
>
> James Antognini
> Windows DDK MVP
>
>
>
> —
> You are currently subscribed to ntdev as: xxxxx@storagecraft.com
> To unsubscribe send a blank email to xxxxx@lists.osr.com

They can, but only when the process is completely outswapped.

----- Original Message -----
From: “Nick Ryan”
To: “Windows System Software Developers Interest List”
Sent: Friday, July 25, 2003 12:33 AM
Subject: [ntdev] Re: Context switching …

> Page tables can be paged out, but page directories cannot be.
>
> James Antognini wrote:
>
> > I believe page tables are themselves pageable. That suggests very strongly
that
> > if their storage is stolen and they are paged back in, they will assume a
> > possibly different storage location.
> >
> > Nick Ryan wrote:
> >
> >
> >> From what I understand, all page table directories for all processes
> >>exist at different kernel-mode addresses simultaneously.
> >
> >
> > –
> > If replying by e-mail, please remove “nospam.” from the address.
> >
> > James Antognini
> > Windows DDK MVP
> >
> >
> >
> > —
> > You are currently subscribed to ntdev as: xxxxx@nryan.com
> > To unsubscribe send a blank email to xxxxx@lists.osr.com
> >
>
> –
> - Nick Ryan (MVP for DDK)
>
>
>
> —
> You are currently subscribed to ntdev as: xxxxx@storagecraft.com
> To unsubscribe send a blank email to xxxxx@lists.osr.com

Interesting… so the Balance Set Manager does this when it deems the
process to be inactive? If so, how does the swapback occur? I took a
quick look at KeStackAttachProcess and children, and in KiAttachProcess
the system just throws the value of the DirectoryTableBase field into
CR3 without any additional checking.

Maxim S. Shatskih wrote:

They can, but only when the process is completely outswapped.

----- Original Message -----
From: “Nick Ryan”
> To: “Windows System Software Developers Interest List”
> Sent: Friday, July 25, 2003 12:33 AM
> Subject: [ntdev] Re: Context switching …
>
>
>
>>Page tables can be paged out, but page directories cannot be.
>>
>>James Antognini wrote:
>>
>>
>>>I believe page tables are themselves pageable. That suggests very strongly
>
> that
>
>>>if their storage is stolen and they are paged back in, they will assume a
>>>possibly different storage location.
>>>
>>>Nick Ryan wrote:
>>>
>>>
>>>
>>>>From what I understand, all page table directories for all processes
>>>>exist at different kernel-mode addresses simultaneously.
>>>
>>>
>>>–
>>>If replying by e-mail, please remove “nospam.” from the address.
>>>
>>>James Antognini
>>>Windows DDK MVP
>>>
>>>
>>>
>>>—
>>>You are currently subscribed to ntdev as: xxxxx@nryan.com
>>>To unsubscribe send a blank email to xxxxx@lists.osr.com
>>>
>>
>>–
>>- Nick Ryan (MVP for DDK)
>>
>>
>>
>>—
>>You are currently subscribed to ntdev as: xxxxx@storagecraft.com
>>To unsubscribe send a blank email to xxxxx@lists.osr.com
>
>
>
>
> —
> Questions? First check the Kernel Driver FAQ at http://www.osronline.com/article.cfm?id=256
>
> You are currently subscribed to ntdev as: xxxxx@nryan.com
> To unsubscribe send a blank email to xxxxx@lists.osr.com
>


- Nick Ryan (MVP for DDK)

> Interesting… so the Balance Set Manager does this when it deems the

process to be inactive?

IIRC yes. At least the whole working set is trimmed, so that all process pages
can be outswapped.

If so, how does the swapback occur?

If the thread of the outswapped process (or the thread with outswapped kernel
stack) is awaken by the dispatcher, then Balance Set Manager is invoked to
inswap the stuff back.

quick look at KeStackAttachProcess and children, and in KiAttachProcess
the system just throws the value of the DirectoryTableBase field into
CR3 without any additional checking.

Maybe there are some means of preventing the KeStackAttachProcess from
attaching to the outswapped process? Maybe it triggers inswap and waits for it?

Max

>>quick look at KeStackAttachProcess and children, and in KiAttachProcess

>the system just throws the value of the DirectoryTableBase field into
>CR3 without any additional checking.

Maybe there are some means of preventing the KeStackAttachProcess from
attaching to the outswapped process? Maybe it triggers inswap and waits for it?

Max

Figured it out, mostly. There’s actually a big fat subroutine that
KiAttachProcess invokes if the process being attached to is swapped out
(darn compiler optimizer is splitting many of these functions into 20 or
more discontiguous pieces that don’t appear to be common code - is this
really optimization?). In it, it sticks the process on a list called
KiProcessInSwapListHead and signals an (apparently permanent) worker
thread called KiSwappingThread to wake up. This thread then awakens and
(among other things) swaps in all processes in KiProcessInSwapListHead.
I’m not sure then how KiAttachProcess waits for the swapin to complete,
but let’s assume it does.

And of course while swapping in the process (via MmInSwapProcess), the
physical page for the PD is restored. This is done by invoking
MiMakeOutswappedPageResident on the _EPROCESS structure’s captive PTE
for the page, data member PageDirectoryPte. MiMakeOutswappedPageResident
digs the page out of the appropriate page file with an IoPageRead.

I guess all this penny-pinching of pages is still necessary for
memory-sparse environments like Windows Embedded. (The PD is ONE page,
fer chrissake). If not, there’s probably a lot of logic in the NT kernel
that simply isn’t necessary in an age where we routinely have more CPU
and RAM than we know what to do with. :slight_smile:

  • Nick Ryan (MVP for DDK)

> I guess all this penny-pinching of pages is still necessary for

memory-sparse environments like Windows Embedded. (The PD is ONE page,

4 for PAE :slight_smile:

Max

>>The one page at 0xc0300000, the PD itself, is covered by that same one

>entry, but the PTs in 0xc0000000-0xc0300000 need PTEs of their own
>within the PD when mapped (which must be an infrequent and protected

These PTEs are PDEs themselves.

Yeah, I guess that works… take some page-aligned VA from
0xc0000000-0xc03FF000, pull out the middle 10 bits and shift left by 12,
and you know this VA points to the page table that represents that
address range. No need to waste PDEs to map PTs.

  • Nick Ryan (MVP for DDK)

> I guess all this penny-pinching of pages is still necessary for

memory-sparse environments like Windows Embedded. (The PD is ONE page,
fer chrissake). If not, there’s probably a lot of logic in the NT kernel
that simply isn’t necessary in an age where we routinely have more CPU
and RAM than we know what to do with. :slight_smile:

Page tables are still a precious resource on any IA-32 machine, and get more
precious the more memory you add above 4G. For that matter kernel space in
general os precious on a large machine.

Loren