Transition from UsertoKernel

Hi All,

This may be a very basic question but need to know the windows specific answer. :slight_smile:
How the transition happens from usermode to kernel mode???
i know when usermode app want to acess device need to do DevciceIoControl etc then Iomgr come into play and
then generate IRP n all.

I just want to know how a deviceIoCall or similer will invoke IoMgr what interrupt or whatever???
do this trick.

Pardon me if i really asked very basic question.

Thanks.
./TuTen

> I just want to know how a deviceIoCall or similer will invoke IoMgr what interrupt or

whatever??? do this trick.

SYSENTER instruction on modern systems and INT 0x2E on the older ones make this transition. If you are interested in a general topic of UM/KM and KM/UM transistions on x86 and x86_64 systems I would suggest reading Intel Developer’s Manuals(Volume 3 is of particular interest here).

This may be a very basic question but need to know the windows specific answer.

If you want to learn about Windows-specific implementations I would suggest getting a copy of Windows Internals…

Anton Bassov

What Mr. Bassov said: SYSENTER or SYSCALL.

And, as he said, on *very* old Windows systems, it was INT 0x2E – which still works for the sake of compatibility.

The use of SYSCALL/SYSENTER is largely responsible for the vast reduction in overhead for changing privilege levels. Despite the fact that these mechanisms have been in place for years, we still hear people voice the old claim “ring transitions are expensive.” These days, not so much.

Peter
OSR
@OSRDrivers

xxxxx@gmail.com wrote:

This may be a very basic question but need to know the windows specific answer. :slight_smile:

I just wanted to add one note here. There isn’t a Windows-specific
answer, because this is not a Windows-specific mechanism. This is an
attribute of the x86 architecture. All virtual memory operating systems
on x86 processors use the same technique. The ARM has a similar
instruction for doing privilege mode transitions.


Tim Roberts, xxxxx@probo.com
Providenza & Boekelheide, Inc.

Well… Yes and no. There are many possible mechanisms to perform a ring transition on an x86 or x64 or ARM system. While I think we could all quickly agree on which are the “sensible” mechanisms, it’s not like a given OS *has* to use the one that we agree is most “sensible.”

Consider, “INT 2E” vs “SYSCALL”, for example.

Also, the actual DETAILS of the implementation will vary. Take ARM, for example. While the “sensible” way to enter Supervisor Mode is obviously the SWI/SVC instruction… what OPERAND (immediate value) is used? In Windows, it’s always “SVC #1”, if I’m not mistaken.

So, while the list of ways one COULD get into kernel-mode is finite, and the “sensible” way to do it is yet even more limited, the definitive answers *are* truly OS-specific.

Peter
OSR
@OSRDrivers

> I just want to know how a deviceIoCall or similer will invoke IoMgr what interrupt or whatever???

This is called “syscall”, and usually each CPU has some defined way to raise its execution/permission level (user->kernel, ring 3->ring 0), while calling the predefined entry point of the more privileged code.

In older Windows, it was “int 2eh”, and then it became the new “sysenter” opcode.

The effect of these opcodes:

  • stack is switched to kernel stack
  • instruction pointer value is saved to kernel stack (for return)
  • CPU’s current execution level is raised to kmode
  • the pre-defined kmode entry point is called (only the kmode code can define it).


Maxim S. Shatskih
Microsoft MVP on File System And Storage
xxxxx@storagecraft.com
http://www.storagecraft.com

As of Threshold 2 (1511 November Update) x64 versions of Windows 10 will now
potentially do INT 2Eh again. Not so ancient anymore :slight_smile:

Also SYSTENTER and SYSCALL don’t actually push anything to the stack Max.
Best regards,
Alex Ionescu

> Also SYSTENTER and SYSCALL don’t actually push anything to the stack Max.

Well, if we decide to be even more precise about the arch-specific details, the statement that “current execution level is raised to kmode” is totally wrong as well. There is simply no such thing as “current execution level” on x86 and x86_64 processors - the privilege level of currently executing code is defined by that of a code segment, rather than by the state of some CPU flag. This is why so-called flat memory model still relies upon the segmentation behind the scenes…

Anton Bassov

> Well, if we decide to be even more precise about the arch-specific details, the statement that

“current execution level is raised to kmode” is totally wrong as well.

It is not totally wrong.

It is - like classic Newton mechanics, compared to 4D/xyzt relativistic mechanics - simplified and mostly valid for a big picture :slight_smile:

privilege level of currently executing code is defined by that of a code segment,

Yes. Far jumps to CS selector with kmode Descriptor Privilege Level == 0 is what switches x86 to kmode. Surely there are lots of limitations of using JMP this way. Usually INT is used for this.


Maxim S. Shatskih
Microsoft MVP on File System And Storage
xxxxx@storagecraft.com
http://www.storagecraft.com

> While I think we could all quickly agree on which are the “sensible” mechanisms, it’s not like a given >OS *has* to use the one that we agree is most “sensible.”

Up to this day, almost 25 years later, I don’t understand why “int 2eh” is more sensible than call via Call Gate.
But somehow Windows architects and architects of majority of 386 UNIXen came to the same conclusion, so there should be a reason.

> It is - like classic Newton mechanics, compared to 4D/xyzt relativistic

mechanics - simplified and mostly valid for a big picture :slight_smile:

You seem to be one of the best “Professor Flounder”'s students. IIRC, this is EXACTLY what he said when I told him to do some research before trying to “pontificate”. I cannot immediately recall which particular way he put a foot into his mouth on that particular occasion but I think he was claiming IRET instruction acks interrupt to interrupt controller…

Yes. Far jumps to CS selector with kmode Descriptor Privilege Level =3D= 0 is what
switches x86 to kmode. Surely there are lots of limitations of using JMP this way.
Usually INT is used for this.

Some more erroneous statemens, Max…

First, you cannot use far jumps for crossing the boundary between the segments that have different privilege levels. You have to use far call and far rets for this purpose. Furthermore, in order to make a far call to a privileged segment you have to use a call gate. However, don’t forget that call gate happens to be yet another feature that is unavailable in 64-bit mode…

Anton Bassov

> Up to this day, almost 25 years later, I don’t understand why “int 2eh” is more sensible than call via

Call Gate.

Because no one wanted to use nonportable stuff.


Maxim S. Shatskih
Microsoft MVP on File System And Storage
xxxxx@storagecraft.com
http://www.storagecraft.com

>said when I told him to do some research before trying to “pontificate”.

Thanks Anton, now I know that the Russian slang word “pont” is from “pontificate” and not from the Greek word “sea” (“Pont Euxenia” for “Black Sea”).

:slight_smile:

> switches x86 to kmode. Surely there are lots of limitations of using JMP this way.
> Usually INT is used for this.

Some more erroneous statemens, Max…

First, you cannot use far jumps for crossing the boundary between the segments that have different

Cite from the above: “lots of limitations of using JMP this way”.


Maxim S. Shatskih
Microsoft MVP on File System And Storage
xxxxx@storagecraft.com
http://www.storagecraft.com

NTDLL is used to call into the operating system, which is (generally) in the address range (0x80000000-0xFFFFFFFF). ?The operating system addresses are not accessible in user-mode; therefore a special protected mechanism (using a CPU instruction which is sysenter…earlier it used to be Int 2e) is used to control the transition from user-mode to kernel-mode. NTDLL loads the system service number into the EAX register, then copies the address the processor-specific kernel-mode transition code on the Kernel-User shared page (0x7FFE0000 + 0x300) into the EDX register, then calls through the EDX register.

MOV ? ?EAX, Service Number
MOV ? ?EDX, MM_SHARED_USER_DATA_VA + UsSystemCall
CALL ? ?EDX
RET ? ? ? ?n

The processor-specific kernel-mode transition code depends upon whether the CPU is Intel, AMD or Pentium2 and earlier (Win2K and earlier). ?INT 2E vectors through the IDT (entry number 0x2E), while SYSCALL and SYSENTER vector through model-specific registers that are initialized at system boot time.

Win2K and earlier:
LEA ? ?EDX, [ESP+4]
INT ? ?2E ? ? ? ? ? ? ? ? ? ? ? ?; Ends up calling KiSystemService
RET

WinXP and later (Intel):
MOV ? ?EDX, ESP
SYSENTER ? ? ? ? ? ? ? ? ? ?; Ends up calling KiFastCallEntry, which then calls
KiSystemService
RET

AMD K6 and later
MOV ? ?EDX, ESP
SYSCALL ? ? ? ? ? ? ? ? ? ?; Ends up calling KiSystemCall, which then calls
KiSystemService
RET
?
KiSystemService uses the system service number(in EAX) ?as an index into the system service dispatch table, which contains the address of the routine in the operating system to call. ?This prevents an application from calling any random address in the system; an application can only call those routines that are listed in the system service dispatch table.
?
During the initialization of NTOSKRNL, it creates a function table, hereafter referred to as the System Service Dispatch Table (SSDT), for different services provided by NTOSKRNL. Each entry in the table contains the address of the function to be executed for a given service ID. The handler looks up this table based on the service ID passed in EAX register and calls the corresponding system service. The code for each function resides in the kernel. Similarly, another table called the System Service Parameter Table [SSPT]) provides the handler with the number of parameter bytes to expect from a particular service. The handler refers to the first entry in the Service Descriptor Table for service IDs less than 0x1000 and refers to the second entry of the table for service IDs greater than or equal to 0x1000. The handler checks the validity of service IDs. If a service ID is valid, the handler extracts the addresses of the SSDT and SSPT. The handler copies the number of bytes (equal to the total number of bytes of the parameter list) described by the SSPT for the service?from user-mode stack to kernel-mode stack?and then calls the function pointed to by the SSDT for that service.

-Pravin S. Waghurde

> Because no one wanted to use nonportable stuff.

If you don’t mind, could you please “enlighten” us and explain why a combination of far call instruction with a specific entry in GDT is less “portable” than the one of INT instruction with a specific entry in IDT (I used quotation marks here because arch-specific code just cannot be portable by its very definition, and UM/KM transition happens to be a totally arch-specific feature)…

> First, you cannot use far jumps for crossing the boundary between the segments
> that have different

Cite from the above: “lots of limitations of using JMP this way”.

This is not a “limitation of using JMP this way” - you simply cannot use JMP instruction at all for UM/KM and KM/UM transitions, because inter-privilege jumps are not allowed by the hardware…

Anton Bassov

>> Cite from the above: “lots of limitations of using JMP this way”.

This is not a “limitation of using JMP this way” - you simply cannot use JMP instruction at all for
UM/KM and KM/UM transitions

What about the Task Gate?


Maxim S. Shatskih
Microsoft MVP on File System And Storage
xxxxx@storagecraft.com
http://www.storagecraft.com

pravin waghurde wrote:

NTDLL loads the system service number into the EAX
register, then copies the address the processor-specific
kernel-mode transition code on the Kernel-User shared
page (0x7FFE0000 + 0x300) into the EDX register, then
calls through the EDX register.

Weren’t you just asking how to block speakers?

>What about the Task Gate?

IIRC, there are 3 types of far jumps:

  1. A far jump to a conforming or non-conforming code segment.
  2. A far jump through a call gate.
  3. A task switch.

However, IIRC, privilege levels of the source and destination code segments have to be the same - far JMP instruction cannot be used to perform inter-privilege-level far jumps, so that you need far call and retf instructions for this purpose…

In any case, all this stuff is available only in 32-bit mode anyway…

Anton Bassov

> Weren’t you just asking how to block speakers?

Well, at least he is not the one who was asking how to capture a screenshot from a kernel-mode driver…

Anton Bassov

Hello Anton Sir,

Yes I was asking about blocking speakers. do you have any information about it. And what about capturing screenshots from kernel-mode driver.