Saving registers in 64-bit ASM

I have a user-mode driver (DLL) with several exported functions. Some of
these functions are stubs that redirect to the appropriate function defined
in another module. Since I don’t know the function prototypes for all the
functions that I need stubs for, I’ve implemented the stubs using ASM so I
can more easily pass the arguments along.

A stub roughly goes like this:

* Stub gets called.

* Call a C function that loads the appropriate target module and returns the
function pointer to the desired function.

* Execute a jmp to the returned function pointer.

Doing it this way has the benefit that the real function in the target
module does the “ret” since it knows the actual number and size of the
function arguments.

All of this is easy to do for x86:

DrvSomeFunction proc

INVOKE GetTargetFunc, ADDR DrvSomeFunctionName

; The address of the target function is in eax

jmp eax

; The target function will take care of returning for us

;ret

DrvSomeFunction endp

Implementing the same functionality on x64 requires a little more work since
INVOKE isn’t supported by ml64. The code below takes care of saving off the
volatile registers and then calls the same function to resolve the target
function as in the x86 example.

DrvSomeFunction proc

; Save off registers potentially used as function arguments

push rcx

push rdx

push r8

push r9

; TODO: We also want to save off XMM0-3

sub rsp, 20h ; Make room on the stack for “spilling” arguments

lea rcx, DrvSomeFunctionName

call GetTargetFunc

add rsp, 20h

pop r9

pop r8

pop rdx

pop rcx

; The address of the target function is in rax

jmp rax

; The target function will take care of returning for us

;ret

DrvSomeFunction endp

This also works as expected, but I’m stuck on saving off the XMM0-3
registers. I know ksamd64.inc has helper macros for pushing volatile
registers on the stack, such as .SAVEREG and .SAVEXMM128, but according to
MSDN they have to be called in the proc’s prologue
(http://msdn.microsoft.com/en-us/library/3cwzs27h.aspx). However, since my
stubs aren’t real frame functions with a ret instruction, I’m unsure if I’m
able to make use of these macros.

And if I can’t use them, how do I push the 128 bit XMM0-3 on the stack (I
assume I need to split them up into two 64 bit parts but am unclear on the
right way to do that).

Any thoughts are much appreciated.

Thanks,

Soren

>

This also works as expected, but I’m stuck on saving off the XMM0-3
registers.
I know ksamd64.inc has helper macros for pushing volatile registers on
the
stack, such as .SAVEREG and .SAVEXMM128, but according to MSDN they
have to be
called in the proc’s prologue (http://msdn.microsoft.com/en-
us/library/3cwzs27h.aspx). However, since my stubs aren’t real frame
functions
with a ret instruction, I’m unsure if I’m able to make use of these
macros.

And if I can’t use them, how do I push the 128 bit XMM0-3 on the stack
(I
assume I need to split them up into two 64 bit parts but am unclear on
the
right way to do that).

Any thoughts are much appreciated.

The “Using Floating Point or MMX in a WDM Driver” page in the help might
shed some light on it.

Maybe you could disassemble the KeSaveFloatingPointState function and
see what it does? Or just call it instead of doing it yourself.

James

> Maybe you could disassemble the KeSaveFloatingPointState function and

see what it does? Or just call it instead of doing it yourself.

Disassembling KeSaveFloatingPointState would be my only choice here since
I’m operating in user-mode.

-----Original Message-----
From: xxxxx@lists.osr.com [mailto:bounce-445681-
xxxxx@lists.osr.com] On Behalf Of James Harper
Sent: Saturday, March 19, 2011 5:35 PM
To: Windows System Software Devs Interest List
Subject: RE: [ntdev] Saving registers in 64-bit ASM

>
> This also works as expected, but I’m stuck on saving off the XMM0-3
registers.
> I know ksamd64.inc has helper macros for pushing volatile registers on
the
> stack, such as .SAVEREG and .SAVEXMM128, but according to MSDN they
have to be
> called in the proc’s prologue (http://msdn.microsoft.com/en-
> us/library/3cwzs27h.aspx). However, since my stubs aren’t real frame
functions
> with a ret instruction, I’m unsure if I’m able to make use of these
macros.
>
> And if I can’t use them, how do I push the 128 bit XMM0-3 on the stack
(I
> assume I need to split them up into two 64 bit parts but am unclear on
the
> right way to do that).
>
> Any thoughts are much appreciated.
>

The “Using Floating Point or MMX in a WDM Driver” page in the help might
shed some light on it.

Maybe you could disassemble the KeSaveFloatingPointState function and
see what it does? Or just call it instead of doing it yourself.

James


NTDEV is sponsored by OSR

For our schedule of WDF, WDM, debugging and other seminars visit:
http://www.osr.com/seminars

To unsubscribe, visit the List Server section of OSR Online at
http://www.osronline.com/page.cfm?name=ListServer

So, looking into this, it don’t think KeSaveFloatingPointState will be of
much use here. MSDN states that:

"In 64-bit versions of Windows, the operating system preserves the MMX/x87
registers across thread (and process) switches. However, there is no
explicit calling convention for the MMX/x87 registers. Code that is produced
by the 64-bit compiler for x64 processors does not use these registers and
does not preserve them across function calls.

The use of the MMX/x87 registers is strictly prohibited in 64-bit
kernel-mode code."

As expected, KeSaveFloatingPointState in the 64-bit ntoskrnl.exe is just a
no-op.

My issue at hand is how to save off the XMM registers in 64-bit user-mode
where those registers are a part of the x64 calling convention
(http://msdn.microsoft.com/en-us/library/ms235286(v=vs.80).aspx).

-----Original Message-----
From: xxxxx@lists.osr.com [mailto:bounce-445681-
xxxxx@lists.osr.com] On Behalf Of James Harper
Sent: Saturday, March 19, 2011 5:35 PM
To: Windows System Software Devs Interest List
Subject: RE: [ntdev] Saving registers in 64-bit ASM

>
> This also works as expected, but I’m stuck on saving off the XMM0-3
registers.
> I know ksamd64.inc has helper macros for pushing volatile registers on
the
> stack, such as .SAVEREG and .SAVEXMM128, but according to MSDN they
have to be
> called in the proc’s prologue (http://msdn.microsoft.com/en-
> us/library/3cwzs27h.aspx). However, since my stubs aren’t real frame
functions
> with a ret instruction, I’m unsure if I’m able to make use of these
macros.
>
> And if I can’t use them, how do I push the 128 bit XMM0-3 on the stack
(I
> assume I need to split them up into two 64 bit parts but am unclear on
the
> right way to do that).
>
> Any thoughts are much appreciated.
>

The “Using Floating Point or MMX in a WDM Driver” page in the help might
shed some light on it.

Maybe you could disassemble the KeSaveFloatingPointState function and
see what it does? Or just call it instead of doing it yourself.

James


NTDEV is sponsored by OSR

For our schedule of WDF, WDM, debugging and other seminars visit:
http://www.osr.com/seminars

To unsubscribe, visit the List Server section of OSR Online at
http://www.osronline.com/page.cfm?name=ListServer

Soren Dreijer wrote:

So, looking into this, it don’t think KeSaveFloatingPointState will be of
much use here. MSDN states that:

My issue at hand is how to save off the XMM registers in 64-bit user-mode
where those registers are a part of the x64 calling convention

Why are you worrying about this? The function you CALL will do this.
That’s part of the contract. You only need to save the XMM registers if
you are going to use the XMM registers within your function. You don’t
worry about the registers for the functions you call. That’s their job.

The x64 calling convention is non-obvious and complicated.


Tim Roberts, xxxxx@probo.com
Providenza & Boekelheide, Inc.

> Why are you worrying about this? The function you CALL will do this.

That’s part of the contract. You only need to save the XMM registers if
you
are going to use the XMM registers within your function. You don’t worry
about the registers for the functions you call. That’s their job.

Hmm, then I’m reading the MSDN docs wrong. They explicitly state that the
“The registers RAX, RCX, RDX, R8, R9, R10, R11 are considered volatile and
must be considered destroyed on function calls”. Since I’m calling my C
helper function before I redirect to the actual target function, I have to
save off any registers that the C function will potentially use and
overwrite.

Is that incorrect?

-----Original Message-----
From: xxxxx@lists.osr.com [mailto:bounce-445877-
xxxxx@lists.osr.com] On Behalf Of Tim Roberts
Sent: Monday, March 21, 2011 11:58 AM
To: Windows System Software Devs Interest List
Subject: Re: [ntdev] Saving registers in 64-bit ASM

Soren Dreijer wrote:
> So, looking into this, it don’t think KeSaveFloatingPointState will be
> of much use here. MSDN states that:
> …
> My issue at hand is how to save off the XMM registers in 64-bit
> user-mode where those registers are a part of the x64 calling
> convention

Why are you worrying about this? The function you CALL will do this.
That’s part of the contract. You only need to save the XMM registers if
you
are going to use the XMM registers within your function. You don’t worry
about the registers for the functions you call. That’s their job.

The x64 calling convention is non-obvious and complicated.


Tim Roberts, xxxxx@probo.com
Providenza & Boekelheide, Inc.


NTDEV is sponsored by OSR

For our schedule of WDF, WDM, debugging and other seminars visit:
http://www.osr.com/seminars

To unsubscribe, visit the List Server section of OSR Online at
http://www.osronline.com/page.cfm?name=ListServer

Soren Dreijer wrote:

> Why are you worrying about this? The function you CALL will do this.
> That’s part of the contract. You only need to save the XMM registers if you are going to use the XMM registers within your function. You don’t worry about the registers for the functions you call. That’s their job.
Hmm, then I’m reading the MSDN docs wrong. They explicitly state that the
“The registers RAX, RCX, RDX, R8, R9, R10, R11 are considered volatile and
must be considered destroyed on function calls”. Since I’m calling my C
helper function before I redirect to the actual target function, I have to
save off any registers that the C function will potentially use and
overwrite.

Is that incorrect?

Well, we’re both right, from our own point of view, but I was thinking
about it a little upside down. Since the parameters to the original
function are in rcx, rdx, r8 and r9, you certainly need to save and
restore those. There isn’t going to be anything useful in r10 and r11
to begin with, so there’s no point in restoring those. And I don’t
think there’s going to be anything useful in the XMM registers, either.
So even if your helper function modifies them, it will restore the ones
that need restoring.

The function that called you will already have allocated room for you to
save the four parameter registers (rcx, rdx, r8, r9). You don’t have
to do that. However, you have to provide room for the NEXT function to
store them.

You don’t see the push and pop instructions very much in 64-bit code.
The idiom is more like this:

mov [rsp+8], rcx ; store in my register parameter area
mov [rsp+16], rdx
mov [rsp+24],r9
mov [rsp+32],r10
sub rsp, 40 ; make room for helper’s register area
call helper
add rsp, 40
mov rcx, [rsp+8]
mov rdx, [rsp+16]
mov r9, [rsp+24]
mov r10, [rsp+32]


Tim Roberts, xxxxx@probo.com
Providenza & Boekelheide, Inc.

Tim Roberts wrote:

Soren Dreijer wrote:
>> Why are you worrying about this? The function you CALL will do this.
>> That’s part of the contract. You only need to save the XMM registers if you are going to use the XMM registers within your function. You don’t worry about the registers for the functions you call. That’s their job.
> Hmm, then I’m reading the MSDN docs wrong. They explicitly state that the
> “The registers RAX, RCX, RDX, R8, R9, R10, R11 are considered volatile and
> must be considered destroyed on function calls”. Since I’m calling my C
> helper function before I redirect to the actual target function, I have to
> save off any registers that the C function will potentially use and
> overwrite.
>
> Is that incorrect?
Well, we’re both right, from our own point of view, but I was thinking
about it a little upside down. Since the parameters to the original
function are in rcx, rdx, r8 and r9, you certainly need to save and
restore those. There isn’t going to be anything useful in r10 and r11
to begin with, so there’s no point in restoring those. And I don’t
think there’s going to be anything useful in the XMM registers, either.
So even if your helper function modifies them, it will restore the ones
that need restoring.

The function that called you will already have allocated room for you to
save the four parameter registers (rcx, rdx, r8, r9). You don’t have
to do that. However, you have to provide room for the NEXT function to
store them.

You don’t see the push and pop instructions very much in 64-bit code.
The idiom is more like this:

mov [rsp+8], rcx ; store in my register parameter area
mov [rsp+16], rdx
mov [rsp+24],r9
mov [rsp+32],r10

Whoops, make that r8 and r9 instead of r9 and r10.

mov r9, [rsp+24]
mov r10, [rsp+32]

Ditto here.


Tim Roberts, xxxxx@probo.com
Providenza & Boekelheide, Inc.

> The function that called you will already have allocated room for you to
save

the four parameter registers (rcx, rdx, r8, r9). You don’t have to do
that.
However, you have to provide room for the NEXT function to store them.

Right, that’s what I did in my initial example:
sub rsp, 20h ; Make room on the stack for “spilling” arguments

And I don’t think there’s going to be anything useful in the
XMM registers, either.
So even if your helper function modifies them, it will restore the ones
that
need restoring.

How do you know there won’t be anything useful in those registers? And more
importantly, why would the helper function restore them if the MSDN docs say
they’re volatile: http://msdn.microsoft.com/en-us/library/9z1stfyw.aspx

mov [rsp+8], rcx ; store in my register parameter area
mov [rsp+16], rdx
mov [rsp+24],r9
mov [rsp+32],r10
sub rsp, 40 ; make room for helper’s register area

Why do you change the stack pointer by 40 rather than 32?

You don’t see the push and pop instructions very much in 64-bit code.

Good point. I rarely write 64-bit ASM by hand, so the push/pop “difference”
from 32-bit wasn’t something I was aware of.

-----Original Message-----
From: xxxxx@lists.osr.com [mailto:bounce-445911-
xxxxx@lists.osr.com] On Behalf Of Tim Roberts
Sent: Monday, March 21, 2011 2:26 PM
To: Windows System Software Devs Interest List
Subject: Re: [ntdev] Saving registers in 64-bit ASM

Soren Dreijer wrote:
>> Why are you worrying about this? The function you CALL will do this.
>> That’s part of the contract. You only need to save the XMM registers
if
you are going to use the XMM registers within your function. You don’t
worry about the registers for the functions you call. That’s their job.
> Hmm, then I’m reading the MSDN docs wrong. They explicitly state that
> the “The registers RAX, RCX, RDX, R8, R9, R10, R11 are considered
> volatile and must be considered destroyed on function calls”. Since
> I’m calling my C helper function before I redirect to the actual
> target function, I have to save off any registers that the C function
> will potentially use and overwrite.
>
> Is that incorrect?

Well, we’re both right, from our own point of view, but I was thinking
about it
a little upside down. Since the parameters to the original function are
in rcx,
rdx, r8 and r9, you certainly need to save and restore those. There isn’t
going
to be anything useful in r10 and r11 to begin with, so there’s no point in
restoring those. And I don’t think there’s going to be anything useful in
the
XMM registers, either.
So even if your helper function modifies them, it will restore the ones
that
need restoring.

The function that called you will already have allocated room for you to
save
the four parameter registers (rcx, rdx, r8, r9). You don’t have to do
that.
However, you have to provide room for the NEXT function to store them.

You don’t see the push and pop instructions very much in 64-bit code.
The idiom is more like this:

mov [rsp+8], rcx ; store in my register parameter area
mov [rsp+16], rdx
mov [rsp+24],r9
mov [rsp+32],r10
sub rsp, 40 ; make room for helper’s register area
call helper
add rsp, 40
mov rcx, [rsp+8]
mov rdx, [rsp+16]
mov r9, [rsp+24]
mov r10, [rsp+32]


Tim Roberts, xxxxx@probo.com
Providenza & Boekelheide, Inc.


NTDEV is sponsored by OSR

For our schedule of WDF, WDM, debugging and other seminars visit:
http://www.osr.com/seminars

To unsubscribe, visit the List Server section of OSR Online at
http://www.osronline.com/page.cfm?name=ListServer

Soren Dreijer wrote:

> And I don’t think there’s going to be anything useful in the
> XMM registers, either.
> So even if your helper function modifies them, it will restore the ones that need restoring.
How do you know there won’t be anything useful in those registers? And more
importantly, why would the helper function restore them if the MSDN docs say
they’re volatile: http://msdn.microsoft.com/en-us/library/9z1stfyw.aspx

If you know the functions you are intercepting will use floating point
values, then yes, you will have to save and restore XMM0, 1, 2, and 3.

The other volatile registers (RAX, R10, R11, XMM4, XMM5) are never used
to pass information between functions. Therefore, the value they had at
the time of the call is irrelevant. Even if they get changed, no one
will have made any assumptions about their value.

Why do you change the stack pointer by 40 rather than 32?

To maintain 16-byte alignment. The return address uses up the other 8
bytes.


Tim Roberts, xxxxx@probo.com
Providenza & Boekelheide, Inc.

> If you know the functions you are intercepting will use floating point
values,

then yes, you will have to save and restore XMM0, 1, 2, and 3.

Right, which is what I expected. I’m not 100% sure that floating point
values aren’t used in the helper function since it also calls into other
functions that could potentially use floating point values.

So, with that said, what’s the Right Way ™ to save off the XMM0-3
registers?

To maintain 16-byte alignment. The return address uses up the other 8
bytes.

Ah, right!

-----Original Message-----
From: xxxxx@lists.osr.com [mailto:bounce-445939-
xxxxx@lists.osr.com] On Behalf Of Tim Roberts
Sent: Monday, March 21, 2011 3:14 PM
To: Windows System Software Devs Interest List
Subject: Re: [ntdev] Saving registers in 64-bit ASM

Soren Dreijer wrote:
>
>> And I don’t think there’s going to be anything useful in the XMM
>> registers, either.
>> So even if your helper function modifies them, it will restore the ones
that
need restoring.
> How do you know there won’t be anything useful in those registers? And
> more importantly, why would the helper function restore them if the
> MSDN docs say they’re volatile:
> http://msdn.microsoft.com/en-us/library/9z1stfyw.aspx

If you know the functions you are intercepting will use floating point
values,
then yes, you will have to save and restore XMM0, 1, 2, and 3.

The other volatile registers (RAX, R10, R11, XMM4, XMM5) are never used to
pass information between functions. Therefore, the value they had at the
time of the call is irrelevant. Even if they get changed, no one will
have made
any assumptions about their value.

> Why do you change the stack pointer by 40 rather than 32?

To maintain 16-byte alignment. The return address uses up the other 8
bytes.


Tim Roberts, xxxxx@probo.com
Providenza & Boekelheide, Inc.


NTDEV is sponsored by OSR

For our schedule of WDF, WDM, debugging and other seminars visit:
http://www.osr.com/seminars

To unsubscribe, visit the List Server section of OSR Online at
http://www.osronline.com/page.cfm?name=ListServer

Soren Dreijer wrote:

> If you know the functions you are intercepting will use floating point
> values then yes, you will have to save and restore XMM0, 1, 2, and 3.
Right, which is what I expected. I’m not 100% sure that floating point
values aren’t used in the helper function since it also calls into other
functions that could potentially use floating point values.

The question is, does the function you are intercepting have any
floating point parameters? If it doesn’t, then you can forget about the
XMM registers altogether. Remember, the ONLY registers you need to save
and restore are the registers that the caller is trying to pass to the
original, unmolested callee.

So, with that said, what’s the Right Way ™ to save off the XMM0-3
registers?

Should be:
movaps [rsp+48], xmm0
movaps [rsp+64], xmm1
movaps [rsp+80], xmm2
movaps [rsp+96], xmm3


Tim Roberts, xxxxx@probo.com
Providenza & Boekelheide, Inc.

> The question is, does the function you are intercepting have any floating

point parameters? If it doesn’t, then you can forget about the XMM
registers altogether. Remember, the ONLY registers you need to save and
restore are the registers that the caller is trying to pass to the
original,
unmolested callee.

Precisely, but the problem is that I’m making stubs for functions with
unknown signatures. That’s why I use the “jmp” instruction and also just
pass through whatever arguments the caller happened to pass along (and
consequently have to take great care not to mess them up).

Should be:
movaps [rsp+48], xmm0
movaps [rsp+64], xmm1
movaps [rsp+80], xmm2
movaps [rsp+96], xmm3

Perfect!

I appreciate your help Tim. Many thanks!

-----Original Message-----
From: xxxxx@lists.osr.com [mailto:bounce-445956-
xxxxx@lists.osr.com] On Behalf Of Tim Roberts
Sent: Monday, March 21, 2011 5:02 PM
To: Windows System Software Devs Interest List
Subject: Re: [ntdev] Saving registers in 64-bit ASM

Soren Dreijer wrote:
>> If you know the functions you are intercepting will use floating
>> point values then yes, you will have to save and restore XMM0, 1, 2,
and
3.
> Right, which is what I expected. I’m not 100% sure that floating point
> values aren’t used in the helper function since it also calls into
> other functions that could potentially use floating point values.

The question is, does the function you are intercepting have any floating
point parameters? If it doesn’t, then you can forget about the XMM
registers altogether. Remember, the ONLY registers you need to save and
restore are the registers that the caller is trying to pass to the
original,
unmolested callee.

> So, with that said, what’s the Right Way ™ to save off the XMM0-3
> registers?

Should be:
movaps [rsp+48], xmm0
movaps [rsp+64], xmm1
movaps [rsp+80], xmm2
movaps [rsp+96], xmm3


Tim Roberts, xxxxx@probo.com
Providenza & Boekelheide, Inc.


NTDEV is sponsored by OSR

For our schedule of WDF, WDM, debugging and other seminars visit:
http://www.osr.com/seminars

To unsubscribe, visit the List Server section of OSR Online at
http://www.osronline.com/page.cfm?name=ListServer