Calling Conventions

Just out of curiosity, why is there such a myriad of calling-conventions, stdcall, cdecl,
fastcall, thiscall. what purpose does each one serve? what would be the advantage of one
over the others? where did they come from? why is windows exclusively stdcall? Lots of
questions I know, but it was just a few i wanted clarified in my head…

asa

Each calling convention has its own advantage.

__fastcall is good when function has one or two arguments which are passed
via ECX and EDX registers (Microsoft compilers). This convention is actively
used for frequently called function with one or two arguments (many kernel
functions have this calling convention)

__cdecl is especially used when function has a variable number of arguments
because stack is cleared after the function call.

__stdcall makes code “smaller” because stack is cleared within function with
this calling convention.

__thiscall is something like __fastcall and used for class member
functions - the first argument (pointer to a class object - “this”) is
passed via ECX register and other arguments are passed via stack.

----- Original Message -----
From: “Asa Yeamans”
To: “Windows System Software Devs Interest List”
Sent: Tuesday, November 02, 2004 8:34 AM
Subject: [ntdev] Calling Conventions

Just out of curiosity, why is there such a myriad of calling-conventions,
stdcall, cdecl,
fastcall, thiscall. what purpose does each one serve? what would be the
advantage of one
over the others? where did they come from? why is windows exclusively
stdcall? Lots of
questions I know, but it was just a few i wanted clarified in my head…

asa


Questions? First check the Kernel Driver FAQ at
http://www.osronline.com/article.cfm?id=256

You are currently subscribed to ntdev as: unknown lmsubst tag argument: ‘’
To unsubscribe send a blank email to xxxxx@lists.osr.com

__cdecl is the worst one, but supports functions with variable number of
parameters like main() or printf(). It uses the “_name” mangling style.

Calling a __cdecl function:

call _myfunc
add esp, ArgSize

__stdcall (also named PASCAL for Win16 compatibility) uses the
xxxxx@ArgSize mangling style. Calling a __stdcall function:

call _myfunc@8

it ends with:

ret 8

__fastcall is used to pass 2 parameters via registers (EBX and ECX IIRC).
Its mangling style is “@xxxxx@ArgSize”, and it is a good idea for small
functions with excessive use of register variables.

C++ also has __thiscall, which passes “this” via ECX. This is how COM
methods must be implemented.

Maxim Shatskih, Windows DDK MVP
StorageCraft Corporation
xxxxx@storagecraft.com
http://www.storagecraft.com

----- Original Message -----
From: “Asa Yeamans”
To: “Windows System Software Devs Interest List”
Sent: Tuesday, November 02, 2004 8:34 AM
Subject: [ntdev] Calling Conventions

> Just out of curiosity, why is there such a myriad of calling-conventions,
stdcall, cdecl,
> fastcall, thiscall. what purpose does each one serve? what would be the
advantage of one
> over the others? where did they come from? why is windows exclusively
stdcall? Lots of
> questions I know, but it was just a few i wanted clarified in my head…
>
> asa
>
> —
> Questions? First check the Kernel Driver FAQ at
http://www.osronline.com/article.cfm?id=256
>
> You are currently subscribed to ntdev as: unknown lmsubst tag argument: ‘’
> To unsubscribe send a blank email to xxxxx@lists.osr.com
>

Well, let’s go:

[cdecl]
pushes the arguments from right to left onto the stack. PUSHing is done by
the caller but cleaning up the stack is done by the callee (!) which is
unusual. (still in use for some RTL functions - otherwise internally in C
programs)

[pascal]
pushes the arguments from left to right onto the stack. The caller has to
cleanup the stack. (was used on Win16)

[stdcall]
pushes the arguments from left to right onto the stack. PUSHing the
arguments is done by the caller as usual. And cleanup is done by the callee
as it already has a predetermined number of arguments.
It is a mix of pascal and cdecl somehow.

[fastcall]
passes arguments in registers (if not all fit, I believe the rest is pushed
onto the stack?!). All C-compilers handle this differently. Watcom does not
understand the fastcall directive anyway.

[register] (Delphi specific)
passes as many arguments as possible in registers (3 of the 4 general
purpose registers are used) onto the stack (rest is like stdcall).

[thiscall]
passes the this pointer (referring to the current object) [usually] via a
register (MS uses ECX, Borland EAX …) and I believe the rest like the
cdecl (but I am not at all sure about that).

[naked] (MS specific)
naked skips prolog and epilog for the function. This was introduced for
(DOS?!!) drivers but found many friends. That’s why it still exists. The
programmer is responsible to provide a prolog and epilog (e.g. ENTER/LEAVE).

So from my comments you should have seen that except for stdcall and cdecl
the calling conventions can be quite compiler specific. Hence you would have
to say what compiler you are referring to or even give a specimen :wink:

A very excellent description of all this can be found in Kris Kaspersky’s
book “Hacker Disassembling Uncovered” (ISBN 1-931769-22-2). He discusses the
topic from different points of view and at several occasions throughout the
book.

On the other hand the compiler docs should provide specific information to
each compiler.

Oliver

May the source be with you, stranger … :wink:

Each calling convention has its own advantage.

__fastcall is good when function has one or two arguments which are passed
via ECX and EDX registers (Microsoft compilers). This convention is actively
used for frequently called function with one or two arguments (many kernel
functions have this calling convention)

__cdecl is especially used when function has a variable number of arguments
because stack is cleared after the function call.

__stdcall makes code “smaller” because stack is cleared within function with
this calling convention.

__thiscall is something like __fastcall and used for class member
functions - the first argument (pointer to a class object - “this”) is
passed via ECX register and other arguments are passed via stack.

----- Original Message -----
From: “Asa Yeamans”
To: “Windows System Software Devs Interest List”
Sent: Tuesday, November 02, 2004 8:34 AM
Subject: [ntdev] Calling Conventions

Just out of curiosity, why is there such a myriad of calling-conventions,
stdcall, cdecl,
fastcall, thiscall. what purpose does each one serve? what would be the
advantage of one
over the others? where did they come from? why is windows exclusively
stdcall? Lots of
questions I know, but it was just a few i wanted clarified in my head…

asa

Hi Asa,

To start with, we’ll begin with the origin of calling conventions…

Obviously, all compilers must have some sort of standard for how to pass
data from one function to another. For function foo() in foo.c to call a
function bar(int x, int y) in bar.c, the compiler needs to know where the
arguments are stored for foo() and how bar() is picking them up.

There are many different ways that we could achieve this, for instance we
could store bar()'s arguments in a static location, essentially making x
and y global variables. This, however, doesn’t work well with recursion, as
we’d need to find a way to store away the old values of x and y, etc. I
believe early versions of Fortran did have this form of argument passing,
but modern Fortran certainly doesn’t.

So, a much better way is to store values on the calling stack. This is
conventionally how all Algol-related languages work. C, Pascal,
Modula-[2,3].

One thing tho’, If the call is bar(1, 2), do we push 1 on the stack first,
then 2, or do we push 2 first, then 1? Well, there’s no real advantage of
one over the other, except for the fact that C has the “variable arguments”
issue. So if we have a function that takes variable arguments, it becomes
very hard to push the first argment first, because the callee will have no
way to determine where the first argument was called (unless of course we
pass some extra data to point to it, but that is ineffecient and
un-necessary if we can just turn it the other way around). So, the standard
C calling convention, by ease of implementing the “variable arguments”, is
to push the last argument first, so that the first argument is on top of
the stack when arriving in the called function.

So why doesn’t everyone use this? Well, the pascal compilers tend(ed) to
use the opposite calling convention, where the first argument is passed
first. Don’t ask me why… Maybe Mr Wirth will be able to explain…

To the compiler writer, it doesn’t make a whole lot of difference, as the
code to push the arguments would be implemented in a recursive function,
and we can implement both in one function, by recursing either before or
after the “push this argument”. Let’s assume that we’ve parsed and
translated the arguments into a linked list, and either cMode or pascalMode
is true depending on what order we want:

struct ArgList
{
struct ArgList *next;

}

void passArgs(struct ArgList *al)
{
if (al == NULL)
return;
if (pascalMode)
pushThisArg(al);
passArgs(al->next);
if (cMode)
pushThisArg(al);
}

There is no evidence that one of those are better than the other, aside
from C’s ability to handle variable arguments of course.

However, there is another ingredient here. Who’s responsible for cleaning
up the stack after the call: the callee or the caller? Since C supports
functions called with 1, 3, 17 or 465 arguments for the same function, the
only place where we know what the number of arguments to be removed from
the stack would be, is where the function was called. However, in a
function that we always know the number of arguments, we could just as well
remove the arguments in the callee function. There is an advantage to the
latter method, and that is that we reduce the size of the code, because we
don’t have to add an extra instruction at the end of every call to the
function to remove the arguments from the stack. In a large system, this
could easily make a big difference.

So, the origin of this is the fact that some people wanted to interface for
instnace Pascal or Fortran code with C.

Then we have optimisations. We can call functions with arguments passed in
registers. This saves the processor storing (pushing) the arguments on the
stack, and then picking them up from the stack again in the callee.

And finally, we may need to call C++ functions, and they have a “this”
pointer, which is the pointer to the actual object that the function is
tied to. Example:

class X
{
void foo(void);
}

main(…)
{
class X a, *p;

p = new X;

a.foo(); // “this = &a”
p->foo(); // “this = p”
}

The thiscall calling convention, which is used for all C++ function that
are not variable arguments or explicitly declared cdecl. This calling
convention puts this in ECX. On a C calling convention, “this” is pushed as
an extra “first argument” (i.e pushed last of all items).

stdcall is used as the interface to the OS, because it’s more
space-optimised than the “standard” C calling convention, because it
reduces the need for a “add esp, A” on every call to the Windows kernel.
With the rather large amount of kernel calls in a standard distribution of
Windows, this is probably a very good thing.

cdecl is used for standard applications, simply because it’s the “standard
C” calling convention. The main reason for this is that it works with “K&R
C”, as it doesn’t require prototypes to work correctly.Consider:

file1.c:
int bar(int x, int y, int z)
{
if (x == 1)
return 4;
if (x == 2)
return y * 4;
if (x == 3)
return y * z * 4;
}

file2.c:
int main(…)
{
int a, b, c;

a = bar(1);
b = bar(2, 6);
c = bar(3, 42, 18);
}

This is indeed valid C-code, but it requires C calling convention (and it
would probably cause all sorts of warnings in a ANSI compiler, but it
should compile and operate correctly).

We don’t really expect code that is using no prototypes in kernel calling
code, so not knowing the number of arguments isn’t really a problem here.

fastcall is one where we call with arguments in registers (which by the
way, may well be the standard calling convention in some processor
architectures, but x86 having very few registers is one of those where it’s
an “optional extra”). It should be a little bit faster, particularly if you
have many functions that take few arguments and don’t do much in the
function. It should for most cases also make shorter code.

The all-important rule is that both the caller and callee need to be aware
of the calling convention. The compiler/linker helps a little bit by
“decorating” the functions, so for instance a cdecl function “foo” will be
called “_foo”, whilst stdcall of “foo” may be “foo@4”. The @4 in this case
indicates that the function takes an argument of 4 bytes. This ensures that
we don’t link an old version of a function with a new prototyp that has
different number of arguments.

I’m sure I’ve missed some points here, but it’s a start…


Mats

xxxxx@lists.osr.com wrote on 11/02/2004 05:34:58 AM:

Just out of curiosity, why is there such a myriad of calling-
conventions, stdcall, cdecl,
fastcall, thiscall. what purpose does each one serve? what would be
the advantage of one
over the others? where did they come from? why is windows
exclusively stdcall? Lots of
questions I know, but it was just a few i wanted clarified in my head…

asa


Questions? First check the Kernel Driver FAQ at http://www.
osronline.com/article.cfm?id=256

You are currently subscribed to ntdev as: unknown lmsubst tag argument:
‘’
To unsubscribe send a blank email to xxxxx@lists.osr.com

ForwardSourceID:NT000067EA

Oliver,

A pretty good description, but I beg to differ (and I checked with the MS
VS.Net help before writing this):

cdecl says that the CALLER not CALLEE should remove the arguments. There is
a very good reason for this, and that is because this is the only calling
convention that supports varargs. If you have variable arguments, the
number of arguments is only known by the caller, the callee has no
knowledge of how many args was passed. [Of course, it would be possible to
pass the argument count to the callee, but that’s not how it’s done in the
standard x86-32 implementations of C that I’m aware of (MS, Borland,
gcc/g++, High C)].

stdcall is not pushing left to right according to MS docs, but right to
left. I’m not sure if this is correct or not, but I suspect it’s correct if
it’s documented that way. Especially, since it seems that the code
generated by the compiler matches this as well. I just wrote a littel
test-piece that calls two different functions, one stdcall and one cdecl,
and they both pass arguments in the same order. However, as you state,
stdcall requires the callee to clean up the stack.

I’m not sure about pascal (and have no way to test it, as I don’t have a
compiler that supports it nowadays), but I’m pretty sure it’s callee that
cleans up the stack. But I could be wrong.


Mats
xxxxx@lists.osr.com wrote on 11/02/2004 10:08:50 AM:

Well, let’s go:

[cdecl]
pushes the arguments from right to left onto the stack. PUSHing is done
by
the caller but cleaning up the stack is done by the callee (!) which is
unusual. (still in use for some RTL functions - otherwise internally in C
programs)

[pascal]
pushes the arguments from left to right onto the stack. The caller has to
cleanup the stack. (was used on Win16)

[stdcall]
pushes the arguments from left to right onto the stack. PUSHing the
arguments is done by the caller as usual. And cleanup is done by the
callee
as it already has a predetermined number of arguments.
It is a mix of pascal and cdecl somehow.

[fastcall]
passes arguments in registers (if not all fit, I believe the rest is
pushed
onto the stack?!). All C-compilers handle this differently. Watcom does
not
understand the fastcall directive anyway.

[register] (Delphi specific)
passes as many arguments as possible in registers (3 of the 4 general
purpose registers are used) onto the stack (rest is like stdcall).

[thiscall]
passes the this pointer (referring to the current object) [usually] via a
register (MS uses ECX, Borland EAX …) and I believe the rest like the
cdecl (but I am not at all sure about that).

[naked] (MS specific)
naked skips prolog and epilog for the function. This was introduced for
(DOS?!!) drivers but found many friends. That’s why it still exists. The
programmer is responsible to provide a prolog and epilog (e.g.
ENTER/LEAVE).

So from my comments you should have seen that except for stdcall and
cdecl
the calling conventions can be quite compiler specific. Hence you would
have
to say what compiler you are referring to or even give a specimen :wink:

A very excellent description of all this can be found in Kris Kaspersky’s
book “Hacker Disassembling Uncovered” (ISBN 1-931769-22-2). He discusses
the
topic from different points of view and at several occasions throughout
the
book.

On the other hand the compiler docs should provide specific information
to
each compiler.

Oliver

May the source be with you, stranger … :wink:


Questions? First check the Kernel Driver FAQ at http://www.
osronline.com/article.cfm?id=256

You are currently subscribed to ntdev as: xxxxx@3dlabs.com
To unsubscribe send a blank email to xxxxx@lists.osr.com

ForwardSourceID:NT0000683A

Oliver Schneider wrote:

Well, let’s go:

[stdcall]
pushes the arguments from left to right onto the stack. PUSHing the

Wrong, it’s *right to left* like cdecl. That’s the biggest difference
from the 16-bit PASCAL convention.

[naked] (MS specific)
naked skips prolog and epilog for the function. This was introduced for
(DOS?!!) drivers but found many friends. That’s why it still exists. The
programmer is responsible to provide a prolog and epilog (e.g. ENTER/LEAVE).

Naked is just function attribute. It doesn’t affect the calling
convention itself.

Maxim S. Shatskih wrote:
> __fastcall is used to pass 2 parameters via registers (EBX and ECX IIRC).

The registers are actually ECX and EDX.

I really recommend reading the series about calling conventions written by Raymond Chen in his blog:

http://blogs.msdn.com/oldnewthing/archive/2004/01/02/47184.aspx
http://blogs.msdn.com/oldnewthing/archive/2004/01/07/48303.aspx
http://blogs.msdn.com/oldnewthing/archive/2004/01/08/48616.aspx
http://blogs.msdn.com/oldnewthing/archive/2004/01/13/58199.aspx
http://blogs.msdn.com/oldnewthing/archive/2004/01/14/58579.aspx

  • Filip

> Obviously, all compilers must have some sort of standard for how to pass

data from one function to another.

These are OS’s standards and not compiler’s, since the OS APIs use these
standards.

Maxim Shatskih, Windows DDK MVP
StorageCraft Corporation
xxxxx@storagecraft.com
http://www.storagecraft.com

> I’m not sure about pascal (and have no way to test it, as I don’t have a

compiler that supports it nowadays), but I’m pretty sure it’s callee that

In Win32, PASCAL is an alias to __stdcall.

Maxim Shatskih, Windows DDK MVP
StorageCraft Corporation
xxxxx@storagecraft.com
http://www.storagecraft.com

Maxim wrote:

> Obviously, all compilers must have some sort of standard for how to
pass
> data from one function to another.

These are OS’s standards and not compiler’s, since the OS APIs use these
standards.

Ok, I should have made a more clear statement on that.

  • The compiler need to have a standard for calling functions between
    modules.
  • The OS needs to have a standard for applications calling it’s (user to
    kernel) system calls
  • The OS also needs a calling convention for kernel drivers calling
    (kerneel to kernel) system calls.
    All three of these may be the same, or they may be completely different, or
    they may be selected on a function by function basis.

For a compiler to support direct system calls (i.e. that the code generated
can just call straight into the kernel), it needs to support the OS’s
calling convention. Some compilers I’ve worked with require a library
(written in assembler) to interface to the kernel calls, but most if not
all Windows C/C++ compilers (except perhaps GCC when using it on Windows)
support calling the Windows API directly.

As it happens, the default calling convention within applications is cdecl
(standard C calling convention). The “default” calling convention for
system calls is stdcall, because it reduces the size of the code calling
the functions. Individual calls into the kernel are then given different
calling convention when applicable, such as fastcall (for functions with
1-2 arguments) or cdecl when varying count arguments are used (typically
“printf” type functions).

The default calling convention for a different OS may be different,
including which registers are used how (as in which needs to be preserved
and which can be destroyed without preserving), and what calling convention
is used for system calls.


Mats