Collecting a stacktrace

I want to be able to walk up the stack (in kernel mode) and identify callers by module!function_name. I know how to check which module/function an address belongs to. The problem is that on x64 there is no frame pointer. How do I identify return addresses in this case? Can .pdb information help with that?

RtlCaptureStackBackTrace. Use it on all architectures, as it’s more portable than fiddling with frame pointers in assembly.

Hi Jeffery,

That’s an interesting function, although it’s not so clear what the return parameter lifetime is. The docs say “BackTrace [out]
An array of pointers captured from the current stack trace.”. Are we supposed to free that memory? Or the OS magically frees is when?

The docs are not exactly clear on the format of the stack frames, is each array entry just a pointer to someplace on the stack? Or a pointer to some sort of context record? For debugging, it’s usually a lot more useful to convert addresses to symbols, like the function names. Unlike the old days where everything was on the stack, modern calling conventions are register based and raw stack dumps are less useful. I know the compiler sometimes allocates stack slots for register parameters to spill into, is there a way to force the compiler to write registers to memory, on debug builds, so a stack dump would contain the parameters?

I’m always looking for ways to create better tracing and debugging.

Jan

On 7/31/17, 2:25 PM, “xxxxx@lists.osr.com on behalf of Jeffrey Tippet” wrote:

RtlCaptureStackBackTrace. Use it on all architectures, as it’s more portable than fiddling with frame pointers in assembly.

Storage is caller-allocated (3rd parameter), so you have full control over its lifetime.

The result will be an array of return addresses, one address for each function on the stack. This tells you the *call* stack (what you’d get with “k” in windbg), but it does not tell you anything about the *data* on the stack. You originally asked about call stacks, but it sounds like you’re also interested in stacks data too?

Extracting the stack data is trickier, as you’ve noted, since the C compiler’s optimizer can do strange and interesting things with it. Indeed in the most general case, you might have written your code in a programming language that doesn’t store any data on a “stack” as such. (For example, C# doesn’t have the concept of stack data.) The OS does not generally know how to extract data off the stack. The OS only knows how to extract an arbitrary call stack, since that’s necessary for exception unwinding.

My inclination is to not try and reconstruct the C language from within processor-level code. E.g., I don’t really advise you to try and write code that generically dumps out all local variables of arbitrary frames on the stack using only processor registers, unwind codes, and the PDB. I think you’d have more success if you either (a) delegate all that work to the debugger (which can already do this), or (b) write explicit code to print out all your various bits of high-value state whenever tracing is enabled.

My problem is that I run at IRQL>DISPATCH_LEVEL. Is it possible to copy parts of the stack and then somehow pass it to RtlCaptureStackBackTrace in a thread with lower irql?

How does RtlCaptureStackBackTrace work?

Ah I don’t believe that you can defer a stack capture to a different processor or thread. You can, however, formulate an official request to Microsoft to document RtlCaptureStackBackTrace as callable at any IRQL.

The implementation of RtlCaptureStackBackTrace is different on each architecture. However, to generalize, x86 uses a linked list of frame pointers, while all subsequent architectures store a set of unwind codes in a separate section of the driver. The kernel uses these unwind codes to determine how many bytes of stack are used by each function on the stack.

Decoding the unwind codes is nontrivial. You can find some documentation online, but I don’t recommend you get involved in this for mere diagnostics. (As far as I know, only a handful of people have gone through the trouble to understand & write decoders, and they tend to be in the anti-malware business.)

when i experimented with this api in x86 . in usermode

there were quiet a few differences between what windbg resolves for K
and in the output provided for this

also there appeared to be a few editable globals which influenced the output

like ntdll!RtlpFuzzyStackTracesEnabled

here is a small code that dumps the stack traces

:\>dir /b
rtlcapstk.cpp

:\>type rtlcapstk.cpp
#include <windows.h>
#include <stdio.h>
#define SKIPSTACK 0
#define NUMFRAMES 0x20
typedef USHORT (WINAPI MyRtlCaptureStackBackTrace)(ULONG,ULONG,PVOID,PULONG);
int main(void)
{
HMODULE hNtdll = LoadLibrary(“ntdll.dll”);
if(hNtdll != NULL)
{
MyRtlCaptureStackBackTrace
capstkbktrace = (MyRtlCaptureStackBackTrace)GetProcAddress(
hNtdll,“RtlCaptureStackBackTrace”);
if(capstkbktrace != NULL)
{
PVOID backstk[NUMFRAMES] = {0};
USHORT res = capstkbktrace(SKIPSTACK,NUMFRAMES,backstk,NULL);
for(int i =0 ; i < res; i++)
{
printf(“%x\n”, (unsigned long)backstk[i]);
}
}
FreeLibrary(hNtdll);
}
return 0;
}
:&gt;cl /Zi /W4 /analyze /Ox /nologo rtlcapstk.cpp /link /nologo
/release rtlcapstk.cpp

:&gt;rtlcapstk.exe
8105b
77183c45
774d37eb
774d37be

:&gt;cdb -c “g rtlcapstk!main;pc 4;k;g;q” rtlcapstk.exe| grep ChildEBP -A 40

ChildEBP RetAddr
0016fc70 002612fe rtlcapstk!main+0x59
(Inline) -------- rtlcapstk!invoke_main+0x1d
0016fcb8 77183c45 rtlcapstk! scrt_common_main_seh+0xf9
0016fcc4 774d37eb kernel32!BaseThreadInitThunk+0xe
0016fd04 774d37be ntdll!
RtlUserThreadStart+0x70
0016fd1c 00000000 ntdll!_RtlUserThreadStart+0x1b

26105b return address appears to be different
( windbg also resolves inlined functions )
77183c45
774d37eb
774d37be
quit:

also here is an output with the undocumented global edited (first is
false second is true)

:&gt;cdb -c “g rtlcapstk!main;pc 4;k;g;q” rtlcapstk.exe| grep ChildEBP -A 40

ChildEBP RetAddr
002ef968 003612fe rtlcapstk!main+0x59
(Inline) -------- rtlcapstk!invoke_main+0x1d
002ef9b0 77183c45 rtlcapstk! scrt_common_main_seh+0xf9
002ef9bc 774d37eb kernel32!BaseThreadInitThunk+0xe
002ef9fc 774d37be ntdll!
RtlUserThreadStart+0x70
002efa14 00000000 ntdll!_RtlUserThreadStart+0x1b
36105b
77183c45
774d37eb
774d37be
quit:

:&gt;cdb -c “ed ntdll!RtlpFuzzyStackTracesEnabled 1;g rtlcapstk!main;pc 4;k;g;q” r
tlcapstk.exe| grep ChildEBP -A 40

ChildEBP RetAddr
002afd80 013512fe rtlcapstk!main+0x59
(Inline) -------- rtlcapstk!invoke_main+0x1d
002afdc8 77183c45 rtlcapstk! scrt_common_main_seh+0xf9
002afdd4 774d37eb kernel32!BaseThreadInitThunk+0xe
002afe14 774d37be ntdll!
RtlUserThreadStart+0x70
002afe2c 00000000 ntdll!_RtlUserThreadStart+0x1b

ffffffff <—seems to differ and appears to be some unknown address
to wrt windbg’s k
77183c45
774d37eb
774d37be
13513af <---- seems to have found one more return address

quit:

On 8/2/17, Jeffrey Tippet
wrote:
> Ah I don’t believe that you can defer a stack capture to a different
> processor or thread. You can, however, formulate an official request to
> Microsoft to document RtlCaptureStackBackTrace as callable at any IRQL.
>
> The implementation of RtlCaptureStackBackTrace is different on each
> architecture. However, to generalize, x86 uses a linked list of frame
> pointers, while all subsequent architectures store a set of unwind codes in
> a separate section of the driver. The kernel uses these unwind codes to
> determine how many bytes of stack are used by each function on the stack.
>
> Decoding the unwind codes is nontrivial. You can find some documentation
> online, but I don’t recommend you get involved in this for mere diagnostics.
> (As far as I know, only a handful of people have gone through the trouble
> to understand & write decoders, and they tend to be in the anti-malware
> business.)
>
>
> —
> NTDEV is sponsored by OSR
>
> Visit the list online at:
> http:
>
> MONTHLY seminars on crash dump analysis, WDF, Windows internals and software
> drivers!
> Details at http:
>
> To unsubscribe, visit the List Server section of OSR Online at
> http:
></http:></http:></http:></stdio.h></windows.h>

Does “kv” show that any of the functions are using FPO?

On the x86 stack unwind information for FPO frames is stored in the PDB,
thus you need to use PDBs if you want a precise stack walk. This is the
reason for the infamous, “stack unwind information unavailable, the
following frames may be missing or incorrect” message that you get in WinDbg
if you’re on the x86 and there’s something on the stack that you don’t have
PDBs for.

On the x64 the stack unwind information is in the executable image header
and available at runtime. No longer necessary to use PDBs to get a correct
call stack.

-scott
OSR
@OSRDrivers

wrote in message news:xxxxx@ntdev…

when i experimented with this api in x86 . in usermode

there were quiet a few differences between what windbg resolves for K
and in the output provided for this

also there appeared to be a few editable globals which influenced the output

like ntdll!RtlpFuzzyStackTracesEnabled

here is a small code that dumps the stack traces

:\>dir /b
rtlcapstk.cpp

:\>type rtlcapstk.cpp
#include <windows.h>
#include <stdio.h>
#define SKIPSTACK 0
#define NUMFRAMES 0x20
typedef USHORT (WINAPI
MyRtlCaptureStackBackTrace)(ULONG,ULONG,PVOID,PULONG);
int main(void)
{
HMODULE hNtdll = LoadLibrary(“ntdll.dll”);
if(hNtdll != NULL)
{
MyRtlCaptureStackBackTrace
capstkbktrace = (MyRtlCaptureStackBackTrace)GetProcAddress(
hNtdll,“RtlCaptureStackBackTrace”);
if(capstkbktrace != NULL)
{
PVOID backstk[NUMFRAMES] = {0};
USHORT res = capstkbktrace(SKIPSTACK,NUMFRAMES,backstk,NULL);
for(int i =0 ; i < res; i++)
{
printf(“%x\n”, (unsigned long)backstk[i]);
}
}
FreeLibrary(hNtdll);
}
return 0;
}
:&gt;cl /Zi /W4 /analyze /Ox /nologo rtlcapstk.cpp /link /nologo
/release rtlcapstk.cpp

:&gt;rtlcapstk.exe
8105b
77183c45
774d37eb
774d37be

:&gt;cdb -c “g rtlcapstk!main;pc 4;k;g;q” rtlcapstk.exe| grep ChildEBP -A 40

ChildEBP RetAddr
0016fc70 002612fe rtlcapstk!main+0x59
(Inline) -------- rtlcapstk!invoke_main+0x1d
0016fcb8 77183c45 rtlcapstk! scrt_common_main_seh+0xf9
0016fcc4 774d37eb kernel32!BaseThreadInitThunk+0xe
0016fd04 774d37be ntdll!
RtlUserThreadStart+0x70
0016fd1c 00000000 ntdll!_RtlUserThreadStart+0x1b

26105b return address appears to be different
( windbg also resolves inlined functions )
77183c45
774d37eb
774d37be
quit:

also here is an output with the undocumented global edited (first is
false second is true)

:&gt;cdb -c “g rtlcapstk!main;pc 4;k;g;q” rtlcapstk.exe| grep ChildEBP -A 40

ChildEBP RetAddr
002ef968 003612fe rtlcapstk!main+0x59
(Inline) -------- rtlcapstk!invoke_main+0x1d
002ef9b0 77183c45 rtlcapstk! scrt_common_main_seh+0xf9
002ef9bc 774d37eb kernel32!BaseThreadInitThunk+0xe
002ef9fc 774d37be ntdll!
RtlUserThreadStart+0x70
002efa14 00000000 ntdll!_RtlUserThreadStart+0x1b
36105b
77183c45
774d37eb
774d37be
quit:

:&gt;cdb -c “ed ntdll!RtlpFuzzyStackTracesEnabled 1;g rtlcapstk!main;pc
4;k;g;q” r
tlcapstk.exe| grep ChildEBP -A 40

ChildEBP RetAddr
002afd80 013512fe rtlcapstk!main+0x59
(Inline) -------- rtlcapstk!invoke_main+0x1d
002afdc8 77183c45 rtlcapstk! scrt_common_main_seh+0xf9
002afdd4 774d37eb kernel32!BaseThreadInitThunk+0xe
002afe14 774d37be ntdll!
RtlUserThreadStart+0x70
002afe2c 00000000 ntdll!_RtlUserThreadStart+0x1b

ffffffff <—seems to differ and appears to be some unknown address
to wrt windbg’s k
77183c45
774d37eb
774d37be
13513af <---- seems to have found one more return address

quit:

On 8/2/17, Jeffrey Tippet
wrote:
> Ah I don’t believe that you can defer a stack capture to a different
> processor or thread. You can, however, formulate an official request to
> Microsoft to document RtlCaptureStackBackTrace as callable at any IRQL.
>
> The implementation of RtlCaptureStackBackTrace is different on each
> architecture. However, to generalize, x86 uses a linked list of frame
> pointers, while all subsequent architectures store a set of unwind codes
> in
> a separate section of the driver. The kernel uses these unwind codes to
> determine how many bytes of stack are used by each function on the stack.
>
> Decoding the unwind codes is nontrivial. You can find some documentation
> online, but I don’t recommend you get involved in this for mere
> diagnostics.
> (As far as I know, only a handful of people have gone through the trouble
> to understand & write decoders, and they tend to be in the anti-malware
> business.)
>
>
> —
> NTDEV is sponsored by OSR
>
> Visit the list online at:
> http:
>
> MONTHLY seminars on crash dump analysis, WDF, Windows internals and
> software
> drivers!
> Details at http:
>
> To unsubscribe, visit the List Server section of OSR Online at
> http:
></http:></http:></http:></stdio.h></windows.h>