RtlCaptureStackBackTrace produces invalid addresses and addresses that don't belong to a module

Regarding the addresses that fall outside all kernel modules:

lm a <address> does not return any valid kernel module, although the addresses are kernel-space addresses above 0x00007ffffffeffff
uu address also doesn’t show symbols, only valid assembly.

Regarding the invalid addresses:
It also returns some weird invalid addresses between the valid ones like this one:

1: kd> uu 0xffff9c921dab969d
ffff9c92`1dab969d ??              ???
ffff9c92`1dab969e ??              ???
ffff9c92`1dab969f ??              ???
ffff9c92`1dab96a0 ??              ???
ffff9c92`1dab96a1 ??              ???
ffff9c92`1dab96a2 ??              ???
ffff9c92`1dab96a3 ??              ???
ffff9c92`1dab96a4 ??              ???

In the same call to RtlCaptureStackBackTrace, the rest of the addresses are legitimate and correct.

  1. Is this a normal behavior?
  2. How does that work with the invalid address?
  3. Or the valid ones that are outside all kernel modules?

Stack backtracing in x64 code is an inexact science. Not all stack users follow the rules exactly. That’s about all you can say.

stack back traces in x64 are a far more exact science than in x86. Optimized code can do all sorts of things of course

I’m confused! Is it exact or inexact and why? The CPU must return to somewhere in the end, it won’t just float around and execute at address 0x0.
Are there any studies or sources on why this is?
And is there a way to correct the stack back trace returned by RtlCaptureStackBackTrace?
Why does the documentation not mention any of that?
Could it be the debugger causing this?

Most software that gets released is compiled with optimizations enabled. Compiler optimizations is a large topic and something of a black art, but they are usually worth the extra obscurity in the resulting assembly because they can speed up some code as much as a hundred fold (your mileage will vary). One class of optimizations works by replacing the normal call & return sequences with jump instructions. Another class works by repurposing standard registers to hold other values. And another works by creative arrangements of the stack. There are many others, but any one of these techniques (or some combination) can make it hard to discover the true call stack. Not impossible since the CPU must be able to do it, but too hard for any standard function. The x64 calling conventions dramatically restrict some of these optimizations and make the stack back trace far more reliable than on some other platforms like x86.

of course a stack trace like this could also be the result of stack corruption

optimized code often does not correspond well to symbols. Usually some are present for valid sequences of code, but they often refer to nonsense lines of code because of optimizations

1 Like