Hi, everyone!
I am faced with a strange problem: WinDBG does not display WOW64 stacks correctly for Windows 11 targets.
After switching into WOW64 context by issuing "!wow64exts.sw" or ".process /r /P XXX;.thread /w YYY" command and then entering "k" (display stack backtrace) we see something like this:
00 0104e554 989067f8 0x99005974
01 0114ff14 76665d49 0x989067f8
02 0114ff24 77b9d6db kernel32!BaseThreadInitThunk+0x19
03 0114ff7c 77b9d661 ntdll_77b30000!__RtlUserThreadStart+0x2b
04 0114ff8c 00000000 ntdll_77b30000!_RtlUserThreadStart+0x1b
Instead of normal:
00 0114fd50 77b86467 ntdll_77b30000!NtWaitForWorkViaWorkerFactory+0xc
01 0114ff14 76665d49 ntdll_77b30000!TppWorkerThread+0x347
02 0114ff24 77b9d6db kernel32!BaseThreadInitThunk+0x19
03 0114ff7c 77b9d661 ntdll_77b30000!__RtlUserThreadStart+0x2b
04 0114ff8c 00000000 ntdll_77b30000!_RtlUserThreadStart+0x1b
This can be easily reproduced during debugging Windows 11 24H2+ target or when analyzing crash dump created on such OS, no matter kernel or user mode.
…
As our research shown, Windows 11 24H2 adds new values to ContextFlags for WOW64 thread context prior to jumping in 64 mode. The new value is CONTEXT_XSTATE flag (0x40).
And then, when trying to copy this context into internal buffer, WinDBG encounters an error, probably because XSTATE-context is large than expected.
So WinDBG is unable to switch into WOW64 context and stack trace is broken. Also, an error message may appear: "Effective machine and debuggee state conflict, disassembly not possible".
FYI: WOW64 context is stored inside TEB.TlsSlots[TLS_CPURESERVED (1)], immediately after two WORDs. Use "!wow64exts.info" to display address of TLS_CPURESERVED block.
I found some possible workarounds:
- Use old WinDBG version.
On Windows 10 and higher WinDBG (dbgeng.dll) uses ntdll!RtlCopyContext - this is a root of problem.
Latest WinDBG that is not affected was shipped with Windows 8.1 SDK and it is still available for download.
- Run WinDBG in compatibility mode.
Still possible, but, unfortunately, the 'Compatibility' tab is hidden in newest Windows versions (and "__COMPAT_LAYER" environment variable also doesn't work as I expected).
- Patch dbgeng.dll, on disk or in memory.
For example, you may write 6.2 into 'g_DebuggerVersionInfo' (actually, OSVERSIONINFO) or even modify version check in 'MachineInfo::XplatCopyWowContext', so the call to RtlCopyContext will never occurred anymore. Note that pdb symbols for windbg modules are available along with other - thanks to MS
.
- Patch ContextFlags of the corresponding thread context.
Just remove CONTEXT_XSTATE bit and everything will work again.
Hope that somebody from WinDBG team will read this post and then, eventually, appropriate fix will be released.