Why does the kernel pre loads not only the ntdll. dll bit also the wow ntdll.dll into wow processes?

Hi,

I am wondering why when creating a process the kernel not only prepares for it the System32\ntdll.dll but also when its a WOW64 process also the SysWOW64\ntdll.dll
I mean, since execution starts in the native ntdll.dll why cant it load the SysWOW64\ntdll.dll on its own, why does this have to be done by the kernel?

Cheers
David X.

The kernel already knew how to load both ntdlls. There was probably no good reason to move that capability. In what universe would this matter in any way?

Well in a very particular usecase it maters a lot…

Here is the use case:

x86 on ARM64 is implemented using a jit transpiler that converts x86 code to ARM64 code that tries to mimic being 32 bit. This generated code is not that well optimized. So MSFT thought it would be a great idea to provide hybrid binaries that have x86 exports but internally transition very quickly to a ARM64 implementation. These are called CHPE binaries and a few dozens fo them resides in Windows\SyChpe32\ directory. When a DLL is found there it is used instead of the entirely x86 DLL located in Windows\SysWOW64\ which is a problem.
While MSFT intended those DLL’s to be hookable as it seams internal control flow i.e. LoadLibraryW → somethign somethign something → NtCreateFile → etc… will not go through the x86 stub in the hybride ntdll.dll (unlike as in x64 on ARM64 where as far as I could see everythign is properly hookable)
So while I can hook LoadLibraryW’s and NtCreateFile’s invocation from x86 code I can not hook the internal code path.
For my security solution this is a major problem.
So… I noticed that when I out right delete the SyChpe32 directory, or its content, the windows kernel will load for me the ntdll from SysWOW64 which is well behaved and where I can hook anything.
But well this is not a really workable solution…
So I’m wondering how I can force processes running under the supervision of my security software (Sandboxie-Plus) to always get the SysWOW64 ntdll, while other x86 programs running outside could still use the optimized CHPE binaries.

One solution that came to my mind is to hook the native ntdll.dll in the earliest stage of the process startup and than unmap the undesired ntdll and load the one from syswow64 instead. But I wonder if that could even work since as the kernel loads this instead of leaving it to the native ntdll it may do some special kernel magic to it that I may have a hard time to emulate.

(I give you credit for sticking with it, Mr. Xanatos – Your perseverance is impressive)

As to your original question, I suspect that this is a performance optimization since the kernel ‘knows’ that nothing can run without ntdll. There are a lot of other DLLs that have to be loaded too, but MSFT have transitioned to something called ‘API sets’ and they all have ‘shims’ to modify the functionality of the basic APIs.

As to your implied question, hooking can’t ever be made to be reliable. Even images compiled with the hot patch option cannot be reliably hooked in arbitrary ways. And hooking can at best provide obscurity but not security. This topic has been discussed many times here and elsewhere

Well in sandboxie hooking is not used for security, only for virtualization of the filesystem and registry, the security part is tackled by the driver.

Since you already have a driver, maybe add combination of process creation callback and a filesystem filter.
When your process is created, note its ID. In the filesystem filter block access of this process to Windows\SyChpe32.
(assumed that SysWOW64 ntdll is not loaded to the new process yet)