Kernel Thread Stack Paging

Hey,
For some reason, My driver is full of calls to “__alloca_probe” / “_chkstk”. We saw these calls during performance profiling and I think they are completely useless in kernel mode. This is my logic and I want to make sure I’m correct:

  1. The chkstk function is only useful for the case where stack pages are paged out and they are paged in as the thread stack grows (The PAGE_GUARD thing in user mode). It exist in order to make sure that the thread will not skip a page (because of large buffer allocations) and crash with an access violation.
  2. The only case where kernel thread stacks are paged out is when the kernel thread enters a “UserMode” wait (+ some limitations apply such as the thread priority level)
  3. When a thread runs code in kernel mode (After the context switch is finished) the kernel stack of this thread must be entirely paged in. This is because:
    • The kernel cannot handle page faults in DISPATCH_LEVEL. A thread may acquire a spinlock and continue reading from the stack, So the stack must be in memory.
    • A kernel thread may enter a filesystem stack context and trying to page in the kernel stack can cause deadlocks.
  4. The conclusion: Enabling the stack checking compiler flag is completely useless. This compiler option is only relevant for user mode code. I can safely remove the chkstk calls from my driver.
  5. In case more stack space is needed for some reason (Invoking functions deep down in the I/O stack that may overflow the stack) it’s possible to use KeExpandKernelStackAndCalloutEx.

My questions:

  1. Can someone confirm these statements?
  2. If a “UserMode” wait can trigger a page fault as a result thread stack paging, Is it unsafe to perform a ‘UserMode’ wait inside a FileSystem stack? Or the OS checks whether FsRtlEnterFileSystem was invoked before paging out the thread’s stack?

Thank you!

Hey @“Peter_Viscarola_(OSR)”, I’m sorry for posting in the WinDbg section, Is it possible to move the thread to the NTDEV section?

@0xrepnz said:
Hey,
For some reason, My driver is full of calls to “__alloca_probe” / “_chkstk”. We saw these calls during performance profiling and I think they are completely useless in kernel mode. This is my logic and I want to make sure I’m correct:

I saw those when I allocated large items on the stack (unintentionally, once I found the offending code pattern, I resolved it by using a different pattern, and those call pairs were removed). Is it possible you are running into that?

The pattern that caused it (from memory, it’s been several months, and I no longer have code access) was an instance copy that created a temporary on the stack that was immediately destroyed after participating in the copy. I wish I could remember the detail there. Once I eliminated the temporary, no more _chkstk.

Actually, The reason for these calls is that one static library that I used is compiled with the /Gs0 flag (=Force chkstk in all functions) and when /Gs0 is used together with /GL (Whole Program Optimization) - The /Gs0 setting is set for the entire linked binary… the entire driver in my case. (potentially a bug in /GL?)

This static library is used both for kernel and user mode, I’m not sure why the developer used /Gs0 for the static library (I can only guess that he wanted to use /GS (=stack security cookie) but got confused? idk).

Anyhow, By default, The chkstk calls are added if a function uses more than one page of stack variables. In this case, The function may “skip a page” of the stack. Imagine this case:

BYTE Buffer[4096]; // Assume this buffer resides on the PAGE_GUARD page that is used to detect that the stack grows
BYTE Buffer2[4096]; // Assume this buffer resides on the page after. This page is not marked with PAGE_GUARD and will cause an access violation if accessed

The stack grows by handling PAGE_GUARD exceptions, If by mistake a page is skipped an access violation is thrown. chkstk is designed to protect against the access violation, By going page by page and accessing it, By doing this it causes the stack to grow page by page.

However, I don’t see how this mechanism relates to kernel mode. In kernel mode the stack should always be in memory and it’s a bad practice to define non-small structures on the stack…

The chkstk calls are added to kernel drivers as well if a a variable that is larger than 4096 is defined on the stack, which is weird because the stack should always be in memory. I would expect a warning to be generated and for the compilation to fail, Because it’s such a bad practice and may cause a stack overflow BSOD. Perhaps they added the chkstk to cases where page sized variables are used just to cause a BSOD immediately… (But then, why not just fail compilation…)

Unless I miss something, This is just a weird corner case and I can remove the /Gs0. Just wanted to see that more guys here agree…
Also, I’m wondering if someone has insights regarding the ‘UserMode’ wait within filesystem problem I presented…

I discovered the behavior I reported because there was a large field in the object, large enough that it could cause a stack overflow. The embedded _chkstk() call issued a bugcheck reporting that the stack usage was too large, even though the stack hadn’t actually overflowed. I was actually quite glad it had.

I understand, but wouldn’t you be happier if the code did not compile? Allocating large buffers on the stack in kernel mode is a very bad practice that could very easily lead to BSOD and should be forbidden… Obviously a BSOD won’t happen every time, but still it’s a bad practice.
In fact, There’s this warning: C6262 that can be enabled to detect these kind of issues in compile time. If the stack size of a routine is larger than 1KB the code does not compile and BSODs are avoided :smiley:

Sure, it’s better to catch it at compile time. I inherited the whole thing, added a buffer definition to an existing structure, got the bugcheck, investigated, found what I found, worked around it. If it was useful for you, cool. If it wasn’t useful, because it was irrelevant to your actual issue, sorry for the distraction. I was only saying that the compiler adding the chkstk call was a necessary step in detecting the issue with the toolchain of the day. Failing the compile would be a substantive improvement.

1 Like

It wasn’t a distraction, Thanks for sharing your thoughts