Thread suspended while in callback

One solution to this issue is to have some driver that’s always loaded in memory and this driver will not get updates - and the driver that’s always in memory can enter a critical region and invoke the second driver, then when the callback is done you can exit the critical region - In my opinion that’s pretty “extreme” solution for a rare issue so not sure it’s worth it…

callback begins (in thread context) → I enter critical section → I wait for user mode → wait is satisfied, kernel execution continues
→ I exit critical section → APCs are delivered and kernel execution is suspended (exiting critical region triggers delivery of APCs).

I’m seeing this problem in practice, not in theory

Are you sure there is no diagnostic mistake here? To begin with, SuspendThread() is not the kind of call that you would normally see all over the place. It is normally used by the debuggers, and is often used hand-in-hand with SetThreadContext() (this is apart from the documentation
making it clear that only userland execution is affected by this call). Probably, this is something entirely different (like, for example, kernel APC handler simply goes blocking on some event that gets subsequently by some kernel component)? This scenario is, indeed, perfectly feasible (in fact, quite likely).

If this is the case…well, then it has absolutely nothing to do with suspending a thread…

Anton Bassov

1 Like

Probably, this is something entirely different

OK maybe some other source can convince you. For example, @“Scott_Noone_(OSR)” wrote in this thread:

As long as the CombinedApcDisable field in the KTHREAD is zero the kernel
APCs can be delivered. I forced it to happen in a VM to get a call stack by
putting the thread into an infinite loop and then suspending it using
Process Explorer. Here are the steps and the resulting call stack:

kd> bp Nothing!NothingRead
kd> g
Nothing!NothingRead:
fffff801`66bf61b0 4889542410 mov qword ptr [rsp+10h],rdx
kd> ??@$thread->Tcb.CombinedApcDisable
unsigned long 0
kd> ew @$ip 0xFEEB
kd> g

Thread is now taking up 100% CPU. Now suspend the thread using Process
Explorer and check the call stack:

kd> k
*** Stack trace for last set context - .thread/.cxr resets it
# Child-SP RetAddr Call Site
00 ffffd001`8ebc64a0 fffff803`3d250d7e nt!KiSwapContext+0x76
01 ffffd001`8ebc65e0 fffff803`3d2507f9 nt!KiSwapThread+0x14e
02 ffffd001`8ebc6680 fffff803`3d2788d0 nt!KiCommitThreadWait+0x129
03 ffffd001`8ebc6700 fffff803`3d24d64c nt!KeWaitForSingleObject+0x2c0
04 ffffd001`8ebc6790 fffff803`3d24e279 nt!KiSchedulerApc+0x78
05 ffffd001`8ebc67f0 fffff803`3d374723 nt!KiDeliverApc+0x209
06 ffffd001`8ebc6870 fffff801`66bf61b0 nt!KiApcInterrupt+0xc3
07 ffffd001`8ebc6a08 fffff803`3d605788 Nothing!NothingRead
08 ffffd001`8ebc6a10 fffff803`3d603336 nt!IopSynchronousServiceTail+0x170
09 ffffd001`8ebc6ae0 fffff803`3d37b1b3 nt!NtReadFile+0x656
0a ffffd001`8ebc6bd0 00007ffc`38420caa nt!KiSystemServiceCopyEnd+0x13
0b 0000009d`fd31fa28 00007ffc`358f83a8 ntdll!NtReadFile+0xa

-scott
OSR

Moreover, YOU YOURSELF wrote here:

My question , thread suspension is implemented by User mode APC or Normal Kernel mode APC.

I would say by the kernel-mode one - this is the reason why APC delivery to a thread that holds kernel-level synch construct that allows preemption should be disabled. Just a couple days ago we discussed a scenario when thread gets suspended by UM app while running in the KM, which may result in a deadlock if it holds synch construct that allows preemption at the moment ( I just could not miss my chance to criticize Windows yet another time, could I)…

If you cannot be convinced by this, not sure what will convince you… I personally think the most convincing one was this:

0: kd> kc
 # Call Site
00 nt!KiInsertQueueApc
01 nt!KiSuspendThread
02 nt!KeSuspendThread
03 nt!PsSuspendThread
04 nt!NtSuspendThread
05 nt!KiSystemServiceCopyEnd
06 ntdll!NtSuspendThread
07 KERNELBASE!SuspendThread
08 SuspendKernelApc!main
09 SuspendKernelApc!_NULL_IMPORT_DESCRIPTOR <PERF> (SuspendKernelApc+0x17400f)
0a 0x0
0: kd> dx ((nt!_KAPC*)@rcx)->ApcMode
((nt!_KAPC*)@rcx)->ApcMode : 0 [Type: char]

If you cannot be convinced by this, not sure what will convince you…

I’ve got to admit that you are right - I just checked WRK, and found the following line in KeInitThread() implementation

KeInitializeApc(&Thread->SuspendApc,
Thread,
OriginalApcEnvironment,
(PKKERNEL_ROUTINE)KiSuspendNop,
(PKRUNDOWN_ROUTINE)KiSuspendRundown,
KiSuspendThread,
KernelMode,
NULL);

As we can see with our own eyes, “SuspendApc” is,indeed, initialised as a KM one, which means that the official documentation is “a way too optimistic”, so to say…

Anton Bassov