In order to realize that it cannot work this way, all you have to do is to consider the scenario when the target thread owns some kernel resource that can be owned at PASSIVE_LEVEL (for example, a “simple” mutex). Imagine what happens if this thread gets suspended by some external means from the userland, and the suspending thread eventually exits without having resumed the target one…
So as far as I understand to solve the issue you’re referring to most (if not all?) PASSIVE_LEVEL locks enforce one of the following:
1 - You must disable normal kernel APC before being able to acquire the lock (for example: ERESOURCE, PUSH_LOCK, KMUTEX…)
2 - Acquiring the lock causes the IRQL to be raised to APC_LEVEL (FAST_MUTEX, KGUARDED_MUTEX…)
This article contains a good explanation about it: https://www.osr.com/nt-insider/2015-issue3/critical-regions/
Could you please expand it a bit. To begin with, unless your code happens to be running in context of the target thread, how can you possibly know what the said thread is actually doing at the moment you are trying to suspend it???
Let me just send the code I used in my tests. I developed a driver that sleeps inside a thread creation callback of a specific thread, so I know the state of the creating thread when I suspend it… hope it will clarify:
VOID
CreateThreadNotifyRoutine(
_In_ HANDLE ProcessId,
_In_ HANDLE ThreadId,
_In_ BOOLEAN Create
)
{
NTSTATUS Status;
ULONG_PTR ThreadStartAddress;
LARGE_INTEGER Interval = { 0 };
if (!Create)
return;
Status = QueryThreadStartAddress(ProcessId, ThreadId, &ThreadStartAddress);
if (ThreadStartAddress == 0x13131313) {
Interval.QuadPart = RELATIVE(SECONDS(20));
KeDelayExecutionThread(KernelMode, FALSE, &Interval);
}
}
VOID
DriverUnload(
__in PDRIVER_OBJECT Driver
)
{
UNREFERENCED_PARAMETER(Driver);
PsRemoveCreateThreadNotifyRoutine(CreateThreadNotifyRoutine);
}
NTSTATUS
DriverEntry(
__inout PDRIVER_OBJECT DriverObject,
__in PUNICODE_STRING RegKey
)
{
UNREFERENCED_PARAMETER(RegKey);
DriverObject->DriverUnload = DriverUnload;
return PsSetCreateThreadNotifyRoutine(CreateThreadNotifyRoutine);
}
User program:
DWORD Thread(PVOID ThreadParam)
{
DWORD ThreadId;
HANDLE ThreadHandle = CreateThread(NULL, 0, (PVOID)0x13131313, NULL, 0, &ThreadId);
if (!ThreadHandle) {
printf("CreateThread failed\n");
return -1;
}
return 0;
}
int main()
{
DWORD ThreadId;
HANDLE ThreadHandle = CreateThread(NULL, 0, Thread, NULL, 0, &ThreadId);
if (!ThreadHandle) {
printf("CreateThread failed\n");
return -1;
}
Sleep(3000);
if (SuspendThread(ThreadHandle) == -1) {
printf("SuspendThread failed");
return -1;
}
printf("Waiting to unload...\n");
getchar();
ResumeThread(ThreadHandle);
return 0;
}
In this case, the APC will only trigger after the thread notification callback has finished because PspCreateThread enters a critical region… And that’s the callstack I sent above. (I simply dumped the callstack of the suspended thread)
Also note that if you put a breakpoint on nt!KiDeliverApc you can see that the target thread has an APC waiting in the kernel mode queue of APCs, thus the caller of the APC used the KernelMode argument when calling KeInitializeApc - this can be seen by disassembling nt!KeInitThread as well…