Windows System Software -- Consulting, Training, Development -- Unique Expertise, Guaranteed Results

Home NTDEV

More Info on Driver Writing and Debugging


The free OSR Learning Library has more than 50 articles on a wide variety of topics about writing and debugging device drivers and Minifilters. From introductory level to advanced. All the articles have been recently reviewed and updated, and are written using the clear and definitive style you've come to expect from OSR over the years.


Check out The OSR Learning Library at: https://www.osr.com/osr-learning-library/


Before Posting...

Please check out the Community Guidelines in the Announcements and Administration Category.

Thread suspended while in callback

kernkern Member Posts: 4

Hi,

My driver is registered to some kernel callback (threads, minifilter file system callbacks, etc.). When the driver unloads, it waits until all callbacks finish. The problem is that the callbacks are mostly called at PASSIVE_LEVEL, and thus the thread in which they run can be suspended, and in such case the driver cannot unload.

What can I do? Is there a way to block the suspension kernel APC throughout the entire callback? (I only control the code inside the callback, so taking a critical section doesn't help, because the APC will be delivered when I exit the critical section, and again the thread will become suspended before the callback is done)
Should I resume suspended threads in order to unload?

Thanks!

Comments

  • 0xrepnz0xrepnz Member Posts: 86
    edited August 6

    I don't think it's possible to really prevent such situation..

    Should I resume suspended threads in order to unload?

    I think it's a bad idea to do so, for obvious reasons.. You risk corrupting user applications that expect that threads are suspended.. For example the .NET garbage collector suspends .NET threads in order to scan their stack for references

    Why does it matter to you if the driver is stuck in memory?

    - Ori Damari
  • anton_bassovanton_bassov Member MODERATED Posts: 5,279

    The problem is that the callbacks are mostly called at PASSIVE_LEVEL, and thus the thread in which they run can be suspended,
    and in such case the driver cannot unload.

    Before you start worrying about all this nonsense I would recommend you to read about APCs a bit, so that you understand the difference between the kernel and user APCs. Suspending and resuming userland threads is done by means of the latter, which has absolutely nothing to do with your driver callbacks. Search MSDN for a doc named somewhere along the lines of "Do waiting threads receive APCs". This doc presents the entire table that lists all possible scenarios. At this point you will realize that, in actuality, you are thinking about a non-existent problem....

    Anton Bassov

  • 0xrepnz0xrepnz Member Posts: 86
    edited August 7

    Suspending and resuming userland threads is done by means of the latter, which has absolutely nothing to do with your driver callbacks

    That's incorrect, Thread suspension is implemented using normal kernel APC and can suspend code in kernel mode too..
    As a proof, I wrote code that suspends a thread from user mode while the thread is in kernel mode (The suspension APC runs when the thread is inside PspCreateThread):

    0: kd> kc
     # Call Site
    00 nt!KiSwapContext
    01 nt!KiSwapThread
    02 nt!KiCommitThreadWait
    03 nt!KeWaitForSingleObject
    04 nt!KiSchedulerApc
    05 nt!KiDeliverApc
    06 nt!KiCheckForKernelApcDelivery
    07 nt!KeLeaveCriticalRegionThread
    08 nt!PspCreateThread
    09 nt!NtCreateThreadEx
    0a nt!KiSystemServiceCopyEnd
    0b ntdll!NtCreateThreadEx
    

    Some driver callbacks are called inside a critical region (like process, thread and image callbacks) so the suspend APC will only be delivered after the callback has finished (so the driver will be able to unload) - That's why I was wondering which scenario the OP is talking about..

    - Ori Damari
  • anton_bassovanton_bassov Member MODERATED Posts: 5,279

    That's incorrect, Thread suspension is implemented using normal kernel APC and can suspend code in kernel mode too..

    In order to realize that it cannot work this way, all you have to do is to consider the scenario when the target thread owns some kernel resource that can be owned at PASSIVE_LEVEL (for example, a "simple" mutex). Imagine what happens if this thread gets suspended by some external means from the userland, and the suspending thread eventually exits without having resumed the target one.....

    As a proof, I wrote code that suspends a thread from user mode while the thread is in kernel mode

    Could you please expand it a bit. To begin with, unless your code happens to be running in context of the target thread, how can you possibly know what the said thread is actually doing at the moment you are trying to suspend it???

    Anton Bassov

  • 0xrepnz0xrepnz Member Posts: 86
    edited August 7

    In order to realize that it cannot work this way, all you have to do is to consider the scenario when the target thread owns some kernel resource that can be owned at PASSIVE_LEVEL (for example, a "simple" mutex). Imagine what happens if this thread gets suspended by some external means from the userland, and the suspending thread eventually exits without having resumed the target one.....

    So as far as I understand to solve the issue you're referring to most (if not all?) PASSIVE_LEVEL locks enforce one of the following:

    1 - You must disable normal kernel APC before being able to acquire the lock (for example: ERESOURCE, PUSH_LOCK, KMUTEX..)
    2 - Acquiring the lock causes the IRQL to be raised to APC_LEVEL (FAST_MUTEX, KGUARDED_MUTEX..)

    This article contains a good explanation about it: https://www.osr.com/nt-insider/2015-issue3/critical-regions/

    Could you please expand it a bit. To begin with, unless your code happens to be running in context of the target thread, how can you possibly know what the said thread is actually doing at the moment you are trying to suspend it???

    Let me just send the code I used in my tests. I developed a driver that sleeps inside a thread creation callback of a specific thread, so I know the state of the creating thread when I suspend it... hope it will clarify:

    VOID
    CreateThreadNotifyRoutine(
        _In_ HANDLE ProcessId,
        _In_ HANDLE ThreadId,
        _In_ BOOLEAN Create
        )
    {
        NTSTATUS Status;
        ULONG_PTR ThreadStartAddress;
        LARGE_INTEGER Interval = { 0 };
    
        if (!Create)
            return;
    
        Status = QueryThreadStartAddress(ProcessId, ThreadId, &ThreadStartAddress);
        if (ThreadStartAddress == 0x13131313) {
            Interval.QuadPart = RELATIVE(SECONDS(20));
            KeDelayExecutionThread(KernelMode, FALSE, &Interval);
        }
    }
    
    VOID
    DriverUnload(
        __in PDRIVER_OBJECT Driver
        )
    {
        UNREFERENCED_PARAMETER(Driver);
    
        PsRemoveCreateThreadNotifyRoutine(CreateThreadNotifyRoutine);
    }
    
    NTSTATUS
    DriverEntry(
        __inout PDRIVER_OBJECT DriverObject,
        __in PUNICODE_STRING RegKey
        )
    {
        UNREFERENCED_PARAMETER(RegKey);
    
        DriverObject->DriverUnload = DriverUnload;
        return PsSetCreateThreadNotifyRoutine(CreateThreadNotifyRoutine);
    }
    

    User program:

    DWORD Thread(PVOID ThreadParam)
    {
        DWORD ThreadId;
    
        HANDLE ThreadHandle = CreateThread(NULL, 0, (PVOID)0x13131313, NULL, 0, &ThreadId);
        if (!ThreadHandle) {
            printf("CreateThread failed\n");
            return -1;
        }
    
        return 0;
    }
    
    int main()
    {
        DWORD ThreadId;
    
        HANDLE ThreadHandle = CreateThread(NULL, 0, Thread, NULL, 0, &ThreadId);
        if (!ThreadHandle) {
            printf("CreateThread failed\n");
            return -1;
        }
    
        Sleep(3000);
    
        if (SuspendThread(ThreadHandle) == -1) {
            printf("SuspendThread failed");
            return -1;
        }
    
        printf("Waiting to unload...\n");
        getchar();
        ResumeThread(ThreadHandle);
        return 0;
    }
    

    In this case, the APC will only trigger after the thread notification callback has finished because PspCreateThread enters a critical region.. And that's the callstack I sent above. (I simply dumped the callstack of the suspended thread)

    Also note that if you put a breakpoint on nt!KiDeliverApc you can see that the target thread has an APC waiting in the kernel mode queue of APCs, thus the caller of the APC used the KernelMode argument when calling KeInitializeApc - this can be seen by disassembling nt!KeInitThread as well..

    - Ori Damari
  • anton_bassovanton_bassov Member MODERATED Posts: 5,279

    Let me just send the code I used in my tests. I developed a driver that sleeps inside a thread creation callback of a specific thread,
    so I know the state of the creating thread when I suspend it... hope it will clarify:

    You seem to be totally confused about the whole thing.....

    First of all, once you have registered your callback with PsSetCreateThreadNotifyRoutine() and not with PsSetCreateThreadNotifyRoutineEx(),
    your callback gets invoked in context of a creating thread, right. Once you are calling KeDelayExecutionThread() in your callback, you
    simply put the creator thread to rest for the period your have specified. There is no relation to your userland call to SuspendThread() whatsoever. If you registered your callback with PsSetCreateThreadNotifyRoutineEx(), it would be invoked in context of the newly-created thread, so it would be a newly-created thread who went to sleep, and still there would be no relation to SuspendThread() call whatsoever.

    So as far as I understand to solve the issue you're referring to most (if not all?) PASSIVE_LEVEL locks enforce one of the following:

    1 - You must disable normal kernel APC before being able to acquire the lock (for example: ERESOURCE, PUSH_LOCK, KMUTEX..)
    2 - Acquiring the lock causes the IRQL to be raised to APC_LEVEL (FAST_MUTEX, KGUARDED_MUTEX..)

    You understand it all wrong......

    The reason why these calls disable APC delivery is because they deal with synch constructs that cannot be acquired recursively. If a thread
    that owns such a construct gets interrupted by an APC and tries to acquire a construct that it already owns in context of an APC callback, it is going to deadlock. This is why APC delivery to the owner thread has to be disabled. It has absolutely nothing to do with suspending the target thread by the external means......

    The most interesting thing here is that your code, in actuality, demonstrates why suspending a thread that runs in the kernel cannot be done.
    You don't seem to be calling ResumeThread() , do you. Therefore, if the newly-created thread had acquired some resource like semaphore
    (which, BTW, does not disable APC delivery) or had to signal an event and you suspended it before it had done this part, it would not simply have a chance to ever do it. As a result, the system would be left in inconsistent state....

    Anton Bassov
    Anton Bassov

    Anton Bassov
    it

  • 0xrepnz0xrepnz Member Posts: 86

    There is no relation to your userland call to SuspendThread() whatsoever

    I suspended the CREATOR thread, the one that's sleeping in my driver - take a look at the code again... The APC is executed right after APC delivery is enabled after the callback..

    - Ori Damari
  • anton_bassovanton_bassov Member MODERATED Posts: 5,279

    I suspended the CREATOR thread, the one that's sleeping in my driver - take a look at the code again...

    Of course you did not suspend it - the only thing that you did was delaying its execution for a certain period of time by means of KeDelayExecutionThread() call. This function is the equivalent of userland Sleep() and not of SuspendThread(), which you are calling on newly-created thread.

    What makes the difference between SuspendThread() and Sleep(),WaitXXX() and all other functions that put the calling thread to rest is that SuspendThread() puts the target thread to rest by the external means (i.e.by means of APC delivery). In actuality, the former relies upon the latter behind the scenes - SuspendThread() queues a user APC to the target thread, and APC handler that gets invoked in context of the target thread already calls some scheduler function that puts the calling thread to rest. For the reasons that I have mentioned in my previous posts, this part can be done only when the target thread is about to make a return to the userland

    The APC is executed right after APC delivery is enabled after the callback..

    Once you put currently running thread to sleep(which means you do it in context of the said thread), APC is simply irrelevant here

    Anton Bassov

  • 0xrepnz0xrepnz Member Posts: 86

    SuspendThread() queues a user APC to the target thread

    Of course you did not suspend it - the only thing that you did was delaying its execution for a certain period of time by means of KeDelayExecutionThread() call

    Did you actually read the code before telling me I'm all wrong?

    First of all, you cannot argue with the debugger - NtSuspendThread queues a normal Kernel APC to suspend the thread:

    0: kd> kc
     # Call Site
    00 nt!KiInsertQueueApc
    01 nt!KiSuspendThread
    02 nt!KeSuspendThread
    03 nt!PsSuspendThread
    04 nt!NtSuspendThread
    05 nt!KiSystemServiceCopyEnd
    06 ntdll!NtSuspendThread
    07 KERNELBASE!SuspendThread
    08 SuspendKernelApc!main
    09 SuspendKernelApc!_NULL_IMPORT_DESCRIPTOR <PERF> (SuspendKernelApc+0x17400f)
    0a 0x0
    0: kd> dx ((nt!_KAPC*)@rcx)->ApcMode
    ((nt!_KAPC*)@rcx)->ApcMode : 0 [Type: char]
    

    ApcMode == 0 == KernelMode.

    Secondly, about your claim that the thread is not suspended, let's see:

    0: kd> dx (void*)((nt!_KAPC*)@rcx)->Thread
    (void*)((nt!_KAPC*)@rcx)->Thread : 0xffffe78a0a52b080 [Type: void *]
    

    So the ETHREAD is 0xffffe78a0a52b080 - now we need to wait for the KeDelayExecutionThread to end (thread notification callbacks are called inside a critical region, so the sleep is inside a critical region now - The kernel APC runs after the thread notification callback finished)

    1: kd> !thread ffffe78a0a52b080
    THREAD ffffe78a0a52b080  Cid 1df4.0f58  Teb: 0000006d0146d000 Win32Thread: 0000000000000000 WAIT: (Suspended) KernelMode Non-Alertable
    SuspendCount 1
        ffffe78a0a52b360  NotificationEvent
    Not impersonating
    DeviceMap                 ffffd78f7278ded0
    Owning Process            ffffe78a0a425080       Image:         SuspendKernelApc.exe
    Attached Process          N/A            Image:         N/A
    Wait Start TickCount      131861         Ticks: 2325 (0:00:00:36.328)
    Context Switch Count      4              IdealProcessor: 0             
    UserTime                  00:00:00.000
    KernelTime                00:00:00.000
    *** WARNING: Unable to verify checksum for SuspendKernelApc.exe
    Win32 Start Address SuspendKernelApc!ILT+4940(Thread) (0x00007ff68bba4351)
    Stack Init ffff948a9fca7c90 Current ffff948a9fca6970
    Base ffff948a9fca8000 Limit ffff948a9fca2000 Call 0000000000000000
    Priority 8 BasePriority 8 PriorityDecrement 0 IoPriority 2 PagePriority 5
    Child-SP          RetAddr           : Args to Child                                                           : Call Site
    ffff948a`9fca69b0 fffff807`1f02fda0 : ffff948a`00000008 00000000`ffffffff 00000000`00000000 ffffe78a`09f18158 : nt!KiSwapContext+0x76
    ffff948a`9fca6af0 fffff807`1f02f2cf : 00000011`000001f6 ffff948a`9fca7050 ffff948a`9fca6cb0 ffffe78a`09f182c0 : nt!KiSwapThread+0x500
    ffff948a`9fca6ba0 fffff807`1f02eb73 : 0000006d`00000000 00000000`00000000 00000001`9fca6d00 ffffe78a`0a52b1c0 : nt!KiCommitThreadWait+0x14f
    ffff948a`9fca6c40 fffff807`1f011ebd : ffffe78a`0a52b360 00000000`00000005 00000000`00000000 00001f80`01010000 : nt!KeWaitForSingleObject+0x233
    ffff948a`9fca6d30 fffff807`1f032369 : 00000000`00000000 ffffe78a`0a52b080 ffffe78a`0a52b308 ffffe78a`0a52b118 : nt!KiSchedulerApc+0x3bd
    ffff948a`9fca6e60 fffff807`1f10394b : 00000000`00000000 00000000`00000000 00000000`00000000 ffffdc9a`e57b7d86 : nt!KiDeliverApc+0x2e9
    ffff948a`9fca6f10 fffff807`1f02e7af : 00000000`00000000 ffffffff`ffffff01 00000000`00000000 00000000`00000000 : nt!KiCheckForKernelApcDelivery+0x2b
    ffff948a`9fca6f40 fffff807`1f465d12 : ffffe78a`09f18080 ffffe78a`0a425080 ffff948a`9fca77c0 ffff948a`9fca6fd4 : nt!KeLeaveCriticalRegionThread+0x2f
    ffff948a`9fca6f70 fffff807`1f46582a : ffff948a`9fca72a0 ffff948a`9fca77a0 00000000`00000000 00000000`00000001 : nt!PspCreateThread+0x2a6
    ffff948a`9fca7230 fffff807`1f1f7475 : 00000000`00000000 00000000`00000000 00000000`00000000 00000000`00000000 : nt!NtCreateThreadEx+0x23a
    ffff948a`9fca7a90 00007ff9`d154c5a4 : 00007ff9`cee11f0f 00000000`00000000 00000000`00000000 00000000`00000000 : nt!KiSystemServiceCopyEnd+0x25 (TrapFrame @ ffff948a`9fca7b00)
    0000006d`016ff158 00007ff9`cee11f0f : 00000000`00000000 00000000`00000000 00000000`00000000 00000000`00000001 : ntdll!NtCreateThreadEx+0x14
    0000006d`016ff160 00007ff9`cf84b57d : 00000000`00000000 00000000`00000000 00000000`00000000 00000000`00000000 : KERNELBASE!CreateRemoteThreadEx+0x29f
    0000006d`016ff6e0 00007ff6`8bba95f9 : 00007ff6`8bc73088 00007ff6`8bba4351 00000000`00000000 00007ff6`8bba4351 : KERNEL32!CreateThreadStub+0x3d
    0000006d`016ff730 00007ff6`8bc73088 : 00007ff6`8bba4351 00000000`00000000 00007ff6`8bba4351 00000000`00000000 : SuspendKernelApc!Thread+0x79
    0000006d`016ff738 00007ff6`8bba4351 : 00000000`00000000 00007ff6`8bba4351 00000000`00000000 0000006d`016ff764 : SuspendKernelApc!`string'
    0000006d`016ff740 00000000`00000000 : 00007ff6`8bba4351 00000000`00000000 0000006d`016ff764 cccccccc`cccccccc : SuspendKernelApc!ILT+4940(Thread)
    

    As you can see, the thread was suspended in the middle of executing PspCreateThread. SuspendCount = 1, inside nt!KiDeliverApc.

    You cannot argue with the kernel debugger...

    - Ori Damari
  • Peter_Viscarola_(OSR)Peter_Viscarola_(OSR) Administrator Posts: 8,663

    We'll let Mr. @anton_bassov and Mr. @0xrepnz continue their discussion (OK... I admit it., while we try and figure out the OP's issue.

    @kern ... You wrote:

    callbacks are mostly called at PASSIVE_LEVEL, and thus the thread in which they run can be suspended

    Please clarify, exactly, what you mean by "suspended" and how your callback thread could get into this state?

    Peter

    Peter Viscarola
    OSR
    @OSRDrivers

  • anton_bassovanton_bassov Member MODERATED Posts: 5,279

    First of all, you cannot argue with the debugger

    Sure. However, you still need to give a proper interpretation to the results that it presents, and, as we will see it shortly, this is not really the case here.

    So the ETHREAD is 0xffffe78a0a52b080 - now we need to wait for the KeDelayExecutionThread to end (thread notification
    callbacks are called inside a critical region, so the sleep is inside a critical region now - The kernel APC runs after the thread notification
    callback finished)

    I really hope you are not going to argue that all this happens in context of the creating thread, are you. This thread may, indeed, be receiving the kernel APCs for the various reasons. Now the most "exciting" part comes in.....

    Did you actually read the code before telling me I'm all wrong?

    Actually, I did.....

    In case you still have any doubts about it, let's look at your code one more time

    HANDLE ThreadHandle = CreateThread(NULL, 0, Thread, NULL, 0, &ThreadId);

    I hope you are not going to argue that 'ThreadHandle" variable refers to the newly-created thread, are you. Now look how you use this handle

    if (SuspendThread(ThreadHandle) == -1)

    So which of them do you think you are suspending, judging from the above code alone - a creator thread or a newly-created one????

    Therefore, let me "redirect" your question. Did you actually read your OWN code before trying to give an interpretation to the debugger output (in fact, before starting this nonsensical argument, in the first place) ???

    Anton Bassov

  • 0xrepnz0xrepnz Member Posts: 86
    edited August 8
    Please read the code again..

    - MainThread creates SecondThread
    - SecondThread creates ThirdThread
    - SecondThread gets into a sleep inside the thread creation callback in the driver.
    - MainThread suspends SecondThread
    - After the sleep finishes, the suspend kernel APC executes (While the code is in PspCreateThread)
    - Ori Damari
  • anton_bassovanton_bassov Member MODERATED Posts: 5,279

    Oops. what I have overlooked is the following part

    CreateThread(NULL, 0, (PVOID)0x13131313, NULL, 0, &ThreadId);

    Look what you are actually doing. The main thread creates a child thread and goes to sleep for 3 seconds, which is a sort of infinity, by the computer standards. After that, it tries to suspend the child.

    Meanwhile, its child creates a a thread at some arbitrary address in the middle of nowhere. Assuming that this call succeeds, your app has all chances in the world to crash and burn long before main thread even has a chance to suspend it child, right.

    Anton Bassov

  • 0xrepnz0xrepnz Member Posts: 86

    your app has all chances in the world to crash and burn long before main thread even has a chance to suspend it child, right.

    No, the kernel driver puts the second thread into a sleep for 20 seconds so the suspend APC WILL EXECUTE. If you really want this app not to crash afterwards, You can do the following:

    VirtualAlloc((PVOID)0x13131313, 4096, MEM_RESERVE | MEM_COMMIT, PAGE_EXECUTE_READWRITE);
    RtlCopyMemory((PVOID)0x13131313, "\xeb\xfe", 2);
    

    @anton_bassov I'm going to stop this discussion now. You have all the information to conclude that NtSuspendThread queues a Normal Kernel APC and that it's possible to suspend kernel code from user mode, while the kernel code is not inside a critical region... So the problem the OP describes does exist.. If you still think I'm wrong then show some evidence for that

    - Ori Damari
  • anton_bassovanton_bassov Member MODERATED Posts: 5,279

    No, the kernel driver puts the second thread into a sleep for 20 seconds so the suspend APC WILL EXECUTE.

    Your app is going to crash immediately after the third thread attempts its userland execution and returns to the user mode, because its userland start address is invalid. Your driver goes to sleep in context of a thread creation notification callback, which is not meant to control the newly-created thread's execution, so that it may start running before the callback returns control (in fact, even before your callback gets invoked). Therefore, your app may well crash while the second thread still sleeps. In fact, it does not really matter, as we will see shortly.

    If you really want this app not to crash afterwards, You can do the following:

    But you did not do anything, did you. Therefore, if your app did not crash, the only conclusion that can be made is that thread creation has failed, for some reason....

    You have all the information to conclude

    Well, it looks like I am getting more and more info to conclude that you give the wrong interpretation to the debugger output.....

    Anton Bassov

  • kernkern Member Posts: 4
    edited August 11

    Sorry for my delayed response, I thought my question was being ignored - apparently not. I appreciate all the replies!
    Actually @0xrepnz understood what I meant - his example is very similar to what I encounter. Let me provide more details:

    I register a callback to some file operation (mini filter). Inside the callback, I send data about the file operation to a user mode code and wait for it to respond (using KeWaitForMultipleObjects, not alertable) before letting the callback finish. I'm in the context of the thread which did the file operation (at least for some of the callbacks). During this wait, this thread gets a suspension APC.
    Because of the wait, the kernel APC for suspending the thread is delivered, the thread becomes suspended and my callback doesn't return.

    I've tried entering a critical section around the wait - as this blocks APCs, however, upon exiting the critical section the OS triggers delivery of APCs, and again the thread becomes suspended.

    Why is that a problem? because I sometimes need to unload the driver (upgrade, disabling of a feature). When I do that I have to make sure all the callbacks have finished because they reference my code and I cannot unload it.

    @0xrepnz good point regarding .NET garbage collector.

    There was another thread here with a similar problem, but it was not resolved:
    https://community.osr.com/discussion/comment/293132#Comment_293132

    Thanks again.

  • anton_bassovanton_bassov Member MODERATED Posts: 5,279

    I register a callback to some file operation (mini filter). Inside the callback, I send data about the file operation to a user mode code
    and wait for it to respond (using KeWaitForMultipleObjects, not alertable) before letting the callback finish. I'm in the context
    of the thread which did the file operation (at least for some of the callbacks). During this wait, this thread gets a suspension APC.
    Because of the wait, the kernel APC for suspending the thread is delivered, the thread becomes suspended and my callback doesn't return.

    Well, I've got to admit that I wasted plenty of time on this thread on explaining (unfortunately, without any success, as I can see) something that the official documentation describes with a SINGLE sentence

    https://docs.microsoft.com/en-us/windows/win32/api/processthreadsapi/nf-processthreadsapi-suspendthread

    [begin quote]

    Suspending a thread causes the thread to stop executing user-mode (application) code

    [end quote]

    As you see it yourself, the problem (at least the way it has been presented) simply does not exist, because suspending a thread affects only its userland execution.

    Your problem arises not because of suspending a thread but because you make your driver code wait for the response from the userland,where everything, indeed,may happen. For example, your target thread may get externally suspended by some third-party app.

    Therefore, it presents yet another example of a security issue that may potentially arise if you make your driver directly depend on the userland app...

    Anton Bassov

  • Peter_Viscarola_(OSR)Peter_Viscarola_(OSR) Administrator Posts: 8,663

    So... again, NOT reading the argument between Mr. @anton_bassov and Mr. @0xrepnz ... But just reading the comments above:

    IF the problem the OP has is that an APC is delivered that results in the user's thread being suspended... isn't that the purpose of Critical Regions? We explain that here.

    If the problem is something else... OK.

    Peter

    Peter Viscarola
    OSR
    @OSRDrivers

  • kernkern Member Posts: 4
    edited August 11

    @anton_bassov I see your point (and thank you for investing time in this thread - it's really appreciated!). However, I do see that kernel execution is being suspended as well, here is the flow:
    callback begins (in thread context) --> I enter critical section --> I wait for user mode --> wait is satisfied, kernel execution continues --> I exit critical section --> APCs are delivered and kernel execution is suspended (exiting critical region triggers delivery of APCs).

    I'm seeing this problem in practice, not in theory. @Peter_Viscarola_(OSR) since I exit the critical region before the callback is done, I still get suspended while I'm inside the callback...

    Thanks for you replies!

  • 0xrepnz0xrepnz Member Posts: 86

    One solution to this issue is to have some driver that's always loaded in memory and this driver will not get updates - and the driver that's always in memory can enter a critical region and invoke the second driver, then when the callback is done you can exit the critical region - In my opinion that's pretty "extreme" solution for a rare issue so not sure it's worth it..

    - Ori Damari
  • anton_bassovanton_bassov Member MODERATED Posts: 5,279

    callback begins (in thread context) --> I enter critical section --> I wait for user mode --> wait is satisfied, kernel execution continues
    --> I exit critical section --> APCs are delivered and kernel execution is suspended (exiting critical region triggers delivery of APCs).

    I'm seeing this problem in practice, not in theory

    Are you sure there is no diagnostic mistake here? To begin with, SuspendThread() is not the kind of call that you would normally see all over the place. It is normally used by the debuggers, and is often used hand-in-hand with SetThreadContext() (this is apart from the documentation
    making it clear that only userland execution is affected by this call). Probably, this is something entirely different (like, for example, kernel APC handler simply goes blocking on some event that gets subsequently by some kernel component)? This scenario is, indeed, perfectly feasible (in fact, quite likely).

    If this is the case...well, then it has absolutely nothing to do with suspending a thread.....

    Anton Bassov

  • 0xrepnz0xrepnz Member Posts: 86

    Probably, this is something entirely different

    OK maybe some other source can convince you. For example, @Scott_Noone_(OSR) wrote in this thread:

    As long as the CombinedApcDisable field in the KTHREAD is zero the kernel
    APCs can be delivered. I forced it to happen in a VM to get a call stack by
    putting the thread into an infinite loop and then suspending it using
    Process Explorer. Here are the steps and the resulting call stack:

    kd> bp Nothing!NothingRead
    kd> g
    Nothing!NothingRead:
    fffff801`66bf61b0 4889542410 mov qword ptr [rsp+10h],rdx
    kd> [email protected]$thread->Tcb.CombinedApcDisable
    unsigned long 0
    kd> ew @$ip 0xFEEB
    kd> g
    
    Thread is now taking up 100% CPU. Now suspend the thread using Process
    Explorer and check the call stack:
    
    kd> k
    *** Stack trace for last set context - .thread/.cxr resets it
    # Child-SP RetAddr Call Site
    00 ffffd001`8ebc64a0 fffff803`3d250d7e nt!KiSwapContext+0x76
    01 ffffd001`8ebc65e0 fffff803`3d2507f9 nt!KiSwapThread+0x14e
    02 ffffd001`8ebc6680 fffff803`3d2788d0 nt!KiCommitThreadWait+0x129
    03 ffffd001`8ebc6700 fffff803`3d24d64c nt!KeWaitForSingleObject+0x2c0
    04 ffffd001`8ebc6790 fffff803`3d24e279 nt!KiSchedulerApc+0x78
    05 ffffd001`8ebc67f0 fffff803`3d374723 nt!KiDeliverApc+0x209
    06 ffffd001`8ebc6870 fffff801`66bf61b0 nt!KiApcInterrupt+0xc3
    07 ffffd001`8ebc6a08 fffff803`3d605788 Nothing!NothingRead
    08 ffffd001`8ebc6a10 fffff803`3d603336 nt!IopSynchronousServiceTail+0x170
    09 ffffd001`8ebc6ae0 fffff803`3d37b1b3 nt!NtReadFile+0x656
    0a ffffd001`8ebc6bd0 00007ffc`38420caa nt!KiSystemServiceCopyEnd+0x13
    0b 0000009d`fd31fa28 00007ffc`358f83a8 ntdll!NtReadFile+0xa
    
    -scott
    OSR
    

    Moreover, YOU YOURSELF wrote here:

    My question , thread suspension is implemented by User mode APC or Normal Kernel mode APC.

    I would say by the kernel-mode one - this is the reason why APC delivery to a thread that holds kernel-level synch construct that allows preemption should be disabled. Just a couple days ago we discussed a scenario when thread gets suspended by UM app while running in the KM, which may result in a deadlock if it holds synch construct that allows preemption at the moment ( I just could not miss my chance to criticize Windows yet another time, could I)....

    If you cannot be convinced by this, not sure what will convince you... I personally think the most convincing one was this:

    0: kd> kc
     # Call Site
    00 nt!KiInsertQueueApc
    01 nt!KiSuspendThread
    02 nt!KeSuspendThread
    03 nt!PsSuspendThread
    04 nt!NtSuspendThread
    05 nt!KiSystemServiceCopyEnd
    06 ntdll!NtSuspendThread
    07 KERNELBASE!SuspendThread
    08 SuspendKernelApc!main
    09 SuspendKernelApc!_NULL_IMPORT_DESCRIPTOR <PERF> (SuspendKernelApc+0x17400f)
    0a 0x0
    0: kd> dx ((nt!_KAPC*)@rcx)->ApcMode
    ((nt!_KAPC*)@rcx)->ApcMode : 0 [Type: char]
    
    - Ori Damari
  • anton_bassovanton_bassov Member MODERATED Posts: 5,279

    If you cannot be convinced by this, not sure what will convince you...

    I've got to admit that you are right - I just checked WRK, and found the following line in KeInitThread() implementation

    KeInitializeApc(&Thread->SuspendApc,
    Thread,
    OriginalApcEnvironment,
    (PKKERNEL_ROUTINE)KiSuspendNop,
    (PKRUNDOWN_ROUTINE)KiSuspendRundown,
    KiSuspendThread,
    KernelMode,
    NULL);

    As we can see with our own eyes, "SuspendApc" is,indeed, initialised as a KM one, which means that the official documentation is "a way too optimistic", so to say......

    Anton Bassov

Sign In or Register to comment.

Howdy, Stranger!

It looks like you're new here. Sign in or register to get started.

Upcoming OSR Seminars
OSR has suspended in-person seminars due to the Covid-19 outbreak. But, don't miss your training! Attend via the internet instead!
Internals & Software Drivers 15 November 2021 Live, Online
Writing WDF Drivers TBD Live, Online
Developing Minifilters 7 February 2022 Live, Online
Kernel Debugging 21 March 2022 Live, Online