Windows System Software -- Consulting, Training, Development -- Unique Expertise, Guaranteed Results

Home NTDEV

More Info on Driver Writing and Debugging


The free OSR Learning Library has more than 50 articles on a wide variety of topics about writing and debugging device drivers and Minifilters. From introductory level to advanced. All the articles have been recently reviewed and updated, and are written using the clear and definitive style you've come to expect from OSR over the years.


Check out The OSR Learning Library at: https://www.osr.com/osr-learning-library/


Before Posting...

Please check out the Community Guidelines in the Announcements and Administration Category.

Most stable way of finding the start address of every kernel thread?

henrik_meidahenrik_meida Member Posts: 118

Hey everyone,

As part of a project, i want to write a driver that scans every kernel thread in order to find the start address of that kernel thread, and any other info i can gather such as whether its active or in a waiting state (if possible).

I need to do this in order to do two things:

  1. First scan the content surrounding the thread start address to see if it matches any malicious signature.

  2. To detect hidden kernel codes, such as the case were a bootkit (before my driver is even loaded) created a thread in the very beginning of the system start up. Therefore the start address of that thread could potentially point to a kernel address which doesn't belong to any kernel module, which i assume is 100% malicious if thats the case (right?).

So my question is what is the most stable/generic way of achieving this for windows 7+? Which APIs i need to use?

Note that if the solution involves using undocumented APIs/structs, its fine by me, as this is not for production and is for detecting malicious drivers in a sandbox environment.

Comments

  • craig_howardcraig_howard Member Posts: 242

    Before you embark on your journey for items 1 and 2, you really need to do some GoogleFu on "shellcode", "gadgets" and "heavens gate" (not the movie voted worst ever created, the kernel mode one). Most (serious) malware operates by using these techniques, as ASLR takes care of item 2 and item 1 is only found with usermode threats ...

  • Tim_RobertsTim_Roberts Member - All Emails Posts: 14,221

    You really need to understand how totally hopeless this is. Once you have introduced malicious code in the kernel, it's game over. Whatever you can do, they can undo, or prevent. The bad guys have way more experience than you do.

    The start address for a kernel thread is just that -- where it started. It doesn't mean that it's executing anywhere near there now. And let's say you get a kernel address. How will you determine that it's part of a driver?

    There certainly are legitimate kernel drivers that build code on the fly. You can't assume that any generated code is automatically malicious.

    Hopeless.

    Tim Roberts, [email protected]
    Providenza & Boekelheide, Inc.

  • MBond2MBond2 Member Posts: 417

    I think you need to do a lot more homework before you even begin a project like this. I will give you a hint though - effective detectors of malware do not run in the same context as the malware that they are designed to detect. And common hypervisors have tell tales that are publicly too well known to be used directly

  • henrik_meidahenrik_meida Member Posts: 118
    edited January 16

    @craig_howard said:
    ASLR takes care of item 2 and item 1 is only found with usermode threats ...

    Hmm, not really.
    No offence but i think you need to do a little googleFu regarding some of the recent bootkits (MBR and UEFI) and some of the recent manual mappers. So please do explain to me how does ASLR defeat all these bootkits that inject code into kernel memory during boot, and how does ASLR defeat manual mappers? This doesn't even make any sense..
    Now most of the ones that i have seen, specially these recent bootkits, other than hooking, will start a kernel thread to some stuff, and those can be used to detect them.

  • henrik_meidahenrik_meida Member Posts: 118
    edited January 16

    @Tim_Roberts said:
    You really need to understand how totally hopeless this is. Once you have introduced malicious code in the kernel, it's game over. Whatever you can do, they can undo, or prevent. The bad guys have way more experience than you do.

    The start address for a kernel thread is just that -- where it started. It doesn't mean that it's executing anywhere near there now. And let's say you get a kernel address. How will you determine that it's part of a driver?

    There certainly are legitimate kernel drivers that build code on the fly. You can't assume that any generated code is automatically malicious.

    Hopeless.

    Tim, i have seen you bring this argument many times that "as long as the system is infected, then there is no hope and they can always bypass you, so just let it go!", If this was true, no EDR/AV would exist as of today.

    Sure, they can always bypass the detection techniques, but at least we can try to detect as many as we can..

  • henrik_meidahenrik_meida Member Posts: 118
    edited January 16

    @MBond2 said:

    • effective detectors of malware do not run in the same context as the malware that they are designed to detect. And common hypervisors have tell tales that are publicly too well known to be used directly

    That is correct, but even tho as of right now this is just for a sandbox and indeed using hypervisor is better in this case, this project might be moved to a production builds if it performs well, so i cannot use hypervisor to monitor the rootkit since many customers do not have systems that support it. So even tho since we are running at the same context as the rootkit and they can eventually bypass any detection technique, we still need to try to detect most of them.

  • henrik_meidahenrik_meida Member Posts: 118

    Currently the best solution i found is using PsLookupThreadByThreadId with threadIDs starting from 4 doing going untill something like 4000 and if it exist using PsIsSystemThread + some stack walking to get most of the information required from the thread such as startAddress and recent addresses of execution, but this requires parsing some undocumented structs such as KTHREAD, so was wondering if there is any better and more stable solution?

  • henrik_meidahenrik_meida Member Posts: 118
    edited January 16

    Also note that i have found three other methods for finding system threads:

    1. Using ThreadListHead of EPROCESS of system

    2. Using PspCidTable

    3. Using DispatcherReadyListHead

    But all three of them use a lot of undocumented stuff, so the previously mentioned approached seemed liked the most stable one, but if there is any better approach, or if the mentioned approach has any serious flaws, please do let me know.

  • Tim_RobertsTim_Roberts Member - All Emails Posts: 14,221

    If this was true, no EDR/AV would exist as of today.

    Antivirus products are largely detecting user-mode attack vectors, in the hope of stopping infections before they set in. Rootkit detection in a live system is, at best, hit and miss. You are, after all, pitting yourself against government-funded efforts by two of the largest nations on earth.

    Tim Roberts, [email protected]
    Providenza & Boekelheide, Inc.

  • MBond2MBond2 Member Posts: 417

    This thread has more evidence for my axiom that anti-virus software is worse than the malware it is designed to prevent.

    Thread IDs are not allocated sequentially and are not confined to 'small' values like 4000. Systems that run for a long time, have many cores, or many devices installed (and hence drivers loaded) may generate substantially more system threads than you expect

    But think about your task a little more. Let's assume for a moment that the malware authors don't know that your software exists, and have taken no steps to bypass the checks that you are making. Let's further assume that you are somehow able to suspend all other threads in the system except yours so that there are no changes happening during your analysis. These are very unreasonable assumptions but never the less:

    you want to detect the location in memory where a thread began from its current stack location and then 'scan' that module to see if it is malware of some kind. Assuming that you can do that, what action do you plan to take?

    next consider how the calling conventions work for x64. Code that does not intent to return, and wants to hide its origins can easily overwrite its own stack to obfuscate the return addresses. Code that allocates executable memory, generates or copies instructions into it, then calls into that memory and erases the return address from its stack would then be untraceable using this method. This is an obvious technique that has been used for at least 20 years - x86 calling conventions can be used for this too

    then consider that the start address might not even be in a binary that was loaded using the normal paging logic. Remember that the CPU only cares about page execute protection and the author of code that is not intended to work within the normal rules of the system can do all sorts of things that you might not expect.

    I'm not going to go into more details because these posts live forever and who knows who will read them and what purposes they have in mind

  • Don_BurnDon_Burn Member - All Emails Posts: 1,755

    The firms I worked with in this area had all concluded:

    1. Stop the infection before it gets in if possible.
    2. Once infected consider providing a way to boot something that accesses the system disk to purge the infection.

    All of them gave up on the idea of fixing an infected system on the fly. You should probably save yourself the pain and do likewise.

  • henrik_meidahenrik_meida Member Posts: 118
    edited January 17

    @Tim_Roberts said:

    If this was true, no EDR/AV would exist as of today.

    Antivirus products are largely detecting user-mode attack vectors

    Wow, i hope none of the poor driver developers at EDR companies see this statement, all those hard work they are doing and even veteran driver developers such as yourself think that AVs/EDRs are mostly detecting user-mode threats!! Have you ever spoken or worked with one of them?

    This is 100% incorrect, if you knew the amount of work many of them are doing to stop kernel mode threats you would have apologized right away for even saying such thing lol.
    For example, Kaspersky's driver developers were detecting hooks on lowest level drivers/devices on the disk stack (driver_object hooks, device_object hooks, inline hooks, device extension hooks, etc..) in 2006.. So image how much more they are doing in 2022.

  • henrik_meidahenrik_meida Member Posts: 118
    edited January 17

    @MBond2 said:
    This thread has more evidence for my axiom that anti-virus software is worse than the malware it is designed to prevent.

    Thread IDs are not allocated sequentially and are not confined to 'small' values like 4000. Systems that run for a long time, have many cores, or many devices installed (and hence drivers loaded) may generate substantially more system threads than you expect

    Thanks for the detailed response MBond2, so regarding the system thread IDs, i don't know why but every system that i checked even those that ran many apps on them and were up for a long time, didn't have a SYSTEM threadID that is higher than 0x3000, but that doesn't matter i can go even as high as 0x10000.
    So compared to other 3 methods that i mentioned (PspCidTable, etc), which one do you think is the best approach to gather system threads? Or is there any better approach?

    Also regarding stack walking and gathering recent points of execution for that thread, what would be your approach for solving this? Assuming the threat doesn't obfuscate the stack, and we want to find recent points of execution in both x86 and x64?

    you want to detect the location in memory where a thread began from its current stack location and then 'scan' that module to see if it is malware of some kind. Assuming that you can do that, what action do you plan to take?

    This is mostly for threat detection only, not disinfecting. I just want to detect whether or not something malicious is happening on the system, the rest is manual analysis of the system (forcing crash and analyzing the crash dump, etc) or the malicious driver itself.

    then consider that the start address might not even be in a binary that was loaded using the normal paging logic. Remember that the CPU only cares about page execute protection and the author of code that is not intended to work within the normal rules of the system can do all sorts of things that you might not expect.

    Again, this is only for threat detection, so if i find a system thread that is executing code that is not inside of any kernel module, that is suspicious at least, if not 100% malicious, and therefore requires further look. Although i would argue that we can safely say this is 100% malicious and i doubt any benign kernel module with start a thread in such a way.

    Post edited by henrik_meida on
  • henrik_meidahenrik_meida Member Posts: 118
    edited January 17

    @Don_Burn said:
    The firms I worked with in this area had all concluded:

    1. Stop the infection before it gets in if possible.
    2. Once infected consider providing a way to boot something that accesses the system disk to purge the infection.

    All of them gave up on the idea of fixing an infected system on the fly. You should probably save yourself the pain and do likewise.

    This post is not regarding disinfection of the system, i just want to detect whether or not something malicious is happening on the system, AKA threat detection. The rest of it is not my worry.
    And currently i just want to find the most stable approach for scanning system threads and finding their recent points of execution (such as via stack walking) for both x64 and x86.

  • henrik_meidahenrik_meida Member Posts: 118
    edited January 17

    @MBond2 said:
    next consider how the calling conventions work for x64. Code that does not intent to return, and wants to hide its origins can easily overwrite its own stack to obfuscate the return addresses. Code that allocates executable memory, generates or copies instructions into it, then calls into that memory and erases the return address from its stack would then be untraceable using this method. This is an obvious technique that has been used for at least 20 years - x86 calling conventions can be used for this too

    Also how about RtlCaptureStackBackTrace + RtlWalkFrameChain?

  • MBond2MBond2 Member Posts: 417

    This thread is getting old very quickly. To highlight your need to learn more before attempting a project of this complexity, consider your question about RtlCaptureStackBackTrace

    First consider, why does this function exist? Today Windows runs primarily on x64 machines. Less so ARM and x86 machines, but formerly there were many more CPU architectures supported - MPIS, Alpha, even Itanium. Every single one of them has a different way in which compilers can setup stack frames and how functions 'call' and 'return'. Some like x86 even have multiple calling conventions even in unoptimized code. To some degree, the possible stack construction is OS specific, but generally not - it is generally a CPU architecture + compiler thing. So why is there an API (DDI) like this one and how could it work? This API exists to provide a platform independent way of performing a stack capture (and subsequent stack walk or hash). Using this API one ca write C or C++ code once, and then recompile it for different architectures and expect it to work within the limits of the stack tracking ability of that platform.

    Next think about the stack tracing ability of the platforms of interest. X64 Windows is one of the best in this regard because there is a single calling convention and it does not allow arbitrary jumps - even in optimized code there is a specified structure. This design was informed by a lot of work in x86 and the extreme to impossible level of difficulty of stack walking on that platform. Yes, code that is not even tying to obfuscate its purpose can be impossible to collect a stack trace from. Is that code automatically malicious? assuredly not

    And that code is not even attempting to hide its purpose. Consider the very simple method of replacing the return address on the current thread stack with an alternate address which is the thread entry point of a well known system thread. The thread has never executed any of those instructions, but if you follow the stack trace, you will locate a completely different module than the one where the thread originated from. Much more sophisticated techniques exist.

    The first thing that you need to learn about is how the platform exists at least at the CPU instruction level. The Specter / meltdown guys tell us that we probably need to learn it at the block diagram electrical level too if we want to make serious security software

  • Peter_Viscarola_(OSR)Peter_Viscarola_(OSR) Administrator Posts: 8,847

    If there ever was a thread on NTDEV that I was glad to have stayed out of…. This one is it.

    @henrik_meida: I’m really not sure what you expected to accomplish by posting your query here. If you need overall help with this sandbox project, I’d suggest you hire a kernel-mode architect or cybersecurity consultant or both. At least that way you can discuss details, tradeoffs, and overall goals interactively, and in depth. Seeking this level of wisdom from a forum, even a great one with highly experienced and very willing engineers like this one, seems to me to be unlikely to lead to wisdom, even if it does result in valid information.

    Engineering is more than “now, how do I do this next step?”

    Peter

    Peter Viscarola
    OSR
    @OSRDrivers

  • henrik_meidahenrik_meida Member Posts: 118
    edited January 18

    For anyone from future coming upon this thread, your solution is in the process hacker repository, and i found all my answers there (from system thread scanning to stack walking/tracing), and the code is pretty well written and clean. Although i suggest that if you want to find malicious threads, you implement all 4 possible ways of scanning system threads, in case the malicious actor has hid its thread from only 1 method ( such as PspCidTable).

    I suggest this thread to get locked at this point, thanks everyone for their inputs.

Sign In or Register to comment.

Howdy, Stranger!

It looks like you're new here. Sign in or register to get started.

Upcoming OSR Seminars
OSR has suspended in-person seminars due to the Covid-19 outbreak. But, don't miss your training! Attend via the internet instead!
Writing WDF Drivers 24 January 2022 Live, Online
Internals & Software Drivers 7 February 2022 Live, Online
Kernel Debugging 21 March 2022 Live, Online
Developing Minifilters 23 May 2022 Live, Online