BSOD and restart when locking process pages.

Greetings,

I’m registering a image loading notification routine via PSLoadImageNotifyRoutine(), and a process creation/destruction notification routine via PsSetCreateProcessNotifyRoutineEx().

I track specific processes, and when a module is loaded to one of these processes and i get a notification about it from the image load routine, i lock it’s code regions(after parsing the PE), using MmProbeAndLockPages() after i change to the virtual address space of the target process via:

KeStackAttachProcess()

MmProbeAndLockPages()

KeUnstackDeatchProcess()

When the process dies and i get a notification about it from the PsSetCreateProcessNotifyRoutineEx(), i change to its virtual address space(just to be sure i’m running in the right VMA) i unlock every locked region that i locked.

This work perfectly on Windows 7, but on Windows 10, when one of the processes dies, i got a BSOD without any message and then an instant reboot…

I’m not really sure what is the problem here. When i remove the locking of the pages everything works. Are you guys familiar with the reason behind this? Am i doing something i shouldn’t do? I’m pretty sure all the code i described runs either in PASSIVE_LEVEL or APC_LEVEL, so it’s fine to lock/unlock pages.

Thanks.

>Am i doing something i shouldn’t do

Not to be a wise guy, but… “Yes. If the system is crashing, you’re doing something you shouldn’t do.” Sorry…

Did you run your code on the latest version of Windows with Driver Verifier enabled?

How about under the Checked Build of Windows?

Those are my “go to” next steps when I get strangeness such as what you’re reporting.

Peter
OSR
@OSRDrivers

I’m not running on the latest version of Windows 10. I cannot update to the latest one because of the product’s enforcements.

I didn’t try the driver verifier, and i didn’t check the Windows build. What does the Windows build has to do with this? Does this sound like a kernel bug to you?

>I didn’t try the driver verifier

That’s the first thing for you to try.

i didn’t check the Windows build

Sorry… I mean “use the Windows CHECKED build” – which is the DEBUG build of the Windows OS. See this explanation:

https:

>Does this sound like a kernel bug to you?

No.

Peter
OSR
@OSRDrivers</https:>

Well, I don’t think you even need a Verifier here - the reason for crash seems( at least to me) to be plainly obvious, and this is very obviously YOUR bug, rather than that of NT kernel…

When a process dies, its address space becomes invalid. When you “change to its virtual address
space(just to be sure i’m running in the right VMA)”, you attach to address space that is already invalid , which means entries on a page pointed to by CR3 may be already pointing to the middle of nowhere.

It does not necessarily mean it is going to be the case - everything depends on when the target page gets re-purposed, so that in some (in fact,in most) cases your code may, indeed, run without crashing. In other words, this is a perfect example of a random bug that does not necessarily have to reveal itself on every other run of the code.

However, when it does you are going to crash abruptly, rather than via “polite bugchecking” - IIRC, invalid control register happens to be a kind of bug that resets the CPU right on the spot, without even giving it a chance to throw an exception

Now look at MmUnlockPages() function

https://docs.microsoft.com/en-us/windows-hardware/drivers/ddi/content/wdm/nf-wdm-mmunlockpages.

What it does is unlocking the pages (after having unmapped them, if they are mapped to the system address space). What makes you believe that you have to be in context of the target process in order to call this function??? After all,don’t forget that locked MDL may well outlive the target process and/or the userland buffer that it describes

In other words, just drop this stupid attaching to the address space upon unlocking pages,and everything should work just fine…

Anton Bassov