Prepping self-modifying code with the Arm64 kernel

I need to execute some self-modifying code in the ARM64 kernel. This is just a test (this won't go into production.)

I allocated memory, placed my test ARM64 assembly instructions there and marked that memory as PAGE_EXECUTE_READ.

I then also needed to flush instruction cache. There was no documented way of doing it (AFAIK), thus I had to use undocumented NtFlushInstructionCache function:

typedef NTSTATUS NTFLUSHINSTRUCTIONCACHE(HANDLE ProcessHandle, PVOID BaseAddress, SIZE_T Length);

Unfortunately, NtFlushInstructionCache is not exported from the NT either, so I had to use ZwFlushInstructionCache which comes with a small overhead, but works. It returned 0 in my experiment.

To execute my self-modifying code I wrote a small asm function, which is basically:

br      x0

That in C would be declared as:

extern void branch_to_code(PVOID CodeAddr);

But after I branch to my code, I get the following exception when I try to execute the first instruction there:
(I confirmed it by trying to step through it with a debugger.)

SYSTEM_THREAD_EXCEPTION_NOT_HANDLED (7e)
Arg1: ffffffffc0000005
Arg2: fffffd0c6531e8c0
Arg3: fffff8023b59ba90
Arg4: 0000000000000000

Note that the Arg2 was supposed to point to the instruction that caused this exception.

When I try to see what's at fffffd0c6531e8c0:


db fffffd0c6531e8c0
fffffd0c6531e8c0  05 00 00 c0 00 00 00 00-00 00 00 00 00 00 00 00

It looks like the address of the exception record. I can confirm with .exr fffffd0c6531e8c0 and .cxr fffff8023b59ba90 gives me the context with pc being ffffbdf600000000.

I can confirm also if I do k to retrieve the stack, I can see the address of the beginning of my code, or ffffbdf600000000 where the exception took place.

So it didn't even execute a single instruction there before bugchecking with status c0000005, which is "The instruction at 0x%p referenced memory at 0x%p. The memory could not be %s.".

I then tried to unassemble that memory:

u ffffbdf600000000 L1
ffffbdf600000000  d10023ff   sub     sp, sp, #8

It's a valid instruction and the memory is readable & executable. If we check sp, it is:

r sp
sp=ffffa2818d921cd0

But the value of sp shouldn't really matter for that instruction.

What is also interesting, is the stack during the crash:

...
nt!KiSecureException+0x9a
nt!KiSyntheticException+0x6a
nt!HvlArm64SyntheticExceptionVectors+0x5a
ffffbdf600000000
my_driver!branch_to_code
...

So it seems like the hypervisor raised this exception. But why?

Or, could it be that I'm not flushing the i-cache or d-cache correctly. Maybe someone can correct me?

Well, it's not bugcheck C0000005, it's bugcheck 0x7E. The C0000005 is the NTSTATUS value.

NTSTATUS 0xC0000005 is perhaps the most common one, EXCEPTION_NOT_HANDLED. It indicates that there was a CPU hardware exception for which there was no handler. It can be many exceptions, not just an invalid address.

Have you checked the page attributes in the debugger, to make sure the "execute" bit actually got set?

@Tim_Roberts yes, that memory is readable and executable. As I see it from the call stack (last in my original post) the hypervisor caught that exception. So it's something coming from EL2 at least. Not sure though what else do I need to do there.

For AMD64 I might say that this is Memory Integrity in action. But IDK what about ARM64.

@SweetLow you know, that driver also builds for x64 and I tried it on my Win10 VM and it worked without any issues. (Although for arm64 I'm trying it on the latest Win11 build.) So I'm wondering if that Memory integrity is something that they added in Win11?

Was the memory allocated with NonPagedPoolExecute?

It's added in Windows 10, but usually it is Off there by default but On in Windows 11. And no matter above - it's accessible in user interface on average hardware.

@umann I didn't use ExAllocatePool* function to allocate memory. My assumption was that if I allocate with POOL_FLAG_NON_PAGED_EXECUTE flag, the memory will be marked as read-write-execute and I couldn't find a way to change that memory protection to read-execute.

So I went with allocating memory as such:

PMDL pMdl = MmAllocatePagesForMdlEx(low, high, skip, 
                                    memorySizeNeeded, MmCached,
                                    MM_DONT_ZERO_ALLOCATION | 
                                    MM_ALLOCATE_FULLY_REQUIRED);

PVOID pMem = MmMapLockedPagesSpecifyCache(pMdl, KernelMode, 
                                          MmCached, NULL, FALSE, 
                                          HighPagePriority);
//Set machine code to that memory
memcpy(pMem, &asm, sizeof(asm));

MmProtectMdlSystemAddress(pMdl, PAGE_EXECUTE_READ);

//Flush caches (done via MmGetSystemRoutineAddress call)
ZwFlushInstructionCache(NtCurrentProcess(), pMem, sizeof(asm));

//Then execute it
branch_to_code(pMem);

Would you guys do it differently?

@SweetLow I just tested it, and yes it is "Memory Integrity". It works if I disable it. Thanks for pointing it out.

The question is how to add my allocated memory to be green-lighted by it?

AFAIK it is impossible for mere mortals. And even some products from MS are restricted in this area.