MmProbeAndLockPages() reporting STATUS_ACCESS_VIOLATION

Hey All,
So I’ve been scratching my head for days trying to figure out this problem with no success. I’m writing a KMDF PCIe driver. I communicate with it though a bunch of custom DeviceIoControl() calls (since my device has quite a unique purpose and interface) and it uses buffered I/O. One of the custom I/O control codes is passed in a C struct (using buffered I/O) that includes a member variable that is a pointer to a user mode buffer that I’d like to make accessible and map into kernel space. This mapping needs to persist for longer than the just duration of this DeviceIoControl() call so I can’t use WdfRequestProbeAndLockUserBufferForRead(). I’m trying to call MmProbeAndLockPages() to map this memory and it consistently fails with STATUS_ACCESS_VIOLATION. Here is the relevant code:

mappedMemory->_mdl = IoAllocateMdl(parameters->_virtualAddress, (ULONG)parameters->_size, FALSE, FALSE, NULL);
if(mappedMemory->_mdl == NULL)
{
PDEBUGERROR(“IoAllocateMdl() failed in mapMemory()!”);
return(STATUS_NONE_MAPPED);
}

__try { MmProbeAndLockPages(mappedMemory->_mdl, UserMode, (parameters->_usage == opFtxMappedMemoryFromDeviceUsage) ? IoWriteAccess : IoReadAccess); }
__except(EXCEPTION_EXECUTE_HANDLER)
{
PDEBUGERROR(“MmProbeAndLockPages() failed in mapMemory() with exception code %u!”, (unsigned int)GetExceptionCode());
IoFreeMdl(mappedMemory->_mdl);
return(STATUS_NONE_MAPPED);
}

This code is called by an EVT_WDF_IO_IN_CALLER_CONTEXT callback so I should be in the correct process’ context. I’m not 100% sure how to check (if anybody knows how, please let me know), but I can’t see why I wouldn’t be the top of the driver stack and have a filter driver above me ruining my process context guarantee. I have checked that parameters->_virtualAddress and parameters->_size is correct and matches what I have passed in from user space so the input buffer’s buffered I/O mechanism is working. The code fails regardless of whether I use it with IoWriteAccess or IoReadAccess. I have tried both KernelMode and UserMode as the parameter to MmProbeAndLockPages(). On the user space side, I have tried this with memory allocated both via VirtualAlloc(NULL, size, MEM_COMMIT, PAGE_READWRITE) and from an array local variable directly on the call stack. I have tried with and without locking the memory from the user space side via VirtualLock(). Nothing seems to make a difference. Does anybody know what could potentially be the problem? what else to check? perhaps another approach to map user space memory into kernel space for extended periods of time outside of the context of a single request/IRP?

Thanks in advance,
Omri

omri wrote:

I’m writing a KMDF PCIe driver. I communicate with it though a bunch of custom DeviceIoControl() calls (since my device has quite a unique purpose and interface) and it uses buffered I/O. One of the custom I/O control codes is passed in a C struct (using buffered I/O) that includes a member variable that is a pointer to a user mode buffer that I’d like to make accessible and map into kernel space.

That’s a design flaw.  The correct design would be to have one
METHOD_xx_DIRECT ioctl where you pass the buffer, then keep that buffer
pending in a manual queue for the life of the need.  That way, the I/O
manager takes care of the mapping and validation. You can follow up with
a buffered ioctl to trigger whatever I/O operation needs to use the
buffer.  When you’re done, you complete the request.

I’m trying to call MmProbeAndLockPages() to map this memory and it consistently fails with STATUS_ACCESS_VIOLATION. Here is the relevant code:

Unless you’re doing this in an InProcessContext callback, this is
probably failing because you’re no longer in that processes context, so
the address space belongs to another process.

One of the custom I/O control codes is passed in a C struct (using buffered I/O) that includes a member variable that is a pointer to a user mode buffer that I’d like to make accessible and map into kernel space.

As you likely know, Windows ioctls can pass two buffers. So, make the 1st buffer contain your “C struct” and the 2nd buffer be the other buffer.
The system will automatically verify and map it for you.
Let the driver to manage the pointer, instead of passing it in the C struct.
Pend this ioctl and do not complete until done with the buffer (for example, the app can cancel it when done).
Bingo, done.

– pa

Hi Tim,
Thanks, unfortunately this is actually a port of a working Linux driver where the cross platform user space C++ code has the map(), doSomethingWithTheMappedMemory(), unmap() structure that will be quite a pain in to refactor.
You mentioned InProcessContext callback. What is that? As I understood the documentation that’s what EVT_WDF_IO_IN_CALLER_CONTEXT is (providing you are at the top of the driver stack and don’t have a filter driver above you).
One alternative approach I’ve considered (which won’t require major refactoring to existing cross platform code) is I can allocate paged memory in kernel space and map it into user space as opposed to vice versa. That way when the user space asks to map it into the kernel and allow the kernel to do something with it, all the kernel has to do is lock the pages. What I’m unsure about is that there could potentially be up to a gigabyte of these buffers around at any given time and I don’t know how much memory I can allocate in the kernel side.

>I can allocate paged memory in kernel space and map it into user space as opposed to vice versa. That’s close to what I did recently in a similar situation (Linux driver that used mmap, etc). But I allocated non-paged memory using MmAllocatePagesForMdl , mapped it into user virtual address space with MmMapLockedPagesSpexifyCache, and returned the pointer to the user app. Just don’t forget to do the unmap before the user app is allowed to exit. Peter

FYI, I just tried calling ProbeForRead() with the exact same buffer address and size right before I call MmProbeAndLockPages() and it succeeds just fine, yet MmProbeAndLockPages() still fails with STATUS_ACCESS_VIOLATION. What could that possibly mean?

Either the virtual address or the length must be wrong. You’ve validated that they’re correct, and match what is passed in user mode?

Peter

On May 23, 2019, at 3:53 PM, omri wrote:
>
> Thanks, unfortunately this is actually a port of a working Linux driver where the cross platform user space C++ code has the map(), doSomethingWithTheMappedMemory(), unmap() structure that will be quite a pain in to refactor.

map() and unmap() are not cross-platform and never have been. However, it is certainly possible for you to write a map() wrapper that does exactly what I described.

> You mentioned InProcessContext callback. What is that? As I understood the documentation that’s what EVT_WDF_IO_IN_CALLER_CONTEXT is (providing you are at the top of the driver stack and don’t have a filter driver above you).

Yes. Is that what you’re using?

> …all the kernel has to do is lock the pages. What I’m unsure about is that there could potentially be up to a gigabyte of these buffers around at any given time and I don’t know how much memory I can allocate in the kernel side.

Page pool is plenty large.

Tim Roberts, timr@probo.com
Providenza & Boekelheide, Inc.

I can allocate paged memory in kernel space and map it into user space as opposed to vice versa.

This approach is generally frowned upon in the Windows world because of the extra complications that may arise. For example, consider what happens if your app terminates abnormally…

Anton Bassov

Peter_Viscarola_(OSR), yes I have verified that both the user space virtual address of the buffer and the size (passed in as members of a struct passed in via the lpInBuffer parameter) are exactly the same once they reach my kernel side driver as they were when I passed them in via DeviceIoControl().

Tim_Roberts, sorry when I said map() and unmap(), I didn’t mean platform specific functions. I meant that is how the existing logic flow of the user space cross platform C++ code for my project interacts with the kernel side driver. First, you pass in a user space buffer address and size to the kernel as either an input buffer or output buffer for future operations (the map() step), the kernel returns an ID to reference this buffer with from now on. Than there are multiple operations you can do with it, passing the ID to reference which buffer of data you are using as input or output. Once you are done, you pass the buffer ID to an IO control code telling it to unmap the buffer from kernel space and that the buffer ID is no longer valid and may later be reused to reference another buffer (the unmap() step).
Tim_Roberts, yes also I am calling MmProbeAndLockPages() from an EVT_WDF_IO_IN_CALLER_CONTEXT callback. Still very odd why it doesn’t succeed especially when ProbeForRead() works fine.

Still very odd why it doesn’t succeed especially when ProbeForRead() works fine.

There is someone’s bug there: ether yours or Microsoft’s. Guess which.
Mr. Robert’s advice lets you skip all these mappings - the system will do everything. Even tell you if the user process dies unexpectedly (by canceling the ioctl). And you don’t need to pass any pointer to the buffer in the first struct.
My advice was just same, slightly rephrased.

– pa

It should go without saying that Mr. Roberts’ and Mr. Pavel_A’s advice represent the primary recommended design pattern for what you want to do. In the same driver that I pinned the buffers, as described earlier, I also used the design pattern described by Mr. Pavel_A.

Sure you can’t change your IOCTL implementation?

Peter

… I’ve been scratching my head for days trying to figure out this problem with no success.

I’m trying to call MmProbeAndLockPages() to map this memory and it consistently fails with STATUS_ACCESS_VIOLATION.

You seem to assume that the userland-provided parameters are correct, so that you don’t even question their validity. However, there is a good chance that this assumption is just a way too bold. If you take the approach that Mr.Roberts suggests (I think our hosts call it " a big honking IRP" and promote it as a new “good coding practice” in their classes) , this problem will be gone

Anton Bassov

Yup. The “big Honkin’ Hangin’ Request” design pattern. We even teach it in our Advanced WDF seminar. Peter

Hey guys,
So I implemented this longstanding request pattern to keep a buffer mapped around in memory. It works for the most part.
At the end of the my EVT_WDF_IO_QUEUE_IO_DEVICE_CONTROL callback, I call WdfRequestMarkCancelableEx(). On the user space application side, when I no longer need the mapped buffer, I call CancelIoEx() to cancel the request followed by WaitForSingleObject() on the event associated with the overlapped call to DeviceIoControl() to confirm cancellation of the original IOCTL. I may have multiple buffers mapped at once and they may be mapped (overalpped DeviceioControl() call) and unmapped (CanelIoEx()/WaitForSingleObject()) from different threads.
The problem I’m having is that in these situations with a high amount of multithreading, sometimes and randomly the framework will cancel my longstanding request and call it’s EVT_WDF_REQUEST_CANCEL callback without the user space application issuing a CancelIoEx() call. This is obviously a problem as a buffer I need access to on the kernel side has been randomly unmapped. Does anybody know what would cause that? Is there some limit to the amount of concurrent outstanding IRPs? Is there some problem with calling the same IOCTL concurrently from multiple threads? (FYI, I have not explicitly done anything to enable automatic synchronization on my driver, device or I/O queue)
I tried as an experiment simply not calling WdfRequestMarkCancelableEx() at the end of my EVT_WDF_IO_QUEUE_IO_DEVICE_CONTROL and not having a cancel callback and just letting the WDFREQUEST linger. I made a separate synchronous IOCTL to unmap the memory which simply calls WdfRequestComplete() on the original outstanding “map memory” request. This fixed the issue, but caused another problem. The problem with this approach is the framework has no way of cancelling outstanding requests. If let’s say I kill the user space application using a debugger with an outstanding map memory request, the app cannot properly close and neither can the file handle via the EVT_WDF_FILE_CLOSE callback because of an unkillable outstanding IRP.
This 2nd experimental approach is obviously flawed in it’s design, but the first approach using WdfRequestMarkCancelableEx() isn’t terribly useful if the framework can randomly cancel my requests. Any help would be appreciated. Thanks!!!

There are no miracles… if your requests get canceled, maybe the user process crashes? Bugs also happen sometimes.
Removing cancel support entirely is of course not acceptable.
Maybe you’ll have to debug KMDF to see where it calls EVT_WDF_REQUEST_CANCEL.

  • pa