Can a kernel-mode driver hold onto a user-mode buffer after application is closed?

Matthew_Langille · December 13, 2012, 8:59pm

Is there any way for a kernel-mode driver to hold onto a user-mode buffer beyond the life of the user-mode application that created the buffer?

We are having issues with a USB device in the following case:

User-mode application creates a user-mode buffer and passes it to the driver as part of a read request
Kernel mode driver communicates with the device
Device starts transmitting data
User-mode application is closed before the data transfer completes

This always puts the device into a weird hanging state where it is no longer able to communicate on that endpoint. I have a third party test app and driver that can be installed on our device, and this same sequence of events does not put the device into a weird hanging state.

I have gotten a USBlyzer trace and a USB bus trace for both our app+driver and the third party app+driver, and I can see that no matter when I force-close the third party app, all mid-transfer requests always complete; this differs from the case of our application, where all the requests get cancelled as soon as the application is force-closed. It looks like the user mode buffers are just disappearing exactly as the app closes, and the device gets stuck mid-transfer.

Is there a way to force the driver to hold onto that user-mode buffer?

I can’t get a handle on what the third party application could possibly be doing that makes this work for them.

Any help appreciated. Thanks all!

–M

Don_Burn · December 13, 2012, 9:19pm

It sounds like you have a bug in your handling of cancel. If your app
is in mid-transfer you should not allow the user mode request that
handled the buffer to be canceled but instead let the transfer complete
and complete the request.

Don Burn
Windows Filesystem and Driver Consulting
Website: http://www.windrvr.com
Blog: http://msmvps.com/blogs/WinDrvr

“xxxxx@gmail.com” wrote in
message news:xxxxx@ntdev:

> Is there any way for a kernel-mode driver to hold onto a user-mode buffer beyond the life of the user-mode application that created the buffer?
>
> We are having issues with a USB device in the following case:
>
> 1. User-mode application creates a user-mode buffer and passes it to the driver as part of a read request
> 2. Kernel mode driver communicates with the device
> 3. Device starts transmitting data
> 4. User-mode application is closed before the data transfer completes
>
> This always puts the device into a weird hanging state where it is no longer able to communicate on that endpoint. I have a third party test app and driver that can be installed on our device, and this same sequence of events does not put the device into a weird hanging state.
>
> I have gotten a USBlyzer trace and a USB bus trace for both our app+driver and the third party app+driver, and I can see that no matter when I force-close the third party app, all mid-transfer requests always complete; this differs from the case of our application, where all the requests get cancelled as soon as the application is force-closed. It looks like the user mode buffers are just disappearing exactly as the app closes, and the device gets stuck mid-transfer.
>
> Is there a way to force the driver to hold onto that user-mode buffer?
>
> I can’t get a handle on what the third party application could possibly be doing that makes this work for them.
>
> Any help appreciated. Thanks all!
>
> --M

Doron_Holan · December 13, 2012, 10:29pm

The process cannot be torn down until all on flight Io completes, so the um buffer is valid for the lifetime of the irp if it is managed by the Io manager. In other words, if you are using a type3 buffer/ an embedded pointer, the previous statement does not apply and you mist manage the lifetime. Does your app and the test app.open the handle the same way wrt overlapped?

d

From: xxxxx@gmail.com mailto:xxxxx
Sent: ?12/?13/?2012 5:59 PM
To: Windows System Software Devs Interest Listmailto:xxxxx
Subject: [ntdev] Can a kernel-mode driver hold onto a user-mode buffer after application is closed?

Is there any way for a kernel-mode driver to hold onto a user-mode buffer beyond the life of the user-mode application that created the buffer?

We are having issues with a USB device in the following case:

1. User-mode application creates a user-mode buffer and passes it to the driver as part of a read request
2. Kernel mode driver communicates with the device
3. Device starts transmitting data
4. User-mode application is closed before the data transfer completes

This always puts the device into a weird hanging state where it is no longer able to communicate on that endpoint. I have a third party test app and driver that can be installed on our device, and this same sequence of events does not put the device into a weird hanging state.

I have gotten a USBlyzer trace and a USB bus trace for both our app+driver and the third party app+driver, and I can see that no matter when I force-close the third party app, all mid-transfer requests always complete; this differs from the case of our application, where all the requests get cancelled as soon as the application is force-closed. It looks like the user mode buffers are just disappearing exactly as the app closes, and the device gets stuck mid-transfer.

Is there a way to force the driver to hold onto that user-mode buffer?

I can’t get a handle on what the third party application could possibly be doing that makes this work for them.

Any help appreciated. Thanks all!

–M

—
NTDEV is sponsored by OSR

For our schedule of WDF, WDM, debugging and other seminars visit:
http://www.osr.com/seminars

To unsubscribe, visit the List Server section of OSR Online at http://www.osronline.com/page.cfm?name=ListServer</mailto:xxxxx></mailto:xxxxx>

OSR_Community_User · December 14, 2012, 12:45am

> Is there any way for a kernel-mode driver to hold onto a user-mode buffer

beyond the life of the user-mode application that created the buffer?

We are having issues with a USB device in the following case:

User-mode application creates a user-mode buffer and passes it to the
driver as part of a read request

Kernel mode driver communicates with the device

Device starts transmitting data

User-mode application is closed before the data transfer completes

This is not possible. If the IRP is active, the CloseHandle operation
will be blocked, which means the process does not terminate and therefore
the buffer is still in existence (note that it doen’t matter if this is a
CloseHandle from the app or the implicit CloseHandle performed after the
process returns to the kernel. Therefore, the application simply cannot
close while the IRP is active.

Furthermore, your diagnosis is erroneous. If the device hangs, the most
common cause is that incoming requests are being queued, but not dequeued,
which means you have left the driver state in an inconsistent state. When
the app closes, the existing IRP in process is usually left alone, and
pending IRPs in the queue are canceled. So the CloseHandle is delayed
until all IRPs are cleared, or a very long time, on the order of five
minutes, has elapsed. At that point, if you have not handled cancellation
correctly, your queues can be jammed up. It has nothing to do with the
existence of the user buffer or the process.

You would need to say more about the nature of the driver, queue
management, and other factors of how you manage user buffers.

This always puts the device into a weird hanging state where it is no
longer able to communicate on that endpoint. I have a third party test app
and driver that can be installed on our device, and this same sequence of
events does not put the device into a weird hanging state.

I have gotten a USBlyzer trace and a USB bus trace for both our app+driver
and the third party app+driver, and I can see that no matter when I
force-close the third party app, all mid-transfer requests always
complete; this differs from the case of our application, where all the
requests get cancelled as soon as the application is force-closed. It
looks like the user mode buffers are just disappearing exactly as the app
closes, and the device gets stuck mid-transfer.

Note that if you have a bug in cancellation, you can end up in weird
states. One of the most common causes I’ve seen is the use of
asynchronous I/O. The streses the driver state in a quite different
fashion than synchronous I/O. So if your test app uses synchronous I/O
and your actual app uses async I/O then you have uncovered a bug, most
likely in queue management. One case I found was when the I/O required
multiple USB transactions, and when the operation was canceled, the
Irp->Cancel flag was tested before initiating the next USB transfer, and
the IRP was completed as “cancelled”, but there was a race condition with
queue loading, so the app got an entry in the queue and thought the device
was busy, so didn’t dequeue it. So the device hung. It had absolutely
nothing to do with the existence of the user buffer. What you want to do
is see what happens to the next IRP that comes in to the device. My
suspicion, from your description, is you will see it go into a queue but
never come out. And when you locate the bug, it will be in the queue
management.

Is there a way to force the driver to hold onto that user-mode buffer?

I see no reason this would arise, so this would not solve your problem.
Stop looking this direction.

I can’t get a handle on what the third party application could possibly be
doing that makes this work for them.

Without any knowledge of the apps, my first guess is going to be sync vs.
async I/O. Next guess is your app is uncovering a bug that the test app
doesn’t tickle.

Any help appreciated. Thanks all!

–M

NTDEV is sponsored by OSR

For our schedule of WDF, WDM, debugging and other seminars visit:
http://www.osr.com/seminars

To unsubscribe, visit the List Server section of OSR Online at
http://www.osronline.com/page.cfm?name=ListServer

anton_bassov · December 14, 2012, 5:45am

> Is there any way for a kernel-mode driver to hold onto a user-mode buffer beyond the life of

the user-mode application that created the buffer?

Yes. All you have to do is to create a MDL describing the target buffer, and then to lock it in memory and map into the kernel address space. If you do it this way, the buffer (as it is known to the driver) will remain valid until driver unmaps it, regardless of the state of an app. This is the only correct way of accessing UM buffers from the kernel drivers (which,BTW, is not always a good idea in itself). Certainly, you can do it differently, but only if you want to write a driver that “sometimes works”

However, as Joe has pointed out already, your diagnosis is erroneous - if the problem was somehow related to buffer invalidation you would get a bugcheck, rather than just a weird behavior of a driver. Furthermore, as long as IO is in progress, a client app has no chance to terminate, in the first place. You are more than likely to have a bug in IO cancellation, which has absolutely nothing to do with buffer invalidation…

Anton Bassov

Matthew_Langille · December 17, 2012, 2:13pm

Hello all,

Thank you for the pointers and advice. I see that there is a consensus that the cancel logic has a problem.

I’ll have to do some more reading about WDF queues.
What we do is quite simple. Basically we have a EVT_WDF_IO_QUEUE_IO_DEVICE_CONTROL OsrFxEvtIoDeviceControl routine that we register as a callback to receive IOCTL requests for a queue created as a WdfIoQueueDispatchParallel-type queue. Our app is passing in the user buffer with a METHOD_OUT_DIRECT IOCTL. A completion routine is registered and the request is sent asynchronously (WDF_NO_SEND_OPTIONS). That’s the whole process.

Does there sound like any obvious places where there could be hiccups happening? I thought this was fairly straightforward.

Doron:
I’m not actually sure what the third-party app does exactly. We certainly do use overlapped structures in our test app, and as you have surmised, the structure we are passing down has some data and an embedded pointer to the actual buffer to be filled. This is the buffer we are calling WdfRequestProbeAndLockUserBufferForWrite on.

I’ll look into documentation on managing the lifetime of the buffer, but I really have no idea right now. Can you elaborate on what you mean?

Thanks all

Doron_Holan · December 17, 2012, 2:42pm

Why are you embedding a pointer? By far the simpler choice is not do so. Given your current design, WdfRequestProbeAndLockUserBufferForWrite is the appropriate API to call. This will lock the buffer for the lifetime of the request, kmdf will unlock the buffer upon request completion.

In what callback are you calling WdfRequestProbeAndLockUserBufferForWrite? If you are calling it in OsrFxEvtIoDeviceControl, that is the incorrect callback since there is no guarantee you are in the context of the calling process. You must call this API in an EvtIoInCallerContext callback. The correct design is to lock in InCallerContext and then send the request back to kmdf for normal dispatchiind and the remainder of processing is in OsrFxEvtIoDeviceControl

d

-----Original Message-----
From: xxxxx@lists.osr.com [mailto:xxxxx@lists.osr.com] On Behalf Of xxxxx@gmail.com
Sent: Monday, December 17, 2012 11:13 AM
To: Windows System Software Devs Interest List
Subject: RE:[ntdev] Can a kernel-mode driver hold onto a user-mode buffer after application is closed?

Hello all,

Thank you for the pointers and advice. I see that there is a consensus that the cancel logic has a problem.

I’ll have to do some more reading about WDF queues.
What we do is quite simple. Basically we have a EVT_WDF_IO_QUEUE_IO_DEVICE_CONTROL OsrFxEvtIoDeviceControl routine that we register as a callback to receive IOCTL requests for a queue created as a WdfIoQueueDispatchParallel-type queue. Our app is passing in the user buffer with a METHOD_OUT_DIRECT IOCTL. A completion routine is registered and the request is sent asynchronously (WDF_NO_SEND_OPTIONS). That’s the whole process.

Does there sound like any obvious places where there could be hiccups happening? I thought this was fairly straightforward.

Doron:
I’m not actually sure what the third-party app does exactly. We certainly do use overlapped structures in our test app, and as you have surmised, the structure we are passing down has some data and an embedded pointer to the actual buffer to be filled. This is the buffer we are calling wh on.

I’ll look into documentation on managing the lifetime of the buffer, but I really have no idea right now. Can you elaborate on what you mean?

Thanks all

NTDEV is sponsored by OSR

For our schedule of WDF, WDM, debugging and other seminars visit:
http://www.osr.com/seminars

To unsubscribe, visit the List Server section of OSR Online at http://www.osronline.com/page.cfm?name=ListServer

Matthew_Langille · December 18, 2012, 1:48pm

Hello,

I have reworked the code, and I still see the same behavior. Now, I do the following:

-Register a EVT_WDF_IO_IN_CALLER_CONTEXT callback in my AddDevice call

That callback does the following:

-Calls WdfRequestGetParameters and checks (params).Parameters.DeviceIoControl.IoControlCode
-If it is not the IO operation causing us issues, I pass the request to the framework with WdfDeviceEnqueueRequest and return from the callback
-If it is the problematic operation, do the following:

WdfRequestRetrieveOutputBuffer to get the output buffer and cast it to the structure with the embedded pointer
After checking that the pointer is valid, call WdfObjectAllocateContext to allocate a request context and then call WdfRequestProbeAndLockUserBufferForWrite with the embedded data pointer. The WDFMEMORY object that gets associated is part of the newly allocated request structure
Call WdfDeviceEnqueueRequest to pass the request back to the framework

Obviously the OsrFxEvtIoDeviceControl has been reworked so that it now gets the wdfmemory object out of the request context instead of calling WdfRequestProbeAndLockUserBufferForWrite itself.

The process above does complete the IO in normal conditions, as before, but it fails in the same cases (sudden application closure mid-transfer)

Does this process sound correct? Please let me know if I’m missing something.
Are developers generally able to use embedded pointers like this, or is this a problematic approach in general?

Peter_Viscarola_OSR · December 18, 2012, 7:01pm

It’s not unheard of, but it’s also considered to be a generally weak design. There are two buffers provided for IOCTL requests for a reason, right? It’s so you can pass control data AND payload data, and avoid (in most cases) the problem you’ve got now.

When you DO pass buffer references, it is FAR preferable to pass “offset to start of buffer” in the overall payload buffer than to pass “User Virtual Address of start of buffer” in an arbitrary location within the user address space.

Passing embedded pointers in IOCTL buffers is a common cause of security problems, bluescreens, and general problems. Like I said, it’s typically considered to be the sign of an overall weak design.

Peter
OSR

OSR_Community_User · December 18, 2012, 11:49pm

You seem to be fixated on two nonexistent problems. First, u less you
have compelling reasons (which you have not stated) you should use direct
mode I/O, which means your buffers will be locked down, so you don’t need
to do locking yourself. And you think the reason it fails of the process
terminates in the middle of a transfer because of the user buffer being
freed. This suggests that you have a strange design, because when the
process terminates, outstanding IRPs will be canceled, and if you handle
that correctly, you will not have a problem. So, first and foremost, you
need to explain why you are using “mode neither” (note that there are many
legitimate reasons for doing this, but I think you need to explain to us
why you chose this mode, to make sure you have some set of legitimate
reasons) and second, you need to understand why a canceled IRP is leaving
a now-invalid address active. You can’t fix a bug if you identified the
wrong cause, and I suspect you have an error in the cancel logic. So
forget this idea of holding onto the user buffer until you have determined
this is the source of the problem. Otherwise you are going to spend a lot
of time building a convoluted and complex non-solution to your problem.
joe

Hello,

I have reworked the code, and I still see the same behavior. Now, I do the
following:

-Register a EVT_WDF_IO_IN_CALLER_CONTEXT callback in my AddDevice call

That callback does the following:

-Calls WdfRequestGetParameters and checks
(params).Parameters.DeviceIoControl.IoControlCode
-If it is not the IO operation causing us issues, I pass the request to
the framework with WdfDeviceEnqueueRequest and return from the callback
-If it is the problematic operation, do the following:

WdfRequestRetrieveOutputBuffer to get the output buffer and cast it to
the structure with the embedded pointer

After checking that the pointer is valid, call WdfObjectAllocateContext
to allocate a request context and then call
WdfRequestProbeAndLockUserBufferForWrite with the embedded data pointer.
The WDFMEMORY object that gets associated is part of the newly allocated
request structure

Call WdfDeviceEnqueueRequest to pass the request back to the framework

Obviously the OsrFxEvtIoDeviceControl has been reworked so that it now
gets the wdfmemory object out of the request context instead of calling
WdfRequestProbeAndLockUserBufferForWrite itself.

The process above does complete the IO in normal conditions, as before,
but it fails in the same cases (sudden application closure mid-transfer)

Does this process sound correct? Please let me know if I’m missing
something.
Are developers generally able to use embedded pointers like this, or is
this a problematic approach in general?

NTDEV is sponsored by OSR

For our schedule of WDF, WDM, debugging and other seminars visit:
http://www.osr.com/seminars

To unsubscribe, visit the List Server section of OSR Online at
http://www.osronline.com/page.cfm?name=ListServer

Matthew_Langille · December 21, 2012, 1:12pm

Hello,

I’ll clear up a few of the comments and describe my current progress:

Firstly, please feel free to collectively roll your eyes if this is a really bad design, but I am not and was never using METHOD_NEITHER; I set up the IOCTL as METHOD_OUT_DIRECT and was calling the probe-and-lock function on the extracted embedded pointer. It was never my intent to use METHOD_NEITHER. I thought that we would still need to probe-and-lock the buffer pointed to by the embedded pointer since the output buffer for the IOCTL is a separate thing from this buffer and is not actually where the payload will be filled in.

The reason for this design is that the data structure we pass down with the embedded pointer is filled with members that describe the data both before and after the data transfer. The application specifies the payload buffer size in one field prior to sending the IOCTL.The driver fills the payload buffer with data the device observes, and then fills in other data like IO completion status, a timestamp, etc.

This design was in place prior to my involvement with the development of this driver, however we are now trying to transition to a new protocol and decided to implement a similar design. I am certainly open to different design approaches and am willing to do whatever will make this work.

Joseph, thank you for your points. I just wanted to mention that I am not fixated on either of the two non-existent problems mentioned; my observation initially was that perhaps the cause of the problem was because of the user buffer being freed mid-transfer, but everyone chimed in and said that this should not happen, so I no longer believe this to be the case and did not push this theory. I keep mentioning the problem happening “when application is terminated mid-transfer” simply because it is the only reproduction case I currently have for this problem, and is exactly the case of what I am trying to fix. As for the reason for the design, it is described above.

So, I have changed the code around in the following way:

The application now passes in two buffers with the IOCTL, input and output. The input buffer contains a struct with fields describing the data size, etc and the output buffer is the payload buffer as is; no embedded pointer.
My EvtIoInCallerContext routine intercepts requests and passes all requests on with WdfDeviceEnqueueRequest except for the problematic IOCTL
For this IOCTL, the EvtIoInCallerContext routine calls WdfRequestRetrieveInputBuffer, checks the specified buffer size, and then calls WdfRequestRetrieveOutputBuffer with that buffer size (Maybe I don’t need to do this in EvtIoInCallerContext since I’m no longer using ProbeAndLock?)
A request context is allocated and the buffer size is filled in.
WdfMemoryCreatePreallocated is called with the pointer to the previously retrieved output buffer. The WDFMEMORY object that gets created is a member of the request context.
WdfDeviceEnqueueRequest is called and the request is requeued.
The EvtIoDeviceControl routine receives the request, retrieves the request context, calls WdfUsbTargetPipeFormatRequestForRead with the RequestContext->memoryObject passed as the third parameter. A completion routine is set and the request is sent.

Unfortunately, this design reproduces the exact same behavior as the other: everything works fine and I can’t make the device hang in most cases, but it does hang when I kill the application mid-transfer. Again, all in-flight requests do appear to ‘clean up’ immediately.

Unfortunately, I’m still not sure what the bus sees. We’re looking at getting a bus analyzer trace, but our USB3 analyzer seems to be malfunctioning right now. I’m looking at taking this step next.

Alex_Grig · December 21, 2012, 1:38pm

Ho do you start an USB operation on the buffer? Do you create a separate IRP, or use the original IRP?
If you create a separate IRP to perform USB transfer, you do cancel it when your cancel routine gets canceled?
If you have to cancel the secondary IRP, do you complete your original IRP only when the secondary IRP gets actually completed?

Matthew_Langille · December 21, 2012, 2:15pm

Hi Alex,

I continually re-use the same WDFREQUEST without creating another. I guess that handles all three questions.

OSR_Community_User · December 21, 2012, 2:44pm

> Hello,

I’ll clear up a few of the comments and describe my current progress:

Firstly, please feel free to collectively roll your eyes if this is a
really bad design, but I am not and was never using METHOD_NEITHER; I set
up the IOCTL as METHOD_OUT_DIRECT and was calling the probe-and-lock
function on the extracted embedded pointer. It was never my intent to use
METHOD_NEITHER. I thought that we would still need to probe-and-lock the
buffer pointed to by the embedded pointer since the output buffer for the
IOCTL is a separate thing from this buffer and is not actually where the
payload will be filled in.

Yes, if you supplied a pointer to user address space in te IOCTL call, you
do have to lock the buffer down. For all practical purposes, this is
isomorphic to METHOD_NEITHER in all the possible bad ways. [drum roll,
no, eye roll] Why are you providing a pointer to a buffer in the payload
instead of just providing the buffer in the call itself?

The reason for this design is that the data structure we pass down with
the embedded pointer is filled with members that describe the data both
before and after the data transfer. The application specifies the payload
buffer size in one field prior to sending the IOCTL.The driver fills the
payload buffer with data the device observes, and then fills in other data
like IO completion status, a timestamp, etc.

And this differs from using the METHOD_OUT_DIRECT buffer how?

This design was in place prior to my involvement with the development of
this driver, however we are now trying to transition to a new protocol and
decided to implement a similar design. I am certainly open to different
design approaches and am willing to do whatever will make this work.

The design would be considered bizarre. There’s no good reason I see that
makes this design advantageous (there are reasons that “neither” mode can
make sense, but your explanation does not mention any of them).

So, unless there are factors you have not mentioned, my reaction [eye
roll] is that it is a good opportunity to redo the design.

Joseph, thank you for your points. I just wanted to mention that I am not
fixated on either of the two non-existent problems mentioned; my
observation initially was that perhaps the cause of the problem was
because of the user buffer being freed mid-transfer, but everyone chimed
in and said that this should not happen, so I no longer believe this to be
the case and did not push this theory. I keep mentioning the problem
happening “when application is terminated mid-transfer” simply because it
is the only reproduction case I currently have for this problem, and is
exactly the case of what I am trying to fix. As for the reason for the
design, it is described above.

I am quite willing to believe that there are problems caused by a
mid-transfer abort. The most common case is that if an IRP that is “in
flight” (currently being in communication with the device) it is possible,
for example, to have a DMA transfer to continue, scribbling over random
storage, or a programmed transfer to use a now-stale pointer. One way to
deal with this is to not allow an active IRP to be completed until the
transfer is finished. For a device with a bounded response time, this is
the best approach. If the device is potentially unbounded response time,
recovery is more complex. The problem is that when you complete an IRP, a
process that is being shut down will shut down, and the buffers will
disappear. But if you have parts of the driver which don’t recognize this
(e.g. the ISR or DPC) they will continue to use now-stale pointers. So,
given your scenario, the first thing I’d look for is the consequences of
an overeager cancel; for example, the active IRP still has a cancel
routine set and the cancel routine doesn’t check to see to see if the IRP
is actually active.

So, I have changed the code around in the following way:

The application now passes in two buffers with the IOCTL, input and
output. The input buffer contains a struct with fields describing the data
size, etc and the output buffer is the payload buffer as is; no embedded
pointer.

Yes, that sounds better

My EvtIoInCallerContext routine intercepts requests and passes all
requests on with WdfDeviceEnqueueRequest except for the problematic IOCTL

For this IOCTL, the EvtIoInCallerContext routine calls
WdfRequestRetrieveInputBuffer, checks the specified buffer size, and then
calls WdfRequestRetrieveOutputBuffer with that buffer size (Maybe I don’t
need to do this in EvtIoInCallerContext since I’m no longer using
ProbeAndLock?)

Isuspect that you are correct that this is not necessary, but I’m not a
KMDF expert.

A request context is allocated and the buffer size is filled in.

WdfMemoryCreatePreallocated is called with the pointer to the
previously retrieved output buffer. The WDFMEMORY object that gets created
is a member of the request context.

WdfDeviceEnqueueRequest is called and the request is requeued.

The EvtIoDeviceControl routine receives the request, retrieves the
request context, calls WdfUsbTargetPipeFormatRequestForRead with the
RequestContext->memoryObject passed as the third parameter. A completion
routine is set and the request is sent.

Unfortunately, this design reproduces the exact same behavior as the
other: everything works fine and I can’t make the device hang in most
cases, but it does hang when I kill the application mid-transfer. Again,
all in-flight requests do appear to ‘clean up’ immediately.

Can you explain “hang”? If the manifestation is that future requests
arrive at the driver and get enqueued, but never dequeued, it could be
that the cancellation leaves the queues in a state where they think they
are busy. In a WDM driver, this would suggest that you called
IoCompleteRequest but failed to call the “dequeue next request” routine
(which,if it finds the pending queue is empty, resets the “device
currently processing an IRP” state. The fact that you see the same
problem after the change in buffer management suggestd that the buffer
management was not the problem. So look for queue management problems.

Unfortunately, I’m still not sure what the bus sees. We’re looking at
getting a bus analyzer trace, but our USB3 analyzer seems to be
malfunctioning right now. I’m looking at taking this step next.

If the hang is that the device is not reponding, then the problem is not
in the queue management, but in the device itself. If you see a request
being sent to the device, but the device is not responding, it means that
the device is expecting some other request and refuses to respond.
Consider if the protocol is a sequence of packets ABC, and you abort the
transfer while B is running, then the device sees ABABC and is in a weird
state because it saw an A after B, instead of the expected C. This is
almost entirely guesswork on my part, but it is consistent with things I
have seen happen to other devices on pre-Windows systems.

You should be able to infer this without a bus analyzer, just be adding
some debug printouts at enqueue, dequeue, and simolar events.
joe

NTDEV is sponsored by OSR

For our schedule of WDF, WDM, debugging and other seminars visit:
http://www.osr.com/seminars

To unsubscribe, visit the List Server section of OSR Online at
http://www.osronline.com/page.cfm?name=ListServer

OSR_Community_User · December 21, 2012, 4:35pm

I’m working on a very similar situation with an old driver that I need to maintain.

The application passes pointers to buffers using IOCTLs, the driver locks the mdl,?completes the IRP,?and writes to the buffers at an arbitrary time later.

It works reasonably well as long as the application is running, but killing the application process results in a blue-screen which I’m trying to prevent:

DRIVER_LEFT_LOCKED_PAGES_IN_PROCESS (cb)
Caused by a driver not cleaning up completely after an I/O.
When possible, the guilty driver’s name (Unicode string) is printed on
the bugcheck screen and saved in KiBugCheckDriver.
Arguments:
Arg1: 9797ee71, The calling address in the driver that locked the pages or if the
IO manager locked the pages this points to the dispatch routine of
the top driver on the stack to which the IRP was sent.
Arg2: 9797f35d, The caller of the calling address in the driver that locked the
pages. If the IO manager locked the pages this points to the device
object of the top driver on the stack to which the IRP was sent.
Arg3: 88bf6dc0, A pointer to the MDL containing the locked pages.
Arg4: 00000001, The number of locked pages.

STACK_TEXT: ?
98ec3bcc 8288c25f 000000cb 9797ee71 9797f35d nt!KeBugCheckEx+0x1e
98ec3bf0 82889267 88a92978 8a5c1d28 00000000 nt!MmDeleteProcessAddressSpace+0x50
98ec3c24 8283a6f4 8a5c1d40 8a5c1d40 8a5c1d28 nt!PspProcessDelete+0x15d
98ec3c3c 82681f60 00000000 15fc4903 99d14000 nt!ObpRemoveObjectRoutine+0x59
98ec3c50 82681ed0 8a5c1d40 826d805e 00000002 nt!ObfDereferenceObjectWithTag+0x88
98ec3c58 826d805e 00000002 98ec3cc0 82894688 nt!ObfDereferenceObject+0xd
98ec3c64 82894688 00000000 c000009a 8274d7c0 nt!MmFreeAccessPfnBuffer+0x27
98ec3cc0 8289433d 00000000 8a3205a0 00000000 nt!PfpFlushBuffers+0x2ba
98ec3d50 8282766d 8274d7c0 baca607d 00000000 nt!PfTLoggingWorker+0xaa
98ec3d90 826d90d9 8289428d 8274d7c0 00000000 nt!PspSystemThreadStartup+0x9e
00000000 00000000 00000000 00000000 00000000 nt!KiThreadStartup+0x19

Instead of rewriting both the driver and application, I’m considering the following approach:

Add another IOCTL, which the application sends exactly once, using overlapped IO, when it initializes
The driver marks the IRP pending and defines a cancellation routine
Whenever the application dies, the cancellation routine executes, and cleans up / unlocks all the locked pages.
The application dies peacefully, without taking the server down.

I know it is far from perfect, but a major rewrite is not an option right now.

Does this look like a reasonable solution, can I improve upon it in any way?

From: “xxxxx@gmail.com”
To: Windows System Software Devs Interest List
Sent: Friday, December 21, 2012 11:16 AM
Subject: RE:[ntdev] Can a kernel-mode driver hold onto a user-mode buffer after application is closed?

Hi Alex,

I continually re-use the same WDFREQUEST without creating another. I guess that handles all three questions.

—
NTDEV is sponsored by OSR

For our schedule of WDF, WDM, debugging and other seminars visit:
http://www.osr.com/seminars

To unsubscribe, visit the List Server section of OSR Online at http://www.osronline.com/page.cfm?name=ListServer

Doron_Holan · December 21, 2012, 4:48pm

You don’t need a separate pended irp. Either
A) don’t complete the original irp which had the embedded pointers until you are done with them

Or

B) cleanup the locked buffers when you see a irp_mj_cleanup on the file handle.

d

From: Aspiring Programmermailto:xxxxx
Sent: ?12/?21/?2012 1:35 PM
To: Windows System Software Devs Interest Listmailto:xxxxx
Subject: Re: [ntdev] Can a kernel-mode driver hold onto a user-mode buffer after application is closed?

I’m working on a very similar situation with an old driver that I need to maintain.

The application passes pointers to buffers using IOCTLs, the driver locks the mdl, completes the IRP, and writes to the buffers at an arbitrary time later.

It works reasonably well as long as the application is running, but killing the application process results in a blue-screen which I’m trying to prevent:

DRIVER_LEFT_LOCKED_PAGES_IN_PROCESS (cb)
Caused by a driver not cleaning up completely after an I/O.
When possible, the guilty driver’s name (Unicode string) is printed on
the bugcheck screen and saved in KiBugCheckDriver.
Arguments:
Arg1: 9797ee71, The calling address in the driver that locked the pages or if the
IO manager locked the pages this points to the dispatch routine of
the top driver on the stack to which the IRP was sent.
Arg2: 9797f35d, The caller of the calling address in the driver that locked the
pages. If the IO manager locked the pages this points to the device
object of the top driver on the stack to which the IRP was sent.
Arg3: 88bf6dc0, A pointer to the MDL containing the locked pages.
Arg4: 00000001, The number of locked pages.

STACK_TEXT:
98ec3bcc 8288c25f 000000cb 9797ee71 9797f35d nt!KeBugCheckEx+0x1e
98ec3bf0 82889267 88a92978 8a5c1d28 00000000 nt!MmDeleteProcessAddressSpace+0x50
98ec3c24 8283a6f4 8a5c1d40 8a5c1d40 8a5c1d28 nt!PspProcessDelete+0x15d
98ec3c3c 82681f60 00000000 15fc4903 99d14000 nt!ObpRemoveObjectRoutine+0x59
98ec3c50 82681ed0 8a5c1d40 826d805e 00000002 nt!ObfDereferenceObjectWithTag+0x88
98ec3c58 826d805e 00000002 98ec3cc0 82894688 nt!ObfDereferenceObject+0xd
98ec3c64 82894688 00000000 c000009a 8274d7c0 nt!MmFreeAccessPfnBuffer+0x27
98ec3cc0 8289433d 00000000 8a3205a0 00000000 nt!PfpFlushBuffers+0x2ba
98ec3d50 8282766d 8274d7c0 baca607d 00000000 nt!PfTLoggingWorker+0xaa
98ec3d90 826d90d9 8289428d 8274d7c0 00000000 nt!PspSystemThreadStartup+0x9e
00000000 00000000 00000000 00000000 00000000 nt!KiThreadStartup+0x19

Instead of rewriting both the driver and application, I’m considering the following approach:

1. Add another IOCTL, which the application sends exactly once, using overlapped IO, when it initializes

2. The driver marks the IRP pending and defines a cancellation routine

3. Whenever the application dies, the cancellation routine executes, and cleans up / unlocks all the locked pages.

4. The application dies peacefully, without taking the server down.

I know it is far from perfect, but a major rewrite is not an option right now.

Does this look like a reasonable solution, can I improve upon it in any way?

________________________________
From: “xxxxx@gmail.com”
To: Windows System Software Devs Interest List
Sent: Friday, December 21, 2012 11:16 AM
Subject: RE:[ntdev] Can a kernel-mode driver hold onto a user-mode buffer after application is closed?

Hi Alex,

I continually re-use the same WDFREQUEST without creating another. I guess that handles all three questions.

—
NTDEV is sponsored by OSR

For our schedule of WDF, WDM, debugging and other seminars visit:
http://www.osr.com/seminars

To unsubscribe, visit the List Server section of OSR Online at http://www.osronline.com/page.cfm?name=ListServer

—
NTDEV is sponsored by OSR

For our schedule of WDF, WDM, debugging and other seminars visit:
http://www.osr.com/seminars

To unsubscribe, visit the List Server section of OSR Online at http://www.osronline.com/page.cfm?name=ListServer</mailto:xxxxx></mailto:xxxxx>

OSR_Community_User · December 21, 2012, 5:23pm

> I’m working on a very similar situation with an old driver that I need to

maintain.

The application passes pointers to buffers using IOCTLs, the driver locks
the mdl, completes the IRP, and writes to the buffers at an arbitrary time
later.

A singularly bad design. You can get the same effect using asynchronous
I/O. This then leads to the question of how the app knows what is
happening to the buffers. I believe that designs like this demonstrate
that te driver writer has no concept of how I/O APIs work, because a
solution like this is so obviously bad it could be created only if the
author wanted to build what appears to be an asynchronous solution using
synchronous I/O. Which is completely unnecessary.

This is one of those examples where I believe there is nothing wrong with
the design that could not be solved by a competent redesign.

It works reasonably well as long as the application is running, but
killing the application process results in a blue-screen which I’m trying
to prevent:

When you get the IRP_MJ_CLEANUP you must free the buffers. It is
unfortunate that you have to go through such bizarre effort to simulate
what is an already-built-in and fully-supported interface.

DRIVER_LEFT_LOCKED_PAGES_IN_PROCESS (cb)
Caused by a driver not cleaning up completely after an I/O.
When possible, the guilty driver’s name (Unicode string) is printed on
the bugcheck screen and saved in KiBugCheckDriver.
Arguments:
Arg1: 9797ee71, The calling address in the driver that locked the pages or
if the
IO manager locked the pages this points to the dispatch routine of
the top driver on the stack to which the IRP was sent.
Arg2: 9797f35d, The caller of the calling address in the driver that
locked the
pages. If the IO manager locked the pages this points to the device
object of the top driver on the stack to which the IRP was sent.
Arg3: 88bf6dc0, A pointer to the MDL containing the locked pages.
Arg4: 00000001, The number of locked pages.

STACK_TEXT:
98ec3bcc 8288c25f 000000cb 9797ee71 9797f35d nt!KeBugCheckEx+0x1e
98ec3bf0 82889267 88a92978 8a5c1d28 00000000
nt!MmDeleteProcessAddressSpace+0x50
98ec3c24 8283a6f4 8a5c1d40 8a5c1d40 8a5c1d28 nt!PspProcessDelete+0x15d
98ec3c3c 82681f60 00000000 15fc4903 99d14000
nt!ObpRemoveObjectRoutine+0x59
98ec3c50 82681ed0 8a5c1d40 826d805e 00000002
nt!ObfDereferenceObjectWithTag+0x88
98ec3c58 826d805e 00000002 98ec3cc0 82894688 nt!ObfDereferenceObject+0xd
98ec3c64 82894688 00000000 c000009a 8274d7c0 nt!MmFreeAccessPfnBuffer+0x27
98ec3cc0 8289433d 00000000 8a3205a0 00000000 nt!PfpFlushBuffers+0x2ba
98ec3d50 8282766d 8274d7c0 baca607d 00000000 nt!PfTLoggingWorker+0xaa
98ec3d90 826d90d9 8289428d 8274d7c0 00000000
nt!PspSystemThreadStartup+0x9e
00000000 00000000 00000000 00000000 00000000 nt!KiThreadStartup+0x19

Instead of rewriting both the driver and application, I’m considering the
following approach:

Add another IOCTL, which the application sends exactly once, using
overlapped IO, when it initializes

The driver marks the IRP pending and defines a cancellation routine

Whenever the application dies, the cancellation routine executes, and
cleans up / unlocks all the locked pages.

The application dies peacefully, without taking the server down.

I know it is far from perfect, but a major rewrite is not an option right
now.

Does this look like a reasonable solution, can I improve upon it in any
way?

Note that you must do all the I/O asynchronously, which is best done by
following each I/O request with a blocking GetOverlappedResult call (what
I call “pseudo-synchronous I/O”) It sounds as if your proposal would
work, although I remain concerned about what mechanism is used to notify
the app that data is ready or has been consumed.

Sometimes, rescuing a bad design by a horrendous kludge has short-term
appeal, but isca long-term serious mistake. I made this mistake several
memorable times in my career, and then had to deal with the consequences
for years.
joe

From: “xxxxx@gmail.com”
> To: Windows System Software Devs Interest List
> Sent: Friday, December 21, 2012 11:16 AM
> Subject: RE:[ntdev] Can a kernel-mode driver hold onto a user-mode buffer
> after application is closed?
>
> Hi Alex,
>
> I continually re-use the same WDFREQUEST without creating another. I guess
> that handles all three questions.
>
> —
> NTDEV is sponsored by OSR
>
> For our schedule of WDF, WDM, debugging and other seminars visit:
> http://www.osr.com/seminars
>
> To unsubscribe, visit the List Server section of OSR Online at
> http://www.osronline.com/page.cfm?name=ListServer
> —
> NTDEV is sponsored by OSR
>
> For our schedule of WDF, WDM, debugging and other seminars visit:
> http://www.osr.com/seminars
>
> To unsubscribe, visit the List Server section of OSR Online at
> http://www.osronline.com/page.cfm?name=ListServer

OSR_Community_User · December 21, 2012, 7:03pm

I plan to rewrite this, with an appropriate design, but I need to immediately prevent the servers from blue-screening.

This approach is expedient, by design, and I’m not going to dispute the crudeness of the existing design.

Thank you for your input. I think too that it’ll work,?with relatively little effort.

From: “xxxxx@flounder.com”
To: Windows System Software Devs Interest List
Sent: Friday, December 21, 2012 2:22 PM
Subject: Re: [ntdev] Can a kernel-mode driver hold onto a user-mode buffer after application is closed?

> I’m working on a very similar situation with an old driver that I need to
> maintain.
>
> The application passes pointers to buffers using IOCTLs, the driver locks
> the mdl,?completes the IRP,?and writes to the buffers at an arbitrary time
> later.
>
A singularly bad design.? You can get the same effect using asynchronous
I/O.? This then leads to the question of how the app knows what is
happening to the buffers.? I believe that designs like this demonstrate
that te driver writer has no concept of how I/O APIs work, because a
solution like this is so obviously bad it could be created only if the
author wanted to build what appears to be an asynchronous solution using
synchronous I/O.? Which is completely unnecessary.

This is one of those examples where I believe there is nothing wrong with
the design that could not be solved by a competent redesign.

> It works reasonably well as long as the application is running, but
> killing the application process results in a blue-screen which I’m trying
> to prevent:

When you get the IRP_MJ_CLEANUP you must free the buffers.? It is
unfortunate that you have to go through such bizarre effort to simulate
what is an already-built-in and fully-supported interface.

>
> DRIVER_LEFT_LOCKED_PAGES_IN_PROCESS (cb)
> Caused by a driver not cleaning up completely after an I/O.
> When possible, the guilty driver’s name (Unicode string) is printed on
> the bugcheck screen and saved in KiBugCheckDriver.
> Arguments:
> Arg1: 9797ee71, The calling address in the driver that locked the pages or
> if the
> IO manager locked the pages this points to the dispatch routine of
> the top driver on the stack to which the IRP was sent.
> Arg2: 9797f35d, The caller of the calling address in the driver that
> locked the
> pages. If the IO manager locked the pages this points to the device
> object of the top driver on the stack to which the IRP was sent.
> Arg3: 88bf6dc0, A pointer to the MDL containing the locked pages.
> Arg4: 00000001, The number of locked pages.
>
>
> STACK_TEXT: ?
> 98ec3bcc 8288c25f 000000cb 9797ee71 9797f35d nt!KeBugCheckEx+0x1e
> 98ec3bf0 82889267 88a92978 8a5c1d28 00000000
> nt!MmDeleteProcessAddressSpace+0x50
> 98ec3c24 8283a6f4 8a5c1d40 8a5c1d40 8a5c1d28 nt!PspProcessDelete+0x15d
> 98ec3c3c 82681f60 00000000 15fc4903 99d14000
> nt!ObpRemoveObjectRoutine+0x59
> 98ec3c50 82681ed0 8a5c1d40 826d805e 00000002
> nt!ObfDereferenceObjectWithTag+0x88
> 98ec3c58 826d805e 00000002 98ec3cc0 82894688 nt!ObfDereferenceObject+0xd
> 98ec3c64 82894688 00000000 c000009a 8274d7c0 nt!MmFreeAccessPfnBuffer+0x27
> 98ec3cc0 8289433d 00000000 8a3205a0 00000000 nt!PfpFlushBuffers+0x2ba
> 98ec3d50 8282766d 8274d7c0 baca607d 00000000 nt!PfTLoggingWorker+0xaa
> 98ec3d90 826d90d9 8289428d 8274d7c0 00000000
> nt!PspSystemThreadStartup+0x9e
> 00000000 00000000 00000000 00000000 00000000 nt!KiThreadStartup+0x19
>
> Instead of rewriting both the driver and application, I’m considering the
> following approach:
>
> 1. Add another IOCTL, which the application sends exactly once, using
> overlapped IO, when it initializes
>
> 2. The driver marks the IRP pending and defines a cancellation routine
>
> 3. Whenever the application dies, the cancellation routine executes, and
> cleans up / unlocks all the locked pages.
>
> 4. The application dies peacefully, without taking the server down.
>
> I know it is far from perfect, but a major rewrite is not an option right
> now.
>
> Does this look like a reasonable solution, can I improve upon it in any
> way?
>
>

Note that you must do all the I/O asynchronously, which is best done by
following each I/O request with a blocking GetOverlappedResult call (what
I call “pseudo-synchronous I/O”)? It sounds as if your proposal would
work, although I remain concerned about what mechanism is used to notify
the app that data is ready or has been consumed.

Sometimes, rescuing a bad design by a horrendous kludge has short-term
appeal, but isca long-term serious mistake.? I made this mistake several
memorable times in my career, and then had to deal with the consequences
for years.
? ? ? ? joe
>
> ________________________________
>? From: “xxxxx@gmail.com”
> To: Windows System Software Devs Interest List
> Sent: Friday, December 21, 2012 11:16 AM
> Subject: RE:[ntdev] Can a kernel-mode driver hold onto a user-mode buffer
> after application is closed?
>
> Hi Alex,
>
> I continually re-use the same WDFREQUEST without creating another. I guess
> that handles all three questions.
>
> —
> NTDEV is sponsored by OSR
>
> For our schedule of WDF, WDM, debugging and other seminars visit:
> http://www.osr.com/seminars
>
> To unsubscribe, visit the List Server section of OSR Online at
> http://www.osronline.com/page.cfm?name=ListServer
> —
> NTDEV is sponsored by OSR
>
> For our schedule of WDF, WDM, debugging and other seminars visit:
> http://www.osr.com/seminars
>
> To unsubscribe, visit the List Server section of OSR Online at
> http://www.osronline.com/page.cfm?name=ListServer

—
NTDEV is sponsored by OSR

For our schedule of WDF, WDM, debugging and other seminars visit:
http://www.osr.com/seminars

To unsubscribe, visit the List Server section of OSR Online at http://www.osronline.com/page.cfm?name=ListServer