Intercepting interrupt reads/writes with a USB filter driver

Alex_Beavis · November 20, 2010, 3:18am

I lost a huge post I was typing up due to one of the very crashes I was going to ask about, so I will try to keep this shorter unless anyone has time to help and needs more information.

I am working on a USB kernel filter driver extending a closed-source device driver, using the WDF USB API to make my life easier. This has been especially tricky for me since I had to intercept the select configuration URB, configure the device myself, and so forth. I also intercept select interface commands but just pass back my current cached interface information to the top-level driver since the select configuration WDF command does not seem to fill in the URB after it is called, and this hacky workaround works since the device is just using alternate setting 0 for all devices.

At any rate, I am able to send and receive data via control requests, doing vendor-specific commands, but I am stuck on what is hopefully one of my final problems, that of handling interrupt endpoints. I have several endpoints on multiple interfaces, and I need to be able to send and receive data to them.

I have tried different things – WdfUsbTargetPipeWriteSynchronously, messing around with a continuous reader (which has its own issues I could ask about), but I was not able to send or receive data with any of them, as far as I could tell. Finally, I tried a plain WdfUsbTargetPipeReadSynchronously call, following the little MSDN example. This actually crashes my Windows Vista 64-bit system WITHOUT a blue screen. Some black lines/garbage flashes for a second, and hten the system reboots, with no minidump or anything.

Does anyone have any idea what could be causing this? Even better, does anyone have any suggestions how I might implement filter driver functionality that lets me send and receive packets quickly and reliably to various endpoints, endpoints that the top-level driver may ordinarily read from (this is the key one, I think) and write to (not as common)? I would like to have a feature where if a certain driver mode is enabled via IOCTL, my driver will instantly start making the endpoint packet data available to userspace without the top-level driver even being aware of anything. The top-level driver has pending interrupt read requests, so I am not sure if there is a way to pre-empt those without completing or cancelling them. For starters, I need to figure out any way to successfully read the data, however.

Thanks very much for your time!

Alex_Beavis · November 21, 2010, 5:54am

Update:

I finally found one way to send read requests that does not crash the system (with or without a bluescreen…and a bluescreen that claims to write memory but does not generate a .DMP file is an annoying thing). I have used WdfUsbTargetPipeFormatRequestForUrb, and filled in a URB with UrbBulkOrInterruptTransfer data. I have partially implemented an optional pipe interception mechanism by cancelling top-level driver requests and saving those request handles in a frozen state. These are completed by my D0 exit function if the device is unplugged while they are uncompleted.

Any responses to my original post’s question(s) are welcome, but I have these additional questions:

When I submit my newly created request with the URB_FUNCTION_BULK_OR_INTERRUPT_TRANSFER URB in it, I have verified that the next IRP stack location’s Parameters.Others.Argument1 is set to the URB, as I would expect. When the completion routine for this URB gets called, however, the next stack location is the same address as before, but the Argument1 field is 0. What could be clearing it? When I forward a request from the top-level driver, the Argument1 field apparently keeps pointing to the URB. My workaround for now is to include the memory associated with the request as a buffer in my device context, and I have verified that the newly read data shows up there, but I feel as though I must be missing something and should be able to have the Argument1 data come through so that I can just use the URB_FROM_IRP macro.
If I cancel a pending interrupt read request from the top-level driver (originally received in my internal IOCTL handler), but then refrain from completing the request because I do not want the top-level driver to know it was cancelled, is there a safe way to reuse it? I am looking at the WdfRequestReuse method, but the documentation indicates I must reinitialize the request if I try to reuse it. What reinitialization would I have to do, and would it be possible in my case?

Any response is greatly appreciated!

Alex_Beavis · November 22, 2010, 11:38am

As is often the case, I have found some workarounds instead of direct solutions, though I do not know if they are safe.

For the URB completion, I still do not know why Argument 1 ends up zeroed, but examining the completion parameters showed me that my request came through as a PipeUrb (or some similar constant), meaning I was able to follow the structure down, get a WDFMEMORY object, and extract the URB from there. It is not as easy as URB_FROM_IRP and I do not understand all the details, but it seems to work.

For “reusing” an existing top-level driver request, my current method is to cancel the request, catch it in the completion routine, save the pointer instead of completing it, and then if I want to let the top-level driver read some data (or start continuously reading data again), I just fill in the URB’s receive buffer and complete that saved request with status SUCCESS. I have not yet fully tested to make sure the top-level driver gets the information appropriately, but it seems to work for me.

I am currently encountering on a problem where my system hard-resets (no blue screen) after I reuse each of my own two custom pending read requests once. I can thus read a total of four messages before this happens. Is a hard-reset with no blue screen a symptom of a NULL pointer dereference in a kernel driver? I have fixed at least a couple of problems like that, but I am unsure why I can apparently reuse the requests once, but then have a crash the next time around. Reusing a request should not corrupt any existing URB memory, right?

Finally, if a driver hangs instead of unloading when I try to upgrade it via device manager, is a possible cause that there is an outstanding reference to a request/WDFMEMORY so it cannot be deleted? I am not even getting to my D0 exit handler sometimes, even though my completion routine is instantly reusing the appropriate read requests (but not sending them), so I would have thought that my references to the requests and/or memory would be released. My requests are children of my I/O target, so I don’t know if that might affect how everything gets deleted. I am manually freeing memory, but depending on the I/O target to delete my requests. Not sure if that’s a valid approach or not.

Thanks as always for reading…and maybe my experiences will help others.

Tim_Roberts · November 22, 2010, 1:39pm

xxxxx@gmail.com wrote:

I lost a huge post I was typing up due to one of the very crashes I was going to ask about, so I will try to keep this shorter unless anyone has time to help and needs more information.

That’s the danger of doing driver testing on your main development
machine…

…
At any rate, I am able to send and receive data via control requests, doing vendor-specific commands, but I am stuck on what is hopefully one of my final problems, that of handling interrupt endpoints. I have several endpoints on multiple interfaces, and I need to be able to send and receive data to them.

I have tried different things – WdfUsbTargetPipeWriteSynchronously, messing around with a continuous reader (which has its own issues I could ask about), but I was not able to send or receive data with any of them, as far as I could tell. Finally, I tried a plain WdfUsbTargetPipeReadSynchronously call, following the little MSDN example. This actually crashes my Windows Vista 64-bit system WITHOUT a blue screen. Some black lines/garbage flashes for a second, and hten the system reboots, with no minidump or anything.

Does anyone have any idea what could be causing this? Even better, does anyone have any suggestions how I might implement filter driver functionality that lets me send and receive packets quickly and reliably to various endpoints, endpoints that the top-level driver may ordinarily read from (this is the key one, I think) and write to (not as common)? I would like to have a feature where if a certain driver mode is enabled via IOCTL, my driver will instantly start making the endpoint packet data available to userspace without the top-level driver even being aware of anything. The top-level driver has pending interrupt read requests, so I am not sure if there is a way to pre-empt those without completing or cancelling them. For starters, I need to figure out any way to successfully read the data, however.

You can’t mix your interrupt requests with the interrupt requests from
the driver above you. An interrupt requests remains outstanding until
the request is completely satisfied. If you need to submit your own
interrupt pipe requests, then you will need to intercept and block the
interrupt requests from the driver above you and decide how to fake the
responses.

What kind of device is this? Usually, with an interrupt IN pipe, you
submit a couple of requests and keep them outstanding forever and ever.
If you do that, it would be easy for you to handle requests from above
by just copying data from the last response you got.

–
Tim Roberts, xxxxx@probo.com
Providenza & Boekelheide, Inc.

Alex_Beavis · November 22, 2010, 1:52pm

Thanks, Tim!

You can’t mix your interrupt requests with the interrupt requests from
the driver above you. An interrupt requests remains outstanding until
the request is completely satisfied. If you need to submit your own
interrupt pipe requests, then you will need to intercept and block the
interrupt requests from the driver above you and decide how to fake the
responses.

That is what I eventually did. When I switch over to the “interception” mode, I cancel the most recent two interrupt requests from the driver above me, and submit my own requests. When I need the driver above me to know anything, I fill in the URB data buffer.

What kind of device is this? Usually, with an interrupt IN pipe, you
submit a couple of requests and keep them outstanding forever and ever.
If you do that, it would be easy for you to handle requests from above
by just copying data from the last response you got.

It is a game controller, and I basically need to keep the official driver for full functionality. I am adding an optional mode where some of the control buttons would do special-purpose actions instead of being reported normally to the driver, and that is why I was only intercepting data part of the time. I suppose you are right and I could do ALL of the reads for the endpoint I am interested in, and simply fill in the top-level URB data buffer and length, and complete the request without ever sending it to the lower-level stack. As long as that doesn’t cause problems, that could be feasible, and simpler than dealing with the cancelled requests I am doing now.

As an ultimate configurable option, I might eventually look into passing the control data (which consists of lots of small packets) up into user-space so a user program or driver could manipulate it before it goes into the top-level controller driver. I am afraid that might introduce latency or throughput problems, however, since I need everything to be as smooth and fast as possible.

Regardless of the solution, I need to figure out why my reused requests are crashing without a blue screen after a couple of packets. I will report back if I get more information.

Doron_Holan · November 22, 2010, 1:57pm

Stack locations are zeroed as the irp is completed up the stack, so in the completion routine, you will not see anything in the next stack location at all. You need to save the data in the context parameter or in your device extension.

-----Original Message-----
From: xxxxx@lists.osr.com [mailto:xxxxx@lists.osr.com] On Behalf Of xxxxx@gmail.com
Sent: Monday, November 22, 2010 8:38 AM
To: Windows System Software Devs Interest List
Subject: RE:[ntdev] Intercepting interrupt reads/writes with a USB filter driver

As is often the case, I have found some workarounds instead of direct solutions, though I do not know if they are safe.

For the URB completion, I still do not know why Argument 1 ends up zeroed, but examining the completion parameters showed me that my request came through as a PipeUrb (or some similar constant), meaning I was able to follow the structure down, get a WDFMEMORY object, and extract the URB from there. It is not as easy as URB_FROM_IRP and I do not understand all the details, but it seems to work.

For “reusing” an existing top-level driver request, my current method is to cancel the request, catch it in the completion routine, save the pointer instead of completing it, and then if I want to let the top-level driver read some data (or start continuously reading data again), I just fill in the URB’s receive buffer and complete that saved request with status SUCCESS. I have not yet fully tested to make sure the top-level driver gets the information appropriately, but it seems to work for me.

I am currently encountering on a problem where my system hard-resets (no blue screen) after I reuse each of my own two custom pending read requests once. I can thus read a total of four messages before this happens. Is a hard-reset with no blue screen a symptom of a NULL pointer dereference in a kernel driver? I have fixed at least a couple of problems like that, but I am unsure why I can apparently reuse the requests once, but then have a crash the next time around. Reusing a request should not corrupt any existing URB memory, right?

Finally, if a driver hangs instead of unloading when I try to upgrade it via device manager, is a possible cause that there is an outstanding reference to a request/WDFMEMORY so it cannot be deleted? I am not even getting to my D0 exit handler sometimes, even though my completion routine is instantly reusing the appropriate read requests (but not sending them), so I would have thought that my references to the requests and/or memory would be released. My requests are children of my I/O target, so I don’t know if that might affect how everything gets deleted. I am manually freeing memory, but depending on the I/O target to delete my requests. Not sure if that’s a valid approach or not.

Thanks as always for reading…and maybe my experiences will help others.

NTDEV is sponsored by OSR

For our schedule of WDF, WDM, debugging and other seminars visit:
http://www.osr.com/seminars

To unsubscribe, visit the List Server section of OSR Online at http://www.osronline.com/page.cfm?name=ListServer

Alex_Beavis · November 22, 2010, 2:11pm

I do not quite understand how the URB_FROM_IRP could ever work in a completion routine, then. At any rate, am I safe as long as I use the completion routine’s completion params structure to get the PipeUrb WDFMEMORY pointer, and read data from there?

Thanks for all of your posts and articles, by the way, Doron. I know I am asking a lot of questions on this mailing list, but even without my questions, your answers and OSROnline have been a huge help for going from knowing nothing about Windows drivers to having a (partly) working kernel driver in a few weeks.

Stack locations are zeroed as the irp is completed up the stack, so in the
completion routine, you will not see anything in the next stack location at all.
You need to save the data in the context parameter or in your device extension.

Alex_Beavis · November 23, 2010, 12:55am

Update: I finally realized that I needed to reinitialize my URB before I resend the request (because things like the buffer length get changed by the time the URB comes back up to me). I now am submitting my own read requests for this particular endpoint all the time, and my code is hopefully simpler because of it. No more dealing with ugly request cancellation! The optional data intercept works, and the top-level driver only gets its data part of the time.

Question: In “non-intercept data” mode, I send a new read request each time I receive a read request from the top-level driver. In the “intercept data” mode, I am intending for new read requests to only be sent when the user actually requests an interrupt endpoint data read (presumably through an IOCTL call to my filter driver). Is there any potential issue there? I am unsure if the device or a lower-level driver could filll up internal buffers if I do not send read requests quickly enough.

Tim_Roberts · November 23, 2010, 11:48am

xxxxx@gmail.com wrote:

>
Question: In “non-intercept data” mode, I send a new read request each time I receive a read request from the top-level driver. In the “intercept data” mode, I am intending for new read requests to only be sent when the user actually requests an interrupt endpoint data read (presumably through an IOCTL call to my filter driver). Is there any potential issue there? I am unsure if the device or a lower-level driver could filll up internal buffers if I do not send read requests quickly enough.
<<

There is no buffering anywhere in the USB stack. If there is no outstanding read request ready to receive data, then the host controller will not send an IN token, and the device will not have an opportunity to transmit data.

My only concern with your scheme involves the request sizes. An interrupt read request remains outstanding until it is completely satisfied (or until there is a short packet). If your application submits a 2MB read and you pass it through, it might be a very long time before you have the opportunity to insert your own request. Only you can tell whether the design of your device allows for something like that.

Tim Roberts, xxxxx@probo.com
Providenza & Boekelheide, Inc.

Alex_Beavis · November 23, 2010, 1:48pm

> My only concern with your scheme involves the request sizes. An interrupt read

request remains outstanding until it is completely satisfied (or until there is
a short packet). If your application submits a 2MB read and you pass it
through, it might be a very long time before you have the opportunity to insert
your own request. Only you can tell whether the design of your device allows
for something like that.

As always, thanks for the information. In this case, I think I should be fine, since I do not think I ever have a read request larger than 32 bytes, and short transfers are fine. In fact, now that I am using your suggested scheme, I am doing all of the requests on the endpoint of interest, so top-level read requests for that endpoint are never forwarded.

My final hurdles are getting data back to userspace (which I should be able to find examples for), and figuring out why my driver can cause the system to block when upgrading the driver version or shutting down. I encountered one situation where shutting down hung until I unplugged the device. As far as I can tell, the d0exit handler does not get called until all requests are cancelled, or something to that effect. Are there any requirements I might be missing involving WDFMEMORY and/or WDFREQUEST references, where things must be done in a certain place in order for driver shutdown to be robust?

My current best guess is that it is either a result of my own WDFREQUEST objects which I created and which are sometimes pending, or it is a result of the WDFREQUEST objects I have saved from the top-level driver so that I can return data through them. I can complete them with STATUS_CANCELLED on driver shutdown, but perhaps I am not doing that in the right handler?

Doron_Holan · November 23, 2010, 1:58pm

My guess is that you are not handling cancelation of the WDFREQUESTs you are queuing from hidusb above you. How are you storing the incoming WDFREQUESTs? You should be parking them in a manual queue which handles cancelation for you automatically. When you want to complete one of them, you pull it off the queue.

d

-----Original Message-----
From: xxxxx@lists.osr.com [mailto:xxxxx@lists.osr.com] On Behalf Of xxxxx@gmail.com
Sent: Tuesday, November 23, 2010 10:48 AM
To: Windows System Software Devs Interest List
Subject: RE:[ntdev] Intercepting interrupt reads/writes with a USB filter driver

My only concern with your scheme involves the request sizes. An
interrupt read request remains outstanding until it is completely
satisfied (or until there is a short packet). If your application
submits a 2MB read and you pass it through, it might be a very long
time before you have the opportunity to insert your own request. Only
you can tell whether the design of your device allows for something like that.

As always, thanks for the information. In this case, I think I should be fine, since I do not think I ever have a read request larger than 32 bytes, and short transfers are fine. In fact, now that I am using your suggested scheme, I am doing all of the requests on the endpoint of interest, so top-level read requests for that endpoint are never forwarded.

My final hurdles are getting data back to userspace (which I should be able to find examples for), and figuring out why my driver can cause the system to block when upgrading the driver version or shutting down. I encountered one situation where shutting down hung until I unplugged the device. As far as I can tell, the d0exit handler does not get called until all requests are cancelled, or something to that effect. Are there any requirements I might be missing involving WDFMEMORY and/or WDFREQUEST references, where things must be done in a certain place in order for driver shutdown to be robust?

My current best guess is that it is either a result of my own WDFREQUEST objects which I created and which are sometimes pending, or it is a result of the WDFREQUEST objects I have saved from the top-level driver so that I can return data through them. I can complete them with STATUS_CANCELLED on driver shutdown, but perhaps I am not doing that in the right handler?

NTDEV is sponsored by OSR

For our schedule of WDF, WDM, debugging and other seminars visit:
http://www.osr.com/seminars

To unsubscribe, visit the List Server section of OSR Online at http://www.osronline.com/page.cfm?name=ListServer

Alex_Beavis · November 23, 2010, 2:02pm

I am indeed just storing the raw handles in my device context. If you can point me in the direction of a queue that can handle cancellation or a WinDDK example that uses such a queue, I should be able to use that. Thanks a lot!

If it makes a difference, the WDFREQUESTs I am storing from above are from a general USB driver, not a HID USB driver.

My guess is that you are not handling cancelation of the WDFREQUESTs you are
queuing from hidusb above you. How are you storing the incoming WDFREQUESTs?
You should be parking them in a manual queue which handles cancelation for you
automatically. When you want to complete one of them, you pull it off the
queue.

d

Doron_Holan · November 23, 2010, 2:09pm

Create a manual WDFQUEUE (manual is a type of WDFQUEUE, you create it just like any other type of WDFQUEUE). When you receive the WDFREQUEST in io event callback routine, you forward the WDFREQUEST to your manual queue. The queue itself will handle cancelation for you automatically without any additional code once it is fwd’ed to the manual queue. When you want to complete a parked request, you retrieve the request (I am guessing just the first request in the queue is fine, so the first one returned) and then complete it

d

-----Original Message-----
From: xxxxx@lists.osr.com [mailto:xxxxx@lists.osr.com] On Behalf Of xxxxx@gmail.com
Sent: Tuesday, November 23, 2010 11:02 AM
To: Windows System Software Devs Interest List
Subject: RE:[ntdev] Intercepting interrupt reads/writes with a USB filter driver

I am indeed just storing the raw handles in my device context. If you can point me in the direction of a queue that can handle cancellation or a WinDDK example that uses such a queue, I should be able to use that. Thanks a lot!

If it makes a difference, the WDFREQUESTs I am storing from above are from a general USB driver, not a HID USB driver.

My guess is that you are not handling cancelation of the WDFREQUESTs
you are queuing from hidusb above you. How are you storing the incoming WDFREQUESTs?
You should be parking them in a manual queue which handles cancelation
for you automatically. When you want to complete one of them, you
pull it off the queue.

d

NTDEV is sponsored by OSR

For our schedule of WDF, WDM, debugging and other seminars visit:
http://www.osr.com/seminars

To unsubscribe, visit the List Server section of OSR Online at http://www.osronline.com/page.cfm?name=ListServer

Tim_Roberts · November 23, 2010, 4:08pm

Doron Holan wrote:

Create a manual WDFQUEUE (manual is a type of WDFQUEUE, you create it just like any other type of WDFQUEUE).

As a sidebar, let me say that WDFQUEUEs are lightweight and extremely
useful. In my view, they are the most critical component in the KMDF
framework. Some people seem to have an aversion to creating extra
queues, and try to shoehorn some magic feature by exploiting another
queueing method. That’s a false economy. There is nothing wrong with a
driver that has a dozen WDFQUEUEs for specific purposes, if that makes
the design any clearer (and it often does). It certainly eliminates
most cancellation issues.

–
Tim Roberts, xxxxx@probo.com
Providenza & Boekelheide, Inc.

Gary_Little-3 · November 23, 2010, 5:27pm

I can “AMEN” that having used them in a WFP driver with an inverted callback. WDFQUEUE made the queue management logic painless.

Gary

----- Original Message -----
From: “Tim Roberts”
To: “Windows System Software Devs Interest List”
Sent: Tuesday, November 23, 2010 3:07:32 PM
Subject: Re: [ntdev] Intercepting interrupt reads/writes with a USB filter driver

Doron Holan wrote:
> Create a manual WDFQUEUE (manual is a type of WDFQUEUE, you create it just like any other type of WDFQUEUE).

As a sidebar, let me say that WDFQUEUEs are lightweight and extremely
useful. In my view, they are the most critical component in the KMDF
framework. Some people seem to have an aversion to creating extra
queues, and try to shoehorn some magic feature by exploiting another
queueing method. That’s a false economy. There is nothing wrong with a
driver that has a dozen WDFQUEUEs for specific purposes, if that makes
the design any clearer (and it often does). It certainly eliminates
most cancellation issues.

–
Tim Roberts, xxxxx@probo.com
Providenza & Boekelheide, Inc.

—
NTDEV is sponsored by OSR

For our schedule of WDF, WDM, debugging and other seminars visit:
http://www.osr.com/seminars

To unsubscribe, visit the List Server section of OSR Online at http://www.osronline.com/page.cfm?name=ListServer

Alex_Beavis · November 24, 2010, 12:27am

That was surprisingly painless, as far as such things go. Replacing the saved WDFREQUEST pointers also allowed me to get rid of the spinlock I had used to protect them, and things worked fine on the first try! …well, except for the fact that apparently my completion routine kept completing top-level requests even when my custom requests were cancelled, which made a machine-locking infinite loop. Hilarious.

Once I changed my completion routine to just return immediately if it received one of my requests completed with STATUS_CANCELLED, however, things worked great. I tried several combinations of not having custom requests pending, having custom requests pending, doing lots of reads, upgrading the driver, and unplugging the device, and I have not seen the driver freeze or prevent itself from unloading yet. Thanks a lot, Doron and Tim!

Now I just need to learn the proper way to make a blocking read request to transfer endpoint data to userland (it would be nice to do it with an IOCTL but I suppose that may not be feasible), maybe add one data-processing feature, and the kernel part of my driver should be done. As another aside, I tried it on Windows XP 32-bit last night, after doing all my development in Windows Vista 64. Aside from one minor difference probably due to the top-level (closed-source) driver version being older on that computer, it basically installed and worked without changes. Color me impressed with the WDF build system.