I'm looking for a good way to load-balance large and small Write WDFREQUESTs. Due to the nature of the device, there is a limit on the number of pages which may be in-flight at a given time. In order to ensure small writes aren't stuck behind large writes (on two separate files), I'd like to process up to X pages per write request then move onto the next request if/when the device Transmit queue is no longer full.
My original idea was to use a manual queue in which each file may place one request at a time. A Transmit DPC then does the following until the device Transmit queue is full or all requests have data in flight:
1. Pulls used descriptors off the device Transmit queue.
2. Finds and completes the WDFREQUEST if it was the last piece of data for the request.
3. Retrieves a request from the head of the manual queue.
4. Puts a new descriptor on the device Transmit queue.
5. Requeues the request at the tail of the manual queue.
The issue is that there seems to be no way of directly requeueing a WDFREQUEST to the tail of a WDFQUEUE.
The WdfRequeueRequest function requeues a WDFREQUEST to the head of the queue, as documented here.
WdfRequestForwardToIoQueue will fail with STATUS_INVALID_DEVICE_REQUEST if the target queue matches the source queue, as documented here.
There are three solutions I can think of to this issue.
Rather than using WdfIoQueueRetrieveNextRequest, use WdfIoQueueFindRequest and supply the previously used WDFREQUEST handle as FoundRequest in order to get the next request, and loop (specify FoundRequest as NULL) when I reach the end of the queue. This requires keeping track of some state (the previously used WDFREQUEST reference), but seems to be the best way to do things.
Create a second queue which simply forwards to the manual queue. Use WdfIoQueueRetrieveNextRequest on the manual queue, process to a point, forward to the forwarding queue, and the request should be placed back in the manual queue at the tail. In this case I would have to check both the forwarding queue and manual queue for the request when going to complete it.
Chop the original write request into small pieces, send one piece to the manual queue at a time, and ensure we place the entire request on the device at once, completing the original request when the last small request completes. I could see this being a large performance hit.
I believe solution 1 is the most correct and most performant solution, but I'd like a second opinion. Is there another solution I haven't thought of?
It looks like you're new here. If you want to get involved, click one of these buttons!
|Upcoming OSR Seminars||Kernel Debugging||30 Mar 2020||OSR Seminar Space|
|Developing Minifilters||20 Apr 2020||OSR Seminar Space & ONLINE|
|Writing WDF Drivers||11 May 2020||OSR Seminar Space & ONLINE|
|Internals & Software Drivers||28 Sept 2020||Dulles, VA|