Fair load balancing of large Write WDRREQUESTs - Can't WdfRequestRequeue to tail of manual queue

I’m looking for a good way to load-balance large and small Write WDFREQUESTs. Due to the nature of the device, there is a limit on the number of pages which may be in-flight at a given time. In order to ensure small writes aren’t stuck behind large writes (on two separate files), I’d like to process up to X pages per write request then move onto the next request if/when the device Transmit queue is no longer full.

My original idea was to use a manual queue in which each file may place one request at a time. A Transmit DPC then does the following until the device Transmit queue is full or all requests have data in flight:

  1. Pulls used descriptors off the device Transmit queue.
  2. Finds and completes the WDFREQUEST if it was the last piece of data for the request.
  3. Retrieves a request from the head of the manual queue.
  4. Puts a new descriptor on the device Transmit queue.
  5. Requeues the request at the tail of the manual queue.

The issue is that there seems to be no way of directly requeueing a WDFREQUEST to the tail of a WDFQUEUE.

The WdfRequeueRequest function requeues a WDFREQUEST to the head of the queue, as documented here.
https://docs.microsoft.com/en-us/windows-hardware/drivers/ddi/content/wdfrequest/nf-wdfrequest-wdfrequestrequeue
https://community.osr.com/discussion/238910/wdfrequestrequeue

WdfRequestForwardToIoQueue will fail with STATUS_INVALID_DEVICE_REQUEST if the target queue matches the source queue, as documented here.
https://docs.microsoft.com/en-us/windows-hardware/drivers/ddi/content/wdfrequest/nf-wdfrequest-wdfrequestforwardtoioqueue

There are three solutions I can think of to this issue.

  1. Rather than using WdfIoQueueRetrieveNextRequest, use WdfIoQueueFindRequest and supply the previously used WDFREQUEST handle as FoundRequest in order to get the next request, and loop (specify FoundRequest as NULL) when I reach the end of the queue. This requires keeping track of some state (the previously used WDFREQUEST reference), but seems to be the best way to do things.
    https://docs.microsoft.com/en-us/windows-hardware/drivers/ddi/content/wdfio/nf-wdfio-wdfioqueuefindrequest

  2. Create a second queue which simply forwards to the manual queue. Use WdfIoQueueRetrieveNextRequest on the manual queue, process to a point, forward to the forwarding queue, and the request should be placed back in the manual queue at the tail. In this case I would have to check both the forwarding queue and manual queue for the request when going to complete it.

  3. Chop the original write request into small pieces, send one piece to the manual queue at a time, and ensure we place the entire request on the device at once, completing the original request when the last small request completes. I could see this being a large performance hit.

I believe solution 1 is the most correct and most performant solution, but I’d like a second opinion. Is there another solution I haven’t thought of?

Whatever works, and is most clear in your code. That’s what you should go for. Method 1 seems nice and makes sense to me.

Or don’t store the Requests in a Queue. Rather, store them in a Collection or queue them in a regular list by their Request Context.

Of course, you lose the automatic cancellation feature, but if you can live without it…

People sometimes get hung up on thinking they MUST store Requests in Queues. Remember, a Request is just another Object. The only real advantage of storing them in a Queue is the auto-cancel feature.

Peter