Question on WdfIoQueueReadyNotify

Hi, I have framework queues that are configured with DispatchManual and I call WdfIoQueueReadyNotify() because this is manual dispatch. Obviously, I provide a callback to WdfIoQueueReadyNotify.

a) In what context does the callback run?.
Obviously it runs when the state of the Queue goes from empty to non-empty but what would the context be?.
Are the executions serialized?.

Here is how my callback is
Foo(Q)
{
while (TRUE) {
WdfIoQueueRetrieveNextRequest(Q, &req);
if (request is valid) ProcessReq(req); else break;
}

Let us say that the framework has injected 3 requests into the Queue, A, B and C and the queue is initially empty.

a) Is it possible for multiple instances of the callback to run?.
b) Does the framework use work items to dispatch the callbacks?
c) Is there a chance that requests A B and C will be processed in a order different than how they are submitted by the framework?

Thanks,
RK

because this is manual dispatch. Obviously, I provide a callback to WdfIoQueueReadyNotify.

Well… not so “obvious.” In fact, I’ve written a few WDF drivers – including a couple with manual queuing – and I had never even HEARD of this callback until tonight.

By specifying this callback, aren’t you turning your manual queue into a non-manual queue? Why use a manual queue in this case? Why not just use Sequential or Parallel Dispatching? Your driver will be a whole lot easier to support.

So, without looking at the WDF source code (which is on GitHub as you know), I’d GUESS:

a) Yes

b) Maybe… but that’s a matter of implementation – And what exactly would “work item” mean here, anyhow (again, not that it matters)?? The context in which the callback runs will be an arbitrary process/thread context (just like almost all KMDF callbacks).

c) “submitted by the framework”… do you mean “submitted TO the framework”?? Again, whatever… the answer is “yes” – due to the wonders of scheduling if for no other reason. Not that it matters, right?

Peter

Peter, thanks for the reply.

This is code I inherited. I have a good mind to change it into Parallel dispatch but that is a discussion for another time.

Basically, the Queue is setup for manual dispatch and teh callback provided in WdfIoQueueReadyNotify is called whenever framework sends Requests to this queue and transitions the queue from empty to non-empty. That is the description of this API.

For (c) yes these are requests submitted to the framework by upper level application that we have.

Just curious since you said you had used manual queuing. I assume your answer means you did not use WdfIoQueueReadyNotify for such queues. If you did not, did you instead just regular callbacks for Read, write, ioctl,…?.

Thanks,
RK

For (c) part of my question, the order that I process the Request must be in the order that the Requests are submitted to the framework and must be in the order that the framework queues to the Queue. I cannot violate that order.

Both, the question and the answer are surprising me a little bit…

Surprising question:
Why care about scheduling and parallelity of WdfIoQueueReadyNotify callback?
Docs say this callback is only invoked if the WdfQueue chages state from empty to non-empty.
This can only happen once, no matter how many multiple WdfRequests are being queued at the same time.
After this the queue is not empty any more. Thus it cannot change state from empty to non-empty any more. Thus no more invocations of WdfIoQueueReadyNotify. UNLESS the driver extracts WdfRequests from the WdfQueue (which of course needs to be synchronized properly against this WdfIoQueueReadyNotify callback).

Surprising answer:
In my recollection, this WdfIoQueueReadyNotify callback was one of the first ones I saw being implemented by our customers when WDF took off more than a decade ago. Watching their implementation, I always considered it as an effective solution for one of the two simplest standard patterns in WDF driver development described below. Please correct me if I am wrong!

Standard pattern with WdfIoQueueReadyNotify:
“Manual WdfQueue with incoming Interrupts”

  • ISR writes data to nonpaged buffer guarded by ISR lock.
  • ISR queues DPC.
  • DPC completes all queued WDFReqests it can complete using the data from nonpaged buffer.
  • WdfIoQueueReadyNotify completes WdfRequests arriving on the manual WdfQueue when bytes are are available in the buffer, but no more interrupts are triggering WdfRequest completion by DPC.

Is there anything wrong with this pattern?
I know that this pattern above can cause too excessive DPC load for devices with too high interrupt frequency. But on the other hand, this aproach should scale best and offer maximum throughput with minimum latency on multiprocessors compared to e.g. one single completion thread.

Is there a more effective way to do this?
If using e.g. a sequential WdfQueue, the “Current WdfRequest” logic with its cancellability, ist border conditions, etc. would introduce more complexity.

Marcel Ruedinger
datronicsoft

Marcel, thanks for the response.

Yes, in my driver, the callback does take requests from the WdfQueue and process them. See my psuedo code above.

Consider this scenario here. I think this may end up getting the callback scheduled more than once concurrently or in parallel.

Let us say the following requests are being submitted to the framework on this queue. Initially the queue is empty.
R1, R2, R3, < some delay here >, R4, R5

a) Once R1 is submitted, queue goes from empty to non-empty and the callback is invoked
Let us say R2 and R3 are submitted to the framework and the callback has a while loop that is processing these.

b) Now we have a small delay here but the callback is still processing R3. R3 is not in the queue anymore because the driver has extracted it from the queue

c) Now R4 is submitted.
In my opinion, since the queue went from empty to non-empty, framework might schedule another instance of the callback.

Now we have 2 instances of the callback running concurrently (if on same CPU) or in parallel (if multiple CPUs).
Am I analyzing this scenario correctly?. Or is there something about the framework that I am missing?

I could look into the KMDF source code and I have but I do not want to depend on the implementation

Thanks,
RK

@Marcel_Ruedinger: Funny. The design pattern of satisfying pending Requests with incoming data and buffering any remaining data, which is used to immediately satisfy subsequent Requests, is indeed a common design pattern. We even do a lab in our seminar that uses this pattern.

When I implement a driver for this pattern, I use a pair of Queues: One with Sequential Dispatching that checks for buffered data and if no data is available, the Request is then forwarded to a Queue with Manual Dispatching (the “pending Requests Queue”). If there IS buffered data, the Request is satisfied with the buffered data. When new data arrives (ISR then DpcForIsr), I attempt to pull a Request from the pending Requests Queue… if that succeeds, I use the arriving data to satisfy that Request. If that does not succeed (or there’s remaining arriving data), I buffer the data.

the order that I process the Request must be in the order that the Requests are submitted to the framework and must be in the order that the framework queues to the Queue

I’m sorry @Ramakrishna_Saripalli … but you’ve hit one of my (many) hot buttons here. If Requests are coming from a single initiator, the only way that those Requests will be guaranteed to be processed in order is if the Requestor sends them synchronously. If the Requests are coming from multiple initiators, there’s no order anyways. As I so often tell people, let’s say you send an async read (for 4K, starting at offset 0 of a file) followed by an async write (for 4K, starting at offset 0 of the same file). Which one will finish first? Answer: You have no guaranteed on Windows. Full stop.

I could look into the KMDF source code and I have but I do not want to depend on the implementation

Then, your ONLY alternative is to assume that the callbacks can run in parallel… because there’s no guideline that says the callbacks will be serialized. If you want serial callbacks, you want Sequential Dispatching. That’s what it’s for.

Peter

Peter, thanks for the reply.

No, I cannot assume that they are coming only from one single initiator. By initiator I assume you mean a user thread that is dispatching Read(), Write(), Ioctl()…

Also, even if there is only one initiator, there is no guarantee that they will be processed synchronously. They could be calling with OVERLAPPED (which I know is strictly a user construct but they could dispatched parallel to the framework). I could have a whole bunch of filter drivers in the middle which could introduce all kinds of timing issues.

Yes, I have come to the conclusion that they could be dispatched in parallel. I am of course referring to the callback itself. I just have not been able to prove that this is happening or could happen but I guess I have to assume that to be the case.

Thanks,
RK

…and since there are multiple initiators of the I/O operations, you also have to conclude that there can be no guaranteed ordering of those requests. No matter what. Let’s say two Requests are presented to your driver strictly in the order that the Framework received them, but your Event Processing Callback is called in parallel, so two instances of your callback are running simultaneously. Then… before you can do anything useful… the code in ONE of your Event Processing Callbacks (running at IRQL PASSIVE_LEVEL) gets preempted.

Who’s to say which Request gets processed first? Answer: There’s no guarantee. There can’t be.

Peter

Peter, yes. That is precisely my thinking too. I could use the Framework’s built in synchronization primitives designed for queues and change the queue to Parallel dispatch. I know my applications like to keep multiple requests pending with our driver in order to get more bandwidth which means it is better for me to use parallel.

Thanks again for all replies,
RK

There’s one other item to consider in your design. WdfIoQueueReadyNotify is essentially edge-triggered: it only fires on the transition from “empty” to “not empty”. If there are already 2 requests in the queue, the callback will not be called when a 3rd request comes in.

Tim, yes I agree which is why I am working on a Parallel dispatch type for these queues. I will have EvtIoRead, EvtIoWrite and a default handler. I am configuring them to use synchronization across these callbacks provided by Framework (so I do not have to deal with it). I think that should fix it

@Tim_Roberts, talking to me or to the OP? Just in case of talking to me (otherwise please ignore): That’s exactly the purpose of the design which I pointed out above. This for at least two immediately obvious reasons: If there are already two WdfRequests in the manual WdfIoQueue,…

  1. … then there is no incoming data from the device available in the buffer. Then the next arriving WdfRequest needs to be queued without any WdfIoQueueReadyNotify callback action. Wdf Request processing will follow later in the DPC after some data has arrived from the device and ISR was triggered and queued a DPC.
  2. … then the two other WdfRequests need to be served first. So indeed, better NOT have the WdfIoQueueReadyNotify invoked on the new incoming WdfRequest!

Marcel Ruedinger
datronicsoft

Yes, Marcel, I replied before I had read your reply. Mine was superfluous.