Problem the with WdfRequestCreate / WdfObjectDelete and WDFQUEUE

I have verified that if I push a request created by WdfRequestCreate and completed by WdfObjectDelete a reference to the request will be stuck in the last queue the request was in. I did that by pushing just 1 allocated request through our system. No threading issues were involved. The requests we get from the IOCTL are completed by calling WdfRequestComplete and in that case
the reference is removed from the last queue it was in.

Normal flow of Request through driver:

! IOCTL Request → FrameBuf_New → FrameBuf_Active → QueueDMA → DpcDelayRequestComplete → WdfRequestComplete

When I allocate the request in the driver, new quad mode:

! Request Create → FrameBuf_New → FrameBuf_Active → QueueDMA → DpcDelayRequestComplete → WdfObjectDelete

To fix my big issued with memory leaking, I created a circular ring buffer with preallocated requests and just cycled through them. To get the driver to shut down on power down I added a EvtIoStop() callback to my DMA queues. The callback is never called, but with it there, the driver shuts down on power down.

But when we are releasing the hardware ( EvtDeviceReleaseHardware ) when the driver is going to be updated / change, the queues with the hidden references in them, cause the driver to hang with the inflight buffers message. Since I have pulled all the request from the queue, I do not know what to do to clear the queues. The hanging comes after returning from release hardware.

! Thread 0xFFFFB00F5063C040 is waiting for all inflight requests to be acknowledged on WDFQUEUE 0x00004FF0B0494BC8

This is the queue with the 1 reference in it:

0: kd> !wdfkd.wdfqueue 0x04FF0B0494BC8
Treating handle as a KMDF handle!

Dumping WDFQUEUE 0x00004ff0b0494bc8

Manual, Power-managed, PowerPurgeDriverNotified, Shut down, Cannot accept, Cannot dispatch, ExecutionLevelDispatch, SynchronizationScopeNone
Number of driver owned requests: 1
Power transition in progress
Number of waiting requests: 0

Number of requests notified about power change: 1
!wdfrequest 0x00004ff0b0775e98  !irp 0xffffb00f4eefa010
            (Request is marked cancelled, EvtIoStop may not have been called for this request)

EvtIoIdleComplete: (0xfffff8005152db90) NewTekHD
EvtIoPurgeComplete: (0xfffff80051524010) NewTekHD
EvtIoCanceledOnQueue: (0xfffff80051529770) NewTekHD
EvtIoStop: (0xfffff80051529650) NewTekHD

I tried to clear the queues using the follow calls:

! WdfIoQueueDrain()
! WdfIoQueueStop()
! WdfIoQuueStopAndPurgeSychronously()
! WdfIoQueuePurge()

I allocated the requests by calling:

WDFIOTARGET Target_ = WdfDeviceGetIoTarget( devContext->WdfDevice );
ntStatus_ = WdfRequestCreate( WDF_NO_OBJECT_ATTRIBUTES, Target_, &devContext->RequestQueue[ i ] );

What I need to know is there a way to remove or to complete the request to remove it from the last queue it was in. Or is there a way to shutdown the queue threads to stop the inflight messages. We need to be able to shut down the driver to replace the driver.

Did I misunderstand, or are you saying you are deleting a request that’s still in a queue? KMDF has the concept of request ownership. Once a request is in a queue, the framework owns it, not the driver. You don’t get to take action on it until you remove it from its queue.

Or did you mean something entirely different?

I am removing the request from the queue and moving it to a collection to be disposed of in the DPC callback. All requests are accounted for and pass through to be disposed of. The removal of the requests is based on the example code in:

https://docs.microsoft.com/en-us/windows-hardware/drivers/ddi/wdfio/nf-wdfio-wdfioqueuefindrequest

We have a stats subroutine to check the condition of our queues. The stats subroutine is based on the second example of the above URL and it simply counts the number of records in a queue. The stats report does not remove anything. The stats reports 0 requests in the queues I pass it on shut down.

Two situations. This behavior has not changed. This is now based on the 1 request test.
If I do not call WdfObjectDelete I can use !wdfkd.wdfqueue and see the 1 record in the queue.
If I call WdfObjectDelete and use !wdfkd.wdfqueue I see the 1 record in the queue, but the values
reported by the debugger give me unexpected results.

Some how the the request I allocate are remembered by the queue after I remove them. It seems to be the last queue they were in. I can follow the 1 request all the way through the driver and the request does exactly what you would expect. Then If I print the stats report, it says 0 records in the queues, but the driver will give me buffers in flight message in release hardware. I changed the driver to have a ring buffer of pre-allocate requests to minimize the effect. I delete the request on shut down, but the queues still remember the pre-allocated requests. When I had the huge memory leak, it was because the allocated / freed requests ( reference ) were stuck in the last queue they were in, but reported ( !wdfkd.wdfqueue) unexpected results because they had been deleted. With the pre-allocated request, they just are remembered by the last queue they were in.

When an allocated request is waiting in a queue the report looks like this:

0: kd> !wdfkd.wdfqueue 0x0487C84A72BC8
Treating handle as a KMDF handle!

Dumping WDFQUEUE 0x0000487c84a72bc8

Manual, Power-managed, PowerOn, Can accept, Can dispatch, ExecutionLevelDispatch, SynchronizationScopeNone
Number of driver owned requests: 0
Number of waiting requests: 1
!wdfrequest 0x0000487c8407fc18 !irp 0xffffb7837c002cd0

That request I can read and remove from the queue. Somehow when the request is removed from the queue, it is remembered as a driver own request. How do I stop that behavior?

I’m sorry to seem dense, but I’m still trying to figure out your flow. It’s not particularly normal to have a driver-created request in a queue at all, nor is it particularly normal to create a request that isn’t sent to a lower driver. I get that you have user-mode DMA requests and internally generated DMA request. Are you creating a fake request just so you can use a single WDFQUEUE to handle both kinds? @Doron_Holan will have to comment on how the WDF queues play with internally-generated requests. He knows that code better than anyone here.

There is absolutely nothing wrong with holding driver created Requests on a WDFQUEUE.

While it is always possible that there are unusual edge conditions, I would venture to guess that the OP has some sort of garden variety bug in their code.

I didn’t answer this post because, despite several attempts to do so, I couldn’t understand the OP’s issue.

Peter

The only way to insert a driver created WDFREQUEST into a WDFQUEUE is to set a completion routine and then send it to yourself so that the it comes through the IO dispatch path. If you do that, you treat it like any other request in a queue and complete it when you are done. Then in your completion routine and you can delete or reuse the request.

The only way to insert a driver created WDFREQUEST into a WDFQUEUE

I stand corrected! You can only call WdfRequestForwardToIoQueue with a Queue Presented Request, according to the docs you get back an error if you try.

I obviously must mis-remember this, because I swear I’ve done this in a driver that had some hideously complex staged processing…

ETA: I did mis-remember. The stages were each represented by a WDFCOLLECTION… So, sorry, OP if I misled you.

Peter

Thank you Tim, Doron and Peter. The reason I am doing this is because one IOCTL request is changed into 4 requests and pushed through a legacy driver. Our hardware supports simultaneous DMA transfers on different channels. By doing this we don’t get the black frame drop outs like we used to. The reason for the multiple steps is to reject duplicated frame and old frames which happen when applications talk to the driver.

I re-read your threads and while there are snippets of each area of processing, the entire pipeline and where the significant state changes occur are not clear. I think a summary with code (not a written description) of what each stage of the pipeline would help focus the attention. I personally don’t need to see the DMA stuff as we are focused on the request being processed through a queue.

IOCTL Request → FrameBuf_New → FrameBuf_Active → QueueDMA → DpcDelayRequestComplete → WdfRequestComplete

From the IOCTL 4 request are created ( 1/4 of a frame each ).
They are posted into a queue FrameBuf_New.
The DPC callback handles the sorting of request.
The DPC callback is a series of lists to be processed.
We guard the DPC callback with a spin lock on entry.
From FrameBuf_New the request is moved to a queue FrameBuf_Active.
At the right time, the Frame_Active queue is move into the appropriate queue QueueDMA.
When the DMA completes, the request is moved into a collection call DpcDeleyRequestComplete.
The DpcDelayRequestComplete loop calls the WdfRequestComplete or reuse request code.
If you need more, I will provide it tomorrow morning.

From your first post:
You said send it to yourself so that it comes through the IO dispatch path.
I am unclear on which calls I use to send the request to myself.

> I am unclear on which calls I use to send the request to myself. Once you have the wdfrequest you created, you will either send it to a wdfiotarget that represents your device (or maybe the top of stack) or you are calling IoCallDriver directly. The flow is clear, but the code behind it is not. There are snippets here and there but not everything in detail. Everywhere you call into a wdfqueue or complete the request is suspect.

I don’t think I would be using requests for that purpose. You don’t gain anything at all. It can’t help with cancellation. Just use a private structure with your own data in a collection or LIST_ENTRY linked list . If one of the private structures happens to correspond to a request, you can include a structure member that points to the queue where it resides, that way you can pop the request and complete it.

1 Like

I have been trying out WdfIoTargetFormatRequestForIoctl() and the internal version.
I have been through the online examples of how to do that.
But the WdfRequestSend() only calls my CompletionRoutine().
I am assuming now I have the target wrong.
The WdfDeviceGetIoTarget() and the target I get from the IOCTL callback Queue is the same.
The objective is to call myself back with an allocated request that will not stick as a driver owned request in a queue.
Mimicking send a request to another driver.

The IOCTL code to receive just the request looks like:
#define FILE_DEVICE_VTHD 0x00008100
#define IOCTL_HDIO_DMA_FRAMEBUF_QUAD_REQUEST CTL_CODE(FILE_DEVICE_VTHD, 0xBA, METHOD_IN_DIRECT,FILE_ANY_ACCESS)
In case that makes a difference. I tried variations.

Can you tell me how to get the target I need or tell me where an article is on how to do that?
I am unfamiliar with working with a driver stack.
The issue is replacing the driver, after it has been run.

Thank you Doron, Tim and Peter

WdfDeviceGetIoTarget is going to return the device below you – the device to which you would ordinarily send requests. I believe you need to call WdfIoTargetCreate and WdfIoTargetOpen to create a target that aims at YOUR device, not your normal I/O sink. You do realize you’ll need to add cases to your IOCTL handler to process these loopbacked requests, yes?

However, I still assert that this is the wrong path. It’s a lot of trouble involving a lot of outside machinery to handle something that could easily be handled by services you control.

I want to thank Tim, Doron and Peter for the help on this bug. A filter driver was the way to go. Cleared up all the problems with the allocated requests getting stuck in the last queue they were in.