Pending IRPs and IoSkip: question to MS's people

In a recent thread, there was an argument between Peter Viscarola and me about whether it is “kosher” to use some coding pattern.

The issue occurs when there is a need to first pend the IRP till some later event will occur, and then, when the event will occur, to pass this same IRP down to the lower driver.

The suggested pattern is:

dispatch routine:
IoMarkIrpPending
put the IRP to some list
return STATUS_PENDING;

when the event occurs:
take the IRP off the list
IoSkipCurrentIrpStackLocation
(VOID)IoCallDriver(LowerDO, Irp);

Peter says this is not OK, since IoSkip propagates the SL_PENDING_RETURNED flag to the next driver’s SL, and, if the next driver will return non-pending status, this is a rules violation.

The contra-argument is: even if this will go that way, the return value of the lower driver just plain does not care, since the IO manager never sees it and sees STATUS_PENDING instead.

Probably, Peter is correct nevertheless: even if the IO manager will never see the return value from the lower driver (the (VOID)IoCallDriver call), Verifier’s instrumented IoCallDriver can possibly see it and complain.

But, for me, the only solution to this problem (if Peter is correct and this is a really problem) is:

dispatch routine:
IoCopyCurrentIrpStackLocationToNext // set up NextLoc without the pending flag
IoMarkIrpPending // set the pending flag in CurrentLoc and NOT in NextLoc
put the IRP to some list
return STATUS_PENDING;

when the event occurs:
take the IRP off the list
(VOID)IoCallDriver(LowerDO, Irp)

which looks very ugly, since IoSkip/IoCopy are logical companions of IoCallDriver and not of the “put IRP to list” path.

What about MS’s code? what pattern is it using?

Surely there are some MS’s filters like VolSnap.sys which have exactly the same pattern: first pend the IRP (write) till some activity (disk block redirections) will be done, and then pass this IRP down unchanged. What approach - of the 2 above - is used by VolSnap.sys? by other MS’s drivers?


Maxim S. Shatskih
Windows DDK MVP
xxxxx@storagecraft.com
http://www.storagecraft.com

I agree that your formulation “works” for the reasons you outline. My contention is that it’s architecturally incorrect and MIGHT cause problems.

Its always possible to return STATUS_PENDING as long as you set the pending bit in you IO stack location and discard the return value from the underlying driver… With the caveat that this will sometimes cause problems in things like file system filters (where the caller might take the STATUS_PENDING as a “grant” (think directory change notification or op locks).

Several years back I had a serious discussion with several MSFT devs about whether IoSkipCurrentIrpStackLocation should CLEAR the Flags field to deal with exactly this case. The outcome was to NOT make this change, because it was likely to break existing drivers, which use exactly the pattern you describe… Often as part of power management processing.

Peter
OSR

>exactly the pattern you describe… Often as part of power management processing.

Some considerations:

  1. VolSnap-style logic of “first do some async processing, and then, when it will be completed - pass the IRP down” is rather popular demand.
  2. Unless IoXxx and Verifier will change - my suggestion is safe (with the exclusion of some very special cases when STATUS_PENDING has a semantic meaning, not just “overlapped op in progress”).
  3. Looks like, in such gory details, Windows contracts are just not so well described :slight_smile: so, if MS-provided drivers use this - then maybe this is a green light to everybody.

To illustrate item 3: is it safe to call IoSetNextIrpStackLocation, then pass the IRP to another subsystem in your same driver (no IoCallDriver), and then call IoCompleteRequest just to return the IRP back to the first subsystem?

This is same kind of issue a) it works b) it will remain working unless IoCompleteRequest will change really a lot c) it is out of contract, i.e. the contract assumes that IoCompleteRequest is the “undo” for IoCallDriver, and not for such logic.

Doron once said (mid-Oct 2010) that this is OK. So, probably this means green light.

Verifier is surely OK with this, checked kernel of 2008 R2 - too.


Maxim S. Shatskih
Windows DDK MVP
xxxxx@storagecraft.com
http://www.storagecraft.com

“Maxim S. Shatskih” wrote in message news:xxxxx@ntdev…

This isn’t necessary, you could do exactly what you did in the previous case
but replace IoSkip with IoCopy:

dispatch routine:
IoMarkIrpPending
put the IRP to some list
return STATUS_PENDING;

when the event occurs:
take the IRP off the list
IoCopyCurrentIrpStackLocationToNext
(VOID)IoCallDriver(LowerDO, Irp)

This works because IoCopy clears the Control field of the next stack
location after performing the copy. From wdm.h:

PIO_STACK_LOCATION irpSp;
PIO_STACK_LOCATION nextIrpSp;
irpSp = IoGetCurrentIrpStackLocation(Irp);
nextIrpSp = IoGetNextIrpStackLocation(Irp);
RtlCopyMemory( nextIrpSp, irpSp, FIELD_OFFSET(IO_STACK_LOCATION,
CompletionRoutine));
nextIrpSp->Control = 0;

Old DDK samples (e.g. diskperf) copied the stack location manually like so:

*ioNextStack = *ioStack;

Which is broken for this exact reason.

This also has an added benefit. By switching from IoSkip to IoCopy, you’re
no longer in a gray area of setting a bit in a stack location that you’re
relinquishing ownership of thus making this more architecturally kosher.

-scott


Scott Noone
Consulting Associate and Chief System Problem Analyst
OSR Open Systems Resources, Inc.
http://www.osronline.com

Hope to see you at the next OSR kernel debugging class February 14th in
Columbia, MD!

I’m enjoying this discussion immensely. Thanks for that, Max.

Well, I *definitely* agree with that, too.

When it comes right down to it, Windows components don’t have a formal “architecture” as much as they simply have a “design” – What’s “architecturally correct” can really only be inferred from the implementation of the component, what the docs say, and the current dev owner’s understanding and plans for their component. I, for one, have argued for a long time that this is a problem. We see it daily in the file system space – what’s right is defined by whatever the file system stack does. But I digress.

So, while again agreeing that your formation works (with the caveats we all agree on about grants) in the specific case you describe, my main issues with your formulation are:

  1. Architectural: *I* contend IoSkipCurrentIrpStackLocation means “I am not doing anything with this IRP, and I cede all control of the IRP to an underlying driver” – The implication here is that the actions you take have no lasting effect on the IRP (including its completion status). When you use your formulation, the stack location you modify is not yours, and the field you’re modifying in fact belongs to the underlying driver. If the original dev who wrote IoSkipCurrentIrpStackLocation had THOUGHT of it, don’t you think he would have cleared the stack location’s Flags field as part of that function? If Windows were written in C#, surely the Flags field of the stack location would be private to the owning driver… don’t you think? :slight_smile:

  2. Practical: I see no practical disadvantage to avoiding the use of IoSkipCurrentIrpStackLocation and instead using IoCopyCurrentIrpStackLocationToNext in the situation you describe. Using IoCopyCurrentIrpStackLocationToNext entirely eliminates the issue, and strikes ME (at least) as being easier to understand and maintain, and less likely to cause unintentional problems when changes are later made to driver code. Doing it the conventional way with IoCopyCurrentIrpStackLocation eliminates the risk of somebody seeing the code and saying… “hmmm… I think this might be broken” and then spending time to convince themselves one way or the other.

  3. Tutorial: Your formulation works in the narrow case you describe. But that narrow case requires a deeper understanding than most people have of Windows I/O Completion, and the discipline to use it only in that case. Having to account for this formulation makes it more difficult for newbs to learn “the rules.” Less experienced people are likely to misunderstand this formulation in code they see and use as a reference.

I love discussions like this…

Peter
OSR

> Old DDK samples (e.g. diskperf) copied the stack location manually like so:

*ioNextStack = *ioStack;

Which is broken for this exact reason.

This is mainly broken due to completion routine being copied, and, more so, IoCopy seems to zero the control field exactly to prevent invocation of such CR.


Maxim S. Shatskih
Windows DDK MVP
xxxxx@storagecraft.com
http://www.storagecraft.com

>deeper understanding than most people have of Windows I/O Completion

Proper understanding is only achieved by having a rev.eng. code of IopSynchronousServiceTail, IoCompleteRequest and IopCompleteRequest.


Maxim S. Shatskih
Windows DDK MVP
xxxxx@storagecraft.com
http://www.storagecraft.com

OK, we can reason all day long, but IoSkipCurrentIrpStackLocation documentation EXPLICITLY forbids modifying the current location in any way, and explicitly mentions IoMarkIrpPending as a call which is non-kosher.

Thank you, Mr. Grig… I didn’t even think to do the obvious thing and look at the docs.

From the docs for IoSkipCurrentIrpStackLocation:

The bit about possibly clearing the pending flag sort of detracts from the credibility of this documentation (I mean… how would you DO that), but otherwise it’s pretty clear.

Peter
OSR