This is a rare one: "IRP completion routine returned at changed IRQL"

Code 0xfa, the irql before calling IoCompleteRequest() is different to the irql after. Before is 2, after is 0.

Code is simple, a URB Irp is received, queued, pended, and completed later in a thread (thread runs at passive). (cancel routines set and cleared in line with Oney’s uber paranoid code)

I am thinking that because the Irp is received at DISPATCH, it needs completing at DISPATCH, instead of PASSIVE.

Is this something new in the latest Windows 10, because I am damn sure I have completed Irps at passive level in the past without issues. (Been away from Windows Kernel for a while, in the land of bare metal ARM and PPC, so memory’s a bit rusty on Windows).

(And yes, this is WDM code, yeah yeah yeah, before you say it, I know, I should use WDF, but this is an old driver I am maintaining)

Raising to DISPATCH, calling IoCompleteRequest() then lowering it to PASSIVE, fixes the issue by the way. Weird, very weird.

Oh, I should add, that if the code doesnt copy any data to the urb’s bulk transfer buffer, the issue also doesnt occur. Which is weird too. Never had problems touching urb bulk buffers, ever.

Code 0xfa, the irql before calling IoCompleteRequest() is different to the irql after. Before is 2, after is 0.

…which means some higher-level driver’s IO completion routine has ARBITRARILY lowered IRQL from DISPATCH_LEVEL to PASSIVE_LEVEL…

Code is simple, a URB Irp is received, queued, pended, and completed later in a thread (thread runs at passive)

Raising to DISPATCH, calling IoCompleteRequest() then lowering it to PASSIVE, fixes the issue by the way. Weird, very weird.

…and this part already contradicts your original statement - judging from this description, the culprit is raising IRQL, rather than trying to lower it…

Anton Bassov

At Anton, no contradiction, I dont think you get the picture.

Without the raise irql before io complete request, then the lower irql, verifier barfs.

Dont forget, my codes thread, which does the completion, is running at passive, and given this is a URB, it is being sent to my drivers handler at dispatch. I think Verifier is working on some new rule where an Irp has to be completed at the same irql it was received at. I have never seen this kind of thing before, it is weird, very weird.

I think Verifier is working on some new rule where an Irp has to be completed at the same irql it was received at

Surely not…All hell would break loose if that was suddenly a rule.

Code 0xfa, the irql before calling IoCompleteRequest() is different to the irql after. Before is 2, after is 0.

Can you post the !analyze output? I’ve never seen that one and I can’t find it described anywhere (bugcheck code 0xFA is HTTP_DRIVER_CORRUPTED). Based on this description though it would mean that the “before IRQL” is DISPATCH_LEVEL and the “after IRQL” is PASSIVE_LEVEL. Presumably this means that you called IoCompleteRequest at DISPATCH_LEVEL and someone lowered it to PASSIVE_LEVEL in their completion routine. You raising the IRQL to DISPATCH_LEVEL before calling IoCompleteRequest shouldn’t do anything if the IRQL was already DISPATCH_LEVEL…

Without the raise irql before io complete request, then the lower irql, verifier barfs.

…which means IoCompleteRequest() invariably returns at DISPATCH_LEVEL, right? As long as a call to IoCompleteRequest() is made at DISPATCH_LEVEL everything works fine because there is no IRQL change behind the scenes, but if if do it at PASSIVE_LEVEL IRQL gets elevated at some point, so that you get a complaint from Verifier.

However, your first statement claims that “Before is 2, after is 0”, i.e that it changes from DISPATCH_LEVEL to PASSIVE_LEVEL. This is the contradiction I was speaking about…

In either case, if this change occurs before IoCompleteRequest() returns control, it means that the culprit is some other driver’s IO completion routine that gets invoked behind the scenes by IoCompleteRequest().

I think Verifier is working on some new rule where an Irp has to be completed at the same irql it was received at.

Please note that IRQL is a per-CPU concept that is totally unrelated to the one of IRP (which, BTW,may be received by the code running on CPU A and completed by the one running on the CPU B). Therefore, I just don’t see any logical reason for this hypothetical “new rule”,
which sounds like a totally nonsensical suggestion to me. At the same time, I think it is entirely possible that some buggy higher-level driver has decided to introduce this ridiculous “new rule” on its own initiative, effectively screwing up your driver’s operations…

I have never seen this kind of thing before, it is weird, very weird.

Sounds pretty much like an inter-op to me…

Anton Bassov

@“Scott_Noone_(OSR)” said:

I think Verifier is working on some new rule where an Irp has to be completed at the same irql it was received at

Surely not…All hell would break loose if that was suddenly a rule.

Code 0xfa, the irql before calling IoCompleteRequest() is different to the irql after. Before is 2, after is 0.

Can you post the !analyze output? I’ve never seen that one and I can’t find it described anywhere (bugcheck code 0xFA is HTTP_DRIVER_CORRUPTED). Based on this description though it would mean that the “before IRQL” is DISPATCH_LEVEL and the “after IRQL” is PASSIVE_LEVEL. Presumably this means that you called IoCompleteRequest at DISPATCH_LEVEL and someone lowered it to PASSIVE_LEVEL in their completion routine. You raising the IRQL to DISPATCH_LEVEL before calling IoCompleteRequest shouldn’t do anything if the IRQL was already DISPATCH_LEVEL…

0xFA is the minor code (parameter) Scott, DRIVER_VERIFIER_DETECTED_VIOLATION is the major code.

The code calls IoCompleteRequest at PASSIVE level Scott, in a thread. The Irp (URB) is received at DISPATCH, pended (iomarkirppending, returning status_pending) and put in a queue for later processing. Thats what the thread does. (oh there is cancellation, handled according to Oneys book, too)

I have used this kind of design for ages, it is a very common way of doing things. Yet now Verifier is barfing.

The fix is to raise irql to DISPATCH, call IoCOmpleteRequest, then lower it.

If I go back to that driver (I changed the architecture so might not), or hit the same bug, I’ll post it. But yeah, it is weird, really weird.

@anton_bassov said:
it means that the culprit is some other driver’s IO completion routine that gets invoked behind the scenes by IoCompleteRequest().
Yeah, you are right I think.

At the same time, I think it is entirely possible that some buggy higher-level driver has decided to introduce this ridiculous “new rule” on its own
initiative, effectively screwing up your driver’s operations…

OK, so why would verifier barf over my driver, you think it just cant tell the difference? Could be, I have seen verifier barf over memory allocated in one driver and freed in another, calling it a leak, so clearly it isnt that clever.

So the Xbox driver completion routine raises Irql and doesnt lower it is what it looks like. Yeah, OK, makes sense. And again, staggering that Msft can make errors like this, it is their kernel, they should be the absolute experts at this.

 OK, so why would verifier barf over my driver, you think it just cant tell the difference?

Could be, I have seen verifier barf over memory allocated in one driver and freed in another, calling it a leak,
so clearly it isnt that clever.1. ,

Oh, come on - you expect a WAY too much from it…

If memory allocated by a driver X gets freed by a driver Y, this is very obviously a protocol/convention that both sides adhere to.
Verifier is just a tool that is supposed to identify the POTENTIAL bugs - it is not meant to analyse and dissect protocols, is it.

In fact, even the human intelligence may not alway be sufficient in situations like that. For example, imagine reading the code of a driver that allocates memory but never frees it. I don’t know about you, but in absence of a comment stating that this is, in actuality, the intended behavior, I would also suspect a memory leak in such situation…

Anton Bassov
.

Thank you for explaining things to me, Matt. I surely haven’t written enough drivers to know what’s a common pattern or, you know, how I/O processing works. I’m sure the workaround you don’t understand for the bug check you don’t understand is a good solution, Matt. Also, now that I know what bugcheck you’re talking about, the description in the docs doesn’t match your interpretation. Precision matters, Matt.

Maybe the !analyze output would have helped me help you, but who knows Matt? Good luck!

1 Like

What a shocking confession Scott! I guess Scott, I ought not to have relied on your advice nearly so much!

:smiley:

@“Scott_Noone_(OSR)” said:
Thank you for explaining things to me, Matt. I surely haven’t written enough drivers to know what’s a common pattern or, you know, how I/O processing works. I’m sure the workaround you don’t understand for the bug check you don’t understand is a good solution, Matt. Also, now that I know what bugcheck you’re talking about, the description in the docs doesn’t match your interpretation. Precision matters, Matt.

Maybe the !analyze output would have helped me help you, but who knows Matt? Good luck!

I did put the verifier text in the title of the post, and the Verifier bug number in the body, thought that was precise enough. :slight_smile: Anyway as I said I am not going to waste time on an architecture that exposes bugs in Microsoft drivers, hence no stack trace.

Not the first time too, many years back I had issues with the two CPU drivers shipped in the OS. Remember the famous ‘how to throttle CPU speed’ post that got so many peoples backs up? Yep, it was that driver. :slight_smile:

(Oh yeah, i have also been writing drivers since the days of NT4 and know perfectly well how IRPs work. )