Description of problem if a filter sets the OriginalFileObject field

You have seen my recent comments that it is wrong for filters to set the OriginalFileObject field when generating their own IRP. Following is a description of a bug I tracked down last year which was caused by a 3rd party file system filter setting this field. Of course the filter involved will remain anonymous.

A customer was seeing 3-4 server crashes a day in their data center. They were running Server 2003 but this issue was not dependent on the OS version.

I analyzed several different crashes but the most telling was the following driver verifier failure:

DRIVER_VERIFIER_IOMANAGER_VIOLATION (c9)
The IO manager has caught a misbehaving driver.
Arguments:
Arg1: 0000000c, Invalid IOSB in IRP at APC IopCompleteRequest
(appears to be on a stack that was unwound)
Arg2: b380c5bc, the IOSB pointer, 3/4 - 0
Arg3: 00000000
Arg4: 00000000

The system was processing a create IRP when the failure occurred.

The bug ended up being an interaction between The FILTER, DFS and how create IRPs are processed.? This is the description of the bug:

  • The IOManager routine IopParseDevice processes all create operations.? One of the things is does when it initializes a create IRP is set the address of a stack local IoStatusBlock structure into the UserIosb field of the IRP.

  • The create operation was then sent to MUP who forwarded it to The FILTER who forwarded it to MrxSmb, the SMB redirector.

  • The redirector completed the operation and returned.

  • The FILTER synchronized the completion of the create IRP back to their create dispatch routine so they could do their post operation processing in the context of the originating thread.

  • The FILTER generated a query file information IRP using the file object from the successful create and sent it down.? This is a common operation that many filters do.? The thing this FILTER did that was different is they used KsQueryInformationFile (from the kernel streaming library) to generate the IRP. Unfortunately this routine wrongly sets the OriginalFileObject field in the generated IRP.

  • The way the IOManager is designed, when an operation completes for a synchronous file object (FO_SYNCHRONOUS_IO flag is set) and the OriginalFileObject field is set, the inline event in the OriginalFileObject field is signaled. In this particular case since the OriginalFileObject field was set with the file object from the create operation, that file object had its inline event set to the signaled state.

  • The FILTER returned from their create dispatch routine which returned back to MUP.

  • MUP returned STATUS_PENDING for the create IRP and did additional processing asynchronously in a worker thread.

  • Since the create IRP was pended the thread was supposed to wait in IopParseDevice until a special kernel APC is executed which completes the processing of the Create IRP and signals the file object’s inline event.

  • When MUP returned back to the IOManager the IOManager did not wait (because the inline event was already signaled) and returned back to user mode.

  • The user mode application (excel in this case) immediately sent down another operation.

  • At some later point MUP completed the create IRP.?

  • The thread processing the create IRP is supposed to be waiting in the IO Manager.? It is not.

  • Completing the IRP queues a special kernel APC which among other things sets the correct error code and information values into what is pointed to by the UserIosb field in the IRP.

  • Unfortunately the UserIosb field points to a stack location that was in the frame of IopParseDevice which has since been unwound.

  • This stack location is in use processing some other operation and we end writing data to the kernel stack, corrupting it.?

  • It is this stack corruption that causes unusual random failures.

There are a couple of things that all of us should learn from this. They are:

  • A filter should never set the OriginalFileObject field when generating an IRP. Some of you may say that this only occurs under a specific set of circumstances and you would be correct but there is no beneficial reason for a filter to ever set this value so the rule is, don’t set it.

  • A file system filter should not blindly use APIs just because they are documented in the IFSKit. The IFSKit is a superset of the DDK and there are things documented and available that filters shouldn’t use. We will try to make this clearer in the documentation in the future but in the meantime you should use the following rule:

In general you should only link your filter with the following IFSKit libraries:
ntoskrnl.lib
hal.lib
fltmgr.lib (only if you are a mini-filter)

If your filter links with other IFSKit libraries you need to evaluate if what you are doing is correct.

I know this was a long detailed explanation and I appreciate the patience of those who waded through it. I believe it helps when people understand the reasons for the rules that are defined.

Neal Christiansen
Microsoft File System Filter Group Lead
This posting is provided “AS IS” with no warranties, and confers no rights.

Thank you, Neil, for very valuable informations
that could have been very difficult to trace
when the problem occurred. I removed
setting of OriginalFileObject from all
points where my filter generates an IRP
immediately.

L.

P.S. To people from OSR - this might be an excellent
article in some future issue of The NT Insider :slight_smile: