This is longer than patience may allow, but the inter-
operability issues raised in this thread are of critical
importance to all of us.
The complexities of filtering file systems are substantial
enough that we sometimes forget we may not be
filtering the FS at all. We may be filtering another filter,
and/or being filtered in turn. The three greatest causes
of system instability I have seen in FS filters have all
been touched on in this thread: (1) blowing the call
stack; (2) deadlocks on reentering the driver stack;
(3) corrupting memory by mishandling rolled IRPs.
(1) is usually due to reentering the driver stack; (2)
always is; and (3) rolling IRPs is often done in an
attempt to avoid reentrancy – building IRPs and
passing them strictly down, rather than calling Zw
routines, is excellent practice, but it can be hard
to get right for e.g. Creates and sophisticated
buffering schemes.
1a) You can blow the call stack with a single filter,
since the FS reenters the driver stack for paging I/O.
In fact, it can reenter twice: A Create can cause a
page fault against a section not backed by the page
file. Handling that fault can trigger another, for code
in the page file. This results in three instances of
both the filter and the FS on the call stack. The FS
uses a lot of stack itself; if the filter does too, then
the FS will attempt to post one of the operations
to a worker thread to get a new stack. It may not
succeed before running out of stack space.
1b) If you don’t filter paging I/O (e.g., if you don’t
need to worry about memory mapped files) then you
probably won’t blow the stack until a customer installs
your filter along with one or more others, some of which
have excessive local variables and/or excessively
deep call chains. The solution is to minimize locals
and call chains, and hope others do too. I have even
gone so far as to dynamically allocate structures to
contain what would otherwise be locals; and to unroll
deep call chains into broad ones instead.
- Filters generally turn a single I/O operation into
multiple I/O operations. (E.g., an operation may
trigger a virus scanner to open a file, read its entire
content, and close it.) If Zw routines are used, a
good deal of reentrancy can result, **while the original
IRP is blocked**, in the calling thread. Apply this
twice, for a system with two independent filters, and
you’ll see that the “original” IRP may not be the original
IRP at all: it may be a top filter’s nested operation.
What if the bottom filter then blocks that nested IRP, and
issues its own, reentrant? This can lead two places,
both bad. Often, either the process is an infinite regress,
or one of the filters uses synchronization objects to block
further operations on the file until its outstanding operation
completes. The infinite regress will either blow the stack
or post infinite work items to worker thread(s). The
synchronization will cause immediate deadlock, if one
of the operations must be posted and thereby needs to
continue in a new thread context (it will be blocked).
It is important to note that both filters work perfectly well
by themselves, since they presumably have a technique
to recognize their own IRPs – but they only see the
other filter’s IRPs, when combined. This is an inter-
operability problem. It might be solved by shadow
devices, if they themselves do not break interoperability
by bypassing lower filters that may be required in accesses
to the file. This may be a clever method for avoiding both
reentrancy and the need to roll IRPs.
While it is not legitimate to bypass lower filters (in the
general case), it seems logically necessary for there to
exist an ordering among filters that permits lower ones
to bypass higher ones. That is, reentrance must be
avoidable (and avoided) among filters, or the problems
above can result.
By sending IRPs strictly downward – either by rolling IRPs
for nested operations and calling the driver we are attached
to, or by sending them to shadow devices – reentrance is
avoided. Rolling IRPs seems good general practice, and it
has a significant further benefit: replacing Zw calls with
rolled IRPs in one driver resulted in an enormous performance
improvement, even in the absence of other filters. The Zw
paths are expensive, and recognizing your own IRPs adds
expense – it’s best to avoid both.
There are alternatives, but they are best reserved for
special cases. Some drivers don’t block IRPs with a
kernel requestor mode. Some don’t need more than
one instance of filtering, regardless of the original caller,
so can pass through operations on a file that they are
already handling. Sometimes ZwCreate is wanted, e.g.
to handle Status Reparse.
(3) For rolling FS IRPs, it is helpful to study the IFSKit
FS sources, don’t depart too far from the techniques there,
and refer to them often. Nonetheless, there is still an
enormous gap between the samples needed and the
samples available. It can even be necessary to
disassemble some very gnarly OS routines, to get the
more sophisticated scenarios right. It is obvious that
several people on this list have done exactly that.
Geoff
-----Original Message-----
From: Jamey Kirby [mailto:xxxxx@storagecraft.com]
Sent: Saturday, July 15, 2000 10:01 AM
To: File Systems Developers
Subject: [ntfsd] RE: preventing recursive loop in create dispatch hand ler
Another possible solution (one that I have used), is to create your own file
object using IoCreateStreamFile() and the build your own IRP to sent to the
target below you. THe only problem with this is that you might bypass some
critical filter above you.
Jamey
-----Original Message-----
From: xxxxx@lists.osr.com
[mailto:xxxxx@lists.osr.com]On Behalf Of Sara Abraham
Sent: Friday, July 14, 2000 3:40 PM
To: File Systems Developers
Subject: [ntfsd] RE: preventing recursive loop in create dispatch hand
lerThanks Tony, yes this all makes sense.
I think that there is another way that might be simpler than the ones
mentioned:
If the filter issues a ZwCreateFile() that it wants to later recognize, it
first issues an IoAllocateIrp() (dummy IRP which the filter will
be able to
identify) followed by a IoEnqueueIrp() ( which will queue the IRP to the
current thread). ZwCreateFile() will allocate an IRP and will
also queue it
to the thread’s list (at the head and still same thread). So no
matter what
happens in between, other IRPs may be queued to the thread, but those two
will stay together. When the filter’s dispatch routine is invoked, if the
IRP is followed by the dummy IRP (Irp->ThreadListEntry), then it’s a
recursive call. This assumes that the IRP that the io manager created in
ZwCreateFile() is the one that the filter sees.The ‘shadow device’ solution is fine too. It has the side effect that
filters that attached to the original device, won’t see those
ZwCreateFile()-s. On the other hand, filters may notice the
‘shadow device’,
attach to them, and get confused.I agree with your mutex deadlock examples, they demonstrate the
rule that a
mutex should not be held across calls to other drivers. I am a bit unclear
about the HSM example and the “well known” thread technique. If we have a
filter driver on top of HSM that also posts request to worker thread, then
HSM won’t recognize it’s threads.I agree with your TopLevelIrp comments. I think the mechanism could be
enhanced to detect recursion in filter drivers, similar to the
way NTFS/FAT
use it. But, as you said, without any established rules and
guidelines, it’s
impossible to implement.Sara
-----Original Message-----
From: Tony Mason
> To: File Systems Developers
> Date: Thursday, July 13, 2000 8:04 AM
> Subject: [ntfsd] RE: preventing recursive loop in create dispatch hand ler
>
>
> >Sara,
> >
> >The original discussion dealt with detecting recursion within a
> file system
> >filter driver for IRP_MJ_CREATE handling. Since a filter that
> is recursing
> >back into the storage stack has not yet reached the file system
> layer, the
> >NTFS/FAT mechanism wouldn’t be useful in detecting same.
> >
> >Even in a case where you were using the TopLevelIrp field, it
> does not work
> >correctly in the case where there are TWO threads involved in processing
> the
> >operation. This might occur, for instance, for an HSM filter where the
> file
> >must be migrated back from tape (or optical, or whatever) to
> disk. An HSM
> >filter would block the IRP_MJ_CREATE and then dispatch the operation to a
> >service. The service (using a separate thread, typically) would then
> >migrate the file back. It could do so by opening the file (but this will
> be
> >passed by the filter since it comes from the “well known” thread of this
> >separate service) and pouring the contents back into the file.
> >
> >This scheme works fine. The filter detects the “recursion” of the second
> >thread and passes it correctly.
> >
> >Now, imagine combining this with a second filter that uses the mutex
> >approach. In this case, the first thread acquires the mutex and
> passes the
> >call to the HSM filter. The HSM filter then dispatches to its service.
> The
> >service thread then attempts to open the file. When the filter
> attempts to
> >acquire the mutex, the operation will block because the mutex is owned by
> >the other thread.
> >
> >That was the scenario with which I was concerned.
> >
> >With respect to using the top level IRP field to detect recursion, if you
> >are constructing an FSD, you own that field and you can use it in any way
> >you wish (note that the network redirector uses it to point to a context
> >data structure, not an IRP.) If you are constructing a filter
> driver (or a
> >file system layered on top of another file system) you should
> not use that
> >field (you can if you are willing to save/restore that state information
> >around calls to/from the FSD, in which case it is typically just
> as easy to
> >maintain a nice CPU-cached hash table of thread local storage
> information.)
> >
> >I suppose one could use the mutex model where you made the mutex
> the thread
> >local storage. That only just occured to me because using a
> mutex in such
> a
> >case would be quite heavyweight - after all, once you have thread local
> >storage, all you need to maintain is a counter (increment when you enter
> the
> >filter, decrement when you leave the filter.)
> >
> >Does this make sense now?
> >
> >Regards,
> >
> >Tony
> >
> >Tony Mason
> >Consulting Partner
> >OSR Open Systems Resources, Inc.
> >http://www.osr.com
> >
> >
> >-----Original Message-----
> >From: Sara Abraham [mailto:xxxxx@veritas.com]
> >Sent: Wednesday, July 12, 2000 5:55 PM
> >To: File Systems Developers
> >Subject: [ntfsd] RE: preventing recursive loop in create
> dispatch hand ler
> >
> >
> >Tony,
> >
> > I must be missing something. NTFS and FAT rely on the TopLevelIrp
> >mechanism. If I have a filter driver on top of NTFS/FAT that decides to
> post
> >a write request, for example, to a different thread, without updating the
> >new thread’s TopLevelIrp information (really copying the information from
> >the original thread), then NTFS/FAT are not going to work properly. So, I
> >think that whatever a filter driver decides to do, it cannot do anything
> >that will corrupt the TopLevelIrp info. To detect recursion in a filter
> >driver, we should be able to use the TopLevelIrp mechanism that
> FastFat is
> >using.
> >
> >sara
> >-----Original Message-----
> >From: Tony Mason
> >To: File Systems Developers
> >Date: Wednesday, July 12, 2000 1:05 PM
> >Subject: [ntfsd] RE: preventing recursive loop in create
> dispatch hand ler
> >
> >
> >>Sara,
> >>
> >>This works until you load a filter that does NOT access the file in the
> >same
> >>thread context. I’ve seen implementations which do this (for example,
> they
> >>open the file in a service to examine it, or to obtain other information
> >>about the file.) Those implementations work fine (because they
> track that
> >>it is a “special thread” accessing the file and hence do not block such
> >>access) and your implementation will work fine.
> >>
> >>When you put the two together (such as with layered filter drivers) they
> >>will not function properly. Thus, it isn’t that this solution doesn’t
> work
> >>by itself, it is that this solution does not work when combined with a
> >>filter that works differently. I’m not proposing a hypothetical case
> here,
> >>this is a problem I have actually seen more than once
> (interaction between
> >>multiple filters using differing techniques to detect thread recursion.)
> >>
> >>Another potential solution I’ve been considering is that for each device
> >>object you create that filters an existing FSD volume, you
> create a SECOND
> >>device object that does NOT filter anything. Then, when you
> need to issue
> >a
> >>ZwCreateFile, you issue it against your second device object
> >>(“\Device\MyFilterAlternativeDevice0063A” or whatever unique name you
> >decide
> >>to use.) Then, when you receive the IRP against the second
> device object,
> >>you send it down to the original FSD. That bypasses the requirement to
> >>build an IRP_MJ_CREATE, detects the recursion (since nobody else is
> calling
> >>your device object) and avoids the difficulty of having other filter
> >drivers
> >>involved (since it is unlikely that any other filter driver
> would attatch
> >to
> >>your random named device object.)
> >>
> >>I hope this clarifies things (or perhaps it muddies the water
> even more.)
> >>This whole issue of detecting recursion is an ugly one, and I’ve seen at
> >>least a half-dozen approaches, each of which has some special drawback.
> No
> >>doubt, I’ll find a drawback in the scheme I just suggested one of these
> >days
> >>as well. I like to think of it as “job security.”
> >>
> >>Regards,
> >>
> >>Tony Mason
> >>Consulting Partner
> >>OSR Open Systems Resources, Inc.
> >>http://www.osr.com
> >>
> >>
> >>-----Original Message-----
> >>From: Sara Abraham [mailto:xxxxx@veritas.com]
> >>Sent: Wednesday, July 12, 2000 3:54 PM
> >>To: File Systems Developers
> >>Subject: [ntfsd] RE: preventing recursive loop in create
> dispatch hand ler
> >>
> >>
> >>Tony,
> >>
> >> I also cannot understand the problem that you see with Marc’s solution.
> >>Marc’s solution is equivalent to a simpler and nicer solution
> of using top
> >>level IRPs (IoGetTopLevelIrp()/IoSetTopLevelIrp) which will detect
> >recursion
> >>within a thread. We assume that when CreateDisptachHandler()
> is invoked a
> >>second time (on behalf of it’s own ZwCreateFile()), it will
> happen in the
> >>same thread context as the first CreateDisptachHandler(). Isn’t this
> always
> >>a true assumption ? We don’t worry about other threads
> accessing the same
> >>file.
> >>
> >>Sara
> >>
> >>-----Original Message-----
> >>From: Tony Mason
> >>To: File Systems Developers
> >>Date: Wednesday, July 12, 2000 12:04 PM
> >>Subject: [ntfsd] RE: preventing recursive loop in create
> dispatch hand ler
> >>
> >>
> >>>Marc,
> >>>
> >>>The point is that the original IRP_MJ_CREATE could be blocked while a
> >>>DIFFERENT thread accesses the (same) file. That doesn’t violate any
> rules
> >>>and is an implementation that I’ve seen numerous times.
> >>>
> >>>Of course, it is possible to process an IRP_MJ_CREATE in
> arbitrary thread
> >>>context, but it isn’t common, and it can be complex (and FSD dependent)
> to
> >>>get it right. But the problems typically involve security context
> >>>(credentials) and not memory (since everything has been
> captured when the
> >>>IRP was constructed.) So this wasn’t the case that concerned me very
> >much.
> >>>
> >>>As anyone who has attended the PlugFests at Microsoft can attest,
> >>>filter-to-filter interactions are a serious problem. IRP_MJ_CREATE
> >>handling
> >>>has been a source of a fair number of those problems.
> >>>
> >>>Regards,
> >>>
> >>>Tony
> >>>
> >>>Tony Mason
> >>>Consulting Partner
> >>>OSR Open Systems Resources, Inc.
> >>>http://www.osr.com http:
> >>>
> >>>-----Original Message-----
> >>>From: Marc Sherman [mailto:xxxxx@bionetrix.com]
> >>>Sent: Wednesday, July 12, 2000 2:52 PM
> >>>To: File Systems Developers
> >>>Subject: [ntfsd] RE: preventing recursive loop in create dispatch hand
> ler
> >>>
> >>>
> >>>
> >>>Tony,
> >>>
> >>>I thought fsd entry points must be called in the context of the
> >>requesting
> >>>thread. This implies that fs filters should not cause any change in
> thread
> >>>context. In this case, it should be the same thread that originally
> called
> >>>ZWCreateFile.
> >>>
> >>>Marc
> >>>
> >>>> -----Original Message-----
> >>>> From: Tony Mason [mailto:xxxxx@osr.com mailto:xxxxx]
> >>>> Sent: Wednesday, July 12, 2000 2:36 PM
> >>>> To: File Systems Developers
> >>>> Subject: [ntfsd] RE: preventing recursive loop in create
> >>>> dispatch hand ler
> >>>>
> >>>>
> >>>> Marc,
> >>>>
> >>>> The general problem with such approaches is that they “work”
> >>>> until you begin
> >>>> to deal with interactions involving other filter drivers,
> >>>> when any change in
> >>>> thread context will cause you to deadlock (because the mutex is not
> >>>> available, the second thread blocks. The first thread waits
> >>>> for the second
> >>>> thread to finish.) This is a problem that can occur when
> >>>> someone else’s
> >>>> filter is using handles (and thus switches to a known thread
> >>>> context) or
> >>>> uses captive threads (in a user service, or even system threads.)
> >>>>
> >>>> Regards,
> >>>>
> >>>> Tony Mason
> >>>> Consulting Partner
> >>>> OSR Open Systems Resources, Inc.
> >>>> http://www.osr.com http: < http://www.osr.com
> >>>http: >
> >>>>
> >>>>
> >>>> -----Original Message-----
> >>>> From: Marc Sherman [mailto:xxxxx@bionetrix.com
> >>>mailto:xxxxx]
> >>>> Sent: Wednesday, July 12, 2000 2:32 PM
> >>>> To: File Systems Developers
> >>>> Subject: [ntfsd] RE: preventing recursive loop in create
> >>>> dispatch hand ler
> >>>>
> >>>>
> >>>> Before calling ZWCreateFile, aquire a mutex, then check to
> >>>> see if you’ve
> >>>> aquired it recursively (mutex.Header.SignalState < 0). If
> >>>> this is true, your
> >>>> hadling the irp that resulted from your ZWCreateFile. Release
> >>>> the mutex and
> >>>> return. Remember to release your mutex after your ZWClose as
> >>>> well. This
> >>>> works for one of our filters.
> >>>>
> >>>> good luck,
> >>>> Marc
> >>>>
> >>>> -----Original Message-----
> >>>> From: Smith, Joel [mailto:xxxxx@ntpsoftware.com
> >>>mailto:xxxxx]
> >>>> Sent: Wednesday, July 12, 2000 2:21 PM
> >>>> To: File Systems Developers
> >>>> Subject: [ntfsd] RE: preventing recursive loop in create
> >>>> dispatch hand ler
> >>>>
> >>>>
> >>>>
> >>>> Thanks,
> >>>> In my case, though, I need to open the file before
> >>>> the file system
> >>>> because it is important that I know the file’s size (if the
> >>>> file exists),
> >>>> and, of course, a create might change the file’s size (overwrite,
> >>>> superceded, etc). I suppose I could backup to the parent
> >>>> directory and do a
> >>>> directory query for the file in question to determine its
> >>>> size. Then again,
> >>>> I wonder if FsRtlGetFileSize is smart enough to work with a
> >>>> FILE_OBJECT that
> >>>> has not been opened yet? Anyway, I’d still be interested in
> >>>> answers to my
> >>>> original question because just recognizing a recursive open
> >>>> may be simpler
> >>>> than solving the problem another way (like I mention above).
> >>>>
> >>>> Thanks,
> >>>> Joel
> >>>>
> >>>> -----Original Message-----
> >>>> From: Pavel Hrdina [mailto:xxxxx@sodatsw.cz mailto:xxxxx
> >>>> < mailto:xxxxx@sodatsw.cz mailto:xxxxx >]
> >>>> Sent: Wednesday, July 12, 2000 2:09 PM
> >>>> To: File Systems Developers
> >>>> Subject: [ntfsd] RE: preventing recursive loop in create
> >>>> dispatch handler
> >>>>
> >>>>
> >>>> The best solution is to let the FSD to process the desired open first
> >>>> and then you can read from the file. You do not need to close
> >>>> it because
> >>>> you’re using the same file object as the successful requestor
> >>>> of this create
> >>>>
> >>>> request.
> >>>>
> >>>> Paul
> >>>>
> >>>> > -----Pùvodní zpráva-----
> >>>> > Od: Smith, Joel [SMTP:xxxxx@ntpsoftware.com]
> >>>> > Odesláno: 12. èervence 2000 20:00
> >>>> > Komu: File Systems Developers
> >>>> > Pøedmìt: [ntfsd] preventing recursive loop in create
> >>>> dispatch handler
> >>>>
> >>>> >
> >>>> > Hello,
> >>>> > I know this has been asked before, and I apologize
> >>>> for asking
> >>>> > again, but…
> >>>> > Can anyone suggest a good way to prevent a
> >>>> recursive loop when
> >>>> > opening the destination file for a create in the create
> >>>> dispatch routine.
> >>>> >
> >>>> > for example:
> >>>> >
> >>>> > CreateDisptachHandler(device, irp)
> >>>> >
> >>
> >>>> > -ZwCreateFile(target of create) - this will cause a
> >>>> > recursive loop, obviously
> >>>> >
> >>>> > -read some settings from file if it exists
> >>>> >
> >>>> > -ZwClose(target of create)
> >>>> > }
> >>>> >
> >>>> > I believe there is a ‘well known’ solution to this
> >>>> problem, but I
> >>>> > don’t know it.
> >>>> >
> >>>> > Thanks,
> >>>> > Joel
> >>>> >
> >>>>
> >>>> —
> >>>> You are currently subscribed to ntfsd as: xxxxx@ntpsoftware.com
> >>>> To unsubscribe send a blank email to $subst(‘Email.Unsub’)
> >>>>
> >>>>
> >>>> —
> >>>> You are currently subscribed to ntfsd as: xxxxx@bionetrix.com
> >>>> To unsubscribe send a blank email to $subst(‘Email.Unsub’)
> >>>>
> >>>
> >>>
> >>>—
> >>>You are currently subscribed to ntfsd as: xxxxx@veritas.com
> >>>To unsubscribe send a blank email to $subst(‘Email.Unsub’)
> >>>
> >>
> >>
> >>—
> >>You are currently subscribed to ntfsd as: xxxxx@osr.com
> >>To unsubscribe send a blank email to $subst(‘Email.Unsub’)
> >>
> >>—
> >>You are currently subscribed to ntfsd as: xxxxx@veritas.com
> >>To unsubscribe send a blank email to $subst(‘Email.Unsub’)
> >>
> >
> >
> >—
> >You are currently subscribed to ntfsd as: xxxxx@osr.com
> >To unsubscribe send a blank email to $subst(‘Email.Unsub’)
> >
> >—
> >You are currently subscribed to ntfsd as: xxxxx@veritas.com
> >To unsubscribe send a blank email to $subst(‘Email.Unsub’)
> >
>
>
> —
> You are currently subscribed to ntfsd as: xxxxx@storagecraft.com
> To unsubscribe send a blank email to $subst(‘Email.Unsub’)
>
—
You are currently subscribed to ntfsd as: xxxxx@stbernard.com
To unsubscribe send a blank email to $subst(‘Email.Unsub’)</mailto:xxxxx></mailto:xxxxx></mailto:xxxxx></mailto:xxxxx></http:></http:></mailto:xxxxx></http:>