Filespy filter driver

OSR_Community_User · July 26, 2002, 2:40am

Hello,

I’ve come across something curious in the XP version of filespy. I’ve
turned on logging for one of my drives, and then I do a search on the
drive searching for a particular string (ie ‘abcde…’). The filespy logs
all the I/O, however it seems to quit logging Name’s after a while. Doing
a little debugging and tracing it seems that the gNamesAllocated variable
quickly reaches it’s max (500)when your doing a search through files.

Further examination shows me that there are a significantly more
IRP_MJ_CREATE irp’s then any other, and each CREATE irp seems to have a
unique FileObject. Since I want to be able to make filespy store
filenames more accurately, without arbitrarily increasing the value of
gMaxNamesToAllocate, is there a definitive way to determine when it’s safe
to remove the FileObjects name? Why also does Filespy seem to have so
many active FileObjects?

Any help? Thanks

Mike

OSR_Community_User · July 26, 2002, 12:44pm

It depends when you need the file names. Filespy is designed to provide
filenames for as long as possible so that you can see what files are
being accessed for all file system operations, which means that it
cannot purge the names from its cache until it sees the IRP_MJ_CLOSE
operation on a given file object.

The cache manager will keep a reference on a fileobject as long as that
file is in the cache. Depending on the memory available in the system
and the pressure on the memory resources, this reference may stay around
for a long time after all the handles for a given file object have been
closed. This causes the names to stay in filespy’s hash table for a
longer period of time that you may have expected.

If you are willing to give up names on IRP_MJ_CLOSE operations, you
could purge names from filespy’s hash table on IRP_MJ_CLEANUP
operations. This would prevent spaces in the hash table from being
filled by the fileobjects for which you are just waiting for the
IRP_MJ_CLOSE operation.

In the XP version of filespy, in addition to the (fileobject, filename)
hash table implementation, the alternative of storing the filename in
the filter context of each stream is also shown (mainly in fspyctx.c).
This design allows you to amortize the resources for storing a stream’s
name across all fileobjects that reference that stream. Also, this
design does not have the arbitrary maximum limit of the number of names
to store, like the hash implementation does.

One final point to note is that the gMaxNamesToAllocate of 500 is a bit
arbitrary. If you are going to be monitoring operations to a large
number of files, it is perfectly legitimate to raise this limit.

Molly Brown
Microsoft Corporation

This posting is provided “AS IS” with no warranties and confers no
rights.

-----Original Message-----
From: xxxxx@platypus.net [mailto:xxxxx@platypus.net]
Sent: Thursday, July 25, 2002 11:41 PM
To: File Systems Developers
Subject: [ntfsd] Filespy filter driver

Hello,

I’ve come across something curious in the XP version of filespy. I’ve
turned on logging for one of my drives, and then I do a search on the
drive searching for a particular string (ie ‘abcde…’). The filespy
logs
all the I/O, however it seems to quit logging Name’s after a while.
Doing
a little debugging and tracing it seems that the gNamesAllocated
variable
quickly reaches it’s max (500)when your doing a search through files.

Further examination shows me that there are a significantly more
IRP_MJ_CREATE irp’s then any other, and each CREATE irp seems to have a
unique FileObject. Since I want to be able to make filespy store
filenames more accurately, without arbitrarily increasing the value of
gMaxNamesToAllocate, is there a definitive way to determine when it’s
safe
to remove the FileObjects name? Why also does Filespy seem to have so
many active FileObjects?

Any help? Thanks

Mike

You are currently subscribed to ntfsd as: xxxxx@windows.microsoft.com
To unsubscribe send a blank email to %%email.unsub%%