algorithm help - sort of long and involved... sorry

Here’s a problem that sounds easier than it is:
Say you are writing a file system filter to enforce size limits on directories on a system. One of the things you would need to do is calculate the size of the directories that the driver manages. This is simple enough; just write a function that walks the directory tree and tallies up all files in all subdirectories of the directory.
Since this tally routine is implemented in a file system filter, it seems logical that you should be able to tally the size of the directory while simultaneously allowing IO (possibly IO that changes the logical size of the directory you are tallying) to pass through without screwing up the total. For example, say you receive a write request which will write data beyond the end of some file, thus extending it. Now say that the write is directed to the directory whose size you are tallying. My intuition tells me that this should be a tractable problem, but I am having trouble coming up with a solution.
I guess the main question that need answers at the time of the file size changing io is:
a) Will this IO change the logical size of the directory being tallied.
b) Has the file whose size is changing already been added to the tally by the tally routine.

The first question is easy, the second is a little trickier. If you can know that the file whose size is changing has already been looked at by the tally routine, than you simply adjust the running tally by the delta. If you know that the file size hasn’t been looked at by the tally routine, then you simply let the size change without intervention.
Given intimate knowledge of the tally algorithm, it would be trivial to determine if the file in question has been looked at by the tally routine by examining the path that is currently being examined and comparing it to the path of the file whose size is changing. This would be easy IF you could rely on a query directory request returning directory entries in a predictable order. However, you cannot assume ordering for all file systems (FAT for example, right?). So this wouldn’t work. You could recurse through the directories in a predictable order if you cached directory entries in a sorted data structure and recursed through the subdirectories in sorted order rather than the arbitrary order returned by query directory. This is expensive and painful, however and you’d eat up a lot of memory in very large and deep directory trees.
Another approach would be to accumulate the indexes of the file’s that have been examined by the tally routine in a sorted data structure. When deciding if a particular file has been examined yet, you would simply look into this buffer for the file in question’s index, which would tell you if the file has been added to the tally yet. Again, though, this approach would become prohibitively expensive for very large and deep directories.
Anyway, there are several additional approaches I can come up with, but intuition tells me that they are all overkill and there must be simple solution to this problem that I am not thinking of. Can anyone think of an efficient, elegant approach to solving this problem? Any suggestions would be appreciated.

If you really need to allow concurrent processing while generating the tally for a directory sub-tree, then assembling a sorted list of file IDs already processed may be the only way to manage it efficiently. This will also track access via alternate paths (hard links) that a simple path parse would not catch (though that will be a problem during subsequent processing unless you retain that list for as long as you want to enforce the limits).

You may also want to build a list of the directories in the sub-tree to catch effects of directory rename operations - which raises the question of whether you need to track per-sub-directory tallies within the tree for as long as you enforce limits (which suggests a tree structure to hold both IDs and sizes in addition to the simple ID list suggested above).

Such solutions get expensive for large sub-trees. Are you sure you can’t use W2K (or third-party) quota facilities to accomplish whatever your goal may be?

  • bill
    ----- Original Message -----
    From: Melissa Badenna
    To: File Systems Developers
    Sent: Thursday, April 27, 2000 11:55 PM
    Subject: [ntfsd] algorithm help - sort of long and involved… sorry

Here’s a problem that sounds easier than it is:
Say you are writing a file system filter to enforce size limits on directories on a system. One of the things you would need to do is calculate the size of the directories that the driver manages. This is simple enough; just write a function that walks the directory tree and tallies up all files in all subdirectories of the directory.
Since this tally routine is implemented in a file system filter, it seems logical that you should be able to tally the size of the directory while simultaneously allowing IO (possibly IO that changes the logical size of the directory you are tallying) to pass through without screwing up the total. For example, say you receive a write request which will write data beyond the end of some file, thus extending it. Now say that the write is directed to the directory whose size you are tallying. My intuition tells me that this should be a tractable problem, but I am having trouble coming up with a solution.
I guess the main question that need answers at the time of the file size changing io is:
a) Will this IO change the logical size of the directory being tallied.
b) Has the file whose size is changing already been added to the tally by the tally routine.

The first question is easy, the second is a little trickier. If you can know that the file whose size is changing has already been looked at by the tally routine, than you simply adjust the running tally by the delta. If you know that the file size hasn’t been looked at by the tally routine, then you simply let the size change without intervention.
Given intimate knowledge of the tally algorithm, it would be trivial to determine if the file in question has been looked at by the tally routine by examining the path that is currently being examined and comparing it to the path of the file whose size is changing. This would be easy IF you could rely on a query directory request returning directory entries in a predictable order. However, you cannot assume ordering for all file systems (FAT for example, right?). So this wouldn’t work. You could recurse through the directories in a predictable order if you cached directory entries in a sorted data structure and recursed through the subdirectories in sorted order rather than the arbitrary order returned by query directory. This is expensive and painful, however and you’d eat up a lot of memory in very large and deep directory trees.
Another approach would be to accumulate the indexes of the file’s that have been examined by the tally routine in a sorted data structure. When deciding if a particular file has been examined yet, you would simply look into this buffer for the file in question’s index, which would tell you if the file has been added to the tally yet. Again, though, this approach would become prohibitively expensive for very large and deep directories.
Anyway, there are several additional approaches I can come up with, but intuition tells me that they are all overkill and there must be simple solution to this problem that I am not thinking of. Can anyone think of an efficient, elegant approach to solving this problem? Any suggestions would be appreciated.

You might try the following:
When a file extension occurs, you add its current (base) length to a
lookaside list that is indexed by file id, if it’s not already there.
You assume that you have seen it before and add only the delta to the
tally. As you iterate through the file system you check each file id
against the lookaside list. If it’s not there you add its current
length to the tally. If it is in the lookaside list you should only add
the base length recorded in the list and then remove the object from the
list.

This algorithm only requires searches and memory based on current file
system write activity and is independent of the depth of the directory
structure. Just be sure to store the lookaside list in some easy to
search form like a tree.

-----Original Message-----
From: Melissa Badenna [mailto:xxxxx@mediaone.net]
Sent: Thursday, April 27, 2000 8:56 PM
To: File Systems Developers
Subject: [ntfsd] algorithm help - sort of long and involved… sorry

Here’s a problem that sounds easier than it is:
Say you are writing a file system filter to enforce size limits on
directories on a system. One of the things you would need to do is
calculate the size of the directories that the driver manages. This is
simple enough; just write a function that walks the directory tree and
tallies up all files in all subdirectories of the directory.
Since this tally routine is implemented in a file system filter, it
seems logical that you should be able to tally the size of the directory
while simultaneously allowing IO (possibly IO that changes the logical
size of the directory you are tallying) to pass through without screwing
up the total. For example, say you receive a write request which will
write data beyond the end of some file, thus extending it. Now say that
the write is directed to the directory whose size you are tallying. My
intuition tells me that this should be a tractable problem, but I am
having trouble coming up with a solution.
I guess the main question that need answers at the time of the file
size changing io is:
a) Will this IO change the logical size of the directory being tallied.
b) Has the file whose size is changing already been added to the tally
by the tally routine.

The first question is easy, the second is a little trickier. If you
can know that the file whose size is changing has already been looked at
by the tally routine, than you simply adjust the running tally by the
delta. If you know that the file size hasn’t been looked at by the
tally routine, then you simply let the size change without intervention.

Given intimate knowledge of the tally algorithm, it would be trivial
to determine if the file in question has been looked at by the tally
routine by examining the path that is currently being examined and
comparing it to the path of the file whose size is changing. This would
be easy IF you could rely on a query directory request returning
directory entries in a predictable order. However, you cannot assume
ordering for all file systems (FAT for example, right?). So this
wouldn’t work. You could recurse through the directories in a
predictable order if you cached directory entries in a sorted data
structure and recursed through the subdirectories in sorted order rather
than the arbitrary order returned by query directory. This is expensive
and painful, however and you’d eat up a lot of memory in very large and
deep directory trees.
Another approach would be to accumulate the indexes of the file’s
that have been examined by the tally routine in a sorted data structure.
When deciding if a particular file has been examined yet, you would
simply look into this buffer for the file in question’s index, which
would tell you if the file has been added to the tally yet. Again,
though, this approach would become prohibitively expensive for very
large and deep directories.
Anyway, there are several additional approaches I can come up with,
but intuition tells me that they are all overkill and there must be
simple solution to this problem that I am not thinking of. Can anyone
think of an efficient, elegant approach to solving this problem? Any
suggestions would be appreciated.

> Anyway, there are several additional approaches I can come up with, but

intuition tells me that they are all overkill and there must be simple
solution
to this problem that I am not thinking of. Can anyone think of an
efficient,
elegant approach to solving this problem? Any suggestions would be
appreciated.

Block all IRP_MJ_CREATE requests targeted to this directory on some lock
(like the event) for the duration of the tally routine.

For instance:

  • have a spinlock-protected section in your code.
  • declare 2 flags - “open files” (integer) and “tally running” (boolean).
  • declare 2 notification events - “create” event and “tally” even.
  • each CREATE going beyond the interesting dir (check the path) will take
    the spinlock, then - if “tally running” is set, release the lock and go to
    waiting
    for “create” event.
    Else - increment “open files”, reset the “tally” event if “open files” was
    zero,
    release the lock, install a completion routine and proceed.
  • the completion routine will - on success - add the file object to some
    table
    and return STATUS_SUCCESS.
  • each IRP_MJ_CLOSE (or IRP_MJ_CLEANUP is enough) request will
    check whether the file object is in the table, if yes-remove it, take the
    lock,
    decrement the “open files” count, set the “tally” event if it becomes zero,
    then
    unlock and proceed.
  • the tally routine must wait for the “tally” event, then take the lock,
    check
    whether “open files” is 0 (there is a small window between waiting and
    taking the lock when a CREATE could occur) - unlock and return to waiting if
    nonzero, otherwise set the “tally running” to true and reset the “create”
    event,
    unlock and proceed. At the end, take the lock, set “tally running” to false,
    set
    the “create” event and unlock.

This will guarantee a) no files are open when tally starts and b) no files
can
be open for the duration of tally.

Max