Implementing reparsing & HSM: pre or post create?

I’m developing a simple HSM solution where files are made resident on access and on client initiated times files are flushed from the client system to the HSM storage.

I’m considering doing this via:
-Inserting a reparse point in the stub files.
-In post create check the reparse tag and STATUS_REPARSE
-If this is our reparse make the file resident by sending a msg to the user mode via FltSendMessage().
-Completing the IRP via FltReissueSyncronousIO()

So my questions: is this a decent way to implement such a solution? When expanding this simple HSM what issues are commonly hit? The SimRep sample part of the WDK shows implementing reparsing in the pre create, why, when actual reparse points and STATUS_REPARSE handling in post create exists?

This sounds like a good way to do it, but I’ve never actually worked on an HSM so I’m not sure what kind of problems you might run into. I’d suggest checking the archives for that.

SimRep is showing implementation of a different idea, namespace virtualization.

Reparse points in the file system are quite different from returning STATUS_REPARSE to the object manager. These are in a way different mechanisms:

  1. a mechanism for a file system or a filter to tell the object manager to open a different path instead of the one they’re trying to open (the SimRep way).
  2. a mechanism for a file system or a filter to fail a create in a way to indicate that the file can’t be opened directly and there is some additional information available (the HSM way). In my opinion this could have used a different status (something like STATUS_ADDITIONAL_INFORMATION or something).

The main difference here is who uses the information. For the 1st mechanism, the information is intended for the object manager which will retry the open along a different path. For the 2nd, it’s the IO manager which will either consume the information (if it’s one it’s special reparse points, for symlinks or mountpoints) or will fail the open with STATUS_REPARSE_POINT_NOT_RESOLVED.

In my opinion the fact that the IO manager implements volume mountpoints and directory junctions and symlinks using both these mechanism is what created the confusion in the first place. What’s actually going on is that for symlinks and such the IO manager will get the information from the file system (which is the 2nd type of STATUS_REPARSE) and then will retry the create using the 1st kind of reparse. It would have been much clearer if the 2nd kind of reparse had a different status code.

Does this make sense ?

Thanks,
Alex.

Yes that makes sense. It seemed to me that “reparse” pharasing was being used in two different ways, but I’d thought I’d confirm.

On the HSM front, do you know of any samples or references?

Btw your blog http://fsfilters.blogspot.com/ has been super helpful as I’ve been ramping up on file system driver. Keep up the awesome work!

-Thanks

I’m glad you my blog and that it’s been useful.

I’m afraid I don’t know of any samples for HSM filters. Perhaps someone else on the group knows. Still, I think you have the right idea and I don’t think you’ll have too difficult a time to get started.

Thanks,
Alex.