Should I redirect IRP_MJ_CREATE requests on directories in minifilter?

Hello folks,

I’m developing a minifilter driver that provides write protection on a chosen directory by redirecting all ops from source dir to another dir that’s on a different volume.

In order to get better picture, let’s assume c:\tmp1 is the directory I want to protect from any sort of modifications and T:\tmp1 is the place where all operations are re-directed.
I achieve this by returning STATUS_REPARSE from Pre-create callback if the file in question has c:\tmp1\ as prefix and if it meets certain criterias.
Note that, in Pre-create I do not differentiate between file and dir and a request is reparsed if it meets above mentioned conditions.

In addition to this, I also handle IRP_MJ_QUERY_DIRECTORY to merge the results from both the locations to provide the virtualization.
(I handle other IRPs as well but those are irrelevant for this discussion).

This sort of a design has surfaced a problem when a file is opened with FILE_OPEN_BY_FILE_ID set.
Here are the steps in detail:

c:\tmp1 contains a singel file t1.txt whereas t:\tmp1 is an empty directory.

  1. hDir = CreateFileW("C:\tmp1", GENERIC_READ | GENERIC_WRITE,……); → Here app receives handle to t:\tmp1 in reality because of the re-direction from minifilter.
  2. FileId = NtQueryDirectoryFile(hDir, FileBothIdInformation…); → IRP_MJ_QUERY_DIRECTORY is triggered and returns info about C:\tmp1\t1.txt since there is none in t:\tmp1.
  3. NtCreateFile(FILE_OPEN_BY_FILE_ID) → This call fails with STATUS_INVALID_PARAMETER.
    (Do not worry about the parameters much - the function calls work fine when my minifilter is not loaded).

What’s happening is because of the redirection in step 1, user app receives handle to directory from T:\ (which shall have different volume id that C:) but the FileID that was returned really belongs to C:\tmp1\t1.txt. And because File IDs are volume dependent, I think NtCreateFile catches this and doesn’t even send IRP_MJ_CREATE for the corresponding NtCreateFile call with file id set.

So this basically makes me wonder should I even be re-directing IRP_MJ_CREATE on directories?
If the approach followed makes sense to you, can you provide some insights as to how I can get around this?

Thanks

I think NtCreateFile catches this and doesn’t even send IRP_MJ_CREATE for the corresponding NtCreateFile call with file id set.

It sends a request to a volume derived from a handle passed to NtCreateFile with OBJECT_ATTRIBUTES . A handle for a directory passed via OBJECT_ATTRIBUTES is resolved to FILE_OBJECT->RelatedFileObject and usually becomes a starting (root) point for a file search by ID (but depends on file system driver implementation).

Thanks for your reply Slava_Imameev.

I think FltGetFileNameInformation identifies such cases transperantly i.e. it produces correct file name even though file was opened with FILE_OPEN_BY_FILE_ID. So ideally, I should get IRP_MJ_CREATE for file T:\tmp1\t1.txt with FILE_OPEN_BY_FILE_ID set, which never happens.

On second thoughts, I think its the IO Manager that does the comparison and once it finds out that returned file id does not belong to the volume derived from the handle it simply doesn’t send the request. Much like in the case of cross-volume move operation, where IRP_MJ_SET_INFORMATION is not even sent down by IO manager when it identifies 2 different volumes are involved.

Did you find a solution? It sounds like you’re between a rock and a hard place. If I understand the situation correctly, they should not be able to get a file id to the original file, because in order to do so, they need an open handle to it, which you would have redirected to the new volume/location. Thus they would have a file id to the T:\tmp1 copy. However, FILE_OPEN_BY_FILE_ID requires a handle to a parent directory that shows which volume the id is on. They could use a handle to "C:" thinking that their file id is for C:\tmp1\t1.txt, where it is really for the T:\ copy, so the create would fail.

On the other hand, if they do have the file id for the C:\ file, then they could open C:, missing your redirector, and access the file, or they could open C:\tmp1, in which case your code would replace it with T:\tmp1, and the C: file id would fail.

No, I haven’t found a solution yet.

I was thinking of stopping the reparsing of IRP_MJ_CREATE on dirs completely but realized that would not solve the problem.
Consider a scenario where c:\tmp1 and t:\tmp1 both have t1.txt - where copy in the t:\tmp1 is latest one (since modifications were redirected to this file).
So what my app would do is,

  1. Obtain handle to c:\tmp1\
  2. Obtain File id using NtQueryDirectory() - Since I merge the results while handling IRP_MJ_QUERY_DIRECTORY, the file id received would belong to t:\tmp1\t1.txt
  3. Call NtCreateFile() - Would fail with STATUS_INVALID_PARAMETER, since file id is from different volume ( t:) than the volume id derived from c:\tmp1.

So overall, FILE_OPEN_BY_FILE_ID in reparse minifilter driver seems kind of hard to get to work.
any ideas how to get around this?

Reparsing across volumes always has weird nasty edges due to things above you seeing data from two different places.

I can back of the napkin some ways out of this, but I think there’s something missing in your analysis that’s bothering me:

On second thoughts, I think its the IO Manager that does the comparison and once it finds out that returned file id does not belong to the volume derived from the handle it simply doesn’t send the request

That cannot be what’s happening. File IDs are only unique per-volume because that’s the way they are implemented by the file system. There’s nothing about the file ID itself that indicates which volume it came from.

You need to figure out where the failure is coming from. Do you have a Process Monitor trace for the failing case?

Sorry Scott for getting back after such a long gap (I got busy with something else).
The reason for coming to above conclusion (i.e. IO manager determining different volumes and thus not sending IRP_MJ_CREATE down the stack) was the output of below test:

  1. Query File Id using sequence,
  • hRealDir = CreateFile("C:\tmp1", …)
  • NtQueryDirectoryFile(hRealDir, …, FileBothIdInformation, “C:\tmp1\t1.txt”…)
  1. Open file by file id using sequence,
  • hVirtualDir = CreateFile('T:\root') → This simulates the step taken by minifilter (i.e. reparsing access to c:\tmp1 to t:\root).
  • NtCreateFile(hVirtualDir, …, FILE_OPEN_BY_FILE_ID | FILE_NON_DIRECTORY_FILE,…)

The call to NtCreateFile fails with STATUS_INVALID_PARAMETER and I do not see IRP_MJ_CREATE on T:\ in filespy.

Thanks

Here’s attaching minimalistic logs from FileSpy. As you can see there is a missing IRP_MJ_CREATE with FILE_OPEN_BY_FILE_ID on T:.

Interesting and not easy to solve.

According to your logic in the original post.

  1. User-Mode app gets handle to T:\tmp1 but really “thinks” its’ C:\tmp1

  2. User-Mode app get ID of C:\tmp1\t1.txt - not enough
    In step 2 you have to:
    2.1) If file exists in T:\tmp1 then you store/update an internal map of real to virtual file ID, and in this case sure, you can return to the user the file ID C:\tmp1\t1.txt, since the user-mode assumes this is the file it has opened.
    2.2) If the file does not exist you MUST create a placeholder file, sparse whatever after which you apply 2.1.

  3. When you receive a file open by ID, here is where it gets tricky.
    You must check if the file trying to be opened refers to one of your virtualised locations.
    If so, you will have to do this Create yourself as replacing the filename with the appropriate file id in the callback data will not work, since the ID is for a different volume, and there is no way for the reparse to “know” just by ID to reparse to a different volume.
    What you do in this case is from PRE-CREATE you do an open by ID on the proper volume with the proper ID ( FltCreateFile…) and fill in the original callback data appropriately and complete the create yourself.
    After this CREATE is completed, your user-mode application should receive a handle to T:\tmp1\t1.txt which was opened by ID.

Things to remember: you must monitor all FS QUERY operations that could return file ids and update your internal map accordingly. Same goes for SETS.

Fun stuff:
Keep in mind that a directory query could reveal many files which exist in C:\tmp1 and not exist in T:\tmp1 in one call. This should be an indication that your internal file id map should update also in case of a directory query which reveals unmapped or out of date file ids.
Even more fun is the case where an application has the IDs cached and you will only see the open by ID with no prior query or open before, or when your map has no history of that ID. In this case you must do an open yourself and check whether the ID refers to one of your files or not and update accordingly.

ALTERNATIVELY: You could simply not support file IDs in your product/filter :smiley: badum tss
Because if you think this is messed up guess how oplocks and transactions are going to be. Oh and did I mention if you want your virtual to be on the network …

Anyway, hope this helps and hope I got the right picture about your scenario.

PS: Some further reading: http://fsfilters.blogspot.com/2011/05/statusreparse-and-fileopenbyfileid.html

Cheers

I edited my previous comment and it seems to have completely disappeared so if anyone has it on email you could repost it as it was pretty long :slight_smile:

Posting on Gabriel’s behalf:

Interesting and not easy to solve.

According to your logic in the original post.

  1. User-Mode app gets handle to T:\tmp1 but really “thinks” its’ C:\tmp1

  2. User-Mode app get ID of C:\tmp1\t1.txt - not enough

    In step 2 you have to:

2.1) If file exists in T:\tmp1 then you store/update an internal map of real to virtual file ID, and in this case sure, you can return to the user the file ID C:\tmp1\t1.txt, since the user-mode assumes this is the file it has opened.

2.2) If the file does not exist you MUST create a placeholder file, sparse whatever after which you apply 2.1.

  1. When you receive a file open by ID, here is where it gets tricky.

You must check if the file trying to be opened refers to one of your virtualised locations.

If so, you will have to do this Create yourself as replacing the filename with the appropriate file id in the callback data will not work, since the ID is for a different volume, and there is no way for the reparse to “know” just by ID to reparse to a different volume.

What you do in this case is from PRE-CREATE you do an open by ID on the proper volume with the proper ID ( FltCreateFile…) and fill in the original callback data appropriately and complete the create yourself.

After this CREATE is completed, your user-mode application should receive a handle to T:\tmp1\t1.txt which was opened by ID.

Things to remember: you must monitor all FS QUERY operations that could return file ids and update your internal map accordingly. Same goes for SETS.

Fun stuff:

Keep in mind that a directory query could reveal many files which exist in C:\tmp1 and not exist in T:\tmp1 in one call. This should be an indication that your internal file id map should update also in case of a directory query which reveals unmapped or out of date file ids.

Even more fun is the case where an application has the IDs cached and you will only see the open by ID with no prior query or open before, or when your map has no history of that ID. In this case you must do an open yourself and check whether the ID refers to one of your files or not and update accordingly.

ALTERNATIVELY: You could simply not support file IDs in your product/filter badum tss

Because if you think this is messed up guess how oplocks and transactions are going to be. Oh and did I mention if you want your virtual to be on the network …

Anyway, hope this helps and hope I got the right picture about your scenario.

Cheers

<<<

Thanks Gabriel for your insights, appreciate it.
As you correctly pointed out,

Keep in mind that a directory query could reveal many files which exist in C:\tmp1 and not exist in T:\tmp1 in one call.
that’s exactly THE problem I’m trying to solve. If the file exists in both the volumes then its not an issue for my driver.
As you hinted in your comment, I think the key to solving this problem lies in making sure the file (just a placeholder initially) exists in virtual volume.
I’m thinking while handling IRP_MJ_QUERY_DIRECTORY, I can detect files which are present in C:\tmp1 but do not exist in T:\tmp1 and create just placeholder files in virtual volume and update internal map of ids. At this point I can probably create a file context for these placeholders and maintain a state in the context saying its an empty file. Later on then when I get a create with FILE_OPEN_BY_FILE_ID for the placeholder, I can fetch its context to detect this - copy stuff from real file thus converting it from being just a placeholder to a full replica of real file and return handle to it.

Your thoughts?

Also I’m on board with monitoring all FS QUERY operations that could return file ids but can you elaborate bit more what you mean by same goes for SETS below?

Things to remember: you must monitor all FS QUERY operations that could return file ids and update your internal map accordingly. Same goes for SETS.

Thanks

I edited my previous comment and it seems to have completely disappeared

https://community.osr.com/discussion/290733/did-a-comment-you-wrote-disappear-or-not-appear

Peter

@“Peter_Viscarola_(OSR)” said:

I edited my previous comment and it seems to have completely disappeared

https://community.osr.com/discussion/290733/did-a-comment-you-wrote-disappear-or-not-appear

Peter

Oh sry Peter I was not aware of that. Well anyway someone just reposted the comment anyway.
The edit I believe was a link to one of Alex Carp’s blog posts on file ids and this flag in particular which I found interesting and thought would be of help here.

Thanks

Your thoughts?
In regards to what you wrote above, yes that was my idea. My thoughts are that the mechanism should work in that case as the placeholder file exists, you can make a context around it and keep all of your relevant info there.
I think Msft themselves implement this technique in quite a few of their products like OneDrive or ProjFs (GitFs or other names that it may have) where a file only virtual/placeholder/reparse point and it is being “hydrated” or “partially hydrated” only when needed.

Regarding the second part, what I really meant was to just be careful and not miss any “hidden” file id queries which may come from several types of callbacks so you can keep your internal data consistent. Nothing fancy.

Thanks Gabriel.
I’ll post updates regarding how I get on with it - might be useful for someone in the future.