Hardlinked dest file with different FileId than source file

Greetings NTFS experts. If anyone out there has any insight into this it would be much appreciated. I’ve already searched old posts on the NTFSD list but nothing discussed about this yet that I could find.

In my minifilter driver I’ve hooked IRP_MJ_SET_INFORMATION and when a hardlink is being created, my callback is called with FILE_INFORMATION_CLASS of FileLinkInformation. Almost always when at that point in that callback I lookup the FileId (http://msdn.microsoft.com/en-us/library/windows/hardware/ff540318(v=vs.85).aspx) on both the source file and the destination file, they are the same. This is to be expected as hardlink siblings will share the same FileId and VolumeId.

But I’ve run across a rare situation where the FileId and file size are different on the destination file than on the source file. After some time and no other callbacks on the dest file other than repeated File Open / File Close, the FileId and file size of the dest file become identical to that of the source file, as I would have expected to see when the hardlink was first detected in IRP_MJ_SET_INFORMATION. Again this is a rare occurrence whereas most of the time the dest and source file are identical all along from the time of the original IRP_MJ_SET_INFORMATION callback.

I’m confused as to how two files that are hardlinks could have a different FileId and different size. Anyone seen this before and have any insight? Thanks.

Does this happen on one or more specific types of files (exe, dll, sys,
txt, ???)

On Mon, Jan 27, 2014 at 3:21 PM, wrote:

> Greetings NTFS experts. If anyone out there has any insight into this it
> would be much appreciated. I’ve already searched old posts on the NTFSD
> list but nothing discussed about this yet that I could find.
>
> In my minifilter driver I’ve hooked IRP_MJ_SET_INFORMATION and when a
> hardlink is being created, my callback is called with
> FILE_INFORMATION_CLASS of FileLinkInformation. Almost always when at that
> point in that callback I lookup the FileId (
> http://msdn.microsoft.com/en-us/library/windows/hardware/ff540318(v=vs.85).aspx)
> on both the source file and the destination file, they are the same. This
> is to be expected as hardlink siblings will share the same FileId and
> VolumeId.
>
> But I’ve run across a rare situation where the FileId and file size are
> different on the destination file than on the source file. After some time
> and no other callbacks on the dest file other than repeated File Open /
> File Close, the FileId and file size of the dest file become identical to
> that of the source file, as I would have expected to see when the hardlink
> was first detected in IRP_MJ_SET_INFORMATION. Again this is a rare
> occurrence whereas most of the time the dest and source file are identical
> all along from the time of the original IRP_MJ_SET_INFORMATION callback.
>
> I’m confused as to how two files that are hardlinks could have a different
> FileId and different size. Anyone seen this before and have any insight?
> Thanks.
>
> —
> NTFSD is sponsored by OSR
>
> OSR is hiring!! Info at http://www.osr.com/careers
>
> For our schedule of debugging and file system seminars visit:
> http://www.osr.com/seminars
>
> To unsubscribe, visit the List Server section of OSR Online at
> http://www.osronline.com/page.cfm?name=ListServer
>

I’ve only seen it happen on a .dll file. Specifically, it’s \Windows\System32\urlmon.dll being hardlinked to another urlmon.dll underneath \Windows\winsxs which is being done by a Windows Update process.

More information: The dest file (\Windows\System32\urlmon.dll) existed prior to this, then was deleted before the hardlink is set. So the rough steps are something like:

  1. \Windows\System32\urlmon.dll is hardlinked to \Windows\winsxs\dir1…\urlmon.dll
  2. \Windows\System32\urlmon.dll is deleted
  3. \Windows\System32\urlmon.dll is hardlinked to \Windows\winsxs\dir2…\urlmon.dll

For a number of seconds after, FileID and size of system32\urlmon.dll is the same as it was in step 1 and different from what it should be in step 3

It seems as if NTFS is erroneously caching the info about that file and it takes some time for it to get updated.

Are the “urlmon” files identical or different?
Probably not erroneously… as much as when the NTFS cache is invalidated.

On Mon, Jan 27, 2014 at 4:32 PM, wrote:

> More information: The dest file (\Windows\System32\urlmon.dll) existed
> prior to this, then was deleted before the hardlink is set. So the rough
> steps are something like:
>
> 1) \Windows\System32\urlmon.dll is hardlinked to
> \Windows\winsxs\dir1…\urlmon.dll
> 2) \Windows\System32\urlmon.dll is deleted
> 3) \Windows\System32\urlmon.dll is hardlinked to
> \Windows\winsxs\dir2…\urlmon.dll
>
> For a number of seconds after, FileID and size of system32\urlmon.dll is
> the same as it was in step 1 and different from what it should be in step 3
>
> It seems as if NTFS is erroneously caching the info about that file and it
> takes some time for it to get updated.
>
> —
> NTFSD is sponsored by OSR
>
> OSR is hiring!! Info at http://www.osr.com/careers
>
> For our schedule of debugging and file system seminars visit:
> http://www.osr.com/seminars
>
> To unsubscribe, visit the List Server section of OSR Online at
> http://www.osronline.com/page.cfm?name=ListServer
>

This behavior happens during bootup when Windows is finishing applying updates before the login screen. So I can’t use any user space tools to analyze the files to determine if they are truly different. I only have what I know through the code that runs in my driver. And to my driver, the files are actually different. If I open a handle on the file and lookup the FileID and size, they are different between the two files, even though it’s in the middle of processing FileLinkInformation in the IRP_MJ_SET_INFORMATION preop, meaning they are now hardlinked. I created a thread to repeatedly open both files and print the FileID and size to the debug log and for about 4.5 minutes after the hardlink is created, they continue to be different. Then they suddenly become the same. The only callbacks on that file in between creating the hardlink and when they begin to appear to be the same file is repeated file open/close (mostly from my thread that’s continually checking it).

Can someone from Microsoft comment on this behavior ?

On Wed, Jan 29, 2014 at 8:40 AM, wrote:

> This behavior happens during bootup when Windows is finishing applying
> updates before the login screen. So I can’t use any user space tools to
> analyze the files to determine if they are truly different. I only have
> what I know through the code that runs in my driver. And to my driver, the
> files are actually different. If I open a handle on the file and lookup the
> FileID and size, they are different between the two files, even though it’s
> in the middle of processing FileLinkInformation in the
> IRP_MJ_SET_INFORMATION preop, meaning they are now hardlinked. I created a
> thread to repeatedly open both files and print the FileID and size to the
> debug log and for about 4.5 minutes after the hardlink is created, they
> continue to be different. Then they suddenly become the same. The only
> callbacks on that file in between creating the hardlink and when they begin
> to appear to be the same file is repeated file open/close (mostly from my
> thread that’s continually checking it).
>
> —
> NTFSD is sponsored by OSR
>
> OSR is hiring!! Info at http://www.osr.com/careers
>
> For our schedule of debugging and file system seminars visit:
> http://www.osr.com/seminars
>
> To unsubscribe, visit the List Server section of OSR Online at
> http://www.osronline.com/page.cfm?name=ListServer
>

> This behavior happens during bootup when Windows is finishing applying updates

before the login screen. So I can’t use any user space tools to analyze the files
to determine if they are truly different.

If you can repeat the scenario, you can. Activate USN journal on the target machine and go reboot. When you log in, use FileSpy with USN Journal as “driver” (choose Options/Choose Driver). Because USN journal is active all the time, you can track what happened on the volume during the boot. However, the level of information is merely what changed (no reads are logged) - file writes, renames, stream changes, hardlink changes.

Also, you could configure FileSpy’s IRP driver (FSpy) to load at boot (You can do it with Process Monitor as well). It only requires to set a bigger memory reserved for Filespy’s driver, otherwise it’s not able to hold all the file names and IRP requests. Prepare to dig in a lots of lines, though. If you need some help with that, let me know (I haven used it in years).

Are you really pre-processing FileLinkInformation at this point? During
pre-processing the link hasn’t happened yet.

How do you know the link being deleted in Step 2? Note that the request to
delete isn’t processed instantaneously, the link won’t actually be removed
until sometime during/after cleanup processing of the last handle to the
file.

-scott
OSR

wrote in message news:xxxxx@ntfsd…

This behavior happens during bootup when Windows is finishing applying
updates before the login screen. So I can’t use any user space tools to
analyze the files to determine if they are truly different. I only have what
I know through the code that runs in my driver. And to my driver, the files
are actually different. If I open a handle on the file and lookup the FileID
and size, they are different between the two files, even though it’s in the
middle of processing FileLinkInformation in the IRP_MJ_SET_INFORMATION
preop, meaning they are now hardlinked. I created a thread to repeatedly
open both files and print the FileID and size to the debug log and for about
4.5 minutes after the hardlink is created, they continue to be different.
Then they suddenly become the same. The only callbacks on that file in
between creating the hardlink and when they begin to appear to be the same
file is repeated file open/close (mostly from my thread that’s continually
checking it).

would a short offline analysis of the files in question be out of the question?

From a personal standpoint I’d use that “L word” through a boot disk and go to each location and just grab whatever ntfs-3g spits out through ls -lah and then take an MD5 of both. At least we’d have a base to start with.

they are different or they are the same…

It occurs to me that if the orig file is deleted but final cleanup has not
occured (ie., FileObjects associated with that file still exist), then the
orig file still has references and NTFS still working on “cleaning
up/removing” the orig file…this can take some time especially during OS
boot file replacement.

On Wed, Jan 29, 2014 at 11:49 AM, Scott Noone wrote:

>


>
> Are you really pre-processing FileLinkInformation at this point? During
> pre-processing the link hasn’t happened yet.
>
> How do you know the link being deleted in Step 2? Note that the request to
> delete isn’t processed instantaneously, the link won’t actually be removed
> until sometime during/after cleanup processing of the last handle to the
> file.
>
> -scott
> OSR
>
>
> wrote in message news:xxxxx@ntfsd…
>
>
> This behavior happens during bootup when Windows is finishing applying
> updates before the login screen. So I can’t use any user space tools to
> analyze the files to determine if they are truly different. I only have
> what I know through the code that runs in my driver. And to my driver, the
> files are actually different. If I open a handle on the file and lookup the
> FileID and size, they are different between the two files, even though it’s
> in the middle of processing FileLinkInformation in the
> IRP_MJ_SET_INFORMATION preop, meaning they are now hardlinked. I created a
> thread to repeatedly open both files and print the FileID and size to the
> debug log and for about 4.5 minutes after the hardlink is created, they
> continue to be different. Then they suddenly become the same. The only
> callbacks on that file in between creating the hardlink and when they begin
> to appear to be the same file is repeated file open/close (mostly from my
> thread that’s continually checking it).
>
> —
> NTFSD is sponsored by OSR
>
> OSR is hiring!! Info at http://www.osr.com/careers
>
> For our schedule of debugging and file system seminars visit:
> http://www.osr.com/seminars
>
> To unsubscribe, visit the List Server section of OSR Online at
> http://www.osronline.com/page.cfm?name=ListServer
>

Thanks for all of the help on this from everyone. I looked closer at what I can observe and I think what Scott @ OSR (and Duane also) suggested is what I’m observing. It looks like the delete of the file is taking a long time and the hardlink is not created until the delete is complete.

The delete is initiated about 80 ms before signaling the creation of the hardlink in IRP_MJ_SET_INFORMATION preop with FileLinkInformation. It’s not until about 90 seconds later (this can obviously vary a lot from run to run) that the source file of the hardlink indicates that it has hardlinked files (FileStandardInformation.NumberOfLinks). During all of that time the file that had the delete initiated was still there with the same FileId and size that it had before initiating the delete. So I would assume that the hardlink request is queued and not fulfilled until the delete is completed.

My driver code (which I inherited) seemed to assume that the hardlink is fully created at the time that FileLinkInformation is seen in the IRP_MJ_SET_INFORMATION preop. Now that I’m looking at this deeper I can see that’s not actually the case and I now know that it may be some time before the hardlink actually exists.

Preop? Why would someone think that the link operation would be done before the call has been passed to the FSD? That’s a bit optimistic.

This is a very complex sub-case: hard link with delete and its interaction with an in-use file. If that file had been opened the hard link would have failed (you can’t hard link with replace on a file that is open) but it gets murkier if the file is memory mapped (and in use) or still in the cache. There’s a lot of serialization that has to occur here and it seems quite likely that the link operation might stall (so the time between preop and postop is substantial).

Tony
OSR

Fair point Tony about the preop. It seems so obvious now that I was wrong to think that the preop signaled the completion of the hardlink action. I let myself misinterpret the actions of the code in my driver which I can now see is not doing anything on the assumption of timing of completion of the hardlink. It is merely marking that there will be a hardlink on that file. I was focusing on the wrong thing. It turns out the problem I’m trying to resolve is a failure of my driver’s code to properly track when the delete is fully complete (which there is code in place for handling) rather than when the hardlink is complete. I now have a much clearer picture of what is actually happening on the filesystem.

Don’t forget about the postOp callback, there you can actually look at what
happened with the operation.

Thanks,
Alex.

On Fri, Jan 31, 2014 at 4:51 AM, wrote:

> Fair point Tony about the preop. It seems so obvious now that I was wrong
> to think that the preop signaled the completion of the hardlink action. I
> let myself misinterpret the actions of the code in my driver which I can
> now see is not doing anything on the assumption of timing of completion of
> the hardlink. It is merely marking that there will be a hardlink on that
> file. I was focusing on the wrong thing. It turns out the problem I’m
> trying to resolve is a failure of my driver’s code to properly track when
> the delete is fully complete (which there is code in place for handling)
> rather than when the hardlink is complete. I now have a much clearer
> picture of what is actually happening on the filesystem.
>
> —
> NTFSD is sponsored by OSR
>
> OSR is hiring!! Info at http://www.osr.com/careers
>
> For our schedule of debugging and file system seminars visit:
> http://www.osr.com/seminars
>
> To unsubscribe, visit the List Server section of OSR Online at
> http://www.osronline.com/page.cfm?name=ListServer
>