Snapshot recreation not working with Mini Filter

I am developing a file system mini-filter driver which is being used for tracking SQL Server database files (namely mdf and ndf files). The agenda is to track all the write operations that take place in an mdf file, find the offsets and length (calling this pair an extent) of all the writes that took place, extract the blocks of data from the latest snapshot using respective offsets along with the length and finally try to recreate the latest snapshot using the older one + all the extents applied/merged on it.

Earlier I was only using IRP_MJ_WRITE in the callbacks array to detect only the writes happening in the mdf file that I want to track but every time I apply the changed blocks data on the older snapshot to create the newer one, the snapshots don’t match. The newer snapshot (say SN2) is 648 MB in size while the modified snapshot obtained after applying the extents on the older snapshot (say SN1) comes out to be 631 MB in size. Also, the extents I get every time from the mini-filter are different but somehow it results in the same 631 MB mdf file after I merge them with the older snapshot. What can be the reason for that? Would love to know this.

For a change, I added the other IRP operations also present by default in Microsoft’s code but it was also of no help. The modified file is still 631 MB in size.

The problem I believe is something else and I am not being able to figure it out. Also, in microsoft’s code, I found out that they are using this flag RECORD_TYPE_FLAG_EXCEED_MEMORY_ALLOWANCE in the mspyLog.c file. Can this be the reason for some buffer overflow happening while retrieving logs?

The base code is derived from Microsoft’s official repository -

I don’t have any experience with filter drivers and would appreciate all kinds of help coming in. Thanks.

(Moved to NTFSD)

1 Like

By “snapshot” I assume you mean VSS shadow copy? If yes, are you taking into account the fact that SQL might modify the database in the shadow copy after it’s created? If you’re not tracking these writes to the shadow copy it could throw off your tracking.

Hi @“Scott_Noone_(OSR)”
Thanks for your reply.

By Snapshot, I simply mean to take the database offline, copy it, and paste it to another directory. When I do this at time T0, I am calling it SN1 and when I do this at T1 (after all tables have been loaded), I am calling it SN2.

Also, to clarify more I am writing down the steps that I am following so far. The database that is being used is adventureWorks2019 DB (adv DB).
The steps are:

  1. Bring DB online. Create the table.
  2. Bring DB offline and copy the Adv Database. This becomes the SN1 snapshot.
  3. Bring DB online again.
  4. Start filter driver.
  5. Run SQL script to load the rows in the table. (Loading 20000 rows currently)
  6. Stop filter driver. Logs have been generated now in logs.txt file
  7. Take DB offline and copy the database. This is the SN2 snapshot.
  8. Now using a golang script, find out the blocks of data associated with every offset and length and store all those blocks in a separate folder. So if the filter driver gives me 100 logs (offsets and length), I will store all the 100 blocks which will be used in the next step.
  9. Now on SN1 snapshot apply those blocks using another script. By apply I mean simply merge the blocks at the corresponding offset.
  10. Compare the final SN2_recreated with the original SN2 snapshot to see if they match or not.

OK, the case is much simpler than I was thinking.

Have you diff’d the files to see which ranges are different? Are you tracking changes to ValidDataLength and EndOfFile via IRP_MJ_SET_INFORMATION?

No @“Scott_Noone_(OSR)”
I am not tracking that. Can you let me know why do we need to track that?

@“Scott_Noone_(OSR)” I did some investigations using the hint provided by you and encountered these 2 official docs.

Just want to ask you about how can we obtain the offset in the case of IRP_MJ_SET_INFORMATION.

What I mean is that it’s easy and straightforward to store the offset and length in the case of IRP_MJ_WRITE since the structure in fltkernel.h provides

  1. ULONG Length; //Length of transfer
  3. LARGE_INTEGER ByteOffset; //Offset to read from

But in the case of IRP_MJ_SET_INFORMATION, we only have ULONG length with us and not the offset. How can we obtain the offset using other data types in IRP_MJ_SET_INFORMATION? The offset is needed along with length to get the exact block of data which will be used to recreate the snapshot in the last steps.

The set information is used for all sorts of things. You need to cast the InfoBuffer field to an operation specific structure based on the FileInformatioClass. So, something like:

    if (FileEndOfFileInformation == Data->Iopb->Parameters.SetFileInformation.FileInformationClass) {
        eof = (PFILE_END_OF_FILE_INFORMATION) Data->Iopb->Parameters.SetFileInformation.InfoBuffer;

PFILE_END_OF_FILE_INFORMATION info = ((PFILE_END_OF_FILE_INFORMATION)Data->Iopb->Parameters.SetFileInformation.InfoBuffer);

recordData->fileLen = info->EndOfFile.QuadPart;

Am I doing something wrong here to fetch the EndOfFile buffer length?

I added these 3 lines of code to fetch the EndofFile (The absolute new end of file position as a byte offset from the start of the file) but the entire windows is crashing and I am getting BSOD. Any reasons behind that?

if (FileEndOfFileInformation == Data->Iopb->Parameters.SetFileInformation.FileInformationClass) {
PFILE_END_OF_FILE_INFORMATION info = (PFILE_END_OF_FILE_INFORMATION)(Data->Iopb->Parameters.SetFileInformation.InfoBuffer);
recordData->fileLen = info->EndOfFile.QuadPart;

Best I can say is that you have a bug. Hook up a debugger and see what !analyze -v says at the time of the crash.