Problem with IRP_MJ_WRITE

Santiago_Navarro · December 1, 2011, 4:47am

Hi!!
We are facing a wierd problem, we have a user application which copies a file and our minifilter that filters the file being copied. Since the file being copied is tagged, the minifilter should redirect the destination file to other file (all files are stored in the same volume).

The problem is that it only works if the file size is small (less than around 700 bytes). If the file is bigger that this, the destination file is full of zeros . We alse see the IRP_WRITE coming after the IRP_CLEANUP and the length is right.

Does anybody know a possible explanation for this??

Thank you very much!
Santi

OSR_Community_User · December 1, 2011, 6:25am

> The problem is that it only works if the file size is small (less than around 700 bytes).

Is this file resident to MFT?

–
Maxim S. Shatskih
Windows DDK MVP
xxxxx@storagecraft.com
http://www.storagecraft.com

Santiago_Navarro · December 1, 2011, 6:40am

Hi! The file is just a regular one (c:\temp\b.txt).

Christian_Allred · December 1, 2011, 2:33pm

Being resident in the MFT has nothing to do with whether your file is “regular” or not. It is entirely a function of its size. If you create a small file (say, around 700 bytes or less) it won’t get any allocation on the disk because its data can fit entirely within the file’s file record. This makes it “resident” in the MFT. Once the file grows enough it will get allocation and become non-resident.

The reason you’re seeing writes come after cleanup for non-resident (i.e. larger than ~700 bytes) files is that you’re seeing the paging I/O for those files as their cached data is finally being written to disk.

Christian [MSFT]
This posting is provided “AS IS” with no warranties, and confers no rights.

Santiago_Navarro · December 2, 2011, 3:47am

Thanks for the information (and sorry for my previous answer)! According to what I have read being resident in the MFT could be the difference between successful and failed copy files. But my question is what would be the reason of that? And how can I be sure a certain file is resident in the MFT?

I see an IRP_WRITE after the IRP_CLEANUP always (not depending on the file size) made in the context of system.exe, and the wierd thing about that is I see the buffer is not null, and the write is successful (I check for no errors in PostWrite). It is as if the write operation was lost after the mifilter processes it.

Any idea?
Thanks a lot!
Santi

Santiago_Navarro · December 5, 2011, 9:04am

Hi!,

I think I haven’t given you enough good information about the context the problem is happening in.
I am writing a minifilter that redirects access to some files for given a specific application. Briefly, if application opens file A, we filter it creating an auxiliar file, say B, and perform a reparse to such file.
Both A and B files have a file stream context, so we could identify further operations targeting the files from the application.

This particular problem is about the write IRP and it is weird because read IRP are working fine. For both of them, we are processing those that are headed to disk, this is, if the application reads from B we
check if it is a disk read, in this case we redirect reading to A, so the application is reading the real data. We manage write IRP in the same way, those writes heading to disk are redirected to A file, so file content iis consistent when application finishes operating with the file.
The problem is that when the application is writing the file (writing B), we filter and redirect disk write but data is not written in A file. Only if the file content is not so big (500 bytes) the data is successfully written to A.

If I test the minifilter creating the file (user application) with WRITE_THROUGH flag, the write disk redirections performed works fine, this is, data is written to A file.
The problem occurs in VISTA, if i perform the same test in XP i dont see any problem.

I am reading about implications related to create flags, but i haven’t found nothing usefull. Currently, i am kind of lost, i don’t know how to continue.

Just one more thing, if in my test application I call FlushFileBuffers() after WriteFile(), everything works fine.
Thanks!

Could you please help me? Thanks!

Alex_Carp · December 5, 2011, 12:38pm

Well, have you waited long enough ?

If I understand this correctly, you always reparse opens to fileA to FileB so the user never has a handle to fileA (even though they think they do). Then you redirect non-cached IO for fileB to fileA. Please correct me if i’m wrong.

The problem with this approach is that the cache manager might keep the data in memory for quite a while. This is not normally a problem because if you open the file again cached you will get the same cache structures and thus will see the same data even if CC never flushed the changes (for non-cached opens this can be more complicated but there are coherency flushes that area meant to address it). However, in your case when you check that the contents of fileA are valid I bet you open it without the reparse and so you don’t get the same cache region as fileB (because they are not the same file for CC) and so if the CC didn’t flush the data for fileB then you won’t see the changes in fileA.

In other words:

user opens fileA and you reparse to fileB.
the file system sees a request to open fileB and it does that and it initializes caching for fileB.
the user writes something that gets put in the cache for fileB.
the user closes fileB.
you then want to check what got written to fileA and so you open fileA (no reparse this time).
the file system sees a request for fileA and it initializes caching for fileA.
you read the contents for fileA.

If there was no flush between step 4 and step7 then the modifications are still in the cache for fileB and so naturally in step 7 you will not see them. If you somehow cause the modifications to be flushed (either by opening with WRITE_THROUGH or by actively flushing the file with FlushFileBuffers() or by causing memory pressure in the system or even by waiting long enough) then you will see the right contents in step 7.

Does this make sense ?

Thanks,
Alex.

OSR_Community_User · December 5, 2011, 11:56pm

>application opens file A, we filter it creating an auxiliar file, say B, and perform a reparse to such file.

So, you create 2 caches for the same disk blocks - one for file A and one for file B (file B’s disk-level IO is redirected to A, so the 2 files use the same disk blocks, but Cc/FSD does not know this and thus creates 2 caches) - and have coherency issues between them.

This is a very strange architecture. Re-think it to only have 1 cache for 1 file.

–
Maxim S. Shatskih
Windows DDK MVP
xxxxx@storagecraft.com
http://www.storagecraft.com

Santiago_Navarro · December 6, 2011, 8:00am

Thanks a lot Alex and Maxim!!

Alex, your point is very clear, but I am afraid the problem does not seem to be I do not wait long enough since I am catching (in Windbg) the disk write operation (after the process has unloaded). The problem here is that when I get those writes to disk and redirection to the other file is performed, the result is not valid. On the other hand, when disk writes that are result of a FlushBuffers operation are performed, the redirection works fine.

The final goal of our minifilter is to encrypt/decrypt files for just a given application, say notepad.exe. A normal scenario would be the following:

1- User opens file A with Notepad.exe.
2- Minifilter creates file B and reparses file A to file B, and from now on all NON_CACHED reads from file B are redirected to file A and decrypted. And all NON_CACHED writes are encripted and redirected to file A (In the PreWrite to file B we also clear the NON_CACHED flag and set the WRITE_THROUGH flag to update both file A disk content and cache content)

We keep only one view per file in cache, one ciphered view for file A and other plain view for file B.

The way we redirect reads and writes is by changing Data->Iopb->TargetInstance/TargetFileObject between file A and B. If user opens file A with any other application, our minifilter does nothing, so it is read the original file A (encripted).

As I mentioned, writing is not working unless user application opens it with WRITE_THROUGH or user does FlushCacheBuffers() after WriteFile() or the file is less than 700 bytes. But I do see and NON_CACHE IRP_WRITE to file A that we redirect to file B and seems it is never written

Please if this is not the right approach tell me.

Again, thanks a lot!!

Alex_Carp · December 6, 2011, 1:10pm

Reading your explanation i don’t understand this phrase “But I do see and NON_CACHE IRP_WRITE to file A that we redirect to file B …”. I thought you only redirected IO from fileB to fileA ?

I’m not sure that converting non-cached IO for FileB to cached IO for FileA is safe. For example if CC is trying to write to FileB because the cache is full and FileA isn’t in the cache then it’s possible to deadlock if CC can’t initialize caching for FileA to complete the write. Perhaps you’re doing something else to ensure that this can’t happen but personally I would try to avoid this… Also, what happens in the scenario you’ve described if you don’t convert the IO to cached ? Does FileA look right afterwards (make sure to open it for non-cached IO when checking its contents…) ?

Which FILE_OBJECT are you using as the TargetFileObject for FileA when redirecting the request from FileB to FileA? Who creates that FILE_OBJECT ?

Thanks,
Alex.

Santiago_Navarro · December 13, 2011, 4:17am

Hello Alex! First thank you very much for your help and sorry for the delay.

About the phrase you do not understand, I am sorry, it is wrong. What I meant to say is “But I do see a NON_CACHE IRP_WRITE to file B that we redirect to file A …”.

I do not think my problem is I do not wait long enough because I see a NON_CACHED IRP_WRITE to file B that I redirect to file A. This operarion is only successful if user application opens file A with WRITE_THROUGH or it calls FlushCacheBuffers() or the length to write is around 500 bytes.

As for the cache not ready for a file, I think I don’t have a problem with file A cache. I keep file A opened since the first create operation (until the user application is closed), so I think that file A cache is already initialized before any read/write is performed against file B. Is this a mistake?, in fact we have just found that if we call ccFlushCache() from PostWrite, sometimes it works and sometimes it causes a deadlock…

The file object we use to redirect operations to file A is one we obtain from a call to FltCreateFile() - ObReferenceObjectByHandle() in PreCreate().

Finally, the test you mention about not converting the IO to cached, the result is the same, file A is read full of zeros (using NO_BUFFERING).

Thanks a lot!
Santi

Alex_Carp · December 14, 2011, 1:16pm

So when you say that “This operarion is only
successful if user application opens file A with WRITE_THROUGH or it calls
FlushCacheBuffers() or the length to write is around 500 bytes.” do you mean that the operation fails (IoStatus.Status != STATUS_SUCCESS) or that the operation succeeds but you don’t see the results ?

As far as i know caching usually isn’t initialized until the first IO is performed on the FILE_OBJECT so the statement that “file A cache is already initialized before any read/write is performed against file B” is only true if there has been a least one cached IO operation on file A… Just add an assert that FILE_OBJECT->PrivateCacheMap != NULL when you do your redirection and you’ll see…

Also as you’ve seen from a previous post (http://www.osronline.com/showThread.CFM?link=209657) you probably shouldn’t call CcFlushCache if you don’t own the cache…

Thanks,
Alex.

Santiago_Navarro · January 5, 2012, 6:57am

Hello!
I have been trying to isolate the problem a bit more performing the following tests:

#1: When IRP_WRITE arrives, file B cache is already initialized (PrivateCacheMap non NULL), but file A cache is not initialized. In PostWrite file A cache is still not initialized, status is 0 (NT_SUCCESS) and information is 0 (0 bytes written AFAIK).
The result of this operation is file A full of zeros.

#2: When FlushFileBuffers() is called after WriteFile() from user app, PrivateCacheMap of file B is never NULL, for file A it is NULL in PreWrite and non NULL in PostWrite, so cache has been initialized.
The result of this operation is successful, file A has valid content.

#3: when I call FltWriteFile(FILE_B,…) (without FlushFileBuffers from user app) in PreWrite and complete this IRP, file A cache is initialized, and the result is successful, file A has valid content.

To complicate things even more, if I use a small file (512bytes) point #1 succeeds, and file A cache is initialized.

What can FlushFileBuffers() change to make it work?? Why does FltWriteFile() succeed and if I let IRP_WRITE continue it fails??

Thanks a lot!!

Santiago_Navarro · January 5, 2012, 7:00am

Sorry, I meant:
#3: when I call FltWriteFile(FILE_A,…) …