Hello! All,
I am working on a prototype of an data transformation driver based on minifilter. My driver only needs to support NTFS locally so I am using NTFS alternative data stream and therefore, my driver doesn’t need to change the file size.
I am running into a strange problem on handling MS WORD. As you know, WORD creates the tmp file, and renames to the destination doc at the end. Therefore, at the post rename operation, my driver transforms the whole file and generates the alternative data stream. The problem I have is that, after I save and close the file, I can’t open the file any more. This failure doesn’t happen all the time. When I can’t open the file, the popup error message says that the file can’t be opened because the file is corrupted. Also, WORD also asks me if I want to recover the content. If I chose yes, the content really comes back. (In case you ask, I transform data back at IRP_MJ_READ with paging io flag on)
I compared with the case that my driver is not running and open an transformed WORD doc. In this case, the content can’t be recovered at all.
I tried the tool FileSpy.exe to compare the case on an transformed file with the case on a file not transformed, but couldn’t find the obvious differences (i.e. no extra IRP operation failures, no extra IPR operations).
BTW, if the file can’t open after save and close, it won’t be able to open all the time. It never happens that a file can open sometimes but sometimes not. It seems that something wrong when the file is written.
What causes the DOC file can’t be open? What should I do to debug the problem and find the cause? What should I do?
Thank you very much in advance for any advices.
Heidi
Hello!
I am still stuck at this problem.
My driver also has the problem with Excel. Excel.exe complains that the file has been save but can’t be reopened. I searched this forum about this but didn’t see any solution on this matter.
I did an experiment that my prototype driver didn’t do any data modification (that is, no flip bit). I didn’t see these problems.
Can you help? Thank you.
I am sure that I missed some details for you to help me. If so, please let me know.
Heidi
BTW, I didn’t see this problem on VM. It surfaces when I test my driver with real machine.
>I am working on a prototype of an data transformation driver based on minifilter. My driver only needs to support NTFS locally so I am using NTFS alternative data stream and therefore, my driver doesn’t need to change the file size.
I am running into a strange problem on handling MS WORD. As you know, WORD creates the tmp file, and renames to the destination doc at the end. Therefore, at the post rename operation, my driver transforms the whole file and generates the alternative data stream. The problem I have is that, after I save and close the file, I can’t open the file any more. This failure doesn’t happen all the time. When I can’t open the file, the popup error message says that the file can’t be opened because the file is corrupted. Also, WORD also asks me if I want to recover the content. If I chose yes, the content really comes back. (In case you ask, I transform data back at IRP_MJ_READ with paging io flag on)
You mentined you only transform data back?when PAGING IO flag is on – that is not enough because data can be read with paging io flag on?for non?cached and non paging IO.
Lijun
?
From: “xxxxx@yahoo.com”
To: Windows File Systems Devs Interest List
Sent: Tue, January 5, 2010 8:04:45 PM
Subject: RE:[ntfsd] Can’t open a transformed WORD document sometimes
Hello!
I am still stuck at this problem.
My driver also has the problem with Excel. Excel.exe complains that the file has been save but can’t be reopened. I searched this forum about this but didn’t see any solution on this matter.
I did an experiment that my prototype driver didn’t do any data modification (that is, no flip bit). I didn’t see these problems.
Can you help? Thank you.
I am sure that I missed some details for you to help me. If so, please let me know.
Heidi
—
NTFSD is sponsored by OSR
For our schedule of debugging and file system seminars
(including our new fs mini-filter seminar) visit:
http://www.osr.com/seminars
To unsubscribe, visit the List Server section of OSR Online at http://www.osronline.com/page.cfm?name=ListServer
The non-chached flag is important for transformation. Paged IOs are allways non-cached and they mean more an arbitrary context from Memory/Cache manager. But anyway I don’t think that office apps use non-cached IOs, so this advice is not very helpfull. You didn’t mention the version of the office and the file (compound document/zipped XML e.g. doc/docx)
Bronislav Gabrhelik
Lijun & Bronislav,
Thank you for the replies.
Sorry that I made a mistake in my problem description. I did handle non-cached flag at read. Here is the condition
if (!(iopb->IrpFlags &
(IRP_NOCACHE | IRP_PAGING_IO | IRP_SYNCHRONOUS_PAGING_IO))) {
//we only care about paging IO
leave;
}
I also use per stream context to track fileobject. At read and write, I only handle the FileObject that has per stream context attached.
Therefore, the read IRP that I will handle needs to meet the above two conditions.
I didn’t attach per-stream context to the FileObject after post rename operation. Should I do so?
The version of office is 2007 and the file types are docx for WORD and xlsx for Excel.
For Excel case, I use FileSpy to observe and compare the IRPs between transformed file and non-transformed file. I found that, for transformed file, Excel issue an IRP_MJ_CREATE with Share value equal to 0 (i.e. exclusive access) after rename IRP. There is no such IRP for non-transformed file.
38 EXCEL.EXE 3172 IRP IRP_MJ_CREATE 40000884 C:\Documents and Settings\Administrator\Desktop\heidi_test.xlsx STATUS_SUCCESS FILE_OPEN CreOpts: 00400060 Access: 00120089 Share: 0 Attrib: 0 Result: FILE_OPENED
From FileSpy, I didn’t see any failure create IRP (I set FileSpy’s path filter to *heidi_test* and *tmp*. )
I wonder if this exclusive access IRP is triggered because the data is different from what are stored in the cache. If so, how to work around?
Looking forward to hearing from anybody here soon.
Heidi
You need to narrow the issue down to find out if the data is written wrong to the disk when you transformed it or there is an error in transforming the data back when reading.
If I were you, I will create some user mode tool to facilate this troubleshooting which can be very?useful for general testing.
This tool will read the data written to the disk (without your driver enabled), and transform the data back (using the same logic as you do in the kernel), then compare this data with the same data written to disk without your driver. If it does not match up, it means data is written wrong or the transformation (both directions suspicious) is not good and you should focus on them. Otherwise, the issue lies with read truly.
You can instrument your code to monitor the file read and check if you are missing any transformation back – the tracing should be inclusive – do not filter out based on conditions you are not confident of.
Lijun
From: “xxxxx@yahoo.com”
To: Windows File Systems Devs Interest List
Sent: Wed, January 6, 2010 1:40:15 PM
Subject: RE:[ntfsd] Can’t open a transformed WORD document sometimes
Lijun & Bronislav,
Thank you for the replies.
Sorry that I made a mistake in my problem description.? I did handle non-cached flag at read. Here is the condition
? ? if (!(iopb->IrpFlags &
??? ? (IRP_NOCACHE | IRP_PAGING_IO | IRP_SYNCHRONOUS_PAGING_IO))) {
? ? ? //we only care about paging IO
? ? ? leave;
? ? }
I also use per stream context to track fileobject. At read and write, I only handle the FileObject that has per stream context attached.
Therefore, the read IRP that I will handle needs to meet the above two conditions.
I didn’t attach per-stream context to the FileObject after post rename operation. Should I do so?
The version of office is 2007 and the file types are docx for WORD and xlsx for Excel.
For Excel case, I use FileSpy to observe and compare the IRPs between transformed file and non-transformed file. I found that, for transformed file, Excel issue an IRP_MJ_CREATE with Share value equal to 0 (i.e. exclusive access) after rename IRP. There is no such IRP for non-transformed file.
38??? EXCEL.EXE??? 3172??? IRP??? IRP_MJ_CREATE??? 40000884??? C:\Documents and Settings\Administrator\Desktop\heidi_test.xlsx??? STATUS_SUCCESS??? FILE_OPEN CreOpts: 00400060 Access: 00120089 Share: 0 Attrib: 0 Result: FILE_OPENED
From FileSpy, I didn’t see any failure create IRP (I set FileSpy’s path filter to heidi_test and tmp. )
I wonder if this exclusive access IRP is triggered because the data is different from what are stored in the cache. If so, how to work around?
Looking forward to hearing from anybody here soon.
Heidi
—
NTFSD is sponsored by OSR
For our schedule of debugging and file system seminars
(including our new fs mini-filter seminar) visit:
http://www.osr.com/seminars
To unsubscribe, visit the List Server section of OSR Online at http://www.osronline.com/page.cfm?name=ListServer
Lijun,
Thank you very much.
You are right! I was missing some transformation back (i.e. some read operations on the same FileObject where the rename operation happened) after rename operations.
That is all for now.
Again, OSROnline rocks!
Heidi