FileSize is 0 during write on Windows Server 2008 x64

Rajesh_Gupta · August 5, 2009, 2:07pm

All,

My application is trying to write 38 bytes of data on Windows Server 2008 x64. When i receive PAGING IO call in the write dispatch, i get FileSize=0. Here is what i get

Allocation size = 0x28
*FileSize = 0*
ValidDataLength=0
Offset = 0
Length=0x1000

How do i get that the actual data written is 0x26?

When i debugged this on Windows Server 2003 x64, i do get the “FileSize = 0x26”.

Why is this difference between two OS and do i need to take care of setFileSize in the driver to handle this scenario or there is any other way to get the file size on Windows Server 2008 during write?

thanks in advance
Rajesh

Matt_Klein · August 5, 2009, 6:12pm

I don’t really understand your question. Are you a FSD? A filter?

What kind of IO do you see prior to the paging IO?

Where are you getting the allocation size and file size from?

Rajesh_Gupta · August 5, 2009, 6:55pm

Sorry about the confusion. I forgot to mention that i am debugging the file system filter driver and I am getting all this information from the FCB during the Paging IO write call.

First IO in my driver is the regular cached Write IO (which is 0x26 bytes).

Immediately after this IO, I see Paging IO with the following information in the FCB

Allocation size = 0x28
*FileSize = 0*
ValidDataLength=0
Offset = 0
Length=0x1000

So here is the problem i am facing. I need the actual data size in the Write paging IO.
In this example, actual Data Size is 0x26 bytes.

On Windows 2003/XP, FCB->FileSize is updated during first Paging IO and i was using FCB->FileSize to get the actual data length. But in case of 2008/Vista, FCB->FileSize is 0 and is not updated timely.

What is the other way of getting the actual data size during Paging IO?

thanks in advance.

Matt_Klein · August 5, 2009, 7:21pm

Can you just explain in more detail what you are trying to accomplish? I assume you are peeking into FsContext which is technically opaque - I would advise against doing this.

Rajesh_Gupta · August 5, 2009, 7:44pm

I am trying to perform the encryption in the write path. To perform the encryption i need the actual data size in the paging IO.

Encryption algorithm uses the actual data size. So during paging IO, i was passing the data size as 4K, which is the length. Everything is aligned and encryption works fine.

But when read comes, it only tries to read only 0x26 bytes and decryption fails.

Seems like, encryption/decryption algorithm uses different algorithm if the data is less than 512 bytes.

Because of this i see data corruption. So we have been using the FCB->FileSize to get the actual data size on 2003 and XP.

Matt_Klein · August 5, 2009, 7:55pm

In general you will need to do encryption/decryption on non-cached IO. You will need to pad the data to the proper encryption block granularity. This also means that you generally need a way of knowing the real size of the file out of band. I can tell you from personal experience that the only way to get this to work is to use a layered FSD. Either buy the OSR kit (my management didn’t want to spend the money on this) or prepare for the long haul.

Rajesh_Gupta · August 5, 2009, 8:06pm

I have fully working driver which does encryption/decryption. we have product out in the market and its working with all kind of real time applications/DBs etc.

While porting to 2008/Vista, i faced this issue. So seems like i will have to pad the data to encryption block size which is 512 in my case. In other word, i can not peek into fscontext structure to get the filesize, allocation size, valid data length etc.

If i query the StandardFileInformation during Paging IO, will that give me updated file size? I mean to say,

File is created
first cache IO on the file, which is 0x26 bytes
First paging IO as a result of the cache IO (step 2). I will query the standard info. Will this give me file size as 0x26 bytes?

If not then when does file size get set? I am not seeing any SetInformation to set the file size either.

rod_widdowson · August 6, 2009, 4:19am

> In other word, i can not peek into fscontext structure to get the

filesize, allocation size, valid data length etc

Of course you cannot. It is not your data structure. The FSD may chose to
share it with Cc, and it may chose to put something into FsContext, but I
don’t believe that that is in any “contract” that the FSD has. So, any
assumptions about what someone else’s data structure looks like will always
cause you grief eventually… It sounds as though the piper has turned up
with the bill.

If i query the StandardFileInformation during Paging IO, will that give me
updated file size? I mean to say,

It may well give you a deadlock (if not in this release in some future one)

If not then when does file size get set? I am not seeing any
SetInformation to set the file size either.

Not even AdvanceOnly? What about extending writes?

It feels to me that if you have to know what the *eof* is you are going to
hveto try to keep a shadow copy of it. This of course is far from easy in a
filter environment…

jimd · August 6, 2009, 10:05am

> I have fully working driver which does encryption/decryption. we have

product out in the market and its working with all kind of real time
applications/DBs etc.

Doesn’t sound that way to me. In the past you cheated, by looking into an
opaque structure.
My point is that you are insisting that because it used to work, you are
perhaps not looking at the fact that things have changed, and that you need
a approach.

While porting to 2008/Vista, i faced this issue. So seems like i will have
to pad the data to encryption block size which is 512 in my case. In other
word, i can not peek into fscontext structure to get the filesize,
allocation size, valid data length etc.

If i query the StandardFileInformation during Paging IO, will that give me
updated file size? I mean to say,

File is created

first cache IO on the file, which is 0x26 bytes

First paging IO as a result of the cache IO (step 2). I will query the
standard info. Will this give me file size as 0x26 bytes?

If not then when does file size get set? I am not seeing any SetInformation
to set the file size either.

NTFSD is sponsored by OSR

For our schedule of debugging and file system seminars
(including our new fs mini-filter seminar) visit:
http://www.osr.com/seminars

To unsubscribe, visit the List Server section of OSR Online at
http://www.osronline.com/page.cfm?name=ListServer

Rajesh_Gupta · August 6, 2009, 11:58am

thanks a lot everyone for the reply.

Sounds like i have been caught cheating I think in this case the only solution i have is to pad it up to the next encryption block.

thanks again everyone.

Rajesh_Gupta · August 6, 2009, 2:48pm

Hi Rod,

“Not even AdvanceOnly? What about extending writes?”

What do you mean by extending writes? You mean pad the data and increase the file size.

"It feels to me that if you have to know what the *eof* is you are going to hveto try to keep a shadow copy of it. This of course is far from easy in a filter environment… "

I already have a structure for every file in my driver. So during regular IO, i know that application is trying to write 36 bytes. I can keep track of the size with the help of offset. When paging IO for that file will be issues, then i can use the tracked file size to performt he encryption. Is that what you mean by shadow copy?

Matt_Klein · August 6, 2009, 3:52pm

As I said above, unless you are in full control of caching, knowing the real size of the file in a filter at any particular moment in time is basically impossible. The correct synchronization cannot be done.

Once you start padding data, you are changing the size of the file on disk, thus you need to hide this change from people above you. This also means that you have to persist the old size of the file somehow for later reads.

Although paging IO is done on block granularity, paging IO can never extend EOF and thus unless you extend EOF somehow your padding will get lost.

This is very complicated stuff. I would recommend studying the FAT code very carefully. Your current code works purely by luck. If your company can afford it I would really recommend talking to OSR about this - I wish I had been able to do that!!!

Rajesh_Gupta · August 6, 2009, 4:57pm

Thanks Matt for comments.

Yes that what i understood. If i will pad the data then it will definitely change the file size and add more complications.

Can we associate paging IO with regular IO? If I can find out that this paging IO/IOs were result of this regular IO then i can track the actual length from regular IO. I am already tracking all the open files on the system with small structure for other reasons and used file context in filter manager. I was wondering if i can track the “Actual Data Size” (which is the length) and offset in the regular IO and then somehow use them in the paging IO. But if the data is more than a page (more than 4K) then i am not sure if the paging IOs will issued in the order. If no then i wouldn’t know weather the first Paging IO has 4k of actual data or the second one.

Also, Rod mentioned about the “Shadow Copy” approach, Does he mean make a shadow copy and use that device object to find out the file size. Like we have been using the shadow copies for reading/writing to the files to avoid loop back.

Also, does filter manager help in resolving this issue somehow?

I understand its getting complicated. well i am certain my company will not spend the money to talk to OSR I wish i can do that as well.

I know these are tons of questions and i really appreciate everyone for quick response and all the help.

OSR_Community_User · August 6, 2009, 6:45pm

xxxxx@gmail.com wrote:

Thanks Matt for comments.

Yes that what i understood. If i will pad the data then it will definitely change the file size and add more complications.

Can we associate paging IO with regular IO? If I can find out that this paging IO/IOs were result of this regular IO then i can track the actual length from regular IO. I am already tracking all the open files on the system with small structure for other reasons and used file context in filter manager. I was wondering if i can track the “Actual Data Size” (which is the length) and offset in the regular IO and then somehow use them in the paging IO. But if the data is more than a page (more than 4K) then i am not sure if the paging IOs will issued in the order. If no then i wouldn’t know weather the first Paging IO has 4k of actual data or the second one.

You can not directly associate a given cached/non cached IO with a
paging IO since multiple processes could be modifying a file within the
same range and you would only see a single set of paging IO requests.

What you can do, and this does get quite complicated, is track the
non-paging IO requests and modify the file size to accommodate your
padding at that time. This requires extending the file correctly during
file extending writes in the non-paging path. This will allow you to
have the correct file sizes during the paging IO processing. You also
need to deal with the cases of extension and truncation through the
SetFileInfo calls.

We have implemented many data modification filter drivers and getting
this right takes a lot of forethought and testing.

Also, Rod mentioned about the “Shadow Copy” approach, Does he mean make a shadow copy and use that device object to find out the file size. Like we have been using the shadow copies for reading/writing to the files to avoid loop back.

Rod is referring to actually having a separate file that contains the
modified data and the actual sizes and then having your file exposed to
the upper callers be a ‘stub’ file of the correct size, etc. Then you
redirect IO requests accordingly to the shadow file. Again, this is not
a huge step in making things less complicated, just shifting it around.
You will basically need a lyered file system in this approach to manage
the shadow files.

Also, does filter manager help in resolving this issue somehow?

No, filter manager does not make things easier in this regard.

Pete

I understand its getting complicated. well i am certain my company will not spend the money to talk to OSR I wish i can do that as well.

I know these are tons of questions and i really appreciate everyone for quick response and all the help.

NTFSD is sponsored by OSR

For our schedule of debugging and file system seminars
(including our new fs mini-filter seminar) visit:
http://www.osr.com/seminars

To unsubscribe, visit the List Server section of OSR Online at http://www.osronline.com/page.cfm?name=ListServer

–
Kernel Drivers
Windows File System and Device Driver Consulting
www.KernelDrivers.com
866.263.9295

Rajesh_Gupta · August 6, 2009, 7:23pm

Thanks a lot Peter for the info and details answers.

Another stupid question.

As the file is already created and i am in the Paging write path, i shall be able to get the file size (standard file info/basic file info) using the shadow device object. Because regular IO is already finished, file is created, file size will be set in the regular IO, i shall be able to query the FileSize using the shadow device object and IRP_MJ_QUERY_INFORMATION with FIleStandardInfo. If i can get the current FileSize then i can tweak my encryption algorithm and it may work.

I will look into other solutions and see which one will be quicker and better for us. I really appreciate all the help.

OSR_Community_User · August 6, 2009, 9:48pm

xxxxx@gmail.com wrote:

Thanks a lot Peter for the info and details answers.

Another stupid question.

As the file is already created and i am in the Paging write path, i shall be able to get the file size (standard file info/basic file info) using the shadow device object. Because regular IO is already finished, file is created, file size will be set in the regular IO, i shall be able to query the FileSize using the shadow device object and IRP_MJ_QUERY_INFORMATION with FIleStandardInfo. If i can get the current FileSize then i can tweak my encryption algorithm and it may work.

I will look into other solutions and see which one will be quicker and better for us. I really appreciate all the help.

Be careful how you are querying the file information in the paging path.
You will be running at APC IRQL because resources will be held when you
are called.

Pete

NTFSD is sponsored by OSR

For our schedule of debugging and file system seminars
(including our new fs mini-filter seminar) visit:
http://www.osr.com/seminars

To unsubscribe, visit the List Server section of OSR Online at http://www.osronline.com/page.cfm?name=ListServer

–
Kernel Drivers
Windows File System and Device Driver Consulting
www.KernelDrivers.com
866.263.9295

OSR_Community_User · August 6, 2009, 11:11pm

>real size of the file out of band. I can tell you from personal experience that the only way to get this to

work is to use a layered FSD.

If you’re meddling with the file size - then yes.

Another design is to keep the last CBC block of the file somewhere in the database of your own (or registry) together with the file encryption key, and keep the file size same as for cleartext case.

In this case, any write which cross the CBC block boundary must save this last “crossed” CBC block to the database, and any read should retrieve it from there.

With such design, the filter becomes much, much simpler.

I think that other “perverted CBC” approaches are possible for file encryption without meddling with the size and without keeping the last block in the database. I even think that several years ago one such approach was discussed here on forums.

–
Maxim S. Shatskih
Windows DDK MVP
xxxxx@storagecraft.com
http://www.storagecraft.com