When the ValidDataLength increase with the EndOfFile?

During my test, I found a strange event.
It is an IRP_MJ_SET_INFORMATION(FileEndOfFileInformation), the value in AdvanceOnly is FALSE, this request try to set the EOF of a file(AllocationSize=FileSize=ValidDataLength=0) to 0x200, in the completion routine of this request, I found the AllocationSize=FileSize=ValidDataLength=0x200.
I thought only AdvanceOnly=TRUE will increase the ValidDataLength with EOF, how can this request increase the ValidDataLength in AdvanceOnly=FALSE?.
Can anyone help me about this? Any help will be much appreciated!

What’s the file system below you?

Are you sure that a write has snuck in?

On the other hand VDL belongs to the filesystem and it can set it to
whatever it wants. It’s just a cursor to allow some short cutting in
cachemanager and the FSD’s Read path. I could imagine that a filesystem
might take an extend of 512 bytes and just zero the space out anyway. At
that stage it it quite OK to have a VDL of 512 (look at FAT, it is what FAT
does on a Cleanup). Then when it got the write it wouldn’t have to update
the VDL (which might be an issue if that involves taking a hot lock). It
might be a sensible way of behaving if the filesystem persisted the VDL.

It is on ntfs, in fact I have test the same operation in fat, it runs quite well and the VDL remains the same as I supposed.
But I can not really treat it as a random file system operation, because every time I execute the previous operation, the VDL increase with the EOF.
Besides, I could nealy sure that there are no write operation occured, in fact I’m doing an encryption filter, my filter own that file object, so I could know it if there is another write operation.

I just wrote a very simple program to check this feature.
By calling SetFilePointer with lDistance = 512 in a exe, it will lead to a IRP_MJ_SET_INFORMATION(FileEndOfFileInformation) with AdvanceOnly==FALSE. This request just increase the AllocationSize=FileSize=ValidDataLength=0 to AllocationSize=FileSize=ValidDataLength=0x200, can anyone explain why VDL increased with EOF to me? I really need some help, any help will be much appreciated!

I found whether the VDL is increased or not is dependent on the value of EOF, if the EOF is smaller than 0x2d0(720) bytes, the VDL is increased with the EOF; otherwise, the VDL will remains the same.
So I guess may be in NTFS, the content (for less than 0x2d0 bytes) will be stored in MFT and this is why VDL increased with EOF? Am I right?

> So I guess may be in NTFS, the content (for less than 0x2d0 bytes) will be stored in MFT

Looks like really so.


Maxim S. Shatskih
Microsoft MVP on File System And Storage
xxxxx@storagecraft.com
http://www.storagecraft.com

NTFS supports a feature called “resident data” where the file data for small files is stored inside the MFT record itself. What you are seeing is that the VDL is moved because the entire region has already been zeroed out because it is internal to the NTFS meta-data. FAT has no such concept.

FAT moves VDL during the last cleanup call. NTFS does not.

Thus, the individual file systems do not have uniform behavior.

“AdvanceOnly” is used by the VM system to ensure that set EOF calls do not truncate. This option has no impact on VDL at all.

The file system is free to move the VDL as it sees fit. For example, if you had an SSD drive and the file system knows it has just allocated space from the SSD that was previously trimmed, there would be no reason not to move the VDL out to the EOF - after all, such a file system would know the space has no useful data content (it should be zero filled but certainly won’t have useful data within it).

What is unfortunate is that there is no programmatic way for you to obtain the current VDL from the underlying file system, so you never know what might occur when you send down a write operation.

Tony
OSR

NTFS does not use a fixed size for the MFT record - it’s determined by the format utility at the time the original format is written. Further, there are other variable size attributes that can be stored inside the MFT record, so that number is true *for the scenario you tested* but is not a universal constant.

Tony
OSR

Thanks Tony, you do help me a lot.
In fact, I’m doing an encryption filter based on minifilter with shadow file object tech, the encrypted file will have a header in it(the header is 512 bytes). As you have mentioned, , so how should I handle with VDL? Because you know, I own the file object and its SCB, I need to make sure the value in VDL is the right one(not the real VDL plus HEADER_SIZE).

As Tony and Rod have pointed out, VDL is used differently and set
differently based on the file system. That said, since you own the file
object and the scb then set the VDL to whatever you would like it to be,
<= EOF. Setting it == EOF is fine unless you want to perform some
internal tracking with it in your implementation.

Pete

On 11/13/2014 6:21 PM, xxxxx@serpurity.com wrote:

Thanks Tony, you do help me a lot.
In fact, I’m doing an encryption filter based on minifilter with shadow file object tech, the encrypted file will have a header in it(the header is 512 bytes). As you have mentioned, , so how should I handle with VDL? Because you know, I own the file object and its SCB, I need to make sure the value in VDL is the right one(not the real VDL plus HEADER_SIZE).
>
> —
> NTFSD is sponsored by OSR
>
> OSR is hiring!! Info at http://www.osr.com/careers
>
> For our schedule of debugging and file system seminars visit:
> http://www.osr.com/seminars
>
> To unsubscribe, visit the List Server section of OSR Online at http://www.osronline.com/page.cfm?name=ListServer


Kernel Drivers
Windows File System and Device Driver Consulting
www.KernelDrivers.com
866.263.9295



There are several ways you can deal with this. Here are two that come to mind:

(1) Don’t depend on the VDL of the underlying file system. This is really the best option - you track what your (logical) VDL is. Look at FastFAT - it actually tracks two VDLs as it is (one for the “in memory” VDL and the other for the “on disk” VDL). Why should your VDL have anything to do with the VDL of the underlying file system.
(2) Push the file size up - NTFS will make the Data non-resident. Even when you truncate it back, the data will remain non-resident.

The real reason this becomes a problem is that if you do a (non-paging) write past VDL, the underlying file system will zero the region from it’s VDL to the offset of the write. I’ve had to deal with this in our own work, where our physical storage component sees the re-entrant paging I/O operations and then does the logical-to-physical mapping. This is tied into some complexities with the way that NTFS and FAT detect the VM background threads. We ensure that we always handle zeroing the data region so that NTFS/FAT don’t do it. The exception here is that NTFS persists VDL and FAT does not. During IRP_MJ_CLEANUP FAT may zero from “on disk VDL” to “EOF”. This is because FAT does not persist VDL.

Unfortunately, there’s no file system attribute bit indicating persistent VDL. Thus, we end up using unsavory techniques to figure out if it has persistent VDL - notably, we look to see if the name of the file system is NTFS. I loathe doing that, but this is a performance optimization (it’s not incorrect to zero VDL to EOF, it’s just slower than the native file system).

Tony
OSR

Thanks Tony.
As you have mentioned, these two ways can solve my problem. I still have some questions.
For the 1st way:
If I try to track my own VDL(do not depend on the VDL of underlying file system), I should save this value to scb->Header.ValidDataLength(FSRTL_COMMON_FCB_HEADER) or scb->ValidDataLength(self-created member)? You know, I have added a header(512 bytes-long) to each encrypted file and I cut 512 bytes for FileSize and ValidDataLength in my previous design. Should I cut 512 bytes for these values or just leave the right value?
For the 2nd way:
I have think of this way, but I wonder what is the critical value for Data to become non-resident?

Store it wherever you wish to store it. The nice thing is that the common header field belongs to you, so if someone ELSE looks in it, they’re cheating so if they break, it’s not a bug in your software. With that said, you are best off to mimic behavior of the existing file systems.

As to what value, the question is “what value do you want another filter to see?” I’d say that this is not going to be the physical VDL, but rather your logical VDL (which you indicate differs by 512 bytes).



Heuristically, this has never been larger than 4KB. While this could break, it won’t break for anything that’s been shipped before. Currently, NTFS will use a 1KB MFT record, but prior to Windows 2000 it was 4KB. It’s possible it will change in the future, but you have time to adjust for that when it occurs.

Tony
OSR


NTFSD is sponsored by OSR

OSR is hiring!! Info at http://www.osr.com/careers

For our schedule of debugging and file system seminars visit:
http://www.osr.com/seminars

To unsubscribe, visit the List Server section of OSR Online at http://www.osronline.com/page.cfm?name=ListServer

Thanks Tony, your advice is always useful as usual ;D

>data region so that NTFS/FAT don’t do it. The exception here is that NTFS persists VDL and FAT

does not.

Non-persistent VDL (like in FAT) was always a kind of mystery for me.

I can understand one of the possibilities on why is it needed - for cached file-growing CcCopyWrite, like AdvanceEOF+WriteToCache+AdvanceVDL, but I’m not sure my understanding is correct.

How is non-persistent VDL used and what advantages it provides? why not just maintain it always == EOF?


Maxim S. Shatskih
Microsoft MVP on File System And Storage
xxxxx@storagecraft.com
http://www.storagecraft.com

For example, image one situation:
One program intend to wirte something in an empty file, it firstly set EOF to 0x100, but later it just write 0x50 bytes of content into the file. The last 0x50 bytes was supposed to be filled with zero. In order to make sure the last 0x50 bytes was filled with zero, we either set VDL to 0x50 to indicate we just write 0x50 bytes of the content in to the file(persist VDL) or set VDL to 0x50 and call CcZeroData(and set VDL to EOF) later for content between 0x50 and 0x100(non-persistent VDL).
By using non-persistent VDL, we do not need to set VDL to EOF and call CcZeroData every time, we just need to handle it in the cleanup routine.
This is my understanding towards VDL, may be not correct. Hope it will be helpfule ;D

Probably you’re correct.

What are the mandatory invariants of VDL? I know that a) cache does not extend beyound EOF and b) EOF cannot be changed by paging IO and c) VDL cannot be changed by non-top-level IO.

Is the range of [VDL…EOF) cached? will Cc substitute zeroes automatically for CcCopyRead calls to this range?

What about CcCopyWrite to such a range? will Cc send SetInfo/VDL to the FSD itself?


Maxim S. Shatskih
Microsoft MVP on File System And Storage
xxxxx@storagecraft.com
http://www.storagecraft.com

wrote in message news:xxxxx@ntfsd…
> For example, image one situation:
> One program intend to wirte something in an empty file, it firstly set EOF to 0x100, but later it just write 0x50 bytes of content into the file. The last 0x50 bytes was supposed to be filled with zero. In order to make sure the last 0x50 bytes was filled with zero, we either set VDL to 0x50 to indicate we just write 0x50 bytes of the content in to the file(persist VDL) or set VDL to 0x50 and call CcZeroData(and set VDL to EOF) later for content between 0x50 and 0x100(non-persistent VDL).
> By using non-persistent VDL, we do not need to set VDL to EOF and call CcZeroData every time, we just need to handle it in the cleanup routine.
> This is my understanding towards VDL, may be not correct. Hope it will be helpfule ;D
>

a) Is the range of [VDL…EOF) cached?
I believe that whether the range of [VDL…EOF) is cached or not is depend on PAGING_IO read requests; besieds, if the file can be cached and is currently entire out of cache, how much of the content will be read into cache is still depend on how does lower file system handle it.
b) Will Cc substitute zeroes automatically for CcCopyRead calls to this range?
Normally there won’t be a CcCopyRead try to read this range, because drivers/filters will check VDL and just read the length below VDL. If CcCopyRead was called manually and was forced to read this range, I think there should be random data in this range.(But I’m not sure about this, I do think this also depend on the internal realization of Cc APIs)
c) What about CcCopyWrite to such a range?
CcCopyWrite do not need to send SetInfo/VDL to the FSD itself, Cc itself can extend the VDL&FileSize, doesn’t it?

One reason I’ve seen is that if you zero the region you can observe timeout situations. The first time I saw this was across the network on a 1GB file - zeroing it took so long at the time that the SMB operation was timed out and the operation failed, even though it actually did work on the server.

Another reason is that I might want to ensure that I have sufficient room to actually perform a given operation before I start that operation. So I set the EOF to the anticipated size and then I start writing data out to the file. If the file system zeros the region on disk there is a 100% increase in unnecessary I/O. This also happens with memory mapped files when the section is first created (recall that fast I/O function and then the filter callback for create section? It’s there because Mm will actually set the EOF of the file during section setup.)

This isn’t the only approach for solving this problem. In my Episode days (more than 20 years ago now) we used an old/new value log system and we would insert transaction records to zero the region of the disk as the *undo* operation, with no corresponding redo operation. When the user data was finally flushed to that region of the disk, we would commit the transaction (so on replay, we did nothing). It actually made dealing with disk scavenging issues surprisingly cheap to deal with. New value only log systems like NTFS and ReFS don’t have that sort of semantic, so they have to accommodate this differently.

But bottom line: eliminating unnecessary I/O to disk is the real driver here. The last thing a file system wants to do is I/O, and the worst I/O is synchronous I/O. Zeroing those regions on the drive is that worst possible I/O: synchronous and typically unnecessary.

Tony
OSR

We’ve spent a lot of time lately dealing with VDL. In fact, the VDL can be maintained and tracked at different layers within the system and they aren’t necessarily the same thing.

Paging I/O can move the VDL. The Lazy writer should not. This is because the lazy writer’s sense of VDL comes from the FSD, so it would be peculiar if it’s sense of VDL was different than the FSD. Mm on the other hand has no such constraint and may move the VDL as it flushes pages back from the application program.

But the in-memory VDL may not match the on-disk VDL - and it is the on-disk VDL that really matters, because it is that value that protects against information exposure.

It may be cached, but there is no requirement that this be the case. If it is cached, then Cc will in fact zero that region. But if you use non-cached user I/O the VDL is still tracked. FAT will zero it on cleanup.

No, because it is the file system that is calling CcCopyWrite. When the VDL moves, the FSD will report this fact to the Cache Manager as well, either implicitly (CcCopyWrite) or explicitly (CcSetFileSizes). The one exception to this is the case in which the cache manager doesn’t know the file size and wants to flush data - in that instance, the cache manager sents a set information IRP with the updated EOF (AdvanceOnly = true). Ironically, the file systems ignore this call.

Tony
OSR