paging i/o and ValidDataLength/FileSize/AllocationSize

Hi all,

I am trying to understand the relationship between paging I/O and the three
file sizes. I found this thread,
http://www.osronline.com/showThread.cfm?link=72094, where Alexei Jelvis says
this:

File system tracks in ValidDataLength the biggest offset at which data
were
written to the file by user. It allows to implement optimisation of both
read
and write requests.
For read requests file system can fill part of the output buffer that lies
beyond ValidDataLength with 0s without reading data from disk.
Cache manager generates write requests in page size granularity. A part of
the page that is beyond ValidDataLength doesn’t contain any usfull data so
file system may truncate actual write request if the request is originated
by
the Cache Manager. If the write request comes from user or from Modifed
Page
Writer then it contains user’s information and ValidDataLength must be
extended.

As I understand then, if paging i/o comes from the last writer, it might
specify a byte range that goes beyond VDL or FileSize. However, the file
system should ignore anything beyond VDL. The file system should never
extend the file by making either VDL or FileSize bigger.

If the paging i/o comes from the MPW, it might specify a byte range that
goes beyond VDL. From what I have read elsewhere, MPW guarantees that it
will never specify a range beyond FileSize. If the MPW paging i/o write
does go beyond VDL, then the file system must assume that any information
beyond VDL is valid, and extend VDL, but only up to FileSize.

Finally, since AllocationSize is always larger than FileSize, and since
paging i/o can never extend FileSize, this means that paging i/o can never
extend AllocationSize either.

Do I understand correctly? Thanks.

=================================================
Roger Tawa
http://tawacentral.net/
[One thing about paradigms: shift happens.]
[When you stop, you’re done.]

> As I understand then, if paging i/o comes from the last writer, it might

specify a byte range that goes beyond VDL or FileSize.

Paging IO can never extend EOF.

Paging IO from Cache Manager cannot extend neither VDL nor EOF - I think Cc is
smart enough to never send down writes going beyound VDL.

If the paging i/o comes from the MPW, it might specify a byte range that
goes beyond VDL.

Yes. The FlushViewOfFile call can also do this.

From what I have read elsewhere, MPW guarantees that it
will never specify a range beyond FileSize.

Yes. You cannot memory-map a file beyond EOF, Cc also cannot.

Finally, since AllocationSize is always larger than FileSize, and since
paging i/o can never extend FileSize, this means that paging i/o can never
extend AllocationSize either.

Yes.

Maxim Shatskih, Windows DDK MVP
StorageCraft Corporation
xxxxx@storagecraft.com
http://www.storagecraft.com

I’d actually make a subtle point here. Max says “paging I/O can never
extend EOF” but in fact I believe he means “paging I/O write can never
extend EOF”. The distinction is that there ARE length modifying
operations (the cache manger sends down set EOF calls, for example) that
occur with the IRP_PAGING_IO bit set but these are not write operations.

However, the point underlying this (paging I/O write does not extend
file sizes) is that if you are constructing a physical media file system
you’d have to have your block allocator in non-paged pool in order to
handle such extending writes. While this might not sound like much,
imagine your disk drive is a 4TB volume with a modest cluster size - the
entire volume bitmap wouldn’t fit into memory, and you can’t really go
page faulting it in while processing a page fault - that just doesn’t
work.

Not every file system has this restriction, but because it is a
restriction of physical file systems, it was cooked into the underlying
system - and this is one of those guarantees that would be difficult to
change at this point.

Regards,

Tony

Tony Mason
Consulting Partner
OSR Open Systems Resources, Inc.
http://www.osr.com

Looking forward to seeing you at the next OSR File Systems class in
Boston, MA April 24-27, 2006.

-----Original Message-----
From: xxxxx@lists.osr.com
[mailto:xxxxx@lists.osr.com] On Behalf Of Maxim S. Shatskih
Sent: Saturday, January 14, 2006 8:58 PM
To: ntfsd redirect
Subject: Re: [ntfsd] paging i/o and
ValidDataLength/FileSize/AllocationSize

As I understand then, if paging i/o comes from the last writer, it
might
specify a byte range that goes beyond VDL or FileSize.

Paging IO can never extend EOF.

Paging IO from Cache Manager cannot extend neither VDL nor EOF - I think
Cc is
smart enough to never send down writes going beyound VDL.

If the paging i/o comes from the MPW, it might specify a byte range
that
goes beyond VDL.

Yes. The FlushViewOfFile call can also do this.

From what I have read elsewhere, MPW guarantees that it
will never specify a range beyond FileSize.

Yes. You cannot memory-map a file beyond EOF, Cc also cannot.

Finally, since AllocationSize is always larger than FileSize, and
since
paging i/o can never extend FileSize, this means that paging i/o can
never
extend AllocationSize either.

Yes.

Maxim Shatskih, Windows DDK MVP
StorageCraft Corporation
xxxxx@storagecraft.com
http://www.storagecraft.com


Questions? First check the IFS FAQ at
https://www.osronline.com/article.cfm?id=17

You are currently subscribed to ntfsd as: xxxxx@osr.com
To unsubscribe send a blank email to xxxxx@lists.osr.com

Thanks Max, Tony.

I would go further again than Tony and say “a paging I/O write
*should* never extend EOF”. The reason I replace “can” with “should”
is that fastfat, for example, actually validates the byte range of a
paging i/o write to make sure it does not go beyond VDL and/or
FileSize, and then does the right thing. From reading the code, this
seems to be more than just defensive programming; it seems to be
handling a case than happens under normal conditions.

“Doing the right thing” then depends on knowing whether the paging i/o
write comes from the lazy writer or the MPW. Fastfat does this by
remembering the thread that called FatAcquireFcbForLazyWrite() in the
FCB. I guess that is the recommended practice for doing so? One
could not, for example, set the top level IRP to a particular value
instead?

Thanks.

=================================================
Roger Tawa
http://tawacentral.net/
[One thing about paradigms: shift happens.]
[When you stop, you’re done.]

> write comes from the lazy writer or the MPW. Fastfat does this by

remembering the thread that called FatAcquireFcbForLazyWrite() in the
FCB. I guess that is the recommended practice for doing so? One
could not, for example, set the top level IRP to a particular value
instead?

NTFS, for example does this by checking IoGetTopLevelIrp. So I assume
there is no recommended way do to this, you just must do it correctly.

L.