Cached / Non-Cached Coherency

Hi,

What is the guarantee that Windows provides regarding simultaneous cached / non-cached IO?

Here is the scenario:

  1. Open a file and do a bunch of cached writing.
  2. Open another handle to the file using direct IO.
  3. Do direct reads.

From my own testing, and from looking at the source, it looks like FAT handles this case by forcing a flush of cached data before allowing the direct read.

However, the same test is failing on NTFS and I’m wondering if I am doing something wrong or if NTFS does not implement this type of flushing before non-cached IO.

Thanks,
Matt

You are doing something wrong, NTFS definitely does the same model.


Don Burn (MVP, Windows DDK)
Windows Filesystem and Driver Consulting
Website: http://www.windrvr.com
Blog: http://msmvps.com/blogs/WinDrvr

wrote in message news:xxxxx@ntfsd…
> Hi,
>
> What is the guarantee that Windows provides regarding simultaneous cached
> / non-cached IO?
>
> Here is the scenario:
>
> 1) Open a file and do a bunch of cached writing.
> 2) Open another handle to the file using direct IO.
> 3) Do direct reads.
>
> From my own testing, and from looking at the source, it looks like FAT
> handles this case by forcing a flush of cached data before allowing the
> direct read.
>
> However, the same test is failing on NTFS and I’m wondering if I am doing
> something wrong or if NTFS does not implement this type of flushing before
> non-cached IO.
>
> Thanks,
> Matt
>
>
> Information from ESET NOD32 Antivirus, version of virus
> signature database 4057 (20090506)

>
> The message was checked by ESET NOD32 Antivirus.
>
> http://www.eset.com
>
>
>

Information from ESET NOD32 Antivirus, version of virus signature database 4057 (20090506)

The message was checked by ESET NOD32 Antivirus.

http://www.eset.com

Thanks, I see the problem. Working on fixing.

> 1) Open a file and do a bunch of cached writing.

  1. Open another handle to the file using direct IO.
  2. Do direct reads.

If the FCB was ever opened cached, then the noncached write to it is actually a cached write with immediate flush.


Maxim S. Shatskih
Windows DDK MVP
xxxxx@storagecraft.com
http://www.storagecraft.com

So, this issue that I am working on is somewhat complicated and I have a few questions if anyone cares to answer.

Part of my filter operation is to change the size of the underlying file transparently to the application. In order to make sure that paging IO always has a place to go (because it can’t extend EOF), I update EOF when I see a cached write that is beyond current EOF.

The issue I am seeing is that when the lazy writer comes through, it does not write up to EOF, but stops at VDL. From looking at the FAT source this seems to be a rule. Lazy writes cannot extend beyond VDL. (I am curious why this is the case but currently I am just accepting this as fact).

In FAT, if this situation occurs the FSD raises STATUS_FILE_LOCK_CONFLICT. Though it looks like in NTFS the write is just truncated.

Is there any good way of handling this situation? By extending EOF I have guaranteed that there is allocation for the paging IO to complete. Though when I do the offset to account for my header this goes beyond VDL.

Some ways I am thinking about:

  1. After setting EOF, also use IRP_MJ_SET_INFORMATION to set VDL. (I don’t think FAT supports this type of IRP from looking at source but I’m guessing NTFS does). I’m not sure what side effects this has. I will try this tomorrow.

  2. After a cached write that extends EOF, manually flush the file so that I get synchronous paging IO that can write up to EOF. Meanwhile fail any lazy write activity with lock conflict.

Thanks,
Matt

>stops at VDL. From looking at the FAT source this seems to be a rule. Lazy writes cannot extend

beyond VDL. (I am curious why this is the case but currently I am just accepting this as fact).

I think this is just the definition of VDL. Paging writes cannot go beyound EOF, and lazy writes cannot go beyound VDL. The FSD needs to extend VDL manually to allow lazy writes of the tail of the grown file.


Maxim S. Shatskih
Windows DDK MVP
xxxxx@storagecraft.com
http://www.storagecraft.com

xxxxx@yahoo.com wrote:

So, this issue that I am working on is somewhat complicated and I have a few questions if anyone cares to answer.

Part of my filter operation is to change the size of the underlying file transparently to the application. In order to make sure that paging IO always has a place to go (because it can’t extend EOF), I update EOF when I see a cached write that is beyond current EOF.

You’re starting to get into the territory of “needs shadow file objects”
here. You are going to have many problems since file size information
is communicated between filesystems & the cache in a non-filterable way.

The issue I am seeing is that when the lazy writer comes through, it does not write up to EOF, but stops at VDL. From looking at the FAT source this seems to be a rule. Lazy writes cannot extend beyond VDL. (I am curious why this is the case but currently I am just accepting this as fact).

Data written in to the cache comes via the file system. The filesystem
has an opportunity to extend VDL at that time. It does not expect
writes beyond VDL to originate from the cache unless a writable mapped
section is present.

Under the contracts of the system, a lazy write means “write these pages
out to the file. Don’t do any work to extend on disk VDL, or zero data.
Just do the write.” After the writes are done, Cc calls the
filesystem to extend VDL (in one go, rather than on each write.) Cc can
only tell the filesystem about the VDL it understands (and it doesn’t
understand writes from writable mapped sections.)

If a writable mapped section is present, the filesystem then forces the
page(s) beyond VDL to be written via the mapped page writer. This has a
different contract: zeroing is performed, and on-disk VDL is updated.
So writes via mapped sections, which the file system has no knowledge of
until paging write time, still result in a VDL update via this mechanism.

In FAT, if this situation occurs the FSD raises STATUS_FILE_LOCK_CONFLICT. Though it looks like in NTFS the write is just truncated.

The difference you are seeing between FAT and NTFS is a cute NTFS
optimization that says, “don’t bother making the mapped page writer take
this unless we have a writable mapped section.” If no writable mapped
section is/was present, the condition should not happen, so the write is
truncated.

Is there any good way of handling this situation? By extending EOF I have guaranteed that there is allocation for the paging IO to complete. Though when I do the offset to account for my header this goes beyond VDL.

There are ways. They are not good.

Some ways I am thinking about:

  1. After setting EOF, also use IRP_MJ_SET_INFORMATION to set VDL. (I don’t think FAT supports this type of IRP from looking at source but I’m guessing NTFS does). I’m not sure what side effects this has. I will try this tomorrow.

Please, please don’t do this. It’s a security hole. NTFS guarantees
that a user creating a file should not see stale data from a previous
user (goodness only knows what that data contains.) It does this in
part with VDL. If you explicitly push out VDL, you are telling the
filesystem “I know what I’m doing and I promise no unprivileged user
will ever see this data. Please give me stale garbage.” If you expose
this file to a regular user, they can now browse one another’s trash.

  1. After a cached write that extends EOF, manually flush the file so that I get synchronous paging IO that can write up to EOF. Meanwhile fail any lazy write activity with lock conflict.

That’s fine and all, but I don’t see how it fixes the problem. After
the flush Cc’s concept of VDL will still disagree with yours. The
problem is these two numbers need to be reconciled somehow. Doing this
reliably really requires shadow file objects.

Sorry,

  • M


This posting is provided “AS IS” with no warranties, and confers no rights.