Hi I am new to file systems and have a question on caching.
In Windows if an app opens a file using
GENERIC_READ, FILE_SHARE_WRITE
and reads from it, caching is setup, assuming FILE_FLAG_NO_BUFFERING was also not specified.
Now, if another app opens the same file using
GENERIC_WRITE, FILE_SHARE_READ, FILE_FLAG_NO_BUFFERING
and writes to the file, this write does directly to the disk.
What happens if the write happens to be on the same region of file for which caching was setup during read by first app?
Does the cache get invalidated ? Or does it become stale ? How does NTFS handle this ?
The file systems handle this situation - you can see how FAT does it in the WDK code.
I’ve seen applications that actually do that - within a single program. If a non-cached read occurs in a region that has dirty cached data, the cached data is flushed back to disk and then the non-cached read is satisfied from disk. If a non-cached write occurs, the cache is purged. Beginning with Windows 7, there is a page invalidation mechanism to ensure that the purge works almost all the time, even in the face of memory mapped files. Prior to Windows 7 it would purge the cache manager cache (if possible) but memory mapped file coherence was not guaranteed.
That’s interesting. So Win7+ guarantees that ReadFile will see the latest
changes made in a memory map of that file…and WriteFile will first write
the page out if it is dirty, and after the WriteFile, an attempt to access
that page will then read it in from the disk? If so, his is a great
improvement.
joe
The file systems handle this situation - you can see how FAT does it in
the WDK code.
I’ve seen applications that actually do that - within a single program.
If a non-cached read occurs in a region that has dirty cached data, the
cached data is flushed back to disk and then the non-cached read is
satisfied from disk. If a non-cached write occurs, the cache is purged.
Beginning with Windows 7, there is a page invalidation mechanism to ensure
that the purge works almost all the time, even in the face of memory
mapped files. Prior to Windows 7 it would purge the cache manager cache
(if possible) but memory mapped file coherence was not guaranteed.
Note that this call can fail; how the underlying FSD deals with this is of course dependent upon the file system - you can either refuse the non-cached write or you can allow the non-coherent cache. None of the samples use it.
Note that this call can fail; how the underlying FSD deals with this is of
course dependent upon the file system - you can either refuse the
non-cached write or you can allow the non-coherent cache. None of the
samples use it.
Absolutely. I’m not sure why FAT hasn’t been modified to do so (maybe it has but the sample code has not) but this is definitely supported by “real” file systems.
I ask primarily because I have a few slides in my Advanced Systems
Programming course talking about the coherency problem, and altough the
chances that I will ever be able to teach it again are slight-to-none, I
like to keep it current.
joe
Absolutely. I’m not sure why FAT hasn’t been modified to do so (maybe it
has but the sample code has not) but this is definitely supported by
“real” file systems.
While this would be one implementation, that’s actually not how they do it. The code is in FastFat. Here’s a comment (write.c) that describes what they are doing:
//
// If this is a noncached transfer and is not a paging I/O, and
// the file has been opened cached, then we will do a flush here
// to avoid stale data problems. Note that we must flush before
// acquiring the Fcb shared since the write may try to acquire
// it exclusive.
//
// The Purge following the flush will guarentee cache coherency.
//
So they actually FLUSH the data (in memory) and then perform the non-cached I/O to disk. This guarantees correct behavior.
It does mean that if someone tries to do a cached read of that region it will be re-fetched from disk.
It’s been a while since I looked at how NTFS did it, but last I looked (Server 2008 timeframe) that is what they did as well. Since I have an Server 2012 VM running under the debugger I took a quick peek to confirm that it’s changed as I expected and indeed I found:
So NTFS is using CcCoherencyFlushAndPurgeCache (instead of the older model shown in the FAT code of the flush/purge model) but it’s doing that before it issues the non-cached I/O.