CcSetFileSizes failure handling

Bill_Zissimopoulos · November 4, 2016, 7:09pm

I am trying to gain a better understanding of CcSetFileSizes and its failure modes. My intent is to devise a robust strategy for handling such failures in my FSD.

My understanding is that when CcSetFileSizes fails it may raise an exception. For example, CcSetFileSizes may raise if it is unable to extend the cache section. My question is: are there any other reasons that CcSetFileSizes may fail?

Even more important is how to handle such a failure. In my FSD CcSetFileSizes is called late in IRP processing (for example, after file metadata concerning size have been updated). Thus it is too late for the FSD to go back and undo the file metadata change. [In fact this is impossible in my FSD, because it interfaces with a user mode file system.]

Therefore I am looking for ways to make CcSetFileSizes failure benign. After all the cache is an optimization and I would like things to work when the cache fails even if they are a bit slower.

Finally: what is the role of CcGetFileSizePointer in all of this? Can I use CcGetFileSizePointer as a fallback when CcSetFileSizes fails?

PS: I have not experienced any CcSetFileSizes failures in my FSD, but this is likely because I have not tried hard enough.

OSR_Community_User · November 4, 2016, 9:21pm

Sometimes the challenge with these function is that they don’t raise directly but invoke other functions that raise. This doesn’t happen often, but that’s the point behind raising - to throw an exception so someone else can handle it gracefully.

CcSetFileSizes also does truncates, which could lead to purges, which could raise.

If you aren’t able to unwind after a failure, why don’t you perform the operation first? With serialization, you can prevent something else in your file system from using that size until you’ve finished doing whatever it is that you need to do to set the size.

Tony
OSR

anatoly_mikhailov · November 5, 2016, 12:35am

> My understanding is that when CcSetFileSizes fails it may raise an
exception. For example, CcSetFileSizes may raise if it is unable to extend
the cache section. My question is: are there any other reasons that
CcSetFileSizes may > fail?

Only if the section extension is failed

CcSetFileSizes also does truncates, which could lead to purges, which
could raise.

Purges couldn’t raise

Even more important is how to handle such a failure. In my FSD
CcSetFileSizes is called late in IRP processing (for example, after file
metadata concerning size have been updated). Thus it is too late for the
FSD to go back and
undo the file metadata change. [In fact this is impossible in my FSD,
because it interfaces with a user mode file system.]

Before metadata updating, set new file size, update metadata. If metadata
updating fails, truncate file size to previous value. This truncation
wouldn’t lead to an exception. Before this operation lock the file
exclusively.

anatoly_mikhailov · November 5, 2016, 1:01am

> Finally: what is the role of CcGetFileSizePointer in all of this? Can I
use CcGetFileSizePointer as a fallback when CcSetFileSizes fails?

Macro return a pointer to the cache manager copy of file size. You can use
it to update file size of cache manager directly if you truncate size. For
example, you extended file size, did metadata updating, something goes
wrong and you have to do cleanup. Therefore you restore previous file sizes
in your structures, then you use CcGetFileSizePointer to restore file size
in a structure of cache manager. If you truncate file size not because of
cleanup of extending, you should use CcSetFileSize instead.

Bill_Zissimopoulos · November 5, 2016, 4:23pm

Tony Mason wrote:

If you aren’t able to unwind after a failure, why don’t you perform the ope=
ration first? With serialization, you can prevent something else in your f=
ile system from using that size until you’ve finished doing whatever it is =
that you need to do to set the size.

I agree that the answer in these cases is usually to preacquire any resources (extend the section), perform the required operation and release the resources (pull back the cache size) if the operation fails. In fact my FSD does this in come cases where it seems mandatory (cached WRITE’s where the Cache Manager can fail if it tries to write past the EndOfFile).

However I face the following problem in my FSD. I want to allow the user mode file system to tell the FSD what the latest version of a file’s metadata looks like after the completion of an operation. For this reason, the user mode file system sends back to the FSD its own view of the file metadata after certain operations complete (Create/Open, QueryInfo, SetAllocationInfo, SetEndOfFileInfo, and NonCached/NonPagingIo Writes).

For example, after the completion of a WRITE, it may look to the FSD that the AllocationSize is 0x1000, but the user mode file system may know that the AllocationSize is actually 0x2000. The reason for this discrepancy is that there may be actors other than the FSD that are able to change the file system. The user mode file system may have a user mode API that allows file manipulation or the file system may have remote users on other computers that can change the files.

[Turns out that in those cases the user mode file system is supposed to disable the cache manager, so many of the reasons that CcSetFileSizes may fail do not exist. So the problem may be simpler than it looks.]

There is in fact a system component that must face similar problems and that is the redirector. Unfortunately there is no public source for it.

CcSetFileSizes also does truncates, which could lead to purges, which could raise.

Interesting. Does that mean that CcPurgeCacheSection can raise? Or do you mean flush when you say purge?

Anatoly Mikhailov wrote:

Macro return a pointer to the cache manager copy of file size. You can use
it to update file size of cache manager directly if you truncate size. For
example, you extended file size, did metadata updating, something goes
wrong and you have to do cleanup. Therefore you restore previous file sizes
in your structures, then you use CcGetFileSizePointer to restore file size
in a structure of cache manager. If you truncate file size not because of
cleanup of extending, you should use CcSetFileSize instead.

If I understand you correctly, that is also my understanding re: CcGetFileSizePointer. You can use it to pull back the file size after extending, but you cannot use it to re-extend the file size after truncation.

This limitation may stem from the fact that for a file AllocationSize >= EndOfFile. But does that also mean that for a SharedCacheMap SectionSize >= FileSize? Or is FileSize allowed to be greater than SectionSize, in which case CcGetFileSizePointer may be used in all situations to reset the cache manager’s view of the file size?

If, as I suspect, you can only pull back size but cannot re-extend with CcGetFileSizePointer you now have a rather unfortunate case that you have to call CcSetFileSizes late in the IRP processing (where you cannot tolerate failures).

For example:

SetAllocationInfo arrives with new AllocationSize less than current (truncation).
FSD performs CcSetFileSizes without failure.
FSD sends request to the user mode file system.
At a later time FSD receives response from the user mode file system indicating failure.
Oops, we must re-extend the AllocationSize and we cannot use CcGetFileSizePointer, so we must use CcSetFileSizes. But this may fail to re-extend the section!

>

For what its worth looking at the FastFat source it seems to fall into this trap. When truncating it first updates its on disk data structures (in FatTruncateFileAllocation) and then calls CcSetFileSizes. Perhaps the FastFat authors know that this particular CcSetFileSizes call cannot fail (because it is a truncation?), but do we?

https://github.com/Microsoft/Windows-driver-samples/blob/master/filesys/fastfat/fileinfo.c#L4258

Bill

anatoly_mikhailov · November 5, 2016, 5:21pm

> Interesting. Does that mean that CcPurgeCacheSection can raise? Or do you
mean flush when you say purge?

Yes. That means CcPurgeCacheSection can’t fail. It invalidates pages but
not allocates them. Cleanup function must not fail by its nature.

You can use it to pull back the file size after extending

Yes, but extended section wouldn’t be cleanup. But it’s not critical.

but you cannot use it to re-extend the file size after truncation.

Of course, because section must be extended. I.e. all needed structures and
memory must be allocated and initialized for it by cache manager.

But does that also mean that for a SharedCacheMap SectionSize >= FileSize?

Yes. When cache manager would map view of the file, memory manager could
BSOD if section wouldn’t be sized properly. There is VACBs array of cache
manager for the file also.

Or is FileSize allowed to be greater than SectionSize, in which case
CcGetFileSizePointer may be used in all situations to reset the cache
manager’s view of the file size?

If you would use it in all situations, it can turn out that wrong file size
directs cache manager to use VACBs which do not exist.

If, as I suspect, you can only pull back size but cannot re-extend with
CcGetFileSizePointer

Yes. You can’t re-extend the section such way.

For example:

SetAllocationInfo arrives with new AllocationSize less than current
(truncation).

FSD performs CcSetFileSizes without failure.

FSD sends request to the user mode file system.

At a later time FSD receives response from the user mode file system
indicating failure.

Oops, we must re-extend the AllocationSize and we cannot use
CcGetFileSizePointer, so we must use CcSetFileSizes. But this may fail to
re-extend the section!

SetAllocationInfo arrives with new AllocationSize less than current
(truncation).
FSD sends request to the user mode file system.
At a later time FSD receives response from the user mode file system
indicating failure.
FSD doesn’t perform CcSetFileSizes because of failure.

Can this be a solution?

Perhaps the FastFat authors know that this particular CcSetFileSizes call
cannot fail (because it is a truncation?)

Truncation can’t fail.

Bill_Zissimopoulos · November 5, 2016, 9:14pm

Anatoly, thanks. You seem to confirm my understanding of CcSetFileSizes and CcGetFileSizePointer.

>Perhaps the FastFat authors know that this particular CcSetFileSizes call
> cannot fail (because it is a truncation?)

Truncation can’t fail.

Ok, good to know. My understanding is that CcSetFileSizes calls MmFlushSection when truncating. Can that raise an exception?

SetAllocationInfo arrives with new AllocationSize less than current
(truncation).

FSD sends request to the user mode file system.

At a later time FSD receives response from the user mode file system
indicating failure.

FSD doesn’t perform CcSetFileSizes because of failure.

This is what my FSD does today.

It seems that the fully correct answer is:

If the allocation size is being increased, call CcSetFileSizes to extend the section and then interact with the user mode file system. If that fails, pull back the section using CcGetFileSizePointer.
If the allocation is being decreased, interact with the user mode file system first and then call CcSetFileSizes if the user mode succeeds. CcSetFileSizes is safe in this case, because (as you are saying) does not raise when truncating.

This looks to me rather complicated. Furthermore, as mentioned, my FSD does not really know the correct allocation size, only the user mode file system does. So despite what the FSD thinks when it first sees the requested allocation size, it may still need to extend the section after it has interacted with the user mode file system.

Now I could impose certain restrictions to what the user mode file system can or cannot do (at least when the cache is being used). And that may indeed be the answer to this conundrum. However I would like to explore alternative solutions before I do so.

So I am considering two different strategies:

After a CcSetFileSizes failure “poison” the FILE_OBJECT (or maybe the FCB) and basically fail all I/O on it, except CLEANUP/CLOSE.
After a CcSetFileSizes failure, do CcFlushCache and then call CcPurgeCacheSection with UninitializeCacheMaps==TRUE. If I understand the docs
correctly, this is effectively like calling CcUninitializeCacheMap on all FILE_OBJECTS for a particular FCB. The next time an I/O comes for this FCB, the FSD will try to reinitialize the cache map (using CcInitializeCacheMap) and can fail safely there.

Strategy 2 (if it works) is by far my favorite strategy.

Bill

anatoly_mikhailov · November 6, 2016, 1:37am

> Ok, good to know. My understanding is that CcSetFileSizes calls
MmFlushSection when truncating. Can that raise an exception?

No. It never raises an exception.

Furthermore, as mentioned, my FSD does not really know the correct
allocation size, only the user mode file system does.

You can use file size for allocation size and valid file size while using
cache manager. Moreover, as far as i understood you, there are clients of
user mode file system besides your FSD. Right? If so you can’t guarantee
cached file consistency.

So despite what the FSD thinks when it first sees the requested
allocation size, it may still need to extend the section after it has
interacted with the user mode file system.

Extending is really needed for file size extending. For allocation size you
can find a way to nop it.

Now I could impose certain restrictions to what the user mode file system
can or cannot do (at least when the cache is being used).

That is a good way in designing of your FSD. You should be sure your cached
files are consistent with user mode file system. Otherwise your FSD would
look like Linux EncFS where you can delete encrypted files which are used
by EncFS driver meanwhile.

After a CcSetFileSizes failure “poison” the FILE_OBJECT (or maybe the
FCB) and basically fail all I/O on it, except CLEANUP/CLOSE.

After a CcSetFileSizes failure, do CcFlushCache and then call
CcPurgeCacheSection with UninitializeCacheMaps==TRUE. If I understand the
docs
correctly, this is effectively like calling CcUninitializeCacheMap on all
FILE_OBJECTS for a particular FCB. The next time an I/O comes for this FCB,
the FSD will try to reinitialize the cache map (using CcInitializeCacheMap)
and can > fail safely there.

Failure because of extending?

CcPurgeCacheSection with UninitializeCacheMaps==TRUE. If I understand the
docs
correctly, this is effectively like calling CcUninitializeCacheMap on all
FILE_OBJECTS for a particular FCB

That’s right.

Bill_Zissimopoulos · November 6, 2016, 1:38pm

> Moreover, as far as i understood you, there are clients of

user mode file system besides your FSD. Right? If so you can’t guarantee
cached file consistency.

That is true and is why the cache is actually (supposed to be) disabled in those situations.

However dealing with user mode you must be prepared for the unexpected. For example, I must handle situations where the user mode file system crashes while servicing a request (IRP). Likewise I would like to handle a situation where the user mode file system has its own view of allocation size, regardless of what the FSD thinks.

> Now I could impose certain restrictions to what the user mode file systemion
> can or cannot do (at least when the cache is being used).

That is a good way in designing of your FSD. You should be sure your cached
files are consistent with user mode file system.

In principle I agree.

One of the reasons I designed this protocol where the user mode file system reports file metadata to the FSD at the tail end of file system operations is because I wanted to support three modes of operation:

A mode of operations where all file data/metadata is considered stale immediately after they are reported to the FSD and the original API caller. This mode does not use the cache manager at all and does not cache any file metadata. Practically every IRP results in a user mode file system request.
A mode of operations where file data/metadata is considered valid for a small amount of time after they are received. For example, after I receive a CREATE IRP, I may cache the file metadata (outside the cache manager) for 1 second. This way if I receive a QUERY_INFORMATION IRP right after the CREATE I can immediately satisfy it from the metadata cache.

This mode does not use the cache manager and does not currently cache file data. I have had many thoughts on how to make the cache manager work in this case. Implementation would require a cache expiration scheme for the cache manager. For the time being I decided that the complexity does not justify the benefit for my v1.

A mode of operations where file data/metadata remains valid indefinitely. In this case the FSD and the user mode file system usually agree on things like the allocation size and the cache manager can be used safely.

> 1. After a CcSetFileSizes failure “poison” the FILE_OBJECT (or maybe the
> FCB) and basically fail all I/O on it, except CLEANUP/CLOSE.
> 2. After a CcSetFileSizes failure, do CcFlushCache and then call
> CcPurgeCacheSection with UninitializeCacheMaps==TRUE. If I understand the
> docs correctly, this is effectively like calling CcUninitializeCacheMap
> on all FILE_OBJECTS for a particular FCB. The next time an I/O comes for
> this FCB, the FSD will try to reinitialize the cache map (using
> CcInitializeCacheMap) and can fail safely there.

Failure because of extending?

Yes.

So I have made some progress on this.

Firstly, I can now reliably trigger CcInitializeCacheMap and CcSetFileSizes failures. For this to work I use an AllocationSize of 0x1000000000000000. That is 1024^6 or 1 exabyte.

Secondly, the technique of CcFlushCache and CcPurgeCacheSection with UninitializeCacheMaps==TRUE almost works. I say almost because I am encountering what may be a problem with CcInitializeCacheMap, which results later in a BSOD. Here is the full scenario.

The user mode file system reports a huge allocation size of 1 exabyte.
The FSD attempts to do CcSetFileSizes and fails. It does NOT update the SharedCacheMap FileSize or SectionSize (confirmed under WinDbg). The FSD catches the CcSetFileSizes exception and does CcFlushCache and CcPurgeCacheSection as described. The PrivateCacheMap’s for the file get NULL’ed (nice). The SharedCacheMap does not get NULL’ed at this point (likely not nice – see below).
A new READ request comes in. The FSD calls CcInitializeCacheMap to reinitialize the cache for the file. It passes it the huge allocation size of 1 exabyte. In this case the CcInitializeCacheMap returns SUCCESS, but it does not update the FileSize or the SectionSize (as per the documentation it should update it if the new AllocationSize is greater than the existing section size).
This later results in a BSOD when I try to do a CcCopyRead with the wrong FileSize in the SharedCacheMap.

IMO the problem is with CcInitializeCacheMap and may be related to the fact that I just uninitialized all cache maps but the SharedCacheMap is not NULL (uninit has not completed). I note here that CcInitializeCacheMap does fail when passed a huge allocation size of 1 exabyte and there is no SharedCacheMap already set up.

The solution probably is to wait for the SharedCacheMap to be NULL’ed. There does not seem to be a straightforward way to do so. CcUninitializeCache has an event to wait on, but there is no such event when doing CcPurgeCacheSection with UninitializeCacheMaps==TRUE.

Bill

Bill_Zissimopoulos · November 6, 2016, 8:09pm

I wrote:

The solution probably is to wait for the SharedCacheMap to be NULL’ed. There
does not seem to be a straightforward way to do so. CcUninitializeCache has an
event to wait on, but there is no such event when doing CcPurgeCacheSection with
UninitializeCacheMaps==TRUE.

I believe I have resolved this issue. For future reference here is the code:

{
/*
* CcSetFileSizes failed. This is a hard case to handle, because it is
* usually late in IRP processing. So we have the following strategy.
*
* Our goal is to completely stop all caching for this FileNode. The idea
* is that if some I/O arrives later for this FileNode, CcInitializeCacheMap
* will be executed (and possibly fail safely there). In fact we may decide
* later to make such CcInitializeCacheMap failures benign (by not using the
* cache when we cannot).
*
* In order to completely stop caching for the FileNode we do the following:
*
* - We flush the cache using CcFlushCache.
* - We purge the cache and uninitialize all PrivateCacheMap’s using
* CcPurgeCacheSection with UninitializeCacheMaps==TRUE.
* - If the SharedCacheMap is still around, we perform an additional
* CcUninitializeCacheMap with an UninitializeEvent. At this point
* CcUninitializeCacheMap should delete the SharedCacheMap and
* signal the UninitializeEvent.
*
* One potential gotcha is whether there is any possibility for another
* system component to delay deletion of the SharedCacheMap and signaling
* of the UninitializeEvent. This could result in a deadlock, because we
* are already holding the FileNode exclusive and waiting for the
* UninitializeEvent. But the thread that would signal our event would have
* to first acquire our FileNode. Classic deadlock.
*
* I believe (but cannot prove) that this deadlock cannot happen. The reason
* is that we have flushed and purged the cache and we have closed all
* PrivateCacheMap’s using this SharedCacheMap. There should be no reason for
* any system component to keep the SharedCacheMap around (famous last words).
*/

IO_STATUS_BLOCK IoStatus;
CACHE_UNINITIALIZE_EVENT UninitializeEvent;

FspCcFlushCache(CcFileObject->SectionObjectPointer, 0, 0, &IoStatus);
CcPurgeCacheSection(CcFileObject->SectionObjectPointer, 0, 0, TRUE);
if (0 != CcFileObject->SectionObjectPointer->SharedCacheMap)
{
UninitializeEvent.Next = 0;
KeInitializeEvent(&UninitializeEvent.Event, NotificationEvent, FALSE);
BOOLEAN CacheStopped = CcUninitializeCacheMap(CcFileObject, 0, &UninitializeEvent);
ASSERT(CacheStopped);
KeWaitForSingleObject(&UninitializeEvent.Event, Executive, KernelMode, FALSE, 0);
}
}

Bill

Malcolm_Smith · November 6, 2016, 8:33pm

On 11/05/2016 02:20 PM, Anatoly Mikhailov wrote:

Yes. That means CcPurgeCacheSection can’t fail. It invalidates pages but
not allocates them. Cleanup function must not fail by its nature.

This is not correct.

Obviously in an ideal world cleanup functions couldn’t fail, but every
system is full of less than ideal situations, and in NT, this is one.

The original design of NT allowed virtual address mappings to reference
physical pages, with no reverse mapping allowing a physical page to
identify its virtual address users. The issue this created for purge is
that if a physical page is actively mapped, the physical page couldn’t
be discarded and repurposed because it’s in use, and there was no
mechanism to identify and clean up the user, so purge just fails.

In Win7 a new reverse mapping mechanism was added which allowed a given
section to identify user mappings and allow virtual address references
to be removed. This is not used via the regular CcPurgeCacheSection but
is available through CcCoherencyFlushAndPurgeCache (which CcSetFileSizes
doesn’t use.)

However, note that there’s still a synchronization problem: a page can
become in use by a user process touching a virtual address range,
something the file system can’t synchronize against. It can even map a
new virtual address range, which the file system receives no
notification for. So even with CcCoherencyFlushAndPurgeCache, there’s a
chance that a page reference can be trimmed and immediately recreated,
causing purge to fail; and it’s why CcCoherencyFlushAndPurgeCache is
almost always called in a loop. The loop is there to catch soft faults
resolving the virtual address to page mapping immediately. The FS also
needs synchronization around purge to prevent hard faults from bringing
data back from disk immediately.

The original way NT was intending to make truncation work is via
MmCanFileBeTruncated. Noting that the file system receives no
notification for virtual address mappings, this API required the FS to
hold synchronization against section creation and return FALSE if an
existing user had a section handle. If that section handle exists, that
caller is free to map a virtual address range and cause purge to fail.
This is of course extremely problematic since it means truncation can
fail semi-randomly, or even, quite deterministically.

Because CcSetFileSizes is part of the original NT, which theorized that
purges couldn’t fail in this path due to MmCanFileBeTruncated,
CcSetFileSizes made no attempt to communicate purge failure. It won’t
raise, but it can just leave pages around unexpectedly.

So in Vista, CcSetFileSizesEx made its debut, which added a return value
indicating purge failure. Unfortunately this didn’t really improve
anything, because purge isn’t atomic - the memory manager will start
throwing away dirty data, then get to a page that’s mapped somewhere,
and fail - but the FS can’t abort the operation because data is already
gone. So this API turned out to not be that useful, and doesn’t appear
to be documented anywhere either.

Now all of that is on the record, going back to the original question…

The best I could really do was to ensure that CcSetFileSizes is called
early when extending so that on allocation failure the extend operation
could be failed. However, when truncating it’s important to perform the
truncation and commit it first because once purging starts it would be
invalid to roll back a truncation, so the same API gets called very
differently for extend vs. truncate. NTFS is still relying on
MmCanFileBeTruncated to reduce the risk of spurious purge failure.

Note the issue with purge failure on truncate isn’t immediate - it might
leave data beyond file size which can’t be read directly and can’t be
written to disk since there’s no allocation, but the real problem is
what happens when the file is extended again and those stale pages
become reachable. I can’t help but think the “best” way to handle this
today is do the purge as part of the extension (and check for failure)
to clean up after a previous truncate, and throw MmCanFileBeTruncated to
the curb.

Good luck,

M

–
http://www.malsmith.net

anatoly_mikhailov · November 6, 2016, 10:21pm

> That is true and is why the cache is actually (supposed to be) disabled
in those situations.

Do you mean when you open file with user mode file system or you are
talking about cache in your FSD? If you’re talking about user mode file
system then you can’t do otherwise. You’ll hang cache manager quickly. Also
this cache disabling doesn’t guarantee file consistency.

However dealing with user mode you must be prepared for the unexpected.
For example, I must handle situations where the user mode file system
crashes while servicing a request (IRP).

That’s true.

Likewise I would like to handle a situation where the user mode file
system has its own view of allocation size, regardless of what the FSD
thinks.

You can ignore allocation size while caching for example. In my opinion
that’s best you can do in this situation.

A mode of operations where all file data/metadata is considered stale
immediately after they are reported to the FSD and the original API caller.
This mode does not use the cache manager at all and does not cache any
file
metadata. Practically every IRP results in a user mode file system
request.

A mode of operations where file data/metadata is considered valid for
a small amount of time after they are received. For example, after I
receive a CREATE IRP, I may cache the file metadata (outside the cache
manager) for 1
second. This way if I receive a QUERY_INFORMATION IRP right after the
CREATE I can immediately satisfy it from the metadata cache.

This mode does not use the cache manager and does not currently cache
file data. I have had many thoughts on how to make the cache manager work
in this case. Implementation would require a cache expiration scheme for
the
cache manager. For the time being I decided that the complexity does not
justify the benefit for my v1.

A mode of operations where file data/metadata remains valid
indefinitely. In this case the FSD and the user mode file system usually
agree on things like the allocation size and the cache manager can be used
safely.

Can you filter user mode file system? If so then you can redirect all open
requests directed to it to your FSD. Your FSD in turn, either do its own
special processing or forward access to user mode file system. This let you
control all requests directed to user mode file system and maintain files
consistency.

>> 1. After a CcSetFileSizes failure “poison” the FILE_OBJECT (or maybe the
>> FCB) and basically fail all I/O on it, except CLEANUP/CLOSE.
>> 2. After a CcSetFileSizes failure, do CcFlushCache and then call
>> CcPurgeCacheSection with UninitializeCacheMaps==TRUE. If I understand
the
>> docs correctly, this is effectively like calling CcUninitializeCacheMap
>> on all FILE_OBJECTS for a particular FCB. The next time an I/O comes for
>> this FCB, the FSD will try to reinitialize the cache map (using
>> CcInitializeCacheMap) and can fail safely there.
>
> Failure because of extending?
> Yes.

So I have made some progress on this.

Firstly, I can now reliably trigger CcInitializeCacheMap and
CcSetFileSizes failures. For this to work I use an AllocationSize of
0x1000000000000000. That is 1024^6 or 1 exabyte.

Did i understand you correctly? You want to fail all I/O on file because of
file section size extending? Or true reason is your FSD receives from user
mode file system failure?
If it’s because of user mode file system failure then you should invalidate
your FILE_OBJECT and fail all further requests with STATUS_FILE_INVALID
status. This status is special one. It’ll force memory manager to don’t try
flush data again.

The solution probably is to wait for the SharedCacheMap to be NULL’ed.
There does not seem to be a straightforward way to do so.
CcUninitializeCache has an event to wait on, but there is no such event
when doing
CcPurgeCacheSection with UninitializeCacheMaps==TRUE.

No means to wait. The best you can do, invalidate everything, mark
FILE_OBJECT as invalid and fail all further requests.

The original design of NT allowed virtual address mappings to reference
physical pages, with no reverse mapping allowing a physical page to
identify its virtual address users. The issue this created for purge is
that if a physical page > is actively mapped, the physical page couldn’t be
discarded and repurposed because it’s in use, and there was no mechanism to
identify and clean up the user, so purge just fails.

We was talking about raising an exception. And i meant CcPurgeCacheSection
will not raise it. When CcPurgeCacheSection can’t purge mapped pages it
just return false.

anatoly_mikhailov · November 6, 2016, 10:50pm

> However, when truncating it’s important to perform the truncation and
commit it first because once purging starts it would be invalid to roll
back a truncation, so the same API gets called very differently for extend
vs. truncate.

We weren’t talking about roll back on truncation. It’s quite problematic.
We was talking about if user mode file system succeded extending but our
FSD fails because of file section extending failure.

rod_widdowson · November 7, 2016, 6:20am

> “Anatoly Mikhailov” wrote in message
> news:xxxxx@ntfsd…
> > Ok, good to know. My understanding is that CcSetFileSizes calls
> > MmFlushSection when truncating. Can that raise an exception?
>>
> No. It never raises an exception.

Speaking as one who has the scars… Never say never.

I can quite believe you in this case, but I have wasted too much time fixing
[my] bugs which have happened because of exceptions from Mm and Cc which I
just didn’t expect that I always bracket all Cc and Mm calls with a try as a
matter of course (like FAT does, but not in that way - I keep them
localized). For instance I am pretty sure that I have seen SetFileSizes
issue a flush (or maybe a pagefault) and I have seen that cause a Paged IO
to fail and I have seen that come back as an exception.

At least one version of windows would silently swallow the exception during
either CLEANUP or CLOSE handling and so if you called a Cc Function in there
which was called with a lock held and that threw an exception then suddenly
everything would appear to work but your lock was left dangling.

anatoly_mikhailov · November 7, 2016, 6:53am

> Speaking as one who has the scars… Never say never.

You just said never :). When CcSetFileSizes fails to allocate VACBs or
fails to extend section it calls ExRaiseStatus explicitly. When you’re
truncating section there is no section extending or VACBs allocating. Am i
wrong?

I can quite believe you in this case

So why you just advised me to never say never :)?

but I have wasted too much time fixing [my] bugs which have happened
because of exceptions from Mm and Cc

These exceptions could occur because of you pass wrong parameters to Mm or
Cc routines for example. So what cases you’re talking about?

For instance I am pretty sure that I have seen SetFileSizes issue a flush
(or maybe a pagefault) and I have seen that cause a Paged IO to fail and I
have seen that come back as an exception.

This is simple to verify, just set breakpoint to IRP_MJ_WRITE handler entry
point and fail it there by variables patching. Make sure of course you see
CcSetFileSizes in call stack. And you’ll see no exception and IRP_MJ_WRITE
retry for local FSD and no retries for network FSD.

At least one version of windows would silently swallow the exception
during either CLEANUP or CLOSE handling and so if you called a Cc Function
in there which was called with a lock held and that threw an exception then
suddenly
everything would appear to work but your lock was left dangling.

About what version of Windows you’re talking? I have never seen
IRP_MJ_CLEANUP/IRP_MJ_CLOSE wrapped around with try/except by I/O manager.
The swallow could be, for example, if there was a driver which wrapped
IoCallDriver with try/except. But in my practice such way also means that
somebody called lower driver, got BSOD, and workaround this by try/except
instead of finding real problem. Meanwhile the real matter was this
somebody passed wrong IRP to lower level driver.

One more thing,
Are you wrapping around ExAllocatePool with try/except too?

rod_widdowson · November 7, 2016, 9:39am

> You just said never :). When CcSetFileSizes fails to allocate VACBs or

fails to extend section it
calls ExRaiseStatus explicitly. When you’re truncating section there is no
section extending or
VACBs allocating. Am i wrong?

I’m not doubting you for a second. I don’t know the insides as you do, but
I’m merely observing wrapping every Cc and Mm in a try/catch (with an ASSERT
of course) costs maybe 2 hours at the start of a project and you can waste
that much time debugging just one case where it throws. Left shift that to
a customer who’s already pissed ofg; converting a bluescreen into an
application misfunction could keep you a customer.

Having said all that I just checked a recent project and in that case I
didn’t wrap CcSetFileSizes. . So much for practising what I
preach. Apologies.

These exceptions could occur because of you pass wrong parameters to Mm or
Cc routines for example. So what cases you’re talking about?

I’m not talking about bad parameters, but edge cases or normal running. Of
course having an assert at the point of failure can help debug things faster
when you have blown it.

At least one version of windows would silently swallow the exception
during either CLEANUP or CLOSE handling
and so if you called a Cc Function in there which was called with a lock
held and that threw an exception then suddenly
everything would appear to work but your lock was left dangling.

About what version of Windows you’re talking?

Possible XP (SP3). maybe Longhorn prior to the first SR. I’ve not seen it
recently but there again I learnt my lesson…

if there was a driver which wrapped IoCallDriver with try/except. But in
my practice such way also means
that somebody called lower driver, got BSOD, and workaround this by
try/except instead of finding real
problem.

Sadly true. We’ve all seen such code.

But in this case it would almost certainly be a clean system - and it was
long enough ago that most of the MS filters that we see these days weren’t
present.

One more thing,
Are you wrapping around ExAllocatePool with try/except too?

I’ve not experienced ExAllocate (or free) throwing (via a call to Mm I’d
guess?) but if I had (for genuine reasons, which don’t include flaky disks
or memory) I would have

R

Bill_Zissimopoulos · November 7, 2016, 1:27pm

Malcolm Smith wrote:

The original design of NT allowed virtual address mappings to reference
physical pages, with no reverse mapping allowing a physical page to
identify its virtual address users…

Malcolm, thank you for the great historical perspective and advice.

So even with CcCoherencyFlushAndPurgeCache, there’s a
chance that a page reference can be trimmed and immediately recreated,
causing purge to fail; and it’s why CcCoherencyFlushAndPurgeCache is
almost always called in a loop. The loop is there to catch soft faults
resolving the virtual address to page mapping immediately.

This is some good advice re: CcCoherencyFlushAndPurgeCache. FastFat may call CcCoherencyFlushAndPurgeCache during WRITE, but it does not use a loop.

Do you mean that we should retry when we get STATUS_CACHE_PAGE_LOCKED? Should we also throw a KeDelayExecutionThread in there? Something along the lines of (not tested):

static const LONG Delays = { … };
NTSTATUS FspCcCoherencyFlushAndPurgeCache(…)
{
PVOID Result;
LARGE_INTEGER Delay;

for (ULONG i = 0, n = sizeof(Delays) / sizeof(Delays[0]);; i++)
{
try
{
CcCoherencyFlushAndPurgeCache(…);
if (STATUS_CACHE_PAGE_LOCKED != IoStatus.Status)
return IoStatus.Status;
}
except (EXCEPTION_EXECUTE_HANDLER)
{
return GetExceptionCode();
}

Delay.QuadPart = n > i ? Delays[i] : Delays[n - 1];
KeDelayExecutionThread(KernelMode, FALSE, &Delay);
}
}

Also should we limit the retries?

The original way NT was intending to make truncation work is via
MmCanFileBeTruncated…
This is of course extremely problematic since it means truncation can
fail semi-randomly, or even, quite deterministically.

My biggest gripe with the Windows file system is that there are a couple of cases where its behavior seems to be almost non-deterministic. My favorite one being that DeleteFile tells you success, but the file can be still there.

The best I could really do was to ensure that CcSetFileSizes is called
early when extending so that on allocation failure the extend operation
could be failed. However, when truncating it’s important to perform the
truncation and commit it first because once purging starts it would be
invalid to roll back a truncation, so the same API gets called very
differently for extend vs. truncate. NTFS is still relying on
MmCanFileBeTruncated to reduce the risk of spurious purge failure.

This is some great advice although I may not be able to apply to my FSD for reasons explained earlier.

Note the issue with purge failure on truncate isn’t immediate - it might
leave data beyond file size which can’t be read directly and can’t be
written to disk since there’s no allocation, but the real problem is
what happens when the file is extended again and those stale pages
become reachable. I can’t help but think the “best” way to handle this
today is do the purge as part of the extension (and check for failure)
to clean up after a previous truncate, and throw MmCanFileBeTruncated to
the curb.

I must admit that I am not completely following you here? Are you saying that we should be doing CcPurgeCacheSection from the old SectionSize to the newly extended SectionSize, and abort the extension operation if CcPurgeCacheSection fails?

Bill

Bill_Zissimopoulos · November 7, 2016, 1:42pm

Rod Widdowson wrote:

… I have wasted too much time fixing
[my] bugs which have happened because of exceptions from Mm and Cc which I
just didn’t expect that I always bracket all Cc and Mm calls with a try as a
matter of course (like FAT does, but not in that way - I keep them
localized).

This is some great advice right there. I also think that wrapping Cc, Mm (and FsRtl) calls and localizing exception handling is mandatory.

<opinion.feelfreetoignore>
On a related aspect I am not a fan of communicating this kind of failure through exceptions in the kernel. I love it when I am using Python, not so much when I am writing FSD’s.
</opinion.feelfreetoignore>

Bill

anatoly_mikhailov · November 7, 2016, 1:43pm

> I must admit that I am not completely following you here? Are you saying
that we should be doing CcPurgeCacheSection from the old SectionSize to the
newly extended SectionSize, and abort the extension operation if

CcPurgeCacheSection fails?

Imagine to yourself you mapped file which is 4 pages length and you mapped
all 4 pages. You can read this file with ReadFile function also. But
physical pages would be the same both for cache and mapped view. Then you
truncated the file to 2 pages. Therefore read/write would fail beyond new
file size but mapped view remains accessable. From point of view of the
file truncating last 2 pages are staled and wouldn’t be written to disk.
Now you extend file again to 4 pages, and earlier truncated pages became a
part of file again with the same data which was there before truncating.

So before file truncating you should call MmCanFileBeTruncated routine with
target file size to be sure there are no mapped view of this part of file.
I think your FSD do this check.

Bill_Zissimopoulos · November 7, 2016, 1:52pm

> Imagine to yourself you mapped file which is 4 pages length and you mapped

all 4 pages. You can read this file with ReadFile function also. But…

Anatoly, this is a good example, thanks. I was trying to see if I understand Malcolm’s advice on how to use CcPurgeCacheSection during extension rather MmCanFileBeTruncated.

Bill