If you utilize a journal mechanism, every intermediate state of the disk
should result (after recovery) in a consistent state. This can even occur
in the face of storage subsystem level caching IF the storage subsystem
itself maintains strict write ordering for the operations. If the storage
subsystem modifies the order AND reports I/O as completed when it has not
(e.g., utilizing a write-back caching scheme) then it is not possible to
guarantee such semantics.
In the ancient past (when I was responsible for implementing the
transactional logging and recovery subsystem for a disk based physical media
file system) we tested our recovery subsystem by running automated tests
that would snapshot and recover the disk after EVERY successive sector level
write operation. This tested exactly the above premise: that the log would
(after recovery) leave the disk in a completely consistent state. Using
automated tools, test scripts, etc. we would essentially run tests
endlessly, automatically saving away disk states that did NOT recover
properly so we could analyze and repair them. Over time, we ended up with a
robust and efficient recovery system.
In all fairness, it took FAR less time and effort to ensure the journal
recovery worked properly than it did to ensure the equivalent of chkdsk
worked properly (since we assumed IT would run on disks where there might be
PHYSICAL damage to the media and hence partial corruption of meta-data that
was outside the scope of the journaling technique to repair).
If you have a storage subsystem that is caching and re-ordering writes, it
will potentially lead to NTFS data corruption. If your storage subsystem
does not cache and re-order writes, then you would appear to have an NTFS
issue and you should take it up with the NTFS development team.
Regards,
Tony
Tony Mason
Consulting Partner
OSR Open Systems Resources, Inc.
http://www.osr.com
See you in Palo Alto at the next File Systems seminar in September!
-----Original Message-----
From: Jan Bottorff [mailto:xxxxx@pmatrix.com]
Sent: Wednesday, August 13, 2003 12:34 AM
To: Windows System Software Developers Interest List
Subject: [ntdev] Surprise removal causing corruption on NTFS
Perhaps some of you file system folks have some input on this…
It seems like it’s true that NTFS is a journaling file system. My
understanding is a journaling file system will reapply the journal at
volume mount time if it was unmounted with a surprise removal. I also
understand it’s true that restoring the journal may not restore every
piece of user data that was on the disk at the moment of failure, but
the journal will assure everything is restored back to a consistent
state from the recent past (like seconds in the past). This is
especially true for the file system metadata. I also believe data
written to files with no buffering (i.e. direct physical I/O) will not
be in any consistent state, but metadata for those files will be.
So here is my question: why would surprise removals of a storage device
with NTFS EVER cause chkdsk to report the disk is corrupt and metadata
is inconsistent? I believe NTFS allows reordering of certain writes, but
periodically has a form of write barrier where all I/O activity is
synchronized, assuring the integrity of journal recovery. Is this an
indication my device/controller is reordering writes or failing some
I/O’s out of order? Or am I just overly optimistic on the effectiveness
of NTFS’s journaling?
Thanks.
Questions? First check the Kernel Driver FAQ at
http://www.osronline.com/article.cfm?id=256
You are currently subscribed to ntdev as: xxxxx@osr.com
To unsubscribe send a blank email to xxxxx@lists.osr.com