Surprise removal causing corruption on NTFS

Perhaps some of you file system folks have some input on this…

It seems like it’s true that NTFS is a journaling file system. My
understanding is a journaling file system will reapply the journal at
volume mount time if it was unmounted with a surprise removal. I also
understand it’s true that restoring the journal may not restore every
piece of user data that was on the disk at the moment of failure, but
the journal will assure everything is restored back to a consistent
state from the recent past (like seconds in the past). This is
especially true for the file system metadata. I also believe data
written to files with no buffering (i.e. direct physical I/O) will not
be in any consistent state, but metadata for those files will be.

So here is my question: why would surprise removals of a storage device
with NTFS EVER cause chkdsk to report the disk is corrupt and metadata
is inconsistent? I believe NTFS allows reordering of certain writes, but
periodically has a form of write barrier where all I/O activity is
synchronized, assuring the integrity of journal recovery. Is this an
indication my device/controller is reordering writes or failing some
I/O’s out of order? Or am I just overly optimistic on the effectiveness
of NTFS’s journaling?

Thanks.

  • Jan

Only the metadata is (logged). If there is write-behind data in the cache
and the disk is removed, you will get an error indicating the data in the
cache never made it to the disk.

Jamey Kirby, Windows DDK MVP
StorageCraft Corporation
xxxxx@storagecraft.com
http://www.storagecraft.com

-----Original Message-----
From: xxxxx@lists.osr.com [mailto:xxxxx@lists.osr.com]
On Behalf Of Jan Bottorff
Sent: Tuesday, August 12, 2003 9:34 PM
To: Windows System Software Developers Interest List
Subject: [ntdev] Surprise removal causing corruption on NTFS

Perhaps some of you file system folks have some input on this…

It seems like it’s true that NTFS is a journaling file system. My
understanding is a journaling file system will reapply the journal at
volume mount time if it was unmounted with a surprise removal. I also
understand it’s true that restoring the journal may not restore every
piece of user data that was on the disk at the moment of failure, but
the journal will assure everything is restored back to a consistent
state from the recent past (like seconds in the past). This is
especially true for the file system metadata. I also believe data
written to files with no buffering (i.e. direct physical I/O) will not
be in any consistent state, but metadata for those files will be.

So here is my question: why would surprise removals of a storage device
with NTFS EVER cause chkdsk to report the disk is corrupt and metadata
is inconsistent? I believe NTFS allows reordering of certain writes, but
periodically has a form of write barrier where all I/O activity is
synchronized, assuring the integrity of journal recovery. Is this an
indication my device/controller is reordering writes or failing some
I/O’s out of order? Or am I just overly optimistic on the effectiveness
of NTFS’s journaling?

Thanks.

  • Jan

Questions? First check the Kernel Driver FAQ at
http://www.osronline.com/article.cfm?id=256

You are currently subscribed to ntdev as: xxxxx@storagecraft.com
To unsubscribe send a blank email to xxxxx@lists.osr.com

If you utilize a journal mechanism, every intermediate state of the disk
should result (after recovery) in a consistent state. This can even occur
in the face of storage subsystem level caching IF the storage subsystem
itself maintains strict write ordering for the operations. If the storage
subsystem modifies the order AND reports I/O as completed when it has not
(e.g., utilizing a write-back caching scheme) then it is not possible to
guarantee such semantics.

In the ancient past (when I was responsible for implementing the
transactional logging and recovery subsystem for a disk based physical media
file system) we tested our recovery subsystem by running automated tests
that would snapshot and recover the disk after EVERY successive sector level
write operation. This tested exactly the above premise: that the log would
(after recovery) leave the disk in a completely consistent state. Using
automated tools, test scripts, etc. we would essentially run tests
endlessly, automatically saving away disk states that did NOT recover
properly so we could analyze and repair them. Over time, we ended up with a
robust and efficient recovery system.

In all fairness, it took FAR less time and effort to ensure the journal
recovery worked properly than it did to ensure the equivalent of chkdsk
worked properly (since we assumed IT would run on disks where there might be
PHYSICAL damage to the media and hence partial corruption of meta-data that
was outside the scope of the journaling technique to repair).

If you have a storage subsystem that is caching and re-ordering writes, it
will potentially lead to NTFS data corruption. If your storage subsystem
does not cache and re-order writes, then you would appear to have an NTFS
issue and you should take it up with the NTFS development team.

Regards,

Tony

Tony Mason
Consulting Partner
OSR Open Systems Resources, Inc.
http://www.osr.com

See you in Palo Alto at the next File Systems seminar in September!

-----Original Message-----
From: Jan Bottorff [mailto:xxxxx@pmatrix.com]
Sent: Wednesday, August 13, 2003 12:34 AM
To: Windows System Software Developers Interest List
Subject: [ntdev] Surprise removal causing corruption on NTFS

Perhaps some of you file system folks have some input on this…

It seems like it’s true that NTFS is a journaling file system. My
understanding is a journaling file system will reapply the journal at
volume mount time if it was unmounted with a surprise removal. I also
understand it’s true that restoring the journal may not restore every
piece of user data that was on the disk at the moment of failure, but
the journal will assure everything is restored back to a consistent
state from the recent past (like seconds in the past). This is
especially true for the file system metadata. I also believe data
written to files with no buffering (i.e. direct physical I/O) will not
be in any consistent state, but metadata for those files will be.

So here is my question: why would surprise removals of a storage device
with NTFS EVER cause chkdsk to report the disk is corrupt and metadata
is inconsistent? I believe NTFS allows reordering of certain writes, but
periodically has a form of write barrier where all I/O activity is
synchronized, assuring the integrity of journal recovery. Is this an
indication my device/controller is reordering writes or failing some
I/O’s out of order? Or am I just overly optimistic on the effectiveness
of NTFS’s journaling?

Thanks.

  • Jan

Questions? First check the Kernel Driver FAQ at
http://www.osronline.com/article.cfm?id=256

You are currently subscribed to ntdev as: xxxxx@osr.com
To unsubscribe send a blank email to xxxxx@lists.osr.com

> indication my device/controller is reordering writes or failing some

I/O’s out of order? Or am I just overly optimistic on the effectiveness
of NTFS’s journaling?

Reordering will not hinder journaling. The journaling only relies on the fact
that, when completion is called for a write, the data written is physically in
the nonvolatile storage. That’s all requirements from journaling to the storage
stack, and even they must be enforced only in SL_WRITE_THROUGH (or how it is
called?) flag.

As about CHKDSK - I dunno when AUTOCHK forcibly runs it, for me, it occured
very, very rarely (much more rarely then on FAT).

Maxim Shatskih, Windows DDK MVP
StorageCraft Corporation
xxxxx@storagecraft.com
http://www.storagecraft.com