This may be a bit off topic here, but on a related note, do any of you guys
know whether most disk devices today gaurantee atomicity of sector writes
(i.e., a sector of data can never be half written; it’s either all written
or all not written) ?
Matt
-----Original Message-----
From: xxxxx@lists.osr.com
[mailto:xxxxx@lists.osr.com] On Behalf Of Tony Mason
Sent: Thursday, January 20, 2005 12:17 PM
To: Windows File Systems Devs Interest List
Subject: RE: [ntfsd] NTFS fault tolerant?
The issue of reliability deals with the classes of failures against which
you are trying to protect. For example, transactional systems are
attempting to protect against a category of errors related to premature
termination of the system. File systems often also try to include
additional information to protect against physical damage to the underlying
storage (which transactional systems do NOT protect against!)
When you issue a cached I/O operation, the data is written to a location in
memory and asynchronously written back to disk - you have no guarantees as
to when it has been successfully written back.
When you issue a non-cached I/O operation, the data is written to a
persistent location - once that I/O has completed, you are guaranteed that
even if the system halts prematurely (like a spontaneous reboot) then the
data may be retrieved. Thus, transactional systems generally use a log to
record the change about to be made and then record the change. If the
system crashes AFTER the log is written but BEFORE the data is written, the
data is rewritten in order to recover from the premature failure.
In transactional systems I’ve written previously, we normally checksum the
records within the log to protect against partial-write errors (in other
words, where only part of the record was written, not the entire
record) that might occur when the system fails. Within the file system, we
also duplicated some critical information in order to make recovery from
media failures (bad sectors, etc) simpler. In distributed systems, we often
would keep critical information stored in multiple replicas - so that even
if one computer “ceased to exist” we would still be able to recover from the
problems.
Thus, I suggest the first thing you do is try to identify the categories of
failures from which you need to protect yourself and then devise strategies
for protecting against those failures. For example you might have a list
like:
(1) Failure due to power failure
(2) Failure due to shutdown while work is in-progress
(3) Failure due to bugcheck
(4) Failure due to destruction of the data storage unit.
Then your solutions might be:
(1) A logging (transactional) system in which the integrity of the log is
protected using checksums.
(2) & (3) A logging system allowing recovery upon reboot
(4) Periodic off-site backups of the data (implemented by documenting this
for users)
I hope this makes sense. I can say that in my experience, building
resilient systems capable of handling a broad range of failures is quite a
bit more difficult than implementing the basic implementation functionality.
Regards,
Tony
Tony Mason
Consulting Partner
OSR Open Systems Resources, Inc.
http://www.osr.com
Looking forward to seeing you at the Next OSR File Systems Class April 4,
2004 in Boston!
-----Original Message-----
From: xxxxx@lists.osr.com
[mailto:xxxxx@lists.osr.com] On Behalf Of Alexey Logachyov
Sent: Thursday, January 20, 2005 5:17 AM
To: ntfsd redirect
Subject: Re:[ntfsd] NTFS fault tolerant?
Please, explain how use of non-cached I/O will help. This is important for
us.
Second, we cannot change format of database. Our database is specialized for
our needs. It has nothing to do with tables, relations, primary keys, and
all that kinds of stuff. What we need is some advise on how to store data so
that we don’t lose or corrupt data on disk in case of unexpected
termination.
–htfv
“Lyndon J Clarke” wrote in message
news:xxxxx@ntfsd…
> First of all use non cahed i/o
> Second have a look at a database design 101 resource
>
> Cheers
> Lyndon
>
> “Alexey Logachyov” wrote in message
> news:xxxxx@ntfsd…
>> Here are some questions for NTFS gurus. We store information in
database
>> of our own format. From time to time we need to update critical
>> structures, one at a time. The size of this structure is small -
about
>> 200-300 bytes. We use WriteFile routine followed by the
FlushFileBuffer
>> routine (file is opened for buffered I/O). We really-really want to
be
>> tolerant to suprise shutdowns (hardware reset and power losses). So,
here
>> are the questions:
>>
>> 1. Can we say that at any given time we have either old version of
>> information or new information? Can it be updated partially? As far
as I
>> understand, if the piece of data crosses page boundaries, we cannot
>> guarantee that.
>>
>> 2. Is it possible that data near the updated location becomes
corrupted?
>> We noticed that sometimes data before and/or after written block
becomes
>> corrupted if power was disrupted.
>>
>> 3. Can anyone give some advise as to how we can achieve our goal?
Does
>> anyone know, how do enterprise databases guarantee consistency?
>>
>> --htfv
>>
>>
>
>
>
—
Questions? First check the IFS FAQ at
https://www.osronline.com/article.cfm?id=17
You are currently subscribed to ntfsd as: xxxxx@osr.com To unsubscribe send
a blank email to xxxxx@lists.osr.com
—
Questions? First check the IFS FAQ at
https://www.osronline.com/article.cfm?id=17
You are currently subscribed to ntfsd as: unknown lmsubst tag argument: ‘’
To unsubscribe send a blank email to xxxxx@lists.osr.com