> thanks,and sorry for not giving out detailed information.
this time i will make up for it:
the backup file is not in the save driver,but maybe in same disk,such
as different partitions.
Still not much info.
If the file you are witing to is on the same disk (different partition),
then the strategy I would use would be to read lots of large
buffers (at least 64KB, but perhaps even 2MB at a time) from as contiguous
disk locations as possible into a huge memory buffer. When the memory
buffer is basically full, start sending out pieces to the disk in writes
(stop your reads) until you have written it all and then go back and do
the reads again.
The reason for this is that you don’t want to be moving the heads. If the
output is on the same disk it can cause a lot of head movement if you are
interleaving reads and writes.
Best approach is make the output file on a different disk and then
interleave reads and writes. The reason 64KB comes up as a buffer
size is that the SCSI standard only used to allow for 64KB byte
transfers at a time. I don’t know if that is still true.
However, my own studies have shown that reading in 2MB boundaries
does move ata faster than 64KB. I suspect the reason is all the
smarts in the OS and disk controller/drive software below your I/Os.
If I wanted to make it REALLY fast, I would empirically determine
a good buffer size as the application was running by testing different
buffer sizes during the early reads/writes and once the software had
determined a good size, leave it alone.
Be sure you pre-allocate much space in the output file as you know you
will need and extend it by large amounts when you need to extend it.
This will tend to allow for more contiguous extents and therefore
MUCH better performance. On the read side, do a scatter/gather
algorithm for data so that you can be reading as contiguous data as
possible. File system fragmentation WILL definately slow your
performance. I know that years ago, Digital Equipment Corporation
got huge performance gains on their backup product by doing read
scatter/gather operations where they read from differnt files at
the same time so they could read large contiguous chunks of disk
space. Disks are faster now, but the problem is still the same.
I would agree with the recommendations that you should probably queue
at least 2-4 asynchronous requests at a time (read first, then write,
unless on different drive in which case 2-4 each interleaved).
Be sure to heed the tip to use page aligned buffers for the I/O and
if you do a good job of managing the I/O, turning off FILE_BUFFERING
(caching) would also be a good idea. This is ESPECIALLY true if you
intend to read and write a LOT of data since it will just flush the
cache for other uses and total system performance will suffer. Besides,
this kind of application isn’t as well suited to caching as what you can
do managing the I/O yourself.
I can’t speak to the documentation pointed to by Peter because
I don’t have that email any longer. It wouldn’t hurt when you
reference something like that if you gave the details in your
email.
Hope this helps.
Rick Cadruvi…