Disk to disk copy

Hello,
My question was for disk to disk copy, and i know its
not a question for ntdev, but I just thought ill ask
the gurus of programming. I would appreciate any
thoughts.

What would be the best algorithm to copy data from one
disk to another? It basically involves reading from
one disk and writing to another, so issuing multiple
outstanding IOs in parallel and also keeping residual
counts, so that if the copy operation fails, the
caller should get a count of how much was transferred.
Multiple IOs mean out of order completions, which
means that if an out of order completion comes in, we
wait for the write completion of the last pending IO
before issuing another read.e.g if I issue IO number 0
1 2 and 3, and 3 completes, I need to wait for 0 1 and
2 to complete before I can reissue a read using IO
struc 3. Unfortunately, no piggback completions are
available in SCSI so its pretty serial. Im just
stumped on how to improve performance for a disk to
disk copy using multiple outstanding IOs as opposed to
a single IO.This is actually something equivalent to
breaking up a larger IO into multiple smaller chunks,
issuing them in parallel and monitoring completions
and issuing more IOs as previous ones complete.
Any pointers?

Mark


Do you Yahoo!?
Yahoo! SiteBuilder - Free, easy-to-use web site design software
http://sitebuilder.yahoo.com

I was going to go into detail, but I dislike yahoo addresses. The answer
is, it depends.

“Mark Lobo” wrote in message news:xxxxx@ntdev…
>
> Hello,
> My question was for disk to disk copy, and i know its
> not a question for ntdev, but I just thought ill ask
> the gurus of programming. I would appreciate any
> thoughts.
>
> What would be the best algorithm to copy data from one
> disk to another? It basically involves reading from
> one disk and writing to another, so issuing multiple
> outstanding IOs in parallel and also keeping residual
> counts, so that if the copy operation fails, the
> caller should get a count of how much was transferred.
> Multiple IOs mean out of order completions, which
> means that if an out of order completion comes in, we
> wait for the write completion of the last pending IO
> before issuing another read.e.g if I issue IO number 0
> 1 2 and 3, and 3 completes, I need to wait for 0 1 and
> 2 to complete before I can reissue a read using IO
> struc 3. Unfortunately, no piggback completions are
> available in SCSI so its pretty serial. Im just
> stumped on how to improve performance for a disk to
> disk copy using multiple outstanding IOs as opposed to
> a single IO.This is actually something equivalent to
> breaking up a larger IO into multiple smaller chunks,
> issuing them in parallel and monitoring completions
> and issuing more IOs as previous ones complete.
> Any pointers?
>
> Mark
>
> __________________________________
> Do you Yahoo!?
> Yahoo! SiteBuilder - Free, easy-to-use web site design software
> http://sitebuilder.yahoo.com
>
>
>

yahoo addresses? Im sorry but I didnt get that?
Im just asking for pointers where I can go dig for
information.

Mark

— “David J. Craig” wrote:
> I was going to go into detail, but I dislike yahoo
> addresses. The answer
> is, it depends.
>
> “Mark Lobo” wrote in
> message news:xxxxx@ntdev…
> >
> > Hello,
> > My question was for disk to disk copy, and i know
> its
> > not a question for ntdev, but I just thought ill
> ask
> > the gurus of programming. I would appreciate any
> > thoughts.
> >
> > What would be the best algorithm to copy data from
> one
> > disk to another? It basically involves reading
> from
> > one disk and writing to another, so issuing
> multiple
> > outstanding IOs in parallel and also keeping
> residual
> > counts, so that if the copy operation fails, the
> > caller should get a count of how much was
> transferred.
> > Multiple IOs mean out of order completions, which
> > means that if an out of order completion comes in,
> we
> > wait for the write completion of the last pending
> IO
> > before issuing another read.e.g if I issue IO
> number 0
> > 1 2 and 3, and 3 completes, I need to wait for 0 1
> and
> > 2 to complete before I can reissue a read using IO
> > struc 3. Unfortunately, no piggback completions
> are
> > available in SCSI so its pretty serial. Im just
> > stumped on how to improve performance for a disk
> to
> > disk copy using multiple outstanding IOs as
> opposed to
> > a single IO.This is actually something equivalent
> to
> > breaking up a larger IO into multiple smaller
> chunks,
> > issuing them in parallel and monitoring
> completions
> > and issuing more IOs as previous ones complete.
> > Any pointers?
> >
> > Mark
> >
> >
> > Do you Yahoo!?
> > Yahoo! SiteBuilder - Free, easy-to-use web site
> design software
> > http://sitebuilder.yahoo.com
> >
> >
> >
>
>
>
> —
> Questions? First check the Kernel Driver FAQ at
> http://www.osronline.com/article.cfm?id=256
>
> You are currently subscribed to ntdev as:
> xxxxx@yahoo.com
> To unsubscribe send a blank email to
xxxxx@lists.osr.com


Do you Yahoo!?
Yahoo! SiteBuilder - Free, easy-to-use web site design software
http://sitebuilder.yahoo.com

David was commenting on using a bogus email address to hide your identity
from the other users of this list.

Jamey Kirby, Windows DDK MVP
StorageCraft Corporation
xxxxx@storagecraft.com
http://www.storagecraft.com

-----Original Message-----
From: xxxxx@lists.osr.com [mailto:xxxxx@lists.osr.com]
On Behalf Of Mark Lobo
Sent: Tuesday, August 12, 2003 5:00 PM
To: Windows System Software Developers Interest List
Subject: [ntdev] Re: Disk to disk copy

yahoo addresses? Im sorry but I didnt get that?
Im just asking for pointers where I can go dig for
information.

Mark

— “David J. Craig” wrote:
> I was going to go into detail, but I dislike yahoo
> addresses. The answer
> is, it depends.
>
> “Mark Lobo” wrote in
> message news:xxxxx@ntdev…
> >
> > Hello,
> > My question was for disk to disk copy, and i know
> its
> > not a question for ntdev, but I just thought ill
> ask
> > the gurus of programming. I would appreciate any
> > thoughts.
> >
> > What would be the best algorithm to copy data from
> one
> > disk to another? It basically involves reading
> from
> > one disk and writing to another, so issuing
> multiple
> > outstanding IOs in parallel and also keeping
> residual
> > counts, so that if the copy operation fails, the
> > caller should get a count of how much was
> transferred.
> > Multiple IOs mean out of order completions, which
> > means that if an out of order completion comes in,
> we
> > wait for the write completion of the last pending
> IO
> > before issuing another read.e.g if I issue IO
> number 0
> > 1 2 and 3, and 3 completes, I need to wait for 0 1
> and
> > 2 to complete before I can reissue a read using IO
> > struc 3. Unfortunately, no piggback completions
> are
> > available in SCSI so its pretty serial. Im just
> > stumped on how to improve performance for a disk
> to
> > disk copy using multiple outstanding IOs as
> opposed to
> > a single IO.This is actually something equivalent
> to
> > breaking up a larger IO into multiple smaller
> chunks,
> > issuing them in parallel and monitoring
> completions
> > and issuing more IOs as previous ones complete.
> > Any pointers?
> >
> > Mark
> >
> >
> > Do you Yahoo!?
> > Yahoo! SiteBuilder - Free, easy-to-use web site
> design software
> > http://sitebuilder.yahoo.com
> >
> >
> >
>
>
>
> —
> Questions? First check the Kernel Driver FAQ at
> http://www.osronline.com/article.cfm?id=256
>
> You are currently subscribed to ntdev as:
> xxxxx@yahoo.com
> To unsubscribe send a blank email to
xxxxx@lists.osr.com


Do you Yahoo!?
Yahoo! SiteBuilder - Free, easy-to-use web site design software
http://sitebuilder.yahoo.com


Questions? First check the Kernel Driver FAQ at
http://www.osronline.com/article.cfm?id=256

You are currently subscribed to ntdev as: xxxxx@storagecraft.com
To unsubscribe send a blank email to xxxxx@lists.osr.com

Jamey, David,
what bogus email address? Ive been using this email
for quite some time. And isnt this a case of guilty
until proved innocent? FYI, I lost my job, and Im
doing oddball jobs as a contractor. Right now, Im just
trying to spruce up my resume, learning about things
that interest me and can maybe get me a permanent job,
and thats where my original question came from, and if
more info is needed, it came from discussions with
someone at the place Im contracting at right now.
Seemed like an interesting problem to solve, and just
got me thinking on how I would do it if I had to. And
since Im doing small contract jobs, I dont have a
permanent work email address unfortunately, like
everyone else does, and so this is the email I use to
subscribe to all news groups. I am sure no one here
uses the same email address to get personal emails and
newsgroup emails, unless they use their work email
address for everything, or am I missing something else
here?

Mark

— Jamey Kirby wrote:
> David was commenting on using a bogus email address
> to hide your identity
> from the other users of this list.
>
> Jamey Kirby, Windows DDK MVP
> StorageCraft Corporation
> xxxxx@storagecraft.com
> http://www.storagecraft.com
>
>
>
>
>
> -----Original Message-----
> From: xxxxx@lists.osr.com
> [mailto:xxxxx@lists.osr.com]
> On Behalf Of Mark Lobo
> Sent: Tuesday, August 12, 2003 5:00 PM
> To: Windows System Software Developers Interest List
> Subject: [ntdev] Re: Disk to disk copy
>
>
> yahoo addresses? Im sorry but I didnt get that?
> Im just asking for pointers where I can go dig for
> information.
>
> Mark
>
> — “David J. Craig” wrote:
> > I was going to go into detail, but I dislike yahoo
> > addresses. The answer
> > is, it depends.
> >
> > “Mark Lobo” wrote in
> > message news:xxxxx@ntdev…
> > >
> > > Hello,
> > > My question was for disk to disk copy, and i
> know
> > its
> > > not a question for ntdev, but I just thought ill
> > ask
> > > the gurus of programming. I would appreciate any
> > > thoughts.
> > >
> > > What would be the best algorithm to copy data
> from
> > one
> > > disk to another? It basically involves reading
> > from
> > > one disk and writing to another, so issuing
> > multiple
> > > outstanding IOs in parallel and also keeping
> > residual
> > > counts, so that if the copy operation fails, the
> > > caller should get a count of how much was
> > transferred.
> > > Multiple IOs mean out of order completions,
> which
> > > means that if an out of order completion comes
> in,
> > we
> > > wait for the write completion of the last
> pending
> > IO
> > > before issuing another read.e.g if I issue IO
> > number 0
> > > 1 2 and 3, and 3 completes, I need to wait for 0
> 1
> > and
> > > 2 to complete before I can reissue a read using
> IO
> > > struc 3. Unfortunately, no piggback completions
> > are
> > > available in SCSI so its pretty serial. Im just
> > > stumped on how to improve performance for a disk
> > to
> > > disk copy using multiple outstanding IOs as
> > opposed to
> > > a single IO.This is actually something
> equivalent
> > to
> > > breaking up a larger IO into multiple smaller
> > chunks,
> > > issuing them in parallel and monitoring
> > completions
> > > and issuing more IOs as previous ones complete.
> > > Any pointers?
> > >
> > > Mark
> > >
> > >
> > > Do you Yahoo!?
> > > Yahoo! SiteBuilder - Free, easy-to-use web site
> > design software
> > > http://sitebuilder.yahoo.com
> > >
> > >
> > >
> >
> >
> >
> > —
> > Questions? First check the Kernel Driver FAQ at
> > http://www.osronline.com/article.cfm?id=256
> >
> > You are currently subscribed to ntdev as:
> > xxxxx@yahoo.com
> > To unsubscribe send a blank email to
> xxxxx@lists.osr.com
>
>
>

> Do you Yahoo!?
> Yahoo! SiteBuilder - Free, easy-to-use web site
> design software
> http://sitebuilder.yahoo.com
>
>
> —
> Questions? First check the Kernel Driver FAQ at
> http://www.osronline.com/article.cfm?id=256
>
> You are currently subscribed to ntdev as:
> xxxxx@storagecraft.com
> To unsubscribe send a blank email to
> xxxxx@lists.osr.com
>
>
>
> —
> Questions? First check the Kernel Driver FAQ at
> http://www.osronline.com/article.cfm?id=256
>
> You are currently subscribed to ntdev as:
> xxxxx@yahoo.com
> To unsubscribe send a blank email to
xxxxx@lists.osr.com

__________________________________
Do you Yahoo!?
Yahoo! SiteBuilder - Free, easy-to-use web site design software
http://sitebuilder.yahoo.com

This is a moderated list, not a new group. Addresses are fairly secure and
OSR does a fine job of keeping junk out of the list.

I was not complaining, just elaborating on what David said.

Jamey Kirby, Windows DDK MVP
StorageCraft Corporation
xxxxx@storagecraft.com
http://www.storagecraft.com

-----Original Message-----
From: xxxxx@lists.osr.com [mailto:xxxxx@lists.osr.com]
On Behalf Of Mark Lobo
Sent: Tuesday, August 12, 2003 9:39 PM
To: Windows System Software Developers Interest List
Subject: [ntdev] Re: Disk to disk copy

Jamey, David,
what bogus email address? Ive been using this email
for quite some time. And isnt this a case of guilty
until proved innocent? FYI, I lost my job, and Im
doing oddball jobs as a contractor. Right now, Im just
trying to spruce up my resume, learning about things
that interest me and can maybe get me a permanent job,
and thats where my original question came from, and if
more info is needed, it came from discussions with
someone at the place Im contracting at right now.
Seemed like an interesting problem to solve, and just
got me thinking on how I would do it if I had to. And
since Im doing small contract jobs, I dont have a
permanent work email address unfortunately, like
everyone else does, and so this is the email I use to
subscribe to all news groups. I am sure no one here
uses the same email address to get personal emails and
newsgroup emails, unless they use their work email
address for everything, or am I missing something else
here?

Mark

— Jamey Kirby wrote:
> David was commenting on using a bogus email address
> to hide your identity
> from the other users of this list.
>
> Jamey Kirby, Windows DDK MVP
> StorageCraft Corporation
> xxxxx@storagecraft.com
> http://www.storagecraft.com
>
>
>
>
>
> -----Original Message-----
> From: xxxxx@lists.osr.com
> [mailto:xxxxx@lists.osr.com]
> On Behalf Of Mark Lobo
> Sent: Tuesday, August 12, 2003 5:00 PM
> To: Windows System Software Developers Interest List
> Subject: [ntdev] Re: Disk to disk copy
>
>
> yahoo addresses? Im sorry but I didnt get that?
> Im just asking for pointers where I can go dig for
> information.
>
> Mark
>
> — “David J. Craig” wrote:
> > I was going to go into detail, but I dislike yahoo
> > addresses. The answer
> > is, it depends.
> >
> > “Mark Lobo” wrote in
> > message news:xxxxx@ntdev…
> > >
> > > Hello,
> > > My question was for disk to disk copy, and i
> know
> > its
> > > not a question for ntdev, but I just thought ill
> > ask
> > > the gurus of programming. I would appreciate any
> > > thoughts.
> > >
> > > What would be the best algorithm to copy data
> from
> > one
> > > disk to another? It basically involves reading
> > from
> > > one disk and writing to another, so issuing
> > multiple
> > > outstanding IOs in parallel and also keeping
> > residual
> > > counts, so that if the copy operation fails, the
> > > caller should get a count of how much was
> > transferred.
> > > Multiple IOs mean out of order completions,
> which
> > > means that if an out of order completion comes
> in,
> > we
> > > wait for the write completion of the last
> pending
> > IO
> > > before issuing another read.e.g if I issue IO
> > number 0
> > > 1 2 and 3, and 3 completes, I need to wait for 0
> 1
> > and
> > > 2 to complete before I can reissue a read using
> IO
> > > struc 3. Unfortunately, no piggback completions
> > are
> > > available in SCSI so its pretty serial. Im just
> > > stumped on how to improve performance for a disk
> > to
> > > disk copy using multiple outstanding IOs as
> > opposed to
> > > a single IO.This is actually something
> equivalent
> > to
> > > breaking up a larger IO into multiple smaller
> > chunks,
> > > issuing them in parallel and monitoring
> > completions
> > > and issuing more IOs as previous ones complete.
> > > Any pointers?
> > >
> > > Mark
> > >
> > >
> > > Do you Yahoo!?
> > > Yahoo! SiteBuilder - Free, easy-to-use web site
> > design software
> > > http://sitebuilder.yahoo.com
> > >
> > >
> > >
> >
> >
> >
> > —
> > Questions? First check the Kernel Driver FAQ at
> > http://www.osronline.com/article.cfm?id=256
> >
> > You are currently subscribed to ntdev as:
> > xxxxx@yahoo.com
> > To unsubscribe send a blank email to
> xxxxx@lists.osr.com
>
>
>

> Do you Yahoo!?
> Yahoo! SiteBuilder - Free, easy-to-use web site
> design software
> http://sitebuilder.yahoo.com
>
>
> —
> Questions? First check the Kernel Driver FAQ at
> http://www.osronline.com/article.cfm?id=256
>
> You are currently subscribed to ntdev as:
> xxxxx@storagecraft.com
> To unsubscribe send a blank email to
> xxxxx@lists.osr.com
>
>
>
> —
> Questions? First check the Kernel Driver FAQ at
> http://www.osronline.com/article.cfm?id=256
>
> You are currently subscribed to ntdev as:
> xxxxx@yahoo.com
> To unsubscribe send a blank email to
xxxxx@lists.osr.com

__________________________________
Do you Yahoo!?
Yahoo! SiteBuilder - Free, easy-to-use web site design software
http://sitebuilder.yahoo.com


Questions? First check the Kernel Driver FAQ at
http://www.osronline.com/article.cfm?id=256

You are currently subscribed to ntdev as: xxxxx@storagecraft.com
To unsubscribe send a blank email to xxxxx@lists.osr.com

I didnt say it was about junk Jamey. Its just that I
like to keep my personal emails separate, because I
get quite a few emails from various newsgroups
everyday, and I dont have a permanent work email
address with tons of space. The email address is
ntdeveloper2002, because thats what I did to start
with, write NT drivers. My name is still Mark and
thats not bogus:)

And I understand that you were elaborating on what
David said, my previous email was just information on
why it is the way it is. Didnt mean to offend anyone.

Mark

— Jamey Kirby wrote:
> This is a moderated list, not a new group. Addresses
> are fairly secure and
> OSR does a fine job of keeping junk out of the list.
>
> I was not complaining, just elaborating on what
> David said.
>
> Jamey Kirby, Windows DDK MVP
> StorageCraft Corporation
> xxxxx@storagecraft.com
> http://www.storagecraft.com
>
>
>
>
>
> -----Original Message-----
> From: xxxxx@lists.osr.com
> [mailto:xxxxx@lists.osr.com]
> On Behalf Of Mark Lobo
> Sent: Tuesday, August 12, 2003 9:39 PM
> To: Windows System Software Developers Interest List
> Subject: [ntdev] Re: Disk to disk copy
>
> Jamey, David,
> what bogus email address? Ive been using this email
> for quite some time. And isnt this a case of guilty
> until proved innocent? FYI, I lost my job, and Im
> doing oddball jobs as a contractor. Right now, Im
> just
> trying to spruce up my resume, learning about things
> that interest me and can maybe get me a permanent
> job,
> and thats where my original question came from, and
> if
> more info is needed, it came from discussions with
> someone at the place Im contracting at right now.
> Seemed like an interesting problem to solve, and
> just
> got me thinking on how I would do it if I had to.
> And
> since Im doing small contract jobs, I dont have a
> permanent work email address unfortunately, like
> everyone else does, and so this is the email I use
> to
> subscribe to all news groups. I am sure no one here
> uses the same email address to get personal emails
> and
> newsgroup emails, unless they use their work email
> address for everything, or am I missing something
> else
> here?
>
> Mark
>
> — Jamey Kirby wrote:
> > David was commenting on using a bogus email
> address
> > to hide your identity
> > from the other users of this list.
> >
> > Jamey Kirby, Windows DDK MVP
> > StorageCraft Corporation
> > xxxxx@storagecraft.com
> > http://www.storagecraft.com
> >
> >
> >
> >
> >
> > -----Original Message-----
> > From: xxxxx@lists.osr.com
> > [mailto:xxxxx@lists.osr.com]
> > On Behalf Of Mark Lobo
> > Sent: Tuesday, August 12, 2003 5:00 PM
> > To: Windows System Software Developers Interest
> List
> > Subject: [ntdev] Re: Disk to disk copy
> >
> >
> > yahoo addresses? Im sorry but I didnt get that?
> > Im just asking for pointers where I can go dig for
> > information.
> >
> > Mark
> >
> > — “David J. Craig”
> wrote:
> > > I was going to go into detail, but I dislike
> yahoo
> > > addresses. The answer
> > > is, it depends.
> > >
> > > “Mark Lobo” wrote in
> > > message news:xxxxx@ntdev…
> > > >
> > > > Hello,
> > > > My question was for disk to disk copy, and i
> > know
> > > its
> > > > not a question for ntdev, but I just thought
> ill
> > > ask
> > > > the gurus of programming. I would appreciate
> any
> > > > thoughts.
> > > >
> > > > What would be the best algorithm to copy data
> > from
> > > one
> > > > disk to another? It basically involves reading
> > > from
> > > > one disk and writing to another, so issuing
> > > multiple
> > > > outstanding IOs in parallel and also keeping
> > > residual
> > > > counts, so that if the copy operation fails,
> the
> > > > caller should get a count of how much was
> > > transferred.
> > > > Multiple IOs mean out of order completions,
> > which
> > > > means that if an out of order completion comes
> > in,
> > > we
> > > > wait for the write completion of the last
> > pending
> > > IO
> > > > before issuing another read.e.g if I issue IO
> > > number 0
> > > > 1 2 and 3, and 3 completes, I need to wait for
> 0
> > 1
> > > and
> > > > 2 to complete before I can reissue a read
> using
> > IO
> > > > struc 3. Unfortunately, no piggback
> completions
> > > are
> > > > available in SCSI so its pretty serial. Im
> just
> > > > stumped on how to improve performance for a
> disk
> > > to
> > > > disk copy using multiple outstanding IOs as
> > > opposed to
> > > > a single IO.This is actually something
> > equivalent
> > > to
> > > > breaking up a larger IO into multiple smaller
> > > chunks,
> > > > issuing them in parallel and monitoring
> > > completions
> > > > and issuing more IOs as previous ones
> complete.
> > > > Any pointers?
> > > >
> > > > Mark
> > > >
> > > >
> > > > Do you Yahoo!?
> > > > Yahoo! SiteBuilder - Free, easy-to-use web
> site
> > > design software
> > > > http://sitebuilder.yahoo.com
> > > >
> > > >
> > > >
> > >
> > >
> > >
> > > —
> > > Questions? First check the Kernel Driver FAQ at
> > > http://www.osronline.com/article.cfm?id=256
> > >
> > > You are currently subscribed to ntdev as:
> > > xxxxx@yahoo.com
> > > To unsubscribe send a blank email to
> > xxxxx@lists.osr.com
> >
> >
> >

> > Do you Yahoo!?
> > Yahoo! SiteBuilder - Free, easy-to-use web site
> > design software
> > http://sitebuilder.yahoo.com
> >
> >
> > —
> > Questions? First check the Kernel Driver FAQ at
> > http://www.osronline.com/article.cfm?id=256
> >
> > You are currently subscribed to ntdev as:
> > xxxxx@storagecraft.com
> > To unsubscribe send a blank email to
> > xxxxx@lists.osr.com
> >
> >
> >
> > —
>
=== message truncated ===

__________________________________
Do you Yahoo!?
Yahoo! SiteBuilder - Free, easy-to-use web site design software
http://sitebuilder.yahoo.com

“Mark Lobo” wrote in message news:xxxxx@ntdev…
>
> Multiple IOs mean out of order completions, which
> means that if an out of order completion comes in, we
> wait for the write completion of the last pending IO
> before issuing another read.e.g if I issue IO number 0
> 1 2 and 3, and 3 completes, I need to wait for 0 1 and
> 2 to complete before I can reissue a read using IO
> struc 3. Unfortunately, no piggback completions are
> available in SCSI so its pretty serial.

No piggyback completions in SCSI, but out of order completions are possible
depending on how the SCSI disk controller reorders the I/O requests it gets.

But I don’t see your problem with having to wait for I/Os 0, 1, and 2 to
complete in order to start another one for completed I/O number 3. Let’s
say each of your I/Os is for 1 sector of a disk, both read and write. At
the start, you launch 4 I/O operations to read sector 0, sector 1, sector 2,
and sector 3 of the disk. If the read for 3 completes, you can immediately
launch a write operation to write sector 3 of the target disk, and launch a
read for sector 4 (or whatever the next highest sector is, based on a
pointer you keep.) Similarly, as each read for sector N completes, you
launch a write for sector N and launch a new read for sector NEXT.

As the writes complete, you can recycle the I/O structures you had set up
for them.

Oh, and use unbuffered I/O so you won’t thrash the Windows filesystem buffer
cache.

Carl

But its needed to keep residual counts of the
operation, in case it fails and return error to the
caller with how much was transferred. And if the
number of IOs outstanding is 10, and if a completion
comes in out of order, say IO 3 completes before IO 0,
and we reissue IO 4 using IO 3s structure, we need to
maintain state on too many IOs and their completion
status. Say if it is like this: 0 has not completed, 1
has, 2 has not completed, but 3 has, 4 has not
completed but 5 has, and all the completed IOs strucs
have been reused to issue new IOs. Since the
structures for the IOs which have completed have been
reused, we need a kind of a bitmask which tells us the
status of the IO completed. For a 10 Terabyte disk,
this bitmask can go to pretty high proportions, if say
we are doing 256k IOs.If the residual count didnt
matter, i.e. the operation is just errored to the
caller when any IO fails, then yes, the IO objects can
just be reused right away without waiting. But in this
case, how does one keep track of residual counts
without keeping too much state AND using as much
parallelism as possible?

Mark
— Carl Appellof
wrote:
>
> “Mark Lobo” wrote in
> message news:xxxxx@ntdev…
> >
> > Multiple IOs mean out of order completions, which
> > means that if an out of order completion comes in,
> we
> > wait for the write completion of the last pending
> IO
> > before issuing another read.e.g if I issue IO
> number 0
> > 1 2 and 3, and 3 completes, I need to wait for 0 1
> and
> > 2 to complete before I can reissue a read using IO
> > struc 3. Unfortunately, no piggback completions
> are
> > available in SCSI so its pretty serial.
>
> No piggyback completions in SCSI, but out of order
> completions are possible
> depending on how the SCSI disk controller reorders
> the I/O requests it gets.
>
> But I don’t see your problem with having to wait for
> I/Os 0, 1, and 2 to
> complete in order to start another one for completed
> I/O number 3. Let’s
> say each of your I/Os is for 1 sector of a disk,
> both read and write. At
> the start, you launch 4 I/O operations to read
> sector 0, sector 1, sector 2,
> and sector 3 of the disk. If the read for 3
> completes, you can immediately
> launch a write operation to write sector 3 of the
> target disk, and launch a
> read for sector 4 (or whatever the next highest
> sector is, based on a
> pointer you keep.) Similarly, as each read for
> sector N completes, you
> launch a write for sector N and launch a new read
> for sector NEXT.
>
> As the writes complete, you can recycle the I/O
> structures you had set up
> for them.
>
> Oh, and use unbuffered I/O so you won’t thrash the
> Windows filesystem buffer
> cache.
>
> Carl
>
>
>
> —
> Questions? First check the Kernel Driver FAQ at
> http://www.osronline.com/article.cfm?id=256
>
> You are currently subscribed to ntdev as:
> xxxxx@yahoo.com
> To unsubscribe send a blank email to
xxxxx@lists.osr.com

__________________________________
Do you Yahoo!?
Yahoo! SiteBuilder - Free, easy-to-use web site design software
http://sitebuilder.yahoo.com

I believe the problem can be solved by just keeping a list of pending IO’s and a #define of maximum pending IOs.
For Example if my list says.
2 4 10 15
and my max pending IO is 4.
Then all the IOs except 2, 4, 10, 15, >15 are complete.
2, 4, 10, 15 are posted not yet complete

15 are not yet posted.

If say the list is only 2, 4, 10
and max pending IO is 4.
Then all the IOs except 2, 4, 10 are complete
2, 4, 10 are posted not yet complete
all IOs are posted.

wouldn’t that be enough?

-Srin.

-----Original Message-----
From: Mark Lobo [mailto:xxxxx@yahoo.com]
Sent: Wednesday, August 13, 2003 9:46 AM
To: Windows System Software Developers Interest List
Subject: [ntdev] Re: Disk to disk copy

But its needed to keep residual counts of the
operation, in case it fails and return error to the
caller with how much was transferred. And if the
number of IOs outstanding is 10, and if a completion
comes in out of order, say IO 3 completes before IO 0,
and we reissue IO 4 using IO 3s structure, we need to
maintain state on too many IOs and their completion
status. Say if it is like this: 0 has not completed, 1
has, 2 has not completed, but 3 has, 4 has not
completed but 5 has, and all the completed IOs strucs
have been reused to issue new IOs. Since the
structures for the IOs which have completed have been
reused, we need a kind of a bitmask which tells us the
status of the IO completed. For a 10 Terabyte disk,
this bitmask can go to pretty high proportions, if say
we are doing 256k IOs.If the residual count didnt
matter, i.e. the operation is just errored to the
caller when any IO fails, then yes, the IO objects can
just be reused right away without waiting. But in this
case, how does one keep track of residual counts
without keeping too much state AND using as much
parallelism as possible?

Mark
— Carl Appellof
wrote:
>
> “Mark Lobo” wrote in
> message news:xxxxx@ntdev…
> >
> > Multiple IOs mean out of order completions, which
> > means that if an out of order completion comes in,
> we
> > wait for the write completion of the last pending
> IO
> > before issuing another read.e.g if I issue IO
> number 0
> > 1 2 and 3, and 3 completes, I need to wait for 0 1
> and
> > 2 to complete before I can reissue a read using IO
> > struc 3. Unfortunately, no piggback completions
> are
> > available in SCSI so its pretty serial.
>
> No piggyback completions in SCSI, but out of order
> completions are possible
> depending on how the SCSI disk controller reorders
> the I/O requests it gets.
>
> But I don’t see your problem with having to wait for
> I/Os 0, 1, and 2 to
> complete in order to start another one for completed
> I/O number 3. Let’s
> say each of your I/Os is for 1 sector of a disk,
> both read and write. At
> the start, you launch 4 I/O operations to read
> sector 0, sector 1, sector 2,
> and sector 3 of the disk. If the read for 3
> completes, you can immediately
> launch a write operation to write sector 3 of the
> target disk, and launch a
> read for sector 4 (or whatever the next highest
> sector is, based on a
> pointer you keep.) Similarly, as each read for
> sector N completes, you
> launch a write for sector N and launch a new read
> for sector NEXT.
>
> As the writes complete, you can recycle the I/O
> structures you had set up
> for them.
>
> Oh, and use unbuffered I/O so you won’t thrash the
> Windows filesystem buffer
> cache.
>
> Carl
>
>
>
> —
> Questions? First check the Kernel Driver FAQ at
> http://www.osronline.com/article.cfm?id=256
>
> You are currently subscribed to ntdev as:
> xxxxx@yahoo.com
> To unsubscribe send a blank email to
xxxxx@lists.osr.com

__________________________________
Do you Yahoo!?
Yahoo! SiteBuilder - Free, easy-to-use web site design software
http://sitebuilder.yahoo.com


Questions? First check the Kernel Driver FAQ at http://www.osronline.com/article.cfm?id=256

You are currently subscribed to ntdev as: xxxxx@nai.com
To unsubscribe send a blank email to xxxxx@lists.osr.com

“Mark Lobo” wrote in message news:xxxxx@ntdev…
>

> For a 10 Terabyte disk,
> this bitmask can go to pretty high proportions, if say
> we are doing 256k IOs.If the residual count didnt
> matter, i.e. the operation is just errored to the
> caller when any IO fails, then yes, the IO objects can
> just be reused right away without waiting. But in this
> case, how does one keep track of residual counts
> without keeping too much state AND using as much
> parallelism as possible?
>

Just limit how much outstanding I/O (parallelism) you’re willing to
tolerate, which will limit the number of I/O structures you have to maintain
to track I/O errors and residual counts. For example, current SCSI protocol
has a tagged queue limit of 256. That is, a single SCSI target won’t allow
more than 256 pending operations in its own queue, so it probably wins you
nothing to issue more I/O requests than that. Of course, you also have to
decide on a block size to use for copying.

Carl

Processor overhead divided by IO transfer time. If (for argument) it takes
1ms of processor time to handle an IO that will take 10ms, then there is no
point in having much more than 10 concurrent IOs, since all the rest will
finish before you can service them, or you will never get a chance to queue
that many.

Now real world numbers are nowhere near as simple as 1 and 10, and you have
to take into account number of processors available for servicing, range of
IO transaction times, variations in processing time with queue depth, and a
bunch of other things. But the basics still hold, even with all that.

Loren

----- Original Message -----
From: “Carl Appellof”
Newsgroups: ntdev
To: “Windows System Software Developers Interest List”
Sent: Thursday, August 14, 2003 10:07 AM
Subject: [ntdev] Re: Disk to disk copy

>
> “Mark Lobo” wrote in message
news:xxxxx@ntdev…
> >
> …
> > For a 10 Terabyte disk,
> > this bitmask can go to pretty high proportions, if say
> > we are doing 256k IOs.If the residual count didnt
> > matter, i.e. the operation is just errored to the
> > caller when any IO fails, then yes, the IO objects can
> > just be reused right away without waiting. But in this
> > case, how does one keep track of residual counts
> > without keeping too much state AND using as much
> > parallelism as possible?
> >
>
> Just limit how much outstanding I/O (parallelism) you’re willing to
> tolerate, which will limit the number of I/O structures you have to
maintain
> to track I/O errors and residual counts. For example, current SCSI
protocol
> has a tagged queue limit of 256. That is, a single SCSI target won’t
allow
> more than 256 pending operations in its own queue, so it probably wins you
> nothing to issue more I/O requests than that. Of course, you also have to
> decide on a block size to use for copying.
>
> Carl
>
>
>
> —
> Questions? First check the Kernel Driver FAQ at
http://www.osronline.com/article.cfm?id=256
>
> You are currently subscribed to ntdev as: xxxxx@earthlink.net
> To unsubscribe send a blank email to xxxxx@lists.osr.com
>