Er, not quite. Tagged queueing does not in general specify that
things complete in order. What the tags say is “complete other
outstanding operations before doing this one” or “reorder this
as you please”…for the common ones. They may be rarely used
in NT; VMS uses them when the device allows.
The value of using tagged queueing is where you have some error
condition or cluster state transition where it becomes necessary
to ensure that you don’t send a new write to a device before all
write commands that are queued inside the device finish.
That said, it should also be mentioned that in practice firmware
writers seem to not understand tagged queueing in many cases, and
have been known to write firmware in which all tags are handled
the same, as “reorder at will”. There has to be a tag value that
identifies the I/O operation inside the device; I believe the legal
range is 0-255. The modifiers are part of the same functional suite
however, and there is one bit in the IDENTIFY message which indicates
whether the device supports tagged queueing or not. Another bit
specifies if disconnect is permitted, but if tagged queueing is not
supported the device can have one operation pending inside it at a
time…no more. Unfortunately devices often lie about whether they
can in fact support tagged queueing or not. Support for this function
in driver software is non-trivial, since when errors occur there are
intended ways you should ensure the device is done. It makes handling
errors more complex, makes read-after-write more complex if your OS
bothers supporting it, and makes cancelling I/O more complicated since
some operations are in general inside the device itself, not just
waiting inside nonpaged pool to be dispatched. Where a device supports
it, though, the feature can produce extreme throughput and cover a
multitude of sins. A device that has enough internal commands to transfer
an entire track can, for example, get the whole track in one rotation
rather than handling things in order, if the tags allow (and there are
several more options I have not mentioned). The device can schedule
operations from several initiators (processors) if your OS supports this.
That kind of shared access support is needed in VMS, not NT, as part of
a shared-i/o clustering model. If your system does not share devices, but
just serves them (or, worse, uses reserve/release to bracket operations)
some of the potential tcq benefits (tcq=tagged command queueing) will not
be available.
There is some benefit though. A device that CORRECTLY supports TCQ
can have many commands in the device at a time and can internally
reorder them to gain performance. A device that does not must handle
commands one at a time in the order given. It is typically messy to
try to replicate the tcq behavior in software layers above the disk,
though it could be done.
TCQ is mis-implemented so often that some OSs (in particular bsd) default
to off for it and allow it only on devices where it is known to work right.
Unfortunately the inquiry bit cannot be relied upon.
-----Original Message-----
From: Paul Bunn [mailto:xxxxx@UltraBac.com]
Sent: Monday, October 30, 2000 9:01 PM
To: NT Developers Interest List
Subject: [ntdev] RE: SCSI chips performance
There are “queued” commands and there are “tagged queued” commands.
Now, I strongly suggest double-checking anything I say here, coz I’m a bit
rusty
on this stuff.
NT/SCSI makes use of queued commands with pretty much all SCSI controllers
(a
prerequisite is that disconnections have to be enabled for the relevant
devices of
course).
Tagged-queued commands are very rarely used feature where you can group a
bunch of
queued commands so that they’re completed in a certain order (like grouping
sets
of commands together). I think this can also be done to prioritize one
group of
commands over another. Since it is possible that a SCSI device can complete
commands in a different order than which they were sent, I believe it
modifies
this behavior. I don’t think any version of NT uses tagged-queued commands
at
all. It can be confusing though since “tagged-queued” commands and “queued”
commands are used interchangeably since a bunch of commands all with the
same
“tag” is the same as no tag at all.
The LSI chipsets (who now own the NCR/Symbios SCSI controllers) are the
nicest
IMHO. They have very intelligent controllers able to support “scripts” in
hardware that the drivers download. They’re really off-board CPUs in their
own
right. The 64bit U3 controllers such as made by www.tekram.com are a good
implementation of such a chipset. It used to be common that a CPU was used
to
handle SCSI bus phases, and the SCSI controller chip was really just a set
of
intelligent line-drivers. I’ve written microcode to handle such phase
transitions
in Z80. Thank goodness that SCSI controllers have come a long way since
then.
The example code on the DDK for SCSI miniport drivers is woefully inadequate
most (all?) are for ISA/EISA implementations, and therefore they are really
not
representative of typical SCSI drivers that are in use today. Anyway, I
think the
overhead of a few extra interrupts per SCSI I/O was negligible back in the
day
when sustaining 3MB/sec was considered fast.
Regards,
Paul Bunn, UltraBac.com, 425-644-6000
Microsoft MVP - WindowsNT/2000
http://www.ultrabac.com
-----Original Message-----
From: Maxim S. Shatskih [mailto:xxxxx@storagecraft.com]
Sent: Monday, October 30, 2000 3:29 PM
To: NT Developers Interest List
Subject: [ntdev] SCSI chips performance
Hi all,
the DDK has 2 samples - AHA174x and NCR53C9X.
According to the source code:
- the Adaptec chip has a hardware FIFO of SCSI commands, drives the bus in
the hardware, but does not support tagged queueing.
- the NCR/Emulex chip is more stupid, cannot drive the bus without the aid
of the software (SCSI bus changes are reported via interrupts), but supports
tagged queueing.
Is there any data on what approach is better and what chip is faster?
You are currently subscribed to ntdev as: xxxxx@FirstUSA.com
To unsubscribe send a blank email to $subst(‘Email.Unsub’)