KMDF high speed usb minimal latency

well i guess that one advantage of having a kernel driver is that i could force it to behave like an interrupt pipe, even if it’s bulk, by using that periodic request function, which would just put a read request at the start of every microframe if i set it’s frequency to the highest one

Nope. You can’t. Tim already explained it before and said you should
study how URBs work. Please do.

You can retry the request immediatelly when previous one is completed
(and it is standard technique) but you won’t gain anything against user
mode app which can do the same.

Best regards,

Michal Vodicka
UPEK, Inc.
[xxxxx@upek.com, http://www.upek.com]

-----Original Message-----
From: xxxxx@lists.osr.com
[mailto:xxxxx@lists.osr.com] On Behalf Of
xxxxx@rajko.info
Sent: Thursday, June 24, 2010 4:26 AM
To: Windows System Software Devs Interest List
Subject: RE:[ntdev] KMDF high speed usb minimal latency

well i guess that one advantage of having a kernel driver is
that i could force it to behave like an interrupt pipe, even
if it’s bulk, by using that periodic request function, which
would just put a read request at the start of every
microframe if i set it’s frequency to the highest one


NTDEV is sponsored by OSR

For our schedule of WDF, WDM, debugging and other seminars visit:
http://www.osr.com/seminars

To unsubscribe, visit the List Server section of OSR Online
at http://www.osronline.com/page.cfm?name=ListServer

wrote in message news:xxxxx@ntdev…
> well i guess that one advantage of having a kernel driver is that i could
> force it to behave like an interrupt pipe, even if it’s bulk, by using
> that periodic request function, which would just put a read request at the
> start of every microframe if i set it’s frequency to the highest one
>

Well, one can write a custom kernel driver for the EHCI and do whatever
the Linux
driver does, or whatever else. But one possibly won’t want go this far.
- pa

i was talking about WDF_USB_CONTINUOUS_READER

xxxxx@rajko.info wrote:

well i guess that one advantage of having a kernel driver is that i could force it to behave like an interrupt pipe, even if it’s bulk, by using that periodic request function, which would just put a read request at the start of every microframe if i set it’s frequency to the highest one

I’m not sure what you mean by “periodic request function”. The FT2232H
uses bulk pipes (the wisdom of which can be debated elsewhere).
Assuming that it does not respond until there is actually data to be
transmitted, a bulk read will retry CONTINUOUSLY until the device
responds with data, or with a zero-length packet. The bulk retry rate
is significantly faster than once every microframe, and indeed this
characteristic of bulk pipes causes havoc with power management, and is
one of the things that is modified in USB 3.0.


Tim Roberts, xxxxx@probo.com
Providenza & Boekelheide, Inc.

Which works as I wrote before. When a request is completed, continuous
reader automatically submits new one. It won’t help you to achieve what
you want, it is just a tool to save some manual work but the performance
is the same and it doesn’t matter if you resumbit request from app or a
driver.

No, the only way is to queue request *before* the device has data
available. This way the device responds to the first request at USB
level and you have data immediatelly. However, if you can only send data
after reading and processing data from the device, you’re out of luck.
As was already stated, your protocol is not suitable for really fast USB
transfers.

Best regards,

Michal Vodicka
UPEK, Inc.
[xxxxx@upek.com, http://www.upek.com]

-----Original Message-----
From: xxxxx@lists.osr.com
[mailto:xxxxx@lists.osr.com] On Behalf Of
xxxxx@rajko.info
Sent: Thursday, June 24, 2010 6:30 PM
To: Windows System Software Devs Interest List
Subject: RE:[ntdev] KMDF high speed usb minimal latency

i was talking about WDF_USB_CONTINUOUS_READER


NTDEV is sponsored by OSR

For our schedule of WDF, WDM, debugging and other seminars visit:
http://www.osr.com/seminars

To unsubscribe, visit the List Server section of OSR Online
at http://www.osronline.com/page.cfm?name=ListServer

Bulk pipes request even more than once per microframe, so it’s should be
possible to get lower latencies with bulk.
You must submit the buffers in advance, but you can copy the data into
these buffers later on, if e.g. your usb device consumes the data with a
constant rate. In that case you know which buffer has been send and
which buffer still remains in the queue. That way you get the lowest
possible latency.

/Uwe

xxxxx@rajko.info schrieb:

well i guess that one advantage of having a kernel driver is that i could force it to behave like an interrupt pipe, even if it’s bulk, by using that periodic request function, which would just put a read request at the start of every microframe if i set it’s frequency to the highest one


NTDEV is sponsored by OSR

For our schedule of WDF, WDM, debugging and other seminars visit:
http://www.osr.com/seminars

To unsubscribe, visit the List Server section of OSR Online at http://www.osronline.com/page.cfm?name=ListServer

Hi Uwe

If the USB device is consuming data from the host at a constant rate,
presumably it has to refuse writes with a NYET packet to throttle the host
which tries ot write one buffer per microframe. Does the host then wait till
next 1mS frame before resubmitting (which gets you back to the isochronous
latencies) or does it retry in a later microframe within the frame?

Alternately I suppose the USB device could accept every packet and discard
those which are obviously “empty” of real data, preventing any retry delay,
but using a lot of bandwidth. (Great if you can dedicate a whole USB bus).

But if say a disk access starts monopolising the bus for a period, the host
will send packets slower. Is there a guaranteed way of knowing which packets
are just about to be sent to microframe accuracy? I have had problems in the
past getting info from the USB driver that is not 1mS out of date, meaning
it is difficult to know exactly which packet is just about to be sent. I
have not looked at it recently so this might now be easier. Any thoughts?

thx Mike

>>>>>>>>----- Original Message -----
From: uwekirst
To: Windows System Software Devs Interest List
Sent: Friday, June 25, 2010 9:02 AM
Subject: Re: [ntdev] KMDF high speed usb minimal latency

Bulk pipes request even more than once per microframe, so it’s should be
possible to get lower latencies with bulk.
You must submit the buffers in advance, but you can copy the data into
these buffers later on, if e.g. your usb device consumes the data with a
constant rate. In that case you know which buffer has been send and
which buffer still remains in the queue. That way you get the lowest
possible latency.

/Uwe

xxxxx@rajko.info schrieb:

well i guess that one advantage of having a kernel driver is that i could
force it to behave like an interrupt pipe, even if it’s bulk, by using
that periodic request function, which would just put a read request at the
start of every microframe if i set it’s frequency to the highest one

NTDEV is sponsored by OSR

For our schedule of WDF, WDM, debugging and other seminars visit:
http://www.osr.com/seminars

To unsubscribe, visit the List Server section of OSR Online at
http://www.osronline.com/page.cfm?name=ListServer


NTDEV is sponsored by OSR

For our schedule of WDF, WDM, debugging and other seminars visit:
http://www.osr.com/seminars

To unsubscribe, visit the List Server section of OSR Online at
http://www.osronline.com/page.cfm?name=ListServer

Mike Kemp:

Hi Uwe

If the USB device is consuming data from the host at a constant rate,
presumably it has to refuse writes with a NYET packet to throttle the
host which tries ot write one buffer per microframe. Does the host
then wait till next 1mS frame before resubmitting (which gets you back
to the isochronous latencies) or does it retry in a later microframe
within the frame?

or NAK.

Alternately I suppose the USB device could accept every packet and
discard those which are obviously “empty” of real data, preventing any
retry delay, but using a lot of bandwidth. (Great if you can dedicate
a whole USB bus).

Ok, you better use Interrupt transfers instead of bulk if you worry
about priorities.
For Interrupt EPs I can definitive answer your question above: The retry
starts in the next microframe, so 125us later (bInterval = 1).

But if say a disk access starts monopolising the bus for a period, the
host will send packets slower. Is there a guaranteed way of knowing
which packets are just about to be sent to microframe accuracy? I have
had problems in the past getting info from the USB driver that is not
1mS out of date, meaning it is difficult to know exactly which packet
is just about to be sent. I have not looked at it recently so this
might now be easier. Any thoughts?

thx Mike
You could use an IN EP for signaling which packets have been received by
the device.

thanks,
/Uwe

Mike Kemp wrote:

If the USB device is consuming data from the host at a constant rate,
presumably it has to refuse writes with a NYET packet to throttle the host
which tries ot write one buffer per microframe. Does the host then wait till
next 1mS frame before resubmitting (which gets you back to the isochronous
latencies) or does it retry in a later microframe within the frame?

With isochronous and interrupt pipes, you get exactly one shot in each
interval, as defined by the endpoint descriptor, whether or not the
device actually accepts the data. If the device refuses that shot, the
host controller will not try again until the next interval. (Actually,
for isochronous pipes, the controller doesn’t retry at all. If a packet
is refused, it’s lost, and the controller moves on to the next packet in
the request.)

With bulk pipes, the host controller keeps retrying continuously. It’s
like an annoying little toddler saying “Mommy, mommy, mommy, mommy,
mommy, mommy…”

But if say a disk access starts monopolising the bus for a period, the host
will send packets slower.

I’m not sure what you’re trying to say here. USB is an entirely
host-driven bus. Devices cannot do a single thing on their own. A
device cannot send until the host gives it a chance to send. So, there
is no such thing as “monopolizing the bus”. If a USB mass storage
driver starts sending a multimegabyte transfer on a bulk pipe, that has
absolutely zero impact on isochronous and interrupt pipes, because those
have bandwidth reserved for them. The HCD will not schedule bulk
transactions during those periods. If you have two devices doing huge
bulk transfers, the host controller will interleave the two to make sure
they’re each getting an equal chance.

Plus, the USB data rate is slow enough that even a fully saturated USB
bus doesn’t materially impact system operations.

Is there a guaranteed way of knowing which packets
are just about to be sent to microframe accuracy?

No. The hardware is processing that list of requests on its own,
without the involvement of the host.

Look, it sounds like you need a dedicated pair of wires, like an RS232
port. Instead of trying to squeeze USB to become something that it
isn’t, why don’t you just get yourself a PCI serial port controller?


Tim Roberts, xxxxx@probo.com
Providenza & Boekelheide, Inc.

uwekirst wrote:

Bulk pipes request even more than once per microframe, so it’s should be
possible to get lower latencies with bulk.

I don’t see how that’s relevant here.

You must submit the buffers in advance, but you can copy the data into
these buffers later on, if e.g. your usb device consumes the data with a
constant rate. In that case you know which buffer has been send and
which buffer still remains in the queue. That way you get the lowest
possible latency.

Whether or not this is possible in practice (and color me dubious), this
is not the kind of latency he’s worried about. With his device, he has
to send a write command to generate data, then read the results, then
write a command to generate more data, then read the results, etc. I
don’t think your magic helps with this.


Tim Roberts, xxxxx@probo.com
Providenza & Boekelheide, Inc.

Hi Tim

Thanks for your input on this.

With isochronous and interrupt pipes, you get exactly one shot in each
interval, as defined by the endpoint descriptor, whether or not the
device actually accepts the data. If the device refuses that shot, the
host controller will not try again until the next interval.<

I think this is why Uwe suggested using multiple bulk transfer, queued for
at least one per microframe.

(Actually,
for isochronous pipes, the controller doesn’t retry at all. If a packet
is refused, it’s lost, and the controller moves on to the next packet in
the request.)
<

And of course, there is no acknowledgement phase so you can’t refuse it.

I’m not sure what you’re trying to say here. USB is an entirely
host-driven bus. Devices cannot do a single thing on their own. A
device cannot send until the host gives it a chance to send. So, there
is no such thing as “monopolizing the bus”. If a USB mass storage
driver starts sending a multimegabyte transfer on a bulk pipe, that has
absolutely zero impact on isochronous and interrupt pipes, because those
have bandwidth reserved for them. <

Yes, got that. My query was about how these competing transfers would affect
Uwe’s scheme of ditching isoch in favour of bulk for lower latency.

If you have two devices doing huge
bulk transfers, the host controller will interleave the two to make sure
they’re each getting an equal chance.
<

This is good news as it suggests that the bulk method will not stall when
other traffic competes, especially if we are pretty sure of getting at least
one buffer per microframe.

I assume the bulk solution could ultimately fail when bus gets near to
saturation, but that’s no worse than any “pseudo” realtime application
running under Windows. What many of us (e.g. in audio editing) want is a
solution that works (a) as well as Windows permits, and (b) slightly better
than the competition!

No. The hardware is processing that list of requests on its own,
without the involvement of the host.
<

Pity that it doesn’t flag the buffer status in real time too, as the killer
on all this is that we are not told which buffers are sent until the slow
(worse than 1ms) update of the control flags. This means that we have to
resort to all sorts of tricks to estimate the next buffer that is just about
to be sent.

In fact my post was to try to get this information from the USB stack
implementors: i.e. what info can be got from the interface to determine with
maxmum accuracy which queued bulk buffer is about to be sent?

Look, it sounds like you need a dedicated pair of wires, like an RS232
port. Instead of trying to squeeze USB to become something that it
isn’t, why don’t you just get yourself a PCI serial port controller?
<

Mainly because of the user desire to use external busses and not to need to
take the lid off (and RS232 is not fast enough, so a dedicated PCI device is
the alternative).

BTW This may be slightly off topic from the OP, but it is a “low latency
over USB” issue that must be of interest. I’ve only mentioned tx, but this
also applies to rx data. Many of use are interested in very low latency
round trip times.

Best, Mike

>>>>>>>>>>>>>>>>>>>>
----- Original Message -----
From: Tim Roberts
To: Windows System Software Devs Interest List
Sent: Friday, June 25, 2010 6:07 PM
Subject: Re: [ntdev] KMDF high speed usb minimal latency

Mike Kemp wrote:

If the USB device is consuming data from the host at a constant rate,
presumably it has to refuse writes with a NYET packet to throttle the host
which tries ot write one buffer per microframe. Does the host then wait
till
next 1mS frame before resubmitting (which gets you back to the isochronous
latencies) or does it retry in a later microframe within the frame?

With isochronous and interrupt pipes, you get exactly one shot in each
interval, as defined by the endpoint descriptor, whether or not the
device actually accepts the data. If the device refuses that shot, the
host controller will not try again until the next interval. (Actually,
for isochronous pipes, the controller doesn’t retry at all. If a packet
is refused, it’s lost, and the controller moves on to the next packet in
the request.)

With bulk pipes, the host controller keeps retrying continuously. It’s
like an annoying little toddler saying “Mommy, mommy, mommy, mommy,
mommy, mommy…”

But if say a disk access starts monopolising the bus for a period, the
host
will send packets slower.

I’m not sure what you’re trying to say here. USB is an entirely
host-driven bus. Devices cannot do a single thing on their own. A
device cannot send until the host gives it a chance to send. So, there
is no such thing as “monopolizing the bus”. If a USB mass storage
driver starts sending a multimegabyte transfer on a bulk pipe, that has
absolutely zero impact on isochronous and interrupt pipes, because those
have bandwidth reserved for them. The HCD will not schedule bulk
transactions during those periods. If you have two devices doing huge
bulk transfers, the host controller will interleave the two to make sure
they’re each getting an equal chance.

Plus, the USB data rate is slow enough that even a fully saturated USB
bus doesn’t materially impact system operations.

Is there a guaranteed way of knowing which packets
are just about to be sent to microframe accuracy?

No. The hardware is processing that list of requests on its own,
without the involvement of the host.

Look, it sounds like you need a dedicated pair of wires, like an RS232
port. Instead of trying to squeeze USB to become something that it
isn’t, why don’t you just get yourself a PCI serial port controller?


Tim Roberts, xxxxx@probo.com
Providenza & Boekelheide, Inc.


NTDEV is sponsored by OSR

For our schedule of WDF, WDM, debugging and other seminars visit:
http://www.osr.com/seminars

To unsubscribe, visit the List Server section of OSR Online at
http://www.osronline.com/page.cfm?name=ListServer