Communication w/ USB3 device stops working when device sends more data than expected

We have a USB3 device that does asynchronous IO.
We provide different sized buffers in sequence (sizeA-sizeB-sizeC-sizeA-sizeB-sizeC-sizeA-sizeB-sizeC, etc).

Occasionally things get de-synchronized on the device, and an attempt is made to send a larger amount of data than the size of the buffer provided. When this happens, ever single time, the data transfer fails and never works again unless the device is power cycled.

All subsequent attempts to retrieve data from the device stall indefinitely, and the host never sends anything across the USB bus.

My understanding is that in this case, the controller should report a babble error, and things should fail gracefully. We’ve examined the host controller (third party) event rings and we never see a babble error happen.

What responsibilities does a function driver have for handling this case? When this happens, the data retrieval request completion routine fires and the status code that comes back is C0000001, which is a generic failure. I would have thought that it would be the host controller driver’s responsibility to account for the babble error and keep things running smoothly.

Is there something we can do in the function driver to keep this bad state from happening?

xxxxx@gmail.com wrote:

We have a USB3 device that does asynchronous IO.
We provide different sized buffers in sequence (sizeA-sizeB-sizeC-sizeA-sizeB-sizeC-sizeA-sizeB-sizeC, etc).

Occasionally things get de-synchronized on the device, and an attempt is made to send a larger amount of data than the size of the buffer provided. When this happens, ever single time, the data transfer fails and never works again unless the device is power cycled.

That’s possible. Depending on where the babble occurs, it can cause the
hub to lose track of the end of the frame, and the hub will shut down
the port.

What responsibilities does a function driver have for handling this case? When this happens, the data retrieval request completion routine fires and the status code that comes back is C0000001, which is a generic failure. I would have thought that it would be the host controller driver’s responsibility to account for the babble error and keep things running smoothly.

Is there something we can do in the function driver to keep this bad state from happening?

Absolutely, there is. When you make a read request, ALWAYS make your
buffer an even multiple of the maximum packet size. Always. If you are
expecting 319 bytes, don’t ask for 319 bytes. Ask for 512 bytes. When
the device sends a packet less than the maximum size, the transfer is
completed.


Tim Roberts, xxxxx@probo.com
Providenza & Boekelheide, Inc.

Tim Roberts wrote:

Absolutely, there is. When you make a read request, ALWAYS make your
buffer an even multiple of the maximum packet size. Always. If you are
expecting 319 bytes, don’t ask for 319 bytes. Ask for 512 bytes. When
the device sends a packet less than the maximum size, the transfer is
completed.

Whoops, you said USB 3, so a bulk packet would be 1024 bytes, not 512.


Tim Roberts, xxxxx@probo.com
Providenza & Boekelheide, Inc.

Hi Tim,

Thank you for your advice. We will certainly look into this.

I was wondering if you could elaborate on why this is the case? Why exactly is making our buffers a multiple of the max packet size going to help stop the host from losing track of the end of the frame?

Is this “limitation” describing in a specification somewhere?

Thanks!

Richard

xxxxx@gmail.com wrote:

I was wondering if you could elaborate on why this is the case? Why exactly is making our buffers a multiple of the max packet size going to help stop the host from losing track of the end of the frame?

Certainly.

The key point is that a USB device is never told how many bytes to
send. That’s simply not part of the protocol. The device is merely
given a token that says “GO!”. It is legally entitled to send any
amount up to the maximum packet size for that endpoint.

But when the host controller driver lays out the schedule for the next
microframe, it trusts you. If you have a 1024-byte endpoint, but you
only hand it a 256-byte buffer, then the scheduler might very well
decide to squeeze in your transfer 256 bytes from the end of the frame.
The host controller hardware will send the “IN” token at exactly that
time. If your device decides to send 1024 bytes, your transfer will
overlap the end of the frame and the start of the next frame. Your
traffic will collide with the host controller’s attempt to send the
start-of-frame token to begin the next frame, and the protocol comes
crashing down in a melted pile of electrons.

Is this “limitation” describing in a specification somewhere?

Section 8.7.4 of the USB spec talks about babble, and mentions that the
hub is required to shut down any port with a device that babbles. The
rest of it is just a side effect of the scheduled and frame-based nature
of USB.


Tim Roberts, xxxxx@probo.com
Providenza & Boekelheide, Inc.

Thank you for the great answer! We’re currently working this requirement into our system and have seen some preliminary success. I’ll update again when we complete testing.