Question on submit URB

Tim Roberts wrote:

> But I have faced other problem.
> If I start any heavy application I see that sometimes the host has not time to
> take away the data from the device. The internal buffer of the device is
> overflowed, and it is inadmissible. Device is analog-to-digital converter.

Where are you getting the buffers now? Are you waiting for a user-mode
app to send you buffers?
Are you submitting one URB, then processing it and resubmitting?

I don’t submit one URB. I submit 16 IRPs from my user-mode application. Driver transforms this 16 IRPs to 16 IRP-URBs. When IRP have completed, I submit next IRP from application. I support queue of length 16 between the application and the Host Controller driver.

I think, I lose samples in heavy loaded system because the driver and the application will be executed in different (PASSIVE LEVEL) threads. There is a situation when the Host Controller Driver has already processed queue from 16 URB and the application thread yet has not started to be dispatched. Result: URB queue for Host Controller Driver is empty and the ISO-exchange is not planned for the next microframe -> device buffer overflows.
Nobody is insured from this situation in non-realtime OS.
If I change the design and I will submit packages directly from Completion Routine, whether that I will avoid this situation? Or this method does not give 100 % guarantee that samples will be taken away from the device on very-very heavy loaded Windows.

Why you advise to support queue from 8 URBs to Host Controller Driver, there is theoretically enough queue from one or two URBs?

You need to “do the math” to match your device’s needs with the interval
for the endpoint. What is the continuous and the peak data rate? What
do you have the isochronous interval set to? If your peak data rate is
no more than 1 MB/second, for example, then an interval of once every 8
microframes should keep up, but you might set it to every 4 microframes
just in case. You almost have to run a simulation to figure out the
worst case. You get a shot to transmit one packet during your isoch
interval. After that, USB won’t talk to you AT ALL until your next
interval. If your FIFO is going to overflow by that time, then you need
to decrease the interval.

My Device Produce Data has speed 1000*8184 = 8.184.000 Bytes/Sec
I set pooling interval pipe descriptor to 1 (Every Microframe) and set 1 ISO packet per microframe. 8000*1024 = 8.192.000

If the CRC is invalid in an isochronous packet, the packet will be
dropped. As I said, this never happens in real life.

What status will i receive If the CRC is invalid? What value of “Length” field will i receive If CRC is invalid?

xxxxx@spiritdsp.com wrote:

I don’t submit one URB. I submit 16 IRPs from my user-mode application. Driver transforms this 16 IRPs to 16 IRP-URBs. When IRP have completed, I submit next IRP from application. I support queue of length 16 between the application and the Host Controller driver.

I think, I lose samples in heavy loaded system because the driver and the application will be executed in different (PASSIVE LEVEL) threads. There is a situation when the Host Controller Driver has already processed queue from 16 URB and the application thread yet has not started to be dispatched. Result: URB queue for Host Controller Driver is empty and the ISO-exchange is not planned for the next microframe -> device buffer overflows.
Nobody is insured from this situation in non-realtime OS.
If I change the design and I will submit packages directly from Completion Routine, whether that I will avoid this situation? Or this method does not give 100 % guarantee that samples will be taken away from the device on very-very heavy loaded Windows.

There are no 100% guarantees, but it’s true that relying on user-mode is
dangerous, for the very reasons you describe. Switching to a circular
buffer managed by the driver, with IRPs resubmitted in the completion
routine, will pretty much ensure that your device FIFO never overflows,
but you still have to worry about what to do when the application gets
behind. It just shifts the problem. In this case, the driver will
overrun its circular buffer. You can decide whether you toss old data,
or toss new data, whichever makes the most sense.

Why you advise to support queue from 8 URBs to Host Controller Driver, there is theoretically enough queue from one or two URBs?

One is definitely not enough. If you submit only one URB, then by the
time that URB completes and gets sent back to the driver, the next
microframe has already been scheduled, and you missed it. However, you
are correct that my recommendation of 8 is mostly superstition. Two
should be enough. You just need to guarantee that you always have at
least one URB in the queue at all times, ready to go.

My Device Produce Data has speed 1000*8184 = 8.184.000 Bytes/Sec
I set pooling interval pipe descriptor to 1 (Every Microframe) and set 1 ISO packet per microframe. 8000*1024 = 8.192.000

That’s pretty tight. Remember that you are not guaranteed that your
interval will occur in exactly the same spot in every microframe. It’s
usually pretty close, but there is no guarantee. As I said, you need to
do some math to figure out how much of a delay you can survive.

What status will i receive If the CRC is invalid? What value of “Length” field will i receive If CRC is invalid?

The length field will be zero. The error in the packet array will be
one of the USBD_STATUS errors. I couldn’t tell you which one is used
for CRC, because I’ve never seen one. There is an error code
USBD_STATUS_CRC. Remember that, as long as there is some good data in
the URB, the overall URB status will be success. You have to parse the
packet array to figure out which packets really have data in them, and
how much.


Tim Roberts, xxxxx@probo.com
Providenza & Boekelheide, Inc.

Tim Roberts wrote:

There are no 100% guarantees, but it’s true that relying on user-mode is
dangerous, for the very reasons you describe. Switching to a circular
buffer managed by the driver, with IRPs resubmitted in the completion
routine, will pretty much ensure that your device FIFO never overflows,
but you still have to worry about what to do when the application gets
behind.

Ok, big thanks, Tim.
I already realised “Submitting in the completion routine” in my driver and I see that it yields good
result: samples have ceased to be lost even on very loaded system. I see that it is a good way out.
Thanks!

But me the theory nevertheless interests: “Submitting in the completion routine” - Is 100% ensure or pretty much ensure?
My assumptions: If host contoller driver calls our completion routine [when calls IoCompleteRequest function] in DISPATCH_LEVEL - we have 100% ensure. If in PASSIVE_LEVEL - pretty much ensure)))

Tim Roberts wrote:

It just shifts the problem. In this case, the driver will
overrun its circular buffer. You can decide whether you toss old data,
or toss new data, whichever makes the most sense.

I know about this problem. While I assume to make the “big hack”: Allocate 10 Megabytes of memory in application, pass pointer to this buffer to driver, lock buffer in physical memory and share it for driver. 10Mb is enough for me. If the buffer is overflowed, we will restart our application and start processing again.

Tim Roberts wrote:

As I said, you need to do some math to figure out how much of a delay you can survive.

OK, Tim I run simulation.

Tim Roberts wrote:

That’s pretty tight. Remember that you are not guaranteed that your
interval will occur in exactly the same spot in every microframe. It’s
usually pretty close, but there is no guarantee.

Why? I thought that the pipe guarantees that if the interval=1 the device will be interrogated every microframe. Where I can read about this in documentation?

Tim Roberts wrote:

The length field will be zero. The error in the packet array will be
one of the USBD_STATUS errors. I couldn’t tell you which one is used
for CRC, because I’ve never seen one. There is an error code
USBD_STATUS_CRC. Remember that, as long as there is some good data in
the URB, the overall URB status will be success. You have to parse the
packet array to figure out which packets really have data in them, and
how much.

OK, I undestand you. But how then to learn quantity of samples in the bad packet? It is very important as the current sample namber is “time for the device”.
Or the unique solutionis to put an option on my device: “return data only when we have complete packet (size equal 1024)”? Then, if we will receive the status like USBD_STATUS_CRC we will precisely know that the length of a package was equal 1024.

xxxxx@spiritdsp.com wrote:

But me the theory nevertheless interests: “Submitting in the completion routine” - Is 100% ensure or pretty much ensure?
My assumptions: If host contoller driver calls our completion routine [when calls IoCompleteRequest function] in DISPATCH_LEVEL - we have 100% ensure. If in PASSIVE_LEVEL - pretty much ensure)))

There are no 100% guarantees. In the early days of Vista, I had a
machine with an audio driver that sat in a tight CPU loop for 10ms at a
time. When that happens, other interrupts are locked out.

If the rest of your drivers are well-behaved, then you won’t lose data
using this method.

Tim Roberts wrote:

> That’s pretty tight. Remember that you are not guaranteed that your
> interval will occur in exactly the same spot in every microframe. It’s
> usually pretty close, but there is no guarantee.
>

Why? I thought that the pipe guarantees that if the interval=1 the device will be interrogated every microframe. Where I can read about this in documentation?

Yes, you get interrogated every microframe, but a microframe is 125us
long. The AVERAGE time between slots is 125us, but if your slot occurs
at the beginning of frame X and the end of frame X+1 and the beginning
of frame X+2, then there might be more than 200us between the first to
slots, and only a couple of us between the next two;

X X+1 X+2
0 125 250 375
| | | |
IDEAL: ^^^ ^^^ ^^^
| | | |
POSSIBLE: ^^^ ^^^ ^^^
| | | |

We have not seen timing issues quite that dramatic, but we have
CERTAINLY seen our isochronous slot vary.

OK, I undestand you. But how then to learn quantity of samples in the bad packet? It is very important as the current sample namber is “time for the device”.
Or the unique solutionis to put an option on my device: “return data only when we have complete packet (size equal 1024)”? Then, if we will receive the status like USBD_STATUS_CRC we will precisely know that the length of a package was equal 1024.

Assuming your device always sends 1024 byte packets, then you will
either get 1024 bytes or 0 bytes. You don’t get a partial packet.
After all, the CRC applies to the whole packet. If the CRC fails, then
the entire packet is suspect, and is discarded.

However, that doesn’t keep you from keeping track of the timing. The
nice thing about isochronous is that you get essentially real-time
information. If an URB has 16 packets, then the result you get back
maps 1:1 to the last 16 intervals.


Tim Roberts, xxxxx@probo.com
Providenza & Boekelheide, Inc.