Need Real transfer speed in USB BULK driver

Hi experts,
A few day back i post one thread in that i mentioned my problem.
Here is the link:
http://www.osronline.com/showThread.cfm?link=114217

At that time i tried lots of different configuration but did not get the success.
I am using USB 2.0 device. and making high speed modem type device.

Here are my requirments:

  • Device will send two bytes to HOST. That will indicate HOST to do WRITE/READ. according to that HOST will sned 1070 bytes to DEVICE or READ that much bytes from DEVICE. So i can not use overlapped flag. Becasue i am not sure about the future transaction.

Now in this situation, i want 5 mbps speed at every transaction. Overall i am getting 150 mbps speed, but that is in continuos streaming of 64KB packets.

Here my protocol design is different and for that at-least i want 5 mbps speed. I am also getting the performace that i want but some time what happen, i am getting 20 ms of spikes. Now that will loss my data.

Tim and others told me the scheduling effact of USB bus and that is also true, also performance is depend on URBs size too but if i want to achieve this goal then what is the other techniqe that i can use.

I am in bit hurry to solve this problem.
If anyone has questions on my issue then please ask me, i will provide depth defination.

Regards,
Tejas

Tejas Vaghela wrote:

Here are my requirments: - Device will send two bytes to HOST. That will
indicate HOST to do WRITE/READ. according to that HOST will sned 1070
bytes to DEVICE or READ that much bytes from DEVICE. So i can not use
>overlapped flag. Becasue i am not sure about the future transaction.

I don’t really comprehend what you’re saying here, but regardless of what bytes are going where (and when), you should just use a KMDF continuous reader for maximum performance, and do any necessary logical buffering inside your driver.

Your main problem seems that you’re tied to some synchronous protocol (“device will send 2 bytes, this means host should read 1070 bytes”) – that’s a pretty brain-dead design and is going to limit your performance.

Hi Chris,
I know that my design is very tightly couple with protocol design. But specification of hardware such that, i have to define this thing. I suggested to change it but could not possible.
Is there any other example in your mind that i can go or choose? I am agree with that our design is some what brain-dead design and thats our limitation. But any suggestion is more welcome.

Like Adding Interrupt end point will do some better job or etc…

My requrment is that intially host will look for 2 bytes from device and based on that HOST will perform READ/WRITE transaction. yes, i m very tied with some synchronous protocol.

I have KMDF. From where i can get continous reader example of KMDF. And if you talking about overlapped then it is not useful for me.

I am getting spikes becoz some OS scheduling during USB communication. Now i just want to reduce that affect on USB comunication. I dont know how to do that but need to solve ASAP.

Chris,
I hope that now you are more clear on my issue.

Regards,
Tejas

Tejas Vaghela wrote:

My requrment is that intially host will look for 2 bytes from device
and based on that HOST will perform READ/WRITE transaction.
yes, i m very tied with some synchronous protocol.

This doesn’t mean that your driver can’t always be trying to read from the device in the form of a continuous reader.

I have KMDF. From where i can get continous reader example
of KMDF. And if you talking about overlapped then it is not
useful for me.

You can look in the OSRFX2USB sample, or others have mentioned some new “USBSAMP” sample, it is probably in there, too.

I am getting spikes becoz some OS scheduling during USB
communication. Now i just want to reduce that affect on USB
comunication. I dont know how to do that but need to solve ASAP.

“Spikes” is not a well-defined term in the context of USB so I have no idea what you’re talking about.

If you are saying (from your other post) that you are seeing a 20ms delay somewhere during your protocol, well, sorry – Windows is not a real time OS, so if this is breaking what you are trying to do, you will need to change the design.

xxxxx@slscorp.com wrote:

I know that my design is very tightly couple with protocol design. But specification of hardware such that, i have to define this thing. I suggested to change it but could not possible.
Is there any other example in your mind that i can go or choose? I am agree with that our design is some what brain-dead design and thats our limitation. But any suggestion is more welcome.

Like Adding Interrupt end point will do some better job or etc…

The endpoint type won’t change anything about this.

My requrment is that intially host will look for 2 bytes from device and based on that HOST will perform READ/WRITE transaction. yes, i m very tied with some synchronous protocol.

And that’s a huge mistake for a USB device.

The solution depends on how the hardware is designed. You can certainly
submit many read requests at a time. That way, if your two-byte thing
says that read data is ready, there will already be a read request
queued up, which will get the data immediately. Remember that a bulk
read will block until data is available. As long as your hardware
doesn’t try to respond to every read request, but only when data is
available, this will work fine.

I am getting spikes becoz some OS scheduling during USB communication. Now i just want to reduce that affect on USB comunication. I dont know how to do that but need to solve ASAP.

The spikes are not due to scheduling during communication. They happen
because you can’t turn around your requests quickly enough.


Tim Roberts, xxxxx@probo.com
Providenza & Boekelheide, Inc.

Thanks Tim and chris for your valuable inputs,

Tim,

The spikes are not due to scheduling during communication. They happen
because you can’t turn around your requests quickly enough.
If it is the case then when that delays comeing after specific time interval. Just for example if i am repeating my loop for 10000 times then i’ll get 22-22 delays of 20 to 22 ms after certain trasaction, like every 400 th trasaction or else.

I am using NIOS II 32 - bit processor at device side. I also optimize the code at that location.
you mentioned about overlapped/queue technicque, but i can not do that over here. Limitation of real hardware, which attached with USB device.

Chris:

“Spikes” is not a well-defined term in the context of USB so I have no idea what
you’re talking about.
Ummm, spikes means somewhat delay that i am getting between consecutive cycles. Here one cycle is one complete read and write transaction. Now if i measure all the cycles timimng and prepare a graph, i’ll see 20ms - 22 ms delay at specific interval. Now as per USB detail, it is schedule bus and i think when schedule went from USB bus to other, at that time i am getting that delay or might be my cotroller ran dry at that time. But if second is the case then why for all trasanction it is not happen and only for certain. I hope that you’ll get some idea from my description.

If USB is not an solution then i would like to use Ethernet or firewire.
Let me know you suggestions on USB.
Thanks in advance.
Regards,
Tejas

Tejas Vaghela wrote:

you mentioned about overlapped/queue technicque, but i can not
do that over here. Limitation of real hardware, which attached
with USB device.

Look, what are you talking about? Your hardware has no idea about the mechanism being used on the host side to pend read URBs at the host controller. It only sees IN tokens.

xxxxx@slscorp.com wrote:

If it is the case then when that delays comeing after specific time interval. Just for example if i am repeating my loop for 10000 times then i’ll get 22-22 delays of 20 to 22 ms after certain trasaction, like every 400 th trasaction or else.

Are you writing to disk? Are there other things going on at the time?
Is it possible you’re draining the buffers out of your USB device, and
it takes time for them to refill? There’s certainly nothing inherent in
USB that would cause a 20ms delay.

I am using NIOS II 32 - bit processor at device side.

That doesn’t really say very much. Nios is a processor core for Altera
FPGAs; the clocking and buffering are still all up to your Verilog code.

I also optimize the code at that location.
you mentioned about overlapped/queue technicque, but i can not do that over here. Limitation of real hardware, which attached with USB device.

Like Chris, I’m unclear why you think your hardware design makes it
impractical to use overlapped I/O.

Ummm, spikes means somewhat delay that i am getting between
consecutive cycles. Here one cycle is one complete read and write
transaction. Now if i measure all the cycles timimng and prepare a
graph, i’ll see 20ms - 22 ms delay at specific interval. Now as per
USB detail, it is schedule bus and i think when schedule went from USB
bus to other, at that time i am getting that delay or might be my
cotroller ran dry at that time. But if second is the case then why for
all trasanction it is not happen and only for certain. I hope that
you’ll get some idea from my description.

Are you running on an extremely underpowered processor for your task?


Tim Roberts, xxxxx@probo.com
Providenza & Boekelheide, Inc.

Hi Tim and chris,
Once again thanks for your inputs.

>Are you writing to disk? Are there other things going on at the time?
>Is it possible you’re draining the buffers out of your USB device, and
>it takes time for them to refill? There’s certainly nothing inherent in
>USB that would cause a 20ms delay.

As i said before, my hardware does not want certain data every time. Just for example, if PC recieves write command with certain information from device, it will write 1070 bytes of message to device. Now i have different different 1070 bytes of messages. So i can not queued up before analysis. Now assume reverse situation, if PC got read command from device, it will read 1070 bytes of data from device.
So in one cylce every time two transaction will happen with two choices:

a) device read - decode and see if write command is there then write to device. It means

1 cycle = read device( 2 bytes) , write device ( 1070 bytes)

b) device read - decode and see if read command is there then read from device, it means
1 cycle = read device( 2 bytes) , read device (1070 bytes)

Now i am not writing any thing to disk.

>Like Chris, I’m unclear why you think your hardware design makes it
>impractical to use overlapped I/O.
Now i hope that you will understand why my hardware can not use overlapped design. becasue defense missile’s guidance hardware architecture are such a way. AS i said before that i am making sort of modem type thing. command and data passing through USB.

Now as Tim said, NIOS 32 is not very much. I agree upon that and if this is the case then every trasaction i will get this issue not after certain transaction. Might be it is possible that after certain transaction some clocking issue will arrise in FPGA and cause this issue but that is not classic case.

Any way i am figuring at FPGA site too. If you guys have any idea on new protocol design for my requirment or have any idea on test cases to find the root problem then you are more welcome.

Regards,
Tejas

On Sat, Oct 20, 2007 at 01:05:33AM -0400, xxxxx@slscorp.com wrote:

>>Like Chris, I’m unclear why you think your hardware design makes it
>>impractical to use overlapped I/O.

Now i hope that you will understand why my hardware can not
use overlapped design. becasue defense missile’s guidance hardware
architecture are such a way.

Sorry, but the misunderstanding is on YOUR part, not on ours. When you
submit a USB read request, that request will block until your device is
ready to send the data. Also remember that when your device sends a
packet shorter than the max packet size, the current transfer will be
ended. So, you can queue up a bunch of 1200 byte reads. In case (b)
above, you’ll get the 2-byte notice, and a read request will already be
queued up to receive the 1070 byte packet.

In case (a), you’ll get the 2-byte notice, and the queued up read request
will just wait until more data is available. In the meantime, you can
submit your 1070-byte write request.

Tim Roberts, xxxxx@probo.com
Providenza & Boeklheide, Inc.

Hi Tim,
I changed some what my firmware design and include cache for fast transaction but still spike is there and i debug the USB line so i noticed that during spikes i am not getting any PID IN. So hardware must have to wait for PID IN. Now why this delays are coming from host side? Might be you are right that it is the affect of OS scheduling. And also i am using blocking trasnfer as it is my requirments.

Regards,
Tejas

Tejas Vaghela wrote:

So hardware must have to wait for PID IN. Now why this delays
are coming from host side?

We already told you the answer. You are not turning around your read URBs fast enough, and this is because you don’t have multiple read URBs pended at the host controller.

This, in turn, is because you have a fundamental misunderstanding of how to separate your “requirements” from the actual implementation.

Frankly, I’m terrified that you have been tasked to do this for a missile guidance system. Which country’s missiles are you designing? Hopefully not my country’s.

xxxxx@slscorp.com wrote:

I changed some what my firmware design and include cache for fast transaction but still spike is there and i debug the USB line so i noticed that during spikes i am not getting any PID IN. So hardware must have to wait for PID IN. Now why this delays are coming from host side? Might be you are right that it is the affect of OS scheduling. And also i am using blocking trasnfer as it is my requirments.

Lack of an IN token means that your driver doesn’t have a request queued
up. The host controller won’t send an IN token until it has a request
waiting.

I told you how to use non-blocking transfers with your architecture.


Tim Roberts, xxxxx@probo.com
Providenza & Boekelheide, Inc.

Hi Chris,
Once you have a hardware with specific requirment then you can not do anything in driver. Driver is not heart for any hardware operation. It can provide functionality very well. Overlapped flag is not a solution for me.

Anyway, thanks for your compliments. This PID IN behaviour i found on different hardware. Till date i did not get this type of issue when i debuged the hardware with bus analyzer so i could not sure for an issue.

Tim,
Thanks for your help.

Regards,
Tejas