Multiple URB's in single IRP in USB

Hi All,
We developed USB 2.0 High speed device controller and also developed a custome driver for it from WINDDK. Everything is working fine and we are getting expected result in every projects.

Now i have a question about how to boost the speed of USB BULK communication? Currently i am getting 100 mbps in BULK IN and BULK OUT in worst case and maximum i achieved 190 mbps in BULK mode. But all are working at single URB.

I meant to say that when i tarnsfer a buffer from the HOST, i have to wait to complete it and then start next one. So in driver only one URB is active at a time.

Now i am doing streaming of 64KB from the host for 10000 times. I got very surprised result. At certain time i got some mili secons of spikes throghout my streaming so it indicates that when OS scheduling comes across USB bus at that time execution is nice but once it went then i got spikes of 50-60 ms during that periods.

Now as i said only one URB is active at a time . So i think that if i made multiple URBs and submit it on USB bus then i think during OS scheduling i got lesser spikes compare to current one because i am already ready with URBs so OS has not to build URBs from application software and it can directly fire URBs from driver to USB Stack.

I tried for overlapped and non overlapped IO too but that is also not help full very well. So i concluded to this point.

Now if you guys have seen WINDDK then in isochronous mode they have facility to send multiple URBs in single IRP. So i just want to do that which might be solve my problem.

Please help me out in this design. it would be great thankful to you.
Regards,
Tejas

Tejas Vaghela wrote:

Now i have a question about how to boost the speed of USB
BULK communication? Currently i am getting 100 mbps in
BULK IN and BULK OUT in worst case and maximum i achieved
190 mbps in BULK mode. But all are working at single URB.

I meant to say that when i tarnsfer a buffer from the HOST,
i have to wait to complete it and then start next one. So in
driver only one URB is active at a time.

If you use a KMDF continuous reader, you get this functionality for free. You tell the framework how many outstanding URBs you want pended at any given time.

xxxxx@slscorp.com wrote:

Now i have a question about how to boost the speed of USB BULK communication? Currently i am getting 100 mbps in BULK IN and BULK OUT in worst case and maximum i achieved 190 mbps in BULK mode. But all are working at single URB.

I meant to say that when i tarnsfer a buffer from the HOST, i have to wait to complete it and then start next one. So in driver only one URB is active at a time.

And that, in a nutshell, is the root of your performance problem.

Now as i said only one URB is active at a time . So i think that if i made multiple URBs and submit it on USB bus then i think during OS scheduling i got lesser spikes compare to current one because i am already ready with URBs so OS has not to build URBs from application software and it can directly fire URBs from driver to USB Stack.

I tried for overlapped and non overlapped IO too but that is also not help full very well. So i concluded to this point.

You have some decisions to make. The only way to get maximum USB
throughput is to have multiple URBs queued up. Allow me to take a
moment to explain why.

USB is a scheduled bus. Before a frame begins, the host controller
driver must have all of the transfers that will happen during that frame
configured and ready to send to the host controller. Once the frame
begins, it’s too late. Nothing can be done until the next frame
starts. So, if your driver misses the beginning of a frame, you have to
wait until the NEXT frame for an opportunity to transfer.

Let’s say you have a single URB, and the transfer succeeds. The host
controller driver gets notified at the end of the frame. It will
complete the URB and send it back to, but by that time the next frame
has already started. And depending on the vagaries of Windows
scheduling, it might take several more frames before the URB can be
returned to your application and turned around again.

So, you must have multiple URBs ready to go at all times. There are two
ways to do that. One way is to eliminate the tight coupling between the
driver and the application. It sounds like your driver converts each
read/write to a single URB and transmits it. That is very tight
coupling. Instead, it is possible to have the driver submit the URBs on
its own, feeding into or out of a circular buffer. The application’s
read and write requests go through this circular buffer. The advantage
is a simpler application. The disadvantage is the extra memory for the
circular buffer.

Another choice is overlapped I/O. Let the application queue up multiple
read requests and wait for them. As each read request finishes, it gets
resubmitted right away. You will have to have enough requests with
large enough buffers to make sure that the driver never runs dry, even
if the application gets delayed for many tens of milliseconds.

This approach does work. We are able to sustain 45 MB/s (360 Mbps) over
bulk pipes with this scheme. (Actually, with both schemes, although the
overlapped I/O approach ended up being simpler overall.)


Tim Roberts, xxxxx@probo.com
Providenza & Boekelheide, Inc.

> ----------

From: xxxxx@lists.osr.com[SMTP:xxxxx@lists.osr.com] on behalf of Tim Roberts[SMTP:xxxxx@probo.com]
Reply To: Windows System Software Devs Interest List
Sent: Thursday, July 26, 2007 7:21 PM
To: Windows System Software Devs Interest List
Subject: Re: [ntdev] Multiple URB’s in single IRP in USB

So, you must have multiple URBs ready to go at all times.

Definitely. In my experience with USB 1.1 devices 4 is minimum; USB 2.0 may need more.

There are two
ways to do that. One way is to eliminate the tight coupling between the
driver and the application. It sounds like your driver converts each
read/write to a single URB and transmits it. That is very tight
coupling. Instead, it is possible to have the driver submit the URBs on
its own, feeding into or out of a circular buffer. The application’s
read and write requests go through this circular buffer. The advantage
is a simpler application. The disadvantage is the extra memory for the
circular buffer.

Another disadvantage is more complicated driver code.

Another choice is overlapped I/O. Let the application queue up multiple
read requests and wait for them. As each read request finishes, it gets
resubmitted right away. You will have to have enough requests with
large enough buffers to make sure that the driver never runs dry, even
if the application gets delayed for many tens of milliseconds.

IMO it is the best choice. It leads to the simplest driver code and gives application developer full freedom to optimize performance. It is possible to have one generic driver for several devices with quite different performance requirements. The decision is then on application. Simple app which doesn’t care about performance can use simple synchronous wrappers for overlapped calls. Another app can use several overlapped requests and reuse them on completion. It is also possible to implement an independent layer/library which uses circular buffer approach and offers sychronous APIs to its clients. Internally it uses a thread which feeds driver with overlapped requests.

Best regards,

Michal Vodicka
UPEK, Inc.
[xxxxx@upek.com, http://www.upek.com]

""Another choice is overlapped I/O. "

I used Overlapped IO but in that i have to wait for event and it is again like blocking the procedure. How can i make multiple request from the appllication.if i wrote waitonsingleEvent time to 0ms then i am fail and if i write 500 or etc then i have to wait for that. How it automatically maintan the things.

Like “Tim Roberts” said that you can create circular buffer for multiple URB handling. So i agree and know that stuffs but the same mechanism shall i have to use that WDM isochronous has used.

Or by another wau i can do that things.
Thanks for your answer.
Regards,
Tejas
SLS, Inc
www.slscorp.com

xxxxx@slscorp.com wrote:

""Another choice is overlapped I/O. "

I used Overlapped IO but in that i have to wait for event and it is again like blocking the procedure. How can i make multiple request from the appllication.if i wrote waitonsingleEvent time to 0ms then i am fail and if i write 500 or etc then i have to wait for that. How it automatically maintan the things.

You don’t have to wait for the event immediately. One of the benefits
of using overlapped I/O is that you can submit multiple requests at
once. You can create 8 events, in 8 OVERLAPPED structures, submit 8
ReadFile requests, and then use WaitForMultipleObjects to wait for one
of them to finish. As soon as one finishes, you use a table look up to
figure out which request it was, and go resubmit it.

It’s not really all that difficult.


Tim Roberts, xxxxx@probo.com
Providenza & Boekelheide, Inc.

Hi,
I used WaitForMultipleObjects function but it doesn’t that much helpful to me. In some situation it was giving more worst result then blocking one,.

Now as i told in my first post that when i do a transaction 10000 times, i am getting spikes at specific intervals. Now if i run my code at real time priority then frequency of spikes are very low. 1 in 65536 transaction. But in normal priority of process i am getting lots of spikes. Now that spikes average duration is 20 ms and if i am getting more than 5 ms of spikes then my data will be loss. I dont know why scheduling takes that much of time. Now i am using asychronous method, so what is the issue now?

I read that in WinNT architecture normal scheduling time is 40-60ms. So if this is the bottle neck of scheduler then what is the remedy for my problem?

In this post i can not send picture other wise i will send graphs that i taken during my experiements.

Now in my WDM driver, i also use readwrite completion routine to finish rest of the URBs. So i am using same IRP throughout transaction and create new one and submitting them on stacks. So i dont think so i am wasting any time.

Here my code for application that i used in application :

for (i = 0; i < INSTANCES; i++)
{

// Create an event object for this instance.
ov[i].Offset =0;
ov[i].OffsetHigh =0;
ov[i].hEvent = hEvents[i]= CreateEvent(
NULL, // default security attribute
TRUE, // manual-reset event
TRUE, // initial state = signaled
NULL); // unnamed event object

if (ov[i].hEvent == NULL)
{
printf(“CreateEvent failed with %d.\n”, GetLastError());
return 0;
}

}
for(int j=0;j {
/*for (i = 0; i < INSTANCES; i++)
{

// Create an event object for this instance.
ov[i].Offset =0;
ov[i].OffsetHigh =0;

ov[i].hEvent = hEvents[i]= CreateEvent(
NULL, // default security attribute
TRUE, // manual-reset event
TRUE, // initial state = signaled
NULL); // unnamed event object

if (ov[i].hEvent == NULL)
{
printf(“CreateEvent failed with %d.\n”, GetLastError());
return 0;
}

} */

// Modification end from Sanjay.
for(int ii=0;ii {
SLS_W32_WriteFile(hDevice,bWBuf1,iNoOfByteToWrite,&junk, &ov[ii]);
//(SLS_ListDevices(&iDevNo,cBuffer,SLS_LIST_BY_INDEX|SLS_OPEN_BY_SERIAL_NUMBER)==SLS_OK);
}

dwEvent = WaitForMultipleObjects(INSTANCES, // number of objects in array
hEvents, // array of objects
FALSE, // wait for any
INFINITE); // indefinite wait

cout<<“Loop”<
}

I hope that some one can guid me better.
Thanks for all previous post.
Regards
Tejas
SLS Inc
www.slscorp.com

xxxxx@slscorp.com wrote:

I used WaitForMultipleObjects function but it doesn’t that much helpful to me. In some situation it was giving more worst result then blocking one,.

You are creating manual reset events, and creating them in the
“signaled” state. Does your SLS_W32_WriteFile routine reset the events
before submitting WriteFile? If not, then WFMO may be returning false
positives.

There must be more to your code than this, right? When WFMO tells you
that a request has finished, you need to go resubmit that request.
That’s the key – never let the queue go empty. As long as you have
that “spring”, you can survive the occasional scheduling burp.

while( thread_still_running )
{
dwEvent = WFMO( INSTANCES, hEvents, FALSE, INFINITE );
if( dwEvent < INSTANCES )
{
SLS_W32_WriteFile( hDevice, bWBuf1, iNoOfByteToWrite, &junk,
&ov[dwEvent] );
}
}

Now as i told in my first post that when i do a transaction 10000 times, i am getting spikes at specific intervals. Now if i run my code at real time priority then frequency of spikes are very low. 1 in 65536 transaction. But in normal priority of process i am getting lots of spikes. Now that spikes average duration is 20 ms and if i am getting more than 5 ms of spikes then my data will be loss. I dont know why scheduling takes that much of time. Now i am using asychronous method, so what is the issue now?

Never let the queue go empty. Have enough requests in your list so that
you can survive a 50ms gap.


Tim Roberts, xxxxx@probo.com
Providenza & Boekelheide, Inc.

Hi All,
For clarification i am using 8 Instances.

Thanks for “Tim” posts and i think i am going on right direction and getting some good behaviour too.
But i noticed some starnge result that i want to share with Tim and others.
As you mentioned about reset event in your last post, i changed CreateEvent function with auto-reset and non-signaled state. Ok, so that was my first modification.

Now as you mentioned that and according to below suggestion:
while( thread_still_running )
{
dwEvent = WFMO( INSTANCES, hEvents, FALSE, INFINITE );
if( dwEvent < INSTANCES )
{
SLS_W32_WriteFile( hDevice, bWBuf1, iNoOfByteToWrite, &junk,
&ov[dwEvent] );
}
}

i got “dwEvent” like 2,3,0, etc. Now if i got 3 means what? It means that scheduler completes 3,4,5,6,7 Instances or it complets 0,1,2,3. From the MSDN docs i found and i understood that it completed 3,4,5,6,7 Instances. So from your suggestion it only post that INSTANCE whcih i got return value from WFMO. So does this have any performace down?

Now i changed some logic and apply some tests. I used below code for Test1

TEST1:

//Created INSTANCE with auto reset and non signaled state. and loop =10000, iNoOfByteToWrite = 4096Bytes

for(int ii=0;ii{
SLS_W32_WriteFile(hDevice,bWBuf1,iNoOfByteToWrite,&junk, &ov[ii]);
}

int j=INSTANCES;
while(j<(loop))
{
dwEvent = WaitForMultipleObjects(INSTANCES, // number of objects in array
hEvents, // array of objects
FALSE, // wait for any
INFINITE); // indefinite wait
if(dwEvent {
for(int t=0;t<=dwEvent;++t)
{
SLS_W32_WriteFile( hDevice, bWBuf1, iNoOfByteToWrite, &junk,&ov[t] );
j++;
}
}
}
But using this test i got two different results. In one case at the device side i am not able to get all transaction. Even thogh application fires loop 10000 times.so i have to ran once again the application and need to repeat above procedure for 4000-5000 times and then i got completed 10000 transaction at device side. But for all trasaction my time is 0.5 to 0.6 ms . that very surprised me .Actaully i want this result but i think there is some data loss in between.

Now in second result i got all trasaction as completed. but i also got spikes of 8-9 ms after certain transactions.And overall spikes count in 10000 trasanction is 25-26.

TEST2 :

//Created INSTANCE with auto reset and non signaled state. and loop =10000, iNoOfByteToWrite = 4096

for(int ii=0;ii{
SLS_W32_WriteFile(hDevice,bWBuf1,iNoOfByteToWrite,&junk, &ov[ii]);
}

int j=INSTANCES;
while(j<(loop))
{
dwEvent = WaitForMultipleObjects(INSTANCES, // number of objects in array
hEvents, // array of objects
FALSE, // wait for any
INFINITE); // indefinite wait
if(dwEvent {
for(int t=dwEvent;t<=INSTANCES;++t)
{
SLS_W32_WriteFile( hDevice, bWBuf1, iNoOfByteToWrite, &junk,&ov[t] );
j++;
}
}
}

But got same result as in TEST1.

i also used another method. Please see below my code:

I created all event in manual reset and non-signaled mode and loop = 10000 and iNoOfByteToWrite= 4096 Byts
int j=INSTANCES;
while(j<(loop))
{

dwEvent = WaitForMultipleObjects(INSTANCES, // number of objects in array
hEvents, // array of objects
FALSE, // wait for any
INFINITE); // indefinite wait

if( dwEvent>=0 && dwEvent < INSTANCES )
{
for(int t=0;t {
int value = WaitForSingleObject(hEvents[t],0);

if(value == WAIT_OBJECT_0)
{
ResetEvent(ov[i].hEvent);
SLS_W32_WriteFile( hDevice, bWBuf1, iNoOfByteToWrite, &junk,&ov[t] );
j++;
}

}
}
}

But in this i got same results as in TEST1 and TEST2.

So i am confuse about that WFMO returns what? I think that i am vey nearer to result but doing something wrong which makes me failure.

Please suggest me.
Tejas
SLS inc.
www.slscorp.com

xxxxx@slscorp.com wrote:

Thanks for “Tim” posts and i think i am going on right direction and getting some good behaviour too.
But i noticed some starnge result that i want to share with Tim and others.
As you mentioned about reset event in your last post, i changed CreateEvent function with auto-reset and non-signaled state. Ok, so that was my first modification.

Now as you mentioned that and according to below suggestion:
while( thread_still_running )
{
dwEvent = WFMO( INSTANCES, hEvents, FALSE, INFINITE );
if( dwEvent < INSTANCES )
{
SLS_W32_WriteFile( hDevice, bWBuf1, iNoOfByteToWrite, &junk,
&ov[dwEvent] );
}
}

i got “dwEvent” like 2,3,0, etc. Now if i got 3 means what? It means that scheduler completes 3,4,5,6,7 Instances or it complets 0,1,2,3. From the MSDN docs i found and i understood that it completed 3,4,5,6,7 Instances.

No. If you get 2, 3, 0, in that order, it means that it completed
buffers 2, 3, and 0. You don’t know anything about 1, 4, 5, 6, and 7.
They might be completed, or they might still be outstanding. If
multiple events have fired by the time you call WFMO, it makes no
guarantees about what order it will return. So, for example, if 0, 1, 2
and 3 had all finished before you checked, you might get them in any order.

If you really need these to be processed in order, then you shouldn’t
use WFMO. Instead, use WaitForSingleObject and cycle through them in order:

iNextHandleToFire = 0;
while( thread_still_running )
{
WaitForSingleObject( hEvents[iNextHandleToFire], INFINITE );
SLS_W32_WriteFile( hDevice, … );
iNextHandleToFire = (iNextHandleToFire + 1) % INSTANCES;
}

TEST1:

//Created INSTANCE with auto reset and non signaled state. and loop =10000, iNoOfByteToWrite = 4096Bytes

for(int ii=0;ii> {
> SLS_W32_WriteFile(hDevice,bWBuf1,iNoOfByteToWrite,&junk, &ov[ii]);
> }
>
> int j=INSTANCES;
> while(j<(loop))
> {
> dwEvent = WaitForMultipleObjects(INSTANCES, // number of objects in array
> hEvents, // array of objects
> FALSE, // wait for any
> INFINITE); // indefinite wait
> if(dwEvent> {
> for(int t=0;t<=dwEvent;++t)
> {
> SLS_W32_WriteFile( hDevice, bWBuf1, iNoOfByteToWrite, &junk,&ov[t] );
> j++;
> }
> }
> }
>

That loop is not correct. WFMO returns exactly one value, telling you
that exactly one event fired. You can’t assume that everything up
through that event has fired. I’m not surprised that you dropped
transactions here.


Tim Roberts, xxxxx@probo.com
Providenza & Boekelheide, Inc.

> ----------

From: xxxxx@lists.osr.com[SMTP:xxxxx@lists.osr.com] on behalf of Tim Roberts[SMTP:xxxxx@probo.com]
Reply To: Windows System Software Devs Interest List
Sent: Thursday, August 02, 2007 7:45 PM
To: Windows System Software Devs Interest List
Subject: Re: [ntdev] Multiple URB’s in single IRP in USB

If you really need these to be processed in order, then you shouldn’t
use WFMO. Instead, use WaitForSingleObject and cycle through them in order:

It is IMO the simplest and in many cases the best solution. Create queue of buffers and always wait for the queue head. Initially, feed driver with optimal number of buffers, put them to queue with the same order and the start waiting. On completion process the first buffer, then pass it back to the driver and queue it to the tail. If processing takes some time, it helps to create another queue of unprocesses buffers and use different thread to process them. In this case there have to be more buffers allocated on the beginning so the driver has always optimal number of buffers pending. Optimal number can be found by experiments and can differ from machine to machine. In practice, it is enough to find the minimal number which is enough for most machines and double it.

When I solved similar problem, I created 3 queues. Empty for allocated buffers, ready for buffers containing read data waiting for processing and in-use for buffers passed to the driver. Buffers are moved from queue to queue similarly and memory pages between lists in Windows kernel. It allows to separate data reading from data processing and effectively implements intermediate cache.

Best regards,

Michal Vodicka
UPEK, Inc.
[xxxxx@upek.com, http://www.upek.com]

Hi All,
Here i want to share my result.
First of all i got the result perfactly and also found that how scheduler will affect to the overall performance.

Here i did some tests.
I used same logic that tim suggest me.
while( thread_still_running )
{
dwEvent = WFMO( INSTANCES, hEvents, FALSE, INFINITE );
if( dwEvent < INSTANCES )
{
SLS_W32_WriteFile( hDevice, bWBuf1, iNoOfByteToWrite, &junk,
&ov[dwEvent] );
}
}

Now i created INSTANCES = 64 and Buffer size is 4096 Bytes. I repeated this for 10000 times.
If i put cout to see the loop count and if i click any of the open window to minimize it,then i got the spikes of 7-8 ms.
Now if i remove cout then my all transactions are very very smooth. I can complete all the trasaction below 0.5 ms.

So that means that during cout or some click event my host controller ran dry and i got spikes.

Now i know to get better performance i need to send large URBs.

So i changes buffer size to 65536 Bytes and repeated this to 10000 times with 64 INSTANCES. Now my Driver BULKUSB_MAX_BUFFER_SIZE is 4096 bytes. No w in this case i got smooth transaction and within 3.8 ms i can transfer 65536 Bytes. In this if i click on any windows or if i ON the cout then also it does not create any issue. It means that my URBs are always ready for controller and controller never ran dry.

So in this case spikes where removed. But if i see then my overall performance is 125Mbits/second.
So Tim how can you able to transfer above 300Mbits/Second.

Second thing is that i can not create more than 64 Instance. So is this the maximum INSTANCES handle by OS?

Michal, can you give me an idea on how to do IMO?

xxxxx@slscorp.com wrote:

Now i created INSTANCES = 64 and Buffer size is 4096 Bytes. I repeated this for 10000 times.
If i put cout to see the loop count and if i click any of the open window to minimize it,then i got the spikes of 7-8 ms.
Now if i remove cout then my all transactions are very very smooth. I can complete all the trasaction below 0.5 ms.

Remember that, for a device that can suck data continuously, 4096 bytes
is just over half of a microframe. It will drain in 80 to 100
microseconds. When the request completes, there is an awful lot of
overhead in sending it back through the driver and up to you to get
resubmitted. You’ll be getting 10,000 completions per second. That’s a
LOT of overhead. You MUST use larger buffers to achieve maximum throughput.

Now i know to get better performance i need to send large URBs.

Yes.

So i changes buffer size to 65536 Bytes and repeated this to 10000 times with 64 INSTANCES. Now my Driver BULKUSB_MAX_BUFFER_SIZE is 4096 bytes.

What does that mean? Why is the limit so low? You can sent several
megabytes in a single URB. There is no reason to have this limit be so low.

Now in this case i got smooth transaction and within 3.8 ms i can transfer 65536 Bytes. In this if i click on any windows or if i ON the cout then also it does not create any issue. It means that my URBs are always ready for controller and controller never ran dry.

So in this case spikes where removed. But if i see then my overall performance is 125Mbits/second.
So Tim how can you able to transfer above 300Mbits/Second.

Are you sure your device can suck data at that rate? What kind of
device is it?

Second thing is that i can not create more than 64 Instance. So is this the maximum INSTANCES handle by OS?

WaitForMultipleObjects does have a limit of 64 handles. As several of
us said recently, WFMO is probably the wrong solution for you.

Also, more buffers is not necessarily better. You should only use as
many buffers as you need to avoid running dry. Using any more is just a
waste of resources. Instead of 64 buffers of 64k, why not use 8 buffers
of 512k?


Tim Roberts, xxxxx@probo.com
Providenza & Boekelheide, Inc.

> ----------

From: xxxxx@lists.osr.com[SMTP:xxxxx@lists.osr.com] on behalf of xxxxx@slscorp.com[SMTP:xxxxx@slscorp.com]
Reply To: Windows System Software Devs Interest List
Sent: Friday, August 03, 2007 7:29 AM
To: Windows System Software Devs Interest List
Subject: RE:[ntdev] Multiple URB’s in single IRP in USB

If i put cout to see the loop count and if i click any of the open window to minimize it,then i got the spikes of 7-8 ms.

Oh sure. Console output can be main bottleneck. Just don’t use it in time critical code. Hint: try to keep window scrollbar for some time; console output is completely blocked.

So i changes buffer size to 65536 Bytes and repeated this to 10000 times with 64 INSTANCES. Now my Driver BULKUSB_MAX_BUFFER_SIZE is 4096 bytes.

There should be no need for it. If I remember correctly, max buffer size was only necessary at obsolete OSes (w9x and maybe also w2k); there is no such limit at XP and above. Just change it to 256 kB to achieve good results or remove it completely.

So in this case spikes where removed. But if i see then my overall performance is 125Mbits/second.
So Tim how can you able to transfer above 300Mbits/Second.

With 4 kB max buffer size every your 64 kB request is handled in 16 stages in the driver if you’re using BulkUsb code (BTW, bad idea, it is miserable code). USB drivers can have enough reuquests queued but still handle only small amount of data for every request and performance is decreased.

Second thing is that i can not create more than 64 Instance. So is this the maximum INSTANCES handle by OS?

You probably reached maximum for WaitForMultipleObjects(). I’m sorry to say it but is seems you don’t have enough experience in Win32 area. What you’re trying to solve is very simple problem which shouldn’t take more than few hours to complete. You’ll encounter much more complicated problems especially if you use BulkUsb sample as the driver.

Michal, can you give me an idea on how to do IMO?

:slight_smile: IMO means In My Opinion.

Best regards,

Michal Vodicka
UPEK, Inc.
[xxxxx@upek.com, http://www.upek.com]

Hi All,
Thanks for your great support. Specially Tim and Michal.
Yes i am using bulkUsb sample driver for USB driver.
Frankly speaking to you guys, i had great experience in Win32 but from last 1 year i was busy in developing USB 2.0 Firmware ( Intelectual Property) and its HAL driver for 32- CP specific( Like NIOS II, ARM, HITACHI etc…)and just in this July we got a certification in USB 2.0 compliant Test. Now i just started to work on driver side. In past i already worked on driver and developed USB 1.1 driver. But in that speed is not that much importent. because USB 1.1 has 12mbps of speed. Now i have to do this because of USB 2.0 has 480 mbps and i want up to 300 + mbps. Now as i said from last 1 years i had knowledge on Device side and suddenly now i moved again to Win32 and Driver side so definately it will take some time to grab all this stuffs once again till i will not move on USB OTG. :). I had lots of back and forth switching during programming. :slight_smile:

Tim wrote >>Are you sure your device can suck data at that rate? What kind of
device is it?

Yes Tim, I am sure for that. Because i had check practically with USB analyzer and measure the PID IN/OUT response. I am not going in depth for this but i am sure for that.

Tim Wrote >>What does that mean? Why is the limit so low? You can sent several
megabytes in a single URB. There is no reason to have this limit be so low.

Ok that i will change in my driver. so as michal said for 64KB, my driver has not wait for 16 stages.

Michal wrote >>You’ll encounter much more complicated problems especially if you use
BulkUsb sample as the driver.

Which driver sample i can use to develop th USB 2.0 driver?

Once again i am very thank full to both of you.
Regards,
Tejas

> ----------

From: xxxxx@lists.osr.com[SMTP:xxxxx@lists.osr.com] on behalf of xxxxx@slscorp.com[SMTP:xxxxx@slscorp.com]
Reply To: Windows System Software Devs Interest List
Sent: Saturday, August 04, 2007 9:50 AM
To: Windows System Software Devs Interest List
Subject: RE:[ntdev] Multiple URB’s in single IRP in USB

300 + mbps. Now as i said from last 1 years i had knowledge on Device side and suddenly now i moved again to Win32 and Driver side so definately it will take some time to grab all this stuffs once again till i will not move on USB OTG. :). I had lots of back and forth switching during programming. :slight_smile:

OK, sorry for underestimating your experience. I know very well what switching means (developing firmware, driver and user mode communication at once :).

Which driver sample i can use to develop th USB 2.0 driver?

I guess you should try KMDF. Buggy and incorrectly designed BulkUsb parts should be already solved correctly by framework and having working driver equivallent to BulkUsb functionality is very easy. I tried it once to perform some testing and it took only few hours until I could communicate with my device. I have my own reasons why I stay with highly modified BulkUsb but for starting it is very bad choice now. I had to rewrite about half of the code and still suffer from design problems with it :-/

Best regards,

Michal Vodicka
UPEK, Inc.
[xxxxx@upek.com, http://www.upek.com]

Hi Michal,
It is ok for me what you think for me. I just clarified my self. It is definately possible that you have greate idea on Driver side.

Anyway i will start with WDF - 6000 build and use KMDF osrusbfx2 driver for base of my device driver and try to develop on that.

If i will get any difficulties on that then i will let you know guys.
Once again thanks for your guidance and sugestion too.

Regards,
Tejas
SLS

Look at both usbsamp as well as osrusbfx2 before you choose the sample
you want to base your project on. Both have their own unique features,
if you want something like bulkusb, usbsamp is closer to it.

d

-----Original Message-----
From: xxxxx@lists.osr.com
[mailto:xxxxx@lists.osr.com] On Behalf Of
xxxxx@slscorp.com
Sent: Sunday, August 05, 2007 10:25 PM
To: Windows System Software Devs Interest List
Subject: RE:[ntdev] Multiple URB’s in single IRP in USB

Hi Michal,
It is ok for me what you think for me. I just clarified my self. It is
definately possible that you have greate idea on Driver side.

Anyway i will start with WDF - 6000 build and use KMDF osrusbfx2 driver
for base of my device driver and try to develop on that.

If i will get any difficulties on that then i will let you know guys.
Once again thanks for your guidance and sugestion too.

Regards,
Tejas
SLS


NTDEV is sponsored by OSR

For our schedule of WDF, WDM, debugging and other seminars visit:
http://www.osr.com/seminars

To unsubscribe, visit the List Server section of OSR Online at
http://www.osronline.com/page.cfm?name=ListServer

Hi Michal,
It is ok for me what you think for me. I just clarified my self. It is definately possible that you have greate idea on Driver side.

Anyway i will start with WDF - 6000 build and use KMDF osrusbfx2 driver for base of my device driver and try to develop on that.

If i will get any difficulties on that then i will let you know guys.
Once again thanks for your guidance and sugestion too.

Regards,
Tejas
SLS