USB performance

I’ve developed a WDM device driver that performs bulk Rx transfers from
a USB device, and I’m profiling its performance.
Looks like the CPU usage is pretty high, and is completely located in
the IRP interaction between my driver and the bus driver for the bulk
receives.

Should I expect the USB bus driver to be CPU inefficient, or should I
look for issues in my driver? Are there well known ways to improve the
CPU usage a high-rate USB drivers?
Thanks in advance,

Loris

What % increase are you seeing? What OS? Usb2.0 or 1.1? are you doing
any memcpys before sending the PIRP or on completion?

d

-----Original Message-----
From: xxxxx@lists.osr.com
[mailto:xxxxx@lists.osr.com] On Behalf Of Loris Degioanni
Sent: Thursday, September 07, 2006 10:06 PM
To: Windows System Software Devs Interest List
Subject: [ntdev] USB performance

I’ve developed a WDM device driver that performs bulk Rx transfers from
a USB device, and I’m profiling its performance.
Looks like the CPU usage is pretty high, and is completely located in
the IRP interaction between my driver and the bus driver for the bulk
receives.

Should I expect the USB bus driver to be CPU inefficient, or should I
look for issues in my driver? Are there well known ways to improve the
CPU usage a high-rate USB drivers?
Thanks in advance,

Loris


Questions? First check the Kernel Driver FAQ at
http://www.osronline.com/article.cfm?id=256

To unsubscribe, visit the List Server section of OSR Online at
http://www.osronline.com/page.cfm?name=ListServer

On Thu, Sep 07, 2006 at 10:05:34PM -0700, Loris Degioanni wrote:

I’ve developed a WDM device driver that performs bulk Rx transfers from
a USB device, and I’m profiling its performance.
Looks like the CPU usage is pretty high, and is completely located in
the IRP interaction between my driver and the bus driver for the bulk
receives.

Should I expect the USB bus driver to be CPU inefficient, or should I
look for issues in my driver? Are there well known ways to improve the
CPU usage a high-rate USB drivers?

How large are your transfers? We were able to sustain 30 MB/s over a
bulk pipe with only about 10% CPU usage, but you need to make sure you
are using largish buffers to reduce the traffic between you and the HCD.

Doron Holan wrote:

What % increase are you seeing?

I’m seeing a CPU load of ~30% on a P4 3.2 GHz for ~2k bulk transfers per
second.

What OS? Usb2.0 or 1.1? are you doing

Windows XP, USB 2.0.

any memcpys before sending the PIRP or on completion?

Yes, to move the data to a driver-allocated buffer. Their cost is close
to zero. The CPU load seems to be outside my driver, so I guess it’s in
the bus driver.

Loris

d

-----Original Message-----
From: xxxxx@lists.osr.com
[mailto:xxxxx@lists.osr.com] On Behalf Of Loris Degioanni
Sent: Thursday, September 07, 2006 10:06 PM
To: Windows System Software Devs Interest List
Subject: [ntdev] USB performance

I’ve developed a WDM device driver that performs bulk Rx transfers from
a USB device, and I’m profiling its performance.
Looks like the CPU usage is pretty high, and is completely located in
the IRP interaction between my driver and the bus driver for the bulk
receives.

Should I expect the USB bus driver to be CPU inefficient, or should I
look for issues in my driver? Are there well known ways to improve the
CPU usage a high-rate USB drivers?
Thanks in advance,

Loris


Questions? First check the Kernel Driver FAQ at
http://www.osronline.com/article.cfm?id=256

To unsubscribe, visit the List Server section of OSR Online at
http://www.osronline.com/page.cfm?name=ListServer


Questions? First check the Kernel Driver FAQ at http://www.osronline.com/article.cfm?id=256

To unsubscribe, visit the List Server section of OSR Online at http://www.osronline.com/page.cfm?name=ListServer

How are you measuring where the CPU load is? Memcpy is not cheap/free,
it will cost you. How big is each xfer? Tons of tiny xfers will not
have good perf.

d

-----Original Message-----
From: xxxxx@lists.osr.com
[mailto:xxxxx@lists.osr.com] On Behalf Of Loris Degioanni
Sent: Thursday, September 07, 2006 10:53 PM
To: Windows System Software Devs Interest List
Subject: Re: [ntdev] USB performance

Doron Holan wrote:

What % increase are you seeing?

I’m seeing a CPU load of ~30% on a P4 3.2 GHz for ~2k bulk transfers per

second.

What OS? Usb2.0 or 1.1? are you doing

Windows XP, USB 2.0.

any memcpys before sending the PIRP or on completion?

Yes, to move the data to a driver-allocated buffer. Their cost is close
to zero. The CPU load seems to be outside my driver, so I guess it’s in
the bus driver.

Loris

d

-----Original Message-----
From: xxxxx@lists.osr.com
[mailto:xxxxx@lists.osr.com] On Behalf Of Loris
Degioanni
Sent: Thursday, September 07, 2006 10:06 PM
To: Windows System Software Devs Interest List
Subject: [ntdev] USB performance

I’ve developed a WDM device driver that performs bulk Rx transfers
from
a USB device, and I’m profiling its performance.
Looks like the CPU usage is pretty high, and is completely located in
the IRP interaction between my driver and the bus driver for the bulk
receives.

Should I expect the USB bus driver to be CPU inefficient, or should I
look for issues in my driver? Are there well known ways to improve the

CPU usage a high-rate USB drivers?
Thanks in advance,

Loris


Questions? First check the Kernel Driver FAQ at
http://www.osronline.com/article.cfm?id=256

To unsubscribe, visit the List Server section of OSR Online at
http://www.osronline.com/page.cfm?name=ListServer


Questions? First check the Kernel Driver FAQ at
http://www.osronline.com/article.cfm?id=256

To unsubscribe, visit the List Server section of OSR Online at
http://www.osronline.com/page.cfm?name=ListServer


Questions? First check the Kernel Driver FAQ at
http://www.osronline.com/article.cfm?id=256

To unsubscribe, visit the List Server section of OSR Online at
http://www.osronline.com/page.cfm?name=ListServer

TransferBufferLength in UsbBuildInterruptOrBulkTransferRequest() is 4096.

Loris

xxxxx@probo.com wrote:

On Thu, Sep 07, 2006 at 10:05:34PM -0700, Loris Degioanni wrote:
> I’ve developed a WDM device driver that performs bulk Rx transfers from
> a USB device, and I’m profiling its performance.
> Looks like the CPU usage is pretty high, and is completely located in
> the IRP interaction between my driver and the bus driver for the bulk
> receives.
>
> Should I expect the USB bus driver to be CPU inefficient, or should I
> look for issues in my driver? Are there well known ways to improve the
> CPU usage a high-rate USB drivers?

How large are your transfers? We were able to sustain 30 MB/s over a
bulk pipe with only about 10% CPU usage, but you need to make sure you
are using largish buffers to reduce the traffic between you and the HCD.

30 MB/s over a bulk pipe with 10% CPU usage still seems quite a lot to
me. With this performance, the theoretical full speed of the bus
(480Mbps, right?) appears to be pretty hard to reach…

Loris

xxxxx@probo.com wrote:

On Thu, Sep 07, 2006 at 10:05:34PM -0700, Loris Degioanni wrote:
We were able to sustain 30 MB/s over a
bulk pipe with only about 10% CPU usage, but you need to make sure you
are using largish buffers to reduce the traffic between you and the HCD.

At few megabytes per second, memcpy *is* negligible on a modern CPU.

I tried to vary the size of the transfer from few hundred bytes to some
kbytes, and the CPU load seems to be proportional to the USB transfers
frequency, while the transfer size seems to have minimal impact.

Loris

Doron Holan wrote:

How are you measuring where the CPU load is? Memcpy is not cheap/free,
it will cost you. How big is each xfer? Tons of tiny xfers will not
have good perf.

d

-----Original Message-----
From: xxxxx@lists.osr.com
[mailto:xxxxx@lists.osr.com] On Behalf Of Loris Degioanni
Sent: Thursday, September 07, 2006 10:53 PM
To: Windows System Software Devs Interest List
Subject: Re: [ntdev] USB performance

Doron Holan wrote:

> What % increase are you seeing?

I’m seeing a CPU load of ~30% on a P4 3.2 GHz for ~2k bulk transfers per

second.

> What OS? Usb2.0 or 1.1? are you doing

Windows XP, USB 2.0.

> any memcpys before sending the PIRP or on completion?

Yes, to move the data to a driver-allocated buffer. Their cost is close
to zero. The CPU load seems to be outside my driver, so I guess it’s in
the bus driver.

Loris

> d
>
> -----Original Message-----
> From: xxxxx@lists.osr.com
> [mailto:xxxxx@lists.osr.com] On Behalf Of Loris
Degioanni
> Sent: Thursday, September 07, 2006 10:06 PM
> To: Windows System Software Devs Interest List
> Subject: [ntdev] USB performance
>
> I’ve developed a WDM device driver that performs bulk Rx transfers
from
> a USB device, and I’m profiling its performance.
> Looks like the CPU usage is pretty high, and is completely located in
> the IRP interaction between my driver and the bus driver for the bulk
> receives.
>
> Should I expect the USB bus driver to be CPU inefficient, or should I
> look for issues in my driver? Are there well known ways to improve the

> CPU usage a high-rate USB drivers?
> Thanks in advance,
>
> Loris
>
> —
> Questions? First check the Kernel Driver FAQ at
> http://www.osronline.com/article.cfm?id=256
>
> To unsubscribe, visit the List Server section of OSR Online at
> http://www.osronline.com/page.cfm?name=ListServer
>
> —
> Questions? First check the Kernel Driver FAQ at
http://www.osronline.com/article.cfm?id=256
> To unsubscribe, visit the List Server section of OSR Online at
http://www.osronline.com/page.cfm?name=ListServer


Questions? First check the Kernel Driver FAQ at
http://www.osronline.com/article.cfm?id=256

To unsubscribe, visit the List Server section of OSR Online at
http://www.osronline.com/page.cfm?name=ListServer


Questions? First check the Kernel Driver FAQ at http://www.osronline.com/article.cfm?id=256

To unsubscribe, visit the List Server section of OSR Online at http://www.osronline.com/page.cfm?name=ListServer

Loris Degioanni wrote:

30 MB/s over a bulk pipe with 10% CPU usage still seems quite a lot to
me. With this performance, the theoretical full speed of the bus
(480Mbps, right?) appears to be pretty hard to reach…

Why? 30 MB/s is fully half of the theoretical bandwidth. (You can’t
actually reach the full 60 MB/s because of bus protocol overhead.) We
were actually quite pleased with those results.

It’s possible we have a units problem here. MB/s is megabytes per
second, while Mbps is megabits per second.


Tim Roberts, xxxxx@probo.com
Providenza & Boekelheide, Inc.