Bug in WdfUsb Continuous Reader?

Dear all,

Our hardware sends relatively short blocks of data (say 300 bytes) with a rate of 128 Hz (about 8ms). An interrupt endpoint is used.
The timing of the data is crucial for our application.

The WdfUsb continuous reader seemed to be ideal solution, as data is received as long as the device is attached. However, the following problem occurs:

  • The reader operates fine most of the time
  • every now and than (say about 1 in 1000) the reader seems to drop a call, the block of data is not read from the device
  • the “lost” block of data is read in the next time slice

Thus, the timing would be (example)


time+0 ms data
time+8 ms data
time+16ms continuous reader does not call back
time+24ms data (from +16ms) and current data

This delay is not acceptable for our application. Using a manual request mechanism (create, send and reuse a request for continuous read) works out.

Do you have any similar experience?

Best regards,

Johannes

Spinneken, Johannes wrote:

Our hardware sends relatively short blocks of data (say 300 bytes)
with a rate of 128 Hz (about 8ms). An interrupt endpoint is used.
The timing of the data is crucial for our application.

The WdfUsb continuous reader seemed to be ideal solution, as data is
received as long as the device is attached. However, the following
problem occurs:

  • The reader operates fine most of the time
  • every now and than (say about 1 in 1000) the reader seems to drop a
    call, the block of data is not read from the device
  • the “lost” block of data is read in the next time slice

Thus, the timing would be (example)


time+0 ms data
time+8 ms data
time+16ms continuous reader does not call back
time+24ms data (from +16ms) and current data

This delay is not acceptable for our application. Using a manual
request mechanism (create, send and reuse a request for continuous
read) works out.

Have you considered changing your interrupt endpoint descriptor to
specify the period as 1ms instead of every 8ms? Unless your device
clock is synchronized to the USB clock (which it probably is not), you
always have the possibility that the timing will come in just a bit
off. How does your device react if the interrupt endpoint request comes
in a microsecond before data is ready? Do you reject it, or do you hold
off?

Have you hooked up a USB analyzer to see what the actual traffic looks like?


Tim Roberts, xxxxx@probo.com
Providenza & Boekelheide, Inc.

Tim Roberts wrote:

Have you considered changing your interrupt endpoint descriptor to
specify the period as 1ms instead of every 8ms? Unless your device
clock is synchronized to the USB clock (which it probably is not), you
always have the possibility that the timing will come in just a bit
off.

The bInterval in the endpoint descriptor is set to 1. This should cause the USB host controller to poll every single micro frame (125?s).
We use a FX2 chip in slave FIFO mode. The timing is controlled by our FPGA (and there is really no way our FPGA could not get timing right!). The FPGA continuously writes data to the endpoint and flushes the buffer every 8ms (the FX2 processor isn’t doing anything). We used the same hardware with our non WDF driver and we had no problems at all. As I said, the timing even works out if I use a manual continuous reader in WDF terms (request create, send, reuse).

Most programmers would not detect this problem as throughput is the prime concern for the majority of applications. Since no data is lost in the end, its very hard to see this bug if you are not watching the timing carefully. On receiving data, we send new data to our hardware. The hardware watches the timing very strictly, so we detected the problem.

How does your device react if the interrupt endpoint request comes
in a microsecond before data is ready? Do you reject it, or do you hold
off?

Have you hooked up a USB analyzer to see what the actual traffic looks like?

Yes, I watched the timing using various USB analyzers. I am not exactly sure where the USB analyzer hooks on. My feeling is it observes the requests between the MS framework and the host controller (rather than hardware to host controller). The problem seemed to be the same but I’ll have another go at this.

I would be less convinced that the bug is somewhere in the WDF continuous reader if the data would just come a little bit to late (say 1ms or so). However, the old data arrives just exactly when the new data is due. The reader seems to loose the request on its way to the USB host controller.

Thanks a lot,

Johannes

Spinneken, Johannes wrote:

Our hardware sends relatively short blocks of data (say 300 bytes)
with a rate of 128 Hz (about 8ms). An interrupt endpoint is used.
The timing of the data is crucial for our application.

The WdfUsb continuous reader seemed to be ideal solution, as data is
received as long as the device is attached. However, the following
problem occurs:

  • The reader operates fine most of the time
  • every now and than (say about 1 in 1000) the reader seems to drop a
    call, the block of data is not read from the device
  • the “lost” block of data is read in the next time slice

Thus, the timing would be (example)


time+0 ms data
time+8 ms data
time+16ms continuous reader does not call back
time+24ms data (from +16ms) and current data

This delay is not acceptable for our application. Using a manual
request mechanism (create, send and reuse a request for continuous
read) works out.

Johannes Spinneken wrote:

Yes, I watched the timing using various USB analyzers. I am not
exactly sure where the USB analyzer hooks on. My feeling is it
observes the requests between the MS framework and the host
controller (rather than hardware to host controller). The problem
seemed to be the same but I’ll have another go at this.

I think Tim meant an actual protocol analyzer that sits on the wire, not a software USB sniffer (which will not help you with this kind of problem).

If you are getting two packets’ worth all at once after a missed frame, then I can’t imagine you would see anything but exactly that with a real protocol analyzer, immediately preceded by your device NAKing the IN tokens from the host during the previous frame, rather instead of answering with one packet’s worth.

Let me try to address the original concern. KMDF does the following in
the continuous reader

  1. Submit N readers (2 by default). This way there is always a
    transfer pending. This means that even though at one 8 ms period a
    request completed, there is still another pending.

  2. Upon completion, a DPC is queued and the read is resent. Since
    there are N requests pending, this delay should not manifest itself as a
    missed packet.

I would suggest cranking up the number of readers to 4 or 5 (anything
more is really overkill, even 4 is a bit overkill). IMHO, this is a
scheduler/firmware problem, not a client driver problem. You need to
make sure that the transfer is pending at all time using a bus analyzer,
which could point to a host controller driver problem. I would also
test against different host controllers (ehci, ohci, uhci) to see if the
bug reproduces. I think that just b/c you cannot repro with WDM, it
does not mean that it is not happening there, perhaps KMDF is making a
timing problem show up more often…but it smells like a timing problem
in the lower layers to me.

d

-----Original Message-----
From: xxxxx@lists.osr.com
[mailto:xxxxx@lists.osr.com] On Behalf Of
xxxxx@gmail.com
Sent: Monday, April 30, 2007 7:53 PM
To: Windows System Software Devs Interest List
Subject: RE:[ntdev] Bug in WdfUsb Continuous Reader?

Johannes Spinneken wrote:

Yes, I watched the timing using various USB analyzers. I am not
exactly sure where the USB analyzer hooks on. My feeling is it
observes the requests between the MS framework and the host
controller (rather than hardware to host controller). The problem
seemed to be the same but I’ll have another go at this.

I think Tim meant an actual protocol analyzer that sits on the wire, not
a software USB sniffer (which will not help you with this kind of
problem).

If you are getting two packets’ worth all at once after a missed frame,
then I can’t imagine you would see anything but exactly that with a real
protocol analyzer, immediately preceded by your device NAKing the IN
tokens from the host during the previous frame, rather instead of
answering with one packet’s worth.


Questions? First check the Kernel Driver FAQ at
http://www.osronline.com/article.cfm?id=256

To unsubscribe, visit the List Server section of OSR Online at
http://www.osronline.com/page.cfm?name=ListServer