Driver compiled with DDk 3790 works, sometimes fails when compiled with 6000

Our USB hid minidriver sometimes fails on some hardware, usually laptops.

I have an URB reading an INT pipe on a blocking read. When the system fails, this urb fails to receive any data altogether, even though I can see the data coming in on a USB bus analyzer.
If I pull the device out, the URB fails with STATUS_UNSUCCESSFUL and the urb status is
USBD_STATUS_DEV_NOT_RESPONDING, which is the normal behaviour. It is as if something below me in the stack is no longer passing data. The bus analyzer also shows me I have a lot of crc errors caused by a dodgy cable, even though they don’t show up in my URB (I never get the USBD_ERROR_CRC) . At first I thought that could be the issue, but some customer feedback puzlled me :

what puzzles me is that customers are reporting that the previous version of our driver does not do this. The only big difference is I upgraded the DDK to 6000. Any other difference is minor and has to do with new byte packets sent by the firmware. The core of the driver (the reading / dispatching logic) is the same, it’s just a case of a few switch statements in the data processing.

Does anybody know (Doron !) if anything has changed between the two ddks that would cause that ?

One of our customers also says that turning off the “Allow the computer to turn this device off” sems to fix it, but we cannot confirm this information is correct. I do not support selective suspend in the driver, and our device can not wake up the hub.

Regards,

Pierre
ps: I am getting a faulty system next week for a day, so I can experiment, but any information before that would greatly help.

To make sure I understand the problem correctly, it is only the build environment that changed correct? You are still running it on the same OS? If that is correct, I don’t see that being the root cause of your issue since it will not affect the underlying host controller driver. What OS are you seeing the problems on?

On which device are they clearing the “Allow the computer to turn this device off” checkbox? On the root hub or host controller or on your device? Even though your device may not support USB SS, the underlying usb core does :wink: and the checkboxes on the root hub and HC toggle this support in the usb core. Previous to vista, usb ss support in the usb core did create some very weird issues (not that Vista is totally issue free, just that it is much better). I would definitely try to isolate (by toggling the checkbox state) if usb ss is your root cause

d

-----Original Message-----
From: xxxxx@lists.osr.com [mailto:xxxxx@lists.osr.com] On Behalf Of xxxxx@bleucanard.f2s.com
Sent: Friday, October 05, 2007 8:59 AM
To: Windows System Software Devs Interest List
Subject: [ntdev] Driver compiled with DDk 3790 works, sometimes fails when compiled with 6000

Our USB hid minidriver sometimes fails on some hardware, usually laptops.

I have an URB reading an INT pipe on a blocking read. When the system fails, this urb fails to receive any data altogether, even though I can see the data coming in on a USB bus analyzer.
If I pull the device out, the URB fails with STATUS_UNSUCCESSFUL and the urb status is
USBD_STATUS_DEV_NOT_RESPONDING, which is the normal behaviour. It is as if something below me in the stack is no longer passing data. The bus analyzer also shows me I have a lot of crc errors caused by a dodgy cable, even though they don’t show up in my URB (I never get the USBD_ERROR_CRC) . At first I thought that could be the issue, but some customer feedback puzlled me :

what puzzles me is that customers are reporting that the previous version of our driver does not do this. The only big difference is I upgraded the DDK to 6000. Any other difference is minor and has to do with new byte packets sent by the firmware. The core of the driver (the reading / dispatching logic) is the same, it’s just a case of a few switch statements in the data processing.

Does anybody know (Doron !) if anything has changed between the two ddks that would cause that ?

One of our customers also says that turning off the “Allow the computer to turn this device off” sems to fix it, but we cannot confirm this information is correct. I do not support selective suspend in the driver, and our device can not wake up the hub.

Regards,

Pierre
ps: I am getting a faulty system next week for a day, so I can experiment, but any information before that would greatly help.


NTDEV is sponsored by OSR

For our schedule of WDF, WDM, debugging and other seminars visit:
http://www.osr.com/seminars

To unsubscribe, visit the List Server section of OSR Online at http://www.osronline.com/page.cfm?name=ListServer

xxxxx@bleucanard.f2s.com wrote:

Our USB hid minidriver sometimes fails on some hardware, usually laptops.

I have an URB reading an INT pipe on a blocking read. When the system fails, this urb fails to receive any data altogether, even though I can see the data coming in on a USB bus analyzer.
If I pull the device out, the URB fails with STATUS_UNSUCCESSFUL and the urb status is
USBD_STATUS_DEV_NOT_RESPONDING, which is the normal behaviour. It is as if something below me in the stack is no longer passing data. The bus analyzer also shows me I have a lot of crc errors caused by a dodgy cable, even though they don’t show up in my URB (I never get the USBD_ERROR_CRC) .

Failed packets on an interrupt pipe are retried until they succeed.
That’s why you never see the CRC error. You’ll only see that with an
isochronous pipe.


Tim Roberts, xxxxx@probo.com
Providenza & Boekelheide, Inc.

Doron,

That is correct : same os, same computer, same source : the only difference is the build environment.
The power management tab does not show on our device, only on the root hub, so I guess it is what our customer disabled, but I can’t guarantee that (unfortunately, our resellers in the US are not great at forwarding detailed information, and we don’t have direct access to the customer).

Tim,

thanks for the clarification.

Pierre

Just rereading my original post, I feel like I need to clarify something :

What I mean is my URB stays in the blocking read, it doesn’t exit with an error status or anything, it just keeps on waiting for data.

Pierre

Doh, forgot to mention the OS was XP SP2 !

Pierre

From your original post it is also a bit unclear if you can reproduce problem yourself or not.

If it really depends on build environment only, one possibility is race conditions. They depend on timing which can be influenced by compiler optimization and these two DDKs contain different compiler version. Does it also depend on debug or release version of the driver?

What I’d do is to turn on traces and compare normal with failing case and find where code flows differ. The same with USB analyser output including SOFs and NAKs. If you see USB errors because of bad cable, change it to avoid misleading data. On the other hand, it can be a culprit of the problem. Normally, there shouldn’t be any error and maybe it isn’t a cable but your device. If there are boundary conditions, USB analyser can report correct data but HC may fail to receive it.

The dependency on root hub selective suspend settings seems strange for XP SP2 if you don’t support SS. Root hub can sleep only if all devices connected can sleep, too.

Best regards,

Michal Vodicka
UPEK, Inc.
[xxxxx@upek.com, http://www.upek.com]


From: xxxxx@lists.osr.com[SMTP:xxxxx@lists.osr.com] on behalf of xxxxx@bleucanard.f2s.com[SMTP:xxxxx@bleucanard.f2s.com]
Reply To: Windows System Software Devs Interest List
Sent: Friday, October 05, 2007 10:22 PM
To: Windows System Software Devs Interest List
Subject: RE:[ntdev] Driver compiled with DDk 3790 works, sometimes fails when compiled with 6000

Just rereading my original post, I feel like I need to clarify something :

What I mean is my URB stays in the blocking read, it doesn’t exit with an error status or anything, it just keeps on waiting for data.

Pierre


NTDEV is sponsored by OSR

For our schedule of WDF, WDM, debugging and other seminars visit:
http://www.osr.com/seminars

To unsubscribe, visit the List Server section of OSR Online at http://www.osronline.com/page.cfm?name=ListServer

Michael,

the only system I can reproduce the problem here it on is a Macbook Pro running XP under an older version of bootcamp (1.2).
We have had reports of the same failure on a Dell laptop, which will be available to me on Monday.
Our customer is seeing the problem on HP laptops.
All of these are running XP SP2.

The problem happens with a release build, compiled with the XP Free build environment under 6000, and the server 2003 under DDK 3790.

The problem happens about 2 or 3 times an hour, sometimes a few seconds after the driver loads up, sometimes 30 minutes later.

I will hopefully know more next week, when I get hold of the Dell laptop. I will make sure I share my findings.

Pierre

Pierre,

does it mean the problem doesn’t occur with the debug version of driver?

Well, at least you can reproduce problem rather quickly. There are problems which are reproducible once per day or a week.

Best regards,

Michal Vodicka
UPEK, Inc.
[xxxxx@upek.com, http://www.upek.com]


From: xxxxx@lists.osr.com[SMTP:xxxxx@lists.osr.com] on behalf of xxxxx@bleucanard.f2s.com[SMTP:xxxxx@bleucanard.f2s.com]
Reply To: Windows System Software Devs Interest List
Sent: Friday, October 05, 2007 11:19 PM
To: Windows System Software Devs Interest List
Subject: RE:[ntdev] Driver compiled with DDk 3790 works, sometimes fails when compiled with 6000

Michael,

the only system I can reproduce the problem here it on is a Macbook Pro running XP under an older version of bootcamp (1.2).
We have had reports of the same failure on a Dell laptop, which will be available to me on Monday.
Our customer is seeing the problem on HP laptops.
All of these are running XP SP2.

The problem happens with a release build, compiled with the XP Free build environment under 6000, and the server 2003 under DDK 3790.

The problem happens about 2 or 3 times an hour, sometimes a few seconds after the driver loads up, sometimes 30 minutes later.

I will hopefully know more next week, when I get hold of the Dell laptop. I will make sure I share my findings.

Pierre


NTDEV is sponsored by OSR

For our schedule of WDF, WDM, debugging and other seminars visit:
http://www.osr.com/seminars

To unsubscribe, visit the List Server section of OSR Online at http://www.osronline.com/page.cfm?name=ListServer

While you’re waiting, if you’re really convinced that this is due to DDK/WDK verison (I have my doubts), one thing that you could do, in addition to what Michal suggested, is to check the implementations of any DDK/WDK function that you use that is actually a macro (like KeGetCurrentIrql on some platforms, for example), as this could result in code that could vary across DDK/WDK implementations. I’m not suggesting that this will be fun or likely fruitful, but if you think that the version is the only problem, it’s the best thing that I can think of, at least until you get a system.

>does it mean the problem doesn’t occur with the debug version of driver?

This points to a race condition, or to a KdPrint or other debug macro switching
off some important code in a release build.


Maxim Shatskih, Windows DDK MVP
StorageCraft Corporation
xxxxx@storagecraft.com
http://www.storagecraft.com

I am not sure about the race condition : when it is in its non functioning state, unplugging the device
causes the driver to be unloaded (by design). I can see all active threads closing down and all pending irps complete properly. My reading URB catches the fact the board was unplugged.
I would expect the driver to fail unloading if there was a race condition. My reading URB would also fail earlier and not catch the unplug event.

I should know more today when I get my hands on the system.

Pierre

Well, it isn’t a ddk problem after all. The reseller hadn’t sent us the whole debug log,
and missed some important information at the end of it that would have ruled it out.
It seems to be a problem at the HID end (sending data to the HID driver on top), rather
than a problem with the lower end (ie from the USB device).
It looks like I am not receiving any IOCTL_HID_READ_REPORT after a while, but it
could be because of some packets I send (some code has changed in this part of
my driver)

Thanks for all the replies and suggestions,

Pierre