Currently using logman to trace NDIS commands, but it is not providing enough information on why after several hours RNDIS decides to send a USB STALL and send clear endpoint and reset the USB.
I have enabled NDIS enabled debug messages via the Registry. How to see or log these messages from NDIS?
I'm not sure what you are trying to trace. NDIS 'commands' is not standard terminology; at least in my mind.
IIRC Wireshark uses a protocol driver to capture packets. Which normally includes exactly the information you need, but there are exceptions. But I suspect that whatever problem you are looking at, the actual network traffic isn't an important part. Probably something between NDIS and a USB network adapter of some kind?
Host applications do not SEND a STALL. A USB stall happens when a device misbehaves, by timeout or protocol violation, and it's up to the host to CLEAR the STALL.
Agreed, I'm trying to figure out why the HOST/WIN10 is sending a STALL. Had a USB expert review the USB Beagle 480 trace and stated the Host is sending the STALL. Looking at several of the previous transactions, nothing stood out. All transactions look valid. It is very random when this occurs, maybe 10+ hours of operation without issues, to 15 minutes, then STALL.
If the same device is connected to WIN7, no issues.
The STALL occurs on a Control Transfer (Endpoint 0) during the data phase of and OUT. So, there is no data being sent by the device, just the Host. The timing between the two requests are in the microseconds, so no timeout.
The main issue is how to trace these transactions within RNDIS driver or NDIS to see what WIN10 is having issues with.
The current method is to use an external USB traffic sniffer. Beagle 480 made by TotalPhase.
All USB transactions are captured with time stamp.
The USB transaction that is having issue is a Control Transfer (Out). The data phase of the Control Transfer is getting a STALL from the Host/ WIN10. Had several people review the trace and it is WIN10 generating a STALL.
The Control Transfer( setup/data) is a RNDIS KEEP_ALIVE message. Its the data phase having the random STALL.
For example, RNDIS connects, user can ping the IP address for the device. Everything works, etc. But sometimes, maybe 10+ hours or less hen 15minutes a random STALL is generated.
Several test setups:
Do not generate any IP traffic. Goal is to just test Control Transfers. Maybe after 1K Control Transfers issuing RNDIS_KEEP_ALIVE a STALL is generated.
Generate IP traffic. STALL still randomly occur.
XP/WIN7/Linux have been used for years without issue. WIN10 has this issue.
The goal is to figure out why WIN10 generates the STALL. Used logman/traceview, see NDIS commands and USB messages. But nothing stands out when this issue occurs. What is a better trace tool for Windows?
You are misreading the transcript. The host never issues a STALL. Quoting from Section 8.4.5 of the USB 2.0 Specification:
The host is not permitted to return a STALL under any condition.
You said "the data phase of the Control Transfer is getting a STALL from the Host". No, what's happening here is that the host sent a Control Transfer and the device returned a STALL. One of the places the device is allowed to STALL is after the data phase of an OUT transfer. It means the device was busy or does not support the request.
However, a STALL on the Control Endpoint is just a standalone event that does not change the state. Nothing is required in order to clear a Control Endpoint stall (unlike with the other endpoint types), so things should just continue to work.
Fixed STALL issue. WIN10 is working for the past month.
Testing on WIN7, getting random, takes several days for a USB reset to occur. Looking at the RNDIS traffic, there is no outstanding packets. All the RNDIS packets look OK. WIN7, does not send REMOTE_NDIS_KEEPALIVE_MSG constant requests.
There is nothing in the windows event log. What is the best way to trace this issue?