AW: AW: USB interrupt routine never gets called.

Hi,

When you say “NO Interrupt”, do you mean that your completion routine
is
not called? The IRP just hangs forever?

That’s exactly what happens. The bigger question is: Why does the IRP
comes
back if the receive buffer is small and hangs when I provide a much
bigger buffer.
I am using am USB-sniffer, which acts as filter driver, and
analyzes the IRPs passed between my device driver and the lower layer.
But after the
Host-controller received the last package of the INT-IN transfer it does
not return
the IRP to my driver.

Can you post the code that prepares and submits the URB…

I am working on an USB-library so I try to copy&paste the most relevante
parts.
Here you go:

=========== SNIP ============
NTSTATUS OnInterrupt(PDEVICE_OBJECT junk,
PIRP irp,
PVOID pdx);

NTSTATUS StartInterrupt(PDEVICE_OBJECT fdo)
{
PDEVICE_EXTENSION pdx = (PDEVICE_EXTENSION) fdo->DeviceExtension;

/*
This stuff was done sometime before:
pdx->irp = IoAllocateIrp(pdx->nextFdo->StackSize, FALSE);
pdx->urb = (PURB)ExAllocatePool( NonPagedPool,
sizeof(struct
_URB_BULK_OR_INTERRUPT_TRANSFER));
pdx->inBuffer = ExAllocatePool(…) // sufficient space
*/

UsbBuildInterruptOrBulkTransferRequest(
pdx->urb,
sizeof(struct _URB_BULK_OR_INTERRUPT_TRANSFER),
pdx->handle,
pdx->inBuffer,
NULL,
pdx->inBufLen,
USBD_TRANSFER_DIRECTION_IN | USBD_SHORT_TRANSFER_OK,
NULL);

IoSetCompletionRoutine( pdx->irp,
(PIO_COMPLETION_ROUTINE) OnInterrupt,
pdx,
TRUE,
TRUE,
TRUE);

PIO_STACK_LOCATION nextStack = IoGetNextIrpStackLocation(irp);

nextStack->MajorFunction =
IRP_MJ_INTERNAL_DEVICE_CONTROL;
nextStack->Parameters.DeviceIoControl.IoControlCode =
IOCTL_INTERNAL_USB_SUBMIT_URB;
nextStack->Parameters.Others.Argument1 = (PVOID)urb;

return IoCallDriver(pdx->nextFdo, irp);
}

=========== SNAP ============

if you need further information please let me know.

Best regards
Emanuel Eick

Eick, Emanuel wrote:

I am using am USB-sniffer, which acts as filter driver, and
analyzes the IRPs passed between my device driver and the lower layer.

Do you get the same behavior without the sniffer? The filter driver USB
sniffers are supposed to be invisible, but I know from painful
experience that it isn’t always the case. USB MIDI devices, for
example, simply do not work properly if a filter is present.

I am working on an USB-library so I try to copy&paste the most relevante
parts.
Here you go:

=========== SNIP ============
NTSTATUS OnInterrupt(PDEVICE_OBJECT junk,
PIRP irp,
PVOID pdx);

NTSTATUS StartInterrupt(PDEVICE_OBJECT fdo)
{
PDEVICE_EXTENSION pdx = (PDEVICE_EXTENSION) fdo->DeviceExtension;

/*
This stuff was done sometime before:
pdx->irp = IoAllocateIrp(pdx->nextFdo->StackSize, FALSE);
pdx->urb = (PURB)ExAllocatePool( NonPagedPool,
sizeof(struct
_URB_BULK_OR_INTERRUPT_TRANSFER));
pdx->inBuffer = ExAllocatePool(…) // sufficient space
*/

UsbBuildInterruptOrBulkTransferRequest(
pdx->urb,
sizeof(struct _URB_BULK_OR_INTERRUPT_TRANSFER),
pdx->handle,
pdx->inBuffer,
NULL,
pdx->inBufLen,
USBD_TRANSFER_DIRECTION_IN | USBD_SHORT_TRANSFER_OK,
NULL);

IoSetCompletionRoutine( pdx->irp,
(PIO_COMPLETION_ROUTINE) OnInterrupt,
pdx,
TRUE,
TRUE,
TRUE);

PIO_STACK_LOCATION nextStack = IoGetNextIrpStackLocation(irp);

nextStack->MajorFunction =
IRP_MJ_INTERNAL_DEVICE_CONTROL;
nextStack->Parameters.DeviceIoControl.IoControlCode =
IOCTL_INTERNAL_USB_SUBMIT_URB;
nextStack->Parameters.Others.Argument1 = (PVOID)urb;

return IoCallDriver(pdx->nextFdo, irp);
}

=========== SNAP ============

if you need further information please let me know.

Well, this isn’t really very helpful, because it isn’t the real code.
You initialize pdx->irp and pdx->urb, but then you switch to using irp
and urb, and you didn’t show the completion code at all.


Tim Roberts, xxxxx@probo.com
Providenza & Boekelheide, Inc.

Hi,

now it’s getting really odd! I exchanged my USB-driver with the generic one provided by Cypress (cyusb.sys) for their EZ-USB-Kit. You can use this driver by changing the PID/VID to a Cypress one.

Now I initialized INT-IN using a TransferBuffer of 0x2000 Byte and trigger the device to send 0x1101 Byte. –> The IRP comes back. Next I send 0x10FF Byte and the IRP hangs. That is exactly the same behavior I get with my implementation.

And yes I did my examinations with and without the usage of any USB-analyzer.
I have the feeling that there is some kind of bug in one of the lower drivers. Now I will try the same experiments on an other computer to see if it reacts the same way.
Maybe I write an extra small driver to test exactly that behavior.

For now thank you to everyone who tried to help so far…

Best regards
Emanuel Eick

Ok, I searched a little bit more and came to this conclusion:
The buffersize which is specified in UsbBuildInterruptOrBulkTransferRequest() is managed in 4K Blocks. The host-controller only returns the IRP if the last provided 4K block is filled with data (1Byte is sufficient). This has the following consequence:
Buffer b = 4K:
Device sending x <= 4K –> the IRP returns.

Buffer 4K < b <= 8K:
Device sending x Byte <= 4K –> IRP hangs forever

Buffer 4K < b <= 8K:
Device sending 4K < x <= 8K –> IRP returns

Maybe someone can confirm this behavior or correct me.

Interestingly Vista does not have this restriction. Here you can setup an INT-IN with 8K for example and the IRP still returns if the device provides only 1 Byte.

Best regards
Emanuel Eick

xxxxx@siemens.com wrote:

Ok, I searched a little bit more and came to this conclusion:
The buffersize which is specified in UsbBuildInterruptOrBulkTransferRequest() is managed in 4K Blocks. The host-controller only returns the IRP if the last provided 4K block is filled with data (1Byte is sufficient). This has the following consequence:
Buffer b = 4K:
Device sending x <= 4K –> the IRP returns.

Buffer 4K < b <= 8K:
Device sending x Byte <= 4K –> IRP hangs forever

Buffer 4K < b <= 8K:
Device sending 4K < x <= 8K –> IRP returns

Maybe someone can confirm this behavior or correct me.

Interestingly Vista does not have this restriction. Here you can setup an INT-IN with 8K for example and the IRP still returns if the device provides only 1 Byte.

Well, I can assure you that I have never seen this effect. I rarely use
interrupt pipes, but bulk pipes use the same code, and I’ve used them
extensively. What interval are you specifying in your endpoint descriptor?

In the last message, you revealed for the first time that you are using
an FX2. How are you sending the data? Are you using slave mode, or are
you generating it in the firmware? The FX2 FIFOs are only 4k bytes
long, so it takes special handling to send more than that. How are you
triggering the transmission of the short packet?


Tim Roberts, xxxxx@probo.com
Providenza & Boekelheide, Inc.

> The host-controller only returns the IRP if the last provided 4K block

is filled with data (1Byte is sufficient). This has the following
consequence:

Buffer b = 4K:
Device sending x <= 4K –> the IRP returns.

Buffer 4K < b <= 8K:
Device sending x Byte <= 4K –> IRP hangs forever

Buffer 4K < b <= 8K:
Device sending 4K < x <= 8K –> IRP returns

Interestingly Vista does not have this restriction.

If on Windows Vista your driver receives exactly has much data as your device firmware sends in all of these cases (and as confirmed by a hardware bus trace) I would assume for now that a problem with the device is not likely to be the issue here. (If there was a problem with the device my first guess would be to make sure the data toggles are all correct. An incorrect data toggle will result in the host ACKing the packet from the device but silently discarding it.)

Is the device operating in high-speed mode or full-speed mode when you are testing this? I tried to duplicate what you are reporting on Windows XP SP2 will a full-speed device attached to a UHCI controller and did not see what you are reporting. I did not try this yet on Windows XP SP2 with a high-speed device.

> The host-controller only returns the IRP if the last provided 4K block

is filled with data (1Byte is sufficient). This has the following
consequence:

Buffer b = 4K:
Device sending x <= 4K –> the IRP returns.

Buffer 4K < b <= 8K:
Device sending x Byte <= 4K –> IRP hangs forever

Buffer 4K < b <= 8K:
Device sending 4K < x <= 8K –> IRP returns

Maybe someone can confirm this behavior or correct me.

Interestingly Vista does not have this restriction.

I can confirm that I can also observe this behavior with a high-speed Interrupt IN endpoint on Windows XP SP2 but not on Windows Vista.

This is related to the Windows XP SP2 EHCI host controller logic which splits Interrupt transfers into 4KB segments and queues multiple segments on the hardware at the same time. If a short packet terminates one of the 4KB segments that is not the last 4KB segment then the remaining segments of the transfer remain queued on the host controller hardware and are not removed and not completed (unless sufficient additional data arrives that logically belongs to the next transfer).

The Windows Vista EHCI host controller logic intentionally differs in this area in such a way as to avoid this problem.

Hi everyone,

I think your answer (@Glen Slick) matches exactly my problem. Therefore I would say “case closed”.

@Tim Roberts: you are right. We used the FX2 board during our USB chipset evaluation. But now we switched to an ATMEL ARM9. Anyway, you can use the cypress generic driver for every device, as long as you change its VID/PID to a cypress on. Using this “trick” I replaced our own driver with this one and examined the cyusb.sys INT-IN transfer behavior.

But now that I know that the last provided 4K segment always has to be filled (even partly) I think about a solution (… I never thought that I would appreciate that everyone uses Vista …).
By design our device can provide data from a few Byte up to 4K+X Byte. This data needs to be delivered with low latency under all circumstances. For this reason we decided to take INT-IN transfers and provide a “big” receive buffer (for instance 32K). Thinking a few minutes about the newly arised problem I came to this:

  1. Let the device pad all INT-IN transfers to fill the full buffer. Not good: We block the endpoint all the time sending NULL-data. What if we already have new data…
  2. Change the concept by providing always 4K receive buffers and repeat this always. In this case I have to keep track which data belongs to which interrupt. Ok, but not absolute low-latency.
  3. Force our customers to use Vista (maybe a little to early, we may talk about this in few years again :slight_smile: )

Does someone has other ideas? Workarounds are welcome.

I cannot repeat it often enough: thank you for all responses.

Best regards
Emanuel Eick

xxxxx@siemens.com wrote:

But now that I know that the last provided 4K segment always has to be filled (even partly) I think about a solution (… I never thought that I would appreciate that everyone uses Vista …).
By design our device can provide data from a few Byte up to 4K+X Byte. This data needs to be delivered with low latency under all circumstances. For this reason we decided to take INT-IN transfers and provide a “big” receive buffer (for instance 32K).

The lowest latency is provided by bulk pipes. That way you don’t tie up
bandwidth that you aren’t going to be using, and you avoid this 4k
problem altogether.


Tim Roberts, xxxxx@probo.com
Providenza & Boekelheide, Inc.