I’m currently developing a driver for a XMC device that operates via a carrier in a PCIe slot. I’ve encountered a performance issue related to DMA writes (device to RAM) when two devices are installed, but only one device is running.
Setup Details:
Each device uses a common buffer for DMA operations, where a scatter-gather list of descriptors is written to the common buffer, and then we tell the HW where to find the common buffer and it reads through the descriptors and transfers the data.
Each device has a unique set of common buffers (one for reading, one for writing) and these are created during driver initialization.
Both devices share the same interrupt (IRQ number is identical for both).
Issue Observed:
The slowdown occurs between providing the hardware with the common buffer’s starting address and receiving the interrupt indicating transfer completion.
The Interrupt Service Routine (ISR) promptly checks which device triggered the interrupt and exits if it’s not the correct device, ruling out multiple ISR calls as the cause.
Performance disparity is significant: normal operation takes about 0.0006 seconds, whereas with two devices installed, it slows down to 0.01 seconds (but only on one device, the other device runs fine). In fact, when you swap the devices, the slowness follows the slot not the device, and we have tried this on multiple machines of different types.
Additional Context:
This issue hasn’t arisen with our other devices, which typically follow a “base” and “channel” design due to multiple channels. However, in this case, as there’s only one channel, so the base and channel design were merged.
I’m at a loss as to what could be causing this slowdown for one device but not the other, especially since there doesn’t appear to be any additional driver code executing during this time. Any insights or advice on this matter would be greatly appreciated!