Hello everyone,
I’m currently developing a Windows kernel driver that leverages Windows Filtering Platform (WFP) callouts to inspect and manage network traffic. The driver is built using the latest WDK and Visual Studio toolchain, and registration at the appropriate layers is working as expected.
As part of ongoing optimization, I’m experimenting with enhanced work-queue handling to keep packet processing efficient under sustained throughput. In one test setup, the system is connected to storage and networking hardware that uses FC fiber, and I’m interested in ensuring that my queuing and context management align well with high-speed data paths.
I’d appreciate any guidance from the community on best practices for managing per-flow contexts, CPU affinity considerations, or recommended patterns for scaling WFP callouts across multiple cores. If needed, I can share trace data or simplified code samples.
Thanks for your time and insights.