Queuing packets in NDIS IM driver receive path for optimal performance

Hi, I wanted to understand what is the best way to optimize performance on the receive path of an NDIS IM Mux driver.

The NDIS IM sits between the NIC’s miniport driver and TCP/IP, and processes the packets. On the receive path, since it is possible to receive multiple NBLs, how should the ProtocolReceiveXXX() callback process the receive NBL chain asynchronously? The protocol edge needs to pass over the received packets to my packet processing logic before it indicates the frames to upper layers. However, since multiple NBLs can be present in the receive path and because each packet can be processed differently, how are these supposed to be handled for best performance? Should the packets be indicated individually after being individually processed, or should they be linked back and indicated in a list together? Moreover, how can this be made asynchronous from the ProtocolReceiveXXX() function?

Thanks!

Does the receive throttling apply to NDIS IM drivers too? https://docs.microsoft.com/en-us/windows-hardware/drivers/ddi/content/ndis/ns-ndis-_ndis_receive_throttle_parameters

In MUX IM, what is the correct way to queue packets in the ProtocolReceiveXXX callback routine? The Protocol edge ends up looking up the MAC in the NBL and ends up forwarding it to the appropriate miniport for indication. It is possible that only a subset of received NBLs need to be indicated after processing, so I would like to queue these for a bounded time. What is this bounded time for IM drivers?