Sending Chained NBLs WFP Callout Filter Works Once Then Gets STATUS_FWP_TCPIP_NOT_READY

Hello, I have run into another “odd” issue I am experiencing with sending Chained net buffer lists on the FWPM_LAYER_INBOUND_MAC_FRAME_NATIVE or FWPM_LAYER_OUTBOUND_MAC_FRAME_NATIVE layers. Calling FwpsInjectMacSendAsync sends the NBL chain just fine the first time, packets are verified correct, and no memory issues (i.e. freeing memory and the like). However, upon attempting to send another NBL chain, FwpsInjectMacSendAsync returns STATUS_FWP_TCPIP_NOT_READY until the machine is booted.
I have verified this behavior is the same on several different systems including virtual machines. Without posting a ton of code the generalized concept of what is being implemented is as follows:

1.) A series of programmatically constructed and already verified individually set of test packets (raw data blocks) are sent via IRP command as a queue of packets
2.) For each packet in the queue: a new non-paged pool block of memory is constructed, an MDL is constructed using this new block of memory, an NBL is constructed, and the packet data is copied to the net buffer of the NBL.
3.) Starting from the first NBL, each additional NBL is linked to the previous NBL’s next pointer (i.e. PreviousNBLptr->next = CurrentNBL)
4.) Upon each NBL returned in the completion function, that NBL, MDL, and associated pool of memory is released/freed.

The first time the chained NBLs are sent, during a kernel debug session, there are no errors, the packets send, and all associated memory gets released properly.
Every attempt after the first successful send, FwpsInjectMacSendAsync returns STATUS_FWP_TCPIP_NOT_READY.

The callout filters used (FWPM_LAYER_INBOUND_MAC_FRAME_NATIVE & FWPM_LAYER_OUTBOUND_MAC_FRAME_NATIVE ) both are initialized with the FWP_CALLOUT_FLAG_ALLOW_L2_BATCH_CLASSIFY flag set.

Each NBL has one net buffer and one MDL assigned to it: [NBL]->[NB]->[MDL]->[npp memblk]
Currently each packet is assigned its own non-paged pool memory block and MDL, should I allocate one contiguous block of non-paged pool memory and then assign segments of that to each NB’s MDL?
Since there is not much information on sending chained NBLs (or really any guidelines as it pertains not to NDIS but to WFP), I am assuming that one only needs to “link” NBLs through the “next” pointer and that is it (the ParentNetBufferList pointer, as I understand it, is used for cloned NBLs and such).

Has anyone run across this issue or does anyone have any additional thoughts as to why the TCPIP portion of the WFP just stops working after sending 1 chained NBL?
Since I am seeing no errors prior to the first send and during the completion callbacks, I am sort of at a loss as to what could be wrong.
Is there any additional steps or flags or the like that one must set on the NBLS =or= a function call to notify the completion of the send?

Any input is welcome.