Problem with the Completion Port ?

I have extended the scanner sample [I don’t do any scanning in the user mode right now, I just query the user mode for questions] and I have hit a particularly nasty problem, when there is little IRP traffic (non-intensive I/O operations) there are no problems, but when there is a burst of I/O activity that I monitor, all the I/O operations as well as the user mode application freeze, things go back to normal when I disable both the driver and try to close the user mode application as well (doing either of them alone won’t work)
There aren’t any problems if I don’t query the user mode.

I have also noticed that the program is prematurely terminated (without errors) when I spawn a large number of threads at the beginning (Like > 20 threads, while the scanner sample works with 2-64 threads), the program crashes without any errors, which leads me to think that too many threads trying to connect to the I/O completion port causes something to overload in the program.

Am I pending a particular program or service that supports the I/O Completion system ? (I am making sure not to block the user mode thread) My routines working with the completion system almost don’t differ at all with the user mode application.

Thanks for all the help.

Update :
Well, somehow, I have narrowed it down to this :
The user mode threads are working right now, but the Kernel mode code freezes on the FltSendMessage command from the kernel mode (It works correctly for only once right now, then it starts freezing to be exact)

I allocate some memory using FltAllocatePoolWithTag and use it for both the notification and the response (just like the scanner sample).