Hi,
I am writing a filter driver, which brings the actual data of the files under some filtered directory from a remote server. Recently we moved our communication module from kernel mode using TDI to user mode using WinSocktet2. The communication module between our filter driver and the UM application is consisted from a shared event (created in kernel mode and opened in User mode). When the filter driver wants to request some data from the server we build some kind of propriety message put it in some queue and signal the event. The UM application using IOCTL receives this message from the queue and requests the data from the server sending the data down using again an IOCTL.
We now have a strange problem when we try to run any executable from this filtered directory. What happens is that a user click on some link and the UM application (the same one, but some other module in it) calls CreateProcess with the parameters. Now, during the create process we see a “regular” (not as a result of a Page-Fault) IRP_MJ_READ for the first block of the file, we receive the block from the server and return it to the caller, then there is a page fault (MmFaultAccess) that occurs in our UM application context requesting some data for the new executable (since the new process has not been “created” yet, the page fault occurs in the context of our UM application and not the new process context). What happens now is that we cannot get any more blocks delivered to the filter driver. We tried to put a break point in the module waiting for the event but when the break point is hit and KiTrap03 is called it seems that everything just stuck, the debugger doesn’t wake up and the systems hangs for a long while (until the filter drivers wakes up when the timeout for the data fires). I think this is related to the fact that the CreateProcess thread in the same process is in non-alterable wait in Kernel Mode waiting for the page fault to be satisfied. But it only happens when a page faults occurs, and I don’t understand why (as I said we can debug easily the first IRP_MJ_READ).
If we don’t put any breakpoints in the path we see that the thread that suppose to bring the data from the server is stuck after calling nt!NtRaiseException:
f22e7450 804e7d36 nt!KiSwapContext+0x2e (FPO: [EBP 0xf22e7484] [0,0,4])
f22e745c 804e8950 nt!KiSwapThread+0x44 (FPO: [0,0,2])
f22e7484 805b90fd nt!KeWaitForSingleObject+0x1c0 (FPO: [Non-Fpo])
f22e7564 805b837a nt!DbgkpQueueMessage+0x176 (FPO: [5,46,3])
f22e7584 805b8505 nt!DbgkpSendApiMessage+0x43 (FPO: [2,0,2])
f22e7610 804fab8a nt!DbgkForwardException+0x8d (FPO: [Non-Fpo])
f22e79c4 804e0c01 nt!KiDispatchException+0x150 (FPO: [Non-Fpo])
f22e7d34 804d95c4 nt!KiRaiseException+0x11e (FPO: [Non-Fpo])
f22e7d50 804d6140 nt!NtRaiseException+0x31
f22e7d50 77e73887 nt!KiSystemService+0xc4 (FPO: [0,0] TrapFrame @ f22e7d64)
If we bring the data first and then do the create process so that there are no page faults in the context of our UM application then everything works. Any suggestions? What is wrong with this design?
G.