Here is the problem I faced on W2K Server-SP4, 2xCPU system with our FSFD
and somebody out there maybe knows the reason.
In FSFD when I have to process delete event (IRP_MJ_SET_INFORMATION,
FileDispositionInformation) for a file, I need to read some data for parent
directories. In FSFD I have to do this reads (IRP_MJ_QUERY_INFORMATION,
IRP_MJ_QUERY_EA) before sending delete IRP down the chain.
So I block the origin thread there and as a good kernel citizen :), I post
work item (say Level_1 work item), to be executed in system worker threads.
In the posted work item (Level_1), I then post some new work items (say
Level_2) again to system worker threads, but always for read data, e.g. one
to read FileAllInformation of parent directory and the other to read parents
EA.
What I have found out is that some times (e.g. deleting whole deep-tree of
directories, so a lot of delete events), the posted work item from Level_2
never gets executed. If I break with WinDbg and do “!exqueue” I see my work
item from Level_2 in PENDING state, listed in “Delayed WorkQueue( current =
1 maximum = 2 )”.
It remains there forever (increasing timeout in origin thread doesn’t help),
changing worker thread queue to Critical either. At the same time CPU
utilization is %0, so there is no “system busy” problem or leak of other
resources. I tried with both new (IoXxxWorkItem) and old (ExXxxWorkItem)
APIs, but the behavior is the same. I have VERIFYER enabled, no error
reports.
I solved the problem by doing the entire reading info job in Level_1 work
item threads. Doing so, no problem.
Any idea why work items (from Level_2) remain un-executed in system worker
queues?
WBR Primoz