Forever PENDING system work item

Here is the problem I faced on W2K Server-SP4, 2xCPU system with our FSFD
and somebody out there maybe knows the reason.

In FSFD when I have to process delete event (IRP_MJ_SET_INFORMATION,
FileDispositionInformation) for a file, I need to read some data for parent
directories. In FSFD I have to do this reads (IRP_MJ_QUERY_INFORMATION,
IRP_MJ_QUERY_EA) before sending delete IRP down the chain.
So I block the origin thread there and as a good kernel citizen :), I post
work item (say Level_1 work item), to be executed in system worker threads.

In the posted work item (Level_1), I then post some new work items (say
Level_2) again to system worker threads, but always for read data, e.g. one
to read FileAllInformation of parent directory and the other to read parents
EA.
What I have found out is that some times (e.g. deleting whole deep-tree of
directories, so a lot of delete events), the posted work item from Level_2
never gets executed. If I break with WinDbg and do “!exqueue” I see my work
item from Level_2 in PENDING state, listed in “Delayed WorkQueue( current =
1 maximum = 2 )”.
It remains there forever (increasing timeout in origin thread doesn’t help),
changing worker thread queue to Critical either. At the same time CPU
utilization is %0, so there is no “system busy” problem or leak of other
resources. I tried with both new (IoXxxWorkItem) and old (ExXxxWorkItem)
APIs, but the behavior is the same. I have VERIFYER enabled, no error
reports.
I solved the problem by doing the entire reading info job in Level_1 work
item threads. Doing so, no problem.

Any idea why work items (from Level_2) remain un-executed in system worker
queues?

WBR Primoz


Primoz Beltram
e-mail: xxxxx@hermes.si
www: http://www.hermes.si
HERMES SoftLab, Office Nova Gorica
Erjavceva 2, 5000 Nova Gorica, Slovenia
phone: (++386 5) 33 30 510, (++386 1) 58 65 710
fax: (++386 5) 33 32 656, (++386 1) 58 65 270

I’m not sure if this is your case but I’m gonna make a try. The system has
only limited number of worker threads. So, when there are too many
work-items queued in short period of time, OS “pends” some of them until
some of the previously queued work-items are complete. In your case you have
“dependent” work-items, i.e. work-item-1 cannot complete until work-item-2
did, but work-item-2 cannot start until work-item-1 is out. If this is
really the case, you’re gonna have to write your own worker threads manager
(sounds great, isn’t it :). Hope this helps.

–htfv

“Primoz Beltram” wrote in message
news:xxxxx@ntfsd…
> Here is the problem I faced on W2K Server-SP4, 2xCPU system with our FSFD
> and somebody out there maybe knows the reason.
>
> In FSFD when I have to process delete event (IRP_MJ_SET_INFORMATION,
> FileDispositionInformation) for a file, I need to read some data for
> parent
> directories. In FSFD I have to do this reads (IRP_MJ_QUERY_INFORMATION,
> IRP_MJ_QUERY_EA) before sending delete IRP down the chain.
> So I block the origin thread there and as a good kernel citizen :), I post
> work item (say Level_1 work item), to be executed in system worker
> threads.
>
> In the posted work item (Level_1), I then post some new work items (say
> Level_2) again to system worker threads, but always for read data, e.g.
> one
> to read FileAllInformation of parent directory and the other to read
> parents
> EA.
> What I have found out is that some times (e.g. deleting whole deep-tree of
> directories, so a lot of delete events), the posted work item from Level_2
> never gets executed. If I break with WinDbg and do “!exqueue” I see my
> work
> item from Level_2 in PENDING state, listed in “Delayed WorkQueue( current
> =
> 1 maximum = 2 )”.
> It remains there forever (increasing timeout in origin thread doesn’t
> help),
> changing worker thread queue to Critical either. At the same time CPU
> utilization is %0, so there is no “system busy” problem or leak of other
> resources. I tried with both new (IoXxxWorkItem) and old (ExXxxWorkItem)
> APIs, but the behavior is the same. I have VERIFYER enabled, no error
> reports.
> I solved the problem by doing the entire reading info job in Level_1 work
> item threads. Doing so, no problem.
>
> Any idea why work items (from Level_2) remain un-executed in system worker
> queues?
>
> WBR Primoz
>
> -----------------------------------------------
> Primoz Beltram
> e-mail: xxxxx@hermes.si
> www: http://www.hermes.si
> HERMES SoftLab, Office Nova Gorica
> Erjavceva 2, 5000 Nova Gorica, Slovenia
> phone: (++386 5) 33 30 510, (++386 1) 58 65 710
> fax: (++386 5) 33 32 656, (++386 1) 58 65 270
> -----------------------------------------------
>
>
>

Yes, that was my case. I ended in “dependent” work item posting, because
during work-item-1 processing, I triggered IRP_MJ_CREATE, where I was
posting other work items (work-item-2).
My (wrong) assumption with system working items was, that once I get a
successful call to IoAllocateWorkItem (IoQueueWorkItem returns void) it will
be sooner or later executed (I’m willing to wait half an hour), similar to
if I get non NULL pointer from ExAllocatePool, I GOT memory chunk from OS.
But I was wrong. I’m underlying it again, that increasing timeout wait in
work-tem-1 (e.g. from 5 seconds to 10 minutes) didn’t help and that during
wait, CPU usage was 0% but work-tem-2 remained PENDING in “!exqueue”.
It sounds like a new rule: Posting new work items and waiting for their
execution, from an already running work item, is not safe.
WBR Primoz

-----Original Message-----
From: Alexey Logachyov [mailto:xxxxx@vba.com.by]
Sent: Sunday, February 22, 2004 12:31 AM
To: Windows File Systems Devs Interest List
Subject: Re:[ntfsd] Forever PENDING system work item

I’m not sure if this is your case but I’m gonna make a try. The system has
only limited number of worker threads. So, when there are too many
work-items queued in short period of time, OS “pends” some of them until
some of the previously queued work-items are complete. In your case you have

“dependent” work-items, i.e. work-item-1 cannot complete until work-item-2
did, but work-item-2 cannot start until work-item-1 is out. If this is
really the case, you’re gonna have to write your own worker threads manager
(sounds great, isn’t it :). Hope this helps.

–htfv

“Primoz Beltram” wrote in message
news:xxxxx@ntfsd…
> Here is the problem I faced on W2K Server-SP4, 2xCPU system with our FSFD
> and somebody out there maybe knows the reason.
>
> In FSFD when I have to process delete event (IRP_MJ_SET_INFORMATION,
> FileDispositionInformation) for a file, I need to read some data for
> parent
> directories. In FSFD I have to do this reads (IRP_MJ_QUERY_INFORMATION,
> IRP_MJ_QUERY_EA) before sending delete IRP down the chain.
> So I block the origin thread there and as a good kernel citizen :), I post
> work item (say Level_1 work item), to be executed in system worker
> threads.
>
> In the posted work item (Level_1), I then post some new work items (say
> Level_2) again to system worker threads, but always for read data, e.g.
> one
> to read FileAllInformation of parent directory and the other to read
> parents
> EA.
> What I have found out is that some times (e.g. deleting whole deep-tree of
> directories, so a lot of delete events), the posted work item from Level_2
> never gets executed. If I break with WinDbg and do “!exqueue” I see my
> work
> item from Level_2 in PENDING state, listed in “Delayed WorkQueue( current
> =
> 1 maximum = 2 )”.
> It remains there forever (increasing timeout in origin thread doesn’t
> help),
> changing worker thread queue to Critical either. At the same time CPU
> utilization is %0, so there is no “system busy” problem or leak of other
> resources. I tried with both new (IoXxxWorkItem) and old (ExXxxWorkItem)
> APIs, but the behavior is the same. I have VERIFYER enabled, no error
> reports.
> I solved the problem by doing the entire reading info job in Level_1 work
> item threads. Doing so, no problem.
>
> Any idea why work items (from Level_2) remain un-executed in system worker
> queues?
>
> WBR Primoz
>
> -----------------------------------------------
> Primoz Beltram
> e-mail: xxxxx@hermes.si
> www: http://www.hermes.si
> HERMES SoftLab, Office Nova Gorica
> Erjavceva 2, 5000 Nova Gorica, Slovenia
> phone: (++386 5) 33 30 510, (++386 1) 58 65 710
> fax: (++386 5) 33 32 656, (++386 1) 58 65 270
> -----------------------------------------------
>
>
>


Questions? First check the IFS FAQ at
https://www.osronline.com/article.cfm?id=17

You are currently subscribed to ntfsd as: xxxxx@hermes.si
To unsubscribe send a blank email to xxxxx@lists.osr.com

Making one workitem wait on another is a bad idea, but I’m not sure
you’ve proven that’s what happened here. The reason it is a bad idea is
that there are only a fixed number of threads which will be spun up to
process a work queue. If you then wind up with all N threads processing
workitems which are all waiting on queued items, you are dead.

You should look at what is actually running in the work queue at the
moment, and then again over a little bit of time. I suspect things will
become clearer. Workitems are processed in serial order, so if your
workitem isn’t being processed for 10 minutes, nothing else is either.

As a side note, the current/maximum that exqueue displays indicate how
many RUNNING/READY threads the system tries to keep processing the
queue. If current < maximum, there are pending workitems and there is an
idle thread, the KQUEUE will be signaled and a workitem will be taken
off and processed. If current < maximum, there are pending workitems and
there isn’t an idle thread, some heuristics start deciding when to
possibly create a new worker thread.

So, if current >= maximum, no new workitems will be scheduled. All
threads are busy. Maximum is usually the number of processors in the
system. Doing computationally expensive work in workitems is also
discouraged …

Dan Lovinger
Microsoft Corporation

This posting is provided “AS IS” with no warranties and confers no
rights.

-----Original Message-----
From: xxxxx@lists.osr.com
[mailto:xxxxx@lists.osr.com] On Behalf Of Primoz Beltram
Sent: Monday, February 23, 2004 12:53 AM
To: Windows File Systems Devs Interest List
Subject: RE: [ntfsd] Forever PENDING system work item

Yes, that was my case. I ended in “dependent” work item posting, because
during work-item-1 processing, I triggered IRP_MJ_CREATE, where I was
posting other work items (work-item-2).
My (wrong) assumption with system working items was, that once I get a
successful call to IoAllocateWorkItem (IoQueueWorkItem returns void) it
will
be sooner or later executed (I’m willing to wait half an hour), similar
to
if I get non NULL pointer from ExAllocatePool, I GOT memory chunk from
OS.
But I was wrong. I’m underlying it again, that increasing timeout wait
in
work-tem-1 (e.g. from 5 seconds to 10 minutes) didn’t help and that
during
wait, CPU usage was 0% but work-tem-2 remained PENDING in “!exqueue”.
It sounds like a new rule: Posting new work items and waiting for their
execution, from an already running work item, is not safe.
WBR Primoz

Thanks for reply. What you wrote was exactly what I did wrong. I wind up all
N system working threads from Delayed queue and so the one (PENDING one)
that would un-wind them didn’t got chance to be executed, since N is const.

-----Original Message-----
From: Daniel Lovinger [mailto:xxxxx@windows.microsoft.com]
Sent: Monday, February 23, 2004 7:20 PM
To: Windows File Systems Devs Interest List
Subject: RE: [ntfsd] Forever PENDING system work item

Making one workitem wait on another is a bad idea, but I’m not sure
you’ve proven that’s what happened here. The reason it is a bad idea is
that there are only a fixed number of threads which will be spun up to
process a work queue. If you then wind up with all N threads processing
workitems which are all waiting on queued items, you are dead.

You should look at what is actually running in the work queue at the
moment, and then again over a little bit of time. I suspect things will
become clearer. Workitems are processed in serial order, so if your
workitem isn’t being processed for 10 minutes, nothing else is either.

As a side note, the current/maximum that exqueue displays indicate how
many RUNNING/READY threads the system tries to keep processing the
queue. If current < maximum, there are pending workitems and there is an
idle thread, the KQUEUE will be signaled and a workitem will be taken
off and processed. If current < maximum, there are pending workitems and
there isn’t an idle thread, some heuristics start deciding when to
possibly create a new worker thread.

So, if current >= maximum, no new workitems will be scheduled. All
threads are busy. Maximum is usually the number of processors in the
system. Doing computationally expensive work in workitems is also
discouraged …

Dan Lovinger
Microsoft Corporation

This posting is provided “AS IS” with no warranties and confers no
rights.

-----Original Message-----
From: xxxxx@lists.osr.com
[mailto:xxxxx@lists.osr.com] On Behalf Of Primoz Beltram
Sent: Monday, February 23, 2004 12:53 AM
To: Windows File Systems Devs Interest List
Subject: RE: [ntfsd] Forever PENDING system work item

Yes, that was my case. I ended in “dependent” work item posting, because
during work-item-1 processing, I triggered IRP_MJ_CREATE, where I was
posting other work items (work-item-2).
My (wrong) assumption with system working items was, that once I get a
successful call to IoAllocateWorkItem (IoQueueWorkItem returns void) it
will
be sooner or later executed (I’m willing to wait half an hour), similar
to
if I get non NULL pointer from ExAllocatePool, I GOT memory chunk from
OS.
But I was wrong. I’m underlying it again, that increasing timeout wait
in
work-tem-1 (e.g. from 5 seconds to 10 minutes) didn’t help and that
during
wait, CPU usage was 0% but work-tem-2 remained PENDING in “!exqueue”.
It sounds like a new rule: Posting new work items and waiting for their
execution, from an already running work item, is not safe.
WBR Primoz


Questions? First check the IFS FAQ at
https://www.osronline.com/article.cfm?id=17

You are currently subscribed to ntfsd as: xxxxx@hermes.si
To unsubscribe send a blank email to xxxxx@lists.osr.com