Producer at DPC,Consumer at helper thread,puzzled

I just looked at both code example (after putting it on NTTALK). Just to see what could be problems as an exercise. After couple false alarm, what I found is this …

max’s code works fine. Consumers does the job of Reseting, hence put them in waiting mode if there is no work. The behavior is - All the consumers can be runnable even though just one work item is in the list. Then they will go on blocking, since the consumer that gets the item will reset the event… __Essentially it is a minor point, but if someone has say 64 consumer threads, worst it can see is all 64 runnable, scheduled, then go to sleep for lack of work.

anton’s code: tried counters and flags, but lack of a Reset can make every one runnable after initial starts, they would spin for work… so it isn’t there for a semaphore type behavior yet.

-pro

On Dec 13, 2010, at 6:01 AM, xxxxx@fastmail.fm wrote:

Windows is not Unix, Windows semaphores are not posix semaphores.

– pa


NTDEV is sponsored by OSR

For our schedule of WDF, WDM, debugging and other seminars visit:
http://www.osr.com/seminars

To unsubscribe, visit the List Server section of OSR Online at http://www.osronline.com/page.cfm?name=ListServer

Which is why I suggested N lists. Having N threads competing for one
list seems like a bottleneck by design.

Mark Roddy

On Mon, Dec 13, 2010 at 10:42 AM, Prokash Sinha wrote:
> I just looked at both code example (after putting it on NTTALK). Just to see what could be problems as an exercise. After couple false alarm, what I found is this …
>
> max’s code works fine. Consumers does the job of Reseting, hence put them in waiting mode if there is no work. The behavior is - All the consumers can be runnable even though just one work item is in the list. Then they will go on blocking, since the consumer that gets the item will reset the event… __Essentially it is a minor point, but if someone has say 64 consumer threads, worst it can see is all 64 runnable, scheduled, then go to sleep for lack of work.
>
> anton’s code: tried counters and flags, but lack of a Reset can make every one runnable after initial starts, they would spin for work… so it isn’t there for a semaphore type behavior yet.
>
>
> -pro
>
> On Dec 13, 2010, at 6:01 AM, xxxxx@fastmail.fm wrote:
>
>> Windows is not Unix, Windows semaphores are not posix semaphores.
>>
>> – pa
>>
>> —
>> NTDEV is sponsored by OSR
>>
>> For our schedule of WDF, WDM, debugging and other seminars visit:
>> http://www.osr.com/seminars
>>
>> To unsubscribe, visit the List Server section of OSR Online at http://www.osronline.com/page.cfm?name=ListServer
>
>
> —
> NTDEV is sponsored by OSR
>
> For our schedule of WDF, WDM, debugging and other seminars visit:
> http://www.osr.com/seminars
>
> To unsubscribe, visit the List Server section of OSR Online at http://www.osronline.com/page.cfm?name=ListServer
>

Yeah, I remember that. That would simplify stuff. All the producer need is
to do a round robin among N consumers.

-pro

Which is why I suggested N lists. Having N threads competing for one
list seems like a bottleneck by design.

Mark Roddy

On Mon, Dec 13, 2010 at 10:42 AM, Prokash Sinha
> wrote:
>> I just looked at both code example (after putting it on NTTALK). Just to
>> see what could be problems as an exercise. After couple false alarm,
>> what I found is this …
>>
>> max’s code works fine. Consumers does the job of Reseting, hence put
>> them in waiting mode if there is no work. The behavior is - All the
>> consumers can be runnable even though just one work item is in the list.
>> Then they will go on blocking, since the consumer that gets the item
>> will reset the event… __Essentially it is a minor point, but if
>> someone has say 64 consumer threads, worst it can see is all 64
>> runnable, scheduled, then go to sleep for lack of work.
>>
>> anton’s code: tried counters and flags, but lack of a Reset can make
>> every one runnable after initial starts, they would spin for work… so
>> it isn’t there for a semaphore type behavior yet.
>>
>>
>> -pro
>>
>> On Dec 13, 2010, at 6:01 AM, xxxxx@fastmail.fm wrote:
>>
>>> Windows is not Unix, Windows semaphores are not posix semaphores.
>>>
>>> – pa
>>>
>>> —
>>> NTDEV is sponsored by OSR
>>>
>>> For our schedule of WDF, WDM, debugging and other seminars visit:
>>> http://www.osr.com/seminars
>>>
>>> To unsubscribe, visit the List Server section of OSR Online at
>>> http://www.osronline.com/page.cfm?name=ListServer
>>
>>
>> —
>> NTDEV is sponsored by OSR
>>
>> For our schedule of WDF, WDM, debugging and other seminars visit:
>> http://www.osr.com/seminars
>>
>> To unsubscribe, visit the List Server section of OSR Online at
>> http://www.osronline.com/page.cfm?name=ListServer
>>
>
> —
> NTDEV is sponsored by OSR
>
> For our schedule of WDF, WDM, debugging and other seminars visit:
> http://www.osr.com/seminars
>
> To unsubscribe, visit the List Server section of OSR Online at
> http://www.osronline.com/page.cfm?name=ListServer
>

>Which is why I suggested N lists. Having N threads competing for one

list seems like a bottleneck by design.

This is how IoQueueWorkItem is implemented.


Maxim S. Shatskih
Windows DDK MVP
xxxxx@storagecraft.com
http://www.storagecraft.com

“This is how IoQueueWorkItem is implemented.”

I still suspect IQWI keeps a separate list per thread. An unfortunate consequence of that is that if a workitem thread gets stuck, all workitems that are in its list are stuck.

For example, for a while I used IQWI for handling some PASSIVE_LEVEL things in a virtual storminiport at system resume. That didn’t go well if the worker thread got its stack paged out (YES, those threads wait on UserMode), and the miniport was in the paging path.

> anton’s code: tried counters and flags, but lack of a Reset can make every one runnable after

initial starts, they would spin for work…

As I told you already on NTTALK (as well as on this thread), this is a synchronization event, rather than notification one, so that only one thread may get released per call to KeSetEvent(). There is no need to manually reset anything - dispatcher takes care of it…

Anton Bassov

My bad, okay I will look at it again - using the event as a synch event.

-pro

> anton’s code: tried counters and flags, but lack of a Reset can make
> every one runnable after
> initial starts, they would spin for work…

As I told you already on NTTALK (as well as on this thread), this is a
synchronization event, rather than notification one, so that only one
thread may get released per call to KeSetEvent(). There is no need to
manually reset anything - dispatcher takes care of it…

Anton Bassov


NTDEV is sponsored by OSR

For our schedule of WDF, WDM, debugging and other seminars visit:
http://www.osr.com/seminars

To unsubscribe, visit the List Server section of OSR Online at
http://www.osronline.com/page.cfm?name=ListServer

> One more thing to add: regardless of the semaphore implementation (which is bad in Windows

only because Limit is mandatory, limit-less semaphores could be useful) you will need to guard
the list itself with some lock.

Well, this is understandable…

In this particular case the only available option is spinlock, because producer runs in atomic context.
Otherwise you could get away an extra synch event and KeWaitForMultipleObjects() in consumers and KeWaitFoSingleObject() in producer (certainly, if Windows handled semaphores the way everyone else does)…

Anton Bassov

Nope. One list per Work Queue Type, with multiple threads for each type. The queue is guarded exactly as Max said. A quick walk-in with the debugger will demonstrate this.

Peter
OSR

>

>problem is that Windows happens to be the system that is…

One more thing to add: regardless of the semaphore implementation
(which is
bad in Windows only because Limit is mandatory, limit-less semaphores
could be
useful) you will need to guard the list itself with some lock.

So, you have 2 locks - 1 dispatcher lock in the semaphore
implementation, 2 to
guard the list.

The only solution to avoid this is to use the dispatcher lock to guard
the
queue itself, I expect KeInsert/RemoveQueue to do this.

Using lock free structures can remove a lot of the synchronisation
requirements. Have a look at http://www.lfds.org - there is a lock free
queue api there. Access the queue at any IRQL.

You still have to solve the notification problems of course.

James

Somehow, that URL isn’t getting me anything related to data structures. Rather, I’m getting The Stonewall Family guild site. Unfortunately, I’m not a gamer and I’m not on that team.

Peter
OSR

>

[quote]
Have a look at http://www.lfds.org - there is a lock free queue api
there.
Access the queue at any IRQL.
[/quote]

Somehow, that URL isn’t getting me anything related to data
structures.
Rather, I’m getting The Stonewall Family guild site. Unfortunately,
I’m not a
gamer and I’m not on that team.

Oh no! sorry wrong link - http://www.liblfds.org/

Please delete my original post from the forum if you can (and I assume
that if anyone can, you can :slight_smile:

James

Oh, I thought I was the only one getting redirected to that gaming site:)

-pro

Somehow, that URL isn’t getting me anything related to data structures.
Rather, I’m getting The Stonewall Family guild site. Unfortunately, I’m
not a gamer and I’m not on that team.

Peter
OSR


NTDEV is sponsored by OSR

For our schedule of WDF, WDM, debugging and other seminars visit:
http://www.osr.com/seminars

To unsubscribe, visit the List Server section of OSR Online at
http://www.osronline.com/page.cfm?name=ListServer

Okay, I’m not going for the semaphore pattern on this… That is later part !

Assumption/Thinking –

  1. I can not trust estimating time to launch all the consumer threads. They can be active a tad bit later than when the producer started and queuing work items.

  2. Producer just produced (NumberofConsumerThreads - 1) work items. So every time it singnals the same event. The synchronization event is in signalled state.

  3. Now that 64 threads starts on 64 processors ( or any big number).

  4. The first consumer thread breaks the barrier by calling the WaitforSingleObject, we make it non-signalled.

  5. Producer does not have any more traffic.

(6) The only consumer thread will find itself that the number of work items are less than numberofConsumerThread, and will get out after processing one of the item.

(7) Now everyone is waiting. And the queue still has N - 1 work item.

The DPC happens to be the DPC of a NIC, depending on protocol setting, NIC setting, what will happen???

Here it so happened that there is no other Rcv pkts, no more traffic. Just one sender managed to send 64 or 63 work item. Now what happens?

-pro

On Dec 13, 2010, at 12:24 PM, xxxxx@hotmail.com wrote:

> anton’s code: tried counters and flags, but lack of a Reset can make every one runnable after
> initial starts, they would spin for work…

As I told you already on NTTALK (as well as on this thread), this is a synchronization event, rather than notification one, so that only one thread may get released per call to KeSetEvent(). There is no need to manually reset anything - dispatcher takes care of it…

Anton Bassov


NTDEV is sponsored by OSR

For our schedule of WDF, WDM, debugging and other seminars visit:
http://www.osr.com/seminars

To unsubscribe, visit the List Server section of OSR Online at http://www.osronline.com/page.cfm?name=ListServer

> Yeah, I remember that. That would simplify stuff. All the producer need is

to do a round robin among N consumers.

And?

The producer will need to make a decision on - to what list to put the request.

The decision will require some global data.

Which must be protected by the lock.

And yes, this lock will be the same bottleneck.


Maxim S. Shatskih
Windows DDK MVP
xxxxx@storagecraft.com
http://www.storagecraft.com

> My EventReceiveDatagramHandler is called at DPC LEVEL,and my worker thread is in PASSIVE

BTW - why not just use Ex/IoQueueWorkItem and avoid all these gory details?

Obsolete ExQueueWorkItem is good since it does not require separate allocation of the work item, which can be a part of any larger structure.

I think starting from Vista (?) the IoQueueWorkItem family also allow such operation.


Maxim S. Shatskih
Windows DDK MVP
xxxxx@storagecraft.com
http://www.storagecraft.com

> 1) I can not trust estimating time to launch all the consumer threads. They can be active a tad bit

later than when the producer started and queuing work items.

Well, they all have to get started at the initialization time before DPC gets even registered, don’t you think…

  1. Producer just produced (NumberofConsumerThreads - 1) work items. So every time it singnals
    the same event. The synchronization event is in signalled state

In fact, our event is NEVER in signaled state. When producer signals it one of the waiting threads gets released, and, at this point, it gets automatically reset. This is the reason why we use synch event,
rather than notification one, in the first place…

Anton Bassov

I knew the arguments will come thru, but it would not be difficult to cook up a sequence ( actually taking out those assumption i put forward), and still there would be a situation where after lifting those assumption you will leave few work item on the producer’s queue… Again the crux is that these events don’t have a counter. And even if it has a counter, wrapping the counter will come back and shoot us down!

-pro

On Dec 14, 2010, at 7:30 AM, xxxxx@hotmail.com wrote:

> 1) I can not trust estimating time to launch all the consumer threads. They can be active a tad bit
> later than when the producer started and queuing work items.

Well, they all have to get started at the initialization time before DPC gets even registered, don’t you think…

> 2) Producer just produced (NumberofConsumerThreads - 1) work items. So every time it singnals
> the same event. The synchronization event is in signalled state

In fact, our event is NEVER in signaled state. When producer signals it one of the waiting threads gets released, and, at this point, it gets automatically reset. This is the reason why we use synch event,
rather than notification one, in the first place…

Anton Bassov


NTDEV is sponsored by OSR

For our schedule of WDF, WDM, debugging and other seminars visit:
http://www.osr.com/seminars

To unsubscribe, visit the List Server section of OSR Online at http://www.osronline.com/page.cfm?name=ListServer

> and still there would be a situation where after lifting those assumption you will leave few work item

on the producer’s queue…

Could you please provide an example of such scenario…

Again the crux is that these events don’t have a counter.

What do you think ‘count’ global variable is for???

Anton Bassov

Well the counter and the event are not coupled.

Your assumption that as soon as you produce an item, and signal, it would be picked up a consumer - Extreemly strong assumption. I can not dictate the OS to do just that. It is possible within the number of NbOfConsumerThreads that more than one work item queued, before anyone get a chance to serve, meaning consumer serves right away.

The underlying assumption is one to one between production and consumption ---- Strong, strong assumption.

-pro

On Dec 14, 2010, at 7:54 AM, xxxxx@hotmail.com wrote:

> and still there would be a situation where after lifting those assumption you will leave few work item
> on the producer’s queue…

Could you please provide an example of such scenario…

> Again the crux is that these events don’t have a counter.

What do you think ‘count’ global variable is for???

Anton Bassov


NTDEV is sponsored by OSR

For our schedule of WDF, WDM, debugging and other seminars visit:
http://www.osr.com/seminars

To unsubscribe, visit the List Server section of OSR Online at http://www.osronline.com/page.cfm?name=ListServer