Synchornizing Access to a Linked List: complete picture

Thanks for your answers. Here is the whole picture.

The linked list I am using stores data in non paged pool space.
The writer runs always at PASSIVE level. It allocates new nodes if necessary
and fills up the fields on each node.
Writing and allocating new nodes is rare. At most, it may happen 10 times
per minute on an average loaded machine.

Reading is another story. Readers are arbitrary threads that run at APC or
DPC levels. They never run at PASSIVE level.
The actual IRQL depends on the point from which my code is called. I assume
DPC level all the time.

Reading may be performed a lot of times, in the order of hundred of times
per second.
The reader perfoms each read in two steps: first it looks for the right
node, and, if it exists on the list, it reads all the fields in the node.

Both, writing and reading are synchronized using an ordinary (no Stack)
Spinlock.

While reading, the following tasks are performed:

1-Acquire Spinlock
2-Check the first field on each node, to look for a certain value
3-If found the right value, read the others fields of that node on local
variables .
4-Release Spinlock

There may be many nodes but I have checked that performance problem occurs
even with only a couple of nodes in the list.
I have checked also that node checking is performed correctly.

There are no calls to other functions from within the spinlock and the data
is located always in non paged pool.

Removing synchronization completely (just for testing) improves speed about
30% when code is runing on multi processor machines.
Of course this is not a solution but it means that synchronization is taking
too much time.

I know there are others ways to synchronize but I need one available at DPC.
I have not tried the new version of queued SpinLocks.

Thanks for your comments.
Inaki.

Synchornizing Access to a Linked List: complete pictureI think implementing syncronization in “multiple reader single writer” scheme will increase performance. Something like ERESOURCE but which may work on DPC level. How often you read your list on raised IRQL? Perhaps, if it does not happen often, you could do lookup in a worker thread. It makes it possible to use ERESOURCE lock.

-htfv

“I?aki Castillo” wrote in message news:xxxxx@ntdev…
Thanks for your answers. Here is the whole picture.

The linked list I am using stores data in non paged pool space.
The writer runs always at PASSIVE level. It allocates new nodes if necessary and fills up the fields on each node.
Writing and allocating new nodes is rare. At most, it may happen 10 times per minute on an average loaded machine.

Reading is another story. Readers are arbitrary threads that run at APC or DPC levels. They never run at PASSIVE level.
The actual IRQL depends on the point from which my code is called. I assume DPC level all the time.

Reading may be performed a lot of times, in the order of hundred of times per second.
The reader perfoms each read in two steps: first it looks for the right node, and, if it exists on the list, it reads all the fields in the node.

Both, writing and reading are synchronized using an ordinary (no Stack) Spinlock.

While reading, the following tasks are performed:

1-Acquire Spinlock
2-Check the first field on each node, to look for a certain value
3-If found the right value, read the others fields of that node on local variables .
4-Release Spinlock

There may be many nodes but I have checked that performance problem occurs even with only a couple of nodes in the list.

I have checked also that node checking is performed correctly.

There are no calls to other functions from within the spinlock and the data is located always in non paged pool.

Removing synchronization completely (just for testing) improves speed about 30% when code is runing on multi processor machines.

Of course this is not a solution but it means that synchronization is taking too much time.

I know there are others ways to synchronize but I need one available at DPC.
I have not tried the new version of queued SpinLocks.

Thanks for your comments.
Inaki.

Synchornizing Access to a Linked List: complete pictureConsider using a hash
function based on the first field you mention to split the list into
multiple lists each with an associated spin lock. This should improve
performance in both the single processor case where the list is long and the
multiprocessor case by reducing lock contention.

Don Burn (MVP, Windows DDK)
Windows 2k/XP/2k3 Filesystem and Driver Consulting

----- Original Message -----
From: I?aki Castillo
To: Windows System Software Devs Interest List
Sent: Wednesday, December 24, 2003 7:12 AM
Subject: [ntdev] Synchornizing Access to a Linked List: complete picture

Thanks for your answers. Here is the whole picture.
The linked list I am using stores data in non paged pool space.
The writer runs always at PASSIVE level. It allocates new nodes if necessary
and fills up the fields on each node.
Writing and allocating new nodes is rare. At most, it may happen 10 times
per minute on an average loaded machine.
Reading is another story. Readers are arbitrary threads that run at APC or
DPC levels. They never run at PASSIVE level.
The actual IRQL depends on the point from which my code is called. I assume
DPC level all the time.
Reading may be performed a lot of times, in the order of hundred of times
per second.
The reader perfoms each read in two steps: first it looks for the right
node, and, if it exists on the list, it reads all the fields in the node.
Both, writing and reading are synchronized using an ordinary (no Stack)
Spinlock.
While reading, the following tasks are performed:
1-Acquire Spinlock
2-Check the first field on each node, to look for a certain value
3-If found the right value, read the others fields of that node on local
variables .
4-Release Spinlock
There may be many nodes but I have checked that performance problem occurs
even with only a couple of nodes in the list.
I have checked also that node checking is performed correctly.
There are no calls to other functions from within the spinlock and the data
is located always in non paged pool.
Removing synchronization completely (just for testing) improves speed about
30% when code is runing on multi processor machines.
Of course this is not a solution but it means that synchronization is taking
too much time.
I know there are others ways to synchronize but I need one available at DPC.
I have not tried the new version of queued SpinLocks.
Thanks for your comments.
Inaki.


Questions? First check the Kernel Driver FAQ at
http://www.osronline.com/article.cfm?id=256

You are currently subscribed to ntdev as: xxxxx@acm.org
To unsubscribe send a blank email to xxxxx@lists.osr.com

This isn’t quite the complete picture. Are nodes only added or are they
deleted as well? Similarly, are the nodes modified after writing, or are
they static? No synchronization should be needed at all unless nodes are
deleted/modified. Is the list doubly or singly linked? Are nodes
added/deleted only at the head/tail of the list or can they be added
anywhere? (if singly linked and only touched at the head/tail, then only
1 pointer needs to be updated, which allows InterlockedExchangePointer
to be used).

Note that, since the writer runs at passive level and the readers at a
higher level, the reads are done atomically with respect to the writes.

In some situations, this can allow for other safe synchronization
methods that don’t require heavyweight synchronization primitives at
all. InterlockedExchange can work miracles if used carefully, doesn’t
use a spin-lock, and it’s SMP-safe with respect to the other
InterlockedXxx functions (these functions use the LOCK feature of the
x86 chip to do their dirty work).

Iñaki Castillo wrote:

Thanks for your answers. Here is the whole picture.

The linked list I am using stores data in non paged pool space.
The writer runs always at PASSIVE level. It allocates new nodes if
necessary and fills up the fields on each node.
Writing and allocating new nodes is rare. At most, it may happen 10
times per minute on an average loaded machine.

Reading is another story. Readers are arbitrary threads that run at APC
or DPC levels. They never run at PASSIVE level.
The actual IRQL depends on the point from which my code is called. I
assume DPC level all the time.

Reading may be performed a lot of times, in the order of hundred of
times per second.
The reader perfoms each read in two steps: first it looks for the right
node, and, if it exists on the list, it reads all the fields in the node.

Both, writing and reading are synchronized using an ordinary (no Stack)
Spinlock.

While reading, the following tasks are performed:

1-Acquire Spinlock
2-Check the first field on each node, to look for a certain value
3-If found the right value, read the others fields of that node on local
variables .
4-Release Spinlock

There may be many nodes but I have checked that performance problem
occurs even with only a couple of nodes in the list.

I have checked also that node checking is performed correctly.

There are no calls to other functions from within the spinlock and the
data is located always in non paged pool.

Removing synchronization completely (just for testing) improves speed
about 30% when code is runing on multi processor machines.

Of course this is not a solution but it means that synchronization is
taking too much time.

I know there are others ways to synchronize but I need one available at
DPC.
I have not tried the new version of queued SpinLocks.

Thanks for your comments.
Inaki.


…/ray..

As has been mentioned, you could win big by implementing a
multiple-reader/single-reader lock akin to an ERESOURCE but that is
designed for use in an SMP situation at dispatch level. You can also
investigate implementations of ‘lock-free’ or ‘wait-free’ data
structures; lock-frees are of medium complexity and you win by reducing
the scope of mutual exclusion to single CAS instructions (i.e. the
InterlockedXXX family). Here are some overviews of the technique:

http://perso.wanadoo.fr/gmem/evenements/jim2002/articles/L17_Fober.pdf
http://kos.enix.org/pub/nolocks.ps

Iñaki Castillo wrote:

Thanks for your answers. Here is the whole picture.

The linked list I am using stores data in non paged pool space.
The writer runs always at PASSIVE level. It allocates new nodes if
necessary and fills up the fields on each node.
Writing and allocating new nodes is rare. At most, it may happen 10
times per minute on an average loaded machine.

Reading is another story. Readers are arbitrary threads that run at APC
or DPC levels. They never run at PASSIVE level.
The actual IRQL depends on the point from which my code is called. I
assume DPC level all the time.

Reading may be performed a lot of times, in the order of hundred of
times per second.
The reader perfoms each read in two steps: first it looks for the right
node, and, if it exists on the list, it reads all the fields in the node.

Both, writing and reading are synchronized using an ordinary (no Stack)
Spinlock.

While reading, the following tasks are performed:

1-Acquire Spinlock
2-Check the first field on each node, to look for a certain value
3-If found the right value, read the others fields of that node on local
variables .
4-Release Spinlock

There may be many nodes but I have checked that performance problem
occurs even with only a couple of nodes in the list.

I have checked also that node checking is performed correctly.

There are no calls to other functions from within the spinlock and the
data is located always in non paged pool.

Removing synchronization completely (just for testing) improves speed
about 30% when code is runing on multi processor machines.

Of course this is not a solution but it means that synchronization is
taking too much time.

I know there are others ways to synchronize but I need one available at
DPC.
I have not tried the new version of queued SpinLocks.

Thanks for your comments.
Inaki.


Nick Ryan (MVP for DDK)

Thanks, Nick. Those are interesting docs for both beginners and experienced
developers.

-htfv

“Nick Ryan” wrote in message news:xxxxx@ntdev…
>
> As has been mentioned, you could win big by implementing a
> multiple-reader/single-reader lock akin to an ERESOURCE but that is
> designed for use in an SMP situation at dispatch level. You can also
> investigate implementations of ‘lock-free’ or ‘wait-free’ data
> structures; lock-frees are of medium complexity and you win by reducing
> the scope of mutual exclusion to single CAS instructions (i.e. the
> InterlockedXXX family). Here are some overviews of the technique:
>
> http://perso.wanadoo.fr/gmem/evenements/jim2002/articles/L17_Fober.pdf
> http://kos.enix.org/pub/nolocks.ps
>
> Iñaki Castillo wrote:
>
> > Thanks for your answers. Here is the whole picture.
> >
> > The linked list I am using stores data in non paged pool space.
> > The writer runs always at PASSIVE level. It allocates new nodes if
> > necessary and fills up the fields on each node.
> > Writing and allocating new nodes is rare. At most, it may happen 10
> > times per minute on an average loaded machine.
> >
> > Reading is another story. Readers are arbitrary threads that run at APC
> > or DPC levels. They never run at PASSIVE level.
> > The actual IRQL depends on the point from which my code is called. I
> > assume DPC level all the time.
> >
> > Reading may be performed a lot of times, in the order of hundred of
> > times per second.
> > The reader perfoms each read in two steps: first it looks for the right
> > node, and, if it exists on the list, it reads all the fields in the
node.
> >
> > Both, writing and reading are synchronized using an ordinary (no Stack)
> > Spinlock.
> >
> > While reading, the following tasks are performed:
> >
> > 1-Acquire Spinlock
> > 2-Check the first field on each node, to look for a certain value
> > 3-If found the right value, read the others fields of that node on local
> > variables .
> > 4-Release Spinlock
> >
> > There may be many nodes but I have checked that performance problem
> > occurs even with only a couple of nodes in the list.
> >
> > I have checked also that node checking is performed correctly.
> >
> > There are no calls to other functions from within the spinlock and the
> > data is located always in non paged pool.
> >
> > Removing synchronization completely (just for testing) improves speed
> > about 30% when code is runing on multi processor machines.
> >
> > Of course this is not a solution but it means that synchronization is
> > taking too much time.
> >
> > I know there are others ways to synchronize but I need one available at
> > DPC.
> > I have not tried the new version of queued SpinLocks.
> >
> > Thanks for your comments.
> > Inaki.
> >
> >
> >
> >
> >
>
> –
> Nick Ryan (MVP for DDK)
>
>
>

Yeah… as far as I can tell knowledge and application of lock and
wait-free techniques are relatively new to the field of applied computer
science (i.e. what we’re all doing). I certainly hadn’t heard of them
until I saw them mentioned in a newsgroup posting. As SMP becomes more
widespread so will the use of these techniques (which are more
applicable to achieving maximum performance in SMP environments). I’m
curious if similar optimizations have or are going into the Windows kernel.

Alexey Logachyov wrote:

Thanks, Nick. Those are interesting docs for both beginners and experienced
developers.

-htfv

“Nick Ryan” wrote in message news:xxxxx@ntdev…
>
>>As has been mentioned, you could win big by implementing a
>>multiple-reader/single-reader lock akin to an ERESOURCE but that is
>>designed for use in an SMP situation at dispatch level. You can also
>>investigate implementations of ‘lock-free’ or ‘wait-free’ data
>>structures; lock-frees are of medium complexity and you win by reducing
>>the scope of mutual exclusion to single CAS instructions (i.e. the
>>InterlockedXXX family). Here are some overviews of the technique:
>>
>>http://perso.wanadoo.fr/gmem/evenements/jim2002/articles/L17_Fober.pdf
>>http://kos.enix.org/pub/nolocks.ps
>>
>>Iñaki Castillo wrote:
>>
>>
>>>Thanks for your answers. Here is the whole picture.
>>>
>>>The linked list I am using stores data in non paged pool space.
>>>The writer runs always at PASSIVE level. It allocates new nodes if
>>>necessary and fills up the fields on each node.
>>>Writing and allocating new nodes is rare. At most, it may happen 10
>>>times per minute on an average loaded machine.
>>>
>>>Reading is another story. Readers are arbitrary threads that run at APC
>>>or DPC levels. They never run at PASSIVE level.
>>>The actual IRQL depends on the point from which my code is called. I
>>>assume DPC level all the time.
>>>
>>>Reading may be performed a lot of times, in the order of hundred of
>>>times per second.
>>>The reader perfoms each read in two steps: first it looks for the right
>>>node, and, if it exists on the list, it reads all the fields in the
>
> node.
>
>>>Both, writing and reading are synchronized using an ordinary (no Stack)
>>>Spinlock.
>>>
>>>While reading, the following tasks are performed:
>>>
>>>1-Acquire Spinlock
>>>2-Check the first field on each node, to look for a certain value
>>>3-If found the right value, read the others fields of that node on local
>>>variables .
>>>4-Release Spinlock
>>>
>>>There may be many nodes but I have checked that performance problem
>>>occurs even with only a couple of nodes in the list.
>>>
>>>I have checked also that node checking is performed correctly.
>>>
>>>There are no calls to other functions from within the spinlock and the
>>>data is located always in non paged pool.
>>>
>>>Removing synchronization completely (just for testing) improves speed
>>>about 30% when code is runing on multi processor machines.
>>>
>>>Of course this is not a solution but it means that synchronization is
>>>taking too much time.
>>>
>>>I know there are others ways to synchronize but I need one available at
>>>DPC.
>>>I have not tried the new version of queued SpinLocks.
>>>
>>>Thanks for your comments.
>>>Inaki.
>>>
>>>
>>>
>>>
>>>
>>
>>–
>>Nick Ryan (MVP for DDK)
>>
>>
>>
>
>
>
>
>


Nick Ryan (MVP for DDK)

Well,

Think of yourself here as living at the end of the dark ages. Back When
Computers Were Big and Memory Was Expensive, we had all this stuff in
multi-processor systems. The plague of PC’s caused all these techniques
to be hidden away in a few isolated monasteries.

I second Don’s suggestion. As I said before, your biggest problem is that
you need better data-management algorithms. An AVL tree might be simpler
than the hash table and it might solve your performance problems.


Jake Oshins
Windows Base Kernel Team

This posting is provided “AS IS” with no warranties, and confers no rights.

“Don Burn” wrote in message news:xxxxx@ntdev…

Synchornizing Access to a Linked List: complete pictureConsider using a hash
function based on the first field you mention to split the list into
multiple lists each with an associated spin lock. This should improve
performance in both the single processor case where the list is long and the
multiprocessor case by reducing lock contention.

Don Burn (MVP, Windows DDK)
Windows 2k/XP/2k3 Filesystem and Driver Consulting

----- Original Message -----
From: Iñaki Castillo
To: Windows System Software Devs Interest List
Sent: Wednesday, December 24, 2003 7:12 AM
Subject: [ntdev] Synchornizing Access to a Linked List: complete picture

Thanks for your answers. Here is the whole picture.
The linked list I am using stores data in non paged pool space.
The writer runs always at PASSIVE level. It allocates new nodes if necessary
and fills up the fields on each node.
Writing and allocating new nodes is rare. At most, it may happen 10 times
per minute on an average loaded machine.
Reading is another story. Readers are arbitrary threads that run at APC or
DPC levels. They never run at PASSIVE level.
The actual IRQL depends on the point from which my code is called. I assume
DPC level all the time.
Reading may be performed a lot of times, in the order of hundred of times
per second.
The reader perfoms each read in two steps: first it looks for the right
node, and, if it exists on the list, it reads all the fields in the node.
Both, writing and reading are synchronized using an ordinary (no Stack)
Spinlock.
While reading, the following tasks are performed:
1-Acquire Spinlock
2-Check the first field on each node, to look for a certain value
3-If found the right value, read the others fields of that node on local
variables .
4-Release Spinlock
There may be many nodes but I have checked that performance problem occurs
even with only a couple of nodes in the list.
I have checked also that node checking is performed correctly.
There are no calls to other functions from within the spinlock and the data
is located always in non paged pool.
Removing synchronization completely (just for testing) improves speed about
30% when code is runing on multi processor machines.
Of course this is not a solution but it means that synchronization is taking
too much time.
I know there are others ways to synchronize but I need one available at DPC.
I have not tried the new version of queued SpinLocks.
Thanks for your comments.
Inaki.


Questions? First check the Kernel Driver FAQ at
http://www.osronline.com/article.cfm?id=256

You are currently subscribed to ntdev as: xxxxx@acm.org
To unsubscribe send a blank email to xxxxx@lists.osr.com

There isn’t enough data to say anything real about his problem, but there’=
s several
things that may be going wrong, besides the obvious spinlock implementatio=
n issue.

He seems to be searching for a node value inside a critical section, that’=
s not a
good idea. Now, there’s several things that might be causing the slowdown,=
here’s
some things to try.

A hash table may help if collision chains are kept short enough, but that =
may be
only part of the issue. I would for example do a test run forcing retrieva=
l to always
access the first key, to see if performance is being lost because of synch=
ronization
or because of the search. If I implemented the hash table, I would do a fe=
w test runs
on sample data to make sure my collision chains are short enough.

Note that it’s a sledgehammer approach to just acquire a spinlock every ti=
me one
reads or writes to the queue. If he can have multiple readers concurrent o=
n the
queue, but only one writer, he can try this:

Reader( )
{
acquire(rSpin);
nreaders++;
if (nreaders=3D=3D1) acquire(wSpin);
release (rSpin);
dequeue( );
acquire(rSpin);
nreaders–;
if (nreaders=3D=3D0) release(wSpin);
release(rSpin);
}

Writer( )
{
acquire(wSpin);
enqueue( );
release(wSpin);
}

The way it works is, once the first reader grabs the wSpin lock, subsequen=
t readers
don’t need to spin waiting. Remember, only writes must be synchronized ! T=
his
however gives priority to the readers, use with care. And of course, if he=
needs to
update the queue at every read, that’s a different question. A variation o=
f this
technique is the following course-grained solution,

Reader( )
{
acquire(rSpin);
while (nWriters>0);
nReaders++;
release(rSpin);
dequeue( );
lock{nReaders–};
}

Writer( )
{
acquire(rSpin);
while (nReaders>0 || nWriters>0);
nWriters++;
release(rSpin);
enqueue( );
lock{nWriters–};

This is a bit lighter on the writers. As I originally said in my other pos=
t, Readers and
Writers is a well studied problem. I recommend going into a good textbook,=
there’s
plenty of suggestions on how to make it run better.

Another point is, if he’s looking up the list, using a linked list may be =
a bad idea for a
number of reasons, if nothing else because pointer chasing is slow and tax=
es the
cache. One way to do it is to structure the queue as an array of pointers,=
this will
help keeping cache locality and it might even allow one to use a rep scas =
to search
for that value.

The problem with an AVL tree in this context is that removals require writ=
ing to the
data structure, which would negate most of the benefits of the Reader/Writ=
er
synchronization logic. But, in that direction, I would prefer to structure=
the data as a
priority queue using his search argument as priority, that will guarantee =
log time
access and no need to implement AVL rotations which may be tough to debug.=
A
priority queue can be simply implemented as an array, because the children=
of
element queue[p] are to be found at queue[2*p] and queue[2*p+1]. This may =
avoid
searching, which is particularly advisable if done inside a spinlock !

Last but not least, he should make sure he’s not allocating memory inside =
a critical
section.

Alberto.

On 25 Dec 2003 at 22:03, Jake Oshins wrote:

I second Don’s suggestion. As I said before, your biggest problem is
that you need better data-management algorithms. An AVL tree might be
simpler than the hash table and it might solve your performance
problems.


Jake Oshins
Windows Base Kernel Team

This posting is provided “AS IS” with no warranties, and confers no
rights.

“Don Burn” wrote in message news:xxxxx@ntdev…
>
> Synchornizing Access to a Linked List: complete pictureConsider using
> a hash function based on the first field you mention to split the list
> into multiple lists each with an associated spin lock. This should
> improve performance in both the single processor case where the list
> is long and the multiprocessor case by reducing lock contention.
>
> Don Burn (MVP, Windows DDK)
> Windows 2k/XP/2k3 Filesystem and Driver Consulting
>
> ----- Original Message -----
> From: I=F1aki Castillo
> To: Windows System Software Devs Interest List
> Sent: Wednesday, December 24, 2003 7:12 AM
> Subject: [ntdev] Synchornizing Access to a Linked List: complete
> picture
>
>
> Thanks for your answers. Here is the whole picture.
> The linked list I am using stores data in non paged pool space.
> The writer runs always at PASSIVE level. It allocates new nodes if
> necessary and fills up the fields on each node. Writing and allocating
> new nodes is rare. At most, it may happen 10 times per minute on an
> average loaded machine. Reading is another story. Readers are
> arbitrary threads that run at APC or DPC levels. They never run at
> PASSIVE level. The actual IRQL depends on the point from which my code
> is called. I assume DPC level all the time. Reading may be performed a
> lot of times, in the order of hundred of times per second. The reader
> perfoms each read in two steps: first it looks for the right node,
> and, if it exists on the list, it reads all the fields in the node.
> Both, writing and reading are synchronized using an ordinary (no
> Stack) Spinlock. While reading, the following tasks are performed:
> 1-Acquire Spinlock 2-Check the first field on each node, to look for a
> certain value 3-If found the right value, read the others fields of
> that node on local variables . 4-Release Spinlock There may be many
> nodes but I have checked that performance problem occurs even with
> only a couple of nodes in the list. I have checked also that node
> checking is performed correctly. There are no calls to other functions
> from within the spinlock and the data is located always in non paged
> pool. Removing synchronization completely (just for testing) improves
> speed about 30% when code is runing on multi processor machines. Of
> course this is not a solution but it means that synchronization is
> taking too much time. I know there are others ways to synchronize but
> I need one available at DPC. I have not tried the new version of
> queued SpinLocks. Thanks for your comments. Inaki.
>
>
>
>
> —
> Questions? First check the Kernel Driver FAQ at
> http://www.osronline.com/article.cfm?id=3D256
>
> You are currently subscribed to ntdev as: xxxxx@acm.org
> To unsubscribe send a blank email to xxxxx@lists.osr.com
>
>
>
>
>
> —
> Questions? First check the Kernel Driver FAQ at
> http://www.osronline.com/article.cfm?id=3D256
>
> You are currently subscribed to ntdev as: xxxxx@ieee.org
> To unsubscribe send a blank email to xxxxx@lists.osr.com

Synchornizing Access to a Linked List: complete picture>Removing
synchronization completely (just for testing) improves speed about 30% when
code

Try the queued spinlock, it is definitely good for your case.

Maxim Shatskih, Windows DDK MVP
StorageCraft Corporation
xxxxx@storagecraft.com
http://www.storagecraft.com

Read-write ERESOURCE-like spinlocks are possible.

One of the possible implementations use the following:

  • read acquire: loop/check to be >=0, then InterlockedIncrement
  • write acquire: loop/check to be zero, then InterlockedAdd with a huge
    negative number.
    At least this is stable till the number of read contenders will be lesser
    then this “huge negative number”, and, in case of spinlocks, the number of
    contenders is always <= CPU count (since preemption is disabled by raising to
    DISPATCH_LEVEL before doing interlocked stuff).

IIRC Linux has this implementation. I also once re-implemented the PnP
remove lock in the similar manner.

Maxim Shatskih, Windows DDK MVP
StorageCraft Corporation
xxxxx@storagecraft.com
http://www.storagecraft.com

----- Original Message -----
From:
To: “Windows System Software Devs Interest List”
Sent: Friday, December 26, 2003 5:11 PM
Subject: [ntdev] Re: Synchornizing Access to a Linked List: complete picture

> There isn’t enough data to say anything real about his problem, but there’s
several
> things that may be going wrong, besides the obvious spinlock implementation
issue.
>
> He seems to be searching for a node value inside a critical section, that’s
not a
> good idea. Now, there’s several things that might be causing the slowdown,
here’s
> some things to try.
>
> A hash table may help if collision chains are kept short enough, but that may
be
> only part of the issue. I would for example do a test run forcing retrieval
to always
> access the first key, to see if performance is being lost because of
synchronization
> or because of the search. If I implemented the hash table, I would do a few
test runs
> on sample data to make sure my collision chains are short enough.
>
> Note that it’s a sledgehammer approach to just acquire a spinlock every time
one
> reads or writes to the queue. If he can have multiple readers concurrent on
the
> queue, but only one writer, he can try this:
>
> Reader( )
> {
> acquire(rSpin);
> nreaders++;
> if (nreaders==1) acquire(wSpin);
> release (rSpin);
> dequeue( );
> acquire(rSpin);
> nreaders–;
> if (nreaders==0) release(wSpin);
> release(rSpin);
> }
>
> Writer( )
> {
> acquire(wSpin);
> enqueue( );
> release(wSpin);
> }
>
> The way it works is, once the first reader grabs the wSpin lock, subsequent
readers
> don’t need to spin waiting. Remember, only writes must be synchronized ! This
> however gives priority to the readers, use with care. And of course, if he
needs to
> update the queue at every read, that’s a different question. A variation of
this
> technique is the following course-grained solution,
>
> Reader( )
> {
> acquire(rSpin);
> while (nWriters>0);
> nReaders++;
> release(rSpin);
> dequeue( );
> lock{nReaders–};
> }
>
> Writer( )
> {
> acquire(rSpin);
> while (nReaders>0 || nWriters>0);
> nWriters++;
> release(rSpin);
> enqueue( );
> lock{nWriters–};
>
> This is a bit lighter on the writers. As I originally said in my other post,
Readers and
> Writers is a well studied problem. I recommend going into a good textbook,
there’s
> plenty of suggestions on how to make it run better.
>
> Another point is, if he’s looking up the list, using a linked list may be a
bad idea for a
> number of reasons, if nothing else because pointer chasing is slow and taxes
the
> cache. One way to do it is to structure the queue as an array of pointers,
this will
> help keeping cache locality and it might even allow one to use a rep scas to
search
> for that value.
>
> The problem with an AVL tree in this context is that removals require writing
to the
> data structure, which would negate most of the benefits of the Reader/Writer
> synchronization logic. But, in that direction, I would prefer to structure
the data as a
> priority queue using his search argument as priority, that will guarantee log
time
> access and no need to implement AVL rotations which may be tough to debug. A
> priority queue can be simply implemented as an array, because the children of
> element queue[p] are to be found at queue[2p] and queue[2p+1]. This may
avoid
> searching, which is particularly advisable if done inside a spinlock !
>
> Last but not least, he should make sure he’s not allocating memory inside a
critical
> section.
>
>
> Alberto.
>
>
>
> On 25 Dec 2003 at 22:03, Jake Oshins wrote:
>
> > I second Don’s suggestion. As I said before, your biggest problem is
> > that you need better data-management algorithms. An AVL tree might be
> > simpler than the hash table and it might solve your performance
> > problems.
> >
> > –
> > Jake Oshins
> > Windows Base Kernel Team
> >
> > This posting is provided “AS IS” with no warranties, and confers no
> > rights.
> >
> > “Don Burn” wrote in message news:xxxxx@ntdev…
> >
> > Synchornizing Access to a Linked List: complete pictureConsider using
> > a hash function based on the first field you mention to split the list
> > into multiple lists each with an associated spin lock. This should
> > improve performance in both the single processor case where the list
> > is long and the multiprocessor case by reducing lock contention.
> >
> > Don Burn (MVP, Windows DDK)
> > Windows 2k/XP/2k3 Filesystem and Driver Consulting
> >
> > ----- Original Message -----
> > From: I?aki Castillo
> > To: Windows System Software Devs Interest List
> > Sent: Wednesday, December 24, 2003 7:12 AM
> > Subject: [ntdev] Synchornizing Access to a Linked List: complete
> > picture
> >
> >
> > Thanks for your answers. Here is the whole picture.
> > The linked list I am using stores data in non paged pool space.
> > The writer runs always at PASSIVE level. It allocates new nodes if
> > necessary and fills up the fields on each node. Writing and allocating
> > new nodes is rare. At most, it may happen 10 times per minute on an
> > average loaded machine. Reading is another story. Readers are
> > arbitrary threads that run at APC or DPC levels. They never run at
> > PASSIVE level. The actual IRQL depends on the point from which my code
> > is called. I assume DPC level all the time. Reading may be performed a
> > lot of times, in the order of hundred of times per second. The reader
> > perfoms each read in two steps: first it looks for the right node,
> > and, if it exists on the list, it reads all the fields in the node.
> > Both, writing and reading are synchronized using an ordinary (no
> > Stack) Spinlock. While reading, the following tasks are performed:
> > 1-Acquire Spinlock 2-Check the first field on each node, to look for a
> > certain value 3-If found the right value, read the others fields of
> > that node on local variables . 4-Release Spinlock There may be many
> > nodes but I have checked that performance problem occurs even with
> > only a couple of nodes in the list. I have checked also that node
> > checking is performed correctly. There are no calls to other functions
> > from within the spinlock and the data is located always in non paged
> > pool. Removing synchronization completely (just for testing) improves
> > speed about 30% when code is runing on multi processor machines. Of
> > course this is not a solution but it means that synchronization is
> > taking too much time. I know there are others ways to synchronize but
> > I need one available at DPC. I have not tried the new version of
> > queued SpinLocks. Thanks for your comments. Inaki.
> >
> >
> >
> >
> > —
> > Questions? First check the Kernel Driver FAQ at
> > http://www.osronline.com/article.cfm?id=256
> >
> > You are currently subscribed to ntdev as: xxxxx@acm.org
> > To unsubscribe send a blank email to xxxxx@lists.osr.com
> >
> >
> >
> >
> >
> > —
> > Questions? First check the Kernel Driver FAQ at
> > http://www.osronline.com/article.cfm?id=256
> >
> > You are currently subscribed to ntdev as: xxxxx@ieee.org
> > To unsubscribe send a blank email to xxxxx@lists.osr.com
>
>
>
>
> —
> Questions? First check the Kernel Driver FAQ at
http://www.osronline.com/article.cfm?id=256
>
> You are currently subscribed to ntdev as: xxxxx@storagecraft.com
> To unsubscribe send a blank email to xxxxx@lists.osr.com
>

Hi Alberto,

Maybe I’m barking at a wrong tree, but I suspect your second algorithm will deadlock if readers
and writers run on different IRQLs in Windows NT kernel.

Let’s imagine the situation:
There is a TDI client driver which implemented some proprietary packet protocol, where send requests must be answered with
corresponding responses.

  1. There is a send request (PASSIVE_LEVEL) which should be registered in our request queue - (Writer)
    So it acquires rLock (DISPATCH), completes while conditions(DISPATCH), increment nWriters(DISPATCH),
    release rLock(PASSIVE)

  2. At the moment we have a DPC from a network receive interrupt which provides our driver with TdiEventReceive data.(DISPATCH)
    We need to check our request queue to find out the request to which the response directed.
    We acquire rSpin(DISPATCH), and simple deadlock on while(nWriters>0) because of 1) is incremented the counter.
    We are on DISPATCH at this time - so the system is deadlocked, yes?

Maybe my understanding of the case is not right?

Best regards,
Valeriy Glushkov

Just KeRaiseIrql/KeLowerIrql to dispatch around the contents of those
two functions and the issue is made moot.

Valeriy Glushkov wrote:

Hi Alberto,

Maybe I’m barking at a wrong tree, but I suspect your second algorithm will deadlock if readers
and writers run on different IRQLs in Windows NT kernel.

Let’s imagine the situation:
There is a TDI client driver which implemented some proprietary packet protocol, where send requests must be answered with
corresponding responses.

  1. There is a send request (PASSIVE_LEVEL) which should be registered in our request queue - (Writer)
    So it acquires rLock (DISPATCH), completes while conditions(DISPATCH), increment nWriters(DISPATCH),
    release rLock(PASSIVE)

  2. At the moment we have a DPC from a network receive interrupt which provides our driver with TdiEventReceive data.(DISPATCH)
    We need to check our request queue to find out the request to which the response directed.
    We acquire rSpin(DISPATCH), and simple deadlock on while(nWriters>0) because of 1) is incremented the counter.
    We are on DISPATCH at this time - so the system is deadlocked, yes?

Maybe my understanding of the case is not right?

Best regards,
Valeriy Glushkov


Nick Ryan (MVP for DDK)

Hi, Valeriy,

You’re right, the system will deadlock if the Reader is able to
preempt the writer as you stated, and prevent it from proceeding:
this will happen if both Reader and Writer end up running on the
same machine. Running both Reader and Writer at the same IRQL
might prevent it from happening. Or, if your driver accomodates it,
you could run the Writer at a higher IRQL than the Reader. A safer
solution would use split semaphores, but the code would get a bit
messier.

Alberto.

On 27 Dec 2003 at 13:55, Valeriy Glushkov wrote:

Hi Alberto,

Maybe I’m barking at a wrong tree, but I suspect your second algorithm
will deadlock if readers and writers run on different IRQLs in Windows
NT kernel.

Let’s imagine the situation:
There is a TDI client driver which implemented some proprietary packet
protocol, where send requests must be answered with corresponding
responses.

  1. There is a send request (PASSIVE_LEVEL) which should be registered
    in our request queue - (Writer) So it acquires rLock (DISPATCH),
    completes while conditions(DISPATCH), increment nWriters(DISPATCH),
    release rLock(PASSIVE)

  2. At the moment we have a DPC from a network receive interrupt which
    provides our driver with TdiEventReceive data.(DISPATCH) We need to
    check our request queue to find out the request to which the response
    directed. We acquire rSpin(DISPATCH), and simple deadlock on
    while(nWriters>0) because of 1) is incremented the counter. We are on
    DISPATCH at this time - so the system is deadlocked, yes?

Maybe my understanding of the case is not right?

Best regards,
Valeriy Glushkov

[quote]
The way it works is, once the first reader grabs the wSpin lock,
subsequent readers don’t need to spin waiting. Remember, only writes
must be synchronized ! This however gives priority to the readers, use
with care. And of course, if he needs to update the queue at every
read, that’s a different question. A variation of this technique is
the following course-grained solution,

Reader( )
{
acquire(rSpin);
while (nWriters>0);
nReaders++;
release(rSpin);
dequeue( );
lock{nReaders–};
}

Writer( )
{
acquire(rSpin);
while (nReaders>0 || nWriters>0);
nWriters++;
release(rSpin);
enqueue( );
lock{nWriters–};

This is a bit lighter on the writers. As I originally said in my other
post, Readers and Writers is a well studied problem. I recommend going
into a good textbook, there’s plenty of suggestions on how to make it
run better. [/quote]


Questions? First check the Kernel Driver FAQ at
http://www.osronline.com/article.cfm?id=256

You are currently subscribed to ntdev as: xxxxx@ieee.org
To unsubscribe send a blank email to xxxxx@lists.osr.com

Max,

Don’t you need to interlock the read loop/check too ? Otherwise,
you loop/check for >=3D0, then some other processor write acquires,
then your processor does the InterlockedIncrement, and you have
two processors in the critical section. Or did I misunderstand you ?

Alberto.

=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D

On 27 Dec 2003 at 2:33, Maxim S. Shatskih wrote:

Read-write ERESOURCE-like spinlocks are possible.

One of the possible implementations use the following:

  • read acquire: loop/check to be >=3D0, then
    InterlockedIncrement - write acquire: loop/check to be zero,
    then InterlockedAdd with a huge
    negative number.
    At least this is stable till the number of read contenders will be
    lesser
    then this “huge negative number”, and, in case of spinlocks, the
    number of contenders is always <=3D CPU count (since preemption is
    disabled by raising to DISPATCH_LEVEL before doing interlocked stuff).

IIRC Linux has this implementation. I also once re-implemented the
PnP
remove lock in the similar manner.

Maxim Shatskih, Windows DDK MVP
StorageCraft Corporation
xxxxx@storagecraft.com
http://www.storagecraft.com

----- Original Message -----
From:
> To: “Windows System Software Devs Interest List”
> Sent: Friday, December 26, 2003 5:11 PM Subject: [ntdev] Re:
> Synchornizing Access to a Linked List: complete picture
>
>
> > There isn’t enough data to say anything real about his problem, but
> > there’s
> several
> > things that may be going wrong, besides the obvious spinlock
> > implementation
> issue.
> >
> > He seems to be searching for a node value inside a critical section,
> > that’s
> not a
> > good idea. Now, there’s several things that might be causing the
> > slowdown,
> here’s
> > some things to try.
> >
> > A hash table may help if collision chains are kept short enough, but
> > that may
> be
> > only part of the issue. I would for example do a test run forcing
> > retrieval
> to always
> > access the first key, to see if performance is being lost because of
> synchronization
> > or because of the search. If I implemented the hash table, I would
> > do a few
> test runs
> > on sample data to make sure my collision chains are short enough.
> >
> > Note that it’s a sledgehammer approach to just acquire a spinlock
> > every time
> one
> > reads or writes to the queue. If he can have multiple readers
> > concurrent on
> the
> > queue, but only one writer, he can try this:
> >
> > Reader( )
> > {
> > acquire(rSpin);
> > nreaders++;
> > if (nreaders=3D=3D1) acquire(wSpin);
> > release (rSpin);
> > dequeue( );
> > acquire(rSpin);
> > nreaders–;
> > if (nreaders=3D=3D0) release(wSpin);
> > release(rSpin);
> > }
> >
> > Writer( )
> > {
> > acquire(wSpin);
> > enqueue( );
> > release(wSpin);
> > }
> >
> > The way it works is, once the first reader grabs the wSpin lock,
> > subsequent
> readers
> > don’t need to spin waiting. Remember, only writes must be
> > synchronized ! This however gives priority to the readers, use with
> > care. And of course, if he
> needs to
> > update the queue at every read, that’s a different question. A
> > variation of
> this
> > technique is the following course-grained solution,
> >
> > Reader( )
> > {
> > acquire(rSpin);
> > while (nWriters>0);
> > nReaders++;
> > release(rSpin);
> > dequeue( );
> > lock{nReaders–};
> > }
> >
> > Writer( )
> > {
> > acquire(rSpin);
> > while (nReaders>0 || nWriters>0);
> > nWriters++;
> > release(rSpin);
> > enqueue( );
> > lock{nWriters–};
> >
> > This is a bit lighter on the writers. As I originally said in my
> > other post,
> Readers and
> > Writers is a well studied problem. I recommend going into a good
> > textbook,
> there’s
> > plenty of suggestions on how to make it run better.
> >
> > Another point is, if he’s looking up the list, using a linked list
> > may be a
> bad idea for a
> > number of reasons, if nothing else because pointer chasing is slow
> > and taxes
> the
> > cache. One way to do it is to structure the queue as an array of
> > pointers,
> this will
> > help keeping cache locality and it might even allow one to use a rep
> > scas to
> search
> > for that value.
> >
> > The problem with an AVL tree in this context is that removals
> > require writing
> to the
> > data structure, which would negate most of the benefits of the
> > Reader/Writer synchronization logic. But, in that direction, I would
> > prefer to structure
> the data as a
> > priority queue using his search argument as priority, that will
> > guarantee log
> time
> > access and no need to implement AVL rotations which may be tough to
> > debug. A priority queue can be simply implemented as an array,
> > because the children of element queue[p] are to be found at
> > queue[2p] and queue[2p+1]. This may
> avoid
> > searching, which is particularly advisable if done inside a spinlock
> > !
> >
> > Last but not least, he should make sure he’s not allocating memory
> > inside a
> critical
> > section.
> >
> >
> > Alberto.
> >
> >
> >
> > On 25 Dec 2003 at 22:03, Jake Oshins wrote:
> >
> > > I second Don’s suggestion. As I said before, your biggest problem
> > > is that you need better data-management algorithms. An AVL tree
> > > might be simpler than the hash table and it might solve your
> > > performance problems.
> > >
> > > –
> > > Jake Oshins
> > > Windows Base Kernel Team
> > >
> > > This posting is provided “AS IS” with no warranties, and confers
> > > no rights.
> > >
> > > “Don Burn” wrote in message news:xxxxx@ntdev…
> > >
> > > Synchornizing Access to a Linked List: complete pictureConsider
> > > using a hash function based on the first field you mention to
> > > split the list into multiple lists each with an associated spin
> > > lock. This should improve performance in both the single
> > > processor case where the list is long and the multiprocessor case
> > > by reducing lock contention.
> > >
> > > Don Burn (MVP, Windows DDK)
> > > Windows 2k/XP/2k3 Filesystem and Driver Consulting
> > >
> > > ----- Original Message -----
> > > From: I=F1aki Castillo
> > > To: Windows System Software Devs Interest List
> > > Sent: Wednesday, December 24, 2003 7:12 AM
> > > Subject: [ntdev] Synchornizing Access to a Linked List: complete
> > > picture
> > >
> > >
> > > Thanks for your answers. Here is the whole picture.
> > > The linked list I am using stores data in non paged pool space.
> > > The writer runs always at PASSIVE level. It allocates new nodes if
> > > necessary and fills up the fields on each node. Writing and
> > > allocating new nodes is rare. At most, it may happen 10 times per
> > > minute on an average loaded machine. Reading is another story.
> > > Readers are arbitrary threads that run at APC or DPC levels. They
> > > never run at PASSIVE level. The actual IRQL depends on the point
> > > from which my code is called. I assume DPC level all the time.
> > > Reading may be performed a lot of times, in the order of hundred
> > > of times per second. The reader perfoms each read in two steps:
> > > first it looks for the right node, and, if it exists on the list,
> > > it reads all the fields in the node. Both, writing and reading are
> > > synchronized using an ordinary (no Stack) Spinlock. While reading,
> > > the following tasks are performed: 1-Acquire Spinlock 2-Check the
> > > first field on each node, to look for a certain value 3-If found
> > > the right value, read the others fields of that node on local
> > > variables . 4-Release Spinlock There may be many nodes but I have
> > > checked that performance problem occurs even with only a couple of
> > > nodes in the list. I have checked also that node checking is
> > > performed correctly. There are no calls to other functions from
> > > within the spinlock and the data is located always in non paged
> > > pool. Removing synchronization completely (just for testing)
> > > improves speed about 30% when code is runing on multi processor
> > > machines. Of course this is not a solution but it means that
> > > synchronization is taking too much time. I know there are others
> > > ways to synchronize but I need one available at DPC. I have not
> > > tried the new version of queued SpinLocks. Thanks for your
> > > comments. Inaki.
> > >
> > >
> > >
> > >
> > > —
> > > Questions? First check the Kernel Driver FAQ at
> > > http://www.osronline.com/article.cfm?id=3D256
> > >
> > > You are currently subscribed to ntdev as: xxxxx@acm.org
> > > To unsubscribe send a blank email to
> > > xxxxx@lists.osr.com
> > >
> > >
> > >
> > >
> > >
> > > —
> > > Questions? First check the Kernel Driver FAQ at
> > > http://www.osronline.com/article.cfm?id=3D256
> > >
> > > You are currently subscribed to ntdev as: xxxxx@ieee.org
> > > To unsubscribe send a blank email to
> > > xxxxx@lists.osr.com
> >
> >
> >
> >
> > —
> > Questions? First check the Kernel Driver FAQ at
> http://www.osronline.com/article.cfm?id=3D256
> >
> > You are currently subscribed to ntdev as: xxxxx@storagecraft.com To
> > unsubscribe send a blank email to xxxxx@lists.osr.com
> >
>
>
> —
> Questions? First check the Kernel Driver FAQ at
> http://www.osronline.com/article.cfm?id=3D256
>
> You are currently subscribed to ntdev as: xxxxx@ieee.org
> To unsubscribe send a blank email to xxxxx@lists.osr.com

Hi, Alberto,

Thank you for the explanation.

So we need to ensure that Readers and Writers are running at the same IRQL.

As Nick suggested, KeRaiseIrql(DISPATCH_LEVEL) and KeLowerIrql(original level) in Reader’s and Writer’s begin
and end will solve the problem with preempting.

Thank you for the really useful samples.

Best regards,
Valeriy Glushkov

----- Original Message -----
From:
To: “Windows System Software Devs Interest List”
Sent: Sunday, December 28, 2003 2:44 AM
Subject: [ntdev] Re: Synchornizing Access to a Linked List: complete picture

> Hi, Valeriy,
>
> You’re right, the system will deadlock if the Reader is able to
> preempt the writer as you stated, and prevent it from proceeding:
> this will happen if both Reader and Writer end up running on the
> same machine. Running both Reader and Writer at the same IRQL
> might prevent it from happening. Or, if your driver accomodates it,
> you could run the Writer at a higher IRQL than the Reader. A safer
> solution would use split semaphores, but the code would get a bit
> messier.
>
> Alberto.
>
>
> On 27 Dec 2003 at 13:55, Valeriy Glushkov wrote:
>
> > Hi Alberto,
> >
> > Maybe I’m barking at a wrong tree, but I suspect your second algorithm
> > will deadlock if readers and writers run on different IRQLs in Windows
> > NT kernel.
> >
> > Let’s imagine the situation:
> > There is a TDI client driver which implemented some proprietary packet
> > protocol, where send requests must be answered with corresponding
> > responses.
> >
> > 1) There is a send request (PASSIVE_LEVEL) which should be registered
> > in our request queue - (Writer) So it acquires rLock (DISPATCH),
> > completes while conditions(DISPATCH), increment nWriters(DISPATCH),
> > release rLock(PASSIVE)
> >
> > 2) At the moment we have a DPC from a network receive interrupt which
> > provides our driver with TdiEventReceive data.(DISPATCH) We need to
> > check our request queue to find out the request to which the response
> > directed. We acquire rSpin(DISPATCH), and simple deadlock on
> > while(nWriters>0) because of 1) is incremented the counter. We are on
> > DISPATCH at this time - so the system is deadlocked, yes?
> >
> > Maybe my understanding of the case is not right?
> >
> > Best regards,
> > Valeriy Glushkov
> >
> >

Loops must not to interlocks, this is slow.

The reader will InterlockedIncrement, saw a negative, then
InterlockedDecrement back and continue looping.

Maxim Shatskih, Windows DDK MVP
StorageCraft Corporation
xxxxx@storagecraft.com
http://www.storagecraft.com

----- Original Message -----
From:
To: “Windows System Software Devs Interest List”
Sent: Sunday, December 28, 2003 3:48 AM
Subject: [ntdev] Re: Synchornizing Access to a Linked List: complete picture

> Max,
>
> Don’t you need to interlock the read loop/check too ? Otherwise,
> you loop/check for >=0, then some other processor write acquires,
> then your processor does the InterlockedIncrement, and you have
> two processors in the critical section. Or did I misunderstand you ?
>
> Alberto.
>
> ============
>
> On 27 Dec 2003 at 2:33, Maxim S. Shatskih wrote:
>
> > Read-write ERESOURCE-like spinlocks are possible.
> >
> > One of the possible implementations use the following:
> > - read acquire: loop/check to be >=0, then
> > InterlockedIncrement - write acquire: loop/check to be zero,
> > then InterlockedAdd with a huge
> > negative number.
> > At least this is stable till the number of read contenders will be
> > lesser
> > then this “huge negative number”, and, in case of spinlocks, the
> > number of contenders is always <= CPU count (since preemption is
> > disabled by raising to DISPATCH_LEVEL before doing interlocked stuff).
> >
> > IIRC Linux has this implementation. I also once re-implemented the
> > PnP
> > remove lock in the similar manner.
> >
> > Maxim Shatskih, Windows DDK MVP
> > StorageCraft Corporation
> > xxxxx@storagecraft.com
> > http://www.storagecraft.com
> >
> >
> > ----- Original Message -----
> > From:
> > To: “Windows System Software Devs Interest List”
> > Sent: Friday, December 26, 2003 5:11 PM Subject: [ntdev] Re:
> > Synchornizing Access to a Linked List: complete picture
> >
> >
> > > There isn’t enough data to say anything real about his problem, but
> > > there’s
> > several
> > > things that may be going wrong, besides the obvious spinlock
> > > implementation
> > issue.
> > >
> > > He seems to be searching for a node value inside a critical section,
> > > that’s
> > not a
> > > good idea. Now, there’s several things that might be causing the
> > > slowdown,
> > here’s
> > > some things to try.
> > >
> > > A hash table may help if collision chains are kept short enough, but
> > > that may
> > be
> > > only part of the issue. I would for example do a test run forcing
> > > retrieval
> > to always
> > > access the first key, to see if performance is being lost because of
> > synchronization
> > > or because of the search. If I implemented the hash table, I would
> > > do a few
> > test runs
> > > on sample data to make sure my collision chains are short enough.
> > >
> > > Note that it’s a sledgehammer approach to just acquire a spinlock
> > > every time
> > one
> > > reads or writes to the queue. If he can have multiple readers
> > > concurrent on
> > the
> > > queue, but only one writer, he can try this:
> > >
> > > Reader( )
> > > {
> > > acquire(rSpin);
> > > nreaders++;
> > > if (nreaders==1) acquire(wSpin);
> > > release (rSpin);
> > > dequeue( );
> > > acquire(rSpin);
> > > nreaders–;
> > > if (nreaders==0) release(wSpin);
> > > release(rSpin);
> > > }
> > >
> > > Writer( )
> > > {
> > > acquire(wSpin);
> > > enqueue( );
> > > release(wSpin);
> > > }
> > >
> > > The way it works is, once the first reader grabs the wSpin lock,
> > > subsequent
> > readers
> > > don’t need to spin waiting. Remember, only writes must be
> > > synchronized ! This however gives priority to the readers, use with
> > > care. And of course, if he
> > needs to
> > > update the queue at every read, that’s a different question. A
> > > variation of
> > this
> > > technique is the following course-grained solution,
> > >
> > > Reader( )
> > > {
> > > acquire(rSpin);
> > > while (nWriters>0);
> > > nReaders++;
> > > release(rSpin);
> > > dequeue( );
> > > lock{nReaders–};
> > > }
> > >
> > > Writer( )
> > > {
> > > acquire(rSpin);
> > > while (nReaders>0 || nWriters>0);
> > > nWriters++;
> > > release(rSpin);
> > > enqueue( );
> > > lock{nWriters–};
> > >
> > > This is a bit lighter on the writers. As I originally said in my
> > > other post,
> > Readers and
> > > Writers is a well studied problem. I recommend going into a good
> > > textbook,
> > there’s
> > > plenty of suggestions on how to make it run better.
> > >
> > > Another point is, if he’s looking up the list, using a linked list
> > > may be a
> > bad idea for a
> > > number of reasons, if nothing else because pointer chasing is slow
> > > and taxes
> > the
> > > cache. One way to do it is to structure the queue as an array of
> > > pointers,
> > this will
> > > help keeping cache locality and it might even allow one to use a rep
> > > scas to
> > search
> > > for that value.
> > >
> > > The problem with an AVL tree in this context is that removals
> > > require writing
> > to the
> > > data structure, which would negate most of the benefits of the
> > > Reader/Writer synchronization logic. But, in that direction, I would
> > > prefer to structure
> > the data as a
> > > priority queue using his search argument as priority, that will
> > > guarantee log
> > time
> > > access and no need to implement AVL rotations which may be tough to
> > > debug. A priority queue can be simply implemented as an array,
> > > because the children of element queue[p] are to be found at
> > > queue[2p] and queue[2p+1]. This may
> > avoid
> > > searching, which is particularly advisable if done inside a spinlock
> > > !
> > >
> > > Last but not least, he should make sure he’s not allocating memory
> > > inside a
> > critical
> > > section.
> > >
> > >
> > > Alberto.
> > >
> > >
> > >
> > > On 25 Dec 2003 at 22:03, Jake Oshins wrote:
> > >
> > > > I second Don’s suggestion. As I said before, your biggest problem
> > > > is that you need better data-management algorithms. An AVL tree
> > > > might be simpler than the hash table and it might solve your
> > > > performance problems.
> > > >
> > > > –
> > > > Jake Oshins
> > > > Windows Base Kernel Team
> > > >
> > > > This posting is provided “AS IS” with no warranties, and confers
> > > > no rights.
> > > >
> > > > “Don Burn” wrote in message news:xxxxx@ntdev…
> > > >
> > > > Synchornizing Access to a Linked List: complete pictureConsider
> > > > using a hash function based on the first field you mention to
> > > > split the list into multiple lists each with an associated spin
> > > > lock. This should improve performance in both the single
> > > > processor case where the list is long and the multiprocessor case
> > > > by reducing lock contention.
> > > >
> > > > Don Burn (MVP, Windows DDK)
> > > > Windows 2k/XP/2k3 Filesystem and Driver Consulting
> > > >
> > > > ----- Original Message -----
> > > > From: I?aki Castillo
> > > > To: Windows System Software Devs Interest List
> > > > Sent: Wednesday, December 24, 2003 7:12 AM
> > > > Subject: [ntdev] Synchornizing Access to a Linked List: complete
> > > > picture
> > > >
> > > >
> > > > Thanks for your answers. Here is the whole picture.
> > > > The linked list I am using stores data in non paged pool space.
> > > > The writer runs always at PASSIVE level. It allocates new nodes if
> > > > necessary and fills up the fields on each node. Writing and
> > > > allocating new nodes is rare. At most, it may happen 10 times per
> > > > minute on an average loaded machine. Reading is another story.
> > > > Readers are arbitrary threads that run at APC or DPC levels. They
> > > > never run at PASSIVE level. The actual IRQL depends on the point
> > > > from which my code is called. I assume DPC level all the time.
> > > > Reading may be performed a lot of times, in the order of hundred
> > > > of times per second. The reader perfoms each read in two steps:
> > > > first it looks for the right node, and, if it exists on the list,
> > > > it reads all the fields in the node. Both, writing and reading are
> > > > synchronized using an ordinary (no Stack) Spinlock. While reading,
> > > > the following tasks are performed: 1-Acquire Spinlock 2-Check the
> > > > first field on each node, to look for a certain value 3-If found
> > > > the right value, read the others fields of that node on local
> > > > variables . 4-Release Spinlock There may be many nodes but I have
> > > > checked that performance problem occurs even with only a couple of
> > > > nodes in the list. I have checked also that node checking is
> > > > performed correctly. There are no calls to other functions from
> > > > within the spinlock and the data is located always in non paged
> > > > pool. Removing synchronization completely (just for testing)
> > > > improves speed about 30% when code is runing on multi processor
> > > > machines. Of course this is not a solution but it means that
> > > > synchronization is taking too much time. I know there are others
> > > > ways to synchronize but I need one available at DPC. I have not
> > > > tried the new version of queued SpinLocks. Thanks for your
> > > > comments. Inaki.
> > > >
> > > >
> > > >
> > > >
> > > > —
> > > > Questions? First check the Kernel Driver FAQ at
> > > > http://www.osronline.com/article.cfm?id=256
> > > >
> > > > You are currently subscribed to ntdev as: xxxxx@acm.org
> > > > To unsubscribe send a blank email to
> > > > xxxxx@lists.osr.com
> > > >
> > > >
> > > >
> > > >
> > > >
> > > > —
> > > > Questions? First check the Kernel Driver FAQ at
> > > > http://www.osronline.com/article.cfm?id=256
> > > >
> > > > You are currently subscribed to ntdev as: xxxxx@ieee.org
> > > > To unsubscribe send a blank email to
> > > > xxxxx@lists.osr.com
> > >
> > >
> > >
> > >
> > > —
> > > Questions? First check the Kernel Driver FAQ at
> > http://www.osronline.com/article.cfm?id=256
> > >
> > > You are currently subscribed to ntdev as: xxxxx@storagecraft.com To
> > > unsubscribe send a blank email to xxxxx@lists.osr.com
> > >
> >
> >
> > —
> > Questions? First check the Kernel Driver FAQ at
> > http://www.osronline.com/article.cfm?id=256
> >
> > You are currently subscribed to ntdev as: xxxxx@ieee.org
> > To unsubscribe send a blank email to xxxxx@lists.osr.com
>
>
>
>
> —
> Questions? First check the Kernel Driver FAQ at
http://www.osronline.com/article.cfm?id=256
>
> You are currently subscribed to ntdev as: xxxxx@storagecraft.com
> To unsubscribe send a blank email to xxxxx@lists.osr.com
>