Making multiple Interlocked operations atomic

Suresh_Patil · September 19, 2016, 1:44pm

Hi,

I understand that an Interlocked is atomic with any other Interlocked operating on the same variable, but is there any way to have two different Interlocked operations done in an atomic way?

e.g. I want to convert this:

AcquireSpinlock
Array[x++] = value;
ReleaseSpinlock

into something like this:

InterlockedExchange(Array, value)
InterlockedIncrement(x)

Will it work (i.e. be atomic) if I nest them into one another? i.e. InterlockedExchange(Array[InterlockedIncrement(x)], value)

Regards,
Suresh

anton_bassov · September 19, 2016, 2:04pm

> Will it work (i.e. be atomic) if I nest them into one another?

Only if you know a way of nesting one bus transaction into another…

i.e. InterlockedExchange(Array[InterlockedIncrement(x)], value)

I guess no more questions about your “understanding” are needed here…

Anton Bassov

Slava_Imameev · September 19, 2016, 2:05pm

[quote]
is there any way to have two different Interlocked
operations done in an atomic way

[quote]

From a CPU point of view ( x86, amd64 ) this is not possible. These CPUs are unable to lock two consecutive memory operations. Though there is a command that allows to compare-exchange 64 bits words on some 32 bit CPUs and 128 bits on some 64 bit CPUs.

This will not make them one atomic operation. This is two operations. It has nothing to do with drivers or OS design. This is a compiler job. The compiler will scheduler the increment first and then the array access as two distinct operations.

anton_bassov · September 19, 2016, 2:17pm

> From a CPU point of view ( x86, amd64 ) this is not possible. These CPUs are unable to lock two >consecutive memory operations.

Don’t you find it disturbing, so to say, that one has to explain something as obvious as that to an aspiring system-level programmer??? OTOH, some of them need some assistance with renaming files in the Explorer…

Anton Bassov

Slava_Imameev · September 19, 2016, 2:34pm

Everyone was a beginner once.

Alex_Grig · September 19, 2016, 3:33pm

Take a look at XACQUIRE and XRELEASE instruction prefixes.

anton_bassov · September 19, 2016, 4:34pm

> Take a look at XACQUIRE and XRELEASE instruction prefixes.

…which may work only on the newer Intel processors - IIRC, AMD does not use these extensions
(that are, basically, just the alternative implementations of the existing REPNE / REPE prefix opcodes in context of being used with LOCK one).Its competing proposed technology relies upon SPECULATE, COMMIT, ABORT and RELEASE instructions that have not been yet implemented in any hardware that has been released up to this point.

Anton Bassov

Suresh_Patil · September 19, 2016, 10:49pm

Thanks for the explanation and wow, that was one tight slap! But probably I deserve it and sometimes that’s how you learn.

I was actually contemplating to use InterlockedPush/PopEntrySList to achieve this (I understand that it works as LIFO and I am okay with it), but was concerned that SList may add extra overhead vs locking only the index using InterlockedCompareExchange.
I will then fall back to Interlocked SList operations. Do let me know if there is any better way though.

Simeon_Zlatarev · September 20, 2016, 2:28am

I am experienced as other list members, so take my advise with a grain of salt. Are you contemplating non-blocking implementation of circular queue?

They are blocking structures in the sense that producers may have to block or abort on overflow. There are lock-free circular FIFO implementations on the web, but non-blocking structures (including queues) are generally not fast, especially under contention. The upshot is that they have progress guarantee, and are deadlock-free (not so useful here).

A more practical approach is to have an array of consumer stacks. When you enqueue, you choose one array element at random and push on that stack. The consumers either have individual array element or choose one at random and swap out the entire stack. Then they process it locally. This eliminates most of the consumer contention. You can use atomic exchange and CAS here. No Slist is needed, because you don’t need to pop individual elements. This will not give you strict FIFO ordering, but will be tentatively fair (most of the time). Having “random” generator is a steep requirement, and you may have to optimize around it.

Simeon_Zlatarev · September 20, 2016, 2:29am

Sorry, meant to say “not as experienced”.

Tim_Roberts · September 20, 2016, 12:15pm

xxxxx@yahoo.com wrote:

I was actually contemplating to use InterlockedPush/PopEntrySList to achieve this (I understand that it works as LIFO and I am okay with it), but was concerned that SList may add extra overhead vs locking only the index using InterlockedCompareExchange.
I will then fall back to Interlocked SList operations. Do let me know if there is any better way though.

The KEY purpose of locking is to ensure that you never leave your data
in an inconsistent state. If some dangerous thing can happen when your
operation gets interrupted between the increment and the swap, then you
must wrap the whole sequence in some kind of interlock.

However, this is something where you have to take a close look. If
nothing gets damaged by an interruption between those two operations,
then you don’t need to protect the whole sequence.

–
Tim Roberts, xxxxx@probo.com
Providenza & Boekelheide, Inc.