Inventing Spinlocks [was Synchronizing ...]

I’m very reluctant to get involved in this, but I feel like I can’t stop
myself. I’ll probably regret it later…

First, very few spinlock-related performance problems are frontside bus
traffic issues. You get to the frontside bus bottleneck only after
you’ve already refactored your algorithms such that they are nearly
optimal in all other respects. It’s far more likely that he’s holding a
lock when he doesn’t need to be, and that’s his performance problem. No
amount of lock algorithm manipulation will fix that.

Second, you give a solution that is intended to drop the amount of
frontside bus traffic when using a spinlock. And while your
implementation may be better than KeAcquireSpinLock, it’s far, far worse
than KeAcquireInStackQueuedSpinLock, which is another OS function that
is already available.

Jake Oshins
Windows Base Kernel Team

This posting is provided “AS IS” with no warranties, and confers no

-----Original Message-----
Subject: Re: Synchronizing access to a linked list
From: “Moreira, Alberto”
Date: Mon, 22 Dec 2003 12:10:17 -0500
X-Message-Number: 18


I talk for myself, of course.

And I have no problem shoving the OS out of my way when I feel compelled
do so.

As far as that spinlock thing went, I had a reason, let me try to
The guy said he had a performance problem with his implementation. My
reaction was, well, if you have high volume of traffic on one spinlock,
may be seeing contention at front-side bus level. Now, I suspected that
implementation of KeAcquireSpinlock( ) did a plain vanilla bts-based
and set, and although I could not say for sure, it might be the case
that a
lot of traffic on a single bts might generate a lot of bus traffic and
slow his SMP implementation. It is a simple test to replace a
with a better spinlock, which is a test-and-test-and-set: it entails
a while loop in front of the bts loop. That takes what, replacing two
of code in his critical section with two while loops and one release.
can be done and tested in less than five minutes, if it increases the
performance he would know one of his bottleneck, of not, just delete it
go to the next stage.

However, you see, it is not quite possible to convert the Windows
implementation of KeAcquireSpinlock( ) from test-and-set to
test-and-test-and-set, unless one goes down one notch and operates
the OS. If one would disassemble the code, one might find out that all
that’s needed is to insert the one while loop in front of the call to
KeAcquireSpinlock( ), but that would require disassembly, etc., which
time and energy: when all that’s needed is a three-line-of-code change

So, what do you suggest, that we don’t try it ? Neglect five minutes of
work, and for what, to toe the party line ? I don’t think you’re going
convince me to suggest that kind of thing.