Stalling a processor for a short while has negligible effect on performance provided the duration of the stall is small. Once the processor is in the loop - and I hope that’s what you mean by “lock acquisition” - other processors are prevented from entering the critical section, but that’s the same with any other control construct.
The act of looping with interrupts disabled may put pressure on the frontside bus, but there are lowest-level OS-independent methods to handle that: test-and-test-and-set, and a locked acquisition of the spinlock using an extra variable, as Anton already explained and as the author of that paper suggested. And may I remind you that a barebones spinlock is subarchitectural to the OS and hence the very same piece of code will work just as well under Linux, Solaris or MacOS. In my case at least, using that kind of construct saves code, and hence money!
I do not see what kind of global management operations would be affected if a processor is stalled for a brief moment. The tlb is inside the processor, and an access is an access, looping or not looping that memory location is going to end up in the tlb. Looping on the memory location will not add any entries to the tlb, and if interrupts are disabled, there’s not going to be any other activity affecting the tlb anyway.
The only thing in a spinlock that can affect other processors, again, is the issue of hogging the bus with locked read-write cycles. This is addressed by test-and-test-and-set spinlocks, and by the spinlock acquisition technique shown by Anton in his recent post. Both techniques operate within that one processor, and they alleviate pressure on the frontside bus. Still, note that even with a regular test-and-set spinlock, if critical sections are short enough, the potential flood of locked read-write cycles on the frontside bus will be so short that in the end there may not be much difference between test-and-set and test-and-test-and-set spinlocks - and that maybe explains what that guy found in that spinlock paper.
I don’t know about 20 years ago, but 40 years ago when I was a rookie we had the Exec 8 operating system, designed to operate on the Univac 1108, which had three central processors and two peripheral processors. Lowest level test-and-set spinlocks were widely used to protect critical sections at processor level, and it worked marvellously. They don’t make them like that anymore!
But if you think a bare spinlock slows down the whole system, hey, follow the lead of that paper writer: write a few programs, run them, measure performance in a comprehensive way, and publish. Inquiring minds want to know, and they’re going to be glad you did the legwork!
Alberto.
----- Original Message -----
From: Mark Roddy
To: Windows System Software Devs Interest List
Sent: Saturday, December 29, 2007 8:38 AM
Subject: Re: [ntdev] Are callbacks guaranteed to run consecutively?
As I mentioned earlier, a naive implementation of spinlocks could of course simply mask all interrupts before attempting to obtain the lock. Such an implementation would have a rather unpleasant performance impact on MP systems as it will not only stall the acquiring processor, but it will effectively stall all processors when lock acquisition blocks timer and tlb and other global processor management operations. Your superb ultrafast heavy metal spinlock has managed to transform itself into a global bottleneck!
Whatever happens on other processors is not irrelevant. What is nearly irrelevant is this thread, as we are simply rehasing spinlock design issues that were well understood and resolved in the industry 20 years ago.
On Dec 29, 2007 12:55 AM, Alberto Moreira wrote:
I disagree. If an interrupt routine runs unpreempted, whatever happens in other processors is irrelevant. Just get in with interrupts disabled, do the job fast, and get out.
But hey, if you allow unlimited preemption by interrupt routines, virtualization stuff going on inside subarchitectural loops, and what not, then obviously your “very fast” is up for grabs. Yet, blame it on the OS design, not on the concept!
Look at it this way. If I’m running a spinlock A which is interrupted by isr B which is interrupted by isr C, if you allow unlimited preemption what you’re really doing is queuing A behind B and B behind C. Given that we’re talking about one processor here (even in an SMP, eh ?), this preemption generates overhead in terms of saving and restoring registers, switching context, and so on, which could be avoided: the place to queue interrupts is in the APIC, independent of the code being executed by the processor.
That’s what I mean when I say “do it fast”. Do not preempt - it’s unnecessary, wastes time burning overhead cycles, and generates complexity and deadlock/livelock dangers that we could do without.
Alberto.
----- Original Message -----
From: Mark Roddy
To: Windows System Software Devs Interest List
Sent: Friday, December 28, 2007 11:09 AM
Subject: Re: [ntdev] Are callbacks guaranteed to run consecutively?
On an MP system there simply is no guarantee of that “very, very fast” quality. The spinlock busy wait can be deferred indefinately by interrupts, unless of course the spinlock busy wait is executed at a high enough interrupt level to prevent such indeterminacy.
On Dec 28, 2007 10:40 AM, Alberto Moreira wrote:
Preemption here is hardware. An interrupt happens, generates the equivalent
of a forced call to an isr. The isr gets control, does whatever it needs to
do very, very fast, and irets back to the caller, who keeps on spinning. And
the whole point of spinning is exactly that you know that you need to hog
the processor on that spinning loop, because it isn’t safe to proceed
otherwise.
And if you use the opportunity to wrestle processor control away from the
spin loop, you’re engaging in dangerous behavior that can easily lead into a
deadlock or livelock. That’s when that “contract” thing becomes necessary,
and all hell breaks loose because people want to embellish a barebones
construct into something it was never meant to be.
The way I see it, and I may be wrong, even when a spinlock loop can be
preempted by the hardware, it should not be interfered with by the OS.
Spinlocks should be subarchitectural to the OS: the lowest peel of the
onion!
Alberto.
----- Original Message -----
From:
To: “Windows System Software Devs Interest List”
Sent: Thursday, December 27, 2007 6:27 PM
Subject: Re: [ntdev] Are callbacks guaranteed to run consecutively?
> Preemption is allowed, but not suspension
>
> What really is the motive for coming up with such a synch* primitive? Why
> not mutex? Why not event? Why not semaphore with count == 1 ?
>
> The motive is not to trap the HW in an useless state. If preemption is not
> allowed, how would the system maintain other higher priority problem(s)
> and tasks ( clocks, instruction OP code problems, and other stuff ).
>
> The context switching is one of the culprit ( heavy over head ) of those
> suspendable synch primitive. A suspendable synch primitive is one where
> the acquiring thread could be suspended, and rescheduled, and again
> suspended …
>
> It is definitely this suspension, that motivated spinlock. Now if they
> are also not preemptible then there is a real problem though.
>
> It is altogether a different story, if someone tries to optimize the bus
> traffic by not polling the flag/lock too often, but that is still spinning
> and not suspending or going to wait state where scheduler/dispatcher -
> intervention is needed to bring the said thread into execution.
>
> -pro
>> A spinlock is supposed to stall a processor. It’s a strong tool to be
>> used
>> as an arbiter of last resort. It works like this:
>>
>> top: atomic_test_and_set location,true
>> jump_if_old_value_is_true top
>>
>> Or, in C-like prose,
>>
>> do
>> {
>> old_value = atomic_test_and_set (flag,true);
>> }
>> while (old_value);
>>
>> To clear a spinlock is even simpler:
>>
>> flag = false;
>>
>> There’s no yielding of any sort here. Spinlocks don’t block, don’t yield,
>> and they’re also not aware of processes, threads, or whatever. It may be
>> the case that an interrupt causes a context switch, but once control
>> comes
>> back, the loop carries on. And I’m definitely not sold on the wisdom of
>> allowing certain kinds of interrupts and locks to be preempted.
>>
>> You can dress up a spinlock with all sorts of extra baggage, but the more
>> you add, the less it looks like a spinlock!
>>
>>
>> Alberto.
>>
>>
>>
>>
>> ----- Original Message -----
>> From: Mark Roddy
>> To: Windows System Software Devs Interest List
>> Sent: Wednesday, December 26, 2007 10:57 AM
>> Subject: Re: [ntdev] Are callbacks guaranteed to run consecutively?
>>
>>
>>
>>
>>
>> On Dec 26, 2007 9:22 AM, Alberto Moreira < xxxxx@ieee.org> wrote:
>>
>> I’m sorry, people, I’m going to put on my computer science hat now
>> and
>> say that whatever this is, it ain’t a spinlock. The whole point of a
>> spinlock is to spin without yielding execution!
>>
>> At least provide an accurate definition. " The whole point of a
>> spinlock
>> is to spin without yielding execution! " No that isn’t the whole
>> point. That is a distinct feature. And everything we have discussed
>> here, with the exception of the AIX style ‘spin a while, then sleep’
>> implementation, doesn’t ‘yield execution’. What we have discussed is
>> that on NT, and most other modern multiprocessor general purpose
>> operating systems, spinlocks are interruptible. The currently executing
>> thread does not yield execution, it is interrupted, runs some isr, and
>> then returns to spinning. Implementations of spinlocks could block all
>> interrupts and never allow the processor to do anything other than busy
>> wait for the lock. Such implementations are generally considered
>> deficient as they block management of timers and TLBs and other critical
>> global OS resources while not providing any added benefit. You are of
>> course free to implement this sort of deficient simplistic private
>> spinlock, but I would suggest not deploying this spinlock of yours on
>> general purpose multiprocessor systems.
>>
>>
>>
>>
>>
>> –
>> Mark Roddy
>> — NTDEV is sponsored by OSR For our schedule of WDF, WDM, debugging
>> and other seminars visit: http://www.osr.com/seminars To unsubscribe,
>> visit the List Server section of OSR Online at
>> http://www.osronline.com/page.cfm?name=ListServer
>> —
>> NTDEV is sponsored by OSR
>>
>> For our schedule of WDF, WDM, debugging and other seminars visit:
>> http://www.osr.com/seminars
>>
>> To unsubscribe, visit the List Server section of OSR Online at
>> http://www.osronline.com/page.cfm?name=ListServer
>
>
>
> —
> NTDEV is sponsored by OSR
>
> For our schedule of WDF, WDM, debugging and other seminars visit:
> http://www.osr.com/seminars
>
> To unsubscribe, visit the List Server section of OSR Online at
> http://www.osronline.com/page.cfm?name=ListServer
—
NTDEV is sponsored by OSR
For our schedule of WDF, WDM, debugging and other seminars visit:
http://www.osr.com/seminars
To unsubscribe, visit the List Server section of OSR Online at http://www.osronline.com/page.cfm?name=ListServer
–
Mark Roddy — NTDEV is sponsored by OSR For our schedule of WDF, WDM, debugging and other seminars visit: http://www.osr.com/seminars To unsubscribe, visit the List Server section of OSR Online at http://www.osronline.com/page.cfm?name=ListServer
—
NTDEV is sponsored by OSR
For our schedule of WDF, WDM, debugging and other seminars visit:
http://www.osr.com/seminars
To unsubscribe, visit the List Server section of OSR Online at http://www.osronline.com/page.cfm?name=ListServer
–
Mark Roddy — NTDEV is sponsored by OSR For our schedule of WDF, WDM, debugging and other seminars visit: http://www.osr.com/seminars To unsubscribe, visit the List Server section of OSR Online at http://www.osronline.com/page.cfm?name=ListServer