Are callbacks guaranteed to run consecutively?

Ah, now you are speaking :-).

Yes I also agree that OS should not interfare in the sense of taking it out
of execution, put thru the grinding mills of whatever ( round robin,
priority based round robin … ) types of scheduling. But as you said the
interruptiblity should be allowed. There has to have a mechnism to let the
other high priority work to keep that processor’s sanity.

But as mark said, AIX has that yielding ( and I assume it pump the thread to
the process/thd table for it tobe scheduled later). Well it is not really a
spinlock in true spirit. And that is why I mentioned that the whole purpose
of coming up with spinlock is to reduce scheduling overhead ( as acquiring
thd finds it is wokenup and goes back to sleep, … ).

But I never said it is not “Art of computer programming”. It is very much an
art :slight_smile:

-pro

----- Original Message -----
From: “Alberto Moreira”
To: “Windows System Software Devs Interest List”
Sent: Friday, December 28, 2007 7:40 AM
Subject: Re: [ntdev] Are callbacks guaranteed to run consecutively?

> Preemption here is hardware. An interrupt happens, generates the
> equivalent of a forced call to an isr. The isr gets control, does whatever
> it needs to do very, very fast, and irets back to the caller, who keeps on
> spinning. And the whole point of spinning is exactly that you know that
> you need to hog the processor on that spinning loop, because it isn’t safe
> to proceed otherwise.
>
> And if you use the opportunity to wrestle processor control away from the
> spin loop, you’re engaging in dangerous behavior that can easily lead into
> a deadlock or livelock. That’s when that “contract” thing becomes
> necessary, and all hell breaks loose because people want to embellish a
> barebones construct into something it was never meant to be.
>
> The way I see it, and I may be wrong, even when a spinlock loop can be
> preempted by the hardware, it should not be interfered with by the OS.
> Spinlocks should be subarchitectural to the OS: the lowest peel of the
> onion!
>
> Alberto.
>
>
>
> ----- Original Message -----
> From:
> To: “Windows System Software Devs Interest List”
> Sent: Thursday, December 27, 2007 6:27 PM
> Subject: Re: [ntdev] Are callbacks guaranteed to run consecutively?
>
>
>> Preemption is allowed, but not suspension :slight_smile:
>>
>> What really is the motive for coming up with such a synch* primitive? Why
>> not mutex? Why not event? Why not semaphore with count == 1 ?
>>
>> The motive is not to trap the HW in an useless state. If preemption is
>> not
>> allowed, how would the system maintain other higher priority problem(s)
>> and tasks ( clocks, instruction OP code problems, and other stuff ).
>>
>> The context switching is one of the culprit ( heavy over head ) of those
>> suspendable synch primitive. A suspendable synch primitive is one where
>> the acquiring thread could be suspended, and rescheduled, and again
>> suspended …
>>
>> It is definitely this suspension, that motivated spinlock. Now if they
>> are also not preemptible then there is a real problem though.
>>
>> It is altogether a different story, if someone tries to optimize the bus
>> traffic by not polling the flag/lock too often, but that is still
>> spinning
>> and not suspending or going to wait state where scheduler/dispatcher -
>> intervention is needed to bring the said thread into execution.
>>
>> -pro
>>> A spinlock is supposed to stall a processor. It’s a strong tool to be
>>> used
>>> as an arbiter of last resort. It works like this:
>>>
>>> top: atomic_test_and_set location,true
>>> jump_if_old_value_is_true top
>>>
>>> Or, in C-like prose,
>>>
>>> do
>>> {
>>> old_value = atomic_test_and_set (flag,true);
>>> }
>>> while (old_value);
>>>
>>> To clear a spinlock is even simpler:
>>>
>>> flag = false;
>>>
>>> There’s no yielding of any sort here. Spinlocks don’t block, don’t
>>> yield,
>>> and they’re also not aware of processes, threads, or whatever. It may be
>>> the case that an interrupt causes a context switch, but once control
>>> comes
>>> back, the loop carries on. And I’m definitely not sold on the wisdom of
>>> allowing certain kinds of interrupts and locks to be preempted.
>>>
>>> You can dress up a spinlock with all sorts of extra baggage, but the
>>> more
>>> you add, the less it looks like a spinlock!
>>>
>>>
>>> Alberto.
>>>
>>>
>>>
>>>
>>> ----- Original Message -----
>>> From: Mark Roddy
>>> To: Windows System Software Devs Interest List
>>> Sent: Wednesday, December 26, 2007 10:57 AM
>>> Subject: Re: [ntdev] Are callbacks guaranteed to run consecutively?
>>>
>>>
>>>
>>>
>>>
>>> On Dec 26, 2007 9:22 AM, Alberto Moreira wrote:
>>>
>>> I’m sorry, people, I’m going to put on my computer science hat now
>>> and
>>> say that whatever this is, it ain’t a spinlock. The whole point of a
>>> spinlock is to spin without yielding execution!
>>>
>>> At least provide an accurate definition. " The whole point of a
>>> spinlock
>>> is to spin without yielding execution! " No that isn’t the whole
>>> point. That is a distinct feature. And everything we have discussed
>>> here, with the exception of the AIX style ‘spin a while, then sleep’
>>> implementation, doesn’t ‘yield execution’. What we have discussed is
>>> that on NT, and most other modern multiprocessor general purpose
>>> operating systems, spinlocks are interruptible. The currently executing
>>> thread does not yield execution, it is interrupted, runs some isr, and
>>> then returns to spinning. Implementations of spinlocks could block all
>>> interrupts and never allow the processor to do anything other than busy
>>> wait for the lock. Such implementations are generally considered
>>> deficient as they block management of timers and TLBs and other critical
>>> global OS resources while not providing any added benefit. You are of
>>> course free to implement this sort of deficient simplistic private
>>> spinlock, but I would suggest not deploying this spinlock of yours on
>>> general purpose multiprocessor systems.
>>>
>>>
>>>
>>>
>>>
>>> –
>>> Mark Roddy
>>> — NTDEV is sponsored by OSR For our schedule of WDF, WDM, debugging
>>> and other seminars visit: http://www.osr.com/seminars To unsubscribe,
>>> visit the List Server section of OSR Online at
>>> http://www.osronline.com/page.cfm?name=ListServer
>>> —
>>> NTDEV is sponsored by OSR
>>>
>>> For our schedule of WDF, WDM, debugging and other seminars visit:
>>> http://www.osr.com/seminars
>>>
>>> To unsubscribe, visit the List Server section of OSR Online at
>>> http://www.osronline.com/page.cfm?name=ListServer
>>
>>
>>
>> —
>> NTDEV is sponsored by OSR
>>
>> For our schedule of WDF, WDM, debugging and other seminars visit:
>> http://www.osr.com/seminars
>>
>> To unsubscribe, visit the List Server section of OSR Online at
>> http://www.osronline.com/page.cfm?name=ListServer
>
>
> —
> NTDEV is sponsored by OSR
>
> For our schedule of WDF, WDM, debugging and other seminars visit:
> http://www.osr.com/seminars
>
> To unsubscribe, visit the List Server section of OSR Online at
> http://www.osronline.com/page.cfm?name=ListServer

On an MP system there simply is no guarantee of that “very, very fast”
quality. The spinlock busy wait can be deferred indefinately by
interrupts, unless of course the spinlock busy wait is executed at a high
enough interrupt level to prevent such indeterminacy.

On Dec 28, 2007 10:40 AM, Alberto Moreira wrote:

> Preemption here is hardware. An interrupt happens, generates the
> equivalent
> of a forced call to an isr. The isr gets control, does whatever it needs
> to
> do very, very fast, and irets back to the caller, who keeps on spinning.
> And
> the whole point of spinning is exactly that you know that you need to hog
> the processor on that spinning loop, because it isn’t safe to proceed
> otherwise.
>
> And if you use the opportunity to wrestle processor control away from the
> spin loop, you’re engaging in dangerous behavior that can easily lead into
> a
> deadlock or livelock. That’s when that “contract” thing becomes necessary,
> and all hell breaks loose because people want to embellish a barebones
> construct into something it was never meant to be.
>
> The way I see it, and I may be wrong, even when a spinlock loop can be
> preempted by the hardware, it should not be interfered with by the OS.
> Spinlocks should be subarchitectural to the OS: the lowest peel of the
> onion!
>
> Alberto.
>
>
>
> ----- Original Message -----
> From:
> To: “Windows System Software Devs Interest List”
> Sent: Thursday, December 27, 2007 6:27 PM
> Subject: Re: [ntdev] Are callbacks guaranteed to run consecutively?
>
>
> > Preemption is allowed, but not suspension :slight_smile:
> >
> > What really is the motive for coming up with such a synch* primitive?
> Why
> > not mutex? Why not event? Why not semaphore with count == 1 ?
> >
> > The motive is not to trap the HW in an useless state. If preemption is
> not
> > allowed, how would the system maintain other higher priority problem(s)
> > and tasks ( clocks, instruction OP code problems, and other stuff ).
> >
> > The context switching is one of the culprit ( heavy over head ) of those
> > suspendable synch primitive. A suspendable synch primitive is one where
> > the acquiring thread could be suspended, and rescheduled, and again
> > suspended …
> >
> > It is definitely this suspension, that motivated spinlock. Now if they
> > are also not preemptible then there is a real problem though.
> >
> > It is altogether a different story, if someone tries to optimize the bus
> > traffic by not polling the flag/lock too often, but that is still
> spinning
> > and not suspending or going to wait state where scheduler/dispatcher -
> > intervention is needed to bring the said thread into execution.
> >
> > -pro
> >> A spinlock is supposed to stall a processor. It’s a strong tool to be
> >> used
> >> as an arbiter of last resort. It works like this:
> >>
> >> top: atomic_test_and_set location,true
> >> jump_if_old_value_is_true top
> >>
> >> Or, in C-like prose,
> >>
> >> do
> >> {
> >> old_value = atomic_test_and_set (flag,true);
> >> }
> >> while (old_value);
> >>
> >> To clear a spinlock is even simpler:
> >>
> >> flag = false;
> >>
> >> There’s no yielding of any sort here. Spinlocks don’t block, don’t
> yield,
> >> and they’re also not aware of processes, threads, or whatever. It may
> be
> >> the case that an interrupt causes a context switch, but once control
> >> comes
> >> back, the loop carries on. And I’m definitely not sold on the wisdom of
> >> allowing certain kinds of interrupts and locks to be preempted.
> >>
> >> You can dress up a spinlock with all sorts of extra baggage, but the
> more
> >> you add, the less it looks like a spinlock!
> >>
> >>
> >> Alberto.
> >>
> >>
> >>
> >>
> >> ----- Original Message -----
> >> From: Mark Roddy
> >> To: Windows System Software Devs Interest List
> >> Sent: Wednesday, December 26, 2007 10:57 AM
> >> Subject: Re: [ntdev] Are callbacks guaranteed to run consecutively?
> >>
> >>
> >>
> >>
> >>
> >> On Dec 26, 2007 9:22 AM, Alberto Moreira wrote:
> >>
> >> I’m sorry, people, I’m going to put on my computer science hat now
> >> and
> >> say that whatever this is, it ain’t a spinlock. The whole point of a
> >> spinlock is to spin without yielding execution!
> >>
> >> At least provide an accurate definition. " The whole point of a
> >> spinlock
> >> is to spin without yielding execution! " No that isn’t the whole
> >> point. That is a distinct feature. And everything we have discussed
> >> here, with the exception of the AIX style ‘spin a while, then sleep’
> >> implementation, doesn’t ‘yield execution’. What we have discussed is
> >> that on NT, and most other modern multiprocessor general purpose
> >> operating systems, spinlocks are interruptible. The currently executing
> >> thread does not yield execution, it is interrupted, runs some isr, and
> >> then returns to spinning. Implementations of spinlocks could block all
> >> interrupts and never allow the processor to do anything other than busy
> >> wait for the lock. Such implementations are generally considered
> >> deficient as they block management of timers and TLBs and other
> critical
> >> global OS resources while not providing any added benefit. You are of
> >> course free to implement this sort of deficient simplistic private
> >> spinlock, but I would suggest not deploying this spinlock of yours on
> >> general purpose multiprocessor systems.
> >>
> >>
> >>
> >>
> >>
> >> –
> >> Mark Roddy
> >> — NTDEV is sponsored by OSR For our schedule of WDF, WDM, debugging
> >> and other seminars visit: http://www.osr.com/seminars To unsubscribe,
> >> visit the List Server section of OSR Online at
> >> http://www.osronline.com/page.cfm?name=ListServer
> >> —
> >> NTDEV is sponsored by OSR
> >>
> >> For our schedule of WDF, WDM, debugging and other seminars visit:
> >> http://www.osr.com/seminars
> >>
> >> To unsubscribe, visit the List Server section of OSR Online at
> >> http://www.osronline.com/page.cfm?name=ListServer
> >
> >
> >
> > —
> > NTDEV is sponsored by OSR
> >
> > For our schedule of WDF, WDM, debugging and other seminars visit:
> > http://www.osr.com/seminars
> >
> > To unsubscribe, visit the List Server section of OSR Online at
> > http://www.osronline.com/page.cfm?name=ListServer
>
>
> —
> NTDEV is sponsored by OSR
>
> For our schedule of WDF, WDM, debugging and other seminars visit:
> http://www.osr.com/seminars
>
> To unsubscribe, visit the List Server section of OSR Online at
> http://www.osronline.com/page.cfm?name=ListServer
>


Mark Roddy

> What you propose - which is also proposed in the paper, by the way…

Actually, I don’t propose anything - I just showed what the paper is *essentially* about, and essentially it is all about a thing that can get done in less than 10 assembly instructions (plus few lines of comments). However, they somehow managed to write few -dozens- pages- size document with “scientific proposals” out of it. This is one of the reasons why I take all these “scientists” and academicians rather ironically. What they are really good at is taking a simple concept , adding various bullshit that is expressed in “sophisticated” terms to it (academicians just cannot live without trying to sound “clever”), and, as a result, arriving to a huge document with “scientific proposals”. However, ask them to write a piece of code that is meant to work *in real life*, and you will see what they are actually worth - the only thing they are good at is speaking…

…is already higher level than the bare bones spinlock

I just wonder what “bare bones spinlock” is …

It uses a hardware lock.

Well, all spinlock do - otherwise, it already would not be a construct that is able to provide MP safety. More on it below…

What if your processors don’t support a bus level hardware lock ?

I am afraid the whole concept of a spinlock is plainly infeasible on such system. There are numerous ways to implement a spinlock - it can be set-and-test; it can be exchange-and-add;
it can be decrementing zero or incrementing 0xFFFFFFFF so that you are able to check overflow flag ;etc;etc;etc. However, all these strategies involve more than one action at the hardware level (i.e. changing variable and setting flag in EFLAGS; adding source and destination and exchanging them; etc). In order to make the very concept of a spinlock workable, all these actions have to be performed atomically by the hardware, which means they require bus locking - you just cannot ensure MP safety if some other bus agent is able to sneak in and modify the target memory location before your operation has been accomplished. This is why all spinlocks use LOCK prefix…

The author tried to analyze different hardware approaches to the problem of
efficiently supporting a spinlock construct in hardware.

… on the system that does not even provide a bus-level hardware lock ??? I wish him good luck…

Anton Bassov

Furthermore, it’s almost a guarantee on a virtualized system that
you’ll end up some day with your virtual processor unscheduled right
in the middle of something that you thought was “very, very fast.”

  • Jake Oshins

“Mark Roddy” wrote in message
news:xxxxx@ntdev…
On an MP system there simply is no guarantee of that “very, very fast”
quality. The spinlock busy wait can be deferred indefinately by
interrupts, unless of course the spinlock busy wait is executed at a
high enough interrupt level to prevent such indeterminacy.

On Dec 28, 2007 10:40 AM, Alberto Moreira wrote:

Preemption here is hardware. An interrupt happens, generates the
equivalent
of a forced call to an isr. The isr gets control, does whatever it
needs to
do very, very fast, and irets back to the caller, who keeps on
spinning. And
the whole point of spinning is exactly that you know that you need to
hog
the processor on that spinning loop, because it isn’t safe to proceed
otherwise.

And if you use the opportunity to wrestle processor control away from
the
spin loop, you’re engaging in dangerous behavior that can easily lead
into a
deadlock or livelock. That’s when that “contract” thing becomes
necessary,
and all hell breaks loose because people want to embellish a barebones
construct into something it was never meant to be.

The way I see it, and I may be wrong, even when a spinlock loop can be
preempted by the hardware, it should not be interfered with by the OS.
Spinlocks should be subarchitectural to the OS: the lowest peel of the
onion!

Alberto.

----- Original Message -----
From:
To: “Windows System Software Devs Interest List”
Sent: Thursday, December 27, 2007 6:27 PM
Subject: Re: [ntdev] Are callbacks guaranteed to run consecutively?

> Preemption is allowed, but not suspension :slight_smile:
>
> What really is the motive for coming up with such a synch*
> primitive? Why
> not mutex? Why not event? Why not semaphore with count == 1 ?
>
> The motive is not to trap the HW in an useless state. If preemption
> is not
> allowed, how would the system maintain other higher priority
> problem(s)
> and tasks ( clocks, instruction OP code problems, and other stuff ).
>
> The context switching is one of the culprit ( heavy over head ) of
> those
> suspendable synch primitive. A suspendable synch primitive is one
> where
> the acquiring thread could be suspended, and rescheduled, and again
> suspended …
>
> It is definitely this suspension, that motivated spinlock. Now if
> they
> are also not preemptible then there is a real problem though.
>
> It is altogether a different story, if someone tries to optimize the
> bus
> traffic by not polling the flag/lock too often, but that is still
> spinning
> and not suspending or going to wait state where
> scheduler/dispatcher -
> intervention is needed to bring the said thread into execution.
>
> -pro
>> A spinlock is supposed to stall a processor. It’s a strong tool to
>> be
>> used
>> as an arbiter of last resort. It works like this:
>>
>> top: atomic_test_and_set location,true
>> jump_if_old_value_is_true top
>>
>> Or, in C-like prose,
>>
>> do
>> {
>> old_value = atomic_test_and_set (flag,true);
>> }
>> while (old_value);
>>
>> To clear a spinlock is even simpler:
>>
>> flag = false;
>>
>> There’s no yielding of any sort here. Spinlocks don’t block, don’t
>> yield,
>> and they’re also not aware of processes, threads, or whatever. It
>> may be
>> the case that an interrupt causes a context switch, but once
>> control
>> comes
>> back, the loop carries on. And I’m definitely not sold on the
>> wisdom of
>> allowing certain kinds of interrupts and locks to be preempted.
>>
>> You can dress up a spinlock with all sorts of extra baggage, but
>> the more
>> you add, the less it looks like a spinlock!
>>
>>
>> Alberto.
>>
>>
>>
>>
>> ----- Original Message -----
>> From: Mark Roddy
>> To: Windows System Software Devs Interest List
>> Sent: Wednesday, December 26, 2007 10:57 AM
>> Subject: Re: [ntdev] Are callbacks guaranteed to run
>> consecutively?
>>
>>
>>
>>
>>
>> On Dec 26, 2007 9:22 AM, Alberto Moreira < xxxxx@ieee.org>
>> wrote:
>>
>> I’m sorry, people, I’m going to put on my computer science hat
>> now
>> and
>> say that whatever this is, it ain’t a spinlock. The whole point of
>> a
>> spinlock is to spin without yielding execution!
>>
>> At least provide an accurate definition. " The whole point of a
>> spinlock
>> is to spin without yielding execution! " No that isn’t the whole
>> point. That is a distinct feature. And everything we have discussed
>> here, with the exception of the AIX style ‘spin a while, then
>> sleep’
>> implementation, doesn’t ‘yield execution’. What we have discussed
>> is
>> that on NT, and most other modern multiprocessor general purpose
>> operating systems, spinlocks are interruptible. The currently
>> executing
>> thread does not yield execution, it is interrupted, runs some isr,
>> and
>> then returns to spinning. Implementations of spinlocks could block
>> all
>> interrupts and never allow the processor to do anything other than
>> busy
>> wait for the lock. Such implementations are generally considered
>> deficient as they block management of timers and TLBs and other
>> critical
>> global OS resources while not providing any added benefit. You are
>> of
>> course free to implement this sort of deficient simplistic private
>> spinlock, but I would suggest not deploying this spinlock of yours
>> on
>> general purpose multiprocessor systems.
>>
>>
>>
>>
>>
>> –
>> Mark Roddy
>> — NTDEV is sponsored by OSR For our schedule of WDF, WDM,
>> debugging
>> and other seminars visit: http://www.osr.com/seminars To
>> unsubscribe,
>> visit the List Server section of OSR Online at
>> http://www.osronline.com/page.cfm?name=ListServer
>> —
>> NTDEV is sponsored by OSR
>>
>> For our schedule of WDF, WDM, debugging and other seminars visit:
>> http://www.osr.com/seminars
>>
>> To unsubscribe, visit the List Server section of OSR Online at
>> http://www.osronline.com/page.cfm?name=ListServer
>
>
>
> —
> NTDEV is sponsored by OSR
>
> For our schedule of WDF, WDM, debugging and other seminars visit:
> http://www.osr.com/seminars
>
> To unsubscribe, visit the List Server section of OSR Online at
> http://www.osronline.com/page.cfm?name=ListServer


NTDEV is sponsored by OSR

For our schedule of WDF, WDM, debugging and other seminars visit:
http://www.osr.com/seminars

To unsubscribe, visit the List Server section of OSR Online at
http://www.osronline.com/page.cfm?name=ListServer


Mark Roddy

> On an MP system there simply is no guarantee of that “very, very fast” quality.

The spinlock busy wait can be deferred indefinately by interrupts, unless of course
the spinlock busy wait is executed at a high enough interrupt level to prevent
such indeterminacy.

Well, you can disable all maskable interrupts, but even then your spinning can get interrupted by SMI that is non-maskable and is completely transparent to the OS software…

Anton Bassov

> A standard spinlock requires one bit. That’s it.

Wrong!!! In addition to changing the target bit a "standard "spinlock requires also setting the flag in EFLAGS that tells you whether the state of the bit got changed by your set-and-test operation.

Therefore, spinlock acquisition requires changing at least two operands, and these operands have to be changed by the CPU as an atomic operation that cannot be disrupted by any other bus agent. This is why, as I have already said in my previous post. there is no way to write a spinlock that ensures MP safety if the target hardware does not support bus locking…

Anton Bassov

The flag is automatically maintained by the processor and saved/restored by
the hardware when an interrupt occurs. There is no need whatsoever that the
test and the conditional jump are done in one atomic operation, unless of
course an interrupt is ill-behaved enough that it doesn’t return back to the
interrupted code with an iret and an intact stack. The test-and-set
instruction must be atomic, and the hardware guarantees it.

The flag has nothing to do with being multiprocessor safe. There is no way
on earth that another processor or any bus agent can change the zero flag on
the looping processor. What can be interfered by another processor or bus
agent is the read-modify-write cycle that’s required to do a test-and-set on
shared memory, because another processor or another bus agent may insert one
or more cycles between the read and the write that are part of the execution
of the test and set operation.
The code

top: test-and-set bit, true
jump-if-not-zero top

is one hundred per cent multiprocessor safe on a bus-based i386 SMP. Mind
you, the memory location being referenced doesn’t even need to be
noncacheable, because the MESI protocol takes care of it for you.

Alberto.

----- Original Message -----
From:
To: “Windows System Software Devs Interest List”
Sent: Friday, December 28, 2007 7:06 PM
Subject: RE:[ntdev] Are callbacks guaranteed to run consecutively?

>> A standard spinlock requires one bit. That’s it.
>
> Wrong!!! In addition to changing the target bit a "standard "spinlock
> requires also setting the flag in EFLAGS that tells you whether the state
> of the bit got changed by your set-and-test operation.
>
> Therefore, spinlock acquisition requires changing at least two operands,
> and these operands have to be changed by the CPU as an atomic operation
> that cannot be disrupted by any other bus agent. This is why, as I have
> already said in my previous post. there is no way to write a spinlock that
> ensures MP safety if the target hardware does not support bus locking…
>
> Anton Bassov
>
> —
> NTDEV is sponsored by OSR
>
> For our schedule of WDF, WDM, debugging and other seminars visit:
> http://www.osr.com/seminars
>
> To unsubscribe, visit the List Server section of OSR Online at
> http://www.osronline.com/page.cfm?name=ListServer

I disagree. If an interrupt routine runs unpreempted, whatever happens in other processors is irrelevant. Just get in with interrupts disabled, do the job fast, and get out.

But hey, if you allow unlimited preemption by interrupt routines, virtualization stuff going on inside subarchitectural loops, and what not, then obviously your “very fast” is up for grabs. Yet, blame it on the OS design, not on the concept!

Look at it this way. If I’m running a spinlock A which is interrupted by isr B which is interrupted by isr C, if you allow unlimited preemption what you’re really doing is queuing A behind B and B behind C. Given that we’re talking about one processor here (even in an SMP, eh ?), this preemption generates overhead in terms of saving and restoring registers, switching context, and so on, which could be avoided: the place to queue interrupts is in the APIC, independent of the code being executed by the processor.

That’s what I mean when I say “do it fast”. Do not preempt - it’s unnecessary, wastes time burning overhead cycles, and generates complexity and deadlock/livelock dangers that we could do without.

Alberto.

----- Original Message -----
From: Mark Roddy
To: Windows System Software Devs Interest List
Sent: Friday, December 28, 2007 11:09 AM
Subject: Re: [ntdev] Are callbacks guaranteed to run consecutively?

On an MP system there simply is no guarantee of that “very, very fast” quality. The spinlock busy wait can be deferred indefinately by interrupts, unless of course the spinlock busy wait is executed at a high enough interrupt level to prevent such indeterminacy.

On Dec 28, 2007 10:40 AM, Alberto Moreira wrote:

Preemption here is hardware. An interrupt happens, generates the equivalent
of a forced call to an isr. The isr gets control, does whatever it needs to
do very, very fast, and irets back to the caller, who keeps on spinning. And
the whole point of spinning is exactly that you know that you need to hog
the processor on that spinning loop, because it isn’t safe to proceed
otherwise.

And if you use the opportunity to wrestle processor control away from the
spin loop, you’re engaging in dangerous behavior that can easily lead into a
deadlock or livelock. That’s when that “contract” thing becomes necessary,
and all hell breaks loose because people want to embellish a barebones
construct into something it was never meant to be.

The way I see it, and I may be wrong, even when a spinlock loop can be
preempted by the hardware, it should not be interfered with by the OS.
Spinlocks should be subarchitectural to the OS: the lowest peel of the
onion!

Alberto.

----- Original Message -----
From:
To: “Windows System Software Devs Interest List”
Sent: Thursday, December 27, 2007 6:27 PM
Subject: Re: [ntdev] Are callbacks guaranteed to run consecutively?

> Preemption is allowed, but not suspension :slight_smile:
>
> What really is the motive for coming up with such a synch* primitive? Why
> not mutex? Why not event? Why not semaphore with count == 1 ?
>
> The motive is not to trap the HW in an useless state. If preemption is not
> allowed, how would the system maintain other higher priority problem(s)
> and tasks ( clocks, instruction OP code problems, and other stuff ).
>
> The context switching is one of the culprit ( heavy over head ) of those
> suspendable synch primitive. A suspendable synch primitive is one where
> the acquiring thread could be suspended, and rescheduled, and again
> suspended …
>
> It is definitely this suspension, that motivated spinlock. Now if they
> are also not preemptible then there is a real problem though.
>
> It is altogether a different story, if someone tries to optimize the bus
> traffic by not polling the flag/lock too often, but that is still spinning
> and not suspending or going to wait state where scheduler/dispatcher -
> intervention is needed to bring the said thread into execution.
>
> -pro
>> A spinlock is supposed to stall a processor. It’s a strong tool to be
>> used
>> as an arbiter of last resort. It works like this:
>>
>> top: atomic_test_and_set location,true
>> jump_if_old_value_is_true top
>>
>> Or, in C-like prose,
>>
>> do
>> {
>> old_value = atomic_test_and_set (flag,true);
>> }
>> while (old_value);
>>
>> To clear a spinlock is even simpler:
>>
>> flag = false;
>>
>> There’s no yielding of any sort here. Spinlocks don’t block, don’t yield,
>> and they’re also not aware of processes, threads, or whatever. It may be
>> the case that an interrupt causes a context switch, but once control
>> comes
>> back, the loop carries on. And I’m definitely not sold on the wisdom of
>> allowing certain kinds of interrupts and locks to be preempted.
>>
>> You can dress up a spinlock with all sorts of extra baggage, but the more
>> you add, the less it looks like a spinlock!
>>
>>
>> Alberto.
>>
>>
>>
>>
>> ----- Original Message -----
>> From: Mark Roddy
>> To: Windows System Software Devs Interest List
>> Sent: Wednesday, December 26, 2007 10:57 AM
>> Subject: Re: [ntdev] Are callbacks guaranteed to run consecutively?
>>
>>
>>
>>
>>
>> On Dec 26, 2007 9:22 AM, Alberto Moreira < xxxxx@ieee.org> wrote:
>>
>> I’m sorry, people, I’m going to put on my computer science hat now
>> and
>> say that whatever this is, it ain’t a spinlock. The whole point of a
>> spinlock is to spin without yielding execution!
>>
>> At least provide an accurate definition. " The whole point of a
>> spinlock
>> is to spin without yielding execution! " No that isn’t the whole
>> point. That is a distinct feature. And everything we have discussed
>> here, with the exception of the AIX style ‘spin a while, then sleep’
>> implementation, doesn’t ‘yield execution’. What we have discussed is
>> that on NT, and most other modern multiprocessor general purpose
>> operating systems, spinlocks are interruptible. The currently executing
>> thread does not yield execution, it is interrupted, runs some isr, and
>> then returns to spinning. Implementations of spinlocks could block all
>> interrupts and never allow the processor to do anything other than busy
>> wait for the lock. Such implementations are generally considered
>> deficient as they block management of timers and TLBs and other critical
>> global OS resources while not providing any added benefit. You are of
>> course free to implement this sort of deficient simplistic private
>> spinlock, but I would suggest not deploying this spinlock of yours on
>> general purpose multiprocessor systems.
>>
>>
>>
>>
>>
>> –
>> Mark Roddy
>> — NTDEV is sponsored by OSR For our schedule of WDF, WDM, debugging
>> and other seminars visit: http://www.osr.com/seminars To unsubscribe,
>> visit the List Server section of OSR Online at
>> http://www.osronline.com/page.cfm?name=ListServer
>> —
>> NTDEV is sponsored by OSR
>>
>> For our schedule of WDF, WDM, debugging and other seminars visit:
>> http://www.osr.com/seminars
>>
>> To unsubscribe, visit the List Server section of OSR Online at
>> http://www.osronline.com/page.cfm?name=ListServer
>
>
>
> —
> NTDEV is sponsored by OSR
>
> For our schedule of WDF, WDM, debugging and other seminars visit:
> http://www.osr.com/seminars
>
> To unsubscribe, visit the List Server section of OSR Online at
> http://www.osronline.com/page.cfm?name=ListServer


NTDEV is sponsored by OSR

For our schedule of WDF, WDM, debugging and other seminars visit:
http://www.osr.com/seminars

To unsubscribe, visit the List Server section of OSR Online at http://www.osronline.com/page.cfm?name=ListServer


Mark Roddy — NTDEV is sponsored by OSR For our schedule of WDF, WDM, debugging and other seminars visit: http://www.osr.com/seminars To unsubscribe, visit the List Server section of OSR Online at http://www.osronline.com/page.cfm?name=ListServer

Alberto,

I am afraid all your statements on the subject are just bound to be wrong, because they are based on the totally wrong assumption that is quoted below:

[begin quote]

The test-and-set instruction must be atomic, and the hardware guarantees it.

[end quote]

This is the root of all your errors. Hardware does * NOT* guarantee atomicity of test-and-set in itself - it guarantees atomicity if and only if test-and-set is used with LOCK prefix. As I tried to explain to you already, at the hardware level test-and-set involves modification of 2 operands, i.e. of the target memory location and EFLAGS register. It involves 2 steps:

  1. CF flag in EFLAGS = target bit of the target variable;
  2. Target bit of the target variable=1;

Now imagine what happens if CPU X does step 1, then CPU Y sneaks in, and then CPU X does step 2 - if spinlock is not originally held by anyone, each CPU will believe that it has exclusively acquired a spinlock, so that a spinlock will have 2 holders at once . This scenario may occur if you do test-and-set without LOCK prefix. This is the reason why spinlock acquisition always has to be done as *locked* test-and-set - if the above mention scenario occurs, spinlock is already not an MP-safe construct, is it???

The code
top: test-and-set bit, true
jump-if-not-zero top

is one hundred per cent multiprocessor safe on a bus-based i386 SMP.

I hope by now you already understand that it is NOT MP-safe at all, because you do test-and-set without LOCK prefix…

Anton Bassov

>: the place to queue interrupts is in the APIC

This is EXACTLY what Windows does - it raises IRQL to the appropriate level (i.e. writes the appropriate value to local APIC’s TPR) before invoking ISR, so that ISR may get interrupted only interrupts of priority, higher than the current one, and the ones of lower priority will get processed after ISR returns. However, it has nothing to do with a “conventional” spinlock that is being acquired and held at DPC level - no matter in which order ISRs get processed, they will all get processed before execution returns to the code that acquires/holds a spinlock…

independent of the code being executed by the processor.

This is not the question of code - this is the question of IRQL, and interrupts just cannot be queued in APIC independently of current IRQL. Instead, everything depends on how interrupt’s priority fares against the IRQL that CPU currently runs at…

That’s what I mean when I say “do it fast”. Do not preempt - it’s unnecessary

In other words, what you are saying is “assign a code that acquires/holds the spinlock the highest possible interrupt priority (or just clear IF flag), no matter what and no matter how”. Not really wise approach, don’t you think???

Anton Bassov

You are correct that Intel’s BST requires the LOCK prefix to insure
atomicity. However, the XCHG can be used even though it does not set the
flags. When XCHG is used, then the register can be tested to see if someone
already owned the spinlock, mutex, etc. If already owned and a constant of
‘1’ is always used, the loop must start again with the XCHG to see if it
finally zero - unclaimed. Most processors have instructions that are
atomic, and there is nothing bad about using the LOCK prefix as long as the
number of prefix bytes don’t exceed some processor specific limit that might
cause some prefixes to be dropped if an instruction must be restarted.

wrote in message news:xxxxx@ntdev…
> Alberto,
>
> I am afraid all your statements on the subject are just bound to be wrong,
> because they are based on the totally wrong assumption that is quoted
> below:
>
> [begin quote]
>
> The test-and-set instruction must be atomic, and the hardware guarantees
> it.
>
> [end quote]
>
> This is the root of all your errors. Hardware does * NOT* guarantee
> atomicity of test-and-set in itself - it guarantees atomicity if and only
> if test-and-set is used with LOCK prefix. As I tried to explain to you
> already, at the hardware level test-and-set involves modification of 2
> operands, i.e. of the target memory location and EFLAGS register. It
> involves 2 steps:
>
> 1. CF flag in EFLAGS = target bit of the target variable;
> 2. Target bit of the target variable=1;
>
> Now imagine what happens if CPU X does step 1, then CPU Y sneaks in, and
> then CPU X does step 2 - if spinlock is not originally held by anyone,
> each CPU will believe that it has exclusively acquired a spinlock, so that
> a spinlock will have 2 holders at once . This scenario may occur if you do
> test-and-set without LOCK prefix. This is the reason why spinlock
> acquisition always has to be done as locked test-and-set - if the
> above mention scenario occurs, spinlock is already not an MP-safe
> construct, is it???
>
>> The code
>> top: test-and-set bit, true
>> jump-if-not-zero top
>
>> is one hundred per cent multiprocessor safe on a bus-based i386 SMP.
>
> I hope by now you already understand that it is NOT MP-safe at all,
> because you do test-and-set without LOCK prefix…
>
> Anton Bassov
>
>

> You are correct that Intel’s BST requires the LOCK prefix to insure atomicity.

However, the XCHG can be used even though it does not set the flags.

XCHG will surely do the job - this instruction implicitly asserts #LOCK signal even when it is not prefixed by LOCK, and, hence, is atomic. However, what we are speaking about is not even LOCK prefix but bus locking in itself, i.e. #LOCK signal - Alberto claims that it is possible to implement a spinlock on some (unspecified) hardware architecture that does not support bus-level locking. What I am trying to explain to him that the very concept of MP-safe spinlock on such system is just infeasible in itself…

Anton Bassov

As I mentioned earlier, a naive implementation of spinlocks could of course
simply mask all interrupts before attempting to obtain the lock. Such an
implementation would have a rather unpleasant performance impact on MP
systems as it will not only stall the acquiring processor, but it will
effectively stall all processors when lock acquisition blocks timer and tlb
and other global processor management operations. Your superb ultrafast
heavy metal spinlock has managed to transform itself into a global
bottleneck!

Whatever happens on other processors is not irrelevant. What is nearly
irrelevant is this thread, as we are simply rehasing spinlock design issues
that were well understood and resolved in the industry 20 years ago.

On Dec 29, 2007 12:55 AM, Alberto Moreira wrote:

> I disagree. If an interrupt routine runs unpreempted, whatever happens in
> other processors is irrelevant. Just get in with interrupts disabled, do the
> job fast, and get out.
>
> But hey, if you allow unlimited preemption by interrupt routines,
> virtualization stuff going on inside subarchitectural loops, and what not,
> then obviously your “very fast” is up for grabs. Yet, blame it on the OS
> design, not on the concept!
>
> Look at it this way. If I’m running a spinlock A which is interrupted
> by isr B which is interrupted by isr C, if you allow unlimited preemption
> what you’re really doing is queuing A behind B and B behind C. Given that
> we’re talking about one processor here (even in an SMP, eh ?), this
> preemption generates overhead in terms of saving and restoring registers,
> switching context, and so on, which could be avoided: the place to queue
> interrupts is in the APIC, independent of the code being executed by the
> processor.
>
> That’s what I mean when I say “do it fast”. Do not preempt - it’s
> unnecessary, wastes time burning overhead cycles, and generates complexity
> and deadlock/livelock dangers that we could do without.
>
> Alberto.
>
>
> ----- Original Message -----
> From: Mark Roddy
> To: Windows System Software Devs Interest List
> Sent: Friday, December 28, 2007 11:09 AM
> Subject: Re: [ntdev] Are callbacks guaranteed to run consecutively?
>
> On an MP system there simply is no guarantee of that “very, very fast”
> quality. The spinlock busy wait can be deferred indefinately by
> interrupts, unless of course the spinlock busy wait is executed at a high
> enough interrupt level to prevent such indeterminacy.
>
> On Dec 28, 2007 10:40 AM, Alberto Moreira wrote:
>
> > Preemption here is hardware. An interrupt happens, generates the
> > equivalent
> > of a forced call to an isr. The isr gets control, does whatever it needs
> > to
> > do very, very fast, and irets back to the caller, who keeps on spinning.
> > And
> > the whole point of spinning is exactly that you know that you need to
> > hog
> > the processor on that spinning loop, because it isn’t safe to proceed
> > otherwise.
> >
> > And if you use the opportunity to wrestle processor control away from
> > the
> > spin loop, you’re engaging in dangerous behavior that can easily lead
> > into a
> > deadlock or livelock. That’s when that “contract” thing becomes
> > necessary,
> > and all hell breaks loose because people want to embellish a barebones
> > construct into something it was never meant to be.
> >
> > The way I see it, and I may be wrong, even when a spinlock loop can be
> > preempted by the hardware, it should not be interfered with by the OS.
> > Spinlocks should be subarchitectural to the OS: the lowest peel of the
> > onion!
> >
> > Alberto.
> >
> >
> >
> > ----- Original Message -----
> > From:
> > To: “Windows System Software Devs Interest List”
> > Sent: Thursday, December 27, 2007 6:27 PM
> > Subject: Re: [ntdev] Are callbacks guaranteed to run consecutively?
> >
> >
> > > Preemption is allowed, but not suspension :slight_smile:
> > >
> > > What really is the motive for coming up with such a synch* primitive?
> > Why
> > > not mutex? Why not event? Why not semaphore with count == 1 ?
> > >
> > > The motive is not to trap the HW in an useless state. If preemption is
> > not
> > > allowed, how would the system maintain other higher priority
> > problem(s)
> > > and tasks ( clocks, instruction OP code problems, and other stuff ).
> > >
> > > The context switching is one of the culprit ( heavy over head ) of
> > those
> > > suspendable synch primitive. A suspendable synch primitive is one
> > where
> > > the acquiring thread could be suspended, and rescheduled, and again
> > > suspended …
> > >
> > > It is definitely this suspension, that motivated spinlock. Now if
> > they
> > > are also not preemptible then there is a real problem though.
> > >
> > > It is altogether a different story, if someone tries to optimize the
> > bus
> > > traffic by not polling the flag/lock too often, but that is still
> > spinning
> > > and not suspending or going to wait state where scheduler/dispatcher -
> > > intervention is needed to bring the said thread into execution.
> > >
> > > -pro
> > >> A spinlock is supposed to stall a processor. It’s a strong tool to be
> > >> used
> > >> as an arbiter of last resort. It works like this:
> > >>
> > >> top: atomic_test_and_set location,true
> > >> jump_if_old_value_is_true top
> > >>
> > >> Or, in C-like prose,
> > >>
> > >> do
> > >> {
> > >> old_value = atomic_test_and_set (flag,true);
> > >> }
> > >> while (old_value);
> > >>
> > >> To clear a spinlock is even simpler:
> > >>
> > >> flag = false;
> > >>
> > >> There’s no yielding of any sort here. Spinlocks don’t block, don’t
> > yield,
> > >> and they’re also not aware of processes, threads, or whatever. It may
> > be
> > >> the case that an interrupt causes a context switch, but once control
> > >> comes
> > >> back, the loop carries on. And I’m definitely not sold on the wisdom
> > of
> > >> allowing certain kinds of interrupts and locks to be preempted.
> > >>
> > >> You can dress up a spinlock with all sorts of extra baggage, but the
> > more
> > >> you add, the less it looks like a spinlock!
> > >>
> > >>
> > >> Alberto.
> > >>
> > >>
> > >>
> > >>
> > >> ----- Original Message -----
> > >> From: Mark Roddy
> > >> To: Windows System Software Devs Interest List
> > >> Sent: Wednesday, December 26, 2007 10:57 AM
> > >> Subject: Re: [ntdev] Are callbacks guaranteed to run consecutively?
> > >>
> > >>
> > >>
> > >>
> > >>
> > >> On Dec 26, 2007 9:22 AM, Alberto Moreira < xxxxx@ieee.org>
> > wrote:
> > >>
> > >> I’m sorry, people, I’m going to put on my computer science hat
> > now
> > >> and
> > >> say that whatever this is, it ain’t a spinlock. The whole point of a
> > >> spinlock is to spin without yielding execution!
> > >>
> > >> At least provide an accurate definition. " The whole point of a
> > >> spinlock
> > >> is to spin without yielding execution! " No that isn’t the whole
> > >> point. That is a distinct feature. And everything we have discussed
> > >> here, with the exception of the AIX style ‘spin a while, then sleep’
> > >> implementation, doesn’t ‘yield execution’. What we have discussed is
> >
> > >> that on NT, and most other modern multiprocessor general purpose
> > >> operating systems, spinlocks are interruptible. The currently
> > executing
> > >> thread does not yield execution, it is interrupted, runs some isr,
> > and
> > >> then returns to spinning. Implementations of spinlocks could block
> > all
> > >> interrupts and never allow the processor to do anything other than
> > busy
> > >> wait for the lock. Such implementations are generally considered
> > >> deficient as they block management of timers and TLBs and other
> > critical
> > >> global OS resources while not providing any added benefit. You are of
> > >> course free to implement this sort of deficient simplistic private
> > >> spinlock, but I would suggest not deploying this spinlock of yours on
> > >> general purpose multiprocessor systems.
> > >>
> > >>
> > >>
> > >>
> > >>
> > >> –
> > >> Mark Roddy
> > >> — NTDEV is sponsored by OSR For our schedule of WDF, WDM,
> > debugging
> > >> and other seminars visit: http://www.osr.com/seminars To unsubscribe,
> >
> > >> visit the List Server section of OSR Online at
> > >> http://www.osronline.com/page.cfm?name=ListServer
> > >> —
> > >> NTDEV is sponsored by OSR
> > >>
> > >> For our schedule of WDF, WDM, debugging and other seminars visit:
> > >> http://www.osr.com/seminars
> > >>
> > >> To unsubscribe, visit the List Server section of OSR Online at
> > >> http://www.osronline.com/page.cfm?name=ListServer
> > >
> > >
> > >
> > > —
> > > NTDEV is sponsored by OSR
> > >
> > > For our schedule of WDF, WDM, debugging and other seminars visit:
> > > http://www.osr.com/seminars
> > >
> > > To unsubscribe, visit the List Server section of OSR Online at
> > > http://www.osronline.com/page.cfm?name=ListServer
> >
> >
> > —
> > NTDEV is sponsored by OSR
> >
> > For our schedule of WDF, WDM, debugging and other seminars visit:
> > http://www.osr.com/seminars
> >
> > To unsubscribe, visit the List Server section of OSR Online at
> > http://www.osronline.com/page.cfm?name=ListServer
> >
>
>
>
> –
> Mark Roddy — NTDEV is sponsored by OSR For our schedule of WDF, WDM,
> debugging and other seminars visit: http://www.osr.com/seminars To
> unsubscribe, visit the List Server section of OSR Online at
> http://www.osronline.com/page.cfm?name=ListServer
>
>
> —
> NTDEV is sponsored by OSR
>
> For our schedule of WDF, WDM, debugging and other seminars visit:
> http://www.osr.com/seminars
>
> To unsubscribe, visit the List Server section of OSR Online at
> http://www.osronline.com/page.cfm?name=ListServer
>


Mark Roddy

Stalling a processor for a short while has negligible effect on performance provided the duration of the stall is small. Once the processor is in the loop - and I hope that’s what you mean by “lock acquisition” - other processors are prevented from entering the critical section, but that’s the same with any other control construct.

The act of looping with interrupts disabled may put pressure on the frontside bus, but there are lowest-level OS-independent methods to handle that: test-and-test-and-set, and a locked acquisition of the spinlock using an extra variable, as Anton already explained and as the author of that paper suggested. And may I remind you that a barebones spinlock is subarchitectural to the OS and hence the very same piece of code will work just as well under Linux, Solaris or MacOS. In my case at least, using that kind of construct saves code, and hence money!

I do not see what kind of global management operations would be affected if a processor is stalled for a brief moment. The tlb is inside the processor, and an access is an access, looping or not looping that memory location is going to end up in the tlb. Looping on the memory location will not add any entries to the tlb, and if interrupts are disabled, there’s not going to be any other activity affecting the tlb anyway.

The only thing in a spinlock that can affect other processors, again, is the issue of hogging the bus with locked read-write cycles. This is addressed by test-and-test-and-set spinlocks, and by the spinlock acquisition technique shown by Anton in his recent post. Both techniques operate within that one processor, and they alleviate pressure on the frontside bus. Still, note that even with a regular test-and-set spinlock, if critical sections are short enough, the potential flood of locked read-write cycles on the frontside bus will be so short that in the end there may not be much difference between test-and-set and test-and-test-and-set spinlocks - and that maybe explains what that guy found in that spinlock paper.

I don’t know about 20 years ago, but 40 years ago when I was a rookie we had the Exec 8 operating system, designed to operate on the Univac 1108, which had three central processors and two peripheral processors. Lowest level test-and-set spinlocks were widely used to protect critical sections at processor level, and it worked marvellously. They don’t make them like that anymore!

But if you think a bare spinlock slows down the whole system, hey, follow the lead of that paper writer: write a few programs, run them, measure performance in a comprehensive way, and publish. Inquiring minds want to know, and they’re going to be glad you did the legwork!

Alberto.

----- Original Message -----
From: Mark Roddy
To: Windows System Software Devs Interest List
Sent: Saturday, December 29, 2007 8:38 AM
Subject: Re: [ntdev] Are callbacks guaranteed to run consecutively?

As I mentioned earlier, a naive implementation of spinlocks could of course simply mask all interrupts before attempting to obtain the lock. Such an implementation would have a rather unpleasant performance impact on MP systems as it will not only stall the acquiring processor, but it will effectively stall all processors when lock acquisition blocks timer and tlb and other global processor management operations. Your superb ultrafast heavy metal spinlock has managed to transform itself into a global bottleneck!

Whatever happens on other processors is not irrelevant. What is nearly irrelevant is this thread, as we are simply rehasing spinlock design issues that were well understood and resolved in the industry 20 years ago.

On Dec 29, 2007 12:55 AM, Alberto Moreira wrote:

I disagree. If an interrupt routine runs unpreempted, whatever happens in other processors is irrelevant. Just get in with interrupts disabled, do the job fast, and get out.

But hey, if you allow unlimited preemption by interrupt routines, virtualization stuff going on inside subarchitectural loops, and what not, then obviously your “very fast” is up for grabs. Yet, blame it on the OS design, not on the concept!

Look at it this way. If I’m running a spinlock A which is interrupted by isr B which is interrupted by isr C, if you allow unlimited preemption what you’re really doing is queuing A behind B and B behind C. Given that we’re talking about one processor here (even in an SMP, eh ?), this preemption generates overhead in terms of saving and restoring registers, switching context, and so on, which could be avoided: the place to queue interrupts is in the APIC, independent of the code being executed by the processor.

That’s what I mean when I say “do it fast”. Do not preempt - it’s unnecessary, wastes time burning overhead cycles, and generates complexity and deadlock/livelock dangers that we could do without.

Alberto.

----- Original Message -----
From: Mark Roddy
To: Windows System Software Devs Interest List
Sent: Friday, December 28, 2007 11:09 AM
Subject: Re: [ntdev] Are callbacks guaranteed to run consecutively?

On an MP system there simply is no guarantee of that “very, very fast” quality. The spinlock busy wait can be deferred indefinately by interrupts, unless of course the spinlock busy wait is executed at a high enough interrupt level to prevent such indeterminacy.

On Dec 28, 2007 10:40 AM, Alberto Moreira wrote:

Preemption here is hardware. An interrupt happens, generates the equivalent
of a forced call to an isr. The isr gets control, does whatever it needs to
do very, very fast, and irets back to the caller, who keeps on spinning. And
the whole point of spinning is exactly that you know that you need to hog
the processor on that spinning loop, because it isn’t safe to proceed
otherwise.

And if you use the opportunity to wrestle processor control away from the
spin loop, you’re engaging in dangerous behavior that can easily lead into a
deadlock or livelock. That’s when that “contract” thing becomes necessary,
and all hell breaks loose because people want to embellish a barebones
construct into something it was never meant to be.

The way I see it, and I may be wrong, even when a spinlock loop can be
preempted by the hardware, it should not be interfered with by the OS.
Spinlocks should be subarchitectural to the OS: the lowest peel of the
onion!

Alberto.

----- Original Message -----
From:
To: “Windows System Software Devs Interest List”
Sent: Thursday, December 27, 2007 6:27 PM
Subject: Re: [ntdev] Are callbacks guaranteed to run consecutively?

> Preemption is allowed, but not suspension :slight_smile:
>
> What really is the motive for coming up with such a synch* primitive? Why
> not mutex? Why not event? Why not semaphore with count == 1 ?
>
> The motive is not to trap the HW in an useless state. If preemption is not
> allowed, how would the system maintain other higher priority problem(s)
> and tasks ( clocks, instruction OP code problems, and other stuff ).
>
> The context switching is one of the culprit ( heavy over head ) of those
> suspendable synch primitive. A suspendable synch primitive is one where
> the acquiring thread could be suspended, and rescheduled, and again
> suspended …
>
> It is definitely this suspension, that motivated spinlock. Now if they
> are also not preemptible then there is a real problem though.
>
> It is altogether a different story, if someone tries to optimize the bus
> traffic by not polling the flag/lock too often, but that is still spinning
> and not suspending or going to wait state where scheduler/dispatcher -
> intervention is needed to bring the said thread into execution.
>
> -pro
>> A spinlock is supposed to stall a processor. It’s a strong tool to be
>> used
>> as an arbiter of last resort. It works like this:
>>
>> top: atomic_test_and_set location,true
>> jump_if_old_value_is_true top
>>
>> Or, in C-like prose,
>>
>> do
>> {
>> old_value = atomic_test_and_set (flag,true);
>> }
>> while (old_value);
>>
>> To clear a spinlock is even simpler:
>>
>> flag = false;
>>
>> There’s no yielding of any sort here. Spinlocks don’t block, don’t yield,
>> and they’re also not aware of processes, threads, or whatever. It may be
>> the case that an interrupt causes a context switch, but once control
>> comes
>> back, the loop carries on. And I’m definitely not sold on the wisdom of
>> allowing certain kinds of interrupts and locks to be preempted.
>>
>> You can dress up a spinlock with all sorts of extra baggage, but the more
>> you add, the less it looks like a spinlock!
>>
>>
>> Alberto.
>>
>>
>>
>>
>> ----- Original Message -----
>> From: Mark Roddy
>> To: Windows System Software Devs Interest List
>> Sent: Wednesday, December 26, 2007 10:57 AM
>> Subject: Re: [ntdev] Are callbacks guaranteed to run consecutively?
>>
>>
>>
>>
>>
>> On Dec 26, 2007 9:22 AM, Alberto Moreira < xxxxx@ieee.org> wrote:
>>
>> I’m sorry, people, I’m going to put on my computer science hat now
>> and
>> say that whatever this is, it ain’t a spinlock. The whole point of a
>> spinlock is to spin without yielding execution!
>>
>> At least provide an accurate definition. " The whole point of a
>> spinlock
>> is to spin without yielding execution! " No that isn’t the whole
>> point. That is a distinct feature. And everything we have discussed
>> here, with the exception of the AIX style ‘spin a while, then sleep’
>> implementation, doesn’t ‘yield execution’. What we have discussed is
>> that on NT, and most other modern multiprocessor general purpose
>> operating systems, spinlocks are interruptible. The currently executing
>> thread does not yield execution, it is interrupted, runs some isr, and
>> then returns to spinning. Implementations of spinlocks could block all
>> interrupts and never allow the processor to do anything other than busy
>> wait for the lock. Such implementations are generally considered
>> deficient as they block management of timers and TLBs and other critical
>> global OS resources while not providing any added benefit. You are of
>> course free to implement this sort of deficient simplistic private
>> spinlock, but I would suggest not deploying this spinlock of yours on
>> general purpose multiprocessor systems.
>>
>>
>>
>>
>>
>> –
>> Mark Roddy
>> — NTDEV is sponsored by OSR For our schedule of WDF, WDM, debugging
>> and other seminars visit: http://www.osr.com/seminars To unsubscribe,
>> visit the List Server section of OSR Online at
>> http://www.osronline.com/page.cfm?name=ListServer
>> —
>> NTDEV is sponsored by OSR
>>
>> For our schedule of WDF, WDM, debugging and other seminars visit:
>> http://www.osr.com/seminars
>>
>> To unsubscribe, visit the List Server section of OSR Online at
>> http://www.osronline.com/page.cfm?name=ListServer
>
>
>
> —
> NTDEV is sponsored by OSR
>
> For our schedule of WDF, WDM, debugging and other seminars visit:
> http://www.osr.com/seminars
>
> To unsubscribe, visit the List Server section of OSR Online at
> http://www.osronline.com/page.cfm?name=ListServer


NTDEV is sponsored by OSR

For our schedule of WDF, WDM, debugging and other seminars visit:
http://www.osr.com/seminars

To unsubscribe, visit the List Server section of OSR Online at http://www.osronline.com/page.cfm?name=ListServer


Mark Roddy — NTDEV is sponsored by OSR For our schedule of WDF, WDM, debugging and other seminars visit: http://www.osr.com/seminars To unsubscribe, visit the List Server section of OSR Online at http://www.osronline.com/page.cfm?name=ListServer


NTDEV is sponsored by OSR

For our schedule of WDF, WDM, debugging and other seminars visit:
http://www.osr.com/seminars

To unsubscribe, visit the List Server section of OSR Online at http://www.osronline.com/page.cfm?name=ListServer


Mark Roddy — NTDEV is sponsored by OSR For our schedule of WDF, WDM, debugging and other seminars visit: http://www.osr.com/seminars To unsubscribe, visit the List Server section of OSR Online at http://www.osronline.com/page.cfm?name=ListServer

You can use xchg to implement a spinlock, even though it doesn’t set the
flags. You need to use a byte for your spin variable, and use the convention
that the lock is on if the variable equals 2, open if the variable equals 1.
Then this code does the trick:

top:
mov ecx,2
xchg ecx,variable
loop top

Doesn’t look vary elegant, but it works. And it doesn’t even use the flags!

Alberto.

----- Original Message -----
From:
To: “Windows System Software Devs Interest List”
Sent: Saturday, December 29, 2007 4:17 AM
Subject: RE:[ntdev] Are callbacks guaranteed to run consecutively?

>> You are correct that Intel’s BST requires the LOCK prefix to insure
>> atomicity.
>> However, the XCHG can be used even though it does not set the flags.
>
> XCHG will surely do the job - this instruction implicitly asserts #LOCK
> signal even when it is not prefixed by LOCK, and, hence, is atomic.
> However, what we are speaking about is not even LOCK prefix but bus
> locking in itself, i.e. #LOCK signal - Alberto claims that it is possible
> to implement a spinlock on some (unspecified) hardware architecture that
> does not support bus-level locking. What I am trying to explain to him
> that the very concept of MP-safe spinlock on such system is just
> infeasible in itself…
>
> Anton Bassov
>
> —
> NTDEV is sponsored by OSR
>
> For our schedule of WDF, WDM, debugging and other seminars visit:
> http://www.osr.com/seminars
>
> To unsubscribe, visit the List Server section of OSR Online at
> http://www.osronline.com/page.cfm?name=ListServer

If the interrupt is queued at the apic, irqls become a hardware-only
concern. No need whatsoever to bring the concept into the OS: the interrupt
prioritization is naturally ordered by the hardware, isr’s are blissfully
ignorant of what other isr’s are doing, and life goes on.

Alberto.

----- Original Message -----
From:
To: “Windows System Software Devs Interest List”
Sent: Saturday, December 29, 2007 3:17 AM
Subject: RE:[ntdev] Are callbacks guaranteed to run consecutively?

>
>>: the place to queue interrupts is in the APIC
>
> This is EXACTLY what Windows does - it raises IRQL to the appropriate
> level (i.e. writes the appropriate value to local APIC’s TPR) before
> invoking ISR, so that ISR may get interrupted only interrupts of priority,
> higher than the current one, and the ones of lower priority will get
> processed after ISR returns. However, it has nothing to do with a
> “conventional” spinlock that is being acquired and held at DPC level - no
> matter in which order ISRs get processed, they will all get processed
> before execution returns to the code that acquires/holds a spinlock…
>
>> independent of the code being executed by the processor.
>
> This is not the question of code - this is the question of IRQL, and
> interrupts just cannot be queued in APIC independently of current IRQL.
> Instead, everything depends on how interrupt’s priority fares against the
> IRQL that CPU currently runs at…
>
>> That’s what I mean when I say “do it fast”. Do not preempt - it’s
>> unnecessary
>
> In other words, what you are saying is “assign a code that acquires/holds
> the spinlock the highest possible interrupt priority (or just clear IF
> flag), no matter what and no matter how”. Not really wise approach, don’t
> you think???
>
> Anton Bassov
>
> —
> NTDEV is sponsored by OSR
>
> For our schedule of WDF, WDM, debugging and other seminars visit:
> http://www.osr.com/seminars
>
> To unsubscribe, visit the List Server section of OSR Online at
> http://www.osronline.com/page.cfm?name=ListServer

> You can use xchg to implement a spinlock, even though it doesn’t set the flags.

Sure - after all, this instruction locks the bus (you don’t have to even prefix it with LOCK - if will assert #LOCK signal anyway). However, the whole thing is, again, based upon bus-level locking at the hardware level, and, according to you, everything that relies upon bus-level locking is already not a “bare-bone spinlock”. Therefore, the question is still open - what is a “bare-bone spinlock” in your understanding???

Anton Bassov

> If the interrupt is queued at the apic, irqls become a hardware-only concern.

No need whatsoever to bring the concept into the OS: the interrupt
prioritization is naturally ordered by the hardware,

Sorry, but how on Earth is local APIC going to know whether it should queue an interrupt or raise it straight away if the OS gives up the responsibility of changing IRQL by writing to the local APIC’s TPR??? Are you sure you understand what you are talking about???

Anton Bassov

Mark,

What is nearly irrelevant is this thread, as we are simply rehasing spinlock design
issues that were well understood and resolved in the industry 20 years ago.

Sad enough, but this is true, and,unfortunately, not only for the spinlocks…

Earlier on this thread Don provided a quotation saying that “OS research is dead”. Indeed, it looks like everything that can be possibly invented in this field has been already invented- no matter what you do, you are always going to get a standard reply “It has been done…”. You may have some seemingly brilliant idea, but after doing some research you are more than likely to discover that this “brilliant” idea had been already tested around 30 years ago and got discarded because it turned out worthless…

Anton Bassov

From my own memory - which admittedly is going after all these years - I can
remember two examples of queuing interrupts at hardware level: the Univac
418 III and the Sperry DCP 20/40. I also heard that the Univac 494, a great
realtimer of its age, did not interrupt, but I’m totally ignorant of the
mechanisms they used. I remember however that airlines such as SAS used a
three-494 system for many years to do their reservations, configured in a
failsafe configuration of two redundant live systems and a third acting as a
hot standby in case either of the other two malfunctioned.

I do not remember the details of the 418 III any longer, this was back in
the 70’s and I didn’t personally work with that system. The machine was a
realtimer, designed to handle high volume transaction applications such as
airline reservations, and it queued interrupts at its peripheral processor.

I did however work my fair share with the DCP machines. The system’s I/O
processors handled interrupts totally in hardware. The programmer would
write the equivalent of an IBM/360 channel program, which directed the
specific processor in its bare i/o function. The hardware would queue
interrupts to hardware queues, created and maintained in memory.

The DCP/40 had two processors, which also recognized the hardware queues at
architectural level. Each hardware queue had what we called a “Software
Attention Item” on its header, and also a “Target Queue”. A processor could
“arm” the queue: this meant that whenever an item was added to the queue,
the Software Attention Item would be queued by the hardware into the Target
Queue, also pre-specified in the queue’s header. That queue could also be
armed with its own attention item, so, an interrupt could trigger a whole
tree of queue updates, all by the hardware and transparent to the OS.

For example, I could have multiple terminal queues feeding a central service
queue through its attention item. I could maintain a tree of queues to
properly route messages. I could have a Uniscope terminal’s queue to point
its Software Attention Item to the OS’s dispatch queue: when the interrupt
came, the I/O processor would queue it into the armed queue, which would
direct the processor to enqueue the SAI into a dispatch queue, which had the
effect of scheduling the equivalent of a DPC, totally by hardware,
transparent to the programmer and subarchitectural to the OS. The programmer
didn’t have to write an ISR, except for the few lines of code to direct the
I/O processor; device drivers basically didn’t exist, it was child’s play to
write an IOP program, a few lines would do; and DPCs were dispatched by
hardware.

The processor had instructions to enqueue and dequeue items, and queue
descriptors were kept in tables not unlike today’s GDT, LDT and IDT. The
processor had a CQM instruction - check queue multiple - which could extract
the highest priority item from an array of queues. This was hw-implemented
priority queues, which we used in the OS dispatcher. At one point of time a
group of people in our lab came up with a new OS - it was called “Monitor” -
and its core OS designer, a personal friend of mine, was proud that the path
through his OS dispatcher consisted of no more than 15 machine instructions.

That’s what I mean when I say that interrupts are better queued - and
serialized - at the APIC. The current APIC cannot do it, but that’s the
direction I believe its development should head.

By the way, the DCP/40 first came out in 1978 or 1979, if I remember. That’s
30 years ago. By comparison, the i386 architecture is some 10 years younger.

Alberto.

----- Original Message -----
From:
To: “Windows System Software Devs Interest List”
Sent: Saturday, December 29, 2007 2:28 PM
Subject: RE:[ntdev] Are callbacks guaranteed to run consecutively?

>
>> If the interrupt is queued at the apic, irqls become a hardware-only
>> concern.
>> No need whatsoever to bring the concept into the OS: the interrupt
>> prioritization is naturally ordered by the hardware,
>
> Sorry, but how on Earth is local APIC going to know whether it should
> queue an interrupt or raise it straight away if the OS gives up the
> responsibility of changing IRQL by writing to the local APIC’s TPR??? Are
> you sure you understand what you are talking about???
>
> Anton Bassov
>
>
>
>
> —
> NTDEV is sponsored by OSR
>
> For our schedule of WDF, WDM, debugging and other seminars visit:
> http://www.osr.com/seminars
>
> To unsubscribe, visit the List Server section of OSR Online at
> http://www.osronline.com/page.cfm?name=ListServer