> ----------
From:
xxxxx@compuware.com[SMTP:xxxxx@compuware.com]
Reply To: xxxxx@lists.osr.com
Sent: Tuesday, September 30, 2003 4:42 PM
To: xxxxx@lists.osr.com
Subject: [ntdev] Re: Problem using queued spinlocks on single cpu
comp uter
I actually meant
pushf/cli/n=KeGetCurrentProcessorNumber/popf/KeAcquireIStackQueuedSpinlock
(&
lk,&ql[n]), but now I see that the window may not disappear. Yet what
burns
more cycles, a quick spin or a context switch ? I don’t know if the
increase
in I/O reactivity, which might be the natural consequence of allowing that
kind of preemption, pays off the additional aggravation and complexity in
managing the i/o complex. The way to speed up an interrupt system isn’t to
make it more reactive, but to lower the number of interrupts and the
latency
of handling them.
I don’t understand. The necessity to protect against context switch is
because of preemptive multitasking. I don’t see the relation to IO
reactivity. Well, the probability you’re preempted in a bad moment changes
but it has to be handled the same way even if probability is very low. Am I
missing something?
I don’t know, Michal, I don’t know. What I find horrible is the fact that
I
can’t trust my state even I’m doing kernel work. Is that good, safe,
efficient ? What do I gain by allowing an OS service to be preempted
arbitrarily to the point I can’t even trust my processor number ?
Normally, you shouldn’t need to know processor number. Code running at
PASSIVE_LEVEL is preemptive and I see it as an advantage (kernel threads).
What is an alternative? Cooperative multitasking in kernel? You can always
raise IRQL to DISPATCH_LEVEL to avoid preemption.
The difference is at what time we do memory management. I tried to design
it
in a way that all the maintenance is done at construct time, so that the
Acquire() call ends up generating inline code that does little more than
loop on the lock, in fact, it will probably generate inline code.
OK, generally it is better but in this case it causes unnecessary
complications. Acquirer class construction + lock acquire is very simple and
should also generate inline code.
The original problem was to achieve a queued spinlock. With one CPU, you
call Acquire(), it bumps the tail, you then get into the critical section
because now head==tail, then you call Release() which bumps the head. And
you could always bypass that code for the one-cpu case. If I remember, I
staggered the head and the tail to take that possibility into account, but
again, I’m losing my ability to handle detail. As for IRQL, fine, raise
the
IRQL first, but then I still believe that we’d be better off if we just
turned off interrupts in that processor during the spin. I don’t at all
mind
a solution that says pushf/cli/spin/popf, it’d save a lot of aggravation.
Disable interrupts just because some code waits for a lock? It affects whole
system because of local conditions. Do you really think it is a good idea?
>Memory management necessary is one instruction i.e. sub esp,8. And whole
>thing requires one line i.e. acquired variable declaration contrary to
two
>lines with your spinlock (lock, unlock).
I’m not sure I follow you here, you have to Acquire and you have to
Release:
two lines.
I mean acquirer class which releases automatically. Yes, code generated
would be very similar but C++ source is simpler.
A strong point of using OO is that we’re now able to code in ways that are
not accessible to plain vanilla block structured languages!
It doesn’t mean it must be used in cases where it causes unnecessary
complications. The goal of OOP is to simplify programming and make code more
readable and not to make developers headache because of purity.
I don’t see why
one well-encapsulated class is any more error prone or more complicated
than
anything else, and the complication if any comes from the semantics of the
call itself and of the OS, and not from the inherent problem !
Yes, it is this case. OS enforces some rules and class design have to fulfil
them even if you don’t like it.
I also have a
problem with implicit actions such as the combination of
Release-on-destruct
and destruct-on-exit, which leads to an implicit Release-on-exit: this may
be convenient and easy to code, but I believe spinlocks are important
enough
that we should explicitly release them, if nothing else in consideration
with the guys who will eventually inherit the code.
This depends on personal preferences. Automatic locks are widely used and
shouldn’t confuse experienced C++ developers. On the contrary, the necessity
to manually unlock can be confusing to somebody who routinely uses
autolocks.
I also don’t like to use
two objects here, and I find the limitation that spinlocks must be
acquired
and released within the same stack frame, well, a limitation.
You may not like it but it is the reality. They were designed this way which
means using them other ways is complicated.
My approach has to do with the way I see spinlocks: a hardware level
mechanism, that should be at the very bottom of the OS layering, that
should
not have to know about any other OS mechanisms, including memory
management
and exception throwing. A mechanism that should be used in a tight
setting,
loop on the lock, do your job, release the lock, and make sure your job is
as small as possible: a fine grain tool, not a coarse grain one !
I’d agree. With these requirements it makes sense to use stack frame,
doesn’t it?
Hmm, guess why queued spinlock acquire/release routines have …InStack…
in the name. Doesn’t it imply usage?
Best regards,
Michal Vodicka
STMicroelectronics Design and Application s.r.o.
[michal.vodicka@st.com, http:://www.st.com]