Pushdf
Cli
…
Popdf
You would not want to enable interrupts if they were already disabled.
Jamey Kirby, Windows DDK MVP
StorageCraft Inc.
xxxxx@storagecraft.com
http://www.storagecraft.com
-----Original Message-----
From: xxxxx@lists.osr.com
[mailto:xxxxx@lists.osr.com] On Behalf Of Moreira, Alberto
Sent: Monday, September 29, 2003 12:06 PM
To: Windows System Software Devs Interest List
Subject: [ntdev] Re: Problem using queued spinlocks on single cpu comp uter
I did indeed ignore a few implementation details, but I feel the IRQL level
race can be handled; worse comes to worse, a cli/sti pair does wonders to
prevent preemption. As far as the one handle per acquire, I’m not sure I
like the idea of having more than one acquire per CPU, although, to be fair,
that restriction would break my producer-consumer example ! In my code,
MaxNumberOfProcessors is an upper limit on the size of your SMP - say, 16,
32, 64. It’s a constant, and the space wasted isn’t that bad if your data
structure is small enough. Or you can ask the OS to get it for you and set
your data structure up accordingly at construction time.
But I was thinking, this whole thing is a bit academic. How about this for a
barebones queued spinlock,
class qlock
{
private:
unsigned int curr, next;
public:
qlock() { curr = 0; next = 0; }
void Acquire() { unsigned int n = xAdd(next,1); while (n != curr); } //
take your number, wait till you’re called
void Release() { xAdd(curr,1); } // one more customer happily served, next
!
};
The n=xAdd(a,1) construct is equivalent to an hw-level locked, indivisible
. This code is a straightforward application of the Bakery Algorithm,
the queue only exists in our minds, and it wraps around at 0xffffffff. You
can tweak it to serve whatever you need ! And, what’s an advantage to me, it
works even if the OS’s down; besides, it takes less code to implement than
other solutions that use the OS call. And look, ma, no handles !
As far as the produce-consumer, if you have a spinlock class, it’s a simple
matter to add a counter to it. I was just using the producer-consumer as an
example of something you may want to control without necessarily using an
on-stack alloc and release.
One point in the code I’ve written this last time was to make sure that
everything that requires object or memory management is done at init time.
Again, it keeps all the spinlock maintenance in the same place, and it
avoids allocations and deallocations while running.
As for exceptions, I believe there’s a difference between spinlocks and
other synchronization objects, in that in a spinlock a processor is looping
and if we don’t handle things well, that processor’s going to hang. It may
be a question of personal preference, but I don’t like to do anything inside
a spinlock but the bare minimum I can get away with. I like code structured
in layers, like an onion, and I like to see spinlock protected code at the
very bottom !
Alberto.
-----Original Message-----
From: Michal Vodicka [mailto:xxxxx@veridicom.cz.nospam]
Sent: Monday, September 29, 2003 2:01 PM
To: Windows System Software Devs Interest List
Subject: [ntdev] Re: Problem using queued spinlocks on single cpu comp
uter
Alberto,
> Sorry if I annoyed you, my apologies.
>
OK, I wasn’t too annoyed; just felt you don’t want to hear what me and
others try to say.
> I now understand that each processor must have its own lock structure, but
> that’s easily handled. You could try for example something like
>
> struct queuelocks
> {
> protected:
> static KLOCK_QUEUE_HANDLE ql[MaxNumberOfProcessors];
> public:
> queuelocks() { }
> };
>
> struct spinlock : public queuelocks
> {
> protected:
> KSPIN_LOCK lk;
> public:
> spinlock() : queuelocks() { }
> Acquire( ) {
> KeAcquireInStackQueuedSpinlock(&lk,&ql[KeGetCurrentProcessorNumber()]; }
> Release( ) {
> KeReleaseInStackQueuedSpinlock(&lk,&ql[KeGetCurrentProcessorNumber()]; }
> }
>
There are race conditions: a small windows between
KeGetCurrentProcessorNumber() and real spinlock acquire. If above code is
called at PASSIVE_LEVEL thread can be rescheduled on another CPU in the
meantime. As Max pointed out before, you’d have to raise IRQL to
DISPATCH_LEVEL to make it safe.
Next complaints: it depends on current implementation; from docs it seems
there should be one handle per acquire and not per CPU. What is
MaxNumberOfProcessors? If you set it to for example 64, you waste space most
times and still it is possible the code would fail in the future. You can
allocate it dynamically depending on real conditions which makes code more
complicated. Compare it with stack acquisitor object; the only necessary
thing is to decrease ESP and in most cases the instruction is already there
and only constant changes. Elegant and efficient.
It is of course possible to use similar code as above but it is always more
complicated, less efficient and error prone. And for what? OOP purity? BTW,
how do you like, for example, STL functors?
> Using the spinlock is as simple as s.Acquire() and s.Release(), all the
> rest
> of the work is done once at init time, and the code is localized in one
> place:
>
> s.Acquire();
> {
> // critical section goes here…
> }
> s.Release();
>
Exception unsafe… ![]()
> This code also gets rid of those dynamic allocations that people objected
> to. And if you don’t like that static variable, you can remove it at the
> cost of grabbing additional memory for each spinlock, or you can
> compromise
> by having an array of PKLOCK_QUEUE_HANDLEs pointing to structures created
> by
> the constructor or even allocated at DriverEntry time.
>
Yes but still on-stack object per-acquire consumes less memory.
> Another advantage is
> that you can use it in multithreaded situations, for example, a
> producer-consumer:
>
> Thread A()
> {
> e.Acquire();
> {
> Produce();
> }
> f.Release();
> }
>
> Thread B()
> {
> f.Acquire();
> {
> Consume();
> }
> e.Release();
> }
>
I believe only semaphores can used used above way. Imagine what happens if
producer is run before consumer and spinlock f which wasn’t acquired is
released. I don’t see how pure spinlock (without additional counter) could
be used for producer-consumer solution.
> The above code has another advantage: I can quickly replace Acquire( ) and
> Release( ) with machine code, or I can derive such an object, for example,
>
> struct hardlock : public spinlock
> {
> Acquire ( ) { TestAndSetAssemblerMacroGoesHere( ); }
> Release ( ) { SetSpinLockVariableToZeroByHand( ); }
> }
>
> which may be required at debug time when the OS is out to lunch. Moreover,
> I
> could even move Acquire( ) and Release( ) to the superclass, for example,
>
> struct queuelocks
> {
> protected:
> static KLOCK_QUEUE_HANDLE ql[MaxNumberOfProcessors];
> KSPIN_LOCK s;
> public:
> queuelocks() { }
> virtual Acquire() {
> KeAcquireInStackQueuedSpinlock(s,&ql[KeGetCurrentProcessorNumber()]; }
> virtual Release() {
> KeReleaseInStackQueuedSpinlock(s,&ql[KeGetCurrentProcessorNumber()]; }
> };
>
> And now I have the option to override the Acquire and Release methods with
> my own code (for example, a test, test and set), or with calls to plain
> vanilla KeAcquireSpinLock( ) and to KeReleaseSpinLock( ): I can derive
> multiple kinds of spinlocks from that one superclass. I can add common
> debugging code to the queuelocks class too.
>
I don’t see a diffence against acquirer class. You can change it any
necessary way, too, without change of code which uses it.
> There are umpteen variations on this theme. But of course, caveat emptor,
> I
> didn’t try the code in a real machine, so I don’t know if the compiler
> will
> generate good kernel-side code. The assumption of course is that you only
> need one KLOCK_QUEUE_HANDLE per processor, meaning, a processor will never
> be waiting on more than one spinlock.
>
> Maybe I reacted too strongly, my apologies, but I don’t like things such
> as
> throwing exceptions from inside a critical section, I believe one should
> release the spinlock first and then go deal with exception conditions.
>
Let’s forget about spinlock for a while and think about any synchronization
object as event, critical section, mutex and semaphore. There are two ways
how to handle error conditions with C++: classic status codes and C++
exceptions. If you decide for exceptions any function including operators
can throw an exception and you can’t assume code inside critical section
behaves differently. Throwing exception directly can seem silly but
generally you can’t avoid indirect throw from a function or operator called.
Actually, it is perfectly reasonable under normal circumstances (where
spinlock may not fall). It is necessary to be prepared for this situation
and that’s what were autolocking classes designed for. I don’t want to
discuss C++ exceptions pros and cons; the list has different purpose and I
don’t feel quite competent for it. They just enforce some rules which may
not be ignored; exception safe class design is one. Spinlocks enforce other
rules and both may not mix well. Personally, I like C++ exceptions in user
mode but don’t think they are appropriate for kernel use (no flames, please
:). Every environment has specific requirements and it is necessary to
choose tools and design code accordingly.
Best regards,
Michal Vodicka
STMicroelectronics Design and Application s.r.o.
[michal.vodicka@st.com, http:://www.st.com]
—
Questions? First check the Kernel Driver FAQ at
http://www.osronline.com/article.cfm?id=256
You are currently subscribed to ntdev as: xxxxx@compuware.com
To unsubscribe send a blank email to xxxxx@lists.osr.com
The contents of this e-mail are intended for the named addressee only. It
contains information that may be confidential. Unless you are the named
addressee or an authorized designee, you may not copy or use it, or disclose
it to anyone else. If you received it in error please notify us immediately
and then destroy it.
—
Questions? First check the Kernel Driver FAQ at
http://www.osronline.com/article.cfm?id=256
You are currently subscribed to ntdev as: xxxxx@storagecraft.com
To unsubscribe send a blank email to xxxxx@lists.osr.com