CPU cache, memory and multiple CPUs

Hi All,

Suppose I have the following code:

KeAcquireSpinlock( &mylock, &oldirql);
if( DevExt->bDoStuff == 1)
{
DoStuff();
DevExt->bDoStuff = 0;
}
KeReleaseSpinlock( &mylock, oldirql);

It can be executed by multiple CPU’s. Is this guaranteed (and, most important,
by whom) that the next CPU to enter the protected section of code will see 0
and not the old value in bDoStuff? My concern is whether it can happen that
when the first CPU modified bDoStuff it did it in its cache and has not written
it to the main memory yet at the time the spinlock was released.

Thank you


Do you Yahoo!?
Yahoo! Mail Plus - Powerful. Affordable. Sign up now.
http://mailplus.yahoo.com

Personally, I would use InterlockedExchange() for setting the value as
opposed to just assignment. It is guarranteed to handle all those caching
issues on whatever hardware you execute on. Let the OS do it’s job and you
do yours and you will have far fewer headaches.

Greg

-----Original Message-----
From: xxxxx@lists.osr.com
[mailto:xxxxx@lists.osr.com]On Behalf Of Ntdev Reader
Sent: Tuesday, January 28, 2003 8:54 AM
To: NT Developers Interest List
Subject: [ntdev] CPU cache, memory and multiple CPUs

Hi All,

Suppose I have the following code:

KeAcquireSpinlock( &mylock, &oldirql);
if( DevExt->bDoStuff == 1)
{
DoStuff();
DevExt->bDoStuff = 0;
}
KeReleaseSpinlock( &mylock, oldirql);

It can be executed by multiple CPU’s. Is this guaranteed (and, most
important,
by whom) that the next CPU to enter the protected section of code will see 0
and not the old value in bDoStuff? My concern is whether it can happen that
when the first CPU modified bDoStuff it did it in its cache and has not
written
it to the main memory yet at the time the spinlock was released.

Thank you


Do you Yahoo!?
Yahoo! Mail Plus - Powerful. Affordable. Sign up now.
http://mailplus.yahoo.com


You are currently subscribed to ntdev as: xxxxx@pdq.net
To unsubscribe send a blank email to xxxxx@lists.osr.com

I think Peter V’s standard response applies here (i.e. It depends).

You don’t show the logic that sets bDoStuff. If it is set by your interrupt handler then the answer is definitely no.

If the routine DoStuff() drops and requires the spinlock then no.

Otherwise, the answer is generally yes on x86 machines because the caches between the processors are coherent.

Duane.

-----Original Message-----
From: Gregory G. Dyess [mailto:xxxxx@pdq.net]
Sent: Tuesday, January 28, 2003 10:08 AM
To: NT Developers Interest List
Subject: [ntdev] RE: CPU cache, memory and multiple CPUs

Personally, I would use InterlockedExchange() for setting the value as
opposed to just assignment. It is guarranteed to handle all those caching
issues on whatever hardware you execute on. Let the OS do it’s job and you
do yours and you will have far fewer headaches.

Greg

-----Original Message-----
From: xxxxx@lists.osr.com
[mailto:xxxxx@lists.osr.com]On Behalf Of Ntdev Reader
Sent: Tuesday, January 28, 2003 8:54 AM
To: NT Developers Interest List
Subject: [ntdev] CPU cache, memory and multiple CPUs

Hi All,

Suppose I have the following code:

KeAcquireSpinlock( &mylock, &oldirql);
if( DevExt->bDoStuff == 1)
{
DoStuff();
DevExt->bDoStuff = 0;
}
KeReleaseSpinlock( &mylock, oldirql);

It can be executed by multiple CPU’s. Is this guaranteed (and, most
important,
by whom) that the next CPU to enter the protected section of code will see 0
and not the old value in bDoStuff? My concern is whether it can happen that
when the first CPU modified bDoStuff it did it in its cache and has not
written
it to the main memory yet at the time the spinlock was released.

Thank you


Do you Yahoo!?
Yahoo! Mail Plus - Powerful. Affordable. Sign up now.
http://mailplus.yahoo.com


You are currently subscribed to ntdev as: xxxxx@pdq.net
To unsubscribe send a blank email to xxxxx@lists.osr.com


You are currently subscribed to ntdev as: xxxxx@infiniconsys.com
To unsubscribe send a blank email to xxxxx@lists.osr.com

Independent of caching issues, the value of DevExt->bDoStuff can’t be known
outside the critical section unless it is never set excepting inside the
critical section. Even if this value is set at init time, one must watch out
for race conditions.

Alberto.

-----Original Message-----
From: Gregory G. Dyess [mailto:xxxxx@pdq.net]
Sent: Tuesday, January 28, 2003 10:08 AM
To: NT Developers Interest List
Subject: [ntdev] RE: CPU cache, memory and multiple CPUs

Personally, I would use InterlockedExchange() for setting the value as
opposed to just assignment. It is guarranteed to handle all those caching
issues on whatever hardware you execute on. Let the OS do it’s job and you
do yours and you will have far fewer headaches.

Greg

-----Original Message-----
From: xxxxx@lists.osr.com
[mailto:xxxxx@lists.osr.com]On Behalf Of Ntdev Reader
Sent: Tuesday, January 28, 2003 8:54 AM
To: NT Developers Interest List
Subject: [ntdev] CPU cache, memory and multiple CPUs

Hi All,

Suppose I have the following code:

KeAcquireSpinlock( &mylock, &oldirql);
if( DevExt->bDoStuff == 1)
{
DoStuff();
DevExt->bDoStuff = 0;
}
KeReleaseSpinlock( &mylock, oldirql);

It can be executed by multiple CPU’s. Is this guaranteed (and, most
important,
by whom) that the next CPU to enter the protected section of code will see 0
and not the old value in bDoStuff? My concern is whether it can happen that
when the first CPU modified bDoStuff it did it in its cache and has not
written
it to the main memory yet at the time the spinlock was released.

Thank you


Do you Yahoo!?
Yahoo! Mail Plus - Powerful. Affordable. Sign up now.
http://mailplus.yahoo.com


You are currently subscribed to ntdev as: xxxxx@pdq.net
To unsubscribe send a blank email to xxxxx@lists.osr.com


You are currently subscribed to ntdev as: xxxxx@compuware.com
To unsubscribe send a blank email to xxxxx@lists.osr.com

The contents of this e-mail are intended for the named addressee only. It
contains information that may be confidential. Unless you are the named
addressee or an authorized designee, you may not copy or use it, or disclose
it to anyone else. If you received it in error please notify us immediately
and then destroy it.

I’m having trouble understanding your point. Let’s assume that bDoStuff
is properly aligned, eg, is a DWORD (4 bytes) on a 4-byte boundary on an
x86 machine (I restrict myself to x86 because that’s what I am familiar
with); and let’s assume that InterlockedExchange() is being used to
effect update. And finally let’s take the original inquirer at his word,
namely, updating occurs only through the code shown.

Where’s the uncertainty?


If replying by e-mail, please remove “nospam.” from the address.

James Antognini

Who sets bDoStuff to 1 to begin with ? Because the flag exists inside the
Device Extension, it cannot be set to 1 by the compiler, hence it’ll have
to be initially set outside that critical section, or the critical section
itself is dead code.

Alberto.

-----Original Message-----
From: James Antognini [mailto:xxxxx@mindspring.nospan.com]
Sent: Tuesday, January 28, 2003 11:43 AM
To: NT Developers Interest List
Subject: [ntdev] Re: CPU cache, memory and multiple CPUs

I’m having trouble understanding your point. Let’s assume that bDoStuff
is properly aligned, eg, is a DWORD (4 bytes) on a 4-byte boundary on an
x86 machine (I restrict myself to x86 because that’s what I am familiar
with); and let’s assume that InterlockedExchange() is being used to
effect update. And finally let’s take the original inquirer at his word,
namely, updating occurs only through the code shown.

Where’s the uncertainty?


If replying by e-mail, please remove “nospam.” from the address.

James Antognini


You are currently subscribed to ntdev as: xxxxx@compuware.com
To unsubscribe send a blank email to xxxxx@lists.osr.com

The contents of this e-mail are intended for the named addressee only. It
contains information that may be confidential. Unless you are the named
addressee or an authorized designee, you may not copy or use it, or disclose
it to anyone else. If you received it in error please notify us immediately
and then destroy it.

Since some questions have arised, let me add more details and make the code
more sensible. Surely, this code is not from a real driver, it is merely used
to illustrate my question.

Suppose I have the following code:

KeAcquireSpinlock( &DevExt->mylock, &oldirql);
if( DevExt->bDoStuff == 1)
{
DoStuff();
DevExt->bDoStuff = 0;
}
KeReleaseSpinlock( &DevExt->mylock, oldirql);

It can be executed by multiple CPU’s. Is this guaranteed (and, most important,
by whom) that the next CPU to enter the protected section of code will see 0
and not the old value in bDoStuff? My concern is whether it can happen that
when the first CPU modifies bDoStuff it does that in its cache and has not
written it to the main memory yet at the time the spinlock is released.

More specifically, must I use one of “Interlocked” functions to assign 0 to
bDoStuff in this case? Or does KeReleaseSpinlock do a cache flush?

An answer such as “x86, as of today, automatically maintains cache coherency
between CPUs and so you don’t need “Interlocked” on x86”, or similar answers
in context of particular CPU implementation details do not answer this
question, for I’m looking for a right, Microsoft guaranteed, way to do this,
very basic, thing.


Do you Yahoo!?
Yahoo! Mail Plus - Powerful. Affordable. Sign up now.
http://mailplus.yahoo.com

> when the first CPU modified bDoStuff it did it in its cache and has
not written

it to the main memory yet at the time the spinlock was released.

x86 SMP platform guarantees cache coherency in the hardware.

Max

[Sound of hand hitting forehead] Got it. Thanks.


If replying by e-mail, please remove “nospam.” from the address.

James Antognini

As long as bDoStuff is properly aligned – eg, DWORD on 4-byte boundary
– you’re safe. Microsoft APIs like Interlocked*** depend on the same
hardware guarantee. Vendors of other OSes running on other hardware rely
on a similar guarantee. The hardware guarantee is an essential
assumption in the OS concerned.

On the other hand, if you do something weird, like a DWORD aligned on an
odd boundary, I’ve no idea. But unless you force the DDK compiler’s hand
(and here I’m not sure that’s possible), you don’t need to worry about
alignment for such a variable.


If replying by e-mail, please remove “nospam.” from the address.

James Antognini

On P2, P3 and P4, the MESI protocol should take care of the cache protocol
issue. Take a look at Mindshare’s books.

Alberto.

-----Original Message-----
From: Ntdev Reader [mailto:xxxxx@yahoo.com]
Sent: Tuesday, January 28, 2003 12:57 PM
To: NT Developers Interest List
Subject: [ntdev] Re: CPU cache, memory and multiple CPUs

Since some questions have arised, let me add more details and make the code
more sensible. Surely, this code is not from a real driver, it is merely
used
to illustrate my question.

Suppose I have the following code:

KeAcquireSpinlock( &DevExt->mylock, &oldirql);
if( DevExt->bDoStuff == 1)
{
DoStuff();
DevExt->bDoStuff = 0;
}
KeReleaseSpinlock( &DevExt->mylock, oldirql);

It can be executed by multiple CPU’s. Is this guaranteed (and, most
important,
by whom) that the next CPU to enter the protected section of code will see 0
and not the old value in bDoStuff? My concern is whether it can happen that
when the first CPU modifies bDoStuff it does that in its cache and has not
written it to the main memory yet at the time the spinlock is released.

More specifically, must I use one of “Interlocked” functions to assign 0 to
bDoStuff in this case? Or does KeReleaseSpinlock do a cache flush?

An answer such as “x86, as of today, automatically maintains cache coherency
between CPUs and so you don’t need “Interlocked” on x86”, or similar answers
in context of particular CPU implementation details do not answer this
question, for I’m looking for a right, Microsoft guaranteed, way to do this,
very basic, thing.


Do you Yahoo!?
Yahoo! Mail Plus - Powerful. Affordable. Sign up now.
http://mailplus.yahoo.com


You are currently subscribed to ntdev as: xxxxx@compuware.com
To unsubscribe send a blank email to xxxxx@lists.osr.com

The contents of this e-mail are intended for the named addressee only. It
contains information that may be confidential. Unless you are the named
addressee or an authorized designee, you may not copy or use it, or disclose
it to anyone else. If you received it in error please notify us immediately
and then destroy it.

The alignment should not matter. What matters is the semantics of
KeReleaseSpinLock(). The semantics of this operation would involve a
memory barrier on architectures and/or processors that change the order
of writes from the processors. This ensures the previous store(s)
regardless of alignment finish before the write to release the spinlock.

Cache coherency is a “given”. Without it, you don’t work at all (unless
you start having DCache flushes, etc.)

JohnR

-----Original Message-----
From: James Antognini [mailto:xxxxx@mindspring.nospan.com]
Sent: Tuesday, January 28, 2003 10:39 AM
To: NT Developers Interest List
Subject: [ntdev] Re: CPU cache, memory and multiple CPUs

As long as bDoStuff is properly aligned – eg, DWORD on 4-byte boundary
– you’re safe. Microsoft APIs like Interlocked*** depend on the same
hardware guarantee. Vendors of other OSes running on other hardware rely
on a similar guarantee. The hardware guarantee is an essential
assumption in the OS concerned.

On the other hand, if you do something weird, like a DWORD aligned on an
odd boundary, I’ve no idea. But unless you force the DDK compiler’s hand
(and here I’m not sure that’s possible), you don’t need to worry about
alignment for such a variable.


If replying by e-mail, please remove “nospam.” from the address.

James Antognini


You are currently subscribed to ntdev as: xxxxx@microsoft.com
To unsubscribe send a blank email to xxxxx@lists.osr.com

I suspected that that is the case, but since the DDK doesn’t say so (at
least under KeReleaseSpinLock(); I haven’t searched the DDK for related
material), I didn’t want to make that strong a statement myself.


If replying by e-mail, please remove “nospam.” from the address.

James Antognini

I generally won’t answer a question from someone who won’t give their real
name. There are several things to consider with your code…

----- Original Message -----
From: “Ntdev Reader”
To: “NT Developers Interest List”
Sent: Tuesday, January 28, 2003 12:57 PM
Subject: [ntdev] Re: CPU cache, memory and multiple CPUs

> Since some questions have arised, let me add more details and make the
code
> more sensible. Surely, this code is not from a real driver, it is merely
used
> to illustrate my question.
>
> Suppose I have the following code:
>
> KeAcquireSpinlock( &DevExt->mylock, &oldirql);
> if( DevExt->bDoStuff == 1)
> {
> DoStuff();
> DevExt->bDoStuff = 0;
> }
> KeReleaseSpinlock( &DevExt->mylock, oldirql);
>
> It can be executed by multiple CPU’s. Is this guaranteed (and, most
important,
> by whom) that the next CPU to enter the protected section of code will see
0
> and not the old value in bDoStuff? My concern is whether it can happen
that
> when the first CPU modifies bDoStuff it does that in its cache and has not
> written it to the main memory yet at the time the spinlock is released.
>
> More specifically, must I use one of “Interlocked” functions to assign 0
to
> bDoStuff in this case? Or does KeReleaseSpinlock do a cache flush?
>
> An answer such as “x86, as of today, automatically maintains cache
coherency
> between CPUs and so you don’t need “Interlocked” on x86”, or similar
answers
> in context of particular CPU implementation details do not answer this
> question, for I’m looking for a right, Microsoft guaranteed, way to do
this,
> very basic, thing.
>
>
> __________________________________________________
> Do you Yahoo!?
> Yahoo! Mail Plus - Powerful. Affordable. Sign up now.
> http://mailplus.yahoo.com
>
>
> —
> You are currently subscribed to ntdev as: xxxxx@yoshimuni.com
> To unsubscribe send a blank email to xxxxx@lists.osr.com