Jon Morrison from the Windows Reliability team put up a pretty interesting post on his new blog that relates to this about memory operation reordering and what barriers do for you.
http://blogs.msdn.com/itgoestoeleven/
-p
-----Original Message-----
From: xxxxx@lists.osr.com [mailto:xxxxx@lists.osr.com] On Behalf Of xxxxx@hotmail.com
Sent: Friday, March 07, 2008 9:42 PM
To: Windows System Software Devs Interest List
Subject: RE:[ntdev] atomic reads/writes on 32/64 bit processor
Nothing nonsense about what I wrote, you can read everything
here http://msdn2.microsoft.com/en-us/library/ms686355.aspx
The link that you have provided speaks about memory ordering issues that don’t really apply here (more on it below). They arise when you want to ensure that CPU X sees variables updated in exactly the same order that CPU Y actually updates them. Consider the following scenario. CPU X executes the following lines:
a=b;
b=c;
Due to speculative reads and writes there is no guarantee that CPU Y will see a updated before b. Therefore, if you want to make sure CPU Y always sees these lines in the right order, you have to insert SFENCE instruction in between these two lines so that the first update will get commited to memory befor the second one takes place (I assume x86 and x86_64 here - IA64, i.e. Itanium, offers instructions that provide acquire-only and release-only semantics as well). This is what that article that you have provided a link to speaks about. However, we speak about something different here…
Bottom line remains he needs to protect his read operations as well to prevent
race conditions because of memory ordering issues.
As I already told you, *in this context* he does not have to care about memory-ordering issues - he has a single writer, single variable and multiple readers here. His situation is similar to the one when spinlock is being held by CPU A while CPUs B,C and D try to acquire it, spinning in an inner loop until spinlock gets released so that they can attempt set-and-test. This kind of thing is implemented by simple MOV instruction, without either SFENCE or LOCK prefix
Using the volatile keyword as you say is one of the possible solutions
‘Volatile’ keyword applies to optimizations that are made by *compiler* , while SFENCE applies to the ones made by CPU. These are different things (although they could get combined - for example, compiler could generate SFENCE prior to MOV if the target variable is declared with ‘volatile’ modifier). However, apparently, it does not do it - otherwise, there would be no need for KeMemoryBarrier() macro. You need ‘volatile’ modifier here not because of memory reordering that may be made by CPU but because of the one that may be made by compiler…
but as far as I know there is nowhere written exactly how the WDK compiler will
actually treat it.
AFAIK, it is a part of a language specification - it has to ensure that it actually reads the memory location that is declared as ‘volatile’ every time the code accesses it (otherwise, it could save its contents in a register in advance)…
Anton Bassov
NTDEV is sponsored by OSR
For our schedule of WDF, WDM, debugging and other seminars visit:
http://www.osr.com/seminars
To unsubscribe, visit the List Server section of OSR Online at http://www.osronline.com/page.cfm?name=ListServer