Some years ago, when we were doing a multiprocessor system, one of the
problems we were running was a partial-differential-equation solver. We
had 16 “cores” (actually, PDP-11 systems), so the obvious thing to do was
to partition the input matrix into 4x4 submatrices, and have each core
work on one partition. But here’s the problem: at the boundaries between
the submatrices, the convolution algorithm required the value in an
adjacent cell. But that cell was in the partition being worked on by
another core. So, of course, we had to synchronize. Any time a border
cell was being computed, its access was locked with a mutex. The
computation crawled. Really crawled. Until one of our researches
realized what was happening, and threw the synchronization away. So
sometimes the computation was made with the value of the previous
iteration, and sometimes the computation was made with the value of the
current iteration. Since the algorithm converges, it didn’t matter which
of the two values was used. The algorithm experienced a
greater-than-order-of-magnitude improvement. This is because the
“correctness” of the algorithm did not depend upon whether the value was
the latest, or the latest-minus-one. The issue is that the correctness of
the computation was not affected by this factoid. Note quite the same as,
say, managing accurate reference counts in a shared-multicore-memory. To
demonstrate this, I once created a multithreaded app that did selective
sleep operations (this was in app space) and did operations like
Fetch(); Sleep(); Increment(); Sleep(); Store(); Sleep();
One sequence ran with a couple threads doing this; another did the same,
but did a
WaitForSingleObject(mutex); …above sequence…; ReleaseMutex();
It is fun to watch the two counters, which I display on progress bars.
One progress bar goes up linearly, the other is lagging behind, moving
sporadically, but constantly falling behind. The idea of the Sleep()
calls is to simulate intercore timing effects at the
hundreds-of-milliseconds granularity instead of the
hundreds-of-picoseconds granualarity usually seen at the instruction
level.
My observation was that the code illustrated as the “problem” exhibits the
nondeterminism that you can get with or without InterlockedRead/Write, and
the OP does not understand that if a deterministic result is required, the
code is simply wrong. It can only be correct if a deterministic result is
not required.
joe
OK, OK… Slow down everyone. Mr. Raymond can be, well, trying with some
of these conceptual challenges (he and I have butted heads before). But
in this case, he has a point… albeit a small one.
The docs DO say: “This function is atomic with respect to calls to other
interlocked functions.”
He merely wants to know “OK then… Is it guaranteed to be atomic with
respect to anything else?”
STRICTLY speaking, according to the docs, we don’t know if the
read/increment/write operation could be interleaved with… something. A
load, for example. Or a store.
My answer: You’re reading too much in the docs. Putting too much faith in
the specifics of wording, the care with which that wording is sometimes
crafted. I’ll bet dollars to donuts that the writer was simply trying to
make the point that doing interlocked increment doesn’t do anything other
than… you know… interlock the increment operation.
But, in any case, think of the consequences, regardless of atomicity of
the interlocked increment instruction:
Let’s say read/inc/write *can* be interleaved with a read.
So, you start with a value of 1 and it’s interlocked incremented to 2.
In all cases, your read will either get a value of 1 or 2. Either is, in
fact, perfectly acceptable… because your code COULD have read the value
just BEFORE the read/inc/write or just AFTER the read/inc/write.
So, “locking” or “being atomic” against the interlocked operation doesn’t
get you anything.
Write… the same way. You either update the value BEFORE the
read/inc/write operation or after it. In either case, the result has got
to be acceptable to you.
And in all cases, you can’t simply ++ the value… because that operation
doesn’t represent a single read or write.
Peter
OSR
@OSRDrivers
NTDEV is sponsored by OSR
Visit the list at: http://www.osronline.com/showlists.cfm?list=ntdev
OSR is HIRING!! See http://www.osr.com/careers
For our schedule of WDF, WDM, debugging and other seminars visit:
http://www.osr.com/seminars
To unsubscribe, visit the List Server section of OSR Online at
http://www.osronline.com/page.cfm?name=ListServer