How to ensure atomic execution of driver code in Multiprocessor environment?

Hi all,
Is using a kernel mode spinlock is enough to ensure an atomic
execution of certain routine that accesses a hardware register (maybe
via a kernel mode API).
This is the scenario, I’m currently planning to develop a driver for
a PCI FPGA card. How am I supposed to access the register within that
device “atomically”? Maybe there’s a pointer to article that might help?

Regards,
Darmawan

First of all, you seem to confuse atomicity with synchronization - once your code is interruptable, the very term “atomic” does not really apply to anything that involves more than one instruction. Therefore, the only thing that can be truly atomic is interlocked operation.

Concerning spinlocks , although they allow your operations to be synchronized, they still cannot
guarantee an exclusive access to the shared resource. Consider the scenario when someone, apart from your code, accesses the given register, without being bothered to acquire a spinlock. Certainly, under the normal circumstances no one should touch the resources that have been allocated for your driver ( and you should not touch other driver’s resources either), so that the OS itself is not going to do it. However, there is always at least theorectical possibility of third-party driver that does not want to play by the rules - unfortunately, you just cannot protect itself against it…

Anton Bassov

On 8/20/07, xxxxx@hotmail.com wrote:
>
> First of all, you seem to confuse atomicity with synchronization - once
> your code is interruptable, the very term “atomic” does not really apply to
> anything that involves more than one instruction. Therefore, the only thing
> that can be truly atomic is interlocked operation.
>
> Aha, yes, you’re right. ;-). Anyway, does this means the code need to run
in elevated IRQL? (I imagine something higher that DISPATCH_LEVEL)

Concerning spinlocks , although they allow your operations to be
> synchronized, they still cannot
> guarantee an exclusive access to the shared resource. Consider the
> scenario when someone, apart from your code, accesses the given register,
> without being bothered to acquire a spinlock. Certainly, under the normal
> circumstances no one should touch the resources that have been allocated for
> your driver ( and you should not touch other driver’s resources either), so
> that the OS itself is not going to do it. However, there is always at least
> theorectical possibility of third-party driver that does not want to play by
> the rules - unfortunately, you just cannot protect itself against it…

I see.

Regards,

Darmawan Salihun
--------------------------------------------------------------------
-= Human knowledge belongs to the world =-

Yes sure other bork’d kernel mode code can screw up the system,
including your correctly written driver.

To answer the OP’s question: use either a standard spinlock or an
interrupt spinlock. Use a standard spinlock if you only need to protect
your own DISPATCH_LEVEL code paths from concurrent access to the
register or registers in question. Use an interrupt spinlock if your ISR
also needs to access the register or registers.

-----Original Message-----
From: xxxxx@lists.osr.com
[mailto:xxxxx@lists.osr.com] On Behalf Of
xxxxx@hotmail.com
Sent: Monday, August 20, 2007 12:17 PM
To: Windows System Software Devs Interest List
Subject: RE:[ntdev] How to ensure atomic execution of driver code in
Multiprocessor environment?

First of all, you seem to confuse atomicity with synchronization - once
your code is interruptable, the very term “atomic” does not really apply
to anything that involves more than one instruction. Therefore, the only
thing that can be truly atomic is interlocked operation.

Concerning spinlocks , although they allow your operations to be
synchronized, they still cannot
guarantee an exclusive access to the shared resource. Consider the
scenario when someone, apart from your code, accesses the given
register, without being bothered to acquire a spinlock. Certainly, under
the normal circumstances no one should touch the resources that have
been allocated for your driver ( and you should not touch other driver’s
resources either), so that the OS itself is not going to do it. However,
there is always at least theorectical possibility of third-party driver
that does not want to play by the rules - unfortunately, you just cannot
protect itself against it…

Anton Bassov


NTDEV is sponsored by OSR

For our schedule of WDF, WDM, debugging and other seminars visit:
http://www.osr.com/seminars

To unsubscribe, visit the List Server section of OSR Online at
http://www.osronline.com/page.cfm?name=ListServer

> Anyway, does this means the code need to run

in elevated IRQL? (I imagine something higher that DISPATCH_LEVEL)

If your target machine has more than one CPU, it does not really solve the problem - you just cannot ensure true atomicity without bus locking, and you cannot lock the bus for more than
a duration of single instruction’s execution.

If you target machine has just one CPU (these days it is quite unwise to make this assumtion), then,
indeed, you can achieve atomicity by raising IRQL to the highest possible level or just disabling interrupts on the CPU with CLI instruction…

Anton Bassov

> Yes sure other bork’d kernel mode code can screw up the system,

including your correctly written driver.

And to make things truly annoying, BSOD message may say that crash has occured in your PROPERLY-WRITTEN driver. The only thing that can save your reputation in this case is
DriverVerifier, so that it makes sense to make sure that your driver passes all DriverVerifier’s tests…

Anton Bassov

Hmmmm… Maybe the OP can define for us what he, specifically, means by “atomic”. Atomic with regards to what?

This is the question Mr. Basov and Mr. Roddy are trying to get at with their replies.

Peter
OSR

On 8/21/07, xxxxx@osr.com wrote:
>
> Hmmmm… Maybe the OP can define for us what he, specifically, means by
> “atomic”. Atomic with regards to what?
>
> This is the question Mr. Basov and Mr. Roddy are trying to get at with
> their replies.
>
Atomic in this context is “the only one executing without being
interrupted”, including “assuring that other processor in the multiprocessor
system” does not interfere with the execution of the code. Much like what
Anton has described.

Regards,

Darmawan Salihun
--------------------------------------------------------------------
-= Human knowledge belongs to the world =-

If the register is yours, one of the “InterlockedBlahBlah” macros may be enough - AFAIK they generate a “lock” prefix to a machine instruction, which gets negotiated at compile time, doesn’t involve the OS, and makes the operation indivisible at front-side bus level. You can use these macros even inside critical sections without a problem. If however you’re accessing a PCI Configuration Register, the only legit way I know is to call the Microsoft PCI Bus Driver.

Yet your concept of “atomic” doesn’t jive. The only way you can have only one processor executing code is to halt the other processors. What you can achieve is mutual exclusion - that is, you can impose atomicity on an operation or on a sequence of operations that do something on a piece of data, meaning, that sequence executes as if it was the only instruction stream operating on the target piece of data.

You must be very aware that atomicity protects data, not code. You can protect a stretch of code with an Interlock, Spinlock or some other synchronization construct, yet that will achieve nothing if some other piece of code somewhere else acts on the same piece of data and doesn’t use the same lock. That’s why PCI configuration accesses in Windows are exposed to race conditions: the OS doesn’t supply a synchronization construct that “atomicizes” third party accesses to PCI registers, hence even though you set up a lock or a semaphore to protect your accesses to them, neither the OS nor some other program will go through that semaphore.

Try reading Gregory Andrews’s excellent “Foundations of Multithreaded, Parallel and Distributed Programs”. It’s all there, and much more!

Alberto.

----- Original Message -----
From: Darmawan Salihun
To: Windows System Software Devs Interest List
Sent: Tuesday, August 21, 2007 5:58 AM
Subject: Re: [ntdev] How to ensure atomic execution of driver code in Multiprocessor environment?

On 8/21/07, xxxxx@osr.com wrote:
Hmmmm… Maybe the OP can define for us what he, specifically, means by “atomic”. Atomic with regards to what?

This is the question Mr. Basov and Mr. Roddy are trying to get at with their replies.

Atomic in this context is “the only one executing without being interrupted”, including “assuring that other processor in the multiprocessor system” does not interfere with the execution of the code. Much like what Anton has described.

Regards,

Darmawan Salihun
--------------------------------------------------------------------
-= Human knowledge belongs to the world =- — NTDEV is sponsored by OSR For our schedule of WDF, WDM, debugging and other seminars visit: http://www.osr.com/seminars To unsubscribe, visit the List Server section of OSR Online at http://www.osronline.com/page.cfm?name=ListServer

Still unclear.
Do you need to be sure that:

  1. while poking your hardware nothing else pokes the hardware?

  2. #1 + a minimal delay while poking your hardware?

Given that you mention an MP system I think you just want #1 (#2 wouldn’t apply between processors at any significant level). That’s easy to achieve. Since only your code will be poking your hardware, just write your code such that it doesn’t attempt to poke the hardware in this way more than one at a time.

Spinlocks, mutexes , interrupt locks - there are plenty of ways to add exclusion into your code in WDM or WDF.

-p

From: xxxxx@lists.osr.com [mailto:xxxxx@lists.osr.com] On Behalf Of Darmawan Salihun
Sent: Tuesday, August 21, 2007 2:58 AM
To: Windows System Software Devs Interest List
Subject: Re: [ntdev] How to ensure atomic execution of driver code in Multiprocessor environment?

On 8/21/07, xxxxx@osr.commailto:xxxxx > wrote:
Hmmmm… Maybe the OP can define for us what he, specifically, means by “atomic”. Atomic with regards to what?

This is the question Mr. Basov and Mr. Roddy are trying to get at with their replies.
Atomic in this context is “the only one executing without being interrupted”, including “assuring that other processor in the multiprocessor system” does not interfere with the execution of the code. Much like what Anton has described.

Regards,

Darmawan Salihun
--------------------------------------------------------------------
-= Human knowledge belongs to the world =- — NTDEV is sponsored by OSR For our schedule of WDF, WDM, debugging and other seminars visit: http://www.osr.com/seminars To unsubscribe, visit the List Server section of OSR Online at http://www.osronline.com/page.cfm?name=ListServer</mailto:xxxxx>

On 8/21/07, Peter Wieland wrote:
>
> Still unclear.
>
> Do you need to be sure that:
>
> 1. while poking your hardware nothing else pokes the hardware?
>
> 2. #1 + a minimal delay while poking your hardware?
>
>
>
> Given that you mention an MP system I think you just want #1 (#2 wouldn’t
> apply between processors at any significant level). That’s easy to
> achieve. Since only your code will be poking your hardware, just write your
> code such that it doesn’t attempt to poke the hardware in this way more than
> one at a time.
>
Exactly, I just want #1

Regards,

Darmawan Salihun
--------------------------------------------------------------------
-= Human knowledge belongs to the world =-

Okay. This should be pretty easy for your driver to manage. You just need to be sure that you properly serialize the access your driver has to your device. No one else should be poking at your devices registers, so this is managed entirely by your driver.

If you need to synchronize this code with your interrupt service routine you should look at KeSynchronizeExecution or KeAcquireInterruptLock - these will allow you to block out your own ISR (on all processors) while you run a particular piece of code. Of course that code runs at interrupt level so you’re limited in what you can do. But you’re just poking registers so that shouldn’t be too bad.

If you need to synchronize hardware access between various dispatch routines there are two options. You can look at the basic synchronization primitives - events, mutexes, spinlocks - and apply these around the “atomic” blocks of hardware access. Or you can structure your driver to serialize the requests as they come in so that you don’t ever worry about processing more than one at a time. WDF makes this very easy with its sequential queues.

There are other synchronization points - for example you might want to program your device from your DPC and a dispatch routine. But in general the same two techniques apply - let the two activities race and use a lock to synchronize them, or attempt to structure your driver such that only one of the two operations will happen at a time. Depending on what you’re synchronizing you might need to use both techniques.

Good luck
-p

From: xxxxx@lists.osr.com [mailto:xxxxx@lists.osr.com] On Behalf Of Darmawan Salihun
Sent: Tuesday, August 21, 2007 10:29 AM
To: Windows System Software Devs Interest List
Subject: Re: [ntdev] How to ensure atomic execution of driver code in Multiprocessor environment?

On 8/21/07, Peter Wieland > wrote:

Still unclear.

Do you need to be sure that:

1. while poking your hardware nothing else pokes the hardware?

2. #1 + a minimal delay while poking your hardware?

Given that you mention an MP system I think you just want #1 (#2 wouldn’t apply between processors at any significant level). That’s easy to achieve. Since only your code will be poking your hardware, just write your code such that it doesn’t attempt to poke the hardware in this way more than one at a time.

Exactly, I just want #1

Regards,

Darmawan Salihun
--------------------------------------------------------------------
-= Human knowledge belongs to the world =- — NTDEV is sponsored by OSR For our schedule of WDF, WDM, debugging and other seminars visit: http://www.osr.com/seminars To unsubscribe, visit the List Server section of OSR Online at http://www.osronline.com/page.cfm?name=ListServer