Locks and Deadlocks

OSR_Community_User · October 28, 2004, 1:44pm

I’m having really really hard time to justify "What would be the case
for which I need two spin locks "

Aquire(A)
Do work_1;

Aquire(B)

Do work_2;

Release(B);

Do Work_3;

Release (A)

If it is justified then only we could talk about even getting more spin
locks and potential danger of reverse ordering, well of course I assume
this thread would/could be invoked multiple times in overlapped fashion.
Key point is ( IRQL gets to DISPATCH for this locking ) and then what
happens …

Another question is “who was the first person coined entropy” into
computer related stuff ?. Hint::The person was testing how random is
random. More Hint? No way.

-pro

Peter_Viscarola_OSR · October 28, 2004, 2:20pm

Sinha, Prokash wrote:

I’m having really really hard time to justify "What would be the case
for which I need two spin locks "

Aquire(A)
Do work_1;

Aquire(B)

Do work_2;

Release(B);

Do Work_3;

Release (A)

I guess I’m having trouble understanding what it is that you don’t
understand. It makes me think I’m missing some incredibly subtle point,
or that I’m about to fall into a “gotcha”. But, heck, I’m willing…

In your example above, SpinLock A guards data structure(s) DS1.
SpinLock B guards data structure(s) DS2.

In our driver, SOMEtimes we need to atomically access DS1, SOMEtimes we
need to atomically access DS2, and SOMEtimes (though ideally rarely) we
need to access both… perhaps even using some information one of these
structures to update the other.

This could, of course, be replaced by a single spinlock that guards both
datastructures DS1 and DS2. But if we assume that we want to be able to
update these in parallel, that the times we need to update both together
is not common, that’d be a bad choice.

Peter
OSR

Ray_Trent · October 28, 2004, 3:51pm

And don’t forget this famous (thanks Peter) one: some spinlocks are
actually controlled by the OS.

For example, if you want to guard some operation on your own queue
against re-entrancy, you have to worry about what you do in your cancel
routine, because if you grab spinlocks in the wrong order, you can get a
deadlock with the OS’s cancel spinlock.

PeterGV wrote:

Sinha, Prokash wrote:

> I’m having really really hard time to justify "What would be the case
> for which I need two spin locks "
>
> Aquire(A)
> Do work_1;
>
> Aquire(B)
>
> Do work_2;
>
> Release(B);
>
> Do Work_3;
>
> Release (A)
>

I guess I’m having trouble understanding what it is that you don’t
understand. It makes me think I’m missing some incredibly subtle point,
or that I’m about to fall into a “gotcha”. But, heck, I’m willing…

In your example above, SpinLock A guards data structure(s) DS1. SpinLock
B guards data structure(s) DS2.

In our driver, SOMEtimes we need to atomically access DS1, SOMEtimes we
need to atomically access DS2, and SOMEtimes (though ideally rarely) we
need to access both… perhaps even using some information one of these
structures to update the other.

This could, of course, be replaced by a single spinlock that guards both
datastructures DS1 and DS2. But if we assume that we want to be able to
update these in parallel, that the times we need to update both together
is not common, that’d be a bad choice.

Peter
OSR

–
…/ray..

Please remove “.spamblock” from my email address if you need to contact
me outside the newsgroup.

Prokash_Sinha-1 · October 28, 2004, 4:31pm

Yeah, I need a laser gun to find a whole in yourlogic or trick you Peter :). So no there is no subtle trick here, it is just w/o looking thru those detail doc you and others put together, trying to justify if we need two spin locks or not, when there is no other piece of code accessing those two DSs, as you explained. Only assumption is that MP(HT) machines, and only this piece of code …

The point I was trying to get at is that do we need a 2nd spin lock or just any lightweight lock that can be accessed at that level.

THE SCENARIO I HAVE IN MIND IS EXACTLY WHAT YOU EXPLAINED …

Next time I will sure try to get a laser gun

-pro

Don_Burn_1 · October 28, 2004, 4:39pm

Well a spinlock is in many ways lightweight, except for the interlocked
instructions it has the lowest overhead to acquire and release. If you
aren’t assuming that your code is going to run on an MP system you are nuts,
since this is becoming the norm.

–
Don Burn (MVP, Windows DDK)
Windows 2k/XP/2k3 Filesystem and Driver Consulting
Remove StopSpam from the email to reply

“Programmers Society Prokash Sinha” wrote in message
news:xxxxx@ntdev…
> Yeah, I need a laser gun to find a whole in yourlogic or trick you Peter
:). So no there is no subtle trick here, it is just w/o looking thru those
detail doc you and others put together, trying to justify if we need two
spin locks or not, when there is no other piece of code accessing those two
DSs, as you explained. Only assumption is that MP(HT) machines, and only
this piece of code …
>
> The point I was trying to get at is that do we need a 2nd spin lock or
just any lightweight lock that can be accessed at that level.
>
> THE SCENARIO I HAVE IN MIND IS EXACTLY WHAT YOU EXPLAINED …
>
> Next time I will sure try to get a laser gun
>
> -pro
>

Prokash_Sinha-1 · October 28, 2004, 4:45pm

Also the original question I asked was flawed ( it is my mental deadlock ), yes
two locks are required for the DS1 DS2 …

-pro

Prokash_Sinha-1 · October 28, 2004, 4:59pm

I will defintely have to buy guns for Don, anyone else need one let me know :-). I would appreciate if you shoot me once, the pain would be once, if I survive !.

IT is HT(MP) as mentioned in that reply…

Also the original question was flawed, since I need two locks, it is the question about the second lock.

-pro

Peter_Viscarola_OSR · October 28, 2004, 5:52pm

Pro…

If your question now becomes: When do I need a spinlock and when do I
need a mutex (or other dispatcher object), the answer is that you only
need a spinlock if you’re serializing access to data structures that can
be accessed (in any code thread) at IRQL DISPATCH_LEVEL or above. This
then requires you to use the appropriate spinlock.

If the data structures being protected are only accessed at IRQL
PASSIVE_LEVEL, for example, no need for spin locks anywhere.

P
OSR

Prokash_Sinha-1 · October 28, 2004, 7:32pm

Thanks Peter, for the last breif stmt. It’s not my day today, and brain termite ( that is not doing driver work for 3 mo. ) eating it fast. I do have all the docs, read before… But I am not doing anything with the cooked up example I just threw …

I did it because, I saw there is a question by Gerish, and the example he gave look like lots of wholes in there ( he mentioned worker thread, very likely it would be executed in PASSIVE_LEVEL ). Then there might be other issues too !

I could not resist, so I dropped two questions, and aswered back in a genric way. Maybe some of you did not get into it, because it was being shooted by others, so it was right for me to ask for clarity from who has all these in their fingure tips… IT IS AGAIN A REFRESHER FOR ME, AND HOPE the other OP will get thehe answer …

-pro

Rod_Widdowson-1 · October 29, 2004, 9:23am

OK, I’ll bite, I never could avoid a good rat-hole and this is a mink-hole.

So I have some code in a single thread which goes:

while (1)
{
lock(a)
manipulate(A); // A is protected by A
lock(b);
manipulate(B);
manipulate_some_more(A); // A still locked for good reasons
unlock(B);
unlock(A);
// time passes
lock(b);
mung(B);
lock(a);
mung(A, B); // both locked
unlock(a);
unlock(b);
}

OK, so this may trigger some warnings but there’s just one thread and all is
OK.

But

we are locking a and locking b. So, there are other threads of execution
which can manipulate A and B
there are also good reasons for holding both locks (say we put A into a
transitional state while we play with B)

This is still OK, but what happens if the other threads of execution ALSO
need to hold both locks - say because they want to do the same manipulation
as above. There is no safe order for them to acquire their locks (not
unless they are incredibly deeply involved in the logic of the thread
above).

OK, so what have I got wrong?

OSR_Community_User · October 29, 2004, 9:49am

Hi Rod,

You haven’t got anything wrong as such. You just have to make sure that
locks are done in a structured, hierarchical, consistant order.

So, in this particular case, you always need to lock A first, then B if you
need both. Of course, if you ONLY need A, or ONLY need B, you can lock only
one of them.

The problem occurs when you have two places that lock both A and B, and one
of them locks A then B, the other locks B then A. If you have the right
amount of (un)luck, this will deadlock, because Thread 1 locks A, Thread 2
locks B and then Thread 1 tries to lock B and waits forever, whilst Thread
2 is waiting for A to be released.

There are of course other solutions:

Not having infiinite wait for the lock to be available and if not
successful in the first X microseconds, release any current lock and start
over.
Use only one lock for all A + B data structures.

But any solution which locks two (or more) different locks in different
order is very likely to cause deadlocks if it’s not guaranteed that at all
times, no other thread will EVER take hold of more than one lock at any
given time.

The BIG problem with this, is that you may well be running the software in
the lab for DAYS or WEEKS, and never find a problem. Only once you ship it,
and some customer with exactly the right setup to cause a deadlock will
find the “perfect” timing to cause a deadlock every so often. And guess
what: This is almost certain to be an important customer who doesn’t like
to see his machine(s) lock up every now and then…

And by the way, as someone pointed out, you need to make sure that you
understand how the OS deals with things too. We’ve had this happen to the
displaydriver… Windows is supposed to only allow one thread at any given
time to access the display driver (except when the PDEV is disabled, which
is a side point). Unfortunately, this is broken. So the simple approach is
to just add a lock on the entry into the driver. NBD. Except that if we
call back into Windows (EngXXXX calls), we have to let go of the lock,
because Windows may well call back into our driver, which means that we’re
currently holding the lock and trying to acquire it at the same time…
Deadlock

–
Mats

xxxxx@lists.osr.com wrote on 10/29/2004 02:21:23 PM:

OK, I’ll bite, I never could avoid a good rat-hole and this is a
mink-hole.

So I have some code in a single thread which goes:

while (1)
{
lock(a)
manipulate(A); // A is protected by A
lock(b);
manipulate(B);
manipulate_some_more(A); // A still locked for good reasons
unlock(B);
unlock(A);
// time passes
lock(b);
mung(B);
lock(a);
mung(A, B); // both locked
unlock(a);
unlock(b);
}

OK, so this may trigger some warnings but there’s just one thread and all
is
OK.

But

we are locking a and locking b. So, there are other threads of
execution
which can manipulate A and B

there are also good reasons for holding both locks (say we put A into
a
transitional state while we play with B)

This is still OK, but what happens if the other threads of execution ALSO
need to hold both locks - say because they want to do the same
manipulation
as above. There is no safe order for them to acquire their locks (not
unless they are incredibly deeply involved in the logic of the thread
above).

OK, so what have I got wrong?

Questions? First check the Kernel Driver FAQ at http://www.
osronline.com/article.cfm?id=256

You are currently subscribed to ntdev as: xxxxx@3dlabs.com
To unsubscribe send a blank email to xxxxx@lists.osr.com

ForwardSourceID:NT00006606

Loren_Wilton · October 29, 2004, 10:02am

> as above. There is no safe order for them to acquire their locks (not

unless they are incredibly deeply involved in the logic of the thread
above).

Doesn’t matter how deeply they are involved in the routiine you posted. Run
that routine on two processors and it will deadlock within microseconds.
One processor will be in the first part trying to lock B, and the second
will be in the second part trying to lock A. Since they both hold a lock
the other needs, deadlock.

Loren

Prokash_Sinha-1 · October 29, 2004, 10:09am

Yes we know any interleaving would create a deadlock :). As I said, I’VE
BRAIN TERMITES, specially when I hang outs with bunch of rootkitters and
blackhatters(no they are actually white hatters), that always carries me in
random directions :-).

That is why I came to say, If anyone gurantees that there will be one
execution goes top to bottom, before anyother interleaving there is no dead
lock, the whole thread is ATOMIC…

My other point was, no matter how we think about locks it is always good to
sit back and read those exceptionally well written guides ( from NT insider
and recent docs from MS), but, but , but sometime tossing different twist(s)
to it brings out the flavor, when and where and what type of locks we really
need. IF ALL WERE IN ONE EXECUTION LEVEL (PASSIVE OR DISPATCH etc. etc )
things becomes bit simpler, but with different execution level , it is
another level of complexity, and tossing and arguing is the only ( and that
is tinkering ) way to figure out what is/are really needed.

May be some day Peter GV will have another article "Bulet proofing your
locking mechanism … "

-pro

-----Original Message-----
From: xxxxx@lists.osr.com
[mailto:xxxxx@lists.osr.com]On Behalf Of Rod Widdowson
Sent: Friday, October 29, 2004 6:21 AM
To: Windows System Software Devs Interest List
Subject: Re:[ntdev] Locks and Deadlocks

OK, I’ll bite, I never could avoid a good rat-hole and this is a mink-hole.

So I have some code in a single thread which goes:

while (1)
{
lock(a)
manipulate(A); // A is protected by A
lock(b);
manipulate(B);
manipulate_some_more(A); // A still locked for good reasons
unlock(B);
unlock(A);
// time passes
lock(b);
mung(B);
lock(a);
mung(A, B); // both locked
unlock(a);
unlock(b);
}

OK, so this may trigger some warnings but there’s just one thread and all is
OK.

But

we are locking a and locking b. So, there are other threads of execution
which can manipulate A and B
there are also good reasons for holding both locks (say we put A into a
transitional state while we play with B)

This is still OK, but what happens if the other threads of execution ALSO
need to hold both locks - say because they want to do the same manipulation
as above. There is no safe order for them to acquire their locks (not
unless they are incredibly deeply involved in the logic of the thread
above).

OK, so what have I got wrong?

Questions? First check the Kernel Driver FAQ at
http://www.osronline.com/article.cfm?id=256

You are currently subscribed to ntdev as: xxxxx@garlic.com
To unsubscribe send a blank email to xxxxx@lists.osr.com

OSR_Community_User · October 29, 2004, 10:30am

Actually, deadlocks are a well-known CS topic. There are (in essence)
TWO ways of handling deadlocks: detection or avoidance.

In a deadlock detection system, the ability to detect a deadlock is an
inherent part of the system. When a deadlock is detected a corrective
action is taken to break the deadlock. For example, in a transactional
system, one transaction is forced to abort and this allows the other to
continue. Such systems are really essential when we have large, complex
interaction schemes where there is no agreement upon cooperation. This
is a very common solution in the database world, for instance, where
random new applications can be written to access various records of
various tables and lock them.

In a deadlock avoidance system, the order of lock acquisition is modeled
in a hierarchy (or more formally a directed acyclic graph). Since there
is a canonical order for lock acquisition, there is no possibility of
deadlock (assuming correct implementation).

In my experience, OS developers tend to prefer the latter implementation
(deadlock avoidance) because the OS seldom has an embedded transactional
system to ease the pain of handling runtime abort (in other words, our
“runtime abort” is a reboot). However, there are some interesting new
transactional technologies appearing within the Windows OS so perhaps
this will change in the future.

But, for the time being, driver writers (device or file system) will
stick with the deadlock avoidance system.

Of course what is great fun is when you combine this with a reentrant
OS!

Regards,

Tony

Tony Mason
Consulting Partner
OSR Open Systems Resources, Inc.
http://www.osr.com

Looking forward to seeing you at the Next OSR File Systems Class October
18, 2004 in Silicon Valley!

-----Original Message-----
From: xxxxx@lists.osr.com
[mailto:xxxxx@lists.osr.com] On Behalf Of Mats PETERSSON
Sent: Friday, October 29, 2004 9:45 AM
To: ntdev redirect
Subject: Re:[ntdev] Locks and Deadlocks

Hi Rod,

You haven’t got anything wrong as such. You just have to make sure that
locks are done in a structured, hierarchical, consistant order.

So, in this particular case, you always need to lock A first, then B if
you
need both. Of course, if you ONLY need A, or ONLY need B, you can lock
only
one of them.

The problem occurs when you have two places that lock both A and B, and
one
of them locks A then B, the other locks B then A. If you have the right
amount of (un)luck, this will deadlock, because Thread 1 locks A, Thread
2
locks B and then Thread 1 tries to lock B and waits forever, whilst
Thread
2 is waiting for A to be released.

There are of course other solutions:

Not having infiinite wait for the lock to be available and if not
successful in the first X microseconds, release any current lock and
start
over.
Use only one lock for all A + B data structures.

But any solution which locks two (or more) different locks in different
order is very likely to cause deadlocks if it’s not guaranteed that at
all
times, no other thread will EVER take hold of more than one lock at
any
given time.

The BIG problem with this, is that you may well be running the software
in
the lab for DAYS or WEEKS, and never find a problem. Only once you ship
it,
and some customer with exactly the right setup to cause a deadlock will
find the “perfect” timing to cause a deadlock every so often. And guess
what: This is almost certain to be an important customer who doesn’t
like
to see his machine(s) lock up every now and then…

And by the way, as someone pointed out, you need to make sure that you
understand how the OS deals with things too. We’ve had this happen to
the
displaydriver… Windows is supposed to only allow one thread at any
given
time to access the display driver (except when the PDEV is disabled,
which
is a side point). Unfortunately, this is broken. So the simple approach
is
to just add a lock on the entry into the driver. NBD. Except that if we
call back into Windows (EngXXXX calls), we have to let go of the lock,
because Windows may well call back into our driver, which means that
we’re
currently holding the lock and trying to acquire it at the same time…
Deadlock

–
Mats

xxxxx@lists.osr.com wrote on 10/29/2004 02:21:23 PM:

OK, I’ll bite, I never could avoid a good rat-hole and this is a
mink-hole.

So I have some code in a single thread which goes:

while (1)
{
lock(a)
manipulate(A); // A is protected by A
lock(b);
manipulate(B);
manipulate_some_more(A); // A still locked for good reasons
unlock(B);
unlock(A);
// time passes
lock(b);
mung(B);
lock(a);
mung(A, B); // both locked
unlock(a);
unlock(b);
}

OK, so this may trigger some warnings but there’s just one thread and
all
is
OK.

But

we are locking a and locking b. So, there are other threads of
execution
which can manipulate A and B

there are also good reasons for holding both locks (say we put A
into
a
transitional state while we play with B)

This is still OK, but what happens if the other threads of execution
ALSO
need to hold both locks - say because they want to do the same
manipulation
as above. There is no safe order for them to acquire their locks (not
unless they are incredibly deeply involved in the logic of the thread
above).

OK, so what have I got wrong?

Questions? First check the Kernel Driver FAQ at http://www.
osronline.com/article.cfm?id=256

You are currently subscribed to ntdev as: xxxxx@3dlabs.com
To unsubscribe send a blank email to xxxxx@lists.osr.com

ForwardSourceID:NT00006606

Questions? First check the Kernel Driver FAQ at
http://www.osronline.com/article.cfm?id=256

You are currently subscribed to ntdev as: xxxxx@osr.com
To unsubscribe send a blank email to xxxxx@lists.osr.com

Rod_Widdowson-1 · October 29, 2004, 11:42am

> Of course what is great fun is when you combine this with a

reentrant OS!

and then clusters.

Jake_Oshins · October 29, 2004, 12:50pm

I’d suggest starting your reading here:

http://www.microsoft.com/whdc/driver/kernel/locks.mspx

–
Jake Oshins
Windows Kernel Group

This posting is provided “AS IS” with no warranties, and confers no rights.

“Sinha, Prokash” wrote in message
news:xxxxx@ntdev…
I’m having really really hard time to justify "What would be the case
for which I need two spin locks "

Aquire(A)
Do work_1;

Aquire(B)

Do work_2;

Release(B);

Do Work_3;

Release (A)

If it is justified then only we could talk about even getting more spin
locks and potential danger of reverse ordering, well of course I assume
this thread would/could be invoked multiple times in overlapped fashion.
Key point is ( IRQL gets to DISPATCH for this locking ) and then what
happens …

Another question is “who was the first person coined entropy” into
computer related stuff ?. Hint::The person was testing how random is
random. More Hint? No way.

-pro