Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > C++ > Please disprove this Double-Checked Locking "fix"

Reply
Thread Tools

Please disprove this Double-Checked Locking "fix"

 
 
jl_post@hotmail.com
Guest
Posts: n/a
 
      04-26-2011
Hi,

Recently I've been reading up on "Double-Checked Locking" in C++
and how it's often implemented imperfectly. The Article "C++ and the
Perils of Double-Checked Locking" by Scott Meyers and Andrei
Alexandrescu ( http://www.aristeia.com/Papers/DDJ_J...04_revised.pdf
) provides a good overview of how it's usually done and why it's often
inadequate.

I won't go into details here, but the article states how this code:

Singleton* Singleton::instance() {
if (pInstance == 0) {
Lock lock;
if (pInstance == 0) {
pInstance = new Singleton;
}
}
return pInstance;
}

and its variants aren't completely thread safe.

Now, I've been thinking about how to make the code thread-safe, and
a few days ago I came up with a couple of solutions. As I implemented
them, I realized that these variants of mine also had problems.

One of the "solutions" I came up with was this:

// Assign the singleton object to a temp pointer:
Singleton* Singleton::instance() {
if (pInstance == 0) {
Lock lock;
if (pInstance == 0) {
Singleton* temp = new Singleton; // initialize to temp
pInstance = temp; // assign temp to pInstance
}
}
return pInstance;
}

Now, this approach isn't new (and is in fact covered in Meyer's and
Alexandrescu's article). The immediate problem is that the compiler
can optimize away the temporary variable, so there's no guarantee that
pInstance won't get the memory before it is properly initialized.

So I thought about making sure pInstance was unset in the
constructor, like this:

Singleton::Singleton() {
// initializing member data
assert( pInstance == 0 );
}

This way, the assert() statement ensures that pInstance will stay NULL
until the constructor is finished.

BUT! I realized that this solution was flawed, because the
compiler can inline the constructor into the ::instance() method, like
this:

Singleton* Singleton::instance() {
if (pInstance == 0) {
Lock lock;
if (pInstance == 0) {
Singleton* temp = operator new(sizeof(Singleton)); //
initialize to temp
new (temp) Singleton;
// initializing member data
assert( pInstance == 0 );
pInstance = temp; // assign temp to pInstance
}
}
return pInstance;
}

and further rearrange the lines like this:


Singleton* Singleton::instance() {
if (pInstance == 0) {
Lock lock;
if (pInstance == 0) {
Singleton* temp = operator new(sizeof(Singleton)); //
initialize to temp
new (temp) Singleton;
assert( pInstance == 0 );
pInstance = temp; // assign temp to pInstance
// initializing member data
}
}
return pInstance;
}

(The compiler is allowed to do this, I think, because the compiler can
reorder code as long as it's not detectable in a single-threaded
program.)

So I realized pInstance still has the possibility of being assigned
memory before it is properly initialized.

Then I thought: What if I included a second lock to make sure that
the constuctor is finished before pInstance is set? So I tried this:

Singleton* Singleton::instance() {
if (pInstance == 0) {
Lock lock;
if (pInstance == 0) {
Singleton* temp = new Singleton; // initialize to temp
secondLock.lock();
pInstance = temp; // assign temp to pInstance
secondLock.unlock();
}
}
}
return pInstance;
}

Singleton::Singleton() {
secondLock.lock();
// initializing member data
secondLock.unlock();
}

But eventually I realized that the order of the locks is not
guaranteed, that pInstance could still get assigned before the
constructor gets entered!

So I shot down my own two solutions. However, I kept thinking of
variants, and I started to wonder: What if I combined the two
variants? In other words, I'd take advantage of both assert() and a
second lock, like this:


Singleton* Singleton::instance() {
if (pInstance == 0) {
Lock lock;
if (pInstance == 0) {
Singleton* temp = new Singleton; // initialize to temp
secondLock.lock();
pInstance = temp; // assign temp to pInstance
secondLock.unlock();
}
}
}
return pInstance;
}

Singleton::Singleton() {
secondLock.lock();
// initializing member data
assert( pInstance == 0 );
secondLock.unlock();
}

In this way, the second lock guarantees that pInstance is not assigned
to inside the constructor code, and the assert() statement ensures
that the constructor code executes before the code that assigns to
pInstance.

This looks good as far as I can tell, but I'm thinking it's too
good to be true.

Since I'm a bit skeptical about this last solution, could someone
poke holes in it? (I'm eager to see if I really did find a solid
solution, or if it's just another pipe dream.)

Thanks!
 
Reply With Quote
 
 
 
 
Joshua Maurice
Guest
Posts: n/a
 
      04-26-2011
On Apr 26, 9:58*am, (E-Mail Removed) wrote:
> Hi,
>
> * *Recently I've been reading up on "Double-Checked Locking" in C++
> and how it's often implemented imperfectly. *The Article "C++ and the
> Perils of Double-Checked Locking" by Scott Meyers and Andrei
> Alexandrescu (http://www.aristeia.com/Papers/DDJ_J...04_revised.pdf
> ) provides a good overview of how it's usually done and why it's often
> inadequate.

[Snipped discussion of double checked locking]
> * *Since I'm a bit skeptical about this last solution, could someone
> poke holes in it? *(I'm eager to see if I really did find a solid
> solution, or if it's just another pipe dream.)


I'm sorry for being so gruff in the following (not really), but it's
evident that you have not even fully read the paper which you cited,
the only paper, and that is highly irritating.

Look, follow this, and understand just how royally screwed your
approach is. Given the initial conditions:
int x = 0;
int y = 0;
Thread 1 executing:
x = 1;
y = 2;
And threads 2-5 executing the following all concurrently with threads
1-5:
cout << x << ' ' << y << endl;
On non-obscure hardware and compilers, that can print /on a single
execution/, all 4 combinations:
00
02
10
12
Repeat, /on a single execution/. Different threads can see some writes
done by different threads in different orders! Thread 2 could see the
write to x without seeing the write to y, and thread 3 can see the
write to y without seeing the write to x.

Technically, the standard guarantees even worse - undefined behavior,
but the above is an example that /really happens/ on /real/ hardware
and /real/ compilers. This is largely due to hardware "reorderings",
but let me quote the paper for the relevant bit:

Quote:
Nothing you do can alter the fundamental problem: you need to be able
to specify a constraint on instruction ordering, and your language
gives you no way to do it.
To emphasize, it might be the compiler reordering it, it might be the
hardware reordering it, and it could even be some new kind of thing
which hasn't been invented yet! In practice the compiler writers and
hardware makers give guarantees for C++03 which mean that your code is
fundamentally broken. There is no standard or given guarantee of any
kind that any code written without proper synchronization will work.
None. Your code, give it to any compiler writer, and they will say
"won't work", which means while it might incidentally work today, but
tomorrow they might put in a new optimization (software or hardware),
and your code breaks. **This is the fundamental problem which you
cannot dodge!!**

Here's how you can break your code. Note again the general problem
that there are no guarantees that could even make it work, so I want
you to focus on the above basic problem, and do not spend too much
time on this, but I present it completeness. For example:

Singleton* Singleton::instance() {
if (pInstance == 0) {
Lock lock;
if (pInstance == 0) {
Singleton* temp = new Singleton; // initialize to temp
secondLock.lock();
pInstance = temp; // assign temp to pInstance
secondLock.unlock();
}
}
return pInstance;
}

A sufficiently smart compiler is allowed to move things from before a
lock to after the lock, and from after an unlock to before a lock. It
can transform the above to:

Singleton* Singleton::instance() {
if (pInstance == 0) {
Lock lock;
if (pInstance == 0) {
secondLock.lock();
Singleton* temp = new Singleton; // initialize to temp
pInstance = temp; // assign temp to pInstance
secondLock.unlock();
}
}
return pInstance;
}

And once we get that, it's trivial to change it to:

Singleton* Singleton::instance() {
if (pInstance == 0) {
Lock lock;
if (pInstance == 0) {
secondLock.lock();
pInstance = new Singleton;
secondLock.unlock();
}
}
return pInstance;
}

Which means we're back to screwed for the reasons known to you. I
didn't even need to resort to the wonderful DEC Alpha and its split
cache, but I could have.

You need to reread the paper which you cited. Here's the link again
for your benefit.
http://www.aristeia.com/Papers/DDJ_J...04_revised.pdf
And pay attention this time.
 
Reply With Quote
 
 
 
 
Joshua Maurice
Guest
Posts: n/a
 
      04-26-2011
On Apr 26, 11:16*am, Joshua Maurice <(E-Mail Removed)> wrote:
> A sufficiently smart compiler is allowed to move things from before a
> lock to after the lock, and from after an unlock to before a lock.


Ack, I meant:

> A sufficiently smart compiler is allowed to move things from before a
> lock to after the lock, and from after an unlock to before
> **an unlock**.


Hopefully it's clear from context. My mistake. My apologies.
 
Reply With Quote
 
Joshua Maurice
Guest
Posts: n/a
 
      04-26-2011
On Apr 26, 11:30*am, Pete Becker <(E-Mail Removed)> wrote:
> On 2011-04-26 14:16:33 -0400, Joshua Maurice said:
>
> > This is largely due to hardware "reorderings",
> > but let me quote the paper for the relevant bit:

>
> >
Quote:
> > Nothing you do can alter the fundamental problem: you need to be able
> > to specify a constraint on instruction ordering, and your language
> > gives you no way to do it.
> >

>
> > To emphasize, it might be the compiler reordering it, it might be the
> > hardware reordering it,

>
> Let me underscore the problem, as the quotation above misses part of
> the issue. Even when the instructions are executed in exactly the order
> that you want them to be executed, different threads can see results in
> a different order from the order in which the thread doing the stores
> actually did them.
>
> When data is shared between threads and at least one thread is writing
> that data, all accesses to that data must be synchronized.
>
> C++0x says that the behavior of a program that does not synchronize
> such accesses is undefined.


I fully agree. I also stated that /very explicitly/ just before the
snipped quote with my 1+4 thread example. I did intend to clearly
state exactly that. At least I think I did. For future reference, was
this unclear? Or did you merely mean to emphasize this point? That is
a good point to emphasize.

I'm just a little confused.
 
Reply With Quote
 
James Kanze
Guest
Posts: n/a
 
      04-30-2011
On Apr 26, 5:58 pm, (E-Mail Removed) wrote:

> Recently I've been reading up on "Double-Checked Locking" in
> C++ and how it's often implemented imperfectly.


You mean, how it is impossible to implement in standard C++
(03).

> The Article "C++ and the Perils of Double-Checked Locking" by
> Scott Meyers and Andrei Alexandrescu
> (http://www.aristeia.com/Papers/DDJ_J...04_revised.pdf)
> provides a good overview of how it's usually done and why it's
> often inadequate.


The standard solution for implementing a singleton works just
fine:

Singleton* Singleton::instance()
{
ScopedLock lock(mutex);
if ( pInstance == NULL )
pInstance = new Singleton;
return pInstance;
}

If the lock is a performance bottleneck (a case I've yet to
encounter), then you can simply insure that the singleton is
constructed before threads are started, and skip the lock.

Anything else is simply added complexity, for no reason.

--
James Kanze
 
Reply With Quote
 
Pavel
Guest
Posts: n/a
 
      05-01-2011
Leigh Johnston wrote:
> On 30/04/2011 23:54, James Kanze wrote:
>> On Apr 26, 5:58 pm, (E-Mail Removed) wrote:
>>
>>> Recently I've been reading up on "Double-Checked Locking" in
>>> C++ and how it's often implemented imperfectly.

>>
>> You mean, how it is impossible to implement in standard C++
>> (03).
>>
>>> The Article "C++ and the Perils of Double-Checked Locking" by
>>> Scott Meyers and Andrei Alexandrescu
>>> (http://www.aristeia.com/Papers/DDJ_J...04_revised.pdf)
>>> provides a good overview of how it's usually done and why it's
>>> often inadequate.

>>
>> The standard solution for implementing a singleton works just
>> fine:
>>
>> Singleton* Singleton::instance()
>> {
>> ScopedLock lock(mutex);
>> if ( pInstance == NULL )
>> pInstance = new Singleton;
>> return pInstance;
>> }
>>

>
> How does this prevent CPU reordering of stores to the pInstance pointer
> and the object pInstance points to?

Any system-dependent mutex has to behave like a double-side memory barrier (it's
a follow-up from the mutual exclusion requirement).

This does not prevent from unnecessary overhead though (grabbing a mutex in a
highly contentious case can be expensive and it is unnecessary when all
contenders want is to read a pointer rather than change it -- which is the case
after the Singleton has been initialized).


>
> /Leigh


-Pavel
 
Reply With Quote
 
James Kanze
Guest
Posts: n/a
 
      05-01-2011
On May 1, 9:49 pm, Leigh Johnston <(E-Mail Removed)> wrote:
> On 30/04/2011 23:54, James Kanze wrote:
> > On Apr 26, 5:58 pm, (E-Mail Removed) wrote:


> >> Recently I've been reading up on "Double-Checked Locking" in
> >> C++ and how it's often implemented imperfectly.


> > You mean, how it is impossible to implement in standard C++
> > (03).


> >> The Article "C++ and the Perils of Double-Checked Locking" by
> >> Scott Meyers and Andrei Alexandrescu
> >> (http://www.aristeia.com/Papers/DDJ_J...04_revised.pdf)
> >> provides a good overview of how it's usually done and why it's
> >> often inadequate.


> > The standard solution for implementing a singleton works just
> > fine:


> > Singleton* Singleton::instance()
> > {
> > ScopedLock lock(mutex);
> > if ( pInstance == NULL )
> > pInstance = new Singleton;
> > return pInstance;
> > }


> How does this prevent CPU reordering of stores to the
> pInstance pointer and the object pInstance points to?


Compiler magic.

Presumably, somewhere in the constructor of ScopedLock, you end
up calling pthread_mutex_lock, or something similar. And if the
compiler is Posix compliant, it knows that it cannot move code
accross a call to pthread_mutex_lock. (And of course, the code
in pthread_mutx_lock contains the necessary machine instructions
to ensure that the hardware respects the order. Posix requires
it.) (Replace Posix with Windows, and posix_mutex_lock with the
corresponding Windows function, if that's your platform.)

Of course, if the compiler doesn't support multithreading (or
you didn't specify the options it requires for it to support
multithreading), then there is no solution; you cant use
multithreading, period.

--
James Kanze
 
Reply With Quote
 
James Kanze
Guest
Posts: n/a
 
      05-01-2011
On May 1, 10:26 pm, Pavel
<(E-Mail Removed)> wrote:

[...]
> This does not prevent from unnecessary overhead though
> (grabbing a mutex in a highly contentious case can be
> expensive and it is unnecessary when all contenders want is to
> read a pointer rather than change it -- which is the case
> after the Singleton has been initialized).


Contention becomes relevant when two conditions are met: many
threads want to acquire the mutex, and having acquired the
mutex, a thread holds it for a certain time. In the case of the
singleton, once the object has been created, the thread holds
the mutex the time it takes for a comparison with null. That
is, a very, very short time. I have yet to see a case where
there was actual contention for the mutex in a singleton once
the singleton has been created. And if you do end up in such a
rare case, as I pointed out, calling Singleton::instance() once
before threading starts is sufficient to ensure that the mutex
isn't needed at all. Double check locking is a solution in
search of a problem, and to top it off, a solution which doesn't
work (at least in standard C++).

--
James Kanze
 
Reply With Quote
 
Pavel
Guest
Posts: n/a
 
      05-01-2011
James Kanze wrote:
> On May 1, 10:26 pm, Pavel
> <(E-Mail Removed)> wrote:
>
> [...]
>> This does not prevent from unnecessary overhead though
>> (grabbing a mutex in a highly contentious case can be
>> expensive and it is unnecessary when all contenders want is to
>> read a pointer rather than change it -- which is the case
>> after the Singleton has been initialized).

>
> Contention becomes relevant when two conditions are met: many
> threads want to acquire the mutex, and having acquired the
> mutex, a thread holds it for a certain time. In the case of the
> singleton, once the object has been created, the thread holds
> the mutex the time it takes for a comparison with null. That
> is, a very, very short time. I have yet to see a case where
> there was actual contention for the mutex in a singleton once
> the singleton has been created.

I agree most of the times it does not happen. However, I have met situations
when several threads start calling a short method on a single instance (like
sending/reading a simple but often message to/from a singleton queue or similar)
and, after they clash once, the situation called "lock convoy" emerges whereas
the threads spent most of their time switching from runnable to blocked state.

In some old implementations of mutices this condition was made worse by CPU
cache traffic bursts occuring when the holding thread released a mutex -- all
blocked threads were awoken to see which grabs it first; some of the current
implementations have this aggravating issue solved (e.g. in linux 2.6 Kernel
it's solved via futex-based implementation).

Regardless, as soon as a lock convoy is formed, it's kind of self-supporting
until all contending threads but one receive a relatively long message to
process -- all at the same time -- for which you might have to wait for a while.

> And if you do end up in such a
> rare case, as I pointed out, calling Singleton::instance() once
> before threading starts is sufficient to ensure that the mutex
> isn't needed at all. Double check locking is a solution in
> search of a problem, and to top it off, a solution which doesn't
> work (at least in standard C++).

I have never defended DCL. In fact, I have never defended a Singleton pattern.
IMHO it is the most-overused, the least-understood, the vaguest-defined of all
GoF patterns; and few people seem to have read the end of the "Consequences"
sub-section of "Singleton" section of their book (actuating Consequences #4,5
there may invalidate the whole discussion on DCL for Singleton irrelevant).

I can't recall when I used it last time -- most of the times you are much better
without. When I have to use one written by someone else, I try to acquire it
before starting any threads and cache, as per your advice or per the article
referred to by OP.

>
> --
> James Kanze


 
Reply With Quote
 
Scott Meyers
Guest
Posts: n/a
 
      05-02-2011
On 4/26/2011 10:50 AM, Pete Becker wrote:
> pinstance points to. The C++0x solution is:
>
> #include <atomic>
> std::atomic<Singleton*> pinstance;
>
> if (pinstance == 0) {
> Lock lock;
> if (pinstance == 0)
> pinstance = new Singleton;
> }
> return pinstance;
>
> Making pinstance an atomic variable ensures that all writes that "happen
> before" the assignment to pinstance in one thread are seen before the
> result of that assignment is seen in any other thread.


True, but that's not enough to make this code work. You also need to
know that assignment to pinstance can't take place until the Singleton
constructor has run to completion. You have that guarantee in C++0x,
too, so the code is correct, but it's not just the use of atomics that
gives you that correctness. You also need the additional guarantees
that C++0x offers regarding the behavior of new expressions (compared to
the weaker guarantees that C++03 gives you).

Scott

--
* C++ and Beyond: Meyers, Sutter, & Alexandrescu, Aug 7-10 in Banff
(http://cppandbeyond.com/)
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Locking locking resolution Frontpage raiderhawk General Computer Support 0 01-08-2008 01:42 AM
increase in cpu usage on locking and locking the system sowmya.rangineni@gmail.com Computer Support 0 06-15-2007 12:06 PM
Application locking to support optimisitc locking ? Timasmith Java 4 11-01-2006 12:42 AM
Locking Error while Updating an Image---- Urgent!!! Please advice Stephen ASP .Net 1 11-30-2004 07:45 PM
please help... ...me learn C++ please please please :) KK C++ 2 10-14-2003 02:08 PM



Advertisments