On May 13, 3:56 pm, Gerhard Fiedler <geli...@gmail.com> wrote:
> Joshua Maurice wrote:
> > However, a sufficiently smart compiler could notice your clever ruse,
> > optimize away the assert as always true, see a lock and unlock pair
> > guarding nothing, optimize that away, and then move the assignment to
> > temp past the mutex acquire, as demonstrated above.
>
> Regarding the compiler optimizing away a lock/unlock pair guarding
> "nothing": AIUI both lock and unlock need to provide certain fences.
> Therefore, again AIUI, they can't be optimized away by the compiler even
> if there's nothing in between, because that would remove the fences and
> alter the behavior.
>
> Am I missing something here?
That may be how they're commonly implemented, but that's not the
guaranteed semantics. Two different mutexes may as a matter of fact on
a given implementation give "happens-before" effects between the two
different mutexes, but there's nothing guaranteed about it.
I was ambiguously describing the situation, possibly to the extent of
being wrong. Allow me to correct myself.
Remember the code:
Singleton* Singleton::instance() {
if (pInstance == 0) {
Lock lock;
if (pInstance == 0) {
Singleton* temp = new Singleton; // initialize to temp
secondLock.lock();
pInstance = temp; // assign temp to pInstance
secondLock.unlock();
}
}
return pInstance;
}
Singleton::Singleton() {
secondLock.lock();
assert( pInstance == 0 );
secondLock.unlock();
}
The compiler can expand inline as follows (pseudo-code):
Singleton* Singleton::instance() {
if (pInstance == 0) {
Lock lock;
if (pInstance == 0) {
Singleton* temp = (Singleton*) :

perator
new(sizeof(Singleton));
secondLock.lock();
assert( pInstance == 0 );
secondLock.unlock();
secondLock.lock();
pInstance = temp; // assign temp to pInstance
secondLock.unlock();
}
}
return pInstance;
}
With that, it sees:
someMutex.lock();
<blah1>
someMutex.unlock();
<blah2>
someMutex.lock();
<blah3>
someMutex.unlock();
The compiler sees a lock, unlock, lock, unlock, in straightline code,
without branching (or exceptions, or volatile (to keep signal handlers
correct)). The compiler is totally free to replace that with:
someMutex.lock();
<blah1>
<blah2>
<blah3>
someMutex.unlock();
It may be bad from a QoI to do that. It depends. It depends heavily on
the situation. I see it as quite reasonable that the compiler could
employ some heuristics to deduce when it's a savings to combine
critical sections. In this case, <blah1> and <blah2> are actually
empty, no-ops, so it's always a savings to combine the two adjacent
critical sections as such.
So, it can transform it to:
Singleton* Singleton::instance() {
if (pInstance == 0) {
Lock lock;
if (pInstance == 0) {
Singleton* temp = (Singleton*) :

perator
new(sizeof(Singleton));
secondLock.lock();
assert( pInstance == 0 );
pInstance = temp; // assign temp to pInstance
secondLock.unlock();
}
}
return pInstance;
}
With some simple data flow analysis, and allowed motions past locks,
it can transform it to:
Singleton* Singleton::instance() {
if (pInstance == 0) {
Lock lock;
if (pInstance == 0) {
secondLock.lock();
pInstance = (Singleton*) :

perator new(sizeof(Singleton));
secondLock.unlock();
}
}
return pInstance;
}
----
Now, back to my original much more controversial statement - can a
compiler simply remove a lock unlock pair? Ex:
mutex.lock();
mutex.unlock();
Maybe. I mentioned "clever ruse" with whole program optimization in
mind. (However, upon thinking about it, I just showed that you don't
even need whole program optimization.) Without whole program
optimization, I think no. Could someone please more educated weigh
in?
Thus far, after 10 minutes of attempts just now to write a conforming
race-free program where you could tell the difference if a compiler
simply removed an "empty" mutex acquire release pair, the only
programs I can find are ones that would deadlock before the change,
and not deadlock after the change. A deadlock is observable behavior,
so a compiler cannot remove it for that reason.