Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > C++ > What in C++11 prohibits mutex operations from being reordered?

Reply
Thread Tools

What in C++11 prohibits mutex operations from being reordered?

 
 
Michael Podolsky
Guest
Posts: n/a
 
      04-10-2013
On Apr 5, 10:51*am, Tiib <(E-Mail Removed)> wrote:
> On Friday, 5 April 2013 15:27:12 UTC+3, Michael Podolsky *wrote:
> > On Apr 5, 4:26*am, Tiib <(E-Mail Removed)> wrote:
> > > On Friday, 5 April 2013 05:41:56 UTC+3, Michael Podolsky *wrote:
> > > > On Apr 4, 12:04*pm, James Kanze <(E-Mail Removed)> wrote:

>
> > > > > On the other hand, I've never heard of a compiler capable of
> > > > > proving much of anything involving mutexes, and especially not
> > > > > that changing their order doesn't affect observable behavior
> > > > > (especially as it usually does).

>
> > > > Why affecting observable behavior is problematic?

>
> > > Because observable behavior is sacred. It can't be changed unless
> > > there is undefined behavior. All programming languages are for
> > > describing observable behavior and so are on one step above
> > > assembly languages (that describe machine instructions). C++ is
> > > not exception there.

>
> > Three is no THE "observable behavior". There is a "SET allowable
> > behaviors" (I am citing 1.9p3).

>
> That is why I snipped it as irrelevant. On cases of unspecified
> behaviors (something may or may not happen) standard tends to tell the
> limits of allowable behaviors. (not so on case of undefined behaviors).
>
> > There is no problem to change
> > observable behavior of your program if you remain inside this set.

>
> That is cite of 1.9/3? Mine copy has such: "Certain other aspects
> and operations of the abstract machine are described in this
> International Standard as unspecified (for example, order of evaluation
> of arguments to a function). Where possible, this International Standard
> defines a set of allowable behaviors. These define the
> nondeterministic aspects of the abstract machine. An instance of the
> abstract machine can thus have more than one possible execution for a
> given program and a given input."


I think it's quite the same. My idea was that changing observable
behavior does not prove by itself anything about violation of C++
standard, as an abstract machine may have a set of allowed behaviors,
specifically because of unspecified aspects of the standard.


>
> > And btw this has nothing to "undefined behavior". All this set of
> > allowable behaviors is not undefined, i.e. is well-defined by the
> > standard.

>
> What set of allowable behaviors for what unspecified behavior you are
> talking about? What paragraph?


I generally meant that for lock-unlock there is respectively acquire
and release semantics and for acquire and release semantics there are
some limitations which leave still a space for a reordering of release
operation with following acquire operation. I think, I can prove by
references to particular paragraphs that a release operation may be
reordered with following acquire without violation of the standard.
But I think it is generally obvious. (it may be still not obvious or
not so at all for lock-unlock).

But... I currently came to some conclusion (with the help of this
forum) that I was not right from the start, and had some
misconceptions about C++ execution model. So what I am saying in this
post is just in the context of the thread, while I generally tend to
agree that unlock and following lock can't be reordered (not
completely, at least) by C++11 rules.

Regards, Michael
 
Reply With Quote
 
 
 
 
Michael Podolsky
Guest
Posts: n/a
 
      04-10-2013
On Apr 5, 4:44*am, Alexander Terekhov <(E-Mail Removed)> wrote:
> Michael Podolsky wrote:
>
> [...]
>
> > That is what you see in my last example. Ordered execution does not
> > deadlock. Reordered execution deadlocks. Both executions are allowed
> > for observable behavior of abstract machine.

>
> > Now if you agree with this, then probably your argument is not enough
> > to explain why mutex operations cannot be reordered.

>
> With mutexes in the absence of trylock() the program is guaranteed to
> behave SC (and obviously without injected deadlocks).
>
> With trylock() in the game, reordering m1.unlock() -> m2.lock()
> (overlapping of subsequent-as-coded critical regions) can be observed as
> far as transition of mutex states is concerned but deadlocks still won't
> be injected.
>
> What is so hard to understand here?


That with mutexes program behaves SC is not an axiom or rule of C++11
- it is theorem which should be proven basing on c++ memory rules like
acquire-release semantics of respective lock-unlock, total order of
operations on one mutex etc. My feeling (and claim) was that program
protected by "acquire-release" mutex is not SC because a x.unlock()
may be reoredered with y.lock().

I think finally, that I was not right here. But at least I think that
putting this claim by itself is not completely insane and makes some
reason (though in the context of some misconception I had about C++
execution model).

Thanks, Michael
 
Reply With Quote
 
 
 
 
Michael Podolsky
Guest
Posts: n/a
 
      04-10-2013
On Apr 5, 6:58*am, Bart van Ingen Schenau
<(E-Mail Removed)> wrote:
> On Thu, 04 Apr 2013 19:34:11 -0700, Michael Podolsky wrote:
> > On Apr 4, 1:53*pm, Bart van Ingen Schenau
> >> That would be 1.9/1.

>
> > It would be if you prove that according to the standard C++11 abstract
> > machine cannot deadlock, say, on two threads executing

>
> > thread 1:
> > m1.lock();
> > m1.unlock();
> > m2.lock();
> > m2.unlock();

>
> > thread 2:
> > m2.lock();
> > m2.unlock();
> > m1.lock();
> > m1.unlock();

>
> > Yes, I know, if this code is executed "sequentially" it cannot deadlock,
> > but I mean C++11 abstract machine

>
> The C++ abstract machine executes the code faithfully as written without
> any form of optimization being applied. This means that the abstract
> machine can't ever deadlock on the code above because at no time does a
> single thread hold both locks at the same time.


This is a very important moment in the context of what drove my
understanding of this topic,
and it looks that you are right here and I was wrong.
Thank you very much for pointing on my mistake.



> >> The lock and unlock MAY be reordered, but only if the compiler can
> >> prove that the execution of the reordered code yields the same
> >> observable behavior as the non-reordered code would yield in the
> >> (concurrent) abstract machine.

>
> > I am not completely sure you are right here, saying that non-reordered
> > code execution is what defines the execution of abstract machine. Here
> > is example of program which can not deadlock if executed in "ordered"
> > way, but might deadlock if executed with reordering. And I suppose you
> > should agree that BOTH executions are what the standard allows, i.e.
> > they should both conform to the abstract machine.

>
> > thread 1:
> > // atomic1 and atomic2 are initially 0 (zero)
> > atomic1.store(1, memory_order_relaxed);
> > atomic2.store(2, memory_order_relaxed);

>
> > thread2:
> > int a2 = atomic2.load(memory_order_seq_cst);
> > int a1 = atomic1.load(memory_order_seq_cst);
> > if(a2==2 && a1==0)
> > * *GO_AND_DEADLOCK_OUR_PROGRAM(); // never returns,
> > * * * * * * * * * * * * * * * * * // blocks all threads ))

>
> > thread2 will execute GO_AND_DEADLOCK_OUR_PROGRAM() only if assignments
> > to atomic1 and atomic2 in thread1 were reordered by compiler (hardware
> > reordering aside). And such execution is allowed by standard!

>
> You are wrong here. It does NOT take a reordering of the two stores in
> thread 1 to have thread 2 deadlock. It is sufficient that the results
> from those writes become visible to thread 2 out of order. This is a
> possibility, even for the abstract machine.
> Clause 1.10/5 notes that relaxed atomic operations do not provide
> synchronization and clause 1.10/6 notes that modifications on different
> atomic objects can become visible to other threads in inconsistent
> orders. Together, this allows the deadlock to occur in the abstract
> machine even without reordering the stores.


I did not mean that different order of visibility may happen ONLY as a
result of reordering.

But as for

> > I am not completely sure you are right here, saying that non-reordered
> > code execution is what defines the execution of abstract machine.


I am now nearly sure that you are right about this moment, the
abstract machine executes its code
without reordering.

>
>
> >> If the reordering introduces a deadlock that is not present in the
> >> original program, then the observable behavior would change (non-
> >> termination versus termination of the program).

>
> > That is what you see in my last example. Ordered execution does not
> > deadlock. Reordered execution deadlocks. Both executions are allowed for
> > observable behavior of abstract machine.

>
> > Now if you agree with this, then probably your argument is not enough to
> > explain why mutex operations cannot be reordered.

>
> I don't agree here, because non-reordered execution can also deadlock.


You are right. Again.

Many thanks,
Michael
 
Reply With Quote
 
Bart van Ingen Schenau
Guest
Posts: n/a
 
      04-10-2013
On Tue, 09 Apr 2013 04:22:23 -0700, Öö Tiib wrote:

> On Tuesday, 9 April 2013 12:04:28 UTC+3, Bart van Ingen Schenau wrote:
>>
>> That is right, but the unspecified order in which thread do their work
>> means that even with synchronization by mutexes there are multiple
>> possible orders in which the observable side effects can occur.

>
> Yes, that mostly relies on fact that non-atomic side effect in one
> thread is not observable in other thread without data race (undefined
> behavior) unless the side effect "happens before" observing. The
> synchronization objects like mutexes are meant for to achieve that
> "happens before" e.g. synchronization.
>
> I understand that if such "happens before" is achieved then accessing
> atomic objects or acquiring and releasing ownership of other mutexes
> (not the one used for achieving "happens before" above) are also
> observable side effects (there even can not be data races). That is why
> I asked what unspecified behavior OP meant.


You make is sound as if, without data races, there must be one single
total order in which observable side-effects happen. This is not true.

For example, if two threads access different (unrelated) volatile objects
without any form of synchronization, you have two unsequenced observable
side effects (access to a volatile object) without there being a data
race or other kind of undefined behavior.
Even within the confines of the abstract machine, these accesses can
occur in either order or even simultaneously and that can be different
from execution to execution.

Bart v Ingen Schenau
 
Reply With Quote
 
Tiib
Guest
Posts: n/a
 
      04-10-2013
On Wednesday, 10 April 2013 17:27:18 UTC+3, Bart van Ingen Schenau wrote:
> On Tue, 09 Apr 2013 04:22:23 -0700, Tiib wrote:
> > On Tuesday, 9 April 2013 12:04:28 UTC+3, Bart van Ingen Schenau wrote:
> >>
> >> That is right, but the unspecified order in which thread do their work
> >> means that even with synchronization by mutexes there are multiple
> >> possible orders in which the observable side effects can occur.

> >
> > Yes, that mostly relies on fact that non-atomic side effect in one
> > thread is not observable in other thread without data race (undefined
> > behavior) unless the side effect "happens before" observing. The
> > synchronization objects like mutexes are meant for to achieve that
> > "happens before" e.g. synchronization.
> >
> > I understand that if such "happens before" is achieved then accessing
> > atomic objects or acquiring and releasing ownership of other mutexes
> > (not the one used for achieving "happens before" above) are also
> > observable side effects (there even can not be data races). That is why
> > I asked what unspecified behavior OP meant.

>
> You make is sound as if, without data races, there must be one single
> total order in which observable side-effects happen. This is not true.


I meant that the possible data races are major factor that limit the
conditions when the side effects could be observed from other thread
even if there is synchronization by mutexes. That makes lot of the
optimizations that were available without threads still available
with threads, the synchronization with mutexes only adds barriers.

> For example, if two threads access different (unrelated) volatile objects
> without any form of synchronization, you have two unsequenced observable
> side effects (access to a volatile object) without there being a data
> race or other kind of undefined behavior.
>
> Even within the confines of the abstract machine, these accesses can
> occur in either order or even simultaneously and that can be different
> from execution to execution.


Certainly, but does not this go bit outside of your "even with
synchronization by mutexes" above? Mutexes and volatile together do
limit the freedom of compiler quite severely.

 
Reply With Quote
 
Bart van Ingen Schenau
Guest
Posts: n/a
 
      04-11-2013
On Wed, 10 Apr 2013 11:31:52 -0700, Öö Tiib wrote:

> I meant that the possible data races are major factor that limit the
> conditions when the side effects could be observed from other thread
> even if there is synchronization by mutexes. That makes lot of the
> optimizations that were available without threads still available with
> threads, the synchronization with mutexes only adds barriers.


OK.

> I wrote:
>> For example, if two threads access different (unrelated) volatile
>> objects without any form of synchronization, you have two unsequenced
>> observable side effects (access to a volatile object) without there
>> being a data race or other kind of undefined behavior.
>>
>> Even within the confines of the abstract machine, these accesses can
>> occur in either order or even simultaneously and that can be different
>> from execution to execution.

>
> Certainly, but does not this go bit outside of your "even with
> synchronization by mutexes" above? Mutexes and volatile together do
> limit the freedom of compiler quite severely.


Here is another example of what I mean, now with the use of a mutex.
Consider there are two threads and one mutex m.
If the threads execute the following code:

//thread 1:
m.lock();
std::cout << "hello ";
m.unlock();

//thread 2:
m.lock();
std::cout << "world!";
m.unlock();

then the abstract machine allows the standard output stream to contain
either "hello world!" or "world!hello ", but it is up to the exact order
in which the threads reach the critical section that determines which
output you will get.

But perhaps we are talking at cross purposes. All I am trying to show is
that mutexes don't magically force a single global order on the
observable side-effects of a program.

Bart v Ingen Schenau
 
Reply With Quote
 
Alexander Terekhov
Guest
Posts: n/a
 
      04-11-2013

Michael Podolsky wrote:
[...]
> Could not they do two flavors of trylock - one for CS and one (sane)
> just for the case of people who want real real trylock and not smth
> undeterministic.


I suspect they didn't want to teach the world that with sane
trylock/timedlock the following

T1 (owns m1 before T3 and T4 are launched): m1.unlock();
T2 (owns m2 before T3 and T4 are launched): m2.unlock();
T3: a = m1.trylock(), b = m2.trylock(); // or with timeouts
T4: c = m2.trylock(), d = m1.trylock(); // or with timeouts

may result in a = true, b = false, c == true, d = false thus exposing
non-SC behaviour even if the program is apparently "fully locked". (And
it can not be made SC under C++ model even with SC fence added to T3 and
T4.)

So I gather that the solution (in the spirit of Dr. Evil so to speak)
was to relax specification of trylock/timedlock beyond the sanity and
allow the program to yield non-SC result even under SC model due to
C++-invented spuriousness of trylock/timedlock failures (in direct
contradiction to POSIX which makes it even more grotesque).

regards,
alexander.
 
Reply With Quote
 
Alexander Terekhov
Guest
Posts: n/a
 
      04-11-2013
(corrections)

Alexander Terekhov wrote:
>
> Michael Podolsky wrote:
> [...]
> > Could not they do two flavors of trylock - one for CS and one (sane)
> > just for the case of people who want real real trylock and not smth
> > undeterministic.

>
> I suspect they didn't want to teach the world that with sane
> trylock/timedlock the following
>
> T1 (owns m1 before T3 and T4 are launched): m1.unlock();
> T2 (owns m2 before T3 and T4 are launched): m2.unlock();
> T3: a = m1.trylock(), b = m2.trylock(); // or with timeouts
> T4: c = m2.trylock(), d = m1.trylock(); // or with timeouts


Err, bad example (due to modification by successful trylock). Here's
another one with even less threads:

T1: (initially nobody owns m1): m1.lock(); // w/o subsequent unlock
T2: (owns m2 before T3 is launched): if (!m1.trylock()) m2.unlock();
T3: if (m2.trylock()) assert(!m1.trylock());

>
> may result in a = true, b = false, c == true, d = false thus exposing


may result in abort thus exposing

> non-SC behaviour even if the program is apparently "fully locked". (And
> it can not be made SC under C++ model even with SC fence added to T3 and
> T4.)


T2 and T3

>
> So I gather that the solution (in the spirit of Dr. Evil so to speak)
> was to relax specification of trylock/timedlock beyond the sanity and
> allow the program to yield non-SC result even under SC model due to
> C++-invented spuriousness of trylock/timedlock failures (in direct
> contradiction to POSIX which makes it even more grotesque).


No err here.

regards,
alexander.
 
Reply With Quote
 
Joshua Maurice
Guest
Posts: n/a
 
      05-10-2013
On Apr 11, 6:00*am, Alexander Terekhov <(E-Mail Removed)> wrote:
> C++-invented spuriousness of trylock/timedlock failures (in direct
> contradiction to POSIX which makes it even more grotesque).


I'd suggest to google this. This is not something that C++ just
invented. This is a straight up bug in the POSIX standard. Almost no
implementation of POSIX actually obeys the formal specification of
POSIX trylock. The naive intuitive implementation of lock, unlock, and
trylock does not obey the formal specification, and to obey the POSIX
trylock specification requires an implementation to add overhead on
all regular locking and unlocking (IIRC). All C++11 did was
standardize existing practice.
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Difference between mutex.mutex and threading.Lock sven Python 2 12-04-2009 11:30 PM
problem mutex-thread "Unlocking mutex owned by another thread ???" NaeiKinDus C++ 3 04-15-2007 09:35 PM
problem mutex-thread "Unlocking mutex owned by another thread ???" NaeiKinDus C++ 1 04-14-2007 07:40 PM
stand-alone JMS, other JDBC operations, and transactions ( ActiveMQ + JOTM + JDBC operations ) Jesus M. Salvo Jr. Java 2 02-11-2006 06:33 PM
Mutex page operations Richard Hollis ASP General 1 12-14-2004 07:40 AM



Advertisments