Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > C++ > What in C++11 prohibits mutex operations from being reordered?

Reply
Thread Tools

What in C++11 prohibits mutex operations from being reordered?

 
 
Alexander Terekhov
Guest
Posts: n/a
 
      04-03-2013

?? Tiib wrote:
>
> On Wednesday, 3 April 2013 20:10:02 UTC+3, Alexander Terekhov wrote:
> > ?? Tiib wrote:
> >
> > [...]
> > > Yeah, sorry for being unclear. I do not get what exact part of
> > > puzzle it is then that you miss? "sequenced before" it was not
> > > maybe "synchronizes with" or "happens before"? Mutex access can
> > > not be reordered.

> >
> > I always thought that e.g.
> >
> > a.unlock() "sequenced before" b.lock()
> >
> > can be reordered in deadlock-free manner as seen by another thread using
> > trylock().

>
> Did you mean try_lock() failing? That function is explicitly and specially
> allowed to fail by standard (see ?30.4.1.2/16) even when the lock is not
> held by any other thread.


I did not mean spurious failures of C++ try_lock()... actually I never
understood why one would need that because to me trylock() is basically
timedlock(yesterday) and POSIX clearly states

http://pubs.opengroup.org/onlinepubs...timedlock.html

"Under no circumstance shall the function fail with a timeout if the
mutex can be locked immediately."

whereas C++ draft (N3242 which I have) rather confusingly elaborates
that

"An implementation should ensure that try_lock() does not consistently
return false in the absence of contending mutex acquisitions."

(No idea what "consistently" supposed to mean here... well I suspect
that they mean lost reservations due to contention (or context switch)
without retry even if mutex is seen to be unlocked but I'd rather want
the retry in this case)

>
> > That's the cornerstone of efficient locking as far as memory model is
> > concerned.

>
> It is just for to have cheap 'try_lock()' implementation. Makes sense.
> It is good to have both cheap and sometimes failing tools in mix with
> expensive and reliable tools.


No, it's about 'release' store for mutex unlock() and 'acquire' RMW for
lock()/trylock() without expensive store-load fencing for locks.

regards,
alexander.
 
Reply With Quote
 
 
 
 
?? Tiib
Guest
Posts: n/a
 
      04-03-2013
On Wednesday, 3 April 2013 23:55:16 UTC+3, Alexander Terekhov wrote:
> ?? Tiib wrote:
> >
> > On Wednesday, 3 April 2013 20:10:02 UTC+3, Alexander Terekhov wrote:
> > > ?? Tiib wrote:
> > >
> > > [...]
> > > > Yeah, sorry for being unclear. I do not get what exact part of
> > > > puzzle it is then that you miss? "sequenced before" it was not
> > > > maybe "synchronizes with" or "happens before"? Mutex access can
> > > > not be reordered.
> > >
> > > I always thought that e.g.
> > >
> > > a.unlock() "sequenced before" b.lock()
> > >
> > > can be reordered in deadlock-free manner as seen by another thread using
> > > trylock().

> >
> > Did you mean try_lock() failing? That function is explicitly and specially
> > allowed to fail by standard (see ?30.4.1.2/16) even when the lock is not
> > held by any other thread.

>
> I did not mean spurious failures of C++ try_lock()... actually I never
> understood why one would need that because to me trylock() is basically
> timedlock(yesterday) and POSIX clearly states


Oh. I did not understand that you meant 'pthread_mutex_trylock()'
by 'trylock()'. m.try_lock() is perhaps *not* meant to be implemented in
terms of m.try_lock_until(yesterday). I do not see that C++ thread
support library is so tightly bound to POSIX pthread design.

> [...]
> whereas C++ draft (N3242 which I have) rather confusingly elaborates
> that
>
> "An implementation should ensure that try_lock() does not consistently
> return false in the absence of contending mutex acquisitions."
>
> (No idea what "consistently" supposed to mean here... well I suspect
> that they mean lost reservations due to contention (or context switch)
> without retry even if mutex is seen to be unlocked but I'd rather want
> the retry in this case)


Probably some naive "C++0x thread lib" implementation had defect that
when several threads were busily m.try_lock() then all kept failing and
so committee decided to forbid such "consistency".

> > > That's the cornerstone of efficient locking as far as memory model is
> > > concerned.

> >
> > It is just for to have cheap 'try_lock()' implementation. Makes sense.
> > It is good to have both cheap and sometimes failing tools in mix with
> > expensive and reliable tools.

>
> No, it's about 'release' store for mutex unlock() and 'acquire' RMW for
> lock()/trylock() without expensive store-load fencing for locks.


Ah whatever exact optimization there can be. It is anyway not true
that 'a.unlock()' "sequenced before" 'b.lock()' may be observably
reordered. a.try_lock() may spuriously fail so it is bad observation tool
and there are no a.trylock() in C++ so how you observe that reordering?
 
Reply With Quote
 
 
 
 
Michael Podolsky
Guest
Posts: n/a
 
      04-04-2013
On Apr 3, 12:53?pm, Alexander Terekhov <(E-Mail Removed)> wrote:
> Michael Podolsky wrote:


> > 2. I then should ask the question: which part of the standard prevents
> > lock() and unlock() operations on DIFFERENT mutexes to be reorered,
> > i.e. what prevents the following sequence:

>
> > m1.lock();
> > m1.unlock();
> > m2.lock();
> > m2.unlock();

>
> > to be compiled into:

>
> > m1.lock();
> > m2.lock();
> > m1.unlock();
> > m2.unlock();

>
> > which obviously must not be allowed because of potential deadlocks.

>
> Injecting deadlocks is a no-no but I hope the intent is that the
> following
>
> T1:
>
> m1.lock();
> ... start T2 ...
> m1.unlock();
> m2.lock();
>
> T2:
>
> if (m2.trylock() == failure) assert(m1.trylock() == success);
>
> may still abort (in effect allowing deadlock-free reordering above).


The "intent" was merely to understand better and in details the memory
model of C++11 standard.

trylock() in C++11 may spuriously fail, we all know about it, yet
pretending it is not, i.e. supposing your testing assertions are
meaningful, I wonder how to compile this code (hardware reordering
optimizations apart) to allow it to release m1 late and still not to
deadlock. Do you mean something in the following style:


m1.lock();
// m1 critical section
bool succeeded = m2.try_lock();
m1.unlock();

if(!succeeded)
m2.lock();
// m2 critical secion
m2.unlock();

BTW, do you have by change any idea about my original question? Say,
about reordering in:

atomic1.store(0, memory_order_release);

while( atomic2.exchange(1, memory_order_acquire) ){;}

I currently have a feeling that it is merely a 'while' operator which
prevents the following two lines to be reordered by compiler. So that
the following two lines MAY be reordered:

atomic1.store(0, memory_order_release);

if( atomic2.exchange(1, memory_order_acquire) ){printf("Hello\n");}

And I also have a feeling that C++11 might be a little vague about
this moment -- of course, I may be completely wrong here, no intention
to cast any doubt on C++11 standard.

Regards, Michael
 
Reply With Quote
 
Michael Podolsky
Guest
Posts: n/a
 
      04-04-2013
On Apr 3, 12:44?pm, "christian.bau" <(E-Mail Removed)>
wrote:
> On Apr 3, 3:30?am, Michael Podolsky <(E-Mail Removed)>
> wrote:
>
> > Even a simple assignment has a side effect. ?1.9/12 "modifying an
> > object". So if you are right, the following two assignment to integer
> > variables cannot be reordered.

>
> > i1=5; // has side effect, so
> > i2=7; // can't be reordered???

>
> "As if" rule: The compiler can do anything it likes as long as a
> conforming program cannot figure out the difference.


It is difference between actual execution and the observable behavior
of the abstract machine 1.9p1, not between actual execution and
sequential execution of the code.

> the program can behave differently depending on the order of
> mutexes, so it's not allowed to reorder them.



I am really serious, but what I am going to say may sound offensive or
like a nonsense.

What in C++ standard defines that different mutexes should be taken in
order, i.e. why do we think that

mutex1.unlock();
mutex2.lock();

are defined by the rules of C++ to be executed in this sequence?

I'll now elaborate my argument. Consider Ex1:

thread 1:
// atomic1 and atomic2 are initially 0 (zero)
atomic1.store(1, memory_order_relaxed);
atomic2.store(2, memory_order_relaxed);

thread2:

int a2 = atomic2.load(memory_order_seq_cst);
int a1 = atomic1.load(memory_order_seq_cst);

if(a2==2 && a1==0) printf("Stores were reordered!");

The behavior of this program may be DIFFERENT (it may print or it may
not), and yet we tolerate that as both behaviors are in compliance
with the standard.

Now consider Ex2 (note the change of ordering arguments in the first
thread):

thread 1:
// atomic1 and atomic2 are initially 0 (zero)
atomic1.store(1, memory_order_release);
atomic2.store(2, memory_order_acquire);

thread2:

int a2 = atomic2.load(memory_order_seq_cst);
int a1 = atomic1.load(memory_order_seq_cst);
if(a2==2 && a1==0) printf("Stores were reordered!");

Again, the behaviors may be different, yet we tolerate that.

Now consider Ex3:

mutex1.unlock(); // has release semantics
mutex2.lock(); // has acquire semantics

these operations has release and acquire semantics exactly as Ex2,
then what exactly prevents them from being reordered or how exactly
their reordering will violate the spec of abstract C++11 machine? It
is not a different execution, not a danger of deadlock, formally it
must be a difference from the abstract C++11 machine behavior.

Regards, Michael
 
Reply With Quote
 
Michael Podolsky
Guest
Posts: n/a
 
      04-04-2013
I should correct a mistake.

Ex2 should rather be something like:

thread 1:
// atomic1 and atomic2 are initially 0 (zero)
atomic1.store(1, memory_order_release);
atomic2.exchange(2, memory_order_acquire); // correction!


thread2:
int a2 = atomic2.load(memory_order_seq_cst);
int a1 = atomic1.load(memory_order_seq_cst);
if(a2==2 && a1==0) printf("Stores were reordered!");

as there cannot be an acquire barrier on a simple store operation


Regards, Michael
 
Reply With Quote
 
Michael Podolsky
Guest
Posts: n/a
 
      04-04-2013
On Apr 3, 4:55?pm, Alexander Terekhov <(E-Mail Removed)> wrote:
> ?? Tiib wrote:


> > Did you mean try_lock() failing? That function is explicitly and specially
> > allowed to fail by standard (see ?30.4.1.2/16) even when the lock is not
> > held by any other thread.


> actually I never
> understood why one would need that because to me trylock() is basically
> timedlock(yesterday) and POSIX clearly states
>
> http://pubs.opengroup.org/onlinepubs...s/pthread_mute...
>
> "Under no circumstance shall the function fail with a timeout if the
> mutex can be locked immediately."


No way. It's C++11, not POSIX. In C++11 both

* mutex::try_lock() and
* timed_mutex::try_lock_for(const chrono::duration<Rep, Period>&
rel_time)

are allowed to fail even if the mutex is immediately available. This
all is the price of introduction of ultimate SC semantics into C++, as
I could understand. The details on this subject with try_lock are in
Hans Boehm - "Reordering Constrains for Pthread-Style Locks" p7.

Regards, Michael


 
Reply With Quote
 
Michael Podolsky
Guest
Posts: n/a
 
      04-04-2013
On Apr 3, 12:59?pm, ?? Tiib <(E-Mail Removed)> wrote:

> Yeah, sorry for being unclear. I do not get what exact part of
> puzzle it is then that you miss? "sequenced before" it was not
> maybe "synchronizes with" or "happens before"?


Inside one thread it is "sequenced before" and ordering constrains of
atomics what
majorly should govern compilation. I think that "happens before" and
"synchronizes with"
are not related as they govern inter-thread synchronization (inside
one thread "happens before" is mostly trivially "sequenced before" ).
Anyway I do not see you are making any clear argument here.

> Mutex access can not be reordered.


I hope so. Yet this is not written in the standard.


>
> > > For example ? 1.10/5 elaborates the difference.

>
> > It is not clear for me still what in this paragraph prevents mutex
> > operations from being reordered in my example.

>
> Yes, ?1.10/5 only tells that atomic operations, mutexes and fences
> as synchronization operations and that relaxed atomic operations *are*
> *not* synchronization operations.
>
> Maybe you should first find out how synchronization operations work?
> It is tricky logic sprinkled all over the standard ... I think most
> of the ?1.10, ?29.3 and ?29.8 are relevant. I try to collect *short*
> and *simple* and *logical* explanation ... : If A is sequenced before
> B on the same thread, and B synchronizes with C on another thread,
> and C is sequenced before D on that second thread, then A happens before
> D, and A and D can access the same data even if they are non-atomic
> operations. So ... even ordinary 'int' access like 'i = 42' can not be
> ordered to other side of m2.lock() nothing to talk of m1.unlock() or
> some other synchronization operation like that.


Yes, this is clear. I spent some time learning the standard and some
articles around
before making this post. I just do not see how the scenario you
present is related
to my example.

>
> > Besides take into
> > account that in my original example (start of the thread) I have an
> > implementation of spin-locks with atomic variables and ask the same
> > question - what prevents these atomic operations from being reordered
> > by compiler.

>
> I did not understand what your example does, sorry.


while( atomic1.exchange(1, memory_order_acquire) ){;}
// critical section protected by atomic1
atomic1.store(0, memory_order_release);

this was just a very simple (and ineffective) implementation of a
spinlock. The first line grabs the lock, the last releases it. And my
question is then what prevents the following lines to be reordered:

atomic1.store(0, memory_order_release); // unlocking previous
critical seciotn
while( atomic2.exchange(1, memory_order_acquire) ){;} // locking next
critical section



Let me try to write
> example that achieves some synchronization ...
>
> ? // bool that tells that thread 1 is done with data. It is
> ? // atomic just because standard does not guarantee that accesses to
> ? // bool are atomic.
> ? std::atomic<bool> ready(false);
> ? // the ordinary data
> ? int data=0;
>
> ? void thread_1()
> ? {
> ? ? ? // these lines can't be reordered in any way
> ? ? ? data=42;
> ? ? ? std::atomic_thread_fence(std::memory_order_release );
> ? ? ? ready.store(true,std::memory_order_relaxed);
> ? }
>
> ? void thread_2()
> ? {
> ? ? ? if(ready.load(std::memory_order_relaxed))
> ? ? ? {
> ? ? ? ? ? // these lines can't be reordered
> ? ? ? ? ? std::atomic_thread_fence(std::memory_order_acquire );
> ? ? ? ? ? std::cout<<"data="<<data<<std::endl;
> ? ? ? }
> ? }
>
> The effect is that thread 2 does not touch data if thread 1
> is not ready with it. So access to non-atomic "data" is safe, synchronized
> and no data race is possible.


Yep, this is a very basic stuff as for memory ordering in C++. Sorry,
do not see a relation to the discussed problem.

Regards, Michael
 
Reply With Quote
 
Luca Risolia
Guest
Posts: n/a
 
      04-04-2013
On 04/04/2013 03:49, Michael Podolsky wrote:
> On Apr 3, 12:53 pm, Alexander Terekhov <(E-Mail Removed)> wrote:
>> Michael Podolsky wrote:

>
>>> 2. I then should ask the question: which part of the standard prevents
>>> lock() and unlock() operations on DIFFERENT mutexes to be reorered,
>>> i.e. what prevents the following sequence:

>>
>>> m1.lock();
>>> m1.unlock();
>>> m2.lock();
>>> m2.unlock();

>>
>>> to be compiled into:

>>
>>> m1.lock();
>>> m2.lock();
>>> m1.unlock();
>>> m2.unlock();


> trylock() in C++11 may spuriously fail, we all know about it, yet
> pretending it is not, i.e. supposing your testing assertions are
> meaningful, I wonder how to compile this code (hardware reordering
> optimizations apart) to allow it to release m1 late and still not to
> deadlock. Do you mean something in the following style:
>
>
> m1.lock();
> // m1 critical section
> bool succeeded = m2.try_lock();
> m1.unlock();
>
> if(!succeeded)
> m2.lock();
> // m2 critical secion
> m2.unlock();
>
> BTW, do you have by change any idea about my original question? Say,
> about reordering in:
>
> atomic1.store(0, memory_order_release);
>
> while( atomic2.exchange(1, memory_order_acquire) ){;}


The statement after the store can be placed before the store itself and
the statement before the while loop can be placed after the loop:
basically, the compiler is allowed to swap the statements. The only
memory order which guarantees sequential consistency is
std::memory_order_seq_cst, which is the default for atomic types.

> I currently have a feeling that it is merely a 'while' operator which
> prevents the following two lines to be reordered by compiler. So that
> the following two lines MAY be reordered:


No. Merely considering a while loop does not prevent any statement
reordering.

> And I also have a feeling that C++11 might be a little vague about
> this moment -- of course, I may be completely wrong here, no intention
> to cast any doubt on C++11 standard.


To avoid deadlocks when acquiring two or more locks together the
standard offers std::lock(). If you cannot acquire (nested) locks at the
same time for some reasons, it's enough to acquire them in a fixed order.

 
Reply With Quote
 
Michael Podolsky
Guest
Posts: n/a
 
      04-04-2013
On Apr 3, 11:12?pm, Luca Risolia <(E-Mail Removed)> wrote:

> The statement after the store can be placed before the store itself and
> the statement before the while loop can be placed after the loop:
> basically, the compiler is allowed to swap the statements. The only
> memory order which guarantees sequential consistency is
> std::memory_order_seq_cst, which is the default for atomic types.
>
> > I currently have a feeling that it is merely a 'while' operator which
> > prevents the following two lines to be reordered by compiler. So that
> > the following two lines MAY be reordered:

>
> No. Merely considering a while loop does not prevent any statement
> reordering.


So you agreed that my implementation of spin-locks may lead to
deadlocks just because compiler may reorder completely my
implementations of unlock() and a following lock(). I personally think
that my implementation is pretty cool ))) and correct and may not be
reordered by C++11 compiler and so lead to deadlocks, but I may be
wrong here.

Then what about original unlock() and lock() on standard C++11
mutexes? I believe they (mutexes) are implemented without
std::memory_order_seq_cst, but with the same release (for unlock) and
acquire (for lock) ordering I used for my spinlocks. In no place in C+
+11 standard do they say that operations on mutexes are all
std::memory_order_seq_cst.

>
> > And I also have a feeling that C++11 might be a little vague about
> > this moment -- of course, I may be completely wrong here, no intention
> > to cast any doubt on C++11 standard.

>
> To avoid deadlocks when acquiring two or more locks together the
> standard offers std::lock(). If you cannot acquire (nested) locks at the
> same time for some reasons, it's enough to acquire them in a fixed order.


Well, I did not mean to acquire the mutexes together, it is just
compiler that transformed my code into doing so )) Seriously, I do
not see that std::lock() makes much sense for multithread programming,
deadlocks usually happen because you grab different mutexes in
different places, not just all in one place.

Regards, Michael
 
Reply With Quote
 
Luca Risolia
Guest
Posts: n/a
 
      04-04-2013
On 04/04/2013 05:29, Michael Podolsky wrote:
> On Apr 3, 11:12 pm, Luca Risolia <(E-Mail Removed)> wrote:
>
>> The statement after the store can be placed before the store itself and
>> the statement before the while loop can be placed after the loop:
>> basically, the compiler is allowed to swap the statements. The only
>> memory order which guarantees sequential consistency is
>> std::memory_order_seq_cst, which is the default for atomic types.


> So you agreed that my implementation of spin-locks may lead to
> deadlocks just because compiler may reorder completely my
> implementations of unlock() and a following lock(). I personally think
> that my implementation is pretty cool ))) and correct and may not be
> reordered by C++11 compiler and so lead to deadlocks, but I may be
> wrong here.


Probably your implementation is not correct. I think you should pass
std::memory_order_acq_rel to exchange() (which is a read-modify-write
operation) instead of just std::memory_order_acquire. Also, a simple
std::atomic_flag should be more appropriate to implement a "spinlock".
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Difference between mutex.mutex and threading.Lock sven Python 2 12-04-2009 11:30 PM
problem mutex-thread "Unlocking mutex owned by another thread ???" NaeiKinDus C++ 3 04-15-2007 09:35 PM
problem mutex-thread "Unlocking mutex owned by another thread ???" NaeiKinDus C++ 1 04-14-2007 07:40 PM
stand-alone JMS, other JDBC operations, and transactions ( ActiveMQ + JOTM + JDBC operations ) Jesus M. Salvo Jr. Java 2 02-11-2006 06:33 PM
Mutex page operations Richard Hollis ASP General 1 12-14-2004 07:40 AM



Advertisments