Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > C Programming > sequence points in subexpressions

Reply
Thread Tools

sequence points in subexpressions

 
 
Kaz Kylheku
Guest
Posts: n/a
 
      12-14-2009
On 2009-12-14, pete <(E-Mail Removed)> wrote:
> No.
> The side effects of the increment operator
> are not needed for the dataflow dependency.


That is true in general, but not in this specific expression.

The key here is that the comma operator imposes the side effect barrier.

> The value of the right operand of the assignment operator
> can be calculated without any side effects taking place;
> It will be equal to (i + 2).


You are gapingly overlooking the fact that algebraic evaluation
shortcuts are optimizations which are only permitted if they don't break
the abstract semantics.

> Sequence points are only defined by
> when the side effects of an evaluation take place,
> not the data calculations.


That is false. A sequence point establishes not only that prior side
effects have settled, but that the next evaluation has not yet started.

I.e. if the prior evaluation has the side effect of modifying X, and the
next evaluation accesses X (a data calculation), then this means that
the side effect of X is settled, and the next evaluation will use the
new, updated, stable value of X.

> (p = p -> next = q)
>
> means the exact same thing as
>
> (p = (p -> next = q))
>
> But the side effect of evaluating (p -> next = q)
> does not have to take place before the update of p.


This is different because no operator is used in this expression which
has sequencing properties.

Since it does not contain any, this example you have given can be used
to neither support nor refute any of your claims about the properties of
sequence points.
 
Reply With Quote
 
 
 
 
Kaz Kylheku
Guest
Posts: n/a
 
      12-14-2009
On 2009-12-14, pete <(E-Mail Removed)> wrote:
> Beej Jorgensen wrote:
>
>> o For the value computation of (i,i++,i) to be complete, i++'s side
>> effects must be complete.

>
> I disagree.
> I know that the value of (i,i++,i) is one greater
> than the original value of (i).
> I computed that without accomplishing any side effects.


What you are describing is the existence of an algebraic shortcut,
which is an optimization.

Remember, that the actual machine can take arbitrary optimizations
in computing the visible behavior of the program, but the results have
to be like what the abstract machine would have computed.

The abstract machine requires adherence to sequence points.

Optimizations can discard sequence points only when it makes no
difference to the computed value, or externally visible behavior.
 
Reply With Quote
 
 
 
 
Keith Thompson
Guest
Posts: n/a
 
      12-14-2009
pete <(E-Mail Removed)> writes:
> Beej Jorgensen wrote:
>> On 12/14/2009 12:19 AM, pete wrote:
>>>Beej Jorgensen wrote:
>>>> o For the value computation of (i,i++,i) to be complete, i++'s side
>>>> effects must be complete.
>>>
>>>I disagree.
>>>I know that the value of (i,i++,i) is one greater
>>>than the original value of (i).
>>>I computed that without accomplishing any side effects.

>>
>> Then I think you skipped over a sequence point without performing every
>> value computation and side effect associated with subexpression i++, in
>> violation of 5.1.2.3p3:
>>
>> # The presence of a sequence point between the evaluation of expressions
>> # A and B implies that every value computation and side effect
>> # associated with A is sequenced before every value computation and side
>> # effect associated with B.
>>
>> But you're arguing the side effects don't necessarily take place at the
>> sequence point, is that right?

>
> No.
> I'm saying that the word "evaluation" includes side effects,
> and I'm saying that value of an expression can be determined and used
> prior to the evaluation of that expression being completed.


Consider:

int i = 0;
int j = 0;
i ++;
j = i;

I can determine the value of the expression ``i'' on line 4
(it's 1) without completing the evaluation of ``i ++'' on line 3.
Nevertheless, the side effect of the "++" must occur before the
side effect of the "=" on line 4.

A compiler may generate code that performs the operations in a
different order, or that omits some operations altogether, but any
such optimization cannot produce visible behavior outside the range
of permitted behaviors of the abstract machine. In particular,
optimization cannot introduce undefined behavior when the behavior
was well defined in the first place.

I think you'll agree that the behavior of my 4-line snippet
above is well defined. I argue that the behavior of

i = (i, i++, i) + 1;

is well defined for the same reason: the sequence points impose
requirements on when the side effects take place, and limit the
optimizer's options to rearrange operations.

[A request: when replying in this thread, please include the original
statement we're discussion.]

--
Keith Thompson (The_Other_Keith) http://www.velocityreviews.com/forums/(E-Mail Removed) <http://www.ghoti.net/~kst>
Nokia
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"
 
Reply With Quote
 
Antoninus Twink
Guest
Posts: n/a
 
      12-14-2009
On 13 Dec 2009 at 14:30, Nick wrote:
> If it's that borderline and you'd never need it, why does it actually
> matter, other than as a sort of C language sudoku.


*ding*

It seems a light has come on.

 
Reply With Quote
 
Chad
Guest
Posts: n/a
 
      12-14-2009
On Dec 14, 10:13*am, Keith Thompson <(E-Mail Removed)> wrote:
> pete <(E-Mail Removed)> writes:
> > Beej Jorgensen wrote:
> >> On 12/14/2009 12:19 AM, pete wrote:
> >>>Beej Jorgensen wrote:
> >>>> o *For the value computation of (i,i++,i) to be complete, i++'s side
> >>>> * *effects must be complete.

>
> >>>I disagree.
> >>>I know that the value of (i,i++,i) is one greater
> >>>than the original value of (i).
> >>>I computed that without accomplishing any side effects.

>
> >> Then I think you skipped over a sequence point without performing every
> >> value computation and side effect associated with subexpression i++, in
> >> violation of 5.1.2.3p3:

>
> >> # The presence of a sequence point between the evaluation of expressions
> >> # A and B implies that every value computation and side effect
> >> # associated with A is sequenced before every value computation and side
> >> # effect associated with B.

>
> >> But you're arguing the side effects don't necessarily take place at the
> >> sequence point, is that right?

>
> > No.
> > I'm saying that the word "evaluation" includes side effects,
> > and I'm saying that value of an expression can be determined and used
> > prior to the evaluation of that expression being completed.

>
> Consider:
>
> * * int i = 0;
> * * int j = 0;
> * * i ++;
> * * j = i;
>
> I can determine the value of the expression ``i'' on line 4
> (it's 1) without completing the evaluation of ``i ++'' on line 3.
> Nevertheless, the side effect of the "++" must occur before the
> side effect of the "=" on line 4.
>


How can you determine the value of the expression 'i' on line 4
without completing the evaluation of 'i++' on line 3?
 
Reply With Quote
 
Beej Jorgensen
Guest
Posts: n/a
 
      12-14-2009
On 12/14/2009 10:43 AM, Chad wrote:
> On Dec 14, 10:13 am, Keith Thompson <(E-Mail Removed)> wrote:
>> I can determine the value of the expression ``i'' on line 4
>> (it's 1) without completing the evaluation of ``i ++'' on line 3.
>> Nevertheless, the side effect of the "++" must occur before the
>> side effect of the "=" on line 4.

>
> How can you determine the value of the expression 'i' on line 4
> without completing the evaluation of 'i++' on line 3?


It depends on if by "completing the evaluation of", you mean that side
effects have been resolved, or if they are still pending and merely the
value calculation has been performed.

Because maybe the side effect of storing the result of i++ in i hasn't
been done yet, even though the answer is known. The side effect of
storing the result must merely happen before the next sequence point.

Here's some example fake "assembly" of (i,i++,i) starting with i = 3490
that does this:

; I think this example violates the
; Standard by ignoring a sequence point

i = 3490
i_inc = i + 1
result = i_inc ; we've calculated the result before storing it
i++ ; now we store it

Where I'm differing with Pete is that I think there's a sequence point after
i++, and therefore the side effect must take place by then:

i = 3490
;; == sequence point ==
i_inc = i + 1
i++ ; now we store it because of the seq point
;; == sequence point ==
result = i_inc
;; == sequence point ==

Take special note of that last line there. It could just as well have
been:

result = i

and had it work. This is the bit Pete is saying that I do agree with.

-Beej

 
Reply With Quote
 
Keith Thompson
Guest
Posts: n/a
 
      12-14-2009
Chad <(E-Mail Removed)> writes:
> On Dec 14, 10:13*am, Keith Thompson <(E-Mail Removed)> wrote:

[...]
>> Consider:
>>
>> * * int i = 0;
>> * * int j = 0;
>> * * i ++;
>> * * j = i;
>>
>> I can determine the value of the expression ``i'' on line 4
>> (it's 1) without completing the evaluation of ``i ++'' on line 3.
>> Nevertheless, the side effect of the "++" must occur before the
>> side effect of the "=" on line 4.

>
> How can you determine the value of the expression 'i' on line 4
> without completing the evaluation of 'i++' on line 3?


By analyzing the code. We know that the value that will be stored in
j is 1; if we can figure it out, so can the compiler.

Consider this complete program:

include <stdio.h>
int main(void)
{
int i = 0;
int j = 0;
i ++;
j = i;
printf("i = %d, j = %d\n", i, j);
return 0;
}

The compiler is free to replace the assignment ``j = i;'' with the
equivalent of ``j = 1;'', or even to eliminate j altogether and
replace the printf call with ``puts("i = 1, j = 1")''. What it
*can't* do is generate code that produces output other than
i = 1, j = 1

In particular, if it generates code for "j = i;" that actually reads i
and updates j, it can't postpone the side efect of the "++" operator
so it occurs after the assignment.

--
Keith Thompson (The_Other_Keith) (E-Mail Removed) <http://www.ghoti.net/~kst>
Nokia
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"
 
Reply With Quote
 
Beej Jorgensen
Guest
Posts: n/a
 
      12-14-2009
On 12/14/2009 05:00 AM, Eric Sosman wrote:
> The value of the parenthesized sub-expression is the
> value of `i' after incrementation, yes. But where is it
> written that the sub-expression's value must be determined
> by actually reading it from `i'?


I don't think that's written--I think it can get the calculated value of
i++ from wherever it wants, but I think it is written that the side
effect of i++ (i.e. the modification of i) must occur before the
rightmost i is evaluated in the subexpression i,i++:

[C99 5.1.2.3p2]
# At certain specified points in the execution sequence called sequence
# points, all side effects of previous evaluations shall be complete and
# no side effects of subsequent evaluations shall have taken place.

[C99 6.5.1.7p2]
# The left operand of a comma operator is evaluated as a void
# expression; there is a sequence point after its evaluation.

I see how it's theoretically possible to perform this calculation:

i = (i, i++, i) + 1

without applying the side effect of storing i+1 in i, but I don't see
how it would be *legal* to do so under the Standard.

> If an optimizing compiler knew that `i' was 42 before the line in
> question, could it not replace the assignment with `i=44', with the
> `i++' happening at some undetermined moment?


In a world without sequence points, I'd totally allow i++ to store the
result in i at some undetermined moment, but the sequence point at the
comma forces the side effect to take place at that particular moment.
(Perhaps the side effect hasn't taken place in "machine code reality",
but it must have taken place in "C code reality".)

Of course, if an optimizing compiler can determine that none of the side
effects or calculations will be visible, it is free to optimize the
entire expression away (spelled out in C99 5.1.2.3p3). But I figure
that since we're discussing it, this particular optimization has not
occurred in this case.

-Beej

 
Reply With Quote
 
Johannes Schaub (litb)
Guest
Posts: n/a
 
      12-14-2009
(E-Mail Removed) wrote:

> Does the statement given below invoke undefined behavior?
> i = (i, i++, i) + 1;
>
> I am almost convinced that it does not because of the following
> reasons
>
> 1> the RHS must be evaluated before a value can be stored in i
> 2> evaluation of RHS does not invoke UB due to the sequence points
> introduced by comma operator
>


Hello there! I'm the evil guy that stated it is UB in C. After reading
analysis of all you, i agree that this is not undefined behavior in C1x
(great work @ Beej Jorgensen !)

But i still think it is UB in C99. The additional requirements about value
computations are missing from C99 and so the "Between the previous and next
sequence point an object shall have its stored value modified at most once
by the evaluation of an expression." seems to render behavior UB.

In C99 it doesn't matter whether the assignment to i needs to compute a
value first. Merely the missing sequence point between the assignment and
the "i++" is enough to render behavior UB by the above quote.

Of course, my analysis might be too simple and miss some important points.
 
Reply With Quote
 
Beej Jorgensen
Guest
Posts: n/a
 
      12-14-2009
On 12/14/2009 02:07 PM, Johannes Schaub (litb) wrote:
> But i still think it is UB in C99. The additional requirements about value
> computations are missing from C99 and so the "Between the previous and next
> sequence point an object shall have its stored value modified at most once
> by the evaluation of an expression." seems to render behavior UB.


I'm still not sure because I'm not sure where the "next sequence point"
is in C99.

A B C D
| | | |
i = (i, i++, i) + 1

i is modified twice between A and D, which causes UB by your above cite.
But what of sequence point C? Does it not count?

I can't help but feel that C99 is missing a needed dimension in the
model, and 201x is addressing this. (Though it might have been a
necessary addition due to the threading stuff [5.1.2.4 "Multi-threaded
executions and data races"] which heavily relies on sequencing side
effects.)

-Beej

 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Re: subexpressions (OT: math) Steve Howell Python 23 06-04-2007 11:49 AM
subexpressions Sergey Dorofeev Python 20 06-04-2007 12:35 AM
Re: subexpressions Steve Howell Python 2 06-02-2007 02:39 PM
Examples of using "reluctant" subexpressions in regexps? david.karr@wamu.net Java 4 04-27-2005 07:46 PM
pre_match and post_match for subexpressions trans. (T. Onoma) Ruby 0 10-24-2004 03:41 AM



Advertisments