Velocity Reviews > sequence points in subexpressions

# sequence points in subexpressions

Tim Rentsch
Guest
Posts: n/a

 01-12-2010
pete <(E-Mail Removed)> writes:

> Richard wrote:
>> pete <(E-Mail Removed)> writes:
>>
>>
>>>(E-Mail Removed) wrote:
>>>
>>>>Does the statement given below invoke undefined behavior?
>>>>i = (i, i++, i) + 1;
>>>>
>>>>I am almost convinced that it does not

>
> It does.
>
>> because of the following
>>>>reasons
>>>>
>>>>1> the RHS must be evaluated before a value can be stored in i

>
> That's wrong.
>
>>>
>>>You would think so but,
>>>the evaluation of an expression also includes side effects,
>>>and the side effects of the evaluation of the right operand
>>>do not have to occur
>>>before the assignment operation on the left operand.

>>
>>
>> What side affects would you expect from the right hand side, keeping in
>> mind the sequence points?
>>

>
> The side effect from the increment operator.
>
> The sequence points from the comma operator are not relevant
> because there is no sequence point between the evaluation
> of the right and left operands of the assignment operator.
>
>
> If the right operand of the assignment opeartor is evaluated first,
> then there shouldn't be any problem with
> i = i++;
> but there is a problem.
> Assignment is not a sequence point.

Pete,

Ask yourself what happens in the abstract machine. In the
abstract machine, the expression

(i, i++, i)

must be evaluated to get the value of 'i' (the last 'i'),
so that the assignment may be performed. To get the
value of the last 'i', the 'i++' must have been already
evaluated and all side effects completed, because evaluating
the comma operator gives a sequence point after the left
operand, and it is the comma operator that yields the value
of the final 'i'.

In the abstract machine, operands are always evaluated before the
operations they are operands for, and evaluating a comma operator
means any side effects of the left hand side are complete before
evaluating the right hand side, and so also before the comma
operation of yielding the right hand side value. So by the
time we get to starting the assignment operation, any side
effects due to 'i++' have completed.

Tim Rentsch
Guest
Posts: n/a

 01-12-2010
James Dow Allen <(E-Mail Removed)> writes:

> On Dec 14, 2:20 am, Flash Gordon <(E-Mail Removed)> wrote:
>> Seebs wrote:
>> > I think it is UB.

>> I think it isn't UB.

>
> I'm not sure. In the simple case:
> i = (1, i++, i) + 1;
> It may be hard to imagine how the C system
> could go wrong, but one might be able to imagine
> some cache-speeding trick that assumes it
> won't encounter this code (or can do what it wants
> with it, if marked UB in The Standard).
>
> For those who think commas are permitted, what about:
> *(p += i, ++i, p += i) = j++, ++j, j;
> No problem right?

Presumably you mean
*(p += i, ++i, p += i) = (j++, ++j, j);

> The commas at left separate left-side sequence points,
> and commas at right separate (order) a different
> set of sequence points. We end up, in effect with
> i += 1, *(p += i+i-1) = j += 2;

Yes (ignoring possible overflows).

> What do we know about *which* sequence points are
> reached first, right-side vs left-side, or can they
> be interelaved?

Look at the abstract syntax tree for a single expression. The
end-of-full-expression puts a sequence point after all the
subexpressions in the tree (and the last end-of-full-expression
puts a sequence point before all the subexpressions in the tree).
A sequencing operator (such as comma, ||, &&, ? put a sequence
point _after_ all subexpressions of the leftmost operand and
a sequence point _before_ all subexpressions of the other operands
and also _before_ all the _operations_ that are parents (or
grandparents, etc) of the sequencing operator subexpression.
Operands or operations not covered by one of these cases for
a particular sequence point may be either before or after
that sequence point.

> *(p += i, ++i, p += i) = i++, ++i, i;
> Definitely UB-lookingish.

Again I presume you mean

*(p += i, ++i, p += i) = (i++, ++i, i);

but yes both are equally UB. The 'i++' is
after the SP of the last FE, before the SP of
the next FE, before both of the RHS commas,
and before the '=' assignment (which isn't a
sequence point, but still before it), but
is neither definitely before nor definitely
after any of the sequence points of the commas
in the LHS

(p += i, ++i, p += i)

So any of the accesses to 'i' are in conflict with
the 'i++' on the RHS (among other conflicts).

I think what you're getting at is, the notion of "previous
sequence point" and "next sequence point" are well-defined for a
single access but not really well-defined for a pair of accesses.
Unless an access is _definitely after_ the next sequence point of
another access, or _definitely before_ the previous sequence
point of that other access, the two accesses may occur between
the previous sequence point and the next sequence point of the
second access (which terms are now well-defined with respect
to that single access).

Tim Rentsch
Guest
Posts: n/a

 01-12-2010
Eric Sosman <(E-Mail Removed)> writes:

> On 12/13/2009 6:20 PM, Flash Gordon wrote:
>> James Dow Allen wrote:
>>> On Dec 14, 2:20 am, Flash Gordon <(E-Mail Removed)> wrote:
>>>> Seebs wrote:
>>>>> I think it is UB.
>>>> I think it isn't UB.
>>>
>>> I'm not sure. In the simple case:
>>> i = (1, i++, i) + 1;
>>> It may be hard to imagine how the C system
>>> could go wrong, but one might be able to imagine
>>> some cache-speeding trick that assumes it
>>> won't encounter this code (or can do what it wants
>>> with it, if marked UB in The Standard).

>>
>> Certainly if it is UB such assumptions can be made, but it is?
>> There is a sequence point between the evaluation of i++ and the
>> evaluation of i to its right, and it is the result of that i which is
>> yielded by the comma operator and then has 1 added to it before being
>> assigned to i. So, the sequence point of the comma operator is before
>> the assignment side effect of the equals operator.

>
> The value of the parenthesized sub-expression is the
> value of `i' after incrementation, yes. But where is it
> written that the sub-expression's value must be determined
> by actually reading it from `i'? If an optimizing compiler
> knew that `i' was 42 before the line in question, could it
> not replace the assignment with `i=44', with the `i++'
> happening at some undetermined moment?

In terms of the abstract machine, it seems clear from 6.3.2.1
paragraphs 1 and 2. What an optimizing compiler might do
isn't relevant, because whatever it does must be faithful
to what the abstract machine does.

Tim Rentsch
Guest
Posts: n/a

 01-12-2010
"Johannes Schaub (litb)" <(E-Mail Removed)> writes:

> Kaz Kylheku wrote:
>
>> On 2009-12-14, pete <(E-Mail Removed)> wrote:
>>> Beej Jorgensen wrote:
>>>
>>>> o For the value computation of (i,i++,i) to be complete, i++'s side
>>>> effects must be complete.
>>>
>>> I disagree.
>>> I know that the value of (i,i++,i) is one greater
>>> than the original value of (i).
>>> I computed that without accomplishing any side effects.

>>
>> You /must/ be trolling.
>>
>> Computing the value of (i, i++, i) is /mathematically/ impossible without
>> modelling the side effect of i++, and reconciliation thereof by the comma
>> operator's sequence point.
>>
>> Without the help of that sequence point, the final rightmost i does not
>> reliably refer to the incremented value.
>>
>> You cannot reduce the expression to a stable value without taking into
>> account the sequence points and modeling them faithfully in your
>> evaluation.
>>
>> In your evaluation model, you must instantiate a representation of the
>> object i, and then perform the side effect upon it. There are no magic
>> shortcuts.

>
> In C99, it's totally irrelevant that there is a sequence point after "i++",
> because the evaluation of the assignment expression ("i = (i, i++, i)") is
> not after that sequence point, but it's *around* that sequence point: The
> sequence point after "i++" is a part of the assignment expression (namely,
> it appears in evaluation of its right side here).
>
> As such, the Standard does not enforce that the assignment side effects does
> not appear between the same pair of sequence points (and it doesn't even
> agree the side effect is complete before any of the sequence points within
> any of its operands). It only has to be complete at the full expression
> sequence point. So, since we now have the value of the scalar changed
> between the previous and the next sequence point more than one time
> potentially (depending on how the implementation schedules the side
> effects), we are running into UB.

I think you're confused for reasons similar to those explained in my
response to James Dow Allen. Here's an easy way to look at it. Any
change to an object is the result of an operation. Consider each
object-changing operation in turn. This operation has a well-defined
previous sequence point and next sequence point. Now consider all
other accesses (whether reading or writing) to that object. If any of
those accesses are not _definitely before_ the previous sequence point
and not _definitely after_ the next sequence point (of the operation
we started on), then there is undefined behavior. But if all the
other accesses are either definitely before the previous SP, or
definitely after the next SP, and that's true for all object-changing
operations, then there is no undefined behavior.

Tim Rentsch
Guest
Posts: n/a

 01-12-2010
pete <(E-Mail Removed)> writes:

> Kenneth Brody wrote:
>> pete wrote:
>>
>>> Beej Jorgensen wrote:
>>>
>>>> o For the value computation of (i,i++,i) to be complete, i++'s side
>>>> effects must be complete.
>>>
>>>
>>> I disagree.
>>> I know that the value of (i,i++,i) is one greater
>>> than the original value of (i).
>>> I computed that without accomplishing any side effects.

>>
>>
>> But, in order for the result to be "one greater than the original
>> value of (i)", the side effect of incrementing "i" must have taken
>> place before evaluating the lone "i" which follows.
>>
>> You just disproved yourself.
>>
>> QED.

>
> I'm not evaluating the lone (i) which follows,
> because I'm not evaluating the expression which contains it.
>
> The meaning of the phrase "evaluate an expression" is at issue here.
>
> If an expression has side effects,
> then merely calculating its value,
> is not an evaluation of the expression.
>
> If you had to evaluate the right operand of every assignment operator
> before making the assignment,
> then assignment would be a sequence point, but it isn't.

You do have to evaluate the RHS of an assignment (and the LHS for that
matter) before making the assignment, but that doesn't make assignment
a sequence point. 6.3.2.1 p 1&2 make it clear that accessing a
variable implies evaluating the expression containing the variable
reference (or any other lvalue object reference). Evaluation is
necessary to get the value stored in the abstract machine.