Velocity Reviews > Re: Pointer Arithmetic & UB

# Re: Pointer Arithmetic & UB

Shao Miller
Guest
Posts: n/a

 12-21-2012
On 12/21/2012 16:16, Tim Rentsch wrote:
> Shao Miller <(E-Mail Removed)> writes:
>
>> On 12/20/2012 11:00, Ken Brody wrote:
>>>
>>> Condensed version of the discussion so far, in this subthread:
>>>
>>> =====
>>>
>>> Given:
>>>
>>> extern volatile int x;
>>> int i = x + x;
>>>
>>> And citing several C&V from different standards regarding the fact
>>> merely accessing a volatile is a "side effect".
>>>
>>> Does the above invoke UB? (No sequence point between the two "side
>>> effects" of accessing "x".)
>>>
>>> =====
>>>
>>> The Standard also requires (5.1.2.3p2) that all side effects be
>>> "complete" at the next sequence point.
>>>
>>> Given that whatever side effect the access of "x" may have is outside
>>> the control of the abstract machine, I fail to see how the sequence
>>> point requirement applies to the side effect of accessing a volatile.
>>>
>>> Consider, for example, a memory-mapped I/O system, where reading from a
>>> given address causes the printer to start printing whatever is in its
>>> buffer. How can C enforce the "shall be complete" requirement of
>>> 5.1.2.3p2? How is "i=x;i+=x;" any better than "i=x+x;"?
>>>

>>
>> Yes, it invokes undefined behaviour. If a read of 'x' is a side
>> effect, then two reads of 'x' are two side effects that could conflict
>> if they occur simultaneously. That is, since an implementation can
>> say that a read of 'x' increments a foo counter elsewhere, then two
>> simultaneous reads can result in two simultaneous increments of the
>> foo counter, which blows up the computer.

>
> IMO this conclusion is wrong. The consequences of volatile access (ie,
> the extra-linguistic side effects) are outside the domain of 6.5p2,
> because it is concerned only with program expressions, not other
> unknown memory changes. This view is explained in more detail in my
> response to Ben Bacarisse in this thread.
>

I read the other response. And what about if 'x' and "foo counter" are
the same scalar? Regardless of that or any other example, N1570's
informative Annex I, point 2 includes:

"An ‘‘unordered’’ binary operator (not comma, &&, or ||) contains a
side effect to an lvalue in one operand, and a side effect to, or an
access to the value of, the identical lvalue in the other operand (6.5)."

Informative Annex J "Portability issues", J.2 "Undefined behavior",
point 1 includes:

"A side effect on a scalar object is unsequenced relative to either a
different side effect on the same scalar object or a value computation
using the value of the same scalar object (6.5)."

Why wouldn't the Standard discuss "stored value" or "modification"
instead of using the looser "side effect"? 5.1.2.3p2:

"Accessing a volatile object, modifying an object, modifying a file,
or calling a function that does any of those operations are all side
effects,12) which are changes in the state of the execution environment.
Evaluation of an expression in general includes both value computations
and initiation of side effects. Value computation for an lvalue
expression includes determining the identity of the designated object."

There is no need for any extra-linguistic side effect, since "accessing
a volatile object" is a side effect by its own right. I think examples

6.7.3p7:

"An object that has volatile-qualified type may be modified in ways
unknown to the implementation or have other unknown side effects.
Therefore any expression referring to such an object shall be evaluated
strictly according to the rules of the abstract machine, as described in
5.1.2.3. Furthermore, at every sequence point the value last stored in
the object shall agree with that prescribed by the abstract machine,
except as modified by the unknown factors mentioned previously.134) What
implementation-defined."

If the last sentence here means that an implementation has license to
redefine that one or both of a read or a store is _not_ an access to a
volatile-qualified type, then it cannot abide by the second sentence,
which involves the semantics of reading and modifying. If one accepts
that the final sentence is not allowing for a redefinition of "access,"
then it must be allowing for a definition of what [else] _constitutes_
an access, such as:

- A clock tick increments the stored value of a tick-counter
- A read of a usage-counter increments the stored value of the usage-counter
- A read of the value of an object used for obtaining random values
causes the object's stored value to change to a new random value

"Undefined" by the Standard can obviously be defined by the
implementation, but need not be, which is why I'd suggest that it's
undefined. (The implementation doesn't have to know, document, or be

- Shao Miller

glen herrmannsfeldt
Guest
Posts: n/a

 12-21-2012
Shao Miller <(E-Mail Removed)> wrote:

> On 12/21/2012 16:16, Tim Rentsch wrote:

(snip)
>> IMO this conclusion is wrong. The consequences of volatile access (ie,
>> the extra-linguistic side effects) are outside the domain of 6.5p2,
>> because it is concerned only with program expressions, not other
>> unknown memory changes. This view is explained in more detail in my
>> response to Ben Bacarisse in this thread.

> I read the other response. And what about if 'x' and "foo counter" are
> the same scalar? Regardless of that or any other example, N1570's
> informative Annex I, point 2 includes:

> "An ??????unordered?????? binary operator (not comma, &&, or ||) contains a
> side effect to an lvalue in one operand, and a side effect to, or an
> access to the value of, the identical lvalue in the other operand (6.5)."

This seems to be for expressions like (x++)+(x).

Also, it seems to indicate that access isn't a side affect.

> Informative Annex J "Portability issues", J.2 "Undefined behavior",
> point 1 includes:

> "A side effect on a scalar object is unsequenced relative to either a
> different side effect on the same scalar object or a value computation
> using the value of the same scalar object (6.5)."

> Why wouldn't the Standard discuss "stored value" or "modification"
> instead of using the looser "side effect"? 5.1.2.3p2:

> "Accessing a volatile object, modifying an object, modifying a file,
> or calling a function that does any of those operations are all side
> effects,12) which are changes in the state of the execution environment.
> Evaluation of an expression in general includes both value computations
> and initiation of side effects. Value computation for an lvalue
> expression includes determining the identity of the designated object."

Is there any indication in the standard on what "side affect" means
for volatile data? I/O registers have been mentioned in the discussion,
but is that in the standard?

There have been systems with self-incrementing or self-decrementing
memory locations. If one was used for a volatile variable, then
there is an obvious side effect.

> There is no need for any extra-linguistic side effect, since "accessing
> a volatile object" is a side effect by its own right. I think examples
> can still be helpful, though.

> 6.7.3p7:

> "An object that has volatile-qualified type may be modified in ways
> unknown to the implementation or have other unknown side effects.
> Therefore any expression referring to such an object shall be evaluated
> strictly according to the rules of the abstract machine, as described in
> 5.1.2.3. Furthermore, at every sequence point the value last stored in
> the object shall agree with that prescribed by the abstract machine,
> except as modified by the unknown factors mentioned previously.134) What
> constitutes an access to an object that has volatile-qualified type is
> implementation-defined."

Sometimes "access" means fetch but not store. I am not so sure
in this case either way.

> If the last sentence here means that an implementation has license to
> redefine that one or both of a read or a store is _not_ an access to a
> volatile-qualified type, then it cannot abide by the second sentence,
> which involves the semantics of reading and modifying. If one accepts
> that the final sentence is not allowing for a redefinition of "access,"
> then it must be allowing for a definition of what [else] _constitutes_
> an access, such as:

> - A clock tick increments the stored value of a tick-counter
> - A read of a usage-counter increments the stored value of the usage-counter
> - A read of the value of an object used for obtaining random values
> causes the object's stored value to change to a new random value

Seems to me that these all have to be implementation dependent
(or implementation defined), in which case the effects and meaning
of "volatile" should also be so defined.

> "Undefined" by the Standard can obviously be defined by the
> implementation, but need not be, which is why I'd suggest that it's
> undefined. (The implementation doesn't have to know, document, or be

-- glen

Shao Miller
Guest
Posts: n/a

 12-22-2012
On 12/21/2012 18:52, glen herrmannsfeldt wrote:
> Shao Miller <(E-Mail Removed)> wrote:
>
>> On 12/21/2012 16:16, Tim Rentsch wrote:

> (snip)
>>> IMO this conclusion is wrong. The consequences of volatile access (ie,
>>> the extra-linguistic side effects) are outside the domain of 6.5p2,
>>> because it is concerned only with program expressions, not other
>>> unknown memory changes. This view is explained in more detail in my
>>> response to Ben Bacarisse in this thread.

>
>> I read the other response. And what about if 'x' and "foo counter" are
>> the same scalar? Regardless of that or any other example, N1570's
>> informative Annex I, point 2 includes:

>
>> "An ??????unordered?????? binary operator (not comma, &&, or ||) contains a
>> side effect to an lvalue in one operand, and a side effect to, or an
>> access to the value of, the identical lvalue in the other operand (6.5)."

>
> This seems to be for expressions like (x++)+(x).
>
> Also, it seems to indicate that access isn't a side affect.
>

Down below in 5.1.2.3p2, we see that for volatile-qualified types of
objects, it really is. The upthread question about volatile 'x' in 'x +
x' has been snipped.

>> Informative Annex J "Portability issues", J.2 "Undefined behavior",
>> point 1 includes:

>
>> "A side effect on a scalar object is unsequenced relative to either a
>> different side effect on the same scalar object or a value computation
>> using the value of the same scalar object (6.5)."

>
>> Why wouldn't the Standard discuss "stored value" or "modification"
>> instead of using the looser "side effect"? 5.1.2.3p2:

>
>> "Accessing a volatile object, modifying an object, modifying a file,
>> or calling a function that does any of those operations are all side
>> effects,12) which are changes in the state of the execution environment.
>> Evaluation of an expression in general includes both value computations
>> and initiation of side effects. Value computation for an lvalue
>> expression includes determining the identity of the designated object."

>
> Is there any indication in the standard on what "side affect" means
> for volatile data? I/O registers have been mentioned in the discussion,
> but is that in the standard?
>

Yes. [Albeit non-normative] Footnote 134 in 6.7.3p7:

"A volatile declaration may be used to describe an object
corresponding to a memory-mapped input/output port or an object accessed
by an asynchronously interrupting function. Actions on objects so
declared shall not be ‘‘optimized out’’ by an implementation or
reordered except as permitted by the rules for evaluating expressions."

> There have been systems with self-incrementing or self-decrementing
> memory locations. If one was used for a volatile variable, then
> there is an obvious side effect.
>

I believe that there are _two_ side effects; one by 5.1.2.3p2 and one by
either "other unknown side effects" or due to an
"implementation-defined", additional "access". (Both of these latter
from 6.7.3p7.)

>> There is no need for any extra-linguistic side effect, since "accessing
>> a volatile object" is a side effect by its own right. I think examples
>> can still be helpful, though.

>
>> 6.7.3p7:

>
>> "An object that has volatile-qualified type may be modified in ways
>> unknown to the implementation or have other unknown side effects.
>> Therefore any expression referring to such an object shall be evaluated
>> strictly according to the rules of the abstract machine, as described in
>> 5.1.2.3. Furthermore, at every sequence point the value last stored in
>> the object shall agree with that prescribed by the abstract machine,
>> except as modified by the unknown factors mentioned previously.134) What
>> constitutes an access to an object that has volatile-qualified type is
>> implementation-defined."

>
> Sometimes "access" means fetch but not store. I am not so sure
> in this case either way.
>

"Access" is the very first definition under section 3 "Terms,
definitions, and symbols". 3.1p1:

"access
〈execution-time action〉 to read or modify the value of an object"

"NOTE 1 Where only one of these two actions is meant, ‘‘read’’ or
‘‘modify’’ is used."

>> If the last sentence here means that an implementation has license to
>> redefine that one or both of a read or a store is _not_ an access to a
>> volatile-qualified type, then it cannot abide by the second sentence,
>> which involves the semantics of reading and modifying. If one accepts
>> that the final sentence is not allowing for a redefinition of "access,"
>> then it must be allowing for a definition of what [else] _constitutes_
>> an access, such as:

>
>> - A clock tick increments the stored value of a tick-counter
>> - A read of a usage-counter increments the stored value of the usage-counter
>> - A read of the value of an object used for obtaining random values
>> causes the object's stored value to change to a new random value

>
> Seems to me that these all have to be implementation dependent
> (or implementation defined), in which case the effects and meaning
> of "volatile" should also be so defined.
>

Why? 3.4.1p1:

"implementation-defined behavior
unspecified behavior where each implementation documents how the choice

Once again, 6.7.3p7 includes:

"An object that has volatile-qualified type may be modified in ways
unknown to the implementation or have other unknown side effects. ..."

So each of the three examples could belong to either of these two
categories. Or, they could be implementation-defined, additional accesses.

>> "Undefined" by the Standard can obviously be defined by the
>> implementation, but need not be, which is why I'd suggest that it's
>> undefined. (The implementation doesn't have to know, document, or be

>

- Shao Miller

Ken Brody
Guest
Posts: n/a

 12-22-2012
On 12/21/2012 5:40 PM, glen herrmannsfeldt wrote:
[...]
> There is much discussion on comp.lang.fortran on what compilers
> might do when optimizing function calls. Seems like in Fortran,
> a compiler is allowed to optimize out the call with very little
> reason, consider:
>
> x=0*fclose(out);

However, because the compiler cannot know if the function itself will have
side effects, the fact that x will always be zero is irrelevant to a C
compiler -- the function call cannot be removed. If Fortran allows it to be
removed, I would consider that part of the language to be "broken".

[...]
> OK, but there is no "magic" keyword to apply to functions.
> So, should compilers always call functions?

Yes, because that function may have side effects. And, in C at least, the
function must therefore be called. This is no different than:

(void)fclose(out);

[...]

glen herrmannsfeldt
Guest
Posts: n/a

 12-22-2012
Ken Brody <(E-Mail Removed)> wrote:

(snip, I wrote)
>> There is much discussion on comp.lang.fortran on what compilers
>> might do when optimizing function calls. Seems like in Fortran,
>> a compiler is allowed to optimize out the call with very little
>> reason, consider:

>> x=0*fclose(out);

> However, because the compiler cannot know if the function itself will have
> side effects, the fact that x will always be zero is irrelevant to a C
> compiler -- the function call cannot be removed. If Fortran allows it to be
> removed, I would consider that part of the language to be "broken".

Well, comp.lang.fortran people might considerr lots of parts of C
broken, but there is some debate as to what Fortran says on this.

There is also the PURE attribute for functions which don't have
side effects (and mostly the compiler can check for that).

Otherwise, it is usual to use SUBROUTINEs when you actually
want side effects.

>> OK, but there is no "magic" keyword to apply to functions.
>> So, should compilers always call functions?

> Yes, because that function may have side effects. And, in C at least, the
> function must therefore be called. This is no different than:

> (void)fclose(out);

-- glen

Tim Rentsch
Guest
Posts: n/a

 12-22-2012
Shao Miller <(E-Mail Removed)> writes:

> On 12/21/2012 16:16, Tim Rentsch wrote:
>> Shao Miller <(E-Mail Removed)> writes:
>>
>>> On 12/20/2012 11:00, Ken Brody wrote:
>>>>
>>>> Condensed version of the discussion so far, in this subthread:
>>>>
>>>> =====
>>>>
>>>> Given:
>>>>
>>>> extern volatile int x;
>>>> int i = x + x;
>>>>
>>>> And citing several C&V from different standards regarding the fact
>>>> merely accessing a volatile is a "side effect".
>>>>
>>>> Does the above invoke UB? (No sequence point between the two "side
>>>> effects" of accessing "x".)
>>>>
>>>> =====
>>>>
>>>> The Standard also requires (5.1.2.3p2) that all side effects be
>>>> "complete" at the next sequence point.
>>>>
>>>> Given that whatever side effect the access of "x" may have is outside
>>>> the control of the abstract machine, I fail to see how the sequence
>>>> point requirement applies to the side effect of accessing a volatile.
>>>>
>>>> Consider, for example, a memory-mapped I/O system, where reading from a
>>>> given address causes the printer to start printing whatever is in its
>>>> buffer. How can C enforce the "shall be complete" requirement of
>>>> 5.1.2.3p2? How is "i=x;i+=x;" any better than "i=x+x;"?
>>>>
>>>
>>> Yes, it invokes undefined behaviour. If a read of 'x' is a side
>>> effect, then two reads of 'x' are two side effects that could conflict
>>> if they occur simultaneously. That is, since an implementation can
>>> say that a read of 'x' increments a foo counter elsewhere, then two
>>> simultaneous reads can result in two simultaneous increments of the
>>> foo counter, which blows up the computer.

>>
>> IMO this conclusion is wrong. The consequences of volatile access (ie,
>> the extra-linguistic side effects) are outside the domain of 6.5p2,
>> because it is concerned only with program expressions, not other
>> unknown memory changes. This view is explained in more detail in my
>> response to Ben Bacarisse in this thread.

>
> I read the other response. [...snip...]

Then apparently you didn't understand it.

Tim Rentsch
Guest
Posts: n/a

 12-22-2012
glen herrmannsfeldt <(E-Mail Removed)> writes:

> Tim Rentsch <(E-Mail Removed)> wrote:
>> Ben Bacarisse <(E-Mail Removed)> writes:
>>> Tim Rentsch <(E-Mail Removed)> writes:

>
> (snip)
>>>> The answer is no. Arbitrary behavior may arise as a result, but
>>>> implementions are obliged to behave as they would if the side
>>>> effects were ordinary, non-interfering side effects. The term
>>>> "undefined behavior" is just a shorthand for saying something
>>>> about what an implementation may do. The actual behavior for
>>>> this example (and indeed for any volatile-qualified access at
>>>> all) is _unconstrained_, but the behavior is not _undefined_ in
>>>> the sense that the Standard uses the term 'undefined behavior'.

>
>>> I'm having trouble squaring this with the wording of 6.5 p2:

>
>>> "If a side effect on a scalar object is unsequenced relative to either
>>> a different side effect on the same scalar object or a value
>>> computation using the value of the same scalar object, the behavior is
>>> undefined."

>
>>> Or maybe I'm having trouble understanding the point you are making.

>
> Before there was "volatile", there was PL/I and the ABNORMAL attribute.
>
> Now, PL/I had multitasking pretty much from the beginning, so there
> was always a way that a variable could change at unexpected times.
>
> A compiler wasn't supposed to optimize, for example, A+A as 2*A,
> as A might change.

I don't have a PL/I specification readily available, so
I won't try to compare ABNORMAL to volatile.

> I don't know C11 well at all, has multitasking, or multithreading,

Yes but volatile has been in standard C since before multithreading
was included, so it isn't really affected by that.

> Is there a way, within a C program (not counting I/O registers
> and such) for a variable to change within a statement, other
> than as side effects of that statement?

Not in a way that's defined by the Standard, no. (And ignoring
threading, which doesn't bear on the current discussion.)

> If so, then the compiler has to allow for that.
>
> Otherwise, it seems to me, that "volatile" has to be
> implementation defined.

It might seem that way but the Standard is very clear that this
isn't so. The consequences, or even potential consequences, of
any volatile access are unknown to the implementation, and this
is explicit in the Standard.

> As a door that an implementation can use in implementaion specific
> ways. If an implementation allows for variables to be I/O
> registers, then the compiler has to compile as appropriate for
> that case.

The only rule is that access to an object through a volatile-qualified
type must be done "naively", ie, according to straightforward rules
of expression evaluation and not optimized out. (The Standard has
a more precise definition, but this is the gist.) Anything beyond
that must be managed by the programmer, not the implementation.

>> The difference is subtle, so let me take another run at explaining
>> it.

>
>> Suppose C had a provision for "magic functions". A magic function
>> is declared and defined just the same way that ordinary C functions
>> are (perhaps with an additional 'magic' keyword), and are called
>> the same way as ordinary functions. However, outside of the
>> language, including both the Standard itself and also any aspects
>> known to any implementation, there is a way of setting magic
>> functions so that they activate a logic control wire whose purpose
>> (and consequences) are unknown as far as the Standard is concerned
>> (again including both portable behavior and implementation-defined
>> behavior).

>
> Stretching this farther than I probably should, consider a function
> in the same file as its call, and that the function doesn't do
> anything, as the compiler can plainly see. Now, consider that
> one might use a linkage editor (the OS/360 linker can do this)
> to later replace that function with a different one.

A similar idea. I don't think the exact mechanism is important,
as long as it is outside the domain of what the implementation
(eg, compiler) knows.

>> This activation takes place whenever a magic function
>> is called. Even though the Standard doesn't know what will happen,
>> it wants to guarantee that the associated logic control wires are
>> activated, so it stipulates that calling a magic function is always
>> required, even if, eg, the function body is empty, and if that is
>> known at every point of call.

>
> There is much discussion on comp.lang.fortran on what compilers
> might do when optimizing function calls. Seems like in Fortran,
> a compiler is allowed to optimize out the call with very little
> reason, consider:
>
> x=0*fclose(out);

The C Standard apparently is more demanding; optimizations
are allowed only when they don't change the results of what
the program naively does (again the Standard defines this
condition more precisely).

>> Under these hypothetical conditions, calling a magic function has,
>> at least potentially, the same consequences as undefined behavior
>> does. However, the act of calling a magic function does not give
>> an implementation any license to ignore or violate requirements.
>> This is true because, even though calling a magic function might
>> do something horrible, _it also might not_, and implementations
>> must proceed just as they would if nothing bad has happened,
>> because as far as they know that could be true.

>
> OK, but there is no "magic" keyword to apply to functions.
> So, should compilers always call functions?

In the hypothetical example language there is a special keyword
for magic functions, so this question doesn't apply.

>> It's important to remember what 'undefined behavior' means, which
>> is a statement about how implementations may behave. Calling a
>> magic function has unlimited consequences in how an /execution/ may
>> behave, but that doesn't eliminate any requirements for how the
>> /implementation/ must behave. Statements in the Standard are
>> really about what implementations (ie, mostly compilers) do, not
>> about what happens during execution. Even though calling a magic
>> function has potentially unlimited consequences during execution,
>> it doesn't change what an implementation is obliged to do to
>> conform to the Standard's requirements. That is the key point.

>
> OK, but magic functions aren't defined (last I knew) in the
> standard.

Did you miss the word "Suppose" in what I wrote before? Magic
functions are a hypothetical construct, defined as described
above and putatively added to C, for the purpose of illustration.

> The "volatile" attribute is, but, as I understand, not well enough
> to say what it actually does.

The point of volatile is to impose additional requirements, or
limitations really, on how a C program may be compiled. Beyond these
requirements, no semantics are defined (beyond those of the access
itself) either by the Standard or by the implementation. Indeed, the
Standard says that volatile objects "may be modified in ways unknown
to the implementation or have other unknown side effects." What
happens upon accessing such objects is very explicitly outside the
domain both of the Standard and of the implementation.

>> To get back to volatile, the consequences of accessing a volatile
>> are just the same as calling a hypothetical magic function. It is
>> true that performing a volatile-qualifed access counts as a side
>> effect, but not necessarily a side effect on the object being
>> accessed. The consequences of accessing a volatile object are
>> potentially horrible, and as a result something really bad might
>> happen, but here again _it also might not_. Because it is unknown,
>> and indeed unknowable, by fiat in the Standard, what the
>> consequences of a volatile-qualified access will be, implementations
>> must behave just as they would if the accesses in question did
>> nothing more than what an ordinary access would do. Ergo the
>> term 'undefined behavior' does not apply.

>
> OK, but it seems to me that the standard, separate from
> implementations, should only cover what standard conforming
> programs can do.

I understand this reaction, but it isn't really right. The
Standard is a specification for how implementations must
behave. How programs will behave is a consequence of what
the implementation must do (and also on being run on a
data-processing system capable of supporting a conforming
implementation, but this also is outside the scope of the
Standard).

> An implementation may allow variables to be I/O ports, and use
> the "volatile" keywork, but the standard does not have any such
> wording.

The point of declaring something 'volatile' is that how it
behaves when accessed is outside the domain of what the
implementation knows, and hence the implementation must
treat it in a particular way. In a sense, the word 'volatile'
says to the implementation, "You don't know what's going on
here, so don't imagine that you do."

>> Finally, about 6.5 p2. What's being referred to here are side
>> effects on scalar objects that occur because of language-defined
>> program actions. Accessing a volatile object is a side effect,
>> but it is not, for the purpose of 6.5 p2, a side effect on any
>> particular scalar object. Otherwise, the mere _declaration_ of
>> a volatile object would potentially provoke undefined behavior,
>> since such objects may be modified at any time, and nothing in
>> the Standard defines their sequencing.

>
> Yes. My feeling is that the keyword is there to allow for
> implementation defined behavior. [snip]

>> Also it may be good to
>> remember that the notion of sequencing is defined only for
>> evaluations done as part of defined C semantics (per 5.1.2.3 p3).
>> Any consequences of volatile-access-induced side effects fall
>> outside the domain of the sequencing rules, because those rules
>> pertain only to evaluations of program expressions (and then
>> only those in a single thread). Doing two read accesses of
>> a single volatile-qualified object might produce horrible
>> consequences (then again, so might only a single read access),
>> but even so there is no 'undefined behavior', in the sense
>> that the Standard uses the term, ie, about what is further
>> required of the implementation: the execution may go completely
>> askew, but that doesn't let, eg, the compiler off the hook for

>
> It does seem that the compiler should follow the code as
> written. If there is one reference to a variable, it should be
> referenced once, for the implementation dependent definition of
> reference.
>
> volatile int x;
> y=2*x;
> z=x+x;
>
> In this case, y should always be even, z has the possibility
> of not being even, and the compiler should allow for that.

Not even that. Even ignoring the possible undefined behavior
because of overflow, after the second statement is done
y could have any value at all, because accessing 'x' in
'z = x + x;' might have the side effect of storing into
y, and the compiler isn't allowed to know that or assume
that it doesn't happen. Any use of y after the second
assignment statement must refetch y, for just this reason.

Shao Miller
Guest
Posts: n/a

 12-22-2012
On 12/22/2012 01:54, Tim Rentsch wrote:
> Shao Miller <(E-Mail Removed)> writes:
>
>> On 12/21/2012 16:16, Tim Rentsch wrote:
>>> Shao Miller <(E-Mail Removed)> writes:
>>>
>>>> On 12/20/2012 11:00, Ken Brody wrote:
>>>>>
>>>>> Condensed version of the discussion so far, in this subthread:
>>>>>
>>>>> =====
>>>>>
>>>>> Given:
>>>>>
>>>>> extern volatile int x;
>>>>> int i = x + x;
>>>>>
>>>>> And citing several C&V from different standards regarding the fact
>>>>> merely accessing a volatile is a "side effect".
>>>>>
>>>>> Does the above invoke UB? (No sequence point between the two "side
>>>>> effects" of accessing "x".)
>>>>>
>>>>> =====
>>>>>
>>>>> The Standard also requires (5.1.2.3p2) that all side effects be
>>>>> "complete" at the next sequence point.
>>>>>
>>>>> Given that whatever side effect the access of "x" may have is outside
>>>>> the control of the abstract machine, I fail to see how the sequence
>>>>> point requirement applies to the side effect of accessing a volatile.
>>>>>
>>>>> Consider, for example, a memory-mapped I/O system, where reading from a
>>>>> given address causes the printer to start printing whatever is in its
>>>>> buffer. How can C enforce the "shall be complete" requirement of
>>>>> 5.1.2.3p2? How is "i=x;i+=x;" any better than "i=x+x;"?
>>>>>
>>>>
>>>> Yes, it invokes undefined behaviour. If a read of 'x' is a side
>>>> effect, then two reads of 'x' are two side effects that could conflict
>>>> if they occur simultaneously. That is, since an implementation can
>>>> say that a read of 'x' increments a foo counter elsewhere, then two
>>>> simultaneous reads can result in two simultaneous increments of the
>>>> foo counter, which blows up the computer.
>>>
>>> IMO this conclusion is wrong. The consequences of volatile access (ie,
>>> the extra-linguistic side effects) are outside the domain of 6.5p2,
>>> because it is concerned only with program expressions, not other
>>> unknown memory changes. This view is explained in more detail in my
>>> response to Ben Bacarisse in this thread.

>>
>> I read the other response. [...snip...]

>
> Then apparently you didn't understand it.
>

Well in that case, I'd really like to enhance that understanding, if
possible.

On 12/21/2012 16:09, Tim Rentsch wrote:
> Suppose C had a provision for "magic functions". A magic function
> is declared and defined just the same way that ordinary C functions
> are (perhaps with an additional 'magic' keyword), and are called
> the same way as ordinary functions. However, outside of the
> language, including both the Standard itself and also any aspects
> known to any implementation, there is a way of setting magic
> functions so that they activate a logic control wire whose purpose
> (and consequences) are unknown as far as the Standard is concerned
> (again including both portable behavior and implementation-defined
> behavior). This activation takes place whenever a magic function
> is called. Even though the Standard doesn't know what will happen,
> it wants to guarantee that the associated logic control wires are
> activated, so it stipulates that calling a magic function is always
> required, even if, eg, the function body is empty, and if that is
> known at every point of call.

This seems pretty clear. It also seems analogous to the semantics of
'volatile'; helping to ensure that any non-standard side effects are
correctly "wired," regardless of their existence.

> Under these hypothetical conditions, calling a magic function has,
> at least potentially, the same consequences as undefined behavior
> does. However, the act of calling a magic function does not give
> an implementation any license to ignore or violate requirements.
> This is true because, even though calling a magic function might
> do something horrible, _it also might not_, and implementations
> must proceed just as they would if nothing bad has happened,
> because as far as they know that could be true.

As in, the computer could melt, but that possibility is entirely outside
of the scope of C, and we must discuss and behave as though it won't.
requirements" is interesting. One apparent requirement from the
definitions is that a read or store be considered an access. If that's
a mistaken interpretation, so be it.

> It's important to remember what 'undefined behavior' means, which
> is a statement about how implementations may behave. Calling a
> magic function has unlimited consequences in how an /execution/ may
> behave, but that doesn't eliminate any requirements for how the
> /implementation/ must behave. Statements in the Standard are
> really about what implementations (ie, mostly compilers) do, not
> about what happens during execution. Even though calling a magic
> function has potentially unlimited consequences during execution,
> it doesn't change what an implementation is obliged to do to
> conform to the Standard's requirements. That is the key point.

I remember there was a toy that was a box and a switch. When you
flipped the switch, a false hand would come slowly creeping out of the
box, with a finger aimed at the switch. At some point it'd flip the
switch and a release would cause the hand to be instantly retracted back
into the box. If this toy was programmed in C, we might hope that the
implementation was conforming even though the Nth write to the
"keep_going" volatile scalar was always mysteriously tied to a
power-loss event.

> To get back to volatile, the consequences of accessing a volatile
> are just the same as calling a hypothetical magic function. It is
> true that performing a volatile-qualifed access counts as a side
> effect, but not necessarily a side effect on the object being
> accessed.

This would seem to me to be more of a key point than the last one. If
one accepts this, then it is easy to agree that there _might_ not be any
"trouble" with the Standard's discussion of unsequenced side effects on
the same scalar.

But why not? Is the suggestion that the wording of N1570's 5.1.2.3p2
does not match the authors' intentions?

"Accessing a volatile object, modifying an object, modifying a file,
or calling a function that does any of those operations are all side
effects,12) which are changes in the state of the execution environment.
Evaluation of an expression in general includes both value computations
and initiation of side effects. Value computation for an lvalue
expression includes determining the identity of the designated object."

Maybe something more like:

"Modifying an object, modifying a file, or calling a function that
does any of those operations are all side effects,12) which are changes
in the state of the execution environment. In addition, accessing a
volatile object may cause side effects that may result in
implementation-defined or undefined behavior, but such side effects are
either implementation-defined or outside of the scope of this standard.
Evaluation of an expression in general includes both value computations
and initiation of side effects. Value computation for an lvalue
expression includes determining the identity of the designated object."

> The consequences of accessing a volatile object are
> potentially horrible, and as a result something really bad might
> happen, but here again _it also might not_. Because it is unknown,
> and indeed unknowable, by fiat in the Standard, what the
> consequences of a volatile-qualified access will be, implementations
> must behave just as they would if the accesses in question did
> nothing more than what an ordinary access would do. Ergo the
> term 'undefined behavior' does not apply.

interpreted literally) enjoy the fantastic new sequencing semantics,
such as the discussions of side effects having been completed before a
sequence point?

If a volatile _read_ was not a side effect, but might possibly initiate
one that we shouldn't worry about, then perhaps given:

volatile int x;
x = 13;
int y = x + x;
x = 42;

we could actually get away with:

1. set value of x to 13
2. side effect for 1
3. sequence point
5. side effect for 4
6. read value of x and add the value from 4 to compute initial value
for y
7. set value of y
8. sequence point
9. side effect that was skipped for 6 (hey, we don't know!)
10. set value of x to 42
11. side effect for 10

Now the 9 and 11 might happen simultaneously. They might have nothing
to do with the value of the scalar 'x' itself, but they are able to
collide. Same potential for collision if a read _is_ always a side
effect (even if the value of scalar 'x' still isn't influenced by that
side effect):

1. set value of x to 13
2. side effect for 1
3. sequence point
5. side effect for 4
7. side effect for 6
8. set value of y
9. sequence point
10. set value of x to 42
11. side effect for 10

Here, 5 and 7 might conflict, except now we can point one finger at the
programmer and another to the Standard and with raised voice and
eyebrows low, tell the programmer they're engaging in undefined
behaviour. By keeping side effects nice and separate, don't we enjoy a
little more safety, or else take our chances?

I'm hoping that understanding and agreement are two different things,
here. It's a lengthy post, but please do clarify, if and only if you
have an opportunity. Thank you for your time.

- Shao Miller

glen herrmannsfeldt
Guest
Posts: n/a

 12-22-2012
Tim Rentsch <(E-Mail Removed)> wrote:

(snip, I wrote)
>> Before there was "volatile", there was PL/I and the ABNORMAL attribute.

>> Now, PL/I had multitasking pretty much from the beginning, so there
>> was always a way that a variable could change at unexpected times.

>> A compiler wasn't supposed to optimize, for example, A+A as 2*A,
>> as A might change.

> I don't have a PL/I specification readily available, so
> I won't try to compare ABNORMAL to volatile.

http://bitsavers.trailing-edge.com/p...ions_Jul65.pdf

This is, as I understand it, how PL/I is supposed to work, separate from
any specific implementation. Separate manuals describe it as
implemented.

Among others that I had forgotten, it applies to procedures and to
variable, or at least did in 1965.

"Rules for abnormality in procedures:

1. Abnormality is a property of both external an dinternal
procedures. Blocks invoding procedures that are abnormal must be
within the scope of an ABNORMAL, USES, or SETS declaration for the
invoked entry name. However, the invocation of an abnormal procedure
does not make the envoking procedure itself abnormal. These
attributes enable program optimization to be performed.

2. An external procedure is abnormal if it or any procedure invoked
by it:

a. Access, modify, alocate, or free external data.
b. Modify, allocate, or free thier arguments.
c. Return inconsistent function values for the same argument values.
d. Maintain any kind of history.
e. Perform input/output operations.
f. Return control from the procedure by means of a GOTO statement.
3. An internal procedure is abnormal:
a. Under any condition listed above for external procedures.
b. If it, or any procedure called by it, access, modify, allocate,
or free variables declared in an outer block.
4. Abnormal external procedures invoked as functions much be declared
with at least one of the attributes, ABNORMAL, USES, or SETS. The
scope of this declaration must include the invoking block.
5. ABNORMAL used alone specifies that all possible types of abnormality
should be assumed. It is unnecessary to specify ABNORMAL for the
built-in functions, TIME and DATE.
6. The NORMAL attribute specifies that the entry name is for a
procedure that is not abnormal.

a. Access, modify, alocate, or free external data.
b. Modify, allocate, or free thier arguments.
c. Return inconsistent function values for the same argument values.
d. Maintain any kind of history.
e. Perform input/output operations.
f. Return control from the procedure by means of a GOTO statement.
3. An internal procedure is abnormal:
a. Under any condition listed above for external procedures.
b. If it, or any procedure called by it, access, modify, allocate,
or free variables declared in an outer block.
4. Abnormal external procedures invoked as functions much be declared
with at least one of the attributes, ABNORMAL, USES, or SETS. The
scope of this declaration must include the invoking block.
5. ABNORMAL used alone specifies that all possible types of abnormality
should be assumed. It is unnecessary to specify ABNORMAL for the
built-in functions, TIME and DATE.
6. The NORMAL attribute specifies that the entry name is for a
procedure that is not abnormal."

That part is pretty interesting. I know that many programs did those
things without the ABNORMAL attribute, but then maybe it is the default
for procedures.

Onto variables:

"Rules for abnormal data:

1. The ABNORMAL attribute may be declared for any variable.
2. The ABNORMAL attribute specifies that a variable may be altered or
otherwise accessed at an unpredictable time during execution of a
program. The situation might occur, for example, during the
execution of an ON-unit as described in "The ON Statement," in
Chapter 8.
3. Every time ABNORMAL data is referred to, its associated storage
contains its current value."

Much simpler than for procedures. Anyway:

"Default for abnormality of procedures:

If an external entry name appears only as a function reference, the
entry name is assumed to have the NORMAL attribute; otherwise, the
entry name is assumed to be ABNORMAL. Entry names of all internal
procedures and entry names of external procedures invoked in a CALL
statement are assumed to have the ABNORMAL attribute.

Default for abnormality if data:

Variables are assumed to be NORMAL, except structures containing
ABNORMAL elements; such structures may not be declared to be NORMAL."

>> I don't know C11 well at all, has multitasking, or multithreading,

> Yes but volatile has been in standard C since before multithreading
> was included, so it isn't really affected by that.

>> Is there a way, within a C program (not counting I/O registers
>> and such) for a variable to change within a statement, other
>> than as side effects of that statement?

> Not in a way that's defined by the Standard, no. (And ignoring
> threading, which doesn't bear on the current discussion.)

But it is convenient that multithreading does allow variables to change
at unexpected (in the statement where they might be used) times.

>> If so, then the compiler has to allow for that.

>> Otherwise, it seems to me, that "volatile" has to be
>> implementation defined.

> It might seem that way but the Standard is very clear that this
> isn't so. The consequences, or even potential consequences, of
> any volatile access are unknown to the implementation, and this
> is explicit in the Standard.

In that case, compilers should just give up.

To make the discussion more interesting, are "volatile" variables
allowed to be partially modified? Consider one that is more than one
byte long, and the bytes are not written with any interlock. An
interrupt could occur while one is only partly updated.

S/370 has CAS, Compare and Swap, for interlocked updating of memory.
Other architectures have similar ways of updating storage. Maybe
compilers should use that?

>> As a door that an implementation can use in implementaion specific
>> ways. If an implementation allows for variables to be I/O
>> registers, then the compiler has to compile as appropriate for
>> that case.

> The only rule is that access to an object through a volatile-qualified
> type must be done "naively", ie, according to straightforward rules
> of expression evaluation and not optimized out. (The Standard has
> a more precise definition, but this is the gist.) Anything beyond
> that must be managed by the programmer, not the implementation.

But if you can't say which ways data might be modified, then it is
pretty hard to expect compilers to account for those ways.

(snip, I wrote)
>> Stretching this farther than I probably should, consider a function
>> in the same file as its call, and that the function doesn't do
>> anything, as the compiler can plainly see. Now, consider that
>> one might use a linkage editor (the OS/360 linker can do this)
>> to later replace that function with a different one.

> A similar idea. I don't think the exact mechanism is important,
> as long as it is outside the domain of what the implementation
> (eg, compiler) knows.

OK, sounds good to me.

(snip)

>> x=0*fclose(out);

> The C Standard apparently is more demanding; optimizations
> are allowed only when they don't change the results of what
> the program naively does (again the Standard defines this
> condition more precisely).

Would be interesting to have "volatile" and "nonvolatile" attribute
for functions.

(snip)

> Did you miss the word "Suppose" in what I wrote before? Magic
> functions are a hypothetical construct, defined as described
> above and putatively added to C, for the purpose of illustration.

Maybe.

>> The "volatile" attribute is, but, as I understand, not well enough
>> to say what it actually does.

> The point of volatile is to impose additional requirements, or
> limitations really, on how a C program may be compiled. Beyond these
> requirements, no semantics are defined (beyond those of the access
> itself) either by the Standard or by the implementation. Indeed, the
> Standard says that volatile objects "may be modified in ways unknown
> to the implementation or have other unknown side effects." What
> happens upon accessing such objects is very explicitly outside the
> domain both of the Standard and of the implementation.

(snip)

>> An implementation may allow variables to be I/O ports, and use
>> the "volatile" keywork, but the standard does not have any such
>> wording.

> The point of declaring something 'volatile' is that how it
> behaves when accessed is outside the domain of what the
> implementation knows, and hence the implementation must
> treat it in a particular way. In a sense, the word 'volatile'
> says to the implementation, "You don't know what's going on
> here, so don't imagine that you do."

But the implementation has to make some assumptions.

or that other accesses are atomic.

(snip)

>> Yes. My feeling is that the keyword is there to allow for
>> implementation defined behavior. [snip]

> The Standard contradicts this idea.

(snip)

>> It does seem that the compiler should follow the code as
>> written. If there is one reference to a variable, it should be
>> referenced once, for the implementation dependent definition of
>> reference.

>> volatile int x;
>> y=2*x;
>> z=x+x;

>> In this case, y should always be even, z has the possibility
>> of not being even, and the compiler should allow for that.

> Not even that. Even ignoring the possible undefined behavior
> because of overflow, after the second statement is done
> y could have any value at all, because accessing 'x' in
> 'z = x + x;' might have the side effect of storing into
> y, and the compiler isn't allowed to know that or assume
> that it doesn't happen. Any use of y after the second
> assignment statement must refetch y, for just this reason.

Even if y isn't volatile?

(I specifically only made x volatile.)

-- glen

Tim Rentsch
Guest
Posts: n/a

 12-22-2012
Shao Miller <(E-Mail Removed)> writes:

> On 12/22/2012 01:54, Tim Rentsch wrote:
>> Shao Miller <(E-Mail Removed)> writes:
>>
>>>> [..snip..snip..snip..]
>>>
>>> I read the other response. [...snip...]

>>
>> Then apparently you didn't understand it.

>
> Well in that case, I'd really like to enhance that understanding,
> if possible. [snip]

My suggestions are: read more carefully; think more deeply; try
to organize your thoughts more systematically; and make an effort
in your writing to express ideas more clearly and more concisely.