Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > C Programming > Is the aliasing rule symmetric?

Reply
Thread Tools

Is the aliasing rule symmetric?

 
 
Johannes Schaub (litb)
Guest
Posts: n/a
 
      01-25-2011
Joshua Maurice wrote:

> On Jan 24, 8:10 pm, Ben Bacarisse <(E-Mail Removed)> wrote:
>> Joshua Maurice <(E-Mail Removed)> writes:
>> > On Jan 24, 7:15 pm, Ben Bacarisse <(E-Mail Removed)> wrote:
>> >> > What about the following?

>>
>> >> > #include <cstdlib>
>> >> > using namespace std;
>> >> > int main()
>> >> > {
>> >> > void* p = malloc(sizeof(int) + sizeof(float));
>> >> > int*x = (int*) p;
>> >> > *x = 1;
>> >> > float* y = (float*) p;
>> >> > *y = 1;
>> >> > }

>>
>> >> Again, in C, this is well-defined due the definition of effective
>> >> type.

>>
>> >> > I've had a thread up on comp.std.c++ for a while now about these
>> >> > issues, and I've gotten 0 replies. It's quite frustrating.

>>
>> >> > In short, I would argue that both of the above programs have no UB
>> >> > in C ++, nor their equivalent program in C.

>>
>> >> <snip much more C++ specific questions>

>>
>> > I'd like to think that the rest of the questions applied to C as well
>> > as C++.

>>
>> At that point you quoted some passages from the C++ standard and started
>> using phrases like "reused" which seems to be key to the C++ behaviour
>> but does not crop up in C. I worried that C-specific answers beyond
>> that point might just confuse matters. I got the feeling your real
>> worries were about whether C++ defined the code you were posting about.
>>
>> > Surely simple things like:
>> > #include <stdlib.h>
>> > int main()
>> > {
>> > void* p = malloc(sizeof(int) + sizeof(float));
>> > * ( (float*) p ) = 1;
>> > * ( (int*) p ) = 1;
>> > return * ( (int*) p );
>> > }

>>
>> That's fine in C. It is not really different from the example I did
>> comment on -- switching to a cast expression from an initialised
>> variable does not alter the meaning.
>>
>> > either work in C and C++, or work in neither. The above is a well
>> > formed C program and a well formed C++ program. I would be slightly
>> > surprised if it had UB in one and not UB in the other.

>>
>> So would I, but the C++ standard uses different language from the C
>> standard about the validity of such accesses so a difference (even an
>> unintended one) is possible.
>>
>> > I would also like to hear some of the actual people on the committees
>> > weigh in on these particular questions. I've been asking questions
>> > like these for a while now, and I have yet to hear compelling answers
>> > from the people on the committees or those who would know, and/or the
>> > actual compiler writers.

>>
>> Have you got some reason to suspect that there is a problem with any of
>> these programs in C? The C standard seems quite clear on these specific
>> questions.

>
> Yes. I've been getting various replies when I tweak the above program
> just slightly.
>
> #include <stdlib.h>
> void foo(int* a, float* b)
> {
> *a = 1;
> *b = 1;
> }
> int main()
> {
> void* p = malloc(sizeof(int) + sizeof(float));
> foo((int*)p, (float*)p);
> }
>
> I asked if this had UB in comp.lang.c a while ago. I received various
> replies, with little follow up discussion.
>
> One reply was that a piece of memory may have at most one effective
> type between calls to malloc and free.
>


No piece of memory (or piece of storage, for that matter) ever has a
specific type, so that point of view is wrong. The spec says when defining
"object":

"NOTE When referenced, an object may be interpreted as having a particular
type; see 6.3.2.1."

And

"The effective type of an object for an access to its stored value is the
declared type of the object, if any. [...]".

So all you have is a type that an access to an object may have, and a type
that an lvalue refering to an object may have and a type that you may have
declared an object to have. But the object will not have a type at runtime.

The C rationale document is clear on that too

"The definition of object does not employ the notion of type. Thus an
object has no type in and of itself. However, since an object may only be
designated by an lvalue (see §3.2.2.1), the phrase ``the type of an object''
is taken to mean, here and in the Standard, ``the type of the lvalue
designating this object,'' and ``the value of an object'' means ``the
contents of the object interpreted as a value of the type of the lvalue
designating the object.''"

> Another reply was that the above program has perfectly well defined
> behavior, but the following has undefined behavior:
> #include <stdlib.h>
> int foo(int* a, float* b)
> {
> *a = 1;
> *b = 1;
> return *a;
> }
> int main()
> {
> void* p = malloc(sizeof(int) + sizeof(float));
> foo((int*)p, (float*)p);
> }
> Specifically, this example explains how the compiler might use
> aliasing analysis for optimization purposes. A conforming compiler may
> not simply assume that an int* and a float* do not alias. However, if
> analysis shows that aliasing would result in UB (as it would in the
> above program when "return *a;" reads a float object through an int
> lvalue) then the compiler is free to do whatever it wants in the face
> of the UB, including assume that they don't alias.
>


Disregarding of what compilers do, I think your last reply is what is
correct. The initial example had fine behavior, but this example is
undefined behavior, for the exactly reasons you give.

 
Reply With Quote
 
 
 
 
Joshua Maurice
Guest
Posts: n/a
 
      01-25-2011
On Jan 24, 7:15*pm, Ben Bacarisse <(E-Mail Removed)> wrote:
> <snip much more C++ specific questions>


Let me phrase it as a C /and/ C++ question then. Consider the
following program and questions, in the context of C and C++, because
I don't know the answer in the context of either language.

#include <stdlib.h>

typedef struct T1 { int x; int y; } T1;
typedef struct T2 { int x; int y; } T2;

int main()
{
void* p = 0;
T1 * t1 = 0;
T2 * t2 = 0;
int * x = 0;

if (sizeof(T1) != sizeof(T2))
return 1;
if ( (char*)(& t1->y) - (char*) (t1) != (char*)(& t2->y) -
(char*) (t2) )
return 1;

p = malloc(sizeof(T1));
/* Do we have a T1 object here? Presumably no. Otherwise we also
have a T2 object here, and we definitely don't want to start talking
about two distinct complete objects occupying the same storage at the
same time. */

t1 = (T1*) p;
/* T1 object yet? Presumably the answer hasn't changed since the
above comment. */

x = & t1->x;
/* T1 object yet? */

*x = 1;
/* Do we have a T1 object here? Maybe. I just see a write through
an int lvalue. I see no writes nor reads through a T1 lvalue. I see
nothing that favors T1 over T2, besides some sort of data dependency
analysis through the member-of operator. However, there isn't even a
hint of data dependency analysis in either standard with regards to
object lifetime rules. */

x = & t1->y;
*x = 2;
/* Do we have a T1 object here? The answer must be yes, or we'll
never have a T1 object. However, again, I see nothing to favor having
a T1 object over a T2 object besides data dependency analysis through
the member-of operator. */

t2 = (T2*) p;
return t2->y; /* Is this UB? Why? Why is reading "t1->y" not UB,
but
reading "t2->y" is UB? In other words, why do we have a T1 object, but
not a T2 object? */
}

Also, what if we used offsetof hackery to initialize both int members
of the T1 object without using a member-of operator on a T1 lvalue?

As far as I can tell, gcc doesn't even bother doing aliasing analysis
on anything besides primitive types, for exactly the reasons outlined
above. They must not have seen a sensible way to differentiate between
T1 and T2, just as I cannot.
 
Reply With Quote
 
 
 
 
Joshua Maurice
Guest
Posts: n/a
 
      01-25-2011
On Jan 24, 10:04*pm, "Johannes Schaub (litb)" <(E-Mail Removed)>
wrote:
> Joshua Maurice wrote:
> > On Jan 24, 8:33 pm, "Johannes Schaub (litb)" <(E-Mail Removed)>
> >> int a;
> >> *(float*)&a = 0;

>
> >> In C, this violates the aliasing rules and is undefined behavior,

>
> > It does? I must be confused then. Ignoring alignment and size issues,
> > the write ends the lifetime of the int object, and starts the lifetime
> > of a float object - the effective type rules.

>
> > Any further read of 'a' would be a read of a float object through an
> > int lvalue, which would be UB.

>
> The spec says that only of objects that have no declared type. In this case,
> we declared it to be an int, so the effective type doesn't change.
>
> "The effective type of an object for an access to its stored value is the
> declared type of the object, if any. If a value is stored into an object
> having no declared type [...]".


Mmm. I see. Thank you.

Not that I would actually write this code, but I'm curious what I have
to work with when sorting through both standards tying to come up with
reasonable interpretations.
 
Reply With Quote
 
Johannes Schaub (litb)
Guest
Posts: n/a
 
      01-25-2011
Joshua Maurice wrote:

> On Jan 24, 8:33 pm, "Johannes Schaub (litb)" <(E-Mail Removed)>
> wrote:
>> C++ doesn't have the concept of effective types, and the following is my
>> personal perception of the issue. Rather, in C++ the objects have type
>> themselfs, while in C types are merely an attribute of the access to
>> objects. So an object cannot exist without a type in C++; it wouldn't
>> make sense with the current model. The behavior of the above code is not
>> clearly defined in C++. Not even that other DR we talked about in
>> comp.std.c++ fixes it, because we are not really copying an object
>> representation over in this case.

>
> So, can you write a memory allocator which reuses memory in pure
> conforming C++ on top of new and delete, or on top of malloc and free?
> I'd like to think that you can, and I don't think that you can unless
> the following program has no UB.
>
> #include <stdlib.h>
> int main()
> {
> void* p = malloc(sizeof(int) + sizeof(float));
> * ( (float*) p ) = 1;
> * ( (int*) p ) = 1;
> return * ( (int*) p );
> }
>


As per my assumptions this is fine in C++. To be sure you can use placement
new, because that is explicitly mentioned as one of the ways to create a new
object at 1.8p1.

To be clear: I don't think the spec accurately describes the behavior when
you use plain writes. As you pointed out on comp.std.c++, it currently says
that merely having storage that fits in size and alignment is enough to
start lifetime. This just makes no sense.

On the other side, 3.8 can be interpreted the following way:

- An object of type T exist which was created by one of the ways described
by 1.8p1, but has not yet necessarily started lifetime yet.
- If T is a non-class type or a type with a trivial ctor, lifetime started
at that point. Otherwise, the ctor must first be called and completed.

Now this makes perfect sense for declared objects and objects created by
new-expressions or by temporaries. But it doesn't make sense if you try to
apply this model to malloc/free, because 1.8p1 doesn't define what type such
an object has. See this one:
https://groups.google.com/group/comp...82227634f15895

>> In the end here's another interesting difference in C and C++. In C and
>> C++ the following have different meanings, if you go with the above
>> explanation:
>>
>> int a;
>> *(float*)&a = 0;
>>
>> In C, this violates the aliasing rules and is undefined behavior,

>
> It does? I must be confused then. Ignoring alignment and size issues,
> the write ends the lifetime of the int object, and starts the lifetime
> of a float object - the effective type rules.
>
> Any further read of 'a' would be a read of a float object through an
> int lvalue, which would be UB.
>


The spec says that only of objects that have no declared type. In this case,
we declared it to be an int, so the effective type doesn't change.

"The effective type of an object for an access to its stored value is the
declared type of the object, if any. If a value is stored into an object
having no declared type [...]".

 
Reply With Quote
 
James Kanze
Guest
Posts: n/a
 
      01-25-2011
On Jan 24, 11:44 pm, Joshua Maurice <(E-Mail Removed)> wrote:
> On Jan 24, 2:49 am, James Kanze <(E-Mail Removed)> wrote:


> > On Jan 22, 10:29 pm, Joshua Maurice <(E-Mail Removed)> wrote:
> > > Let's consider this function though:
> > > int foo(int* x, short* y)
> > > {
> > > *x = 1;
> > > *y = 2;
> > > return 1;
> > > }
> > > int bar(int* x, short* y)
> > > {
> > > *x = 1;
> > > *y = 2;
> > > return *x;
> > > }
> > > Let's consider functions foo and bar. Let's suppose that x and
> > > y alias in both. For function foo, there is no undefined
> > > behavior even though both alias (at least according to what
> > > appears to be the prominent interpretation of these rules).


> > I'm not sure about C here, but in C++, there is definitely
> > undefined behavior in foo if x and y alias. In fact, there
> > would be undefined behavior even if foo were simply:


> > void foo(const int* x, const short* y)
> > {
> > printf("%d, %d\n", *x, *y)
> > }


> > If the two pointers point to the same physical address, there is
> > no way that the memory they point to can be both an int and
> > a short. And the C++ standard clearly says:
> > If a program attempts to access the stored value of an
> > object through an lvalue of other than one of the following
> > types the behavior is undefined:
> > [...]
> > and short for int or vice versa isn't in the list. And I'm
> > certain that the intent in C is the same: C definitly allows
> > trapping representations for integer values, and reading part of
> > an int as a short could conceivably result in a trapping
> > representation for a short. (Think of a one's complement
> > machine which traps on -0.)


> > The problem becomes more interesting if we replace short with
> > unsigned char. In that case, my version is legal and defined
> > behavior: accessing a stored value through an lvalue of char or
> > unsgiend char type is in the list after the cited paragraph.
> > (IIRC, in C, this exception only applies to unsigned char; for
> > some reason, C++ added plain char to the list.) But what about
> > the original version, which modifies. Is modifying an int
> > through an unsigned char* undefined behavior? What if it
> > results in a trapping representation in the int? Or is it just
> > undefined behavior if you access the int? And of course,
> > modifying all of the bytes in the int, from a bytewise copy of
> > another int, has to be fully defined behavior.


> Is the following a well-formed C++ program without UB?


> #include <cstdlib>
> using namespace std;
> int main()
> {
> void* p = malloc(sizeof(int) + sizeof(float));
> int*x = (int*) p;
> *x = 1;
> }


Let's hope so.

Seriously, in C++ at least, a POD "exists" as soon as the memory
for it is allocated. I think the standard could be clearer, but
I'm pretty sure that the intent is that memory allocated with
malloc (or with the operator new function) is potentially an
object of any POD type which fits, and becomes an object of
a specific POD type when it is used as such. The assignment to
*x means that the memory between p and p + sizeof(int) contains
an int object (and that using it as any other type of object is
undefined behavior).

Again, an interesting case is something like:

void f(float v)
{
void* p = malloc(sizeof(float));
memcpy(p, &v, sizeof(float));
void* pf = (float*)p;
printf("%.5f\n", *pf);
}

What is the type of the object starting at p? (IIRC, the
specification of memcpy says that it copies "as if" through
unsigned char*, so we've effectively used the object as an
unsigned char[]. So using it as a float would seem to violate
3.10/15 (in the C++ standard). IMHO, the above *should* be
legal and well defined; i.e. if I call f(3.14159), the above
should display "3.14159". But I'm not sure that this is the
case as the standard is currently written. 3.8 is very clear
that the lifetime of an object of type T begins when storage
with the proper alignment and size is obtained, but with malloc
or the operator new function (both of which return a void*, and
guarantee alignment sufficient for any type of object), what is
the type of the object whose lifetime has just begun?

> What about the following?


> #include <cstdlib>
> using namespace std;
> int main()
> {
> void* p = malloc(sizeof(int) + sizeof(float));
> int*x = (int*) p;
> *x = 1;
> float* y = (float*) p;
> *y = 1;
> }


I think this is covered by the last bullet in 3.8: "The
lifetime of an object of type T ends when: [...]-- the storage
which the object occupies is reused or released." Although it's
not nearly as clear as it should be, I would consider the last
assignment as "reusing" the storage as a float, so that the int
object at p ceases to exist, and a new object with float type
comes into existance. In the case of memory obtained by means
of malloc or the operator new function, I think the standard
*should* say (but doesn't) that an object of type T only begins
to exist when the raw memory is initialized as a type T, and
that reinitializing it with a different type causes a new object
to begin to exist. In sum, such memory behaves very much like
a union of all types which fit, and you can only access the last
assigned value. (But you also need special wording for cases
where the "object" is initialized using memcpy or something
similar---writing through an unsigned char*. While the current
wording seems defective, it's hard to find adequate wording.)

> I've had a thread up on comp.std.c++ for a while now about these
> issues, and I've gotten 0 replies. It's quite frustrating.


> In short, I would argue that both of the above programs have no UB in C
> ++, nor their equivalent program in C. You need both programs above to
> have no UB in order to have user-space memory allocators in standard
> conforming C++. I think that the standard's intent is not to forbid
> user-space C++ standard conforming pooling memory allocators.


> Let's look at "3.8 Objectlifetime / 1, 2, 4, 5, 6, and 7". Each of
> those sections make reference to "reusing storage", something which is
> distinct from "releasing the storage". "Reusing the storage" of an
> object ends that object's lifetime. What else can this mean besides
> the following?


> void* p = malloc(sizeof(int) + sizeof(float));
> int*x = (int*) p;
> *x = 1;
> float* y = (float*) p;
> *y = 1; /* reuse of storage, the int object's lifetime ends, and
> the float object's lifetime begins */


Agreed. The real questions are: what is the type of the object
between the malloc and the *x = 1 statement (since an object
lifetime has apparently begun, according to 3., and (more
importantly) what about the case where you initialize using
something like memcpy?

> Furthermore, let's look at the rules in "3.8 Object Lifetime". "3.8
> Object Lifetime / 1" is actually nonsensical as written. Consider:
> void* p = malloc(sizeof(char))
> Well, we've allocated storage with proper alignment and type for an
> arbitrarily large number of types, and if those types have a trivial
> constructor, such as:
> struct T1 {};
> struct T2 {};
> struct T3 {};
> //etc.
> then an object of each of those types exists at that location. So, an
> arbitrarily large number of distinct complete objects coexist in "*p"
> according to that reading of the rules, which is entirely
> nonsensical.


I see we're thinking along the same lines.

> Unfortunately, as I've expounded at length in the thread on
> comp.std.c++, the sensible way forward isn't clear. However,
> some of the proposed changes to C++0x in 3.10 / 15 are taking
> the language in quite the wrong direction IMO.


> We need to solve a couple of basic problems. The most important and
> basic is: when does the lifetime of a POD class even begin?


That's a good question: do PODs have lifetime? I'd argue yes,
but it's not the lifetime defined in 3.8. Accessing an
uninitialized POD is undefined behavior, and if you can't access
an object, how can you say it exists?

> Consider:


> #include <cstdlib>
> using namespace std;


> struct T1 { int x; int y; };
> struct T2 { int x; int y; };


Just a note: the answers in the following may differ between
C and C++. I don't have a copy of the C standard handy to
verify what it says, but it does use a subtly different
definition of type than C++, with terms like "compatible types".
(If memory serves me correctly, I think that if two structs both
have a tag, and the tag is different, then the types are not
compatible, and so the effect is the same here. But I'm far
from sure.)

> int main()
> {
> void* p = 0;
> T1 * t1 = 0;
> T2 * t2 = 0;
> int * x = 0;


> if (sizeof(T1) != sizeof(T2))
> return 1;
> if ( (char*)(& t1->y) - (char*) (& t1) != (char*)(& t2->y) -
> (char*) (& t2) )
> return 1;


> p = malloc(sizeof(T1));
> /* Do we have a T1 object here? Presumably no. Otherwise we also
> have a T2 object here, and we definitely don't want to start talking
> about two distinct complete objects occupying the same storage at the
> same time. */


> t1 = (T1*) p;
> /* T1 object yet? Presumably the answer hasn't changed since the
> above comment. */


> x = & t1->x;
> /* T1 object yet? */


> *x = 1;
> /* Do we have a T1 object here? Maybe. I just see a write through
> an int lvalue. I see no writes nor reads through a T1 lvalue. I see
> nothing that favors T1 over T2, besides some sort of data dependency
> analysis through the member-of operator. However, there isn't even a
> hint of data dependency analysis in the standard with regards to
> object lifetime rules. */


> x = & t1->y;
> *x = 2;
> /* Do we have a T1 object here? The answer must be yes, or we'll
> never have a T1 object. However, again, I see nothing to favor having
> a T1 object over a T2 object besides data dependency analysis through
> the member-of operator. */


> t2 = (T2*) p;
> return t2->y; /* UB? Why? Why is reading "t1->y" not UB, but
> reading "t2->y" is UB? In other words, why do we have a T1 object, but
> not a T2 object? */
> }


> Also, what if we used offsetof hackery to initialize both int members
> of the T1 object without using a member-of operator on a T1 lvalue?


> As far as I can tell, gcc doesn't even bother doing aliasing analysis
> on anything besides primitive types, for exactly the reasons outlined
> above. They must not have seen a sensible way to differentiate between
> T1 and T2, just as I cannot.


Gcc may be basing its decision on the meaning of "compatible
type" in C. Again, purely from memory (perhaps someone from the
C group could confirm), I think that given:
typedef struct { int i; } T1;
typedef struct { int i; } T2;
, in C, T1 and T2 are "compatible types", and behave more or
less as if they were the same type. (In C++, they are two
distinct types.)

--
James Kanze
 
Reply With Quote
 
James Kanze
Guest
Posts: n/a
 
      01-25-2011
On Jan 25, 3:15 am, Ben Bacarisse <(E-Mail Removed)> wrote:
> This is cross posted and I don't think the C side of the questions have
> been properly answered. My answer is about C only.


Hopefully, C++ says the same thing. This is one point where
I don't think the languages should differ.

> Joshua Maurice <(E-Mail Removed)> writes:
> > On Jan 24, 2:49 am, James Kanze <(E-Mail Removed)> wrote:
> >> On Jan 22, 10:29 pm, Joshua Maurice <(E-Mail Removed)> wrote:
> >> > Let's consider this function though:
> >> > int foo(int* x, short* y)
> >> > {
> >> > *x = 1;
> >> > *y = 2;
> >> > return 1;
> >> > }
> >> > int bar(int* x, short* y)
> >> > {
> >> > *x = 1;
> >> > *y = 2;
> >> > return *x;
> >> > }
> >> > Let's consider functions foo and bar. Let's suppose that x and
> >> > y alias in both. For function foo, there is no undefined
> >> > behavior even though both alias (at least according to what
> >> > appears to be the prominent interpretation of these rules).


> >> I'm not sure about C here, but in C++, there is definitely
> >> undefined behavior in foo if x and y alias.


> If x and y both point to the same allocated object, then neither
> function is undefined. The assignments set the "effective type" of the
> allocated object.


Gcc treates it as undefined behavior if there is aliasing, and
may reorder the assignments. (IMHO, this is an error, but from
what I understand, it is an important optimization in certan
cases.)

> >> In fact, there
> >> would be undefined behavior even if foo were simply:


> >> void foo(const int* x, const short* y)
> >> {
> >> printf("%d, %d\n", *x, *y)
> >> }


> Again, this is not always UB when the object being aliased is allocated
> rather than declared.


I'm not sure I understand. Supposing that x and y point to the
same address, which was obtained by malloc. If the memory is
uninitialized, there is undefined behavior. If the memory was
initialized as an int, then accessing it as a short is undefined
behavior, and if it was initialized as a short, accessing it as
an int has undefined behavior. And there's no way for its
"effective type" to be both short and int; it's one or the
other (or none of the above), but it can't be both.

> When the aliased object is allocated, whether the
> accesses are defined or not depends on the effective type of the aliased
> allocated object. To be certain of UB when the pointers point to the
> same allocated object you need something like this:


> void foo(int *x, short *y)
> {
> *y = 1;
> printf("%d\n", *x);
> }


> The assignment ensures that the effective type of the allocated object
> is int so the the second is undefined.


> >> If the two pointers point to the same physical address, there is
> >> no way that the memory they point to can be both an int and
> >> a short.


> In C it can be if the storage is allocated and only stores are done (as
> in the first foo and bar above).


In C (and C++), when the memory is allocated, it is
uninitialized. I don't know what type, if any, it is assumed to
have, but regardless of the type, you simply cannot access
uninitialized memory (except through an unsigned char*). And
once you initialize it, you've fixed the type (until the next
"initialization", at least).

> >> And the C++ standard clearly says:
> >> If a program attempts to access the stored value of an
> >> object through an lvalue of other than one of the following
> >> types the behavior is undefined:
> >> [...]
> >> and short for int or vice versa isn't in the list. And I'm
> >> certain that the intent in C is the same: C definitly allows
> >> trapping representations for integer values, and reading part of
> >> an int as a short could conceivably result in a trapping
> >> representation for a short. (Think of a one's complement
> >> machine which traps on -0.)


> >> The problem becomes more interesting if we replace short with
> >> unsigned char. In that case, my version is legal and defined
> >> behavior: accessing a stored value through an lvalue of char or
> >> unsgiend char type is in the list after the cited paragraph.
> >> (IIRC, in C, this exception only applies to unsigned char; for
> >> some reason, C++ added plain char to the list.)


> In C, the wording is "a character type" which covers char and both
> signed and unsigned char. As you say, it is odd (at last at first
> glance -- I am not a C++ expert) that C++ added char but not signed
> char to the list.


It's especially odd that signed char is allowed, since copying
a signed char cannot necessarily be made to preserve the raw
bits or avoid trapping. (Again, a machine with 1's complement
which either converts all 0's the positive representation when
it sees them, or traps. On such machines, for plain char to
work, it would have to be unsigned---in fact, on the two
machines I know which don't use 2's complement, plain char is
unsigned.)

Just an idea: for purposes of demonstration, it might be better
to use int and float, rather than int and short, because reading
an int as a float can trap on most common machines; we don't
have to introduce such exotics as 1's complement to cause
issues.

--
James Kanze
 
Reply With Quote
 
James Kanze
Guest
Posts: n/a
 
      01-25-2011
On Jan 25, 3:33 am, Joshua Maurice <(E-Mail Removed)> wrote:

[...]
> I'd like to think that the rest of the questions applied to C as well
> as C++.


One would like to think so.

I believe that this was the intent in C++98, but the committee,
for whatever reasons, reworded everything, and there are clearly
cases where the resulting wording actually differs in meaning
from that in the C standard. Since C++98, the C committee has
also reworded a lot (things like the representation of integral
types comes to mind). And I'm not sure how seriously the
current C++ committee (and the current drafts) take
C compatibility into account. IMHO: if C++ differs from C where
basic types like int, short and float are involved, it is
a defect in the C++ standard, but I'm not sure that all of the
committee members agree with me.

[...]
> I would also like to hear some of the actual people on the committees
> weigh in on these particular questions.


Formally, I think I'm still a "technical expert" for AFNOR, but
for various personal reasons, I've not been active lately.

--
James Kanze
 
Reply With Quote
 
James Kanze
Guest
Posts: n/a
 
      01-25-2011
On Jan 25, 5:26 am, Joshua Maurice <(E-Mail Removed)> wrote:
> On Jan 24, 8:33 pm, "Johannes Schaub (litb)" <(E-Mail Removed)>
> wrote:


[...]
> So, can you write a memory allocator which reuses memory in pure
> conforming C++ on top of new and delete, or on top of malloc and free?


If you're actually trying to write an allocator, you also have
to take into account what actual compilers do, and not just the
standard. I seem to recall something along the following lines:

float f(float const* in, bool* out)
{
float result = *in;
*out = true;
return result;
}

failing with g++ when called with:

union U { float f; bool b; };
U u;
u.f = 3.14159;
float g = f(&u.f, &u.b);

According to both C and C++, the union guaranteed that this
should work, But g++ rearranged the read and the write in f.

(I also seem to recall---albeit vaguely---the C committee saying
that it wasn't the intent to make this work; that they only
meant for it to be guaranteed if e.g. f was passed a pointer to
the union. But it's all very vague---I didn't have time to
follow up at the time.)

At any rate, most of this discussion seems to turn around the
same issues, without the union.

--
James Kanze
 
Reply With Quote
 
Joshua Maurice
Guest
Posts: n/a
 
      01-25-2011
On Jan 25, 2:22*am, James Kanze <(E-Mail Removed)> wrote:
> If you're actually trying to write an allocator, you also have
> to take into account what actual compilers do, and not just the
> standard. *I seem to recall something along the following lines:
>
> * * float f(float const* in, bool* out)
> * * {
> * * * * float result = *in;
> * * * * *out = true;
> * * * * return result;
> * * }
>
> failing with g++ when called with:
>
> * * union U { float f; bool b; };
> * * U u;
> * * u.f = 3.14159;
> * * float g = f(&u.f, &u.b);
>
> According to both C and C++, the union guaranteed that this
> should work, But g++ rearranged the read and the write in f.
>
> (I also seem to recall---albeit vaguely---the C committee saying
> that it wasn't the intent to make this work; that they only
> meant for it to be guaranteed if e.g. f was passed a pointer to
> the union. *But it's all very vague---I didn't have time to
> follow up at the time.)
>
> At any rate, most of this discussion seems to turn around the
> same issues, without the union.


Indeed. It's all very related to the union DR. So, the C standard
committee never intended for the following program to have defined
behavior? Interesting.

void foo(int* x, float* y)
{ *x = 1;
*y = 1;
}
int main()
{ union { int x; float y; } u;
foo(&u.x, &u.y);
return u.y;
}

AFAIK, the only way that this could make sense is if you introduce
some formalisms with data dependency analysis.

Let me take another wack at trying to formalize it.

Quote:
For a single function, the compiler may assume that at any particular
point of execution, any accessible pointer value or named variable
does not alias another accessible pointer value or named variable of a
sufficiently different type (see existing strict aliasing rules),
unless the two pointers or named variables have a data dependency
between them (ala the rules for restrict, or maybe the C++0x rules for
std::memory_order_consume). Programs which violate this assumption
have undefined behavior.

Ex:
void foo(int* x, float* y)
{ *x = 1;
*y = 1;
}
int main()
{ union { int x; float y; } u;
foo(&u.x, &u.y);
return u.y;
}
The function foo has a spot during its execution where there are two
accessible pointer values (its parameters x and y) of sufficiently
different types which alias, and there is no data dependency between
in the scope of the body of foo. Thus the assumption is violated, and
the program has undefined behavior.

Ex:
#include <stdlib.h>
int main()
{ int* x = (int*) malloc(sizeof(int));
*x = 1;
free(x);
float* y = (float*) malloc(sizeof(float));
*y = 1;
free(y);
}
In the above program, malloc may return the same piece of memory
twice, once for x, and once for y. However, at no point of execution
are both pointers "live" and pointing to the same piece of memory.
Thus the assumption is not violated, and this program has no undefined
behavior.

Ex:
int main()
{ int x;
float* y;

y = (float*) x;
x = 1;
*y = 2;
x = 3;
}
The above program has a named variable which aliases a pointer value.
However, there exist a data dependency between them, so the program
has no undefined behavior.
I think this is the best I've gotten to formalizing the intent. I'm
deferring to the 'restrict' rules, mostly because I think they would
probably best capture all of the nuances which I need. Perhaps I could
instead use the C++0x data dependency rules ala
std::memory_order_consume. I'm not intimately familiar with those
either.

I'm not sure what I have written thus far is anywhere near sufficient
or correct, but hopefully it captures what I'm aiming for.

The important thing is, AFAIK, nothing like this is in any of the C
standards nor any of the C++ standards.
 
Reply With Quote
 
Joshua Maurice
Guest
Posts: n/a
 
      01-25-2011
On Jan 25, 2:06*am, James Kanze <(E-Mail Removed)> wrote:
> On Jan 25, 3:33 am, Joshua Maurice <(E-Mail Removed)> wrote:
> > I'd like to think that the rest of the questions applied to C as well
> > as C++.

>
> One would like to think so.
>
> I believe that this was the intent in C++98, but the committee,
> for whatever reasons, reworded everything, and there are clearly
> cases where the resulting wording actually differs in meaning
> from that in the C standard. *Since C++98, the C committee has
> also reworded a lot (things like the representation of integral
> types comes to mind). *And I'm not sure how seriously the
> current C++ committee (and the current drafts) take
> C compatibility into account. *IMHO: if C++ differs from C where
> basic types like int, short and float are involved, it is
> a defect in the C++ standard, but I'm not sure that all of the
> committee members agree with me.


That's good to hear. It would be a shame if C++ diverged from C so
drastically. I agree that most C should continue to be legal C++ code
(once you fix up the minor details like the includes, namespaces,
implicit void pointer casts, etc.). Glad you're on the committee.

> > I would also like to hear some of the actual people on the committees
> > weigh in on these particular questions.

>
> Formally, I think I'm still a "technical expert" for AFNOR, but
> for various personal reasons, I've not been active lately.


I never meant to disparage you. I know you are, but up until the last
couple of posts, you didn't really address my interesting questions.
(You have now though, here and else-thread, and thank you again.)
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Is the aliasing rule symmetric? Johannes Schaub (litb) C++ 68 02-06-2011 09:33 PM
Is the aliasing rule symmetric? Johannes Schaub (litb) C++ 2 01-21-2011 11:30 PM
how to add validation rule for url in the validation-rule.xml ,I added some thing like this but......... shailajabtech@gmail.com Java 0 10-12-2006 08:36 AM
Anti-aliasing GIF Images Kevin Bertman Java 4 11-29-2004 05:46 AM
LCD anti-aliasing in Java Tim Tyler Java 2 09-05-2003 09:01 AM



Advertisments