Velocity Reviews > =operator for structs

# =operator for structs

Christian Christmann
Guest
Posts: n/a

 03-18-2006
Hi,

I was wondering how the =operator works for
struct.

When I for example define a struct as follows:

struct point {
int a;
char *c;
};

and create the first struct

struct point p1 = { 10, "Hallo" };

and then create another struct and assign it struct p1

struct point p2;
p2 = p1;

it seams that all elements are copied properly, i.e.
a new variable is created for 'a' but also, what is more interesting,
an independent string char* 'c' is generated since a
modification to p1.c does not affect p2.c.

Does this mean that structs can be assigned by '=' without any
problems even if they contain (pointer to) nested structs as
elements?

Thank you.
Chris

Peter Nilsson
Guest
Posts: n/a

 03-18-2006
Christian Christmann wrote:
> Hi,
>
> I was wondering how the =operator works for
> struct.

It does a shallow copy.

> When I for example define a struct as follows:
>
> struct point {
> int a;
> char *c;
> };
>
> and create the first struct
>
> struct point p1 = { 10, "Hallo" };

Note that "Hallo" is a string literal. Your p1->c will point to a
string, but
modifying that string invokes undefined behaviour.

> and then create another struct and assign it struct p1
>
> struct point p2;
> p2 = p1;
>
> it seams that all elements are copied properly, i.e.
> a new variable is created for 'a' but also, what is more interesting,
> an independent string char* 'c' is generated since a
> modification to p1.c does not affect p2.c.

This is an example of undefined behaviour in action.

> Does this mean that structs can be assigned by '=' without any
> problems

Generally yes. There isn't a problem with the assignment, there
is a problem withour subsequent use of the struct copy.

> even if they contain (pointer to) nested structs as
> elements?

Like I say, assignment only does a shallow copy.

--
Peter

Default User
Guest
Posts: n/a

 03-18-2006
Christian Christmann wrote:

> Hi,
>
> I was wondering how the =operator works for
> struct.
>
> When I for example define a struct as follows:
>
> struct point {
> int a;
> char *c;
> };
>
> and create the first struct
>
> struct point p1 = { 10, "Hallo" };
>
> and then create another struct and assign it struct p1
>
> struct point p2;
> p2 = p1;
>
> it seams that all elements are copied properly, i.e.
> a new variable is created for 'a' but also, what is more interesting,
> an independent string char* 'c' is generated since a
> modification to p1.c does not affect p2.c.

What do mean by "modification to p1.c"?

If you did something like this:

p1.c[0] = 'q';

Then you did a very bad thing, that's undefined behavior.

If you meant this:

p1.c = "a different string";

Then there's no problem.

> Does this mean that structs can be assigned by '=' without any
> problems even if they contain (pointer to) nested structs as
> elements?

Unlikely. The pointers are copied exactly, each struct initially points
to the same item (assuming the pointers were set to a valid object's
You use terminology very loosely. Examples would help.

Brian

Keith Thompson
Guest
Posts: n/a

 03-18-2006
Christian Christmann <(E-Mail Removed)> writes:
> I was wondering how the =operator works for
> struct.

By copying the values of all the members. It can either copy them one
at a time or, more likely, by doing the equivalent of a memcpy() on
the entire structure.

> When I for example define a struct as follows:
>
> struct point {
> int a;
> char *c;
> };
>
> and create the first struct
>
> struct point p1 = { 10, "Hallo" };
>
> and then create another struct and assign it struct p1
>
> struct point p2;
> p2 = p1;
>
> it seams that all elements are copied properly, i.e.
> a new variable is created for 'a' but also, what is more interesting,
> an independent string char* 'c' is generated since a
> modification to p1.c does not affect p2.c.
>
> Does this mean that structs can be assigned by '=' without any
> problems even if they contain (pointer to) nested structs as
> elements?

No, it doesn't. If a structure contains pointers, copying it by
assignment to another structure object just copies the pointers; both
pointers will point to the same external object. To use the jargon,
struct assignment does a "shallow copy", not a "deep copy".

p1.c, a pointer, is part of the structure, and is copied by the
assignment. The string that p1.c points to is not part of the
structure, and is not copied by the assignment.

Given the code above, you can modify p2.c without affecting p1 (just
as you can modify p2.a without affecting p1), but you can't modify
what p2.c points to without affecting p1 (or rather, affecting what
p1.c points to).

And in this case, since p1.c and p2.c both point to a string literal,
you can't legally modify it at all (attempting to do so invokes
undefined behavior).

Here's a program that illustrates what happens. Note that I've
initialized p1.c to point to a (non-const) array object rather than to
a string literal, so modifying the string is allowed.

================================
#include <stdio.h>
int main(void)
{
char hello[] = "hello";

struct point {
int a;
char *c;
};

struct point p1 = { 10, hello };
struct point p2;
p2 = p1;

printf("p1 = { %d, %p --> \"%s\" }\n", p1.a, (void*)p1.c, p1.c);
printf("p2 = { %d, %p --> \"%s\" }\n", p2.a, (void*)p2.c, p2.c);

printf("Modifying p2.c[0]\n");
p2.c[0] = 'J';

printf("p1 = { %d, %p --> \"%s\" }\n", p1.a, (void*)p1.c, p1.c);
printf("p2 = { %d, %p --> \"%s\" }\n", p2.a, (void*)p2.c, p2.c);

printf("Modifying p2.c\n");
p2.c = "Good-bye";

printf("p1 = { %d, %p --> \"%s\" }\n", p1.a, (void*)p1.c, p1.c);
printf("p2 = { %d, %p --> \"%s\" }\n", p2.a, (void*)p2.c, p2.c);

return 0;
}
================================

The output is:

p1 = { 10, 0x22eeb0 --> "hello" }
p2 = { 10, 0x22eeb0 --> "hello" }
Modifying p2.c[0]
p1 = { 10, 0x22eeb0 --> "Jello" }
p2 = { 10, 0x22eeb0 --> "Jello" }
Modifying p2.c
p1 = { 10, 0x22eeb0 --> "Jello" }
p2 = { 10, 0x40205d --> "Good-bye" }

Keep in mind that the printf with a "%p" format prints its argument (a
pointer), while printf with a "%s" format prints what its argument
points to (a string).

--
Keith Thompson (The_Other_Keith) http://www.velocityreviews.com/forums/(E-Mail Removed) <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <*> <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.

Simon Biber
Guest
Posts: n/a

 03-18-2006
Christian Christmann wrote:
> struct point {
> int a;
> char *c;
> };
>
> and create the first struct
>
> struct point p1 = { 10, "Hallo" };
>
> and then create another struct and assign it struct p1
>
> struct point p2;
> p2 = p1;

The previous line is equivalent to:
p2.a = p1.a; /* copy integer value */
p2.c = p1.c; /* copy pointer value */

> it seams that all elements are copied properly, i.e.
> a new variable is created for 'a' but also, what is more interesting,
> an independent string char* 'c' is generated since a
> modification to p1.c does not affect p2.c.

p1.c and p2.c were always independent objects. A modification to p1.c
can never affect p2.c! Each of them holds a pointer value, and either
pointer value can be modified at any time.

However, no independent string is generated. Both p1.c and p2.c point to
the same string literal. The string literal, as always, is not
modifyable. If, however, it were a modifyable object, then you could see
that modifying it would result in the modifications being visible from
both p1.c and p2.c.

> Does this mean that structs can be assigned by '=' without any
> problems even if they contain (pointer to) nested structs as
> elements?

If they contain nested structs as elements, then the nested structs will
be copied correctly by the '=' operator.

If they contain _pointers to_ nested structs as elements, then only the
pointer values will be copied. You will then have two pointers that
point to the same object. Modifying the underlying object will affect
access through either pointer.

--
Simon.

Barry Schwarz
Guest
Posts: n/a

 03-18-2006
On 17 Mar 2006 16:20:12 -0800, "Peter Nilsson" <(E-Mail Removed)>
wrote:

>Christian Christmann wrote:
>> Hi,
>>
>> I was wondering how the =operator works for
>> struct.

>
>It does a shallow copy.
>
>> When I for example define a struct as follows:
>>
>> struct point {
>> int a;
>> char *c;
>> };
>>
>> and create the first struct
>>
>> struct point p1 = { 10, "Hallo" };

>
>Note that "Hallo" is a string literal. Your p1->c will point to a
>string, but
>modifying that string invokes undefined behaviour.
>
>> and then create another struct and assign it struct p1
>>
>> struct point p2;
>> p2 = p1;
>>
>> it seams that all elements are copied properly, i.e.
>> a new variable is created for 'a' but also, what is more interesting,
>> an independent string char* 'c' is generated since a
>> modification to p1.c does not affect p2.c.

>
>This is an example of undefined behaviour in action.

No it isn't. It is perfectly legal to modify p1.c. What would invoke
undefined behavior would be modifying what p1.c points to while it
still points to a string literal.

>
>> Does this mean that structs can be assigned by '=' without any
>> problems

>
>Generally yes. There isn't a problem with the assignment, there
>is a problem withour subsequent use of the struct copy.

What problem are you referring to?

>
>> even if they contain (pointer to) nested structs as
>> elements?

>
>Like I say, assignment only does a shallow copy.

Remove del for email

Christian Christmann
Guest
Posts: n/a

 03-21-2006
On Sat, 18 Mar 2006 00:24:29 +0000, Default User wrote:

>>
>> struct point p1 = { 10, "Hallo" };
>>
>> and then create another struct and assign it struct p1
>>
>> struct point p2;
>> p2 = p1;
>>

>
> What do mean by "modification to p1.c"?
>
> If you did something like this:
>
> p1.c[0] = 'q';
>
> Then you did a very bad thing, that's undefined behavior.

Why do I get an undefined behavior when modifying the string
p1.c points to? Isn't "Hello" a char array somewhere in the
memory that is referenced by p1.c and thus modification to single
char elements like p1.c[0] should be allowed?

Richard Bos
Guest
Posts: n/a

 03-21-2006
Christian Christmann <(E-Mail Removed)> wrote:

> On Sat, 18 Mar 2006 00:24:29 +0000, Default User wrote:
>
> >> struct point p1 = { 10, "Hallo" };

> > What do mean by "modification to p1.c"?
> >
> > If you did something like this:
> >
> > p1.c[0] = 'q';
> >
> > Then you did a very bad thing, that's undefined behavior.

>
> Why do I get an undefined behavior when modifying the string
> p1.c points to? Isn't "Hello" a char array somewhere in the
> memory that is referenced by p1.c and thus modification to single
> char elements like p1.c[0] should be allowed?

No. pl.c is a pointer to char; whether writing through this pointer
invokes UB depends on what it points at. In this case, it points at
"Hallo", which is a string literal; and string literals are translated
into arrays of char in memory _which may be in unwritable memory_. For
example, an embedded system long on ROM and short on RAM could put all
literal strings in ROM.

Richard

Flash Gordon
Guest
Posts: n/a

 03-21-2006
Christian Christmann wrote:
> On Sat, 18 Mar 2006 00:24:29 +0000, Default User wrote:
>
>
>>> struct point p1 = { 10, "Hallo" };
>>>
>>> and then create another struct and assign it struct p1
>>>
>>> struct point p2;
>>> p2 = p1;
>>>

>> What do mean by "modification to p1.c"?
>>
>> If you did something like this:
>>
>> p1.c[0] = 'q';
>>
>> Then you did a very bad thing, that's undefined behavior.

>
> Why do I get an undefined behavior when modifying the string
> p1.c points to? Isn't "Hello" a char array somewhere in the
> memory that is referenced by p1.c and thus modification to single
> char elements like p1.c[0] should be allowed?

I'm assuming the definition of the struct was something line:
struct point {
int i;
char *p;
}

So, in other words, p1.c is a pointer to a string literal.

You are correct that "Hello" will be an array in memory somewhere, and
it will obviously have a /0 after it. However, the standard explicitly
states that you are not allowed to modify string literals, so modifying
it is undefined behaviour.

The reason the C language has this restriction is to allow the compiler
to put the string literal in read only memory (e.g. keep it in ROM on an
embedded system, or just in a page marked as read only on a hosted
system) and/or combine string literals, including combining the strings
"Let me say Hello", "Hello" and "lo".

So, depending on what the compiler does, some possible results would be
modifying all string literals that end in "Hello", causing the OS to
raise some form of access violation signal or error, causing an attempt
to write to memory that is physically read only (probably resulting in
nothing happening) or anything else.
--
Flash Gordon, living in interesting times.
Web site - http://home.flash-gordon.me.uk/
comp.lang.c posting guidelines and intro:
http://clc-wiki.net/wiki/Intro_to_clc