Velocity Reviews > Declaration vs definition of array

# Declaration vs definition of array

Noob
Guest
Posts: n/a

 03-27-2013
Hello everyone,

I was playing around with arrays, when I noticed something
I don't understand.

Consider the following code:

extern int u[10];
int v[10];
int x[10] = { };
int y[10] = { 0 };
int z[10] = { 42 };

y and z are proper array definitions, while u is merely
a declaration. But what about v and x?

On my platform, objdump says:

00000000 g O .data 00000028 _z
00000000 g O .bss 00000028 _y
00000028 g O .bss 00000028 _x
00000028 O *COM* 00000004 _v

No mention of u, which is expected for a declaration.
y in bss, expected since it's all-0.
z in data, expected because it's not all-0.

Apparently, x is equivalent to y.
So int x[10] = { }; is a proper definition?
(Can you cite C&V allowing this syntax?)

I'm puzzled with v. It seems to have been considered
a pointer? (Since the size is 4.) Did my compiler
consider this a tentative definition?

I would have expected v to be equivalent to x and y.
I suppose I was wrong, given the output?

So I have to use x or y syntax to define an empty array?

http://stackoverflow.com/questions/2...implementation
http://david.tribble.com/text/cdiffs.htm#C99-odr

Regards.

James Kuyper
Guest
Posts: n/a

 03-27-2013
On 03/27/2013 01:27 PM, Noob wrote:
> Hello everyone,
>
> I was playing around with arrays, when I noticed something
> I don't understand.
>
> Consider the following code:
>
> extern int u[10];
> int v[10];
> int x[10] = { };
> int y[10] = { 0 };
> int z[10] = { 42 };
>
> y and z are proper array definitions, while u is merely
> a declaration. But what about v and x?

The array 'v' is a tentative definition, since no initializers are
provided. It is covered by 6.9.2p2: "... If a translation unit contains
one or more tentative definitions for an identifier, and the translation
unit contains no external definition for that identifier, then the
behavior is exactly as if the translation unit contains a file scope
declaration of that identifier, with the composite type as of the end of
the translation unit, with an initializer equal to 0."

It has been pointed out that, technically, this wording implies that the
equivalent declaration should be:

int v[10] = 0;

Which would be a constraint violation (6.7.9p16) because there should be
braces around that '0'. However, the intent of the committee was almost
certainly that the equivalent declaration be:

int v[10] = {0};

Which would result in the entire array being zero-initialized, and that
is in fact the way that most (all?) real-world conforming C compilers
interpret it.

You might think that x is a similar case, since no initializers are
explicitly provided there, either. However, a brace-enclosed initializer
list is itself an initializer - but an initializer consisting of braces
that don't enclose an initializer list is a syntax error (6.7.9p1), and
should therefore have triggered a diagnostic message. If your compiler
didn't provide one, you may need to set the warning level higher. gcc
requires the '-ansi -pedantic' options in order to properly diagnose
this issue. Your compiler may have similar requirements.

> On my platform, objdump says:
>
> 00000000 g O .data 00000028 _z
> 00000000 g O .bss 00000028 _y
> 00000028 g O .bss 00000028 _x
> 00000028 O *COM* 00000004 _v
>
> No mention of u, which is expected for a declaration.
> y in bss, expected since it's all-0.
> z in data, expected because it's not all-0.
>
> Apparently, x is equivalent to y.
> So int x[10] = { }; is a proper definition?
> (Can you cite C&V allowing this syntax?)

No.

> I'm puzzled with v. It seems to have been considered
> a pointer? (Since the size is 4.) Did my compiler
> consider this a tentative definition?

It should have set aside enough room for 10 ints, zero-initialized. I'm
not sufficiently familiar with objdump to be sure how to interpret those
results; perhaps it works correctly when linked with other code to make
a complete program?

> I would have expected v to be equivalent to x and y.
> I suppose I was wrong, given the output?
>
> So I have to use x or y syntax to define an empty array?

The syntax you used for v should be sufficient, but y will work as well.
The syntax you used for x should not have worked.

Noob
Guest
Posts: n/a

 03-28-2013
James Kuyper wrote:

> Noob wrote:
>
>> extern int u[10];
>> int v[10];
>> int x[10] = { };
>> int y[10] = { 0 };
>> int z[10] = { 42 };
>>
>> y and z are proper array definitions, while u is merely
>> a declaration. But what about v and x?

>
> The array 'v' is a tentative definition, since no initializers are
> provided. It is covered by 6.9.2p2: "... If a translation unit contains
> one or more tentative definitions for an identifier, and the translation
> unit contains no external definition for that identifier, then the

What's an external definition?

The following is just a declaration, right?
extern int v[10];

(For the record, adding that declaration to my source file doesn't
change the compiler's output.)

> behavior is exactly as if the translation unit contains a file scope
> declaration of that identifier, with the composite type as of the end of
> the translation unit, with an initializer equal to 0."
>
> It has been pointed out that, technically, this wording implies that the
> equivalent declaration should be:
>
> int v[10] = 0;
>
> Which would be a constraint violation (6.7.9p16) because there should be
> braces around that '0'. However, the intent of the committee was almost
> certainly that the equivalent declaration be:
>
> int v[10] = {0};
>
> Which would result in the entire array being zero-initialized, and that
> is in fact the way that most (all?) real-world conforming C compilers
> interpret it.
>
> You might think that x is a similar case, since no initializers are
> explicitly provided there, either. However, a brace-enclosed initializer
> list is itself an initializer - but an initializer consisting of braces
> that don't enclose an initializer list is a syntax error (6.7.9p1), and
> should therefore have triggered a diagnostic message. If your compiler
> didn't provide one, you may need to set the warning level higher. gcc
> requires the '-ansi -pedantic' options in order to properly diagnose
> this issue. Your compiler may have similar requirements.

Indeed! So this is a gcc extension. Good to know.

\$ sh4gcc -std=c99 -pedantic -Wall -Wextra -c -O3 -Wall arr.c
arr.c:3:13: warning: ISO C forbids empty initializer braces [-pedantic]

>> On my platform, objdump says:
>>
>> 00000000 g O .data 00000028 _z
>> 00000000 g O .bss 00000028 _y
>> 00000028 g O .bss 00000028 _x
>> 00000028 O *COM* 00000004 _v
>>
>> No mention of u, which is expected for a declaration.
>> y in bss, expected since it's all-0.
>> z in data, expected because it's not all-0.
>>
>> Apparently, x is equivalent to y.
>> So int x[10] = { }; is a proper definition?
>> (Can you cite C&V allowing this syntax?)

>
> No.

No C&V because it is an extension. (Thanks for pointing it out.)

>> I'm puzzled with v. It seems to have been considered
>> a pointer? (Since the size is 4.) Did my compiler
>> consider this a tentative definition?

>
> It should have set aside enough room for 10 ints, zero-initialized. I'm
> not sufficiently familiar with objdump to be sure how to interpret those
> results; perhaps it works correctly when linked with other code to make
> a complete program?

There probably is something special about the *COM* section.

This paragraph sounds very relevant:

3.18 Options for Code Generation Conventions
-fno-common
In C code, controls the placement of uninitialized global variables.
Unix C compilers have traditionally permitted multiple definitions of
such variables in different compilation units by placing the
variables in a common block. This is the behavior specified by
-fcommon, and is the default for GCC on most targets. On the other
hand, this behavior is not required by ISO C, and on some targets may
carry a speed or code size penalty on variable references. The
-fno-common option specifies that the compiler should place
uninitialized global variables in the data section of the object
file, rather than generating them as common blocks. This has the
effect that if the same variable is declared (without extern) in two
different compilations, you get a multiple-definition error when you
link them. In this case, you must compile with -fcommon instead.
Compiling with -fno-common is useful on targets for which it provides
better performance, or if you wish to verify that the program will
work on other systems that always treat uninitialized variable
declarations this way.

>> I would have expected v to be equivalent to x and y.
>> I suppose I was wrong, given the output?
>> So I have to use x or y syntax to define an empty array?

>
> The syntax you used for v should be sufficient, but y will work as well.
> The syntax you used for x should not have worked.

Thanks.

Barry Schwarz
Guest
Posts: n/a

 03-28-2013
On Thu, 28 Mar 2013 11:08:00 +0100, Noob <root@127.0.0.1> wrote:

>James Kuyper wrote:
>
>> Noob wrote:
>>
>>> extern int u[10];
>>> int v[10];
>>> int x[10] = { };
>>> int y[10] = { 0 };
>>> int z[10] = { 42 };
>>>
>>> y and z are proper array definitions, while u is merely
>>> a declaration. But what about v and x?

>>
>> The array 'v' is a tentative definition, since no initializers are
>> provided. It is covered by 6.9.2p2: "... If a translation unit contains
>> one or more tentative definitions for an identifier, and the translation
>> unit contains no external definition for that identifier, then the

>
>What's an external definition?

The phrase "external definition" is defined in the paragraph preceding
the one James quoted.

>The following is just a declaration, right?
>extern int v[10];

Non-sequitur. The storage class "extern" is not related to the phrase
"external definition".

>(For the record, adding that declaration to my source file doesn't
>change the compiler's output.)

The actions of one compiler need not reflect the intent of the
standard.

--
Remove del for email

James Kuyper
Guest
Posts: n/a

 03-28-2013
On 03/28/2013 06:08 AM, Noob wrote:
> James Kuyper wrote:
>
>> Noob wrote:
>>
>>> extern int u[10];
>>> int v[10];
>>> int x[10] = { };
>>> int y[10] = { 0 };
>>> int z[10] = { 42 };
>>>
>>> y and z are proper array definitions, while u is merely
>>> a declaration. But what about v and x?

>>
>> The array 'v' is a tentative definition, since no initializers are
>> provided. It is covered by 6.9.2p2: "... If a translation unit contains
>> one or more tentative definitions for an identifier, and the translation
>> unit contains no external definition for that identifier, then the

>
> What's an external definition?

6.9p4: "... the unit of program text after preprocessing is a
translation unit, which consists of a sequence of external declarations.
These are described as ‘‘external’’ because they appear outside any
function (and hence have file scope). ..."

6.9p5: "An external definition is an external declaration that is also a
definition of a function (other than an inline definition) or an object.
...."

Providing a body for a function or an initializer for an object coverts
the declaration into a definition.
--
James Kuyper

Tim Rentsch
Guest
Posts: n/a

 03-30-2013
Noob <root@127.0.0.1> writes:

>> [snip]

>
> There probably is something special about the *COM* section.
>
> This paragraph sounds very relevant:
>
> 3.18 Options for Code Generation Conventions
> -fno-common
> In C code, controls the placement of uninitialized global
> variables. Unix C compilers have traditionally permitted
> multiple definitions of such variables in different compilation
> units by placing the variables in a common block. This is the
> behavior specified by -fcommon, and is the default for GCC on
> most targets. On the other hand, this behavior is not required
> by ISO C, and on some targets may carry a speed or code size
> penalty on variable references. The -fno-common option specifies
> that the compiler should place uninitialized global variables in
> the data section of the object file, rather than generating them
> as common blocks. This has the effect that if the same variable
> is declared (without extern) in two different compilations, you
> get a multiple-definition error when you link them. In this
> case, you must compile with -fcommon instead. Compiling with
> -fno-common is useful on targets for which it provides better
> performance, or if you wish to verify that the program will work
> on other systems that always treat uninitialized variable
> declarations this way.

In case it isn't clear what this means: before C was standardized,
there was no notion of a 'tentative definition', and declaring a
variable without initializing it (and without using 'extern') made
the variable be a "shared global" (just a made-up term) without
strongly defining it in any one file. Thus, if we have in two
separate .c files, x.c and y.c, code like:

/* x.c */

int foo;

...

/* y.c */

int foo;

...

then both x.c and y.c can use the "shared global" variable 'foo',
which exists in one place in memory, not two. This usage was not
an error but the usual way things were done, and well-defined in
the sense that the compilers that existed at the time all did the
same thing.

When C was standardized, the language definition was changed so
that a declaration like 'int foo;' will define the variable in any
translation unit where such a declaration appears, and if there is
more than one then the result is undefined behavior (which is also
the case for any kind of multiple definition, not just ones that
don't have an initializer).

Note the significance of how the Standard addresses the situation.
Because (insofar as the Standard is concerned) the behavior is
undefined, an implementation is free to define the "shared global"
kind of declaration so it works just like the good old days. And
that's the reason for -fcommon, why your no-initializer variable
ended up in the COM area, etc.

Lowell Gilbert
Guest
Posts: n/a

 03-30-2013
Tim Rentsch <(E-Mail Removed)> writes:

> When C was standardized, the language definition was changed so
> that a declaration like 'int foo;' will define the variable in any
> translation unit where such a declaration appears, and if there is
> more than one then the result is undefined behavior (which is also
> the case for any kind of multiple definition, not just ones that
> don't have an initializer).

If this is an impersonation, it's a pretty good one.
I think somebody with a rather strange sense of humor has actually
gotten Tim Rentsch's account.

--
Lowell Gilbert, embedded/networking software engineer
http://be-well.ilk.org/~lowell/

Tim Rentsch
Guest
Posts: n/a

 03-30-2013
Lowell Gilbert <(E-Mail Removed)> writes:

> Tim Rentsch <(E-Mail Removed)> writes:
>
>> When C was standardized, the language definition was changed so
>> that a declaration like 'int foo;' will define the variable in any
>> translation unit where such a declaration appears, and if there is
>> more than one then the result is undefined behavior (which is also
>> the case for any kind of multiple definition, not just ones that
>> don't have an initializer).

>
> If this is an impersonation, it's a pretty good one.
> I think somebody with a rather strange sense of humor has actually
> gotten Tim Rentsch's account.

Yes, I have to admit, when I read it myself,
I'm amazed by how good the impersonation is.

glen herrmannsfeldt
Guest
Posts: n/a

 03-31-2013
Tim Rentsch <(E-Mail Removed)> wrote:

(snip)

> In case it isn't clear what this means: before C was standardized,
> there was no notion of a 'tentative definition', and declaring a
> variable without initializing it (and without using 'extern') made
> the variable be a "shared global" (just a made-up term) without
> strongly defining it in any one file.

Well, before there was and C, there was Fortran. Starting,
I believe, in Fortran II, there was COMMON such that variables
could be shared between subroutines. In the beginning there was
only blank (unnamed) COMMON, but not so much later, named COMMON.

Once idea of separate compilation came along, and linkage editors
to actually do it, the mechanism was there. As other languages came
along, they could use the existing mechanism.

> Thus, if we have in two
> separate .c files, x.c and y.c, code like:

> /* x.c */

> int foo;

> ...

> /* y.c */

> int foo;

> ...

> then both x.c and y.c can use the "shared global" variable 'foo',
> which exists in one place in memory, not two. This usage was not
> an error but the usual way things were done, and well-defined in
> the sense that the compilers that existed at the time all did the
> same thing.

And reasonably likely, at least on systems also supporting Fortran,
the same as a named COMMON block named foo. (Or sometimes _foo
or foo_.)

> When C was standardized, the language definition was changed so
> that a declaration like 'int foo;' will define the variable in any
> translation unit where such a declaration appears, and if there is
> more than one then the result is undefined behavior (which is also
> the case for any kind of multiple definition, not just ones that
> don't have an initializer).

In addition to COMMON, though I believe added later, Fortran
has BLOCK DATA. Ordinarily, variables in Fortran COMMON are not
initialized to any specific value. BLOCK DATA allows one to
initialize variables in COMMON, but only in one place.

But C requires, I believe even back to K&R, that static variables,
including "shared global" be initialized, either to zero or another
specified value. Some linkers will allow multiple instances of
an initialized global (COMMON), others won't.

For the OS/360 linkage editor, the first initializing (not COM)
seen is used, and others are ignored. (That is one way to do actual
editing with the linkage editor. You can replace a CSECT in an existing
module by loading the new one first.)

> Note the significance of how the Standard addresses the situation.
> Because (insofar as the Standard is concerned) the behavior is
> undefined, an implementation is free to define the "shared global"
> kind of declaration so it works just like the good old days. And
> that's the reason for -fcommon, why your no-initializer variable
> ended up in the COM area, etc.

-- glen

Noob
Guest
Posts: n/a

 03-31-2013
Lowell Gilbert wrote:

> Tim Rentsch <(E-Mail Removed)> writes:
>
>> When C was standardized, the language definition was changed so
>> that a declaration like 'int foo;' will define the variable in any
>> translation unit where such a declaration appears, and if there is
>> more than one then the result is undefined behavior (which is also
>> the case for any kind of multiple definition, not just ones that
>> don't have an initializer).

>
> If this is an impersonation, it's a pretty good one.
> I think somebody with a rather strange sense of humor has actually
> gotten Tim Rentsch's account.

Why do you think that?