Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > C Programming > use pointer as array, Weird

Reply
Thread Tools

use pointer as array, Weird

 
 
Shao Miller
Guest
Posts: n/a
 
      01-29-2013
On 1/29/2013 06:57, wy wrote:
> On 01/29/2013 05:45 PM, glen herrmannsfeldt wrote:
>
>> The two are compiled separately and, separately, are
>> legal C. The linker doesn't test for the error for
>> mostly historical reasons.
>>
>> It isn't legal C, but the compiler can't tell.
>>

>
> I see.
> It's the linker's fault.
> The compiler tests(tries to compile) source code,
> and the linker links object code while not testing source or object.
> Thank you.


It is not the linker's fault because the code lies. Whatever particular
linker you're using happens to believe your code. Imagine how much
worse it could be... You could define 'int x = 13;' in one file and
declare 'extern void x(void);' in another file and try to call it. What
would happen then?

It's likely that you're getting the same results as you would with a union:

char data_[3];
union {
char * cp;
char ca[3];
} data = { data_ };

Here, 'data.ca != data.cp', because pointers and arrays are different
things, as mentioned before.

--
- Shao Miller
--
"Thank you for the kind words; those are the kind of words I like to hear.

Cheerily," -- Richard Harter
 
Reply With Quote
 
 
 
 
Eric Sosman
Guest
Posts: n/a
 
      01-29-2013
On 1/29/2013 10:53 AM, wy wrote:
>> Finally, this was a deliberate decision by the C committee to allow
>> implementation of C even on systems with fairly simple-minded linkers.

>
> It's surprising!
> After I change the declaration of data in test.c to
> 'extern short data;',
> still no errors and warning occur.
> And result is
> '
> _data_ address in data.c: 0x6009c0
> data address in data.c: 0x6009c0
> data address in test.c: 0x9c0
> ',
> as expected.


There's no reason (in C) to have any particular expectation
of the consequences of undefined behavior. That's what "undefined"
means! If you expect some specific outcome when undefined behavior
is at work, you are kidding yourself.

That said, it is entirely possible that a given implementation
of C may go beyond what the Standard requires and define behaviors
that the Standard does not. An implementation may advertise that
its linker will detect certain kinds of clashes, and will react in
thus-and-such a way when they occur. If your expectation is based
on beyond-the-language documentation of this kind, then it may be
well-founded -- but you cannot hold that same expectation across
other C implementations, because they are under no obligation to
behave in the same way, nor even to document how they will behave.

(NOBODY expects Undefined Behavior! Our chief weapon is the
crash... crashes and bugs, bugs and crashes... Our two weapons
are crashes and bugs and subtle oddities... Our *three* weapons
are crashes, bugs, subtle oddities, and meeting unjustified
expectations... Our *four* -- no -- *Amongst* our weapons...
our weaponry... are such elements as crashes, bugs... I'll
come in again.)

--
Eric Sosman
http://www.velocityreviews.com/forums/(E-Mail Removed)d
 
Reply With Quote
 
 
 
 
James Kuyper
Guest
Posts: n/a
 
      01-29-2013
On 01/29/2013 10:49 AM, Keith Thompson wrote:
> James Kuyper <(E-Mail Removed)> writes:
>> One side issue:
>> On 01/29/2013 02:27 AM, wy wrote:
>>> This code can't be compiled without errors.
>>> /* test.c */
>>> #include <stdio.h>
>>> typedef char data_str;
>>>
>>> data_str _data_[3], *data = _data_;

>>
>> "-- All identifiers that begin with an underscore and either an
>> uppercase letter or another underscore are always reserved for any use.
>> -- All identifiers that begin with an underscore are always reserved for
>> use as identifiers with file scope in both the ordinary and tag name
>> spaces." (7.1.3p1).
>>
>> _data_ has file scope, and is in the ordinary name space.
>>
>> "If the program declares or defines an identifier in a
>> context in which it is reserved (other than as allowed by 7.1.4), or
>> defines a reserved identifier as a macro name, the behavior is
>> undefined." (7.1.3p2)
>>
>> Therefore, your program has undefined behavior. Don't use such names.

>
> Yes, but it's very unlikely that the use of a reserved identifier causes
> the symptoms. (James, you know that, but the OP might not.)


That's why I called it a side issue. Still, you're right - I should have
mentioned that fact explicitly.


 
Reply With Quote
 
James Kuyper
Guest
Posts: n/a
 
      01-29-2013
On 01/29/2013 01:02 PM, Geoff wrote:
> On Tue, 29 Jan 2013 15:27:50 +0800, wy <(E-Mail Removed)> wrote:

....
> You have declared the same 'data' object of file scope, twice. Delete the second
> declaration (extern data_str data[3] and the program will compile and behave
> correctly.


[After Wy split the two different definitions into separate source files:]
....
>> Here we see, it's compiled with no errors and warnings,
>> and the data addresses in data.c and test.c are different.
>> Why are they different and
>> why don't c complier prohibit this kind of use?

>
> They are different because the translator creates two different 'data' objects
> that the linker must treat separately.


There's no "must" about it. As far as the C standard is concerned, the
behavior is simply undefined, the linker can do anything it wants. As
far as reality goes, there have been real linkers that would place both
'data' objects in the same location; in C terms they would behave as if
they were a members of an unnamed union of the two different
declarations. The compiler doesn't know that the two 'data's share the
same memory, and presumably neither does the author of the code, so the
typical result using such a linker would be disaster.

As if that's not bad enough, it's not necessarily the case that the
linker would set aside enough space for the larger of the two
declarations; it might set aside enough space for whichever one it
processed first, which could be the smaller one. There's a reason,
soundly rooted in the behavior of commonly used linkers at the time the
C standard was first written, why the standard was written to give code
like this undefined behavior.

 
Reply With Quote
 
Geoff
Guest
Posts: n/a
 
      01-29-2013
On Tue, 29 Jan 2013 15:27:50 +0800, wy <(E-Mail Removed)> wrote:

>This code can't be compiled without errors.
>/* test.c */
>#include <stdio.h>
>typedef char data_str;
>
>data_str _data_[3], *data = _data_;
>void data_pointer_print(){
> printf("_data_ address 1st: %p\n", _data_);
> printf(" data address 2nd: %p\n", data );
>}
>
>extern data_str data[3];
>int main(){
> data_pointer_print();
> printf(" data address 3rd: %p\n", data );
>}
>
>I understand the error "conflicting types for ‘data’".


OK, if you think you understand the error, then why do you apply the wrong fix?

You have declared the same 'data' object of file scope, twice. Delete the second
declaration (extern data_str data[3] and the program will compile and behave
correctly.

The first declaration defines and instantiates the object and reserves the space
for it. The second declaration declares some object, 'data', as external,
meaning the translator expects to encounter the instantiation at link time, but
your code doesn't do that.

>But after splitting the code into several files,
>it is compiled successfully, and have an undesired result.
>


Yes, because you applied a "fix" (below) that doesn't eliminate the point of the
error.

>/* data.h */
>#include <stdio.h>
>typedef char data_str;
>extern void datastr_pointer_print();
>
>/* data.c */
>#include "data.h"
>data_str _data_[3], *data = _data_;
>void data_pointer_print(){
> printf("_data_ address in data.c: %p\n", _data_);
> printf(" data address in data.c: %p\n", data );
>}
>
>/* test.c */
>#include "data.h"
>extern data_str data[3];
>int main(){
> data_pointer_print();
> printf(" data address in test.c: %p\n", data );
>}
>
>And this is the result:
>$ gcc test.c data.c
>$ ./a.out
>_data_ address in data.c: 0x6009b8
> data address in data.c: 0x6009b8
> data address in test.c: 0x6009a0
>
>Here we see, it's compiled with no errors and warnings,
>and the data addresses in data.c and test.c are different.
>Why are they different and
>why don't c complier prohibit this kind of use?


They are different because the translator creates two different 'data' objects
that the linker must treat separately.

 
Reply With Quote
 
BartC
Guest
Posts: n/a
 
      01-29-2013


"wy" <(E-Mail Removed)> wrote in message
news:ke8dgc$uc4$(E-Mail Removed)...
> On 01/29/2013 05:45 PM, glen herrmannsfeldt wrote:
>
>> The two are compiled separately and, separately, are
>> legal C. The linker doesn't test for the error for
>> mostly historical reasons.
>>
>> It isn't legal C, but the compiler can't tell.
>>

>
> I see.
> It's the linker's fault.


Not really. The linker only matches the values of symbols. 'data' in data.c
and 'data' in test.c are resolved by the linker to the same address. It
doesn't know about types.

But data.c treats its 'data' as a pointer (and writes the value of the
pointer: the contents of that address), while test.c treats 'data' as an
array (and writes the address of element 0 of the array).

To avoid problems, make sure the two files share the same declaration of
'data'.

--
Bartc

 
Reply With Quote
 
Andrey Tarasevich
Guest
Posts: n/a
 
      01-29-2013
On 1/28/2013 11:27 PM, wy wrote:
>
> Here we see, it's compiled with no errors and warnings,
> and the data addresses in data.c and test.c are different.
> Why are they different and
> why don't c complier prohibit this kind of use?
>


Historically, C compilers are built on the principle of "independent
translation". The compiler proper sees and compiler each translation
unit independently, without any knowledge of any other translation units.

For this reason, the compiler proper cannot detect any errors that are
caused by any inconsistencies between different translation units. The
language specification explicitly gives the compilers the freedom to
ignore such errors, i.e. it allows them generate invalid code without
issuing any diagnostic messages.

The only part of the compiler that actually sees the program in its
entirety (in a typical implementation) is called linker. So, linker is
actually in position to detect such errors. But in a typical
implementation by the time the program gets to the linking stage, the
additional information needed for error detection is already lost
irreversibly.

Hence the responsibility to observe the inter-translation-unit
relationships resides entirely on the user. If you fail to observe them,
you'll typically end up with a program, whose behavior is undefined.

It doesn't have to be that way. It is really a quality-of-implementation
issue. Some compiler might decide to go that extra mile and take extra
steps in order to detect such errors. However, most compilers indulge
the liberty provided by the language specification and leave such errors
undiagnosed.

--
Best regards,
Andrey Tarasevich
 
Reply With Quote
 
Shao Miller
Guest
Posts: n/a
 
      01-29-2013
On 1/29/2013 12:53, James Kuyper wrote:
> There's a reason,
> soundly rooted in the behavior of commonly used linkers at the time the
> C standard was first written, why the standard was written to give code
> like this undefined behavior.
>


There might be a reason, but there definitely isn't a citation, here.

--
- Shao Miller
--
"Thank you for the kind words; those are the kind of words I like to hear.

Cheerily," -- Richard Harter
 
Reply With Quote
 
James Kuyper
Guest
Posts: n/a
 
      01-29-2013
On 01/29/2013 04:21 PM, glen herrmannsfeldt wrote:
....
> Not that I really know "What they thought when they did it" but
> it seems to me that we are stuck with what Fortran could do 50 years
> ago, on much smaller computers. That, and that people don't write a
> new linker for each new language.


We're not completely stuck. Nothing prevents implementations of C from
taking advantage of the capabilities of more modern linkers, and
diagnosing such problems. If linkers with the required capabilities
become common enough, and enough implementations of C choose to take
advantage of that fact, the standard might eventually be modified to
make diagnosis of such problems mandatory. However, don't expect
anything like that to happen any time soon (possibly not for several
decades).
 
Reply With Quote
 
Geoff
Guest
Posts: n/a
 
      01-29-2013
On Tue, 29 Jan 2013 12:53:01 -0500, James Kuyper <(E-Mail Removed)>
wrote:

>On 01/29/2013 01:02 PM, Geoff wrote:
>> On Tue, 29 Jan 2013 15:27:50 +0800, wy <(E-Mail Removed)> wrote:

>...
>> You have declared the same 'data' object of file scope, twice. Delete the second
>> declaration (extern data_str data[3] and the program will compile and behave
>> correctly.

>
>[After Wy split the two different definitions into separate source files:]
>...
>>> Here we see, it's compiled with no errors and warnings,
>>> and the data addresses in data.c and test.c are different.
>>> Why are they different and
>>> why don't c complier prohibit this kind of use?

>>
>> They are different because the translator creates two different 'data' objects
>> that the linker must treat separately.

>
>There's no "must" about it. As far as the C standard is concerned, the
>behavior is simply undefined, the linker can do anything it wants. As
>far as reality goes, there have been real linkers that would place both
>'data' objects in the same location; in C terms they would behave as if
>they were a members of an unnamed union of the two different
>declarations. The compiler doesn't know that the two 'data's share the
>same memory, and presumably neither does the author of the code, so the
>typical result using such a linker would be disaster.


Good point. There is nothing in the C standard that defines what must happen in
the multiple translation unit, redefinition case. I should have said:

They are different because the translator creates two different 'data' objects
that your linker treats separately. This is UB in C and can cause unexpected
behavior at run time.

>
>As if that's not bad enough, it's not necessarily the case that the
>linker would set aside enough space for the larger of the two
>declarations; it might set aside enough space for whichever one it
>processed first, which could be the smaller one. There's a reason,
>soundly rooted in the behavior of commonly used linkers at the time the
>C standard was first written, why the standard was written to give code
>like this undefined behavior.


His initial case, the single-file multiple definition, was a constraint
violation and the compiler correctly indicated the error. The OP eliminated the
error by splitting the file into separate translation units but this induced UB.
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Pointer to pointer or reference to pointer A C++ 7 07-05-2011 07:49 PM
Pointer to pointer Vs References to Pointer bansalvikrant@gmail.com C++ 4 07-02-2009 10:20 AM
use pointer and not use pointer, which is faster to access data? shuisheng C++ 4 09-26-2006 08:05 AM
passing the address of a pointer to a func that doesnt recieve a pointer-to-a-pointer jimjim C Programming 16 03-27-2006 11:03 PM
Pointer-to-pointer-to-pointer question masood.iqbal@lycos.com C Programming 10 02-04-2005 02:57 AM



Advertisments