Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > C Programming > Re: Who owns the variable in my header file ?

Reply
Thread Tools

Re: Who owns the variable in my header file ?

 
 
Ben Bacarisse
Guest
Posts: n/a
 
      10-04-2012
lipska the kat <> writes:
<snip>
> Given a program written in C, how does one determine that it is
> correct' if complying with requirements and returning the same output
> from the same input is not enough.


There are a few tools that can help. For example, there's valgrind (and
other similar things) that can check all of your memory accesses as you
run your tests. But there are many other things that can be wrong but
which appear to work. One general tool is to get into the habit of
reasoning about your programs.

Testing is very helpful of course, but I'd venture to say that the
balance between treating programming as a formal mathematical activity
and treating it like engineering has tended, in recent years, to down
play the mathematical side to the detriment of the field.

--
Ben.
 
Reply With Quote
 
 
 
 
Keith Thompson
Guest
Posts: n/a
 
      10-04-2012
lipska the kat <> writes:
> On 04/10/12 09:30, Keith Thompson wrote:
>> lipska the kat<> writes:
>> [...]

>
> [snip]
>
>> C, as the saying goes, gives you enough rope to shoot yourself in the
>> foot. I'll show you a concrete example:

>
> [snip]
>
> gcc example.c
> example.c: In function ‘write_array’:
> example.c:4:9: error: ‘for’ loop initial declarations are only allowed
> in C99 mode
> example.c:4:9: note: use option -std=c99 or -std=gnu99 to compile your code
> example.c: In function ‘read_array’:
> example.c:10:9: error: ‘for’ loop initial declarations are only allowed
> in C99 mode
> make: *** [example] Error 1
>
> gcc -ansi example.c
> ditto above


Right, I used a C99-specific feature, and gcc with no arguments, or with
"-ansi", doesn't implement C99. You can avoid that by using "-std=c99",
or by changing

for (int i = 0; i <= 5; i ++) {
/* ... */
}

to:

int i;
for (int i = 0; i <= 5; i ++) {
/* ... */
}

Note that the "int i;" declaration has to be at the top of the block,
before any statements (a C90 restriction that C99 removed).


> gcc -std=c99 -Wall example.c
> example.c: In function ‘main’:
> example.c:19:13: warning: unused variable ‘z’ [-Wunused-variable]
> example.c:17:13: warning: unused variable ‘x’ [-Wunused-variable]
>
> gcc -std=c99 -O1 -Wall example.c
> ditto above
>
> gcc -std=c99 -O2 -Wall example.c
> example.c: In function ‘main’:
> example.c:19:13: warning: unused variable ‘z’ [-Wunused-variable]
> example.c:17:13: warning: unused variable ‘x’ [-Wunused-variable]
> example.c:5:20: warning: array subscript is above array bounds
> [-Warray-bounds]
>
> gcc -std=c99 -O3 -Wall example.c
> example.c: In function ‘main’:
> example.c:19:13: warning: unused variable ‘z’ [-Wunused-variable]
> example.c:17:13: warning: unused variable ‘x’ [-Wunused-variable]
> example.c:5:20: warning: array subscript is above array bounds
> [-Warray-bounds]
> example.c:11:19: warning: array subscript is above array bounds
> [-Warray-bounds]


Yes, all those warnings are valid. An "unused variable" warning
doesn't mean that your program is wrong, it just means that you've
probably made a logical error.

The "array subscript is above array bounds" is more serious. As I
said, I deliberately wrote a program whose behavior is undefined;
this absolutely was *not* an example of what you should do.

The program attempts to store values outside the bounds of an array.
I added extra array declarations to try to give those accesses
somewhere to go.

> 0 1 2 3 4 5 every time
>
> Now I'm really confused


The program's behavior is undefined. Printing 0 1 2 3 4 5 every
time is therefore perfectly valid, since nothing in the standard
says it *shouldn't* behave that way.

If you go beyond what the standard actually says, there are reasons
why it behaves the way it does. The arrays x, y, and z probably
happen to be stored next to each other in memory. Writing past
the end of y probably clobbers the beginning of either x or z.
Since x, y, and z are all in memory that you "own", the program is
able to do that with no apparent problem.

A more stringent compiler might have caused the program to crash
when it tried to store past the end of y. Most compilers don't
do that because it requires explicit checking, which is expensive
(it would catch incorrect programs, but slow down correct programs).
A cynic might say that C compilers are designed to let you get your
wrong answers as quickly as possible.

> Maybe I should be reading the C99 spec %-}


It's not easy reading, and there's not really anything in it that
explains the way this program behaves. As far as the standard is
concerned, running that program could make demons fly out of your
nose (obviously that won't really happen, but nothing in the C
standard forbids it).

It's entirely possible that my example was more complex than it
should have been. If you don't understand it, don't worry about it
too much for now. Perhaps you should concentrate more on writing
correct code than on understanding incorrect code.

--
Keith Thompson (The_Other_Keith) kst- <http://www.ghoti.net/~kst>
Will write code for food.
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"
 
Reply With Quote
 
 
 
 
Nick Keighley
Guest
Posts: n/a
 
      10-06-2012
On Oct 4, 1:30*pm, James Kuyper <jameskuy...@verizon.net> wrote:
> On 10/04/2012 04:02 AM, lipska the kat wrote:
> > On 04/10/12 05:34, James Kuyper wrote:



> >> That's a bad assumption. One of the most common ways in which code with
> >> undefined behavior actually behaves is to produce exactly the same
> >> result that you incorrectly assume that it's required to produce. That's
> >> because your assumptions happen to match decisions made by the
> >> implementors of the version of C that you're testing with. Other
> >> implementors of C are free to make different decisions, ones that are
> >> incompatible with your incorrect assumptions.

>
> > Er ... wow, OK, that is a bit of a head****
> > Do you mean to say that even if I test my program to destruction and as
> > far as I can tell it's 'correct', that is it complies with requirements
> > and behaves as expected it could still be incorrect when compiled with
> > a different compiler ???

>
> Certainly. That's not just because of undefined behavior, either.
> There's also behavior that is merely unspecified: the standard provides
> (explicitly or, more commonly, implicitly) a list of possible behaviors,
> and each implementation gets to choose from that list - in some cases,
> it can even make a different choice each time a given piece of code is
> executed. Some unspecified behavior is "implementation-defined" which
> means that an implementation is required to document which choice it has
> made, but there's also a lot of cases where there's no such requirement.
>
> > Surely there is some 'base' implementation of C that is used to test
> > compilers ..

>
> No, there is not. Even if there were, the base implementation would have
> to make specific choices in every case where the C standard leaves the
> behavior unspecified or undefined, and other fully-conforming
> implementations of C would not be required to make the same choices,
> which greatly reduces the usefulness of having a base implementation.
> That may be one reason why there isn't one.
>
> > ... or is it a free for all ...

>
> It's not a free-for-all - the standard does impose a great many specific
> requirements. However, the things that it does not specify are what
> gives implementors sufficient freedom to create a conforming
> implementation of C on almost every platform. That is the reason why C
> is one of the most widely implemented of all computer languages.
>
> > ... to me this implies that there can
> > be more than one 'correct' implementation of the C language,

>
> Correct - the set of possible fully-conforming implementations of the C
> language is infinite. The set of actual fully-conforming implementations
> is much smaller, but still large enough that it's not feasible to test
> any given program on all of them. It's also sufficiently varied that
> testing on only a few dozen of them is insufficient to prove that your
> code will work on all of the untested ones.


<snip>

As someone remarked this business with "undefined behaviour" is true
of pretty much all programming languages (I'm not convinced Godel has
anything to contribute to this). To some extent C stresses it more,
this is partly because C runs nearly everywhere and has huge numbers
of implementations.

Langauages like Perl and Python have less trouble with this as there
are actually very few implementations. Java side steps it by running
on a virtual machine. In a sense java is utterly non-portable as it
only runs on one platform (the JVM)! Java also nails down many things
that C doesn't such as order of expression of evaluation and size of
fundamental types. Some languages such as Ada had extensive test
suites to validate compilers; but such things are very expensive to
maintain.
 
Reply With Quote
 
Les Cargill
Guest
Posts: n/a
 
      10-06-2012
Nick Keighley wrote:
> On Oct 4, 1:30 pm, James Kuyper <jameskuy...@verizon.net> wrote:
>> On 10/04/2012 04:02 AM, lipska the kat wrote:
>>> On 04/10/12 05:34, James Kuyper wrote:

>
>
>>>> That's a bad assumption. One of the most common ways in which code with
>>>> undefined behavior actually behaves is to produce exactly the same
>>>> result that you incorrectly assume that it's required to produce. That's
>>>> because your assumptions happen to match decisions made by the
>>>> implementors of the version of C that you're testing with. Other
>>>> implementors of C are free to make different decisions, ones that are
>>>> incompatible with your incorrect assumptions.

>>
>>> Er ... wow, OK, that is a bit of a head****
>>> Do you mean to say that even if I test my program to destruction and as
>>> far as I can tell it's 'correct', that is it complies with requirements
>>> and behaves as expected it could still be incorrect when compiled with
>>> a different compiler ???

>>
>> Certainly. That's not just because of undefined behavior, either.
>> There's also behavior that is merely unspecified: the standard provides
>> (explicitly or, more commonly, implicitly) a list of possible behaviors,
>> and each implementation gets to choose from that list - in some cases,
>> it can even make a different choice each time a given piece of code is
>> executed. Some unspecified behavior is "implementation-defined" which
>> means that an implementation is required to document which choice it has
>> made, but there's also a lot of cases where there's no such requirement.
>>
>>> Surely there is some 'base' implementation of C that is used to test
>>> compilers ..

>>
>> No, there is not. Even if there were, the base implementation would have
>> to make specific choices in every case where the C standard leaves the
>> behavior unspecified or undefined, and other fully-conforming
>> implementations of C would not be required to make the same choices,
>> which greatly reduces the usefulness of having a base implementation.
>> That may be one reason why there isn't one.
>>
>>> ... or is it a free for all ...

>>
>> It's not a free-for-all - the standard does impose a great many specific
>> requirements. However, the things that it does not specify are what
>> gives implementors sufficient freedom to create a conforming
>> implementation of C on almost every platform. That is the reason why C
>> is one of the most widely implemented of all computer languages.
>>
>>> ... to me this implies that there can
>>> be more than one 'correct' implementation of the C language,

>>
>> Correct - the set of possible fully-conforming implementations of the C
>> language is infinite. The set of actual fully-conforming implementations
>> is much smaller, but still large enough that it's not feasible to test
>> any given program on all of them. It's also sufficiently varied that
>> testing on only a few dozen of them is insufficient to prove that your
>> code will work on all of the untested ones.

>
> <snip>
>
> As someone remarked this business with "undefined behaviour" is true
> of pretty much all programming languages (I'm not convinced Godel has
> anything to contribute to this). To some extent C stresses it more,
> this is partly because C runs nearly everywhere and has huge numbers
> of implementations.
>
> Langauages like Perl and Python have less trouble with this as there
> are actually very few implementations. Java side steps it by running
> on a virtual machine.


Perl and Python, being interpreted, also have a "virtual machine"
each.

> In a sense java is utterly non-portable as it
> only runs on one platform (the JVM)! Java also nails down many things
> that C doesn't such as order of expression of evaluation and size of
> fundamental types. Some languages such as Ada had extensive test
> suites to validate compilers; but such things are very expensive to
> maintain.
>


--
Les Cargill
 
Reply With Quote
 
Richard Damon
Guest
Posts: n/a
 
      10-07-2012
On 10/6/12 5:30 AM, Nick Keighley wrote:

> As someone remarked this business with "undefined behaviour" is true
> of pretty much all programming languages (I'm not convinced Godel has
> anything to contribute to this). To some extent C stresses it more,
> this is partly because C runs nearly everywhere and has huge numbers
> of implementations.
>
> Langauages like Perl and Python have less trouble with this as there
> are actually very few implementations. Java side steps it by running
> on a virtual machine. In a sense java is utterly non-portable as it
> only runs on one platform (the JVM)! Java also nails down many things
> that C doesn't such as order of expression of evaluation and size of
> fundamental types. Some languages such as Ada had extensive test
> suites to validate compilers; but such things are very expensive to
> maintain.
>


Undefined behavior is allowed in C to provide for (significantly)
improved efficiency in some operations. For example, accessing an array
beyond its bounds. If we removed pointers into arrays (and passing
arrays with unspecified bounds), then the compiler could easily add code
to check the subscripts to the array and trap on error conditions. If we
want to support pointers into arrays, then these pointers could also be
made "fatter" to include the bounds of the object they point to (and for
multidimensional arrays, the bounds for each of the larger arrays the
array is part of). This add significant overhead to the pointer and the
operations. Since the design goal of C was to favor creating efficient
code, to make it a reasonable replacement for assembly code, the
tradeoff tend to be made in the favor of efficiency, over catching "bad"
code. Many other languages have chosen to limit the realm of undefined
behavior, by defining what is supposed to happen, forcing the compiler
to possible generate less efficient (but more predictable) code.
 
Reply With Quote
 
Ian Collins
Guest
Posts: n/a
 
      10-07-2012
On 10/07/12 14:19, Gordon Burditt wrote:
> It is very easy to write a program in C that deliberately crashes


Are you replying to some one or posting random musings?

--
Ian Collins
 
Reply With Quote
 
BartC
Guest
Posts: n/a
 
      10-07-2012


"Richard Damon" <> wrote in message
news:k4qm0b$jr0$...
> On 10/6/12 5:30 AM, Nick Keighley wrote:
>
>> As someone remarked this business with "undefined behaviour" is true
>> of pretty much all programming languages (I'm not convinced Godel has
>> anything to contribute to this). To some extent C stresses it more,
>> this is partly because C runs nearly everywhere and has huge numbers
>> of implementations.


> If we removed pointers into arrays (and passing
> arrays with unspecified bounds), then the compiler could easily add code
> to check the subscripts to the array and trap on error conditions. If we
> want to support pointers into arrays, then these pointers could also be
> made "fatter" to include the bounds of the object they point to (and for
> multidimensional arrays, the bounds for each of the larger arrays the
> array is part of).


Arrays can have any numbers of dimensions, so would be highly impractical
for any of a thousand possible pointers into an array for each to duplicate
it's half-dozen or dozen dimensions. You would likely also need different
pointers for each of the sub-dimensions.

And for an array whose dimensions are not realised until runtime, or for
'ragged' arrays where the bounds vary through the array, how would
such a pointer be initialised? Other languages would tend to build the
bounds into the arrays themselves.

In any case, C allows pointers into all sorts of objects, including
non-arrays, or a single element of that multi-dimensional array, or to cast
one type of pointer into another; you wouldn't then be able to step or do
arithmetic on such a pointer, without by-passing the bounds checking.

So 'undefined behaviour', if it's as simple as having the wrong value in a
pointer, is built-in to the language!

(For single-dimensional arrays, a 'fat' pointer containing exactly one
bound, could work, provided they are a new explicit type in addition to
regular pointers. Then an array allocator could return such a pointer, which
can be passed to functions and would carry it's length for use by programs,
and could optionally be used for bounds checking by internal code. But for
multi-dimensions, it gets complicated...)

> This add significant overhead to the pointer and the
> operations.


Not if the alternative is to have to always pass the length of the array
together with a pointer to the array. Having bounds-checking code inserted
would be an extra overhead, but that can be optional.

--
Bartc

 
Reply With Quote
 
James Kuyper
Guest
Posts: n/a
 
      10-07-2012
On 10/07/2012 06:40 AM, BartC wrote:
>
>
> "Richard Damon" <> wrote in message
> news:k4qm0b$jr0$...

....
>> If we removed pointers into arrays (and passing
>> arrays with unspecified bounds), then the compiler could easily add code
>> to check the subscripts to the array and trap on error conditions. If we
>> want to support pointers into arrays, then these pointers could also be
>> made "fatter" to include the bounds of the object they point to (and for
>> multidimensional arrays, the bounds for each of the larger arrays the
>> array is part of).

>
> Arrays can have any numbers of dimensions, so would be highly impractical
> for any of a thousand possible pointers into an array for each to duplicate
> it's half-dozen or dozen dimensions. You would likely also need different
> pointers for each of the sub-dimensions.


None of that matters; only one range is needed at any given time - it
can be modified whenever changing levels in the multidimensional array.
Whenever an lvalue of array type gets converted to a pointer of it's
element type, that pointer can be given a range corresponding to the
beginning and ending of the array. It doesn't matter whether the element
type is itself an array type - that can only come into play upon
conversion of an lvalue of the element type being converted to a pointer
to it's first element; at which point the same rule applies, giving the
pointer a different range.

> And for an array whose dimensions are not realised until runtime, or for
> 'ragged' arrays where the bounds vary through the array, how would
> such a pointer be initialised?


In C, ragged arrays can only be implemented by allocating each row from
a larger memory space. If the allocation is handled by malloc(), then
the bounds can be inserted at the time malloc() is called. If the user
code allocates one large array, and then fills in an array of pointers
to irregularly-sized pieces of that array, there's no way for the C
compiler to know what the bounds are; it will necessarily use only the
bounds of the big array.

Other languages would tend to build the
> bounds into the arrays themselves.
>
> In any case, C allows pointers into all sorts of objects, including
> non-arrays,


That poses no problems - the C standard specifies that a pointer to a
non-array object can be treated as a pointer to the first and only
element of a 1-element array of the object's type.

> ... or a single element of that multi-dimensional array,


That poses no problem, either; the bounds for the pointer to the single
element are the bounds for the array from which it was selected. If the
programmer wants to restrict the permitted range more tightly than that,
the C language currently provides no mechanism for doing so; though
*((*element_type)[n])element_pointer seems a plausible mechanism that
could be used to tell the compiler to treat it as though it came from a
n-element array (I do NOT claim that the current standard endorses any
such use of this construct).

This construct could also be used to tell the compiler what bounds to
use when filling in a ragged array from a single large array.
--
James Kuyper
 
Reply With Quote
 
Chicken McNuggets
Guest
Posts: n/a
 
      10-07-2012
On 05/10/2012 08:44, lipska the kat wrote:
>
> I understand it perfectly well, I just think if someone makes the effort
> to reply to my question I should make the effort to respond.
> One thing I learned from this and other posts is that C99 is probably a
> better choice for me that earlier standards. I added the option you
> suggested to my gcc commands along with -Wall ran make on all my current
> code and waited for the explosion (of demons perhaps ... but nothing
> really of note appeared, There was one warning but that was about it.
> Most gratifying.
>
> Thanks for taking the time to reply
>
> lipska
>


When compiling with GCC you probably want to add the -Wextra and
-pedantic flags as well to your compilation command.
 
Reply With Quote
 
James Kuyper
Guest
Posts: n/a
 
      10-07-2012
On 10/07/2012 01:22 PM, lipska the kat wrote:
> On 07/10/12 02:19, Gordon Burditt wrote:
>> It is very easy to write a program in C that deliberately crashes
>> (here this means: calls abort()) under conditions which you

>
> [snip]
>
>> - Crashes only when calling asctime() and the year is greater
>> than 9,999 (Y10K bug in the *definition* of asctime()).

>
> Well if I have any code running > 9999 then I'll consider it a bit of a


Well, the issue is also relevant to code that computes future times. I
admit, the need to determine calendar dates that far in the future is
quite small - but it's not non-existent.

The problem with asctime() is that it's the only C standard library
function whose behavior is defined entirely by example code (7.3.27.1p2)
showing how it could be implemented. asctime() provides a prime example
of why that's a bad idea. It can be deduced from that example code that
asctime() has undefined behavior if:

timeptr->tm_wday < 0 || timeptr->tm_wday > 6 ||
timeptr->tm_mon < 0 || timeptr->tm_mon > 11 ||
timeptr->tm_year < -2899 || timeptr->tm_year > 8099

The limits on tm_wday and tm_mon are due to their use as array indices;
the limit on tm_year is imposed by the fact that the call to sprintf()
will overflow the provided buffer. Even assuming that the date being
represented is between year 1000 and year 9999, you'll still get a
buffer overflow if

timeptr->tm_mday < -9 || timeptr->tm_mday > 99 ||
timeptr->tm_hour < -9 || timeptr->tm_hour > 99 ||
timeptr->tm_sec < -9 || timeptr->tm_sec > 99

However, until C2011, it was nowhere explicitly stated that this is the
case. In C2011, 7.3.27.1p3 was added, which says that the behavior is
undefined if (in effect) timeptr->tm_year < -900 || timeptr->tm_year >
8099, or any of the other fields are outside their normal range, as
defined in 7.27.1p4; this is more restrictive than the constraints I
deduced above.

asctime() doesn't have to be unsafe - the example code is only an
example. Undefined behavior allows, as one possibility, that asctime()
is implemented more safely than in the example code. It could return a
null pointer when tm_wday or tm_mon are out of range, or it could choose
a special month/day name (such as "INV"). It could also return a null
pointer instead of producing a buffer overflow, or it it could use a
buffer large enough to avoid any possibility of overflow.
--
James Kuyper
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Re: Who owns the variable in my header file ? Edward A. Falk C Programming 5 10-11-2012 08:30 PM
Re: Who owns the variable in my header file ? James Kuyper C Programming 0 10-04-2012 12:43 PM
Re: Who owns the variable in my header file ? James Kuyper C Programming 0 10-04-2012 12:43 PM
Re: Who owns the variable in my header file ? Ike Naar C Programming 0 10-03-2012 07:52 PM
Re: Who owns the variable in my header file ? Kaz Kylheku C Programming 0 10-03-2012 07:40 PM



Advertisments
 



1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57