Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > C Programming > -2147483648 and gcc optimisation, all sorts of different results

Reply
Thread Tools

-2147483648 and gcc optimisation, all sorts of different results

 
 
tom_usenet@optusnet.com.au
Guest
Posts: n/a
 
      03-11-2010
I'm surprised at the different results I can get from code with and
without optimisation where overflow is involved.

I suspect this was "done to death" about 20 years ago, but
I can't find anything in comp.lang.c matching this. Is this
the "if you overflow the compiler can do whatever it likes"
clause?

The original problem I was trying to solve is why a simple
embedded "printf("%ld")" was printing random garbage when
handed -2147483648.

Here are the results for the following simple program with different
optimisation levels, in all cases "-O1" gives sane results and "-O2"
gives remarkably creative results.

The problems are:

1 - With "-O2" the second test in the following is optimised out:
as it "can't be true". Unless (num == -214748364 in which
case it IS true, and is what I'm trying to correct for in
the function I was trying to fix.

if (num < 0L) num = -num;
if (num < 0L) stillNeg = 1;

2 - "-2147483647 + 1 == -2147483647" ???

3 - "-2147483647 == --2147483647" ???

This is the "sane" one without optimisation.

$ gcc --version
gcc (GCC) 4.3.4 20090804 (release) 1

$ gcc -Wall -O1 -o ox2 ox2.c
$ ./ox2
Function calls
2147483646
2147483647
-2147483648 Still negative
2147483647
2147483646
For loop
num = 2147483646 vnum = 2147483646
num = 2147483647 vnum = 2147483647
num = -2147483648 neg vnum = -2147483648 neg
num = 2147483647 vnum = 2147483647
num = 2147483646 vnum = 2147483646

This is the insane one with optimisation.

$ gcc -Wall -O2 -o ox2 ox2.c
$ ./ox2
Function calls
2147483646
2147483647
-2147483648 *** Missed the second test ***
2147483647
2147483646
For loop
num = 2147483646 vnum = 2147483646
num = 2147483647 vnum = 2147483647
num = 2147483647 vnum = -2147483648
num = 2147483647 vnum = -2147483647
num = 2147483647 vnum = -2147483646

^^ That got stuck ^^ ^^ That is negating when it shouldn't **

Here's the test code. Apologies for the "crammed style", I don't
usually write code that looks this bad:

#include <stdio.h>

void tneg(long num);
void tneg(long num)
{
int stillNeg = 0;
if (num < 0L) num = -num;
if (num < 0L) stillNeg = 1;
printf("%ld %s\n", num, (stillNeg)? " Still negative" : "");
}

int main(int argc, char ** argv)
{
long count;
long test, vtest, num, vnum;

printf("Function calls\n");
tneg((long)0x7ffffffe);
tneg((long)0x7fffffff);
tneg((long)0x80000000);
tneg((long)0x80000001);
tneg((long)0x80000002);

printf("For loop\n");
vnum = (argc == 5) ? 5 : 0x7ffffffe;
num = 0x7ffffffe;
for (count = 0; count < 5; count++)
{
int stillNeg = 0, vstillNeg = 0;

test = num; vtest = vnum;
if (test < 0) test = - test;
if (test < 0) stillNeg = 1;
if (vtest < 0) vtest = - vtest;
if (vtest < 0) vstillNeg = 1;
printf("num = %ld %s ", test, (stillNeg)? " neg" : " ");
printf("vnum = %ld %s\n", vtest, (vstillNeg)? " neg" : "");
num += 1; vnum += 1;
}
return 0;
}
 
Reply With Quote
 
 
 
 
Ben Bacarisse
Guest
Posts: n/a
 
      03-11-2010
"(E-Mail Removed)" <(E-Mail Removed)> writes:

> I'm surprised at the different results I can get from code with and
> without optimisation where overflow is involved.
>
> I suspect this was "done to death" about 20 years ago, but
> I can't find anything in comp.lang.c matching this. Is this
> the "if you overflow the compiler can do whatever it likes"
> clause?


Looks like it, yes.

> The original problem I was trying to solve is why a simple
> embedded "printf("%ld")" was printing random garbage when
> handed -2147483648.


That sounds quite different. For the "long int" that you seem to be
using it would be a library bug for printf("%ld", x) to print anything
but -2147483648. Did you mean that printf is being handed something
apparently random when you expected it to be handed -2147483648?

> Here are the results for the following simple program with different
> optimisation levels, in all cases "-O1" gives sane results and "-O2"
> gives remarkably creative results.
>
> The problems are:
>
> 1 - With "-O2" the second test in the following is optimised out:
> as it "can't be true". Unless (num == -214748364 in which
> case it IS true, and is what I'm trying to correct for in
> the function I was trying to fix.
>
> if (num < 0L) num = -num;
> if (num < 0L) stillNeg = 1;


From the point of view of the C language it's not quite that "it can't
be true" -- it's more a case of "either it's true or undefined
behaviour has occurred". The net effect is the same, in that the
compiler is using this undefined behaviour as permission to conclude
that the second test is redundant.

I hope that does not sound like too much splitting of hairs. It's
useful to distinguish between what the C standard says and what a
compiler decides to do as a result.

> 2 - "-2147483647 + 1 == -2147483647" ???


I don't see where this happens in your code. If -2147483647 is
representable in your long int type (and it is from you example below)
then the above would be a bug. -2147483647 + 1 must be -2147483646.

[Guessing here: did you mean "2147483647 + 1 == 2147483647"? If so,
the compiler can do pretty much what it likes since 2147483647 + 1 is
undefined with the types you are using.]

> 3 - "-2147483647 == --2147483647" ???


Due to C's parsing rules, -- is not the same as - - but I know what you
are saying here.

With normal 32-but integers, if you see that -(-2147483647) !=
2147483647 then you would have bug but, again, I don't see that in
your code.

> This is the "sane" one without optimisation.
>
> $ gcc --version
> gcc (GCC) 4.3.4 20090804 (release) 1
>
> $ gcc -Wall -O1 -o ox2 ox2.c
> $ ./ox2
> Function calls
> 2147483646
> 2147483647
> -2147483648 Still negative
> 2147483647
> 2147483646
> For loop
> num = 2147483646 vnum = 2147483646
> num = 2147483647 vnum = 2147483647
> num = -2147483648 neg vnum = -2147483648 neg
> num = 2147483647 vnum = 2147483647
> num = 2147483646 vnum = 2147483646
>
> This is the insane one with optimisation.
>
> $ gcc -Wall -O2 -o ox2 ox2.c
> $ ./ox2
> Function calls
> 2147483646
> 2147483647
> -2147483648 *** Missed the second test ***
> 2147483647
> 2147483646
> For loop
> num = 2147483646 vnum = 2147483646
> num = 2147483647 vnum = 2147483647
> num = 2147483647 vnum = -2147483648
> num = 2147483647 vnum = -2147483647
> num = 2147483647 vnum = -2147483646
>
> ^^ That got stuck ^^ ^^ That is negating when it shouldn't **


The compiler is probably unrolling the loop[1] and can thus tell that num
overflows. It is permitted to make num += 1 whatever it likes. It
can't tell that vnum overflows because you (deliberately, I am sure)
made it depend on argc but it can (and, I think, does) assume that
there is no point in testing for vtest < 0 (vtest being a copy of
vnum) since it starts positive and is only incremented.

[1] It only needs to unroll one loop body to see that num hits its
maximum value and the optimiser will always try to unroll one loop to
put the test at the bottom. Change the initial value so that it is
one less and you will see that num and vnum now mirror each other.

> Here's the test code. Apologies for the "crammed style", I don't
> usually write code that looks this bad:
>
> #include <stdio.h>
>
> void tneg(long num);
> void tneg(long num)
> {
> int stillNeg = 0;
> if (num < 0L) num = -num;
> if (num < 0L) stillNeg = 1;
> printf("%ld %s\n", num, (stillNeg)? " Still negative" : "");
> }
>
> int main(int argc, char ** argv)
> {
> long count;
> long test, vtest, num, vnum;
>
> printf("Function calls\n");
> tneg((long)0x7ffffffe);
> tneg((long)0x7fffffff);
> tneg((long)0x80000000);
> tneg((long)0x80000001);
> tneg((long)0x80000002);


FYI: these last three are implementation defined conversions (i.e. the
C language does not say exactly what happens). 0x80000000 is a
positive number that can't be represented in your long type.

> printf("For loop\n");
> vnum = (argc == 5) ? 5 : 0x7ffffffe;
> num = 0x7ffffffe;
> for (count = 0; count < 5; count++)
> {
> int stillNeg = 0, vstillNeg = 0;
>
> test = num; vtest = vnum;
> if (test < 0) test = - test;
> if (test < 0) stillNeg = 1;
> if (vtest < 0) vtest = - vtest;
> if (vtest < 0) vstillNeg = 1;
> printf("num = %ld %s ", test, (stillNeg)? " neg" : " ");
> printf("vnum = %ld %s\n", vtest, (vstillNeg)? " neg" : "");
> num += 1; vnum += 1;
> }
> return 0;
> }


--
Ben.
 
Reply With Quote
 
 
 
 
Eric Sosman
Guest
Posts: n/a
 
      03-11-2010
On 3/11/2010 9:59 AM, Ben Bacarisse wrote:
> "(E-Mail Removed)"<(E-Mail Removed) om.au> writes:
>
>> I'm surprised at the different results I can get from code with and
>> without optimisation where overflow is involved.
>>
>> I suspect this was "done to death" about 20 years ago, but
>> I can't find anything in comp.lang.c matching this. Is this
>> the "if you overflow the compiler can do whatever it likes"
>> clause?

>
> Looks like it, yes.
>
>> The original problem I was trying to solve is why a simple
>> embedded "printf("%ld")" was printing random garbage when
>> handed -2147483648.

>
> That sounds quite different. For the "long int" that you seem to be
> using it would be a library bug for printf("%ld", x) to print anything
> but -2147483648. Did you mean that printf is being handed something
> apparently random when you expected it to be handed -2147483648?


The way the value is "handed" to printf() may make a
difference, and so may the applicable version of the Standard.
Note that the value 2147483648 is too large for a 32-bit long,
so the operand of the `-' operator will be of a different type.
The chosen type depends on the Standard version: Under C90 rules
you'll get an unsigned long, C99 gives (signed) long long. The
unary `-' operator is then applied; under C90 you wind up with
the unsigned long 2147483648, C99 gives the negative -2147483648
as a long long.

Passing either of these to "%ld" is undefined behavior, because
"%ld" wants a (signed) long, period. Under C90 you're very likely
to get away with it and see the negative output you were expecting
all along, but under C99 you'll be passing a (probably) 64-bit
value where a 32-bit value was expected. This could easily throw
things off and generate the garbage the O.P. encountered.

In short, under C99

printf ("%ld\n", -214748364; // passes LL

may plausibly generate different output than

long num = -2147483648; // LL converts to L
printf ("%ld\n", num);

.... because of the type mismatch. (There's also the potential for
conversion issues in the second fragment, but that's unlikely to
be the source of the trouble.)

--
Eric Sosman
http://www.velocityreviews.com/forums/(E-Mail Removed)lid
 
Reply With Quote
 
Ben Bacarisse
Guest
Posts: n/a
 
      03-11-2010
Eric Sosman <(E-Mail Removed)> writes:

> On 3/11/2010 9:59 AM, Ben Bacarisse wrote:
>> "(E-Mail Removed)"<(E-Mail Removed) om.au> writes:
>>
>>> I'm surprised at the different results I can get from code with and
>>> without optimisation where overflow is involved.
>>>
>>> I suspect this was "done to death" about 20 years ago, but
>>> I can't find anything in comp.lang.c matching this. Is this
>>> the "if you overflow the compiler can do whatever it likes"
>>> clause?

>>
>> Looks like it, yes.
>>
>>> The original problem I was trying to solve is why a simple
>>> embedded "printf("%ld")" was printing random garbage when
>>> handed -2147483648.

>>
>> That sounds quite different. For the "long int" that you seem to be
>> using it would be a library bug for printf("%ld", x) to print anything
>> but -2147483648. Did you mean that printf is being handed something
>> apparently random when you expected it to be handed -2147483648?

>
> The way the value is "handed" to printf() may make a
> difference, and so may the applicable version of the Standard.
> Note that the value 2147483648 is too large for a 32-bit long,
> so the operand of the `-' operator will be of a different type.
> The chosen type depends on the Standard version: Under C90 rules
> you'll get an unsigned long, C99 gives (signed) long long. The
> unary `-' operator is then applied; under C90 you wind up with
> the unsigned long 2147483648, C99 gives the negative -2147483648
> as a long long.


I was unclear in a way that is depressingly common (not just for me
but I do it quite often): I meant the mathematical value -2147483648
not the C expression. Given that, I think I am right that printf must
print "-2147483648" with the 32 bit longs being used by the OP.

> Passing either of these to "%ld" is undefined behavior, because
> "%ld" wants a (signed) long, period. Under C90 you're very likely
> to get away with it and see the negative output you were expecting
> all along, but under C99 you'll be passing a (probably) 64-bit
> value where a 32-bit value was expected. This could easily throw
> things off and generate the garbage the O.P. encountered.


That's a good point, but it seems unlikely that the original case the
OP is describing is one where the C constant expression -2147483648 is
the actual argument of printf. Why would anyone write that?

<snip>
--
Ben.
 
Reply With Quote
 
tom_usenet@optusnet.com.au
Guest
Posts: n/a
 
      03-11-2010
On Mar 12, 1:59*am, Ben Bacarisse <(E-Mail Removed)> wrote:
> "(E-Mail Removed)" <(E-Mail Removed)> writes:

....
> > The original problem I was trying to solve is why a simple
> > embedded "printf("%ld")" was printing random garbage when
> > handed *-2147483648.

>
> That sounds quite different. *For the "long int" that you seem to be
> using it would be a library bug for printf("%ld", x) to print anything
> but -2147483648. *Did you mean that printf is being handed something
> apparently random when you expected it to be handed -2147483648?


We're not using gcc's libc, at least not the stdio one. We have our
own
printf() code.

The function that prints "a number in any base" is being handed
"32 bits with the top bit set" best represented as "0x80000000", and
starts:

static void outnum( long num, const long base, struct PRINTF_CTX
*ctx )

{

charptr cp;

int negative;

char outbuf[32];

const char digits[] = "0123456789ABCDEF";



/* Check if number is negative */

if (num < 0L) {

negative = 1;

num = -num;

}
else

negative = 0;



/* Build number (backwards) in outbuf */

cp = outbuf;

do {

*cp++ = digits[(int)(num % base)];

} while ((num /= base) > 0);

if (negative)

*cp++ = '-';

*cp-- = 0;

And "*cp++ = digits[(int)(num % base)];" indexes backwards when
"num" is negative.

I added another "if (num < 0L)" after the first one to handle and
the compiler removed it. That was the start of this.

> > 2 - "-2147483647 + 1 == -2147483647" ???

>
> I don't see where this happens in your code.
>
> [Guessing here: did you mean "2147483647 + 1 == 2147483647"?


That's the one.

> > 3 - "-2147483647 *== --2147483647" ???

>
> Due to C's parsing rules, -- is not the same as - - but
> I know what you are saying here.


That was a typo. I meant to say "2147483647 == -2147483647".

The unoptimised case prints:

vnum = 2147483646, 2147483647, -2147483648, 2147483647, 2147483646.

The optimised case prints:

vnum = 2147483646, 2147483647, -2147483648, -2147483647, -2147483646.

The code is "if the number is negative, make it positive and print
it",
but for the last two numbers the "meant to be positive" numbers
aren't.

> > ^^ That got stuck ^^ * ^^ That is negating when it shouldn't **

>
> The compiler is probably unrolling the loop[1] and can thus tell that num
> overflows. *It is permitted to make num += 1 whatever it likes.


So it is doing "INT_MAX + 1 = INT_MAX" when it known about the
overflow
and "INT_MAX + 1 is OK if we assume it is now an unsigned int" when it
doesn't.

> That's a good point, but it seems unlikely that the original
> case the OP is describing is one where the C constant
> expression -2147483648 is the actual argument of printf.
> Why would anyone write that?


I didn't. I was doing "long num = f" where "f" is a float that had
gone to infinity due to a divide-by-zero. "num = f" results in
"INT_MAX" when "f" is too big to represent, but strangely "INT_MIN"
when infinity. Strange conversion, probaby legal. I was trying to
print "num" to see what was going on and then hit the bug
in our print code.

> Why would anyone write that?


When writing test cases to find out what the compiler is doing.

tom_usenet
 
Reply With Quote
 
Ben Bacarisse
Guest
Posts: n/a
 
      03-11-2010
"(E-Mail Removed)" <(E-Mail Removed)> writes:

> On Mar 12, 1:59*am, Ben Bacarisse <(E-Mail Removed)> wrote:
>> "(E-Mail Removed)" <(E-Mail Removed)> writes:

> ...
>> > The original problem I was trying to solve is why a simple
>> > embedded "printf("%ld")" was printing random garbage when
>> > handed *-2147483648.

>>
>> That sounds quite different. *For the "long int" that you seem to be
>> using it would be a library bug for printf("%ld", x) to print anything
>> but -2147483648. *Did you mean that printf is being handed something
>> apparently random when you expected it to be handed -2147483648?

>
> We're not using gcc's libc, at least not the stdio one. We have our
> own printf() code.
>
> The function that prints "a number in any base" is being handed "32
> bits with the top bit set" best represented as "0x80000000",


In this case I think it is simpler to say LONG_MIN. Everything else
is up for misinterpretation.

In fact, I understood you fine the first time. You pass the
mathematical value -2147483648 and get an odd result. (You could also
say that you pass LONG_MIN.) The confusion comes when someone
interprets -2147483648 as a C expression. If you use C99 it is
possible that this is an integer constant expression of type "long
long int".

> and starts:
>
> static void outnum( long num, const long base, struct PRINTF_CTX
> *ctx )
> {
> charptr cp;
> int negative;
> char outbuf[32];
> const char digits[] = "0123456789ABCDEF";
>
> /* Check if number is negative */
> if (num < 0L) {
> negative = 1;
> num = -num;
> }
> else
> negative = 0;
>
> /* Build number (backwards) in outbuf */
> cp = outbuf;
> do {
> *cp++ = digits[(int)(num % base)];
> } while ((num /= base) > 0);
> if (negative)
> *cp++ = '-';
> *cp-- = 0;
>
> And "*cp++ = digits[(int)(num % base)];" indexes backwards when
> "num" is negative.
>
> I added another "if (num < 0L)" after the first one to handle and
> the compiler removed it. That was the start of this.


Your best bet is probably to handle num == LONG_MIN as a special case.
There are various other tricks that you can do, like handling the
first digit before negating the number (so the most negative number
you try to make positive is num/base) but I don't think any are
significantly better than a simple special case.

<snip>
>> The compiler is probably unrolling the loop[1] and can thus tell that num
>> overflows. *It is permitted to make num += 1 whatever it likes.

>
> So it is doing "INT_MAX + 1 = INT_MAX" when it known about the
> overflow


(s/INT/LONG/ because you are using long)

Yes. Though that is an arbitrary choice it made. The loop unrolling
spotted that the next value is LONG_MAX and that it did not need to
increment any further because after an overflow, anything will do.
This it replaced the increment with an assignment of LONG_MAX. As I
suggested, if you start with num one smaller, the compiler does not
spot the overflow (because it only unrolls the loop to put the test at
the bottom) and you get an implementation defined increment. I point
this out only because I had fun investigating. I don't think the
details matter.

> and "INT_MAX + 1 is OK if we assume it is now an unsigned int" when it
> doesn't.


I don't follow this bit but I am not sure there is any point in trying
really hard to understand what the compiler did once you entered the
realms of undefined behaviour.

<snip>
--
Ben.
 
Reply With Quote
 
Tim Rentsch
Guest
Posts: n/a
 
      03-23-2010
Ben Bacarisse <(E-Mail Removed)> writes:

> "(E-Mail Removed)" <(E-Mail Removed)> writes:

[snip]
>>
>> static void outnum( long num, const long base, struct PRINTF_CTX
>> *ctx )
>> {
>> charptr cp;
>> int negative;
>> char outbuf[32];
>> const char digits[] = "0123456789ABCDEF";
>>
>> /* Check if number is negative */
>> if (num < 0L) {
>> negative = 1;
>> num = -num;
>> }
>> else
>> negative = 0;
>>
>> /* Build number (backwards) in outbuf */
>> cp = outbuf;
>> do {
>> *cp++ = digits[(int)(num % base)];
>> } while ((num /= base) > 0);
>> if (negative)
>> *cp++ = '-';
>> *cp-- = 0;
>>
>> And "*cp++ = digits[(int)(num % base)];" indexes backwards when
>> "num" is negative.
>>
>> I added another "if (num < 0L)" after the first one to handle and
>> the compiler removed it. That was the start of this.

>
> Your best bet is probably to handle num == LONG_MIN as a special case.
> [snip elaboration]


I second this recommendation, except a better test is 'num < -LONG_MAX';
writing the test this way more directly expresses the essential
characteristic of the condition that needs to be tested. (Other
obvious comments and advice left as an exercise.)
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Different results from different gcc versions ks C++ 9 03-20-2010 02:09 PM
All Sorts of Trouble with the Personal Site Starter Kit Jonathan Wood ASP .Net 2 06-18-2008 04:35 PM
iTunes causing all sorts of problems. Jamie Computer Information 6 03-07-2008 09:32 AM
A shoot out of sorts and a call for more Scott W Digital Photography 11 07-02-2005 03:44 PM
control-c and threads, signals in 2.3 causing all sorts of issues Srikanth Mandava Python 1 02-19-2004 12:41 PM



Advertisments