Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > C Programming > Behavior of the code

Reply
Thread Tools

Behavior of the code

 
 
Army1987
Guest
Posts: n/a
 
      12-29-2007
Harald van Dijk wrote:
[...] 7.18.1.1 now reads
>
> "These types are optional. However, if an implementation provides integer
> types with widths of 8, 16, 32, or 64 bits, no padding bits, and (for
> the signed types) that have a two's complement representation, it shall
> define the corresponding typedef names."

[...]
> -- but, if CHAR_BIT == 8, unsigned char is still an unsigned integer type
> with a width of 8 bits and no padding, meaning uint8_t is required to be
> provided. And when uint8_t is provided, int8_t is required as well.

IOW the standard unintentionally requires signed char to use 2's
complement whenever CHAR_BIT is 8. Or does it?
(An implementation *can* have CHAR_BIT == 8 and signed char using e.g.
sign and magnitude, but it would need to have another 1-byte signed type
for int8_t to be typedef'd to. Since this was clearly not the intent, the
requirement to define uint..._t if and only if int..._t is defined ought
to be dropped.)

--
Army1987 (Replace "NOSPAM" with "email")
 
Reply With Quote
 
 
 
 
Army1987
Guest
Posts: n/a
 
      12-29-2007
santosh wrote:

> somenath wrote:


>> Now when I print using %x it should print the hexadecimal
>> representation of 240 i.e F0 .Why it is printing fffffff0 ?

>
> Because of format specifier mismatch. You are supplying a char and
> telling it to look for an int.

Isn't that char supposed to be promoted to int (or, in case CHAR_MAX >
INT_MAX, to unsigned int) according to the integer promotion?
--
Army1987 (Replace "NOSPAM" with "email")
 
Reply With Quote
 
 
 
 
Army1987
Guest
Posts: n/a
 
      12-29-2007
jacob navia wrote:

> Joe Wright wrote:
>>> #include <stdio.h
>>> int main(void)
>>> {
>>> int x = 0x7fff;
>>> signed char y;
>>> y =(signed char) x;
>>> printf("%hhx\n", y);
>>> return 0;
>>>
>>>
>>> }

>> Do you think 2.96 is broken? I think ffff is right and you think ff is
>> right. I ask you again to explain why.
>>

>
> The C standard fprintf function.
> <quote>
> hh
> Specifies that a following d, i, o, u, x, or X conversion specifier
> applies to a signed char or unsigned char argument (the argument will
> have been promoted according to the integer promotions, but its value
> shall be converted to signed char or unsigned char before printing); or
> that a following n conversion specifier applies to a pointer to a signed
> char argument.
> <end quote>
>
> Since you specify the "x" format, the value is
> interpreted as unsigned, and it is converted to an
> unsigned char.

And -1, which is the most likely value for (signed char)0x7fff (though
that's implementation defined) should become 0xff when converted to
unsigned char. Implementations which don't support %hhx and treat it as
%hx will print ffff.

--
Army1987 (Replace "NOSPAM" with "email")
 
Reply With Quote
 
jacob navia
Guest
Posts: n/a
 
      12-29-2007
Army1987 wrote:
> jacob navia wrote:
>
>> Joe Wright wrote:
>>>> #include <stdio.h
>>>> int main(void)
>>>> {
>>>> int x = 0x7fff;
>>>> signed char y;
>>>> y =(signed char) x;
>>>> printf("%hhx\n", y);
>>>> return 0;
>>>>
>>>>
>>>> }
>>> Do you think 2.96 is broken? I think ffff is right and you think ff is
>>> right. I ask you again to explain why.
>>>

>> The C standard fprintf function.
>> <quote>
>> hh
>> Specifies that a following d, i, o, u, x, or X conversion specifier
>> applies to a signed char or unsigned char argument (the argument will
>> have been promoted according to the integer promotions, but its value
>> shall be converted to signed char or unsigned char before printing); or
>> that a following n conversion specifier applies to a pointer to a signed
>> char argument.
>> <end quote>
>>
>> Since you specify the "x" format, the value is
>> interpreted as unsigned, and it is converted to an
>> unsigned char.

> And -1, which is the most likely value for (signed char)0x7fff (though
> that's implementation defined) should become 0xff when converted to
> unsigned char. Implementations which don't support %hhx and treat it as
> %hx will print ffff.
>


That was the whole problem. hhx is defined in C99 only.
What is surprising is that the gcc printf recognizes somehow hh, even
if it doesn't support it. Strange.

Maybe we should try to write hhhhhhhhh to see what happens!




--
jacob navia
jacob at jacob point remcomp point fr
logiciels/informatique
http://www.cs.virginia.edu/~lcc-win32
 
Reply With Quote
 
Harald van Dijk
Guest
Posts: n/a
 
      12-29-2007
On Sat, 29 Dec 2007 00:32:11 +0000, Army1987 wrote:
> Harald van Dijk wrote:
> [...] 7.18.1.1 now reads
>>
>> "These types are optional. However, if an implementation provides
>> integer
>> types with widths of 8, 16, 32, or 64 bits, no padding bits, and (for
>> the signed types) that have a two's complement representation, it
>> shall define the corresponding typedef names."

> [...]
>> -- but, if CHAR_BIT == 8, unsigned char is still an unsigned integer
>> type with a width of 8 bits and no padding, meaning uint8_t is required
>> to be provided. And when uint8_t is provided, int8_t is required as
>> well.

> IOW the standard unintentionally requires signed char to use 2's
> complement whenever CHAR_BIT is 8. Or does it? (An implementation *can*
> have CHAR_BIT == 8 and signed char using e.g. sign and magnitude, but it
> would need to have another 1-byte signed type for int8_t to be typedef'd
> to.


Exactly.

> Since this was clearly not the intent, the requirement to define
> uint..._t if and only if int..._t is defined ought to be dropped.)


Either that, or the requirement to define uint8_t should exist only if a
type matching the description of int8_t also exists. This would allow
CHAR_BIT==8 without two's complement by simply not defining (u)int8_t,
even though a type matching uint8_t's requirements exists.
 
Reply With Quote
 
Chris Torek
Guest
Posts: n/a
 
      12-29-2007
In article <fl46kj$3qg$(E-Mail Removed)> jacob navia <(E-Mail Removed)> wrote:
[code using "%hhx" format in printf()]
>That was the whole problem. hhx is defined in C99 only.
>What is surprising is that the gcc printf recognizes somehow hh, even
>if it doesn't support it. Strange.


It is not really surprising, or at least, *should* not be, if you
think about it. The GNU compiler collection contains a "C compiler"
(of sorts), but not a complete *implementation* of C, because it
uses whatever libraries are provided by the underlying system.

The compiler front-end, which reads the source code and turns
syntax and semantics into instruction sequences, is all part of
this "compiler collection". While this part does not actually
implement C99, it comes close enough to "understand" %hhx. So
the part of the compiler that emits diagnostics will "read" the
formatting directives to printf, see "%hhx", and check that the
argument has the correct type, all while *assuming* that the
system-provided library will "do the right thing" with it.

Later, at link time, when you combine the "compiled" code (object
files and/or libraries) with the system-provided library to get
the final executable -- Translation Phase 8 in the C99 standard
(TP7 in C89, if I remember right) -- you get the system's actual
implementation of printf(), which is often "less C99-ish" (as it
were) than even GCC.

It is not practical for GCC to provide the implementation of
printf() itself: The bottom levels of stdio are full of
"implementation-specific magic" (such as handling all the RMS
formats on VMS, or the multiple kinds of file formats on IBM
mainframes or DOS/Windows systems) that varies too much from one
implementation to the next. You can, of course, just choose to
use a system whose system-provided C library supports %hhx.

(The C Standards really address only complete systems, not divided-up
half-implementations[%] like the GNU Compiler Collection. So one
cannot even say that gcc implements C89, much less that it implements
C99. If you combine gcc with a particular set of libraries, you
can get a conformant C89 implementation, but gcc is approaching
C99 in the front-end only asymptotically. The biggest sticking
point appears to be various GNU extensions that are incompatible
with C99: making the front end C99-conformant would break those.
Given that the GNU folks have broken their own extensions before,
I am not sure how "sticky" a sticking point this really is, but it
is definitely "sticky". )

[% Actually, I tend to think of gcc as a "3/4ths or so" implementation.
Compilers are usually divided, in compiler circles at least, into
three parts. Only two of these are thought of as "the compiler":
the "front end", which reads syntax and "understands" semantics,
and the "back end", which does code-generation and final (peephole)
optimization. The main body of optimization is either part of the
"front end" or, in many cases now, a "middle end" that -- like the
back end -- is shared between multiple languages. That is, one
might have a front end for Ada, another for C, a third for C++, a
fourth for Fortran, and a fifth for Pascal; these would all produce
some sort of internal tree or list representation that feeds through
a shared optimizer and shared back-end. The "middle end" optimization
needed tends to vary quite a bit from one language to the next,
though, so it can be more efficient, in some senses, to paste
different "middle ends" onto the various front-end parts. Back
when 8 megabytes was a lot of RAM, this kind of efficiency was
more important; nowadays the gigantic shared middle end, that uses
a gigabyte of RAM to run, seems to be in vogue.

In any case, after the final code comes out of the "back end"
of the compiler -- and in some cases, is turned into linkable
object code by a separate "assembler" -- the object code and
libraries are handled by a piece usually called a "linker". The
"linker" is normally completely separate from the compiler, and
tends to be used on its own now and then, e.g., to build
libraries.

(There are also "globally optimizing" compilers that defer at least
some of the optimization and code-generation phases. In this case,
instead of generating object code and linking that, the front end
simply saves a "digested form" of the code in the "object" files
-- which are no longer object files at all -- and the optimization,
code generation, and final stages are all run when you "link" the
pieces together. This gives the optimizer a view of the entire
program, so that it can do a much better job. The drawback is that
the "final link phase", which is usually pretty fast, now contains
most of the real work, and can take hours or even days for large
programs.)

The GNU Compiler Collection provides front and back ends, but
not the linker. There *is* a GNU linker, and using it buys you
some advantages, especially in languages other than C, but gcc
can be built for systems that use the native non-GNU linker,
using an auxiliary step they call "collect2".]
--
In-Real-Life: Chris Torek, Wind River Systems
Salt Lake City, UT, USA (4039.22'N, 11150.29'W) +1 801 277 2603
email: forget about it http://web.torek.net/torek/index.html
Reading email is like searching for food in the garbage, thanks to spammers.
 
Reply With Quote
 
Army1987
Guest
Posts: n/a
 
      12-30-2007
Chris Torek wrote:

> In article <fl46kj$3qg$(E-Mail Removed)> jacob navia <(E-Mail Removed)> wrote:
> [code using "%hhx" format in printf()]
>>That was the whole problem. hhx is defined in C99 only.
>>What is surprising is that the gcc printf recognizes somehow hh, even
>>if it doesn't support it. Strange.

>
> It is not really surprising, or at least, *should* not be, if you
> think about it. The GNU compiler collection contains a "C compiler"
> (of sorts), but not a complete *implementation* of C, because it
> uses whatever libraries are provided by the underlying system.
>
> The compiler front-end, which reads the source code and turns
> syntax and semantics into instruction sequences, is all part of
> this "compiler collection". While this part does not actually
> implement C99, it comes close enough to "understand" %hhx. So
> the part of the compiler that emits diagnostics will "read" the
> formatting directives to printf, see "%hhx", and check that the
> argument has the correct type, all while *assuming* that the
> system-provided library will "do the right thing" with it.
>
> Later, at link time, when you combine the "compiled" code (object
> files and/or libraries) with the system-provided library to get
> the final executable -- Translation Phase 8 in the C99 standard
> (TP7 in C89, if I remember right) -- you get the system's actual
> implementation of printf(), which is often "less C99-ish" (as it
> were) than even GCC.

Support for hh was added in glibc 2.1.
With gcc 4.1.2 and glibc 2.5,
#include <stdio.h>
int main(void)
{
printf("%hhx %hx\n", -1, -1);
return 0;
}
prints "ff ffff".
--
Army1987 (Replace "NOSPAM" with "email")
 
Reply With Quote
 
Dan Henry
Guest
Posts: n/a
 
      01-03-2008
On 25 Dec 2007 23:58:34 GMT, Chris Torek <(E-Mail Removed)> wrote:

[...]
> I believe it is better to think not in terms of "machine
>byte order" -- which is something you can only control by picking
>which machines you use -- but rather to think in terms of values
>and representations. As a C programmer, you have a great deal of
>control of values, and if you use "unsigned" types, you have complete
>control of representations. For instance, you can read a 10-bit
>two's complement value from a stdio stream, with the first input
>char giving the uppermost 2 bits, using "unsigned int" this way:
>
> /*
> * Read one 2-bit value and one 8-bit value from the given stream,
> * and compose a signed 10-bit value (in the range [-512..+511])
> * from those bits.
> */
> int get_signed_10_bit_value(FILE *fp) {
> int c0, c1;
> unsigned int val;
>
> c0 = getc(fp);
> if (c0 == EOF) ... handle error ...
> c1 = getc(fp);
> if (c1 == EOF) ... handle error ...
> val = ((c0 & 0x03) << | (c1 & 0xff);
> return (val ^ 0x200) - 0x200;
> }
>
>(Note that when you go to read more than 15 bits, you need to be
>careful with intermediate values, since plain int may have as few
>as 15 non-sign "value bits", and unsigned int may have as few as
>16. You will need to convert values to "unsigned long", using
>temporary variables, casts, or both.)
>
>This xor-and-subtract trick works on all implementations, including
>ones' complement machine like the Univac. Its only real limitation
>is that the final (signed) value has to fit in the types available:
>a 16-bit two's complement machine has a -32768 but a 16-bit ones'
>complement machine bottoms out at -32767. (As it happens, though,
>anything other than two's complement is rare today, so you probably
>need not worry about this very much.)


I wonder if the holidays have left me the victim of some form of
limited thinking. I have been admonished occasionally here to (in so
many words) think less about representations (i.e., bit patterns) and
more about values. Thinking about values with the sign extension code
above, with a negative signed 10-bit value (e.g., 'val' is 0x202
before the return), I'd have thought that (assuming 16-bit int and
unsigned int) the return expression would yield the unsigned *value*
0xFE02 which is also twos-complement *representation* of the expected
int return value. However, my new and improved, value-oriented self
is now confused. What allows the the conversion, which I thought was
value-to-value, of an unsigned value that a 16-bit signed int can't
have? There has got to be a clause in the standard that I am
overlooking.

Remember, I already said that I suffer from limited thinking. Would
someone kindly point me where I should be reading?

--
Dan Henry
 
Reply With Quote
 
Chris Torek
Guest
Posts: n/a
 
      01-03-2008
>On 25 Dec 2007 23:58:34 GMT, Chris Torek <(E-Mail Removed)> wrote:
[where "val" has type "unsigned int" and c0 and c1 are ordinary "int"]
>> val = ((c0 & 0x03) << | (c1 & 0xff);
>> return (val ^ 0x200) - 0x200;
>>(Note that when you go to read more than 15 bits, you need to be
>>careful with intermediate values, ...


In article <(E-Mail Removed)>,
Dan Henry <(E-Mail Removed)> wrote:
>I wonder if the holidays have left me the victim of some form of
>limited thinking. I have been admonished occasionally here to (in so
>many words) think less about representations (i.e., bit patterns) and
>more about values.


Well, in this case, we have a specification that talks about
representations: c0 and c0 (read from a stdio stream) hold bits
for a 10-bit signed two's complement representation, and our goal
in this code fragment is to turn those into whatever *our* machine
uses to represent the numbers extracted from the stream. So for
this particular part of the problem, we do have to care about
representations.

>Thinking about values with the sign extension code
>above, with a negative signed 10-bit value (e.g., 'val' is 0x202
>before the return), I'd have thought that (assuming 16-bit int and
>unsigned int) the return expression would yield the unsigned *value*
>0xFE02


Indeed. The problem is, I goofed. As you say, if val is 0x202 (and
has type "unsigned int"), we have:

(val ^ 0x200) - 0x200

which is:

(0x202U ^ 0x200) - 0x200

which mixes unsigned and signed. In pre-ANSI C, it was easy to
predict the result: mix unsigned with signed, you got unsigned.
In C89 and C99, what you get depends on type_MAX vs Utype_MAX,
where the <type>s are the types of the signed and unsigned values
involved. In this case, <type> is int (each time), and in general
we have UINT_MAX > INT_MAX, so the xor is done by converting 0x200
to unsigned (i.e., 512U or 0x200U). Since 0x200U ^ 0x202U is 2U,
the result is 2U. Then we have the same unsigned-vs-signed problem,
and again 0x200 is converted to 0x200U, so the value is in fact
(2U - 512U), which is indeed 0xfe02 or 0xfffffe02 on common
implementations.

The code *should* have read:

(int)(val ^ 0x200U) - 512

(I used a decimal constant for clarity this time, although it may
actually be less clear. When I wrote the text in ">>" above I
actually tried putting in 512, but switched back to 0x200.)

By converting to int after xor-ing with 0x200, we would now get the
(signed int) value 2, and (2 - 512) is -510, which is what we wanted.
So:

>> return (val ^ 0x200) - 0x200;


should read, instead, something more like:

return (int)(val ^ 0x200U) - (int)0x200;
--
In-Real-Life: Chris Torek, Wind River Systems
Salt Lake City, UT, USA (4039.22'N, 11150.29'W) +1 801 277 2603
email: forget about it http://web.torek.net/torek/index.html
Reading email is like searching for food in the garbage, thanks to spammers.
 
Reply With Quote
 
Dan Henry
Guest
Posts: n/a
 
      01-04-2008
On 3 Jan 2008 08:13:02 GMT, Chris Torek <(E-Mail Removed)> wrote:

>>On 25 Dec 2007 23:58:34 GMT, Chris Torek <(E-Mail Removed)> wrote:

>[where "val" has type "unsigned int" and c0 and c1 are ordinary "int"]
>>> val = ((c0 & 0x03) << | (c1 & 0xff);
>>> return (val ^ 0x200) - 0x200;
>>>(Note that when you go to read more than 15 bits, you need to be
>>>careful with intermediate values, ...

>
>In article <(E-Mail Removed)>,
>Dan Henry <(E-Mail Removed)> wrote:
>>I wonder if the holidays have left me the victim of some form of
>>limited thinking. I have been admonished occasionally here to (in so
>>many words) think less about representations (i.e., bit patterns) and
>>more about values.

>
>Well, in this case, we have a specification that talks about
>representations: c0 and c0 (read from a stdio stream) hold bits
>for a 10-bit signed two's complement representation, and our goal
>in this code fragment is to turn those into whatever *our* machine
>uses to represent the numbers extracted from the stream. So for
>this particular part of the problem, we do have to care about
>representations.


Chris,

Thank you for your reply. I had no problem or confusion with the
representation aspects of everything above the 'return' line. My
issue was entirely regarding the 'return' expression and its
conversion to the returned value.

>>Thinking about values with the sign extension code
>>above, with a negative signed 10-bit value (e.g., 'val' is 0x202
>>before the return), I'd have thought that (assuming 16-bit int and
>>unsigned int) the return expression would yield the unsigned *value*
>>0xFE02

>
>Indeed. The problem is, I goofed. As you say, if val is 0x202 (and
>has type "unsigned int"), we have:
>
> (val ^ 0x200) - 0x200
>
>which is:
>
> (0x202U ^ 0x200) - 0x200
>
>which mixes unsigned and signed. In pre-ANSI C, it was easy to
>predict the result: mix unsigned with signed, you got unsigned.
>In C89 and C99, what you get depends on type_MAX vs Utype_MAX,
>where the <type>s are the types of the signed and unsigned values
>involved. In this case, <type> is int (each time), and in general
>we have UINT_MAX > INT_MAX, so the xor is done by converting 0x200
>to unsigned (i.e., 512U or 0x200U). Since 0x200U ^ 0x202U is 2U,
>the result is 2U. Then we have the same unsigned-vs-signed problem,
>and again 0x200 is converted to 0x200U, so the value is in fact
>(2U - 512U), which is indeed 0xfe02 or 0xfffffe02 on common
>implementations.
>
>The code *should* have read:
>
> (int)(val ^ 0x200U) - 512
>
>(I used a decimal constant for clarity this time, although it may
>actually be less clear. When I wrote the text in ">>" above I
>actually tried putting in 512, but switched back to 0x200.)


0x200 seems just fine to me.

>By converting to int after xor-ing with 0x200, we would now get the
>(signed int) value 2, and (2 - 512) is -510, which is what we wanted.
>So:
>
>>> return (val ^ 0x200) - 0x200;

>
>should read, instead, something more like:
>
> return (int)(val ^ 0x200U) - (int)0x200;


It is exactly the coercion of (val ^ 0x200) to an int that I thought
would be necessary.

Thanks again.

--
Dan Henry
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Odd behavior with odd code Michael Speer C Programming 33 02-18-2007 07:31 AM
Do you think this code will produce any undefined behavior? Chris Thomasson C++ 10 11-17-2006 06:10 PM
THE BEHAVIOR CODE FOR 24-BITUP/DOWN COUNTER WITH PARALLEL LOAD AND ASYNCHRONOUS RESET coldplay112 VHDL 2 09-25-2006 10:06 AM
Step-thru code - odd behavior ASP .Net 2 06-01-2004 06:10 PM
undefined behavior or not undefined behavior? That is the question Mantorok Redgormor C Programming 70 02-17-2004 02:46 PM



Advertisments