Velocity Reviews > Endian Independence

# Endian Independence

Kelly B
Guest
Posts: n/a

 07-27-2008
#include<stdio.h>

#define LITTLE_ENDIAN 0
#define BIG_ENDIAN 1

int endian() {
int i = 1;
char *p = (char *)&i;

if (p[0] == 1)
return LITTLE_ENDIAN;
else
return BIG_ENDIAN;
}

int reverseInt (int i) {
unsigned char c1, c2, c3, c4;

if ( endian() == BIG_ENDIAN ) {
return i;
} else {
c1 = i & 255;
c2 = (i >> & 255;
c3 = (i >> 16) & 255;
c4 = (i >> 24) & 255;

return ((int)c1 << 24) + ((int)c2 << 16) + ((int)c3 << + c4;
}
}

int main(void)
{
if(endian())
puts("Big Endian Machine");
else
puts("Small Endian Machine");
printf("%d",reverseInt(5));
return 0;

}

I tested it on my PC (On Pentium 4) and this is the output:

Small Endian Machine
83886080.

I am baffled as I was expecting 5 to be printed or is it that I am
missing something completely ?
Probably i have completely misunderstood the idea of endianness

Any help is appreciated.

Thank You

Antoninus Twink
Guest
Posts: n/a

 07-27-2008
On 27 Jul 2008 at 15:41, Kelly B wrote:
> int reverseInt (int i) {
> unsigned char c1, c2, c3, c4;
>
> if ( endian() == BIG_ENDIAN ) {
> return i;
> } else {
> c1 = i & 255;
> c2 = (i >> & 255;
> c3 = (i >> 16) & 255;
> c4 = (i >> 24) & 255;
>
> return ((int)c1 << 24) + ((int)c2 << 16) + ((int)c3 << + c4;
> }
> }
>
> int main(void)
> {
> if(endian())
> puts("Big Endian Machine");
> else
> puts("Small Endian Machine");
> printf("%d",reverseInt(5));
> return 0;
> }
>
> I tested it on my PC (On Pentium 4) and this is the output:
>
> Small Endian Machine
> 83886080.
>
> I am baffled as I was expecting 5 to be printed or is it that I am
> missing something completely ?

Your (somewhat poorly named) reverseInt function takes an integer i, and
returns 32 bits that give i when interpreted as a bigendian integer.
Since your machine is littleendian, the printf() function interprets its
arguments as if they were littleendian, so when you pass printf
reversInt(5) as an argument, it interprets this as 0x05000000.

Lew Pitcher
Guest
Posts: n/a

 07-27-2008
In comp.lang.c, Kelly B wrote:

> #include<stdio.h>
>
> #define LITTLE_ENDIAN 0
> #define BIG_ENDIAN 1
>
> int endian() {
> int i = 1;
> char *p = (char *)&i;
>
> if (p[0] == 1)
> return LITTLE_ENDIAN;
> else
> return BIG_ENDIAN;
> }

In the above code, you assume that there are only two possible "endian"
values.

> int reverseInt (int i) {
> unsigned char c1, c2, c3, c4;
>
> if ( endian() == BIG_ENDIAN ) {
> return i;
> } else {
> c1 = i & 255;
> c2 = (i >> & 255;
> c3 = (i >> 16) & 255;
> c4 = (i >> 24) & 255;
>
> return ((int)c1 << 24) + ((int)c2 << 16) + ((int)c3 << + c4;
> }
> }

In the above code, you assume that an int has a sizeof 4 characters.

> int main(void)
> {
> if(endian())
> puts("Big Endian Machine");
> else
> puts("Small Endian Machine");
> printf("%d",reverseInt(5));
> return 0;
>
> }

"Endianness" usually refers to how elements are ordered when storing
multi-element entities, and is usually referenced with respect to the order
that "bytes" are used to store "integers". For instance, if it takes
2 "bytes" (A and B) to store an "integer", there are two ways that these
bytes can be ordered in memory as they are stored:
A followed by B
and
B followed by A

If an integer takes four bytes (A, B, C, and D), then there are (4 * 3 * 2 *
1) or 24 ways to order these bytes in memory:
A followed by B followed by C followed by D
A followed by B followed by D followed by C
A followed by D followed by B followed by C
...
D followed by C followed by B followed by A

In the simple 2-element case, when the byte containing the most-significant
portion of the compound value is stored first, the order is called "Big
Endian". When the least-significant portion is stored first, the order is
called "Little Endian".

In the more complex n-element cases, "Little Endian" and "Big Endian" are
two of the many possible orders.

in the endian() function, you return one of two values, based on whether the
least-significant portion of the value is stored first or not. This binary
return (LITTLE_ENDIAN or BIG_ENDIAN) can only be valid if an int takes the
same space as two (and only two) char elements. If an int took more space
(say, 4 char elements), then endian() would have to return one of many more
possible ordering names (one of 24 names, for a 4 char int, for instance).
Clearly, your endian() function assumes that sizeof(int) == 2

But, your reverseInt() function clearly assumes that sizeof(int) == 4, at
least for LITTLE_ENDIAN values.

You should know that your compiler may actually use a different value for
sizeof(int).

In summary, your code is flawed.

> I tested it on my PC (On Pentium 4) and this is the output:
>
> Small Endian Machine
> 83886080.
>
> I am baffled as I was expecting 5 to be printed or is it that I am
> missing something completely ?
> Probably i have completely misunderstood the idea of endianness

Yes.

> Any help is appreciated.

First off, as far as the C language is concerned, there is no need to
determine the "endianness" of stored values. Where it is important (i.e.
when transferring binary values through a file), your compiler's
documentation should tell you the exact format. Otherwise, you don't need

Secondly, you should first determine how "wide" your compiler makes
integers. Look for the sizeof(int) value, as this will give you the number
of "bytes" that make up an integer.

Thirdly, if sizeof(int) is greater than 2, your compiler may choose some
order other than "Little Endian" /or/ "Big Endian" to store the values
in. "Little" and "Big" endian only name two of the possible orders.

--
Lew Pitcher

Master Codewright & JOAT-in-training | Registered Linux User #112576
http://pitcher.digitalfreehold.ca/ | GPG public key available by request
---------- Slackware - Because I know what I'm doing. ------

Kelly B
Guest
Posts: n/a

 07-27-2008
Lew Pitcher wrote:
> In comp.lang.c, Kelly B wrote:
>
>> #include<stdio.h>
>>
>> #define LITTLE_ENDIAN 0
>> #define BIG_ENDIAN 1
>>
>> int endian() {
>> int i = 1;
>> char *p = (char *)&i;
>>
>> if (p[0] == 1)
>> return LITTLE_ENDIAN;
>> else
>> return BIG_ENDIAN;
>> }

>
> In the above code, you assume that there are only two possible "endian"
> values.
>
>> int reverseInt (int i) {
>> unsigned char c1, c2, c3, c4;
>>
>> if ( endian() == BIG_ENDIAN ) {
>> return i;
>> } else {
>> c1 = i & 255;
>> c2 = (i >> & 255;
>> c3 = (i >> 16) & 255;
>> c4 = (i >> 24) & 255;
>>
>> return ((int)c1 << 24) + ((int)c2 << 16) + ((int)c3 << + c4;
>> }
>> }

....Long Snip...

>
> First off, as far as the C language is concerned, there is no need to
> determine the "endianness" of stored values. Where it is important (i.e.
> when transferring binary values through a file), your compiler's
> documentation should tell you the exact format. Otherwise, you don't need
>
> Secondly, you should first determine how "wide" your compiler makes
> integers. Look for the sizeof(int) value, as this will give you the number
> of "bytes" that make up an integer.
>
> Thirdly, if sizeof(int) is greater than 2, your compiler may choose some
> order other than "Little Endian" /or/ "Big Endian" to store the values
> in. "Little" and "Big" endian only name two of the possible orders.
>

Thanks Antonius and Lew !
This is what bugged me

http://www.ibm.com/developerworks/ai...ry/au-endianc/

I thought the article was correct and wanted to quickly test it on my
PC.I guess i will have to write my own function(s).

Antoninus Twink
Guest
Posts: n/a

 07-27-2008
On 27 Jul 2008 at 16:12, Lew Pitcher wrote:
> First off, as far as the C language is concerned, there is no need to
> determine the "endianness" of stored values. Where it is important (i.e.
> when transferring binary values through a file), your compiler's
> documentation should tell you the exact format. Otherwise, you don't need

Rubbish. Suppose he wants to read a binary file produced by someone else
on a different machine. Then knowing the endianness of both machines is
crucial.

> Secondly, you should first determine how "wide" your compiler makes
> integers. Look for the sizeof(int) value, as this will give you the number
> of "bytes" that make up an integer.

Nonsense. There is nothing to suggest that the OP isn't perfectly well
aware that his compiler uses 32-bit ints.

> Thirdly, if sizeof(int) is greater than 2, your compiler may choose some
> order other than "Little Endian" /or/ "Big Endian" to store the values
> in. "Little" and "Big" endian only name two of the possible orders.

Eyewash. The OP said explicitly that he's using a Pentium 4, which is a
littleendian architecture.

Ben Bacarisse
Guest
Posts: n/a

 07-27-2008
Kelly B <(E-Mail Removed)> writes:

<snip>
> This is what bugged me
>
> http://www.ibm.com/developerworks/ai...ry/au-endianc/
>
> I thought the article was correct and wanted to quickly test it on my
> PC.

Well, it is not a good explanation, but it is not exactly wrong
either. The main part you missed is that you don't need to worry
unless your program "exports" multi-byte values. The vast majority of
C programs can be entirely portable without any need to worry about
the endianness of the hardware.

It is not surprising. That article has a section "When endianness
affects code" which has 6 paragraphs. 5 of these about when it does
*not* affect the code! Only that last short paragraph starts to explain
when it does matter.

>I guess i will have to write my own function(s).

If you are writing network code (the most common reason to export
multi-byte values) then you can use POSIX functions like htons and
htonl etc. Only write your own if you don't have these available or
you need to something more outlandish.

--
Ben.

Richard Tobin
Guest
Posts: n/a

 07-27-2008
In article <(E-Mail Removed)>,
Antoninus Twink <(E-Mail Removed)> wrote:

>Rubbish. Suppose he wants to read a binary file produced by someone else
>on a different machine. Then knowing the endianness of both machines is
>crucial.

While this is true, I recommend where possible standardising on a
fixed byte order for files that may be shared between architectures.
You also need to consider the size and padding of items. Writing a
program to deal with all the possibilities of format is more tedious
than writing it to produce a uniform format.

*are identical.

-- Richard
--
Please remember to mention me / in tapes you leave behind.

Kelly B
Guest
Posts: n/a

 07-27-2008
Kelly B wrote:
> #include<stdio.h>
>
> #define LITTLE_ENDIAN 0
> #define BIG_ENDIAN 1
>
> int endian() {
> int i = 1;
> char *p = (char *)&i;
>
> if (p[0] == 1)
> return LITTLE_ENDIAN;
> else
> return BIG_ENDIAN;
> }

...snip..

Just one more thing.What is the right way to convert a *signed* int from
one endianness to another ( more specifically from big-endian to small
or vice versa). How do i preserve the *sign* bit.
Swapping the bytes cannot be an option, unless i probably somehow
preserve the sign and treat the number as an unsigned int or am i
way-off again ?

santosh
Guest
Posts: n/a

 07-27-2008
Kelly B wrote:

> Kelly B wrote:
>> #include<stdio.h>
>>
>> #define LITTLE_ENDIAN 0
>> #define BIG_ENDIAN 1
>>
>> int endian() {
>> int i = 1;
>> char *p = (char *)&i;
>>
>> if (p[0] == 1)
>> return LITTLE_ENDIAN;
>> else
>> return BIG_ENDIAN;
>> }

>
> ..snip..
>
> Just one more thing.What is the right way to convert a *signed* int
> from one endianness to another ( more specifically from big-endian to
> small or vice versa). How do i preserve the *sign* bit.
> Swapping the bytes cannot be an option, unless i probably somehow
> preserve the sign and treat the number as an unsigned int or am i
> way-off again ?

You'll need to the manner in which signed values are represented on your
machine, whether twos-complement, sign-and-magnitude or
ones-complement, the three formats that C recognises. Such code will
not be portable.

But most system's already provide their own routines for
endian-conversion whereever it's likely to matter. Unix systems provide
htonl/htons and ntohl/ntohs.

Jean-Marc Bourguet
Guest
Posts: n/a

 07-27-2008
Antoninus Twink <(E-Mail Removed)> writes:

> On 27 Jul 2008 at 16:12, Lew Pitcher wrote:
>> First off, as far as the C language is concerned, there is no need to
>> determine the "endianness" of stored values. Where it is important (i.e.
>> when transferring binary values through a file), your compiler's
>> documentation should tell you the exact format. Otherwise, you don't need

>
> Rubbish. Suppose he wants to read a binary file produced by someone else
> on a different machine. Then knowing the endianness of both machines is
> crucial.

Why?

1/ You can write binary files which have a defined binary format without
knowing the endianess of either the writer and the reader (obviously
knowing that the file is in the same endianess as you allows optimization)

2/ Even if your binary file is a memory dump which depend on the endianess
of the writer, knowing which one that will return the case 1 for the

3/ With an adequate file format, (ie a signature which is byte order
dependant) you can detect at run time that the file is not in the same
order as your native one, without knowing which.

And naturally, when you aren't trying to do binary IO manipulating data
wider that a byte in the same format as the one they are represented in
memory, you'd better ensure that all other things which may apply (size and
alignment are the more common one) are the same.

--
Jean-Marc