Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > C Programming > discuss portibility of this ENDIAN testing code

Reply
Thread Tools

discuss portibility of this ENDIAN testing code

 
 
G Patel
Guest
Posts: n/a
 
      08-18-2006
Code in question (assuming CHAR_BIT = 8 system):

union
{
unsigned int whole;
unsigned char bytes[sizeof(int)];
} var;

var.whole = 0xFF;

if( var.bytes[0] == 0xFF )
printf("\nLITTLE ENDIAN\n");
else if( var.bytes[sizeof(int)-1] == 0xFF )
printf("\nBIG ENDIAN\n");
else
printf("\nHUH???\n");




I'm wondering about the portibility of the above ENDIAN tester code
(don't worry about CHAR_BIT != 8 as an issue).

I've read some really knowledgeable posts on clc before that kept
emphasizing the fact that C pastes a small abstraction layer over
memory/hardware. And that 2 contiguous bytes in C's layer is not
necessarily 2 contiguous in RAM (or process memory space) -or- the
order of the bytes in C's layer is not necessarily the same as
hardware. So with this in mind, can the above code be made more
portable?

Thanks

Gaya

 
Reply With Quote
 
 
 
 
Keith Thompson
Guest
Posts: n/a
 
      08-18-2006
"G Patel" <> writes:
> Code in question (assuming CHAR_BIT = 8 system):
>
> union
> {
> unsigned int whole;
> unsigned char bytes[sizeof(int)];
> } var;
>
> var.whole = 0xFF;
>
> if( var.bytes[0] == 0xFF )
> printf("\nLITTLE ENDIAN\n");
> else if( var.bytes[sizeof(int)-1] == 0xFF )
> printf("\nBIG ENDIAN\n");
> else
> printf("\nHUH???\n");
>
> I'm wondering about the portibility of the above ENDIAN tester code
> (don't worry about CHAR_BIT != 8 as an issue).
>
> I've read some really knowledgeable posts on clc before that kept
> emphasizing the fact that C pastes a small abstraction layer over
> memory/hardware. And that 2 contiguous bytes in C's layer is not
> necessarily 2 contiguous in RAM (or process memory space) -or- the
> order of the bytes in C's layer is not necessarily the same as
> hardware. So with this in mind, can the above code be made more
> portable?


I think you're talking about the distinction between physical memory
and virtual memory. On systems with virtual memory, that's all you
can see; it's not possible to access physical memory directly (except
*maybe* by some horribly system-specific low-level technique).

Within a C object, memory is continguous, and addresses of successive
bytes are adjacent. (There are no guarantees across distinct objects,
except that their addresses are unique; any attempt to apply a
relational operator to the addresses of two objects, such as
"&obj1 < &obj2", invokes undefined behavior.)

I'd use "sizeof(unsigned int)" everywhere you used "sizeof(int)",
since you declared the "whole" member as unsigned int. It happens
that int and unsigned int are guaranteed to be the same size, but I
just now had to check the standard to confirm that; if you use
"unsigned int" consistently, the question doesn't arise.

You said not to worry about CHAR_BIT != 8, but I'll still mention that
the code will report LITTLE_ENDIAN if sizeof(unsigned int) == 1 (which
can only happen if CHAR_BIT >= 16). If an int is a single byte, then
it has no meaningful byte ordering.

Apart from that, I think the code will work reliably as long as
unsigned int has no padding bits. If it does have padding bits,
there's a possibility that the 0xFF won't land in either the
high-order or the low-order byte. In that case, printing "HUH???" is
probably good enough.

--
Keith Thompson (The_Other_Keith) kst- <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <*> <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.
 
Reply With Quote
 
 
 
 
lovecreatesbeauty
Guest
Posts: n/a
 
      08-18-2006

Keith Thompson wrote:
> "G Patel" <> writes:
> > Code in question (assuming CHAR_BIT = 8 system):
> >
> > union
> > {
> > unsigned int whole;
> > unsigned char bytes[sizeof(int)];
> > } var;
> >
> > var.whole = 0xFF;
> >
> > if( var.bytes[0] == 0xFF )
> > printf("\nLITTLE ENDIAN\n");
> > else if( var.bytes[sizeof(int)-1] == 0xFF )
> > printf("\nBIG ENDIAN\n");
> > else
> > printf("\nHUH???\n");
> >
> > I'm wondering about the portibility of the above ENDIAN tester code
> > (don't worry about CHAR_BIT != 8 as an issue).
> >
> > I've read some really knowledgeable posts on clc before that kept
> > emphasizing the fact that C pastes a small abstraction layer over
> > memory/hardware. And that 2 contiguous bytes in C's layer is not
> > necessarily 2 contiguous in RAM (or process memory space) -or- the
> > order of the bytes in C's layer is not necessarily the same as
> > hardware. So with this in mind, can the above code be made more
> > portable?


> You said not to worry about CHAR_BIT != 8, but I'll still mention that
> the code will report LITTLE_ENDIAN if sizeof(unsigned int) == 1 (which
> can only happen if CHAR_BIT >= 16). If an int is a single byte, then
> it has no meaningful byte ordering.


I use unsigned short with G Patel's code and get LITTLE ENDIAN and BIG
ENDIAN result on Linux + i386 and HP 9000/800/rp3410 respectively.
CHAR_BIT != 8 does not relate to sizeof(unsigned int) != 4. The endian
ways may exist where multiple bytes presentation exist.

 
Reply With Quote
 
Frederick Gotham
Guest
Posts: n/a
 
      08-18-2006
G Patel posted:

> union
> {
> unsigned int whole;
> unsigned char bytes[sizeof(int)];
> } var;



This has been discussed several times.

Watch out for:

(1) Padding inside an int.
(2) Trap values for an int.

I posted some C++ code a while back which does this; give me an minute and
I'll convert it to C... *time passes*. Here's my best shot a C-ification.

(I wasn't sure which format specifier would print a size_t.)

#include <stddef.h>
#include <limits.h>

typedef unsigned UType;

typedef struct ByteIndexes {
size_t indexes[sizeof(UType)];
} ByteIndexes;

ByteIndexes DetermineIndexes(void)
{
ByteIndexes bi;
size_t *pindex = bi.indexes;

UType guinea_pig = 0;
char unsigned const *p = (char unsigned const*)&guinea_pig;
char unsigned const *const pover = (char unsigned const*)(&guinea_pig +
1);

UType byte_number = 1;

do guinea_pig |= byte_number << CHAR_BIT * byte_number;
while (++byte_number != sizeof guinea_pig);

do *pindex++ = *p++;
while(p != pover);

return bi;
}

size_t LSBIndexToByteIndex(size_t const i)
{
ByteIndexes static bi;

int static first_time = 1;

if(first_time) first_time = 0, bi = DetermineIndexes();

return bi.indexes[i];
}

#include <stdlib.h>

int main(void)
{
size_t i;

printf("============================\n"
"|| Byte Order ||\n"
"============================\n\n"
"LSB: Byte 0 -- Memory Address ");

for(i = 0;
{
printf("%lu\n",LSBIndexToByteIndex(i++));

if (sizeof(UType) == i) break;

printf(
sizeof(UType) == i + 1 ?
"MSB: Byte %lu -- Memory Address "
: " Byte %lu -- Memory Address ",i);
}
}

--

Frederick Gotham
 
Reply With Quote
 
Clark S. Cox III
Guest
Posts: n/a
 
      08-18-2006
Frederick Gotham wrote:
> G Patel posted:
>
>> union
>> {
>> unsigned int whole;
>> unsigned char bytes[sizeof(int)];
>> } var;

>
>
> This has been discussed several times.
>
> Watch out for:
>
> (1) Padding inside an int.
> (2) Trap values for an int.
>
> I posted some C++ code a while back which does this; give me an minute and
> I'll convert it to C... *time passes*. Here's my best shot a C-ification.
>
> (I wasn't sure which format specifier would print a size_t.)


FYI: As of C99, %zu will print a size_t. Before C99, there wasn't one,
and the best one could do was to cast the size_t to an unsigned long and
then use %lu.

--
Clark S. Cox III

 
Reply With Quote
 
Keith Thompson
Guest
Posts: n/a
 
      08-18-2006
Frederick Gotham <> writes:
[...]
> I posted some C++ code a while back which does this; give me an minute and
> I'll convert it to C... *time passes*. Here's my best shot a C-ification.
>
> (I wasn't sure which format specifier would print a size_t.)

[...]
> size_t LSBIndexToByteIndex(size_t const i)

[...]
> printf("%lu\n",LSBIndexToByteIndex(i++));


That's not it. "%lu" expects an unsigned long, which is often
compatible with size_t (in fact I don't think I've ever seen a system
where it isn't), but it's not guaranteed.

C99 has "%zu", but a lot of *printf() implementations don't support
that.

The most portable solution is to use "%lu" and cast the size_t
argument to unsigned long:

printf("%lu\n", (unsigned long)LSBIndexToByteIndex(i++));

This can fail if size_t is bigger than unsigned long (not possible in
C90 and explicitly discouraged in C99) *and* if the actual value
exceeds ULONG_MAX; in that case, the printed result will be reduced
modulo ULONG_MAX+1.

--
Keith Thompson (The_Other_Keith) kst- <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <*> <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.
 
Reply With Quote
 
Gordon Burditt
Guest
Posts: n/a
 
      08-19-2006
>Code in question (assuming CHAR_BIT = 8 system):

There are 24 possible byte orders if sizeof(int) = 4, and 40,320
possible byte orders if sizeof(int) = 8. (Note that this does NOT
depend on CHAR_BIT = 8, but it still works if it is).

Your code mis-identifies some of these as big-endian, some as little-endian,
and identifies some of these (correctly, I guess) under the collective
name HUH??? .


>
> union
> {
> unsigned int whole;
> unsigned char bytes[sizeof(int)];
> } var;
>
> var.whole = 0xFF;
>
> if( var.bytes[0] == 0xFF )
> printf("\nLITTLE ENDIAN\n");
> else if( var.bytes[sizeof(int)-1] == 0xFF )
> printf("\nBIG ENDIAN\n");
> else
> printf("\nHUH???\n");
>
>
>
>
>I'm wondering about the portibility of the above ENDIAN tester code
>(don't worry about CHAR_BIT != 8 as an issue).
>
>I've read some really knowledgeable posts on clc before that kept
>emphasizing the fact that C pastes a small abstraction layer over
>memory/hardware. And that 2 contiguous bytes in C's layer is not
>necessarily 2 contiguous in RAM (or process memory space) -or- the
>order of the bytes in C's layer is not necessarily the same as
>hardware. So with this in mind, can the above code be made more
>portable?
>
>Thanks
>
>Gaya
>



 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Little Endian to Big Endian invincible C++ 9 06-14-2005 10:21 PM
Little Endian to Big Endian for 32 bit invincible C++ 1 06-14-2005 04:20 PM
float: IEEE, big endian, little endian Ernst Murnleitner C++ 0 01-13-2004 01:48 PM
convert from BIG-ENDIAN to LITTLE-ENDIAN hicham C++ 2 07-02-2003 04:55 PM
convert from big-endian to little-endian hicham C Programming 0 06-30-2003 10:16 AM



Advertisments
 



1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57