Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > C Programming > Typecast clarification

Reply
Thread Tools

Typecast clarification

 
 
syuga2012@gmail.com
Guest
Posts: n/a
 
      02-12-2009
Hi Folks,

To determine if a machine is little endian / big endian the foll. code
snippet is used...

int num = 1;

if( * (char *)&num == 1)
printf ("\n Little Endian");

else
printf("\n Big endian");

I needed a few clarifications regarding this.

1. Can we use void * instead of char * ?
2. When do we use void * and when char * ?
3. Does the above typecast convert an integer to a char (1 byte) in
memory?
For e.g if I used a variable ch, to store the result of the above
typecast

4. In general, when can we safely do typecasts ? Are such code
portable ?

Thanks a lot for your help. Appreciate it.

syuga
 
Reply With Quote
 
 
 
 
WANG Cong
Guest
Posts: n/a
 
      02-12-2009
http://www.velocityreviews.com/forums/(E-Mail Removed) wrote:

> Hi Folks,
>
> To determine if a machine is little endian / big endian the foll. code
> snippet is used...
>
> int num = 1;
>
> if( * (char *)&num == 1)
> printf ("\n Little Endian");
>
> else
> printf("\n Big endian");
>
> I needed a few clarifications regarding this.
>
> 1. Can we use void * instead of char * ?


Here? No, you can not dereference a void* pointer.

> 2. When do we use void * and when char * ?


void* is used generally, for example, malloc(), it can help
you to pass pointers without castings.

Here, in your case, using char * is because it only wants to
fetch one byte from a 4-byte int.

> 3. Does the above typecast convert an integer to a char (1 byte) in
> memory?
> For e.g if I used a variable ch, to store the result of the above
> typecast


No, it casts an int pointer to char pointer.

>
> 4. In general, when can we safely do typecasts ? Are such code
> portable ?


When you understand what you are doing.


 
Reply With Quote
 
 
 
 
nick_keighley_nospam@hotmail.com
Guest
Posts: n/a
 
      02-12-2009
Subject: "Typecast clarification"

technically, what you are doing is "casting" not "typecasting"
[prepare for flamewar]


On 12 Feb, 09:49, "(E-Mail Removed)" <(E-Mail Removed)> wrote:
> Hi Folks,
>
> To determine if a machine is little endian / big endian the foll. code
> snippet is used...
>
> int num = 1;
>
> if( * (char *)&num == 1)
> * printf ("\n Little Endian");
> else
> * printf("\n Big endian");


that takes the address of num, casts it to a pointer to char
(unsigned char might be slightly safer), then dereferences
it to give a char. If it's little endian the number will
be stored (assuming 8bit chars and 32 bit ints)

lo hi
01 00 00 00

lo hi
00 00 00 01

so the code will do what you expect. Note there
are more than 2 ways to order 4 objects...
(and some of them *have* been used)

> I needed a few clarifications regarding this.
>
> 1. Can we use void * instead of char * ?


no. You cannot dereference a (void*)
(some compilers allow this but they are not compliant with the
standard)

> 2. When do we use void * and when char * ?


(void*) for anonymous or unknown type. (char*) for
pointer to characters (eg strings). (unsigned char*)
for getting at the representation. It is safe to cast
any data to unsigned char.

> 3. Does the above typecast convert an integer to a char (1 byte) in
> memory?


it doesn't actually modifythe value in memory, but
only how the program looks at it.

> * * For e.g if I used a variable ch, to store the result of the above
> typecast


sorry, lost me. Could you post code?


> 4. In general, when can we safely do typecasts ?


when necessary There's no short answer to this one.

> Are such code portable ?


sometimes. More often then not, no though


> Thanks a lot for your help. Appreciate it.


happy coding


--
Nick Keighley

"Half-assed programming was a time-filler that, like knitting,
must date to the beginning of human experience."
"A Fire Upon The Deep" by Verne Vinge
 
Reply With Quote
 
James Kuyper
Guest
Posts: n/a
 
      02-12-2009
(E-Mail Removed) wrote:
> Hi Folks,
>
> To determine if a machine is little endian / big endian the foll. code
> snippet is used...
>
> int num = 1;
>
> if( * (char *)&num == 1)
> printf ("\n Little Endian");
>
> else
> printf("\n Big endian");


Note: this code assumes that there are only two possible
representations. That's a good approximation to reality, but it's not
the exact truth. If 'int' is a four-byte type (which it is on many
compilers), there's 24 different byte orders theoretically possible, 6
of which would be identified as Little Endian by this code, 5 of them
incorrectly. 18 of them would be identified as Big Endian, 17 of them
incorrectly.

This would all be pure pedantry, if it weren't for one thing: of those
24 possible byte orders, something like 8 to 11 of them (I can't
remember the exact number) are in actual use on real world machines.
Even that would be relatively unimportant if bigendian and littlendian
were overwhelmingly the most popular choices, but that's not even the
case: the byte orders 2134 and 3412 have both been used in some fairly
common machines.

The really pedantic issue is that the standard doesn't even guarantee
that 'char' and 'int' number the bits in the same order. A conforming
implementation of C could use the same bit that is used by an 'int'
object to store a value of '1' as the sign bit when the byte containing
that bit is interpreted as a char.

> I needed a few clarifications regarding this.
>
> 1. Can we use void * instead of char * ?


No, because you cannot dereference a pointer to void.

> 2. When do we use void * and when char * ?


The key differences between char* and void* are that
a) you cannot dereference or perform pointer arithmetic on void*
b) there are implicit conversions between void* and any other pointer to
to object type.

The general rule is that you should use void* whenever the implicit
conversions are sufficiently important. The standard library's mem*()
functions are a good example where void* is appropriate, because they
are frequently used on pointers to types other than char. You should use
char* whenever your actually accessing the object as an array of
characters, which requires pointer arithmetic and dereferencing. You
should use unsigned char* when accessing the object as an array of
uninterpreted bytes.

> 3. Does the above typecast convert an integer to a char (1 byte) in
> memory?


There's no such thing as a typecast in C. There is a type conversion,
which can occur either implicitly, or explicitly. Explicit conversions
occur as a result of cast expressions.

The (char*) cast does not convert an integer into a char. It converts a
pointer to an int into a pointer to a char. The char object it points at
is the first byte of 'num'. The * operator interprets that byte as a char.

> For e.g if I used a variable ch, to store the result of the above
> typecast


The result of the cast expression is a pointer to char; it can be
converted into a char and stored into a char variable, but the result of
that conversion is probably meaningless unless sizeof(intptr_t) == 1,
which is pretty unlikely. It would NOT, in general, have anything to do
with the value stored in the first byte of "num".

You could write:

char c = *(char*)&num;

> 4. In general, when can we safely do typecasts ? Are such code
> portable ?


The only type conversions that are reasonably safe in portable code are
the ones which occur implicitly, without the use of a cast, and even
those have dangers. Any use of a cast should be treated as a danger
sign. The pattern *(T*), where T is an arbitrary type, is called type
punning. In general, this is one of the most dangerous uses of a cast.
In the case where T is "char", it happens to be relatively safe.

The best answer to your question is to read section 6.3 of the standard.
However, it may be hard for someone unfamiliar with standardese to
translate what section 6.3 says into "safe" or "unsafe", "portable" or
"unportable". Here's my quick attempt at a translation:

* Any value may be converted to void; there's nothing that you can do
with the result. The only use for such a cast would be to shut up the
diagnostics that some compilers generate when you fail to do anything
with the value returned by a function. However, it is perfectly safe.

* Converting any numeric value to a type that is capable of storing that
value is safe. If the value is currently of a type which has a range
which is guaranteed to be a subset of the the range of the target type,
safety is automatic - for instance, when converting "signed char" to
"int". Otherwise, it's up to your program to make sure that the value is
within the valid range.

* Converting a value to a signed or floating point type that is outside
of the valid range for that type is not safe.

* Converting a numeric value to an unsigned type that is outside the
valid range is safe, in the sense that your program will continue
running; but the resulting value will be different from the original by
a multiple of the number that is one more than the maximum value which
can be stored in that type. If that change in value is desired and
expected (D&E), that's a good thing, otherwise it's bad.

* Converting a floating point value to an integer type will loose the
fractional part of that value. If this is D&E, good, otherwise, bad.

* Converting a floating point value to a type with lower precision will
generally lose precision. If this is acceptable and expected, good -
otherwise, bad.

* Converting a _Complex value to a real type will cause the imaginary
part of the value to be discarded. Converting it to an _Imaginary type
will cause the real part of the value to be discarded. Converting
between real and _Imaginary types will always result in a value of 0. In
each of these cases, if the change in value is D&E, good - otherwise, bad.

* Converting a null pointer constant to a pointer type results in a null
pointer of that type. Converting a null pointer to a different pointer
type results in a null pointer of that target type. Both conversions are
safe.

* Converting a pointer to an integer type is safe, but unless the target
type is either an intptr_t or a uintptr_t, the result is
implementation-defined, rendering it pretty much useless, at least in
portable code. If the target type is intptr_t or uintptr_t, the result
may be safely converted back to the original pointer type, and the
result of that conversion will compare equal to the original pointer.
You can safely treat that integer value just like any other integer
value, but conversion back to the original pointer type is the only
meaningful thing that can be done with it.

* Except as described above, converting an integer value into a pointer
type is always dangerous. Note: an integer constant expression with a
value of 0 qualifies as a null pointer constant. Therefore, it qualifies
as one of the cases "described above".

* Any pointer to a function type may be safely converted into a pointer
to a different pointer type. The result may be converted back to the
original pointer type, in which case it will compare equal to the
original pointer. However, you can only safely dereference a function
pointer if it points at a function whose actual type is compatible with
the type that the function pointer points at.

* Conversions which add a qualifier to a pointer type (such as int* =>
const int*) are safe.

* Conversions which remove a qualifier from a pointer type (such as
volatile double * => double *) are safe in themselves, but are
invariably needed only to perform operations that can be dangerous
unless you know precisely what the relevant rules are.

* A pointer to any object can be safely converted into a pointer to a
character type. The result points at the first byte of that object.

* Conversion of a pointer to an object or incomplete type into a pointer
to a different object or incomplete type is safe, but only if it is
correctly aligned for that type. There are only a few cases where you
can be portably certain that the alignment is correct, which limits the
usefulness of this case.

Except as indicated above, the standard says absolutely nothing about
WHERE the resulting pointer points at, which in principle even more
seriously restricts the usefulness of the result of such a conversion.
However, in practice, on most real systems the resulting pointer will
point at the same location in memory as the original pointer.

However, it is only safe to dereference such a pointer if you do so in a
way that conforms to the anti-aliasing rules (6.5p7). And that is what
makes type punning so dangerous.
 
Reply With Quote
 
Boon
Guest
Posts: n/a
 
      02-12-2009
syuga wrote:

> To determine if a machine is little endian or big endian, the
> following code snippet is used...
>
> int num = 1;
>
> if( * (char *)&num == 1)
> printf ("\n Little Endian");
>
> else
> printf("\n Big endian");


You don't need casts if you use memcmp.

$ cat endian.c
#include <stdint.h>
#include <string.h>
#include <stdio.h>

int main(void)
{
uint32_t i = 0x12345678;
uint8_t msb_first[4] = { 0x12, 0x34, 0x56, 0x78 };
uint8_t lsb_first[4] = { 0x78, 0x56, 0x34, 0x12 };
if (memcmp(&i, msb_first, 4) == 0) puts("BIG ENDIAN");
else if (memcmp(&i, lsb_first, 4) == 0) puts("LITTLE ENDIAN");
else puts("SOMETHING ELSE");
return 0;
}
 
Reply With Quote
 
Bruce Cook
Guest
Posts: n/a
 
      02-12-2009
James Kuyper wrote:

> (E-Mail Removed) wrote:
>> Hi Folks,
>>
>> To determine if a machine is little endian / big endian the foll. code
>> snippet is used...
>>
>> int num = 1;
>>
>> if( * (char *)&num == 1)
>> printf ("\n Little Endian");
>>
>> else
>> printf("\n Big endian");

>
> Note: this code assumes that there are only two possible
> representations. That's a good approximation to reality, but it's not
> the exact truth. If 'int' is a four-byte type (which it is on many
> compilers), there's 24 different byte orders theoretically possible, 6
> of which would be identified as Little Endian by this code, 5 of them
> incorrectly. 18 of them would be identified as Big Endian, 17 of them
> incorrectly.
>
> This would all be pure pedantry, if it weren't for one thing: of those
> 24 possible byte orders, something like 8 to 11 of them (I can't
> remember the exact number) are in actual use on real world machines.
> Even that would be relatively unimportant if bigendian and littlendian
> were overwhelmingly the most popular choices, but that's not even the
> case: the byte orders 2134 and 3412 have both been used in some fairly
> common machines.


And there's arguments as to weather 2143, 3412 or 4321 is the "real" big-
endian once it jumped from 16 bits to 32 bits, endian-ness became a bit
complicated. It's original intent was to enable short-word and word fetches
to fetch the same value, assuming the word contained a small value. This
came about because processors often had octet as well as word instructions.

Once 32 bits came about and instructions had 8, 16 and 32 bit word operand
sizes, the question was do you optimize for 8-bit or 16 bit fetches.
Different processor designers came up with different solutions to this,
which lead to all the differing endians.

Then when you get to 64-bit native such as the Alpha, there's even more
combinations (8 octets per word instead of just 4).

The Alpha is interesting because it's endianness is controllable, although
practically you'd have it fixed for a particular operating system so testing
for it would still be valid.

[...]

Bruce


 
Reply With Quote
 
Keith Thompson
Guest
Posts: n/a
 
      02-12-2009
James Kuyper <(E-Mail Removed)> writes:
[...]
> Note: this code assumes that there are only two possible
> representations. That's a good approximation to reality, but it's not
> the exact truth. If 'int' is a four-byte type (which it is on many
> compilers), there's 24 different byte orders theoretically possible, 6
> of which would be identified as Little Endian by this code, 5 of them
> incorrectly. 18 of them would be identified as Big Endian, 17 of them
> incorrectly.
>
> This would all be pure pedantry, if it weren't for one thing: of those
> 24 possible byte orders, something like 8 to 11 of them (I can't
> remember the exact number) are in actual use on real world
> machines. Even that would be relatively unimportant if bigendian and
> littlendian were overwhelmingly the most popular choices, but that's
> not even the case: the byte orders 2134 and 3412 have both been used
> in some fairly common machines.


Really? I've only heard of 1234, 4321, 2143, and 3412 being used in
real life. In fact, I've only heard of one of the last two (whichever
one the PDP-11 used). What other orders have been used, and *why*?

[...]

> * Converting a numeric value to an unsigned type that is outside the
> valid range is safe, in the sense that your program will continue
> running; but the resulting value will be different from the original
> by a multiple of the number that is one more than the maximum value
> which can be stored in that type. If that change in value is desired
> and expected (D&E), that's a good thing, otherwise it's bad.


Almost. Converting a *signed or unsigned* value to an unsigned type
is safe, as you describe. Converting a floating-point value to
unsigned, if the value is outside the range of the unsigned type,
invokes undefined behavior.

--
Keith Thompson (The_Other_Keith) (E-Mail Removed) <http://www.ghoti.net/~kst>
Nokia
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"
 
Reply With Quote
 
jameskuyper
Guest
Posts: n/a
 
      02-12-2009
Keith Thompson wrote:
> James Kuyper <(E-Mail Removed)> writes:
> [...]
> > Note: this code assumes that there are only two possible
> > representations. That's a good approximation to reality, but it's not
> > the exact truth. If 'int' is a four-byte type (which it is on many
> > compilers), there's 24 different byte orders theoretically possible, 6
> > of which would be identified as Little Endian by this code, 5 of them
> > incorrectly. 18 of them would be identified as Big Endian, 17 of them
> > incorrectly.
> >
> > This would all be pure pedantry, if it weren't for one thing: of those
> > 24 possible byte orders, something like 8 to 11 of them (I can't
> > remember the exact number) are in actual use on real world
> > machines. Even that would be relatively unimportant if bigendian and
> > littlendian were overwhelmingly the most popular choices, but that's
> > not even the case: the byte orders 2134 and 3412 have both been used
> > in some fairly common machines.

>
> Really? I've only heard of 1234, 4321, 2143, and 3412 being used in


My reference to 2134 was a typo - I meant 2143.

> real life. In fact, I've only heard of one of the last two (whichever
> one the PDP-11 used). What other orders have been used, and *why*?


I remember seeing a web site that listed a large number number of
orders in current use, and cited specific machines for each byte
order. Unfortunately, I did not save the URL, so I can't cite it.
Sorry!
However it is sufficient for my purposes that 2143 and 3412 are in
use, and all you have to do to verify that is to do a web search for
"middle endian".

> > * Converting a numeric value to an unsigned type that is outside the
> > valid range is safe, in the sense that your program will continue
> > running; but the resulting value will be different from the original
> > by a multiple of the number that is one more than the maximum value
> > which can be stored in that type. If that change in value is desired
> > and expected (D&E), that's a good thing, otherwise it's bad.

>
> Almost. Converting a *signed or unsigned* value to an unsigned type
> is safe, as you describe. Converting a floating-point value to
> unsigned, if the value is outside the range of the unsigned type,
> invokes undefined behavior.


You're right. It's not an issue I've had to worry about very often,
and I remembered it incorrectly. I did the first 7 items on my list
straight from memory, and I should have double-checked them against
the standard before posting.

 
Reply With Quote
 
LL
Guest
Posts: n/a
 
      02-12-2009

"WANG Cong" <(E-Mail Removed)> wrote in message
news:gn125p$14h$(E-Mail Removed)99.com...
> (E-Mail Removed) wrote:
>
>> Hi Folks,
>>
>> To determine if a machine is little endian / big endian the foll. code
>> snippet is used...
>>
>> int num = 1;
>>
>> if( * (char *)&num == 1)
>> printf ("\n Little Endian");

I'm a novice on C too but here makes no sense.
Refer to C Precedence Table
(http://isthe.com/chongo/tech/comp/c/c-precedence.html). Here unary * has
the highest precedence, then comes == then &. So what's this supposed to
mean? Dereferencing what?

>>
>> else
>> printf("\n Big endian");
>>
>> I needed a few clarifications regarding this.
>>
>> 1. Can we use void * instead of char * ?

>
> Here? No, you can not dereference a void* pointer.
>
>> 2. When do we use void * and when char * ?

>
> void* is used generally, for example, malloc(), it can help
> you to pass pointers without castings.
>
> Here, in your case, using char * is because it only wants to
> fetch one byte from a 4-byte int.
>
>> 3. Does the above typecast convert an integer to a char (1 byte) in
>> memory?
>> For e.g if I used a variable ch, to store the result of the above
>> typecast

>
> No, it casts an int pointer to char pointer.
>
>>
>> 4. In general, when can we safely do typecasts ? Are such code
>> portable ?

>
> When you understand what you are doing.

Could someone tell me how does this test for endianness?

 
Reply With Quote
 
jameskuyper
Guest
Posts: n/a
 
      02-12-2009
LL wrote:
> "WANG Cong" <(E-Mail Removed)> wrote in message
> news:gn125p$14h$(E-Mail Removed)99.com...
> > (E-Mail Removed) wrote:

.....
> >> int num = 1;
> >>
> >> if( * (char *)&num == 1)
> >> printf ("\n Little Endian");

> I'm a novice on C too but here makes no sense.
> Refer to C Precedence Table
> (http://isthe.com/chongo/tech/comp/c/c-precedence.html). Here unary * has
> the highest precedence, then comes == then &. So what's this supposed to
> mean? Dereferencing what?


It's a mistake to pay too much attention to precedence tables. The C
standard defines things in terms of grammar, not in terms of
precedence, and the relevant grammar rule is 6.5.3p1:

"unary-expression:
...
unary-operator cast-expression

unary-operator: one of
& * + - ~ !
"

Thus, & and * have the same "precedence", the key issue is whether or
not the thing to the right of the operator can be parsed as a cast-
expression. You can't parse anything to the right of the '*' operator
as a cast express that is shorter than "(char*)&num". Therefore the
&num has to be evaluated first, giving a pointer to 'num'. Then
(char*) is applied to that pointer, converting it to a pointer to
char. Finally, the unary '*' is evaluated, returning the value of the
byte at that location, interpreted as a char. That value is then
compared to 1 for equality.

> Could someone tell me how does this test for endianness?


If 'int' is a little-endian type, the bit that will be set is in the
first byte; if it's a big-endian type, the bit that will be set is in
the last byte. If those were the only two possibilities, this would be
a good way to find out which one it is.
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Typecast clarification Roger Tombey C Programming 29 05-27-2010 06:40 PM
Re: typecast error! Mark Fitzpatrick ASP .Net 1 02-08-2006 04:45 AM
Re: typecast error! Otis Mukinfus ASP .Net 1 02-04-2006 05:32 PM
Re: typecast error! Joe Van Meer ASP .Net 1 02-02-2006 02:00 AM
generics typecast Tomba Java 8 01-13-2006 02:02 AM



Advertisments