Velocity Reviews (http://www.velocityreviews.com/forums/index.php)
-   C++ (http://www.velocityreviews.com/forums/f39-c.html)

 junw2000@gmail.com 01-25-2007 08:45 AM

Hi,

For the code below:

char *c = "0113";
unsigned short *p;
p = (unsigned short*)c; //LINE1
std::cout<<"*p: "<<*p<<'\n';
std::cout<<"*c: "<<c<<'\n';

The output is:
*p: 12592
*c: 0113

Why?
How does LINE1 work? For the string "0113", there is a implicit '/0' at
the end. How does LINE1 handle it?

Thanks

Jack

 Ian Collins 01-25-2007 08:52 AM

junw2000@gmail.com wrote:
> Hi,
>
> For the code below:
>
> char *c = "0113";

should be const char*.

> unsigned short *p;
> p = (unsigned short*)c; //LINE1
> std::cout<<"*p: "<<*p<<'\n';
> std::cout<<"*c: "<<c<<'\n';
>
> The output is:
> *p: 12592
> *c: 0113
>
> Why?

What else would you expect?

> How does LINE1 work? For the string "0113", there is a implicit '/0' at
> the end. How does LINE1 handle it?
>

It assigns the value of c to p. Assuming sizeof unsigned short to be 2,
*p is the first two bytes of the string literal pointed to by 2.

Convert 12592 to hex and check the ASCII values for '0' and '1'

--
Ian Collins.

 Alf P. Steinbach 01-25-2007 08:56 AM

* junw2000@gmail.com:
> Hi,
>
> For the code below:
>
> char *c = "0113";
> unsigned short *p;
> p = (unsigned short*)c; //LINE1
> std::cout<<"*p: "<<*p<<'\n';
> std::cout<<"*c: "<<c<<'\n';
>
> The output is:
> *p: 12592
> *c: 0113
>
> Why?

Why not? What did you expect?

> How does LINE1 work?

It uses a C-style cast, which is interpreted as a reinterpret_cast.

Look up reinterpret_cast.

Then remember in the future to not use C-style casts, and remember that
while you're still a novice every occurrence of reinterpret_cast in
your code means you have a bug.

> For the string "0113", there is a implicit '/0' at
> the end. How does LINE1 handle it?

It doesn't.

--
A: Because it messes up the order in which people normally read text.
Q: Why is it such a bad thing?
A: Top-posting.
Q: What is the most annoying thing on usenet and in e-mail?

 john_andronicus@hotmail.com 01-25-2007 12:35 PM

On 25 Jan, 08:45, junw2...@gmail.com wrote:
> Hi,
>
> For the code below:
>
> char *c = "0113";
> unsigned short *p;
> p = (unsigned short*)c; //LINE1
> std::cout<<"*p: "<<*p<<'\n';
> std::cout<<"*c: "<<c<<'\n';
>
> The output is:
> *p: 12592
> *c: 0113
>
> Why?

Because casts like this have undefined behaviour. There is no why,
undefined behaviour means anything is allowed to happen.

> How does LINE1 work? For the string "0113", there is a implicit '/0' at
> the end. How does LINE1 handle it?

Since LINE1 is undefined behaviour any further questions are
meaningless.

 =?iso-8859-1?q?Kirit_S=E6lensminde?= 01-26-2007 04:00 AM

On Jan 25, 3:45 pm, junw2...@gmail.com wrote:
> For the code below:
>
> char *c = "0113";
> unsigned short *p;
> p = (unsigned short*)c; //LINE1
> std::cout<<"*p: "<<*p<<'\n';
> std::cout<<"*c: "<<c<<'\n';
>
> The output is:
> *p: 12592
> *c: 0113
>
> Why?

It looks like the program is trying to show that the memory location
where the string is located (p) and the string. It is showing that if
you send a char * (should be const char *) to an IO stream it will
print as a character string. However, if you send a pointer to any
other type to a IO stream it will display the memory location.

The cast in //LINE1 is a nasty hack and you should never do anything
like it. I don't know if the cast itself is UB, but I'm pretty sure
that dereferencing p would be (if I had to guess I would say that the
code was written on a platform where sizeof( unsigned short ) ==
sizeof( char ) == 1). If p were a const void * then it would be
correct, but still not a good thing to do unless forced (normally to
interact with C style code).

> How does LINE1 work? For the string "0113", there is a implicit '/0' at
> the end. How does LINE1 handle it?

It doesn't. c is really a pointer to the memory location that the
string is stored at. The line is simply a way of getting that memory
location pointer into something that can be displayed by the IO stream
as a pointer rather than a string.

The first three lines are reasonable C, but unreasonable C++.

K

 junw2000@gmail.com 01-26-2007 08:02 AM

On Jan 25, 12:52 am, Ian Collins <ian-n...@hotmail.com> wrote:
> junw2...@gmail.com wrote:
> > Hi,

>
> > For the code below:

>
> > char *c = "0113";should be const char*.

>
> > unsigned short *p;
> > p = (unsigned short*)c; //LINE1
> > std::cout<<"*p: "<<*p<<'\n';
> > std::cout<<"*c: "<<c<<'\n';

>
> > The output is:
> > *p: 12592
> > *c: 0113

>
> > Why?What else would you expect?

Maybe I should do this:
p = static_cast<unsigned short*>c;
Is it right?
I need to do checksum of a string. The function is like this: checksum(
unsigned short *p, int count).
So I have to convert char* to unsign short*. Is there any better to do
it?

>
> > How does LINE1 work? For the string "0113", there is a implicit '/0' at
> > the end. How does LINE1 handle it?It assigns the value of c to p. Assuming sizeof unsigned short to be 2,

> *p is the first two bytes of the string literal pointed to by 2.
>
> Convert 12592 to hex and check the ASCII values for '0' and '1'

The binary of 12592 is 11000100110000. The binary of '0' is 11000. The
binary of '1' is 110001.
After the cast, why it becomes '10' other than '01'?

Thanks.

Jack

 Ian Collins 01-26-2007 08:11 AM

junw2000@gmail.com wrote:
>
> On Jan 25, 12:52 am, Ian Collins <ian-n...@hotmail.com> wrote:
>
> Maybe I should do this:
> p = static_cast<unsigned short*>c;

p = reinterpret_cast<short*>(c);

>
>>>How does LINE1 work? For the string "0113", there is a implicit '/0' at
>>>the end. How does LINE1 handle it?It assigns the value of c to p. Assuming sizeof unsigned short to be 2,

>>
>>*p is the first two bytes of the string literal pointed to by 2.
>>
>>Convert 12592 to hex and check the ASCII values for '0' and '1'

>
>
> The binary of 12592 is 11000100110000. The binary of '0' is 11000. The
> binary of '1' is 110001.
> After the cast, why it becomes '10' other than '01'?
>

--
Ian Collins.

 Kai-Uwe Bux 01-26-2007 08:31 AM

junw2000@gmail.com wrote:

>
>
> On Jan 25, 12:52 am, Ian Collins <ian-n...@hotmail.com> wrote:
>> junw2...@gmail.com wrote:
>> > Hi,

>>
>> > For the code below:

>>
>> > char *c = "0113";should be const char*.

>>
>> > unsigned short *p;
>> > p = (unsigned short*)c; //LINE1
>> > std::cout<<"*p: "<<*p<<'\n';
>> > std::cout<<"*c: "<<c<<'\n';

>>
>> > The output is:
>> > *p: 12592
>> > *c: 0113

>>
>> > Why?What else would you expect?

>
> Maybe I should do this:
> p = static_cast<unsigned short*>c;
> Is it right?
> I need to do checksum of a string. The function is like this: checksum(
> unsigned short *p, int count).
> So I have to convert char* to unsign short*. Is there any better to do
> it?

There may be no way to solve the underlying problem by casting pointer types
around. E.g., what happens if the char* points to a place not suitably
aligned for short? What happens if the string contains a number of
characters that is not a multiple of sizeof(short)? Besides, very likely
you have undefined behavior anyway, depending on what checksum() does
internally.

Best

Kai-Uwe Bux

 NagelBagel@gmail.com 01-27-2007 01:06 PM

There is a standard way to do this, though it involves a bit of
implementation-defined behavior (such as endianness). There are two
errors in your code: first, the reinterpret_cast is not valid because
unsigned short could have stricter alignment requirements than char,
and you attempt to access a string literal as an unsigned short. The
latter is wrong for two reasons, one because the alignment
requirements of short could be stricter than char, and two because the
standard disallows accessing objects as different types, so the
compiler could optimize it away.

char *c = "0113" is valid, because old C code used that idiom a lot,
so it was included for backwards compatibility. It's use is deprecated
though.

The second problem can be avoided by copying the array into an
unsigned short. The first can be avoided by first casting to void *,
then char * or unsigned char *. You could also use std::memcpy or
std::memmove, since the standard appears to make special consideration
for them (it uses them in examples). This sort of copying is only
allowed for POD types.

Technically the standard only allows for copying of this sort from one
object to another of the same type because types are allowed to have
padding bits and trap bits, but as long as (type(1) <<
type(sizeof(type)) * type(CHAR_BIT)) - 1 is equal to
std::type_traits<type>::max(), for unsigned types at least, the
copying will be valid. In practice, I doubt you'll find too many
implementations that go into these peculiarities.

Here's a valid implementation (assuming some valid min() function):

if ((unsigned short(1) << unsigned short(sizeof(unsigned short)) *
unsigned short(CHAR_BIT)) - 1 !=
std::type_traits<unsigned short>::max()) return;
unsigned short s = 0;
const char *c = "0113";
for (std::size_t i = 0; i < min(sizeof(s), 5); ++i) static_cast<const
char *>(static_cast<const void *>(&s))[i] = c[i];
std::cout<<"s: "<<s<<'\n';
std::cout<<"c: "<<c<<'\n';

The actual value of s is implementation-defined due to several factors
including the size of s, the representation of unsigned shorts and the
values '0' '1' and '3' map to. The vast majority of platforms will
represent an unsigned short as a two's complement integer of two
bytes, with the bit order the same as a char. The only thing that will
differ normally is endianness, whether the '0' or the '1' will make up
the first byte.

 All times are GMT. The time now is 04:25 PM.