Velocity Reviews

Velocity Reviews (http://www.velocityreviews.com/forums/index.php)
-   C Programming (http://www.velocityreviews.com/forums/f42-c-programming.html)
-   -   Re: A portable code to create a 4-bytes Big Endian twos complement (http://www.velocityreviews.com/forums/t745307-re-a-portable-code-to-create-a-4-bytes-big-endian-twos-complement.html)

Spiros Bousbouras 03-17-2011 06:43 PM

Re: A portable code to create a 4-bytes Big Endian twos complement
 
On Thu, 17 Mar 2011 19:25:40 +0100
pozz <pozzugno@gmail.com> wrote:
> Il 17/03/2011 18:16, pozz ha scritto:
> > On pag. 179 of "C Unleashed" book (by Heathfield, Kirby et al.), there
> > is this code for a similar task (ifp is an input stream):
> > [...]

>
> And another thing I couldn't understand on that book, for a similar topic.
>
> On pag. 178 it explains how to write and read a two-bytes integer value
> on a portable data file (writing/reading to a file is a similar task to
> sending/receiving to/from a network).


[...]

> How is possible the book is wrong?


Books can have mistakes. But in this case I don't know if it has made a
mistake because I don't know what "two-bytes integer value" means.
Could you provide a precise definition ?

> If I want to fix this, how can I do?


Exactly what do you want to achieve ?

Spiros Bousbouras 03-17-2011 08:15 PM

Re: A portable code to create a 4-bytes Big Endian twos complement
 
On Thu, 17 Mar 2011 21:03:19 +0100
pozz <pozzugno@gmail.com> wrote:
> Il 17/03/2011 19:43, Spiros Bousbouras ha scritto:
> > Books can have mistakes. But in this case I don't know if it has made a
> > mistake because I don't know what "two-bytes integer value" means.
> > Could you provide a precise definition ?

>
> The book was talking about the issue to store an integer value on a file
> in a portable way.
> "Suppose we decide that the int as represented in the data file will be
> two bytes in little-endian order [...]".
> After the writing (putc) instructions, the author says:
> "Key point number two is that we're not concerned with how big an int
> happens to be on this machine; [...]"
>
> So I think that the author has presented a code that works on every C
> implementation (16-, 32-, 64-bits sized int).


I don't have the book but I don't think that 32 or 64 bits sized int is
meant to count as 2 bytes. My guess is that when it says 2 bytes it
means that the whole bit pattern which represents the number is stored
in 2 bytes.

Ben Bacarisse 03-18-2011 02:23 AM

Re: A portable code to create a 4-bytes Big Endian twos complement
 
pozz <pozzugno@gmail.com> writes:

> Il 17/03/2011 21:15, Spiros Bousbouras ha scritto:
>>> So I think that the author has presented a code that works on every C
>>> implementation (16-, 32-, 64-bits sized int).

>>
>> I don't have the book but I don't think that 32 or 64 bits sized int is
>> meant to count as 2 bytes. My guess is that when it says 2 bytes it
>> means that the whole bit pattern which represents the number is stored
>> in 2 bytes.

>
> The size of int variable in memory depends on implementation (16, 32
> or 64 bits or maybe other values), but the value stored in the file is
> fixed arbitrarily to 2-bytes little-endian.
> So the author proposes the code:
> putc(i & 0xff, ofp);
> putc((i >> 8) & 0xff, ofp);
> Indeed, i (an int variable) could be 16, 32 or 64 bits, but the 2
> bytes written to ofp will be always the same (of course, the value of
> i must be between -32767 and +32767).
>
> For example, suppose i contains the value 600. It could be represented
> in memory as:
> 0258 (16-bits big-endian)
> 00000258 (32-bits big-endian)
> 5802 (16-bits little-endian)
> 58020000 (32-bits little-endian)
> With the above two lines of code, I'll write always the same two bytes
> on the file ofp: 0x58 (the first) and 0x02 (the second).
>
> Now I want to write a function that returns the same value in a int
> variable, starting from 0x58 and 0x02 bytes read from the file. The
> code should be portable on 16-, 32- and maybe 64-bits int
> implementations.
> And the value stored in the file could be negative.
>
> The goal of this will be to have a code that writes and reads a
> *portable data file*, so a file that can be created by an application
> on a machine and read by another application on another machine.
> Or a data packet sent from an application running on a machine and
> received by an application running on a different machine.


If you can, avoid using signed binary numbers as a portable
representation but if you can't you can read the two bytes and pack
them into an unsigned int:

int a = fgetc(fp);
int b = fgetc(fp);
unsigned int ux = (((unsigned)b) << 8) | a;

(obviously check for EOF and errors in the real code). Then you need to
convert this to an int. One portable way is like this:

int x;
if (ux >= 0x8000u) {
x = 0xffffu - ux;
x = -x - 1;
}
else x = ux; /* conversion is in range possible */

This came up here some time ago (July 2010) and Tim Rentsch came up
with:

(int)(ux & 32767) - (int)(ux/2 & 16384) - (int)(ux/2 & 16384);

which has several advantages over my suggestion.

--
Ben.

pozz 03-18-2011 09:38 AM

Re: A portable code to create a 4-bytes Big Endian twos complement
 
On 18 Mar, 03:23, Ben Bacarisse <ben.use...@bsb.me.uk> wrote:
> If you can, avoid using signed binary numbers as a portable
> representation


Unfortunately I can't avoid it, because the data packet format
is fixed not by me.


> but if you can't you can read the two bytes and pack
> them into an unsigned int:


So you agree with me that the two reading instructions can't work with
signed values on 32 bits machines.

I found the same code on Question 12.42 of comp.lang.c FAQ ("How can I
write code to conform to these old, binary data file formats?"). There
the following struct is defined:
struct mystruct {
char c;
long int i32;
int i16;
} s;
and the following code is used to read the 16 bits value:
s.i16 = getc(fp) << 8;
s.i16 |= getc(fp);

If we assume the values stored in the file are unsigned (0-65535), the
member i16 should had be defined as unsigned int (not signed int),
otherwise the value 40000 can't be correctly received on 16-bits
machines.

If we assume the values stored in the file are signed (-32767 -
+32767), the code to read them doesn't work for negative values on
implementations with int size greater than 16 bits (the value -1 is
written in the file as 0xFF 0xFF, but is read back as 65535 on 32-bits
platforms).

In both cases, the code doesn't work and should be fixed.


> * int a = fgetc(fp);
> * int b = fgetc(fp);
> * unsigned int ux = (((unsigned)b) << 8) | a;
>[...]
> * int x;
> * if (ux >= 0x8000u) {
> * * * x = 0xffffu - ux;
> * * * x = -x - 1;
> * }
> * else x = ux; /* conversion is in range possible */


Ok, it could be a solution. Another solution would be (if I assume the
presence of int16_t):
int i = (int16_t)(((unsigned)b) << 8) | a;


> This came up here some time ago (July 2010) and Tim Rentsch came up
> with:
>
> * (int)(ux & 32767) - (int)(ux/2 & 16384) - (int)(ux/2 & 16384);
>
> which has several advantages over my suggestion.


I found the _very long_ thread. I'll try to understand it alone, but
it seems difficult for my small knowledge of C language.

Ben Bacarisse 03-18-2011 12:43 PM

Re: A portable code to create a 4-bytes Big Endian twos complement
 
pozz <pozzugno@gmail.com> writes:

> On 18 Mar, 03:23, Ben Bacarisse <ben.use...@bsb.me.uk> wrote:
>> [...] you can read the two bytes and pack
>> them into an unsigned int:

>
> So you agree with me that the two reading instructions can't work with
> signed values on 32 bits machines.


I am not exactly sure what you think I am agreeing with. You showed
code that looked wrong but without context it's hard to know what the
code was supposed to do. For example, this:

> I found the same code on Question 12.42 of comp.lang.c FAQ ("How can I
> write code to conform to these old, binary data file formats?"). There
> the following struct is defined:
> struct mystruct {
> char c;
> long int i32;
> int i16;
> } s;
> and the following code is used to read the 16 bits value:
> s.i16 = getc(fp) << 8;
> s.i16 |= getc(fp);
>
> If we assume the values stored in the file are unsigned (0-65535), the
> member i16 should had be defined as unsigned int (not signed int),
> otherwise the value 40000 can't be correctly received on 16-bits
> machines.


is not as bad as you think since the context makes it clear that the
purpose is to read signed 16-bit ints. 40000 is not an option.

> If we assume the values stored in the file are signed (-32767 -
> +32767), the code to read them doesn't work for negative values on
> implementations with int size greater than 16 bits (the value -1 is
> written in the file as 0xFF 0xFF, but is read back as 65535 on 32-bits
> platforms).


That's not a good assumption. The context is clear: ints are 16 bits.
I agree that it could be made more explicit, and it should certainly
mention the reliance on an implementation-defined conversion, but the
code is not intended to work with all int sizes. (I think the FAQ dates
from before C99 so there is no possibility of a signal being raised.)

> In both cases, the code doesn't work and should be fixed.


Have you told the people concerned?

>> Â* int a = fgetc(fp);
>> Â* int b = fgetc(fp);
>> Â* unsigned int ux = (((unsigned)b) << 8) | a;
>>[...]
>> Â* int x;
>> Â* if (ux >= 0x8000u) {
>> Â* Â* Â* x = 0xffffu - ux;
>> Â* Â* Â* x = -x - 1;
>> Â* }
>> Â* else x = ux; /* conversion is in range possible */

>
> Ok, it could be a solution. Another solution would be (if I assume the
> presence of int16_t):
> int i = (int16_t)(((unsigned)b) << 8) | a;


No, that does not address the question -- the implementation-defined
conversion when an unsigned int value it out of range for int. You may
be happy with assuming that it works as you expect, but you seemed to
want a 100% portable solution.

>> This came up here some time ago (July 2010) and Tim Rentsch came up
>> with:
>>
>> Â* (int)(ux & 32767) - (int)(ux/2 & 16384) - (int)(ux/2 & 16384);
>>
>> which has several advantages over my suggestion.

>
> I found the _very long_ thread. I'll try to understand it alone, but
> it seems difficult for my small knowledge of C language.


If I recall, after solutions were posted a lot of the thread was about
readability and the ability of compilers to optimise the resulting code.

--
Ben.

pozz 03-18-2011 01:54 PM

Re: A portable code to create a 4-bytes Big Endian twos complement
 
On 18 Mar, 13:43, Ben Bacarisse <ben.use...@bsb.me.uk> wrote:
> > I found the same code on Question 12.42 of comp.lang.c FAQ ("How can I
> > write code to conform to these old, binary data file formats?"). There
> > the following struct is defined:
> > * struct mystruct {
> > * * char c;
> > * * long int i32;
> > * * int i16;
> > * } s;
> > and the following code is used to read the 16 bits value:
> > * s.i16 = getc(fp) << 8;
> > * s.i16 |= getc(fp);

>
> > If we assume the values stored in the file are unsigned (0-65535), the
> > member i16 should had be defined as unsigned int (not signed int),
> > otherwise the value 40000 can't be correctly received on 16-bits
> > machines.

>
> is not as bad as you think since the context makes it clear that the
> purpose is to read signed 16-bit ints. *40000 is not an option.


Ok, for me it wasn't so clear...


> > If we assume the values stored in the file are signed (-32767 -
> > +32767), the code to read them doesn't work for negative values on
> > implementations with int size greater than 16 bits (the value -1 is
> > written in the file as 0xFF 0xFF, but is read back as 65535 on 32-bits
> > platforms).

>
> That's not a good assumption. *The context is clear: ints are 16 bits.


Ok, even in this case I made the assumption the code worked also for
32 bits machines.

Anyway you agree with me when I say that the code in the FAQ doesn't
work for 32 bits integers, don't you?


> > Ok, it could be a solution. Another solution would be (if I assume the
> > presence of int16_t):
> > * int i = (int16_t)(((unsigned)b) << 8) | a;

>
> No, that does not address the question -- the implementation-defined
> conversion when an unsigned int value it out of range for int. *You may
> be happy with assuming that it works as you expect, but you seemed to
> want a 100% portable solution.


You are right. Now I understand that unsigned->int conversion is
bad :-)

Ben Bacarisse 03-18-2011 03:14 PM

Re: A portable code to create a 4-bytes Big Endian twos complement
 
pozz <pozzugno@gmail.com> writes:

> On 18 Mar, 13:43, Ben Bacarisse <ben.use...@bsb.me.uk> wrote:
>> > I found the same code on Question 12.42 of comp.lang.c FAQ ("How can I
>> > write code to conform to these old, binary data file formats?"). There
>> > the following struct is defined:
>> > Â* struct mystruct {
>> > Â* Â* char c;
>> > Â* Â* long int i32;
>> > Â* Â* int i16;
>> > Â* } s;
>> > and the following code is used to read the 16 bits value:
>> > Â* s.i16 = getc(fp) << 8;
>> > Â* s.i16 |= getc(fp);

<snip>
>> > If we assume the values stored in the file are signed (-32767 -
>> > +32767), the code to read them doesn't work for negative values on
>> > implementations with int size greater than 16 bits (the value -1 is
>> > written in the file as 0xFF 0xFF, but is read back as 65535 on 32-bits
>> > platforms).

>>
>> That's not a good assumption. Â*The context is clear: ints are 16 bits.

>
> Ok, even in this case I made the assumption the code worked also for
> 32 bits machines.
>
> Anyway you agree with me when I say that the code in the FAQ doesn't
> work for 32 bits integers, don't you?


Yes, nor for 18-bit ints or 64 bit ones. The code has a purpose and it
does not do anything else. If your point is "I'd rather the FAQ had an
example of reading a 16-bit signed 2's complement integer into an int of
any permitted size" then I agree that might be more interesting but i
don't think it makes the current code wrong.

<snip>
--
Ben.


All times are GMT. The time now is 01:56 PM.

Powered by vBulletin®. Copyright ©2000 - 2013, vBulletin Solutions, Inc.
SEO by vBSEO ©2010, Crawlability, Inc.


1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57