Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > C Programming > Structure size and binary format

Reply
Thread Tools

Structure size and binary format

 
 
gamehack
Guest
Posts: n/a
 
      12-31-2005
Hi all,

I've been wondering when I write a structure like:

struct {
int a;
unsigned int b;
float c;
} mystruct;

And then I'm using this as a record for a binary file. The problem is
that the size of the types is different on different
platforms(win/lin/osx) so if a file was copied on another platform and
attempted to be read then the first say 16 bytes could be regarded as
the integer a but it could have been created on system where integer
was 32 bytes. Is there a portable solution to this? Moreover, I've been
looking for some resource on designing your own binary format and I
couldn't find anything apart from short tutorials how to read binary
files. Are there any good resources?

Thanks a lot

 
Reply With Quote
 
 
 
 
Mark McIntyre
Guest
Posts: n/a
 
      12-31-2005
On 30 Dec 2005 16:05:03 -0800, in comp.lang.c , "gamehack"
<(E-Mail Removed)> wrote:

>Hi all,
>
>I've been wondering when I write a structure like:
>
>struct {
>int a;
>unsigned int b;
>float c;
>} mystruct;
>
>And then I'm using this as a record for a binary file. The problem is
>that the size of the types is different on different
>platforms(win/lin/osx) so if a file was copied on another platform and
>attempted to be read then the first say 16 bytes could be regarded as
>the integer a but it could have been created on system where integer
>was 32 bytes. Is there a portable solution to this?


The simplest is to store the data as text, not binary data. Other
methods might involve using fixed-width data types (if your platforms
support them), or writing custom load/save functions for each platform
which still store in binary but do it element by element and take into
account the differing sizes of types on each platform.


Mark McIntyre
--

----== Posted via Newsfeeds.Com - Unlimited-Unrestricted-Secure Usenet News==----
http://www.newsfeeds.com The #1 Newsgroup Service in the World! 120,000+ Newsgroups
----= East and West-Coast Server Farms - Total Privacy via Encryption =----
 
Reply With Quote
 
 
 
 
Chuck F.
Guest
Posts: n/a
 
      12-31-2005
gamehack wrote:
>
> I've been wondering when I write a structure like:
>
> struct {
> int a;
> unsigned int b;
> float c;
> } mystruct;
>
> And then I'm using this as a record for a binary file. The
> problem is that the size of the types is different on different
> platforms(win/lin/osx) so if a file was copied on another
> platform and attempted to be read then the first say 16 bytes
> could be regarded as the integer a but it could have been
> created on system where integer was 32 bytes.


Good. You recognize the existence of a problem. The answer is
"Don't do that". Binary representations are, in general, not
portable. You can convert things into a sequence of bytes and
write/read those to a file, but that means you also have to write
the conversion mechanisms. Now such things as byte sex can bite you.

Far and away the most portable transportation mechanism is pure
text. You already have conversion routines in the standard
library, and all you need to do is use them. Anybody and their dog
can read the files.

--
"If you want to post a followup via groups.google.com, don't use
the broken "Reply" link at the bottom of the article. Click on
"show options" at the top of the article, then click on the
"Reply" at the bottom of the article headers." - Keith Thompson
More details at: <http://cfaj.freeshell.org/google/>
 
Reply With Quote
 
Malcolm
Guest
Posts: n/a
 
      12-31-2005

"gamehack" <(E-Mail Removed)> wrote
>
> I've been wondering when I write a structure like:
>
> struct {
> int a;
> unsigned int b;
> float c;
> } mystruct;
>
> And then I'm using this as a record for a binary file. The problem is
> that the size of the types is different on different
> platforms(win/lin/osx) so if a file was copied on another platform and
> attempted to be read then the first say 16 bytes could be regarded as
> the integer a but it could have been created on system where integer
> was 32 bytes. Is there a portable solution to this? Moreover, I've been
> looking for some resource on designing your own binary format and I
> couldn't find anything apart from short tutorials how to read binary
> files. Are there any good resources?
>

Integers are easy. Just use the AND and OR operators, together with the
bitshifts ( >> <<) to break up an integer into 8-bit chunks, and store it,
big-endian, in a file.

It is necessary to use the big-endian format because otherwise those
little-endians might take over the world, and force us all to store our
bytes at the little end, and we don't wnat that happening.

The float is a bit more tricky. Floating point number have their own
internal format. The good news is that virtually all are 32-bit IEEE format
(sign, exponent, mantissa). You can probably get away with a binary dump,
making sure of the endianness. However to be really portable, you do need to
break the number up into its constitutents, and then rebuild it, using the
ldexp() and frexp() functions.


 
Reply With Quote
 
Eric Sosman
Guest
Posts: n/a
 
      12-31-2005
Chuck F. wrote:

> gamehack wrote:
>
>>
>> I've been wondering when I write a structure like:
>>
>> struct {
>> int a;
>> unsigned int b;
>> float c;
>> } mystruct;
>>
>> And then I'm using this as a record for a binary file. The
>> problem is that the size of the types is different on different
>> platforms(win/lin/osx) so if a file was copied on another
>> platform and attempted to be read then the first say 16 bytes
>> could be regarded as the integer a but it could have been
>> created on system where integer was 32 bytes.

>
>
> Good. You recognize the existence of a problem. The answer is "Don't
> do that". Binary representations are, in general, not portable. You
> can convert things into a sequence of bytes and write/read those to a
> file, but that means you also have to write the conversion mechanisms.
> Now such things as byte sex can bite you.
>
> Far and away the most portable transportation mechanism is pure text.
> You already have conversion routines in the standard library, and all
> you need to do is use them. Anybody and their dog can read the files.
>

 
Reply With Quote
 
gamehack
Guest
Posts: n/a
 
      12-31-2005
Thanks a lot guys.

 
Reply With Quote
 
Eric Sosman
Guest
Posts: n/a
 
      12-31-2005

(Please excuse the vacuous reply that I fat-fingered
a moment ago.)

Chuck F. wrote:

> gamehack wrote:
>
>>
>> I've been wondering when I write a structure like:
>>
>> struct {
>> int a;
>> unsigned int b;
>> float c;
>> } mystruct;
>>
>> And then I'm using this as a record for a binary file. The
>> problem is that the size of the types is different on different
>> platforms(win/lin/osx) so if a file was copied on another
>> platform and attempted to be read then the first say 16 bytes
>> could be regarded as the integer a but it could have been
>> created on system where integer was 32 bytes.

>
>
> Good. You recognize the existence of a problem. The answer is "Don't
> do that". Binary representations are, in general, not portable. You
> can convert things into a sequence of bytes and write/read those to a
> file, but that means you also have to write the conversion mechanisms.
> Now such things as byte sex can bite you.


"Don't do that" needs a little qualification, I think.
If "that" means "just read and write the struct in whatever
form the compiler happens to choose," the advice is sound.
But the claim that binary representations are not portable
(I'm not sure what "in general" means here) doesn't hold up.
Who has not transported a ZIP or GIF or JPEG file between
dissimilar systems? At a lower level, who has not exchanged
IP packets with other systems? Portability is a matter of
agreed-upon standards, not of the underlying representations
chosen.

> Far and away the most portable transportation mechanism is pure text.
> You already have conversion routines in the standard library, and all
> you need to do is use them. Anybody and their dog can read the files.


Text has a few pitfalls of its own. Even without appealing
to the multitude of character encoding schemes, some difficulties
are apparent. For example, it is no simple matter to devise a
portable text representation for arbitrary `double' values. A
value encoded as text, sent to another machine and decoded, then
re-encoded and sent back again may not decode to the same value
that was originally transmitted. It requires as much care to
make this work for text as for binary representations. (And I've
got the war stories from a PPOE to prove it, too ...)

--
Eric Sosman
http://www.velocityreviews.com/forums/(E-Mail Removed)lid

 
Reply With Quote
 
Keith Thompson
Guest
Posts: n/a
 
      12-31-2005
Eric Sosman <(E-Mail Removed)> writes:
[...]
> Text has a few pitfalls of its own. Even without appealing
> to the multitude of character encoding schemes, some difficulties
> are apparent. For example, it is no simple matter to devise a
> portable text representation for arbitrary `double' values. A
> value encoded as text, sent to another machine and decoded, then
> re-encoded and sent back again may not decode to the same value
> that was originally transmitted. It requires as much care to
> make this work for text as for binary representations. (And I've
> got the war stories from a PPOE to prove it, too ...)


A hexadecimal floating-point representation (supported in C99,
implementable in C90) should avoid at least some of the problems.
With enough digits, you can have an exact textual representation of a
floating-point value.

--
Keith Thompson (The_Other_Keith) (E-Mail Removed) <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <*> <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.
 
Reply With Quote
 
gamehack
Guest
Posts: n/a
 
      12-31-2005
Thank you. That's why I wondered how to design a format, like .zip .jpg
etc Do you basically say that each 33 bytes would be one pixel, and
the value of red would be the first 11 bytes, green next 11 bytes, and
then last 11 bytes are going to be blue. And probably some fixed-size
headers at the end file(or probably using some sequence of bytes to
mark end of fields in the header). The problem is that I haven't seen
_any_ good resources about designing file formats. Any pointers?

Regards,
gamehack

 
Reply With Quote
 
Eric Sosman
Guest
Posts: n/a
 
      12-31-2005
gamehack wrote:
> Thank you. That's why I wondered how to design a format, like .zip .jpg
> etc Do you basically say that each 33 bytes would be one pixel, and
> the value of red would be the first 11 bytes, green next 11 bytes, and
> then last 11 bytes are going to be blue. And probably some fixed-size
> headers at the end file(or probably using some sequence of bytes to
> mark end of fields in the header). The problem is that I haven't seen
> _any_ good resources about designing file formats. Any pointers?


<OT>

Visit http://www.wotsit.org/ to find descriptions of
many file formats. Some are binary, some are textual. Some
are designed for portability, some are not. In any event, a
review of what's already been done should give you some ideas.
Perhaps you'll even find an existing format that meets your
needs; if so, adopting it might make available whole suites of
helpful tools for dealing with it.

</OT>

--
Eric Sosman
(E-Mail Removed)lid
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Size of a structure : Structure Padding Kislay C Programming 15 07-13-2011 04:24 AM
binary number format ? format character %b or similar. Ken Starks Python 4 06-23-2008 08:59 AM
Preferred Size, Minimum Size, Size Jason Cavett Java 5 05-25-2008 08:32 AM
mega pixels, file size, image size, and print size - Adobe Evangelists Frank ess Digital Photography 0 11-14-2006 05:08 PM
Pointers to structure and array of structure. Excluded_Middle C Programming 4 10-26-2004 05:39 AM



Advertisments