Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > C++ > Storing data of different byte sizes

Reply
Thread Tools

Storing data of different byte sizes

 
 
magnus.moraberg@gmail.com
Guest
Posts: n/a
 
      05-23-2009
Hi,

I wish to read a wave file header which uses different amounts of
bytes to store different pieces of information. For example -

two bytes for the number of channels
four bytes for the length of the raw data.

But since the basic types in c++ are system independent, I'm unsure
how to store these. This is currently how I read the data length -

waveFile.seekg(40);
waveFile.read(reinterpret_cast<char*>(&dataLength) , 4);

where dataLength is an unsigned int. This is all well and good while
an int is four bytes, but how would you guys do this?

Am I right in saying that a char is always one byte?

Also, the actually data samples can themselves have different byte
sizes. So lets say I store 10 samples in memory, each 4 bytes in size.
I would use this code to point to a particular sample -

byteSize = 4;
char* sampleBufferPtr = populate(/**/);
sampleBufferPtr(byteSize*sampleIndex)

but how would I convert the sample to a float?

Thanks for your help,

Barry.
 
Reply With Quote
 
 
 
 
Ron AF Greve
Guest
Posts: n/a
 
      05-23-2009
Hi,

This is how I read the 'SubChunkSize':

UInt32 SubChunk2Size; // == NumSamples * NumChannels * BitsPerSample/8 (is
the rest of this file)

I just read in with Stream.read() the size of the UInt32 (my own type which
is always 4 bytes unsigned int).

I just read the whole header as one struct though (including that last
field) and only handle one type of Wav file.

Input.read( reinterpret_cast<char*>( &WavHeader.WavHeader ), sizeof(
WavHeader.WavHeader ) );

From the top of my head; I thought the samples are integers (could be 8/16
mono or stereo) and I think most other libraries also uses ints (like
OpenAL, speex etc.).

Regards, Ron AF Greve

http://informationsuperhighway.eu

<(E-Mail Removed)> wrote in message
news:(E-Mail Removed)...
> Hi,
>
> I wish to read a wave file header which uses different amounts of
> bytes to store different pieces of information. For example -
>
> two bytes for the number of channels
> four bytes for the length of the raw data.
>
> But since the basic types in c++ are system independent, I'm unsure
> how to store these. This is currently how I read the data length -
>
> waveFile.seekg(40);
> waveFile.read(reinterpret_cast<char*>(&dataLength) , 4);
>
> where dataLength is an unsigned int. This is all well and good while
> an int is four bytes, but how would you guys do this?
>
> Am I right in saying that a char is always one byte?
>
> Also, the actually data samples can themselves have different byte
> sizes. So lets say I store 10 samples in memory, each 4 bytes in size.
> I would use this code to point to a particular sample -
>
> byteSize = 4;
> char* sampleBufferPtr = populate(/**/);
> sampleBufferPtr(byteSize*sampleIndex)
>
> but how would I convert the sample to a float?
>
> Thanks for your help,
>
> Barry.



 
Reply With Quote
 
 
 
 
Ian Collins
Guest
Posts: n/a
 
      05-23-2009
http://www.velocityreviews.com/forums/(E-Mail Removed) wrote:
> Hi,
>
> I wish to read a wave file header which uses different amounts of
> bytes to store different pieces of information. For example -
>
> two bytes for the number of channels
> four bytes for the length of the raw data.
>
> But since the basic types in c++ are system independent, I'm unsure
> how to store these. This is currently how I read the data length -
>
> waveFile.seekg(40);
> waveFile.read(reinterpret_cast<char*>(&dataLength) , 4);
>
> where dataLength is an unsigned int. This is all well and good while
> an int is four bytes, but how would you guys do this?


The only truly portable solution is to read the data in bytes and build
up whatever bigger types are required, depending on size and byte order.

If the byte system order matches the file byte order, you can use the
widely available C fixed width types ([u]intN_t).

> Am I right in saying that a char is always one byte?


Yes.

> Also, the actually data samples can themselves have different byte
> sizes. So lets say I store 10 samples in memory, each 4 bytes in size.
> I would use this code to point to a particular sample -
>
> byteSize = 4;
> char* sampleBufferPtr = populate(/**/);
> sampleBufferPtr(byteSize*sampleIndex)
>
> but how would I convert the sample to a float?


You would have to know the floating point format used. If the file
format matches your host, you can get away with a cast:

float* fp = reinterpret_cast<float*>(sampleBufferPtr);

--
Ian Collins
 
Reply With Quote
 
James Kanze
Guest
Posts: n/a
 
      05-24-2009
On May 24, 1:45 am, Ian Collins <(E-Mail Removed)> wrote:
> (E-Mail Removed) wrote:


> > I wish to read a wave file header which uses different
> > amounts of bytes to store different pieces of
> > information. For example -


> > two bytes for the number of channels
> > four bytes for the length of the raw data.


> > But since the basic types in c++ are system independent,
> > I'm unsure how to store these. This is currently how I
> > read the data length -

>
> > waveFile.seekg(40);
> > waveFile.read(reinterpret_cast<char*>(&dataLength) , 4);


> > where dataLength is an unsigned int. This is all well
> > and good while an int is four bytes,


It doesn't necessarily work even when int is four bytes.

> > but how would you guys do this?


> The only truly portable solution is to read the data in
> bytes and build up whatever bigger types are required,
> depending on size and byte order.


> If the byte system order matches the file byte order, you
> can use the widely available C fixed width types
> ([u]intN_t).


Only if the file format uses 2's complement (usually the
case).

> > Am I right in saying that a char is always one byte?


> Yes.


Yes, but. A byte isn't necessarily 8 bits.

> > Also, the actually data samples can themselves have
> > different byte sizes. So lets say I store 10 samples in
> > memory, each 4 bytes in size. I would use this code to
> > point to a particular sample -


> > byteSize = 4;
> > char* sampleBufferPtr = populate(/**/);
> > sampleBufferPtr(byteSize*sampleIndex)


> > but how would I convert the sample to a float?


> You would have to know the floating point format used. If
> the file format matches your host, you can get away with a
> cast:


> float* fp = reinterpret_cast<float*>(sampleBufferPtr);


Maybe. Not with g++. It's definitely undefined behavior,
and although IMHO, the intent of the standard was more or
less for this to work, there are various reasons that it
doesn't always. (I'm supposing here that sampleBufferPtr
has type uint32_t*, and points to a valid uint32_t. If it
is just a pointer into your buffer, the code will core dump
on most processors, because of alignment considerations, and
there's not much the compiler can do about it.)

If the floating point format is the same in the file and in
your machine, you can use memcpy to copy a uint32_t into a
float. Otherwise, you've got to extract the fields, and use
functions like ldexp to create the actual value. (The code
to do so is actually fairly simple---until you add all of
the necessary error handling.)

--
James Kanze (GABI Software) email:(E-Mail Removed)
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34

 
Reply With Quote
 
James Kanze
Guest
Posts: n/a
 
      05-24-2009
On May 24, 1:33 am, "Ron AF Greve" <me@localhost> wrote:

> This is how I read the 'SubChunkSize':


> UInt32 SubChunk2Size; // == NumSamples * NumChannels * BitsPerSample/8 (is
> the rest of this file)


> I just read in with Stream.read() the size of the UInt32
> (my own type which is always 4 bytes unsigned int).


> I just read the whole header as one struct though
> (including that last field) and only handle one type of
> Wav file.


Which if it works, is only by shear luck. It's not
guaranteed, and it doesn't work most of the time.
(Depending on the file format, it will fail on a Sparc, or
on an Intel machine---for most network formats, it will fail
on the Intel.)

--
James Kanze (GABI Software) email:(E-Mail Removed)
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34
 
Reply With Quote
 
Martin Eisenberg
Guest
Posts: n/a
 
      05-24-2009
(E-Mail Removed) wrote:

> I wish to read a wave file header which uses different amounts
> of bytes to store different pieces of information.


If possible, use libsndfile. If not, you can still study it.
http://www.mega-nerd.com/libsndfile/


Martin

--
Quidquid latine scriptum est, altum videtur.
 
Reply With Quote
 
Ron AF Greve
Guest
Posts: n/a
 
      05-24-2009
Hi,


Well actually the only problem I can imagine when I would move to a 128 byte
system (I probably have to read field by field). However due to the
structure of the header this was just a bit of a timesaver instead of
writing everything out.

All types are my own fixed size types and it certainly does work on intel,
since there is where I use it, since I read the file and can listen to the
contents using OpenAL I think I can safely assume it works

But you are right that on systems with other 'endiness' there needs some
byte swapping to be done and for systems larger than 64 bits, ressetting the
file pointer might be necessary.

Regards, Ron AF Greve

http://informationsuperhighway.eu

"James Kanze" <(E-Mail Removed)> wrote in message
news:(E-Mail Removed)...
On May 24, 1:33 am, "Ron AF Greve" <me@localhost> wrote:

> This is how I read the 'SubChunkSize':


> UInt32 SubChunk2Size; // == NumSamples * NumChannels * BitsPerSample/8
> (is
> the rest of this file)


> I just read in with Stream.read() the size of the UInt32
> (my own type which is always 4 bytes unsigned int).


> I just read the whole header as one struct though
> (including that last field) and only handle one type of
> Wav file.


Which if it works, is only by shear luck. It's not
guaranteed, and it doesn't work most of the time.
(Depending on the file format, it will fail on a Sparc, or
on an Intel machine---for most network formats, it will fail
on the Intel.)

--
James Kanze (GABI Software) email:(E-Mail Removed)
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34


 
Reply With Quote
 
James Kanze
Guest
Posts: n/a
 
      05-25-2009
On May 24, 1:46 pm, "Ron AF Greve" <me@localhost> wrote:

> Well actually the only problem I can imagine when I would move
> to a 128 byte system (I probably have to read field by field).
> However due to the structure of the header this was just a bit
> of a timesaver instead of writing everything out.


> All types are my own fixed size types and it certainly does
> work on intel, since there is where I use it, since I read the
> file and can listen to the contents using OpenAL I think I can
> safely assume it works


> But you are right that on systems with other 'endiness' there
> needs some byte swapping to be done and for systems larger
> than 64 bits, ressetting the file pointer might be necessary.


And on systems with different integral representations you'll
need other adjustments, and with compilers which use different
padding, you'll need other adjustments, and on systems where
bytes aren't 8 bits, you'll need other adjustments.

FWIW: I've seen byte order of a 32 bit integer change from one
version of the compiler to the next, from 2301 to 0123. Padding
often changes according to compiler options. And most systems
(not Intel) have alignment restrictions, which means that if the
data in the buffer isn't aligned, you get a core dump.

Byte order is just the tip of the iceberg. In practice, you
need to define the format (or use an already defined format),
and implement the correct formatting.

--
James Kanze (GABI Software) email:(E-Mail Removed)
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
read() returns data of different sizes jimgardener Python 1 10-02-2010 12:13 PM
Re: Win 7 changing font sizes without icon sizes? why? Computer Support 0 03-21-2010 11:32 AM
Re: Win 7 changing font sizes without icon sizes? why? Computer Support 0 03-21-2010 11:31 AM
The File Sizes of Pictures on my CDs Increased to Unreadable Sizes Marful Computer Support 11 03-08-2006 07:13 PM
Any ideas on different file sizes showing of the same file on two different systems? Joe Computer Support 4 01-10-2005 01:05 AM



Advertisments