cpptutor2000@yahoo.com
Guest
Posts: n/a

 03-30-2008
data on a PC. I am recording the sound in PCM format at 2000 Hz, 16
bit, little endian and signed. The resulting data is put in an array
of bytes. I am generating tones at 1000.00 Hz, with a tone generator.
However, when I convert the bytes to float values, I do not see the
periodic sinusoidal data, as expected, (sample output below)
18770.0
38724.0
16727.0
28006.0
16.0
1.0
2000.0
4000.0
2.0
24932.0
38688.0
0.0
0.0
0.0
0.0

I understand that with 16 bit resolution, I can get numbers in the
range -2^16 - 1 to 2^16 - 1.

I believe that I am not converting the data correctly. To achieve the
conversion, I am taking 4 bytes at a time, and converting them. That
is, first bytes 0 - 3, then bytes 4 - 7 and so on. Is this correct ?

Any hints, suggestions would be greatly appreciated. Thanks in advance

Mark Space
Guest
Posts: n/a

 03-30-2008
http://www.velocityreviews.com/forums/(E-Mail Removed) wrote:
> data on a PC. I am recording the sound in PCM format at 2000 Hz, 16
> bit, little endian and signed. The resulting data is put in an array

> I believe that I am not converting the data correctly. To achieve the
> conversion, I am taking 4 bytes at a time, and converting them. That
> is, first bytes 0 - 3, then bytes 4 - 7 and so on. Is this correct ?

4 bytes = 32 bits. Why are you taking your date four bytes at a time if

Roedy Green
Guest
Posts: n/a

 03-30-2008
On Sat, 29 Mar 2008 17:37:56 -0700 (PDT), "(E-Mail Removed)"
<(E-Mail Removed)> wrote, quoted or indirectly quoted someone
who said :

>Any hints, suggestions would be greatly appreciated. Thanks in advance

see endian.html

If the data are not IEEE, you will have to find out the format and do
some fancy bit fiddling.
--

The Java Glossary
http://mindprod.com

Patricia Shanahan
Guest
Posts: n/a

 03-30-2008
Mark Space wrote:
> (E-Mail Removed) wrote:
>> data on a PC. I am recording the sound in PCM format at 2000 Hz, 16
>> bit, little endian and signed. The resulting data is put in an array

>
>> I believe that I am not converting the data correctly. To achieve the
>> conversion, I am taking 4 bytes at a time, and converting them. That
>> is, first bytes 0 - 3, then bytes 4 - 7 and so on. Is this correct ?

>
> 4 bytes = 32 bits. Why are you taking your date four bytes at a time if
> your data is two bytes?

Also make sure the conversion deals with the data being little-endian.

Rather than going straight to float, I suggest first turning the data
into shorts, and making sure that is working. It may be easier to check.

If the first wave of ideas do not solve the problem, try posting a
sample of the input data in hex.

Patricia

Logan Shaw
Guest
Posts: n/a

 03-30-2008
(E-Mail Removed) wrote:
> data on a PC. I am recording the sound in PCM format at 2000 Hz, 16
> bit, little endian and signed. The resulting data is put in an array
> of bytes. I am generating tones at 1000.00 Hz, with a tone generator.

You *must* set your tone generator to a lower frequency! At
that frequency, even if your software is perfect, you're still
going to see garbage data!

Nyquist's sampling theorem says that when you sample at 2000 Hz,
the highest possible frequency you can represent at *all* (without
completely mangling it) is 1000 Hz. There must be at least two
samples per wavelength.

And that 1000 Hz is in an ideal world. A real-world A-to-D
converter has a low-pass filter that will filter out everything
below the Nyquist frequency (in this case 1000 Hz), and the slope
of that filter is usually sharp, but it is not infinite. That
means in practice the highest frequency that the A-to-D converter
will even see is something less than 1000 Hz.

I would try setting your frequency generator to something like
100 Hz, or set your sampling rate higher.

> However, when I convert the bytes to float values, I do not see the
> periodic sinusoidal data, as expected, (sample output below)
> 18770.0
> 38724.0
> 16727.0
> 28006.0
> 16.0
> 1.0
> 2000.0
> 4000.0
> 2.0
> 24932.0
> 38688.0
> 0.0
> 0.0
> 0.0
> 0.0
>
> I understand that with 16 bit resolution, I can get numbers in the
> range -2^16 - 1 to 2^16 - 1.

No, that would be a total of 2^17 + 1 distinct values. With a
16-bit number, you can only have 2^16 distinct values.

The usual format for signed numbers is two's complement. In
that format, the values range from -2^15 to 2^15-1, which is
another way of saying from -32768 to +32767.

> I believe that I am not converting the data correctly. To achieve the
> conversion, I am taking 4 bytes at a time, and converting them. That
> is, first bytes 0 - 3, then bytes 4 - 7 and so on. Is this correct ?

Well, you haven't said whether the data in your input file is
monophonic, stereophonic, or something else. If it's stereo,
you're going to have pairs of samples. Since each sample is
16 bits, which is 2 bytes, each pair of samples will be 4 bytes.
But I would avoid that at the early stages and try to start with
an input file that is monophonic in order to keep things simple.

Assuming you have a monophonic input file, you need to read
only 2 bytes per sample.

> Any hints, suggestions would be greatly appreciated. Thanks in advance

Let's assume you have read some bytes of the input file into
some array. Converting that into samples is going to look
something like this:

byte[] rawBytes = getBlockOfSamples();

if (samples.length % 2 != 0) {
throw SomeException("Can't handle samples spanning blocks");
}

short[] samples = new short[samples.length / 2];
int inputOffset = 0;
int outputOffset = 0;

while (inputOffset < samples.length) {
// read in both bytes of first sample;
// put them in 16-bit types since they'll
// be converted to that size soon anyway.
short lowOrder = rawBytes[inputOffset];
short highOrder = rawBytes[inputOffset+1];
inputOffset += 2;

// the low-order byte is meant to be
// unsigned since the sign bit is in the
// high-order byte. But the java type
// wraps around after 127, so some of
// our positive numbers will have gotten
// converted to negatives. so fix that.
// since we have already converted to short,
// we can already handle the larger range.
if (lowOrder < 0) {
lowOrder += 256;
}

// shift the high-order byte into position
// and combine them.
samples[outputOffset] = lowOrder | (highOrder << ;
outputOffset++;
}

There is probably some tricky way to avoid that conditional I
used to correct for the negative values, but let's forget about
performance for now.

- Logan

Lew
Guest
Posts: n/a

 03-30-2008
(E-Mail Removed) wrote:
>> data on a PC. I am recording the sound in PCM format at 2000 Hz, 16
>> bit, little endian and signed. The resulting data is put in an array
>> of bytes. I am generating tones at 1000.00 Hz, with a tone generator.

Logan Shaw wrote:
> You *must* set your tone generator to a lower frequency! At
> that frequency, even if your software is perfect, you're still
> going to see garbage data!
>
> Nyquist's sampling theorem says that when you sample at 2000 Hz,
> the highest possible frequency you can represent at *all* (without
> completely mangling it) is 1000 Hz. There must be at least two
> samples per wavelength.

Doesn't that apply to analog sampling? Digital sampling reduces the accuracy
of the reproduction still further, doesn't it?

I have always wondered if the Nyquist frequency really applied to digital
sampling. Every time I've looked it up the formulas use real numbers, not
floating-point approximations. Do you have insight on this?

> I would try setting your frequency generator to something like
> 100 Hz, or set your sampling rate higher.

> The usual format for signed numbers is two's complement. In
> that format, the values range from -2^15 to 2^15-1, which is
> another way of saying from -32768 to +32767.

This is Java, where this is the only format for signed numbers. However, Java
does not have a signed 16-bit integral type.

Endianness should be much easier to handle with the built-in facilities of
java.nio.ByteBuffer.
<http://java.sun.com/javase/6/docs/api/java/nio/ByteBuffer.html>
>> Primitive values are translated to (or from) sequences of bytes according to
>> the buffer's current byte order, which may be retrieved and modified via the order methods.
>> Specific byte orders are represented by instances of the ByteOrder class.

saving you all that looping through
> short lowOrder = rawBytes[inputOffset];
> short highOrder = rawBytes[inputOffset+1];

etc.

--
Lew

Lew
Guest
Posts: n/a

 03-30-2008
Lew wrote:
> This is Java, where this is the only format for signed numbers.
> However, Java does not have a signed 16-bit integral type.

Other than short.

--
Lew

Patricia Shanahan
Guest
Posts: n/a

 03-30-2008
Lew wrote:
> Lew wrote:
>> This is Java, where this is the only format for signed numbers.
>> However, Java does not have a signed 16-bit integral type.

>
> Other than short.
>

Given the data so far, suggesting signed integers in little-endian
format, I would indeed try reading the data as shorts from an
LEDataStream - http://mindprod.com/jgloss/ledatastream.html.

Patricia

Mark Space
Guest
Posts: n/a

 03-30-2008
Logan Shaw wrote:

> Nyquist's sampling theorem says that when you sample at 2000 Hz,
> the highest possible frequency you can represent at *all* (without
> completely mangling it) is 1000 Hz. There must be at least two

> I would try setting your frequency generator to something like
> 100 Hz, or set your sampling rate higher.

Good point, I completely missed that in the OPs post. It never occured
to me that he was actually using an external tone generator. I thought
he was talking about something in software.

Considering that humans can hear up to 15kHz to 20kHz or so, should be
be using at least 150k samples per second? That's the general rule I
remember -- 10x oversample or risk distortion.

cpptutor2000@yahoo.com
Guest
Posts: n/a

 03-30-2008
Thank you very much for your very helpful hints and insight into the
problem. Initially, I had set the sampling frequency at 8000 Hz, with
PCM at 16 bits, signed, little-endian, channel mono. However, with
this sampling frequency, I started getting Java OutOfMemoryException.
So, I shifted to 2000 Hz. Also, I am using a software tone generator.

On Mar 29, 11:13 pm, Logan Shaw <(E-Mail Removed)> wrote:
>
> You *must* set your tone generator to a lower frequency! At
> that frequency, even if your software is perfect, you're still
> going to see garbage data!
>
> Nyquist's sampling theorem says that when you sample at 2000 Hz,
> the highest possible frequency you can represent at *all* (without
> completely mangling it) is 1000 Hz. There must be at least two
> samples per wavelength.
>
> And that 1000 Hz is in an ideal world. A real-world A-to-D
> converter has a low-pass filter that will filter out everything
> below the Nyquist frequency (in this case 1000 Hz), and the slope
> of that filter is usually sharp, but it is not infinite. That
> means in practice the highest frequency that the A-to-D converter
> will even see is something less than 1000 Hz.
>
> I would try setting your frequency generator to something like
> 100 Hz, or set your sampling rate higher.
>
>
>
> > However, when I convert the bytes to float values, I do not see the
> > periodic sinusoidal data, as expected, (sample output below)
> > 18770.0
> > 38724.0
> > 16727.0
> > 28006.0
> > 16.0
> > 1.0
> > 2000.0
> > 4000.0
> > 2.0
> > 24932.0
> > 38688.0
> > 0.0
> > 0.0
> > 0.0
> > 0.0

>
> > I understand that with 16 bit resolution, I can get numbers in the
> > range -2^16 - 1 to 2^16 - 1.

>
> No, that would be a total of 2^17 + 1 distinct values. With a
> 16-bit number, you can only have 2^16 distinct values.
>
> The usual format for signed numbers is two's complement. In
> that format, the values range from -2^15 to 2^15-1, which is
> another way of saying from -32768 to +32767.
>
> > I believe that I am not converting the data correctly. To achieve the
> > conversion, I am taking 4 bytes at a time, and converting them. That
> > is, first bytes 0 - 3, then bytes 4 - 7 and so on. Is this correct ?

>
> Well, you haven't said whether the data in your input file is
> monophonic, stereophonic, or something else. If it's stereo,
> you're going to have pairs of samples. Since each sample is
> 16 bits, which is 2 bytes, each pair of samples will be 4 bytes.
> But I would avoid that at the early stages and try to start with
> an input file that is monophonic in order to keep things simple.
>
> Assuming you have a monophonic input file, you need to read
> only 2 bytes per sample.
>
> > Any hints, suggestions would be greatly appreciated. Thanks in advance

>
> Let's assume you have read some bytes of the input file into
> some array. Converting that into samples is going to look
> something like this:
>
> byte[] rawBytes = getBlockOfSamples();
>
> if (samples.length % 2 != 0) {
> throw SomeException("Can't handle samples spanning blocks");
> }
>
> short[] samples = new short[samples.length / 2];
> int inputOffset = 0;
> int outputOffset = 0;
>
> while (inputOffset < samples.length) {
> // read in both bytes of first sample;
> // put them in 16-bit types since they'll
> // be converted to that size soon anyway.
> short lowOrder = rawBytes[inputOffset];
> short highOrder = rawBytes[inputOffset+1];
> inputOffset += 2;
>
> // the low-order byte is meant to be
> // unsigned since the sign bit is in the
> // high-order byte. But the java type
> // wraps around after 127, so some of
> // our positive numbers will have gotten
> // converted to negatives. so fix that.
> // since we have already converted to short,
> // we can already handle the larger range.
> if (lowOrder < 0) {
> lowOrder += 256;
> }
>
> // shift the high-order byte into position
> // and combine them.
> samples[outputOffset] = lowOrder | (highOrder << ;
> outputOffset++;
> }
>
> There is probably some tricky way to avoid that conditional I
> used to correct for the negative values, but let's forget about
> performance for now.
>
> - Logan