Velocity Reviews > float algorithm is slow

# float algorithm is slow

Lawrence Kirby
Guest
Posts: n/a

 07-06-2005
On Tue, 05 Jul 2005 21:15:21 +0000, Anonymous 7843 wrote:

> In article <(E-Mail Removed) .com>,
> Wenfei <(E-Mail Removed)> wrote:
>>
>>
>> float percentage;
>>
>> for (j = 0; j < 10000000; j++)
>> {
>> percentage = sinf(frequency * j * 2 * 3.14159 / sampleFreq );

You may gain a little by precalculating the loop invariant
frequency * 2 * 3.14159 / sampleFreq outside the loop.

>> buffer[totalBytes] =ceilf(volume * percentage) + volume;
>> totalBytes++;
>> }
>>
>> Because the float variable, the above loop take 2 seconds in c or c++
>> on Linux machine. Does anybody has a solution to reduce the time?

>
> * Use a small table of sine values instead of sinf() function.
> It appears that you're downsampling to the width of a char
> anyway, so table of somewhat more than 256 sine values probably
> won't make things much worse.

On of the problems is that the type of buffer is not specified

> * Or just map out one complete waveform and just repeatedly copy that
> throughout the rest of the buffer.
>
> * Or Just make one copy of the waveform and index into that as
> appropriate when you need the results.

These only work if the cycle length corresponds to an integral number of
positions in buffer. There's nothing to suggest that this is the case.

Lawrence

Anonymous 7843
Guest
Posts: n/a

 07-06-2005
In article <(E-Mail Removed)> ,
Lawrence Kirby <(E-Mail Removed)> wrote:
>
> On Tue, 05 Jul 2005 21:15:21 +0000, Anonymous 7843 wrote:
>
> > In article <(E-Mail Removed) .com>,
> > Wenfei <(E-Mail Removed)> wrote:
> >>
> >> percentage = sinf(frequency * j * 2 * 3.14159 / sampleFreq );

>
> > * Or Just make one copy of the waveform and index into that as
> > appropriate when you need the results.

>
> These only work if the cycle length corresponds to an integral number of
> positions in buffer. There's nothing to suggest that this is the case.

Given the choice to use float, round pi off at the 6th digit and the
(likely) stuffing of results into a char it didn't seem like fidelity
was foremost.

One hopes that this project was just the first exercise in a DSP course
and that our intrepid correspondent Wenfei soon outlearns us.
--
7842++

Wenfei
Guest
Posts: n/a

 07-08-2005
I am converting Notes format to PCM format to play sounds on the ceil
phone. It is Linux os and it is not FPU, so it is slow.

But finally, I used a sinf lookup table, and the sound can be played
instantly.

Thanks for all your guys idea.

Wenfei

websnarf@gmail.com
Guest
Posts: n/a

 07-10-2005
Wenfei wrote:
> float percentage;
> for (j = 0; j < 10000000; j++) {
> percentage = sinf(frequency * j * 2 * 3.14159 / sampleFreq );
> buffer[totalBytes] =ceilf(volume * percentage) + volume;
> totalBytes++;
> }

sinf() appears to be a Microsoft VC++ extension that computes sin() but
uses only 32-bit floats. Interesting that the zealots in this
newsgroup didn't call you on that.

Ok, anyhow remembering our trigonometry:

sin(a + b) = sin(a)*cos(b) + sin(b)cos(a)
cos(a + b) = cos(a)*cos(b) - sin(b)sin(a)

Now we can strength reduce the argument to sinf:

arg += (freqency*2*3.14159/sampleFreq);

But that value is a constant:

arg += deltaAngle; /* deltaAngle is precomputed as:
freqency*2*3.14159/sampleFreq */

So we can simplify this to:

s1 = percentage*cos(deltaAngle) + c*sin(deltaAngle);
c = c*cos(deltaAngle) + percentage*sin(deltaAngle);
percentage = s1;

And of course we can replace the cos(deltaAngle) and sin(deltaAngle)
with some precomputed values, and we initialize c to 1, and percentage
to 0.

This removes all the trigonometric functions altogether.

The problem is that it will have "accumulating accuracy" problems.
These problems are not trivial, because you will lose the sin^2+cos^2=1
property, which will start scaling your results (either upward or
downward) globally, which could get bad.

A simple way to mitigate this is to split the loop into groups of, say,
100 or 1000 at a time, and reset the parameters with their true
mathematical value with the raw sin/cos functions:

percentage = sinf (frequency * j * 2 * 3.14159 / sampleFreq);
c = cosf (frequency * j * 2 * 3.14159 / sampleFreq);

> Because the float variable, the above loop take 2 seconds in c or c++
> on Linux machine. Does anybody has a solution to reduce the time?

This takes time because you are calling sinf(), not because its
floating point.

--
Paul Hsieh
http://www.pobox.com/~qed/
http://bstring.sf.net/

CBFalconer
Guest
Posts: n/a

 07-10-2005
http://www.velocityreviews.com/forums/(E-Mail Removed) wrote:
> Wenfei wrote:
>
>> float percentage;
>> for (j = 0; j < 10000000; j++) {
>> percentage = sinf(frequency * j * 2 * 3.14159 / sampleFreq );
>> buffer[totalBytes] =ceilf(volume * percentage) + volume;
>> totalBytes++;
>> }

>
> sinf() appears to be a Microsoft VC++ extension that computes
> sin() but uses only 32-bit floats. Interesting that the zealots
> in this newsgroup didn't call you on that.

The usage is proper. The foolish comment is not. From N869:

7.12.4.6 The sin functions

Synopsis

[#1]
#include <math.h>
double sin(double x);
float sinf(float x);
long double sinl(long double x);

Description

[#2] The sin functions compute the sine of x (measured in

Returns

[#3] The sin functions return the sine value.

--
"If you want to post a followup via groups.google.com, don't use
the broken "Reply" link at the bottom of the article. Click on
"show options" at the top of the article, then click on the

Netocrat
Guest
Posts: n/a

 07-10-2005
On Sat, 09 Jul 2005 17:35:27 -0700, websnarf wrote:

> Wenfei wrote:
>> float percentage;
>> for (j = 0; j < 10000000; j++) {
>> percentage = sinf(frequency * j * 2 * 3.14159 / sampleFreq );
>> buffer[totalBytes] =ceilf(volume * percentage) + volume; totalBytes++;
>> }
>> }

> sinf() appears to be a Microsoft VC++ extension that computes sin() but
> uses only 32-bit floats. Interesting that the zealots in this newsgroup
> didn't call you on that.
>
> Ok, anyhow remembering our trigonometry:
>
> sin(a + b) = sin(a)*cos(b) + sin(b)cos(a) cos(a + b) = cos(a)*cos(b) -
> sin(b)sin(a)
>
> Now we can strength reduce the argument to sinf:
>
> arg += (freqency*2*3.14159/sampleFreq);
>
> But that value is a constant:
>
> arg += deltaAngle; /* deltaAngle is precomputed as:
> freqency*2*3.14159/sampleFreq */
>
> So we can simplify this to:
>
> s1 = percentage*cos(deltaAngle) + c*sin(deltaAngle);
> c = c*cos(deltaAngle) + percentage*sin(deltaAngle);

You meant to subtract there, not add, I believe.

> percentage = s1;
>
> And of course we can replace the cos(deltaAngle) and sin(deltaAngle) with
> some precomputed values, and we initialize c to 1, and percentage to 0.
>
> This removes all the trigonometric functions altogether.

Good suggestion. That would seem to be the major contributor to the
calculation time, although you'd have to compare the approaches to be
sure. It's theoretically possible that on a specially optimised machine
sin(j*deltaAngle) could be faster than your approach, but I very much
doubt that this is the case on the OP's cell phone...

> The problem is that it will have "accumulating accuracy" problems. These
> problems are not trivial, because you will lose the sin^2+cos^2=1
> property, which will start scaling your results (either upward or
> downward) globally, which could get bad.
>
> A simple way to mitigate this is to split the loop into groups of, say,
> 100 or 1000 at a time, and reset the parameters with their true
> mathematical value with the raw sin/cos functions:
>
> percentage = sinf (frequency * j * 2 * 3.14159 / sampleFreq); c
> = cosf (frequency * j * 2 * 3.14159 / sampleFreq);

Yes and you could tune the loop grouping number beforehand by looking at
how many cycles it takes before the values grow too inaccurate for

<snip>

Christian Bau
Guest
Posts: n/a

 07-10-2005
In article <(E-Mail Removed). com>,
(E-Mail Removed) wrote:

> Wenfei wrote:
> > float percentage;
> > for (j = 0; j < 10000000; j++) {
> > percentage = sinf(frequency * j * 2 * 3.14159 / sampleFreq );
> > buffer[totalBytes] =ceilf(volume * percentage) + volume;
> > totalBytes++;
> > }

>
> sinf() appears to be a Microsoft VC++ extension that computes sin() but
> uses only 32-bit floats. Interesting that the zealots in this
> newsgroup didn't call you on that.

That's no surprise because it is part of C99.

P.J. Plauger
Guest
Posts: n/a

 07-10-2005
"Christian Bau" <(E-Mail Removed)> wrote in message
news:(E-Mail Removed)...

> In article <(E-Mail Removed). com>,
> (E-Mail Removed) wrote:
>
>> Wenfei wrote:
>> > float percentage;
>> > for (j = 0; j < 10000000; j++) {
>> > percentage = sinf(frequency * j * 2 * 3.14159 / sampleFreq );
>> > buffer[totalBytes] =ceilf(volume * percentage) + volume;
>> > totalBytes++;
>> > }

>>
>> sinf() appears to be a Microsoft VC++ extension that computes sin() but
>> uses only 32-bit floats. Interesting that the zealots in this
>> newsgroup didn't call you on that.

>
> That's no surprise because it is part of C99.

It's even defined, but optional, in C89.

P.J. Plauger
Dinkumware, Ltd.
http://www.dinkumware.com

Wenfei
Guest
Posts: n/a

 07-11-2005
I am converting Notes format to PCM format to play sounds on the ceil
phone. It is Linux os and it is not FPU, so it is slow.

But finally, I used a sinf lookup table, and the sound can be played
instantly.

Thanks for all your guys idea.

Wenfei

Wenfei wrote:
> float percentage;
>
> for (j = 0; j < 10000000; j++)
> {
> percentage = sinf(frequency * j * 2 * 3.14159 / sampleFreq );
> buffer[totalBytes] =ceilf(volume * percentage) + volume;
> totalBytes++;
> }
>
> Because the float variable, the above loop take 2 seconds in c or c++
> on Linux machine. Does anybody has a solution to reduce the time?
>
> Thanks,
>
> Wenfei

Tydr Schnubbis
Guest
Posts: n/a

 07-12-2005
E. Robert Tisdale wrote:
> Wenfei wrote:
>
>> float percentage;
>>
>> for (j = 0; j < 10000000; j++) {
>> percentage = sinf(frequency * j * 2 * 3.14159 / sampleFreq );
>> buffer[totalBytes] =ceilf(volume * percentage) + volume;
>> totalBytes++;
>> }
>>
>> Because the float variable, the above loop take 2 seconds in c or c++
>> on Linux machine. Does anybody has a solution to reduce the time?

>
> > cat main.c

> #include <stdlib.h>
> #include <math.h>
>
> int main(int argc, char* argv[]) {
>
> const
> size_t n = 10000000;
> float buffer[n];
> size_t totalBytes = 0;
> const
> float_t frequency = 1.0;
> const
> float_t sampleFreq = 1.0;
> const
> float_t pi = 3.14159265358979323846;
> const
> float_t volume = 1.0;
>
> for (size_t j = 0; j < n; ++j) {
> float_t percentage = sinf(frequency*j*2*pi/sampleFreq);
> buffer[totalBytes] =ceilf(volume*percentage) + volume;
> totalBytes++;
> }
>
> return 0;
> }
>
> > gcc -Wall -std=c99 -pedantic -O2 -o main main.c -lm
> > time ./main

> 3.694u 0.258s 0:03.92 100.5% 0+0k 0+0io 0pf+0w

If you are posting a version of someone's example that looks different,
but really does the exact same thing in the exact same way, it's best if
you state that that's what you're doing. Don't make people read your
code and compare to figure out you didn't change the logic, nor optimize
it at all. But sure, you version is better style.

-Tydr