Velocity Reviews > float algorithm is slow

# float algorithm is slow

Wenfei
Guest
Posts: n/a

 07-05-2005
float percentage;

for (j = 0; j < 10000000; j++)
{
percentage = sinf(frequency * j * 2 * 3.14159 / sampleFreq );
buffer[totalBytes] =ceilf(volume * percentage) + volume;
totalBytes++;
}

Because the float variable, the above loop take 2 seconds in c or c++
on Linux machine. Does anybody has a solution to reduce the time?

Thanks,

Wenfei

E. Robert Tisdale
Guest
Posts: n/a

 07-05-2005
Wenfei wrote:

> float percentage;
>
> for (j = 0; j < 10000000; j++) {
> percentage = sinf(frequency * j * 2 * 3.14159 / sampleFreq );
> buffer[totalBytes] =ceilf(volume * percentage) + volume;
> totalBytes++;
> }
>
> Because the float variable, the above loop take 2 seconds in c or c++
> on Linux machine. Does anybody has a solution to reduce the time?

> cat main.c

#include <stdlib.h>
#include <math.h>

int main(int argc, char* argv[]) {

const
size_t n = 10000000;
float buffer[n];
size_t totalBytes = 0;
const
float_t frequency = 1.0;
const
float_t sampleFreq = 1.0;
const
float_t pi = 3.14159265358979323846;
const
float_t volume = 1.0;

for (size_t j = 0; j < n; ++j) {
float_t percentage = sinf(frequency*j*2*pi/sampleFreq);
buffer[totalBytes] =ceilf(volume*percentage) + volume;
totalBytes++;
}

return 0;
}

> gcc -Wall -std=c99 -pedantic -O2 -o main main.c -lm
> time ./main

3.694u 0.258s 0:03.92 100.5% 0+0k 0+0io 0pf+0w

Eric Sosman
Guest
Posts: n/a

 07-05-2005
Wenfei wrote:
> float percentage;
>
> for (j = 0; j < 10000000; j++)
> {
> percentage = sinf(frequency * j * 2 * 3.14159 / sampleFreq );
> buffer[totalBytes] =ceilf(volume * percentage) + volume;
> totalBytes++;
> }
>
> Because the float variable, the above loop take 2 seconds in c or c++
> on Linux machine. Does anybody has a solution to reduce the time?

Yes: Change the iteration count from 10000000 to 0, and
the code will almost certainly run faster.

In other words, micro-benchmarks of this sort are not
very informative. What are you really trying to do?

--
Eric Sosman
http://www.velocityreviews.com/forums/(E-Mail Removed)lid

Gordon Burditt
Guest
Posts: n/a

 07-05-2005
>float percentage;
>
>for (j = 0; j < 10000000; j++)
>{
> percentage = sinf(frequency * j * 2 * 3.14159 / sampleFreq );
> buffer[totalBytes] =ceilf(volume * percentage) + volume;
> totalBytes++;
>}
>
>Because the float variable,

My guess is that if you GOT RID OF the float variable, it would

>for (j = 0; j < 10000000; j++)

{
buffer[totalBytes] =ceilf(volume * sinf(frequency * j * 2 * 3.14159 / sampleFreq )) + volume;
totalBytes++;
}

>the above loop take 2 seconds in c or c++
>on Linux machine. Does anybody has a solution to reduce the time?

You haven't demonstrated why taking two seconds is a problem yet.
Cut down the number of iterations? Get a faster machine?
Doing the calculation in double might make it faster (although on
Intel *86 it probably won't).

Gordon L. Burditt

Anonymous 7843
Guest
Posts: n/a

 07-05-2005
In article <(E-Mail Removed) .com>,
Wenfei <(E-Mail Removed)> wrote:
>
>
> float percentage;
>
> for (j = 0; j < 10000000; j++)
> {
> percentage = sinf(frequency * j * 2 * 3.14159 / sampleFreq );
> buffer[totalBytes] =ceilf(volume * percentage) + volume;
> totalBytes++;
> }
>
> Because the float variable, the above loop take 2 seconds in c or c++
> on Linux machine. Does anybody has a solution to reduce the time?

* Use a small table of sine values instead of sinf() function.
It appears that you're downsampling to the width of a char
anyway, so table of somewhat more than 256 sine values probably
won't make things much worse.

* Or just map out one complete waveform and just repeatedly copy that
throughout the rest of the buffer.

* Or Just make one copy of the waveform and index into that as
appropriate when you need the results.
--
7842++

Robert Gamble
Guest
Posts: n/a

 07-05-2005
Wenfei wrote:
> float percentage;
>
> for (j = 0; j < 10000000; j++)
> {
> percentage = sinf(frequency * j * 2 * 3.14159 / sampleFreq );
> buffer[totalBytes] =ceilf(volume * percentage) + volume;
> totalBytes++;
> }
>
> Because the float variable, the above loop take 2 seconds in c or c++
> on Linux machine. Does anybody has a solution to reduce the time?

<OT>
It is unlikely that using double will be any faster.

On my system, 66% of the time is spent in the first statement inside
the loop, 27% on the second. The sinf and ceilf function calls are by
far the most intensive parts of these statements. If your system is
similiar, you might try replacing the sinf call with a lookup table of
precalculated values with a tradeoff of accuracy and memory usage. You
could also use a macro of inline function in place of ceilf.

You didn't specify how much faster you need it, what tradeoffs are
feasible, what you have tried already, or even what exactly you are
trying to accomplish. Perhaps the algorithm itself could be improved
but it is difficult to attempt to do so without knowing what the
specifications are.
</OT>

Robert Gamble

Erik de Castro Lopo
Guest
Posts: n/a

 07-05-2005
Wenfei wrote:
>
> float percentage;
>
> for (j = 0; j < 10000000; j++)
> {
> percentage = sinf(frequency * j * 2 * 3.14159 / sampleFreq );
> buffer[totalBytes] =ceilf(volume * percentage) + volume;
> totalBytes++;
> }
>
> Because the float variable, the above loop take 2 seconds in c or c++
> on Linux machine. Does anybody has a solution to reduce the time?

I assume your linux machine has an x86 based processor. If thats the
case then you problem is probably the ceilf() function. The implementation
of that function requires that the FPU's control word to be modified at
the start of ceilf() and restored afterwards. Every time the FPU control
word is modified, it causes a stall in the FPU pipeline.

You might try replacing ceilf() with the C99 function lrintf().
Unfortunately lrintf() is a round function, not a ceil function so you
might need to replace

ceilf (x)

with

lrintf (x + 0.5).

Also have a look at this paper:

http://www.mega-nerd.com/FPcast/

Erik
--
+-----------------------------------------------------------+
Erik de Castro Lopo (E-Mail Removed) (Yes it's valid)
+-----------------------------------------------------------+
"It has been discovered that C++ provides a remarkable facility
for concealing the trival details of a program -- such as where
its bugs are." -- David Keppel

Anonymous 7843
Guest
Posts: n/a

 07-05-2005
In article <(E-Mail Removed) .com>,
Wenfei <(E-Mail Removed)> wrote:
>
> buffer[totalBytes] =ceilf(volume * percentage) + volume;

Sorry for posting twice to the same thread, but ceilf() may not be
strictly really necessary. If buffer is an array of some kind of
integer type then the assignment is going to do a conversion from float
to int anyway. Unless you have some mathematically rigorous standard
to uphold, you might get a "close enough" result by rounding up. E.g.

buffer[totalBytes] = (volume * percentage + 0.5) + volume;
--
7842++

Christian Bau
Guest
Posts: n/a

 07-05-2005
In article <(E-Mail Removed) .com>,
"Wenfei" <(E-Mail Removed)> wrote:

> float percentage;
>
> for (j = 0; j < 10000000; j++)
> {
> percentage = sinf(frequency * j * 2 * 3.14159 / sampleFreq );
> buffer[totalBytes] =ceilf(volume * percentage) + volume;
> totalBytes++;
> }
>
> Because the float variable, the above loop take 2 seconds in c or c++
> on Linux machine. Does anybody has a solution to reduce the time?

anyway.

When j has the maximum value 1e7 - 1, what is the difference between j *
2 * 3.14159 and j * 2 * pi?

Once you've fixed this, just a hint (mathematics is fun): Consecutive
terms of any function of the form

f (n) = a * sin (x * n + b) + c * cos (x * n + d)

can be calculated using one single multiplication and one single

Lawrence Kirby
Guest
Posts: n/a

 07-06-2005
On Tue, 05 Jul 2005 22:57:23 +0100, Christian Bau wrote:

....

> Once you've fixed this, just a hint (mathematics is fun): Consecutive
> terms of any function of the form
>
> f (n) = a * sin (x * n + b) + c * cos (x * n + d)
>
> can be calculated using one single multiplication and one single

But you need to be careful about accumulation of errors.

Lawrence