Velocity Reviews > C++ > Writing efficient normal distribution function (for learning purposes).

# Writing efficient normal distribution function (for learning purposes).

Luca Cerone
Guest
Posts: n/a

 02-17-2012
Dear all, I'm a beginner in C++ so I'm sorry if my question might seem silly.
I'm also new to the group so I just say hello to all the people in it.

I've written a function that computes the Normal Distribution, but when comparing the time to create 1000 vectors
each with 1000 points I've the same performance than using the analogous Matlab built-in function.
I'd like to improve my code for speed.
I'm doing this mainly to learn how to write efficient C++ code, so please just don't tell me
that library X and Y do this faster than my code, but help me understanding why and how to improve
my code.

I need both a single value and a vector version of this function,
so this is how I have defined them.

// this is the single point version of the function:
double normdist(const double& px,const double& mu,const double& csigma){
// evaluating the formula taken from http://en.wikipedia.org/wiki/Normal_distribution
double val = sqrt(2*M_PI)*csigma;
val = 1/val;
val = val*exp(-pow(px-mu,2)/(2*pow(csigma,2)));
return val ;
}

//this is the vector version. It takes as input a vector of double and gives the vector of corresponding values as output:

vector<double> normdist(const vector<double>& vx,const double& mu,const double& csigma){
vector<double> gv(vx.size()) ; //allocates the space for the vector gv.
for (int i=0 ; i<vx.size() ; i++){
gv[i]=normdist(vx[i],mu,csigma);
}
return gv ;
}

As I told I'm a beginner so I guess there is plenty of room for improving the speed and the quality of the code
(just writing to you I have realized that the initialization of gv might not be optimal, since I create values I don't need).

Thank you very much to all of you!
Luca

Asger Joergensen
Guest
Posts: n/a

 02-17-2012
Hi Luca

If You can live with fixed precision You code will run much faster if You use
int's or __int64's and then multiply /devide as much as You need to get Your the
precision You need.

Best regards
Asger-P

lucacerone
Guest
Posts: n/a

 02-17-2012
Let me understand if I've understood.. by fixed precision you mean having
number with a fixed number of digits? If so, yes I can live with it
as far as I have 8 exact decimal points.

> If You can live with fixed precision You code will run much faster if You use
> int's or __int64's and then multiply /devide as much as You need to get Your the
> precision You need.

I'm sorry but I don't understand you. How would you suggest to use ints?
What type is it __int64?

Thanks a lot for you help in any case,
Luca

Asger Joergensen
Guest
Posts: n/a

 02-17-2012
Hi lucacerone

lucacerone wrote:

> Let me understand if I've understood.. by fixed precision you mean having
> number with a fixed number of digits? If so, yes I can live with it
> as far as I have 8 exact decimal points.

int64_t val = DoubleVal * 100000000;

do the rest of the work with the int and then You divide by the same number
when You need the result.

> > If You can live with fixed precision You code will run much faster if You use
> > int's or __int64's and then multiply /devide as much as You need to get Your the
> > precision You need.

>
> I'm sorry but I don't understand you. How would you suggest to use ints?
> What type is it __int64?

64 bit integer my borland compiler like __int64, but I think int64_t is the
cross compiler way to write it.

I'm not into math so I don't know if You can use this method for Your project.

I do a lot of gradient graphics, so when I calculate color steps i use this
method and it is a lot faster then using floating point values like float and
double.

Best regards
Asger-P

Juha Nieminen
Guest
Posts: n/a

 02-17-2012
Asger Joergensen <(E-Mail Removed)> wrote:
> If You can live with fixed precision You code will run much faster if You use
> int's or __int64's and then multiply /devide as much as You need to get Your the
> precision You need.

This isn't the early 90's. Processors have got a bit faster at
calculating with floating point values in the meantime.

Juha Nieminen
Guest
Posts: n/a

 02-17-2012
Luca Cerone <(E-Mail Removed)> wrote:
> I'd like to improve my code for speed.

First of all, are you sure you are compiling with optimizations / in
release mode? A surprisingly common mistake to not to do that.

> double val = sqrt(2*M_PI)*csigma;

A compiler might or might not be able to calculate that "sqrt(2*M_PI)"
at compile time (because compilers oftentimes do not perform floating point
calculations at compile time due to a myriad of reasons; of course it
depends on the compiler and its version). If you want a few clock cycles
off, might be a safer bet to write the value directly.

> val = val*exp(-pow(px-mu,2)/(2*pow(csigma,2)));

Likewise a compiler might or might not be able to optimize a
"pow(x,2)" into "x*x". It may be a safer bet to write it explicitly.
(std:ow is significantly slower than multiplication.)

> vector<double> normdist(const vector<double>& vx,const double& mu,const double& csigma){
> vector<double> gv(vx.size()) ; //allocates the space for the vector gv.

How many times is this function called in your program? Are you
measuring its speed by calling this function thousands/millions of
times and seeing how long it takes overall?

If yes, then there's your major problem. Memory allocation is
(unfortunately) a slow operation in C++ (not really C++'s fault,
but that's another story). You might want to pass the return value
as a reference parameter to the function and have it write there
instead of allocating a new vector each time. Then you can reuse the
same vector object for each call, avoiding the memory allocation
(except for the very first time it's called, of course).

Victor Bazarov
Guest
Posts: n/a

 02-17-2012
On 2/17/2012 10:41 AM, Luca Cerone wrote:
> Dear all, I'm a beginner in C++ so I'm sorry if my question might seem silly.
> I'm also new to the group so I just say hello to all the people in it.
>
> I've written a function that computes the Normal Distribution, but when comparing the time to create 1000 vectors
> each with 1000 points I've the same performance than using the analogous Matlab built-in function.
> I'd like to improve my code for speed.
> I'm doing this mainly to learn how to write efficient C++ code, so please just don't tell me
> that library X and Y do this faster than my code, but help me understanding why and how to improve
> my code.
>
> I need both a single value and a vector version of this function,
> so this is how I have defined them.
>
> // this is the single point version of the function:
> double normdist(const double& px,const double& mu,const double& csigma){
> // evaluating the formula taken from http://en.wikipedia.org/wiki/Normal_distribution
> double val = sqrt(2*M_PI)*csigma;
> val = 1/val;
> val = val*exp(-pow(px-mu,2)/(2*pow(csigma,2)));
> return val ;

Some compilers are actually good at parsing expressions and converting
them into efficient machine code. You could try simply writing

return exp( -pow(px-mu,2)/2/pow(csigma,2) ) / sqrt(2*M_PI) / csigma;

And now you can see that there are two subexpressions that don't have to
be calculated on every call. pow(csigma,2) could be calculated once per
the entire loop and passed into 'normdist' as a const double&. The same
goes for sqrt(2*M_PI), which is just a constant you could define in the
file scope and just use here.

> }
>
> //this is the vector version. It takes as input a vector of double and gives the vector of corresponding values as output:
>
> vector<double> normdist(const vector<double>& vx,const double& mu,const double& csigma){
> vector<double> gv(vx.size()) ; //allocates the space for the vector gv.
> for (int i=0 ; i<vx.size() ; i++){

While it might not matter in your particular case, get into the habit of
not calling any function in your 'for' condition if you know that it's
not going to change. I'm yet to see a compiler that knows to eliminate
it. Try this:

for (int i=0, s=vx.size(); i<s; ++i)

> gv[i]=normdist(vx[i],mu,csigma);
> }

Most modern compilers are good with indexing, but it's a function call
for the vector, so you might want to try to eliminate it as well by
using pointers:

double* pgv = &gv[0];
const double* pvx = &vx[0];

for (...
{
*pgv++ = normdist(*pvx++, mu, csigma);
}

> return gv ;
> }
>
> As I told I'm a beginner so I guess there is plenty of room for improving the speed and the quality of the code
> (just writing to you I have realized that the initialization of gv might not be optimal, since I create values I don't need).

Good luck!

V
--

Asger Joergensen
Guest
Posts: n/a

 02-17-2012
Hi Juha

Juha Nieminen wrote:

> Asger Joergensen <(E-Mail Removed)> wrote:
> > If You can live with fixed precision You code will run much faster if You use
> > int's or __int64's and then multiply /devide as much as You need to get Your the
> > precision You need.

>
> This isn't the early 90's. Processors have got a bit faster at
> calculating with floating point values in the meantime.

Not nearly as fast as they can do it with int's though, in my experience.

Best regards
Asger-P

Joshua Maurice
Guest
Posts: n/a

 02-17-2012
On Feb 17, 7:41*am, Luca Cerone <(E-Mail Removed)> wrote:
> Dear all, I'm a beginner in C++ so I'm sorry if my question might seem silly.
> I'm also new to the group so I just say hello to all the people in it.
>
> I've written a function that computes the Normal Distribution, but when comparing the time to create 1000 vectors
> each with 1000 points I've the same performance than using the analogous Matlab built-in function.
> I'd like to improve my code for speed.
> I'm doing this mainly to learn how to write efficient C++ code, so pleasejust don't tell me
> that library X and Y do this faster than my code, but help me understanding why and how to improve
> my code.
>
> I need both a single value and a vector version of this function,
> so this is how I have defined them.
>
> // this is the single point version of the function:
> double normdist(const double& px,const double& mu,const double& csigma){
> * // evaluating the formula taken fromhttp://en.wikipedia.org/wiki/Normal_distribution
> * double *val = sqrt(2*M_PI)*csigma;
> * val = 1/val;
> * val = val*exp(-pow(px-mu,2)/(2*pow(csigma,2)));
> * return val ;
>
> }
>
> //this is the vector version. It takes as input a vector of double and gives the vector of corresponding values as output:
>
> vector<double> normdist(const vector<double>& vx,const double& mu,const double& csigma){
> * vector<double> gv(vx.size()) ; //allocates the space for the vector gv.
> * for (int i=0 ; i<vx.size() ; i++){
> * * gv[i]=normdist(vx[i],mu,csigma);
> * }
> * return gv ;
>
> }
>
> As I told I'm a beginner so I guess there is plenty of room for improvingthe speed and the quality of the code
> (just writing to you I have realized that the initialization of gv might not be optimal, since I create values I don't need).
>
> Thank you very much to all of you!
> Luca

Are compiling with all of the proper optimization flags? Are you using
MSVC, and have you disabled checked iterators?

Victor Bazarov
Guest
Posts: n/a

 02-18-2012
On 2/17/2012 6:01 PM, Joshua Maurice wrote:
> On Feb 17, 7:41 am, Luca Cerone<(E-Mail Removed)> wrote:
>> [..]
>> I've written a function that computes the Normal Distribution, but when comparing the time to create 1000 vectors
>> each with 1000 points I've the same performance than using the analogous Matlab built-in function.
>> [.. code using vector and indexing snipped ..]

>
> Are compiling with all of the proper optimization flags? Are you using
> MSVC, and have you disabled checked iterators?

There seem to be no iterators in that code.

If the code speed is the same as Matlab's, there is probably not much
hope to significantly improve it. I doubt that Matlab people release
their library/application with sub-par performance.

V
--