Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > C++ > slow complex<double>'s

Reply
Thread Tools

slow complex<double>'s

 
 
Greg Buchholz
Guest
Posts: n/a
 
      03-05-2006
/*
While writing a C++ version of the Mandelbrot benchmark over at the
"The Great Computer Language Shootout"...


http://shootout.alioth.debian.org/gp...lbrot&lang=all

....I've come across the issue that complex<double>'s seem quite slow
unless compiled with -ffast-math. Of course doing that results in
incorrect answers because of rounding issues. The speed difference for
the program below is between 5x-8x depending on the version of g++. It
is also about 5 times slower than the corresponding gcc version at...

http://shootout.alioth.debian.org/gp...&lang=gcc&id=2

....I'd be interesting in learning the reason for the speed difference.
Sure, the C version is slightly more optimized, but I was thinking that
the C++ code should only be 20-50% slower, not 750% slower like I get
with g++-4.1.0pre021006 (g++ 3.4.2 is a factor of 5 slower when
compiling with "-O3" vs. "-O3 -ffast-math"). Does it have something to
do with temporaries not being optimized away, or somesuch? A
limitation of the x87 instruction set? Is it inherent in the way the
C++ Standard requires complex<double>'s to be calculated? My bad coding
style? Limitations imposed by g++?

Curious,

Greg Buchholz
*/

// Takes an integer argument "n" on the command line and generates a
// PBM bitmap of the Mandelbrot set on stdout.
// see also: ( http://sleepingsquirrel.org/cpp/mandelbrot.cpp.html )

#include<iostream>
#include<complex>

int main (int argc, char **argv)
{
char bit_num = 0, byte_acc = 0;
const int iter = 50;
const double limit_sqr = 2.0 * 2.0;

std::ios_base::sync_with_stdio(false);
int n = atoi(argv[1]);

std::cout << "P4\n" << n << " " << n << std::endl;

for(int y=0; y<n; ++y)
for(int x=0; x<n; ++x)
{
std::complex<double> Z(0.0,0.0);
std::complex<double> C(2*(double)x/n - 1.5, 2*(double)y/n -
1.0);

for (int i=0; i<iter and norm(Z) <= limit_sqr; ++i) Z = Z*Z +
C;

byte_acc = (byte_acc << 1) | ((norm(Z) > limit_sqr) ?
0x00:0x01);

if(++bit_num == { std::cout << byte_acc; bit_num = byte_acc =
0; }
else if(x == n-1) { byte_acc <<= (8-n%;
std::cout << byte_acc;
bit_num = byte_acc = 0; }
}
}

 
Reply With Quote
 
 
 
 
Jerry Coffin
Guest
Posts: n/a
 
      03-05-2006
In article <1141588925.900212.137230
@z34g2000cwc.googlegroups.com>,
http://www.velocityreviews.com/forums/(E-Mail Removed) says...

[ ... ]

> ...I'd be interesting in learning the reason for the speed difference.
> Sure, the C version is slightly more optimized, but I was thinking that
> the C++ code should only be 20-50% slower, not 750% slower like I get
> with g++-4.1.0pre021006 (g++ 3.4.2 is a factor of 5 slower when
> compiling with "-O3" vs. "-O3 -ffast-math"). Does it have something to
> do with temporaries not being optimized away, or somesuch? A
> limitation of the x87 instruction set? Is it inherent in the way the
> C++ Standard requires complex<double>'s to be calculated? My bad coding
> style? Limitations imposed by g++?


[ ... ]

> for (int i=0; i<iter and norm(Z) <= limit_sqr; ++i) Z = Z*Z +
> C;


Hmm...try this:

for (int i=0; i<iter and norm(Z) <= limit_sqr; ++i) {
Z *= Z; Z += C; }

No guarantee, but I think it's worth a shot.

--
Later,
Jerry.

The universe is a figment of its own imagination.
 
Reply With Quote
 
 
 
 
Greg Buchholz
Guest
Posts: n/a
 
      03-05-2006

Jerry Coffin wrote:
> Hmm...try this:
>
> for (int i=0; i<iter and norm(Z) <= limit_sqr; ++i) {
> Z *= Z; Z += C; }
>
> No guarantee, but I think it's worth a shot.


Tried it. No speed improvement on gcc-3.4.2 or gcc-4.1.0pre021006.

Greg Buchholz

 
Reply With Quote
 
Fei Liu
Guest
Posts: n/a
 
      03-06-2006

Greg Buchholz wrote:
> Jerry Coffin wrote:
> > Hmm...try this:
> >
> > for (int i=0; i<iter and norm(Z) <= limit_sqr; ++i) {
> > Z *= Z; Z += C; }
> >
> > No guarantee, but I think it's worth a shot.

>
> Tried it. No speed improvement on gcc-3.4.2 or gcc-4.1.0pre021006.
>
> Greg Buchholz


profile your code and find out what's causing the slowdown...

 
Reply With Quote
 
peter koch
Guest
Posts: n/a
 
      03-06-2006

Greg Buchholz wrote:
> /*
> While writing a C++ version of the Mandelbrot benchmark over at the
> "The Great Computer Language Shootout"...
>
>
> http://shootout.alioth.debian.org/gp...lbrot&lang=all
>
> ...I've come across the issue that complex<double>'s seem quite slow
> unless compiled with -ffast-math. Of course doing that results in
> incorrect answers because of rounding issues. The speed difference for
> the program below is between 5x-8x depending on the version of g++. It
> is also about 5 times slower than the corresponding gcc version at...
>
> http://shootout.alioth.debian.org/gp...&lang=gcc&id=2
>
> ...I'd be interesting in learning the reason for the speed difference.
> Sure, the C version is slightly more optimized, but I was thinking that
> the C++ code should only be 20-50% slower, not 750% slower like I get
> with g++-4.1.0pre021006 (g++ 3.4.2 is a factor of 5 slower when
> compiling with "-O3" vs. "-O3 -ffast-math"). Does it have something to
> do with temporaries not being optimized away, or somesuch? A
> limitation of the x87 instruction set? Is it inherent in the way the
> C++ Standard requires complex<double>'s to be calculated? My bad coding
> style? Limitations imposed by g++?
>
> Curious,
>
> Greg Buchholz
> */

[snip]
>
> for (int i=0; i<iter and norm(Z) <= limit_sqr; ++i) Z = Z*Z +
> C;
>
> byte_acc = (byte_acc << 1) | ((norm(Z) > limit_sqr) ?
> 0x00:0x01);

[snip]

Could it be that some large time is spent in calculating "norm"? I
doubt that the C-version does so - and it should not be necesarry. Some
simple test should fit the bill (or at least avoid calling norm all the
time).

/Peter

 
Reply With Quote
 
Jerry Coffin
Guest
Posts: n/a
 
      03-06-2006
In article <1141602729.936509.269620
@e56g2000cwe.googlegroups.com>,
(E-Mail Removed) says...
>
> Jerry Coffin wrote:
> > Hmm...try this:
> >
> > for (int i=0; i<iter and norm(Z) <= limit_sqr; ++i) {
> > Z *= Z; Z += C; }
> >
> > No guarantee, but I think it's worth a shot.

>
> Tried it. No speed improvement on gcc-3.4.2 or gcc-4.1.0pre021006.


Well, that takes out the possibility that seemed most
obvious to me. Depending on your bent, your next step
would be either a profiler or examining the code the
compiler's producing. For large chunks of code, the
former works well, but for small amounts that you want to
examine in maximum detail the latter can be useful as
well.

--
Later,
Jerry.

The universe is a figment of its own imagination.
 
Reply With Quote
 
Greg Buchholz
Guest
Posts: n/a
 
      03-06-2006
Greg Buchholz wrote:
> ...I've come across the issue that complex<double>'s seem quite slow
> unless compiled with -ffast-math. Of course doing that results in
> incorrect answers because of rounding issues. The speed difference for
> the program below is between 5x-8x depending on the version of g++.


Looks like the problem can be solved by manually inlining the
definition of "norm"...

//manually inlining "norm" results in a 5x-7x speedup on g++
for(int i=0; i<iter and
(Z.real()*Z.real() + Z.imag()*Z.imag()) <= limit_sqr; ++i)
Z = Z*Z + C;

....For some reason g++ must not have been able to inline it (or does so
after common subexpression elimination or somesuch).

Greg Buchholz

 
Reply With Quote
 
Bill Shortall
Guest
Posts: n/a
 
      03-06-2006

"Greg Buchholz" <(E-Mail Removed)> wrote in message
news:(E-Mail Removed) oups.com...
> /*
> While writing a C++ version of the Mandelbrot benchmark over at the
> "The Great Computer Language Shootout"...
>
>
>

http://shootout.alioth.debian.org/gp...lbrot&lang=all
>
> ...I've come across the issue that complex<double>'s seem quite slow
> unless compiled with -ffast-math. Of course doing that results in
> incorrect answers because of rounding issues. The speed difference for
> the program below is between 5x-8x depending on the version of g++. It
> is also about 5 times slower than the corresponding gcc version at...
>
>

http://shootout.alioth.debian.org/gp...lbrot&lang=gcc
&id=2
>
> ...I'd be interesting in learning the reason for the speed difference.
> Sure, the C version is slightly more optimized, but I was thinking that
> the C++ code should only be 20-50% slower, not 750% slower like I get
> with g++-4.1.0pre021006 (g++ 3.4.2 is a factor of 5 slower when
> compiling with "-O3" vs. "-O3 -ffast-math"). Does it have something to
> do with temporaries not being optimized away, or somesuch? A
> limitation of the x87 instruction set? Is it inherent in the way the
> C++ Standard requires complex<double>'s to be calculated? My bad coding
> style? Limitations imposed by g++?
>
> Curious,
>
> Greg Buchholz
> */
>
> // Takes an integer argument "n" on the command line and generates a
> // PBM bitmap of the Mandelbrot set on stdout.
> // see also: ( http://sleepingsquirrel.org/cpp/mandelbrot.cpp.html )
>
> #include<iostream>
> #include<complex>
>
> int main (int argc, char **argv)
> {
> char bit_num = 0, byte_acc = 0;
> const int iter = 50;
> const double limit_sqr = 2.0 * 2.0;
>
> std::ios_base::sync_with_stdio(false);
> int n = atoi(argv[1]);
>
> std::cout << "P4\n" << n << " " << n << std::endl;
>
> for(int y=0; y<n; ++y)
> for(int x=0; x<n; ++x)
> {
> std::complex<double> Z(0.0,0.0);
> std::complex<double> C(2*(double)x/n - 1.5, 2*(double)y/n -
> 1.0);
>
> for (int i=0; i<iter and norm(Z) <= limit_sqr; ++i) Z = Z*Z +
> C;
>
> byte_acc = (byte_acc << 1) | ((norm(Z) > limit_sqr) ?
> 0x00:0x01);
>
> if(++bit_num == { std::cout << byte_acc; bit_num = byte_acc =
> 0; }
> else if(x == n-1) { byte_acc <<= (8-n%;
> std::cout << byte_acc;
> bit_num = byte_acc = 0; }
> }
> }
>


------------------------------------------------
Hi Greg,
I have had similiar problems when
using the <complex> library for
Microsoft VC6. It ran at about half the
expected speed . After looking thru
the header file I saw that the C++
structure was somewhat involved with
a base class and several derived classes.
I ended up writing my own very simple
complex class looking like

namespace std
{
template <class Tc>
class ppcomplex
{
public:
Tc re;
Tc im;

ppcomplex(){re = 0;im = 0;}
ppcomplex(const Tc& r,const Tc& i) : re(r), im(i) {}
ppcomplex(const Tc& r) : re(r), im((Tc)0) {}

Tc real() const { return re;}
Tc imag() const { return im;}

Tc real(const Tc& x) { return ( re = x):}
Tc imag(const Tc& x) { return ( im = x):}
// the usual assignment operators
ppcomplex(const ppcomplex<Tc>& z)
{this->re = z.re; this->im = z.im;}
ppcomplex<Tc>& operator =(const ppcomplex<Tc>& y) {
if(this != &y)
{this->re = y.re; this->im = y.im;} return *this; }
ppcomplex<Tc>& operator =(const Tc& r)
{ this->re = r, this->im = (Tc)0; return *this;}

etc --- etc ---etc more stuff here

// updating by a real constant

ppcomplex<Tc>& operator +=(const Tc& y)
{ re += y; return *this;}

// more stuff

}; // end of class ppcomplex

This ran twice as fast ! so maybe you have the same problem i.e. your
complex class is just too complicated ?

Regards....Bill





 
Reply With Quote
 
Marcus Kwok
Guest
Posts: n/a
 
      03-06-2006
Bill Shortall <(E-Mail Removed)> wrote:
> namespace std
> {
> template <class Tc>
> class ppcomplex
> {


You are not allowed to introduce your own names to namespace std. IIRC,
you are only allowed to add specializations of the standard template
classes, when specializing on user-defined classes.

--
Marcus Kwok
 
Reply With Quote
 
Bill Shortall
Guest
Posts: n/a
 
      03-07-2006

"Marcus Kwok" <(E-Mail Removed)> wrote in message
news:dui7sh$pd7$(E-Mail Removed)...
> Bill Shortall <(E-Mail Removed)> wrote:
> > namespace std
> > {
> > template <class Tc>
> > class ppcomplex
> > {

>
> You are not allowed to introduce your own names to namespace std. IIRC,
> you are only allowed to add specializations of the standard template
> classes, when specializing on user-defined classes.
>
> --
> Marcus Kwok


OK -- change ppcomplex to complex
call it's header file <complex> and
remove the old one


 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Re: slow slow slow! Expert lino fitter Computer Support 5 12-12-2008 04:00 PM
Re: slow slow slow! General Patron Computer Support 0 12-11-2008 11:01 PM
Re: slow slow slow! chuckcar Computer Support 0 12-10-2008 11:25 PM
Re: slow slow slow! Beauregard T. Shagnasty Computer Support 2 12-10-2008 09:03 PM
Re: slow slow slow! Expert lino fitter Computer Support 0 12-10-2008 02:33 PM



Advertisments