wrote:
> I'm working on a scientific computing application. I need a class
> called Element which is no more than a collection of integers, or
> "nodes" and has only on method int getNode(int i).
>
> I would like to implement in the most efficient was possible. So I
> summoned up my programming intellect and asked myself: Do I want to
> have members such as int a, b, c, d, e or a single member such as int
> a[5]. So I wrote the following snippet and compiled it with a -O3 flag:
>
> int main(char *argv[], int argc) {
>
> /*
> int a, b, c, d, e;
> for (int i = 0; i < 1000000000; i++) {
> a = 1;
> b = 2;
> c = 3;
> d = 4;
> e = 5;
> }
>
> */
>
>
> int a[5];
> for (int i = 0; i < 1000000000; i++) {
> a[0] = 1;
> a[1] = 2;
> a[2] = 3;
> a[3] = 4;
> a[4] = 5;
> }
>
> return 0;
> }
>
> The first (commented out) version ran twice as fast. (For doubles
> instead of ints, it was a factor of 4). So the simpleton part of me
> thinks that that answers my question. The remaining part tells me that
> it is never that simple. Finally, the cynical part of me thinks that it
> all doesn't matter and other parts of the program are bound to be far
> more time consuming.
It is more than likely that the compiler re-arranged your code
int a, b, c, d, e;
a = 1;
b = 2;
c = 3;
d = 4;
e = 5;
for (int i = 0; i < 1000000000; i++) {
}
Or, perhaps the compiler placed the values in registers.
There is a deeper design question for you.
Are these values really related ? Do you do operations on them in
tandem ? Would you ever think that it might be interesting to write a
template function with a "pointer to member" of one of these values ?
I would go with the 5 separate values if they are truly separate. That
way it will be harder to run into other problems like going past array
bounds or issues with using the wrong index etc.
Anyhow, below is an example where the compiler can't (easily) make the
optimization above. The results are essentially identical with:
gcc version 4.0.0 20050102 (experimental)
#include <ostream>
#include <iostream>
struct X
{
virtual void F() = 0; // hard for compiler to optimize this
};
struct A
{
int a, b, c, d, e;
};
struct B
{
int a[5];
};
struct Av
: A, X
{
Av()
: A()
{
}
virtual void F()
{
a = 1;
b = 2;
c = 3;
d = 4;
e = 5;
}
};
struct Bv
: B, X
{
Bv()
: B()
{
}
virtual void F()
{
a[0] = 1;
a[1] = 2;
a[2] = 3;
a[3] = 4;
a[4] = 5;
a[5] = 6;
}
};
int main( int argc, char ** argv )
{
X * x;
if ( argc >= 2 )
{
std::cout << "Making an A\n";
x = new Av;
}
else
{
std::cout << "Making a B\n";
x = new Bv;
}
for (int i = 0; i < 1000000000; i++)
{
x->F();
}
}
$ g++ -O3 -o ./optimal_array_or_members ./optimal_array_or_members.cpp
$ time ./optimal_array_or_members
Making a B
6.900u 0.000s 0:06.92 99.7% 0+0k 0+0io 216pf+0w
$ time ./optimal_array_or_members d
Making an A
6.770u 0.000s 0:06.78 99.8% 0+0k 0+0io 216pf+0w
$ time ./optimal_array_or_members
Making a B
6.960u 0.010s 0:06.96 100.1% 0+0k 0+0io 216pf+0w
$ time ./optimal_array_or_members s
Making an A
6.920u 0.000s 0:06.92 100.0% 0+0k 0+0io 216pf+0w
$ time ./optimal_array_or_members
Making a B
7.010u 0.000s 0:07.00 100.1% 0+0k 0+0io 216pf+0w
$ time ./optimal_array_or_members s
Making an A
6.950u 0.000s 0:06.95 100.0% 0+0k 0+0io 216pf+0w
$ time ./optimal_array_or_members
Making a B
6.770u 0.000s 0:06.76 100.1% 0+0k 0+0io 216pf+0w
$ time ./optimal_array_or_members s
Making an A
6.850u 0.000s 0:06.84 100.1% 0+0k 0+0io 216pf+0w