Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > C++ > Slow manual 'garbage collection' in C++ ??

Reply
Thread Tools

Slow manual 'garbage collection' in C++ ??

 
 
brey_maastricht@hotmail.com
Guest
Posts: n/a
 
      06-17-2008
Hello guys,

I really need your help. I'm a Java programmer and right now I'm
experiencing a huge problem with C++.

In a scientific computation I need to store a lot of Point objects
(which have an array of doubles and 2 ints as fields).
To quickly search among all these points I have designed a Hashtable
to store them; a bucket/vector of buckets/vectors.
The hashtable is filled with Points objects in a while loop.

This whole procedure again is repeated a lot of times in a for loop.
At the end of every run of this for loop the Hashtable has to be
emptied in order to prevent for memroy leaks: this means looping
through every vector in the main vector of the hashtable and delete
every Point object manually.
Later on, this C++ computation will be compiled under Matlab to use it
over there.

But now the problem is:

My C++ programm is very fast without emptying the hashtable. But then
I have a huge memory leak; and my swapfile grows enormouly.
With emptying the hashtable my C++ programm becomes dramatically
slow.....even slower than my Java programm.

So the question is: What to do now ?? It cannot be the case that
preventing a memory leak, slows down the computations ??

Kind regards,
Brey
 
Reply With Quote
 
 
 
 
Mirco Wahab
Guest
Posts: n/a
 
      06-17-2008
http://www.velocityreviews.com/forums/(E-Mail Removed) wrote:
> In a scientific computation I need to store a lot of Point objects
> (which have an array of doubles and 2 ints as fields).
> To quickly search among all these points I have designed a Hashtable
> to store them; a bucket/vector of buckets/vectors.
> The hashtable is filled with Points objects in a while loop.
> ...
> My C++ programm is very fast without emptying the hashtable. But then
> I have a huge memory leak; and my swapfile grows enormouly.
> With emptying the hashtable my C++ programm becomes dramatically
> slow.....even slower than my Java programm.
> So the question is: What to do now ?? It cannot be the case that
> preventing a memory leak, slows down the computations ??


You could redesign the stuff for speed by using
e.g. continous vectors - and hash separate
integer indexes which point into these:

// [PSEUDO]

vector<int>indexarray; // <= 0 .. NUMPOINTS-1

struct coordinates {
vector<double>x; // <= 0 .. NUMPOINTS-1
vector<double>y; //
vector<double>z; //
vector<int>data1;
vector<int>data2;
};

or use the more traditional approach
with the same index array:

vector<int>indexarray; // <= 0 .. NUMPOINTS-1

struct coordinates { double x,y,z; int data1,data2; };
vector<coordinates>v; // <= 0 .. NUMPOINTS-1

Why did you design a hash algorithm instead
using a std::map<whatever, whatelse> (whats wrong with it?).
What exactly is hashed, what has to be looked up fast?

Regards

Mirco

 
Reply With Quote
 
 
 
 
iu2
Guest
Posts: n/a
 
      06-17-2008
On Jun 17, 9:46 pm, (E-Mail Removed) wrote:
> Hello guys,
>
> I really need your help. I'm a Java programmer and right now I'm
> experiencing a huge problem with C++.
>
> In a scientific computation I need to store a lot of Point objects
> (which have an array of doubles and 2 ints as fields).
> To quickly search among all these points I have designed a Hashtable
> to store them; a bucket/vector of buckets/vectors.
> The hashtable is filled with Points objects in a while loop.
>
> This whole procedure again is repeated a lot of times in a for loop.
> At the end of every run of this for loop the Hashtable has to be
> emptied in order to prevent for memroy leaks: this means looping
> through every vector in the main vector of the hashtable and delete
> every Point object manually.
> Later on, this C++ computation will be compiled under Matlab to use it
> over there.
>
> But now the problem is:
>
> My C++ programm is very fast without emptying the hashtable. But then
> I have a huge memory leak; and my swapfile grows enormouly.
> With emptying the hashtable my C++ programm becomes dramatically
> slow.....even slower than my Java programm.
>
> So the question is: What to do now ?? It cannot be the case that
> preventing a memory leak, slows down the computations ??
>
> Kind regards,
> Brey


You don't need to write your own hash.
Try using STL. I think an STL 'multiset' can store objects in an RB-
tree fashion. When the multiset goes out of scope it automatically
frees all the memory.
You will need to overload the '<' operator to allow the multiset
comare the nodes in the RB tree.

Here is an example:

#include <set>
#include <iostream>

using namespace std;

// define the Point struct
struct Point {
double ar[10];
int a, b;

Point(int x, int y) {
a = x;
b = y;
}
};

// define the '<' operator, this definition relates to the 'a' field
bool operator < (const Point &p1, const Point &p2) {
return p1.a < p2.a;
}


int main()
{
multiset<Point> points;

// insert some points
for (int i = 0; i < 10; i++) {
cout << "Inserting point with a = " << 10 - i << endl;
points.insert(Point(10 - i, 0));
points.insert(Point(10 - i, 0)); // add each item twice..., just to
see this works
}

cout << endl;

// now scan the set
multiset<Point>::iterator it;
for (it = points.begin(); it != points.end(); it++) {
cout << "Reading point: a = " << it->a << ", b = " << it->b << endl;
}

return 0;
}


The output is:

Inserting point with a = 10
Inserting point with a = 9
Inserting point with a = 8
Inserting point with a = 7
Inserting point with a = 6
Inserting point with a = 5
Inserting point with a = 4
Inserting point with a = 3
Inserting point with a = 2
Inserting point with a = 1

Reading point: a = 1, b = 0
Reading point: a = 1, b = 0
Reading point: a = 2, b = 0
Reading point: a = 2, b = 0
Reading point: a = 3, b = 0
Reading point: a = 3, b = 0
Reading point: a = 4, b = 0
Reading point: a = 4, b = 0
Reading point: a = 5, b = 0
Reading point: a = 5, b = 0
Reading point: a = 6, b = 0
Reading point: a = 6, b = 0
Reading point: a = 7, b = 0
Reading point: a = 7, b = 0
Reading point: a = 8, b = 0
Reading point: a = 8, b = 0
Reading point: a = 9, b = 0
Reading point: a = 9, b = 0
Reading point: a = 10, b = 0
Reading point: a = 10, b = 0

See how the numbers are ordered from least to most, although inserted
from most to least.
 
Reply With Quote
 
Erik Wikström
Guest
Posts: n/a
 
      06-17-2008
On 2008-06-17 20:46, (E-Mail Removed) wrote:
> Hello guys,
>
> I really need your help. I'm a Java programmer and right now I'm
> experiencing a huge problem with C++.
>
> In a scientific computation I need to store a lot of Point objects
> (which have an array of doubles and 2 ints as fields).
> To quickly search among all these points I have designed a Hashtable
> to store them; a bucket/vector of buckets/vectors.
> The hashtable is filled with Points objects in a while loop.
>
> This whole procedure again is repeated a lot of times in a for loop.
> At the end of every run of this for loop the Hashtable has to be
> emptied in order to prevent for memroy leaks: this means looping
> through every vector in the main vector of the hashtable and delete
> every Point object manually.
> Later on, this C++ computation will be compiled under Matlab to use it
> over there.
>
> But now the problem is:
>
> My C++ programm is very fast without emptying the hashtable. But then
> I have a huge memory leak; and my swapfile grows enormouly.
> With emptying the hashtable my C++ programm becomes dramatically
> slow.....even slower than my Java programm.
>
> So the question is: What to do now ?? It cannot be the case that
> preventing a memory leak, slows down the computations ??


Since you are allocating/deallocating a lot of objects of the same type
a specialised allocator such as a pool allocator might be useful.

You could also put all the point objects in a pre-resized vector and
store pointers to them in the hash-table. This will get rid of the high
number of deallocations since the vector manages the memory for the points.

--
Erik Wikström
 
Reply With Quote
 
jl_post@hotmail.com
Guest
Posts: n/a
 
      06-17-2008
On Jun 17, 12:46 pm, (E-Mail Removed) wrote:
>
> This whole procedure again is repeated a lot of times in a for loop.
> At the end of every run of this for loop the Hashtable has to be
> emptied in order to prevent for memroy leaks: this means looping
> through every vector in the main vector of the hashtable and delete
> every Point object manually.



Dear Brey,

Maybe I misunderstood something in your post, but it sounds like
you are using a vector of vectors of Point pointers (that is,
vector<vector<Point *> >).

If that's the case, why are you using pointers? If you just use
Point objects (as opposed to Point pointers), there would be no need
to free/delete the memory; the vectors and Points would just clean
themselves up when they go out of scope.

I find that a lot of C and Java programmers who move to C++ have a
habit of thinking in pointers; for some reason, they like to allocate
memory and free/delete it even when there's no advantage to doing so.

However, C++ eliminates the need to use pointers in many, many
places. For example, do you need a function to modify a parameter
itself, instead of a copy of the parameter? Use a C++ reference (no
pointer needed). Do you need a variable-length array? Use a
std::vector (again, no pointers needed). Do you need a variable-
length string? Use a std::string. Do you need to create an object
that outlasts its scope? Have the caller function declare it, and
pass it in as a reference.

In my experience about 90% of crashes in C++ programs are due to
pointer errors. Thus, if a programmer were to write pointer-free
code, 9 out 10 potential crashes could be avoided.

But some programmers have a hard time wrapping their heads around
pointer-free code. They can't conceive how it's possible to write
good code without pointers, and will sometimes argue that fact with
someone who can show them that it is in fact possible (I'm speaking
from experience here).

I'm not saying that it's necessarily bad to program with pointers;
what I'm trying to say is that most uses of pointers are unnecessary,
and that by using self-cleaning code (which C++ affords the use of),
programmers can use and manipulate objects without ever having to
worry about memory allocation & deletion. Avoid pointers, and avoid
most crashes and memory leaks.

(Memory allocation and deletion are also rarely ever mentioned in
pseudo-code (probably because it's too environment-specific), so if
you believe that written code should map to well-written pseudo-code
(like I do), it would do you well to avoid malloc/new and free/delete
calls wherever possible.)

I apologize if I misunderstood your post, but I still recommend
avoid using pointers wherever possible.

Have a great day!

-- Jean-Luc
 
Reply With Quote
 
brey_maastricht@hotmail.com
Guest
Posts: n/a
 
      06-17-2008
Thank you all for your replys !

Indeed I am using a vector<vector<nDimensionalPoint *> > as Hashtable.
My requirement is to lookup the value of a point of n dimensions very
fast for given coordinates.

I'm going to try an STL Map or Multiset. Maybe that works faster and
more memory efficient.
I have never heard before of a pool allocator but will Google for it.

But when I dont use pointers in my vector of vectors but only
vector<vector<nDimensionalPoint> >, the destructor of the objects
nDimensionalPoint are called, I suppose ?

Best regards !
Brey
The Netherlands

 
Reply With Quote
 
Bo Persson
Guest
Posts: n/a
 
      06-17-2008
(E-Mail Removed) wrote:
> Thank you all for your replys !
>
> Indeed I am using a vector<vector<nDimensionalPoint *> > as
> Hashtable. My requirement is to lookup the value of a point of n
> dimensions very fast for given coordinates.
>
> I'm going to try an STL Map or Multiset. Maybe that works faster and
> more memory efficient.
> I have never heard before of a pool allocator but will Google for
> it.
>
> But when I dont use pointers in my vector of vectors but only
> vector<vector<nDimensionalPoint> >, the destructor of the objects
> nDimensionalPoint are called, I suppose ?
>


Yes, but if the objects just contain doubles and ints, the destructor
doesn't have to do anything. Can't be much faster than that!


Bo Persson


 
Reply With Quote
 
brey_maastricht@hotmail.com
Guest
Posts: n/a
 
      06-17-2008
On 17 jun, 22:45, "Bo Persson" <(E-Mail Removed)> wrote:
> (E-Mail Removed) wrote:
> > Thank you all for your replys !

>
> > Indeed I am using a vector<vector<nDimensionalPoint *> > as
> > Hashtable. My requirement is to lookup the value of a point of n
> > dimensions very fast for given coordinates.

>
> > I'm going to try an STL Map or Multiset. Maybe that works faster and
> > more memory efficient.
> > I have never heard before of a pool allocator but will Google for
> > it.

>
> > But when I dont use pointers in my vector of vectors but only
> > vector<vector<nDimensionalPoint> >, the destructor of the objects
> > nDimensionalPoint are called, I suppose ?

>
> Yes, but if the objects just contain doubles and ints, the destructor
> doesn't have to do anything. Can't be much faster than that!
>
> Bo Persson


Well, the objects contain 2 values and one array:
its real value, its imaginary value and the coordinates in an integer
array (because the number of coordinates have to be variable, can be 2
dimensional up till 12 dimensional).
So I need a real destructor which delete[] the int array.
 
Reply With Quote
 
jl_post@hotmail.com
Guest
Posts: n/a
 
      06-17-2008
On Jun 17, 3:18 pm, (E-Mail Removed) wrote:
>
> Well, the objects contain 2 values and one array:
> its real value, its imaginary value and the coordinates in an integer
> array (because the number of coordinates have to be variable, can be 2
> dimensional up till 12 dimensional).
> So I need a real destructor which delete[] the int array.



I think I said this in my previous post, but I'll say it again: If
you ever need to use a variable-length array, just use a vector, like
this:

class nDimensionalPoint
{
public:
double realValue;
double imaginaryValue;
std::vector<int> coordinates;
};

This way there's no need do define a destructor, since there was no
memory allocated!

I hope this helps, Brey.

-- Jean-Luc

 
Reply With Quote
 
jl_post@hotmail.com
Guest
Posts: n/a
 
      06-17-2008
On Jun 17, 2:30 pm, (E-Mail Removed) wrote:
>
> But when I dont use pointers in my vector of vectors but only
> vector<vector<nDimensionalPoint> >, the destructor of the objects
> nDimensionalPoint are called, I suppose ?



Yes. When an object is declared "on the stack" (as opposed to "on
the heap" as pointer memory usually is), its destructor automatically
gets called when the object goes out of scope. Therefore, there is no
need to "garbage collect" it.

Take a look at this class:

class A
{
A() { std::cout << "In A's constructor.\n"; }
~A() { std::cout << "In A's destructor.\n"; }
};

Both of the following lines will call A's constructor:

A a; // calls A::A()
A *aPtr = new A; // also calls A::A()

But when do the destructors get called? Well, for aPtr, it gets
called when you delete() it, like this:

delete(aPtr); // invokes A::~A()

As for the object/variable named "a", its destructor gets called as
soon as it goes out of scope. If you return or break out of the scope
prematurely, its destructor still gets called! In other words, a's
destructor is guaranteed to get called once (and exactly once),
whereas aPtr's destructor gets called only when the "delete(aPtr);"
line is encountered. (If it's never encountered, aPtr's destructor
never gets called, and if it gets encountered more than once, a double-
delete() will happen, likely causing your program to crash.)

So you have to make sure that if you call new() on a pointer to
call delete() on it once and exactly once. But if you just declare it
on the stack (that is, as a non-pointer), you don't have to remember
to call anything for it -- it will clean itself up as soon as it goes
out of scope!

This aspect of C++ (that allows out-of-scope objects to clean
themselves up by calling their destructor) is one of the reasons C++
doesn't have a garbage collector. There is just no need for one if
you write your objects to be self-cleaning. (And you don't need to
define a destructor for a class if all the objects that class are
already self-cleaning.)

In fact, C++'s approach in cleaning objects when they go out of
scope is often much faster than many other languages' approach --
namely, to use a garbage collector that activates only when a system
process decides to.

So my advice is to avoid using pointers as well as calls to malloc/
new and free/delete. Doing so will prevent many memory leaks and
crashes, and in many cases make your code easier to read.

I hope this helps, Brey.

-- Jean-Luc
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Re: slow slow slow! Expert lino fitter Computer Support 5 12-12-2008 04:00 PM
Re: slow slow slow! General Patron Computer Support 0 12-11-2008 11:01 PM
Re: slow slow slow! chuckcar Computer Support 0 12-10-2008 11:25 PM
Re: slow slow slow! Beauregard T. Shagnasty Computer Support 2 12-10-2008 09:03 PM
Re: slow slow slow! Expert lino fitter Computer Support 0 12-10-2008 02:33 PM



Advertisments