Velocity Reviews > C++ > Priority Queue - very slow

# Priority Queue - very slow

chenbo09@gmail.com
Guest
Posts: n/a

 12-20-2010
Hello! I am trying to use priority queue to store my nodes, and I need
to insert and retrieve elements very frequently. The size of the queue
increases dramatically at the beginning and starts to reduce after the
queue reaches its maximum size in the process.

The queue contains objects with three data members, i.e., row number,
column number, elevation. The object with the smallest elevation will
be the first element in the queue.

I am trying to implement an algorithm proposed by a published research
paper, the authored wrote "The algorithm could process a gird with 500
* 1000 cells in 2 seconds", while it seems that it took my code 2
hours to finish the task. I observed the intermediate output and found
that the maximum size of my queue was about 20,000.

************************************************** ******
while( !theQueue.empty())
{
tempNode = theQueue.top();
row = tempNode.row;
col = tempNode.col;
theQueue.pop();

z = DEMFill.get_cellValue(row, col);
for(i = 0; i < 8; i++)
{
irow = Get_rowTo(i, row);
icol = Get_colTo(i, col);
if(DEM.is_inGrid(irow, icol) && DEM.is_InternalCell(irow, icol) &&
DEMFill.is_nodataCell(irow, icol))
{
double iz = DEM.get_cellValue(irow, icol);
if( iz < z) { iz = z; }
DEMFill.set_cellValue(irow, icol, iz);
progress++;

tempNode.row = irow;
tempNode.col = icol;
tempNode.spill = iz;
//theQueue.push( tempNode );
}
}

if ((progress % 5000) == 0)
{
cout << theQueue.size() << " " << progress << " " <<
DEMFill.get_nValidCells() << endl;
}
if (progress == DEMFill.get_nValidCells())
{
cout << "Finished!" << endl;
break;
}
}

Jorgen Grahn
Guest
Posts: n/a

 12-20-2010
On Mon, 2010-12-20, http://www.velocityreviews.com/forums/(E-Mail Removed) wrote:
> Hello! I am trying to use priority queue to store my nodes, and I need
> to insert and retrieve elements very frequently. The size of the queue
> increases dramatically at the beginning and starts to reduce after the
> queue reaches its maximum size in the process.
>
> The queue contains objects with three data members, i.e., row number,
> column number, elevation. The object with the smallest elevation will
> be the first element in the queue.
>
> I am trying to implement an algorithm proposed by a published research
> paper, the authored wrote "The algorithm could process a gird with 500
> * 1000 cells in 2 seconds", while it seems that it took my code 2
> hours to finish the task.

A factor 3600 -- that's a HUGE difference. My gut feeling is that
there's a difference in algorithm, not just overhead from the priority
queue, a badly implemented operator< () etc.

I haven't read the code. Try to run your code through a profiler
(gprof if you're running GCC) and it will tell you where the algorithm
spends its time. The results from profiling often surprise you.

[snip code]

/Jorgen

--
// Jorgen Grahn <grahn@ Oo o. . .
\X/ snipabacken.se> O o .

Joshua Maurice
Guest
Posts: n/a

 12-21-2010
On Dec 20, 9:21*am, "(E-Mail Removed)" <(E-Mail Removed)> wrote:
> Hello! I am trying to use priority queue to store my nodes, and I need
> to insert and retrieve elements very frequently. The size of the queue
> increases dramatically at the beginning and starts to reduce after the
> queue reaches its maximum size in the process.
>
> The queue contains objects with three data members, i.e., row number,
> column number, elevation. The object with the smallest elevation will
> be the first element in the queue.
>
> I am trying to implement an algorithm proposed by a published research
> paper, the authored wrote "The algorithm could process a gird with 500
> * 1000 cells in 2 seconds", while it seems that it took my code 2
> hours to finish the task. I observed the intermediate output and found
> that the maximum size of my queue was about 20,000.
>
>
> ************************************************** ******
> * * * * while( !theQueue.empty())
> * * * * {
> * * * * * * * * tempNode = theQueue.top();
> * * * * * * * * row = tempNode.row;
> * * * * * * * * col = tempNode.col;
> * * * * * * * * theQueue.pop();
>
> * * * * * * * * z = DEMFill.get_cellValue(row, col);
> * * * * * * * * for(i = 0; i < 8; i++)
> * * * * * * * * {
> * * * * * * * * * * * * irow = Get_rowTo(i, row);
> * * * * * * * * * * * * icol = Get_colTo(i, col);
> * * * * * * * * * * * * if(DEM.is_inGrid(irow, icol) && DEM.is_InternalCell(irow, icol) &&
> DEMFill.is_nodataCell(irow, icol))
> * * * * * * * * * * * * {
> * * * * * * * * * * * * * * * * double iz = DEM.get_cellValue(irow, icol);
> * * * * * * * * * * * * * * * * if( iz < z) *{ iz = z; }
> * * * * * * * * * * * * * * * * DEMFill.set_cellValue(irow, icol, iz);
> * * * * * * * * * * * * * * * * progress++;
>
> * * * * * * * * * * * * * * * * tempNode.row = irow;
> * * * * * * * * * * * * * * * * tempNode.col = icol;
> * * * * * * * * * * * * * * * * tempNode.spill = iz;
> * * * * * * * * * * * * * * * * //theQueue.push( tempNode );
> * * * * * * * * * * * * }
> * * * * * * * * }
>
> * * * * * * * * if ((progress % 5000) == 0)
> * * * * * * * * {
> * * * * * * * * * * * * cout << theQueue.size() << " * " << progress << " * * * *" <<
> DEMFill.get_nValidCells() << endl;
> * * * * * * * * }
> * * * * * * * * if (progress == DEMFill.get_nValidCells())
> * * * * * * * * {
> * * * * * * * * * * * * cout << "Finished!" << endl;
> * * * * * * * * * * * * break;
> * * * * * * * * }
> * * * * }

Compiler options? Did you turn optimization on? Also, always please
post a complete compileable / linkable / executable code sample.

chenbo09@gmail.com
Guest
Posts: n/a

 12-21-2010
On Dec 20, 4:40*pm, Jorgen Grahn <(E-Mail Removed)> wrote:
> On Mon, 2010-12-20, (E-Mail Removed) wrote:
> > Hello! I am trying to use priority queue to store my nodes, and I need
> > to insert and retrieve elements very frequently. The size of the queue
> > increases dramatically at the beginning and starts to reduce after the
> > queue reaches its maximum size in the process.

>
> > The queue contains objects with three data members, i.e., row number,
> > column number, elevation. The object with the smallest elevation will
> > be the first element in the queue.

>
> > I am trying to implement an algorithm proposed by a published research
> > paper, the authored wrote "The algorithm could process a gird with 500
> > * 1000 cells in 2 seconds", while it seems that it took my code 2
> > hours to finish the task.

>
> A factor 3600 -- that's a HUGE difference. My gut feeling is that
> there's a difference in algorithm, not just overhead from the priority
> queue, a badly implemented operator< () etc.
>
> I haven't read the code. Try to run your code through a profiler
> (gprof if you're running GCC) and it will tell you where the algorithm
> spends its time. *The results from profiling often surprise you.
>
> [snip code]
>
> /Jorgen
>
> --
> * // Jorgen Grahn <grahn@ *Oo *o. * . *.
> \X/ * * snipabacken.se> * O *o * .

Hello Jorgen & Joshua,

I am new to C++, so I don't know what a profiler is! But for your

I defined a Grid class, among the data members, two dynamic 2D
vectors of size nrows * ncols are defined. one is used to contain the
elevation data, and the other is an identifier matrix indicating if a
cell is on the border, is an internal cell, or is no data value. I
think these two might consume a lot of memory. I am wondering if there
are ways to handle huge arrays.

Regarding the implementation of the operator <, here is my code:
class SpillNode
{
public:
unsigned row;
unsigned col;
double spill;

struct Greater : public binary_function <SpillNode, SpillNode, bool>
{
bool operator ()(const SpillNode& lhs, SpillNode& rhs) const
{
return lhs.spill > rhs.spill;
}
};
};

Joshua, can you tell me how to turn on optimization? Yes, I would like
to post my code here, but it is a relatively long multiple-file
project.

BTW, the code finished in a second if I common out the line
"theQueue.push( tempNode );". While if I don't then it is time
consuming, and this is the reason why I think it might be because of
the slow speed a priority queue. Probably my guess is wrong, and I am
not trying to offend C++ lovers.

Jorgen Grahn
Guest
Posts: n/a

 12-29-2010
On Tue, 2010-12-21, (E-Mail Removed) wrote:
> On Dec 20, 4:40*pm, Jorgen Grahn <(E-Mail Removed)> wrote:
>> On Mon, 2010-12-20, (E-Mail Removed) wrote:
>> > Hello! I am trying to use priority queue to store my nodes, and I need
>> > to insert and retrieve elements very frequently. The size of the queue
>> > increases dramatically at the beginning and starts to reduce after the
>> > queue reaches its maximum size in the process.

>>
>> > The queue contains objects with three data members, i.e., row number,
>> > column number, elevation. The object with the smallest elevation will
>> > be the first element in the queue.

>>
>> > I am trying to implement an algorithm proposed by a published research
>> > paper, the authored wrote "The algorithm could process a gird with 500
>> > * 1000 cells in 2 seconds", while it seems that it took my code 2
>> > hours to finish the task.

>>
>> A factor 3600 -- that's a HUGE difference. My gut feeling is that
>> there's a difference in algorithm, not just overhead from the priority
>> queue, a badly implemented operator< () etc.
>>
>> I haven't read the code. Try to run your code through a profiler
>> (gprof if you're running GCC) and it will tell you where the algorithm
>> spends its time. *The results from profiling often surprise you.
>>
>> [snip code]

....

> I am new to C++, so I don't know what a profiler is!

<http://en.wikipedia.org/wiki/Profiling_(computer_programming)>

Compile your program with extra profiling flags, run it, and get a
listing of how much time was spent in the different functions. That's
how gprof works, but I assume your compiler environment (whatever it
is) works in more or less the same way. The documentation will tell
you the details.

....
> Joshua, can you tell me how to turn on optimization? Yes, I would like
> to post my code here, but it is a relatively long multiple-file
> project.

No need to post the code. Just read the documentation for your compiler.
It will surely describe how to turn on optimization.

> BTW, the code finished in a second if I common out the line
> "theQueue.push( tempNode );". While if I don't then it is time
> consuming, and this is the reason why I think it might be because of
> the slow speed a priority queue.

does something completely different when you don't populate theQueue.
You can do no work, on an empty queue, really fast ...

/Jorgen

--
// Jorgen Grahn <grahn@ Oo o. . .
\X/ snipabacken.se> O o .