Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > C++ > vector::push_back performance

Reply
Thread Tools

vector::push_back performance

 
 
Uwe Schnitker
Guest
Posts: n/a
 
      09-30-2004
"Andrew Koenig" <(E-Mail Removed)> wrote in message news:<dop6d.644518$(E-Mail Removed)>...
> "Antonios Christofides" <(E-Mail Removed)> wrote in message
> news:dc5.4159da31.d0edf@voltaire...
>
> > As I read in the archives, the performance problem caused by memory
> > reallocations during vector:ush_back is a common one. My first C++
> > program is suffering from it: 300 thousand push_backs result,
> > according to the profiler, in 20 reallocations; these 20 reallocations
> > account for 3.6 seconds (Celeron 1.3G), which is 40% of total
> > execution time.

>
> I'm skeptical.
>
> Here's a little program:
>
> #include <vector>
>
> int main()
> {
> std::vector<int> v;
> for (std::vector<int>::size_type i = 0; i != 1000000; ++i)
> v.push_back(i);
> return 0;
> }
>
> When I run this program on my machine (admittedly faster than 1.3G, but no
> more than twice as fast), it runs in three *hundredths* of a second. And it
> calls push_back a million times, not 300,000 times.


Just for the records: Are you sure it _does_ call push_back at all?

Since your program doesn't produce any observable effect, the compiler
has licence to transfer it into something like

int main()
{
}

which should probably run quite fast, shouldn't it?

Have fun,

Uwe
 
Reply With Quote
 
 
 
 
Peter van Merkerk
Guest
Posts: n/a
 
      09-30-2004
Uwe Schnitker wrote:

> "Andrew Koenig" <(E-Mail Removed)> wrote in message news:<dop6d.644518$(E-Mail Removed)>...
>
>>"Antonios Christofides" <(E-Mail Removed)> wrote in message
>>news:dc5.4159da31.d0edf@voltaire...
>>
>>
>>>As I read in the archives, the performance problem caused by memory
>>>reallocations during vector:ush_back is a common one. My first C++
>>>program is suffering from it: 300 thousand push_backs result,
>>>according to the profiler, in 20 reallocations; these 20 reallocations
>>>account for 3.6 seconds (Celeron 1.3G), which is 40% of total
>>>execution time.

>>
>>I'm skeptical.
>>
>>Here's a little program:
>>
>>#include <vector>
>>
>>int main()
>>{
>> std::vector<int> v;
>> for (std::vector<int>::size_type i = 0; i != 1000000; ++i)
>> v.push_back(i);
>> return 0;
>>}
>>
>>When I run this program on my machine (admittedly faster than 1.3G, but no
>>more than twice as fast), it runs in three *hundredths* of a second. And it
>>calls push_back a million times, not 300,000 times.

>
>
> Just for the records: Are you sure it _does_ call push_back at all?
>
> Since your program doesn't produce any observable effect, the compiler
> has licence to transfer it into something like
>
> int main()
> {
> }
>
> which should probably run quite fast, shouldn't it?


Good point, one should always be carefull with drawing conclusions from
test like this.

I compiled Andrew program on MSVC 6.0 with full optimization and checked
the assembly output. On this compiler the loop isn't optimized away.
push_back() is inlined, but the function that inserts elements into the
vector is still called one million times. Nevertheless this program
completed in 0.10 seconds on a Pentium III @ 700Mhz.

But I can imagine when the objects in the vector are expensive to copy
the story changes quite a bit.

--
Peter van Merkerk
peter.van.merkerk(at)dse.nl
 
Reply With Quote
 
 
 
 
Andrew Koenig
Guest
Posts: n/a
 
      09-30-2004

"Uwe Schnitker" <(E-Mail Removed)> wrote in message
news:(E-Mail Removed) om...
> Just for the records: Are you sure it _does_ call push_back at all?
>
> Since your program doesn't produce any observable effect, the compiler
> has licence to transfer it into something like
>
> int main()
> {
> }
>
> which should probably run quite fast, shouldn't it?


If I insert the statement

std::cout << v.size() << std::endl;

in the appropriate place, it prints 1000000 and still runs in 0.03 seconds.


 
Reply With Quote
 
algorithm
Guest
Posts: n/a
 
      10-01-2004
Uwe - do you know who yoe are questioning?
I guess not...
Have fun yourself

 
Reply With Quote
 
Antonios Christofides
Guest
Posts: n/a
 
      10-03-2004
Hi again,

after several experiments, I see that, indeed, when vector:ush_back
needs to reallocate memory, it does construct a copy of the object and
destruct the old object. When I recompiled my program without
optimization, I saw that the reallocation routines don't actually take
much time, but the constructor of the contained class does; with
optimization, it is apparently inlined and the profiler indicates that
the time is spent in the reallocation routines. If I call
vector::reserve prior to performing the 306 thousand push_backs, the
constructor is called 306 thousand times; if I don't, it is called 830
thousand times (likewise for the destructor).

Why don't just memmove the objects? I understand that in the general
case this is not possible; for example, if the constructor registers
the object's address in some global vector object. But this would not
be a problem for my class, which does not do such things. Furthermore,
my examination with the debugger indicates that when I copy an
instance of it, the copy is identical, byte by byte; even a string
member points to the same memory location as the original. So is it
just that the compiler/stl can't know, and we can't hint it, that a
memmove would suffice?



Here's the class, for your reference:

struct Record {
Record(const Date& t, bool n, double v, string f):
timestamp(t), null(n), value(v), flags(f) { }
Date timestamp;
bool null;
double value;
string flags;
};

("Date" is a user class internally represented as a struct tm.)
 
Reply With Quote
 
Uwe Schnitker
Guest
Posts: n/a
 
      10-04-2004
"algorithm" <(E-Mail Removed)> wrote in message news:<(E-Mail Removed) alkaboutprogramming.com>...
> Uwe - do you know who yoe are questioning?
> I guess not...
> Have fun yourself


I'm not sure when I first read something either about or by AK, but it
must have been in the previous century ...

Questioning (well-earned) authority is not a matter of refusal to
learn from it. Quite the contrary.
 
Reply With Quote
 
Victor Bazarov
Guest
Posts: n/a
 
      10-04-2004
Antonios Christofides wrote:
> [...]
> Why don't just memmove the objects? I understand that in the general
> case this is not possible; for example, if the constructor registers
> the object's address in some global vector object. But this would not
> be a problem for my class, which does not do such things. Furthermore,
> my examination with the debugger indicates that when I copy an
> instance of it, the copy is identical, byte by byte; even a string
> member points to the same memory location as the original. So is it
> just that the compiler/stl can't know, and we can't hint it, that a
> memmove would suffice?
> [...]


Try implementing the copy c-tor for your class in terms of 'memmove'.
If you succeed, remember that it's not portable. If you don't succeed,
read up on "move semantics" for return values and temporaries (IIRC),
the discussion was in comp.lang.c++.moderated somewhere in the past
couple of years.

Victor
 
Reply With Quote
 
Tom Widmer
Guest
Posts: n/a
 
      10-04-2004
On Wed, 29 Sep 2004 07:37:21 +0000 (UTC), Antonios Christofides
<(E-Mail Removed)> wrote:

>> > All the texts I have read state that, when a dynamic array needs
>> > to be extended, it is better/faster to 'realloc' instead of
>> > creating a new array and copying the elements manually.
>> >
>> > So why is 'realloc' more efficient?

>
>Except for what was already said, I believe that even in the cases
>when realloc needs to allocate a new memory block rather than extend
>the existing one, it uses memmove or some similar operation which will
>be faster than manually copying the elements if its implementation
>uses specialized mass-byte-copying CPU instructions.


std::vector will also typically use memmove or memcpy for built in
types like int. The easiest way to see this is to trace into the
push_back calls in a debugger - working out the code path can be
difficult otherwise.

For object types with copy constructors, the language currently
contains no better way of moving them that copying them and then
destroying the original. However, move constructors are a future
solution to this problem.

Tom
 
Reply With Quote
 
Tom Widmer
Guest
Posts: n/a
 
      10-04-2004
On Sun, 03 Oct 2004 18:52:16 -0000, Antonios Christofides
<(E-Mail Removed)> wrote:

>Hi again,
>
>after several experiments, I see that, indeed, when vector:ush_back
>needs to reallocate memory, it does construct a copy of the object and
>destruct the old object. When I recompiled my program without
>optimization, I saw that the reallocation routines don't actually take
>much time, but the constructor of the contained class does; with
>optimization, it is apparently inlined and the profiler indicates that
>the time is spent in the reallocation routines. If I call
>vector::reserve prior to performing the 306 thousand push_backs, the
>constructor is called 306 thousand times; if I don't, it is called 830
>thousand times (likewise for the destructor).


Right. Unfortunately the std::allocator interface doesn't include any
kind of try_realloc method, so it always has to allocate a new block
and copy, even when there is free space after the current storage.

>Why don't just memmove the objects? I understand that in the general
>case this is not possible; for example, if the constructor registers
>the object's address in some global vector object. But this would not
>be a problem for my class, which does not do such things. Furthermore,
>my examination with the debugger indicates that when I copy an
>instance of it, the copy is identical, byte by byte; even a string
>member points to the same memory location as the original. So is it
>just that the compiler/stl can't know, and we can't hint it, that a
>memmove would suffice?


Yes, exactly. Strictly speaking, you can't use memmove portably on any
non-POD type, but in practice it works well with many types.

>Here's the class, for your reference:
>
> struct Record {
> Record(const Date& t, bool n, double v, string f):
> timestamp(t), null(n), value(v), flags(f) { }
> Date timestamp;
> bool null;
> double value;
> string flags;
> };
>
> ("Date" is a user class internally represented as a struct tm.)


Well, I can think of reasonable implementations of std::string that
wouldn't be memmoveable (e.g. using a COW based lock-free pointer XOR
reference linked list implementation), so the copy-destroy cycle is
the best you can do until this comes to be:
http://www.open-std.org/jtc1/sc22/wg...ve%20Semantics

Tom
 
Reply With Quote
 
David Harmon
Guest
Posts: n/a
 
      10-06-2004
On Thu, 30 Sep 2004 14:03:10 GMT in comp.lang.c++, "Andrew Koenig"
<(E-Mail Removed)> wrote,
>
>If I insert the statement
>
> std::cout << v.size() << std::endl;
>
>in the appropriate place, it prints 1000000 and still runs in 0.03 seconds.


But perhaps the compiler is determining statically what the size of the
vector would be at that point and encoding only that string in the
executable. I think you have to read the initial 1000000 from a file
that is unknown at compile time.

 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Performance Tutorials Services - Boosting Performance by DisablingUnnecessary Services on Windows XP Home Edition Software Engineer Javascript 0 06-10-2011 02:18 AM
OCZ Rally High Performance USB2 (Dual Channel) Flash Memory Silverstrand Front Page News 0 09-20-2005 03:35 AM
How to quickly improve your computer's performance @ Bona... Silverstrand Front Page News 0 08-24-2005 01:26 PM
HEXUS.review: Rockdirect XTI 3.8 Performance Silverstrand Front Page News 0 08-08-2005 02:37 PM
Web Form Performance Versus Single File Performance jm ASP .Net 1 12-12-2003 11:14 PM



Advertisments