![]() |
vector::push_back performance
Hi,
As I read in the archives, the performance problem caused by memory reallocations during vector::push_back is a common one. My first C++ program is suffering from it: 300 thousand push_backs result, according to the profiler, in 20 reallocations; these 20 reallocations account for 3.6 seconds (Celeron 1.3G), which is 40% of total execution time. What I don't understand: why is the reallocation code so complex? I studied the library source and I have a hard time understanding it, but it seems to be copying the vector item by item in each reallocation. Why wouldn't a "realloc" suffice? And, given that I don't know the vector size beforehand, is there anything else I can do other than trying deqeue or a guessed vector::reserve? In case it matters, I'm using gcc 3.3 with its standard c++ library on a Debian sarge, but portability is also an issue. Thanks! |
Re: vector::push_back performance
Antonios Christofides wrote:
> Hi, > > As I read in the archives, the performance problem caused by memory > reallocations during vector::push_back is a common one. My first C++ > program is suffering from it: 300 thousand push_backs result, > according to the profiler, in 20 reallocations; these 20 reallocations > account for 3.6 seconds (Celeron 1.3G), which is 40% of total > execution time. > > What I don't understand: why is the reallocation code so complex? I > studied the library source and I have a hard time understanding it, > but it seems to be copying the vector item by item in each > reallocation. Why wouldn't a "realloc" suffice? That's what a "realloc" does, too. You usually can't easily make an already allocated memory block bigger (what would you do with data after it?), so a new block must be allocated and the data be copied over to it, then the old one destroyed. > And, given that I don't know the vector size beforehand, is there > anything else I can do other than trying deqeue or a guessed > vector::reserve? Not much. > In case it matters, I'm using gcc 3.3 with its standard c++ library on > a Debian sarge, but portability is also an issue. |
Re: vector::push_back performance
Antonios Christofides wrote:
> As I read in the archives, the performance problem caused by memory > reallocations during vector::push_back is a common one. My first C++ > program is suffering from it: 300 thousand push_backs result, > according to the profiler, in 20 reallocations; these 20 reallocations > account for 3.6 seconds (Celeron 1.3G), which is 40% of total > execution time. Three hundred thousand push_backs into a vector without reserve? Seems unjustified. > What I don't understand: why is the reallocation code so complex? I > studied the library source and I have a hard time understanding it, > but it seems to be copying the vector item by item in each > reallocation. Why wouldn't a "realloc" suffice? What would 'realloc' do? Place the objects each at a different memory location without letting the object know? That's not right. Objects may need to know where they have been constructed. They might want to let other classes or objects know of their location, etc. > And, given that I don't know the vector size beforehand, is there > anything else I can do other than trying deqeue or a guessed > vector::reserve? Nope. Using standard containers requires acknowledging the trade-offs. If you need fast push_back and you don't know the size, you should probably use 'std::list' or 'std::deque'. If you need random access afterwards, don't push-back without reserving. I bet any decent book on Standard containers talks about how to pick the container well-suited for your task. Of course, that assumes that you know what your task is. > In case it matters, I'm using gcc 3.3 with its standard c++ library on > a Debian sarge, but portability is also an issue. It doesn't matter, at least not here. V |
Re: vector::push_back performance
Antonios Christofides wrote:
> As I read in the archives, the performance problem caused by memory > reallocations during vector::push_back is a common one. My first C++ > program is suffering from it: 300 thousand push_backs result, > according to the profiler, in 20 reallocations; these 20 reallocations > account for 3.6 seconds (Celeron 1.3G), which is 40% of total > execution time. Do you think you could go to an algorithm where you push less back? > What I don't understand: why is the reallocation code so complex? I > studied the library source and I have a hard time understanding it, > but it seems to be copying the vector item by item in each > reallocation. Why wouldn't a "realloc" suffice? Read Herb Sutter's way-cool GOTW series, and his books /Exceptional C++/. He impugns the container class of choice should usually be std::deque<>, not std::vector<>. It frags not memory like std::list<>, and it's optimal to push things to both the beginning and end. -- Phlip http://industrialxp.org/community/bi...UserInterfaces |
Re: vector::push_back performance
"Antonios Christofides" <anthony@itia.ntua.gr> wrote in message
news:dc5.4159da31.d0edf@voltaire... > As I read in the archives, the performance problem caused by memory > reallocations during vector::push_back is a common one. My first C++ > program is suffering from it: 300 thousand push_backs result, > according to the profiler, in 20 reallocations; these 20 reallocations > account for 3.6 seconds (Celeron 1.3G), which is 40% of total > execution time. I'm skeptical. Here's a little program: #include <vector> int main() { std::vector<int> v; for (std::vector<int>::size_type i = 0; i != 1000000; ++i) v.push_back(i); return 0; } When I run this program on my machine (admittedly faster than 1.3G, but no more than twice as fast), it runs in three *hundredths* of a second. And it calls push_back a million times, not 300,000 times. This behavior suggests to me that your vector must contain objects of a class that is much more expensive to copy than int. So I think we need to see more information about your program in order to understand the source of the performance problem. |
Re: vector::push_back performance
> That's what a "realloc" does, too. You usually can't easily make an
already > allocated memory block bigger (what would you do with data after it?), so a > new block must be allocated and the data be copied over to it, then the old > one destroyed. > All the texts I have read state that, when a dynamic array needs to be extended, it is better/faster to 'realloc' instead of creating a new array and copying the elements manually. So why is 'realloc' more efficient? |
Re: vector::push_back performance
"Method Man" <a@b.c> wrote...
>> That's what a "realloc" does, too. You usually can't easily make an > already >> allocated memory block bigger (what would you do with data after it?), so > a >> new block must be allocated and the data be copied over to it, then the > old >> one destroyed. >> > > All the texts I have read state that, when a dynamic array needs to be > extended, it is better/faster to 'realloc' instead of creating a new array > and copying the elements manually. You've been reading too many C texts, haven't you? > So why is 'realloc' more efficient? I am not sure how to answer this question. Imagine you have a tree which has grown too far up and needs trimming. You decide to trim it a foot off the ground because it's too inefficient to climb up and trim every branch that could use a trimming. You lose the tree. Why is cutting it down more efficient than doing it right? There is no answer. Another analogy: a TV set and a need to deliver it from the 7th floor to the truck parked outside. It can be done on an elevator, it could be done on the stairs. Or, somebody might decide that it's more efficient to lower it down through the window. Without ropes. Hey, a couple of seconds and it's down on the ground, no? Why is throwing it down more efficient than using the elevator (or stairs)? For POD you can do realloc. For non-POD classes (general case) realloc will simply not work. Efficient or not. V |
Re: vector::push_back performance
"Method Man" <a@b.c> wrote in message news:oJq6d.4788
> All the texts I have read state that, when a dynamic array needs to be > extended, it is better/faster to 'realloc' instead of creating a new array > and copying the elements manually. > > So why is 'realloc' more efficient? Because if more space exists at the end of the existing array to yield a larger array, then realloc will just claim that space. Say you do malloc(512u) and the program reserves bytes 0x101 to 0x300 for your array. Say bytes 0x301 to 0x0500 are free for anyone to use. If no other objects request this space and you realloc the array to 1024u bytes, then the system may just mark bytes 0x0101 to 0x0500 as in use by your array (so no-one else can claim it). There's no garuantee that space is available, but if it is, we save copying lots of bytes. |
Re: vector::push_back performance
"Antonios Christofides" <anthony@itia.ntua.gr> wrote in message
> What I don't understand: why is the reallocation code so complex? I > studied the library source and I have a hard time understanding it, > but it seems to be copying the vector item by item in each > reallocation. Why wouldn't a "realloc" suffice? For user types, especially those managing dynamic memory like std::string or std::deque, we have to call the overloaded copy constructor or operator= to the copy. > And, given that I don't know the vector size beforehand, is there > anything else I can do other than trying deqeue or a guessed > vector::reserve? What about std::list? What is wrong with std::deque? If you have a large object, it might be a good idea to make it reference counted. You can use boost::shared_ptr or the like. |
Re: vector::push_back performance
"Victor Bazarov" <v.Abazarov@comAcast.net> wrote in message news:tEr6d.172009$3l3.160912@attbi_s03... > "Method Man" <a@b.c> wrote... > >> That's what a "realloc" does, too. You usually can't easily make an > > already > >> allocated memory block bigger (what would you do with data after it?), so > > a > >> new block must be allocated and the data be copied over to it, then the > > old > >> one destroyed. > >> > > > > All the texts I have read state that, when a dynamic array needs to be > > extended, it is better/faster to 'realloc' instead of creating a new array > > and copying the elements manually. > > You've been reading too many C texts, haven't you? > Well yea. I was talking about arrays of POD types, but I didn't make that clear in my post. Of course non-POD types can have constructors and overloaded assignment operators, so my question wouldn't make sense in that case. > > So why is 'realloc' more efficient? > > I am not sure how to answer this question. Imagine you have a tree which > has grown too far up and needs trimming. You decide to trim it a foot off > the ground because it's too inefficient to climb up and trim every branch > that could use a trimming. You lose the tree. Why is cutting it down more > efficient than doing it right? There is no answer. Another analogy: a TV > set and a need to deliver it from the 7th floor to the truck parked outside. > It can be done on an elevator, it could be done on the stairs. Or, somebody > might decide that it's more efficient to lower it down through the window. > Without ropes. Hey, a couple of seconds and it's down on the ground, no? > Why is throwing it down more efficient than using the elevator (or stairs)? > > For POD you can do realloc. For non-POD classes (general case) realloc will > simply not work. Efficient or not. > Your analogies didn't really help in my understanding of realloc. I was looking for something like -- 'realloc' is never/sometimes/always more efficient than malloc'ing a new array and manually copying from the old array (of PODs). Then justify the choice. |
| All times are GMT. The time now is 12:38 PM. |
Powered by vBulletin®. Copyright ©2000 - 2013, vBulletin Solutions, Inc.
SEO by vBSEO ©2010, Crawlability, Inc.