Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > C++ > reserve of vector

Reply
Thread Tools

reserve of vector

 
 
ab2305@gmail.com
Guest
Posts: n/a
 
      10-14-2008
Does standard mandates that the reserve call should initialize a
container's values to its defaults, hence vector<int> intVec;
intVec.reserve(100) would make intVec[0]=intVec[1]....intVec[99]=0 ?

Thanks
 
Reply With Quote
 
 
 
 
Kai-Uwe Bux
Guest
Posts: n/a
 
      10-14-2008
http://www.velocityreviews.com/forums/(E-Mail Removed) wrote:

> Does standard mandates that the reserve call should initialize a
> container's values to its defaults, hence vector<int> intVec;
> intVec.reserve(100) would make intVec[0]=intVec[1]....intVec[99]=0 ?


No. In fact, size() remains unchanged so that referencing those elements is
undefined behavior.

The resize() method, in this case, will grow the vector and initialize the
missing elements.


Best

Kai-Uwe Bux
 
Reply With Quote
 
 
 
 
Stephen Horne
Guest
Posts: n/a
 
      10-14-2008
On Mon, 13 Oct 2008 23:08:12 -0400, Kai-Uwe Bux <(E-Mail Removed)>
wrote:

>(E-Mail Removed) wrote:
>
>> Does standard mandates that the reserve call should initialize a
>> container's values to its defaults, hence vector<int> intVec;
>> intVec.reserve(100) would make intVec[0]=intVec[1]....intVec[99]=0 ?

>
>No. In fact, size() remains unchanged so that referencing those elements is
>undefined behavior.


Just to expand on that, reserve is for managing memory. If you know
that you're about to add thousands of items to a vector, using a
reserve to pre-grow the memory allocation can make it more efficient,
since the memory allocation only gets grown once - not potentially
dozens or hundreds of times.

IIRC sequentially adding n items to the end of a vector takes O(n^2)
time. This is because the memory gets grown O(n) times - every k
items, say, where k is a constant - and each growing can involve
copying the whole vector which is on average O(n) again.

If you can reserve the required space first, sequentially adding n
items is O(n). There is only one memory reallocation, and only one
copying - at most - so its the actual appending of values that
determines the time taken, not repeated memory reallocation and
copying.

There's also a scheme where the reserved memory is doubled on each
reallocation, and this also gives O(n), though it's still slower than
full preallocation.

 
Reply With Quote
 
acehreli@gmail.com
Guest
Posts: n/a
 
      10-14-2008
On Oct 13, 9:44*pm, Stephen Horne <(E-Mail Removed)> wrote:

> Just to expand on that, reserve is for managing memory. If you know
> that you're about to add thousands of items to a vector, using a
> reserve to pre-grow the memory allocation can make it more efficient,


I've read that reserve() doesn't help much. (I remember reading from
Bjarne Stroustrup himself that he stopped calling reserve() at some
point; but cannot find any references for that.)

> IIRC sequentially adding n items to the end of a vector takes O(n^2)
> time. This is because the memory gets grown O(n) times - every k
> items, say, where k is a constant - and each growing can involve
> copying the whole vector which is on average O(n) again.


The standard requires that push_back has amortized constant time; so
the implementers cannot use the complexity that you describe.

> There's also a scheme where the reserved memory is doubled on each
> reallocation, and this also gives O(n), though it's still slower than
> full preallocation.


Nowadays the implementers use a growth factor of 50%, which reportedly
works better with pool allocators.

Ali
 
Reply With Quote
 
acehreli@gmail.com
Guest
Posts: n/a
 
      10-14-2008
On Oct 13, 9:51*pm, (E-Mail Removed) wrote:

> I've read that reserve() doesn't help much. (I remember reading from
> Bjarne Stroustrup himself that he stopped calling reserve() at some
> point; but cannot find any references for that.)


I found it:

http://www.research.att.com/~bs/bs_f...low-containers

There, Stroustrup says:

<quote>
People sometimes worry about the cost of std::vector growing
incrementally. I used to worry about that and used reserve() to
optimize the growth. After measuring my code and repeatedly having
trouble finding the performance benefits of reserve() in real
programs, I stopped using it except where it is needed to avoid
iterator invalidation (a rare case in my code). Again: measure before
you optimize.
</quote>

Ali
 
Reply With Quote
 
Kai-Uwe Bux
Guest
Posts: n/a
 
      10-14-2008
Stephen Horne wrote:

> On Mon, 13 Oct 2008 23:08:12 -0400, Kai-Uwe Bux <(E-Mail Removed)>
> wrote:
>
>>(E-Mail Removed) wrote:
>>
>>> Does standard mandates that the reserve call should initialize a
>>> container's values to its defaults, hence vector<int> intVec;
>>> intVec.reserve(100) would make intVec[0]=intVec[1]....intVec[99]=0 ?

>>
>>No. In fact, size() remains unchanged so that referencing those elements
>>is undefined behavior.

>
> Just to expand on that, reserve is for managing memory. If you know
> that you're about to add thousands of items to a vector, using a
> reserve to pre-grow the memory allocation can make it more efficient,
> since the memory allocation only gets grown once - not potentially
> dozens or hundreds of times.
>
> IIRC sequentially adding n items to the end of a vector takes O(n^2)
> time. This is because the memory gets grown O(n) times - every k
> items, say, where k is a constant - and each growing can involve
> copying the whole vector which is on average O(n) again.


No. Appending to a vector is amortized constant time. The vector only
moves when size() and capacity() conincide. At those points, the capacity
jumps by a certain factor and not by an additive constant (your k).

BTW: this is not a quality of implementation issue. The standard makes this
requirement in clause [23.1.1/12].


> If you can reserve the required space first, sequentially adding n
> items is O(n). There is only one memory reallocation, and only one
> copying - at most - so its the actual appending of values that
> determines the time taken, not repeated memory reallocation and
> copying.
>
> There's also a scheme where the reserved memory is doubled on each
> reallocation, and this also gives O(n), though it's still slower than
> full preallocation.


What matters is not the precise factor but that the constant is
multiplicative rather than additive. I think, many implementations out
there use the golden ratio or some other constant smaller than 2.

Nonetheless, preallocation will beat naive sequential push_back()
performance wise.


Best

Kai-Uwe Bux
 
Reply With Quote
 
Stephen Horne
Guest
Posts: n/a
 
      10-14-2008
On Mon, 13 Oct 2008 21:55:04 -0700 (PDT), (E-Mail Removed) wrote:

>On Oct 13, 9:51*pm, (E-Mail Removed) wrote:
>
>> I've read that reserve() doesn't help much. (I remember reading from
>> Bjarne Stroustrup himself that he stopped calling reserve() at some
>> point; but cannot find any references for that.)

>
>I found it:


I agree - I didn't say anyone should use it

Personally, I don't have a single use of it in my current working copy
- I just checked. Plenty of resizes, but no reserves. If I need a big
enough collection that it's an issue, I normally end up using a
different kind of container anyway, though not specifically to avoid
this.

I have my own vector ADT based on a multiway tree, though only because
it was convenient given the map, set etc ADTs based on the same
underlying code - they all support log n subscripting. Writing the one
extra wrapper seemed prudent at the time, but I don't think I've ever
used it. If I wanted a huge vector (perhaps a very large audio file) I
might use this since it supports efficient inserts anywhere in the
container - though I'd have to add some bulk ops first.

For video, I wouldn't use it - I'd use a normal vector of pointers,
either to frames or GOPs. An hours video is only in the order of
100,000 frames anyway. And I might just use an vector of pointers to
chunks even for the huge audio file case.

The word "thousands" in my earlier post was out by several orders, I
guess.

I needed a 64 million item 3/4 GB array recently. But it was
fixed-size, so I used a fixed size array, not a vector. If I had used
a vector, I would probably have used reserve.

Strictly I did use a vector with millions of items at one stage -
underlying a priority queue that held up to approx. two million items.
Copy overheads *were* an issue here, but not so much from memory
allocation as from the items flowing through the queue. I solved it by
using a more specialised queuing method, putting items into the
location in the main array where they'd be needed (with some collision
handling).

But even that was for a programming challenge thing that I was just
doing for fun.

http://projecteuler.net/

Problem 211 - it turns out I solved it the hard way, but what the
hell.

Wish I could justify more time for these problems. I've had a plan for
problem 210 (but no code yet) for about a week. This time doing it the
easy way

 
Reply With Quote
 
Stephen Horne
Guest
Posts: n/a
 
      10-14-2008
On Tue, 14 Oct 2008 01:23:56 -0400, Kai-Uwe Bux <(E-Mail Removed)>
wrote:

>No. Appending to a vector is amortized constant time. The vector only
>moves when size() and capacity() conincide. At those points, the capacity
>jumps by a certain factor and not by an additive constant (your k).


That's good to know. Thanks.

I could swear that there used to be fixed increment - even that the
interface allowed the increment to be changed - but there you go.

I didn't think they'd impose that because 50% worst-case memory
overhead (often 75% when you allow for deletes), but its no bad thing.
A memory overhead on that scale isn't usually a big deal these days.

>What matters is not the precise factor but that the constant is
>multiplicative rather than additive. I think, many implementations out
>there use the golden ratio or some other constant smaller than 2.


Reducing the worst-case memory overhead. Makes sense.

 
Reply With Quote
 
James Kanze
Guest
Posts: n/a
 
      10-14-2008
On Oct 14, 6:44 am, Stephen Horne <(E-Mail Removed)> wrote:
> On Mon, 13 Oct 2008 23:08:12 -0400, Kai-Uwe Bux <(E-Mail Removed)>
> wrote:


> >(E-Mail Removed) wrote:


> >> Does standard mandates that the reserve call should
> >> initialize a container's values to its defaults, hence
> >> vector<int> intVec; intVec.reserve(100) would make
> >> intVec[0]=intVec[1]....intVec[99]=0 ?


> >No. In fact, size() remains unchanged so that referencing
> >those elements is undefined behavior.


> Just to expand on that, reserve is for managing memory. If you
> know that you're about to add thousands of items to a vector,
> using a reserve to pre-grow the memory allocation can make it
> more efficient, since the memory allocation only gets grown
> once - not potentially dozens or hundreds of times.


It's rarely necessary for optimization reasons (although it
doesn't hurt). The main reason for using reserve() is to avoid
invalidating iterators, pointers and references into the vector.
(Appending to a vector will not invalidate iterators unless the
capacity() increases.) Another reason, particularly with very
large vectors, is to reduce total memory consumption.

> IIRC sequentially adding n items to the end of a vector takes
> O(n^2) time. This is because the memory gets grown O(n) times
> - every k items, say, where k is a constant - and each growing
> can involve copying the whole vector which is on average O(n)
> again.


The standard requires that appending to a vector be "amortized
constant time". In practice, this means that memory growth must
be exponential---older implementations generally doubled the
size each time, but I think the general consensus today is that
1.5 times is preferrable.

> If you can reserve the required space first, sequentially adding n
> items is O(n).


Creating a vector of n objects using push_back is required by
the standard to be O(n). All reserve() does is reduce the
constant factor.

> There is only one memory reallocation, and only one copying -
> at most - so its the actual appending of values that
> determines the time taken, not repeated memory reallocation
> and copying.


> There's also a scheme where the reserved memory is doubled on
> each reallocation, and this also gives O(n), though it's still
> slower than full preallocation.


The real problem with it is memory consumption. You can easily
end up consuming three times more memory than you are actually
using.

--
James Kanze (GABI Software) email:(E-Mail Removed)
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34
 
Reply With Quote
 
James Kanze
Guest
Posts: n/a
 
      10-14-2008
On Oct 14, 7:23 am, Kai-Uwe Bux <(E-Mail Removed)> wrote:
> Stephen Horne wrote:


[...]
> What matters is not the precise factor but that the constant
> is multiplicative rather than additive. I think, many
> implementations out there use the golden ratio or some other
> constant smaller than 2.


I'd be surprised if they used exactly the golden ratio, but 1.5
is an easy to calculate approximation.

The motivation for this is simple: if you double each time, the
sum of the memory you've previously freed is always less than
the amount of memory you're requesting. A little bit of
mathematical analysis shows that this is the case any time the
multiplier is greater than the golden ration.

> Nonetheless, preallocation will beat naive sequential
> push_back() performance wise.


Slightly.

I have a couple of cases (maybe two in the last ten years) where
I've used reserve(). Always to avoid invalidating iterators,
however. (Most of the time, however, if the size of the vector
is changing, I'll save the index, not an iterator.)

--
James Kanze (GABI Software) email:(E-Mail Removed)
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Free memory allocate by a STL vector, vector of vector, map of vector Allerdyce.John@gmail.com C++ 8 02-18-2006 12:48 AM
Is Vector Reserve guaranteed to allocate contiguous memory? Gary Kuehn C++ 2 07-19-2005 12:42 AM
seg-fault on vector-auto-reserve Matthias Kaeppler C++ 2 02-27-2005 10:01 PM
vector : reserve(LONG_MAX) Alex Vinokur C++ 2 04-22-2004 02:33 PM
vector.reserve john smith C++ 5 07-25-2003 06:05 AM



Advertisments