Velocity Reviews (http://www.velocityreviews.com/forums/index.php)
-   C++ (http://www.velocityreviews.com/forums/f39-c.html)
-   -   Optimal buffer growth (http://www.velocityreviews.com/forums/t621045-optimal-buffer-growth.html)

 Angel Tsankov 06-18-2008 09:45 PM

Optimal buffer growth

Hi,

I remember reading in a book (or in an article) that the optmial buffer
growth factor is about 1.6. Now I need to find this book but I can't
remember its title. Can someone help me with this?

 acehreli@gmail.com 06-18-2008 09:54 PM

Re: Optimal buffer growth

On Jun 18, 2:45 pm, "Angel Tsankov" <fn42...@fmi.uni-sofia.bg> wrote:
> Hi,
>
> I remember reading in a book (or in an article) that the optmial buffer
> growth factor is about 1.6. Now I need to find this book but I can't
> remember its title. Can someone help me with this?

I think you are thinking Andrew Koenig.

Search for the thread titled "vector growth factor of 1.5" at
groups.google.com for a discussion of it.

Ali

 Kai-Uwe Bux 06-18-2008 11:16 PM

Re: Optimal buffer growth

Angel Tsankov wrote:

> I remember reading in a book (or in an article) that the optmial buffer
> growth factor is about 1.6. Now I need to find this book but I can't
> remember its title. Can someone help me with this?

What do you mean by "optimal"? It's a space-time tradeoff. If you increase
the multiplication factor, you decrease the expected number of move or copy
operations but you increase the expected memory overhead.

Assuming that the capacity multiplies by a factor of d > 1 at each
reallocation, I get (using Benford's law):

a) The expected ratio size() / capacity() is:

d - 1
--------- (for d = 2, approx = 0.72)
d ln(d)

b) Filling the vector by a series of push_back() operations will involve on
average

1
------- (for d = 2, approx = 1.44)
ln(d)

moves per object arising from reallocating the vector (i.e., not including
initial copy construction of the object into the vector).

For d = 1.6, that would be an expected memory usage of 80% and each object
would be reallocated 2.13 times on average.

Best

Kai-Uwe Bux

 Angel Tsankov 06-19-2008 06:25 AM

Re: Optimal buffer growth

>> Hi,
>>
>> I remember reading in a book (or in an article) that the optmial buffer
>> growth factor is about 1.6. Now I need to find this book but I can't
>> remember its title. Can someone help me with this?

>
> I think you are thinking Andrew Koenig.
>
> Search for the thread titled "vector growth factor of 1.5" at
> groups.google.com for a discussion of it.

Thanks a lot for the reference; I'll take a look. However, I still need to
find out that book...

 James Kanze 06-19-2008 08:53 AM

Re: Optimal buffer growth

On Jun 19, 1:16 am, Kai-Uwe Bux <jkherci...@gmx.net> wrote:
> Angel Tsankov wrote:
> > I remember reading in a book (or in an article) that the
> > optmial buffer growth factor is about 1.6. Now I need to
> > find this book but I can't remember its title. Can someone
> > help me with this?

> What do you mean by "optimal"? It's a space-time tradeoff. If
> you increase the multiplication factor, you decrease the
> expected number of move or copy operations but you increase

The issue was that if the factor is larger than (1+sqrt(5))/2
(roughly 1.618), the memory freed after a reallocation could
never be reused by the vector; if you double the size at each
allocation, the total memory freed until that point will always
be less than the size requested for the new allocation. If the
factor is smaller, you can hope that sooner or later, the
underlying allocator will be able to merge previously filled
blocks, and fulfill the request from them.

--
James Kanze (GABI Software) email:james.kanze@gmail.com
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34

 Kai-Uwe Bux 06-19-2008 01:14 PM

Re: Optimal buffer growth

James Kanze wrote:

> On Jun 19, 1:16 am, Kai-Uwe Bux <jkherci...@gmx.net> wrote:
>> Angel Tsankov wrote:
>> > I remember reading in a book (or in an article) that the
>> > optmial buffer growth factor is about 1.6. Now I need to
>> > find this book but I can't remember its title. Can someone
>> > help me with this?

>
>> What do you mean by "optimal"? It's a space-time tradeoff. If
>> you increase the multiplication factor, you decrease the
>> expected number of move or copy operations but you increase

>
> The issue was that if the factor is larger than (1+sqrt(5))/2
> (roughly 1.618), the memory freed after a reallocation could
> never be reused by the vector; if you double the size at each
> allocation, the total memory freed until that point will always
> be less than the size requested for the new allocation. If the
> factor is smaller, you can hope that sooner or later, the
> underlying allocator will be able to merge previously filled
> blocks, and fulfill the request from them.

Cool, the golden ratio strikes again.

I have to wonder, though, whether this has a measurable impact (a) on modern
architectures where memory is organized in pages or (b) in typical programs
where one probably has more than one dynamic data structure growing at a
time anyway.

Best

Kai-Uwe Bux

 James Kanze 06-19-2008 03:35 PM

Re: Optimal buffer growth

On Jun 19, 3:14 pm, Kai-Uwe Bux <jkherci...@gmx.net> wrote:
> James Kanze wrote:
> > On Jun 19, 1:16 am, Kai-Uwe Bux <jkherci...@gmx.net> wrote:
> >> Angel Tsankov wrote:
> >> > I remember reading in a book (or in an article) that the
> >> > optmial buffer growth factor is about 1.6. Now I need to
> >> > find this book but I can't remember its title. Can someone
> >> > help me with this?

> >> What do you mean by "optimal"? It's a space-time tradeoff. If
> >> you increase the multiplication factor, you decrease the
> >> expected number of move or copy operations but you increase
> >> the expected memory overhead.

> > The issue was that if the factor is larger than (1+sqrt(5))/2
> > (roughly 1.618), the memory freed after a reallocation could
> > never be reused by the vector; if you double the size at each
> > allocation, the total memory freed until that point will always
> > be less than the size requested for the new allocation. If the
> > factor is smaller, you can hope that sooner or later, the
> > underlying allocator will be able to merge previously filled
> > blocks, and fulfill the request from them.

> Cool, the golden ratio strikes again.

> I have to wonder, though, whether this has a measurable impact
> (a) on modern architectures where memory is organized in pages
> or (b) in typical programs where one probably has more than
> one dynamic data structure growing at a time anyway.

I don't really know. Nominally, with any of the "classical"
allocation algorithms, if you have one vector which just grows
and grows, it eventually migrates to the end of the free space
arena (because it becomes bigger than any of the holes), and
this factor could possibly affect just how large you could make
it grow. Except, of course, that you'll likely bring the
machine to its knees through paging before that. And that there
are enough additional factors involved that it's not certain
that the rule really changes anything.

Just out of curiousity, during my lunch hour, I wrote a simple
allocator and tested the principles. Using some simple
multipliers, I get the following output:
For 1.10: max size = 389582583 (39.0%)
For 1.20: max size = 389586745 (39.0%)
For 1.30: max size = 513088587 (51.3%)
For 1.40: max size = 477760691 (47.8%)
For 1.50: max size = 419279977 (41.9%)
For 1.60: max size = 432051256 (43.2%)
For 1.70: max size = 360273482 (36.0%)
For 1.80: max size = 307547665 (30.8%)
For 1.90: max size = 328691801 (32.9%)
For 2.00: max size = 268435456 (26.8%)
Change just about any of the parmeters, however, and you get
something different: using an initial size of 500 (rather than
128) results in:
For 1.10: max size = 374691238 (37.5%)
For 1.20: max size = 519586870 (52.0%)
For 1.30: max size = 420105341 (42.0%)
For 1.40: max size = 488743519 (48.9%)
For 1.50: max size = 323374783 (32.3%)
For 1.60: max size = 259493561 (25.9%)
For 1.70: max size = 288397093 (28.8%)
For 1.80: max size = 371637543 (37.2%)
For 1.90: max size = 357024500 (35.7%)
For 2.00: max size = 262144000 (26.2%)
Change the size of the arena from 1000000000 to 500000000, and
you get:
For 1.10: max size = 219909214 (44.0%)
For 1.20: max size = 225455293 (45.1%)
For 1.30: max size = 233540550 (46.7%)
For 1.40: max size = 174111040 (34.8%)
For 1.50: max size = 279519985 (55.9%)
For 1.60: max size = 168770022 (33.8%)
For 1.70: max size = 124662105 (24.9%)
For 1.80: max size = 170859814 (34.2%)
For 1.90: max size = 172995685 (34.6%)
For 2.00: max size = 134217728 (26.8%)
For the moment, I'm not sure what one can really conclude:-).

Anyhow, for those interested, here's the code. It uses my
library, but only a few simple things from it, which can easily
be replaced. Also, it's entirely possible that I've got an
error somewhere in it (it was written very quickly), which could
explain the randomness of the results as well.

-------------- fill.cc ----------------
#include <cstdlib>
#include <iostream>
#include <iomanip>
#include <new>
#include "gb/FFmt.hh"
#include "gb/CommandLine.hh"
#include "gb/NumericOption.hh"

Gabi::BoundNumericOption
arenaSize( 'a', 1000000000 ) ;
Gabi::BoundNumericOption
startSize( 's', 128 ) ;
Gabi::BoundNumericOption
intervalCount( 'i', 10 ) ;

class Pool
{
public:
explicit Pool( size_t size = arenaSize ) ;
~Pool() ;
void* allocate( size_t n ) ;
void free( void* p ) ;

private:
{
bool isFree ;
} ;
void* base ;

inline void* add( void* p, size_t n )
{
return static_cast< char* >( p ) + n ;
}
inline size_t diff( void* p1, void* p2 )
{
return static_cast< char* >( p2 ) - static_cast< char*
>( p1 ) ;

}
} ;

void
testPool(
double ratio )
{
Pool p ;
size_t s = startSize ;
void* v = p.allocate( s ) ;
while ( v != NULL ) {
size_t s2 = (size_t)( ratio * s ) ;
void* v2 = p.allocate( s2 ) ;
p.free( v ) ;
v = v2 ;
if ( v != NULL ) {
s = s2 ;
}
}
std::cout << "For " << Gabi::FFmt( 4, 2 ) << ratio
<< ": max size = " << std::setw( 9 ) << s
<< " (" << Gabi::FFmt( 4, 1 ) << 100.0 * s / arenaSize
<< "%)"
<< std::endl ;
}

int
main( int argc, char** argv )
{
Gabi::CommandLine::instance().parse( argc, argv ) ;
for ( int i = 1 ; i <= intervalCount ; ++ i ) {
testPool( 1.0 + i / static_cast< double
>( intervalCount.value() ) ) ;

}
return 0 ;
}

Pool::Pool(
size_t size )
{
base = std::malloc( size ) ;
if ( base == NULL ) {
}
first = static_cast< BlockHeader* >( base ) ;
first->next = last ;
first->isFree = true ;
last->next = NULL ;
last->isFree = false ;
}

Pool::~Pool()
{
std::free( base ) ;
}

void*
Pool::allocate(
size_t n )
{
n += sizeof( BlockHeader ) ;
n = (n + 7) & (static_cast< size_t >( -1 ) << 3) ;
while ( result == NULL && b != NULL ) {
if ( b->isFree ) {
while ( b->next->isFree ) {
b->next = b->next->next ;
}
if ( diff( b, b->next ) > n ) {
result = b ;
}
}
b = b->next ;
}
if ( result != NULL ) {
if ( diff( result, result->next ) > n +
n ) ) ;
newNext->next = result->next ;
newNext->isFree = true ;
result->next = newNext ;
}
result->isFree = false ;
++ result ;
}
return result ;
}

void
Pool::free(
void* p )
{
if ( p != NULL ) {
b->isFree = true ;
}
}
---------------- ------- ----------------

--
James Kanze (GABI Software) email:james.kanze@gmail.com
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34

 All times are GMT. The time now is 04:18 PM.