Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > C++ > How much overhead for a new templated class?

Reply
Thread Tools

How much overhead for a new templated class?

 
 
jacob navia
Guest
Posts: n/a
 
      05-21-2012
I am writing a containers library in C. The generic list container
adds for a new type:

code: 2495 bytes
data: 488 bytes

for a list container containing around 60 entry points.

This means that you will pay 2983 bytes for each instatiation
of the list template (one of the biggest around)

I would be interested to know the corresponding vaue of the
C++ stl. How many bytes of code it will cost you to instantiate
a list template for a new type?

Thanks in advance.



 
Reply With Quote
 
 
 
 
Ian Collins
Guest
Posts: n/a
 
      05-21-2012
On 05/22/12 08:37 AM, jacob navia wrote:
> I am writing a containers library in C. The generic list container
> adds for a new type:
>
> code: 2495 bytes
> data: 488 bytes
>
> for a list container containing around 60 entry points.
>
> This means that you will pay 2983 bytes for each instatiation
> of the list template (one of the biggest around)
>
> I would be interested to know the corresponding vaue of the
> C++ stl. How many bytes of code it will cost you to instantiate
> a list template for a new type?\


I'm sure you can check that for yourself, but with g++, about 1K to
instantiate a list<int> and push_back a value.

--
Ian Collins
 
Reply With Quote
 
 
 
 
Dombo
Guest
Posts: n/a
 
      05-21-2012
Op 21-May-12 22:37, jacob navia schreef:
> I am writing a containers library in C. The generic list container
> adds for a new type:
>
> code: 2495 bytes
> data: 488 bytes
>
> for a list container containing around 60 entry points.
>
> This means that you will pay 2983 bytes for each instatiation
> of the list template (one of the biggest around)
>
> I would be interested to know the corresponding vaue of the
> C++ stl. How many bytes of code it will cost you to instantiate
> a list template for a new type?


That depends on the compiler, compiler settings, the type for which the
list template is instantiated and the other types are instantiated.

For example:

#include <list>
#include <string>

class A
{
public:
A(): foo(0) {}
long foo;
};

class B
{
public:
B() {}
std::string bar;
};

int main()
{
std::list<void*> lpv;
std::list<A*> lpa;
std::list<B*> lpb;
std::list<A> la;
std::list<B> lb;

lpv.push_back(new A);
lpa.push_back(new A);
lpb.push_back(new B);
la.push_back(A());
lb.push_back(B());

return 0;
}


If you compile this code Visual C++ 2010 with optimizations enabled all
push_back() method calls result in the same code being called regardless
of the list type (in other words the code is not replicated for each
template instantiation). The code for the constructors is also shared
between std::list<void*>, std::list<A*>, std::list<B*> and std::list<A>,
but not for std::list<B>; std::list<B> has its own constructor code.

My observation is that template instantiation for pointer types often
share code with template instantiations for other pointer types. I.e.
instantiating a list template with another pointer type in itself does
not add code. With non-pointer types only some parts, if any, are shared
between template instantiations. Of course YMMV because of all the
variables involved that can affect the outcome.
 
Reply With Quote
 
Ian Collins
Guest
Posts: n/a
 
      05-21-2012
On 05/22/12 08:37 AM, jacob navia wrote:
> I am writing a containers library in C. The generic list container
> adds for a new type:
>
> code: 2495 bytes
> data: 488 bytes
>
> for a list container containing around 60 entry points.
>
> This means that you will pay 2983 bytes for each instatiation
> of the list template (one of the biggest around)
>
> I would be interested to know the corresponding vaue of the
> C++ stl. How many bytes of code it will cost you to instantiate
> a list template for a new type?


Rather than fussing over a few bytes, how about a performance comparison?

Here's a simple (related to some work I have been doing recently) benchmark:

Generate a list of 10,000,000 random longs (call it random)
use that list to instantiate another list (call it list)
sort list
sum and empty list by removing the first entry until empty.

What would be your containers library solution and how does it compare
to this C++ solution?

#include <list>
#include <iostream>
#include <stdint.h>
#include <stdlib.h>

// For system hi-res timer, add your own here.
//
int64_t hiresTime();

const unsigned entries = 10000000;

typedef std::list<long> List;

void fill( List& list )
{
srand48(42);
for( unsigned n = 0; n < entries; ++n )
list.push_back(lrand48());
}

uint64_t get( List& list )
{
uint64_t result(0);
while( !list.empty() )
{
result += list.front();
list.pop_front();
}
return result;
}

double delta( int64_t start, int64_t now )
{
return (now-start) / 1000000000.0;
}

int main()
{
using std::cout;
using std::endl;

List random;

int64_t start = hiresTime();
fill( random );
int64_t now = hiresTime();

cout << "To generate " << delta(start, now) << 'S' << endl;

start = now;
List list(random);
now = hiresTime();

cout << "To create " << delta(start, now) << 'S' << endl;

start = now;
list.sort();
now = hiresTime();

cout << "To sort " << delta(start, now) << 'S' << endl;

start = now;
uint64_t n = get( list );
now = hiresTime();

cout << "To empty " << delta(start, now) << 'S' << endl;

cout << n << endl;
}

--
Ian Collins
 
Reply With Quote
 
Ian Collins
Guest
Posts: n/a
 
      05-21-2012
On 05/22/12 11:28 AM, Ian Collins wrote:
> On 05/22/12 08:37 AM, jacob navia wrote:
>> I am writing a containers library in C. The generic list container
>> adds for a new type:
>>
>> code: 2495 bytes
>> data: 488 bytes
>>
>> for a list container containing around 60 entry points.
>>
>> This means that you will pay 2983 bytes for each instatiation
>> of the list template (one of the biggest around)
>>
>> I would be interested to know the corresponding vaue of the
>> C++ stl. How many bytes of code it will cost you to instantiate
>> a list template for a new type?

>
> Rather than fussing over a few bytes, how about a performance comparison?
>
> Here's a simple (related to some work I have been doing recently) benchmark:
>
> Generate a list of 10,000,000 random longs (call it random)
> use that list to instantiate another list (call it list)
> sort list
> sum and empty list by removing the first entry until empty.


I forget to add include error checking for allocation failure.

--
Ian Collins
 
Reply With Quote
 
Juha Nieminen
Guest
Posts: n/a
 
      05-22-2012
jacob navia <> wrote:
> I would be interested to know the corresponding vaue of the
> C++ stl. How many bytes of code it will cost you to instantiate
> a list template for a new type?


It depends on which member functions of the templated class you are
calling. The compiler will instantiate only those functions that get
called and skip the rest.

(This is, in fact, not just an optimization, but a *necessity*. The compiler
*must* instantiate only what is being called because some templated classes
actually depend on this behavior.)
 
Reply With Quote
 
Juha Nieminen
Guest
Posts: n/a
 
      05-22-2012
Ian Collins <ian-> wrote:
> Rather than fussing over a few bytes, how about a performance comparison?


A performance comparison would be quite moot because the biggest bottleneck
in the whole thing is memory allocation. By far. A performance comparison
would only make sense if one version uses more memory allocations than the
other (for example, I have seen "generic" linked lists in C that actually
make two allocations per element rather than the one that std::list makes,
hence basically making the C version twice as slow as std::list).

It can be quite surprising how much time the program spends in memory
allocation alone. For instance, if you instantiate a std::set and add
some tens of millions of elements to it, it will take many seconds on
a typical computer. You'd think that the majority of the time is spent
re-balancing the tree after each insertion. You'd be wrong. Something
like 80-90% of the time is spent allocating memory!

Using a very fast allocator with std::set or std::list can speed it up
quite considerably, if many insertions and deletions are performed.
(The great thing about the C++ standard data containers is that they
support user-created memory allocators, something that's way more
difficult to do in a "generic" C container; as everything else.)
 
Reply With Quote
 
jacob navia
Guest
Posts: n/a
 
      05-22-2012
Le 22/05/12 08:05, Juha Nieminen a écrit :
> Ian Collins<ian-> wrote:
>> Rather than fussing over a few bytes, how about a performance comparison?

>
> A performance comparison would be quite moot because the biggest bottleneck
> in the whole thing is memory allocation. By far.


Yes. With a fast allocator, my software runs at the same
speed than C++. This due to bottleneck is in:

Memory I/O
Allocation

> A performance comparison
> would only make sense if one version uses more memory allocations than the
> other (for example, I have seen "generic" linked lists in C that actually
> make two allocations per element rather than the one that std::list makes,
> hence basically making the C version twice as slow as std::list).
>


My software makes one allocation per element.

> It can be quite surprising how much time the program spends in memory
> allocation alone. For instance, if you instantiate a std::set and add
> some tens of millions of elements to it, it will take many seconds on
> a typical computer. You'd think that the majority of the time is spent
> re-balancing the tree after each insertion. You'd be wrong. Something
> like 80-90% of the time is spent allocating memory!
>


That number is correct, that's why I decided to use (for single linked
lists)

struct element {
struct element *Next;
char data[1];
};

> Using a very fast allocator with std::set or std::list can speed it up
> quite considerably, if many insertions and deletions are performed.
> (The great thing about the C++ standard data containers is that they
> support user-created memory allocators, something that's way more
> difficult to do in a "generic" C container; as everything else.)


My software supports user supplied allocators. The default allocator
object is:

typedef struct tagAllocator {
void *(*malloc)(size_t); /* Function to allocate memory */
void (*free)(void *); /* Function to release it */
void *(*realloc)(void *,size_t);/* Function to resize a block of
memory */
void *(*calloc)(size_t,size_t);
} ContainerAllocator;

You can replace it with your own allocator. For instance you can
supply a GC (Boehm's for instance) allocator, or a custom one,
as you wish.
 
Reply With Quote
 
Ian Collins
Guest
Posts: n/a
 
      05-22-2012
On 05/22/12 06:05 PM, Juha Nieminen wrote:
> Ian Collins<ian-> wrote:
>> Rather than fussing over a few bytes, how about a performance comparison?

>
> A performance comparison would be quite moot because the biggest bottleneck
> in the whole thing is memory allocation. By far. A performance comparison
> would only make sense if one version uses more memory allocations than the
> other (for example, I have seen "generic" linked lists in C that actually
> make two allocations per element rather than the one that std::list makes,
> hence basically making the C version twice as slow as std::list).


Hence my simple use of std::list, I just wanted some idea how the two
compare (particularly with regard to the sort) before any optimisations
are added.

--
Ian Collins
 
Reply With Quote
 
Ian Collins
Guest
Posts: n/a
 
      05-22-2012
On 05/22/12 06:45 PM, jacob navia wrote:
> Le 22/05/12 08:05, Juha Nieminen a écrit :
>> Ian Collins<ian-> wrote:
>>> Rather than fussing over a few bytes, how about a performance comparison?

>>
>> A performance comparison would be quite moot because the biggest bottleneck
>> in the whole thing is memory allocation. By far.

>
> Yes. With a fast allocator, my software runs at the same
> speed than C++. This due to bottleneck is in:


Can you provide a simple, unoptimised example? I'm genuinely curios how
the two compare both before and after allocator optimisation.

--
Ian Collins
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Re: How include a large array? Edward A. Falk C Programming 1 04-04-2013 08:07 PM
templated function as parameter of another templated function Amadeus W. M. C++ 2 07-04-2006 09:59 PM
ASP.NET Templated User Controls - Limit child controls allowable within a templated control JohnyStyles@gmail.com ASP .Net 0 05-29-2006 06:00 PM
Subtypes of templated types (in templated functions) Marijn C++ 5 02-13-2004 09:50 AM
implementing a templated struct within a templated struct RA Scheltema C++ 3 01-06-2004 11:25 AM



Advertisments