Velocity Reviews > Breakthrough

# Breakthrough

MelissA
Guest
Posts: n/a

 06-04-2012
On Mon, 04 Jun 2012 22:42:28 +0200
jacob navia <(E-Mail Removed)> wrote:

> Confirmed.
>
> real 0m5.112s
> user 0m4.600s
> sys 0m0.487s
>
> That's faster than C by a small amount...
>
>

This is with my inplace qsort with bidirectional
iterators:

bmaxa@maxa:~/jacob/ccl\$ time ./cpplist1
sort time 1.480
Sum = 10738138201479754

real 0m2.105s
user 0m1.936s
sys 0m0.168s
bmaxa@maxa:~/jacob/ccl\$ time ./testlist
sort time 3.130
Sum = 10738138201479754

real 0m4.265s
user 0m4.128s
sys 0m0.136s
bmaxa@maxa:~/jacob/ccl\$ cat QSort.h
#include <algorithm>

namespace VLib{
using std::swap;

template <typename T,typename F>
void qsort(T begin, T end, unsigned size, F f)
{
if(begin == end)return;

T high = end;
if(!size)
{
T tmp = begin;
while(tmp!=end){ high=tmp++;++size; }
}
else
{
--high;
}

if(size == 1)return;

T low = begin;
++low;

unsigned counthigh = 0,countlow = 0;
do
{
while(high != low && f(*begin,*high)){ --high;++counthigh; }
while(low != high && f(*low,*begin)){ ++low;++countlow; }

if(low != high && !f(*low,*begin) && !f(*begin,*low) && !f(*begin,*high) && !f(*high,*begin))
{
while(high != low && !f(*high,*low) && !f(*low,*high)){ --high;++counthigh; }
}

swap(*low,*high);
}while(low != high);
swap(*low,*begin);
T i = low;

while(++i != end && !f(*i,*low) && !f(*low,*i) )--counthigh;

VLib::qsort(begin,low,countlow,f);
VLib::qsort(i,end,counthigh,f);
}

template <typename T>
inline void qsort(T begin, T end, unsigned size = 0)
{
VLib::qsort(begin,end,size,std::less<typename std::iterator_traits<T>::value_type>());
}

}
bmaxa@maxa:~/jacob/ccl\$ cat cpplist1.cpp
#include <list>
#include <cstdio>
#include <cstdlib>
#include <ctime>
#include "QSort.h"

#define N 10000000

int main()
{
std::list<int> L1;
long long sum=0,newSum=0;
for(int i = 0; i < N; ++i)
{
int d = rand();
sum+=d;
L1.push_back(d);
}

clock_t t = clock();
VLib::qsort(L1.begin(),L1.end());
clock_t et = clock();
printf("sort time %.3f\n",float(et - t)/CLOCKS_PER_SEC);

for(auto it : L1)
{
newSum+= it;
}

if (sum != newSum)
printf("Failed\n");
else printf("Sum = %lld\n",sum);

}

Ian Collins
Guest
Posts: n/a

 06-04-2012
On 06/ 5/12 04:03 AM, jacob navia wrote:
> Le 04/06/12 17:44, Ike Naar a écrit :
>>
>> On the machine that I tried, most of the time was spent on the
>> implicit destruction of the list at program termination.

>
> Yes, I see a noticeable delay between the printing of the
> result sum and the end of the program.
>
> Apparently cleaning up is quit e expensive since destructors must be
> called 100 million times.
>
> In the CCL you can ADD a destructor if you want but they are NOT
> required as in C++.

So not leaking memory is optional

--
Ian Collins

jacob navia
Guest
Posts: n/a

 06-04-2012
C: 6.6 s
C++ 14.760

See the analysis after this session transcript. Please try this in your

~/stl/test \$ cat listexample.cpp
#include <list>
#include <cstdio>
#include <cstdlib>

#define N 10000000

int main()
{
std::list<int> L1;
long long sum=0,newSum=0;
int i;
for( i = 0; i < N; ++i)
{
int d = rand();
sum+=d;
L1.push_back(d);
}

L1.sort();

//for(auto it : L1)
std::list<int>::iterator it;
for(it=L1.begin(); it != L1.end(); ++it)
{
newSum+= *it;
}

if (sum != newSum)
printf("Failed\n");
else printf("Sum = %lld\n",sum);

}

~/stl/test \$ g++ -O2 -o listexample-cpp listexample.cpp
~/stl/test \$ time ./listexample-cpp
Sum = 10737818730605039

real 0m14.760s
user 0m14.438s
sys 0m0.320s
~/stl/test \$ cat tintlist.c
#include <stdlib.h>
#include "containers.h"
#include "intlist.h"
#define N 10000000
int main(void)
{
intList *L1;
size_t i;
int d;
long long sum=0,newSum=0;
Iterator *it;
int *pint;

L1 = iintList.Create(sizeof(int));
iintList.UseHeap(L1,NULL); // Use specialized Heap
// Add N random numbers to the integer list
for (i=0; i<N;i++) {
d = rand();
sum += d;
}
// Sort it
iintList.Sort(L1);
// Go through the sorted list using an iterator
it = iintList.NewIterator(L1);
i=0;
for (pint = it->GetFirst(it); pint; pint = it->GetNext(it)) {
newSum += *pint;
}
// Verify that both sums are identical
if (sum != newSum)
printf("Failed\n");
else printf("Sum = %lld\n",sum);
// Destroy the list
iintList.Finalize(L1);
}
~/stl/test \$ gcc -O2 -I.. -o listexample-c tintlist.c ../libccl.a
~/stl/test \$ time ./listexample-c
Sum = 10737818730605039

real 0m6.584s
user 0m6.319s
sys 0m0.258s
~/stl/test \$

Analysis:
---------
The main problems in the C++code are:

1) malloc overhead. There are several solutions to that, but none are
standard.
2) Apparently the algorithm for sorting a list is different in C++ to
the one I use. I copy the list into a vector and sort the vector, then
copy the sorted result into a list again.

For example:
~/stl/test \$ cat tintlist1.cpp
#include <list>
#include <vector>
#include <algorithm>
#include <cstdio>
#include <cstdlib>
#include <ctime>

#define N 10000000

int main()
{
std::list<int> L1;
long long sum=0,newSum=0;
for(int i = 0; i < N; ++i)
{
int d = rand();
sum+=d;
L1.push_back(d);
}
std::vector<int> tmp(L1.begin(),L1.end());
clock_t t = clock();
std::sort(tmp.begin(),tmp.end());
clock_t et = clock();
L1.clear();
L1.insert(L1.begin(),tmp.begin(),tmp.end());
printf("sort time %.3f\n",float(et - t)/CLOCKS_PER_SEC);

// for(auto it : L1)
std::list<int>::iterator it;
for(it=L1.begin(); it != L1.end(); ++it)

{
newSum+= *it;
}

if (sum != newSum)
printf("Failed\n");
else printf("Sum = %lld\n",sum);

}
This code takes only
~/stl/test \$ time ./tintlist1-cpp
sort time 1.138
Sum = 10737818730605039

real 0m5.137s
user 0m4.656s
sys 0m0.476s

An improvement of 300%.

Of course now we aren't fair to C since I could have done a vector sort
and I would be faster than C++ anyway.

Conclusion:

Probably C and C++ go at the same speed.

jacob navia
Guest
Posts: n/a

 06-04-2012
Le 05/06/12 00:23, Ian Collins a écrit :
>> In the CCL you can ADD a destructor if you want but they are NOT
>> required as in C++.

>
> So not leaking memory is optional

No. You do a

iintList.Finalize(list);

The advantage is that you free ALL memory in a single call to a function
that destroys the heap not individually (list element by list element)
but in blocks of 1000 list elements each you see?

C++ calls destructors 100 million times. That takes time.

MelissA
Guest
Posts: n/a

 06-04-2012
On Tue, 5 Jun 2012 00:16:36 +0200
MelissA <(E-Mail Removed)> wrote:

>
> unsigned counthigh = 0,countlow = 0;

mine error
countlow should be 1, not 0

BartC
Guest
Posts: n/a

 06-04-2012
"jacob navia" <(E-Mail Removed)> wrote in message
news:jqj017\$vai\$(E-Mail Removed)...
> Le 04/06/12 20:18, BartC a écrit :

>> hung up on errors in wchar.h (first on line 521, then on line 509).
>>

>
> Ahhh you are using the distribution that comes with the compiler?

No. I updated lcc in case it was out-of-date and causing the problem. The
CCL stuff is separate.

In any case, this project doesn't need a makefile as it's straightforward
(now I know what's needed); I will leave that to the experts.

I wanted to test the speed of container lists versus ordinary arrays (I know
containers are more sophisticated but your test is simple so I'm looking at
just that).

The results of your test program weren't so interesting, because 95% of the
runtime seemed to be in sorting. So got rid of that, and just tested
allocating and iterating over 100M (not 10M) int objects. This took about 8
seconds and seem to use about 1.3GB of memory (is there a per-element

Using an ordinary allocated array, took only 1.7 seconds, and used the
expected 0.4GB of memory.

However this was *preallocated*, a possible advantage. (I ran the same test
on a dynamic language, which took 16 seconds, whether the int-array was
preallocated or not. Interestingly (need to verify this), using a
preallocated generic list only took 10 seconds, not far off the CCL timing
for a non-generic list).

It seems the conclusion, if my results are valid, is that if the
requirements are very simple, that ordinary C array handling should be
considered if speed and memory are important. Maybe you might want to put
forward a more elaborate test that isn't so easy to rewrite in standard C...

--
Bartc

Luca Risolia
Guest
Posts: n/a

 06-04-2012
On 05/06/2012 00:31, jacob navia wrote:
> Le 05/06/12 00:23, Ian Collins a écrit :
>>> In the CCL you can ADD a destructor if you want but they are NOT
>>> required as in C++.

>>
>> So not leaking memory is optional

>
> No. You do a
>
> iintList.Finalize(list);
>
> The advantage is that you free ALL memory in a single call to a function
> that destroys the heap not individually (list element by list element)
> but in blocks of 1000 list elements each you see?
>
> C++ calls destructors 100 million times. That takes time.

Not necessarily. Memory deallocation and object destruction are two
separate things usually controlled by an allocator. With regard to the
standard containers an allocator is always an (optional) template
parameter. If you really want, you can provide your preferred
deallocation strategy for a given type T to std::list, so that it frees
the memory in blocks of 1000 and doesn't destroy any object.

Martin Shobe
Guest
Posts: n/a

 06-05-2012
Luca Risolia wrote:

> On 05/06/2012 00:31, jacob navia wrote:
>> Le 05/06/12 00:23, Ian Collins a écrit :
>>>> In the CCL you can ADD a destructor if you want but they are NOT
>>>> required as in C++.
>>>
>>> So not leaking memory is optional

>>
>> No. You do a
>>
>> iintList.Finalize(list);
>>
>> The advantage is that you free ALL memory in a single call to a function
>> that destroys the heap not individually (list element by list element)
>> but in blocks of 1000 list elements each you see?
>>
>> C++ calls destructors 100 million times. That takes time.

>
> Not necessarily. Memory deallocation and object destruction are two
> separate things usually controlled by an allocator. With regard to the
> standard containers an allocator is always an (optional) template
> parameter. If you really want, you can provide your preferred
> deallocation strategy for a given type T to std::list, so that it frees
> the memory in blocks of 1000 and doesn't destroy any object.
>

In the code that lead to this, the list was a list of ints. They don't
even have destructors to call, so the time certainly wasn't spent
calling them.

Martin Shobe

MelissA
Guest
Posts: n/a

 06-05-2012
Final QSort.h (corrected errors, this one is little bit slower)

#include <algorithm>

namespace VLib{
using std::swap;

template <typename T,typename F>
void qsort(T begin, T end, unsigned size, F f)
{
if(begin == end)return;

T high = end;
if(!size)
{
T tmp = begin;
while(tmp!=end){ high=tmp++;++size; }
}
else
{
--high;
}

if(size == 1)return;
T low = begin;
unsigned count = 0;
while(++count<size/2)++low;
// printf("size %u\n",size);
swap(*low,*begin);
low = begin;
++low;

unsigned counthigh = 0,countlow = 1;
do
{
while(high != low && f(*begin,*high)){ --high;++counthigh; }
while(low != high && f(*low,*begin)){ ++low;++countlow; }

if(low != high && !f(*low,*begin) && !f(*begin,*low) && !f(*begin,*high) && !f(*high,*begin))
{
while(high != low && !f(*high,*low) && !f(*low,*high)){ --high;++counthigh; }
}

swap(*low,*high);
}while(low != high);
swap(*low,*begin);
T i = low;

while(++i != end && !f(*i,*low) && !f(*low,*i) )--counthigh;

VLib::qsort(begin,low,countlow,f);
VLib::qsort(i,end,counthigh,f);
}

template <typename T>
inline void qsort(T begin, T end, unsigned size = 0)
{
VLib::qsort(begin,end,size,std::less<typename std::iterator_traits<T>::value_type>());
}

}

Juha Nieminen
Guest
Posts: n/a

 06-05-2012
In comp.lang.c++ jacob navia <(E-Mail Removed)> wrote:
>> If you want to test sorting speed and nothing else, you have to discard
>> the time spent in memory allocation.

>
> Yes, but what I want to test is the speed to do ALL tasks mentioned above

Then you will be comparing C++'s default memory allocator speed to whatever
that C list was using.