Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > C++ > Re: Breakthrough

Reply
Thread Tools

Re: Breakthrough

 
 
Paul
Guest
Posts: n/a
 
      06-04-2012
Juha Nieminen wrote:
> In comp.lang.c++ MelissA <(E-Mail Removed)> wrote:
>> #define N 10000000
>>
>> int main()
>> {
>> std::list<int> L1;
>> long long sum=0,newSum=0;
>> for(int i = 0; i < N; ++i)
>> {
>> int d = rand();
>> sum+=d;
>> L1.push_back(d);

>
> Your test is basically useless as a measurement of sorting speed because
> the vast majority of the time will be spent on those push_back() calls.
>
> If you want to test sorting speed and nothing else, you have to discard
> the time spent in memory allocation.


If you're going to collect timing information, it should be
collected right around the working part of the test, and
inside the code itself. Rather than recording the runtime of
the entire executable, and all those rand() calls.

generate_random_numbers
...
get_time_before
do_the_sort
get_time_after
...
print ( get_time_after - get_time_before )

Paul
 
Reply With Quote
 
 
 
 
Zoltan Juhasz
Guest
Posts: n/a
 
      06-04-2012
On Monday, 4 June 2012 12:01:16 UTC-4, jacob navia wrote:
> C++ is a bit more complicated in this respect, but I would
> like to use a common library. Loki needs to be downloaded,
> compiled, and it is lacking some examples...
>
> In this page
> http://sourceforge.net/projects/loki.../topic/3828398
>
> people say it runs slower than malloc...


Yes, it was designed as a portable, book-example allocator,
not necessarily an industry strength, fully tuned one.

It might be slower than malloc to allocate the elements,
but from the test perspective allocation speed is irrelevant;
What you are after with Loki is a more compact representation
of your data.

> > As far a feasibility is concerned, I am moderately sure that
> > there are off-the-shelf, STL compatible small-object
> > allocators available; providing an additional template argument
> > s hould not be a problem.
> >

>
> Well, besides Loki what others would you propose?


Well, Google can only answer that , but as I said, Loki should
be fine, also NSTL seems to provide its own SmallObjAlloc.

> Not horribly complex for you but for me it looks like a big deal...


Oh I agree, it is not a trivial exercise, but keep in mind
that you are talking about a specialized use-case.

Back to the original topic: if the edge is provided by your
sorting algorithm, the tests have to be based on close-to
identical environment, and your sort has to be measured, not
something else.

On the other hand, I also hear your argument that the test should
be based on a close-to identical environment that could be achieved
with resonably similar effort...

-- Zoltan
 
Reply With Quote
 
 
 
 
MelissA
Guest
Posts: n/a
 
      06-04-2012
On Mon, 4 Jun 2012 15:32:34 +0000 (UTC)
Juha Nieminen <(E-Mail Removed)> wrote:

> In comp.lang.c++ MelissA <(E-Mail Removed)> wrote:
> > #define N 10000000
> >
> > int main()
> > {
> > std::list<int> L1;
> > long long sum=0,newSum=0;
> > for(int i = 0; i < N; ++i)
> > {
> > int d = rand();
> > sum+=d;
> > L1.push_back(d);

>
> Your test is basically useless as a measurement of sorting speed
> because the vast majority of the time will be spent on those
> push_back() calls.
>
> If you want to test sorting speed and nothing else, you have to
> discard the time spent in memory allocation.

Here is just sorting time, C version is significantly faster:

bmaxa@maxa:~/jacob/ccl$ ./testlist
sort time 3.140
Sum = 10738138201479754
bmaxa@maxa:~/jacob/ccl$ ./cpplist
sort time 5.350
Sum = 10738138201479754
bmaxa@maxa:~/jacob/ccl$

 
Reply With Quote
 
MelissA
Guest
Posts: n/a
 
      06-04-2012
On Mon, 4 Jun 2012 21:44:30 +0200
MelissA <(E-Mail Removed)> wrote:

> On Mon, 4 Jun 2012 15:32:34 +0000 (UTC)
> Juha Nieminen <(E-Mail Removed)> wrote:
>
> > In comp.lang.c++ MelissA <(E-Mail Removed)> wrote:
> > > #define N 10000000
> > >
> > > int main()
> > > {
> > > std::list<int> L1;
> > > long long sum=0,newSum=0;
> > > for(int i = 0; i < N; ++i)
> > > {
> > > int d = rand();
> > > sum+=d;
> > > L1.push_back(d);

> >
> > Your test is basically useless as a measurement of sorting speed
> > because the vast majority of the time will be spent on those
> > push_back() calls.
> >
> > If you want to test sorting speed and nothing else, you have to
> > discard the time spent in memory allocation.

> Here is just sorting time, C version is significantly faster:
>
> bmaxa@maxa:~/jacob/ccl$ ./testlist
> sort time 3.140
> Sum = 10738138201479754
> bmaxa@maxa:~/jacob/ccl$ ./cpplist
> sort time 5.350
> Sum = 10738138201479754
> bmaxa@maxa:~/jacob/ccl$
>


But watch out this:
It is faster to populate vector from list, then
to sort vector , then to clear list, then
to populate list from vector then C version!

bmaxa@maxa:~/jacob/ccl$ time ./cpplist
sort time 0.750
Sum = 10738138201479754

real 0m1.813s
user 0m1.660s
sys 0m0.152s
bmaxa@maxa:~/jacob/ccl$ time ./testlist
sort time 3.140
Sum = 10738138201479754

real 0m4.302s
user 0m4.208s
sys 0m0.096s
bmaxa@maxa:~/jacob/ccl$ cat cpplist.cpp
#include <list>
#include <vector>
#include <algorithm>
#include <cstdio>
#include <cstdlib>
#include <ctime>

#define N 10000000

int main()
{
std::list<int> L1;
long long sum=0,newSum=0;
for(int i = 0; i < N; ++i)
{
int d = rand();
sum+=d;
L1.push_back(d);
}
std::vector<int> tmp(L1.begin(),L1.end());
clock_t t = clock();
std::sort(tmp.begin(),tmp.end());
clock_t et = clock();
L1.clear();
L1.insert(L1.begin(),tmp.begin(),tmp.end());
printf("sort time %.3f\n",float(et - t)/CLOCKS_PER_SEC);

for(auto it : L1)
{
newSum+= it;
}

if (sum != newSum)
printf("Failed\n");
else printf("Sum = %lld\n",sum);

}

 
Reply With Quote
 
jacob navia
Guest
Posts: n/a
 
      06-04-2012
Confirmed.

real 0m5.112s
user 0m4.600s
sys 0m0.487s

That's faster than C by a small amount...


 
Reply With Quote
 
MelissA
Guest
Posts: n/a
 
      06-04-2012
On Mon, 04 Jun 2012 22:42:28 +0200
jacob navia <(E-Mail Removed)> wrote:

> Confirmed.
>
> real 0m5.112s
> user 0m4.600s
> sys 0m0.487s
>
> That's faster than C by a small amount...
>
>


This is with my inplace qsort with bidirectional
iterators:


bmaxa@maxa:~/jacob/ccl$ time ./cpplist1
sort time 1.480
Sum = 10738138201479754

real 0m2.105s
user 0m1.936s
sys 0m0.168s
bmaxa@maxa:~/jacob/ccl$ time ./testlist
sort time 3.130
Sum = 10738138201479754

real 0m4.265s
user 0m4.128s
sys 0m0.136s
bmaxa@maxa:~/jacob/ccl$ cat QSort.h
#include <algorithm>

namespace VLib{
using std::swap;

template <typename T,typename F>
void qsort(T begin, T end, unsigned size, F f)
{
if(begin == end)return;

T high = end;
if(!size)
{
T tmp = begin;
while(tmp!=end){ high=tmp++;++size; }
}
else
{
--high;
}

if(size == 1)return;

T low = begin;
++low;

unsigned counthigh = 0,countlow = 0;
do
{
while(high != low && f(*begin,*high)){ --high;++counthigh; }
while(low != high && f(*low,*begin)){ ++low;++countlow; }

if(low != high && !f(*low,*begin) && !f(*begin,*low) && !f(*begin,*high) && !f(*high,*begin))
{
while(high != low && !f(*high,*low) && !f(*low,*high)){ --high;++counthigh; }
}

swap(*low,*high);
}while(low != high);
swap(*low,*begin);
T i = low;

while(++i != end && !f(*i,*low) && !f(*low,*i) )--counthigh;

VLib::qsort(begin,low,countlow,f);
VLib::qsort(i,end,counthigh,f);
}

template <typename T>
inline void qsort(T begin, T end, unsigned size = 0)
{
VLib::qsort(begin,end,size,std::less<typename std::iterator_traits<T>::value_type>());
}

}
bmaxa@maxa:~/jacob/ccl$ cat cpplist1.cpp
#include <list>
#include <cstdio>
#include <cstdlib>
#include <ctime>
#include "QSort.h"

#define N 10000000

int main()
{
std::list<int> L1;
long long sum=0,newSum=0;
for(int i = 0; i < N; ++i)
{
int d = rand();
sum+=d;
L1.push_back(d);
}

clock_t t = clock();
VLib::qsort(L1.begin(),L1.end());
clock_t et = clock();
printf("sort time %.3f\n",float(et - t)/CLOCKS_PER_SEC);

for(auto it : L1)
{
newSum+= it;
}

if (sum != newSum)
printf("Failed\n");
else printf("Sum = %lld\n",sum);

}

 
Reply With Quote
 
Ian Collins
Guest
Posts: n/a
 
      06-04-2012
On 06/ 5/12 04:03 AM, jacob navia wrote:
> Le 04/06/12 17:44, Ike Naar a écrit :
>>
>> On the machine that I tried, most of the time was spent on the
>> implicit destruction of the list at program termination.

>
> Yes, I see a noticeable delay between the printing of the
> result sum and the end of the program.
>
> Apparently cleaning up is quit e expensive since destructors must be
> called 100 million times.
>
> In the CCL you can ADD a destructor if you want but they are NOT
> required as in C++.


So not leaking memory is optional

--
Ian Collins
 
Reply With Quote
 
jacob navia
Guest
Posts: n/a
 
      06-04-2012
Le 05/06/12 00:23, Ian Collins a écrit :
>> In the CCL you can ADD a destructor if you want but they are NOT
>> required as in C++.

>
> So not leaking memory is optional


No. You do a

iintList.Finalize(list);

The advantage is that you free ALL memory in a single call to a function
that destroys the heap not individually (list element by list element)
but in blocks of 1000 list elements each you see?

C++ calls destructors 100 million times. That takes time.

 
Reply With Quote
 
MelissA
Guest
Posts: n/a
 
      06-04-2012
On Tue, 5 Jun 2012 00:16:36 +0200
MelissA <(E-Mail Removed)> wrote:

>
> unsigned counthigh = 0,countlow = 0;

mine error
countlow should be 1, not 0

 
Reply With Quote
 
Luca Risolia
Guest
Posts: n/a
 
      06-04-2012
On 05/06/2012 00:31, jacob navia wrote:
> Le 05/06/12 00:23, Ian Collins a écrit :
>>> In the CCL you can ADD a destructor if you want but they are NOT
>>> required as in C++.

>>
>> So not leaking memory is optional

>
> No. You do a
>
> iintList.Finalize(list);
>
> The advantage is that you free ALL memory in a single call to a function
> that destroys the heap not individually (list element by list element)
> but in blocks of 1000 list elements each you see?
>
> C++ calls destructors 100 million times. That takes time.


Not necessarily. Memory deallocation and object destruction are two
separate things usually controlled by an allocator. With regard to the
standard containers an allocator is always an (optional) template
parameter. If you really want, you can provide your preferred
deallocation strategy for a given type T to std::list, so that it frees
the memory in blocks of 1000 and doesn't destroy any object.



 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Amazing breakthrough in digital photography Lionel Lauer Digital Photography 1 06-29-2004 06:38 PM
Scientific breakthrough: magnetic lenses  @ .  Digital Photography 0 04-05-2004 01:59 AM
A breakthrough in lens technology: must read REED BOXIN Digital Photography 3 03-07-2004 08:40 PM
A remarkable breakthrough: Sony DSC-F828 Jill Digital Photography 24 10-28-2003 05:52 AM
Re: Whitley Strieber's New Breakthrough Baron Maximillian von Schtuldeworfshiseundurheimhoppen Computer Support 0 10-19-2003 01:09 AM



Advertisments