- **C++**
(*http://www.velocityreviews.com/forums/f39-c.html*)

- - **bench: binary trees**
(*http://www.velocityreviews.com/forums/t948664-bench-binary-trees.html*)

bench: binary treesI have played a little tree with benchmark from computer language game
site: http://shootout.alioth.debian.org/u6...st=binarytrees Interested in gc vs manual I have been very surprised with Java performance ;) Seems that Java gc is blazingly fast in comparison with standard new/delete. I think this is because gc deallocates in chunks rather piece by piece, perhaps allocates from pool. Also there are no destructors calls. fastest C++ program on this site uses boost pool which calls destructors and can deallocate, therefore, is slower then C variant which uses apache's pool, which just deallocate. So I have coded simple custom pool that does not keeps track of allocated objects and doesn't call destructors and result is that it is slightly faster than C ;) What is clear is that Java garbage collector cannot be bitten in this case if manual deletion of objects is required. (as seen here) http://shootout.alioth.debian.org/u6...st=binarytrees Timings for cpp with my custom alloc bmaxa@maxa:~/shootout/trees$ time ./binarytreescpp 20 stretch tree of depth 21 check: -1 2097152 trees of depth 4 check: -2097152 524288 trees of depth 6 check: -524288 131072 trees of depth 8 check: -131072 32768 trees of depth 10 check: -32768 8192 trees of depth 12 check: -8192 2048 trees of depth 14 check: -2048 512 trees of depth 16 check: -512 128 trees of depth 18 check: -128 32 trees of depth 20 check: -32 long lived tree of depth 20 check: -1 real 0m4.188s user 0m7.036s sys 0m0.112s bmaxa@maxa:~/shootout/trees$ time taskset -c 0 ./binarytreescpp 20 stretch tree of depth 21 check: -1 2097152 trees of depth 4 check: -2097152 524288 trees of depth 6 check: -524288 131072 trees of depth 8 check: -131072 32768 trees of depth 10 check: -32768 8192 trees of depth 12 check: -8192 2048 trees of depth 14 check: -2048 512 trees of depth 16 check: -512 128 trees of depth 18 check: -128 32 trees of depth 20 check: -32 long lived tree of depth 20 check: -1 real 0m7.077s user 0m6.976s sys 0m0.084s Timings for C with apache pool: bmaxa@maxa:~/shootout/trees$ time ./binarytreesc 20 stretch tree of depth 21 check: -1 2097152 trees of depth 4 check: -2097152 524288 trees of depth 6 check: -524288 131072 trees of depth 8 check: -131072 32768 trees of depth 10 check: -32768 8192 trees of depth 12 check: -8192 2048 trees of depth 14 check: -2048 512 trees of depth 16 check: -512 128 trees of depth 18 check: -128 32 trees of depth 20 check: -32 long lived tree of depth 20 check: -1 real 0m4.625s user 0m8.085s sys 0m0.140s bmaxa@maxa:~/shootout/trees$ time taskset -c 0 ./binarytreesc 20 stretch tree of depth 21 check: -1 2097152 trees of depth 4 check: -2097152 524288 trees of depth 6 check: -524288 131072 trees of depth 8 check: -131072 32768 trees of depth 10 check: -32768 8192 trees of depth 12 check: -8192 2048 trees of depth 14 check: -2048 512 trees of depth 16 check: -512 128 trees of depth 18 check: -128 32 trees of depth 20 check: -32 long lived tree of depth 20 check: -1 real 0m8.009s user 0m7.888s sys 0m0.116s code follows if someone is interested: /* The Computer Language Benchmarks Game * http://shootout.alioth.debian.org/ * * contributed by Jon Harrop * modified by Alex Mizrahi * modified by Andreas SchÃ¤fer * very minor omp tweak by The Anh Tran * custom pool Branimir Maksimovic */ #include <iostream> #include <stdlib.h> #include <stdio.h> #include <vector> #include <new> #include <pthread.h> #include <boost/pool/object_pool.hpp> const size_t LINE_SIZE = 64; template <class T> class Pool{ static const unsigned NELEM = 10000*sizeof(T); public: Pool() : pos_(0) { pool_.push_back(alloc()); } Pool(const Pool&)=delete; Pool& operator=(const Pool&)=delete; ~Pool() { for(auto i : pool_) { free(i); } } template <class ...T1> T* allocate(T1... args) { if(NELEM <= pos_) { pool_.push_back(alloc()); pos_=0; } void* p = &pool_.back()[pos_]; pos_+=sizeof(T); return new(p)T(args...); } private: static char* alloc() { Inner* next = (Inner*)pthread_getspecific(k.key); if(next == nullptr) { next = (Inner*)::operator new(NELEM); next->next = nullptr; } char* p = (char*)next; next = next->next; pthread_setspecific(k.key,next); return p; } static void free(void* p) { Inner* next = (Inner*)pthread_getspecific(k.key); Inner* tmp = (Inner*)p; tmp->next = next; next = tmp; pthread_setspecific(k.key,next); } std::vector<char*> pool_; unsigned pos_; struct Inner{ Inner* next; }; static struct Key{ Key() { pthread_key_create(&key,destroy); } ~Key() { pthread_key_delete(key); } static void destroy(void* p) { Inner* next = (Inner*)p; while(next) { Inner* tmp = next->next; ::operator delete(next); next = tmp; } } pthread_key_t key; }k; }; template <class T> typename Pool<T>::Key Pool<T>::k; struct Node { Node *l, *r; int i; Node(int i2) :l(0),r(0), i(i2) {} Node(Node *l2, int i2, Node *r2) : l(l2), r(r2), i(i2) {} Node(const Node&)=delete; Node& operator=(const Node&)=delete; int check() const { if (l) return l->check() + i - r->check(); else return i; } }; typedef boost::object_pool<Node> NodePool; Node *make(int i, int d, Pool<Node>& p) { if (d > 0) return p.allocate( make(2*i-1, d-1,p), i, make(2*i, d-1,p)); return p.allocate(i); } Node *make(int i, int d, NodePool& p) { if (d > 0) return p.construct( make(2*i-1, d-1,p), i, make(2*i, d-1,p)); return p.construct(i); } int main(int argc, char *argv[]) { int min_depth = 4; int max_depth = std::max(min_depth+2, (argc == 2 ? atoi(argv[1]) : 10)); int stretch_depth = max_depth+1; // Alloc then dealloc stretchdepth tree { Pool<Node> c_pool; //NodePool c_pool; Node* c (make(0, stretch_depth,c_pool)); std::cout << "stretch tree of depth " << stretch_depth << "\t " << "check: " << c->check() << '\n'; } Pool<Node> long_lived_tree_pool; //NodePool long_lived_tree_pool; Node* long_lived_tree(make(0, max_depth,long_lived_tree_pool)); char *outputstr = (char*)malloc(LINE_SIZE * (max_depth +1) * sizeof(char)); #pragma omp parallel for for (int d = min_depth; d <= max_depth; d += 2) { int iterations = 1 << (max_depth - d + min_depth); int c = 0; for (int i = 1; i <= iterations; ++i) { Pool<Node> pool; //NodePool pool; Node *a(make(i, d,pool)), *b(make(-i, d,pool)); c += a->check() + b->check(); } sprintf(outputstr + LINE_SIZE * d, "%d\t trees of depth %d\t check: %d\n", (2 * iterations), d, c); } for (int d = min_depth; d <= max_depth; d += 2) printf("%s", outputstr + (d * LINE_SIZE) ); free(outputstr); std::cout << "long lived tree of depth " << max_depth << "\t " << "check: " << (long_lived_tree->check()) << '\n'; return 0; } |

Re: bench: binary treesOn Saturday, July 28, 2012 7:20:04 PM UTC-4, Melzzzzz wrote:
> I have played a little tree with benchmark from computer language game > > site: > > http://shootout.alioth.debian.org/u6...st=binarytrees > > > > Interested in gc vs manual I have been very surprised with Java > > performance ;) > > Seems that Java gc is blazingly fast in comparison with standard > > new/delete. Is it possible that the benchmark never causes GC to run? Is so, that would also mean any time it took to run destructors was also ignored in the Java case. > > I think this is because gc deallocates in chunks rather piece by > > piece, perhaps allocates from pool. Also there are no destructors > > calls. > > fastest C++ program on this site uses boost pool which calls > > destructors and can deallocate, therefore, is slower then C variant > > which uses apache's pool, which just deallocate. > > > > So I have coded simple custom pool that does not keeps track of > > allocated objects and doesn't call destructors and result is > > that it is slightly faster than C ;) > > > > What is clear is that Java garbage collector cannot be bitten > > in this case if manual deletion of objects is required. > > (as seen here) > > http://shootout.alioth.debian.org/u6...st=binarytrees > > > > Timings for cpp with my custom alloc > > bmaxa@maxa:~/shootout/trees$ time ./binarytreescpp 20 > > stretch tree of depth 21 check: -1 > > 2097152 trees of depth 4 check: -2097152 > > 524288 trees of depth 6 check: -524288 > > 131072 trees of depth 8 check: -131072 > > 32768 trees of depth 10 check: -32768 > > 8192 trees of depth 12 check: -8192 > > 2048 trees of depth 14 check: -2048 > > 512 trees of depth 16 check: -512 > > 128 trees of depth 18 check: -128 > > 32 trees of depth 20 check: -32 > > long lived tree of depth 20 check: -1 > > > > real 0m4.188s > > user 0m7.036s > > sys 0m0.112s > > bmaxa@maxa:~/shootout/trees$ time taskset -c 0 ./binarytreescpp 20 > > stretch tree of depth 21 check: -1 > > 2097152 trees of depth 4 check: -2097152 > > 524288 trees of depth 6 check: -524288 > > 131072 trees of depth 8 check: -131072 > > 32768 trees of depth 10 check: -32768 > > 8192 trees of depth 12 check: -8192 > > 2048 trees of depth 14 check: -2048 > > 512 trees of depth 16 check: -512 > > 128 trees of depth 18 check: -128 > > 32 trees of depth 20 check: -32 > > long lived tree of depth 20 check: -1 > > > > real 0m7.077s > > user 0m6.976s > > sys 0m0.084s > > > > Timings for C with apache pool: > > bmaxa@maxa:~/shootout/trees$ time ./binarytreesc 20 > > stretch tree of depth 21 check: -1 > > 2097152 trees of depth 4 check: -2097152 > > 524288 trees of depth 6 check: -524288 > > 131072 trees of depth 8 check: -131072 > > 32768 trees of depth 10 check: -32768 > > 8192 trees of depth 12 check: -8192 > > 2048 trees of depth 14 check: -2048 > > 512 trees of depth 16 check: -512 > > 128 trees of depth 18 check: -128 > > 32 trees of depth 20 check: -32 > > long lived tree of depth 20 check: -1 > > > > real 0m4.625s > > user 0m8.085s > > sys 0m0.140s > > bmaxa@maxa:~/shootout/trees$ time taskset -c 0 ./binarytreesc 20 > > stretch tree of depth 21 check: -1 > > 2097152 trees of depth 4 check: -2097152 > > 524288 trees of depth 6 check: -524288 > > 131072 trees of depth 8 check: -131072 > > 32768 trees of depth 10 check: -32768 > > 8192 trees of depth 12 check: -8192 > > 2048 trees of depth 14 check: -2048 > > 512 trees of depth 16 check: -512 > > 128 trees of depth 18 check: -128 > > 32 trees of depth 20 check: -32 > > long lived tree of depth 20 check: -1 > > > > real 0m8.009s > > user 0m7.888s > > sys 0m0.116s > > > > code follows if someone is interested: > > /* The Computer Language Benchmarks Game > > * http://shootout.alioth.debian.org/ > > * > > * contributed by Jon Harrop > > * modified by Alex Mizrahi > > * modified by Andreas Schäfer > > * very minor omp tweak by The Anh Tran > > * custom pool Branimir Maksimovic > > */ > > > > #include <iostream> > > #include <stdlib.h> > > #include <stdio.h> > > #include <vector> > > #include <new> > > #include <pthread.h> > > #include <boost/pool/object_pool.hpp> > > > > const size_t LINE_SIZE = 64; > > > > template <class T> > > class Pool{ > > static const unsigned NELEM = 10000*sizeof(T); > > public: > > Pool() > > : pos_(0) > > { > > pool_.push_back(alloc()); > > } > > Pool(const Pool&)=delete; > > Pool& operator=(const Pool&)=delete; > > ~Pool() > > { > > for(auto i : pool_) > > { > > free(i); > > } > > } > > template <class ...T1> > > T* allocate(T1... args) > > { > > if(NELEM <= pos_) > > { > > pool_.push_back(alloc()); > > pos_=0; > > } > > void* p = &pool_.back()[pos_]; > > pos_+=sizeof(T); > > return new(p)T(args...); > > } > > private: > > static char* alloc() > > { > > Inner* next = (Inner*)pthread_getspecific(k.key); > > if(next == nullptr) > > { > > next = (Inner*)::operator new(NELEM); > > next->next = nullptr; > > } > > char* p = (char*)next; > > next = next->next; > > pthread_setspecific(k.key,next); > > return p; > > } > > static void free(void* p) > > { > > Inner* next = (Inner*)pthread_getspecific(k.key); > > Inner* tmp = (Inner*)p; > > tmp->next = next; > > next = tmp; > > pthread_setspecific(k.key,next); > > } > > std::vector<char*> pool_; > > unsigned pos_; > > struct Inner{ > > Inner* next; > > }; > > static struct Key{ > > Key() > > { > > pthread_key_create(&key,destroy); > > } > > ~Key() > > { > > pthread_key_delete(key); > > } > > static void destroy(void* p) > > { > > Inner* next = (Inner*)p; > > while(next) > > { > > Inner* tmp = next->next; > > ::operator delete(next); > > next = tmp; > > } > > } > > pthread_key_t key; > > }k; > > }; > > > > template <class T> > > typename Pool<T>::Key Pool<T>::k; > > > > struct Node > > { > > Node *l, *r; > > int i; > > > > Node(int i2) :l(0),r(0), i(i2) > > {} > > Node(Node *l2, int i2, Node *r2) : l(l2), r(r2), i(i2) > > {} > > Node(const Node&)=delete; > > Node& operator=(const Node&)=delete; > > int check() const > > { > > if (l) > > return l->check() + i - r->check(); > > else return i; > > } > > }; > > > > typedef boost::object_pool<Node> NodePool; > > > > Node *make(int i, int d, Pool<Node>& p) > > { > > if (d > 0) > > return p.allocate( make(2*i-1, d-1,p), > > i, > > make(2*i, d-1,p)); > > return p.allocate(i); > > } > > > > Node *make(int i, int d, NodePool& p) > > { > > if (d > 0) > > return p.construct( make(2*i-1, d-1,p), > > i, > > make(2*i, d-1,p)); > > return p.construct(i); > > } > > > > int main(int argc, char *argv[]) > > { > > int min_depth = 4; > > int max_depth = std::max(min_depth+2, > > (argc == 2 ? atoi(argv[1]) : 10)); > > int stretch_depth = max_depth+1; > > > > // Alloc then dealloc stretchdepth tree > > { > > Pool<Node> c_pool; > > //NodePool c_pool; > > Node* c (make(0, stretch_depth,c_pool)); > > std::cout << "stretch tree of depth " << stretch_depth << "\t " > > << "check: " << c->check() << '\n'; > > } > > > > Pool<Node> long_lived_tree_pool; > > //NodePool long_lived_tree_pool; > > Node* long_lived_tree(make(0, max_depth,long_lived_tree_pool)); > > > > char *outputstr = (char*)malloc(LINE_SIZE * (max_depth +1) * sizeof(char)); > > > > #pragma omp parallel for > > for (int d = min_depth; d <= max_depth; d += 2) > > { > > int iterations = 1 << (max_depth - d + min_depth); > > int c = 0; > > > > for (int i = 1; i <= iterations; ++i) > > { > > Pool<Node> pool; > > //NodePool pool; > > Node *a(make(i, d,pool)), *b(make(-i, d,pool)); > > c += a->check() + b->check(); > > } > > > > sprintf(outputstr + LINE_SIZE * d, "%d\t trees of depth %d\t check: %d\n", (2 * iterations), d, c); > > } > > > > for (int d = min_depth; d <= max_depth; d += 2) > > printf("%s", outputstr + (d * LINE_SIZE) ); > > free(outputstr); > > > > std::cout << "long lived tree of depth " << max_depth << "\t " > > << "check: " << (long_lived_tree->check()) << '\n'; > > > > return 0; > > } On Saturday, July 28, 2012 7:20:04 PM UTC-4, Melzzzzz wrote: > I have played a little tree with benchmark from computer language game > > site: > > http://shootout.alioth.debian.org/u6...st=binarytrees > > > > Interested in gc vs manual I have been very surprised with Java > > performance ;) > > Seems that Java gc is blazingly fast in comparison with standard > > new/delete. > > I think this is because gc deallocates in chunks rather piece by > > piece, perhaps allocates from pool. Also there are no destructors > > calls. > > fastest C++ program on this site uses boost pool which calls > > destructors and can deallocate, therefore, is slower then C variant > > which uses apache's pool, which just deallocate. > > > > So I have coded simple custom pool that does not keeps track of > > allocated objects and doesn't call destructors and result is > > that it is slightly faster than C ;) > > > > What is clear is that Java garbage collector cannot be bitten > > in this case if manual deletion of objects is required. > > (as seen here) > > http://shootout.alioth.debian.org/u6...st=binarytrees > > > > Timings for cpp with my custom alloc > > bmaxa@maxa:~/shootout/trees$ time ./binarytreescpp 20 > > stretch tree of depth 21 check: -1 > > 2097152 trees of depth 4 check: -2097152 > > 524288 trees of depth 6 check: -524288 > > 131072 trees of depth 8 check: -131072 > > 32768 trees of depth 10 check: -32768 > > 8192 trees of depth 12 check: -8192 > > 2048 trees of depth 14 check: -2048 > > 512 trees of depth 16 check: -512 > > 128 trees of depth 18 check: -128 > > 32 trees of depth 20 check: -32 > > long lived tree of depth 20 check: -1 > > > > real 0m4.188s > > user 0m7.036s > > sys 0m0.112s > > bmaxa@maxa:~/shootout/trees$ time taskset -c 0 ./binarytreescpp 20 > > stretch tree of depth 21 check: -1 > > 2097152 trees of depth 4 check: -2097152 > > 524288 trees of depth 6 check: -524288 > > 131072 trees of depth 8 check: -131072 > > 32768 trees of depth 10 check: -32768 > > 8192 trees of depth 12 check: -8192 > > 2048 trees of depth 14 check: -2048 > > 512 trees of depth 16 check: -512 > > 128 trees of depth 18 check: -128 > > 32 trees of depth 20 check: -32 > > long lived tree of depth 20 check: -1 > > > > real 0m7.077s > > user 0m6.976s > > sys 0m0.084s > > > > Timings for C with apache pool: > > bmaxa@maxa:~/shootout/trees$ time ./binarytreesc 20 > > stretch tree of depth 21 check: -1 > > 2097152 trees of depth 4 check: -2097152 > > 524288 trees of depth 6 check: -524288 > > 131072 trees of depth 8 check: -131072 > > 32768 trees of depth 10 check: -32768 > > 8192 trees of depth 12 check: -8192 > > 2048 trees of depth 14 check: -2048 > > 512 trees of depth 16 check: -512 > > 128 trees of depth 18 check: -128 > > 32 trees of depth 20 check: -32 > > long lived tree of depth 20 check: -1 > > > > real 0m4.625s > > user 0m8.085s > > sys 0m0.140s > > bmaxa@maxa:~/shootout/trees$ time taskset -c 0 ./binarytreesc 20 > > stretch tree of depth 21 check: -1 > > 2097152 trees of depth 4 check: -2097152 > > 524288 trees of depth 6 check: -524288 > > 131072 trees of depth 8 check: -131072 > > 32768 trees of depth 10 check: -32768 > > 8192 trees of depth 12 check: -8192 > > 2048 trees of depth 14 check: -2048 > > 512 trees of depth 16 check: -512 > > 128 trees of depth 18 check: -128 > > 32 trees of depth 20 check: -32 > > long lived tree of depth 20 check: -1 > > > > real 0m8.009s > > user 0m7.888s > > sys 0m0.116s > > > > code follows if someone is interested: > > /* The Computer Language Benchmarks Game > > * http://shootout.alioth.debian.org/ > > * > > * contributed by Jon Harrop > > * modified by Alex Mizrahi > > * modified by Andreas Schäfer > > * very minor omp tweak by The Anh Tran > > * custom pool Branimir Maksimovic > > */ > > > > #include <iostream> > > #include <stdlib.h> > > #include <stdio.h> > > #include <vector> > > #include <new> > > #include <pthread.h> > > #include <boost/pool/object_pool.hpp> > > > > const size_t LINE_SIZE = 64; > > > > template <class T> > > class Pool{ > > static const unsigned NELEM = 10000*sizeof(T); > > public: > > Pool() > > : pos_(0) > > { > > pool_.push_back(alloc()); > > } > > Pool(const Pool&)=delete; > > Pool& operator=(const Pool&)=delete; > > ~Pool() > > { > > for(auto i : pool_) > > { > > free(i); > > } > > } > > template <class ...T1> > > T* allocate(T1... args) > > { > > if(NELEM <= pos_) > > { > > pool_.push_back(alloc()); > > pos_=0; > > } > > void* p = &pool_.back()[pos_]; > > pos_+=sizeof(T); > > return new(p)T(args...); > > } > > private: > > static char* alloc() > > { > > Inner* next = (Inner*)pthread_getspecific(k.key); > > if(next == nullptr) > > { > > next = (Inner*)::operator new(NELEM); > > next->next = nullptr; > > } > > char* p = (char*)next; > > next = next->next; > > pthread_setspecific(k.key,next); > > return p; > > } > > static void free(void* p) > > { > > Inner* next = (Inner*)pthread_getspecific(k.key); > > Inner* tmp = (Inner*)p; > > tmp->next = next; > > next = tmp; > > pthread_setspecific(k.key,next); > > } > > std::vector<char*> pool_; > > unsigned pos_; > > struct Inner{ > > Inner* next; > > }; > > static struct Key{ > > Key() > > { > > pthread_key_create(&key,destroy); > > } > > ~Key() > > { > > pthread_key_delete(key); > > } > > static void destroy(void* p) > > { > > Inner* next = (Inner*)p; > > while(next) > > { > > Inner* tmp = next->next; > > ::operator delete(next); > > next = tmp; > > } > > } > > pthread_key_t key; > > }k; > > }; > > > > template <class T> > > typename Pool<T>::Key Pool<T>::k; > > > > struct Node > > { > > Node *l, *r; > > int i; > > > > Node(int i2) :l(0),r(0), i(i2) > > {} > > Node(Node *l2, int i2, Node *r2) : l(l2), r(r2), i(i2) > > {} > > Node(const Node&)=delete; > > Node& operator=(const Node&)=delete; > > int check() const > > { > > if (l) > > return l->check() + i - r->check(); > > else return i; > > } > > }; > > > > typedef boost::object_pool<Node> NodePool; > > > > Node *make(int i, int d, Pool<Node>& p) > > { > > if (d > 0) > > return p.allocate( make(2*i-1, d-1,p), > > i, > > make(2*i, d-1,p)); > > return p.allocate(i); > > } > > > > Node *make(int i, int d, NodePool& p) > > { > > if (d > 0) > > return p.construct( make(2*i-1, d-1,p), > > i, > > make(2*i, d-1,p)); > > return p.construct(i); > > } > > > > int main(int argc, char *argv[]) > > { > > int min_depth = 4; > > int max_depth = std::max(min_depth+2, > > (argc == 2 ? atoi(argv[1]) : 10)); > > int stretch_depth = max_depth+1; > > > > // Alloc then dealloc stretchdepth tree > > { > > Pool<Node> c_pool; > > //NodePool c_pool; > > Node* c (make(0, stretch_depth,c_pool)); > > std::cout << "stretch tree of depth " << stretch_depth << "\t " > > << "check: " << c->check() << '\n'; > > } > > > > Pool<Node> long_lived_tree_pool; > > //NodePool long_lived_tree_pool; > > Node* long_lived_tree(make(0, max_depth,long_lived_tree_pool)); > > > > char *outputstr = (char*)malloc(LINE_SIZE * (max_depth +1) * sizeof(char)); > > > > #pragma omp parallel for > > for (int d = min_depth; d <= max_depth; d += 2) > > { > > int iterations = 1 << (max_depth - d + min_depth); > > int c = 0; > > > > for (int i = 1; i <= iterations; ++i) > > { > > Pool<Node> pool; > > //NodePool pool; > > Node *a(make(i, d,pool)), *b(make(-i, d,pool)); > > c += a->check() + b->check(); > > } > > > > sprintf(outputstr + LINE_SIZE * d, "%d\t trees of depth %d\t check: %d\n", (2 * iterations), d, c); > > } > > > > for (int d = min_depth; d <= max_depth; d += 2) > > printf("%s", outputstr + (d * LINE_SIZE) ); > > free(outputstr); > > > > std::cout << "long lived tree of depth " << max_depth << "\t " > > << "check: " << (long_lived_tree->check()) << '\n'; > > > > return 0; > > } On Saturday, July 28, 2012 7:20:04 PM UTC-4, Melzzzzz wrote: > I have played a little tree with benchmark from computer language game > > site: > > http://shootout.alioth.debian.org/u6...st=binarytrees > > > > Interested in gc vs manual I have been very surprised with Java > > performance ;) > > Seems that Java gc is blazingly fast in comparison with standard > > new/delete. > > I think this is because gc deallocates in chunks rather piece by > > piece, perhaps allocates from pool. Also there are no destructors > > calls. > > fastest C++ program on this site uses boost pool which calls > > destructors and can deallocate, therefore, is slower then C variant > > which uses apache's pool, which just deallocate. > > > > So I have coded simple custom pool that does not keeps track of > > allocated objects and doesn't call destructors and result is > > that it is slightly faster than C ;) > > > > What is clear is that Java garbage collector cannot be bitten > > in this case if manual deletion of objects is required. > > (as seen here) > > http://shootout.alioth.debian.org/u6...st=binarytrees > > > > Timings for cpp with my custom alloc > > bmaxa@maxa:~/shootout/trees$ time ./binarytreescpp 20 > > stretch tree of depth 21 check: -1 > > 2097152 trees of depth 4 check: -2097152 > > 524288 trees of depth 6 check: -524288 > > 131072 trees of depth 8 check: -131072 > > 32768 trees of depth 10 check: -32768 > > 8192 trees of depth 12 check: -8192 > > 2048 trees of depth 14 check: -2048 > > 512 trees of depth 16 check: -512 > > 128 trees of depth 18 check: -128 > > 32 trees of depth 20 check: -32 > > long lived tree of depth 20 check: -1 > > > > real 0m4.188s > > user 0m7.036s > > sys 0m0.112s > > bmaxa@maxa:~/shootout/trees$ time taskset -c 0 ./binarytreescpp 20 > > stretch tree of depth 21 check: -1 > > 2097152 trees of depth 4 check: -2097152 > > 524288 trees of depth 6 check: -524288 > > 131072 trees of depth 8 check: -131072 > > 32768 trees of depth 10 check: -32768 > > 8192 trees of depth 12 check: -8192 > > 2048 trees of depth 14 check: -2048 > > 512 trees of depth 16 check: -512 > > 128 trees of depth 18 check: -128 > > 32 trees of depth 20 check: -32 > > long lived tree of depth 20 check: -1 > > > > real 0m7.077s > > user 0m6.976s > > sys 0m0.084s > > > > Timings for C with apache pool: > > bmaxa@maxa:~/shootout/trees$ time ./binarytreesc 20 > > stretch tree of depth 21 check: -1 > > 2097152 trees of depth 4 check: -2097152 > > 524288 trees of depth 6 check: -524288 > > 131072 trees of depth 8 check: -131072 > > 32768 trees of depth 10 check: -32768 > > 8192 trees of depth 12 check: -8192 > > 2048 trees of depth 14 check: -2048 > > 512 trees of depth 16 check: -512 > > 128 trees of depth 18 check: -128 > > 32 trees of depth 20 check: -32 > > long lived tree of depth 20 check: -1 > > > > real 0m4.625s > > user 0m8.085s > > sys 0m0.140s > > bmaxa@maxa:~/shootout/trees$ time taskset -c 0 ./binarytreesc 20 > > stretch tree of depth 21 check: -1 > > 2097152 trees of depth 4 check: -2097152 > > 524288 trees of depth 6 check: -524288 > > 131072 trees of depth 8 check: -131072 > > 32768 trees of depth 10 check: -32768 > > 8192 trees of depth 12 check: -8192 > > 2048 trees of depth 14 check: -2048 > > 512 trees of depth 16 check: -512 > > 128 trees of depth 18 check: -128 > > 32 trees of depth 20 check: -32 > > long lived tree of depth 20 check: -1 > > > > real 0m8.009s > > user 0m7.888s > > sys 0m0.116s > > > > code follows if someone is interested: > > /* The Computer Language Benchmarks Game > > * http://shootout.alioth.debian.org/ > > * > > * contributed by Jon Harrop > > * modified by Alex Mizrahi > > * modified by Andreas Schäfer > > * very minor omp tweak by The Anh Tran > > * custom pool Branimir Maksimovic > > */ > > > > #include <iostream> > > #include <stdlib.h> > > #include <stdio.h> > > #include <vector> > > #include <new> > > #include <pthread.h> > > #include <boost/pool/object_pool.hpp> > > > > const size_t LINE_SIZE = 64; > > > > template <class T> > > class Pool{ > > static const unsigned NELEM = 10000*sizeof(T); > > public: > > Pool() > > : pos_(0) > > { > > pool_.push_back(alloc()); > > } > > Pool(const Pool&)=delete; > > Pool& operator=(const Pool&)=delete; > > ~Pool() > > { > > for(auto i : pool_) > > { > > free(i); > > } > > } > > template <class ...T1> > > T* allocate(T1... args) > > { > > if(NELEM <= pos_) > > { > > pool_.push_back(alloc()); > > pos_=0; > > } > > void* p = &pool_.back()[pos_]; > > pos_+=sizeof(T); > > return new(p)T(args...); > > } > > private: > > static char* alloc() > > { > > Inner* next = (Inner*)pthread_getspecific(k.key); > > if(next == nullptr) > > { > > next = (Inner*)::operator new(NELEM); > > next->next = nullptr; > > } > > char* p = (char*)next; > > next = next->next; > > pthread_setspecific(k.key,next); > > return p; > > } > > static void free(void* p) > > { > > Inner* next = (Inner*)pthread_getspecific(k.key); > > Inner* tmp = (Inner*)p; > > tmp->next = next; > > next = tmp; > > pthread_setspecific(k.key,next); > > } > > std::vector<char*> pool_; > > unsigned pos_; > > struct Inner{ > > Inner* next; > > }; > > static struct Key{ > > Key() > > { > > pthread_key_create(&key,destroy); > > } > > ~Key() > > { > > pthread_key_delete(key); > > } > > static void destroy(void* p) > > { > > Inner* next = (Inner*)p; > > while(next) > > { > > Inner* tmp = next->next; > > ::operator delete(next); > > next = tmp; > > } > > } > > pthread_key_t key; > > }k; > > }; > > > > template <class T> > > typename Pool<T>::Key Pool<T>::k; > > > > struct Node > > { > > Node *l, *r; > > int i; > > > > Node(int i2) :l(0),r(0), i(i2) > > {} > > Node(Node *l2, int i2, Node *r2) : l(l2), r(r2), i(i2) > > {} > > Node(const Node&)=delete; > > Node& operator=(const Node&)=delete; > > int check() const > > { > > if (l) > > return l->check() + i - r->check(); > > else return i; > > } > > }; > > > > typedef boost::object_pool<Node> NodePool; > > > > Node *make(int i, int d, Pool<Node>& p) > > { > > if (d > 0) > > return p.allocate( make(2*i-1, d-1,p), > > i, > > make(2*i, d-1,p)); > > return p.allocate(i); > > } > > > > Node *make(int i, int d, NodePool& p) > > { > > if (d > 0) > > return p.construct( make(2*i-1, d-1,p), > > i, > > make(2*i, d-1,p)); > > return p.construct(i); > > } > > > > int main(int argc, char *argv[]) > > { > > int min_depth = 4; > > int max_depth = std::max(min_depth+2, > > (argc == 2 ? atoi(argv[1]) : 10)); > > int stretch_depth = max_depth+1; > > > > // Alloc then dealloc stretchdepth tree > > { > > Pool<Node> c_pool; > > //NodePool c_pool; > > Node* c (make(0, stretch_depth,c_pool)); > > std::cout << "stretch tree of depth " << stretch_depth << "\t " > > << "check: " << c->check() << '\n'; > > } > > > > Pool<Node> long_lived_tree_pool; > > //NodePool long_lived_tree_pool; > > Node* long_lived_tree(make(0, max_depth,long_lived_tree_pool)); > > > > char *outputstr = (char*)malloc(LINE_SIZE * (max_depth +1) * sizeof(char)); > > > > #pragma omp parallel for > > for (int d = min_depth; d <= max_depth; d += 2) > > { > > int iterations = 1 << (max_depth - d + min_depth); > > int c = 0; > > > > for (int i = 1; i <= iterations; ++i) > > { > > Pool<Node> pool; > > //NodePool pool; > > Node *a(make(i, d,pool)), *b(make(-i, d,pool)); > > c += a->check() + b->check(); > > } > > > > sprintf(outputstr + LINE_SIZE * d, "%d\t trees of depth %d\t check: %d\n", (2 * iterations), d, c); > > } > > > > for (int d = min_depth; d <= max_depth; d += 2) > > printf("%s", outputstr + (d * LINE_SIZE) ); > > free(outputstr); > > > > std::cout << "long lived tree of depth " << max_depth << "\t " > > << "check: " << (long_lived_tree->check()) << '\n'; > > > > return 0; > > } |

Re: bench: binary treesOn Sat, 28 Jul 2012 16:31:01 -0700 (PDT)
W Karas <wkaras@yahoo.com> wrote: > On Saturday, July 28, 2012 7:20:04 PM UTC-4, Melzzzzz wrote: > > I have played a little tree with benchmark from computer language > > game > > > > site: > > > > http://shootout.alioth.debian.org/u6...st=binarytrees > > > > > > > > Interested in gc vs manual I have been very surprised with Java > > > > performance ;) > > > > Seems that Java gc is blazingly fast in comparison with standard > > > > new/delete. > > Is it possible that the benchmark never causes GC to run? Is so, > that would also mean any time it took to run destructors was also > ignored in the Java case. > GC runs heavily in this case as memory will be exhausted pretty quickly if not. Take for example comparison between C on single CPU and Java (non threaded version, as threaded version suffers some on single CPU). C uses fast pool from apache while Java uses GC! What is also interesting is that with openmp it is actually easier to parallelize with C/C++ than Java ;) bmaxa@maxa:~/shootout/trees$ time java binarytrees 20 stretch tree of depth 21 check: -1 2097152 trees of depth 4 check: -2097152 524288 trees of depth 6 check: -524288 131072 trees of depth 8 check: -131072 32768 trees of depth 10 check: -32768 8192 trees of depth 12 check: -8192 2048 trees of depth 14 check: -2048 512 trees of depth 16 check: -512 128 trees of depth 18 check: -128 32 trees of depth 20 check: -32 long lived tree of depth 20 check: -1 real 0m8.537s user 0m8.949s sys 0m0.236s bmaxa@maxa:~/shootout/trees$ time taskset -c 0 ./binarytreesc 20 stretch tree of depth 21 check: -1 2097152 trees of depth 4 check: -2097152 524288 trees of depth 6 check: -524288 131072 trees of depth 8 check: -131072 32768 trees of depth 10 check: -32768 8192 trees of depth 12 check: -8192 2048 trees of depth 14 check: -2048 512 trees of depth 16 check: -512 128 trees of depth 18 check: -128 32 trees of depth 20 check: -32 long lived tree of depth 20 check: -1 real 0m8.054s user 0m7.936s sys 0m0.112s |

Re: bench: binary treesMelzzzzz <mel@zzzzz.com> wrote:
> Seems that Java gc is blazingly fast in comparison with standard > new/delete. Yes, new/delete in C++ is very slow (as well as malloc()/free() in C). That's the reason why a savvy C++ programmer will avoid doing tons of them if it's possible to avoid them. Depending on the algorith, this is, in fact, a relatively frequent occurrence (ie. that you can avoid using tons of news and deletes). |

Re: bench: binary treesOn Sunday, July 29, 2012 12:20:04 AM UTC+1, Melzzzzz wrote:
> I have played a little tree with benchmark from computer language game > site: > http://shootout.alioth.debian.org/u6...st=binarytrees > Interested in gc vs manual I have been very surprised with Java > performance ;) Beware of any benchmark you haven't falsified yourself. > Seems that Java gc is blazingly fast in comparison with standard > new/delete. Typically, the classical algorithms for garbage collection will be faster than manual allocation in some applications, slower than manual in other---the classical gc algorithms shine when there are a lot of small, short lived objects, for example. So people interested in showing garbage collection in its best light write benchmarks which make use of a lot of small, short-lived objects: trees and graphs are very popular for this. If your application makes extensive use of trees or graphs with a large number of very small nodes, you'd benefit, performance-wise, by using garbage collection in C++. (The Boehm collector works very well, for example.) Of course, the real argument for garbage collection isn't performance; it's safety. There's no way to get a dangling pointer (to dynamically allocated memory) with garbage collection. So when you destruct an object, you can mark it as destructed, and be sure that it will stay marked until there are no pointers to it. > I think this is because gc deallocates in chunks rather piece by > piece, perhaps allocates from pool. Also there are no destructors > calls. > fastest C++ program on this site uses boost pool which calls > destructors and can deallocate, therefore, is slower then C variant > which uses apache's pool, which just deallocate. > So I have coded simple custom pool that does not keeps track of > allocated objects and doesn't call destructors and result is > that it is slightly faster than C ;) > What is clear is that Java garbage collector cannot be bitten > in this case if manual deletion of objects is required. I'm not sure what you mean by "manual deletion". If the objects have a non-trivial destructor even in the absense of memory management issues, then it must be called. Regardless of the language. (Not calling it is a frequent error in Java code.) -- James |

Re: bench: binary treesJames Kanze <james.kanze@gmail.com> wrote:
> Of course, the real argument for garbage collection isn't > performance; it's safety. There's no way to get a dangling > pointer (to dynamically allocated memory) with garbage > collection. So when you destruct an object, you can mark it as > destructed, and be sure that it will stay marked until there are > no pointers to it. Depends on your definition of "safety". Garbage collection makes the execution of destructors ("finalizers") non-deterministic and, in fact, they might not be executed *at all* in Java. Thus destructors are basically useless. AFAIK Java does not offer any alternative (other than C-style manual resource management, other than memory). |

Re: bench: binary treesOn Sunday, July 29, 2012 6:48:28 PM UTC+1, Juha Nieminen wrote:
> James Kanze <james.kanze@gmail.com> wrote: > > Of course, the real argument for garbage collection isn't > > performance; it's safety. There's no way to get a dangling > > pointer (to dynamically allocated memory) with garbage > > collection. So when you destruct an object, you can mark it as > > destructed, and be sure that it will stay marked until there are > > no pointers to it. > Depends on your definition of "safety". Garbage collection makes the > execution of destructors ("finalizers") non-deterministic and, in fact, > they might not be executed *at all* in Java. Thus destructors are > basically useless. AFAIK Java does not offer any alternative (other > than C-style manual resource management, other than memory). I don't think you understand how garbage collection works. Finalizers have nothing to do with destructors; destructors work exactly as they always have, even with garbage collection. Garbage collection has no impact with regards to destructors. Note that the safety aspect doesn't mean that you don't "delete" objects. What it means, above all, is that you don't reuse "deleted" memory until there are no more pointers to it. So if you somehow mark the memory as "deleted", you can detect any attempt to use it---that mark won't be overwritten by a later allocation. And that if you forget too delete the object before the last pointer to it disappears, the finalize method can tell you. There's also a convenience aspect, that you don't need to destruct objects with trivial destructors, and that most destructors do become trivial once they don't have to worry about memory management. But this is secondary to the safety issue. -- James |

Re: bench: binary treesJames Kanze <james.kanze@gmail.com> wrote:
>> Depends on your definition of "safety". Garbage collection makes the >> execution of destructors ("finalizers") non-deterministic and, in fact, >> they might not be executed *at all* in Java. Thus destructors are >> basically useless. AFAIK Java does not offer any alternative (other >> than C-style manual resource management, other than memory). > > I don't think you understand how garbage collection works. > Finalizers have nothing to do with destructors; destructors work > exactly as they always have, even with garbage collection. > Garbage collection has no impact with regards to destructors. You are right. I don't understand what you are talking about. I don't even understand what you mean by "finalizer" and "destructor" here. Java classes have destructors, but due to GC they get called some time in the future, maybe not at all. That makes them practically useless. You better not rely on them to free important non-memory resources. And Java doesn't really provide any alternative either. |

Re: bench: binary treesOn Tuesday, July 31, 2012 6:21:35 AM UTC+1, Juha Nieminen wrote:
> James Kanze <james.kanze@gmail.com> wrote: > >> Depends on your definition of "safety". Garbage collection makes the > >> execution of destructors ("finalizers") non-deterministic and, in fact, > >> they might not be executed *at all* in Java. Thus destructors are > >> basically useless. AFAIK Java does not offer any alternative (other > >> than C-style manual resource management, other than memory). > > I don't think you understand how garbage collection works. > > Finalizers have nothing to do with destructors; destructors work > > exactly as they always have, even with garbage collection. > > Garbage collection has no impact with regards to destructors. > You are right. I don't understand what you are talking about. I don't > even understand what you mean by "finalizer" and "destructor" here. > Java classes have destructors, Java classes don't have destructors. At least not according to the Java language specification. In practice, "destructors" (in the largest sense of the word) are essential for some classes. There is a relatively widespread (but far from universal) convention in Java to call them "dispose": if your class has a "dispose" function, clients are expected to call it whenever the lifetime of the object ends (logically, of course). Typically using a try ... finally, since the language has no mechanism for automating calling it. > but due to GC they get called some time in the future, maybe > not at all. That makes them practically useless. Those aren't destructors, but finalizer methods. And they can be useful: if the object requires destruction (most don't), you assert that the destructor has been called. > You better not rely on them to free important non-memory resources. Of course not. That's not what they're there for, > And Java doesn't really provide any alternative either. The "alternative" is try ... finally. In my opinion, not a particularly good alternative, since it leaves the responsibility in the hands of the client, rather than having the author of the class enforce it. But the argument try...finally vs. destructors is orthogonaly to arguments concerning garbage collection. -- James |

Re: bench: binary treesJames Kanze <james.kanze@gmail.com> wrote:
> The "alternative" is try ... finally. In my opinion, not a > particularly good alternative, since it leaves the > responsibility in the hands of the client, rather than having > the author of the class enforce it. Besides, that only works for objects that do not outlive the scope where they are created, nor are shared. (In other words, it works only for strictly scope-bound, non-shared objects.) Since it's next to impossible to implement any kind of reference counting in Java (because you can't overload a reference assignment nor destruction), and it's overall difficult to know if an object is being currently shared or not, it's likewise very difficult to implement any kind of automatic "when the last reference to this object dies, execute this function". |

All times are GMT. The time now is 05:04 PM. |

Powered by vBulletin®. Copyright ©2000 - 2014, vBulletin Solutions, Inc.

SEO by vBSEO ©2010, Crawlability, Inc.