Velocity Reviews > Java > Re: Shootout: the benchmarks game.

# Re: Shootout: the benchmarks game.

Juha Nieminen
Guest
Posts: n/a

 04-21-2008
I'm curious to know how fast this program would be when translated to
Java. In my computer (3.4GHz Pentium4) using gcc 4.1.2 (with the options
-O3 -march=pentium4 -fomit-frame-pointer) it takes 1 min and 17 seconds,
while taking 252 MB of memory (according to 'top').

The task the program solves is in the comment. (Note that I'm not
really asking how fast the *problem* can be solved in Java, but how fast
it can be solved by using this same algorithm/technique.)

/*
Find the largest n for which:

1) n is a prime.
2) The nth prime is smaller than 4200000000.

As the answer, print both n and the nth prime.
And while you are at it, also print the amount of
primes smaller than 4200000000.
*/

#include <bitset>
#include <iostream>

namespace
{
const unsigned MAX_VALUE = 4200000000U;
std::bitset<MAX_VALUE/2> prime;

// http://en.wikipedia.org/wiki/Sieve_of_Eratosthenes
void init()
{
prime.set();
for(unsigned n = 3; n*n < MAX_VALUE; n += 2)
if(prime.test(n/2))
for(unsigned index = n*n; index < MAX_VALUE; index += n)
if(index % 2)
prime.reset(index/2);
}
}

#include <ctime>

int main()
{
const clock_t t1 = std::clock();
init();

unsigned count = prime.count(), p = MAX_VALUE+1;

std::cout << "There are " << count << " primes smaller than "
<< MAX_VALUE << std::endl;

while(true)
{
do p -= 2; while(!prime.test(p/2));
if(count%2 && prime.test(count/2)) break;
--count;
}

const int seconds = (std::clock() - t1)/CLOCKS_PER_SEC;
std::cout << count << "\n" << p << std::endl
<< seconds/60 << " mins " << seconds%60 << " s\n";
}

Juha Nieminen
Guest
Posts: n/a

 04-21-2008
Juha Nieminen wrote:
> for(unsigned index = n*n; index < MAX_VALUE; index += n)
> if(index % 2)
> prime.reset(index/2);

Actually that can be replaced with:

for(unsigned index = n*n; index < MAX_VALUE; index += n*2)
prime.reset(index/2);

OTOH, the speedup is rather small. The runtime was reduced to 1 min
and 13 seconds.

Juha Nieminen
Guest
Posts: n/a

 04-21-2008
Razii wrote:
> I am going to use MAX_VALUE that stays within the bounds of int (that
> also means the index shouldn't go negative also in this loop)

I think you can safely use a MAX_VALUE of 2000000000 (and in fact a
bit higher) without running into problems.

Steve Wampler
Guest
Posts: n/a

 04-21-2008
Razii wrote:
> Another problem is that this benchmark really depends on how the
> Bitsets class is implemented in java. Perhaps C++ library implements
> it in more efficient way.

Definitely. A Java BitSet instance is not fixed in size and can
grow and shrink - and there's quite a bit of code that results from
this. (Clearing a bit, for example, results in a recalculation of the
number of words required to hold the remaining bits - something that
only pays off if you're clearing the rightmost 'on' bits in the set)!
The emphasis of the implementation appears to be more on efficient
space use over the cost of bitwise operations.

Given that this benchmark is really only using a bitset as a (sorta)
space efficient mapping of numbers to primality and makes very
little use of the other bitset features in either C++ or Java, it would
make sense to implement your own more primitive mapping facility
("FixedSizeBitMap", perhaps).

--
Steve Wampler -- http://www.velocityreviews.com/forums/(E-Mail Removed)
The gods that smiled on your birth are now laughing out loud.

Juha Nieminen
Guest
Posts: n/a

 04-21-2008
Steve Wampler wrote:
> Given that this benchmark is really only using a bitset as a (sorta)
> space efficient mapping of numbers to primality and makes very
> little use of the other bitset features in either C++ or Java

There's one additional thing where a bitset feature is used
effetively: To count the number of set bits.

I measured how long it takes for the gcc implementation to do so. The
total amount of bits is 2100000000 (ie. half of the max value), from
which 198996103 are 1-bits (ie. the amount of primes smaller than
4200000000). That is, approximately 10% of (quite random) bits are on.

It takes my computer approximately 0.6 seconds to count the number of
1-bits in that bitset. That's approximately one bit test per clock cycle
in average. I'd call that efficient.

Juha Nieminen
Guest
Posts: n/a

 04-22-2008
Razii wrote:
> On Mon, 21 Apr 2008 22:56:24 +0300, Juha Nieminen
> <(E-Mail Removed)> wrote:
>
>> It takes my computer approximately 0.6 seconds to count the number of
>> 1-bits in that bitset. That's approximately one bit test per clock cycle
>> in average. I'd call that efficient.

>
> Are you on 32-bit OS or 64-bit? If 64-bit, can you compare java vs C++
> version? BitSet class uses long arrays of "words". Just make sure to
> use -server and enough Xmx value that it doesn't throw out of memory
> exception.

Pentium4 is a 32-bit system. Unfortunately I don't have javac
installed here.

(Btw, if you are curious to know how it's even possible for that
function to be so fast, AFAIK it uses this algorithm:
http://en.wikipedia.org/wiki/Hamming_weight )

Steve Wampler
Guest
Posts: n/a

 04-23-2008
Juha Nieminen wrote:
> (Btw, if you are curious to know how it's even possible for that
> function to be so fast, AFAIK it uses this algorithm:
> http://en.wikipedia.org/wiki/Hamming_weight )

That's most likely the same as in Java's Long/Integer.bitCount()
methods, I think.

The BitSet in Java suffers from the use of longs when run in 32bit
mode, for operations on individual bits. Long operations such as
bit shifting are double-word operations in a 32bit world.

It would be more efficient in this case to use a (simplified) bit
set class that uses ints internally instead of longs. Perhaps
considerably so for the current benchmark.

--
Steve Wampler -- (E-Mail Removed)
The gods that smiled on your birth are now laughing out loud.