Velocity Reviews (http://www.velocityreviews.com/forums/index.php)
-   C Programming (http://www.velocityreviews.com/forums/f42-c-programming.html)
-   -   Good binary PRNG (http://www.velocityreviews.com/forums/t442765-good-binary-prng.html)

 pinkfloydhomer@gmail.com 05-15-2006 09:34 AM

Good binary PRNG

Using rand() in and old version og gcc, and using Tausworth's method, I
calculated the frequency of 0 or 1 in the first digit like this:

int hist[2] = {0,0};

for (i = 0; i < 100000; ++i)
{
++hist[prng() % 2];
}

For some reason, in both cases, 0 occurs more often than 1, no matter
what seed I use to initialize the prng with. Something like 50150 vs
49850 in the above code.

Does anybody know of a good "binary" PRNG, that is, one that in the
above case would make something closer to 50000 vs 50000?

/David

 Flash Gordon 05-15-2006 11:04 AM

Re: Good binary PRNG

pinkfloydhomer@gmail.com wrote:
> Using rand() in and old version og gcc, and using Tausworth's method, I
> calculated the frequency of 0 or 1 in the first digit like this:
>
> int hist[2] = {0,0};
>
> for (i = 0; i < 100000; ++i)
> {
> ++hist[prng() % 2];
> }
>
> For some reason, in both cases, 0 occurs more often than 1, no matter
> what seed I use to initialize the prng with. Something like 50150 vs
> 49850 in the above code.
>
> Does anybody know of a good "binary" PRNG, that is, one that in the
> above case would make something closer to 50000 vs 50000?

I suggest you start with the comp.lang.c FAQ which as question 13.15 has
"I need a random number generator". You can find the FAQ at
http://c-faq.com/
--
Flash Gordon, living in interesting times.
Web site - http://home.flash-gordon.me.uk/
comp.lang.c posting guidelines and intro:
http://clc-wiki.net/wiki/Intro_to_clc

 Rod Pemberton 05-18-2006 12:07 AM

Re: Good binary PRNG

<pinkfloydhomer@gmail.com> wrote in message
> Using rand() in and old version og gcc, and using Tausworth's method, I
> calculated the frequency of 0 or 1 in the first digit like this:
>
> int hist[2] = {0,0};
>
> for (i = 0; i < 100000; ++i)
> {
> ++hist[prng() % 2];
> }
>
> For some reason, in both cases, 0 occurs more often than 1, no matter
> what seed I use to initialize the prng with. Something like 50150 vs
> 49850 in the above code.
>
> Does anybody know of a good "binary" PRNG, that is, one that in the
> above case would make something closer to 50000 vs 50000?
>

Many PRNG's (Mersienne Twister, ISO C, etc.) seem to hit a binary "sweet
spot" after calling rand() about 7 million times.

Try the following code with your compiler or PRNG:

#include <stdio.h>
#include <stdlib.h>
#include <time.h>

unsigned long hist[2] = {0,0};

int main(void)
{
unsigned long i, value;

srand(NULL);

for (i = 0; i < 7000000; ++i)
{
rand();
}
for (i = 0; i < 100000; ++i)
{
value=rand();
++hist[value%2];
}
printf("%ld %ld\n",hist[0],hist[1]);

return(EXIT_SUCCESS);
}

Rod Pemberton

 Keith Thompson 05-18-2006 12:46 AM

Re: Good binary PRNG

[...]
> srand(NULL);

srand(time(NULL));

--
Keith Thompson (The_Other_Keith) kst-u@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <*> <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.

 pete 05-18-2006 12:55 AM

Re: Good binary PRNG

Keith Thompson wrote:
>
> [...]
> > srand(NULL);

>
> srand(time(NULL));

A cast would supress warnings that might be generated
if time_t is higher ranking than unsigned.

srand((unsigned)time(NULL));

--
pete

 Ben Pfaff 05-18-2006 01:46 AM

Re: Good binary PRNG

Keith Thompson <kst-u@mib.org> writes:

> [...]
>> srand(NULL);

>
> srand(time(NULL));

This used to come up fairly often, so I have a stock answer on
it:

By default, the C random number generator produces the same
sequence every time the program is run. In order to generate
different sequences, it has to be "seeded" using srand() with a
unique value. The function below to do this is carefully
designed. It uses time() to obtain the current time; the
alternative clock() is a poor choice because it measures CPU time
used, which is often more or less constant among runs. The
actual value of a time_t is not portable, so it computes a "hash"
of the bytes in it using a multiply-and-add technique. The
factor used for multiplication normally comes out as 257, a prime
and therefore a good candidate.

References: Knuth, _The Art of Computer Programming, Vol. 2:
Seminumerical Algorithms_, section 3.2.1; Aho, Sethi, and Ullman,
_Compilers: Principles, Techniques, and Tools_, section 7.6.

#include <limits.h>
#include <stdlib.h>
#include <time.h>

/* Choose and return an initial random seed based on the current time.
Based on code by Lawrence Kirby <fred@genesis.demon.co.uk>.
Usage: srand (time_seed ()); */
unsigned
time_seed (void)
{
time_t timeval; /* Current time. */
unsigned char *ptr; /* Type punned pointed into timeval. */
unsigned seed; /* Generated seed. */
size_t i;

timeval = time (NULL);
ptr = (unsigned char *) &timeval;

seed = 0;
for (i = 0; i < sizeof timeval; i++)
seed = seed * (UCHAR_MAX + 2u) + ptr[i];

return seed;
}

--
"Large amounts of money tend to quench any scruples I might be having."
-- Stephan Wilms

 Eric Sosman 05-18-2006 03:26 AM

Re: Good binary PRNG

pinkfloydhomer@gmail.com wrote:
> Using rand() in and old version og gcc, and using Tausworth's method, I
> calculated the frequency of 0 or 1 in the first digit like this:
>
> int hist[2] = {0,0};
>
> for (i = 0; i < 100000; ++i)
> {
> ++hist[prng() % 2];
> }
>
> For some reason, in both cases, 0 occurs more often than 1, no matter
> what seed I use to initialize the prng with. Something like 50150 vs
> 49850 in the above code.
>
> Does anybody know of a good "binary" PRNG, that is, one that in the
> above case would make something closer to 50000 vs 50000?

Note that the 50150 you mention is less than one-sixth
of one percent greater than a perfect fifty-fifty split. How
much "closer" do you think you need to be? How much "closer"
do you think a random sequence *ought* to be?

Let's see: If the outcomes (1 or 0) are equiprobable and
you make 100000 independent trials, the probability of getting
exactly 50000 of each outcome is the binomial probability

100000!
--------------- * 2^(-100000)
50000! * 50000!

If I'm not mistaken (I might be; these numbers are BIG), this
comes to 0.0025231262141967+; you can only expect a "perfect"
split about once in four hundred trials of a hundred thousand
numbers each. How many times have you run your experiment?

Let's see, take two: The expected number of 1's is 50000.
The variance is 100000*0.5*0.5 = 25000, so the standard
deviation is a little more than 158. You report a discrepancy
of 150, which is less than one standard deviation away from
a "perfect" split. I ask again: How much closer do you think
a random sequence can be while still being "random?"

Flip a coin twice. If get something other than heads once
and tails once, do you conclude that the coin is biased?

--
Eric Sosman
esosman@acm-dot-org.invalid

 Keith Thompson 05-18-2006 04:17 AM

Re: Good binary PRNG

Eric Sosman <esosman@acm-dot-org.invalid> writes:
> pinkfloydhomer@gmail.com wrote:
>> Using rand() in and old version og gcc, and using Tausworth's method, I
>> calculated the frequency of 0 or 1 in the first digit like this:
>> int hist[2] = {0,0};
>> for (i = 0; i < 100000; ++i)
>> {
>> ++hist[prng() % 2];
>> }
>> For some reason, in both cases, 0 occurs more often than 1, no matter
>> what seed I use to initialize the prng with. Something like 50150 vs
>> 49850 in the above code.
>> Does anybody know of a good "binary" PRNG, that is, one that in the
>> above case would make something closer to 50000 vs 50000?

>
> Note that the 50150 you mention is less than one-sixth
> of one percent greater than a perfect fifty-fifty split. How
> much "closer" do you think you need to be? How much "closer"
> do you think a random sequence *ought* to be?

The difference very likely isn't significant, but it might be.

If a single run of 100000 trials differs from a 50/50 split by a
fraction of a percent, it's not just harmless, it's expected. But if
N runs of 100000 are *consistently* over 50%, there's probably a
systematic error. (I'm deliberately leaving the value of N vague.)

I suspect that the OP hasn't done enough runs to demonstrate an actual
bias, but I'm only guessing.

[...]

> Flip a coin twice. If get something other than heads once
> and tails once, do you conclude that the coin is biased?

No, but if I consistently get more heads than tails, it probably is
biased.

--
Keith Thompson (The_Other_Keith) kst-u@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <*> <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.

 pete 05-18-2006 01:17 PM

Re: Good binary PRNG

Keith Thompson wrote:

> If a single run of 100000 trials differs from a 50/50 split by a
> fraction of a percent, it's not just harmless, it's expected. But if
> N runs of 100000 are *consistently* over 50%, there's probably a
> systematic error. (I'm deliberately leaving the value of N vague.)
>
> I suspect that the OP hasn't done enough runs to demonstrate an actual
> bias, but I'm only guessing.
>
> [...]
>
> > Flip a coin twice. If get something other than heads once
> > and tails once, do you conclude that the coin is biased?

>
> No, but if I consistently get more heads than tails, it probably is
> biased.

If you consistently get heads and tails alternating,
that's not very good either.

1, 2300471611
0, 3253914372
1, 1858424313
0, 116628778
1, 2369750119
0, 3657615680
1, 2176380933
0, 894631110
1, 3876978771
0, 721092924

/* BEGIN new.c */

#include <stdio.h>

#define LU_RAND(S) ((S) * 69069 + 362437 & 0XFFFFFFFFLU)

int main(void)
{
long unsigned lu_seed = 12345678LU;
int x;

for (x = 0; x != 10; ++x) {
lu_seed = LU_RAND(lu_seed);
printf("%lu, %lu\n", lu_seed % 2, lu_seed);
}
return 0;
}

/* END new.c */

--
pete

 CBFalconer 05-18-2006 01:48 PM

Re: Good binary PRNG

Keith Thompson wrote:
> Eric Sosman <esosman@acm-dot-org.invalid> writes:
>

.... snip ...
>
>> Flip a coin twice. If get something other than heads once
>> and tails once, do you conclude that the coin is biased?

>
> No, but if I consistently get more heads than tails, it probably
> is biased.

Look up the chi-square test.

--