Velocity Reviews > C++ > Problem with rand() % range+1

# Problem with rand() % range+1

Rafael Cunha de Almeida
Guest
Posts: n/a

 08-25-2008
Hi,

I've found several sites on google telling me that I shouldn't use

rand() % range+1

and I should, instead, use something like:

lowest+int(range*rand()/(RAND_MAX + 1.0))

They fail to make it clear why. There seems to be less randomness in the
lower bits or something like that. I don't know what less randomness
would mean. Would it not be normally distributed or something like that?

peter koch
Guest
Posts: n/a

 08-25-2008
On 25 Aug., 22:35, Rafael Cunha de Almeida <(E-Mail Removed)>
wrote:
> Hi,
>
> I've found several sites on google telling me that I shouldn't use
>
> * * * * rand() % range+1
>
> and I should, instead, use something like:
>
> * * * * lowest+int(range*rand()/(RAND_MAX + 1.0))
>
> They fail to make it clear why. There seems to be less randomness in the
> lower bits or something like that. I don't know what less randomness
> would mean. Would it not be normally distributed or something like that?

Take an example of a random generator delivering numbers in the range
0..99. If you want numbers in the range 0..79, your solution would
give an expectancy of numbers 0..19 twice as high as the range 20..79.
So your method is bad if you require an even distribution (but might
be ok if you just want somewhat random distribution as eg for a game).
The other method is better, but not perfect.

/Peter

peter koch
Guest
Posts: n/a

 08-25-2008
On 25 Aug., 23:15, Pete Becker <(E-Mail Removed)> wrote:
> On 2008-08-25 17:00:48 -0400, peter koch <(E-Mail Removed)> said:
>
>
>
>
>
> > On 25 Aug., 22:35, Rafael Cunha de Almeida <(E-Mail Removed)>
> > wrote:
> >> Hi,

>
> >> I've found several sites on google telling me that I shouldn't use

>
> >> * * * * rand() % range+1

>
> >> and I should, instead, use something like:

>
> >> * * * * lowest+int(range*rand()/(RAND_MAX + 1.0))

>
> >> They fail to make it clear why. There seems to be less randomness in the
> >> lower bits or something like that. I don't know what less randomness
> >> would mean. Would it not be normally distributed or something like that?

>
> > Take an example of a random generator delivering numbers in the range
> > 0..99. If you want numbers in the range 0..79, your solution would
> > give an expectancy of numbers 0..19 twice as high as the range 20..79.
> > So your method is bad if you require an even distribution (but might
> > be ok if you just want somewhat random distribution as eg for a game).
> > The other method is better, but not perfect.

>
> Well, the other method is better in that it makes the non-randomness
> less obvious. <g> But with the same sample ranges it still produces
> twenty values twice as often as the rest; it's just that they're not
> the twenty lowest ones any more.
>

You are right - the other method also produces twenty numbers twice as
frequent as the first one - just spread evenly. To my embarassment, I
must admit that I did not realise this when I replied. So much for
perfection!

/Peter

acehreli@gmail.com
Guest
Posts: n/a

 08-25-2008
On Aug 25, 2:05 pm, Pete Becker <(E-Mail Removed)> wrote:

> The number of values that you start with has to be an exact multiple of
> the number of values that you want to get. The way to do that is to
> throw away some values from rand(), namely, those that exceed the
> maximum multiple of range that's less than RAND_MAX+1. Like this:
>
> int mult = (RAND_MAX + 1) / range;

There is a problem if RAND_MAX equals INT_MAX. Adding one would wrap
the value and mult would become zero. So RAND_MAX could be casted to a
suitable type and I think 'unsigned int' works. Or an unsigned 1 would
work:

int mult = (RAND_MAX + 1u) / range;

Andrew Koenig on clc++m posts.

> int max = range * mult;
> int temp = rand();
> while (max < temp)
> temp = rand();
> return temp % range;

Ali

zaimoni@zaimoni.com
Guest
Posts: n/a

 08-25-2008
On Aug 25, 3:35 pm, Rafael Cunha de Almeida <(E-Mail Removed)>
wrote:
> Hi,
>
> I've found several sites on google telling me that I shouldn't use
>
> rand() % range+1
>
> and I should, instead, use something like:
>
> lowest+int(range*rand()/(RAND_MAX + 1.0))

Or function doing roughly the equivalent using integer math, which
might be useful if range is large (to avoid signed integer overflow).

> They fail to make it clear why. There seems to be less randomness in the
> lower bits or something like that. I don't know what less randomness
> would mean. Would it not be normally distributed or something like that?

If rand() is implemented with a linear congruential generator f(x)=ax
+b mod m, then it is a relatively easy automated check that the lower
bits are in general highly correlated (given the first, the second is
somewhat predictable). Directly verifying algebraically is harder,
but a standard homework exercise in the field. The lowest separation
required to demonstrate correlation is usually two through six
invocations apart.

More sophisticated algorithms are harder to verify, but reputedly have
similar problems unless explicitly designed otherwise. [As an extreme
case, the Mersenne Twister has been formally verified to not to be
correlated across 637 consecutive invocations.]

It's moderately unusual to use a high-quality RNG to implement rand().

The overall results would still be as approximately uniformly
distributed as rand(), except for unavoidable discretization errors as
mentioned in earlier replies (which wouldn't be that large for small
moduli since RAND_MAX should be at least 32767).

zaimoni@zaimoni.com
Guest
Posts: n/a

 08-26-2008
On Aug 25, 4:47 pm, (E-Mail Removed) wrote:

> More sophisticated algorithms are harder to verify, but reputedly have
> similar problems unless explicitly designed otherwise. [As an extreme
> case, the Mersenne Twister has been formally verified to not to be
> correlated across 637 consecutive invocations.]

Wrong: 623 consecutive invocations. Please accept my apologies.

Jorgen Grahn
Guest
Posts: n/a

 09-08-2008
On Mon, 25 Aug 2008 14:47:03 -0700 (PDT), http://www.velocityreviews.com/forums/(E-Mail Removed) <(E-Mail Removed)> wrote:
> On Aug 25, 3:35 pm, Rafael Cunha de Almeida <(E-Mail Removed)>
> wrote:
>> Hi,
>>
>> I've found several sites on google telling me that I shouldn't use
>>
>> rand() % range+1

....
>> They fail to make it clear why. There seems to be less randomness in the
>> lower bits or something like that. I don't know what less randomness
>> would mean. Would it not be normally distributed or something like that?

>
> If rand() is implemented with a linear congruential generator f(x)=ax
> +b mod m, then it is a relatively easy automated check that the lower
> bits are in general highly correlated (given the first, the second is
> somewhat predictable).

....
> It's moderately unusual to use a high-quality RNG to implement rand().

"High-quality" is a bit too vague here. Are you saying that rand() is
still broken in the low-order bits in many libraries? (Yeah, "broken"
is also vague, but you know what I mean.)

The Linux rand(3) manual says this:

The versions of rand() and srand() in the Linux C Library use
the same random number generator as random() and srandom(), so
the lower-order bits should be as random as the higher-order
bits. However, on older rand() implementations, and on current
implementations on different systems, the lower-order bits are
much less random than the higher- order bits. Do not use this
function in applications intended to be portable when good
randomness is needed.

but I have never seen anyone explicitly list over current
implementation with this problem. I kind of assumed they were gone by
now. Or at least, I wouldn't be surprised if they were, and what used
to be a fact ended up as urban legent.

<(E-Mail Removed)> back in 2003.

/Jorgen

--
// Jorgen Grahn <grahn@ Ph'nglui mglw'nafh Cthulhu
\X/ snipabacken.se> R'lyeh wgah'nagl fhtagn!

Jorgen Grahn
Guest
Posts: n/a

 09-08-2008
On Mon, 25 Aug 2008 14:47:03 -0700 (PDT), (E-Mail Removed) <(E-Mail Removed)> wrote:
> On Aug 25, 3:35 pm, Rafael Cunha de Almeida <(E-Mail Removed)>
> wrote:
>> Hi,
>>
>> I've found several sites on google telling me that I shouldn't use
>>
>> rand() % range+1

....
>> They fail to make it clear why. There seems to be less randomness in the
>> lower bits or something like that. I don't know what less randomness
>> would mean. Would it not be normally distributed or something like that?

>
> If rand() is implemented with a linear congruential generator f(x)=ax
> +b mod m, then it is a relatively easy automated check that the lower
> bits are in general highly correlated (given the first, the second is
> somewhat predictable).

....
> It's moderately unusual to use a high-quality RNG to implement rand().

"High-quality" is a bit too vague here. Are you saying that rand() is
still broken in the low-order bits in many libraries? (Yeah, "broken"
is also vague, but you know what I mean.)

The Linux rand(3) manual says this:

The versions of rand() and srand() in the Linux C Library use
the same random number generator as random() and srandom(), so
the lower-order bits should be as random as the higher-order
bits. However, on older rand() implementations, and on current
implementations on different systems, the lower-order bits are
much less random than the higher- order bits. Do not use this
function in applications intended to be portable when good
randomness is needed.

but I have never seen anyone explicitly list over current
implementation with this problem. I kind of assumed they were gone by
now. Or at least, I wouldn't be surprised if they were, and what used
to be a fact ended up as urban legent.

<(E-Mail Removed)> back in 2003.

/Jorgen

--
// Jorgen Grahn <grahn@ Ph'nglui mglw'nafh Cthulhu
\X/ snipabacken.se> R'lyeh wgah'nagl fhtagn!