Velocity Reviews > Perl > Using rand(), how to I avoid repeats?

# Using rand(), how to I avoid repeats?

Scott Stark
Guest
Posts: n/a

 07-21-2004
I'm using rand() to generate a series of random numbers. Is there any
way to prevent the same number from being used more than once? All I
can think of is keeping a list of used numbers, and having some kind
of "if used, next" statement, which seems lame. (Theoretically that
could keep going forever, if it randomly kept selecting the same
numbers.) Any thoughts?

thanks
Scott

thundergnat
Guest
Posts: n/a

 07-21-2004
Scott Stark wrote:

> I'm using rand() to generate a series of random numbers. Is there any
> way to prevent the same number from being used more than once? All I
> can think of is keeping a list of used numbers, and having some kind
> of "if used, next" statement, which seems lame. (Theoretically that
> could keep going forever, if it randomly kept selecting the same
> numbers.) Any thoughts?
>
> thanks
> Scott

Sounds like you want some variation of a shuffle. Take a look at
Algorithm::Numerical::Shuffle on cpan.

Generate an array of numbers, shuffle it, then shift off how many items
you need.

Randal L. Schwartz
Guest
Posts: n/a

 07-21-2004
>>>>> "Scott" == Scott Stark <(E-Mail Removed)> writes:

Scott> I'm using rand() to generate a series of random numbers. Is
Scott> there any way to prevent the same number from being used more
Scott> than once? All I can think of is keeping a list of used
Scott> numbers, and having some kind of "if used, next" statement,
Scott> which seems lame. (Theoretically that could keep going forever,
Scott> if it randomly kept selecting the same numbers.) Any thoughts?

You probably need to deal from a deck then.

use List::Util qw(shuffle); # might need this from CPAN
my @shuffled;

while (time passes) {
@shuffled = shuffle(1..100) unless @shuffled;
my \$number = shift @shuffled;

etc
}

print "Just another Perl hacker,"; # the original

--
Randal L. Schwartz - Stonehenge Consulting Services, Inc. - +1 503 777 0095
<(E-Mail Removed)> <URL:http://www.stonehenge.com/merlyn/>
Perl/Unix/security consulting, Technical writing, Comedy, etc. etc.
See PerlTraining.Stonehenge.com for onsite and open-enrollment Perl training!

NY Tennessean
Guest
Posts: n/a

 07-22-2004
http://www.velocityreviews.com/forums/(E-Mail Removed) (Scott Stark) wrote in message news:<(E-Mail Removed). com>...
> I'm using rand() to generate a series of random numbers. Is there any
> way to prevent the same number from being used more than once? All I
> can think of is keeping a list of used numbers, and having some kind
> of "if used, next" statement, which seems lame. (Theoretically that
> could keep going forever, if it randomly kept selecting the same
> numbers.) Any thoughts?

Keeping a hash of 'used' numbers could indeed be lame in that
performance would degrade as the percentage of numbers used became
higher and higher.

A better approach might be to start with an array (hash?) with an
entry for every number in the acceptable range of random numbers,
where the key and the value are identical to begin with. Here is an
example representation after this array has been created:

\$random[1] = 1;
\$random[2] = 2;
\$random[3] = 3;
#etcetera
\$random[16] = 16; #range max.

The first time you choose from a range of 1 to 16 lets say 3 is the
random number. Then you could splice a couple of array slices, so
that the 'higher' half takes up at the spot where you just eliminated

\$random[1] = 1;
\$random[2] = 2;

#\$random[3] = 3;

\$random[3] = 4;
\$random[4] = 5;
#etcetera
\$random[15] = 16;

The next time you will choose a random number from 1 to 15: the new
maximum. If it just happens to return 15, for instance, you would
then get the value specified by that array index: 16.

There are a few gotchas, such as being sure not to try to create more
random numbers than are in the original range, and managing the array
minimum and maximum values as you continue to slice and splice.

Happy coding!

NY Tennessean
Guest
Posts: n/a

 07-22-2004
(E-Mail Removed) (Scott Stark) wrote in message news:<(E-Mail Removed). com>...
> I'm using rand() to generate a series of random numbers. Is there any
> way to prevent the same number from being used more than once? All I
> can think of is keeping a list of used numbers, and having some kind
> of "if used, next" statement, which seems lame. (Theoretically that
> could keep going forever, if it randomly kept selecting the same
> numbers.) Any thoughts?

You've got a couple of answers suggesting you use modules that allow
you to "shuffle" an array, and I can't disagree: modules are usually
the way to go as they have been used and tested more than any home
grown solution.

But still I wanted to share with you the home-grown code I've rolled
since my original suggestion. Beware, its a little clunky. It could
stand some perfecting but "works" for me, at least for the dozen or so
times I tested it:

sub uniq_random_int_in (\$\$\$) {
#based on random_int_in in perlfaq4, data manipulation
my(\$min, \$max, \$qty) = @_;
# Assumes that the three arguments are integers themselves!
# Assumes \$min < \$max
# Assumes \$qty <= \$max - \$min + 1
my @set, @uniq;
for (my \$i=0; \$i <= \$max - \$min; \$i++) {
\$set[\$i] = \$i;
}
for (my \$i=0; \$i < \$qty; \$i++) {
my \$rand = int rand(\$max - \$min - \$i + 1);
\$uniq[\$i] = \$set[\$rand] + \$min;
@set = (@set[0..\$rand-1], @set[\$rand+1..scalar(@set)-1]);
print \$uniq[\$i], "\n"; #display only, can comment out
}
return @uniq;
}

@uniq_rands = uniq_random_int_in(11,20,5); # for testing

ctcgag@hotmail.com
Guest
Posts: n/a

 07-22-2004
(E-Mail Removed) (Scott Stark) wrote:
> I'm using rand() to generate a series of random numbers. Is there any
> way to prevent the same number from being used more than once?

A lot of ways.

> All I can think of is keeping a list of used numbers, and having some
> kind of "if used, next" statement, which seems lame.

Well, I'd keep a hash of used numbers. And "redo" would likely more sense
than "next". If your numbers are floats, you may have some problem with
the numeric and string representations being nonidentical, but I highly
doubt that will be a problem. It is not lame at all, unless you are
sampling very densely among all possible values. (If you are sampling very
densely, i.e. you will before you are done need a high fraction of all
possible values, then I would just shuffle the list of all of those
possible values.)

> (Theoretically that
> could keep going forever, if it randomly kept selecting the same
> numbers.) Any thoughts?

If your random number generator degenerates thus, then in that case you are
pretty much screwed regardless.

Xho

--
Usenet Newsgroup Service \$9.95/Month 30GB

Juha Laiho
Guest
Posts: n/a

 07-22-2004
(E-Mail Removed) said:
>(E-Mail Removed) (Scott Stark) wrote:
>> All I can think of is keeping a list of used numbers, and having some
>> kind of "if used, next" statement, which seems lame.

....
>> (Theoretically that could keep going forever, if it randomly kept
>> selecting the same numbers.) Any thoughts?

>
>If your random number generator degenerates thus, then in that case you
>are pretty much screwed regardless.

Not so -- think of a situation where your number space is f.ex.
1000 numbers, and you don't want repetitions. You'll end up with a
significant and inreasing possibility of retries for each new number.
For the last number, there's only a 1/1000 chance that the random
algorithm comes up with the desired number. I made a small script
to test this; even with range of 3 (create randomised list of numbers
of 1 to 3) I got with short testing 12 "false guesses" in generating
the list. With larger ranges the number of false guesses got significantly
worse -- with just range of 1000, the worst results I got had over 3000000
false guesses. Results with less than 1000000 false guesses seemed to be
rare.

So, for a list of random numbers without repetitions, the shuffled list
algorithms are definitely better.
--
Wolf a.k.a. Juha Laiho Espoo, Finland
(GC 3.0) GIT d- s+: a C++ ULSH++++\$ P++@ L+++ E- W+\$@ N++ !K w !O !M V
PS(+) PE Y+ PGP(+) t- 5 !X R !tv b+ !DI D G e+ h---- r+++ y++++
"...cancel my subscription to the resurrection!" (Jim Morrison)

Anno Siegel
Guest
Posts: n/a

 07-22-2004
NY Tennessean <(E-Mail Removed)> wrote in comp.lang.perl.misc:
> (E-Mail Removed) (Scott Stark) wrote in message
> news:<(E-Mail Removed). com>...
> > I'm using rand() to generate a series of random numbers. Is there any
> > way to prevent the same number from being used more than once? All I
> > can think of is keeping a list of used numbers, and having some kind
> > of "if used, next" statement, which seems lame. (Theoretically that
> > could keep going forever, if it randomly kept selecting the same
> > numbers.) Any thoughts?

>
> You've got a couple of answers suggesting you use modules that allow
> you to "shuffle" an array, and I can't disagree: modules are usually
> the way to go as they have been used and tested more than any home
> grown solution.
>
> But still I wanted to share with you the home-grown code I've rolled
> since my original suggestion. Beware, its a little clunky. It could
> stand some perfecting but "works" for me, at least for the dozen or so
> times I tested it:

Yes, it does what it should, and yes, it *is* a little clunky.

> sub uniq_random_int_in (\$\$\$) {

^^^^^
Don't give a function a prototype unless there is a distinct advantage.
Normally, perl subs are not prototyped.

> #based on random_int_in in perlfaq4, data manipulation
> my(\$min, \$max, \$qty) = @_;
> # Assumes that the three arguments are integers themselves!
> # Assumes \$min < \$max
> # Assumes \$qty <= \$max - \$min + 1
> my @set, @uniq;

This doesn't declare @uniq. "my (@set, @uniq)" is what you need. You
didn't run this under strict, did you?

> for (my \$i=0; \$i <= \$max - \$min; \$i++) {
> \$set[\$i] = \$i;
> }

That could be more simply written

@set = 0 .. \$max - \$min;

But what you really want in @set is the set of possible results, so
you can make that

@set = \$min .. \$max;

> for (my \$i=0; \$i < \$qty; \$i++) {
> my \$rand = int rand(\$max - \$min - \$i + 1);

The number "\$max - \$min - \$i + 1" is exactly the number of elements
left in @set in the \$i-th step. You start with \$max - \$min + 1 elements
in the 0-th step, and you lose one element each time \$i increases. So

my \$rand = int rand @set;

does the same thing. The selection process is now even clearer:
Select a random number between 0 and the number of elements left
(exclusive). Pick that element.

> \$uniq[\$i] = \$set[\$rand] + \$min;

When @set contains the numbers \$min .. \$max directly, you don't need
the addition. \$set[\$rand] is the wanted random number.

> @set = (@set[0..\$rand-1], @set[\$rand+1..scalar(@set)-1]);

^^^^^^
"@set" is already in scalar context, no need for "scalar".

But Perl's splice() function is what you really want here:

splice( @set, \$rand, 1);

This returns the value(s) that are spliced out, so you can find
the random index and move the indicated element from @set to @uniq
in one step:

push @uniqe, splice( @set, rand @set, 1);

> print \$uniq[\$i], "\n"; #display only, can comment out
> }
> return @uniq;
> }
>
> @uniq_rands = uniq_random_int_in(11,20,5); # for testing

Putting it all together, this is how I would write it:

sub uniq_random_int_in {
my(\$min, \$max, \$qty) = @_;
my @set = ( \$min .. \$max);
map splice( @set, rand @set, 1), 1 .. \$qty;
}

Anno

NY Tennessean
Guest
Posts: n/a

 07-22-2004
(E-Mail Removed) wrote in message news:<20040721233346.458\$(E-Mail Removed)>...
> (E-Mail Removed) (Scott Stark) wrote:
> > I'm using rand() to generate a series of random numbers. Is there any
> > way to prevent the same number from being used more than once?

>
> A lot of ways.
>
> > All I can think of is keeping a list of used numbers, and having some
> > kind of "if used, next" statement, which seems lame.

>
> Well, I'd keep a hash of used numbers.

There may be more than one way to do it, but this is definitely worse
than:

1) the above-stated ideas of using a module to shuffle an array
2) The "idea" behind my home rolled solution (the implementation could
definitely be improved)

As the hash of used numbers grows, chances increase that to generate
each random number, rand() will needlessly be called more than once,
maybe even several times. I've experienced this. I wrote a similar
program at my first programming job for a bingo supply company.
Random bingo grids, and as I recall rows and or columns have to add up
to the same number?! Well I kept an array of used combinations and
the darn thing took longer to run than it did to write! Never again
would I do that. Maybe thats why I took such an interest in this
problem.

Ilmari Karonen
Guest
Posts: n/a

 07-22-2004
On 2004-07-21, Scott Stark <(E-Mail Removed)> wrote:
> I'm using rand() to generate a series of random numbers. Is there any
> way to prevent the same number from being used more than once? All I
> can think of is keeping a list of used numbers, and having some kind
> of "if used, next" statement, which seems lame. (Theoretically that
> could keep going forever, if it randomly kept selecting the same
> numbers.) Any thoughts?

Several solutions have been presented already. Here's one I haven't
seen posted in this thread yet: a "lazy sparse shuffle".

my %int_pool;
use constant MAX_UNIQUE_INT => 2**32-1;

sub get_unique_int () {
my \$n = keys %int_pool;
my \$r = \$n + int rand(MAX_UNIQUE_INT+1 - \$n);
my \$o = (exists \$int_pool{\$r} ? \$int_pool{\$r} : \$r);
\$int_pool{\$r} = (exists \$int_pool{\$n} ? \$int_pool{\$n} : \$n);
return \$o;
}

This method works the like the hash solutions already posted, except
that it has a guaranteed upper bound for execution time, even if the
number space does get densely sampled. On the other hand, it only
works for sample spaces that can be easily enumerated.

--
Ilmari Karonen