Velocity Reviews > Ruby > [SOLUTION] [QUIZ] Sampling (#39)

# [SOLUTION] [QUIZ] Sampling (#39)

hp gobelli
Guest
Posts: n/a

 07-18-2005
Hi,

my solution uses a constant amount of memory and is fairly fast, however
understanding the algorithm is left as an exercise to the reader

These are the default tests:

\$ time ./sample.rb 5_000_000 1_000_000_000 > big_sample.txt
real 0m52.744s
user 0m47.006s
sys 0m2.092s

2
261
445
677
917
1101
1213
1521
1677
1926

\$ tail big_sample.txt
999998055
999998267
999998407
999998703
999998970
999999110
999999221
999999439
999999637
999999818

\$ ./sample.rb 3 10
1
4
8

\$ ./sample.rb 3 10
3
4
7

\$ ./sample.rb 3 10
1
3
9

\$ ./sample.rb 9 10
0
1
2
3
4
5
6
7
8

\$ ./sample.rb 20 400
1
25
46
68
96
109
134
145
173
180
206
228
258
271
298
303
338
348
372
386

....and here is the algorithm:

#! /usr/bin/env ruby

require 'set'

if ARGV.length != 2
puts "Usage: sample MEMBERS LIMIT"
exit
end

members = ARGV[0].to_f
limit = ARGV[1].to_f
raise "Bad parameters" if members > limit

i = -1
range = min_val = 0
sub_limit = limit / members
steps = (limit / sub_limit).ceil.to_i
(0...steps).each do |step|
min_val = (sub_limit * step).floor
range = sub_limit
if i >= min_val
min_val = i + 1
range = sub_limit * step - i
end
i = (rand(1000.0 * range) / 1000.0 + min_val).floor
puts i
end

Regards,

Paolo.

Yohanes Santoso
Guest
Posts: n/a

 07-18-2005
hp gobelli <(E-Mail Removed)> writes:

> i = -1
> range = min_val = 0
> sub_limit = limit / members
> steps = (limit / sub_limit).ceil.to_i
> (0...steps).each do |step|
> min_val = (sub_limit * step).floor
> range = sub_limit
> if i >= min_val
> min_val = i + 1
> range = sub_limit * step - i
> end
> i = (rand(1000.0 * range) / 1000.0 + min_val).floor
> puts i
> end

> Regards,
> Paolo.

seems similar to:

from, to, segment_amt = 0, ARGV[1].to_i, ARGV[0].to_i
segment_size = (to - from) / segment_amt
segment_amt.times {
next_random = from + rand(segment_size)
from += segment_size
puts next_random
}

but the people in the mailing list has 'condemned' algorithm producing
non-uniformly distributed random numbers

YS.