Velocity Reviews > Ruby > [QUIZ] Sampling (#39)

# [QUIZ] Sampling (#39)

Ruby Quiz
Guest
Posts: n/a

 07-15-2005
The three rules of Ruby Quiz:

1. Please do not post any solutions or spoiler discussion for this quiz until
48 hours have passed from the time on this message.

2. Support Ruby Quiz by submitting ideas as often as you can:

http://www.rubyquiz.com/

3. Enjoy!

-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=

This week's Ruby quiz is a classic sampling problem that I've seen many
interesting solutions for in the past.

The challenge is to implement a program called "sample" that takes exactly two
integer inputs. The first of those should be the number of members you would
like included in the sample. The second is the upper boundary (exclusive) of
the range of integers members can be selected from. The lower boundary is zero
(inclusive).

Your program should output exactly the requested number of members from the
defined range to STDOUT, one number per line. Each member must be unique and
they should appear in sorted order.

Here are some sample runs of my solution to illustrate the task:

\$ ./sample
Usage: sample MEMBERS LIMIT

MEMBERS: The number of members you want in the sample. (Integer)
LIMIT: The upper limit for the sample range. All (Integer)
members will between 0 and LIMIT - 1.

Note that MEMBERS must be less than LIMIT.
\$ ./sample 3 10
0
1
5
\$ ./sample 3 10
1
2
8
\$ ./sample 3 10
2
3
9
\$ ./sample 9 10
0
2
3
4
5
6
7
8
9
\$ ./sample 20 400
1
3
4
25
32
36
42
56
93
111
137
139
153
213
222
226
263
289
293
314

Now for all the algorithm junkies out there that enjoy a little friendly
competition, here's the time to beat:

\$ time ./sample 5_000_000 1_000_000_000 > big_sample.txt

real 30m10.593s
user 29m54.544s
sys 0m4.736s
\$ ls -l big_sample.txt
-rw-r--r-- 1 james staff 49011436 Jul 11 15:26 big_sample.txt
112
221
450
655
900
1216
1399
1469
1494
1628
\$ tail big_sample.txt
999995325
999996330
999996552
999996682
999997426
999997676
999998253
999999153
999999658
999999693

Jim Menard
Guest
Posts: n/a

 07-15-2005
Ruby Quiz wrote:

> The challenge is to implement a program called "sample" that takes exactly two
> integer inputs. The first of those should be the number of members you would
> like included in the sample. The second is the upper boundary (exclusive) of
> the range of integers members can be selected from. The lower boundary is zero
> (inclusive).
>
> Your program should output exactly the requested number of members from the
> defined range to STDOUT, one number per line. Each member must be unique and
> they should appear in sorted order.

The description does not say that the output needs to be randomly selected
from the input range. Without that, printing the first <members> numbers would
be sufficient---and really fast.

(0...members).each { | i | puts i }

Should the output be randomly selected from the range of input values?

Jim
--
Jim Menard, http://www.velocityreviews.com/forums/(E-Mail Removed), http://www.io.com/~jimm
"Ask not a question of USENET, for it will answer both 'Yea.' and 'Nay.'
and 'Ask in another group.'" -- Simon Slavin

James Edward Gray II
Guest
Posts: n/a

 07-15-2005
On Jul 15, 2005, at 8:00 AM, Ruby Quiz wrote:

> Now for all the algorithm junkies out there that enjoy a little
> friendly
> competition, here's the time to beat:
>
> \$ time ./sample 5_000_000 1_000_000_000 > big_sample.txt
>
> real 30m10.593s
> user 29m54.544s
> sys 0m4.736s

P.S. Taunting each other with fast time readouts is not a spoiler.
Feel free!

James Edward Gray II

James Edward Gray II
Guest
Posts: n/a

 07-15-2005
On Jul 15, 2005, at 8:10 AM, Jim Menard wrote:

> The description does not say that the output needs to be randomly
> selected from the input range. Without that, printing the first
> <members> numbers would be sufficient---and really fast.
>
> (0...members).each { | i | puts i }
>
> Should the output be randomly selected from the range of input values?

Egad, yes. Massive oversight on my part. Sorry!

James Edward Gray II

Wybo Dekker
Guest
Posts: n/a

 07-15-2005
On Fri, 15 Jul 2005, James Edward Gray II wrote:

> On Jul 15, 2005, at 8:00 AM, Ruby Quiz wrote:
>
> > Now for all the algorithm junkies out there that enjoy a little friendly
> > competition, here's the time to beat:
> >
> > \$ time ./sample 5_000_000 1_000_000_000 > big_sample.txt
> >
> > real 30m10.593s
> > user 29m54.544s
> > sys 0m4.736s

>

\$ time ruby -e '5_000_000.times do puts rand(1_000_000_000) end' >big_sample.txt
elap 16.612
user 16.421
syst 0.177
CPU 99.91%

\$ wc big_sample.txt
5000000 5000000 49446636 big_sample.txt

or do I miss something??
--
Wybo

Belorion
Guest
Posts: n/a

 07-15-2005
On 7/15/05, Wybo Dekker <(E-Mail Removed)> wrote:
> \$ time ruby -e '5_000_000.times do puts rand(1_000_000_000) end' >big_sam=

ple.txt
> elap 16.612
> user 16.421
> syst 0.177
> CPU 99.91%
>=20
> \$ wc big_sample.txt
> 5000000 5000000 49446636 big_sample.txt
>=20
> or do I miss something??

The above example will result in duplicate members. Every number in
the list must be unique.

Matt

jason r tibbetts
Guest
Posts: n/a

 07-15-2005
Belorion wrote:
> On 7/15/05, Wybo Dekker <(E-Mail Removed)> wrote:
>
>>\$ time ruby -e '5_000_000.times do puts rand(1_000_000_000) end' >big_sample.txt
>> elap 16.612
>> user 16.421
>> syst 0.177
>> CPU 99.91%
>>
>>\$ wc big_sample.txt
>> 5000000 5000000 49446636 big_sample.txt
>>
>>or do I miss something??

>
>
> The above example will result in duplicate members. Every number in
> the list must be unique.

And sorted.

jason r tibbetts
Guest
Posts: n/a

 07-15-2005
James,

What are your machine's specs, just for comparison's sake? I don't want
to get too discouraged by my 5+ year-old Sun beater workstation.

I've also found that the printing of the results (in the form of an
array) is taking several minutes in and of itself. Since optimizing the
printout probably isn't the point of the quiz, can anyone recommend
something faster than arr.each{ | elem | p elem } ?

Ruby Quiz wrote:
> The three rules of Ruby Quiz:
>
> 1. Please do not post any solutions or spoiler discussion for this quiz until
> 48 hours have passed from the time on this message.
>
> 2. Support Ruby Quiz by submitting ideas as often as you can:
>
> http://www.rubyquiz.com/
>
> 3. Enjoy!
>
> -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
>
> This week's Ruby quiz is a classic sampling problem that I've seen many
> interesting solutions for in the past.
>
> The challenge is to implement a program called "sample" that takes exactly two
> integer inputs. The first of those should be the number of members you would
> like included in the sample. The second is the upper boundary (exclusive) of
> the range of integers members can be selected from. The lower boundary is zero
> (inclusive).
>
> Your program should output exactly the requested number of members from the
> defined range to STDOUT, one number per line. Each member must be unique and
> they should appear in sorted order.
>
> Here are some sample runs of my solution to illustrate the task:
>
> \$ ./sample
> Usage: sample MEMBERS LIMIT
>
> MEMBERS: The number of members you want in the sample. (Integer)
> LIMIT: The upper limit for the sample range. All (Integer)
> members will between 0 and LIMIT - 1.
>
> Note that MEMBERS must be less than LIMIT.
> \$ ./sample 3 10
> 0
> 1
> 5
> \$ ./sample 3 10
> 1
> 2
> 8
> \$ ./sample 3 10
> 2
> 3
> 9
> \$ ./sample 9 10
> 0
> 2
> 3
> 4
> 5
> 6
> 7
> 8
> 9
> \$ ./sample 20 400
> 1
> 3
> 4
> 25
> 32
> 36
> 42
> 56
> 93
> 111
> 137
> 139
> 153
> 213
> 222
> 226
> 263
> 289
> 293
> 314
>
> Now for all the algorithm junkies out there that enjoy a little friendly
> competition, here's the time to beat:
>
> \$ time ./sample 5_000_000 1_000_000_000 > big_sample.txt
>
> real 30m10.593s
> user 29m54.544s
> sys 0m4.736s
> \$ ls -l big_sample.txt
> -rw-r--r-- 1 james staff 49011436 Jul 11 15:26 big_sample.txt
> 112
> 221
> 450
> 655
> 900
> 1216
> 1399
> 1469
> 1494
> 1628
> \$ tail big_sample.txt
> 999995325
> 999996330
> 999996552
> 999996682
> 999997426
> 999997676
> 999998253
> 999999153
> 999999658
> 999999693
>
>
>

Stefan Lang
Guest
Posts: n/a

 07-15-2005
On Friday 15 July 2005 18:38, jason r tibbetts wrote:
> James,
>
> What are your machine's specs, just for comparison's sake? I don't want
> to get too discouraged by my 5+ year-old Sun beater workstation.
>
> I've also found that the printing of the results (in the form of an
> array) is taking several minutes in and of itself. Since optimizing the
> printout probably isn't the point of the quiz, can anyone recommend
> something faster than arr.each{ | elem | p elem } ?

I guess "puts arr" is faster.

--
Stefan

jason r tibbetts
Guest
Posts: n/a

 07-15-2005
jason r tibbetts wrote:
> James,
>
> What are your machine's specs, just for comparison's sake? I don't want
> to get too discouraged by my 5+ year-old Sun beater workstation.
>
> I've also found that the printing of the results (in the form of an
> array) is taking several minutes in and of itself. Since optimizing the
> printout probably isn't the point of the quiz, can anyone recommend
> something faster than arr.each{ | elem | p elem } ?

With the aforementioned old Sun box (SunOS dakota 5.8 Generic_108528-27
sun4u sparc SUNW,Ultra-5_10, 440MHz):

Attempt #1:
time ./sample.rb 5_000_000 1_000_000_000 > big_sample.txt
955.00u 46.42s 18:41.51 89.2%

Attempt #2:
time ./sample-logn2.rb 5_000_000 1_000_000_000 > big_sample.txt
388.37u 50.92s 9:40.11 75.7%

tail big_sample.txt
999998043
999998053
999998441
999998442
999998587
999999146
999999158
999999418
999999439
999999517

> Ruby Quiz wrote:
>
>> The three rules of Ruby Quiz:
>>
>> 1. Please do not post any solutions or spoiler discussion for this
>> quiz until
>> 48 hours have passed from the time on this message.
>>
>> 2. Support Ruby Quiz by submitting ideas as often as you can:
>>
>> http://www.rubyquiz.com/
>>
>> 3. Enjoy!
>>
>> -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
>>
>>
>> This week's Ruby quiz is a classic sampling problem that I've seen many
>> interesting solutions for in the past.
>>
>> The challenge is to implement a program called "sample" that takes
>> exactly two
>> integer inputs. The first of those should be the number of members
>> you would
>> like included in the sample. The second is the upper boundary
>> (exclusive) of
>> the range of integers members can be selected from. The lower
>> boundary is zero
>> (inclusive).
>>
>> Your program should output exactly the requested number of members
>> from the
>> defined range to STDOUT, one number per line. Each member must be
>> unique and
>> they should appear in sorted order.
>>
>> Here are some sample runs of my solution to illustrate the task:
>>
>> \$ ./sample Usage: sample MEMBERS LIMIT
>>
>> MEMBERS: The number of members you want in the sample. (Integer)
>> LIMIT: The upper limit for the sample range. All (Integer)
>> members will between 0 and LIMIT - 1.
>>
>> Note that MEMBERS must be less than LIMIT.
>> \$ ./sample 3 10
>> 0
>> 1
>> 5
>> \$ ./sample 3 10
>> 1
>> 2
>> 8
>> \$ ./sample 3 10
>> 2
>> 3
>> 9
>> \$ ./sample 9 10
>> 0
>> 2
>> 3
>> 4
>> 5
>> 6
>> 7
>> 8
>> 9
>> \$ ./sample 20 400
>> 1
>> 3
>> 4
>> 25
>> 32
>> 36
>> 42
>> 56
>> 93
>> 111
>> 137
>> 139
>> 153
>> 213
>> 222
>> 226
>> 263
>> 289
>> 293
>> 314
>>
>> Now for all the algorithm junkies out there that enjoy a little friendly
>> competition, here's the time to beat:
>>
>> \$ time ./sample 5_000_000 1_000_000_000 > big_sample.txt
>>
>> real 30m10.593s
>> user 29m54.544s
>> sys 0m4.736s
>> \$ ls -l big_sample.txt -rw-r--r-- 1 james staff 49011436
>> Jul 11 15:26 big_sample.txt
>> 221
>> 450
>> 655
>> 900
>> 1216
>> 1399
>> 1469
>> 1494
>> 1628
>> \$ tail big_sample.txt 999995325
>> 999996330
>> 999996552
>> 999996682
>> 999997426
>> 999997676
>> 999998253
>> 999999153
>> 999999658
>> 999999693
>>
>>
>>

>
>
>
>