Velocity Reviews > Ruby > [QUIZ] Distinct Sets (#225)

# [QUIZ] Distinct Sets (#225)

Daniel Moore
Guest
Posts: n/a

 11-21-2009
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-

The three rules of Ruby Quiz:

1. Please do not post any solutions or spoiler discussion for this
quiz until 48 hours have elapsed from the time this message was
sent.

2. Support Ruby Quiz by submitting ideas and responses
as often as you can.

3. Enjoy!

Suggestion: A [QUIZ] in the subject of emails about the problem
the original quiz message, if you can.

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

Suggestions?: http://rubyquiz.strd6.com/suggestions

-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-

## Distinct Sets (#225)

Aloha Rubyists,

This week's quiz comes from Ruby Quiz Suggestions MVP Martin DeMello[1].

[based on a surprisingly tricky stackoverflow problem]

You have an list of sets, which you want to transform by the following
method: if any two sets have a common element, merge them into a
single set. You will be left with a reduced list partitioning all the
elements into sets where every set is disjoint from every other.

For example:

Start: 0:[D E G]
1:[C J K M]
2:[K M]
3:[H]
4:[D H L R]
5:[G L]

merging 1 and 2 since they have K and M in common:
=> [D E G] [C J K M] [H] [D H L R] [G L]

merging 2 and 3 since they have H in common:
=> [D E G] [C J K M] [D H L R] [G L]

merging 0 and 2 (D)
=> [D E G H L R] [C J K M] [G L]

merging 0 and 2 (G, L)
=> [D E G H L R] [C J K M]

The tricky bit is to do it as efficiently as possible (in an
algorithmic sense; in actual ruby code the efficiency depends a lot on
which methods run in ruby and which in C), but even without that it's
a fun problem to solve.

Here are a few input/output pairs to help test your program:

[["G", "J", "N"], ["D", "F", "G", "K"], ["E", "H"], ["B", "C", "J",
"L", "Q"], ["C", "M"]]
=> [["B", "C", "D", "F", "G", "J", "K", "L", "M", "N", "Q"], ["E", "H"]]

[["A", "C", "E", "G", "H"], ["B", "I", "M"], ["E", "M", "O"]]
=> [["A", "B", "C", "E", "G", "H", "I", "M", "O"]]

[["D", "E", "J", "L"], ["F", "K"], ["L", "M"], ["I", "K"], ["I", "K"]]
=> [["D", "E", "J", "L", "M"], ["F", "I", "K"]]

[["B", "E", "L", "M"], ["B", "I", "L", "O", "P"], ["A", "J", "O",
"P"], ["A", "D", "F", "L"]]
=> [["A", "B", "D", "E", "F", "I", "J", "L", "M", "O", "P"]]

[["E", "G", "K"], ["A", "C", "I", "J", "N"], ["C", "J", "M", "N"]]
=> [["E", "G", "K"], ["A", "C", "I", "J", "M", "N"]]

[["A", "D", "E", "H"], ["D", "N", "P"], ["D", "I", "L", "P"]]
=> [["A", "D", "E", "H", "I", "L", "N", "P"]]

[["E", "F", "K", "N", "O"], ["A", "B", "C", "J", "P"]]
=> [["E", "F", "K", "N", "O"], ["A", "B", "C", "J", "P"]]

[["C", "H", "M"], ["D", "F", "L"], ["A", "E", "J", "O"], ["C", "H"],
["J", "K", "M"], ["A", "N", "Q", "T"]]
=> [["A", "C", "E", "H", "J", "K", "M", "N", "O", "Q", "T"], ["D", "F", "L"]]

Have fun!

[1]: http://zem.novylen.net

P.S. I'm pretty swamped right now which is causing delay on
summarizing the recent quizzes. If anyone would like to write a guest
summary I would much appreciate it! Sometimes you never know where
help will come from until you ask for it. Also, special thanks to
Martin and everyone else who helps by submitting ideas and quizzes!

--
-Daniel
http://rubyquiz.strd6.com

brabuhr@gmail.com
Guest
Posts: n/a

 11-21-2009
On Fri, Nov 20, 2009 at 9:23 PM, Daniel Moore <(E-Mail Removed)> wrote:
>
> ## Distinct Sets (#225)
>
> Aloha Rubyists,
>
> This week's quiz comes from Ruby Quiz Suggestions MVP Martin DeMello[1].
>
> [based on a surprisingly tricky stackoverflow problem]
>
> You have an list of sets, which you want to transform by the following
> method: if any two sets have a common element, merge them into a
> single set. You will be left with a reduced list partitioning all the
> elements into sets where every set is disjoint from every other.

\$ ruby -v 225.rb
ruby 1.8.2 (2004-12-25) [powerpc-darwin8.0]
Started

Rob Biedenharn
Guest
Posts: n/a

 11-22-2009

On Nov 21, 2009, at 11:50 AM, http://www.velocityreviews.com/forums/(E-Mail Removed) wrote:

> On Fri, Nov 20, 2009 at 9:23 PM, Daniel Moore <(E-Mail Removed)>
> wrote:
>>
>> ## Distinct Sets (#225)
>>
>> Aloha Rubyists,
>>
>> This week's quiz comes from Ruby Quiz Suggestions MVP Martin
>> DeMello[1].
>>
>> [based on a surprisingly tricky stackoverflow problem]
>>
>> You have an list of sets, which you want to transform by the
>> following
>> method: if any two sets have a common element, merge them into a
>> single set. You will be left with a reduced list partitioning all the
>> elements into sets where every set is disjoint from every other.

>
> \$ ruby -v 225.rb
> ruby 1.8.2 (2004-12-25) [powerpc-darwin8.0]
> Started
> .
> Finished in 0.059557 seconds.
>
> 1 tests, 10 assertions, 0 failures, 0 errors
>
> But, I cheated a bit in my assertions; sorting the expected and actual
> values before asserting them equal:

I'm sure that there's a better way than mine, but it seems to work well.

ruby -v -rubygems distinct_sets_test.rb
ruby 1.8.6 (2008-08-11 patchlevel 287) [universal-darwin9.0]
/Library/Ruby/Gems/1.8/gems/thoughtbot-shoulda-2.9.1/lib/shoulda/
context.rb:4: warning: method redefined; discarding old contexts
Started
....................
Finished in 0.018326 seconds.

20 tests, 148 assertions, 0 failures, 0 errors

This includes all the tests given in the original quiz description and
all the tests from (E-Mail Removed), but using Shoulda and splitting
the one "tests" method with 10 assertions into 4 separate methods with
one assertion each and eliminating the duplicates.

http://gist.github.com/240457

I'm guessing the speed difference is due more to 1.8.2 v. 1.8.6 and
the actual machines than any significant difference in our algorithms.

-Rob

Rob Biedenharn http://agileconsultingllc.com
(E-Mail Removed)

lith
Guest
Posts: n/a

 11-22-2009
> http://gist.github.com/240457

Both of your tests use rather small input sets. It would be
interesting to know how the solutions deal with input that contains
many (10, 50, 100, ....) sets and/or many different signs (not just
letters).

Rob Biedenharn
Guest
Posts: n/a

 11-23-2009
On Nov 22, 2009, at 1:51 AM, lith wrote:

>> http://gist.github.com/240457

>
> Both of your tests use rather small input sets. It would be
> interesting to know how the solutions deal with input that contains
> many (10, 50, 100, ....) sets and/or many different signs (not just
> letters).

I accept your challenge! The gist has been updated with sets that use
numbers and symbols as well as strings. There are also some tests of
large sets (which worked fine, but getting the test setup by hand was
nasty).

ruby -rubygems distinct_sets_test.rb
Started
..........................
Finished in 1.237836 seconds.

26 tests, 202 assertions, 0 failures, 0 errors

The only change that I has to make was in how I sorted the final array
to account for symbols or mixed contents: Numerics compare "naturally"
with <=> and non-numeric or mixed are compared using the #to_s
representation.

-Rob

P.S. I could add my solution to the gist, too, but I'll give everyone
a chance to try the new tests first.

Rob Biedenharn http://agileconsultingllc.com
(E-Mail Removed)

brabuhr@gmail.com
Guest
Posts: n/a

 11-23-2009
On Mon, Nov 23, 2009 at 11:10 AM, Rob Biedenharn
<(E-Mail Removed)> wrote:
> On Nov 22, 2009, at 1:51 AM, lith wrote:
>>> http://gist.github.com/240457

>>
>> Both of your tests use rather small input sets. It would be
>> interesting to know how the solutions deal with input that contains
>> many (10, 50, 100, ....) sets and/or many different signs (not just
>> letters).

>
> I accept your challenge! =A0The gist has been updated with sets that use
> numbers and symbols as well as strings. =A0There are also some tests of l=

arge
> sets (which worked fine, but getting the test setup by hand was nasty).

Thanks.

> ruby1.8 -v -rubygems distinct_sets_test.rb

ruby 1.8.7 (2009-06-12 patchlevel 174) [i486-linux]
/var/lib/gems/1.8/gems/shoulda-2.10.2/lib/shoulda/context.rb:4:
warning: method redefined; discarding old contexts
Started
..................EEE.....
Finished in 3.424045 seconds.

1) Error:
test: non-uniform contents should handle matching on symbols.
(DistinctSetsTest):
ArgumentError: comparison of String with :bill failed
./distinct_sets.rb:7:in `sort'

2) Error:
test: non-uniform contents should handle mix of strings and symbols
(matching on string). (DistinctSetsTest):
ArgumentError: comparison of String with :bill failed
./distinct_sets.rb:7:in `sort'

3) Error:
test: non-uniform contents should handle mix of strings, numbers, and
symbols. (DistinctSetsTest):
ArgumentError: comparison of Fixnum with :emergency failed
./distinct_sets.rb:7:in `sort'

26 tests, 175 assertions, 0 failures, 3 errors

> The only change that I has to make was in how I sorted the final array to
> account for symbols or mixed contents: Numerics compare "naturally" with =

<=3D>
> and non-numeric or mixed are compared using the #to_s representation.

Simply not sorting:

26 tests, 202 assertions, 17 failures, 0 errors

Simply sort_by{to_s}:

26 tests, 202 assertions, 3 failures, 0 errors

More complex sort{}:

26 tests, 202 assertions, 0 failures, 0 errors

Benoit Daloze
Guest
Posts: n/a

 11-23-2009
[Note: parts of this message were removed to make it a legal post.]

Hey Rubyists!

I think I got quite a fast solution, using 1.9.

Here is my test file : http://pastie.org/711737
You just need to change METHOD and require your own file

So here is my best result for the 19tests:

Finished in 0.222688 seconds.
1 tests, 19 assertions, 0 failures, 0 errors, 0 skips

Without sorting(I just had to change the order of some tests of Rob)

Maye I should also say I'm using a 64bit ruby 1.9.2 ? Anyway I think this
method is far faster than my others(about 100 times) and probably some of
yours.

Enjoy the quiz,
Benoit

2009/11/23 <(E-Mail Removed)>

> On Mon, Nov 23, 2009 at 11:10 AM, Rob Biedenharn
> <(E-Mail Removed)> wrote:
> > On Nov 22, 2009, at 1:51 AM, lith wrote:
> >>> http://gist.github.com/240457
> >>
> >> Both of your tests use rather small input sets. It would be
> >> interesting to know how the solutions deal with input that contains
> >> many (10, 50, 100, ....) sets and/or many different signs (not just
> >> letters).

> >
> > I accept your challenge! The gist has been updated with sets that use
> > numbers and symbols as well as strings. There are also some tests of

> large
> > sets (which worked fine, but getting the test setup by hand was nasty).

>
> Thanks.
>
> > ruby1.8 -v -rubygems distinct_sets_test.rb

> ruby 1.8.7 (2009-06-12 patchlevel 174) [i486-linux]
> /var/lib/gems/1.8/gems/shoulda-2.10.2/lib/shoulda/context.rb:4:
> warning: method redefined; discarding old contexts
> Started
> ..................EEE.....
> Finished in 3.424045 seconds.
>
> 1) Error:
> test: non-uniform contents should handle matching on symbols.
> (DistinctSetsTest):
> ArgumentError: comparison of String with :bill failed
> ./distinct_sets.rb:7:in `sort'
>
> 2) Error:
> test: non-uniform contents should handle mix of strings and symbols
> (matching on string). (DistinctSetsTest):
> ArgumentError: comparison of String with :bill failed
> ./distinct_sets.rb:7:in `sort'
>
> 3) Error:
> test: non-uniform contents should handle mix of strings, numbers, and
> symbols. (DistinctSetsTest):
> ArgumentError: comparison of Fixnum with :emergency failed
> ./distinct_sets.rb:7:in `sort'
>
> 26 tests, 175 assertions, 0 failures, 3 errors
>
> > The only change that I has to make was in how I sorted the final array to
> > account for symbols or mixed contents: Numerics compare "naturally" with

> <=>
> > and non-numeric or mixed are compared using the #to_s representation.

>
> Simply not sorting:
>
> 26 tests, 202 assertions, 17 failures, 0 errors
>
> Simply sort_by{to_s}:
>
> 26 tests, 202 assertions, 3 failures, 0 errors
>
> More complex sort{}:
>
> 26 tests, 202 assertions, 0 failures, 0 errors
>
>

brabuhr@gmail.com
Guest
Posts: n/a

 11-23-2009
Here's my not-fast solution:

require 'set'

class Set
def intersect?(other)
other.each { |o| return true if include?(o) }
false
end
end

def distinct_sets(array_of_arrays)
set_of_sets = array_of_arrays.map{|a|
a.to_set
}.to_set

set_of_sets.divide{|i, j|
i.intersect?(j)
}.map{|s|
s.flatten.to_a
}
end

Adding the intersect? method to Set was primarily motivated by
readability, but also provided a noticeable speed improvement over my
original alternative (intersection.size.>).

Benoit Daloze
Guest
Posts: n/a

 11-23-2009
[Note: parts of this message were removed to make it a legal post.]

I must admit is a very elegant solution.
And it's not so slow, 0.36s with my tests and adding .map { |s|
s.flatten.to_a*.sort*
}, it pass all the tests without sorting.
Awesome for a Ruby-based class! A nice exemple of using forgot methods like
divide.

2009/11/23 <(E-Mail Removed)>

> Here's my not-fast solution:
>
> require 'set'
>
> class Set
> def intersect?(other)
> other.each { |o| return true if include?(o) }
> false
> end
> end
>
> def distinct_sets(array_of_arrays)
> set_of_sets = array_of_arrays.map{|a|
> a.to_set
> }.to_set
>
> set_of_sets.divide{|i, j|
> i.intersect?(j)
> }.map{|s|
> s.flatten.to_a
> }
> end
>
> Adding the intersect? method to Set was primarily motivated by
> readability, but also provided a noticeable speed improvement over my
> original alternative (intersection.size.>).
>
>

lith
Guest
Posts: n/a

 11-24-2009
> And it's not so slow,

I wrote a small approximative benchmark for that runs the script with
different set configurations.
http://pastie.org/711915

It might be interesting to compare the runtime behaviour of your
script with other solutions.