Velocity Reviews > Java > Computer Science 101 Question - extracting subsets of a collection

# Computer Science 101 Question - extracting subsets of a collection

Roger Varley
Guest
Posts: n/a

 08-10-2005
Hi

Can someone point me to an efficient algorythm/method that will allow
me to efficiently extract subsets of a collection such that if I have a
collection of n objects, I need to extract all combinations of (n-1)
objects, (n-2) objects etc. So if I have a series of 5 numbers
1,2,3,4,5 then I need all combinations of 4 numbers (1234, 1235 etc),
all combinations of 3 numbers (123, 234, 345 etc)

Regards
Roger

jan V
Guest
Posts: n/a

 08-10-2005
> Can someone point me to an efficient algorythm/method that will allow
> me to efficiently extract subsets of a collection such that if I have a
> collection of n objects, I need to extract all combinations of (n-1)
> objects, (n-2) objects etc.

Think recursion.

Roger Varley
Guest
Posts: n/a

 08-10-2005
Thanks Jan, but I think I'm going to need a bit more of a nudge in the
right direction than that. I should've also pointed out that I just
used the numbers as an example, I actually have a collection of objects
and I am not concerned about the order in which the objects appear
within the subset - so I don't mind wether I get "123", "321" or "213"
as long as I get one of them, but not the others.

Regards
Roger

jan V
Guest
Posts: n/a

 08-10-2005
> Thanks Jan, but I think I'm going to need a bit more of a nudge in the
> right direction than that.

Analyse the problem for the simplest cases: (a) and (a,b). Then you can
build on that by looking at (a,b,c)... and so on. That should allow you to
write a recursive routine which produces the required combinations. [If you
ask me for any more hints, you'll force me to write it all out in code, and
you don't want that, do you ? ]

> I should've also pointed out that I just
> used the numbers as an example, I actually have a collection of objects

That will not make any difference to the algorithm, only its detailed
implementation.

> and I am not concerned about the order in which the objects appear
> within the subset - so I don't mind wether I get "123", "321" or "213"
> as long as I get one of them, but not the others.

Then you'll need to involve the services of one of the Set classes.

Steve Wampler
Guest
Posts: n/a

 08-10-2005
On Wed, 10 Aug 2005 15:23:14 +0000, jan V wrote:
>...
>> and I am not concerned about the order in which the objects appear
>> within the subset - so I don't mind wether I get "123", "321" or "213"
>> as long as I get one of them, but not the others.

>
> Then you'll need to involve the services of one of the Set classes.

Hmmm, the Set classes aren't needed unless he's worried about
duplicating the subsets at different parts of the tree (that is,
the subset "23" can be generated from both "123" and "234"...)
It's not clear from the original posting whether that's a problem
or not. Perhaps the original poster can clarify?

Roger Varley
Guest
Posts: n/a

 08-10-2005
you guys are taking me to CS109 pretty quickly here I don't think
I'll duplicate subsets in the way Steve is talking about. I will always
be starting from the root of n objects and generating the combinations
of (n-1), (n-2) ... from the original n objects. I won't be trying to
generate my (n-2) groups from my (n-1) groups - unless Jan's recursive
method (when I've worked it out) takes me down that route.

Regards & thanks
Roger

Steve Wampler
Guest
Posts: n/a

 08-10-2005
On Wed, 10 Aug 2005 09:16:07 -0700, Roger Varley wrote:

> you guys are taking me to CS109 pretty quickly here I don't think
> I'll duplicate subsets in the way Steve is talking about. I will always
> be starting from the root of n objects and generating the combinations
> of (n-1), (n-2) ... from the original n objects. I won't be trying to
> generate my (n-2) groups from my (n-1) groups - unless Jan's recursive
> method (when I've worked it out) takes me down that route.

Oh, it will, it will...

Unless you do some special bookkeeping (using the Set classes Jan
mentioned is one way), you'll get lots of duplicates with a
recursive solution. Here's an example output, without the special
checking:
------------------------
abcd
abc
ab
a
b
ac
a
c
bc
b
c
abd
ab
a
b
a
d
bd
b
d
acd
ac
a
c
a
d
cd
c
d
bcd
bc
b
c
bd
b
d
cd
c
d
---------------------------

where what I think you want is:

----------------------------
abcd
abc
abd
acd
bcd
ab
ac
bc
bd
cd
a
b
c
d
------------------

Directly generating the 2nd solution is more interesting, which is
why I suspect that's what the assignment is.

This must be near the end of the semester in the 101 class...

Eric Sosman
Guest
Posts: n/a

 08-10-2005

jan V wrote:
>>Can someone point me to an efficient algorythm/method that will allow
>>me to efficiently extract subsets of a collection such that if I have a
>>collection of n objects, I need to extract all combinations of (n-1)
>>objects, (n-2) objects etc.

>
>
> Think recursion.

Even easier: Think binary numbers. Hint: What happens
to the bits of a binary number as you increment it from 0 to
(2^n)-1?

--
http://www.velocityreviews.com/forums/(E-Mail Removed)

jan V
Guest
Posts: n/a

 08-10-2005
> > Think recursion.
>
> Even easier: Think binary numbers. Hint: What happens
> to the bits of a binary number as you increment it from 0 to
> (2^n)-1?

Now I'm stumped. I can easily picture the "animating dance" of bits
switching on and off as you increment from 0 to whatever, but I don't see
the connection with the OP's problem.. can you give us an extra hint?

Raymond DeCampo
Guest
Posts: n/a

 08-10-2005
jan V wrote:
>>>Think recursion.

>>
>> Even easier: Think binary numbers. Hint: What happens
>>to the bits of a binary number as you increment it from 0 to
>>(2^n)-1?

>
>
> Now I'm stumped. I can easily picture the "animating dance" of bits
> switching on and off as you increment from 0 to whatever, but I don't see
> the connection with the OP's problem.. can you give us an extra hint?
>
>

abcd
0001
----
d

abcd
0010
----
c

abcd
0011
----
cd

abcd
0100
----
b

abcd
0101
----
b d

etc.

Ray

--
XML is the programmer's duct tape.