Velocity Reviews > Java > iterating the difference of two collections

# iterating the difference of two collections

Andreas Leitgeb
Guest
Posts: n/a

 01-06-2007
Given two Collection parameters a,b that I'm *not* supposed to modify,
what's the best way to iterate over the elements in a but not in b?

Create a new collection d with elements of a, then d.removeAll(b),
then iterate over d.
or
for(ElemType e:a) { if (b.contains(e)) continue; ... }

Actually it happens that Collection b is actually a Set,
so my preference would be for the second, except that it
feels hackish. Is this feeling right or wrong?

Flo 'Irian' Schaetz
Guest
Posts: n/a

 01-06-2007
And thus spoke Andreas Leitgeb...

> Given two Collection parameters a,b that I'm *not* supposed to modify,
> what's the best way to iterate over the elements in a but not in b?

I would say, that depends...

.... how many elements do the collections have?
.... how complex are the actions of get and contains?

> Create a new collection d with elements of a, then d.removeAll(b),
> then iterate over d.
> or
> for(ElemType e:a) { if (b.contains(e)) continue; ... }
>
> Actually it happens that Collection b is actually a Set,
> so my preference would be for the second, except that it
> feels hackish. Is this feeling right or wrong?

Depends on the type of the set, imho: It it's a hashset, b.contains(e)
would be aprox. constant, so I would prefer it, too.

Flo

Daniel Pitts
Guest
Posts: n/a

 01-06-2007

Andreas Leitgeb wrote:
> Given two Collection parameters a,b that I'm *not* supposed to modify,
> what's the best way to iterate over the elements in a but not in b?
>
> Create a new collection d with elements of a, then d.removeAll(b),
> then iterate over d.
> or
> for(ElemType e:a) { if (b.contains(e)) continue; ... }
>
> Actually it happens that Collection b is actually a Set,
> so my preference would be for the second, except that it
> feels hackish. Is this feeling right or wrong?

Either way works. If you have the answer, why ask the question?

If you wanted to get really tricky, you would create a custom
collections called "Difference" which defines a view into the
collection (a - b);

Patricia Shanahan
Guest
Posts: n/a

 01-06-2007
Flo 'Irian' Schaetz wrote:
> And thus spoke Andreas Leitgeb...
>
>> Given two Collection parameters a,b that I'm *not* supposed to modify,
>> what's the best way to iterate over the elements in a but not in b?

>
> I would say, that depends...
>
> ... how many elements do the collections have?
> ... how complex are the actions of get and contains?
>
>> Create a new collection d with elements of a, then d.removeAll(b),
>> then iterate over d.
>> or
>> for(ElemType e:a) { if (b.contains(e)) continue; ... }
>>
>> Actually it happens that Collection b is actually a Set,
>> so my preference would be for the second, except that it
>> feels hackish. Is this feeling right or wrong?

>
> Depends on the type of the set, imho: It it's a hashset, b.contains(e)
> would be aprox. constant, so I would prefer it, too.
>
> Flo

You can also get total O(|a|+|b|) time, regardless of the original
collection implementations, the first way, using a HashSet as the
working collection.

Patricia

Flo 'Irian' Schaetz
Guest
Posts: n/a

 01-06-2007
And thus spoke Patricia Shanahan...

> You can also get total O(|a|+|b|) time, regardless of the original
> collection implementations, the first way, using a HashSet as the
> working collection.

Good idea, didn't think of that. If a or b is a hashset, the 2nd way
would perhaps be better (O(a) or O(b), if I'm not mistaken), but
otherwise this sounds like a good idea.

Flo

Andreas Leitgeb
Guest
Posts: n/a

 01-07-2007
Flo 'Irian' Schaetz <(E-Mail Removed)> wrote:
> And thus spoke Patricia Shanahan...
>> You can also get total O(|a|+|b|) time, regardless of the original
>> collection implementations, the first way, using a HashSet as the
>> working collection.

>
> Good idea, didn't think of that. If a or b is a hashset, the 2nd way
> would perhaps be better (O(a) or O(b), if I'm not mistaken), but
> otherwise this sounds like a good idea.

The sizes are small enough that probably the overhead of creating
another HashSet outweighs the actual iteration.

Nevertheless I had to use a newly created temporary collection
anyway, because I found that the originally used collection "a"
could be replaced by some already maintained subset "a1" of "a",
which contained only the relevant elements, but is unfortunately
modified inside the loop (some of the iterated elements are being
removed).
Using an iterator (and iterator's remove) is a bit complicated,
because the remove actually happens inside another method...

Thanks however for all responses!

Mike Schilling
Guest
Posts: n/a

 01-07-2007

"Daniel Pitts" <(E-Mail Removed)> wrote in message
news:(E-Mail Removed) oups.com...
>
> Andreas Leitgeb wrote:
>> Given two Collection parameters a,b that I'm *not* supposed to modify,
>> what's the best way to iterate over the elements in a but not in b?
>>
>> Create a new collection d with elements of a, then d.removeAll(b),
>> then iterate over d.
>> or
>> for(ElemType e:a) { if (b.contains(e)) continue; ... }
>>
>> Actually it happens that Collection b is actually a Set,
>> so my preference would be for the second, except that it
>> feels hackish. Is this feeling right or wrong?

>
> Either way works. If you have the answer, why ask the question?
>
> If you wanted to get really tricky, you would create a custom
> collections called "Difference" which defines a view into the
> collection (a - b);

Or create a Subset class, constructed from a Set and an expression
determining which members of another Set are in the subset: (none of this
compiled or tested)

public class Subset extends AbstractSet implements Set {
public Subset(Set superset, Criterion criterion){ ... }
...
public interface Criterion {
public boolean isInSubset(Object member);
}
}

and implement a Difference criterion

public class Difference implements Subset.Criterion {
private Set toExclude;
public Difference(Set toExclude) {
this.toExclude = toExclude;
}
public boolean isInSubset(Object member) {
return !toExclude.contains(member);
}
}

Andreas Leitgeb
Guest
Posts: n/a

 01-07-2007
Mike Schilling <(E-Mail Removed)> wrote:
> "Daniel Pitts" <(E-Mail Removed)> wrote:
>> Andreas Leitgeb wrote:
>>> Given two Collection parameters a,b that I'm *not* supposed to modify,
>>> what's the best way to iterate over the elements in a but not in b?

>> Either way works. If you have the answer, why ask the question?

My question was for the choice among those. While I had a
tendency, I didn't consider that an answer, myself.

> Or create a Subset class, constructed from a Set and an expression
> determining which members of another Set are in the subset: (none of this
> compiled or tested)

I'd still need to implement an iterator to make it all work.
Depending on how I did that, it would be equivalent to either
of my own suggestions, but either way it would be much more "OO"
than mine

> public class Subset extends AbstractSet implements Set {

Btw., is there a reason for "implements Set", when AbstractSet

> public class Difference implements Subset.Criterion {
> [...]
> }

A truely elegant wrapping of my "if (c.contains(e)) continue;"

Mike Schilling
Guest
Posts: n/a

 01-07-2007

"Andreas Leitgeb" <(E-Mail Removed)> wrote in message
news:(E-Mail Removed).. .
> Mike Schilling <(E-Mail Removed)> wrote:

>
>> public class Subset extends AbstractSet implements Set {

>
> Btw., is there a reason for "implements Set", when AbstractSet

Documentation.

John Ersatznom
Guest
Posts: n/a

 01-08-2007
Mike Schilling wrote:
> "Andreas Leitgeb" <(E-Mail Removed)> wrote in message
> news:(E-Mail Removed).. .
>
>>Mike Schilling <(E-Mail Removed)> wrote:

>
>
>>> public class Subset extends AbstractSet implements Set {

>>
>>Btw., is there a reason for "implements Set", when AbstractSet

>
> Documentation.

The class name contains "set". It extends a class whose name contains
"Set". "All implemented interfaces" will include "Set". What more
documentation do you want?

The really interesting thing here is whether to make Subset modifiable.
One supposes removal should remove the element from the backing Set, and
addition should add to the backing Set and throw something if the
criterion is violated (the new element would not actually now appear in
the Subset). Anyone who wants an unmodifiable one has
Collections.unmodifiableSet available to them, and anyone who wants a
Subset to diverge from its parent Set and just initially contain the
appropriate Subset, then be modifiable at will (and not reflect changes
in the backing Set), can use Set foo = new HashSet();
foo.addAll(mySubset); to copy the Subset into a normal-behaving HashSet.