Velocity Reviews > Perl > New FAQ: How do I compute the difference of two arrays?

# New FAQ: How do I compute the difference of two arrays?

The Poor
Guest
Posts: n/a

 09-26-2003
Here is my new code changed from the old code in FAQ.
Please add it to the perldoc for the next release of perl.

OLD:
+
How do I compute the difference of two arrays? How do I compute the
intersection of two arrays?

Use a hash. Here's code to do both and more. It assumes that each
element is unique in a given array:

@union = @intersection = @difference = ();
%count = ();
foreach \$element (@array1, @array2) { \$count{\$element}++ }
foreach \$element (keys %count) {
push @union, \$element;
push @{ \$count{\$element} > 1 ? \@intersection :
\@difference }, \$element;
}

Note that this is the *symmetric difference*, that is, all
elements in
either A or in B but not in both. Think of it as an xor operation.

-
New:
+
How do I compute the difference of two arrays? How do I compute the
intersection of two arrays?

Use a hash. Here's code to do both and more. It assumes that each
element is unique in a given array:

@union = @intersection = @in1notin2 = @in2notin1();
%count = ();
foreach \$element (@array1) { \$count{\$element}++ }
foreach \$element (@array2) { \$count{\$element}-- }
foreach \$element (keys %count) {
push @union, \$element;
push (@intersection, \$element) if \$count{\$element} == 0;
push (@in1notin2, \$element) if \$count{\$element} == 1;
push (@in2notin1, \$element) if \$count{\$element} == -1;
}
-

2 Questions:
how to fix it when the array is not uniqu?
how to simplfy the last 3 lines using case, ?: or something else?

push @intersection, \$element if \$count{\$element} == 0;
push @in1notin2, \$element if \$count{\$element} == 1;
push @in2notin1, \$element if \$count{\$element} == -1;

http://edealseek.com

Steve Grazzini
Guest
Posts: n/a

 09-26-2003
The Poor <(E-Mail Removed).2y.net> wrote:
> Here is my new code changed from the old code in FAQ.
> Please add it to the perldoc for the next release of perl.

This is a discussion group. FAQ patches (preferably in the form
of unified or context diffs) can be mailed to perlfaq-workers at
perl.org or posted from nntp.perl.org.

You should also include some rationale -- what's wrong with the
I'm not sure that it is an improvement, but you'll have to make
a case for it. (And fix the syntax error, etc.)

--
Steve

Jay Tilton
Guest
Posts: n/a

 09-26-2003
Steve Grazzini <(E-Mail Removed)> wrote:

: The Poor <(E-Mail Removed).2y.net> wrote:
: > Here is my new code changed from the old code in FAQ.
: > Please add it to the perldoc for the next release of perl.
:
: This is a discussion group. FAQ patches (preferably in the form
: of unified or context diffs) can be mailed to perlfaq-workers at
: perl.org or posted from nntp.perl.org.
:
: You should also include some rationale -- what's wrong with the
: I'm not sure that it is an improvement, but you'll have to make
: a case for it. (And fix the syntax error, etc.)

As well, when making that case it is good to keep in mind that the FAQ
answers are not intended to form a cookbook of pre-fab do-anything
solutions. That's more the domain of CPAN, or the actual Perl Cookbook.

Recipe 4.7 in Perl Cookbook (1st ed.), by the way, directly addresses
the problem of "Finding Elements in One Array but Not Another."

A more useful (and easier) change to the FAQ would be to add mention of
a module on CPAN that handles permutations of the question that the FAQ
module appears to be especially capable.

Bob Walton
Guest
Posts: n/a

 09-27-2003
The Poor wrote:

....
> 2 Questions:
> how to fix it when the array is not uniqu?

Make it unique:

{my %h or @array1_unique=(@h{@array1}=1 and keys %h)}

> how to simplfy the last 3 lines using case, ?: or something else?
>
> push @intersection, \$element if \$count{\$element} == 0;
> push @in1notin2, \$element if \$count{\$element} == 1;
> push @in2notin1, \$element if \$count{\$element} == -1;

One way:
push @{\$result[\$count{\$element}+1]},\$element;
#@{\$result[0][...]} is in2notin1
#@{\$result[1][...]} is intersection
#@{\$result[2][...]} is in1notin2

--
Bob Walton

Randal L. Schwartz
Guest
Posts: n/a

 09-27-2003
>>>>> "The" == The Poor <(E-Mail Removed).2y.net> writes:

The> -
The> New:
The> +
The> How do I compute the difference of two arrays? How do I compute the
The> intersection of two arrays?

The> Use a hash. Here's code to do both and more. It assumes that each
The> element is unique in a given array:

Here's code that doesn't require that uniqueness, and would be a better
candidate for the FAQ:

my %tally;
\$tally{\$_} .= "a" for @array_a;
\$tally{\$_} .= "b" for @array_b;
my @union = keys %tally;
my @intersection = grep \$tally{\$_} =~ /ab/, @union;
my @a_not_b = grep \$tally{\$_} =~ /a\$/, @union;
my @b_not_a = grep \$tally{\$_} =~ /^b/, @union;

print "Just another Perl hacker,";

--
Randal L. Schwartz - Stonehenge Consulting Services, Inc. - +1 503 777 0095
<(E-Mail Removed)> <URL:http://www.stonehenge.com/merlyn/>
Perl/Unix/security consulting, Technical writing, Comedy, etc. etc.
See PerlTraining.Stonehenge.com for onsite and open-enrollment Perl training!

Gregory Toomey
Guest
Posts: n/a

 09-27-2003
It was a dark and stormy night, and Randal L. Schwartz managed to scribble:

>>>>>> "The" == The Poor <(E-Mail Removed).2y.net> writes:

>
> The> -
> The> New:
> The> +
> The> How do I compute the difference of two arrays? How do I compute
> the The> intersection of two arrays?
>
> The> Use a hash. Here's code to do both and more. It assumes that each
> The> element is unique in a given array:
>
> Here's code that doesn't require that uniqueness, and would be a better
> candidate for the FAQ:
>
> my %tally;
> \$tally{\$_} .= "a" for @array_a;
> \$tally{\$_} .= "b" for @array_b;
> my @union = keys %tally;
> my @intersection = grep \$tally{\$_} =~ /ab/, @union;
> my @a_not_b = grep \$tally{\$_} =~ /a\$/, @union;
> my @b_not_a = grep \$tally{\$_} =~ /^b/, @union;
>
> print "Just another Perl hacker,";

Having coded this problem myself, your solution is minimal & very elegant (as expected).

gtoomey

Anno Siegel
Guest
Posts: n/a

 09-29-2003
Gregory Toomey <(E-Mail Removed)> wrote in comp.lang.perl.misc:
> It was a dark and stormy night, and Randal L. Schwartz managed to scribble:
>
> >>>>>> "The" == The Poor <(E-Mail Removed).2y.net> writes:

> >
> > The> -
> > The> New:
> > The> +
> > The> How do I compute the difference of two arrays? How do I compute
> > the The> intersection of two arrays?
> >
> > The> Use a hash. Here's code to do both and more. It assumes that each
> > The> element is unique in a given array:
> >
> > Here's code that doesn't require that uniqueness, and would be a better
> > candidate for the FAQ:
> >
> > my %tally;
> > \$tally{\$_} .= "a" for @array_a;
> > \$tally{\$_} .= "b" for @array_b;
> > my @union = keys %tally;
> > my @intersection = grep \$tally{\$_} =~ /ab/, @union;
> > my @a_not_b = grep \$tally{\$_} =~ /a\$/, @union;
> > my @b_not_a = grep \$tally{\$_} =~ /^b/, @union;
> >
> > print "Just another Perl hacker,";

>
> Having coded this problem myself, your solution is minimal & very
> elegant (as expected).

....a variation on the theme:

my %tally;
\$tally{\$_} += 1 for @array_a; # or |=
\$tally{\$_} += 2 for @array_b;
# use more powers of two for more arrays
my @union = keys %tally;
my @intersection = grep \$tally{\$_} == 3, @union;
my \$a_not_b = grep \$tally{\$_} == 1, @union;
my \$b_not_a = grep \$tally{\$_} == 2, @union;
# etc.

Anno

David K. Wall
Guest
Posts: n/a

 09-29-2003
Anno Siegel <(E-Mail Removed)-berlin.de> wrote:

> Gregory Toomey <(E-Mail Removed)> wrote in comp.lang.perl.misc:
>> It was a dark and stormy night, and Randal L. Schwartz managed to
>> scribble:
>>
>> >>>>>> "The" == The Poor <(E-Mail Removed).2y.net> writes:
>> >
>> > The> -
>> > The> New:
>> > The> +
>> > The> How do I compute the difference of two arrays? How do I
>> > compute the The> intersection of two arrays?
>> >
>> > The> Use a hash. Here's code to do both and more. It
>> > assumes that each The> element is unique in a given array:
>> >
>> > Here's code that doesn't require that uniqueness, and would be
>> > a better candidate for the FAQ:
>> >
>> > my %tally;
>> > \$tally{\$_} .= "a" for @array_a;
>> > \$tally{\$_} .= "b" for @array_b;
>> > my @union = keys %tally;
>> > my @intersection = grep \$tally{\$_} =~ /ab/, @union;
>> > my @a_not_b = grep \$tally{\$_} =~ /a\$/, @union;
>> > my @b_not_a = grep \$tally{\$_} =~ /^b/, @union;
>> >
>> > print "Just another Perl hacker,";

>>
>> Having coded this problem myself, your solution is minimal & very
>> elegant (as expected).

>
> ...a variation on the theme:
>
> my %tally;
> \$tally{\$_} += 1 for @array_a; # or |=
> \$tally{\$_} += 2 for @array_b;
> # use more powers of two for more arrays
> my @union = keys %tally;
> my @intersection = grep \$tally{\$_} == 3, @union;

This assumes there are no duplicated values in either array, right?

> my \$a_not_b = grep \$tally{\$_} == 1, @union;

I think you mean @a_not_b.

> my \$b_not_a = grep \$tally{\$_} == 2, @union;

^
> # etc.

Randal's solution allowed for non-unique values in @array_a and
@array_b. The above doesn't produce correct results for, say,

my @array_a = qw(a a a y z);
my @array_b = qw(b b b x y z);

my @array_c = qw(c c c x z);

my %tally;
\$tally{\$_} .= "a" for @array_a;
\$tally{\$_} .= "b" for @array_b;
\$tally{\$_} .= "c" for @array_c;
my @union = keys %tally;
my @intersection = grep \$tally{\$_} =~ /a+b+c/, @union;
my @a_only = grep \$tally{\$_} =~ /^a+\$/, @union;
my @b_only = grep \$tally{\$_} =~ /^b+\$/, @union;
my @c_only = grep \$tally{\$_} =~ /^c+\$/, @union;
my @a_and_b = grep \$tally{\$_} =~ /a+b/, @union;
my @a_b_not_c = grep \$tally{\$_} =~ /a+b+(?!c)/, @union;

Anno Siegel
Guest
Posts: n/a

 09-29-2003
David K. Wall <(E-Mail Removed)> wrote in comp.lang.perl.misc:
> Anno Siegel <(E-Mail Removed)-berlin.de> wrote:

[union, intersection of arrays]

> > ...a variation on the theme:
> >
> > my %tally;
> > \$tally{\$_} += 1 for @array_a; # or |=
> > \$tally{\$_} += 2 for @array_b;
> > # use more powers of two for more arrays
> > my @union = keys %tally;
> > my @intersection = grep \$tally{\$_} == 3, @union;

>
> This assumes there are no duplicated values in either array, right?

Yes, if you use +=, as in the code above. Not if you use |= instead,
as indicated in a comment. I must have been thinking of sets, not lists.

> > my \$a_not_b = grep \$tally{\$_} == 1, @union;

>
> I think you mean @a_not_b.

More sloppiness.

Thanks. The code should definitely use "\$tally{\$_} |= ...".

Anno

David K. Wall
Guest
Posts: n/a

 09-29-2003
Anno Siegel <(E-Mail Removed)-berlin.de> wrote:

> David K. Wall <(E-Mail Removed)> wrote in
> comp.lang.perl.misc:
>> Anno Siegel <(E-Mail Removed)-berlin.de> wrote:

>
> [union, intersection of arrays]
>
>> > ...a variation on the theme:
>> >
>> > my %tally;
>> > \$tally{\$_} += 1 for @array_a; # or |=
>> > \$tally{\$_} += 2 for @array_b;
>> > # use more powers of two for more arrays
>> > my @union = keys %tally;
>> > my @intersection = grep \$tally{\$_} == 3, @union;

>>
>> This assumes there are no duplicated values in either array,
>> right?

>
> Yes, if you use +=, as in the code above. Not if you use |=
> instead, as indicated in a comment. I must have been thinking of
> sets, not lists.

Aha. I didn't notice the comment. It makes sense now.

As an exercise for myself I found a number of subsets of three sets
this way. Although I didn't do any benchmarking I'm pretty sure this
is faster than the string method, but I must say I can think more
sequences. A trade-off as usual, I guess.

--
David Wall