Velocity Reviews > Re: best way to compare contents of 2 lists?

# Re: best way to compare contents of 2 lists?

Esmail
Guest
Posts: n/a

 04-24-2009
David Robinow wrote:
> On Thu, Apr 23, 2009 at 9:31 PM, Esmail <(E-Mail Removed)> wrote:
>> What is the best way to compare the *contents* of two different
>> lists regardless of their respective order? The lists will have
>> the same number of items, and be of the same type.
>>
>> E.g. a trivial example (my lists will be larger),
>>
>> a=[1, 2, 3]
>>
>> b=[2, 3, 1]
>>
>> should yield true if a==b
>>
>> I suppose I could sort them and then compare them. I.e.,
>>
>> sorted(a)==sorted(b)
>>
>>
>> I am wondering if there is a more efficient/preferred way to do so.
>>
>> Thanks,
>> Esmail
>>
>> --
>> http://mail.python.org/mailman/listinfo/python-list
>>

>
> set(a) == set(b) # test if a and b have the same elements
>
> # check that each list has the same number of each element
> # i.e. [1,2,1,2] == [1,1,2,2], but [1,2,2,2] != [1,1,1,2]
> for elem in set(a):
> a.count(elem) == b.count(elem)

Ah .. this part would take care of different number of duplicates
in the lists. Cool.

thanks,
Esmail

> --
> http://mail.python.org/mailman/listinfo/python-list
>

John Yeung
Guest
Posts: n/a

 04-24-2009
Esmail <(E-Mail Removed)> wrote:
> What is the best way to compare the *contents* of two different
> lists regardless of their respective order? The lists will have
> the same number of items, and be of the same type.

"Best" can mean different things. Fastest? Shortest code? Most

> David Robinow wrote:
> > set(a) == set(b) * *# test if a and b have the same elements

>
> > # check that each list has the same number of each element
> > # i.e. * *[1,2,1,2] == [1,1,2,2], but [1,2,2,2] != [1,1,1,2]
> > for elem in set(a):
> > * a.count(elem) == b.count(elem)

>
> Ah .. this part would take care of different number of duplicates
> in the lists. Cool.

It takes care of the duplicates, but so does your initial solution,
which I like best:

> sorted(a)==sorted(b)

This is concise, clear, and in my opinion, the most Pythonic. It may
well even be the fastest. (If you didn't have to match up the numbers
of duplicates, the set solution would be most Pythonic and probably
fastest.)

John

Piet van Oostrum
Guest
Posts: n/a

 04-24-2009
>>>>> John Yeung <(E-Mail Removed)> (JY) wrote:

>JY> It takes care of the duplicates, but so does your initial solution,
>JY> which I like best:

>>> sorted(a)==sorted(b)

>JY> This is concise, clear, and in my opinion, the most Pythonic. It may
>JY> well even be the fastest. (If you didn't have to match up the numbers
>JY> of duplicates, the set solution would be most Pythonic and probably
>JY> fastest.)

But it may fail if the list cannot be sorted because it contains
incomparable elements:

>>> a = [1, 2, 3+4j]
>>> b = [2, 3+4j, 1]
>>> set(a) == set(b)

True
>>> sorted(a)==sorted(b)

Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: no ordering relation is defined for complex numbers

In Python 3 there are even more incomparable objects pairs.
--
Piet van Oostrum <(E-Mail Removed)>
URL: http://pietvanoostrum.com [PGP 8DAE142BE17999C4]
Private email: http://www.velocityreviews.com/forums/(E-Mail Removed)

Steven D'Aprano
Guest
Posts: n/a

 04-24-2009
On Thu, 23 Apr 2009 21:51:42 -0400, Esmail wrote:

>> set(a) == set(b) # test if a and b have the same elements
>>
>> # check that each list has the same number of each element # i.e.
>> [1,2,1,2] == [1,1,2,2], but [1,2,2,2] != [1,1,1,2] for elem in set(a):
>> a.count(elem) == b.count(elem)

>
> Ah .. this part would take care of different number of duplicates in the
> lists. Cool.

At significant cost of extra work.

Counting the number of times a single element occurs in the list is O(N).
Counting the number of times every element occurs in the list is O(N**2).
Sorting is O(N*log N), so for large lists, sorting will probably be much
cheaper.

--
Steven

Esmail
Guest
Posts: n/a

 04-24-2009
Piet van Oostrum wrote:
>>>>>> John Yeung <(E-Mail Removed)> (JY) wrote:

>
>> JY> It takes care of the duplicates, but so does your initial solution,
>> JY> which I like best:

>
>>>> sorted(a)==sorted(b)

>
>> JY> This is concise, clear, and in my opinion, the most Pythonic. It may
>> JY> well even be the fastest. (If you didn't have to match up the numbers
>> JY> of duplicates, the set solution would be most Pythonic and probably
>> JY> fastest.)

>
> But it may fail if the list cannot be sorted because it contains
> incomparable elements:

Ah .. good point .. something I had not considered. Lucky for me in
this case it's not an issue, but something I'll keep in mind for the
future.

Esmail
Guest
Posts: n/a

 04-24-2009
John Yeung wrote:
> so does your initial solution,
> which I like best:
>
>> sorted(a)==sorted(b)

>
> This is concise, clear, and in my opinion, the most Pythonic. It may
> well even be the fastest.

Great .. I can live with that

Terry Reedy
Guest
Posts: n/a

 04-24-2009
Steven D'Aprano wrote:
> On Thu, 23 Apr 2009 21:51:42 -0400, Esmail wrote:
>
>>> set(a) == set(b) # test if a and b have the same elements
>>>
>>> # check that each list has the same number of each element # i.e.
>>> [1,2,1,2] == [1,1,2,2], but [1,2,2,2] != [1,1,1,2] for elem in set(a):
>>> a.count(elem) == b.count(elem)

>> Ah .. this part would take care of different number of duplicates in the
>> lists. Cool.

>
> At significant cost of extra work.
>
> Counting the number of times a single element occurs in the list is O(N).
> Counting the number of times every element occurs in the list is O(N**2).

A frequency dict should be O(n) also, and hence faster than sorting.

> Sorting is O(N*log N), so for large lists, sorting will probably be much
> cheaper.
>
>

 Posting Rules You may not post new threads You may not post replies You may not post attachments You may not edit your posts BB code is On Smilies are On [IMG] code is On HTML code is OffTrackbacks are On Pingbacks are On Refbacks are Off Forum Rules

 Similar Threads Thread Thread Starter Forum Replies Last Post Zachary Dziura Python 12 06-13-2011 09:33 PM norseman Python 3 04-24-2009 10:09 PM Esmail Python 3 04-24-2009 11:20 AM Esmail Python 0 04-24-2009 01:31 AM Robin Siebler Python 6 09-15-2004 04:12 AM