Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Python > Is there a unique method in python to unique a list?

Reply
Thread Tools

Is there a unique method in python to unique a list?

 
 
Token Type
Guest
Posts: n/a
 
      09-09-2012
Is there a unique method in python to unique a list? thanks
 
Reply With Quote
 
 
 
 
Chris Angelico
Guest
Posts: n/a
 
      09-09-2012
On Sun, Sep 9, 2012 at 3:43 PM, Token Type <(E-Mail Removed)> wrote:
> Is there a unique method in python to unique a list? thanks


I don't believe there's a method for that, but if you don't care about
order, try turning your list into a set and then back into a list.

ChrisA
 
Reply With Quote
 
 
 
 
Chris Angelico
Guest
Posts: n/a
 
      09-09-2012
On Sun, Sep 9, 2012 at 4:29 PM, John H. Li <(E-Mail Removed)> wrote:
> However, if I don't put list(set(lemma_list)) to a variable name, it works
> much faster.


Try backdenting that statement. You're currently doing it at every
iteration of the loop - that's why it's so much slower.

But you'll probably find it better to work with the set directly,
instead of uniquifying a list as a separate operation.

ChrisA
 
Reply With Quote
 
Token Type
Guest
Posts: n/a
 
      09-09-2012

> Try backdenting that statement. You're currently doing it at every
>
> iteration of the loop - that's why it's so much slower.


Thanks. I works now.

>>> def average_polysemy(pos):

synset_list = list(wn.all_synsets(pos))
sense_number = 0
lemma_list = []
for synset in synset_list:
lemma_list.extend(synset.lemma_names)
for lemma in list(set(lemma_list)):
sense_number_new = len(wn.synsets(lemma, pos))
sense_number = sense_number + sense_number_new
return sense_number/len(set(lemma_list))

>>> average_polysemy('n')

1


> But you'll probably find it better to work with the set directly,
>
> instead of uniquifying a list as a separate operation.


Yes, the following second methods still runs faster if I don't give a separate variable name to list(set(lemma_list)). Why will this happen?

>>> def average_polysemy(pos):

synset_list = list(wn.all_synsets(pos))
sense_number = 0
lemma_list = []
for synset in synset_list:
lemma_list.extend(synset.lemma_names)
for lemma in list(set(lemma_list)):
sense_number_new = len(wn.synsets(lemma, pos))
sense_number = sense_number + sense_number_new
return sense_number/len(set(lemma_list))

>>> average_polysemy('n')

1
 
Reply With Quote
 
Token Type
Guest
Posts: n/a
 
      09-09-2012

> Try backdenting that statement. You're currently doing it at every
>
> iteration of the loop - that's why it's so much slower.


Thanks. I works now.

>>> def average_polysemy(pos):

synset_list = list(wn.all_synsets(pos))
sense_number = 0
lemma_list = []
for synset in synset_list:
lemma_list.extend(synset.lemma_names)
for lemma in list(set(lemma_list)):
sense_number_new = len(wn.synsets(lemma, pos))
sense_number = sense_number + sense_number_new
return sense_number/len(set(lemma_list))

>>> average_polysemy('n')

1


> But you'll probably find it better to work with the set directly,
>
> instead of uniquifying a list as a separate operation.


Yes, the following second methods still runs faster if I don't give a separate variable name to list(set(lemma_list)). Why will this happen?

>>> def average_polysemy(pos):

synset_list = list(wn.all_synsets(pos))
sense_number = 0
lemma_list = []
for synset in synset_list:
lemma_list.extend(synset.lemma_names)
for lemma in list(set(lemma_list)):
sense_number_new = len(wn.synsets(lemma, pos))
sense_number = sense_number + sense_number_new
return sense_number/len(set(lemma_list))

>>> average_polysemy('n')

1
 
Reply With Quote
 
Serhiy Storchaka
Guest
Posts: n/a
 
      09-09-2012
On 09.09.12 08:47, Donald Stufft wrote:
> If you don't need to retain order you can just use a set,


Only if elements are hashable.


 
Reply With Quote
 
Paul Rubin
Guest
Posts: n/a
 
      09-09-2012
Token Type <(E-Mail Removed)> writes:
>>>> def average_polysemy(pos):

> synset_list = list(wn.all_synsets(pos))
> sense_number = 0
> lemma_list = []
> for synset in synset_list:
> lemma_list.extend(synset.lemma_names)
> for lemma in list(set(lemma_list)):
> sense_number_new = len(wn.synsets(lemma, pos))
> sense_number = sense_number + sense_number_new
> return sense_number/len(set(lemma_list))


I think you mean (untested):

synsets = wn.all_synsets(pos)
sense_number = 0
lemma_set = set()
for synset in synsets:
lemma_set.add(synset.lemma_names)
for lemma in lemma_set:
sense_number += len(wn.synsets(lemma,pos))
return sense_number / len(lemma_set)
 
Reply With Quote
 
Paul Rubin
Guest
Posts: n/a
 
      09-09-2012
Paul Rubin <(E-Mail Removed)> writes:
> I think you mean (untested):
>
> synsets = wn.all_synsets(pos)
> sense_number = 0
> lemma_set = set()
> for synset in synsets:
> lemma_set.add(synset.lemma_names)
> for lemma in lemma_set:
> sense_number += len(wn.synsets(lemma,pos))
> return sense_number / len(lemma_set)


Or even:

lemma_set = set(synset for synset in wn.all_synsets(pos))
sense_number = sum(len(wn.synsets(lemma, pos)) for lemma in lemma_set)
return sense_number / len(lemma_set)
 
Reply With Quote
 
Token Type
Guest
Posts: n/a
 
      09-09-2012
Thanks. I try to use set() suggested by you. However, not successful. Please see:
>>> synsets = list(wn.all_synsets('n'))
>>> synsets[:5]

[Synset('entity.n.01'), Synset('physical_entity.n.01'), Synset('abstraction.n.06'), Synset('thing.n.12'), Synset('object.n.01')]
>>> lemma_set = set()
>>> for synset in synsets:

lemma_set.add(synset.lemma_names)


Traceback (most recent call last):
File "<pyshell#43>", line 2, in <module>
lemma_set.add(synset.lemma_names)
TypeError: unhashable type: 'list'
>>> for synset in synsets:

lemma_set.add(set(synset.lemma_names))

Traceback (most recent call last):
File "<pyshell#45>", line 2, in <module>
lemma_set.add(set(synset.lemma_names))
TypeError: unhashable type: 'set'


 
Reply With Quote
 
Chris Angelico
Guest
Posts: n/a
 
      09-09-2012
On Sun, Sep 9, 2012 at 11:44 PM, Token Type <(E-Mail Removed)> wrote:
> lemma_set.add(synset.lemma_names)


That tries to add the whole list as a single object, which doesn't
work because lists can't go into sets. There are two solutions,
depending on what you want to do.

1) If you want each addition to remain discrete, make a tuple instead:
lemma_set.add(tuple(synset.lemma_names))

2) If you want to add the elements of that list individually into the
set, use update:
lemma_set.update(synset.lemma_names)

I'm thinking you probably want option 2 here.

ChrisA
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Is there a method (similar to str() method in R) that can print thedata structure in python? Peng Yu Python 0 09-26-2009 02:32 PM
list question... unique values in all possible unique spots ToshiBoy Python 6 08-12-2008 05:01 AM
Python SHA-1 as a method for unique file identification ? [help!] EP Python 5 06-27-2006 01:00 AM
Is there a way to pass a python function ptr to a c++ method from a python script? liam_herron Python 1 06-06-2006 02:32 AM
[python] Is there a python written fax program out there? David Stockwell Python 2 06-08-2004 08:28 PM



Advertisments