Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Python > generating unique set of dicts from a list of dicts

Reply
Thread Tools

generating unique set of dicts from a list of dicts

 
 
bruce
Guest
Posts: n/a
 
      01-10-2012
trying to figure out how to generate a unique set of dicts from a
json/list of dicts.

initial list :::
[{"pStart1a": {"termVal":"1122","termMenu":"CLASS_SRCH_WRK2_STRM ","instVal":"OSUSI",
"instMenu":"CLASS_SRCH_WRK2_INSTITUTION","goBtn":" CLASS_SRCH_WRK2_SSR_PB_SRCH",
"pagechk":"CLASS_SRCH_WRK2_SSR_PB_SRCH","nPage":"C LASS_SRCH_WRK2_SSR_PB_CLASS_SRCH"},
"pSearch1a":
{"chk":"CLASS_SRCH_WRK2_MON","srchbtn":"DERIVED_CL SRCH_SSR_EXPAND_COLLAPS"}},
{"pStart1":""},
{"pStart1a":{"termVal":"1122","termMenu":"CLASS_SR CH_WRK2_STRM","instVal":"OSUSI",
"instMenu":"CLASS_SRCH_WRK2_INSTITUTION","goBtn":" CLASS_SRCH_WRK2_SSR_PB_SRCH",
"pagechk":"CLASS_SRCH_WRK2_SSR_PB_SRCH","nPage":"C LASS_SRCH_WRK2_SSR_PB_CLASS_SRCH"},
"pSearch1a":
{"chk":"CLASS_SRCH_WRK2_MON","srchbtn":"DERIVED_CL SRCH_SSR_EXPAND_COLLAPS"}},
{"pStart1":""}]



As an exmple, the following is the test list:

[{"pStart1a": {"termVal":"1122","termMenu":"CLASS_SRCH_WRK2_STRM ","instVal":"OSUSI",
"instMenu":"CLASS_SRCH_WRK2_INSTITUTION","goBtn":" CLASS_SRCH_WRK2_SSR_PB_SRCH",
"pagechk":"CLASS_SRCH_WRK2_SSR_PB_SRCH","nPage":"C LASS_SRCH_WRK2_SSR_PB_CLASS_SRCH"},
"pSearch1a":
{"chk":"CLASS_SRCH_WRK2_MON","srchbtn":"DERIVED_CL SRCH_SSR_EXPAND_COLLAPS"}},
{"pStart1":""},
{"pStart1a":{"termVal":"1122","termMenu":"CLASS_SR CH_WRK2_STRM","instVal":"OSUSI",
"instMenu":"CLASS_SRCH_WRK2_INSTITUTION","goBtn":" CLASS_SRCH_WRK2_SSR_PB_SRCH",
"pagechk":"CLASS_SRCH_WRK2_SSR_PB_SRCH","nPage":"C LASS_SRCH_WRK2_SSR_PB_CLASS_SRCH"},
"pSearch1a":
{"chk":"CLASS_SRCH_WRK2_MON","srchbtn":"DERIVED_CL SRCH_SSR_EXPAND_COLLAPS"}},
{"pStart1":""}]

Trying to get the following, list of unique dicts, so there aren't
duplicate dicts.
Searched various sites/SO.. and still have a mental block.

[
{"pStart1a":
{"termVal":"1122","termMenu":"CLASS_SRCH_WRK2_STRM ","instVal":"OSUSI",
"instMenu":"CLASS_SRCH_WRK2_INSTITUTION","goBtn":" CLASS_SRCH_WRK2_SSR_PB_SRCH",
pagechk":"CLASS_SRCH_WRK2_SSR_PB_SRCH","nPage":"CL ASS_SRCH_WRK2_SSR_PB_CLASS_SRCH"},
"pSearch1a":
{"chk":"CLASS_SRCH_WRK2_MON","srchbtn":"DERIVED_CL SRCH_SSR_EXPAND_COLLAPS"}},
{"pStart1":""}]

I was considering iterating through the initial list, copying each
dict into a new list, and doing a basic comparison, adding the next
dict if it's not in the new list.. is there another/better way?

posted this to StackOverflow as well. >>>>
http://stackoverflow.com/questions/8...que-dict-items
<<<

There was a potential soln that I couldn't understand.


-------------------------
The simplest approach -- using list(set(your_list_of_dicts)) won't
work because Python dictionaries are mutable and not hashable (that
is, they don't implement __hash__). This is because Python can't
guarantee that the hash of a dictionary won't change after you insert
it into a set or dict.

However, in your case, since you (don't seem to be) modifying the data
at all, you can compute your own hash, and use this along with a
dictionary to relatively easily find the unique JSON objects without
having to do a full recursive comparison of each dictionary to the
others.

First, we need a function to compute a hash of the dictionary. Rather
than trying to build our own hash function, let's use one of the
built-in ones from hashlib:

def dict_hash(d):
out = hashlib.md5()
for key, value in d.iteritems():
out.update(unicode(key))
out.update(unicode(value))
return out.hexdigest()

(Note that this relies on unicode(...) for each of your values
returning something unique -- if you have custom classes in the
dictionaries whose __unicode__ returns something like "MyClass
instance", this will fail or will require modification. Also, in your
example, your dictionaries are flat, but I'll leave it as an exercise
to the reader how to expand this solution to work with dictionaries
that contain other dicts or lists.)

Since dict_hash returns a string, which is immutable, you can now use
a dictionary to find the unique elements:

uniques_map = {}
for d in list_of_dicts:
uniques[dict_hash(d)] = d
unique_dicts = uniques_map.values()

>>>>*** not sure what the "uniqes" is, or what/how it should be defined....



thoughts/comments are welcome

thanks
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Is there a unique method in python to unique a list? Token Type Python 9 09-09-2012 02:13 PM
list question... unique values in all possible unique spots ToshiBoy Python 6 08-12-2008 05:01 AM
Re: Generating 8 digit unique ID Patrice ASP .Net 0 04-20-2004 04:27 PM
Re: Generating 8 digit unique ID Curt_C [MVP] ASP .Net 1 04-20-2004 03:21 PM
Re: Generating 8 digit unique ID Martin Dechev ASP .Net 0 04-20-2004 03:00 PM



Advertisments