Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Python > toy list processing problem: collect similar terms

Reply
Thread Tools

toy list processing problem: collect similar terms

 
 
Xah Lee
Guest
Posts: n/a
 
      09-26-2010
here's a interesting toy list processing problem.

I have a list of lists, where each sublist is labelled by
a number. I need to collect together the contents of all sublists
sharing
the same label. So if I have the list

((0 a b) (1 c d) (2 e f) (3 g h) (1 i j) (2 k l) (4 m n) (2 o p) (4 q
r) (5 s t))

where the first element of each sublist is the label, I need to
produce:

output:
((a b) (c d i j) (e f k l o p) (g h) (m n q r) (s t))

a Mathematica solution is here:
http://xahlee.org/UnixResource_dir/w...tions_mma.html

R5RS Scheme lisp solution:
http://xahlee.org/UnixResource_dir/w...work_gmail.scm
by Sourav Mukherjee

also, a Common Lisp solution can be found here:
http://groups.google.com/group/comp....ded8824bc750b?

anyone care to give a solution in Python, Perl, javascript, or other
lang? am guessing the scheme solution can be much improved... perhaps
using some lib but that seems to show scheme is pretty weak if the lib
is non-standard.

Xah ∑ xahlee.org ☄
 
Reply With Quote
 
 
 
 
Gary Herron
Guest
Posts: n/a
 
      09-26-2010
On 09/25/2010 09:05 PM, Xah Lee wrote:
> here's a interesting toy list processing problem.
>
> I have a list of lists, where each sublist is labelled by
> a number. I need to collect together the contents of all sublists
> sharing
> the same label. So if I have the list
>
> ((0 a b) (1 c d) (2 e f) (3 g h) (1 i j) (2 k l) (4 m n) (2 o p) (4 q
> r) (5 s t))
>
> where the first element of each sublist is the label, I need to
> produce:
>
> output:
> ((a b) (c d i j) (e f k l o p) (g h) (m n q r) (s t))
>
> a Mathematica solution is here:
> http://xahlee.org/UnixResource_dir/w...tions_mma.html
>
> R5RS Scheme lisp solution:
> http://xahlee.org/UnixResource_dir/w...work_gmail.scm
> by Sourav Mukherjee
>
> also, a Common Lisp solution can be found here:
> http://groups.google.com/group/comp....ded8824bc750b?
>
> anyone care to give a solution in Python, Perl, javascript, or other
> lang? am guessing the scheme solution can be much improved... perhaps
> using some lib but that seems to show scheme is pretty weak if the lib
> is non-standard.
>
> Xah ∑ xahlee.org ☄
>



Python 3: (I have not tried to match the exact format of your output,
but I get the right things is the right order.)

data = ((0,'a','b'), (1,'c','d'), (2,'e','f'), (3,'g','h'),
(1,'i','j'), (2,'k','l'), (4,'m','n'), (2,'o','p'),
(4,'q','r'), (5,'s','t'))

from collections import OrderedDict
r = OrderedDict()
for label,*rest in data:
r.setdefault(label, []).extend(rest)
print(list(r.values()))

produces:

(['a', 'b'], ['c', 'd', 'i', 'j'], ['e', 'f', 'k', 'l', 'o', 'p'], ['g',
'h'], ['m', 'n', 'q', 'r'], ['s', 't'])


--
Gary Herron, PhD.
Department of Computer Science
DigiPen Institute of Technology
(425) 895-4418


 
Reply With Quote
 
 
 
 
Alexander Burger
Guest
Posts: n/a
 
      09-26-2010
In PicoLisp:

(mapcar
'((X) (apply conc (cdr X)))
(group List) )

Cheers,
- Alex
 
Reply With Quote
 
Paul Rubin
Guest
Posts: n/a
 
      09-26-2010
Python solution follows. Removed all crossposts since massive
crossposting is a standard trolling tactic.

from collections import defaultdict

def collect(xss):
d = defaultdict(list)
for xs in xss:
d[xs[0]].extend(xs[1:])
return sorted(v for k,v in d.items())

y = [['0','a','b'], ['1','c','d'], ['2','e','f'], ['3','g','h'],
['1','i','j'], ['2','k','l'], ['4','m','n'], ['2','o','p'],
['4','q','r'], ['5','s','t']]

print collect(y)
 
Reply With Quote
 
Paul Rubin
Guest
Posts: n/a
 
      09-26-2010
Python solution follows (earlier one with an error cancelled). All
crossposting removed since crossposting is a standard trolling tactic.

from collections import defaultdict

def collect(xss):
d = defaultdict(list)
for xs in xss:
d[xs[0]].extend(xs[1:])
return list(v for k,v in sorted(d.items()))

y = [[0,'a','b'], [1,'c','d'], [2,'e','f'], [3,'g','h'], [1,'i','j'],
[2,'k','l'], [4,'m','n'], [2,'o','p'], [4,'q','r'], [5,'s','t']]

print collect(y)
 
Reply With Quote
 
livibetter
Guest
Posts: n/a
 
      09-26-2010
Here is mine for Python:

l = [[0, 'a', 'b'], [1, 'c', 'd'], [2, 'e', 'f'], [3, 'g', 'h'], [1,
'i', 'j'], [2, 'k', 'l'], [4, 'm', 'n'], [2, 'o', 'p'], [4, 'q', 'r'],
[5, 's', 't']]
d = {}
for idx, items in [(e[0], e[1:]) for e in l]: d[idx] = d[idx] + items
if idx in d else items
print d.values()

Output:
[['a', 'b'], ['c', 'd', 'i', 'j'], ['e', 'f', 'k', 'l', 'o', 'p'],
['g', 'h'], ['m', 'n', 'q', 'r'], ['s', 't']]
 
Reply With Quote
 
Arnaud Delobelle
Guest
Posts: n/a
 
      09-26-2010
On 26 Sep, 08:47, livibetter <(E-Mail Removed)> wrote:
> Here is mine for Python:
>
> l = [[0, 'a', 'b'], [1, 'c', 'd'], [2, 'e', 'f'], [3, 'g', 'h'], [1,
> 'i', 'j'], [2, 'k', 'l'], [4, 'm', 'n'], [2, 'o', 'p'], [4, 'q', 'r'],
> [5, 's', 't']]
> d = {}
> for idx, items in [(e[0], e[1:]) for e in l]: d[idx] = d[idx] + items
> if idx in d else items
> print d.values()
>
> Output:
> [['a', 'b'], ['c', 'd', 'i', 'j'], ['e', 'f', 'k', 'l', 'o', 'p'],
> ['g', 'h'], ['m', 'n', 'q', 'r'], ['s', 't']]


from itertools import groupby
from operator import itemgetter

l = [[0, 'a', 'b'], [1, 'c', 'd'], [2, 'e', 'f'], [3, 'g', 'h'], [1,
'i', 'j'], [2, 'k', 'l'], [4, 'm', 'n'], [2, 'o', 'p'], [4, 'q',
'r'],
[5, 's', 't']]

[
[x for g in gs for x in g[1:]]
for _, gs in groupby(sorted(l), itemgetter(0))
]

--
Arnaud
 
Reply With Quote
 
Dr.Ruud
Guest
Posts: n/a
 
      09-26-2010
On 2010-09-26 06:05, Xah Lee wrote:

> I have a list of lists, where each sublist is labelled by
> a number. I need to collect together the contents of all sublists
> sharing the same label. So if I have the list
>
> ((0 a b) (1 c d) (2 e f) (3 g h) (1 i j) (2 k l) (4 m n) (2 o p) (4 q r) (5 s t))
>
> where the first element of each sublist is the label, I need to
> produce:
>
> output:
> ((a b) (c d i j) (e f k l o p) (g h) (m n q r) (s t))


The input is a string on STDIN,
and the output is a string on STDOUT?


Use a hash:

perl -MData:umper -wle '$Data:umper::Sortkeys = 1;
my $t = "((0 a b) (1 c d) (2 e f) (3 g h) (1 i j)"
. " (2 k l) (4 m n) (2 o p) (4 q r) (5 s t))";

push @{ $h{ $1 } }, $2 while $t =~ /(\w+)([^)]*)/g; # gist

print Dumper \%h;
'

or an array:

perl -wle '
my $t = "((0 a b) (1 c d) (2 e f) (3 g h) (1 i j)"
. " (2 k l) (4 m n) (2 o p) (4 q r) (5 s t))";

push @{$a[$1]},$2 while $t =~ /(\w+)\s+([^)]*)/g; # gist.1
print "((".join(") (",map join(" ",@$_),@a )."))"; # gist.2
'


Or if the list is not just a string, but a real data structure in the
script:

perl -wle'
my $t = [ [qw/0 a b/], [qw/1 c d/], [qw/2 e f/], [qw/3 g h/],
[qw/1 i j/], [qw/2 k l/], [qw/4 m n/], [qw/2 o p/],
[qw/4 q r/], [qw/5 s t/] ];

push @{ $a[ $_->[0] ] }, [ @$_[ 1, 2 ] ] for @$t; # AoAoA

printf "((%s))\n", join ") (",
map join( " ",
map join( " ", @$_ ), @$_
), @a;
'

Etc.

--
Ruud

 
Reply With Quote
 
Jrgen Exner
Guest
Posts: n/a
 
      09-26-2010
Alexander Burger <(E-Mail Removed)> wrote:
>In PicoLisp:


What the f**** does PicoLisp have to with Perl?

jue
 
Reply With Quote
 
Jrgen Exner
Guest
Posts: n/a
 
      09-26-2010
livibetter <(E-Mail Removed)> wrote:
>Here is mine for Python:


What the f*** does Python have to do with Perl?

jue
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
toy list processing problem: collect similar terms Xah Lee Perl Misc 43 02-17-2011 07:19 AM
Any similar Webcam broadcasting site similar to youtube Chaudhry Nijjhar Computer Support 0 02-19-2008 11:48 PM
Can some explain Context.list() in DNS terms? robert Java 3 12-18-2006 04:48 PM
New toy.. new toy! Shane NZ Computing 9 03-10-2006 06:40 AM
Toy Story 1 & 2 SE vs Toy Box Set byronrobinson@sympatico.ca DVD Video 2 12-30-2005 11:14 PM



Advertisments