Velocity Reviews

Velocity Reviews (http://www.velocityreviews.com/forums/index.php)
-   Python (http://www.velocityreviews.com/forums/f43-python.html)
-   -   toy list processing problem: collect similar terms (http://www.velocityreviews.com/forums/t734133-toy-list-processing-problem-collect-similar-terms.html)

Xah Lee 09-26-2010 04:05 AM

toy list processing problem: collect similar terms
 
here's a interesting toy list processing problem.

I have a list of lists, where each sublist is labelled by
a number. I need to collect together the contents of all sublists
sharing
the same label. So if I have the list

((0 a b) (1 c d) (2 e f) (3 g h) (1 i j) (2 k l) (4 m n) (2 o p) (4 q
r) (5 s t))

where the first element of each sublist is the label, I need to
produce:

output:
((a b) (c d i j) (e f k l o p) (g h) (m n q r) (s t))

a Mathematica solution is here:
http://xahlee.org/UnixResource_dir/w...tions_mma.html

R5RS Scheme lisp solution:
http://xahlee.org/UnixResource_dir/w...work_gmail.scm
by Sourav Mukherjee

also, a Common Lisp solution can be found here:
http://groups.google.com/group/comp....ded8824bc750b?

anyone care to give a solution in Python, Perl, javascript, or other
lang? am guessing the scheme solution can be much improved... perhaps
using some lib but that seems to show scheme is pretty weak if the lib
is non-standard.

Xah ∑ xahlee.org ☄

Gary Herron 09-26-2010 05:21 AM

Re: toy list processing problem: collect similar terms
 
On 09/25/2010 09:05 PM, Xah Lee wrote:
> here's a interesting toy list processing problem.
>
> I have a list of lists, where each sublist is labelled by
> a number. I need to collect together the contents of all sublists
> sharing
> the same label. So if I have the list
>
> ((0 a b) (1 c d) (2 e f) (3 g h) (1 i j) (2 k l) (4 m n) (2 o p) (4 q
> r) (5 s t))
>
> where the first element of each sublist is the label, I need to
> produce:
>
> output:
> ((a b) (c d i j) (e f k l o p) (g h) (m n q r) (s t))
>
> a Mathematica solution is here:
> http://xahlee.org/UnixResource_dir/w...tions_mma.html
>
> R5RS Scheme lisp solution:
> http://xahlee.org/UnixResource_dir/w...work_gmail.scm
> by Sourav Mukherjee
>
> also, a Common Lisp solution can be found here:
> http://groups.google.com/group/comp....ded8824bc750b?
>
> anyone care to give a solution in Python, Perl, javascript, or other
> lang? am guessing the scheme solution can be much improved... perhaps
> using some lib but that seems to show scheme is pretty weak if the lib
> is non-standard.
>
> Xah ∑ xahlee.org ☄
>



Python 3: (I have not tried to match the exact format of your output,
but I get the right things is the right order.)

data = ((0,'a','b'), (1,'c','d'), (2,'e','f'), (3,'g','h'),
(1,'i','j'), (2,'k','l'), (4,'m','n'), (2,'o','p'),
(4,'q','r'), (5,'s','t'))

from collections import OrderedDict
r = OrderedDict()
for label,*rest in data:
r.setdefault(label, []).extend(rest)
print(list(r.values()))

produces:

(['a', 'b'], ['c', 'd', 'i', 'j'], ['e', 'f', 'k', 'l', 'o', 'p'], ['g',
'h'], ['m', 'n', 'q', 'r'], ['s', 't'])


--
Gary Herron, PhD.
Department of Computer Science
DigiPen Institute of Technology
(425) 895-4418



Alexander Burger 09-26-2010 05:29 AM

Re: toy list processing problem: collect similar terms
 
In PicoLisp:

(mapcar
'((X) (apply conc (cdr X)))
(group List) )

Cheers,
- Alex

Paul Rubin 09-26-2010 06:11 AM

Re: toy list processing problem: collect similar terms
 
Python solution follows. Removed all crossposts since massive
crossposting is a standard trolling tactic.

from collections import defaultdict

def collect(xss):
d = defaultdict(list)
for xs in xss:
d[xs[0]].extend(xs[1:])
return sorted(v for k,v in d.items())

y = [['0','a','b'], ['1','c','d'], ['2','e','f'], ['3','g','h'],
['1','i','j'], ['2','k','l'], ['4','m','n'], ['2','o','p'],
['4','q','r'], ['5','s','t']]

print collect(y)

Paul Rubin 09-26-2010 06:17 AM

Re: toy list processing problem: collect similar terms
 
Python solution follows (earlier one with an error cancelled). All
crossposting removed since crossposting is a standard trolling tactic.

from collections import defaultdict

def collect(xss):
d = defaultdict(list)
for xs in xss:
d[xs[0]].extend(xs[1:])
return list(v for k,v in sorted(d.items()))

y = [[0,'a','b'], [1,'c','d'], [2,'e','f'], [3,'g','h'], [1,'i','j'],
[2,'k','l'], [4,'m','n'], [2,'o','p'], [4,'q','r'], [5,'s','t']]

print collect(y)

livibetter 09-26-2010 07:47 AM

Re: toy list processing problem: collect similar terms
 
Here is mine for Python:

l = [[0, 'a', 'b'], [1, 'c', 'd'], [2, 'e', 'f'], [3, 'g', 'h'], [1,
'i', 'j'], [2, 'k', 'l'], [4, 'm', 'n'], [2, 'o', 'p'], [4, 'q', 'r'],
[5, 's', 't']]
d = {}
for idx, items in [(e[0], e[1:]) for e in l]: d[idx] = d[idx] + items
if idx in d else items
print d.values()

Output:
[['a', 'b'], ['c', 'd', 'i', 'j'], ['e', 'f', 'k', 'l', 'o', 'p'],
['g', 'h'], ['m', 'n', 'q', 'r'], ['s', 't']]

Arnaud Delobelle 09-26-2010 08:29 AM

Re: toy list processing problem: collect similar terms
 
On 26 Sep, 08:47, livibetter <livibet...@gmail.com> wrote:
> Here is mine for Python:
>
> l = [[0, 'a', 'b'], [1, 'c', 'd'], [2, 'e', 'f'], [3, 'g', 'h'], [1,
> 'i', 'j'], [2, 'k', 'l'], [4, 'm', 'n'], [2, 'o', 'p'], [4, 'q', 'r'],
> [5, 's', 't']]
> d = {}
> for idx, items in [(e[0], e[1:]) for e in l]: d[idx] = d[idx] + items
> if idx in d else items
> print d.values()
>
> Output:
> [['a', 'b'], ['c', 'd', 'i', 'j'], ['e', 'f', 'k', 'l', 'o', 'p'],
> ['g', 'h'], ['m', 'n', 'q', 'r'], ['s', 't']]


from itertools import groupby
from operator import itemgetter

l = [[0, 'a', 'b'], [1, 'c', 'd'], [2, 'e', 'f'], [3, 'g', 'h'], [1,
'i', 'j'], [2, 'k', 'l'], [4, 'm', 'n'], [2, 'o', 'p'], [4, 'q',
'r'],
[5, 's', 't']]

[
[x for g in gs for x in g[1:]]
for _, gs in groupby(sorted(l), itemgetter(0))
]

--
Arnaud

Dr.Ruud 09-26-2010 09:41 AM

Re: toy list processing problem: collect similar terms
 
On 2010-09-26 06:05, Xah Lee wrote:

> I have a list of lists, where each sublist is labelled by
> a number. I need to collect together the contents of all sublists
> sharing the same label. So if I have the list
>
> ((0 a b) (1 c d) (2 e f) (3 g h) (1 i j) (2 k l) (4 m n) (2 o p) (4 q r) (5 s t))
>
> where the first element of each sublist is the label, I need to
> produce:
>
> output:
> ((a b) (c d i j) (e f k l o p) (g h) (m n q r) (s t))


The input is a string on STDIN,
and the output is a string on STDOUT?


Use a hash:

perl -MData::Dumper -wle '$Data::Dumper::Sortkeys = 1;
my $t = "((0 a b) (1 c d) (2 e f) (3 g h) (1 i j)"
. " (2 k l) (4 m n) (2 o p) (4 q r) (5 s t))";

push @{ $h{ $1 } }, $2 while $t =~ /(\w+)([^)]*)/g; # gist

print Dumper \%h;
'

or an array:

perl -wle '
my $t = "((0 a b) (1 c d) (2 e f) (3 g h) (1 i j)"
. " (2 k l) (4 m n) (2 o p) (4 q r) (5 s t))";

push @{$a[$1]},$2 while $t =~ /(\w+)\s+([^)]*)/g; # gist.1
print "((".join(") (",map join(" ",@$_),@a )."))"; # gist.2
'


Or if the list is not just a string, but a real data structure in the
script:

perl -wle'
my $t = [ [qw/0 a b/], [qw/1 c d/], [qw/2 e f/], [qw/3 g h/],
[qw/1 i j/], [qw/2 k l/], [qw/4 m n/], [qw/2 o p/],
[qw/4 q r/], [qw/5 s t/] ];

push @{ $a[ $_->[0] ] }, [ @$_[ 1, 2 ] ] for @$t; # AoAoA

printf "((%s))\n", join ") (",
map join( " ",
map join( " ", @$_ ), @$_
), @a;
'

Etc.

--
Ruud


Jrgen Exner 09-26-2010 01:53 PM

Re: toy list processing problem: collect similar terms
 
Alexander Burger <abu@software-lab.de> wrote:
>In PicoLisp:


What the f**** does PicoLisp have to with Perl?

jue

Jrgen Exner 09-26-2010 01:54 PM

Re: toy list processing problem: collect similar terms
 
livibetter <livibetter@gmail.com> wrote:
>Here is mine for Python:


What the f*** does Python have to do with Perl?

jue


All times are GMT. The time now is 10:43 AM.

Powered by vBulletin®. Copyright ©2000 - 2014, vBulletin Solutions, Inc.
SEO by vBSEO ©2010, Crawlability, Inc.