Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Python > toy list processing problem: collect similar terms

Reply
Thread Tools

toy list processing problem: collect similar terms

 
 
John Bokma
Guest
Posts: n/a
 
      09-28-2010
Xah Lee <(E-Mail Removed)> writes:

> can you stop this?


Can you stop crossposting? And if there is really, really a need to
crosspost, can you please set the follow-up to?

> doesn't seems fruitful to keep on this.
>
> if you don't like my posts, ignore them? i don't post in
> comp.lang.python or comp.lang.perl.misc that often... maybe have done
> so 5 times this year.


Which is enough to disrupt those groups for days.

> i visited your home page
> http://johnbokma.com/mexit/2010/08/15/
> and there are really a lot beautiful photos.


Thanks Xah. Like I wrote, your site /does/ have good information, it's
so sad that you somehow think it's necessary to spam Usenet to get
visitors. Or maybe you've another reason, don't know. But it /is/ Usenet
abuse.

> this isn't bribery or something like that. I've been annoyed by you,
> of course, but it's not fruitful to keep going on this.


Well, you annoy me, I annoy you. It's in your hands to make it stop.

My advice is:

1) remove all the excessive swearing from your site. If you have a
point, you don't need it. Your argument(s) without the swearing
should speak for themselves

2) Stop abusing Usenet. Instead focus on writing more good stuff on
your site.

1) & 2) will keep me from linking to your site, ever. And I am sure I am
not alone in this.

--
John Bokma j3b

Blog: http://johnbokma.com/ Facebook: http://www.facebook.com/j.j.j.bokma
Freelance Perl & Python Development: http://castleamber.com/
 
Reply With Quote
 
 
 
 
Paul Rubin
Guest
Posts: n/a
 
      09-28-2010
John Bokma <(E-Mail Removed)> writes:
> Xah Lee <(E-Mail Removed)> writes: ...
> Can you stop crossposting?


John, can you ALSO stop crossposting?
 
Reply With Quote
 
 
 
 
John Bokma
Guest
Posts: n/a
 
      09-28-2010
Paul Rubin <(E-Mail Removed)> writes:

> John Bokma <(E-Mail Removed)> writes:
>> Xah Lee <(E-Mail Removed)> writes: ...
>> Can you stop crossposting?

>
> John, can you ALSO stop crossposting?


Since the issue is on-topic in all groups: no. I did set a follow-up
header, which you ignored and on top of that redirected the thing to
comp.lang.python. So:

Paul, can you ALSO stop acting like a complete ass?

--
John Bokma j3b

Blog: http://johnbokma.com/ Facebook: http://www.facebook.com/j.j.j.bokma
Freelance Perl & Python Development: http://castleamber.com/
 
Reply With Quote
 
w_a_x_man
Guest
Posts: n/a
 
      09-29-2010
On Sep 26, 9:24*am, (E-Mail Removed) (Pascal J. Bourguignon)
wrote:
> Xah Lee <(E-Mail Removed)> writes:
> > here's a interesting toy list processing problem.

>
> > I have a list of lists, where each sublist is labelled by
> > a number. I need to collect together the contents of all sublists
> > sharing
> > the same label. So if I have the list

>
> > ((0 a b) (1 c d) (2 e f) (3 g h) (1 i j) (2 k l) (4 m n) (2 o p) (4 q
> > r) (5 s t))

>
> > where the first element of each sublist is the label, I need to
> > produce:

>
> > output:
> > ((a b) (c d i j) (e f k l o p) (g h) (m n q r) (s t))

>
> > a Mathematica solution is here:
> >http://xahlee.org/UnixResource_dir/w...tions_mma.html

>
> > R5RS Scheme lisp solution:
> >http://xahlee.org/UnixResource_dir/w...e_sourav.work_...
> > by Sourav Mukherjee

>
> > also, a Common Lisp solution can be found here:
> >http://groups.google.com/group/comp....m/thread/5d1de...

>
> It's too complex. Just write:
>
> (let ((list '((0 a b) (1 c d) (2 e f) (3 g h) (1 i j) (2 k l) (4 m n)
> * * * * * * * (2 o p) (4 q r) (5 s t))))
>
> * (mapcar (lambda (class) (reduce (function append) class :key (function rest)))
> * * * * * *(com.informatimago.common-lisp.list:equivalence-classes list :key (function first)))
>
> * *)
>
> --> ((S T) (Q R M N) (G H) (O P K L E F) (I J C D) (A B))
>
> --
> __Pascal Bourguignon__ * * * * * * * * * *http://www.informatimago.com/


Ruby:

[[0, 'a', 'b'], [1, 'c', 'd'], [2, 'e', 'f'], [3, 'g', 'h'], [1,
'i', 'j'], [2, 'k', 'l'], [4, 'm', 'n'], [2, 'o', 'p'], [4, 'q', 'r'],
[5, 's', 't']].
group_by{|x| x.first}.values.map{|x| x.map{|y| y[1..-1]}.flatten}

==>[["s", "t"], ["a", "b"], ["c", "d", "i", "j"],
["e", "f", "k", "l", "o", "p"],
["g", "h"], ["m", "n", "q", "r"]]
 
Reply With Quote
 
namekuseijin
Guest
Posts: n/a
 
      09-30-2010
On 29 set, 11:04, w_a_x_man <(E-Mail Removed)> wrote:
> On Sep 26, 9:24*am, (E-Mail Removed) (Pascal J. Bourguignon)
> wrote:
>
>
>
> > Xah Lee <(E-Mail Removed)> writes:
> > > here's a interesting toy list processing problem.

>
> > > I have a list of lists, where each sublist is labelled by
> > > a number. I need to collect together the contents of all sublists
> > > sharing
> > > the same label. So if I have the list

>
> > > ((0 a b) (1 c d) (2 e f) (3 g h) (1 i j) (2 k l) (4 m n) (2 o p) (4 q
> > > r) (5 s t))

>
> > > where the first element of each sublist is the label, I need to
> > > produce:

>
> > > output:
> > > ((a b) (c d i j) (e f k l o p) (g h) (m n q r) (s t))

>
> > > a Mathematica solution is here:
> > >http://xahlee.org/UnixResource_dir/w...tions_mma.html

>
> > > R5RS Scheme lisp solution:
> > >http://xahlee.org/UnixResource_dir/w...e_sourav.work_....
> > > by Sourav Mukherjee

>
> > > also, a Common Lisp solution can be found here:
> > >http://groups.google.com/group/comp....m/thread/5d1de....

>
> > It's too complex. Just write:

>
> > (let ((list '((0 a b) (1 c d) (2 e f) (3 g h) (1 i j) (2 k l) (4 m n)
> > * * * * * * * (2 o p) (4 q r) (5 s t))))

>
> > * (mapcar (lambda (class) (reduce (function append) class :key (function rest)))
> > * * * * * *(com.informatimago.common-lisp.list:equivalence-classes list :key (function first)))

>
> > * *)

>
> > --> ((S T) (Q R M N) (G H) (O P K L E F) (I J C D) (A B))

>
> > --
> > __Pascal Bourguignon__ * * * * * * * * * *http://www.informatimago.com/

>
> Ruby:
>
> [[0, 'a', 'b'], [1, 'c', 'd'], [2, 'e', 'f'], [3, 'g', 'h'], [1,
> 'i', 'j'], [2, 'k', 'l'], [4, 'm', 'n'], [2, 'o', 'p'], [4, 'q', 'r'],
> [5, 's', 't']].
> group_by{|x| x.first}.values.map{|x| x.map{|y| y[1..-1]}.flatten}
>
> * * ==>[["s", "t"], ["a", "b"], ["c", "d", "i", "j"],
> *["e", "f", "k", "l", "o", "p"],
> *["g", "h"], ["m", "n", "q", "r"]]


cool, it comes with order all ****ed up. This is something I was
criticized for before, though not all that important to most
functional processing. Not the case here, though.

here's a scheme version that is hopefully better than the given one:

(define (dig in)

(if (null? in) '()

(let* ((n (first-of-first in))

(all-n (filter in (lambda (x) (eq? n (first x)))))

(all-but-n (filter in (lambda (x) (not (eq? n (first
x)))))))

(pair

(fold all-n
(lambda (i o) (pair (second i) (pair (third i) o))))

(dig all-but-n)))))


; given these aliases to lisp n00bs

(define pair cons)

(define first car)

(define rest cdr)

(define first-of-first caar)

(define second cadr)

(define third caddr)


; and these well-known functions
(non-tail-recursive for benefit of n00bs)
(define (fold ls f) ; AKA reduce

(if (null? ls) '()

(f (first ls) (fold (rest ls) f))))


(define (filter ls f)

(fold ls (lambda (i o) (if (f i) (pair i o) o))))



; testing
(let ((in '((0 a b) (1 c d) (2 e f) (3 g h) (1 i j) (2 k l) (4 m n)
(2 o p) (4 q r) (5 s t))))
(display (dig in))
(newline))

 
Reply With Quote
 
namekuseijin
Guest
Posts: n/a
 
      09-30-2010
On 30 set, 09:35, namekuseijin <(E-Mail Removed)> wrote:
> On 29 set, 11:04, w_a_x_man <(E-Mail Removed)> wrote:
>
>
>
> > On Sep 26, 9:24*am, (E-Mail Removed) (Pascal J. Bourguignon)
> > wrote:

>
> > > Xah Lee <(E-Mail Removed)> writes:
> > > > here's a interesting toy list processing problem.

>
> > > > I have a list of lists, where each sublist is labelled by
> > > > a number. I need to collect together the contents of all sublists
> > > > sharing
> > > > the same label. So if I have the list

>
> > > > ((0 a b) (1 c d) (2 e f) (3 g h) (1 i j) (2 k l) (4 m n) (2 o p) (4 q
> > > > r) (5 s t))

>
> > > > where the first element of each sublist is the label, I need to
> > > > produce:

>
> > > > output:
> > > > ((a b) (c d i j) (e f k l o p) (g h) (m n q r) (s t))

>
> > > > a Mathematica solution is here:
> > > >http://xahlee.org/UnixResource_dir/w...tions_mma.html

>
> > > > R5RS Scheme lisp solution:
> > > >http://xahlee.org/UnixResource_dir/w...e_sourav.work_...
> > > > by Sourav Mukherjee

>
> > > > also, a Common Lisp solution can be found here:
> > > >http://groups.google.com/group/comp....m/thread/5d1de...

>
> > > It's too complex. Just write:

>
> > > (let ((list '((0 a b) (1 c d) (2 e f) (3 g h) (1 i j) (2 k l) (4 m n)
> > > * * * * * * * (2 o p) (4 q r) (5 s t))))

>
> > > * (mapcar (lambda (class) (reduce (function append) class :key (function rest)))
> > > * * * * * *(com.informatimago.common-lisp.list:equivalence-classes list :key (function first)))

>
> > > * *)

>
> > > --> ((S T) (Q R M N) (G H) (O P K L E F) (I J C D) (A B))

>
> > > --
> > > __Pascal Bourguignon__ * * * * * * * * * *http://www.informatimago.com/

>
> > Ruby:

>
> > [[0, 'a', 'b'], [1, 'c', 'd'], [2, 'e', 'f'], [3, 'g', 'h'], [1,
> > 'i', 'j'], [2, 'k', 'l'], [4, 'm', 'n'], [2, 'o', 'p'], [4, 'q', 'r'],
> > [5, 's', 't']].
> > group_by{|x| x.first}.values.map{|x| x.map{|y| y[1..-1]}.flatten}

>
> > * * ==>[["s", "t"], ["a", "b"], ["c", "d", "i", "j"],
> > *["e", "f", "k", "l", "o", "p"],
> > *["g", "h"], ["m", "n", "q", "r"]]

>
> cool, it comes with order all ****ed up. *This is something I was
> criticized for before, though not all that important to most
> functional processing. *Not the case here, though.
>
> here's a scheme version that is hopefully better than the given one:


(define (dig in)
* (if (null? in) '()
* * (let* ((n * * * * (first-of-first in))
* * * * * *(all-n * * (filter in (lambda (x) * * *(eq? n (first x)))))
* * * * * *(all-but-n (filter in (lambda (x) (not (eq? n (first
x)))))))
* * * *(pair
* * * * * (fold all-n
* * * * * * *(lambda (i o) (pair (second i) (pair (third i) o))))
* * * * * (dig all-but-n)))))

; given these aliases to lisp n00bs
(define pair cons)
(define first car)
(define rest cdr)
(define first-of-first caar)
(define second cadr)
(define third caddr)

; and these well-known functions*(non-tail-recursive for benefit of
n00bs)
(define (fold ls f) ; AKA reduce
* (if (null? ls) '()
* * * (f (first ls) (fold (rest ls) f))))
(define (filter ls f)
* (fold ls (lambda (i o) (if (f i) (pair i o) o))))

; testing
(let ((in '((0 a b) (1 c d) (2 e f) (3 g h) (1 i j) (2 k l) (4 m n)
* * * * * * (2 o p) (4 q r) (5 s t))))
* (display (dig in))
* (newline))

;frakkin text editor...
 
Reply With Quote
 
sln@netherlands.com
Guest
Posts: n/a
 
      10-06-2010
On Sat, 25 Sep 2010 21:05:13 -0700 (PDT), Xah Lee <(E-Mail Removed)> wrote:

>here's a interesting toy list processing problem.
>
>I have a list of lists, where each sublist is labelled by
>a number. I need to collect together the contents of all sublists
>sharing
>the same label. So if I have the list
>
>((0 a b) (1 c d) (2 e f) (3 g h) (1 i j) (2 k l) (4 m n) (2 o p) (4 q
>r) (5 s t))
>
>where the first element of each sublist is the label, I need to
>produce:
>
>output:
>((a b) (c d i j) (e f k l o p) (g h) (m n q r) (s t))
>

[snip]

>anyone care to give a solution in Python, Perl, javascript, or other
>lang? am guessing the scheme solution can be much improved... perhaps
>using some lib but that seems to show scheme is pretty weak if the lib
>is non-standard.
>


Crossposting to Lisp, Python and Perl because the weird list of lists looks
like Lisp or something else, and you mention other languages so I'm throwing
this out for Perl.

It appears this string you have there is actually list syntax in another language.
If it is, its the job of the language to parse the data out. Why then do you
want to put it into another language form? At runtime, once the data is in variables,
dictated by the syntax, you can do whatever data manipulation you want
(combining arrays, etc..).

So, in the spirit of a preprocessor, given that the text is balanced, with proper closure,
ie: ( (data) (data) ) is ok.
( data (data) ) is not ok.

the below does simple text manipulation, joining like labeled sublists, without going into
the runtime guts of internalizing the data itself. Internally, this is too simple.

-sln
-----------------
Alternate input:
(
(
(0 a b) (1 c d) (2 e f )
)
(3 g h) (1 i j) (2 k l) (4 m n) (2 o p) (4 q r) (5 s t)
)
------------------
use strict;
use warnings;

my $input = <<EOI;
((0 a b) (1 c d) (2 e f) (3 g h) (1 i j) (2 k l) (4 m n) (2 o p) (4 q r)
(5 s t))
EOI
my $output = $input;

my $regxout = qr/
( (?: \( \s* [^()]+ \s* \) (\s*) )+ )
/x;


$output =~
s{ $regxout }
{
my ( $list, $format ) = ( $1, $2 );
my ( %hseen,
@order,
$replace
);
while ($list =~ /\(\s* (\S+) \s* (.+?) \s*\)/xsg) {
if ( exists $hseen{$1} ) {
$hseen{$1} .= " $2";
next;
}
push @order, $1;
$hseen{$1} = $2;
}
for my $id (@order) {
$replace .= "($hseen{$id}) ";
}
$replace =~ s/ $//;
$replace . $format
}xeg;

print "Input -\n$input\n";
print "Output -\n$output";

__END__

Input -
((0 a b) (1 c d) (2 e f) (3 g h) (1 i j) (2 k l) (4 m n) (2 o p) (4 q r)
(5 s t))

Output -
((a b) (c d i j) (e f k l o p) (g h) (m n q r) (s t))


 
Reply With Quote
 
sln@netherlands.com
Guest
Posts: n/a
 
      10-08-2010
On Wed, 06 Oct 2010 10:52:19 -0700, http://www.velocityreviews.com/forums/(E-Mail Removed) wrote:

>On Sat, 25 Sep 2010 21:05:13 -0700 (PDT), Xah Lee <(E-Mail Removed)> wrote:
>
>>here's a interesting toy list processing problem.
>>
>>I have a list of lists, where each sublist is labelled by
>>a number. I need to collect together the contents of all sublists
>>sharing
>>the same label. So if I have the list
>>
>>((0 a b) (1 c d) (2 e f) (3 g h) (1 i j) (2 k l) (4 m n) (2 o p) (4 q
>>r) (5 s t))
>>
>>where the first element of each sublist is the label, I need to
>>produce:
>>
>>output:
>>((a b) (c d i j) (e f k l o p) (g h) (m n q r) (s t))
>>

>[snip]
>
>>anyone care to give a solution in Python, Perl, javascript, or other
>>lang? am guessing the scheme solution can be much improved... perhaps
>>using some lib but that seems to show scheme is pretty weak if the lib
>>is non-standard.
>>

>
>Crossposting to Lisp, Python and Perl because the weird list of lists looks
>like Lisp or something else, and you mention other languages so I'm throwing
>this out for Perl.
>
>It appears this string you have there is actually list syntax in another language.
>If it is, its the job of the language to parse the data out. Why then do you
>want to put it into another language form? At runtime, once the data is in variables,
>dictated by the syntax, you can do whatever data manipulation you want
>(combining arrays, etc..).
>
>So, in the spirit of a preprocessor, given that the text is balanced, with proper closure,
>ie: ( (data) (data) ) is ok.
> ( data (data) ) is not ok.
>
>the below does simple text manipulation, joining like labeled sublists, without going into
>the runtime guts of internalizing the data itself. Internally, this is too simple.
>


If not preprocessor, then ...
The too simple, order independent, id independent, Perl approach.

-sln
-----------------

use strict;
use warnings;
use Data:ump 'dump';

my @inp = ([0,'a','b'],[1,'c','d'],[2,'e','f'],[3,'g','h'],
[1,'i','j'],[2,'k','l'],[4,'m','n'],[2,'o','p'],
[4,'q','r'],[5,'s','t']);

my ($cnt, @outp, %hs) = (0);

for my $ref (@inp) {
$hs{ $$ref[0] } or $hs{ $$ref[0] } = $cnt++;
push @{$outp[ $hs{ $$ref[0] } ] }, @{$ref}[ 1 .. $#{$ref} ];
}

dump @outp;

__END__

(
["a", "b"],
["c", "d", "i", "j"],
["e", "f", "k", "l", "o", "p"],
["g", "h"],
["m", "n", "q", "r"],
["s", "t"],
)

 
Reply With Quote
 
Xah Lee
Guest
Posts: n/a
 
      10-15-2010

On Sep 25, 9:05Â*pm, Xah Lee <(E-Mail Removed)> wrote:
> here's a interesting toy list processing problem.
>
> I have a list of lists, where each sublist is labelled by
> a number. I need to collect together the contents of all sublists
> sharing
> the same label. So if I have the list
>
> ((0 a b) (1 c d) (2 e f) (3 g h) (1 i j) (2 k l) (4 m n) (2 o p) (4 q
> r) (5 s t))
>
> where the first element of each sublist is the label, I need to
> produce:
>
> output:
> ((a b) (c d i j) (e f k l o p) (g h) (m n q r) (s t))
> ...


thanks all for many interesting solutions. I've been so busy in past
month on other computing issues and writing and never got around to
look at this thread. I think eventually i will, but for now just made
a link on my page to point to here.

now we have solutions in perl, python, ruby, common lisp, scheme lisp,
mathematica. I myself would also be interested in javascript perhps
i'll write one soon. If someone would go thru all these solution and
make a good summary with consistent format/names of each solution...
that'd be very useful i think. (and will learn a lot, which is how i
find this interesting)

PS here's a good site that does very useful comparisons for those
learning multiple langs.

* 〈Lisp: Common Lisp, Scheme, Clojure, Emacs Lisp〉
http://hyperpolyglot.wikidot.com/lisp
* 〈Scripting Languages: PHP, Perl, Python, Ruby, Smalltalk〉
http://hyperpolyglot.wikidot.com/scripting
* 〈Scripting Languages: Bash, Tcl, Lua, JavaScript, Io〉
http://hyperpolyglot.wikidot.com/small
* 〈Platform Languages: C, C++, Objective C, Java, C#〉
http://hyperpolyglot.wikidot.com/c
* 〈ML: Standard ML, OCaml, F#, Scala, Haskell〉 http://hyperpolyglot.wikidot.com/ml

Xah ∑ http://xahlee.org/ ☄
 
Reply With Quote
 
WJ
Guest
Posts: n/a
 
      02-10-2011
Pascal J. Bourguignon wrote:

> Xah Lee <(E-Mail Removed)> writes:
>
>
> > here's a interesting toy list processing problem.
> >
> > I have a list of lists, where each sublist is labelled by
> > a number. I need to collect together the contents of all sublists
> > sharing
> > the same label. So if I have the list
> >
> > ((0 a b) (1 c d) (2 e f) (3 g h) (1 i j) (2 k l) (4 m n) (2 o p) (4
> > q r) (5 s t))
> >
> > where the first element of each sublist is the label, I need to
> > produce:
> >
> > output:
> > ((a b) (c d i j) (e f k l o p) (g h) (m n q r) (s t))
> >
> > a Mathematica solution is here:
> > http://xahlee.org/UnixResource_dir/w...tions_mma.html
> >
> > R5RS Scheme lisp solution:
> > http://xahlee.org/UnixResource_dir/w...ee_sourav.work
> > _gmail.scm by Sourav Mukherjee
> >
> > also, a Common Lisp solution can be found here:
> > http://groups.google.com/group/comp....rm/thread/5d1d
> > ed8824bc750b?

>
> It's too complex. Just write:
>
> (let ((list '((0 a b) (1 c d) (2 e f) (3 g h) (1 i j) (2 k l) (4 m n)
> (2 o p) (4 q r) (5 s t))))
>
> (mapcar (lambda (class) (reduce (function append) class :key
> (function rest)))
> (com.informatimago.common-lisp.list:equivalence-classes list :key
> (function first)))
>
> )
>
> --> ((S T) (Q R M N) (G H) (O P K L E F) (I J C D) (A B))


Clojure:

(def groups '((0 a b)(1 c d)(2 e f)(3 g h)(1 i j)(2 k l)(4 m n)
(2 o p)(4 q r) (5 s t)))

Using group-by:

(map (fn[[k v]](flatten (map rest v))) (group-by first groups))

Using reduce:

(map #(flatten(rest %)) (reduce (fn[h [k & v]]
(merge-with concat h {k v})) {} groups))

 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
toy list processing problem: collect similar terms Xah Lee Perl Misc 43 02-17-2011 07:19 AM
Any similar Webcam broadcasting site similar to youtube Chaudhry Nijjhar Computer Support 0 02-19-2008 11:48 PM
Can some explain Context.list() in DNS terms? robert Java 3 12-18-2006 04:48 PM
New toy.. new toy! Shane NZ Computing 9 03-10-2006 06:40 AM
Toy Story 1 & 2 SE vs Toy Box Set byronrobinson@sympatico.ca DVD Video 2 12-30-2005 11:14 PM



Advertisments