Velocity Reviews > Perl > Circular lists

# Circular lists

xhoster@gmail.com
Guest
Posts: n/a

 01-11-2009
Jürgen Exner <(E-Mail Removed)> wrote:
> gamo <(E-Mail Removed)> wrote:
> >
> >If a,b,c is a list
> >b,c,a is the same list rotated b,c,a,b,c,a (note the a,b,c)
> >c,a,b is the same too, because c,a,b,c,a,b (note the a,b,c)
> >
> >a,c,b is another list because a,c,b,a,c,b is new
> >
> >I expect this clarifies the problem.

>
> Are you confirming Red's understanding of the problem or are you
> correcting his understanding? To me both versions, his and yours, seem
> to be the same.
>
> If they are, then generating all your "circular lists" is trivial, as is
> computing their number.
>
> Obviously there are exactly as many circular lists as there are elements
> in the original list, because each element can become the first element
> in a result list.

ababab only has two linear representations,not 6.

I think your claim is true if and only if the counts of the occurrences of
distinct letters in the input @set have a lcf of 1.

Xho

--
The costs of publication of this article were defrayed in part by the
this fact.

gamo
Guest
Posts: n/a

 01-11-2009
On Sun, 11 Jan 2009, http://www.velocityreviews.com/forums/(E-Mail Removed) wrote:

> > I don't want to inspect every permutation because the number of
> > permutations is n! =3D n*(n-1)*(n-2)...*1 and a problem of 20! is
> > intractable.

>
> There only that many permutations if your "set" has only unique letters,
> which in your example it does not. Anyway, may gut feeling is that to get

And how can I exploit that property reducing iterations?

> an accurate count based on stochastic sampling, your will need to be about
> as large as the underlying domain, anyway. But I could be wrong.
>

But I can have a lower bound for the number and the lists I can't have
with exaustive enumeration.

>
> > Suppose that I have the huge list of permutations in memory.
> > Wich are the circular rotations of another?=20

>
> canonicalize! Or solve the problem analytically. If your set is such that
> no permutation can have a self-rotation, then every circular list will
> have exactly N linear representations.
>

Only the number of circular permutations could be calculated. I want to
know what are that lists.

Best regards,

PS: If you look at my corrected code, it works. I tought that it was
inefficient but I can't see where. A posible patch could be to use
only \$p[] and don't compare with the past, because \$p[] contains the
(number of circular perm * lenght of the original list) possibilities.

>
> Xho
>
> --
> The costs of publication of this article were defrayed in part by the
> payment of page charges. This article must therefore be hereby marked
> advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate
> this fact.
>

--
http://www.telecable.es/personales/gamo/
"Was it a car or a cat I saw?"
perl -E 'say 111_111_111**2;'

xhoster@gmail.com
Guest
Posts: n/a

 01-11-2009
gamo <(E-Mail Removed)> wrote:
> On Sun, 11 Jan 2009, (E-Mail Removed) wrote:
>
> > > I don't want to inspect every permutation because the number of
> > > permutations is n! =3D n*(n-1)*(n-2)...*1 and a problem of 20! is
> > > intractable.

> >
> > There only that many permutations if your "set" has only unique
> > letters, which in your example it does not. Anyway, may gut feeling is
> > that to get

>
> And how can I exploit that property reducing iterations?

There may already be a module for that on CPAN, but if so finding it
seems as hard as making one. So I whipped up something easy to implement
(but not to use)

To make a good module out of it, recursion should be removed and
reimplemented with explicit state and as an iterator rather than a
callback.

With your example, it returns 5880, as it just generates, filters, and
counts the permutations, and doesn't look at circularity.

#efficiently make all distinct permutations

use strict;
use warnings;

sub dpermute {
my (\$set, \$prefix, \$code)=@_;
my \$count;
foreach (keys %\$set) {
next unless \$set->{\$_};
\$count++;
my %set = %\$set;
\$set{\$_}--;
my @prefix = (@\$prefix, \$_);
dpermute(\%set, \@prefix, \$code);
};
\$code->(@\$prefix) unless \$count;
};

my %set;
\$set{\$_}++ foreach qw/a a a a a r r g g n/;
my \$x;
dpermute(\%set,[],sub {my \$y=join "", @_; \$y.=\$y; \$x++ unless \$y=~/gg/});
warn \$x;

> > an accurate count based on stochastic sampling, your will need to be
> > about as large as the underlying domain, anyway. But I could be wrong.
> >

>
> But I can have a lower bound for the number and the lists I can't have
> with exaustive enumeration.

You can use a deterministic enumeration and just not let it run to
completion. I think it might start out less efficient than random, but
at some point will become more efficient as the random starts reproducing
old results. Or do it both and then take the max of the two.

>
> PS: If you look at my corrected code, it works.

Oh, I didn't see the correction (well, I saw it, but thought it was a
botched posting as Ｉdidn't notice the changes just a copy of the
original). I'll take a look later and see if I have any suggestions.

Xho

--
The costs of publication of this article were defrayed in part by the
this fact.

gamo
Guest
Posts: n/a

 01-11-2009
On Sun, 11 Jan 2009, (E-Mail Removed) wrote:

> gamo <(E-Mail Removed)> wrote:
> >
> > Thanks to all the responders, but I think now that the problem is
> > unsolvable.

>
> For some input sizes and structures, clearly it is intractable. But the
> same is true for nearly all problems.
>

Certainly. Thanks for remember me that.

> > Since you need to compare a new candidate with those of
> > the past, the problem is more or less O(n^2).

>
> Where n is what, the size of @set, or the factorial of that size?
>

More related with size, but there are counter-examples like
(a a a a a a) or (a a a a a b). I will try it with the case
(a a b b b c c c c d d d d d e e e e e e) for try to learn a
relation.

Best regards,

> Xho
>
> --
> The costs of publication of this article were defrayed in part by the
> payment of page charges. This article must therefore be hereby marked
> advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate
> this fact.
>

--
http://www.telecable.es/personales/gamo/
"Was it a car or a cat I saw?"
perl -E 'say 111_111_111**2;'

Jürgen Exner
Guest
Posts: n/a

 01-12-2009
(E-Mail Removed) wrote:
>Jürgen Exner <(E-Mail Removed)> wrote:
>> gamo <(E-Mail Removed)> wrote:
>> >
>> >If a,b,c is a list
>> >b,c,a is the same list rotated b,c,a,b,c,a (note the a,b,c)
>> >c,a,b is the same too, because c,a,b,c,a,b (note the a,b,c)
>> >
>> >a,c,b is another list because a,c,b,a,c,b is new
>> >
>> >I expect this clarifies the problem.

>>
>> Are you confirming Red's understanding of the problem or are you
>> correcting his understanding? To me both versions, his and yours, seem
>> to be the same.
>>
>> If they are, then generating all your "circular lists" is trivial, as is
>> computing their number.
>>
>> Obviously there are exactly as many circular lists as there are elements
>> in the original list, because each element can become the first element
>> in a result list.

>
>
>ababab only has two linear representations,not 6.

Good catch, you are right.

>I think your claim is true if and only if the counts of the occurrences of
>distinct letters in the input @set have a lcf of 1.

What's an lcf? http://en.wikipedia.org/wiki/LCF has numerous
suggestions, but none that seems to fit in this context.

jue

Jürgen Exner
Guest
Posts: n/a

 01-12-2009
(E-Mail Removed) wrote:
>gamo <(E-Mail Removed)> wrote:
>> > >
>> > > I have a set of @set =3D qw(a a b b b c c c c);
>> > >
>> > > In a loop...
>> > >
>> > > @list =3D shuffle(@set);
>> >=20
>> > Is this the shuffle from List::Util? If so, why use a random method
>> > as part of determining a non-random result? If you want to inspect
>> > every permutation, don't do it randomly.

>>
>> I don't want to inspect every permutation because the number of
>> permutations is n! =3D n*(n-1)*(n-2)...*1 and a problem of 20! is
>> intractable.

>
>There only that many permutations if your "set" has only unique letters,
>which in your example it does not.

Well, I wonder what George is actually talking about. He is saying set,
but as you pointed out he is using a multi-set (or maybe a list?), and
then later he is talking about lists.

>Anyway, may gut feeling is that to get
>an accurate count based on stochastic sampling, your will need to be about
>as large as the underlying domain, anyway. But I could be wrong.

Unless George manages to explain his problem precisely in terms that
everyone can understand and doesn't keep changing his story I have
little hope that we can even agree about the problem to be solved, let
alone agree on a solution.

A formal spec would be nice, but I guess that is too much to ask.

jue

xhoster@gmail.com
Guest
Posts: n/a

 01-12-2009
> On Fri, 9 Jan 2009, gamo wrote:
>

> for \$j (1..\$precounter){
> if (\$s eq \$p[\$j]){
> \$ok=0;
> last;
> }
> };

This would be better written as a hash look up. To do so, the easiest way
is move it above the hash population, and do away with @p altogether,
giving:

for (1..10_000_000){
my @set = shuffle(@a);
my \$s = join '',@set;
my \$two = \$s . \$s;
next if (\$two =~ /gg/);
unless (exists \$hash{\$s}){
\$exito++;
print "\$0 \$exito\n";
}
for my \$i (0..9){
my \$r = substr \$two,\$i,10;
\$hash{\$r}++;
}
}

This should be a lot faster. (In addition to other things mentioned
previously, the hard-coding of 9 and 10 in the above is not a good idea)

If RAM becomes a premium, you could put only one canonical sequence
in the hash, rather than every rotation:

for (1..10_000_000){
my @set = shuffle(@a);
my \$s = join '',@set;
my \$two = \$s . \$s;
next if (\$two =~ /gg/);
my (\$canon) = sort map {substr \$two,\$_,length \$s} 0..length(\$s) -1;
unless (exists \$hash{\$canon}){
\$hash{\$canon}++;
\$exito++;
print "\$0 \$exito\n";
}
}
print "El número de permutaciones circulares es \$exito\n";

Xho

--
The costs of publication of this article were defrayed in part by the
this fact.

xhoster@gmail.com
Guest
Posts: n/a

 01-12-2009
Jürgen Exner <(E-Mail Removed)> wrote:
> (E-Mail Removed) wrote:
> >I think your claim is true if and only if the counts of the occurrences
> >of distinct letters in the input @set have a lcf of 1.

>
> What's an lcf? http://en.wikipedia.org/wiki/LCF has numerous
> suggestions, but none that seems to fit in this context.

Oops. I meant gcf, greatest common factor. (lcf, least common factor, is
pretty trivial, which is probably why there is no entry for it.)

Xho

--
The costs of publication of this article were defrayed in part by the
this fact.

xhoster@gmail.com
Guest
Posts: n/a

 01-12-2009
Previously, I wrote about getting only distinct permutations:

> There may already be a module for that on CPAN, but if so finding it
> seems as hard as making one. So I whipped up something easy to implement
> (but not to use)
>
> To make a good module out of it, recursion should be removed and
> reimplemented with explicit state and as an iterator rather than a
> callback.

I've done the above mentioned conversion:

my \$iter=Distinct:ermute->new(qw/a a a a a r r g g n/);
while (my @x=\$iter->next) {
print "@x\n";
};

package Distinct:ermute;
sub new {
"Distinct:ermute" eq shift @_ or die "Not subclassable";
my %h;
foreach (@_) {
\$h{\$_}++;
};
my @alpha=keys %h;
my @counts=values %h;
my @first;
push @first, ((\$_) x \$counts[\$_]) foreach 0..\$#alpha;
return bless {alpha => \@alpha, counts=> \@counts, now => \@first};
};

sub next {
my \$self=shift;
return if exists \$self->{done};
my @return = map \$self->{alpha}[\$_], @{\$self->{now}};

## climb back off the length until we have backed off far enough that
## the next iteration doesn't need to change anything to the left of us.
my \$climb=\$#{\$self->{now}};
my @left = (0)x @{\$self->{alpha}};
my \$max_left = -1; # highest digit with any left to choose
while (\$climb >=0) {
last if \$self->{now}[\$climb] < \$max_left;
\$max_left = \$self->{now}[\$climb] if \$self->{now}[\$climb] > \$max_left;
\$left[ \$self->{now}[\$climb] ] ++;
\$climb--;
};
if (\$climb<0) {
\$self->{done}=1;
return @return;
};

# roll over the digit that needs it:
\$left[ \$self->{now}[\$climb] ]++;
do {\$self->{now}[\$climb]++} until \$left[ \$self->{now}[\$climb] ];
\$left[ \$self->{now}[\$climb] ]--;

# fill out rest
foreach my \$digit (0..\$#left) {
foreach (1..\$left[\$digit]) {
\$self->{now}[++\$climb]=\$digit;
};
};

return @return

};

--
The costs of publication of this article were defrayed in part by the
this fact.

gamo
Guest
Posts: n/a

 01-12-2009
On Mon, 12 Jan 2009, (E-Mail Removed) wrote:

> > On Fri, 9 Jan 2009, gamo wrote:
> >

>
> > for \$j (1..\$precounter){
> > if (\$s eq \$p[\$j]){
> > \$ok=0;
> > last;
> > }
> > };

>
> This would be better written as a hash look up. To do so, the easiest way
> is move it above the hash population, and do away with @p altogether,
> giving:
>
> for (1..10_000_000){
> my @set = shuffle(@a);
> my \$s = join '',@set;
> my \$two = \$s . \$s;
> next if (\$two =~ /gg/);
> unless (exists \$hash{\$s}){
> \$exito++;
> print "\$0 \$exito\n";
> }
> for my \$i (0..9){
> my \$r = substr \$two,\$i,10;
> \$hash{\$r}++;
> }
> }
>
> This should be a lot faster. (In addition to other things mentioned
> previously, the hard-coding of 9 and 10 in the above is not a good idea)
>
> If RAM becomes a premium, you could put only one canonical sequence
> in the hash, rather than every rotation:
>
> for (1..10_000_000){
> my @set = shuffle(@a);
> my \$s = join '',@set;
> my \$two = \$s . \$s;
> next if (\$two =~ /gg/);
> my (\$canon) = sort map {substr \$two,\$_,length \$s} 0..length(\$s) -1;
> unless (exists \$hash{\$canon}){
> \$hash{\$canon}++;
> \$exito++;
> print "\$0 \$exito\n";
> }
> }
> print "El número de permutaciones circulares es \$exito\n";
>

I don't agree with your simplifications. It needs two hashes:
one for the objective, the circular unique lists, and another
for all the rotations. Each candidate must be checked against
both.

Anyway the problem with lenght 20,
(a a b b b c c c c d d d d d e e e e e e)
is intractable in terms of memory to store the results (using
a 16GB RAM computer). I halt the program with 15,4 GB consumed.
The counter of circular lists marks 5,3 millons and running up
at devils speed.

I run your fast routine dpermute to figure out the dimension
of the problem, but the program consume many time in the calculation.
Still waiting for coment it.

I am hopeless about tryng a real problem of this kind.

Thanks for your efforts, best regards

>
> Xho
>
> --
> The costs of publication of this article were defrayed in part by the
> payment of page charges. This article must therefore be hereby marked
> advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate
> this fact.
>

--
http://www.telecable.es/personales/gamo/
"Was it a car or a cat I saw?"
perl -E 'say 111_111_111**2;'