Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Perl > Perl Misc > fastest count of instances in string?

Reply
Thread Tools

fastest count of instances in string?

 
 
Uri Guttman
Guest
Posts: n/a
 
      10-03-2003
>>>>> "A" == AlV <> writes:

A> Uri Guttman wrote:
>>>>>>> "A" == AlV <> writes:

A> $n=tr/$char/$char/; }'
>> that is broken code. try it out and see what happens.


A> You're, of course, right (from perldoc -f perlop):

A> But that would give the very slow:

A> # time perl -e 'for my $i (1 .. 1000000) { $char="a"; $_="cacababaA";
A> $n= eval "tr/$char/$char/"; } print $n, "\n"; '
A> perl -e 31.11s user 0.06s system 99% cpu 31.259 total

remove the eval from the loop. create a sub with that eval tr// in it
and call that in the loop. then you should make all the benchmark
entries call subs to equalize them. tr will be much faster than the
others.

A> So, s//g would be a much faster way to count the occurences of
A> $char... or am I still stupid? }

not stupid, just not familiar with how to isolate the thing you really
want to benchmark. and there are many benchmark techniques and also ways
to speed up code like tr// with intrpolation (caching code refs
generated by eval, etc.) so you don't call eval each time. eval is the
killer in that loop.

uri

--
Uri Guttman ------ -------- http://www.stemsystems.com
--Perl Consulting, Stem Development, Systems Architecture, Design and Coding-
Search or Offer Perl Jobs ---------------------------- http://jobs.perl.org
 
Reply With Quote
 
 
 
 
AlV
Guest
Posts: n/a
 
      10-03-2003
Uri Guttman wrote:
>>>>>>"A" == AlV <> writes:

>
> A> But that would give the very slow:
>
> A> # time perl -e 'for my $i (1 .. 1000000) { $char="a"; $_="cacababaA";
> A> $n= eval "tr/$char/$char/"; } print $n, "\n"; '
> A> perl -e 31.11s user 0.06s system 99% cpu 31.259 total
>
> remove the eval from the loop. create a sub with that eval tr// in it
> and call that in the loop. then you should make all the benchmark
> entries call subs to equalize them. tr will be much faster than the
> others.


Huh... O

I am certainly dense, but I can't figure how putting the tr// in a sub
and calling that sub would mean anything other than a longer execution
time...
(I tried, but certainly not the way you expected ;o)

In the first question of this thread, it was asked whether a faster
method existed for counting character occurences in a string (other than
splitting the string and looking through the array).

For me, timing the speed of a given method is timing everything that is
involved, and for the sake of reading a meaningful value, looping around
the piece of code that is involved. For example, if it takes, 30ms to
complete and my expected accuracy is about a second, I need to perform
100 to 1000 loops before I can expect a "reasonably accurate" value for
the execution time.

Down to this particular problem, if using tr/// means using an eval
around it in order to get interpolation, so be it...

Suppose the content of the string or the char (that we are trying to
count) come to change, eval should be called each time: IMHO, this
overhead should be part of the timing (some sort of worst case).

But feel free to contradict me )

> A> So, s//g would be a much faster way to count the occurences of
> A> $char... or am I still stupid? }
>
> not stupid, just not familiar with how to isolate the thing you really
> want to benchmark. and there are many benchmark techniques and also ways
> to speed up code like tr// with intrpolation (caching code refs
> generated by eval, etc.) so you don't call eval each time.


I fail to see how we could cache "code refs generated by eval" (whatever
that could be, I am no Perl Guru ;o) if the string (in which we are
looking for the character) and the character itself change...

> eval is the killer in that loop.


That, I had guessed... )

 
Reply With Quote
 
 
 
 
Uri Guttman
Guest
Posts: n/a
 
      10-03-2003
>>>>> "A" == AlV <> writes:

A> I fail to see how we could cache "code refs generated by eval"
A> (whatever that could be, I am no Perl Guru ;o) if the string (in which
A> we are looking for the character) and the character itself change...

simple:

<untested>

my %tr_subs ;

sub tr_chars {

my ( $tr_chars ) = @_

$tr_subs{ $tr_chars } ||= eval "sub { tr/$tr_chars// }" ;

return $tr_subs{ $tr_chars }->() ;
}

this could also be done using the memoize module i would guess but it is
so short i didn't bother. also it does the tr on $_ and it could easily
be changed to work with an argument or defaulting to $_.

uri

--
Uri Guttman ------ -------- http://www.stemsystems.com
--Perl Consulting, Stem Development, Systems Architecture, Design and Coding-
Search or Offer Perl Jobs ---------------------------- http://jobs.perl.org
 
Reply With Quote
 
Roy Johnson
Guest
Posts: n/a
 
      10-03-2003
AlV <> wrote in message news:<blk2v6$ub8$>...
> # time perl -e 'for my $i (1 .. 1000000) { $char="a"; $_="cacababaA";
> $n=tr/$char/$char/; }'


You'll get a smidge faster if you don't bother to replace anything.
tr/// doesn't delete unless you specify the /d modifier, so you can
just do
$n=tr/$char//;
and it is non-destructive.
 
Reply With Quote
 
Uri Guttman
Guest
Posts: n/a
 
      10-03-2003
>>>>> "RJ" == Roy Johnson <> writes:

RJ> AlV <> wrote in message news:<blk2v6$ub8$>...
>> # time perl -e 'for my $i (1 .. 1000000) { $char="a"; $_="cacababaA";
>> $n=tr/$char/$char/; }'


RJ> You'll get a smidge faster if you don't bother to replace anything.
RJ> tr/// doesn't delete unless you specify the /d modifier, so you can
RJ> just do
RJ> $n=tr/$char//;
RJ> and it is non-destructive.

first, you have the same problem that tr/// doesn't intrpolate.

secondly, replacing each char by itself is never destuctive. and your
notion that not using the replacement string is different then you need
to rtfm more carefully:

If the REPLACEMENTLIST is empty, the SEARCHLIST is replicated.
This latter is useful for counting characters in a class or for
squashing character sequences in a class.

so not passing in the replacement string is just syntactic sugar and
nothing special semantically.

uri

--
Uri Guttman ------ -------- http://www.stemsystems.com
--Perl Consulting, Stem Development, Systems Architecture, Design and Coding-
Search or Offer Perl Jobs ---------------------------- http://jobs.perl.org
 
Reply With Quote
 
Roy Johnson
Guest
Posts: n/a
 
      10-03-2003
> Within this newsgroup, you are expected to provide valid
> Perl code


You did not include any code in your post. Or is it not required in
posts where you chew someone out?

The code I posted IS valid Perl code.

> and you are expected to test your code


It wasn't "my" code, it was code I modified ever so slightly, and I
did test it, and compared to the other code, it was a smidge faster.

Some code for you. Please try it out.

delete $butt->{'stick'};
 
Reply With Quote
 
Roy Johnson
Guest
Posts: n/a
 
      10-04-2003
Purl Gurl <> wrote in message news:<>...

> That is not valid Perl code unless is it your
> intent to set $n equal to a bare word which
> returns a value of zero. There exists an
> additional reason why your code is invalid.


Validity of code is not about what one intends it to do. It is whether
it compiles and runs. My code did.

> You did not test your code. You are lying.


No, you are stupid. I did test my code. The test was how it did
against the originally offered code, not what output it generated.

> Ignorant troll


You should make that your signature.

> You certainly will not survive me.


Yeah, keep stroking yourself.
 
Reply With Quote
 
Roy Johnson
Guest
Posts: n/a
 
      10-04-2003
Uri Guttman <> wrote in message news:<>...

> first, you have the same problem that tr/// doesn't intrpolate.


That's actually not my problem. Perhaps I wasn't clear enough in
communicating that I was only interested in the relative merits of
specifying a replacement list vs. not.

> secondly, replacing each char by itself is never destuctive.


I didn't say that it was. Again, I may have been unclear: the reason I
mentioned destructiveness is that having a blank replacement set may
*appear* to be destructive. That is the only reason I can think of for
specifying that a character set be replaced with itself.

> so not passing in the replacement string is just syntactic sugar and
> nothing special semantically.


In much the same way that
"$stupid"
is the same as
$stupid
, my point is that
tr/stupid/stupid/
is the same as
tr/stupid//

And for whatever reason, I got the barest improvement in speed when I
benchmarked. The second version is less prone to error (say,
transposing characters in one set) and is easier to grok at a glance:
"oh, nothing is being replaced."
 
Reply With Quote
 
John W. Krahn
Guest
Posts: n/a
 
      10-04-2003
Roy Johnson wrote:
>
> In much the same way that
> "$stupid"
> is the same as
> $stupid


It is?

$ perl -le'
$stupid = \55;
print "<", $stupid + 0, "> <", "$stupid" + 0, ">";
$stupid = [ 1,2,3 ];
print "<", @{$stupid}, "> <", @{"$stupid"}, ">";
'
<135267140> <0>
<123> <>



John
--
use Perl;
program
fulfillment
 
Reply With Quote
 
AlV
Guest
Posts: n/a
 
      10-04-2003
John W. Krahn wrote:
> Roy Johnson wrote:
>
>>In much the same way that
>> "$stupid"
>>is the same as
>> $stupid

>
>
> It is?
>
> $ perl -le'
> $stupid = \55;
> print "<", $stupid + 0, "> <", "$stupid" + 0, ">";
> $stupid = [ 1,2,3 ];
> print "<", @{$stupid}, "> <", @{"$stupid"}, ">";
> '
> <135267140> <0>
> <123> <>


Eh, eh, eh, nice shots! D

But, I am not sure that bill (the person who posted the first question
of this thread) is using Perl anymore...

He most certainly is hard at work learning Java or Python or
WhateverLanguageThatIsNotPerl while mumbling about "a bunch of irascible
loonies" ;o)

This last remark is not for you, as you certainly understood, John )

Have a nice day,

 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
dicts,instances,containers, slotted instances, et cetera. ocschwar@gmail.com Python 8 01-29-2009 09:52 AM
Count running instances and user privileges on windows 2k3 Cedric Fontaine C++ 1 12-11-2007 10:51 PM
Fastest 5 mp Digital Camera ? Fastest 4 mp Digital Camera? photoguysept102004@yahoo.com Digital Photography 6 10-28-2004 11:33 AM
list of class instances within a list of a class instances John Wohlbier Python 2 02-22-2004 08:41 AM
Newbie here... getting a count of repeated instances in a list. Amy G Python 9 11-24-2003 10:56 PM



Advertisments
 



1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57