Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Perl > Perl Misc > fastest count of instances in string?

Reply
Thread Tools

fastest count of instances in string?

 
 
Roy Johnson
Guest
Posts: n/a
 
      10-06-2003
"John W. Krahn" <(E-Mail Removed)> wrote in message news:<(E-Mail Removed)>...
> If the value in $stupid is already a string why would you need to quote
> the variable?


Exactly. And if tr/stupid// doesn't change any values, why would you
need to specify the values not to change?
 
Reply With Quote
 
 
 
 
Roy Johnson
Guest
Posts: n/a
 
      10-06-2003
Uri Guttman <(E-Mail Removed)> wrote in message news:<(E-Mail Removed)>...
> then there is another
> bug. if $tr_chars has more than 1 char then s/// not delete the
> individual chars. he needs a char class for that.


That's not a bug, it's a spec. The question was about counting how
many times an individual character appeared in a string.

I think there was a goof in the mtr sub, though: I left "a" instead of
"$tr_chars".
 
Reply With Quote
 
 
 
 
Iain Truskett
Guest
Posts: n/a
 
      10-06-2003
* Roy Johnson <(E-Mail Removed)>:
> * Uri Guttman <(E-Mail Removed)> wrote in message
> > then there is another bug. if $tr_chars has more than 1 char then
> > s/// not delete the individual chars. he needs a char class for
> > that.


> That's not a bug, it's a spec. The question was about counting how
> many times an individual character appeared in a string.


So why is it called $tr_chars instead of the singular $tr_char? Or even
just $char?


cheers,
--
Iain.
 
Reply With Quote
 
John W. Krahn
Guest
Posts: n/a
 
      10-06-2003
Roy Johnson wrote:
>
> "John W. Krahn" <(E-Mail Removed)> wrote in message news:<(E-Mail Removed)>...
> > If the value in $stupid is already a string why would you need to quote
> > the variable?

>
> Exactly. And if tr/stupid// doesn't change any values, why would you
> need to specify the values not to change?


Ahh, but it does change values. It just changes them to the same thing
that they were before.


John
--
use Perl;
program
fulfillment
 
Reply With Quote
 
Roy Johnson
Guest
Posts: n/a
 
      10-06-2003
"John W. Krahn" <(E-Mail Removed)> wrote in message news:<(E-Mail Removed)>...
> Ahh, but it does change values. It just changes them to the same thing
> that they were before.


You have an unusual notion of the definition of "change".
Nevertheless, the behavior is exactly the same, regardless of whether
you specify the replacement list, so why specify it?
 
Reply With Quote
 
Anno Siegel
Guest
Posts: n/a
 
      10-07-2003
Eric J. Roode <(E-Mail Removed)> wrote in comp.lang.perl.misc:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> bill <(E-Mail Removed)> wrote in news:blk164$cd5$(E-Mail Removed):
>
> >
> >
> > What's the fastest way to count how many times a specific character
> > appears in a string? Anything faster than something like
> >
> > for (split //, $string) { ++$count if $_ eq $char }

>
> Probably the fastest way would be to write a C function to do the counting,
> then link to it via XS (or Inline::C).


I don't think a C function can be much faster than a pre-compiled tr///.
In fact, I was courious enough to do a quick benchmark. (Except they're
never really quick.) The Inline code *was* faster -- by 4 %.

Anno

 
Reply With Quote
 
Roy Johnson
Guest
Posts: n/a
 
      10-07-2003
Better benchmark code produces more satisfying benchmarks. Note the
addition of a method using "index", which is not too shabby.

I noticed that adding \Q to the patterns (so that I matched strings
rather than re's) slowed them down, so I cached the pattern matches.
They had been running about 4x as slow as cached tr///, but reduced
the difference to less than a factor of 2.

Results:
scc : uses s///g: 8.5 sec
mcc : uses m//g with scalar map to count results: 10.8 sec
mcc2: uses m//g with an explicit loop and counter: 11.9 sec
trcc: cached subs of eval tr///: 6.4 sec
tacc: same, but using arrays instead of hashes to cache: 6.0 sec
incc: uses index and explicit counter: 7.9 sec

Notes:
The spec is to count occurrences of single characters. Most of these
methods generalize:
tacc will only do individual characters
trcc will do character classes
the others will do substrings (though the patterns could be
rewritten to match classes instead)
Using index is surprisingly fast, and requires no caching
scc returns empty string instead of zero

Code:
#!perl
use strict;
use warnings;

my $str = 'abccdeageabbcab';
my @chars = qw(a b c d e f g);
my $count;
my $sub = \&mcc;
my %cache = ();
my @cache = ();
for (1..100000) {
$count = &$sub($_, $str) for (@chars);
}
for (@chars) {
$count = &$sub($_, $str);
print "There are $count ${_}s in $str\n";
}

sub mcc {
my ($c, $s) = @_;
$cache{$c} ||= qr/\Q$c/;
scalar map(/$cache{$c}/g, $s);
}

sub mcc2 {
my ($c, $s) = @_;
my $count = 0;
$cache{$c} ||= qr/\Q$c/;
++$count while $s=~/$cache{$c}/g;
$count;
}

sub trcc {
my ($c, $s) = @_;
$cache{$c} ||= eval "sub { \$_[0] =~ tr/$c// }";
$cache{$c}->($s);
}

sub tacc {
my ($c, $s) = @_;
$cache[ord $c] ||= eval "sub { \$_[0] =~ tr/$c// }";
$cache[ord $c]->($s);
}

sub scc {
my ($c, $s) = @_;
$cache{$c} ||= qr/\Q$c/;
scalar $s=~s/$cache{$c}//g;
}

sub incc {
my ($c, $s) = @_;
my $count = 0;
my $pos = index($s, $c);
while ($pos >= 0) {
++$count;
$pos = index($s, $c, $pos+1);
}
$count;
}
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
dicts,instances,containers, slotted instances, et cetera. ocschwar@gmail.com Python 8 01-29-2009 09:52 AM
Count running instances and user privileges on windows 2k3 Cedric Fontaine C++ 1 12-11-2007 10:51 PM
Fastest 5 mp Digital Camera ? Fastest 4 mp Digital Camera? photoguysept102004@yahoo.com Digital Photography 6 10-28-2004 11:33 AM
list of class instances within a list of a class instances John Wohlbier Python 2 02-22-2004 08:41 AM
Newbie here... getting a count of repeated instances in a list. Amy G Python 9 11-24-2003 10:56 PM



Advertisments