Velocity Reviews

Velocity Reviews (http://www.velocityreviews.com/forums/index.php)
-   Perl Misc (http://www.velocityreviews.com/forums/f67-perl-misc.html)
-   -   perl efficiency -- fastest grepping? (http://www.velocityreviews.com/forums/t889137-perl-efficiency-fastest-grepping.html)

Bryan Krone 11-16-2004 11:57 AM

perl efficiency -- fastest grepping?
 
I have a stream of data comming off a serial port at 19200. I am wondering
what is the most efficient way to grep through the data in realtime? I have
20 or so different strings I need to find. All of which are ~15 characters
or less. Currently I'm using code that looks like this:****************

forever loop
{
sysread the serial buffer into $newdata

if( defined $newdata )
{
********$inString*=~*s/^.*(.{32})$/$1/o;
********$inString*.=*$newdata;
}



if( $inString =~ /.*ResetPF.*/o || $inString =~ /.*[gG][oO].*/o || $inString
=~ /.*reset.*/o || $inString =~ /.*sysinit.*/o )
{
********set*some*flag;
}
}
Is there a more efficient way to grep for the strings to set some flag? This
works pretty well but this is only 4 strings. I would like to add a lot
more but the program slows down after 10 or more strings. Any ideas would
be greatly appreciated.

Thanks

Peter Wyzl 11-16-2004 12:27 PM

Re: perl efficiency -- fastest grepping?
 
"Bryan Krone" <bryankrone@comcast.net> wrote in message
news:Y92dnaeOkvA3dwTcRVn-2w@comcast.com...
>I have a stream of data comming off a serial port at 19200. I am wondering
> what is the most efficient way to grep through the data in realtime? I
> have
> 20 or so different strings I need to find. All of which are ~15 characters
> or less. Currently I'm using code that looks like this:


No doubt others more qualified than I will comment as well, but a couple of
things...

> forever loop
> {
> sysread the serial buffer into $newdata
>
> if( defined $newdata )
> {
> $inString =~ s/^.*(.{32})$/$1/o;


Why are you using the 'o' switch to the regex? You have you variable being
interpolated.

> $inString .= $newdata;



Anyway, I believe you will find substr to be significantly faster for this
operation, simply discarding everything except the last 32 characters in a
string.

$inString = substr( $inString, -32) . $newdata;

Read about that in perlfunc


> }
>
>
>
> if( $inString =~ /.*ResetPF.*/o || $inString =~ /.*[gG][oO].*/o ||
> $inString
> =~ /.*reset.*/o || $inString =~ /.*sysinit.*/o )


Ooo!! Your regexen will be VERY inefficient because of the .* causing huge
amounts of backtracking (specially at both ends). Since you are only
looking to match the string, you can discard both sets of .* for a BIG
performance boost (particularly across multiple regexen). Again, you have
the unnecessary 'o' switches, and that second regex can be written using the
'i' switch (case insensitive).

Yielding:

if( $inString =~ /ResetPF/ || $inString =~ /go/i || $inString =~ /reset/ ||
$inString =~ /sysinit/ ){

I think you need to read up a bit more on regexes, particularly switches and
how the regex engine works.

HTH
--
Wyzelli
#!/usr/bin/perl -w
use strict;
eval reverse ';"n\rekcaH lreP rehtona tsuJ" tnirp';



James Willmore 11-16-2004 01:33 PM

Re: perl efficiency -- fastest grepping?
 
On Tue, 16 Nov 2004 05:57:25 -0600, Bryan Krone wrote:

<snip>
> if( $inString =~ /.*ResetPF.*/o || $inString =~ /.*[gG][oO].*/o ||

$inString
> =~ /.*reset.*/o || $inString =~ /.*sysinit.*/o ) {
> ********set*some*flag;
> }
> }
> Is there a more efficient way to grep for the strings to set some flag?
> This works pretty well but this is only 4 strings. I would like to add a
> lot more but the program slows down after 10 or more strings. Any ideas
> would be greatly appreciated.


First, if you can do without the regular expressions, do so. You can use
either 'unpack' or 'split' and place the results into an array. Then you
can use 'grep' to find what you need.

Second, I'm going to throw this out here and see what happens.

If you can't get away from using regular expressions ... and because there
are *specific* matches to be performed ... and with each match there might
be a specific flag to be set (or action to be performed based upon the
match), I'd (maybe) use a lookup table. This method may or may not be any
better than the way you're doing it now. I haven't benchmarked it and ...
my benchmarks would be useless against what you're trying to do.

For example:

#!/usr/bin/perl

use strict;
use warnings;

my $inString = 'reset the switch now please';

my %lookup = (
qr{ResetPF} => \&do_resetpf,
qr{go}i => \&do_go,
qr{reset} => \&do_reset,
qr{sysinit} => \&do_sysinit,
);

while( my($key,$value) = each %lookup ) {
if( $inString =~ $key) {
$value->();
}
}

sub do_resetpf {
print "ResetPF matched\n";
}

sub do_go {
print "GgOo matched\n";
}

sub do_reset {
print "reset matched\n";
}

sub do_sysinit {
print "sysinit matched\n";
}

HTH

Jim


Tad McClellan 11-16-2004 02:37 PM

Re: perl efficiency -- fastest grepping?
 
Peter Wyzl <wyzelli@yahoo.com> wrote:


> I think you need to read up a bit more on regexes, particularly switches and
> how the regex engine works.



See also:

"How Regexes Work"

http://perl.plover.com/Regex/


--
Tad McClellan SGML consulting
tadmc@augustmail.com Perl programming
Fort Worth, Texas

Matija Papec 11-16-2004 05:41 PM

Re: perl efficiency -- fastest grepping?
 
X-Ftn-To: Bryan Krone

Bryan Krone <bryankrone@comcast.net> wrote:
>if( $inString =~ /.*ResetPF.*/o || $inString =~ /.*[gG][oO].*/o || $inString
>=~ /.*reset.*/o || $inString =~ /.*sysinit.*/o )
>{
>********set*some*flag;
>}
>}
>Is there a more efficient way to grep for the strings to set some flag? This


If you're checking against plain strings (ResetPF, reset..) you can speed up
things with perldoc -f index,

if (1+index($inString, "ResetPF") or ..) {}



--
Matija

Uri Guttman 11-16-2004 06:39 PM

Re: perl efficiency -- fastest grepping?
 
>>>>> "DD" == Darren Dunham <ddunham@redwood.taos.com> writes:

DD> Peter Wyzl <wyzelli@yahoo.com> wrote:

>> if( $inString =~ /ResetPF/ || $inString =~ /go/i || $inString =~
>> /reset/ || $inString =~ /sysinit/ ){


DD> And while those may be replacable with index, if you can't do so in the
DD> general case, moving all the matches into a single regex can be
DD> significantly faster...

DD> if( $inString =~ /ResetPF|(?i:go)|reset|sysinit/ )

and alternation of lots of strings in a regex can be very slow as well.

the OP didn't give a proper spec for the problem IMO. if the string in
question has a token in a know place, the fastest way to check for it is
to grab it with a simple regex and then look it up in a hash. so the
data read from the serial line needs to be properly specified with some
way to define where this match string is located. then extraction should
be easy and a hash can be made of the desired strings.

uri


All times are GMT. The time now is 12:47 PM.

Powered by vBulletin®. Copyright ©2000 - 2014, vBulletin Solutions, Inc.
SEO by vBSEO ©2010, Crawlability, Inc.