Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Perl > Perl Misc > laziest / fastest way to match last characters of a string

Reply
Thread Tools

laziest / fastest way to match last characters of a string

 
 
hofer
Guest
Posts: n/a
 
      09-11-2008
Hi,
Let's look at following example:

$text = "Today is a nice day";
$end = "day";

print "text ends with $end" if $text =~ /$end$/;

Would the regular expression be efficient for long strings?

The alternative is a little more awkward to type

print "text ends with $end" substr($text,-length($end)) eq $end; # I
didn't try this line, but it should work I think

Is there any core module containing something like
print "text ends with $end" if endswith($text,$end);


thans and bye


H
 
Reply With Quote
 
 
 
 
J. Gleixner
Guest
Posts: n/a
 
      09-11-2008
hofer wrote:
> Hi,
> Let's look at following example:
>
> $text = "Today is a nice day";
> $end = "day";
>
> print "text ends with $end" if $text =~ /$end$/;
>
> Would the regular expression be efficient for long strings?


Why not benchmark some different alternatives to see? Your 'long
strings' might not be all that long.

>
> The alternative is a little more awkward to type
>
> print "text ends with $end" substr($text,-length($end)) eq $end; # I
> didn't try this line, but it should work I think
>
> Is there any core module containing something like
> print "text ends with $end" if endswith($text,$end);


Don't know if it'll be faster, but using length and index would be an
alternative, another would be substr.

perldoc -f index
perldoc -f length
perldoc -f substr
 
Reply With Quote
 
 
 
 
Ben Morrow
Guest
Posts: n/a
 
      09-11-2008

Quoth hofer <(E-Mail Removed)>:
>
> $text = "Today is a nice day";
> $end = "day";
>
> print "text ends with $end" if $text =~ /$end$/;
>
> Would the regular expression be efficient for long strings?


~% perl -Mre=debug -e'$end="day"; "Today is a nice day" =~ /$end$/'
Freeing REx: `","'
Compiling REx `day$'
size 4 Got 36 bytes for offset annotations.
first at 1
1: EXACT <day>(3)
3: EOL(4)
4: END(0)
anchored "day"$ at 0 (checking anchored isall) minlen 3
Offsets: [4]
1[3] 0[0] 4[1] 5[0]
Guessing start of match, REx "day$" against "Today is a nice day"...
Found anchored substr "day"$ at offset 16...
Starting position does not contradict /^/m...
Guessed: match at offset 16
Freeing REx: `"day$"'

The first thing it tries is a direct match against the last three
characters, which is as fast as it gets.

Ben

--
Outside of a dog, a book is a man's best friend.
Inside of a dog, it's too dark to read.
http://www.velocityreviews.com/forums/(E-Mail Removed) Groucho Marx
 
Reply With Quote
 
Jürgen Exner
Guest
Posts: n/a
 
      09-11-2008
hofer <(E-Mail Removed)> wrote:
>$text = "Today is a nice day";
>$end = "day";
>print "text ends with $end" if $text =~ /$end$/;
>
>Would the regular expression be efficient for long strings?
>
>The alternative is a little more awkward to type
>
>print "text ends with $end" substr($text,-length($end)) eq $end; # I
>didn't try this line, but it should work I think


These two versions do very different things. If you need REs, then the
second version won't do you any good.
If you want textual comparison without RE-behaviour then the first
version is wrong unless you have a very limited set of possible data.

Use the one that matches your needs. Usually correct is more important
than fast.

jue
 
Reply With Quote
 
hofer
Guest
Posts: n/a
 
      09-11-2008
On Sep 11, 8:51*pm, Jürgen Exner <(E-Mail Removed)> wrote:

> >print "text ends with $end" if $text =~ /$end$/;

>
> >print "text ends with $end" *substr($text,-length($end)) eq $end; *#I

>
> These two versions do very different things. If you need REs, then the
> second version won't do you any good.
> If you want textual comparison without RE-behaviour then the first
> version is wrong unless you have a very limited set of possible data.
>
> Use the one that matches your needs. Usually correct is more important
> than fast.
>

Hi Juergen,

In fact I don't need REs and the finishing strings won't contain
backslashes, dots or other characters, that could be taken as RE.

So in my special case both are interchangable.

For me the RE is visualy more intuitive than the substr with the -
length() and the fact, that the string to be searched has
to be entered twice if it were a constant and not a variable

I just wondered if perl has a built-in string_ends_with() function or
whether REs would be much slower.

As it Ben pointed out the first thing the RE search does is checking
at the end of the string, so I guess I'll stick with REs


bye


N

 
Reply With Quote
 
Ben Morrow
Guest
Posts: n/a
 
      09-12-2008

Quoth hofer <(E-Mail Removed)>:
> On Sep 11, 8:51*pm, Jürgen Exner <(E-Mail Removed)> wrote:
>
> > >print "text ends with $end" if $text =~ /$end$/;

> >
> > >print "text ends with $end" *substr($text,-length($end)) eq $end; *# I

> >
> > These two versions do very different things. If you need REs, then the
> > second version won't do you any good.
> > If you want textual comparison without RE-behaviour then the first
> > version is wrong unless you have a very limited set of possible data.
> >
> > Use the one that matches your needs. Usually correct is more important
> > than fast.
> >

> Hi Juergen,
>
> In fact I don't need REs and the finishing strings won't contain
> backslashes, dots or other characters, that could be taken as RE.
>
> So in my special case both are interchangable.


Be aware that /$/ has rather odd semantics: it will match before a
newline at the end of the string, in a somewhat misguided attempt to
handle reading from a filehandle without chomping. If this is an issue
(if your string might contain newlines, and you *don't* want to match
them like this), use /\z/ instead.

Also, it's always worth interpolating a variable that's meant to be
taken literally like this:

/\Q$end\E$/

just in case.

> For me the RE is visualy more intuitive than the substr with the -
> length() and the fact, that the string to be searched has
> to be entered twice if it were a constant and not a variable


The second is a nonissue. Allowing you to type things only once is what
variables are *for* .

> I just wondered if perl has a built-in string_ends_with() function or
> whether REs would be much slower.


Well, yes; it's called a regex.

Ben

--
All persons, living or dead, are entirely coincidental.
(E-Mail Removed) Kurt Vonnegut
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
String#match vs. Regexp#match - confused Old Echo Ruby 1 09-04-2008 06:11 PM
Fastest way to find a match? bukzor Perl Misc 10 03-16-2008 10:03 PM
The Laziest Variable in the Whooole World Daniel Nugent Ruby 2 09-08-2005 05:20 PM
Fastest way to dump special characters? laredotornado@zipmail.com Java 1 01-05-2005 10:51 PM
Fastest 5 mp Digital Camera ? Fastest 4 mp Digital Camera? photoguysept102004@yahoo.com Digital Photography 6 10-28-2004 11:33 AM



Advertisments