Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Perl > Perl Misc > search and replace

Reply
Thread Tools

search and replace

 
 
yeti349@yahoo.com
Guest
Posts: n/a
 
      09-02-2005
Hi. The following snippet seems to be doing what I want, but I'd like
to get suggestions for refinement or improving the code. Thank you.

#!/usr/bin/perl -w

use strict;

my $phrase = "MARKEM 1ST HOLDINGS INC. & MONEY MARKET LLC";

my @terms = terms();

foreach my $term (@terms)
{
$phrase =~ s/\W/ /ig; #remove nonword characters
$phrase =~ s/\S*\d\S*/ /ig; #remove numeric strings
$phrase =~ s/\b($term)\b//ig; #remove business terms
}

#new phrase: "MARKEM MONEY MARKET"
print "$phrase\n";

sub terms
{
my @terms = qw(holdings inc llc);
}

 
Reply With Quote
 
 
 
 
Paul Lalli
Guest
Posts: n/a
 
      09-02-2005
wrote:
> Hi. The following snippet seems to be doing what I want, but I'd like
> to get suggestions for refinement or improving the code. Thank you.
>
> #!/usr/bin/perl -w


You enabled Warnings. Very good. However, these days, it is
preferable to write
use warnings;
rather than the -w option on the shebang.

> use strict;


Excellent!

> my $phrase = "MARKEM 1ST HOLDINGS INC. & MONEY MARKET LLC";


My personal preference is to not use double quotes unless you actually
need interpolation.

>
> my @terms = terms();


I get nervous when using the same identifier for multiple pieces of
information. There's nothing *wrong* with this, of course, and Perl
will readily do exactly what you want. It just seems needlessly
confusing to the reader. Consider something like `my @terms =
get_terms();` instead.

>
> foreach my $term (@terms)
> {
> $phrase =~ s/\W/ /ig; #remove nonword characters


Your comment doesn't match what the code actually does. You're not
removing non-word characters. You're replacing each non-word character
with a single space character.

There is no reason for the /i modifier in that regexp.

You're specifying the /g modifer, which means this one single statement
removes *all* the non-word characters. So why is this line in a loop?
This code does not in any way depend on $term, and it should be moved
outside the loop so that it's only executed once.

> $phrase =~ s/\S*\d\S*/ /ig; #remove numeric strings


Pretty much the same critiques for this one. The comment doesn't match
what's happening, the /i modifier is pointless, and it should only be
executed once.

> $phrase =~ s/\b($term)\b//ig; #remove business terms


Okay. *This* one should absolutely remain in the loop, and should use
the /i modifier, and it is in fact actually removing the term.

My only critique here is that there's no reason to use (), as you're
not using $1 anywhere.

> }
>
> #new phrase: "MARKEM MONEY MARKET"
> print "$phrase\n";


Again, your comment doesn't match. That is not the new phrase. The
new phrase is: 'MARKEM MONEY MARKET '

> sub terms
> {
> my @terms = qw(holdings inc llc);
> }


If this subroutine has no point other than returning a list of values,
why are you bothering to create a new variable at all?


Putting all my critiques together, we're left with:
#!/usr/bin/perl
use strict;
use warnings;

my $phrase = 'MARKEM 1ST HOLDINGS INC. & MONEY MARKET LLC';

my @terms = get_terms();

$phrase =~ s/\W/ /ig; #replace non-word chars with a space
$phrase =~ s/\S*\d\S*/ /ig; #replace strings containing a digit
#with a space

for my $term (@terms) {
$phrase =~ s/\b$term\b//ig; #remove business terms
}

#new phrase: 'MARKEM MONEY MARKET '
print "$phrase\n";

sub get_terms {
qw/holdings inc llc/;
}

__END__

 
Reply With Quote
 
 
 
 
yeti349@yahoo.com
Guest
Posts: n/a
 
      09-02-2005
Excellent repsonse Paul! Thank you for your help.

 
Reply With Quote
 
Anno Siegel
Guest
Posts: n/a
 
      09-02-2005
Paul Lalli <> wrote in comp.lang.perl.misc:
> wrote:
> > Hi. The following snippet seems to be doing what I want, but I'd like
> > to get suggestions for refinement or improving the code. Thank you.


[most of excellent code revision snipped]

> > #!/usr/bin/perl -w
> > use strict;
> >
> > my $phrase = "MARKEM 1ST HOLDINGS INC. & MONEY MARKET LLC";
> >
> > my @terms = terms();

>
> I get nervous when using the same identifier for multiple pieces of
> information. There's nothing *wrong* with this, of course, and Perl
> will readily do exactly what you want. It just seems needlessly
> confusing to the reader. Consider something like `my @terms =
> get_terms();` instead.


I don't get nervous when the like-named things represent essentially
the same information. As this is the case here, I might have used
"terms" in both senses without compunctions.

However, in this case the issue can be dodged. The variable @terms is only
used once, so it can go (okay, there are exceptions). Just call terms()
directly in its place.

> > foreach my $term (@terms)


foreach my $term ( terms() ) {

Anno
--
If you want to post a followup via groups.google.com, don't use
the broken "Reply" link at the bottom of the article. Click on
"show options" at the top of the article, then click on the
"Reply" at the bottom of the article headers.
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Losing $1 and $2 variables in my search and replace expression laredotornado@zipmail.com Perl Misc 1 06-21-2012 04:43 PM
| SEO , Search Engine Optimizer, SEARCH OPtiMIzAtIoN with SeaRch OPtiMizer optimizer.seo@gmail.com Digital Photography 0 04-22-2007 04:20 AM
quick and dirty registry search and replace? VB? Dave - Dave.net.nz NZ Computing 14 11-08-2004 07:44 PM
search within a search within a search - looking for better way...my script times out Abby Lee ASP General 5 08-02-2004 04:01 PM
Search and replace with NIO and Regex? Mark McKay Java 3 01-21-2004 05:29 PM



Advertisments