Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Perl > Perl Misc > Help with regexp. Can you do better

Reply
Thread Tools

Help with regexp. Can you do better

 
 
papaDoc
Guest
Posts: n/a
 
      09-27-2005
Hi,

I'm trying to parse the output of CVS loginfo to update the script
activitymail and I have a problem.

I'm able to parse the line but I don't like the way I'm doing it.
Can you help me get rid of the $delim variable which I need in my
current algo ?

I want to get a list like this
@dest[0] = "excavator_resources.h 1.129 1.130.23"
@dest[0] = "gfxGround.cpp 1.12 1.13"
etc

#!C:/DevTools/mks/mksnt/perl.exe
#

$src = "excavator_resources.h 1.129 1.130.23 gfxGround.cpp 1.12 1.13
mgrDemo.cpp 1.72 1.73 objExcavator.cpp 1.42 1.43 pedModule_Digging.cpp
1.25 1.26 pedModule_DumpTrench.cpp 1.18 1.19 pedModule_DumpTruck.cpp
1.27 1.28 pedModule_TargetSearch.cpp 1.17 1.18 pedTerrainDef.cpp 1.6
1.7 pedTrenchDef.cpp 1.18 1.19 pedTrial_DumpTruck.cpp 1.28 1.29
playback.cpp 1.1 1.2 .configrc 1.2 1.3 ~configrc 1.2 1.3";

$delim =
"This-Is-A-Simlog-Delimiter-And-No-Filenames-Should-Be-Called-This";

$src =~ s/([\d\.]+\s[\d\.]+)\s(\S)/\1$delim\2/g;
@dest = split( /$delim/ , $src);

print "\n";
foreach $d (@dest)
{
print "($d)\n";
}
print "\n";


Remi


 
Reply With Quote
 
 
 
 
Paul Lalli
Guest
Posts: n/a
 
      09-27-2005
papaDoc wrote:
> Hi,
>
> I'm trying to parse the output of CVS loginfo to update the script
> activitymail and I have a problem.
>
> I'm able to parse the line but I don't like the way I'm doing it.
> Can you help me get rid of the $delim variable which I need in my
> current algo ?
>
> I want to get a list like this
> @dest[0] = "excavator_resources.h 1.129 1.130.23"
> @dest[0] = "gfxGround.cpp 1.12 1.13"


I have no idea what a "list like that" would be. I assume you meant
$dest[0] for the first line, and $dest[1] for the second.

perldoc -q "difference" (scroll down a bit)

> $src = "excavator_resources.h 1.129 1.130.23 gfxGround.cpp 1.12 1.13
> mgrDemo.cpp 1.72 1.73 objExcavator.cpp 1.42 1.43 pedModule_Digging.cpp
> 1.25 1.26 pedModule_DumpTrench.cpp 1.18 1.19 pedModule_DumpTruck.cpp
> 1.27 1.28 pedModule_TargetSearch.cpp 1.17 1.18 pedTerrainDef.cpp 1.6
> 1.7 pedTrenchDef.cpp 1.18 1.19 pedTrial_DumpTruck.cpp 1.28 1.29
> playback.cpp 1.1 1.2 .configrc 1.2 1.3 ~configrc 1.2 1.3";
>
> $delim =
> "This-Is-A-Simlog-Delimiter-And-No-Filenames-Should-Be-Called-This";
>
> $src =~ s/([\d\.]+\s[\d\.]+)\s(\S)/\1$delim\2/g;


use warnings; would have told you to use $1 and $2 rather than \1 and
\2 there.

> @dest = split( /$delim/ , $src);


I have no idea why you're jumping through these hoops. Are you aware
that a pattern match in list context returns a list of the matches?

my @dest = $src =~ /(\S+\s[\d\.]+\s[\d\.]+)/g;
Match all instances of: one or more non-whitespace, a single
whitespace, one or more (decmial or digit), a whitespace, and another
one or more (decimal or digit).

Paul Lalli

 
Reply With Quote
 
 
 
 
Dave Weaver
Guest
Posts: n/a
 
      09-27-2005
On 27 Sep 2005 05:24:48 -0700, papaDoc <> wrote:
> Hi,
>
> I'm trying to parse the output of CVS loginfo to update the script
> activitymail and I have a problem.
>
> I'm able to parse the line but I don't like the way I'm doing it.
> Can you help me get rid of the $delim variable which I need in my
> current algo ?
>
> I want to get a list like this
> @dest[0] = "excavator_resources.h 1.129 1.130.23"
> @dest[0] = "gfxGround.cpp 1.12 1.13"
> etc
>


How about:

#!/usr/bin/perl
use warnings;
use strict;

my $src = "excavator_resources.h 1.129 1.130.23 gfxGround.cpp 1.12 1.13
mgrDemo.cpp 1.72 1.73 objExcavator.cpp 1.42 1.43 pedModule_Digging.cpp
1.25 1.26 pedModule_DumpTrench.cpp 1.18 1.19 pedModule_DumpTruck.cpp
1.27 1.28 pedModule_TargetSearch.cpp 1.17 1.18 pedTerrainDef.cpp 1.6
1.7 pedTrenchDef.cpp 1.18 1.19 pedTrial_DumpTruck.cpp 1.28 1.29
playback.cpp 1.1 1.2 .configrc 1.2 1.3 ~configrc 1.2 1.3";

my @dest = $src =~ /(.*?\s[\d\.]+\s[\d\.]+)\s?/g;

use Data:umper;
print Dumper \@dest;

 
Reply With Quote
 
ced@carios2.ca.boeing.com
Guest
Posts: n/a
 
      09-27-2005

papaDoc wrote:
> Hi,
>
> I'm trying to parse the output of CVS loginfo to update the script
> activitymail and I have a problem.
>
> I'm able to parse the line but I don't like the way I'm doing it.
> Can you help me get rid of the $delim variable which I need in my
> current algo ?
>
> I want to get a list like this
> @dest[0] = "excavator_resources.h 1.129 1.130.23"
> @dest[0] = "gfxGround.cpp 1.12 1.13"
> etc


Nitpickey but @dest[0] is better as $dest[0] (until Perl 6)

>
> #!C:/DevTools/mks/mksnt/perl.exe
> #
>
> $src = "excavator_resources.h 1.129 1.130.23 gfxGround.cpp 1.12 1.13
> mgrDemo.cpp 1.72 1.73 objExcavator.cpp 1.42 1.43 pedModule_Digging.cpp
> 1.25 1.26 pedModule_DumpTrench.cpp 1.18 1.19 pedModule_DumpTruck.cpp
> 1.27 1.28 pedModule_TargetSearch.cpp 1.17 1.18 pedTerrainDef.cpp 1.6
> 1.7 pedTrenchDef.cpp 1.18 1.19 pedTrial_DumpTruck.cpp 1.28 1.29
> playback.cpp 1.1 1.2 .configrc 1.2 1.3 ~configrc 1.2 1.3";
>
> $delim =
> "This-Is-A-Simlog-Delimiter-And-No-Filenames-Should-Be-Called-This";
>
> $src =~ s/([\d\.]+\s[\d\.]+)\s(\S)/\1$delim\2/g;
> @dest = split( /$delim/ , $src);
>
> print "\n";
> foreach $d (@dest)
> {
> print "($d)\n";
> }
> print "\n";
>
>


Here're a couple:

@dest = $src =~ /(\S+\s+[\d.?]+\s+[\d.?]+\s*)/g;

the [\d.] doesn't force order so this might be slightly preferable
although not totally right:

@dest = $src =~ /(\S+ # group starting with non-whitespace
\s+ # followed by whitespace
(?:\d\.?){1,} # non-capturing: digit and period (1or
more)
\s+ # followed by whitespace
(?:\d\.?){1,} # non-capturing: digit and period (1or
more)
\s* # whitespace (0 or more since none at
end)
) # end grouping
/xg;

Output:

excavator_resources.h 1.129 1.130.23
gfxGround.cpp 1.12 1.13
mgrDemo.cpp 1.72 1.73
objExcavator.cpp 1.42 1.43
pedModule_Digging.cpp 1.25 1.26
pedModule_DumpTrench.cpp 1.18 1.19
pedModule_DumpTruck.cpp 1.27 1.28
pedModule_TargetSearch.cpp 1.17 1.18
pedTerrainDef.cpp 1.6 1.7
pedTrenchDef.cpp 1.18 1.19
pedTrial_DumpTruck.cpp 1.28 1.29
playback.cpp 1.1 1.2
.configrc 1.2 1.3
~configrc 1.2 1.3

hth,
--
Charles DeRykus

 
Reply With Quote
 
Paul Lalli
Guest
Posts: n/a
 
      09-27-2005
wrote:
> @dest = $src =~ /(\S+\s+[\d.?]+\s+[\d.?]+\s*)/g;

^^^^^^ ^^^^^^

This doesn't mean what you think it means. ? is not special in a
character class. Each of those is searching for one or more digits,
periods, or question marks.

Paul Lalli

 
Reply With Quote
 
ced@carios2.ca.boeing.com
Guest
Posts: n/a
 
      09-27-2005

Paul Lalli wrote:
> wrote:
> > @dest = $src =~ /(\S+\s+[\d.?]+\s+[\d.?]+\s*)/g;

> ^^^^^^ ^^^^^^
>
> This doesn't mean what you think it means. ? is not special in a
> character class. Each of those is searching for one or more digits,
> periods, or question marks.


Right, I must've been thinking ahead to the class-less alternative
I suggested.

--
Charles DeRykus

 
Reply With Quote
 
William James
Guest
Posts: n/a
 
      09-28-2005

Paul Lalli wrote:

> my @dest = $src =~ /(\S+\s[\d\.]+\s[\d\.]+)/g;
> Match all instances of: one or more non-whitespace, a single
> whitespace, one or more (decmial or digit), a whitespace, and another
> one or more (decimal or digit).


It's not necessary to escape . in a character class:

my @dest = $src =~ /(\S+\s[\d.]+\s[\d.]+)/g;

 
Reply With Quote
 
Anno Siegel
Guest
Posts: n/a
 
      09-28-2005
papaDoc <> wrote in comp.lang.perl.misc:
> Hi,
>
> I'm trying to parse the output of CVS loginfo to update the script
> activitymail and I have a problem.
>
> I'm able to parse the line but I don't like the way I'm doing it.
> Can you help me get rid of the $delim variable which I need in my
> current algo ?
>
> I want to get a list like this
> @dest[0] = "excavator_resources.h 1.129 1.130.23"
> @dest[0] = "gfxGround.cpp 1.12 1.13"
> etc
>
> #!C:/DevTools/mks/mksnt/perl.exe
> #
>
> $src = "excavator_resources.h 1.129 1.130.23 gfxGround.cpp 1.12 1.13
> mgrDemo.cpp 1.72 1.73 objExcavator.cpp 1.42 1.43 pedModule_Digging.cpp
> 1.25 1.26 pedModule_DumpTrench.cpp 1.18 1.19 pedModule_DumpTruck.cpp
> 1.27 1.28 pedModule_TargetSearch.cpp 1.17 1.18 pedTerrainDef.cpp 1.6
> 1.7 pedTrenchDef.cpp 1.18 1.19 pedTrial_DumpTruck.cpp 1.28 1.29
> playback.cpp 1.1 1.2 .configrc 1.2 1.3 ~configrc 1.2 1.3";
>
> $delim =
> "This-Is-A-Simlog-Delimiter-And-No-Filenames-Should-Be-Called-This";
>
> $src =~ s/([\d\.]+\s[\d\.]+)\s(\S)/\1$delim\2/g;
> @dest = split( /$delim/ , $src);
>
> print "\n";
> foreach $d (@dest)
> {
> print "($d)\n";
> }
> print "\n";


Split on blanks that are followed by a non-digit:

my @dest = split / (?=\D)/, $src;

Anno
--
If you want to post a followup via groups.google.com, don't use
the broken "Reply" link at the bottom of the article. Click on
"show options" at the top of the article, then click on the
"Reply" at the bottom of the article headers.
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
if you have free time ,you can chose the pro of you like sara_love67 Computer Support 7 09-25-2007 08:29 PM
if you have free time ,you can chose the pro of you like sara_love67 Digital Photography 0 09-24-2007 08:19 PM
Can you spare 10 minutes to do a survey? You can be a valuable help to university researchers! daniel DVD Video 0 03-02-2007 02:19 AM
Build a Better Blair (like Build a Better Bush, only better) Kenny Computer Support 0 05-06-2005 04:50 AM
if you kick for a division 1 football team and suck, you better bend over ChuckLysaght@yahoo.com Digital Photography 2 02-22-2004 02:14 AM



Advertisments