Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Perl > Perl Misc > Novice - help with pattern matching needed

Reply
Thread Tools

Novice - help with pattern matching needed

 
 
Robert Day
Guest
Posts: n/a
 
      02-07-2004
Hi

I am using a very basic Perl script to parse a file and extract just
the elements I need but one aspect is causing me trouble and I am sure
the answer is probably quite simple. Below are examples of two of the
lines (watch wrapping) - the value I seek is that between the date on
the left and the "UV Port" on the right.

Enter bookmobile session location code (or NONE) : NONE 06 FEB 2004
March Mobile A UV Port 51
Circulation
06 FEB 2004 Papworth Library
UV Port 50

The section of code dealing with this is currently

if(/UV/) {
$library = $`;
$library =~ s/^\s+\d{2}\s\w{3}\s\d{4}\s+//;
$library =~ s/- CAMBOOK//g;
$library =~ s/(\w+)/\u\L$1/g;
print "$library\n";
}

The 2nd and 3rd pattern matches deal with other lines in the data (not
shown) in which the value I seek is all CAPS or has "- CAMBOOK"
appended. This code works fine on line 2 of the sample data given
above but I don't know how to get rid of "Enter bookmobile session
location code (or NONE) : NONE" when it appears (as it does on a few
entries). i have tried various patterns and I am sure the solution is
simple but it eludes me at present. Can anyone help?

Robert
 
Reply With Quote
 
 
 
 
gnari
Guest
Posts: n/a
 
      02-07-2004
"Robert Day" <(E-Mail Removed)> wrote in message
news:(E-Mail Removed) om...
> ... the value I seek is that between the date on
> the left and the "UV Port" on the right.
>
> Enter bookmobile session location code (or NONE) : NONE 06 FEB 2004
> March Mobile A UV Port 51
> Circulation
> 06 FEB 2004 Papworth Library
> UV Port 50
>
> The section of code dealing with this is currently
>
> if(/UV/) {
> $library = $`;
> $library =~ s/^\s+\d{2}\s\w{3}\s\d{4}\s+//;


are you sure about the '^' here?

> $library =~ s/- CAMBOOK//g;
> $library =~ s/(\w+)/\u\L$1/g;
> print "$library\n";
> }


I just would do somethng like:
if ( ($library)=/\d\d \w\w\w \d{4} (.*?)(- CAMBOOK)? UV/ ) {
print "$library\n";
}

gnari




 
Reply With Quote
 
 
 
 
Gunnar Hjalmarsson
Guest
Posts: n/a
 
      02-08-2004
Robert Day wrote:
> I am using a very basic Perl script to parse a file and extract
> just the elements I need ...


<snip>

> I don't know how to get rid of "Enter bookmobile session location
> code (or NONE) : NONE" when it appears (as it does on a few
> entries). i have tried various patterns and I am sure the solution
> is simple but it eludes me at present. Can anyone help?


As regards the approach I have to ask: If you want to extract
something, why do you not write code that does just that rather than
deleting everything that you do not want to keep?

$library =~ s/^\s+\d{2}\s\w{3}\s\d{4}\s+//;
----------------------^
What's your considerations behind beginning the pattern with the ^
metacharacter?

perldoc perlvar points out that the $` variable "anywhere in a program
imposes a considerable performance penalty on all regular expression
matches". There appears not to be any reason to use it here.

> $library = $`;
> $library =~ s/^\s+\d{2}\s\w{3}\s\d{4}\s+//;
> $library =~ s/- CAMBOOK//g;


You may want to replace those three lines with:

my ($library) = /\d{2} \w{3} \d{4}\s+(.+?)(?:- CAMBOOK)?\s+UV/;

--
Gunnar Hjalmarsson
Email: http://www.gunnar.cc/cgi-bin/contact.pl

 
Reply With Quote
 
Robert
Guest
Posts: n/a
 
      02-08-2004

"Gunnar Hjalmarsson" <(E-Mail Removed)> wrote in message
news:c03u4u$107mv3$(E-Mail Removed)-berlin.de...
>
> As regards the approach I have to ask: If you want to extract
> something, why do you not write code that does just that rather than
> deleting everything that you do not want to keep?


It seemed simpler because there is consistency in the stuff to remove but
the value I want to keep could be one of 70 different values, with a variety
of different formats.

>
> $library =~ s/^\s+\d{2}\s\w{3}\s\d{4}\s+//;
> ----------------------^
> What's your considerations behind beginning the pattern with the ^
> metacharacter?
>


This is a leftover from the way the code worked before the introduction of
entries with the "Enter bookmobile....." line. At that time the dates were
always the leftmost item so always matched the ^ metacharacter.

> You may want to replace those three lines with:
>
> my ($library) = /\d{2} \w{3} \d{4}\s+(.+?)(?:- CAMBOOK)?\s+UV/;
>


Thanks. I'll give it a go (and then try to understand exactly what it is
doing!)
Robert


 
Reply With Quote
 
Gunnar Hjalmarsson
Guest
Posts: n/a
 
      02-08-2004
Robert wrote:
> "Gunnar Hjalmarsson" <(E-Mail Removed)> wrote in message
> news:c03u4u$107mv3$(E-Mail Removed)-berlin.de...
>> As regards the approach I have to ask: If you want to extract
>> something, why do you not write code that does just that rather
>> than deleting everything that you do not want to keep?

>
> It seemed simpler because there is consistency in the stuff to
> remove but the value I want to keep could be one of 70 different
> values, with a variety of different formats.


Okay. As you can see from both my and gnari's examples, that should
not prevent you from capturing rather than removing stuff.

>> You may want to replace those three lines with:
>>
>> my ($library) = /\d{2} \w{3} \d{4}\s+(.+?)(?:- CAMBOOK)?\s+UV/;

>
> Thanks. I'll give it a go (and then try to understand exactly what
> it is doing!)


It can also be written:

my $library;
if ( /\d{2} \w{3} \d{4}\s+(.+?)(?:- CAMBOOK)?\s+UV/ ) {
$library = $1;
}

Please study perldoc perlre about capturing, the meaning of the $1
variable, etc.

--
Gunnar Hjalmarsson
Email: http://www.gunnar.cc/cgi-bin/contact.pl

 
Reply With Quote
 
R Day
Guest
Posts: n/a
 
      02-08-2004

"Gunnar Hjalmarsson" <(E-Mail Removed)> wrote in message
news:c05n13$13j4u9$(E-Mail Removed)-berlin.de...
> It can also be written:
>
> my $library;
> if ( /\d{2} \w{3} \d{4}\s+(.+?)(?:- CAMBOOK)?\s+UV/ ) {
> $library = $1;
> }


Thanks. This works as required.

> Please study perldoc perlre about capturing, the meaning of the $1
> variable, etc.


I will do.

Robert


 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Novice Tomcat design pattern question TwelveEighty Java 9 12-03-2007 05:07 PM
Help with Pattern matching. Matching multiple lines from while reading from a file. Bobby Chamness Perl Misc 2 05-03-2007 06:02 PM
pattern matching code - little help needed Yoon Soo C Programming 0 03-07-2004 01:41 PM
Pattern matching : not matching problem Marc Bissonnette Perl Misc 9 01-13-2004 05:52 PM
my Computer's speed/Help needed from a novice John Computer Support 1 09-17-2003 03:34 AM



Advertisments