Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Perl > Perl Misc > Need help with an advanced? regular expression

Reply
Thread Tools

Need help with an advanced? regular expression

 
 
Martin Gill
Guest
Posts: n/a
 
      02-18-2005
Hi,

I'm trying to write a regular expression which parses the following string:

blah blah items 1234, 4567, 4345, and 3245 blah blah blah

I want to be able to pick up the numbers following the "items" label.

I thought the following might work, but it doesn't seem to

/ORs (\b(\d+)\b)+/

i want it to match:
1234
4567
4345
3245

Any help is greatly appreciated.


--
--
Martin Gill
 
Reply With Quote
 
 
 
 
Martin Gill
Guest
Posts: n/a
 
      02-18-2005
Thanks for the quick reply,

Bernard El-Hagin wrote:
> Martin Gill <(E-Mail Removed)> wrote:
>
>
>>Hi,
>>
>>I'm trying to write a regular expression which parses the
>>following string:
>>
>>blah blah items 1234, 4567, 4345, and 3245 blah blah blah
>>
>>I want to be able to pick up the numbers following the "items"
>>label.
>>
>>I thought the following might work, but it doesn't seem to
>>
>>/ORs (\b(\d+)\b)+/

>
> ^^^
>


replace ORs with items. I'm trying to use the regex in different places,
and I picked the other example.

>
> What is that supposed to do?
>
>
>
>>i want it to match:
>>1234
>>4567
>>4345
>>3245

>
>
>
> With the input and specification you've provided this will work for
> you:
>
>
> print "$_\n" for m/(\d+)/g;
>
>


The problem I have is that the target string could be something like this:

Over the next 10 days i'll deliver 4 items 1234, 1234, 5321 and 2345.

I want to use items as the key phrase to identify the list of times.
The example you gave will also find 10 and 4 which i don't want.

In english, the regex i need is: Find all numbers after "items".


--
--
Martin Gill
 
Reply With Quote
 
 
 
 
Gunnar Hjalmarsson
Guest
Posts: n/a
 
      02-18-2005
Martin Gill wrote:
> The problem I have is that the target string could be something like this:
>
> Over the next 10 days i'll deliver 4 items 1234, 1234, 5321 and 2345.
>
> I want to use items as the key phrase to identify the list of times.
> The example you gave will also find 10 and 4 which i don't want.
>
> In english, the regex i need is: Find all numbers after "items".


You don't necessarily need a pure regex, do you?

print "$_\n" for substr($_, index $_, 'items') =~ /\d+/g;

--
Gunnar Hjalmarsson
Email: http://www.gunnar.cc/cgi-bin/contact.pl
 
Reply With Quote
 
Arndt Jonasson
Guest
Posts: n/a
 
      02-18-2005

Martin Gill <(E-Mail Removed)> writes:
> The problem I have is that the target string could be something like this:
>
> Over the next 10 days i'll deliver 4 items 1234, 1234, 5321 and 2345.
>
> I want to use items as the key phrase to identify the list of times.
> The example you gave will also find 10 and 4 which i don't want.
>
> In english, the regex i need is: Find all numbers after "items".


I would first extract the substring beginning with "items" and then
apply the regexp to find the numbers.

Maybe it can be done in one single regexp (I don't think it can), but
even if so, would it be worth the effort?
 
Reply With Quote
 
jl_post@hotmail.com
Guest
Posts: n/a
 
      02-18-2005
Martin Gill wrote:
>
> I'm trying to write a regular expression which parses the following
> string:
>
> blah blah items 1234, 4567, 4345, and 3245 blah blah blah
>
> I want to be able to pick up the numbers following the "items" label.
>
> I thought the following might work, but it doesn't seem to
>
> /ORs (\b(\d+)\b)+/
>
> i want it to match:
> 1234
> 4567
> 4345
> 3245



Well, for one thing, you want to pick up the numbers following the
"items" string, but in your regular expression you are searching for
"ORs" instead (which doesn't appear in your string at all).

If you have a $string variable:

$string = "blah blah items 1234, 4567, 4345, and 3245 blah blah blah";

you can print out all the numbers by first matching "items" and then by
matching all the numbers in the postmatch (the $' variable), like this:

if ($string =~ /items/)
{
# Everything after "items" (the postmatch) is now in $'
# so extract all the numbers in $' :
print "$_\n" foreach $' =~ m/\d+/g;
}

But be warned! Use of $' carries a performance penalty, making many of
Perl programmers avoid it. If this performace penalty bothers you, you
can avoid it with the following similar code:

if ($string =~ /items(.*)/)
{
# Everything after "items" is now in $1
# so extract all the numbers in $1 :
print "$_\n" foreach $1 =~ m/\d+/g;
}

If this is the only regular expression in your program, or if the
other regular expressions operate on relatively small strings, then
using $' should be nothing to worry about. In such cases, I think you
should use whatever method is more readable to you.

I hope this helps.

-- Jean-Luc

 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Seek xpath expression where an attribute name is a regular expression GIMME XML 3 12-29-2008 03:11 PM
C/C++ language proposal: Change the 'case expression' from "integral constant-expression" to "integral expression" Adem C++ 42 11-04-2008 12:39 PM
C/C++ language proposal: Change the 'case expression' from "integral constant-expression" to "integral expression" Adem C Programming 45 11-04-2008 12:39 PM
Matching abitrary expression in a regular expression =?iso-8859-1?B?bW9vcJk=?= Java 8 12-02-2005 12:51 AM
Dynamically changing the regular expression of Regular Expression validator VSK ASP .Net 2 08-24-2003 02:47 PM



Advertisments