Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Perl > Perl Misc > Simple Regex Doubt

Reply
Thread Tools

Simple Regex Doubt

 
 
Donato Azevedo
Guest
Posts: n/a
 
      07-16-2009
Hi everyone,

I've got a simple question to which Ive, to this point, not been able
to solve:

I have these regexes which I want to convert into a single one:

if ( $raw_content =~ /Doc1(?:=rev)??<document1>.*?)\r\n
Doc2(?:=rev)??<document2>.*?)\r\n
Item?<item>.*?)\r\n
Data\s+doc1?<data1>.*?)\r\n
Data\s+doc2?<data2>.*?)\r\n
Obs?<observation>.*?)\r\n
Critic?<criticality>.*?)\r\n
Comments?<comments>.*)
/isx ||
$raw_content =~ /Doc1(?:=rev)??<document1>.*?)\r\n
Doc2(?:=rev)??<document2>.*?)\r\n
Item?<item>.*?)\r\n
Data\s+doc1?<data1>.*?)\r\n
Data\s+doc2?<data2>.*?)\r\n
Obs?<observation>.*?)\r\n
Critic?<criticality>.*)
/isx ) {

this is to match text that can either end in:

Critic:foobartext

or

Critic:foo
Comments:bar

The problem seems to be the greediness of the last captures: I tried
doing

Critic?<criticality>.*?)(\r\nComments?<comment s>.*))?

and

Critic?<criticality>.*)(\r\nComments?<comments >.*))?

but I must be missing something... It must be something quite simple
I'd say.

Well, any ideas?
 
Reply With Quote
 
 
 
 
C.DeRykus
Guest
Posts: n/a
 
      07-16-2009
On Jul 16, 9:17*am, Donato Azevedo <donat...@gmail.com> wrote:
> Hi everyone,
>
> I've got a simple question to which Ive, to this point, not been able
> to solve:
>
> I have these regexes which I want to convert into a single one:
>
> * * * * if ( $raw_content =~ /Doc1(?:=rev)??<document1>.*?)\r\n
> * * * * * * * * * * * * * * * * * * * * * Doc2(?:=rev)??<document2>.*?)\r\n
> * * * * * * * * * * * * * * * * * * * * * Item?<item>.*?)\r\n
> * * * * * * * * * * * * * * * * * * * * * Data\s+doc1?<data1>.*?)\r\n
> * * * * * * * * * * * * * * * * * * * * * Data\s+doc2?<data2>.*?)\r\n
> * * * * * * * * * * * * * * * * * * * * * Obs?<observation>.*?)\r\n
> * * * * * * * * * * * * * * * * * * * * * Critic?<criticality>.*?)\r\n
> * * * * * * * * * * * * * * * * * * * * * Comments?<comments>.*)
> * * * * * * * * * * * * * * * * * * * * /isx ||
> * * * * $raw_content =~ /Doc1(?:=rev)??<document1>.*?)\r\n
> * * * * * * * * * * * * * * * * * * * * *Doc2(?:=rev)??<document2>.*?)\r\n
> * * * * * * * * * * * * * * * * * * * * *Item?<item>.*?)\r\n
> * * * * * * * * * * * * * * * * * * * * *Data\s+doc1?<data1>.*?)\r\n
> * * * * * * * * * * * * * * * * * * * * *Data\s+doc2?<data2>.*?)\r\n
> * * * * * * * * * * * * * * * * * * * * *Obs?<observation>.*?)\r\n
> * * * * * * * * * * * * * * * * * * * * *Critic?<criticality>.*)
> * * * * * * * * * * * * * * * * * * * * /isx ) {
>
> this is to match text that can either end in:
>
> Critic:foobartext
>
> or
>
> Critic:foo
> Comments:bar
>
> The problem seems to be the greediness of the last captures: I tried
> doing
>
> Critic?<criticality>.*?)(\r\nComments?<comment s>.*))?
>
> and
>
> Critic?<criticality>.*)(\r\nComments?<comments >.*))?
>
> but I must be missing something... It must be something quite simple
> I'd say.
>
> Well, any ideas?



You might want to post a simple, minimal example to
demo what is/isn't working. The following worked
for me:

$_ = <<'END';
one line
another line
Critic: foobartext
Comments: bunches of comments
END
my $regex = qr /.*? Critic: (?<criticality>.*?)\n
(?:Comments: (?<comments>.*))?
/isx;
if ( /$regex/ ) {
print "criticality: $+{criticality}", "\n",
"comments: $+{comments}"
}

--
Charles DeRykus
 
Reply With Quote
 
 
 
 
sln@netherlands.com
Guest
Posts: n/a
 
      07-17-2009
On Thu, 16 Jul 2009 09:17:59 -0700 (PDT), Donato Azevedo <> wrote:

>Hi everyone,
>
>I've got a simple question to which Ive, to this point, not been able
>to solve:
>
>I have these regexes which I want to convert into a single one:
>
> if ( $raw_content =~ /Doc1(?:=rev)??<document1>.*?)\r\n
> Doc2(?:=rev)??<document2>.*?)\r\n
> Item?<item>.*?)\r\n
> Data\s+doc1?<data1>.*?)\r\n
> Data\s+doc2?<data2>.*?)\r\n
> Obs?<observation>.*?)\r\n
> Critic?<criticality>.*?)\r\n
> Comments?<comments>.*)
> /isx ||
> $raw_content =~ /Doc1(?:=rev)??<document1>.*?)\r\n
> Doc2(?:=rev)??<document2>.*?)\r\n
> Item?<item>.*?)\r\n
> Data\s+doc1?<data1>.*?)\r\n
> Data\s+doc2?<data2>.*?)\r\n
> Obs?<observation>.*?)\r\n
> Critic?<criticality>.*)
> /isx ) {
>
>this is to match text that can either end in:
>
>Critic:foobartext
>
>or
>
>Critic:foo
>Comments:bar
>
>The problem seems to be the greediness of the last captures: I tried
>doing
>
>Critic?<criticality>.*?)(\r\nComments?<commen ts>.*))?
>
>and
>
>Critic?<criticality>.*)(\r\nComments?<comment s>.*))?
>
>but I must be missing something... It must be something quite simple
>I'd say.
>
>Well, any ideas?


Wow, looks complicated, but isin't. Yes, as DeRykus says,
you need a quantifier '?' (0 or 1) around a non capture grouping
of --> Critic?<criticality>.*) in the first regex.

This will at least assign $+{criticality} a '' if there is no 'Critic:'
data (.*)and will assign (just like the $n vars I think) undef if there is no 'Critic:'

I haven't checked 5.10 much but, there may not even exist $+{criticality} if '?'
for the group is 0. Regex satisfied, but who knows how %+ hash is reset.
Probably exists, but set to undef, like its unamed capture counterpart.

Btw, whats this bizz: /(.*?)\r\n/s ??

-sln





 
Reply With Quote
 
sln@netherlands.com
Guest
Posts: n/a
 
      07-17-2009
On Fri, 17 Jul 2009 12:46:17 -0700, wrote:

>On Thu, 16 Jul 2009 09:17:59 -0700 (PDT), Donato Azevedo <> wrote:


<snip>

>Wow, looks complicated, but isin't. Yes, as DeRykus says,
>you need a quantifier '?' (0 or 1) around a non capture grouping
>of --> Critic?<criticality>.*) in the first regex.
>
>This will at least assign $+{criticality} a '' if there is no 'Critic:'
>data (.*)and will assign (just like the $n vars I think) undef if there is no 'Critic:'
>
>I haven't checked 5.10 much but, there may not even exist $+{criticality} if '?'
>for the group is 0. Regex satisfied, but who knows how %+ hash is reset.
>Probably exists, but set to undef, like its unamed capture counterpart.
>
>Btw, whats this bizz: /(.*?)\r\n/s ??
>
>-sln
>


^^
Oh, I'm sorry, s/comments/criticality/g it the above reply-post.

-sln
 
Reply With Quote
 
sln@netherlands.com
Guest
Posts: n/a
 
      07-17-2009
On Fri, 17 Jul 2009 12:52:22 -0700, wrote:

>On Fri, 17 Jul 2009 12:46:17 -0700, wrote:
>
>>On Thu, 16 Jul 2009 09:17:59 -0700 (PDT), Donato Azevedo <> wrote:

>
><snip>
>
>>Wow, looks complicated, but isin't. Yes, as DeRykus says,
>>you need a quantifier '?' (0 or 1) around a non capture grouping
>>of --> Critic?<criticality>.*) in the first regex.
>>
>>This will at least assign $+{criticality} a '' if there is no 'Critic:'
>>data (.*)and will assign (just like the $n vars I think) undef if there is no 'Critic:'
>>
>>I haven't checked 5.10 much but, there may not even exist $+{criticality} if '?'
>>for the group is 0. Regex satisfied, but who knows how %+ hash is reset.
>>Probably exists, but set to undef, like its unamed capture counterpart.
>>
>>Btw, whats this bizz: /(.*?)\r\n/s ??
>>
>>-sln
>>

>
>^^
>Oh, I'm sorry, s/comments/criticality/g it the above reply-post.
>
>-sln


Warning!! ignore that man behind the curtain..
The saga continues, s/criticality/comments/g
Dyslexia is a terrible thing to waste.

-sln
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
dotnet doubt can any body clarify my doubt challa462@gmail.com ASP .Net 0 08-22-2012 06:02 AM
How make regex that means "contains regex#1 but NOT regex#2" ?? seberino@spawar.navy.mil Python 3 07-01-2008 03:06 PM
doubt about doubt Bob Nelson C Programming 11 07-30-2006 08:17 PM
Response.Write simple doubt Rafael Veronezi ASP .Net 1 10-14-2003 08:04 PM
No doubt I should know this regex point but .. Jim Hefferon Python 1 08-14-2003 03:25 PM



Advertisments
 



1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57