Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Perl > REGEX: capturing on optional groups which fail

Thread Tools

REGEX: capturing on optional groups which fail

Charles Shannon Hendrix
Posts: n/a

I have been writing some code to parse log files, and I used regular
expressions to build arrays of fields. Those arrays were inserted
verbatim into an SQL insert command.

I assing the results of the regex to an array, like this:

@array = $line =~ /$rex_extract/x;

Then I found that some lines had a variable ending. There were three
possible endings:

"N" warnings
"N" errors
"N" errors, error code = "N"

At the same time, I want a regex failure on lines like this:

"N" warnings, "N" errors
"N" warnings, "N" errors, error code = "N"
"N" errors, "N" warnings
"N" errors, "N" warnings, error code = "N"

I found the following regex works and keeps my array in order so I don't
have to do ugly array parsing later:

<expressions for first N non-variable fields snipped>
"([0-9]+)" # number of...
warnings # warnings
"([0-9]+)" # number of...
errors # errors
(?: # error code
\s*$' # end of line


Do captures in failing non-capturing expressions always generate an
empty array position? I want to make sure I'm not depending on an
unreliable side effect.

The reason I like this is that it preserves the order in my array, so I
don't have to parse it to see which line ending was found.

I'm interested in seeing better ways of doing this.

I would also like a pointer to where this behavior is documented. I've
not been able to find an explicit mention.

shannon "AT" -- [governorrhea: a contagious disease that
spreads from the governor of a state downward through other offices and his
corporate sponsors]
Reply With Quote

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off

Similar Threads
Thread Thread Starter Forum Replies Last Post
Regular expression : non capturing groups are faster ? candide Python 6 01-03-2012 08:38 PM
OT: GNU regex library and non-capturing groups pinkisntwell C Programming 1 11-13-2009 07:35 PM
more than 100 capturing groups in a regex Joerg Schuster Python 33 10-27-2005 04:57 PM
match groups: optional groups not accessible Python 3 06-10-2005 05:03 PM
if (f() != FAIL) or if (FAIL != f())? Wenjie C Programming 3 07-31-2003 09:54 PM