Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Perl > Perl Misc > Regexp for variable length tags

Reply
Thread Tools

Regexp for variable length tags

 
 
Jon Burroughs
Guest
Posts: n/a
 
      07-18-2005
I am processing some data that has a up to three key-value pairs
concatenated together. The keys can be "ADD, REM, EQD". Values are
variable length.

There will always be an "ADD" section, followed by 0 to 1 "REM"
sections, followed by 0 to 1 "EQD" sections. For example:
ADDxxxxxxxxREMyyyyyEQDzzzzz

I'm trying to find a regular expression that will split this apart into
separarate sections in one step.

So far, I have this:

$rec =~ /(ADD.+)(REM.+)(EQD.+)/;

But, this only works if I know the record has all three tokens.

This gobbles too much:
$rec =~ /(ADD.+)(REM.+)?(EQD.+)?/;

Any ideas?

-Jon
 
Reply With Quote
 
 
 
 
John W. Krahn
Guest
Posts: n/a
 
      07-18-2005
Jon Burroughs wrote:
> I am processing some data that has a up to three key-value pairs
> concatenated together. The keys can be "ADD, REM, EQD". Values are
> variable length.
>
> There will always be an "ADD" section, followed by 0 to 1 "REM"
> sections, followed by 0 to 1 "EQD" sections. For example:
> ADDxxxxxxxxREMyyyyyEQDzzzzz
>
> I'm trying to find a regular expression that will split this apart into
> separarate sections in one step.
>
> So far, I have this:
>
> $rec =~ /(ADD.+)(REM.+)(EQD.+)/;
>
> But, this only works if I know the record has all three tokens.
>
> This gobbles too much:
> $rec =~ /(ADD.+)(REM.+)?(EQD.+)?/;
>
> Any ideas?


Try using non-greedy quantifiers.

perldoc perlre


John
 
Reply With Quote
 
 
 
 
Gunnar Hjalmarsson
Guest
Posts: n/a
 
      07-18-2005
Jon Burroughs wrote:
> There will always be an "ADD" section, followed by 0 to 1 "REM"
> sections, followed by 0 to 1 "EQD" sections. For example:
> ADDxxxxxxxxREMyyyyyEQDzzzzz
>
> I'm trying to find a regular expression that will split this apart into
> separarate sections in one step.


Why regex?

my @rec;
while (<DATA>) {
chomp;
for my $key ( qw/EQD REM ADD/ ) {
if( (my $pos = index $_, $key) >= 0 ) {
$rec[$.-1]{$key} = substr $_, $pos+3;
substr $_, $pos, 100, '';
}
}
}
use Data:umper;
print Dumper \@rec;

__DATA__
ADDxxxxxxREMyyyyyEQDzzzzz
ADD2222REM666666
ADD7777777EQD8888

--
Gunnar Hjalmarsson
Email: http://www.gunnar.cc/cgi-bin/contact.pl
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
[regexp] How to convert string "/regexp/i" to /regexp/i - ? Joao Silva Ruby 16 08-21-2009 05:52 PM
Ruby 1.9 - ArgumentError: incompatible encoding regexp match(US-ASCII regexp with ISO-2022-JP string) Mikel Lindsaar Ruby 0 03-31-2008 10:27 AM
Perl Regexp that deals with variable parameter-string length. asspenm@gmail.com Perl Misc 4 03-25-2007 09:29 PM
Programmatically turning a Regexp into an anchored Regexp Greg Hurrell Ruby 4 02-14-2007 06:56 PM
RegExp.exec() returns null when there is a match - a JavaScript RegExp bug? Uldis Bojars Javascript 2 12-17-2006 09:50 PM



Advertisments
 



1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57