Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Perl > Perl Misc > I'm struggling with an EZ way to do this regex

Reply
Thread Tools

I'm struggling with an EZ way to do this regex

 
 
advice please wireless 802.11 on RH8
Guest
Posts: n/a
 
      09-11-2008
I'm pretty good at regexes- at least for most common uses. But
although I can brute force a solution here I'm not happy with it!

Lets say we have an array like

my @a = qw(10 20 22 23 25);

and some text like

'44,33,4.44.64.10,32,25,88,20,6,55'

and I want a regex that replaces any number in the string with say
'XX', as long as that number is not in the array @a, yielding:

$_ = 'XX,XX,XX.XX.XX.10,XX,25,XX,20,XX,XX'

The most *elegant* approach I've dreamed up is to join the array with
OR (|), then somehow use that to compare in the text. But I'm not sure
how to negatively compare.

my $a = join '|',@a;
s/(something)($a)/XXX/g;

I think this may be one of those oddball assertions that I never
mastered.

My other idea was to @t = split /,/
then iterate over each element with

grep /^$element$/,@t

but that ain't so pretty either..



Can someone give me a nudge in the right direction to do this in A
single, simple, elegant regex with no array conversions or looping? I
can usually dream one up but not this time!
 
Reply With Quote
 
 
 
 
Peter Scott
Guest
Posts: n/a
 
      09-11-2008
On Thu, 11 Sep 2008 06:43:04 -0700, advice please wireless 802.11 on RH8
wrote:
> Lets say we have an array like
>
> my @a = qw(10 20 22 23 25);
>
> and some text like
>
> '44,33,4.44.64.10,32,25,88,20,6,55'
>
> and I want a regex that replaces any number in the string with say
> 'XX', as long as that number is not in the array @a, yielding:
>
> $_ = 'XX,XX,XX.XX.XX.10,XX,25,XX,20,XX,XX'


my @a = qw(10 20 22 23 25);
$_ = '44,33,4.44.64.10,32,25,88,20,6,55';
my %keep = map { $_, 1 } @a;
s/(\d+)/$keep{$1} ? $1 : 'XX'/ge;

--
Peter Scott
http://www.perlmedic.com/
http://www.perldebugged.com/

 
Reply With Quote
 
 
 
 
Ben Morrow
Guest
Posts: n/a
 
      09-11-2008

Quoth "advice please wireless 802.11 on RH8" <>:
> I'm pretty good at regexes- at least for most common uses. But
> although I can brute force a solution here I'm not happy with it!
>
> Lets say we have an array like
>
> my @a = qw(10 20 22 23 25);
>
> and some text like
>
> '44,33,4.44.64.10,32,25,88,20,6,55'
>
> and I want a regex that replaces any number in the string with say
> 'XX', as long as that number is not in the array @a, yielding:
>
> $_ = 'XX,XX,XX.XX.XX.10,XX,25,XX,20,XX,XX'
>
> The most *elegant* approach I've dreamed up is to join the array with
> OR (|), then somehow use that to compare in the text. But I'm not sure
> how to negatively compare.
>
> my $a = join '|',@a;
> s/(something)($a)/XXX/g;
>
> I think this may be one of those oddball assertions that I never
> mastered.


Something like

s/ (?! $a ) \d+ /XX/gx

is what you want, but that hits lots of nasty corner cases like '1'
being in the array and '12' in the string. I *think*

s/ (^|\D) (?! (?: $a) (?: \D|$) ) \d+ /$1XX/gx

works correctly, but that's hardly pretty. With 5.10 you can remove the
nasty $1 capture using \K:

s/ (?: ^|\D) \K (?! (?: $a) (?: \D|$) ) \d+ /XX/gx

but it's not much of an improvement.

I would put the numbers to be matched in a hash:

my %ok;
@ok{@a} = 1;

and then split the string and match against the hash:

my @split = split /\D/;
for (@split) {
$_ = "XX" unless $ok{$_};
}
$_ = join ",", @split;

Ben

--
I have two words that are going to make all your troubles go away.
"Miniature". "Golf".
[]
 
Reply With Quote
 
Peter Scott
Guest
Posts: n/a
 
      09-11-2008
On Thu, 11 Sep 2008 15:16:08 +0100, Ben Morrow wrote:
> I would put the numbers to be matched in a hash:
>
> my %ok;
> @ok{@a} = 1;
>
> and then split the string and match against the hash:
>
> my @split = split /\D/;
> for (@split) {
> $_ = "XX" unless $ok{$_};
> }
> $_ = join ",", @split;


Not all of the inter-digit characters in the input string were commas.

--
Peter Scott
http://www.perlmedic.com/
http://www.perldebugged.com/

 
Reply With Quote
 
Ben Morrow
Guest
Posts: n/a
 
      09-11-2008

Quoth Peter Scott <>:
> On Thu, 11 Sep 2008 15:16:08 +0100, Ben Morrow wrote:
> > I would put the numbers to be matched in a hash:
> >
> > my %ok;
> > @ok{@a} = 1;
> >
> > and then split the string and match against the hash:
> >
> > my @split = split /\D/;
> > for (@split) {
> > $_ = "XX" unless $ok{$_};
> > }
> > $_ = join ",", @split;

>
> Not all of the inter-digit characters in the input string were commas.


I noticed that, but the OP mentioned split /,/ so I presumed they were
typos. If not, something like

my @split = split /(\D+)/;
for (@split) {
/\D/ and next;
$_ = "XX" unless $ok{$_};
}
$_ = join "", @split;

should do.

Ben

--
If you put all the prophets, | You'd have so much more reason
Mystics and saints | Than ever was born
In one room together, | Out of all of the conflicts of time.
The Levellers, 'Believers'
 
Reply With Quote
 
Ilya Zakharevich
Guest
Posts: n/a
 
      09-11-2008
[A complimentary Cc of this posting was NOT [per weedlist] sent to
Ben Morrow
<>], who wrote in article <8qlnp5->:
> > '44,33,4.44.64.10,32,25,88,20,6,55'
> >
> > and I want a regex that replaces any number in the string with say 'XX',


I do not know what is a "number". I assume you mean "a sequence of digits".

> Something like
>
> s/ (?! $a ) \d+ /XX/gx


s/ \b (?! $a \b ) \d+ /XX/gx

Hope this helps,
Ilya
 
Reply With Quote
 
Ben Morrow
Guest
Posts: n/a
 
      09-12-2008

Quoth Ilya Zakharevich <nospam->:
> [A complimentary Cc of this posting was NOT [per weedlist] sent to
> Ben Morrow
> <>], who wrote in article
> <8qlnp5->:
> > > '44,33,4.44.64.10,32,25,88,20,6,55'
> > >
> > > and I want a regex that replaces any number in the string with say 'XX',

>
> I do not know what is a "number". I assume you mean "a sequence of digits".
>
> > Something like
> >
> > s/ (?! $a ) \d+ /XX/gx

>
> s/ \b (?! $a \b ) \d+ /XX/gx


Duh! I was thinking I needed a \d\D boundary, but of course for the
string given a \w\W boundary works just as well.

Thanks

Ben

--
"If a book is worth reading when you are six, *
it is worth reading when you are sixty." [C.S.Lewis]
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
How make regex that means "contains regex#1 but NOT regex#2" ?? seberino@spawar.navy.mil Python 3 07-01-2008 03:06 PM
Struggling with Wireless network. =?Utf-8?B?T0MxMQ==?= Wireless Networking 3 06-09-2005 07:15 PM
struggling at starting blocks getting workgroup recognised =?Utf-8?B?Q2Fyb2wgQg==?= Wireless Networking 2 02-13-2005 10:33 PM
Struggling With Concept One Handed Man \( OHM#\) ASP .Net 1 06-12-2004 02:07 PM
enableSessionState - still struggling Martin ASP .Net 6 12-29-2003 03:30 PM



Advertisments
 



1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57