Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Perl > Perl Misc > Setting backreference inside of a string

Reply
Thread Tools

Setting backreference inside of a string

 
 
Jason C
Guest
Posts: n/a
 
      09-10-2012
I'm doing a replace, like this:

$text = "Yes dear!";
$pattern = "(D|d)ear";
$replace = "$1eer";

$text =~ s/$pattern/$replace/gi;

That's just an example, of course; the real $pattern and $replace come from a database list, and $text comes from form data.

The problem I'm having is that the replace is replacing with a literal "$1eer", instead of setting the $1 to (D|d). Meaning, instead of printing:

Yes deer!

I'm printing:

Yes $1eer!

Any suggestions on how to make $1 in $replace refer to the first group in $pattern?
 
Reply With Quote
 
 
 
 
Peter Makholm
Guest
Posts: n/a
 
      09-10-2012
Jason C <> writes:

> $text = "Yes dear!"; $pattern = "(D|d)ear"; $replace = "$1eer";
>
> $text =~ s/$pattern/$replace/gi;


Using this code I get "Yes eer!" in $text...

> Any suggestions on how to make $1 in $replace refer to the first group
> in $pattern?


You need to look at the /e modifier to your substitution.

//Makholm
 
Reply With Quote
 
 
 
 
Jason C
Guest
Posts: n/a
 
      09-10-2012
On Monday, September 10, 2012 3:57:47 AM UTC-4, Peter Makholm wrote:
> > $text = "Yes dear!"; $pattern = "(D|d)ear"; $replace = "$1eer";
> > $text =~ s/$pattern/$replace/gi;

>
> Using this code I get "Yes eer!" in $text...


Could be a minor variation in what I posted vs. my actual code. I didn't post the whole thing because I thought it was unnecessarily complicated, but it's technically:

my $sth = $dbh->prepare("SELECT * FROM table");
$sth->execute();

while (($pattern, $replace) = $sth->fetchrow_array()) {
$text =~ s/(\b*)$pattern(er|in|ing|s|ed|y|\b)/$1$replace$+/gi;
}


> > Any suggestions on how to make $1 in $replace refer to the first group
> > in $pattern?

>
> You need to look at the /e modifier to your substitution.


Thanks for the tip. I've read a bit on the 'e' modifier now, but I'm not quite understanding how to use it for this application.

In retrospect, what I think is happening is that the while() loop is treating $replace as if it is in a single quote instead of double. So instead of it reading like:

$pattern = "(D|d)ear";
$replace = "$1eer";

it's reading:

$pattern = '(D|d)ear';
$replace = '$1eer';

So the question may really be, how do I get it to read $replace as interpretive?
 
Reply With Quote
 
Wolf Behrenhoff
Guest
Posts: n/a
 
      09-10-2012
Am 10.09.2012 11:36, schrieb Jason C:
>>> Any suggestions on how to make $1 in $replace refer to the first group
>>> in $pattern?

>>
>> You need to look at the /e modifier to your substitution.

>
> Thanks for the tip. I've read a bit on the 'e' modifier now, but I'm not quite understanding how to use it for this application.


For example like this:

$ perl -E '$r=q("${1}eer");($_="hello")=~s/(ll)/$r/ee; say'
helleero

- Wolf

 
Reply With Quote
 
Peter Makholm
Guest
Posts: n/a
 
      09-11-2012
Wolf Behrenhoff <> writes:

> For example like this:
>
> $ perl -E '$r=q("${1}eer");($_="hello")=~s/(ll)/$r/ee; say'
> helleero


So, after matching 'll' and asigning it to $1 it is replaced by

eval( eval '$r' )

Start by computing the inner eval we get

eval ( '"$1eer"')

Remembering that $1 was "ll" this evaluates to

"lleer"

//Makholm

 
Reply With Quote
 
C.DeRykus
Guest
Posts: n/a
 
      09-11-2012
On Monday, September 10, 2012 2:36:22 AM UTC-7, Jason C wrote:
> On Monday, September 10, 2012 3:57:47 AM UTC-4, Peter Makholm wrote:
>
> > > $text = "Yes dear!"; $pattern = "(D|d)ear"; $replace = "$1eer";

>
> > > $text =~ s/$pattern/$replace/gi;

>
> >

>
> > Using this code I get "Yes eer!" in $text...

>
>
>
> Could be a minor variation in what I posted vs. my actual code. I didn't post the whole thing because I thought it was unnecessarily complicated, but it's technically:
>
>
>
> my $sth = $dbh->prepare("SELECT * FROM table");
>
> $sth->execute();
>
>
>
> while (($pattern, $replace) = $sth->fetchrow_array()) {
>
> $text =~ s/(\b*)$pattern(er|in|ing|s|ed|y|\b)/$1$replace$+/gi;
>
> }
>
>
>
>
>
> > > Any suggestions on how to make $1 in $replace refer to the first group

>
> > > in $pattern?

>
> >

>
> > You need to look at the /e modifier to your substitution.

>
>
>
> Thanks for the tip. I've read a bit on the 'e' modifier now, but I'm not quite understanding how to use it for this application.
>
>
>
> In retrospect, what I think is happening is that the while() loop is treating $replace as if it is in a single quote instead of double. So instead of it reading like:
>
>
>
> $pattern = "(D|d)ear";
>
> $replace = "$1eer";
>
>
>
> it's reading:
>
>
>
> $pattern = '(D|d)ear';
>
> $replace = '$1eer';
>
>
>
> So the question may really be, how do I get it to read $replace as interpretive?


One way to avoid an 'ee' solution's drawbacks
is just pull the backref out of the pattern:

my $pattern = '(D|d)ear';
my $replace = 'eer';

$text =~ s/$pattern/$1$replace/gi;

--
Charles DeRykus


 
Reply With Quote
 
Jason C
Guest
Posts: n/a
 
      09-12-2012
On Tuesday, September 11, 2012 4:50:20 PM UTC-4, C.DeRykus wrote:

> One way to avoid an 'ee' solution's drawbacks
> is just pull the backref out of the pattern:
>
> my $pattern = '(D|d)ear';
> my $replace = 'eer';
>
> $text =~ s/$pattern/$1$replace/gi;


That was my original thought, too, but I also have rows where the () isn't at the beginning. Eg:

$pattern = 'smart(\s)*ass';
$replace = 'smart$1butt';

I really would like to avoid using /ee, though, for the security reasons mentioned earlier.

Maybe something like:

$text = "Yes dear!";
$pattern = '(D|d)ear';
$replace = '$1eer';

# if $pattern doesn't contain a backreference
# create an empty one
if ($pattern !~ /\(.*?\)/g) {
$pattern = "()*?" . $pattern;
}

$replace =~ s/\$1/<marker>/g;
# now, $replace = '<marker>eer';

while ($text =~ /$pattern/g) {
$replace =~ s/<marker>/$1/g;
$text =~ s/$pattern/$replace/gi;
}


I haven't tested that, I'm just spit-balling the logic. Thoughts?
 
Reply With Quote
 
C.DeRykus
Guest
Posts: n/a
 
      09-12-2012
On Tuesday, September 11, 2012 7:55:15 PM UTC-7, Jason C wrote:
> On Tuesday, September 11, 2012 4:50:20 PM UTC-4, C.DeRykus wrote:
>
>
>
> > One way to avoid an 'ee' solution's drawbacks

>
> > is just pull the backref out of the pattern:

>
> >

>
> > my $pattern = '(D|d)ear';

>
> > my $replace = 'eer';

>
> >

>
> > $text =~ s/$pattern/$1$replace/gi;

>
>
>
> That was my original thought, too, but I also have rows where the () isn't at the beginning. Eg:
>
>
>
> $pattern = 'smart(\s)*ass';
>
> $replace = 'smart$1butt';
>
>
>
> I really would like to avoid using /ee, though, for the security reasons mentioned earlier.
>
>
>
> Maybe something like:
>
>
>
> $text = "Yes dear!";
>
> $pattern = '(D|d)ear';
>
> $replace = '$1eer';
>
>
>
> # if $pattern doesn't contain a backreference
>
> # create an empty one
>
> if ($pattern !~ /\(.*?\)/g) {
>
> $pattern = "()*?" . $pattern;
>
> }
>
>
>
> $replace =~ s/\$1/<marker>/g;
>
> # now, $replace = '<marker>eer';
>
>
>
> while ($text =~ /$pattern/g) {
>
> $replace =~ s/<marker>/$1/g;
>
> $text =~ s/$pattern/$replace/gi;
>
> }
>
>
>
>
>
> I haven't tested that, I'm just spit-balling the logic. Thoughts?


I'm not sure I follow entirely but, IMO, separate regexes would be much easier and more maintainable
than trying to do this in a single regex.

Only if there's a huge bottleneck, would I bother,
trying to re-factor...

--
Charles DeRykus
 
Reply With Quote
 
Jason C
Guest
Posts: n/a
 
      09-12-2012
On Wednesday, September 12, 2012 12:08:32 AM UTC-4, C.DeRykus wrote:
> I'm not sure I follow entirely but, IMO, separate regexes would be much easier and more maintainable
>
> than trying to do this in a single regex.
>
> Only if there's a huge bottleneck, would I bother,
> trying to re-factor...


You might have missed it before, but on the live site, $pattern and $replace are coming from a database. Like so:

my $sth = $dbh->prepare("SELECT * FROM table");
$sth->execute();

while (($pattern, $replace) = $sth->fetchrow_array()) {
$text =~ s/(\b*)$pattern(er|in|ing|s|ed|y|\b)/$1$replace$+/gi;
}

The first group in $pattern can actually be anywhere in the string, so one row might be:

(D|d)ear

while the next might be:

smart(\s*)ass

The issue comes in where $1 is defined as non-interpretive in the database, and I'm not sure how to make it interpretive in the replacement.

The while() loop that I presented in the last post is an attempt to replace the non-interpretive '$1' with '<marker>', then replace '<marker>' back with the interpretive "$1".
 
Reply With Quote
 
Willem
Guest
Posts: n/a
 
      09-12-2012
Jason C wrote:
) On Tuesday, September 11, 2012 4:50:20 PM UTC-4, C.DeRykus wrote:
)
)> One way to avoid an 'ee' solution's drawbacks
)> is just pull the backref out of the pattern:
)>
)> my $pattern = '(D|d)ear';
)> my $replace = 'eer';
)>
)> $text =~ s/$pattern/$1$replace/gi;
)
) That was my original thought, too, but I also have rows where the () isn't at the beginning. Eg:
)
) $pattern = 'smart(\s)*ass';
) $replace = 'smart$1butt';
)
) I really would like to avoid using /ee, though, for the security reasons mentioned earlier.
)
) Maybe something like:
)
) $text = "Yes dear!";
) $pattern = '(D|d)ear';
) $replace = '$1eer';
)
) # if $pattern doesn't contain a backreference
) # create an empty one
) if ($pattern !~ /\(.*?\)/g) {
) $pattern = "()*?" . $pattern;
) }
)
) $replace =~ s/\$1/<marker>/g;
) # now, $replace = '<marker>eer';
)
) while ($text =~ /$pattern/g) {
) $replace =~ s/<marker>/$1/g;
) $text =~ s/$pattern/$replace/gi;
) }

It would be easier to do the whole thing in a /e expression.
But not interpreting the database string, but just adding your own code.

Like this:

$test =~ s/$pattern/my $s1 = $1; (my $t = $replace) =~ s|\$1|$s1|g; $t/ge;

That should work.

If you want more than just $1, you need a slightly more complicated
expression, probably involving @- and @+.

(I've always wondered why there is no regex-match array perlvar...)


SaSW, Willem
--
Disclaimer: I am in no way responsible for any of the statements
made in the above text. For all I know I might be
drugged or something..
No I'm not paranoid. You all think I'm paranoid, don't you !
#EOT
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Bug? concatenate a number to a backreference: re.sub(r'(zzz:)xxx',r'\1'+str(4444), somevar) abdulet Python 2 10-23-2009 12:27 PM
No regex backreference with four backslashes gabriel.birke@gmail.com Ruby 4 09-16-2006 09:30 AM
re.sub() backreference bug? jemminger@gmail.com Python 4 08-18-2006 12:47 AM
backreference in regexp Fredrik Lundh Python 2 01-31-2006 03:02 PM
Newbie backreference question paulm Python 6 06-30-2005 11:00 PM



Advertisments
 



1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57