Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Ruby > Regexp riddle; escaping escapes

Reply
Thread Tools

Regexp riddle; escaping escapes

 
 
Phlip
Guest
Posts: n/a
 
      08-17-2007
Rubies:

Someone didn't escape their & in their HTML correctly. Let's fix it.

This regexp correctly does not escape &dude, because we only want to escape
raw & markers:

p "yo &dude".gsub(/&([^a-z])/i, '&\1')

That passed "yo &dude" thru unchanged. (I am aware "dude" has no ; on the
end; we are leaving that optional, for whatever reason...)

Now escape & followed by a non-alphabetic character:

p "yo & dude".gsub(/&([^a-z])/i, '&\1')

That correctly provides: "yo & dude"

Now how to escape "yo && dude"? Note that the ([^a-z]) consumes the second
&, leading to this incorrect output:

"yo && dude"

The only workaround I can think of is to run the Regexp twice:

x = "yo && dude"
2.times{ x.gsub!(/&([^a-z])/i, '&\1') }
p x

Can someone help my feeb Regexp skills and get a "yo && dude" in one
line?

--
Phlip
http://www.oreilly.com/catalog/9780596510657/
^ assert_xpath
http://tinyurl.com/23tlu5 <-- assert_raise_message
 
Reply With Quote
 
 
 
 
Tim Pease
Guest
Posts: n/a
 
      08-17-2007
On 8/17/07, Phlip <(E-Mail Removed)> wrote:
> Rubies:
>
> Someone didn't escape their & in their HTML correctly. Let's fix it.
>
> This regexp correctly does not escape &dude, because we only want to escape
> raw & markers:
>
> p "yo &dude".gsub(/&([^a-z])/i, '&amp;\1')
>
> That passed "yo &dude" thru unchanged. (I am aware "dude" has no ; on the
> end; we are leaving that optional, for whatever reason...)
>
> Now escape & followed by a non-alphabetic character:
>
> p "yo & dude".gsub(/&([^a-z])/i, '&amp;\1')
>
> That correctly provides: "yo &amp; dude"
>
> Now how to escape "yo && dude"? Note that the ([^a-z]) consumes the second
> &, leading to this incorrect output:
>
> "yo &amp;& dude"
>
> The only workaround I can think of is to run the Regexp twice:
>
> x = "yo && dude"
> 2.times{ x.gsub!(/&([^a-z])/i, '&amp;\1') }
> p x
>
> Can someone help my feeb Regexp skills and get a "yo &amp;&amp; dude" in one
> line?
>


str = "yo && dude"
str.gsub!( %r/&(?=[^a-z])/i, '&amp;')
p str
=> "yo &amp;&amp; dude"


The regular expression trick here is the (?=re) That's called the
"zero-width positive lookahead". It matches, but it does not consume
the string; so the gsub! will only replace the characters that are NOT
inside (?=re).

Blessings,
TwP

 
Reply With Quote
 
 
 
 
Phlip
Guest
Posts: n/a
 
      08-17-2007
Tim Pease wrote:

> str.gsub!( %r/&(?=[^a-z])/i, '&amp;')


Thanks!

> "zero-width positive lookahead"


Man, that was right there, but I was blocking on it. (-;

--
Phlip
http://www.oreilly.com/catalog/9780596510657/
^ assert_xpath
http://tinyurl.com/yrc77g <-- assert_latest Model

 
Reply With Quote
 
Tim Pease
Guest
Posts: n/a
 
      08-17-2007
On 8/17/07, Phlip <(E-Mail Removed)> wrote:
> Tim Pease wrote:
>
> > str.gsub!( %r/&(?=[^a-z])/i, '&amp;')

>
> Thanks!
>
> > "zero-width positive lookahead"

>
> Man, that was right there, but I was blocking on it. (-;
>


I had to pull my pickaxe off the shelf and look it up, too. Page 327
in the second edition if you're interested in reading about it. It's
in the first edition, too, that is available online.

Blessings,
TwP

 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Regexp.escape with un-escapes Intransition Ruby 4 12-08-2009 05:10 AM
[regexp] How to convert string "/regexp/i" to /regexp/i - ? Joao Silva Ruby 16 08-21-2009 05:52 PM
escaping for regexp ??? Une bévue Ruby 1 09-17-2006 09:35 AM
Escapes Sequences Not Working? Rick Brandt XML 13 08-27-2004 11:59 PM
[RegExp] Making non-greedy; Escaping parentheses? Jane Doe Javascript 3 09-13-2003 05:06 PM



Advertisments