Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Ruby > Escaping single quotes in XPath query with REXML

Reply
Thread Tools

Escaping single quotes in XPath query with REXML

 
 
Francis Hwang
Guest
Posts: n/a
 
      10-21-2004
Anybody tried to use XPath in REXML with a single quote, only to run
into the fact that quote escaping in XPath is apparently not accounted
for? If this were in the context on XSLT I'd be able to assign some
annoying temp variable like $apos, but it's not, so I can't.

irb(main):001:0> require 'rexml/document'
=> true
irb(main):002:0> include REXML
=> Object
irb(main):003:0> xml = "<rss version='2.0'><channel><item><title>John's
Doe</title></item></channel></rss>"
=> "<rss version='2.0'><channel><item><title>John's
Doe</title></item></channel></rss>"
irb(main):004:0> xmldoc = Document.new xml
=> <UNDEFINED> ... </>
irb(main):005:0> XPath.first( xmldoc, "/rss/channel/item/title" ).to_s
=> "<title>John's Doe</title>"
irb(main):006:0> XPath.first( xmldoc,
"/rss/channel/item/title[text()='John's Doe']" ).to_s
NoMethodError: undefined method `node_type' for "John":String
from /usr/local/lib/ruby/1.8/rexml/xpath_parser.rb:124:in
`internal_parse'
from /usr/local/lib/ruby/1.8/rexml/xpath_parser.rb:123:in `each'
from /usr/local/lib/ruby/1.8/rexml/xpath_parser.rb:123:in
`internal_parse'
from /usr/local/lib/ruby/1.8/rexml/xpath_parser.rb:49:in `match'
from /usr/local/lib/ruby/1.8/rexml/xpath_parser.rb:402:in
`Predicate'
from /usr/local/lib/ruby/1.8/rexml/xpath_parser.rb:346:in
`Predicate'
from /usr/local/lib/ruby/1.8/rexml/xpath_parser.rb:204:in
`internal_parse'
from /usr/local/lib/ruby/1.8/rexml/xpath_parser.rb:199:in
`times'
from /usr/local/lib/ruby/1.8/rexml/xpath_parser.rb:199:in
`internal_parse'
from /usr/local/lib/ruby/1.8/rexml/xpath_parser.rb:49:in `match'
from /usr/local/lib/ruby/1.8/rexml/xpath_parser.rb:34:in `parse'
from /usr/local/lib/ruby/1.8/rexml/xpath.rb:28:in `first'
from (irb):6
irb(main):007:0> XPath.first( xmldoc,
"/rss/channel/item/title[text()='John\'s Doe']" ).to_s
NoMethodError: undefined method `node_type' for "John":String
from /usr/local/lib/ruby/1.8/rexml/xpath_parser.rb:124:in
`internal_parse'
from /usr/local/lib/ruby/1.8/rexml/xpath_parser.rb:123:in `each'
from /usr/local/lib/ruby/1.8/rexml/xpath_parser.rb:123:in
`internal_parse'
from /usr/local/lib/ruby/1.8/rexml/xpath_parser.rb:49:in `match'
from /usr/local/lib/ruby/1.8/rexml/xpath_parser.rb:402:in
`Predicate'
from /usr/local/lib/ruby/1.8/rexml/xpath_parser.rb:346:in
`Predicate'
from /usr/local/lib/ruby/1.8/rexml/xpath_parser.rb:204:in
`internal_parse'
from /usr/local/lib/ruby/1.8/rexml/xpath_parser.rb:199:in
`times'
from /usr/local/lib/ruby/1.8/rexml/xpath_parser.rb:199:in
`internal_parse'
from /usr/local/lib/ruby/1.8/rexml/xpath_parser.rb:49:in `match'
from /usr/local/lib/ruby/1.8/rexml/xpath_parser.rb:34:in `parse'
from /usr/local/lib/ruby/1.8/rexml/xpath.rb:28:in `first'
from (irb):7



 
Reply With Quote
 
 
 
 
Brian Candler
Guest
Posts: n/a
 
      10-21-2004
> irb(main):006:0> XPath.first( xmldoc,
> "/rss/channel/item/title[text()='John's Doe']" ).to_s


I'm no expert in XPath, but that looks like a broken XPath query because of
the three single quotes.

> irb(main):007:0> XPath.first( xmldoc,
> "/rss/channel/item/title[text()='John\'s Doe']" ).to_s


That's identical, as you'll see if you try this:

irb(main):001:0> a="text()='John\'s Doe'"
=> "text()='John's Doe'"

You've not inserted a backslash into the string, you just escaped the quote,
and the escaping was removed. You need two backslashes to insert a single
backslash into the string:

irb(main):002:0> a="text()='John\\'s Doe'"
=> "text()='John\\'s Doe'"

(Despite how it looks, there is only a single backslash in there; it's shown
as two because it's inside a double-quoted string, to make it valid Ruby)

irb(main):003:0> a.each_byte { |c| print c.chr," " }
t e x t ( ) = ' J o h n \ ' s D o e ' => "text()='John\\'s Doe'"

However, I've just had a quick scan through the XPath-1.0 spec, and I don't
think that's how you do it. You can include single quotes inside a
double-quoted string, and vice versa. But probably what you want for the
general case is XML character entities: ' or &apos;

Try passing your string through this before constructing your XPath query:

require 'rexml/text'
a = "John's Doe"
b = REXML::Text::normalize(a)
#=> "John&apos;s Doe"

HTH,

Brian.


 
Reply With Quote
 
 
 
 
Brian Candler
Guest
Posts: n/a
 
      10-21-2004
On Thu, Oct 21, 2004 at 09:28:51AM +0100, Brian Candler wrote:
> Try passing your string through this before constructing your XPath query:
>
> require 'rexml/text'
> a = "John's Doe"
> b = REXML::Text::normalize(a)
> #=> "John&apos;s Doe"


Hmm, that doesn't work.

irb(main):007:0> XPath.first( xmldoc, "/rss/channel/item/title[text()='John&apos;s Doe']" ).to_s
=> ""
irb(main):008:0> XPath.first( xmldoc, "/rss/channel/item/title[text()='John's Doe']" ).to_s
=> ""
irb(main):009:0> XPath.first( xmldoc, "/rss/channel/item/title[text()=\"John's Doe\"]" ).to_s
=> "<title>John's Doe</title>"

You might want to raise that with the REXML author. In the mean time, if you
know the string only contains single quotes, then you can surround it with
double quotes in the XPath query, as per the third line above.

Regards,

Brian.


 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Asp.NET Javascript string, want to pass '(single quotes' within '(single quotes) Chris ASP .Net 1 03-24-2006 09:03 PM
Escaping single quotes in SQL queries lists Ruby 3 10-20-2005 11:57 PM
Escaping quotes withing quotes duwayne@gmail.com Javascript 7 05-17-2005 06:17 PM
escaping single quotes in a string with gsub Paul Rubel Ruby 5 11-03-2004 09:52 PM
Multiline quotes - escaping quotes - et al Lawrence Tierney Java 3 12-24-2003 05:12 PM



Advertisments