Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Javascript > XPath, XMLHttpRequest and parsing DOM

Reply
Thread Tools

XPath, XMLHttpRequest and parsing DOM

 
 
Xandor Leahte
Guest
Posts: n/a
 
      08-08-2011
Hey there,

I wish to introduce you to a problem that i get working on Javascript
and XPath.

Be r an XMLHttpRequest object; i want to make a request through a
webpage inside my domain (so no security problem); with r i can handle
r.responseText and r.responseXML: sometimes i can't use responseXML
cause of no valid syntax of the document, so I've to use responseText.
So, creating the DOM document like this way:

var doc = new DOMParser().parseFromString(r.responseText, "text/
xml")

Then I can try to evaluate a XPath expression on doc, like:

doc.evaluate(query, doc, null, 0, null)

where query is a valid XPath expression. There's the problem: if I
make a query like "//*[@id='foo']" or "//*" it works perfectly;
otherwise if i make a query like "/html/body" or "/ol/li/a" or
something without wildcard * included, the evaluate function returns
null. I can't understand why if i dont use the wildcard query doesn't
work (see: query works if I try to evaluate it in a "document" contest
like in firebug/js console where my page is the "document" object).

I think it's a problem of parsing request but i dont know ways to do
it; maybe i could use a hidden iframe but it's not so elegant. I wish
to know if you know something about this problem, maybe a problem
about DOM parsing or something like that...

Thanks for all your reply and sorry for my english, I hope you can
forgive me!

Sincerely,
X.
 
Reply With Quote
 
 
 
 
Martin Honnen
Guest
Posts: n/a
 
      08-08-2011
Xandor Leahte wrote:
> Hey there,
>
> I wish to introduce you to a problem that i get working on Javascript
> and XPath.
>
> Be r an XMLHttpRequest object; i want to make a request through a
> webpage inside my domain (so no security problem); with r i can handle
> r.responseText and r.responseXML: sometimes i can't use responseXML
> cause of no valid syntax of the document, so I've to use responseText.
> So, creating the DOM document like this way:
>
> var doc = new DOMParser().parseFromString(r.responseText, "text/
> xml")


I don't see why parseFromString on responseText would work when
responseXML could not be built.

> Then I can try to evaluate a XPath expression on doc, like:
>
> doc.evaluate(query, doc, null, 0, null)
>
> where query is a valid XPath expression. There's the problem: if I
> make a query like "//*[@id='foo']" or "//*" it works perfectly;
> otherwise if i make a query like "/html/body" or "/ol/li/a" or
> something without wildcard * included, the evaluate function returns
> null. I can't understand why if i dont use the wildcard query doesn't
> work (see: query works if I try to evaluate it in a "document" contest
> like in firebug/js console where my page is the "document" object).


Post a sample of the XML markup you parse with DOMParser. I suspect it
is a namespace problem i.e. you have
<html xmlns="http://www.w3.org/1999/xhtml">...</html>
in your responseText and then you parse that with DOMParser a XML DOM
document is built with the elements all belonging to the XHTML
namespace. In that case with doc.evaluate you need to pass in a
namespace resolver and use a prefix e.g.
doc.evaluate('xhtml:html/xhtml:body', doc, function(prefix) { if
(prefix === 'xhtml') return 'http://www.w3.org/1999/xhtml'; else return
null; }, 0, null);



--

Martin Honnen --- MVP Data Platform Development
http://msmvps.com/blogs/martin_honnen/
 
Reply With Quote
 
 
 
 
Thomas 'PointedEars' Lahn
Guest
Posts: n/a
 
      08-08-2011
Martin Honnen wrote:

> Xandor Leahte wrote:
>> Be r an XMLHttpRequest object; i want to make a request through a
>> webpage inside my domain (so no security problem); with r i can handle
>> r.responseText and r.responseXML: sometimes i can't use responseXML
>> cause of no valid syntax of the document, so I've to use responseText.
>> So, creating the DOM document like this way:
>>
>> var doc = new DOMParser().parseFromString(r.responseText, "text/
>> xml")

>
> I don't see why parseFromString on responseText would work when
> responseXML could not be built.


ACK

>> Then I can try to evaluate a XPath expression on doc, like:
>>
>> doc.evaluate(query, doc, null, 0, null)
>>
>> where query is a valid XPath expression. There's the problem: if I
>> make a query like "//*[@id='foo']" or "//*" it works perfectly;
>> otherwise if i make a query like "/html/body" or "/ol/li/a" or
>> something without wildcard * included, the evaluate function returns
>> null. I can't understand why if i dont use the wildcard query doesn't
>> work (see: query works if I try to evaluate it in a "document" contest
>> like in firebug/js console where my page is the "document" object).

>
> Post a sample of the XML markup you parse with DOMParser. I suspect it
> is a namespace problem i.e. you have
> <html xmlns="http://www.w3.org/1999/xhtml">...</html>
> in your responseText and then you parse that with DOMParser a XML DOM
> document is built with the elements all belonging to the XHTML
> namespace. In that case with doc.evaluate you need to pass in a
> namespace resolver and use a prefix e.g.
> doc.evaluate('xhtml:html/xhtml:body', doc, function(prefix) { if
> (prefix === 'xhtml') return 'http://www.w3.org/1999/xhtml'; else return
> null; }, 0, null);


FYI: The (experimental) jsx.xpath object makes this easier and the
programming more flexible. For example, the above code can be written as

jsx.xpath.evaluate('_xhtml:html/_xhtml:body', doc,
jsx.xpath.createCustomNSResolver({
_xhtml: 'http://www.w3.org/1999/xhtml'
}));

(where you might want to alias jsx.xpath or the used methods, or
jsx._import(jsx.xpath, …) them in order to increase runtime efficiency.)

This should work with implementations of DOM Level 3 XPath and MSXML alike.
The only dependency for xpath.js, which defines that object, is object.js.

<http://PointedEars.de/websvn/filedetails.php?repname=JSX&path=%2Ftrunk%2Fxpath. js>


PointedEars
--
Danny Goodman's books are out of date and teach practices that are
positively harmful for cross-browser scripting.
-- Richard Cornford, cljs, <cife6q$253$1$> (2004)
 
Reply With Quote
 
Xandor Leahte
Guest
Posts: n/a
 
      08-10-2011
On Aug 8, 6:08*pm, Martin Honnen <mahotr...@yahoo.de> wrote:
> I don't see why parseFromString on responseText would work when
> responseXML could not be built.


Hey there! Thanks for reply! Sometimes responseXML cannot be build
cause of content/type of request; im handling right now to force
XMLHttpRequest to ask a defined content/type
(using .setRequestHeader()).

>
> Post a sample of the XML markup you parse with DOMParser. I suspect it
> is a namespace problem i.e. you have
> * *<html xmlns="http://www.w3.org/1999/xhtml">...</html>
> in your responseText and then you parse that with DOMParser a XML DOM
> document is built with the elements all belonging to the XHTML
> namespace. In that case with doc.evaluate you need to pass in a
> namespace resolver and use a prefix e.g.
> * *doc.evaluate('xhtml:html/xhtml:body', doc, function(prefix) { if
> (prefix === 'xhtml') return 'http://www.w3.org/1999/xhtml';else return
> null; }, 0, null);
>


This is a sample of the page that i've to parse: http://pastebin.com/njtdvcLH
Im just working on a information extraction module and i've to handle
the page using XPath.
I just tried it on a shell like Firebug when the document is the
Document object itself and XPath queries work. A sample of my code is
here: http://pastebin.com/QdXzhDba

Thanks a lot for reply!

> --
>
> * * * * Martin Honnen --- MVP Data Platform Development
> * * * *http://msmvps.com/blogs/martin_honnen/


 
Reply With Quote
 
Martin Honnen
Guest
Posts: n/a
 
      08-12-2011
Xandor Leahte wrote:
> On Aug 8, 6:08 pm, Martin Honnen<mahotr...@yahoo.de> wrote:
>> I don't see why parseFromString on responseText would work when
>> responseXML could not be built.

>
> Hey there! Thanks for reply! Sometimes responseXML cannot be build
> cause of content/type of request; im handling right now to force
> XMLHttpRequest to ask a defined content/type
> (using .setRequestHeader()).


You can handle that case with Firefox/Mozilla with
overrideMimeType("application/xml")
https://developer.mozilla.org/en/xml...9_Non-standard


> This is a sample of the page that i've to parse: http://pastebin.com/njtdvcLH
> Im just working on a information extraction module and i've to handle
> the page using XPath.
> I just tried it on a shell like Firebug when the document is the
> Document object itself and XPath queries work. A sample of my code is
> here: http://pastebin.com/QdXzhDba


Well if you have an XML DOM document and want to run XPath against XHTML
where the elements are in the XHTML default namespace then doing
createNSResolver(doc.documentElement) does not help, you will need to
implement your own namespace resolver (which is as easy as using a
function expression
function (prefix) {
if (prefix === 'x') {
return 'http://www.w3.org/1999/xhtml';
}
else {
return null;
}
}
) then you have to use the choosen prefix (e.g. 'x') in your path
expressions (as in /x:html/x:body//x:a/@href).


--

Martin Honnen --- MVP Data Platform Development
http://msmvps.com/blogs/martin_honnen/
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
XMLHttpRequest - parsing returned data Tony Javascript 10 12-19-2005 08:57 PM
Convert a XML DOM Object to a HTML DOM Object manjunath.d@gmail.com XML 0 09-20-2005 08:16 AM
What is the difference between DOM Level 1 and DOM Level 2. mike XML 1 11-20-2004 03:19 PM
Difference between pure DOM and JAXP over DOM ?? Thorsten Meininger XML 0 07-28-2004 08:51 AM
Difference between pure DOM and JAXP over DOM ?? Thorsten Meininger Java 0 07-28-2004 08:51 AM



Advertisments
 



1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57