Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > XML > Re: more XPath struggles (tDOM)

Reply
Thread Tools

Re: more XPath struggles (tDOM)

 
 
Joseph J. Kesselman
Guest
Posts: n/a
 
      05-02-2008
Mikhail Teterin wrote:
> $xml selectNodes {//mp:date}
>
> does not find any, but
>
> $xml selectNodes {//*[name()="mp:date"]}
>
> works. What's the reason for the difference?


XPath is namespace-aware. To select a namespaced node, you must use a
prefix in your path (which you did) and tell your XPath evaluator what
the prefix bound to (which you didn't). Look at the user's manual for
your tool.

(comp.lang.xml doesn't exist on my server, so I can't crosspost there.)
 
Reply With Quote
 
 
 
 
Donal K. Fellows
Guest
Posts: n/a
 
      05-02-2008
Mikhail Teterin wrote:
> But the namespaces are defined in the document itself
> -- for example:
> <xc:XmlCache xmlnsc="XmlCache" xc:action="Update">
> why do I still need to specify them? It certainly works with xml_grep... Is
> there a bug in the package (tDOM), or is the above element not sufficient
> to define a namespace?


Basically, namespaces and XPath don't sit too well together (when the
XPath expression is located in an XML document) because the document
namespace context at the point in the document where the XPath is
located is shrouded from the expression (because it is formally an
xsd:string, which is not namespace-aware according to the namespaces
spec). This means that if you're embedding an XPath expression in an
XML document, you also need to embed a way to tell it what the
namespace context to evaluate in is. (This is stupid, but the way it
is and isn't a Tcl problem at all.)

If the XPath expression is not contained in some XML context, then it
is even more obvious that the namespace context needs to be given.

Donal.
 
Reply With Quote
 
 
 
 
Joseph J. Kesselman
Guest
Posts: n/a
 
      05-02-2008
Mikhail Teterin wrote:
> I got some progress... But the namespaces are defined in the document itself
> -- for example:
> <xc:XmlCache xmlnsc="XmlCache" xc:action="Update">
> why do I still need to specify them?


Because they could be set differently at different places in the
document, and/or whatever generated your XPath might have different
prefixes bound to those namespaces or vice versa. You need to provide a
context so the system knows what you meant.

Some processors will let you specify a context node and will pick up the
namespaces defined there. Again, check your docs.

> It certainly works with xml_grep


I don't know xml_grep, so I can't advise. It may be assuming the root
node as the context if not told otherwise. Or it may be flat-out broken
and not processing namespaces correctly.

Say what you mean. The system can't read your mind, and shouldn't try.
 
Reply With Quote
 
Joseph J. Kesselman
Guest
Posts: n/a
 
      05-02-2008
> Basically, namespaces and XPath don't sit too well together

It works fine when you understand how to use it properly.

The only real problem is that XPath relied on prefixes retrieved from
some unspecified environment (depending on the context/tool in which the
XPath is being executed). That's a bit less verbose than using an
"expanded qualified name" like {http://my_namespace}foo, or requiring
that the namespace bindings be specified via some syntax in the XPath
string. But it does mean that an XPath is partly defined by that
context. (Then again, XPaths which use variables also need a context, as
do those which use some of the functions, so this is just the most
obvious -- and most unnecessary -- instance thereof.)

It is possible to write a portable namespace-aware XPath that doesn't
rely on prefixes (via some ugly predicate hacks)... but it really should
be easier to do so. Oh well. 20:20 hindsight; maybe XPath 3.0 will
finally reconsider that point.

By the way: The namespaces shown in the original example are not
considered acceptable by today's standards. Namespace names should be
fully-qualified ("absolute") URI References. Yes, the original namespace
spec was fuzzy about that, and many tools won't enforce this... but
after much painful debate, the W3C agreed that the concept of a
"relative namespace" really didn't make any sense no matter how you
sliced it. Tim Berners-Lee reserves the right to reintroduce that idea
if and when the Semantic Web effort comes up with a way to make those
meaningful... but until then, you really should make sure all your
namespace names follow the official absolute-URI-reference syntax.
 
Reply With Quote
 
Richard Tobin
Guest
Posts: n/a
 
      05-03-2008
In article <>,
Mikhail Teterin <usenet+> wrote:

>I don't want it to read my mind, I want it to read the document. The
>namespaces are set there with an xmlns-attribute of containing elements. In
>fact, when I for the node's name [$node nodeName], I get the
>fully-qualified foo.bar.woof.meow.
>
>It KNOWS the namespace-mapping, but it wants me to repeat it (f means foo, b
>means bar, w means woof, etc.). That's gratuitous...


Suppose you try to use the same XPath expressions with a document
that uses different prefixes. How's that going to work? Your XPath
expressions will all be wrong.

The choice of prefixes is supposed to be arbitrary. You can't rely on
f meaning foo. Even within a single document, you can use the same
prefix for different namespaces and different prefixes for the
same namespace.

-- Richard

--
:wq
 
Reply With Quote
 
Donal K. Fellows
Guest
Posts: n/a
 
      05-03-2008
Joseph J. Kesselman wrote:
> It is possible to write a portable namespace-aware XPath that doesn't
> rely on prefixes (via some ugly predicate hacks)... but it really should
> be easier to do so. Oh well. 20:20 hindsight; maybe XPath 3.0 will
> finally reconsider that point.


It'd be OK if there was a type "like xs:string, but understands the
current namespace context" but there isn't. (Of course, once you
extract the XPath from its context document you then need to remember
to explicitly get the NS context from somewhere, which is almost
certainly the root of the problem in the message that started this
thread.)

Donal.
 
Reply With Quote
 
Rolf Ade
Guest
Posts: n/a
 
      05-06-2008
Mikhail Teterin wrote:
>Joseph J. Kesselman wrote:
>> XPath is namespace-aware. To select a namespaced node, you must use a
>> prefix in your path (which you did) and tell your XPath evaluator what
>> the prefix bound to (which you didn't). Look at the user's manual for
>> your tool.

>
>Thanks. After I explicitly set:
>
> $xml selectNodesNamespaces {mp MarketParameters xc XmlCache}


That's the right way to bind prefixes to a namespace. One way. You can
always use the -namespaces option to the selectNodes method, but
setting things up with one selectNodesNamespaces call for the rest of
the lifetime of the document seems to be more convenient to me.

>I got some progress... But the namespaces are defined in the document itself
>-- for example:
>
> <xc:XmlCache xmlnsc="XmlCache" xc:action="Update">
>
>why do I still need to specify them? It certainly works with xml_grep... Is
>there a bug in the package (tDOM), or is the above element not sufficient
>to define a namespace?


No, it's not a bug. As long as no selectNodesNamespaces setting nor
the -namespaces option is given, tDOM even respects the XML namespace
declarations of the document. The context node of your XPath
expression is the node, from which you call your XPath expression. If
the (all) prefixes, you're using in your XPath expression are in scope
of that node, you've to do nothing; namespace resolving will work as
you expect. Since you had trouble with this, I'd bet, not all used
XML namespace declarations are in scope of your context node.

But, as others already have pointed out, it is _dangerous_ to bank on
the prefixes in the document. Prefixes don't matter, it's the
namespaces, that matters.

From the XML viewpoint,

<a:doc xmlns:a="http://foo.bar.com">
<a:elem>data</a:elem>
</a:doc>

and

<b:doc xmlns:b="http://foo.bar.com">
<b:elem>data</a:elem>
</b:doc>

are the in some sense the 'same' documents.

You can't just say [$someNode selectNodes a:elem] in your code and
expect that to work reliable. If the document provider uses another
prefix (bound to the same namespace), your code will fail.

The clear way out is, to say the XPath engine, which namespace you
mean with which prefix. With e.g. selectNodesNamespaces.

rolf
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Timezone and ISO8601 struggles with datetime and xml.utils.iso8601.parse Samuel Python 2 09-09-2005 06:23 PM
"Memory leak" in javax.xml.xpath.XPath Marvin_123456 Java 4 07-29-2005 03:49 PM
Low-Light Autofocus Struggles with Nikon Coolpix 5700 Larry R Harrison Jr Digital Photography 6 06-08-2004 09:36 AM
XPath that does not include other XPath Anna XML 0 07-31-2003 07:55 AM
Problem selecting a node with XPATH if attribute value contains backslashes - how to force XPATH string to be treated as literal? Alastair Cameron XML 1 07-08-2003 07:24 PM



Advertisments
 



1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57