Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > XML > Returning "nearest in document" matches using XPath

Reply
Thread Tools

Returning "nearest in document" matches using XPath

 
 
Nick Leverton
Guest
Posts: n/a
 
      12-05-2008
I have an application which attempts to describe a tree of TCP subnets,
which in essence are not fully accessible from each other. I have a
description of the network in XML as shown in the excerpt below.

The application is actually trying to optimise delivery of large files
to multiple destinations over expensive links, so it's not just a matter
of opening up firewalls and adding a bit of NATting. To avoid duplicated
transfers I need to know what is the nearest machine which leads onto
the ultimate destination for the file I am currently handling.

So for instance files destined for units 26 and 27 are first delivered
to node V9990 which then delivers them to V9991, to which 26 and 27
are directly attached. The distinction between nodes and units isn't
important for this part of the task. The ID attribute defines the
ultimate destination which I am trying to reach and each ID is unique,
so there is only one "nearest" IP address corresponding to each ID.

<?xml version="1.0"?>
<nodes>
<node id="V9990" ip="1.1.1.1">
<unit id="23" ip="10.10.10.10"/>
<unit id="24" ip="10.10.10.11"/>
<node id="V9991" ip="10.10.10.12">
<unit id="26" ip="192.168.0.1"/>
<unit id="27" ip="192.168.0.2"/>
</node>
</node>
<node id="V9992" ip="2.2.2.2">
<node id="V9993" ip="10.10.10.10">
<unit id="21"/>
<unit id="22"/>
</node>
</node>
</nodes>

To simplify network maintenance I would like to use the same config file
on all the "nodes", and to modify the XPath query with extra terms on
the sub-nodes. In other words, on the "root" machine a query for id=26
will return ip=1.1.1.1, but on node V9990 a query for id=26 will return
ip=10.10.10.12

In summary, what I want to do is to retrieve the nearest ip attribute
in the document which has a given id attribute as a descendant. I am
currently using the following XPath:

Querying from the root:
descendant-or-self::*[@ip and descendant-or-self::*[@id="26"]][last()]/@ip

I used descendant-or-self as the first term here rather than //*
because I don't want XPath to descend the doc and return all matches,
only the node which matches nearest the root of the XML document.

Querying from a sub-node:
//*[@id="V9990"]/*[@ip and descendant-or-self::*[@id="26"]][last()]/@ip

Here I establish a context node first and then work on that with
predicates.

First question - these two work, but are probably not ideal since I'm
not yet very familiar with XPath. In particular I don't understand
why I need to use [last()] predicate rather than [1], as I thought the
descendant axis should work downwards in document order not upwards.

Secondly, I now have a requirement to retrieve all the "nearest" ip
attributes for polling/reporting purposes. In other words, querying
from the root I would want to return 1.1.1.1 and 2.2.2.2. Or querying
from node V9990 I would want to return 10.10.10.10, 10.10.10.11 and
10.10.10.12. I don't mind about getting multiple instances of the same
attribute back as de-duping is simple. But I cannot figure out how to
arrange the predicates so as to return the "topmost" ip attribute only,
neither for the root case nor for the sub-context case.

Am I bending XPath a step too far here ? I was hoping not to have to
introduce an extra processing step but I am thinking maybe the sub-nodes
need to extract their "local" view of the network and only to work
on that. Any advice would be very helpful.

I'm working in perl XML::XPath in case it makes a difference.

Thankyou

Nick
--
Serendipity: http://www.leverton.org/blosxom (last update 19th September 200
"The Internet, a sort of ersatz counterfeit of real life"
-- Janet Street-Porter, BBC2, 19th March 1996
 
Reply With Quote
 
 
 
 
Dimitre Novatchev
Guest
Posts: n/a
 
      12-05-2008
What do you mean by "nearest"? Is this the geographical distance b/n two
nodes? I dont see this reflected in the XML document.

Cheers,
Dimitre Novatchev

"Nick Leverton" <> wrote in message
news:...
>I have an application which attempts to describe a tree of TCP subnets,
> which in essence are not fully accessible from each other. I have a
> description of the network in XML as shown in the excerpt below.
>
> The application is actually trying to optimise delivery of large files
> to multiple destinations over expensive links, so it's not just a matter
> of opening up firewalls and adding a bit of NATting. To avoid duplicated
> transfers I need to know what is the nearest machine which leads onto
> the ultimate destination for the file I am currently handling.
>
> So for instance files destined for units 26 and 27 are first delivered
> to node V9990 which then delivers them to V9991, to which 26 and 27
> are directly attached. The distinction between nodes and units isn't
> important for this part of the task. The ID attribute defines the
> ultimate destination which I am trying to reach and each ID is unique,
> so there is only one "nearest" IP address corresponding to each ID.
>
> <?xml version="1.0"?>
> <nodes>
> <node id="V9990" ip="1.1.1.1">
> <unit id="23" ip="10.10.10.10"/>
> <unit id="24" ip="10.10.10.11"/>
> <node id="V9991" ip="10.10.10.12">
> <unit id="26" ip="192.168.0.1"/>
> <unit id="27" ip="192.168.0.2"/>
> </node>
> </node>
> <node id="V9992" ip="2.2.2.2">
> <node id="V9993" ip="10.10.10.10">
> <unit id="21"/>
> <unit id="22"/>
> </node>
> </node>
> </nodes>
>
> To simplify network maintenance I would like to use the same config file
> on all the "nodes", and to modify the XPath query with extra terms on
> the sub-nodes. In other words, on the "root" machine a query for id=26
> will return ip=1.1.1.1, but on node V9990 a query for id=26 will return
> ip=10.10.10.12
>
> In summary, what I want to do is to retrieve the nearest ip attribute
> in the document which has a given id attribute as a descendant. I am
> currently using the following XPath:
>
> Querying from the root:
> descendant-or-self::*[@ip and
> descendant-or-self::*[@id="26"]][last()]/@ip
>
> I used descendant-or-self as the first term here rather than //*
> because I don't want XPath to descend the doc and return all matches,
> only the node which matches nearest the root of the XML document.
>
> Querying from a sub-node:
> //*[@id="V9990"]/*[@ip and descendant-or-self::*[@id="26"]][last()]/@ip
>
> Here I establish a context node first and then work on that with
> predicates.
>
> First question - these two work, but are probably not ideal since I'm
> not yet very familiar with XPath. In particular I don't understand
> why I need to use [last()] predicate rather than [1], as I thought the
> descendant axis should work downwards in document order not upwards.
>
> Secondly, I now have a requirement to retrieve all the "nearest" ip
> attributes for polling/reporting purposes. In other words, querying
> from the root I would want to return 1.1.1.1 and 2.2.2.2. Or querying
> from node V9990 I would want to return 10.10.10.10, 10.10.10.11 and
> 10.10.10.12. I don't mind about getting multiple instances of the same
> attribute back as de-duping is simple. But I cannot figure out how to
> arrange the predicates so as to return the "topmost" ip attribute only,
> neither for the root case nor for the sub-context case.
>
> Am I bending XPath a step too far here ? I was hoping not to have to
> introduce an extra processing step but I am thinking maybe the sub-nodes
> need to extract their "local" view of the network and only to work
> on that. Any advice would be very helpful.
>
> I'm working in perl XML::XPath in case it makes a difference.
>
> Thankyou
>
> Nick
> --
> Serendipity: http://www.leverton.org/blosxom (last update 19th September
> 200
> "The Internet, a sort of ersatz counterfeit of real life"
> -- Janet Street-Porter, BBC2, 19th March 1996



 
Reply With Quote
 
 
 
 
Nick Leverton
Guest
Posts: n/a
 
      12-05-2008
In article <4939431f$0$17068$>,
Dimitre Novatchev <> wrote:
>What do you mean by "nearest"? Is this the geographical distance b/n two
>nodes? I dont see this reflected in the XML document.


No, sorry for being unclear. I mean that from the set of ip attributes
on the axis which contains both the root and the required id attribute:

/ ... @ip ... @ip ... @ip ... @id

I want to find the left-most one in the above diagram, nearest to the root
(or to other selected starting node inbetween the root and the required @id).

I can do this for single ids with the Xpath query I posted, although I
don't fully understand the ordering I am getting. I can't figure out
how to make a satisfactory query which will return the set of leftmost
@ip for all the ids in the XML document.

Thanks for your interest, if I'm still not explaining clearly please let
me know. I'm quite new to XML/Xpath and don't always know the correct
way to describe what I want to do.

Nick
--
Serendipity: http://www.leverton.org/blosxom (last update 19th September 200
"The Internet, a sort of ersatz counterfeit of real life"
-- Janet Street-Porter, BBC2, 19th March 1996
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Match a pattern multiple times, returning matches, captures andoffset? Markus Fischer Ruby 9 04-08-2011 07:53 PM
returning regex matches as lists Jonathan Lukens Python 7 02-16-2008 12:27 PM
RegEx engine returning empty matches between valid tokens. John otac0n Gietzen Perl Misc 2 02-05-2006 12:55 AM
"Memory leak" in javax.xml.xpath.XPath Marvin_123456 Java 4 07-29-2005 03:49 PM
Problem selecting a node with XPATH if attribute value contains backslashes - how to force XPATH string to be treated as literal? Alastair Cameron XML 1 07-08-2003 07:24 PM



Advertisments
 



1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57