Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Python > programmnig advise needed

Reply
Thread Tools

programmnig advise needed

 
 
mitsura@skynet.be
Guest
Posts: n/a
 
      06-07-2005
Hi,

I am writing a Python program that needs to read XML files and contruct
a tree object from the XML file (using wxTree).
The XML however is not an hiearchical XML file. It contains <elements>
and <association> tags. The <assiociation> tags link the elements
together.
Example:
<element1>
... element 1 attributes
</element1>
<element2>
... element 2 attributes
</element2>
<element3>
... element 3 attributes
</element3>
<association>
<Name>element1</Name>
<Parent>element3</Parent>
</association>
<association>
<Name>element2</Name>
<Parent>element3</Parent>
</association>

In this case <element3> is the parent of the two other elements.
The XML files I receive often contain thousands of elements so the tree
structure can be very large. Futhermore, the XML files are not
'sorted', like in the example above, the 'root' element object entry
might be somewhere in the middle of the XML file.

I don't really now to code this so any suggestions are welcome.

With kind regards,

Kris

 
Reply With Quote
 
 
 
 
Andrea Griffini
Guest
Posts: n/a
 
      06-07-2005
On 7 Jun 2005 12:14:45 -0700, http://www.velocityreviews.com/forums/(E-Mail Removed) wrote:

>I am writing a Python program that needs to read XML files and contruct
>a tree object from the XML file (using wxTree).


Supposing your XML file has a single top level node (so that it's a
legal XML file) then the following code should be able to read it...

#########
from elementtree import ElementTree as ET

class Node:
def __init__(self, xmlnode):
self.xmlnode = xmlnode
self.children = []

def loadtree(f):
root = ET.parse(f).getroot()
nodes = {}
for x in root:
if x.tag.startswith("element"):
nodes[x.tag] = Node(x)
for x in root.findall("association"):
name = x.find("Name").text
parent = x.find("Parent").text
nodes[parent].children.append(nodes.pop(name))
assert len(nodes) == 1
return nodes.popitem()[1]
##########

The idea is to create a node with an empty list of
logical children and then for every association
removing the child node from the global pool (a dict
indexed by node name) to place it in its parent.
I assumed that the nodes are in random order, but
that the associations are sorted bottom-up in
the tree. If this is not the case then you should
keep TWO dicts, removing from only one of them
the children when you process an association,
and looking up the parent in the *other* dict that
is not changed during processing of associations.
The dict from who you are removing the children
will allow you to detect logical errors in the file
(i.e. a node having two parents - you won't find
that node in the dict the second time - and absence
of a single root - you won't end up with a single
element in the dict after processing all
associations -).

HTH
Andrea
 
Reply With Quote
 
 
 
 
mitsura@skynet.be
Guest
Posts: n/a
 
      06-08-2005
Thanks for the feedback! It is a very good idea to use dictionaries. I
will try to implemented it this way.

Thanks,

Kris


Andrea Griffini schreef:
> On 7 Jun 2005 12:14:45 -0700, (E-Mail Removed) wrote:
>
> >I am writing a Python program that needs to read XML files and contruct
> >a tree object from the XML file (using wxTree).

>
> Supposing your XML file has a single top level node (so that it's a
> legal XML file) then the following code should be able to read it...
>
> #########
> from elementtree import ElementTree as ET
>
> class Node:
> def __init__(self, xmlnode):
> self.xmlnode = xmlnode
> self.children = []
>
> def loadtree(f):
> root = ET.parse(f).getroot()
> nodes = {}
> for x in root:
> if x.tag.startswith("element"):
> nodes[x.tag] = Node(x)
> for x in root.findall("association"):
> name = x.find("Name").text
> parent = x.find("Parent").text
> nodes[parent].children.append(nodes.pop(name))
> assert len(nodes) == 1
> return nodes.popitem()[1]
> ##########
>
> The idea is to create a node with an empty list of
> logical children and then for every association
> removing the child node from the global pool (a dict
> indexed by node name) to place it in its parent.
> I assumed that the nodes are in random order, but
> that the associations are sorted bottom-up in
> the tree. If this is not the case then you should
> keep TWO dicts, removing from only one of them
> the children when you process an association,
> and looking up the parent in the *other* dict that
> is not changed during processing of associations.
> The dict from who you are removing the children
> will allow you to detect logical errors in the file
> (i.e. a node having two parents - you won't find
> that node in the dict the second time - and absence
> of a single root - you won't end up with a single
> element in the dict after processing all
> associations -).
>
> HTH
> Andrea


 
Reply With Quote
 
Magnus Lycka
Guest
Posts: n/a
 
      06-10-2005
(E-Mail Removed) wrote:
> Hi,
>
> I am writing a Python program that needs to read XML files and contruct
> a tree object from the XML file (using wxTree).
> The XML however is not an hiearchical XML file. It contains <elements>
> and <association> tags. The <assiociation> tags link the elements
> together.


Are you sure that you get a tree? As I understand your description,
the XML format could well be used to describe arbitrary directed
graphs, even cyclic ones. This might not be a problem unless someone
tries to expand the entire tree...but you should probably have
some kind of cycle check, or disallow "expand all" in the GUI and
make sure you don't create the nodes before they are expanded.

If you really just permit trees, it's fairly simple, because a
node should just appear as a <Name> in an association once. You
can put those in a set, and check that they aren't already in the
set before you accept the content in a new association. Something
like the following semi-pseudo-code:

nodes = set()

for name, parent in associations:
if name in nodes:
raise KeyError, ("%s is already in the tree. "
"Can't put it under %s." % (name, parent)
nodes.add(name)
whatever...

If you want to allow arbitrary directed acyclic graphs (DAGs),
you have to work a little more. E.g. if you want to allow things
like:

A
..B
...C
....D
....E
..F
...C
....D
....E

(I know that you said a tree, but I've certainly met clever and
well educated people before who called DAGs trees...)

For DAGs, you might traverse the tree to make sure there are
no cycles, but there are probably better solutions as well. As
I said, if it's really trees you want, this is not a problem.

Actually, another risk is that your document describes several
disjunct structures... Oh well, that's another issue, and you
can always argue how defensive your code needs to be. It obviously
depends on how well you trust your input and the implications
of problems. Anyway, assuming cycle checks, it's easy to find
roots: They are the keys in the dictionary that don't occur as
values. If there's just one, you just have one tree.

To summarize: If you want to describe a tree structure in XML,
it's probably a good idea to use the natural hierarchy of XML
tags to do this. That way the file can't convey a non-tree
structure. If you need to use a construct like you describe
in your XML, this suggests that you need to convey non-tree
structures, and then you probably need to handle those as well.
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Easiest Programmnig & MS Office iboon.us Java 0 02-07-2008 03:28 PM
Wireless Network help/advise needed please =?Utf-8?B?cGFmb3M=?= Wireless Networking 1 01-21-2006 05:30 PM
Advise needed re Olympus Camedia C-3030 Zoom Camera (driver needed) Arawak Computer Support 2 11-18-2004 03:03 PM
advise needed. brian Computer Support 31 05-16-2004 07:42 PM
Advise needed Ling Chung Shum Computer Support 5 06-21-2003 11:36 PM



Advertisments