Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Python > xml element tree to html problem

Reply
Thread Tools

xml element tree to html problem

 
 
Ron Adam
Guest
Posts: n/a
 
      04-04-2006

I'm new to element tree and haven't been able to find a simple solution
to this problem yet. So maybe someone can point me in the right direction.

I have an element tree structure of nested elements that I want to
convert to html as nested definition and unordered lists in the
following way.

<object>
<name>ball</ball>
<desc>
<color>red</color>
<size>large</size>
</desc>
</object>


To...

<dl class='object'>
<dt class='name'>ball</dt>
<dd class='desc'>
<ul>
<li class='color'>red</li>
<li class='size'>large</li>
</ul>
</dd>
</dl>


Where each xml tag has a predefined relationship to either definition
list or an unordered list html tag. 'object' is always mapped to <dl
class='object'>, 'name' is always mapped to <dt class='name'>. etc...

So I will probably have a dictionary to look them up. The problem I
have is finding a relatively painless way to do the actual translation.

Thanks in advance for any advise.

Cheers,
Ron

 
Reply With Quote
 
 
 
 
Fredrik Lundh
Guest
Posts: n/a
 
      04-04-2006
Ron Adam wrote:

> I have an element tree structure of nested elements that I want to
> convert to html as nested definition and unordered lists in the
> following way.
>
> <object>
> <name>ball</ball>
> <desc>
> <color>red</color>
> <size>large</size>
> </desc>
> </object>
>
>
> To...
>
> <dl class='object'>
> <dt class='name'>ball</dt>
> <dd class='desc'>
> <ul>
> <li class='color'>red</li>
> <li class='size'>large</li>
> </ul>
> </dd>
> </dl>
>
>
> Where each xml tag has a predefined relationship to either definition
> list or an unordered list html tag. 'object' is always mapped to <dl
> class='object'>, 'name' is always mapped to <dt class='name'>. etc...
>
> So I will probably have a dictionary to look them up. The problem I
> have is finding a relatively painless way to do the actual translation.


here's one way to do it:

import cElementTree as ET

tree = ET.XML("""
<object>
<name>ball</name>
<desc>
<color>red</color>
<size>large</size>
</desc>
</object>
""")

MAP = {
"object": ("dl", "object"),
"name": ("dt", "name"),
"desc": ("ul", None),
"color": ("li", "color"),
"size": ("li", "size"),
}

for elem in tree.getiterator():
elem.tag, klass = MAP[elem.tag]
if klass:
elem.set("class", klass)

print ET.tostring(tree)

this prints:

<dl class="object">
<dt class="name">ball</dt>
<ul>
<li class="color">red</li>
<li class="size">large</li>
</ul>
</dl>


here's a somewhat simpler (but less general) version:

MAP = dict(object="dl", name="dt", desc="ul", color="li", size="li")

for elem in tree.getiterator():
if elem.tag != "desc":
elem.set("class", elem.tag)
elem.tag = MAP[elem.tag]

hope this helps!

</F>



 
Reply With Quote
 
 
 
 
akameswaran@gmail.com
Guest
Posts: n/a
 
      04-04-2006
are you using PyXML?
If this is a simple one to one relationship with dependence on order,
I'd forgo the whole PyXML, read the file line by line, and replace tags
as appropriate. You may have to do some simple checkin, in case there
is n <object> <name>object</name>


Otherwise, have fun with the parsers - nothing is painless is SAX or
DOM.

as far as simple translation? once you have tokenized or used PyXML to
get the elements, something like?

for tag in listofTags:
if tagDictionary.has_key(tag):
doSomething(tagDictionary[tag])


IMHO - PyXML is a great tool, but for something like simple
substitution, it's so so much overkill. For simple processing tasks
like this I almost always just treat it as a text file. I resort to
PyXML when things like heirarchy are important. Hope this helps

 
Reply With Quote
 
akameswaran@gmail.com
Guest
Posts: n/a
 
      04-04-2006
Frederick,
I didn't know about cElementTree before, wow - this is a lot nicer than
PyyXML - and a whole lot faster. Almost makes my comments about
dealing with the xml as text, completely pointless.

 
Reply With Quote
 
Ron Adam
Guest
Posts: n/a
 
      04-05-2006
Fredrik Lundh wrote:

> here's one way to do it:
>
> import cElementTree as ET
>
> tree = ET.XML("""
> <object>
> <name>ball</name>
> <desc>
> <color>red</color>
> <size>large</size>
> </desc>
> </object>
> """)
>
> MAP = {
> "object": ("dl", "object"),
> "name": ("dt", "name"),
> "desc": ("ul", None),
> "color": ("li", "color"),
> "size": ("li", "size"),
> }
>
> for elem in tree.getiterator():
> elem.tag, klass = MAP[elem.tag]
> if klass:
> elem.set("class", klass)
>
> print ET.tostring(tree)



Thanks a *LOT!*

This is what I needed. Now I can play with finding the best data
structure along with what elements to translate each tag to.

This is for a rewrite of PyDoc.py. I'm hoping it will be as easy to
write to other formats from the XML as it is to html.

Cheers,
Ron
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
how to Update/insert an xml element's text----> (<element>text</element>) HANM XML 2 01-29-2008 03:31 PM
Best way to remove body/html tag from HTML::Element tree afrinspray Perl Misc 2 09-07-2006 04:55 PM
insert E4X XML tree inside existing DOM tree Joris Gillis XML 2 06-16-2006 08:30 PM
Problem to insert an XML-element by XSLT-converting from one XML-file into another XML-file jkflens XML 2 05-30-2006 09:41 AM
B tree, B+ tree and B* tree Stub C Programming 3 11-12-2003 01:51 PM



Advertisments