Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Python > Ignoring XML Namespaces with ElementTree

Reply
Thread Tools

Ignoring XML Namespaces with ElementTree

 
 
Pete
Guest
Posts: n/a
 
      12-03-2009
Is there anyway to configure ElementTree to ignore the XML namespace?
For the past couple months, I've been using minidom to parse an XML
file that is generated by a unit within my organization that can't
stick with a standard. This hasnt been a problem until recently when
the script was provided a 30MB file that once parsed, increased the
python memory footprint by 1.0GB and now I'm running into Memory
Errors. Based on Google searches and testing it looks like ElementTree
is much more efficient with memory and I'd like to switch, however I'd
like to be able to ignore the namespaces. These XML files tend to
randomly switch the namespace for no reason and ignoring these
namespaces would help the script adapt to the changes. Any help on
this would be greatly appreciated. I'm having a hard time finding the
answer.

Additionally, anyone know how ElementTree handle's XML elements that
include Unicode?
 
Reply With Quote
 
 
 
 
Stefan Behnel
Guest
Posts: n/a
 
      12-03-2009
Pete, 03.12.2009 19:21:
> Is there anyway to configure ElementTree to ignore the XML namespace?
> For the past couple months, I've been using minidom to parse an XML
> file that is generated by a unit within my organization that can't
> stick with a standard. This hasnt been a problem until recently when
> the script was provided a 30MB file that once parsed, increased the
> python memory footprint by 1.0GB and now I'm running into Memory
> Errors. Based on Google searches and testing it looks like ElementTree
> is much more efficient with memory and I'd like to switch,


Make sure you use cElementTree, then that's certainly the right choice to make.


> however I'd
> like to be able to ignore the namespaces. These XML files tend to
> randomly switch the namespace for no reason and ignoring these
> namespaces would help the script adapt to the changes. Any help on
> this would be greatly appreciated. I'm having a hard time finding the
> answer.


ET uses namespace URIs as part of the tag name, so if you want to ignore
namespaces, just strip the leading "{...}" (if any) from the tag and work
with the rest (so-called "local name").


> Additionally, anyone know how ElementTree handle's XML elements that
> include Unicode?


It's an XML parser, so the answer is: without any difficulties.

Stefan
 
Reply With Quote
 
 
 
 
Pete
Guest
Posts: n/a
 
      12-03-2009
On Dec 3, 2:55*pm, Stefan Behnel <(E-Mail Removed)> wrote:
> Pete, 03.12.2009 19:21:
>
> > Is there anyway to configure ElementTree to ignore the XML namespace?
> > For the past couple months, I've been using minidom to parse an XML
> > file that is generated by a unit within my organization that can't
> > stick with a standard. This hasnt been a problem until recently when
> > the script was provided a 30MB file that once parsed, increased the
> > python memory footprint by 1.0GB and now I'm running into Memory
> > Errors. Based on Google searches and testing it looks like ElementTree
> > is much more efficient with memory and I'd like to switch,

>
> Make sure you use cElementTree, then that's certainly the right choice to make.
>
> > however I'd
> > like to be able to ignore the namespaces. These XML files tend to
> > randomly switch the namespace for no reason and ignoring these
> > namespaces would help the script adapt to the changes. Any help on
> > this would be greatly appreciated. I'm having a hard time finding the
> > answer.

>
> ET uses namespace URIs as part of the tag name, so if you want to ignore
> namespaces, just strip the leading "{...}" (if any) from the tag and work
> with the rest (so-called "local name").
>
> > Additionally, anyone know how ElementTree handle's XML elements that
> > include Unicode?

>
> It's an XML parser, so the answer is: without any difficulties.
>
> Stefan


Perfect... I can work with that. Thanks.
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Dealing with xml namespaces with ElementTree Neil Cerutti Python 0 01-21-2011 02:21 PM
Re: ignoring namespaces? Joe Kesselman XML 8 06-12-2010 08:13 PM
Ignoring XML Namespaces with cElementTree dmtr Python 10 05-02-2010 03:06 AM
ElementTree.XML(string XML) and ElementTree.fromstring(string XML)not working Kee Nethery Python 12 06-27-2009 06:06 AM
ElementTree and namespaces in the header only Peter Bengtsson Python 3 01-17-2008 12:13 AM



Advertisments