Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Python > high level, fast XML package for Python?

Reply
Thread Tools

high level, fast XML package for Python?

 
 
Gleb Rybkin
Guest
Posts: n/a
 
      09-15-2006
I searched online, but couldn't really find a standard package for
working with Python and XML -- everybody seems to suggest different
ones.

Is there a standard xml package for Python? Preferably high-level, fast
and that can parse in-file, not in-memory since I have to deal with
potentially MBs of data.

Thanks.

 
Reply With Quote
 
 
 
 
Diez B. Roggisch
Guest
Posts: n/a
 
      09-15-2006
Gleb Rybkin wrote:

> I searched online, but couldn't really find a standard package for
> working with Python and XML -- everybody seems to suggest different
> ones.
>
> Is there a standard xml package for Python? Preferably high-level, fast
> and that can parse in-file, not in-memory since I have to deal with
> potentially MBs of data.


cElementTree and lxml (which is API-compatible to the former). cElementTree
has an incremental parser, which allows for lager-than-memory-files to be
processed.

Diez
 
Reply With Quote
 
 
 
 
Steven Bethard
Guest
Posts: n/a
 
      09-15-2006
Diez B. Roggisch wrote:
> Gleb Rybkin wrote:
>
>> I searched online, but couldn't really find a standard package for
>> working with Python and XML -- everybody seems to suggest different
>> ones.
>>
>> Is there a standard xml package for Python? Preferably high-level, fast
>> and that can parse in-file, not in-memory since I have to deal with
>> potentially MBs of data.

>
> cElementTree and lxml (which is API-compatible to the former). cElementTree
> has an incremental parser, which allows for lager-than-memory-files to be
> processed.


In Python 2.5, cElementTree and ElementTree will be available in the
standard library as xml.etree.cElementTree and xml.etree.ElementTree.
So learning them now is a great idea.

STeVe
 
Reply With Quote
 
Gleb Rybkin
Guest
Posts: n/a
 
      09-15-2006
Okay, thanks!

Steven Bethard wrote:
> Diez B. Roggisch wrote:
> > Gleb Rybkin wrote:
> >
> >> I searched online, but couldn't really find a standard package for
> >> working with Python and XML -- everybody seems to suggest different
> >> ones.
> >>
> >> Is there a standard xml package for Python? Preferably high-level, fast
> >> and that can parse in-file, not in-memory since I have to deal with
> >> potentially MBs of data.

> >
> > cElementTree and lxml (which is API-compatible to the former). cElementTree
> > has an incremental parser, which allows for lager-than-memory-files to be
> > processed.

>
> In Python 2.5, cElementTree and ElementTree will be available in the
> standard library as xml.etree.cElementTree and xml.etree.ElementTree.
> So learning them now is a great idea.
>
> STeVe


 
Reply With Quote
 
Tim N. van der Leeuw
Guest
Posts: n/a
 
      09-15-2006
Hi Gleb,

Gleb Rybkin wrote:
> I searched online, but couldn't really find a standard package for
> working with Python and XML -- everybody seems to suggest different
> ones.
>
> Is there a standard xml package for Python? Preferably high-level, fast
> and that can parse in-file, not in-memory since I have to deal with
> potentially MBs of data.
>
> Thanks.


Another option is Amara; also quite high-level and also allows for
incremental parsing. I would say Amara is somewhat higher level than
ElementTree since it allows you to access your XML nodes as Python
objects (with some extra attributes and some minor warts), as well as
giving you XPath expressions on the object tree.

URL:

http://uche.ogbuji.net/tech/4suite/amara/

Best version currently available is version 1.1.7

It does work together with py2exe on windows if the need ever arises
for you but you have to fiddle a bit with it (ask for details on this
list if you ever need to do that)

Cheers,

--Tim

 
Reply With Quote
 
Stefan Behnel
Guest
Posts: n/a
 
      09-16-2006
Tim N. van der Leeuw wrote:
> Another option is Amara; also quite high-level and also allows for
> incremental parsing. I would say Amara is somewhat higher level than
> ElementTree since it allows you to access your XML nodes as Python
> objects (with some extra attributes and some minor warts), as well as
> giving you XPath expressions on the object tree.


Then you should definitely give lxml.objectify a try. It combines the ET API
with the lxml set of features (XPath, RelaxNG, XSLT, ...) and hides the actual
XML behind a Python object interface. That gives you everything at the same time.

http://codespeak.net/lxml/objectify.html

It's part of the lxml distribution:
http://codespeak.net/lxml/

Stefan
 
Reply With Quote
 
John J. Lee
Guest
Posts: n/a
 
      09-17-2006
Steven Bethard <> writes:
[...]
> In Python 2.5, cElementTree and ElementTree will be available in the
> standard library as xml.etree.cElementTree and
> xml.etree.ElementTree. So learning them now is a great idea.


Only some of the original ElementTree software is going into 2.5,
apparently. So you can get more on the effbot.org site than you get
from just downloading Python 2.5. Probably future Python releases
will add more of Fredrik's XML code.


John
 
Reply With Quote
 
=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=
Guest
Posts: n/a
 
      09-17-2006
Gleb Rybkin schrieb:
> I searched online, but couldn't really find a standard package for
> working with Python and XML -- everybody seems to suggest different
> ones.
>
> Is there a standard xml package for Python? Preferably high-level, fast
> and that can parse in-file, not in-memory since I have to deal with
> potentially MBs of data.


It seems that everybody is proposing libraries that use in-memory
representations. There is a standard xml package for Python, it's
called "xml" (and comes with the standard library). It contains a
SAX interface, xml.sax, which can parse files incrementally.

Regards,
Martin
 
Reply With Quote
 
Steven Bethard
Guest
Posts: n/a
 
      09-17-2006
Martin v. Löwis wrote:
> Gleb Rybkin schrieb:
>> I searched online, but couldn't really find a standard package for
>> working with Python and XML -- everybody seems to suggest different
>> ones.
>>
>> Is there a standard xml package for Python? Preferably high-level, fast
>> and that can parse in-file, not in-memory since I have to deal with
>> potentially MBs of data.

>
> It seems that everybody is proposing libraries that use in-memory
> representations. There is a standard xml package for Python, it's
> called "xml" (and comes with the standard library). It contains a
> SAX interface, xml.sax, which can parse files incrementally.


To use ElementTree and keep your memory consumption down, consider using
the iterparse function:

http://effbot.org/zone/element-iterparse.htm

Then you can get more SAX-like memory consumption while still enjoying
the high-level interface of ElementTree.

STeVe
 
Reply With Quote
 
Paul Boddie
Guest
Posts: n/a
 
      09-19-2006
Martin v. Löwis wrote:
>
> It seems that everybody is proposing libraries that use in-memory
> representations. There is a standard xml package for Python, it's
> called "xml" (and comes with the standard library). It contains a
> SAX interface, xml.sax, which can parse files incrementally.


What about xml.dom.pulldom? It quite possibly resembles ElementTree's
iterparse, or at least promotes event-style handling of XML information
using some kind of mainloop...

import xml.dom.pulldom

for etype, node in xml.dom.pulldom.parseString(s):
if etype == xml.dom.pulldom.START_ELEMENT:
print node.nodeName, node.attributes

....instead of callbacks (as happens with SAX):

import xml.sax

class CH(xml.sax.ContentHandler):
def startElement(self, name, attrs):
print name, attrs

xml.sax.parseString(s, CH())

Paul

 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
[ANN] macstl 0.2 -- portable SIMD toolkit, fast valarray transcendentals, fast Mach vectors glenlow@pixelglow.com C++ 0 02-02-2005 12:32 PM
More memory: How fast is fast rfdjr1@optonline.net Computer Support 5 05-19-2004 05:45 PM
Canon S30 Fast shutter mode... Why so fast? mark popp Digital Photography 1 02-08-2004 10:07 PM
I NEED HELP FAST!!!!! REAL FAST!!!!! R. Jizzle MCSE 3 09-29-2003 08:51 PM
Super-fast AA Chargers: Anything as fast as the 15 minute Rayovac? David Chien Digital Photography 4 08-30-2003 07:49 PM



Advertisments
 



1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57