Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Python > [ANN] lxml 1.0 released

Thread Tools

[ANN] lxml 1.0 released

Stefan Behnel
Posts: n/a
Hallo everyone,

I have the honour to announce the availability of lxml 1.0.

It's downloadable from cheeseshop:

lxml is a Pythonic binding for the libxml2 and libxslt libraries. It provides
safe and convenient access to these libraries using the ElementTree API. It
extends the ElementTree API significantly to offer support for XPath, RelaxNG,
XML Schema, XSLT, C14N and much, much more.

Its goals are:

* Pythonic API.
* Documented.
* Use Python unicode strings in API.
* Safe (no segfaults).
* No manual memory management!
(as opposed to the official libxml2 Python bindings)

While the list of features added since the last beta version (1.0.beta) is
rather small, this version contains a large number of bug fixes found by
various users and testers. Thank you all for your help!


Features added since 0.9.2:

* Element.getiterator() and the findall() methods support finding
arbitrary elements from a namespace (pattern {namespace}*)
* Another speedup in tree iteration code
* General speedup of Python Element object creation and deallocation
* Writing C14N no longer serializes in memory (reduced memory footprint)
* PyErrorLog for error logging through the Python logging module
* element.getroottree() returns an ElementTree for the root node of the
document that contains the element.
* ElementTree.getpath(element) returns a simple, absolute XPath expression
to find the element in the tree structure
* Error logs have a last_error attribute for convenience
* Comment texts can be changed through the API
* Formatted output via pretty_print keyword to serialization functions
* XSLT can block access to file system and network via XSLTAccessControl
* ElementTree.write() no longer serializes in memory (reduced memory
* Speedup of Element.findall(tag) and Element.getiterator(tag)
* Support for writing the XML representation of Elements and ElementTrees
to Python unicode strings via etree.tounicode()
* Support for writing XSLT results to Python unicode strings via unicode()
* Parsing a unicode string no longer copies the string (reduced memory
* Parsing file-like objects now reads chunks rather than the whole file
(reduced memory footprint)
* Parsing StringIO objects from the start avoids copying the string
(reduced memory footprint)
* Read-only 'docinfo' attribute in ElementTree class holds DOCTYPE
information, original encoding and XML version as seen by the parser
* etree module can be compiled without libxslt by commenting out the line
include "xslt.pxi" near the end of the etree.pyx source file
* Better error messages in parser exceptions
* Error reporting now also works in XSLT
* Support for custom document loaders (URI resolvers) in parsers and XSLT,
resolvers are registered at parser level
* Implementation of exslt:regexp for XSLT based on the Python 're' module,
enabled by default, can be switched off with 'regexp=False' keyword
* Support for exslt extensions (libexslt) and libxslt extra functions
(node-set, document, write, output)
* Substantial speedup in XPath.evaluate()
* HTMLParser for parsing (broken) HTML
* XMLDTDID function parses XML into tuple (root node, ID dict) based on
xml:id implementation of libxml2 (as opposed to ET compatible XMLID)

Bugs fixed since 0.9.2:

* Memory leak in Element.__setitem__
* Memory leak in Element.attrib.items() and Element.attrib.values()
* Memory leak in XPath extension functions
* Memory leak in unicode related setup code
* Element now raises ValueError on empty tag names
* Namespace fixing after moving elements between documents could fail if
the source document was freed too early
* Setting namespace-less tag names on namespaced elements ('{ns}t' -> 't')
didn't reset the namespace
* Unknown constants from newer libxml2 versions could raise exceptions in
the error handlers
* lxml.etree compiles much faster
* On libxml2 <= 2.6.22, parsing strings with encoding declaration could
fail in certain cases
* Document reference in ElementTree objects was not updated when the root
element was moved to a different document
* Running absolute XPath expressions on an Element now evaluates against
the root tree
* Evaluating absolute XPath expressions (/*) on an ElementTree could fail
* Crashes when calling XSLT, RelaxNG, etc. with uninitialized ElementTree
* Memory leak when using iconv encoders in tostring/write
* Deep copying Elements and ElementTrees maintains the document
* Serialization functions raise LookupError for unknown encodings
* Memory deallocation crash resulting from deep copying elements
* Some ElementTree methods could crash if the root node was not
initialized (neither file nor element passed to the constructor)
* Element/SubElement failed to set attribute namespaces from passed attrib
* tostring() now adds an XML declaration for non-ASCII encodings
* tostring() failed to serialize encodings that contain 0-bytes
* ElementTree.xpath() and XPathDocumentEvaluator were not using the
ElementTree root node as reference point
* Calling document('') in XSLT failed to return the stylesheet
Reply With Quote
Kent Johnson
Posts: n/a
Stefan Behnel wrote:
> Hallo everyone,
> I have the honour to announce the availability of lxml 1.0.
> It's downloadable from cheeseshop:

Are there any plans to offer a Windows installer?

Reply With Quote
Stefan Behnel
Posts: n/a
Kent Johnson wrote:
> Stefan Behnel wrote:
>> Hallo everyone,
>> I have the honour to announce the availability of lxml 1.0.
>> It's downloadable from cheeseshop:

> Are there any plans to offer a Windows installer?

Already there.

It just takes a minute longer sometimes, but Windows users are not forgotten.

Reply With Quote

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off

Similar Threads
Thread Thread Starter Forum Replies Last Post
[ANN] lxml 2.2 released Stefan Behnel Python 2 03-21-2009 09:11 PM
lxml 2.0 released Stefan Behnel Python 0 02-01-2008 06:47 PM
lxml and SimpleXMLWriter Srijit Kumar Bhadra Python 2 07-06-2006 05:49 PM
[ANN] lxml 0.9 is out! Stefan Behnel Python 0 03-20-2006 08:17 PM
ANN: MathDOM 0.5.2 - MathML in Python - now featuring lxml API! Stefan Behnel Python 0 10-17-2005 09:30 AM