Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > XML > [ANNOUNCEMENT]: VTD-XML released under GPL

Reply
Thread Tools

[ANNOUNCEMENT]: VTD-XML released under GPL

 
 
Jimmy zhang
Guest
Posts: n/a
 
      06-29-2004
I am pleased to announce that version 0.5 of VTD-XML -- a new,
non-extractive, Java-base XML processing API licensed under GPL
-- is now freely available on sourceforge.net. For source code,
documentation, detailed description of API and code examples,
please visit

http://vtd-xml.sf.net

Capable of random-access, VTD-XML attempts to be both memory
efficient and high performance. The starting point of this project is
the observation that, for XML documents that don't declare entities
in DTD, tokenization can indeed be done by only recording the starting
offset and length of a token. A discussion on this subject appeared
in a recently article on xml.com
(http://www.xml.com/pub/a/2004/05/19/parsing.html).

The core technology of VTD-XML is a binary format specification
called Virtual Token Descriptor (VTD). A VTD record is a 64-bit integer
that encodes the starting offset, length, type and nesting depth of a
token in an XML document. Because VTD records don't contain actually
token content, they work alongside of the original XML document, which
is maintained intact in memory by the processing model.

VTD's memory-conserving features can be summarized as follows:

* Avoid Per-object overhead -- In many VM-based object-oriented
programming languages, per-object allocation incurs a small amount
of memory overhead. A VTD record is immune to the overhead because
it is not an object.
* Bulk-allocation of storage -- Fixed in length, VTD records can be
stored in large memory blocks, which are more efficient to allocate
and GC. By allocating a large array for 4096 VTD records, one incurs
the per-array overhead (16 bytes in JDK 1.4) only once across 4096
records, thus reducing per-record overhead to very little.

Our benchmark indicates that VTD-XML processes XML at the performance
level similar to (and often better than) SAX with NULL content handler.
The memory usage is typically between 1.3x ~ 1.6x of the size of the
document, with "1" being the document itself.

Other features included in this release are:

* Incremental update -- VTD-XML allows one to modify content of XML
without touching irrelevant parts of the document.
* Content extraction -- VTD-XML also allows one to pull an element
out of XML in its serialized format. This can be an important
feature for partial signing/encryption of SOAP payload for
WS-security.

In the upcoming releases, we plan to add the persistence support so
that one can save/load VTD to/from the disk along with the XML documents
to avoid repetitive parsing in read-only situations. XPATH support is
also on the development roadmap. However, we would like to collect as
many suggestions and bug reports before taking the next step.

Your input and suggestions are very important to make VTD-XML a truly
useful XML processor.

Thanks,

Jimmy Zhang


 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
CACAO JavaVM 0.91 released under GPL Andreas Krall Java 2 12-25-2004 07:39 PM
[ANNOUNCEMENT] VTD-XML released under GPL Jimmy zhang Java 0 06-30-2004 07:23 AM
[ANNOUNCEMENT] VTD-XML released under GPL Jimmy zhang XML 0 06-30-2004 07:23 AM
[ANNOUNCEMENT}: VTD-XML released under GPL Jimmy zhang Java 0 06-29-2004 03:36 AM
CASM logMine v1.2 released under GNU GPL Bob Mahan Computer Security 0 11-12-2003 07:38 PM



Advertisments