Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Java > Problem: XSLT on a large XML using Java results in OutOfMemory error

Reply
Thread Tools

Problem: XSLT on a large XML using Java results in OutOfMemory error

 
 
Lenny Wintfeld
Guest
Posts: n/a
 
      05-17-2006
Hi

I'm attempting additions/changes to a Java program that (among other
things) uses XSLT to transform a large (96 Mb) XML file. It runs fine on
small XML files but generates OutOfMemory exceptions with large XML
files. I tried a simple punt of -Xmx512MB but that didn't work. In the
future, the input XML file may become considerably bigger than 96 MB, so
even if it did work, it probably would be putting off the inevitable to
some later date.

I'm using JavaSE 1.4.2_11 and the XSL/XML libraries that come with it.
The conversion is from and to an xml file. The code I inherited looks a
lot like most of the example code you can find on the net for doing an
XSLT transformation. The relevant part is:

TransformerFactory tf = TransformerFactory.newInstance();
Transformer transformer = tf.newTransformer(xsltSource);
transformer.transform(new StreamSource(new StringReader(x)),
xsltDest);

where xsltSource is XSLT in the form of a string, generated by code
immediately above the snip shown, and the "x" is the input xml to be
transformed.

Things I tried:

1. I modified the above code to use a file instead of a String as the
XML to be transformed and a file for the XSLT that specifies the
transformation. It works fine with small XML input files but not with
large ones. I assume this code is using the DOM parser, and there is
simply not enough room in memory to house the input XML file.

2. Based on some old (years old) newsgroup posts I found, I tried using
a SAX equivalent of the above code, assuming that SAX takes in, parses
and transforms the input XML file either picemeal (maybe element by
element?) or that SAX uses the complete virtual memory of the computer.
But this code also results successful runs on small input XML files and
OutOfMemory errors on large ones. Here is a snip of the SAX code
(adapted from a chapter of Burke's "XSLT and Java" at the O'Reilly
website):

FileInputStream brXSLT = new FileInputStream ("C:/Documents and
Settings/Lenny/Desktop/OCCxsl.xsl");

// Set up the transformer
TransformerFactory transFact =
TransformerFactory.newInstance( );
SAXTransformerFactory saxTransFact =
(SAXTransformerFactory) transFact;
Source xsltSource = new StreamSource(brXSLT);
TransformerHandler transHand =
saxTransFact.newTransformerHandler(xsltSource);

// Set up input source
InputSource inxml = new InputSource(inXML);
SAXSource saxSource = new SAXSource(inxml);

// Set the destination for the XSLT transformation
transHand.setResult(new StreamResult(outXML));

// attach the XSLT processor to the XMLReader
String parserClass = "org.apache.crimson.parser.XMLReaderImpl";
XMLReader reader = XMLReaderFactory.createXMLReader(parserClass);

//parse the input file to an output file
reader.setContentHandler(transHand);
reader.parse(inxml);


I'm considering making a custom parser of the input XML file which
basically identifies elements of the input XML file and treats each
element as if it were a comlete document. e.g. send the content handler
ch.startDocument()
ch.startElement(..) // pass through the original element
ch.characters(..) // "
ch.endElement(..) // "
ch.endDocument()
for each element in the input XML file.

But being a newbie to XSLT, I don't know if this is worth pursuing, or
even if it would work; I'm hoping there are simpler, more strightforward
ways of accomplising the same thing and at a higher level. It does seem
pretty clumsy, even if it would work.

I found a reply on the web to someone who had a similar problem. To the
effect that a "SAX pipeline" should be used. But there was no further
elaboration, and so far, I haven't figured out what a SAX Pipeline is or
how it would help.

Any advice, or references to examples, or actual examples would be
greatly appreciated.

Also, this problem seems to fall in a crack between comp.text.xml and
comp.lang.java.programmer. Do you think it's better addressed at the
other group?

Non-procedural programming is taking quite a bit of effort to
understand!

Thanks in advance for your help.

Lenny Wintfeld



 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
how to overcome "java.lang.OutOfMemory" Error Ananth Java 8 03-05-2008 05:48 PM
OutputStream from a URLConnection produces an OutOfMemory OutputStream from a URLConnection produces an OutOfMemory WinstonSmith_101@hotmail.com Java 2 10-25-2006 04:45 PM
XSLTranslation of a large XML file using Java results in OutOfMemory Lenny Wintfeld XML 6 05-22-2006 09:03 PM
Different results parsing a XML file with XML::Simple (XML::Sax vs. XML::Parser) Erik Wasser Perl Misc 5 03-05-2006 10:09 PM
java.lang.OutOfMemory Error puneet.bansal@wipro.com Java 25 07-12-2005 05:20 AM



Advertisments