Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > XML > Using Xerces SAX to parse just part of an input stream?

Reply
Thread Tools

Using Xerces SAX to parse just part of an input stream?

 
 
Nobody
Guest
Posts: n/a
 
      05-09-2006
I'm trying to put together code to deal with a SOAP with attachements
response, and I'd like to process the response in a single pass. The
SOAP with attachments specification returns XML in a MIME message, so
it looks like this:

--4389012.48390
Content-Type: text/xml

<?xml version="1.0" encoding="UTF-8"?>
<soap-env:Envelope
xmlns:soap-env="http://schemas.xmlsoap.org/soap/envelope/">
....snip...
</soap-env:Envelope>
--4389012.48390
Content-Type: text/xml
Content-Id: RootNode

<?xml version="1.0" encoding="UTF-8"?><RootNode>
... snip ...
</RootNode>
--4389012.48390--

So what I'd LIKE to be able to do is to parse the incoming input stream
up to the <?xml> declaration, hand the input stream over to a SAX
parser, let it parse to the end of the document, and then have it
return at the end so I can continue parsing the same input stream.

The problem is that "SAXParser.parse( new InputSource( inputStream ),
handler );" appears to want to consume the input stream until it
reaches EOF on the input stream (which, when given the input stream
above, fails with the error message "Content is not allowed in trailing
section."). Is this something I can work around in Xerces, or is there
a better SAX implementation that will let me tell the parser to stop
when it reaches the last element?

 
Reply With Quote
 
 
 
 
Joe Kesselman
Guest
Posts: n/a
 
      05-09-2006
Nobody wrote:
> The problem is that "SAXParser.parse( new InputSource( inputStream ),
> handler );" appears to want to consume the input stream until it
> reaches EOF on the input stream (which, when given the input stream
> above, fails with the error message "Content is not allowed in trailing
> section.").


Unfortunately, the definition of XML parsing does say that there
shouldn't be anything following the document element.

Possible solution: Create a stream filter which you pass the
"--4389012.48390" at the start of the enclosed message, and which
delivers characters only until it sees the corresponding
"--4389012.48390" mark at the end, returning EOF thereafter. Run the
parser from that filter-stream rather than direct from your original
input stream.

In other words, sweep the issue under the carpet so the parser doesn't
have to see it.
 
Reply With Quote
 
 
 
 
Nobody
Guest
Posts: n/a
 
      05-09-2006
Thanks - that was pretty much what I've come up with, although I was
hoping for something simpler. Of course, it doesn't look like writing
a SAX parser is all THAT hard...

 
Reply With Quote
 
Joseph Kesselman
Guest
Posts: n/a
 
      05-09-2006
Nobody wrote:
> Thanks - that was pretty much what I've come up with, although I was
> hoping for something simpler. Of course, it doesn't look like writing
> a SAX parser is all THAT hard...


XML 1.0 was designed with the goal that writing a parser should be about
the right size for a student project.

Of course that's before namespaces, and schemas, and other things were
added to the mix.

Experience has shown that this is very much a 90/10 problem. You can get
90% of the behavior for 10% of the effort; the other 10% takes the other
90% (or more) of the effort. And making it perform well can add yet
another 90%...


--
Joe Kesselman / Beware the fury of a patient man. -- John Dryden
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
XML Parsing Problems with SAX xerces John Smith Java 3 09-27-2005 05:50 AM
Xerces SAX encoding problems John Smith Java 1 09-21-2005 09:29 PM
Upgrade of Xalan 1.2.2 and Xerces 1.4.4 to Xalan 2.6 and Xerces 2.6.2 cvissy XML 0 11-16-2004 07:06 AM
SaX,, Xerces: parse() and IOException caused by wrong URI-encoding ? Pascal Lagass? Java 2 03-01-2004 08:44 AM
Jave and Xerces - testing the Sax.SaxCount class Badebecq XML 0 08-17-2003 03:33 PM



Advertisments