Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > XML > Tree splitting/merging

Reply
Thread Tools

Tree splitting/merging

 
 
William Ahern
Guest
Posts: n/a
 
      11-04-2003
I'm looking for resources on splitting and merging XML trees. Specifically,
on methods to pare large XML documents into smaller documents which can be
merged later.

Off of the top of my head, I can envision unions of node sets, and unions of
node text. But I know there's much more to the subject than that, if not
more alternatives than greater technical detail.

TIA,

Bill
 
Reply With Quote
 
 
 
 
sylvain.loiseau
Guest
Posts: n/a
 
      11-04-2003
> I'm looking for resources on splitting and merging XML trees.
Specifically,
> on methods to pare large XML documents into smaller documents which can be
> merged later.


I have something for a problem (perhaps) close to yours: I need to perform
XSLT transformation on very large document which doesn't fit in memory. I
use a SAX parser with three XMLFilter (concretely, sub-classes of
org.xml.sax.helpers.XMLFilterImpl). The first class "split" the stream (i.e.
it throw a "start document" and a "end document" events) when it encouters a
specific start and endElement. So the next filter receive several (smaller)
documents one at once. This second filter is a TransformerHandler which
perform the transformation. Then it pass the event to a last filter, a
"merger", who discard the "start" and "endDocument" event except the very
first and the very last one.
I was inspired by a Perl module by Barrie Slaymaker.
(inccidentaly, I noticed that there is nothing as convenient for Java that
the XML::SAX:ipeline Perl module)

In fact I was coming on this list for a question close to this one: it's in
a new thread...

> Off of the top of my head, I can envision unions of node sets, and unions

of
> node text. But I know there's much more to the subject than that, if not
> more alternatives than greater technical detail.


Which level of well-formedness have your merging problem, i.e. do you want
only add node to existing nodes in a DOM mode (you just need standard method
of the Node interface), or do you want to insert mixed content checking for
well-formedness, tag nesting, etc?

> TIA,




 
Reply With Quote
 
 
 
 
William Ahern
Guest
Posts: n/a
 
      11-04-2003
sylvain.loiseau <(E-Mail Removed)> wrote:
>> I'm looking for resources on splitting and merging XML trees.

> Specifically,
>> on methods to pare large XML documents into smaller documents which can be
>> merged later.

>
> I have something for a problem (perhaps) close to yours: I need to perform
> XSLT transformation on very large document which doesn't fit in memory. I
> use a SAX parser with three XMLFilter (concretely, sub-classes of
> org.xml.sax.helpers.XMLFilterImpl). The first class "split" the stream (i.e.
> it throw a "start document" and a "end document" events) when it encouters a
> specific start and endElement. So the next filter receive several (smaller)
> documents one at once. This second filter is a TransformerHandler which
> perform the transformation. Then it pass the event to a last filter, a
> "merger", who discard the "start" and "endDocument" event except the very
> first and the very last one.
> I was inspired by a Perl module by Barrie Slaymaker.
> (inccidentaly, I noticed that there is nothing as convenient for Java that
> the XML::SAX:ipeline Perl module)


Right after posting I tripped over the XPipe project (http://xpipe.sf.net/).
XPipe associates this w/ the scatter/gather pattern, and they seem to have
put a lot of thought into the issues. Specifically, they elaborate on a
notion of a "fulcra", or the node-depth I suppose you could call it, that a
document can be split on. Probably you're already thought this through, but
maybe you can find more info on that site. They have code and list
discussions you can wade through.

- Bill
 
Reply With Quote
 
sylvain.loiseau
Guest
Posts: n/a
 
      11-04-2003
Thanks, it looks very interesting.

Sylvain

"William Ahern" <william@wilbur.25thandClement.com> a écrit dans le message
de news: g4ol71-0jq.ln1@wilbur.25thandClement.com...
> sylvain.loiseau <(E-Mail Removed)> wrote:
> >> I'm looking for resources on splitting and merging XML trees.

> > Specifically,
> >> on methods to pare large XML documents into smaller documents which can

be
> >> merged later.

> >
> > I have something for a problem (perhaps) close to yours: I need to

perform
> > XSLT transformation on very large document which doesn't fit in memory.

I
> > use a SAX parser with three XMLFilter (concretely, sub-classes of
> > org.xml.sax.helpers.XMLFilterImpl). The first class "split" the stream

(i.e.
> > it throw a "start document" and a "end document" events) when it

encouters a
> > specific start and endElement. So the next filter receive several

(smaller)
> > documents one at once. This second filter is a TransformerHandler which
> > perform the transformation. Then it pass the event to a last filter, a
> > "merger", who discard the "start" and "endDocument" event except the

very
> > first and the very last one.
> > I was inspired by a Perl module by Barrie Slaymaker.
> > (inccidentaly, I noticed that there is nothing as convenient for Java

that
> > the XML::SAX:ipeline Perl module)

>
> Right after posting I tripped over the XPipe project

(http://xpipe.sf.net/).
> XPipe associates this w/ the scatter/gather pattern, and they seem to have
> put a lot of thought into the issues. Specifically, they elaborate on a
> notion of a "fulcra", or the node-depth I suppose you could call it, that

a
> document can be split on. Probably you're already thought this through,

but
> maybe you can find more info on that site. They have code and list
> discussions you can wade through.
>
> - Bill



 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
B+ Tree versus Ternary Search Tree Ramkumar Menon Java 2 08-16-2005 08:13 PM
B+ Tree versus Ternary Search Tree Ramkumar Menon Java 1 08-16-2005 09:46 AM
B+ Tree versus Ternary Search Tree Ramkumar Menon Java 0 08-16-2005 09:01 AM
B tree, B+ tree and B* tree Stub C Programming 3 11-12-2003 01:51 PM
Spanning Tree And Per Vlan Spanning Tree Amy L. Cisco 0 07-24-2003 10:01 PM



Advertisments