Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > XML > Cut up XML

Reply
Thread Tools

Cut up XML

 
 
Adam
Guest
Posts: n/a
 
      04-05-2004
Hi

I have some large XML files and need to produce a website from them,
but they will need cutting up into smaller sections, and to produce
navigation between them all.

For example:

doc1.xml wants to be cut up in to:

doc1_a.xml
doc1_b.xml
doc1_c.xml
doc1_d.xml

The XML is simple there are only 12 tags so what I am after is a way
to count characters to say 500, find the closest <aheader> tag cut
above it, and produce an xml file, then count from that <Aheader> tag
and do the same again?

i.e.
doc1.xml =

<root>
<aheader>Blar…Blarr…</aheader>
<bheader>Blar…Blarr…</bheader>
<bodytext>Blar…Blarr…</ bodytext >
< bodytext >Blar…Blarr…</ bodytext >
< bodytext >Blar…Blarr…</ bodytext >
<quote>Blar…Blarr…</ quote >
< bodytext >Blar…Blarr…</ bodytext >

<!-----------Cut here------------this is not in the XML>

<aheader>Blar…Blarr…</aheader>
< bheader >Blar…Blarr…</ bheader >
< bodytext >Blar…Blarr…</ bodytext >
< bodytext >Blar…Blarr…</ bodytext >
</root>
-------------------------------------------------

and produce 2 files like this:

doc1_a.xml=

<root>
<aheader>Blar…Blarr…</aheader>
<bheader>Blar…Blarr…</bheader>
<bodytext>Blar…Blarr…</ bodytext >
< bodytext >Blar…Blarr…</ bodytext >
< bodytext >Blar…Blarr…</ bodytext >
<quote>Blar…Blarr…</ quote >
< bodytext >Blar…Blarr…</ bodytext >
</root>


doc1_b.xml=

<root>
<aheader>Blar…Blarr…</aheader>
< bheader >Blar…Blarr…</ bheader >
< bodytext >Blar…Blarr…</ bodytext >
< bodytext >Blar…Blarr…</ bodytext >
</root>


Can this be done? And how, I know a bit of XSL, is there a program
that does this already?

Also when this is done, I need a navigation page to understand the
structure of my files?

I have a friend that says this can be done in Microsoft C sharp? But I
thought that was music (joke)

Thanks for any help
 
Reply With Quote
 
 
 
 
Gadrin77
Guest
Posts: n/a
 
      04-06-2004
http://www.velocityreviews.com/forums/(E-Mail Removed) (Adam) wrote in message

> doc1_a.xml=
>
> <root>
> <aheader>Blar?Blarr?</aheader>
> <bheader>Blar?Blarr?</bheader>
> <bodytext>Blar?Blarr?</ bodytext >
> < bodytext >Blar?Blarr?</ bodytext >
> < bodytext >Blar?Blarr?</ bodytext >
> <quote>Blar?Blarr?</ quote >
> < bodytext >Blar?Blarr?</ bodytext >
> </root>
>
>
> doc1_b.xml=
>
> <root>
> <aheader>Blar?Blarr?</aheader>
> < bheader >Blar?Blarr?</ bheader >
> < bodytext >Blar?Blarr?</ bodytext >
> < bodytext >Blar?Blarr?</ bodytext >
> </root>
>
>
> Can this be done? And how, I know a bit of XSL, is there a program
> that does this already?
>
> Also when this is done, I need a navigation page to understand the
> structure of my files?



Using XMLDOM might be easiest or treating the file like a .txt file
and read it line by line. Concatenate each line into a string var
and keep track of the length of the string var. As long as your
documents look like your examples (all the children of the root are
on the same level) it should be relatively easy. You just have to
know what the ROOT tag of each file is and when you reach your magic
number, place the ROOT tags around it, then write the string var to
a file. VBScript or VBA should do it pretty easy. I use Winbatch
which is somewhat similar to JScript.

You'll have to ignore the lines with the ROOT tags.

Anyway, whenever you write out the subfile, write the name to a list
or array and then you can build a list of links.

If your ROOT's children aren't all on the same level, then it gets
complex since you might leave off a closing tag. I'd then use the
XMLDOM and step thru the children, checking the size of inner XML,
then write it out.

First step: back up original files

I'd also do the first 3 or 4 files by hand, and see what you come up
with. Then write your script and test it, and see how close it comes
to your interactive work. Then decide whether you need more coding
or it's time to go.

Don't forget: backup!
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Re: Cut out XML subtree Andreas Perstinger Python 0 08-29-2012 05:31 PM
Cut out XML subtree Florian Lindner Python 0 08-29-2012 04:17 PM
Different results parsing a XML file with XML::Simple (XML::Sax vs. XML::Parser) Erik Wasser Perl Misc 5 03-05-2006 10:09 PM
My Regexp XML Parser -> Structured Perl Data, Cut & Paste Version, No Module's (Vol I) robic0 Perl Misc 43 01-06-2006 06:04 AM
What XML technologies to learn first for "XML Processing" and "XML Mapping"? Bomb Diggy Java 0 07-28-2004 07:26 AM



Advertisments