Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > XML > XSLT to transform a "flat" XML file into a structured text file

Reply
Thread Tools

XSLT to transform a "flat" XML file into a structured text file

 
 
R. P.
Guest
Posts: n/a
 
      06-21-2006
Subject: XSLT to transform a flat XML file into a structured text file

I have an XML file that lists the PDF file segment names and titles of a
larger document and looks something like this:

<DOCUMENT>
......
...... some lead elements
......
<SEGMENT_LIST>
<SEGMENT FILE="fwd.pdf">Foreword</SEGMENT>
<SEGMENT FILE="chap1.pdf">Chapter 1</SEGMENT>
<SEGMENT FILE="chap2.pdf">Chapter 2</SEGMENT>
<SEGMENT FILE="chap3.pdf">Chapter 3</SEGMENT>
<SEGMENT FILE="v1fwd.pdf" VOLUME="Volume 1">Foreword</SEGMENT>
<SEGMENT FILE="v1defs.pdf" VOLUME="Volume 1">Definitions</SEGMENT>
<SEGMENT FILE="v1meth.pdf" VOLUME="Volume 1">Methodology</SEGMENT>
<SEGMENT FILE="v1sachap1.pdf" VOLUME="Volume 1" GROUP='Section
A">Chapter 1A</SEGMENT>
<SEGMENT FILE="v1sachap2.pdf" VOLUME="Volume 1" GROUP='Section
A">Chapter 2A</SEGMENT>
<SEGMENT FILE="v1sachap3.pdf" VOLUME="Volume 1" GROUP='Section
A">Chapter 3A</SEGMENT>
<SEGMENT FILE="v1sbchap1.pdf" VOLUME="Volume 1" GROUP='Section
B">Chapter 1B</SEGMENT>
<SEGMENT FILE="v1sbchap2.pdf" VOLUME="Volume 1" GROUP='Section
B">Chapter 2B</SEGMENT>
<SEGMENT FILE="v1sbchap3.pdf" VOLUME="Volume 1" GROUP='Section
B">Chapter 3B</SEGMENT>
<SEGMENT FILE="appa.pdf" GROUP="Appendices">Appendix A</SEGMENT>
<SEGMENT FILE="appb.pdf" GROUP="Appendices">Appendix B</SEGMENT>
<SEGMENT FILE="appc.pdf" GROU2P="Appendices">Appendix C</SEGMENT>
</SEGMENT_LIST>
</DOCUMENT>

I need to transform the SEGMENT_LIST elements into a structured text
file for use by another application to construct the Table Of Content
(TOC). The file would be vertical bar (|) separated list of PDF file
segment names and their titles with a single-digit TOC indentation level
indicator in the first position as so:

1|fwd.pdf|Foreword
1|chap1.pdf|Chapter 1
1|chap2.pdf|Chapter 2
1|chap3.pdf|Chapter 3
1||Volume 1
2|v1fwd.pdf|Foreword
2|v1defs.pdf|Definitions
2|v1meth.pdf|Methodology
2||Section A
3|v1sachap1.pdf|Chapter 1A
3|v1sachap2.pdf|Chapter 2A
3|v1sachap3.pdf|Chapter 3A
2||Section B
3|v1sbchap1.pdf|Chapter 1B
3|v1sbchap2.pdf|Chapter 2B
3|v1sbchap3.pdf|Chapter 3B
1||Appendices
2|appa.pdf|Appendix A
2|appb.pdf|Appendix B
2|appc.pdf|Appendix C

I think you can imagine from the transformed file how the TOC would look
like:

Foreword
Chapter 1
Chapter 2
Chapter 3
Volume 1
Foreword
Definitions
Methodology
Section A
Chapter 1A
Chapter 2A
Chapter 3A
Section B
Chapter 1B
Chapter 2B
Chapter 3B
Appendices
Appendix A
Appendix B
Appendix C

My problem is that while I find it easy to write an XSLT stylesheet to
create the first 4 lines of the output file where the source XML does
not have either of the optional VOLUME and GROUP attributes:

<xsl:template match="/">
<xsl:apply-templates select="/DOCUMENT/SEGMENT_LIST/*" />
</xsl:template>

<xsl:template match="SEGMENT">
<xsl:text>1|</xsl:text>
<xsl:value-of select="@FILE"/>
<xsl:text>|</xsl:text>
<xsl:value-of select="."/>
<xsl:text>
</xsl:text>
</xsl:template>

I have no idea however, how to transform the rest of XML because I don't
know how to process those attributes to make them Volume, Section and
Appendices headers in the output file for all the segments with the
same attribute value and with the proper indent level numbers.

Any suggestion would be greatly appreciated.

Rudy

 
Reply With Quote
 
 
 
 
Joris Gillis
Guest
Posts: n/a
 
      06-21-2006
On Wed, 21 Jun 2006 04:07:44 +0200, R. P. <(E-Mail Removed)> wrote:

> Subject: XSLT to transform a flat XML file into a structured text file
>
> I have an XML file that lists the PDF file segment names and titles ofa
> larger document and looks something like this:
>
> <DOCUMENT>
> .....
> ..... some lead elements
> .....
> <SEGMENT_LIST>
> <SEGMENT FILE="fwd.pdf">Foreword</SEGMENT>
> <SEGMENT FILE="chap1.pdf">Chapter 1</SEGMENT>
> <SEGMENT FILE="chap2.pdf">Chapter 2</SEGMENT>
> <SEGMENT FILE="chap3.pdf">Chapter 3</SEGMENT>
> <SEGMENT FILE="v1fwd.pdf" VOLUME="Volume 1">Foreword</SEGMENT>
> <SEGMENT FILE="v1defs.pdf" VOLUME="Volume 1">Definitions</SEGMENT>
> <SEGMENT FILE="v1meth.pdf" VOLUME="Volume 1">Methodology</SEGMENT>
> <SEGMENT FILE="v1sachap1.pdf" VOLUME="Volume 1" GROUP='Section
> A">Chapter 1A</SEGMENT>
> <SEGMENT FILE="v1sachap2.pdf" VOLUME="Volume 1" GROUP='Section
> A">Chapter 2A</SEGMENT>
> <SEGMENT FILE="v1sachap3.pdf" VOLUME="Volume 1" GROUP='Section
> A">Chapter 3A</SEGMENT>
> <SEGMENT FILE="v1sbchap1.pdf" VOLUME="Volume 1" GROUP='Section
> B">Chapter 1B</SEGMENT>
> <SEGMENT FILE="v1sbchap2.pdf" VOLUME="Volume 1" GROUP='Section
> B">Chapter 2B</SEGMENT>
> <SEGMENT FILE="v1sbchap3.pdf" VOLUME="Volume 1" GROUP='Section
> B">Chapter 3B</SEGMENT>
> <SEGMENT FILE="appa.pdf" GROUP="Appendices">Appendix A</SEGMENT>
> <SEGMENT FILE="appb.pdf" GROUP="Appendices">Appendix B</SEGMENT>
> <SEGMENT FILE="appc.pdf" GROU2P="Appendices">Appendix C</SEGMENT>
> </SEGMENT_LIST>
> </DOCUMENT>
>
> I need to transform the SEGMENT_LIST elements into a structured text
> file for use by another application to construct the Table Of Content
> (TOC). The file would be vertical bar (|) separated list of PDF file
> segment names and their titles with a single-digit TOC indentation level
> indicator in the first position as so:


You probably should look for a solution involving 'multi-level grouping',
possibly with muenchian technique...

In the mean time, you could try out this quick and dirty solution:
(I wouldn't use it in a production environment)

<xsl:stylesheet xmlnssl="http://www.w3.org/1999/XSL/Transform"
version="1.0">
<xslutput method="text"/>

<xsl:template match="SEGMENT">
<xsl:variable name="this" select="@*[not(name()='FILE')]"/>
<xsl:variable name="that"
select="preceding-sibling::SEGMENT[1]/@*[not(name()='FILE')]"/>

<xsl:if test="$this[not(.=$that)] or count($this)!=count($that)">
<xsl:value-of select="count($this)"/>||<xsl:value-of
select="$this[not(.=$that)]"/>
<xsl:text> </xsl:text>
</xsl:if>

<xsl:value-of select="count($this) + 1"/>|<xsl:value-of select="@FILE"/>
<xsl:text>|</xsl:text>
<xsl:value-of select="."/>
<xsl:text> </xsl:text>
</xsl:template>

</xsl:stylesheet>


regards,
--
Joris Gillis (http://users.telenet.be/root-jg/me.html)
Gaudiam omnibus traderat W3C, nec vana fides
 
Reply With Quote
 
 
 
 
R. P.
Guest
Posts: n/a
 
      06-22-2006
"Joris Gillis" <(E-Mail Removed)> wrote:
>
> You probably should look for a solution involving 'multi-level
> grouping', possibly with muenchian technique...
>
> In the mean time, you could try out this quick and dirty solution:
> (I wouldn't use it in a production environment)


Thanks Joris, I wouldn't do it either. If for nothing else, it did not
provide the sought results on my first attempt. However, you gave me
some tips on the direction I should be looking at for solution,
especially the term that describes my problem: "multi-level grouping."
I didn't know there was a name for it.

Regards,
Rudy

 
Reply With Quote
 
Joe Kesselman
Guest
Posts: n/a
 
      06-22-2006
In case you aren't aware of it: Check the XSLT FAQ website's grouping
and indexing pages; some of the techniques there are quite useful but
not at all obvious.

http://www.dpawson.co.uk/xsl/sect2/sect21.html

--
() ASCII Ribbon Campaign | Joe Kesselman
/\ Stamp out HTML e-mail! | System architexture and kinetic poetry
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Transform XML to XML using XSLT adi XML 1 06-06-2006 01:06 AM
Problem to insert an XML-element by XSLT-converting from one XML-file into another XML-file jkflens XML 2 05-30-2006 09:41 AM
blocking I/O with javax.xml.parsers.DocumentBuilder.parse() and javax.xml.transform.Transformer.transform() jazzdman@gmail.com Java 1 03-27-2005 06:56 AM
How to use XSLT to transform XML according to the data in another XML ai2003lian@yahoo.com XML 1 02-02-2005 05:07 PM
How to use XSLT to transform XML according to the data in another XML ai2003lian@yahoo.com XML 0 02-02-2005 04:57 PM



Advertisments