Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > XML > Q: using xlst to "congeal" adjecent sections

Reply
Thread Tools

Q: using xlst to "congeal" adjecent sections

 
 
Malcolm Dew-Jones
Guest
Posts: n/a
 
      02-27-2008
(First - my terminology may well be bogus, I hope you understand me
though.)

I have some ms word documents that will be used as the input for a
different purpose in a database. To ease this process I want to take
certain adjacent sections of the documents and "congeal" them into single
sections.

The format within the xml output of msword is simple to see, but I haven't
played with xslt for a while, (and not much at that) so am looking for
examples or suggestions of how to do the following.

For example, I have the following two "sections"


<w:r wsp:rsidR="00A5105E" wsp:rsidRPr="00EC0118">
<w:rPr>
<w:rFonts w:ascii="Arial" w:h-ansi="Arial" />
<wx:font wx:val="Arial" />
<w:highlight w:val="yellow" />
</w:rPr>
<w:t>FIRST PART </w:t>
</w:r>
<w:r wsp:rsidR="00A61057">
<w:rPr>
<w:rFonts w:ascii="Arial" w:h-ansi="Arial" />
<wx:font wx:val="Arial" />
<w:highlight w:val="yellow" />
</w:rPr>
<w:t>AND THE SECOND PART</w:t>
</w:r>

I want to end up with a single section that has the FIRST PART AND THE
SECOND PART combined. I don't think I need to care about the id numbers,
but even if I do I will worry about that later. The result would then look
like this

<w:r wsp:rsidR="*"> (the value of * doesn't matter to me yet)
<w:rPr>
<w:rFonts w:ascii="Arial" w:h-ansi="Arial" />
<wx:font wx:val="Arial" />
<w:highlight w:val="yellow" />
</w:rPr>
<w:t>FIRST PART AND THE SECOND PART</w:t>
</w:r>

The thing that makes each section the same is that the <w:rPr>...</w:rPr>
are the same in the adjacent sections, and I only care about the sections
that have that exact formatting shown above (i.e. w:ascii="Arial" etc.) so
a tranform could have those values hard coded if it makes it easier.

Anyway, as I said, examples or suggestions for setting up an xslt to do
this would be appreciated.

Thanks

 
Reply With Quote
 
 
 
 
Joseph Kesselman
Guest
Posts: n/a
 
      02-27-2008
Off-the-cuff answer:


The usual approach for this sort of thing is to write two templates to
handle the two distinct cases.

Start by figuring out a match pattern that selects all the elments
you're interested in.

Modify that to create two match patterns: one that matches the first
such instance (one with no preceeding matching siblings) and one that
matches all the others. (Or the last and all-the-rest; either way.)

Make a template fired by the first pattern that gathers the contents of
it and its adjacent matching siblings.

Make a template fired by the second pattern which discards the elements
which match it, since they were handled by the other template.

Plug those two into a stylesheet which handles the rest of the document,
typically the identity transformation.

Done.


The XSLT FAQ websiteshould have some examples. I suspect that, given the
complexity of what you're matching on, you'll want to take advantage of
keys.

--
Joe Kesselman / Beware the fury of a patient man. -- John Dryden
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
function-sections and data-sections option in gcc Raman C Programming 6 08-03-2007 10:40 AM
enumerate all adjecent substrings in the file puzzlecracker Perl Misc 9 12-13-2005 10:36 AM
getting parent node using XLST john smith XML 2 05-03-2005 09:09 AM
HOW CAN I USE XLST PRAVEEN MCSD 4 12-20-2004 02:16 PM
ACCESS -> (XML + XLST) -> to files Hai Nguyen ASP .Net 3 01-11-2004 06:31 PM



Advertisments