Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > XML > Literal 
 (not newline)

Reply
Thread Tools

Literal 
 (not newline)

 
 
will@thestranathans.com
Guest
Posts: n/a
 
      09-05-2006
I have an XML input that includes things like:

<foo>line of text another line of text yet another</foo>

And I want the entities PRESERVED (not translated) on the result,
so:

<bar>line of text another line of text yet another</bar>

I've tried <xsl:copy-of select="foo/text()" />, I've tried
<xsl:value-of select="foo" disable-output-escaping="yes" />, I've tried
<xsl:text disable-output-escaping="yes"><xsl:copy-of
select="foo/text()" /></xsl:text>, and it seems nothing works.

Strangely, &#lt; will (with certain incantations of the above) be
preserved properly, but it seems that perhaps the PARSER is translating
the entities, not copying them. i.e., no matter what I try, the
from the input become newlines in the output.

I'm using Xerces J (a couple of different versions with the same).

Thanks smart people.

 
Reply With Quote
 
 
 
 
Joseph Kesselman
Guest
Posts: n/a
 
      09-05-2006
Per the XML spec, newlines are normalized as they are read in, and you
can't distinguish one representation from another. You may be able to
tell your serializer that you want all newlines output as ... but
it won't be able to tell those from other line breaks in your source file.

I'd recommend using semantic markup, such as an <lf/> element, to
represent this case, and postprocessing it to yield the desired
character. Or fixing whatever downstream tool is forcing you to worry
about the exact representation of line-break.

--
Joe Kesselman / Beware the fury of a patient man. -- John Dryden
 
Reply With Quote
 
 
 
 
chiaman
Guest
Posts: n/a
 
      09-05-2006
Joseph Kesselman wrote:
> Per the XML spec, newlines are normalized as they are read in, and you
> can't distinguish one representation from another.

I was kinda' afraid of that.

> I'd recommend using semantic markup, such as an <lf/> element, to
> represent this case, and postprocessing it to yield the desired
> character. Or fixing whatever downstream tool is forcing you to worry
> about the exact representation of line-break.

Another option I'd love to be able to deal with - however, I'm neither
in control of the input format (a vendor tool that saves reports in XML
format) nor the desired output format (M$ Excel).

Thanks for the help.

 
Reply With Quote
 
Richard Tobin
Guest
Posts: n/a
 
      09-05-2006
In article <(E-Mail Removed) .com>,
chiaman <(E-Mail Removed)> wrote:
>Another option I'd love to be able to deal with - however, I'm neither
>in control of the input format (a vendor tool that saves reports in XML
>format) nor the desired output format (M$ Excel).


You haven't told us *why* you need the newlines as character
references (incidentally, they're not entities). When you say the
output format is Excel, what do you mean? An XML document that Excel
can process? If so, it shouldn't care about whether you use literal
newlines or a reference.

-- Richard
 
Reply With Quote
 
chiaman
Guest
Posts: n/a
 
      09-05-2006
Because the data that includes the embedded references is formatted.
So I want the references included in the excel so that the newlines
appear in the cell data when displayed in excel. (I know, pick a
better tool than excel).

For example, given the following:

<poem>
<lines>A unix salesperson, Lenore Loved her job, but loved the
beach more. She devised such a way to combine work and
play: She sells C-shells by the seashore</lines>
<author>Unknown</author>
</poem>

Translated into Excel:
<Cell><Data ss:Type="String">A unix salesperson, Lenore
Loved her job, but loved the beach more.
She devised such a way
to combine work and play:
She sells C-shells by the seashore</Data><Cell>

when actually opened in Excel renders as

A unix salesperson, Lenore Loved her job, but loved the beach more. She
devised such a way to combine work and play: She sells C-shells by the
seashore

but if the newlines in the Excel XML include actual references:

<Cell><Data ss:Type="String">A unix salesperson, Lenore
Loved her job, but loved the beach more.
She devised such a way
to combine work and play:
She sells C-shells by the seashore</Data><Cell>

Will render properly in the Excel as

A unix salesperson, Lenore
Loved her job, but loved the beach more.
She devised such a way
to combine work and play:
She sells C-shells by the seashore

So the references are in the source because they're actually important.
I want them retained when I translate it to excel because they remain
important.

Richard Tobin wrote:
> In article <(E-Mail Removed) .com>,
> chiaman <(E-Mail Removed)> wrote:
> >Another option I'd love to be able to deal with - however, I'm neither
> >in control of the input format (a vendor tool that saves reports in XML
> >format) nor the desired output format (M$ Excel).

>
> You haven't told us *why* you need the newlines as character
> references (incidentally, they're not entities). When you say the
> output format is Excel, what do you mean? An XML document that Excel
> can process? If so, it shouldn't care about whether you use literal
> newlines or a reference.
>
> -- Richard


 
Reply With Quote
 
Johannes Koch
Guest
Posts: n/a
 
      09-05-2006
chiaman wrote:
> but if the newlines in the Excel XML include actual references:
>
> <Cell><Data ss:Type="String">A unix salesperson, Lenore
> Loved her job, but loved the beach more.
> She devised such a way
> to combine work and play:
> She sells C-shells by the seashore</Data><Cell>
>
> Will render properly in the Excel as
>
> A unix salesperson, Lenore
> Loved her job, but loved the beach more.
> She devised such a way
> to combine work and play:
> She sells C-shells by the seashore


What does Excel render if is is

<Cell><Data ss:Type="String">A unix salesperson, Lenore Loved her
job, but loved the beach more. She devised such a way to combine
work and play: She sells C-shells by the seashore</Data><Cell>

instead?
--
Johannes Koch
Spem in alium nunquam habui praeter in te, Deus Israel.
(Thomas Tallis, 40-part motet)
 
Reply With Quote
 
chiaman
Guest
Posts: n/a
 
      09-05-2006
As I said earlier - if the actual references are included, when viewing
the file in Excel, the line breaks show in the correct places - this
is, of course, assuming the last <Cell> is actually </Cell> When
you view this in Excel, you would see:

A unix salesperson, Lenore
Loved her job, but loved the beach more.
She devised such a way
to combine work and play:
She sells C-shells by the seashore

For actual line breaks to appear in Excel, they have to be included in
the XML as references, otherwise, they're just parsed as whitespace and
render as a single space within Excel.

Johannes Koch wrote:
> What does Excel render if is is
>
> <Cell><Data ss:Type="String">A unix salesperson, Lenore Loved her
> job, but loved the beach more. She devised such a way to combine
> work and play: She sells C-shells by the seashore</Data><Cell>
>
> instead?


 
Reply With Quote
 
Johannes Koch
Guest
Posts: n/a
 
      09-05-2006
chiaman wrote:
> As I said earlier


No. You provided two examples:

1. Newline characters, no character references
2. Newline characters followed by character references

for wich you added the renderings in Excel.

I asked for a third:
No newline characters, but character references

Maybe, in the end it's an issue of various line break character(s) on
different systems (u000A/u000D vs. u000A vs. u000D).
--
Johannes Koch
Spem in alium nunquam habui praeter in te, Deus Israel.
(Thomas Tallis, 40-part motet)
 
Reply With Quote
 
Richard Tobin
Guest
Posts: n/a
 
      09-05-2006
In article <(E-Mail Removed) .com>,
chiaman <(E-Mail Removed)> wrote:

>For actual line breaks to appear in Excel, they have to be included in
>the XML as references, otherwise, they're just parsed as whitespace and
>render as a single space within Excel.


I'm afraid that all I can suggest is that you complain to Microsoft,
because XML applications should not treat in text any
differently from a newline character (a conforming XML parser will
return the character in both cases).

-- Richard
 
Reply With Quote
 
Peter Flynn
Guest
Posts: n/a
 
      09-06-2006
chiaman wrote:
[...]
> So the references are in the source because they're actually important.
> I want them retained when I translate it to excel because they remain
> important.


OK. Yes, picking a better system than Excel would be nice, but...

If you're not in control of the input format, then just run the
file through a filter and turn the numeric references into some
dummy empty element which you can transform back to after.
<lb/> as ?Joseph suggested would be conventional, eg

$ sed -e "s+ +<lb/>+g" original.file >new.file

///Peter
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
State definition and display: literal vs. symbolic in ModelSim Paul Urbanus VHDL 4 01-12-2005 03:40 AM
What's wrong with rpc-literal? Why use doc-literal? Anonieko Ramos ASP .Net Web Services 0 09-27-2004 09:06 AM
ASP:Image vs ASP:Literal C K ASP .Net 4 10-28-2003 10:19 PM
Re: Literal Control and Onclick? Mike Prager ASP .Net 0 10-14-2003 08:35 PM
asp:literal and hyperlink field les ASP .Net 6 07-14-2003 09:04 PM



Advertisments