Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > XML > XSL for recursive transformation

Reply
Thread Tools

XSL for recursive transformation

 
 
Peter Flynn
Guest
Posts: n/a
 
      02-16-2006
Joe Kesselman wrote:
> Peter Flynn wrote:
>> Why not just use entity declarations?

>
> Parsed entities are pretty much dying as XML Schema replaces DTDs.


I think you'll find them alive and kicking in many places. Reports
of the death of DTDs are greatly exaggerated.

> Schemas don't have any equivalent.


QED

> XInclude/XLink were supposed to take over that role.


Oooh look, flying pigs

///Peter

 
Reply With Quote
 
 
 
 
Joe Kesselman
Guest
Posts: n/a
 
      02-17-2006
>> Parsed entities are pretty much dying as XML Schema replaces DTDs.
>
> I think you'll find them alive and kicking in many places. Reports
> of the death of DTDs are greatly exaggerated.


Uhm. I agree that schemas are taking longer to find their way in than
might have been expected, partly becuase they're a syntax only a
database expert or computer science geek could love. (Though frankly the
DTD syntax is also pretty hideous.)

However, entities are definitely on the way out. The problem is that
they really aren't all that useful unless there's a fragment that will
appear in a huge number of instances of this kind of document, and even
then they're only a significant advantage when producing the document by
hand; it is a significant pain for software to recognize that the
opportunity exists to take advantage of a parsed entity, and there
usually isn't much to be gained by doing so.

Entities had value when most docs were produced by humans pounding on
raw XML text; they really aren't useful for docs produced by smarter
editors. Most of the things you might still want to use them for can be
handled better by an appropriate tool -- an editor that lets you see and
enter the actual characters rather than their named equivalents, for
example, or a syntax that's actually defined in the document rather than
in a non-tag-language secondary file. Among other things, that permits
different documents to reference different resource rather than having
only a single set, hard-wired into the DTD, that they can name.

>> XInclude/XLink were supposed to take over that role.

> Oooh look, flying pigs


I did put it in the imperfect tense... Part of the problem is that we're
finding that the need for a portable syntax for documents referencing
other documents isn't as universal as we expected. Or at least isn't so
right now.

If we'd designed XML completely before releasing it to the public, we
would have started with the infoset (including namespaces and schemas
and includes and links), then designed the syntax and APIs from that,
Instead the W3C started with the syntax and a known-inadequate schema
language (DTDs), and has build everything out from there. The upside is
that folks had a chance to start using XML much earlier, and we've
gotten some benefit from seeing which directions everyone has gone with
it. The downside is that there have been some warts and hiccups and
direction changes along the way, and tools have not always been quick to
catch up -- and even when they have, folks who have working solutions
using the old stopgaps are often reluctant to make the effort to move
over. Which leaves all of us with the job of supporting multiple ways of
doing things and trying to gently push folks toward the ones that will
make their life -- and ours -- easier in the long run.

Oh well. The cutting edge usually has a few nicks in it.



--
() ASCII Ribbon Campaign | Joe Kesselman
/\ Stamp out HTML e-mail! | System architexture and kinetic poetry
 
Reply With Quote
 
 
 
 
Peter Flynn
Guest
Posts: n/a
 
      02-17-2006
Joe Kesselman wrote:
>>> Parsed entities are pretty much dying as XML Schema replaces DTDs.

>>
>> I think you'll find them alive and kicking in many places. Reports
>> of the death of DTDs are greatly exaggerated.

>
> Uhm. I agree that schemas are taking longer to find their way in than
> might have been expected, partly because they're a syntax only a
> database expert or computer science geek could love. (Though frankly the
> DTD syntax is also pretty hideous.)


Only a syntax geek would love it, but it has the advantage of being very
terse, and once learned, quite expressive. RelaxNG seems to be the way
forward, but I still feel we did the community a disservice by not
properly investigating the possibility of adding datatyping to DTDs
before running amok with W3C Schemas. Ah well. Another time.

> However, entities are definitely on the way out. The problem is that
> they really aren't all that useful unless there's a fragment that will
> appear in a huge number of instances of this kind of document, and even
> then they're only a significant advantage when producing the document by
> hand;


Actually there is rather a lot of stuff out there that does this.

> it is a significant pain for software to recognize that the
> opportunity exists to take advantage of a parsed entity, and there
> usually isn't much to be gained by doing so.


For parsed entities, yes. Legal boilerplate, tech doc, and chapter
files for long documents are the only real candidates.

Parameter entities are a different matter.

> Entities had value when most docs were produced by humans pounding on
> raw XML text; they really aren't useful for docs produced by smarter
> editors. Most of the things you might still want to use them for can be
> handled better by an appropriate tool -- an editor that lets you see and
> enter the actual characters rather than their named equivalents, for


This refers to character entities. Sadly, editors are still in their
infancy when it comes to the interface (hence my thesis topic), and
there are still a gazillion so-called plaintext editors (non-XML) out
there that XML beginners use, which seriously screws up their chances
when they start editing UTF-8. For this reason, several companies and
projects I have been dealing with have made it policy for the moment
to create ISO-8859-1 files only, and ALL other characters go in as
character entity references or numeric references (fortunately for them
they deal only with western languages in Latin scripts).

> example, or a syntax that's actually defined in the document rather than
> in a non-tag-language secondary file. Among other things, that permits
> different documents to reference different resource rather than having
> only a single set, hard-wired into the DTD, that they can name.
>
>>> XInclude/XLink were supposed to take over that role.

>> Oooh look, flying pigs

>
> I did put it in the imperfect tense...


Sorry, I was being deliberately provocative.

> Part of the problem is that we're
> finding that the need for a portable syntax for documents referencing
> other documents isn't as universal as we expected. Or at least isn't so
> right now.


Ahead of the curve as usual Although the demand for a syntax to
refer from one document to another is slowly approaching FAQ-level.
It's just embarrassing that we had multi-way bidirectional 3rd-party
linking in the Panorama plugin a decade ago, and still nothing to
replace it.

> If we'd designed XML completely before releasing it to the public,


We'd still be discussing it.

> would have started with the infoset (including namespaces and schemas
> and includes and links), then designed the syntax and APIs from that,
> Instead the W3C started with the syntax and a known-inadequate schema
> language (DTDs), and has build everything out from there. The upside is
> that folks had a chance to start using XML much earlier, and we've
> gotten some benefit from seeing which directions everyone has gone with


I like the description, although I disagree about the infoset. Coming
from the tech doc background, I would have preferred to see some of the
useful SGML features retained and more attention paid to the usability
of markup. Pretending that a document is a tree when it's not (it's a
document!) was a mistake we are still paying for. Starting with the
syntax was OK, IMHO, and pretty much 99% of what we did was right. But
schemas were a later development, a bolt-on which only came when the
XML-Data folks saw the market for the syntax (and that's something else
we'll end up paying for -- I see way too many slabs of data being done
into XML when CSV would be much more sensible).

> it. The downside is that there have been some warts and hiccups and
> direction changes along the way, and tools have not always been quick to
> catch up -- and even when they have, folks who have working solutions
> using the old stopgaps are often reluctant to make the effort to move
> over.


This is going to be the interesting bit. New tools -- *really good* new
tools -- are few and far between. And there are too many good old tools
which have become unavailable just at the point when they were most
needed, because of corporate buyouts resulting in technically-unaware
people dropping the ball.

> Which leaves all of us with the job of supporting multiple ways of
> doing things and trying to gently push folks toward the ones that will
> make their life -- and ours -- easier in the long run.


It does work eventually. I've only had one breakage so far, and that was
due to sabotage.

> Oh well. The cutting edge usually has a few nicks in it.


Mind that axe, Eugene.

///Peter


 
Reply With Quote
 
Harrie
Guest
Posts: n/a
 
      02-17-2006
Peter Flynn said the following on 2/17/2006 22:52 +0200:

> Mind that axe, Eugene.


Actually, it's "Careful with that Axe, Eugene"

http://www.pink-floyd-lyrics.com/htm...ma-lyrics.html

--
Regards
Harrie
 
Reply With Quote
 
Joe Kesselman
Guest
Posts: n/a
 
      02-22-2006
>> If we'd designed XML completely before releasing it to the public,
> We'd still be discussing it.


Which is why they went the other way around. Unfortunately that left us
with some warts where the afterthoughts were tacked on (including some
that could have been avoided, but... oh well; too much water over the
dam at this point).

> I like the description, although I disagree about the infoset. Coming
> from the tech doc background, I would have preferred to see some of the
> useful SGML features retained


Trimming away everything that wasn't absolutely required is what made
implementing XML easy. If you've ever written an SGML processor, you
know getting it right is messy at best. XML was deliberately restricted
to the point where the parser is implementable by an average student in
a week or less.

> This is going to be the interesting bit. New tools -- *really good* new
> tools -- are few and far between.


They're starting to appear, though. If you see a market not being
adequately served, think of it as a marketing opportunity. That's what
got us started on Xerces and Xalan...<grin/>

--
() ASCII Ribbon Campaign | Joe Kesselman
/\ Stamp out HTML e-mail! | System architexture and kinetic poetry
 
Reply With Quote
 
Peter Flynn
Guest
Posts: n/a
 
      02-23-2006
Joe Kesselman wrote:
> Trimming away everything that wasn't absolutely required is what made
> implementing XML easy. If you've ever written an SGML processor, you
> know getting it right is messy at best. XML was deliberately restricted
> to the point where the parser is implementable by an average student in
> a week or less.


I think Tim Bray's comment was "implementable in 'just a few' 30-hour
Perl hacking sessions"

> They're starting to appear, though. If you see a market not being
> adequately served, think of it as a marketing opportunity.


Oh I am, believe me

///Peter
 
Reply With Quote
 
Joseph Kesselman
Guest
Posts: n/a
 
      02-24-2006
Peter Flynn wrote:
> I think Tim Bray's comment was "implementable in 'just a few' 30-hour
> Perl hacking sessions"


The concept of the DPH -- Desperate Perl Hacker -- has been invoked a
number of times as an argument for why everything should be kept as
simple as possible. (But not simpler.)



--
Joe Kesselman / Beware the fury of a patient man. -- John Dryden
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Use output of XSL transformation as new XSL stylesheet barney.b@iname.com XML 0 01-16-2006 02:29 PM
XSL Question tp xsl:for-each and xsl:variable schaf@2wire.ch XML 1 05-27-2005 09:25 PM
Nested DataSet / Xsl Transformation George Durzi ASP .Net 0 03-24-2005 06:44 PM
XSL Transformation - Dynamic Generation of XML Content Hugo Ferreira ASP .Net 0 07-14-2004 11:25 AM
Timeout on Xsl Transformation George Durzi ASP .Net 0 12-29-2003 07:46 PM



Advertisments