Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > XML > how to stuff HTML into RSS??

Reply
Thread Tools

how to stuff HTML into RSS??

 
 
lkrubner@geocities.com
Guest
Posts: n/a
 
      12-02-2004
Me and some friends are working on some PHP based templates for web
pages. We've templates that look like this (simplified):

<html>
<head>
<title>
The green and blue design for carpentry companies
</title>
</head>
<body>
<?php showMainContent(); ?>
<div style="width:200px; float:right">
<?php showLinkArea(3); ?>
</div>
</body>
</html>


I'd like to publish all the templates in our database in an RSS feed so
it will be easier to import them on other sites. Does it screw things
up if I stuff HTML into the DESCRIPTION tag on an RSS .91 feed?

 
Reply With Quote
 
 
 
 
Andy Dingley
Guest
Posts: n/a
 
      12-02-2004
On 2 Dec 2004 01:03:34 -0800, http://www.velocityreviews.com/forums/(E-Mail Removed) wrote:

>Does it screw things
>up if I stuff HTML into the DESCRIPTION tag on an RSS .91 feed?


It's not what you stuff, it's how you stuff it.

You should encode HTML, so that

<description><p>Some <b>HTML</b> in RSS</description>

becomes this

<description>&lt;p&gt;Some &lt;b&gt;HTML&lt;/b&gt; in
RSS&lt;/p&gt;</description>

Watch out as well for & (becomes &amp and for &eacute; etc. (turn
them into the equivalent numeric entity)

I'd also suggest that you make your HTML fragments into well-formed,
balanced XHTML fragments before you embed them (lower case element
names, close open elements). Although this isn't required, it can make
life easier with XML toolsets.

This stuff isn't hard to do, but it's very poorly documented. There
are many RSS versions, and few of them describe it fully. This is a
useful read
http://diveintomark.org/archives/200...compatible-rss


I'd also avoid the obsolete RSS 0.91 in favour of RSS 1.0 (far
better), or you might prefer the more popular RSS 2.0

--
Smert' spamionam
 
Reply With Quote
 
 
 
 
Joris Gillis
Guest
Posts: n/a
 
      12-02-2004
> You should encode HTML, so that
>
> <description><p>Some <b>HTML</b> in RSS</description>
>
> becomes this
>
> <description>&lt;p&gt;Some &lt;b&gt;HTML&lt;/b&gt; in
> RSS&lt;/p&gt;</description>
>

Hi,

I don't know anything about RSS, but wouldn'it be easier and more logical to insert the XHTML as elements using namespaces? And if that wouldn't be possible yet, shouldn't it become possible?

regards,
--
Joris Gillis (http://www.ticalc.org/cgi-bin/acct-v...i?userid=38041)
Ceterum censeo XML omnibus esse utendum
 
Reply With Quote
 
Andy Dingley
Guest
Posts: n/a
 
      12-02-2004
On Thu, 02 Dec 2004 15:41:32 GMT, "Joris Gillis" <(E-Mail Removed)>
wrote:

>I don't know anything about RSS,


I suggest you read the Dive Into Mark article. It explains some of the
background to this and is a good explanation.
http://diveintomark.org/archives/200...compatible-rss

RSS has suffered because of too many standards, and especially because
these standards have generally been poorly specified. In particular
there is no clear guidance on how to embed HTML content within an RSS
item.

A problem with RSS, and all such protocols that try to become an open
publication medium, is that many creators will make content and many
consumers will try to read it. Where the spec isn't exhaustive on how
it _must_ be done, then a situation soon develops of de facto
behaviour for how it _is_ done. Readers become dependent on this, and
you diverge from it at your peril.

> but wouldn'it be easier and more logical to insert the XHTML as elements using namespaces?


That's an attractive option. However it's not a viable one.
There are several reasons:

Namespacing relies on using XHTML, and you may wish to include HTML
_as_HTML_ not XHTML. Some consumers may be confused if they receive
XHTML

Namespacing relies on including a balanced fragment (i.e. one that can
be well-formed as as XML fragment). This wasn't a requirement on the
original RSS/HTML enclosure, so this is hard to re-impose in some
cases (<a name="..." > is one of the more awkward cases to deal
with).

RSS is not an XML protocol. Successive versions of badly-written specs
have clouded this. There are all sorts of references of "ASCII" when
it should really be CDATA. It's commonplace to include HTML entities,
even when these aren't valid outside the HTML DTD. Reliable parsing
of RSS from external sources is a mess, and it often relies on
knife-and-fork parsing with non-XML tools. It's not reliable to
assume good support for standard XML features if you're working with
external feeds, even though you "should" be able to do this.

> And if that wouldn't be possible yet, shouldn't it become possible?


RSS is old. It's post-XML, but pre-XHTML and (arguably)
pre-namespacing. So even if a namespaced approach became widespread,
consumers should (strongly) keep supporting the old way if they still
want to accept content supplied that way.

I use namespaced content for internal RSS feeds within my projects,
where I always use RSS 1.0. For external work though, I encode plain
HTML. I use balanced fragments, so I close elements like <p>...</p>,
but I don't use the <br /> form for <br>

--
Smert' spamionam
 
Reply With Quote
 
Joris Gillis
Guest
Posts: n/a
 
      12-03-2004
On Thu, 02 Dec 2004 20:30:17 +0000, Andy Dingley <(E-Mail Removed)> wrote:

> On Thu, 02 Dec 2004 15:41:32 GMT, "Joris Gillis" <(E-Mail Removed)>
> wrote:
>
>> I don't know anything about RSS,

>
> I suggest you read the Dive Into Mark article. It explains some of the
> background to this and is a good explanation.
> http://diveintomark.org/archives/200...compatible-rss
>
> RSS has suffered because of too many standards, and especially because
> these standards have generally been poorly specified. In particular
> there is no clear guidance on how to embed HTML content within an RSS
> item.
>
> A problem with RSS, and all such protocols that try to become an open
> publication medium, is that many creators will make content and many
> consumers will try to read it. Where the spec isn't exhaustive on how
> it _must_ be done, then a situation soon develops of de facto
> behaviour for how it _is_ done. Readers become dependent on this, and
> you diverge from it at your peril.
>
>> but wouldn'it be easier and more logical to insert the XHTML as elements using namespaces?

>
> That's an attractive option. However it's not a viable one.
> There are several reasons:
>
> Namespacing relies on using XHTML, and you may wish to include HTML
> _as_HTML_ not XHTML. Some consumers may be confused if they receive
> XHTML
>
> Namespacing relies on including a balanced fragment (i.e. one that can
> be well-formed as as XML fragment). This wasn't a requirement on the
> original RSS/HTML enclosure, so this is hard to re-impose in some
> cases (<a name="..." > is one of the more awkward cases to deal
> with).
>
> RSS is not an XML protocol. Successive versions of badly-written specs
> have clouded this. There are all sorts of references of "ASCII" when
> it should really be CDATA. It's commonplace to include HTML entities,
> even when these aren't valid outside the HTML DTD. Reliable parsing
> of RSS from external sources is a mess, and it often relies on
> knife-and-fork parsing with non-XML tools. It's not reliable to
> assume good support for standard XML features if you're working with
> external feeds, even though you "should" be able to do this.
>
>> And if that wouldn't be possible yet, shouldn't it become possible?

>
> RSS is old. It's post-XML, but pre-XHTML and (arguably)
> pre-namespacing. So even if a namespaced approach became widespread,
> consumers should (strongly) keep supporting the old way if they still
> want to accept content supplied that way.
>
> I use namespaced content for internal RSS feeds within my projects,
> where I always use RSS 1.0. For external work though, I encode plain
> HTML. I use balanced fragments, so I close elements like <p>...</p>,
> but I don't use the <br /> form for <br>
>


Now that what I call a valuable reply
Thank you very much.

--
Joris Gillis (http://www.ticalc.org/cgi-bin/acct-v...i?userid=38041)
Ceterum censeo XML omnibus esse utendum
 
Reply With Quote
 
Peter Flynn
Guest
Posts: n/a
 
      12-04-2004
(E-Mail Removed) wrote:

> Me and some friends are working on some PHP based templates for web
> pages. We've templates that look like this (simplified):
>
> <html>
> <head>
> <title>
> The green and blue design for carpentry companies
> </title>
> </head>
> <body>
> <?php showMainContent(); ?>
> <div style="width:200px; float:right">
> <?php showLinkArea(3); ?>
> </div>
> </body>
> </html>
>
>
> I'd like to publish all the templates in our database in an RSS feed so
> it will be easier to import them on other sites. Does it screw things
> up if I stuff HTML into the DESCRIPTION tag on an RSS .91 feed?


Yes. Implementations of RSS readers are almost all hopelessly broken and
non-conformant, and the RSS "spec" -- such as it is -- has been so kicked
about and bastardised as to be virtually worthless except as a carrier
format like HTML. There were plans to make a newer, better version, but
like HTML it has now become so fossilised that it's not worth changing.

///Peter
--
"The cat in the box is both a wave and a particle"
-- Terry Pratchett, introducing quantum physics in _The Authentic Cat_
 
Reply With Quote
 
lkrubner@geocities.com
Guest
Posts: n/a
 
      12-08-2004
Thank you for your in-depth reply. I've already read Mark's article and
one thing I got from it was that it didn't matter much which version of
RSS you used, they were all broken.

For now I'm in the lucky position of being the consumer of my own
output. We have some HTML templates we'd like publish, but we are
publishing them for people who have our software, so we control the
source and the point of consumption. I'd love to eventualy use a richer
RSS but I'm short on time this month and so I'd like to reuse what PHP
code we already have written and tested. The code we have puts out
valid RSS .91.

To publish an HTML template in the description tag of RSS, should I
just wrap it in a CDATA tag? Or escape it as someone ablove remarked.

 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Stuff a pair of size_t into an uint32_t Giuseppe:G: C++ 1 07-12-2008 06:17 PM
Help requested -- importing stuff from a .COM DLL into python John D Salt Python 1 05-31-2006 08:39 PM
using dreamweaver : how can I insert HTML file into another html file? johnsonholding@yahoo.com HTML 3 01-10-2006 08:06 PM
MS VS does not recognize some html stuff. Qwert ASP .Net 1 04-13-2005 01:31 AM
how to stuff HTML into RSS?? lkrubner@geocities.com XML 0 12-02-2004 09:04 AM



Advertisments