Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > XML > RSS feed clarification?

Reply
Thread Tools

RSS feed clarification?

 
 
Ed Flecko
Guest
Posts: n/a
 
      05-14-2007
Hi folks,
I'm trying to figure out this whole RSS feed thing.

I've created my .xml file to use for my feed, and my browsers
"recognize" that I have an RSS feed, and you can subscribe, etc., etc.

Here's why I "think" I want to use an RSS feed, and what I'm confused
about.

I have one file (and one file only) on my web site that changes
frequently (weekly), but the file name is always the same. I want to
alert people who subscribe to the feed that this file has changed.

Here's my questions:

1.) Will an RSS feed "work" (automatically notify the subscribers) for
a single file whose name is always the same (although the body content
of the file changes)?
2.) I don't understand how RSS feeds actually work, from the clients
perspective, i.e., how do the subscribers RSS client (Internet
Explorer, Firefox, etc.) actually know that the RSS feed has changed
and download it, etc.? Is it just simply a scheduled task, and the
client checks the feed automatically on a schedule?
3.) Since my feed isn't "news", per se, I don't need to bother with
"syndicating" my feed, do I?...or would this somehow benefit me.

Thank you,
Ed

 
Reply With Quote
 
 
 
 
Andy Dingley
Guest
Posts: n/a
 
      05-14-2007
On 14 May, 15:55, Ed Flecko <(E-Mail Removed)> wrote:

> I've created my .xml file to use for my feed, and my browsers
> "recognize" that I have an RSS feed, and you can subscribe, etc., etc.


It would help if you told us the URL.

It's also better practice (if you can arrnage this, with your hosting)
to give your RSS file a ".rss" extension and most importantly to serve
it with a correct content-type for RSS, not just for XML. RSS is
robust aginst not doing this (most publishers can't get it right), but
it's still good practice if you're hosted on Apache.


> I have one file (and one file only) on my web site that changes
> frequently (weekly), but the file name is always the same. I want to
> alert people who subscribe to the feed that this file has changed.


> 1.) Will an RSS feed "work" (automatically notify the subscribers) for
> a single file whose name is always the same (although the body content
> of the file changes)?


Yes. RSS is all about embedding metadata, and that includes the update
timestamps. The names themself are just one piece of the metadata --
so long as _something_ reflects the change, then you can make it all
work.

You might not be able to use the "permalink" feature of some RSS
versions. This is useful, so if you can, then you should use it. As to
whether it's relevant, then that depends on your particular
application and so we don't know that yet.

It's good practice to offer a "permalink" as a URL that will always
retrieve a particular version of the content, even some time after
this was first served. "Last week's news" is still interesting to many
consumers. You can purge these over time if you wish, but it's still
good practice to make the URL namespace a consumed resource that isn't
re-used.

If you can, then keep "last weeks news" and "this weeks news" stored
as separate files on the web server and make the web server respond to
specific requests for each one appropriately. Just giving a filename
with a datestamp in it can be enough to do this. For the "latest
news" URL, send a 302 redirect to the URL for the current file. This
redirect's value will need to be changed as each new file is uploaded.

Possibly it's just not appropriate to serve "last week's news" in your
application (I don't, and can't, know). If so, then just have the
simple one file, one filename, one URL situation. However make sure
that any URL you publish in RSS is _not_ labelled incorrectly as a
"permalink".


> 2.) I don't understand how RSS feeds actually work, from the clients
> perspective,


Largely you don't and can't know this. You publish the stuff, what
happens next is up to whoever uses it. Don't try to pre-judge what
they can and (especially) what they can't do with it.


> how do the subscribers RSS client (Internet
> Explorer, Firefox, etc.) actually know that the RSS feed has changed
> and download it, etc.?


They'll usually poll it regularly to see (i.e., the client decides).
HTTP polling shoudl be efficient - i.e. a GET or HEAD request should
quickly return a suitable HTTP 304 Not Modified if needs be, or at
least a HTTP 200 with appropriate timestamps. Good clients adjust this
polling time so as not to be a nuisance, to respect any hints you
embed in the syndication information you include inside the RSS
document, and also to fine-tune this on the basis of how often you
actually make changes to the content.

It's important that an RSS server can efficiently serve polling
clients when the content _hasn't_ changed, otherwise it can soon be
overloaded, even when it's not serving any content. This is a real
problem for dumb-coded servers with database-generated content. If
your RSS content is coming from static files, then Apache will get it
right automatically. If you're generating it dynamically, then make
sure your "last updated" timestamps are calculated and returned
quickly, and also that they represent the "last change" not the "last
request" timestamps.


> 3.) Since my feed isn't "news", per se, I don't need to bother with
> "syndicating" my feed, do I?.


You don't really ever syndicate your own feed, you offer it up for
syndication and some aggregator might decide to syndicate it elsewhere
if it wishes to. Or else it might not, I cannot be positive which.



 
Reply With Quote
 
 
 
 
Ed Flecko
Guest
Posts: n/a
 
      05-14-2007
On May 14, 9:00 am, Andy Dingley <(E-Mail Removed)> wrote:
> On 14 May, 15:55, Ed Flecko <(E-Mail Removed)> wrote:
>
> > I've created my .xml file to use for my feed, and my browsers
> > "recognize" that I have an RSS feed, and you can subscribe, etc., etc.

>
> It would help if you told us the URL.
>
> It's also better practice (if you can arrnage this, with your hosting)
> to give your RSS file a ".rss" extension and most importantly to serve
> it with a correct content-type for RSS, not just for XML. RSS is
> robust aginst not doing this (most publishers can't get it right), but
> it's still good practice if you're hosted on Apache.
>
> > I have one file (and one file only) on my web site that changes
> > frequently (weekly), but the file name is always the same. I want to
> > alert people who subscribe to the feed that this file has changed.
> > 1.) Will an RSS feed "work" (automatically notify the subscribers) for
> > a single file whose name is always the same (although the body content
> > of the file changes)?

>
> Yes. RSS is all about embedding metadata, and that includes the update
> timestamps. The names themself are just one piece of the metadata --
> so long as _something_ reflects the change, then you can make it all
> work.
>
> You might not be able to use the "permalink" feature of some RSS
> versions. This is useful, so if you can, then you should use it. As to
> whether it's relevant, then that depends on your particular
> application and so we don't know that yet.
>
> It's good practice to offer a "permalink" as a URL that will always
> retrieve a particular version of the content, even some time after
> this was first served. "Last week's news" is still interesting to many
> consumers. You can purge these over time if you wish, but it's still
> good practice to make the URL namespace a consumed resource that isn't
> re-used.
>
> If you can, then keep "last weeks news" and "this weeks news" stored
> as separate files on the web server and make the web server respond to
> specific requests for each one appropriately. Just giving a filename
> with a datestamp in it can be enough to do this. For the "latest
> news" URL, send a 302 redirect to the URL for the current file. This
> redirect's value will need to be changed as each new file is uploaded.
>
> Possibly it's just not appropriate to serve "last week's news" in your
> application (I don't, and can't, know). If so, then just have the
> simple one file, one filename, one URL situation. However make sure
> that any URL you publish in RSS is _not_ labelled incorrectly as a
> "permalink".
>
> > 2.) I don't understand how RSS feeds actually work, from the clients
> > perspective,

>
> Largely you don't and can't know this. You publish the stuff, what
> happens next is up to whoever uses it. Don't try to pre-judge what
> they can and (especially) what they can't do with it.
>
> > how do the subscribers RSS client (Internet
> > Explorer, Firefox, etc.) actually know that the RSS feed has changed
> > and download it, etc.?

>
> They'll usually poll it regularly to see (i.e., the client decides).
> HTTP polling shoudl be efficient - i.e. a GET or HEAD request should
> quickly return a suitable HTTP 304 Not Modified if needs be, or at
> least a HTTP 200 with appropriate timestamps. Good clients adjust this
> polling time so as not to be a nuisance, to respect any hints you
> embed in the syndication information you include inside the RSS
> document, and also to fine-tune this on the basis of how often you
> actually make changes to the content.
>
> It's important that an RSS server can efficiently serve polling
> clients when the content _hasn't_ changed, otherwise it can soon be
> overloaded, even when it's not serving any content. This is a real
> problem for dumb-coded servers with database-generated content. If
> your RSS content is coming from static files, then Apache will get it
> right automatically. If you're generating it dynamically, then make
> sure your "last updated" timestamps are calculated and returned
> quickly, and also that they represent the "last change" not the "last
> request" timestamps.
>
> > 3.) Since my feed isn't "news", per se, I don't need to bother with
> > "syndicating" my feed, do I?.

>
> You don't really ever syndicate your own feed, you offer it up for
> syndication and some aggregator might decide to syndicate it elsewhere
> if it wishes to. Or else it might not, I cannot be positive which.


Hi Andy,
Hey, thanks for the reply. I'll take all the suggestions and help I
can get!

O.K., I've changed the name of my basic RSS file so it has an .rss
extension.

The site is: www.fivestarbank.com, and the specific file is our CD
rates that I know customers would like to keep current on...that's why
I think the RSS feed would be a smart idea.

Comments? Further suggestions?

Thank you!

 
Reply With Quote
 
Andy Dingley
Guest
Posts: n/a
 
      05-15-2007
On 14 May, 18:14, Ed Flecko <(E-Mail Removed)> wrote:

> O.K., I've changed the name of my basic RSS file so it has an .rss
> extension.


It's now served under a content-type of text/plain when it ought to
be application/rss+xml. Fix that if you can (Apache and .htaccess),
otherwise it _might_ be better as .xml and at least served as text/xml
or application/xml. Don't sweat this though: it's good practice, but
RSS is deliberately robust against it being mis-configured.

Also validate it with feed validator
http://feedvalidator.org/check.cgi?u....com%2Ffsb.rss

As it stands, it's valid but still needs a couple of tweaks.

You're using RSS 2.0, which is probably the best choice for you,
although the spec is unfortunately badly written and ambiguous. Worth
reading anyway though:
<http://cyber.law.harvard.edu/rss/rss.html>


Line by line:

<title>Welcome to the Five Star Bank RSS feed</title>
Don't welcome people, tell them what it is. It's not a web site, it's
an RSS feed. They don't "visit" this, they have it delivered to them.
Remeber that they might be reading this on their fridge screen
display, along with the morning's news and last night's baseball
result.


<link>http://www.fivestarbank.com</link>
Good. This should be to the human-readable website, not any part of
the feed


<description>Where Excellence Exceeds Expectations</description>
Lose the marketing flannel. Put some content here. Try "Five Star Bank
CD rates at 15th May 2007, valid for the next 5 days" or similar


<item>
One item. It's all you need. Not common practice, but entirely valid
in your application.

<title>Current CD Rates</title>
Be careful with words like "current" in any syndicatable protocol (it
might not still be current when yourr reader gets to see it). Only use
them with items that are clearly timestamped, otherwise you will
confuse users.

<link>http://www.fivestarbank.com/documents/Current_Rates.pdf</link>

I would still be happier if this pointed to a series of files called
"rates at 2007-05-15" etc. Delete them as soon as they're obsolete if
you wish, but at least it avoids confusion of mapping an old
"currrent" onto a new file with a changed rate. If you don't do this
then you are losing most of the advantages of RSS.

You can still make "current" 302 redirect to this week's file.

There's a separate commercial decision to be made as to whether you
want to have your historical rate history visible so easily (by
leaving the old files available). It's your call (but if you ever
make this information publically visible even temporarily, someone
will make a business out of recording it and selling histories of it).
Obviously a single filename kills this anyway.

<description></description>
Put something in there. Probably (for this one-item case) a
restatement of the channel's description.

There are several elements missing from <channel>. Some are important.

<pubDate>
This is vital, because it's how an aggregator identifies the channel /
item as having been updated. If you don't have it, and you don't
change the item link URL, then most correct aggregators will simply
see your content as stale and unchanging, even if the PDF contents
themselves are changing. Put this on both channel and item -- channel
is just the latest pubDate across all <item>s, so in your case they're
currently the same.


<skipHours> & <ttl>
This is poorly done in RSS 2.0, but you should still use it. It's part
of how they hint at the update schedule for the channel. Personally
I'd use the RSS 1.0 syndication module instead, or as well.
<http://web.resource.org/rss/1.0/modules/syndication/>

<copyright>
This can be important, particularly if you wish to indicate that
financial information brokers can't republish your content. I suggest
reading the Creative Commons site for advice on indicating this.

<managingEditor>
It's now a legal requirement for UK commercial feeds to include this
(with some wiggle room for the technical details of "how"), so as to
identify the legal entity publishing this business communication. I'm
sure US retail banking laws have similar requirements.


There are also elements missing from <item>. Some are already
described, some important.

Remember that many syndication / aggregation environments syndicate
_items_, not _channels_. They'll strip out the items they want from
several sources of channel, then republish them as an aggregation. If
you want to swim in this world, make sure that your <item>s carry the
appropriate metadata, don't just stick it once one the overall channel
and hope.

<guid>
This is essential if you expect any syndication to work. It's how they
recognise <item>s that are different or (in conjunction with pubDate)
have been updated. Don't use isPermaLink=true though unless you're
disambiguated between each weeks' set of rates (as I suggest anyway).

<enclosure>
Your linked content is a PDF, so it's unclear as to whether it ought
to be addressed via a <link> or via <enclosure>. It's possible to use
either. It's better to not use a PDF at all, but to use HTML (with my
Semantic Web pointy hat on). In that case you'd clearly use a <link>
and we'd all start building a world of automatically machine-readable
smart content, intelligent agents and all the rest of it.

However you probably have a corporate brand manager who forces you to
use a PDF so that they can control the exact choice of corporate
typeface. This is a Bad and Wrong policy and the sooner these
dinosaurs are put out to grass the better, but I appreciate that it
happens. So is a PDF a piece of "web content" (use <link>) or is it a
monstrous great piece of opaque brochureware that's only fit to be
downloaded and printed, with no hope of ever being automatically read
and used by agents (use <enclosure>).


 
Reply With Quote
 
Ed Flecko
Guest
Posts: n/a
 
      05-17-2007
On May 15, 4:38 am, Andy Dingley <(E-Mail Removed)> wrote:
> On 14 May, 18:14, Ed Flecko <(E-Mail Removed)> wrote:
>
> > O.K., I've changed the name of my basic RSS file so it has an .rss
> > extension.

>
> It's now served under a content-type of text/plain when it ought to
> be application/rss+xml. Fix that if you can (Apache and .htaccess),
> otherwise it _might_ be better as .xml and at least served as text/xml
> or application/xml. Don't sweat this though: it's good practice, but
> RSS is deliberately robust against it being mis-configured.
>
> Also validate it with feed validatorhttp://feedvalidator.org/check.cgi?url=http%3A%2F%2Fwww.fivestarbank....
>
> As it stands, it's valid but still needs a couple of tweaks.
>
> You're using RSS 2.0, which is probably the best choice for you,
> although the spec is unfortunately badly written and ambiguous. Worth
> reading anyway though:
> <http://cyber.law.harvard.edu/rss/rss.html>
>
> Line by line:
>
> <title>Welcome to the Five Star Bank RSS feed</title>
> Don't welcome people, tell them what it is. It's not a web site, it's
> an RSS feed. They don't "visit" this, they have it delivered to them.
> Remeber that they might be reading this on their fridge screen
> display, along with the morning's news and last night's baseball
> result.
>
> <link>http://www.fivestarbank.com</link>
> Good. This should be to the human-readable website, not any part of
> the feed
>
> <description>Where Excellence Exceeds Expectations</description>
> Lose the marketing flannel. Put some content here. Try "Five Star Bank
> CD rates at 15th May 2007, valid for the next 5 days" or similar
>
> <item>
> One item. It's all you need. Not common practice, but entirely valid
> in your application.
>
> <title>Current CD Rates</title>
> Be careful with words like "current" in any syndicatable protocol (it
> might not still be current when yourr reader gets to see it). Only use
> them with items that are clearly timestamped, otherwise you will
> confuse users.
>
> <link>http://www.fivestarbank.com/documents/Current_Rates.pdf</link>
>
> I would still be happier if this pointed to a series of files called
> "rates at 2007-05-15" etc. Delete them as soon as they're obsolete if
> you wish, but at least it avoids confusion of mapping an old
> "currrent" onto a new file with a changed rate. If you don't do this
> then you are losing most of the advantages of RSS.
>
> You can still make "current" 302 redirect to this week's file.
>
> There's a separate commercial decision to be made as to whether you
> want to have your historical rate history visible so easily (by
> leaving the old files available). It's your call (but if you ever
> make this information publically visible even temporarily, someone
> will make a business out of recording it and selling histories of it).
> Obviously a single filename kills this anyway.
>
> <description></description>
> Put something in there. Probably (for this one-item case) a
> restatement of the channel's description.
>
> There are several elements missing from <channel>. Some are important.
>
> <pubDate>
> This is vital, because it's how an aggregator identifies the channel /
> item as having been updated. If you don't have it, and you don't
> change the item link URL, then most correct aggregators will simply
> see your content as stale and unchanging, even if the PDF contents
> themselves are changing. Put this on both channel and item -- channel
> is just the latest pubDate across all <item>s, so in your case they're
> currently the same.
>
> <skipHours> & <ttl>
> This is poorly done in RSS 2.0, but you should still use it. It's part
> of how they hint at the update schedule for the channel. Personally
> I'd use the RSS 1.0 syndication module instead, or as well.
> <http://web.resource.org/rss/1.0/modules/syndication/>
>
> <copyright>
> This can be important, particularly if you wish to indicate that
> financial information brokers can't republish your content. I suggest
> reading the Creative Commons site for advice on indicating this.
>
> <managingEditor>
> It's now a legal requirement for UK commercial feeds to include this
> (with some wiggle room for the technical details of "how"), so as to
> identify the legal entity publishing this business communication. I'm
> sure US retail banking laws have similar requirements.
>
> There are also elements missing from <item>. Some are already
> described, some important.
>
> Remember that many syndication / aggregation environments syndicate
> _items_, not _channels_. They'll strip out the items they want from
> several sources of channel, then republish them as an aggregation. If
> you want to swim in this world, make sure that your <item>s carry the
> appropriate metadata, don't just stick it once one the overall channel
> and hope.
>
> <guid>
> This is essential if you expect any syndication to work. It's how they
> recognise <item>s that are different or (in conjunction with pubDate)
> have been updated. Don't use isPermaLink=true though unless you're
> disambiguated between each weeks' set of rates (as I suggest anyway).
>
> <enclosure>
> Your linked content is a PDF, so it's unclear as to whether it ought
> to be addressed via a <link> or via <enclosure>. It's possible to use
> either. It's better to not use a PDF at all, but to use HTML (with my
> Semantic Web pointy hat on). In that case you'd clearly use a <link>
> and we'd all start building a world of automatically machine-readable
> smart content, intelligent agents and all the rest of it.
>
> However you probably have a corporate brand manager who forces you to
> use a PDF so that they can control the exact choice of corporate
> typeface. This is a Bad and Wrong policy and the sooner these
> dinosaurs are put out to grass the better, but I appreciate that it
> happens. So is a PDF a piece of "web content" (use <link>) or is it a
> monstrous great piece of opaque brochureware that's only fit to be
> downloaded and printed, with no hope of ever being automatically read
> and used by agents (use <enclosure>).


Thank you, Andy.

I'll try your suggestions!



 
Reply With Quote
 
Joseph Kesselman
Guest
Posts: n/a
 
      05-17-2007
Quick reminder, not directed only at Ed: Please remember to trim quotes!
Reposting a hundred lines of text just to add seven words of thanks is
not a very good use of Internet resources (or of readers' time).

In general, your new text should be larger than what you're quoting,
with a *bit* of leeway allowed when the quote itself is also short.
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Two ways to generate RSS - rss/maker and rss/2.0 - which is better? Jonathan Groll Ruby 1 06-27-2009 03:53 AM
How do you get feed discovery to work? I go to web pages I know has feeds, but the feed discovery button is disabled. Help! Tim Bryant Computer Support 1 02-13-2007 05:01 AM
Post RSS feed w/o RSS-to-Javascript.com Scott Gordo HTML 5 08-29-2006 01:34 AM
RSS Feed - need an Idiot's Guide to RSS News on my website teach_me6@hotmail.com HTML 5 02-25-2005 11:01 AM
Searches in multiple RSS feeds -> new rss feed Motta XML 1 06-09-2004 10:55 PM



Advertisments