Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Ruby > Atom & the Standard Library RSS Module

Reply
Thread Tools

Atom & the Standard Library RSS Module

 
 
grant
Guest
Posts: n/a
 
      04-04-2008
I need to parse RSS 1.0, 2.0 and ATOM feeds. I upgraded my RSS module
to the latest version (0.2.4) to get ATOM support. The strange thing
is that the data structure returned for ATOM feeds is ugly and wildly
inconsistent with the nice, clean one that is returned for RSS feeds.

I've noticed there are a couple of competing Ruby ports of Mark
Pilgrim's Universal Feedparser. The 'rfeedparser' one looks to be the
best and FeedTools looks interesting, but I haven't actually tried
them yet. (I really like the Universal Feedparser for Python.)

Does anyone have any suggestions on which direction to take?
 
Reply With Quote
 
 
 
 
Kouhei Sutou
Guest
Posts: n/a
 
      04-05-2008
Hi,

In <(E-Mail Removed)>
"Atom & the Standard Library RSS Module" on Sat, 5 Apr 2008 06:25:05 +0900,
grant <(E-Mail Removed)> wrote:

> I need to parse RSS 1.0, 2.0 and ATOM feeds. I upgraded my RSS module
> to the latest version (0.2.4) to get ATOM support. The strange thing
> is that the data structure returned for ATOM feeds is ugly and wildly
> inconsistent with the nice, clean one that is returned for RSS feeds.


Please show an example.

--
kou

 
Reply With Quote
 
 
 
 
grant
Guest
Posts: n/a
 
      04-05-2008
On 5 Apr, 01:29, Kouhei Sutou <(E-Mail Removed)> wrote:
> Please show an example.


Sorry, I got tangled up in my own wishes. My problem with the library
was just in my head. I suppose I was just hoping for a more consistent
representation of a feed than I've had to deal with in the past. I
know that what is generated by RSS/ATOM parsing libraries is a
reflection of the feeds themselves.

Anyway, a few examples of what bugs me, demonstrated in irb:

require 'rss'

rss = 'http://www.giftedslacker.com/feed/'
atom = 'http://oblivionation.blogspot.com/feeds/posts/default'

rssfeed = RSS:arser.parse(rss)
atomfeed = RSS:arser.parse(atom)

#print the content of the most recent post
puts rssfeed.items[0].content_encoded
puts atomfeed.items[0].content.content

#print the titles of the posts in the feed
rssfeed.items.each {|item| puts item.title}
atomfeed.items.each {|item| puts item.title.content}

#print the author of the most recent post
rssfeed.items[0].dc_creator
atomfeed.entries[0].author.name.content

---

What I'd like is a consistent interface to the commonly used elements
in ATOM and RSS feeds, regardless of version. It must be more
difficult than it seems to me, because I'm not aware that anyone does
it. I might give it a try just for fun. But, being new to Ruby, I'm
not sure where to begin.

-grant
 
Reply With Quote
 
Phillip Gawlowski
Guest
Posts: n/a
 
      04-05-2008
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

grant wrote:
| On 5 Apr, 01:29, Kouhei Sutou <(E-Mail Removed)> wrote:

| What I'd like is a consistent interface to the commonly used elements
| in ATOM and RSS feeds, regardless of version. It must be more
| difficult than it seems to me, because I'm not aware that anyone does
| it. I might give it a try just for fun. But, being new to Ruby, I'm
| not sure where to begin.

You could build upon/look at/submit patches to Simple RSS:
http://simple-rss.rubyforge.org/

I've used it, and while it is simple to use, it comes at the cost of
limited functionality (as far as I could see. I only used it to grab
NetBeans Ruby IDE builds off the web, when the buildserver used updated
its RSS feed, so take my comments with a grain of salt.)

- -- Phillip Gawlowski
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.8 (MingW32)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iEYEARECAAYFAkf3bSUACgkQbtAgaoJTgL9ijwCfZPylNfWZHs TE02Pec6fQXTcl
F6EAoJCO3lkfChpIKji9hc52aUaWm9BV
=Uhvw
-----END PGP SIGNATURE-----

 
Reply With Quote
 
Kouhei Sutou
Guest
Posts: n/a
 
      04-06-2008
Hi,

In <(E-Mail Removed)>
"Re: Atom & the Standard Library RSS Module" on Sat, 5 Apr 2008 20:45:06 +0900,
grant <(E-Mail Removed)> wrote:

> On 5 Apr, 01:29, Kouhei Sutou <(E-Mail Removed)> wrote:
> > Please show an example.

>
> Sorry, I got tangled up in my own wishes. My problem with the library
> was just in my head. I suppose I was just hoping for a more consistent
> representation of a feed than I've had to deal with in the past. I
> know that what is generated by RSS/ATOM parsing libraries is a
> reflection of the feeds themselves.
>
> Anyway, a few examples of what bugs me, demonstrated in irb:
>
> require 'rss'
>
> rss = 'http://www.giftedslacker.com/feed/'
> atom = 'http://oblivionation.blogspot.com/feeds/posts/default'
>
> rssfeed = RSS:arser.parse(rss)
> atomfeed = RSS:arser.parse(atom)
>

You can normalize parsed feed by to_rss, to_atom or
to_xml. For example:

rss10feed = atomfeed.to_rss("1.0")

You may need to set default value:

rss10feed = atomfeed.to_rss("1.0") do |maker|
maker.channel.about ||= maker.channel.link
maker.channel.description ||= "No description"
maker.items.each do |item|
item.title ||= "UNKNOWN"
item.link ||= "UNKNOWN"
end
end

> #print the content of the most recent post
> puts rssfeed.items[0].content_encoded
> puts atomfeed.items[0].content.content
>
> #print the titles of the posts in the feed
> rssfeed.items.each {|item| puts item.title}
> atomfeed.items.each {|item| puts item.title.content}
>
> #print the author of the most recent post
> rssfeed.items[0].dc_creator
> atomfeed.entries[0].author.name.content


The atom specification says that all atom element may have
xml:base and xml:lang attributes. If
RSS::Atom::Entry::Author#name returns a String, we can't
get such attribute values. This is why
RSS::Atom::Entry::Author#name returns an
RSS::Atom::Entry::Author::Name not a String.

BTW, what about the following API?

atomfeed.entries[0].author.name # => a String
atomfeed.entries[0].author.name do |name|
# name: a RSS::Atom::Entry::Author::Name
name.content # => a String
end # => the last evaluated value (name.content)


Thanks,
--
kou

 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Recommended library for parsing RSS and Atom feeds Richard Conroy Ruby 4 06-23-2010 06:43 PM
Two ways to generate RSS - rss/maker and rss/2.0 - which is better? Jonathan Groll Ruby 1 06-27-2009 03:53 AM
RSS/Atom library Prathap C++ 1 02-12-2009 08:19 PM
Reading non-standard/custom attributes in Atom feeds with standardRuby RSS Library - How? Gerald Bauer Ruby 1 07-21-2008 03:56 PM
Atom-Newsfeed on Website (Atom->HTML) chlori HTML 1 09-21-2005 03:43 PM



Advertisments