RSS reading and writing
Really Simple Syndication (RSS) is a family of formats that describe ‘feeds,’ specially constructed XML documents that allow an interested person to subscribe and receive updates from a particular web service. This portion of the standard library provides tooling to read and create these feeds.
The standard library supports RSS 0.91, 1.0, 2.0, and Atom, a related format. Here are some links to the standards documents for these formats:
Consuming RSS
If you’d like to read someone’s RSS feed with your Ruby code, you’ve come to the right place. It’s really easy to do this, but we’ll need the help of open-uri:
require 'rss' require 'open-uri' url = 'http://www.ruby-lang.org/en/feeds/news.rss' open(url) do |rss| feed = RSS::Parser.parse(rss) puts "Title: #{feed.channel.title}" feed.items.each do |item| puts "Item: #{item.title}" end end
As you can see, the workhorse is RSS::Parser#parse, which takes the source of the feed and a parameter that performs validation on the feed. We get back an object that has all of the data from our feed, accessible through methods. This example shows getting the title out of the channel element, and looping through the list of items.
Producing RSS
Producing our own RSS feeds is easy as well. Let’s make a very basic feed:
require "rss" rss = RSS::Maker.make("atom") do |maker| maker.channel.author = "matz" maker.channel.updated = Time.now.to_s maker.channel.about = "http://www.ruby-lang.org/en/feeds/news.rss" maker.channel.title = "Example Feed" maker.items.new_item do |item| item.link = "http://www.ruby-lang.org/en/news/2010/12/25/ruby-1-9-2-p136-is-released/" item.title = "Ruby 1.9.2-p136 is released" item.updated = Time.now.to_s end end puts rss
As you can see, this is a very Builder-like DSL. This code will spit out an Atom feed with one item. If we needed a second item, we’d make another block with maker.items.new_item and build a second one.
Copyright
Copyright © 2003-2007 Kouhei Sutou <kou@cozmixng.org>
You can redistribute it and/or modify it under the same terms as Ruby.
There is an additional tutorial by the author of RSS at: http://www.cozmixng.org/~rwiki/?cmd=view;name=RSS+Parser%3A%3ATutorial.en
Constants
CONTENT_PREFIX = 'content'
CONTENT_URI = "http://purl.org/rss/1.0/modules/content/"
IMAGE_PREFIX = 'image'
IMAGE_URI = 'http://purl.org/rss/1.0/modules/image/'
IMAGE_ELEMENTS = []
TRACKBACK_PREFIX = 'trackback'
TRACKBACK_URI = 'http://madskills.com/public/xml/rss/module/trackback/'
AVAILABLE_PARSER_LIBRARIES = [ ["rss/xmlparser", :XMLParserParser], ["rss/xmlscanner", :XMLScanParser], ["rss/rexmlparser", :REXMLParser], ]
AVAILABLE_PARSERS = []
VERSION = "0.2.7"
URI = "http://purl.org/rss/1.0/"
DEBUG = false
NotExceptedTagError = NotExpectedTagError
UnknownConvertMethod = UnknownConversionMethodError
ITUNES_PREFIX = 'itunes'
ITUNES_URI = 'http://www.itunes.com/dtds/podcast-1.0.dtd'
SY_PREFIX = 'sy'
SY_URI = "http://purl.org/rss/1.0/modules/syndication/"
TAXO_PREFIX = "taxo"
TAXO_URI = "http://purl.org/rss/1.0/modules/taxonomy/"
TAXO_ELEMENTS = []
SLASH_PREFIX = 'slash'
SLASH_URI = "http://purl.org/rss/1.0/modules/slash/"
DC_PREFIX = 'dc'
DC_URI = "http://purl.org/dc/elements/1.1/"
DublincoreModel = DublinCoreModel
Attributes
Undocumented pile of ruby
> If you’d like to read someone’s RSS feed with your Ruby code, you’ve come to the right place
No, you’ve definitely come to wrong place. RSS is one of the worst documented libraries I’ve ever seen for Ruby. It’s as confusing and misleading as it can get.
RSS feeds in Rails
Fetching RSS feeds in the request/response cycle inside a Rails application is probably not the very best approach, as it will make your application as slow as the server serving RSS feeds. Another option is to do it asynchronously using a worker or a queue, but this can also become quite complex and hard to maintain over time.
Another solution is to use an API like superfeedr.com and its Rails Engine (http://blog.superfeedr.com/consuming-rss-feeds-rails/). All the polling and parsing is done on Superfeedr’s side and your application is notified in realtime as soon as the resources are updated using a webhook pattern.
What you really want is
The official Ruby documentation is available for this here http://ruby-doc.org/stdlib-2.2.3/libdoc/rss/rdoc/RSS.html
But the library that will be most helpful to you is called Feedjira: http://feedjira.com/