RSS and Syndication

16 287 0
Tài liệu đã được kiểm tra trùng lặp
RSS and Syndication

Đang tải... (xem toàn văn)

Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống

Thông tin tài liệu

239 CHAPTER 9 RSS and Syndication I N THIS CHAPTER , we describe how portlets can aggregate links to content on external web sites using the group of standards known as RSS. We also discuss how the content of your own portal could be syndicated for convenient inclusion in external sites using the same mechanism. Overview of RSS RSS is not a single standard. It is several standards, some closely related, and others more loosely so. The versions of RSS that are most commonly used are 0.9 and 0.91, both of which were released by Netscape to allow content from external web sites to be aggregated into its My Netscape portal. Since 0.91, two groups have produced new versions of RSS with varying degrees of backward compatibility. The company UserLand Software carried out early development of RSS for Netscape and has subsequently released versions 0.92, 0.93, 0.94, and 2.0. The RSS-DEV working group (an independent group of devel- opers) released the 1.0 version of RSS stemming from the 0.91 version. NOTE Some but not all of these versions are based on the Resource Description Framework (RDF) format. This rather more consistently managed standard from the World Wide Web Consortium (W3C) standards body provided a stan- dard for presenting metadata. A syndication feed is a set of metadata; it does not (generally) provide the articles itself, but will provide their titles, some associated links, and abstracts of the articles. RDF in this respect is ideal—however, it is quite a complex standard; RSS prag- matically provides a reasonable subset of this information oriented specifically toward syndication at the cost of a somewhat fragmented standard. Even the naming of the standard reflects the version confusion. Correctly or otherwise, you may see any of these versions referred to as one of “Really Simple Syndication,” “Rich Site Summary,” or “RDF Site Summary.” In practice, it is sim- plest to refer to RSS by its acronym alone, and use a version number if you feel the need to be specific. 2840ch09.qxd 7/13/04 12:44 PM Page 239 Download at Boykma.Com Chapter 9 240 The good news is that amid this riot of colorful standards for RSS, the RSS Portlet that we use to acquire and present syndicated content is quite agnostic. You can import an RSS feed in formats 0.90 through to 2.0. The only thing that you cannot import is invalid XML. RSS is not the only game in town—there are various other standards for mak- ing this type of meta-information available and for syndicating content. Although we won’t be covering them any further, you should be familiar with RDF (which we’ve already mentioned) and the up-and-coming “Atom” standard (in development at www.atomenabled.org ), which aims to be a more “standard” standard! Walking Through an Example RSS File Let’s now take a look at some concrete examples of RSS feeds in the most com- monly encountered 0.9 and 0.91 formats. Both formats provide a number of optional elements, but for the most part we will ignore these in favor of those most commonly encountered “in the wild.” Version 0.9 The following is a correct RSS 0.9 feed describing the authors’ web site, including the compulsory elements and some of the optional ones: <?xml version="1.0"?> <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns="http://my.netscape.com/rdf/simple/0.9/"> <channel> <title>PortalBook Technical Notes</title> <link>http://portalbook.com/</link> <description> Discourse and exposition on Java and developing Portlets </description> </channel> <item> <title>New version of Jetspeed released</title> <link> http://portalbook.com/notes/005.html </link> </item> 2840ch09.qxd 7/13/04 12:44 PM Page 240 Download at Boykma.Com RSS and Syndication 241 <item> <title>Collections and iterations</title> <link> http://portalbook.com/notes/004.html </link> </item> <item> <title>Deprecated techniques</title> <link> http://portalbook.com/notes/003.html </link> </item> </rdf:RDF> The format is so simple it barely needs explanation, which is indubitably one of the reasons for the enthusiastic early take-up. The first version of RSS was a valid RDF document. As such it fell within the RDF namespace defined by the W3C. The simple elements required by Netscape’s format are specified in the default namespace: <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns="http://my.netscape.com/rdf/simple/0.9/"> The <channel> element contains the metadata for the feed—its title, the site from which it can be obtained, and a human-readable description of its content. One of the deficiencies of the 0.9 format over later submissions is that it is restricted to a single channel, so a web site proffering diverse subject matter must provide multiple distinct feeds rather than a single RSS feed with multiple channels: <channel> <title>PortalBook Technical Notes</title> <link>http://portalbook.com/</link> <description> Discourse and exposition on Java and developing Portlets </description> </channel> The <item> element repeats multiple times, once for each article or item of interest that is being publicized in the feed. There is a hard limit of 15 items permissible in the channel. The items includes a title describing the data to be 2840ch09.qxd 7/13/04 12:44 PM Page 241 Download at Boykma.Com Chapter 9 242 propagated and a link to the data in question. This extremely sparse information is all that is permitted: <item> <title>Deprecated techniques</title> <link> http://portalbook.com/notes/003.html </link> </item> Version 0.91 The following is a correct RSS 0.91 feed describing the authors’ web site, including the compulsory elements and some of the optional ones. <?xml version="1.0"?> <rss version="0.91"> <channel> <title>PortalBook Technical Notes</title> <link>http://portalbook.com/</link> <description> Discourse and exposition on Java and developing Portlets </description> <language>en-us</language> <copyright> Copyright: (C) 2003 Dave Minter and Jeff Linwood </copyright> <item> <title>New version of Jetspeed released</title> <link>http://portalbook.com/notes/005.html</link> <description> We let you know the latest changes and improvements to the Jetspeed portlet server in the new version. </description> </item> <item> <title>Collections and iterations</title> 2840ch09.qxd 7/13/04 12:44 PM Page 242 Download at Boykma.Com RSS and Syndication 243 <link>http://portalbook.com/notes/004.html</link> <description> Misuse of Collections can result in hidden nested iterations that rapidly become a serious performance drag. We discuss how to avoid this and similar pitfalls. </description> </item> <item> <title>Deprecated techniques</title> <link>http://portalbook.com/notes/003.html</link> <description> Bad habits die hard. We discuss some of the techniques that were legitimate in older versions of Jetspeed and the approaches that should replace them. </description> </item> </channel> </rss> This format is not quite as simple as that of version 0.9 but does contain some compensatory features. The version of RSS is specified in this version, making it a little easier to keep track of what data is incoming: <rss version="0.91"> Again only one channel is permitted by the standard. In this version, however, the <channel> element encompasses all of the subsequent items along with the channel’s metadata: <channel> Rather more information about the channel is available in a 0.91 feed. As well as the title, link, and description, we are provided with an associated language and copyright information: <title>PortalBook Technical Notes</title> <link>http://portalbook.com/</link> <description> Discourse and exposition on Java and developing Portlets </description> 2840ch09.qxd 7/13/04 12:44 PM Page 243 Download at Boykma.Com Chapter 9 244 <language>en-us</language> <copyright> Copyright: (C) 2003 Dave Minter and Jeff Linwood </copyright> The <item> elements are also rather better equipped. In addition to the <title> and <link> elements, we have a description. This is usually populated with an abstract of the content that is to be covered in the associated link: <item> <title>New version of Jetspeed released</title> <link>http://portalbook.com/notes/005.html</link> <description> We let you know the latest changes and improvements to the Jetspeed portlet server in the new version. </description> </item> This version of the standard is not limited to 15 item elements, but enough software exists that makes this assumption that we figure it is safer to so limit it. Version 2.0 The following is a correct RSS 2.0 feed describing the authors’ web site, including the compulsory elements and some of the optional ones: <?xml version="1.0"?> <rss version="2.0"> <channel> <title>PortalBook Technical Notes</title> <link>http://portalbook.com/</link> <description> Discourse and exposition on Java and developing Portlets </description> <language>en-us</language> <copyright> Copyright: (C) 2003 Dave Minter and Jeff Linwood </copyright> 2840ch09.qxd 7/13/04 12:44 PM Page 244 Download at Boykma.Com RSS and Syndication 245 <item> <title>New version of Jetspeed released</title> <link> http://portalbook.com/notes/005.html </link> <description> We let you know the latest changes and improvements to the Jetspeed portlet server in the new version </description> </item> <item> <title>Collections and iterations</title> <link> http://portalbook.com/notes/004.html </link> <description> Misuse of Collections can result in hidden nested iterations that rapidly become a serious performance drag. We discuss how to avoid this and similar pitfalls. </description> </item> <item> <title>Deprecated techniques</title> <link> http://portalbook.com/notes/003.html </link> <description> Bad habits die hard. We discuss some of the techniques that were legitimate in older versions of Jetspeed and the approaches that should replace them. </description> </item> </channel> </rss> If you compare this feed with the one demonstrated in the 0.91 version of RSS, you’ll see a striking similarity. In fact, they’re identical aside from the version num- ber. So what’s the point? 2840ch09.qxd 7/13/04 12:44 PM Page 245 Download at Boykma.Com Chapter 9 246 Figure 9-1. NetNewsWire Lite presenting a set of RSS feeds RSS 2.0 provides a much larger set of optional elements that can be included in your feed. However, the later the version of RSS that you select for your imple- mentation, the less likely it is that client software will provide compatibility for it. Therefore, you need to weigh this disadvantage against the richer variety of optional metadata (publication dates, unique identifiers, and so forth—for the full list, see the current specification for RSS 2.0 at http://blogs.law.harvard.edu/ tech/rss ) that you can include in a 2.0 feed. RSS Browsers As we have discussed, the original purpose of RSS was to allow headlines to be imported into other web pages. However, a number of specialized browsers have appeared that provide a convenient user interface for browsing through these content summaries. The example shown in Figure 9-1 is for NetNewsWire Lite running on a Macintosh and illustrates the basic functionality you can expect to see in an RSS browser. The browser shown provides a list of sites from which you may choose an RSS feed. Selecting a site lists the article titles available on the site. Selecting a title displays the abstract of the article. A link is provided that will launch a browser with the article in question. Most of the rest of the example screen shots in this chapter will be taken from a Java Swing-based RSS browser called RSS Viewer, which you can down- load from http://sourceforge.net/projects/rssview/ . 2840ch09.qxd 7/13/04 12:44 PM Page 246 Download at Boykma.Com RSS and Syndication 247 RSS Viewer lacks some of the finesse of NetNewsWire Lite, but being Java-based , it will run on any platform. A list of other RSS resources, including RSS browsers for various platforms, is available at www.lights.com/weblogs/rss.html . Displaying Syndicated Information in Portlets It is possible that your portal will supply a portlet for displaying RSS streams, but failing that, a number of third-party portlets already exist that provide this service. We will discuss a portlet available from the Portlet Open Source Trading (POST) site at http://portlet-opensrc.sourceforge.net/ . NOTE The Portlet Open Source Trading (POST) site provides a set of open source portlets that conform to the Java portlet API or the Web Services for Remote Portlets (WSRP) standard. As of this writing, it has also released a Google portlet, an e-mail portlet, a wizard portlet, and an upload portlet. The portlet application we are using is called RSS Portlet and is provided as a WAR file to be deployed in your portal. The open source license for RSS Portlet is a BSD-style license, so you can use it for free as-is, or make any changes you like to it (although under this license if you do so, you’re not allowed to call your derivation “RSS Portlet”). The RSS Portlet uses XSL files to translate the incoming RSS feeds into HTML. A style sheet called html.xsl converts 0.9x RSS feeds and a style sheet called Rss20.xsl converts 2.0 RSS feeds. Both of these files are stored in the WEB-INF directory of the portlet. TIP At the time of writing, there is a bug in the html.xsl file. If your portlet fails to load and leaves errors like the following: "Can not resolve namespace prefix: im" in your log files, you will need to remove the “im” and “rss-sample” entries from the line beginning “exclude-result-prefixes” so that it now reads: exclude-result-prefixes= "rdf dc dcterms rss content annotate admin image cc reqv" The entries removed from this reference XML namespaces, which have not been included. Earlier versions of the XML parser tolerated this error. 2840ch09.qxd 7/13/04 12:44 PM Page 247 Download at Boykma.Com Chapter 9 248 Figure 9-2. Browsing 0.9-style RSS feeds This portlet makes use of the Xalan XML parser to read and translate the RSS streams. Although the J2SE 1.4 runtime is provided with a version of Xalan, it lacks some of the more up-to-date features required by the RSS Portlet. You may, therefore, need to install the latest Xalan JAR files in your portal. TIP If you are using a portlet server based on the Tomcat application server, such as Pluto or Jetspeed, and you are using the 1.4 version of the J2SDK, you will need to take additional measures to have your new Xalan JAR files over- ride the JAR files provided with the SDK. To do this, place the JAR files in the common/endorsed/ directory. An error message like “The output format must have a '{http://xml.apache.org/ xslt}content-handler' property!” is indicative of this particular problem. The RSS feeds that will appear in the portlet are configured from the “RssXml” preference. The default set of preferences configured in portlet.xml is as follows: <name>RssXml</name> <value>http://www.theserverside.com/rss/theserverside-0.9.rdf</value> <value>http://rss.com.com/2547-12-0-20.xml</value> <value>http://headlines.internet.com/internetnews/top-news/news.rss</value> <value>http://headlines.internet.com/internetnews/fina-news/news.rss</value> <value>http://www.sciencedaily.com/newsfeed.xml</value> Naturally you will want to customize the available feeds to suit the audience of your portal. The default list includes some 0.9-style feeds, as shown in Figure 9-2, along with some 2.0-style feeds, as shown in Figure 9-3. 2840ch09.qxd 7/13/04 12:44 PM Page 248 Download at Boykma.Com [...]... response.setContentType("text/xml"); OutputStream out = response.getOutputStream(); RssGenerator.generateRss(document, out); out.flush(); } catch (RssGenerationException e) { throw new ServletException("Cannot generate RSS feed", e); } } private RssDocument document; } 252 Download at Boykma.Com 2840ch09.qxd 7/13/04 12:44 PM Page 253 RSS and Syndication In our example, an RssDocument is built into the init() method (when the servlet... javax.servlet.http.*; import churchillobjects .rss4 j.*; import churchillobjects .rss4 j.generator.*; public class RSSExample extends HttpServlet { public RSSExample() { } Download at Boykma.Com 251 2840ch09.qxd 7/13/04 12:44 PM Page 252 Chapter 9 public void init() { document = new RssDocument(); document.setVersion(RssDocument.VERSION_91); RssChannel channel = new RssChannel(); channel.setChannelLanguage("en");... we discussed the origins of RSS and the various available flavors We introduced you to an RSS browser application and demonstrated how you can incorporate an RSS feed into your portal Finally, we showed how you can create an RSS feed from your own applications using the RSS4 J library In the next chapter, we will show you how to incorporate search tools into your portlets and applications Download at... < /rss> While this is not an especially attractive piece of code, it clearly demonstrates the ease of generating an RSS feed Figure 9-4 shows the generated feed in the RSS Viewer application 250 Download at Boykma.Com 2840ch09.qxd 7/13/04 12:44 PM Page 251 RSS and Syndication Figure 9-4 Browsing our JSP-generated feed Alternatively, we can generate our RSS directly from code within... rendered every time that the doGet() method is invoked Figure 9-5 shows the generated feed in the RSS Viewer application Figure 9-5 Browsing our servlet-generated feed The major objects used here—RssDocument, RssChannel, and RssChannelItem— correspond exactly with the elements that are required in the RSS feed If RSS4 J does not meet your needs, it would be fairly easy to extend Alternatively, the open source...2840ch09.qxd 7/13/04 12:44 PM Page 249 RSS and Syndication Figure 9-3 Browsing 2.0-style RSS feeds The RSS Portlet discussed here is not yet a polished piece of work suitable for all situations, but it illustrates the techniques required to display an RSS feed in a portlet Since it is open source software, it provides an excellent basis for developing a custom RSS portlet for your own purposes Syndicating... the XML page directly Finally, you could use a library specifically oriented to RSS, such as RSS4 J, available from www.churchillobjects.com/c/13005.html RSS4 J is yet another open source library, and allows you to model an RSS document rather than the rather more abstract XML document The library permits you to build a RSS 0.9, 0.91, or 1.0 document, but does not currently allow you to build a 2.0-compliant... Society"); channel.setChannelLink("http://example.com"); channel.setChannelDescription( "Discourse on the subject, object, and practice of examples"); channel.setChannelUri("http://example.com /rss/ "); document.addChannel(channel); for (int i = 0; i < 10; i++) { RssChannelItem item = new RssChannelItem(); item.setItemTitle("Article number " + i); item.setItemLink("http://example.com/" + i); item.setItemDescription(... application available as an RSS stream that third parties may subscribe to For example, an information portal could represent the names of the available portlets as an RSS feed for interested parties They would then be aware when new functionality was made available to them Probably the simplest way to create a valid RSS stream is to write a JavaServer Pages (JSP) page You cannot make an RSS stream available... definition must be displayed as part of a valid HTML page, whereas an RSS stream is an XML document The RSS stream must be made visible through a servlet (either created directly or as a JSP page) Although we normally use JSP to generate HTML there is nothing to stop us from using it to generate other document types A simple example of an RSS 2.0–compliant JSP page follows: Download at Boykma.Com 249 2840ch09.qxd . mechanism. Overview of RSS RSS is not a single standard. It is several standards, some closely related, and others more loosely so. The versions of RSS that are. of colorful standards for RSS, the RSS Portlet that we use to acquire and present syndicated content is quite agnostic. You can import an RSS feed in formats

Ngày đăng: 05/10/2013, 04:20

Tài liệu cùng người dùng

  • Đang cập nhật ...

Tài liệu liên quan