Dr. Dobb's is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them. Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.


Channels ▼
RSS

Running a Weblog from IRC


WebReview.com: August 24, 2001:Running a Weblog from IRC

Related Links

What is IRC?

ircdhelp.org, for advice on running an IRC server.

Python

XSL Transformations (WebReview.com)

PHP's Newest Weapon: The Sablotron XSL Processor (WebReview.com)

• Some blogs that are produced from IRC: Daily Churn, Daily Chump, RDF Interest Group.


Internet Relay Chat (IRC) has been around on the Internet for years. Like NNTP newsgroups, IRC was in popular use long before the rise of the pretty-pretty messaging clients that most of us are used to today. Like many pre-Web technologies, IRC has been overlooked by many in favor of less powerful tools with more accessible interfaces.

This shift to desktop tools with graphical user interfaces is entirely understandable. Log on to an IRC server on a popular network and your first impressions are likely to be of a somewhat unfriendly, incomprehensible jungle of cryptic messages, contributed by individuals with highly illegible nicknames. If you survive the initial confusion, then there's always the barrage of private messages offering you warez and other forbidden (and probably virus-ridden) fruit. Your entirely reasonable reaction is to switch back to your favorite instant messaging client where you know who your friends are.

Despite its harsh exterior, IRC is in fact a great tool for collaboration and communication. Its great advantage is that an IRC channel is like a room: you can "hear" whoever's in it. Private communication is possible too, of course, but the fact that the default chatting mode is public is the key thing about a channel. Other advantages include the fact that IRC is an open technology, meaning anyone can run a server and that software for IRC is widely available and often free. The Open Projects IRC network was created specifically to aid the development of open-source efforts.

Availability and openness are big advantages, especially when geographically distributed folks need to talk. Many collaborative software projects such as the GNOME desktop or the Mozilla browser now run their own IRC servers. Different teams (like for user interface and for testing) can have different IRC channels, and a user can be on as many of these as he or she wants.

Bots

Many IRC channels have bots on them. Bots are programs, sometimes quite complex ones, that appear in channels like regular users and perform tasks. One of the most popular uses of a bot is to protect an IRC channel from unscrupulous interlopers who attempt to take over or otherwise disrupt a channel. Another popular use, more commonly found in channels that support development projects, is to have the bot report useful information; for instance, to report the latest Slashdot news, the last time each person said anything on the channel, or when the latest build of the software is available.

Bots are often written to make life easier for a group of collaborators. One of the most celebrated of these is the Infobot, a multipurpose Perl bot whose primary purpose is to remember facts. It has been extended to allow users to find the weather, send text messages to cellphones, retrieve headlines from Web sites, and many more useful functions.

Bots for Blogs

On many occasions, conversation on IRC involves the cutting and pasting of URLs from interesting Web sites. Whether discussing work in progress, the location of a download, or just a funny page, it's surprising how much conversation either on IRC or email involves the exchange of URLs. This activity alone accounts for much of the popularity behind Weblogs.

If some IRC channels are full of traded links, and if Weblogs tend to be a lot of the same, then it seems a logical conclusion to try and publish a Weblog from IRC—to capture all the banter and ideas generated by a conversation and publish it in Weblog format.

The first implementation of the IRC-to-Weblog idea was done by Bijan Parsia and friends at Monkeyfist.com. Bijan's bot drives the Monkeyfist Daily Churn Weblog. He called his program "DiaWebLogBot," and wrote it in Squeak, a version of the Smalltalk programming language.

Having played with the bot at Monkeyfist, I wanted to run one of my own. However, not being a Squeaker, I preferred a version in a language like Python or Perl. So I challenged a friend, Matt Biddulph, to implement Bijan's bot in Python, and make it output the blog as an XML document. As a result, The Daily Chump, named in honor of the Daily Churn, was born.

Using the bot is pretty simple. First, you need to enter a URL into the IRC channel where the bot is running. The bot then spits back a prompt for a label for that URL. You can then title the URL by preceding your label text with a pipe symbol (|). The screenshot below shows this in action on the Chump bot's "home" Weblog, The Daily Chump.

Entering a URL into an IRC window
Figure 1: Entering a URL into an IRC window

When formatted onto a Web page, the output from the bot will look something like the following. Note that the word "Webreview" at the top of the log entry is the label we provided for the http://www.webreview.com URL.

Resulting Web page
Figure 2: Resulting Web page

It's basically as simple as that. The resultant Weblog ends up looking like a more exclusive version of MetaFilter.

Getting Fancy

Not content with merely posting URLs and annotating them? The DiaWebLogBot also provides inline markup features so that you can annotate text and add other hyperlinks to your labels. The markup code is based on that used by WikiWikiWeb. (If you've not seen a Wiki before, I encourage you to check it out, it's a great tool for rapid Web authoring). To add hyperlinks to your label, you use square brackets and the pipe symbol, like so:

Adding an inline link to a comment
Figure 3: Adding an inline link to a comment

When rendered, this is converted into the familiar <a href="..."></a> of HTML:

Inline links as rendered to a browser
Figure 4: Inline links as rendered to a browser

Perhaps my favorite feature is the ability to include images, too. This is done exactly like an inline link, but with a plus sign (+) in front of the square brackets.

Inserting an image into the comments
Figure 5: Inserting an image into the comments

The bot converts the markup in the IRC channel into appropriate <img /> tag on the resulting Weblog.

An image rendered in the comments
Figure 6: An image rendered in the comments

Not all conversation you want to record starts with a URL, of course. Perhaps you're collaboratively working on some ideas for a new site, or just wanting to vent. DiaWebLogBots have a handy feature to facilitate this, called "blurbing." To start a blurb, you have to settle on a title for your rant, and then you can annotate in the same way as you can with a URL.

Starting a blurb
Figure 7: Starting a blurb

When rendered into a Web page, the only difference between a blurb and a URL is that the title isn't linked to any Web page.

A blurb rendered on a Web page
Figure 8: A blurb rendered on a Web page

That about covers the basic commands. For more detailed information, see the online manual.

Setting Up A Bot

Although it's relatively straightforward to actually set up the bot so that it's running in an IRC channel, you'll need some familiarity with programming to do it. The prerequisites for running the bot are a working Python installation (Python is available for many platforms), and access to an IRC server. Note that many IRC administrators don't like unauthorized bots running, so you ought to ask permission before setting it up.

Firstly, download the code for the bot and save it to the server where you have access to IRC and have Python installed. Then, pick a directory where the bot will output its XML document (the resulting Weblog). Then start your bot running with one of the following commands (both commands should be typed all on one line).

For a Unix system, replace the italicized parameters with those of your choice:

python dailychumpbot.py -s irc.yourserver.com -n chumpbot -c "#mychannel" -d /var/www/ircweblog/

For a Windows system, assuming Python is installed as C:\Program Files\Python\Python.exe, use this command:

C:\chump\src>c:\progra~1\python\python.exe dailychumpbot.py -s irc.yourserver.com -n chumpbot -c "#mychannel" -d c:\docs\

After doing this (invoking your bot), you should see it appear as a user in your IRC channel. It will also have created a near-empty XML document in the directory you specified. Try posting a link into the channel, as described above, and adding some annotation. Then examine the index.xml file in your output directory. The file should look something like this:

<!DOCTYPE churn>
<churn>
<last-updated value="994948386.374412">2001-07-12
   14:33</last-updated>
<itemcount value="1" />
<link>
<time value="994947258.765120">2001-07-12 14:14</time>
<url>http://webreview.com/</url>
<nick>edd</nick>
<title>webreview</title>
<comment nick="edd">featuring SVG this week. seems
    everyone's getting excited about it!</comment>
<comment nick="edd"><a href="http://xml.com">XML.com</a>
    also has an article on SVG, by Kip Hampton</comment>
<comment nick="edd">Kip creates this neat image:</comment>
<comment nick="edd"><img
    src="http://www.xml.com/2001/07/11/graphics/japh.png"
    /></comment>
</link>
</churn>

The Chump bot archives its content on a daily basis, where the day is currently deemed to start at midnight UTC. It puts each day's file into a directory named as the day's date, in yyyy/mm/dd format, and names the file yyyy-mm-dd.xml.

Getting the XML into HTML and RSS

All that remains is to convert the XML into HTML format so that the Weblog can be displayed in a Web browser. Happily, this is one of those tasks which can be accomplished in a multitude of ways. Unhappily (for you) this means that the Chump program itself doesn't come with any predefined way to do this. Here are some suggestions—if you don't understand these or don't think you're capable of implementing any of them, the easiest way is to find somebody who does know and to ask very nicely.

  • Write a PHP, ASP, or JSP (or other favorite scripting environment) program to read the XML file on the fly and send HTML back to the browser.
  • Use mod_xslt, or another XSLT-aware Web serving environment, such as Cocoon, to transform the XML dynamically from the Web server. This is the method the Daily Chump uses, and sample XSLT sheets are included in the Chump program distribution.
  • Develop a Microsoft Internet Explorer XSLT stylesheet for the XML document, and specify it using the Chump's -e option. Note that IE5 uses an old version of XSLT, so this will only work if people are browsing your Weblog with IE.

The Chump distribution includes another Python program to create indexes for the archive, so navigation like that provided on the Daily Chump Weblog can be built.

One of the benefits of starting with a base XML document, rather than having the bot directly output HTML, is that XML can fairly easily be transformed into other formats. Aside from the obvious HTML, RSS is one of the most useful transformations one can make. This allows your Weblog to be syndicated to places like Meerkat and Headline Viewer, among others. You can find an example XSLT stylesheet to create an RSS 1.0 translation from your Weblog in the Chump distribution. It is working in practice on the W3C RDF Interest Group Scratchpad, which uses the Chump software. To view the RSS, put ".rss" instead of ".html" on the end of a URL. Here's an example: 2001-07-11.rss. Another possibility XML opens up is the prospect of a WAP version of your Weblog.

Time to Have Fun!

Creating a Weblog from IRC is not only entertaining, but it can be a very useful tool for keeping and publishing a record of IRC collaboration. Whether you decide to make the results public, or just keep them between your friends, the resultant Weblog proves an excellent memory for your discussions. If you do decide to make your Weblog public send mail to the Chump's maintainers so it can be linked from the Chump home page.


Edd Dumbill is Managing Editor of XML.com, publisher and editor of XMLhack and WriteTheWeb. He is also chair of the XTech 2001 and XML Europe 2002 conferences. In his spare time he hangs out saying silly things on IRC.


Related Reading


More Insights






Currently we allow the following HTML tags in comments:

Single tags

These tags can be used alone and don't need an ending tag.

<br> Defines a single line break

<hr> Defines a horizontal line

Matching tags

These require an ending tag - e.g. <i>italic text</i>

<a> Defines an anchor

<b> Defines bold text

<big> Defines big text

<blockquote> Defines a long quotation

<caption> Defines a table caption

<cite> Defines a citation

<code> Defines computer code text

<em> Defines emphasized text

<fieldset> Defines a border around elements in a form

<h1> This is heading 1

<h2> This is heading 2

<h3> This is heading 3

<h4> This is heading 4

<h5> This is heading 5

<h6> This is heading 6

<i> Defines italic text

<p> Defines a paragraph

<pre> Defines preformatted text

<q> Defines a short quotation

<samp> Defines sample computer code text

<small> Defines small text

<span> Defines a section in a document

<s> Defines strikethrough text

<strike> Defines strikethrough text

<strong> Defines strong text

<sub> Defines subscripted text

<sup> Defines superscripted text

<u> Defines underlined text

Dr. Dobb's encourages readers to engage in spirited, healthy debate, including taking us to task. However, Dr. Dobb's moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing or spam. Dr. Dobb's further reserves the right to disable the profile of any commenter participating in said activities.

 
Disqus Tips To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.