An Introduction to Compassionate Screen Scraping

August 10, 2008 One of the most common quickie projects on the web is to screenscrape a website and play around with its data. These projects are a lot of fun, and can allow for inventive mashups, but often the screepscraping scripts cause unnecessary load on the site's servers due to inconsiderate technique. This is an introduction to the art of compassionate screenscraping.

A Syntax Coloring Template Filter for Django

August 9, 2008 I spent a bit of time this evening writing a template filter for Django that accepts a string of code (and optionally the name of the Pygments lexer to use for highlighting) and returns the code nicely syntax colored. A simple but slightly helpful addition to your templating arsenal.

Python Content Scraper for

August 8, 2008 I spent a while today writing a fairly kind content scraper for, which shows how to use Python's httplib2 and BeautifulSoup to scrape data with a flexible api and minimal http connections.

BossArray for list-like Yahoo search results

July 28, 2008 I recently put together BossArray, which is a simple wrapper around the Yahoo BOSS search results (relying on the Yahoo BOSS Mashup Framework for the heavy lifting). It provides a dirt simple interface mimicking a normal Python list for most interaction.

Stripping Reddit From HackerNews With BOSS Mashup

July 12, 2008 This tutorial looks at using the Yahoo BOSS Mashup Framework (a simple Python library) to retrieve the RSS feeds for HackerNews and Reddit Programming and strip the union of those results from HackerNews, returning HackerNews to an earlier era.