Pages tagged with Similar to An Introduction to Compassionate Screen Scraping

Deploying Django with Fabric

A look at how to manage deployment complexity with Django using Fabric. Something of a continuation on the post from yesterday.

An Epic Introduction to PyObjC and Cocoa

PyObjC is one of the most helpful projects I have ever used, but a number of individuals have been having trouble getting started with PyObjC on Leopard because the documentation is in a bit of a disarray. In particular, there didn't seem to be a comprehensive tutorial that could introduce a newcomer to all the important aspects of PyobjC, and that was completely up to date. Here is my attempt to fill that void. With a vengeance.

Huge CSV and XML Files in Python


Quick walkthrough of my code for converting a very large CSV file into a very large XML file using the Python standard libraries. Despite a few issues along the way, was a very pleasant experience.

Python Content Scraper for

I spent a while today writing a fairly kind content scraper for, which shows how to use Python's httplib2 and BeautifulSoup to scrape data with a flexible api and minimal http connections.

Using Optional Parameters in Django Urls


A simple but helpful trick for using optional parameters in Django views to allow one view to serve multiple urls with varying parameters.

Using Threadpools in Python


This article takes a look at creating a threadpool in Python. Specifically it takes a stab at iteratively processing CSV and XML files and farming out the parsed data for processing by a threadpool. The Python logging, csv and ElementTree modules make cameo appearances.

Python Datastructures Backed by Redis


Part of my day's experiment was to play with implementing Python datastructures which are implemented ontop of Redis. Here we take a look at dictionaries and lists, but it should be straightforward to extend this idea to sets as well.

Storing Bounded Timeboxes in Redis


If you're doing analytics, reports or dealing with memory constraints in Redis, you're probably dealing with keeping two sorted-sets mutually consistent. This article also takes a look at using multi/exec to keep it fresh.

War Card Game in Python


A simple implementation of the war card game in Python, made for an interview some time back.

Using PyFacebook without the Facebook middleware

I've been working on a Facebook application with a couple of friends recently. We decided to use PyFacebook library, but there was a brief period of intense confusion on my part about how to use the PyFacebook library without the included middleware. I worked through it, though, and this article has some advice on how you can do the same.

All Rights Reserved, Will Larson 2007 - 2015.