One of the most common quickie projects on the web is to screenscrape a website and play around with its data. These projects are a lot of fun, and can allow for inventive mashups, but often the screepscraping scripts cause unnecessary load on the site's servers due to inconsiderate technique. This is an introduction to the art of compassionate screenscraping.
A look at how to manage deployment complexity with Django using Fabric. Something of a continuation on the post from yesterday.
PyObjC is one of the most helpful projects I have ever used, but a number of individuals have been having trouble getting started with PyObjC on Leopard because the documentation is in a bit of a disarray. In particular, there didn't seem to be a comprehensive tutorial that could introduce a newcomer to all the important aspects of PyobjC, and that was completely up to date. Here is my attempt to fill that void. With a vengeance.
Quick walkthrough of my code for converting a very large CSV file into a very large XML file using the Python standard libraries. Despite a few issues along the way, was a very pleasant experience.
A simple but helpful trick for using optional parameters in Django views to allow one view to serve multiple urls with varying parameters.
This article takes a look at creating a threadpool in Python. Specifically it takes a stab at iteratively processing CSV and XML files and farming out the parsed data for processing by a threadpool. The Python logging, csv and ElementTree modules make cameo appearances.
Part of my day's experiment was to play with implementing Python datastructures which are implemented ontop of Redis. Here we take a look at dictionaries and lists, but it should be straightforward to extend this idea to sets as well.
If you're doing analytics, reports or dealing with memory constraints in Redis, you're probably dealing with keeping two sorted-sets mutually consistent. This article also takes a look at using multi/exec to keep it fresh.
I've been working on a Facebook application with a couple of friends recently. We decided to use PyFacebook library, but there was a brief period of intense confusion on my part about how to use the PyFacebook library without the included middleware. I worked through it, though, and this article has some advice on how you can do the same.
A quick and pointless look at implementing tail in Python. Something of a koan.
Software engineer, technical leader, sci-fi reader, and so on. Born in NC, living in SF, and glad to grab a coffee.