Pages tagged with Similar to An Introduction to Compassionate Screen Scraping

Deploying Django with Fabric

A look at how to manage deployment complexity with Django using Fabric. Something of a continuation on the post from yesterday.

An Epic Introduction to PyObjC and Cocoa

PyObjC is one of the most helpful projects I have ever used, but a number of individuals have been having trouble getting started with PyObjC on Leopard because the documentation is in a bit of a disarray. In particular, there didn't seem to be a comprehensive tutorial that could introduce a newcomer to all the important aspects of PyobjC, and that was completely up to date. Here is my attempt to fill that void. With a vengeance.

Python Content Scraper for OneManga.com

I spent a while today writing a fairly kind content scraper for OneManga.com, which shows how to use Python's httplib2 and BeautifulSoup to scrape data with a flexible api and minimal http connections.

Huge CSV and XML Files in Python

01/22/2009

Quick walkthrough of my code for converting a very large CSV file into a very large XML file using the Python standard libraries. Despite a few issues along the way, was a very pleasant experience.

Using Optional Parameters in Django Urls

02/04/2008

A simple but helpful trick for using optional parameters in Django views to allow one view to serve multiple urls with varying parameters.

Using Threadpools in Python

02/10/2009

This article takes a look at creating a threadpool in Python. Specifically it takes a stab at iteratively processing CSV and XML files and farming out the parsed data for processing by a threadpool. The Python logging, csv and ElementTree modules make cameo appearances.

Python Datastructures Backed by Redis

09/05/2010

Part of my day's experiment was to play with implementing Python datastructures which are implemented ontop of Redis. Here we take a look at dictionaries and lists, but it should be straightforward to extend this idea to sets as well.

Storing Bounded Timeboxes in Redis

04/08/2011

If you're doing analytics, reports or dealing with memory constraints in Redis, you're probably dealing with keeping two sorted-sets mutually consistent. This article also takes a look at using multi/exec to keep it fresh.

Using PyFacebook without the Facebook middleware

I've been working on a Facebook application with a couple of friends recently. We decided to use PyFacebook library, but there was a brief period of intense confusion on my part about how to use the PyFacebook library without the included middleware. I worked through it, though, and this article has some advice on how you can do the same.

Tailing in Python

05/16/2010

A quick and pointless look at implementing tail in Python. Something of a koan.

All Rights Reserved, Will Larson 2007 - 2014.