A Command Line Tool for Loading CouchDB Documents

December 9, 2008. Filed under python 56 couchdb 6

While working with CouchDB this weekend, I found myself wanting a bit more than creating CouchDB content in Futon, and a bit less than writing a new custom script for each new experiment; a script that would take a file or directory of files, convert them into JSON, and upload them into CouchDB.

So, I wrote it.

Justifications

  1. Creating multi-line values in Futon is not particularly straight-forward, so the standby save-in-another-file-and-paste-in solution is suboptimal.

  2. For experimentation, I was often deleting databases, and wanted a quick and simple way to load sample data and views.

  3. Wanted an easy way to add one-off data, without writing throw-away code (the perfect domain for a command-line tool).

  4. Writing document data in Python gives me some niceties:

    1. good tabbing and syntax highlighting for cleaner data,
    2. handles translating data structures into JSON via simplejson, which validates syntax of the Python data structure, as well as ensuring that the output JSON is valid.
  5. Command line tools play together much better than scripts do.

File Formats

There are four possible file formats for the script:

  1. A Python dict in a file with extension .py.

    { '_id':'123',"title":"hi","etc":"blah" }
    
  2. A Python list of dicts in a file with extension .py.

    [ {'title':'a'}, {'title':'b'}]
    
  3. A JSON dict in a file with extension .js or .json. (Um. A simple example is actually identical in structure to the Python example.)

  4. A JSON array of dicts in a file with extension .js or .json. (Again, this will actually be identical with the Python list example.)

The array formats are always sent via the bulk document api (each dict within the array is a document), and the dict formats are always sent via the PUT api for creating doucments.

Usage

Usage is fairly straight forward.

python doc_utils.py database-name
python doc_utils.py db path/to/folder
python doc_utils.py db path/my_views.py
python doc_utils.py db --port=3432
python doc_utils.py db --hostname=example.com

Any combination of the above options works as well, so you can use it to load arbitrary paths to arbitrary instances of CouchDB.

Download

You can grab the source code at its GitHub repository, as well as download a zip.