Irrational Exuberance!

Implementing Threaded Comments in LifeFlow

January 4, 2008. Filed under djangolifeflow

Recently I decided that I wanted to add threaded comments to LifeFlow. This is mostly because the commenting system thus far has caused (readers) relatively more suffering than my conscience can cope with. And I hoped that adding threaded comments would somehow balance out this travesty I have inflicted on you dear readers. Probably just a lie I tell myself.

Regardless, I wanted to implement a robust comment threading system. I had two design goals in mind:

  1. As little extra data to store in the database as possible.
  2. Keep the computational complexity reasonable (it should be able to easily handle ~500 comments per post).

I thought up a couple of different ways to approach this. One way I didn't think to approach it was storing the hierarchical data in the database. I don't like the idea of storing the full path in the database. It also seems like their method would make rearranging comments rather complex (I readily admit that there aren't many situations that require arbitrary rearranging of comments. Unless you implement voting or have some other form of comment promotion based on merit).

What data do I need to store?

The only extra piece of data I decided to store is the id of the parent comment. (The other piece of data I need is what entry the comment is associated with, but I was already storing that.)

The challenge

Because I am not storing paths in the database, but am instead only storing each node's parent, my real challenge is recreating the structure of the comments using that not-perfectly-suited datum.

A simple, although imperfect answer

So here is my solution (in Django code):

def organize_comments(self):
    def build_relations(dict, comment=None):
        if comment is None: id = None
        else: id = comment.id
        try:
            children = dict[id]
            return [comment, [build_relations(dict, x) for x in children]]
        except:
            return comment
    dict = {None:[]}
    all = Comment.objects.select_related().filter(entry=self)
    for comment in all:
        if comment.parent: id = comment.parent.id
        else: id = None
        try:
            dict[id].append(comment)
        except KeyError:
            dict[id] = [comment]
    relations = build_relations(dict)
    if len(relations) == 1: return None
    else: return relations[1]

Basically I build the comment's structure in a hashmap (with key None representing a comment without a parent), and then use that hashmap to recursively build a list containing that structure.

The reason that this can be done fairly efficiently is that the select_related() method (in the Django SVN, not available in 0.96 or earlier) batch fetches related objects, so we will only be hitting the database once, despite all of our tomfoolery.

Its even tail-recursive, which would be meaninful if Python gave a damn about tail recursion. Eventually I'll have to convert the recursion into a loop of some sort.

So, LifeFlow now has support for threaded comments, at least at the model level. I still have to revamp the templates and views to expose the new functionality to users.

Update 1/6/08

Threaded comments are now 90% implemented. Everything is implemented except a visual guide to the threading. That is going to require some custom tagging. You can see the rather awkward implementation of the comment threading in the models.py file from LifeFlow. Its under the organize_comments method of the Entry model.

updated 1/7/08

Have the visuals implemented as well now. All in all its working pretty well. I ended up writing a quick custom tag that is used to throttle the depth of comments.

from django import template
register = template.Library()

def boundary(value, arg):
    """Defines a boundary for an integer. If the value of the integer
    is higher than the boundary, then the boundary is returned instead.

    Example:  {{ comment.depth|:"4" }} will return 4 if the value of
    comment.depth is 4 or higher, but will return 1, 2 or 3 if the
    value of comment.depth is 1, 2 or 3 respectively.
    """
    value = int(value)
    boundary = int(arg)
    if value > boundary:
        return boundary
    else:
        return value

register.filter('boundary', boundary)

I use this to throttle the comments to a depth of 5. Anything deeper will be displayed at the level five depth. This makes it much easier to provide the visual look for comments (don't have to do anything dynamic), and seems like a win to me. It also means that comments won't run far away into the right-margin sunset. Also a win.

You can take a look at the css or the template to get a more complete view of the implementation (and the model is still here). The template needs to be aligned a bit more consistently, but then again that template is for my custom remake of the LifeFlow templates, and I don't expect it to recieve a whole lot of use from others... but thats just an excuse.

Upcoming Changes

I will still need to revamp the comments to play nicely with Internet Explorerer (the submission part, that is), and also I am looking into invalidating the cached blog entry whenever a new comment is added, because its kind of frustrating for users to not see their comments until a few minutes have passed.