Don't Repeat Yourself for Bloggers: Dynamic Blog Context in LifeFlow

Published on January 9, 2008. blog (19), lifeflow (20), markdown (6)

A great deal of LifeFlow's design is the result of three observations:

Markdown makes writing fun.
Links make blogs better.
Writing links sucks.

Out of these observations, Dynamic Blog Context, or DBC was concieved.

DBC is essentially a battery of automatically generated references that you can use when writing your blog entries. You may find yourself writing a lot of throw away references to link to a certain tag of yours, or to a certain project. You may get annoyed having to figure out the link to the previous blog entry, or remembering the link to the first entry in some blog series.

Dynamic Blog Context does its best to make things a little bit better.

Lets take a look at an example:

Please take a look [at my articles about Japan][tag japan].

The magic here (such as it is) is that this just works, instead of having to write it out like this

[tag japan]: /tags/japan/
Please take a look [at my articles about Japan][tag japan].

Its a pleasant convenience, and helps abstract your entries from the specific URLs and implementation details of your blog. Using DBC you could rearrange your blog, but you just have to resave your entries (DBC is computed at save), and they would update to the new structure.

A number more tags are in the work (they are quite simple, and I am working on a way to allow users to contribute their own without breaking their SVN checkout, at the moment you can just add more by looking in lifeflow/markup/lifeflowmarkup.py, but its likely that code will be getting cleaned up somehow in the near future)--please check the project page for current details--but take a look at the currently available DBC.

[tag slug] - creates a link to the tag with @slug
[comment pk] - creates a link to the comment with @pk (works with comments posted on any entry)
[project slug] - creates a link to the project with @slug
[author] - creates a link to the author's biography page if there is one author, if there are more or less authors then it creates a link to the /author/ page.
[comments] - creates a link to the comment division for the current entry
[projects] - creates a link to the /projects/ page
[series] - creates a link to the /articles/ page
[tags] - creates a link to the site's tag cloud

The others on the way (probably tonight):

[previous] - a link to the previous entry (if and only if one exists)
[next] - a link to the next entry (if and only if one exists)
[series number_of_entry] - a link to the nth (starting at 1, not zero) entry in the first series the entry belongs to (very few entries exist in multiple series, so this seems like the appropriate behavior in 99% of situations)
[series slug number_of_entry] - retrieve the nth article from series with @slug
[freq alias] - allow you to specify frequently used links, and give them an alias. Basically a hashmap for your links.

There are undoubtedly other places where Dynamic Blog Context would be appropriate, and I am open to suggestions about where those places may be. Let me know if you have any good ideas!

Implementation

(You may want to read through this article about extending Python-Markdown if the implementation details are a bit unintelligible. If you couldn't care less, go ahead to the ending thoughts section.)

The implementation for DBC is rather simple. I have added another preprocessor to Python-Markdown, whose primary function is run:

def run (self, lines):
        def clean(match):
            return match[-1]
        text = u"\n".join(lines)
        refs = self.LIFEFLOW_RE.findall(text)
        cleaned = [ clean(x) for x in refs ]
        processed = [ self.process_dynamic(x) for x in cleaned]
        dynamic_refs = [ x for x in processed if x is not None ]
        static_refs = self.build_static_references()
        return static_refs + dynamic_refs + lines

run is called by the Markdown library before it renders the Markdown into HTML. run recieves one argument (other than the ubiquitous self): lines. lines is a list of all the lines of text passed to the Markdown renderer.

Something like this is happening:

def markdown(self, text):
    lines = text.split(u"\n")
    preprocessor.run(lines)

Anyway, I take those lines and put them back into a giant chunk (kind of redundant, but you have to walk in the lines if you want to play), and then scan through text looking for reference links. Basically anything that looks like:

[this is a description][and this is an id]

Once I find a reference link, I check to see if I have already seen that id, if so I ignore it. Otherwise I run its id through a few loops and check if it matches any of the patterns I know. If it does match then I create a new reference for it to refer to. After going through all the reference links, then I append any new references I created to the top of the text, and send it on its way to the other preprocessors (and eventually it gets turned into HTML).

You may have noticed I got a bit fuzzy at the "I run its id through a few loops and..." I wanted to explain the overall process first, but now let me explain a little bit more about that part.

There are two types of references that DBC makes: (ironically) static, and dynamic. Static context is anything that doesn't change. An example of this is the [tags] reference. Its always going to go to '/tags' (until the structure of the blog software gets changed, and then the string representing it will have to get changed as well). Static context is always added at the top of the text. It doesn't require any database accesses, and it doesn't hurt to have it (and it'll evaporate into the mist by the time the final HTML is rendered).

Dynamic context is handled differently. These are things that can't be represented by an unchanging string, but have to be calculated somehow. Many of those calculations involve hitting the database, so those references are only calculated if there is a reference link referring to them (it is also careful to only build the reference once and store it incase it is referred to multiple times, no need to be wasteful).

The current code is a bit ugly (a bunch of nested flow control, oh my), but you can take a look at it in SVN.

Ending Thoughts

In the end, I think this is a fairly elegant way to making it easier and quicker to write blog entries. Its bringing Don't Repeat Yourself to the blogging process, and I think that is a win.

The other aspect I like here is providing a layer of abstraction between your blogs and the URLs and resources they refer to. Maybe you refer to the Wikipedia page about Archaic Felines a lot, and the Wikipedia Editors have a closed room meeting and rename the entry to Archaic-but-not-in-a-demeaning-sense-Felines. Its nice to only have to make one change in your frequently used list, instead of making changes (and forgetting to make changes) in all your entries.

In the same vein, some absent minded implementor (i.e. me) might move the tags url from /tags/ to /important-stuff/tags/. Dynamic Blog Context makes that move invisible.

I'd love to hear any thoughts (apparently comment posting actually work in IE 7--maybe even 6?--now, although I didn't do it on purpose).

Update: January, 13th

Over the past few days this system has gotten refined and improved substantially. The biggest change is that it is now based on the SVN head of Python-Markdown instead of rather old Python-Markdown 0.9. This means that a lot of previous issues have disappeared overnight. Although, I did find the new Python-Markdown is less amenable to the sort of dynamic stretching that I am doing here (at least as far as integration goes, the bulk of the underlying code is the same).

There are a handful of newly implemented contexts as well.

[previous] - the previous entry
[next] - the next entry
[series nth] - the @nth entry in the series
[series slug nth] - the @nth entry in series with slug @slug
[file name] - the file (image or otherwise) with @name
[f name] - a shortcut for file