I've spent a bit of time with AuditTrail over the past day,
since I first discovered it, and I've been quite pleased with
it. However, my app makes a large number of changes, and
I was beginning to experience a bit of database bloat because
of the growing number of audits.
After a day of usage, one of my models had about 180 revisions,
and while each revision itself is small, it was pretty clear that
I wasn't going to be able to ignore the situation without causing
myself some serious headaches in the relatively near future
(of course, being able to only record diffs is a nice advantage
for something like django-rcsfield, which would be able to
get by with much less space).
Fortunately, depending on how you're using revisions, there is a
fairly simple solution to this dilemma: throw the excess revisions
away. I didn't want to perform extra database lookups everytime
a new revision was created, so I decided that adding an
extension to manage.py would be an adequate solution
(which I could periodically activate with a cronjob).
So I setup the skeleton for a management command:
At first I intended to go with a very specific set of rules
for picking the revisions to keep:
All revisions in the past hour,
The first revision older than one hour,
The first revision older than one day,
The first revision older than one week,
The first revision older than one month,
6 and so on...
But then I started actually writing that code, and my
enthusiasm for that approach swiftly dwindled. Instead
I decided I could accomplish roughly what I wanted much
more concisely by using a simple backoff to determine
the cutoffs for dates.
Depending the type of backoff you use, you can control
the spacing of revisions to save.
You could also do an additive backoff, etc. For my
needs the multiplicitive backoff worked well.
Starting from 60 seconds and multiplying by ten
it follows this pattern: 1 minute, 10 minutes,
1 hour, 16 hours, 6 days, 9 weeks, and so on.
Here is the implementation of the clean_audit_trails