How to Migrate Data Across Model Changes

July 1, 2007. Filed under django 72 python 56

When you create a Django application, you use models to represent your content, your data, your webpage's lifeforce. Unfortunately the first set of models you make are wrong. So is your second set of models. Oh, and so are the models you are using right now. Sure, by this point they are pretty good, but they could be better. So... why arn't they?

My answer to that question has usually been "Because there is no way in hell I am going to transfer my data." Early on I went in and added an extra column to my database by hand, and made that column default to a reasonable value to avoid having unexpected nulls flying out of the database. And then I wrote some code to automatically calculate the correct value of that column each time the model was saved.

Please don't do that. Ever. It just isn't very bright.

You've heard of fixtures right? The Django documents mention them here and there, casually assuming you know exactly what they are and how to use them, because they are so easy to use, why bother documenting them? Okay, its a beta project, I know, I know, and they are super easy to use. You just need a push in the right direction, and this article is me shoving you for the benefit of all of us.

So you have a database with some data in it. Lets dump that data to disk:

python manage.py dumpdata myapp --format=xml > myappdata.xml

By default your data will be dumped via JSON. Which is okay, but JSON gutted my Japanese encoded data, so I decided that XML would be the provider of my future datadumps. If you are only using simple ascii data then JSON is probably fine. Then again, XML is much easier to parse via BeautifulSoup, which sometimes is necessary when making complex model changes. So... lets just use XML.

Now we have the data saved. Go into the database and make some change to your models, hell you might remove an entire model, or just add a CharField or BooleanField to an existing model, anythings good.

Go ahead and reset your database.

python manage.py reset myapp

Now you'll simply load in that data you saved earlier.

python manage.py loaddata myappdata.xml

And thats it, your data has been recreated in your new tables. This technique becomes more and more difficult as the complexity of your changes increases, and you may eventually find yourself writing scripts to convert the old xml into a new form. My only words of encouragement are to write those scripts, and don't do it by hand. Doing it by hand seems easier at first, until you start making human errors and thus leaving timebombs throughout your data file.

Disclaimer: Do a trial run with this technique before using it on anything important.

Good luck, and feel free to shoot me any questions if you are trying to make a more robust alteration to your database.