When you create a Django application, you use models to represent your content, your data, your webpage's lifeforce. Unfortunately the first set of models you make are wrong. So is your second set of models. Oh, and so are the models you are using right now. Sure, by this point they are pretty good, but they could be better. So... why arn't they?
My answer to that question has usually been "Because there is no way in hell I am going to transfer my data." Early on I went in and added an extra column to my database by hand, and made that column default to a reasonable value to avoid having unexpected nulls flying out of the database. And then I wrote some code to automatically calculate the correct value of that column each time the model was saved.
Please don't do that. Ever. It just isn't very bright.
You've heard of fixtures right? The Django documents mention them here and there, casually assuming you know exactly what they are and how to use them, because they are so easy to use, why bother documenting them? Okay, its a beta project, I know, I know, and they are super easy to use. You just need a push in the right direction, and this article is me shoving you for the benefit of all of us.
So you have a database with some data in it. Lets dump that data to disk:
python manage.py dumpdata myapp --format=xml > myappdata.xml
By default your data will be dumped via JSON. Which is okay, but JSON gutted my Japanese encoded data, so I decided that XML would be the provider of my future datadumps. If you are only using simple ascii data then JSON is probably fine. Then again, XML is much easier to parse via BeautifulSoup, which sometimes is necessary when making complex model changes. So... lets just use XML.
Now we have the data saved. Go into the database and make some change to your models, hell you might remove an entire model, or just add a CharField or BooleanField to an existing model, anythings good.
Go ahead and reset your database.
python manage.py reset myapp
Now you'll simply load in that data you saved earlier.
python manage.py loaddata myappdata.xml
And thats it, your data has been recreated in your new tables. This technique becomes more and more difficult as the complexity of your changes increases, and you may eventually find yourself writing scripts to convert the old xml into a new form. My only words of encouragement are to write those scripts, and don't do it by hand. Doing it by hand seems easier at first, until you start making human errors and thus leaving timebombs throughout your data file.
Disclaimer: Do a trial run with this technique before using it on anything important.
Good luck, and feel free to shoot me any questions if you are trying to make a more robust alteration to your database.
Will, I had today discovered your site, superb django-facebook tutorial, lifeflow ... all that amazingly clean and I like your writing style - be sure that I have finally decided to use your lifeflow for my personal sites :-) Thanks a lot for it.
To Migrations - I have exactly this idea how to do migrations as easy as possible; at least during ongoing development on test-database, where isnt lot of data and dumping/reloading into sqlite must be simply very fast. It is interresting, that you simply and shortly described way how to do it using existing features of django - because, I havent heard of it so simply never at any place on the net... :-)
I am so much feel in love with django, because many years ago, I developed application using czech DOS relational database system "PC-FAND" (he died with DOS, grrr), which ahead of time introduced in fact things VERY similar to MVC, ORM (really:-), and I got pathological dependency on this development approach. In this tool, here was possible to "change model" and tool immediatelly restructured real database data according to this change, eventually with warning to losing some data ...
As today, SQL doesnot allow in SQLite3 so easy modifications (ALTER TABLE fields, as I know), it is really best approach to DUMP all data to XML, recreate all from scratch with mods and then load back again.
As project/apps are designed and implemented, there is not so much need to have such flexibility, but things like django-evolution are here too...
... large post, so its obvious, that I desperatelly need its own blog :-)) Thanks again for LIFEFLOW.
Thanks for all the kind words Petr.
I think most people don't think about migrating this way because it can require a bit of wrangling by hand. Also, it requires a little foresight about how the different databases deal with data. Like how SQLite will willingly eat some data that PostgreSQL will reject. Really though, its how I migrate all my data changes.
My usual system is dumping it from the source (like the server running lethain.com), then reloading it and testing the model change on a development deployment and getting the data updated for it, then dumping that, and reloading the fixed set onto the production server. This means that I get to screw up my development server, which is okay, instead of my production server, which is... less good.
Anyway, take care, and let me know if you need any help with lifeflow.
Reply to this entry