.. _datamig:

==================
Database Migration
==================

A Python dump of a database is not only a :ref:`backup`, it is also
the base of every database migration à la Lino.

When you upgrade to a newer version of the :ref:`application
<application>` you are running, or when you change certain parameters
in your :xfile:`settings.py`, then this might require **changes in the
database structure**.  This is called database migration.

- Before upgrading or applying configuration changes, 
  create a :ref:`backup`.
  
- After upgrading or applying configuration changes, 
  restore your database from that backup.
  The :xfile:`restore.py` script will automatically detect version changes and 
  apply any necessary changes to your data.
  
For example, here is a upgrade with data migration of a :ref:`voga`
site::
  
  $ python manage.py dump2py 20130827
  $ pip install -U lino_voga
  $ python manage.py run 20130827/restore.py 
  
It is of course recommended to stop any other processes 
which might access your database during the whole procedure.


.. _ddt:

Double Dump Test (DDT)
----------------------

A `Double Dump Test` is a method to test for possible problems
e.g. after a :ref:`datamig`: we make a first dump of the database to a
Python fixture `a.py`, then we load that picture to the database, then
we make a second dump to a fixture `b.py`.  And finally we launch
`diff a.py b.py` to verify that both pictures are identical.

Background:

When :xfile:`restore.py` successfully terminated without any warnings
and error messages, then there are good chances that your database has
been successfully migrated.

But here is one more automated test that you may run when everything
seems okay: a :ref:`ddt`. 

This consists of the following steps:

- make another dump of the freshly migrated database 
  to a directory `a`.
- restore this dump to the database
- make a third dump `b`
- Compare the files :file:`a` and :file:`b`:
  if there's no difference, then the double dump test succeeded!

In other words::  
  
  $ python manage.py dump2py a
  $ python manage.py run a/restore.py 
  $ python manage.py dump2py a
  $ diff a b
 
If there's no difference between the two dumps, then the test succeeded!
  
  
Designing data migrations for your application
----------------------------------------------

Designing data migrations for your application
is easy but not yet well documented.

The main trick that any :xfile:`restore.py` file generated by
:manage:`dump2py` contains the following line ::

    settings.SITE.install_migrations(globals())

This means that the script itself will call 
the :func:`install_migrations <lino.utils.dpy.install_migrations>` 
method of your application *before* actually starting to load
any database object.
And it passes her `globals()` dict, which means 
that you can potentially change everything.

To see real-life example, look at the source code of 
:mod:`lino_welfare.migrate`
and
:mod:`lino_welfare.old_migrate`.

A magical `before_dumpy_save` attribute may contain custom 
code to apply inside the try...except block. 
If that code fails, the deserializer will simply 
defer the save operation and try it again.

  
Models that get special handling
--------------------------------

- `ContentType` objects aren't stored in a dump because they 
  can always be recreated.
- `Site` and `Permission` objects *must* be stored and *must not* be re-created
- `Session` objects can get lost in a dump and are not stored.



Note about `django-extensions <https://github.com/django-extensions>`_ 
----------------------------------------------------------------------

`django-extensions <https://github.com/django-extensions>`_ 
has a command "dumpscript" which is comparable.
Differences: 

- dumpy produces fixtures to be restored with loaddata,
  dumpscript produces a simple python script to be restored with runscript
- the fixtures generated by dumpy are designed in order to make it possible to 
  write automated data migrations.