Updating database migrations with data

I’ve been using Migratus for my database migrations for a few years, and have been happy with it. My database files are divided into “init.sql” and various “populate*.sql” types. The problem I run into with live (though only max of 100 inserts per week) databases, though, is that as this project proceeds I find I need to change one of the schemas defined in init.sql, and this may in turn break some of the populate sql files. Now, the manual process is as follows:

  1. Take a dump of the live database tables and replace the “populate” files with the new dumps of those respective tables.
  2. Implement my changes, massaging the data into correct form to respect those changes
  3. Perform the repaired migration on the live database (hoping, fairly reasonably in this particular case, that nothing new has been added since I took the dump).

Obviously this is a messy, tedious process and there is plenty of room for error. It also seems like something that should be a solved problem. What are your solutions for the scenario of updating your migrations with respect to additional data? And, ultimately, is there a solution that automates this update process?

Edit: looking at the docs for migratus, I get the feeling I might have misunderstood your question. In that case, I apologize.

I usually prefer to treat database migrations as immutable, i.e. when I need to make a change in the database schema, I create a new migration file that applies the schema changes as well as massaging existing data into the new format.

That way the initial database population code is also immutable and can be run on a new database, and migrating the database will update the shape of the data as well.

I have not used migratus, so I am not certain how my response fits into your mental model of migrations. My guess is it should be reasonably close to ActiveRecord migrations.

1 Like

My opinion: after you wrote a migration and shared it with others, you should in general not change it (unless you are the only one working on the code and database, or have agreed with all developers to start from scratch, there are exceptions).

migration 1: add table
migration 2: populate table
migration 3: alter table (e.g. add column) + update the populated data from migration 2.

1 Like

Good point; Migratus supports this concept. I guess the part of me that hadn’t considered it was just thinking, “inefficient something something something”, but in reality, this makes a lot of sense. Of course, it still means I need to manually take a shot of the latest contents of the database before I can start massaging. I was hoping there might be a way to automate that process (e.g. define automatic updates for my migration files).

As it happens for this project, both of your “unless” apply to me here. The three steps you mention are essentially the steps I take each time, except I haven’t tried adding a different table and then swapping it in (if I understand your steps).

If you apply your migrations to a development database instead of the production database, you should not need to dump anything before running a migration. The process could go something like this:

CREATE TABLE copy_table (LIKE table_to_change);
INSERT INTO copy_table SELECT * FROM table_to_change; -- this is your dump
ALTER TABLE table_to_change... ; -- make any changes to the original table
UPDATE table_to_change SET column_replacing_dropped_column = copy_table.replaced_column + any changes; -- and re-insert based on what was in the table before you changed it

It looks like migratus wraps migrations in a transaction, so doing it this way means that you do not risk changes to the original table while your migration is running.

1 Like

Excellent ideas! I think I’m seeing how to do this. I think I can adjust your ideas effectively to include steps for removing my near-duplicates and inserting my new near-duplicate-preventing constraint. That’s perfect. Thanks!

1 Like

You are welcome. :slight_smile:

I just wanted to clarify that the above statement just means that you should not run your migrations against your production database while developing them. When you have made certain a migration does as intended on a database copy, it is ready to be run against the production database.

This topic was automatically closed 182 days after the last reply. New replies are no longer allowed.