Monday, 4 August 2014

Using Models in Migrations

In the primate social experiment of monkeys and bananas (best link I could find), I would be one of the monkeys who got hosed.

When told not to use models in migrations, I went ahead and did so, not understanding why, and got hosed. We can learn from our mistakes not to do something again, but learning why is even more powerful. And learning why we we shouldn’t use models in migrations also teaches us how to use models in migrations safely.

An Example

Firstly, let’s consider an example where we might want to use a model in a migration: Suppose we have a Task model and we have a boolean flag for “is_active” that can only represent two states and we want to migrate this to a new column “status” that might have values “inactive”, “active”, and “pending”. After writing a migration to add a status column, we might write another migration like this:

class UpdateTaskStatus < ActiveRecord::Migration
  def up
    Task.where(:is_active => false).update_all(:status => "inactive")
    Task.where(:is_active => true).update_all(:status => "active")
  end

  def down
    # no action required
  end
end

bin/rake db:migrate and our status fields are populated.

So what can go wrong?

Lots. One of the easiest examples to recognise is when the Task model is refactored. For example, it might be renamed to Job. The next developer who joins the team, or resets their database runs a migration referring to the model Task, which no longer exists.

Other more subtle conflicts can arise when behaviour added to the model class depends on later migrations. For example, adding the paranoia gem to a project and adding the archived_at column to a model will break migrations that previously refer to the model before that column was added to the database.

Changing default scopes, validations and relationships can all have implications in a migration that may be executing in a state when the schema did not have the fields necessary for these changes to function.

There is another caveat: When a model is loaded, it inspects the database to discover which columns exist for that model. If multiple migrations refer to the same model, the model class and column information will be loaded by the first migration using that model. Later migrations using the same model may be missing columns that were added by other migrations. (Yes, you can reset the column information on a model as a workaround).

So is there a way to use a model in the migration safely?

Yes.

Create a new model in the migration itself. This model will be a minimal implementation of only what is required for the migration to execute. No changes made to the model (including being renamed or deleted) will change the use of this new model that is used only in the migration.

Example 1. Creating a new Task model inside the migration:

References to Task within the migration will be to UpdateTaskStatus::Task and not ::Task. This Task model is contained safely within the migration and free from being broken by other changes to the Task model in the application.

class UpdateTaskStatus < ActiveRecord::Migration
  class Task < ActiveRecord::Base
  end

  def up
    Task.where(:is_active => false).update_all(:status => "inactive")
    Task.where(:is_active => true).update_all(:status => "active")
  end
end

Example 2. Creating a new anonymous model inside the migration:

A new subclass of ActiveRecord::Base is created, without a name, and is pointed to the table of interest. Being completely anonymous, no changes to any models in the application can affect this model.

class UpdateTaskStatus < ActiveRecord::Migration
  def up
    model = Class.new(ActiveRecord::Base)
    model.table_name = "tasks"
    model.where(:is_active => false).update_all(:status => "inactive")
    model.where(:is_active => true).update_all(:status => "active")
  end
end

No comments:

Post a Comment