My Problems with Flyway

Last week, I released a new version of the application I’m working on. But some of my services were unable to start. Flyway was unable to update my database.

I faced this problem many times. And I’ve created a simple framework to easily solve this situation.

The Context

With each of my Microservices, I add Flyway. This way, before running the service, it upgrades the database schema.

I choose to integrate Flyway inside each service instead of running Flyway from my CI/CD. Because this way, I don’t need to open the database connection to my CI/CD pipeline.

So, if the Flyway migration fails, the service won’t start.

flyway migrate
rc=$?; if [[ $rc != 0 ]]; then exit $rc; fi
mvn spring-boot:run

Even if the service starts, it won’t run as the database is in an older version.

This occurred to me too many times for many different reasons.

The Problems

I saw many reasons the Flyway migration fails:

  • Due to a foreign key which is not present in the staging environments;
  • Due to an intensive workload in production which limits the locks;
  • Or some other reason

So, I decided to prepare a procedure to mitigate this situation.

Make sure it’s from Flyway

First of all, I must make sure it’s Flyway which fails to run.

I must go check the logs and see where the error occurs.

Flyway prints different log lines depending on the cause. But knowing it’s Flyway is the first step.

Sometimes, Flyway remains locked and the application doesn’t start. This means that the execution path is blocked between Flyway and my application.

Once I get the Flyway error message, let’s examine it to determine the problem.

Checksum Problem

This error indicates that somebody updated the changeset 001.101. We can never change the content of a changeset which was run.

A changeset is like a Git commit, it’s immutable, once run, don’t touch it!

ERROR: Validate failed: Migrations have failed validation
Migration checksum mismatch for migration version 001.101
-> Applied to database : 1884708740
-> Resolved locally    : -1560729926. Either revert the changes to the migration, or run repair to update the schema history.

Missing Migration

Here, I can see that the database has the changeset 001.130 applied, but the file itself doesn’t exist in my project.

ERROR: Validate failed: Migrations have failed validation
Detected applied migration not resolved locally: 001.130. If you removed this migration intentionally, run repair to mark the migration as deleted.

Error on Query

In this error message, the changeset 001.132 failed to run for some unknown reason (maybe a SQL syntax error).

ERROR: Migration V001.132__drop.sql failed
-------------------------------------
SQL State  : 42P01
Error Code : 0
Message    : ERROR: table "some_table" does not exist
Location   : V001.132__drop.sql
Line       : 1
Statement  : drop table some_table

Unknown Migration

And this one indicates that I have an old changeset, 001.123 which wasn’t applied to my database. The last changeset applied to my database is newer than 001.123.

ERROR: Validate failed: Migrations have failed validation
Detected resolved migration not applied to database: 001.123. To ignore this migration, set -ignoreIgnoredMigrations=true. To allow executing this migration, set -outOfOrder=true.

The Solution

Now, depending on the problem, I can apply different solutions.

Stop the Production

Flyway may generate deadlocks. As the production is trying to update some rows, but Flyways is trying to update the entire table.

To solve this, a solution may be to stop the production server. This will allow Flyway to perform the migration, and continue.

But sometimes this solution is not acceptable.

Rollback a Single Changeset

If you are like me and don’t want to pay the premium license of Flyway, you don’t have the rollback feature.

What I’ve done, is create a separate folder, rollbacks, with all the rollback queries named as the changesets.

Even if it’s not with Flyway, I can easily get the SQL query to rollback a feature.

As I have the rollback queries, I can easily find them and apply them. I must also make sure to rollback the Flyway row.

DELETE FROM flyway_schema_history WHERE version = '001.123';

Update the Checksum

If the problem is the checksum, and the correct version is the new one, just update the row.

UPDATE flyway_schema_history SET checksum = -1560729926 WHERE checksum = 1884708740 AND version = '001.101';

On the other side, if the correct checksum is the one stored in the database, I must rollback the changes done in the changeset file.

This implies looking in the history of the changeset file for the changes done and rollback them.

Change the Order

If an old file appear in the Flyway migration changesets, I may want to applied it even if it’s out of order.

To do so, I need to add the following option:

flyway migrate -outOfOrder=true

But this must be taken with caution. As an old file may run a query on an outdated schema. I mean, the old changeset may update a table which does not exist anymore.

Rollback Everything

I hope this won’t happen.

This means rollback the application and all the migrations associated with the current version of the application.

Once rollbacked the application, I must identify all the applied changesets and rollback them individually.

To rollback individually the rollbacks, I can follow the previous point. Run the rollback query and delete the row on the Flyway history table.

Conclusion

I’ve added a new step in my CI/CD pipeline to test the Flyway migration, the rollback and the migration again.

flyway migrate
ROLLBACK_FILES=$(ls -r flyway/rollback/*)
for f in $ROLLBACK_FILES; do; psql -f $f; done
flyway migrate

This ensures that both the migrations and rollbacks are correct.

Then, the rollout should go fine.

My New ebook, How to Master Git With 20 Commands, is available now.

Leave a comment

A WordPress.com Website.