Why You Should Use Database Transactions

Database transactions are an important part of keeping your database consistent. It seems that a lot of developers haven’t been exposed to transactions or choose not to use them, which is unfortunate because it’s easier for your data to get corrupted than you might think.

Take this, for example: you have a database with two tables, owner and pet, where each owner has zero or more pets. Below is an implementation in MySQL. If you have MySQL installed, you should be able to log into MySQL by doing the following:

mysql -u youruser -p

You will be prompted for a password and then you’ll be in the MySQL CLI.

First, let’s create the database.

mysql> create database example;
Query OK, 1 row affected (0.00 sec)

Then tell MySQL you want to be using the database called example.

mysql> use example
Database changed

Create the owner table.

mysql> create table owner (
owner_id int unsigned auto_increment not null,
primary key (owner_id),
name varchar(30) not null
) engine=innodb;
Query OK, 0 rows affected (0.11 sec)

Create the pet table.

mysql> create table pet (
pet_id int unsigned auto_increment not null,
primary key (pet_id),
owner_id int unsigned not null,
foreign key (owner_id) references owner (owner_id),
name varchar(30) not null
) engine=innodb;
Query OK, 0 rows affected (0.02 sec)
mysql> show tables;
+-------------------+
| Tables_in_example |
+-------------------+
| owner             |
| pet               |
+-------------------+
2 rows in set (0.00 sec)

Now you can see that the two tables have been successfully added. Let’s add some data to the database:

mysql> insert into owner (name) values ('Alice'), ('Bob');
Query OK, 2 rows affected (0.00 sec)
Records: 2  Duplicates: 0  Warnings: 0
mysql> select * from owner;
+----------+-------+
| owner_id | name  |
+----------+-------+
|        1 | Alice |
|        2 | Bob   |
+----------+-------+
2 rows in set (0.00 sec)
mysql> insert into pet (owner_id, name) values
(1, 'Fido'), (2, 'Scruffy'), (2, 'Hoppy the Frog');
Query OK, 3 rows affected (0.00 sec)
Records: 3  Duplicates: 0  Warnings: 0

If we select owners and pets from our database, we should see that Alice has a pet named Fido and that Bob has two pets: Hoppy the Frog and Scruffy.

mysql> select o.name owner_name, p.name pet_name
from owner o
join pet p on o.owner_id = p.owner_id
order by o.name;
+------------+----------------+
| owner_name | pet_name       |
+------------+----------------+
| Alice      | Fido           |
| Bob        | Scruffy        |
| Bob        | Hoppy the Frog |
+------------+----------------+
3 rows in set (0.00 sec)

There they are. Below, we’ll be going through some examples that delete records. After each example, we could just manually re-insert the records we deleted, but that would get old fast. Instead, let’s back up our database using mysqldump. Log out of the MySQL CLI (or open a new terminal) and type this:

mysqldump example -u youruser -p > example_backup.sql

We’ll be able to use example_backup.sql at anytime in the future to restore our database.

Now let’s say Alice is no longer a customer of ours and we want to remove her from our database. This will be a two-part operation:

  1. Delete Alice’s pet(s) from the pet table
  2. Delete Alice from the owner table

Note carefully that the order of these two operations matters. We can’t delete Alice from the owner table before we delete her pets because pet.owner_id has a foreign key constraint that points to owner.owner_id. If you try to do it that way, you’ll get an error.

Let’s delete Alice’s pets and then delete Alice:

mysql> delete from pet where owner_id = 1;
Query OK, 1 row affected (0.00 sec)

mysql> delete from owner where owner_id = 1;
Query OK, 1 row affected (0.01 sec)

If we repeat our query from earlier, we should see that Alice is gone and Bob is still there.

mysql> select o.name owner_name,
    -> p.name pet_name
    -> from owner o
    -> join pet p on o.owner_id = p.owner_id
    -> order by o.name;
+------------+----------------+
| owner_name | pet_name       |
+------------+----------------+
| Bob        | Scruffy        |
| Bob        | Hoppy the Frog |
+------------+----------------+
2 rows in set (0.00 sec)

That looks right, but let’s query each table individually to make sure:

mysql> select * from owner;
+----------+------+
| owner_id | name |
+----------+------+
|        2 | Bob  |
+----------+------+
1 row in set (0.00 sec)

mysql> select * from pet;
+--------+----------+----------------+
| pet_id | owner_id | name           |
+--------+----------+----------------+
|      2 |        2 | Scruffy        |
|      3 |        2 | Hoppy the Frog |
+--------+----------+----------------+
2 rows in set (0.00 sec)

Good, that’s what we expected. So if everything worked just fine without using any transactions, what’s the problem? Well, usually there is no problem, but it’s all too easy for a condition to arise where there will be a problem. I’ll explain in a moment, but first let’s restore our database:

mysql example -u youruser -p < example_backup.sql

Now that we have our original database back, let's imagine that your company's DBA decides that the owner table is poorly named and it should really be called user:

mysql> alter table owner rename user;
Query OK, 0 rows affected (0.00 sec)

mysql> show tables;
+-------------------+
| Tables_in_example |
+-------------------+
| pet               |
| user              |
+-------------------+
2 rows in set (0.00 sec)

The DBA informs the developers of this, but one of the developers at your company is lazy and doesn't like to read e-mails, so he never pays attention to the change, so he never updates his code. (Actually, this DBA could be you, and the person who forgets to update his code could very likely also be YOU! I just made them different people to show how easily this can happen.)

Now what happens when your deletion code runs?

mysql> delete from pet where owner_id = 1;
Query OK, 1 row affected (0.01 sec)

mysql> delete from owner where owner_id = 1;
ERROR 1146 (42S02): Table 'example.owner' doesn't exist

The pets got deleted just fine, but when we try to delete Alice, we get an error because the owner table no longer exists. What do we have in our database now?

mysql> select o.name owner_name, p.name pet_name
from user o left join pet p on o.owner_id = p.owner_id
order by o.name;
+------------+----------------+
| owner_name | pet_name       |
+------------+----------------+
| Alice      | NULL           |
| Bob        | Scruffy        |
| Bob        | Hoppy the Frog |
+------------+----------------+
3 rows in set (0.00 sec)

(Note that I changed the JOIN to a LEFT JOIN so that we'll see all users in the database, regardless of whether or not they have pets.)

As you can see, Alice is still there but her pets are gone. Is that what we wanted to do? Certainly not. Our database is now in an inconsistent state and we can't trust it. That's bad!

What if we had wrapped our deletion code in a transaction? We'll look at that, but first let's restore our table again:

mysql example -u youruser -p < example_backup.sql

Let me pause for a moment to explain how transactions are usually coded. In your server-side script (e.g. PHP, Python, ASP), you'll usually have something similar to this:

class Owner
{
  // Delete an owner and his or her pets.
  function delete()
  {
    // Start the database transaction.
    $db->startTransaction();
    try {
      $this->deletePets(); // Contains "DELETE FROM pet..." SQL
      $this->deleteSelf(); // Contains "DELETE FROM owner..." SQL
      // If everything went okay, commit the change.
      $db->commit();
    } catch (Exception $e) {
      // If there was an error, roll back the transaction,
      // leaving the database untouched.
      $db->rollback();
    }
  }
}

Before we delete an owner, we say to the database, "Hey database, what we're about to do might not work. Hold on for a second and don't actually make these changes. Just pretend like we're making the changes."

If anything went wrong in the process of deleting a pet or owner, an exception will be raised so that our server-side script knows whether or not the operation was successful. If no exception was raised, we say, "Okay, database. It worked. Go ahead and make those changes for real." Otherwise we say, "Looks like that's not going to work after all. Just forget about the whole thing."

Since the examples I'm showing here are scripting-language-agnostic, I'm just going to show you some transaction examples in plain SQL.

If you run a SHOW TABLES command when you log back into the MySQL CLI, you might be slightly surprised:

mysql> show tables;
+-------------------+
| Tables_in_example |
+-------------------+
| owner             |
| pet               |
| user              |
+-------------------+
3 rows in set (0.00 sec)

Why didn't mysqldump restore our table back to what it originally was? Well, all we're doing when we restore is playing back the original creation of the database, not wiping everything out and then starting anew. So we'll have get rid of the user table manually:

mysql> drop table user;
Query OK, 0 rows affected (0.03 sec)

mysql> show tables;
+-------------------+
| Tables_in_example |
+-------------------+
| owner             |
| pet               |
+-------------------+
2 rows in set (0.00 sec)

Now let's do our deletion again, but this time let's start a transaction first:

mysql> start transaction;
Query OK, 0 rows affected (0.00 sec)

mysql> delete from pet where owner_id = 1;
Query OK, 1 row affected (0.00 sec)

mysql> delete from owner where owner_id = 1;
Query OK, 1 row affected (0.01 sec)

mysql> select * from owner;
+----------+------+
| owner_id | name |
+----------+------+
|        2 | Bob  |
+----------+------+
1 row in set (0.00 sec)

mysql> select * from pet;
+--------+----------+----------------+
| pet_id | owner_id | name           |
+--------+----------+----------------+
|      2 |        2 | Scruffy        |
|      3 |        2 | Hoppy the Frog |
+--------+----------+----------------+
2 rows in set (0.00 sec)

At this point it looks like everything went as expected. Alice and her pets are gone. If we roll back, however, the changes don't stick.

mysql> rollback;
Query OK, 0 rows affected (0.00 sec)

mysql> select * from owner;
+----------+-------+
| owner_id | name  |
+----------+-------+
|        1 | Alice |
|        2 | Bob   |
+----------+-------+
2 rows in set (0.00 sec)

mysql> select * from pet;
+--------+----------+----------------+
| pet_id | owner_id | name           |
+--------+----------+----------------+
|      1 |        1 | Fido           |
|      2 |        2 | Scruffy        |
|      3 |        2 | Hoppy the Frog |
+--------+----------+----------------+
3 rows in set (0.00 sec)

Let's do all the deletion stuff over again, this time committing:

mysql> start transaction;
Query OK, 0 rows affected (0.00 sec)

mysql> delete from pet where owner_id = 1;
Query OK, 1 row affected (0.00 sec)

mysql> delete from owner where owner_id = 1;
Query OK, 1 row affected (0.00 sec)

mysql> commit;
Query OK, 0 rows affected (0.00 sec)

mysql> select * from owner;
+----------+------+
| owner_id | name |
+----------+------+
|        2 | Bob  |
+----------+------+
1 row in set (0.01 sec)

mysql> select * from pet;
+--------+----------+----------------+
| pet_id | owner_id | name           |
+--------+----------+----------------+
|      2 |        2 | Scruffy        |
|      3 |        2 | Hoppy the Frog |
+--------+----------+----------------+
2 rows in set (0.00 sec)

That exactly what you would expect, right? We made the changes, then committed them, so they stuck. Let's redo our example where the owner table got renamed to user, but this time let's use a transaction so we don't corrupt our data. But first we have to restore the database one final time.

mysql example -u youruser -p < example_backup.sql
mysql> alter table owner rename user;
Query OK, 0 rows affected (0.01 sec)

mysql> start transaction;
Query OK, 0 rows affected (0.00 sec)

mysql> delete from pet where owner_id = 1;
Query OK, 1 row affected (0.00 sec)

mysql> delete from owner where owner_id = 1;
ERROR 1146 (42S02): Table 'example.owner' doesn't exist

There's our error again. But instead of being screwed this time, we can simply roll back:

mysql> rollback;
Query OK, 0 rows affected (0.01 sec)

mysql> select * from user;
+----------+-------+
| owner_id | name  |
+----------+-------+
|        1 | Alice |
|        2 | Bob   |
+----------+-------+
2 rows in set (0.00 sec)

mysql> select * from pet;
+--------+----------+----------------+
| pet_id | owner_id | name           |
+--------+----------+----------------+
|      1 |        1 | Fido           |
|      2 |        2 | Scruffy        |
|      3 |        2 | Hoppy the Frog |
+--------+----------+----------------+
3 rows in set (0.00 sec)

Thanks to transactions, our biggest problem went from being a corrupt database (a terrible thing to have) to being an error that we can catch and report without affecting our data. If you value your data integrity at all - and it would be very silly not to - you should use database transactions.

Side note: if you're using the Doctrine ORM, here's how to do a transaction in Doctrine.