For the average reader, the phrase “expert in ACID” most likely evokes a picture of either an eminent chemist or a jolly Deadhead. However, in the world of computer science, an understanding of "ACID" means something a bit different (though not mutually exclusive!). When dealing with relational databases, the term refers to a set of properties that describe certain characteristics of transactions.
To understand ACID, one must therefore first understand the definition of “transactions” that we're using here. Specifically, transactions are units of work, and they represent a particular kind of exchange. They have distinct rules: each transaction is always independent from each other transaction, and a transaction either occurs or it doesn’t—there’s no ambiguous middle ground. This, in fact, is thanks to ACID.
In computer science, where transactions are common and important, ACID is an acronym representing the following properties:
Taken together, these characteristics are another way of stating our definition of transactions: each transaction must be absolute and all-or-nothing (A); it must follow the constraints of the system inside of which it takes place (C); it must be independent and visible to other transactions (I); and it must persist as a new truth of the system once it has completed (D).
There are many disciplines and applications where transactions, as governed by ACID, play a vital role. The most common examples are in programs and systems related to banking, where transactions (exchanges of money) are obvious and familiar to virtually everybody.
Consider the parameters we used to define transactions earlier: transactions must be independent and they must be absolute (they either happen or they don’t). Imagine a transaction involving moving some amount—say, $100—from one account to another. The movement involves two operations: 1) money is subtracted from the debtor account, and 2) money is added to the recipient account. The $100 has to start out in one place and end up in another—every penny must be accounted for. These operations need to both happen, or neither; transactions of cash must be zero-sum! If one operation happens and not the other, the bank (and, eventually, the economy) is going to be in big trouble, fast. The ACID properties of transactions are what allow the transfer of money to happen within a system in a logical, repeatable, preservable way.
Of course, banking is hardly the only use case. Other application areas include:
- Accounting. Suppose you have a web app with a billing feature. Customers receive invoices, and eventually all payments go into a table for reporting purposes. If someone pays an invoice, the invoice needs to be updated and an entry needs to be made in the payment log. If two people try to pay an invoice, only one should succeed (consider how this fits into the ACID model). An invoice shouldn’t be paid twice.
- Social networks. Suppose your database stores users and groups. Each user has a list of groups with which they are associated. Similarly, each group has a list of users who are members. The act of adding an item to or subtracting an item from one of these lists should qualify as a transaction. Adding a user to a group, for example, should result in an update of both the user’s group list and the group’s member list; the two lists need to be updated as part of a transaction so things stay consistent.
In a wider scope, there are many cases where transactions prove helpful. In relational databases, if you have a really complicated, single-statement query, you can simplify it by separating it into multiple statements and wrapping them in a transaction to get the effect of a single-statement query—the rules of ACID mean that they'll remain independent. Similarly, transactions are great for tests. You can start a transaction for each test, make changes to your database, and, at the end, roll back any changes to avoid unwanted effects. Further, you can run tests in parallel since transactions are isolated—their isolation means they'll have no side effects.
Finally, transactions are great for concurrency. Applications that can utilize concurrency can be much faster, and consistency and isolation are guaranteed. This means you don't need to worry as much about the ordering of actions within an application, as the independence of transactions is built in by definition.
Not all databases offer ACID transactions, though your database may guarantee enough for your use case. For example, MongoDB only offers atomicity within a single document, but if that’s all you need, there’s no problem.
However, some databases don't support transactions at all. If your application needs them—or would benefit substantially from them—but your database doesn't provide them, you're left with a thorny problem. You can
implement transactions yourself, ad hoc, though it can be very difficult. Once you give up transactional integrity at a lower level, it's hard to recover the ACID properties. Transactions in databases are like mutexes in multithreaded programming: if you don’t have them, you have to be very careful about not making mistakes. And it’s not always obvious if you're making mistakes!
As we've seen, transactions offer important benefits in a wide range of applications, and the ACID properties are at the root of those benefits. As a developer or DBA, the best way to use transactions properly is to first have a good understanding of the theory behind them and their use cases. Getting a handle on ACID is a logical first step; once you're comfortable with those properties the nature and uses of transactions will be significantly clearer.