A transaction symbolizes a unit of work performed within a database management system (or similar system) against a database, and treated in a coherent and reliable way independent of other transactions. A transaction generally represents any change in database. Transactions in a database environment have two main purposes:
- To provide reliable units of work that allow correct recovery from failures and keep a database consistent even in cases of system failure, when execution stops (completely or partially) and many operations upon a database remain uncompleted, with unclear status.
- To provide isolation between programs accessing a database concurrently. If this isolation is not provided, the programs' outcomes are possibly erroneous.
Transactions provide an "all-or-nothing" proposition, stating that each work-unit performed in a database must either complete in its entirety or have no effect whatsoever. Further, the system must isolate each transaction from other transactions, results must conform to existing constraints in the database, and transactions that complete successfully must get written to durable storage.
Databases and other data stores which treat the integrity of data as paramount often include the ability to handle transactions to maintain the integrity of data. A single transaction consists of one or more independent units of work, each reading and/or writing information to a database or other data store. When this happens it is often important to ensure that all such processing leaves the database or data store in a consistent state.
Examples from double-entry accounting systems often illustrate the concept of transactions. In double-entry accounting every debit requires the recording of an associated credit. If one writes a check for $100 to buy groceries, a transactional double-entry accounting system must record the following two entries to cover the single transaction:
- Debit $100 to Groceries Expense Account
- Credit $100 to Checking Account
A transactional system would make both entries pass or both entries would fail. By treating the recording of multiple entries as an atomic transactional unit of work the system maintains the integrity of the data recorded. In other words, nobody ends up with a situation in which a debit is recorded but no associated credit is recorded, or vice versa.
A transactional database is a DBMS where write transactions on the database are able to be rolled back if they are not completed properly (e.g. due to power or connectivity loss).
Most modern relational database management systems fall into the category of databases that support transactions.
In a database system a transaction might consist of one or more data-manipulation statements and queries, each reading and/or writing information in the database. Users of database systems consider consistency and integrity of data as highly important. A simple transaction is usually issued to the database system in a language like SQL wrapped in a transaction, using a pattern similar to the following:
- Begin the transaction
- Execute a set of data manipulations and/or queries
- If no errors occur then commit the transaction and end it
- If errors occur then rollback the transaction and end it
If no errors occurred during the execution of the transaction then the system commits the transaction. A transaction commit operation applies all data manipulations within the scope of the transaction and persists the results to the database. If an error occurs during the transaction, or if the user specifies a rollback operation, the data manipulations within the transaction are not persisted to the database. In no case can a partial transaction be committed to the database since that would leave the database in an inconsistent state.
Internally, multi-user databases store and process transactions, often by using a transaction ID or XID.
There are multiple varying ways for transactions to be implemented other than the simple way documented above. Nested transactions, for example, are transactions which contain statements within them that start new transactions (i.e. sub-transactions). Multi-level transactions are a variant of nested transactions where the sub-transactions take place at different levels of a layered system architecture (e.g., with one operation at the database-engine level, one operation at the operating-system level) Another type of transaction is the compensating transaction.
Transactions are available in most SQL database implementations, though with varying levels of robustness. (MySQL, for example, does not support transactions in the MyISAM storage engine, which was its default storage engine before version 5.5.)
A transaction is typically started using the command
BEGIN (although the SQL standard specifies
START TRANSACTION). When the system processes a
COMMIT statement, the transaction ends with successful completion. A
ROLLBACK statement can also end the transaction, undoing any work performed since
BEGIN TRANSACTION. If autocommit was disabled using
START TRANSACTION, autocommit will also be re-enabled at the transaction's end.
One can set the isolation level for individual transactional operations as well as globally. At the READ COMMITTED level, the result of any work done after a transaction has commenced, but before it has ended, will remain invisible to other database-users until it has ended. At the lowest level (READ UNCOMMITTED), which may occasionally be used to ensure high concurrency, such changes will be visible.
Relational databases traditionally comprise tables with fixed size fields and thus records. Object databases comprise variable sized blobs (possibly incorporating a mime-type or serialized). The fundamental similarity though is the start and the commit or rollback.
After starting a transaction, database records or objects are locked, either read-only or read-write. Actual reads and writes can then occur. Once the user (and application) is happy, any changes are committed or rolled-back atomically, such that at the end of the transaction there is no inconsistency.
Database systems implement distributed transactions as transactions against multiple applications or hosts. A distributed transaction enforces the ACID properties over multiple systems or data stores, and might include systems such as databases, file systems, messaging systems, and other applications. In a distributed transaction a coordinating service ensures that all parts of the transaction are applied to all relevant systems. As with database and other transactions, if any part of the transaction fails, the entire transaction is rolled back across all affected systems.
- A transaction is a group of operations that are atomic, consistent, isolated, and durable (ACID).
- Beeri, C., Bernstein, P.A., and Goodman, N. A model for concurrency in nested transactions systems. Journal of the ACM, 36(1):230-269, 1989
- "MSDN Library". Retrieved 16 October 2014.
- Philip A. Bernstein, Eric Newcomer (2009): Principles of Transaction Processing, 2nd Edition, Morgan Kaufmann (Elsevier), ISBN 978-1-55860-623-4
- Gerhard Weikum, Gottfried Vossen (2001), Transactional information systems: theory, algorithms, and the practice of concurrency control and recovery, Morgan Kaufmann, ISBN 1-55860-508-8