Transactions and Concurrency Control

Overview

What a transaction is and how ACID keeps it reliable.
Where concurrency goes wrong: lost updates, non-repeatable reads and phantoms.
How to judge correctness with serializability and precedence graphs.
How DBMSs enforce order: timestamp ordering, locks, two-phase locking and deadlock handling.

1. Transactions and ACID

1.1 What counts as a transaction?

A transaction is a sequence of operations treated as a single unit of work:
- It can read or modify the database,
- It ends with either COMMIT (make changes permanent) or ROLLBACK (undo them).

Elementary actions inside a transaction:
- Read: fetch data (SELECT).
- Write: change data (INSERT, UPDATE, DELETE).
- Commit: persist every change of this transaction.
- Rollback: cancel every change of this transaction.

Generic form:

-- Oracle starts a transaction implicitly on the first change
-- or when you set autocommit off.

-- Reads and writes here...

COMMIT;   -- or ROLLBACK;

Example - bank transfer in one transaction:

UPDATE accounts SET balance = balance - 100 WHERE account_id = 'A';
UPDATE accounts SET balance = balance + 100 WHERE account_id = 'B';
COMMIT;  -- or ROLLBACK if any step fails

1.2 ACID properties

Atomicity - all or nothing: partial effects are not allowed.
Consistency - every constraint stays valid after a correct transaction.
Isolation - intermediate results stay private; the outcome must match some serial order.
Durability - once committed, data survives crashes.
Reference: Ensuring ACID semantics (Oracle docs)
Video overview: ACID properties in databases

2. Concurrency problems

When transactions overlap, their operations can interleave and create anomalies.

2.1 Lost update

Two transactions write the same data; one overwrites the other.
Timeline (balance starts at 600):
T1 reads 600 and plans to subtract 100.
T2 reads 600 and plans to add 100.
T2 writes 700.
T1 writes 500 -> T2's update is lost.

2.2 Non-repeatable read

A transaction reads a row twice and gets two different values because another transaction updated it between the reads.
Example: T2 sums accounts A and B. After reading A, T1 withdraws from A and B. When T2 reads B, the total depends on timing.

2.3 Phantom read (invalid dependency)

A transaction reads a set of rows that satisfy a condition.
Another transaction inserts or deletes rows matching that condition.
Re-running the same query returns a different set (extra or missing "phantom" rows).
Example: counting employees in a department while another session inserts one more employee there.
These issues motivate formal correctness criteria and control mechanisms.

Video on read anomalies: Dirty, inconsistent and phantom reads

3. Schedules, serializability and the scheduler

Schedule: the exact order of all reads, writes, commits and rollbacks from concurrent transactions.
Serial schedule: transactions run one after another with no interleaving (always correct but slow).
Interleaved schedule: operations are mixed to improve throughput.
Serializable schedule: interleaved execution that produces the same final database state as some serial order.

Scheduler: the DBMS component that orders operations to avoid incorrect interleavings.

3.1 Testing serializability with precedence (conflict) graphs

Conflicting operations are on the same data item where at least one is a write:
- Read-Write, Write-Read, Write-Write.

To check a schedule:
1. Create one node per transaction.
2. For each conflict, draw an edge Ti -> Tj if Ti's operation must occur before Tj's.
3. If the graph has no cycle, the schedule is conflict-serializable.

Another intuition: operations on different data can often be permuted (swapped) without changing the result; conflicts mark operations that cannot be swapped.
- Video: Serializability and recoverability

4. Timestamp ordering

Goal: impose a global time order so operations "respect" who arrived first.

Each transaction gets a unique timestamp TS(Ti) on start (clock time or counter).
For every data item X, the system tracks:
TSR(X) - timestamp of the last transaction that read X,
TSW(X) - timestamp of the last transaction that wrote X.

Rules
- Read Ti on X: allowed if TS(Ti) >= TSW(X); then set TSR(X) = max(TSR(X), TS(Ti)). Otherwise abort and restart Ti.
- Write Ti on X: allowed if TS(Ti) >= TSR(X) and TS(Ti) >= TSW(X); then set TSW(X) = TS(Ti). Otherwise abort and restart Ti.

Intuition: no operation is allowed to "go back in time" and overwrite or see something that a newer transaction already produced.
- Video: Timestamp ordering protocols

5. Lock-based protocols

Transactions must request locks before using data:
- SLOCK (shared/read) - many readers allowed; no writers.
- XLOCK (exclusive/write) - single owner; blocks all other access.
- UNLOCK - releases the item.

Compatibility:
- If an item is free: grant requested lock.
- If item has SLOCKs: new SLOCK allowed; XLOCK must wait.
- If item has XLOCK: everyone else waits.

Simple locking alone can still lead to non-serializable schedules, deadlocks, or cascading rollbacks (if a transaction reads data that is later rolled back).

6. Two-phase locking (2PL) and variants

Two phases per transaction:
- Growing: acquire locks, release none.
- Shrinking: release locks, acquire none.

Breaking the rule (releasing then acquiring) risks non-serializable schedules.

Variants:
- Strict 2PL: keep XLOCKs until COMMIT/ROLLBACK. Ensures serializability.
- Rigorous 2PL: keep both SLOCKs and XLOCKs until the end. Prevents cascading rollbacks.
- Conservative 2PL: request all needed locks before starting; if any lock is unavailable, wait before doing anything. Avoids deadlocks but can reduce parallelism.
- Concurrencyr overview (Arabic): Concurrency control

7. Deadlocks: detection and prevention

Deadlock: transactions wait for each other's locks forever (e.g., T1 holds a and wants b; T2 holds b and wants a).

Detection: build a wait-for graph (edge Ti -> Tj if Ti waits on Tj). A cycle means deadlock; abort one transaction to break it.

Prevention (use timestamps to decide who waits vs aborts):
- Wait-Die: older waits for younger; younger requesting a lock held by older is aborted ("dies") and restarts with the same timestamp.
- Wound-Wait: older aborts ("wounds") younger that block it; younger waits if the blocker is older.

Video: Shared vs exclusive locks with examples

8. Key takeaways

ACID keeps each transaction reliable; isolation matters most under concurrency.
Lost updates, non-repeatable reads and phantoms show why ordering rules are needed.
Conflict serializability (no cycles in the precedence graph) formalizes “safe” interleavings.
Timestamp ordering and lock-based (2PL) protocols are standard enforcement tools; each has trade-offs in aborts, waiting and throughput.
Deadlocks are a natural side effect of locking and must be detected or prevented.