Outline
- Redo Logging
- In-Class Exercise
- Undo/redo Logging
Introduction
- On Monday, we were looking at how to ensure database integrity under different types of failures.
- We were focusing in particular on how to handle system failures such as power loss or software errors
that cause parts of transactions to be lost.
- We talked about transactions on a database as a sequence of actions on the database whose effect we wanted to entirely
appear or to not appear at all. For example, an ATM withdrawal.
- We then looked at logging as a technique to recover the database to a consistent state in the event of a system failure.
- In undo logging, we have log entries of the form `langle START\ T\rangle` (written when the transactions starts), `langle COMMIT\ T\rangle` (written to indicate all value changes of T are on disk) and `langle ABORT\ T\rangle` (written to indicate all effects of `T` have been removed from disk), or `langle T,X,v rangle`. Here `T` is a transaction identtifier, `X` is a database item, and `v` is the old value of the item.
- When a transaction wants to change a value, it first must write a log entry `langle T,X,v rangle` to disk. In the event of a system failure, the undo log records are used to restore uncommitted or abort changes back to their prior values.
- The process to do this as described on Monday might be slow as it could involve reading the whole log file which might be very long.
- So we looked at checkpointing, the associated `langle CKPT\ T\rangle` record, and nonquiescent checkpointing as ways to improve this.
- Today, we begin by looking at redo logging, a second way to do recovery logging.
Redo Logging
- A problem with undo logging is that we cannot commit a transaction without
first writing all its changed data to disk. This might cost us I/Os.
- This requirement can be avoided using redo logging.
- The principle differences between undo and redo logging are:
- Undo logging cancels the effects of incomplete transactions and ignores committed ones;
redo logging ignores incomplete transactions and redoes committed ones as necessary.
- Undo logging requires changed DB elements to be written to disk prior to the commit
log record being written to disk; redo logging requires the COMMIT appear on disk before
any changed values reach the disk.
- We had rules `U1` and `U2` for undo logging to guarantee undo logging worked. We
will have a redo log rule `R1` which replaces these two rules.
The Redo Logging Rule
-
(R1) [Also called the write-ahead logging (WAL) rule] Before modifying X on disk, all log records
pertaining to this modification of `X`
(that is, both the update record `langle T, X, v rangle`
and the `langle COMMIT\ T rangle` record) must appear on disk.
-
Here `v` is now the new value not the old value as in undo logging.
- Redo logging gives rise to the following order for writing things to disk:
- Log records of changed DB elements.
- A Commit record.
- The Changed DB elements themselves
Recovery with Redo Logging
- To recover when using redo logging, we:
- Identify the committed transactions.
- Scan the log forward from the beginning. For each
log record `langle T, X, v rangle` encountered:
- If `T` is not a committed transaction, do nothing.
- If `T` committed, write the value for database element `X`.
- For each incomplete transaction `T`, write an `langle ABORT\ T rangle` record to the log and flush the log.
Checkpointing a Redo Log
- Checkpointing can also be done with redo logs. The steps are:
- Write a log record `langle START\ CKPT(T_1,...,T_k) rangle`, where `T_1,..., T_k` are the active transactions.
- Write to disk all database elements that were written to buffer for transactions that had committed when the START CKPT began.
- Write an `langle END\ CKPT rangle` record.
- To recover with a checkpointed redo log, look for last record of type `langle END\ CKPT rangle`.
- We only need to redo `T_i`'s or transaction committed after the corresponding `langle START\ CKPT(T_1,...,T_k) rangle`.
In-Class Exercise
- Consider the transaction from last: I(A), R(A,t), t:=t*2, W(A,t), I(B), R(B,t), t:= t*2, W(B,t), O(A), O(B)
- Make a table like last day with columns Transaction Op, Value t, Memory Value A, Disk Value A, Memory Value B, Disk Value B, Log Records.
- Now though make any writes to disk be consistent with redo logging. Indicate when the log records are written to disk by explicitly given flush logs.
- Please post your solution to the Apr 5 In-Class Exercise Thread.
Undo/Redo Logging
- Both undo and redo logging have disadvantages.
- An example disadvantage of redo logging is that all
modified blocks must be kept in buffers until the transaction
commits and the log records have been flushed.
- Both cause problems if one has a block in memory that was
modified by a transaction that is ongoing and modified by a
transaction that has committed.
- We can try to combine the best of both worlds and use undo/redo logging.
- In this set up we have the following new rules:
-
(UR1) Before modifying `X` on disk because of `T`, it is necessary
that the update record `langle T,X,v,w rangle` appear on disk. (`v` is the old value; `w` is the new)
- (UR2) A `langle COMMIT\ T rangle` must be flushed to disk as soon as it appears in the log.
- To recover when using undo/redo logging, we:
- Redo all the committed transactions in the order earliest first, and
- Undo all the incomplete transactions in the order latest first.
Checkpointing an Undo/Redo Log
- The steps this time are:
- Write a `langle START\ CKPT(T_1, ..., T_k) rangle` to the log
- Write all dirty buffers to disk.
- Write an `langle END\ CKPT rangle`, flush the log.
- To recover we can look for the last `langle END\ CKPT rangle` in the log and find its corresponding `langle START\ CKPT(T_1, ..., T_k) rangle`.
- By (2), we know transactions that have committed prior to this have had all their data on disk, so don't need to be redone.
- So we only need to redo committed transaction after `langle START\ CKPT(T_1, ..., T_k) rangle`.
- Similarly, we only have to undo uncommitted transactions up to the `langle START\ CKPT(T_1, ..., T_k) rangle`.
- We require a transaction not write any values even to memory
buffers until it is certain whether or not to abort.
Logging in DBMS systems
- Much of the recovery techniques currently in use in DBMS, date to IBMs System R (Gray, et al 1981) and to ARIES (Algorithm for Recovery and Isolation Exploiting Semantics) (Mohan et al 1992) also developed at IBM.
- System R did not use logging: A database was treated as a collection of pages accessed through a page table.
- When a transaction changes a page, it first copies the pages to a shadow page, and then makes changes to the shadow page.
- After this is done, the transaction modifies the page table to change the old page for the shadow page. When this happens the operation is viewed as "committing".
- To abort a transaction its shadow pages are discarded.
- This approach can lead to data fragmentation, and so was eventuallly superseded by logging.
- What ARIES does is essentially what we called Undo/Redo logging with checkpointing.
- It also keeps track of a dirty page table and a transaction table.
- The transaction table keeps track of all the transactions that are currently running and the dirty page table keeps track of all pages that have been modified but not written to disk (makes it easier to do checkpointing).
- To make it easier to scan through the logs, records have the format: `langle` Sequence Number, Transaction ID, Page ID, Redo, Undo, Previous Sequence Number `rangle`.
Here Sequence number refers to log entry, and the previous sequence number gives the last entry involving the same transaction.
- ARIES also adds one twist to the recovery process it also writes Compensation Log Records to record when an operation has already been undone. They have
the form `langle` Sequence Number, Transaction ID, Page ID, Redo, Previous Sequence Number, Next Undo Sequence Number `rangle` and make it faster to keep going if there is a crash during a recovery.
- ARIES is used by DB2, SQL Server, and InnoDB in MySQL/MariaDB. Oracle uses a variation on undo/redo logging.
- Postgres uses Write Ahead Logging (i.e., redo-based recovery).
- The Sqlite database command PRAGMA journal_mode = value; can be used to set the kind of logging Sqlite does. By default, its value is delete which roughly correspond to undo logging. It also support WAL.