CMU 15-645 Logging Schemes

Recovery Algorithm

  • Today: ensure DBMS can recover from a failure (during normal TX processing)
  • Idea: Recover to a state with ACID guarantee

Classifications

  • Storage Types: Volatile Storage (DRAM/SRAM) => Non-volatile Storage (HDD, SSD, NVM) => Stable Storage (multiple storage devices)
  • Failure Types: Transaction / System / Storage Media
    • Transaction Errors: Logical errors / Internal State Errors
    • System Errors: Software (Exceptions, e.g., divide by zero) / hardware (Power down)
    • Fail-Stop Assumption: Non-volatile Storage will not be corrupted
    • Storage Media Failure: Game Over! WTF

Use volatile memory for faster access:

  • Procedures (Buffer Pool):
    1. Copy target record into memory
    2. Perform the writes in memory
    3. Write Dirty records back to disks
  • Issue: There are two transactions: T1, T2. T1 is still uncommitted but T2 is already committed. At the same time, the dirty records is written back to non-volatile device. T1 needs to roll back. What’s happened?
    • Steal + Force:
      1. Never have to undo changes of an aborted TX because the changes were not written to disk
      2. Never have to redo changes of a committed TX because all the changes are guaranteed to be written to disk at commit time

Shadow Paging

  • Automatically switch roles between master and shadow
  • Undo: Remove the shadow pages, switch to the master
  • Cons:
    1. Copying the entire page table is expensive (a index structure like B+Tree, Path copying)
    2. Commit Overhead is high

WAL (Write Ahead Log)

  • Format <Tid, Object Id, Previous Value, Current Value> (Undo + Redo)
  • Trade-off:
    1. When should we write log entries to disk?
    2. When should we write dirty records to disk?

Log Styles

  • Physical Log: Record the data changes in the database
  • Logical Log: Higher level operations (e.g., UPDATE, INSERT)
    • Cons: higher time costing to recover
    • Pros: Less space needed

Checkpoint

  • Actions:
    1. Output onto stable storage all log records currently residing in main memory
    2. Output to the disk all modified blocks
    3. Write a <CHECKPOINT> entry to the log and flush to stable storage
  • Recovery: Redo committed TXs after checkpoint + Undo uncommitted TXs after checkpoint