Hashing, Storage Engines for DBMSs




CS157b

Chris Pollett

Feb 20, 2023

Outline

Introduction

Secondary-Storage Hash Tables

Insertion, Deletion and Efficiency

Extensible Hash Tables

Insertion into Extensible Hash Tables

Extensible Hashing Example 1
Extensible Hashing Example 2
  1. Compute `h(K)` look at first `i` bits.
  2. Suppose this equals `x`.
  3. Look up the `x`th entry in the indirection table, follow its pointer to a bucket.
  4. If there is room insert into this bucket. (What would do for left hand image above, if inserted the record 0111)
  5. If no room and the number of bits of hash value used to look up bucket way less than `i`, say `j`, divide entries in block `x` into two blocks according to the `j+1`st bit of the hash function. (What would do for left hand image above, if inserted both 0000, 0111).
  6. If `j=i` then we double the number of indirection pointers. That is set `i:=i+1`. Each old pointer on `i` bits now becomes two pointers on `i+1` bits. Now we can go to 3. (The right hand image is what happens after inserting first 0000, 0111, then inserting 1000 -- which causes the doubling).

Quiz

Which of the following statements is true?

  1. In Oracle terminology PCTFREE represents the percentage of a block that is reserved for future updates to existing rows.
  2. In B-tree algorithms, the root node is not treated differently than any other node.
  3. B-trees do not easily support range queries.

Linear Hash Tables

Example Linear Hash Table Insert Sequence

Dawn of Databases - COBOL and files

Dawn of Databases - IBM and ISAM

How Tables and Indexes are Implemented: IBM, Ingres, Postgres, Oracle