Finish Variable Length Records, Record Modification, and Indexes

CS157b

Chris Pollett

Feb 8, 2023

Outline

On Monday, we were talking about how blocks and records are identified by the DBMS using database addresses and in memory addresses.
We looked at how these addresses are converted to each other when a block is read into the DB buffer (in-memory) and when it is written back to disk using swizzling/unswizzling.
We then looked at variable length formats (for example to efficiently handle VARCHARs) for fields within records and variable format records (for XML and JSON).
We begin today by finishing looking at variable length record formats by looking at what happens when records store really large fields...

A record which has a field that has a value that forces spanning (such as a JPG, MPEG, etc) is called a blob (binary large object).
Blobs often need to be retrieved quickly so we want to store them in sequential blocks or stripe them across different disks.
Blobs also often need special index structures to be able to do things like retrieve say the 40 minute of a movie.
Sometimes it makes sense not to store the blob in the database at all, but use the filesystem instead.
This is because databases were originally optimize for small records and often incur a fair bit of overhead compared to the raw filesystem for supporting insert, delete, update, operations on larger objects.
In these cases, we may just store the filesystem path to the object in the database.
Sears, van Ingen, and Gray (2006) consider various workloads and when the overheads of keeping a blob in the database exceed the benefits.

Suppose we want to insert a record into a DB table.
If the records in the table are not sorted, then we can:
1. Find a block `B` of the table in question with space for the record.
2. If no such block exists, allocate a new block `B` and associate this new block with our table.
3. Add our record to block `B`.
The situation is harder if we need to keep records sorted.

The idea is to find the block where we need insert.
If there is space there, then add the record.
We might need to slide records around within the relevant block and update its offset table.
If there is no space, then we could:
1. look at an adjacent block in the sorted order and try to find space there then redistribute records
2. create an overflow block.

When we delete a record we may be able to reclaim its unused space.
We could try to slide records around so there is one big available space block.
We might also be able to get rid of any overflow blocks.
If this is not possible, then we should maintain an available space list in the block header.
- That is, we designate one of the offsets in the block header as the available space offset.
- This is null if there is no available space.
- Otherwise, it gives the offset to an available record slot.
- Available record slot header then have addresses to the next available record slot in the list, and so.
Once a record is deleted, we may need to place a tombstone on the record so that anything pointing to the record knows it was deleted.

Suppose we are storing records in sorted order and we can store two records per block (records are single alphabet characters).
We first store B, D, E, H. Then store A, C. Then delete D. Then store F, G.
Show after each insert what the blocks would look like if we were following the advice of the last two slides.
Post your solution to the Feb 8 In-Class Exercise Thread.

For fixed length records, one can just overwrite the existing record.
For variable length records, one has all the problems associated with insert and delete except that one doesn't have to have a tombstone.
If a variable length update record is shorter than the current record, we can:
1. overwrite the old record
2. add unused space to an available space list
If a variable length update record is longer than the current record, we can:
1. Try shifting records around in the block to free up space and write the new record.
2. If shifting doesn't work, we can try to create an overflow block.

When we store records we need to store enough information so we can process queries like
```
SELECT * 
FROM R
```
This kind of a query involves a whole scan of a table (so is often called a tablescan.)
It will run in time proportional to the number of rows in the table.
To do this, we need to store in the block header where in a block the records begin, and in the record header of these records, which relation they belong to.
We might want to reserve whole cylinders for particular relations to speed things up.
A query like:
```
SELECT * 
FROM EMP
WHERE name = 'John Smith';
```
only returns a small number of rows compared to the table.
Ideally, we want our queries to run in time proportional to the result set, not time proportional to the underlying tables.
To avoid doing table scans, for queries such as the above, we will use auxiliary data structures such as indexes.

We imagine our relations are stored in data structures within one or more files.
The data file (in some DBMSs like Mysql this is a real file, in sqlite all relations and their components and there stored in the same file) is where the record data is stored.
An index file is a file containing (index key, pointer) pairs as records.
A index key usually consists of one or more column values that are found in the relation, and the pointer points to a record which has those column values.
A given relation might have several index files.
As index keys often let one find records quickly in the relation they are sometimes referred to as search keys.

In this kind of file we sort the file on the attributes of the index.
Particular values for the attributes of the index are called sort keys.
In the case of a sequential file, we don't need an auxiliary index file.
Keeping the file sorted means we can use binary search to quick find records which have a particular sort key.