Joins, Completing Plans, System Failures

CS157b

Chris Pollett

Mar 15, 2023

Outline

Suppose we want to compute $(R_1 \bowtie R_2 \bowtie \ldots \bowtie R_n)$
We can build a table that contains:
- Each table `R_i`, as well as its size and number of values for the join attribute
- For each subset of `j` tables, `R_{i_1} ... R_{i_j}`, an estimate of the total cost of performing the join of this subset.
The second items above can be computed from the best joins of subsets `j-1` tables by:
1. Go through the entries for subsets of size `j-1`, joining one more table.
2. Check if this j-subset is already in the table.
3. If it is, compare its currently listed cost with the new way of joining.
4. If the new value is smaller, then replace its value in the table.

Suppose we have three tables R(A,B), S(B,C), U(C,D). Here T(R)=300, T(S) = 600, T(U) = 900 and V(R,A)=10, V(R,B)=20, V(S,B)=30, V(S,C)=40, V(U,C)=50, V(U,D)=60. Use the algorithm of the last slide to determine the cost of computing the natural join of these three tables.
Please post your solution to the Mar 15 In-Class Exercise Thread.

To finish up creating a query plan, we need to consider the following issues:
- How to select algorithms to implement the operations of the query plan not done as part of some earlier step as in the dynamic programming for join order
- Decide whether to materialize intermediate results or pipeline them
- Notations for physical plan operators

What way should we use to evaluate `sigma_C(R)`?
Assuming there are no multidimensional attributes we can, if possible:
- Choose an attribute `A` in `C`, with an index, and which is compared to a constant
- After using the index to find those things satisfying this part of `C`, apply the rest of the condition to complete the selection.
- If no such attribute exists, we need to do a table-scan
If there is more than one choice for `A`, we want to choose the most selective single indexed attribute, so we might do cost estimates over the different possibilities.

Request a nested loop join, if one can hope that the outer table can fit totally in memory (so maybe don't need to execute outer loop more than once).
Use a sort-join when one or both tables is already sorted on the join attribute or if there are two or more tables joined on the same attribute (as then sort step can be amortized).
If there is an index on the join attribute and the results are expected to be small use an index join.
Fall back to nested loop join if none of the above apply.

The naive way to evaluate queries is to write intermediate results back out to disk.
This is called materialization.
Pipelining is a technique whereby we pass the output of the sub-query computed so far to the next stage in the query evaluation. It is typically implemented using iterators.

Each node in logical query plan needs to be replaced by a physical operator.
Operators for leaves:
- TableScan(R)
- SortScan(R,L) - sort according to list L
- IndexScan(R,C) - look up index according to condition C
- IndexScan(R,A) - entire relation is retrieved via lookup on attribute A.
Operators for Selection:
- Filter(C) -- filter child results according to C.
- Some select might be an IndexScan followed by a Filter
Operator for Sorting:
- Sort(L) -- may occur anywhere interior to tree.
Other operators
- Join( type, # of buffers) - for example type might be two pass hash-join, and number of buffers might be 100

The physical query plan is typically represented as a tree.
Data must flow up the tree.
Since interior nodes may not be ancestors or descendants of each other in a bushy tree, we need to figure out an order to evaluate sub-trees of the same height.
To do this we evaluate sub-trees from the bottom up and evaluate sub-trees of the same height from left to right. (Pre-order.)
Execute all nodes of each sub-tree using a network of iterators.

We now start a new topic, how to handle system failures.
We would like our DBMS to:
- Protect data in the case of system failure.
- Keep the database in a consistent state even if multiple operations are being carried out at the same time.

Erroneous Data Entry -- We can try to prevent by type checking.
Media Failures -- We can try to prevent by using a RAID system or archiving data to tape
Catastrophic failures -- For example, fires, rogue airplanes, etc. We can try to distribute the database (How Sun recovered 9/11).
System failures -- These are things like power loss or software errors that cause a particular transaction to be lost.
Handling system failures are what we are going to focus for the next several lectures.

In order to understand the recovery process from system failure, we need to understand a little about transactions.
When doing a sequence of operations from a DMBS shell, each query or statement is roughly a transaction.
In an embedded SQL/JDBC setting, a transaction will be a sequence of database commands up until either a COMMIT or ROLLBACK (abort) command is issued.
The transaction manager in the DBMS has the job of making sure a transaction is executed correctly.

The Transaction Manager:
- Issues signals to the log manager so that necessary log records are written.
- Ensures that concurrently executing transactions do not interfere with each other.