Cost Based Plan Selection




CS157b

Chris Pollett

Mar 7, 2018

Outline

Estimating Costs of Operations

What to select for each plan we generate

  1. An order and grouping on the associative and commutative operations like join, unions, and intersections.
  2. An algorithm for each operator in the logical plan. For example, we need to choose between nested-loop join and hash-join.
  3. Additional operators - scanning, sorting, etc which are needed for the physical plan but are not present in the logical plan.
  4. The way in which arguments are passed from one operator to the next. (By intermediate results or by pipelining).

Estimating Sizes of Intermediate Relations

Estimating the Size of a Projection

Estimating the Size of a Selection

In-Class Exercise

Suppose we had a table PERSON with 100 million rows. In these rows, the EYE_COLOR attribute can take one of six values. Further, in these rows, the ANNUAL_INCOME attributes take on 5000 distinct values. Solve the following problems:

  1. You only want to date people that don't have red eyes. Write this as a relational algebra query, and estimate the number of tuples returned.
  2. You only want to date people that have green eyes. Write this as a relational algebra query, and estimate the number of tuples returned.
  3. You only want to date people that have annual incomes over 50000. Write this as a relational algebra query, and estimate the number of tuples returned.

Post your answers to the Mar 7 In-Class Exercise Thread.

Estimating the Size of a Join

Natural Joins with Multiple Join Attributes

Joins of Many Relations