Chris Pollett> Old Classses >
CS157b

( Print View )

Student Corner:
[Submit Sec2]
[Grades Sec2]

[Online Final-PDF]

[Online Midterm-PDF]

[Lecture Notes]
[Discussion Board]

Course Info:
[Texts & Links]
[Description]
[Course Outcomes]
[Outcomes Matrix]
[Course Schedule]
[Grading]
[Requirements/HW/Quizzes]
[Class Protocols]
[Exam Info]
[Regrades]
[University Policies]
[Announcements]

HW Assignments:
[Hw1] [Hw2] [Hw3]
[Hw4] [Hw5] [Quizzes]

Practice Exams:
[Midterm] [Final]

CS157b Spring 2020Practice Final

To study for the final I would suggest you: (1) Know how to do (by heart) all the practice problems. (2) Go over your notes at least three times. Second and third time try to see how much you can remember from the first time. (3) Go over the homework problems. (4) Try to create your own problems similar to the ones I have given and solve them. (5) Skim the relevant sections from the book. (6) If you want to study in groups, at this point you are ready to quiz each other. The practice final is below. Here are some facts about the actual final: (a) It is comprehensive (b) It is open book, open notes, you will have at least 24 hours to do the test and upload it using the class homework submission mechanism. (d) Some problems will be dependent on some personal information about you such as your ID, and you will not receive credit if you don't make use of this information correctly. (e) It is 10 problems (3pts each), 6 problems will be on materials since the midterm, 4 problems will be from the topics of the midterm. (f) Two problems will be exactly (less typos) off of the practice final, and one will be off of the practice midterm.

  1. Suppose `T(R) = 30000` and `T(S) = 20000` and `S` and `R` share only one attribute `Y` where `V(R,Y) =100`, V(S,Y) =1000. What would be our cost-based plan selection estimate for the number of tuples returned by `W=\sigma_{ A lt c}(R)`? $R \bowtie S$ ?
  2. Briefly explain how the SQL ANALYZE command works for sqlite as well as what kind of statistics are stored in the table sqlite_stat1.
  3. Give sufficient rules for a database to be able to perform recovery when: (a) undo logging is being used (b) redo logging is being used, (c) undo-redo logging is being used.
  4. Explain how a non-quiescent checkpoint is done if we are using undo-redo logging. Explain how an non-quiescent archive is performed if we are using undo-redo logging.
  5. Briefly explain and give an example: (a) two-phase locking, (b) strict two phase locking, (c) recoverable schedule, (d) avoids cascading rollbacks schedule.
  6. Give an example situation where 2PL is in use by two transaction won't be serializable. What kind of locking might prevent this? Give an example write-too-late situation that might occur when multi-version timestamp concurrency is being used.
  7. Give the parallel algorithm from class for computing the join of two relations $R \bowtie S$. Give its cost on `N` machines. Give the distributed algorithm from class for computing $R \bowtie S$ on two machines.
  8. Define the following distributed locking concepts: (a) centralized locking, (b) Lock Coordinator, (c) primary copy locking. Suggest an algorithm to avoid deadlocks in the distributed setting.
  9. Give an example different from class of a schema a mediator might expose to application queries and how queries to that schema might be re-written to two distinct data sources.
  10. Define the following OLAP concepts (a) star-schema, (b) slicing data, (d) cube. How would the roll up operation in SQL help in computing frequent item sets?