Chris Pollett > Old Classes > CS256
( Print View )

Student Corner:
  [Grades Sec1]

Lecture Notes]

  [Discussion Board]

Course Info:
  [Texts & Links]
  [Outcomes Matrix]
  [HW/Quiz Info]
  [Exam Info]
  [Additional Policies]

HWs and Quizzes:
  [Hw1]  [Hw2]  [Hw3]
  [Hw4]  [Hw5]  [Quizzes]

Practice Exams:
  [Mid 1]  [Mid 2]  [Final]


CS256 Fall 2017 Practice Midterm 1

Studying for one of my tests does involve some memorization. I believe this is an important skill. Often people waste a lot of time and fail to remember the things they are trying to memorize. Please use a technique that has been shown to work such as the method of loci. Other memorization techniques can be found off the Wiki Page for Moonwalking with Einstein. Given this, to study for the midterm I would suggest you:

  • Know how to do (by heart) all the practice problems.
  • Go over your notes at least three times. Second and third time try to see how much you can remember from the first time.
  • Go over the homework problems.
  • Try to create your own problems similar to the ones I have given and solve them.
  • Skim the relevant sections from the book.
  • If you want to study in groups, at this point you are ready to quiz each other.

The practice midterm is below. Here are some facts about the actual midterm: (a) It is closed book, closed notes. Nothing will be permitted on your desk except your pen (pencil) and test. (b) You should bring photo ID. (c) There will be more than one version of the test. Each version will be of comparable difficulty. (d) One problem (less typos) on the actual test will be from the practice test.

  1. Briefly describe the following machine learning tasks: (a) Regression, (b) Transcription, (c) Denoising.
  2. Suppose we roll a 6-sided dice four times, (a) what is the expected value of the sum of the die? (b) Using Markov's inequality, give an upper bound on the probability of this sum being greater than 18. (c) Give an example of a pair of 3x3 matrices such that `AB ne BA`.
  3. Let `alpha = 2`. Suppose our current weight vector is `[1,2,3]` and `theta = 1`. The training example `x = [1,1,1]` is a false positive. How would the perceptron rule update `vec{w}` and `theta`? How would the Winnow rule update `vec{w}` and `theta`?
  4. Let `t` be the Boolean Threshold 1 Perceptron computing the inequality `(x_1 +x_2 +x_3) ge 2`. Let `X = { (0,0,0), (0,1,1) , (1,1,0), (1, 0, 0) }`. Compute a weight vector `vec{w}^t` and a constant `c_t >0` such that `t(vec{x}) = 0 => vec{w}^t \cdot vec{x} \leq 1 - c_t` and `t(vec{x}) = 1 => vec{w}^t \cdot vec{x} \geq 1 + c_t` for all `vec{x} in X`.
  5. Let `f` be the Nested Boolean Function `x_1 ^^ neg x_2 ^^ neg x_3 ^^ cdots ^^ neg x_n` (only the first input is unnegated). Explictly compute the `w_i`'s needed to show that `f` can be represented by a boolean threshold function
    `w_1x_1 + cdots w_nx_n geq theta_n`,
    with `theta_n = k +1/2` for some integer `k`, `w_i = \pm 2^{i-1}`, and `sum_{w_i<0}w_i < theta_n < sum_{w_i>0}w_i`.
  6. Write a Python function num_examples(n, epsilon, delta) which computes the number of training examples needed to PAC-learn an n-bit threshold function from a class of gradual threshold functions to within `epsilon`-error at least `1-delta` fraction of the time.
  7. Prove a two input perceptron cannot compute the XOR function.
  8. Let `NAE(x_1,...,x_n)` be the function which is 1 if not all of its inputs have the same value and is `0` otherwise. Show how to compute this function as a composition of threshold functions of at most three layers.
  9. Define the following terms: (a) Maximal Margin Separator, (b) Radial Basis Function, (c) Polynomial Kernel (for an SVM).
  10. Briefly explain the update rule used in the S-K algorithm.