Finish Optimization




CS256

Chris Pollett

Nov 13, 2017

Outline

Introduction

Optimization and SGD

Excess Error

Momentum

Nestorov Momentum

Quiz

Which of the following is true?

  1. Weights Perturbation involves adding small amounts of Gaussian noise to the `(x,y)` pairs while training.
  2. If early stopping is used then for each epoch when training we hold some data in reserve for validation.
  3. In standard bagging one trains several models separately and then have the model with the largest score determine the output.

Parameter Initialization

Initializing Biases

Adaptive Learning Rate Algorithms

Adaptive Learning Rate Algorithms that work with SGD