Minimization Methods, Hidden Units, Stochastic Gradient Descent




CS256

Chris Pollett

Oct 16, 2017

Outline

Introduction

Minimizing Functions

Newton Methods For Finding Minima

The BFGS Algorithm

In-Class Exercise

Gradient Descent

Hidden Units

Types of Activation

Remarks on Initialization

Stochastic Gradient Descent

Training Terminology

Visiting All the Data with SGD