Chris Pollett > Old Classes > CS256
( Print View )

Student Corner:
  [Grades Sec1]

Lecture Notes]

  [Discussion Board]

Course Info:
  [Texts & Links]
  [Outcomes Matrix]
  [HW/Quiz Info]
  [Exam Info]
  [Additional Policies]

HWs and Quizzes:
  [Hw1]  [Hw2]  [Hw3]
  [Hw4]  [Hw5]  [Quizzes]

Practice Exams:
  [Mid 1]  [Mid 2]  [Final]


HW#4 --- last modified Saturday, 18-Nov-2017 22:33:37 PST.

Solution set.

Due date: Nov 21

Files to be submitted:

Purpose: To experiment with neural nets containing convolutional layers.

Related Course Outcomes:

The main course outcomes covered by this assignment are:

CLO4 -- Be able to select neural network layers type to build a network suitable for various learning tasks such as object classification, object detection, language processing, planning, policy selection, etc.

CLO5 -- Be able to select an appropriate regularization technique for a given learning task.

CLO6 -- Be able to code and train with a library such as Caffe, Theano, Tensorflow a multi-layer neural network.

CLO7 -- Be able to measure the performance of a model, determine if more data in needed, as well as how to tune the model.


For this homework you will implement a convolutional neural network program in TensorFlow (or Caffe - but not sure if we have time to describe in class) capable of learning the zener card training data of Hw2. You will then use your program to conduct experiments and compare these to the SVM models trained in Hw2. You can use the data generating program from Hw2 to make datasets.

Your neural net training program will be run from the command line with a line like:

python cost network_description epsilon max_updates class_letter model_file_name train_folder_name 

The parameters, epsilon, max_updates, class_letter, model_file_name, train_folder_name are the same as in Hw2. An update here will mean one epoch. cost should be one of cross, cross-l1, cross-l2, or ctest which says whether training will be done using just cross entropy, cross entropy with L1 regularization, cross entropy with L2 regularization, or no training just testing (epsilon max_updates are then ignored). network_description is the name of a file that should consist of a sequence of rows in the format:

feature_size num_features

This should be followed by a row with a number of units for a dense layer. For example, a network_description file might look like:

5 4
6 8
6 16

This would specify a neural net with the following layers: the first two layers would consist of a convolutional layer with 4 feature maps using a 5x5 filter followed by a maxpool layer, the next two layers would consist of a convolutional layer with 8 feature maps using a 6x6 filter followed by a maxpool layer, the next two layers consist of a convolutional layer with 16 feature maps using a 6x6 filter followed by a maxpool layer, finally, the last two layers consist of a dense layer of 64 units all of whose ouputs connect to a single sigmoid perceptron. For the purposes of this homework all other units use relu activations.

You should train with 5-fold cross-validation. Your program should output a final average cost value for the training data (over the five different training sets) and a final average cost for the validation data (over the five different training sets). You can use the last trained model as what you write to disk. When you do testing with your trained model, use a different data set then what you used to train and validate with.

Once you have written the above program, I would like you to design and conduct experiments which investigate the following:

  1. Fix a network approximately like LeNet-5. If you plot max_updates versus training cost and validation cost does the validation cost ever reach a minimum? How does a lower validation cost affect accuracy on test data?
  2. Start with a network approximately like LeNet-5. How does the choice of regularization effect the training convergence rate of your model?
  3. Start with a network approximately like LeNet-5. What is the effect of increasing or decreasing the number of layers on training accuracy? What about varying feature size or number of features/layer?
  4. For the best network you found from the above experiments how does its accuracy, training times, etc compare to the SVMs from Hw2?

The write-ups of your experiments should follow the guidelines from the Oct 11 lecture. You should make sure to have at least one graph/experiment generated using matplotlib. Include your experiment write-ups in the file Hw4.pdf which should be included in your file.

Point Breakdown works according to the spec above 2pts
Experiments 1-4 (2pts each) 8pts