Deliverable 4

Implementing Demo2Vec model

Goal: To recreate Demo2Vec model on OPRA dataset as discussed in Deliverable 3.

Aim: To get insight on how to perform affordance learning on videos and create baseline model for the project

Conclusion: Created partial Demo2Vec. Implemented ConvLSTM to handle time series data. Utilized human activity recognition datasets to implement neural net.

I experimented and created neural network to perform human activity recognition on UCF 101- Action Recognition Dataset

Dataset: From UCF 101 dataset, I chose 5 human activity.
Biking
Biking

Bench Press
Bench Press

Archery
Archery

Baby Crawling
Baby Crawling

BasketBall
BasketBall

Model Generation:
1. I utilized VVG-16 model with pretrained weights restored from imageNet dataset as our base model.
2. For videos, I implemented a ConvLSTM network to handle time series data and extract spatial-temporal information from the input.
3. All the videos were subsampled at a rate of 5 frames per second. All the generated frames were resized to 112*112.

Rest of the implementation will be carried out in the next semester

Code to generate Frames
Frame

Base Model
Base Model

Model Summary
Summary