Chris Pollett >
Students > [Bio] [Blog] [GANs-PDF] [Gen Videos Scene Dynamics-PDF] [3D CNNs Human Action RECOG-PDF] [TGAN-PDF] |
CS297 ProposalSynthesis Video From Giving FramesLei Zhang (lei.zhang01@sjsu.edu) Advisor: Dr. Chris Pollett Description: Since Ian Goodfellow invented Generative Adversarial Network (GAN) in 2014, it's a hot research area. People have published a batch of papers to use GAN to create fake human faces, turn an image of a horse into an image of zebra, and to generate animations from screenplays. In this project, we will develop an app that takes a single photo as input and generates a short output video of a particular kind. For example, if the input is a head, one kind might be the head-turning. These kinds are based on a learning training set of videos. With this technology, AI can be used to generate longer fake videos sequences. This work extends earlier work of [1] [3] where a start and end keyframe was used to generate videos. Here rather than the end frame, we supply a category of video we want. Schedule:
Deliverables: The full project will be done when CS298 is completed. The following will be done by the end of CS297: 1. Implement a simple GAN with Python and use it to generate Chinese character digits 2. Implement a video GAN to create fake videos 3. Explore to create video with pix2pix 4. Create a GAN-based framework to generate fake videos with input from a single picture 5. Final CS 297 report. References: [1] Li, Yunpeng, Dominik Roblek, and Marco Tagliasacchi. "From Here to There: Video Inbetweening Using Direct 3D Convolutions." arXiv preprint arXiv:1905.10240 (2019). [2] E. Zakharov, A. Shysheya, E. Burkov, and V. Lempitsky, “Few-shot adversarial learning of realistic neural talking head models,” 2019. [3] Clark, Aidan, Jeff Donahue, and Karen Simonyan. "Efficient Video Generation on Complex Datasets." arXiv preprint arXiv:1907.06571 (2019). [4] Wang, Ting-Chun, et al. "High-resolution image synthesis and semantic manipulation with conditional gans." Proceedings of the IEEE conference on computer vision and pattern recognition. 2018. [5] Ji, Shuiwang, et al. "3D convolutional neural networks for human action recognition." IEEE transactions on pattern analysis and machine intelligence 35.1 (2012): 221-231. [6] Vondrick, Carl, Hamed Pirsiavash, and Antonio Torralba. "Generating videos with scene dynamics." Advances In Neural Information Processing Systems. 2016. [7] Goodfellow, I. J., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., Courville, A., and Bengio, Y. (2014). Generative adversarial nets. In NIPS’2014. [8] M. Saito, E. Matsumoto, and S. Saito, "Temporal generative adversarial nets with singular value clipping, " In IEEE International Conference on Computer Vision (ICCV), 2017. |