\( \def\lefteqn#1{\rlap{\displaystyle{#1}}} \newcommand{\Hajek}{H\'{a}jek} \newcommand{\Hastad}{H{\aa}stad} \newcommand{\Pudlak}{Pudl\'{a}k} \newcommand{\Krajicek}{Kraj\'\i\v{c}ek} \newcommand{\Jerabek}{Je\v{r}\'{a}bek} \newcommand{\Kolodziejczyk}{Ko{\l}odziejczyk} \newcommand{\compfont}{\mathsf} \newcommand{\SIG}[1]{\hat\Sigma^{\compfont b}_{#1}} \newcommand{\SIB}[1]{\Sigma^{\compfont b}_{#1}} \newcommand{\SIGINFTY}[1]{\Sigma^{\compfont #1}_{\infty}} \newcommand{\ASIGINFTY}[1]{\mathcal{A}\Sigma^{\compfont #1}_{\infty}} \newcommand{\pSIB}[1]{p\Sigma^{\compfont b}_{#1}} \newcommand{\PI}[1]{\hat\Pi^{\compfont b}_{#1}} \newcommand{\PIB}[1]{\Pi^{\compfont b}_{#1}} \newcommand{\DELT}[1]{\hat\Delta^b_{#1}} \newcommand{\DELTB}[1]{\bigtriangledown_{#1}} \newcommand{\BOOL}{\compfont B} \newcommand{\UNIV}{\compfont U} \newcommand{\EXIST}{\compfont E} \newcommand{\LEX}{\compfont L} \newcommand{\PLS}{\compfont{PLS}} \newcommand{\FP}{\compfont{FP}} \newcommand{\PTIME}{\compfont{P}} \newcommand{\NP}{\compfont{NP}} \newcommand{\NC}{\compfont {NC}} \newcommand{\coNP}{\compfont{co}-\compfont{NP}} \newcommand{\PH}{\compfont{PH}} \newcommand{\polylog}{\compfont{polylog}} \newcommand{\SigmaP}[1]{\Sigma^{\compfont p}_{#1}} \newcommand{\SiP}[1]{\Sigma^{\compfont p}_{#1}} \newcommand{\PiP}[1]{\Pi^{\compfont p}_{#1}} \newcommand{\theoryfont}{\mathit} \newcommand{\BASIC}{\theoryfont{BASIC}} \newcommand{\LKB}{\theoryfont{LKB}} \newcommand{\LK}{\theoryfont{LK}} \newcommand{\EBASIC}{\theoryfont{EBASIC}} \newcommand{\IOpen}{\theoryfont{IOpen}} \newcommand{\LIOpen}{\theoryfont{LIOpen}} \newcommand{\TOCOMP}{\theoryfont{TOComp}} \newcommand{\TCOMP}[1]{\theoryfont{TComp}^{#1}} \newcommand{\IDelta}{\mbox{$\theoryfont{I}\Delta_0$}} \newcommand{\TT}[1]{\theoryfont{T}^{#1}_2} \newcommand{\TR}[1]{\theoryfont{\hat{T}}^{#1}_2} \newcommand{\ST}[1]{\theoryfont{S}^{#1}_2} \newcommand{\RT}[1]{\theoryfont{R}^{#1}_2} \newcommand{\RR}[1]{\theoryfont{\hat R}^{#1}_2} \newcommand{\CT}[1]{\theoryfont{\hat{C}}^{#1}_2} \newcommand{\quasip}{\theoryfont{\{2^{ \mathbf{ \dot{(||\id||)} } } \} }} \newcommand{\RFN}{\theoryfont{RFN}} \newcommand{\BPR}{\theoryfont{BPR}} \newcommand{\BDC}{\theoryfont{BDC}} \newcommand{\LLIND}{\theoryfont{LLIND}} \newcommand{\LIND}{\theoryfont{LIND}} \newcommand{\IND}{\theoryfont{IND}} \newcommand{\COMP}{\theoryfont{COMP}} \newcommand{\REPL}{\theoryfont{REPL}} \newcommand{\BB}{\theoryfont{BB}} \newcommand{\open}{\theoryfont{open}} \newcommand{\WIT}{\theoryfont{WIT}} \newcommand{\mathfnfont}{\mathrm} \newcommand{\K}{\mathfnfont{K}} \newcommand{\cons}[1]{\mathfnfont{cons}(#1)} \newcommand{\car}[1]{\mathfnfont{car}(#1)} \newcommand{\cdr}{\mathfnfont{cdr}} \newcommand{\cond}{\mathfnfont{cond}} \newcommand{\PAD}{\mathfnfont{PAD}} \newcommand{\CAT}{\mathfnfont{CAT}} \newcommand{\LSP}{\mathfnfont{LSP}} \newcommand{\MSP}[2]{\mathfnfont{\lfloor\frac{#1}{2^{#2}}\rfloor}} \newcommand{\MULT}{\mathfnfont{MULT}} \newcommand{\Interval}[3]{\mathfnfont{#1 \in [#2, #3)}} \newcommand{\BLK}{\mathfnfont{BLK}} \newcommand{\BIT}{\mathfnfont{BIT}} \newcommand{\OUT}{\mathfnfont{OUT}} \newcommand{\DMSB}{\mathfnfont{DMSB}} \newcommand{\PREV}{\mathfnfont{PREV}} \newcommand{\bit}[1]{\mathfnfont{BIT}(#1)} \newcommand{\bool}[1]{\mathfnfont{bool}(#1)} \newcommand{\longBool}[1]{\mathfnfont{lbool}(#1)} \newcommand{\Pair}[1]{\mathfnfont{pair}(#1)} \newcommand{\NOP}{\mathfnfont{NOP}} \newcommand{\Seq}{\mathfnfont{Seq}} \newcommand{\RIGHT}{\mathfnfont{right}} \newcommand{\ispair}{\mathfnfont{ispair}} \newcommand{\Len}{\mathfnfont{Len}} \newcommand{\modtwo}{\mathfnfont{mod2}} \newcommand{\id}{\mathfnfont{id}} \newcommand{\cl}{\mathfnfont{cl}} \newcommand{\proj}[2]{(#1)_{#2}} \newcommand{\vec}[1]{\mathbf{#1}} \newcommand{\bd}{\mathfnfont{bd}} \newcommand{\LAST}{\mathfnfont{LAST}} \newcommand{\SUB}{\mathfnfont{SUB}} \newcommand{\ov}{\overline} \newcommand{\proves}{\vdash} \newcommand{\sequent}{\rightarrow} \newcommand{\monus}{\frac{\cdot}{ }} \newcommand{\bigvw}{\mathop{\mathchoice% {\makebox[0pt][l]{$\displaystyle\bigvee$}\mbox{$\displaystyle\bigwedge$}}% {\makebox[0pt][l]{$\textstyle\bigvee$}\mbox{$\textstyle\bigwedge$}}% {\makebox[0pt][l]{$\scriptstyle\bigvee$}\mbox{$\scriptstyle\bigwedge$}}% {\makebox[0pt][l]{$\scriptscriptstyle\bigvee$} \mbox{$\scriptscriptstyle\bigwedge$}}}\limits} \newcommand{\IFF}{\Leftrightarrow} \newcommand{\AND}{\mathrel{\land}} \newcommand{\OR}{\mathrel{\lor}} \newcommand{\NOT}{\neg} \newcommand{\IMP}{\supset} \newcommand{\DAND}{\wedge\!\!\!\!\wedge} \newcommand{\DOR}{\vee\!\!\!\!\vee} \newcommand{\GN}[1]{\,\!^{\lceil}\!#1\,\!^{\rceil}} \newcommand{\HALF}[1]{\lfloor\frac{1}{2}#1\rfloor} \newcommand{\DIV}[2]{\lfloor\frac{#1}{#2}\rfloor} \newcommand{\LEQ}{\leq_l} \newcommand{\nat}{\mathbb{N}} \newcommand{\EQN}[1]{\begin{eqnarray*}#1\end{eqnarray*}} \newcommand{\EQ}[1]{\begin{eqnarray}#1\end{eqnarray}} \newcommand{\TQ}[1]{\hbox{$#1$}} \newenvironment{proof}% {\medskip \noindent {\it Proof.} }{$\Box$} \newcommand{\ignore}[1]{} \)

Facial Expression Video Synthesis from the StyleGAN Latent Space




Lei Zhang
Chris Pollett (Presenting)
May, 2021

Introduction

A Girl's Face

Prior Image Generation Systems 1

As with many video generation systems, our system make use of prior work on image generation:

Prior Image Generation Systems 2

Prior Video Generation Systems

TGAN and MocoGAN Examples

TGAN 128x128 Frame Golf Scene
MocoGAN 96x96 Face and 64x64 Taichi Scenes

Our System Architecture

Embedding Faces in StyleGAN

Image Embedding Neural Net
Example Embedded Faces

Flickr-Faces-HQ (FFHQ)

Images from Flickr-Faces-HQ dataset

IMPA-FACE3D

Images from IMPA-FACE3D dataset

Training Emotion Directions

Predict Emotion Sequences

Keyframes and Interpolation

Keyframe Interpolation of a Girl's Face

Experiments

ModelACD
TGAN0.305
MoCoGAN0.201
Our Model0.167

Conclusion

We conclude this talk with some observations based on our experiments with our video generation model:

Slides and Git Repository

References

[1] M. Saito, E. Matsumoto, and S. Saito, "Temporal generative adversarial nets with singular value clipping," Proceedings of the IEEE International Conference on Computer Vision (ICCV), 2017, pp. 2830--2839.

[2] R. Abdal, Y. Qin, and P. Wonka, "Image2StyleGAN: How to Embed Images Into the StyleGAN Latent Space?," Proceedings of the IEEE International Conference on Computer Vision. 2019.

[3] T. Karras, et al., "Progressive growing of GANS for improved quality, stability, and variation," International Conference on Learning Representations (ICLR), 2018.

[4] S. Tulyakov, et al., "MoCoGAN: Decomposing Motion and Content for Video Generation," 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2018. pp. 1526--535, doi: 10.1109/CVPR.2018.00165.

[5] T. Karras, S. Laine, and T. Aila, "A style-based generator architecture for generative adversarial networks," 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2019, pp. 4396--4405, doi: 10.1109/CVPR.2019.00453.

[6] N. Aifanti, C. Papachristou, and A. Delopoulos, "The MUG facial expression database," 11th International Workshop on Image Analysis for Multimedia Interactive Services WIAMIS 10. IEEE, 2010.

[7] T. Karras, et al., "Analyzing and improving the image quality of StyleGAN," 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 8107-8116.

[8] S. Ji, et al., "3D convolutional neural networks for human action recognition," IEEE Transactions on Pattern Analysis and Machine Intelligence, 35(1):221--231, 2013.