Using https://github.com/harskish/ganspace to find latent directions in a StyleGAN2 model trained on the pizza10 dataset.
Erik Härkönen1,2, Aaron Hertzmann2, Jaakko Lehtinen1,3, Sylvain Paris2
1Aalto University, 2Adobe Research, 3NVIDIA
https://arxiv.org/abs/2004.02546Abstract: This paper describes a simple technique to analyze Generative Adversarial Networks (GANs) and create interpretable controls for image synthesis, such as change of viewpoint, aging, lighting, and time of day. We identify important latent directions based on Principal Components Analysis (PCA) applied in activation space. Then, we show that interpretable edits can be defined based on layer-wise application of these edit directions. Moreover, we show that BigGAN can be controlled with layer-wise inputs in a StyleGAN-like manner. A user may identify a large number of interpretable controls with these mechanisms. We demonstrate results on GANs from various datasets.
Video: https://youtu.be/jdTICDa_eAI
Figure 1: The top 4 principal components obtained from PCA on the latent vectors lead to controlling corresponding features a) Top-most: size of the pizza, b) Second-largest: shape of the pizza since it gives out thinner slices towards the end, c) Thirdlargest: Amount of cheese could be varied by varying the third component d) Fourth largest: The amount of tomato sauce inMargherita type pizzas.
For setup and usage instructions, open the notebook in Google Colab:
Using https://github.com/rosinality/stylegan2-pytorch to discover meaningful latent semantic directions in an unsupervised manner using a StyleGAN2 model trained on the pizza10 dataset.
Yujun Shen, Bolei Zhou
The Chinese University of Hong Kong
https://arxiv.org/abs/2007.06600Abstract: A rich set of interpretable dimensions has been shown to emerge in the latent space of the Generative Adversarial Networks (GANs) trained for synthesizing images. In order to identify such latent dimensions for image editing, previous methods typically annotate a collection of syn- thesized samples and train linear classifiers in the latent space. However, they require a clear definition of the target attribute as well as the corresponding manual annotations, limiting their applications in practice. In this work, we examine the internal representation learned by GANs to reveal the underlying variation factors in an unsupervised manner. In particular, we take a closer look into the gen- eration mechanism of GANs and further propose a closed-form factorization algorithm for latent semantic discovery by directly decomposing the pre-trained weights. With a lightning-fast implementation, our approach is capable of not only finding semantically meaningful dimensions comparably to the state-of-the-art supervised methods, but also resulting in far more versatile concepts across multiple GAN models trained on a wide range of datasets.
Figure 2: Samples generated from latent code moved along the 8th eigenvector lead to controlling the amount of toppings on the pizza. For each vertical set of images, the middle one is the original output, while the top and the bottom are the output images generated by moving the latent with degree +5 and -5 respectively.
For setup and usage instructions, open the notebook in Google Colab: