Skip to content

PyTorch Implementation of Latent Space Anchoring (TPAMI 2023)

Notifications You must be signed in to change notification settings

siyuhuang/Latent-Space-Anchoring

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Latent-Space-Anchoring

PyTorch implementation of Domain-Scalable Unpaired Image Translation via Latent Space Anchoring

Siyu Huang* (Harvard), Jie An* (Rochester), Donglai Wei (BC), Zudi Lin (Amazon Alexa), Jiebo Luo (Rochester), Hanspeter Pfister (Harvard)
IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI)

[Paper]

Given an unpaired image-to-image translation (UNIT) model trained on certain domains, it is challenging to incorporate new domains. This work includes a domain-scalable UNIT method, termed as latent space anchoring, anchors images of different domains to the same latent space of frozen GANs by learning lightweight encoder and regressor models to reconstruct single-domain images. In inference, the learned encoders and decoders of different domains can be arbitrarily combined to translate images between any two domains without fine-tuning:

Installation

We recommend installing using Anaconda. All dependencies are provided in env.yaml.

conda env create -f env.yaml
conda activate lsa

Pretrained Models

Please download the pre-trained models from the following links.

Name Enc/Dec Domain Generator Backbone
seg2ffhq.pt facial segmentation mask (CelebAMask-HQ) StyleGAN2 trained on FFHQ face.
sketch2ffhq.pt facial sketch (CUFSF) StyleGAN2 trained on FFHQ face.
cat2dog.pt cat face (AFHQ-cat) StyleGAN2 trained on AFHQ-dog.

In addition, we provide the auxiliary pre-trained models used for training our models.

Name Description
stylegan2-ffhq-config-f.pt StyleGAN2 generator on FFHQ face.
psp_ffhq_encode.pt The encoder for StyleGAN2-FFHQ inversion.
model_ir_se50.pth IR-SE50 model used for encoder's weight initialization.

Testing

CelebAMask-to-FFHQ


          input            regressor output      generator output

Unpaired image translation from CelebAMask-HQ mask to FFHQ image.

Please download CelebAMask-HQ dataset, put it in ./data/. Download pre-trained model seg2ffhq.pt, put it in ./pretrained_models. The folder structure is

Latent-Space-Anchoring
├── data
│   ├── CelebAMask-HQ
│   │   ├── face_parsing
│   │   │   ├── Data_preprocessing
│   │   │   ├── ├── train_img
│   │   │   ├── ├── train_label
│   │   │   ├── ├── test_img
│   │   │   ├── ├── test_label
├── pretrained_models
│   ├── seg2ffhq.pt
├── commands
│   ├── test_seg2ffhq.sh

Run:

bash commands/test_seg2ffhq.sh

Sketch-to-FFHQ


          input            regressor output      generator output

Unpaired image translation from CUFSF facial sketch to FFHQ image. Figures: input, regressor output, generator output.

Please download CUFSF dataset, put it in ./data/. Manually split the dataset into training and test sets (we use the first 1k images as training set and the rest as test set). Download pre-trained model sketch2ffhq.pt, put it in ./pretrained_models. The folder structure is

Latent-Space-Anchoring
├── data
│   ├── CUFSF
│   │   ├── train
│   │   ├── test
├── pretrained_models
│   ├── sketch2ffhq.pt
├── commands
│   ├── test_sketch2ffhq.sh

Run:

bash commands/test_sketch2ffhq.sh

FFHQ-to-CelebAMask


          input            regressor output      generator output

Unpaired image translation from FFHQ image to CelebAMask-HQ mask. Figures: input, regressor output, generator output.

Please download FFHQ dataset, put it in ./data/. Download pre-trained models seg2ffhq.pt and psp_ffhq_encode.pt, put them in ./pretrained_models. The folder structure is

Latent-Space-Anchoring
├── data
│   ├── ffhq
│   │   ├── images1024x1024
├── pretrained_models
│   ├── seg2ffhq.pt
│   ├── psp_ffhq_encode.pt
├── commands
│   ├── test_ffhq2seg.sh

Run:

bash commands/test_seg2ffhq.sh

FFHQ-to-Sketch


          input            regressor output      generator output

Unpaired image translation from FFHQ image to CUFSF facial sketch. Figures: input, regressor output, generator output.

Please download FFHQ dataset, put it in ./data/. Download pre-trained models sketch2ffhq.pt and psp_ffhq_encode.pt, put them in ./pretrained_models. The folder structure is

Latent-Space-Anchoring
├── data
│   ├── ffhq
│   │   ├── images1024x1024
├── pretrained_models
│   ├── sketch2ffhq.pt
│   ├── psp_ffhq_encode.pt
├── commands
│   ├── test_ffhq2sketch.sh

Run:

bash commands/test_ffhq2sketch.sh

CelebAMask-to-Sketch


          input            regressor output      generator output

Unpaired image translation from CelebAMask-HQ mask to CUFSF facial sketch.

Please download CelebAMask-HQ dataset, put it in ./data/. Download pre-trained model seg2ffhq.pt and sketch2ffhq.pt, put them in ./pretrained_models. The folder structure is

Latent-Space-Anchoring
├── data
│   ├── CelebAMask-HQ
│   │   ├── face_parsing
│   │   │   ├── Data_preprocessing
│   │   │   ├── ├── train_img
│   │   │   ├── ├── train_label
│   │   │   ├── ├── test_img
│   │   │   ├── ├── test_label
├── pretrained_models
│   ├── seg2ffhq.pt
│   ├── sketch2ffhq.pt
├── commands
│   ├── test_seg2sketch.sh

Run:

bash commands/test_seg2sketch.sh

Cat-to-Dog


          input            regressor output      generator output

Unpaired image translation from AFHQ-cat to AFHQ-dog.

Please download AFHQ dataset, put it in ./data/. Download pre-trained model cat2dog.pt, put it in ./pretrained_models. The folder structure is

Latent-Space-Anchoring
├── data
│   ├── AFHQ
│   │   ├── afhq
│   │   │   ├── train
│   │   │   ├── ├── cat
│   │   │   ├── ├── dog
│   │   │   ├── ├── wild
│   │   │   ├── test
│   │   │   ├── ├── cat
│   │   │   ├── ├── dog
│   │   │   ├── ├── wild
├── pretrained_models
│   ├── cat2dog.pt
├── commands
│   ├── test_cat2dog.sh

Run:

bash commands/test_cat2dog.sh

Training

It requires a single GPU with at least 16GB memory. Less GPU memory with a smaller batch size is potentially feasible, although we have not tested it.

CelebAMask-to-FFHQ

Train encoder and regressor for CelebAMask-HQ mask domain, by using StyleGAN2-FFHQ as the generator backbone.

Please download CelebAMask-HQ dataset and FFHQ dataset, put them in ./data/. Download pre-trained models stylegan2-ffhq-config-f.pt and model_ir_se50.pth, put them in ./pretrained_models. The folder structure is

Latent-Space-Anchoring
├── data
│   ├── CelebAMask-HQ
│   │   ├── face_parsing
│   │   │   ├── Data_preprocessing
│   │   │   ├── ├── train_img
│   │   │   ├── ├── train_label
│   │   │   ├── ├── test_img
│   │   │   ├── ├── test_label
│   ├── ffhq
│   │   ├── images1024x1024
├── pretrained_models
│   ├── stylegan2-ffhq-config-f.pt
│   ├── model_ir_se50.pth
├── commands
│   ├── train_seg2ffhq.sh

Run:

bash commands/train_seg2ffhq.sh

The training results and model checkpoints will be saved in ./logs/seg2ffhq.

Sketch-to-FFHQ

Train encoder and regressor for CUFSF facial sketch domain, by using StyleGAN2-FFHQ as the generator backbone.

Please download CUFSF and FFHQ dataset, put them in ./data/. Download pre-trained models stylegan2-ffhq-config-f.pt and model_ir_se50.pth, put them in ./pretrained_models. The folder structure is

Latent-Space-Anchoring
├── data
│   ├── CUFSF
│   │   ├── train
│   │   ├── test
│   ├── ffhq
│   │   ├── images1024x1024
├── pretrained_models
│   ├── stylegan2-ffhq-config-f.pt
│   ├── model_ir_se50.pth
├── commands
│   ├── train_sketch2ffhq.sh

Run:

bash commands/train_sketch2ffhq.sh

The training results and model checkpoints will be saved in ./logs/sketch2ffhq.

Diverse Generations

We support diverse model generations.

Please follow *Testing: CelebAMask-to-FFHQ to prepare dataset and pre-trained models. Run:

bash commands/inference_seg2ffhq.sh

High-Resolution Sampling

We support sampling high-resolution (i.e., 1024x1024) mask and images from a random noise.

Please follow Testing: CelebAMask-to-FFHQ to prepare dataset and pre-trained models. Run:

bash commands/sampling_seg2ffhq.sh

Acknowledgements

This implementation is built upon StyleGAN2 and pixel2style2pixel.

Citation

@article{huang2023domain,
  author={Huang, Siyu and An, Jie and Wei, Donglai and Lin, Zudi and Luo, Jiebo and Pfister, Hanspeter},
  title={Domain-Scalable Unpaired Image Translation Via Latent Space Anchoring},
  journal={IEEE Transactions on Pattern Analysis and Machine Intelligence}, 
  year={2023},
}

Contact

Siyu Huang ([email protected])

About

PyTorch Implementation of Latent Space Anchoring (TPAMI 2023)

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published