You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardexpand all lines: README.md
+111-63
Original file line number
Diff line number
Diff line change
@@ -8,69 +8,113 @@ of being backwards-compatible. As such, we can use our previously-trained models
8
8
get acquainted with the official repository and its codebase, as we will be building upon it and as such, increase its
9
9
capabilities (but hopefully not its complexity!).
10
10
11
-
This repository adds the following (not yet the complete list):
11
+
This repository adds/has the following changes (not yet the complete list):
12
12
13
13
* Dataset tool
14
14
* Add `--center-crop-tall`: add vertical black bars to the sides instead, in the same vein as the horizontal bars in
15
15
`--center-crop-wide`.
16
16
* Grayscale images in the dataset are converted to `RGB`.
17
17
* If the dataset tool encounters an error, print it along the offending image, but continue with the rest of the dataset
18
-
([pull#39](https://github.com/NVlabs/stylegan3/pull/39) from [Andreas Jansson](https://github.com/andreasjansson)).
19
-
**TODO*: Add multi-crop, as used in [Earth View](https://github.com/PDillis/earthview#multi-crop---data_augmentpy).
18
+
([PR#39](https://github.com/NVlabs/stylegan3/pull/39) from [Andreas Jansson](https://github.com/andreasjansson)).
19
+
****TODO:*** Add multi-crop, as used in [Earth View](https://github.com/PDillis/earthview#multi-crop---data_augmentpy).
20
20
* Training
21
-
*`--mirrory`: Added vertical mirroring for doubling the dataset size
22
-
*`--gamma`: If no R1 regularization is provided, the heuristic formula will be used from [StyleGAN2](https://github.com/NVlabs/stylegan2).
23
-
*`--aug`: ***TODO*** add [Deceive-D/APA](https://github.com/EndlessSora/DeceiveD) as an option.
24
-
*`--augpipe`: Now available to use is [StyleGAN2-ADA's](https://github.com/NVlabs/stylegan2-ada-pytorch) full list of augpipe, e,g., `blit`, `geom`, `bgc`, `bgcfnc`, etc.
25
-
*`--img-snap`: When to save snapshot images, so now it's independent of when the model is saved;
26
-
*`--snap-res`: The resolution of the snapshots, depending on your screen resolution, or how many images you wish to see per tick. Available resolutions: `1080p`, `4k`, and `8k`.
21
+
*`--mirrory`: Added vertical mirroring for doubling the dataset size (quadrupling if `--mirror` is used; make sure your dataset has either or both
22
+
of these symmetries in order for it to make sense to use them)
23
+
*`--gamma`: If no R1 regularization is provided, the heuristic formula from [StyleGAN2](https://github.com/NVlabs/stylegan2) will be used.
24
+
*`--aug`: ***TODO:*** add [Deceive-D/APA](https://github.com/EndlessSora/DeceiveD) as an option.
25
+
*`--augpipe`: Now available to use is [StyleGAN2-ADA's](https://github.com/NVlabs/stylegan2-ada-pytorch) full list of augpipe, i.e., individual augmentations (`blit`, `geom`, `color`, `filter`, `noise`, `cutout`) or their combinations (`bg`, `bgc`, `bgcf`, `bgcfn`, `bgcfnc`).
26
+
*`--img-snap`: Set when to save snapshot images, so now it's independent of when the model is saved (e.g., save image snapshots more often to know how the model is training without saving the model itself, to save space).
27
+
*`--snap-res`: The resolution of the snapshots, depending on how many images you wish to see per snapshot. Available resolutions: `1080p`, `4k`, and `8k`.
27
28
*`--resume-kimg`: Starting number of `kimg`, useful when continuing training a previous run
28
-
*`--outdir`: Automatically set as `training-runs`
29
+
*`--outdir`: Automatically set as `training-runs`, so no need to set beforehand (in general this is true throughout the repository)
29
30
*`--metrics`: Now set by default to `None`, so there's no need to worry about this one
30
-
*`--resume`: All available pre-trained models from NVIDIA can be found with a simple dictionary, depending on the `--cfg` used.
31
-
For example, if `--cfg=stylegan3-r`, then to transfer learn from FFHQU at 1024 resolution, set `--resume=ffhqu1024`.
32
-
***TODO***: finish the following table, but full list available [here](https://github.com/PDillis/stylegan3-fun/blob/0bfa8e108487b50d6ecb73718c60497f063d8c17/train.py#L297).
33
-
<table>
34
-
<tr>
35
-
<td>Available Models</td>
36
-
<td><pre>ffhq256</pre></td>
37
-
<td><pre>ffhqu256</pre></td>
38
-
<td><pre>ffhq512</pre></td>
39
-
<td><pre>ffhq1024</pre></td>
40
-
<td><pre>ffhqu1024</pre></td>
41
-
</tr>
42
-
<tr>
43
-
<td><pre>stylegan2</pre></td>
44
-
<td>:heavy_check_mark:</td>
45
-
<td>:heavy_check_mark:</td>
46
-
<td>:heavy_check_mark:</td>
47
-
<td>:heavy_check_mark:</td>
48
-
<td>:heavy_check_mark:</td>
49
-
</tr>
50
-
<tr>
51
-
<td><pre>stylegan3-t</pre></td>
52
-
<td></td>
53
-
<td>:heavy_check_mark:</td>
54
-
<td></td>
55
-
<td>:heavy_check_mark:</td>
56
-
<td>:heavy_check_mark:</td>
57
-
</tr>
58
-
<tr>
59
-
<td><pre>stylegan3-r</pre></td>
60
-
<td></td>
61
-
<td>:heavy_check_mark:</td>
62
-
<td></td>
63
-
<td>:heavy_check_mark:</td>
64
-
<td>:heavy_check_mark:</td>
65
-
</tr>
66
-
</table>
31
+
*`--resume`: All available pre-trained models from NVIDIA (and more) can be used with a simple dictionary, depending on the `--cfg` used.
32
+
For example, if you wish to use StyleGAN3's `config-r`, then set `--cfg=stylegan3-r`. In addition, if you wish to transfer learn from FFHQU at 1024 resolution, set `--resume=ffhqu1024`.
33
+
* The full list of currently available models to transfer learn from (or synthesize new images with) is the following (***TODO:*** add small description of each model,
34
+
so the user can better know which to use for their particular usecase; proper citation to original authors as well):
35
+
36
+
<details>
37
+
<summary>StyleGAN2 models</summary>
38
+
39
+
1. Majority, if not all, are `config-f`: set `--cfg=stylegan2`
40
+
*`ffhq256`
41
+
*`ffhqu256`
42
+
*`ffhq512`
43
+
*`ffhq1024`
44
+
*`ffhqu1024`
45
+
*`celebahq256`
46
+
*`lsundog256`
47
+
*`afhqcat512`
48
+
*`afhqdog512`
49
+
*`afhqwild512`
50
+
*`afhq512`
51
+
*`brecahad512`
52
+
*`cifar10` (conditional, 10 classes)
53
+
*`metfaces1024`
54
+
*`metfacesu1024`
55
+
*`lsuncar512` (config-f)
56
+
*`lsuncat256` (config-f)
57
+
*`lsunchurch256` (config-f)
58
+
*`lsunhorse256` (config-f)
59
+
*`minecraft1024` (thanks to @jeffheaton)
60
+
*`imagenet512` (thanks to @shawwn)
61
+
*`wikiart1024-C` (conditional, 167 classes; thanks to @pbaylies)
62
+
*`wikiart1024-U` (thanks to @pbaylies)
63
+
*`maps1024` (thanks to @tjukanov)
64
+
*`fursona512` (thanks to @arfafax)
65
+
*`mlpony512` (thanks to @arfafax)
66
+
*`afhqcat256` (Deceive-D/APA models)
67
+
*`anime256` (Deceive-D/APA models)
68
+
*`cub256` (Deceive-D/APA models)
69
+
*`sddogs1024` (Self-Distilled StyleGAN models)
70
+
*`sdelephant512` (Self-Distilled StyleGAN models)
71
+
*`sdhorses512` (Self-Distilled StyleGAN models)
72
+
*`sdbicycles256` (Self-Distilled StyleGAN models)
73
+
*`sdlions512` (Self-Distilled StyleGAN models)
74
+
*`sdgiraffes512` (Self-Distilled StyleGAN models)
75
+
*`sdparrots512` (Self-Distilled StyleGAN models)
76
+
</details>
77
+
78
+
<details>
79
+
<summary>StyleGAN3 models</summary>
80
+
81
+
1.`config-t`: set `--cfg=stylegan3-t`
82
+
*`afhq512`
83
+
*`ffhqu256`
84
+
*`ffhq1024`
85
+
*`ffhqu1024`
86
+
*`metfaces1024`
87
+
*`metfacesu1024`
88
+
*`landscapes256` (thanks to @justinpinkney)
89
+
*`wikiart1024` (thanks to @justinpinkney)
90
+
*`mechfuture256` (thanks to @edstoica; 29 kimg tick)
91
+
*`vivflowers256` (thanks to @edstoica; 68 kimg tick)
92
+
*`alienglass256` (thanks to @edstoica; 38 kimg tick)
93
+
*`scificity256` (thanks to @edstoica; 210 kimg tick)
94
+
*`scifiship256` (thanks to @edstoica; 168 kimg tick)
95
+
2.`config-r`: set `--cfg=stylegan3-r`
96
+
*`afhq512`
97
+
*`ffhq1024`
98
+
*`ffhqu1024`
99
+
*`ffhqu256`
100
+
*`metfaces1024`
101
+
*`metfacesu1024`
102
+
</details>
103
+
104
+
* The main sources of these pretrained models are both the [official NVIDIA repository](https://catalog.ngc.nvidia.com/orgs/nvidia/teams/research/models/stylegan3),
105
+
as well as other community repositories, such as [Justin Pinkney](https://github.com/justinpinkney) 's [Awesome Pretrained StyleGAN2](https://github.com/justinpinkney/awesome-pretrained-stylegan2)
106
+
and [Awesome Pretrained StyleGAN3](https://github.com/justinpinkney/awesome-pretrained-stylegan3), [Deceive-D/APA](https://github.com/EndlessSora/DeceiveD),
107
+
[Self-Distilled StyleGAN/Internet Photos](https://github.com/self-distilled-stylegan/self-distilled-internet-photos), and [edstoica](https://github.com/edstoica) 's
108
+
[Wombo Dream](https://www.wombo.art/)[-based models](https://github.com/edstoica/lucid_stylegan3_datasets_models). Others can be found around the net and are properly credited in this repository,
109
+
so long as they can be easily downloaded with [`dnnlib.util.open_url`](https://github.com/PDillis/stylegan3-fun/blob/4ce9d6f7601641ba1e2906ed97f2739a63fb96e2/dnnlib/util.py#L396).
*[Project into W+](https://arxiv.org/abs/1904.03189)
76
120
* Additional losses to use for better projection (e.g., using VGG16 or [CLIP](https://github.com/openai/CLIP))
@@ -79,12 +123,16 @@ This repository adds the following (not yet the complete list):
79
123
* Start from a random image (`random` or `perlin`, using [Mathieu Duchesneau's implementation](https://github.com/duchesneaumathieu/pyperlin)) or from an existing one
80
124
* Expansion on GUI/`visualizer.py`
81
125
* Added the rest of the affine transformations
126
+
* Added widget for class-conditional models (***TODO:*** mix classes with continuous values for `cls`!)
82
127
* General model and code additions
83
-
* No longer necessary to specify `--outdir` when running the code, as the output directory will be automatically generated
*[Multi-modal truncation trick](https://arxiv.org/abs/2202.12211): find the different clusters in your model and use the closest one to your dlatent, in order to increase the fidelity (TODO: finish skeleton implementation)
130
+
* StyleGAN3: anchor the latent space for easier to follow interpolations (thanks to [Rivers Have Wings](https://github.com/crowsonkb) and [nshepperd](https://github.com/nshepperd)).
131
+
* Use CPU instead of GPU if desired (not recommended, but perfectly fine for generating images, whenever the custom CUDA kernels fail to compile).
132
+
* Add missing dependencies and channels so that the [`conda`](https://docs.conda.io/en/latest/) environment is correctly setup in Windows
133
+
(PR's [#111](https://github.com/NVlabs/stylegan3/pull/111) /[#116](https://github.com/NVlabs/stylegan3/pull/116) /[#125](https://github.com/NVlabs/stylegan3/pull/125) and [#80](https://github.com/NVlabs/stylegan3/pull/80) /[#143](https://github.com/NVlabs/stylegan3/pull/143) from the base, respectively)
86
134
87
-
***TODO:*** Finish documentation for better user experience, add videos/images, code samples.
135
+
***TODO:*** Finish documentation for better user experience, add videos/images, code samples, visuals...
88
136
89
137
---
90
138
@@ -161,7 +209,7 @@ See [Troubleshooting](./docs/troubleshooting.md) for help on common installation
161
209
162
210
Pre-trained networks are stored as `*.pkl` files that can be referenced using local filenames or URLs:
163
211
164
-
```.bash
212
+
```bash
165
213
# Generate an image using pre-trained AFHQv2 model ("Ours" in Figure 1, left).
@@ -175,7 +223,7 @@ Outputs from the above commands are placed under `out/*.png`, controlled by `--o
175
223
176
224
**Docker**: You can run the above curated image example using Docker as follows:
177
225
178
-
```.bash
226
+
```bash
179
227
# Build the stylegan3:latest image
180
228
docker build --tag stylegan3 .
181
229
@@ -199,7 +247,7 @@ The `docker run` invocation may look daunting, so let's unpack its contents here
199
247
200
248
This release contains an interactive model visualization tool that can be used to explore various characteristics of a trained model. To start it, run:
201
249
202
-
```.bash
250
+
```bash
203
251
python visualizer.py
204
252
```
205
253
@@ -209,7 +257,7 @@ python visualizer.py
209
257
210
258
You can use pre-trained networks in your own Python code as follows:
211
259
212
-
```.python
260
+
```python
213
261
withopen('ffhq.pkl', 'rb') as f:
214
262
G = pickle.load(f)['G_ema'].cuda() # torch.nn.Module
215
263
z = torch.randn([1, G.z_dim]).cuda() # latent codes
@@ -223,7 +271,7 @@ The pickle contains three networks. `'G'` and `'D'` are instantaneous snapshots
223
271
224
272
The generator consists of two submodules, `G.mapping` and `G.synthesis`, that can be executed separately. They also support various additional options:
225
273
226
-
```.python
274
+
```python
227
275
w = G.mapping(z, c, truncation_psi=0.5, truncation_cutoff=8)
@@ -236,7 +284,7 @@ Datasets are stored as uncompressed ZIP archives containing uncompressed PNG fil
236
284
237
285
**FFHQ**: Download the [Flickr-Faces-HQ dataset](https://github.com/NVlabs/ffhq-dataset) as 1024x1024 images and create a zip archive using `dataset_tool.py`:
See the [MetFaces README](https://github.com/NVlabs/metfaces-dataset) for information on how to obtain the unaligned MetFaces dataset images. Use the same steps as above to create a ZIP archive for training and validation.
257
305
258
306
**AFHQv2**: Download the [AFHQv2 dataset](https://github.com/clovaai/stargan-v2/blob/master/README.md#animal-faces-hq-dataset-afhq) and create a ZIP archive:
Note that the above command creates a single combined dataset using all images of all three classes (cats, dogs, and wild animals), matching the setup used in the StyleGAN3 paper. Alternatively, you can also create a separate dataset for each class:
The easiest way to inspect the spectral properties of a given generator is to use the built-in FFT mode in `visualizer.py`. In addition, you can visualize average 2D power spectra (Appendix A, Figure 15) as follows:
341
389
342
-
```.bash
390
+
```bash
343
391
# Calculate dataset mean and std, needed in subsequent steps.
0 commit comments