Skip to content

Commit e51295f

Browse files
committed
committed small pretrained models
1 parent 901ad00 commit e51295f

File tree

5 files changed

+37
-22
lines changed

5 files changed

+37
-22
lines changed

.gitignore

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -12,3 +12,4 @@ data/
1212
output/
1313
*.ipynb
1414
PRIVATE_*
15+
*_deprecated.yaml

README.md

Lines changed: 19 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -70,6 +70,8 @@ To train an MDEQ segmentation model on Cityscapes, do
7070
```sh
7171
python -m torch.distributed.launch --nproc_per_node=4 tools/seg_train.py --cfg experiments/[DATASET_NAME]/[CONFIG_FILE_NAME].yaml
7272
```
73+
where you should provide the pretrained ImageNet model path in the corresponding configuration (`.yaml`) file. We provide a sample pretrained model extractor in `pretrained_models/`, but you can also write your own script.
74+
7375
Similarly, to test the model and generate segmentation results on Cityscapes, do
7476

7577
```sh
@@ -88,20 +90,32 @@ We provide some reasonably good pre-trained weights here so that one can quickly
8890
| ------------- | ----------------- | ------------------- | ----------------------- |
8991
| MDEQ-XL | ImageNet Classification | ImageNet | [download (.pkl)](https://drive.google.com/file/d/1MBUFBOAAI8m2eccNbHePrukpOiAzPbji/view?usp=sharing) |
9092
| MDEQ-XL | Cityscapes(val) Segmentation | Cityscapes | [download (.pkl)](https://drive.google.com/file/d/1Gu7pJLGvXBbU_sPxNfjiaROJwEwak2Z8/view?usp=sharing) |
93+
| MDEQ-Small | ImageNet Classification | ImageNet | [download (.pkl)](https://drive.google.com/file/d/12ANsUdJJ3_qb5nfiBVPOoON2GQ2v4W1g/view?usp=sharing) |
94+
| MDEQ-Small | Cityscapes(val) Segmentation | Cityscapes | [download (.pkl)](https://drive.google.com/file/d/11DZfYhHNK_XC6-Uob1Pp2pStS5EhP5dF/view?usp=sharing) |
9195

9296
**Example of how to use the pretrained ImageNet model to train on Cityscapes**:
9397
1. Download the pretrained ImageNet `.pkl` file.
9498
2. Put the model under `pretrained_models/` folder with some file name `[FILENAME]`.
95-
3. In the corresponding `experiments/cityscapes/cls_MDEQ_XL.yaml`, set `PRETRAINED` to `"pretrained_models/[FILENAME]"`. Make sure you **don't** make it the `MODEL_FILE`.
96-
4. Run the MDEQ segmentation training command (see the "Usage" section above).
99+
3. In the corresponding `experiments/cityscapes/seg_MDEQ_[SIZE].yaml` (where `SIZE` is typically `SMALL`, `LARGE` or `XL`), set `MODEL.PRETRAINED` to `"pretrained_models/[FILENAME]"`.
100+
4. Run the MDEQ segmentation training command (see the "Usage" section above):
101+
```sh
102+
python -m torch.distributed.launch --nproc_per_node=[N_GPUS] tools/seg_train.py --cfg experiments/cityscapes/seg_MDEQ_[SIZE].yaml
103+
```
97104

98-
(We'll soon update with the pretrained MDEQ-Large and MDEQ-Small ImageNet models!)
105+
**Example of how to use the pretrained Cityscapes model for inference**:
106+
1. Download the pretrained Cityscapes `.pkl` file
107+
2. Put the model under `pretrained_models/` folder with some file name `[FILENAME]`.
108+
3. In the corresponding `experiments/cityscapes/seg_MDEQ_[SIZE].yaml` (where `SIZE` is typically `SMALL`, `LARGE` or `XL`), set `TEST.MODEL_FILE` to `"pretrained_models/[FILENAME]"`.
109+
4. Run the MDEQ segmentation testing command (see the "Usage" section above):
110+
```sh
111+
python tools/seg_test.py --cfg experiments/cityscapes/seg_MDEQ_[SIZE].yaml
112+
```
99113

100114

101115
### Tips:
102116

103-
- To load the Cityscapes pretrained model, download the `.pkl` file below and specify the path in `config.[TRAIN/TEST].MODEL_FILE` (which is `''` by default) in the `.yaml` files.
104-
- The difference between `[TRAIN/TEST].MODEL_FILE` and `MODEL.PRETRAINED` arguments in the yaml files: the former is used to load all of the model parameters; the latter is for compound training (e.g., when transferring from ImageNet to Cityscapes, we want to discard the final classifier FC layer).
117+
- To load the Cityscapes pretrained model, download the `.pkl` file and specify the path in `config.[TRAIN/TEST].MODEL_FILE` (which is `''` by default) in the `.yaml` files. This is **different** from setting `MODEL.PRETRAINED`, see the point below.
118+
- The difference between `[TRAIN/TEST].MODEL_FILE` and `MODEL.PRETRAINED` arguments in the yaml files: the former is used to load all of the model parameters; the latter is for compound training (e.g., when transferring from ImageNet to Cityscapes, we want to discard the final classifier FC layers).
105119
- The repo supports checkpointing of models at each epoch. One can resume from a previously saved checkpoint by turning on the `TRAIN.RESUME` argument in the yaml files.
106120
- Just like DEQs, the MDEQ models can be slower than explicit deep networks, and even more so as the image size increases (because larger images typically require more Broyden iterations to converge well; see Figure 5 in the paper). But one can play with the forward and backward thresholds to adjust the runtime.
107121

experiments/cityscapes/seg_mdeq_SMALL.yaml

Lines changed: 8 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -17,14 +17,14 @@ DATASET:
1717
MODEL:
1818
NAME: mdeq
1919
PRETRAINED: ''
20-
NUM_LAYERS: 4
21-
DROPOUT: 0.02
22-
F_THRES: 26
23-
B_THRES: 26
20+
NUM_LAYERS: 3
21+
DROPOUT: 0.05
22+
F_THRES: 27
23+
B_THRES: 30
2424
WNORM: true
2525
DOWNSAMPLE_TIMES: 2
2626
NUM_GROUPS: 8
27-
EXPANSION_FACTOR: 4
27+
EXPANSION_FACTOR: 5
2828
EXTRA:
2929
FINAL_CONV_KERNEL: 1
3030
FULL_STAGE:
@@ -51,11 +51,11 @@ TRAIN:
5151
- 1024
5252
- 512
5353
BASE_SIZE: 2048
54-
BATCH_SIZE_PER_GPU: 3
54+
BATCH_SIZE_PER_GPU: 2
5555
SHUFFLE: true
5656
BEGIN_EPOCH: 0
5757
END_EPOCH: 500
58-
RESUME: false
58+
RESUME: true
5959
OPTIMIZER: sgd
6060
LR: 0.01
6161
WD: 0.0002
@@ -65,7 +65,7 @@ TRAIN:
6565
MULTI_SCALE: true
6666
LR_SCHEDULER: 'cosine'
6767
DOWNSAMPLERATE: 1
68-
PRETRAIN_STEPS: 45000
68+
PRETRAIN_STEPS: 60000
6969
IGNORE_LABEL: 255
7070
SCALE_FACTOR: 16
7171
TEST:

experiments/imagenet/cls_mdeq_SMALL.yaml

Lines changed: 8 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -11,11 +11,11 @@ MODEL:
1111
NUM_CLASSES: 1000
1212
NUM_GROUPS: 8
1313
DROPOUT: 0.0
14-
F_THRES: 24
15-
B_THRES: 24
14+
F_THRES: 26
15+
B_THRES: 27
1616
WNORM: true
1717
DOWNSAMPLE_TIMES: 2
18-
EXPANSION_FACTOR: 4
18+
EXPANSION_FACTOR: 5
1919
IMAGE_SIZE:
2020
- 224
2121
- 224
@@ -25,10 +25,10 @@ MODEL:
2525
NUM_BRANCHES: 4
2626
BLOCK: BASIC
2727
HEAD_CHANNELS:
28-
- 28
29-
- 56
30-
- 112
31-
- 224
28+
- 24
29+
- 48
30+
- 96
31+
- 192
3232
FINAL_CHANSIZE: 2048
3333
NUM_BLOCKS:
3434
- 1
@@ -60,7 +60,7 @@ TRAIN:
6060
END_EPOCH: 100
6161
RESUME: true
6262
LR_SCHEDULER: 'cosine'
63-
PRETRAIN_STEPS: 600000
63+
PRETRAIN_STEPS: 500000
6464
LR_FACTOR: 0.1
6565
LR_STEP:
6666
- 30

lib/modules/broyden.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -127,7 +127,7 @@ def broyden(g, x0, threshold, eps, ls=False, name="unknown"):
127127
gx = g(x_est) # (bsz, 2d, L')
128128
nstep = 0
129129
tnstep = 0
130-
LBFGS_thres = min(threshold, 24)
130+
LBFGS_thres = min(threshold, 27)
131131

132132
# For fast calculation of inv_jacobian (approximately)
133133
Us = torch.zeros(bsz, total_hsize, n_elem, LBFGS_thres).to(dev)

0 commit comments

Comments
 (0)