Official PyTorch implementation of the papers:
-
HyperMAML: Few-Shot Adaptation of Deep Models with Hypernetworks (2022) Przewięźlikowski M., Przybysz P. , Tabor J., Zięba M., Spurek P., preprint
-
HyperShot: Few-Shot Learning by Kernel HyperNetworks (2022) Sendera M., Przewięźlikowski M., Karanowski K., Zięba M. Tabor J., Spurek P., preprint
@misc{sendera2022hypershot,
doi = {10.48550/ARXIV.2203.11378},
url = {https://arxiv.org/abs/2203.11378},
author = {Sendera, Marcin and Przewięźlikowski, Marcin and Karanowski, Konrad and Zięba, Maciej and Tabor, Jacek and Spurek, Przemysław},
keywords = {Machine Learning (cs.LG), Artificial Intelligence (cs.AI), Computer Vision and Pattern Recognition (cs.CV), FOS: Computer and information sciences, FOS: Computer and information sciences},
title = {HyperShot: Few-Shot Learning by Kernel HyperNetworks},
publisher = {arXiv},
year = {2022},
copyright = {Creative Commons Attribution 4.0 International}
}
Few-shot models aim at making predictions using a minimal number of labeled examples from a given task. The main challenge in this area is the one-shot setting where only one element represents each class. We propose HyperShot - the fusion of kernels and hypernetwork paradigm. Compared to reference approaches that apply a gradient-based adjustment of the parameters, our model aims to switch the classification module parameters depending on the task's embedding. In practice, we utilize a hypernetwork, which takes the aggregated information from support data and returns the classifier's parameters handcrafted for the considered problem. Moreover, we introduce the kernel-based representation of the support examples delivered to hypernetwork to create the parameters of the classification module. Consequently, we rely on relations between embeddings of the support examples instead of direct feature values provided by the backbone models. Thanks to this approach, our model can adapt to highly different tasks.
The aim of Few-Shot learning methods is to train models which can easily adapt to previously unseen tasks, based on small amounts of data. One of the most popular and elegant Few-Shot learning approaches is Model-Agnostic Meta-Learning (MAML). The main idea behind this method is to learn the general weights of the meta-model, which are further adapted to specific problems in a small number of gradient steps. However, the model’s main limitation lies in the fact that the update procedure is realized by gradient-based optimisation. In consequence, MAML cannot always modify weights to the essential level in one or even a few gradient iterations. On the other hand, using many gradient steps results in a complex and time-consuming optimization procedure, which is hard to train in practice, and may lead to overfitting. In this paper, we propose HyperMAML, a novel generalization of MAML, where the training of the update procedure is also part of the model. Namely, in HyperMAML, instead of updating the weights with gradient descent, we use for this purpose a trainable Hypernetwork. Consequently, in this framework, the model can generate significant updates whose range is not limited to a fixed number of gradient steps. Experiments show that HyperMAML consistently outperforms MAML and performs comparably to other state-of-the-art techniques in a number of standard Few-Shot learning benchmarks.
- Python >= 3.7
- Numpy >= 1.19
- pyTorch >= 1.11
- GPyTorch >= 1.5.1
- (optional) neptune-client for logging traning results into your Neptune project.
pip install numpy torch torchvision gpytorch h5py pillow
- HyperShot: hypernet_kernel.py
- HyperMAML: hypermaml.py
The various methods can be trained using the following syntax:
python train.py --dataset="miniImagenet" --method="hyper_maml" --train_n_way=5 --test_n_way=5 --n_shot=1 --seed=1 --train_aug
You can run
python train.py --h to list all the possible arguments.
The train.py script performs the whole training and evaluation procedure.
This repository provides implementations of several few-shot learning methods:
hyper_maml
- HyperMAML: Few-Shot Adaptation of Deep Models with Hypernetworkshyper_shot
- HyperShot: Few-Shot Learning by Kernel HyperNetworkshn_ppa
- Few-Shot Image Recognition by Predicting Parameters from ActivationsDKT
- Bayesian Meta-Learning for the Few-Shot Setting via Deep Kernelsmaml
,maml_approx
- Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networksprotonet
- Prototypical Networks for Few-shot Learningrelationnet
- Learning to Compare: Relation Network for Few-Shot Learningmatchingnet
- Matching Networks for One Shot Learningbaseline++
- A Closer Look at Few-Shot Classificationbaseline
- Feature Transfer
You must use those exact strings at training and test time when you call the script (see below).
This is an example of how to download and prepare a dataset for training/testing. Here we assume the current directory is the project root folder:
cd filelists/DATASET_NAME/
sh download_DATASET_NAME.sh
Replace DATASET_NAME
with one of the following: omniglot
, CUB
, miniImagenet
, emnist
, QMUL
. Notice that mini-ImageNet is a large dataset that requires substantial storage, therefore you can save the dataset in another location and then change the entry in configs.py
in accordance.
These are the instructions to train and test the methods reported in the paper in the various conditions.
In addition, you can select cross_char
and cross
datasets for cross-domain classification of
Omnglot → EMNIST and mini-ImageNet → CUB, respectively.
The script allows training and testing on different backbone networks. By default the script will use the same backbone used in our experiments (Conv4
). Check the file backbone.py
for the available architectures, and use the parameter --model=BACKBONE_STRING
where BACKBONE_STRING
is one of the following: Conv4
, Conv6
, ResNet10|18|34|50|101
.
We provide logging the training / validation metrics and details to Neptune. To do so, one must export the following env variables before running train.py
.
export NEPTUNE_PROJECT=...
export NEPTUNE_API_TOKEN=...
This repository is a fork of: https://github.com/BayesWatch/deep-kernel-transfer, which in turn is a fork of https://github.com/wyharveychen/CloserLookFewShot.