Official implementation of the paper “Regression-Based Elastic Metric Learning on Shape Spaces of Cell Curves”.
[Paper] published at [NeurIPS Workshop Learning Meaningful Representations of Life]
- Regression-Based Elastic Metric Learning is a machine learning tool designed to improve analysis of discrete parameterizations of 2D cures changing over time.
- Specifically, we optimize geodesic regression analysis by learning the elastic metric parameters that model a given data trajectory close to a geodesic.
Left: A trajectory may follow a geodesic as calculated by one metric but not follow a geodesic as calculated by another metric. Our paradigm learns the elastic metric (parameterized by
- We consider a family of elastic metrics given by:
- We use the elastic metric implementation in Geomstats.
- The elastic metric is parameterized by
$a$ and$b$ which quantify how much two shapes are "stretched" or "bent" compared to each other, respectively. - Changing
$a$ and$b$ of the elastic metric changes the distance between various points on the manifold of discrete curves: the space where we analyze curves. As such, changing$a$ and$b$ changes the nature of geodesics on the manifold of discrete curves.
- Note that the ratio
$a/b$ is sufficient to describe variationos of $ g^{a, b}$. Thus, we set$b=0.5$ , as varying$b$ only changes units of the calculation. - Our paradigm learns the
$a*$ (and therefore the ratio$a*/b$ ) which models the data trajectory as being closest to a geodesic, as evaluated by the coefficient of determination$R^2$ . - We use a gradient ascent algorithm, along with our derived analytical expression of
$R^2$ in terms of$a$ , to find the$a*$ which maximizes$R^2$ for a given trajectory.
- We apply our paradigm to data trajectories of cell outlines changing over time
- For each experiement, we generate a semi-synthetic data trajectory by drawing a geodesic between two real cancer cells
- We create the trajectory with a predetermined 1) number of cells 2) number of sampling points (how many times each cell outline is sampled) 3) amount of noise 4) "true
$a$ " (the metric used to draw the geodesic between real cancer cells). Note that because the semi-synthetic geodesic is drawn with the metric parameter$a_{true}$ , the metric parameter$a_{true}$ WILL model the trajectory as a geodesic. - Thus, the gradient ascent learning scheme aims to learn an
$a*$ close to$a_{true}$ .
- We compare the predictive power of
$a*$ regression to the predictive power of regression with the square-root-velocity (SRV) metric, which is a special case of the elastic metric where$a = 1$ and$b = 0.5$ . - Performing geodesic regression with our learned
$a*$ metric parameter improves predictive power to geodesic regression, as geodesic regression is more accurate when the data trajectory is close to a geodesic.
If this code is useful to your research, please cite:
doi = {10.48550/ARXIV.2210.01932},
url = {},
author = {Myers, Adele and Miolane, Nina},
keywords = {Machine Learning (cs.LG), FOS: Computer and information sciences, FOS: Computer and information sciences},
title = {Regression-Based Elastic Metric Learning on Shape Spaces of Elastic Curves},
publisher = {arXiv},
year = {2022},
copyright = {Creative Commons Attribution 4.0 International}
This codes runs on Python 3.8. We recommend using Anaconda for easy installation. To create the necessary conda environment, run:
cd dyn
conda env create -f environment.yml
conda activate dyn
We use Wandb to keep track of our runs. To launch a new run, follow the steps below.
1. Set up Wandb logging.
Wandb is a powerful tool for logging performance during training, as well as animation artifacts. To use it, simply create an account, then run:
wandb login
to sign into your account.
Create a new project in Wandb called "metric-learning-TEST".
"" of our program runs through every combination of hyperparameters specified in "". Change any of the following hyperparameters:
- a_true: "a" is the metric parameter used to generate the synthetic shape trajectory. This is the metric parameter that the code is trying to learn.
- n_sampling_points: the number of points in each cell shape
- n_times: the number of cell shapes in the data trajectory
- noise_std: the amount of noise added to the synthetic data. (how "noisy" is each cell shape)
- percent_train: percent of the data trajectory used to train the regression model in our code
- percent_val: percent of the data trajectory used to validate the regression model (using the coefficient of determination
$R^2$ ) and learn a* (our code's best estimate of a_true). Note: the rest of the data trajectory that is not used to train or validate the model will be used to test the predictive power of a* regression against the baseline square-root-velocity (SRV) metric regression. - dataset: you can either test the code on a synthetic geodesic between a circle and an ellipse or a semi-synthetic geodesic between two real cancer cells.
For a single run, use the command:
This will initiate runs with every combination of hyperparameters detailed in
You can see all of your runs by logging into the Wandb webpage and looking under your project name "metric-learning-TEST". Our code automatically names each run as