Skip to content

Latest commit

 

History

History
40 lines (30 loc) · 4.35 KB

README.md

File metadata and controls

40 lines (30 loc) · 4.35 KB

fomo - Figuring out Machine Learning Ourselves

Fomo is an ongoing "machine learning bootcamp", based on the idea that ML is gaining traction and honestly we are collectively not super aware of the strengths and weaknesses of how this can be applied in a biological/evolutionary context. In this light, we have established of a collective monthly ML hackathon. We will meet regularly (once a month), sit down together and hash through different ML paradigms, think about how these can be applied to the kinds of data we are generating and the questions we have, and collectively work through tutorials, and maybe even do some preliminary exploratory analysis. For example, one week we might evaluate boosted regression. What are the core ideas of what this is doing? How do you achieve this in R and python? Explore how this can be applied to genetic/morphological/abundance/phylogenetic data.

Pre-processing/Feature Engineering

Unsupervised methods

Gradient boosting & Random forest

TL;DR Gradient boosting is "better", but random forest is easier to tune, and maybe faster. Additionally, gradient boosting may have trouble when the training data is noisy.

Gradient Boosting

Deep learning

Deep learning frameworks

  • fastai: Python wrapper around pytorch focused on making construction and training of NN fast and easy. Good documentation and examples (focused on vision, text classficiation, and tabular datasets).
  • keras, a high level python deep learning library. Tons of excellent examples in the repo.

Time series

Other ML-like applications (with special attention to biodiversity analysis)