This repository regroups works on the "introduction to machine learning" course of the SIB. Its goal is to develop a somewhat language agnostic course, with a an R and a python implementation, however at the moment only the python implementation in done.
The course is targeted to life scientists who are already familiar with the Python programming language and who have basic knowledge on statistics.
In order to follow the course you need to have installed python and jupyter notebooks (a web based notebook system for creating and sharing computational documents).
If you need help with the installation, you can refer to these tips and instructions(NB: this links to another github repo).
In addition, there are a number of libraries to install. See the intructions on installing prerequisite libraries.
The course is organized in several, numbered, jupyter notebooks, each corresponding to a chapter which interleaves theory, code demo, and exercises.
The course does not require any particular expertise with jupyter notebooks to be followed, but if it is the first time you encounter them we recommend this gentle introduction.
- Chapter0 : python warmup: provides a gentle warm-up to the basic usage of the libraries which are a pre-requisite for this course. You can use this notebook to assess your level before the course, or just as a way to get (re-)acquainted with these libraries.
- Chapter1 : Exploratory analysis and unsupervised learning
- Chapter2 : Machine Learning routine - distance-based model for classification
- Chapter3 : Machine Learning based on decision trees for classification
- Chapter4 : Machine Learning for regression
- Chapter Extra : Teasing Neural Network
Solutions to each practical can be found in the solutions/
folder and should be loadable directly in the jupyter notebook themselves.
Note also the utils.py
and utils2.py
files which contain many utilitary function we use to visually showcase the effect of various ML algorithms' hyper-parameters.
- data : contains the datasets
- exam : contains the data and instruction of a facultative exam
- images : images generated or used in the notebooks
- python_notebooks
- references : reference to the article of some of the datasets
- slides : pptx / pdf of introductory slides
Please cite as :
Mueller, M., & Duchemin, W. (2023, October 25). Introduction to Machine Learning - SIB training. Zenodo. https://doi.org/10.5281/zenodo.10039454