-
Notifications
You must be signed in to change notification settings - Fork 9
Introduction to DaCapo
DaCapo is a framework for toolset for easily accessing established machine learning approaches to help identify objects e.g., organelles) in large, multi-dimensional images. It's goal is to bring the power of machine learning tools to all biologists, regardless of how much (or little) experience they have using these techniques.
In simple terms, machine learning involves passing input data through a multi-layer neural network. Each layer of the network comprises many processing units (neurons). The first layer of the network receives input, such as an image, and each processing unit assigns a random values to its input. This value is then passed to several units in the second layer.
Each second-layer unit collects input from multiple first-layer units, giving different weight to different inputs. It combines the weighted inputs into a single value, which it sends to several units in the next layer. This process continues until the information has passed through all the layers.
The output of the final layer is a prediction (guess) about what the input was, according to what the task was. For example, it might predict which voxels in the image were part of a mitochondrion. This prediction is compared to the ground truth (correct answer), which is provided by the user. This comparison produces a loss value, which estimates the amount of error. This information is then used to adjust the weights between units, before the input is passed through the network again.
Each passage through the network is called an iteration. A run comprises many iterations.
Periodically during a run, validation steps are performed to make sure the model the neural network has developed to perform the given task on training data is also effective in performing the task on new data.
As a run progresses, the loss scores should go down whie the validation scores go up. Looking at these scores wil help you determine which iteration performed the task most correctly.
DaCapo helps you configure and train established machine learning models to identify and label different objects in three-dimensional images. After a training run, DaCapo helps you identify the best learning model for your task, which you can then use to process your entire dataset.
- Preliminary step: produce ground truth data for training
- Configure a run
- Perform and evaluate the run
- Post-processing steps
- Use your results