The goal of this project is to create an AI able to segment in real time some categories of objects on the road. All images are segmented independently.
I used those models to segment the video :
-
BiSeNet V2 - paper - pretrained model
Result : Mean Intersection Over Union = 54% Loss = 0.23 -
Attention R2U-Net - pretrained model
Result : Mean Intersection Over Union = 55% Loss = 0.21 -
DDRNet - paper
-
TMANet - paper
I trained each of these models for about 48 hours with an I7-7700K, a 6GB GTX 1060 and 28GB of RAM.
To test the models on a video, you can use the UI.
First, install required packages :
pip install -r requirements.txt
Then, start the UI :
python segmentation.py [Video Folder Path]
To train a model, you first need to download the A2D2 and Mappillary Vistas dataset.
Then, install required packages :
pip install -r requirements.txt
After that, you might need to change some constant (dataset folders, epochs, lr, WanDB, ...) in the file train.py
:
code train.py
Finally, start the learning :
python train.py
Those are the categories trained to be segmented by the AI.
# | Name | Color |
---|---|---|
1 | Road | |
2 | Lane | |
3 | Crosswalk | |
4 | Curb | |
5 | Sidewalk | |
6 | Traffic Light | |
7 | Traffic Sign | |
8 | Person | |
9 | Bicycle | |
10 | Bus | |
11 | Car | |
12 | Motorcycle | |
13 | Truck | |
14 | Sky | |
15 | Nature |
The AI was trained using a mix of those two datasets :
List of tools I used :