Identification Of Gunshots In Passive Acoustic Monitoring Recordings Using Convolutional Neural Networks
This code is part of my bachelor's thesis work on gunshot identification in passive acoustic monitoring. The project aimed at enhancing the accuracy of gunshot detection in environmental audio signals through the exploration of Convolutional Neural Network (CNN) architectures and various pre-processing techniques. The techniques considered encompass denoising, augmentation, and feature extraction. The primary objective of this study is to develop a CNN-based model that surpasses the current Data Template Detector from the Elephant Listening Project (ELP), particularly in terms of precision, and recall for gunshot identification.
The results of this work demonstrate that constructing a CNN model using Bayesian hyperparameter optimization and incorporating data augmentation, low-pass filtering with a cutoff frequency of 1500 Hz, and extraction of deltas and delta-deltas Mel-frequency cepstral coefficients as pre-processing techniques yield a highly accurate model capable of predicting the presence of a gunshot with an accuracy rate of 99%, and a precision and recall of 100% and 96%, respectively.
For a detailed understanding of the research and methodologies employed in this project, you can refer to the report.
For the defence of my work, you can refer to the presentation slides
Rosamelia Carioni Porras
Main supervisor: Pietro Bonizzi
Second supervisor: Marijn ten Thij
To get all the required packages do:
pip install -r requirements.txt
The organization of the files and work can be understood as follows:
-
The generation of clips from the 24-hour long recordings provided by ELP can be found under the folder
data_preprocessing
. The clips were generated by using data frames with timestamps provided by the organization and by downloading from AWS some of the files (some of them were provided by the organization directly). -
The methods used in the training and evaluation of the different models can be found under the
methods_audio
folder. The methods include the generation of new data, application of different denoising techniques, generation of plots from signals, etc. -
To obtain the different models (architectures), the folder
models
was created, which contains a class in charge of returning a specific model based on what is asked to the method. -
Different models architectures were built with hyperparameter tuning, more specifically, Bayesian Optimization. The files where they were built can be found under the jupyter notebooks called:
build_model_*
-
To answer the research questions of this work, different experiments were performed. For more details about the different experiments refer to the notebooks:
experiment_*
-
After experiments were performed and answer to which pre-processing techniques are best to identify gunshots in environmental sounds, a final set of 4 different models were trained and tested with the test set provided by the organization. The code for the training of the models can be found under the jupyter notebook:
train_final_models.ipynb
, whereas the code for the testing can be found under:results_test_set.ipynb
.
Note that the code here is meant as a reference to see what was done in this study and is not meant to be run by other people, as not all the data is not publically available. If you would like to run this code you would need to get the data and update the paths in the code which reference it.