This is a semester long project as part of my machine learning class divided into three parts. In Part I, it presents a machine learning pipeline that distinguishes Argentinian accents from others in audio samples. It features a comparative analysis of logistic regression and random forest classifiers, leveraging the Librosa library for audio feature extraction. With random forest achieving 82% accuracy, future work will focus on feature selection, data augmentation, and potentially implementing convolutional neural networks for improved classification.
In Part 2, a CNN and RNN model is evaluated for the same classification task as Part I but using the MFFCs features of the audio data and data augmentation techniques.
Part 3 is under progress that utilises generative models and transfer learning,