Labs completed for Speech and Speaker Recognition, KTH 2020.
Step by step implementation of Mel Filterbank and MFCC features, correlation between these features was evaluated & utterances were compared with Dynamic Time Warping.
Algorithms for the evaluation and decoding of Hidden Markov Models implemented, this implementation was used to perform isolated word recognition & algorithms for training Gaussian HMMs were implemented.
Used predefined Gaussian-emission HMM phonetic models, to create time aligned phonetic transcriptions of the TIDIGITS database. A DNN model for phoneme recognition was trained using Keras and evaluated on a frame-by-frame recognition score.