Skip to content

aryanpandey/Flipkart-GRID-2.0

Repository files navigation

Flipkart-GRID-Noise-Cancellation-Solution

Owners: Aditya Das, Nihal John George and Aryan Pandey

This repository contains team Third Degree Burn's solution for Round 3 of the Flipkart GRiD 2.0.

Correction in the video:- We meant to say that the audios are padded to a length of 10 seconds not 15 seconds. Any audio longer than that is cropped to 10 seconds and fed into the model.

Link for drive folder where we have stored all our predictions on the input files: https://drive.google.com/drive/folders/1ewBhjymSAa-8PkT5S80DMuDDK8_dUK3a?usp=sharing

Link to WER for each file: link

Video link: link

API Usage

Without a GUI

We have provided an API which uses Flask to take in the path to an input file or a directory with multiple input files and the path to an output directory where the files will be stored in WAV format. To make use of our scripts, run this wsgi script on a terminal and then run this interacting script separately to start interacting with the server.

$ cd FlaskNoGUI

$ python wsgi.py

The scripts can be found here:

With a GUI

DISCLAIMER - Move the gbl_model.h5 file from the 'FlaskNoGUI/Models/' folder to the 'FlaskGUI/Models/' folder before running scripts in this section

We have also made the scripts for having a small GUI incorporated with Flask using tkinter. The order to run the scripts is the same. You first run this wsgi script on a terminal and then run this interacting script separately to start interacting with the server. The only difference here is that On running the interacting script, a pop-up window shows in which you can select whether you want to input a single file or a whole directory. In any case, you need to select the input directory before the output directory, else the code will not run.

The scripts for this can be found here:

Model Building

Following are the scripts that we have used for building our model:

Datsets

We manually made our data where we created 30 odd files with just background noise and 30 odd files which contains clear voice. We then generated our dataset by mixing each clean audio with each background noise to create a new audio file. This way we got 1000 audio files. We also incorporated a function where we scaled our noise by some amount to get three sets of audio: One with a dimmed out noise, one with the noise at the same level as the clean audio and one with an amplified noise. Totally we got 3000 audio samples from this mixing.

The datasets as of now are uploaded to Kaggle and are Private. Anyone with the following links will be able to access it. The dataset will be made public after the competition is over, if Flipkart allows it.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 3

  •  
  •  
  •  

Languages