Skip to content

A multi-task net for KWS and speaker detection.

Notifications You must be signed in to change notification settings

Dean-zy/MTN-CBAM

Repository files navigation

MTN-CBAM

A multi-task net for KWS and speaker detection. The description of this experiment is in the paper: “MTN-CBAM: Multi-Task Network with Convolutional Block Attention Module for Speaker Related Small-Footprint Keyword Spotting”

Environment configuration

In order to run these Python scripts, the following libraries and packages are needed:
* Keras
* Librosa
* Numpy
* Pickle
* Matplotlib

Data and directory

When running these Python scripts, by default, it is expected to find two folders within this one: "HADataset" and "exp". The first would contain the hearing aid speech database that can be freely downloaded from Data Link The second folder is the working directory, where all files resulting from running the provided scripts are stored.

Thanks to Lopezespejo I, Tan Z, Jensen J, et al. Keyword Spotting for Hearing Assistive Devices Robust to External Speakers[C]. conference of the international speech communication association, 2019: 3223-3227. for providing this data set

How to run

run.sh demonstrates the running example

Pre-trained model in paper

The pre-trained model of "MTN-CBAM" and "MTN-CBAM-2" using "2 * 2" convolution kernel is provided in "exp".
You can use "test_cbam.py" to test them, the specific command reference in "run.sh"

About

A multi-task net for KWS and speaker detection.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published