AudioSegmenter

Segmentation of audio for a speech pipeline using jack audio. Intended to use as a VAD (Voice Activation Detection) in a speech recognition pipeline.

Disclaimer

The AudioSegmenter is in development and currently not recommended for use. Requirements will most likely change in the near future; We will move away from jackaudio as sound framework to a more dedicated framework called esiaf, which is currently beeing worked on.

Current Segmentation Methods

A basic interface allows for different VAD algorithms. At this point in time, only one is implemented.

Double Threshold Segmenter

Will read audio and write it back if...

The db_min is reached
The db does not fall below db_keep_alive for more than time_keep_alive ms
The audio is not longer than time_max ms

Otherwise, it will output a zero signal.

Requirements

Jack audio
Somewhat recent gcc (5.4 is tested, but it should theoretically work from 4.8 onwards)
SpeechRecPipelineMsgs (https://github.com/Slothologist/SpeechRecPipelineMsgs)

Name		Name	Last commit message	Last commit date
Latest commit History 43 Commits
AudioSegmenter		AudioSegmenter
config		config
launch		launch
.gitignore		.gitignore
CMakeLists.txt		CMakeLists.txt
README.md		README.md
package.xml		package.xml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

AudioSegmenter

Disclaimer

Current Segmentation Methods

Double Threshold Segmenter

Requirements

About

Releases

Packages

Languages

Slothologist/AudioSegmenter

Folders and files

Latest commit

History

Repository files navigation

AudioSegmenter

Disclaimer

Current Segmentation Methods

Double Threshold Segmenter

Requirements

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages