Skip to content
Georgy Treshchev edited this page Sep 10, 2023 · 21 revisions

Runtime Speech Recognizer Documentation

Runtime Speech Recognizer is an open-source plugin that enables real-time, offline speech recognition. Based on Whisper OpenAI technology, particularly whisper.cpp library, and supports multiple language models pre-selected in the plugin's settings.

This plugin is somewhat experimental and may encounter issues on certain platforms, particularly due to the workarounds required when explicitly defining CPU instruction sets here. Please be aware that to ensure the plugin functions seamlessly and accurately on these platforms, you may need to adjust the CPU-specific definitions in that file or, ideally, incorporate them using compiler flags, which is unfortunately isn't feasible without modifying the UE source code. Alternatively, you can consider manually building the whisper.cpp library and including it in a precompiled state, but this would necessitate some plugin code modifications. There are plans to refine the plugin a little to better accommodate this approach.

How to install

There're two ways to install the plugin:

  1. Through the marketplace.
  2. Manual installation. Select and download the release for the required engine version, extract the archive into your plugins project folder to get the following path: "[ProjectName] / Plugins / RuntimeSpeechRecognizer".

On first run, install language models (a dialog box will appear asking you to do this automatically).

Basic description

This plugin provides real-time speech recognition using advanced algorithms based on whisper.cpp library. It matches incoming audio data, provided as a stream or non-stream input, against pre-trained language models.

Clone this wiki locally