loud_llama is an AI voice assistant that utilizes Whisper for speech recognition, Ollama for large language model (LLM) capabilities, and XTTS for text-to-speech functionality.
Follow these steps to set up the project:
-
Install Python (version 3.11.x is recommended):
Download from the official website: Python Downloads -
Download the Project:
You can either clone the repository or download it as a ZIP file:- Clone with Git:
git clone https://github.com/GongXiPing/loud_llama.git
- Or download as a ZIP:
Go to Code > Download ZIP on the GitHub page.
- Clone with Git:
-
Install Required Packages:
Navigate to the project directory and install the necessary packages using pip:pip install -r requirements.txt
-
Install PyTorch:
Download and install a suitable version of PyTorch for your system. Visit the PyTorch website and follow the instructions to select the appropriate configuration (OS, package manager, Python version, and CUDA version). For example, a typical command for installing PyTorch might look like this:pip install torch torchvision torchaudio --extra-index-url https://download.pytorch.org/whl/cu113
-
Install Whisper for Speech Recognition:
You can install Whisper either as a package or as an individual model:- Install as a Package (Recommended for Beginners):
- Install using pip:
pip install openai-whisper
- Set
using_as_package=True
in the Initialization and Parameters section of your code. - Choose a
model_name
:- For English only:
model_name = "small.en"
- For multilingual support:
model_name = "small"
- Consider using larger models for improved transcription quality.
- For English only:
- Install using pip:
- Install as an Individual Model:
- Download a model from the Whisper series on Hugging Face.
- Set the
model_path
to your model directory (absolute path is recommended):model_path = "path/to/your/model/whisper-small.en/"
- Install as a Package (Recommended for Beginners):
-
Install Ollama for LLM Service:
Follow the instructions on the Ollama GitHub page.
To use loud_llama:
-
Start the Ollama Service:
- Open a terminal and run the following command to start the Ollama service:
ollama serve
- Open a terminal and run the following command to start the Ollama service:
-
Start the Assistant:
- If you have an IDE that supports .ipynb files, open assistant.ipynb and run all the cells.
- Alternatively, you can run it in the command line:
ipython assistant.ipynb
-
Interact with the Assistant:
- Speak into the microphone when "Recording..." is displayed.
- Wait for the assistant's vocal response.
- To pause the conversation, simply stop speaking.
- Press Enter to resume the conversation, type 'new' to start a new thread, or type 'bye' to exit.
Please note that loud_llama has currently only been tested on Windows and Python 3.11.9.
If you encounter any problems while using the program, please feel free to post an issue in the GitHub repository, and we will do our best to assist you.