Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Chinese is not recognized with the default model #79

Open
luobendewugong opened this issue Sep 4, 2024 · 10 comments
Open

Chinese is not recognized with the default model #79

luobendewugong opened this issue Sep 4, 2024 · 10 comments
Assignees

Comments

@luobendewugong
Copy link

Hello, thanks for your work, I have two problems with the default model and would like to consult you:

  1. Can only recognize English, where do I need to set it to recognize Chinese? Or do I need to replace it with another model
  2. The recognition effect is not good, and the pronunciation of the answer is not continuous, do I need to replace other TTS models? The default model actually quite big

Thank you so much!

@Lbaiall
Copy link

Lbaiall commented Sep 4, 2024

you should change model from HF

@andimarafioti
Copy link
Member

Using the code from this PR:
#60

You can call the system with:
python s2s_pipeline.py --recv_host 0.0.0.0 --send_host 0.0.0.0 --lm_model_name meta-llama/Meta-Llama-3.1-8B-Instruct --init_chat_role system --tts melo --stt_model_name openai/whisper-large-v3 --language zh

@andimarafioti
Copy link
Member

There, whisper is larger than the distil version (but it works for Chinese). The llm is larger (but it works for Chinese, you can change it for another one). The TTS is smaller than the default (and it works for chinese)

@andimarafioti andimarafioti self-assigned this Sep 4, 2024
@andimarafioti
Copy link
Member

Let me know if it works :)

@andimarafioti
Copy link
Member

I merged the PR for multiple language, so you should be able to run with the code in main

@Kong4Git
Copy link

Kong4Git commented Sep 4, 2024

Hi, thanks for your work,I encountered an issue while running the code from your repository on my Mac. The error I received is as follows:【ValueError: Please select a valid model】,

The error occurs when initializing the LightningWhisperMLX model with the following command:
python s2s_pipeline.py --local_mac_optimal_settings --device mps --lm_model_name meta-llama/Meta-Llama-3.1-8B-Instruct --init_chat_role system --tts melo --stt_model_name openai/whisper-large-v3 --language zh

could you please provide some guidance on what might be causing this issue or suggest any potential solutions?

Thank you very much for your help!

@andimarafioti
Copy link
Member

You can run it on mac with:

python s2s_pipeline.py  --device mps --lm_model_name meta-llama/Meta-Llama-3.1-8B-Instruct --init_chat_role system --tts melo --stt_model_name openai/whisper-large-v3 --language zh --mode local

But we still didn't make the changes to the MLX classes to support Chinese, so the generation will be quite slow.

@andimarafioti
Copy link
Member

If you want to make the changes, we welcome PRs! otherwise I'll adapt it in the coming days

@luobendewugong
Copy link
Author

Thank you very much, I can use Chinese now, but I feel that there are three more things I want to ask:

  1. Modifying 'init_chat_prompt' in 'LLM\language_model.py' doesn't seem to have any effect, no matter how I modify it, LLM there is no change in the answer;
  2. '--language None' seems to be related to the language model, I use qwen2-1.5b when running the output without Language None mode, have to choose the language;
  3. Can I use a model in GGUF format?

Using the code from this PR: #60

You can call the system with: python s2s_pipeline.py --recv_host 0.0.0.0 --send_host 0.0.0.0 --lm_model_name meta-llama/Meta-Llama-3.1-8B-Instruct --init_chat_role system --tts melo --stt_model_name openai/whisper-large-v3 --language zh

@andimarafioti
Copy link
Member

  1. For the init_chat_prompt to take effect, you also need to set init_chat_role
  2. We changed it to '--language auto' because we thought it was more intuitive. In any case, it's related to everything. Setting it to 'language auto' makes everything auto, setting it to 'language zh' should make everything chinese.
  3. Do you mean for the LLM? I think you should be able to. Try it out and report back to me !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants