api_server with --compile fails #773

MithrilMan · 2024-12-21T03:14:49Z

Self Checks

This template is only for bug reports. For questions, please visit Discussions.
I have thoroughly reviewed the project documentation (installation, training, inference) but couldn't find information to solve my problem. English 中文日本語 Portuguese (Brazil)
I have searched for existing issues, including closed ones. Search issues
I confirm that I am using English to submit this report (我已阅读并同意 Language Policy).
[FOR CHINESE USERS] 请务必使用英文提交 Issue，否则会被关闭。谢谢！:）
Please do not modify this template and fill in all required fields.

Cloud or Self Hosted

Self Hosted (Docker)

Environment Details

latest docker image ( sha256:40c9620c1dfd8efb1063a5826e72243efafc5f19784af5fa0603238e06b7dd62 )
I've a RTX 3090 TI with cuda (works on gradio)

Steps to Reproduce

run api_server.py with these arguments
--listen 0.0.0.0:8080 --llama-checkpoint-path "checkpoints/fish-speech-1.5" --decoder-checkpoint-path "checkpoints/fish-speech-1.5/firefly-gan-vq-fsq-8x1024-21hz-generator.pth" --decoder-config-name firefly_gan_vq --compile

✔️ Expected Behavior

should work

❌ Actual Behavior

the warmup fails already with an error in generate method where calls decode_n_tokens

2024-12-21 03:12:34.092 | INFO     | tools.vqgan.inference:load_model:43 - Loaded model: <All keys matched successfully>
2024-12-21 03:12:34.092 | INFO     | tools.server.model_manager:load_decoder_model:108 - Decoder model loaded.
2024-12-21 03:12:34.126 | INFO     | tools.llama.generate:generate_long:789 - Encoded text: Hello world.
2024-12-21 03:12:34.127 | INFO     | tools.llama.generate:generate_long:807 - Generating sentence 1/1 of sample 1/1
  0%|                                                                                                                                                                                    | 0/1023 [00:00<?, ?it/s]/usr/local/lib/python3.12/contextlib.py:105: FutureWarning: `torch.backends.cuda.sdp_kernel()` is deprecated. In the future, this context manager will be removed. Please see `torch.nn.attention.sdpa_kernel()` for the new context manager, with updated signature.
  self.gen = func(*args, **kwds)
  0%|                                                                                                                                                                                    | 0/1023 [00:05<?, ?it/s]
ERROR:    Traceback (most recent call last):
  File "/usr/local/lib/python3.12/site-packages/kui/asgi/lifespan.py", line 36, in __call__
    await result
  File "/opt/fish-speech/tools/api_server.py", line 77, in initialize_app
    app.state.model_manager = ModelManager(
                              ^^^^^^^^^^^^^
  File "/opt/fish-speech/tools/server/model_manager.py", line 66, in __init__
    self.warm_up(self.tts_inference_engine)
  File "/opt/fish-speech/tools/server/model_manager.py", line 122, in warm_up
    list(inference(request, tts_inference_engine))
  File "/opt/fish-speech/tools/server/inference.py", line 25, in inference_wrapper
    raise HTTPException(
baize.exceptions.HTTPException: (<HTTPStatus.INTERNAL_SERVER_ERROR: 500>, '\'Failed running call_function <built-in method empty_like of type object at 0x7fa3d7c4b5e0>(*(FakeTensor(..., device=\\\'cuda:0\\\', size=(102048,), dtype=torch.bfloat16),), **{}):\\nCannot set version_counter for inference tensor\\n\\nfrom user code:\\n   File "/opt/fish-speech/tools/llama/generate.py", line 266, in decode_one_token_ar\\n    sample(\\n  File "/opt/fish-speech/tools/llama/generate.py", line 135, in sample\\n    idx_next = multinomial_sample_one_no_sync(probs)\\n  File "/opt/fish-speech/tools/llama/generate.py", line 52, in multinomial_sample_one_no_sync\\n    q = torch.empty_like(probs_sort).exponential_(1)\\n\\nSet TORCH_LOGS="+dynamo" and TORCHDYNAMO_VERBOSE=1 for more information\\n\\n\\nYou can suppress this exception and fall back to eager by setting:\\n    import torch._dynamo\\n    torch._dynamo.config.suppress_errors = True\\n\'')

The text was updated successfully, but these errors were encountered:

MithrilMan · 2024-12-21T03:23:51Z

here the log if I launch run_webui with --compile (it works)

just to confirm it's a server api error

Here instead the screenshot of the api_server error:

MithrilMan · 2024-12-21T13:44:41Z

update:
if I launch the api_server from the bash it works.

The problem is if I launch the api_server from vs code with its debugger debugpy

here the launch.json

{
    "version": "0.2.0",
    "configurations": [
        {
            "name": "api_server.py",
            "type": "debugpy",
            "request": "launch",
            "program": "tools/api_server.py",
            "console": "integratedTerminal",
            "args": [
                "--listen", "0.0.0.0:8080",
                "--llama-checkpoint-path", "checkpoints/fish-speech-1.5",
                "--decoder-checkpoint-path", "checkpoints/fish-speech-1.5/firefly-gan-vq-fsq-8x1024-21hz-generator.pth",
                "--decoder-config-name", "firefly_gan_vq",
                "--compile",
                "--half"
            ],
            "cwd": "${workspaceFolder}"
        }
    ]
}

MithrilMan · 2024-12-21T14:26:44Z

actually my solution is disable --compile when debugging within visual studio even if it means waste of time, suggestions are welcome

MithrilMan added the bug Something isn't working label Dec 21, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

api_server with --compile fails #773

api_server with --compile fails #773

MithrilMan commented Dec 21, 2024 •

edited

Loading

MithrilMan commented Dec 21, 2024 •

edited

Loading

MithrilMan commented Dec 21, 2024 •

edited

Loading

MithrilMan commented Dec 21, 2024

api_server with --compile fails #773

api_server with --compile fails #773

Comments

MithrilMan commented Dec 21, 2024 • edited Loading

Self Checks

Cloud or Self Hosted

Environment Details

Steps to Reproduce

✔️ Expected Behavior

❌ Actual Behavior

MithrilMan commented Dec 21, 2024 • edited Loading

MithrilMan commented Dec 21, 2024 • edited Loading

MithrilMan commented Dec 21, 2024

MithrilMan commented Dec 21, 2024 •

edited

Loading

MithrilMan commented Dec 21, 2024 •

edited

Loading

MithrilMan commented Dec 21, 2024 •

edited

Loading