bug: some of the engine parameters in the model load request are ignored #1824

louis-jan · 2024-12-24T04:30:32Z

Cortex version

1.0.6

Describe the issue and expected behaviour

When starting a model, there are engine parameters that can be configured as described here: https://github.com/janhq/cortex.llamacpp. However, when sending these parameters through the cortex.cpp server, most of them are filtered out due to a new model.yaml configuration that hardcodes several acceptable parameters.

After reviewing the model.yaml implementation, I noticed that the settings are not applicable because these declaration are missing. So that they all fallback to default settings.

cpu_threads
n_batch
caching_enabled
grp_attn_n
grp_attn_w
mlock
grammar_file
model_type
model_alias
flash_attn
cache_type
use_mmap
llama_model_path
embedding
cont_batching
user_prompt
ai_prompt
system_prompt
pre_prompt

Steps to Reproduce

Start cortex server
Start a model by sending a request with cpu_threads or n_batch settings
Observe cortex.log
See the error

Screenshots / Logs

No response

What is your OS?

Windows
Mac Silicon
Mac Intel
Linux / Ubuntu

What engine are you running?

cortex.llamacpp (default)
cortex.tensorrt-llm (Nvidia GPUs)
cortex.onnx (NPUs, DirectML)

Hardware Specs eg OS version, GPU

No response

The text was updated successfully, but these errors were encountered:

louis-jan added the type: bug Something isn't working label Dec 24, 2024

github-project-automation bot added this to Jan & Cortex Dec 24, 2024

github-project-automation bot moved this to Investigating in Jan & Cortex Dec 24, 2024

louis-jan added the P1: important Important feature / fix label Dec 24, 2024

louis-jan assigned vansangpfiev Dec 24, 2024

louis-jan added this to the v1.0.7 milestone Dec 24, 2024

louis-jan moved this from Investigating to In Progress in Jan & Cortex Dec 24, 2024

vansangpfiev mentioned this issue Dec 24, 2024

fix: forward start model parameters #1825

Merged

3 tasks

vansangpfiev moved this from In Progress to QA in Jan & Cortex Dec 26, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

bug: some of the engine parameters in the model load request are ignored #1824

bug: some of the engine parameters in the model load request are ignored #1824

louis-jan commented Dec 24, 2024 •

edited

Loading

bug: some of the engine parameters in the model load request are ignored #1824

bug: some of the engine parameters in the model load request are ignored #1824

Comments

louis-jan commented Dec 24, 2024 • edited Loading

Cortex version

Describe the issue and expected behaviour

Steps to Reproduce

Screenshots / Logs

What is your OS?

What engine are you running?

Hardware Specs eg OS version, GPU

louis-jan commented Dec 24, 2024 •

edited

Loading