Skip to content

Commit

Permalink
Merge branch 'feat/use-llama-cpp-server' of github.com:janhq/cortex.l…
Browse files Browse the repository at this point in the history
…lamacpp into feat/use-llama-cpp-server
  • Loading branch information
sangjanai committed Jan 13, 2025
2 parents ba7e5af + 7bbc7fe commit 399d01f
Show file tree
Hide file tree
Showing 3 changed files with 3 additions and 1 deletion.
1 change: 1 addition & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -148,3 +148,4 @@ Table of parameters
|`flash_attn` | Boolean| To enable Flash Attention, default is true|
|`cache_type` | String| KV cache type: f16, q8_0, q4_0, default is f16|
|`use_mmap` | Boolean| To enable mmap, default is true|
|`ctx_shift` | Boolean| To enable context shift, default is true|
1 change: 1 addition & 0 deletions src/llama_engine.cc
Original file line number Diff line number Diff line change
Expand Up @@ -712,6 +712,7 @@ bool LlamaEngine::LoadModelImpl(std::shared_ptr<Json::Value> json_body) {
}
}

params.ctx_shift = json_body->get("ctx_shift", true).asBool();
params.n_gpu_layers =
json_body->get("ngl", 300)
.asInt(); // change from 100 -> 300 since llama 3.1 has 292 gpu layers
Expand Down

0 comments on commit 399d01f

Please sign in to comment.