diff --git a/README.md b/README.md index bce17fc..f65b531 100644 --- a/README.md +++ b/README.md @@ -69,6 +69,7 @@ Inference, Chat, API | `--tokenizer ` | Tokenizer to model. | `dllama_tokenizer_llama3.t` | | `--buffer-float-type ` | Float precision of synchronization. | `q80` | | `--workers ` | Addresses of workers (ip:port), separated by space. | `10.0.0.1:9991 10.0.0.2:9991` | +| `--max-seq-len ` | The maximum sequence length, it helps to reduce the RAM usage. | `4096` | Inference, Chat, Worker, API