You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardexpand all lines: README.md
+76-47
Original file line number
Diff line number
Diff line change
@@ -2,67 +2,41 @@
2
2
This guide will help you understand how to convert AI models into different formats that our applications can use.
3
3
4
4
## What We're Doing
5
-
We convert AI models into special formats (GGUF, TensorRT, ONNX) so they can work with our applications called [Jan](https://github.com/janhq/jan) and [Cortex](https://github.com/janhq/cortex.cpp). Think of this like converting a video file from one format to another so it can play on different devices.
5
+
We convert LLMs on HuggingFace into special formats (GGUF, TensorRT, ONNX) so they can work with our applications called [Jan](https://github.com/janhq/jan) and [Cortex](https://github.com/janhq/cortex.cpp). Think of this like converting a video file from one format to another so it can play on different devices.
6
6
7
7
### New Model Conversion
8
8
9
9
#### Step 1: Create a New Model Template
10
10
11
-
1. Go to our model storage at: https://huggingface.co/cortexso
12
-
2. Click "New Model"
13
-
3. Give it the same license as the original model you're converting
14
-
4. Create two important files in your new model:
11
+
This step will create a model repository on [Cortexso's Hugging Face account](https://huggingface.co/cortexso) and then generate two files: `model.yml` as the model's configuration file and `metadata.yml`.
15
12
16
-
**File 1:**`model.yml`
17
-
This file tells our system how to use the model. Copy this template and fill in the details:
18
-
19
-
```
20
-
# Basic Information
21
-
id: your_model_name # Choose a unique name
22
-
model: your_model_name # Same as above
23
-
name: your_model_name # Same as above
24
-
version: 1 # Start with 1
25
-
26
-
# Model Settings
27
-
stop: # Where the model should stop generating
28
-
- "<|eot_id|>" # You might need to change this
29
-
30
-
# Default Settings (Usually keep these as they are)
31
-
stream: true
32
-
top_p: 0.9
33
-
temperature: 0.7
34
-
max_tokens: 4096
35
-
36
-
# Technical Settings
37
-
engine: llama-cpp # The engine type
38
-
prompt_template: "<|begin_of_text|>..." # How to format inputs
-**`model_name`**: Name of the model to create (will be used in repo name and files)
17
+
-**`prompt_template`**: Prompt template for the model (default: `<|im_start|>system\n{system_message}<|im_end|>\n<|im_start|>user\n{prompt}<|im_end|>\n<|im_start|>assistant\n`)
18
+
-**`stop_tokens`**: Stop tokens for the model (comma-separated, e.g., `,</s>`) (default: `<|im_end|>`)
19
+
-**`engine`**: Engine to run the model (default: `llama-cpp`)
20
+
4. Click `Run workflow` button to start the conversion
42
21
43
22
**Common Errors:**
44
-
- Wrong Stop Tokens: If the model keeps generating too much text, check the stop tokens in model.yml
45
-
- Engine Errors: Make sure you picked the right engine type in model.yml
46
-
- Template Issues: Double-check your prompt_template if the model gives weird outputs
47
-
48
-
**File 2:**`metadata.yml`
49
-
50
-
```
51
-
version: 1
52
-
name: your_model_name
53
-
default: 8b-gguf-q4-km
54
-
```
23
+
- Wrong Stop Tokens: If the model keeps generating too much text, check the `stop` tokens in `model.yml`
24
+
- Engine Errors: Make sure you picked the right engine type in `model.yml`
25
+
- Template Issues: Double-check your `prompt_template` if the model gives weird outputs
0 commit comments