Skip to content

Commit 4ee10a9

Browse files
authored
feat: resolveModelFile method (#351)
* feat: `resolveModelFile` method * feat: `hf:` URI support * fix: improve GGUF metadata read times * fix: hide internal type * docs: document the `hf:` URI
1 parent 578e710 commit 4ee10a9

38 files changed

+1150
-614
lines changed

.vitepress/config/apiReferenceSidebar.ts

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -10,6 +10,7 @@ const categoryOrder = [
1010

1111
const functionsOrder = [
1212
"getLlama",
13+
"resolveModelFile",
1314
"defineChatSessionFunction",
1415
"createModelDownloader",
1516
"resolveChatWrapper",

docs/cli/pull.md

Lines changed: 7 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -13,10 +13,16 @@ const commandDoc = docs.pull;
1313
A wrapper around [`ipull`](https://www.npmjs.com/package/ipull)
1414
to download model files as fast as possible with parallel connections and other optimizations.
1515

16-
Automatically handles split and binary-split models files, so only pass the URL to the first file of a model.
16+
Automatically handles split and binary-split models files, so only pass the URI to the first file of a model.
1717

1818
If a file already exists and its size matches the expected size, it will not be downloaded again unless the `--override` flag is used.
1919

20+
The supported URI schemes are:
21+
- **HTTP:** `https://`, `http://`
22+
- **Hugging Face:** `hf:<user>/<model>/<file-path>#<branch>` (`#<branch>` is optional)
23+
24+
Learn more about using model URIs in the [Downloading Models guide](../guide/downloading-models.md#model-uris).
25+
2026
> To programmatically download a model file in your code, use [`createModelDownloader()`](../api/functions/createModelDownloader.md)
2127
2228
## Usage

docs/guide/choosing-a-model.md

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -164,3 +164,7 @@ npx --no node-llama-cpp pull --dir ./models <model-file-url>
164164
>
165165
> If the model file URL is of a single part of a multi-part model (for example, [this model](https://huggingface.co/bartowski/Meta-Llama-3-70B-Instruct-GGUF/blob/main/Meta-Llama-3-70B-Instruct-Q5_K_L.gguf/Meta-Llama-3-70B-Instruct-Q5_K_L-00001-of-00002.gguf)),
166166
> it will also download all the other parts as well into the same directory.
167+
168+
::: tip
169+
Consider using [model URIs](./downloading-models.md#model-uris) to download and load models.
170+
:::

docs/guide/downloading-models.md

Lines changed: 47 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -69,16 +69,61 @@ This option is recommended for more advanced use cases, such as downloading mode
6969
If you know the exact model URLs you're going to need every time in your project, it's better to download the models
7070
automatically after running `npm install` as described in the [Using the CLI](#cli) section.
7171

72+
## Model URIs {#model-uris}
73+
You can reference models using a URI instead of their full download URL when using the CLI and relevant methods.
74+
75+
When downloading a model from a URI, the model files will be prefixed with a corresponding adaptation of the URI.
76+
77+
To reference a model from Hugging Face, you can use the scheme
78+
<br/>
79+
`hf:<user>/<model>/<file-path>#<branch>` (`#<branch>` is optional).
80+
81+
Here's an example usage of the Hugging Face URI scheme:
82+
```
83+
hf:mradermacher/Meta-Llama-3.1-8B-Instruct-GGUF/Meta-Llama-3.1-8B-Instruct.Q4_K_M.gguf
84+
```
85+
86+
When using a URI to reference a model,
87+
it's recommended [to add it to your `package.json` file](#cli) to ensure it's downloaded when running `npm install`,
88+
and also resolve it using the [`resolveModelFile`](../api/functions/resolveModelFile.md) method to get the full path of the resolved model file.
89+
90+
Here's and example usage of the [`resolveModelFile`](../api/functions/resolveModelFile.md) method:
91+
```typescript
92+
import {fileURLToPath} from "url";
93+
import path from "path";
94+
import {getLlama, resolveModelFile} from "node-llama-cpp";
95+
96+
const __dirname = path.dirname(fileURLToPath(import.meta.url));
97+
const modelsDirectory = path.join(__dirname, "models");
98+
99+
const modelPath = await resolveModelFile(
100+
"hf:user/model/model-file.gguf",
101+
modelsDirectory
102+
);
103+
104+
const llama = await getLlama();
105+
const model = await llama.loadModel({modelPath});
106+
```
107+
108+
::: tip NOTE
109+
If a corresponding model file is not found in the given directory, the model will automatically be downloaded.
110+
111+
When a file is being downloaded, the download progress is shown in the console by default.
112+
<br/>
113+
Set the [`cli`](../api/type-aliases/ResolveModelFileOptions#cli) option to `false` to disable this behavior.
114+
:::
115+
72116
## Downloading Gated Models From Hugging Face {#hf-token}
73117
Some models on Hugging Face are "gated", meaning they require a manual consent from you before you can download them.
74118

75119
To download such models, after completing the consent form on the model card, you need to create a [Hugging Face token](https://huggingface.co/docs/hub/en/security-tokens) and set it in one of the following locations:
76120
* Set an environment variable called `HF_TOKEN` the token
77121
* Set the `~/.cache/huggingface/token` file content to the token
78122

79-
Now, using the CLI or the [`createModelDownloader`](../api/functions/createModelDownloader.md) method will automatically use the token to download gated models.
123+
Now, using the CLI, the [`createModelDownloader`](../api/functions/createModelDownloader.md) method,
124+
or the [`resolveModelFile`](../api/functions/resolveModelFile.md) method will automatically use the token to download gated models.
80125

81-
Alternatively, you can use the token in the [`tokens`](../api/type-aliases/ModelDownloaderOptions.md#tokens) option when using [`createModelDownloader`](../api/functions/createModelDownloader.md).
126+
Alternatively, you can use the token in the [`tokens`](../api/type-aliases/ModelDownloaderOptions.md#tokens) option when using [`createModelDownloader`](../api/functions/createModelDownloader.md) or [`resolveModelFile`](../api/functions/resolveModelFile.md).
82127

83128
## Inspecting Remote Models
84129
You can inspect the metadata of a remote model without downloading it by either using the [`inspect gguf` command](../cli/inspect/gguf.md) with a URL,

docs/guide/index.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -51,9 +51,9 @@ npx --no node-llama-cpp inspect gpu
5151
```
5252

5353
## Getting a Model File
54-
We recommend you to get a GGUF model from either [Michael Radermacher on Hugging Face](https://huggingface.co/mradermacher) or [search HuggingFace directly](https://huggingface.co/models?library=gguf) for a GGUF model.
54+
We recommend getting a GGUF model from either [Michael Radermacher on Hugging Face](https://huggingface.co/mradermacher) or by [searching HuggingFace directly](https://huggingface.co/models?library=gguf) for a GGUF model.
5555

56-
We recommend you to start by getting a small model that doesn't have a lot of parameters just to ensure everything works, so try downloading a `7B`/`8B` parameters model first (search for models with both `7B`/`8B` and `GGUF` in their name).
56+
We recommend starting by getting a small model that doesn't have a lot of parameters just to ensure everything works, so try downloading a `7B`/`8B` parameters model first (search for models with both `7B`/`8B` and `GGUF` in their name).
5757

5858
For improved download speeds, you can use the [`pull`](../cli/pull.md) command to download a model:
5959
```shell

scripts/scaffoldElectronExampleForCiBuild.ts

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -40,8 +40,8 @@ await scaffoldProjectTemplate({
4040
directoryPath: resolvedPackageFolderPath,
4141
parameters: {
4242
[ProjectTemplateParameter.ProjectName]: projectName,
43-
[ProjectTemplateParameter.ModelUrl]: "https://github.com/withcatai/node-llama-cpp",
44-
[ProjectTemplateParameter.ModelFilename]: "model.gguf",
43+
[ProjectTemplateParameter.ModelUriOrUrl]: "https://github.com/withcatai/node-llama-cpp",
44+
[ProjectTemplateParameter.ModelUriOrFilename]: "model.gguf",
4545
[ProjectTemplateParameter.CurrentModuleVersion]: packageVersion
4646
}
4747
});

src/chatWrappers/Llama3_1ChatWrapper.ts

Lines changed: 9 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -36,13 +36,7 @@ export class Llama3_1ChatWrapper extends ChatWrapper {
3636
/**
3737
* @param options
3838
*/
39-
public constructor({
40-
cuttingKnowledgeDate = new Date("2023-12-01T00:00:00Z"),
41-
todayDate = () => new Date(),
42-
noToolInstructions = false,
43-
44-
_specialTokensTextForPreamble = false
45-
}: {
39+
public constructor(options: {
4640
/**
4741
* Set to `null` to disable
4842
*
@@ -64,6 +58,14 @@ export class Llama3_1ChatWrapper extends ChatWrapper {
6458
} = {}) {
6559
super();
6660

61+
const {
62+
cuttingKnowledgeDate = new Date("2023-12-01T00:00:00Z"),
63+
todayDate = () => new Date(),
64+
noToolInstructions = false,
65+
66+
_specialTokensTextForPreamble = false
67+
} = options;
68+
6769
this.cuttingKnowledgeDate = cuttingKnowledgeDate == null
6870
? null
6971
: cuttingKnowledgeDate instanceof Function

src/cli/commands/ChatCommand.ts

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -77,9 +77,9 @@ export const ChatCommand: CommandModule<object, ChatCommand> = {
7777

7878
return yargs
7979
.option("modelPath", {
80-
alias: ["m", "model", "path", "url"],
80+
alias: ["m", "model", "path", "url", "uri"],
8181
type: "string",
82-
description: "Model file to use for the chat. Can be a path to a local file or a URL of a model file to download. Leave empty to choose from a list of recommended models"
82+
description: "Model file to use for the chat. Can be a path to a local file or a URI of a model file to download. Leave empty to choose from a list of recommended models"
8383
})
8484
.option("header", {
8585
alias: ["H"],

src/cli/commands/CompleteCommand.ts

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -57,9 +57,9 @@ export const CompleteCommand: CommandModule<object, CompleteCommand> = {
5757
builder(yargs) {
5858
return yargs
5959
.option("modelPath", {
60-
alias: ["m", "model", "path", "url"],
60+
alias: ["m", "model", "path", "url", "uri"],
6161
type: "string",
62-
description: "Model file to use for the chat. Can be a path to a local file or a URL of a model file to download. Leave empty to choose from a list of recommended models"
62+
description: "Model file to use for the completion. Can be a path to a local file or a URI of a model file to download. Leave empty to choose from a list of recommended models"
6363
})
6464
.option("header", {
6565
alias: ["H"],

src/cli/commands/InfillCommand.ts

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -59,9 +59,9 @@ export const InfillCommand: CommandModule<object, InfillCommand> = {
5959
builder(yargs) {
6060
return yargs
6161
.option("modelPath", {
62-
alias: ["m", "model", "path", "url"],
62+
alias: ["m", "model", "path", "url", "uri"],
6363
type: "string",
64-
description: "Model file to use for the chat. Can be a path to a local file or a URL of a model file to download. Leave empty to choose from a list of recommended models"
64+
description: "Model file to use for the infill. Can be a path to a local file or a URI of a model file to download. Leave empty to choose from a list of recommended models"
6565
})
6666
.option("header", {
6767
alias: ["H"],

0 commit comments

Comments
 (0)