Support streaming model import to avoid OOM for large (multi-GB) files

## Problem
`ModelManager.importModel()` and the model loading pipeline read entire files into memory via `file.arrayBuffer()`. For 2-8 GB LLM models, this spikes JS heap memory and can cause OOM crashes on constrained devices.

## Current State
- `importModel()` calls `new Uint8Array(await file.arrayBuffer())` — full file in memory
- `ModelLoadContext.data: Uint8Array` forces loaders to receive the full file
- Double-buffering: JS heap copy + WASM linear memory copy

## Proposed Solution
1. Add streaming interface to ModelLoadContext: `dataStream?: ReadableStream<Uint8Array>`
2. Update `importModel()` to use `file.stream()` and pipe chunks to storage
3. When LocalFileStorage is active, avoid copy entirely by passing the File handle
4. Update backend loaders to support chunked writes to their WASM FS

## Impact
- High for users downloading large LLMs (2+ GB)
- Medium complexity — requires interface changes across core + backends

From PR #370 review comments (greptile + coderabbit).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support streaming model import to avoid OOM for large (multi-GB) files #372

Problem

Current State

Proposed Solution

Impact

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Support streaming model import to avoid OOM for large (multi-GB) files #372

Description

Problem

Current State

Proposed Solution

Impact

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions