Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bug: insufficient handling of insufficient memory #1457

Open
1 of 3 tasks
Tracked by #3908
jlfranklin opened this issue Oct 2, 2024 · 4 comments
Open
1 of 3 tasks
Tracked by #3908

bug: insufficient handling of insufficient memory #1457

jlfranklin opened this issue Oct 2, 2024 · 4 comments
Assignees
Labels
category: model running Inference ux, handling context/parameters, runtime P1: important Important feature / fix type: bug Something isn't working

Comments

@jlfranklin
Copy link

Jan version

0.5.5

Describe the Bug

When there is insufficient memory to run the model, lot of errors are thrown into the logs, and the returned text is complete gibberish.

Jan should stop the model and say something like, "sorry, my brain is full."

Steps to Reproduce

  1. Clean install of Jan on an 8GB Macbook Air M1
  2. Load Qwen Chat 7B
  3. Ask it anything.

Screenshots / Logs

app.log
memory-pressure
sample-response

What is your OS?

  • MacOS
  • Windows
  • Linux
@jlfranklin jlfranklin added the type: bug Something isn't working label Oct 2, 2024
@imtuyethan
Copy link
Contributor

cortex.cpp team is working on this

@freelerobot freelerobot transferred this issue from janhq/jan Oct 13, 2024
@freelerobot freelerobot added category: model running Inference ux, handling context/parameters, runtime P1: important Important feature / fix labels Oct 13, 2024
@freelerobot
Copy link
Contributor

freelerobot commented Oct 13, 2024

Needed: proper error handling when:

  1. user attempts to load a model too big to fit in available memory
  2. error message, e.g.: unable to load model due to insufficient system memory. xx needed. xx available.

Feel free to reassign @vansangpfiev and move to a different sprint

@freelerobot freelerobot moved this from Investigating to Planning in Menlo Oct 15, 2024
@dan-menlo
Copy link
Contributor

Related to #1165

@gabrielle-ong
Copy link
Contributor

Linked to #1108 to recommend models based on hardware

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
category: model running Inference ux, handling context/parameters, runtime P1: important Important feature / fix type: bug Something isn't working
Projects
Status: Planning
Development

No branches or pull requests

7 participants