Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Eval bug: Slow model loading w/ mmap #10478

Closed
hg0428 opened this issue Nov 25, 2024 · 6 comments
Closed

Eval bug: Slow model loading w/ mmap #10478

hg0428 opened this issue Nov 25, 2024 · 6 comments

Comments

@hg0428
Copy link

hg0428 commented Nov 25, 2024

Name and Version

all recent versions

Which operating systems do you know to be affected?

Mac

GGML backends

Metal

Hardware

Apple Silicon. M3 Max

Model

Any big model such as Mixtral 8x7b.

Steps to Reproduce

Just load it with mmap and notice how much slower it is compared to without mmap.
See original issue here: #9244 (comment)

First Bad Commit

idk

Relevant log output

it just loads for a long time.
@xgdgsc
Copy link

xgdgsc commented Dec 6, 2024

How many ram do you have? Is the second run also same slow? Otherwise it' s expected mmap behavior.

@hg0428
Copy link
Author

hg0428 commented Dec 9, 2024

How many ram do you have? Is the second run also same slow? Otherwise it' s expected mmap behavior.

I have 36gb ram (32 available to GPU). I tested with multiple large models, such as Mixtral 8x7b q4_k_m. Yes, it is still slow on the second run.
I don't mean just a little slow. It's >20x slower. If this is expected mmap behavior then it should be disabled for large models because I don't like having to wait minutes for it to load.

@xgdgsc
Copy link

xgdgsc commented Dec 9, 2024

Or just use a frontend like Ollama and Continue extension to set the config of useMmap for each model https://docs.continue.dev/reference.

@hg0428
Copy link
Author

hg0428 commented Dec 9, 2024

Or just use a frontend like Ollama and Continue extension to set the config of useMmap for each model https://docs.continue.dev/reference.

I already know to disable it; I'm just saying I don't think it should be this slow just from mmap. With smaller models (~<40b q4) the difference is negligible.

@github-actions github-actions bot added the stale label Jan 9, 2025
Copy link
Contributor

This issue was closed because it has been inactive for 14 days since being marked as stale.

@hg0428
Copy link
Author

hg0428 commented Jan 24, 2025

This was not completed!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants