ggerganov / llama.cpp Public

Notifications You must be signed in to change notification settings
Fork 9.8k
Star 68.4k

Code
Issues 257
Pull requests 320
Discussions
Actions
Projects 9
Wiki
Security
Insights

Additional navigation options

Code
Issues
Pull requests
Discussions
Actions
Projects
Wiki
Security
Insights

Issues: ggerganov/llama.cpp

changelog : libllama API

#9289 opened Sep 3, 2024 by ggerganov

Open 1

changelog : llama-server REST API

#9291 opened Sep 3, 2024 by ggerganov

Open 8

Labels 70 Milestones 0

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

257 Open 3,815 Closed

Author

Filter by author

Label

Filter by label

Use alt + click/return to exclude labels

or ⇧ + click/return for logical OR

Projects

Filter by project

Milestones

Filter by milestone

Assignee

Filter by who’s assigned

Assigned to nobody

Sort

Sort by

Newest Oldest Most commented Least commented Recently updated Least recently updated Best match

Most reactions

Issues list

Misc. bug: -sm row does not work with --device bug

Something isn't working

#10533 opened Nov 26, 2024 by mostlygeek

Compile bug: ARM neon instructions when compiling cuda bug-unconfirmed

#10531 opened Nov 26, 2024 by bruceloco

Misc. bug: Inconsistent Vulkan segfault bug-unconfirmed

#10528 opened Nov 26, 2024 by RobbyCBennett

Feature Request: enhancement

New feature or request

#10520 opened Nov 26, 2024 by atozj

4 tasks done

Compile bug: [SYCL] Build fails from the latest master bug-unconfirmed

#10518 opened Nov 26, 2024 by qnixsynapse

[CANN] Compile bug: cann backend build failed when manually specify SOC_TYPE or gcc version that isn't verified Ascend NPU

issues specific to Ascend NPUs

#10517 opened Nov 26, 2024 by leo-pony

[CANN] Operator support Ascend NPU

issues specific to Ascend NPUs

enhancement

New feature or request

#10512 opened Nov 26, 2024 by noemotiovon

4 tasks done

Misc. bug: SYCL builds >= 4040 have lower parallel performance by ~20% bug-unconfirmed

#10511 opened Nov 26, 2024 by 0xDEADFED5

Feature Request: Ability to cancel during prompt processing (llama_decode) enhancement

New feature or request

#10509 opened Nov 26, 2024 by bdashore3

4 tasks done

Feature Request: Add "tokens per second" information in the Web UI enhancement

New feature or request

good first issue

Good for newcomers

server/webui

#10502 opened Nov 25, 2024 by ggerganov

4 tasks done

Compile bug: make CC='/opt/AMD/aocc-compiler-5.0.0/bin/clang' CXX='/opt/AMD/aocc-compiler-5.0.0/bin/clang++' AMD_ZEN4_BLIS_5=1 GGML_CUDA=1 GGML_CUDA_FORCE_CUBLAS=1 GML_CUDA_F16=1 GGML_CUDA_FORCE_MMQ=1 CUDA_USE_TENSOR_CORES=1 GGML_RPC=1 bug-unconfirmed

#10493 opened Nov 25, 2024 by KarlHeinzMali

Eval bug: Slow model loading w/ mmap bug-unconfirmed

#10478 opened Nov 25, 2024 by hg0428

Misc. bug: Serving of custom static files is broken when API key is set. bug-unconfirmed

#10475 opened Nov 24, 2024 by shibe2

Misc. bug: poor concurrent request performance with llama-server in macOS bug-unconfirmed

#10473 opened Nov 24, 2024 by pengjiang80

Feature Request: better cross entropy loss CUDA kernel enhancement

New feature or request

#10467 opened Nov 23, 2024 by JohannesGaessler

4 tasks done

Misc. bug: Model provisioning doc link broken bug-unconfirmed

#10464 opened Nov 23, 2024 by paoletto

Compile bug: how to modify the cmakelists.txt bug-unconfirmed

#10462 opened Nov 23, 2024 by wangzd0209

Support for Macro-o1 by alibaba enhancement

New feature or request

#10461 opened Nov 23, 2024 by Meshwa428

4 tasks done

ggml : add ANE backend help wanted

Extra attention is needed

research 🔬

#10453 opened Nov 22, 2024 by ggerganov

Bug: 【CANN】ggml-cann/aclnn_ops.cpp:3007: GGML_ASSERT(n_dims == src0->ne[0]) failed Ascend NPU

issues specific to Ascend NPUs

#10451 opened Nov 22, 2024 by zyp2

Bug: Heavy throttling during token generation on Apple Silicon bug-unconfirmed medium severity

Used to report medium severity bugs in llama.cpp (e.g. Malfunctioning Features but still useable)

#10444 opened Nov 21, 2024 by Azirine

Bug: Flash Attention performs worse under ROCM bug-unconfirmed medium severity

Used to report medium severity bugs in llama.cpp (e.g. Malfunctioning Features but still useable)

#10439 opened Nov 20, 2024 by Mushoz

Bug: Severe Performance Degradation on Q4_0 CPU-only with MacOS / Apple Silicon M2, after PR#9921 / Version 4081 bug

Something isn't working

#10435 opened Nov 20, 2024 by AndreasKunar

Why server slot's cache_prompt is false by default? bug-unconfirmed medium severity

Used to report medium severity bugs in llama.cpp (e.g. Malfunctioning Features but still useable)

#10427 opened Nov 20, 2024 by Nekotekina

Bug: SYCL builds >= b4069 have half the context limit of previous builds bug-unconfirmed critical severity

Used to report critical severity bugs in llama.cpp (e.g. Crashing, Corrupted, Dataloss)

#10421 opened Nov 20, 2024 by 0xDEADFED5

Previous 1 2 3 4 5 … 10 11 Next

Previous Next

ProTip! Exclude everything labeled bug with -label:bug.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly