← All repos118k
Local LLMsC++
llama.cpp
ggml-org/llama.cpp
The engine that put LLMs on laptops and phones.
High-performance LLM inference in plain C/C++ with quantization — the runtime under much of the local-model ecosystem, from laptops to Raspberry Pis.