ggml-org/llama.cpp
LLM inference in C/C++
Open-source desktop app for local LLMs. Text, vision, tool-calling, OpenAI/Anthropic-compatible API. 100% private.
LLM inference in C/C++
20+ high-performance LLMs with recipes to pretrain, finetune and deploy at scale.
Evaluate the accuracy of LLM generated outputs
Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)
Run GGUF models easily with a KoboldAI UI. One File. Zero Install.
Python bindings for llama.cpp
1 capture since 2026-05-25