llm-d/llm-d
Achieve state of the art inference performance with modern accelerators on Kubernetes
Kubernetes operator for local LLM inference with llama.cpp, vLLM, TGI, and mlx-server — multi-GPU NVIDIA + Apple Silicon Metal, autoscaling, air-gapped, production-ready
Achieve state of the art inference performance with modern accelerators on Kubernetes
LLM inference in C/C++
🏗️ Fine-tune, build, and deploy open-source LLMs easily!
Deploy any AI model, agent, database, RAG, and pipeline locally or remotely in minutes
LocalAI is the open-source AI engine. Run any model - LLMs, vision, voice, image, video - on any hardware. No GPU required.
kubectl for AI Agents
2 captures since 2026-05-23
AI agent config detected
Key config paths
AGENTS.md