cognisoc/zigllm
Learn how LLMs work by building one in Zig -- from tensors to text generation.
Inference Llama 2 in one file of pure Zig
Learn how LLMs work by building one in Zig -- from tensors to text generation.
Inference code for Llama models
llama.cpp fork with additional SOTA quants and improved performance
Any model. Any hardware. Zero compromise. Built with @ziglang / @openxla / MLIR / @bazelbuild
A self-hosted, offline, ChatGPT-like chatbot. Powered by Llama 2. 100% private, with no data leaving your device. New: Code Llama support!
Distributed LLM inference. Connect home devices into a powerful cluster to accelerate LLM inference. More devices means faster inference.
1 capture since 2026-06-09