antimatter15/alpaca.cpp
Locally run an Instruction-Tuned Chat-Style LLM
Run LLaMA (and Stanford-Alpaca) inference on Apple Silicon GPUs.
Locally run an Instruction-Tuned Chat-Style LLM
Easy and Efficient Finetuning LLMs. (Supported LLama, LLama2, LLama3, Qwen, Baichuan, GLM , Falcon) 大模型高效量化训练+部署.
LLM as a Chatbot Service
llama.go is like llama.cpp in pure Golang!
QLoRA: Efficient Finetuning of Quantized LLMs
Inference code for Llama models
1 capture since 2026-05-27