OptimalScale/LMFlow
An Extensible Toolkit for Finetuning and Inference of Large Foundation Models. Large Models for All.
High-speed Large Language Model Serving for Local Deployment
An Extensible Toolkit for Finetuning and Inference of Large Foundation Models. Large Models for All.
FlashInfer: Kernel Library for LLM Serving
Easy and Efficient Finetuning LLMs. (Supported LLama, LLama2, LLama3, Qwen, Baichuan, GLM , Falcon) 大模型高效量化训练+部署.
SGLang is a high-performance serving framework for large language models and multimodal models.
QLoRA: Efficient Finetuning of Quantized LLMs
Making large AI models cheaper, faster and more accessible
1 capture since 2026-05-25