vllm-project/vllm
A high-throughput and memory-efficient inference and serving engine for LLMs
LightLLM is a Python-based LLM (Large Language Model) inference and serving framework, notable for its lightweight design, easy scalability, and high-speed performance.
A high-throughput and memory-efficient inference and serving engine for LLMs
Democratizing Reinforcement Learning for LLMs
A high-performance inference engine for LLM, VLM, DiT and REC models, optimized for diverse AI accelerators.
An Extensible Toolkit for Finetuning and Inference of Large Foundation Models. Large Models for All.
Universal LLM Deployment Engine with ML Compilation
Achieve state of the art inference performance with modern accelerators on Kubernetes
1 capture since 2026-05-25