vllm-project/vllm
A high-throughput and memory-efficient inference and serving engine for LLMs
A high-throughput and memory-efficient inference and serving engine for LLMs
Minimalistic large language model 3D-parallelism training
Fast Multimodal LLM on Mobile Devices
Open-source evaluation toolkit of large multi-modality models (LMMs), support 220+ LMMs, 80+ benchmarks
A compact implementation of SGLang, designed to demystify the complexities of modern LLM serving systems.
VILA is a family of state-of-the-art vision language models (VLMs) for diverse multimodal AI tasks across the edge, data center, and cloud.
1 capture since 2026-05-25