UbiquitousLearning/mllm
Fast Multimodal LLM on Mobile Devices
Universal LLM Deployment Engine with ML Compilation
Fast Multimodal LLM on Mobile Devices
LMDeploy is a toolkit for compressing, deploying, and serving LLMs.
A high-throughput and memory-efficient inference and serving engine for LLMs
A high-performance inference engine for LLM, VLM, DiT and REC models, optimized for diverse AI accelerators.
MLX: An array framework for Apple silicon
Achieve state of the art inference performance with modern accelerators on Kubernetes
1 capture since 2026-05-25