Tiiny-AI/PowerInfer
High-speed Large Language Model Serving for Local Deployment
FlashInfer: Kernel Library for LLM Serving
High-speed Large Language Model Serving for Local Deployment
Fast and memory-efficient exact attention
A high-throughput and memory-efficient inference and serving engine for LLMs
An Extensible Toolkit for Finetuning and Inference of Large Foundation Models. Large Models for All.
SGLang is a high-performance serving framework for large language models and multimodal models.
A high-performance inference engine for LLM, VLM, DiT and REC models, optimized for diverse AI accelerators.
1 capture since 2026-05-25
AI agent config detected
Key config paths
.claude
AGENTS.md
CLAUDE.md
.claude
.claude/skills
.claude/skills/add-cuda-kernel
.claude/skills/add-cuda-kernel/SKILL.md
.claude/skills/benchmark-kernel
.claude/skills/benchmark-kernel/SKILL.md
.claude/skills/debug-cuda-crash
.claude/skills/debug-cuda-crash/SKILL.md
AGENTS.md
CLAUDE.md