Awesome

GitHub projects from awesome lists

Search awesome repositories

Search names, descriptions, topics, tags, and stacks, then tune results by ecosystem, freshness, health, and cross-list signal.

Continue with GitHub Browse awesome lists Request a list

Repos indexed: 17,379
Awesome lists tracked: 125
Current results: 19

Find repositories

Start broad, then narrow by ecosystem, freshness, health, and growth.

Clear 1 refinement

Search repositories

Search mode

Keyword Semantic

Tune results

The controls most people need first.

Awesome list

Language

Freshness

Sort

Direction

More filters Topics, generated tags, stack, files, age, archive status, and growth.

Ecosystem

GitHub topic

Generated tag

Framework or stack

Package manager

Files

Has file

Choose a suggestion or use commas to require multiple files.

Health

Minimum stars

Repository age

Uses known first-commit dates.

Archive status

AI development signals

Momentum

Unmaintained for

Commit velocity

Star growth

Reset filters

19 repos shown

Topic: vllm

Browse

Highlighted

Open highlighted repo slot

Put your repository first

Promote a GitHub repo at the top of Awesome repository list views for 7 days.

modelscope/FunASR

Open-source speech recognition toolkit for training, inference, streaming ASR, VAD, punctuation, speaker diarization pipelines, and OpenAI-compatible/MCP serving.

Stack

Python Android FastAPI Vue CMake CocoaPods .NET SDK

GitHub topics

#asr #audio #chinese #emotion-recognition #funasr #mcp-server

Updated: 2026-07-15
Lists: 2 list mentions
First commit: 2022-11-24
History: 5 history points
License: MIT
Issues: 5 open

19,261

stars

Forks: 1,938
Commits: 5,344 commits
Star growth, last 7 days: 0 0.0%
Commit velocity, last 7 days: 0 0.0%

Website GitHub

Orchestra-Research/AI-Research-SKILLs

Comprehensive open-source library of AI research and engineering skills for any AI model. Package the skills and your claude code/codex/gemini agent will be an AI research agent with full horsepower. Maintained by Orchestra Research.

AI dev

Stack

TeX React npm

GitHub topics

#ai #ai-research #claude #claude-code #claude-skills #codex

Updated: 2026-06-16
Lists: 0 list mentions
First commit: 2025-11-03
History: 48 history points
License: MIT
Issues: 16 open

10,836

stars

Forks: 800
Commits: 220 commits
Star growth, last 7 days: +188 +1.8%
Commit velocity, last 7 days: 0 0.0%

Website GitHub

LMCache/LMCache

LMCache: Supercharge Your LLM with the Fastest KV Cache Layer

AI dev

Stack

Python Cargo CMake Go modules

GitHub topics

#amd #cuda #fast #inference #kv-cache #llm

Updated: 2026-07-14
Lists: 1 list mention
First commit: 2024-05-29
History: 5 history points
License: Apache-2.0
Issues: 402 open

10,553

stars

Forks: 1,556
Commits: 1,948 commits
Star growth, last 7 days: 0 0.0%
Commit velocity, last 7 days: 0 0.0%

Website GitHub

OpenRLHF/OpenRLHF

An Easy-to-use, Scalable and High-performance Agentic RL Framework based on Ray (PPO & DAPO & REINFORCE++ & VLM & TIS & vLLM & Ray & Async RL)

Stack

Python PEP 517 pip

GitHub topics

#large-language-models #proximal-policy-optimization #raylib #reinforcement-learning #reinforcement-learning-from-human-feedback #transformers

Updated: 2026-07-14
Lists: 2 list mentions
First commit: 2023-07-30
History: 5 history points
License: Apache-2.0
Issues: 342 open

9,789

stars

Forks: 989
Commits: 1,548 commits
Star growth, last 7 days: 0 0.0%
Commit velocity, last 7 days: 0 0.0%

Website GitHub

xorbitsai/inference

Swap GPT for any LLM by changing a single line of code. Xinference lets you run open-source, speech, and multimodal models on cloud, on-prem, or your laptop — all through one unified, production-ready inference API.

AI dev

Stack

Python FastAPI LangChain Next.js pytest npm PEP 517 pip

GitHub topics

#artificial-intelligence #chatglm #deployment #flan-t5 #gemma #ggml

Updated: 2026-07-18
Lists: 3 list mentions
First commit: 2023-06-14
History: 5 history points
License: Apache-2.0
Issues: 50 open

9,437

stars

Forks: 845
Commits: 2,031 commits
Star growth, last 7 days: 0 0.0%
Commit velocity, last 7 days: 0 0.0%

Website GitHub

ai-dynamo/dynamo

A Datacenter Scale Distributed Inference Serving Framework

AI dev

Stack

Rust Axum Cobra FastAPI gRPC Go Cargo Go modules npm

GitHub topics

#diffusion #disaggregated-serving #kubernetes #llm-inference #omni #routing-engine

Updated: 2026-07-15
Lists: 1 list mention
First commit: 2024-12-20
History: 5 history points
License: NOASSERTION
Issues: 828 open

7,489

stars

Forks: 1,338
Commits: 6,031 commits
Star growth, last 7 days: 0 0.0%
Commit velocity, last 7 days: 0 0.0%

Website GitHub

kvcache-ai/Mooncake

Mooncake is the serving platform for Kimi, a leading LLM service provided by Moonshot AI.

AI dev

Stack

C++ Cargo CMake Go modules

GitHub topics

#disaggregation #inference #kvcache #llm #rdma #reinforcement-learning

Updated: 2026-07-15
Lists: 1 list mention
First commit: 2024-06-25
History: 5 history points
License: Apache-2.0
Issues: 506 open

5,823

stars

Forks: 963
Commits: 1,580 commits
Star growth, last 7 days: 0 0.0%
Commit velocity, last 7 days: 0 0.0%

Website GitHub

kserve/kserve

Standardized Distributed Generative and Predictive AI Inference Platform for Scalable, Multi-Framework Deployment on Kubernetes

Stack

Go Cobra FastAPI Gin gRPC Go Go modules PEP 517 pip

GitHub topics

#artificial-intelligence #cncf #genai #hacktoberfest #istio #k8s

Updated: 2026-07-14
Lists: 1 list mention
First commit: 2019-03-27
History: 5 history points
License: Apache-2.0
Issues: 486 open

5,687

stars

Forks: 1,575
Commits: 2,405 commits
Star growth, last 7 days: 0 0.0%
Commit velocity, last 7 days: 0 0.0%

Website GitHub

OpenBMB/UltraRAG

A Low-Code MCP Framework for Building Complex and Innovative RAG Pipelines

AI dev

Stack

Python FastAPI Flask Jupyter pytest npm uv

GitHub topics

#deepseek #demo #easy #embedding #flask #gpt

Updated: 2026-07-15
Lists: 1 list mention
First commit: 2025-01-16
History: 5 history points
License: Apache-2.0
Issues: 28 open

5,647

stars

Forks: 434
Commits: 424 commits
Star growth, last 7 days: 0 0.0%
Commit velocity, last 7 days: 0 0.0%

Website GitHub

gpustack/gpustack

A GPU cluster manager for high-performance AI model serving (vLLM, SGLang) and on-demand SSH-accessible GPU instances.

AI dev

Stack

Python FastAPI pytest pip uv

GitHub topics

#ascend #cuda #deepseek #distributed-inference #genai #high-performance-inference

Updated: 2026-07-05
Lists: 3 list mentions
First commit: 2024-04-25
History: 5 history points
License: Apache-2.0
Issues: 616 open

5,272

stars

Forks: 561
Commits: 3,226 commits
Star growth, last 7 days: No 7-day history
Commit velocity, last 7 days: No 7-day history

Website GitHub

mostlygeek/llama-swap

Reliable model swapping for any local OpenAI/Anthropic compatible server - llama.cpp, vllm, etc

AI dev

Stack

Go Go modules npm

GitHub topics

#golang #llama #llamacpp #localllama #localllm #openai

Updated: 2026-07-15
Lists: 2 list mentions
First commit: 2024-10-04
History: 5 history points
License: MIT
Issues: 78 open

5,001

stars

Forks: 375
Commits: 527 commits
Star growth, last 7 days: 0 0.0%
Commit velocity, last 7 days: 0 0.0%

GitHub

lemony-ai/cascadeflow

Cascading runtime for AI agents. Optimize cost, latency, quality, and policy decisions inside the agent loop.

AI dev

Stack

Python Express FastAPI Next.js PydanticAI npm PEP 517 pip

GitHub topics

#agent #ai #anthropic #api #budgets #claude

Updated: 2026-07-01
Lists: 2 list mentions
First commit: 2025-10-02
History: 3 history points
License: MIT
Issues: 6 open

3,218

stars

Forks: 669
Commits: 507 commits
Star growth, last 7 days: No 7-day history
Commit velocity, last 7 days: No 7-day history

Website GitHub

containers/ramalama

RamaLama is an open-source developer tool that simplifies the local serving of AI models from any source and facilitates their use for inference in production, all through the familiar language of containers.

AI dev

Stack

Python pytest React npm pip uv

GitHub topics

#ai #containers #cuda #hacktoberfest #hip #inference-server

Updated: 2026-07-14
Lists: 1 list mention
First commit: 2024-06-08
History: 5 history points
License: MIT
Issues: 104 open

2,956

stars

Forks: 348
Commits: 4,414 commits
Star growth, last 7 days: 0 0.0%
Commit velocity, last 7 days: 0 0.0%

Website GitHub

bricks-cloud/BricksLLM

🔒 Enterprise-grade API gateway that helps you monitor and impose cost or rate limits per API key. Get fine-grained access control and monitoring per user, application, or environment. Supports OpenAI, Azure OpenAI, Anthropic, vLLM, and open-source LLMs.

Stack

Go Gin gRPC Go Go modules

GitHub topics

#ai #anthropic #api #artificial-intelligence #azure #docker

Updated: 2025-01-05
Lists: 1 list mention
First commit: 2023-07-18
History: 4 history points
License: MIT
Issues: 21 open

1,216

stars

Forks: 95
Commits: 739 commits
Star growth, last 7 days: No 7-day history
Commit velocity, last 7 days: No 7-day history

Website GitHub

jakobdylanc/llmcord

Make Discord your LLM frontend - Supports any OpenAI compatible API (OpenRouter, Ollama and more)

Stack

Python pip

GitHub topics

#bot #chatbot #discord #discord-bot #gemini #gpt-5

Updated: 2026-07-02
Lists: 2 list mentions
First commit: 2023-07-29
History: 7 history points
License: MIT
Issues: 5 open

815

stars

Forks: 198
Commits: 480 commits
Star growth, last 7 days: No 7-day history
Commit velocity, last 7 days: No 7-day history

GitHub

pgalko/BambooAI

A Python library powered by Language Models (LLMs) for conversational data discovery and analysis.

Stack

Python Flask pytest pip

GitHub topics

#ai #ai-agents #anthropic #data-analysis #data-science #docker

Updated: 2026-06-03
Lists: 1 list mention
First commit: 2023-05-07
History: 6 history points
License: MIT
Issues: 15 open

782

stars

Forks: 84
Commits: 391 commits
Star growth, last 7 days: 0 0.0%
Commit velocity, last 7 days: 0 0.0%

GitHub

defilantech/LLMKube

Kubernetes operator for self-hosted LLM inference across a heterogeneous GPU fleet: NVIDIA CUDA, AMD Vulkan, and Apple Silicon Metal. Runtimes: llama.cpp, vLLM, TGI, mlx-server. Multi-GPU sharding, model caching, OpenAI-compatible endpoints. Apache-2.0, run across homelab and on-prem fleets, actively developed.

AI dev

Stack

Go Cobra gRPC Go Go modules

GitHub topics

#ai #apple-silicon #autoscaling #edge-computing #gguf #gpu

Updated: 2026-07-06
Lists: 1 list mention
First commit: 2025-11-17
History: 5 history points
License: Apache-2.0
Issues: 60 open

162

stars

Forks: 24
Commits: 630 commits
Star growth, last 7 days: No 7-day history
Commit velocity, last 7 days: No 7-day history

Website GitHub

tiannuo-yang/SearchAgent-X

A High-Efficiency System of Large Language Model Based Search Agents

Stack

Python FastAPI pytest Starlette CMake PEP 517 pip

GitHub topics

#agent #ai #approximate-nearest-neighbor-search #efficient-ai #information-retrieval #llm

Updated: 2025-07-02
Lists: 1 list mention
First commit: 2025-05-21
History: 5 history points
License: Unknown
Issues: 1 open

stars

Forks: 5
Commits: 12 commits
Star growth, last 7 days: 0 0.0%
Commit velocity, last 7 days: 0 0.0%

Website GitHub

informatico-madrid/blackwell-linux-infra-optimizer

Optimized vLLM deployment for NVIDIA Blackwell (RTX 5090) on Linux Kernel 6.14. Resolves SM_120 kernel incompatibilities, P2P deadlocks, and memory fragmentation for high-performance LLM inference.

Stack

Dockerfile

GitHub topics

#blackwell #cuda #deepseek #infrastructure #linux-kernel #llm

Updated: 2026-01-17
Lists: 1 list mention
First commit: 2026-01-16
History: 5 history points
License: Unknown
Issues: 0 open

stars

Forks: 1
Commits: 11 commits
Star growth, last 7 days: 0 0.0%
Commit velocity, last 7 days: 0 0.0%

GitHub

Search awesome repositories

Find repositories

Put your repository first

How it works

Pricing

How it works

Pricing