Open highlighted repo slot
Put your repository first
Promote a GitHub repo at the top of Awesome repository list views for 7 days.
Awesome List
Curated list of the best truly open-source AI projects, models, tools, and infrastructure.
GitHub stars and default-branch commits for alvinreal/awesome-opensource-ai.
767 repos currently saved from this list.
Open highlighted repo slot
Promote a GitHub repo at the top of Awesome repository list views for 7 days.
🦉 Data Versioning and ML Experiments
An orchestration platform for the development, production, and observation of data assets.
Tantivy is a full-text search engine library inspired by Apache Lucene and written in Rust
Hindsight: Agent Memory That Learns
MNN: A blazing-fast, lightweight inference engine battle-tested by Alibaba, powering high-performance on-device LLMs and Edge AI.
Burn is a next generation tensor library and Deep Learning Framework that doesn't compromise on flexibility, efficiency and portability.
FinRL®: Financial Reinforcement Learning. 🔥
The Self-hosted AI Starter Kit is an open-source template that quickly sets up a local AI environment. Curated by n8n, it provides essential tools for creating secure, self-hosted AI workflows.
Unified framework for building enterprise RAG pipelines with small, specialized models
Replace 'hub' with 'ingest' in any GitHub URL to get a prompt-friendly extract of a codebase
SciPy library main repository
Tensor library for machine learning
Official code for "F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with Flow Matching"
RWKV (pronounced RwaKuv) is an RNN with great LLM performance, which can also be directly trained like a GPT transformer (parallelizable). We are at RWKV-7 "Goose". So it's combining the best of RNN and transformer - great performance, linear time, constant space (no kv-cache), fast training, infinite ctx_len, and free sentence embedding.
An open source, privacy focused alternative to NotebookLM for teams with no data limits. Join our Discord: https://discord.gg/ejRNvftDp9
Use PEFT or Full-parameter to CPT/SFT/DPO/GRPO 600+ LLMs (Qwen3.6, DeepSeek-R1, GLM-5.1, InternLM3, Llama4, ...) and 300+ MLLMs (Qwen3-VL, Qwen3-Omni, InternVL3.5, Ovis2.5, GLM4.5v, Gemma4, Llava, Phi4, ...) (AAAI 2025).
A hyperparameter optimization framework
Supercharge Your LLM Application Evaluations 🚀
OpenMetadata is a unified metadata platform for data discovery, data observability, and data governance powered by a central metadata repository, in-depth column level lineage, and seamless team collaboration.
Open-source simulator for autonomous driving research.
Structured Outputs
An open source implementation of CLIP.
Parallel computing with task scheduling
Simple, unified interface to multiple Generative AI providers
TensorRT LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and supports state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT LLM also contains components to create Python and C++ runtimes that orchestrate the inference execution in a performant way.
The developer platform for on-demand cloud development environments to create software faster and more securely.
Nano vLLM
Open3D: A Modern Library for 3D Data Processing
Multi-Joint dynamics with Contact. A general purpose physics simulator.
Open Machine Learning Compiler Framework
20+ high-performance LLMs with recipes to pretrain, finetune and deploy at scale.
PyTorch version of Stable Baselines, reliable implementations of reinforcement learning algorithms.
Minimal reproduction of DeepSeek R1-Zero
structured outputs for llms
🔎 Open source distributed and RESTful search engine.
Easy-to-use and powerful LLM and SLM library with awesome model zoo.
dbt enables data analysts and engineers to transform their data using the same practices that software engineers use to build applications.
A framework for few-shot evaluation of language models.
💡 All-in-one AI framework for semantic search, LLM orchestration and language model workflows
Open Source framework for voice and multimodal conversational AI
Run any open-source LLMs, such as DeepSeek and Llama, as OpenAI compatible API endpoint in the cloud.
A Library for Advanced Deep Time Series Models for General Time Series Analysis.
This project aim to reproduce Sora (Open AI T2V model), we wish the open source community contribute to this project.
HunyuanVideo: A Systematic Framework For Large Video Generation Model
LangChain4j is an idiomatic, open-source Java library for building LLM-powered applications on the JVM. It offers a unified API over popular LLM providers and vector stores, and makes implementing tool calling (including MCP support), agents and RAG easy. It integrates seamlessly with enterprise Java frameworks like Quarkus and Spring Boot.
LangGPT: Empowering everyone to become a prompt expert! 🚀 📌 结构化提示词(Structured Prompt)提出者 📌 元提示词(Meta-Prompt)发起者 📌 最流行的提示词落地范式 | Language of GPT The pioneering framework for structured & meta-prompt design 10,000+ ⭐ | Battle-tested by thousands of users worldwide Created by 云中江树
The Context Platform for your Data and AI Stack
Access large language models from the command-line
Go ahead and axolotl questions