Sign in
Awesome

GitHub projects from awesome lists

Search awesome repositories

Search names, descriptions, topics, tags, and stacks, then tune results by ecosystem, freshness, health, and cross-list signal.

Repos indexed
10,634
Awesome lists tracked
83
Current results
10,634

Find repositories

Start broad, then narrow by ecosystem, freshness, health, and growth.

Search mode
Tune results
More filters Topics, generated tags, stack, age, archive status, and growth.
Ecosystem
Health

Uses known first-commit dates.

Momentum
Filters by commit-count growth since Awesome first tracked the repository. Enter the minimum percentage increase.
Filters by GitHub star growth since Awesome first tracked the repository. Enter the minimum percentage increase.
Reset filters
10,634 repos shown
No filters applied
Highlighted

Open highlighted repo slot

Put your repository first

Promote a GitHub repo at the top of Awesome repository list views for 7 days.

esbatmop/MNBVC

MNBVC(Massive Never-ending BT Vast Chinese corpus)超大规模中文语料集。对标chatGPT训练的40T数据。MNBVC数据集不但包括主流文化,也包括各个小众文化甚至火星文的数据。MNBVC数据集包括新闻、作文、小说、书籍、杂志、论文、台词、帖子、wiki、古诗、歌词、商品介绍、笑话、糗事、聊天记录等一切形式的纯文本中文数据。

#chinese #chinese-language #chinese-nlp #chinese-simplified #corpus-data 2 awesome lists 300 commits first commit 2022-12-31 2 history points updated 2026-05-23