Sign in
← Back to search

NVIDIA/Megatron-LM

Ongoing research training transformer models at scale

Stars
16,443
Forks
3,993
Commits
8913
Language
Python
Awesome lists
2

Similar repositories

huggingface/nanotron

Minimalistic large language model 3D-parallelism training

2699 stars
Python 2 awesome lists

NVIDIA/Model-Optimizer

A unified library of SOTA model optimization techniques like quantization, pruning, distillation, speculative decoding, etc. It compresses deep learning models for downstream deployment frameworks like TensorRT-LLM, TensorRT, vLLM, etc. to optimize inference speed.

2759 stars
Python 1 awesome list

InternLM/xtuner

A Next-Generation Training Engine Built for Ultra-Large MoE Models

5137 stars
Python 1 awesome list

deepspeedai/DeepSpeed

DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.

42395 stars
Python 1 awesome list

kvcache-ai/ktransformers

A Flexible Framework for Experiencing Heterogeneous LLM Inference/Fine-tune Optimizations

17195 stars
Python 1 awesome list

open-mmlab/mmengine

OpenMMLab Foundational Library for Training Deep Learning Models

1475 stars
Python 1 awesome list

Tracked growth

1 capture since 2026-05-25

Latest capture 2026-05-25 21:10

Stars history

Total stars

Commits history

Default branch commits

Metadata

AI development signals

AI agent config detected

10 config paths 8 files 2 directories
Agent instructions Agent workspace 2 Claude Code 4 CodeRabbit Cursor +1

Key config paths

  • dir .agents
  • dir .claude
  • file .coderabbit.yaml
  • file .cursorrules
  • file AGENTS.md
  • file CLAUDE.md

1 more config path detected.

Review config paths
  • Agent workspace .agents
  • Agent workspace .agents/skills
  • Claude Code .claude
  • Claude Code .claude/settings.json
  • Claude Code .claude/skills
  • CodeRabbit .coderabbit.yaml
  • Cursor .cursorrules
  • Agent instructions AGENTS.md
  • Claude Code CLAUDE.md
  • Greptile greptile.json