Sign in
← Back to search

NVIDIA/Model-Optimizer

A unified library of SOTA model optimization techniques like quantization, pruning, distillation, speculative decoding, etc. It compresses deep learning models for downstream deployment frameworks like TensorRT-LLM, TensorRT, vLLM, etc. to optimize inference speed.

Stars
2,759
Forks
407
Commits
817
Language
Python
Awesome lists
1

Similar repositories

NVIDIA/TensorRT-LLM

TensorRT LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and supports state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT LLM also contains components to create Python and C++ runtimes that orchestrate the inference execution in a performant way.

13725 stars
Python 2 awesome lists

NVIDIA/Megatron-LM

Ongoing research training transformer models at scale

16443 stars
Python 2 awesome lists

kaito-project/aikit

🏗️ Fine-tune, build, and deploy open-source LLMs easily!

524 stars
Go 2 awesome lists

huggingface/nanotron

Minimalistic large language model 3D-parallelism training

2699 stars
Python 2 awesome lists

pytorch/ao

PyTorch native quantization and sparsity for training and inference

2833 stars
Python 1 awesome list

Tracked growth

1 capture since 2026-05-25

Latest capture 2026-05-25 21:10

Stars history

Total stars

Commits history

Default branch commits

Metadata

AI development signals

AI agent config detected

88 config paths 62 files 26 directories
Agent instructions Agent workspace 2 Claude Code 84 CodeRabbit

Key config paths

  • dir .agents
  • dir .claude
  • file .coderabbit.yaml
  • file AGENTS.md
  • file CLAUDE.md
  • file tools/debugger/CLAUDE.md

1 more config path detected.

Review config paths
  • Agent workspace .agents
  • Agent workspace .agents/TOOLING.md
  • Claude Code .claude
  • Claude Code .claude/clusters.yaml.example
  • Claude Code .claude/scripts
  • Claude Code .claude/scripts/sync-upstream-skills.sh
  • Claude Code .claude/skills
  • Claude Code .claude/skills/accessing-mlflow
  • Claude Code .claude/skills/accessing-mlflow/SKILL.md
  • Claude Code .claude/skills/common
  • Claude Code .claude/skills/common/credentials.md
  • Claude Code .claude/skills/common/environment-setup.md
  • Claude Code .claude/skills/common/remote-execution.md
  • Claude Code .claude/skills/common/remote_exec.sh
  • Claude Code .claude/skills/common/slurm-setup.md
  • Claude Code .claude/skills/common/workspace-management.md
  • Claude Code .claude/skills/compare-results
  • Claude Code .claude/skills/compare-results/SKILL.md
  • Claude Code .claude/skills/compare-results/tests
  • Claude Code .claude/skills/compare-results/tests/evals.json
  • Claude Code .claude/skills/debug
  • Claude Code .claude/skills/debug/SKILL.md
  • Claude Code .claude/skills/deployment
  • Claude Code .claude/skills/deployment/references

Showing the first 24 paths. 64 more detected.