Sign in
← Back to search

NVIDIA/TensorRT-LLM

TensorRT LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and supports state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT LLM also contains components to create Python and C++ runtimes that orchestrate the inference execution in a performant way.

Stars
13,725
Forks
2,408
Commits
6838
Language
Python
Awesome lists
2

Similar repositories

NVIDIA/Model-Optimizer

A unified library of SOTA model optimization techniques like quantization, pruning, distillation, speculative decoding, etc. It compresses deep learning models for downstream deployment frameworks like TensorRT-LLM, TensorRT, vLLM, etc. to optimize inference speed.

2759 stars
Python 1 awesome list

llm-d/llm-d

Achieve state of the art inference performance with modern accelerators on Kubernetes

3250 stars
Shell 1 awesome list

tensorzero/tensorzero

TensorZero is an open-source LLMOps platform that unifies an LLM gateway, observability, evaluation, optimization, and experimentation.

11397 stars
Rust 4 awesome lists

InternLM/lmdeploy

LMDeploy is a toolkit for compressing, deploying, and serving LLMs.

7873 stars
Python 2 awesome lists

mlc-ai/mlc-llm

Universal LLM Deployment Engine with ML Compilation

22709 stars
Python 1 awesome list

Tracked growth

1 capture since 2026-05-25

Latest capture 2026-05-25 21:10

Stars history

Total stars

Commits history

Default branch commits

Metadata

AI development signals

AI agent config detected

200 config paths 145 files 55 directories
Agent instructions 3 Claude Code 196 CodeRabbit

Key config paths

  • dir .claude
  • file .coderabbit.yaml
  • file .codex/AGENTS.md
  • file AGENTS.md
  • file CLAUDE.md
  • file tensorrt_llm/runtime/kv_cache_manager_v2/AGENTS.md

1 more config path detected.

Review config paths
  • Claude Code .claude
  • Claude Code .claude/agent-tests
  • Claude Code .claude/agent-tests/perf-test-sync
  • Claude Code .claude/agent-tests/perf-test-sync/build_prompt.py
  • Claude Code .claude/agent-tests/perf-test-sync/promptfooconfig.yaml
  • Claude Code .claude/agent-tests/perf-test-sync/render_report.py
  • Claude Code .claude/agent-tests/perf-test-sync/run.sh
  • Claude Code .claude/agents
  • Claude Code .claude/agents/ad-conf-check-update.md
  • Claude Code .claude/agents/ad-debug-agent.md
  • Claude Code .claude/agents/ad-onboard-reviewer.md
  • Claude Code .claude/agents/ad-run-agent.md
  • Claude Code .claude/agents/exec-compile-specialist.md
  • Claude Code .claude/agents/kernel-cuda-specialist.md
  • Claude Code .claude/agents/kernel-cute-specialist.md
  • Claude Code .claude/agents/kernel-tileir-specialist.md
  • Claude Code .claude/agents/kernel-triton-specialist.md
  • Claude Code .claude/agents/perf-profiling-specialist.md
  • Claude Code .claude/agents/perf-test-sync.md
  • Claude Code .claude/agents/perf-torch-cuda-graph-specialist.md
  • Claude Code .claude/README.md
  • Claude Code .claude/skills
  • Claude Code .claude/skills/ad-accuracy-debug
  • Claude Code .claude/skills/ad-accuracy-debug/SKILL.md

Showing the first 24 paths. 176 more detected.