Sign in
← Back to search

informatico-madrid/blackwell-linux-infra-optimizer

Optimized vLLM deployment for NVIDIA Blackwell (RTX 5090) on Linux Kernel 6.14. Resolves SM_120 kernel incompatibilities, P2P deadlocks, and memory fragmentation for high-performance LLM inference.

Stars
5
Forks
1
Commits
11
Language
Dockerfile
Awesome lists
1

Similar repositories

llm-d/llm-d

Achieve state of the art inference performance with modern accelerators on Kubernetes

3250 stars
Shell 1 awesome list

bobazooba/xllm

🦖 X—LLM: Cutting Edge & Easy LLM Finetuning

408 stars
Python 1 awesome list

NVIDIA/Model-Optimizer

A unified library of SOTA model optimization techniques like quantization, pruning, distillation, speculative decoding, etc. It compresses deep learning models for downstream deployment frameworks like TensorRT-LLM, TensorRT, vLLM, etc. to optimize inference speed.

2759 stars
Python 1 awesome list

vllm-project/vllm

A high-throughput and memory-efficient inference and serving engine for LLMs

81308 stars
Python 3 awesome lists

NVIDIA/TensorRT-LLM

TensorRT LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and supports state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT LLM also contains components to create Python and C++ runtimes that orchestrate the inference execution in a performant way.

13725 stars
Python 2 awesome lists

Tracked growth

1 capture since 2026-05-25

Latest capture 2026-05-25 20:53

Stars history

Total stars

Commits history

Default branch commits

Metadata

  • Created: 2026-01-16
  • First commit: —
  • Last pushed: 2026-01-17
  • Archived: no
  • Stack detected: —
  • License: —

AI development signals

No AI development config files detected.