google-research/text-to-text-transfer-transformer
Code for the paper "Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer"
Mesh TensorFlow: Model Parallelism Made Easier
Code for the paper "Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer"
🤗 Transformers: the model-definition framework for state-of-the-art machine learning models in text, vision, audio, and multimodal models, for both inference and training.
Ongoing research training transformer models at scale
Fast inference engine for Transformer models
Minimalistic large language model 3D-parallelism training
An implementation of model parallel autoregressive transformers on GPUs, based on the Megatron and DeepSpeed libraries
3 captures since 2026-05-22