zai-org/CogVideo
text and image to video generation: CogVideoX (2024) and CogVideo (ICLR 2023)
The code used to train and run inference with the ColVision models, e.g. ColPali, ColQwen2, and ColSmol.
text and image to video generation: CogVideoX (2024) and CogVideo (ICLR 2023)
VILA is a family of state-of-the-art vision language models (VLMs) for diverse multimodal AI tasks across the edge, data center, and cloud.
High-level library to help with training and evaluating neural networks in PyTorch flexibly and transparently.
Data processing for and with foundation models! ๐ ๐ ๐ฝ โก๏ธ โก๏ธ๐ธ ๐น ๐ท
Toolkit for linearizing PDFs for LLM datasets/training
Transformer models from BERT to GPT-4, environments from Hugging Face to OpenAI. Fine-tuning, training, and prompt engineering examples. A bonus section with ChatGPT, GPT-3.5-turbo, GPT-4, and DALL-E including jump starting GPT-4, speech-to-text, text-to-speech, text to image generation with DALL-E, Google Cloud AI,HuggingGPT, and more
1 capture since 2026-05-25