zai-org/CogVideo
text and image to video generation: CogVideoX (2024) and CogVideo (ICLR 2023)
An open source implementation of CLIP.
text and image to video generation: CogVideoX (2024) and CogVideo (ICLR 2023)
[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.
VILA is a family of state-of-the-art vision language models (VLMs) for diverse multimodal AI tasks across the edge, data center, and cloud.
Generative Models by Stability AI
Easy and Efficient Finetuning LLMs. (Supported LLama, LLama2, LLama3, Qwen, Baichuan, GLM , Falcon) 大模型高效量化训练+部署.
No description.
1 capture since 2026-05-25