open-compass/VLMEvalKit
Open-source evaluation toolkit of large multi-modality models (LMMs), support 220+ LMMs, 80+ benchmarks
OpenCompass is an LLM evaluation platform, supporting a wide range of models (Llama3, Mistral, InternLM2,GPT-4,LLaMa2, Qwen,GLM, Claude, etc) over 100+ datasets.
Open-source evaluation toolkit of large multi-modality models (LMMs), support 220+ LMMs, 80+ benchmarks
An Open Source implementation of Notebook LM with more flexibility and features
A streamlined and customizable framework for efficient large model (LLM, VLM, AIGC) evaluation and performance benchmarking.
Automated Penetration Testing Agentic Framework Powered by Large Language Models
[COLM 2024] OpenAgents: An Open Platform for Language Agents in the Wild
Financial data platform for analysts, quants and AI agents.
1 capture since 2026-05-25