inparallel/SaudiNewsNet
This repo contains a set of Arabic newspaper articles alongwith metadata, extracted from various Saudi newspapers.
RightNow Arabic LLM Corpus - One of the largest high-quality Arabic text datasets for LLM training
This repo contains a set of Arabic newspaper articles alongwith metadata, extracted from various Saudi newspapers.
P2PCLAW: Training Dataset for Autonomous Scientific Peer Review - Apache 2.0
CAJAL — Local scientific paper generator. Qwen 27B fine-tuned for academic writing. AI Tribunal peer review. 6x4 CognitionBoard. 2.7x token compression. Runs offline on RTX 3090. Apache 2.0.
Easy and Efficient Finetuning LLMs. (Supported LLama, LLama2, LLama3, Qwen, Baichuan, GLM , Falcon) 大模型高效量化训练+部署.
🤗 The largest hub of ready-to-use datasets for AI models with fast, easy-to-use and efficient data manipulation tools
The Open Cookbook for Top-Tier Code Large Language Model
5 captures since 2026-06-04