microsoft/LLMLingua
[EMNLP'23, ACL'24] To speed up LLMs' inference and enhance LLM's perceive of key information, compress the prompt and KV-Cache, which achieves up to 20x compression with minimal performance loss.
Enforce the output format (JSON Schema, Regex etc) of a language model
[EMNLP'23, ACL'24] To speed up LLMs' inference and enhance LLM's perceive of key information, compress the prompt and KV-Cache, which achieves up to 20x compression with minimal performance loss.
A language for constraint-guided and efficient LLM programming.
The llama-cpp-agent framework is a tool designed for easy interaction with Large Language Models (LLMs). Allowing users to chat with LLM models, execute structured function calls and get structured output. Works also with models not fine-tuned to JSON output and function calls.
Course to get into Large Language Models (LLMs) with roadmaps and Colab notebooks.
The Security Toolkit for LLM Interactions
A framework for few-shot evaluation of language models.
1 capture since 2026-05-25