Related resources from around the web

,
Jan 20, 2023
Open in Github

People are writing great tools and papers for improving outputs from GPT. Here are some cool ones we've seen:

Prompting libraries & tools (in alphabetical order)

  • Arthur Shield: A paid product for detecting toxicity, hallucination, prompt injection, etc.
  • Baserun: A paid product for testing, debugging, and monitoring LLM-based apps
  • Chainlit: A Python library for making chatbot interfaces.
  • Embedchain: A Python library for managing and syncing unstructured data with LLMs.
  • FLAML (A Fast Library for Automated Machine Learning & Tuning): A Python library for automating selection of models, hyperparameters, and other tunable choices.
  • Guardrails.ai: A Python library for validating outputs and retrying failures. Still in alpha, so expect sharp edges and bugs.
  • Guidance: A handy looking Python library from Microsoft that uses Handlebars templating to interleave generation, prompting, and logical control.
  • Haystack: Open-source LLM orchestration framework to build customizable, production-ready LLM applications in Python.
  • HoneyHive: An enterprise platform to evaluate, debug, and monitor LLM apps.
  • LangChain: A popular Python/JavaScript library for chaining sequences of language model prompts.
  • LiteLLM: A minimal Python library for calling LLM APIs with a consistent format.
  • LlamaIndex: A Python library for augmenting LLM apps with data.
  • LMQL: A programming language for LLM interaction with support for typed prompting, control flow, constraints, and tools.
  • OpenAI Evals: An open-source library for evaluating task performance of language models and prompts.
  • Outlines: A Python library that provides a domain-specific language to simplify prompting and constrain generation.
  • Parea AI: A platform for debugging, testing, and monitoring LLM apps.
  • Portkey: A platform for observability, model management, evals, and security for LLM apps.
  • Promptify: A small Python library for using language models to perform NLP tasks.
  • PromptPerfect: A paid product for testing and improving prompts.
  • Prompttools: Open-source Python tools for testing and evaluating models, vector DBs, and prompts.
  • Scale Spellbook: A paid product for building, comparing, and shipping language model apps.
  • Semantic Kernel: A Python/C#/Java library from Microsoft that supports prompt templating, function chaining, vectorized memory, and intelligent planning.
  • Vellum: A paid AI product development platform to experiment with, evaluate, and deploy advanced LLM apps.
  • Weights & Biases: A paid product for tracking model training and prompt engineering experiments.
  • YiVal: An open-source GenAI-Ops tool for tuning and evaluating prompts, retrieval configurations, and model parameters using customizable datasets, evaluation methods, and evolution strategies.

Prompting guides

Video courses

Papers on advanced prompting to improve reasoning