Jan 20, 2023
People are writing great tools and papers for improving outputs from GPT. Here are some cool ones we've seen:

Prompting libraries & tools (in alphabetical order)

  • Arthur Shield: A paid product for detecting toxicity, hallucination, prompt injection, etc.
  • Baserun: A paid product for testing, debugging, and monitoring LLM-based apps
  • Chainlit: A Python library for making chatbot interfaces.
  • Embedchain: A Python library for managing and syncing unstructured data with LLMs.
  • FLAML (A Fast Library for Automated Machine Learning & Tuning): A Python library for automating selection of models, hyperparameters, and other tunable choices.
  • A Python library for validating outputs and retrying failures. Still in alpha, so expect sharp edges and bugs.
  • Guidance: A handy looking Python library from Microsoft that uses Handlebars templating to interleave generation, prompting, and logical control.
  • Haystack: Open-source LLM orchestration framework to build customizable, production-ready LLM applications in Python.
  • HoneyHive: An enterprise platform to evaluate, debug, and monitor LLM apps.
  • LangChain: A popular Python/JavaScript library for chaining sequences of language model prompts.
  • LiteLLM: A minimal Python library for calling LLM APIs with a consistent format.
  • LlamaIndex: A Python library for augmenting LLM apps with data.
  • LMQL: A programming language for LLM interaction with support for typed prompting, control flow, constraints, and tools.
  • OpenAI Evals: An open-source library for evaluating task performance of language models and prompts.
  • Outlines: A Python library that provides a domain-specific language to simplify prompting and constrain generation.
  • Parea AI: A platform for debugging, testing, and monitoring LLM apps.
  • Portkey: A platform for observability, model management, evals, and security for LLM apps.
  • Promptify: A small Python library for using language models to perform NLP tasks.
  • PromptPerfect: A paid product for testing and improving prompts.
  • Prompttools: Open-source Python tools for testing and evaluating models, vector DBs, and prompts.
  • Scale Spellbook: A paid product for building, comparing, and shipping language model apps.
  • Semantic Kernel: A Python/C#/Java library from Microsoft that supports prompt templating, function chaining, vectorized memory, and intelligent planning.
  • Vellum: A paid AI product development platform to experiment with, evaluate, and deploy advanced LLM apps.
  • Weights & Biases: A paid product for tracking model training and prompt engineering experiments.
  • YiVal: An open-source GenAI-Ops tool for tuning and evaluating prompts, retrieval configurations, and model parameters using customizable datasets, evaluation methods, and evolution strategies.

