Product

Run model experiments

Velvet is an AI gateway that works with any model. Use our evaluation framework to test models, settings, and metrics on your request logs.

Evaluations overview

LLMs are inherently unpredictable, which can make feature-development challenging. With Velvet Evaluations, you can feel confident that your LLM-powered features will work the way you expect them to. Test your request logs against models, settings, and metrics.

Use cases:

  • Run deterministic tests to check for accuracy, latency, cost, and more.
  • Use LLMs as a judge to check for relevance, accuracy, and quality.
  • Run complex evaluations for data validation, RAG, and agentic workflows.

Watch a video introduction below, or see our docs for in-depth tutorials on running evaluations in your application.

Get started

Create a workspace and add Velvet as your proxy to get started.

  1. Create your account here.
  2. Just 2 lines of code to get started. Read the docs.
  3. Run evaluations and continuous monitoring on your AI features.

AI gateway

Analyze, evaluate, and monitor your AI

Free up to 10k requests per month.

2 lines of code to get started.

Try Velvet for free

More articles

Product
Query logs with Velvet's text-to-SQL editor

Use our data copilot to query your AI request logs with SQL.

Product
Warehouse every request from OpenAI

Use Velvet to observe, analyze, and optimize your AI features.

Product
Warehouse every request from Anthropic

Use Velvet to observe, analyze, and optimize your AI features.