Engineering

Monitor AI features in production

Velvet's evaluation framework helps you run continuous testing on AI features in production. Monitor model configuration, versions, and metrics, and get alerts on weekly changes.

Continuous testing overview

LLMs are inherently unpredictable, which can make production reliability a challenge. With Velvet Continuous Monitoring, you can feel confident that your AI-powered features continue to work the way you expect them to. Sample your request logs against models, settings, and metrics. And get alerts when tests fail.

‍

Use cases:

Run deterministic tests to check for accuracy, latency, cost, and more.
Use LLMs as a judge to check for relevance, accuracy, and quality.
Run complex evaluations for data validation, RAG, and agentic workflows.

‍

Watch a video introduction below, or see our docs for in-depth tutorials on running continuous testing in your application.

Get started

‍

Create a workspace and add Velvet as your proxy to get started.

Create your account here.
Just 2 lines of code to get started. Read the docs.
Run evaluations and continuous monitoring on your AI features.

‍

AI gateway

Analyze, evaluate, and monitor your AI

Free up to 10k requests per month.

2 lines of code to get started.

Try Velvet for free

More articles

Product

Query logs with Velvet's text-to-SQL editor

Use our data copilot to query your AI request logs with SQL.

Product

Warehouse every request from OpenAI

Use Velvet to observe, analyze, and optimize your AI features.

Product

Warehouse every request from Anthropic

Use Velvet to observe, analyze, and optimize your AI features.