Velvet’s first feature was a data copilot. Our AI-powered editor turns natural language queries into SQL, returning real-time database results. In this article, we share lessons learned optimizing our usage of OpenAI.
We needed a way for customers to easily query their production data. So we set out build a best-in-class AI SQL editor using OpenAI to automate SQL.
Requirements: Customers can connect a database, ask natural language questions to automate SQL, and return real-time data in milliseconds.
Why? A SQL query that might take a human hours to construct should take only a few seconds using our data copilot. This not only has a 10x effect in engineering workflows, but also empowers any stakeholder to query production data.
Connecting to OpenAI was relatively easy. We started with basic prompting to instruct OpenAI, something like “you are a data engineer and you only write SQL”. We shipped an MVP in a few weeks that could capture any event-based data source. The AI copilot could write contextual SQL and output live data results.
This first version ingested webhooks from tools like Stripe or Webflow. These tools throw off thousands of events, so it was an easy way to capture and unify customer data. Webhooks are often unstructured and incomplete, which caused challenges in writing relevant SQL.
There were some obvious customer bottlenecks we knew we had to fix. Connecting data was onerous and incomplete, the editor only surfaced one SQL statement at a time, and our context generation was rudimentary. We set out to solve these problems first.
App improvements: Customers were frustrated when they hit an error and couldn't find a path forward. We updated the UI so users could iteratively build SQL on queries without losing context, and added self-healing so errors could resolved themselves. And most importantly, we added direct database connections instead of event-based data sources.
AI improvements: We started to see patterns of when OpenAI had trouble processing user requests. After a variety of experiments, we found that reducing temperature, adding additional prompts, and improving the database schema context allowed the model to return more predictable SQL.
This version of the product felt like a 100x improvement. It was faster and more intuitive than writing SQL by hand or using ChatGPT. We loved using it internally and quickly onboarded beta customers.
We had moved past the MVP validation stage, and were ready to optimize. But it was hard to tell what was working well, and where the problems lived. When a customer reported a problem, was it because of their database connection, an error in our application layer, the way we managed context, or limitations to the model itself?
A positive or negative message from a user was largely dependent on the LLM response, but we had no quantitative insight into response performance.
There were two things we couldn’t observe or control; the customer’s database and the OpenAI model. We could at least trouble shoot database connection issues with our customers. But without raw LLM logs in a queryable database, it felt like we were crossing our fingers and hoping for the best from the OpenAI gods.
We started warehousing our LLM requests to analyze what was happening. Granular access to our OpenAI logs created a sense of familiarity and control. We could utilize their models as part of our tech stack, instead of as a black box.
No one can predict the future, especially when it comes to AI. We felt it was an unnecessary risk to be dependent on one company, one model, or one data platform. With our OpenAI logs warehoused in our own database, we have technical optionality moving forward.
Velvet is an AI-first data pipeline. We help you warehouse your LLM requests to your database. It’s free to get started, and you have full ownership of your data.
Test models, settings, and metrics against historical request logs.
Use our data copilot to query your AI request logs with SQL.
Use Velvet to observe, analyze, and optimize your AI features.