Product

Build an AI-powered recommendation feature

Never built an AI feature? This week’s article is for you. We’ll walk through the evolution of building an AI-powered book recommendation app. Read about 6 experiments for developing a valuable AI feature.

6 experiments for building a useful feature on top of OpenAI

You're an app developer tasked with helping users discover new books to read. Your app has a large, constantly evolving catalog of books. Because of the volume of books available, your end users struggle to find books that are relevant to them.

Your feature needs to take the user's search criteria, past reads, and interests into consideration. You decide to use OpenAI to return relevant results.

Experiment 1: Connect to OpenAI

You make an MVP chat interface that accepts user inputs, sends them to OpenAI, and returns answers. Without additional prompting work, your feature acts like a wrapper for OpenAI. There will be no personality, content, or capability unique to your product or end users.

Experiment 2: Layer on prompt engineering

Next, you layer on prompts to instruct the LLM how to act. While this is often called prompt engineering, it doesn’t require much coding expertise. A "prompt" is a text-based note passed to the LLM. This extra context instructs the model to behave in a particular way. A prompt changes the results you'll get from this specific call to OpenAI, but it won’t train the model’s general knowledge for every future call (more on that later).

An example prompt might be: You're a librarian specialized in fictional literature. People come to you when they don't know what to read next. You take into consideration what that person has read in the past to inform your recommendations and suggest four books at a time. You can ask additional questions to produce better recommendations.”

Now your chat feature talks more like a librarian, it returns four results at a time, and it focuses on fictional literature from the beginning.

Experiment 3: Design a more focused interface

While text-based chat is functional, you start to see a decline in engagement after someone tries the feature once. Plus, you notice that most users are typing in just a few key words, something like "beach reads based in italy with a female protagonist".

You try implementing a more focused book-buying experience that includes a mix of prepared recommendations and chat.

Now your app starts feeling more familiar to users. It's not so obviously "AI", but it recommends relevant books quickly, and keeps the user focused on the task at hand.

Experiment 4: Use RAG to improve results

You realize your book recommendation feature isn't effectively converting to book rentals, and each query to OpenAI is expensive and time consuming. You want the LLM to focus on your app's thematic and available catalogs, rather than all books on the internet. Your library catalog is constantly changing, so the data needs to be updated frequently.

You implement retrieval-augmented generation (RAG) to address these problems. Now the LLM narrows its scope to a particular database of books - improving the relevancy of results. Adding RAG impacts the results of each call, but doesn't train the model to reason differently.

At this point, your recommendation feature is working - you’ve found product-market fit with millions of end users.

Experiment 5: Add caching for repeat queries

You notice some people use the search bar to find a book they've already read. In this case, they aren't looking for different (more creative) results, they simply want to find the exact book or recommendation list they found previously.

You add caching so you can surface past search results without sending new context to OpenAI every time. This makes your repeat searches faster and cheaper.

Experiment 6: Fine-tune a model

You built a useful feature using one of OpenAI's pre-trained models. Experimenting with prompt engineering, RAG, UI, and caching helped you validate this feature at a reasonable cost. Now you have broad adoption, and you're ready to optimize further.

You decide to invest in fine-tuning a model to further improve costs, latency, and accuracy. With fine-tuning, you’re changing the reasoning capabilities of the model itself, not just the results of each call.

Note: The fine tuning process is expensive and can result in catastrophic failures, so it’s best to depend on other tactics until you identify a problem worth the investment.

Run your own experiments

We discussed 6 experiments of how to test and optimize a new AI feature:

  1. Connect to OpenAI: A basic chat wrapper
  2. Prompt engineering: Instruct the LLM to be specific
  3. User interface: Design a focused experience
  4. Retrieval augmented generation: Point to a particular data source
  5. Caching: Faster and cheaper repeat queries
  6. Fine-tune: Train an in-house model for your use case

LLM requests, responses, and parameters can be used to analyze, optimize, and fine-tune your AI features. Velvet makes it easy to warehouse every request from OpenAI.

Email team@usevelvet.com, or schedule a call to get started.

AI gateway

Warehouse every AI request to your database

Try Velvet for free

More articles

Product
Warehouse every request from OpenAI

Use Velvet to observe, analyze, and optimize your AI features.

Product
Warehouse every request from Anthropic

Use Velvet to observe, analyze, and optimize your AI features.

Engineering
Analyze OpenAI's Batch and File APIs

Use the Batch API to reduce costs. Velvet unlocks extra log data.