Build an AI Assistant in Your SaaS App Using OpenAI and Pinecone

Introduction

Adding AI chat capabilities to your SaaS product can dramatically improve support, onboarding, and automation. In this guide, we'll show you how to integrate OpenAI's GPT model with Pinecone to build an intelligent assistant that remembers, responds, and grows with user input.

Why Combine OpenAI with Pinecone?

OpenAI provides state-of-the-art language models for understanding and generating human-like text.
Pinecone offers a scalable vector database to store and search memory embeddings.
Together, they enable AI agents that can reference past chats, documents, or user actions.

System Architecture Overview

We'll build a Flask-based backend that stores user chat history as embeddings in Pinecone and retrieves relevant context before sending queries to OpenAI.

Creating Embeddings with OpenAI

import openai
  from dotenv import load_dotenv
  load_dotenv()
  
  openai.api_key = os.getenv("OPENAI_API_KEY")
  
  def get_embedding(text):
      response = openai.Embedding.create(
          input=text,
          model="text-embedding-ada-002"
      )
      return response['data'][0]['embedding']

Storing and Querying Vectors in Pinecone

import pinecone
  pinecone.init(api_key="YOUR_PINECONE_API_KEY", environment="us-east1-gcp")
  index = pinecone.Index("chat-memory")
  
  def store_memory(user_id, text):
      vector = get_embedding(text)
      index.upsert([(f"{user_id}:{uuid4()}", vector, {"text": text})])
  
  def retrieve_context(user_id, query):
      vector = get_embedding(query)
      results = index.query(vector=vector, top_k=5, include_metadata=True)
      return [match.metadata['text'] for match in results.matches]

Generating Context-Aware Responses

Now, we can feed the retrieved context to OpenAI's chat completion API for smarter replies.

def chat_with_context(query, context):
      messages = [
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "Context: " + "
".join(context)},
        {"role": "user", "content": query}
      ]
      return openai.ChatCompletion.create(model="gpt-4", messages=messages)

Next Steps

Add user authentication and scope context to user sessions
Set up periodic vector cleanups and deduplication
Enable memory updating or forgetting based on relevance

Need Help Building It?

If you want to add a Pinecone-powered AI chat assistant to your app, I offer consulting and development services for startups, SaaS companies, and internal tools. Let's build something amazing together—reach out via my contact form.