Build an AI Assistant in Your SaaS App Using OpenAI and Pinecone
Learn how to integrate OpenAI and Pinecone to create a context-aware AI assistant in your SaaS app. Great for startups and devs looking to add conversational intelligence.
Introduction
Adding AI chat capabilities to your SaaS product can dramatically improve support, onboarding, and automation. In this guide, we'll show you how to integrate OpenAI's GPT model with Pinecone to build an intelligent assistant that remembers, responds, and grows with user input.
Why Combine OpenAI with Pinecone?
- OpenAI provides state-of-the-art language models for understanding and generating human-like text.
- Pinecone offers a scalable vector database to store and search memory embeddings.
- Together, they enable AI agents that can reference past chats, documents, or user actions.
System Architecture Overview
We'll build a Flask-based backend that stores user chat history as embeddings in Pinecone and retrieves relevant context before sending queries to OpenAI.
Creating Embeddings with OpenAI
import openai
from dotenv import load_dotenv
load_dotenv()
openai.api_key = os.getenv("OPENAI_API_KEY")
def get_embedding(text):
response = openai.Embedding.create(
input=text,
model="text-embedding-ada-002"
)
return response['data'][0]['embedding']
Storing and Querying Vectors in Pinecone
import pinecone
pinecone.init(api_key="YOUR_PINECONE_API_KEY", environment="us-east1-gcp")
index = pinecone.Index("chat-memory")
def store_memory(user_id, text):
vector = get_embedding(text)
index.upsert([(f"{user_id}:{uuid4()}", vector, {"text": text})])
def retrieve_context(user_id, query):
vector = get_embedding(query)
results = index.query(vector=vector, top_k=5, include_metadata=True)
return [match.metadata['text'] for match in results.matches]
Generating Context-Aware Responses
Now, we can feed the retrieved context to OpenAI's chat completion API for smarter replies.
def chat_with_context(query, context):
messages = [
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "Context: " + "
".join(context)},
{"role": "user", "content": query}
]
return openai.ChatCompletion.create(model="gpt-4", messages=messages)
Next Steps
- Add user authentication and scope context to user sessions
- Set up periodic vector cleanups and deduplication
- Enable memory updating or forgetting based on relevance
Need Help Building It?
If you want to add a Pinecone-powered AI chat assistant to your app, I offer consulting and development services for startups, SaaS companies, and internal tools. Let's build something amazing together—reach out via my contact form.
Further Reading
- OpenAI's embedding docs
- Pinecone vector DB examples
- LangChain and RAG architecture basics