Build an AI Assistant
Develop an intelligent virtual assistant that can use tools, search knowledge bases, and complete tasks — going beyond simple conversation.
Introduction
An AI assistant goes far beyond a chatbot. While a chatbot can only converse, an assistant can use tools, search databases, execute actions, and integrate with external systems to actually get things done.
In this guide, you will build an AI assistant with function calling and retrieval-augmented generation (RAG) — two of the most powerful techniques in modern AI development.
Chatbot vs. Assistant
Understanding the difference is important:
- Chatbot — Responds to messages using only its training data. It can answer questions and hold conversations, but cannot take actions or access external information.
- AI Assistant — Can use tools (APIs, databases, file systems), retrieve real-time information, execute functions, and take actions on behalf of the user.
💡 Think of It This Way
A chatbot is like talking to someone who knows a lot but is sitting in an empty room. An assistant is like talking to someone who has a computer, phone, and filing cabinet at their desk — they can look things up and take action.
Assistant Architecture
A well-designed AI assistant follows a four-step pipeline:
Understand the Request
The AI parses the user's message to understand what they need. It identifies whether this is a simple question, a tool-based task, or a multi-step request.
Reason and Plan
The AI decides which tools or data sources it needs. For complex requests, it breaks the task into smaller steps.
Execute Tools
The AI calls the appropriate functions — searching a database, calling an API, reading a file — and collects the results.
Formulate Response
The AI combines the tool results with its own knowledge to generate a clear, helpful response.
Function Calling
Function calling allows the AI to interact with external systems. You define the available tools, and the AI decides when and how to use them.
import openai, json, os
from dotenv import load_dotenv
load_dotenv()
client = openai.OpenAI(api_key=os.getenv("OPENAI_API_KEY"))
# Define tools the assistant can use
tools = [
{
"type": "function",
"function": {
"name": "get_weather",
"description": "Get current weather for a location",
"parameters": {
"type": "object",
"properties": {
"location": {
"type": "string",
"description": "City name, e.g. Berlin"
}
},
"required": ["location"]
}
}
},
{
"type": "function",
"function": {
"name": "search_database",
"description": "Search the company knowledge base",
"parameters": {
"type": "object",
"properties": {
"query": {
"type": "string",
"description": "Search query"
}
},
"required": ["query"]
}
}
}
]
# Implement the actual functions
def get_weather(location):
# In production, call a real weather API
return {"temp": "18°C", "condition": "Sunny", "location": location}
def search_database(query):
# In production, search your actual database
return {"results": [f"Found info about: {query}"]}Now let us write the logic that handles the AI's decision to call a function:
def run_assistant(user_message, conversation_history):
conversation_history.append(
{"role": "user", "content": user_message}
)
response = client.chat.completions.create(
model="gpt-4",
messages=conversation_history,
tools=tools,
tool_choice="auto"
)
message = response.choices[0].message
# Check if the model wants to call a function
if message.tool_calls:
conversation_history.append(message)
for tool_call in message.tool_calls:
fn_name = tool_call.function.name
fn_args = json.loads(tool_call.function.arguments)
# Execute the function
if fn_name == "get_weather":
result = get_weather(**fn_args)
elif fn_name == "search_database":
result = search_database(**fn_args)
conversation_history.append({
"role": "tool",
"tool_call_id": tool_call.id,
"content": json.dumps(result)
})
# Get final response with function results
final_response = client.chat.completions.create(
model="gpt-4",
messages=conversation_history
)
return final_response.choices[0].message.content
return message.contentRetrieval-Augmented Generation (RAG)
RAG lets your assistant answer questions from your own documents, not just from its training data. It works by finding relevant document chunks and injecting them into the AI's context.
from openai import OpenAI
client = OpenAI()
def create_embeddings(texts):
"""Convert text chunks into vector embeddings."""
response = client.embeddings.create(
model="text-embedding-3-small",
input=texts
)
return [item.embedding for item in response.data]
def find_relevant_context(query, documents, doc_embeddings):
"""Find the most relevant documents for a query."""
query_embedding = create_embeddings([query])[0]
# Calculate similarity (cosine similarity)
similarities = []
for i, doc_emb in enumerate(doc_embeddings):
similarity = sum(a * b for a, b in zip(query_embedding, doc_emb))
similarities.append((similarity, documents[i]))
# Return top 3 most relevant documents
similarities.sort(reverse=True)
return [doc for _, doc in similarities[:3]]
def ask_with_context(query, context_docs):
"""Ask the AI with retrieved context."""
context = "\n\n".join(context_docs)
response = client.chat.completions.create(
model="gpt-4",
messages=[
{"role": "system", "content": f"""Answer based on this context:
{context}
If the answer isn't in the context, say so."""},
{"role": "user", "content": query}
]
)
return response.choices[0].message.content✅ Choosing a Vector Database
For production RAG systems, use a dedicated vector database like Pinecone, Weaviate, or ChromaDB instead of the simple in-memory approach shown here. They handle millions of documents efficiently.
Building the Assistant
Let us combine function calling and RAG into a complete assistant:
class AIAssistant:
def __init__(self, name, system_prompt, tools=None):
self.name = name
self.client = OpenAI()
self.tools = tools or []
self.history = [{"role": "system", "content": system_prompt}]
self.knowledge_base = []
self.kb_embeddings = []
def add_knowledge(self, documents):
"""Add documents to the assistant's knowledge base."""
self.knowledge_base.extend(documents)
self.kb_embeddings = create_embeddings(self.knowledge_base)
def ask(self, question):
"""Process a question with RAG + function calling."""
# Step 1: Retrieve relevant context
context = ""
if self.knowledge_base:
relevant = find_relevant_context(
question, self.knowledge_base, self.kb_embeddings
)
context = "\nRelevant info: " + " | ".join(relevant)
# Step 2: Add context to the question
enhanced_question = question + context
# Step 3: Run through the LLM with tools
return run_assistant(enhanced_question, self.history)
# Usage
assistant = AIAssistant(
name="Company Helper",
system_prompt="You are a helpful company assistant.",
tools=tools
)
assistant.add_knowledge([
"Our office hours are 9 AM to 6 PM CET.",
"We offer free consultations on Tuesdays.",
"Our main product is an AI workspace platform."
])
response = assistant.ask("What are your office hours?")
print(response)Testing Your Assistant
Thorough testing is essential for AI assistants. Test these scenarios:
- Simple questions that should be answered from the knowledge base.
- Questions that require tool usage (e.g., weather queries).
- Multi-step requests that need both knowledge retrieval and tool execution.
- Edge cases: questions outside the assistant's scope, ambiguous requests, and simultaneous tool calls.
Summary
You have built a powerful AI assistant. Key takeaways:
- AI assistants extend chatbots with tools, knowledge bases, and action capabilities.
- Function calling lets the AI decide when to use external tools and APIs.
- RAG enables the assistant to answer from your own documents with high accuracy.
- Combine function calling and RAG for assistants that can both know and do.