Vionis Labs - Intelligent AI Solutions for Every Industry

Introduction

An AI assistant goes far beyond a chatbot. While a chatbot can only converse, an assistant can use tools, search databases, execute actions, and integrate with external systems to actually get things done.

In this guide, you will build an AI assistant with function calling and retrieval-augmented generation (RAG) — two of the most powerful techniques in modern AI development.

Chatbot vs. Assistant

Understanding the difference is important:

Chatbot — Responds to messages using only its training data. It can answer questions and hold conversations, but cannot take actions or access external information.
AI Assistant — Can use tools (APIs, databases, file systems), retrieve real-time information, execute functions, and take actions on behalf of the user.

💡 Think of It This Way

A chatbot is like talking to someone who knows a lot but is sitting in an empty room. An assistant is like talking to someone who has a computer, phone, and filing cabinet at their desk — they can look things up and take action.

Assistant Architecture

A well-designed AI assistant follows a four-step pipeline:

Understand the Request

The AI parses the user's message to understand what they need. It identifies whether this is a simple question, a tool-based task, or a multi-step request.

Reason and Plan

The AI decides which tools or data sources it needs. For complex requests, it breaks the task into smaller steps.

Execute Tools

The AI calls the appropriate functions — searching a database, calling an API, reading a file — and collects the results.

Formulate Response

The AI combines the tool results with its own knowledge to generate a clear, helpful response.

Function Calling

Function calling allows the AI to interact with external systems. You define the available tools, and the AI decides when and how to use them.

python

import openai, json, os
from dotenv import load_dotenv

load_dotenv()
client = openai.OpenAI(api_key=os.getenv("OPENAI_API_KEY"))

# Define tools the assistant can use
tools = [
    {
        "type": "function",
        "function": {
            "name": "get_weather",
            "description": "Get current weather for a location",
            "parameters": {
                "type": "object",
                "properties": {
                    "location": {
                        "type": "string",
                        "description": "City name, e.g. Berlin"
                    }
                },
                "required": ["location"]
            }
        }
    },
    {
        "type": "function",
        "function": {
            "name": "search_database",
            "description": "Search the company knowledge base",
            "parameters": {
                "type": "object",
                "properties": {
                    "query": {
                        "type": "string",
                        "description": "Search query"
                    }
                },
                "required": ["query"]
            }
        }
    }
]

# Implement the actual functions
def get_weather(location):
    # In production, call a real weather API
    return {"temp": "18°C", "condition": "Sunny", "location": location}

def search_database(query):
    # In production, search your actual database
    return {"results": [f"Found info about: {query}"]}

Now let us write the logic that handles the AI's decision to call a function:

python

def run_assistant(user_message, conversation_history):
    conversation_history.append(
        {"role": "user", "content": user_message}
    )
    
    response = client.chat.completions.create(
        model="gpt-4",
        messages=conversation_history,
        tools=tools,
        tool_choice="auto"
    )
    
    message = response.choices[0].message
    
    # Check if the model wants to call a function
    if message.tool_calls:
        conversation_history.append(message)
        
        for tool_call in message.tool_calls:
            fn_name = tool_call.function.name
            fn_args = json.loads(tool_call.function.arguments)
            
            # Execute the function
            if fn_name == "get_weather":
                result = get_weather(**fn_args)
            elif fn_name == "search_database":
                result = search_database(**fn_args)
            
            conversation_history.append({
                "role": "tool",
                "tool_call_id": tool_call.id,
                "content": json.dumps(result)
            })
        
        # Get final response with function results
        final_response = client.chat.completions.create(
            model="gpt-4",
            messages=conversation_history
        )
        return final_response.choices[0].message.content
    
    return message.content

Retrieval-Augmented Generation (RAG)

RAG lets your assistant answer questions from your own documents, not just from its training data. It works by finding relevant document chunks and injecting them into the AI's context.

python

from openai import OpenAI

client = OpenAI()

def create_embeddings(texts):
    """Convert text chunks into vector embeddings."""
    response = client.embeddings.create(
        model="text-embedding-3-small",
        input=texts
    )
    return [item.embedding for item in response.data]

def find_relevant_context(query, documents, doc_embeddings):
    """Find the most relevant documents for a query."""
    query_embedding = create_embeddings([query])[0]
    
    # Calculate similarity (cosine similarity)
    similarities = []
    for i, doc_emb in enumerate(doc_embeddings):
        similarity = sum(a * b for a, b in zip(query_embedding, doc_emb))
        similarities.append((similarity, documents[i]))
    
    # Return top 3 most relevant documents
    similarities.sort(reverse=True)
    return [doc for _, doc in similarities[:3]]

def ask_with_context(query, context_docs):
    """Ask the AI with retrieved context."""
    context = "\n\n".join(context_docs)
    response = client.chat.completions.create(
        model="gpt-4",
        messages=[
            {"role": "system", "content": f"""Answer based on this context:
{context}

If the answer isn't in the context, say so."""},
            {"role": "user", "content": query}
        ]
    )
    return response.choices[0].message.content

✅ Choosing a Vector Database

For production RAG systems, use a dedicated vector database like Pinecone, Weaviate, or ChromaDB instead of the simple in-memory approach shown here. They handle millions of documents efficiently.

Building the Assistant

Let us combine function calling and RAG into a complete assistant:

python

class AIAssistant:
    def __init__(self, name, system_prompt, tools=None):
        self.name = name
        self.client = OpenAI()
        self.tools = tools or []
        self.history = [{"role": "system", "content": system_prompt}]
        self.knowledge_base = []
        self.kb_embeddings = []
    
    def add_knowledge(self, documents):
        """Add documents to the assistant's knowledge base."""
        self.knowledge_base.extend(documents)
        self.kb_embeddings = create_embeddings(self.knowledge_base)
    
    def ask(self, question):
        """Process a question with RAG + function calling."""
        # Step 1: Retrieve relevant context
        context = ""
        if self.knowledge_base:
            relevant = find_relevant_context(
                question, self.knowledge_base, self.kb_embeddings
            )
            context = "\nRelevant info: " + " | ".join(relevant)
        
        # Step 2: Add context to the question
        enhanced_question = question + context
        
        # Step 3: Run through the LLM with tools
        return run_assistant(enhanced_question, self.history)

# Usage
assistant = AIAssistant(
    name="Company Helper",
    system_prompt="You are a helpful company assistant.",
    tools=tools
)
assistant.add_knowledge([
    "Our office hours are 9 AM to 6 PM CET.",
    "We offer free consultations on Tuesdays.",
    "Our main product is an AI workspace platform."
])

response = assistant.ask("What are your office hours?")
print(response)

Testing Your Assistant

Thorough testing is essential for AI assistants. Test these scenarios:

Simple questions that should be answered from the knowledge base.
Questions that require tool usage (e.g., weather queries).
Multi-step requests that need both knowledge retrieval and tool execution.
Edge cases: questions outside the assistant's scope, ambiguous requests, and simultaneous tool calls.

Summary

You have built a powerful AI assistant. Key takeaways:

AI assistants extend chatbots with tools, knowledge bases, and action capabilities.
Function calling lets the AI decide when to use external tools and APIs.
RAG enables the assistant to answer from your own documents with high accuracy.
Combine function calling and RAG for assistants that can both know and do.

Build an AI Assistant

Introduction

Chatbot vs. Assistant

Assistant Architecture

Understand the Request

Reason and Plan

Execute Tools

Formulate Response

Function Calling

Retrieval-Augmented Generation (RAG)

Building the Assistant

Testing Your Assistant

Summary