TechFrontier Original

Advanced AI Agent Techniques

Take your AI assistant to the next level with advanced features like tool usage, knowledge retrieval, and intelligent workflows.

60 min
Intermediate
Updated: March 2023
AI, Python, LangChain, Agents
Advanced AI Agent Illustration
0% Complete

Overview

Welcome to the Advanced AI Agent Techniques tutorial. If you've completed our Build Your First AI Agent guide, you're ready to take your AI assistant to the next level with sophisticated features that dramatically enhance its capabilities.

In this tutorial, you'll learn how to transform your basic AI assistant into a powerful AI agent that can:

  • Use external tools to perform real-world tasks
  • Maintain sophisticated memory systems for better context retention
  • Implement advanced reasoning techniques for complex problem-solving
  • Deploy your agent as a production-ready service

What Makes an AI Agent Different?

While a basic AI assistant can have conversations and generate text, an AI agent can:

  • Take Action: Interact with the outside world through tools and APIs
  • Plan & Reason: Break down complex tasks and determine the best approach
  • Learn & Remember: Retain information across sessions using sophisticated memory systems
  • Collaborate: Work with other systems and even other agents to accomplish goals

Prerequisites

Working Basic AI Agent

Before proceeding with this tutorial, you should have:

  • Completed our Build Your First AI Agent tutorial
  • A working AI assistant with a functional user interface
  • Familiarity with Python and basic AI concepts

We'll be building upon the code from the previous tutorial. If you haven't completed it yet, we recommend doing so before continuing here.

Required Packages

In addition to the packages from the basic tutorial, you'll need to install the following:

pip install langchain requests beautifulsoup4 duckduckgo-search faiss-cpu networkx matplotlib

Here's what each package is for:

  • langchain: Framework for building applications with language models
  • requests & beautifulsoup4: For web retrieval and parsing
  • duckduckgo-search: API for searching the web
  • faiss-cpu: Vector store for embedding-based memory
  • networkx & matplotlib: For creating and visualizing knowledge graphs

System Requirements

Adding these advanced features will increase the system requirements:

  • At least 16GB RAM recommended
  • An additional 2GB of disk space
  • Internet connection for some features (like web search)

Integrating External Tools

The key feature that transforms an AI assistant into an AI agent is the ability to use tools to interact with the external world. In this section, we'll add several powerful tools to our agent.

Creating a Tool Interface

First, let's define a standard interface for tools that our agent can use:

# tools.py

from typing import List, Dict, Callable, Any, Optional
import inspect
import json

class Tool:
    def __init__(self, name: str, description: str, func: Callable):
        """
        Initialize a tool for the AI agent.
        
        Args:
            name: The name of the tool
            description: A description of what the tool does and how to use it
            func: The function that implements the tool's functionality
        """
        self.name = name
        self.description = description
        self.func = func
        self.signature = inspect.signature(func)
    
    def __call__(self, *args, **kwargs) -> Any:
        """Call the tool function with the provided arguments."""
        return self.func(*args, **kwargs)
    
    def get_parameters(self) -> Dict:
        """Get parameter information for this tool."""
        params = {}
        for name, param in self.signature.parameters.items():
            if name == 'self':
                continue
            params[name] = {
                'type': str(param.annotation),
                'default': None if param.default is inspect.Parameter.empty else param.default
            }
        return params
    
    def to_dict(self) -> Dict:
        """Convert tool to a dictionary for JSON serialization."""
        return {
            'name': self.name,
            'description': self.description,
            'parameters': self.get_parameters()
        }


class ToolRegistry:
    def __init__(self):
        """Initialize a registry for tools that the agent can use."""
        self.tools: Dict[str, Tool] = {}
    
    def register_tool(self, tool: Tool) -> None:
        """Register a tool in the registry."""
        self.tools[tool.name] = tool
    
    def get_tool(self, name: str) -> Optional[Tool]:
        """Get a tool by name."""
        return self.tools.get(name)
    
    def list_tools(self) -> List[Dict]:
        """List all available tools."""
        return [tool.to_dict() for tool in self.tools.values()]
    
    def execute_tool(self, name: str, args_json: str) -> Any:
        """Execute a tool by name with JSON-serialized arguments."""
        tool = self.get_tool(name)
        if not tool:
            return f"Error: Tool '{name}' not found"
        
        try:
            args = json.loads(args_json)
            return tool(**args)
        except Exception as e:
            return f"Error executing tool: {str(e)}"

This code creates a structured way to define tools and a registry to manage them. Each tool has a name, description, and a function that implements its functionality.

One of the most powerful tools we can give our agent is the ability to search the web for information:

# web_search_tool.py

from duckduckgo_search import DDGS
from tools import Tool

def search_web(query: str, num_results: int = 5) -> str:
    """
    Search the web for information using DuckDuckGo.
    
    Args:
        query: The search query
        num_results: Number of results to return (default: 5)
    
    Returns:
        A string with the search results
    """
    try:
        with DDGS() as ddgs:
            results = list(ddgs.text(query, max_results=num_results))
        
        if not results:
            return "No results found."
        
        formatted_results = []
        for i, result in enumerate(results, 1):
            formatted_results.append(f"{i}. {result['title']}")
            formatted_results.append(f"   {result['body']}")
            formatted_results.append(f"   Source: {result['href']}")
            formatted_results.append("")
        
        return "\n".join(formatted_results)
    except Exception as e:
        return f"Error searching the web: {str(e)}"

# Create a Tool object for web search
web_search_tool = Tool(
    name="web_search",
    description="Search the web for information. Useful for questions about current events or specific facts.",
    func=search_web
)

Privacy Note

When using web search, be aware that:

  • Queries will be sent to external search engines
  • Your IP address may be visible to these services
  • For complete privacy, you may want to route requests through a proxy

Advanced Memory Systems

One of the limitations of basic AI assistants is their inability to remember past interactions effectively. In this section, we'll implement advanced memory systems that allow our agent to retain and retrieve information over time.

Vector Store Memory

Vector stores are a powerful way to implement semantic memory for our agent. They allow us to store and retrieve information based on meaning rather than exact matching:

# vector_memory.py

import os
import numpy as np
import faiss
from typing import List, Dict, Tuple, Optional
import json
import torch
from transformers import AutoTokenizer, AutoModel

class VectorMemory:
    def __init__(self, model_name: str = "sentence-transformers/all-MiniLM-L6-v2"):
        """
        Initialize a vector-based memory system.
        
        Args:
            model_name: The name of the sentence transformer model to use
        """
        self.tokenizer = AutoTokenizer.from_pretrained(model_name)
        self.model = AutoModel.from_pretrained(model_name)
        self.dimension = self.model.config.hidden_size
        self.index = faiss.IndexFlatL2(self.dimension)
        self.texts = []
        
    def _encode(self, text: str) -> np.ndarray:
        """Encode text into a vector representation."""
        inputs = self.tokenizer(text, return_tensors='pt', 
                               padding=True, truncation=True, max_length=512)
        with torch.no_grad():
            outputs = self.model(**inputs)
        
        # Mean pooling
        embeddings = outputs.last_hidden_state.mean(dim=1).cpu().numpy()
        return embeddings
    
    def add(self, text: str) -> None:
        """Add a text to memory."""
        vector = self._encode(text)
        self.index.add(vector)
        self.texts.append(text)
    
    def search(self, query: str, k: int = 5) -> List[Tuple[int, str, float]]:
        """
        Search for similar texts in memory.
        
        Args:
            query: The search query
            k: Number of results to return
        
        Returns:
            List of tuples containing (index, text, distance)
        """
        vector = self._encode(query)
        distances, indices = self.index.search(vector, k)
        
        results = []
        for i, idx in enumerate(indices[0]):
            if idx < len(self.texts):  # Valid index
                results.append((idx, self.texts[idx], distances[0][i]))
        
        return results
    
    def save(self, filepath: str) -> None:
        """Save the memory to disk."""
        directory = os.path.dirname(filepath)
        if directory and not os.path.exists(directory):
            os.makedirs(directory)
            
        faiss.write_index(self.index, f"{filepath}.index")
        with open(f"{filepath}.texts", "w") as f:
            json.dump(self.texts, f)
    
    @classmethod
    def load(cls, filepath: str, model_name: str = "sentence-transformers/all-MiniLM-L6-v2") -> 'VectorMemory':
        """Load a memory from disk."""
        memory = cls(model_name)
        memory.index = faiss.read_index(f"{filepath}.index")
        with open(f"{filepath}.texts", "r") as f:
            memory.texts = json.load(f)
        return memory

This vector store allows our agent to store information and retrieve it based on semantic similarity, which is much more powerful than simple keyword matching.

Conversation Summarization

For long conversations, we can use summarization to compress the history while retaining the important information:

# conversation_summary.py

class ConversationSummaryMemory:
    def __init__(self, llm, max_tokens=1000):
        """
        Initialize a conversation summary memory.
        
        Args:
            llm: The language model to use for summarization
            max_tokens: Maximum number of tokens to keep in the summary
        """
        self.llm = llm
        self.max_tokens = max_tokens
        self.summary = ""
        self.current_conversation = []
    
    def add_interaction(self, user_input: str, ai_response: str) -> None:
        """Add a user-AI interaction to the memory."""
        self.current_conversation.append({
            "user": user_input,
            "ai": ai_response
        })
        
        # If the conversation is getting long, summarize it
        if self._estimate_tokens(str(self.current_conversation)) > self.max_tokens:
            self._summarize_conversation()
    
    def _summarize_conversation(self) -> None:
        """Summarize the current conversation and reset it."""
        if not self.current_conversation:
            return
            
        conversation_text = "\n".join([
            f"User: {turn['user']}\nAI: {turn['ai']}"
            for turn in self.current_conversation
        ])
        
        prompt = f"""
        Previous conversation summary:
        {self.summary}
        
        New conversation:
        {conversation_text}
        
        Create an updated summary of the entire conversation that includes the key points from both the previous summary and the new conversation.
        """
        
        self.summary = self.llm.generate_response(prompt, "You are an expert summarizer.")
        self.current_conversation = []
    
    def get_context(self) -> str:
        """Get the current conversation context."""
        if self.summary:
            result = f"Conversation summary: {self.summary}\n\nRecent messages:\n"
        else:
            result = "Conversation:\n"
            
        for turn in self.current_conversation:
            result += f"User: {turn['user']}\nAI: {turn['ai']}\n"
            
        return result
    
    def _estimate_tokens(self, text: str) -> int:
        """Roughly estimate the number of tokens in a text."""
        # A very rough approximation: 1 token ≈ 4 characters
        return len(text) // 4

Building Reasoning Chains

To handle complex tasks, modern AI agents use structured reasoning techniques that break down problems into steps. These techniques significantly improve the agent's ability to solve complex problems.

Chain of Thought Reasoning

Chain of Thought (CoT) prompting encourages the model to work through a problem step by step:

# chain_of_thought.py

def chain_of_thought_prompt(question: str) -> str:
    """
    Create a prompt that encourages chain-of-thought reasoning.
    
    Args:
        question: The user's question
    
    Returns:
        A prompt template that encourages step-by-step reasoning
    """
    return f"""
    Question: {question}
    
    Let's think through this step by step:
    1. First, I'll identify what information we need to solve this problem.
    2. Then, I'll gather the relevant facts and determine what approach to take.
    3. Next, I'll work through the solution methodically.
    4. Finally, I'll verify my answer and provide a clear explanation.
    
    Step-by-step solution:
    """

def generate_cot_response(llm, question: str) -> str:
    """
    Generate a response using chain of thought reasoning.
    
    Args:
        llm: The language model to use
        question: The user's question
    
    Returns:
        The model's response
    """
    prompt = chain_of_thought_prompt(question)
    response = llm.generate_response(prompt, "You are a logical problem solver.")
    
    # Extract the final answer from the reasoning
    lines = response.strip().split('\n')
    answer_line = None
    
    for line in reversed(lines):
        if line.startswith("Therefore,") or line.startswith("In conclusion,") or line.startswith("The answer is"):
            answer_line = line
            break
    
    if answer_line:
        return f"{response}\n\nFinal answer: {answer_line}"
    else:
        return response

ReAct Framework

The ReAct (Reasoning + Acting) framework combines reasoning with tool usage for more powerful capabilities:

# react_agent.py

from typing import List, Dict, Any, Optional
import json
import re

class ReActAgent:
    def __init__(self, llm, tool_registry):
        """
        Initialize a ReAct agent.
        
        Args:
            llm: The language model to use
            tool_registry: The registry of tools available to the agent
        """
        self.llm = llm
        self.tool_registry = tool_registry
        self.max_steps = 10  # Maximum number of reasoning steps
    
    def run(self, user_query: str) -> str:
        """
        Run the ReAct agent on a user query.
        
        Args:
            user_query: The user's query
        
        Returns:
            The agent's final response
        """
        available_tools = self.tool_registry.list_tools()
        tool_descriptions = "\n".join([
            f"- {tool['name']}: {tool['description']}"
            for tool in available_tools
        ])
        
        system_prompt = f"""
        You are an intelligent agent that can reason and use tools to solve problems.
        
        Available tools:
        {tool_descriptions}
        
        For each step:
        1. Think about what needs to be done
        2. Decide if you need to use a tool or can answer directly
        3. If using a tool:
           - Specify the tool name in [TOOL] tags
           - Provide the arguments as a JSON object in [ARGS] tags
        4. After using tools or reasoning, provide your final answer in [FINAL ANSWER] tags
        
        Example:
        I need to search for current information.
        [TOOL] web_search
        [ARGS] {{"query": "latest news about AI"}}
        
        Based on my search, I found that...
        [FINAL ANSWER] Here is the information you requested...
        """
        
        prompt = f"User query: {user_query}\n\nLet's solve this step by step:"
        
        # Track the conversation
        conversation = [{"role": "system", "content": system_prompt},
                       {"role": "user", "content": prompt}]
        
        # Execute the reasoning steps
        for step in range(self.max_steps):
            # Get the next reasoning step from the LLM
            response = self.llm.chat_completion(conversation)
            reasoning_step = response.choices[0].message.content
            
            # Check if the agent has provided a final answer
            if "[FINAL ANSWER]" in reasoning_step:
                final_answer = re.search(r'\[FINAL ANSWER\](.*?)($|(?=\[TOOL\]))', 
                                        reasoning_step, re.DOTALL)
                if final_answer:
                    return final_answer.group(1).strip()
            
            # Check if the agent wants to use a tool
            if "[TOOL]" in reasoning_step and "[ARGS]" in reasoning_step:
                # Extract tool name and arguments
                tool_match = re.search(r'\[TOOL\](.*?)($|(?=\[ARGS\]))', reasoning_step, re.DOTALL)
                args_match = re.search(r'\[ARGS\](.*?)($|(?=\[TOOL\]|\[FINAL ANSWER\]))', 
                                     reasoning_step, re.DOTALL)
                
                if tool_match and args_match:
                    tool_name = tool_match.group(1).strip()
                    args_json = args_match.group(1).strip()
                    
                    # Execute the tool
                    tool_result = self.tool_registry.execute_tool(tool_name, args_json)
                    
                    # Add the tool execution to the conversation
                    observation = f"Observation: {tool_result}"
                    conversation.append({"role": "assistant", "content": reasoning_step})
                    conversation.append({"role": "user", "content": observation})
                    continue
            
            # If we get here, the reasoning step didn't trigger a tool or final answer
            conversation.append({"role": "assistant", "content": reasoning_step})
            conversation.append({"role": "user", "content": "Continue your reasoning. If you have an answer, provide it in [FINAL ANSWER] tags."})
        
        # If we've reached the maximum number of steps without a final answer
        return "I've thought about this problem extensively but haven't reached a definitive conclusion. Here's what I've determined so far: " + reasoning_step

Be Careful with Recursion

When implementing reasoning chains, be wary of infinite loops or deep recursion:

  • Always set a maximum number of reasoning steps
  • Implement clear stopping conditions
  • Add timeouts to prevent excessive computation

Production Deployment

Once you've built an advanced AI agent, you may want to deploy it as a production service that others can use. This section covers the key considerations for deployment.

Packaging Your Agent

To make your agent easy to distribute and install, you can package it as a Python package:

# Project structure for packaging
my_ai_agent/
├── pyproject.toml
├── README.md
├── LICENSE
├── src/
│   └── my_ai_agent/
│       ├── __init__.py
│       ├── agent.py        # Main agent code
│       ├── tools.py        # Tool definitions
│       ├── memory.py       # Memory systems
│       ├── reasoning.py    # Reasoning chains
│       └── web_interface.py # Web UI code
└── scripts/
    └── run_agent.py        # Entry point script

Create a pyproject.toml file to define your package:

# pyproject.toml
[build-system]
requires = ["setuptools>=61.0", "wheel"]
build-backend = "setuptools.build_meta"

[project]
name = "my-ai-agent"
version = "0.1.0"
description = "An advanced AI agent with tools, memory, and reasoning capabilities"
readme = "README.md"
authors = [
    {name = "Your Name", email = "your.email@example.com"}
]
license = {text = "MIT"}
classifiers = [
    "Programming Language :: Python :: 3",
    "License :: OSI Approved :: MIT License",
    "Operating System :: OS Independent",
]
requires-python = ">=3.8"
dependencies = [
    "gradio>=3.32.0",
    "torch>=2.0.0",
    "transformers>=4.30.0",
    "langchain>=0.0.200",
    "faiss-cpu>=1.7.4",
    "duckduckgo-search>=2.8.6",
    "requests>=2.28.0",
    "beautifulsoup4>=4.12.0",
    "networkx>=3.1",
    "matplotlib>=3.7.0"
]

[project.scripts]
my-ai-agent = "my_ai_agent.scripts.run_agent:main"

To build and install your package:

# Build the package
python -m build

# Install from the local build
pip install dist/my_ai_agent-0.1.0-py3-none-any.whl

Creating an API Service

You can deploy your agent as an API service using Flask:

# api_service.py
from flask import Flask, request, jsonify
from my_ai_agent.agent import ReActAgent
from my_ai_agent.tools import ToolRegistry
from my_ai_agent.memory import VectorMemory
import os

app = Flask(__name__)

# Initialize the agent components
llm = initialize_llm()  # Your LLM initialization code
tool_registry = ToolRegistry()
vector_memory = VectorMemory()

# Register tools
# ... (your tool registration code)

# Create the agent
agent = ReActAgent(llm, tool_registry)

@app.route('/api/query', methods=['POST'])
def query_agent():
    data = request.json
    if not data or 'query' not in data:
        return jsonify({'error': 'No query provided'}), 400
    
    user_query = data['query']
    session_id = data.get('session_id', 'default')
    
    # Handle conversation context if provided
    if 'context' in data:
        # Process context (e.g., add to memory)
        pass
    
    # Run the agent
    response = agent.run(user_query)
    
    return jsonify({
        'response': response,
        'session_id': session_id
    })

if __name__ == '__main__':
    port = int(os.environ.get('PORT', 5000))
    app.run(host='0.0.0.0', port=port)

For production deployment, you'd typically use a WSGI server like Gunicorn and possibly a reverse proxy like Nginx:

# Run with Gunicorn
gunicorn -w 4 -b 0.0.0.0:5000 api_service:app

Security Considerations

When deploying your agent as a service:

  • Add proper authentication and authorization
  • Set rate limits to prevent abuse
  • Validate all user inputs
  • Consider implementing a content filter
  • Use HTTPS for all API endpoints

Next Steps and Resources

Congratulations on completing this advanced AI agent tutorial! You've learned how to enhance your AI assistant with tools, sophisticated memory systems, advanced reasoning capabilities, and deployment techniques.

Where to Go From Here

To continue your AI agent development journey:

Fine-tune Your Model

Fine-tune the underlying language model on your specific domain for better performance.

Learn about fine-tuning

Multi-Agent Systems

Build systems where multiple agents collaborate to solve complex problems.

Explore multi-agent architectures

Embodied AI

Connect your agent to physical hardware like robots or smart home devices.

Discover embodied AI

AI Safety and Alignment

Learn techniques to ensure your agent behaves safely and aligns with human values.

Study AI safety

Additional Resources

Share Your Creation!

We'd love to see what you build with the techniques from this tutorial. Join our community and share your AI agent projects!

Module Complete!