TechFrontier Original

Build Your First Local AI Agent

Create a powerful AI assistant that runs entirely on your computer, giving you full control over your data and privacy.

45 min
Beginner
Updated: June 2023
AI, Python, LLM
AI Agent Illustration
0% Complete

Overview

Welcome to this comprehensive guide on building your own local AI agent. In this tutorial, you'll learn how to create a powerful AI assistant that runs entirely on your computer, giving you full control over your data and privacy.

By the end of this tutorial, you'll have a functional AI assistant that can:

  • Answer questions on a wide range of topics
  • Generate creative content like stories and poems
  • Help with coding and problem-solving
  • Run completely offline without sending your data to external servers

Why Build a Local AI Agent?

Running AI locally gives you several advantages:

  • Privacy: Your data never leaves your computer
  • Customization: Full control to modify the AI for your specific needs
  • No Subscription Fees: Once set up, it's yours to use without ongoing costs
  • Offline Access: Use your AI assistant even without internet access

Prerequisites

Before we begin, make sure you have:

  • A computer with at least 8GB RAM (16GB recommended)
  • At least 10GB of free disk space
  • Basic familiarity with command line interfaces
  • Python 3.10 or later installed

Setting Up Your Environment

Installing Python

If you don't already have Python installed:

  1. Visit python.org and download Python 3.10 or later
  2. During installation, make sure to check "Add Python to PATH"
  3. Verify installation by opening a terminal/command prompt and typing:
python --version

or on some systems:

python3 --version

Creating a Project Directory

Let's create a dedicated directory for our AI assistant project:

mkdir my-ai-assistant
cd my-ai-assistant

Setting Up a Virtual Environment

A virtual environment keeps your project dependencies isolated from other Python projects:

python -m venv venv
venv\Scripts\activate
python3 -m venv venv
source venv/bin/activate

You should now see (venv) at the beginning of your command prompt, indicating the virtual environment is active.

Installing Required Packages

With your virtual environment activated, install the necessary packages:

pip install llama-cpp-python gradio requests tqdm

Package Information

Here's what each package does:

  • llama-cpp-python: Python bindings for the llama.cpp library to run LLM models
  • gradio: Library for creating web interfaces for ML models
  • requests: For downloading models
  • tqdm: For progress bars during downloads

Building Your AI Assistant

Now that we have our environment set up, let's create the core functionality of our AI assistant.

Creating the Assistant Script

Create a new file called ai_assistant.py in your project directory. This will be the main script for our AI assistant.

import os
from llama_cpp import Llama
import gradio as gr

# Initialize the language model
model_path = os.path.join("models", "llama-2-7b-chat.ggmlv3.q4_0.bin")
llm = Llama(model_path=model_path, n_ctx=2048)

# Define a function to generate responses
def generate_response(prompt, system_prompt="You are a helpful AI assistant.", max_tokens=512):
    # Format the prompt with system instructions
    full_prompt = f"{system_prompt}\n\nUser: {prompt}\n\nAssistant:"
    
    # Generate a response
    output = llm(
        full_prompt,
        max_tokens=max_tokens,
        stop=["User:", "\n\nUser:"],
        echo=False
    )
    
    # Extract and return the generated text
    return output['choices'][0]['text'].strip()

# Create a simple web interface
def create_interface():
    with gr.Blocks(css="footer {visibility: hidden}") as demo:
        gr.Markdown("# Your Personal AI Assistant")
        gr.Markdown("Ask me anything or give me a task to help you with!")
        
        with gr.Row():
            with gr.Column():
                system_prompt = gr.Textbox(
                    label="System Prompt (Instructions for the AI)",
                    value="You are a helpful AI assistant that provides accurate, informative responses.",
                    lines=2
                )
                
                user_input = gr.Textbox(
                    label="Your Question or Request",
                    placeholder="Type your question here...",
                    lines=3
                )
                
                submit_btn = gr.Button("Get Response")
                
            with gr.Column():
                output = gr.Textbox(label="AI Response", lines=12)
                
        submit_btn.click(
            fn=generate_response,
            inputs=[user_input, system_prompt],
            outputs=output
        )
        
    return demo

# Run the interface
if __name__ == "__main__":
    interface = create_interface()
    interface.launch(share=False)

Response Generation

Let's break down the key parts of our code:

Code Explanation

  • Model Initialization: We load the language model from the file path
  • Response Generation: The generate_response function takes a user prompt and system instructions, formats them, and sends them to the model
  • System Prompt: This defines the AI's personality and behavior
  • Stop Sequences: These tell the model when to stop generating text

Web Interface

We're using Gradio to create a simple web interface for our AI assistant:

Interface Features

  • System Prompt Field: Allows customizing the AI's behavior
  • User Input Field: Where you type your questions or requests
  • Response Output: Displays the AI's responses
  • Local Web Server: Runs on your computer, accessible via web browser

Running Your AI Assistant

Now that we've built our AI assistant, let's run it and start interacting with it.

Running the Script

Make sure your virtual environment is activated, then run the assistant script:

python ai_assistant.py

You should see output similar to:

Running on local URL:  http://127.0.0.1:7860

Open this URL in your web browser to access your AI assistant's interface.

First Run Notice

The first time you run the assistant, it might take a minute or two to load the model into memory. Subsequent runs will be faster as the model is already loaded.

Interface Screenshot

AI Assistant Interface

Your AI assistant interface should look similar to this.

Basic Interactions

Now that your assistant is running, try asking it various questions:

User: Explain quantum computing in simple terms
AI: Quantum computing is like having a super-powered calculator that can consider multiple possibilities at once, instead of one at a time like regular computers. In traditional computing, we use bits that are either 0 or 1. But quantum computers use "qubits" that can exist in multiple states simultaneously thanks to a phenomenon called superposition. It's a bit like being able to flip many coins at once and having them all be both heads and tails until you look at them. This allows quantum computers to solve certain complex problems much faster than regular computers, especially in fields like cryptography, drug discovery, and optimization problems. Think of it this way: if you needed to find a specific name in a phone book, a regular computer would have to check each name one by one. A quantum computer could check many names simultaneously, potentially finding the answer much faster.
User: Write a short poem about technology
AI: Digital Dawn Silicon dreams in metal frames, Electrons dance, never the same. Invisible currents, powerful and swift, Giving human potential a mighty lift. Screens aglow with knowledge vast, Present connected to future and past. In our hands, worlds unfold, Stories written, yet to be told. Technology's march, relentless, bright— Guiding us forward with its light.

Customizing the System Prompt

The system prompt defines your assistant's personality and capabilities. Try these different system prompts:

For a creative writing assistant:

You are a creative writing assistant that specializes in storytelling, poetry, and creative content. You provide imaginative and engaging responses.

For a programming assistant:

You are a programming assistant with expertise in multiple programming languages. You provide clear, concise code examples with explanations.

For a learning assistant:

You are a patient and educational assistant that explains complex topics in simple terms. You break down difficult concepts into easy-to-understand explanations.

Enhancing Your AI Assistant

Now that you have a basic AI assistant working, let's explore ways to enhance it with additional features.

Adding Memory to Your Assistant

One limitation of our current assistant is that it doesn't remember previous exchanges in a conversation. Let's modify our code to add memory:

# Add at the top of your script
conversation_history = []

# Replace the generate_response function with this version
def generate_response_with_memory(prompt, system_prompt="You are a helpful AI assistant.", max_tokens=512):
    global conversation_history
    
    # Add user message to history
    conversation_history.append(f"User: {prompt}")
    
    # Format the prompt with system instructions and conversation history
    full_prompt = f"{system_prompt}\n\n"
    
    # Add conversation history (up to last 5 exchanges to avoid context length issues)
    for message in conversation_history[-10:]:
        full_prompt += f"{message}\n\n"
    
    full_prompt += "Assistant:"
    
    # Generate a response
    output = llm(
        full_prompt,
        max_tokens=max_tokens,
        stop=["User:", "\n\nUser:"],
        echo=False
    )
    
    # Extract the generated text
    response = output['choices'][0]['text'].strip()
    
    # Add assistant response to history
    conversation_history.append(f"Assistant: {response}")
    
    return response

Packaging Your AI Assistant

Let's create a simple script to package your AI assistant as a standalone application.

Creating a Launcher Script

Create a file called launch_assistant.py:

import os
import sys
import subprocess
import webbrowser
import time

def check_environment():
    """Check if the virtual environment exists and create it if not."""
    if not os.path.exists("venv"):
        print("Virtual environment not found. Creating one...")
        subprocess.run([sys.executable, "-m", "venv", "venv"])
    
    # Activate virtual environment and install dependencies
    if sys.platform == "win32":
        python = os.path.join("venv", "Scripts", "python.exe")
        pip = os.path.join("venv", "Scripts", "pip.exe")
    else:
        python = os.path.join("venv", "bin", "python")
        pip = os.path.join("venv", "bin", "pip")
    
    # Check if dependencies are installed
    if not os.path.exists(os.path.join("venv", "Lib", "site-packages", "llama_cpp")):
        print("Installing dependencies...")
        subprocess.run([pip, "install", "llama-cpp-python", "gradio", "requests", "tqdm"])
    
    return python

def check_model():
    """Check if the model exists and download it if not."""
    model_path = os.path.join("models", "llama-2-7b-chat.ggmlv3.q4_0.bin")
    if not os.path.exists(model_path):
        print("Model not found. Downloading...")
        if not os.path.exists("models"):
            os.makedirs("models")
        
        # Create and run the download script
        with open("download_model.py", "w") as f:
            f.write('''
import requests
import os
from tqdm import tqdm

def download_model(url, save_path):
    response = requests.get(url, stream=True)
    total_size = int(response.headers.get('content-length', 0))
    block_size = 1024  # 1 Kibibyte
    
    with open(save_path, 'wb') as f:
        for data in tqdm(response.iter_content(block_size), total=total_size//block_size, unit='KiB', unit_scale=True):
            f.write(data)

model_url = "https://huggingface.co/TheBloke/Llama-2-7B-Chat-GGML/resolve/main/llama-2-7b-chat.ggmlv3.q4_0.bin"
save_path = "models/llama-2-7b-chat.ggmlv3.q4_0.bin"

print(f"Downloading model to {save_path}...")
download_model(model_url, save_path)
print("Download complete!")
''')
        
        python = check_environment()
        subprocess.run([python, "download_model.py"])

def launch_assistant():
    """Launch the AI assistant."""
    python = check_environment()
    check_model()
    
    # Check which assistant version to run
    assistant_file = "ai_assistant.py"
    if os.path.exists("ai_assistant_with_memory.py"):
        assistant_file = "ai_assistant_with_memory.py"
    if os.path.exists("ai_assistant_with_file_analysis.py"):
        assistant_file = "ai_assistant_with_file_analysis.py"
    
    print(f"Launching AI assistant ({assistant_file})...")
    
    # Start the assistant
    process = subprocess.Popen([python, assistant_file])
    
    # Wait a moment for the server to start
    time.sleep(3)
    
    # Open the web interface in the default browser
    webbrowser.open("http://127.0.0.1:7860")
    
    return process

if __name__ == "__main__":
    process = launch_assistant()
    
    print("AI assistant is running. Press Ctrl+C to stop.")
    try:
        process.wait()
    except KeyboardInterrupt:
        process.terminate()
        print("\nAI assistant stopped.")

Now you can simply run:

python launch_assistant.py

This script will:

  1. Check if the virtual environment exists and create it if needed
  2. Install required dependencies if they're missing
  3. Check if the model exists and download it if needed
  4. Launch the most advanced version of your AI assistant
  5. Open your web browser to the assistant's interface

Next Steps and Resources

Congratulations! You've successfully built your own local AI assistant. Here are some ways to further enhance your assistant:

Try Different Models

You can experiment with different models to find the balance between performance and resource usage:

  • Smaller models (3B-7B parameters) run faster but may have limited capabilities
  • Larger models (13B-70B parameters) offer better responses but require more RAM

Some models to try:

  • Llama 2 (7B, 13B, 70B)
  • Mistral (7B)
  • Vicuna
  • Orca

Add More Features

Consider adding these features to your assistant:

  • Voice input and output using speech recognition and text-to-speech
  • Integration with local documents and knowledge bases
  • Custom tools and plugins for specific tasks
  • A desktop application wrapper using Electron or PyQt

Learn More

To deepen your understanding of AI assistants, explore these resources:

Congratulations!

You've successfully completed the "Build Your First Local AI Agent" module. You now have the knowledge to create, customize, and deploy your own AI assistant locally.