Self-Hosted AI Agent on AWS EC2

A comprehensive setup guide for running AI models on your own infrastructure

Setup Guide for Self-Hosted AI Agent

Welcome to this comprehensive guide for setting up your own AI agent on AWS EC2. This guide will walk you through the entire process from creating an AWS account to deploying and using your own AI model.

Prerequisites

  • An AWS account
  • Basic familiarity with command line interfaces
  • A domain name (optional, but recommended for SSL)

Step 1: Launching an EC2 Instance

First, you'll need to launch an EC2 instance with sufficient resources to run your AI model. We recommend:

  • Instance type: g4dn.xlarge (for GPU acceleration)
  • Storage: At least 100GB
  • Operating System: Ubuntu 22.04

Launch Commands

# SSH into your instance
ssh -i your-key.pem ubuntu@your-instance-ip

Step 2: Installing Dependencies

Once connected to your instance, you'll need to install several dependencies:

# Update package lists
sudo apt update
sudo apt upgrade -y

# Install required packages
sudo apt install -y build-essential python3-pip nvidia-driver-535

Step 3: Installing Ollama

Ollama is an easy-to-use framework for running LLMs locally:

curl -fsSL https://ollama.com/install.sh | sh

Step 4: Running Your First Model

Now you can pull and run a model:

# Pull a model (e.g., Llama 2)
ollama pull llama2

# Run the model
ollama run llama2

Step 5: Setting Up a Web Interface

For easier interaction, you can set up a web UI:

# Clone the web UI repository
git clone https://github.com/ollama-webui/ollama-webui.git
cd ollama-webui

# Install and start
docker-compose up -d

Additional Resources: For more detailed instructions, refer to our setup scripts section.

Cost Estimate for Self-Hosted AI Agent

Running your own AI infrastructure on AWS involves various costs. Here's a breakdown to help you budget for your deployment.

EC2 Instance Costs

Instance Type Hourly Cost Monthly Cost (24/7)
g4dn.xlarge $0.526 ~$384
g4dn.2xlarge $0.752 ~$550
g5.xlarge $1.006 ~$734

Storage Costs

EBS (Elastic Block Storage) costs are additional:

  • gp3 Storage: $0.08/GB per month
  • 100GB: ~$8/month
  • 200GB: ~$16/month

Cost Optimization Strategies

To reduce costs:

  • Use Auto-Shutdown: Set up the instance to shut down when not in use using our provided script
  • Reserved Instances: For long-term projects, consider purchasing reserved instances for up to 72% savings
  • Spot Instances: For non-critical workloads, spot instances can save up to 90%

Total Estimated Costs

A typical setup with cost optimization might cost:

  • On-demand use (8 hours/day): ~$130/month
  • Full-time use with reserved instance: ~$200/month

Cost Warning: Remember that costs can vary based on region and additional services. Always monitor your AWS billing dashboard and set up budget alerts to avoid unexpected charges.

Setup Scripts

Below are helpful scripts for setting up your self-hosted AI environment. You can copy and use these scripts to automate your setup process.

install_ollama.sh

#!/bin/bash
# Script to install Ollama on Ubuntu

echo "Installing Ollama on Ubuntu..."

# Update package lists
sudo apt update
sudo apt upgrade -y

# Install prerequisites
sudo apt install -y curl build-essential

# Install Ollama
curl -fsSL https://ollama.com/install.sh | sh

# Verify installation
echo "Verifying Ollama installation..."
ollama --version

# Pull a model for testing
echo "Pulling Llama2 model for testing..."
ollama pull llama2

echo "Ollama installation complete!"
echo "You can now run a model with: ollama run llama2"

install_code_server.sh

#!/bin/bash
# Script to install Code Server (VS Code in browser)

echo "Installing Code Server..."

# Update package lists
sudo apt update
sudo apt upgrade -y

# Install prerequisites
sudo apt install -y curl wget

# Install Code Server
curl -fsSL https://code-server.dev/install.sh | sh

# Configure Code Server
mkdir -p ~/.config/code-server
cat > ~/.config/code-server/config.yaml << EOF
bind-addr: 0.0.0.0:8080
auth: password
password: $(openssl rand -base64 12)
cert: false
EOF

# Set up systemd service
sudo systemctl enable --now code-server@$USER

echo "Code Server installation complete!"
echo "You can access Code Server at: http://your-server-ip:8080"
echo "Password is in ~/.config/code-server/config.yaml"

setup_auto_shutdown.sh

#!/bin/bash
# Script to set up automatic shutdown of EC2 instance when idle

echo "Setting up automatic shutdown for EC2 instance..."

# Create the shutdown script
cat > ~/auto_shutdown.sh << 'EOF'
#!/bin/bash

# Configuration
IDLE_TIME_THRESHOLD=30  # Minutes
CPU_THRESHOLD=10        # Percent
MEMORY_THRESHOLD=20     # Percent
CHECK_INTERVAL=5        # Minutes

# Get instance ID from metadata service
INSTANCE_ID=$(curl -s http://169.254.169.254/latest/meta-data/instance-id)
REGION=$(curl -s http://169.254.169.254/latest/meta-data/placement/region)

while true; do
  # Get CPU utilization average
  CPU_UTIL=$(top -bn1 | grep "Cpu(s)" | sed "s/.*, *\([0-9.]*\)%* id.*/\1/" | awk '{print 100 - $1}')
  
  # Get memory utilization
  MEM_UTIL=$(free | grep Mem | awk '{print $3/$2 * 100.0}')
  
  # Check if system is idle
  if (( $(echo "$CPU_UTIL < $CPU_THRESHOLD" | bc -l) )) && (( $(echo "$MEM_UTIL < $MEMORY_THRESHOLD" | bc -l) )); then
    echo "System is idle. CPU: $CPU_UTIL%, Memory: $MEM_UTIL%"
    echo "Shutting down in $IDLE_TIME_THRESHOLD minutes unless activity resumes..."
    
    # Sleep and check again before shutdown
    sleep $(($IDLE_TIME_THRESHOLD * 60))
    
    # Get CPU and memory usage again
    CPU_UTIL=$(top -bn1 | grep "Cpu(s)" | sed "s/.*, *\([0-9.]*\)%* id.*/\1/" | awk '{print 100 - $1}')
    MEM_UTIL=$(free | grep Mem | awk '{print $3/$2 * 100.0}')
    
    # Final check before shutdown
    if (( $(echo "$CPU_UTIL < $CPU_THRESHOLD" | bc -l) )) && (( $(echo "$MEM_UTIL < $MEMORY_THRESHOLD" | bc -l) )); then
      echo "System still idle. Shutting down now."
      sudo shutdown -h now
    else
      echo "Activity detected. Shutdown cancelled."
    fi
  else
    echo "System active. CPU: $CPU_UTIL%, Memory: $MEM_UTIL%"
  fi
  
  sleep $(($CHECK_INTERVAL * 60))
done
EOF

# Make script executable
chmod +x ~/auto_shutdown.sh

# Create systemd service
sudo bash -c 'cat > /etc/systemd/system/auto-shutdown.service << EOF
[Unit]
Description=Automatic EC2 Shutdown Service
After=network.target

[Service]
User=ubuntu
ExecStart=/bin/bash /home/ubuntu/auto_shutdown.sh
Restart=always

[Install]
WantedBy=multi-user.target
EOF'

# Enable and start service
sudo systemctl enable auto-shutdown.service
sudo systemctl start auto-shutdown.service

echo "Auto-shutdown service installed and started."
echo "The instance will shut down after $IDLE_TIME_THRESHOLD minutes of CPU < $CPU_THRESHOLD% and Memory < $MEMORY_THRESHOLD%"

Usage Tip: You can download these scripts directly to your EC2 instance using wget or copy-paste them into files. Make sure to set proper permissions with chmod +x before running.

Looking for Other Resources?

Check out my other AWS research pages: