Stop Paying for Idle EC2: Automate Underutilization Reports with Bash & CloudWatch
A Bash script that queries CloudWatch for 7-day CPU and memory averages across every running EC2 instance, flags the underutilized ones, emails an HTML report with a CSV attachment, and runs itself every morning via cron — so you never miss a wasteful instance again.
The Idle Instance Tax
Every AWS account has them. EC2 instances that were spun up for a project, a test, a demo — and never sized down or terminated when the load disappeared. They sit at 3–5% CPU, consuming a Reserved Instance or racking up On-Demand charges every hour, every day, every month.
AWS Cost Explorer will tell you your bill is high. It won't tell you which specific instances are wasting money or what you should do about them. That gap is exactly what this script fills: a daily automated report that names names, shows the data, and recommends an action.
How the Script Works
The pipeline is simple: query every running EC2 instance → pull 7-day average CloudWatch metrics for CPU and memory → compare against configurable thresholds → generate an HTML email with a colour-coded table and a CSV attachment for tracking in spreadsheets.
Prerequisites
Three things need to be in place before the script runs correctly. The memory metric is optional — the script gracefully shows "N/A" for instances without the CloudWatch Agent installed.
ec2:DescribeInstances and cloudwatch:GetMetricStatistics. Read-only, minimal blast radius.
Underutilization Thresholds
The script uses three severity tiers. Tune these constants at the top of the script to match your organisation's cost policies:
| Tier | CPU (7d avg) | Memory (7d avg) | Recommendation |
|---|---|---|---|
| Critical | < 5% |
< 10% |
Terminate or stop immediately |
| Warning | < 10% |
< 20% |
Downsize to next smaller type |
| Healthy | ≥ 10% |
≥ 20% |
No action needed |
The Full Script
Drop this in /opt/ec2-report/ec2_underutilized_report.sh, set your email address and thresholds at the top, then run chmod +x on it. The script is fully self-contained — no external dependencies beyond the AWS CLI, jq, and sendmail.
describe-instances call.AWS/EC2 namespace and memory from CWAgent namespace. Handles missing CWAgent gracefully.bc for float arithmetic, assigns a severity level and a human-readable recommendation string.sendmail as a MIME multipart email with the CSV attached.#!/bin/bash
# ─────────────────────────────────────────────────────────────────
# ec2_underutilized_report.sh
# Identifies underutilized EC2 instances via CloudWatch,
# generates an HTML email report + CSV attachment, logs everything.
# ─────────────────────────────────────────────────────────────────
set -euo pipefail
# ── Configuration ─────────────────────────────────────────────────
REPORT_EMAIL="devops-team@yourcompany.com"
FROM_EMAIL="aws-reports@yourcompany.com"
REGION="us-east-1"
LOOKBACK_DAYS=7
PERIOD_SECONDS=$(( LOOKBACK_DAYS * 86400 ))
# Underutilization thresholds
CPU_WARN_THRESHOLD=10 # % — downsize recommendation
CPU_CRIT_THRESHOLD=5 # % — terminate/stop recommendation
MEM_WARN_THRESHOLD=20 # % — downsize recommendation
MEM_CRIT_THRESHOLD=10 # % — terminate/stop recommendation
# Paths
REPORT_DIR="/opt/ec2-report"
LOG_FILE="${REPORT_DIR}/ec2_report.log"
CSV_FILE="${REPORT_DIR}/ec2_underutilized_$(date +%Y%m%d).csv"
HTML_BODY="${REPORT_DIR}/report_body.html"
mkdir -p "${REPORT_DIR}"
# ── Logging ────────────────────────────────────────────────────────
log() {
echo "[$(date '+%Y-%m-%d %H:%M:%S')] $*" | tee -a "${LOG_FILE}"
}
log "=== EC2 Underutilization Report started ==="
# ── Time window for CloudWatch queries ────────────────────────────
END_TIME=$(date -u +"%Y-%m-%dT%H:%M:%SZ")
START_TIME=$(date -u -d "${LOOKBACK_DAYS} days ago" +"%Y-%m-%dT%H:%M:%SZ")
# ── Fetch all running EC2 instances ───────────────────────────────
log "Discovering running EC2 instances in ${REGION}..."
INSTANCES=$(aws ec2 describe-instances \
--region "${REGION}" \
--filters "Name=instance-state-name,Values=running" \
--query "Reservations[].Instances[].[InstanceId,InstanceType,Tags[?Key=='Name'].Value|[0]]" \
--output json)
INSTANCE_COUNT=$(echo "${INSTANCES}" | jq 'length')
log "Found ${INSTANCE_COUNT} running instances"
if [[ "${INSTANCE_COUNT}" -eq 0 ]]; then
log "No running instances found — exiting."
exit 0
fi
# ── CloudWatch metric helper ───────────────────────────────────────
get_metric() {
local instance_id="$1"
local namespace="$2"
local metric_name="$3"
local dim_name="$4"
result=$(aws cloudwatch get-metric-statistics \
--region "${REGION}" \
--namespace "${namespace}" \
--metric-name "${metric_name}" \
--dimensions "Name=${dim_name},Value=${instance_id}" \
--start-time "${START_TIME}" \
--end-time "${END_TIME}" \
--period "${PERIOD_SECONDS}" \
--statistics Average \
--query "Datapoints[0].Average" \
--output text 2>/dev/null)
# Return "N/A" if CloudWatch has no data (CWAgent not installed etc.)
if [[ "${result}" == "None" ]] || [[ -z "${result}" ]]; then
echo "N/A"
else
# Round to 2 decimal places
printf "%.2f" "${result}"
fi
}
# ── Severity classification ────────────────────────────────────────
get_severity() {
local cpu="$1" mem="$2"
if [[ "${cpu}" == "N/A" ]]; then
echo "unknown"
return
fi
local cpu_crit=$(echo "${cpu} < ${CPU_CRIT_THRESHOLD}" | bc -l)
local cpu_warn=$(echo "${cpu} < ${CPU_WARN_THRESHOLD}" | bc -l)
if [[ "${cpu_crit}" -eq 1 ]]; then
echo "critical"
elif [[ "${cpu_warn}" -eq 1 ]]; then
echo "warning"
else
echo "healthy"
fi
}
get_recommendation() {
case "$1" in
critical) echo "Consider stopping or terminating" ;;
warning) echo "Downsize to smaller instance type" ;;
healthy) echo "No action needed" ;;
*) echo "Install CloudWatch Agent for memory metrics" ;;
esac
}
# ── Build HTML report ──────────────────────────────────────────────
log "Building HTML report..."
cat > "${HTML_BODY}" <<'HTML'
<!DOCTYPE html>
<html><head><meta charset="UTF-8"/>
<style>
body{font-family:Arial,sans-serif;background:#f4f4f4;padding:20px}
h2{color:#232f3e}
table{border-collapse:collapse;width:100%;background:#fff;box-shadow:0 1px 3px rgba(0,0,0,.1)}
th{background:#232f3e;color:#fff;padding:10px 14px;text-align:left;font-size:13px}
td{padding:9px 14px;font-size:13px;border-bottom:1px solid #eee}
.critical{background:#fff0f0;color:#c0392b;font-weight:bold}
.warning{background:#fffbe6;color:#d68910}
.healthy{background:#f0fff4;color:#1e8449}
.unknown{background:#f8f8f8;color:#888}
.badge{padding:2px 8px;border-radius:10px;font-size:11px}
.badge-crit{background:#fadbd8;color:#c0392b}
.badge-warn{background:#fef9e7;color:#d68910}
.badge-ok{background:#d5f5e3;color:#1e8449}
</style></head><body>
HTML
echo "<h2>EC2 Underutilization Report — $(date '+%B %d, %Y')</h2>" >> "${HTML_BODY}"
echo "<p>Region: <strong>${REGION}</strong> | Lookback: <strong>${LOOKBACK_DAYS} days</strong> | Instances scanned: <strong>${INSTANCE_COUNT}</strong></p>" >> "${HTML_BODY}"
echo "<table><tr><th>Instance ID</th><th>Name</th><th>Type</th><th>CPU Avg (7d)</th><th>Memory Avg (7d)</th><th>Status</th><th>Recommendation</th></tr>" >> "${HTML_BODY}"
# CSV header
echo "InstanceId,Name,InstanceType,CPU_7d_Avg,Memory_7d_Avg,Status,Recommendation" > "${CSV_FILE}"
# ── Per-instance loop ──────────────────────────────────────────────
while IFS= read -r instance; do
INSTANCE_ID=$(echo "${instance}" | jq -r '.[0]')
INSTANCE_TYPE=$(echo "${instance}" | jq -r '.[1]')
INSTANCE_NAME=$(echo "${instance}" | jq -r '.[2] // "Unnamed"')
log "Processing ${INSTANCE_ID} (${INSTANCE_NAME} / ${INSTANCE_TYPE})"
# Pull metrics
CPU_AVG=$(get_metric "${INSTANCE_ID}" "AWS/EC2" "CPUUtilization" "InstanceId")
MEM_AVG=$(get_metric "${INSTANCE_ID}" "CWAgent" "mem_used_percent" "InstanceId")
SEVERITY=$(get_severity "${CPU_AVG}" "${MEM_AVG}")
RECOMMENDATION=$(get_recommendation "${SEVERITY}")
# HTML badge styling
case "${SEVERITY}" in
critical) BADGE='<span class="badge badge-crit">Critical</span>' ;;
warning) BADGE='<span class="badge badge-warn">Warning</span>' ;;
healthy) BADGE='<span class="badge badge-ok">Healthy</span>' ;;
*) BADGE='<span class="badge">Unknown</span>' ;;
esac
# Append HTML table row
cat >> "${HTML_BODY}" <<EOF
<tr class="${SEVERITY}">
<td><code>${INSTANCE_ID}</code></td>
<td>${INSTANCE_NAME}</td>
<td><code>${INSTANCE_TYPE}</code></td>
<td>${CPU_AVG}%</td>
<td>${MEM_AVG}%</td>
<td>${BADGE}</td>
<td>${RECOMMENDATION}</td>
</tr>
EOF
# Append CSV row
echo "${INSTANCE_ID},${INSTANCE_NAME},${INSTANCE_TYPE},${CPU_AVG},${MEM_AVG},${SEVERITY},${RECOMMENDATION}" >> "${CSV_FILE}"
done < <(echo "${INSTANCES}" | jq -c '.[]')
# Close HTML
echo "</table><p style='color:#999;font-size:11px'>Generated by ec2_underutilized_report.sh at $(date)</p></body></html>" >> "${HTML_BODY}"
# ── Send email via sendmail (MIME multipart) ───────────────────────
log "Sending report to ${REPORT_EMAIL}..."
BOUNDARY="boundary_$(date +%s)"
CSV_BASENAME=$(basename "${CSV_FILE}")
CSV_B64=$(base64 -w 0 "${CSV_FILE}")
sendmail -t <<MAIL
To: ${REPORT_EMAIL}
From: ${FROM_EMAIL}
Subject: [AWS] EC2 Underutilization Report — $(date '+%Y-%m-%d') (${REGION})
MIME-Version: 1.0
Content-Type: multipart/mixed; boundary="${BOUNDARY}"
--${BOUNDARY}
Content-Type: text/html; charset=UTF-8
Content-Transfer-Encoding: 7bit
$(cat "${HTML_BODY}")
--${BOUNDARY}
Content-Type: text/csv; name="${CSV_BASENAME}"
Content-Transfer-Encoding: base64
Content-Disposition: attachment; filename="${CSV_BASENAME}"
${CSV_B64}
--${BOUNDARY}--
MAIL
log "Report sent successfully."
log "CSV saved to: ${CSV_FILE}"
log "=== Report complete ==="
IAM Policy — Least Privilege
Attach this policy to the EC2 instance profile running the script. It covers exactly what's needed — describe instances, read CloudWatch metrics, nothing else.
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "EC2Describe",
"Effect": "Allow",
"Action": [
"ec2:DescribeInstances",
"ec2:DescribeTags"
],
"Resource": "*"
},
{
"Sid": "CloudWatchReadMetrics",
"Effect": "Allow",
"Action": [
"cloudwatch:GetMetricStatistics",
"cloudwatch:ListMetrics"
],
"Resource": "*"
}
]
}
Enabling Memory Metrics with CloudWatch Agent
By default, CloudWatch has no memory data for EC2 — memory usage is not exposed by the hypervisor. You need the CloudWatch Agent running inside the instance to push mem_used_percent. Without it, the script shows "N/A" for memory — still useful, just without memory signal.
# Quick install + configure CloudWatch Agent for memory metrics
# 1. Install
sudo yum install -y amazon-cloudwatch-agent # Amazon Linux
# sudo apt install -y amazon-cloudwatch-agent # Ubuntu/Debian
# 2. Minimal config — writes mem_used_percent every 60s
sudo tee /opt/aws/amazon-cloudwatch-agent/etc/amazon-cloudwatch-agent.json > /dev/null <<'EOF'
{
"metrics": {
"namespace": "CWAgent",
"metrics_collected": {
"mem": {
"measurement": ["mem_used_percent"],
"metrics_collection_interval": 60
}
},
"append_dimensions": {
"InstanceId": "${aws:InstanceId}"
}
}
}
EOF
# 3. Start and enable
sudo /opt/aws/amazon-cloudwatch-agent/bin/amazon-cloudwatch-agent-ctl \
-a fetch-config \
-m ec2 \
-c file:/opt/aws/amazon-cloudwatch-agent/etc/amazon-cloudwatch-agent.json \
-s
# Verify it's running
sudo systemctl status amazon-cloudwatch-agent
Deploy & Schedule
Get the script onto your reporting EC2 instance, make it executable, then wire it into cron for daily delivery:
# 1. Create directory and drop script in place
sudo mkdir -p /opt/ec2-report
sudo curl -o /opt/ec2-report/ec2_underutilized_report.sh [your-script-url]
sudo chmod +x /opt/ec2-report/ec2_underutilized_report.sh
# 2. Test run first — check output before scheduling
/opt/ec2-report/ec2_underutilized_report.sh
# Watch logs in real-time during test
tail -f /opt/ec2-report/ec2_report.log
# 3. Schedule via cron — every day at 8am
# (crontab -e, then add this line)
0 8 * * * /opt/ec2-report/ec2_underutilized_report.sh >> /opt/ec2-report/ec2_report.log 2>&1
# 4. Verify cron registered the job
crontab -l
What the Report Looks Like
Every morning, the team receives an email with a table like this. Rows are colour-coded by severity — red for critical, yellow for warnings, green for healthy. The CSV attachment lets you paste straight into a spreadsheet to track trends over weeks.
What You Get
ec2_report.log. When your finance team asks why a specific instance was terminated last Tuesday, the log has the metric values that triggered the recommendation.#finops channel. Or feed it into a DynamoDB table and build a week-over-week savings tracker. The script is a data source — the HTML email is just the simplest output format to start with.