Aurora PostgreSQL at Scale: Replication, Failover & Performance Tuning
Deep dive into Amazon Aurora PostgreSQL — cluster architecture, read replica scaling, automated failover under 30s, RDS Proxy for connection pooling, and parameter group tuning for high-throughput OLTP.
Why Aurora Over Standard RDS?
Aurora PostgreSQL isn't just managed Postgres — it's a fundamentally different storage architecture. Storage is distributed across 3 AZs in 6 copies, auto-heals corrupted blocks, and scales to 128TB. Failover completes in under 30 seconds vs 1–2 minutes for standard Multi-AZ RDS.
Cluster Architecture
RDS Proxy: Connection Pooling
Lambda functions create a new DB connection on every cold start. With 500 concurrent Lambda executions you'll exhaust PostgreSQL's connection limit instantly. RDS Proxy pools and reuses connections, acting as a multiplexer.
resource "aws_db_proxy" "main" {
name = "aurora-proxy"
engine_family = "POSTGRESQL"
role_arn = aws_iam_role.rds_proxy.arn
require_tls = true
auth {
auth_scheme = "SECRETS"
iam_auth = "REQUIRED" # IAM auth — no static passwords
secret_arn = aws_secretsmanager_secret.db_creds.arn
}
target { db_cluster_identifier = aws_rds_cluster.main.cluster_identifier }
}
Parameter Group Tuning
Aurora's defaults are conservative. The highest-impact changes for OLTP workloads:
random_page_cost = 1.1(default 4.0) — Aurora uses SSD-backed distributed storage, so index scans are cheapeffective_cache_size = 75% of RAM— tells the query planner how much memory is available for cachingwork_mem = 4MBglobally, set per-session to 256MB for analytical querieslog_min_duration_statement = 1000— log all queries over 1 second
Read Replica Auto Scaling
Aurora auto-scales read replicas based on CPU or connection count. We scale out at 60% average CPU across replicas — this absorbed a 10× traffic spike during a product launch with zero manual intervention.
resource "aws_appautoscaling_policy" "aurora_read" {
policy_type = "TargetTrackingScaling"
resource_id = "cluster:${aws_rds_cluster.main.cluster_identifier}"
scalable_dimension = "rds:cluster:ReadReplicaCount"
target_tracking_scaling_policy_configuration {
target_value = 60.0
predefined_metric_specification {
predefined_metric_type = "RDSReaderAverageCPUUtilization"
}
scale_in_cooldown = 300
scale_out_cooldown = 60
}
}