Reducing AWS Spend by 42% for a Mid-Size SaaS Company

Background

A fast-growing B2B SaaS company had scaled from 20 to 200+ engineers over three years. Their AWS footprint had grown proportionally — but their cloud cost practices hadn't. By the time I was engaged, they were spending approximately $180,000/month on AWS with no clear ownership model for that spend.

The immediate symptoms:

Month-over-month cost growth of 8–12% with no corresponding growth in revenue or user base
Finance team receiving a monthly bill with no ability to attribute costs to products or teams
Engineers unaware of the cost implications of their infrastructure choices
Zero Reserved Instance or Savings Plan coverage

The Problem

Three root causes were driving the runaway spend:

1. No Cost Visibility

The AWS account had no tagging strategy. Every resource — EC2 instances, RDS databases, S3 buckets, load balancers — was untagged. This meant the $180,000 monthly bill arrived as an undifferentiated blob. Finance couldn't answer "which team is spending this?" and engineering couldn't answer "is our product profitable on an infrastructure basis?"

2. Massive Over-Provisioning

Without cost awareness, engineers had defaulted to large instance types. A review of their EC2 fleet found:

68% of EC2 instances were running below 15% average CPU over a 30-day window
12 m5.4xlarge instances ($0.768/hr each) were used for microservices averaging 4% CPU
Several r5.2xlarge memory-optimized instances were running workloads that didn't require high memory
Development and staging environments were running 24/7 on production-class instances

3. All On-Demand, No Commitments

100% of their EC2 and Fargate spend was On-Demand. For a company with 3+ years of stable baseline workloads, this was the most expensive possible pricing model.

Monthly EC2 On-Demand spend:     $94,000
Equivalent with Savings Plan:    ~$54,000
Monthly overpayment:             ~$40,000

Analysis

The engagement began with a 2-week discovery phase:

Week 1: Data Collection

Exported 6 months of Cost and Usage Reports (CUR) to S3
Used AWS Cost Explorer to identify top 20 cost drivers
Ran Python scripts against the CUR data to segment spend by service, region, and instance type
Pulled CloudWatch CPU/memory metrics for all EC2 instances

Week 2: Pattern Analysis

The CloudWatch data revealed clear patterns:

Instance Category	Count	Avg CPU	Monthly Cost	Recommended Action
Heavily over-provisioned	47	< 10%	$42,300	Right-size 2 tiers down
Moderately over-provisioned	31	10–20%	$18,600	Right-size 1 tier down
Appropriately sized	28	20–60%	$22,100	Commit with Savings Plan
Under-provisioned	3	> 80%	$3,100	Scale up or investigate

A cost anomaly analysis also found:

$4,200/month in forgotten data transfer fees from a deprecated cross-region replication job
$2,800/month in unattached EBS volumes (73 volumes, some dating back 2 years)
$1,100/month in Elastic IPs not associated with running instances

Solution

The optimization was executed in three phases over 90 days:

Phase 1: Quick Wins (Weeks 1–4)

Tagging:

Defined a four-key tag schema: Team, Environment, Product, CostCenter
Used AWS Tag Editor to audit all resources
Wrote an AWS Config rule to alert on untagged resource creation going forward
Engineering leads were given their team's cost report for the first time

Zombie resource cleanup:

Deleted 73 unattached EBS volumes: $2,800/month saved
Released 18 unassociated Elastic IPs: $1,100/month saved
Terminated the cross-region replication job after confirming with the data team it was unused: $4,200/month saved
Identified and terminated 8 forgotten development instances: $3,600/month saved

Phase 1 total: $11,700/month saved

Phase 2: Rightsizing (Weeks 5–10)

Working with engineering leads, we systematically right-sized over-provisioned instances:

47 heavily over-provisioned instances right-sized 2 tiers (e.g., m5.2xlarge → m5.large)
31 moderately over-provisioned instances right-sized 1 tier
All dev/staging environments moved to a scheduled start/stop policy (8am–8pm weekdays only)

Key enabler: We set up CloudWatch dashboards per team so engineers could see their instance metrics in real time. This transparency changed behavior — several engineers proactively right-sized their own instances before we asked.

Phase 2 total: $38,400/month saved (rightsizing) + $6,200/month (dev/staging scheduling)

Phase 3: Commitment Strategy (Weeks 10–12)

With rightsizing complete, we had a clear picture of the stable baseline workload. We purchased:

1-year Compute Savings Plan covering $32,000/month of On-Demand EC2 spend
Effective discount: 38% → $12,160/month saved
Reserved m5.large instances for the 12 most stable microservices: $3,140/month saved

Phase 3 total: $15,300/month saved

Outcome

Over 90 days, the company's monthly AWS spend dropped from $180,000 to $104,400 — a 42% reduction.

Category	Monthly Savings
Zombie resource cleanup	$11,700
EC2 rightsizing	$38,400
Dev/staging scheduling	$6,200
Savings Plans & RIs	$15,300
Total	$75,600

Annualized savings: $907,200

Beyond the direct cost savings, the company established a sustainable FinOps practice:

Weekly cost review meetings with engineering leads
Automated anomaly alerts via AWS Budgets
A tagging compliance score tracked on a shared dashboard
New hire onboarding includes a "cloud cost awareness" module

Lessons Learned

1. Visibility drives behavior. The single highest-leverage action was sharing cost reports with engineering teams. Engineers aren't indifferent to cost — they just didn't have the information to act on it.

2. Start with cleanup before committing. Purchasing Reserved Instances before rightsizing locks in waste. Always clean up and right-size first, then commit.

3. Dev/staging scheduling is underrated. Development environments that run 24/7 are burning money for 16 hours a day when nobody is working. Scheduled stop/start policies are easy to implement and require no code changes.

4. FinOps is a cultural change, not a technical fix. The tools and analysis are the easy part. Getting buy-in from engineering teams — and making cost a first-class metric alongside performance and reliability — is where the real work happens.

Interested in a similar engagement for your AWS environment? Get in touch.