Recommendations for AWS

CloudZero analyzes your AWS environment and generates recommendations that identify specific resources where you can reduce costs or improve efficiency. Each recommendation includes the affected resource, the estimated savings, and guidance on how to address it.

For details on how to work with recommendations in the CloudZero UI (search, filter, group, take action), see Recommendations.

What you need

Several AWS recommendations require one or both of the following to be enabled in your AWS account. The overview table below marks which prerequisites each recommendation requires. Recommendations marked "CloudZero" use CloudZero's own billing data analysis and require no additional AWS configuration.

Prerequisite	What it enables	How to enable
Cost Optimization Hub (COH)	Savings Plans and Reserved Instance purchase recommendations, Lambda cost optimization	Enable in the AWS Cost Management console
AWS Compute Optimizer (CO)	Rightsizing, deletion, upgrade, and migration recommendations for EC2, EBS, RDS, Aurora, Fargate, and Lambda	Opt in through the AWS Compute Optimizer console

Overview of AWS Recommendations

Artificial Intelligence

Recommendation	Source	What CloudZero identifies
AWS Savings Plans Purchase Recommendations for Amazon SageMaker AI	COH	Savings Plans purchase opportunities for Amazon SageMaker based on your usage patterns

Compute

Recommendation	Source	What CloudZero identifies
Amazon EC2 Instance Consolidation for Microsoft SQL Server	CloudZero	Opportunities to consolidate Microsoft SQL Server licenses on EC2 instances
Amazon EC2 Instance Over-Provisioned for Microsoft SQL Server	CloudZero	EC2 instances running Microsoft SQL Server that have more vCPUs than needed
Amazon EC2 Instances Stopped	CloudZero	EC2 instances that are stopped and are candidates for termination
Amazon EC2 Migrate to Graviton	COH + CO	EC2 instances that can be migrated to Graviton-based instances
Amazon EC2 Reserved Instance Lease Expiration	CloudZero	EC2 Reserved Instances approaching lease expiration
Amazon EC2 Reserved Instance Optimization	CloudZero	EC2 Reserved Instance optimization opportunities
Amazon EC2 Rightsize Instances	COH + CO	EC2 instances that should be rightsized to optimize cost and performance
Amazon EC2 Stop Instances	COH + CO	EC2 instances that should be stopped to reduce costs
Amazon EC2 Upgrade Instances	COH + CO	EC2 instances that should be upgraded to newer generation instances
AWS Fargate Cost Optimization Delete Recommendations for Amazon ECS	CloudZero	Unused or idle Fargate services that should be deleted
AWS Fargate Cost Optimization Recommendations for Amazon ECS	COH + CO	Fargate services with over-provisioned CPU or memory allocations
AWS Lambda Cost Optimization Recommendations for Functions	COH + CO	Lambda functions with cost optimization opportunities
AWS Savings Plans Purchase Recommendations for Compute	COH	Savings Plans purchase opportunities for compute resources
Configure ECR Repository Lifecycle Policy to Reduce Storage Costs	CloudZero	ECR repositories without lifecycle policies configured
Delete EBS Snapshot Older Than 180 Days	CloudZero	EC2 snapshots older than 180 days still incurring costs
EKS Clusters Incurring Extended Support Charges	CloudZero	EKS clusters incurring extended support charges for end-of-standard-support Kubernetes versions
Excessive EC2 Cross-Region Data Transfer	CloudZero	Accounts where cross-region data transfer costs exceed 10% of total EC2 data transfer costs
Excessive EC2/ELB Internet Traffic Bypassing CloudFront	CloudZero	Accounts using CloudFront but with significant direct internet egress from EC2/ELB
Fix Lambda Function with Excessive Error Rate	CloudZero	Lambda functions experiencing high error rates
Fix Lambda Function with Excessive Timeouts	CloudZero	Lambda functions experiencing excessive timeouts
Migrate EMR Serverless to ARM (Graviton)	CloudZero	EMR Serverless workloads on x86 architecture that could migrate to ARM Graviton
Older Generation Instances Detected	CloudZero	EC2, RDS, and ElastiCache older generation instances with at least $500 in spend

Databases

Recommendation	Source	What CloudZero identifies
Amazon Aurora Delete Clusters	COH + CO	Aurora clusters that should be deleted to reduce costs
Amazon Aurora Migrate to Graviton	COH + CO	Aurora clusters that can be migrated to Graviton-based instances
Amazon Aurora Rightsize Clusters	COH + CO	Aurora clusters that should be rightsized to optimize cost and performance
Amazon Aurora Upgrade Clusters	COH + CO	Aurora clusters that should be upgraded to newer generation types
Amazon DynamoDB Reserved Capacity Purchase Recommendations	COH	DynamoDB reserved capacity purchase opportunities
Amazon ElastiCache Reserved Node Purchase Recommendations	COH	ElastiCache Reserved Node purchase opportunities
Amazon MemoryDB Reserved Node Purchase Recommendations	COH	MemoryDB Reserved Node purchase opportunities
Amazon OpenSearch Service Reserved Instance Purchase Recommendations	COH	OpenSearch Service Reserved Instance purchase opportunities
Amazon RDS Delete Instances	COH + CO	RDS instances that should be deleted to reduce costs
Amazon RDS Migrate to Graviton	COH + CO	RDS instances that can be migrated to Graviton-based instances
Amazon RDS Reserved Instance Purchase Recommendations	COH	RDS Reserved Instance purchase opportunities
Amazon RDS Rightsize Instances	COH + CO	RDS instances that should be rightsized to optimize cost and performance
Amazon RDS Storage Delete Recommendations	COH + CO	RDS database instances with storage that can be deleted
Amazon RDS Storage Rightsize Recommendations	COH + CO	RDS database instances with storage that can be rightsized
Amazon RDS Storage Upgrade Recommendations	COH + CO	RDS database instances where storage can be upgraded to more cost-effective options
Amazon RDS Upgrade Instances	COH + CO	RDS instances that should be upgraded to newer generation types
Amazon Redshift Reserved Node Purchase Recommendations	COH	Redshift Reserved Node purchase opportunities
Delete Inactive DynamoDB Tables	CloudZero	DynamoDB tables incurring storage costs with no usage activity
Excessive RDS Backup Retention	CloudZero	RDS backups and manual snapshots retained beyond 90 days
RDS Clusters Incurring Extended Support Charges	CloudZero	RDS instances and clusters on outdated engine versions incurring extended support charges
RDS Snapshot Costs Are Higher Than Expected	CloudZero	RDS snapshot costs exceeding 10% of total RDS costs
Underutilized Amazon Redshift Clusters	CloudZero	Redshift clusters that are underutilized
Upgrade Elasticsearch to Avoid Extended Support Charges	CloudZero	Elasticsearch domains running EOL versions incurring extended support fees
Upgrade OpenSearch to Avoid Extended Support Charges	CloudZero	OpenSearch domains running EOL versions incurring extended support fees

Management Tools

Recommendation	Source	What CloudZero identifies
CloudWatch Costs Higher Than Expected	CloudZero	CloudWatch costs that have increased beyond expected thresholds
Redundant CloudTrail Usage Detected	CloudZero	Accounts being charged for CloudTrail events due to redundant instances

Networking & Content Delivery

Recommendation	Source	What CloudZero identifies
Delete Idle Load Balancer	CloudZero	Classic Load Balancers that are idle
Delete Inactive AWS Network Firewall	CloudZero	Network Firewalls that have processed 0 bytes in the last 30 days
Delete Inactive Gateway Load Balancer Endpoint	CloudZero	Gateway Load Balancer endpoints that have processed 0 bytes in the last 30 days
Delete Inactive VPC Interface Endpoint	CloudZero	VPC interface endpoints that have processed 0 bytes in the last 30 days
Inefficient AWS NAT Gateway Detected	CloudZero	NAT Gateways with hourly charges but minimal data processing
Managed NAT Gateway with Excessive Data Transfer	CloudZero	NAT Gateways where data transfer costs exceed 60% of total gateway costs
Release Idle Elastic IP Addresses	CloudZero	Elastic IP addresses allocated but not associated with running resources

Storage

Recommendation	Source	What CloudZero identifies
Amazon EBS Delete Volumes	COH + CO	EBS volumes that should be deleted to reduce costs
Amazon EBS Rightsize Volumes	COH + CO	EBS volumes that should be rightsized to optimize cost and performance
Amazon EBS Upgrade Volumes	COH + CO	EBS volumes that should be upgraded to newer generation types
Configure S3 Lifecycle Policy to Abort Incomplete Multipart Uploads	CloudZero	S3 buckets without lifecycle policies for incomplete multipart upload cleanup
Consider Intelligent-Tiering or Lifecycle Rules for S3	CloudZero	S3 buckets with spend only on Standard Storage
High Data Retrieval Costs for S3 Glacier Storage	CloudZero	Data retrieval costs on S3 Glacier storage tiers exceeding $100 over 30 days
High Non-Standard API Requests for S3	CloudZero	High spend on non-standard API requests (LIST, HEAD) to S3
High Ratio of S3 API Cost to Storage Cost	CloudZero	S3 buckets where API request costs exceed 80% of total bucket costs
High S3 Administrative Fees	CloudZero	S3 buckets where administrative fees exceed 10% of total bucket cost
Unarchived Old EBS Snapshots	CloudZero	EBS snapshots stored in standard storage for over 90 days, candidates for archive

Artificial Intelligence

AWS Savings Plans Purchase Recommendations for Amazon SageMaker AI

Amazon SageMaker Savings Plans offer significant savings on SageMaker usage in exchange for a commitment to a consistent amount of usage (measured in $/hour) for a one or three year term. AWS Trusted Advisor analyzes your SageMaker usage patterns and provides recommendations for purchasing Savings Plans that could reduce your costs.

How to address this

Review the recommended Savings Plan commitment amount and term
Navigate to the AWS Cost Management console
Go to Savings Plans > Purchase Savings Plans
Select SageMaker Compute as the Savings Plans type
Enter the recommended commitment amount
Choose the appropriate term (1-year or 3-year)
Select the payment option (All Upfront, Partial Upfront, or No Upfront)
Review and complete the purchase

Additional details

Savings Plans provide flexibility to change instance families, sizes, operating systems, and regions
Longer commitment terms (3 years) typically offer higher savings rates
All Upfront payment provides the highest discount
Savings Plans automatically apply to eligible usage across your AWS account
You can stack multiple Savings Plans to match your usage patterns

Compute

Amazon EC2 Instance Consolidation for Microsoft SQL Server

Identifies opportunities to consolidate Microsoft SQL Server licenses on Amazon EC2 instances by using instances with more vCPUs to reduce licensing costs.

This recommendation analyzes your EC2 instances running Microsoft SQL Server and identifies cases where instances are running with fewer vCPUs than the minimum required for SQL Server licensing, multiple smaller instances could be consolidated into larger instances, or SQL Server editions could benefit from instance consolidation.

Microsoft SQL Server licensing is often based on core/vCPU counts, and there are minimum licensing requirements. By consolidating workloads onto instances with more vCPUs that meet or exceed these minimums, you can reduce the total number of SQL Server licenses needed, improve SQL Server performance through better resource allocation, and simplify management by reducing the number of instances.

How to address this

Review the specific instances flagged, noting current instance type, vCPU count, SQL Server edition, and minimum recommended vCPU count
Analyze whether workloads can be consolidated onto larger instance types, combined with other SQL Server instances, or migrated to instances that better match licensing tiers
Plan the consolidation: identify target instance types, group compatible workloads, schedule during maintenance windows, and prepare rollback procedures
Test SQL Server performance on consolidated instances, verify license compliance, and test application connectivity before production implementation
Implement during scheduled maintenance: backup databases, follow SQL Server best practices, and update monitoring and backup configurations
Monitor performance metrics, verify cost savings, and document the new configuration

Additional details

Cost impact calculation:

Inefficiency Ratio = (Minimum vCPU - Current vCPU) / Minimum vCPU
Estimated Savings = Instance Cost × Inefficiency Ratio × 0.30

The 0.30 factor estimates that SQL Server licensing comprises approximately 30% of total EC2 instance costs. For example, an instance with 1 vCPU but requiring 4 vCPU minimum (75% inefficiency) costing $100/month would show estimated savings of $22.50/month ($100 × 0.75 × 0.30). Actual savings vary based on your specific licensing agreements and instance types.

Licensing Models: Ensure compliance with Microsoft SQL Server licensing agreements when consolidating instances
High Availability: Consider the impact on your high availability and disaster recovery strategy
Resource Isolation: Evaluate whether workload consolidation aligns with your security and isolation requirements

Amazon EC2 Instance Over-Provisioned for Microsoft SQL Server

Identifies EC2 instances running Microsoft SQL Server that have more vCPUs than needed for SQL Server licensing, presenting opportunities to rightsize to smaller instance types and reduce costs.

Each recommendation includes the current instance type and vCPU count, maximum recommended vCPU count based on workload analysis, recommended instance type, and estimated monthly savings.

How to address this

Review each flagged instance: current instance type, vCPU count, recommended instance type, estimated savings, and SQL Server edition
Analyze workload patterns: review CPU utilization over time, identify peak usage, verify the recommended size can handle peak loads
Plan the rightsizing: prioritize by savings potential, schedule during maintenance windows, prepare rollback procedures
Test in non-production: validate SQL Server performance, verify application behavior under load
Execute during scheduled maintenance: stop the instance, change the instance type, start and verify SQL Server, test connectivity
Monitor CPU and memory utilization, SQL Server performance, and application response times

Additional details

Licensing Compliance: Verify that downsizing maintains compliance with Microsoft SQL Server licensing requirements
High Availability: Consider the impact on your HA/DR strategy when changing instance types
Stop/Start Impact: Changing instance types requires stopping the instance, which causes downtime
Elastic IPs: Retained when changing instance types
Instance Store: Data is lost when stopping the instance (EBS-backed volumes are preserved)

Amazon EC2 Instances Stopped

Identifies EC2 instances that are currently stopped and are candidates for termination to reduce costs. While stopped instances do not incur compute charges, they still have associated costs from EBS volumes, Elastic IP addresses, and other resources.

How to address this

Review stopped instances to determine if they are still needed
Check for associated EBS volumes and other resources
Consider terminating instances that are no longer required
Verify no critical data will be lost before termination

Amazon EC2 Migrate to Graviton

Identifies EC2 instances that can be migrated to Graviton-based instances for cost optimization.

How to address this

Migrate eligible instances to Graviton-based instance types
Review application compatibility with ARM-based processors
Test performance and functionality after migration

Amazon EC2 Reserved Instance Lease Expiration

Identifies EC2 Reserved Instances approaching lease expiration. When Reserved Instance leases expire, instances continue running at on-demand pricing (up to 72% higher).

How to address this

Review Reserved Instances approaching expiration
Consider renewing leases for consistent workloads
Evaluate if Reserved Instances still match current usage patterns
Consider converting to Savings Plans for more flexibility
Plan for renewal well before expiration to avoid cost spikes

Additional details

Plan renewals 30-60 days before expiration
Consider usage patterns and workload changes
Evaluate if Reserved Instances still provide optimal coverage
Review instance types and sizes for current needs

Amazon EC2 Reserved Instance Optimization

Identifies EC2 Reserved Instance optimization opportunities. AWS Trusted Advisor analyzes your usage patterns and recommends purchases or modifications that can reduce your compute costs.

How to address this

Purchase new Reserved Instances for consistent workloads
Modify existing Reserved Instances to better match your usage patterns
Exchange Reserved Instances for different instance types or regions
Consider Reserved Instance Marketplace for unused capacity

Additional details

Reserved Instances can provide up to 75% savings compared to On-Demand pricing for consistent workloads.

Amazon EC2 Rightsize Instances

Identifies EC2 instances that are over-provisioned or under-provisioned and should be rightsized to optimize cost and performance.

How to address this

Rightsize instances to match actual resource utilization
Review CPU, memory, and network utilization metrics
Test performance after rightsizing to ensure application requirements are met
Consider using CloudWatch metrics to validate rightsizing recommendations

Amazon EC2 Stop Instances

Identifies EC2 instances that should be stopped to reduce costs.

How to address this

Stop instances that are not actively being used
Review instance usage patterns before stopping
Consider using scheduled stop/start for development instances
Implement automated stop policies for non-production environments

Amazon EC2 Upgrade Instances

Identifies EC2 instances that should be upgraded to newer generation instances for better price-performance.

How to address this

Upgrade instances to newer generation types for better price-performance
Review application compatibility with newer instance types
Test performance and functionality after upgrade
Consider Reserved Instances for upgraded instances to maximize savings

AWS Fargate Cost Optimization Delete Recommendations for Amazon ECS

Identifies unused or idle Fargate services that should be deleted to eliminate unnecessary costs. Fargate services with no recent task executions, idle services, and services inactive for extended periods are flagged for deletion.

How to address this

Delete the identified Fargate services to eliminate ongoing costs from idle container infrastructure including CPU, memory, and data transfer charges.

AWS Fargate Cost Optimization Recommendations for Amazon ECS

Identifies Fargate services with over-provisioned CPU or memory allocations that should be rightsized to reduce costs while maintaining performance.

How to address this

Rightsize Fargate services to more appropriate CPU and memory allocations based on actual utilization patterns.

AWS Lambda Cost Optimization Recommendations for Functions

This recommendation identifies AWS Lambda functions that have cost optimization opportunities based on AWS Trusted Advisor recommendations.

How it works

AWS Trusted Advisor continuously monitors your Lambda functions and provides recommendations for cost optimization opportunities. This recommendation surfaces those recommendations to help you identify potential savings.

What CloudZero identifies

Lambda functions with cost optimization opportunities
Functions that could benefit from memory allocation adjustments
Underutilized Lambda functions that could be optimized
Function configuration optimizations
Cost optimization recommendations from AWS Trusted Advisor

How it works

Uses AWS Trusted Advisor's c1z7kmr05n check for Lambda cost optimization
Leverages Trusted Advisor's estimated savings calculations
Provides dynamic titles with specific recommendations and resource IDs
Focuses on Lambda service costs and function-specific optimizations

Cost impact

The recommendation calculates potential savings based on Trusted Advisor's estimates for cost optimization opportunities in Lambda functions, including memory allocation, timeout settings, and other function-specific optimizations.

AWS Savings Plans Purchase Recommendations for Compute

This recommendation identifies AWS Savings Plans purchase opportunities for compute resources based on AWS Trusted Advisor recommendations.

How it works

AWS Trusted Advisor analyzes your compute usage patterns across Amazon EC2, AWS Fargate, and AWS Lambda to provide Savings Plans purchase recommendations. This recommendation surfaces those opportunities to help you identify potential savings through committed usage discounts.

What CloudZero identifies

Savings Plans purchase opportunities for compute resources
Recommended commitment amounts and terms
Estimated monthly savings from purchasing Savings Plans
Account-level purchase recommendations
Cost optimization opportunities from AWS Cost Optimization Hub

How it works

Uses AWS Trusted Advisor's c1z7kmr09n check for Savings Plans recommendations
Leverages Trusted Advisor's estimated savings calculations
Provides dynamic titles with specific recommendations
Covers EC2, Fargate, and Lambda compute resources
Account-level recommendations rather than resource-specific

Cost impact

The recommendation calculates potential savings based on Trusted Advisor's estimates for Savings Plans purchases, including recommended commitment amounts, terms, and expected monthly savings.

Configure ECR Repository Lifecycle Policy to Reduce Storage Costs

This check identifies Amazon Elastic Container Registry (ECR) repositories that do not have lifecycle policies configured. Without lifecycle policies, repositories can accumulate old, unused, and untagged container images over time, leading to unnecessary storage costs.

Additional details

ECR repositories without lifecycle policies tend to accumulate images indefinitely. This includes:

Old image versions that are no longer deployed
Untagged images from failed or interrupted builds
Development and testing images that are no longer needed
Multiple versions of images that exceed retention requirements

Implementing lifecycle policies can significantly reduce storage costs by automatically removing old or unused images based on criteria you define.

How to address this

Configure lifecycle policies for your ECR repositories to automatically clean up old and unused images. A typical lifecycle policy might:

Keep only the last N tagged images
Remove untagged images after a certain period (e.g., 7-14 days)
Remove images older than a certain age
Keep images with specific tags (like "production" or "latest")

Additional details

The estimated savings is based on your current ECR storage costs. By implementing lifecycle policies, you can typically reduce storage by 20-30% through removal of:

Untagged images from failed builds
Old versions of images no longer in use
Development and testing images

Actual savings will vary based on your image retention requirements and current repository management practices.

How to address this

Open the Amazon ECR console
Navigate to the repository identified in the recommendation
Click "Lifecycle Policy" in the left navigation
Create a new lifecycle policy using the visual editor or JSON
Define rules for image retention (e.g., keep last 10 images, remove untagged after 7 days)
Test the policy using the "Dry run" feature before enabling
Save and enable the lifecycle policy

Additional details

Delete EBS Snapshot Older Than 180 Days

This recommendation identifies EC2 snapshots that are older than 180 days and are still actively incurring costs. Often these snapshots can be outdated and no longer needed. Cleaning them up can save money.

Threshold: This recommendation is created if the total spend for the identified snapshots exceeds $500 in real cost. When the total spend for those snapshots is reduced below $500 through cleaning them up, the Recommendation will automatically be closed.

EKS Clusters Incurring Extended Support Charges

This recommendation identifies Amazon EKS (Elastic Kubernetes Service) clusters that are incurring extended support charges for using Kubernetes versions that have reached end-of-standard-support.

What Are EKS Extended Support Charges?

AWS charges additional fees for EKS clusters running on Kubernetes versions that have passed their standard support end date. Extended support provides:

Security patches and bug fixes for the Kubernetes control plane
Continued access to Amazon EKS optimized AMIs
Technical support for the extended version

However, these charges can be significant and are avoidable by upgrading to a supported Kubernetes version.

Cost impact

Extended support charges typically add:

~$0.60/hour per cluster (~$438/month)
This is in addition to standard EKS cluster costs ($0.10/hour)
Represents a 6x increase in control plane costs

For organizations with multiple clusters, these charges can accumulate to thousands of dollars per month.

Additional details

Cost Optimization: Eliminating extended support charges immediately reduces EKS costs
Security: Newer Kubernetes versions include important security improvements
Features: Access to latest Kubernetes features and improvements
Performance: Newer versions often include performance enhancements
Compliance: Running EOL software can violate security policies

How to address this

Upgrade your EKS clusters to a Kubernetes version that is within standard support.

Check Current Version

aws eks describe-cluster --name <cluster-name> --query cluster.version

Upgrade Process

Review the upgrade path: EKS only allows upgrading one minor version at a time (e.g., 1.21 → 1.22 → 1.23)

Update control plane:

aws eks update-cluster-version --name <cluster-name> --kubernetes-version <version>

Update node groups:
- Managed node groups: Update through AWS Console or CLI
- Self-managed nodes: Update AMIs and roll out new nodes

Update add-ons:

aws eks update-addon --cluster-name <cluster-name> --addon-name <addon> --addon-version <version>

Test thoroughly between each version upgrade

Important Considerations

Application compatibility: Test workloads with new Kubernetes API versions
Deprecated APIs: Check for deprecated API usage in your manifests
Add-ons: Ensure all add-ons (CNI, CoreDNS, kube-proxy) are compatible
Helm charts: Verify Helm chart compatibility with target version
Downtime: Plan upgrade window (control plane upgrade causes brief API disruption)

Current Support Timeline

AWS provides 14 months of standard support for each Kubernetes version. For current version support dates, see the AWS EKS documentation.

Best Practices

Stay current: Aim to be within 2 minor versions of latest
Upgrade regularly: Don't let versions fall too far behind
Test in non-prod first: Always test upgrades in dev/staging
Automate: Use GitOps tools (ArgoCD, Flux) for consistent deployments
Monitor: Set up alerts for version EOL dates
Plan ahead: Schedule upgrades well before standard support ends

Resources

Excessive EC2 Cross-Region Data Transfer

This recommendation identifies AWS accounts where EC2 cross-region data transfer costs exceed 10% of total EC2 data transfer costs. Cross-region data transfer occurs when EC2 instances in one region communicate with resources in another region, incurring per-GB charges that are significantly higher than same-region transfers. These costs often indicate architectural inefficiencies.

Cost impact

Estimated savings: 75% reduction by architecting to keep data within the same region.

Cross-region data transfer is charged per GB, while same-region transfers within an availability zone are free. By consolidating resources within a single region or deploying complete regional stacks, most cross-region costs can be eliminated.

Additional details

High Cost: Cross-region transfers are significantly more expensive than same-region transfers
Performance: Added latency between regions impacts application response times
Architectural Issues: Services and data not co-located
Hidden Costs: Compute overhead, replication delays, retry logic

Common Causes

Multi-region without purpose: HA deployed but never used
Legacy migration artifacts: Partial migration between regions
Centralized data stores: Single database/cache serving multiple regions
VPC peering misuse: Cross-region peering for convenience
Backup/DR traffic: Continuous replication instead of snapshots

How to address this

Step 1: Identify Traffic Sources

Use VPC Flow Logs or Cost Explorer (filter by DataTransfer-Regional-Bytes) to identify which resources are generating cross-region traffic.

Step 2: Fix Common Patterns

Application and Database Split:

Incorrect: App in us-east-1 communicating with RDS in us-west-2
Correct: App in us-east-1 communicating with RDS in us-east-1

Centralized Services:

Incorrect: Services in multiple regions connecting to single Redis in us-east-1
Correct: Each region has its own Redis instance

Cross-Region Microservices:

Incorrect: Service A in us-east-1 calling Service B in us-west-2
Correct: Both services in same region, or both deployed in each region

Step 3: Choose Architecture Strategy

Option A: Single-Region (Simplest)

Deploy all resources in one region
Best for most applications

Option B: True Multi-Region (For HA/DR)

Deploy complete independent stacks in each region
Use Route53 geo-routing
NO cross-region traffic during normal operation

Option C: Active-Passive DR (Lower cost)

Primary region with hot data
Standby region with snapshots only
Failover only in disasters

Step 4: Use VPC Endpoints

Replace cross-region AWS service calls with VPC endpoints and regional buckets.

Step 5: Optimize Required Cross-Region Traffic

If cross-region is unavoidable:

Use AWS PrivateLink (lower cost)
Batch transfers instead of real-time streaming
Compress data before transfer

Step 6: Monitor

Set up CloudWatch alarms for DataTransfer-Regional-Bytes to catch regressions.

When Cross-Region Is Acceptable

Disaster recovery snapshots (periodic, not continuous)
CloudFront with regional origins
Regulatory compliance requirements
True global applications with regional data isolation

Prevention Strategies

Enforce regional deployment patterns in IaC
Use security groups to block unexpected cross-region traffic
Tag resources with region and monitor spending
Review architecture for new deployments

Excessive EC2/ELB Internet Traffic Bypassing CloudFront

This recommendation identifies AWS accounts using CloudFront CDN but with significant direct internet egress from EC2/ELB. When traffic bypasses CloudFront, you pay 2-5x higher data transfer costs and miss caching, DDoS protection, and global performance benefits.

What CloudZero identifies

Accounts with active CloudFront distributions
EC2/ELB direct egress >10% of CloudFront egress costs
Minimum $1,000/month direct egress
New services bypassing existing CDN architecture

Cost impact

Savings calculation: 50% reduction in direct egress costs through CloudFront routing, caching, and Origin Shield.

Example: $75k/month in direct egress = $37,500/month savings ($450k annually)

Why 50% savings:

CloudFront caching reduces origin bandwidth 50-90%
Origin Shield adds additional cache layer
Reduced compute costs from fewer origin requests
Better compression and optimization

Additional details

1. Higher data transfer costs

Direct is 2-5x more expensive

2. No caching benefits

Every request hits origin servers
Increased compute and database load
Higher latency for global users

3. Missing security & performance

No AWS Shield DDoS protection
Single-region latency vs edge caching
Increased attack surface

Common Causes

New services deployed without CDN: Microservices/APIs bypass existing CloudFront
"Dynamic content" misconception: CloudFront caches API responses; even 1-second cache helps
Legacy architecture: Pre-CDN infrastructure still serving traffic
Direct API access: Mobile apps/integrations pointing to ALB/EC2 directly

How to address this

Step 1: Identify Sources

Use AWS Cost Explorer to find high-egress resources:

Service: EC2/ELB
Usage Type: DataTransfer-Out-Bytes
Group by: Resource

Step 2: Add Origins to CloudFront

Console: CloudFront → Distributions → Origins → Create origin

Origin domain: Your ALB DNS or EC2 endpoint
Protocol: HTTPS only
Enable Origin Shield for additional caching

Terraform example:

origin {
  domain_name = aws_lb.app.dns_name
  origin_id   = "ALB"

  custom_origin_config {
    origin_protocol_policy = "https-only"
    origin_ssl_protocols   = ["TLSv1.2"]
  }

  origin_shield {
    enabled              = true
    origin_shield_region = "us-east-1"
  }
}

default_cache_behavior {
  target_origin_id       = "ALB"
  min_ttl     = 0
  default_ttl = 60    # Even 1 minute helps
  max_ttl     = 3600
}

Step 3: Update DNS & Application Configs

Point your domain to CloudFront instead of direct ALB/EC2 endpoints.

Step 4: Configure Caching

For dynamic content, cache based on query strings with short TTLs (30-60 seconds).

Step 5: Monitor Results

Check CloudFront CacheHitRate metric
Verify 50-90% reduction in origin requests
Monitor cost savings in Cost Explorer

When Direct Egress Is Acceptable

Database replication, backups to third-party services
VPN connections, B2B integrations with strict IP requirements
Streaming protocols not supported by CloudFront

Fix Lambda Function with Excessive Error Rate

This recommendation identifies AWS Lambda functions that are experiencing high error rates, which can impact reliability, user experience, and costs.

How it works

AWS Trusted Advisor monitors your Lambda functions and identifies those with elevated error rates. Functions with high error rates can indicate code issues, configuration problems, or external service dependencies that impact application reliability and increase operational costs.

What CloudZero identifies

Lambda functions with high error rates
Functions that need error handling improvements
Code quality and reliability optimization opportunities
Configuration issues that are causing failures
Cost optimization recommendations from AWS Trusted Advisor

How it works

Uses AWS Trusted Advisor's L4dfs2Q3C2 check for Lambda function error rate analysis
Leverages Trusted Advisor's error metrics and recommendations
Provides dynamic titles with specific actions
Covers all Lambda functions across all regions
Function-level recommendations for targeted optimization

Cost and Reliability Impact

Lambda functions with high error rates can result in:

Increased execution costs from failed invocations
Poor user experience from service failures
Potential cascading failures in dependent systems
Higher operational overhead for error handling
Missed opportunities for reliability optimization

Error Rate Optimization Strategies

Error handling: Implement comprehensive error handling and logging
Code quality: Improve code robustness and error prevention
Configuration review: Check function configuration and permissions
External service reliability: Optimize calls to external services
Retry logic: Implement appropriate retry mechanisms
Monitoring and alerting: Set up proper monitoring for error detection

Common Error Causes

Permission issues: Insufficient IAM permissions for function execution
External service failures: Unreliable external API or service calls
Resource constraints: Insufficient memory or timeout configurations
Code bugs: Logic errors or unhandled exceptions
Configuration problems: Incorrect environment variables or settings
Network issues: Connectivity problems to external resources

Reliability Improvement Recommendations

Implement comprehensive error handling: Catch and handle all potential errors
Add proper logging: Use structured logging for better debugging
Review IAM permissions: Ensure functions have appropriate permissions
Optimize external calls: Implement timeouts and retry logic for external services
Monitor error patterns: Use CloudWatch to track error trends
Implement circuit breakers: Prevent cascading failures

Best Practices

Implement proper error handling and logging in all functions
Use CloudWatch metrics to monitor error rates and trends
Set up alerts for error rate thresholds
Implement retry logic with exponential backoff
Review and test error scenarios regularly
Use dead letter queues for failed function invocations
Monitor external service dependencies and their reliability

Fix Lambda Function with Excessive Timeouts

This recommendation identifies AWS Lambda functions that are experiencing excessive timeouts, which can impact performance, reliability, and costs.

How it works

AWS Trusted Advisor monitors your Lambda functions and identifies those with excessive timeout occurrences. Functions that frequently timeout can indicate performance issues, inefficient code, or inappropriate timeout configurations that impact user experience and increase costs.

What CloudZero identifies

Lambda functions with high timeout rates
Functions that need timeout configuration adjustments
Performance optimization opportunities
Code efficiency improvements
Cost optimization recommendations from AWS Trusted Advisor

How it works

Uses AWS Trusted Advisor's L4dfs2Q3C3 check for Lambda function timeout analysis
Leverages Trusted Advisor's performance metrics and recommendations
Provides dynamic titles with specific actions
Covers all Lambda functions across all regions
Function-level recommendations for targeted optimization

Cost and Performance Impact

Lambda functions with excessive timeouts can result in:

Increased execution costs due to longer running times
Poor user experience from slow response times
Potential cascading failures in dependent systems
Higher error rates and reduced reliability
Missed opportunities for performance optimization

Timeout Optimization Strategies

Timeout configuration: Adjust function timeout settings appropriately
Code optimization: Improve function efficiency and reduce execution time
Resource allocation: Increase memory allocation for better performance
Async processing: Use asynchronous patterns for long-running operations
External service optimization: Optimize calls to external services
Caching strategies: Implement caching to reduce redundant operations

Common Timeout Causes

External API calls: Slow or unresponsive external services
Database queries: Inefficient or slow database operations
File processing: Large file operations without streaming
Memory constraints: Insufficient memory allocation
Cold starts: Initialization delays for complex functions
Network latency: Slow network connections to external resources

Performance Optimization Recommendations

Monitor execution times: Track function performance metrics
Optimize code: Refactor inefficient algorithms and operations
Use appropriate timeouts: Set realistic timeout values based on actual execution times
Implement retry logic: Handle transient failures gracefully
Consider async patterns: Use asynchronous processing for long operations
Optimize dependencies: Minimize and optimize external service calls

Best Practices

Set timeout values based on actual execution time plus buffer
Implement proper error handling and retry mechanisms
Use CloudWatch metrics to monitor function performance
Consider breaking large functions into smaller, focused functions
Implement caching strategies for frequently accessed data
Monitor and optimize external service dependencies

Migrate EMR Serverless to ARM (Graviton)

This recommendation identifies AWS accounts running EMR Serverless workloads on x86 (Intel/AMD) architecture that could achieve significant cost savings by migrating to ARM-based Graviton processors. AWS Graviton processors offer up to 20% cost savings with equivalent or better performance for most EMR Serverless workloads.

What CloudZero identifies

EMR Serverless applications running on x86 architecture
Accounts with any non-ARM EMR Serverless usage
Potential savings from migrating to ARM Graviton instances
Both fully x86 deployments and mixed x86/ARM environments

Cost impact

The recommendation calculates potential savings based on:

20% cost reduction from migrating x86 workloads to ARM Graviton
Current monthly x86 EMR Serverless spend
No performance degradation expected (often performance improves)

Example Scenario

Metric	Value
Monthly x86 EMR Serverless cost	$100,000
ARM migration savings (20%)	$20,000/month
Annual savings	$240,000

Additional details

Immediate Cost Savings: 20% reduction in compute costs with minimal effort
No Performance Trade-off: Graviton processors often provide better performance
Simple Migration: Usually just requires changing instance configuration
Growing Support: Most Spark libraries and frameworks support ARM
Environmental Impact: Graviton processors are more energy efficient

How to address this

Step 1: Verify Application Compatibility

#### Most EMR Serverless Workloads Are Compatible with ARM

#### Check for Any Architecture-specific Dependencies:

* Review custom libraries and packages
* Verify third-party integrations support ARM
* Test in development environment first

Step 2: Update EMR Serverless Application Configuration

Via AWS Console:

Navigate to EMR Studio → Applications
Select your application
Edit application settings
Under "Architecture", select arm64 (Graviton)
Save and restart application

Via AWS CLI:

aws emr-serverless update-application \
    --application-id <application-id> \
    --architecture ARM64

Via Terraform:

resource "aws_emrserverless_application" "example" {
  name          = "my-application"
  release_label = "emr-6.10.0"
  type          = "Spark"

  architecture = "ARM64"  # Change from "X86_64"
}

Step 3: Monitor and Validate

Monitor job execution times (should be equal or better)
Verify cost reduction in billing (appears within 24-48 hours)
Check application logs for any architecture-related issues

Additional details

Graviton Benefits

Cost: 20% cheaper than comparable x86 instances
Performance: Up to 40% better price-performance
Memory: Same memory-to-vCPU ratios available
Compatibility: Supports Spark 3.x, Python 3.7+, Java 8+

Known Limitations

Some legacy libraries do not support ARM (rare in modern Spark)
Custom native code requires recompilation
Third-party connectors should be verified

Best Practices

Start with non-production workloads
Run parallel tests comparing x86 vs ARM performance
Migrate incrementally (application by application)
Update documentation to default to ARM for new applications

References

Older Generation Instances Detected

The Amazon EC2, RDS, and ElastiCache services continually upgrade their instances to the current generation. Newer generation instances deliver better performance at a lower price point. Periodically check your environments for older generation instances for opportunities to upgrade to the AWS current generation to improve performance and reduce cost. Generally you can save up to 15% of the cost of your older generation instances by upgrading them.

Threshold: This recommendation is created if the total real cost spend for the identified Amazon EC2, RDS, and ElastiCache older generation instances is at least $500 and will be marked as Addressed afterwards when it falls below $500.

Databases

Amazon Aurora Delete Clusters

This recommendation identifies Aurora clusters that should be deleted to reduce costs.

How it works

Identifies Aurora clusters that are candidates for deletion
Provides estimated cost savings from deleting unused clusters
Uses AWS Trusted Advisor recommendations to identify optimal deletion targets

How to address this

Delete Aurora clusters that are no longer needed
Create final snapshots before deletion if data needs to be preserved
Ensure no applications are dependent on the clusters
Review Aurora Serverless v1 clusters that are candidates for deletion
Consider the impact on read replicas and other dependent resources

Amazon Aurora Migrate to Graviton

This recommendation identifies Aurora clusters that can be migrated to Graviton-based instances for cost optimization.

How it works

Identifies Aurora clusters that are candidates for migration to Graviton processors
Provides estimated cost savings from the migration
Uses AWS Trusted Advisor recommendations to identify optimal migration targets

How to address this

Migrate eligible Aurora clusters to Graviton-based instance types
Review application compatibility with ARM-based processors
Test database performance and functionality after migration
Plan for maintenance windows during migration
Consider the performance benefits and cost savings of Graviton instances
Evaluate Aurora Serverless v2 with Graviton for variable workloads

Amazon Aurora Rightsize Clusters

This recommendation identifies Aurora clusters that should be rightsized to optimize cost and performance.

How it works

Identifies Aurora clusters that are over-provisioned or under-provisioned
Provides estimated cost savings from rightsizing clusters
Uses AWS Trusted Advisor recommendations to identify optimal rightsizing targets

How to address this

Rightsize Aurora clusters to match actual resource utilization
Review CPU, memory, and I/O utilization metrics
Consider performance requirements when rightsizing
Test application performance after rightsizing to ensure requirements are met
Monitor Aurora cluster performance metrics during and after rightsizing
Consider Aurora Serverless v2 for variable workloads that benefit from auto-scaling

Amazon Aurora Upgrade Clusters

This recommendation identifies Aurora clusters that should be upgraded to newer generation types for cost optimization.

How it works

Identifies Aurora clusters that are candidates for upgrading to newer generation types
Provides estimated cost savings from upgrading to more efficient cluster types
Uses AWS Trusted Advisor recommendations to identify optimal upgrade targets

How to address this

Upgrade Aurora clusters to newer generation types for better price-performance
Review application compatibility with newer cluster types
Plan for maintenance windows during upgrades
Test database performance and functionality after upgrade
Consider Aurora Serverless v2 for variable workloads
Evaluate the benefits of upgrading to newer Aurora engine versions

Amazon DynamoDB Reserved Capacity Purchase Recommendations

This recommendation identifies Amazon DynamoDB reserved capacity purchase opportunities based on AWS Trusted Advisor recommendations.

How it works

AWS Trusted Advisor analyzes your DynamoDB usage patterns to provide reserved capacity purchase recommendations. This recommendation surfaces those opportunities to help you identify potential savings through committed usage discounts for DynamoDB read and write capacity units.

What CloudZero identifies

Reserved capacity purchase opportunities for DynamoDB tables
Recommended reserved capacity amounts for read and write units
Estimated monthly savings from purchasing reserved capacity
Table-specific purchase recommendations
Cost optimization opportunities from AWS Cost Optimization Hub

How it works

Uses AWS Trusted Advisor's c1z7kmr15n check for DynamoDB reserved capacity recommendations
Leverages Trusted Advisor's estimated savings calculations
Provides dynamic titles with specific recommendations
Covers DynamoDB read and write capacity units
Table-level recommendations for targeted optimization

Cost impact

The recommendation calculates potential savings based on Trusted Advisor's estimates for DynamoDB reserved capacity purchases, including recommended capacity amounts, terms, and expected monthly savings compared to on-demand pricing.

Reserved Capacity Benefits

Up to 70% savings compared to on-demand pricing
Predictable billing for consistent workloads
No upfront payment required (No Upfront option available)
Flexible terms (1 or 3 years)
Automatic application to matching tables

Amazon ElastiCache Reserved Node Purchase Recommendations

This recommendation identifies ElastiCache Reserved Node purchase opportunities based on AWS Trusted Advisor recommendations.

How it works

AWS Trusted Advisor analyzes your ElastiCache usage patterns and recommends Reserved Node purchases that can reduce your caching costs. This recommendation surfaces those recommendations to help you optimize your Reserved Node portfolio for consistent workloads.

How to address this

Purchase Reserved Nodes for ElastiCache clusters with consistent usage patterns
Consider 1-year or 3-year term options based on workload stability
Evaluate payment options (All Upfront, Partial Upfront, No Upfront)
Review node types and sizes to ensure optimal capacity planning

Cost impact

Reserved Nodes can provide significant savings compared to On-Demand pricing for consistent workloads. The cost impact represents the potential monthly savings from implementing the recommended Reserved Node purchases.

Amazon MemoryDB Reserved Node Purchase Recommendations

This recommendation identifies MemoryDB Reserved Node purchase opportunities based on AWS Trusted Advisor recommendations.

How it works

AWS Trusted Advisor analyzes your MemoryDB usage patterns and recommends Reserved Node purchases that can reduce your in-memory database costs. This recommendation surfaces those recommendations to help you optimize your Reserved Node portfolio for consistent workloads.

How to address this

Purchase Reserved Nodes for MemoryDB clusters with consistent usage patterns
Consider 1-year or 3-year term options based on workload stability
Evaluate payment options (All Upfront, Partial Upfront, No Upfront)
Review node types and sizes to ensure optimal capacity planning

Cost impact

Amazon OpenSearch Service Reserved Instance Purchase Recommendations

This recommendation identifies OpenSearch Service Reserved Instance purchase opportunities based on AWS Trusted Advisor recommendations.

How it works

AWS Trusted Advisor analyzes your Amazon OpenSearch Service usage patterns and recommends Reserved Instance purchases that can reduce your search and analytics costs. This recommendation surfaces those recommendations to help you optimize your Reserved Instance portfolio for consistent workloads.

How to address this

Purchase Reserved Instances for OpenSearch domains with consistent usage patterns
Consider 1-year or 3-year term options based on workload stability
Evaluate payment options (All Upfront, Partial Upfront, No Upfront)
Review instance types and sizes to ensure optimal capacity planning
Consider Reserved Instance purchases for both data and master nodes

Cost impact

Reserved Instances can provide significant savings compared to On-Demand pricing for consistent workloads. The cost impact represents the potential monthly savings from implementing the recommended Reserved Instance purchases.

Amazon RDS Delete Instances

This recommendation identifies RDS instances that should be deleted to reduce costs.

How it works

Identifies RDS instances that are candidates for deletion
Provides estimated cost savings from deleting unused instances
Uses AWS Trusted Advisor recommendations to identify optimal deletion targets

How to address this

Delete RDS instances that are no longer needed
Create final snapshots before deletion if data needs to be preserved
Ensure no applications are dependent on the instances
Review read replicas and other dependent resources before deletion

Amazon RDS Migrate to Graviton

This recommendation identifies RDS instances that can be migrated to Graviton-based instances for cost optimization.

How it works

Identifies RDS instances that are candidates for migration to Graviton processors
Provides estimated cost savings from the migration
Uses AWS Trusted Advisor recommendations to identify optimal migration targets

How to address this

Migrate eligible RDS instances to Graviton-based instance types
Review application compatibility with ARM-based processors
Test database performance and functionality after migration
Plan for maintenance windows during migration
Consider the performance benefits and cost savings of Graviton instances

Amazon RDS Reserved Instance Purchase Recommendations

This recommendation identifies Amazon RDS Reserved Instance purchase opportunities based on AWS Trusted Advisor recommendations.

How it works

AWS Trusted Advisor analyzes your RDS usage patterns to provide Reserved Instance purchase recommendations. This recommendation surfaces those opportunities to help you identify potential savings through committed usage discounts for RDS database instances.

What CloudZero identifies

Reserved Instance purchase opportunities for RDS database instances
Recommended Reserved Instance configurations (instance type, size, engine)
Estimated monthly savings from purchasing Reserved Instances
Database-specific purchase recommendations
Cost optimization opportunities from AWS Cost Optimization Hub

How it works

Uses AWS Trusted Advisor's c1z7kmr11n check for RDS Reserved Instance recommendations
Leverages Trusted Advisor's estimated savings calculations
Provides dynamic titles with specific recommendations and savings percentages
Covers all RDS database engines (MySQL, PostgreSQL, MariaDB, Oracle, SQL Server)
Database-level recommendations for targeted optimization

Cost impact

The recommendation calculates potential savings based on Trusted Advisor's estimates for RDS Reserved Instance purchases, including:

Up to 69% savings compared to on-demand pricing
Recommended instance types and sizes
Expected monthly savings from Reserved Instance commitments
Database engine-specific optimization opportunities

Reserved Instance Benefits

Significant cost savings: Up to 69% compared to on-demand pricing
Predictable billing: Fixed monthly costs for database instances
No upfront payment: No Upfront option available for flexibility
Flexible terms: 1 or 3-year commitment options
Engine coverage: Available for all major RDS database engines
Multi-AZ support: Reserved Instances work with Multi-AZ deployments

Database Engines Supported

Amazon Aurora (MySQL and PostgreSQL compatible)
MySQL
PostgreSQL
MariaDB
Oracle
SQL Server

How to address this

Review RDS usage patterns to identify consistent workloads
Consider Reserved Instances for production databases with steady usage
Evaluate different Reserved Instance terms (1 vs 3 years)
Plan for database growth when selecting Reserved Instance sizes
Monitor Reserved Instance coverage to maximize savings

Amazon RDS Rightsize Instances

This recommendation identifies RDS instances that should be rightsized to optimize cost and performance.

How it works

Identifies RDS instances that are over-provisioned or under-provisioned
Provides estimated cost savings from rightsizing instances
Uses AWS Trusted Advisor recommendations to identify optimal rightsizing targets

How to address this

Rightsize RDS instances to match actual resource utilization
Review CPU, memory, and I/O utilization metrics
Consider performance requirements when rightsizing
Test application performance after rightsizing to ensure requirements are met
Monitor database performance metrics during and after rightsizing

Amazon RDS Storage Delete Recommendations

This recommendation identifies Amazon RDS database instances with storage that can be deleted to reduce costs, typically for unused or redundant storage.

What CloudZero identifies

RDS database instances with unused storage allocations
Redundant storage that can be safely removed
Opportunities to eliminate unnecessary storage costs

How to address this

Review storage utilization and identify unused allocations
Delete unused or redundant storage
Clean up orphaned storage resources

Cost impact

Eliminates ongoing storage costs for unused capacity
Provides immediate cost savings
Reduces overall RDS storage expenses

Implementation Effort

Medium - Requires careful verification that storage is truly unused and safe to delete.

Additional details

Ensure storage is truly unused before deletion
Verify no dependencies exist before removing storage
Consider backup requirements before deletion
Test deletion process in non-production environments first

Amazon RDS Storage Rightsize Recommendations

This recommendation identifies Amazon RDS database instances with storage that can be rightsized to reduce costs while maintaining performance.

What CloudZero identifies

RDS database instances with over-provisioned storage
Opportunities to reduce storage allocation to match actual usage
Potential cost savings from rightsizing storage

How to address this

Review current storage utilization patterns
Rightsize storage allocation to match actual usage
Monitor performance after rightsizing to ensure no impact

Cost impact

Reduces storage costs by eliminating over-provisioned capacity
Maintains database performance while optimizing costs
Provides immediate cost savings on storage charges

Implementation Effort

Medium - Requires careful analysis of usage patterns and testing in non-production environments first.

Additional details

Always test storage changes in non-production environments first
Monitor database performance after rightsizing
Consider future growth when determining new storage allocation

Amazon RDS Storage Upgrade Recommendations

This recommendation identifies Amazon RDS database instances where storage can be upgraded to more cost-effective options or better performance tiers.

What CloudZero identifies

RDS database instances using outdated storage types
Opportunities to upgrade to more cost-effective storage options
Storage that can benefit from performance improvements

How to address this

Review current storage type and performance requirements
Upgrade to more cost-effective storage options
Consider performance improvements available with newer storage types

Cost impact

Reduces storage costs through more efficient storage types
May improve performance while reducing costs
Provides long-term cost optimization benefits

Implementation Effort

Medium - Requires planning for storage migration and potential downtime.

Additional details

Plan for potential downtime during storage upgrades
Test upgrade process in non-production environments first
Consider performance impact of storage changes
Verify compatibility with current database configuration

Amazon RDS Upgrade Instances

This recommendation identifies RDS instances that should be upgraded to newer generation types for cost optimization.

How it works

Identifies RDS instances that are candidates for upgrading to newer generation types
Provides estimated cost savings from upgrading to more efficient instance types
Uses AWS Trusted Advisor recommendations to identify optimal upgrade targets

How to address this

Upgrade RDS instances to newer generation types for better price-performance
Review application compatibility with newer instance types
Plan for maintenance windows during upgrades
Test performance and functionality after upgrade
Consider Reserved Instances for upgraded instances to maximize savings

Amazon Redshift Reserved Node Purchase Recommendations

This recommendation identifies Redshift Reserved Node purchase opportunities based on AWS Trusted Advisor recommendations.

How it works

AWS Trusted Advisor analyzes your Amazon Redshift usage patterns and recommends Reserved Node purchases that can reduce your data warehouse costs. This recommendation surfaces those recommendations to help you optimize your Reserved Node portfolio for consistent workloads.

How to address this

Purchase Reserved Nodes for Redshift clusters with consistent usage patterns.
Consider 1-year or 3-year term options based on workload stability.
Evaluate payment options (All Upfront, Partial Upfront, No Upfront).
Review node types and sizes to ensure optimal capacity planning.
Consider Reserved Node purchases for both leader and compute nodes.

Cost impact

Delete Inactive DynamoDB Tables

This recommendation identifies DynamoDB tables that are incurring storage costs but show no usage activity. Inactive tables continue to accumulate charges based on data size (per GB-month) even when the tables are not being read from or written to. Unused DynamoDB tables represent pure waste. You are paying for storage without gaining any value. These tables are often remnants of:

Completed projects or migrations
Testing and development environments
Deprecated features or services
Data that should have been archived or deleted

Threshold: This recommendation is created if a DynamoDB table has no read or write activity in the past thirty days.

Recommended action: Investigate ownership and verify if the table is truly unused. If so, delete the table to eliminate ongoing storage costs. If there is any chance the table is needed later, export the data to S3.

Excessive RDS Backup Retention

CloudZero has identified Amazon RDS backups and manual snapshots retained beyond 90 days, potentially exceeding business or compliance requirements. Long-term retention of RDS snapshots accumulates significant costs over time.

How it works

This recommendation identifies RDS backups and snapshots that are:

Older than 90 days
Incurring ongoing storage costs
Potentially exceeding necessary retention requirements

RDS automated backups retain for 7-35 days and auto-cleanup. This focuses on manual snapshots retained indefinitely unless explicitly deleted.

Additional details

Cost Accumulation: Storage costs add up as snapshots accumulate ($0.095/GB-month)
Unmanaged Snapshots: Manual snapshots created for one-time purposes often remain in place indefinitely
Example: 50 old 500GB snapshots = ~$28,500/year in unnecessary costs

How to address this

Review Snapshot Inventory:
- Navigate to RDS → Snapshots → Filter "Manual snapshots" → Sort by date
- Identify purpose of each snapshot (testing, compliance, migration, etc.)
- Determine which are still needed
```
aws rds describe-db-snapshots --snapshot-type manual \
  --query 'DBSnapshots[?SnapshotCreateTime<=`2023-01-01`].[DBSnapshotIdentifier,SnapshotCreateTime]'
```
Establish Retention Policy:
- Daily backups: 7-30 days
- Weekly backups: 4-12 weeks
- Monthly backups: 12 months
- Yearly backups: 7 years (compliance only)
- Document and communicate policy
Delete Unnecessary Snapshots:

Important: Deletion is permanent - always verify first
```
aws rds delete-db-snapshot --db-snapshot-identifier mydb-snapshot-2023-01-15
```
Implement Automated Lifecycle:
- Use AWS Lambda to auto-delete based on tags and age
- Tag snapshots: Purpose, RetentionDays, Retain
- Use AWS Backup for centralized lifecycle management
Consider Alternative Storage:
- Export to S3 Glacier Deep Archive (~$0.00099/GB-month, 99% cheaper)
- Use AWS Backup archive tier
- Keep only "hot" backups in RDS format
Monitor and Alert:
- Set up Cost Anomaly Detection for RDS backup storage
- Create CloudWatch dashboards for snapshot age and costs
- Alert on snapshots exceeding retention policy

Cost impact

Conservative estimate: 50% reduction (assumes some retention needed)
Unnecessary snapshots: 100% recoverable
Pricing: ~$0.095/GB-month (standard), ~$0.021/GB-month (Aurora excess)

Important Considerations

Retention Best Practices

Keep:

Compliance-required snapshots
Recent backups (30-90 days) for disaster recovery
Pre-migration/upgrade snapshots (until validated)

Consider Deleting:

Ad-hoc test snapshots
Post-deployment snapshots (after validation)
Duplicate snapshots
Decommissioned database snapshots

Operational Risks

Deletion is permanent (no undelete)
Verify with database owners before deletion
Export to S3 if uncertain
Test that remaining snapshots are restorable

AWS Backup Alternative

Use AWS Backup for automated lifecycle management:

Centralized management across services
Automated retention and expiration
Compliance reporting
Cold storage transitions

RDS Clusters Incurring Extended Support Charges

CloudZero has identified Amazon RDS database instances and clusters that are running on outdated engine versions and incurring AWS extended support charges. These charges apply when you continue running RDS database engines beyond their standard support end date.

AWS extended support fees can add significant costs to your RDS spending, often 50-100% more than the base instance cost for older engine versions. By upgrading to a supported engine version, you can eliminate these charges entirely while also benefiting from security patches, bug fixes, and performance improvements.

How it works

This recommendation identifies RDS resources that are:

Running database engine versions that are past their standard support period
Incurring AWS extended support charges (typically identified by "ExtendedSupport" in usage types or line item descriptions)
Eligible for upgrade to newer, supported engine versions without extended support fees

Common database engines affected include:

MySQL 5.7 and earlier versions
PostgreSQL 11 and earlier versions
MariaDB 10.3 and earlier versions
Oracle database versions past their support dates
SQL Server versions past their support dates

Additional details

Cost Savings: Extended support charges can double your RDS costs for affected instances. Upgrading eliminates these fees completely
Security: Newer engine versions receive active security patches and vulnerability fixes
Performance: Modern database versions include performance optimizations and new features
Compliance: Many compliance frameworks require running currently supported software versions
Future-Proofing: Avoiding technical debt by staying on supported versions

How to address this

Review Affected Resources: Identify all RDS instances/clusters incurring extended support charges and their current engine versions
Plan Upgrades: For each affected resource:
- Check AWS documentation for the upgrade path to the latest supported version
- Review application compatibility with newer database versions
- Identify any deprecated features your application uses
- Plan maintenance windows for the upgrade
Test in Non-Production: Before upgrading production databases:
- Restore a snapshot to a test environment
- Upgrade the test instance to the target version
- Run application regression tests
- Verify query performance and compatibility
- Test backup and restore procedures
Perform Upgrades: Execute the upgrade during scheduled maintenance windows:
- For minor version upgrades: can often be done with minimal downtime
- For major version upgrades: requires more planning and testing
- Use RDS Blue/Green deployments for zero-downtime upgrades when available
- Take a manual snapshot before upgrading as a safety measure
Monitor Post-Upgrade: After upgrading:
- Verify application connectivity and functionality
- Monitor database performance metrics
- Check for any application errors or warnings
- Confirm extended support charges stop appearing in billing
Establish Upgrade Cadence: Prevent future extended support charges:
- Track RDS engine version end-of-support dates
- Schedule regular database upgrades before support ends
- Test new versions early in non-production environments
- Keep documentation of version-specific application requirements

Additional details

Upgrade Paths: Some major version upgrades require intermediate steps (e.g., MySQL 5.7 → 8.0 requires upgrading to 5.7.latest first)
Downtime: Plan for maintenance windows; consider using read replicas for minimal downtime migrations
Parameter Groups: Review and update parameter groups to ensure compatibility with new versions
Application Changes: Some applications need code changes for newer database versions
Backup Strategy: Always take manual snapshots before major version upgrades
Blue/Green Deployments: Use RDS Blue/Green deployments for safer, zero-downtime upgrades when available

For detailed upgrade procedures, consult the AWS RDS documentation for your specific database engine.

RDS Snapshot Costs Are Higher Than Expected

This recommendation is created when the percentage of RDS snapshots exceeds 10% of the total RDS costs. A typical organization's RDS snapshot costs will represent 1% to 5% of the total cost of the entire RDS service. When RDS snapshot costs exceed that, it indicates there are an excessive number of snapshots. This is often due to missing or inadequate snapshot retention rules that leave large number of automatic snapshots around, as well as manual snapshots that are not managed by the snapshot retention rules. Think about tightening up snapshot retention rules and cleaning up any unnecessary snapshots in order to save money.

Threshold: This recommendation is created if the total real cost spend for the identified snapshots exceeds 10% of the real cost for all of the RDS service and is at least $500. When the total spend for RDS snapshots falls below 10%, the Recommendation will automatically be closed.

Underutilized Amazon Redshift Clusters

This recommendation identifies Amazon Redshift clusters that are underutilized and could benefit from optimization, based on AWS Trusted Advisor recommendations.

How it works

AWS Trusted Advisor analyzes your Redshift cluster usage patterns and identifies clusters that are not being fully utilized. This surfaces recommendations to help you optimize your data warehouse resources and reduce costs for underutilized infrastructure.

What CloudZero identifies

Redshift clusters with low CPU utilization
Clusters with minimal query activity
Underutilized storage and compute resources
Clusters that are candidates for downsizing or deletion

How it works

Uses AWS Trusted Advisor's G31sQ1E9U check for underutilized Redshift clusters
Leverages Trusted Advisor's estimated savings calculations
Provides dynamic titles with specific recommended actions

Cost impact

The recommendation calculates potential monthly savings from optimizing underutilized Redshift clusters, helping you eliminate costs for unused or underutilized data warehouse capacity.

Upgrade Elasticsearch to Avoid Extended Support Charges

How it works

AWS charges additional Extended Support fees for Elasticsearch domains running end-of-life (EOL) versions. These charges can add 50-100% to your regular Elasticsearch costs and increase over time.

Additional details

High cost: Extended support can double your Elasticsearch bill
Escalating fees: Charges increase the longer you stay on old versions
Security risk: EOL versions no longer receive security patches
Performance: Newer versions offer better performance and features

Common EOL Versions with Extended Support

Elasticsearch 6.x (all versions)
Elasticsearch 7.0 - 7.9 (early 7.x versions)

Recommendation

Upgrade to the latest supported Elasticsearch version (7.10 or later, or migrate to OpenSearch).

Implementation Steps

Option 1: In-Place Upgrade (Recommended)

Review compatibility: Check application compatibility with target version
Backup domain: Create manual snapshot before upgrade
Upgrade domain: Use AWS Console or CLI to perform rolling upgrade
Test thoroughly: Validate all queries and integrations work
Monitor performance: Watch cluster health and query latency

Option 2: Blue/Green Deployment

Create new domain with target version
Reindex data from old domain
Update application endpoints
Monitor and validate
Delete old domain

Option 3: Migrate to OpenSearch

Consider migrating to OpenSearch (AWS's open-source fork) for long-term support and latest features.

Cost impact

Eliminates 100% of extended support charges immediately upon upgrade. For a typical domain, this can save $500-$5,000/month depending on cluster size.

Additional details

Latest security patches
Improved performance and efficiency
Access to new features
Better AWS support
Lower operational risk

Upgrade OpenSearch to Avoid Extended Support Charges

How it works

AWS charges additional Extended Support fees for OpenSearch domains running end-of-life (EOL) versions. These charges can add 50-100% to your regular OpenSearch costs and increase over time.

Additional details

High cost: Extended support can double your OpenSearch bill
Escalating fees: Charges increase the longer you stay on old versions
Security risk: EOL versions no longer receive security patches
Performance: Newer versions offer better performance and features

Common EOL Versions with Extended Support

OpenSearch 1.0 - 1.2 (early 1.x versions)
Legacy Elasticsearch 7.10 versions migrated to OpenSearch

Recommendation

Upgrade to the latest supported OpenSearch version (2.x or later).

Implementation Steps

Option 1: In-Place Upgrade (Recommended)

Review compatibility: Check application compatibility with target version
Backup domain: Create manual snapshot before upgrade
Upgrade domain: Use AWS Console or CLI to perform rolling upgrade
Test thoroughly: Validate all queries and integrations work
Monitor performance: Watch cluster health and query latency

Option 2: Blue/Green Deployment

Create new domain with target version
Reindex data from old domain
Update application endpoints
Monitor and validate
Delete old domain

Cost impact

Eliminates 100% of extended support charges immediately upon upgrade. For a typical domain, this can save $500-$5,000/month depending on cluster size.

Additional details

Latest security patches
Improved performance and efficiency
Access to new features (OpenSearch 2.x includes significant improvements)
Better AWS support
Lower operational risk

Version Compatibility

OpenSearch maintains strong backward compatibility:

Most applications work without code changes
Query syntax largely unchanged from Elasticsearch 7.10
Plugin compatibility improved in 2.x

Management Tools

CloudWatch Costs Higher Than Expected

The AWS CloudWatch service should be only a small part of your cloud bill. This recommendation detects increases in CloudWatch costs which indicates you use CloudWatch too extensively and can clean up any unnecessary CloudWatch log groups.

Threshold: This recommendation is created if the total spend for the identified CloudWatch log groups exceeds a sliding scale cost that depends on your total 30 day real cost for all AWS services and is at least $500. The following table shows the sliding scale.

30 Day Spend for All AWS Services	30 Day Spend Threshold for CloudWatch Logs
< $10,000.00	$50.00
Between $10,000.00 and $50,000.00	$100.00
Between $50,000.00 and $100,000.00	$250.00
Between $100,000.00 and $500,000.00	$500.00
Between $500,000.00 and $2,500,000.00	$750.00
> $2,500,000.00	$1,000.00

When your CloudWatch cost falls below the threshold based on your AWS spend, or if falls below $500, the Recommendation will automatically be closed.

Redundant CloudTrail Usage Detected

The AWS CloudTrail service typically does not cost anything unless you have more than an one instance in an account. This recommendation detects whether you are being charged for CloudTrail events, which indicates you have more than one instance in an account and can clean up any redundant CloudTrail instances to eliminate unnecessary spend.

Threshold: This recommendation is created if the total spend for the identified CloudTrail events exceeds a sliding scale cost that depends on your total 30 day real cost for all AWS services, and the CloudTrail cost is at least $500. The following table shows the sliding scale.

30 Day Spend for All AWS Services	30 Day Spend Threshold for CloudTrail
< $10,000.00	$100.00
Between $10,000.00 and $50,000.00	$250.00
Between $50,000.00 and $100,000.00	$500.00
Between $100,000.00 and $500,000.00	$1,000.00
Between $500,000.00 and $2,500,000.00	$2,500.00
> $2,500,000.00	$5,000.00

When your CloudTrail cost falls below the threshold based on your AWS spend, or if falls below $500, the Recommendation will be closed automatically.

Networking & Content Delivery

Delete Idle Load Balancer

This recommendation identifies AWS Classic Load Balancers (ELBs) that are idle and can be deleted to reduce costs.

How it works

This recommendation uses AWS Trusted Advisor data to identify Classic Load Balancers that have been idle for an extended period. These load balancers continue to incur charges even when not actively serving traffic.

Additional details

Cost Savings: Idle ELBs incur hourly charges even when not in use
Resource Cleanup: Helps maintain a clean AWS environment
Security: Reduces attack surface by removing unused resources

How to address this

Review: Verify the ELB is truly unused by checking application logs and monitoring
Backup: Document the ELB configuration before deletion
Delete: Remove the idle ELB through AWS Console or CLI
Monitor: Ensure no applications were depending on the deleted ELB

Delete Inactive AWS Network Firewall

This recommendation identifies AWS Network Firewalls that appear to be inactive and could be deleted to reduce costs.

How it works

AWS Trusted Advisor monitors your AWS Network Firewalls and identifies firewalls that have processed 0 bytes of data in the last 30 days. Network Firewalls incur significant hourly charges even when not actively processing traffic, making inactive firewalls a major source of unnecessary spending.

What CloudZero identifies

Network Firewalls with 0 bytes processed in the last 30 days
Unused firewalls that are still incurring hourly charges
Firewalls that were provisioned but never used or are no longer needed
Opportunities to eliminate unused network security infrastructure

How to address this

Delete Network Firewalls that have not processed any traffic in the last 30 days
Review your VPC security architecture to ensure firewalls are still required
Verify that the firewall is not being used for security inspection or filtering
Consider consolidating multiple firewalls if possible
Confirm with security and network teams before deletion to avoid creating security gaps

How it works

Uses AWS Trusted Advisor's c2vlfg0bfw check for inactive Network Firewalls
Identifies firewalls with zero data transfer over 30 days
Provides Network Firewall ARNs for easy identification
Focuses on reducing unnecessary hourly firewall charges

Cost impact

AWS Network Firewalls have substantial hourly charges that accumulate continuously. Each inactive Network Firewall represents significant ongoing waste that can be eliminated immediately. Deleting inactive Network Firewalls provides immediate and substantial cost savings.

Delete Inactive Gateway Load Balancer Endpoint

This recommendation identifies Gateway Load Balancer endpoints that appear to be inactive and could be deleted to reduce costs.

How it works

AWS Trusted Advisor monitors your Gateway Load Balancer (GWLB) endpoints and identifies endpoints that have processed 0 bytes of data in the last 30 days. Gateway Load Balancer endpoints incur hourly charges even when not actively processing traffic, making inactive endpoints a source of unnecessary spending.

What CloudZero identifies

Gateway Load Balancer endpoints with 0 bytes processed in the last 30 days
Unused GWLB endpoints that are still incurring hourly charges
Endpoints that were created for testing or temporary use
Opportunities to clean up unused network infrastructure

How to address this

Delete Gateway Load Balancer endpoints that have not been used in the last 30 days
Review your network architecture to ensure endpoints are still needed
Verify that security appliances or inspection services no longer require the endpoint
Confirm with application teams before deletion to avoid service disruption

How it works

Uses AWS Trusted Advisor's c2vlfg0k35 check for inactive GWLB endpoints
Identifies endpoints with zero data transfer over 30 days
Provides endpoint IDs and ARNs for easy identification
Focuses on reducing unnecessary hourly endpoint charges

Cost impact

Gateway Load Balancer endpoints incur hourly charges that accumulate over time. Deleting inactive endpoints eliminates ongoing hourly charges and helps maintain a clean, cost-effective network architecture. While individual endpoint costs are modest, multiple inactive endpoints can represent significant unnecessary spending.

Delete Inactive VPC Interface Endpoint

This recommendation identifies VPC interface endpoints that appear to be inactive and could be deleted to reduce costs.

How it works

AWS Trusted Advisor monitors your VPC interface endpoints and identifies endpoints that have processed 0 bytes of data in the last 30 days. VPC interface endpoints incur hourly charges and data processing costs even when not actively used, making inactive endpoints a source of unnecessary spending.

What CloudZero identifies

VPC interface endpoints with 0 bytes processed in the last 30 days
Unused PrivateLink connections that are still incurring hourly charges
Endpoints that were created for testing or temporary use
Opportunities to consolidate endpoints using centralized architectures

How to address this

Delete VPC interface endpoints that have not been used in the last 30 days
Review your architecture to ensure endpoints are still needed
Consider deploying VPC interface endpoints in a centralized architecture using Transit Gateway to reduce hourly charges on inactive endpoints
Verify that applications no longer require the endpoint before deletion

How it works

Uses AWS Trusted Advisor's c2vlfg0jp6 check for inactive VPC endpoints
Identifies endpoints with zero data transfer over 30 days
Provides endpoint IDs, VPC IDs, and subnet information for easy identification
Focuses on reducing unnecessary hourly endpoint charges

Cost impact

While individual VPC interface endpoints have modest hourly costs, these charges accumulate over time and across multiple endpoints. Deleting inactive endpoints eliminates ongoing hourly charges and helps maintain a clean, cost-effective network architecture.

Inefficient AWS NAT Gateway Detected

The AWS VPC service provides NAT Gateways so that resources in private subnets can access resources outside your VPC. When using NAT Gateways, you are charged per NAT Gateway-Hour (rounded up to the hour) and per GB Data Processed.

This recommendation detects NAT Gateways that have hourly charges without appreciable corresponding data processing charges. This indicates unused NAT Gateways that you can clean up.

Threshold: This recommendation is created if the total real cost spend for the identified NAT Gateways with low data processing charges is at least $500 and will be marked as Addressed when the spend falls below $500.

Managed NAT Gateway with Excessive Data Transfer

CloudZero has identified AWS NAT Gateways where data transfer costs represent an unusually high percentage of total gateway costs. While NAT Gateways include both hourly charges and data processing fees, excessive data transfer costs often indicate opportunities to optimize network architecture and reduce unnecessary cross-AZ or internet-bound traffic.

How it works

This recommendation identifies NAT Gateways where:

Data transfer costs exceed 60% of total NAT Gateway costs

High data transfer ratios can indicate:

Unnecessary cross-Availability Zone traffic
Inefficient application architectures routing excessive traffic through NAT
Missing VPC endpoints for AWS services (S3, DynamoDB, etc.)
Applications that could benefit from VPC peering or PrivateLink
Workloads that would be better served by alternative connectivity solutions

Additional details

Cost Optimization: NAT Gateway data processing fees are expensive and can add up quickly with high-volume workloads
Architecture Efficiency: High data transfer often signals architectural issues that impact both cost and performance
Service Availability: Reducing NAT Gateway dependency can improve resilience and reduce single points of failure
Performance: Alternative solutions like VPC endpoints can provide lower latency and higher throughput

How to address this

Analyze Traffic Patterns:
- Use VPC Flow Logs to identify sources and destinations of NAT Gateway traffic
- Determine which applications or services are generating the most traffic
- Identify whether traffic is internet-bound or AWS service traffic
- Check for cross-AZ traffic that could be optimized
Implement VPC Endpoints for AWS Services:
- Create Gateway VPC Endpoints for S3 and DynamoDB (no additional cost)
- Deploy Interface VPC Endpoints for services like:
  - ECR (Elastic Container Registry)
  - ECS (Elastic Container Service)
  - Systems Manager
  - CloudWatch Logs
  - Secrets Manager
  - KMS
- VPC endpoints eliminate NAT Gateway traffic for these services entirely
Optimize Cross-AZ Traffic:
- Review application architectures that route traffic between Availability Zones through NAT
- Consider deploying NAT Gateways in each AZ to keep traffic local
- Evaluate whether cross-AZ traffic is necessary or can be redesigned
Consider VPC Peering or PrivateLink:
- For inter-VPC communication, use VPC peering instead of routing through NAT and internet
- For service-to-service communication, consider AWS PrivateLink
- These alternatives avoid both NAT Gateway costs and internet egress charges
Evaluate Alternative Connectivity:
- For large data transfers to the internet, consider using:
  - Direct Connect for consistent high-volume workloads
  - S3 Transfer Acceleration for uploads
  - CloudFront for content delivery
- For outbound-only instances, consider NAT instances for very high throughput scenarios (though less managed)
Right-size NAT Gateway Deployment:
- Review whether you need NAT Gateways in all Availability Zones
- Consider consolidating in lower-traffic environments (dev/test)
- Balance high availability needs with cost optimization
Monitor and Set Alerts:
- Configure CloudWatch alarms for NAT Gateway data processing
- Track data transfer trends over time
- Set up cost anomaly detection for unexpected spikes

Cost Impact Calculation

The cost impact represents the excessive portion of data transfer costs:

Baseline: Normal NAT Gateway usage typically has data transfer costs around 40-60% of total costs
Threshold: This recommendation flags gateways where data transfer exceeds 60%
Savings: Cost impact = (Data Transfer Ratio - 0.60) × Total NAT Gateway Cost

For example, a NAT Gateway with:

$100/month total cost
80% data transfer costs
Cost impact = (0.80 - 0.60) × $100 = $20/month potential savings

Additional details

High Availability: When implementing changes, maintain redundancy across Availability Zones for production workloads
Compliance: Some regulatory requirements mandate specific network architectures
Migration Planning: Moving to VPC endpoints or alternative solutions requires application testing and validation
Performance Impact: Always test performance after architectural changes
Incremental Optimization: Start with high-impact services (S3, ECR) before optimizing smaller traffic sources

Release Idle Elastic IP Addresses

Elastic IP addresses (EIPs) that are allocated but not associated with running resources incur hourly charges. This recommendation identifies idle EIPs that can be released to reduce costs.

What Are Elastic IPs?

Static IPv4 addresses for AWS resources that allow you to:

Maintain consistent public IPs across instance replacements
Quickly remap IPs to different instances
Mask availability zone failures

Common Causes

Terminated Instances: EIP not released when EC2 instance deleted
Testing/Development: Allocated for testing and not released afterward
Infrastructure Changes: Old IPs from decommissioned services
Deleted Resources: EIPs from removed NAT Gateways or Load Balancers

Detection Method

Uses AmazonVPC billing data for idle addresses:

Service: AmazonVPC
Usage Type: PublicIPv4:IdleAddress
Criteria: Idle 7+ days

Cost impact

Idle EIPs	Monthly	Annual
5	$18	$216
10	$36	$432
50	$180	$2,160
100	$360	$4,320

How to address this

1. Verify EIP Status

aws ec2 describe-addresses --allocation-ids eipalloc-xxxxxxxxx

Check output:

InstanceId: null → Safe to release
InstanceId: i-xxxxx → Still in use, don't release
NetworkInterfaceId: eni-xxxxx → Check if ENI is attached

2. Check Dependencies

Before releasing, verify the EIP is NOT referenced in:

DNS A records
Firewall allowlist rules
Application configurations
Documentation

3. Release the EIP

AWS Console:

EC2 → Elastic IPs
Select unassociated EIP
Actions → Release Elastic IP addresses

AWS CLI:

aws ec2 release-address --allocation-id eipalloc-xxxxxxxxx

4. Update References

After release:

Update DNS records (if applicable)
Remove from firewall rules
Update documentation

Important Considerations

Do NOT release if:

Referenced in DNS (update DNS first)
In firewall allowlists (update rules first)
Reserved for disaster recovery
Actively used (verify association status)

Recovery: You cannot recover the same IP once released. You must allocate a new one and update all references.

Best Practices

Tag all EIPs:

aws ec2 create-tags --resources eipalloc-xxx --tags \
  Key=Name,Value="Production API" \
  Key=Owner,Value=team-name

Regular audits: Review idle EIPs monthly in all regions
Automation: Set up Lambda to alert on idle EIPs detected
Use alternatives when possible:
- Auto-assigned public IPs (free)
- Application Load Balancer (AWS-managed IPs)
- CloudFront (global edge network)

Cost Optimization

Multiple EIPs per instance: First is free when associated, additional cost $3.60/month each
NAT Gateway vs Instance: NAT Gateway has no EIP charges (included)
Load Balancers: ALB/NLB don't require EIPs, often cheaper at scale

Storage

Amazon EBS Delete Volumes

This recommendation identifies EBS volumes that should be deleted to reduce costs.

How it works

Identifies EBS volumes that are candidates for deletion
Provides estimated cost savings from deleting unused volumes
Uses AWS Trusted Advisor recommendations to identify optimal deletion targets

How to address this

Delete EBS volumes that are no longer needed
Review volume snapshots before deletion
Ensure volumes are not attached to running instances
Consider creating snapshots for important data before deletion

Amazon EBS Rightsize Volumes

This recommendation identifies EBS volumes that should be rightsized to optimize cost and performance.

How it works

Identifies EBS volumes that are over-provisioned or under-provisioned
Provides estimated cost savings from rightsizing volumes
Uses AWS Trusted Advisor recommendations to identify optimal rightsizing targets

How to address this

Rightsize EBS volumes to match actual storage requirements
Review volume utilization metrics and I/O patterns
Consider performance requirements when rightsizing
Test application performance after rightsizing to ensure requirements are met

Amazon EBS Upgrade Volumes

This recommendation identifies EBS volumes that should be upgraded to newer generation types for cost optimization.

How it works

Identifies EBS volumes that are candidates for upgrading to newer generation types
Provides estimated cost savings from upgrading to more efficient volume types
Uses AWS Trusted Advisor recommendations to identify optimal upgrade targets

How to address this

Upgrade EBS volumes to newer generation types (e.g., gp3 instead of gp2)
Review performance requirements before upgrading
Test application performance after upgrade
Consider the trade-offs between cost and performance

Configure S3 Lifecycle Policy to Abort Incomplete Multipart Uploads

This recommendation identifies Amazon S3 buckets that do not have lifecycle policies configured to automatically abort incomplete multipart uploads, which can lead to unnecessary storage costs.

How it works

AWS Trusted Advisor monitors your S3 buckets and identifies those without lifecycle policies configured to abort incomplete multipart uploads. Incomplete multipart uploads continue to incur storage costs until they are explicitly aborted or automatically cleaned up by lifecycle policies.

What CloudZero identifies

S3 buckets without lifecycle policies for incomplete multipart upload cleanup
Opportunities to implement lifecycle policies for multipart upload management
Buckets that are accumulating costs from incomplete uploads
Recommendations for appropriate lifecycle policy configurations
Cost optimization opportunities from AWS Trusted Advisor

How it works

Uses AWS Trusted Advisor's c1cj39rr6v check for incomplete multipart upload abort configuration
Leverages Trusted Advisor's cost estimates and recommendations
Provides dynamic titles with specific actions
Covers all S3 buckets across all regions
Bucket-level recommendations for targeted optimization

Cost impact

Buckets without incomplete multipart upload abort policies can result in:

Accumulation of incomplete multipart upload parts over time
Continued storage costs for failed or abandoned uploads
Wasted storage space from incomplete upload fragments
Missed opportunities for cost optimization through automated cleanup

Multipart Upload Lifecycle Policy Benefits

Automated cleanup: Abort incomplete multipart uploads automatically
Cost control: Eliminate storage costs from failed uploads
Storage optimization: Free up storage space from abandoned uploads
Predictable costs: Better control over multipart upload-related storage costs
Simplified management: No manual intervention required for cleanup

Common Multipart Upload Lifecycle Configurations

Immediate cleanup: Abort incomplete multipart uploads after 1 day
Standard cleanup: Abort incomplete multipart uploads after 7 days
Extended cleanup: Abort incomplete multipart uploads after 30 days
Comprehensive policy: Combine with other lifecycle rules for complete bucket management

Multipart Upload Considerations

Upload timeouts: Incomplete uploads can occur due to network issues or application failures
Storage costs: Each part of an incomplete multipart upload incurs storage charges
Cleanup timing: Balance between allowing retry attempts and cost control
Application integration: Ensure applications handle multipart upload failures gracefully

How to address this

Review buckets without incomplete multipart upload abort policies
Implement lifecycle policies specifically for multipart upload cleanup
Consider application retry patterns when setting abort timing
Monitor incomplete multipart upload accumulation
Use lifecycle policies to automate multipart upload cleanup
Regularly review and adjust abort policies based on usage patterns

Consider Intelligent-Tiering or Lifecycle Rules for S3

This recommendation is created when there are S3 buckets with spend only on Standard Storage, indicating that use of Intelligent-Tiering or Lifecycle policies could be applied to reduce cost.

Threshold: This recommendation is created if 10% of the total spend on S3 buckets that use Standard storage only is greater than $500.

Standard storage is the default storage class for objects in S3 and is the most expensive. Standard storage is best used for data that needs to be accessed frequently with fastest access time for data retrieval.

Consider the following when determining if S3 Intelligent-Tiering or S3 Lifecycle could be applied to the S3 resources listed to save up to 10% on storage costs.

S3 Intelligent-Tiering:

Amazon S3 Intelligent-Tiering is an Amazon S3 storage class designed to optimize storage costs by automatically moving data to the most cost-effective access tier when access patterns change, without performance impact or operational overhead.
S3 Intelligent-Tiering automatically stores objects in three access tiers:
- Frequent Access tier: The default access tier that any object created or transitioned to S3 Intelligent-Tiering begins its lifecycle in. An object remains in this tier as long as it is being accessed. If objects in other tiers are accessed later, S3 Intelligent-Tiering automatically moves the objects back to this tier.
- Infrequent Access tier: If an object is not accessed for 30 consecutive days, the object moves to the Infrequent Access tier with savings up to 40%.
- Archive Instant Access tier: If an object is not accessed for 90 consecutive days, the object moves to the Archive Instant Access tier with savings up to 68%.
When to use Intelligent-Tiering: Ideal for data with unknown, changing, or unpredictable access patterns, independent of object size or retention period. This includes data for new applications, data analytics, user-generated content, and data lakes.

S3 Lifecycle Rules:

S3 Lifecycle helps users store objects in a cost effective way throughout their lifecycle by transitioning them to lower-cost storage classes or deleting expired objects on your behalf.
Lifecycle rules are applied to all existing and future objects in an S3 bucket
When to use Lifecycle policies: If you have a well-defined access pattern for your data. Ideal for data needing access for a specific period and then archiving at a cheaper storage tier.

ℹ️
Object monitoring and automation for Intelligent-Tiering incurs a small monthly charge. Learn more about S3 pricing and the additional costs associated with S3 in this blog post.
Amazon S3 Lifecycle can be used to transition new objects that are programmatically uploaded to the S3 Intelligent-Tiering storage class.

The resource table shows the list of buckets with spend only on Standard storage.

High Data Retrieval Costs for S3 Glacier Storage

This recommendation identifies data retrieval costs for an S3 bucket occurring on an S3 Glacier storage tier. Data retrieval costs indicates frequently accessed data that could be optimized by moving to a more cost-effective storage class.

Threshold: This recommendation is created if data retrieval costs for data stored in long-term or archival storage on any S3 bucket exceeds $100 over the last 30 days. When the cost impact from all S3 buckets drops back to $100 or below, the Recommendation will resolve.

AWS charges for storing objects in your S3 buckets, and for certain tiers, data retrieval per gigabyte. Amazon S3 provides the following S3 Glacier storage classes:

S3 Glacier Instant Retrieval (GLACIER_IR): Use for long-term data that is rarely accessed and requires milliseconds for retrieval. Data in this storage class is available for real-time access.
S3 Glacier Flexible Retrieval (GLACIER): Use for archives where portions of the data need to be retrieved in minutes. Data in this storage class is archived, and not available for real-time access.
S3 Glacier Deep Archive (DEEP_ARCHIVE): Use for archiving data that rarely needs to be accessed. Data in this storage class is archived, and not available for real-time access.

While S3 buckets stored on these tiers have lower storage costs, there is a cost for retrieving data. Look at these buckets to determine why data retrieval is needed and consider moving frequently accessed data to the Standard storage tier, which does not charge for data retrieval.

ℹ️
A small fee is applied for objects transitioned between storage classes, which is usually very low. Learn more about S3 pricing and the additional costs associated with S3 in this blog post.
Learn more about how to change the storage class for existing objects in the AWS documentation.

High Non-Standard API Requests for S3

This recommendation identifies high spend on non-standard API requests to S3. This high spend indicates excess overhead operations on your objects in S3.

Threshold: This recommendation is created if reducing non-standard S3 API calls will save at least $500 based on a 95% savings rate. When reducing non-standard S3 API calls results in savings less than $500, the Recommendation will automatically be closed.

Non-standard API requests for S3 include operations like LIST and HEAD. The LIST operation is used for retrieving various configuration information for S3 buckets and the HEAD operation is used for retrieving metadata about an object without retrieving the object itself. These operations are categorized as overhead costs, while all other request types, such as GET and PUT, are considered operational costs.

High spend on these overhead operations in comparison to operational costs indicates these operations are adding disproportionate cost. This is normal if you are serving private objects, since HEAD requests for private objects cannot be cached due to the need to generate a signed URL. Public objects can be cached because they do not require a signed URL. For publicly served objects, consider caching these requests with CloudFront to reduce costs.

ℹ️
If object metadata changes frequently, you need to set shorter cache expiration times to ensure your application is receiving the latest information.

High Ratio of S3 API Cost to Storage Cost

This recommendation is created when spend on API requests to an S3 bucket represents greater than 80% of costs for that bucket. This high ratio of requests to storage cost indicates frequently accessed data that could be moved to a different storage class.

When API requests costs are high, this is because the data being accessed is in an Infrequent Access tier. While Infrequent Access tiers have lower storage costs, there is a cost for every gigabyte of data retrieved and it is billed as an API request.

Consider moving frequently accessed data to Standard storage tier, which does not charge for data retrieval, to save up to 50% on S3 spend.

High S3 Administrative Fees

Typically administrative fees and other miscellaneous costs for a single S3 bucket should not exceed 10% of the total cost of the bucket. Fees related to AWS StorageLens and StorageAnalytics are not included in this check. When administrative fees and miscellaneous costs exceed the 10% threshold, the excess cost usually points to inefficient use of the S3 bucket or potentially unused buckets. The cost impact for this Recommendation is calculated by subtracting the per bucket fees threshold (10% of the total 30 day bucket cost) from the total administrative fees for the specified S3 buckets.

Threshold: This recommendation is created if the total cost impact exceeds $500 in real cost for the last 30 days. When the cost impact drops back below $500, the Recommendation will be resolved.

You can view the fees by grouping by Service Detail. They include:

Fee	Description
DeleteObject (Early Delete)	Some storage tiers are meant for infrequent access and have a minimum storage duration of 30 days. Objects deleted, moved, or overwritten prior to the minimum storage duration incur the normal storage change plus a pro-rated fee for the remaining days. These fees represent the pro-rated cost. Check that you are using the appropriate storage tiers and services based on your access patterns.
SmObjects (Small Objects)	Some storage tiers and services have minimum billable object size of 128KB. Objects smaller than 128KB are charged for 128KB. These fees represent the difference between the actual storage used and the minimum billable object size. Check that you are using the appropriate storage tiers and services based on your object sizes.
Inventory	Amazon S3 Inventory is a service that generates reports on the content of your S3 buckets. These reports are generated for your own management or auditing purposes, or they are generated for use in conjunction with other AWS services, such as Intelligent Tiering. These fees are associated with the generation and storage of these reports. Check your usage of the inventory services and determine if they are necessary.

Unarchived Old EBS Snapshots

CloudZero has identified Amazon EBS snapshots that have been stored for an extended period in standard snapshot storage. These long-term snapshots are excellent candidates for EBS Snapshot Archive, which can reduce storage costs by up to 75% for snapshots that are rarely accessed.

How it works

This recommendation identifies EBS snapshots that are:

Stored in standard EBS snapshot storage (not archived)
Older than 90 days
Incurring ongoing standard snapshot storage costs
Good candidates for migration to EBS Snapshot Archive tier

EBS Snapshot Archive is designed for long-term retention of snapshots that are accessed infrequently, such as compliance archives, disaster recovery backups, or historical reference snapshots.

Additional details

Cost Savings: Snapshot Archive storage costs ~75% less than standard snapshot storage ($0.0125/GB-month vs $0.05/GB-month)
Compliance: Maintain required long-term backups while dramatically reducing costs
No Data Loss: Archives preserve complete snapshot data in a lower-cost tier
Scalability: As snapshot storage grows over time, these savings compound

For example, a 1TB snapshot stored for a year:

Standard storage: $600/year
Archive storage: $150/year
Savings: $450/year per TB

How to address this

Review Snapshot Usage Patterns:
- Identify snapshots that are retained for compliance or disaster recovery
- Determine which snapshots are rarely or never restored
- Confirm snapshots older than 90 days are good archival candidates
- Verify that longer restore times (24-72 hours) are acceptable

Archive Eligible Snapshots:

Via AWS Console:

Navigate to EC2 → Snapshots
Select snapshot(s) to archive
Actions → Archive snapshot

Via AWS CLI:

aws ec2 modify-snapshot-tier \
  --snapshot-id snap-1234567890abcdef0 \
  --storage-tier archive

Bulk Archive via CLI:

# List old snapshots
aws ec2 describe-snapshots \
  --owner-ids self \
  --query 'Snapshots[?StartTime<=`2023-01-01`].SnapshotId' \
  --output text | \
while read snap; do
  aws ec2 modify-snapshot-tier \
    --snapshot-id $snap \
    --storage-tier archive
done

Implement Automated Archival Policies:
- Use AWS Data Lifecycle Manager (DLM) to automatically archive snapshots based on age
- Create lifecycle policies that:
  - Move snapshots to archive tier after 90 days
  - Delete archived snapshots after retention period expires
  - Apply to specific volumes by tags
Set Up Monitoring:
- Track snapshot storage costs over time
- Monitor archive vs standard storage distribution
- Set CloudWatch alarms for unexpected snapshot growth
- Review archived snapshots quarterly to confirm retention needs
Document Restore Process:
- Document that archived snapshots take 24-72 hours to restore
- Update disaster recovery runbooks with new restore timelines
- Communicate changes to teams that need to restore snapshots
- Test restore process from archive to verify procedures
Review Retention Policies:
- Evaluate whether all snapshots need to be retained
- Delete snapshots that are no longer needed for compliance or recovery
- Consider tiered retention: recent snapshots → archive → deletion

Cost Impact Calculation

The cost impact represents potential savings from archiving:

Standard Storage: ~$0.05 per GB-month (varies by region)
Archive Storage: ~$0.0125 per GB-month (75% cheaper)
Savings: 75% of current standard snapshot storage costs

For a snapshot older than 90 days with $100/month in storage costs:

Moving to archive saves: $75/month or $900/year

Important Considerations

Restore Times

Standard snapshots: Instant availability for volume creation
Archived snapshots: 24-72 hours to restore to standard tier before use
Only archive snapshots where slow restore is acceptable

Use Cases for Archive

Good candidates:

Compliance/regulatory retention backups
Long-term disaster recovery snapshots
Historical reference snapshots
End-of-month/quarter/year snapshots
Snapshots of decommissioned resources

Poor candidates:

Snapshots for active disaster recovery (need fast restore)
Recent snapshots (< 90 days old)
Snapshots used for frequent testing or development
Snapshots that need to be available quickly

Pricing Considerations

Archive storage: $0.0125/GB-month (~$12.75/TB-month)
Restore from archive: $0.03/GB retrieval charge (one-time when restoring)
Standard storage: $0.05/GB-month (~$51.20/TB-month)

If you need to restore an archived snapshot frequently, the retrieval charges can offset savings.

Operational Impact

No changes to snapshot permissions or sharing
Snapshot IDs remain the same
Tags and metadata are preserved
Can restore to standard tier at any time (with 24-72 hour delay)

Best Practices

Age-Based Policy: Archive snapshots automatically after 90-180 days
Tag-Based Archival: Use tags to identify archive candidates (e.g., Archivable=true)
Test Restore Process: Periodically test restoring from archive to verify procedures
Lifecycle Management: Use DLM for automated archival and eventual deletion
Cost Tracking: Monitor savings from archival using Cost Explorer tags
Document Exceptions: Clearly identify snapshots that should never be archived

ℹ️
Have questions or feedback? Reach out to your account manager.