Optimizing AWS Spending: A Guide to Remove Stale Resources for Cost Efficiency

Optimizing AWS Spending: A Guide to Remove Stale Resources for Cost Efficiency

Concepts to be known :

AWS LambdaAWS EC2
AWS Lambda is a serverless computing service provided by Amazon Web Services (AWS), allowing developers to run code in response to events without managing servers. It scales automatically and charges based on compute time.AWS EC2 (Elastic Compute Cloud) offers virtual servers that users can manage and scale as needed, providing full control over infrastructure. In contrast, Lambda abstracts server management and scaling, charging only for compute time.
  • Why is Cost Optimization very important?

    Cost optimization is crucial in the cloud due to its pay-as-you-go model. Effective strategies ensure efficient resource utilization, preventing wastage and controlling expenses.

  • What is EBS volume?

    Amazon Elastic Block Store (EBS) provides persistent block-level storage volumes for use with Amazon EC2 instances. EBS volumes are like virtual hard drives and can be attached to EC2 instances to store data persistently. They offer durability, scalability, and the ability to be backed up and restored easily.

  • What is a Snapshot?

    A snapshot is a point-in-time copy of data stored in Amazon Web Services (AWS) services such as Amazon Elastic Block Store (EBS) volumes or Amazon Redshift clusters. It captures the entire state of the data at the time the snapshot is created, enabling easy backups, replication, and recovery.

  • What are Stale Resources?

    Stale resources refer to unused or outdated resources within a system, typically in the context of cloud computing environments or IT infrastructure. These can include unused virtual machines, storage volumes, databases, or other resources that were provisioned but are no longer actively utilized or needed for operations.

Overview of the Project:

  • High level overview

Explanation:

When you create an AWS EC2 instance, it automatically includes an EBS volume for storage. If you have important data on this volume, it's crucial to create a snapshot to back it up. Remember, deleting an instance also deletes its associated volume unless you explicitly detach it beforehand. However, even if you detach the volume, forgetting to delete the snapshot incurs storage charges.

Steps to reproduce:

  1. Create an EC2 instance: This automatically creates a corresponding EBS volume.

  2. Create a snapshot of the volume: Navigate to your instance, select "Create snapshot," and choose the volume you want to back up.

  3. Verify your resources: You will now have one instance, one volume attached to it, and one snapshot of the volume.

4. Deleting Forgotten Snapshots: If you forget to delete snapshots after completing your work, no need to fret! Here's a script to automate the cleanup of stale resources:

Creating Lambda Function:

1.Access the AWS Management Console: Open a web browser and navigate to the AWS Management Console: console.aws.amazon.com..

2.Navigate to Lambda Services: In the search bar at the top of the console, type "Lambda" and click on the "Lambda" service to open the Lambda console.

3.Create a New Function:

  • Click the "Create function" button.

  • Choose "Author from scratch" as the creation method.

  • Enter a descriptive name for your function, such as "CleanUpStaleSnapshots."

  • Select the desired runtime environment (e.g., Python 3.9, Node.js 16.x).

  • Choose an execution role with appropriate permissions to access and manage snapshots. You can create a new role with the "AmazonEC2FullAccess" managed policy for this purpose.

  • Click "Create function" to proceed.

4.Now Paste the below code in the lambda function's code section.

import boto3

def lambda_handler(event, context):
    ec2 = boto3.client('ec2')

    # Get all EBS snapshots
    response = ec2.describe_snapshots(OwnerIds=['self'])

    # Get all active EC2 instance IDs
    instances_response = ec2.describe_instances(Filters=[{'Name': 'instance-state-name', 'Values': ['running']}])
    active_instance_ids = set()

    for reservation in instances_response['Reservations']:
        for instance in reservation['Instances']:
            active_instance_ids.add(instance['InstanceId'])

    # Iterate through each snapshot and delete if it's not attached to any volume or the volume is not attached to a running instance
    for snapshot in response['Snapshots']:
        snapshot_id = snapshot['SnapshotId']
        volume_id = snapshot.get('VolumeId')

        if not volume_id:
            # Delete the snapshot if it's not attached to any volume
            ec2.delete_snapshot(SnapshotId=snapshot_id)
            print(f"Deleted EBS snapshot {snapshot_id} as it was not attached to any volume.")
        else:
            # Check if the volume still exists
            try:
                volume_response = ec2.describe_volumes(VolumeIds=[volume_id])
                if not volume_response['Volumes'][0]['Attachments']:
                    ec2.delete_snapshot(SnapshotId=snapshot_id)
                    print(f"Deleted EBS snapshot {snapshot_id} as it was taken from a volume not attached to any running instance.")
            except ec2.exceptions.ClientError as e:
                if e.response['Error']['Code'] == 'InvalidVolume.NotFound':
                    # The volume associated with the snapshot is not found (it might have been deleted)
                    ec2.delete_snapshot(SnapshotId=snapshot_id)
                    print(f"Deleted EBS snapshot {snapshot_id} as its associated volume was not found.")

5.Once you've configured everything, click the "Review" button to review the changes. If everything looks good, click "Deploy" to make your function live.

6.Then move to configuration section > go to General configuration and increase the execution time of Lambda function to 10 sec.

7.By default the execution time of lambda function is 3 sec.

8.And then move to code and click on test, you will be getting the below error.

9.The error is "You don't have permission to access 'Describe Snapshots' to grant permission to the above error."

10.To rectify this, we first need to go to the role that is executing this task and modify or add permissions.

11.To do so, go to the configuration section in Lambda, then navigate to Permissions, and click on the Role name.

12.After clicking on the Role name, you will be redirected to the IAM console. From there, navigate to the Policies section and proceed to create a new policy.

Select the EC2 service, then choose the policies "Delete Snapshot" and "Describe Snapshots".

13.And then give the policy a name and proceed to create the policy.

14.Now, we need to attach the policy to the Role that is executing the Lambda function.

15.Navigate to the IAM console by clicking on the role, and add the necessary permissions. After that, you will see the following permissions listed in the IAM console.

16.Now, return to the code and click on "Test" again. You will encounter another error, which may appear as follows:

17.It says, "You don't have permission to access 'Describe Instance'."

18.To address this, once again, we need to create a policy similar to the previous one and include permissions for "Describe Volumes" and "Describe Instances".

19.After creating the policy, we need to add these new permissions to the role. Upon completion, you will observe the following permissions associated with the role.

20.Now, click on "Test" again, and you shouldn't encounter any errors. The output should resemble the following:

21."Delete the EC2 Instance. By default, the EBS volume will also be deleted. You will notice that the volume is deleted, but the instance is also deleted. However, this change may not reflect in the console, as the snapshot remains."

22.Now, click on "Test" again. It will proceed to delete any snapshots that are not associated with any volume.

23.After clicking "Test" again, you will observe that the snapshot which was not associated with any volume has also been deleted.

24.fter completing all the tasks, it's important to delete the Lambda function and the associated policies. Remember, leaving behind unused resources can incur unnecessary costs.

Credits: This tutorial was followed from YouTube.

Special thanks to Abhishek Veeramalla for his guidance through his YouTube videos.

If you found this article helpful, consider sharing it with your tech-savvy friends and colleagues. The more people benefit, the better! Don't forget to subscribe to supercharge your DevOps and cloud journey!