I had a problem a long time ago, when I gave people access to S3, they treated it like garbage can. Bucket was supposed to be for development purposes only, testing to see if features worked well etc. It’s just that people were testing different things and forgetting to delete files.
I decided to do something about it once and for all. These are buckets for testing and such are to be left empty with no sensitive data, and preferably no data at all 🙂
Just what, because I wouldn’t want to log on to S3 every now and then and clean up.
I will not. But such a process can be automated. I decided to use the AWS Lambda function for this.
You can read a guide on how to create a function and give it the appropriate (minimum) permissions in the article How to automatically copy data from AWS S3 – Lambda events (lepczynski.it).
Today I’m going to give a quick demonstration of how S3 can be cleaned automatically in a simple way. Once I have created a function in python 3.9 and added the appropriate IAM role to it, then I can move on to writing the code.
The matter is simple you can use boto3. In the Lambda Functions code, just add:
s3 = boto3.resource('s3')
bucket = s3.Bucket('test001asde001')
test001asde001 – this is my AWS S3 bucket from which I will be removing all files.
Obviously, I am not an expert in writing code and do not want to complicate it unnecessarily, so I am limiting myself to the minimum that will allow it to work properly and make the subject easier to understand.
Well okay I have a function that works and deletes all files, I don’t have to do it manually so that’s cool. However, I’m a pretty lazy person and I wouldn’t want to run this function manually every day.
The best thing would be to set up an automatic function to CRON. But is there such a thing in AWS? Of course there is 🙂 It’s called the CloudWatch Events Rule.
Just add a new trigger for yourself and select CloudWatch Events. I like cron, so I use it often. Here is a cool calculator https://crontab.guru/ and a helpful explanation of how to set up cron Schedule Expressions for Rules – Amazon CloudWatch Events .
I want my lambda function to start every day at 20:00, so I set the following value:
cron(0 20 * * ? *)
When setting up the CloudWatch Events Rule remember one very important thing. You specify the time in UTC 🙂
As you can see, lambda functions do not have to be complicated. Using the CloudWatch Events Rule, you can call a Lambda function as simply as running scripts in CRON.
With this simple tutorial, you can run any Lambda function at a specific time. For example, you could shut down EC2 for the weekend and turn it back on after the weekend, or you could plan to shut down all development machines every day after work and turn them on in the morning before work. This will have a positive impact on your AWS bills. The machines will run for less time, which means you will pay less. If you’re curious how to do this, I invite you to read the tutorial How to automatically run EC2 in the AWS cloud?