Skip to content

Automatically delete old files from AWS S3

aws s3 automatic cleanup

We usually store a lot of files during our work. Fast internet and data availability support this very much. Unfortunately, a very large part of the data we collect is unnecessary or out of date after some time. Some people view their datasets and periodically delete unnecessary things.

What about the cloud? Since the price for storage has become very low, many of us collect data not only at home, but in the cloud. Why?

Because it is cheap, convenient and we have access to it from anywhere in the world 🙂

Nowadays, we need to filter the data well to find something that is useful to us. If we don’t, we’ll be drowned in a flood of information. With the stubbornness of a maniac, we collect the necessary data, keep it, and after some time it becomes unnecessary and obsolete.

The same applies to logs stored in the cloud. Storing them is also cheap and convenient, only … only logs also have validity. Let’s not keep data longer than we have to. For some systems it will be 30 days, for others it will be 90 days, and for some systems it will be a year or several years, if we are required to do so by law and regulations.

If you have any data in AWS that you would like to automatically delete after a certain period of time, then this article is for you !!

Automatic deletion of data from the entire S3 bucket

On the AWS (Amazon Web Service) platform, we can easily automatically delete data from our S3 bucket. We open Amazon S3 and select one bucket from the list, on which we want to enable automatic deletion of files after a specified time.

aws s3 - simple storage service

Go to Management and click Create lifecycle rule.

aws s3 - management

We give the name of our rule. Select the option saying that our changes are to apply to all objects and select the checkbox that appears. We must be aware that all objects older than the number of days listed below will be deleted.

s3 - create lifecycle rule

If we want the files to be deleted after 30 days, select the “Expire current versions of objects” option and enter the number of days after which the files will be deleted.

aws s3 - lifecycle rule actions

In summary, we should see something similar to the picture below. When everything is correct, click on Create rule and our automatic file deletion rule will be ready.

aws s3 - timeline summary

Automatic deletion of data from one folder from S3 bucket

Everything works nicely, but what if we wanted to have different retention set for each group of files? We won’t be creating a few buckets for every flu. Managing so many buckets would be a nightmare. There is an easier way. We can limit the deletion of files to a specific folder or subfolder only.

I strongly recommend that you check this option on a test bucket if you are just learning, or make a copy of the bucket you are implementing it on, just in case.

If you make a mistake, you could lose your data.

If you are sure of what you are doing then you can limit file deletion to a specific folder only.

The whole thing is very similar as before, only the difference is that this time you select the “Limit the scope of this rule using one or more filters” option and enter the name of the folder with / to which the rule is to be created for example ‘folder1/‘.

aws s3 - create lifecycle rule prefix 2022

Summary of Lifecycle rules

aws s3 - podsumowanie lifecycle

These are the basics, we can extend our automation much more. The basics will help you avoid unnecessary expenses and keep order by automatically delete old logs or outdated data from AWS S3 .

If the article has given you some value, or you know someone who might need it, please share it on the Internet. Don’t let him idle on his blog and waste his potential.

I would also be pleased if you write a comment.

If you do not like to spend money unnecessarily, I invite you to read other articles on saving money in the cloud.

39 thoughts on “Automatically delete old files from AWS S3”

  1. How to delete files that are older than x days. Can we use any API or script for that?
    Share an example with explanation

    1. The simplest way is to use lifecycle rule.

      There is no single command to delete a file older than x days in API or CLI.

      For example, you can mount S3 as a network drive (for example through s3fs) and use the linux command to find and delete files older than x days.

      You can also first use “aws ls” to search for files older than X days, and then use “aws rm” to delete them.

      There are many ways. You can find some examples on stackoverflow but I haven’t tested them 😉 https://stackoverflow.com/questions/50467698/how-to-delete-files-older-than-7-days-in-amazon-s3

  2. Your use of prefixes is incorrect. What you have written here will achieve nothing.

  3. Sorry for this. You’re right, I forgot to add ‘/’ after the end of directory names. But when I tested it it worked fine.
    To be consistent with the documentation I will correct the article. Thank you very much for paying attention

  4. Hi can we also delete an object which in s3 deep archive class before 180 using lifecycle

  5. Very useful tutorial.
    Id like to know how i can delete only .zip files?
    I i have different types of archives and i want to delete only old .zip files.

  6. When i want delete files with specyfic extension in Media Store I don’t use:
    "path": [
    {"prefix": "folder1/"}
    ],

    Instead, I use:
    "path": [
    {"wildcard": "folder1/*.zip"}
    ],

    In this example I delete all files with ‘.zip’ extension from ‘folder1’.

    I just don’t know if you can use it in S3.
    Remember that it is better to test any changes in a test environment.

  7. I set lifecycle ruse for my entire s3 bucket and delation set after next day file will be deleted automatically but next day file not deleted why?

  8. Amazon S3 runs lifecycle rules once every day. After the first time that Amazon S3 runs the rules, all objects that are eligible for expiration are marked for deletion. You’re no longer charged for objects that are marked for deletion.
    However, rules might take a few days to run before the bucket is empty because expiring object versions and cleaning up delete markers are asynchronous steps. For more information about this asynchronous object removal in Amazon S3, see Expiring objects.

  9. Can I clean my s3 bucket every 2 hours, by life cycle rules that can’t be done I suppose, is there any other way ?

  10. This is very useful article . can we create the one rule for delete the 30days old file that is inside sub folders like

    Bucket-name\ServerName\DatabaseName\FullBackup\
    Bucket-name\ServerName\DatabaseName\DiffBackup\
    Bucket-name\ServerName\DatabaseName\LogBackup\

    How to give the prefix name

  11. Thanks for this article. It sounds like I need to be using the lifecycle options. Originally I was under the impression that I could use the –expires flag on cp command. I thought that would copy files to the bucket and set an expiry date at individual object level instead of at bucket or folder level. Sounds like that is not true.

    1. If I understand it correctly, the expires flag is used for something else. I think this applies to the place(server) where the command is run:
      --expires (string) The date and time at which the object is no longer cacheable.

      I think Lifecycle will be a good choice

  12. Hi where do I implement this “wildcard”? Since there is no wildcard in Lifecycle Configuration.

  13. Hello,

    Thanks for the information, but is there a way to customize the lifecycle trigger, like from Midnight UTC to 4 pm UTC?

  14. Hi and thanks for answering for the my question regarding on the wildcard but is there a way how to change the lifecycle run or trigger? I know it always run at Midnight UTC but can I change that?

  15. Thanks a lot for sharing this. Is this possible to set a lifecycle policy for multiple folders or i need to set multiple policy for each folders and subfolders?

  16. You can use only single prefix in rule.
    If you have many prefixes you must copy the rule and change the prefix.
    You can find more info in documentation.

    Of course, for example, for all subfolders in s3_folder, you can set only one main rule s3_folder/, you don’t need to set a rule for each subfolder, such as s3_folder/folder1/, s3_folder/folder2/, s3_folder/folder3/

  17. Hello,

    I want to generate a monthly report of the data availability on a specific day.
    How can we do this , Are they any configurations or any other ways to achieve it ?

    Thanks in Advance ..!!

  18. It’s a bit unrelated to this article, but you can use ‘AWS Config’ for AWS resources. Maybe someday I will write an article about AWS Config, for now you can read about it on the AWS website https://aws.amazon.com/config/

    If you only mean S3 then you can enable logging of what is happening on S3. But if you want to monitor something specific on S3, it’s probably best to use a lambda, or Athena, or an external tool. You can take a look on this.

  19. i need your help, why is that the testfile i upload in the folder is not deleted after 1 day? i set a rule for my specific sample test folder and i set it for 1 day expiration after automtic deletion. why is that the test file is not deleted? thank you

  20. I can’t see your configuration so it’s hard to say, maybe there is some syntax or configuration error, maybe something small. Or maybe everything is ok and you just have to wait.

    Amazon S3 runs lifecycle rules once every day. After the first time that Amazon S3 runs the rules, all objects that are eligible for expiration are marked for deletion. You’re no longer charged for objects that are marked for deletion.
    However, rules might take a few days to run before the bucket is empty because expiring object versions and cleaning up delete markers are asynchronous steps. For more information about this asynchronous object removal in Amazon S3, see Expiring objects.

  21. hello sir. the file is successfully deleted, there is some sort of delay but the files is already deleted. but my problem is after all the file has been delete, the folder or object i used for the lifecycle rule is also deleted. how to retain the folder that i used after all the files has been deleted? thank you so much

  22. You can create a folder with put_object:

    client = boto3.client('s3')
    response = client.put_object(Bucket='test-745df33637-source-lambda-copy', Body='', Key='test-folder/')

    I checked it works, but you’d better add extra permissions to the lambda function to create resources in S3.

  23. For example
    I have bucket name abc with folders and many sub folders like below

    /Folder1/folder2/folder3/objects
    Folder1 and folder2 also contains some objects. My aim is to delete objects under folder3 alone and without deleting folder3. How to achieve it using prefix

    I tried below prefixes
    Folder1/folder2/folder3/
    /Folder1/folder2/folder3/*
    /Folder1/folder2/folder3

    But most of the time, folder3 also getting deleted with the objects inside it

  24. Hi, When you create a folder in Amazon S3, S3 creates a 0-byte object with a key that’s set to the folder name that you provided. If the object is deleted and there are no files using that path, then the folder will also disappear.

  25. Need to delete files older than 100days on non-versioning bucket.
    Option with lifecycle type “Expire current versions of objects” do not work.

    I modified lifecycle to “Permanently delete noncurrent versions of objects”
    Days after objects become noncurrent – 100
    However checking old object properties does not shot its expire date:
    “Expiration rule
    You can use a lifecycle configuration to define expiration rules to schedule the removal of this object after a pre-defined time period.

    Expiration date
    The object will be permanently deleted on this date.
    -”

    I need to wait till midnight – then rule is executed (so bad AWS so bad).

    Still it should work for non-versioning bucket:
    Non-versioned bucket – The Expiration action results in Amazon S3 permanently removing the object.
    from doc:
    https://docs.aws.amazon.com/AmazonS3/latest/userguide/intro-lifecycle-rules.html?icmpid=docs_amazons3_console#intro-lifecycle-rules-actions

    PS. AWS CLI will not show execution time of a lifecycle rule – other way to get it?

  26. Why do you think that “Expire current versions of objects” don’t work??

    Amazon S3 runs lifecycle rules once every day. After the first time that Amazon S3 runs the rules, all objects that are eligible for expiration are marked for deletion. You’re no longer charged for objects that are marked for deletion.

    However, rules might take a few days to run before the bucket is empty because expiring object versions and cleaning up delete markers are asynchronous steps. For more information about this asynchronous object removal in Amazon S3, see Expiring objects – https://docs.aws.amazon.com/AmazonS3/latest/userguide/lifecycle-expire-general-considerations.html

  27. Its not applying lifecycle because version of object is NULL>
    Why – because at the beginning the bucket was non-versioned. Even I change to be versioning – ID of object is still NULL.

    DOC:
    https://docs.aws.amazon.com/AmazonS3/latest/userguide/troubleshooting-versioning.html

    “When versioning is suspended on a bucket, Amazon S3 marks the version ID as NULL on newly created objects. An expiration action in a versioning-suspended bucket causes Amazon S3 to create a delete marker with NULL as the version ID. In a versioning-suspended bucket, a NULL delete marker is created for any delete request. These NULL delete markers are also called expired object delete markers when all object versions are deleted and only a single delete marker remains. If too many NULL delete markers accumulate, performance degradation in the bucket occurs.”

Comments are closed.