Amazon Introduces S3 Batch Replication to Replicate Existing Objects
Amazon recently introduced Batch Replication for S3, an option to replicate existing objects and synchronize buckets. The new feature is designed for use cases such as setting up disaster recovery, reducing latency, or transferring ownership of existing data.
While S3 Replication has been available since 2015, until now, customers had to develop their own solutions to copy objects created before the replication rule was configured. Additionally, manually copying objects between buckets did not preserve metadata such as version ID or object creation time.
Marcia VillalbaSenior Developer Advocate at AWS, highlights key use cases for the latest features:
Customers may wish to copy their data to a new AWS Region for a disaster recovery setup. (…) Another reason for copying existing data comes from growing organizations around the world. (…) Another common use case we see is customers going through mergers and acquisitions where they need to transfer ownership of existing data from one AWS account to another.
S3 batch replication can be used to replicate existing objects, replicate objects that were added to a bucket before a replication rule was configured, replicate objects that previously failed due to insufficient permissions, replicate objects that have already been replicated to another target bucket or replicate replicas of objects that were created from a replication rule.
Paul Meighan, senior manager at AWS, summarizes in a Tweeter:
Amazon S3 batch replication lets you easily populate a newly created bucket with existing objects, retry objects that were previously unable to replicate, migrate data between accounts, or add new buckets to your data lake.
The article “Replicating existing objects between S3 buckets” has been updated to reflect the latest functionality, with Akhil Aendapallysenior solution architect at AWS, and Steven DolanEnterprise Support Manager at AWS, recommending:
To monitor the replication status of your existing objects, configure Amazon S3 inventory on the source bucket at least 48 hours before enabling replication.
Coney Quinn, cloud economist at The Duckbill Group, warns in his newsletter:
This is a remarkably strong candidate for what could potentially be “the most expensive API call in all of AWS.” Be careful with this one!
In addition to the costs of storing replicated data in the destination bucket, customers are accused replication fees, data transfer fees, batch operations, optional manifest generation fees, and Key Management Service (KMS) costs. S3 batch replication is available in all AWS Regions.