Workflow - Using the Cloud Storage Object API

System Admin

Note: Replication can only be enabled on Container Vaults, which requires Enabling Container Mode on the system.

To enable replication for bucket users, an operator with System Admin role must first enable it on the Container Vault through these steps:

  1. Enable Replication on the Storage Pool
    1. (Optional) Configure replication rate limits
    2. (Optional) Configure custom rate distribution across the Access Pools where vaults from this Storage Pool are deployed. By default, the rates are distributed evenly across all Access Pools.
    3. (Optional) Configure Certificate PEM. See Configuring replication for various network setups.
  2. Enable Replication on Container Vaults within the Storage Pool
    • (Optional) Configure Replication Endpoint. This endpoint is used by Cloud Object Storage when servicing background replication sync operations to this vault.
    • If endpoint is not specified, replication agent syncs to the destination Access Pool devices by using their local IP addresses. If there is a network partition between the source and destination Access Pools, this will not work: an operator must specify an endpoint that resolves to one or more correct destination Access Pools.
    • (Optional) Configure Sync Latency Alerting Threshold. When many replication syncs take longer that this latency value to sync over, an alert is generated. See the Sync Latency Alert Event.

Bucket Owner

Once Replication is enabled for a Vault, users are allowed to enable replication for buckets within that Vault. This is done by using S3 PutBucketReplication API .

There are a few conditions that must be met before a replication can be successfully configured on a bucket:

  • Network connectivity between the source and destination buckets (that is, between the Access Pools where the respective vaults are deployed).
  • Only bucket owners are permitted to configure replication policies
  • Object versioning must be enabled on both source and destination buckets
  • Destination bucket owner must grant bucket WRITE permission (by means of ACL) to the source bucket owner

All of these conditions are required for the Cloud Object Storage system to successfully replicate objects in the background. Therefore, it is checked as part of the PutBucketReplication API call to help ensure that the user has properly prepared the source/destination buckets for replication.

The bucket containing the replication configuration is the replication source bucket (where objects are being copied from).

A Replication Policy/Configuration consists of a set of rules that each specify:

  • Destination bucket
    • Where objects are replicated to
  • Priority
    • Used to decide which rule is applied when multiple rules apply to a given object. Rule with higher priority wins.
  • Status
    • Whether this rule is enabled/disabled. Disabled rules are ignored.
  • Delete Marker Replication
    • Whether delete markers will be replicated
  • Filter
    • Used to decide what objects are subject to replication. Can specify object name prefix and/or tag (multiple tags are allowed)

Once a bucket is configured with a Replication policy, all new object writes operations into this source bucket are evaluated against the bucket's Replication policy to determine whether it should be replicated. The following operations are replicated:

  • New object write
    • Any API that creates new object version (PutObject, PostObject). For multipart uploads, the terminal CompleteMultipartUpload operation triggers replication.
  • Delete marker creation
    • New delete markers are created when user calls DeleteObject (without version ID) on versioned buckets.
    • Delete marker is only replicated when the rule has delete marker replication enabled.
  • Object tag update
  • Object lock retention updates
  • Object lock legal hold updates

When an operation is eligible for replication, a replication work item is inserted into an internal queue (that is maintained per container vault) for background processing. The access pool(s), in which the container vault is deployed, are responsible for processing these work items. A replication agent, which runs on each Accesser device within these pools, processes each work item and delete them if once the change is successfully replicated. A work item is executed by reading the object from the source bucket and replicating the object content or metadata (depending on the original operation) to the destination bucket.

When a version is replicated, the replicas have the same metadata as the source, which includes properties such as:

  • Version ID
  • ETag
  • Last modified time
  • Object tags
  • Object lock retention / legal hold
  • Content metadata (Content-* headers)
  • User-define metadata (x-amz-meta-* headers)

Since only a single rule is enforced on each user operation, a user operation results in one background sync / work item that is inserted at most.

On the destination bucket, the replica object can be read, deleted, or modified in the same manner as any other user-generated object. However, note that manual modifications on the replicas may cause them to go out of sync with the source bucket. This is because newer updates (based on the modification timestamp) will prevent syncs of older (“stale”) updates from the source on a given metadata (tags/retention/legal hold). Similarly, if a replica version is permanently deleted (for example: DeleteObjectVersion), any future updates to that version on the source will not be synced over because there is no longer a replica to update.

Synchronization work items are liable for execution when they are created but different factors may delay or prevent the successful execution of these work items. This causes for this include temporary outages in the source or target bucket, loss of connectivity, lack of permissions, or deletion of the source object before the object is synced. If a work item is delayed or cannot be processed, then these work items can be listed by users with the ListReplicationFailures API call. These work items continue to be reattempted daily for 30 days.

Once a replication failure does not succeed for 30 days, the system will no longer retry them daily. However, users have the option to trigger reattempt of all failures within their bucket by using a dedicated API like Schedule reattempt of failed replications in a bucket.