Metrics
The Device API includes metrics representing the performance of the object lifecycle scanners.
Container Listing Metrics
Lists all the buckets in the container index of the service vault and identifies which have expiration rules.
| Metric | Type | Device API Key | Comments |
|---|---|---|---|
| Run Time | Gauge | lifecycle.containerListing.runTime |
Number of milliseconds for which the current scan has been running. Because the container listing process is only run on a single accesser per day, only one node should report a non-zero value for this metric each day. This gauge resets to zero at the start of each scanning interval, if the previous scan completed. If the previous scan did not complete, the value keeps increasing. Once the scan completes, the value stays constant. |
| Cycle Start Time | Gauge | lifecycle.containerListing.cycleStartTime |
Time in milliseconds since Epoch, corresponding to when the current scanning cycle started, but not necessarily when the scanning itself started. For example: IBM Accesser device reboots at 6am, cycleStart is 00:00, and start_time is approximately 0600. |
| End Time | Gauge | Lifecycle.containerListing.endTime |
Time in milliseconds since Epoch, corresponding to when scanning is finished. This gauge resets to zero when the IBM Accesser device restarts and will remain zero until a scan is completed. |
| Work items leased | Counter | lifecycle.containerListing.leasedWorkItems |
The number of times this node has won the election to scan the container index in the service vault for containers that have enabled expiration rules. |
| Range creation work items created for Expiration | Counter | lifecycle.containerListing.containerWorkItemsCreated | Number of buckets to be scanned for all expiration-related lifecycles. |
| Range creation work items created for non-versioned object / current version expiration | Counter | lifecycle.containerListing.regularExpiration.containerWorkItemsCreated | Number of buckets to be scanned for non-versioned object / current version expiration |
| Range creation work items created for non-current version and expired object delete marker expiration | Counter | lifecycle.containerListing.versionExpiration.containerWorkItemsCreated | Number of buckets to be scanned for non-current version and expired object delete marker expiration. |
| Work items created for Aborting Incomplete Multipart Uploads | Counter | lifecycle.containerListing.abortMpu.containerWorkItemsCreated | Number of buckets to be scanned for aborting incomplete multipart uploads. |
| work items finished | Counter | lifecycle.containerListing.finishedWorkItems |
The number of times this node has finished scanning the container index for containers that have expiration policies. Note: There is a global election process so a single IBM Accesser device will run a single scan each
day.
|
Lifecycle scanning range metrics
Scans a bucket and creates work items representing approximately 1000 objects to scan.
| Metric | Type | Device API Key | Comments |
|---|---|---|---|
| Run Time | Gage | lifecycle.scanningRangeCreation.runTime |
Number of milliseconds for which the current scan has been running. This gauge resets to zero at the start of each scanning interval, if the previous scan completed. If the previous scan did not complete, the value keeps increasing. Once the scan completes, the value stays constant. |
| Cycle Start Time | Gage | lifecycle.scanningRangeCreation.cycleStartTime |
Time in milliseconds since Epoch, corresponding to when the current scanning cycle started, but not necessarily when the scanning itself started. For example: IBM Accesser device reboots at 6am, cycleStart is 00:00, and start_time is approximately 0600. |
| End Time | Gage | lifecycle.scanningRangeCreation.endTime |
Time in milliseconds since Epoch, corresponding to when scanning is finished. This gauge resets to zero when the IBM Accesser device restarts and will remain zero until a scan is completed. |
| work items leased | Counter | lifecycle.scanningRangeCreation.leasedWorkItems |
Number of buckets to scan by this node. |
| work items finished | Counter | lifecycle.scanningRangeCreation.finishedWorkItems |
Number of buckets finished scanning by this node. This is how many buckets with enabled expiry were analyzed by this node to create the work items to scan name ranges of 1,000 objects. |
| Scanning Ranges Created | Counter | lifecycle.scanningRangeCreation.rangesCreated |
This is the amount of work this node created for the Lifecycle name index scanner. |
Lifecycle name index scan metrics
Scans objects to find objects to be deleted.
| Metric | Type | Device API Key | Comments |
|---|---|---|---|
| Run Time | Gauge | lifecycle.nameIndexScan.runTime |
Number of milliseconds for which the current scan has been running. This gauge resets to zero at the start of each scanning interval, if the previous scan completed. If the previous scan did not complete, the value keeps increasing. Once the scan completes, the value stays constant. |
| Cycle Start Time | Gauge | lifecycle.nameIndexScan.cycleStartTime |
Time in milliseconds since Epoch, corresponding to when the current scanning cycle started, but not necessarily when the scanning itself started. For example: IBM Accesser device reboots at 6am, cycleStart is 00:00, and start_time is approximately 0600. |
| End Time | Gauge | lifecycle.nameIndexScan.endTime |
Time in milliseconds since Epoch, corresponding to when scanning is finished. This gauge resets to zero when the IBM Accesser device restarts and will remain zero until a scan is completed. |
| work items leased | Counter | lifecycle.nameIndexScan.leasedWorkItems |
The number of name scan ranges leased. |
| Objects to expire | Counter | lifecycle.nameIndexScan.expireObjects |
Number of objects this node found to expire. |
| Bytes to expire | Counter | lifecycle.nameIndexScan.expireBytes |
Number of bytes this node found to expire. |
| Objects scanned | Counter | lifecycle.nameIndexScan.objectsScanned |
Number of objects scanned by this node. Important: Use this metric to compute the scanning rate of a node.
|
| Scanning Ranges Processed/ Work completed | Counter | lifecycle.nameIndexScan.finishedWorkItems |
The work item's processed counter. It is the consumption count for the index range scanner's output count. You can compare these values to see how far along the index name scanner is in the days work. |
| Objects to expire over limit | Gauge | lifecycle.nameIndexScan.expireObjectsOverLimit | Number of objects found by the node to expire, but were discarded because they are over the limit (device limit or per-account limit). |
| Objects to expire in future | Gauge | lifecycle.nameIndexScan.expireObjectsDayX | Predicted number of objects identified to expire in day (X) in future. X = [1, 2, 3, 4, 5, 6, 7 and 8to14]. Note: 8to14 represents the objects predicted to expire in the following week. |
| Objects to expire over the limit in future | Gauge | lifecycle.nameIndexScan.expireObjectsOverLimitDayX | Predicted number of objects identified to expire in day (X) in future, but are over the limit (device limit or per-account limit). X = [1, 2, 3, 4, 5, 6, 7 and 8to14]. Note: 8to14 represents the objects predicted to expire in the following week |
Space Reclamation metrics
| Metric | Type | Device API Key | Comments |
|---|---|---|---|
| Run Time | Gauge | lifecycle.expirationSpaceReclamation.runTime |
Number of milliseconds for which the current scan has been running. This gauge resets to zero at the start of each scanning interval, if the previous scan completed. If the previous scan did not complete, the value keeps increasing. Once the scan completes the value stays constant. |
| Cycle Start Time | Gauge | lifecycle.expirationSpaceReclamation.cycleStartTime | Time in milliseconds since Epoch, corresponding to when the current reclamation cycle started, but not necessarily when the scanning itself started. For example: IBM Accesser device reboots at 6am, cycleStart is 00:00, and start_time is approximately 0600. |
| End Time | Gauge | lifecycle.expirationSpaceReclamation.endTime | Time in milliseconds since Epoch, corresponding to when reclamation is finished. This gauge resets to zero when the IBM Accesser device restarts and will remain zero until a scan is completed. |
| objects deleted | Counter | lifecycle.expirationSpaceReclamation.objectsDeleted | Objects deleted. |
| bytes deleted | Counter | lifecycle.expirationSpaceReclamation.bytesDeleted | Bytes deleted. |
| delete_markers_created | Counter | lifecycle.expirationSpaceReclamation.deleteMarkersCreated | Number of delete markers created as part of expiration of versioned objects. |
| work items leased | Counter | lifecycle.expirationSpaceReclamation.leasedWorkItems | The number of work items leased. This equals the sum of the objects deleted, and the two failure metrics. This is approximately the number of deletes attempted, excluding retries. |
| work items finished | Counter | lifecycle.expirationSpaceReclamation.finishedWorkItems | The number of work items completed. This is approximately the number of deletes attempted, excluding retries. |
| objects not found | Counter | lifecycle.expirationSpaceReclamation.objectDeleteExceptions.notFound | Deletes that failed because the object could not be found, most likely because it was deleted by a user before it could be expired. |
| versions not found | Counter | lifecycle.expirationSpaceReclamation.objectDeleteExceptions.versionNotFound | Version deletes that failed because the object version to delete could not be found. |
| policy prevented delete | Counter | lifecycle.expirationSpaceReclamation.objectDeleteExceptions.lifecyclePrecondition | Deletes that failed because the lifecycle policy no longer indicates that the object is to be deleted, most likely because the policy was changed, or the object was overwritten. |
| I/O failure during delete | Counter | lifecycle.expirationSpaceReclamation.objectDeleteExceptions.objectIo | Deletes that failed due to internal I/O errors. |
| protected objects not deletable | Counter | lifecycle.expirationSpaceReclamation.objectDeleteExceptions.protected | Deletes that failed because the object is protected by a retention policy or legal hold. |
Version Scanning Range metrics
| Metric | Type | Device API Key | Comments |
|---|---|---|---|
| Run Time | Gauge | lifecycle.versionScanningRangeCreation.runTime |
Number of milliseconds for which the current scan has been running. This gauge resets to zero at the start of each scanning interval, if the previous scan completed. If the previous scan did not complete, the value keeps increasing. Once the scan completes the value stays constant. |
| Cycle Start Time | Gauge | lifecycle.versionScanningRangeCreation.cycleStartTime | Time in milliseconds since Epoch, corresponding to when the current reclamation cycle started, but not necessarily when the scanning itself started. For example: IBM Accesser device reboots at 6am, cycleStart is 00:00, and start_time is approximately 0600. |
| work items leased | Counter | lifecycle.versionScanningRangeCreation.leasedWorkItems | Number of buckets to scan by this node. |
| work items finished | Counter | lifecycle.versionScanningRangeCreation.finishedWorkItems | Number of buckets finished scanning by this node. This is how many versioned buckets with enabled expiry were analyzed by this node to create the work items to scan version index ranges. |
| Scanning Range Created | Counter | lifecycle.versionScanningRangeCreation.rangesCreated | This is the amount of work this node created for the Version Index scanner. |
Version Index Scan metrics
| Metric | Type | Device API Key | Comments |
|---|---|---|---|
| Run Time | Gauge | lifecycle.versionIndexScan.runTime |
Number of milliseconds for which the current scan has been running. This gauge resets to zero at the start of each scanning interval, if the previous scan completed. If the previous scan did not complete, the value keeps increasing. Once the scan completes the value stays constant. |
| Cycle Start Time | Gauge | lifecycle.versionIndexScan.cycleStartTime | Time in milliseconds since Epoch, corresponding to when the current reclamation cycle started, but not necessarily when the scanning itself started. For example: IBM Accesser device reboots at 6am, cycleStart is 00:00, and start_time is approximately 0600. |
| End Time | Gauge | lifecycle.versionIndexScan.endTime | Time in milliseconds since Epoch, corresponding to when scanning is finished. This gauge resets to zero when the IBM Accesser device restarts and will remain zero until a scan is completed. |
| work items leased | Counter | lifecycle.versionIndexScan.leasedWorkItems | The number of version index ranges leased. |
| work items finished | Counter | lifecycle.versionIndexScan.finishedWorkItems | The work item's processed counter. It is the consumption count for the version index range scanner's output count. You can compare these values to see how far along the version index scanner is in the days work. |
| Versioned Objects scanned | Counter | lifecycle.versionIndexScan.objectsScanned | Number of versioned objects scanned by this node. |
| Expired versioned objects | Counter | lifecycle.versionIndexScan.expireObjects | Number of versioned objects which were evaluated to have been expired next day. |
| Expired versioned objects bytes | Counter | lifecycle.versionIndexScan.expireBytes | Number of bytes corresponding to the versioned objects which were evaluated to have been expired next day. |
| Versioned objects to expire over limit | Gauge | lifecycle.versionIndexScan.expireVersionedObjectsOverLimit | Number of versioned objects found by the node to expire, but were discarded as they are over the limit (device limit or per-account limit). |
| Versioned objects to expire in future | Gauge | lifecycle.versionIndexScan.expireVersionedObjectsDayX | Predicted number of versioned objects identified to expire in day (X) in future. X = [1, 2, 3, 4, 5, 6, 7 and 8to14]. Note: 8to14 represents the versioned objects predicted to expire in the following week |
| Versioned objects to expire over the limit in future | Gauge | lifecycle.versionIndexScan.expireVersionedObjectsOverLimitDayX | Predicted number of versioned objects identified to expire in day (X) in future, but are over the limit (device limit or per-account limit). X = [1, 2, 3, 4, 5, 6, 7 and 8to14]. Note: 8to14 represents the versioned objects predicted to expire in the following week |
Multipart Upload Scanning metrics
| Metric | Type | Device API Key | Comments |
|---|---|---|---|
| Run Time | Gauge | lifecycle.mpuScanning.runTime |
Number of milliseconds for which the current scan has been running. This gauge resets to zero at the start of each scanning interval, if the previous scan completed. If the previous scan did not complete, the value keeps increasing. Once the scan completes the value stays constant. |
| Cycle Start Time | Gauge | lifecycle.mpuScanning.cycleStartTime | Time in milliseconds since Epoch, corresponding to when the current scanning cycle started, but not necessarily when the scanning itself started. For example: IBM Accesser device reboots at 6am, cycleStart is 00:00, and start_time is approximately 0600. |
| End Time | Gauge | lifecycle.mpuScanning.endTime | Time in milliseconds since Epoch, corresponding to when scanning is finished. This gauge resets to zero when the IBM Accesser device restarts and will remain zero until a scan is completed. |
| Work items leased | Counter | lifecycle.mpuScanning.leasedWorkItems | The number of Incomplete Multipart Upload Transactions leased. |
| Work items finished | Counter | lifecycle.mpuScanning.finishedWorkItems | The number of incomplete Multipart Upload Transactions which were successfully evaluated after leasing. |
| Scanned transactions | Counter | lifecycle.mpuScanning.scannedTransactions | Number of incomplete Multipart Upload Transactions which were identified and evaluated against the policy. |
| Expired transactions | Counter | lifecycle.mpuScanning.expiredTransactions | Number of incomplete Multipart Upload Transactions which were evaluated against the policy and determined to have expired and need to be aborted. |
| Expired parts | Counter | lifecycle.mpuScanning.expiredParts | Number of Incomplete Multipart Upload parts which have been evaluated to have expired and have to be deleted. |
| Expired bytes | Counter | lifecycle.mpuScanning.expiredBytes | Total bytes corresponding to the Incomplete Multipart Upload parts which have been evaluated to have expired and have to be reclaimed. |
| Expired MPU Parts over limit | Gauge | lifecycle.mpuScanning.expiredPartsOverLimit | Number of MPU Parts found by the node to expire, but were discarded as they are over the limit (device limit or per-account limit). |
| Expired MPU Parts daily | Gauge | lifecycle.mpuScanning.expiredPartsDayX | Predicted number of MPU parts identified to expire in day (X) in future. X = [1, 2, 3, 4, 5, 6, 7 and 8to14]. Note: 8to14 represents the MPU Parts predicted to expire in the following week. |
| Expired MPU parts over limit in future | Gauge | lifecycle.mpuScanning.expiredPartsOverLimitDayX | Predicted number of MPU parts identified to expire in day (X) in future, but are over the limit (device limit or per-account limit). X = [1, 2, 3, 4, 5, 6, 7 and 8to14]. Note: 8to14 represents the MPU Parts predicted to expire in the following week. |
Multipart Upload Reclamation metrics
| Metric | Type | Device API Key | Comments |
|---|---|---|---|
| Run Time | Gauge | lifecycle.mpuSpaceReclamation.runTime |
Number of milliseconds for which the current scan has been running. This gauge resets to zero at the start of each scanning interval, if the previous scan completed. If the previous scan did not complete, the value keeps increasing. Once the scan completes the value stays constant. |
| Cycle Start Time | Gauge | lifecycle.mpuSpaceReclamation.cycleStartTime | Time in milliseconds since Epoch, corresponding to when the current scanning cycle started, but not necessarily when the scanning itself started. For example: IBM Accesser device reboots at 6am, cycleStart is 00:00, and start_time is approximately 0600. |
| End Time | Gauge | lifecycle.mpuSpaceReclamation.endTime | Time in milliseconds since Epoch, corresponding to when scanning is finished. This gauge resets to zero when the IBM Accesser device restarts and will remain zero until a scan is completed. |
| Work items leased | Counter | lifecycle.mpuSpaceReclamation.leasedWorkItems | The number of Incomplete Multipart Upload Transactions leased. |
| Work items finished | Counter | lifecycle.mpuSpaceReclamation.finishedWorkItems | The number of incomplete Multipart Upload Transactions which were successfully evaluated after leasing. |
| Incomplete MPU transactions deleted | Counter | lifecycle.mpuSpaceReclamation.transactionsDeleted | Number of Expired incomplete Multipart Upload Transactions which were successfully aborted. |
| Incomplete MPU transactions bytes deleted | Counter | lifecycle.mpuSpaceReclamation.transactionsBytesDeleted | Number of bytes corresponding to Expired incomplete Multipart Upload Transactions which were successfully aborted. |
| Incomplete MPU parts deleted | Counter | lifecycle.mpuSpaceReclamation.transactionsPartsDeleted | Number of Multipart Upload parts corresponding to incomplete transactions which have been successfully deleted. |
| I/O failure during MPU Abort | Counter | lifecycle.mpuSpaceReclamation.mpuAbortExceptions.objectIo | Abort of Incomplete Multipart Upload parts which failed due internal I/O errors. |
| Policy prevented delete during MPU Abort | Counter | lifecycle.mpuSpaceReclamation.mpuAbortExceptions.lifecyclePrecondition | Abort of Incomplete MPU that failed because the lifecycle policy no longer indicates that the MPU has to be aborted, most likely because the policy was changed. |
| Incomplete MPUs not found | Counter | lifecycle.mpuSpaceReclamation.mpuAbortExceptions.notFound | Abort of Incomplete MPU that failed because the identified incomplete MPU was already aborted by the user. |
Replication metrics
| Metric | Type | Device API Key | Comments |
|---|---|---|---|
| Replicated bytes | Counter | replicationAgent.consumer.contentSyncBytesProcessed | Total object bytes processed by background replication agent |
| Networking issues | Counter | replicationAgent.consumer.networkingIssues | Total number of networking issues encountered while replicating objects |
| Throttled replication requests | Counter | replicationAgent.consumer.requestsOutboundThrottled | Total number of outbound replication requests that were throttled at source side |
| Internal replication failures | Counter | replicationAgent.consumer.syncInternalFailures | Total number of internal failures encountered while replicating objects |
| Bytes queued for replication | Counter | replicationAgent.producer.contentSyncBytesQueued | Total number of object bytes that were queued for replication (as a result of user uploads) |
| Account-throttled requests | Counter | replicationAgent.producer.requestsAccountThrottled | Total number of replication requests that were throttled due to account limits |
| Replication queuing latency | Summary | replicationAgent.producer.workItemInsertLatency | Running average of replication queuing latency |
| Replication queuing failures | Counter | replicationAgent.producer.workItemQueueInsertFailures | Total number of failures encountered when queuing replications |
| Replications queued | Counter | replicationAgent.producer.workItemsQueued | Total number of replications queued |
| Fast replications removed | Counter | replicationAgent.scheduler.finishedWorkItems | Total number of replications removed from fast queue |
| Replication work lease latency | Summary | replicationAgent.scheduler.workItemLeaseLatency | Windowed average of replication work lease latency |
| Vault replication network errors | Counter | replicationVault.{vault-id}.network | Total number of replication network errors for vault {vault-id} |
| Vault replication sync delays over limit | Counter | replicationVault.{vault-id}.syncDelayOverLimit | Total number of replication latencies over limit for vault {vault-id} |
| Vault replication network errors | Counter | replicationVault.{vault-id}.system | Total number of replication system errors for vault {vault-id} |
| Vault replication network errors | Counter | replicationVault.{vault-id}.user | Total number of replication user errors for vault {vault-id} |