Implementing runtime retention policy

The data retention policy governs the retention and deletion of records based on specified parameters. The retention policy ensures the preservation of a designated number of records for both applications and kernels associated with a specific instance-ID. OCP administrators can configure and implement a runtime retention policy within the spark-hb-runtime-retention-policy ConfigMap.

Configuration

OCP administartors must do the following configuration:

The number of applications or kernels for a particular instance. Additionally, you must determine whether the instance falls under 'project-id', 'space-id', or 'service_instance', based on whether it is created by a runtime project,runtime space instance, or external service instance, respectively.

After you configure, a cronjob is initiated to enforce the runtime retention policy. This cronjob retrieves the JSON file from the ConfigMap and proceeds with record management as per the specified guidelines. The cronjob runs once every midnight.

Procedure

Select the Instance Type: The cronjob identifies the appropriate context type ('service_instance', 'project', 'space') based on the JSON structure.
Select the Identifier ('instance-id', 'project-id', 'space-id'): Within each instance type, the cronjob iterates through the respective identifiers to determine the specific instance.

Retention Criteria For each identifier, the cronjob determines the number of applications and kernels to be preserved. Records are persisted based on their creation time, ensuring compliance with the specified retention limits. Records that does not meet the retention criteria and falling within defined states ('STOPPED', 'FINISHED', 'FAILED', 'AUTO_TERMINATED' for applications; 'Deleted', 'Culled', 'Killed', 'AutoTerminated' for kernels) are deleted.
The spark-hb-runtime-retention-policy ConfigMap facilitates the management of runtime retention policies. It uses a JSON file, config.json within this ConfigMap to define retention rules. The OCP administrator must configure the JSON file.

The JSON structure within 'config.json' has the following format:

 {
    "spark_retention_policy": {
        "service_instance": {
            "<instance-id>": {
                "retained_applications": <number>,
                "retained_kernels": <number>
            }
        },
        "project": {
            "<project-id>": {
                "retained_applications": <number>,
                "retained_kernels": <number>
            }
        },
        "space": {
            "<space-id>": {
                "retained_applications": <number>,
                "retained_kernels": <number>
            }
        }
     }
   }