Using the Resilience dimension

Edit online

Use the Resilience dimension to configure how resilience is evaluated for your application deployments, run assessments based on defined requirements and metrics, and analyze scores and risk levels to identify and address resilience gaps.

Note: When upgrading your Concert instance from v1.x to v2.x, existing resilience data is not retained. After upgrading, recreate your resilience configurations in the new version.

Before you begin

Before you use the Resilience dimension:

You should have the required metrics available for your application deployment so that assessments can generate scores and evaluate risk.
You should be familiar with the resilience requirements that you want to apply when creating a resilience profile.

Draft comment:
Earlier draft didn't have the pre-req section. I’ve introduced this section to help users better understand the key prerequisites required before working with the Resilience dimension. Can you let me know, what pre-reqs ideally should be mentioned/included here?

Step 1: Understand your resilience goals

Before configuring resilience in Concert, identify the requirements (NFRs) that define resilience for your application deployments and the target scores that represent acceptable performance and risk levels. Start by reviewing the available resilience libraries (default or custom) to understand how requirements are organized across categories and how they are measured. From Concert v2.2, resilience libraries include predefined categories, requirements, associated metrics, and scoring criteria that define how application deployment resilience is evaluated.

Select the categories and requirements that are relevant to your application deployment and business objectives. These requirements are evaluated using metrics, which can include both leading indicators (predictive signals) and lagging indicators (historical outcomes). Ensure that your application deployment deployments are generating and providing the required metrics so that Concert can accurately calculate scores and evaluate risk during assessments. Taking this approach ensures that you have the correct requirements, metrics, and scoring context defined before you create a resilience profile and run assessments.

Step 2: Create a resilience profile

A resilience profile defines how resilience is evaluated for your application deployment by combining selected requirements with scoring logic, target scores, and risk thresholds. Create a resilience profile by selecting relevant requirements from one or more resilience libraries and configuring how those requirements are measured and scored. When creating a resilience profile, consider the following:

Select the requirements that best represent your application deployment’s resilience goals across relevant categories.
Define target scores to indicate the expected level of performance for each requirement.
Configure risk thresholds to determine how scores translate into risk levels.
Assign weights to requirements to reflect their relative importance in the overall resilience score.
Specify how each requirement is evaluated using metrics

You can create a profile from scratch, use predefined options, or clone an existing resilience profile and modify an existing resilience profile to accelerate setup.

Refer to Creating a resilience profile for details and instructions.

Step 3: Define a posture (assessment plan)

In the Resilience dimension, a posture is an assessment plan that specifies which resilience profile (and its associated requirements and scoring logic) is applied when evaluating the resilience of an application deployment. A single resilience profile can be reused across multiple posture configurations. When defining a posture:

Select the resilience profile that you created in the previous step.
Associate the profile with the relevant application deployment or group of application deployments.
Define the aggregation period to determine how frequently metrics are collected and resilience scores are generated (for example, latest, hourly, daily, or monthly).

Note: If you are entering resilience data manually, you can do this while creating or editing the posture configuration for the relevant application deployment.

The aggregation period defines how resilience metrics are grouped and evaluated within a specific time window. Resilience scores are generated when an assessment is run using data from the current aggregation period. The posture ensures that the selected profile is consistently applied during assessments, enabling standardized measurement of resilience scores and risk levels over time.

Refer to Creating a posture plan for details and instructions.

Step 4: Use Concert Workflows to automate resilience data ingestion (optional)

Use Concert Workflows to import prebuilt workflows and ingest resilience metrics into Concert from external systems. These workflows collect metrics from sources such as application deployment performance monitoring (APM) and IT service management (ITSM) tools, and send the data to Concert, where it is mapped to the requirements defined in your resilience profile. You can also configure recurring data ingestion by scheduling jobs to automatically import resilience data at defined intervals. This ensures that your assessments are based on up-to-date and consistent metric data.

Refer to Importing resilience data using Concert Workflows for details on the supported workflows and Creating jobs for details and instructions to automate data ingestion on a recurring basis at a defined interval.

Once resilience data is ingested and available, you can run assessments to evaluate your application deployment’s resilience posture.

Step 5: Run a resilience assessment

Run a resilience assessment to evaluate how well your application deployment meets the requirements defined in its assigned resilience profile. During an assessment, Concert uses the available metric data to evaluate performance across all configured requirements and generate a comprehensive view of your application deployment’s resilience posture. The assessment provides:

An overall resilience score for the application deployment
Category-level scores (for example, availability, security, and observability)
Requirement-level scores based on associated metrics
Risk levels that indicate the severity of gaps against defined targets
Potential disruption cost to quantify the business impact of resilience gaps

The results also include detailed metric data, such as values, sources, aggregation methods, and timestamps, to support deeper analysis. Assessments are executed based on your posture plan configuration and can be triggered manually or through the /evaluate API, using the latest available data within the defined aggregation period.

Refer to resilience assessments for more details and instructions.

Step 6: Review assessment results and take action

After an assessment is complete, review the results to understand your application deployment’s resilience posture and identify areas for improvement. You can view both the latest and historical assessment results. To view assessment details, go to Dimensions > Resilience, and click the name of a posture from the list to view the latest and historical data. From the assessment results, you can:

Identify requirements that are not meeting target thresholds
Detect high-risk areas that require immediate attention
Understand which metrics are contributing to low scores
Compare assessments over time to evaluate whether resilience is improving or degrading

Based on these insights, you can take actions to improve resilience:

Prioritize remediation for high-risk or low-performing requirements
Investigate underlying metric data to determine root causes
Initiate follow-up actions directly from the assessment view

The Resilience dimension in Concert provides a structured approach to evaluate and improve application deployment resilience by connecting requirements, metrics, and assessment outcomes. By continuously assessing performance, analyzing risk, and taking targeted actions, you can identify gaps early and strengthen your application deployment’s ability to withstand and recover from disruptions over time.

Note:

When creating resilience configuration resources such as resilience libraries, profiles, postures, and metrics, resource names must use only supported characters. Supported characters include uppercase letters (A–Z), lowercase letters (a–z), numbers (0–9), and the following special characters: underscore (_), dot (.), colon (:), and hyphen (-).

Resource names that do not follow these rules cannot be used consistently across the Resilience dimension.