Disaster Recovery (DR)
If your production environment becomes unavailable or unusable during the ongoing operation of Sterling Intelligent Promising, the disaster recovery process is triggered automatically.
The active environment is sized to house the active clusters and instances (for performance and production support tasks) and the dormant DR instance (as a contingency). All the instances share virtual machines for web, application, and database resources. For this reason, during a declared disaster, IBM quarantines your entire active environments so that there is no interruption to the DR activities, and is fully devoted to restoring the availability. The disaster recovery architecture includes all the servers, network, scripts, and databases that are involved in the data backup and switching between active environments, when needed. The active data clusters are housed within different data centers with IBM® SoftLayer® data center.
As part of the disaster recovery process, operational and transactional data within your production environment, such as orders, are routinely replicated throughout the day and backed up to the disaster recovery instance. The data replication includes the backing up of your PII and other regulated data. Your production environment web and application data are backed up hourly to the disaster recovery instance. The application data includes file system artifacts, such as CSS, images, static content, and SaaS extension artifacts. IBM also backs up key environment and site data, such as infrastructure and configuration data, extensions, and files daily. Backups of your production environment databases are also completed daily. Local backups, which can be used for small scale recovery events, are also completed and moved to a remote storage location. Transaction logs are maintained in both your live and disaster recovery data centers.
Production environment data, including web and application data, is replicated and backed up through a private IBM SoftLayer network between your active environments. Your disaster recovery databases, which are always maintained at a near-ready state, use this network to replicate data in a near-synchronous mode by using high availability disaster recover (HADR) functions.
Objectives
- Recovery Time Objective (RTO)
- The RTO is the elapsed time between the disaster being declared and the restoration of your production environment service.
- Recovery Point Objective (RPO)
- The RPO is the point in time in the past to which your environment recovers. The RPO is indicative of the amount of potential data loss or age of data that must be recovered from the disaster recovery backups for normal operations to resume.
- If you have the subscription of Inventory service in Sterling Intelligent Promising Essentials package, the Service Level Objective (SLO) offered for the RTO is within 7 days with an RPO of 48 hours.
- If you have the subscription of Inventory service in Sterling Intelligent Promising Standard package, the SLO offered for the RTO is within 48 hours with an RPO of 24 hours.
- Additionally, if you purchase options for SLO improvement for the Inventory service in Sterling Intelligent Promising Standard package, the expected RTO is within 4 hours, with an RPO of 2 hours.
Process
When a disaster occurs, the following steps are completed during the disaster recovery process:- If your production environment or primary data center experiences a severe problem, which after investigation is deemed irreversible, IBM declares that a disaster occurred and starts implementing the disaster recovery process.
- IBM issues an alert to you and any other relevant parties, such as your business partners, if you are using a business partner to support your services.
- IBM activates the disaster recovery process to switch your active environment temporarily and the active environment becomes unavailable. As part of this activation, IBM activates the disaster recovery application servers on your backed-up production code base. IBM also validates that the network file systems for your site are mounted and available.
- After the disaster recovery process ends, your production environment is restored.