GitHubContribute in GitHub: Edit online

Backups and Disaster Recovery

Application Backups

Data used by applications within the Maximo Application Suite portfolio are backed up according to the following:

All backups are encrypted. Communication between applications, backup scripts the storage layer and database services are perfromed via secure transport and accessed only via private endpoints that are offered by the service. Production backups are performed once a day. Backups are stored in a separate AWS data center location.

System Configuration Backups

Maximo Application Suite utilizes different components to deliver the applications to clients. Each of these services are backed up using the appropriate backup tool for that component. In general, all component backups:

  • are encrypted
  • are taken daily
  • are stored in a separate data center
  • are saved for 30 days

Database Backup Retention

Database backups will be retained for the standard duration of 14 days for Production environments and 7 days for Non-Production environments. In scenarios where the customer would like to retain a backup for longer than the standard duration outlined, the IBM SRE team will perform a database export and save it to the COS bucket. It will then be the responsibility of the customer to download and maintain those exports. If a restore using one of the downloaded exports is required, the customer will need to upload the export file back to the COS bucket and open a case specifying which export file to use and for which environment they would like the restore.

Restore

Restore requests must be submitted via case (ticket) through the IBM Support Community Portal. The expected turn around time will depend on the severity and the size of the restore required. Generally expect 1 - 3 days for a restore to happen. Database restore can only be done to one of the previous daily backups (cannot restore to point in time).

Disaster Recovery

All clients of MAS SaaS (including MAS SaaS Essentials, Standard or Premium) receive the standard Disaster Recovery as outlined below. Clients who have purchased the Enhanced Disaster Recovery can view the details for that offering further below:

Standard Disaster Recovery

In the event of a disaster recovery issue with the Maximo Application Suite SaaS offering for a specific customer, IBM's focus will be in the following order:

  1. Recover the existing infrastructure in place
  2. Recover within the same AWS data center to a new infrastructure
  3. Recover to a secondary AWS data center

In the event a disaster is declared, the base parameters are:

Recovery Time Objective (RTO) - 72 hours
Recovery Point Objective (RPO) - 24 hours

RTO is the longest possible time needed to make the application available.
RPO is the longest possible time since data was last backed up.

These are Service Level Objectives (SLO's) and not Service Level Agreements.

IBM performs internal MAS disaster recovery testing annually per ISO-27001 guidelines.

For information on IBM business continuity, please refer to IBM's Business Continuity Management Position Paper. It is IBM Confidential and can be shared under a non-disclosure agreement. Please note only IBM personnel can access the link below. If required, please contact your IBM Sales person or Customer Success Manager.

Enhanced Disaster Recovery

Clients who have purchased the Enhanced Disaster Recovery option (only available on MAS SaaS Premium, non-shared clusters and only for the Manage Application) have the following as part of the option:

  • In a separate data center or a second availability zone depending on the region and data residency rules of the client, a MAS instance will be deployed that mirrors the production instance in terms of size and configuration of the MAS Core and Manage application.
  • Data will be replicated from production in near real time to this disaster recovery instance. The replica database will be in read only mode to receive the data.
  • Attachments are also replicated.
  • The Disaster Recovery instance will be kept to the same version of Openshift, MAS and subcomponents as the production instance.
  • If the production site has a VPN, then this will be enabled on the Disaster Recovery site as well.
  • SAML / LDAP configuration will be configured to match production.

In the event a disaster is declared, the base parameters are:

  • Recovery Time Objective (RTO) - 6 hours
  • Recovery Point Objective (RPO) – 15 minutes
  • RTO is the longest possible time needed to make the application available. RPO is the longest possible time since data was last backed up.
  • These are Service Level Objectives (SLO's) and not Service Level Agreements.

Exclusions and Limitations of Enhanced DR

The DocumentDB which stores users is only backed up every 24 hours and cannot be replicated. Any users who have been added to the system from the time of the last back to the disaster would need to be manually added to the Disaster Recovery instance. If the client has a replica database as part of the production instance, this would be deployed after the Disaster has been declared as there are physical limitations to the number of replica allowed.

Testing

Once a year a client may request a DR test of their enhanced DR solution. The request is in the form of a Support Ticket and should allow 2-3 month advance notice to ensure the proper coordination. The base test consists of:

  • Replication to the DR site is turned off
  • CRON and integrations on the DR Database are turned off through a script (Client responsible to ensure all CRONS and integrations necessary are included).
  • Replica DB is made active and connected to the application
  • Application is brought online
  • Access is through a URL of the form client-DR.suite.maximo.com
  • Client logs in and tests.
  • Client will typically check to see data changes or added in the 30 minutes prior to the start of the test are available, showing replication worked.
  • Client can check other functions.
  • Once complete, site is turned off, database is put back into standby mode and connected to the production database and pending transactions are then replicated.
  • Results are documented. Any issues are remediated.
  • This is done in normal work hours, Monday – Friday.
  • At no time is the production environment affected or altered. The DR site cannot be used as a production site for a period of time as part of the test.

https://w3.ibm.com/w3publisher/global-bcm/faq#ClientRequests