Backup and restore performance improvement

Deployment options: Netezza Performance Server for Cloud Pak for Data SystemNetezza Performance Server for Cloud Pak for DataNetezza Performance Server for Cloud Pak for Data as a Service

The following measurements were observed on single-rack Netezza "Mako" and "Striper" systems with about 147 GB of data in about 5000 tables that are distributed across 240 data slices. Elapsed times are in seconds, averaged across three runs. The elapsed times for the initial full backup, and for the initial restore of the full backup, are not shown. The fraction of tables that are modified between the full and the differential backup has a significant impact on the performance benefit. That benefit comes from avoiding queries to back up and restore nonexistent inserted and deleted rows for tables that did not change between backups. For larger absolute amounts of data changed between backups (more than the 10% of 147 GB in the tests), the relative performance speedup is likely to be less than shown. This is because the backing up and restoring of the data that did change might dominate the shorter queries (that back up and restore no rows) that are avoided.

Table 1. Performance improvement for Mako
Mako   Differential backup Restoring one increment
    Elapsed time with TRACK CHANGES OFF Elapsed time with TRACK CHANGES ON Speedup Elapsed time with TRACK CHANGES OFF on source Elapsed time with TRACK CHANGES ON on source Speedup
  ~10% of data changed across 10% of tables 446 96 4.65x 1108 171 6.48x
  ~10% of data changed across 25% of tables 435 149 2.92x 1089 409 2.66x
  One row changed in each table* 463 348 1.33x 1378 1053 1.33x
Table 2. Performance improvement for Striper
Striper   Differential backup Restoring one increment
    Elapsed time with TRACK CHANGES OFF Elapsed time with TRACK CHANGES ON Speedup Elapsed time with TRACK CHANGES OFF on source Elapsed time with TRACK CHANGES ON on source Speedup
  ~10% of data changed across 10% of tables 474 112 4.23x 1184 172 6.30x
  ~10% of data changed across 25% of tables 466 134 3.48x 1153 395 2.92x
  One row changed in each table* 478 362 1.32x 1671 1135 1.39x

*For the 1 row changed in each table test, roughly one-third each of the changes were Inserts, Deletes, and Updates.

A table that had a row (or rows) updated requires two queries to retrieve inserted and deleted rows during incremental backup and two queries during restore, as with prior releases or with TRACK CHANGES OFF, so these do not see any benefit on backup or on restore.

A table with only inserted or only deleted rows between backups requires just one query during incremental backup to retrieve the inserted or the deleted rows, and just one query during restore to insert or delete the rows. This results in the modest speedup in both backup and restore in this case.