Backup and restore SQL schemas for DB2 Universal Database

A library for logical schema-level backup, restore, copy, and drop

This article provides logical schema level backup, restore, copy and drop schema procedures for IBM DB2® Universal Database™ for Linux®, UNIX®, and Windows® (DB2 UDB). The procedures handle all common SQL objects and their properties, including privileges, statistics, and sequence/identity states. Using these procedures greatly simplifies deployment and upgrades of ISV applications that use schemas to modularize their products.

Serge Rielau (srielau@ca.ibm.com), Senior Software Developer, IBM

Serge RielauSerge Rielau is part of the DB2 Solutions Development team, where he closely works with customers and business partners to port or migrate their applications from competitive RDBMS to DB2 for Linux. UNIX, and Windows. Prior to this role, he worked for 7 years as a team lead and technical manager in the DB2 SQL Compiler Development team. As an expert in the SQL language, Serge is an active participant in the comp.databases.ibm-db2 newsgroup.



09 February 2006

Motivation

The SQL language provides the concept of a schema to group SQL objects of all kinds together. Schemas can be understood as the SQL equivalent of directories in a file system. DB2 UDB for Linux, UNIX, and Windows traditionally has had strong support for the SQL schema concept, having separated the USER from the CURRENT SCHEMA early on and providing a PATH for routine resolution.

While usage of schemas is easy, their management is a different matter. Requirements such as renaming a schema, schema-level backup and restore, and making a copy of a schema cause significant challenges, since they must be hand scripted, and the number of objects and their dependencies are complex.

This article provides a library of procedures with a simple interface that allow the user to back up all objects in a given schema onto the file system, restore the backup under a different schema, and also drop all objects in a given schema.

Installation

After downloading and extracting the backupschema.zip file included in this article, you will find the following file:

  • backupschema.sql

    This file contains the DDL, which installs the procedures as well as various helper routines and tables. By default, the schema used for all objects is "ADMIN". If you want to change this schema, simply modify the SET SCHEMA and the SET PATH statements at the beginning of this file to replace "ADMIN" with a schema name of your choice

    To execute the script, connect to the database as the DB2 administrator and then execute db2 -tvf backupschema.sql

The procedures use the SYSPROC.ADMIN_CMD() stored procedure to export and the SYSPROC.DB2LOAD() stored procedure to load. (SYSPROC.ADMIN_CMD() has been introduced in DB2 V8.2.2 (FP 9). Therefore, V8.2.2 is the minimum version requirement.) SYSPROC.ADMIN_CMD() executes under the fenced user id and group specified by the owner of the sqllib\adm\.fenced file. As a result the exported files have the same owner and group as sqllib\adm\.fenced. Therefore it is important to ensure that the user or group has privileges to write to the specified directories, while the user id expected to access the files also has access. The simple solution is to change ownership of sqllib\adm\.fenced to the instance owner. The safer solution is to have both IDs as members of a shared group and change only the group ownership of sqllib\adm\.fenced to that shared group. This way it is ensured that you can access the produced files through the group membership.

Specification

DROPSCHEMA(<SCHEMA>)

The procedure will drop all objects defined in the schema and finally drop the schema itself.

Arguments

  1. <SCHEMA>

    The name of the schema to be dropped. SQL identifiers are case-sensitive. That is to drop the schema "MYSCHEMA", the schema must be capitalized: 'MYSCHEMA'

If an object in another schema depends on the schema to be dropped, the regular rules for dropping objects apply. That is, in most cases the dependent object will be dropped or invalidated as well. Some objects, such as SQL functions, however, enforce RESTRICT semantics. If such a case is encountered, the procedure will fail.

The DROPSCHEMA procedure is under transaction control and does not perform commit or rollback on its own. So to persist the dropping of the schema, the drop must be commited. Similarly, if an error occurs, the invoker of the procedure needs to decide whether to roll back the process to that point or not.

BACKUPSCHEMA(<SCHEMA>,<DIRECTORY>)

The procedure performs a logical backup of all objects in the schema to the file system. Supported objects are:

  • Tables including statistics. Not supported are range clustered tables (RCT), materialized query tables (MQT), and typed tables.
  • Views with the exception of typed views
  • Functions including sourced, SQL and external functions. Note that the executables for external functions are not backed up. Statistics for functions are not backed up.
  • SQL Procedures and external procedures. Note that the executables for external procedures are not backed up.
  • Distinct datatypes. Note that structured data types are not supported.
  • Indexes, with the exception of index extensions
  • All constraints
  • Sequences, including their high watermark
  • Comments on all supported objects
  • Privileges for all supported objects

The backup includes the PATH, SCHEMA, and PREP options, where applicable.

Federated objects such as NICKNAMEs as well as the database partitioning feature (DPF) are not supported.

Arguments

  1. <SCHEMA>

    The case-sensitive name of the schema to be backed up.

  2. <DIRECTORY>

    The directory in the file system to which the objects are to be backed up. The directory must exist and end with a slash on UNIX or Linux and a backslash on Windows; for example, 'D:\TEMP\MYSCHEMA\'. If the directory is NULL or an empty string, no files will be exported. The procedure will only populate the DDLLOG table with the DDL statements. This mode is used by the COPYSCHEMA procedure.

After a successful backup, the directory will contain IXF files for:

  • Tables in the schema that was backed up
  • SYSSTAT views containing the table, column, and index statistics
  • The DDL log containing the infomation to recreate all objects.

The user is free to add additional files, such as executables for external routines, to the directory. Once complete, the user can zip and tar the directory, archive, or FTP as desired.

RESTORESCHEMA(<SCHEMA>,<TABLESPACEINFO>,<LOADDIRECTORY>,<MESSAGEDIRECTORY>)

This procedure restores a schema that was previously backed up using BACKUPSCHEMA. All objects are being created using the ID of the current user. If any object to be restored references an object in another schema, and this object does not exist in the target database, the procedure will fail. This procedure performs internal commits.

If the restore is done into a different schema than the schema used for BACKUPSCHEMA, all SQL objects containing SQL bodies (such as views, SQL routines, and check constraints) must not have explicit references to the source schema in their definitions. That is: CHECK (MYSCHEMA.MYTABLE.C1 > 0) cannot be restored successfully into a schema other than "MYSCHEMA" while CHECK (MYTABLE.C1 > 0) can. To enable "redirected" RESTORESCHEMA, it is therefore recommended to exploit the CURRENT PATH and CURRENT SCHEMA special registers instead of qualifying the local schema expclicitly when defining SQL objects.

Arguments

  1. <SCHEMA>

    The case-sensitive name of the schema into which the objects are to be restored. If the schema already exists, it is dropped first. If the schema is NULL or the empty string, the procedure restores the schema to the name from where it was backed up. This mode is used by the COPYSCHEMA procedure.

  2. <TABLESPACEINFO>

    A string containing tablespace information to be added to each table definition. The clause may contain the IN <tablespace>, LONG IN <lobtablespace>, and INDEX IN <indextablespace> clauses. This parameter can be an empty string or NULL, if defaults should be picked.

    For example: 'IN DATA8K INDEX IN INDEX4K'

  3. <LOADDIRECTORY>

    The directory in the file system where the files generated by BACKUPSCHEMA are located. The directory must end with a slash on UNIX or Linux and a backslash on Windows; for example, 'D:\TEMP\MYSCHEMA\'.

  4. <MESSAGEDIRECTORY>

    The directory in the filesystem to which the procedure will write load message files. The directory must end with a slash or backslash as apropriate for the operating system.

COPYSCHEMA(<TARGETSCHEMA>,<TARGETTABLESPACEINFO>,<SOURCESCHEMA>)

This procedure makes a copy of a schema within the same database.

Arguments

  1. <TARGETSCHEMA>

    The case-sensitive name of the new schema.

  2. <TABLESPACEINFO>

    A string containing tablespace information to be added to each table definition. If NULL or '' is provided, defaults will be chosen.

  3. <SOURCESCHEMA>

    The case-sensitive name of the schema to be copied.

REBINDPROCEDURES(<SCHEMA>)

This procedure rebinds all SQL procedures in a schema. The procedure should be called after new statistics have been collected for tables used in procedures within the schema. That is, the procedure will re-optimize all SQL statements used in the SQL Procedures.

Arguments

  1. <SCHEMA>

    The case-sensitive name of the schema.

Examples

  • CALL ADMIN.DROPSCHEMA('SAMPLE')

    This procedure call drops all objects in the schema "SAMPLE".

  • CALL ADMIN.BACKUPSCHEMA('UTIL', 'D:\TEMP\UTIL_SCHEMA\')

    This procedure call stores all objects in the schema "UTIL" in the directory D:\TEMP\UTIL_SCHEMA\.

  • CALL ADMIN.RESTORESCHEMA(NULL, NULL, 'D:\TEMP\UTIL_SCHEMA\', 'D:\TEMP\')

    This procedure call retrieves the objects from D:\TEMP\UTIL_SCHEMA\ and restores them to the original schema using default tablespaces. Messages from load are written to D:\TEMP\

  • CALL ADMIN.RESTORESCHEMA('SRIELAU', 'IN "DATA" INDEX IN "INDEX"', 'D:\TEMP\USER_SCHEMA\', 'D:\TEMP\')

    This procedure call retrieves the objects from D:\TEMP\USER_SCHEMA\ and restores them to the "SRIELAU" schema using the tablespace "DATA" for table data and "INDEX" for indices. Messages from LOAD are written to D:\TEMP\

  • CALL ADMIN.COPYSCHEMA('SRIELAU', 'IN "DATA" INDEX IN "INDEX" LONG IN "LONG"', 'STOLZE')

    This procedure call duplicates all objects in the schema "STOLZE" into the schema "SRIELAU" schema using the tablespace "DATA" for table data, "INDEX" for indices, and "LONG" for LOB data.

  • CALL ADMIN.REBINDPROCEDURES('SRIELAU'')

    This procedure call rebinds all SQL procedures in the schema SRIELAU, picking up new statistics.

Implementation

This section gives a high-level overview on the implementation of the procedures discussed so far in this article.

The main challenge to a successful restoration of all objects in a schema is the handling of the many dependencies between different objects within the schema. For example, tables may depend on user-defined distinct types; check constraints depend on functions; views depend on views, aliases, and functions; and so on.

Since DB2 UDB cannot create an object that depends on a non-existing entity, the correct order of execution appears vital. Upon looking at the problem closer, you can see, however, that the depth of the dependency graph tends to be shallow, and furthermore that some objects are natural endpoints. That is, user-defined distinct types can never depend on any other DDL object, and tables (excluding constraints) depend only on distinct types. Having realized that, it is possible to implement a restore in three phases:

  1. Restore all distinct types followed by all tables with their indexes.
  2. Copy the data into the tables and get statisticts. This is done early to ensure execution plans for SQL procedures are correct.
  3. Restore all other objects in a loop. That is, don't worry if any given object creation fails as long as progress is being made per iteration through the list of objects to be restored. Only if the process stalls is an error returned.

With this basic algorithm for restore, it is easy to implement a simple infrastructure.

Here the table DDLLOG is used by the procedure BACKUPSCHEMA to record the DDL statements. The first row (with the number 0) contains the source schema. This is followed by the DDL statements for distinct types and then tables. The end of this phase is marked by an empty row, which is followed by all other objects.

The table contains a SUCCESS column, which is used by RESTORESCHEMA to log the successful creation of an object.

The composition of the DDL objects is entirely based on the documented SYSCAT catalog views, with one exception. The high watermark for an IDENTITY column needs to be retrieved from SYSIBM.SYSSEQUENCES.LASTASSIGNVAL, which is not exposed in DB2 UDB V8.

The interaction with the filesystem is achieved through the SYSPROC.ADMIN_CMD procedure, which supports export and is used to write the DDLLOG table, the user data, and the statisticts to files. To load SYSPROC.DB2LOAD is used.

The same brute-force approach used to restore a schema is used to drop a schema. The DROPSCHEMA procedure simply keeps attempting to drop objects until it either fails to make progress, or all objects are dropped.

Conclusion

This article provides a group of powerful procedures that allow for schema-level operations such as logical backup, restore, and copy of all objects in a given schema. In addition to providing value to ISVs and end customers with this library, this article also shows that DB2 UDB has sufficient SQL APIs to allow for powerful added value using customer-accesible methods.


Download

DescriptionNameSize
Scripts and source code used in this articlebackupschema.zip12KB

Resources

Learn

Get products and technologies

  • Download a free trial version of DB2 Universal Database Enterprise Server Edition.
  • Build your next development project with IBM trial software, available for download directly from developerWorks.
  • Now you can use DB2 for free. Download DB2 Express-C, a no-charge version of DB2 Express Edition for the community that offers the same core data features as DB2 Express Edtion and provides a solid base to build and deploy applications.

Discuss

Comments

developerWorks: Sign in

Required fields are indicated with an asterisk (*).


Need an IBM ID?
Forgot your IBM ID?


Forgot your password?
Change your password

By clicking Submit, you agree to the developerWorks terms of use.

 


The first time you sign into developerWorks, a profile is created for you. Information in your profile (your name, country/region, and company name) is displayed to the public and will accompany any content you post, unless you opt to hide your company name. You may update your IBM account at any time.

All information submitted is secure.

Choose your display name



The first time you sign in to developerWorks, a profile is created for you, so you need to choose a display name. Your display name accompanies the content you post on developerWorks.

Please choose a display name between 3-31 characters. Your display name must be unique in the developerWorks community and should not be your email address for privacy reasons.

Required fields are indicated with an asterisk (*).

(Must be between 3 – 31 characters.)

By clicking Submit, you agree to the developerWorks terms of use.

 


All information submitted is secure.

Dig deeper into Information management on developerWorks


static.content.url=http://www.ibm.com/developerworks/js/artrating/
SITE_ID=1
Zone=Information Management
ArticleID=103558
ArticleTitle=Backup and restore SQL schemas for DB2 Universal Database
publish-date=02092006