Maintaining the database

Maintain your database by, for example, reorganizing tables and indexes.

Database statistics

To ensure that the database is using the optimal path to access data, periodically generate and update statistics on all the tables and indexes.

Configure the Platform Maintenance Scheduler, which was formerly named Cleanup Agent, correctly to ensure that the statistics are updated daily. regularly analyze the server log on the server that runs the Platform Maintenance Scheduler. Analysis verifies that the agent is completing successfully and running the appropriate database statistics command. If you do not use the Platform Maintenance Scheduler for this task, then schedule the statistics update to run on the database daily or weekly, depending on the system that is used.

IBM Db2 provides the runstats tool to update statistics directly on the database because the Platform Maintenance Scheduler does not automatically call runstats on that database platform.

Oracle provides equivalent commands for manually updating statistics on the database:

  • dbms_stats.gather_schema_stats
  • dbms_stats.gather_table_stats
  • dbms_stats.gather_index_stats

Reorganizing tables and indexes

If you do not expect further growth in a table, reorganize it to reclaim space. When many updates are made on a table, the space can become fragmented. However, if you expect further inserts, then the database should reuse the space within the table.

Reorganizations or rebuilds can also improve performance. When the data is not aligned on an index, performance can become degraded.

Consider these questions before you reorganize or rebuild tables an indexes:

  • Does fragmented space cause a performance problem?
  • Are the records in the table updated or are new records inserted within a short period?
  • Does fragmented space cause inefficient storage use?
  • Is the order of the table data aligned with a particular index?
  • Has performance degraded over time or has it happened at a specific time?
  • What changes might cause the incident?
  • How does the performance correlate to system CPU and I/O use?
  • If table reorganization is done, what is the expected benefit? Running reorganization affects data availability, uses system resources, and takes time to complete.

Reorganize and rebuild the tables based on the following guidance:

  • Reorganize a table only when you carefully analyze to determine that a reorganization corrects the performance problem.
  • Reorganize tables that have many deletes and updates to reclaim space only when you determine that no further inserts are done in the near term.
  • Do not reorganize metadata tables.
  • If you determine that a reorganization improves a performance issue with the table or index, then consider whether to reorganize online or offline while the system is down. System resources are needed for reorganizations. A reorganization can slow down the system. Also, consider that users must wait for locks during an online reorganization.
  • Maintain the size of the database by archiving and trimming tables where appropriate to reduce the time that is needed for reorganization.
  • Index rebuilds help performance greatly but can slow the database system down greatly. Therefore, rebuild offline if possible and only when needed.

The Platform Maintenance Scheduler has commands for IBM Db2 that automatically reorganizes tables and indexes. In addition, IBM Db2 provides the REORGCHK and REORG TABLE commands to perform these activities directly on the database. However, use the Platform Maintenance Scheduler for this task.

For more information on reorganizing tables and indexes on an Oracle database, see the Oracle Database Documentation Library. For more information on Microsoft SQL Server databases, see the Reorganize and Rebuild Indexes.

Multibyte character sets

Using multibyte character sets (MBCS) on the database has performance implications. Single-byte character sets are better for performance than multibyte character sets, and they are also the most efficient in terms of storage.

Archiving and deleting historical data

Deleting unnecessary data from the database helps to optimize its performance. If you are archiving to an external system or database, use integrations or reports to query the archived data. Archiving data can also facilitate upgrading to a new version of TRIRIGA® because the historical data is not critical to day-to-day operations and might not be needed in the new system.

Some databases have tools to assess data usage and aid in information lifecycle management, such as heat maps. Use these tools to identify data that might be purged or archived.

Table spaces

TRIRIGA uses two table spaces, TRIDATA_DATA and TRIDATA_INDX. These names are not mandatory, and you can use your own names. These tables separate the data tables from the indexes. TRIDATA_DATA contains all the IBM_TRIRIGA tables, and the TRIDATA_INDX table space stores all the indexes.

Over time, custom indexes might not be created in the proper table space. Check that these indexes after the system changes.

Use a third table space if you have large amounts of data and heavy usage of the TRIRIGA Document Manager. Moving the tables that store the binary large objects (BLOBs) into their own table space optimizes performance. However, TRIRIGA does not currently manage multiple table spaces, so track and record any changes that you make.

Database server virtualization

Virtualization always adds to processor usage, and while virtualization is supported in all tiers of the stack, do not use it for the database tier because of its resource use. Testing shows a near linear 30% CPU resource overhead when a database is run on a virtual machine (VM), with transactions per second also decreasing by approximately 5%.

TRIRIGA is not aware if it is running SQL statements against a virtualized or physical database. TRIRIGA makes requests and awaits responses. However, with virtualization in place, these responses might be delayed due to overhead, which is outside of TRIRIGA. Usually, this delay is fractions of a second, and while negligible, this delay can accumulate during heavy system usage and potentially become a performance degradation. Given this behavior, run the database on physical hardware.

If you are using virtualization for the database, it becomes another layer to consider when you diagnose performance issues. Use dedicated resources, including dedicated fast storage that is attached to the virtual environment.

Virtualized Environments:

Deployments in virtualized environments such as VMware might not see the same performance benefits from these practices. Using shared resources in virtualized environments does not give optimal performance. Instead, use dedicated CPU and memory resources.

Internal testing shows that putting the TRIRIGA database in a virtualized environment such as VMware has significant performance impacts. These impacts increase rapidly with larger workloads. While some customers run their TRIRIGA database successfully on a virtual machine, some use dedicated hardware for optimal performance.

If you deploy TRIRIGA on VMware, see the following tuning guide that is available from VMware: Performance Best Practices for VMware vSphere 5.5

Database server throughput

Ensure that the throughput of the database server meets expectations for enterprise-class fast data storage, which is 8 Gbps Fibre Channel or better. Follow the instructions by using the spreadsheet for your database server.

The spreadsheet shows whether a performance issue is at the database I/O level. This test removes any TRIRIGA platform code and runs a full table scan on one of the largest tables in the system. The number of blocks that are processed is divided by the number of seconds it takes to run the full table scan. Then, compare the result to a reference system that is configured with an 8 Gbps Fibre Channel storage area network (SAN).

To test the throughput of the database server, take the following steps:
  1. Download and open the spreadsheet file that corresponds to your database server. Several statements are provided in each spreadsheet:
  2. In a database tool that corresponds to your database server, such as Oracle SQL Developer, run the spreadsheet statements as your TRIDATA database user.

Database disk-caching policies

Your disk-caching policies might affect database performance. For example, if the hardware of your staging and production environments is identical, your staging disks might be configured with a write-back caching policy. At the same time, your production disks might be configured with a write-through caching policy. In this example, write-back might be 4 times faster than write-through.

To verify that differences in disk-caching policies are affecting performance, run a test on staging by switching one disk, or group of disks, from one caching policy to another policy. Then conduct a speed test, and switch back to the original policy for another speed test.

After you verify any differences in performance, implement the disk-caching policy change to your production environment.

Take the following steps:
  1. Stop your production site and database.
  2. Ensure that no processes or services are either reading or writing to any of the production disks. Ensure that there are no side effects due to cached data.
  3. Change your production disks to the write-back caching policy.
  4. Run a quick speed test to verify that performance is improved.
  5. Restart your production database and site. Depending on your environment, the outage takes approximately 30-45 minutes.

Creating batches for business object (BO) data cleanup

If you are doing a large BO data cleanup exercise and have hundreds of thousands or millions of records to delete, create batches to spread the cleanup work over a time. Use the mod function to do the cleanup. The mod function returns the remainder as an integer when dividing one number by another. Use this function to group a large set of numbers into a nearly-equivalent number of groups or batches.

For example, run the following query to divide the set of spec_id records into 7 roughly equal batches.

  • Oracle Database or IBM Db2:
    select count(*), mod(spec_id ,7) from ibs_spec
    group by mod(spec_id ,7);
  • Microsoft SQL Server:
    select count(*), spec_id % 7 from ibs_spec
    group by spec_id % 7;

Using this information, split the cleanup work into a more reasonable number of records. Typically, cleaning 10,000-20,000 records takes about 30-60 minutes. You can modify the query by adding a where clause to filter only the records to be deleted.

  • Find the total number of records to be deleted. Then run the following query:
    select count(*)/20000 from ibs_spec 
    where object_id < 0
    and system_flag = 0;

For example, assume that the result of this query is 7 batches. Then run the following query to verify the number of records to be cleaned up in the following days:

  • Oracle Database or IBM Db2:
    select count(*), mod(spec_id ,7) from ibs_spec
    where object_id < 0
    and system_flag = 0
    group by mod(spec_id ,7);
  • Microsoft SQL Server:
    select count(*), spec_id % 7 from ibs_spec
    where object_id < 0
    and system_flag = 0
    group by spec_id % 7;

To implement the query, update the ibs_spec table and set the updated_date from today, adding a day into the future:

  • Oracle Database or IBM Db2:
    update ibs_spec 
    set updated_date = sysdate + mod(spec_id ,7) 
    where object_id < 0 and system_flag = 0;
  • Microsoft SQL Server:
    update ibs_spec 
    set updated_date = getdate() + spec_id % 7  
    where object_id < 0 and and system_flag = 0;