Database Server Tuning and Maintenance
5. Database Server Tuning and Maintenance
The database is central to the functionality of TRIRIGA®. This database stores all data that is collected and calculated by the applications. The database also stores metadata for configuring and maintaining the environment. The database server processes all transactions from the applications. The integrated reporting function accesses the data in the database to generate documents. Reporting also generates resource-intensive management reports.
Because all functionality is based on database performance, the database should be a key focus for performance tuning. TRIRIGA recommends that the database server be co-located on the same subnet as the application and process servers.
For this reason, cloud database offerings such as IBM DB2 on Cloud Service or Oracle Database Exadata Cloud Service are not recommended for adequate performance. However, hosting the entire TRIRIGA infrastructure in the cloud where the TRIRIGA infrastructure is hosted on the same subnet is a viable configuration for good performance. In addition, customers have reported good performance using Oracle Exadata Database Machine in on-premise deployments.
Always check the IBM TRIRIGA Compatibility Matrix for the latest database versions and fix packs supported. Performance issues where the database server is not co-located with the application and process servers are not supported.
You should apply standard database-tuning techniques to TRIRIGA and periodically monitor production databases during peak load. You can use standard monitoring tools such as those native to the database platform. If necessary, adjust parameters to resolve the bottlenecks as suggested by the monitoring tools.
5.1 General Database Tuning
5.1.1 Indexing
Indexing a database requires a good understanding of the data, the user functions, and how databases use indexes. Indexes use key parts of data from a table in a binary structure to enhance searching capability. Each record of data in the table must have associated data in the index. Opportunities for indexing should be evaluated on a regular basis by analyzing the TRIRIGA logs using the Performance Analyzer for long running queries and/or using database tools such as Oracle AWR Reports and DB2 snapshots to identify long running queries.
When long running SQL queries are identified, you can use query tuning utilities native to the database platform to identify potential indexes to improve the query response time. Oracle SQL Developer, IBM Data Studio, and Microsoft SQL Server Management Studio each provide SQL tuning utilities. In general, each of the above utilities operate similarly in that the SQL query to be tuned is input into the tool, and the option to tune the query is selected. The utility will then list any potential indexes that can be created to improve query performance. For more information, see the documentation for your database platform.
Indexing can greatly increase search speeds. However, a drawback of indexes is that for each insert, update, or delete, the index must also be updated. Database administrators often apply many indexes to a table to enhance searching, and then sometimes find that other activities have slowed. Review and test all potential indexes to ensure that you have the right balance for searching and for updating tables.
TRIRIGA includes several built-in
indexes against IBS_SPEC_ASSIGNMENTS
and these should remain. Some additional
IBS_SPEC_ASSIGNMENTS
indexes are also recommended and these should be added, but
analyzed to ensure that the indexes are proven to help. Any additional indexes added to
IBS_SPEC_ASSIGNMENTS
are generally not recommended and should be avoided,
because they are particularly prone to cause issues, due to the volatile nature of such a large
table in TRIRIGA. However, if an
analysis proves that an additional index helps, then it can be added.
Indexes are extremely dependent on the configuration and data composition. Customers and partners must be proactive in analyzing the configurations and data for indexes that might be helpful. An index for one customer may not be helpful for another customer. The indexes that are shipped with TRIRIGA and recommended in these best practices have been shown to help in most use cases.
Also check existing indexes against newly recommended indexes to make sure there is no overlap. For example, the utilities may recommend an index on columns A, B, and C. However, there may be an existing index on columns A and B. Instead of having multiple indexes, simply modify the existing index to add column C to the end.
Customizing TRIRIGA can change the way you select information from the database. Some customizations include additional tables and columns. If you have customized TRIRIGA, carefully compare indexes to the user functions that use them. You may need to alter existing indexes to include new columns and create new indexes for new tables.
IBM recommends using the Database Table Manager tool to view, alter, drop, and create database table indexes as they will then be managed by the product. For more information, see Migrating database table indexes.
a. Understanding Your Data by Analyzing IBS_SPEC_ASSIGNMENTS Table
If your database is converted to enable Module Level
Associations (MLA), the ALLOW_REVERSE_ASSOCIATION
property will no
longer be honored. The Reverse Association flag in queries and reports
will be ignored, and only forward associations will be allowed in queries and reports.
As of 3.8, the Module Level Assignments converter tool is now renamed Module Level Associations.
TRIRIGA stores all associations
between records in the IBS_SPEC_ASSIGNMENTS
table. Over time, this table can grow,
and can exceed 100 million rows. Processes like the Platform Maintenance
Scheduler (Cleanup Agent) will take longer to finish because it queries this table to
clean associated data. By analyzing your IBS_SPEC_ASSIGNMENTS
table, you may be
able to identify data in your system that you can delete, and consequently, reduce the size of your
table.
For more information, see Analyzing Your IBS_SPEC_ASSIGNMENTS
Table.
5.1.2 Recommended Performance Tuning Indexes
IBM recommends tuning indexes that are the result of iterative performance tuning cycles and collaboration with the TRIRIGA development team. The indexes listed here are not included in the TRIRIGA base product unless otherwise stated.
These indexes, which are listed by database platform, provide significant performance improvements when measured against a broad performance test workload. While TRIRIGA recommends adding these indexes per their respective database platform, performance gains may vary depending on an array of factors including application usage, load patterns, hardware sizing, application, database server configuration, and so on. Database administrators should monitor databases for efficient index usage to determine the overall impact produced by applying the recommended indexes, and to determine additional indexes that will improve performance based on situational and data composition needs.
- See the section 5.3.2 IBM DB2 Application Platform Indexes.
- See the section 5.3.3 Reserve Indexes for DB2.
- See the section 5.4.2 Oracle Application Platform Indexes.
- See the section 5.4.3 Reserve Indexes for Oracle.
- See the section 5.5.2 Microsoft SQL Server Application Platform Indexes.
- See the section 5.5.3 Reserve Indexes for SQL Server.
5.1.3 Query Tuning
Review custom queries and reports for efficient SQL and use of indexes. In all of these cases, improving the efficiency of user queries also improves the efficiency of the workflows that use them.
Many long running queries and reports are typically found to be those that are not properly filtered. In addition, most of the filters in reports and queries are generated by individual users with run-time filter operators and by association filter operators defined in the Report Manager. While these are powerful features of TRIRIGA, they can potentially produce inefficient SQL.
- See the section 7.3.1 Query Tuning.
5.2 Database Maintenance
5.2.1 Database Statistics
To ensure that the database is using the optimal path to access data, statistics must be periodically generated and updated on all the tables and indexes.
The Platform Maintenance Scheduler (formerly Cleanup Agent) must be running and setup properly to ensure the statistics are updated daily. The server log on the server running the Platform Maintenance Scheduler agent must be analyzed regularly to verify the agent is completing successfully and executing the appropriate database statistics command needed. If the Platform Maintenance Scheduler agent is not being used for this task, then the statistics update must be scheduled to run on the back-end daily or weekly, depending on the system used.
IBM
DB2 provides the
runstats
tool to update statistics directly on the back-end database since the
Platform Maintenance Scheduler does not automatically call runstats
on that
database platform.
Oracle provides equivalent commands for manually updating statistics on the back-end database:
dbms_stats.gather_schema_stats
dbms_stats.gather_table_stats
dbms_stats.gather_index_stats
However, it is recommended that you use the Platform Maintenance Scheduler for this task.
5.2.2 Reorganizing Tables and Indexes
You may want to perform reorganizations to reclaim space in a table. When many updates are made on a table, the space can become fragmented. If no further growth is expected in the table, then the space can be reclaimed. However, if further inserts are expected, then the database should be effectively reusing the space within the table.
Another case where reorganizations or rebuilds can be helpful is to improve performance. When the data is not aligned on an index, performance can become degraded.
Some questions to consider before doing reorganizations or rebuilds are:
- Is fragmented space causing a performance problem?
- Will the records in the table be updated or will new records inserted within a short period?
- Is fragmented space resulting in inefficient storage utilization?
- Is the order of the table data aligned with a particular index?
- Has performance degraded over time or has it happened at a specific time?
- What changes could have caused the incident?
- How does the performance correlate to system CPU and I/O utilization?
- If table reorganization is done, what is the expected benefit? Running reorganization affects data availability, consumes system resources, and takes time to complete.
General recommendations for reorganizations and rebuilds are:
- Only run a reorganization against a table when careful analysis has been done to determine that a reorganization corrects the performance problem.
- Only reorganize tables that have had many deletes and updates to reclaim space when it has been determined that no further inserts will be done in the near term.
- Do not reorganize metadata tables.
- If it has been determined that a reorganization improves a performance issue with the table/index, then careful consideration must be given on whether the reorganization should be done online or offline while the system is down. System resources are required to do reorganizations. A reorganization can slow down the system. As well, consider that users must wait for locks during an online reorganization.
- Carefully maintain the size of the database by archiving and trimming tables where appropriate to reduce the time needed for reorganization.
- Index rebuilds can help performance greatly, but can slow the database system down greatly and therefore should be done offline if possible and only if needed.
The Platform Maintenance Scheduler (formerly Cleanup Agent) has commands
for IBM
DB2 that will automatically
REORG
tables and indexes. In addition, IBM
DB2 provides the
reorgchk
and reorg
tools to perform these activities directly on
the back-end database. However, it is recommended that you use the Platform Maintenance Scheduler
for this task.
For assistance in reorganizing tables and indexes on an Oracle database, see the Oracle Database Documentation Library. For a Microsoft SQL Server database, see the Reorganize and Rebuild Indexes.
5.2.3 Multibyte Character Sets
When using multibyte character sets (MBCS) on the back-end database system, be aware that there are performance implications. Single-byte character sets are more optimal for performance than multibyte character sets, and they also are the most efficient in terms of space requirements.
5.2.4 Archiving and Deleting Historical Data
Deleting unnecessary data from the database helps to optimize its performance. If archived to an external system or database, integrations or reports can be used to query the archived data. Archiving data can also facilitate upgrading to a new version of TRIRIGA because the historical data is not critical to day-to-day operations and might not be needed in the new system.
Some databases have built-in tools to assess data usage and aid in Information Lifecycle Management, such as heat maps. These tools can be utilized to help identify data that may be purged or archived on an ongoing basis.
5.2.5 Tablespaces
By default, TRIRIGA uses two
different tablespaces, TRIDATA_DATA
and TRIDATA_INDX
. These
specific names are not required and unique names may be used. The intention is to separate the data
tables from the indexes. TRIDATA_DATA
should contain all of the
IBM_TRIRIGA
tables and the TRIDATA_INDX
tablespace should store
all the indexes.
Over time, during the normal use of the system, custom indexes might not get created in the proper tablespace. This should be checked in a regular basis after changes have been made to the system.
In addition, a third tablespace might be ideal for those that have large amounts of data and heavy usage of the TRIRIGA Document Manager. Moving the tables that store the binary large objects (BLOBs) into their own tablespaces may be ideal for administration and for optimum performance. However, note that TRIRIGA does not currently manage multiple tablespaces. Therefore, any changes you make should be tracked and recreated as needed (for example, during upgrades).
5.2.6 Database Server Virtualization
Virtualization by nature will always add some level of overhead, and while virtualization is supported in all tiers of the stack, it is not recommended for the database tier due to its heavy utilization. Testing has shown an almost linear 30% CPU resource overhead with a database run on a VM, with transactions per second also decreasing by approximately 5%.
The TRIRIGA platform is not aware if it is executing SQL statements against a virtualized or physical database; it makes requests and awaits responses. However, with virtualization in place, these responses may be delayed slightly due to the aforementioned overhead, which is outside of TRIRIGA purview. In most cases, this delay is fractions of a second, and while negligible alone, this can add up during very heavy system usage and potentially become a noticeable performance degradation. Given this behavior, the best practice recommendation is to run the database on physical hardware.
If choosing the use of virtualization for the database, this becomes another layer to consider when diagnosing performance issues. Dedicated resources, including dedicated fast storage attached to the virtual environment, should be used.
Deployments in virtualized environments such as VMware might not see the same performance benefits from the recommendations made in these best practices. Using shared resources in virtualized environments is not recommended for optimal performance. Use dedicated CPU and memory resources instead.
In addition, internal testing has shown that putting the TRIRIGA database in a virtualized environment such as VMware can have significant performance impacts. These impacts increase rapidly with larger workloads. While some customers run their TRIRIGA database successfully in a virtual machine, IBM strongly recommends dedicated unvirtualized hardware for optimal performance.
If you deploy TRIRIGA on VMware, see the following tuning guides available from VMware:
5.2.7 Database Server Throughput
Make sure that the throughput of the database server meets expectations for enterprise-class fast data storage (8 Gbps Fibre Channel or better). Follow the instructions with the spreadsheet below that corresponds to your database server.
The purpose of spreadsheet is to show if a performance issue is at the database I/O level. This test removes any TRIRIGA platform code and runs a full table scan on one of the largest tables in the system. The number of blocks that are processed is divided by the number of seconds it takes to execute the full table scan (SQL 1). Then the result can be compared to a reference system configured with an 8 Gbps Fibre Channel SAN.
To test the throughput of the database server, perform the following steps:
- Step 1. Download and open the spreadsheet file that corresponds to your database server. Several statements are provided in each spreadsheet.
- Step 2. In a database tool that corresponds to your database server, such as Oracle SQL Developer, DB2 Data Studio, or SQL Plus, execute
the spreadsheet statements as your
TRIDATA
database user.
5.2.8 Database Disk-Caching Policies
In some cases, database performance might be affected by your disk-caching policies.
For example, if the hardware of your staging and production environments are identical, your staging disks might be configured with a "write-back" caching policy, while your production disks might be configured with a "write-through" caching policy. In this example, "write-back" might be as high as 4x faster than "write-through".
To verify that differences in disk-caching policies are affecting performance, you can perform a quick test on staging by switching one disk, or group of disks, from one caching policy to another policy, conducting a speed test, then switching back to the original policy for another speed test.
After verifying any differences in performance, to implement the disk-caching policy change to your production environment, perform the following steps:
- Step 1. Stop your production site and database.
- Step 2. Ensure that there are no processes or services that are either reading or writing to any of the production disks. Ensure that there are no side-effects due to cached data.
- Step 3. Change your production disks to the "write-back" caching policy.
- Step 4. Perform a quick speed test to verify that performance is improved.
- Step 5. Restart your production database and site. Depending on your environment, the outage should only take 30-45 minutes.
5.2.9 Creating Batches for BO Data Cleanup
If you are doing a large BO data cleanup exercise, and have hundreds of thousands or millions of
records to delete, you can create batches to spread the cleanup work over a period of time. The key
to do this is to understand the power of the mod
(modulus) function. The modulus
function returns the remainder as an integer when dividing one number by another. This is very
useful in grouping a large set of numbers into a nearly-equivalent number of groups or batches.
For example, run the following query to divide the set of spec_id
records into 7
roughly-equal batches.
- Oracle Database or IBM
DB2:
select count(*), mod(spec_id ,7) from ibs_spec group by mod(spec_id ,7);
- Microsoft SQL Server:
select count(*), spec_id % 7 from ibs_spec group by spec_id % 7;
With this information, you can now plan on splitting the cleanup work into a more reasonable
number of records to clean up. Typically, cleaning 10,000-20,000 records would take about 30-60
minutes. You can modify the query above, adding the where
clause to filter only the
records to be deleted.
- First, find the total number of records to be deleted. Then run the following
query:
select count(*)/20000 from ibs_spec where object_id < 0 and system_flag = 0;
For example, let's say that the result of this query is 7 (batches). Then run the following query to verify the number of records that will be cleaned up in the following days:
- Oracle Database or IBM
DB2:
select count(*), mod(spec_id ,7) from ibs_spec where object_id < 0 and system_flag = 0 group by mod(spec_id ,7);
- Microsoft SQL Server:
select count(*), spec_id % 7 from ibs_spec where object_id < 0 and system_flag = 0 group by spec_id % 7;
To implement this, update the ibs_spec
table and set the
updated_date
from today, adding a day into the future:
- Oracle Database or IBM
DB2:
update ibs_spec set updated_date = sysdate + mod(spec_id ,7) where object_id < 0 and system_flag = 0;
- Microsoft SQL Server:
update ibs_spec set updated_date = getdate() + spec_id % 7 where object_id < 0 and and system_flag = 0;