Configuring Lifecycle Query Engine with Jena or Link Index Provider with Jena for improving performance and scalability

When you configure Lifecycle Query Engine with Jena or Link Index Provider with Jena, you can make deployment decisions and configuration choices to improve Lifecycle Query Engine with Jena or Link Index Provider with Jena performance and scalability.

Hardware recommendations

  • Deploy Lifecycle Query Engine with Jena or LDX with Jena on a dedicated computer that has adequate CPU, memory, and hard disk capacity.
  • For best performance, deploy Lifecycle Query Engine with Jena or LDX with Jena as a stand-alone application on a web server.
  • Deploy Lifecycle Query Engine with Jena or LDX with Jena on the VM image with dedicated resources, such as CPU, RAM, and disk input and output, by using a dedicated VM.
RAM
Lifecycle Query Engine with Jena or LDX with Jena uses two kinds of memory: heap memory and native memory. Heap memory is allocated through the JVM properties and is used by Lifecycle Query Engine with Jena or LDX with Jena for various heap allocations. The operating system allocates the native memory on demand. This memory is used by Lifecycle Query Engine with Jena or LDX with Jena to load the index into the memory. The amount of RAM needed on a system for efficient functioning of Lifecycle Query Engine with Jena or LDX with Jena must be calculated based on the heap and native memory projections.
Note: The following recommendations are sufficient for any current data set operation. If you continue to add new projects, users, or data, you must have sufficient memory for future growth.
  • The JVM heap size for Lifecycle Query Engine with Jena or LDX with Jena must be 4 GB or greater. The data set size is the sum of the sizes of all the index folders on the hard disk. For data sets size that is greater than 16 GB, the heap size must be 25% of the data set size. For example, if the estimated size of the data set on the disk is 200 GB, the heap allocation must be 50 GB (25% of 200 GB).
  • In addition to the JVM heap allocation, sufficient free memory must be available to load the data set into memory. For example, if the estimated-indexed data set is 200 GB on the disk, there must be at least 200 GB of memory available in addition to 50 GB (25% of 200 GB) of JVM heap allocation. In this case, the total memory that is reserved for Lifecycle Query Engine with Jena or LDX with Jena must be at least 250 GB.

These memory settings are the same for both reindexing, whether direct I/O mode is used, and normal usage of Lifecycle Query Engine with Jena or LDX with Jena.

Reserved memory for Lifecycle Query Engine with Jena or LDX with Jena must be in addition to the memory that is required by the operating system and other processes.

Estimating the size of the data set

The data set (size of indexTdb + size of shapeTdb + size of versionTdb + size of textIndex + size of shapeText) is used for the total TDB calculation.

Configuring the JVM heap
  • The minimum size of the JVM heap can be configured by using the -Xms JVM property. For example, if the estimated heap size is 4 GB, it can be configured as -Xms4G.
  • Set the maximum size of the JVM heap to the same value as the minimum size, which is specified by the -Xmx JVM property. For example, to set the maximum size to 4 GB, use -Xmx4G.
  • Set the heap nursery size to one-fifth of the maximum heap. For example, the nursery size for -Xmx5G is one-fifth of 5 GB or -Xmn1G.
  • You might need extra heap memory to support high query loads.
CPU

Deploy Lifecycle Query Engine with Jena or LDX with Jena on servers with CPUs, which have clock speeds greater than 2 GHz. CPUs with higher clock speeds increase indexing performance and speed up query execution times.

Deploy Lifecycle Query Engine with Jena or LDX with Jena on servers with multi-core CPUs to increase the capacity for concurrent query executions.

Storage

Deploy Lifecycle Query Engine with Jena or LDX with Jena on servers with solid-state drives (SSDs). SSDs offer significant improvements for disk read and write operations. This results in improved indexing and query execution performance; in fact, SSDs can increase indexing performance by a factor of two times.

Network

Deploy Lifecycle Query Engine with Jena or LDX with Jena with other data sources on the same network. Indexing performance is improved with a faster network because Lifecycle Query Engine with Jena, LDX with Jena, and data providers can respond to requests quicker.

Operating system recommendations

Deploy Lifecycle Query Engine with Jena or LDX on a Linux-based system. Although Lifecycle Query Engine or LDX with Jena works fine on a Windows-based system, Linux-based systems provide slightly better performance. You might face some issues with Windows-based systems when there is a need for the Lifecycle Query Engine with Jena or LDX with Jena system to make large writes to disk. For more information, see IBM Engineering Lifecycle Query Engine performance is very poor for query and indexing on Windows servers.

On Linux systems, the Linux kernel version must preferably be later than the 3.x level.

Server deployment recommendations

IBM® WebSphere® Liberty settings
Change the lazy Load setting in the server.xml file in the server/liberty/servers/clm folder from
webContainer deferServletLoad="true"
to
webContainer deferServletLoad="false"
Server settings

On the server where Lifecycle Query Engine with Jena or LDX with Jena must be deployed, increase the maximum number of threads for request processing. This increases the number of simultaneous requests, which can be handled by the server (and Lifecycle Query Engine with Jena) or LDX with Jena. On Tomcat, increase the default number of threads in the range 200 - 250, but it is applicable only for Lifecycle Query Engine with Jena or LDX with Jena, and if other web applications are installed on this server, then consider increasing this value.

On the server where Lifecycle Query Engine with Jena or LDX with Jena must be deployed, increase the queue size for the maximum number of incoming requests. Increasing this value allows more incoming requests to be queued before they are rejected by the server. On Tomcat, increase the default number of requests from 100 to 250, but this is applicable only for Lifecycle Query Engine with Jena or LDX with Jena, and if other web applications are installed on this server, then consider increasing this value higher.

Lifecycle Query Engine Recommendations

Indexing performance

Indexing performance depends on several factors: CPU processing capability, hard disk read/write speeds, and network latency. Lifecycle Query Engine with Jena or LDX with Jena is a highly concurrent web application and optimally uses multi-core CPUs for parallel processing. Since the index is written to a hard disk, the hard disk must be optimally capable of fast read/write speeds. When Lifecycle Query Engine with Jena or LDX with Jena indexes a data provider, there must be optimum networking capability for Lifecycle Query Engine with Jena or LDX with Jena and the data provider to send and receive http messages.

For best indexing performance:
  • Deploy Lifecycle Query Engine with Jena or LDX with Jena on servers with CPUs that have clock speeds greater than 2 GHz. Faster CPUs increase indexing performance.
  • Deploy Lifecycle Query Engine with Jena or LDX with Jena on servers with multi-core CPUs to increase the capacity for concurrent processing.
  • Deploy Lifecycle Query Engine with Jena or LDX with Jena on servers with SSDs when possible to increase indexing performance. SSDs can increase indexing performance by a factor of two times.
  • Deploy Lifecycle Query Engine with Jena or LDX with Jena with other data sources on the same network subsystems for faster indexing performance.
  • Increase the number of threads for first time and incremental indexing to achieve higher throughputs. For more information see, Connecting Lifecycle Query Engine or Link Index Provider to applications that use the same Jazz Team Server.

Reindexing

You can add more threads to fetch resources from the data provider when indexing, reindexing, or for change log processing. Consider the number of vCPUs available to the Lifecycle Query Engine or LDX with Jena servers, and if more threads might put a higher load on the data provider server.

Indexing or reindexing

  • For indexing or reindexing, if you have a low number of vCPUs, use the default settings.
  • For indexing or reindexing, if you have more than 20 vCPUs and the data provider server can take the extra load, then you can increase the number of threads to 6-8.
  • Increasing the number of threads can improve the time that is required for indexing or reindexing.

For changelog processing, changing the default number of threads might not be noticeable unless you have a high volume of changes from the data source. For more information, see Configuring data providers.

Query performance or scalability
Query performance also depends on many factors such as CPU processing capability, RAM capacity, hard disk read/write speeds, network latency, which is indexed data set size, data complexity, and query optimization.
  • Increased CPU capacity increases query execution performance.
  • Increased RAM capacity improves in-memory computations and prevents potential memory constraints.
  • SSDs reduce read and write times.
  • Indexed data sets size makes a huge difference, due to the increased number of nodes to traverse in queries.
  • All queries must be optimized by reducing the result set earlier in the query structure. Query response times must target less than 100 milliseconds (ms) for optimum scaling in larger data sets. The query can be restructured for optimum efficiency by understanding the data structures.
For query performance:
  • Deploy Lifecycle Query Engine with Jena or LDX with Jena on servers with CPUs that have clock speeds greater than 2 GHz. Faster CPUs increase query execution performance.
  • Increase RAM capacity to the expected data set size * 1.25 to improve query performance (see earlier examples).
  • Deploy Lifecycle Query Engine with Jena or LDX with Jena on servers with SSDs drives when possible to increase read performance for query execution.
  • Deploy Lifecycle Query Engine with Jena with other data sources on the same network subsystems.
  • Always optimize queries to run faster than 100 ms. Queries can be written in many ways and depends on understanding the data structure, data set size, and data complexity to narrow the results in an efficient way.
  • Adjust the HTTP Connection timeout (seconds) and Socket timeouts (seconds) property values on the Lifecycle Query Engine or LDX > Administration > Advanced Properties page based on the response times for your optimized queries.

For more information, see this Jazz.net article.

Recommended defaults for Lifecycle Query Engine with Jena or DX primary and secondary logs for Db2

Due to the parallel processing of the Lifecycle Query Engine or LDX with Jena component when indexing data, the database might be active. The following examples are mentioned as guidelines. Adjust these settings depending on your data load initially and over time.

The database must have the MAXAPPLS increased to allow for concurrent connections in Lifecycle Query Engine or LDX to process data if it is not set to AUTOMATIC. Increase the value to 300.

db2 update db cfg for LQE using maxappls 300
db2 update db cfg for LQE using locklist 20000
db2 update db cfg for LQE using LOGFILSIZ 20000
db2 update db cfg for LQE using logprimary 25
db2 update db cfg for LQE using logsecond 100
Backing up Lifecycle Query Engine with Jena or LDX with Jena

Compaction

Schedule your compactions to run before the backups.

Compaction must be run minimum once a week on for Windows and fortnightly for Linux.
Note: If the total index size changes more than 5% between compactions, compactions must be done more frequently.

Compaction must be done during a scheduled maintenance window.

The free disk space requirement for the compaction is the total size of each tdb. It can potentially fail due to the same disk space check. This size check is done against the location of the index folder, which in this case is the same partition as the backup location.

If you see the "2019-01-01 00:48:00,907 [lqe.BackupScheduler0-task-thread-0] ERROR bm.team.integration.lqe.lib.backup.impl.BackupTask - CRLQE0475E A fatal error occurred during backup." error in the log, run the following steps.
  1. Ensure that you have sufficient space in the backup area, for twice the size of the index directory.
  2. If you need to troubleshoot further, edit the conf/lqe/log4j2.xml. Change the backup logging to debug level instead of trace. Add logging for the compaction.
    <Logger name="log4j.logger.com.ibm.team.integration.lqe.lib.backup.BackupScheduler" level="DEBUG" additivity="false">
    			<AppenderRef ref="mainLog"/>
    		</Logger>
    		<Logger name="log4j.logger.com.ibm.team.jis.lqe.compaction.CompactionScheduler" level="DEBUG" additivity="false">
    			<AppenderRef ref="mainLog"/>
    		</Logger>
    		<Logger name="log4j.logger.com.ibm.team.jis.lqe.compaction.CompactionTask" level="DEBUG" additivity="false">
    			<AppenderRef ref="mainLog"/>
    		</Logger>
    		<Logger name="log4j.logger.com.ibm.team.jis.lqe.compaction.CompactionUtils" level="DEBUG" additivity="false">
    			<AppenderRef ref="mainLog"/>
    		</Logger>
    Reload log4j2.xml from the Lifecycle Query Engine or LDX with Jena application.
  3. Run compaction. Capture the lqe.log(s) when it finishes.
  4. Run a backup.

Backup

Backups must be done during a scheduled maintenance window. If you do not use the Global Configurations, the free disk space requirement for the backup folder is (twice the size of each tdb) + 300k (100k per tdb). If you use Global Configurations, use the formula from The backup fails with "not enough space available" despite there is still available disk space document. The reason for requiring at least two times the size of the Lifecycle Query Engine with Jena or LDX with Jena indexes for the backup operation is because the backup process copies the indexes. The Lifecycle Query Engine with Jena or LDX with Jena type system model+selections+selects from the Relational Database to the backup location on the disk. The indexes can be compressed to reduce the size.

When the compress backup option is selected, then a second size check is done after the backup folder is created. That second size check requires free disk space equal to the size of the backup folder.
Note: Lifecycle Query Engine with Jena or LDX with Jena backups are not useful if the backup is older than the rebase period of IBM Engineering Test Management's (Engineering Test Management's) or IBM Engineering Requirements Management DOORS Next's (DOORS Next's) TRS feeds, which must be 30 days maximum. Hence, Lifecycle Query Engine with Jena or LDX with Jena backups must not be kept for more than a month.

Migration

If you are moving Lifecycle Query Engine with Jena or LDX with Jena data from Jazz Reporting Service version 6.0.0 to 7.0.0, you must first remove the IBM Engineering Requirements Management DOORS Next TRS from Lifecycle Query Engine with Jena or LDX with Jena. Since Lifecycle Query Engine with Jena or LDX with Jena needs to run through a lengthy migration, the migration time improves when the TRS is removed. After you move the Lifecycle Query Engine with Jena or LDX with Jena data to JRS version 7.0.0, you must rebase the DOORS Next TRS, and add it back to Lifecycle Query Engine with Jena or LDX. It shortens the migration process for Lifecycle Query Engine with Jena or LDX with Jena.

How long does it take for Report Builder to see changes made in the application data?

When you update application data, the time taken for the changes to reflect can vary depending on the process. Consider the following scenarios and the time it takes to reflect in Report Builder.

A project area is created, modified, deleted, or archived:
When a project area is created Lifecycle Query Engine with Jena might index the change within a few minutes as Lifecycle Query Engine with Jena reads the TRS feed every 60 secs*, which is a default value. When a project area is modified, deleted, or archived, a large number of TRS change events can be created.

The larger the project area is the longer it may take to completely respond to all the change events. Very large project areas can take minutes to hours. Report Builder needs to rebuild the metamodel for project area changes, but can query project area changes in a report after Lifecycle Query Engine with Jena has indexed the changes.

Access membership of a project area is changed:

LQE determines access membership from project areas by using reportable rest APIs. By default, Lifecycle Query Engine with Jena initiates this process every 15 min, which can be modified by editing the Advanced Property: Access Context List Refresh Rate. Changes to access membership might reflect in Jazz Reporting Service within 20-30 mins of the change to the project area.

An artifact is created, deleted, or modified:

An artifact modification is automatically indexed in Lifecycle Query Engine with Jena by TRS feed updates within 60 secs*, which is a default value. These changes would be apparent in the report results within this 60 secs duration, but may take longer depending upon the pending changes in the change log and response time of data provider..

Resource shape is changed:
Artifact type changes are automatically indexed in Lifecycle Query Engine with Jena by TRS feed updates. LQE reads the TRS feed every 60 secs*, which is a default value. The change may take longer depending upon the pending changes in the change log and response time of data provider.

In Report Builder, the artifact type changes are not reflected in the metamodel until the metamodel is refreshed either manually or automatically, which occurs every 12 hours at 6:00 AM and 6:00 PM by default (default set in app.properties).

Lifecycle Query Engine with Jena vocabulary is modified:

Lifecycle Query Engine with Jena vocabulary updates are automatically indexed in Lifecycle Query Engine within 60 secs*, depending upon the change log size and response time of data provider. In Report Builder, any artifact type changes are not reflected in the metamodel until the metamodel is refreshed either manually or automatically, which occurs every 12 hours at 6:00 AM and 6:00 PM by default (default set in app.properties).

Failed patches in Lifecycle Query Engine with Jena:

When there are failed patches in Lifecycle Query Engine with Jena while indexing then result of report might be incomplete and the failed patches are retried after 60 minutes and after that result of report is complete.

Note:

*- Lifecycle Query Engine with Jena reads the TRS feed every 60 secs, which is default value. To change this refresh rate, you can change the value of Refresh rate (seconds) property for the respective data provider.