IBM Support

Mustgather: Investigating LQE or LDX performance problems

How To


Summary

Details of data to be collected when you experience performance problems with IBM Lifecycle Query Index (LQE) and IBM Link Indexing Service (LDX)

Objective

To prepare the IBM Enterprise Lifecycle Management (ELM) support team for LQE and LDX performance concerns, prepare and load the following details to the support case.

Steps

1: Elaborate the issue

Describe in the details what happens:
  • Was LQE/LDX page was available? Did you have the issue with all reports or only with some
  • When exactly (date and time) you experienced the issue
  • Did restart resolve the issue?

2: Collect detailed logs from the LQE or LDX server a restart of the server
For detailed instructions on how to collect the logs see: Running IBM Support Assistant Data Collector. Note: you can gather the data right after restarting the server.

3: Collect 4 - 6 Java cores, at a 3 to 5-minute interval
Use the instructions from How to gather Java cores for different application servers in Engineering Lifecycle Management applications document. Note: these files need to be gathered before restarting the server

4: Collect Verbose GC logs

The default location of log file is <elm-installation>server\liberty\servers\clm. These files can be gathered after server restart.

5: Get the heapdumps and Java cores

If there was out of memory exception, then the javacore, heapdump or core file was generated in your server installation directory. Collect all these files.

 

6: Collect the information about your disk speed
Run standard tool applicable for your operating system (for example dd or hdparm on Linux) to check your disk speed.

7: Collect the LQE or LDX export data
LQE or LDX metrics can be found in the LQE export link: https://${server}:${port}/lqe/export
The link downloads a compressed file with the following name convention: lqe_metadata_export_YYYYMMMDDhhmmss###.zip

NOTE: The metrics data is stored in the database. Therefore, this data is still useful even after server restart. If you were unable to gather them before the restart, gather them immediately after a restart.

8: Collect the Document LQE or LDX page with print screen capture:
a) Latest running and completed queries in the Query Statistics page.
Capture the following to LQE or LDX Pages,

  • The "Completed query"
    https://$[server}:${port}/lqe/web/health/query-stats#section=completed
    Capture the node query graph and some of the details of the first 20 queries listed.
  • "Running Queries"
    https://$[server}:${port}/lqe/web/health/query-stats#section=running
    The details from a historical point of view, select a selection that represents perceived slowness capture long-running queries reported.

b) Advance Properties timeout values
Advanced Properties
https://$[server}:${port}/lqe/web/admin/advanced

c) Statistics
Collect the Server statistics
https://${server}:${port}/lqe/web/health/stats

image-20200327170246-1
d) LQE or LDX Overview
https://${server}:${port}/lqe/web/health/overview
This view gives details on the index sizes.

9: Value of the pending Journal Writebacks.
This value is available on the Statistics page.
You can also get is using the 
queuedCommits value in LQE Mbeans. See Monitoring the performance of Lifecycle Query Engine using MBeans
In the file system, you can review the Journal files size on the server that hosts LQE or LDX. When the file size is not 0, it indicated you have pending Journal Writebacks.
Collect the size on disk on the index journal.jrnl file. Using the following commands in the LQE or LDX ${JAZZ_HOME}server/conf/${lqe/ldx}

Linux: ls -lsa * | grep -e "jour*" -e "Tdb"

Example:

[user@ELM lqe]$ pwd

/opt/IBM/JazzTeamServer/server/conf/lqe

[chevalie@ELM706 lqe]$ ls -lsa * | grep -e "jour*" -e "Tdb"

historyTdb:

  0 -rw-r-xr-x  1 root root        0 Mar 26 14:01 journal.jrnl

indexTdb:

   0 -rw-r-xr-x  1 root root        0 Mar 26 14:01 journal.jrnl

shapeTdb:

   0 -rw-r-xr-x  1 root root        0 Mar 26 14:01 journal.jrnl

versionTdb:

  0 -rw-r-xr-x  1 root root        0 Mar 26 14:01 journal.jrnl

Windows: dir /S | findstr  "Tdb\> journ"

C:\Products\IBM\LQE_6061\server\conf\lqe>dir /S | findstr  "Tdb\> journ"

03/25/2020  04:02 PM    <DIR>          historyTdb

03/25/2020  04:02 PM    <DIR>          indexTdb

03/25/2020  04:02 PM    <DIR>          shapeTdb

03/25/2020  04:02 PM    <DIR>          versionTdb

 Directory of C:\Products\IBM\LQE_6061\server\conf\lqe\historyTdb

03/27/2020  12:00 AM                 0 journal.jrnl

 Directory of C:\Products\IBM\LQE_6061\server\conf\lqe\indexTdb

03/20/2020  08:43 AM                 0 journal.jrnl

 Directory of C:\Products\IBM\LQE_6061\server\conf\lqe\shapeTdb

03/20/2020  08:43 AM                 0 journal.jrnl

 Directory of C:\Products\IBM\LQE_6061\server\conf\lqe\versionTdb

03/20/2020  08:43 AM                 0 journal.jrnl

The expected size of the pending Journal Writebacks is 0, in a quiet LQE or LDX server. When the pending Journal Writebacks is not 0, it does not indicate a problem. It can be that the server is busy reading the index for a SPARQL query. The larger value gets the larger the impact on performance.  It's not only the number of pending Journal Writebacks that is important. You also need to see for how long the pending Journal Writebacks are continuously more than 0.

NOTE: it is recommended to gather the journal file size or pending Journal Writebacks value continously, for example every minute. If the issue appears, gather the values from last time to see how this value was growing over the time.

10: Check for long running, expensive operations

Operations like long updates to the index, backup or compaction are relatively expensive operations and they are potential contributors to performance degradation. Navigate to data sources section and check whether at the time you have that operations running. Make a screen capture of datasources page and provide it to the support case.

11: Gather proxy server logs

If you use a set of LQE servers with a proxy server behind them, gather the access logs from that server.

For IHS server, the log file has a name access.log.

12: Provide the report from monitoring tool

Provide the report from monitoring tool like Splunk, Instana or APM tool if you configured them to gather the data from the server.
The most significant usage parameters from the report are following: processor, memory, Java heap, thread connections pool, garbage collection time, active services number, disk and network IO, a number of expensive scenarios. Two charts are required: from the last week, and the last day.

Document Location

Worldwide

[{"Type":"SW","Line of Business":{"code":"LOB59","label":"Sustainability Software"},"Business Unit":{"code":"BU059","label":"IBM Software w\/o TPS"},"Product":{"code":"SSTU9C","label":"Jazz Reporting Service"},"ARM Category":[{"code":"a8m0z000000GmlzAAC","label":"Jazz Reporting Service-\u003ELifecycle Query Engine-\u003EPerformance \/ MBeans"}],"ARM Case Number":"TS003523629","Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"All Version(s)"}]

Document Information

Modified date:
20 September 2024

UID

ibm16128283