This document applies to Rational DOORS Next Generation 6.0.6 Ifix003 and later.
It is possible to construct a query across a large data set that runs for a long time. Such a long running query might consume resources and cause the system to become unstable.
A new Rogue Query Monitor is introduced to capture these queries.
It is now possible to abort a long-running query before it impacts system stability.
Queries that exceed a specified timeout produce a warning message. The message is displayed to the user in the web UI and also seen in the rm.log on the server. Since 6.0.6 iFix003, it is possible to automatically abort queries that exceed a defined timeout.
Note: Some RM internal queries are excluded from monitoring because we know they are expected to be long running. These include ETL jobs for Reporting.
This document details how to use the timeout values and how to disable this feature, if required.
The following steps apply to the new Rogue Query Monitor functionality introduced as a stability fix within Rational DOORS Next Generation 6.0.6 ifix003.
The RM server uses the following advanced properties for the Rogue Query Monitor
1) Rogue Query Monitor run interval
- Name - Rogue Query Monitor run interval (in seconds)
- Description - Run interval in seconds for the Rogue Query Monitor. A value of 0 disables the Rogue Query Monitor
- Default value - 30 seconds
- Changing the value requires a server restart to become active
- A value of 0 seconds results in no query monitor running. This reverts to the pre-Ifix003 behavior and must be used under the direction of IBM Support.
2) Rogue Query timeout
- Name - Rogue SPARQL Query abort timeout (in ms)
- Description - Enables the RM Rogue Query Monitor to abort exceedingly long running SPARQL queries. If the execution of a query exceeds the amount of time in milliseconds specified here, the query execution aborts to avoid locking up the server
- Default value - 60000 (1 minute)
- Can change value on running server
- No minimum setting, if set to 0 or -1 (ms), it is added (or deducted for negative values) from the client timeout value
3) Web UI Query timeout (exists already)
- Name - query.client.timeout
- Description - Value for which the web UI expects a query to time out
- Default value - 30000 milliseconds (30 seconds)
Allowed runtime calculation
Logic used by the RM Rogue Query Monitor to calculate query abort:
Query start time + (Web UI Query timeout + Rogue Query timeout) > current time: abort query in Jena by using the current Thread ID
Using this calculation
The minimum query runtime with the default settings is (30 seconds client timeout + 60 seconds rogue timeout) + 1 second for the rogue query monitor interval = 91 seconds
The maximum query runtime with the default settings is (30 seconds client timeout + 60 seconds rogue timeout) + 30 seconds for the rogue query monitor to take effect = 120 seconds (plus a minimal delay while the query monitor is iterating through each the running queries at that moment)
The RM admin debug page for running SPARQL queries now has a checkbox to allow for long running queries.
There is more logging that accompanies this functionality that is set by default, as well as advanced logging that can be set with a log4j property.
Informational logging that is available in rm.log
At server startup:
CRRRS8752I The RM query monitor task started. The run interval for the task is set to 30 seconds. The maximum query run time is set to 1 minute 30 seconds.
CRRRS8753I The RM query monitor task is disabled. The run interval for the task is set to 0 seconds.
when a rogue query is detected:
CRRRS8754W The RM query monitor detected and will cancel a query that started running at 8/23/18 3:50 PM and has been running for 1 minute 37 seconds. The query ID is 33f9b07d-d771-437a-854b-3a3437a2b0ed with thread 574.
Debug logging can be invoked in order to fully understand what is occurring when operations and queries are timing out with:
If you are asked to run a specific tool by IBM Support to troubleshoot a situation, or run a corrective procedure, you might need to adjust these settings. If apparently nothing is happening in the GUI after 2 minutes, then refer to the rm.log.
An example would be ReqIF export. Jazz.net defect: https://jazz.net/jazz03/resource/itemName/com.ibm.team.workitem.WorkItem/126574 / APAR PH05142 is an example of a user action impacted by this. The advice is to amend the rogue query timeout to a value that allows your exports to complete.
IBM Support can advise whether to turn off this feature temporarily, or whether to adjust the timeout.
RDNG; Rational DOORS Next Generation;Rational DOORS Next Generation
Was this topic helpful?
15 April 2022