Monitoring indexing activity statistics
By monitoring the indexing activity, you ensure that indexing and queries are completing successfully, and that defined memory thresholds are not exceeded. You can see the status of the insertion or deletion of the selects during indexing, reindexing, and validation in Lifecycle Query Engine.
Procedure
-
On the Health Monitoring page, click Statistics in
the menu, and go to the properties under Dataset Statistics.
- Active Transactions: The number of active readers and writers, providing a view of activities of the index: queries, base logs, change logs, and so on.
- Completed Transactions: The number of completed transactions, including Read-transactions, Write-transactions, and aborted transactions.
-
Journal Writebacks: In Lifecycle Query Engine with Jena, you can view
the Journal Writeback. A journal is a Jena TDB write-ahead log. This field displays the number of
transactions in the journal that are pending or are written to the backing index. You must monitor
the number of pending and completed journal writebacks:
- Under normal conditions, a low number of pending journal writeback must be available as the journal writebacks move from a pending state to a completed state.
- When the number of pending journal writebacks starts to increase and does not decrease over time, there is potential for heap space issues if this condition continues unabated.
- Suspensions: The number of suspensions that are pending or completed,
that timed out, or showed errors because the JVM heap space reached a threshold (The heap usage
threshold is set in advanced properties.). This situation causes the suspension of incoming
transaction activities.To address this problem, you can use the dataset suspension feature in Lifecycle Query Engine. A dataset suspension does the following tasks:
- Blocks all new read and write operations to the dataset
- Waits for existing read operations to complete
- Attempts to flush journal writebacks to the index
The following Lifecycle Query Engine advanced properties control dataset suspension:- Heap Suspension Enabled
- Initiates dataset suspension when a heap threshold is exceeded. This property is disabled by default.
- Heap Usage Threshold
- Percentage of heap that is used to trigger suspension. The default is 85%.
- Stack Suspension Enabled
- After a commit, initiates dataset suspension if the number of pending journal writes exceeds the maximum pending writebacks threshold. Starting in version 7.0.1, this property is enabled by default.
- Maximum Pending Writebacks
- The threshold of pending writebacks when stack suspension takes place.
- Suspend Timeout
- The number of seconds to wait for read operations to complete before attempting a journal flush. This value must be greater than the total of query timeout and rogue query timeout. For example, if the default query timeout is 600 seconds and default rogue timeout is 180 seconds, this value must be greater than 780 seconds.
- Overloads
- The number of times an overload condition was encountered. An overload can occur in the
following situations:
- When the JVM garbage collection process starts because the heap usage threshold was reached and, after garbage collection, the JVM heap still exceeds the heap usage threshold.
- When the maximum value of pending writebacks is exceeded and, after stack suspension, the maximum value of pending writebacks is still exceeded.
- Running or Completed Queries
- The number of queries that are running or completed. You should watch the number of running queries over time to determine whether they might potentially lock index writes if the queries don't finish or are not timed out properly.
-
To view currently running queries in Lifecycle Query Engine with Jena, go to Health Monitoring > Queries, and click Running Queries. All queries must end normally or
by timing out; however, in Apache Jena, some queries might become rogue.
For information about how to handle rogue queries, see Preventing Out-of-memory errors in Lifecycle Query Engine.
- To view completed queries in Lifecycle Query Engine with Jena, go to Health Monitoring > Queries, and click Completed Queries. You can review the details of the completed queries, such as status, user, dataset node, execution node, start time, etc. From the Config scope column, you can monitor the configuration scope of queries in the Lifecycle Query Engine as it shows the version resources count. The administrators can review the Global configurations(GC) count from the Config scope column. If there is any query that has a larger config scope value, then the administrator can suggest to the user, to optimize the query and use a GC with a smaller scope.