Monitoring indexing activity statistics

By monitoring the indexing activity, you ensure that indexing and queries are completing successfully, and that defined memory thresholds are not exceeded.

Procedure

  • On the Health Monitoring page, click Statistics in the menu, and go to the properties under Dataset Statistics.
    • Active Transactions: The number of active readers and writers, providing a view of activities of the index: queries, base logs, change logs, and so on.
    • Completed Transactions: The number of completed transactions, including Read-transactions, Write-transactions, and aborted transactions.
    • Journal Writebacks: A journal is a Jena TDB write-ahead log. This field displays the number of transactions in the journal that are pending or are written to the backing index. You must monitor the number of pending and completed journal writebacks:
      • Under normal conditions, a low number of pending journal writeback must be available as the journal writebacks move from a pending state to a completed state.
      • When the number of pending journal writebacks starts to increase and does not decrease over time, there is potential for heap space issues if this condition continues unabated.
    • Suspensions: The number of suspensions that are pending or completed, that timed out, or showed errors because the JVM heap space reached a threshold (The heap usage threshold is set in advanced properties.). This situation causes the suspension of incoming transaction activities.
      To address this problem, you can use the dataset suspension feature in Lifecycle Query Engine (LQE). A dataset suspension does the following tasks:
      • Blocks all new read and write operations to the dataset
      • Waits for existing read operations to complete
      • Attempts to flush journal writebacks to the index
      The following LQE advanced properties control dataset suspension:
      • Heap Suspension Enabled: Initiates dataset suspension when a heap threshold is exceeded. This property is disabled by default.
      • Heap Usage Threshold: Percentage of heap that is used to trigger suspension. The default is 85%.
      • Stack Suspension Enabled: After a commit, initiates dataset suspension if the number of pending journal writes exceeds the maximum pending writebacks threshold. Starting in version 7.0.1, this property is enabled by default.
      • Maximum Pending Writebacks: The threshold of pending writebacks when stack suspension takes place.
      • Suspend Timeout: The number of seconds to wait for read operations to complete before attempting a journal flush. This value must be greater than the total of query timeout and rogue query timeout. For example, if the default query timeout is 600 seconds and default rogue timeout is 180 seconds, this value must be greater than 780 seconds.
    • Overloads: The number of times an overload condition was encountered. An overload can occur in the following situations:
      • When the JVM garbage collection process starts because the heap usage threshold was reached and, after garbage collection, the JVM heap still exceeds the heap usage threshold.
      • When the maximum value of pending writebacks is exceeded and, after stack suspension, the maximum value of pending writebacks is still exceeded.
      A journal writeback might have a stack of data set views and, if the incoming requests don't pause, a backlog of journal writebacks, with a corresponding stack of data sets, can be queued and might lead to stack overflow. The stack indicates the number of times a journal writeback stack overflow was prevented when the heap usage threshold was exceeded.
    • Running or Completed Queries: The number of queries that are running or completed. You should watch the number of running queries over time to determine whether they might potentially lock index writes if the queries don't finish or are not timed out properly.
  • To view currently running queries, go to Health Monitoring > Queries, and click Running Queries. All queries must end normally or by timing out; however, in Apache Jena, some queries might become rogue.
    For information about how to handle rogue queries, see Preventing Out-of-memory errors in Lifecycle Query Engine.