IBM Support

Handling inconsistencies in Search query results

Troubleshooting


Problem

Summary

DSE Search uses a consistency level of LOCAL_ONE for search requests, so the same query hitting different nodes may return different results. This is not abnormal and can occur for the following reasons:

  • Failed mutations/data inconsistency
  • Some nodes may have committed/flushed, whereas others have not. The flush time may be the same, but a variation of when those flushes occur may differ.


Applies To

  • DSE Search - all versions



Solution

To troubleshoot inconsistencies in query results, consider session stickiness, subrange node repair, and follow best practices for soft commit points on different replica nodes.


DSE Search implements an efficient, highly available distributed search algorithm on top of the database, which tries to select the minimum number of replica nodes required to cover all token ranges and avoid hot spots. Consequently, due to the eventually consistent nature of the database, some replica nodes might still need to receive or index the latest updates. Due to different replica node selections, this situation might cause DSE Search to return inconsistent results (different numFound counts) between queries. This behavior is intrinsic to how highly available distributed systems work, as described in the ACM article "Eventually Consistent" by Werner Vogels. Most of the time, eventual consistency is not an issue, yet DSE Search implements session stickiness to guarantee that consecutive queries will hit the same set of nodes on a healthy, stable cluster to provide monotonic results. Session stickiness works by adding a session seed to request parameters as follows:

shard.shuffling.strategy=SEED
shard.shuffling.seed=session_id // any value that you specify

 

Using the above strategy, you are guaranteed to re-use the same shards, so your queries should be identical during each session because the query will pull data from the same set of nodes that it did previously.


In unstable clusters with missed updates due to failures or network partitions, consistent results can be achieved by repairing nodes using the DSE OpsCenter Repair Service.


Finally, another minor source of inconsistencies is caused by different soft commit points on different replica nodes. A given item might be indexed and committed on a given node, but still needs to be added to its replica. This situation is primarily a function of the load on each node. Implement the following best practices:

  • Evenly balance read/write load between nodes
  • Properly tune soft commit time and async indexing concurrency
  • Configure back pressure in the dse.yaml file


For information about multi-threaded asynchronous indexing that uses a back pressure mechanism, please look at Configuring and tuning indexing performance.


DSE Search buffers insert requests from the database to maximize insert throughput so that application insert requests can be acknowledged as quickly as possible. However, if too many requests accumulate in the buffer (a configurable setting), DSE Search pauses or blocks incoming requests until it catches up with the buffered requests. In extreme cases, that pause causes a timeout to the application.



Last Reviewed: 12/15/2023

Document Location

Worldwide

[{"Type":"MASTER","Line of Business":{"code":"LOB76","label":"Data Platform"},"Business Unit":{"code":"BU048","label":"IBM Software"},"Product":{"code":"SSCR56","label":"IBM DataStax Enterprise"},"ARM Category":[{"code":"","label":""}],"ARM Case Number":"","Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"All Version(s)"}]

Historical Number

ka0Ui0000000Em9IAE

Document Information

Modified date:
30 January 2026

UID

ibm17258745