IBM Support

CQL Solr query returns 'null' for a non-indexed field

Troubleshooting


Problem

Summary

This article discusses a scenario when CQL Solr queries can return null values unexpectedly.


Applies to

  • DataStax Enterprise 5.1
  • DataStax Enterprise 6.0
  • DataStax Enterprise 6.7


Symptoms

When running a Solr CQL query, a non-indexed field returns 'null' even though a value exists for the field. Consider the following table:

CREATE TABLE customer.orders (
item_id UUID, 
item text, 
lastname text, 
firstname text, 
solr_query text, 
order_type text, 
PRIMARY KEY (item_id, lastname));
 

For this example, we'll create a search index (Solr core) for the table using the following custom schema.xml file:

$ cat customer_orders_schema.xml
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<schema name="autoSolrSchema" version="1.5">
<types>
<fieldType class="org.apache.solr.schema.TrieDateField" name="TrieDateField"/>
<fieldType class="org.apache.solr.schema.UUIDField" name="UUIDField"/>
<fieldType class="org.apache.solr.schema.StrField" name="StrField"/>
<fieldType class="org.apache.solr.schema.BoolField" name="BoolField"/>
<fieldType class="org.apache.solr.schema.TextField" name="TextField">
 <analyzer>
                <tokenizer class="solr.KeywordTokenizerFactory"/>
  <filter class="solr.LowerCaseFilterFactory"/>
 </analyzer>
</fieldType>
</types>
<fields>
<field indexed="true" multiValued="false" docValues="true" name="item_id" stored="true" type="UUIDField"/>
<field indexed="true" multiValued="false" docValues="true" name="lastname" stored="true" type="StrField"/>
<field indexed="true" multiValued="false" docValues="true" name="firstname" stored="true" type="StrField"/>
</fields>
<uniqueKey>(item_id,lastname)</uniqueKey>
</schema>
 

Next, we'll use dsetool to create the Solr core:

$ dsetool create_core customer.orders reindex=true schema=./customer_orders_schema.xml solrconfig=./customer_orders_config.xml


Now we'll insert data into the table:

insert into customer.orders (item_id, item, lastname, firstname, order_type) values (a6f94417-e27b-444f-9d0c-dccb588e421f, 'grill', 'Smith', 'Joe', 'app');


Running the following CQL Solr query returns 'null' for the 'order_type' field instead of the inserted value of 'app':

cqlsh> SELECT item_id, lastname, firstname, order_type FROM customer.orders WHERE item_id = a6f94417-e27b-444f-9d0c-dccb588e421f and solr_query='{"q": "*:*","fq":"(firstname:Joe)"}';

item_id | lastname | firstname | order_type
--------------------------------------+----------+-----------+------------
 a6f94417-e27b-444f-9d0c-dccb588e421f | Smith | Joe | null

(1 rows)


Cause

Solr single-pass CQL queries and HTTP queries use the custom Lucene stored fields codec which use the Solr schema to discover which fields to return.

There are a number of ways to generate a single-pass query instead of the standard two-pass query, without realizing it is happening:

  • Specify the distrib.singlePass Boolean parameter in the query expression:
SELECT * FROM keyspace.table WHERE solr_query = '{"q" : "*:*", "distrib.singlePass" : true}' 
  • Use a token() or partition key restriction in the WHERE clause:
SELECT * FROM keyspace.table WHERE token(id) >= 3074457345618258604 AND solr_query = '{"q" : "*:*"}' 
  • Execute a COUNT query:
SELECT count(*) FROM keyspace.table WHERE solr_query = '{"q" : "*:*"}' 
  • Specify only the primary key elements in the SELECT clause, using the table from the example above, we could have:
SELECT item_id FROM customer.orders WHERE solr_query='{"q": "*:*","fq":"(item_id:a6f94417-e27b-444f-9d0c-dccb588e421f)"}';

Solution

If you end up with a Solr single-pass CQL query through any of the above methods, the custom Lucene stored fields codec will be used.


That is, they use our custom Lucene stored fields codec, rather than just reading entire CQL rows as non-single pass CQL Solr queries would because the codec only knows about fields, indexed or not, that exist in the schema.xml.


Using the example above, we could add the following line to the 'customer_orders_schema.xml' file:

<field indexed="false" multiValued="false" docValues="false" name="order_type" stored="true" type="StrField"/>


After changing the file, we would need to reload the core:

dsetool reload_core customer.orders reindex=false schema=./customer_orders_schema.xml solrconfig=./customer_orders_config.xml


You'll notice that the same query returning 'null' for the 'order_type' field above, will now return the stored value of 'app' correctly:

cqlsh> SELECT item_id, lastname, firstname, order_type FROM customer.orders WHERE item_id = a6f94417-e27b-444f-9d0c-dccb588e421f and solr_query='{"q": "*:*","fq":"(firstname:Joe)"}';
 
 item_id | lastname | firstname | order_type
--------------------------------------+----------+-----------+------------
 a6f94417-e27b-444f-9d0c-dccb588e421f | Smith | Joe | app

(1 rows)


We could also re-write the query to avoid generating a single-pass query to achieve the same results:

cqlsh> SELECT item_id, lastname, firstname, order_type FROM customer.orders WHERE solr_query='{"q": "*:*","fq":"(item_id:a6f94417-e27b-444f-9d0c-dccb588e421f)"}';

item_id | lastname | firstname | order_type
--------------------------------------+----------+-----------+------------
 a6f94417-e27b-444f-9d0c-dccb588e421f | Smith | Joe | app

(1 rows)

*Note that if you use the generateResources=true option when creating the Solr core, it will include all table fields in the schema.xml file by default, for example:

dsetool create_core keyspace.table generateResources=true
 

In DSE 5.1 and earlier, the stored="true" property is required in the schema.xml file for the field to be returnable in a query.


In DSE 6.0 and later, the field must be present but with indexed=false in the schema.xml.


DataStax documentation

JSON single-pass distributed query

Document Location

Worldwide

[{"Type":"MASTER","Line of Business":{"code":"LOB76","label":"Data Platform"},"Business Unit":{"code":"BU048","label":"IBM Software"},"Product":{"code":"SSCR56","label":"IBM DataStax Enterprise"},"ARM Category":[{"code":"","label":""}],"ARM Case Number":"","Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"All Version(s)"}]

Historical Number

ka0Ui0000000QSLIA2

Document Information

Modified date:
30 January 2026

UID

ibm17258552