Troubleshooting

Accessing error messages

Informational and error messages display in different locations based on the type of information:

Flow configuration issues
The flow canvas provides guidance and error details as follows:
  • Issues found by implicit validation display in the Issues list.
  • An error icon displays at the stage where the problem occurs or on the canvas for flow configuration issues.
Error record information
You can use the Error Records flow properties to write error records and related details to another system for review. The information in the record header attributes can help you determine the problem that occurred. For more information, see Internal attributes.
For more information about error records and error record handling, see Error record handling.

Flow basics

Use the following tips for help with flow basics:

Why isn't the Run icon enabled?
You can start a flow when it is valid. Use the Issues icon to review the list of issues in your flow. When you resolve the issues, the Run icon becomes enabled.

General validation errors

Use the following tips for help with general flow validation errors:
The flow has the following set of validation errors for a stage:
CONTAINER_0901 - Could not find stage definition for <stage library name>:<stage name>.
CREATION_006 - Stage definition not found. Library <stage library name>. Stage <stage name>. 
Version <version>
VALIDATION_0006 - Stage definition does not exist, library <stage library name>, 
name <stage name>, version <version>
The flow uses a stage that is not installed on the Data Collector engine. This might happen if you imported a flow from a different version of the engine and the current engine is not enabled to use the stage.
If the Data Collector engine uses a different version of the stage, you might delete the invalid version and replace it with a local valid version. For example, if the flow from an older Data Collector version uses an older version of the MongoDB Atlas target, you might replace it with a version used by the current engine.
If you need to use a stage that is not installed on the engine, install the related stage library. For information about installing additional drivers, see Install external libraries.

Sources

Use the following tips for help with source stages and systems.

Directory

Why isn't the Directory source reading all of my files?
The Directory source reads a set of files based on the configured file name pattern, read order, and first file to process. If new files arrive after the Directory source has passed their position in the read order, the Directory source does not read the files unless you reset the source.
When using the last-modified timestamp read order, arriving files should have timestamps that are later than the files in the directory.
Similarly, when using the lexicographically ascending file name read order, make sure the naming convention for the files are lexicographically ascending. For example, filename-1.log, filename-2.log, etc., works fine until filename-10.log. If filename-10.log arrives after the Directory source completes reading filename-2.log, then the Directory source does not read filename-10.log since it is lexicographically earlier than filename-2.log.
For more information, see Read order.

Elasticsearch

A flow with an Elasticsearch source fails to start with an SSL/TLS error, such as the following:
ELASTICSEARCH_43 - Could not connect to the server(s) <SSL/TLS error details>
This message can display due to many different SSL/TLS issues. Review the details of related messages to determine the corrective measures to take.
Here are some examples of when a version of the message might display:
  • If you configure the stage to use SSL/TLS, but do not specify an HTTPS-enabled port.

    To resolve the issue, specify an HTTPS-enabled port.

  • If you configure the stage to use SSL/TLS and specify an HTTPS-enabled port, but configure the URL incorrectly, such as http://<host>:<port>.

    To resolve this issue, update the URL to use HTTPS, as follows: https://<host>:<port>.

  • If you configure the stage to use SSL/TLS, but do not have a certificate in the specified truststore.

    To resolve the issue, place a valid certificate in the truststore.

JDBC sources

My MySQL JDBC Driver 5.0 fails to validate the query in my JBDC Query Consumer source.
This can occur when you use a LIMIT clause in your query.
Workaround: Upgrade to version 5.1.
I'm using a JDBC source to read MySQL data. Why are datetime value set to zero being treated like error records?
MySQL treats invalid dates as an exception, so both the JDBC Query Consumer and the JDBC Multitable Consumer create error records for invalid dates.
You can override this behavior by setting a JDBC configuration property in the source. Add the zeroDateTimeBehavior property and set the value to "convertToNull".
For more information about this and other MySQL-specific JDBC configuration properties, see http://dev.mysql.com/doc/connector-j/en/connector-j-reference-configuration-properties.html.
A flow using the JDBC Query Consumer source keeps stopping with the following error:
JDBC_77 <db error message> attempting to execute query '<query>'. Giving up 
after <error count> errors as per stage configuration. First error: <first db error>.
This occurs when the source cannot successfully execute a query. To handle transient connection or network errors, try increasing the value for the Number of Retries Upon Query Error property on the JDBC tab of the source.
My flow using a JDBC source generates an out-of-memory error when reading a large table.
When the Auto Commit property is enabled in a JDBC source, some drivers ignore the fetch-size restriction, configured by the Max Batch Size property in the source. This can lead to an out-of-memory error when reading a large table that cannot entirely fit in memory.
To resolve, disable the Auto Commit property on the Advanced tab of the source.

Oracle CDC Client

Data preview continually times out for my Oracle CDC Client flow.
Flows that use the Oracle CDC Client can take longer than expected to initiate for data preview. If preview times out, try increasing the Preview Timeout property incrementally.

For more information about using preview with this source, see Data preview with Oracle CDC Client.

My Oracle CDC Client flow has paused processing during a daylight saving time change.
If the source is configured to use a database time zone that uses daylight saving time, then the flow pauses processing during the time change window to ensure that all data is correctly processed. After the time change completes, the flow resumes processing at the last-saved offset.
For more information, see Database time zone.

PostgreSQL CDC Client

A PostgreSQL CDC Client flow generates the following error:
com.streamsets.pipeline.api.StageException: JDBC_606 - Wal Sender is not active
This can occur when the Status Interval property configured for the source is larger than the wal_sender_timeout property in the PostgreSQL postgresql.conf configuration file.
The Status Interval property should be less than the wal_sender_timeout property. Ideally, it should be set to half of the value of the wal_sender_timeout property.
For example, you can use the default status interval of 30 seconds with the default wal_sender_timeout value of 60000 milliseconds, or 1 minute.

Salesforce

A flow generates a buffering capacity error
When flows with a Salesforce source fail due to a buffering capacity error, such as Buffering capacity 1048576 exceeded, increase the buffer size by editing the Streaming Buffer Size property on the Subscribe tab.

Scripting sources

A flow fails to stop when manually stopped
Scripts must include code that stops the script when users stop the flow. In the script, use the sdc.isStopped method to check whether the flow has been stopped.
A Jython script does not proceed beyond import lock
Flows freeze if Jython scripts do not release the import lock upon a failure or error. When a script does not release an import lock, you must restart Data Collector to release the lock. To avoid the problem, use a try statement with a finally block in the Jython script. For more information, see Thread safety in Jython scripts.

SQL Server CDC Client

Previewing data does not show any values
When you set the Maximum Transaction Length property, the source fetches data in multiple time windows. The property determines the size of each time window. Previewing data only shows data from the first time window, but the source might need to process multiple time windows before finding changed values to show in the preview.

To see values when previewing data, increase Maximum Transaction Length or set to -1 to fetch data in one time window.

A no-more-data event is generated before reading all changes
When you set the Maximum Transaction Length property, the source fetches data in multiple time windows. The property determines the size of each time window. After processing all available rows in each time window, the source generates a no-more-data event, even when subsequent time windows remain for processing.

Processors

Use the following tip for help with processors.

Encrypt and Decrypt Fields

The following error message displays in the log after I start the flow:
CONTAINER_0701 - Stage 'EncryptandDecryptFields_01' initialization error: java.lang.IllegalArgumentException: Input byte array has incorrect ending byte at 44
When the processor uses a user-supplied key, the length of the Base64 encoded key that you provide must match the length of the key expected by the selected cipher suite. For example, if the processor uses a 264-bit (32 byte) cipher suite, the Base64 encoded key must be 32 bytes in length.
You can receive this message when the length of the Base64 encoded key is not the expected length.

Targets

Use the following tips for help with target stages and systems.

Azure Data Lake Storage

An Azure Data Lake Storage target seems to be causing out of memory errors, with the following object using all available memory:
com.streamsets.pipeline.stage.destination.hdfs.writer.ActiveRecordWriters
This can occur due to a Hadoop known issue, which can affect the Azure Data Lake Storage Gen2 target.
For a description of a workaround, see the documentation for the Gen2 target.

Cassandra

Why is the flow failing entire batches when only a few records have a problem?
Due to Cassandra requirements, when you write to a Cassandra cluster, batches are atomic. This means than an error in a one or more records causes the entire batch to fail.
Why is all of my data being sent to error? Every batch is failing.
When every batch fails, you might have a data type mismatch. Cassandra requires the data type of the data to exactly match the data type of the Cassandra column.
To determine the issue, check the error messages associated with the error records. If you see a message like the following, you have a data type mismatch. The following error message indicates that data type mismatch is for Integer data being unsuccessfully written to a Varchar column:
CASSANDRA_06 - Could not prepare record 'sdk:': 
Invalid type for value 0 of CQL type varchar, expecting class java.lang.String but class java.lang. 
Integer provided`
To correct the problem, you might use a Field Type Converter processor to convert field data types. In this case, you would convert the integer data to string.

Elasticsearch target

A flow with an Elasticsearch target fails to start with an SSL/TLS error, such as the following:
ELASTICSEARCH_43 - Could not connect to the server(s) <SSL/TLS error details>
This message can display due to many different SSL/TLS issues. Review the details of related messages to determine the corrective measures to take.
Here are some examples of when a version of the message might display:
  • If you configure the stage to use SSL/TLS, but do not specify an HTTPS-enabled port.

    To resolve the issue, specify an HTTPS-enabled port.

  • If you configure the stage to use SSL/TLS and specify an HTTPS-enabled port, but configure the URL incorrectly, such as http://<host>:<port>.

    To resolve this issue, update the URL to use HTTPS, as follows: https://<host>:<port>.

  • If you configure the stage to use SSL/TLS, but do not have a certificate in the specified truststore.

    To resolve the issue, place a valid certificate in the truststore.

Kafka Producer

Can the Kafka Producer create topics?
The Kafka Producer can create a topic when all of the following are true:
  • You configure the Kafka Producer to write to a topic name that does not exist.
  • At least one of the Kafka brokers defined for the Kafka Producer has the auto.create.topics.enable property enabled.
  • The broker with the enabled property is up and available when the Kafka Producer looks for the topic.
A flow that writes to Kafka keeps failing and restarting in an endless cycle.
This can happen when the flow tries to write message to Kafka 0.8 that is longer than the Kafka maximum message size.
Workaround: Reconfigure Kafka brokers to allow larger messages or ensure that incoming records are within the configured limit.

JDBC connections

Use the following tips for help with stages that use JDBC connections to connect to databases. For some stages, Data Collector includes the necessary JDBC driver to connect to the database. For other stages, you must install a JDBC driver.

The following stages require you to install a JDBC driver:
  • JDBC Multitable Consumer source
  • JDBC Query Consumer source
  • MySQL Binary Log source
  • Oracle Bulkload source
  • Oracle CDC source
  • Oracle CDC Client source
  • Oracle Multitable Consumer source
  • Oracle target
  • SAP HANA Query Consumer source
  • JDBC Lookup processor
  • JDBC Tee processor
  • SQL Parser processor, when using the database to resolve the schema
  • JDBC Producer target
  • JDBC Query executor

No suitable driver

When Data Collector cannot find the JDBC driver for a stage, Data Collector might generate one of the following error messages:
JDBC_00 - Cannot connect to specified database: com.streamsets.pipeline.api.StageException:
JDBC_06 - Failed to initialize connection pool: java.sql.SQLException: No suitable driver

Verify that you have followed the instructions to install additional drivers, as explained in Install external libraries.

You can also use these additional tips to help resolve the issue:

The JDBC connection string is not correct.
The JDBC Connection String property for the stage must include the jdbc: prefix. For example, a PostgreSQL connection string might be jdbc:postgresql://<database host>/<database name>.
Check your database documentation for the required connection string format. For example, if you are using a non-standard port, you must specify it in the connection string.
The external resource archive file containing the JDBC driver is not set up correctly.
When you include the JDBC driver in an external resource archive file, the archive file must use the required folder names and directory structure. For details about the required archive file structure, see Archive structure.
JDBC drivers do not load or register correctly.
Sometimes JDBC drivers that a flow requires do not load or register correctly. For example a JDBC driver might not correctly support JDBC 4.0 auto-loading, resulting in a No suitable driver error message.
Two approaches can resolve this issue:
  • Add the class name for the driver in the JDBC Class Driver Name property on the Legacy Drivers tab for the stage.
  • Configure Data Collector to automatically load specific drivers. In the Data Collector configuration properties, uncomment the stage.conf_com.streamsets.pipeline.stage.jdbc.drivers.load property and set to a comma-separated list of the JDBC drivers required by stages in your flows.

Cannot connect to database

When Data Collector cannot connect to the database, an error message like the following displays - the exact message can vary depending on the driver:

JDBC_00 - Cannot connect to specified database: com.zaxxer.hikari.pool.PoolInitializationException:
Exception during pool initialization: The TCP/IP connection to the host 1.2.3.4, port 1234 has failed
In this case, verify that the Data Collector machine can access the database machine on the relevant port. You can use tools such as ping and netcat (nc) for this purpose. For example, to verify that the host 1.2.3.4 is accessible:
$ ping 1.2.3.4 
PING 1.2.3.4 (1.2.3.4): 56 data bytes 
64 bytes from 1.2.3.4: icmp_seq=0 ttl=57 time=12.063 ms 
64 bytes from 1.2.3.4: icmp_seq=1 ttl=57 time=11.356 ms 
64 bytes from 1.2.3.4: icmp_seq=2 ttl=57 time=11.626 ms 
^C
--- 1.2.3.4 ping statistics --- 
3 packets transmitted, 3 packets received, 0.0% packet loss 
round-trip min/avg/max/stddev = 11.356/11.682/12.063/0.291 ms
Then to verify that port 1234 can be reached:
$ nc -v -z -w2 1.2.3.4 1234 
nc: connectx to 1.2.3.4 port 1234 (tcp) failed: Connection refused

If the host or port is not accessible, check the routing and firewall configuration.

MySQL JDBC driver and time values

Due to a MySQL JDBC driver issue, the driver cannot return time values to the millisecond. Instead, the driver returns the values to the second.

For example, if a column has a value of 20:12:50.581, the driver reads the value as 20:12:50.000.

Performance

Use the following tips for help with performance:
How can I decrease the delay between reads from the source system?
A long delay can occur between reads from the source system when a flow reads records faster than it can process them or write them to the target system. Because a flow processes one batch at a time, the flow must wait until a batch is committed to the target system before reading the next batch, preventing the flow from reading at a steady rate. Reading data at a steady rate provides better performance than reading sporadically.
If you cannot increase the throughput for the processors or target, limit the rate at which the flow reads records from the source system. Configure the Rate Limit property for the flow to define the maximum number of records that the flow can read in a second.
When I try to start one or more flows, I receive an error that not enough threads are available
By default, Data Collector can run approximately 22 standalone flows at the same time. If you run a larger number of standalone flows at the same time, you might receive the following error:
CONTAINER_0166 - Cannot start flow '<flow name>' as there are not enough threads available
To resolve this error, increase the value of the runner.thread.pool.size property in the Data Collector configuration properties.
How can I improve the general flow performance?
You might improve performance by adjusting the batch size used by the flow. The batch size determines how much data passes through the flow at one time. By default, the batch size is 1000 records.
You might adjust the batch size based on the size of the records or the speed of their arrival. For example, if your records are extremely big, you might reduce the batch size to increase the processing speed. Or if the records are small and arrive quickly, you might increase the batch size.
Experiment with the batch size and review the results.
To change the batch size, configure the production.maxBatchSize property in the Data Collector configuration properties.