Known issues for Analytics Engine powered by Apache Spark

The following limitations and known issues apply to Analytics Engine Powered by Apache Spark.

You reach the data limit in the Spark database

Applies to: 4.6.0 and later

Historical data is not removed in your Spark database and the limit is reached. This means you might experience backup and restore failure, OOM issues on MetaStore DB, or disruption due to database performance and slowness.

Workaround:

The following command execs into the zen-metastoredb pod and runs SQL queries against the Spark database to remove historical data that is older than 60 days:

oc rsh zen-metastoredb-0
cp -r /certs/ /tmp/ && cd /tmp && chmod 0600 ./certs/* && cd  /cockroach
./cockroach sql --certs-dir=/tmp/certs/ --host=zen-metastoredb-0.zen-metastoredb
set database=spark;
DELETE from job where creation_date < NOW() - INTERVAL '60 day' and state IN ('DELETED', 'FAILED', 'DELETE_FAILED');
DELETE from deployment where creation_date < NOW() - INTERVAL '60 day';
DELETE from deploy_request where creation_date < NOW() - INTERVAL '60 day';

Timeout message when submitting a Spark job

The expected behaviour by the Spark service when you submit a Spark application is that you create a SparkContext or SparkSession at the beginning of your Spark application code. When you submit the Spark job via the REST API, it returns the Spark application ID once the SparkContext is successfully created.

However, if you don't create a SparkContext or SparkSession:

  • At the beginning of the Spark application
  • At all in the Spark application or if your application is in plain Python, Scala or R

the REST API will wait for your application to complete which can lead to a REST API timeout. The reason is that the Spark service expects the Spark application to have started, which is not the case if you are running a plain Python, Scala or R application. This application will be listed in Jobs UI even though the REST API timed out.

Applies to: Spark applications using V2 or V3 APIs only.