Troubleshooting
Problem
The conductor reports the FileNotFoundException issue. This error only exists for one SIG (jh-spark211) and one shuffle directory (/data10/spark/jh-spark211/local_dir/).
2019-05-28?17:32:58.724?com.spark.rules.DefaultRuleRunner.runRules(DefaultRuleRunner.java:34)??
TaskFailException----------------------------------------
com.spark.rules.common.TaskFailException:?org.apache.spark.SparkException:?Job?aborted.
at?com.spark.rules.executor.AbstractExecutor.process(AbstractExecutor.java:60)
at?com.spark.rules.executor.ExecutorGroup.processRule(ExecutorGroup.java:87)
at?com.spark.rules.executor.ExecutorGroup.processRules(ExecutorGroup.java:74)
.................
at?org.apache.spark.sql.execution.datasources.DataSource.write(DataSource.scala:520)
at?org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:215)
at?org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:198)
at?org.apache.spark.sql.DataFrameWriter.parquet(DataFrameWriter.scala:494)
at?com.spark.rules.executor.output.Spark2Hdfs.doProcess(Spark2Hdfs.java:35)
at?com.spark.rules.executor.AbstractExecutor.process(AbstractExecutor.java:50)
...?15?more
Caused?by:?org.apache.spark.SparkException:?Job?aborted?due?to?stage?failure:?Task?10?in?stage?11.0?failed?4?times,?most?recent?failure:?Lost?task?10.3?in?stage?11.0?(TID?1320,?dsszbyz-etl-node57,?executor?74-08cf9e79-87e4-48f7-bb33-808b66b881a8):?org.apache.spark.SparkException:?Task?failed?while?writing?rows
at?org.apache.spark.sql.execution.datasources.FileFormatWriter$.org$apache$spark$sql$execution$datasources$FileFormatWriter$$executeTask(FileFormatWriter.scala:204)
at?org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$write$1$$anonfun$3.apply(FileFormatWriter.scala:129)
at?org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$write$1$$anonfun$3.apply(FileFormatWriter.scala:128)
at?org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87)
at?org.apache.spark.scheduler.Task.run(Task.scala:99)
at?org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:396)
at?java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1160)
at?java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
at?java.lang.Thread.run(Thread.java:795)
Caused?by:?org.apache.spark.shuffle.FetchFailedException:?java.lang.RuntimeException:?Failed?to?open?file:?/data10/spark/jh-spark211/local_dir/d2f5f7e6-adcb-4c7d-ae56-f0bb20a957db_hive_hive/blockmgr-6976f3e6-7539-4b60-8ebc-bf3efe46b8f7/25/shuffle_13_168_0.index
at?org.apache.spark.network.shuffle.ExternalShuffleBlockResolver.getSortBasedShuffleBlockData(ExternalShuffleBlockResolver.java:262)
at?org.apache.spark.network.shuffle.ExternalShuffleBlockResolver.getBlockData(ExternalShuffleBlockResolver.java:174)
at?org.apache.spark.network.shuffle.ExternalShuffleBlockHandler.handleMessage(ExternalShuffleBlockHandler.java:99)
at?org.apache.spark.deploy.ego.EGOExternalShuffleBlockHandler.handleMessage(EGOShuffleService.scala:127)
..............
at?io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111)
at?io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:137)
at?java.lang.Thread.run(Thread.java:795)
Caused?by:?java.util.concurrent.ExecutionException:?java.io.FileNotFoundException:?/data10/spark/jh-spark211/local_dir/d2f5f7e6-adcb-4c7d-ae56-f0bb20a957db_hive_hive/blockmgr-6976f3e6-7539-4b60-8ebc-bf3efe46b8f7/25/shuffle_13_168_0.index?(Permission?denied)
..............................
Caused?by:?java.io.FileNotFoundException:?/data10/spark/jh-spark211/local_dir/d2f5f7e6-adcb-4c7d-ae56-f0bb20a957db_hive_hive/blockmgr-6976f3e6-7539-4b60-8ebc-bf3efe46b8f7/25/shuffle_13_168_0.index?(Permission?denied)
..............
TaskFailException----------------------------------------
com.spark.rules.common.TaskFailException:?org.apache.spark.SparkException:?Job?aborted.
at?com.spark.rules.executor.AbstractExecutor.process(AbstractExecutor.java:60)
at?com.spark.rules.executor.ExecutorGroup.processRule(ExecutorGroup.java:87)
at?com.spark.rules.executor.ExecutorGroup.processRules(ExecutorGroup.java:74)
.................
at?org.apache.spark.sql.execution.datasources.DataSource.write(DataSource.scala:520)
at?org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:215)
at?org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:198)
at?org.apache.spark.sql.DataFrameWriter.parquet(DataFrameWriter.scala:494)
at?com.spark.rules.executor.output.Spark2Hdfs.doProcess(Spark2Hdfs.java:35)
at?com.spark.rules.executor.AbstractExecutor.process(AbstractExecutor.java:50)
...?15?more
Caused?by:?org.apache.spark.SparkException:?Job?aborted?due?to?stage?failure:?Task?10?in?stage?11.0?failed?4?times,?most?recent?failure:?Lost?task?10.3?in?stage?11.0?(TID?1320,?dsszbyz-etl-node57,?executor?74-08cf9e79-87e4-48f7-bb33-808b66b881a8):?org.apache.spark.SparkException:?Task?failed?while?writing?rows
at?org.apache.spark.sql.execution.datasources.FileFormatWriter$.org$apache$spark$sql$execution$datasources$FileFormatWriter$$executeTask(FileFormatWriter.scala:204)
at?org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$write$1$$anonfun$3.apply(FileFormatWriter.scala:129)
at?org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$write$1$$anonfun$3.apply(FileFormatWriter.scala:128)
at?org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87)
at?org.apache.spark.scheduler.Task.run(Task.scala:99)
at?org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:396)
at?java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1160)
at?java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
at?java.lang.Thread.run(Thread.java:795)
Caused?by:?org.apache.spark.shuffle.FetchFailedException:?java.lang.RuntimeException:?Failed?to?open?file:?/data10/spark/jh-spark211/local_dir/d2f5f7e6-adcb-4c7d-ae56-f0bb20a957db_hive_hive/blockmgr-6976f3e6-7539-4b60-8ebc-bf3efe46b8f7/25/shuffle_13_168_0.index
at?org.apache.spark.network.shuffle.ExternalShuffleBlockResolver.getSortBasedShuffleBlockData(ExternalShuffleBlockResolver.java:262)
at?org.apache.spark.network.shuffle.ExternalShuffleBlockResolver.getBlockData(ExternalShuffleBlockResolver.java:174)
at?org.apache.spark.network.shuffle.ExternalShuffleBlockHandler.handleMessage(ExternalShuffleBlockHandler.java:99)
at?org.apache.spark.deploy.ego.EGOExternalShuffleBlockHandler.handleMessage(EGOShuffleService.scala:127)
..............
at?io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111)
at?io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:137)
at?java.lang.Thread.run(Thread.java:795)
Caused?by:?java.util.concurrent.ExecutionException:?java.io.FileNotFoundException:?/data10/spark/jh-spark211/local_dir/d2f5f7e6-adcb-4c7d-ae56-f0bb20a957db_hive_hive/blockmgr-6976f3e6-7539-4b60-8ebc-bf3efe46b8f7/25/shuffle_13_168_0.index?(Permission?denied)
..............................
Caused?by:?java.io.FileNotFoundException:?/data10/spark/jh-spark211/local_dir/d2f5f7e6-adcb-4c7d-ae56-f0bb20a957db_hive_hive/blockmgr-6976f3e6-7539-4b60-8ebc-bf3efe46b8f7/25/shuffle_13_168_0.index?(Permission?denied)
..............
Document Location
Worldwide
[{"Business Unit":{"code":"BU048","label":"IBM Software"},"Product":{"code":"SSZU2E","label":"IBM Spectrum Conductor with Spark"},"Component":"","Platform":[{"code":"PF016","label":"Linux"}],"Version":"All Versions","Edition":"","Line of Business":{"code":"LOB77","label":"Automation Platform"}}]
Log InLog in to view more of this document
This document has the abstract of a technical article that is available to authorized users once you have logged on. Please use Log in button above to access the full document. After log in, if you do not have the right authorization for this document, there will be instructions on what to do next.
Was this topic helpful?
Document Information
Modified date:
06 September 2019
UID
ibm10964712