IBM Support

Spark task lost and failed due to timeout

Troubleshooting


Problem

Spark task lost and failed due to timeout

Symptom

Spark job failed with task timeout. Spark driver log captured following messages:

19/10/31 18:31:53 INFO TaskSetManager: Starting task 823.0 in stage 2.0 (TID 1116, <hostname>, executor 3-46246ed5-2297-4a85-a088-e133fa202c6b, partition 823, PROCESS_LOCAL, 8509 bytes)

19/10/31 18:32:07 INFO TaskSetManager: [task] [failed] taskName:823.0 taskId:1116 stageId:2.0 executorId:3-46246ed5-2297-4a85-a088-e133fa202c6b

19/10/31 18:32:07 WARN TaskSetManager: Lost task 823.0 in stage 2.0 (TID 1116, <hostname>, executor 3-46246ed5-2297-4a85-a088-e133fa202c6b): ExecutorLostFailure (executor 3-46246ed5-2297-4a85-a088-e133fa202c6b exited caused by one of the running tasks) Reason: remote Rpc client disassociated

Document Location

Worldwide

[{"Business Unit":{"code":"BU048","label":"IBM Software"},"Product":{"code":"SS4H63","label":"IBM Spectrum Conductor"},"Component":"","Platform":[{"code":"PF016","label":"Linux"}],"Version":"All Versions","Edition":"","Line of Business":{"code":"LOB77","label":"Automation Platform"}}]

Log InLog in to view more of this document

This document has the abstract of a technical article that is available to authorized users once you have logged on. Please use Log in button above to access the full document. After log in, if you do not have the right authorization for this document, there will be instructions on what to do next.

Document Information

Modified date:
24 December 2019

UID

ibm11163848