IBM Support

How do you find the executor logs for Spark performance issues in the Data Science Experience Notebook?

Question & Answer


Question

How do you find the executor logs for Spark performance issues in the Data Science Experience Notebook?

Answer

Short Description

How to check Spark Performance for Data Science Experience Notebook

What';s Happening

You are running a spark job and it is hanging, you do not get any results back, or the cell stays in the running state.

Why it';s Happening

There might be different reasons why a spark notebook is hanging.

  • Cause: Your spark service connected is having issues
    You are not able to run any notebook or even a simple job like sc.parallelize([1,2,3,4]).count()

  • Cause: The code you are running is taking longer in certain stages of spark job. You would need to check the executor logs and figure out what is wrong. For checking executor logs, you need application ID and access to the Spark History user interface. To find the application ID, run following command in your python notebook:

    sc.applicationId

    After you find the application ID, complete these steps:
    1. Run your cells.

    2. Go to the Spark History user interface and then open the incomplete application.

    3. Locate the application ID that you found above and open it.

    4. Go to the executor tabs and you will see list of executors.

    You can also check the Stages tab to see which stage and task is stuck. You can check executor log in the Notebook. See the Performance_HowTofindSparkHistory topic for information on how to check the shared notebook.

[{"Business Unit":{"code":"BU059","label":"IBM Software w\/o TPS"},"Product":{"code":"SSCLA9","label":"IBM Watson Studio Cloud"},"Component":"","Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"All Versions","Edition":"","Line of Business":{"code":"LOB10","label":"Data and AI"}}]

Document Information

Modified date:
01 August 2019

UID

ibm1KB0010489