Temporal event training fails when run outside the pod

The temporal event training command fails when it is run from outside the pod.


When you attempt to run the temporal event training command from outside the pod, the command fails with spark error messages. This issue is observed on smaller deployments (trial size clusters).


This issue occurs because the spark pod does not work to optimal levels with only one pod for controller or worker node.


To resolve this issue, run the training command from within the pod by completing the following steps:
  1. Run the training command:
    kubectl delete pod trainer; kubectl run trainer -it --command=true --restart=Never --env=LICENSE=accept --env=RUNNING_IN_CONTAINER=true --image=$TOOL_VERSION --command sleep infinity
    The command times out, which is expected behavior.
  2. Verify that the pod is running by running the command:
    oc get pods |grep trainer
  3. Log in to the pod by running the command:
    oc exec -it trainer /bin/bash
  4. Run the training command from within the pod:
    cd bin
    ./runTraining.sh -r <release_name> -t cfd95b7e-3bc7-4006-a4a8-a73a79c71255 -a seasonal-events -s <start_date> -e <end_date> -d true
    ./runTraining.sh -r <release_name> -t cfd95b7e-3bc7-4006-a4a8-a73a79c71255 -a related-events -s <start_date> -e <end_date> -d true
    Where <release_name> is the full text value of the release, and <start_date> and <end_date> are the start and end dates of the data that is selected for the training.