Temporal event training fails when run outside the pod
The temporal event training command fails when it is run from outside the pod.
When you attempt to run the temporal event training command from outside the pod, the command fails with spark error messages. This issue is observed on smaller deployments (trial size clusters).
This issue occurs because the spark pod does not work to optimal levels with only one pod for controller or worker node.
- Run the training command:
The command times out, which is expected behavior.
kubectl delete pod trainer; kubectl run trainer -it --command=true --restart=Never --env=LICENSE=accept --env=RUNNING_IN_CONTAINER=true --image=$TOOL_VERSION --command sleep infinity
- Verify that the pod is running by running the
oc get pods |grep trainer
- Log in to the pod by running the
oc exec -it trainer /bin/bash
- Run the training command from within the pod:
Where <release_name> is the full text value of the release, and <start_date> and <end_date> are the start and end dates of the data that is selected for the training.
cd bin ./runTraining.sh -r <release_name> -t cfd95b7e-3bc7-4006-a4a8-a73a79c71255 -a seasonal-events -s <start_date> -e <end_date> -d true ./runTraining.sh -r <release_name> -t cfd95b7e-3bc7-4006-a4a8-a73a79c71255 -a related-events -s <start_date> -e <end_date> -d true