IBM Cloud Code Engine is a fully managed, serverless platform that runs your containerized workloads, including web apps, microservices, event-driven functions and batch jobs with run-to-completion characteristics. Code Engine even builds container images for you from your source code. Because these workloads are all hosted within the same Code Engine platform, all of them can seamlessly work together. The Code Engine experience is designed so that you can focus on writing code and not on the infrastructure that is needed to host it.
Advanced users can benefit from the open Kubernetes API that Code Engine exposes in order to run technologies like Ray.
Ray is an open technology for “fast and simple distributed computing.” It makes it easy for data scientists and application developers to run their code in a distributed fashion. It also provides a lean and easy interface for distributed programming with many different libraries, best suited to perform machine learning and other intensive compute tasks.
The following are a few features offered by Ray:
As described above, Ray does also run on Kubernetes. Since Code Engine exposes Kubernetes APIs and does run containers at a high scale in a serverless fashion, we were looking at bringing the two together in order to “boost your serverless compute.”
That’s simple:
The idea is to run the Ray nodes as containers in the namespace that belongs to a Code Engine project. The configuration of the containers, the scaling and submission of tasks are completely done by Ray. Code Engine is just providing the compute infrastructure to create and run the containers.
For further reading about the Code Engine architecture, check out “Learning about Code Engine architecture and workload isolation.”
For further reading about Ray and how autoscaling works, you get an overview in “Ray Cluster Overview.”
Before you can get started, make sure you have the latest CLI’s for Code Engine, Kubernetes and Ray installed.
brew install python@3.7
) and installed Ray accordingly with pip3 install -U https://s3-us-west-2.amazonaws.com/ray-wheels/latest/ray-2.0.0.dev0-cp37-cp37m-macosx_10_13_intel.whl
.Install Kubernetes libraries: pip3 install kubernetes
ray
in your terminal and that you can see the ray command line help.In order to start a Ray cluster, you need to define the cluster specification and where to run the Ray cluster. Since Code Engine exposes the Kubernetes API, we can use the Kubernetes provider from Ray.
To tell Ray which Kubernetes environment to use, you would need to run the following commands:
With the following command you can download a basic Ray cluster definition and customize it for your namespace:
The downloaded example describes a Ray cluster with the following characteristics:
example-cluster
with up to 10 workers using the Kubernetes provider pointing to your Code Engine projectPlease see the section “One final note for advanced Kubernetes users” below for more information.
Now you can start the Ray cluster by running ray up example-cluster.yaml
. This command will create the Ray head node as Kubernetes Pod in your Code Engine project. When you create the Ray cluster for the first time, it can take up to three minutes until the Ray image is downloaded from the Ray repository. Ray will emit “SSH still not available” error messages until the image has been downloaded and the pod has started.
The following very simple Ray program example illustrates how tasks can be executed in the remote Ray cluster. Function f simulates a workload that is called for 200 times. The 200 workloads will be distributed among the Ray cluster nodes:
After having copied the above program to a file example.py, you can run the program by submitting it to the Ray cluster:
This will copy the Python program to the Ray head node and start it there. In parallel, the autoscaler will begin to scale up the Ray cluster with additional worker nodes. The Python program will be copied to the newly created worker nodes and run there in addition to the head node. The autoscaler scales up the Ray cluster in chunks until the maximum number of workers (10) is reached as defined in example-cluster.yaml with available_node_types.worker_node.max_workers. Finally, one minute (see idle_timeout_minutes in example-cluster.yaml) after the program has terminated, the autoscaler starts to downscale the Ray cluster until all worker nodes are deleted and only the head node is left.
Example output: Program output and autoscaler event messages
inside f(<index>/200), host=<ip>
reflects the n-th invocation of function f, index=1,2,…200The output ends with a summary:
You can monitor the execution of the Ray tasks and the autoscaling of resources by running the following:
$ ray exec example-cluster.yaml 'tail -n 100 -f /tmp/ray/session_latest/logs/monitor*'
Alternatively, you can bring up the Ray dashboard and monitor the cluster with your browser:
$ ray dashboard example-cluster.yaml & Forwarding from 127.0.0.1:8265 -> 8265
You can type http://127.0.0.1:8265 in your browser to view the dashboard:
If you’re done, you can simply teardown the cluster by running ray down example-cluster.yaml .
Let’s explain a few specifics of IBM Cloud Code Engine that might be interesting for advanced Kubernetes users. When running your Kubernetes workload (Pods, Deployments) in Code Engine, the following aspects should be kept in mind:
The example above illustrated a lean example of how to run a Ray cluster in a IBM Cloud Code Engine project in a pure serverless fashion without the need to deal with any infrastructure or Kubernetes cluster management.
IBM web domains
ibm.com, ibm.org, ibm-zcouncil.com, insights-on-business.com, jazz.net, mobilebusinessinsights.com, promontory.com, proveit.com, ptech.org, s81c.com, securityintelligence.com, skillsbuild.org, softlayer.com, storagecommunity.org, think-exchange.com, thoughtsoncloud.com, alphaevents.webcasts.com, ibm-cloud.github.io, ibmbigdatahub.com, bluemix.net, mybluemix.net, ibm.net, ibmcloud.com, galasa.dev, blueworkslive.com, swiss-quantum.ch, blueworkslive.com, cloudant.com, ibm.ie, ibm.fr, ibm.com.br, ibm.co, ibm.ca, community.watsonanalytics.com, datapower.com, skills.yourlearning.ibm.com, bluewolf.com, carbondesignsystem.com, openliberty.io