How and why to run Ray — an open technology for fast and simple distributed computing — on IBM Cloud Code Engine.

What is IBM Cloud Code Engine?

IBM Cloud Code Engine is a fully managed, serverless platform that runs your containerized workloads, including web apps, microservices, event-driven functions and batch jobs with run-to-completion characteristics. Code Engine even builds container images for you from your source code. Because these workloads are all hosted within the same Code Engine platform, all of them can seamlessly work together. The Code Engine experience is designed so that you can focus on writing code and not on the infrastructure that is needed to host it. 

Advanced users can benefit from the open Kubernetes API that Code Engine exposes in order to run technologies like Ray.

What is Ray?

Ray is an open technology for “fast and simple distributed computing.” It makes it easy for data scientists and application developers to run their code in a distributed fashion. It also provides a lean and easy interface for distributed programming with many different libraries, best suited to perform machine learning and other intensive compute tasks. 

The following are a few features offered by Ray: 

  • Single machine code can be parallelized with little-to-zero code changes.
  • Scale everywhere: Run the same code on laptops, multi-core machines, any cloud provider or on a Kubernetes cluster.
  • Out-of-the-box scalable machine learning libraries. There is a large ecosystem of applications, libraries and tools on top of the core Ray to enable complex applications. 
  • It is open source.

As described above, Ray does also run on Kubernetes. Since Code Engine exposes Kubernetes APIs and does run containers at a high scale in a serverless fashion, we were looking at bringing the two together in order to “boost your serverless compute.”

Why should I run Ray on Code Engine?

That’s simple:

  • You don’t want to manage clusters or infrastructure. Instead, you want to value your time and focus on your job and task.
  • You don’t want to wait for your local computer to process a large data set, but want to simply outsource the heavy compute into the cloud.
  • You don’t want to waste money by running infrastructure or containers when you have nothing to run, but to have an automated scaling to zero not introducing any sort of charges.
  • You don’t want to use different programming models, but want to have the same developer experience locally and in the cloud.

The idea is to run the Ray nodes as containers in the namespace that belongs to a Code Engine project. The configuration of the containers, the scaling and submission of tasks are completely done by Ray. Code Engine is just providing the compute infrastructure to create and run the containers.

For further reading about the Code Engine architecture, check out “Learning about Code Engine architecture and workload isolation.” 

For further reading about Ray and how autoscaling works, you get an overview in “Ray Cluster Overview.”

How to run your Ray task on IBM Cloud Code Engine in a few steps

Before you can get started, make sure you have the latest CLI’s for Code Engine, Kubernetes and Ray installed.

Install the pre-requisites for Code Engine and Ray

  1. Get ready with IBM Code Engine
  2. Set up the Kubernetes CLI 
  3. Install Python3 libraries for Ray and Kubernetes:
    • Follow the instructions here for your operating system and Python3 version. For example, in our macOS-based installation, we used Python 3.7 (brew install python@3.7) and installed Ray accordingly with pip3 install -U
    • Install Kubernetes libraries: pip3 install kubernetes
    • Verify you can run ray in your terminal and that you can see the ray command line help.

Step 1: Define your Ray cluster

In order to start a Ray cluster, you need to define the cluster specification and where to run the Ray cluster. Since Code Engine exposes the Kubernetes API, we can use the Kubernetes provider from Ray. 

To tell Ray which Kubernetes environment to use, you would need to run the following commands:

  1. If not already done, create a project in Code Engine:
    ibmcloud ce project create -n <your project name>
  2. Select the Code Engine project and switch the `kubectl` context to the project:
    ibmcloud ce project select -n <your project name>  -k
  3. Extract the Kubernetes namespace assigned to your project. The namespace can be found in the NAME column in the output of the command:
    kubectl get namespace
  4. Export the namespace:
    export NAMESPACE=<namespace from above (3.)>

With the following command you can download a basic Ray cluster definition and customize it for your namespace:

$ curl -L | sed "s/NAMESPACE/$NAMESPACE/" > example-cluster.yaml

The downloaded example describes a Ray cluster with the following characteristics:

  • A cluster named example-cluster with up to 10 workers using the Kubernetes provider pointing to your Code Engine project
  • A head node type with 1 vCPU and 2 GB memory using the ray image rayproject/ray:1.5.1
  • A worker node type with 0.5 vCPU and 1 GB memory using the ray image rayproject/ray:1.5.1
  • The default startup commands to start Ray within the container and listen to the proper ports
  • The autoscaler upscale speed is set to 10 for quick upscaling in this short and simple demo

Please see the section “One final note for advanced Kubernetes users” below for more information.

Step 2: “Ray up” your cluster

Now you can start the Ray cluster by running ray up example-cluster.yaml.This command will create the Ray head node as Kubernetes Pod in your Code Engine project. When you create the Ray cluster for the first time, it can take up to three minutes until the Ray image is downloaded from the Ray repository. Ray will emit “SSH still not available” error messages until the image has been downloaded and the pod has started.

Step 3: Submit a very simple but distributed Ray program  

The following very simple Ray program example illustrates how tasks can be executed in the remote Ray cluster. Function f simulates a workload that is called for 200 times. The 200 workloads will be distributed among the Ray cluster nodes:

# import all helper libraries
from collections import Counter
import socket
import time
import ray
import math

# connect to ray
ray.init(address='auto', _redis_password='5241590000000000')

# f: returns the hostname/ip of the current container executing the function. f is annotated
# with @ray.remote to indicate that it should be distributed among the nodes in the Ray cluster.
def f(counter, total):
    host = socket.gethostbyname(socket.gethostname())
    print("inside f({}/{}), host={}".format(counter + 1, total, host))
    # simulate work for 5 seconds
    # return IP address
    return host

# execute function f for 200 times and fetch the results
n = 200
object_ids = [f.remote(i, n) for i in range(n)]
ip_addresses = ray.get(object_ids)

# print a histogram of the number of tasks executed per hostname/ip
print('Tasks executed')
for ip_address, num_tasks in Counter(ip_addresses).items():
    print('    {} tasks on {}'.format(num_tasks, ip_address))

After having copied the above program to a file, you can run the program by submitting it to the Ray cluster:

# Check that the head node pod status is Running. 
# If the status is ContainerCreating, the Ray image download is still running.
kubectl get pods
# Submit the program to the Ray cluster
ray submit example-cluster.yaml

This will copy the Python program to the Ray head node and start it there. In parallel, the autoscaler will begin to scale up the Ray cluster with additional worker nodes. The Python program will be copied to the newly created worker nodes and run there in addition to the head node. The autoscaler scales up the Ray cluster in chunks until the maximum number of workers (10) is reached as defined in example-cluster.yaml with available_node_types.worker_node.max_workers. Finally, one minute (see idle_timeout_minutes in example-cluster.yaml) after the program has terminated, the autoscaler starts to downscale the Ray cluster until all worker nodes are deleted and only the head node is left. 

Example output: Program output and autoscaler event messages

  • is the head node
  • inside f(<index>/200), host=<ip> reflects the n-th invocation of function f, index=1,2,…200
  • The output ends with a summary that shows the distribution of the 200 function calls among the 11 nodes of the Ray cluster (1 head node and 10 worker nodes). Since we have started in a Ray cluster with no initial worker nodes, the distribution is not equal among the nodes. It took some time until all workers were created by the autoscaler. When you re-submit the example within less than a minute where all worker nodes still run, the distribution would be more like an ideal, equal distribution, until all 200 tasks are terminated:

The output ends with a summary:

Step 4: Monitor your cluster

You can monitor the execution of the Ray tasks and the autoscaling of resources by running the following:

$ ray exec example-cluster.yaml 'tail -n 100 -f /tmp/ray/session_latest/logs/monitor*'

Alternatively, you can bring up the Ray dashboard and monitor the cluster with your browser:

$ ray dashboard example-cluster.yaml &
Forwarding from -> 8265

You can type in your browser to view the dashboard:

Step 5: “Ray down” your cluster

If you’re done, you can simply teardown the cluster by running ray down example-cluster.yaml.

One final note for advanced Kubernetes users

Let’s explain a few specifics of IBM Cloud Code Engine that might be interesting for advanced Kubernetes users. When running your Kubernetes workload (Pods, Deployments) in Code Engine, the following aspects should be kept in mind:

  • A default service account exists that has specific permissions and cannot be changed.
  • Roles and role bindings cannot be managed, and the existing service accounts must be used.
  • Resource requests and resource limits must be identical.
  • Image pull secrets can be created and used as expected to pull images from private registry.
  • ImagePullPolicy must be set to Always or kept unset.  
  • Config maps and secrets can be managed and used as expected to configure the Pods.
  • Services must be of type ClusterIP only.


The example above illustrated a lean example of how to run a Ray cluster in a IBM Cloud Code Engine project in a pure serverless fashion without the need to deal with any infrastructure or Kubernetes cluster management.


More from Cloud

Kubernetes version 1.28 now available in IBM Cloud Kubernetes Service

2 min read - We are excited to announce the availability of Kubernetes version 1.28 for your clusters that are running in IBM Cloud Kubernetes Service. This is our 23rd release of Kubernetes. With our Kubernetes service, you can easily upgrade your clusters without the need for deep Kubernetes knowledge. When you deploy new clusters, the default Kubernetes version remains 1.27 (soon to be 1.28); you can also choose to immediately deploy version 1.28. Learn more about deploying clusters here. Kubernetes version 1.28 In…

Temenos brings innovative payments capabilities to IBM Cloud to help banks transform

3 min read - The payments ecosystem is at an inflection point for transformation, and we believe now is the time for change. As banks look to modernize their payments journeys, Temenos Payments Hub has become the first dedicated payments solution to deliver innovative payments capabilities on the IBM Cloud for Financial Services®—an industry-specific platform designed to accelerate financial institutions' digital transformations with security at the forefront. This is the latest initiative in our long history together helping clients transform. With the Temenos Payments…

Foundational models at the edge

7 min read - Foundational models (FMs) are marking the beginning of a new era in machine learning (ML) and artificial intelligence (AI), which is leading to faster development of AI that can be adapted to a wide range of downstream tasks and fine-tuned for an array of applications.  With the increasing importance of processing data where work is being performed, serving AI models at the enterprise edge enables near-real-time predictions, while abiding by data sovereignty and privacy requirements. By combining the IBM watsonx data…

The next wave of payments modernization: Minimizing complexity to elevate customer experience

3 min read - The payments ecosystem is at an inflection point for transformation, especially as we see the rise of disruptive digital entrants who are introducing new payment methods, such as cryptocurrency and central bank digital currencies (CDBC). With more choices for customers, capturing share of wallet is becoming more competitive for traditional banks. This is just one of many examples that show how the payments space has evolved. At the same time, we are increasingly seeing regulators more closely monitor the industry’s…