An IBM Cloud Object Storage (COS) bucket can be used as the backing store for a PVC.

When you imagine a file system, you are probably thinking of the block storage provided by disk drives. Object storage buckets can also be used for file system volumes on Kubernetes and might fit well into your application. For example, buckets can be managed outside of the application with a variety of tools and the IBM Cloud Console. Getting data in and out is a breeze.

This post starts from scratch and demonstrates the creation of a cluster, buckets, volumes and applications. Off-the-shelf container images for nginx and jekyll will be used to demonstrate, so no application coding is required.


Background

IBM Cloud supports fully managed Kubernetes clusters. “Storing data on Block Storage for VPC” explains how to use high-performance block storage for Kubernetes Persistent Volume Claims (PVCs). PVCs serve as the backing read/write storage that are mounted as volumes in pods.

An IBM Cloud Object Storage (COS) bucket can be used as the backing store for a PVC. Buckets might fit your use case better than block storage. Examples include the following:

  • Import or export data
  • Share data between pods
  • Utilize COS buckets using file system APIs

These are a few things to think about when choosing a bucket versus block storage:

  • Price: Bucket objects are persisted for a few pennies per GB per month. Pay as you grow and start for free.
  • Multiple pods: Multiple pods can mount the same PVC bucket.
  • Operational simplicity: COS bucket can be easily populated and/or read in variety of work flows.
  • Resiliency: Global location and resiliency options include cross region and regional options. 
  • Performance: Block storage and buckets have drastically different characteristics. Verify against application requirements.

This post demonstrates how to create everything from scratch. The scripts and Terraform configuration are available on GitHub. Use them to create all of the resources required to see PVC buckets in action. The basic steps are as follows:

  • Create a VPC and Kubernetes cluster.
  • Create a COS instance with associated Kubernetes secrets and storage classes.
  • Create the Kubernetes resources: PVC, deployment, service and ingress.
  • Verify it works.
  • Create a blog in a second PVC using Jekyll.
  • Serve the blog with nginx.

A security feature of the bucket is also highlighted — the bucket IP “allow list” limits bucket access to the Kubernetes cluster VPC.

OK, lets do it!

Prerequisites

The provision steps are going to be done from the CLI. This will allow you to move these steps to a CI/CD pipeline or into IBM Cloud Schematics, over time. See “Getting started with solution tutorials” for help with the following required tools:

  • Git
  • IBM Cloud CLI
  • Terraform
  • Docker
  • Jq

You should be the account owner or have enough privileges to create resources. In the terminal, make your own copy of the local.env file and make the suggested edits. The prereq script will verify the tools are installed. This step is complete when you see: >>> Prerequisites met.

git clone https://github.com/IBM-Cloud/kubernetes-cos-pvc
cd  kubernetes-cos-pvc
cp template.local.env local.env
edit local.env
source local.env
./000-prereq.sh
Scroll to view full table

Container Registry Service

The default template.local.env has IBMCR set to “false”, which makes this step optional. Initialize IBMCR to “true” to use the IBM Container Registry. If pod access to hub.docker.com is disabled, set to “true”. This step will copy the required Docker images into a newly created namespace in the IBM Container Registry. Image names will resemble us.icr.io/$BASENAME/:

./010-container-registry.sh
Scroll to view full table

Cluster

You can use an existing VPC-based cluster or execute this step to create a cluster. Either way, the local.env CLUSTER_NAME is required. The creation is done in the cluster/ directory where you will find the Terraform configuration. The script will create a terraform.tfvars, then execute Terraform.

Take a look at cluster/main.tf to see all of the resources created. For example:

  • Resource “ibm_is_vpc”  — create a VPC
  • Resource “ibm_container_vpc_cluster” — create a cluster
./020-create-cluster.sh
Scroll to view full table

It can take over 30 minutes to create a cluster. Once it completes, check out the cluster in the cloud console.

Resources

The rest of the resources are created in the terraform/ directory. The script will create a terraform.tfvars, then execute Terraform.

TLDR; skip down to 025-create-resources.sh.

IBM Cloud Object Storage (COS) bucket storage classes are installed by the resources in cos_storage_class.tf.  If you had previously followed the Installing the IBM Cloud Object Storage plug-in, there are comments in cos_storage_class.tf that can be used to avoid this step.

The main.tf file creates the rest of the resources:

  • COS instance and secret keys that are then used to populate a couple of Kubernetes secrets
  • PVC configured to create a bucket automatically with limited access
  • Deployment for the nginx image
  • Service to expose the deployment pods
  • Ingress to expose the service to the public

Optionally, you can open main.tf and take a closer look a few of the resources. TLDR — skip down and execute the shell script.

Here is a cut down of the PVC:

resource "kubernetes_persistent_volume_claim" "pvc" {
  metadata {
    name = local.pvc_nginx_claim_name
    annotations = {
      "ibm.io/auto-create-bucket" : "true"
      "ibm.io/set-access-policy" : "true"
  spec {
    storage_class_name = "ibmc-s3fs-standard-regional"
Scroll to view full table

The basics are pretty simple. The annotations allow some configuration (e.g. automatically creating the bucket and setting the access policy), which means setting the allow list of IPs for the bucket (demonstrated later). The full list of annotations and storage classes can be found in the documentation at “Storing data on IBM Cloud Object Storage.”

Here is a cut down of the nginx deployment:

resource "kubernetes_deployment" "nginx" {
  spec {
      spec {
        container {
          name    = "nginx"
          image   = var.imagefqn_nginx
          command = ["sh", "-c", "echo '#Success' > /usr/share/nginx/html/index.html ; exec nginx -g 'daemon off;'"]
          port {
            container_port = "80"
          volume_mount {
            name       = "volname"
            mount_path = "/usr/share/nginx/html"
        volume {
          name = "volname"
          persistent_volume_claim {
            claim_name = local.pvc_nginx_claim_name
Scroll to view full table

The command echoes a string to the default site file for nginx (index.html). We can test this later to verify success. The volume_mount adds the volume to a directory within the deployment’s pod. The volume configuration ties the PVC to the deployment.

Two more configuration files demonstrate the ability to write contents to a bucket for reading and also share the PVC with another deployment. jekylblog.tf:

  • Resource “kubernetes_persistent_volume_claim” “jekyllblog” —  PVC and associated bucket
  • Resource “kubernetes_deployment” “jekyllblog”  — Deployment that generates a blog and starts a web server.  These commands do the work:

Then, jekyllnginx.tf has a deployment that mounts the same PVC:

  • “kubernetes_deployment” “jekyllnginx” — Deployment that creates a symlink to the same PVC. These are the commands:
    • cd /usr/share/nginx
    • rm -rf html
    • ln -s /blog/kubernetes-cos-pvc/example/jekyllblog/myblog/_site html
    • exec nginx -g ‘daemon off;’

Ingress exposes all three of these services with the subdomain nginx.<ingress domain>, jekyllblog,<ingress domain> and jekuyllnginx.<ingress domain>. Here is a cut down:

resource "kubernetes_ingress" "example_ingress" {
  spec {
    tls {
      secret_name = data.ibm_container_vpc_cluster.cluster.ingress_secret
      hosts       = [data.ibm_container_vpc_cluster.cluster.ingress_hostname]
    }
    rule {
      host = "nginx.${data.ibm_container_vpc_cluster.cluster.ingress_hostname}"
      http {
        path {
          backend {
            service_name = kubernetes_service.nginx.metadata[0].name
            service_port = 80
    rule {
      host = "jekyllblog.${data.ibm_container_vpc_cluster.cluster.ingress_hostname}"
      ...
    rule {
      host = "jekyllnginx.${data.ibm_container_vpc_cluster.cluster.ingress_hostname}"
      ...
Scroll to view full table

Write configuration values to terraform/terraform.tfvars and then execute the Terraform configuration:

025-create-resources.sh
Scroll to view full table

Once this is completes, check out the following:

  • IBM Cloud Object Storage instance (find it in the storage section of the resource list)
  • Navigate to the COS instance and then the bucket with nginx in the name. Look for the following:
    • The Objects section indicates Access denied this is because of the Access policies.
    • The Access policies section in the Authorized IPs panel has a list of IP addresses that can access this bucket. The VPC’s cloud service endpoint source addresses are listed. See “VPC behind the curtain” for more details.
  • For Kubernetes clusters, click your cluster, open the Kubernetes dashboard to see the rest of the resources created or use the kubectl command line like kubectl get deployments. The names all start with $BASENAME*:
    • deployments
    • pods
    • services
    • ingresses
    • persistentvolumeclaims

Test

Run the test script. It will read the simple nginx service by using curl. It is expecting the string success that was put into index.html. Two URLs are displayed — you should open each of these to verify that the blog is being served by the other two deployments. It can take a couple of minutes for the blogs to become available:

./030-test.sh
Scroll to view full table

Clean up

Clean it all up with the following command:

./040-cleanup.sh
Scroll to view full table

Conclusion

A Kubernetes Persistent Volume Claims (PVC) bucket for hosting static content works well, and a production environment could build the static contents in the CI/CD pipeline to create a new release.

A PVC bucket may help with legacy applications that use local volume for storage:

  • Create a container image for the application. 
  • Upload the local volume to a bucket, and mount a PVC bucket for the application.
  • Manage backup and restore through bucket operations.

The PVC bucket can be used to export application data for archives, analysis and more.

Learn more

Learn more about Kubernetes on the IBM Cloud:

More from Cloud

Modernizing child support enforcement with IBM and AWS

7 min read - With 68% of child support enforcement (CSE) systems aging, most state agencies are currently modernizing them or preparing to modernize. More than 20% of families and children are supported by these systems, and with the current constituents of these systems becoming more consumer technology-centric, the use of antiquated technology systems is archaic and unsustainable. At this point, families expect state agencies to have a modern, efficient child support system. The following are some factors driving these states to pursue modernization:…

7 min read

IBM Cloud Databases for Elasticsearch End of Life and pricing changes

2 min read - As part of our partnership with Elastic, IBM is announcing the release of a new version of IBM Cloud Databases for Elasticsearch. We are excited to bring you an enhanced offering of our enterprise-ready, fully managed Elasticsearch. Our partnership with Elastic means that we will be able to offer more, richer functionality and world-class levels of support. The release of version 7.17 of our managed database service will include support for additional functionality, including things like Role Based Access Control…

2 min read

Connected products at the edge

6 min read - There are many overlapping business usage scenarios involving both the disciplines of the Internet of Things (IoT) and edge computing. But there is one very practical and promising use case that has been commonly deployed without many people thinking about it: connected products. This use case involves devices and equipment embedded with sensors, software and connectivity that exchange data with other products, operators or environments in real-time. In this blog post, we will look at the frequently overlooked phenomenon of…

6 min read

SRG Technology drives global software services with IBM Cloud VPC under the hood

4 min read - Headquartered in Ft. Lauderdale, Florida, SRG Technology LLC. (SRGT) is a software development company supporting the education, healthcare and travel industries. Their team creates data systems that deliver the right data in real time to customers around the globe. Whether those customers are medical offices and hospitals, schools or school districts, government agencies, or individual small businesses, SRGT addresses a wide spectrum of software services and technology needs with round-the-clock innovative thinking and fresh approaches to modern data problems. The…

4 min read