Handle the Unexpected with Bluemix Auto-Scaling

Share this post:

One of the intrinsic benefits of cloud platform services is that they drastically simplify the process of scaling your application. Scaling can be done both vertically – by increasing the amount of memory available to each instance – or horizontally – by creating additional instances. This process of scaling your application can be done manually, either through the Cloud Foundry CLI or the application dashboard. However, it is also useful to be able to scale automatically in order to quickly react to spikes in application usage. The Auto-Scaling service provides the ability for users to dynamically scale their application horizontally by creating custom policies that dictate scaling behavior.

By binding the service to your application, Auto-Scaling allows you to define rules that control the behavior for scaling in and out. Depending on what runtime your application uses, you are able to build scaling rules based on CPU utilization, memory usage, throughput and JVM heap usage. In addition, users can have more fine-grained control over their scaling policies by controlling the breach time required to scale or the amount of time the application must “cooldown” before the application scales again.

Example of a scaling rule based on CPU utilization.

To demonstrate the capabilities of this service, I have put together a demo application that scales when a load test is applied. The web app will visually display the current CPU utilization and memory usage of the instance from which the page was served. The color of the gauges indicates if the node has reached its threshold for scaling and is useful as a visual indicator of our application performance while the load test is being run. In addition to Auto-Scaling, we will be using the following 3rd party services to help run the load test and monitor application performance:


BlazeMeter provides performance and load testing services for web and mobile applications. Its rich feature set allows testing from multiple locations, offers real-time reporting and analytics, works on public or private clouds, and much more. Best of all, it is 100% compatible with the open-source Apache JMeter. As a 3rd party service in the Bluemix catalog, BlazeMeter offers a free tier for all users of the platform. We will be using this service to create a JMeter script and run a stress test on our demo application.

New Relic

New Relic is the market leader in providing real-time application analytics and visualizations of performance data. In this demo, we will be using New Relic’s application performance monitoring service to gain in-depth insight into our application while the stress test is applied. Like BlazeMeter, New Relic offers a free trial of its service to all Bluemix users. As a note, IBM’s Monitoring and Analytics service provides the same functionality, but to showcase Bluemix’s ability to easily integrate with 3rd party SaaS offerings, I’ve decided to use New Relic instead.

Auto Scaling Demo Diagram

If you would like to follow along with the step-by-step walk-through, check out the video below:

Without further ado, let’s get started with the demo!

What You’ll Need

  1. A Bluemix account – the free trial will work just fine if you do not already have an account
  2. The Cloud Foundry CLI – For an introductory guide, see here
  3. Google Chrome and the BlazeMeter Chrome Extension
  4. Node.js and any plain, old text editor
  5. Node packages: express, dust.js, consolidate,, util, usage, cron

Step 1 – Deploy your App to Bluemix

    1. Either download the .zip file of this GitHub repository or clone the repository to your local machine using the following command:
<code>git clone</code>
    1. Update the manifest.yml file to reflect your desired application and host name. Remember, the host name must be unique for all Bluemix applications.

    1. Update the package.json file with the same application name you used in the manifest.yml file.

    1. Install all node modules in the application working directory using npm.
    2. Open a terminal window and navigate to the directory that contains the sample app.
    3. Log in to Bluemix using the Cloud Foundry CLI and push the application.

Congratulations! Your app is now on Bluemix and accessible from the internet. Type the URL into your browser and check out the home page.

<img src=”/blogs/bluemix/wp-content/uploads/2015/03/auto_scaling_demo_dashboard_normal.jpg” alt=”The application home page. Gauge activity should be normal under light load conditions.”/>

Step 2 – Add and Configure the Auto-Scaling Service

  1. Verify that your app is running by checking the ‘App Health’ section of the application dashboard.
  2. Click the ‘Add A Service Or API’ button.
  3. Select Auto-Scaling from the DevOps section of the service catalog.
    Select the Auto-Scaling service under Dev-Ops
  4. Select your application from the dropdown menu and click ‘Create’.
  5. After the service is bound, you will be prompted to restage the application. Click ‘RESTAGE’.
  6. Once the application has restarted, the Auto-Scaling service will be successfully bound to your app.

Step 3 – Create an Auto-Scaling Policy

    1. Navigate to the Auto-Scaling dashboard and select ‘Create Auto-Scaling Policy’ from the console.
      Click the button to create an Auto-Scaling policy
    2. Update the name of the policy and make the following changes:
      • The minimum number of application instances: 1
      • The maximum number of application instances: 2
    3. Alter the policy to reflect the following configuration:
      • Metric type: Memory
      • If average CPU utilization exceeds 60%, then increase 1 instance
      • If average CPU utilization is below 40%, then decrease 1 instance
      • Statistic Window: 30 seconds
      • Breach Duration: 60 seconds
      • Cooldown period for scaling out: 600 seconds
      • Cooldown period for scaling in: 600 seconds

The rule defines the thresholds for scaling and the timing conditions around them.
For a more detailed description of these configuration fields, reference this.

  • Click ‘Save’ and return to the ‘Policy Configuration’ tab.
    The 'Policy Configuration' tab should now display your new Auto-Scaling rule.


You have now added auto-scaling to your application. Based on this setup, your application will scale out to two instances if the memory usage exceeds 60% for more than one minute. Conversely, if application is already running on two instances and the memory usage exceeds 40% for more than one minute, an instance will be deprovisioned.

The core of your application is now complete, but we now want to add enhanced performance monitoring capabilities. For that, we will add the New Relic service.

Step 4 – Add and Configure the New Relic Service

    1. Select the ‘Add A Service Or API’ button from the application dashboard.
    2. Select New Relic from the DevOps section of the service catalog.
      Select the New Relic service under Dev-Ops
    3. Select your application from the dropdown menu, update the service name, and click ‘Create’.
    4. After the service is bound, you will be prompted to restage the application. Click ‘RESTAGE’.
    5. Once the restart is complete, navigate to the New Relic service and open up the dashboard.
      Click this button in Bluemix to open up the New Relic dashboard
    6. Run the following npm command in the directory where your application is stored to install the Node.js agent:
      <code>npm install newrelic</code>
    7. Update app.js as follows so that the New Relic module is mounted first in the server startup script.

  1. Copy the newrelic.js file from /node_modules/newrelic/newrelic.js into the root folder of the app.
    Move newrelic.js File
  2. Push the application back to Bluemix using the CLI.

Once the application is pushed and restarted, app performance metrics should be visible on the New Relic dashboard within five minutes. You can use this dashboard to monitor application throughput, error events, and several other key performance indicators of your application. Now lets create the load test so we can generate useful performance metrics!

Step 5 – Add the BlazeMeter Service and Create a Load Test

  1. Select the ‘Add A Service Or API’ button from the application dashboard.
  2. Select BlazeMeter from the DevOps section of the service catalog.
    Select the BlazeMeter service under Dev-Ops
  3. Update the service name and click ‘Create’. This service cannot be bound to any particular application.
  4. Go to the newly created BlazeMeter service and open up the dashboard.
  5. Now that you are logged into BlazeMeter, open up the BlazeMeter Chrome extension. To create a load test using the plugin, complete the following steps:
    • Update the load test name (A)
    • Keep the concurrency level set at 50 (B)
    • Click the record button (C)
    • Go to the application home page in the browser
    • Stop the recording (D)
    • Click the .jmx button to export the load test as a JMeter file (E)
    • Select only the ‘’ domain and click ‘Submit’ (F)
      BlazeMeter Plugin
  6. In the BlazeMeter dashboard, click the ‘Add Test’ button on the toolbar.
  7. Update the test name, upload your JMeter script, and select ‘Save’ to create the test.
  8. Once the test is created on the BlazeMeter servers, click the ‘Start’ button in the toolbar.
    Start BlazeMeter Test

Step 6 – Monitoring application performance

Now it is time to see all these services in action! While BlazeMeter provisions the necessary servers for the test, pull up the following screens so you can see your app’s performance metrics while it runs:

  • Application: Bring up at least one web page of your application’s home page
  • Auto-Scaling Dashboard: Pull up the ‘Metric Statistics’ tab of the Auto-Scaling service
  • New Relic: In the New Relic dashboard, navigate to your application and monitor the ‘Web transaction response time’ to get a look at your app’s throughput
  • BlazeMeter Report: Once the test has started, your BlazeMeter test report will be available for real-time monitoring of app performance as the number of test clients hitting your app increases

Top-Left: The demo application home page shows the CPU and memory utilization of instance zero; Top-Right: The New Relic dashboard displays the response time, both in request queuing and in the actual node; Bottom-Left: The BlazeMeter report offers many views, but here we see the relationship between concurrent users and error rate over time; Bottom-Right: The Auto-Scaling console shows the memory usage over time relative to the policy thresholds

Top-Left: The demo application homepage shows the CPU and memory utilization of instance zero; Top-Right: The New Relic dashboard displays the response time, both in request queuing and in the actual node; Bottom-Left: The BlazeMeter report offers many views, but here we see the relationship between concurrent users and error rate over time; Bottom-Right: The Auto-Scaling console shows the memory usage over time relative to the policy thresholds

Step 7 – Application Scaling

Once your application’s memory usage has exceed 60% for one minute, another instance will be provisioned and started. You should receive a notification of this in the Bluemix console, but to verify that the app has scaled, check the activity log on the right side of the dashboard.
The activity log will show an entry for scaling instances to 2 after scaling is complete

Now that there are two instances of your application, traffic will be distributed between both instances by a load balancer. If you attempt to hit your app, you can see which instance your client was served from right below the page header. If you compare the metrics between a page served from instance zero and instance one, you will notice that the gauges indicate the metrics for that particular instance. This is because multiple instances of the same application might not be located on the same server and thus may not share any resources. You will see the same behavior in the Auto-Scaling web console when viewing metric statistics for the individual instances.

Step 8 – Post Mortem

You can either wait for your BlazeMeter test to conclude or stop it after the second instance of your application has started. After the test is complete, it is important to review your test data in New Relic and the BlazeMeter report. By cross-referencing this historical metric data, you will be better equipped to determine the bottlenecks in your application and at what load/stress levels performance begins to degrade. These tools are essential to developing your application scaling policies in order to achieve optimal resiliency and reduce errors.

Auto-Scaling Service – Coming Features

You may have noticed that this demo only applies to applications deployed on Cloud Foundry and does not speak to applications on containers or VMs. The main reason for this is because auto-scaling functionality is baked right into the core features of the latter two deployment models. Instead of requiring a service to interface with the application in order to provide custom auto-scaling policies, containers and VMs have auto-scaling rules built into their respective dashboards. This model will ultimately make its way over to Cloud Foundry applications, but right now we have bigger issues to fry over in the Bluemix labs.

Our developers and designers are currently working hard on the concept of what we call ‘projects’. A project is essentially an amalgamation of applications deployed on Cloud Foundry, containers, and VMs, all working in conjunction to deliver what the end user recognizes as a single application. The reason you might architect a system like this is because each deployment model has its own advantages, as well as disadvantages. In this manner, you can contain singular functional operations, like user sign-up, in their own application or service. When that particular operation gets bombarded with traffic, you will be able to scale it out without adversely affecting any of its interfacing applications. This strategy is not only effective in maintaining your overall system’s resiliency, it will help cut costs by minimizing the total memory your applications need to add when scaling out.

Stay tuned for further updates!

More stories
May 1, 2019

Two Tutorials: Plan, Create, and Update Deployment Environments with Terraform

Multiple environments are pretty common in a project when building a solution. They support the different phases of the development cycle and the slight differences between the environments, like capacity, networking, credentials, and log verbosity. These two tutorials will show you how to manage the environments with Terraform.

Continue reading

April 29, 2019

Transforming Customer Experiences with AI Services (Part 1)

This is an experience from a recent customer engagement on transcribing customer conversations using IBM Watson AI services.

Continue reading

April 26, 2019

Analyze Logs and Monitor the Health of a Kubernetes Application with LogDNA and Sysdig

This post is an excerpt from a tutorial that shows how the IBM Log Analysis with LogDNA service can be used to configure and access logs of a Kubernetes application that is deployed on IBM Cloud.

Continue reading