In the well-designed application, one state of the application function will move smoothly and quickly to the state of another function. The application is always available with no interruptions (or at most, imperceptive ones) except for scheduled downtimes for maintenance.
Inevitably, the developer decides to migrate the application to a cloud environment. In this scenario, the developer is not proacative. To prepare for migration, he makes some changes to the application, such as adding a login box to let the user with proper credentials use the application; the credentials include how much control the user has over the application depending on the type of cloud service he rents.
If the developer migrates the application as Software as a Service (SaaS), the only control the user has is to access the application from a desktop or mobile device in order to process business tasks such as billing and invoicing; under this scheme, the user is not allowed to control the operating system, hardware, or network infrastructure on which any SaaS application is running. Only the developer can build, deploy, run, and manage upgrades and patches to all functionalities of the application.
If the developer migrates the application as part of Platform as a Service (PaaS), the user can control the application within the full business life cycle of the platform, manipulating things such as spreadsheets and payroll processing. The user determines what changes to make to the application, (like adding an option to a drop-down list provided by the developer). The user is still not allowed to control the operating system, hardware, or network infrastructure.
Shortly after the developer completes the migration process, he encounters interoperability and performance issues. Uptime availability begins to slide down from the guaranteed level set forth in a service level agreement (SLA). The users howl about the application responding slower than usual to their requests for data they need to make critical business decisions; the users have lost new business opportunities.
Frantic for fast solutions, the developer may become reactive.
The nightmare of being reactive
Frantic for fast solutions, the developer becomes reactive. First he stops the production in the cloud while pretending to be on scheduled maintenance. He puts a notice on users' browsers that says: "Service is temporarily down. Please wait." Then he begins to work on the application in house.
In no time, the developer finds it is difficult to locate the code to expand to include a drop-down list of business tasks (such as for invoicing). The developer discovers too late the application that runs perfectly well in house was written by a previous developer as a lengthy single unit rather than as, say 500 modular parts.
Frantically, he patches up the application with a modular part on the drop-down list and then tests it in the cloud to determine how well resource instances are equally allocated to run the application. He discovers too late that load balancing failed; one resource instance that had reached the maximum capacity failed. Other resource instances that had reached 75 percent of their maximum capacity could not take over the business transactions from the failed resource instance.
Before the migration, the developer did not ensure each resource instance be used at 50 percent of its capacity, so that if one resource instance fails, the healthy resource instances will take over.
Meanwhile, users complain about prolonged maintenance work when they begin to suspect the developer has been having problems with the application. The users demand credits, refunds, and free months to compensate for lack of service as specified in the SLA. They demand they be allowed to terminate the service if unsatisfactory service continues.
At this point the developer groans and moans that he did not do two things before he migrated the application:
- He did not divide the application into modular parts.
- He did not include a resource threshold window in the application to track how the resource instances are being used.
The developer grieves for the loss of service, users, and reputation that he did not include proactive behavioral changes in the first place. It would have been a different story had he been proactive.
Proactive behavioral changes
If a developer considers proactive behavioral changes, he can keep his existing users happy, does not ruin his reputation, and gains new subscribers.
First, he gets a team to help him to divide the application into modules, each focusing on a behavioral change. His team tests the modules in house and checks with the users to ensure that the modules' behaviors meet their expectations. Once that is done, he migrates the application to a cloud service type.
Following are two module examples that the team can include in the application:
- Drop-down controls
- Threshold thumbnail windows
Drop-down controls module
This module of drop-down list controls lets users make a choice among a list of mutually exclusive values in the list. With one mouse click, a user can choose one and only one option. While holding down the CTRL key, the user can choose multiple options in the list. With a standard drop-down list, the user is limited to choices in the list, but with a combo box, the user can enter a value that is not in the list.
One example is a drop-down list of business tasks that need to be activated:
- Payroll processing
If you choose the payroll processing option, you get the application to connect in the background with another application to start payroll processing. If you choose the spreadsheet option, a list of spreadsheets appears. You then choose a spreadsheet to view and edit or you can add a new sheet.
If you choose the invoicing option, a list of invoices appears. You then choose an invoice to view and print. You can bring up a blank invoice form to fill out and then submit it to a recipient via email. If you choose the billing option, you choose an account to view and send payment.
If a value does not appear in the list of choices and the user knows a new business task has been added to the application, he can enter, say, invoicing2 in the combo box.
Here is another drop-down list showing four threshold options:
- Data request
If you choose only one option with a mouse click, a threshold thumbnail window module (in the next section) brings up a window to show how well the application is performing on that option.
For example, if you choose only the resource option, a thumbnail window on resource usage appears on the screen. If you then choose only the user option, you see a thumbnail window on user status appears. If you choose the first two options, use the CTRL with the mouse to get a thumbnail window for each option to appear side by side. Choose the dashboard option to show the first three options in one window, each showing a different threshold level.
Threshold thumbnail window module
The module pops up a thumbnail window activated by your selection of an option in the threshold drop-down list. The window shows how well the application is performing in real time with respect to a threshold level set by the provider in a threshold policy that should be a part of the SLA.
If the module detects the application is performing at or below the acceptable threshold level, the application is running as planned, ensuring operational continuity. If it is above the threshold level while still under the maximum level, it is an indication that the application operations might be interrupted due to something like insufficient resources to complete a task, users not logging off when done with the service, or excessive number of data requests remaining in a queue.
If the interruption lasts for a few milliseconds, it is usually not noticeable to the human eye, ensuring operational continuity.
If you feel the threshold level is set too high, you may need to contact the provider about lowering the level to a more acceptable level.
The next question to answer is "how do I remain proactive?".
How you can remain proactive
After successful changes to the application bound to a cloud environment, make sure you remain proactive. Ask yourself questions about:
- Threshold policies.
- Browsers' inherent limitations.
- Application's statefulness.
- Failover mechanisms types.
- Security issues.
Considered three threshold policy types —resource, user, and data requests. For each policy type, application testing may have a different threshold level than that for production. Use capacity planning ahead of time to prepare your system to implement threshold policies.
Resource threshold policy ensures resource consumption is balanced dynamically for applications in the cloud. The threshold level is set below the maximum number of additional resource instances that could be consumed. When resource consumption exceeds the threshold level during a spike in workload demands, additional resource instances are allocated. When the demand returns at or below the threshold level, resources instances that have been created are freed up and put to other use.
User threshold policy ensures users can access concurrently the application up to the limit specified in a user license from the provider. For example, the license limits to 3,000 users but only allows a maximum of 2,500 users to access concurrently, and the threshold level is set at 2,000 concurrent users. If the number of concurrent users is at or below the application, the application is continuously available assuming that resource consumption and data requests are below at their respective threshold level.
Data request threshold policy ensures data requests to the application can be processed immediately. The threshold level is set below the maximum number of data requests and the maximum size of data requests that users can send concurrently. If the number of data requests exceeds the threshold level, a message should pop up to show how many data requests are in queue waiting to be processed.
Browsers' inherent limitations
If you are using both Mozilla Firefox and Internet Explorer, do you know Firefox loads pages faster and uses less memory than Internet Explorer? Firefox's main limitation is that it can shut down or result in slow response due to nonavailability of:
- Additional resources after resource consumption reaches the maximum capacity of resource instances.
- A threshold windows thumbnail to alert you when resource consumption exceeds the threshold level.
To fix the problem, make sure resource consumption stays at or below the threshold level and instance resource redundancy failover is working.
If the thumbnail window shows resource consumption has exceeded the threshold level, it should:
- Pop up a warning message on consumption excess.
- Ask you to shut down the browser or have the system perform automatic deletion of firefox.exe processes and restart the browser.
Statefulness of application functions
Do you know how well statefulness of application functions is performing? Statefulness refers to how well one state of an application function is behaving when it goes to subsequent states of other functions in the cloud.
In this over-simplified example, while in the cloud the state of the function of validating credit card information to be used for purchasing a retail item for business use should go quickly to the state of the function of sending the information to a bank account, and then go quickly to the state of the function of notifying the customer that the retail item has been shipped to the business address.
If going from one state to other states results in the application running slower in the cloud than in house, consider the possible causes:
- Flaws in application logic in handling business transactions.
- Threshold level in the resource, user, and/or data requests set too high.
- Inefficient resource instances resulting in untimely statefulness.
- Users or data requests competing for the same resource instances.
Failover mechanisms ensure operational continuity during events like impending hardware failures, resource stress, or prolonged latency delays. Exceptions include acts of God, the provider's scheduled downtime maintenance, or accidental cutting of the fiber optics line.
Here are some failover mechanism examples to consider:
- Load sharing redundant: Two or more systems loaded with no more than 50 percent of the total load. When a device fails, other devices pick up the load with little or no interruption or changes in threshold levels.
- Instance resource redundant: Two or more resource instances loaded with no more than 50 percent of the total load. When a resource instance fails, other resource instances pick up the load with little or no changes in threshold levels.
- Data request queue redundant: Two or more data request queues loaded with no more than 50 percent of the queue. When a queue fails, other queues pick up the load with little or no change interruption.
- Alternate connection retry: If network interruption lasts more than two minutes, reconnect to another physical server or virtual machine via alternate network connections.
If your company posts revenues greater than US$1 billion a year, private clouds may be more cost effective than public clouds. A private internal cloud has many of the same business characteristics as a public cloud but with much higher levels of governance on security, availability, and failover mechanisms than small businesses with revenues of less than US$1 million.
An internal private cloud allows you to store and get data from known locations in a specific jurisdiction (like the United States or Canada). It is suitable for storing your sensitive, compliance, privacy, and test data.
In contrast, data in a public cloud can be stored in unknown locations and might not be easy to retrieve. Unknown locations are not suitable for storing compliance, privacy, sensitive, and test data.
One security issue is whether the provider's administrator can access your sensitive, compliance, privacy, and test data stored at known locations without your permission. Another issue is the stored data might be in geographical areas where privacy and compliance regulations differ from those in countries you are familiar with. Laws can vary from one country to another regarding data export controls.
Before giving permission, administrator's responsibilities should be spelled out in a contract on what he can do and cannot do in accessing your data, what laws on data exports he must comply with, and what policies he must implement on mitigating risks of hackers setting up Command and Control centers (CnC) in the cloud.
Hackers have tools to detect data traveling from your desktop to unknown locations and vice versa. PaaS platforms have been used as CnC centers to direct operations of a botnet for use in distributed denial of service attacks (DDoS) and installing malware applications in the cloud. Hackers could devise and deploy malicious applications for:
- Allocating malicious resource instances.
- Resetting threshold levels to unrealistically high values.
- Changing the state of one application function to subsequent states of malicious functions.
Changing application behavior from in house to the cloud requires planning ahead to resolve the issues of remaining proactive in making behavior changes after the applications are migrated. A team of developers, users, and business analysts need to work together in setting threshold policies, overcoming browsers' inherent limitations, customizing failover mechanisms, and considering security issues in the cloud. The team will find resolving the issues makes the task of staying proactive while making behavioral changes a much easier one.
- The author discusses threshold policy in the articles "Balance workload in a cloud environment: Use threshold policies to dynamically balance workload demands" and "Cloud computing versus grid computing: Service types, similarities and differences, and things to consider."
- In the developerWorks cloud developer resources, discover and share knowledge and experience of application and services developers building their projects for cloud deployment.
- More developerWorks resources that match this article can be found at SOA and web services at developerWorks and industries at developerWorks.
- The next steps: Find out how to access IBM Smart Business Development and Test on the IBM Cloud.
Get products and technologies
- See the product images available on the IBM Smart Business Development and Test on the IBM Cloud.
- Join a cloud computing group on developerWorks.
- Read all the great cloud blogs on developerWorks.
- Join the developerWorks community, a professional network and unified set of community tools for connecting, sharing, and collaborating.