February 27, 2014 | Written by: Indrajit Bhattacharya
Share this post:
I was in a session at IBM Pulse 2014 where Phil Jackson (@underscorephil) from SoftLayer talked about bottlenecks in IT infrastructure. This is a very interesting topic, especially for IT infrastructure planners, the majority of whom almost always tend to put a certain percentage buffer into sizing stuff—and not without reason.
The question: are all bottlenecks evil? Certainly bottlenecks are not good, they are never good. But are all bottlenecks so evil that we need to throw money at them and try to prevent them even if they occur rarely or only occur at certain peak times? There is no straight answer, but consider the following few points:
• Criticality: How much does it impact? If the user is having a slower page load, can they live with it without changing their perception of the application? In fact, most web applications would have some pages load fast and others slow, so that could be okay.
• Data: Will there be a data loss? In case of a bottleneck, would there be a data loss and what would be the impact of the data loss?
• Frequency: Is it occasional? A bottleneck occurring once in a while on a non-critical workload could be manageable.
• Component: Which piece of the application jigsaw does it impact? There are applications that are designed to withstand certain component failures or lag in performance.
• Recoverability: How quickly will it come back? The recovery process can be automatic, such as with a temporary jam in the network link that eases itself or perhaps requiring a restart of a service or operating system.
Looking at these and other factors will enable you to reach an informed decision on how much to invest to prevent and address bottlenecks. There could very well be situations where bottlenecks are allowed, especially when the cost to prevent and address them far outweighs the loss (in terms of dollars and branding) or gain.
For those who are looking at addressing bottlenecks at the compute layer, it can be achieved by scaling applications. Workloads can be scaled horizontally or vertically depending on the nature of the workload running:
• Horizontal scaling: This is also called scale out. Workloads that have the ability to add more parallel instances and are able to spread the load across the new instances can be horizontally scaled. For example, this can be enabled for web servers where more instances can be added and load balanced dynamically.
• Vertical scaling: In many cases it may not be possible to add new instances without requiring multiple changes to the environment. A typical example could be database servers where it would probably be easier to scale up or have a larger instance to take on the additional workloads.
While scaling workloads, care needs to be taken to choose the right approach and understand the implications of it. Sometimes these may require a restart of services or operating system.
Cloud providers like SoftLayer provide APIs that can be used to automate the scaling of workloads based on application or business logic. The same can also be achieved on SoftLayer through third party products like RightScale. And then there are other cloud providers that enable autoscaling as part of their core offering, whereby capacity is added or removed automatically based on conditions defined by the user.
Personally, I would like to have control and would prefer the API method that lets me choose when and how to do it. I am curious to know your views on this. Please comment below and let me know what you think!