Solution performance has two key aspects. One is end user response times as discussed in the prior post; the other is capacity management, which is the discipline of predicting workloads and sizing systems to ensure reasonable system response times given the number of users accessing the systems. Capacity management differs between on-premises environments and cloud solutions like LotusLive (SmartCloud) in one pivotal way: In the on-premises world, each server is usually looked at in isolation. It serves a specific purpose (file server, database server, mail server, etc) for a specific set of users with reasonably consistent work patterns. Single-Sign-On (SSO) systems notwithstanding, it is often sufficient to size and scale each service independent of each other, and we typically determine or plan for a concurrency rate, or percentage of users active in any given instant, for each individual service. In our LotusLive cloud environment, we need to look at the combined solution of all services for capacity management because of the integration points and dependencies between components. For example, every meeting user goes through the same authentication mechanism used by those accessing file sharing, communities, chat services, and more. And a meeting moderator may be spawning transactions to the Files service by sharing a file in a web conference, even though their end user experience is "conducting a meeting". For that reason, we need to determine concurrency rates across services, not just individually for each service. There is also a need to track resource consumption (cpu, i/o, memory) and analyze trends in real time, in order to predict when additional virtual machines (vm's) need to be spun up. For example, internal operational data give us a pretty clear idea of the cpu consumption level at which service quality starts to deteriorate gradually. That allows us to spin up an additional image in order to serve a growing workload and avoid degrading service quality. The multi-tenancy nature of the cloud solution means that the user base is constantly growing; not nearly as stable as most on-premises environments. Mergers and acquisitions can lead to significant changes in user base and workload in on-premises environments. In the cloud, we live that change every day as new companies sign on to use the services and add potentially large numbers of users to regional data centers. Virtualization technology is what enables us to quickly provision as many servers as needed to service rapid growth in workloads. And good capacity management keeps capacity ahead of the growing demand.
PS: To sort the blog and display just the ‘Cloud Difference’ series, click on the “cloud_difference” tag below the title of any post in the series.