Scalability and web server size
Large system deployments require capacity planning. During this process, information is gathered about the workload that the deployed system is required to support. That information is then used as input to models for determining the CPU, network, and I/O capacity that are required to accommodate the planned workload. After that determination has been made, a second issue must be addressed: how should the required resources be partitioned?
High-availability concerns typically steer deployments away from having a single large server and toward smaller and more numerous servers. This direction is consistent with creating an architecture that facilitates software performance and scalability. Especially regarding the web-tier deployments, systems that use smaller servers tend to be better able to use their computing resources. For most workloads, two single-CPU systems are able to support more throughput than one dual-CPU system.
To eliminate single points of failure and to maximize the use of available capacity, create web-tier servers that are multiple machines or partitions with a smaller amount of processor allocation instead of one large machine or partition with the entire processor allocation.