Yes, Jon, there is a mainframe that can help replace 1500 x86 servers
Comments (4) Visits (19945)
After my response to Jon Toigo on Drunken Data with my post [Eleven answers about Deduplication], Jon follows up with a request to validate the numbers quoted in the February 26 Press Release[IBM launches New “System z10” Mainframe], particularly the estimate that a single mainframe can handle to the workload of 1500 x86 servers, in his post[A Bit More Blegging]. The timing is perfect in that IBM launched[the next wave of Project Big Green] today.To avoid sounding like an [editorial from the New York Sun],I checked the facts, and spoke to the person in IBM who did all the calculations. Jon, as always, you have my permission to publish this on your site if you want.
( I cannot take credit for coining the new term "bleg". I saw this term first used over on the [FreakonomicsBlog]. If you have not yet read the book "Freakonomics", I highly recommend it! The authors' blog is excellent as well.)
For this comparison, it is important to figure out how much workload a mainframe can support, how much an x86 can support, and then divide one from the other. Sounds simple enough, right? And what workload should you choose?IBM chose a business-oriented "data-intensive" workload using Oracle database. (If you wanted instead a scie
Moving Oracle workloads from x86 over to mainframe is quite common since [IBM and Cisco joined forces to meet Linux On Mainframe demand]. For more information on consolidating x86 servers running Oracle over to a mainframe, read the [Quick Reference], IBM Redbook titled ["Using Oracle Solutions on Linux for System z"], or this [presentation by Jim Elliott, IBM System z Specialist].
In a shop full of x86 servers, there are production servers, test and development servers, quality assurance servers, standby idle servers for high availability, and so on. On average, these are only 10 percent utilized.For example, consider the following mix of servers:
While [some might question, dispute or challenge this ten percent] estimate, it matches the logic used to justify VMware, XEN, Virtual Iron or other virtualization technologies. Running 10 to 20 "virtual servers" on a single physical x86 machine assumes a similar 5-10 percent utilization rate.
Note: The following paragraphs have been revised per comments received.
Keep in mind that each processor is different, and we now have Intel, AMD, SPARC, PA-RISC and POWER (and others); 32-bit versus 64-bit; dual-core and quad-core; and different co-processor chip sets to worry about. AMD Opteron processors come in different speeds, but we are comparing against the 2.8GHz, so 1500 times 6.866 times 2.8 is 28,337. Since these would be running as Linux guests under z/VM, we add an additional 7 percent overhead or 2,019 MIPS. We then subtract 15 percent for "smoothing", which is what happens when you consolidate workloads that have different peaks and valleys in workload, or 4,326 MIPS.The end is that we need a machine to do 26,530 MIPS. Thanks to advances in "Hypervisor" technological synergy between the z/VM operating system and the underlying z10 EC hardware, the mainframe can easily run 90 percent utilized when aggregating multiple workloads, so a 29,477 MIPS machine running at 90 percent utilization can handle these 26,530 MIPS.
N-way machines, from a little 2-way Sun Fire X2100 to the might 64-way z10 EC mainframe, are called "Symmetric Multiprocessors". All of the processors or cores are in play, but sometimes they have to take turns, wait for exclusive access on a shared resource, such as cache or the bus. When your car is stopped at a red light, you are waiting for your turn to use the shared "intersection". As a result, you don't get linear improvement, but rather you get diminishing returns. This is known generically as the "SMP effect", and in IBM documents this as [Large System Performance Reference].While a 1-way z10 EC can handle 920 MIPS, the 64-way can only handle30,657 MIPS. The 29,477 MIPS needed for the Sun x2100 workload can be handled by a 61-way, giving you three extra processors to handle unexpected peaks in workload.
But are 1500 Linux guest images architecturally possible? A long time ago, David Boyes of[Sine Nomine Associates] ran 41,400 Linux guest images on a single mainframe using his [Test Plan Charlie], and IBM internallywas able to get 98,000 images, and in both cases these were on machines less powerful than the z10 EC. Neitherof these were tests ran I/O intensive workloads, but extreme limits are always worth testing. The 1500-to-1 reduction in IBM's press release is edge
The z10 EC can handle up to 60 LPARs, and each LPAR can run z/VM which acts much like VMware in allowing multipleLinux guests per z/VM instance. For 1500 Linux guests, you could have 25 guests each on 60 z/VM LPARs, or 250 guests on each of six z/VM LPARs, or 750 guests on two LPARs. with z/VM 5.3, each LPAR can support up to 256GB of memory and 32 processors, so you need at least two LPAR to use all 64 engines. Also, there are good reasons to have different guests under different z/VM LPARs, such as separating development/test from production workloads. If you had to re-IPLa specific z/VM LPAR, it could be done without impacting the workloads on other LPARs.
To access storage, IBM offers N-port ID Virtualization (NPIV). Without NPIV, two Linux guest images could not access the same LUN through the same FCP port because this would confuse the Host Bus Adapter (HBA), which IBM calls "FICON Express" cards. For example, Linux guest 1 asks to read LUN 587 block 32 and this is sent out a specific port, to a switch, to a disk system. Meanwhile, Linux guest 2 asks to read LUN 587 block 49. The data comes back to the z10 EC with the data, gives it to the correct z/VM LPAR, but then what? How does z/VM know which of the many Linux guests to give the data to? Both touched the same LUN, so it is unclear which made the request. To solve this, NPIV assigns a virtual "World Wide Port Name" (WWPN), up to 256 of them per physical port, so you can have up to 256 Linux guests sharing the same physical HBA port to access the same LUN.If you had 250 guests on each of six z/VM LPARs, and each LPAR had its own set of HBA ports, then all 1500 guestscould access the same LUN.
Yes, the z10 EC machines support Sysplex. The concept is confusing, but "Sysplex" in IBM terminology just means that you can have LPARs either on the same machine or on separate mainframes, all sharing the same time source, whether this be a "Sysplex Timer" or by using the "Server Time Protocol" (STP). The z10 EC can have STP over 6 Gbps Infiniband over distance. If you wantedto have all 1500 Linux guests time stamp data identically, all six z/VM LPARs need access to the shared time source. This can help in a re-do or roll-back situation for Oracle databases to complete or back-out "Units of Work" transactions. This time stamp is also used to form consistency groups in "z/OS Global Mirror", formerly called "XRC" for Extended Remote Distance Copy. Currently, the "timestamp" on I/O applies only to z/OS and Linux and not other operating systems. (The time stamp is done through the CKD driver on Linux, and contributed back to the open source community so that it is available from both Novell SUSE and Red Hat distributions.)To have XRC have consistency between z/OS and Linux, the Linux guests would need to access native CKD volumes,rather than VM Minidisks or FCP-oriented LUNs.
Note: this is different than "Parallel Sysplex" which refers to having up to 32 z/OS images sharing a common "Coupling Facility" which acts as shared memory for applications. z/VM and Linux do not participate in"Parallel Sysplex".
As for the price, mainframes list for as little as "six figures" to as much as several million dollars, but I have no idea how much this particular model would cost. And, of course, this is just the hardware cost. I could not find the math for the $667 per server replacement you mentioned, so don't have details on that.You would need to purchase z/VM licenses, and possibly support contracts for Linux on System z to be fully comparable to all of the software license and support costs of the VMware, Solaris, Linux and/or Windows licenses you run on the x86 machines.
This is where a lot of the savings come from, as a lot of software is licensed "per processor" or "per core", and so software on 64 mainframe processors can be substantially less expensive than 1500 processors or 3000 cores.IBM does "eat its own cooking" in this case. IBM is consolidating 3900 one-
All of the components of DFSMS (including DFP, DFHSM, DFDSS and DFRMM) were merged into a single product "DFSMS for z/OS" and is now an included element in the base z/OS operating system. As a result of these, customers typically have 80 to 90 percent utilization on their mainframe disk. For the 1500 Linux guests, however, most of the DFSMS features of z/OS do not apply. These functions were not "ported over" to z/VM nor Linux on any platform.
Note: DFSMS can backup or dump Linux on System z partitions or volumes. See this [Appendix C. HOWTO backup Linux data through z/OS] for details.
Instead, the DFSMS concepts have been re-implemented into a new product called "Scale-Out File Services" (SOFS) which would provide NAS interfaces to a blended disk-and-tape environment. The SOFS disk can be kept at 90 percent utilization because policies can place data, movedata and even expire files, just like DFSMS does for z/OS data sets. SOFS supports standard NAS protocols such as CIFS, NFS, FTP and HTTP, and these could be access from the 1500 Linux guests over an Ethernet Network Interface Card (NIC), which IBM calls "OSA Express" cards.
Lastly, IBM z10 EC is not emulating x86 or x86-64 interfaces for any of these workloads. No doubt IBM and AMD could collaborate together to come up with an AMD Opteron emulator for the S/390 chipset, and load Windows 2003 right on top of it, but that would just result in all kinds of emulation overhead.Instead, Linux on System z guests can run comparable workloads. There are many Linux applications that are functionally equivalent or the same as their Windows counterparts. If you run Oracle on Windows, you could run Oracle on Linux. If you run MS Exchange on Windows, you could run Bynari on Linux and let all of your Outlook Express users not even know their Exchange server had been moved! Linux guest images can be application servers, web servers, database servers, network infrastructure servers, file servers, firewall, DNS, and so on. For nearly any business workload you can assign to an x86 server in a datacenter, there is likely an option for Linux on System z.
Hope this answers all of your questions, Jon. These were estimates based on basic assumptions. This is not to imply that IBM z10 EC and VMware are the only technologies that help in this area, you can certainly find virtualization on other systems and through other software.I have asked IBM to make public the "TCO framework" that sheds more light on this.As they say, "Your mileage may vary."
For more on this series, check out the following posts:
If in your travels, Jon, you run into someone interested to see how IBM could help consolidate rack-mounted servers over to a z10 EC mainframe, have them ask IBM for a "Scorpion study". That is the name of the assessment that evaluates a specific client situation, and can then recommend a more accurate estimate configuration.
technorati tags: Jon Toigo, DrunkenData, bleg, IBM, z10, EC, E64, mainframe, x86, AMD, Opteron, Sun, Fire, X2100, petaFLOP, Freakonomics, Red Hat, RHEL, IFL, VMware, Jim Elliott, Xen, Virtual Iron, Solaris, Linux, Windows, Project Big Green, Infiniband, STP, Sysplex, Scorpion study, MS Exchange, Bynari, Oracle