Modified on by jscheel
by Jeff Scheel, IBM Linux on Power Chief Engineer
As promised, here is my first blog post on little endian or "LE" as we call it. Where better place to start than with a list of frequently ask questions (FAQs)? Hopefully, you'll find this helpful. Let me know if you have any questions I missed.
What is big endian and little endian, anyway?
In order to perform operations on data, computers routinely load and store bytes of data from and to memory, the network, and disk. This data management generally follows one of two schemes: little endian or big endian.
Imagine the number one hundred twenty three. When representing this number with numerals, we typically write it with the most significant digit first and the least significant digit last: 123. This is big endian. Mainframes and RISC architectures like POWER default to big endian when manipulating data.
Some microprocessor architectures store the numbers representing one hundred twenty three in reverse – the least significant digit first and the most significant digit last: 321. This is little endian. x86 architectures use little endian when storing data.
Why do people care about what endian mode their platform runs?
Most users do not care which endian mode their platform is using. They simply care about what applications are supported by their Linux operating systems. Only application providers care about endianess. For example:
A software developer that has code manipulating data through pointer casting or bitfields would not be able to simply recompile an application for one endian mode to another.
A user with large amounts of data stored to disk or exchanged among systems over network connections without consideration of endian schemes risks a range of application failures from very subtle to complete failures.
A system accelerator programmer (GPU or FPGA) who needs to share memory with applications running in the system processor must share data in an pre-determined endianness for correct application functionality.
Why is Linux on Power transitioning from big endian to little endian?
The Power architecture is bi-endian in that it supports accessing data in both little endian and big endian modes. Although Power already has Linux distributions and supporting applications that run in big endian mode, the Linux application ecosystem for x86 platforms is much larger and Linux on x86 uses little endian mode. Numerous clients, software partners, and IBM’s own software developers have told us that porting their software to Power becomes simpler if the Linux environment on Power supports little endian mode, more closely matching the environment provided by Linux on x86. This new level of support will lower the barrier to entry for porting Linux on x86 software to Linux on Power.
Which Linux distributions will support little endian on Power?
So far, only Canonical’s Ubuntu Server 14.04 distribution supports little endian on Power. Plans are underway in the community distributions of Debian and openSUSE for little endian releases.
Additionally, SUSE has stated publicly that SLES 12 will be little endian when it becomes available. See SUSE Conversations for more information.
Red Hat has not yet publicly disclosed their plans around a little endian operating systems However, work to create a ppc64le architecture has started in the Fedora.
Which Linux distributions will support big endian on Power?
It is IBM's understanding that Red Hat and SUSE will continue to support their existing big endian releases on Power for their full product lifecycles.
While SUSE has announced their plans to transition their distribution to little endian (see above), Red Hat has not disclosed anything. The newly available Red Hat Enterprise Linux 7 operates in big endian mode on Power. Specifics about the transition to little endian will be decided and disclosed by Red Hat.
What about Linux applications that have already been optimized for big endian on Power?
The existing PowerLinux application portfolio supports only big endian modes today. Open source applications have begun extending their support to little endian mode on Power Systems. Existing third party and IBM applications will likely migrate more slowly and deliberately. As such, Power hardware will support both endian modes for the foreseeable future so that existing Linux applications optimized for a big endian platform will continue to run unchanged while new applications optimized to little endian mode are added.
Can applications compiled for x86 (Windows or Linux) run without change on little endian Power?
Because the x86 and Power processors use different instruction set architectures (ISAs) – the binary executable known to the processor – compiled applications will need at least a recompile on the new platform. Whether source code changes are required depends on how many optimizations have been made in the application source – such as the use of assembler language and any assumptions about page size or cache line size, etc.
However, interpreted applications such as those in Java, perl, python, php, ruby and others should be capable of migrating with little to no change.
Does this transition affect application ecosystems for AIX or IBM i?
No, there will be no effect on AIX or IBM i application environments as a result of this change.
What if I want to run a mix of big endian and little endian applications on the same Power System?
Each Linux distribution will support a particular endian mode, little or big. Applications always certify to specific distributions. As such, endian mode decisions should be transparent to the end user. Customers should not have to consider endianess in their application choice.
If one requires different Linux distributions or the same distribution at different releases on a single server, then Power Systems virtualization (LPARs or VMs) allows customers to run applications supported by a big endian Linux distribution like RHEL6 as well as applications supported by a little endian distribution like Canonical’s Ubuntu Server at the same time. However, concurrent little endian and big endian support on the same server will not be available until a future date. See more details in the questions below.
Which POWER processors support little endian mode?
The POWER8 processor is the first processor to support little endian and big endian modes equivalently. Although previous generations of the POWER processors had basic little endian functionality, they did not fully implement the necessary instructions in such a way to enable enterprise operating system offerings.
Where can little endian distributions run on Power?
When IBM announced POWER8 in April 2014, little endian (LE) operating systems were initially supported as KVM guests. Further, KVM support was limited to only include all LE or all big endian (BE) guests. In coming releases, IBM expects to support concurrent LE and BE guests in KVM, as well as the support of LE guests on PowerVM.
Do POWER systems support the running of mixed environments of big and little endian operating systems?
The POWER8 processor supports mixing of big and little endian memory accesses at the core level, through the use of SPR (special purpose register) settings. While this could technically support the running of both big and little endian software threads, the complexity of implementing such a design point would be high. Therefore, IBM has elected to enable operating system versions as completely big endian or little endian by design.
The virtualization capabilities of the POWER platform have allowed for mixed environments of operating system levels and types. This same isolation mechanism applies to big and little endian operating systems. However, in implementing the initial releases of little endian, IBM has introduced some short-term limitations on where LE operating systems can run. Over time, these will be removed and both KVM and PowerVM will support concurrent mixing of LE and BE operating systems.
See the previous question for more information.
Does PowerVM support little endian operating systems?
While the POWER8 systems support little endian (LE) mode, IBM has not yet completed the software development and testing to enable LE operating systems on PowerVM. The outlook is that this function will be delivered around mid-2015. When this capability is delivered, PowerVM will support the mixing of both big endian (BE) and LE operating systems. This enablement will also enable the running of LE operating systems on the Power Integrated Facilities for Linux (IFLs).
Does PowerKVM support mixing of little endian and big endian operating systems?
Testing has not yet completed to enable the mixing of little endian (LE) and big endian (BE) guests for KVM. Until this completes, IBM supports guests of the same type – all LE or all BE.
IBM hopes to support mixing of guest types around mid-2015.
Can I run big endian applications on a little endian operating system or vice versa?
No, the operating system enablement only supports applications of the same type. As such, a little endian operating system (ppc64le or ppc64el) can only run little endian applications built for this software platform. Likewise, big endian operating systems (ppc64) only support software built for big endian.
January 23, 2015 - Author's update
A couple noteworthy activities have occurred since this blog was originally published.
A little endian (LE) version of RHEL 7.1 has been released in beta form. This announcement indicates that RHEL 7 updates will have both the existing big endian (BE) offering and a new LE offering. For more information about the beta, see the RHEL 7.1 beta announcement information. This means that all three Linux on Power distribution partners -- SUSE, Canonical, and now Red Hat -- have LE operating systems.
IBM PowerKVM now supports the mixture of BE and LE guests beginning with the 2.1.1 update in October 2014. This was a subtle change that is hard to find in documentation.
Support for LE operating systems on PowerVM continues to make progress toward a delivery sooner versus later this year. When this is delivered, the mixing of BE and LE logical partitions will be supported.
Additionally, the following question keeps being asked and needs it's own FAQ:
Can I run x86 Linux applications on LE Linux on Power operating systems unchanged?
If your application was written in a dynamic language, it is highly portable and often migrates to BE and LE Linux on Power operating system environments without change. Examples include applications written in Java, php, perl, python, etc.
If your application was written in a compiled language like C/C++, it must be recompiled on Power in both the BE and LE operating systems. Applications migrating from x86 Linux onto an LE Linux operating system on Power will migrate without concern for data layout (endianness). Applications migrating onto BE operating systems need to be reviewed for consist data access, especially if they will share data using disk or networking with LE systems.
Modified on by jhopper
zswap" is discussed, with some initial performance data provided to demonstrate the potential benefits for a system (partition or guest) which has constrained memory and is beginning to swap memory pages to disk. The technique improves the throughput of a system, while significantly reducing the disk I/O activity normally associated with page swapping. We also explore how zswap works in conjunction with the new compression accelerator feature of the POWER7+ processor to potentially improve the system throughput even more than software compression alone.
This article is a good example of the ongoing collaboration that occurs in the Linux open-source community. New implementations are proposed, discussed, debated, refined and updated across developers, community members, interested customers, and performance teams. Here on the PowerLinux technical community, we are working to highlight more of these examples of work-in-progress from the broader Linux community. These proposals are applicable to both x86 systems and Power systems, so examples shown below cover both realms.
What is zswap?
Zswap is a new lightweight backend framework that takes pages that are in the process of being swapped out and attempts to compress them and store them in a RAM-based memory pool. Aside from a small reserved portion intended for very low-memory situations, this zswap pool is not pre-allocated, it grows on demand and the max size is user-configurable. Zswap leverages an existing frontend already in mainline called frontswap. The zswap/frontswap process intercepts the normal swap path before the page is actually swapped out, so the existing swap page selection algorithms are unchanged. Zswap also introduces key functionality that automatically evicts pages from the zswap pool to a swap device when the zswap pool is full. This prevents stale pages from filling up the pool.
The zswap patches have been submitted to the Linux Kernel Mailing List
(lkml) for review, you can view them in this post
Instructions for building a zswap-enabled kernel on a system installed with Fedora 17 can be found on this wiki
What are the benefits?
When a page is compressed and stored in a RAM-based memory pool instead of actually being swapped out to a swap device, this results in a significant I/O reduction and in some cases can significantly improve workload performance. The same is true when a page is "swapped back in" - retrieving the desired page from the in-memory zswap pool and decompressing it can result in performance improvements and I/O reductions compared to actually retrieving the page from a swap device.
Using the SPECjbb2005 workload for our engineering tests, we gathered some performance data to show the benefits of zswap. SPECjbb2005 uses a Java™ benchmark that evaluates server performance and calculates a throughput metric called "bops" (business operations per second). To find out more about this benchmark or see the latest official results, see the SPEC web site
. Note that the following results are not tuned for optimal performance and should not be considered official benchmark results for the system, but rather results obtained for research purposes. We liked this benchmark for this use case because we could more carefully control the amount of active memory being used in increments.
The SPECjbb2005 workload ramps up a specified number of "warehouses", or units of stored data, during the run. The number of warehouses is a user-controlled setting that is configured depending on the number of threads available to the JVM. As the benchmark increases the number of warehouses throughout the run, the system utilization level increases. A bops score is reported for each warehouse run. For this work, we focused on the bops score from the warehouse that keeps the system about 50% utilized. We also increased the default runtime for each warehouse to 5 minutes since swapping can be bursty and a longer runtime helps to achieve more consistent results.
For these results, the system was assigned 2 cores, 10 GB of memory, and a 20 GB swap device. A single JVM was created for the SPECjbb2005 runs, using IBM Java. First, a baseline measurement was taken where normal swapping activity occurred, then a run with zswap enabled was measured to show the benefits of zswap. We gathered results on both a Power7+ system and an x86 system to observe the performance impacts on different architecture types. The mpstat, vmstat, and iostat profilers from the sysstat package were used to record CPU utilization, memory usage, and I/O statistics. We would recommend taking advantage of the lpcpu
package to gather these data points.
To demonstrate the performance effects of swapping and compression, we started with a JVM heap size that could be covered by available memory, and then increased the JVM heap size in increments until we were well beyond the amount of free memory, which forced swapping and/or compression to occur. We recorded the throughput metric and swap rate at each data point to measure the impacts as the workload demanded more and more pages.
Settting up zswap
With the current implementation, zswap is enabled by this kernel boot parameter:
We looked at several new in-kernel stats to determine the characteristics of compression during the run. The metrics used were as follows:
pool_pages - number pages backing the compressed memory pool
reject_compress_poor - reject pages due to poor compression policy (cumulative) (see max_compressed_page_size sysfs attribute)
reject_zsmalloc_fail - rejected pages due to zsmalloc failure (cumulative)
reject_kmemcache_fail - rejected pages due to kmem failure (cumulative)
reject_tmppage_fail - rejected pages due to tmppage failure (cumulative)
reject_flush_attempted - reject flush attempted (cumulative)
reject_flush_fail - reject flush failed (cumulative)
stored_pages - number of compressed pages stored in zswap
outstanding_flushes - the number of pages queued to be written back
flushed_pages - the number of pages written back from zswap to the swap device (cumulative)
saved_by_flush - the number of stores that succeeded after an initial failure due to reclaim by flushing pages to the swap device
pool_limit_hit - the zswap pool limit has been reached
There are two user-configurable zswap attributes:
max_pool_percent - the maximum percentage of memory that the compressed pool can occupy
max_compressed_page_size - the maximum size of an acceptable compressed page. Any pages that do not compress to be less than or equal to this size will be rejected (i.e. sent to the actual swap device)
failed_stores - how many store attempts have failed (cumulative)
loads - how many loads were attempted (all should succeed) (cumulative)
succ_stores - how many store attempts have succeeded (cumulative)
invalidates - how many invalidates were attempted (cumulative)
To observe performance and swapping behavior once the zswap pool becomes full, we set the max_pool_percent parameter to 20 - this means that zswap can use up to 20% of the 10GB of total memory.
The following graphs represent the SPECjbb2005 performance and swap rate for a run using the normal swapping mechanism.
Note that as "available" memory is used up around 10GB, the performance falls off very quickly (the Blue Line) and normal page swapping (the Red Line) to disk increases. The behavior is consistent both on Power7+ and x86 systems.
Power7+ baseline results:
x86 baseline results:
As you can see, performance dramatically decreased once the system started swapping and continued to level off as the JVM heap was increased.
The following graphs represent the SPECjbb2005 performance and swap rate for a run when zswap is enabled. In these cases, memory is now being compressed, which significantly reduces the need to go to disk for swapped pages. Performance of the workload (the blue line) still drops off but not as sharply, but more importantly the system load on I/O drops dramatically.
Power7+ with zswap compression:
x86 with zswap compression:
As you can see, the swap (I/O) rate was dramatically reduced. This is because most pages were compressed and stored in the zswap pool instead of swapped to disk, and taken from the zswap pool and decompressed instead of swapped in from disk when the page was requested again. The small amount of "real" swapping that occurred is due to the fact that some pages compressed poorly - which means they did not meet a user-defined max compressed page size - and were therefore swapped out to the disk, and/or stale pages were evicted from the zswap pool.
Looking at the zswap metrics for each run, we can calculate some interesting statistics from this set of runs - keep in mind the base page size is different between Power (64K pages) and x86 (4K pages), which accounts for some of the different behaviour. Also note that we set the max zswap pool size to 20% of total memory for these runs, as mentioned above - this max setting can be adjusted as needed. On Power, the average zswap compression ratio was 4.3. On x86, the average zswap compression ratio was 3.6. For the Power runs, we saw entries for "pool_limit_hit" starting at the 17 GB data point. For the x86 runs, the pool limit was hit earlier - starting at the 15.5 GB data point. For the Power runs, at most the zswap pool stored 139,759 pages. For the x86 runs, the max number of stored pages was 1,914,720. This means all those pages were compressed and stored in the zswap pool, rather than being swapped out to disk, which results in the performance improvements seen here.
POWER7+ hardware acceleration
The POWER7+ processor introduces new onboard hardware assist accelerators that offer memory compression and decompression capabilities, which can provide significant performance advantages over software compression. As an example, the system specifications for the IBM Flex System p260 and p460 Compute Nodes
mention the "Memory Expansion acceleration" feature of the processor.
The current zswap implementation is designed to work with these hardware accelerators when they are available, allowing for either software compression or hardware compression. When a user enables zswap and the hardware accelerator, zswap simply passes the pages to be compressed or decompressed off to the accelerator instead of performing the work in software. Here we demonstrate the performance advantages that can result from leveraging the POWER7+ on-chip memory compression accelerator.
POWER7+ hardware compression results
Because the hardware accelerator speeds up compression, looking at the zswap metrics we observed that there were more store and load requests in a given amount of time, which filled up the zswap pool faster than a software compression run. Because of this behavior, we set the max_pool_percent parameter to 30 for the hardware compression runs - this means that zswap can use up to 30% of the 10GB of total memory.
The following graph represents the SPECjbb2005 performance and swap rate for a run when zswap and the POWER7+ hardware accelerator are enabled. In this case, memory is now being compressed in hardware instead of software, and this results in a significant performance improvement. Performance of the workload (the blue line) still drops off, but even less sharply than the zswap software compression case, and the system load on I/O still remains very low.
Power7+ hardware compression:
As you can see, the swap (I/O) rate was dramatically reduced. This is because most pages were compressed using the hardware accelerator and stored in the zswap pool instead of swapped to disk, and taken from the zswap pool and decompressed in the hardware accelerator instead of swapped in from disk when the page was requested again. The small amount of "real" swapping that occurred is due to the fact that some pages compressed poorly - which means they did not meet a user-defined max compressed page size - and were therefore swapped out to the disk, and/or stale pages were evicted from the zswap pool.
The following graphs show the performance comparison between normal swapping and zswap compression, and the POWER7+ graph also includes the hardware compression results, showing that the hardware accelerator provides even more performance advantages over software compression alone:
Power7+ performance comparison:
x86 performance comparison:
As you can see, this workload shows up to a 40% performance improvement in some cases after the heap size exceeds available memory when zswap is enabled, and the POWER7+ results show that the hardware accelerator can improve the performance by up to 60% in some cases compared to the baseline performance.
Swap (I/O) comparison
The following graphs show the swap rate comparison between normal swapping and zswap compression, and the POWER7+ graph includes the hardware compression results, showing that the hardware accelerator also reduces the swap rate dramatically. Swap rates are dramatically reduced on both architectures when zswap is enabled, including the POWER7+ hardware compression results.
Power7+ swap I/O comparison:
x86 swap I/O comparison:
The new zswap implementation can improve performance while reducing swap I/O , which can also have positive effects on other partitions that share the same I/O bus. The new POWER7+ on-chip memory compression accelerator can be leveraged to provide performance improvements while still keeping swap I/O very low.
Modified on by jscheel
By Jeff Scheel
As you likely have heard, Arvind Krishna, IBM General Manager for Development and Manufacturing in the IBM Systems & Technology Group, announced that Power Systems would be supporting KVM. This is an exciting announcement for numerous reasons that I'll defer for another posting. For this blog entry, I thought I'd do some question/answer session based on common questions I've been asked in the past couple weeks. However, before I do so, I need to remind you that these are our current thoughts at this time: things may change.
Q: When will KVM be available on Power?
A: The outlook for general availability is next year. However, IBM has already started releasing patches to various KVM communities to support the POWER platform.
Q: On what systems does IBM intend to support KVM?
A: IBM intends to initially support KVM on a limited set of models, targeted at the entry end of the system servers. This strategy supports IBM's efforts to capture the largest growing market, x86 Linux servers in the 2-socket and smaller space.
Q: How does IBM plan to position KVM against PowerVM?
A: IBM remains committed to the PowerVM being the premier enterprise virtualization software in the industry. With KVM on Power, IBM will be targeting x86 customers on entry servers but will offer both KVM and PowerVM to meet the varying virtualization needs PowerLinux customers. However, KVM virtualization technology represents an opportunity to simplify customer's virtualization infrastructure with a single hypervisor and management software across multiple platforms.
Q: What Linux versions from Red Hat and SUSE will provide KVM hosts support on Power?
A: The decision to provide KVM on PowerLinux will be made by Red Hat and SUSE. IBM will be working with them in the months to come and would welcome their support.
Q: What management and cloud software will support KVM on Power?
A: For KVM node management, IBM intends to work with multiple vendors, including Red Hat and SUSE to certify KVM on Power into their system management software offerings. Additionally, IBM plans to contribute any patches necessary to OpenStack to extend the KVM driver to Power. Using this foundation, additional IBM and third-party software should provide a diverse set of management software.
Q: What will software providers need to do to support KVM on Power?
A: Most software provides have become comfortable with some form of virtualization such as PowerVM, VMWare, and KVM. Just like with applications on Linux, software providers should find that applications in the KVM environment behave similarly on x86 and Power platforms. As such, each vendor should understand any challenge KVM on Power would provide.
Q: What operating systems will be supported as guests in KVM on Power?
A: Given that KVM is initially targetted to be released on Linux-only servers, only Linux is planned at this time. IBM plans to certify the latest updates of RHEL 6 and SLES 11 as KVM guests.
Q: How will KVM run on the Power Systems?
A: The design goal of KVM on Power is to be just another hardware platform supporting KVM. As such, the KVM on Power will be true to the KVM design point of a KVM host image that supports one or more guests. PowerVM constructs such as the HMC, IVM, and VIOS will not exist in KVM. Management and virtualization will occur through the KVM host image.
Q: Will KVM run in a PowerVM logical partition (LPAR)?
A: While KVM supports a user-mode virtualization that can run on any Linux operating system, KVM on Power is being developed to run natively on the system, not nested in PowerVM. This is done to enable KVM to run optimally using the POWER processor Hypervisor Mode. As such, the system will make a decision very early in the boot process to run KVM or PowerVM. This is envisioned as a selectable option managed by the Service Processor (FSP)?
Q: Will it be possible to migrate from KVM on Power to PowerVM or vice versa?
A: While the virtualization mode will be selectable on systems, the process of migrating from KVM and PowerVM will require additional steps such that frequent migrations will be unlikely. However, in the case where a customer wishes to upgrade to PowerVM to acquire advanced virtualization capabilities, this migration should be supported. Steps to backup and restore the VM image will be required when migrating in either direction.
Q: Will AIX or IBM I run in KVM on Power?
A: Given that KVM initially runs on Linux-only platforms, support for non-Linux operatings systems has not been planned at this time.
Q: Will Windows run in KVM on Power?
A: Windows does not run on Power Systems. As such, supporting it in a KVM guest VM will not work.
Hopefully, these questions were helpful to folks. As usual, follow-up questions/comments appreciated.
By: Anirban Chatterjee.
month, the PowerLinux team is announcing the biggest technology change in PowerLinux servers
since we launched, with the availability of our POWER7+ chips on the platform.
POWER7+ is more than just a speed bump on our POWER7
processors. Our hardware teams have
worked hard to increase the flexibility of the platform, bringing
balanced performance increases while keeping other factors like energy
consumption at bay. Some examples:
doubled the memory capacity in servers like the 7R1 and 7R2. We’ve also doubled the number of virtual
machines you can allocate to a single processor core. This means we’ve dramatically increased
the system’s flexibility when it comes to deploying virtualized workloads
… in many cases, this will eliminate memory as the gating factor, allowing
users to drive utilization rates even higher and boost system efficiency.
reduced the feature size in the chips from 45 nm to 32 nm. This not just a simple die shrink,
though … with every shrink, the chip team has to work even harder to
ensure the computational and thermal stability of the chip while driving
higher clock speeds. In PowerLinux
servers like the 7R2, the new chips now top out at 4.2 GHz.
- Because we have more available chip real estate now that we’ve shrunk the die
size, we’ve bumped up the L3 cache from 4 MB to 10 MB. This significantly boosts performance in
workloads that are memory dependent, like Java and big data applications.
feature additions to POWER7+ allow us to improve chip reliability and
boost energy savings. We’ve added
self-healing capabilities and automatic processor reinitialization to
increase system robustness, and we’ve introduced a new energy saving mode
that saves 45% more energy than before when the processor is idle.
The new performance capabilities afforded by POWER7+ enable
some pretty interesting possibilities when it comes to reducing costs. For example, we’ve found that people
typically need just two dual-socket (16 core) PowerLinux 7R2s to do what it
would take three dual-socket (16 core) Xeon servers to do. Given the already competitive pricing on the
7R2s, this means that you can potentially save north of 40% on your costs of acquisition by choosing PowerLinux.
These changes make PowerLinux an ideal platform for the most
critical workloads your business runs today, like your customer facing web
applications, or your ERP system.
Customers like Kwik Fit (PDF) and IT Informatik (PDF) are already realizing the benefits. Click the links to read the
case studies on these customers.
But PowerLinux is also a great platform for today’s growth
workloads, like development and deployment of mobile and web applications. To make it easier for businesses to create
and launch these types of client experiences, we’re introducing a new solution for WebSphere mobile and web applications that leverages the lightweight
WebSphere Liberty Profile software. This
is a light, easily reconfigured web app environment that makes it simple for
developers to test and deploy applications.
As 2013 progresses, we'll continue to bring you more
announcements that improve the PowerLinux platform's ability to reduce costs while improving
efficiency, enabling new and growth workloads, and giving you a better overall
Modified on by jerberstark
By: Breno Leitão.
This tutorial explains how to create a RAID device on PowerLinux machines using an array of disks. This step by step tutorial includes identifying the disks, formatting them, combining them in a RAID array, creating a partition and, finally, creating a file system on this partition.
The PowerLinux machines support a RAID (Redundant Array of Independent Disks) card. A RAID card is a device that combines a set of physical disks into a logical unit to achieve a better performance and more data redundancy. A RAID array could also be created by the operating system (known as Software-based RAID), and it consumes some CPU cycles from the machine to manage and control the disk array. On the other side, a RAID card, as the one embedded on PowerLinux machines, offers a Hardware-based RAID, meaning that the operations on the disk array are offloaded to the RAID card, not utilizing CPU cycles managing the disks, thus, being more efficient than Software-based RAID solutions.
The RAID adapters on PowerLinux machines support several different RAID protection levels. Depending of the protection level, you might have different benefits, as potentially achieving a higher data transfer, a smaller latency and data redundancy when compared to a single big disk. You might also want to combine these benefits all together in the same disk array, which is also a possible depending on the RAID protection level.
Using RAID is usually a trade-off between disk space and redundancy, so, depending on the RAID protection level, part of the disk space is used to save redundant data, thus, part of the disk space is not available for general usage. The real space available to the users varies from 50% to 100% of the total disk space.
The RAID protection levels supported by most PowerLinux RAID adapters are:
RAID 0: On this configuration, a block of data is striped in different disks on the array, so, the read/write operations on the disks could happen in parallel on the disks in the array. On this configuration, there is no fault tolerance i.e., if a disk fail, the whole data is lost. This level usually improves the data throughput.
Requires at least 1 disk. In a single disk RAID 0 configuration, no striping occurs.
RAID 1: On this level, the data is written at the same time on 2 disks. As both disks have the same data, a read operation will occur on the disks that has the smaller latency. On this case, if one disk fails, the whole data will continue be preserved on the other disk. Once the other disk is replaced, the RAID would be reconstructed. As expected, this level improves the data redundancy.
Requires at least 2 disks. Note: The PowerLinux RAID adapters refer to this RAID level as RAID 10.
RAID 5: On this level, the data and parity bits are spread all across the disks. If one disk fails, then all the data is still available, once the original data could be reconstructed using parity data from the other disks. If more than one disk fails, then the data will be corrupted. (If a disks fail happen, the operations may happen in a slower fashion, since the data being accessed is on the lost disk, then the data will need to be reconstructed.). One disk's worth of capacity is consumed for redundancy information for the array.
Requires at least 3 disks.
RAID 6: The same as RAID 5, but up to 2 disk can fail, and the data will still be preserved. Two disk's worth of capacity is consumed for redundancy information for the array.
Requires at least 4 disks.
RAID 10: This level combines the best concepts of RAID 1 and RAID 0. On RAID 10, the data is striped on a set of hard disks, and these hard disks are mirror to another set of hard disks. So, you have a very good throughput and also a data redundancy.
Requires at least 2 disks for mirroring and striping. In a two disk RAID 10 configuration, no striping occurs.
RAID card on PowerLinux
The PowerLinux machines come with an embedded RAID controller that supports up to 6 SAS or SDD disks and RAID levels 0, 5, 6 and 10 on machines 7R1
. RAID 1 is also supported as a subset of RAID 10, since the RAID controller allows you do create a RAID 10 with just two disks as part of the array. In this case, since there is no enough disk to mirror and strip, the data just gets mirrored on both disk, instead of striped, which is what a RAID 1 does. So, the cards also supports, in a different form, a RAID 1.
On Linux, the device is listed as a PCI-E device named "IBM Obsidian-E PCI-E SCSI controller ".
In order to manage this controller, there is a set of tools on the packaged called iprutils that helps the system administrator to create, configure and delete disks and arrays using the RAID controller.
The iprutils package provides the iprconfig application. The iprconfig is the tool responsible for configuring the RAID devices on your machine, and will be the tool that will be covered below.
The device driver
The device driver for the PowerLinux RAID controller is named ipr.ko. It's currently part of the Linux kernel and comes with all the supported Linux Distros for PowerLinux. So, it's recommended to always use the last supported kernel version from the distro in order to take the best from you PowerLinux machine.
Using iprconfig tool
The iprconfig tool is a very easy application to use. It's a text-based (TUI) application that helps you to list and configure the RAID controller and the disks on your system. iprconfig also allows you to check the controller log, and upgrade the card firmware. An example of the iprconfig screen could be seen at Figure 1.
Now on, a step-by-step tutorial will show how to format a set of disks, combine them together in a RAID mode, create a partition over this array and, then, create a file system over this partition. This is an easy process that might take less than 30 minutes to be accomplished. For this tutorial, we are going to create a RAID 5 device, meaning that the array will have the data mirrored and striped over the disks arrays.
You can use iprconfig just as a command line option, in this case, you need to pass the parameters you want in the command line. For example, in order to see what are the RAID levels supported on a controller (sg5
), the following command should be used:
# iprconfig -c query-supported-raid-levels sg5
0 5 10 6
Formatting a disk in RAID mode.
In order to use a disk as part of a disk array, it needs to be formatted specifically for being part of RAID, also known as, advanced function
mode. If the disks is not formatted properly, you can not add it to a RAID array. In order to format a disk in advanced function mode, the following steps should be followed:
Launch the iprconfig tool on a console.
Select the menu Work with disk array (as shown in Figure 1)
In order to do it, press 2 and then enter.
Select Format device for RAID function
In order to do it, press 5 and then enter.
Then select the disks you want to format (all of those that are going to take part of the disk array) and continue. (As shown in figure 2)
In order to select the disks, you must use the up/down arrows, and press 1 to select the devices you want to select.
Wait until the disks are formatted as shown in Figure 3. (It takes some minutes until the disks are formatted.)
Figure 3: Disks being formatted
Creating an RAID disk array
As explained above, in order to add a disk into the RAID array, the disk need to be formatted in RAID mode. Once the disks are formatted in RAID mode, they can be available to be added to a disk array, and the RAID device could be created. You might want to create as many RAID devices you want, and give them a set of disks. Let' s go through the process of creating an array device. It is recommended when creating a RAID array to format the devices for RAID as described in the previous step, then create the disk array following the steps below without exiting the iprconfig
tool. Exiting the tool will result in the loss of knowledge that the disks have just been zeroed and it will take longer for the array to initialize.
Launch the iprconfig tool.
Select Work with disk arrays, as shown in Figure 1.
Select Create a disk array.
Select the controller you want (You might have just one controller), as shown in Figure 4.
Select the disk that will be part of the array, as shown in Figure 5
Press '1' over the disks you want to select.
Select the RAID type, as shown in Figure 6.
Go into the Protect Level and press 'c' to change the RAID level.
Select the disks that will take part of that disk array
It may take a while to have the array created.
Figure 4: Selecting the RAID controller
Figure 6: Select the RAID type
Start using the RAID array
Once you had the RAID array created, it becomes a block device as any other block device on the system. You can create a partition on the device, make a file system on it, and start using. The next steps will help you to create a partition and a file system on the array partition. On this tutorial, I will create just one partition using the whole array and format it using EXT4 file system, as shown below:
Figure 8: Creating a EXT4 file system on the partition over the RAID device
Once the file system is created in the partition, you can use this partition as a traditional file system, i.e, mount it on a directory and start using it. All the RAID operation, as managing, striping, checksumming or mirroring the data will be offloaded to the RAID card, and happen transparently.
Modified on by jscheel
by Jeff Scheel, IBM Linux on Power Chief Engineer
In June of last year, I started publicly discussing the role that little endian (LE) plays in our Linux on Power strategy with the blog, Just the FAQs about Little Endian. Then, in August I attempted to eliminate uncertainty in my Removing the FUD and Demystifying LE (little endian) article. With the announcement of the Red Hat Enterprise Linux 7.1 beta delivering an LE version, it is time to revisit little endian from the perspective of an application developer.
The release of RHEL 7.1 LE completes the offerings of little endian operating systems. Canonical had Ubuntu 14.04 ready for POWER8 launch in May. SUSE supported the launch with public statements by Michael Miller about SLES 12 being LE in May, and publicly released in October. It is now time for application developers to get busy: little endian Linux on Power is here!
One thing that being a developer by training has taught me, is that “we” often need to be convinced that work is worth doing. Little endian Linux on Power is about reducing the cost of migrating an application AND providing additional value of the end application.
Being able to run Linux on Power in LE mode means that applications have one less thing – data endianness – to worry about in the port. While technical differences such as assembler language, page size, and cache size still exist, developers and architects tend to worry most about data endianness because the finding and fixing all the problems can be very time consuming. By enabling Power to run in the same endian mode as x86 (the defacto Linux platform of choice for developers), applications can simply be recompiled without having to worry about endianness. Further, if one is going to build a solution mixed with x86 and POWER systems, exchanging data on disk or across the network in the same endian mode greatly simplifies the application as well. Then, add in the ability to accelerate Power applications with (inherently little endian) GPUs and the benefits of little endian become “a no brainer”.
So, hopefully, we're past the “why should I do this?” phase and now we address the list of technical resources for migrating to Linux on Power. My favorite list of resources include:
The Linux on Power community in developerWorks has a great wiki page Porting from Intel x86 to Power systems running Linux that provides a great starting point for the process.
If you are migrating your application from x86 Linux and like bundles or toolkits, the Software Development Toolkit for Linux on Power provides an Eclipse-based environment for C/C++ applications with a porting wizard (The Migration Advisor) and a tuning wizard (Souce Code Analyzer) for development efficiently. This bundle further provides the latest free software (GNU) tools, oprofile, gdb and several Power-unique tools such as FDPR for post link optimization, pthread-mon to analyze highly threaded applications, and CPI (cycles per instruction) tooling to visually show inefficiencies.
For the best advice on tuning your application, I recommmend starting at the Performance Rocks – Best Practices wiki page in developerWorks.
The Performance Optimization and Tuning Techniques for IBM Processors, including IBM POWER8 Redbook provides excellent insight to the Power processor.
Now let us take a look at “where can I get started?” The answers to this question depend on your role in the software ecosystem. If you are a software provider, my colleague Bob Dick, recently published he thoughts on how to get started in a the Using the IBM Power Development Cloud for Red Hat Enterprise Linux 7.1 (little endian) Beta application testing blog posting. Programs like IBM PartnerWorld provide this and more resources to facilitate porting. Check them out.
If you are a “in house” owner of an application in your enterprise, finding a system on which to port your application could be challenging. Of course, your IBM Sales contact or your business partner can provide alternatives such as try-and-buy or proof-of-concept systems. Do not hesitate to start with them. If you do not know them, or if this does not work out, go to the cloud! Site Ox offers a two week free trial for development purposes. Visit their website for details. As we move forward, I remain hopeful that other vendors will provide public offerings of Linux on Power images. Further, if you do not at first see the particular release for which you are looking, reach out to the service provider and request it. They might just surprise you and have a plan to provide it. If not, it helps them to hear your needs.
For open source developers, the access to free cloud images increases. The Open Source Labs at Oregon State University hosts Power development images (VMs). University of Campinas (UNICAMP) also hosts a minicloud in Brazil. In China, the SuperVessel Cloud
provides a similar service to developers. In addition to these three locations, we are hoping to extend our offerings in both Europe and India in the near future. Again, the particular releases hosted at these sites may vary, but will generally include the little endian versions of Fedora, openSUSE, and Debian. If none of these sites or offerings work for you, feel free to reach out to me on Google+ (loaner post) to explore a dedicated loaner system.
With a complete set of little endian Linux on Power distributions, a robust list of technical resources, and plenty of resources for porting applications, the future is here. Take the first step. Seize the moment. Let's see what you can do with Linux on Power!
Modified on by jscheel
by Jeff Scheel, IBM Linux on Power Chief Engineer
Never before in my almost 25 year history with IBM have I anticipated a launch like that of our latest POWER8 systems. This launch releases the next generation of POWER processors in 1- and 2-socket scale-out servers with a focus on delivering new Linux solutions, especially in the area of open source clouds built on KVM – a dream come true for this “Chief Engineer.”
For me, the most important announcement is the release of an IBM KVM product, PowerKVM, on our POWER8 Linux-only servers, the S812L and S822L. For the past year, we have worked very hard in the open source communities to enable this open virtualization technology. Even though IBM is the first to offer KVM on Power Systems with our own product, rest assured that it has been enabled in a way such that any Linux partner can provide their own product. In fact, key open source community distributions such as Fedora 20 and openSUSE 13.2 have already been enabled. IBM remains committed to enabling Power Systems as a platform for open innovation.
The launch of this product marks the start of a new era in virtualization—convergence of hypervisors. As I've stated explicitly in my last blog, “Is PowerLinux a New Platform? Not really...” and have talked about in multiple forums for the past couple years, hypervisor commonality will be the next place enterprises save money. KVM unleashes the potential for a a single hypervisor to run all platforms in the Enterprise as it continues to mature over the next 5-7 years, just as the Linux operating system has done so in the past 10 years. In the meantime, PowerKVM provides an excellent opportunity for customers to leverage synergies with KVM and OpenStack to build light weight, flexible, highly virtualized cloud solutions. KVM on Power (from IBM or anyone else wishing to release a version) will look and act just like KVM on x86, making it an excellent choice for new customers on the platform to exploit POWER8 benefits while greatly reducing the learning curve to a bare minimum.
My excitement about this launch continues with the POWER8 processor. This generation of processors will thrive in the data-centric generation of cloud computing. Growth of the maximum number of cores per socket from 8 to 12, increases in the hardware threads per core from 4 to 8, significant improvement in the single thread performance over previous generations, and integration of a PCIe Gen3 controller into the chip will provide applications plenty of resources for big data and analytic workloads. In addition to the traditional “bigger, better, faster, stronger” enhancements associated with a new processor generation, the POWER8 processor also adds two exciting new technologies—CAPI (Coherent Accelerator Processor Interface) for connecting accelerators directly into the system and a new processor mode to access memory in a little endian format. These processors technologies will fuel software innovation for years to come.
CAPI enables POWER8 processors and out-of-core accelerators like GPUs (graphics processing units) and FPGAs (field-programmable gate arrays) to cache-coherently share memory with the applications running on the main processors as if they were “in-core”. This greatly simplifies application acceleration design which has traditionally required meticulously planing of shared memory usage through techniques such as pinned memory. Look for nVidia, our new OpenPOWER partner, to leverage this technology as they help bring GPU-based solutions to Power Systems in the coming months. Meanwhile, IBM will combine CAPI with FPGAs and emerging software applications like Redis to overcome traditional performance limitations.
Support for little endian mode in the POWER8 processor enables a new generation of applications. While I plan to devote future blog entries to this topic, let me take a moment to address some basic facts about the technology. First, the traditional POWER operating systems (AIX, and IBM i) are big endian and will continue to run as such on POWER8 and future systems. Second, Linux operating systems over time will be migrating from being big endian to little endian in an organized way that by no means compromises the support cycles for their existing releases. Third, as little endian operating systems come to market, a new little endian application ecosystem will grow. Given that most applications today support one or more little endian operating systems (Windows or x86 Linux) and at least one big endian operating system (Linux on Power, AIX, or z/OS), this ecosystem should grow quickly. Finally, little endian mode on POWER8 is not a magic bullet: compiled x86 Linux applications will still require a recompile at a minimum because the x86 and Power instruction set architectures are still different. With this new support, Power will simplify on step in the porting process for new applications, but to develop the whole ecosystem will take time.
Finally, as I look at this first set of POWER8 systems, my excitement peaks over the form factor in these initial offerings and the solutions they will enable. For the first time in as long as I can remember, we are bringing our newest technology to the entry, scale-out servers with 1- and 2-socket servers in 2u and 4u footprints. Complement the offering with a continuation of the total cost of acquisition (TCA) competitive “Linux-only” servers, the S812L, S822L, and S824L models, and the Linux market can benefit immediately. Add a new Linux operating systems from Canonical, Ubuntu version 14.04, and new applications from SugarCRM, CFEngine, Redis Labs and Zend and the platform takes a huge leap forward in open innovation.
PowerKVM, POWER8, scale-out servers, and new Linux solutions. Wow! This POWER8 launch delivers so many new technologies from the processor through the software offerings enabled with KVM. It's truly hard not to sound like Buzz Lightyear from the movie Toy Story who frequently hollers, “To infinity and beyond!” If you are a POWER customer, you should be excited about the new solutions these new servers provide for you. If you are new to POWER, combine Linux with KVM to explore the benefits of the platform. The future is here. What will you do with it?
By: Bill Buros.
Within the Linux open-source community, and across the Power systems, there are a number of new, emerging, and in some cases maturing technologies actively being worked on, prototyped, and prepared for enterprise-class deployment. Members of the IBM Linux Technology Center around the world are involved with these projects, collaborating with peers across the open-source community.
Over the coming weeks and rolling into next year, here on the PowerLinux Technical Community we plan to begin highlighting these technologies in a series of Blog articles and updates.
Some of the technologies are simply prototypes which are being worked with the open-source community, and would naturally be expected to evolve as community feedback and interactions shape the proposals. Some of the technologies are unique to the Power processors and systems, in which case the articles provide deeper technical insights into what's happening with your Power system and how to recognize, control, and optimize those behaviors. And finally, some of the technologies are ready for prime time and will be highlighted to encourage users, customers and administrators to consider taking advantage of these packages.
For example, in the coming days, watch for an article on a Linux kernel "Memory Compression" prototype, where the Linux kernel can compress memory pages in memory instead of swapping the pages out to disk. Another article will provide technical details on I/O adapters which can take advantage of special adapter slots which support Dynamic DMA Windows on Power systems. Discussions within our own teams have concluded that "operf" (a nice combination of oprofile and perf) is ready for more extensive use and will be promoted - it is easier to setup, run, and does not require root access. These are just a few of the articles being discussed and developed.
Our plan is to begin expanding our Technical Articles on the PowerLinux Blog - providing a deeper and more technical focus around Power and Linux. The Blog will still provide updates on packages, releases, and product updates, but our hope is to provide a stronger technical dive into the technologies behind the solutions and systems.
Modified on by jscheel
by Jeff Scheel, IBM Linux on Power Chief Engineer
If you were like me and enjoyed the July 4th holiday week, you likely missed a very subtle – yet significant – event: the July 2nd release of firmware by IBM for use in the OpenPOWER Ecosystem under the Apache license, version 2.0. This action further signifies IBM's commitment to opening up the Power architecture and fulfills an IBM's promise to the OpenPOWER Foundation. It also demonstrates cross-company innovation with the inclusion of significant code contributions from Google.
The OpenPOWER Foundation, established in 2013 by IBM, Google, Tyan, Mellanox, and NVIDIA, represents a growing alliance of over 40 organizations committed to building new technology based on the POWER microprocessor architecture. Plans to form the organization were first revealed in this IBM Press release stating that “The move makes POWER hardware and software available to open development...”
But, honestly, what makes the release of open source firmware so significant for OpenPOWER? In a nutshell, it enables collaboration around hardware. As OpenPOWER Foundation members innovate around the processor and system architecture, having a freely available, easily extensible, and simple to re-distribute firmware becomes a critical requirement. The release of this firmware under the Apache license provides a solid base around which a community can be built to support the OpenPOWER Ecosystem and to enable future hardware and software innovation. While a community could have created a new code base from scratch, IBM has shortened the time to innovation by tens to a hundred of person years through this contribution.
If you would like to get involved in this emerging firmware community or simply want to follow along, the project list is currently available on github at https://github.com/open-power. If you want to be involved in the larger OpenPOWER Foundation group responsible for the firmware, keep you eye open for information on the proposed System Software Work Group. A draft charter is presently being reviewed that allows for an open work group in which anyone can participate.
So, let the innovation begin. Come help us put the “Open” in OpenPOWER!
By: Jeff Scheel.
Ever wondered what IBM means when they say "Power Linux"? I get questions all the time like, "How does Power Linux differ from RHEL or SLES?" So, I thought perhaps it might be time to address this in writing, in a common place.
Here's the simple answer: Power Linux generally refers to all supported Linux distributions that run on IBM Power Systems servers. This means that Power Linux includes all Red Hat Enterprise Linux (RHEL) and SUSE Linux Enterprise Server (SLES) versions. We simply say "Power Linux" instead of "RHEL and SLES" each time we wish to refer collectively to these Linux distributions. (What would one expect from us, the master of 3 letter acronyms and abbreviations?)
While this simplification sounds reasonable on the surface, in practice it has created some confusion. People believe that Power Linux is somehow different from SLES and RHEL on other platforms. It is not. Further, RHEL 6.2 for Power Systems is built from common source code, tested at the same time across platforms, made generally available on the same date, and supported the same way as RHEL 6.2 for x86 platforms. These same commonalities apply to other RHEL releases/versions, as well as SLES releases/versions. It is this standardization across multiple hardware platforms that elevates Linux from just another operating system to a multi-platform operating system that enables customers, business partners, and software providers to leverage hardware platform strengths without having to invest in new operating systems.
As my college physics professor used to say: "Is it as transparent as mud?"
Modified on by Bill_Buros
By Kent Yoder
I've just released a white paper
with some info on using the encryption and random number generation hardware accelerators in the POWER7+ CPU in Fedora Linux. It covers a few areas:
1. Background on the hardware and software architecture
2. Setting up Fedora to use the accelerators for disk encryption and IPSec
3. How to monitor that you're really getting hardware acceleration
Hope you find it useful, and don't hesitate to send me questions or comments!
Modified on by JeffAntley
By: Jeff Scheel, PowerLinux Chief Engineer
A software provider recently said, point blank, at the beginning of our discussion, “Convince me that PowerLinux is not a new platform.”
The core of my answer went like this: the value of Linux is the same for software providers as it is for customers – Linux provides a single operating system environment across different hardware platforms. Customers and partners who have both x86 Linux skills and Power System skills have all the knowledge they need to run Linux on IBM Power Systems and PowerLinux servers. Customers and partners who only have x86 skills will need some training in PowerVM and the platform, but will quickly find commonality in the operating system.
This software provider's simple request strikes at the heart of the IT challenge – managing expense for the data center. Everyone is struggling to control software deployments in the enterprise. One can visualize this challenge as being depicted by a two-dimensional matrix (spreadsheet) of applications (across the top) against platforms (down the left side).
This complexity of this problem becomes most daunting when one considers two key factors:
Adding a new platform to the matrix means addressing all the applications and vice versa.
The application list continuously grows, seemingly daily.
The good news for all of us in the IT industry is that Linux and open source are driving commonality into software and redefining the “platform” to be the most basic components of the hardware platform. To fully understand this conclusion, we need to look at how the definition of “platform” has evolved over time.
Software Engineers typically draw the software stack with hardware on the bottom of the stack and applications on top, as shown below.
Applications run on operating systems, hosted by hypervisors, which virtualize the hardware architecture.
For the sake of simplicity, I have grouped several categories together. For example, middleware is really an enabling component of applications and as such could be a separate layer below “Applications.” Additionally, runtime environments like Java or Perl have been lumped into the operating systems level instead of being placed as an unique layer on top of “Operating Systems.” One could just as easily include hypervisors as part of the hardware architecture. To enable a technology trends discussion later in this blog entry, I have elected to explicitly separate “Hypervisors” as a distinct layer.
You must consider the visibility to users in the IT infrastructure of each software stack component to appreciate the impacts on the business. With a little insight, you can see that most people in the enterprise touch the application layer, while fewest touch the hardware. This observation suggests a roughly graduated pyramid, as depicted below.
The implication of this graduated visibility means that changing an application causes more expense for an enterprise than does changing hardware architectures, especially when changes in a top layer drive changes in the foundational layers below it.
Does anyone remember the day when applications ran on only one platform? Not too long ago, “platform” meant the whole software stack, a silo top-to-bottom. If I wanted to run the AmiPro word processor, I did so on my DOS-based personal computer. To do word processing at work, I used BookMaster on the mainframe. Applications were the “platform."
Over time, application providers began to write applications that could handle the disparate operating system environments. Applications (middleware) like IBM's DB2 database are available for Windows and UNIX operating systems. The advent of interpreted languages like Java further helped application portability. In this time frame, people began to think of the “platform” as the layers from the operating system downward.
With the adoption of the Linux operating system, the definition of “platform” continues to move further down the stack. When customers run RHEL 6.3 or SLES 11 SP3 in their environment, they get the same kernel version, built with the same compiler (which of course generated different binaries for the each processor instruction set), running the same libraries, leveraging the same file systems, and including the same levels of many common applications such as the Apache webserver. In this structure, properly written applications can expect common application environments, even though they may be running in different hypervisors on different hardware platforms. For well written applications, “porting” to a new platform generally becomes a recompile and a regression test.
The value of Linux to the enterprise comes from commonality. Differences drive expense. Commonality saves money. The greater the “visibility” for a component of the software stack, the larger the potential for savings from commonality. So, it should not surprise us that the historical trend of convergence has been top-down in our software stack.
While the majority of commonality has been with applications and operating systems, some convergence has occurred within the hardware architecture layer. I/O has developed standards like PCI. Networking converged to Ethernet, despite a competing standard of Token Ring. Likewise, storage area networking selected the Fibre-channel protocol over SSA. While these gains have been beneficial, they likely are limited in opportunity because processor architectures will likely never converge to a single architecture. Every processor type and system architecture focuses on solving a different challenge.
As we look forward toward the future, opportunity exists for continued convergence in the software stack with hypervisors. Today's enterprises have their virtualization options dictated by the hardware platform. Power systems virtualize with PowerVM, mainframes use z/VM and PR/SM, x86 systems use VMWare, HyperV, Xen, or KVM. Enterprises with heterogeneous hardware platforms run different hypervisors on each platform -- not by choice, but by mandate.
With Linux embracing KVM, the potential for a single, cross-platform hypervisor emerges. In the not too distant future, enterprises should be able to leverage KVM to reduce their virtualization expense, again changing the definition of “platform.” In this final picture, commonality has driven the definition of “platform” to simply being a new hardware architecture, reducing the impact (and resulting expense) to the most minimal definition.
Now that we have completed the analysis of the software stack convergence, we can provide a deeper answer to the original question: Is PowerLinux a new platform? Not really. Linux on Power is “just Linux”, RHEL 6.5 or SLES 11 SP3. For a software provider such as the one who posed the original challenge, their x86 Linux expertise combined with their Power System skills facilitate support of Linux on Power in a very cost effective fashion. As Power embraces KVM, even the hypervisor differences will be eliminated and the impact of adopting Linux on Power has truly been isolated to a few people for both software providers and customers alike.
While this convergence can be viewed from a historical perspective, one could also view these steps as phases of application maturity. New applications typically begin on a single operating system and hardware stack. Over time time, applications wishing to support multiple hardware platforms, will port. Modern applications, such as those written in interpreted languages like Java or Python or those compiled with using open source compilers like gcc, will migrate to new hardware platforms quickly. Linux and open source tools enable software vendors to maximize their addressable market by writing to common libraries and compiling using standard tools common to all platforms who run the Linux operating system.
The evolution of the software stack has been on an exciting trend. Open source software has driven and will continue to drive a convergence of the software stack as the IT industry evolves. We have come a long way from the days of “application silos.” Today's runtime environment with the Linux operating system provides a common architecture that enables cross-platform application support. With the emergence of KVM as a virtualization technology, convergence will continue into the hypervisor layer. Once this transformation is complete, platform differences will be reduced to the fundamentals of the processor architecture, enabling IT customers to minimize expense while leveraging the fundamental advantages each platform can provide to their solution. What enterprise would not want to be allowed the flexibility of selecting the best platform for their solution while minimizing expenses?
If you are an IBM Business Partner or IBM employee who needed remote access to Power Systems servers in the past, you might have come across the Virtual Loaner Program (VLP). The VLP is gone now, but not to worry, it has been replaced by the Power Development Platform (PDP).
In addition to the name change, the program added new features and comes with an improved web interface. As did the VLP, the PDP focuses on bringing ISVs, other Business Partners and IBM employees worldwide remote access to IBM POWER processor-based servers on the IBM AIX, IBM i and Linux operating systems. The PDP brings the latest in IBM Power Systems hardware for porting, testing, certifying and demonstrating applications.
Outside of a new Linux porting image with IBM DB2 10.x and IBM WebSphere 8.5.5., which will especially interest the PowerLinux community members, the other enhancements include areas such as improved reservation navigation, the capability to allow expansion beyond Virtual Server Access and deeper social media integration to provide users with more news and information.
So, check out the PDP site at ibm.com/partnerworld/pdp and see it for yourselves. Please note that you need to be a PartnerWorld member or an IBM employee to reserve a virtual system.
By: Fabio Dassan dos Santos
One of the new and noteworthy features for this 5.3 release, the LPAR Cloning and Restoration tool, focuses on extending value in this category by providing a quick and easy way of creating re-usable system images across LPARs.
Through very few steps, it is possible to achieve the following with this new tool:
- Save all available devices of the LPAR in backup images;
- Use compression methods to decrease the size of the backup image;
- Store these images in a NFS server share;
- Associate previously saved images with available devices of a LPAR and restore the system;
This function is specially useful in those situations where there is a need to preserve a certain system level, or even quickly replicate system images to multiple LPARS, in a virtualized environment.
By: Jeff Scheel.
What an exciting week in Miami, FL!!! I spent last week at Power Technical University, helping people Think Power Linux. We had lots of great discussions. A big "thank you" goes out to all who attended sessions, a bigger "THANK YOU" to those who asked questions and participated in the discussion.
Here are some my key thoughts from the event:
- The interest in Linux continues to increase. Although I don't keep formal counts, attendance at the Linux sessions is up over last year which was better than the previous. The first Trends and Directions presentation was standing-room-only, largely due to overflow from the other sessions. But even before the overflow wave started, we had at least 40 attendees in the room. I've posted the deck for people to review who didn't make the session.
- Power customers continue to grapple with the question of "Why Power Linux?" Those attending the sessions are frequently feeling like they're trying to convice their enterprises to consider Power when deploying Linux. When I provide the simplified answer of there being two reasons to do Power Linux -- the value of the Power Platform (virtualization, RAS, and performance) and all of the additional value-add items that we provide (pre-load, Installation Toolkit, Simplified Setup Tool, Software Development Toolkit, and Think Power Linux community) -- the answer seems to resonate. Folks understand that the platform provides value to all Power operating systems. They also appreciate the value-add initiatives that reduce their time-to-value for Linux solutions on Power Systems.
- The 2011 focus items on the SDK for application development and the new Think Power Linux community are definitely needed and timely. The reception of these items have been resoundingly positive. Customers are happy that we're working to simplify the porting process with the SDK and they're looking for places to ask their questions and find the latest information on the product.
- In a great discussion with a Power Linux customer, I learned that customers are still grappling with backup solutions similar to makesysb. While we have an open source solution that we're looking at for our Installation Toolkit next year, this customer discovered that Storix has made their SBAdmin product available for Power Linux. He implemented and was very impressed with the function, support, and price. What a great thing to learn and hear from a customer!
If you attended the conference, I hope you found as much value as I did. If you didn't attend, perhaps you join us at a future event.
By: Jeff Scheel.
Two weeks ago, I blogged about my thoughts after attending Power Technical University in Miami. This week, I bring you my thoughts from our event in Copenhagen, Denmark.
It never ceases to amaze me what I learn at these events. While the topics I presented were identical to the session in Miami two weeks ago, I still learned a bundle from the Power Linux customers who attended in Copenhagen.
Here are my thoughts from this week:
- Again, I took a significant number of business cards to Copenhagen and still ran out before the week was over. Interest in Power Linux was definitely greater than Lyon, France last year!!! Power customers are definitely "thinking Linux". I did my best to help extend this to "Think Power Linux". I believe we see this growth reflected here in our new community where our membership continues to grow. We've past the yearly goal of 150. Can we top 200 before year-end?
- If you remember my blog from Miami, I was surprised to meet a customer who taught me that Storix had a solution for Power Linux. In Copenhagen, I met another Storix customer and had more discussions about mksysb type backups. There's a real need for this solution in the market place. While some open source solutions exist, none yet support Power. Having Storix supporting the platform is great because of their deep heritage with in the UNIX marketplace.
- On the theme of surprising solutions, I found an answer to a frequent question: How do I size my Linux partition? Midrange Performance Group provides a Power Navigator product to perform capacity planning on AIX, Linux, and other proprietary operating systems. As I understand this solution, it can help you migrate a Linux workload from x86 to Power using data from the nmon tool. Give it a look if this is a problem with which you've been grappling. Oh, by the way, did I mention that the Linux Installation Toolkit includes nmon?
- I attended a great presentation on the Linux on Power Best Practices in virtualized environments by Dr. Michael Perzl (email@example.com). He did a terrific job of detailing HA configuratoins for Power Linux and comparing showing the similarities/differences with the AIX equivalents. I've posted a PDF export of his presentation to our community. (Please note, the formatting issues in the PDF are a result of my export, not Michael's presentation.) Feel free to reach out to him as one of our many technical experts.
- Finally, the issues in Europe are the same as the United States: How do we differentiate Power Linux from Intel Linux? How do we "sell" Power Linux within an business that believes Linux is x86 only? If you haven't read my approach to answering these questions, feel free to refer back to my blog about Power Tech U. in Miami.
If I met you in Copenhagen and you joined because of presentation, feel free to comment against this blog and provide feedback. Welcome to our community! Help make us better.
If you just follow along and my postings spur any thoughts or comments, feel free to comment as well. Did I spur any thoughts, comments, or questions?
Well, that's all. Thanks for Thinking Power Linux today.
Modified on by jhopper
By: Jenifer Hopper
This article discusses some basic XML tuning tips for PowerKVM guests. It helps new users get started with editing guest XML definitions, and walks through some simple tuning examples.
The article covers various options to tune the guest disk, network, cpu, and memory. It also includes some example guest resource pinning configurations for different scenarios. Applying these tips may help improve application performance by ensuring your guest is configured properly and optimized for the KVM environment.
Modified on by jscheel
by Jeff Scheel, IBM Linux on Power Chief Engineer
I couldn't resist the urge to use TLAs (three letter acronyms) to dispel the FUD (fear, uncertainty, and doubt) on my favorite topic, LE (little endian).
If you are like most customers (and my mother), the concept of data endianness rarely, if ever, enters your mind. You buy applications, operating systems, and computers. All you care is that the operating systems run your applications on the computer to accomplish your goals. If you have heard that Linux on Power is moving to little endian and are worried, your approach to survival is simple: focus on release planning details for your systems – applications, operating system, and hardware.
Who does care then about data endianness? Programmers, software vendors, hardware vendors and those 1% technology geeks who do development in the IT industry. These are the people who have noticed that the POWER8 hardware can operate with data in either big or little endian mode and for whom I wrote the “Just the FAQs about Little Endian” blog back in June. They appreciate how little endian simplifies their Linux applications running on both x86 and Power systems; they understand how little endian simplifies data sharing on disk or over the network between x86 and Power systems; and they drool over the potential of running GPUs (graphics processing units, sorry couldn't resist another TLA) on their Power Systems to create highly optimized applications common to scientific or analytic workloads.
But for the remaining 99%, what you need to know about LE can be simplified to the following key points:
Existing Linux on Power operating systems will be transitioning from BE (big endian) to LE, most likely at a major release boundary. At this boundary, the process of upgrading will involve more work than previous upgrades because the operating system and all applications will need to make the transition from BE to LE. SUSE has already announced that SLES 12 will be their transition point to LE. Red Hat shipped RHEL 7 as BE and has not announced their LE plans.
Note: Canonical will not be transitioning Ubuntu, because it started as an LE operating system.
Only POWER8 and future generations of POWER processors will be capable of running LE. As the operating system transitions on POWER7 and older systems, you will not be able to upgrade to the new LE releases. Just leave POWER7+ and older systems on the old release. If you want to run SLES 12 or Ubuntu 14.04, you will need a POWER8 system.
IBM plans to support LE operating systems in LPARs (logical partitions) and VMs (virtual machines). POWER8 and newer systems will eventually support intermixing LE and BE operating systems on both PowerVM and KVM, but this support will be a staged delivery with complete intermixing support around mid-year 2015. Completion of all testing of various BE and LE configurations takes time. Your patience is greatly appreciated. The LE FAQ above has some valuable details about current limitations.
Hopefully you now understand that you need not learn about esoteric programming concepts as we head into the brave new world of little endian Linux on Power. Instead, as we progress through this transition, if you simply spend a little more time in planning to double check support for your applications, operating systems, and hardware during each step, you will have success. That's what I'd recommend to my mother – just a little due diligence.
By: Jeff Scheel.
As I've often mentioned in my blogs, I like to answer questions which I've been very asked frequently. Today's topic provides more details about our marketing slogan "Industry standard, Tuned to the task". We've worked so hard to eliminate the myth that PowerLinux is different (see my blog What does IBM mean when it says, "PowerLinux?"
) that we're now getting questions like:
- Will my (x86) Linux application just run on PowerLinux?
- Can I use the same DVDs to install my PowerLinux or Power System server as I used for my x86 server?
- Can I migrate a VMWare Linux image to PowerVM?
The short answer to all of these questions is the same, "No".
The technical explanation, stems from a fundamental component of computer architecture -- the processor. Every processor architecture has a different machine language (command set) that it supports. So, even though Linux is built from the same source code, using the same compiler, the final executable system code is different because the compiler "targets" (builds for, generates machine instructions for) the specific processor type.
For some people, this may make perfect sense if we talk simple "Hello world" application. If I compile it on my x86 PC at home, I understand that it would not naturally expect it to run on the Power System at work. In fact, if I build the application in both places, x86 PC and Power System, I will see that even the resulting binary is a different size, demonstrating the machine language between processor architectures are likely different.
Since the Linux operating systems is simply a large collection of programs, the same explanation applies to the operating system. Red Hat and SUSE build their distributions from the same source using the same compiler,s but generate different programs for each processor architecture. Then, they bundle and distribute all the programs for the operating systems on a specific processor architecture into different DVDs -- one set for x86, one for PowerLinux, etc.
Now, let's look at an installed image. Once I get that operating system running with the programs compiled for my architecture, the answer to the final bullet above should become obvious. The executable program is unique to the processor architecture. So, the migration of VMs must naturally stay on the same processor architecture. PowerVM can move VMs (or LPARs as I grew up knowing them) from different versions of the architecture such as POWER6 to POWER7, but it cannot be moved from POWER7 to Nehalem because the executable binaries only understandable to the processor for which they were built.
Hopefully, this now makes perfect sense. But if not, let me try one more analogy. If you and I were identical twins, dressed identically by our mother, and were trained to play the violin for the same number of years by the same teacher, we would play the same piece of music (say Fritz Kriessler's Praeludium and Allegro) differently. Why? Because our brains are different. Even though the source code (written music) is the same, the executable program (our playing of the music) would be different because our processors (brains) have a different architecture even though the computer systems has all the same components like I/O (violin) and chassis (clothes and body structure).
By: Brent Baude.
On May 11th, Fedora announced a beta for F17 for ppc64. This is another milestone in the march towards Fedora 17 for the powerpc architecture.
The beta announcement itself can be found here ->
Yeah, lots of packages have been updated and so forth but there are two interesting pieces I'd like to draw some attention towards. Firstly, adoption of grub2 continues in the beta. We smoothed out some of the rough edges since the alpha timeframe and have a number of additional patches we'll push as well.
Secondly is that rpm and yum now are equipped to deal with a ppc64p7 subarch. I'll write up more on this topic as we near or pass Fedora 17 General Availability, but the basic function is that we have a POWER7 subarch (akin to i686 for x86) where certain optimizations are passed to the binary rpms. This is an exciting time for the architecture and Fedora!
Stay tuned as we catch our breath and begin to share more!