By: Bill Buros.
There's quite a bit going on in the world of Linux on Power, where several of us have some focus on improvements for performance. Lately, a series of articles have been published on DeveloperWorks which nicely highlight the performance gains that gcc (packaged in the Advance Toolchain) provides over the gcc packaged with the Linux operating system.
Two articles are available which dive into performance gains across a number of workloads embedded in the SPECcpu2006 suite. The approach is simple. Use gcc as bundled with the version and release of the operating system, measure the performance. Then install the Advance Toolchain (a couple of rpms), change the path to gcc, re-build, re-run, and compare the performance.
Naturally, your mileage will vary. If you have a workload example of something you think should be running faster, let us know, we'll try to help!
And more gcc improvements are coming. It just keeps getting better.
(Sept 19th 2011 - updated the url link to the Advance Toolchain 4.0 article)
Modificado em por breno.leitao
By: Breno Leitão
This technical preview tutorial explains how users of IBM's latest POWER8-based scale-out Linux servers can try Ubuntu running non-virtualized. We show how Ubuntu can be installed directly on the OPAL firmware, and run as a single-image operating system directly on the system.
Ubuntu 14.04 is generally available today and fully supported as a PowerKVM guest on the IBM Power Systems shown below:
For more details on Ubuntu 14.04 - see Canonical's What’s new in 14.04 LTS document.
It is also possible to run Ubuntu 14.04 directly on these systems, which the development teams refer to as non-virtualized mode, or "bare-metal". There is no PowerVM LPAR layer, and there is no PowerKVM hosting layer. Over time, this capability is available in the open-source communities, so new versions of other Linux distros are expected to be enabled for this support as well.
Note: If you are running Ubuntu 14.04 non-virtualized, you need to upgrade the kernel packages to get cpufreq support. The 3.13.0-32 level works.
The OPAL firmware referenced below is designed to allow a Linux operating system to run directly on the POWER8 system. By running directly on the system, this enables the operating system to be a KVM host, creating and controlling KVM Guests. In the scenario described here in this article, there is no KVM hosting, and the Ubuntu 14.04 operating system is running as an operating system directly on the system.
Because the OPAL firmware enables the PowerKVM mode, the terminology used in selecting the firmware below is targeted at that mode. In practice, OPAL firmware enables a Linux operating system to run directly on the system, and running KVM in that operating system is not a requirement.
Technical Preview only at this time. The ability to run Ubuntu directly on the POWER8 Linux-only system is provided as-is, is not a supported configuration option at this time, and therefore, is not for production use. This ability is provided as a technical preview only. If you should encounter any problems running with the non-virtualized technical preview you can report your bugs against Ubuntu in Launchpad. Alternatively, you can always ask a question in the Forums here on the Community!
You are over-writing your PowerKVM install. These instructions replace (destroy) your existing PowerKVM installed host and all of the guests. The PowerKVM software can be re-installed at a later time, and your guests can be re-created.
Your system must have access to the external web for access to Canonical's netboot server - or you will need a DVD image downloaded and burned to a DVD. These instructions assume a netboot load.
1. In order to install Ubuntu 14.04 on the IBM Power system, the system needs to be set for KVM as the Hypervisor mode. This step selects the OPAL Firmware to be loaded. If you build the system with the PowerKVM configuration, you are ready to go, otherwise, you can configure it using the following steps:
Turn off the server by going to the server Advanced System Management (ASM), under System Configuration ⇒ Hypervisor Configuration, and set the hypervisor mode to KVM (or OPAL) and choose a IPMI password
2. Once the machine is in PowerKVM mode, you need to connect to the FSP using IPMI to get the machine console. Run the following IPMI commands:
Once you run the last command, you will be seeing the machine console, and everything you type will be sent to the machine. In order to exit the console, you should type ~. and ~? shows the help menu.
3. Once the machine is booted up, you will see the petitboot console, as shown below:
Petitboot is the bootloader for the IBM Power machines configured with PowerKVM. From here, you can install an Ubuntu DVD in the machine using DVD-ROM and boot from it. You can also boot from the network.
This document will explain how to install from the network.
4. In order to install from the network, you need to configure the System Network in the 'System Configuration' menu entry. Once the network is configured, you can create a new entry in the petitboot by pressing the letter 'n'. By creating a new entry, you will go into Option Editor to configure the entry details, as shown:
Once you are editing the boot entry, you should choose the 'Specify paths/URLs manually' option and you must provide the installer kernel and initird. For Ubuntu, you should point it to the Canonical Ubuntu 14.04 netboot website.
On this example, I used version 14.04 that I following URLs:
For a more recent version, as 14.04.1, 14.04.2 and updates, check the Ubuntu ppc64el wiki page download section
You do not need to fill out the other entries, if you just want to do a default installation. Once you do the configuration, get back to the petitboot menu, and boot the entry you just configured "User Item 1".
Then just boot on that entry, pressing 'Enter',, and you are going to launch the Ubuntu 14.04 installer, as shown:
When you see this screen, select the language you want to use during the installation and proceed through normal Ubuntu 14.04 installation processes.
For more information about Ubuntu installation process, check Ubuntu 14.04 Installation guide. For more information about the Petitboot, you can check the IBM PowerKVM RedBook.
Modificado em por rpsene
Also you can follow the steps described here.
. But here are some good references:
Just put your question here.
There are a lot of experts willing to help you. Also, you can ask for help from within the SDK. Just select an
Modificado em por breno.leitao
I would like to announce a draft of a PowerKVM Redbook.
PowerKVM is an open source hypervisor based on the POWER8™ technology. On the following document, you will learn how to set up and configure this hypervisor. It will also cover how to control KVM from the console and WebUI interface.
IBM Redbooks are great technical collateral intended to quickly bring you up to speed with the latest technologies, offerings, and best practices!
Modificado em por PowerLinuxTeam
A good day of playing around on the KVM guest from SiteOx.com running the latest Ubuntu 14.04 for Power.
(continued from my first impressions article)
First and foremost.. I am very impressed with how easy this has been. For the longer term, I'm working out how to get in and get out in a day, since I don't really want to be billed for a system just sitting there over time, but this looks pretty quick and easy to be able to provision, setup, copy files to, run/test/debug/tune/repeat, copy files back, and let go of the server. That makes a $3 investment for a one day extended set of experiments look pretty do'able.
The good folk at SiteOx were very responsive with problem tickets - which really were just simple questions - and they've already setup a DeveloperWorks Community just for questions, support, and hints/tips.
Some quick observations from the day.
1. apt-get comes setup and ready to go. Adding packages was a breeze. I really appreciated that. The installation speed was quick - the connections looked pretty good - but I didn't do any specific performance measurements.
2. For the guest configuration, the setup of Power hardware threads (SMT mode) for the processor cores wasn't ideal for Power. The setup is currently only a single thread per core. As I mentioned yesterday, I purchased / ordered four processor cores - and got four CPUs. That really should come with SMT=4 turned on, this being a Power7 based server until Power8 is available. We are working with the teams to get this updated. In general, the user/admin of the system (each guest) should be able to control the SMT mode for his/her guest image.
3. I will reiterate yesterday's comment about the flexibility of Ubuntu being built for Power7 systems to enable teams to take advantage of older systems - but Ubuntu will *only* be supported on IBM's Power8 servers. You actually need a special version of the underlying firmware which enables KVM hosting, so for customers with existing Power7 or Power7+ systems, no, this simply won't be possible.
4. All of my go-to tools worked. oprofile, operf, perf, gcc, gedit, vi, sar, top, etc etc. I'm re-building the Ubuntu kernel now just to see how that works. Couple of install commands, and the build commands and it's off and running.
5. I did get a question from peers about running a desktop from the server. I never really do that for a server, but for curiosity sake pursued that a bit. It appears that Unity won't be supported (intentionally) for servers - Unity requires local 3D accelerators. Makes sense. I believe there are various threads going on to enable simpler alternative desktops which should work over an X connection. Not a high priority for me.
6. We're looking to get the Advance Toolchain from IBM automatically installed. That'll be good. It's easy to use and exploit.
7. Sometime over the next month or two or three (is that vague enough?), the IBM SDK is expected to be in beta mode for Ubuntu. That'll be really good.
8. scp (copying files) to/from the guest worked fine. For example, I can copy my test application to the guest. Play around - test - debug - compile again - tune a bit. And then scp the files back.
More pieces planned next week.
If you're looking for quick and easy access to a developer's platform to do some porting work, I would recommend you start thinking about this offering.
I do think for new ports, you'll want the IBM SDK with the migration tools, but in the meantime, this platform will provide easy access to a Power system and the latest Ubuntu server images.
Modificado em por jhopper
zswap" is discussed, with some initial performance data provided to demonstrate the potential benefits for a system (partition or guest) which has constrained memory and is beginning to swap memory pages to disk. The technique improves the throughput of a system, while significantly reducing the disk I/O activity normally associated with page swapping. We also explore how zswap works in conjunction with the new compression accelerator feature of the POWER7+ processor to potentially improve the system throughput even more than software compression alone.
This article is a good example of the ongoing collaboration that occurs in the Linux open-source community. New implementations are proposed, discussed, debated, refined and updated across developers, community members, interested customers, and performance teams. Here on the PowerLinux technical community, we are working to highlight more of these examples of work-in-progress from the broader Linux community. These proposals are applicable to both x86 systems and Power systems, so examples shown below cover both realms.
What is zswap?
Zswap is a new lightweight backend framework that takes pages that are in the process of being swapped out and attempts to compress them and store them in a RAM-based memory pool. Aside from a small reserved portion intended for very low-memory situations, this zswap pool is not pre-allocated, it grows on demand and the max size is user-configurable. Zswap leverages an existing frontend already in mainline called frontswap. The zswap/frontswap process intercepts the normal swap path before the page is actually swapped out, so the existing swap page selection algorithms are unchanged. Zswap also introduces key functionality that automatically evicts pages from the zswap pool to a swap device when the zswap pool is full. This prevents stale pages from filling up the pool.
The zswap patches have been submitted to the Linux Kernel Mailing List
(lkml) for review, you can view them in this post
Instructions for building a zswap-enabled kernel on a system installed with Fedora 17 can be found on this wiki
What are the benefits?
When a page is compressed and stored in a RAM-based memory pool instead of actually being swapped out to a swap device, this results in a significant I/O reduction and in some cases can significantly improve workload performance. The same is true when a page is "swapped back in" - retrieving the desired page from the in-memory zswap pool and decompressing it can result in performance improvements and I/O reductions compared to actually retrieving the page from a swap device.
Using the SPECjbb2005 workload for our engineering tests, we gathered some performance data to show the benefits of zswap. SPECjbb2005 uses a Java™ benchmark that evaluates server performance and calculates a throughput metric called "bops" (business operations per second). To find out more about this benchmark or see the latest official results, see the SPEC web site
. Note that the following results are not tuned for optimal performance and should not be considered official benchmark results for the system, but rather results obtained for research purposes. We liked this benchmark for this use case because we could more carefully control the amount of active memory being used in increments.
The SPECjbb2005 workload ramps up a specified number of "warehouses", or units of stored data, during the run. The number of warehouses is a user-controlled setting that is configured depending on the number of threads available to the JVM. As the benchmark increases the number of warehouses throughout the run, the system utilization level increases. A bops score is reported for each warehouse run. For this work, we focused on the bops score from the warehouse that keeps the system about 50% utilized. We also increased the default runtime for each warehouse to 5 minutes since swapping can be bursty and a longer runtime helps to achieve more consistent results.
For these results, the system was assigned 2 cores, 10 GB of memory, and a 20 GB swap device. A single JVM was created for the SPECjbb2005 runs, using IBM Java. First, a baseline measurement was taken where normal swapping activity occurred, then a run with zswap enabled was measured to show the benefits of zswap. We gathered results on both a Power7+ system and an x86 system to observe the performance impacts on different architecture types. The mpstat, vmstat, and iostat profilers from the sysstat package were used to record CPU utilization, memory usage, and I/O statistics. We would recommend taking advantage of the lpcpu
package to gather these data points.
To demonstrate the performance effects of swapping and compression, we started with a JVM heap size that could be covered by available memory, and then increased the JVM heap size in increments until we were well beyond the amount of free memory, which forced swapping and/or compression to occur. We recorded the throughput metric and swap rate at each data point to measure the impacts as the workload demanded more and more pages.
Settting up zswap
With the current implementation, zswap is enabled by this kernel boot parameter:
We looked at several new in-kernel stats to determine the characteristics of compression during the run. The metrics used were as follows:
pool_pages - number pages backing the compressed memory pool
reject_compress_poor - reject pages due to poor compression policy (cumulative) (see max_compressed_page_size sysfs attribute)
reject_zsmalloc_fail - rejected pages due to zsmalloc failure (cumulative)
reject_kmemcache_fail - rejected pages due to kmem failure (cumulative)
reject_tmppage_fail - rejected pages due to tmppage failure (cumulative)
reject_flush_attempted - reject flush attempted (cumulative)
reject_flush_fail - reject flush failed (cumulative)
stored_pages - number of compressed pages stored in zswap
outstanding_flushes - the number of pages queued to be written back
flushed_pages - the number of pages written back from zswap to the swap device (cumulative)
saved_by_flush - the number of stores that succeeded after an initial failure due to reclaim by flushing pages to the swap device
pool_limit_hit - the zswap pool limit has been reached
There are two user-configurable zswap attributes:
max_pool_percent - the maximum percentage of memory that the compressed pool can occupy
max_compressed_page_size - the maximum size of an acceptable compressed page. Any pages that do not compress to be less than or equal to this size will be rejected (i.e. sent to the actual swap device)
failed_stores - how many store attempts have failed (cumulative)
loads - how many loads were attempted (all should succeed) (cumulative)
succ_stores - how many store attempts have succeeded (cumulative)
invalidates - how many invalidates were attempted (cumulative)
To observe performance and swapping behavior once the zswap pool becomes full, we set the max_pool_percent parameter to 20 - this means that zswap can use up to 20% of the 10GB of total memory.
The following graphs represent the SPECjbb2005 performance and swap rate for a run using the normal swapping mechanism.
Note that as "available" memory is used up around 10GB, the performance falls off very quickly (the Blue Line) and normal page swapping (the Red Line) to disk increases. The behavior is consistent both on Power7+ and x86 systems.
Power7+ baseline results:
x86 baseline results:
As you can see, performance dramatically decreased once the system started swapping and continued to level off as the JVM heap was increased.
The following graphs represent the SPECjbb2005 performance and swap rate for a run when zswap is enabled. In these cases, memory is now being compressed, which significantly reduces the need to go to disk for swapped pages. Performance of the workload (the blue line) still drops off but not as sharply, but more importantly the system load on I/O drops dramatically.
Power7+ with zswap compression:
x86 with zswap compression:
As you can see, the swap (I/O) rate was dramatically reduced. This is because most pages were compressed and stored in the zswap pool instead of swapped to disk, and taken from the zswap pool and decompressed instead of swapped in from disk when the page was requested again. The small amount of "real" swapping that occurred is due to the fact that some pages compressed poorly - which means they did not meet a user-defined max compressed page size - and were therefore swapped out to the disk, and/or stale pages were evicted from the zswap pool.
Looking at the zswap metrics for each run, we can calculate some interesting statistics from this set of runs - keep in mind the base page size is different between Power (64K pages) and x86 (4K pages), which accounts for some of the different behaviour. Also note that we set the max zswap pool size to 20% of total memory for these runs, as mentioned above - this max setting can be adjusted as needed. On Power, the average zswap compression ratio was 4.3. On x86, the average zswap compression ratio was 3.6. For the Power runs, we saw entries for "pool_limit_hit" starting at the 17 GB data point. For the x86 runs, the pool limit was hit earlier - starting at the 15.5 GB data point. For the Power runs, at most the zswap pool stored 139,759 pages. For the x86 runs, the max number of stored pages was 1,914,720. This means all those pages were compressed and stored in the zswap pool, rather than being swapped out to disk, which results in the performance improvements seen here.
POWER7+ hardware acceleration
The POWER7+ processor introduces new onboard hardware assist accelerators that offer memory compression and decompression capabilities, which can provide significant performance advantages over software compression. As an example, the system specifications for the IBM Flex System p260 and p460 Compute Nodes
mention the "Memory Expansion acceleration" feature of the processor.
The current zswap implementation is designed to work with these hardware accelerators when they are available, allowing for either software compression or hardware compression. When a user enables zswap and the hardware accelerator, zswap simply passes the pages to be compressed or decompressed off to the accelerator instead of performing the work in software. Here we demonstrate the performance advantages that can result from leveraging the POWER7+ on-chip memory compression accelerator.
POWER7+ hardware compression results
Because the hardware accelerator speeds up compression, looking at the zswap metrics we observed that there were more store and load requests in a given amount of time, which filled up the zswap pool faster than a software compression run. Because of this behavior, we set the max_pool_percent parameter to 30 for the hardware compression runs - this means that zswap can use up to 30% of the 10GB of total memory.
The following graph represents the SPECjbb2005 performance and swap rate for a run when zswap and the POWER7+ hardware accelerator are enabled. In this case, memory is now being compressed in hardware instead of software, and this results in a significant performance improvement. Performance of the workload (the blue line) still drops off, but even less sharply than the zswap software compression case, and the system load on I/O still remains very low.
Power7+ hardware compression:
As you can see, the swap (I/O) rate was dramatically reduced. This is because most pages were compressed using the hardware accelerator and stored in the zswap pool instead of swapped to disk, and taken from the zswap pool and decompressed in the hardware accelerator instead of swapped in from disk when the page was requested again. The small amount of "real" swapping that occurred is due to the fact that some pages compressed poorly - which means they did not meet a user-defined max compressed page size - and were therefore swapped out to the disk, and/or stale pages were evicted from the zswap pool.
The following graphs show the performance comparison between normal swapping and zswap compression, and the POWER7+ graph also includes the hardware compression results, showing that the hardware accelerator provides even more performance advantages over software compression alone:
Power7+ performance comparison:
x86 performance comparison:
As you can see, this workload shows up to a 40% performance improvement in some cases after the heap size exceeds available memory when zswap is enabled, and the POWER7+ results show that the hardware accelerator can improve the performance by up to 60% in some cases compared to the baseline performance.
Swap (I/O) comparison
The following graphs show the swap rate comparison between normal swapping and zswap compression, and the POWER7+ graph includes the hardware compression results, showing that the hardware accelerator also reduces the swap rate dramatically. Swap rates are dramatically reduced on both architectures when zswap is enabled, including the POWER7+ hardware compression results.
Power7+ swap I/O comparison:
x86 swap I/O comparison:
The new zswap implementation can improve performance while reducing swap I/O , which can also have positive effects on other partitions that share the same I/O bus. The new POWER7+ on-chip memory compression accelerator can be leveraged to provide performance improvements while still keeping swap I/O very low.
Modificado em por Christy Norman
By now everyone has heard of the microservices revolution that is sweeping the industry. Microservices are here to change the world, and developers and companies alike are rushing to transform their workflows to this new model.
Some workflows require the use of host devices, so docker provides a --device flag that will pass through a host device (such as a block device) into a container. However, for NVIDIA GPUs it’s not as simple as using the --device flag. To allow developers to isolate GPUs in docker containers, NVIDIA has created a wrapper for the docker command aptly named nvidia-docker. You can read more about why --device isn’t sufficient here: https://github.com/NVIDIA/nvidia-docker/wiki/Why%20NVIDIA%20Docker. Primarily, using a GPU “requires the installation of the NVIDIA driver.”
nvidia-docker is a great tool for developers using NVIDIA GPUs, and NVIDIA is a big part of the OpenPOWER Foundation – so it’s obvious that we would want to get ppc64le support into the nvidia-docker project. Luckily, the project was well laid-out and it was a piece of cake to get ppc64le support added.
Let’s walk through building the required packages and docker images, and install the nvidia-docker plugin.
If you're unfamiliar with the nvidia-docker project's components, here is a quick summary. There are two main components: 1) the plugin and command-wrapper, and 2) the docker image build scripts (makefiles and dockerfiles). The plugin and command can be installed by building a package, or by using make and make install. If you use make, you'll have to start the nvidia-docker service manually (e.g. $ systemctl start nvidia-docker). For POWER, you can build a deb package or make to install nvidia-docker, also build docker images for CUDA 7.5 (which have 14.04 as the image base) and CUDA 8.0 (which have 16.04 as the image base).
This walk-through is using Ubuntu 16.04-based images with CUDA 8.0.
Pre-Req: You have to install the nvidia drivers for the GPUs you’re going to use on your host system in order to use nvidia-docker. This article assumes that you have been running GPU workloads on your system and have those libraries pre-installed.
Check that your installed nvidia driver supports the CUDA 8.0 toolkit version here: https://github.com/NVIDIA/nvidia-docker/wiki/CUDA#requirements
Build the nvidia-docker plugin deb
Optional: If you already have docker installed, and it’s version 1.9 or later, you can choose to skip this step and force-install the deb we create next.
Docker, Inc. has its own repositories for those who want to use a more recent docker version than their distro might ship. Those repositories contain the docker-engine package. If you want to install docker-engine on ppc64le, use the following:
$ sudo echo ‘deb http://ftp.unicamp.br/pub/ppc64el/ubuntu/16_04/docker-1.13.1-ppc64el/ xenial main’ >> /etc/apt/sources.list
$ sudo apt-get update && apt-get install docker-engine
To use docker as a non-root user (e.g. your-username), execute the following to add the user to the docker group.
$ sudo usermod -aG docker your-username
Perform the rest of the tasks as the user you added to the docker group.
Clone the repository and build the deb
First clone the nvidia-docker repository and checkout the ppc64le branch:
$ git clone https://github.com/NVIDIA/nvidia-docker.git && git fetch –all && cd nvidia-docker
$ git checkout ppc64le
Next, create the installable plugin package. This uses docker, so you’ll see some docker image builds happen. When the images have built and the deb has been built (inside a docker image), you’ll have a deb in your local filesystem.
$ make deb
When prompted to update the changelog, chose ‘n.’
Note: If you’re not running on Ubuntu, you can also run ‘make’ and ‘make install’ and then start the plugin manually by starting the nvidia-docker service. Building and installing the deb does this all for you.
Install and verify the deb
$ cd tools/dist && sudo dpkg -i nvidia-docker_1.0.0~rc.3-1_ppc64el.deb
Installing the deb installs an nvidia-docker plugin, as well as a wrapper for the docker command. You’ll notice that you can run all docker commands using nvidia-docker instead. From now on, if you’re working with GPUs, use 'nvidia-docker' instead of ‘docker.' For example, $ nvidia-docker images will list your docker images, just like $ docker images does.
Build the cuda images using Ubuntu 16.04 (xenial) as the image base. This will also use docker, so you will see docker images being created.
$ OS=ubuntu-16.04 make cuda
Note: If your build errors out complaining that some packages couldn’t be installed, scroll up and see if there was a warning that the repository didn’t have a Release file. Delete your cuda images and try again if so. Sometimes a network hiccup causes issues.
Now you’ll have new docker images:
The nvidia-docker images were created during the ‘make deb’ step, and the cuda images were just created. The cuda images will be used to run container with GPU workloads.
Using Your Images
Let’s make sure the images built correctly and that the nvidia-docker plugin is working.
$ nvidia-docker run --rm cuda:8.0 nvidia-smi
This is your GPU inside a container!
Now that you have a base cuda image, you can use it to create your own GPU workloads. You can create Dockerfiles using 'FROM cuda:8.0' to use the image you just created.
You can reach out to us by filing an issue in the nvidia-docker repo, or leaving a comment on this blog post.
Modificado em por Bill_Buros
A brand new release of IBM PowerKVM - V3.1.0 is now available. PowerKVM V3.1.0 delivers new server and I/O adapter support, enhanced availability options, and improved automation. These new PowerKVM capabilities give you more virtualization options for PowerKVM that improve usability, performance, and availability. PowerKVM V3.1.0 offers the following new functions:
- The Host OS runs in Little Endian mode
- vCPU and Memory hot-plug support
- vCPU hot-plug add and remove
- Memory hot-plug add
- Automated dynamic micro-threading support, which automates setting micro-threading dynamically when systems require this during resource over-commitment
- Support for new Linux scale-out S812LC (Habanero) and S822LC (Firestone) servers
- Support for new (& existing) Linux distribution releases guest VMs
- RHEL6.7 BE, RHEL7.1 BE and LE, RHEL7.2 BE and LE
- SLES11 SP4 BE, SLES12 LE, SLES12 SP1 LE (when available)
- Ubuntu 14.04.3, 15.04, 15.10 LE
- Development Kit for PowerKVM (AS-IS support) - can be found here.
For more information and documentation, please refer to our official PowerKVM V3.1 documentation page.
About the IBM PowerKVM
IBM® PowerKVM™ provides an open virtualization choice for IBM scale-out Linux systems based on the POWER8™ technology. This solution includes the Linux open source technology of KVM virtualization and is designed to complement the performance, scalability and security qualities of Linux. This provides an open extendable solution for running virtual machines (VMs) on Linux scale-out servers that enables cloud deployments, scale-out processing and big data solutions reducing complexity and cost.
For more information about it, visit http://www-03.ibm.com/systems/power/software/linux/powerkvm/
Modificado em por Bill_Buros
This is the second part in a series of articles on the topic: Using the IBM SDK for Linux on Power to improve the performance of a large open source project where I will be using the Migration Advisor. Here we continue exploring the performance of the PHP open source project by looking for the tell-tale signs of missing optimization for the POWER platform in the source.
The Migration Advisor (MA) looks for examples in the code where it uses Intel inline assembler or Intel-specific compiler built-ins. And where this Intel specific code is wrapped in C Pre-Processor (CPP) conditional macro logic and there is no corresponding conditional logic for __powerpc__ or __powerpc64__ specific equivalents.
This is really handy if you are under the pressure to “port” this application or library to the POWER platform or the port “so far” is not performing and you want to find the “low-hanging-watermelons” before doing deep performance analysis. A common problem we see is some low level optimization has been done for Intel but little or none for POWER.
In some cases involving well known patterns that MA recognizes, we can offer a “quick fix”. The IBM SDK offers to edit the specific source code for you by inserting conditional compile logic around the Intel specific code and inserting the equivalent POWER code into the __powerpc64__ leg of the conditional logic.
In the cases where the Intel specific code is not recognized or the conditional logic is too complex for the MA to deal with, MA still is useful. Examples include CPP conditional logic involving multiple else if (#elif) sections or if the conditional logic falls into generic C/C++ code for the default (not Intel) case. This is not to say that MA is not useful for these case, it still finds them and points to exact source files and lines that need work. It just means you will get to use your programmer skills to provides the appropriate fix. So general C/C++ and CPP macro skills are still needed.
It also helps to be familiar with GCC built-in functions including the common cross-platform support, Intel specific, and POWER specific functions. We see these builtin functions a lot in open source codes. You will also see atomic operations in both in-line assembler and using the legacy IA64 style __sync built-ins. Here it is best to convert these to the new C11 Standard atomic operation built-ins.
We are also seeing a lot more use of the vector intrinsic built-ins. These are not provided by the GCC compiler directly but as macros in Intel-specific header files that map the Intel intrinsic to the corresponding GCC built-in. For example; the _mm_add_pd intrinsic maps to the GCC built-in __builtin_ia32_addpd. The MA scans for both spellings and where recognized, will provide a quick fix mapping to the equivalent POWER Vector extension builtin.
Of course both platforms are evolving quickly and each adds new instructions with succeeding generations. This is a moving target and while we strive to keep up, the MA may not have a quick-fixes for the latest intrinsic instructions build-ins. So if you find a case that the MA does not recognize an intrinsic or does not provide a quick-fix, please contact us on the portal. We maintain following table with the mappings we have.
This wraps up the Migration Advisor introduction. I plan to follow-up soon with a series on the Source Code Advisor.
Modificado em por Bill_Buros
When I started this Zen and art of PowerLinux Performance series I had the best of intentions to contribute regularly. Then I got busy; with enabling Little Endian for Power 64-bit, which required a new ABI , plus coordinating the enablement of all the pieces of the toolchain for the new target and platform, and then supporting the bring up of multiple Linux Distributions.
Now we are mostly past the “when will we get Distro X on hardware Y?” and we are deep into the “How do I tune my application for POWER?” and “I am not happy with the performance of open source package Z on POWER!” stage. How I wish that I could be answering these questions 24x7, but that tends to interfere with my day job as Toolchain Architect.
Fortunately I have another job as the architect for the IBM Software Developer Kit for Linux on Power (SDK). This gives me a chance to influence development of additional tools that apply directly to the questions mentioned above. The idea is to provide powerful tools that embed the deep knowledge of performance tuning and POWER architecture into the tool itself. The idea is that a good set of tools will allow Linux developers to analyze and tune their own applications.
I do hear concerns that a fancy Integrated Development Environment (IDE) will be too complicated to use and may not work with existing open source projects or large projects. The good news is that Eclipse.org tools (on which the SDK is built) are extremely flexible and extensible while maintain a strong emphasis on usability. The SDK team has focused on selecting and integrating the plugin components that support C/C++ development for Linux and migration of existing applications to Linux on Power.
This support includes existing Eclipse plugins like the C/C++ development Tools (CDT), and the Linux Tools Project (LTP) which integrates with and leverages the existing Linux tools that we are all familiar with. For example, the SDK can import and build existing autotools and Makefile projects. Once a project is imported, the SDK supports the usual Edit, Compile, Debug cycles with additional tools that leverage existing Linux tools for code profiling (gprof, perf, and OProfile tools) and memory/heap analysis (Valgrind; memcheck and massif ).
The SDK team has also developed unique and powerful POWER specific plugins and extensions, for the Eclipse framework, to analyze and tune your application. This includes the Migration Advisor (MA), Source Code Advisor (SCA), Feedback Directed Program Restructuring (FDPR), and more. These combined with the profiling tools provide a complete application porting, analysis and tuning tool kit.
Of course it definitely helps if you have general Linux development skills. To run some of the SDK's special tool against the project you will need to understand how the project is structured. For example to run tests or benchmarks internal to the project (for example those normally run under “make check”) you will need to read up on the set-up need to run that test. For example does it need LD_LIBRARY_PATH set the project libraries you just built? Or where do I find and how do I run the command line interpreter for the project (like php or python), without installing that project (and potentially messing up my system).
So what about larger projects? Well I personally have imported open source projects like the GNU Compiler Collection (GCC) and GLIBC (as part of my day job). On occasion you will need to bump up the ulimits for the stack and Java heap sizes for the SDK (look for /opt/ibm/ibm-sdk-lop/sdk_launcher.ini) to successfully import a really large project. And the initial source code indexing might take a while to process (an hour or two), but this is a small price to pay to gain deeper insights into the structure and performance of your companies application or your favorite open source package.
For this performance series I will take you through the process of importing, analyzing, and tuning a popular open source dynamic language interpreter, PHP.
The first step of importing the PHP projects is described in Using the IBM SDK for Linux on Power to improve the performance of a large open source project. Additional installments will demonstrate using the Migration Advisor and Source Code Advisor to find performance bottle necks and suggest code changes that improve or eliminate these bottle necks.
Modificado em por Pradipta_Kumar
Docker is now part of Ubuntu
Vivid Vervet (15.04) for Power LE. Ubuntu Vivid which just got released, includes docker (version 1.5) support. You can find the detailed release notes here
For using docker with Vivid on either baremetal or virtual machine on Power systems , just install it from the repository, start the docker engine and off you go.
$sudo apt-get install docker.io
Start the service and verify the installation
$sudo service docker start
$sudo docker info
Storage Driver: devicemapper
Pool Name: docker-253:1-249191-pool
Pool Blocksize: 65.54 kB
Backing Filesystem: extfs
$sudo docker version
Client version: 1.5.0
Client API version: 1.17
Go version (client): go1.4.2 gccgo (Ubuntu 5.1~rc1-0ubuntu1) 5.0.1 20150414 (prerelease) [gcc-5-branch revision 222102]
Git commit (client): a8a31ef
OS/Arch (client): linux/ppc64le
Server version: 1.5.0
Server API version: 1.17
Go version (server): go1.4.2 gccgo (Ubuntu 5.1~rc1-0ubuntu1) 5.0.1 20150414 (prerelease) [gcc-5-branch revision 222102]
Git commit (server): a8a31ef
Create your first docker image
Since there is no official Power image available in dockerhub you'll need to create the base image from scratch. Here is an example to create Ubuntu Trusty (14.04) base image.
$sudo apt-get install debootstrap
$sudo debootstrap --components=main,universe trusty trusty
$sudo tar -C trusty -c . | sudo docker import - test/ubuntu_ppc64el:trusty
$sudo docker images
REPOSITORY TAG IMAGE ID CREATED VIRTUAL SIZE
test/ubuntu_ppc64el trusty c5efc5ec18e0 31 minutes ago 269.3 MB
Run your first docker container
$sudo docker run -itd test/ubuntu_ppc64el:trusty /bin/sh
Note that the image is referred along with the tag - 'trusty'. If no tag is specified, then the default tag of 'latest' is assumed and docker will try to search for the image with tag 'latest'
$ docker run -it test/ubuntu_ppc64el /bin/sh
Unable to find image 'test/ubuntu_ppc64el:latest' locally
Pulling repository test/ubuntu_ppc64el
time="2015-05-18T05:31:54Z" level=fatal msg="Error: image test/ubuntu_ppc64el:latest not found"
If you would like to refer to the image as above (without any explicit tag), then ensure you tag the image with 'latest'
$ docker tag test/ubuntu_ppc64el:trusty test/ubuntu_ppc64el:latest
You can also modify the image and save the image in dockerhub or your private repository for later use.
Build and Use Upstream Docker
If you want to build latest docker from source, then all the necessary tools comes with Vivid. Here are the instructions to get your started.
Install the gccgo compiler and docker build dependencies
$sudo apt-get install gccgo binutils libsqlite3-dev btrfs-tools libdevmapper-dev
Setup required directories
$mkdir -p ~/docker.bld/src/github.com/docker
Clone the docker source in the appropriate location.
Build dynamically linked docker binary
---> Making bundle: dyngccgo (in bundles/1.7.0-dev/dyngccgo)
Created binary: /home/ubuntu/docker.bld/src/github.com/docker/docker/bundles/1.7.0-dev/dyngccgo/dockerinit-1.7.0-dev
Created binary: /home/ubuntu/docker.bld/src/github.com/docker/docker/bundles/1.7.0-dev/dyngccgo/docker-1.7.0-dev
And that's all. You can now start using the latest docker on your Ubuntu Vivid system.
Hope this will be useful in your docker journey on Power systems.
Modificado em por rdutra
The SDK for Linux on Power team today the GA of the SDK 1.7.0supports x86_64 and POWER Little Endian operating systems.
Red Hat Enterprise Linux 7.1
SUSE Linux Enterprise Server 12
Red Hat Enterprise Linux 6.5
Red Hat Enterprise Linux 7.1
SUSE Linux Enterprise Server 12
features and changes
- Power Performance Advisor plugin:
The POWER Performance Advisor (PPA) plugin allow users to profile C/C++ applications selecting a set of metrics based on the chosen target processor. PPA leverages Ocount tool, an OProfile tool used to count native hardware events, to gather the processor performance data and calculate the metrics.
- Remote Setup Wizard:
A installation wizard that allows users to install the SDK packages and their dependencies on Power Systems using the x86_64 SDK. It automatically detects the operating system of the target system and automates all the necessary steps to install the SDK.
The launch bar provides quick access to the plugin launches. It works like a short cut for common tasks like running, debugging, and profiling an application.
The SDK provides a new welcome page which provides useful information regarding the tools supported.
- IBM POWER8 Functional Simulator supported distributions:
The IBM POWER8 Functional Simulatorsupports now only Little Endian operating systems.
The Cheat sheets provide a quick help inside the Eclipse IDE (Help > Cheat Sheets). Each cheat sheet is designed to help completing a specific task, listing the sequence of steps required to achieve that goal.
is now available.
About the IBM SDK for Linux on Power
The IBM Software Development Kit for Linux on Power (SDK) is a free, Eclipse-based Integrated Development Environment (IDE). The SDK integrates C/C++ source development with the Advance Toolchain, Post-Link Optimization, and classic Linux performance analysis tools, including Oprofile, Perf and Valgrind. For more information about the IBM SDK for Linux on Power visit: https://www-304.ibm.com/webapp/set2/sas/f/lopdiags/sdklop.html
Modificado em por wainersm
A wide range of applications applied to technical areas such as computational vision, chemistry, bioinformatics, molecular biology, engineering and financial analysis are using heterogeneous computing systems with general purpose GPU (Graphics Processing Unit) hardware as their high performance platform of choice.
Recently launched, the IBM Power System S824L comes into play to explore use of the NVIDIA Tesla K40 GPU combined with the latest IBM POWER8 CPU, providing a unique platform for heterogeneous high performance computing.
The Power S824L system comes with up to 2 Tesla K40 GPU cards (based on Kepler(TM) Architecture), each of them is able to delivery 1.43 and 4.29 Tflops of peak performance on, respectively, single and double-precision float point operations. The Tesla K40 GPU features:
15 SMX (Streaming Multiprocessor)
Simultaneously execute 4 warps (group of 32 parallel threads)
ALU fully compliant with IEEE 754-2008 standard
64 KB configurable shared memory and L1 cache per multiprocessor.
48 KB read-only data cache per multiprocessor
1536 KB L2 cache
12 GB DRAM (GDDR5)
GPU Boost Clock
Simultaneously execute 2880 CUDA cores (192 per multiprocessor)
Supports CUDA compute capability 3.5
Dynamic parallelism (ability to launch nested CUDA kernels)
Hyper-Q (allows several CPU threads/processes to dispatch CUDA kernels concurrently)
C/C++ CUDA programming support for POWER8 was first introduced with CUDA Toolkit 5.5 for Ubuntu 14.10 ppc64le. As of this writing, version 7 is latest CUDA Toolkit release and it supports Ubuntu 14.04 ppc64le as well. The toolkit comes with following tools and libraries that allow development of CUDA applications on Power:
NVCC (NVidia CUDA Compiler) - front-end compiler
CUDA GDB - command line GDB-based debugger
CUDA Memcheck - command line memory and race checker tool
nvprof - command line profiling tool
binary utilities - include cuobjdump and nvdisasm
POWER cross-compilation support (new in CUDA Toolkit 7.0)
GPU-accelerated libraries - provides many libraries and APIs, as for example, cuBLAS, cuFFT, cuSPARSE, Thrust.
NSight Eclipse Edition - Eclipse-based Integrated Development Environment (IDE)
Because an CUDA application have portions of code that run exclusively on host or device processors, the NVCC is a front-end compiler driver that simplifies the process of compiling C/C++ code. As back-end compilers, there can be used either distro's GCC or IBM XL C/C++ compiler 13.1.1 (or newer). They are used to generate the objects which run on host processor, while nvcc is going to compile portions of code targeting the GPU device.
The CUDA Toolkit for Linux on POWER8 can be free downloaded from https://developer.nvidia.com/cuda-downloads#linux-power8
Java applications can also exploit GPU-accelerated operations since IBM Java SDK 7.1 and 8.0 versions. It makes available for applications the following packages:
com.ibm.gpu - provides classes with GPU-offloaded operations (e.g. arrays sorting)
com.ibm.cuda - enables low-level access to CUDA devices. As for example, the API allows to load/unload CUDA modules within the GPU device to execute kernel functions.
Read much more about NVIDIA CUDA on IBM POWER8 on following IBM redpapers:
Modificado em por jscheel
by Jeff Scheel, IBM Linux on Power Chief Engineer
In June of last year, I started publicly discussing the role that little endian (LE) plays in our Linux on Power strategy with the blog, Just the FAQs about Little Endian. Then, in August I attempted to eliminate uncertainty in my Removing the FUD and Demystifying LE (little endian) article. With the announcement of the Red Hat Enterprise Linux 7.1 beta delivering an LE version, it is time to revisit little endian from the perspective of an application developer.
The release of RHEL 7.1 LE completes the offerings of little endian operating systems. Canonical had Ubuntu 14.04 ready for POWER8 launch in May. SUSE supported the launch with public statements by Michael Miller about SLES 12 being LE in May, and publicly released in October. It is now time for application developers to get busy: little endian Linux on Power is here!
One thing that being a developer by training has taught me, is that “we” often need to be convinced that work is worth doing. Little endian Linux on Power is about reducing the cost of migrating an application AND providing additional value of the end application.
Being able to run Linux on Power in LE mode means that applications have one less thing – data endianness – to worry about in the port. While technical differences such as assembler language, page size, and cache size still exist, developers and architects tend to worry most about data endianness because the finding and fixing all the problems can be very time consuming. By enabling Power to run in the same endian mode as x86 (the defacto Linux platform of choice for developers), applications can simply be recompiled without having to worry about endianness. Further, if one is going to build a solution mixed with x86 and POWER systems, exchanging data on disk or across the network in the same endian mode greatly simplifies the application as well. Then, add in the ability to accelerate Power applications with (inherently little endian) GPUs and the benefits of little endian become “a no brainer”.
So, hopefully, we're past the “why should I do this?” phase and now we address the list of technical resources for migrating to Linux on Power. My favorite list of resources include:
The Linux on Power community in developerWorks has a great wiki page Porting from Intel x86 to Power systems running Linux that provides a great starting point for the process.
If you are migrating your application from x86 Linux and like bundles or toolkits, the Software Development Toolkit for Linux on Power provides an Eclipse-based environment for C/C++ applications with a porting wizard (The Migration Advisor) and a tuning wizard (Souce Code Analyzer) for development efficiently. This bundle further provides the latest free software (GNU) tools, oprofile, gdb and several Power-unique tools such as FDPR for post link optimization, pthread-mon to analyze highly threaded applications, and CPI (cycles per instruction) tooling to visually show inefficiencies.
For the best advice on tuning your application, I recommmend starting at the Performance Rocks – Best Practices wiki page in developerWorks.
The Performance Optimization and Tuning Techniques for IBM Processors, including IBM POWER8 Redbook provides excellent insight to the Power processor.
Now let us take a look at “where can I get started?” The answers to this question depend on your role in the software ecosystem. If you are a software provider, my colleague Bob Dick, recently published he thoughts on how to get started in a the Using the IBM Power Development Cloud for Red Hat Enterprise Linux 7.1 (little endian) Beta application testing blog posting. Programs like IBM PartnerWorld provide this and more resources to facilitate porting. Check them out.
If you are a “in house” owner of an application in your enterprise, finding a system on which to port your application could be challenging. Of course, your IBM Sales contact or your business partner can provide alternatives such as try-and-buy or proof-of-concept systems. Do not hesitate to start with them. If you do not know them, or if this does not work out, go to the cloud! Site Ox offers a two week free trial for development purposes. Visit their website for details. As we move forward, I remain hopeful that other vendors will provide public offerings of Linux on Power images. Further, if you do not at first see the particular release for which you are looking, reach out to the service provider and request it. They might just surprise you and have a plan to provide it. If not, it helps them to hear your needs.
For open source developers, the access to free cloud images increases. The Open Source Labs at Oregon State University hosts Power development images (VMs). University of Campinas (UNICAMP) also hosts a minicloud in Brazil. In China, the SuperVessel Cloud
provides a similar service to developers. In addition to these three locations, we are hoping to extend our offerings in both Europe and India in the near future. Again, the particular releases hosted at these sites may vary, but will generally include the little endian versions of Fedora, openSUSE, and Debian. If none of these sites or offerings work for you, feel free to reach out to me on Google+ (loaner post) to explore a dedicated loaner system.
With a complete set of little endian Linux on Power distributions, a robust list of technical resources, and plenty of resources for porting applications, the future is here. Take the first step. Seize the moment. Let's see what you can do with Linux on Power!
Modificado em por PowerLinuxTeam
System administrators are very used to handle several different complex services that run interconnected and in parallel on high availability servers. In this scenario, it is hard to understand all the architecture and application connections in a single server, so you do not break it when trying to fix or update something. It is also common to bump into servers that are not even updated accordingly, because the system administrators are so afraid of architecture the server was build, that they usually prefer not to touch these servers. This is the kind of issue that we will try to solve in this article.
With the advent of virtualization and tools used to handle nested operating system, some engineers realized system administrators' apprehension and tried to decrease it by using containers isolation, pre-installed applications and control version concepts. They eventually came up with Docker, that is a layer over Linux containers, and could be seen as a lightweight virtualization.
On the Docker architecture, each application runs in a single container, and each container supports only one application, meaning that all files in a single container exist solely to support that application. In this scenario, a container is a minimal operating system plus the application it will run. One of the most common minimal operating system is Ubuntu Core, that is based on, as you might expect, Ubuntu.
Ubuntu Core and Docker on an IBM POWER8 server is a good combination for customers looking for an efficient and agile method to deploy complex workloads on the cloud.
What is Ubuntu Core?
Ubuntu Core is the minimal Ubuntu installation ideally for small environments. It contains just the basic OS layer that supports any other software to run over it.
It is basically a 200MB Linux rootfs that contains around 196 packages that can be extended to be a full distro. Since it contains the dpkg and apt tools, you can install whatever you want to tailor it to your needs.
Ubuntu core follows traditional Ubuntu release, and it started to ship for ppc64el architecture at 14.04, and the latest release is 14.10. You can also find daily build for Ubuntu core at Ubuntu Core daily builds website. In this article we are going to use Ubuntu Core version 14.04 on a standanrd 14.10 Ubuntu Docker host.
Ubuntu Core could be used on containers, Docker, chroot and virtualization environments.
What are Linux Containers?
Linux Containers is the basic technology behind Docker. A Linux container is the technology that would enable an operating system virtualization on Linux, providing all the infrastructure to isolate different containers.
Linux containers rely basically on two technologies that enable the existence of containers:
Cgroups: a Linux kernel technology that enables mainly process isolation using name spaces and resource limitation.
LXC tools: a userspace daemon and a client tools that enables container management.
In order to guarantee that Linux Containers work fine on your machine, you can run a script that guarantee that your environment is sane. To do so, as root run:
Install LXC package
Run the lxc-checkconfig
Guarantee that everything is enabled, as:
Figure 1: Verify that your server supports containers.
What is Docker?
Docker is an infrastructure to deploy applications/software inside a Linux container, so, instead of having a full set of applications being used on the same machine, you can have an application per container, and deploy multiple containers.
For each container, you can have a version control for it, which means that you can 'commit' and 'revert' changes, so, you handle a complex set of applications breaking them in easier containers.
Using Docker hub, you can also upload, share and download already pre configured containers for your environment, so, instead of installing a complex application, you can download a container with an already installed application.
Installing Docker in Ubuntu on Power
In order to install docker on Ubuntu 14.10, you should do the following steps as root:
Install Docker repository in Ubuntu:
Update the archive index
Install the docker package
This is a example of the package being installed on a 2-sockets POWER8 S822-L machine.
(you may click on these images to see a larger version of the image)
Figure 2: Docker.io package being installed on Ubuntu 14.10
Creating the Ubuntu Core image in Docker
Download Ubuntu Core based on 14.04 version
Import the files into docker
Guarantee that the image was created:
Assure that your image is running fine:
Figure 3: Listing the fresh installed ubuntucore image
Figure 4: Assuring that the container is running
Using a preloaded Ubuntu Core image
If you want to use a default Ubuntu Core image for ppc64el, you don't need to do the step below, you can just install an already installed image. In order to do so, you need to be registered at Docker Hub website, which is similar to a github for docker images. Once you have registered there, just login using docker and then search for a ppc64el image.
# docker login
Figure 5: Authenticating in Docker hub
# docker pull image.
This is an example that I have been using on my personal registry.
Figure 6: Download Ubuntu Core 14.04 for POWER
This is going to create a Ubuntu core image in your system, which would enable you to play with Docker and Ubuntu Core on POWER. Once you have that, you can list the docker images and you are going to find the image leitao/ubuntucore as part of your images. As always, you can run any command on this image using:
# docker run image <command>
Commiting and reverting changes
Once you have your image running, you can change it as you want, and commit the changes. For example, let's install a package into the my ubuntucore image:
After git is installed you can check that there is a modification, using:
Later, you can see what files were modified after the change, using the container ID, as:
If you like your change, you can commit it using docker commit command and then upload the changed image using docker push command.
Automating container creation
It is possible to automate the image creation using the concept of Dockerfile, which describes how a Docker image is created starting from a base image.
A Dockerfile is generally used to create custom images, as you can execute command and copy files into an already existing image, giving it a personal content.
A Dockerfile is a plain text that uses a small set of commands to create a customized image. In the following example we are going to create an image with mongodb installed, and copy some customized etc files.
This is how the Dockerfile looks like:
from ubuntucore # use the initial image called ubuntu core
maintainer "Breno Leitao" # The author for this new image
run apt-get install -y mongodb # Run a command to install mongodb into the image
copy /srv/mongodb/etc /etc # Copy the directory /srv/mongodb/etc to inside the image /etc
In order to run this Dockerfile, you need to save it and run the following command on the directory that contains this file:
# docker build .
Once you run it, you are going to have a new image that is based on the ubuntucore image, with the package mongodb installed and the files from the 'host' /srv/mongodb/etc copied to the container /etc directory.
Ubuntu Core image for Docker
Docker Hub registration