Reduce Linux power consumption, Part 3: Tuning results

This three-part series is your starting point for tuning your system for power efficiency. In Part 3, the author compares the performance of the five in-kernel governors in both tuned and untuned states to show you how to optimize a Linux-based System x server.

Share your opinion:  Are the power savings worth the performance tradeoffs? Add your comments below.

Share:

Jenifer Hopper, Software Engineer, IBM  

author photo - Jenifer HopperJenifer Hopper is a software engineer for the IBM Linux performance group in Austin, Texas. Her current focus is on High Performance Computing (HPC) and energy workloads, as well as system profilers and data analysis tools.



07 October 2009

Also available in Russian Portuguese

About this series

In this series, learn how to tune your Linux-based IBM System x server for power efficiency. You'll learn about the in-kernel governors and their settings and how to use them; you'll also see the effects of the tuned governors on a power performance and e-commerce workload. The examples are based on a System x server running Red Hat Enterprise Linux version 5.2 (RHEL 5.2), but the same guidelines apply to any of the 2.6.x kernels, as well as any processor type that supports frequency scaling.

Part 1 introduces the components and concepts you'll need to tune your system for power efficiency, including the Linux CPUfreq subsystem, C and P states, and the five in-kernel governors.

Part 2 gives more details on the general settings of the Linux CPUfreq subsystem and the five in-kernel governors—performance, powersave, userspace, ondemand, and conservative—and their settings.

Part 3 compares the performance of the five in-kernel governors in both a tuned and an untuned state to show you what results you can achieve by power tuning your system.

Workloads and governor effects

Power efficiency is an important consideration for anyone concerned with business costs or environmental issues. In this final article in the series, let's look at the difference in power efficiency (in real numbers and charts) that you get from tuning the Linux CPUfreq subsystem and in-kernel governors to change the processor's operating frequency without having a major impact on performance.

In Part 2, you saw how to use and tune the governors, so now you'll see some governor effects. I use two popular workloads to compare performance and power consumption and show how a tuned governor can provide power savings without sacrificing performance:

  • A workload from the SPECpower_ssj2008 benchmark that evaluates both power and performance
  • A workload from an e-commerce shopping application that gathers many statistics during a simulated online shopping session, including latency times and the number of requests per second

These comparisons were made on an IBM System x® 3650 running Red Hat Enterprise Linux 5.2.


SPECpower_ssj2008 workload

The five in-kernel governors again

Performance governor
Static | Sets processor to highest frequency | User adjustable frequency range | Fastest speed; no power savings

Powersave governor
Static | Sets processor to lowest frequency | User adjustable frequency range | Lowest speed; few power savings

Userspace governor
Dynamic | Allows user to manually set frequencies | User adjustable frequency range | Useful to set unique power policy

Ondemand governor
Dynamic | Changes frequency based on processor use | User adjustable range, utilization check rate and threshold | Manages processor utilization power downwards

Conservative governor
Dynamic | Changes frequency based on processor use | User adjustable range, utilization check, threshold, and frequency step rate | Manages processor utilization power up and down

The following results are from the SPECpower_ssj2008 benchmark that evaluates both power and performance. To find out more about this benchmark or see the latest official benchmark results, see the SPEC Web site (Resources for a link). Note that these results are not tuned for optimal performance and should not be considered official benchmark results for the system, but rather results obtained for research purposes.

SPECpower_ssj2008 uses a Java™ benchmark to get a performance score in the unit ssj_ops (ssj operations) and runs the benchmark at loads from 100 percent down to idle. The higher this score, the more the system can compute.

SPECpower_ssj2008 also measures power in Watts and calculates a performance-to-power ratio at each of the loads. The higher the ratio, the better the system's performance-to-power efficiency.

Default governor comparison

Figure 1 compares the effects of the five in-kernel governors, all running with their default settings. The tunables sched_mc_power_savings and sched_smt_power_savings were off and the CPU frequency daemon cpuspeed was running in conjunction with the userspace governor.

Figure 1. Score and power consumption for default
Score and power consumption for default

The dotted lines show the score in ssj_ops, a SPECpower_ssj2008 performance metric. As you can see the only governor that causes a major drop in performance is the powersave governor. This is of course because the powersave governor statically sets the processor frequency to the lowest available to save as much power as possible.

The solid lines show the power consumption. Again the powersave governor uses less power than the others, but at the expense of performance. Also, you can see the difference in idle power between the governors. The performance governor always runs at the highest frequency and therefore has an increase of around 10 Watts at idle compared to the other governors. The userspace governor, with the cpuspeed daemon, seems to be the best of the default governors in providing a power savings without hurting performance. We can confirm this by comparing the performance-to-power ratios for each run in Figure 2.

Figure 2. Performance-to-power ration for default
Performance-to-power ration for default

Join the green groups on My developerWorks

Discuss topics and share resources about energy, efficiency, and the environment on the GReen IT Report space and the Green computing group on My developerWorks.

The performance-to-power ratio is a metric calculated by SPECpower_ssj2008 to highlight how power-efficient a system is by comparing the score received to the amount of power used to achieve that score, so the higher the ratio the better.

As you can see, the userspace governor, in conjunction with the cpuspeed daemon, has a better performance-to-power ratio than the others for most of the loads when the governors are running their default configurations for this setup; therefore, the userspace governor is more power efficient.

Tuning

As discussed earlier in Part 2, there are some optional tuning parameters for the ondemand and conservative governors. Here we will discuss how changing the utilization thresholds can affect the governor's power efficiency.

Remember tunable schedulers

sched_mc_power_savings is for scheduling processes on cores.

sched_smt_power_savings is for scheduling processes on hyperthreads on a core.

Ondemand
The ondemand up_threshold is set to 80 by default, meaning that once the CPU utilization reaches above 80 percent, the governor will increase the frequency. Here I'll show how you can tune the governor to be more power efficient simply by changing the up_threshold to 98.

Figure 3 compares the ondemand governor's effectiveness running the default configuration, an up_threshold of 80 versus the tuned ondemand governor, running with an up_threshold of 98. The tunables sched_mc_power_savings and sched_smt_power_savings were off during these runs.

Figure 3. Score and power consumption for ondemand
Score and power consumption for ondemand

As you can see by the dotted lines, both the default and tuned ondemand governor achieved a very similar score, so changing the up_threshold does not show any performance impact. The solid lines, which show power consumption, do show a slight difference. As you can see, raising the up_threshold to 98 results in slightly lower power consumption than using the default threshold.

Next, let's look at the performance-to-power ratios in Figure 4.

Figure 4. Performance-to-power ratio for ondemand
Performance-to-power ratio for ondemand

Here you can see that for almost every load, the tuned ondemand governor with an upwards utilization threshold of 98 is slightly more power efficient than the default ondemand governor.

Conservative
The conservative governor has two threshold values that can be tuned:

  • First, the up_threshold is set to 80 by default, meaning that once the processor utilization reaches above 80 percent, the governor increases the frequency.
  • Also, there is a down_threshold, which is set to 20 by default. This means that once the governor finds the processors to be less than 20 percent utilized, it will start stepping down the frequency to save power.

I'll demonstrate how you can tune the conservative governor to be more power efficient simply by changing the up_threshold to 98 and the down_threshold to 95. This is fairly aggressive tuning for the governor, but I'll show that the aggressively tuned conservative governor is more power efficient.

Figure 5 compares the conservative governor's effectiveness running the default configuration, an up_threshold of 80 and a down_threshold of 20, versus the tuned conservative governor running with an up_threshold of 98 and a down_threshold of 95. The tunables sched_mc_power_savings and sched_smt_power_savings were off during these runs.

Figure 5. Score and power consumption for conservative
Score and power consumption for conservative

Again, the dotted lines show there is no performance impact when using an aggressively tuned governor. The solid lines show the difference in power consumption between the default and tuned governor; it is very clear that the tuned governor pulls much less power at the middle loads, up to about 40 watts less at the 50 percent load. This is a significant power savings. You can confirm these observations by comparing the performance-to-power ratios in Figure 6.

Figure 6. Performance-to-power ratio for conservative
Performance-to-power ratio for conservative

The ratios show that the tuned conservative governor increases its power efficiency over the default conservative governor for the 30 through 90 percent loads.

Tuned governors comparison

In this section, I'll compare the tuned ondemand and conservative governors to the other three governors. Figure 7 compares all five governors with ondemand and conservative threshold tuning. The tunables sched_mc_power_savings and sched_smt_power_savings were off, and the CPU frequency daemon cpuspeed was running in conjunction with the userspace governor.

Figure 7. Score and power consumption for tuned governors
Score and power consumption for tuned governors

Here you can see again what a big performance hit the powersave governor takes since it only runs at the lowest possible frequency, although it does consume much less power than the others. (We'll look at the powersave governor's power efficiency compared to the others in the next graph.) The other four governors achieved a similar score regardless of their tuning. Again the performance governor only runs the highest available frequency, and you can see how great a difference this makes in power consumption by comparing the solid lines. The userspace governor, running with the cpuspeed daemon, and the tuned conservative governor pull the least power after the powersave governor. The userspace governor appears to consume slightly less power than the tuned conservative governor around the 30 to 50 percent loads, and the tuned conservative governor pulls less power for the loads above 50 percent. We can see which one has a better power efficiency by comparing their performance-to-power ratios in Figure 8.

Figure 8. Performance-to-power ratio for tuned governors
Performance-to-power ratio for tuned governors

From this graph you can see that the power efficiency between the tuned conservative governor and the userspace governor running cpuspeed is very similar. The final SPECpower_ssj2008 score, not shown here, indicates that the tuned conservative governor has the best overall power efficiency, but only by a very small margin.

sched_mc_power_savings comparison

As I discussed earlier, the sched_mc_power_savings tunable attempts to consolidate processes to as few cores as possible in order to save power. Figures 9 and 10 show a comparison of CPU utilization for a run with sched_mc_power_savings on (1) and off (0), running with the default conservative governor. The following comparisons show the utilization for each processor at the 10 percent load, so the system is on average 10 percent utilized.

Figure 9. sched_mc_power_savings off
sched_mc_power_savings off
Figure 10. sched_mc_power_savings on
sched_mc_power_savings on

You can clearly see the difference in the two graphs. The first graph (Figure 9) with sched_mc_power_savings off shows that four of the processors are running at about 15 percent, and the other four are running at about 5 percent utilization. The second graph (Figure 10) with sched_mc_power_savings on shows that the load was consolidated onto four processors, now at about 20 percent, and the other four are idle. Using this tunable in conjunction with an in-kernel CPUfreq governor can reduce power consumption since the consolidation allows some of the processors to be idle and therefore able to run at a lower frequency.

sched_smt_power_savings comparison

Like sched_mc_power_savings, the sched_smt_power_savings tunable attempts to consolidate hyperthreads onto the fewest number of CPUs in order to save power. Figures 11 and 12 show a comparison of processor utilization for a run with sched_smt_power_savings on (1) and off (0), running with the default conservative governor on a system that supports hyperthreading. The following comparisons are showing the utilization for each processor at the 10 percent load, so the system is, on average, 10 percent utilized.

Figure 11. sched_smt_power_savings off
sched_smt_power_savings off
Figure 12. sched_smt_power_savings on
sched_smt_power_savings on

Again you can see that the load was consolidated when the setting was on. If the CPUs that are idling or close to idling can use CPUfreq governors to lower the frequency and/or idle C states in conjunction with this type of scheduling, power savings may be possible.


An e-commerce workload

In this section, I'll compare the governor effects on another type of workload. The following results are from an e-commerce shopping application that gathers many statistics during a simulated online shopping session, including latency times and the number of requests per second. This application uses an Apache front end, a PHP implementation, and a MySQL database to create a usable shopping site. Note that these results are not tuned for optimal performance and should not be considered official results for the system. We'll compare the effects on the workload at various utilization loads.

Default governors comparison

The following graphs compare the effects of two tunable in-kernel governors, conservative and ondemand, and the performance governor as a baseline comparison. All governors are running with their default settings, and the tunables sched_mc_power_savings and sched_smt_power_savings were off during these runs.

The Figure 13 series shows the statistics for an online shopping session with 500 clients total. The system under test is 8-12 percent utilized on average when running 500 clients.

Figure 13a. Performance in requests per second
Performance in requests per second
Figure 13b. Latency in milliseconds
Latency in milliseconds
Figure 13c. Average power in watts
Average power in watts
Figure 13d. Performance per watt
Performance per watt

Figure 13a shows the performance of the shopping session in requests per second. You can see that all three governors have almost exactly the same number of requests per second.

There is a slight difference in average latency as you can see in Figure 13b. The conservative governor has a latency of almost 7ms more than the performance governor, but for an application such as an online shopping cart, most users will not notice a few extra milliseconds, so this difference may be considered negligible.

Figure 13c shows the average power consumption. You can see that the conservative governor saves about 20W over the performance governor; that is, there is no processor frequency scaling and the ondemand governor saves about 15W on average.

Figure 13d shows the performance per watt by taking the number of requests per second divided by the average power consumed. The governors' similar performance and the power savings by the two dynamic governors translates into a higher performance per watt. The conservative governor is the most power-efficient for a load of 8-12 percent utilization, closely followed by the ondemand governor.

Next we'll compare the performance per watt for each of the three default governors for some larger loads to see whether the default conservative governor still has better power efficiency than the other two default governors. Figure 14 shows a load of 1,000 clients, which results in an average utilization of around 20-25 percent.

Figure 14. Default governor comparison for 1,000 clients
Default governor comparison for 1,000 clients

From this chart you can see that the conservative governor is the most power-efficient governor for this load as well. For this run, the conservative governor saved about 25W more than the performance governor while still serving almost exactly the same number of requests per second. The conservative governor's average request latency was about 5ms slower than the other two governors for this load.

Last, let's look at the performance per watt for a load of 2,000 clients in Figure 15, which pushes the system under test to around 45-60 percent utilization on average.

Figure 15. Default governor comparison for 2,000 clients
Default governor comparison for 2,000 clients

For this load, the default ondemand governor had a slightly better performance per watt. The ondemand and conservative governors both saved about 15W here, but the default conservative governor took a performance hit since it completed about 8 fewer requests per second than the others and had a latency of about 0.15 seconds more than the performance governor. The ondemand governor won here since it achieved virtually the same number of requests per second with a latency of only 50ms more than the performance governor, which of course represents what the system would achieve without any processor frequency scaling at all.

Tuned governors comparison

Now we'll compare how the tuned ondemand and conservative governors behave with this workload. Again, the tuning of the governors was achieved by changing the utilization thresholds. The tuned conservative governor had its up_threshold set to 98 and the down_threshold set to 95. The tuned ondemand governor was running with an up_threshold of 98 as well. We'll look at the effects of the tuned governor on a heavier load of 2,000 clients (45-60 percent utilization on average) in the Figure 16 series.

Lighter loads do not show much of a difference, because the tuned governors act the same as the default governors for all loads under 20 percent utilization since that is the default down_threshold. The tunables sched_mc_power_savings and sched_smt_power_savings were off for these runs.

Figure 16a. Performance in requests per second
Performance in requests per second
Figure 16b. Latency in milliseconds
Latency in milliseconds
Figure 16c. Average power in watts
Average power in Watts
Figure 16d. Performance per watt
Performance per Watt

Figures 16a and 16b show that the tuned conservative governor took a slight performance hit of about 13 fewer requests per second and a higher latency of about 0.28 seconds more than the performance governor; however, you can see from Figure 16c that the tuned conservative governor achieved a significant power savings of about 55W over no processor scaling. Even with the slight performance hit, the tuned conservative governor was the most power-efficient by far.


Conclusion

In this 3-part series, I've shown that in most cases, a tuned conservative governor with an up_threshold of 98 and a down_threshold of 95 achieves the best performance-to-power efficiency. In some cases, this governor can have a slight effect on performance.

You must decide whether the possible effect on performance is worth achieving potentially significant power savings. As I discussed, there are many tunables for the dynamic in-kernel governors that you can adjust to affect the performance of the governor, which in turn can affect the performance of the workload running.

As always, there is a tradeoff between power savings and performance, but I hope I've shown you how to reduce the effects on performance to a negligible degree while getting a better power efficiency from the system.

Resources

Learn

Get products and technologies

  • With IBM trial software, available for download directly from developerWorks, build your next development project on Linux.

Discuss

Comments

developerWorks: Sign in

Required fields are indicated with an asterisk (*).


Need an IBM ID?
Forgot your IBM ID?


Forgot your password?
Change your password

By clicking Submit, you agree to the developerWorks terms of use.

 


The first time you sign into developerWorks, a profile is created for you. Information in your profile (your name, country/region, and company name) is displayed to the public and will accompany any content you post, unless you opt to hide your company name. You may update your IBM account at any time.

All information submitted is secure.

Choose your display name



The first time you sign in to developerWorks, a profile is created for you, so you need to choose a display name. Your display name accompanies the content you post on developerWorks.

Please choose a display name between 3-31 characters. Your display name must be unique in the developerWorks community and should not be your email address for privacy reasons.

Required fields are indicated with an asterisk (*).

(Must be between 3 – 31 characters.)

By clicking Submit, you agree to the developerWorks terms of use.

 


All information submitted is secure.

Dig deeper into Linux on developerWorks


static.content.url=http://www.ibm.com/developerworks/js/artrating/
SITE_ID=1
Zone=Linux
ArticleID=433619
ArticleTitle=Reduce Linux power consumption, Part 3: Tuning results
publish-date=10072009