Workloads and governor effects
Power efficiency is an important consideration for anyone concerned with business costs or environmental issues. In this final article in the series, let's look at the difference in power efficiency (in real numbers and charts) that you get from tuning the Linux CPUfreq subsystem and in-kernel governors to change the processor's operating frequency without having a major impact on performance.
In Part 2, you saw how to use and tune the governors, so now you'll see some governor effects. I use two popular workloads to compare performance and power consumption and show how a tuned governor can provide power savings without sacrificing performance:
- A workload from the SPECpower_ssj2008 benchmark that evaluates both power and performance
- A workload from an e-commerce shopping application that gathers many statistics during a simulated online shopping session, including latency times and the number of requests per second
These comparisons were made on an IBM System x® 3650 running Red Hat Enterprise Linux 5.2.
The following results are from the SPECpower_ssj2008 benchmark that evaluates both power and performance. To find out more about this benchmark or see the latest official benchmark results, see the SPEC Web site (Resources for a link). Note that these results are not tuned for optimal performance and should not be considered official benchmark results for the system, but rather results obtained for research purposes.
SPECpower_ssj2008 uses a Java™ benchmark to get a performance score in the unit ssj_ops (ssj operations) and runs the benchmark at loads from 100 percent down to idle. The higher this score, the more the system can compute.
SPECpower_ssj2008 also measures power in Watts and calculates a performance-to-power ratio at each of the loads. The higher the ratio, the better the system's performance-to-power efficiency.
Figure 1 compares the effects of the five in-kernel governors, all running
with their default settings. The tunables
sched_mc_power_savings and
sched_smt_power_savings were off and the CPU
frequency daemon cpuspeed was running in
conjunction with the userspace governor.
Figure 1. Score and power consumption for default
The dotted lines show the score in ssj_ops, a
SPECpower_ssj2008 performance metric. As you can see the only governor
that causes a major drop in performance is the powersave governor. This is
of course because the powersave governor statically sets the processor
frequency to the lowest available to save as much power as possible.
The solid lines show the power consumption. Again the powersave governor
uses less power than the others, but at the expense of performance. Also,
you can see the difference in idle power between the governors. The
performance governor always runs at the highest frequency and therefore
has an increase of around 10 Watts at idle compared to the other
governors. The userspace governor, with the
cpuspeed daemon, seems to be the best of the
default governors in providing a power savings without hurting
performance. We can confirm this by comparing the performance-to-power
ratios for each run in Figure 2.
Figure 2. Performance-to-power ration for default
The performance-to-power ratio is a metric calculated by SPECpower_ssj2008 to highlight how power-efficient a system is by comparing the score received to the amount of power used to achieve that score, so the higher the ratio the better.
As you can see, the userspace governor, in conjunction with the
cpuspeed daemon, has a better
performance-to-power ratio than the others for most of the loads when the
governors are running their default configurations for this setup;
therefore, the userspace governor is more power efficient.
As discussed earlier in Part 2, there are some optional tuning parameters for the ondemand and conservative governors. Here we will discuss how changing the utilization thresholds can affect the governor's power efficiency.
Ondemand
The ondemand up_threshold
is set to 80 by default, meaning that once the CPU utilization reaches
above 80 percent, the governor will increase the frequency. Here I'll show
how you can tune the governor to be more power efficient simply by
changing the up_threshold to 98.
Figure 3 compares the ondemand governor's effectiveness running the
default configuration, an up_threshold of 80
versus the tuned ondemand governor, running with an
up_threshold of 98. The tunables
sched_mc_power_savings and
sched_smt_power_savings were off during these
runs.
Figure 3. Score and power consumption for ondemand
As you can see by the dotted lines, both the default and tuned ondemand
governor achieved a very similar score, so changing the
up_threshold does not show any performance
impact. The solid lines, which show power consumption, do show a slight
difference. As you can see, raising the
up_threshold to 98 results in slightly lower
power consumption than using the default threshold.
Next, let's look at the performance-to-power ratios in Figure 4.
Figure 4. Performance-to-power ratio for ondemand
Here you can see that for almost every load, the tuned ondemand governor with an upwards utilization threshold of 98 is slightly more power efficient than the default ondemand governor.
Conservative
The conservative governor has two threshold
values that can be tuned:
- First, the
up_thresholdis set to 80 by default, meaning that once the processor utilization reaches above 80 percent, the governor increases the frequency. - Also, there is a
down_threshold, which is set to 20 by default. This means that once the governor finds the processors to be less than 20 percent utilized, it will start stepping down the frequency to save power.
I'll demonstrate how you can tune the conservative governor to be more
power efficient simply by changing the
up_threshold to 98 and the
down_threshold to 95. This is fairly aggressive
tuning for the governor, but I'll show that the aggressively tuned
conservative governor is more power efficient.
Figure 5 compares the conservative governor's effectiveness running the
default configuration, an up_threshold of 80
and a down_threshold of 20, versus the tuned
conservative governor running with an
up_threshold of 98 and a
down_threshold of 95. The tunables
sched_mc_power_savings and
sched_smt_power_savings were off during these
runs.
Figure 5. Score and power consumption for conservative
Again, the dotted lines show there is no performance impact when using an aggressively tuned governor. The solid lines show the difference in power consumption between the default and tuned governor; it is very clear that the tuned governor pulls much less power at the middle loads, up to about 40 watts less at the 50 percent load. This is a significant power savings. You can confirm these observations by comparing the performance-to-power ratios in Figure 6.
Figure 6. Performance-to-power ratio for conservative
The ratios show that the tuned conservative governor increases its power efficiency over the default conservative governor for the 30 through 90 percent loads.
In this section, I'll compare the tuned ondemand and conservative
governors to the other three governors. Figure 7 compares all five
governors with ondemand and conservative threshold tuning. The tunables
sched_mc_power_savings and
sched_smt_power_savings were off, and the CPU
frequency daemon cpuspeed was running in conjunction with the userspace
governor.
Figure 7. Score and power consumption for tuned governors
Here you can see again what a big performance hit the powersave governor
takes since it only runs at the lowest possible frequency, although it
does consume much less power than the others. (We'll look at the powersave
governor's power efficiency compared to the others in the next graph.) The
other four governors achieved a similar score regardless of their tuning.
Again the performance governor only runs the highest available frequency,
and you can see how great a difference this makes in power consumption by
comparing the solid lines. The userspace governor, running with the
cpuspeed daemon, and the tuned conservative
governor pull the least power after the powersave governor. The userspace
governor appears to consume slightly less power than the tuned
conservative governor around the 30 to 50 percent loads, and the tuned
conservative governor pulls less power for the loads above 50 percent. We
can see which one has a better power efficiency by comparing their
performance-to-power ratios in Figure 8.
Figure 8. Performance-to-power ratio for tuned governors
From this graph you can see that the power efficiency between the tuned
conservative governor and the userspace governor running
cpuspeed is very similar. The final
SPECpower_ssj2008 score, not shown here, indicates that the tuned
conservative governor has the best overall power efficiency, but only by a
very small margin.
sched_mc_power_savings comparison
As I discussed earlier, the
sched_mc_power_savings tunable attempts to
consolidate processes to as few cores as possible in order to save power.
Figures 9 and 10 show a comparison of CPU utilization for a run with
sched_mc_power_savings on (1) and off (0),
running with the default conservative governor. The following comparisons
show the utilization for each processor at the 10 percent load, so the
system is on average 10 percent utilized.
Figure 9. sched_mc_power_savings off
Figure 10. sched_mc_power_savings on
You can clearly see the difference in the two graphs. The first graph
(Figure 9) with
sched_mc_power_savings off shows that four of
the processors are running at about 15 percent, and the other four are
running at about 5 percent utilization. The second graph (Figure 10) with
sched_mc_power_savings on shows that the load
was consolidated onto four processors, now at about 20 percent, and the
other four are idle. Using this tunable in conjunction with an in-kernel
CPUfreq governor can reduce power consumption since the consolidation
allows some of the processors to be idle and therefore able to run at a
lower frequency.
sched_smt_power_savings comparison
Like sched_mc_power_savings, the
sched_smt_power_savings tunable attempts to
consolidate hyperthreads onto the fewest number of CPUs in order to save
power. Figures 11 and 12 show a comparison of processor utilization for a
run with sched_smt_power_savings on (1) and off
(0), running with the default conservative governor on a system that
supports hyperthreading. The following comparisons are showing the
utilization for each processor at the 10 percent load, so the system is, on
average, 10 percent utilized.
Figure 11. sched_smt_power_savings off
Figure 12. sched_smt_power_savings on
Again you can see that the load was consolidated when the setting was on. If the CPUs that are idling or close to idling can use CPUfreq governors to lower the frequency and/or idle C states in conjunction with this type of scheduling, power savings may be possible.
In this section, I'll compare the governor effects on another type of workload. The following results are from an e-commerce shopping application that gathers many statistics during a simulated online shopping session, including latency times and the number of requests per second. This application uses an Apache front end, a PHP implementation, and a MySQL database to create a usable shopping site. Note that these results are not tuned for optimal performance and should not be considered official results for the system. We'll compare the effects on the workload at various utilization loads.
The following graphs compare the effects of two tunable in-kernel
governors, conservative and ondemand, and the performance governor as a
baseline comparison. All governors are running with their default
settings,
and the tunables sched_mc_power_savings and
sched_smt_power_savings were off during these
runs.
The Figure 13 series shows the statistics for an online shopping session with 500 clients total. The system under test is 8-12 percent utilized on average when running 500 clients.
Figure 13a. Performance in requests per second
Figure 13b. Latency in milliseconds
Figure 13c. Average power in watts
Figure 13d. Performance per watt
Figure 13a shows the performance of the shopping session in requests per second. You can see that all three governors have almost exactly the same number of requests per second.
There is a slight difference in average latency as you can see in Figure 13b. The conservative governor has a latency of almost 7ms more than the performance governor, but for an application such as an online shopping cart, most users will not notice a few extra milliseconds, so this difference may be considered negligible.
Figure 13c shows the average power consumption. You can see that the conservative governor saves about 20W over the performance governor; that is, there is no processor frequency scaling and the ondemand governor saves about 15W on average.
Figure 13d shows the performance per watt by taking the number of requests per second divided by the average power consumed. The governors' similar performance and the power savings by the two dynamic governors translates into a higher performance per watt. The conservative governor is the most power-efficient for a load of 8-12 percent utilization, closely followed by the ondemand governor.
Next we'll compare the performance per watt for each of the three default governors for some larger loads to see whether the default conservative governor still has better power efficiency than the other two default governors. Figure 14 shows a load of 1,000 clients, which results in an average utilization of around 20-25 percent.
Figure 14. Default governor comparison for 1,000 clients
From this chart you can see that the conservative governor is the most power-efficient governor for this load as well. For this run, the conservative governor saved about 25W more than the performance governor while still serving almost exactly the same number of requests per second. The conservative governor's average request latency was about 5ms slower than the other two governors for this load.
Last, let's look at the performance per watt for a load of 2,000 clients in Figure 15, which pushes the system under test to around 45-60 percent utilization on average.
Figure 15. Default governor comparison for 2,000 clients
For this load, the default ondemand governor had a slightly better performance per watt. The ondemand and conservative governors both saved about 15W here, but the default conservative governor took a performance hit since it completed about 8 fewer requests per second than the others and had a latency of about 0.15 seconds more than the performance governor. The ondemand governor won here since it achieved virtually the same number of requests per second with a latency of only 50ms more than the performance governor, which of course represents what the system would achieve without any processor frequency scaling at all.
Now we'll compare how the tuned ondemand and conservative governors behave
with this workload. Again, the tuning of the governors was achieved by
changing the utilization thresholds. The tuned conservative governor had
its up_threshold set to 98 and the
down_threshold set to 95. The tuned ondemand
governor was running with an up_threshold of 98
as well. We'll look at the effects of the tuned governor on a heavier load
of 2,000 clients (45-60 percent utilization on average) in the Figure 16
series.
Lighter loads do not show much of a difference, because the tuned governors
act the same as the default governors for all loads under 20 percent
utilization since that is the default
down_threshold. The tunables
sched_mc_power_savings and
sched_smt_power_savings were off for these
runs.
Figure 16a. Performance in requests per second
Figure 16b. Latency in milliseconds
Figure 16c. Average power in watts
Figure 16d. Performance per watt
Figures 16a and 16b show that the tuned conservative governor took a slight performance hit of about 13 fewer requests per second and a higher latency of about 0.28 seconds more than the performance governor; however, you can see from Figure 16c that the tuned conservative governor achieved a significant power savings of about 55W over no processor scaling. Even with the slight performance hit, the tuned conservative governor was the most power-efficient by far.
In this
3-part
series, I've shown that in most cases, a tuned conservative
governor with an up_threshold of 98 and a
down_threshold of 95 achieves the best
performance-to-power efficiency. In some cases, this governor can have a
slight effect on performance.
You must decide whether the possible effect on performance is worth achieving potentially significant power savings. As I discussed, there are many tunables for the dynamic in-kernel governors that you can adjust to affect the performance of the governor, which in turn can affect the performance of the workload running.
As always, there is a tradeoff between power savings and performance, but I hope I've shown you how to reduce the effects on performance to a negligible degree while getting a better power efficiency from the system.
Learn
- Check out these additional materials on
power consumption:
- The tutorial "How to make use of Dynamic Frequency Scaling"
- The tutorial "Enhanced Intel SpeedStep Technology and Demand-Based Switching on Linux"
- The article "Making power policy just work" (on power schedulers)
- The article "CPU frequency scaling in Linux"
- The documentation "Linux CPUfreq Governors" (on CPU frequency and voltage scaling code in the Linux kernel)
- The Gentoo "Power Management Guide" (comes with a caveat — for laptops, so don't apply to servers unless you know what you're doing!)
- The tutorial "How to use CPU frequency scaling (cpufreq)"
- This wiki entry on CPU Frequency Scaling
- The tutorial "Scheduler tunables for multi-socket systems"
- And data on the CPUfreq subsystem from kernel.org
-
SPECpower_ssj2008
is the first industry-standard SPEC benchmark that evaluates the power and
performance characteristics of volume server class computers. The initial
benchmark addresses the performance of server-side Java, and additional
workloads are planned. You can see the
latest results.
- Review the
list of hardware that supports the CPUfreq subsystem.
- Need help rebuilding/rebooting your
kernel? Try Kwan Lowe's
"Kernel Rebuild
Guide."
-
In the
developerWorks Linux zone,
find more resources for Linux developers, and scan our
most popular articles and
tutorials.
-
See all
Linux tips and
Linux tutorials on developerWorks.
-
Stay current with
developerWorks technical events and Webcasts.
Get products and technologies
-
With
IBM trial software,
available for download directly from developerWorks, build your next development
project on Linux.
Discuss
- Participate in the discussion forum.
-
Get involved in the
My developerWorks community; with your personal profile and custom home page, you
can tailor developerWorks to your interests and interact with other developerWorks users.






