linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Len Brown <lenb@kernel.org>
To: Alex Shi <alex.shi@intel.com>
Cc: mingo@redhat.com, peterz@infradead.org, tglx@linutronix.de,
	akpm@linux-foundation.org, arjan@linux.intel.com, bp@alien8.de,
	pjt@google.com, namhyung@kernel.org, efault@gmx.de,
	morten.rasmussen@arm.com, vincent.guittot@linaro.org,
	gregkh@linuxfoundation.org, preeti@linux.vnet.ibm.com,
	viresh.kumar@linaro.org, linux-kernel@vger.kernel.org,
	len.brown@intel.com, rafael.j.wysocki@intel.com, jkosina@suse.cz,
	clark.williams@gmail.com, tony.luck@intel.com,
	keescook@chromium.org, mgorman@suse.de, riel@redhat.com,
	Linux PM list <linux-pm@vger.kernel.org>
Subject: Re: [patch v7 0/21] sched: power aware scheduling
Date: Thu, 11 Apr 2013 17:02:45 -0400	[thread overview]
Message-ID: <516724F5.20504@kernel.org> (raw)
In-Reply-To: <1365040862-8390-1-git-send-email-alex.shi@intel.com>

On 04/03/2013 10:00 PM, Alex Shi wrote:

> As mentioned in the power aware scheduling proposal, Power aware
> scheduling has 2 assumptions:
> 1, race to idle is helpful for power saving
> 2, less active sched groups will reduce cpu power consumption

linux-pm@vger.kernel.org should be cc:
on Linux proposals that affect power.

> Since the patch can perfect pack tasks into fewer groups, I just show
> some performance/power testing data here:
> =========================================
> $for ((i = 0; i < x; i++)) ; do while true; do :; done  &   done
> 
> On my SNB laptop with 4 core* HT: the data is avg Watts
>          powersaving     performance
> x = 8	 72.9482 	 72.6702
> x = 4	 61.2737 	 66.7649
> x = 2	 44.8491 	 59.0679
> x = 1	 43.225 	 43.0638

> on SNB EP machine with 2 sockets * 8 cores * HT:
>          powersaving     performance
> x = 32	 393.062 	 395.134
> x = 16	 277.438 	 376.152
> x = 8	 209.33 	 272.398
> x = 4	 199 	         238.309
> x = 2	 175.245 	 210.739
> x = 1	 174.264 	 173.603

The numbers above say nothing about performance,
and thus don't tell us much.

In particular, they don't tell us if reducing power
by hacking the scheduler is more or less efficient
than using the existing techniques that are already shipping,
such as controlling P-states.

> tasks number keep waving benchmark, 'make -j <x> vmlinux'
> on my SNB EP 2 sockets machine with 8 cores * HT:
>          powersaving              performance
> x = 2    189.416 /228 23          193.355 /209 24

Energy = Power * Time

189.416*228 = 43186.848 Joules for powersaving to retire the workload
193.355*209 = 40411.195 Joules for performance to retire the workload.

So the net effect of the 'powersaving' mode here is:
1. 228/209 = 9% performance degradation
2. 43186.848/40411.195 = 6.9 % more energy to retire the workload.

These numbers suggest that this patch series simultaneously
has a negative impact on performance and energy required
to retire the workload.  Why do it?

> x = 4    215.728 /132 35          219.69 /122 37

ditto here.
8% increase in time.
6% increase in energy.

> x = 8    244.31 /75 54            252.709 /68 58

ditto here
10% increase in time.
6% increase in energy.

> x = 16   299.915 /43 77           259.127 /58 66

Are you sure that powersave mode ran in 43 seconds
when performance mode ran in 58 seconds?

If that is true, than somewhere in this patch series
you have a _significant_ performance benefit
on this workload under these conditions!

Interestingly, powersave mode also ran at
15% higher power than performance mode.
maybe "powersave" isn't quite the right name for it:-)

> x = 32   341.221 /35 83           323.418 /38 81

Why does this patch series have a performance impact (8%)
at x=32.  All the processors are always busy, no?

> data explains: 189.416 /228 23
> 	189.416: average Watts during compilation
> 	228: seconds(compile time)
> 	23:  scaled performance/watts = 1000000 / seconds / watts
> The performance value of kbuild is better on threads 16/32, that's due
> to lazy power balance reduced the context switch and CPU has more boost 
> chance on powersaving balance.

25% is a huge difference in performance.
Can you get a performance benefit in that scenario
without having a negative performance impact
in the other scenarios?  In particular,
an 8% hit to the fully utilized case is a deal killer.

The x=16 performance change here suggest there is value
someplace in this patch series to increase performance.
However, the case that these scheduling changes are
a benefit from an energy efficiency point of view
is yet to be made.

thanks,
-Len Brown
Intel Open Source Technology Center


  parent reply	other threads:[~2013-04-11 21:02 UTC|newest]

Thread overview: 64+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-04-04  2:00 [patch v7 0/21] sched: power aware scheduling Alex Shi
2013-04-04  2:00 ` [patch v7 01/21] Revert "sched: Introduce temporary FAIR_GROUP_SCHED dependency for load-tracking" Alex Shi
2013-04-04  2:00 ` [patch v7 02/21] sched: set initial value of runnable avg for new forked task Alex Shi
2013-05-06  3:24   ` Preeti U Murthy
2013-04-04  2:00 ` [patch v7 03/21] sched: add sched balance policies in kernel Alex Shi
2013-04-04  2:00 ` [patch v7 04/21] sched: add sysfs interface for sched_balance_policy selection Alex Shi
2013-04-04  2:00 ` [patch v7 05/21] sched: log the cpu utilization at rq Alex Shi
2013-05-06  3:26   ` Preeti U Murthy
2013-05-06  5:22     ` Alex Shi
2013-05-06 12:03   ` Phil Carmody
2013-05-06 12:35     ` Alex Shi
2013-05-06 21:19   ` Paul Turner
2013-04-04  2:00 ` [patch v7 06/21] sched: add new sg/sd_lb_stats fields for incoming fork/exec/wake balancing Alex Shi
2013-04-04  2:00 ` [patch v7 07/21] sched: move sg/sd_lb_stats struct ahead Alex Shi
2013-04-04  2:00 ` [patch v7 08/21] sched: scale_rt_power rename and meaning change Alex Shi
2013-04-04  2:00 ` [patch v7 09/21] sched: get rq potential maximum utilization Alex Shi
2013-04-04  2:00 ` [patch v7 10/21] sched: add power aware scheduling in fork/exec/wake Alex Shi
2013-04-04  2:00 ` [patch v7 11/21] sched: add sched_burst_threshold_ns as wakeup burst indicator Alex Shi
2013-04-04  2:00 ` [patch v7 12/21] sched: using avg_idle to detect bursty wakeup Alex Shi
2013-04-04  2:00 ` [patch v7 13/21] sched: packing transitory tasks in wakeup power balancing Alex Shi
2013-04-04  2:00 ` [patch v7 14/21] sched: add power/performance balance allow flag Alex Shi
2013-04-04  2:00 ` [patch v7 15/21] sched: pull all tasks from source group Alex Shi
2013-04-04  5:59   ` Namhyung Kim
2013-04-06 11:49     ` Alex Shi
2013-04-04  2:00 ` [patch v7 16/21] sched: no balance for prefer_sibling in power scheduling Alex Shi
2013-05-06  3:26   ` Preeti U Murthy
2013-04-04  2:00 ` [patch v7 17/21] sched: add new members of sd_lb_stats Alex Shi
2013-04-04  2:00 ` [patch v7 18/21] sched: power aware load balance Alex Shi
2013-04-04  2:01 ` [patch v7 19/21] sched: lazy power balance Alex Shi
2013-04-04  2:01 ` [patch v7 20/21] sched: don't do power balance on share cpu power domain Alex Shi
2013-04-08  3:17   ` Preeti U Murthy
2013-04-08  3:25     ` Preeti U Murthy
2013-04-08  4:19       ` Alex Shi
2013-04-04  2:01 ` [patch v7 21/21] sched: make sure select_tas_rq_fair get a cpu Alex Shi
2013-04-11 21:02 ` Len Brown [this message]
2013-04-12  8:46   ` [patch v7 0/21] sched: power aware scheduling Alex Shi
2013-04-12 16:23     ` Borislav Petkov
2013-04-12 16:48       ` Mike Galbraith
2013-04-12 17:12         ` Borislav Petkov
2013-04-14  1:36           ` Alex Shi
2013-04-17 21:53         ` Len Brown
2013-04-18  1:51           ` Mike Galbraith
2013-04-26 15:11           ` Mike Galbraith
2013-04-30  5:16             ` Mike Galbraith
2013-04-30  8:30               ` Mike Galbraith
2013-04-30  8:41                 ` Ingo Molnar
2013-04-30  9:35                   ` Mike Galbraith
2013-04-30  9:49                     ` Mike Galbraith
2013-04-30  9:56                       ` Mike Galbraith
2013-05-17  8:06                         ` Preeti U Murthy
2013-05-20  1:01                           ` Alex Shi
2013-05-20  2:30                             ` Preeti U Murthy
2013-04-14  1:28       ` Alex Shi
2013-04-14  5:10         ` Alex Shi
2013-04-14 15:59         ` Borislav Petkov
2013-04-15  6:04           ` Alex Shi
2013-04-15  6:16             ` Alex Shi
2013-04-15  9:52               ` Borislav Petkov
2013-04-15 13:50                 ` Alex Shi
2013-04-15 23:12                   ` Borislav Petkov
2013-04-16  0:22                     ` Alex Shi
2013-04-16 10:24                       ` Borislav Petkov
2013-04-17  1:18                         ` Alex Shi
2013-04-17  7:38                           ` Borislav Petkov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=516724F5.20504@kernel.org \
    --to=lenb@kernel.org \
    --cc=akpm@linux-foundation.org \
    --cc=alex.shi@intel.com \
    --cc=arjan@linux.intel.com \
    --cc=bp@alien8.de \
    --cc=clark.williams@gmail.com \
    --cc=efault@gmx.de \
    --cc=gregkh@linuxfoundation.org \
    --cc=jkosina@suse.cz \
    --cc=keescook@chromium.org \
    --cc=len.brown@intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pm@vger.kernel.org \
    --cc=mgorman@suse.de \
    --cc=mingo@redhat.com \
    --cc=morten.rasmussen@arm.com \
    --cc=namhyung@kernel.org \
    --cc=peterz@infradead.org \
    --cc=pjt@google.com \
    --cc=preeti@linux.vnet.ibm.com \
    --cc=rafael.j.wysocki@intel.com \
    --cc=riel@redhat.com \
    --cc=tglx@linutronix.de \
    --cc=tony.luck@intel.com \
    --cc=vincent.guittot@linaro.org \
    --cc=viresh.kumar@linaro.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).