All of lore.kernel.org
 help / color / mirror / Atom feed
From: Peter Zijlstra <peterz@infradead.org>
To: Yuyang Du <yuyang.du@intel.com>
Cc: Dirk Brandewie <dirk.brandewie@gmail.com>,
	"Rafael J. Wysocki" <rjw@rjwysocki.net>,
	Morten Rasmussen <morten.rasmussen@arm.com>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"linux-pm@vger.kernel.org" <linux-pm@vger.kernel.org>,
	"mingo@kernel.org" <mingo@kernel.org>,
	"vincent.guittot@linaro.org" <vincent.guittot@linaro.org>,
	"daniel.lezcano@linaro.org" <daniel.lezcano@linaro.org>,
	"preeti@linux.vnet.ibm.com" <preeti@linux.vnet.ibm.com>,
	Dietmar Eggemann <Dietmar.Eggemann@arm.com>,
	len.brown@intel.com, jacob.jun.pan@linux.intel.com
Subject: Re: [RFC PATCH 06/16] arm: topology: Define TC2 sched energy and provide it to scheduler
Date: Tue, 10 Jun 2014 12:16:22 +0200	[thread overview]
Message-ID: <20140610101622.GB6758@twins.programming.kicks-ass.net> (raw)
In-Reply-To: <20140607232628.GC22261@intel.com>

[-- Attachment #1: Type: text/plain, Size: 3287 bytes --]

On Sun, Jun 08, 2014 at 07:26:29AM +0800, Yuyang Du wrote:
> Ok. I think we understand each other. But one more thing, I said P ~ V^3,
> because P ~ V^2*f and f ~ V, so P ~ V^3. Maybe some frequencies share the same
> voltage, but you can still safely assume V changes with f in general, and it
> will be more and more so, since we do need finer control over power consumption.

I didn't know the frequency part was proportionate to another voltage
term, ok, then the cubic term makes sense.

> > Sure, but realize that we must fully understand this governor and
> > integrate it in the scheduler if we're to attain the goal of IPC/watt
> > optimized scheduling behaviour.
> > 
> 
> Attain the goal of IPC/watt optimized?
> 
> I don't see how it can be done like this. As I said, what is unknown for
> prediction is perf scaling *and* changing workload. So the challenge for pstate
> control is in both. But I see more chanllenge in the changing workload than
> in the performance scaling or the resulting IPC impact (if workload is
> fixed).

But for the scheduler the workload change isn't that big a problem; we
know the history of each task, we know when tasks wake up and when we
move them around. Therefore we can fairly accurately predict this.

And given a simple P state model (like ARM) where the CPU simply does
what you tell it to, that all works out. We can change P-state at task
wakeup/sleep/migration and compute the most efficient P-state, and task
distribution, for the new task-set.

> Currently, all freq governors take CPU utilization (load%) as the indicator
> (target), which can server both: workload and perf scaling.

So the current cpufreq stuff is terminally broken in too many ways; its
sampling, so it misses a lot of changes, its strictly cpu local, so it
completely misses SMP information (like the migrations etc..)

If we move a 50% task from CPU1 to CPU0, a sampling thing takes time to
adjust on both CPUs, whereas if its scheduler driven, we can instantly
adjust and be done, because we _know_ what we moved.

Now some of that is due to hysterical raisins, and some of that due to
broken hardware (hardware that needs to schedule in order to change its
state because its behind some broken bus or other). But we should
basically kill off cpufreq for anything recent and sane.

> As for IPC/watt optimized, I don't see how it can be practical. Too micro to
> be used for the general well-being?

What other target would you optimize for? The purpose here is to build
an energy aware scheduler, one that schedules tasks so that the total
amount of energy, for the given amount of work, is minimal.

So we can't measure in Watt, since if we forced the CPU into the lowest
P-state (or even C-state for that matter) work would simply not
complete. So we need a complete energy term.

Now. IPC is instructions/cycle, Watt is Joule/second, so IPC/Watt is

instructions   second
------------ * ------ ~ instructions / joule
  cycle        joule

Seeing how both cycles and seconds are time units.

So for any given amount of instructions, the work needs to be done, we
want the minimal amount of energy consumed, and IPC/Watt is the natural
metric to measure this over an entire workload.

[-- Attachment #2: Type: application/pgp-signature, Size: 836 bytes --]

  parent reply	other threads:[~2014-06-10 10:16 UTC|newest]

Thread overview: 71+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-05-23 18:16 [RFC PATCH 00/16] sched: Energy cost model for energy-aware scheduling Morten Rasmussen
2014-05-23 18:16 ` [RFC PATCH 01/16] sched: Documentation for scheduler energy cost model Morten Rasmussen
2014-06-05  8:49   ` Vincent Guittot
2014-06-05 11:35     ` Morten Rasmussen
2014-06-05 15:02       ` Vincent Guittot
2014-05-23 18:16 ` [RFC PATCH 02/16] sched: Introduce CONFIG_SCHED_ENERGY Morten Rasmussen
2014-06-08  6:03   ` Henrik Austad
2014-06-09 10:20     ` Morten Rasmussen
2014-06-10  9:39       ` Peter Zijlstra
2014-06-10 10:06         ` Morten Rasmussen
2014-06-10 10:23           ` Peter Zijlstra
2014-06-10 11:17             ` Henrik Austad
2014-06-10 12:19               ` Peter Zijlstra
2014-06-10 11:24             ` Morten Rasmussen
2014-06-10 12:24               ` Peter Zijlstra
2014-06-10 14:41                 ` Morten Rasmussen
2014-05-23 18:16 ` [RFC PATCH 03/16] sched: Introduce sd energy data structures Morten Rasmussen
2014-05-23 18:16 ` [RFC PATCH 04/16] sched: Allocate and initialize sched energy Morten Rasmussen
2014-05-23 18:16 ` [RFC PATCH 05/16] sched: Add sd energy procfs interface Morten Rasmussen
2014-05-23 18:16 ` [RFC PATCH 06/16] arm: topology: Define TC2 sched energy and provide it to scheduler Morten Rasmussen
2014-05-30 12:04   ` Peter Zijlstra
2014-06-02 14:15     ` Morten Rasmussen
2014-06-03 11:41       ` Peter Zijlstra
2014-06-04 13:49         ` Morten Rasmussen
2014-06-03 11:44   ` Peter Zijlstra
2014-06-04 15:42     ` Morten Rasmussen
2014-06-04 16:16       ` Peter Zijlstra
2014-06-06 13:15         ` Morten Rasmussen
2014-06-06 13:43           ` Peter Zijlstra
2014-06-06 14:29             ` Morten Rasmussen
2014-06-12 15:05               ` Vince Weaver
2014-06-03 11:50   ` Peter Zijlstra
2014-06-04 16:02     ` Morten Rasmussen
2014-06-04 17:27       ` Peter Zijlstra
2014-06-04 21:56         ` Rafael J. Wysocki
2014-06-05  6:52           ` Peter Zijlstra
2014-06-05 15:03             ` Dirk Brandewie
2014-06-05 20:29               ` Yuyang Du
2014-06-06  8:05                 ` Peter Zijlstra
2014-06-06  0:35                   ` Yuyang Du
2014-06-06 10:50                     ` Peter Zijlstra
2014-06-06 12:13                       ` Ingo Molnar
2014-06-06 12:27                         ` Ingo Molnar
2014-06-06 14:11                           ` Morten Rasmussen
2014-06-07  2:33                           ` Nicolas Pitre
2014-06-09  8:27                             ` Morten Rasmussen
2014-06-09 13:22                               ` Nicolas Pitre
2014-06-11 11:02                                 ` Eduardo Valentin
2014-06-11 11:42                                   ` Morten Rasmussen
2014-06-11 11:43                                     ` Eduardo Valentin
2014-06-11 13:37                                       ` Morten Rasmussen
2014-06-07 23:53                         ` Yuyang Du
2014-06-07 23:26                       ` Yuyang Du
2014-06-09  8:59                         ` Morten Rasmussen
2014-06-09  2:15                           ` Yuyang Du
2014-06-10 10:16                         ` Peter Zijlstra [this message]
2014-06-10 17:01                           ` Nicolas Pitre
2014-06-10 18:35                           ` Yuyang Du
2014-06-06 16:27                     ` Jacob Pan
2014-06-06 13:03         ` Morten Rasmussen
2014-06-07  2:52         ` Nicolas Pitre
2014-05-23 18:16 ` [RFC PATCH 07/16] sched: Introduce system-wide sched_energy Morten Rasmussen
2014-05-23 18:16 ` [RFC PATCH 08/16] sched: Introduce SD_SHARE_CAP_STATES sched_domain flag Morten Rasmussen
2014-05-23 18:16 ` [RFC PATCH 09/16] sched, cpufreq: Introduce current cpu compute capacity into scheduler Morten Rasmussen
2014-05-23 18:16 ` [RFC PATCH 10/16] sched, cpufreq: Current compute capacity hack for ARM TC2 Morten Rasmussen
2014-05-23 18:16 ` [RFC PATCH 11/16] sched: Energy model functions Morten Rasmussen
2014-05-23 18:16 ` [RFC PATCH 12/16] sched: Task wakeup tracking Morten Rasmussen
2014-05-23 18:16 ` [RFC PATCH 13/16] sched: Take task wakeups into account in energy estimates Morten Rasmussen
2014-05-23 18:16 ` [RFC PATCH 14/16] sched: Use energy model in select_idle_sibling Morten Rasmussen
2014-05-23 18:16 ` [RFC PATCH 15/16] sched: Use energy to guide wakeup task placement Morten Rasmussen
2014-05-23 18:16 ` [RFC PATCH 16/16] sched: Disable wake_affine to broaden the scope of wakeup target cpus Morten Rasmussen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20140610101622.GB6758@twins.programming.kicks-ass.net \
    --to=peterz@infradead.org \
    --cc=Dietmar.Eggemann@arm.com \
    --cc=daniel.lezcano@linaro.org \
    --cc=dirk.brandewie@gmail.com \
    --cc=jacob.jun.pan@linux.intel.com \
    --cc=len.brown@intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pm@vger.kernel.org \
    --cc=mingo@kernel.org \
    --cc=morten.rasmussen@arm.com \
    --cc=preeti@linux.vnet.ibm.com \
    --cc=rjw@rjwysocki.net \
    --cc=vincent.guittot@linaro.org \
    --cc=yuyang.du@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.