All of lore.kernel.org
 help / color / mirror / Atom feed
From: Morten Rasmussen <morten.rasmussen@arm.com>
To: Nicolas Pitre <nicolas.pitre@linaro.org>
Cc: Arjan van de Ven <arjan@linux.intel.com>,
	Daniel Lezcano <daniel.lezcano@linaro.org>,
	Preeti U Murthy <preeti@linux.vnet.ibm.com>,
	Peter Zijlstra <peterz@infradead.org>,
	Len Brown <len.brown@intel.com>,
	Preeti Murthy <preeti.lkml@gmail.com>,
	"mingo@redhat.com" <mingo@redhat.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	"Rafael J. Wysocki" <rjw@rjwysocki.net>,
	LKML <linux-kernel@vger.kernel.org>,
	"linux-pm@vger.kernel.org" <linux-pm@vger.kernel.org>,
	Lists linaro-kernel <linaro-kernel@lists.linaro.org>
Subject: Re: [RFC PATCH 3/3] idle: store the idle state index in the struct rq
Date: Mon, 3 Feb 2014 12:54:41 +0000	[thread overview]
Message-ID: <20140203125441.GD19029@e103034-lin> (raw)
In-Reply-To: <alpine.LFD.2.11.1401311236470.2312@knanqh.ubzr>

On Fri, Jan 31, 2014 at 06:19:26PM +0000, Nicolas Pitre wrote:
> Right now (on ARM at least but I imagine this is pretty universal), the 
> biggest impact on information accuracy for a CPU depends on what the 
> other CPUs are doing.  The most obvious example is cluster power down.  
> For a cluster to be powered down, all the CPUs sharing this cluster must 
> also be powered down.  And all those CPUs must have agreed to a possible 
> cluster power down in advance as well.  But it is not because an idle 
> CPU has agreed to the extra latency imposed by a cluster power down that 
> the cluster has actually powered down since another CPU in that cluster 
> might still be running, in which case the recorded latency information 
> for that idle CPU would be higher than it would be in practice at that 
> moment.
> 
> A cluster should map naturally to a scheduling domain.  If we need to 
> wake up a CPU, it is quite obvious that we should prefer an idle CPU 
> from a scheduling domain which load is not zero.  If the load is not 
> zero then this means that any idle CPU in that domain, even if it 
> indicated it was ready for a cluster power down, will not require the 
> cluster power-up latency as some other CPUs must still be running.  But 
> we already know that of course even if the recorded latency might not 
> say so.
> 
> In other words, the hardware latency information is dynamic of course.  
> But we might not _need_ to have it reflected at the scheduler domain all 
> the time as in this case it can be inferred by the scheduling domain 
> load.

I agree that the existing sched domain hierarchy should be used to
represent the power topology. But, it is not clear to me how much we can say
about the C-state of cpu without checking the load of the entire cluster
every time?

We would need to know which C-states (index) that are per cpu and per
cluster and ignore the cluster states when the cluster load is non-zero.

Current sched domain load is not maintained in the scheduler, it is only
produced when needed. But I guess you could derive the necessary
information from the idle cpu masks.

> 
> Within a scheduling domain it is OK to pick up the best idle CPU by 
> looking at the index as it is best to leave those CPUs ready for a 
> cluster power down set to that state and prefer one which is not.  And a 
> scheduling domain with a load of zero should be left alone if idle CPUs 
> are found in another domain which load is not zero, irrespective of 
> absolute latency information. So all the existing heuristics already in 
> place to optimize cache utilization and so on will make things just work 
> for idle as well.

IIUC, you propose to only use the index when picking an idle cpu inside
an already busy sched domain and leave idle sched domains alone if
possible. It may work for homogeneous SMP systems, but I don't think it
will work for heterogeneous systems like big.LITTLE.

If the little cluster has zero load and the big has stuff running, it
doesn't mean that it is a good idea to wake up another big cpu. It may
be more power efficient to wake up the little cluster. Comparing idle
state index of a big and little cpu won't help us in making that choice
as the clusters may have different idle states and the costs associated
with each state are different.

I'm therefore not convinced that idle state index is the right thing to
give the scheduler. Using a cost metric would be better in my
opinion.

Morten


  parent reply	other threads:[~2014-02-03 12:54 UTC|newest]

Thread overview: 52+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-01-30 14:09 [RFC PATCH 0/3] cpuidle/sched: move main idle function in the idle.c Daniel Lezcano
2014-01-30 14:09 ` [RFC PATCH 1/3] cpuidle: split cpuidle_idle_call main function into functions Daniel Lezcano
2014-01-30 15:27   ` Peter Zijlstra
2014-01-30 15:39     ` Daniel Lezcano
2014-01-30 19:39   ` Nicolas Pitre
2014-01-31 14:10     ` Daniel Lezcano
2014-01-30 14:09 ` [RFC PATCH 2/3] cpuidle: move the cpuidle_idle_call function to idle.c Daniel Lezcano
2014-01-30 19:42   ` Nicolas Pitre
2014-01-30 14:09 ` [RFC PATCH 3/3] idle: store the idle state index in the struct rq Daniel Lezcano
2014-01-30 15:31   ` Peter Zijlstra
2014-01-30 16:27     ` Daniel Lezcano
2014-01-30 16:35       ` Peter Zijlstra
2014-01-30 17:25         ` Daniel Lezcano
2014-01-30 17:50           ` Lorenzo Pieralisi
2014-01-30 21:02             ` Nicolas Pitre
2014-01-31  9:46               ` Vincent Guittot
2014-01-31 10:04               ` Lorenzo Pieralisi
2014-01-31 10:44               ` Daniel Lezcano
2014-01-31  8:45           ` Preeti Murthy
2014-01-31  9:02             ` Peter Zijlstra
2014-01-31  9:39               ` Preeti U Murthy
2014-01-31 10:24                 ` Peter Zijlstra
2014-01-31 14:04                 ` Daniel Lezcano
2014-01-31 14:12                   ` Dietmar Eggemann
2014-01-31 15:07                 ` Arjan van de Ven
2014-01-31 15:37                   ` Daniel Lezcano
2014-01-31 15:50                     ` Arjan van de Ven
2014-01-31 16:35                       ` Daniel Lezcano
2014-01-31 16:42                         ` Arjan van de Ven
2014-01-31 18:19                       ` Nicolas Pitre
2014-02-01  6:00                         ` Brown, Len
2014-02-01 15:31                           ` Nicolas Pitre
2014-02-01 19:39                             ` Brown, Len
2014-02-01 20:13                               ` Nicolas Pitre
2014-02-01 15:40                           ` Lorenzo Pieralisi
2014-02-03 12:54                         ` Morten Rasmussen [this message]
2014-02-03 14:38                           ` Arjan van de Ven
2014-02-03 14:56                             ` Peter Zijlstra
2014-02-03 16:17                               ` Arjan van de Ven
2014-02-11 16:41                                 ` Peter Zijlstra
2014-02-11 17:12                                   ` Arjan van de Ven
2014-02-11 19:47                                     ` Peter Zijlstra
2014-02-12 15:16                                 ` Lorenzo Pieralisi
2014-02-12 16:14                                   ` Arjan van de Ven
2014-02-12 17:37                                     ` Lorenzo Pieralisi
2014-02-12 19:05                                       ` Nicolas Pitre
2014-02-04  9:14                               ` Ingo Molnar
2014-02-04 14:53                                 ` Arjan van de Ven
2014-02-04 14:56                                 ` Arjan van de Ven
2014-02-03 14:58                           ` Nicolas Pitre
2014-01-31 10:15             ` Daniel Lezcano
2014-02-03  6:33               ` Preeti U Murthy

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20140203125441.GD19029@e103034-lin \
    --to=morten.rasmussen@arm.com \
    --cc=arjan@linux.intel.com \
    --cc=daniel.lezcano@linaro.org \
    --cc=len.brown@intel.com \
    --cc=linaro-kernel@lists.linaro.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pm@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=nicolas.pitre@linaro.org \
    --cc=peterz@infradead.org \
    --cc=preeti.lkml@gmail.com \
    --cc=preeti@linux.vnet.ibm.com \
    --cc=rjw@rjwysocki.net \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.