linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Morten Rasmussen <morten.rasmussen@arm.com>
To: David Lang <david@lang.hm>
Cc: Ingo Molnar <mingo@kernel.org>,
	"alex.shi@intel.com" <alex.shi@intel.com>,
	"peterz@infradead.org" <peterz@infradead.org>,
	"preeti@linux.vnet.ibm.com" <preeti@linux.vnet.ibm.com>,
	"vincent.guittot@linaro.org" <vincent.guittot@linaro.org>,
	"efault@gmx.de" <efault@gmx.de>,
	"pjt@google.com" <pjt@google.com>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"linaro-kernel@lists.linaro.org" <linaro-kernel@lists.linaro.org>,
	"arjan@linux.intel.com" <arjan@linux.intel.com>,
	"len.brown@intel.com" <len.brown@intel.com>,
	"corbet@lwn.net" <corbet@lwn.net>,
	Andrew Morton <akpm@linux-foundation.org>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	"tglx@linutronix.de" <tglx@linutronix.de>,
	Catalin Marinas <Catalin.Marinas@arm.com>
Subject: Re: power-efficient scheduling design
Date: Wed, 19 Jun 2013 13:39:33 +0100	[thread overview]
Message-ID: <20130619123933.GG5460@e103034-lin> (raw)
In-Reply-To: <alpine.DEB.2.02.1306181032160.9258@nftneq.ynat.uz>

On Tue, Jun 18, 2013 at 06:39:27PM +0100, David Lang wrote:
> On Tue, 18 Jun 2013, Morten Rasmussen wrote:
> 
> >> I don't think that you are passing nearly enough information around.
> >>
> >> A fairly simple example
> >>
> >> take a relatively modern 4-core system with turbo mode where speed controls
> >> affect two cores at a time (I don't know the details of the available CPUs to
> >> know if this is an exact fit to any existing system, but I think it's a
> >> reasonable fit)
> >>
> >> If you are running with a loadave of 2, should you power down 2 cores and run
> >> the other two in turbo mode, power down 2 cores and not increase the speed, or
> >> leave all 4 cores running as is.
> >>
> >> Depending on the mix of processes, I could see any one of the three being the
> >> right answer.
> >>
> >> If you have a process that's maxing out it's cpu time on one core, going to
> >> turbo mode is the right thing as the other processes should fit on the other
> >> core and that process will use more CPU (theoretically getting done sooner)
> >>
> >> If no process is close to maxing out the core, then if you are in power saving
> >> mode, you probably want to shut down two cores and run everything on the other
> >> two
> >>
> >> If you only have two processes eating almost all your CPU time, going to two
> >> cores is probably the right thing to do.
> >>
> >> If you have more processes, each eating a little bit of time, then continuing
> >> to run on all four cores uses more cache, and could let all of the tasks finish
> >> faster.
> >>
> >>
> >> So, how is the Power Scheduler going to get this level of information?
> >>
> >> It doesn't seem reasonable to either pass this much data around, or to try and
> >> give two independant tools access to the same raw data (since that data is so
> >> tied to the internal details of the scheduler). If we are talking two parts of
> >> the same thing, then it's perfectly legitimate to have this sort of intimate
> >> knowledge of the internal data structures.
> >
> > I realize that my description is not very clear about this point. Total
> > load is clearly not enough information for the power scheduler to take
> > any reasonable decisions. By current load, I mean per-cpu load, number
> > of tasks, and possibly more task statistics. Enough information to
> > determine the best use of the system cpus.
> >
> > As stated in my previous reply, this is not the ultimate design. It
> > expect to have many design iterations. If it turns out that it doesn't
> > make sense to have a separate power scheduler, then we should merge
> > them. I just propose to divide the design into manageable components. A
> > unified design covering the scheduler, two other policy frameworks, and
> > new policies is too complex in my opinion.
> >
> > The power scheduler may be viewed as an external extension to the
> > periodic scheduler load balance. I don't see a major problem in
> > accessing raw data in the scheduler. The power scheduler will live in
> > sched/power.c. In a unified solution where you put everything into
> > sched/fair.c you would still need access to the same raw data to make
> > the right power scheduling decisions. By having the power scheduler
> > separately we just attempt to minimize the entanglement.
> 
> Why insist on this being treated as an external component that you have to pass 
> messages to?
> 
> If you allow it to be combined, then it can lookup the info it needs rather than 
> trying to define an API between the two that accounts for everything that you 
> need to know (now and in the future)

I don't see why you cannot read the internal scheduler data structures
from the power scheduler (with appropriate attention to locking). The
point of the proposed design is not to define interfaces, it is to divide
the problem into manageable components.

Let me repeat again, if we while developing the solution find out that
the separation doesn't make sense I have no problem merging them. I
don't insist on the separation, my point is that we need to partition
this very complex problem and let it evolve into a reasonable solution.

> 
> This will mean that as the internals of one change it will affect the internals 
> of the other, but it seems like this is far more likely to be successful.
> 

That is no different from having a merged design. If you change
something in the scheduler you would have to consider all the power
implications anyway. The power scheduler design would give you at least
a vague separation and the possibility of not having a power scheduler
at all.

> If you have hundreds or thousands of processes, it's bad enough to lookup the 
> data directly, but trying to marshal the infromation to send it to a separate 
> component seems counterproductive.

I don't see why that should be necessary.

Morten


  reply	other threads:[~2013-06-19 12:39 UTC|newest]

Thread overview: 58+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-05-30 13:47 [RFC] Comparison of power-efficient scheduling patch sets Morten Rasmussen
2013-05-31  1:17 ` Alex Shi
2013-05-31  8:23   ` Alex Shi
2013-05-31 10:52 ` power-efficient scheduling design Ingo Molnar
2013-06-03 14:59   ` Arjan van de Ven
2013-06-03 15:43     ` Ingo Molnar
2013-06-04 15:03   ` Morten Rasmussen
2013-06-07  6:26     ` Preeti U Murthy
2013-06-20 15:23     ` Ingo Molnar
2013-06-05  9:56   ` Amit Kucheria
2013-06-07  6:03   ` Preeti U Murthy
2013-06-07 14:51     ` Catalin Marinas
2013-06-07 18:08       ` Preeti U Murthy
2013-06-07 17:36         ` David Lang
2013-06-09  4:33           ` Preeti U Murthy
2013-06-08 11:28         ` Catalin Marinas
2013-06-08 14:02           ` Rafael J. Wysocki
2013-06-09  3:42             ` Preeti U Murthy
2013-06-09 22:53               ` Catalin Marinas
2013-06-10 16:25               ` Daniel Lezcano
2013-06-12  0:27                 ` David Lang
2013-06-12  1:48                   ` Arjan van de Ven
2013-06-12  9:48                     ` Amit Kucheria
2013-06-12 16:22                       ` David Lang
2013-06-12 10:20                     ` Catalin Marinas
2013-06-12 15:24                       ` Arjan van de Ven
2013-06-12 17:04                         ` Catalin Marinas
2013-06-12  9:50                   ` Daniel Lezcano
2013-06-12 16:30                     ` David Lang
2013-06-11  0:50               ` Rafael J. Wysocki
2013-06-13  4:32                 ` Preeti U Murthy
2013-06-09  4:23           ` Preeti U Murthy
2013-06-07 15:23     ` Arjan van de Ven
2013-06-14 16:05   ` Morten Rasmussen
2013-06-17 11:23     ` Catalin Marinas
2013-06-18  1:37     ` David Lang
2013-06-18 10:23       ` Morten Rasmussen
2013-06-18 17:39         ` David Lang
2013-06-19 12:39           ` Morten Rasmussen [this message]
2013-06-18 15:20     ` Arjan van de Ven
2013-06-18 17:47       ` David Lang
2013-06-18 19:36         ` Arjan van de Ven
2013-06-19 15:39         ` Arjan van de Ven
2013-06-19 17:00           ` Morten Rasmussen
2013-06-19 17:08             ` Arjan van de Ven
2013-06-21  8:50               ` Morten Rasmussen
2013-06-21 15:29                 ` Arjan van de Ven
2013-06-21 15:38                 ` Arjan van de Ven
2013-06-21 21:23                   ` Catalin Marinas
2013-06-21 21:34                     ` Arjan van de Ven
2013-06-23 23:32                       ` Benjamin Herrenschmidt
2013-06-24 10:07                         ` Catalin Marinas
2013-06-24 15:26                         ` Arjan van de Ven
2013-06-24 21:59                           ` Benjamin Herrenschmidt
2013-06-24 23:10                             ` Arjan van de Ven
2013-06-18 19:06       ` Catalin Marinas
2013-06-21 15:06       ` Morten Rasmussen
2013-06-23 10:55         ` Ingo Molnar

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20130619123933.GG5460@e103034-lin \
    --to=morten.rasmussen@arm.com \
    --cc=Catalin.Marinas@arm.com \
    --cc=akpm@linux-foundation.org \
    --cc=alex.shi@intel.com \
    --cc=arjan@linux.intel.com \
    --cc=corbet@lwn.net \
    --cc=david@lang.hm \
    --cc=efault@gmx.de \
    --cc=len.brown@intel.com \
    --cc=linaro-kernel@lists.linaro.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@kernel.org \
    --cc=peterz@infradead.org \
    --cc=pjt@google.com \
    --cc=preeti@linux.vnet.ibm.com \
    --cc=tglx@linutronix.de \
    --cc=torvalds@linux-foundation.org \
    --cc=vincent.guittot@linaro.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).