All of lore.kernel.org
 help / color / mirror / Atom feed
From: Steve Muckle <steve.muckle@linaro.org>
To: "Rafael J. Wysocki" <rjw@rjwysocki.net>
Cc: Steve Muckle <steve.muckle@linaro.org>,
	Peter Zijlstra <peterz@infradead.org>,
	Ingo Molnar <mingo@redhat.com>,
	"Rafael J . Wysocki" <rafael@kernel.org>,
	linux-kernel@vger.kernel.org, linux-pm@vger.kernel.org,
	Vincent Guittot <vincent.guittot@linaro.org>,
	Morten Rasmussen <morten.rasmussen@arm.com>,
	Dietmar Eggemann <dietmar.eggemann@arm.com>,
	Juri Lelli <Juri.Lelli@arm.com>,
	Patrick Bellasi <patrick.bellasi@arm.com>
Subject: Re: [PATCH 2/2] sched: cpufreq: use rt_avg as estimate of required RT CPU capacity
Date: Wed, 31 Aug 2016 07:49:39 -0700	[thread overview]
Message-ID: <20160831144939.GM5599@graphite.smuckle.net> (raw)
In-Reply-To: <1779842.1JHXT67au9@vostro.rjw.lan>

On Wed, Aug 31, 2016 at 03:31:07AM +0200, Rafael J. Wysocki wrote:
> On Friday, August 26, 2016 11:40:48 AM Steve Muckle wrote:
> > A policy of going to fmax on any RT activity will be detrimental
> > for power on many platforms. Often RT accounts for only a small amount
> > of CPU activity so sending the CPU frequency to fmax is overkill. Worse
> > still, some platforms may not be able to even complete the CPU frequency
> > change before the RT activity has already completed.
> > 
> > Cpufreq governors have not treated RT activity this way in the past so
> > it is not part of the expected semantics of the RT scheduling class. The
> > DL class offers guarantees about task completion and could be used for
> > this purpose.
> > 
> > Modify the schedutil algorithm to instead use rt_avg as an estimate of
> > RT utilization of the CPU.
> > 
> > Based on previous work by Vincent Guittot <vincent.guittot@linaro.org>.
> 
> If we do it for RT, why not to do a similar thing for DL?  As in the
> original patch from Peter, for example?

Agreed DL should have a similar change. I think that could be done in a
separate patch. I also would need to discuss it with the deadline sched
devs to fully understand the metric used there.

> 
> > Signed-off-by: Steve Muckle <smuckle@linaro.org>
> > ---
> >  kernel/sched/cpufreq_schedutil.c | 26 +++++++++++++++++---------
> >  1 file changed, 17 insertions(+), 9 deletions(-)
> > 
> > diff --git a/kernel/sched/cpufreq_schedutil.c b/kernel/sched/cpufreq_schedutil.c
> > index cb8a77b1ef1b..89094a466250 100644
> > --- a/kernel/sched/cpufreq_schedutil.c
> > +++ b/kernel/sched/cpufreq_schedutil.c
> > @@ -146,13 +146,21 @@ static unsigned int get_next_freq(struct sugov_cpu *sg_cpu, unsigned long util,
> >  
> >  static void sugov_get_util(unsigned long *util, unsigned long *max)
> >  {
> > -	struct rq *rq = this_rq();
> > -	unsigned long cfs_max;
> > +	int cpu = smp_processor_id();
> > +	struct rq *rq = cpu_rq(cpu);
> > +	unsigned long max_cap, rt;
> > +	s64 delta;
> >  
> > -	cfs_max = arch_scale_cpu_capacity(NULL, smp_processor_id());
> > +	max_cap = arch_scale_cpu_capacity(NULL, cpu);
> >  
> > -	*util = min(rq->cfs.avg.util_avg, cfs_max);
> > -	*max = cfs_max;
> > +	delta = rq_clock(rq) - rq->age_stamp;
> > +	if (unlikely(delta < 0))
> > +		delta = 0;
> > +	rt = div64_u64(rq->rt_avg, sched_avg_period() + delta);
> > +	rt = (rt * max_cap) >> SCHED_CAPACITY_SHIFT;
> 
> These computations are rather heavy, so I wonder if they are avoidable based
> on the flags, for example?

Yeah the div is bad. I don't know that we can avoid it based on the
flags because rt_avg will decay during CFS activity and you'd want to
take note of that.

One way to make this a little better is to ssume that the divisor,
sched_avg_period() + delta, fits into 32 bits so that div_u64 can be
used, which I believe is less bad. Doing that means placing a
restriction on how large sysctl_sched_time_avg (which determines
sched_avg_period()) can be, a max of 4.2 seconds I think. I don't know
that anyone uses a value that large anyway but there's currently no
limit on it.

Another option would be just adding another separate metric to track rt
activity that is more mathematically favorable to deal with.

Both these seemed potentially heavy handed so I figured I'd just start
with the obvious, if suboptimal, solution...

> Plus is SCHED_CAPACITY_SHIFT actually defined for all architectures?

Yes.

> One more ugly thing is about using rq_clock(rq) directly from here whereas we
> pass it around as the 'time' argument elsewhere.

Sure I'll clean this up.

> 
> > +
> > +	*util = min(rq->cfs.avg.util_avg + rt, max_cap);
> > +	*max = max_cap;
> >  }

  reply	other threads:[~2016-08-31 14:49 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-08-26 18:40 [PATCH 0/2] utilization changes for schedutil Steve Muckle
2016-08-26 18:40 ` [PATCH 1/2] sched: cpufreq: ignore SMT when determining max cpu capacity Steve Muckle
2016-08-31  1:27   ` Rafael J. Wysocki
2016-08-31 14:34     ` Peter Zijlstra
2016-09-12 22:25       ` Rafael J. Wysocki
2016-08-26 18:40 ` [PATCH 2/2] sched: cpufreq: use rt_avg as estimate of required RT CPU capacity Steve Muckle
2016-08-31  1:31   ` Rafael J. Wysocki
2016-08-31 14:49     ` Steve Muckle [this message]
2016-08-31 14:39   ` Peter Zijlstra
2016-08-31 15:08     ` Steve Muckle
2016-08-31 16:28       ` Thomas Gleixner
2016-08-31 16:40         ` Peter Zijlstra
2016-08-31 17:00           ` Juri Lelli
2016-09-01  7:12             ` Peter Zijlstra
2016-09-01 21:48             ` Steve Muckle
2016-09-02  9:35               ` Juri Lelli
2016-09-02 12:17                 ` Thomas Gleixner
2016-08-31 22:50           ` Rafael J. Wysocki
2016-09-02  8:12           ` Thomas Gleixner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160831144939.GM5599@graphite.smuckle.net \
    --to=steve.muckle@linaro.org \
    --cc=Juri.Lelli@arm.com \
    --cc=dietmar.eggemann@arm.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pm@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=morten.rasmussen@arm.com \
    --cc=patrick.bellasi@arm.com \
    --cc=peterz@infradead.org \
    --cc=rafael@kernel.org \
    --cc=rjw@rjwysocki.net \
    --cc=vincent.guittot@linaro.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.