linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Rafael J. Wysocki" <rjw@rjwysocki.net>
To: Vincent Guittot <vincent.guittot@linaro.org>
Cc: Viresh Kumar <viresh.kumar@linaro.org>,
	Linux PM <linux-pm@vger.kernel.org>,
	LKML <linux-kernel@vger.kernel.org>,
	Peter Zijlstra <peterz@infradead.org>,
	Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>,
	Juri Lelli <juri.lelli@arm.com>,
	Patrick Bellasi <patrick.bellasi@arm.com>,
	Joel Fernandes <joelaf@google.com>,
	Morten Rasmussen <morten.rasmussen@arm.com>,
	Ingo Molnar <mingo@redhat.com>
Subject: Re: [RFC][PATCH 2/2] cpufreq: schedutil: Force max frequency on busy CPUs
Date: Mon, 20 Mar 2017 13:59:58 +0100	[thread overview]
Message-ID: <3315477.nk1LFSfS7V@aspire.rjw.lan> (raw)
In-Reply-To: <CAKfTPtD8bp-mB=9Rjufeyj3weg6T3b1J-o+Sc2Oe2EMGX3zKzQ@mail.gmail.com>

On Monday, March 20, 2017 09:26:34 AM Vincent Guittot wrote:
> On 20 March 2017 at 04:57, Viresh Kumar <viresh.kumar@linaro.org> wrote:
> > On 19-03-17, 14:34, Rafael J. Wysocki wrote:
> >> From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> >>
> >> The PELT metric used by the schedutil governor underestimates the
> >> CPU utilization in some cases.  The reason for that may be time spent
> >> in interrupt handlers and similar which is not accounted for by PELT.
> 
> Are you sure of the root cause  described above (time stolen by irq
> handler) or is it just a hypotheses ? That would be good to be sure of
> the root cause

No, I'm not sure.  That's why I said "may be". :-)

> Furthermore, IIRC the time spent in irq context is also accounted as
> run time for the running cfs task but not RT and deadline task running
> time

OK

Anyway, the problem is that we have a 100% busy CPU which quite evidently
is not reported as 100% busy by the metric we use.

Now, if I was sure about the root cause, I would fix it rather than suggest
workarounds.

> So I'm not really aligned with the description of your problem: PELT
> metric underestimates the load of the CPU.  The PELT is just about
> tracking CFS task utilization but not whole CPU utilization and
> according to your description of the problem (time stolen by irq),
> your problem doesn't come from an underestimation of CFS task but from
> time spent in something else but not accounted in the value used by
> schedutil

That's fair enough, but the assumption was that PELT would be sufficient for
that.  To me, it clearly isn't, so the assumption was incorrect.

> That would be good to be sure of what is running during this not
> accounted time and find a way to account them

Yes, I agree.

I'm not sure if I can carry out full investigation of that any time soon, however.

I sent this mostly to let everybody know that there was a problem and how it
could be worked around.  That's why it is an RFC.

> 
> >>
> >> That can be easily demonstrated by running kernel compilation on
> >> a Sandy Bridge Intel processor, running turbostat in parallel with
> >> it and looking at the values written to the MSR_IA32_PERF_CTL
> >> register.  Namely, the expected result would be that when all CPUs
> >> were 100% busy, all of them would be requested to run in the maximum
> >> P-state, but observation shows that this clearly isn't the case.
> >> The CPUs run in the maximum P-state for a while and then are
> >> requested to run slower and go back to the maximum P-state after
> >> a while again.  That causes the actual frequency of the processor to
> >> visibly oscillate below the sustainable maximum in a jittery fashion
> >> which clearly is not desirable.
> >>
> >> To work around this issue use the observation that, from the
> >> schedutil governor's perspective, CPUs that are never idle should
> >> always run at the maximum frequency and make that happen.
> >>
> >> To that end, add a counter of idle calls to struct sugov_cpu and
> >> modify cpuidle_idle_call() to increment that counter every time it
> >> is about to put the given CPU into an idle state.  Next, make the
> >> schedutil governor look at that counter for the current CPU every
> >> time before it is about to start heavy computations.  If the counter
> >> has not changed for over SUGOV_BUSY_THRESHOLD time (equal to 50 ms),
> >> the CPU has not been idle for at least that long and the governor
> >> will choose the maximum frequency for it without looking at the PELT
> >> metric at all.
> >
> > Looks like we are fixing a PELT problem with a schedutil Hack :)
> 
> I would not say that it's a PELT problem (at least based on current
> description) but more that we don't track all
> activities of CPU

I generally agree.  PELT does what it does.

However, using PELT the way we do that in schedutil turns out to be problematic.

Thanks,
Rafael

  parent reply	other threads:[~2017-03-20 13:06 UTC|newest]

Thread overview: 71+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-03-19 13:21 [PATCH 0/2] cpufreq: schedutil: Fix and optimization Rafael J. Wysocki
2017-03-19 13:30 ` [PATCH 1/2] cpufreq: schedutil: Fix per-CPU structure initialization in sugov_start() Rafael J. Wysocki
2017-03-20  3:28   ` Viresh Kumar
2017-03-20 12:36     ` Rafael J. Wysocki
2017-03-19 13:34 ` [RFC][PATCH 2/2] cpufreq: schedutil: Force max frequency on busy CPUs Rafael J. Wysocki
2017-03-19 21:24   ` Rafael J. Wysocki
2017-03-19 21:42     ` Rafael J. Wysocki
2017-03-20 10:38     ` Peter Zijlstra
2017-03-20 12:31       ` Rafael J. Wysocki
2017-03-20  3:57   ` Viresh Kumar
2017-03-20  8:26     ` Vincent Guittot
2017-03-20 12:34       ` Patrick Bellasi
2017-03-22 23:56         ` Joel Fernandes
2017-03-23 22:08           ` Vincent Guittot
2017-03-25  3:48             ` Joel Fernandes
2017-03-27  6:59               ` Vincent Guittot
2017-03-20 12:59       ` Rafael J. Wysocki [this message]
2017-03-20 13:20         ` Vincent Guittot
2017-03-20 12:48     ` Rafael J. Wysocki
2017-03-20 10:36   ` Peter Zijlstra
2017-03-20 12:35     ` Rafael J. Wysocki
2017-03-20 12:50       ` Peter Zijlstra
2017-03-20 13:04         ` Rafael J. Wysocki
2017-03-20 13:06         ` Patrick Bellasi
2017-03-20 13:05           ` Rafael J. Wysocki
2017-03-20 14:13             ` Patrick Bellasi
2017-03-20 21:46   ` [RFC][PATCH v2 2/2] cpufreq: schedutil: Avoid decreasing frequency of " Rafael J. Wysocki
2017-03-21  6:40     ` Viresh Kumar
2017-03-21 12:30       ` Rafael J. Wysocki
2017-03-21  8:50     ` Vincent Guittot
2017-03-21 11:56       ` Patrick Bellasi
2017-03-21 13:22       ` Peter Zijlstra
2017-03-21 13:37         ` Vincent Guittot
2017-03-21 14:03           ` Peter Zijlstra
2017-03-21 14:18             ` Vincent Guittot
2017-03-21 14:25             ` Patrick Bellasi
     [not found]             ` <CAKfTPtALorn7HNpz4LOfWWSc3u+9y5iHB5byzfTHGQXDA+tVJQ@mail.gmail.com>
2017-03-21 14:58               ` Peter Zijlstra
2017-03-21 17:00                 ` Vincent Guittot
2017-03-21 17:01                   ` Vincent Guittot
2017-03-21 14:26           ` Rafael J. Wysocki
2017-03-21 14:38             ` Patrick Bellasi
2017-03-21 14:46               ` Rafael J. Wysocki
2017-03-21 14:50                 ` Rafael J. Wysocki
2017-03-21 15:04                 ` Peter Zijlstra
2017-03-21 15:18                   ` Rafael J. Wysocki
2017-03-21 17:00                     ` Peter Zijlstra
2017-03-21 17:17                       ` Rafael J. Wysocki
2017-03-21 15:08                 ` Patrick Bellasi
2017-03-21 15:18                   ` Peter Zijlstra
2017-03-21 19:28                     ` Patrick Bellasi
2017-03-21 15:02             ` Peter Zijlstra
2017-03-21 11:50     ` Patrick Bellasi
2017-03-21 23:08     ` [RFC][PATCH v3 2/2] cpufreq: schedutil: Avoid reducing frequency of busy CPUs prematurely Rafael J. Wysocki
2017-03-22  9:26       ` Peter Zijlstra
2017-03-22  9:54       ` Viresh Kumar
2017-03-23  1:04       ` Joel Fernandes
2017-03-23 19:26       ` Sai Gurrappadi
2017-03-23 20:48         ` Sai Gurrappadi
2017-03-24  1:39         ` Rafael J. Wysocki
2017-03-24 19:08           ` Sai Gurrappadi
2017-03-25  1:14       ` Sai Gurrappadi
2017-03-25  1:39         ` Rafael J. Wysocki
2017-03-27  7:04         ` Vincent Guittot
2017-03-27 21:01           ` Sai Gurrappadi
2017-03-27 21:11             ` Rafael J. Wysocki
2017-05-08  3:49       ` Wanpeng Li
2017-05-08  4:01         ` Viresh Kumar
2017-05-08  5:15           ` Wanpeng Li
2017-05-08 22:16           ` Rafael J. Wysocki
2017-05-08 22:36             ` Wanpeng Li
2017-05-08 23:01               ` Rafael J. Wysocki

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=3315477.nk1LFSfS7V@aspire.rjw.lan \
    --to=rjw@rjwysocki.net \
    --cc=joelaf@google.com \
    --cc=juri.lelli@arm.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pm@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=morten.rasmussen@arm.com \
    --cc=patrick.bellasi@arm.com \
    --cc=peterz@infradead.org \
    --cc=srinivas.pandruvada@linux.intel.com \
    --cc=vincent.guittot@linaro.org \
    --cc=viresh.kumar@linaro.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).