linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Rafael J. Wysocki" <rafael@kernel.org>
To: Sai Gurrappadi <sgurrappadi@nvidia.com>
Cc: Vincent Guittot <vincent.guittot@linaro.org>,
	"Rafael J. Wysocki" <rjw@rjwysocki.net>,
	Linux PM <linux-pm@vger.kernel.org>,
	Peter Zijlstra <peterz@infradead.org>,
	LKML <linux-kernel@vger.kernel.org>,
	Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>,
	Viresh Kumar <viresh.kumar@linaro.org>,
	Juri Lelli <juri.lelli@arm.com>,
	Patrick Bellasi <patrick.bellasi@arm.com>,
	Joel Fernandes <joelaf@google.com>,
	Morten Rasmussen <morten.rasmussen@arm.com>,
	Ingo Molnar <mingo@redhat.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	Peter Boonstoppel <pboonstoppel@nvidia.com>
Subject: Re: [RFC][PATCH v3 2/2] cpufreq: schedutil: Avoid reducing frequency of busy CPUs prematurely
Date: Mon, 27 Mar 2017 23:11:12 +0200	[thread overview]
Message-ID: <CAJZ5v0ghzyNG-HL8gnfcPeq=t7sosd6b54rr2gQbm7FpVkUh9Q@mail.gmail.com> (raw)
In-Reply-To: <58D97DBE.2090508@nvidia.com>

On Mon, Mar 27, 2017 at 11:01 PM, Sai Gurrappadi <sgurrappadi@nvidia.com> wrote:
> Hi Vincent,
>
> On 03/27/2017 12:04 AM, Vincent Guittot wrote:
>> On 25 March 2017 at 02:14, Sai Gurrappadi <sgurrappadi@nvidia.com> wrote:
>>> Hi Rafael,
>>>
>>> On 03/21/2017 04:08 PM, Rafael J. Wysocki wrote:
>>>> From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
>>>>
>>>> The way the schedutil governor uses the PELT metric causes it to
>>>> underestimate the CPU utilization in some cases.
>>>>
>>>> That can be easily demonstrated by running kernel compilation on
>>>> a Sandy Bridge Intel processor, running turbostat in parallel with
>>>> it and looking at the values written to the MSR_IA32_PERF_CTL
>>>> register.  Namely, the expected result would be that when all CPUs
>>>> were 100% busy, all of them would be requested to run in the maximum
>>>> P-state, but observation shows that this clearly isn't the case.
>>>> The CPUs run in the maximum P-state for a while and then are
>>>> requested to run slower and go back to the maximum P-state after
>>>> a while again.  That causes the actual frequency of the processor to
>>>> visibly oscillate below the sustainable maximum in a jittery fashion
>>>> which clearly is not desirable.
>>>>
>>>> That has been attributed to CPU utilization metric updates on task
>>>> migration that cause the total utilization value for the CPU to be
>>>> reduced by the utilization of the migrated task.  If that happens,
>>>> the schedutil governor may see a CPU utilization reduction and will
>>>> attempt to reduce the CPU frequency accordingly right away.  That
>>>> may be premature, though, for example if the system is generally
>>>> busy and there are other runnable tasks waiting to be run on that
>>>> CPU already.
>>>>
>>>
>>> Thinking out loud a bit, I wonder if what you really want to do is basically:
>>>
>>> schedutil_cpu_util(cpu) = max(cpu_rq(cpu)->cfs.util_avg, total_cpu_util_avg);
>>>
>>> Where total_cpu_util_avg tracks the average utilization of the CPU itself over time (% of time the CPU was busy) in the same PELT like manner. The difference here is that it doesn't change instantaneously as tasks migrate in/out but it decays/accumulates just like the per-entity util_avgs.
>>
>> But we loose the interest of immediate decrease when tasks migrate.
>
> Indeed, this is not ideal.
>
>> Instead of total_cpu_util_avg we should better track RT utilization in
>> the same manner so with ongoing work for deadline we will have :
>> total_utilization = cfs.util_avg + rt's util_avg + deadline's util avg
>> and we still take advantage of task migration effect
>
> I agree that we need better tracking for RT and DL tasks but that doesn't solve the overloaded case with more than one CFS thread sharing a CPU.
>
> In the overloaded case, we care not just about the instant where the migrate happens but also subsequent windows where the PELT metric is slowly ramping up to reflect the real utilization of a task now that it has a CPU to itself.
>
> Maybe there are better ways to solve that though :-)

I wonder if it's viable to postpone the utilization update on both the
source and target runqueues until the task has been fully migrated?

That would make the artificial utilization reductions go away.

Thanks,
Rafael

  reply	other threads:[~2017-03-27 21:11 UTC|newest]

Thread overview: 71+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-03-19 13:21 [PATCH 0/2] cpufreq: schedutil: Fix and optimization Rafael J. Wysocki
2017-03-19 13:30 ` [PATCH 1/2] cpufreq: schedutil: Fix per-CPU structure initialization in sugov_start() Rafael J. Wysocki
2017-03-20  3:28   ` Viresh Kumar
2017-03-20 12:36     ` Rafael J. Wysocki
2017-03-19 13:34 ` [RFC][PATCH 2/2] cpufreq: schedutil: Force max frequency on busy CPUs Rafael J. Wysocki
2017-03-19 21:24   ` Rafael J. Wysocki
2017-03-19 21:42     ` Rafael J. Wysocki
2017-03-20 10:38     ` Peter Zijlstra
2017-03-20 12:31       ` Rafael J. Wysocki
2017-03-20  3:57   ` Viresh Kumar
2017-03-20  8:26     ` Vincent Guittot
2017-03-20 12:34       ` Patrick Bellasi
2017-03-22 23:56         ` Joel Fernandes
2017-03-23 22:08           ` Vincent Guittot
2017-03-25  3:48             ` Joel Fernandes
2017-03-27  6:59               ` Vincent Guittot
2017-03-20 12:59       ` Rafael J. Wysocki
2017-03-20 13:20         ` Vincent Guittot
2017-03-20 12:48     ` Rafael J. Wysocki
2017-03-20 10:36   ` Peter Zijlstra
2017-03-20 12:35     ` Rafael J. Wysocki
2017-03-20 12:50       ` Peter Zijlstra
2017-03-20 13:04         ` Rafael J. Wysocki
2017-03-20 13:06         ` Patrick Bellasi
2017-03-20 13:05           ` Rafael J. Wysocki
2017-03-20 14:13             ` Patrick Bellasi
2017-03-20 21:46   ` [RFC][PATCH v2 2/2] cpufreq: schedutil: Avoid decreasing frequency of " Rafael J. Wysocki
2017-03-21  6:40     ` Viresh Kumar
2017-03-21 12:30       ` Rafael J. Wysocki
2017-03-21  8:50     ` Vincent Guittot
2017-03-21 11:56       ` Patrick Bellasi
2017-03-21 13:22       ` Peter Zijlstra
2017-03-21 13:37         ` Vincent Guittot
2017-03-21 14:03           ` Peter Zijlstra
2017-03-21 14:18             ` Vincent Guittot
2017-03-21 14:25             ` Patrick Bellasi
     [not found]             ` <CAKfTPtALorn7HNpz4LOfWWSc3u+9y5iHB5byzfTHGQXDA+tVJQ@mail.gmail.com>
2017-03-21 14:58               ` Peter Zijlstra
2017-03-21 17:00                 ` Vincent Guittot
2017-03-21 17:01                   ` Vincent Guittot
2017-03-21 14:26           ` Rafael J. Wysocki
2017-03-21 14:38             ` Patrick Bellasi
2017-03-21 14:46               ` Rafael J. Wysocki
2017-03-21 14:50                 ` Rafael J. Wysocki
2017-03-21 15:04                 ` Peter Zijlstra
2017-03-21 15:18                   ` Rafael J. Wysocki
2017-03-21 17:00                     ` Peter Zijlstra
2017-03-21 17:17                       ` Rafael J. Wysocki
2017-03-21 15:08                 ` Patrick Bellasi
2017-03-21 15:18                   ` Peter Zijlstra
2017-03-21 19:28                     ` Patrick Bellasi
2017-03-21 15:02             ` Peter Zijlstra
2017-03-21 11:50     ` Patrick Bellasi
2017-03-21 23:08     ` [RFC][PATCH v3 2/2] cpufreq: schedutil: Avoid reducing frequency of busy CPUs prematurely Rafael J. Wysocki
2017-03-22  9:26       ` Peter Zijlstra
2017-03-22  9:54       ` Viresh Kumar
2017-03-23  1:04       ` Joel Fernandes
2017-03-23 19:26       ` Sai Gurrappadi
2017-03-23 20:48         ` Sai Gurrappadi
2017-03-24  1:39         ` Rafael J. Wysocki
2017-03-24 19:08           ` Sai Gurrappadi
2017-03-25  1:14       ` Sai Gurrappadi
2017-03-25  1:39         ` Rafael J. Wysocki
2017-03-27  7:04         ` Vincent Guittot
2017-03-27 21:01           ` Sai Gurrappadi
2017-03-27 21:11             ` Rafael J. Wysocki [this message]
2017-05-08  3:49       ` Wanpeng Li
2017-05-08  4:01         ` Viresh Kumar
2017-05-08  5:15           ` Wanpeng Li
2017-05-08 22:16           ` Rafael J. Wysocki
2017-05-08 22:36             ` Wanpeng Li
2017-05-08 23:01               ` Rafael J. Wysocki

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAJZ5v0ghzyNG-HL8gnfcPeq=t7sosd6b54rr2gQbm7FpVkUh9Q@mail.gmail.com' \
    --to=rafael@kernel.org \
    --cc=joelaf@google.com \
    --cc=juri.lelli@arm.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pm@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=morten.rasmussen@arm.com \
    --cc=patrick.bellasi@arm.com \
    --cc=pboonstoppel@nvidia.com \
    --cc=peterz@infradead.org \
    --cc=rjw@rjwysocki.net \
    --cc=sgurrappadi@nvidia.com \
    --cc=srinivas.pandruvada@linux.intel.com \
    --cc=tglx@linutronix.de \
    --cc=vincent.guittot@linaro.org \
    --cc=viresh.kumar@linaro.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).