linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Rafael J. Wysocki" <rafael@kernel.org>
To: Giovanni Gherdovich <ggherdovich@suse.cz>
Cc: "Rafael J. Wysocki" <rafael@kernel.org>,
	Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>,
	"Rafael J. Wysocki" <rjw@rjwysocki.net>,
	Thomas Gleixner <tglx@linutronix.de>,
	Ingo Molnar <mingo@redhat.com>,
	Peter Zijlstra <peterz@infradead.org>,
	Borislav Petkov <bp@suse.de>, Len Brown <lenb@kernel.org>,
	"the arch/x86 maintainers" <x86@kernel.org>,
	Linux PM <linux-pm@vger.kernel.org>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	Mel Gorman <mgorman@techsingularity.net>,
	Matt Fleming <matt@codeblueprint.co.uk>,
	Viresh Kumar <viresh.kumar@linaro.org>,
	Juri Lelli <juri.lelli@redhat.com>, Paul Turner <pjt@google.com>,
	Vincent Guittot <vincent.guittot@linaro.org>,
	Quentin Perret <qperret@qperret.net>,
	Dietmar Eggemann <dietmar.eggemann@arm.com>,
	Doug Smythies <dsmythies@telus.net>
Subject: Re: [PATCH v2 2/2] cpufreq: intel_pstate: Conditional frequency invariant accounting
Date: Fri, 4 Oct 2019 11:17:37 +0200	[thread overview]
Message-ID: <CAJZ5v0gAsd4=LOd0BBJGZgwg2TYUuQP_-FzYXS4k+XK1vfM_3g@mail.gmail.com> (raw)
In-Reply-To: <1570179472.30086.4.camel@suse.cz>

On Fri, Oct 4, 2019 at 10:52 AM Giovanni Gherdovich <ggherdovich@suse.cz> wrote:
>
> On Fri, 2019-10-04 at 10:29 +0200, Rafael J. Wysocki wrote:
> > On Fri, Oct 4, 2019 at 10:24 AM Giovanni Gherdovich <ggherdovich@suse.cz> wrote:
> > >
> > > On Thu, 2019-10-03 at 20:31 -0700, Srinivas Pandruvada wrote:
> > > > On Thu, 2019-10-03 at 20:05 +0200, Rafael J. Wysocki wrote:
> > > > > On Wednesday, October 2, 2019 2:29:26 PM CEST Giovanni Gherdovich
> > > > > wrote:
> > > > > > From: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
> > > > > >
> > > > > > intel_pstate has two operating modes: active and passive. In "active"
> > > > > > mode, the in-built scaling governor is used and in "passive" mode, the
> > > > > > driver can be used with any governor like "schedutil". In "active" mode
> > > > > > the utilization values from schedutil is not used and there is a
> > > > > > requirement from high performance computing use cases, not to readas
> > > > > > well any APERF/MPERF MSRs.
> > > > >
> > > > > Well, this isn't quite convincing.
> > > > >
> > > > > In particular, I don't see why the "don't read APERF/MPERF MSRs" argument
> > > > > applies *only* to intel_pstate in the "active" mode.  What about
> > > > > intel_pstate in the "passive" mode combined with the "performance"
> > > > > governor?  Or any other governor different from "schedutil" for that
> > > > > matter?
> > > > >
> > > > > And what about acpi_cpufreq combined with any governor different from
> > > > > "schedutil"?
> > > > >
> > > > > Scale invariance is not really needed in all of those cases right now
> > > > > AFAICS, or is it?
> > > >
> > > > Correct. This is just part of the patch to disable in active mode
> > > > (particularly in HWP and performance mode).
> > > >
> > > > But this patch is 2 years old. The folks who wanted this, disable
> > > > intel-pstate and use userspace governor with acpi-cpufreq. So may be
> > > > better to address those cases too.
> > >
> > > I disagree with "scale invariance is needed only by the schedutil governor";
> > > the two other users are the CPU's estimated utilization in the wakeup path,
> > > via cpu_util_without(), as well as the load-balance path, via cpu_util() which
> > > is used by update_sg_lb_stats().
> >
> > OK, so there are reasons to run the scale invariance code which are
> > not related to the cpufreq governor in use.
> >
> > I wonder then why those reasons are not relevant for intel_pstate in
> > the "active" mode.
> >
> > > Also remember that scale invariance is applied to both PELT signals util_avg
> > > and load_avg; schedutil uses the former but not the latter.
> > >
> > > I understand Srinivas patch to disable MSR accesses during the tick as a
> > > band-aid solution to address a specific use case he cares about, but I don't
> > > think that extending this approach to any non-schedutil governor is a good
> > > idea -- you'd be killing load balancing in the process.
> >
> > But that is also the case for intel_pstate in the "active" mode, isn't it?
>
> Sure it is.
>
> Now, what's the performance impact of loosing scale-invariance in PELT signals?

That needs to be measured.

> And what's the performance impact of accessing two MSRs at the scheduler tick
> on each CPU?

That would be the MSR access latency times two and I don't remember
the exact numbers from the top of my head.  It would also depend on
how much time it takes to run the tick without those two MSR accesses,
on average.

The question I have, however, is whether or not it really is necessary
to update arch_cpu_freq on every tick.  Maybe it would be sufficient
to do that every 10 ms, say (in case the tick is more frequent than
that), or similar?

> I am sporting Srinivas' patch because he expressed the concern that the losses
> don't justify the gains for a specific class of users (supercomputing),
> although I don't fully like the idea (and arguably that should be measured).

My point is that this patch doesn't even cover the entire case in
question, because the HPC people may very well be using cpufreq
driver/governor configurations different from intel_pstate in the
"active" mode.  Moreover, given the lack of data, it is even hard to
say what the potential impact is, if any.

I guess it should be a fairly straightforward exercise to compare the
results of the various benchmarks using intel_pstate in the "passive"
mode and the "performance" governor with and without patch [1/2] from
this series. If you see any perf regressions after applying the patch,
that's what the others will probably see as well.

  reply	other threads:[~2019-10-04  9:17 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-10-02 12:29 [PATCH v2 0/2] Add support for frequency invariance for (some) x86 Giovanni Gherdovich
2019-10-02 12:29 ` [PATCH v2 1/2] x86,sched: Add support for frequency invariance Giovanni Gherdovich
2019-10-02 15:23   ` kbuild test robot
2019-10-02 15:49     ` Giovanni Gherdovich
2019-10-02 16:43   ` kbuild test robot
2019-10-02 18:27   ` Peter Zijlstra
2019-10-03 10:27   ` Rafael J. Wysocki
2019-10-03 12:15     ` Peter Zijlstra
2019-10-03 17:36       ` Srinivas Pandruvada
2019-10-03 17:53       ` Rafael J. Wysocki
2019-10-04 11:48         ` Peter Zijlstra
2019-10-08  7:48         ` Giovanni Gherdovich
2019-10-08  9:32           ` Rafael J. Wysocki
2019-10-02 12:29 ` [PATCH v2 2/2] cpufreq: intel_pstate: Conditional frequency invariant accounting Giovanni Gherdovich
2019-10-03 18:05   ` Rafael J. Wysocki
2019-10-04  3:31     ` Srinivas Pandruvada
2019-10-04  8:08       ` Rafael J. Wysocki
2019-10-04  8:29       ` Giovanni Gherdovich
2019-10-04  8:28         ` Vincent Guittot
2019-10-04  8:33           ` Rafael J. Wysocki
2019-10-04  8:29         ` Rafael J. Wysocki
2019-10-04  8:57           ` Giovanni Gherdovich
2019-10-04  9:17             ` Rafael J. Wysocki [this message]
2019-10-04 15:17             ` Srinivas Pandruvada
2019-10-07  8:33               ` Giovanni Gherdovich

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAJZ5v0gAsd4=LOd0BBJGZgwg2TYUuQP_-FzYXS4k+XK1vfM_3g@mail.gmail.com' \
    --to=rafael@kernel.org \
    --cc=bp@suse.de \
    --cc=dietmar.eggemann@arm.com \
    --cc=dsmythies@telus.net \
    --cc=ggherdovich@suse.cz \
    --cc=juri.lelli@redhat.com \
    --cc=lenb@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pm@vger.kernel.org \
    --cc=matt@codeblueprint.co.uk \
    --cc=mgorman@techsingularity.net \
    --cc=mingo@redhat.com \
    --cc=peterz@infradead.org \
    --cc=pjt@google.com \
    --cc=qperret@qperret.net \
    --cc=rjw@rjwysocki.net \
    --cc=srinivas.pandruvada@linux.intel.com \
    --cc=tglx@linutronix.de \
    --cc=vincent.guittot@linaro.org \
    --cc=viresh.kumar@linaro.org \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).