Linux-PM Archive on lore.kernel.org
 help / color / Atom feed
From: "Rafael J. Wysocki" <rafael@kernel.org>
To: Ionela Voinescu <ionela.voinescu@arm.com>
Cc: Viresh Kumar <viresh.kumar@linaro.org>,
	"Rafael J. Wysocki" <rjw@rjwysocki.net>,
	Catalin Marinas <catalin.marinas@arm.com>,
	Sudeep Holla <sudeep.holla@arm.com>,
	Will Deacon <will@kernel.org>,
	Russell King - ARM Linux <linux@armlinux.org.uk>,
	Valentin Schneider <valentin.schneider@arm.com>,
	Ingo Molnar <mingo@redhat.com>,
	Peter Zijlstra <peterz@infradead.org>,
	Dietmar Eggemann <dietmar.eggemann@arm.com>,
	Linux PM <linux-pm@vger.kernel.org>,
	Linux ARM <linux-arm-kernel@lists.infradead.org>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH 2/8] cpufreq: move invariance setter calls in cpufreq core
Date: Wed, 1 Jul 2020 17:51:26 +0200
Message-ID: <CAJZ5v0gg4CtixKXEWG4agPATJxm5NZ4bnNVsqt7mRpwZS0Nygw@mail.gmail.com> (raw)
In-Reply-To: <20200701152751.GA29496@arm.com>

On Wed, Jul 1, 2020 at 5:28 PM Ionela Voinescu <ionela.voinescu@arm.com> wrote:
>
> Hey,
>
> On Wednesday 01 Jul 2020 at 16:16:19 (+0530), Viresh Kumar wrote:
> > On 01-07-20, 10:07, Ionela Voinescu wrote:
> > > From: Valentin Schneider <valentin.schneider@arm.com>
> > >
> > > To properly scale its per-entity load-tracking signals, the task scheduler
> > > needs to be given a frequency scale factor, i.e. some image of the current
> > > frequency the CPU is running at. Currently, this scale can be computed
> > > either by using counters (APERF/MPERF on x86, AMU on arm64), or by
> > > piggy-backing on the frequency selection done by cpufreq.
> > >
> > > For the latter, drivers have to explicitly set the scale factor
> > > themselves, despite it being purely boiler-plate code: the required
> > > information depends entirely on the kind of frequency switch callback
> > > implemented by the driver, i.e. either of: target_index(), target(),
> > > fast_switch() and setpolicy().
> > >
> > > The fitness of those callbacks with regard to driving the Frequency
> > > Invariance Engine (FIE) is studied below:
> > >
> > > target_index()
> > > ==============
> > > Documentation states that the chosen frequency "must be determined by
> > > freq_table[index].frequency". It isn't clear if it *has* to be that
> > > frequency, or if it can use that frequency value to do some computation
> > > that ultimately leads to a different frequency selection. All drivers
> > > go for the former, while the vexpress-spc-cpufreq has an atypical
> > > implementation.
> > >
> > > Thefore, the hook works on the asusmption the core can use
> > > freq_table[index].frequency.
> > >
> > > target()
> > > =======
> > > This has been flagged as deprecated since:
> > >
> > >   commit 9c0ebcf78fde ("cpufreq: Implement light weight ->target_index() routine")
> > >
> > > It also doesn't have that many users:
> > >
> > >   cpufreq-nforce2.c:371:2:  .target = nforce2_target,
> > >   cppc_cpufreq.c:416:2:             .target = cppc_cpufreq_set_target,
> > >   pcc-cpufreq.c:573:2:              .target = pcc_cpufreq_target,
> > >
> > > Should we care about drivers using this hook, we may be able to exploit
> > > cpufreq_freq_transition_{being, end}(). Otherwise, if FIE support is
> > > desired in their current state, arch_set_freq_scale() could still be
> > > called directly by the driver, while CPUFREQ_CUSTOM_SET_FREQ_SCALE
> > > could be used to mark support for it.
> > >
> > > fast_switch()
> > > =============
> > > This callback *has* to return the frequency that was selected.
> > >
> > > setpolicy()
> > > ===========
> > > This callback does not have any designated way of informing what was the
> > > end choice. But there are only two drivers using setpolicy(), and none
> > > of them have current FIE support:
> > >
> > >   drivers/cpufreq/longrun.c:281:    .setpolicy      = longrun_set_policy,
> > >   drivers/cpufreq/intel_pstate.c:2215:      .setpolicy      = intel_pstate_set_policy,
> > >
> > > The intel_pstate is known to use counter-driven frequency invariance.
> >
> > Same for acpi-cpufreq driver as well ?
> >
>
> The acpi-cpufreq driver defines target_index() and fast_switch() so it
> should go through the setting in cpufreq core. But x86 does not actually
> define arch_set_freq_scale() so when called it won't do anything (won't
> set any frequency scale factor), but rely on counters to set it through
> the arch_scale_freq_tick().

Right.

So on x86 (Intel flavor of it at least), cpufreq has nothing to do
with this regardless of what driver is in use.

> But this cpufreq functionality could potentially be used.

How so?

>
> > And I think we should do the freq-invariance thing for all the above categories
> > nevertheless.
> >
>
> I'm not sure what you mean by this. You mean we should also (try to) set
> the frequency scale factor for drivers defining setpolicy() and target()?

No, we shouldn't.

The sched tick potentially does that already and nothing more needs to
be done unless we know it for the fact that the scale factor is not
set by the tick.

> > > If FIE support is desired in their current state, arch_set_freq_scale()
> > > could still be called directly by the driver, while
> > > CPUFREQ_CUSTOM_SET_FREQ_SCALE could be used to mark support for it.
> > >
> > > Conclusion
> > > ==========
> > >
> > > Given that the significant majority of current FIE enabled drivers use
> > > callbacks that lend themselves to triggering the setting of the FIE scale
> > > factor in a generic way, move the invariance setter calls to cpufreq core,
> > > while filtering drivers that flag custom support using
> > > CPUFREQ_CUSTOM_SET_FREQ_SCALE.
> > >
> > > Signed-off-by: Valentin Schneider <valentin.schneider@arm.com>
> > > Signed-off-by: Ionela Voinescu <ionela.voinescu@arm.com>
> > > Cc: Rafael J. Wysocki <rjw@rjwysocki.net>
> > > Cc: Viresh Kumar <viresh.kumar@linaro.org>
> > > ---
> > >  drivers/cpufreq/cpufreq.c | 20 +++++++++++++++++---
> > >  1 file changed, 17 insertions(+), 3 deletions(-)
> > >
> > > diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
> > > index 0128de3603df..83b58483a39b 100644
> > > --- a/drivers/cpufreq/cpufreq.c
> > > +++ b/drivers/cpufreq/cpufreq.c
> > > @@ -2046,9 +2046,16 @@ EXPORT_SYMBOL(cpufreq_unregister_notifier);
> > >  unsigned int cpufreq_driver_fast_switch(struct cpufreq_policy *policy,
> > >                                     unsigned int target_freq)
> > >  {
> > > +   unsigned int freq;
> > > +
> > >     target_freq = clamp_val(target_freq, policy->min, policy->max);
> > > +   freq = cpufreq_driver->fast_switch(policy, target_freq);
> > > +
> >
> > > +   if (freq && !(cpufreq_driver->flags & CPUFREQ_CUSTOM_SET_FREQ_SCALE))
> > > +           arch_set_freq_scale(policy->related_cpus, freq,
> > > +                               policy->cpuinfo.max_freq);

policy->cpuinfo.max_freq need not be the one to use in all cases when
boost is supported.

policy->cpuinfo.max_freq may be the max boost freq and you may want to
scale with respect to the max sustainable one anyway.

> > This needs to be a separate function.
> >
>
> Yes, that would be nicer.
>
> > >
> > > -   return cpufreq_driver->fast_switch(policy, target_freq);
> > > +   return freq;
> > >  }
> > >  EXPORT_SYMBOL_GPL(cpufreq_driver_fast_switch);
> > >
> > > @@ -2140,7 +2147,7 @@ int __cpufreq_driver_target(struct cpufreq_policy *policy,
> > >                         unsigned int relation)
> > >  {
> > >     unsigned int old_target_freq = target_freq;
> > > -   int index;
> > > +   int index, retval;
> > >
> > >     if (cpufreq_disabled())
> > >             return -ENODEV;
> > > @@ -2171,7 +2178,14 @@ int __cpufreq_driver_target(struct cpufreq_policy *policy,
> > >
> > >     index = cpufreq_frequency_table_target(policy, target_freq, relation);
> > >
> > > -   return __target_index(policy, index);
> > > +   retval = __target_index(policy, index);
> > > +
> > > +   if (!retval && !(cpufreq_driver->flags & CPUFREQ_CUSTOM_SET_FREQ_SCALE))
> > > +           arch_set_freq_scale(policy->related_cpus,
> > > +                               policy->freq_table[index].frequency,
> >
> > policy->cur gets updated for both target and target_index type drivers. You can
> > use that safely. It gets updated after the postchange notification.
> >
>
> This would allow us to cover the drivers that define target() as well (not
> only target_index() and fast_switch()). Looking over the code we only take
> that path (calling cpufreq_freq_transition_end()), for
> !CPUFREQ_ASYNC_NOTIFICATION. But again, that's only used for
> powernow-k8 which is deprecated.
>
> I'll attempt a nice way to use this.

On arches like x86, policy->cur may not be the current frequency of
the CPU, though.  On relatively recent systems it actually isn't that
frequency most of the time.

Thanks!

  reply index

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-07-01  9:07 [PATCH 0/8] cpufreq: improve frequency invariance support Ionela Voinescu
2020-07-01  9:07 ` [PATCH 1/8] cpufreq: allow drivers to flag custom support for freq invariance Ionela Voinescu
2020-07-01 10:46   ` Viresh Kumar
2020-07-01 13:33     ` Ionela Voinescu
2020-07-01 16:05       ` Rafael J. Wysocki
2020-07-01 18:06         ` Ionela Voinescu
2020-07-02  2:58         ` Viresh Kumar
2020-07-02 11:44           ` Ionela Voinescu
2020-07-06 12:14             ` Dietmar Eggemann
2020-07-09  8:53               ` Ionela Voinescu
2020-07-09  9:09                 ` Viresh Kumar
2020-07-01  9:07 ` [PATCH 2/8] cpufreq: move invariance setter calls in cpufreq core Ionela Voinescu
2020-07-01 10:46   ` Viresh Kumar
2020-07-01 15:27     ` Ionela Voinescu
2020-07-01 15:51       ` Rafael J. Wysocki [this message]
2020-07-02  3:01         ` Viresh Kumar
2020-07-02 11:45         ` Ionela Voinescu
2020-07-01  9:07 ` [PATCH 3/8] cpufreq,drivers: remove setting of frequency scale factor Ionela Voinescu
2020-07-01  9:07 ` [PATCH 4/8] cpufreq,vexpress-spc: fix Frequency Invariance (FI) for bL switching Ionela Voinescu
2020-07-01 10:46   ` Viresh Kumar
2020-07-01 14:07     ` Ionela Voinescu
2020-07-02  3:05       ` Viresh Kumar
2020-07-02 11:41         ` Ionela Voinescu
2020-07-02 11:46           ` Viresh Kumar
2020-07-01  9:07 ` [PATCH 5/8] cpufreq: report whether cpufreq supports Frequency Invariance (FI) Ionela Voinescu
2020-07-01 10:46   ` Viresh Kumar
2020-07-01  9:07 ` [PATCH 6/8] arch_topology,cpufreq,sched/core: constify arch_* cpumasks Ionela Voinescu
2020-07-01  9:07 ` [PATCH 7/8] arch_topology,arm64: define arch_scale_freq_invariant() Ionela Voinescu
2020-07-01  9:07 ` [PATCH 8/8] cpufreq: make schedutil the default for arm and arm64 Ionela Voinescu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAJZ5v0gg4CtixKXEWG4agPATJxm5NZ4bnNVsqt7mRpwZS0Nygw@mail.gmail.com \
    --to=rafael@kernel.org \
    --cc=catalin.marinas@arm.com \
    --cc=dietmar.eggemann@arm.com \
    --cc=ionela.voinescu@arm.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pm@vger.kernel.org \
    --cc=linux@armlinux.org.uk \
    --cc=mingo@redhat.com \
    --cc=peterz@infradead.org \
    --cc=rjw@rjwysocki.net \
    --cc=sudeep.holla@arm.com \
    --cc=valentin.schneider@arm.com \
    --cc=viresh.kumar@linaro.org \
    --cc=will@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Linux-PM Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linux-pm/0 linux-pm/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-pm linux-pm/ https://lore.kernel.org/linux-pm \
		linux-pm@vger.kernel.org
	public-inbox-index linux-pm

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-pm


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git