linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Rafael J. Wysocki" <rafael@kernel.org>
To: Viresh Kumar <viresh.kumar@linaro.org>
Cc: Doug Smythies <dsmythies@telus.net>,
	"Rafael J. Wysocki" <rafael@kernel.org>,
	Rafael Wysocki <rjw@rjwysocki.net>,
	Ingo Molnar <mingo@redhat.com>,
	Peter Zijlstra <peterz@infradead.org>,
	Linux PM <linux-pm@vger.kernel.org>,
	Vincent Guittot <vincent.guittot@linaro.org>,
	Joel Fernandes <joel@joelfernandes.org>,
	"v4 . 18+" <stable@vger.kernel.org>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH] cpufreq: schedutil: Don't skip freq update when limits change
Date: Mon, 29 Jul 2019 10:37:35 +0200	[thread overview]
Message-ID: <CAJZ5v0gaW=ujtsDmewrVXL7V8K0YZysNqwu=qKLw+kPC86ydqA@mail.gmail.com> (raw)
In-Reply-To: <20190729083219.fe4xxq4ugmetzntm@vireshk-i7>

On Mon, Jul 29, 2019 at 10:32 AM Viresh Kumar <viresh.kumar@linaro.org> wrote:
>
> On 29-07-19, 00:55, Doug Smythies wrote:
> > On 2019.07.25 23:58 Viresh Kumar wrote:
> > > Hmm, so I tried to reproduce your setup on my ARM board.
> > > - booted only with CPU0 so I hit the sugov_update_single() routine
> > > - And applied below diff to make CPU look permanently busy:
> > >
> > > -------------------------8<-------------------------
> > >diff --git a/kernel/sched/cpufreq_schedutil.c b/kernel/sched/cpufreq_schedutil.c
> > > index 2f382b0959e5..afb47490e5dc 100644
> > > --- a/kernel/sched/cpufreq_schedutil.c
> > > +++ b/kernel/sched/cpufreq_schedutil.c
> > > @@ -121,6 +121,7 @@ static void sugov_fast_switch(struct sugov_policy *sg_policy, u64 time,
> > >         if (!sugov_update_next_freq(sg_policy, time, next_freq))
> > >                return;
> > >
> > > +       pr_info("%s: %d: %u\n", __func__, __LINE__, freq);
> >
> > ?? there is no "freq" variable here, and so this doesn't compile. However this works:
> >
> > +       pr_info("%s: %d: %u\n", __func__, __LINE__, next_freq);
>
> There are two paths we can take to change the frequency, normal
> sleep-able path (sugov_work) or fast path. Only one of them is taken
> by any driver ever. In your case it is the fast path always and in
> mine it was the slow path.
>
> I only tested the diff with slow-path and copy pasted to fast path
> while giving out to you and so the build issue. Sorry about that.
>
> Also make sure that the print is added after sugov_update_next_freq()
> is called, not before it.
>
> > >         next_freq = cpufreq_driver_fast_switch(policy, next_freq);
> > >        if (!next_freq)
> > >                return;
> > > @@ -424,14 +425,10 @@ static unsigned long sugov_iowait_apply(struct sugov_cpu *sg_cpu, u64 time,
> > > #ifdef CONFIG_NO_HZ_COMMON
> > > static bool sugov_cpu_is_busy(struct sugov_cpu *sg_cpu)
> > > {
> > > -       unsigned long idle_calls = tick_nohz_get_idle_calls_cpu(sg_cpu->cpu);
> > > -       bool ret = idle_calls == sg_cpu->saved_idle_calls;
> > > -
> > > -       sg_cpu->saved_idle_calls = idle_calls;
> > > -       return ret;
> > > +       return true;
> > >  }
> > >  #else
> > > -static inline bool sugov_cpu_is_busy(struct sugov_cpu *sg_cpu) { return false; }
> > > +static inline bool sugov_cpu_is_busy(struct sugov_cpu *sg_cpu) { return true; }
> > > #endif /* CONFIG_NO_HZ_COMMON */
> > >
> > >  /*
> > > @@ -565,6 +562,7 @@ static void sugov_work(struct kthread_work *work)
> > >         sg_policy->work_in_progress = false;
> > >         raw_spin_unlock_irqrestore(&sg_policy->update_lock, flags);
> > >
> > > +       pr_info("%s: %d: %u\n", __func__, __LINE__, freq);
> > >         mutex_lock(&sg_policy->work_lock);
> > >         __cpufreq_driver_target(sg_policy->policy, freq, CPUFREQ_RELATION_L);
> > >         mutex_unlock(&sg_policy->work_lock);
> > >
> > > -------------------------8<-------------------------
> > >
> > > Now, the frequency never gets down and so gets set to the maximum
> > > possible after a bit.
> > >
> > > - Then I did:
> > >
> > > echo <any-low-freq-value> > /sys/devices/system/cpu/cpufreq/policy0/scaling_max_freq
> > >
> > > Without my patch applied:
> > >        The print never gets printed and so frequency doesn't go down.
> > >
> > > With my patch applied:
> > >        The print gets printed immediately from sugov_work() and so
> > >        the frequency reduces.
> > >
> > > Can you try with this diff along with my Patch2 ? I suspect there may
> > > be something wrong with the intel_cpufreq driver as the patch fixes
> > > the only path we have in the schedutil governor which takes busyness
> > > of a CPU into account.
> >
> > With this diff along with your patch2 There is never a print message
> > from sugov_work. There are from sugov_fast_switch.
>
> Which is okay. sugov_work won't get hit in your case as I explained
> above.
>
> > Note that for the intel_cpufreq CPU scaling driver and the schedutil
> > governor I adjust the maximum clock frequency this way:
> >
> > echo <any-low-percent> > /sys/devices/system/cpu/intel_pstate/max_perf_pct
>
> This should eventually call sugov_limits() in schedutil governor, this
> can be easily checked with another print message.
>
> > I also applied the pr_info messages to the reverted kernel, and re-did
> > my tests (where everything works as expected). There is never a print
> > message from sugov_work. There are from sugov_fast_switch.
>
> that's fine.
>
> > Notes:
> >
> > I do not know if:
> > /sys/devices/system/cpu/cpufreq/policy*/scaling_max_freq
> > /sys/devices/system/cpu/cpufreq/policy*/scaling_min_freq
> > Need to be accurate when using the intel_pstate driver in passive mode.
> > They are not.
> > The commit comment for 9083e4986124389e2a7c0ffca95630a4983887f0
> > suggests that they might need to be representative.
> > I wonder if something similar to that commit is needed
> > for other global changes, such as max_perf_pct and min_perf_pct?
>
> We are already calling intel_pstate_update_policies() in that case, so
> it should be fine I believe.
>
> > intel_cpufreq/ondemand doesn't work properly on the reverted kernel.
>
> reverted kernel ? The patch you reverted was only for schedutil and it
> shouldn't have anything to do with ondemand.
>
> > (just discovered, not investigated)
> > I don't know about other governors.
>
> When you do:
>
> echo <any-low-percent> > /sys/devices/system/cpu/intel_pstate/max_perf_pct
>
> How soon does the print from sugov_fast_switch() gets printed ?
> Immediately ? Check with both the kernels, with my patch and with the
> reverted patch.
>
> Also see if there is any difference in the next_freq value in both the
> kernels when you change max_perf_pct.
>
> FWIW, we now know the difference between intel-pstate and
> acpi-cpufreq/my testcase and why we see differences here. In the cases
> where my patch fixed the issue (acpi/ARM), we were really changing the
> limits, i.e. policy->min/max. This happened because we touched
> scaling_max_freq directly.
>
> For the case of intel-pstate, you are changing max_perf_pct which
> doesn't change policy->max directly. I am not very sure how all of it
> work really, but at least schedutil will not see policy->max changing.
>
> @Rafael: Do you understand why things don't work properly with
> intel_cpufreq driver ?

I haven't tried to understand this yet, so no.

My somewhat educated guess is that using max_perf_pct has to do with
it, so I would try to retest to see if there's any difference when
scaling_max_freq is used instead of that.

  reply	other threads:[~2019-07-29  8:37 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-07-18  6:26 [PATCH] Revert "cpufreq: schedutil: Don't set next_freq to UINT_MAX" Doug Smythies
2019-07-18 10:28 ` Viresh Kumar
2019-07-18 15:46   ` Doug Smythies
2019-07-22  6:49     ` Viresh Kumar
2019-07-22  6:51 ` [PATCH] cpufreq: schedutil: Don't skip freq update when limits change Viresh Kumar
2019-07-23  7:10   ` Doug Smythies
2019-07-23  9:13     ` Rafael J. Wysocki
2019-07-23  9:15     ` Viresh Kumar
2019-07-23 10:27       ` Rafael J. Wysocki
2019-07-24 11:43         ` Viresh Kumar
2019-07-25 15:20           ` Doug Smythies
2019-07-26  3:26             ` Viresh Kumar
2019-07-26  6:57             ` Viresh Kumar
2019-07-29  7:55               ` Doug Smythies
2019-07-29  8:32                 ` Viresh Kumar
2019-07-29  8:37                   ` Rafael J. Wysocki [this message]
2019-08-01  0:20                     ` Doug Smythies
2019-08-01  6:17                       ` Viresh Kumar
2019-08-01  7:47                         ` Rafael J. Wysocki
2019-08-01  7:55                           ` Viresh Kumar
2019-08-01 17:57                         ` Doug Smythies
2019-08-02  3:48                           ` Viresh Kumar
2019-08-02  9:11                             ` Rafael J. Wysocki
2019-08-02  9:19                               ` Rafael J. Wysocki
2019-08-06  4:00                               ` Viresh Kumar
2019-07-31  2:58 Viresh Kumar
2019-07-31 23:19 ` Doug Smythies

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAJZ5v0gaW=ujtsDmewrVXL7V8K0YZysNqwu=qKLw+kPC86ydqA@mail.gmail.com' \
    --to=rafael@kernel.org \
    --cc=dsmythies@telus.net \
    --cc=joel@joelfernandes.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pm@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=peterz@infradead.org \
    --cc=rjw@rjwysocki.net \
    --cc=stable@vger.kernel.org \
    --cc=vincent.guittot@linaro.org \
    --cc=viresh.kumar@linaro.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).