linux-pm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Rafael J. Wysocki" <rafael@kernel.org>
To: Anson Huang <anson.huang@nxp.com>
Cc: "Rafael J. Wysocki" <rafael@kernel.org>,
	Viresh Kumar <viresh.kumar@linaro.org>,
	Peng Fan <peng.fan@nxp.com>, Jacky Bai <ping.bai@nxp.com>,
	"linux-pm@vger.kernel.org" <linux-pm@vger.kernel.org>,
	Vincent Guittot <vincent.guittot@linaro.org>,
	Peter Zijlstra <peterz@infradead.org>,
	Paul McKenney <paulmck@linux.vnet.ibm.com>
Subject: Re: About CPU hot-plug stress test failed in cpufreq driver
Date: Tue, 10 Dec 2019 10:04:00 +0100	[thread overview]
Message-ID: <CAJZ5v0hZeEp0PToT0su25YV4pdG6OyzT_H9wu2R45=jkyo_y4g@mail.gmail.com> (raw)
In-Reply-To: <40413247.HltoIgKm8r@kreacher>

On Tue, Dec 10, 2019 at 9:50 AM Rafael J. Wysocki <rjw@rjwysocki.net> wrote:
>
> On Tuesday, December 10, 2019 9:45:09 AM CET Anson Huang wrote:
> >
> > > -----Original Message-----
> > > From: Rafael J. Wysocki <rafael@kernel.org>
> > > Sent: Tuesday, December 10, 2019 4:38 PM
> > > To: Anson Huang <anson.huang@nxp.com>
> > > Cc: Rafael J. Wysocki <rafael@kernel.org>; Viresh Kumar
> > > <viresh.kumar@linaro.org>; Peng Fan <peng.fan@nxp.com>; Rafael J.
> > > Wysocki <rjw@rjwysocki.net>; Jacky Bai <ping.bai@nxp.com>; linux-
> > > pm@vger.kernel.org; Vincent Guittot <vincent.guittot@linaro.org>; Peter
> > > Zijlstra <peterz@infradead.org>; Paul McKenney
> > > <paulmck@linux.vnet.ibm.com>
> > > Subject: Re: About CPU hot-plug stress test failed in cpufreq driver
> > >
> > > On Tue, Dec 10, 2019 at 9:29 AM Anson Huang <anson.huang@nxp.com>
> > > wrote:
> > > >
> > > >
> > > >
> > > > > -----Original Message-----
> > > > > From: Rafael J. Wysocki <rafael@kernel.org>
> > > > > Sent: Tuesday, December 10, 2019 4:22 PM
> > > > > To: Viresh Kumar <viresh.kumar@linaro.org>
> > > > > Cc: Peng Fan <peng.fan@nxp.com>; Rafael J. Wysocki
> > > > > <rafael@kernel.org>; Anson Huang <anson.huang@nxp.com>; Rafael J.
> > > > > Wysocki <rjw@rjwysocki.net>; Jacky Bai <ping.bai@nxp.com>; linux-
> > > > > pm@vger.kernel.org; Vincent Guittot <vincent.guittot@linaro.org>;
> > > > > Peter Zijlstra <peterz@infradead.org>; Paul McKenney
> > > > > <paulmck@linux.vnet.ibm.com>
> > > > > Subject: Re: About CPU hot-plug stress test failed in cpufreq driver
> > > > >
> > > > > On Tue, Dec 10, 2019 at 8:05 AM Viresh Kumar
> > > > > <viresh.kumar@linaro.org>
> > > > > wrote:
> > > > > >
> > > > > > +few more guys
> > > > > >
> > > > > > On 10-12-19, 05:53, Peng Fan wrote:
> > > > > > > But per
> > > > > > > https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%
> > > > > > > 2Fel
> > > > > > > ixir.bootlin.com%2Flinux%2Fv5.5-
> > > > > rc1%2Fsource%2Fkernel%2Fsched%2Fsche
> > > > > > >
> > > > >
> > > d.h%23L2293&amp;data=02%7C01%7Canson.huang%40nxp.com%7C6f44900
> > > > > be3404
> > > > > > >
> > > > >
> > > e7d355708d77d4a16fa%7C686ea1d3bc2b4c6fa92cd99c5c301635%7C0%7C0%
> > > > > 7C637
> > > > > > >
> > > > >
> > > 115629475456329&amp;sdata=XXhwvuTOBb3TLmerwkr1zKbaWNA8xA%2Bl
> > > > > W%2Faw31
> > > > > > > 0AYcM%3D&amp;reserved=0
> > > > > > > cpu_of(rq) and smp_processor_id() is possible to not the same,
> > > > > > >
> > > > > > > When cpu_of(rq) is not equal to smp_processor_id(),
> > > > > > > dbs_update_util_handler will use irq_work_queue to
> > > > > > > smp_processor_id(), not cpu_of(rq). Is this expected?
> > > > > > > Or should the irq_work be queued to cpu_of(rq)?
> > > > > >
> > > > > > Okay, sorry for the long weekend where I couldn't get time to reply at
> > > all.
> > > > >
> > > > > No worries. :-)
> > > > >
> > > > > > First of all, lets try to understand dvfs_possible_from_any_cpu.
> > > > > >
> > > > > > Who can update the frequency of a CPU ? For many
> > > > > > architectures/platforms the eventual code that writes to some
> > > > > > register to change the frequency should only run on the local CPU,
> > > > > > as these registers are per-cpu registers and not something shared
> > > between CPUs.
> > > > > >
> > > > > > But for the ARM architecture, we have a PLL and then some more
> > > > > > registers to play with the clk provided to the CPU blocks and
> > > > > > these registers (which are updated as a result of clk_set_rate())
> > > > > > are part of a
> > > > > block outside of the CPU blocks.
> > > > > > And so any CPU (even if it is not part of the same cpufreq policy)
> > > > > > can update it. Setting this flag allows that and eventually we may
> > > > > > end up updating the frequency sooner, instead of later (which may
> > > > > > be less effective). That was the idea of the remote-wakeup series.
> > > > > > This stuff is absolutely correct and so cpufreq-dt does it for everyone.
> > > > > >
> > > > > > This also means that the normal work and irq-work both can run on
> > > > > > any CPU for your platform and it should be okay to do that.
> > > > >
> > > > > And it the failing case all of the CPUs in the system are in the
> > > > > same policy anyway, so dvfs_possible_from_any_cpu is a red herring.
> > > > >
> > > > > > Now, we have necessary measures in place to make sure that after
> > > > > > stopping and before starting a governor, the scheduler hooks to
> > > > > > save the cpufreq governor pointer and updates to policy->cpus are
> > > > > > made properly, to make sure that we never ever schedule a work or
> > > > > > irq-work on a CPU which is offline. Now it looks like this isn't
> > > > > > working as expected and we need to find what exactly is broken here.
> > > > > >
> > > > > > And yes, I did the testing on Hikey 620, an octa-core ARM platform
> > > > > > which has a single cpufreq policy which has all the 8 CPUs. And
> > > > > > yes, I am using cpufreq-dt only and I wasn't able to reproduce the
> > > > > > problem with mainline kernel as I explained earlier.
> > > > > >
> > > > > > The problem is somewhere between the scheduler's governor hook
> > > > > running
> > > > > > or queuing work on a CPU which is in the middle of getting
> > > > > > offline/online and there is some race around that. The problem
> > > > > > hence may not be related to just cpufreq, but a wider variety of clients.
> > > > >
> > > > > The problem is that a CPU is running a governor hook which it
> > > > > shouldn't be running at all.
> > > > >
> > > > > The observation that dvfs_possible_from_any_cpu makes a difference
> > > > > only means that the governor hook is running on a CPU that is not
> > > > > present in the
> > > > > policy->cpus mask.  On the platform(s) in question this cannot
> > > > > policy->happen as
> > > > > long as RCU works as expected.
> > > >
> > > > If I understand correctly, the governor hook ONLY be clear on the CPU
> > > > being offline and after governor stopped, but the CPU being offline
> > > > could still run into below function to help other CPU update the util,
> > > > and it ONLY checks the cpu_of(rq)'s governor hook which is valid as that
> > > CPU is online.
> > > >
> > > > So the question is how to avoid the CPU being offline and already
> > > > finish the governor stop flow be scheduled to help other CPU update the
> > > util.
> > > >
> > > >  static inline void cpufreq_update_util(struct rq *rq, unsigned int
> > > > flags)  {
> > > >          struct update_util_data *data;
> > > >
> > > >          data =
> > > rcu_dereference_sched(*per_cpu_ptr(&cpufreq_update_util_data,
> > > >                                                    cpu_of(rq)));
> > > >          if (data)
> > > >                  data->func(data, rq_clock(rq), flags);  }
> > >
> > > OK, so that's where the problem is, good catch!
> > >
> > > So what happens is that a CPU going offline runs some scheduler code that
> > > invokes cpufreq_update_util().  Incidentally, it is not the cpu_of(rq), but that
> > > CPU is still online, so the callback is invoked and then policy->cpus test is
> > > bypassed because of dvfs_possible_from_any_cpu.
> >
> > If this is the issue, add another check here for the current CPU's governor hook?
> > Or any other better place to make sure the CPU being offline NOT to be queued to irq work?
>
> Generally, yes.
>
> Something like the patch below should help if I'm not mistaken:
>
> ---
>  include/linux/cpufreq.h |    8 +++++---
>  1 file changed, 5 insertions(+), 3 deletions(-)
>
> Index: linux-pm/include/linux/cpufreq.h
> ===================================================================
> --- linux-pm.orig/include/linux/cpufreq.h
> +++ linux-pm/include/linux/cpufreq.h
> @@ -599,11 +599,13 @@ static inline bool cpufreq_this_cpu_can_
>  {
>         /*
>          * Allow remote callbacks if:
> -        * - dvfs_possible_from_any_cpu flag is set
>          * - the local and remote CPUs share cpufreq policy
> +        * - dvfs_possible_from_any_cpu flag is set and the CPU running the
> +        *   code is not going offline.
>          */
> -       return policy->dvfs_possible_from_any_cpu ||
> -               cpumask_test_cpu(smp_processor_id(), policy->cpus);
> +       return cpumask_test_cpu(smp_processor_id(), policy->cpus) ||
> +               (policy->dvfs_possible_from_any_cpu &&
> +                !cpumask_test_cpu(smp_processor_id(), policy->related_cpus));
>  }
>
>  /*********************************************************************
>

Actually, this is not sufficient, because the CPU going offline need
not belong to the same policy and if that is the case, the problem
will still occur AFAICS.

Please test it anyway so as to confirm that we are on the right track, though.

A better approach would be to queue the irq_work on a different CPU if
the CPU running the code is not in the policy->cpus set.

I'll cut a patch for that too in a minute.

  parent reply	other threads:[~2019-12-10  9:04 UTC|newest]

Thread overview: 57+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <DB3PR0402MB391626A8ECFDC182C6EDCF8DF54E0@DB3PR0402MB3916.eurprd04.prod.outlook.com>
2019-11-21  9:35 ` About CPU hot-plug stress test failed in cpufreq driver Viresh Kumar
2019-11-21 10:13   ` Anson Huang
2019-11-21 10:53     ` Rafael J. Wysocki
2019-11-21 10:56       ` Rafael J. Wysocki
2019-11-22  5:15         ` Anson Huang
2019-11-22  9:59           ` Rafael J. Wysocki
2019-11-25  6:05             ` Anson Huang
2019-11-25  9:43               ` Anson Huang
2019-11-26  6:18                 ` Viresh Kumar
2019-11-26  8:22                   ` Anson Huang
2019-11-26  8:25                     ` Viresh Kumar
2019-11-25 12:44               ` Rafael J. Wysocki
2019-11-26  8:57                 ` Rafael J. Wysocki
2019-11-29 11:39                 ` Rafael J. Wysocki
2019-11-29 13:44                   ` Anson Huang
2019-12-05  8:53                     ` Anson Huang
2019-12-05 10:48                       ` Rafael J. Wysocki
2019-12-05 13:18                         ` Anson Huang
2019-12-05 15:52                           ` Rafael J. Wysocki
2019-12-09 10:31                             ` Peng Fan
2019-12-09 10:37                             ` Anson Huang
2019-12-09 10:56                               ` Anson Huang
2019-12-09 11:23                                 ` Rafael J. Wysocki
2019-12-09 12:32                                   ` Anson Huang
2019-12-09 12:44                                     ` Rafael J. Wysocki
2019-12-09 14:18                                       ` Anson Huang
2019-12-10  5:39                                         ` Anson Huang
2019-12-10  5:53                                       ` Peng Fan
2019-12-10  7:05                                         ` Viresh Kumar
2019-12-10  8:22                                           ` Rafael J. Wysocki
2019-12-10  8:29                                             ` Anson Huang
2019-12-10  8:36                                               ` Viresh Kumar
2019-12-10  8:37                                                 ` Peng Fan
2019-12-10  8:37                                               ` Rafael J. Wysocki
2019-12-10  8:43                                                 ` Peng Fan
2019-12-10  8:45                                                 ` Anson Huang
2019-12-10  8:50                                                   ` Rafael J. Wysocki
2019-12-10  8:51                                                     ` Anson Huang
2019-12-10 10:39                                                       ` Rafael J. Wysocki
2019-12-10 10:54                                                         ` Rafael J. Wysocki
2019-12-11  5:08                                                           ` Anson Huang
2019-12-11  8:59                                                           ` Peng Fan
2019-12-11  9:36                                                             ` Rafael J. Wysocki
2019-12-11  9:43                                                               ` Peng Fan
2019-12-11  9:52                                                                 ` Rafael J. Wysocki
2019-12-11 10:11                                                                   ` Peng Fan
2019-12-10 10:54                                                         ` Viresh Kumar
2019-12-10 11:07                                                           ` Rafael J. Wysocki
2019-12-10  8:57                                                     ` Viresh Kumar
2019-12-10 11:03                                                       ` Rafael J. Wysocki
2019-12-10  9:04                                                     ` Rafael J. Wysocki [this message]
2019-12-10  8:31                                             ` Viresh Kumar
2019-12-10  8:12                                         ` Rafael J. Wysocki
2019-12-05 11:00                       ` Viresh Kumar
2019-12-05 11:10                         ` Rafael J. Wysocki
2019-12-05 11:17                           ` Viresh Kumar
2019-11-21 10:37   ` Rafael J. Wysocki

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAJZ5v0hZeEp0PToT0su25YV4pdG6OyzT_H9wu2R45=jkyo_y4g@mail.gmail.com' \
    --to=rafael@kernel.org \
    --cc=anson.huang@nxp.com \
    --cc=linux-pm@vger.kernel.org \
    --cc=paulmck@linux.vnet.ibm.com \
    --cc=peng.fan@nxp.com \
    --cc=peterz@infradead.org \
    --cc=ping.bai@nxp.com \
    --cc=vincent.guittot@linaro.org \
    --cc=viresh.kumar@linaro.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).