From: "Rafael J. Wysocki" <rafael@kernel.org>
To: Viresh Kumar <viresh.kumar@linaro.org>
Cc: Peng Fan <peng.fan@nxp.com>,
"Rafael J. Wysocki" <rafael@kernel.org>,
Anson Huang <anson.huang@nxp.com>,
"Rafael J. Wysocki" <rjw@rjwysocki.net>,
Jacky Bai <ping.bai@nxp.com>,
"linux-pm@vger.kernel.org" <linux-pm@vger.kernel.org>,
Vincent Guittot <vincent.guittot@linaro.org>,
Peter Zijlstra <peterz@infradead.org>,
Paul McKenney <paulmck@linux.vnet.ibm.com>
Subject: Re: About CPU hot-plug stress test failed in cpufreq driver
Date: Tue, 10 Dec 2019 09:22:12 +0100 [thread overview]
Message-ID: <CAJZ5v0hQeZDOiQC=UT+nEOc27E7Rn0O_7tLbeO1YMinCntDFCQ@mail.gmail.com> (raw)
In-Reply-To: <20191210070535.bvfzigolydhyz2ix@vireshk-i7>
On Tue, Dec 10, 2019 at 8:05 AM Viresh Kumar <viresh.kumar@linaro.org> wrote:
>
> +few more guys
>
> On 10-12-19, 05:53, Peng Fan wrote:
> > But per https://elixir.bootlin.com/linux/v5.5-rc1/source/kernel/sched/sched.h#L2293
> > cpu_of(rq) and smp_processor_id() is possible to not the same,
> >
> > When cpu_of(rq) is not equal to smp_processor_id(), dbs_update_util_handler
> > will use irq_work_queue to smp_processor_id(), not cpu_of(rq). Is this
> > expected?
> > Or should the irq_work be queued to cpu_of(rq)?
>
> Okay, sorry for the long weekend where I couldn't get time to reply at all.
No worries. :-)
> First of all, lets try to understand dvfs_possible_from_any_cpu.
>
> Who can update the frequency of a CPU ? For many architectures/platforms the
> eventual code that writes to some register to change the frequency should only
> run on the local CPU, as these registers are per-cpu registers and not something
> shared between CPUs.
>
> But for the ARM architecture, we have a PLL and then some more registers to play
> with the clk provided to the CPU blocks and these registers (which are updated
> as a result of clk_set_rate()) are part of a block outside of the CPU blocks.
> And so any CPU (even if it is not part of the same cpufreq policy) can update
> it. Setting this flag allows that and eventually we may end up updating the
> frequency sooner, instead of later (which may be less effective). That was the
> idea of the remote-wakeup series. This stuff is absolutely correct and so
> cpufreq-dt does it for everyone.
>
> This also means that the normal work and irq-work both can run on any CPU for
> your platform and it should be okay to do that.
And it the failing case all of the CPUs in the system are in the same
policy anyway, so dvfs_possible_from_any_cpu is a red herring.
> Now, we have necessary measures in place to make sure that after stopping and
> before starting a governor, the scheduler hooks to save the cpufreq governor
> pointer and updates to policy->cpus are made properly, to make sure that we
> never ever schedule a work or irq-work on a CPU which is offline. Now it looks
> like this isn't working as expected and we need to find what exactly is broken
> here.
>
> And yes, I did the testing on Hikey 620, an octa-core ARM platform which has a
> single cpufreq policy which has all the 8 CPUs. And yes, I am using cpufreq-dt
> only and I wasn't able to reproduce the problem with mainline kernel as I
> explained earlier.
>
> The problem is somewhere between the scheduler's governor hook running or
> queuing work on a CPU which is in the middle of getting offline/online and there
> is some race around that. The problem hence may not be related to just cpufreq,
> but a wider variety of clients.
The problem is that a CPU is running a governor hook which it
shouldn't be running at all.
The observation that dvfs_possible_from_any_cpu makes a difference
only means that the governor hook is running on a CPU that is not
present in the policy->cpus mask. On the platform(s) in question this
cannot happen as long as RCU works as expected.
next prev parent reply other threads:[~2019-12-10 8:22 UTC|newest]
Thread overview: 57+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <DB3PR0402MB391626A8ECFDC182C6EDCF8DF54E0@DB3PR0402MB3916.eurprd04.prod.outlook.com>
2019-11-21 9:35 ` About CPU hot-plug stress test failed in cpufreq driver Viresh Kumar
2019-11-21 10:13 ` Anson Huang
2019-11-21 10:53 ` Rafael J. Wysocki
2019-11-21 10:56 ` Rafael J. Wysocki
2019-11-22 5:15 ` Anson Huang
2019-11-22 9:59 ` Rafael J. Wysocki
2019-11-25 6:05 ` Anson Huang
2019-11-25 9:43 ` Anson Huang
2019-11-26 6:18 ` Viresh Kumar
2019-11-26 8:22 ` Anson Huang
2019-11-26 8:25 ` Viresh Kumar
2019-11-25 12:44 ` Rafael J. Wysocki
2019-11-26 8:57 ` Rafael J. Wysocki
2019-11-29 11:39 ` Rafael J. Wysocki
2019-11-29 13:44 ` Anson Huang
2019-12-05 8:53 ` Anson Huang
2019-12-05 10:48 ` Rafael J. Wysocki
2019-12-05 13:18 ` Anson Huang
2019-12-05 15:52 ` Rafael J. Wysocki
2019-12-09 10:31 ` Peng Fan
2019-12-09 10:37 ` Anson Huang
2019-12-09 10:56 ` Anson Huang
2019-12-09 11:23 ` Rafael J. Wysocki
2019-12-09 12:32 ` Anson Huang
2019-12-09 12:44 ` Rafael J. Wysocki
2019-12-09 14:18 ` Anson Huang
2019-12-10 5:39 ` Anson Huang
2019-12-10 5:53 ` Peng Fan
2019-12-10 7:05 ` Viresh Kumar
2019-12-10 8:22 ` Rafael J. Wysocki [this message]
2019-12-10 8:29 ` Anson Huang
2019-12-10 8:36 ` Viresh Kumar
2019-12-10 8:37 ` Peng Fan
2019-12-10 8:37 ` Rafael J. Wysocki
2019-12-10 8:43 ` Peng Fan
2019-12-10 8:45 ` Anson Huang
2019-12-10 8:50 ` Rafael J. Wysocki
2019-12-10 8:51 ` Anson Huang
2019-12-10 10:39 ` Rafael J. Wysocki
2019-12-10 10:54 ` Rafael J. Wysocki
2019-12-11 5:08 ` Anson Huang
2019-12-11 8:59 ` Peng Fan
2019-12-11 9:36 ` Rafael J. Wysocki
2019-12-11 9:43 ` Peng Fan
2019-12-11 9:52 ` Rafael J. Wysocki
2019-12-11 10:11 ` Peng Fan
2019-12-10 10:54 ` Viresh Kumar
2019-12-10 11:07 ` Rafael J. Wysocki
2019-12-10 8:57 ` Viresh Kumar
2019-12-10 11:03 ` Rafael J. Wysocki
2019-12-10 9:04 ` Rafael J. Wysocki
2019-12-10 8:31 ` Viresh Kumar
2019-12-10 8:12 ` Rafael J. Wysocki
2019-12-05 11:00 ` Viresh Kumar
2019-12-05 11:10 ` Rafael J. Wysocki
2019-12-05 11:17 ` Viresh Kumar
2019-11-21 10:37 ` Rafael J. Wysocki
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='CAJZ5v0hQeZDOiQC=UT+nEOc27E7Rn0O_7tLbeO1YMinCntDFCQ@mail.gmail.com' \
--to=rafael@kernel.org \
--cc=anson.huang@nxp.com \
--cc=linux-pm@vger.kernel.org \
--cc=paulmck@linux.vnet.ibm.com \
--cc=peng.fan@nxp.com \
--cc=peterz@infradead.org \
--cc=ping.bai@nxp.com \
--cc=rjw@rjwysocki.net \
--cc=vincent.guittot@linaro.org \
--cc=viresh.kumar@linaro.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).