linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH][RFC] cpufreq: Bring CPUs up even if cpufreq_online failed
@ 2017-03-25  4:20 Chen Yu
  2017-03-27 21:32 ` Rafael J. Wysocki
  2017-03-28 16:23 ` Sebastian Andrzej Siewior
  0 siblings, 2 replies; 4+ messages in thread
From: Chen Yu @ 2017-03-25  4:20 UTC (permalink / raw)
  To: linux-pm
  Cc: linux-kernel, Sebastian Andrzej Siewior, Chen Yu,
	Rafael J. Wysocki, Viresh Kumar, Stable # 4 . 9+

There is a report that after
commit 27622b061eb4 ("cpufreq: Convert to hotplug state machine"),
the normal CPU offline/online cycle failed on some platforms.
According to the ftrace result, this problem was triggered on
platforms using acpi-freq as the default cpufreq driver,
and due to the lack of some ACPI freq method(_PCT eg), the
cpufreq_online failed and returned a negative value, thus the cpu
hotplug statemachine rollbacked the CPU online process. Actually
the failure of cpufreq_online should not impact the whole CPU
online process according to the original semantics before above patch.
BTW, during system bootup the cpufreq_online is not invoked via
cpuhotplug statemachine but by the cpufreq device creation process,
thus the APs can be brought up although cpufreq_online failed in that
stage.

This patch ignores the return value of cpufreq_online/offline and
prints a warning if there is a failure.

Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=194581
Fixes: 27622b061eb4 ("cpufreq: Convert to hotplug state machine")
Reported-and-tested-by: Tomasz Maciej Nowak <tmn505@gmail.com>
Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
Cc: Viresh Kumar <viresh.kumar@linaro.org>
Cc: linux-pm@vger.kernel.org
Cc: Stable <stable@vger.kernel.org> # 4.9+
Signed-off-by: Chen Yu <yu.c.chen@intel.com>
---
 drivers/cpufreq/cpufreq.c | 26 ++++++++++++++++++++++++--
 1 file changed, 24 insertions(+), 2 deletions(-)

diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
index b8ff617..1c13873 100644
--- a/drivers/cpufreq/cpufreq.c
+++ b/drivers/cpufreq/cpufreq.c
@@ -2391,6 +2391,28 @@ EXPORT_SYMBOL_GPL(cpufreq_boost_enabled);
  *********************************************************************/
 static enum cpuhp_state hp_online;
 
+static int cpuhp_cpufreq_online(unsigned int cpu)
+{
+	int ret = cpufreq_online(cpu);
+
+	if (ret)
+		pr_err("Failed to bring cpufreq online for CPU%u. (%d)\n",
+			cpu, ret);
+
+	return 0;
+}
+
+static int cpuhp_cpufreq_offline(unsigned int cpu)
+{
+	int ret = cpufreq_offline(cpu);
+
+	if (ret)
+		pr_err("Failed to put cpufreq offline for CPU%u. (%d)\n",
+			cpu, ret);
+
+	return 0;
+}
+
 /**
  * cpufreq_register_driver - register a CPU Frequency driver
  * @driver_data: A struct cpufreq_driver containing the values#
@@ -2453,8 +2475,8 @@ int cpufreq_register_driver(struct cpufreq_driver *driver_data)
 	}
 
 	ret = cpuhp_setup_state_nocalls(CPUHP_AP_ONLINE_DYN, "cpufreq:online",
-					cpufreq_online,
-					cpufreq_offline);
+					cpuhp_cpufreq_online,
+					cpuhp_cpufreq_offline);
 	if (ret < 0)
 		goto err_if_unreg;
 	hp_online = ret;
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH][RFC] cpufreq: Bring CPUs up even if cpufreq_online failed
  2017-03-25  4:20 [PATCH][RFC] cpufreq: Bring CPUs up even if cpufreq_online failed Chen Yu
@ 2017-03-27 21:32 ` Rafael J. Wysocki
  2017-03-28 16:23 ` Sebastian Andrzej Siewior
  1 sibling, 0 replies; 4+ messages in thread
From: Rafael J. Wysocki @ 2017-03-27 21:32 UTC (permalink / raw)
  To: Chen Yu
  Cc: Linux PM, Linux Kernel Mailing List, Sebastian Andrzej Siewior,
	Rafael J. Wysocki, Viresh Kumar, Stable # 4 . 9+

On Sat, Mar 25, 2017 at 5:20 AM, Chen Yu <yu.c.chen@intel.com> wrote:
> There is a report that after
> commit 27622b061eb4 ("cpufreq: Convert to hotplug state machine"),
> the normal CPU offline/online cycle failed on some platforms.
> According to the ftrace result, this problem was triggered on
> platforms using acpi-freq as the default cpufreq driver,
> and due to the lack of some ACPI freq method(_PCT eg), the
> cpufreq_online failed and returned a negative value, thus the cpu
> hotplug statemachine rollbacked the CPU online process. Actually
> the failure of cpufreq_online should not impact the whole CPU
> online process according to the original semantics before above patch.
> BTW, during system bootup the cpufreq_online is not invoked via
> cpuhotplug statemachine but by the cpufreq device creation process,
> thus the APs can be brought up although cpufreq_online failed in that
> stage.
>
> This patch ignores the return value of cpufreq_online/offline and
> prints a warning if there is a failure.
>
> Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=194581
> Fixes: 27622b061eb4 ("cpufreq: Convert to hotplug state machine")
> Reported-and-tested-by: Tomasz Maciej Nowak <tmn505@gmail.com>
> Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
> Cc: Viresh Kumar <viresh.kumar@linaro.org>
> Cc: linux-pm@vger.kernel.org
> Cc: Stable <stable@vger.kernel.org> # 4.9+
> Signed-off-by: Chen Yu <yu.c.chen@intel.com>
> ---
>  drivers/cpufreq/cpufreq.c | 26 ++++++++++++++++++++++++--
>  1 file changed, 24 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
> index b8ff617..1c13873 100644
> --- a/drivers/cpufreq/cpufreq.c
> +++ b/drivers/cpufreq/cpufreq.c
> @@ -2391,6 +2391,28 @@ EXPORT_SYMBOL_GPL(cpufreq_boost_enabled);
>   *********************************************************************/
>  static enum cpuhp_state hp_online;
>
> +static int cpuhp_cpufreq_online(unsigned int cpu)
> +{
> +       int ret = cpufreq_online(cpu);
> +
> +       if (ret)
> +               pr_err("Failed to bring cpufreq online for CPU%u. (%d)\n",
> +                       cpu, ret);

This pr_err() is not particularly useful IMO, because cpufreq_online()
complains on the majority of errors.

It would be better to make cpufreq_online() log errors with pr_err()
in all cases instead.

> +
> +       return 0;
> +}
> +
> +static int cpuhp_cpufreq_offline(unsigned int cpu)
> +{
> +       int ret = cpufreq_offline(cpu);
> +
> +       if (ret)
> +               pr_err("Failed to put cpufreq offline for CPU%u. (%d)\n",
> +                       cpu, ret);

And analogously here.

> +
> +       return 0;
> +}
> +
>  /**
>   * cpufreq_register_driver - register a CPU Frequency driver
>   * @driver_data: A struct cpufreq_driver containing the values#
> @@ -2453,8 +2475,8 @@ int cpufreq_register_driver(struct cpufreq_driver *driver_data)
>         }
>
>         ret = cpuhp_setup_state_nocalls(CPUHP_AP_ONLINE_DYN, "cpufreq:online",
> -                                       cpufreq_online,
> -                                       cpufreq_offline);
> +                                       cpuhp_cpufreq_online,
> +                                       cpuhp_cpufreq_offline);
>         if (ret < 0)
>                 goto err_if_unreg;
>         hp_online = ret;
> --
> 2.7.4

The rest looks OK to me.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH][RFC] cpufreq: Bring CPUs up even if cpufreq_online failed
  2017-03-25  4:20 [PATCH][RFC] cpufreq: Bring CPUs up even if cpufreq_online failed Chen Yu
  2017-03-27 21:32 ` Rafael J. Wysocki
@ 2017-03-28 16:23 ` Sebastian Andrzej Siewior
  2017-03-28 16:40   ` Rafael J. Wysocki
  1 sibling, 1 reply; 4+ messages in thread
From: Sebastian Andrzej Siewior @ 2017-03-28 16:23 UTC (permalink / raw)
  To: Chen Yu; +Cc: linux-pm, linux-kernel, Rafael J. Wysocki, Viresh Kumar

On 2017-03-25 12:20:11 [+0800], Chen Yu wrote:
> There is a report that after
> commit 27622b061eb4 ("cpufreq: Convert to hotplug state machine"),
> the normal CPU offline/online cycle failed on some platforms.
> According to the ftrace result, this problem was triggered on
> platforms using acpi-freq as the default cpufreq driver,
> and due to the lack of some ACPI freq method(_PCT eg), the
> cpufreq_online failed and returned a negative value, thus the cpu
> hotplug statemachine rollbacked the CPU online process. Actually
> the failure of cpufreq_online should not impact the whole CPU
> online process according to the original semantics before above patch.

Well, an error during bring up of CPU should not keep the system going
like nothing happend and cpufreq was ignoring return values without a
comment _why_ it is a good iea to do so.

> BTW, during system bootup the cpufreq_online is not invoked via
> cpuhotplug statemachine but by the cpufreq device creation process,
> thus the APs can be brought up although cpufreq_online failed in that
> stage.
> 
> This patch ignores the return value of cpufreq_online/offline and
> prints a warning if there is a failure.

What about dealing with this known error instead printing? If something
like "cpufreq_policy_alloc()" fails I will definitely a rollback and not
just a print. 

So what happens if we miss this "method(_PCT eg)"? We still want the
hotplug event right? So I would suggest a pr_once() that this _PCT
thingy is missing and continue without an error. I think pr_err_once()
is enough because I doubt the situation changes without an BIOS update
and a pr_err() will be visible also during suspend/resume, right?

Sebastian

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH][RFC] cpufreq: Bring CPUs up even if cpufreq_online failed
  2017-03-28 16:23 ` Sebastian Andrzej Siewior
@ 2017-03-28 16:40   ` Rafael J. Wysocki
  0 siblings, 0 replies; 4+ messages in thread
From: Rafael J. Wysocki @ 2017-03-28 16:40 UTC (permalink / raw)
  To: Sebastian Andrzej Siewior; +Cc: Chen Yu, linux-pm, linux-kernel, Viresh Kumar

On Tuesday, March 28, 2017 06:23:17 PM Sebastian Andrzej Siewior wrote:
> On 2017-03-25 12:20:11 [+0800], Chen Yu wrote:
> > There is a report that after
> > commit 27622b061eb4 ("cpufreq: Convert to hotplug state machine"),
> > the normal CPU offline/online cycle failed on some platforms.
> > According to the ftrace result, this problem was triggered on
> > platforms using acpi-freq as the default cpufreq driver,
> > and due to the lack of some ACPI freq method(_PCT eg), the
> > cpufreq_online failed and returned a negative value, thus the cpu
> > hotplug statemachine rollbacked the CPU online process. Actually
> > the failure of cpufreq_online should not impact the whole CPU
> > online process according to the original semantics before above patch.
> 
> Well, an error during bring up of CPU should not keep the system going
> like nothing happend and cpufreq was ignoring return values without a
> comment _why_ it is a good iea to do so.
> 
> > BTW, during system bootup the cpufreq_online is not invoked via
> > cpuhotplug statemachine but by the cpufreq device creation process,
> > thus the APs can be brought up although cpufreq_online failed in that
> > stage.
> > 
> > This patch ignores the return value of cpufreq_online/offline and
> > prints a warning if there is a failure.
> 
> What about dealing with this known error instead printing? If something
> like "cpufreq_policy_alloc()" fails I will definitely a rollback and not
> just a print. 

cpufreq_online() will do a proper rollback in that case.  It may even log an
error by itself. :-)

> So what happens if we miss this "method(_PCT eg)"? We still want the
> hotplug event right? So I would suggest a pr_once() that this _PCT
> thingy is missing and continue without an error. I think pr_err_once()
> is enough because I doubt the situation changes without an BIOS update
> and a pr_err() will be visible also during suspend/resume, right?

Right.

That's why I wouldn't print anything here and let cpufreq_online()
and cpufreq_offline() deal with that.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2017-03-28 16:46 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-03-25  4:20 [PATCH][RFC] cpufreq: Bring CPUs up even if cpufreq_online failed Chen Yu
2017-03-27 21:32 ` Rafael J. Wysocki
2017-03-28 16:23 ` Sebastian Andrzej Siewior
2017-03-28 16:40   ` Rafael J. Wysocki

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).