[PATCH 0/5] cpufreq: cppc: Fix suspend/resume specific races with FIE code

All of lore.kernel.org
 help / color / mirror / Atom feed

* [PATCH 0/5] cpufreq: cppc: Fix suspend/resume specific races with FIE code
@ 2021-06-10  8:23 ` Viresh Kumar
  0 siblings, 0 replies; 18+ messages in thread
From: Viresh Kumar @ 2021-06-10  8:23 UTC (permalink / raw)
  To: Rafael Wysocki, Qian Cai, Benjamin Herrenschmidt,
	Jonathan Corbet, Len Brown, Michael Ellerman, Paul Mackerras,
	Srinivas Pandruvada, Viresh Kumar
  Cc: linux-pm, Vincent Guittot, Ionela Voinescu, Dirk Brandewie,
	linux-doc, linux-kernel, linuxppc-dev

Hi Qian,

It would be helpful if you can test this patchset and confirm if the races you
mentioned went away or not and that the FIE code works as we wanted it to.

I don't have a real setup and so it won't be easy for me to test this out.

I have already sent a temporary fix for 5.13 and this patchset is targeted for
5.14 and is based over that.

-------------------------8<-------------------------

The CPPC driver currently stops the frequency invariance related
kthread_work and irq_work from cppc_freq_invariance_exit() which is only
called during driver's removal.

This is not sufficient as the CPUs can get hot-plugged out while the
driver is in use, the same also happens during system suspend/resume.

In such a cases we can reach a state where the CPU is removed by the
kernel but its kthread_work or irq_work aren't stopped.

Fix this by implementing the start_cpu() and stop_cpu() callbacks in the
cpufreq core, which will be called for each CPU's addition/removal.

A similar call was already available in the cpufreq core, which isn't required
anymore and so its users are migrated to use exit() callback instead.

This is targeted for v5.14-rc1.

--
Viresh

Viresh Kumar (5):
  cpufreq: cppc: Migrate to ->exit() callback instead of ->stop_cpu()
  cpufreq: intel_pstate: Migrate to ->exit() callback instead of
    ->stop_cpu()
  cpufreq: powerenv: Migrate to ->exit() callback instead of
    ->stop_cpu()
  cpufreq: Add start_cpu() and stop_cpu() callbacks
  cpufreq: cppc: Fix suspend/resume specific races with the FIE code

 Documentation/cpu-freq/cpu-drivers.rst |   7 +-
 drivers/cpufreq/Kconfig.arm            |   1 -
 drivers/cpufreq/cppc_cpufreq.c         | 163 ++++++++++++++-----------
 drivers/cpufreq/cpufreq.c              |  11 +-
 drivers/cpufreq/intel_pstate.c         |   9 +-
 drivers/cpufreq/powernv-cpufreq.c      |  23 ++--
 include/linux/cpufreq.h                |   5 +-
 7 files changed, 119 insertions(+), 100 deletions(-)

-- 
2.31.1.272.g89b43f80a514

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH 0/5] cpufreq: cppc: Fix suspend/resume specific races with FIE code
@ 2021-06-10  8:23 ` Viresh Kumar
  0 siblings, 0 replies; 18+ messages in thread
From: Viresh Kumar @ 2021-06-10  8:23 UTC (permalink / raw)
  To: Rafael Wysocki, Qian Cai, Benjamin Herrenschmidt,
	Jonathan Corbet, Len Brown, Michael Ellerman, Paul Mackerras,
	Srinivas Pandruvada, Viresh Kumar
  Cc: Vincent Guittot, linux-doc, Dirk Brandewie, linuxppc-dev,
	linux-pm, linux-kernel, Ionela Voinescu

Hi Qian,

It would be helpful if you can test this patchset and confirm if the races you
mentioned went away or not and that the FIE code works as we wanted it to.

I don't have a real setup and so it won't be easy for me to test this out.

I have already sent a temporary fix for 5.13 and this patchset is targeted for
5.14 and is based over that.

-------------------------8<-------------------------

The CPPC driver currently stops the frequency invariance related
kthread_work and irq_work from cppc_freq_invariance_exit() which is only
called during driver's removal.

This is not sufficient as the CPUs can get hot-plugged out while the
driver is in use, the same also happens during system suspend/resume.

In such a cases we can reach a state where the CPU is removed by the
kernel but its kthread_work or irq_work aren't stopped.

Fix this by implementing the start_cpu() and stop_cpu() callbacks in the
cpufreq core, which will be called for each CPU's addition/removal.

A similar call was already available in the cpufreq core, which isn't required
anymore and so its users are migrated to use exit() callback instead.

This is targeted for v5.14-rc1.

--
Viresh

Viresh Kumar (5):
  cpufreq: cppc: Migrate to ->exit() callback instead of ->stop_cpu()
  cpufreq: intel_pstate: Migrate to ->exit() callback instead of
    ->stop_cpu()
  cpufreq: powerenv: Migrate to ->exit() callback instead of
    ->stop_cpu()
  cpufreq: Add start_cpu() and stop_cpu() callbacks
  cpufreq: cppc: Fix suspend/resume specific races with the FIE code

 Documentation/cpu-freq/cpu-drivers.rst |   7 +-
 drivers/cpufreq/Kconfig.arm            |   1 -
 drivers/cpufreq/cppc_cpufreq.c         | 163 ++++++++++++++-----------
 drivers/cpufreq/cpufreq.c              |  11 +-
 drivers/cpufreq/intel_pstate.c         |   9 +-
 drivers/cpufreq/powernv-cpufreq.c      |  23 ++--
 include/linux/cpufreq.h                |   5 +-
 7 files changed, 119 insertions(+), 100 deletions(-)

-- 
2.31.1.272.g89b43f80a514

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH 1/5] cpufreq: cppc: Migrate to ->exit() callback instead of ->stop_cpu()
  2021-06-10  8:23 ` Viresh Kumar
  (?)
@ 2021-06-10  8:23 ` Viresh Kumar
  -1 siblings, 0 replies; 18+ messages in thread
From: Viresh Kumar @ 2021-06-10  8:23 UTC (permalink / raw)
  To: Rafael Wysocki, Qian Cai, Viresh Kumar
  Cc: linux-pm, Vincent Guittot, Ionela Voinescu, linux-kernel

commit 367dc4aa932b ("cpufreq: Add stop CPU callback to cpufreq_driver
interface") added the stop_cpu() callback to allow the drivers to do
clean up before the CPU is completely down and its state cannot be
modified.

At that time the CPU hotplug framework used to call the cpufreq core's
registered notifier for different events like CPU_DOWN_PREPARE and
CPU_POST_DEAD. The stop_cpu() callback was called during the
CPU_DOWN_PREPARE event.

This is no longer the case, cpuhp_cpufreq_offline() is called only once
by the CPU hotplug core now and we don't really need two separate
callbacks for cpufreq drivers, i.e. stop_cpu() and exit(), as everything
can be done from the exit() callback itself.

Migrate to using the exit() callback instead of stop_cpu().

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
---
 drivers/cpufreq/cppc_cpufreq.c | 46 ++++++++++++++++++----------------
 1 file changed, 24 insertions(+), 22 deletions(-)

diff --git a/drivers/cpufreq/cppc_cpufreq.c b/drivers/cpufreq/cppc_cpufreq.c
index 3848b4c222e1..30a861538784 100644
--- a/drivers/cpufreq/cppc_cpufreq.c
+++ b/drivers/cpufreq/cppc_cpufreq.c
@@ -384,27 +384,6 @@ static int cppc_verify_policy(struct cpufreq_policy_data *policy)
 	return 0;
 }
 
-static void cppc_cpufreq_stop_cpu(struct cpufreq_policy *policy)
-{
-	struct cppc_cpudata *cpu_data = policy->driver_data;
-	struct cppc_perf_caps *caps = &cpu_data->perf_caps;
-	unsigned int cpu = policy->cpu;
-	int ret;
-
-	cpu_data->perf_ctrls.desired_perf = caps->lowest_perf;
-
-	ret = cppc_set_perf(cpu, &cpu_data->perf_ctrls);
-	if (ret)
-		pr_debug("Err setting perf value:%d on CPU:%d. ret:%d\n",
-			 caps->lowest_perf, cpu, ret);
-
-	/* Remove CPU node from list and free driver data for policy */
-	free_cpumask_var(cpu_data->shared_cpu_map);
-	list_del(&cpu_data->node);
-	kfree(policy->driver_data);
-	policy->driver_data = NULL;
-}
-
 /*
  * The PCC subspace describes the rate at which platform can accept commands
  * on the shared PCC channel (including READs which do not count towards freq
@@ -557,6 +536,29 @@ static int cppc_cpufreq_cpu_init(struct cpufreq_policy *policy)
 	return ret;
 }
 
+static int cppc_cpufreq_cpu_exit(struct cpufreq_policy *policy)
+{
+	struct cppc_cpudata *cpu_data = policy->driver_data;
+	struct cppc_perf_caps *caps = &cpu_data->perf_caps;
+	unsigned int cpu = policy->cpu;
+	int ret;
+
+	cpu_data->perf_ctrls.desired_perf = caps->lowest_perf;
+
+	ret = cppc_set_perf(cpu, &cpu_data->perf_ctrls);
+	if (ret)
+		pr_debug("Err setting perf value:%d on CPU:%d. ret:%d\n",
+			 caps->lowest_perf, cpu, ret);
+
+	/* Remove CPU node from list and free driver data for policy */
+	free_cpumask_var(cpu_data->shared_cpu_map);
+	list_del(&cpu_data->node);
+	kfree(policy->driver_data);
+	policy->driver_data = NULL;
+
+	return 0;
+}
+
 static inline u64 get_delta(u64 t1, u64 t0)
 {
 	if (t1 > t0 || t0 > ~(u32)0)
@@ -665,7 +667,7 @@ static struct cpufreq_driver cppc_cpufreq_driver = {
 	.target = cppc_cpufreq_set_target,
 	.get = cppc_cpufreq_get_rate,
 	.init = cppc_cpufreq_cpu_init,
-	.stop_cpu = cppc_cpufreq_stop_cpu,
+	.exit = cppc_cpufreq_cpu_exit,
 	.set_boost = cppc_cpufreq_set_boost,
 	.attr = cppc_cpufreq_attr,
 	.name = "cppc_cpufreq",
-- 
2.31.1.272.g89b43f80a514


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 2/5] cpufreq: intel_pstate: Migrate to ->exit() callback instead of ->stop_cpu()
  2021-06-10  8:23 ` Viresh Kumar
  (?)
  (?)
@ 2021-06-10  8:23 ` Viresh Kumar
  -1 siblings, 0 replies; 18+ messages in thread
From: Viresh Kumar @ 2021-06-10  8:23 UTC (permalink / raw)
  To: Rafael Wysocki, Qian Cai, Srinivas Pandruvada, Len Brown, Viresh Kumar
  Cc: linux-pm, Vincent Guittot, Ionela Voinescu, Dirk Brandewie, linux-kernel

commit 367dc4aa932b ("cpufreq: Add stop CPU callback to cpufreq_driver
interface") added the stop_cpu() callback to allow the drivers to do
clean up before the CPU is completely down and its state cannot be
modified.

At that time the CPU hotplug framework used to call the cpufreq core's
registered notifier for different events like CPU_DOWN_PREPARE and
CPU_POST_DEAD. The stop_cpu() callback was called during the
CPU_DOWN_PREPARE event.

This is no longer the case, cpuhp_cpufreq_offline() is called only once
by the CPU hotplug core now and we don't really need two separate
callbacks for cpufreq drivers, i.e. stop_cpu() and exit(), as everything
can be done from the exit() callback itself.

Migrate to using the exit() callback instead of stop_cpu().

Cc: Dirk Brandewie <dirk.j.brandewie@intel.com>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
---
 drivers/cpufreq/intel_pstate.c | 9 +--------
 1 file changed, 1 insertion(+), 8 deletions(-)

diff --git a/drivers/cpufreq/intel_pstate.c b/drivers/cpufreq/intel_pstate.c
index f0401064d7aa..9d3191663925 100644
--- a/drivers/cpufreq/intel_pstate.c
+++ b/drivers/cpufreq/intel_pstate.c
@@ -2374,17 +2374,11 @@ static int intel_pstate_cpu_online(struct cpufreq_policy *policy)
 	return 0;
 }
 
-static void intel_pstate_stop_cpu(struct cpufreq_policy *policy)
-{
-	pr_debug("CPU %d stopping\n", policy->cpu);
-
-	intel_pstate_clear_update_util_hook(policy->cpu);
-}
-
 static int intel_pstate_cpu_exit(struct cpufreq_policy *policy)
 {
 	pr_debug("CPU %d exiting\n", policy->cpu);
 
+	intel_pstate_clear_update_util_hook(policy->cpu);
 	policy->fast_switch_possible = false;
 
 	return 0;
@@ -2451,7 +2445,6 @@ static struct cpufreq_driver intel_pstate = {
 	.resume		= intel_pstate_resume,
 	.init		= intel_pstate_cpu_init,
 	.exit		= intel_pstate_cpu_exit,
-	.stop_cpu	= intel_pstate_stop_cpu,
 	.offline	= intel_pstate_cpu_offline,
 	.online		= intel_pstate_cpu_online,
 	.update_limits	= intel_pstate_update_limits,
-- 
2.31.1.272.g89b43f80a514


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 3/5] cpufreq: powerenv: Migrate to ->exit() callback instead of ->stop_cpu()
  2021-06-10  8:23 ` Viresh Kumar
@ 2021-06-10  8:23   ` Viresh Kumar
  -1 siblings, 0 replies; 18+ messages in thread
From: Viresh Kumar @ 2021-06-10  8:23 UTC (permalink / raw)
  To: Rafael Wysocki, Qian Cai, Viresh Kumar, Michael Ellerman,
	Benjamin Herrenschmidt, Paul Mackerras
  Cc: linux-pm, Vincent Guittot, Ionela Voinescu, linuxppc-dev, linux-kernel

commit 367dc4aa932b ("cpufreq: Add stop CPU callback to cpufreq_driver
interface") added the stop_cpu() callback to allow the drivers to do
clean up before the CPU is completely down and its state cannot be
modified.

At that time the CPU hotplug framework used to call the cpufreq core's
registered notifier for different events like CPU_DOWN_PREPARE and
CPU_POST_DEAD. The stop_cpu() callback was called during the
CPU_DOWN_PREPARE event.

This is no longer the case, cpuhp_cpufreq_offline() is called only once
by the CPU hotplug core now and we don't really need two separate
callbacks for cpufreq drivers, i.e. stop_cpu() and exit(), as everything
can be done from the exit() callback itself.

Migrate to using the exit() callback instead of stop_cpu().

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
---
 drivers/cpufreq/powernv-cpufreq.c | 23 +++++++++--------------
 1 file changed, 9 insertions(+), 14 deletions(-)

diff --git a/drivers/cpufreq/powernv-cpufreq.c b/drivers/cpufreq/powernv-cpufreq.c
index e439b43c19eb..005600cef273 100644
--- a/drivers/cpufreq/powernv-cpufreq.c
+++ b/drivers/cpufreq/powernv-cpufreq.c
@@ -875,7 +875,15 @@ static int powernv_cpufreq_cpu_init(struct cpufreq_policy *policy)
 
 static int powernv_cpufreq_cpu_exit(struct cpufreq_policy *policy)
 {
-	/* timer is deleted in cpufreq_cpu_stop() */
+	struct powernv_smp_call_data freq_data;
+	struct global_pstate_info *gpstates = policy->driver_data;
+
+	freq_data.pstate_id = idx_to_pstate(powernv_pstate_info.min);
+	freq_data.gpstate_id = idx_to_pstate(powernv_pstate_info.min);
+	smp_call_function_single(policy->cpu, set_pstate, &freq_data, 1);
+	if (gpstates)
+		del_timer_sync(&gpstates->timer);
+
 	kfree(policy->driver_data);
 
 	return 0;
@@ -1007,18 +1015,6 @@ static struct notifier_block powernv_cpufreq_opal_nb = {
 	.priority	= 0,
 };
 
-static void powernv_cpufreq_stop_cpu(struct cpufreq_policy *policy)
-{
-	struct powernv_smp_call_data freq_data;
-	struct global_pstate_info *gpstates = policy->driver_data;
-
-	freq_data.pstate_id = idx_to_pstate(powernv_pstate_info.min);
-	freq_data.gpstate_id = idx_to_pstate(powernv_pstate_info.min);
-	smp_call_function_single(policy->cpu, set_pstate, &freq_data, 1);
-	if (gpstates)
-		del_timer_sync(&gpstates->timer);
-}
-
 static unsigned int powernv_fast_switch(struct cpufreq_policy *policy,
 					unsigned int target_freq)
 {
@@ -1042,7 +1038,6 @@ static struct cpufreq_driver powernv_cpufreq_driver = {
 	.target_index	= powernv_cpufreq_target_index,
 	.fast_switch	= powernv_fast_switch,
 	.get		= powernv_cpufreq_get,
-	.stop_cpu	= powernv_cpufreq_stop_cpu,
 	.attr		= powernv_cpu_freq_attr,
 };
 
-- 
2.31.1.272.g89b43f80a514


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 3/5] cpufreq: powerenv: Migrate to ->exit() callback instead of ->stop_cpu()
@ 2021-06-10  8:23   ` Viresh Kumar
  0 siblings, 0 replies; 18+ messages in thread
From: Viresh Kumar @ 2021-06-10  8:23 UTC (permalink / raw)
  To: Rafael Wysocki, Qian Cai, Viresh Kumar, Michael Ellerman,
	Benjamin Herrenschmidt, Paul Mackerras
  Cc: Ionela Voinescu, Vincent Guittot, linuxppc-dev, linux-kernel, linux-pm

commit 367dc4aa932b ("cpufreq: Add stop CPU callback to cpufreq_driver
interface") added the stop_cpu() callback to allow the drivers to do
clean up before the CPU is completely down and its state cannot be
modified.

At that time the CPU hotplug framework used to call the cpufreq core's
registered notifier for different events like CPU_DOWN_PREPARE and
CPU_POST_DEAD. The stop_cpu() callback was called during the
CPU_DOWN_PREPARE event.

This is no longer the case, cpuhp_cpufreq_offline() is called only once
by the CPU hotplug core now and we don't really need two separate
callbacks for cpufreq drivers, i.e. stop_cpu() and exit(), as everything
can be done from the exit() callback itself.

Migrate to using the exit() callback instead of stop_cpu().

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
---
 drivers/cpufreq/powernv-cpufreq.c | 23 +++++++++--------------
 1 file changed, 9 insertions(+), 14 deletions(-)

diff --git a/drivers/cpufreq/powernv-cpufreq.c b/drivers/cpufreq/powernv-cpufreq.c
index e439b43c19eb..005600cef273 100644
--- a/drivers/cpufreq/powernv-cpufreq.c
+++ b/drivers/cpufreq/powernv-cpufreq.c
@@ -875,7 +875,15 @@ static int powernv_cpufreq_cpu_init(struct cpufreq_policy *policy)
 
 static int powernv_cpufreq_cpu_exit(struct cpufreq_policy *policy)
 {
-	/* timer is deleted in cpufreq_cpu_stop() */
+	struct powernv_smp_call_data freq_data;
+	struct global_pstate_info *gpstates = policy->driver_data;
+
+	freq_data.pstate_id = idx_to_pstate(powernv_pstate_info.min);
+	freq_data.gpstate_id = idx_to_pstate(powernv_pstate_info.min);
+	smp_call_function_single(policy->cpu, set_pstate, &freq_data, 1);
+	if (gpstates)
+		del_timer_sync(&gpstates->timer);
+
 	kfree(policy->driver_data);
 
 	return 0;
@@ -1007,18 +1015,6 @@ static struct notifier_block powernv_cpufreq_opal_nb = {
 	.priority	= 0,
 };
 
-static void powernv_cpufreq_stop_cpu(struct cpufreq_policy *policy)
-{
-	struct powernv_smp_call_data freq_data;
-	struct global_pstate_info *gpstates = policy->driver_data;
-
-	freq_data.pstate_id = idx_to_pstate(powernv_pstate_info.min);
-	freq_data.gpstate_id = idx_to_pstate(powernv_pstate_info.min);
-	smp_call_function_single(policy->cpu, set_pstate, &freq_data, 1);
-	if (gpstates)
-		del_timer_sync(&gpstates->timer);
-}
-
 static unsigned int powernv_fast_switch(struct cpufreq_policy *policy,
 					unsigned int target_freq)
 {
@@ -1042,7 +1038,6 @@ static struct cpufreq_driver powernv_cpufreq_driver = {
 	.target_index	= powernv_cpufreq_target_index,
 	.fast_switch	= powernv_fast_switch,
 	.get		= powernv_cpufreq_get,
-	.stop_cpu	= powernv_cpufreq_stop_cpu,
 	.attr		= powernv_cpu_freq_attr,
 };
 
-- 
2.31.1.272.g89b43f80a514


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 4/5] cpufreq: Add start_cpu() and stop_cpu() callbacks
  2021-06-10  8:23 ` Viresh Kumar
                   ` (3 preceding siblings ...)
  (?)
@ 2021-06-10  8:24 ` Viresh Kumar
  -1 siblings, 0 replies; 18+ messages in thread
From: Viresh Kumar @ 2021-06-10  8:24 UTC (permalink / raw)
  To: Rafael Wysocki, Qian Cai, Viresh Kumar, Jonathan Corbet
  Cc: linux-pm, Vincent Guittot, Ionela Voinescu, linux-doc, linux-kernel

On CPU hotplug, the cpufreq core doesn't call any driver specific
callback unless all the CPUs of a policy went away.

There is need for a callback to be called in such cases (for the CPPC
cpufreq driver) now. Reuse the existing stop_cpu() callback and add a
new one for start_cpu().

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
---
 Documentation/cpu-freq/cpu-drivers.rst |  7 +++++--
 drivers/cpufreq/cpufreq.c              | 11 ++++++++---
 include/linux/cpufreq.h                |  5 ++++-
 3 files changed, 17 insertions(+), 6 deletions(-)

diff --git a/Documentation/cpu-freq/cpu-drivers.rst b/Documentation/cpu-freq/cpu-drivers.rst
index a697278ce190..15cfe42b4075 100644
--- a/Documentation/cpu-freq/cpu-drivers.rst
+++ b/Documentation/cpu-freq/cpu-drivers.rst
@@ -71,8 +71,11 @@ And optionally
  .exit - A pointer to a per-policy cleanup function called during
  CPU_POST_DEAD phase of cpu hotplug process.
 
- .stop_cpu - A pointer to a per-policy stop function called during
- CPU_DOWN_PREPARE phase of cpu hotplug process.
+ .start_cpu - A pointer to a per-policy per-cpu start function called
+ during CPU online phase.
+
+ .stop_cpu - A pointer to a per-policy per-cpu stop function called
+ during CPU offline phase.
 
  .suspend - A pointer to a per-policy suspend function which is called
  with interrupts disabled and _after_ the governor is stopped for the
diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
index 802abc925b2a..fac2522be5c3 100644
--- a/drivers/cpufreq/cpufreq.c
+++ b/drivers/cpufreq/cpufreq.c
@@ -1119,6 +1119,10 @@ static int cpufreq_add_policy_cpu(struct cpufreq_policy *policy, unsigned int cp
 
 	cpumask_set_cpu(cpu, policy->cpus);
 
+	/* Do CPU specific initialization if required */
+	if (cpufreq_driver->start_cpu)
+		cpufreq_driver->start_cpu(policy, cpu);
+
 	if (has_target()) {
 		ret = cpufreq_start_governor(policy);
 		if (ret)
@@ -1581,6 +1585,10 @@ static int cpufreq_offline(unsigned int cpu)
 		policy->cpu = cpumask_any(policy->cpus);
 	}
 
+	/* Do CPU specific de-initialization if required */
+	if (cpufreq_driver->stop_cpu)
+		cpufreq_driver->stop_cpu(policy, cpu);
+
 	/* Start governor again for active policy */
 	if (!policy_is_inactive(policy)) {
 		if (has_target()) {
@@ -1597,9 +1605,6 @@ static int cpufreq_offline(unsigned int cpu)
 		policy->cdev = NULL;
 	}
 
-	if (cpufreq_driver->stop_cpu)
-		cpufreq_driver->stop_cpu(policy);
-
 	if (has_target())
 		cpufreq_exit_governor(policy);
 
diff --git a/include/linux/cpufreq.h b/include/linux/cpufreq.h
index 353969c7acd3..c281b3df4e2f 100644
--- a/include/linux/cpufreq.h
+++ b/include/linux/cpufreq.h
@@ -371,7 +371,10 @@ struct cpufreq_driver {
 	int		(*online)(struct cpufreq_policy *policy);
 	int		(*offline)(struct cpufreq_policy *policy);
 	int		(*exit)(struct cpufreq_policy *policy);
-	void		(*stop_cpu)(struct cpufreq_policy *policy);
+
+	/* CPU specific start/stop */
+	void		(*start_cpu)(struct cpufreq_policy *policy, unsigned int cpu);
+	void		(*stop_cpu)(struct cpufreq_policy *policy, unsigned int cpu);
 	int		(*suspend)(struct cpufreq_policy *policy);
 	int		(*resume)(struct cpufreq_policy *policy);
 
-- 
2.31.1.272.g89b43f80a514


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 5/5] cpufreq: cppc: Fix suspend/resume specific races with the FIE code
  2021-06-10  8:23 ` Viresh Kumar
                   ` (4 preceding siblings ...)
  (?)
@ 2021-06-10  8:24 ` Viresh Kumar
  -1 siblings, 0 replies; 18+ messages in thread
From: Viresh Kumar @ 2021-06-10  8:24 UTC (permalink / raw)
  To: Rafael Wysocki, Qian Cai, Viresh Kumar
  Cc: linux-pm, Vincent Guittot, Ionela Voinescu, linux-kernel

The CPPC driver currently stops the frequency invariance related
kthread_work and irq_work from cppc_freq_invariance_exit() which is only
called during driver's removal.

This is not sufficient as the CPUs can get hot-plugged out while the
driver is in use, the same also happens during system suspend/resume.

In such a cases we can reach a state where the CPU is removed by the
kernel but its kthread_work or irq_work aren't stopped.

Fix this by implementing the start_cpu() and stop_cpu() callbacks of the
cpufreq core, which will be called for each CPU's addition/removal.

The FIE feature was marked BROKEN earlier, revert that.

Reported-by: Qian Cai <quic_qiancai@quicinc.com>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
---
 drivers/cpufreq/Kconfig.arm    |   1 -
 drivers/cpufreq/cppc_cpufreq.c | 117 +++++++++++++++++++--------------
 2 files changed, 68 insertions(+), 50 deletions(-)

diff --git a/drivers/cpufreq/Kconfig.arm b/drivers/cpufreq/Kconfig.arm
index 614c34350f41..a5c5f70acfc9 100644
--- a/drivers/cpufreq/Kconfig.arm
+++ b/drivers/cpufreq/Kconfig.arm
@@ -22,7 +22,6 @@ config ACPI_CPPC_CPUFREQ
 config ACPI_CPPC_CPUFREQ_FIE
 	bool "Frequency Invariance support for CPPC cpufreq driver"
 	depends on ACPI_CPPC_CPUFREQ && GENERIC_ARCH_TOPOLOGY
-	depends on BROKEN
 	default y
 	help
 	  This extends frequency invariance support in the CPPC cpufreq driver,
diff --git a/drivers/cpufreq/cppc_cpufreq.c b/drivers/cpufreq/cppc_cpufreq.c
index 30a861538784..82167c657098 100644
--- a/drivers/cpufreq/cppc_cpufreq.c
+++ b/drivers/cpufreq/cppc_cpufreq.c
@@ -74,7 +74,6 @@ struct cppc_freq_invariance {
 
 static DEFINE_PER_CPU(struct cppc_freq_invariance, cppc_freq_inv);
 static struct kthread_worker *kworker_fie;
-static bool fie_disabled;
 
 static struct cpufreq_driver cppc_cpufreq_driver;
 static unsigned int hisi_cppc_cpufreq_get_rate(unsigned int cpu);
@@ -151,35 +150,64 @@ static struct scale_freq_data cppc_sftd = {
 	.set_freq_scale = cppc_scale_freq_tick,
 };
 
-static void cppc_freq_invariance_policy_init(struct cpufreq_policy *policy,
-					     struct cppc_cpudata *cpu_data)
+static void cppc_cpufreq_start_cpu(struct cpufreq_policy *policy,
+				   unsigned int cpu)
 {
+	struct cppc_freq_invariance *cppc_fi = &per_cpu(cppc_freq_inv, cpu);
 	struct cppc_perf_fb_ctrs fb_ctrs = {0};
-	struct cppc_freq_invariance *cppc_fi;
-	int i, ret;
+	int ret;
 
-	if (cppc_cpufreq_driver.get == hisi_cppc_cpufreq_get_rate)
-		return;
+	cppc_fi->cpu = cpu;
+	cppc_fi->cpu_data = policy->driver_data;
+	kthread_init_work(&cppc_fi->work, cppc_scale_freq_workfn);
+	init_irq_work(&cppc_fi->irq_work, cppc_irq_work);
 
-	if (fie_disabled)
+	ret = cppc_get_perf_ctrs(cpu, &fb_ctrs);
+	if (ret) {
+		pr_warn("%s: failed to read perf counters: %d\n",
+				__func__, ret);
 		return;
+	} else {
+		cppc_fi->prev_perf_fb_ctrs = fb_ctrs;
+	}
 
-	for_each_cpu(i, policy->cpus) {
-		cppc_fi = &per_cpu(cppc_freq_inv, i);
-		cppc_fi->cpu = i;
-		cppc_fi->cpu_data = cpu_data;
-		kthread_init_work(&cppc_fi->work, cppc_scale_freq_workfn);
-		init_irq_work(&cppc_fi->irq_work, cppc_irq_work);
+	/* Register for freq-invariance */
+	topology_set_scale_freq_source(&cppc_sftd, cpumask_of(cpu));
+}
 
-		ret = cppc_get_perf_ctrs(i, &fb_ctrs);
-		if (ret) {
-			pr_warn("%s: failed to read perf counters: %d\n",
-				__func__, ret);
-			fie_disabled = true;
-		} else {
-			cppc_fi->prev_perf_fb_ctrs = fb_ctrs;
-		}
-	}
+static void cppc_cpufreq_stop_cpu(struct cpufreq_policy *policy,
+				  unsigned int cpu)
+{
+	struct cppc_freq_invariance *cppc_fi = &per_cpu(cppc_freq_inv, cpu);
+
+	topology_clear_scale_freq_source(SCALE_FREQ_SOURCE_CPPC, cpumask_of(cpu));
+
+	irq_work_sync(&cppc_fi->irq_work);
+	kthread_cancel_work_sync(&cppc_fi->work);
+}
+
+static int cppc_freq_invariance_policy_init(struct cpufreq_policy *policy)
+{
+	int cpu;
+
+	if (cppc_cpufreq_driver.get == hisi_cppc_cpufreq_get_rate)
+		return 0;
+
+	for_each_cpu(cpu, policy->cpus)
+		cppc_cpufreq_start_cpu(policy, cpu);
+
+	return 0;
+}
+
+static void cppc_freq_invariance_policy_exit(struct cpufreq_policy *policy)
+{
+	int cpu;
+
+	if (cppc_cpufreq_driver.get == hisi_cppc_cpufreq_get_rate)
+		return;
+
+	for_each_cpu(cpu, policy->cpus)
+		cppc_cpufreq_stop_cpu(policy, cpu);
 }
 
 static void __init cppc_freq_invariance_init(void)
@@ -202,9 +230,6 @@ static void __init cppc_freq_invariance_init(void)
 	if (cppc_cpufreq_driver.get == hisi_cppc_cpufreq_get_rate)
 		return;
 
-	if (fie_disabled)
-		return;
-
 	kworker_fie = kthread_create_worker(0, "cppc_fie");
 	if (IS_ERR(kworker_fie))
 		return;
@@ -217,36 +242,28 @@ static void __init cppc_freq_invariance_init(void)
 		return;
 	}
 
-	/* Register for freq-invariance */
-	topology_set_scale_freq_source(&cppc_sftd, cpu_present_mask);
+	cppc_cpufreq_driver.start_cpu = cppc_cpufreq_start_cpu;
+	cppc_cpufreq_driver.stop_cpu = cppc_cpufreq_stop_cpu;
 }
 
 static void cppc_freq_invariance_exit(void)
 {
-	struct cppc_freq_invariance *cppc_fi;
-	int i;
-
 	if (cppc_cpufreq_driver.get == hisi_cppc_cpufreq_get_rate)
 		return;
 
-	if (fie_disabled)
-		return;
-
-	topology_clear_scale_freq_source(SCALE_FREQ_SOURCE_CPPC, cpu_present_mask);
-
-	for_each_possible_cpu(i) {
-		cppc_fi = &per_cpu(cppc_freq_inv, i);
-		irq_work_sync(&cppc_fi->irq_work);
-	}
-
 	kthread_destroy_worker(kworker_fie);
 	kworker_fie = NULL;
 }
 
 #else
+static inline int
+cppc_freq_invariance_policy_init(struct cpufreq_policy *polic)
+{
+	return 0;
+}
+
 static inline void
-cppc_freq_invariance_policy_init(struct cpufreq_policy *policy,
-				 struct cppc_cpudata *cpu_data)
+cppc_freq_invariance_policy_exit(struct cpufreq_policy *policy)
 {
 }
 
@@ -529,11 +546,10 @@ static int cppc_cpufreq_cpu_init(struct cpufreq_policy *policy)
 	if (ret) {
 		pr_debug("Err setting perf value:%d on CPU:%d. ret:%d\n",
 			 caps->highest_perf, cpu, ret);
-	} else {
-		cppc_freq_invariance_policy_init(policy, cpu_data);
+		return ret;
 	}
 
-	return ret;
+	return cppc_freq_invariance_policy_init(policy);
 }
 
 static int cppc_cpufreq_cpu_exit(struct cpufreq_policy *policy)
@@ -543,6 +559,8 @@ static int cppc_cpufreq_cpu_exit(struct cpufreq_policy *policy)
 	unsigned int cpu = policy->cpu;
 	int ret;
 
+	cppc_freq_invariance_policy_exit(policy);
+
 	cpu_data->perf_ctrls.desired_perf = caps->lowest_perf;
 
 	ret = cppc_set_perf(cpu, &cpu_data->perf_ctrls);
@@ -728,10 +746,11 @@ static int __init cppc_cpufreq_init(void)
 	INIT_LIST_HEAD(&cpu_data_list);
 
 	cppc_check_hisi_workaround();
+	cppc_freq_invariance_init();
 
 	ret = cpufreq_register_driver(&cppc_cpufreq_driver);
-	if (!ret)
-		cppc_freq_invariance_init();
+	if (ret)
+		cppc_freq_invariance_exit();
 
 	return ret;
 }
@@ -750,8 +769,8 @@ static inline void free_cpu_data(void)
 
 static void __exit cppc_cpufreq_exit(void)
 {
-	cppc_freq_invariance_exit();
 	cpufreq_unregister_driver(&cppc_cpufreq_driver);
+	cppc_freq_invariance_exit();
 
 	free_cpu_data();
 }
-- 
2.31.1.272.g89b43f80a514


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: [PATCH 0/5] cpufreq: cppc: Fix suspend/resume specific races with FIE code
  2021-06-10  8:23 ` Viresh Kumar
@ 2021-06-14 13:48   ` Qian Cai
  -1 siblings, 0 replies; 18+ messages in thread
From: Qian Cai @ 2021-06-14 13:48 UTC (permalink / raw)
  To: Viresh Kumar, Rafael Wysocki, Benjamin Herrenschmidt,
	Jonathan Corbet, Len Brown, Michael Ellerman, Paul Mackerras,
	Srinivas Pandruvada
  Cc: linux-pm, Vincent Guittot, Ionela Voinescu, Dirk Brandewie,
	linux-doc, linux-kernel, linuxppc-dev



On 6/10/2021 4:23 AM, Viresh Kumar wrote:
> Hi Qian,
> 
> It would be helpful if you can test this patchset and confirm if the races you
> mentioned went away or not and that the FIE code works as we wanted it to.
> 
> I don't have a real setup and so it won't be easy for me to test this out.
> 
> I have already sent a temporary fix for 5.13 and this patchset is targeted for
> 5.14 and is based over that.

Unfortunately, this series looks like needing more works.

[  487.773586][    T0] CPU17: Booted secondary processor 0x0000000801 [0x503f0002]
[  487.976495][  T670] list_del corruption. next->prev should be ffff009b66e9ec70, but was ffff009b66dfec70
[  487.987037][  T670] ------------[ cut here ]------------
[  487.992351][  T670] kernel BUG at lib/list_debug.c:54!
[  487.997810][  T670] Internal error: Oops - BUG: 0 [#1] SMP
[  488.003295][  T670] Modules linked in: cpufreq_userspace xfs loop cppc_cpufreq processor efivarfs ip_tables x_tables ext4 mbcache jbd2 dm_mod igb i2c_algo_bit nvme mlx5_core i2c_core nvme_core firmware_class
[  488.021759][  T670] CPU: 1 PID: 670 Comm: cppc_fie Not tainted 5.13.0-rc5-next-20210611+ #46
[  488.030190][  T670] Hardware name: MiTAC RAPTOR EV-883832-X3-0001/RAPTOR, BIOS 1.6 06/28/2020
[  488.038705][  T670] pstate: 600000c5 (nZCv daIF -PAN -UAO -TCO BTYPE=--)
[  488.045398][  T670] pc : __list_del_entry_valid+0x154/0x158
[  488.050969][  T670] lr : __list_del_entry_valid+0x154/0x158
[  488.056534][  T670] sp : ffff8000229afd70
[  488.060534][  T670] x29: ffff8000229afd70 x28: ffff0008c8f4f340 x27: dfff800000000000
[  488.068361][  T670] x26: ffff009b66e9ec70 x25: ffff800011c8b4d0 x24: ffff0008d4bfe488
[  488.076188][  T670] x23: ffff0008c8f4f340 x22: ffff0008c8f4f340 x21: ffff009b6789ec70
[  488.084015][  T670] x20: ffff0008d4bfe4c8 x19: ffff009b66e9ec70 x18: ffff0008c8f4fd70
[  488.091842][  T670] x17: 20747562202c3037 x16: 6365396536366239 x15: 0000000000000028
[  488.099669][  T670] x14: 0000000000000000 x13: 0000000000000001 x12: ffff60136cdd3447
[  488.107495][  T670] x11: 1fffe0136cdd3446 x10: ffff60136cdd3446 x9 : ffff8000103ee444
[  488.115322][  T670] x8 : ffff009b66e9a237 x7 : 0000000000000001 x6 : ffff009b66e9a230
[  488.123149][  T670] x5 : 00009fec9322cbba x4 : ffff60136cdd3447 x3 : 1fffe001191e9e69
[  488.130975][  T670] x2 : 0000000000000000 x1 : 0000000000000000 x0 : 0000000000000054
[  488.138803][  T670] Call trace:
[  488.141935][  T670]  __list_del_entry_valid+0x154/0x158
[  488.147153][  T670]  kthread_worker_fn+0x15c/0xda0
[  488.151939][  T670]  kthread+0x3ac/0x460
[  488.155854][  T670]  ret_from_fork+0x10/0x18
[  488.160120][  T670] Code: 911e8000 aa1303e1 910a0000 941b595b (d4210000)
[  488.166901][  T670] ---[ end trace e637e2d38b2cc087 ]---
[  488.172206][  T670] Kernel panic - not syncing: Oops - BUG: Fatal exception
[  488.179182][  T670] SMP: stopping secondary CPUs
[  489.209347][  T670] SMP: failed to stop secondary CPUs 0-1,10-11,16-17,31
[  489.216128][  T][  T670] Memoryn ]---

> 
> -------------------------8<-------------------------
> 
> The CPPC driver currently stops the frequency invariance related
> kthread_work and irq_work from cppc_freq_invariance_exit() which is only
> called during driver's removal.
> 
> This is not sufficient as the CPUs can get hot-plugged out while the
> driver is in use, the same also happens during system suspend/resume.
> 
> In such a cases we can reach a state where the CPU is removed by the
> kernel but its kthread_work or irq_work aren't stopped.
> 
> Fix this by implementing the start_cpu() and stop_cpu() callbacks in the
> cpufreq core, which will be called for each CPU's addition/removal.
> 
> A similar call was already available in the cpufreq core, which isn't required
> anymore and so its users are migrated to use exit() callback instead.
> 
> This is targeted for v5.14-rc1.
> 
> --
> Viresh
> 
> Viresh Kumar (5):
>   cpufreq: cppc: Migrate to ->exit() callback instead of ->stop_cpu()
>   cpufreq: intel_pstate: Migrate to ->exit() callback instead of
>     ->stop_cpu()
>   cpufreq: powerenv: Migrate to ->exit() callback instead of
>     ->stop_cpu()
>   cpufreq: Add start_cpu() and stop_cpu() callbacks
>   cpufreq: cppc: Fix suspend/resume specific races with the FIE code
> 
>  Documentation/cpu-freq/cpu-drivers.rst |   7 +-
>  drivers/cpufreq/Kconfig.arm            |   1 -
>  drivers/cpufreq/cppc_cpufreq.c         | 163 ++++++++++++++-----------
>  drivers/cpufreq/cpufreq.c              |  11 +-
>  drivers/cpufreq/intel_pstate.c         |   9 +-
>  drivers/cpufreq/powernv-cpufreq.c      |  23 ++--
>  include/linux/cpufreq.h                |   5 +-
>  7 files changed, 119 insertions(+), 100 deletions(-)
> 

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 0/5] cpufreq: cppc: Fix suspend/resume specific races with FIE code
@ 2021-06-14 13:48   ` Qian Cai
  0 siblings, 0 replies; 18+ messages in thread
From: Qian Cai @ 2021-06-14 13:48 UTC (permalink / raw)
  To: Viresh Kumar, Rafael Wysocki, Benjamin Herrenschmidt,
	Jonathan Corbet, Len Brown, Michael Ellerman, Paul Mackerras,
	Srinivas Pandruvada
  Cc: Vincent Guittot, linux-doc, Dirk Brandewie, linuxppc-dev,
	linux-pm, linux-kernel, Ionela Voinescu



On 6/10/2021 4:23 AM, Viresh Kumar wrote:
> Hi Qian,
> 
> It would be helpful if you can test this patchset and confirm if the races you
> mentioned went away or not and that the FIE code works as we wanted it to.
> 
> I don't have a real setup and so it won't be easy for me to test this out.
> 
> I have already sent a temporary fix for 5.13 and this patchset is targeted for
> 5.14 and is based over that.

Unfortunately, this series looks like needing more works.

[  487.773586][    T0] CPU17: Booted secondary processor 0x0000000801 [0x503f0002]
[  487.976495][  T670] list_del corruption. next->prev should be ffff009b66e9ec70, but was ffff009b66dfec70
[  487.987037][  T670] ------------[ cut here ]------------
[  487.992351][  T670] kernel BUG at lib/list_debug.c:54!
[  487.997810][  T670] Internal error: Oops - BUG: 0 [#1] SMP
[  488.003295][  T670] Modules linked in: cpufreq_userspace xfs loop cppc_cpufreq processor efivarfs ip_tables x_tables ext4 mbcache jbd2 dm_mod igb i2c_algo_bit nvme mlx5_core i2c_core nvme_core firmware_class
[  488.021759][  T670] CPU: 1 PID: 670 Comm: cppc_fie Not tainted 5.13.0-rc5-next-20210611+ #46
[  488.030190][  T670] Hardware name: MiTAC RAPTOR EV-883832-X3-0001/RAPTOR, BIOS 1.6 06/28/2020
[  488.038705][  T670] pstate: 600000c5 (nZCv daIF -PAN -UAO -TCO BTYPE=--)
[  488.045398][  T670] pc : __list_del_entry_valid+0x154/0x158
[  488.050969][  T670] lr : __list_del_entry_valid+0x154/0x158
[  488.056534][  T670] sp : ffff8000229afd70
[  488.060534][  T670] x29: ffff8000229afd70 x28: ffff0008c8f4f340 x27: dfff800000000000
[  488.068361][  T670] x26: ffff009b66e9ec70 x25: ffff800011c8b4d0 x24: ffff0008d4bfe488
[  488.076188][  T670] x23: ffff0008c8f4f340 x22: ffff0008c8f4f340 x21: ffff009b6789ec70
[  488.084015][  T670] x20: ffff0008d4bfe4c8 x19: ffff009b66e9ec70 x18: ffff0008c8f4fd70
[  488.091842][  T670] x17: 20747562202c3037 x16: 6365396536366239 x15: 0000000000000028
[  488.099669][  T670] x14: 0000000000000000 x13: 0000000000000001 x12: ffff60136cdd3447
[  488.107495][  T670] x11: 1fffe0136cdd3446 x10: ffff60136cdd3446 x9 : ffff8000103ee444
[  488.115322][  T670] x8 : ffff009b66e9a237 x7 : 0000000000000001 x6 : ffff009b66e9a230
[  488.123149][  T670] x5 : 00009fec9322cbba x4 : ffff60136cdd3447 x3 : 1fffe001191e9e69
[  488.130975][  T670] x2 : 0000000000000000 x1 : 0000000000000000 x0 : 0000000000000054
[  488.138803][  T670] Call trace:
[  488.141935][  T670]  __list_del_entry_valid+0x154/0x158
[  488.147153][  T670]  kthread_worker_fn+0x15c/0xda0
[  488.151939][  T670]  kthread+0x3ac/0x460
[  488.155854][  T670]  ret_from_fork+0x10/0x18
[  488.160120][  T670] Code: 911e8000 aa1303e1 910a0000 941b595b (d4210000)
[  488.166901][  T670] ---[ end trace e637e2d38b2cc087 ]---
[  488.172206][  T670] Kernel panic - not syncing: Oops - BUG: Fatal exception
[  488.179182][  T670] SMP: stopping secondary CPUs
[  489.209347][  T670] SMP: failed to stop secondary CPUs 0-1,10-11,16-17,31
[  489.216128][  T][  T670] Memoryn ]---

> 
> -------------------------8<-------------------------
> 
> The CPPC driver currently stops the frequency invariance related
> kthread_work and irq_work from cppc_freq_invariance_exit() which is only
> called during driver's removal.
> 
> This is not sufficient as the CPUs can get hot-plugged out while the
> driver is in use, the same also happens during system suspend/resume.
> 
> In such a cases we can reach a state where the CPU is removed by the
> kernel but its kthread_work or irq_work aren't stopped.
> 
> Fix this by implementing the start_cpu() and stop_cpu() callbacks in the
> cpufreq core, which will be called for each CPU's addition/removal.
> 
> A similar call was already available in the cpufreq core, which isn't required
> anymore and so its users are migrated to use exit() callback instead.
> 
> This is targeted for v5.14-rc1.
> 
> --
> Viresh
> 
> Viresh Kumar (5):
>   cpufreq: cppc: Migrate to ->exit() callback instead of ->stop_cpu()
>   cpufreq: intel_pstate: Migrate to ->exit() callback instead of
>     ->stop_cpu()
>   cpufreq: powerenv: Migrate to ->exit() callback instead of
>     ->stop_cpu()
>   cpufreq: Add start_cpu() and stop_cpu() callbacks
>   cpufreq: cppc: Fix suspend/resume specific races with the FIE code
> 
>  Documentation/cpu-freq/cpu-drivers.rst |   7 +-
>  drivers/cpufreq/Kconfig.arm            |   1 -
>  drivers/cpufreq/cppc_cpufreq.c         | 163 ++++++++++++++-----------
>  drivers/cpufreq/cpufreq.c              |  11 +-
>  drivers/cpufreq/intel_pstate.c         |   9 +-
>  drivers/cpufreq/powernv-cpufreq.c      |  23 ++--
>  include/linux/cpufreq.h                |   5 +-
>  7 files changed, 119 insertions(+), 100 deletions(-)
> 

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 0/5] cpufreq: cppc: Fix suspend/resume specific races with FIE code
  2021-06-14 13:48   ` Qian Cai
@ 2021-06-15  7:50     ` Viresh Kumar
  -1 siblings, 0 replies; 18+ messages in thread
From: Viresh Kumar @ 2021-06-15  7:50 UTC (permalink / raw)
  To: Qian Cai
  Cc: Rafael Wysocki, Benjamin Herrenschmidt, Jonathan Corbet,
	Len Brown, Michael Ellerman, Paul Mackerras, Srinivas Pandruvada,
	linux-pm, Vincent Guittot, Ionela Voinescu, Dirk Brandewie,
	linux-doc, linux-kernel, linuxppc-dev

Hi Qian,

First of all thanks for testing this, I need more of your help to test
this out :)

FWIW, I did test this on my Hikey board today, with some hacks, and
tried multiple insmod/rmmod operations for the driver, and I wasn't
able to reproduce the issue you reported. I did enable the list-debug
config option.

On 14-06-21, 09:48, Qian Cai wrote:
> Unfortunately, this series looks like needing more works.
> 
> [  487.773586][    T0] CPU17: Booted secondary processor 0x0000000801 [0x503f0002]
> [  487.976495][  T670] list_del corruption. next->prev should be ffff009b66e9ec70, but was ffff009b66dfec70
> [  487.987037][  T670] ------------[ cut here ]------------
> [  487.992351][  T670] kernel BUG at lib/list_debug.c:54!
> [  487.997810][  T670] Internal error: Oops - BUG: 0 [#1] SMP
> [  488.003295][  T670] Modules linked in: cpufreq_userspace xfs loop cppc_cpufreq processor efivarfs ip_tables x_tables ext4 mbcache jbd2 dm_mod igb i2c_algo_bit nvme mlx5_core i2c_core nvme_core firmware_class
> [  488.021759][  T670] CPU: 1 PID: 670 Comm: cppc_fie Not tainted 5.13.0-rc5-next-20210611+ #46
> [  488.030190][  T670] Hardware name: MiTAC RAPTOR EV-883832-X3-0001/RAPTOR, BIOS 1.6 06/28/2020
> [  488.038705][  T670] pstate: 600000c5 (nZCv daIF -PAN -UAO -TCO BTYPE=--)
> [  488.045398][  T670] pc : __list_del_entry_valid+0x154/0x158
> [  488.050969][  T670] lr : __list_del_entry_valid+0x154/0x158
> [  488.056534][  T670] sp : ffff8000229afd70
> [  488.060534][  T670] x29: ffff8000229afd70 x28: ffff0008c8f4f340 x27: dfff800000000000
> [  488.068361][  T670] x26: ffff009b66e9ec70 x25: ffff800011c8b4d0 x24: ffff0008d4bfe488
> [  488.076188][  T670] x23: ffff0008c8f4f340 x22: ffff0008c8f4f340 x21: ffff009b6789ec70
> [  488.084015][  T670] x20: ffff0008d4bfe4c8 x19: ffff009b66e9ec70 x18: ffff0008c8f4fd70
> [  488.091842][  T670] x17: 20747562202c3037 x16: 6365396536366239 x15: 0000000000000028
> [  488.099669][  T670] x14: 0000000000000000 x13: 0000000000000001 x12: ffff60136cdd3447
> [  488.107495][  T670] x11: 1fffe0136cdd3446 x10: ffff60136cdd3446 x9 : ffff8000103ee444
> [  488.115322][  T670] x8 : ffff009b66e9a237 x7 : 0000000000000001 x6 : ffff009b66e9a230
> [  488.123149][  T670] x5 : 00009fec9322cbba x4 : ffff60136cdd3447 x3 : 1fffe001191e9e69
> [  488.130975][  T670] x2 : 0000000000000000 x1 : 0000000000000000 x0 : 0000000000000054
> [  488.138803][  T670] Call trace:
> [  488.141935][  T670]  __list_del_entry_valid+0x154/0x158
> [  488.147153][  T670]  kthread_worker_fn+0x15c/0xda0

This is a strange place to get the issue from. And this is a new
issue.

> [  488.151939][  T670]  kthread+0x3ac/0x460
> [  488.155854][  T670]  ret_from_fork+0x10/0x18
> [  488.160120][  T670] Code: 911e8000 aa1303e1 910a0000 941b595b (d4210000)
> [  488.166901][  T670] ---[ end trace e637e2d38b2cc087 ]---
> [  488.172206][  T670] Kernel panic - not syncing: Oops - BUG: Fatal exception
> [  488.179182][  T670] SMP: stopping secondary CPUs
> [  489.209347][  T670] SMP: failed to stop secondary CPUs 0-1,10-11,16-17,31
> [  489.216128][  T][  T670] Memoryn ]---

Can you give details on what exactly did you try to do, to get this ?
Normal boot or something more ?

I have made some changes to the way calls were happening, may get this
thing sorted. Can you please try this branch ?

https://git.kernel.org/pub/scm/linux/kernel/git/vireshk/pm.git/log/?h=cpufreq/cppc

I can see one place where race can happen, i.e. between
topology_clear_scale_freq_source() and topology_scale_freq_tick(). It
is possible that sfd->set_freq_scale() may get called for a previously
set handler as there is no protection there.

I will see how to fix that. But I am not sure if the issue reported
above comes from there.

Anyway, please give my branch a try, lets see.

-- 
viresh

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 0/5] cpufreq: cppc: Fix suspend/resume specific races with FIE code
@ 2021-06-15  7:50     ` Viresh Kumar
  0 siblings, 0 replies; 18+ messages in thread
From: Viresh Kumar @ 2021-06-15  7:50 UTC (permalink / raw)
  To: Qian Cai
  Cc: linuxppc-dev, Vincent Guittot, linux-doc, Jonathan Corbet,
	Dirk Brandewie, linux-pm, Srinivas Pandruvada, Rafael Wysocki,
	linux-kernel, Paul Mackerras, Ionela Voinescu, Len Brown

Hi Qian,

First of all thanks for testing this, I need more of your help to test
this out :)

FWIW, I did test this on my Hikey board today, with some hacks, and
tried multiple insmod/rmmod operations for the driver, and I wasn't
able to reproduce the issue you reported. I did enable the list-debug
config option.

On 14-06-21, 09:48, Qian Cai wrote:
> Unfortunately, this series looks like needing more works.
> 
> [  487.773586][    T0] CPU17: Booted secondary processor 0x0000000801 [0x503f0002]
> [  487.976495][  T670] list_del corruption. next->prev should be ffff009b66e9ec70, but was ffff009b66dfec70
> [  487.987037][  T670] ------------[ cut here ]------------
> [  487.992351][  T670] kernel BUG at lib/list_debug.c:54!
> [  487.997810][  T670] Internal error: Oops - BUG: 0 [#1] SMP
> [  488.003295][  T670] Modules linked in: cpufreq_userspace xfs loop cppc_cpufreq processor efivarfs ip_tables x_tables ext4 mbcache jbd2 dm_mod igb i2c_algo_bit nvme mlx5_core i2c_core nvme_core firmware_class
> [  488.021759][  T670] CPU: 1 PID: 670 Comm: cppc_fie Not tainted 5.13.0-rc5-next-20210611+ #46
> [  488.030190][  T670] Hardware name: MiTAC RAPTOR EV-883832-X3-0001/RAPTOR, BIOS 1.6 06/28/2020
> [  488.038705][  T670] pstate: 600000c5 (nZCv daIF -PAN -UAO -TCO BTYPE=--)
> [  488.045398][  T670] pc : __list_del_entry_valid+0x154/0x158
> [  488.050969][  T670] lr : __list_del_entry_valid+0x154/0x158
> [  488.056534][  T670] sp : ffff8000229afd70
> [  488.060534][  T670] x29: ffff8000229afd70 x28: ffff0008c8f4f340 x27: dfff800000000000
> [  488.068361][  T670] x26: ffff009b66e9ec70 x25: ffff800011c8b4d0 x24: ffff0008d4bfe488
> [  488.076188][  T670] x23: ffff0008c8f4f340 x22: ffff0008c8f4f340 x21: ffff009b6789ec70
> [  488.084015][  T670] x20: ffff0008d4bfe4c8 x19: ffff009b66e9ec70 x18: ffff0008c8f4fd70
> [  488.091842][  T670] x17: 20747562202c3037 x16: 6365396536366239 x15: 0000000000000028
> [  488.099669][  T670] x14: 0000000000000000 x13: 0000000000000001 x12: ffff60136cdd3447
> [  488.107495][  T670] x11: 1fffe0136cdd3446 x10: ffff60136cdd3446 x9 : ffff8000103ee444
> [  488.115322][  T670] x8 : ffff009b66e9a237 x7 : 0000000000000001 x6 : ffff009b66e9a230
> [  488.123149][  T670] x5 : 00009fec9322cbba x4 : ffff60136cdd3447 x3 : 1fffe001191e9e69
> [  488.130975][  T670] x2 : 0000000000000000 x1 : 0000000000000000 x0 : 0000000000000054
> [  488.138803][  T670] Call trace:
> [  488.141935][  T670]  __list_del_entry_valid+0x154/0x158
> [  488.147153][  T670]  kthread_worker_fn+0x15c/0xda0

This is a strange place to get the issue from. And this is a new
issue.

> [  488.151939][  T670]  kthread+0x3ac/0x460
> [  488.155854][  T670]  ret_from_fork+0x10/0x18
> [  488.160120][  T670] Code: 911e8000 aa1303e1 910a0000 941b595b (d4210000)
> [  488.166901][  T670] ---[ end trace e637e2d38b2cc087 ]---
> [  488.172206][  T670] Kernel panic - not syncing: Oops - BUG: Fatal exception
> [  488.179182][  T670] SMP: stopping secondary CPUs
> [  489.209347][  T670] SMP: failed to stop secondary CPUs 0-1,10-11,16-17,31
> [  489.216128][  T][  T670] Memoryn ]---

Can you give details on what exactly did you try to do, to get this ?
Normal boot or something more ?

I have made some changes to the way calls were happening, may get this
thing sorted. Can you please try this branch ?

https://git.kernel.org/pub/scm/linux/kernel/git/vireshk/pm.git/log/?h=cpufreq/cppc

I can see one place where race can happen, i.e. between
topology_clear_scale_freq_source() and topology_scale_freq_tick(). It
is possible that sfd->set_freq_scale() may get called for a previously
set handler as there is no protection there.

I will see how to fix that. But I am not sure if the issue reported
above comes from there.

Anyway, please give my branch a try, lets see.

-- 
viresh

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 0/5] cpufreq: cppc: Fix suspend/resume specific races with FIE code
  2021-06-15  7:50     ` Viresh Kumar
@ 2021-06-15  9:38       ` Viresh Kumar
  -1 siblings, 0 replies; 18+ messages in thread
From: Viresh Kumar @ 2021-06-15  9:38 UTC (permalink / raw)
  To: Qian Cai
  Cc: Rafael Wysocki, Benjamin Herrenschmidt, Jonathan Corbet,
	Len Brown, Michael Ellerman, Paul Mackerras, Srinivas Pandruvada,
	linux-pm, Vincent Guittot, Ionela Voinescu, Dirk Brandewie,
	linux-doc, linux-kernel, linuxppc-dev

On 15-06-21, 13:20, Viresh Kumar wrote:
> I can see one place where race can happen, i.e. between
> topology_clear_scale_freq_source() and topology_scale_freq_tick(). It
> is possible that sfd->set_freq_scale() may get called for a previously
> set handler as there is no protection there.
> 
> I will see how to fix that. But I am not sure if the issue reported
> above comes from there.

I have tried to fix this race and pushed the relevant patch to my
branch. Please pick the latest branch and hopefully everything will
just work.

-- 
viresh

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 0/5] cpufreq: cppc: Fix suspend/resume specific races with FIE code
@ 2021-06-15  9:38       ` Viresh Kumar
  0 siblings, 0 replies; 18+ messages in thread
From: Viresh Kumar @ 2021-06-15  9:38 UTC (permalink / raw)
  To: Qian Cai
  Cc: linuxppc-dev, Vincent Guittot, linux-doc, Jonathan Corbet,
	Dirk Brandewie, linux-pm, Srinivas Pandruvada, Rafael Wysocki,
	linux-kernel, Paul Mackerras, Ionela Voinescu, Len Brown

On 15-06-21, 13:20, Viresh Kumar wrote:
> I can see one place where race can happen, i.e. between
> topology_clear_scale_freq_source() and topology_scale_freq_tick(). It
> is possible that sfd->set_freq_scale() may get called for a previously
> set handler as there is no protection there.
> 
> I will see how to fix that. But I am not sure if the issue reported
> above comes from there.

I have tried to fix this race and pushed the relevant patch to my
branch. Please pick the latest branch and hopefully everything will
just work.

-- 
viresh

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 0/5] cpufreq: cppc: Fix suspend/resume specific races with FIE code
  2021-06-15  7:50     ` Viresh Kumar
@ 2021-06-15 12:17       ` Qian Cai
  -1 siblings, 0 replies; 18+ messages in thread
From: Qian Cai @ 2021-06-15 12:17 UTC (permalink / raw)
  To: Viresh Kumar
  Cc: Rafael Wysocki, Benjamin Herrenschmidt, Jonathan Corbet,
	Len Brown, Michael Ellerman, Paul Mackerras, Srinivas Pandruvada,
	linux-pm, Vincent Guittot, Ionela Voinescu, Dirk Brandewie,
	linux-doc, linux-kernel, linuxppc-dev



On 6/15/2021 3:50 AM, Viresh Kumar wrote:
> Hi Qian,
> 
> First of all thanks for testing this, I need more of your help to test
> this out :)
> 
> FWIW, I did test this on my Hikey board today, with some hacks, and
> tried multiple insmod/rmmod operations for the driver, and I wasn't
> able to reproduce the issue you reported. I did enable the list-debug
> config option.

The setup here is an arm64 server with 32 CPUs.

> 
> On 14-06-21, 09:48, Qian Cai wrote:
>> Unfortunately, this series looks like needing more works.
>>
>> [  487.773586][    T0] CPU17: Booted secondary processor 0x0000000801 [0x503f0002]
>> [  487.976495][  T670] list_del corruption. next->prev should be ffff009b66e9ec70, but was ffff009b66dfec70
>> [  487.987037][  T670] ------------[ cut here ]------------
>> [  487.992351][  T670] kernel BUG at lib/list_debug.c:54!
>> [  487.997810][  T670] Internal error: Oops - BUG: 0 [#1] SMP
>> [  488.003295][  T670] Modules linked in: cpufreq_userspace xfs loop cppc_cpufreq processor efivarfs ip_tables x_tables ext4 mbcache jbd2 dm_mod igb i2c_algo_bit nvme mlx5_core i2c_core nvme_core firmware_class
>> [  488.021759][  T670] CPU: 1 PID: 670 Comm: cppc_fie Not tainted 5.13.0-rc5-next-20210611+ #46
>> [  488.030190][  T670] Hardware name: MiTAC RAPTOR EV-883832-X3-0001/RAPTOR, BIOS 1.6 06/28/2020
>> [  488.038705][  T670] pstate: 600000c5 (nZCv daIF -PAN -UAO -TCO BTYPE=--)
>> [  488.045398][  T670] pc : __list_del_entry_valid+0x154/0x158
>> [  488.050969][  T670] lr : __list_del_entry_valid+0x154/0x158
>> [  488.056534][  T670] sp : ffff8000229afd70
>> [  488.060534][  T670] x29: ffff8000229afd70 x28: ffff0008c8f4f340 x27: dfff800000000000
>> [  488.068361][  T670] x26: ffff009b66e9ec70 x25: ffff800011c8b4d0 x24: ffff0008d4bfe488
>> [  488.076188][  T670] x23: ffff0008c8f4f340 x22: ffff0008c8f4f340 x21: ffff009b6789ec70
>> [  488.084015][  T670] x20: ffff0008d4bfe4c8 x19: ffff009b66e9ec70 x18: ffff0008c8f4fd70
>> [  488.091842][  T670] x17: 20747562202c3037 x16: 6365396536366239 x15: 0000000000000028
>> [  488.099669][  T670] x14: 0000000000000000 x13: 0000000000000001 x12: ffff60136cdd3447
>> [  488.107495][  T670] x11: 1fffe0136cdd3446 x10: ffff60136cdd3446 x9 : ffff8000103ee444
>> [  488.115322][  T670] x8 : ffff009b66e9a237 x7 : 0000000000000001 x6 : ffff009b66e9a230
>> [  488.123149][  T670] x5 : 00009fec9322cbba x4 : ffff60136cdd3447 x3 : 1fffe001191e9e69
>> [  488.130975][  T670] x2 : 0000000000000000 x1 : 0000000000000000 x0 : 0000000000000054
>> [  488.138803][  T670] Call trace:
>> [  488.141935][  T670]  __list_del_entry_valid+0x154/0x158
>> [  488.147153][  T670]  kthread_worker_fn+0x15c/0xda0
> 
> This is a strange place to get the issue from. And this is a new
> issue.

Well, it was still the same exercises with CPU online/offline.

> 
>> [  488.151939][  T670]  kthread+0x3ac/0x460
>> [  488.155854][  T670]  ret_from_fork+0x10/0x18
>> [  488.160120][  T670] Code: 911e8000 aa1303e1 910a0000 941b595b (d4210000)
>> [  488.166901][  T670] ---[ end trace e637e2d38b2cc087 ]---
>> [  488.172206][  T670] Kernel panic - not syncing: Oops - BUG: Fatal exception
>> [  488.179182][  T670] SMP: stopping secondary CPUs
>> [  489.209347][  T670] SMP: failed to stop secondary CPUs 0-1,10-11,16-17,31
>> [  489.216128][  T][  T670] Memoryn ]---
> 
> Can you give details on what exactly did you try to do, to get this ?
> Normal boot or something more ?

Basically, it has the cpufreq driver as CPPC and the governor as schedutil. Running a few workloads to get CPU scaling up and down. Later, try to offline all CPUs until the last one and then online all CPUs.

> 
> I have made some changes to the way calls were happening, may get this
> thing sorted. Can you please try this branch ?
> 
> https://git.kernel.org/pub/scm/linux/kernel/git/vireshk/pm.git/log/?h=cpufreq/cppc
> 
> I can see one place where race can happen, i.e. between
> topology_clear_scale_freq_source() and topology_scale_freq_tick(). It
> is possible that sfd->set_freq_scale() may get called for a previously
> set handler as there is no protection there.
> 
> I will see how to fix that. But I am not sure if the issue reported
> above comes from there.
> 
> Anyway, please give my branch a try, lets see.

I am hesitate to try this at the moment because this all feel like shooting in the dark. Ideally, you will be able to get access to one of those arm64 servers (Huawei, Ampere, TX2, FJ etc) eventually and really try the same exercises yourself with those debugging options like list debugging and KASAN on. That way you could fix things way efficiently. I could share you the .config once you are there. Last but not least, once you get better narrow down of the issues, I'd hope to see someone else familiar with the code there to get review of those patches first (feel free to Cc me once you are ready to post) before I'll rerun the whole things again. That way we don't waste time on each other backing and forth chasing the shadow.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 0/5] cpufreq: cppc: Fix suspend/resume specific races with FIE code
@ 2021-06-15 12:17       ` Qian Cai
  0 siblings, 0 replies; 18+ messages in thread
From: Qian Cai @ 2021-06-15 12:17 UTC (permalink / raw)
  To: Viresh Kumar
  Cc: linuxppc-dev, Vincent Guittot, linux-doc, Jonathan Corbet,
	Dirk Brandewie, linux-pm, Srinivas Pandruvada, Rafael Wysocki,
	linux-kernel, Paul Mackerras, Ionela Voinescu, Len Brown



On 6/15/2021 3:50 AM, Viresh Kumar wrote:
> Hi Qian,
> 
> First of all thanks for testing this, I need more of your help to test
> this out :)
> 
> FWIW, I did test this on my Hikey board today, with some hacks, and
> tried multiple insmod/rmmod operations for the driver, and I wasn't
> able to reproduce the issue you reported. I did enable the list-debug
> config option.

The setup here is an arm64 server with 32 CPUs.

> 
> On 14-06-21, 09:48, Qian Cai wrote:
>> Unfortunately, this series looks like needing more works.
>>
>> [  487.773586][    T0] CPU17: Booted secondary processor 0x0000000801 [0x503f0002]
>> [  487.976495][  T670] list_del corruption. next->prev should be ffff009b66e9ec70, but was ffff009b66dfec70
>> [  487.987037][  T670] ------------[ cut here ]------------
>> [  487.992351][  T670] kernel BUG at lib/list_debug.c:54!
>> [  487.997810][  T670] Internal error: Oops - BUG: 0 [#1] SMP
>> [  488.003295][  T670] Modules linked in: cpufreq_userspace xfs loop cppc_cpufreq processor efivarfs ip_tables x_tables ext4 mbcache jbd2 dm_mod igb i2c_algo_bit nvme mlx5_core i2c_core nvme_core firmware_class
>> [  488.021759][  T670] CPU: 1 PID: 670 Comm: cppc_fie Not tainted 5.13.0-rc5-next-20210611+ #46
>> [  488.030190][  T670] Hardware name: MiTAC RAPTOR EV-883832-X3-0001/RAPTOR, BIOS 1.6 06/28/2020
>> [  488.038705][  T670] pstate: 600000c5 (nZCv daIF -PAN -UAO -TCO BTYPE=--)
>> [  488.045398][  T670] pc : __list_del_entry_valid+0x154/0x158
>> [  488.050969][  T670] lr : __list_del_entry_valid+0x154/0x158
>> [  488.056534][  T670] sp : ffff8000229afd70
>> [  488.060534][  T670] x29: ffff8000229afd70 x28: ffff0008c8f4f340 x27: dfff800000000000
>> [  488.068361][  T670] x26: ffff009b66e9ec70 x25: ffff800011c8b4d0 x24: ffff0008d4bfe488
>> [  488.076188][  T670] x23: ffff0008c8f4f340 x22: ffff0008c8f4f340 x21: ffff009b6789ec70
>> [  488.084015][  T670] x20: ffff0008d4bfe4c8 x19: ffff009b66e9ec70 x18: ffff0008c8f4fd70
>> [  488.091842][  T670] x17: 20747562202c3037 x16: 6365396536366239 x15: 0000000000000028
>> [  488.099669][  T670] x14: 0000000000000000 x13: 0000000000000001 x12: ffff60136cdd3447
>> [  488.107495][  T670] x11: 1fffe0136cdd3446 x10: ffff60136cdd3446 x9 : ffff8000103ee444
>> [  488.115322][  T670] x8 : ffff009b66e9a237 x7 : 0000000000000001 x6 : ffff009b66e9a230
>> [  488.123149][  T670] x5 : 00009fec9322cbba x4 : ffff60136cdd3447 x3 : 1fffe001191e9e69
>> [  488.130975][  T670] x2 : 0000000000000000 x1 : 0000000000000000 x0 : 0000000000000054
>> [  488.138803][  T670] Call trace:
>> [  488.141935][  T670]  __list_del_entry_valid+0x154/0x158
>> [  488.147153][  T670]  kthread_worker_fn+0x15c/0xda0
> 
> This is a strange place to get the issue from. And this is a new
> issue.

Well, it was still the same exercises with CPU online/offline.

> 
>> [  488.151939][  T670]  kthread+0x3ac/0x460
>> [  488.155854][  T670]  ret_from_fork+0x10/0x18
>> [  488.160120][  T670] Code: 911e8000 aa1303e1 910a0000 941b595b (d4210000)
>> [  488.166901][  T670] ---[ end trace e637e2d38b2cc087 ]---
>> [  488.172206][  T670] Kernel panic - not syncing: Oops - BUG: Fatal exception
>> [  488.179182][  T670] SMP: stopping secondary CPUs
>> [  489.209347][  T670] SMP: failed to stop secondary CPUs 0-1,10-11,16-17,31
>> [  489.216128][  T][  T670] Memoryn ]---
> 
> Can you give details on what exactly did you try to do, to get this ?
> Normal boot or something more ?

Basically, it has the cpufreq driver as CPPC and the governor as schedutil. Running a few workloads to get CPU scaling up and down. Later, try to offline all CPUs until the last one and then online all CPUs.

> 
> I have made some changes to the way calls were happening, may get this
> thing sorted. Can you please try this branch ?
> 
> https://git.kernel.org/pub/scm/linux/kernel/git/vireshk/pm.git/log/?h=cpufreq/cppc
> 
> I can see one place where race can happen, i.e. between
> topology_clear_scale_freq_source() and topology_scale_freq_tick(). It
> is possible that sfd->set_freq_scale() may get called for a previously
> set handler as there is no protection there.
> 
> I will see how to fix that. But I am not sure if the issue reported
> above comes from there.
> 
> Anyway, please give my branch a try, lets see.

I am hesitate to try this at the moment because this all feel like shooting in the dark. Ideally, you will be able to get access to one of those arm64 servers (Huawei, Ampere, TX2, FJ etc) eventually and really try the same exercises yourself with those debugging options like list debugging and KASAN on. That way you could fix things way efficiently. I could share you the .config once you are there. Last but not least, once you get better narrow down of the issues, I'd hope to see someone else familiar with the code there to get review of those patches first (feel free to Cc me once you are ready to post) before I'll rerun the whole things again. That way we don't waste time on each other backing and forth chasing the shadow.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 0/5] cpufreq: cppc: Fix suspend/resume specific races with FIE code
  2021-06-15 12:17       ` Qian Cai
@ 2021-06-16  4:57         ` Viresh Kumar
  -1 siblings, 0 replies; 18+ messages in thread
From: Viresh Kumar @ 2021-06-16  4:57 UTC (permalink / raw)
  To: Qian Cai
  Cc: Rafael Wysocki, Benjamin Herrenschmidt, Jonathan Corbet,
	Len Brown, Michael Ellerman, Paul Mackerras, Srinivas Pandruvada,
	linux-pm, Vincent Guittot, Ionela Voinescu, Dirk Brandewie,
	linux-doc, linux-kernel, linuxppc-dev

On 15-06-21, 08:17, Qian Cai wrote:
> On 6/15/2021 3:50 AM, Viresh Kumar wrote:
> > This is a strange place to get the issue from. And this is a new
> > issue.
> 
> Well, it was still the same exercises with CPU online/offline.
> 
> > 
> >> [  488.151939][  T670]  kthread+0x3ac/0x460
> >> [  488.155854][  T670]  ret_from_fork+0x10/0x18
> >> [  488.160120][  T670] Code: 911e8000 aa1303e1 910a0000 941b595b (d4210000)
> >> [  488.166901][  T670] ---[ end trace e637e2d38b2cc087 ]---
> >> [  488.172206][  T670] Kernel panic - not syncing: Oops - BUG: Fatal exception
> >> [  488.179182][  T670] SMP: stopping secondary CPUs
> >> [  489.209347][  T670] SMP: failed to stop secondary CPUs 0-1,10-11,16-17,31
> >> [  489.216128][  T][  T670] Memoryn ]---
> > 
> > Can you give details on what exactly did you try to do, to get this ?
> > Normal boot or something more ?
> 
> Basically, it has the cpufreq driver as CPPC and the governor as
> schedutil. Running a few workloads to get CPU scaling up and down.
> Later, try to offline all CPUs until the last one and then online
> all CPUs.

Hmm, okay.

So I basically have very similar setup with 8 cores (1-policy
per-cpu), the only difference is I don't end up reading the
performance counters, everything else remains same. So I should see
issues now just like you, in case there are any.

Since the insmod/rmmod setup is a bit different, this is what I tried
today for around an hour with CONFIG_DEBUG_LIST and RCU debugging
options.

while true; do
    for i in `seq 1 7`;
    do
        echo 0 > /sys/devices/system/cpu/cpu$i/online;
    done;

    for i in `seq 1 7`;
    do
        echo 1 > /sys/devices/system/cpu/cpu$i/online;
    done;
done

I don't see any crashes, oops or warnings with latest stuff.

> I am hesitate to try this at the moment because this all feel like
> shooting in the dark.

I understand your point and you aren't completely wrong here. It
wasn't completely in dark but since I am unable to reproduce the issue
at my end, I asked for help.

FWIW, I think one of the possible cause of corruption of kthread thing
could have been because of the race in the topology related code. I
already fixed that in my tree yesterday.

> Ideally, you will be able to get access to one
> of those arm64 servers (Huawei, Ampere, TX2, FJ etc) eventually and
> really try the same exercises yourself with those debugging options
> like list debugging and KASAN on. That way you could fix things way
> efficiently.

Yeah, I thought of this work being over and I am not a user of it
normally. I had to enable it for ARM servers and I took help of my
colleagues (Vincent Guittot and Ionela) for testing the same.

I have also asked Vincent to give it a try again.

> I could share you the .config once you are there. Last
> but not least, once you get better narrow down of the issues, I'd
> hope to see someone else familiar with the code there to get review
> of those patches first (feel free to Cc me once you are ready to
> post) before I'll rerun the whole things again. That way we don't
> waste time on each other backing and forth chasing the shadow.

I did send the stuff up for review and this last thing (you reported)
was a different race altogether, so asked for testing without reviews.

Anyway, I am quite sure my tests have covered such issues now. I will
send out patches again soon.

Thanks Qian.

-- 
viresh

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 0/5] cpufreq: cppc: Fix suspend/resume specific races with FIE code
@ 2021-06-16  4:57         ` Viresh Kumar
  0 siblings, 0 replies; 18+ messages in thread
From: Viresh Kumar @ 2021-06-16  4:57 UTC (permalink / raw)
  To: Qian Cai
  Cc: linuxppc-dev, Vincent Guittot, linux-doc, Jonathan Corbet,
	Dirk Brandewie, linux-pm, Srinivas Pandruvada, Rafael Wysocki,
	linux-kernel, Paul Mackerras, Ionela Voinescu, Len Brown

On 15-06-21, 08:17, Qian Cai wrote:
> On 6/15/2021 3:50 AM, Viresh Kumar wrote:
> > This is a strange place to get the issue from. And this is a new
> > issue.
> 
> Well, it was still the same exercises with CPU online/offline.
> 
> > 
> >> [  488.151939][  T670]  kthread+0x3ac/0x460
> >> [  488.155854][  T670]  ret_from_fork+0x10/0x18
> >> [  488.160120][  T670] Code: 911e8000 aa1303e1 910a0000 941b595b (d4210000)
> >> [  488.166901][  T670] ---[ end trace e637e2d38b2cc087 ]---
> >> [  488.172206][  T670] Kernel panic - not syncing: Oops - BUG: Fatal exception
> >> [  488.179182][  T670] SMP: stopping secondary CPUs
> >> [  489.209347][  T670] SMP: failed to stop secondary CPUs 0-1,10-11,16-17,31
> >> [  489.216128][  T][  T670] Memoryn ]---
> > 
> > Can you give details on what exactly did you try to do, to get this ?
> > Normal boot or something more ?
> 
> Basically, it has the cpufreq driver as CPPC and the governor as
> schedutil. Running a few workloads to get CPU scaling up and down.
> Later, try to offline all CPUs until the last one and then online
> all CPUs.

Hmm, okay.

So I basically have very similar setup with 8 cores (1-policy
per-cpu), the only difference is I don't end up reading the
performance counters, everything else remains same. So I should see
issues now just like you, in case there are any.

Since the insmod/rmmod setup is a bit different, this is what I tried
today for around an hour with CONFIG_DEBUG_LIST and RCU debugging
options.

while true; do
    for i in `seq 1 7`;
    do
        echo 0 > /sys/devices/system/cpu/cpu$i/online;
    done;

    for i in `seq 1 7`;
    do
        echo 1 > /sys/devices/system/cpu/cpu$i/online;
    done;
done

I don't see any crashes, oops or warnings with latest stuff.

> I am hesitate to try this at the moment because this all feel like
> shooting in the dark.

I understand your point and you aren't completely wrong here. It
wasn't completely in dark but since I am unable to reproduce the issue
at my end, I asked for help.

FWIW, I think one of the possible cause of corruption of kthread thing
could have been because of the race in the topology related code. I
already fixed that in my tree yesterday.

> Ideally, you will be able to get access to one
> of those arm64 servers (Huawei, Ampere, TX2, FJ etc) eventually and
> really try the same exercises yourself with those debugging options
> like list debugging and KASAN on. That way you could fix things way
> efficiently.

Yeah, I thought of this work being over and I am not a user of it
normally. I had to enable it for ARM servers and I took help of my
colleagues (Vincent Guittot and Ionela) for testing the same.

I have also asked Vincent to give it a try again.

> I could share you the .config once you are there. Last
> but not least, once you get better narrow down of the issues, I'd
> hope to see someone else familiar with the code there to get review
> of those patches first (feel free to Cc me once you are ready to
> post) before I'll rerun the whole things again. That way we don't
> waste time on each other backing and forth chasing the shadow.

I did send the stuff up for review and this last thing (you reported)
was a different race altogether, so asked for testing without reviews.

Anyway, I am quite sure my tests have covered such issues now. I will
send out patches again soon.

Thanks Qian.

-- 
viresh

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2021-06-16  4:57 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-06-10  8:23 [PATCH 0/5] cpufreq: cppc: Fix suspend/resume specific races with FIE code Viresh Kumar
2021-06-10  8:23 ` Viresh Kumar
2021-06-10  8:23 ` [PATCH 1/5] cpufreq: cppc: Migrate to ->exit() callback instead of ->stop_cpu() Viresh Kumar
2021-06-10  8:23 ` [PATCH 2/5] cpufreq: intel_pstate: " Viresh Kumar
2021-06-10  8:23 ` [PATCH 3/5] cpufreq: powerenv: " Viresh Kumar
2021-06-10  8:23   ` Viresh Kumar
2021-06-10  8:24 ` [PATCH 4/5] cpufreq: Add start_cpu() and stop_cpu() callbacks Viresh Kumar
2021-06-10  8:24 ` [PATCH 5/5] cpufreq: cppc: Fix suspend/resume specific races with the FIE code Viresh Kumar
2021-06-14 13:48 ` [PATCH 0/5] cpufreq: cppc: Fix suspend/resume specific races with " Qian Cai
2021-06-14 13:48   ` Qian Cai
2021-06-15  7:50   ` Viresh Kumar
2021-06-15  7:50     ` Viresh Kumar
2021-06-15  9:38     ` Viresh Kumar
2021-06-15  9:38       ` Viresh Kumar
2021-06-15 12:17     ` Qian Cai
2021-06-15 12:17       ` Qian Cai
2021-06-16  4:57       ` Viresh Kumar
2021-06-16  4:57         ` Viresh Kumar

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.