linux-arm-kernel.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/5] Fix Arm system PMU hotplug issues
@ 2019-02-04 17:09 Robin Murphy
  2019-02-04 17:09 ` [PATCH 1/5] perf/arm-cci: Fix CPU hotplug race avoidance Robin Murphy
                   ` (5 more replies)
  0 siblings, 6 replies; 14+ messages in thread
From: Robin Murphy @ 2019-02-04 17:09 UTC (permalink / raw)
  To: will.deacon, mark.rutland
  Cc: suzuki.poulose, peterz, bigeasy, linux-kernel, tglx, linux-arm-kernel

Hi all,

Following the report of a preemption-related bug in arm-cci, it turns
out there's a fair bit of cleaning up to do in this area. I've started
here with the Arm drivers that I'm fairly familiar with - I suspect the
hisi/qcom/xgene ones suffer from similar issues, but it's going to take
me a while longer to figure them out in detail.

Robin.


Robin Murphy (5):
  perf/arm-cci: Fix CPU hotplug race avoidance
  cpu/hotplug: Export __cpuhp_state_add_instance_cpuslocked()
  perf/arm-ccn: Fix CPU hotplug race avoidance
  cpu/hotplug: Add locked variant of cpuhp_state_add_instance()
  perf/arm_dsu: Fix CPU hotplug races

 drivers/perf/arm-cci.c     | 22 ++++++++++++----------
 drivers/perf/arm-ccn.c     | 25 +++++++++++++------------
 drivers/perf/arm_dsu_pmu.c |  8 +++++---
 include/linux/cpuhotplug.h |  6 ++++++
 kernel/cpu.c               |  1 +
 5 files changed, 37 insertions(+), 25 deletions(-)

-- 
2.20.1.dirty


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH 1/5] perf/arm-cci: Fix CPU hotplug race avoidance
  2019-02-04 17:09 [PATCH 0/5] Fix Arm system PMU hotplug issues Robin Murphy
@ 2019-02-04 17:09 ` Robin Murphy
  2019-02-05 10:10   ` Corentin Labbe
                     ` (2 more replies)
  2019-02-04 17:09 ` [PATCH 2/5] cpu/hotplug: Export __cpuhp_state_add_instance_cpuslocked() Robin Murphy
                   ` (4 subsequent siblings)
  5 siblings, 3 replies; 14+ messages in thread
From: Robin Murphy @ 2019-02-04 17:09 UTC (permalink / raw)
  To: will.deacon, mark.rutland
  Cc: suzuki.poulose, peterz, bigeasy, linux-kernel, tglx, Li, Meng,
	linux-arm-kernel

The arm-cci probe logic faces a cyclic dependency wherein it has to pick
a valid CPU to associate with before registering the PMU device, has to
have the PMU state initialised before handling hotplug events in case it
must be migrated, but has to have the hotplug notifier registered before
the chosen CPU may go offline lest things get out of sync. The present
code has tried to solve the races by using get_cpu() to pick the current
CPU and prevent it from disappearing while the other two registrations
are performed, but that results in taking mutexes with preemption
disabled, which makes certain configurations very unhappy:

[ 1.983337] BUG: sleeping function called from invalid context at kernel/locking/rtmutex.c:2004
[ 1.983340] in_atomic(): 1, irqs_disabled(): 0, pid: 1, name: swapper/0
[ 1.983342] Preemption disabled at:
[ 1.983353] [<ffffff80089801f4>] cci_pmu_probe+0x1dc/0x488
[ 1.983360] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.18.20-rt8-yocto-preempt-rt #1
[ 1.983362] Hardware name: ZynqMP ZCU102 Rev1.0 (DT)
[ 1.983364] Call trace:
[ 1.983369] dump_backtrace+0x0/0x158
[ 1.983372] show_stack+0x24/0x30
[ 1.983378] dump_stack+0x80/0xa4
[ 1.983383] ___might_sleep+0x138/0x160
[ 1.983386] __might_sleep+0x58/0x90
[ 1.983391] __rt_mutex_lock_state+0x30/0xc0
[ 1.983395] _mutex_lock+0x24/0x30
[ 1.983400] perf_pmu_register+0x2c/0x388
[ 1.983404] cci_pmu_probe+0x2bc/0x488
[ 1.983409] platform_drv_probe+0x58/0xa8

However, we don't actually mind being preempted or migrated at this
point; all that really matters is that whichever CPU we pick does not
get offlined before we're done. Thus, do the robust thing and instead
take the lock to inhibit CPU hotplug for the duration. This also
revealed an additional race in assigning the global pointer too late
relative to the hotplug notifier, so that gets fixed in the process.

Reported-by: "Li, Meng" <Meng.Li@windriver.com>
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
---
 drivers/perf/arm-cci.c | 22 ++++++++++++----------
 1 file changed, 12 insertions(+), 10 deletions(-)

diff --git a/drivers/perf/arm-cci.c b/drivers/perf/arm-cci.c
index 1bfeb160c5b1..f6d9df07ec9b 100644
--- a/drivers/perf/arm-cci.c
+++ b/drivers/perf/arm-cci.c
@@ -1692,21 +1692,23 @@ static int cci_pmu_probe(struct platform_device *pdev)
 	raw_spin_lock_init(&cci_pmu->hw_events.pmu_lock);
 	mutex_init(&cci_pmu->reserve_mutex);
 	atomic_set(&cci_pmu->active_events, 0);
-	cci_pmu->cpu = get_cpu();
+
+	cpus_read_lock();
+	cci_pmu->cpu = smp_processor_id();
 
 	ret = cci_pmu_init(cci_pmu, pdev);
-	if (ret) {
-		put_cpu();
-		return ret;
-	}
+	if (ret)
+		goto out;
 
-	cpuhp_setup_state_nocalls(CPUHP_AP_PERF_ARM_CCI_ONLINE,
-				  "perf/arm/cci:online", NULL,
-				  cci_pmu_offline_cpu);
-	put_cpu();
 	g_cci_pmu = cci_pmu;
+	cpuhp_setup_state_nocalls_cpuslocked(CPUHP_AP_PERF_ARM_CCI_ONLINE,
+					     "perf/arm/cci:online", NULL,
+					     cci_pmu_offline_cpu);
+
 	pr_info("ARM %s PMU driver probed", cci_pmu->model->name);
-	return 0;
+out:
+	cpus_read_unlock();
+	return ret;
 }
 
 static int cci_pmu_remove(struct platform_device *pdev)
-- 
2.20.1.dirty


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 2/5] cpu/hotplug: Export __cpuhp_state_add_instance_cpuslocked()
  2019-02-04 17:09 [PATCH 0/5] Fix Arm system PMU hotplug issues Robin Murphy
  2019-02-04 17:09 ` [PATCH 1/5] perf/arm-cci: Fix CPU hotplug race avoidance Robin Murphy
@ 2019-02-04 17:09 ` Robin Murphy
  2019-02-04 17:09 ` [PATCH 3/5] perf/arm-ccn: Fix CPU hotplug race avoidance Robin Murphy
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 14+ messages in thread
From: Robin Murphy @ 2019-02-04 17:09 UTC (permalink / raw)
  To: will.deacon, mark.rutland
  Cc: suzuki.poulose, peterz, bigeasy, linux-kernel, tglx, linux-arm-kernel

This is the only member of the family not exported already, but
happens to be the one I need to fix a bug in a modular driver.

Signed-off-by: Robin Murphy <robin.murphy@arm.com>
---
 kernel/cpu.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/kernel/cpu.c b/kernel/cpu.c
index 91d5c38eb7e5..c1bd8ed7546e 100644
--- a/kernel/cpu.c
+++ b/kernel/cpu.c
@@ -1683,6 +1683,7 @@ int __cpuhp_state_add_instance_cpuslocked(enum cpuhp_state state,
 	mutex_unlock(&cpuhp_state_mutex);
 	return ret;
 }
+EXPORT_SYMBOL_GPL(__cpuhp_state_add_instance_cpuslocked);
 
 int __cpuhp_state_add_instance(enum cpuhp_state state, struct hlist_node *node,
 			       bool invoke)
-- 
2.20.1.dirty


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 3/5] perf/arm-ccn: Fix CPU hotplug race avoidance
  2019-02-04 17:09 [PATCH 0/5] Fix Arm system PMU hotplug issues Robin Murphy
  2019-02-04 17:09 ` [PATCH 1/5] perf/arm-cci: Fix CPU hotplug race avoidance Robin Murphy
  2019-02-04 17:09 ` [PATCH 2/5] cpu/hotplug: Export __cpuhp_state_add_instance_cpuslocked() Robin Murphy
@ 2019-02-04 17:09 ` Robin Murphy
  2019-02-05 11:38   ` Suzuki K Poulose
  2019-02-10 20:44   ` Thomas Gleixner
  2019-02-04 17:09 ` [PATCH 4/5] cpu/hotplug: Add locked variant of cpuhp_state_add_instance() Robin Murphy
                   ` (2 subsequent siblings)
  5 siblings, 2 replies; 14+ messages in thread
From: Robin Murphy @ 2019-02-04 17:09 UTC (permalink / raw)
  To: will.deacon, mark.rutland
  Cc: suzuki.poulose, peterz, bigeasy, linux-kernel, tglx, linux-arm-kernel

Like arm-cci, arm-ccn has the same issue where disabling preemption to
avoid races between registering the PMU device and the hotplug notifier
can lead to those operations taking mutexes in an invalid context. Fix
it the same way by disabling hotplug instead of preemption. Since we
only ever associate the PMU instance with a single CPU, we can also take
the opportunity to slightly simplify the hotplug handling to track just
that CPU number instead of a full cpumask.

Signed-off-by: Robin Murphy <robin.murphy@arm.com>
---
 drivers/perf/arm-ccn.c | 25 +++++++++++++------------
 1 file changed, 13 insertions(+), 12 deletions(-)

diff --git a/drivers/perf/arm-ccn.c b/drivers/perf/arm-ccn.c
index 7dd850e02f19..8629893a9ef6 100644
--- a/drivers/perf/arm-ccn.c
+++ b/drivers/perf/arm-ccn.c
@@ -167,7 +167,7 @@ struct arm_ccn_dt {
 
 	struct hrtimer hrtimer;
 
-	cpumask_t cpu;
+	unsigned int cpu;
 	struct hlist_node node;
 
 	struct pmu pmu;
@@ -559,7 +559,7 @@ static ssize_t arm_ccn_pmu_cpumask_show(struct device *dev,
 {
 	struct arm_ccn *ccn = pmu_to_arm_ccn(dev_get_drvdata(dev));
 
-	return cpumap_print_to_pagebuf(true, buf, &ccn->dt.cpu);
+	return cpumap_print_to_pagebuf(true, buf, cpumask_of(ccn->dt.cpu));
 }
 
 static struct device_attribute arm_ccn_pmu_cpumask_attr =
@@ -762,7 +762,7 @@ static int arm_ccn_pmu_event_init(struct perf_event *event)
 	 * mitigate this, we enforce CPU assignment to one, selected
 	 * processor (the one described in the "cpumask" attribute).
 	 */
-	event->cpu = cpumask_first(&ccn->dt.cpu);
+	event->cpu = ccn->dt.cpu;
 
 	node_xp = CCN_CONFIG_NODE(event->attr.config);
 	type = CCN_CONFIG_TYPE(event->attr.config);
@@ -1218,15 +1218,15 @@ static int arm_ccn_pmu_offline_cpu(unsigned int cpu, struct hlist_node *node)
 	struct arm_ccn *ccn = container_of(dt, struct arm_ccn, dt);
 	unsigned int target;
 
-	if (!cpumask_test_and_clear_cpu(cpu, &dt->cpu))
+	if (cpu != dt->cpu)
 		return 0;
 	target = cpumask_any_but(cpu_online_mask, cpu);
 	if (target >= nr_cpu_ids)
 		return 0;
 	perf_pmu_migrate_context(&dt->pmu, cpu, target);
-	cpumask_set_cpu(target, &dt->cpu);
+	dt->cpu = target;
 	if (ccn->irq)
-		WARN_ON(irq_set_affinity_hint(ccn->irq, &dt->cpu) != 0);
+		WARN_ON(irq_set_affinity_hint(ccn->irq, cpumask_of(dt->cpu)));
 	return 0;
 }
 
@@ -1301,11 +1301,12 @@ static int arm_ccn_pmu_init(struct arm_ccn *ccn)
 	}
 
 	/* Pick one CPU which we will use to collect data from CCN... */
-	cpumask_set_cpu(get_cpu(), &ccn->dt.cpu);
+	cpus_read_lock();
+	ccn->dt.cpu = smp_processor_id();
 
 	/* Also make sure that the overflow interrupt is handled by this CPU */
 	if (ccn->irq) {
-		err = irq_set_affinity_hint(ccn->irq, &ccn->dt.cpu);
+		err = irq_set_affinity_hint(ccn->irq, cpumask_of(ccn->dt.cpu));
 		if (err) {
 			dev_err(ccn->dev, "Failed to set interrupt affinity!\n");
 			goto error_set_affinity;
@@ -1316,14 +1317,14 @@ static int arm_ccn_pmu_init(struct arm_ccn *ccn)
 	if (err)
 		goto error_pmu_register;
 
-	cpuhp_state_add_instance_nocalls(CPUHP_AP_PERF_ARM_CCN_ONLINE,
-					 &ccn->dt.node);
-	put_cpu();
+	cpuhp_state_add_instance_nocalls_cpuslocked(CPUHP_AP_PERF_ARM_CCN_ONLINE,
+						    &ccn->dt.node);
+	cpus_read_unlock();
 	return 0;
 
 error_pmu_register:
 error_set_affinity:
-	put_cpu();
+	cpus_read_unlock();
 error_choose_name:
 	ida_simple_remove(&arm_ccn_pmu_ida, ccn->dt.id);
 	for (i = 0; i < ccn->num_xps; i++)
-- 
2.20.1.dirty


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 4/5] cpu/hotplug: Add locked variant of cpuhp_state_add_instance()
  2019-02-04 17:09 [PATCH 0/5] Fix Arm system PMU hotplug issues Robin Murphy
                   ` (2 preceding siblings ...)
  2019-02-04 17:09 ` [PATCH 3/5] perf/arm-ccn: Fix CPU hotplug race avoidance Robin Murphy
@ 2019-02-04 17:09 ` Robin Murphy
  2019-02-04 17:09 ` [PATCH 5/5] perf/arm_dsu: Fix CPU hotplug races Robin Murphy
  2019-02-05  9:25 ` [PATCH 0/5] Fix Arm system PMU hotplug issues Mark Rutland
  5 siblings, 0 replies; 14+ messages in thread
From: Robin Murphy @ 2019-02-04 17:09 UTC (permalink / raw)
  To: will.deacon, mark.rutland
  Cc: suzuki.poulose, peterz, bigeasy, linux-kernel, tglx, linux-arm-kernel

A helper already exists for nearly every combination of parameters,
except of course the one that I now need.

Signed-off-by: Robin Murphy <robin.murphy@arm.com>
---
 include/linux/cpuhotplug.h | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/include/linux/cpuhotplug.h b/include/linux/cpuhotplug.h
index fd586d0301e7..1d6231fe1590 100644
--- a/include/linux/cpuhotplug.h
+++ b/include/linux/cpuhotplug.h
@@ -290,6 +290,12 @@ static inline int cpuhp_state_add_instance(enum cpuhp_state state,
 	return __cpuhp_state_add_instance(state, node, true);
 }
 
+static inline int cpuhp_state_add_instance_cpuslocked(enum cpuhp_state state,
+						      struct hlist_node *node)
+{
+	return __cpuhp_state_add_instance_cpuslocked(state, node, true);
+}
+
 /**
  * cpuhp_state_add_instance_nocalls - Add an instance for a state without
  *                                    invoking the startup callback.
-- 
2.20.1.dirty


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 5/5] perf/arm_dsu: Fix CPU hotplug races
  2019-02-04 17:09 [PATCH 0/5] Fix Arm system PMU hotplug issues Robin Murphy
                   ` (3 preceding siblings ...)
  2019-02-04 17:09 ` [PATCH 4/5] cpu/hotplug: Add locked variant of cpuhp_state_add_instance() Robin Murphy
@ 2019-02-04 17:09 ` Robin Murphy
  2019-02-05 11:40   ` Suzuki K Poulose
  2019-02-05  9:25 ` [PATCH 0/5] Fix Arm system PMU hotplug issues Mark Rutland
  5 siblings, 1 reply; 14+ messages in thread
From: Robin Murphy @ 2019-02-04 17:09 UTC (permalink / raw)
  To: will.deacon, mark.rutland
  Cc: suzuki.poulose, peterz, bigeasy, linux-kernel, tglx, linux-arm-kernel

Like other system PMUs which associate themselves with an arbitrary CPU
for housekeeping purposes, arm_dsu has a race between registering the
hotplug notifier and registering the PMU device, such that the hotplug
niotifier can potentially fire and attempt to migrate the PMU context
before the latter is valid. This is easily resolved by inhibiting
hotplug until both the notifier and PMU device are successfully set up.

For the same reason, also suppress any synchronous notifier calls in the
cleanup path if PMU registration fails.

Signed-off-by: Robin Murphy <robin.murphy@arm.com>
---
 drivers/perf/arm_dsu_pmu.c | 8 +++++---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/drivers/perf/arm_dsu_pmu.c b/drivers/perf/arm_dsu_pmu.c
index 660cb8ac886a..cfaca06b964a 100644
--- a/drivers/perf/arm_dsu_pmu.c
+++ b/drivers/perf/arm_dsu_pmu.c
@@ -717,7 +717,8 @@ static int dsu_pmu_device_probe(struct platform_device *pdev)
 
 	dsu_pmu->irq = irq;
 	platform_set_drvdata(pdev, dsu_pmu);
-	rc = cpuhp_state_add_instance(dsu_pmu_cpuhp_state,
+	cpus_read_lock();
+	rc = cpuhp_state_add_instance_cpuslocked(dsu_pmu_cpuhp_state,
 						&dsu_pmu->cpuhp_node);
 	if (rc)
 		return rc;
@@ -738,9 +739,10 @@ static int dsu_pmu_device_probe(struct platform_device *pdev)
 	};
 
 	rc = perf_pmu_register(&dsu_pmu->pmu, name, -1);
+	cpus_read_unlock();
 	if (rc) {
-		cpuhp_state_remove_instance(dsu_pmu_cpuhp_state,
-						 &dsu_pmu->cpuhp_node);
+		cpuhp_state_remove_instance_nocalls(dsu_pmu_cpuhp_state,
+						    &dsu_pmu->cpuhp_node);
 		irq_set_affinity_hint(dsu_pmu->irq, NULL);
 	}
 
-- 
2.20.1.dirty


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH 0/5] Fix Arm system PMU hotplug issues
  2019-02-04 17:09 [PATCH 0/5] Fix Arm system PMU hotplug issues Robin Murphy
                   ` (4 preceding siblings ...)
  2019-02-04 17:09 ` [PATCH 5/5] perf/arm_dsu: Fix CPU hotplug races Robin Murphy
@ 2019-02-05  9:25 ` Mark Rutland
  5 siblings, 0 replies; 14+ messages in thread
From: Mark Rutland @ 2019-02-05  9:25 UTC (permalink / raw)
  To: Robin Murphy
  Cc: suzuki.poulose, peterz, bigeasy, will.deacon, linux-kernel, tglx,
	linux-arm-kernel

On Mon, Feb 04, 2019 at 05:09:03PM +0000, Robin Murphy wrote:
> Hi all,
> 
> Following the report of a preemption-related bug in arm-cci, it turns
> out there's a fair bit of cleaning up to do in this area. I've started
> here with the Arm drivers that I'm fairly familiar with - I suspect the
> hisi/qcom/xgene ones suffer from similar issues, but it's going to take
> me a while longer to figure them out in detail.
> 
> Robin.

These all looks sound to me, so FWIW:

Acked-by: Mark Rutland <mark.rutland@arm.com>

Mark.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 1/5] perf/arm-cci: Fix CPU hotplug race avoidance
  2019-02-04 17:09 ` [PATCH 1/5] perf/arm-cci: Fix CPU hotplug race avoidance Robin Murphy
@ 2019-02-05 10:10   ` Corentin Labbe
  2019-02-05 11:19   ` Suzuki K Poulose
  2019-02-10 20:43   ` Thomas Gleixner
  2 siblings, 0 replies; 14+ messages in thread
From: Corentin Labbe @ 2019-02-05 10:10 UTC (permalink / raw)
  To: Robin Murphy
  Cc: mark.rutland, suzuki.poulose, peterz, bigeasy, will.deacon,
	linux-kernel, tglx, Li, Meng, linux-arm-kernel

On Mon, Feb 04, 2019 at 05:09:04PM +0000, Robin Murphy wrote:
> The arm-cci probe logic faces a cyclic dependency wherein it has to pick
> a valid CPU to associate with before registering the PMU device, has to
> have the PMU state initialised before handling hotplug events in case it
> must be migrated, but has to have the hotplug notifier registered before
> the chosen CPU may go offline lest things get out of sync. The present
> code has tried to solve the races by using get_cpu() to pick the current
> CPU and prevent it from disappearing while the other two registrations
> are performed, but that results in taking mutexes with preemption
> disabled, which makes certain configurations very unhappy:
> 
> [ 1.983337] BUG: sleeping function called from invalid context at kernel/locking/rtmutex.c:2004
> [ 1.983340] in_atomic(): 1, irqs_disabled(): 0, pid: 1, name: swapper/0
> [ 1.983342] Preemption disabled at:
> [ 1.983353] [<ffffff80089801f4>] cci_pmu_probe+0x1dc/0x488
> [ 1.983360] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.18.20-rt8-yocto-preempt-rt #1
> [ 1.983362] Hardware name: ZynqMP ZCU102 Rev1.0 (DT)
> [ 1.983364] Call trace:
> [ 1.983369] dump_backtrace+0x0/0x158
> [ 1.983372] show_stack+0x24/0x30
> [ 1.983378] dump_stack+0x80/0xa4
> [ 1.983383] ___might_sleep+0x138/0x160
> [ 1.983386] __might_sleep+0x58/0x90
> [ 1.983391] __rt_mutex_lock_state+0x30/0xc0
> [ 1.983395] _mutex_lock+0x24/0x30
> [ 1.983400] perf_pmu_register+0x2c/0x388
> [ 1.983404] cci_pmu_probe+0x2bc/0x488
> [ 1.983409] platform_drv_probe+0x58/0xa8
> 
> However, we don't actually mind being preempted or migrated at this
> point; all that really matters is that whichever CPU we pick does not
> get offlined before we're done. Thus, do the robust thing and instead
> take the lock to inhibit CPU hotplug for the duration. This also
> revealed an additional race in assigning the global pointer too late
> relative to the hotplug notifier, so that gets fixed in the process.
> 
> Reported-by: "Li, Meng" <Meng.Li@windriver.com>
> Signed-off-by: Robin Murphy <robin.murphy@arm.com>
> ---
>  drivers/perf/arm-cci.c | 22 ++++++++++++----------
>  1 file changed, 12 insertions(+), 10 deletions(-)
> 
> diff --git a/drivers/perf/arm-cci.c b/drivers/perf/arm-cci.c
> index 1bfeb160c5b1..f6d9df07ec9b 100644
> --- a/drivers/perf/arm-cci.c
> +++ b/drivers/perf/arm-cci.c
> @@ -1692,21 +1692,23 @@ static int cci_pmu_probe(struct platform_device *pdev)
>  	raw_spin_lock_init(&cci_pmu->hw_events.pmu_lock);
>  	mutex_init(&cci_pmu->reserve_mutex);
>  	atomic_set(&cci_pmu->active_events, 0);
> -	cci_pmu->cpu = get_cpu();
> +
> +	cpus_read_lock();
> +	cci_pmu->cpu = smp_processor_id();
>  
>  	ret = cci_pmu_init(cci_pmu, pdev);
> -	if (ret) {
> -		put_cpu();
> -		return ret;
> -	}
> +	if (ret)
> +		goto out;
>  
> -	cpuhp_setup_state_nocalls(CPUHP_AP_PERF_ARM_CCI_ONLINE,
> -				  "perf/arm/cci:online", NULL,
> -				  cci_pmu_offline_cpu);
> -	put_cpu();
>  	g_cci_pmu = cci_pmu;
> +	cpuhp_setup_state_nocalls_cpuslocked(CPUHP_AP_PERF_ARM_CCI_ONLINE,
> +					     "perf/arm/cci:online", NULL,
> +					     cci_pmu_offline_cpu);
> +
>  	pr_info("ARM %s PMU driver probed", cci_pmu->model->name);
> -	return 0;
> +out:
> +	cpus_read_unlock();
> +	return ret;
>  }
>  
>  static int cci_pmu_remove(struct platform_device *pdev)
> -- 
> 2.20.1.dirty

Hello

Thanks, this patch fix my issue that I has reported here:
https://lkml.org/lkml/2017/12/29/139
https://lkml.org/lkml/2018/11/12/1901

Tested-by: Corentin Labbe <clabbe.montjoie@gmail.com>
Tested-on: sun8i-a83t-bananapi-m3

Regards

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 1/5] perf/arm-cci: Fix CPU hotplug race avoidance
  2019-02-04 17:09 ` [PATCH 1/5] perf/arm-cci: Fix CPU hotplug race avoidance Robin Murphy
  2019-02-05 10:10   ` Corentin Labbe
@ 2019-02-05 11:19   ` Suzuki K Poulose
  2019-02-10 20:43   ` Thomas Gleixner
  2 siblings, 0 replies; 14+ messages in thread
From: Suzuki K Poulose @ 2019-02-05 11:19 UTC (permalink / raw)
  To: robin.murphy, will.deacon, mark.rutland
  Cc: peterz, bigeasy, linux-kernel, tglx, Meng.Li, linux-arm-kernel

Robin,

On 04/02/2019 17:09, Robin Murphy wrote:
> The arm-cci probe logic faces a cyclic dependency wherein it has to pick
> a valid CPU to associate with before registering the PMU device, has to
> have the PMU state initialised before handling hotplug events in case it
> must be migrated, but has to have the hotplug notifier registered before
> the chosen CPU may go offline lest things get out of sync. The present
> code has tried to solve the races by using get_cpu() to pick the current
> CPU and prevent it from disappearing while the other two registrations
> are performed, but that results in taking mutexes with preemption
> disabled, which makes certain configurations very unhappy:
> 
> [ 1.983337] BUG: sleeping function called from invalid context at kernel/locking/rtmutex.c:2004
> [ 1.983340] in_atomic(): 1, irqs_disabled(): 0, pid: 1, name: swapper/0
> [ 1.983342] Preemption disabled at:
> [ 1.983353] [<ffffff80089801f4>] cci_pmu_probe+0x1dc/0x488
> [ 1.983360] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.18.20-rt8-yocto-preempt-rt #1
> [ 1.983362] Hardware name: ZynqMP ZCU102 Rev1.0 (DT)
> [ 1.983364] Call trace:
> [ 1.983369] dump_backtrace+0x0/0x158
> [ 1.983372] show_stack+0x24/0x30
> [ 1.983378] dump_stack+0x80/0xa4
> [ 1.983383] ___might_sleep+0x138/0x160
> [ 1.983386] __might_sleep+0x58/0x90
> [ 1.983391] __rt_mutex_lock_state+0x30/0xc0
> [ 1.983395] _mutex_lock+0x24/0x30
> [ 1.983400] perf_pmu_register+0x2c/0x388
> [ 1.983404] cci_pmu_probe+0x2bc/0x488
> [ 1.983409] platform_drv_probe+0x58/0xa8
> 
> However, we don't actually mind being preempted or migrated at this
> point; all that really matters is that whichever CPU we pick does not
> get offlined before we're done. Thus, do the robust thing and instead
> take the lock to inhibit CPU hotplug for the duration. This also
> revealed an additional race in assigning the global pointer too late
> relative to the hotplug notifier, so that gets fixed in the process.
> 
> Reported-by: "Li, Meng" <Meng.Li@windriver.com>
> Signed-off-by: Robin Murphy <robin.murphy@arm.com>

Thanks for fixing the issues.

Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com>

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 3/5] perf/arm-ccn: Fix CPU hotplug race avoidance
  2019-02-04 17:09 ` [PATCH 3/5] perf/arm-ccn: Fix CPU hotplug race avoidance Robin Murphy
@ 2019-02-05 11:38   ` Suzuki K Poulose
  2019-02-10 20:44   ` Thomas Gleixner
  1 sibling, 0 replies; 14+ messages in thread
From: Suzuki K Poulose @ 2019-02-05 11:38 UTC (permalink / raw)
  To: robin.murphy, will.deacon, mark.rutland
  Cc: peterz, tglx, bigeasy, linux-kernel, linux-arm-kernel



On 04/02/2019 17:09, Robin Murphy wrote:
> Like arm-cci, arm-ccn has the same issue where disabling preemption to
> avoid races between registering the PMU device and the hotplug notifier
> can lead to those operations taking mutexes in an invalid context. Fix
> it the same way by disabling hotplug instead of preemption. Since we
> only ever associate the PMU instance with a single CPU, we can also take
> the opportunity to slightly simplify the hotplug handling to track just
> that CPU number instead of a full cpumask.
> 
> Signed-off-by: Robin Murphy <robin.murphy@arm.com>

Reviewed-by: Suzuki K Poulse <suzuki.poulose@arm.com>

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 5/5] perf/arm_dsu: Fix CPU hotplug races
  2019-02-04 17:09 ` [PATCH 5/5] perf/arm_dsu: Fix CPU hotplug races Robin Murphy
@ 2019-02-05 11:40   ` Suzuki K Poulose
  2019-02-05 13:04     ` Robin Murphy
  0 siblings, 1 reply; 14+ messages in thread
From: Suzuki K Poulose @ 2019-02-05 11:40 UTC (permalink / raw)
  To: robin.murphy, will.deacon, mark.rutland
  Cc: peterz, tglx, bigeasy, linux-kernel, linux-arm-kernel



On 04/02/2019 17:09, Robin Murphy wrote:
> Like other system PMUs which associate themselves with an arbitrary CPU
> for housekeeping purposes, arm_dsu has a race between registering the
> hotplug notifier and registering the PMU device, such that the hotplug
> niotifier can potentially fire and attempt to migrate the PMU context
> before the latter is valid. This is easily resolved by inhibiting
> hotplug until both the notifier and PMU device are successfully set up.
> 
> For the same reason, also suppress any synchronous notifier calls in the
> cleanup path if PMU registration fails.
> 
> Signed-off-by: Robin Murphy <robin.murphy@arm.com>

Should we add :

Fixes: commit 7520fa99246dade7ab6 ("perf: ARM DynamIQ Shared Unit PMU support")

Either way:

Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com>

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 5/5] perf/arm_dsu: Fix CPU hotplug races
  2019-02-05 11:40   ` Suzuki K Poulose
@ 2019-02-05 13:04     ` Robin Murphy
  0 siblings, 0 replies; 14+ messages in thread
From: Robin Murphy @ 2019-02-05 13:04 UTC (permalink / raw)
  To: Suzuki K Poulose, will.deacon, mark.rutland
  Cc: peterz, tglx, bigeasy, linux-kernel, linux-arm-kernel

Hi Suzuki,

On 05/02/2019 11:40, Suzuki K Poulose wrote:
> On 04/02/2019 17:09, Robin Murphy wrote:
>> Like other system PMUs which associate themselves with an arbitrary CPU
>> for housekeeping purposes, arm_dsu has a race between registering the
>> hotplug notifier and registering the PMU device, such that the hotplug
>> niotifier can potentially fire and attempt to migrate the PMU context
>> before the latter is valid. This is easily resolved by inhibiting
>> hotplug until both the notifier and PMU device are successfully set up.
>>
>> For the same reason, also suppress any synchronous notifier calls in the
>> cleanup path if PMU registration fails.
>>
>> Signed-off-by: Robin Murphy <robin.murphy@arm.com>
> 
> Should we add :
> 
> Fixes: commit 7520fa99246dade7ab6 ("perf: ARM DynamIQ Shared Unit PMU 
> support")
> 
> Either way:
> 
> Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com>

Thanks for the reviews! I think this patch is worthwhile for cleanliness 
and consistency, but since it's neither self-contained (with the 
dependency on #4) nor "fix[ing] a real bug that bothers people" I'm not 
convinced it really deserves backporting - unlike the preemption issue, 
actually getting a crash from this race in practice is unlikely enough 
that it would probably require some determined, deliberate effort to 
trigger it with just the right conditions.

Cheers,
Robin.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 1/5] perf/arm-cci: Fix CPU hotplug race avoidance
  2019-02-04 17:09 ` [PATCH 1/5] perf/arm-cci: Fix CPU hotplug race avoidance Robin Murphy
  2019-02-05 10:10   ` Corentin Labbe
  2019-02-05 11:19   ` Suzuki K Poulose
@ 2019-02-10 20:43   ` Thomas Gleixner
  2 siblings, 0 replies; 14+ messages in thread
From: Thomas Gleixner @ 2019-02-10 20:43 UTC (permalink / raw)
  To: Robin Murphy
  Cc: mark.rutland, suzuki.poulose, peterz, bigeasy, will.deacon,
	linux-kernel, Li, Meng, linux-arm-kernel

On Mon, 4 Feb 2019, Robin Murphy wrote:
> +
> +	cpus_read_lock();
> +	cci_pmu->cpu = smp_processor_id();

That wants to be raw_smp_processor_id() because this is preemptible
context and debug_smp_processor_id() will complain. 

Thanks,

	tglx

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 3/5] perf/arm-ccn: Fix CPU hotplug race avoidance
  2019-02-04 17:09 ` [PATCH 3/5] perf/arm-ccn: Fix CPU hotplug race avoidance Robin Murphy
  2019-02-05 11:38   ` Suzuki K Poulose
@ 2019-02-10 20:44   ` Thomas Gleixner
  1 sibling, 0 replies; 14+ messages in thread
From: Thomas Gleixner @ 2019-02-10 20:44 UTC (permalink / raw)
  To: Robin Murphy
  Cc: mark.rutland, suzuki.poulose, peterz, bigeasy, will.deacon,
	linux-kernel, linux-arm-kernel

On Mon, 4 Feb 2019, Robin Murphy wrote:
>  	/* Pick one CPU which we will use to collect data from CCN... */
> -	cpumask_set_cpu(get_cpu(), &ccn->dt.cpu);
> +	cpus_read_lock();
> +	ccn->dt.cpu = smp_processor_id();

Again raw_smp_processor_id()

Thanks,

	tglx

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2019-02-10 20:44 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-02-04 17:09 [PATCH 0/5] Fix Arm system PMU hotplug issues Robin Murphy
2019-02-04 17:09 ` [PATCH 1/5] perf/arm-cci: Fix CPU hotplug race avoidance Robin Murphy
2019-02-05 10:10   ` Corentin Labbe
2019-02-05 11:19   ` Suzuki K Poulose
2019-02-10 20:43   ` Thomas Gleixner
2019-02-04 17:09 ` [PATCH 2/5] cpu/hotplug: Export __cpuhp_state_add_instance_cpuslocked() Robin Murphy
2019-02-04 17:09 ` [PATCH 3/5] perf/arm-ccn: Fix CPU hotplug race avoidance Robin Murphy
2019-02-05 11:38   ` Suzuki K Poulose
2019-02-10 20:44   ` Thomas Gleixner
2019-02-04 17:09 ` [PATCH 4/5] cpu/hotplug: Add locked variant of cpuhp_state_add_instance() Robin Murphy
2019-02-04 17:09 ` [PATCH 5/5] perf/arm_dsu: Fix CPU hotplug races Robin Murphy
2019-02-05 11:40   ` Suzuki K Poulose
2019-02-05 13:04     ` Robin Murphy
2019-02-05  9:25 ` [PATCH 0/5] Fix Arm system PMU hotplug issues Mark Rutland

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).