linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/2] arm_pmu: fix fallout from context handling rewrite
@ 2023-02-16 14:12 Mark Rutland
  2023-02-16 14:12 ` [PATCH 1/2] arm_pmu: fix event CPU filtering Mark Rutland
                   ` (2 more replies)
  0 siblings, 3 replies; 7+ messages in thread
From: Mark Rutland @ 2023-02-16 14:12 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: asahi, ecurtin, j, lina, linux-kernel, mark.rutland, peterz,
	ravi.bangoria, will

Janne reports [1] that perf has been broken on Apple M1 as of commit:

  bd27568117664b8b ("perf: Rewrite core context handling")

This is due to changes to pmu::filter_match() and
arm_pmu::filter_match(), which have been renamed and had their polarity
inverted, but the conversion was inconsistent, and so in some cases we
return the opposite result relative to what we had intended. This
results in consistently losing events on Apple M1.

That commit also (silently) removed the filtering of CHAIN events, which
is undesireable.

These patches fix and simplify the CPU filtering, and replace the CHAIN
event filtering with early rejection of CHAIN events, which is much
simpler.

Thanks,
Mark

[1] https://lore.kernel.org/asahi/20230215-arm_pmu_m1_regression-v1-1-f5a266577c8d@jannau.net/

Mark Rutland (2):
  arm_pmu: fix event CPU filtering
  arm64: perf: reject CHAIN events at creation time

 arch/arm64/kernel/perf_event.c | 15 ++++++++-------
 drivers/perf/arm_pmu.c         |  8 +-------
 include/linux/perf/arm_pmu.h   |  1 -
 3 files changed, 9 insertions(+), 15 deletions(-)

-- 
2.30.2


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH 1/2] arm_pmu: fix event CPU filtering
  2023-02-16 14:12 [PATCH 0/2] arm_pmu: fix fallout from context handling rewrite Mark Rutland
@ 2023-02-16 14:12 ` Mark Rutland
  2023-02-16 14:35   ` Janne Grunau
  2023-02-16 14:12 ` [PATCH 2/2] arm64: perf: reject CHAIN events at creation time Mark Rutland
  2023-02-16 22:14 ` [PATCH 0/2] arm_pmu: fix fallout from context handling rewrite Will Deacon
  2 siblings, 1 reply; 7+ messages in thread
From: Mark Rutland @ 2023-02-16 14:12 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: asahi, ecurtin, j, lina, linux-kernel, mark.rutland, peterz,
	ravi.bangoria, will

Janne reports that perf has been broken on Apple M1 as of commit:

  bd27568117664b8b ("perf: Rewrite core context handling")

That commit replaced the pmu::filter_match() callback with
pmu::filter(), whose return value has the opposite polarity, with true
implying events should be ignored rather than scheduled. While an
attempt was made to update the logic in armv8pmu_filter() and
armpmu_filter() accordingly, the return value remains inverted in a
couple of cases:

* If the arm_pmu does not have an arm_pmu::filter() callback,
  armpmu_filter() will always return whether the CPU is supported rather
  than whether the CPU is not supported.

  As a result, the perf core will not schedule events on supported CPUs,
  resulting in a loss of events. Additionally, the perf core will
  attempt to schedule events on unsupported CPUs, but this will be
  rejected by armpmu_add(), which may result in a loss of events from
  other PMUs on those unsupported CPUs.

* If the arm_pmu does have an arm_pmu::filter() callback, and
  armpmu_filter() is called on a CPU which is not supported by the
  arm_pmu, armpmu_filter() will return false rather than true.

  As a result, the perf core will attempt to schedule events on
  unsupported CPUs, but this will be rejected by armpmu_add(), which may
  result in a loss of events from other PMUs on those unsupported CPUs.

This means a loss of events can be seen with any arm_pmu driver, but
with the ARMv8 PMUv3 driver (which is the only arm_pmu driver with an
arm_pmu::filter() callback) the event loss will be more limited and may
go unnoticed, which is how this issue evaded testing so far.

Fix the CPU filtering by performing this consistently in
armpmu_filter(), and remove the redundant arm_pmu::filter() callback and
armv8pmu_filter() implementation.

Commit bd2756811766 also silently removed the CHAIN event filtering from
armv8pmu_filter(), which will be addressed by a separate patch without
using the filter callback.

Fixes: bd27568117664b8b ("perf: Rewrite core context handling")
Reported-by: Janne Grunau <j@jannau.net>
Link: https://lore.kernel.org/asahi/20230215-arm_pmu_m1_regression-v1-1-f5a266577c8d@jannau.net/
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Asahi Lina <lina@asahilina.net>
Cc: Eric Curtin <ecurtin@redhat.com>
---
 arch/arm64/kernel/perf_event.c | 7 -------
 drivers/perf/arm_pmu.c         | 8 +-------
 include/linux/perf/arm_pmu.h   | 1 -
 3 files changed, 1 insertion(+), 15 deletions(-)

diff --git a/arch/arm64/kernel/perf_event.c b/arch/arm64/kernel/perf_event.c
index a5193f2146a6..3e43538f6b72 100644
--- a/arch/arm64/kernel/perf_event.c
+++ b/arch/arm64/kernel/perf_event.c
@@ -1023,12 +1023,6 @@ static int armv8pmu_set_event_filter(struct hw_perf_event *event,
 	return 0;
 }
 
-static bool armv8pmu_filter(struct pmu *pmu, int cpu)
-{
-	struct arm_pmu *armpmu = to_arm_pmu(pmu);
-	return !cpumask_test_cpu(smp_processor_id(), &armpmu->supported_cpus);
-}
-
 static void armv8pmu_reset(void *info)
 {
 	struct arm_pmu *cpu_pmu = (struct arm_pmu *)info;
@@ -1258,7 +1252,6 @@ static int armv8_pmu_init(struct arm_pmu *cpu_pmu, char *name,
 	cpu_pmu->stop			= armv8pmu_stop;
 	cpu_pmu->reset			= armv8pmu_reset;
 	cpu_pmu->set_event_filter	= armv8pmu_set_event_filter;
-	cpu_pmu->filter			= armv8pmu_filter;
 
 	cpu_pmu->pmu.event_idx		= armv8pmu_user_event_idx;
 
diff --git a/drivers/perf/arm_pmu.c b/drivers/perf/arm_pmu.c
index 9b593f985805..40f70f83daba 100644
--- a/drivers/perf/arm_pmu.c
+++ b/drivers/perf/arm_pmu.c
@@ -550,13 +550,7 @@ static void armpmu_disable(struct pmu *pmu)
 static bool armpmu_filter(struct pmu *pmu, int cpu)
 {
 	struct arm_pmu *armpmu = to_arm_pmu(pmu);
-	bool ret;
-
-	ret = cpumask_test_cpu(cpu, &armpmu->supported_cpus);
-	if (ret && armpmu->filter)
-		return armpmu->filter(pmu, cpu);
-
-	return ret;
+	return !cpumask_test_cpu(cpu, &armpmu->supported_cpus);
 }
 
 static ssize_t cpus_show(struct device *dev,
diff --git a/include/linux/perf/arm_pmu.h b/include/linux/perf/arm_pmu.h
index ef914a600087..525b5d64e394 100644
--- a/include/linux/perf/arm_pmu.h
+++ b/include/linux/perf/arm_pmu.h
@@ -100,7 +100,6 @@ struct arm_pmu {
 	void		(*stop)(struct arm_pmu *);
 	void		(*reset)(void *);
 	int		(*map_event)(struct perf_event *event);
-	bool		(*filter)(struct pmu *pmu, int cpu);
 	int		num_events;
 	bool		secure_access; /* 32-bit ARM only */
 #define ARMV8_PMUV3_MAX_COMMON_EVENTS		0x40
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH 2/2] arm64: perf: reject CHAIN events at creation time
  2023-02-16 14:12 [PATCH 0/2] arm_pmu: fix fallout from context handling rewrite Mark Rutland
  2023-02-16 14:12 ` [PATCH 1/2] arm_pmu: fix event CPU filtering Mark Rutland
@ 2023-02-16 14:12 ` Mark Rutland
  2023-02-16 22:14 ` [PATCH 0/2] arm_pmu: fix fallout from context handling rewrite Will Deacon
  2 siblings, 0 replies; 7+ messages in thread
From: Mark Rutland @ 2023-02-16 14:12 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: asahi, ecurtin, j, lina, linux-kernel, mark.rutland, peterz,
	ravi.bangoria, will

Currently it's possible for a user to open CHAIN events arbitrarily,
which we previously tried to rule out in commit:

  ca2b497253ad01c8 ("arm64: perf: Reject stand-alone CHAIN events for PMUv3")

Which allowed the events to be opened, but prevented them from being
scheduled by by using an arm_pmu::filter_match hook to reject the
relevant events.

The CHAIN event filtering in the arm_pmu::filter_match hook was silently
removed in commit:

  bd27568117664b8b ("perf: Rewrite core context handling")

As a result, it's now possible for users to open CHAIN events, and for
these to be installed arbitrarily.

Fix this by rejecting CHAIN events at creation time. This avoids the
creation of events which will never count, and doesn't require using the
dynamic filtering.

Attempting to open a CHAIN event (0x1e) will now be rejected:

| # ./perf stat -e armv8_pmuv3/config=0x1e/ ls
| perf
|
|  Performance counter stats for 'ls':
|
|    <not supported>      armv8_pmuv3/config=0x1e/
|
|        0.002197470 seconds time elapsed
|
|        0.000000000 seconds user
|        0.002294000 seconds sys

Other events (e.g. CPU_CYCLES / 0x11) will open as usual:

| # ./perf stat -e armv8_pmuv3/config=0x11/ ls
| perf
|
|  Performance counter stats for 'ls':
|
|            2538761      armv8_pmuv3/config=0x11/
|
|        0.002227330 seconds time elapsed
|
|        0.002369000 seconds user
|        0.000000000 seconds sys

Fixes: bd27568117664b8b ("perf: Rewrite core context handling")
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Will Deacon <will@kernel.org>
---
 arch/arm64/kernel/perf_event.c | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/arch/arm64/kernel/perf_event.c b/arch/arm64/kernel/perf_event.c
index 3e43538f6b72..dde06c0f97f3 100644
--- a/arch/arm64/kernel/perf_event.c
+++ b/arch/arm64/kernel/perf_event.c
@@ -1063,6 +1063,14 @@ static int __armv8_pmuv3_map_event(struct perf_event *event,
 				       &armv8_pmuv3_perf_cache_map,
 				       ARMV8_PMU_EVTYPE_EVENT);
 
+	/*
+	 * CHAIN events only work when paired with an adjacent counter, and it
+	 * never makes sense for a user to open one in isolation, as they'll be
+	 * rotated arbitrarily.
+	 */
+	if (hw_event_id == ARMV8_PMUV3_PERFCTR_CHAIN)
+		return -EINVAL;
+
 	if (armv8pmu_event_is_64bit(event))
 		event->hw.flags |= ARMPMU_EVT_64BIT;
 
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH 1/2] arm_pmu: fix event CPU filtering
  2023-02-16 14:12 ` [PATCH 1/2] arm_pmu: fix event CPU filtering Mark Rutland
@ 2023-02-16 14:35   ` Janne Grunau
  2023-02-16 15:13     ` Mark Rutland
  0 siblings, 1 reply; 7+ messages in thread
From: Janne Grunau @ 2023-02-16 14:35 UTC (permalink / raw)
  To: Mark Rutland
  Cc: linux-arm-kernel, asahi, ecurtin, lina, linux-kernel, peterz,
	ravi.bangoria, will

On 2023-02-16 14:12:38 +0000, Mark Rutland wrote:
> Janne reports that perf has been broken on Apple M1 as of commit:
> 
>   bd27568117664b8b ("perf: Rewrite core context handling")
> 
> That commit replaced the pmu::filter_match() callback with
> pmu::filter(), whose return value has the opposite polarity, with true
> implying events should be ignored rather than scheduled. While an
> attempt was made to update the logic in armv8pmu_filter() and
> armpmu_filter() accordingly, the return value remains inverted in a
> couple of cases:
> 
> * If the arm_pmu does not have an arm_pmu::filter() callback,
>   armpmu_filter() will always return whether the CPU is supported rather
>   than whether the CPU is not supported.
> 
>   As a result, the perf core will not schedule events on supported CPUs,
>   resulting in a loss of events. Additionally, the perf core will
>   attempt to schedule events on unsupported CPUs, but this will be
>   rejected by armpmu_add(), which may result in a loss of events from
>   other PMUs on those unsupported CPUs.
> 
> * If the arm_pmu does have an arm_pmu::filter() callback, and
>   armpmu_filter() is called on a CPU which is not supported by the
>   arm_pmu, armpmu_filter() will return false rather than true.
> 
>   As a result, the perf core will attempt to schedule events on
>   unsupported CPUs, but this will be rejected by armpmu_add(), which may
>   result in a loss of events from other PMUs on those unsupported CPUs.
> 
> This means a loss of events can be seen with any arm_pmu driver, but
> with the ARMv8 PMUv3 driver (which is the only arm_pmu driver with an
> arm_pmu::filter() callback) the event loss will be more limited and may
> go unnoticed, which is how this issue evaded testing so far.
> 
> Fix the CPU filtering by performing this consistently in
> armpmu_filter(), and remove the redundant arm_pmu::filter() callback and
> armv8pmu_filter() implementation.
> 
> Commit bd2756811766 also silently removed the CHAIN event filtering from
> armv8pmu_filter(), which will be addressed by a separate patch without
> using the filter callback.
> 
> Fixes: bd27568117664b8b ("perf: Rewrite core context handling")
> Reported-by: Janne Grunau <j@jannau.net>
> Link: https://lore.kernel.org/asahi/20230215-arm_pmu_m1_regression-v1-1-f5a266577c8d@jannau.net/
> Signed-off-by: Mark Rutland <mark.rutland@arm.com>
> Cc: Will Deacon <will@kernel.org>
> Cc: Peter Zijlstra <peterz@infradead.org>
> Cc: Ravi Bangoria <ravi.bangoria@amd.com>
> Cc: Asahi Lina <lina@asahilina.net>
> Cc: Eric Curtin <ecurtin@redhat.com>
> ---
>  arch/arm64/kernel/perf_event.c | 7 -------
>  drivers/perf/arm_pmu.c         | 8 +-------
>  include/linux/perf/arm_pmu.h   | 1 -
>  3 files changed, 1 insertion(+), 15 deletions(-)
> 
> diff --git a/arch/arm64/kernel/perf_event.c b/arch/arm64/kernel/perf_event.c
> index a5193f2146a6..3e43538f6b72 100644
> --- a/arch/arm64/kernel/perf_event.c
> +++ b/arch/arm64/kernel/perf_event.c
> @@ -1023,12 +1023,6 @@ static int armv8pmu_set_event_filter(struct hw_perf_event *event,
>  	return 0;
>  }
>  
> -static bool armv8pmu_filter(struct pmu *pmu, int cpu)
> -{
> -	struct arm_pmu *armpmu = to_arm_pmu(pmu);
> -	return !cpumask_test_cpu(smp_processor_id(), &armpmu->supported_cpus);
> -}
> -
>  static void armv8pmu_reset(void *info)
>  {
>  	struct arm_pmu *cpu_pmu = (struct arm_pmu *)info;
> @@ -1258,7 +1252,6 @@ static int armv8_pmu_init(struct arm_pmu *cpu_pmu, char *name,
>  	cpu_pmu->stop			= armv8pmu_stop;
>  	cpu_pmu->reset			= armv8pmu_reset;
>  	cpu_pmu->set_event_filter	= armv8pmu_set_event_filter;
> -	cpu_pmu->filter			= armv8pmu_filter;
>  
>  	cpu_pmu->pmu.event_idx		= armv8pmu_user_event_idx;
>  
> diff --git a/drivers/perf/arm_pmu.c b/drivers/perf/arm_pmu.c
> index 9b593f985805..40f70f83daba 100644
> --- a/drivers/perf/arm_pmu.c
> +++ b/drivers/perf/arm_pmu.c
> @@ -550,13 +550,7 @@ static void armpmu_disable(struct pmu *pmu)
>  static bool armpmu_filter(struct pmu *pmu, int cpu)
>  {
>  	struct arm_pmu *armpmu = to_arm_pmu(pmu);
> -	bool ret;
> -
> -	ret = cpumask_test_cpu(cpu, &armpmu->supported_cpus);
> -	if (ret && armpmu->filter)
> -		return armpmu->filter(pmu, cpu);
> -
> -	return ret;
> +	return !cpumask_test_cpu(cpu, &armpmu->supported_cpus);
>  }
>  
>  static ssize_t cpus_show(struct device *dev,
> diff --git a/include/linux/perf/arm_pmu.h b/include/linux/perf/arm_pmu.h
> index ef914a600087..525b5d64e394 100644
> --- a/include/linux/perf/arm_pmu.h
> +++ b/include/linux/perf/arm_pmu.h
> @@ -100,7 +100,6 @@ struct arm_pmu {
>  	void		(*stop)(struct arm_pmu *);
>  	void		(*reset)(void *);
>  	int		(*map_event)(struct perf_event *event);
> -	bool		(*filter)(struct pmu *pmu, int cpu);
>  	int		num_events;
>  	bool		secure_access; /* 32-bit ARM only */
>  #define ARMV8_PMUV3_MAX_COMMON_EVENTS		0x40

This works as well. I limited the patch to the minimal fix this                                                   
this late in the cycle.

Tested-by: Janne Grunau <j@jannau.net>

thanks,
Janne

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 1/2] arm_pmu: fix event CPU filtering
  2023-02-16 14:35   ` Janne Grunau
@ 2023-02-16 15:13     ` Mark Rutland
  2023-02-16 15:17       ` Will Deacon
  0 siblings, 1 reply; 7+ messages in thread
From: Mark Rutland @ 2023-02-16 15:13 UTC (permalink / raw)
  To: Janne Grunau, will
  Cc: linux-arm-kernel, asahi, ecurtin, lina, linux-kernel, peterz,
	ravi.bangoria

On Thu, Feb 16, 2023 at 03:35:19PM +0100, Janne Grunau wrote:
> On 2023-02-16 14:12:38 +0000, Mark Rutland wrote:
> > Fix the CPU filtering by performing this consistently in
> > armpmu_filter(), and remove the redundant arm_pmu::filter() callback and
> > armv8pmu_filter() implementation.
> > 
> > Commit bd2756811766 also silently removed the CHAIN event filtering from
> > armv8pmu_filter(), which will be addressed by a separate patch without
> > using the filter callback.

[...]

> This works as well. I limited the patch to the minimal fix this                                                   
> this late in the cycle.

I did appreciate that you'd made the effort for the minimal fix; had the issue
with CHAIN events not existed I would have acked that as-is and done the
simplification later. Given the CHAIN issue and given the simplification make
the code "obviously correct" I think it's preferable to do both bits now.

> Tested-by: Janne Grunau <j@jannau.net>

Thanks!

Hopefully Will or Peter can pick this up shortly; I'm assuming that Will can
take this via the arm64 tree.

Mark.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 1/2] arm_pmu: fix event CPU filtering
  2023-02-16 15:13     ` Mark Rutland
@ 2023-02-16 15:17       ` Will Deacon
  0 siblings, 0 replies; 7+ messages in thread
From: Will Deacon @ 2023-02-16 15:17 UTC (permalink / raw)
  To: Mark Rutland
  Cc: Janne Grunau, linux-arm-kernel, asahi, ecurtin, lina,
	linux-kernel, peterz, ravi.bangoria

On Thu, Feb 16, 2023 at 03:13:11PM +0000, Mark Rutland wrote:
> On Thu, Feb 16, 2023 at 03:35:19PM +0100, Janne Grunau wrote:
> > On 2023-02-16 14:12:38 +0000, Mark Rutland wrote:
> > > Fix the CPU filtering by performing this consistently in
> > > armpmu_filter(), and remove the redundant arm_pmu::filter() callback and
> > > armv8pmu_filter() implementation.
> > > 
> > > Commit bd2756811766 also silently removed the CHAIN event filtering from
> > > armv8pmu_filter(), which will be addressed by a separate patch without
> > > using the filter callback.
> 
> [...]
> 
> > This works as well. I limited the patch to the minimal fix this                                                   
> > this late in the cycle.
> 
> I did appreciate that you'd made the effort for the minimal fix; had the issue
> with CHAIN events not existed I would have acked that as-is and done the
> simplification later. Given the CHAIN issue and given the simplification make
> the code "obviously correct" I think it's preferable to do both bits now.
> 
> > Tested-by: Janne Grunau <j@jannau.net>
> 
> Thanks!
> 
> Hopefully Will or Peter can pick this up shortly; I'm assuming that Will can
> take this via the arm64 tree.

I'll grab 'em.

Will

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 0/2] arm_pmu: fix fallout from context handling rewrite
  2023-02-16 14:12 [PATCH 0/2] arm_pmu: fix fallout from context handling rewrite Mark Rutland
  2023-02-16 14:12 ` [PATCH 1/2] arm_pmu: fix event CPU filtering Mark Rutland
  2023-02-16 14:12 ` [PATCH 2/2] arm64: perf: reject CHAIN events at creation time Mark Rutland
@ 2023-02-16 22:14 ` Will Deacon
  2 siblings, 0 replies; 7+ messages in thread
From: Will Deacon @ 2023-02-16 22:14 UTC (permalink / raw)
  To: Mark Rutland, linux-arm-kernel
  Cc: catalin.marinas, kernel-team, Will Deacon, peterz, ravi.bangoria,
	j, ecurtin, lina, asahi, linux-kernel

On Thu, 16 Feb 2023 14:12:37 +0000, Mark Rutland wrote:
> Janne reports [1] that perf has been broken on Apple M1 as of commit:
> 
>   bd27568117664b8b ("perf: Rewrite core context handling")
> 
> This is due to changes to pmu::filter_match() and
> arm_pmu::filter_match(), which have been renamed and had their polarity
> inverted, but the conversion was inconsistent, and so in some cases we
> return the opposite result relative to what we had intended. This
> results in consistently losing events on Apple M1.
> 
> [...]

Applied to arm64 (for-next/fixes), thanks!

[1/2] arm_pmu: fix event CPU filtering
      https://git.kernel.org/arm64/c/61d038627343
[2/2] arm64: perf: reject CHAIN events at creation time
      https://git.kernel.org/arm64/c/853e2dac25c1

Cheers,
-- 
Will

https://fixes.arm64.dev
https://next.arm64.dev
https://will.arm64.dev

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2023-02-16 22:14 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-02-16 14:12 [PATCH 0/2] arm_pmu: fix fallout from context handling rewrite Mark Rutland
2023-02-16 14:12 ` [PATCH 1/2] arm_pmu: fix event CPU filtering Mark Rutland
2023-02-16 14:35   ` Janne Grunau
2023-02-16 15:13     ` Mark Rutland
2023-02-16 15:17       ` Will Deacon
2023-02-16 14:12 ` [PATCH 2/2] arm64: perf: reject CHAIN events at creation time Mark Rutland
2023-02-16 22:14 ` [PATCH 0/2] arm_pmu: fix fallout from context handling rewrite Will Deacon

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).