linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFCv1 0/4] arm64: Use static key for PID in CONTEXTIDR
@ 2021-10-21 13:45 Leo Yan
  2021-10-21 13:45 ` [RFCv1 1/4] arm64: Use static key for tracing " Leo Yan
                   ` (3 more replies)
  0 siblings, 4 replies; 24+ messages in thread
From: Leo Yan @ 2021-10-21 13:45 UTC (permalink / raw)
  To: Catalin Marinas, Will Deacon, Mark Rutland, Kees Cook,
	Ard Biesheuvel, Sami Tolvanen, Nicholas Piggin, James Morse,
	Marc Zyngier, Joey Gouly, Peter Collingbourne, Vincenzo Frascino,
	Peter Zijlstra (Intel),
	Stephane Eranian, James Clark, linux-arm-kernel, linux-kernel
  Cc: Leo Yan

Kernel provides configuration PID_IN_CONTEXTIDR so we can save PID into
system register contextidr_el{1|2}.  This means developer must build
kernel with this config, otherwise, the feature is disabled and cannot
be used by hardware tracing modules (Like Arm CoreSight, SPE, etc) on
the fly.

Suggested by Stephane Eranian, this patch series introduces static key
for PID in CONTEXTIDR.  If the config PID_IN_CONTEXTIDR is selected when
build kernel, then the static key will be set as true and kernel will
always trace PID into CONTEXTIDR, so we can keep the same semantics for
PID_IN_CONTEXTIDR before and after applying this patch series.

If the config PID_IN_CONTEXTIDR is not selected, the kernel modules
still can invoke the pair functions contextidr_enable() and
contextidr_disable() to dynamically enable and disable PID tracing in
the profiling flow.  As result, Arm SPE is the first consumer to use
static key.

When I review Arm CoreSight driver, I found it misses to check root PID
namespace for its register setting.  So it would use a dedicate patch
series to firstly correct namespace checking and then apply static key
for PID tracing.  For this reason, this patch set doesn't contain Arm
CoreSight related enhancement.

We also need to provide arm32 variant to use static key for PID in
CONTEXTIDR.  I'd like to send out this patch series firstly for comment,
in case this approach is not accepted by maintainer.  If we can conclude
this is the right thing to do, I will supplement arm32 support in next
spin.

This patch set has been verified on Hisilicon D06 board with Arm SPE
driver.

I tested the performance for using static key, the result doesn't
show regression.  In the testing, I used the command 'perf bench sched
messaging -t -g 20 -l 1000' to measure the scheduling latency for 4
different modes:

    'dis': Disable kernel configuration PID_IN_CONTEXTIDR.
    'enb': Enable kernel configuration PID_IN_CONTEXTIDR.
    'true': Set static key to 'TRUE'
    'false': Set static key to 'FALSE' (so don't store PID into CONTEXTIDR)

The testing iterates for 5 loops for every configuration, and get the
run time (in seconds):

          dis      enb     true     false
    ---+--------+--------+--------+-------
    #1   26.568 | 26.786 | 26.056 | 26.197
    #2   26.442 | 26.457 | 26.458 | 26.846
    #3   26.719 | 26.701 | 27.119 | 26.281
    #4   26.448 | 27.595 | 26.953 | 27.043
    #5   27.017 | 27.263 | 26.638 | 26.933
    ---+--------+--------+--------+-------
    avg. 26.638 | 26.960 | 26.644 | 26.66
    ---+--------+--------+--------+-------
    delta pct.  | 1.21%  | 0.02%  | 0.08%


Leo Yan (4):
  arm64: Use static key for tracing PID in CONTEXTIDR
  arm64: entry: Always apply workaround for contextidr_el1
  arm64: Introduce functions for controlling PID tracing
  perf: arm_spe: Dynamically switch PID tracing to contextidr

 arch/arm64/include/asm/mmu_context.h | 14 +++++++++++++-
 arch/arm64/kernel/entry.S            |  4 ----
 arch/arm64/kernel/process.c          | 11 +++++++++++
 drivers/perf/arm_spe_pmu.c           | 13 ++++++++++++-
 4 files changed, 36 insertions(+), 6 deletions(-)

-- 
2.25.1


^ permalink raw reply	[flat|nested] 24+ messages in thread

* [RFCv1 1/4] arm64: Use static key for tracing PID in CONTEXTIDR
  2021-10-21 13:45 [RFCv1 0/4] arm64: Use static key for PID in CONTEXTIDR Leo Yan
@ 2021-10-21 13:45 ` Leo Yan
  2021-10-21 14:33   ` James Clark
  2021-10-21 13:45 ` [RFCv1 2/4] arm64: entry: Always apply workaround for contextidr_el1 Leo Yan
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 24+ messages in thread
From: Leo Yan @ 2021-10-21 13:45 UTC (permalink / raw)
  To: Catalin Marinas, Will Deacon, Mark Rutland, Kees Cook,
	Ard Biesheuvel, Sami Tolvanen, Nicholas Piggin, James Morse,
	Marc Zyngier, Joey Gouly, Peter Collingbourne, Vincenzo Frascino,
	Peter Zijlstra (Intel),
	Stephane Eranian, James Clark, linux-arm-kernel, linux-kernel
  Cc: Leo Yan

The kernel provides CONFIG_PID_IN_CONTEXTIDR for tracing PID into system
register CONTEXTIDR; we need to statically enable this configuration
when build kernel image to use this feature.

On the other hand, hardware tracing modules (e.g. Arm CoreSight, SPE,
etc) rely on this feature to provide context info in their tracing data.
If kernel has not enabled configuration CONFIG_PID_IN_CONTEXTIDR, then
tracing modules have no chance to capture PID related info.

This patch introduces static key for tracing PID in CONTEXTIDR, it
provides a possibility for device driver to dynamically enable and
disable tracing PID into CONTEXTIDR as needed.

As the first step, the kernel increases the static key if
CONFIG_PID_IN_CONTEXTIDR is enabled when booting kernel, in this case
kernel will always trace PID into CONTEXTIDR at the runtime.  This means
before and after applying this patch, the semantics for
CONFIG_PID_IN_CONTEXTIDR are consistent.

Signed-off-by: Leo Yan <leo.yan@linaro.org>
---
 arch/arm64/include/asm/mmu_context.h |  4 +++-
 arch/arm64/kernel/process.c          | 11 +++++++++++
 2 files changed, 14 insertions(+), 1 deletion(-)

diff --git a/arch/arm64/include/asm/mmu_context.h b/arch/arm64/include/asm/mmu_context.h
index f4ba93d4ffeb..e1f33616f83a 100644
--- a/arch/arm64/include/asm/mmu_context.h
+++ b/arch/arm64/include/asm/mmu_context.h
@@ -26,9 +26,11 @@
 
 extern bool rodata_full;
 
+DECLARE_STATIC_KEY_FALSE(contextidr_in_use);
+
 static inline void contextidr_thread_switch(struct task_struct *next)
 {
-	if (!IS_ENABLED(CONFIG_PID_IN_CONTEXTIDR))
+	if (!static_branch_unlikely(&contextidr_in_use))
 		return;
 
 	write_sysreg(task_pid_nr(next), contextidr_el1);
diff --git a/arch/arm64/kernel/process.c b/arch/arm64/kernel/process.c
index 40adb8cdbf5a..d744c0c7e4c4 100644
--- a/arch/arm64/kernel/process.c
+++ b/arch/arm64/kernel/process.c
@@ -61,6 +61,9 @@ unsigned long __stack_chk_guard __ro_after_init;
 EXPORT_SYMBOL(__stack_chk_guard);
 #endif
 
+DEFINE_STATIC_KEY_FALSE(contextidr_in_use);
+EXPORT_SYMBOL_GPL(contextidr_in_use);
+
 /*
  * Function pointers to optional machine specific functions
  */
@@ -721,3 +724,11 @@ int arch_elf_adjust_prot(int prot, const struct arch_elf_state *state,
 	return prot;
 }
 #endif
+
+static int __init contextidr_init(void)
+{
+	if (IS_ENABLED(CONFIG_PID_IN_CONTEXTIDR))
+		static_branch_inc(&contextidr_in_use);
+	return 0;
+}
+early_initcall(contextidr_init);
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [RFCv1 2/4] arm64: entry: Always apply workaround for contextidr_el1
  2021-10-21 13:45 [RFCv1 0/4] arm64: Use static key for PID in CONTEXTIDR Leo Yan
  2021-10-21 13:45 ` [RFCv1 1/4] arm64: Use static key for tracing " Leo Yan
@ 2021-10-21 13:45 ` Leo Yan
  2021-10-21 13:45 ` [RFCv1 3/4] arm64: Introduce functions for controlling PID tracing Leo Yan
  2021-10-21 13:45 ` [RFCv1 4/4] perf: arm_spe: Dynamically switch PID tracing to contextidr Leo Yan
  3 siblings, 0 replies; 24+ messages in thread
From: Leo Yan @ 2021-10-21 13:45 UTC (permalink / raw)
  To: Catalin Marinas, Will Deacon, Mark Rutland, Kees Cook,
	Ard Biesheuvel, Sami Tolvanen, Nicholas Piggin, James Morse,
	Marc Zyngier, Joey Gouly, Peter Collingbourne, Vincenzo Frascino,
	Peter Zijlstra (Intel),
	Stephane Eranian, James Clark, linux-arm-kernel, linux-kernel
  Cc: Leo Yan

After introducing static key as the switch for tracing PID into
contextidr_el1, the kernel can dynamically turn on the static key to
use PID tracing to contextidr_el1.  This means even the config
CONFIG_PID_IN_CONTEXTIDR is not selected, the kernel still can use
contextidr_el1.

When erratum 84571 is detected, the workaround should be always
applied on contextidr_el1, particularly if the static key is enabled
dynamically.

This patch would introduce minor overload by one extra instruction for
accessing system register contextidr_el1 and it only impacts platform
which erratum 84571.  So it's expected to not cause any significant
regression.

Signed-off-by: Leo Yan <leo.yan@linaro.org>
---
 arch/arm64/kernel/entry.S | 4 ----
 1 file changed, 4 deletions(-)

diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S
index bc6d5a970a13..c41a4cfff527 100644
--- a/arch/arm64/kernel/entry.S
+++ b/arch/arm64/kernel/entry.S
@@ -356,12 +356,8 @@ alternative_else_nop_endif
 
 #ifdef CONFIG_ARM64_ERRATUM_845719
 alternative_if ARM64_WORKAROUND_845719
-#ifdef CONFIG_PID_IN_CONTEXTIDR
 	mrs	x29, contextidr_el1
 	msr	contextidr_el1, x29
-#else
-	msr contextidr_el1, xzr
-#endif
 alternative_else_nop_endif
 #endif
 3:
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [RFCv1 3/4] arm64: Introduce functions for controlling PID tracing
  2021-10-21 13:45 [RFCv1 0/4] arm64: Use static key for PID in CONTEXTIDR Leo Yan
  2021-10-21 13:45 ` [RFCv1 1/4] arm64: Use static key for tracing " Leo Yan
  2021-10-21 13:45 ` [RFCv1 2/4] arm64: entry: Always apply workaround for contextidr_el1 Leo Yan
@ 2021-10-21 13:45 ` Leo Yan
  2021-10-21 13:45 ` [RFCv1 4/4] perf: arm_spe: Dynamically switch PID tracing to contextidr Leo Yan
  3 siblings, 0 replies; 24+ messages in thread
From: Leo Yan @ 2021-10-21 13:45 UTC (permalink / raw)
  To: Catalin Marinas, Will Deacon, Mark Rutland, Kees Cook,
	Ard Biesheuvel, Sami Tolvanen, Nicholas Piggin, James Morse,
	Marc Zyngier, Joey Gouly, Peter Collingbourne, Vincenzo Frascino,
	Peter Zijlstra (Intel),
	Stephane Eranian, James Clark, linux-arm-kernel, linux-kernel
  Cc: Leo Yan

Introduce two functions contextidr_enable() and contextidr_disable(),
which can be used by kernel modules to turn on or off PID tracing in
contextidr register.

Signed-off-by: Leo Yan <leo.yan@linaro.org>
---
 arch/arm64/include/asm/mmu_context.h | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/arch/arm64/include/asm/mmu_context.h b/arch/arm64/include/asm/mmu_context.h
index e1f33616f83a..0c1669db19a1 100644
--- a/arch/arm64/include/asm/mmu_context.h
+++ b/arch/arm64/include/asm/mmu_context.h
@@ -37,6 +37,16 @@ static inline void contextidr_thread_switch(struct task_struct *next)
 	isb();
 }
 
+static inline void contextidr_enable(void)
+{
+	static_branch_inc(&contextidr_in_use);
+}
+
+static inline void contextidr_disable(void)
+{
+	static_branch_dec(&contextidr_in_use);
+}
+
 /*
  * Set TTBR0 to reserved_pg_dir. No translations will be possible via TTBR0.
  */
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [RFCv1 4/4] perf: arm_spe: Dynamically switch PID tracing to contextidr
  2021-10-21 13:45 [RFCv1 0/4] arm64: Use static key for PID in CONTEXTIDR Leo Yan
                   ` (2 preceding siblings ...)
  2021-10-21 13:45 ` [RFCv1 3/4] arm64: Introduce functions for controlling PID tracing Leo Yan
@ 2021-10-21 13:45 ` Leo Yan
  2021-10-21 15:49   ` Kees Cook
  2021-10-22 15:36   ` James Clark
  3 siblings, 2 replies; 24+ messages in thread
From: Leo Yan @ 2021-10-21 13:45 UTC (permalink / raw)
  To: Catalin Marinas, Will Deacon, Mark Rutland, Kees Cook,
	Ard Biesheuvel, Sami Tolvanen, Nicholas Piggin, James Morse,
	Marc Zyngier, Joey Gouly, Peter Collingbourne, Vincenzo Frascino,
	Peter Zijlstra (Intel),
	Stephane Eranian, James Clark, linux-arm-kernel, linux-kernel
  Cc: Leo Yan

Now Arm64 provides API for enabling and disable PID tracing, Arm SPE
driver invokes these functions to dynamically enable it during
profiling when the program runs in root PID name space, and disable PID
tracing when the perf event is stopped.

Device drivers should not depend on CONFIG_PID_IN_CONTEXTIDR for PID
tracing, so this patch uses the consistent condition for setting bit
EL1_CX for PMSCR.

Signed-off-by: Leo Yan <leo.yan@linaro.org>
---
 drivers/perf/arm_spe_pmu.c | 13 ++++++++++++-
 1 file changed, 12 insertions(+), 1 deletion(-)

diff --git a/drivers/perf/arm_spe_pmu.c b/drivers/perf/arm_spe_pmu.c
index d44bcc29d99c..935343cdcb39 100644
--- a/drivers/perf/arm_spe_pmu.c
+++ b/drivers/perf/arm_spe_pmu.c
@@ -23,6 +23,7 @@
 #include <linux/irq.h>
 #include <linux/kernel.h>
 #include <linux/list.h>
+#include <linux/mmu_context.h>
 #include <linux/module.h>
 #include <linux/of_address.h>
 #include <linux/of_device.h>
@@ -272,7 +273,7 @@ static u64 arm_spe_event_to_pmscr(struct perf_event *event)
 	if (!attr->exclude_kernel)
 		reg |= BIT(SYS_PMSCR_EL1_E1SPE_SHIFT);
 
-	if (IS_ENABLED(CONFIG_PID_IN_CONTEXTIDR) && perfmon_capable())
+	if (perfmon_capable() && (task_active_pid_ns(current) == &init_pid_ns))
 		reg |= BIT(SYS_PMSCR_EL1_CX_SHIFT);
 
 	return reg;
@@ -731,6 +732,13 @@ static void arm_spe_pmu_start(struct perf_event *event, int flags)
 	if (hwc->state)
 		return;
 
+	/*
+	 * Enable tracing PID to contextidr if profiling program runs in
+	 * root PID namespace.
+	 */
+	if (perfmon_capable() && (task_active_pid_ns(current) == &init_pid_ns))
+		contextidr_enable();
+
 	reg = arm_spe_event_to_pmsfcr(event);
 	write_sysreg_s(reg, SYS_PMSFCR_EL1);
 
@@ -792,6 +800,9 @@ static void arm_spe_pmu_stop(struct perf_event *event, int flags)
 	}
 
 	hwc->state |= PERF_HES_STOPPED;
+
+	if (perfmon_capable() && (task_active_pid_ns(current) == &init_pid_ns))
+		contextidr_disable();
 }
 
 static int arm_spe_pmu_add(struct perf_event *event, int flags)
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* Re: [RFCv1 1/4] arm64: Use static key for tracing PID in CONTEXTIDR
  2021-10-21 13:45 ` [RFCv1 1/4] arm64: Use static key for tracing " Leo Yan
@ 2021-10-21 14:33   ` James Clark
  2021-10-21 14:37     ` Leo Yan
  2021-10-21 15:47     ` Kees Cook
  0 siblings, 2 replies; 24+ messages in thread
From: James Clark @ 2021-10-21 14:33 UTC (permalink / raw)
  To: Leo Yan, Catalin Marinas, Will Deacon, Mark Rutland, Kees Cook,
	Ard Biesheuvel, Sami Tolvanen, Nicholas Piggin, James Morse,
	Marc Zyngier, Joey Gouly, Peter Collingbourne, Vincenzo Frascino,
	Peter Zijlstra (Intel),
	Stephane Eranian, linux-arm-kernel, linux-kernel



On 21/10/2021 14:45, Leo Yan wrote:
> The kernel provides CONFIG_PID_IN_CONTEXTIDR for tracing PID into system
> register CONTEXTIDR; we need to statically enable this configuration
> when build kernel image to use this feature.
> 
> On the other hand, hardware tracing modules (e.g. Arm CoreSight, SPE,
> etc) rely on this feature to provide context info in their tracing data.
> If kernel has not enabled configuration CONFIG_PID_IN_CONTEXTIDR, then
> tracing modules have no chance to capture PID related info.
> 
> This patch introduces static key for tracing PID in CONTEXTIDR, it
> provides a possibility for device driver to dynamically enable and
> disable tracing PID into CONTEXTIDR as needed.
> 
> As the first step, the kernel increases the static key if
> CONFIG_PID_IN_CONTEXTIDR is enabled when booting kernel, in this case
> kernel will always trace PID into CONTEXTIDR at the runtime.  This means
> before and after applying this patch, the semantics for
> CONFIG_PID_IN_CONTEXTIDR are consistent.
> 
> Signed-off-by: Leo Yan <leo.yan@linaro.org>
> ---
>  arch/arm64/include/asm/mmu_context.h |  4 +++-
>  arch/arm64/kernel/process.c          | 11 +++++++++++
>  2 files changed, 14 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/arm64/include/asm/mmu_context.h b/arch/arm64/include/asm/mmu_context.h
> index f4ba93d4ffeb..e1f33616f83a 100644
> --- a/arch/arm64/include/asm/mmu_context.h
> +++ b/arch/arm64/include/asm/mmu_context.h
> @@ -26,9 +26,11 @@
>  
>  extern bool rodata_full;
>  
> +DECLARE_STATIC_KEY_FALSE(contextidr_in_use);
> +
>  static inline void contextidr_thread_switch(struct task_struct *next)
>  {
> -	if (!IS_ENABLED(CONFIG_PID_IN_CONTEXTIDR))
> +	if (!static_branch_unlikely(&contextidr_in_use))
>  		return;
>  
>  	write_sysreg(task_pid_nr(next), contextidr_el1);
> diff --git a/arch/arm64/kernel/process.c b/arch/arm64/kernel/process.c
> index 40adb8cdbf5a..d744c0c7e4c4 100644
> --- a/arch/arm64/kernel/process.c
> +++ b/arch/arm64/kernel/process.c
> @@ -61,6 +61,9 @@ unsigned long __stack_chk_guard __ro_after_init;
>  EXPORT_SYMBOL(__stack_chk_guard);
>  #endif
>  
> +DEFINE_STATIC_KEY_FALSE(contextidr_in_use);
> +EXPORT_SYMBOL_GPL(contextidr_in_use);
> +
>  /*
>   * Function pointers to optional machine specific functions
>   */
> @@ -721,3 +724,11 @@ int arch_elf_adjust_prot(int prot, const struct arch_elf_state *state,
>  	return prot;
>  }
>  #endif
> +
> +static int __init contextidr_init(void)
> +{
> +	if (IS_ENABLED(CONFIG_PID_IN_CONTEXTIDR))
> +		static_branch_inc(&contextidr_in_use);
> +	return 0;
> +}
> +early_initcall(contextidr_init);

Hi Leo,

Can you skip this early_initcall() part if you do something like this:

    DECLARE_STATIC_KEY_MAYBE(CONFIG_PID_IN_CONTEXTIDR, contextidr_in_use)

It seems like there is a way to conditionally initialise it to true.

James


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [RFCv1 1/4] arm64: Use static key for tracing PID in CONTEXTIDR
  2021-10-21 14:33   ` James Clark
@ 2021-10-21 14:37     ` Leo Yan
  2021-10-21 15:47     ` Kees Cook
  1 sibling, 0 replies; 24+ messages in thread
From: Leo Yan @ 2021-10-21 14:37 UTC (permalink / raw)
  To: James Clark
  Cc: Catalin Marinas, Will Deacon, Mark Rutland, Kees Cook,
	Ard Biesheuvel, Sami Tolvanen, Nicholas Piggin, James Morse,
	Marc Zyngier, Joey Gouly, Peter Collingbourne, Vincenzo Frascino,
	Peter Zijlstra (Intel),
	Stephane Eranian, linux-arm-kernel, linux-kernel

Hi James,

On Thu, Oct 21, 2021 at 03:33:01PM +0100, James Clark wrote:

[...]

> > +static int __init contextidr_init(void)
> > +{
> > +	if (IS_ENABLED(CONFIG_PID_IN_CONTEXTIDR))
> > +		static_branch_inc(&contextidr_in_use);
> > +	return 0;
> > +}
> > +early_initcall(contextidr_init);
> 
> Hi Leo,
> 
> Can you skip this early_initcall() part if you do something like this:
> 
>     DECLARE_STATIC_KEY_MAYBE(CONFIG_PID_IN_CONTEXTIDR, contextidr_in_use)
> 
> It seems like there is a way to conditionally initialise it to true.

Thanks for good point!  Will test this way in next spin.

Leo

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [RFCv1 1/4] arm64: Use static key for tracing PID in CONTEXTIDR
  2021-10-21 14:33   ` James Clark
  2021-10-21 14:37     ` Leo Yan
@ 2021-10-21 15:47     ` Kees Cook
  1 sibling, 0 replies; 24+ messages in thread
From: Kees Cook @ 2021-10-21 15:47 UTC (permalink / raw)
  To: James Clark
  Cc: Leo Yan, Catalin Marinas, Will Deacon, Mark Rutland,
	Ard Biesheuvel, Sami Tolvanen, Nicholas Piggin, James Morse,
	Marc Zyngier, Joey Gouly, Peter Collingbourne, Vincenzo Frascino,
	Peter Zijlstra (Intel),
	Stephane Eranian, linux-arm-kernel, linux-kernel

On Thu, Oct 21, 2021 at 03:33:01PM +0100, James Clark wrote:
> 
> 
> On 21/10/2021 14:45, Leo Yan wrote:
> > The kernel provides CONFIG_PID_IN_CONTEXTIDR for tracing PID into system
> > register CONTEXTIDR; we need to statically enable this configuration
> > when build kernel image to use this feature.
> > 
> > On the other hand, hardware tracing modules (e.g. Arm CoreSight, SPE,
> > etc) rely on this feature to provide context info in their tracing data.
> > If kernel has not enabled configuration CONFIG_PID_IN_CONTEXTIDR, then
> > tracing modules have no chance to capture PID related info.
> > 
> > This patch introduces static key for tracing PID in CONTEXTIDR, it
> > provides a possibility for device driver to dynamically enable and
> > disable tracing PID into CONTEXTIDR as needed.
> > 
> > As the first step, the kernel increases the static key if
> > CONFIG_PID_IN_CONTEXTIDR is enabled when booting kernel, in this case
> > kernel will always trace PID into CONTEXTIDR at the runtime.  This means
> > before and after applying this patch, the semantics for
> > CONFIG_PID_IN_CONTEXTIDR are consistent.
> > 
> > Signed-off-by: Leo Yan <leo.yan@linaro.org>
> > ---
> >  arch/arm64/include/asm/mmu_context.h |  4 +++-
> >  arch/arm64/kernel/process.c          | 11 +++++++++++
> >  2 files changed, 14 insertions(+), 1 deletion(-)
> > 
> > diff --git a/arch/arm64/include/asm/mmu_context.h b/arch/arm64/include/asm/mmu_context.h
> > index f4ba93d4ffeb..e1f33616f83a 100644
> > --- a/arch/arm64/include/asm/mmu_context.h
> > +++ b/arch/arm64/include/asm/mmu_context.h
> > @@ -26,9 +26,11 @@
> >  
> >  extern bool rodata_full;
> >  
> > +DECLARE_STATIC_KEY_FALSE(contextidr_in_use);
> > +
> >  static inline void contextidr_thread_switch(struct task_struct *next)
> >  {
> > -	if (!IS_ENABLED(CONFIG_PID_IN_CONTEXTIDR))
> > +	if (!static_branch_unlikely(&contextidr_in_use))
> >  		return;
> >  
> >  	write_sysreg(task_pid_nr(next), contextidr_el1);
> > diff --git a/arch/arm64/kernel/process.c b/arch/arm64/kernel/process.c
> > index 40adb8cdbf5a..d744c0c7e4c4 100644
> > --- a/arch/arm64/kernel/process.c
> > +++ b/arch/arm64/kernel/process.c
> > @@ -61,6 +61,9 @@ unsigned long __stack_chk_guard __ro_after_init;
> >  EXPORT_SYMBOL(__stack_chk_guard);
> >  #endif
> >  
> > +DEFINE_STATIC_KEY_FALSE(contextidr_in_use);
> > +EXPORT_SYMBOL_GPL(contextidr_in_use);
> > +
> >  /*
> >   * Function pointers to optional machine specific functions
> >   */
> > @@ -721,3 +724,11 @@ int arch_elf_adjust_prot(int prot, const struct arch_elf_state *state,
> >  	return prot;
> >  }
> >  #endif
> > +
> > +static int __init contextidr_init(void)
> > +{
> > +	if (IS_ENABLED(CONFIG_PID_IN_CONTEXTIDR))
> > +		static_branch_inc(&contextidr_in_use);
> > +	return 0;
> > +}
> > +early_initcall(contextidr_init);
> 
> Hi Leo,
> 
> Can you skip this early_initcall() part if you do something like this:
> 
>     DECLARE_STATIC_KEY_MAYBE(CONFIG_PID_IN_CONTEXTIDR, contextidr_in_use)
> 
> It seems like there is a way to conditionally initialise it to true.

I was going to suggest the same thing. :) With this change, it looks
good:

Reviewed-by: Kees Cook <keescook@chromium.org>

-- 
Kees Cook

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [RFCv1 4/4] perf: arm_spe: Dynamically switch PID tracing to contextidr
  2021-10-21 13:45 ` [RFCv1 4/4] perf: arm_spe: Dynamically switch PID tracing to contextidr Leo Yan
@ 2021-10-21 15:49   ` Kees Cook
  2021-10-22  2:09     ` Leo Yan
  2021-11-01 15:28     ` Leo Yan
  2021-10-22 15:36   ` James Clark
  1 sibling, 2 replies; 24+ messages in thread
From: Kees Cook @ 2021-10-21 15:49 UTC (permalink / raw)
  To: Leo Yan
  Cc: Catalin Marinas, Will Deacon, Mark Rutland, Ard Biesheuvel,
	Sami Tolvanen, Nicholas Piggin, James Morse, Marc Zyngier,
	Joey Gouly, Peter Collingbourne, Vincenzo Frascino,
	Peter Zijlstra (Intel),
	Stephane Eranian, James Clark, linux-arm-kernel, linux-kernel

On Thu, Oct 21, 2021 at 09:45:30PM +0800, Leo Yan wrote:
> Now Arm64 provides API for enabling and disable PID tracing, Arm SPE
> driver invokes these functions to dynamically enable it during
> profiling when the program runs in root PID name space, and disable PID
> tracing when the perf event is stopped.
> 
> Device drivers should not depend on CONFIG_PID_IN_CONTEXTIDR for PID
> tracing, so this patch uses the consistent condition for setting bit
> EL1_CX for PMSCR.

My own preference here would be to not bother with the new
enable/disable helpers, but just open code it right here. (Save a patch
and is the only user.) But I defer to the taste of arm64 maintainers. :)

-Kees

> 
> Signed-off-by: Leo Yan <leo.yan@linaro.org>
> ---
>  drivers/perf/arm_spe_pmu.c | 13 ++++++++++++-
>  1 file changed, 12 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/perf/arm_spe_pmu.c b/drivers/perf/arm_spe_pmu.c
> index d44bcc29d99c..935343cdcb39 100644
> --- a/drivers/perf/arm_spe_pmu.c
> +++ b/drivers/perf/arm_spe_pmu.c
> @@ -23,6 +23,7 @@
>  #include <linux/irq.h>
>  #include <linux/kernel.h>
>  #include <linux/list.h>
> +#include <linux/mmu_context.h>
>  #include <linux/module.h>
>  #include <linux/of_address.h>
>  #include <linux/of_device.h>
> @@ -272,7 +273,7 @@ static u64 arm_spe_event_to_pmscr(struct perf_event *event)
>  	if (!attr->exclude_kernel)
>  		reg |= BIT(SYS_PMSCR_EL1_E1SPE_SHIFT);
>  
> -	if (IS_ENABLED(CONFIG_PID_IN_CONTEXTIDR) && perfmon_capable())
> +	if (perfmon_capable() && (task_active_pid_ns(current) == &init_pid_ns))
>  		reg |= BIT(SYS_PMSCR_EL1_CX_SHIFT);
>  
>  	return reg;
> @@ -731,6 +732,13 @@ static void arm_spe_pmu_start(struct perf_event *event, int flags)
>  	if (hwc->state)
>  		return;
>  
> +	/*
> +	 * Enable tracing PID to contextidr if profiling program runs in
> +	 * root PID namespace.
> +	 */
> +	if (perfmon_capable() && (task_active_pid_ns(current) == &init_pid_ns))
> +		contextidr_enable();
> +
>  	reg = arm_spe_event_to_pmsfcr(event);
>  	write_sysreg_s(reg, SYS_PMSFCR_EL1);
>  
> @@ -792,6 +800,9 @@ static void arm_spe_pmu_stop(struct perf_event *event, int flags)
>  	}
>  
>  	hwc->state |= PERF_HES_STOPPED;
> +
> +	if (perfmon_capable() && (task_active_pid_ns(current) == &init_pid_ns))
> +		contextidr_disable();
>  }
>  
>  static int arm_spe_pmu_add(struct perf_event *event, int flags)
> -- 
> 2.25.1
> 

-- 
Kees Cook

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [RFCv1 4/4] perf: arm_spe: Dynamically switch PID tracing to contextidr
  2021-10-21 15:49   ` Kees Cook
@ 2021-10-22  2:09     ` Leo Yan
  2021-11-01 15:28     ` Leo Yan
  1 sibling, 0 replies; 24+ messages in thread
From: Leo Yan @ 2021-10-22  2:09 UTC (permalink / raw)
  To: Kees Cook
  Cc: Catalin Marinas, Will Deacon, Mark Rutland, Ard Biesheuvel,
	Sami Tolvanen, Nicholas Piggin, James Morse, Marc Zyngier,
	Joey Gouly, Peter Collingbourne, Vincenzo Frascino,
	Peter Zijlstra (Intel),
	Stephane Eranian, James Clark, linux-arm-kernel, linux-kernel

Hi Kees,

On Thu, Oct 21, 2021 at 08:49:46AM -0700, Kees Cook wrote:
> On Thu, Oct 21, 2021 at 09:45:30PM +0800, Leo Yan wrote:
> > Now Arm64 provides API for enabling and disable PID tracing, Arm SPE
> > driver invokes these functions to dynamically enable it during
> > profiling when the program runs in root PID name space, and disable PID
> > tracing when the perf event is stopped.
> > 
> > Device drivers should not depend on CONFIG_PID_IN_CONTEXTIDR for PID
> > tracing, so this patch uses the consistent condition for setting bit
> > EL1_CX for PMSCR.
> 
> My own preference here would be to not bother with the new
> enable/disable helpers, but just open code it right here. (Save a patch
> and is the only user.) But I defer to the taste of arm64 maintainers. :)

Yes, with your reminding I recognize that we can avoid the new helpers.
Just remind, tracing PID in contextidr is not only used by Arm SPE
driver, it will be used in Arm CoreSight driver as well.  I plan to use
a separate patch set to address Arm CoreSight (CoreSight driver misses
to checking root PID namespace so need firstly fix that issue).

Just give more info, so you and arm64 maintainers could judge we
should use helpers or directly access static key.

Thanks for your review!

Leo

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [RFCv1 4/4] perf: arm_spe: Dynamically switch PID tracing to contextidr
  2021-10-21 13:45 ` [RFCv1 4/4] perf: arm_spe: Dynamically switch PID tracing to contextidr Leo Yan
  2021-10-21 15:49   ` Kees Cook
@ 2021-10-22 15:36   ` James Clark
  2021-10-22 15:40     ` James Clark
  2021-10-22 16:23     ` James Clark
  1 sibling, 2 replies; 24+ messages in thread
From: James Clark @ 2021-10-22 15:36 UTC (permalink / raw)
  To: Leo Yan, Catalin Marinas, Will Deacon, Mark Rutland, Kees Cook,
	Ard Biesheuvel, Sami Tolvanen, Nicholas Piggin, James Morse,
	Marc Zyngier, Joey Gouly, Peter Collingbourne, Vincenzo Frascino,
	Peter Zijlstra (Intel),
	Stephane Eranian, linux-arm-kernel, linux-kernel



On 21/10/2021 14:45, Leo Yan wrote:
> Now Arm64 provides API for enabling and disable PID tracing, Arm SPE
> driver invokes these functions to dynamically enable it during
> profiling when the program runs in root PID name space, and disable PID
> tracing when the perf event is stopped.
> 
> Device drivers should not depend on CONFIG_PID_IN_CONTEXTIDR for PID
> tracing, so this patch uses the consistent condition for setting bit
> EL1_CX for PMSCR.

Hi Leo,

I've been testing this change, but I'm seeing something strange. Not sure
if it's a problem on my side or not yet. With this command:

 sudo ./perf record -vvv -e arm_spe//u -- taskset --cpu-list 1 bash -c ls

I'm only seeing 0 values for context:

 sudo ./perf report -D | grep CONTEXT

.  00038dce:  65 00 00 00 00                                  CONTEXT 0x0 el2
.  00038e0e:  65 00 00 00 00                                  CONTEXT 0x0 el2

I added a printk to the function, and I see it print non zero values, although
there are some zero ones mixed in there too:

 diff --git a/arch/arm64/include/asm/mmu_context.h b/arch/arm64/include/asm/mmu_context.h
index 0c1669db19a1..8f0fb43a5fac 100644
--- a/arch/arm64/include/asm/mmu_context.h
+++ b/arch/arm64/include/asm/mmu_context.h
@@ -33,7 +33,8 @@ static inline void contextidr_thread_switch(struct task_struct *next)
        if (!static_branch_unlikely(&contextidr_in_use))
                return;
 
-       write_sysreg(task_pid_nr(next), contextidr_el1);
+       printk("Set %d\n", task_pid_nr(next));
+       write_sysreg(task_pid_nr(next), contextidr_el2);
        isb();
 }
 

Results in this:

[   53.257905] Set 77
[   53.257909] Set 0
[   53.258180] Set 77
[   53.258183] Set 0
[   53.258385] Set 309
[   53.258385] Set 172
[   53.258425] Set 77
[   53.258443] Set 990
[   53.258449] Set 77
[   53.258455] Set 990
[   53.258467] Set 310
[   53.258719] Set 7
[   53.258728] Set 77
[   53.258731] Set 0
[   53.258733] Set 0
[   53.258738] Set 7


Without your patchset I don't get 0 values in the SPE trace anymore:

.  0000050e:  65 b1 01 00 00                                  CONTEXT 0x1b1 el2
.  0000054e:  65 b1 01 00 00                                  CONTEXT 0x1b1 el2
.  0000058e:  65 ac 01 00 00                                  CONTEXT 0x1ac el2
.  000005ce:  65 ac 01 00 00                                  CONTEXT 0x1ac el2


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* Re: [RFCv1 4/4] perf: arm_spe: Dynamically switch PID tracing to contextidr
  2021-10-22 15:36   ` James Clark
@ 2021-10-22 15:40     ` James Clark
  2021-10-22 16:23     ` James Clark
  1 sibling, 0 replies; 24+ messages in thread
From: James Clark @ 2021-10-22 15:40 UTC (permalink / raw)
  To: Leo Yan
  Cc: Catalin Marinas, Will Deacon, Mark Rutland, Kees Cook,
	Ard Biesheuvel, Sami Tolvanen, Nicholas Piggin, James Morse,
	Marc Zyngier, Joey Gouly, Peter Collingbourne, Vincenzo Frascino,
	Peter Zijlstra (Intel),
	Stephane Eranian, linux-arm-kernel, linux-kernel



On 22/10/2021 16:36, James Clark wrote:
> 
> 
> On 21/10/2021 14:45, Leo Yan wrote:
>> Now Arm64 provides API for enabling and disable PID tracing, Arm SPE
>> driver invokes these functions to dynamically enable it during
>> profiling when the program runs in root PID name space, and disable PID
>> tracing when the perf event is stopped.
>>
>> Device drivers should not depend on CONFIG_PID_IN_CONTEXTIDR for PID
>> tracing, so this patch uses the consistent condition for setting bit
>> EL1_CX for PMSCR.
> 
> Hi Leo,
> 
> I've been testing this change, but I'm seeing something strange. Not sure
> if it's a problem on my side or not yet. With this command:
> 
>  sudo ./perf record -vvv -e arm_spe//u -- taskset --cpu-list 1 bash -c ls
> 
> I'm only seeing 0 values for context:
> 
>  sudo ./perf report -D | grep CONTEXT
> 
> .  00038dce:  65 00 00 00 00                                  CONTEXT 0x0 el2
> .  00038e0e:  65 00 00 00 00                                  CONTEXT 0x0 el2
> 
> I added a printk to the function, and I see it print non zero values, although
> there are some zero ones mixed in there too:
> 
>  diff --git a/arch/arm64/include/asm/mmu_context.h b/arch/arm64/include/asm/mmu_context.h
> index 0c1669db19a1..8f0fb43a5fac 100644
> --- a/arch/arm64/include/asm/mmu_context.h
> +++ b/arch/arm64/include/asm/mmu_context.h
> @@ -33,7 +33,8 @@ static inline void contextidr_thread_switch(struct task_struct *next)
>         if (!static_branch_unlikely(&contextidr_in_use))
>                 return;
>  
> -       write_sysreg(task_pid_nr(next), contextidr_el1);
> +       printk("Set %d\n", task_pid_nr(next));
> +       write_sysreg(task_pid_nr(next), contextidr_el2);

Ignore this second line change, that doesn't even compile. It's just the printk that I added.


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [RFCv1 4/4] perf: arm_spe: Dynamically switch PID tracing to contextidr
  2021-10-22 15:36   ` James Clark
  2021-10-22 15:40     ` James Clark
@ 2021-10-22 16:23     ` James Clark
  2021-10-24 10:25       ` Leo Yan
  1 sibling, 1 reply; 24+ messages in thread
From: James Clark @ 2021-10-22 16:23 UTC (permalink / raw)
  To: Leo Yan
  Cc: Catalin Marinas, Will Deacon, Mark Rutland, Kees Cook,
	Ard Biesheuvel, Sami Tolvanen, Nicholas Piggin, James Morse,
	Marc Zyngier, Joey Gouly, Peter Collingbourne, Vincenzo Frascino,
	Peter Zijlstra (Intel),
	Stephane Eranian, linux-arm-kernel, linux-kernel



On 22/10/2021 16:36, James Clark wrote:
> 
> 
> On 21/10/2021 14:45, Leo Yan wrote:
>> Now Arm64 provides API for enabling and disable PID tracing, Arm SPE
>> driver invokes these functions to dynamically enable it during
>> profiling when the program runs in root PID name space, and disable PID
>> tracing when the perf event is stopped.
>>
>> Device drivers should not depend on CONFIG_PID_IN_CONTEXTIDR for PID
>> tracing, so this patch uses the consistent condition for setting bit
>> EL1_CX for PMSCR.
> 
> Hi Leo,
> 
> I've been testing this change, but I'm seeing something strange. Not sure
> if it's a problem on my side or not yet. With this command:
> 
>  sudo ./perf record -vvv -e arm_spe//u -- taskset --cpu-list 1 bash -c ls
> 
> I'm only seeing 0 values for context:
> 
>  sudo ./perf report -D | grep CONTEXT
> 
> .  00038dce:  65 00 00 00 00                                  CONTEXT 0x0 el2
> .  00038e0e:  65 00 00 00 00                                  CONTEXT 0x0 el2
> 
> I added a printk to the function, and I see it print non zero values, although
> there are some zero ones mixed in there too:
> 
>  diff --git a/arch/arm64/include/asm/mmu_context.h b/arch/arm64/include/asm/mmu_context.h
> index 0c1669db19a1..8f0fb43a5fac 100644
> --- a/arch/arm64/include/asm/mmu_context.h
> +++ b/arch/arm64/include/asm/mmu_context.h
> @@ -33,7 +33,8 @@ static inline void contextidr_thread_switch(struct task_struct *next)
>         if (!static_branch_unlikely(&contextidr_in_use))
>                 return;
>  
> -       write_sysreg(task_pid_nr(next), contextidr_el1);
> +       printk("Set %d\n", task_pid_nr(next));
> +       write_sysreg(task_pid_nr(next), contextidr_el2);
>         isb();
>  }
>  
> 
> Results in this:
> 
> [   53.257905] Set 77
> [   53.257909] Set 0
> [   53.258180] Set 77
> [   53.258183] Set 0
> [   53.258385] Set 309
> [   53.258385] Set 172
> [   53.258425] Set 77
> [   53.258443] Set 990
> [   53.258449] Set 77
> [   53.258455] Set 990
> [   53.258467] Set 310
> [   53.258719] Set 7
> [   53.258728] Set 77
> [   53.258731] Set 0
> [   53.258733] Set 0
> [   53.258738] Set 7
> 
> 
> Without your patchset I don't get 0 values in the SPE trace anymore:
> 
> .  0000050e:  65 b1 01 00 00                                  CONTEXT 0x1b1 el2
> .  0000054e:  65 b1 01 00 00                                  CONTEXT 0x1b1 el2
> .  0000058e:  65 ac 01 00 00                                  CONTEXT 0x1ac el2
> .  000005ce:  65 ac 01 00 00                                  CONTEXT 0x1ac el2
> 

Is it an issue with building with CONTEXTIDR disabled? Seems like this change results
in context packets set to 0 when it's disabled rather than having the packets disabled
like they used to be:

zcat /proc/config.gz | grep CONTEXTIDR
# CONFIG_PID_IN_CONTEXTIDR is not set

sudo ./perf report -D | grep CONTEXT
.  00045b4e:  65 00 00 00 00                                  CONTEXT 0x0 el2

When I build with CONFIG_PID_IN_CONTEXTIDR=y the contexts are non zero so it seems to
be working that way. But ./perf record -e arm_spe//u -a does have context IDs even
when CONFIG_PID_IN_CONTEXTIDR=n. So I'm still a bit confused.


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [RFCv1 4/4] perf: arm_spe: Dynamically switch PID tracing to contextidr
  2021-10-22 16:23     ` James Clark
@ 2021-10-24 10:25       ` Leo Yan
  0 siblings, 0 replies; 24+ messages in thread
From: Leo Yan @ 2021-10-24 10:25 UTC (permalink / raw)
  To: James Clark
  Cc: Catalin Marinas, Will Deacon, Mark Rutland, Kees Cook,
	Ard Biesheuvel, Sami Tolvanen, Nicholas Piggin, James Morse,
	Marc Zyngier, Joey Gouly, Peter Collingbourne, Vincenzo Frascino,
	Peter Zijlstra (Intel),
	Stephane Eranian, linux-arm-kernel, linux-kernel

Hi James,

On Fri, Oct 22, 2021 at 05:23:23PM +0100, James Clark wrote:
> On 22/10/2021 16:36, James Clark wrote:
> > 
> > 
> > On 21/10/2021 14:45, Leo Yan wrote:
> >> Now Arm64 provides API for enabling and disable PID tracing, Arm SPE
> >> driver invokes these functions to dynamically enable it during
> >> profiling when the program runs in root PID name space, and disable PID
> >> tracing when the perf event is stopped.
> >>
> >> Device drivers should not depend on CONFIG_PID_IN_CONTEXTIDR for PID
> >> tracing, so this patch uses the consistent condition for setting bit
> >> EL1_CX for PMSCR.
> > 
> > Hi Leo,
> > 
> > I've been testing this change, but I'm seeing something strange. Not sure
> > if it's a problem on my side or not yet. With this command:
> > 
> >  sudo ./perf record -vvv -e arm_spe//u -- taskset --cpu-list 1 bash -c ls
> > 
> > I'm only seeing 0 values for context:
> > 
> >  sudo ./perf report -D | grep CONTEXT
> > 
> > .  00038dce:  65 00 00 00 00                                  CONTEXT 0x0 el2
> > .  00038e0e:  65 00 00 00 00                                  CONTEXT 0x0 el2

Good catch!  I reproduced this issue at my side and looked into the
flow, the root cause is relevant with timing.

When perf launches the program 'taskset --cpu-list 1 bash -c ls', it
forks a new process and 'ls' program is scheduled in, then function
arm_spe_pmu_start() invokes contextidr_enable() to enable the PID
tracing in contextidr.  Since 'ls' program executes very short and it
simply runs to the end (so in the middle of 'ls' there have no any
context switching on the CPU), there have no any new PID is written
into contextidr and CPU's contextidr keeps zero.  This is the reason
we see the context packets contain zeros for PID.

To fix this issue, we should enable PID tracing when setup AUX ring
buffer, at this phase, the profiled program has not been started yet.
So when the profiled program is scheduled in at the first time, PID
traing is getting ready and we can see the expected context packet in
Arm SPE trace data.   So this patch should be updated as below, I will
apply it in next spin if no objection.

diff --git a/drivers/perf/arm_spe_pmu.c b/drivers/perf/arm_spe_pmu.c
index c21cf1385cc0..85aa2eab0c2e 100644
--- a/drivers/perf/arm_spe_pmu.c
+++ b/drivers/perf/arm_spe_pmu.c
@@ -876,6 +867,13 @@ static void *arm_spe_pmu_setup_aux(struct perf_event *event, void **pages,
        buf->nr_pages   = nr_pages;
        buf->snapshot   = snapshot;

+       /*
+        * Enable tracing PID to contextidr if profiling program runs in
+        * root PID namespace.
+        */
+       if (perfmon_capable() && (task_active_pid_ns(current) == &init_pid_ns))
+               contextidr_enable();
+
        kfree(pglist);
        return buf;

@@ -890,6 +888,9 @@ static void arm_spe_pmu_free_aux(void *aux)
 {
        struct arm_spe_pmu_buf *buf = aux;

+       if (perfmon_capable() && (task_active_pid_ns(current) == &init_pid_ns))
+               contextidr_disable();
+
        vunmap(buf->base);
        kfree(buf);
 }

Thanks a lot for detailed testing!

Leo

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* Re: [RFCv1 4/4] perf: arm_spe: Dynamically switch PID tracing to contextidr
  2021-10-21 15:49   ` Kees Cook
  2021-10-22  2:09     ` Leo Yan
@ 2021-11-01 15:28     ` Leo Yan
  2021-12-03 16:22       ` Catalin Marinas
  1 sibling, 1 reply; 24+ messages in thread
From: Leo Yan @ 2021-11-01 15:28 UTC (permalink / raw)
  To: Kees Cook
  Cc: Catalin Marinas, Will Deacon, Mark Rutland, Ard Biesheuvel,
	Sami Tolvanen, Nicholas Piggin, James Morse, Marc Zyngier,
	Joey Gouly, Peter Collingbourne, Vincenzo Frascino,
	Peter Zijlstra (Intel),
	Stephane Eranian, James Clark, linux-arm-kernel, linux-kernel

Hi Catalin, Will,

On Thu, Oct 21, 2021 at 08:49:46AM -0700, Kees Cook wrote:
> On Thu, Oct 21, 2021 at 09:45:30PM +0800, Leo Yan wrote:
> > Now Arm64 provides API for enabling and disable PID tracing, Arm SPE
> > driver invokes these functions to dynamically enable it during
> > profiling when the program runs in root PID name space, and disable PID
> > tracing when the perf event is stopped.
> > 
> > Device drivers should not depend on CONFIG_PID_IN_CONTEXTIDR for PID
> > tracing, so this patch uses the consistent condition for setting bit
> > EL1_CX for PMSCR.
> 
> My own preference here would be to not bother with the new
> enable/disable helpers, but just open code it right here. (Save a patch
> and is the only user.) But I defer to the taste of arm64 maintainers. :)

Before I send out a new version for this patch set (for support
dynamic PID tracing on Arm64), I'd like to get your opinions for two
things:

- Firstly, as Kees suggested to directly use variable
  'contextidr_in_use' in drivers, which is exported as GPL symbol,
  it's not necessarily to add two helpers contextidr_{enable|disable}().
  What's your preference for this?

- Secondly, now this patch set only support dynamic PID tracing for
  Arm64; and there would be two customers to use dynamic PID tracing:
  Arm SPE and Coresight ETMv4.x.  So this patch set doesn't support
  dynamic PID tracing for Arm32 (under arch/arm).

  Do you accept this patch set for enabling PID tracing on Arm64 and we
  can defer to support Arm32 when really need PID tracing on Arm32?
  Or we should enable PID dynamic tracing for Arm64 and Arm32 in one
  go?


Thanks,
Leo

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [RFCv1 4/4] perf: arm_spe: Dynamically switch PID tracing to contextidr
  2021-11-01 15:28     ` Leo Yan
@ 2021-12-03 16:22       ` Catalin Marinas
  2021-12-05 13:51         ` Leo Yan
  0 siblings, 1 reply; 24+ messages in thread
From: Catalin Marinas @ 2021-12-03 16:22 UTC (permalink / raw)
  To: Leo Yan
  Cc: Kees Cook, Will Deacon, Mark Rutland, Ard Biesheuvel,
	Sami Tolvanen, Nicholas Piggin, James Morse, Marc Zyngier,
	Joey Gouly, Peter Collingbourne, Vincenzo Frascino,
	Peter Zijlstra (Intel),
	Stephane Eranian, James Clark, linux-arm-kernel, linux-kernel

On Mon, Nov 01, 2021 at 11:28:35PM +0800, Leo Yan wrote:
> On Thu, Oct 21, 2021 at 08:49:46AM -0700, Kees Cook wrote:
> > On Thu, Oct 21, 2021 at 09:45:30PM +0800, Leo Yan wrote:
> > > Now Arm64 provides API for enabling and disable PID tracing, Arm SPE
> > > driver invokes these functions to dynamically enable it during
> > > profiling when the program runs in root PID name space, and disable PID
> > > tracing when the perf event is stopped.
> > > 
> > > Device drivers should not depend on CONFIG_PID_IN_CONTEXTIDR for PID
> > > tracing, so this patch uses the consistent condition for setting bit
> > > EL1_CX for PMSCR.
> > 
> > My own preference here would be to not bother with the new
> > enable/disable helpers, but just open code it right here. (Save a patch
> > and is the only user.) But I defer to the taste of arm64 maintainers. :)
> 
> Before I send out a new version for this patch set (for support
> dynamic PID tracing on Arm64), I'd like to get your opinions for two
> things:
> 
> - Firstly, as Kees suggested to directly use variable
>   'contextidr_in_use' in drivers, which is exported as GPL symbol,
>   it's not necessarily to add two helpers contextidr_{enable|disable}().
>   What's your preference for this?

My preference would be to keep the helpers.

> - Secondly, now this patch set only support dynamic PID tracing for
>   Arm64; and there would be two customers to use dynamic PID tracing:
>   Arm SPE and Coresight ETMv4.x.  So this patch set doesn't support
>   dynamic PID tracing for Arm32 (under arch/arm).
> 
>   Do you accept this patch set for enabling PID tracing on Arm64 and we
>   can defer to support Arm32 when really need PID tracing on Arm32?
>   Or we should enable PID dynamic tracing for Arm64 and Arm32 in one
>   go?

If it doesn't break arm32, it's fine by me.

What's the cost of always enabling CONFIG_PID_IN_CONTEXTIDR? If it's
negligible, I'd not bother at all with any of the enabling/disabling.

Another question: can you run multiple instances of SPE for different
threads on different CPUs? What happens to the global contextidr_in_use
key when one of them stops?

-- 
Catalin

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [RFCv1 4/4] perf: arm_spe: Dynamically switch PID tracing to contextidr
  2021-12-03 16:22       ` Catalin Marinas
@ 2021-12-05 13:51         ` Leo Yan
  2021-12-07 11:48           ` Catalin Marinas
  0 siblings, 1 reply; 24+ messages in thread
From: Leo Yan @ 2021-12-05 13:51 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: Kees Cook, Will Deacon, Mark Rutland, Ard Biesheuvel,
	Sami Tolvanen, Nicholas Piggin, James Morse, Marc Zyngier,
	Joey Gouly, Peter Collingbourne, Vincenzo Frascino,
	Peter Zijlstra (Intel),
	Stephane Eranian, James Clark, linux-arm-kernel, linux-kernel

Hi Catalin,

On Fri, Dec 03, 2021 at 04:22:42PM +0000, Catalin Marinas wrote:
> On Mon, Nov 01, 2021 at 11:28:35PM +0800, Leo Yan wrote:
> > On Thu, Oct 21, 2021 at 08:49:46AM -0700, Kees Cook wrote:
> > > On Thu, Oct 21, 2021 at 09:45:30PM +0800, Leo Yan wrote:
> > > > Now Arm64 provides API for enabling and disable PID tracing, Arm SPE
> > > > driver invokes these functions to dynamically enable it during
> > > > profiling when the program runs in root PID name space, and disable PID
> > > > tracing when the perf event is stopped.
> > > > 
> > > > Device drivers should not depend on CONFIG_PID_IN_CONTEXTIDR for PID
> > > > tracing, so this patch uses the consistent condition for setting bit
> > > > EL1_CX for PMSCR.
> > > 
> > > My own preference here would be to not bother with the new
> > > enable/disable helpers, but just open code it right here. (Save a patch
> > > and is the only user.) But I defer to the taste of arm64 maintainers. :)
> > 
> > Before I send out a new version for this patch set (for support
> > dynamic PID tracing on Arm64), I'd like to get your opinions for two
> > things:
> > 
> > - Firstly, as Kees suggested to directly use variable
> >   'contextidr_in_use' in drivers, which is exported as GPL symbol,
> >   it's not necessarily to add two helpers contextidr_{enable|disable}().
> >   What's your preference for this?
> 
> My preference would be to keep the helpers.

Okay, will keep the helpers.

> > - Secondly, now this patch set only support dynamic PID tracing for
> >   Arm64; and there would be two customers to use dynamic PID tracing:
> >   Arm SPE and Coresight ETMv4.x.  So this patch set doesn't support
> >   dynamic PID tracing for Arm32 (under arch/arm).
> > 
> >   Do you accept this patch set for enabling PID tracing on Arm64 and we
> >   can defer to support Arm32 when really need PID tracing on Arm32?
> >   Or we should enable PID dynamic tracing for Arm64 and Arm32 in one
> >   go?
> 
> If it doesn't break arm32, it's fine by me.

In next spin, I will introduce a patch for new Arm32 helpers.  Since
now we have no requirement for dynamic PID tracing, the Arm32 helpers
will be nop operations and can be used for kernel compilation.

> What's the cost of always enabling CONFIG_PID_IN_CONTEXTIDR? If it's
> negligible, I'd not bother at all with any of the enabling/disabling.

Yes, I compared performance for PID tracing with always enabling and
disabling CONFIG_PID_IN_CONTEXTIDR, and also compared with using
static key for enabling/disabling PID tracing.  The result shows the
cost is negligible based on the benchmark 'perf bench sched'.

Please see the detailed data in below link (note the testing results
came from my Juno board):
https://lore.kernel.org/lkml/20211021134530.206216-1-leo.yan@linaro.org/

> Another question: can you run multiple instances of SPE for different
> threads on different CPUs? What happens to the global contextidr_in_use
> key when one of them stops?

No, I only can launch one instance for Arm SPE event via perf tool; when
I tried to launch a second instance, perf tool reports failure:

The sys_perf_event_open() syscall returned with 16 (Device or resource busy) for event (arm_spe_0/load_filter=1,store_filter=1/u).

Alternatively, I'd like give several examples for contextidr_in_use key
values when run different perf modes.

Firstly, I added two kprobe points to monitor contextidr_in_use key
values with below commands, the first probe point is to monitor static
key's increment, and second probe point is to monitor key's
descrement:

  # perf probe --add 'arm_spe_pmu_setup_aux:45 contextidr_in_use=contextidr_in_use.key.enabled.counter:u32'
  # perf probe --add 'arm_spe_pmu_free_aux:7 contextidr_in_use=contextidr_in_use.key.enabled.counter:u32'

Case 1: run perf tool with 'per-thread' mode:

  # perf record -e arm_spe_0/load_filter=1,store_filter=1/u --per-thread -- uname

Trace log shows contextidr_in_use is increment to 1 when setup AUX
buffer and it descrement to 0 when free AUX buffer (before close SPE
event).

  perf-2393    [077] d..1.   427.161612: arm_spe_pmu_setup_aux_L45: (arm_spe_pmu_setup_aux+0x130/0x200) contextidr_in_use=1
  perf-2393    [077] d..1.   427.477993: arm_spe_pmu_free_aux_L7: (arm_spe_pmu_free_aux+0x44/0xa0) contextidr_in_use=0

Case 2: perf tool runs with system-wide mode:

  # perf record -e arm_spe_0/load_filter=1,store_filter=1/u -a -- uname

The system has 128 cores, so every CPU has its own AUX buffer, thus
the static key will increase from 0 to 128; reversely, the static key
will decrease from 128 to 0 when perf tool exists:

  perf-2395    [077] d..1.   435.647270: arm_spe_pmu_setup_aux_L45: (arm_spe_pmu_setup_aux+0x130/0x200) contextidr_in_use=1
  perf-2395    [077] d..1.   435.647912: arm_spe_pmu_setup_aux_L45: (arm_spe_pmu_setup_aux+0x130/0x200) contextidr_in_use=2
  ...
  perf-2395    [077] d..1.   435.709717: arm_spe_pmu_setup_aux_L45: (arm_spe_pmu_setup_aux+0x130/0x200) contextidr_in_use=128
  perf-2395    [127] d..1.   436.734142: arm_spe_pmu_free_aux_L7: (arm_spe_pmu_free_aux+0x44/0xa0) contextidr_in_use=127
  perf-2395    [127] d..1.   436.734438: arm_spe_pmu_free_aux_L7: (arm_spe_pmu_free_aux+0x44/0xa0) contextidr_in_use=126
  perf-2395    [127] d..1.   436.734682: arm_spe_pmu_free_aux_L7: (arm_spe_pmu_free_aux+0x44/0xa0) contextidr_in_use=125
  ...
  perf-2395    [127] d..1.   436.763856: arm_spe_pmu_free_aux_L7: (arm_spe_pmu_free_aux+0x44/0xa0) contextidr_in_use=0

Case 3: perf tool runs for CPU-wise mode:

  # perf record -e arm_spe_0/load_filter=1,store_filter=1/u -C 1,3,4  -- uname

The option '-C 1,3,4' specifies to only enable Arm SPE tracing on
three CPUs (CPU1/3/4), so the contextidr_in_use key will increment for
3 times when open perf event, and decrement the static key for 3 times
when close SPE event:

  perf-2404    [077] d..1.   450.564087: arm_spe_pmu_setup_aux_L45: (arm_spe_pmu_setup_aux+0x130/0x200) contextidr_in_use=1
  perf-2404    [077] d..1.   450.564744: arm_spe_pmu_setup_aux_L45: (arm_spe_pmu_setup_aux+0x130/0x200) contextidr_in_use=2
  perf-2404    [077] d..1.   450.565369: arm_spe_pmu_setup_aux_L45: (arm_spe_pmu_setup_aux+0x130/0x200) contextidr_in_use=3
  perf-2404    [004] d..1.   451.567532: arm_spe_pmu_free_aux_L7: (arm_spe_pmu_free_aux+0x44/0xa0) contextidr_in_use=2
  perf-2404    [004] d..1.   451.567823: arm_spe_pmu_free_aux_L7: (arm_spe_pmu_free_aux+0x44/0xa0) contextidr_in_use=1
  perf-2404    [004] d..1.   451.568120: arm_spe_pmu_free_aux_L7: (arm_spe_pmu_free_aux+0x44/0xa0) contextidr_in_use=0

Hope these three cases can demonstrate the usage for contextidr_in_use
static key.

Thanks,
Leo

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [RFCv1 4/4] perf: arm_spe: Dynamically switch PID tracing to contextidr
  2021-12-05 13:51         ` Leo Yan
@ 2021-12-07 11:48           ` Catalin Marinas
  2021-12-07 12:31             ` Leo Yan
  0 siblings, 1 reply; 24+ messages in thread
From: Catalin Marinas @ 2021-12-07 11:48 UTC (permalink / raw)
  To: Leo Yan
  Cc: Kees Cook, Will Deacon, Mark Rutland, Ard Biesheuvel,
	Sami Tolvanen, Nicholas Piggin, James Morse, Marc Zyngier,
	Joey Gouly, Peter Collingbourne, Vincenzo Frascino,
	Peter Zijlstra (Intel),
	Stephane Eranian, James Clark, linux-arm-kernel, linux-kernel

On Sun, Dec 05, 2021 at 09:51:03PM +0800, Leo Yan wrote:
> On Fri, Dec 03, 2021 at 04:22:42PM +0000, Catalin Marinas wrote:
> > What's the cost of always enabling CONFIG_PID_IN_CONTEXTIDR? If it's
> > negligible, I'd not bother at all with any of the enabling/disabling.
> 
> Yes, I compared performance for PID tracing with always enabling and
> disabling CONFIG_PID_IN_CONTEXTIDR, and also compared with using
> static key for enabling/disabling PID tracing.  The result shows the
> cost is negligible based on the benchmark 'perf bench sched'.
> 
> Please see the detailed data in below link (note the testing results
> came from my Juno board):
> https://lore.kernel.org/lkml/20211021134530.206216-1-leo.yan@linaro.org/

The table wasn't entirely clear to me. So the dis/enb benchmarks are
without this patchset applied. There seems to be a minor drop but it's
probably noise. Anyway, do we need this patchset or we just make
CONFIG_PID_IN_CONTEXTIDR default to y?

> > Another question: can you run multiple instances of SPE for different
> > threads on different CPUs? What happens to the global contextidr_in_use
> > key when one of them stops?
> 
> No, I only can launch one instance for Arm SPE event via perf tool; when
> I tried to launch a second instance, perf tool reports failure:
> 
> The sys_perf_event_open() syscall returned with 16 (Device or resource
> busy) for event (arm_spe_0/load_filter=1,store_filter=1/u).
[...]
> Alternatively, I'd like give several examples for contextidr_in_use key
> values when run different perf modes.
[...]
> Hope these three cases can demonstrate the usage for contextidr_in_use
> static key.

OK, so we can have multiple uses of PID in CONTEXTIDR. Since
static_branch_inc() is refcounted, we get away with this but the
downside is that a CPU won't notice until its next thread switch.

-- 
Catalin

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [RFCv1 4/4] perf: arm_spe: Dynamically switch PID tracing to contextidr
  2021-12-07 11:48           ` Catalin Marinas
@ 2021-12-07 12:31             ` Leo Yan
  2021-12-08 17:29               ` Catalin Marinas
  0 siblings, 1 reply; 24+ messages in thread
From: Leo Yan @ 2021-12-07 12:31 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: Kees Cook, Will Deacon, Mark Rutland, Ard Biesheuvel,
	Sami Tolvanen, Nicholas Piggin, James Morse, Marc Zyngier,
	Joey Gouly, Peter Collingbourne, Vincenzo Frascino,
	Peter Zijlstra (Intel),
	Stephane Eranian, James Clark, linux-arm-kernel, linux-kernel

On Tue, Dec 07, 2021 at 11:48:00AM +0000, Catalin Marinas wrote:
> On Sun, Dec 05, 2021 at 09:51:03PM +0800, Leo Yan wrote:
> > On Fri, Dec 03, 2021 at 04:22:42PM +0000, Catalin Marinas wrote:
> > > What's the cost of always enabling CONFIG_PID_IN_CONTEXTIDR? If it's
> > > negligible, I'd not bother at all with any of the enabling/disabling.
> > 
> > Yes, I compared performance for PID tracing with always enabling and
> > disabling CONFIG_PID_IN_CONTEXTIDR, and also compared with using
> > static key for enabling/disabling PID tracing.  The result shows the
> > cost is negligible based on the benchmark 'perf bench sched'.
> > 
> > Please see the detailed data in below link (note the testing results
> > came from my Juno board):
> > https://lore.kernel.org/lkml/20211021134530.206216-1-leo.yan@linaro.org/
> 
> The table wasn't entirely clear to me. So the dis/enb benchmarks are
> without this patchset applied.

Yes, dis/enb metrics don't apply this patchset.

> There seems to be a minor drop but it's
> probably noise. Anyway, do we need this patchset or we just make
> CONFIG_PID_IN_CONTEXTIDR default to y?

Good point.  I remembered before we had discussed for making
CONFIG_PID_IN_CONTEXTIDR to 'y', but this approach is not always valid,
especially when the profiling process runs in non-root PID namespace,
in this case, hardware tracing data (e.g. Arm SPE or CoreSight) cannot
trust the PID values from tracing since the PID conflicts between
different PID namespaces.

So this patchset is to add the fundamental mechanism for dynamically
enabling and disable PID tracing into CONTEXTIDR.  Based on it, we can
use helpers to dynamically enable PID tracing _only_ when process runs
in root PID namespace.

> > > Another question: can you run multiple instances of SPE for different
> > > threads on different CPUs? What happens to the global contextidr_in_use
> > > key when one of them stops?
> > 
> > No, I only can launch one instance for Arm SPE event via perf tool; when
> > I tried to launch a second instance, perf tool reports failure:
> > 
> > The sys_perf_event_open() syscall returned with 16 (Device or resource
> > busy) for event (arm_spe_0/load_filter=1,store_filter=1/u).
> [...]
> > Alternatively, I'd like give several examples for contextidr_in_use key
> > values when run different perf modes.
> [...]
> > Hope these three cases can demonstrate the usage for contextidr_in_use
> > static key.
> 
> OK, so we can have multiple uses of PID in CONTEXTIDR. Since
> static_branch_inc() is refcounted, we get away with this but the
> downside is that a CPU won't notice until its next thread switch.

Yes, it's true that after static key is decreased to 0, any CPUs in
the system will notice for the '0' value until the next context
switch.  But I think this will not introduce issue, please see the
flow:

  Step 1: Perf tool: enable event (Arm SPE or CoreSight event)
  Step 2: Perf tool: enable PID tracing when setup AUX buf
  Step 3: Profiling, run specific program or for all CPUs;
          it can have many context switch and all relevant PIDs
          are stored into CONTEXTIDR.
  Step 4: Perf tool: disable PID tracing when free AUX buf
  Step 5: Perf tool: disable event

The latency you mentioned is introduced between step 4 and step 5,
but it doesn't impact any PID tracing for step 3.

And when the CPU will switch context, it will detect the static key is
decreased to zero so it can skip PID tracing.  Thus it also will not
introduce any redundant PID tracing.

Do I miss any potential issue?

Thanks,
Leo
> 
> -- 
> Catalin

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [RFCv1 4/4] perf: arm_spe: Dynamically switch PID tracing to contextidr
  2021-12-07 12:31             ` Leo Yan
@ 2021-12-08 17:29               ` Catalin Marinas
  2021-12-10  7:59                 ` Leo Yan
  0 siblings, 1 reply; 24+ messages in thread
From: Catalin Marinas @ 2021-12-08 17:29 UTC (permalink / raw)
  To: Leo Yan
  Cc: Kees Cook, Will Deacon, Mark Rutland, Ard Biesheuvel,
	Sami Tolvanen, Nicholas Piggin, James Morse, Marc Zyngier,
	Joey Gouly, Peter Collingbourne, Vincenzo Frascino,
	Peter Zijlstra (Intel),
	Stephane Eranian, James Clark, linux-arm-kernel, linux-kernel

On Tue, Dec 07, 2021 at 08:31:18PM +0800, Leo Yan wrote:
> On Tue, Dec 07, 2021 at 11:48:00AM +0000, Catalin Marinas wrote:
> > On Sun, Dec 05, 2021 at 09:51:03PM +0800, Leo Yan wrote:
> > > On Fri, Dec 03, 2021 at 04:22:42PM +0000, Catalin Marinas wrote:
> > > > What's the cost of always enabling CONFIG_PID_IN_CONTEXTIDR? If it's
> > > > negligible, I'd not bother at all with any of the enabling/disabling.
> > > 
> > > Yes, I compared performance for PID tracing with always enabling and
> > > disabling CONFIG_PID_IN_CONTEXTIDR, and also compared with using
> > > static key for enabling/disabling PID tracing.  The result shows the
> > > cost is negligible based on the benchmark 'perf bench sched'.
> > > 
> > > Please see the detailed data in below link (note the testing results
> > > came from my Juno board):
> > > https://lore.kernel.org/lkml/20211021134530.206216-1-leo.yan@linaro.org/
> > 
> > The table wasn't entirely clear to me. So the dis/enb benchmarks are
> > without this patchset applied.
> 
> Yes, dis/enb metrics don't apply this patchset.
> 
> > There seems to be a minor drop but it's
> > probably noise. Anyway, do we need this patchset or we just make
> > CONFIG_PID_IN_CONTEXTIDR default to y?
> 
> Good point.  I remembered before we had discussed for making
> CONFIG_PID_IN_CONTEXTIDR to 'y', but this approach is not always valid,
> especially when the profiling process runs in non-root PID namespace,
> in this case, hardware tracing data (e.g. Arm SPE or CoreSight) cannot
> trust the PID values from tracing since the PID conflicts between
> different PID namespaces.
> 
> So this patchset is to add the fundamental mechanism for dynamically
> enabling and disable PID tracing into CONTEXTIDR.  Based on it, we can
> use helpers to dynamically enable PID tracing _only_ when process runs
> in root PID namespace.

I don't think your approach fully works. Let's say you are tracing two
processes, one in the root PID namespace, the other not. Since the
former enables PID in CONTEXTIDR, you automatically get some PID in
CONTEXTIDR for the latter whether you requested it explicitly or not.

I wonder whether it makes more sense to turn this on per thread. You set
some TIF flag and set the PID in contextidr_thread_switch() only if the
flag is set. You could also check there if the PID is in the root
namespace and avoid setting CONTEXTIDR (or write 0).

-- 
Catalin

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [RFCv1 4/4] perf: arm_spe: Dynamically switch PID tracing to contextidr
  2021-12-08 17:29               ` Catalin Marinas
@ 2021-12-10  7:59                 ` Leo Yan
  2021-12-17  7:58                   ` Leo Yan
  2022-01-17 18:48                   ` Catalin Marinas
  0 siblings, 2 replies; 24+ messages in thread
From: Leo Yan @ 2021-12-10  7:59 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: Kees Cook, Will Deacon, Mark Rutland, Ard Biesheuvel,
	Sami Tolvanen, Nicholas Piggin, James Morse, Marc Zyngier,
	Joey Gouly, Peter Collingbourne, Vincenzo Frascino,
	Peter Zijlstra (Intel),
	Stephane Eranian, James Clark, linux-arm-kernel, linux-kernel

On Wed, Dec 08, 2021 at 05:29:41PM +0000, Catalin Marinas wrote:
> On Tue, Dec 07, 2021 at 08:31:18PM +0800, Leo Yan wrote:
> > On Tue, Dec 07, 2021 at 11:48:00AM +0000, Catalin Marinas wrote:
> > > On Sun, Dec 05, 2021 at 09:51:03PM +0800, Leo Yan wrote:
> > > > On Fri, Dec 03, 2021 at 04:22:42PM +0000, Catalin Marinas wrote:
> > > > > What's the cost of always enabling CONFIG_PID_IN_CONTEXTIDR? If it's
> > > > > negligible, I'd not bother at all with any of the enabling/disabling.
> > > > 
> > > > Yes, I compared performance for PID tracing with always enabling and
> > > > disabling CONFIG_PID_IN_CONTEXTIDR, and also compared with using
> > > > static key for enabling/disabling PID tracing.  The result shows the
> > > > cost is negligible based on the benchmark 'perf bench sched'.
> > > > 
> > > > Please see the detailed data in below link (note the testing results
> > > > came from my Juno board):
> > > > https://lore.kernel.org/lkml/20211021134530.206216-1-leo.yan@linaro.org/
> > > 
> > > The table wasn't entirely clear to me. So the dis/enb benchmarks are
> > > without this patchset applied.
> > 
> > Yes, dis/enb metrics don't apply this patchset.
> > 
> > > There seems to be a minor drop but it's
> > > probably noise. Anyway, do we need this patchset or we just make
> > > CONFIG_PID_IN_CONTEXTIDR default to y?
> > 
> > Good point.  I remembered before we had discussed for making
> > CONFIG_PID_IN_CONTEXTIDR to 'y', but this approach is not always valid,
> > especially when the profiling process runs in non-root PID namespace,
> > in this case, hardware tracing data (e.g. Arm SPE or CoreSight) cannot
> > trust the PID values from tracing since the PID conflicts between
> > different PID namespaces.
> > 
> > So this patchset is to add the fundamental mechanism for dynamically
> > enabling and disable PID tracing into CONTEXTIDR.  Based on it, we can
> > use helpers to dynamically enable PID tracing _only_ when process runs
> > in root PID namespace.
> 
> I don't think your approach fully works. Let's say you are tracing two
> processes, one in the root PID namespace, the other not. Since the
> former enables PID in CONTEXTIDR, you automatically get some PID in
> CONTEXTIDR for the latter whether you requested it explicitly or not.

The key point is kernel always sets a PID number from the root PID
namespace into the system register CONTEXTIDR.

Let's see a case:

  # unshare --fork --pid perf record -e cs_etm// -m 64K,64K -a -o perf_test.data -- ./multi_threads_test

In this case, with command "unshare --fork --pid", perf tool and the
profiled program multi_threads_test run in the created PID namespace.
We can see the dumped PID values:

   <idle>-0       [000] d..2.   331.751681: __switch_to: contextidr_thread_switch: task perf-exec pid 840 ns_pid 2
   <idle>-0       [002] d..2.   331.755930: __switch_to: contextidr_thread_switch: task multi_threads_t pid 842 ns_pid 4
   <idle>-0       [003] d..2.   331.755993: __switch_to: contextidr_thread_switch: task multi_threads_t pid 843 ns_pid 5
  sugov:0-129     [005] d..2.   332.323469: __switch_to: contextidr_thread_switch: task perf pid 841 ns_pid 3

We can see processes have two different PIDs values, say task
'perf-exec', 840 is its PID number from the root PID namespace, and 2
is its PID from the active namespace the task is belonging to.

But the register CONTEXTIDR is _only_ used to trace PIDs from the root
PID namespace and ignores PID values from any other PID namespace.

Come back to your question, if there have two tracing sessions, one
session runs in the root PID namespace, and another runs in the
non-root PID namespace.  Since we can gather the PIDs in the root namespace,
the session in the root PID namespace can work perfectly, and we can
ensure the PID infos matching well between kernel's PID tracing and user
space tool (note: perf needs gather process info with process' pid/tid for
post parsing).

For the later session in non-root PID name space, we doesn't capture
any PID tracing data.  This results in the profiled result doesn't
contain any PID info, although it's painful that the PID info is
incomplete but we don't deliver any wrong info for users studying
profiling result and PID is always '-1'.

I think the purpose for this patchset is to allow us to dynamically
enable PID tracing when the tracing session runs in root namespace.

Furthermore, please see the patch [1], you could see we also need an
extra checking in SPE driver for setting SPE's control register to
disable PID packets so can avoid PID polutions if the session runs in
non-root PID namespace:

  // Only enable PID packets when run in root namespace
  if (perfmon_capable() && (task_active_pid_ns(current) == &init_pid_ns))
 		reg |= BIT(SYS_PMSCR_EL1_CX_SHIFT);

[1] https://lore.kernel.org/lkml/20211021134530.206216-5-leo.yan@linaro.org/

> I wonder whether it makes more sense to turn this on per thread. You set
> some TIF flag and set the PID in contextidr_thread_switch() only if the
> flag is set.

Good point.  IIRC, before Suzuki brought up this suggestion as well.

I considered it and these two days I looked into the perf code, it's feasible
to apply the TIF flag for perf's 'per-thread' mode or process mode.  In these
modes, we can set the thread flags when profiled task/thread is forked.
But the difficult thing is how we can handle for CPU wide profiling.

In CPU wide mode, the perf tool opens the tracing events on several
or all CPUs, the running tasks running on the CPU are not forked
from perf session, in the below diagram, there have three tasks (T1/2/3),
we can see tasks T2/T3 even were created before perf tool is
lanched; in this case, it's hard for perf tool to set TIF flag for all
tasks in the system, and it's possible to see the tasks migrate from
one CPU to another CPU.

            Perf enables event                  Perf disables event
               on the CPU                          on the CPU
                  |                                   |
         ##T2##   V    ###T1###  ##T3##  ####T2#####  V
  CPU0  ------------------------------------------------->
                  |                                   |
            #T3#  V    #######T2######                V
  CPU1  ------------------------------------------------->

                                        ##T1##
  CPU2  ------------------------------------------------->
              ^
              ` The event is not enabled on CPU2 so CPU2 doesn't
                capture any trace data.

> You could also check there if the PID is in the root
> namespace and avoid setting CONTEXTIDR (or write 0).

This could introduce mess.  Writing 0 can lead the decoder to take it
as idle thread; if skip setting CONTEXTIDR, the tracer might use a
stale stale PID number (the previous one ID number).

Alternatively, if you accept to always set PID to CONTEXTIDR in
contextidr_thread_switch(), it would be fine for me and we can only
need to control PID packets in SPE and CoreSight drivers.

Please let me know your opinion, thanks!

Leo

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [RFCv1 4/4] perf: arm_spe: Dynamically switch PID tracing to contextidr
  2021-12-10  7:59                 ` Leo Yan
@ 2021-12-17  7:58                   ` Leo Yan
  2022-01-17 18:48                   ` Catalin Marinas
  1 sibling, 0 replies; 24+ messages in thread
From: Leo Yan @ 2021-12-17  7:58 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: Kees Cook, Will Deacon, Mark Rutland, Ard Biesheuvel,
	Sami Tolvanen, Nicholas Piggin, James Morse, Marc Zyngier,
	Joey Gouly, Peter Collingbourne, Vincenzo Frascino,
	Peter Zijlstra (Intel),
	Stephane Eranian, James Clark, linux-arm-kernel, linux-kernel

On Fri, Dec 10, 2021 at 03:59:18PM +0800, Leo Yan wrote:

[...]

> > You could also check there if the PID is in the root
> > namespace and avoid setting CONTEXTIDR (or write 0).
> 
> This could introduce mess.  Writing 0 can lead the decoder to take it
> as idle thread; if skip setting CONTEXTIDR, the tracer might use a
> stale stale PID number (the previous one ID number).
> 
> Alternatively, if you accept to always set PID to CONTEXTIDR in
> contextidr_thread_switch(), it would be fine for me and we can only
> need to control PID packets in SPE and CoreSight drivers.
> 
> Please let me know your opinion, thanks!

Gentle ping, Catalin.  If anything is blur and you want me to clarify,
please let me know.  Sorry if you are in holiday and if so we can
delay after holiday.

Thanks,
Leo

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [RFCv1 4/4] perf: arm_spe: Dynamically switch PID tracing to contextidr
  2021-12-10  7:59                 ` Leo Yan
  2021-12-17  7:58                   ` Leo Yan
@ 2022-01-17 18:48                   ` Catalin Marinas
  2022-02-01 13:02                     ` Leo Yan
  1 sibling, 1 reply; 24+ messages in thread
From: Catalin Marinas @ 2022-01-17 18:48 UTC (permalink / raw)
  To: Leo Yan
  Cc: Kees Cook, Will Deacon, Mark Rutland, Ard Biesheuvel,
	Sami Tolvanen, Nicholas Piggin, James Morse, Marc Zyngier,
	Joey Gouly, Peter Collingbourne, Vincenzo Frascino,
	Peter Zijlstra (Intel),
	Stephane Eranian, James Clark, linux-arm-kernel, linux-kernel

Hi Leo,

(sorry for the delay, holidays, merging window)

On Fri, Dec 10, 2021 at 03:59:18PM +0800, Leo Yan wrote:
> On Wed, Dec 08, 2021 at 05:29:41PM +0000, Catalin Marinas wrote:
> > On Tue, Dec 07, 2021 at 08:31:18PM +0800, Leo Yan wrote:
> > > On Tue, Dec 07, 2021 at 11:48:00AM +0000, Catalin Marinas wrote:
> > > > On Sun, Dec 05, 2021 at 09:51:03PM +0800, Leo Yan wrote:
> > > > > On Fri, Dec 03, 2021 at 04:22:42PM +0000, Catalin Marinas wrote:
> > > > > > What's the cost of always enabling CONFIG_PID_IN_CONTEXTIDR? If it's
> > > > > > negligible, I'd not bother at all with any of the enabling/disabling.
> > > > > 
> > > > > Yes, I compared performance for PID tracing with always enabling and
> > > > > disabling CONFIG_PID_IN_CONTEXTIDR, and also compared with using
> > > > > static key for enabling/disabling PID tracing.  The result shows the
> > > > > cost is negligible based on the benchmark 'perf bench sched'.
> > > > > 
> > > > > Please see the detailed data in below link (note the testing results
> > > > > came from my Juno board):
> > > > > https://lore.kernel.org/lkml/20211021134530.206216-1-leo.yan@linaro.org/
> > > > 
> > > > The table wasn't entirely clear to me. So the dis/enb benchmarks are
> > > > without this patchset applied.
> > > 
> > > Yes, dis/enb metrics don't apply this patchset.
> > > 
> > > > There seems to be a minor drop but it's
> > > > probably noise. Anyway, do we need this patchset or we just make
> > > > CONFIG_PID_IN_CONTEXTIDR default to y?
> > > 
> > > Good point.  I remembered before we had discussed for making
> > > CONFIG_PID_IN_CONTEXTIDR to 'y', but this approach is not always valid,
> > > especially when the profiling process runs in non-root PID namespace,
> > > in this case, hardware tracing data (e.g. Arm SPE or CoreSight) cannot
> > > trust the PID values from tracing since the PID conflicts between
> > > different PID namespaces.
> > > 
> > > So this patchset is to add the fundamental mechanism for dynamically
> > > enabling and disable PID tracing into CONTEXTIDR.  Based on it, we can
> > > use helpers to dynamically enable PID tracing _only_ when process runs
> > > in root PID namespace.
> > 
> > I don't think your approach fully works. Let's say you are tracing two
> > processes, one in the root PID namespace, the other not. Since the
> > former enables PID in CONTEXTIDR, you automatically get some PID in
> > CONTEXTIDR for the latter whether you requested it explicitly or not.
> 
> The key point is kernel always sets a PID number from the root PID
> namespace into the system register CONTEXTIDR.

So earlier you mentioned that CONFIG_PID_IN_CONTEXTIDR=y is not always
right since a profiled task may run in a non-root PID namespace. But
this series introduces a single knob to turn on PID in CONTEXTIDR for
all CPUs, irrespective of whether they run a task in the root PID
namespace or not. The CPU running a task in the root ns may call
contextidr_enable() but the other CPUs may run tasks in non-root ns.

IIUC, with this patch, to avoid the root ns PID for a non-root ns task,
the SPE driver only sets the SYS_PMSCR_EL1_CX_SHIFT bit for the root ns
tasks. So, in this case, for a non-root ns task, does it matter that
CONTEXTIDR always contain the root ns PID? If it matters, see above why
a single knob is already doing this.

> Let's see a case:
> 
>   # unshare --fork --pid perf record -e cs_etm// -m 64K,64K -a -o perf_test.data -- ./multi_threads_test
> 
> In this case, with command "unshare --fork --pid", perf tool and the
> profiled program multi_threads_test run in the created PID namespace.
> We can see the dumped PID values:
> 
>    <idle>-0       [000] d..2.   331.751681: __switch_to: contextidr_thread_switch: task perf-exec pid 840 ns_pid 2
>    <idle>-0       [002] d..2.   331.755930: __switch_to: contextidr_thread_switch: task multi_threads_t pid 842 ns_pid 4
>    <idle>-0       [003] d..2.   331.755993: __switch_to: contextidr_thread_switch: task multi_threads_t pid 843 ns_pid 5
>   sugov:0-129     [005] d..2.   332.323469: __switch_to: contextidr_thread_switch: task perf pid 841 ns_pid 3
> 
> We can see processes have two different PIDs values, say task
> 'perf-exec', 840 is its PID number from the root PID namespace, and 2
> is its PID from the active namespace the task is belonging to.
> 
> But the register CONTEXTIDR is _only_ used to trace PIDs from the root
> PID namespace and ignores PID values from any other PID namespace.

Would it help if we used task_pid_nr_ns() instead for setting
CONTEXTIDR?

> Come back to your question, if there have two tracing sessions, one
> session runs in the root PID namespace, and another runs in the
> non-root PID namespace.  Since we can gather the PIDs in the root namespace,
> the session in the root PID namespace can work perfectly, and we can
> ensure the PID infos matching well between kernel's PID tracing and user
> space tool (note: perf needs gather process info with process' pid/tid for
> post parsing).
> 
> For the later session in non-root PID name space, we doesn't capture
> any PID tracing data.  This results in the profiled result doesn't
> contain any PID info, although it's painful that the PID info is
> incomplete but we don't deliver any wrong info for users studying
> profiling result and PID is always '-1'.

So what's actually disabling the capture of PID tracing data? Is it the
PMSCR_EL1.CX bit being 0?

> I think the purpose for this patchset is to allow us to dynamically
> enable PID tracing when the tracing session runs in root namespace.

But it doesn't do this with a single global knob for all CPUs. You only
need the SPE patch together with the CONTEXTIDR config set to y.

> Alternatively, if you accept to always set PID to CONTEXTIDR in
> contextidr_thread_switch(), it would be fine for me and we can only
> need to control PID packets in SPE and CoreSight drivers.

Well, we have the config option and infrastructure already, it's up to
Linux distros to turn it on. It doesn't necessarily need to be in
defconfig.

Now, if we removed the CONTEXTIDR setting altogether, is there a way for
perf tools to coordinate the traces with the scheduling events? This
would be a simpler approach from the kernel perspective (i.e. no work).
Alternatively, could we move the CONTEXTIDR setting to a perf driver.
I'm not too familiar with perf but I guess we could add some scheduling
event hooks.

-- 
Catalin

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [RFCv1 4/4] perf: arm_spe: Dynamically switch PID tracing to contextidr
  2022-01-17 18:48                   ` Catalin Marinas
@ 2022-02-01 13:02                     ` Leo Yan
  0 siblings, 0 replies; 24+ messages in thread
From: Leo Yan @ 2022-02-01 13:02 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: Kees Cook, Will Deacon, Mark Rutland, Ard Biesheuvel,
	Sami Tolvanen, Nicholas Piggin, James Morse, Marc Zyngier,
	Joey Gouly, Peter Collingbourne, Vincenzo Frascino,
	Peter Zijlstra (Intel),
	Stephane Eranian, James Clark, linux-arm-kernel, linux-kernel

Hi Catalin,

On Mon, Jan 17, 2022 at 06:48:10PM +0000, Catalin Marinas wrote:
> Hi Leo,
> 
> (sorry for the delay, holidays, merging window)

I am sorry too for the delay.  I read your email but have no sufficient
time to reply it in the past two weeks.

[...]

> > > I don't think your approach fully works. Let's say you are tracing two
> > > processes, one in the root PID namespace, the other not. Since the
> > > former enables PID in CONTEXTIDR, you automatically get some PID in
> > > CONTEXTIDR for the latter whether you requested it explicitly or not.
> > 
> > The key point is kernel always sets a PID number from the root PID
> > namespace into the system register CONTEXTIDR.
> 
> So earlier you mentioned that CONFIG_PID_IN_CONTEXTIDR=y is not always
> right since a profiled task may run in a non-root PID namespace. But
> this series introduces a single knob to turn on PID in CONTEXTIDR for
> all CPUs, irrespective of whether they run a task in the root PID
> namespace or not. The CPU running a task in the root ns may call
> contextidr_enable() but the other CPUs may run tasks in non-root ns.

Sorry I introduced confusion.  Let me to clarify:

CONTEXTIDR should be always used to trace root ns PIDs, this can give us
a consistent semantics.  Especially, when we record trace data with
system wide mode (e.g. use perf command with option "-a" for tracing all
CPUs), CONTEXTIDR always contains root ns PIDs for all tasks.

On the flip side when perf and profiled program run in non-root PID
namespace, it must not rely on CONTEXTIDR anymore for PID tracing.  In
some implementations (E.g. Arm SPE, x86 Intel-PT), perf tool can use
the software trace events to retrieve PID numbers, and heavily based on
the timestamp correlation between software trace events and hardware
trace events.

> IIUC, with this patch, to avoid the root ns PID for a non-root ns task,
> the SPE driver only sets the SYS_PMSCR_EL1_CX_SHIFT bit for the root ns
> tasks. So, in this case, for a non-root ns task, does it matter that
> CONTEXTIDR always contain the root ns PID? If it matters, see above why
> a single knob is already doing this.

You are right, even CONTEXTIDR always contains the root ns PID, we
still can control the bit SYS_PMSCR_EL1_CX_SHIFT in SPE driver:
when the profiled program runs in non-root ns, Arm SPE driver clears
SYS_PMSCR_EL1_CX_SHIFT bit; and if the profiled program runs in root
ns, ARM SPE driver sets SYS_PMSCR_EL1_CX_SHIFT bit alternatively.

  Diagram 1: Perf/profiled program run in root namespace
  +------------+    +---------+    +-----------------+
  | CONTEXTIDR | -> | Arm SPE | -> | Perf Trace Data |
  +------------+    +---------+    +-----------------+
                         ^                 ^
                         |                 ` Contains context packet
                         ` Set SYS_PMSCR_EL1_CX_SHIFT bit

  Diagram 2: Perf/profiled program run in non-root namespace
  +------------+    +---------+    +-----------------+
  | CONTEXTIDR | -> | Arm SPE | -> | Perf Trace Data |
  +------------+    +---------+    +-----------------+
                         ^                 ^
                         |                 ` Doesn't contain any context packet
                         ` Clear SYS_PMSCR_EL1_CX_SHIFT bit

As shown in diagram 2, when Arm SPE driver detects if the session runs
in non-root PID namespace, it clears SYS_PMSCR_EL1_CX_SHIFT bit.  As a
result, the SPE tracing data in perf file doesn't contain any context
packets, thus perf tool will rollback to use software events to retrieve
PID numbers and assign them to samples.

> > Let's see a case:
> > 
> >   # unshare --fork --pid perf record -e cs_etm// -m 64K,64K -a -o perf_test.data -- ./multi_threads_test
> > 
> > In this case, with command "unshare --fork --pid", perf tool and the
> > profiled program multi_threads_test run in the created PID namespace.
> > We can see the dumped PID values:
> > 
> >    <idle>-0       [000] d..2.   331.751681: __switch_to: contextidr_thread_switch: task perf-exec pid 840 ns_pid 2
> >    <idle>-0       [002] d..2.   331.755930: __switch_to: contextidr_thread_switch: task multi_threads_t pid 842 ns_pid 4
> >    <idle>-0       [003] d..2.   331.755993: __switch_to: contextidr_thread_switch: task multi_threads_t pid 843 ns_pid 5
> >   sugov:0-129     [005] d..2.   332.323469: __switch_to: contextidr_thread_switch: task perf pid 841 ns_pid 3
> > 
> > We can see processes have two different PIDs values, say task
> > 'perf-exec', 840 is its PID number from the root PID namespace, and 2
> > is its PID from the active namespace the task is belonging to.
> > 
> > But the register CONTEXTIDR is _only_ used to trace PIDs from the root
> > PID namespace and ignores PID values from any other PID namespace.
> 
> Would it help if we used task_pid_nr_ns() instead for setting
> CONTEXTIDR?

No, I don't think using task_pid_nr_ns() is a good idea for setting
CONTEXTIDR; this will introduce mess for the system wide mode in the
perf docoding phase.

E.g. for below perf command:

  # perf record -e arm_spe_0// -a -- sleep 1

In this case, perf tool gathers all processes info from the root
PID namespace via proc FS in the recording phase.

If we store task_pid_nr_ns() into CONTEXTIDR, it will lead to confusion
for perf tool, since the PID numbers come from different namespaces,
during the decoding phase perf tool will wrongly match PIDs (from
different PID namespace) with the recorded PIDs from root namespace.

> > Come back to your question, if there have two tracing sessions, one
> > session runs in the root PID namespace, and another runs in the
> > non-root PID namespace.  Since we can gather the PIDs in the root namespace,
> > the session in the root PID namespace can work perfectly, and we can
> > ensure the PID infos matching well between kernel's PID tracing and user
> > space tool (note: perf needs gather process info with process' pid/tid for
> > post parsing).
> > 
> > For the later session in non-root PID name space, we doesn't capture
> > any PID tracing data.  This results in the profiled result doesn't
> > contain any PID info, although it's painful that the PID info is
> > incomplete but we don't deliver any wrong info for users studying
> > profiling result and PID is always '-1'.
> 
> So what's actually disabling the capture of PID tracing data? Is it the
> PMSCR_EL1.CX bit being 0?

Yes.  I agree it does make sense to clear PMSCR_EL1.CX bit for session
in non-root ns.

Sorry before I didn't make clear for this.

> > I think the purpose for this patchset is to allow us to dynamically
> > enable PID tracing when the tracing session runs in root namespace.
> 
> But it doesn't do this with a single global knob for all CPUs. You only
> need the SPE patch together with the CONTEXTIDR config set to y.

Yeah.

> > Alternatively, if you accept to always set PID to CONTEXTIDR in
> > contextidr_thread_switch(), it would be fine for me and we can only
> > need to control PID packets in SPE and CoreSight drivers.
> 
> Well, we have the config option and infrastructure already, it's up to
> Linux distros to turn it on. It doesn't necessarily need to be in
> defconfig.
> 
> Now, if we removed the CONTEXTIDR setting altogether, is there a way for
> perf tools to coordinate the traces with the scheduling events? This
> would be a simpler approach from the kernel perspective (i.e. no work).

Indeed, we can correlate between software scheduling events and
hardware tracing events, which can be sorted with timestamps.  Arm SPE
has supported this mode, please refer to the patch [1] for understanding
the implementation in perf tool.

But using the software scheduling events might introduce inaccuracy
for the statistics, the reason is there have gap between scheduling
events and task context switching, so this is why we still want to
support context packets (by using CONTEXTIDR) for root PID namespace.
this can give us more accurate results.

[1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/tools/perf/util/arm-spe.c?id=9dc9855f18ba25d2bc536ea5ba6682855e385d66

> Alternatively, could we move the CONTEXTIDR setting to a perf driver.
> I'm not too familiar with perf but I guess we could add some scheduling
> event hooks.

It would introduce trouble if we move the CONTEXTIDR setting to a perf
driver, I can think out two reasons:

- The first reason is we might have not only one perf driver to use
  CONTEXTIDR, e.g. Arm SPE and Arm CoreSight both need to use
  CONTEXTIDR for PID tracing.  This means we either need to set
  CONTEXTIDR in both drivers so we will introduce redundant codes.
- Furthermore, for the perf system wide mode, perf only enables
  hardware event when start the session and disables hardware event
  when stop the session.  In this mode, perf driver is not notified
  by any scheduling event during the session.

Seems to me, so far we need to come back to apply below change:

diff --git a/drivers/perf/Kconfig b/drivers/perf/Kconfig
index e1a0c44bc686..a678562253dc 100644
--- a/drivers/perf/Kconfig
+++ b/drivers/perf/Kconfig
@@ -127,6 +127,7 @@ config XGENE_PMU
 config ARM_SPE_PMU
        tristate "Enable support for the ARMv8.2 Statistical Profiling Extension"
        depends on ARM64
+       select PID_IN_CONTEXTIDR
        help
          Enable perf support for the ARMv8.2 Statistical Profiling
          Extension, which provides periodic sampling of operations in

Thanks,
Leo

^ permalink raw reply related	[flat|nested] 24+ messages in thread

end of thread, other threads:[~2022-02-01 13:02 UTC | newest]

Thread overview: 24+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-10-21 13:45 [RFCv1 0/4] arm64: Use static key for PID in CONTEXTIDR Leo Yan
2021-10-21 13:45 ` [RFCv1 1/4] arm64: Use static key for tracing " Leo Yan
2021-10-21 14:33   ` James Clark
2021-10-21 14:37     ` Leo Yan
2021-10-21 15:47     ` Kees Cook
2021-10-21 13:45 ` [RFCv1 2/4] arm64: entry: Always apply workaround for contextidr_el1 Leo Yan
2021-10-21 13:45 ` [RFCv1 3/4] arm64: Introduce functions for controlling PID tracing Leo Yan
2021-10-21 13:45 ` [RFCv1 4/4] perf: arm_spe: Dynamically switch PID tracing to contextidr Leo Yan
2021-10-21 15:49   ` Kees Cook
2021-10-22  2:09     ` Leo Yan
2021-11-01 15:28     ` Leo Yan
2021-12-03 16:22       ` Catalin Marinas
2021-12-05 13:51         ` Leo Yan
2021-12-07 11:48           ` Catalin Marinas
2021-12-07 12:31             ` Leo Yan
2021-12-08 17:29               ` Catalin Marinas
2021-12-10  7:59                 ` Leo Yan
2021-12-17  7:58                   ` Leo Yan
2022-01-17 18:48                   ` Catalin Marinas
2022-02-01 13:02                     ` Leo Yan
2021-10-22 15:36   ` James Clark
2021-10-22 15:40     ` James Clark
2021-10-22 16:23     ` James Clark
2021-10-24 10:25       ` Leo Yan

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).