From: Rob Herring <robh@kernel.org> To: Will Deacon <will@kernel.org>, Catalin Marinas <catalin.marinas@arm.com>, Peter Zijlstra <peterz@infradead.org>, Ingo Molnar <mingo@redhat.com>, Arnaldo Carvalho de Melo <acme@kernel.org>, Jiri Olsa <jolsa@redhat.com>, Mark Rutland <mark.rutland@arm.com> Cc: linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, Alexander Shishkin <alexander.shishkin@linux.intel.com>, Namhyung Kim <namhyung@kernel.org>, Raphael Gault <raphael.gault@arm.com>, Jonathan Cameron <Jonathan.Cameron@huawei.com>, Ian Rogers <irogers@google.com>, honnappa.nagarahalli@arm.com, Itaru Kitayama <itaru.kitayama@gmail.com> Subject: [PATCH v5 2/9] arm64: perf: Enable PMU counter direct access for perf event Date: Wed, 13 Jan 2021 20:05:58 -0600 [thread overview] Message-ID: <20210114020605.3943992-3-robh@kernel.org> (raw) In-Reply-To: <20210114020605.3943992-1-robh@kernel.org> From: Raphael Gault <raphael.gault@arm.com> Keep track of event opened with direct access to the hardware counters and modify permissions while they are open. The strategy used here is the same which x86 uses: every time an event is mapped, the permissions are set if required. The atomic field added in the mm_context helps keep track of the different event opened and de-activate the permissions when all are unmapped. We also need to update the permissions in the context switch code so that tasks keep the right permissions. Signed-off-by: Raphael Gault <raphael.gault@arm.com> Signed-off-by: Rob Herring <robh@kernel.org> --- I'm not completely sure if using current->active_mm in an IPI is okay? It seems to work in my testing. Peter Z says (event->oncpu == smp_processor_id()) in the user page update is always true, but my testing says otherwise[1]. v5: - Only set cap_user_rdpmc if event is on current cpu - Limit enabling/disabling access to CPUs associated with the PMU (supported_cpus) and with the mm_struct matching current->active_mm. v2: - Move mapped/unmapped into arm64 code. Fixes arm32. - Rebase on cap_user_time_short changes Changes from Raphael's v4: - Drop homogeneous check - Disable access for chained counters - Set pmc_width in user page [1] https://lore.kernel.org/lkml/CAL_JsqK+eKef5NaVnBfARCjRE3MYhfBfe54F9YHKbsTnWqLmLw@mail.gmail.com/ --- arch/arm64/include/asm/mmu.h | 5 +++ arch/arm64/include/asm/mmu_context.h | 2 ++ arch/arm64/include/asm/perf_event.h | 14 +++++++++ arch/arm64/kernel/perf_event.c | 47 ++++++++++++++++++++++++++++ 4 files changed, 68 insertions(+) diff --git a/arch/arm64/include/asm/mmu.h b/arch/arm64/include/asm/mmu.h index 75beffe2ee8a..ee08447455da 100644 --- a/arch/arm64/include/asm/mmu.h +++ b/arch/arm64/include/asm/mmu.h @@ -18,6 +18,11 @@ typedef struct { atomic64_t id; + /* + * non-zero if userspace have access to hardware + * counters directly. + */ + atomic_t pmu_direct_access; #ifdef CONFIG_COMPAT void *sigpage; #endif diff --git a/arch/arm64/include/asm/mmu_context.h b/arch/arm64/include/asm/mmu_context.h index 0b3079fd28eb..25aee6ab4127 100644 --- a/arch/arm64/include/asm/mmu_context.h +++ b/arch/arm64/include/asm/mmu_context.h @@ -21,6 +21,7 @@ #include <asm/proc-fns.h> #include <asm-generic/mm_hooks.h> #include <asm/cputype.h> +#include <asm/perf_event.h> #include <asm/sysreg.h> #include <asm/tlbflush.h> @@ -231,6 +232,7 @@ static inline void __switch_mm(struct mm_struct *next) } check_and_switch_context(next); + perf_switch_user_access(next); } static inline void diff --git a/arch/arm64/include/asm/perf_event.h b/arch/arm64/include/asm/perf_event.h index 60731f602d3e..112f3f63b79e 100644 --- a/arch/arm64/include/asm/perf_event.h +++ b/arch/arm64/include/asm/perf_event.h @@ -8,6 +8,7 @@ #include <asm/stack_pointer.h> #include <asm/ptrace.h> +#include <linux/mm_types.h> #define ARMV8_PMU_MAX_COUNTERS 32 #define ARMV8_PMU_COUNTER_MASK (ARMV8_PMU_MAX_COUNTERS - 1) @@ -254,4 +255,17 @@ extern unsigned long perf_misc_flags(struct pt_regs *regs); (regs)->pstate = PSR_MODE_EL1h; \ } +static inline void perf_switch_user_access(struct mm_struct *mm) +{ + if (!IS_ENABLED(CONFIG_PERF_EVENTS)) + return; + + if (atomic_read(&mm->context.pmu_direct_access)) { + write_sysreg(ARMV8_PMU_USERENR_ER|ARMV8_PMU_USERENR_CR, + pmuserenr_el0); + } else { + write_sysreg(0, pmuserenr_el0); + } +} + #endif diff --git a/arch/arm64/kernel/perf_event.c b/arch/arm64/kernel/perf_event.c index 21f6f4cdd05f..38b1cdc7bbe2 100644 --- a/arch/arm64/kernel/perf_event.c +++ b/arch/arm64/kernel/perf_event.c @@ -889,6 +889,46 @@ static int armv8pmu_access_event_idx(struct perf_event *event) return event->hw.idx; } +static void refresh_pmuserenr(void *mm) +{ + if (mm == current->active_mm) + perf_switch_user_access(mm); +} + +static void armv8pmu_event_mapped(struct perf_event *event, struct mm_struct *mm) +{ + struct arm_pmu *armpmu = to_arm_pmu(event->pmu); + + if (!(event->hw.flags & ARMPMU_EL0_RD_CNTR)) + return; + + /* + * This function relies on not being called concurrently in two + * tasks in the same mm. Otherwise one task could observe + * pmu_direct_access > 1 and return all the way back to + * userspace with user access disabled while another task is still + * doing on_each_cpu_mask() to enable user access. + * + * For now, this can't happen because all callers hold mmap_lock + * for write. If this changes, we'll need a different solution. + */ + lockdep_assert_held_write(&mm->mmap_lock); + + if (atomic_inc_return(&mm->context.pmu_direct_access) == 1) + on_each_cpu_mask(&armpmu->supported_cpus, refresh_pmuserenr, mm, 1); +} + +static void armv8pmu_event_unmapped(struct perf_event *event, struct mm_struct *mm) +{ + struct arm_pmu *armpmu = to_arm_pmu(event->pmu); + + if (!(event->hw.flags & ARMPMU_EL0_RD_CNTR)) + return; + + if (atomic_dec_and_test(&mm->context.pmu_direct_access)) + on_each_cpu_mask(&armpmu->supported_cpus, refresh_pmuserenr, mm, 1); +} + /* * Add an event filter to a given event. */ @@ -1120,6 +1160,8 @@ static int armv8_pmu_init(struct arm_pmu *cpu_pmu, char *name, cpu_pmu->filter_match = armv8pmu_filter_match; cpu_pmu->pmu.event_idx = armv8pmu_access_event_idx; + cpu_pmu->pmu.event_mapped = armv8pmu_event_mapped; + cpu_pmu->pmu.event_unmapped = armv8pmu_event_unmapped; cpu_pmu->name = name; cpu_pmu->map_event = map_event; @@ -1299,6 +1341,11 @@ void arch_perf_update_userpage(struct perf_event *event, userpg->cap_user_time = 0; userpg->cap_user_time_zero = 0; userpg->cap_user_time_short = 0; + userpg->cap_user_rdpmc = !!(event->hw.flags & ARMPMU_EL0_RD_CNTR) && + (event->oncpu == smp_processor_id()); + + if (userpg->cap_user_rdpmc) + userpg->pmc_width = armv8pmu_event_is_64bit(event) ? 64 : 32; do { rd = sched_clock_read_begin(&seq); -- 2.27.0
WARNING: multiple messages have this Message-ID (diff)
From: Rob Herring <robh@kernel.org> To: Will Deacon <will@kernel.org>, Catalin Marinas <catalin.marinas@arm.com>, Peter Zijlstra <peterz@infradead.org>, Ingo Molnar <mingo@redhat.com>, Arnaldo Carvalho de Melo <acme@kernel.org>, Jiri Olsa <jolsa@redhat.com>, Mark Rutland <mark.rutland@arm.com> Cc: Ian Rogers <irogers@google.com>, Alexander Shishkin <alexander.shishkin@linux.intel.com>, linux-kernel@vger.kernel.org, honnappa.nagarahalli@arm.com, Raphael Gault <raphael.gault@arm.com>, Jonathan Cameron <Jonathan.Cameron@huawei.com>, Namhyung Kim <namhyung@kernel.org>, Itaru Kitayama <itaru.kitayama@gmail.com>, linux-arm-kernel@lists.infradead.org Subject: [PATCH v5 2/9] arm64: perf: Enable PMU counter direct access for perf event Date: Wed, 13 Jan 2021 20:05:58 -0600 [thread overview] Message-ID: <20210114020605.3943992-3-robh@kernel.org> (raw) In-Reply-To: <20210114020605.3943992-1-robh@kernel.org> From: Raphael Gault <raphael.gault@arm.com> Keep track of event opened with direct access to the hardware counters and modify permissions while they are open. The strategy used here is the same which x86 uses: every time an event is mapped, the permissions are set if required. The atomic field added in the mm_context helps keep track of the different event opened and de-activate the permissions when all are unmapped. We also need to update the permissions in the context switch code so that tasks keep the right permissions. Signed-off-by: Raphael Gault <raphael.gault@arm.com> Signed-off-by: Rob Herring <robh@kernel.org> --- I'm not completely sure if using current->active_mm in an IPI is okay? It seems to work in my testing. Peter Z says (event->oncpu == smp_processor_id()) in the user page update is always true, but my testing says otherwise[1]. v5: - Only set cap_user_rdpmc if event is on current cpu - Limit enabling/disabling access to CPUs associated with the PMU (supported_cpus) and with the mm_struct matching current->active_mm. v2: - Move mapped/unmapped into arm64 code. Fixes arm32. - Rebase on cap_user_time_short changes Changes from Raphael's v4: - Drop homogeneous check - Disable access for chained counters - Set pmc_width in user page [1] https://lore.kernel.org/lkml/CAL_JsqK+eKef5NaVnBfARCjRE3MYhfBfe54F9YHKbsTnWqLmLw@mail.gmail.com/ --- arch/arm64/include/asm/mmu.h | 5 +++ arch/arm64/include/asm/mmu_context.h | 2 ++ arch/arm64/include/asm/perf_event.h | 14 +++++++++ arch/arm64/kernel/perf_event.c | 47 ++++++++++++++++++++++++++++ 4 files changed, 68 insertions(+) diff --git a/arch/arm64/include/asm/mmu.h b/arch/arm64/include/asm/mmu.h index 75beffe2ee8a..ee08447455da 100644 --- a/arch/arm64/include/asm/mmu.h +++ b/arch/arm64/include/asm/mmu.h @@ -18,6 +18,11 @@ typedef struct { atomic64_t id; + /* + * non-zero if userspace have access to hardware + * counters directly. + */ + atomic_t pmu_direct_access; #ifdef CONFIG_COMPAT void *sigpage; #endif diff --git a/arch/arm64/include/asm/mmu_context.h b/arch/arm64/include/asm/mmu_context.h index 0b3079fd28eb..25aee6ab4127 100644 --- a/arch/arm64/include/asm/mmu_context.h +++ b/arch/arm64/include/asm/mmu_context.h @@ -21,6 +21,7 @@ #include <asm/proc-fns.h> #include <asm-generic/mm_hooks.h> #include <asm/cputype.h> +#include <asm/perf_event.h> #include <asm/sysreg.h> #include <asm/tlbflush.h> @@ -231,6 +232,7 @@ static inline void __switch_mm(struct mm_struct *next) } check_and_switch_context(next); + perf_switch_user_access(next); } static inline void diff --git a/arch/arm64/include/asm/perf_event.h b/arch/arm64/include/asm/perf_event.h index 60731f602d3e..112f3f63b79e 100644 --- a/arch/arm64/include/asm/perf_event.h +++ b/arch/arm64/include/asm/perf_event.h @@ -8,6 +8,7 @@ #include <asm/stack_pointer.h> #include <asm/ptrace.h> +#include <linux/mm_types.h> #define ARMV8_PMU_MAX_COUNTERS 32 #define ARMV8_PMU_COUNTER_MASK (ARMV8_PMU_MAX_COUNTERS - 1) @@ -254,4 +255,17 @@ extern unsigned long perf_misc_flags(struct pt_regs *regs); (regs)->pstate = PSR_MODE_EL1h; \ } +static inline void perf_switch_user_access(struct mm_struct *mm) +{ + if (!IS_ENABLED(CONFIG_PERF_EVENTS)) + return; + + if (atomic_read(&mm->context.pmu_direct_access)) { + write_sysreg(ARMV8_PMU_USERENR_ER|ARMV8_PMU_USERENR_CR, + pmuserenr_el0); + } else { + write_sysreg(0, pmuserenr_el0); + } +} + #endif diff --git a/arch/arm64/kernel/perf_event.c b/arch/arm64/kernel/perf_event.c index 21f6f4cdd05f..38b1cdc7bbe2 100644 --- a/arch/arm64/kernel/perf_event.c +++ b/arch/arm64/kernel/perf_event.c @@ -889,6 +889,46 @@ static int armv8pmu_access_event_idx(struct perf_event *event) return event->hw.idx; } +static void refresh_pmuserenr(void *mm) +{ + if (mm == current->active_mm) + perf_switch_user_access(mm); +} + +static void armv8pmu_event_mapped(struct perf_event *event, struct mm_struct *mm) +{ + struct arm_pmu *armpmu = to_arm_pmu(event->pmu); + + if (!(event->hw.flags & ARMPMU_EL0_RD_CNTR)) + return; + + /* + * This function relies on not being called concurrently in two + * tasks in the same mm. Otherwise one task could observe + * pmu_direct_access > 1 and return all the way back to + * userspace with user access disabled while another task is still + * doing on_each_cpu_mask() to enable user access. + * + * For now, this can't happen because all callers hold mmap_lock + * for write. If this changes, we'll need a different solution. + */ + lockdep_assert_held_write(&mm->mmap_lock); + + if (atomic_inc_return(&mm->context.pmu_direct_access) == 1) + on_each_cpu_mask(&armpmu->supported_cpus, refresh_pmuserenr, mm, 1); +} + +static void armv8pmu_event_unmapped(struct perf_event *event, struct mm_struct *mm) +{ + struct arm_pmu *armpmu = to_arm_pmu(event->pmu); + + if (!(event->hw.flags & ARMPMU_EL0_RD_CNTR)) + return; + + if (atomic_dec_and_test(&mm->context.pmu_direct_access)) + on_each_cpu_mask(&armpmu->supported_cpus, refresh_pmuserenr, mm, 1); +} + /* * Add an event filter to a given event. */ @@ -1120,6 +1160,8 @@ static int armv8_pmu_init(struct arm_pmu *cpu_pmu, char *name, cpu_pmu->filter_match = armv8pmu_filter_match; cpu_pmu->pmu.event_idx = armv8pmu_access_event_idx; + cpu_pmu->pmu.event_mapped = armv8pmu_event_mapped; + cpu_pmu->pmu.event_unmapped = armv8pmu_event_unmapped; cpu_pmu->name = name; cpu_pmu->map_event = map_event; @@ -1299,6 +1341,11 @@ void arch_perf_update_userpage(struct perf_event *event, userpg->cap_user_time = 0; userpg->cap_user_time_zero = 0; userpg->cap_user_time_short = 0; + userpg->cap_user_rdpmc = !!(event->hw.flags & ARMPMU_EL0_RD_CNTR) && + (event->oncpu == smp_processor_id()); + + if (userpg->cap_user_rdpmc) + userpg->pmc_width = armv8pmu_event_is_64bit(event) ? 64 : 32; do { rd = sched_clock_read_begin(&seq); -- 2.27.0 _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
next prev parent reply other threads:[~2021-01-14 2:09 UTC|newest] Thread overview: 24+ messages / expand[flat|nested] mbox.gz Atom feed top 2021-01-14 2:05 [PATCH v5 0/9] libperf and arm64 userspace counter access support Rob Herring 2021-01-14 2:05 ` Rob Herring 2021-01-14 2:05 ` [PATCH v5 1/9] arm64: pmu: Add function implementation to update event index in userpage Rob Herring 2021-01-14 2:05 ` Rob Herring 2021-01-14 2:05 ` Rob Herring [this message] 2021-01-14 2:05 ` [PATCH v5 2/9] arm64: perf: Enable PMU counter direct access for perf event Rob Herring 2021-01-14 2:05 ` [PATCH v5 3/9] tools/include: Add an initial math64.h Rob Herring 2021-01-14 2:05 ` Rob Herring 2021-01-14 2:06 ` [PATCH v5 4/9] libperf: Add evsel mmap support Rob Herring 2021-01-14 2:06 ` Rob Herring 2021-01-23 22:42 ` Jiri Olsa 2021-01-23 22:42 ` Jiri Olsa 2021-01-14 2:06 ` [PATCH v5 5/9] libperf: tests: Add support for verbose printing Rob Herring 2021-01-14 2:06 ` Rob Herring 2021-01-14 2:06 ` [PATCH v5 6/9] libperf: Add support for user space counter access Rob Herring 2021-01-14 2:06 ` Rob Herring 2021-01-14 2:06 ` [PATCH v5 7/9] libperf: Add arm64 support to perf_mmap__read_self() Rob Herring 2021-01-14 2:06 ` Rob Herring 2021-01-14 2:06 ` [PATCH v5 8/9] perf: arm64: Add test for userspace counter access on heterogeneous systems Rob Herring 2021-01-14 2:06 ` Rob Herring 2021-01-14 2:06 ` [PATCH v5 9/9] Documentation: arm64: Document PMU counters access from userspace Rob Herring 2021-01-14 2:06 ` Rob Herring 2021-02-04 15:18 ` [PATCH v5 0/9] libperf and arm64 userspace counter access support Rob Herring 2021-02-04 15:18 ` Rob Herring
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=20210114020605.3943992-3-robh@kernel.org \ --to=robh@kernel.org \ --cc=Jonathan.Cameron@huawei.com \ --cc=acme@kernel.org \ --cc=alexander.shishkin@linux.intel.com \ --cc=catalin.marinas@arm.com \ --cc=honnappa.nagarahalli@arm.com \ --cc=irogers@google.com \ --cc=itaru.kitayama@gmail.com \ --cc=jolsa@redhat.com \ --cc=linux-arm-kernel@lists.infradead.org \ --cc=linux-kernel@vger.kernel.org \ --cc=mark.rutland@arm.com \ --cc=mingo@redhat.com \ --cc=namhyung@kernel.org \ --cc=peterz@infradead.org \ --cc=raphael.gault@arm.com \ --cc=will@kernel.org \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.