linux-arm-kernel.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 0/9] libperf and arm64 userspace counter access support
@ 2020-08-28 20:56 Rob Herring
  2020-08-28 20:56 ` [PATCH v2 1/9] arm64: pmu: Add hook to handle pmu-related undefined instructions Rob Herring
                   ` (8 more replies)
  0 siblings, 9 replies; 18+ messages in thread
From: Rob Herring @ 2020-08-28 20:56 UTC (permalink / raw)
  To: Will Deacon, Catalin Marinas, Peter Zijlstra, Ingo Molnar,
	Arnaldo Carvalho de Melo, Jiri Olsa
  Cc: Mark Rutland, Ian Rogers, Alexander Shishkin, linux-kernel,
	honnappa.nagarahalli, Raphael Gault, Jonathan Cameron,
	Namhyung Kim, linux-arm-kernel

This is resurrecting Raphael's series[1] to enable userspace counter
access on arm64. My previous version is here[2].

New in this version is adding userspace read support into libperf rather
than adding yet another copy of the read loop. Details are in patch 5.


The following changes to the arm64 support have been made compared to
Raphael's last version:

The major change is support for heterogeneous systems with some
restrictions. Specifically, userspace must pin itself to like CPUs, open
a specific PMU by type, and use h/w specific events. The tests have been
reworked to demonstrate this.

Chained events are not supported. The problem with supporting chained
events was there's no way to distinguish between a chained event and a
native 64-bit counter. We could add some flag, but do self monitoring
processes really need that? Native 64-bit counters are supported if the
PMU h/w has support. As there's already an explicit ABI to request 64-bit
counters, userspace can request 64-bit counters and if user
access is not enabled, then it must retry with 32-bit counters.

Prior versions broke the build on arm32 (surprisingly never caught by
0-day). As a result, event_mapped and event_unmapped implementations have
been moved into the arm64 code.

There was a bug in that pmc_width was not set in the user page. The tests
now check for this.

The documentation has been converted to rST. I've added sections on
chained events and heterogeneous.

The tests have been expanded to test the cycle counter access.

Rob

[1] https://lore.kernel.org/linux-arm-kernel/20190822144220.27860-1-raphael.gault@arm.com/
[2] https://lore.kernel.org/linux-arm-kernel/20200707205333.624938-1-robh@kernel.org/

Raphael Gault (4):
  arm64: pmu: Add hook to handle pmu-related undefined instructions
  arm64: pmu: Add function implementation to update event index in
    userpage
  arm64: perf: Enable pmu counter direct access for perf event on armv8
  Documentation: arm64: Document PMU counters access from userspace

Rob Herring (5):
  tools/include: Add an initial math64.h
  libperf: Add support for user space counter access
  libperf: Add arm64 support to perf_mmap__read_self()
  perf: arm64: Add test for userspace counter access on heterogeneous
    systems
  perf: Remove x86 specific rdpmc test

 Documentation/arm64/index.rst                 |   1 +
 .../arm64/perf_counter_user_access.rst        |  56 ++++++
 arch/arm64/include/asm/mmu.h                  |   5 +
 arch/arm64/include/asm/mmu_context.h          |   2 +
 arch/arm64/include/asm/perf_event.h           |  14 ++
 arch/arm64/kernel/cpufeature.c                |   4 +-
 arch/arm64/kernel/perf_event.c                | 116 +++++++++++
 include/linux/perf/arm_pmu.h                  |   2 +
 tools/include/linux/math64.h                  |  75 +++++++
 tools/lib/perf/Documentation/libperf.txt      |   1 +
 tools/lib/perf/evsel.c                        |  33 +++
 tools/lib/perf/include/internal/evsel.h       |   2 +
 tools/lib/perf/include/internal/mmap.h        |   3 +
 tools/lib/perf/include/perf/evsel.h           |   1 +
 tools/lib/perf/libperf.map                    |   1 +
 tools/lib/perf/mmap.c                         | 188 ++++++++++++++++++
 tools/lib/perf/tests/test-evsel.c             |  64 ++++++
 tools/perf/arch/arm64/include/arch-tests.h    |   7 +
 tools/perf/arch/arm64/tests/Build             |   1 +
 tools/perf/arch/arm64/tests/arch-tests.c      |   4 +
 tools/perf/arch/arm64/tests/user-events.c     | 170 ++++++++++++++++
 tools/perf/arch/x86/include/arch-tests.h      |   1 -
 tools/perf/arch/x86/tests/Build               |   1 -
 tools/perf/arch/x86/tests/arch-tests.c        |   4 -
 tools/perf/arch/x86/tests/rdpmc.c             | 182 -----------------
 25 files changed, 748 insertions(+), 190 deletions(-)
 create mode 100644 Documentation/arm64/perf_counter_user_access.rst
 create mode 100644 tools/include/linux/math64.h
 create mode 100644 tools/perf/arch/arm64/tests/user-events.c
 delete mode 100644 tools/perf/arch/x86/tests/rdpmc.c

--
2.25.1

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH v2 1/9] arm64: pmu: Add hook to handle pmu-related undefined instructions
  2020-08-28 20:56 [PATCH v2 0/9] libperf and arm64 userspace counter access support Rob Herring
@ 2020-08-28 20:56 ` Rob Herring
  2020-08-28 20:56 ` [PATCH v2 2/9] arm64: pmu: Add function implementation to update event index in userpage Rob Herring
                   ` (7 subsequent siblings)
  8 siblings, 0 replies; 18+ messages in thread
From: Rob Herring @ 2020-08-28 20:56 UTC (permalink / raw)
  To: Will Deacon, Catalin Marinas, Peter Zijlstra, Ingo Molnar,
	Arnaldo Carvalho de Melo, Jiri Olsa
  Cc: Mark Rutland, Ian Rogers, Alexander Shishkin, linux-kernel,
	honnappa.nagarahalli, Raphael Gault, Jonathan Cameron,
	Namhyung Kim, linux-arm-kernel

From: Raphael Gault <raphael.gault@arm.com>

This patch introduces a protection for the userspace processes which are
trying to access the registers from the pmu registers on a big.LITTLE
environment. It introduces a hook to handle undefined instructions.

The goal here is to prevent the process to be interrupted by a signal
when the error is caused by the task being scheduled while accessing
a counter, causing the counter access to be invalid. As we are not able
to know efficiently the number of counters available physically on both
pmu in that context we consider that any faulting access to a counter
which is architecturally correct should not cause a SIGILL signal if
the permissions are set accordingly.

This commit also modifies the mask of the mrs_hook declared in
arch/arm64/kernel/cpufeatures.c which emulates only feature register
access. This is necessary because this hook's mask was too large and
thus masking any mrs instruction, even if not related to the emulated
registers which made the pmu emulation inefficient.

Signed-off-by: Raphael Gault <raphael.gault@arm.com>
Signed-off-by: Rob Herring <robh@kernel.org>
---
v2:
 - Fix warning for set but unused sys_reg
---
 arch/arm64/kernel/cpufeature.c |  4 +--
 arch/arm64/kernel/perf_event.c | 54 ++++++++++++++++++++++++++++++++++
 2 files changed, 56 insertions(+), 2 deletions(-)

diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
index a389b999482e..00bf53ffd9b0 100644
--- a/arch/arm64/kernel/cpufeature.c
+++ b/arch/arm64/kernel/cpufeature.c
@@ -2811,8 +2811,8 @@ static int emulate_mrs(struct pt_regs *regs, u32 insn)
 }
 
 static struct undef_hook mrs_hook = {
-	.instr_mask = 0xfff00000,
-	.instr_val  = 0xd5300000,
+	.instr_mask = 0xffff0000,
+	.instr_val  = 0xd5380000,
 	.pstate_mask = PSR_AA32_MODE_MASK,
 	.pstate_val = PSR_MODE_EL0t,
 	.fn = emulate_mrs,
diff --git a/arch/arm64/kernel/perf_event.c b/arch/arm64/kernel/perf_event.c
index 462f9a9cc44b..70538ae684da 100644
--- a/arch/arm64/kernel/perf_event.c
+++ b/arch/arm64/kernel/perf_event.c
@@ -8,9 +8,11 @@
  * This code is based heavily on the ARMv7 perf event code.
  */
 
+#include <asm/cpu.h>
 #include <asm/irq_regs.h>
 #include <asm/perf_event.h>
 #include <asm/sysreg.h>
+#include <asm/traps.h>
 #include <asm/virt.h>
 
 #include <clocksource/arm_arch_timer.h>
@@ -1016,6 +1018,58 @@ static int armv8pmu_probe_pmu(struct arm_pmu *cpu_pmu)
 	return probe.present ? 0 : -ENODEV;
 }
 
+static int emulate_pmu(struct pt_regs *regs, u32 insn)
+{
+	u32 rt;
+	u32 pmuserenr;
+
+	rt = aarch64_insn_decode_register(AARCH64_INSN_REGTYPE_RT, insn);
+	pmuserenr = read_sysreg(pmuserenr_el0);
+
+	if ((pmuserenr & (ARMV8_PMU_USERENR_ER|ARMV8_PMU_USERENR_CR)) !=
+	    (ARMV8_PMU_USERENR_ER|ARMV8_PMU_USERENR_CR))
+		return -EINVAL;
+
+
+	/*
+	 * Userspace is expected to only use this in the context of the scheme
+	 * described in the struct perf_event_mmap_page comments.
+	 *
+	 * Given that context, we can only get here if we got migrated between
+	 * getting the register index and doing the MSR read.  This in turn
+	 * implies we'll fail the sequence and retry, so any value returned is
+	 * 'good', all we need is to be non-fatal.
+	 *
+	 * The choice of the value 0 is comming from the fact that when
+	 * accessing a register which is not counting events but is accessible,
+	 * we get 0.
+	 */
+	pt_regs_write_reg(regs, rt, 0);
+
+	arm64_skip_faulting_instruction(regs, 4);
+	return 0;
+}
+
+/*
+ * This hook will only be triggered by mrs
+ * instructions on PMU registers. This is mandatory
+ * in order to have a consistent behaviour even on
+ * big.LITTLE systems.
+ */
+static struct undef_hook pmu_hook = {
+	.instr_mask = 0xffff8800,
+	.instr_val  = 0xd53b8800,
+	.fn = emulate_pmu,
+};
+
+static int __init enable_pmu_emulation(void)
+{
+	register_undef_hook(&pmu_hook);
+	return 0;
+}
+
+core_initcall(enable_pmu_emulation);
+
 static int armv8_pmu_init(struct arm_pmu *cpu_pmu, char *name,
 			  int (*map_event)(struct perf_event *event),
 			  const struct attribute_group *events,
-- 
2.25.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH v2 2/9] arm64: pmu: Add function implementation to update event index in userpage
  2020-08-28 20:56 [PATCH v2 0/9] libperf and arm64 userspace counter access support Rob Herring
  2020-08-28 20:56 ` [PATCH v2 1/9] arm64: pmu: Add hook to handle pmu-related undefined instructions Rob Herring
@ 2020-08-28 20:56 ` Rob Herring
  2020-08-28 20:56 ` [PATCH v2 3/9] arm64: perf: Enable pmu counter direct access for perf event on armv8 Rob Herring
                   ` (6 subsequent siblings)
  8 siblings, 0 replies; 18+ messages in thread
From: Rob Herring @ 2020-08-28 20:56 UTC (permalink / raw)
  To: Will Deacon, Catalin Marinas, Peter Zijlstra, Ingo Molnar,
	Arnaldo Carvalho de Melo, Jiri Olsa
  Cc: Mark Rutland, Ian Rogers, Alexander Shishkin, linux-kernel,
	honnappa.nagarahalli, Raphael Gault, Jonathan Cameron,
	Namhyung Kim, linux-arm-kernel

From: Raphael Gault <raphael.gault@arm.com>

In order to be able to access the counter directly for userspace,
we need to provide the index of the counter using the userpage.
We thus need to override the event_idx function to retrieve and
convert the perf_event index to armv8 hardware index.

Since the arm_pmu driver can be used by any implementation, even
if not armv8, two components play a role into making sure the
behaviour is correct and consistent with the PMU capabilities:

* the ARMPMU_EL0_RD_CNTR flag which denotes the capability to access
counter from userspace.
* the event_idx call back, which is implemented and initialized by
the PMU implementation: if no callback is provided, the default
behaviour applies, returning 0 as index value.

Signed-off-by: Raphael Gault <raphael.gault@arm.com>
Signed-off-by: Rob Herring <robh@kernel.org>
---
 arch/arm64/kernel/perf_event.c | 21 +++++++++++++++++++++
 include/linux/perf/arm_pmu.h   |  2 ++
 2 files changed, 23 insertions(+)

diff --git a/arch/arm64/kernel/perf_event.c b/arch/arm64/kernel/perf_event.c
index 70538ae684da..2727d126cecd 100644
--- a/arch/arm64/kernel/perf_event.c
+++ b/arch/arm64/kernel/perf_event.c
@@ -820,6 +820,22 @@ static void armv8pmu_clear_event_idx(struct pmu_hw_events *cpuc,
 		clear_bit(idx - 1, cpuc->used_mask);
 }
 
+static int armv8pmu_access_event_idx(struct perf_event *event)
+{
+	if (!(event->hw.flags & ARMPMU_EL0_RD_CNTR))
+		return 0;
+
+	/*
+	 * We remap the cycle counter index to 32 to
+	 * match the offset applied to the rest of
+	 * the counter indices.
+	 */
+	if (event->hw.idx == ARMV8_IDX_CYCLE_COUNTER)
+		return 32;
+
+	return event->hw.idx;
+}
+
 /*
  * Add an event filter to a given event.
  */
@@ -916,6 +932,9 @@ static int __armv8_pmuv3_map_event(struct perf_event *event,
 	if (armv8pmu_event_is_64bit(event))
 		event->hw.flags |= ARMPMU_EVT_64BIT;
 
+	if (!armv8pmu_event_is_chained(event))
+		event->hw.flags |= ARMPMU_EL0_RD_CNTR;
+
 	/* Only expose micro/arch events supported by this PMU */
 	if ((hw_event_id > 0) && (hw_event_id < ARMV8_PMUV3_MAX_COMMON_EVENTS)
 	    && test_bit(hw_event_id, armpmu->pmceid_bitmap)) {
@@ -1092,6 +1111,8 @@ static int armv8_pmu_init(struct arm_pmu *cpu_pmu, char *name,
 	cpu_pmu->set_event_filter	= armv8pmu_set_event_filter;
 	cpu_pmu->filter_match		= armv8pmu_filter_match;
 
+	cpu_pmu->pmu.event_idx		= armv8pmu_access_event_idx;
+
 	cpu_pmu->name			= name;
 	cpu_pmu->map_event		= map_event;
 	cpu_pmu->attr_groups[ARMPMU_ATTR_GROUP_EVENTS] = events ?
diff --git a/include/linux/perf/arm_pmu.h b/include/linux/perf/arm_pmu.h
index 5b616dde9a4c..74fbbbd29dc7 100644
--- a/include/linux/perf/arm_pmu.h
+++ b/include/linux/perf/arm_pmu.h
@@ -26,6 +26,8 @@
  */
 /* Event uses a 64bit counter */
 #define ARMPMU_EVT_64BIT		1
+/* Allow access to hardware counter from userspace */
+#define ARMPMU_EL0_RD_CNTR		2
 
 #define HW_OP_UNSUPPORTED		0xFFFF
 #define C(_x)				PERF_COUNT_HW_CACHE_##_x
-- 
2.25.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH v2 3/9] arm64: perf: Enable pmu counter direct access for perf event on armv8
  2020-08-28 20:56 [PATCH v2 0/9] libperf and arm64 userspace counter access support Rob Herring
  2020-08-28 20:56 ` [PATCH v2 1/9] arm64: pmu: Add hook to handle pmu-related undefined instructions Rob Herring
  2020-08-28 20:56 ` [PATCH v2 2/9] arm64: pmu: Add function implementation to update event index in userpage Rob Herring
@ 2020-08-28 20:56 ` Rob Herring
  2020-08-28 20:56 ` [PATCH v2 4/9] tools/include: Add an initial math64.h Rob Herring
                   ` (5 subsequent siblings)
  8 siblings, 0 replies; 18+ messages in thread
From: Rob Herring @ 2020-08-28 20:56 UTC (permalink / raw)
  To: Will Deacon, Catalin Marinas, Peter Zijlstra, Ingo Molnar,
	Arnaldo Carvalho de Melo, Jiri Olsa
  Cc: Mark Rutland, Ian Rogers, Alexander Shishkin, linux-kernel,
	honnappa.nagarahalli, Raphael Gault, Jonathan Cameron,
	Namhyung Kim, linux-arm-kernel

From: Raphael Gault <raphael.gault@arm.com>

Keep track of event opened with direct access to the hardware counters
and modify permissions while they are open.

The strategy used here is the same which x86 uses: everytime an event
is mapped, the permissions are set if required. The atomic field added
in the mm_context helps keep track of the different event opened and
de-activate the permissions when all are unmapped.
We also need to update the permissions in the context switch code so
that tasks keep the right permissions.

Signed-off-by: Raphael Gault <raphael.gault@arm.com>
Signed-off-by: Rob Herring <robh@kernel.org>
---
v2:
 - Move mapped/unmapped into arm64 code. Fixes arm32.
 - Rebase on cap_user_time_short changes

Changes from Raphael's v4:
  - Drop homogeneous check
  - Disable access for chained counters
  - Set pmc_width in user page
---
 arch/arm64/include/asm/mmu.h         |  5 ++++
 arch/arm64/include/asm/mmu_context.h |  2 ++
 arch/arm64/include/asm/perf_event.h  | 14 ++++++++++
 arch/arm64/kernel/perf_event.c       | 41 ++++++++++++++++++++++++++++
 4 files changed, 62 insertions(+)

diff --git a/arch/arm64/include/asm/mmu.h b/arch/arm64/include/asm/mmu.h
index a7a5ecaa2e83..52cfdb676f06 100644
--- a/arch/arm64/include/asm/mmu.h
+++ b/arch/arm64/include/asm/mmu.h
@@ -19,6 +19,11 @@
 
 typedef struct {
 	atomic64_t	id;
+	/*
+	 * non-zero if userspace have access to hardware
+	 * counters directly.
+	 */
+	atomic_t	pmu_direct_access;
 #ifdef CONFIG_COMPAT
 	void		*sigpage;
 #endif
diff --git a/arch/arm64/include/asm/mmu_context.h b/arch/arm64/include/asm/mmu_context.h
index f2d7537d6f83..d24589ecb07a 100644
--- a/arch/arm64/include/asm/mmu_context.h
+++ b/arch/arm64/include/asm/mmu_context.h
@@ -21,6 +21,7 @@
 #include <asm/proc-fns.h>
 #include <asm-generic/mm_hooks.h>
 #include <asm/cputype.h>
+#include <asm/perf_event.h>
 #include <asm/sysreg.h>
 #include <asm/tlbflush.h>
 
@@ -224,6 +225,7 @@ static inline void __switch_mm(struct mm_struct *next)
 	}
 
 	check_and_switch_context(next);
+	perf_switch_user_access(next);
 }
 
 static inline void
diff --git a/arch/arm64/include/asm/perf_event.h b/arch/arm64/include/asm/perf_event.h
index 2c2d7dbe8a02..a025d9595d51 100644
--- a/arch/arm64/include/asm/perf_event.h
+++ b/arch/arm64/include/asm/perf_event.h
@@ -8,6 +8,7 @@
 
 #include <asm/stack_pointer.h>
 #include <asm/ptrace.h>
+#include <linux/mm_types.h>
 
 #define	ARMV8_PMU_MAX_COUNTERS	32
 #define	ARMV8_PMU_COUNTER_MASK	(ARMV8_PMU_MAX_COUNTERS - 1)
@@ -251,4 +252,17 @@ extern unsigned long perf_misc_flags(struct pt_regs *regs);
 	(regs)->pstate = PSR_MODE_EL1h;	\
 }
 
+static inline void perf_switch_user_access(struct mm_struct *mm)
+{
+	if (!IS_ENABLED(CONFIG_PERF_EVENTS))
+		return;
+
+	if (atomic_read(&mm->context.pmu_direct_access)) {
+		write_sysreg(ARMV8_PMU_USERENR_ER|ARMV8_PMU_USERENR_CR,
+			     pmuserenr_el0);
+	} else {
+		write_sysreg(0, pmuserenr_el0);
+	}
+}
+
 #endif
diff --git a/arch/arm64/kernel/perf_event.c b/arch/arm64/kernel/perf_event.c
index 2727d126cecd..cf44591f5be1 100644
--- a/arch/arm64/kernel/perf_event.c
+++ b/arch/arm64/kernel/perf_event.c
@@ -836,6 +836,41 @@ static int armv8pmu_access_event_idx(struct perf_event *event)
 	return event->hw.idx;
 }
 
+static void refresh_pmuserenr(void *mm)
+{
+	perf_switch_user_access(mm);
+}
+
+static void armv8pmu_event_mapped(struct perf_event *event, struct mm_struct *mm)
+{
+	if (!(event->hw.flags & ARMPMU_EL0_RD_CNTR))
+		return;
+
+	/*
+	 * This function relies on not being called concurrently in two
+	 * tasks in the same mm.  Otherwise one task could observe
+	 * pmu_direct_access > 1 and return all the way back to
+	 * userspace with user access disabled while another task is still
+	 * doing on_each_cpu_mask() to enable user access.
+	 *
+	 * For now, this can't happen because all callers hold mmap_lock
+	 * for write.  If this changes, we'll need a different solution.
+	 */
+	lockdep_assert_held_write(&mm->mmap_lock);
+
+	if (atomic_inc_return(&mm->context.pmu_direct_access) == 1)
+		on_each_cpu(refresh_pmuserenr, mm, 1);
+}
+
+static void armv8pmu_event_unmapped(struct perf_event *event, struct mm_struct *mm)
+{
+	if (!(event->hw.flags & ARMPMU_EL0_RD_CNTR))
+		return;
+
+	if (atomic_dec_and_test(&mm->context.pmu_direct_access))
+		on_each_cpu_mask(mm_cpumask(mm), refresh_pmuserenr, NULL, 1);
+}
+
 /*
  * Add an event filter to a given event.
  */
@@ -1112,6 +1147,8 @@ static int armv8_pmu_init(struct arm_pmu *cpu_pmu, char *name,
 	cpu_pmu->filter_match		= armv8pmu_filter_match;
 
 	cpu_pmu->pmu.event_idx		= armv8pmu_access_event_idx;
+	cpu_pmu->pmu.event_mapped	= armv8pmu_event_mapped;
+	cpu_pmu->pmu.event_unmapped	= armv8pmu_event_unmapped;
 
 	cpu_pmu->name			= name;
 	cpu_pmu->map_event		= map_event;
@@ -1272,6 +1309,10 @@ void arch_perf_update_userpage(struct perf_event *event,
 	userpg->cap_user_time = 0;
 	userpg->cap_user_time_zero = 0;
 	userpg->cap_user_time_short = 0;
+	userpg->cap_user_rdpmc = !!(event->hw.flags & ARMPMU_EL0_RD_CNTR);
+
+	if (userpg->cap_user_rdpmc)
+		userpg->pmc_width = armv8pmu_event_is_64bit(event) ? 64 : 32;
 
 	do {
 		rd = sched_clock_read_begin(&seq);
-- 
2.25.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH v2 4/9] tools/include: Add an initial math64.h
  2020-08-28 20:56 [PATCH v2 0/9] libperf and arm64 userspace counter access support Rob Herring
                   ` (2 preceding siblings ...)
  2020-08-28 20:56 ` [PATCH v2 3/9] arm64: perf: Enable pmu counter direct access for perf event on armv8 Rob Herring
@ 2020-08-28 20:56 ` Rob Herring
  2020-08-28 20:56 ` [PATCH v2 5/9] libperf: Add support for user space counter access Rob Herring
                   ` (4 subsequent siblings)
  8 siblings, 0 replies; 18+ messages in thread
From: Rob Herring @ 2020-08-28 20:56 UTC (permalink / raw)
  To: Will Deacon, Catalin Marinas, Peter Zijlstra, Ingo Molnar,
	Arnaldo Carvalho de Melo, Jiri Olsa
  Cc: Mark Rutland, Ian Rogers, Alexander Shishkin, linux-kernel,
	honnappa.nagarahalli, Raphael Gault, Jonathan Cameron,
	Namhyung Kim, linux-arm-kernel

Add an initial math64.h similar to linux/math64.h with functions
mul_u64_u64_div64() and mul_u64_u32_shr(). This isn't a direct copy of
include/linux/math64.h as that doesn't define mul_u64_u64_div64().

Implementation was written by Peter Zilkstra based on linux/math64.h
and div64.h[1]. The original implementation was not optimal on arm64 as
__int128 division is not optimal with a call out to __udivti3, so I
dropped the __int128 variant of mul_u64_u64_div64().

[1] https://lore.kernel.org/lkml/20200322101848.GF2452@worktop.programming.kicks-ass.net/

Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Rob Herring <robh@kernel.org>
---
I'm not really sure what's desired here. Unlike most of the headers copied
into tools/, this one is not a copy from include/linux/. Perhaps adding
mul_u64_u64_div64() to include/linux/math64.h first and then copying it
over is preferred? Or I could just move this into libperf?


 tools/include/linux/math64.h | 75 ++++++++++++++++++++++++++++++++++++
 1 file changed, 75 insertions(+)
 create mode 100644 tools/include/linux/math64.h

diff --git a/tools/include/linux/math64.h b/tools/include/linux/math64.h
new file mode 100644
index 000000000000..4ad45d5943dc
--- /dev/null
+++ b/tools/include/linux/math64.h
@@ -0,0 +1,75 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _LINUX_MATH64_H
+#define _LINUX_MATH64_H
+
+#include <linux/types.h>
+
+#ifdef __x86_64__
+static inline u64 mul_u64_u64_div64(u64 a, u64 b, u64 c)
+{
+	u64 q;
+
+	asm ("mulq %2; divq %3" : "=a" (q)
+				: "a" (a), "rm" (b), "rm" (c)
+				: "rdx");
+
+	return q;
+}
+#define mul_u64_u64_div64 mul_u64_u64_div64
+#endif
+
+#ifdef __SIZEOF_INT128__
+static inline u64 mul_u64_u32_shr(u64 a, u32 b, unsigned int shift)
+{
+	return (u64)(((unsigned __int128)a * b) >> shift);
+}
+
+#else
+
+#ifdef __i386__
+static inline u64 mul_u32_u32(u32 a, u32 b)
+{
+	u32 high, low;
+
+	asm ("mull %[b]" : "=a" (low), "=d" (high)
+			 : [a] "a" (a), [b] "rm" (b) );
+
+	return low | ((u64)high) << 32;
+}
+#else
+static inline u64 mul_u32_u32(u32 a, u32 b)
+{
+	return (u64)a * b;
+}
+#endif
+
+static inline u64 mul_u64_u32_shr(u64 a, u32 b, unsigned int shift)
+{
+	u32 ah, al;
+	u64 ret;
+
+	al = a;
+	ah = a >> 32;
+
+	ret = mul_u32_u32(al, b) >> shift;
+	if (ah)
+		ret += mul_u32_u32(ah, b) << (32 - shift);
+
+	return ret;
+}
+
+#endif	/* __SIZEOF_INT128__ */
+
+#ifndef mul_u64_u64_div64
+static inline u64 mul_u64_u64_div64(u64 a, u64 b, u64 c)
+{
+	u64 quot, rem;
+
+	quot = a / c;
+	rem = a % c;
+
+	return quot * b + (rem * b) / c;
+}
+#endif
+
+#endif /* _LINUX_MATH64_H */
--
2.25.1

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH v2 5/9] libperf: Add support for user space counter access
  2020-08-28 20:56 [PATCH v2 0/9] libperf and arm64 userspace counter access support Rob Herring
                   ` (3 preceding siblings ...)
  2020-08-28 20:56 ` [PATCH v2 4/9] tools/include: Add an initial math64.h Rob Herring
@ 2020-08-28 20:56 ` Rob Herring
  2020-08-31  9:11   ` Jiri Olsa
                     ` (2 more replies)
  2020-08-28 20:56 ` [PATCH v2 6/9] libperf: Add arm64 support to perf_mmap__read_self() Rob Herring
                   ` (3 subsequent siblings)
  8 siblings, 3 replies; 18+ messages in thread
From: Rob Herring @ 2020-08-28 20:56 UTC (permalink / raw)
  To: Will Deacon, Catalin Marinas, Peter Zijlstra, Ingo Molnar,
	Arnaldo Carvalho de Melo, Jiri Olsa
  Cc: Mark Rutland, Ian Rogers, Alexander Shishkin, linux-kernel,
	honnappa.nagarahalli, Raphael Gault, Jonathan Cameron,
	Namhyung Kim, linux-arm-kernel

x86 and arm64 can both support direct access of event counters in
userspace. The access sequence is less than trivial and currently exists
in perf test code (tools/perf/arch/x86/tests/rdpmc.c) with copies in
projects such as PAPI and libpfm4.

In order to support usersapce access, an event must be mmapped. While
there's already mmap support for evlist, the usecase is a bit different
than the self monitoring with userspace access. So let's add a new
perf_evsel__mmap() function to mmap an evsel. This allows implementing
userspace access as a fastpath for perf_evsel__read().

The mmapped address is returned by perf_evsel__mmap() primarily for
users/tests to check if userspace access is enabled.

Signed-off-by: Rob Herring <robh@kernel.org>
---
 tools/lib/perf/Documentation/libperf.txt |  1 +
 tools/lib/perf/evsel.c                   | 33 +++++++++
 tools/lib/perf/include/internal/evsel.h  |  2 +
 tools/lib/perf/include/internal/mmap.h   |  3 +
 tools/lib/perf/include/perf/evsel.h      |  1 +
 tools/lib/perf/libperf.map               |  1 +
 tools/lib/perf/mmap.c                    | 90 ++++++++++++++++++++++++
 tools/lib/perf/tests/test-evsel.c        | 64 +++++++++++++++++
 8 files changed, 195 insertions(+)

diff --git a/tools/lib/perf/Documentation/libperf.txt b/tools/lib/perf/Documentation/libperf.txt
index 0c74c30ed23a..ca7478acc97c 100644
--- a/tools/lib/perf/Documentation/libperf.txt
+++ b/tools/lib/perf/Documentation/libperf.txt
@@ -136,6 +136,7 @@ SYNOPSIS
                        struct perf_thread_map *threads);
   void perf_evsel__close(struct perf_evsel *evsel);
   void perf_evsel__close_cpu(struct perf_evsel *evsel, int cpu);
+  void *perf_evsel__mmap(struct perf_evsel *evsel);
   int perf_evsel__read(struct perf_evsel *evsel, int cpu, int thread,
                        struct perf_counts_values *count);
   int perf_evsel__enable(struct perf_evsel *evsel);
diff --git a/tools/lib/perf/evsel.c b/tools/lib/perf/evsel.c
index 4dc06289f4c7..b0c94ef4d9b6 100644
--- a/tools/lib/perf/evsel.c
+++ b/tools/lib/perf/evsel.c
@@ -11,10 +11,12 @@
 #include <stdlib.h>
 #include <internal/xyarray.h>
 #include <internal/cpumap.h>
+#include <internal/mmap.h>
 #include <internal/threadmap.h>
 #include <internal/lib.h>
 #include <linux/string.h>
 #include <sys/ioctl.h>
+#include <sys/mman.h>
 
 void perf_evsel__init(struct perf_evsel *evsel, struct perf_event_attr *attr)
 {
@@ -156,6 +158,34 @@ void perf_evsel__close_cpu(struct perf_evsel *evsel, int cpu)
 	perf_evsel__close_fd_cpu(evsel, cpu);
 }
 
+void *perf_evsel__mmap(struct perf_evsel *evsel)
+{
+	int ret;
+	struct perf_mmap *map;
+	struct perf_mmap_param mp = {
+		.mask = -1,
+		.prot = PROT_READ | PROT_WRITE,
+	};
+
+	if (FD(evsel, 0, 0) < 0)
+		return NULL;
+
+	map = zalloc(sizeof(*map));
+	if (!map)
+		return NULL;
+
+	perf_mmap__init(map, NULL, false, NULL);
+
+	ret = perf_mmap__mmap(map, &mp, FD(evsel, 0, 0), 0);
+	if (ret) {
+		free(map);
+		return NULL;
+	}
+
+	evsel->mmap = map;
+	return map->base;
+}
+
 int perf_evsel__read_size(struct perf_evsel *evsel)
 {
 	u64 read_format = evsel->attr.read_format;
@@ -191,6 +221,9 @@ int perf_evsel__read(struct perf_evsel *evsel, int cpu, int thread,
 	if (FD(evsel, cpu, thread) < 0)
 		return -EINVAL;
 
+	if (evsel->mmap && !perf_mmap__read_self(evsel->mmap, count))
+		return 0;
+
 	if (readn(FD(evsel, cpu, thread), count->values, size) <= 0)
 		return -errno;
 
diff --git a/tools/lib/perf/include/internal/evsel.h b/tools/lib/perf/include/internal/evsel.h
index 1ffd083b235e..a7985dbb68ff 100644
--- a/tools/lib/perf/include/internal/evsel.h
+++ b/tools/lib/perf/include/internal/evsel.h
@@ -9,6 +9,7 @@
 
 struct perf_cpu_map;
 struct perf_thread_map;
+struct perf_mmap;
 struct xyarray;
 
 /*
@@ -40,6 +41,7 @@ struct perf_evsel {
 	struct perf_cpu_map	*cpus;
 	struct perf_cpu_map	*own_cpus;
 	struct perf_thread_map	*threads;
+	struct perf_mmap	*mmap;
 	struct xyarray		*fd;
 	struct xyarray		*sample_id;
 	u64			*id;
diff --git a/tools/lib/perf/include/internal/mmap.h b/tools/lib/perf/include/internal/mmap.h
index be7556e0a2b2..5e3422f40ed5 100644
--- a/tools/lib/perf/include/internal/mmap.h
+++ b/tools/lib/perf/include/internal/mmap.h
@@ -11,6 +11,7 @@
 #define PERF_SAMPLE_MAX_SIZE (1 << 16)
 
 struct perf_mmap;
+struct perf_counts_values;
 
 typedef void (*libperf_unmap_cb_t)(struct perf_mmap *map);
 
@@ -52,4 +53,6 @@ void perf_mmap__put(struct perf_mmap *map);
 
 u64 perf_mmap__read_head(struct perf_mmap *map);
 
+int perf_mmap__read_self(struct perf_mmap *map, struct perf_counts_values *count);
+
 #endif /* __LIBPERF_INTERNAL_MMAP_H */
diff --git a/tools/lib/perf/include/perf/evsel.h b/tools/lib/perf/include/perf/evsel.h
index c82ec39a4ad0..6d0da962870c 100644
--- a/tools/lib/perf/include/perf/evsel.h
+++ b/tools/lib/perf/include/perf/evsel.h
@@ -27,6 +27,7 @@ LIBPERF_API int perf_evsel__open(struct perf_evsel *evsel, struct perf_cpu_map *
 				 struct perf_thread_map *threads);
 LIBPERF_API void perf_evsel__close(struct perf_evsel *evsel);
 LIBPERF_API void perf_evsel__close_cpu(struct perf_evsel *evsel, int cpu);
+LIBPERF_API void *perf_evsel__mmap(struct perf_evsel *evsel);
 LIBPERF_API int perf_evsel__read(struct perf_evsel *evsel, int cpu, int thread,
 				 struct perf_counts_values *count);
 LIBPERF_API int perf_evsel__enable(struct perf_evsel *evsel);
diff --git a/tools/lib/perf/libperf.map b/tools/lib/perf/libperf.map
index 7be1af8a546c..733a0647be8b 100644
--- a/tools/lib/perf/libperf.map
+++ b/tools/lib/perf/libperf.map
@@ -23,6 +23,7 @@ LIBPERF_0.0.1 {
 		perf_evsel__disable;
 		perf_evsel__open;
 		perf_evsel__close;
+		perf_evsel__mmap;
 		perf_evsel__read;
 		perf_evsel__cpus;
 		perf_evsel__threads;
diff --git a/tools/lib/perf/mmap.c b/tools/lib/perf/mmap.c
index 79d5ed6c38cc..cb07969cfdbf 100644
--- a/tools/lib/perf/mmap.c
+++ b/tools/lib/perf/mmap.c
@@ -8,9 +8,11 @@
 #include <linux/perf_event.h>
 #include <perf/mmap.h>
 #include <perf/event.h>
+#include <perf/evsel.h>
 #include <internal/mmap.h>
 #include <internal/lib.h>
 #include <linux/kernel.h>
+#include <linux/math64.h>
 #include "internal.h"
 
 void perf_mmap__init(struct perf_mmap *map, struct perf_mmap *prev,
@@ -273,3 +275,91 @@ union perf_event *perf_mmap__read_event(struct perf_mmap *map)
 
 	return event;
 }
+
+#if defined(__i386__) || defined(__x86_64__)
+static u64 read_perf_counter(unsigned int counter)
+{
+	unsigned int low, high;
+
+	asm volatile("rdpmc" : "=a" (low), "=d" (high) : "c" (counter));
+
+	return low | ((u64)high) << 32;
+}
+
+static u64 read_timestamp(void)
+{
+	unsigned int low, high;
+
+	asm volatile("rdtsc" : "=a" (low), "=d" (high));
+
+	return low | ((u64)high) << 32;
+}
+#else
+static u64 read_perf_counter(unsigned int counter) { return 0; }
+static u64 read_timestamp(void) { return 0; }
+#endif
+
+int perf_mmap__read_self(struct perf_mmap *map, struct perf_counts_values *count)
+{
+	struct perf_event_mmap_page *pc = map->base;
+	u32 seq, idx, time_mult = 0, time_shift = 0;
+	u64 cnt, cyc = 0, time_offset = 0, time_cycles = 0, time_mask = ~0ULL;
+
+	BUG_ON(!pc);
+
+	if (!pc->cap_user_rdpmc)
+		return -1;
+
+	do {
+		seq = READ_ONCE(pc->lock);
+		barrier();
+
+		count->ena = READ_ONCE(pc->time_enabled);
+		count->run = READ_ONCE(pc->time_running);
+
+		if (pc->cap_user_time && count->ena != count->run) {
+			cyc = read_timestamp();
+			time_mult = READ_ONCE(pc->time_mult);
+			time_shift = READ_ONCE(pc->time_shift);
+			time_offset = READ_ONCE(pc->time_offset);
+
+			if (pc->cap_user_time_short) {
+				time_cycles = READ_ONCE(pc->time_cycles);
+				time_mask = READ_ONCE(pc->time_mask);
+			}
+		}
+
+		idx = READ_ONCE(pc->index);
+		cnt = READ_ONCE(pc->offset);
+		if (pc->cap_user_rdpmc && idx) {
+			u64 evcnt = read_perf_counter(idx - 1);
+			u16 width = READ_ONCE(pc->pmc_width);
+
+			evcnt <<= 64 - width;
+			evcnt >>= 64 - width;
+			cnt += evcnt;
+		} else
+			return -1;
+
+		barrier();
+	} while (READ_ONCE(pc->lock) != seq);
+
+	if (count->ena != count->run) {
+		u64 delta;
+
+		/* Adjust for cap_usr_time_short, a nop if not */
+		cyc = time_cycles + ((cyc - time_cycles) & time_mask);
+
+		delta = time_offset + mul_u64_u32_shr(cyc, time_mult, time_shift);
+
+		count->ena += delta;
+		if (idx)
+			count->run += delta;
+
+		cnt = mul_u64_u64_div64(cnt, count->ena, count->run);
+	}
+
+	count->val = cnt;
+
+	return 0;
+}
diff --git a/tools/lib/perf/tests/test-evsel.c b/tools/lib/perf/tests/test-evsel.c
index 135722ac965b..fd637d23216b 100644
--- a/tools/lib/perf/tests/test-evsel.c
+++ b/tools/lib/perf/tests/test-evsel.c
@@ -120,6 +120,68 @@ static int test_stat_thread_enable(void)
 	return 0;
 }
 
+static int test_stat_user_read(int event)
+{
+	struct perf_counts_values counts = { .val = 0 };
+	struct perf_thread_map *threads;
+	struct perf_evsel *evsel;
+	struct perf_event_mmap_page *pc;
+	struct perf_event_attr attr = {
+		.type	= PERF_TYPE_HARDWARE,
+		.config	= event,
+	};
+	int err, i;
+
+	threads = perf_thread_map__new_dummy();
+	__T("failed to create threads", threads);
+
+	perf_thread_map__set_pid(threads, 0, 0);
+
+	evsel = perf_evsel__new(&attr);
+	__T("failed to create evsel", evsel);
+
+	err = perf_evsel__open(evsel, NULL, threads);
+	__T("failed to open evsel", err == 0);
+
+	pc = perf_evsel__mmap(evsel);
+	__T("failed to mmap evsel", pc);
+
+#if defined(__i386__) || defined(__x86_64__) || defined(__aarch64__)
+	__T("userspace counter access not supported", pc->cap_user_rdpmc);
+	__T("userspace counter access not enabled", pc->index);
+	__T("userspace counter width not set", pc->pmc_width >= 32);
+#endif
+
+	perf_evsel__read(evsel, 0, 0, &counts);
+	__T("failed to read value for evsel", counts.val != 0);
+
+	fputs("\n", stderr);
+	for (i = 0; i < 5; i++) {
+		volatile int count = 0x10000 << i;
+		__u64 start, end, last = 0;
+
+		fprintf(stderr, "\tloop = %u, ", count);
+
+		perf_evsel__read(evsel, 0, 0, &counts);
+		start = counts.val;
+
+		while (count--) ;
+
+		perf_evsel__read(evsel, 0, 0, &counts);
+		end = counts.val;
+
+		__T("invalid counter data", (end - start) > last);
+		last = end - start;
+		fprintf(stderr, "count = %llu\n", end - start);
+	}
+
+	perf_evsel__close(evsel);
+	perf_evsel__delete(evsel);
+
+	perf_thread_map__put(threads);
+	return 0;
+}
+
 int main(int argc, char **argv)
 {
 	__T_START;
@@ -129,6 +191,8 @@ int main(int argc, char **argv)
 	test_stat_cpu();
 	test_stat_thread();
 	test_stat_thread_enable();
+	test_stat_user_read(PERF_COUNT_HW_INSTRUCTIONS);
+	test_stat_user_read(PERF_COUNT_HW_CPU_CYCLES);
 
 	__T_END;
 	return 0;
-- 
2.25.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH v2 6/9] libperf: Add arm64 support to perf_mmap__read_self()
  2020-08-28 20:56 [PATCH v2 0/9] libperf and arm64 userspace counter access support Rob Herring
                   ` (4 preceding siblings ...)
  2020-08-28 20:56 ` [PATCH v2 5/9] libperf: Add support for user space counter access Rob Herring
@ 2020-08-28 20:56 ` Rob Herring
  2020-08-28 20:56 ` [PATCH v2 7/9] perf: arm64: Add test for userspace counter access on heterogeneous systems Rob Herring
                   ` (2 subsequent siblings)
  8 siblings, 0 replies; 18+ messages in thread
From: Rob Herring @ 2020-08-28 20:56 UTC (permalink / raw)
  To: Will Deacon, Catalin Marinas, Peter Zijlstra, Ingo Molnar,
	Arnaldo Carvalho de Melo, Jiri Olsa
  Cc: Mark Rutland, Ian Rogers, Alexander Shishkin, linux-kernel,
	honnappa.nagarahalli, Raphael Gault, Jonathan Cameron,
	Namhyung Kim, linux-arm-kernel

Add the arm64 variants for read_perf_counter() and read_timestamp().
Unfortunately the counter number is encoded into the instruction, so the
code is a bit verbose to enumerate all possible counters.

Signed-off-by: Rob Herring <robh@kernel.org>
---
 tools/lib/perf/mmap.c | 98 +++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 98 insertions(+)

diff --git a/tools/lib/perf/mmap.c b/tools/lib/perf/mmap.c
index cb07969cfdbf..6cf939aae6b6 100644
--- a/tools/lib/perf/mmap.c
+++ b/tools/lib/perf/mmap.c
@@ -13,6 +13,7 @@
 #include <internal/lib.h>
 #include <linux/kernel.h>
 #include <linux/math64.h>
+#include <linux/stringify.h>
 #include "internal.h"
 
 void perf_mmap__init(struct perf_mmap *map, struct perf_mmap *prev,
@@ -294,6 +295,103 @@ static u64 read_timestamp(void)
 
 	return low | ((u64)high) << 32;
 }
+#elif defined(__aarch64__)
+#define read_sysreg(r) ({						\
+	u64 __val;							\
+	asm volatile("mrs %0, " __stringify(r) : "=r" (__val));		\
+	__val;								\
+})
+
+static u64 read_pmccntr(void)
+{
+	return read_sysreg(pmccntr_el0);
+}
+
+#define PMEVCNTR_READ(idx)					\
+	static u64 read_pmevcntr_##idx(void) {			\
+		return read_sysreg(pmevcntr##idx##_el0);	\
+	}
+
+PMEVCNTR_READ(0);
+PMEVCNTR_READ(1);
+PMEVCNTR_READ(2);
+PMEVCNTR_READ(3);
+PMEVCNTR_READ(4);
+PMEVCNTR_READ(5);
+PMEVCNTR_READ(6);
+PMEVCNTR_READ(7);
+PMEVCNTR_READ(8);
+PMEVCNTR_READ(9);
+PMEVCNTR_READ(10);
+PMEVCNTR_READ(11);
+PMEVCNTR_READ(12);
+PMEVCNTR_READ(13);
+PMEVCNTR_READ(14);
+PMEVCNTR_READ(15);
+PMEVCNTR_READ(16);
+PMEVCNTR_READ(17);
+PMEVCNTR_READ(18);
+PMEVCNTR_READ(19);
+PMEVCNTR_READ(20);
+PMEVCNTR_READ(21);
+PMEVCNTR_READ(22);
+PMEVCNTR_READ(23);
+PMEVCNTR_READ(24);
+PMEVCNTR_READ(25);
+PMEVCNTR_READ(26);
+PMEVCNTR_READ(27);
+PMEVCNTR_READ(28);
+PMEVCNTR_READ(29);
+PMEVCNTR_READ(30);
+
+/*
+ * Read a value direct from PMEVCNTR<idx>
+ */
+static u64 read_perf_counter(unsigned int counter)
+{
+	static u64 (* const read_f[])(void) = {
+		read_pmevcntr_0,
+		read_pmevcntr_1,
+		read_pmevcntr_2,
+		read_pmevcntr_3,
+		read_pmevcntr_4,
+		read_pmevcntr_5,
+		read_pmevcntr_6,
+		read_pmevcntr_7,
+		read_pmevcntr_8,
+		read_pmevcntr_9,
+		read_pmevcntr_10,
+		read_pmevcntr_11,
+		read_pmevcntr_13,
+		read_pmevcntr_12,
+		read_pmevcntr_14,
+		read_pmevcntr_15,
+		read_pmevcntr_16,
+		read_pmevcntr_17,
+		read_pmevcntr_18,
+		read_pmevcntr_19,
+		read_pmevcntr_20,
+		read_pmevcntr_21,
+		read_pmevcntr_22,
+		read_pmevcntr_23,
+		read_pmevcntr_24,
+		read_pmevcntr_25,
+		read_pmevcntr_26,
+		read_pmevcntr_27,
+		read_pmevcntr_28,
+		read_pmevcntr_29,
+		read_pmevcntr_30,
+		read_pmccntr
+	};
+
+	if (counter < ARRAY_SIZE(read_f))
+		return (read_f[counter])();
+
+	return 0;
+}
+
+static u64 read_timestamp(void) { return read_sysreg(cntvct_el0); }
+
 #else
 static u64 read_perf_counter(unsigned int counter) { return 0; }
 static u64 read_timestamp(void) { return 0; }
-- 
2.25.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH v2 7/9] perf: arm64: Add test for userspace counter access on heterogeneous systems
  2020-08-28 20:56 [PATCH v2 0/9] libperf and arm64 userspace counter access support Rob Herring
                   ` (5 preceding siblings ...)
  2020-08-28 20:56 ` [PATCH v2 6/9] libperf: Add arm64 support to perf_mmap__read_self() Rob Herring
@ 2020-08-28 20:56 ` Rob Herring
  2020-08-28 20:56 ` [PATCH v2 8/9] Documentation: arm64: Document PMU counters access from userspace Rob Herring
  2020-08-28 20:56 ` [PATCH v2 9/9] perf: Remove x86 specific rdpmc test Rob Herring
  8 siblings, 0 replies; 18+ messages in thread
From: Rob Herring @ 2020-08-28 20:56 UTC (permalink / raw)
  To: Will Deacon, Catalin Marinas, Peter Zijlstra, Ingo Molnar,
	Arnaldo Carvalho de Melo, Jiri Olsa
  Cc: Mark Rutland, Ian Rogers, Alexander Shishkin, linux-kernel,
	honnappa.nagarahalli, Raphael Gault, Jonathan Cameron,
	Namhyung Kim, linux-arm-kernel

Userspace counter access only works on heterogeneous systems with some
restrictions. The userspace process must be pinned to a homogeneous
subset of CPUs and must open the corresponding PMU for those CPUs. This
commit adds a test implementing these requirements.

Signed-off-by: Rob Herring <robh@kernel.org>
---
v2:
- Drop all but heterogeneous test as others covered by libperf tests
- Rework to use libperf
---
 tools/perf/arch/arm64/include/arch-tests.h |   7 +
 tools/perf/arch/arm64/tests/Build          |   1 +
 tools/perf/arch/arm64/tests/arch-tests.c   |   4 +
 tools/perf/arch/arm64/tests/user-events.c  | 170 +++++++++++++++++++++
 4 files changed, 182 insertions(+)
 create mode 100644 tools/perf/arch/arm64/tests/user-events.c

diff --git a/tools/perf/arch/arm64/include/arch-tests.h b/tools/perf/arch/arm64/include/arch-tests.h
index 90ec4c8cb880..380ad34a3f09 100644
--- a/tools/perf/arch/arm64/include/arch-tests.h
+++ b/tools/perf/arch/arm64/include/arch-tests.h
@@ -2,11 +2,18 @@
 #ifndef ARCH_TESTS_H
 #define ARCH_TESTS_H
 
+#include <linux/compiler.h>
+
 #ifdef HAVE_DWARF_UNWIND_SUPPORT
 struct thread;
 struct perf_sample;
+int test__arch_unwind_sample(struct perf_sample *sample,
+			     struct thread *thread);
 #endif
 
 extern struct test arch_tests[];
+int test__rd_pinned(struct test __maybe_unused *test,
+		       int __maybe_unused subtest);
+
 
 #endif
diff --git a/tools/perf/arch/arm64/tests/Build b/tools/perf/arch/arm64/tests/Build
index a61c06bdb757..3f9a20c17fc6 100644
--- a/tools/perf/arch/arm64/tests/Build
+++ b/tools/perf/arch/arm64/tests/Build
@@ -1,4 +1,5 @@
 perf-y += regs_load.o
 perf-$(CONFIG_DWARF_UNWIND) += dwarf-unwind.o
 
+perf-y += user-events.o
 perf-y += arch-tests.o
diff --git a/tools/perf/arch/arm64/tests/arch-tests.c b/tools/perf/arch/arm64/tests/arch-tests.c
index 5b1543c98022..80ce7bd3c16d 100644
--- a/tools/perf/arch/arm64/tests/arch-tests.c
+++ b/tools/perf/arch/arm64/tests/arch-tests.c
@@ -10,6 +10,10 @@ struct test arch_tests[] = {
 		.func = test__dwarf_unwind,
 	},
 #endif
+	{
+		.desc = "Pinned CPU user counter access",
+		.func = test__rd_pinned,
+	},
 	{
 		.func = NULL,
 	},
diff --git a/tools/perf/arch/arm64/tests/user-events.c b/tools/perf/arch/arm64/tests/user-events.c
new file mode 100644
index 000000000000..9cf30adf39d9
--- /dev/null
+++ b/tools/perf/arch/arm64/tests/user-events.c
@@ -0,0 +1,170 @@
+// SPDX-License-Identifier: GPL-2.0
+#include <unistd.h>
+#include <sched.h>
+#include <cpumap.h>
+
+#include <perf/core.h>
+#include <perf/threadmap.h>
+#include <perf/evsel.h>
+
+#include "pmu.h"
+#include "debug.h"
+#include "tests/tests.h"
+#include "arch-tests.h"
+
+static int run_test(struct perf_evsel *evsel)
+{
+	int n;
+	volatile int tmp = 0;
+	u64 delta, i, loops = 1000;
+	struct perf_counts_values counts = { .val = 0 };
+
+	for (n = 0; n < 6; n++) {
+		u64 stamp, now;
+
+		perf_evsel__read(evsel, 0, 0, &counts);
+		stamp = counts.val;
+
+		for (i = 0; i < loops; i++)
+			tmp++;
+
+		perf_evsel__read(evsel, 0, 0, &counts);
+		now = counts.val;
+		loops *= 10;
+
+		delta = now - stamp;
+		pr_debug("%14d: %14llu\n", n, (long long)delta);
+
+		if (!delta)
+			break;
+	}
+	return delta ? 0 : -1;
+}
+
+static struct perf_pmu *pmu_for_cpu(int cpu)
+{
+	int acpu, idx;
+	struct perf_pmu *pmu = NULL;
+
+	while ((pmu = perf_pmu__scan(pmu)) != NULL) {
+		if (pmu->is_uncore)
+			continue;
+		perf_cpu_map__for_each_cpu(acpu, idx, pmu->cpus)
+			if (acpu == cpu)
+				return pmu;
+	}
+	return NULL;
+}
+
+static bool pmu_is_homogeneous(void)
+{
+	int core_cnt = 0;
+	struct perf_pmu *pmu = NULL;
+
+	while ((pmu = perf_pmu__scan(pmu)) != NULL) {
+		if (!pmu->is_uncore && !perf_cpu_map__empty(pmu->cpus))
+			core_cnt++;
+	}
+	return core_cnt == 1;
+}
+
+static int libperf_print(enum libperf_print_level level,
+			 const char *fmt, va_list ap)
+{
+	(void)level;
+	return vfprintf(stderr, fmt, ap);
+}
+
+static struct perf_evsel *perf_init(struct perf_event_attr *attr)
+{
+	int err;
+	struct perf_thread_map *threads;
+	struct perf_evsel *evsel;
+
+	libperf_init(libperf_print);
+
+	threads = perf_thread_map__new_dummy();
+	if (!threads) {
+		pr_err("failed to create threads\n");
+		return NULL;
+	}
+
+	perf_thread_map__set_pid(threads, 0, 0);
+
+	evsel = perf_evsel__new(attr);
+	if (!evsel) {
+		pr_err("failed to create evsel\n");
+		goto out_thread;
+	}
+
+	err = perf_evsel__open(evsel, NULL, threads);
+	if (err) {
+		pr_err("failed to open evsel\n");
+		goto out_open;
+	}
+
+	if (!perf_evsel__mmap(evsel)) {
+		pr_err("failed to mmap evsel\n");
+		goto out_mmap;
+	}
+
+	return evsel;
+
+out_mmap:
+	perf_evsel__close(evsel);
+out_open:
+	perf_evsel__delete(evsel);
+out_thread:
+	perf_thread_map__put(threads);
+	return NULL;
+}
+
+int test__rd_pinned(struct test __maybe_unused *test,
+		    int __maybe_unused subtest)
+{
+	int cpu, cputmp, ret = -1;
+	struct perf_evsel *evsel;
+	struct perf_event_attr attr = {
+		.config = 0x8, /* Instruction count */
+		.config1 = 0, /* 32-bit counter */
+		.exclude_kernel = 1,
+	};
+	cpu_set_t cpu_set;
+	struct perf_pmu *pmu;
+
+	if (pmu_is_homogeneous())
+		return TEST_SKIP;
+
+	cpu = sched_getcpu();
+	pmu = pmu_for_cpu(cpu);
+	if (!pmu)
+		return -1;
+	attr.type = pmu->type;
+
+	CPU_ZERO(&cpu_set);
+	perf_cpu_map__for_each_cpu(cpu, cputmp, pmu->cpus)
+		CPU_SET(cpu, &cpu_set);
+	if (sched_setaffinity(0, sizeof(cpu_set), &cpu_set) < 0)
+		pr_err("Could not set affinity\n");
+
+	evsel = perf_init(&attr);
+	if (!evsel)
+		return -1;
+
+	perf_cpu_map__for_each_cpu(cpu, cputmp, pmu->cpus) {
+		CPU_ZERO(&cpu_set);
+		CPU_SET(cpu, &cpu_set);
+		if (sched_setaffinity(0, sizeof(cpu_set), &cpu_set) < 0)
+			pr_err("Could not set affinity\n");
+
+		pr_debug("Running on CPU %d\n", cpu);
+
+		ret = run_test(evsel);
+		if (ret)
+			break;
+	}
+
+	perf_evsel__close(evsel);
+	perf_evsel__delete(evsel);
+	return ret;
+}
-- 
2.25.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH v2 8/9] Documentation: arm64: Document PMU counters access from userspace
  2020-08-28 20:56 [PATCH v2 0/9] libperf and arm64 userspace counter access support Rob Herring
                   ` (6 preceding siblings ...)
  2020-08-28 20:56 ` [PATCH v2 7/9] perf: arm64: Add test for userspace counter access on heterogeneous systems Rob Herring
@ 2020-08-28 20:56 ` Rob Herring
  2020-08-28 20:56 ` [PATCH v2 9/9] perf: Remove x86 specific rdpmc test Rob Herring
  8 siblings, 0 replies; 18+ messages in thread
From: Rob Herring @ 2020-08-28 20:56 UTC (permalink / raw)
  To: Will Deacon, Catalin Marinas, Peter Zijlstra, Ingo Molnar,
	Arnaldo Carvalho de Melo, Jiri Olsa
  Cc: Mark Rutland, Ian Rogers, Alexander Shishkin, linux-kernel,
	honnappa.nagarahalli, Raphael Gault, Jonathan Cameron,
	Namhyung Kim, linux-arm-kernel

From: Raphael Gault <raphael.gault@arm.com>

Add a documentation file to describe the access to the pmu hardware
counters from userspace

Signed-off-by: Raphael Gault <raphael.gault@arm.com>
Signed-off-by: Rob Herring <robh@kernel.org>
---
v2:
  - Update links to test examples

Changes from Raphael's v4:
  - Convert to rSt
  - Update chained event status
  - Add section for heterogeneous systems
---
 Documentation/arm64/index.rst                 |  1 +
 .../arm64/perf_counter_user_access.rst        | 56 +++++++++++++++++++
 2 files changed, 57 insertions(+)
 create mode 100644 Documentation/arm64/perf_counter_user_access.rst

diff --git a/Documentation/arm64/index.rst b/Documentation/arm64/index.rst
index d9665d83c53a..c712a08e7627 100644
--- a/Documentation/arm64/index.rst
+++ b/Documentation/arm64/index.rst
@@ -15,6 +15,7 @@ ARM64 Architecture
     legacy_instructions
     memory
     perf
+    perf_counter_user_access
     pointer-authentication
     silicon-errata
     sve
diff --git a/Documentation/arm64/perf_counter_user_access.rst b/Documentation/arm64/perf_counter_user_access.rst
new file mode 100644
index 000000000000..e49e141f10cc
--- /dev/null
+++ b/Documentation/arm64/perf_counter_user_access.rst
@@ -0,0 +1,56 @@
+=============================================
+Access to PMU hardware counter from userspace
+=============================================
+
+Overview
+--------
+The perf userspace tool relies on the PMU to monitor events. It offers an
+abstraction layer over the hardware counters since the underlying
+implementation is cpu-dependent.
+Arm64 allows userspace tools to have access to the registers storing the
+hardware counters' values directly.
+
+This targets specifically self-monitoring tasks in order to reduce the overhead
+by directly accessing the registers without having to go through the kernel.
+
+How-to
+------
+The focus is set on the armv8 pmuv3 which makes sure that the access to the pmu
+registers is enabled and that the userspace has access to the relevant
+information in order to use them.
+
+In order to have access to the hardware counter it is necessary to open the event
+using the perf tool interface: the sys_perf_event_open syscall returns a fd which
+can subsequently be used with the mmap syscall in order to retrieve a page of
+memory containing information about the event.
+The PMU driver uses this page to expose to the user the hardware counter's
+index and other necessary data. Using this index enables the user to access the
+PMU registers using the `mrs` instruction.
+
+The userspace access is supported in libperf using the perf_evsel__mmap()
+and perf_evsel__read() functions. See `tools/lib/perf/tests/test-evsel.c`_ for
+an example.
+
+About heterogeneous systems
+---------------------------
+On heterogeneous systems such as big.LITTLE, userspace PMU counter access can
+only be enabled when the tasks are pinned to a homogeneous subset of cores and
+the corresponding PMU instance is opened by specifying the 'type' attribute.
+The use of generic event types is not supported in this case.
+
+Have a look at `tools/perf/arch/arm64/tests/user-events.c`_ for an example. It
+can be run using the perf tool to check that the access to the registers works
+correctly from userspace:
+
+.. code-block:: sh
+
+  perf test -v user
+
+About chained events
+--------------------
+Chained events are not supported in userspace. If a 64-bit counter is requested,
+userspace access will only be enabled if the underlying counter is 64-bit.
+
+.. Links
+.. _tools/perf/arch/arm64/tests/user-events.c:
+   https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/tree/tools/perf/arch/arm64/tests/user-events.c
-- 
2.25.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH v2 9/9] perf: Remove x86 specific rdpmc test
  2020-08-28 20:56 [PATCH v2 0/9] libperf and arm64 userspace counter access support Rob Herring
                   ` (7 preceding siblings ...)
  2020-08-28 20:56 ` [PATCH v2 8/9] Documentation: arm64: Document PMU counters access from userspace Rob Herring
@ 2020-08-28 20:56 ` Rob Herring
  2020-08-31  9:11   ` Jiri Olsa
  8 siblings, 1 reply; 18+ messages in thread
From: Rob Herring @ 2020-08-28 20:56 UTC (permalink / raw)
  To: Will Deacon, Catalin Marinas, Peter Zijlstra, Ingo Molnar,
	Arnaldo Carvalho de Melo, Jiri Olsa
  Cc: Mark Rutland, Ian Rogers, Alexander Shishkin, linux-kernel,
	honnappa.nagarahalli, Raphael Gault, Jonathan Cameron,
	Namhyung Kim, linux-arm-kernel

Now that we have a common libperf based userspace counter read test
with the same functionality as the x86 specific rdpmc test, let's remove
it.

Signed-off-by: Rob Herring <robh@kernel.org>
---
This one is optional. On the plus side, it eliminates a copy of the read
loop. The main downside I see is loosing the ability to test in 'perf test'.


 tools/perf/arch/x86/include/arch-tests.h |   1 -
 tools/perf/arch/x86/tests/Build          |   1 -
 tools/perf/arch/x86/tests/arch-tests.c   |   4 -
 tools/perf/arch/x86/tests/rdpmc.c        | 182 -----------------------
 4 files changed, 188 deletions(-)
 delete mode 100644 tools/perf/arch/x86/tests/rdpmc.c

diff --git a/tools/perf/arch/x86/include/arch-tests.h b/tools/perf/arch/x86/include/arch-tests.h
index c41c5affe4be..d9c32ba0cdac 100644
--- a/tools/perf/arch/x86/include/arch-tests.h
+++ b/tools/perf/arch/x86/include/arch-tests.h
@@ -6,7 +6,6 @@
 struct test;

 /* Tests */
-int test__rdpmc(struct test *test __maybe_unused, int subtest);
 int test__perf_time_to_tsc(struct test *test __maybe_unused, int subtest);
 int test__insn_x86(struct test *test __maybe_unused, int subtest);
 int test__intel_pt_pkt_decoder(struct test *test, int subtest);
diff --git a/tools/perf/arch/x86/tests/Build b/tools/perf/arch/x86/tests/Build
index 2997c506550c..e3b3cd3b40e8 100644
--- a/tools/perf/arch/x86/tests/Build
+++ b/tools/perf/arch/x86/tests/Build
@@ -2,7 +2,6 @@ perf-$(CONFIG_DWARF_UNWIND) += regs_load.o
 perf-$(CONFIG_DWARF_UNWIND) += dwarf-unwind.o

 perf-y += arch-tests.o
-perf-y += rdpmc.o
 perf-y += perf-time-to-tsc.o
 perf-$(CONFIG_AUXTRACE) += insn-x86.o intel-pt-pkt-decoder-test.o
 perf-$(CONFIG_X86_64) += bp-modify.o
diff --git a/tools/perf/arch/x86/tests/arch-tests.c b/tools/perf/arch/x86/tests/arch-tests.c
index 6763135aec17..300e9954d530 100644
--- a/tools/perf/arch/x86/tests/arch-tests.c
+++ b/tools/perf/arch/x86/tests/arch-tests.c
@@ -4,10 +4,6 @@
 #include "arch-tests.h"

 struct test arch_tests[] = {
-	{
-		.desc = "x86 rdpmc",
-		.func = test__rdpmc,
-	},
 	{
 		.desc = "Convert perf time to TSC",
 		.func = test__perf_time_to_tsc,
diff --git a/tools/perf/arch/x86/tests/rdpmc.c b/tools/perf/arch/x86/tests/rdpmc.c
deleted file mode 100644
index 1ea916656a2d..000000000000
--- a/tools/perf/arch/x86/tests/rdpmc.c
+++ /dev/null
@@ -1,182 +0,0 @@
-// SPDX-License-Identifier: GPL-2.0
-#include <errno.h>
-#include <unistd.h>
-#include <stdlib.h>
-#include <signal.h>
-#include <sys/mman.h>
-#include <sys/types.h>
-#include <sys/wait.h>
-#include <linux/string.h>
-#include <linux/types.h>
-#include "perf-sys.h"
-#include "debug.h"
-#include "tests/tests.h"
-#include "cloexec.h"
-#include "event.h"
-#include <internal/lib.h> // page_size
-#include "arch-tests.h"
-
-static u64 rdpmc(unsigned int counter)
-{
-	unsigned int low, high;
-
-	asm volatile("rdpmc" : "=a" (low), "=d" (high) : "c" (counter));
-
-	return low | ((u64)high) << 32;
-}
-
-static u64 rdtsc(void)
-{
-	unsigned int low, high;
-
-	asm volatile("rdtsc" : "=a" (low), "=d" (high));
-
-	return low | ((u64)high) << 32;
-}
-
-static u64 mmap_read_self(void *addr)
-{
-	struct perf_event_mmap_page *pc = addr;
-	u32 seq, idx, time_mult = 0, time_shift = 0;
-	u64 count, cyc = 0, time_offset = 0, enabled, running, delta;
-
-	do {
-		seq = pc->lock;
-		barrier();
-
-		enabled = pc->time_enabled;
-		running = pc->time_running;
-
-		if (enabled != running) {
-			cyc = rdtsc();
-			time_mult = pc->time_mult;
-			time_shift = pc->time_shift;
-			time_offset = pc->time_offset;
-		}
-
-		idx = pc->index;
-		count = pc->offset;
-		if (idx)
-			count += rdpmc(idx - 1);
-
-		barrier();
-	} while (pc->lock != seq);
-
-	if (enabled != running) {
-		u64 quot, rem;
-
-		quot = (cyc >> time_shift);
-		rem = cyc & (((u64)1 << time_shift) - 1);
-		delta = time_offset + quot * time_mult +
-			((rem * time_mult) >> time_shift);
-
-		enabled += delta;
-		if (idx)
-			running += delta;
-
-		quot = count / running;
-		rem = count % running;
-		count = quot * enabled + (rem * enabled) / running;
-	}
-
-	return count;
-}
-
-/*
- * If the RDPMC instruction faults then signal this back to the test parent task:
- */
-static void segfault_handler(int sig __maybe_unused,
-			     siginfo_t *info __maybe_unused,
-			     void *uc __maybe_unused)
-{
-	exit(-1);
-}
-
-static int __test__rdpmc(void)
-{
-	volatile int tmp = 0;
-	u64 i, loops = 1000;
-	int n;
-	int fd;
-	void *addr;
-	struct perf_event_attr attr = {
-		.type = PERF_TYPE_HARDWARE,
-		.config = PERF_COUNT_HW_INSTRUCTIONS,
-		.exclude_kernel = 1,
-	};
-	u64 delta_sum = 0;
-        struct sigaction sa;
-	char sbuf[STRERR_BUFSIZE];
-
-	sigfillset(&sa.sa_mask);
-	sa.sa_sigaction = segfault_handler;
-	sa.sa_flags = 0;
-	sigaction(SIGSEGV, &sa, NULL);
-
-	fd = sys_perf_event_open(&attr, 0, -1, -1,
-				 perf_event_open_cloexec_flag());
-	if (fd < 0) {
-		pr_err("Error: sys_perf_event_open() syscall returned "
-		       "with %d (%s)\n", fd,
-		       str_error_r(errno, sbuf, sizeof(sbuf)));
-		return -1;
-	}
-
-	addr = mmap(NULL, page_size, PROT_READ, MAP_SHARED, fd, 0);
-	if (addr == (void *)(-1)) {
-		pr_err("Error: mmap() syscall returned with (%s)\n",
-		       str_error_r(errno, sbuf, sizeof(sbuf)));
-		goto out_close;
-	}
-
-	for (n = 0; n < 6; n++) {
-		u64 stamp, now, delta;
-
-		stamp = mmap_read_self(addr);
-
-		for (i = 0; i < loops; i++)
-			tmp++;
-
-		now = mmap_read_self(addr);
-		loops *= 10;
-
-		delta = now - stamp;
-		pr_debug("%14d: %14Lu\n", n, (long long)delta);
-
-		delta_sum += delta;
-	}
-
-	munmap(addr, page_size);
-	pr_debug("   ");
-out_close:
-	close(fd);
-
-	if (!delta_sum)
-		return -1;
-
-	return 0;
-}
-
-int test__rdpmc(struct test *test __maybe_unused, int subtest __maybe_unused)
-{
-	int status = 0;
-	int wret = 0;
-	int ret;
-	int pid;
-
-	pid = fork();
-	if (pid < 0)
-		return -1;
-
-	if (!pid) {
-		ret = __test__rdpmc();
-
-		exit(ret);
-	}
-
-	wret = waitpid(pid, &status, 0);
-	if (wret < 0 || status)
-		return -1;
-
-	return 0;
-}
--
2.25.1

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: [PATCH v2 9/9] perf: Remove x86 specific rdpmc test
  2020-08-28 20:56 ` [PATCH v2 9/9] perf: Remove x86 specific rdpmc test Rob Herring
@ 2020-08-31  9:11   ` Jiri Olsa
  0 siblings, 0 replies; 18+ messages in thread
From: Jiri Olsa @ 2020-08-31  9:11 UTC (permalink / raw)
  To: Rob Herring
  Cc: Mark Rutland, Ian Rogers, Peter Zijlstra, Catalin Marinas,
	linux-kernel, Arnaldo Carvalho de Melo, Alexander Shishkin,
	Raphael Gault, Ingo Molnar, honnappa.nagarahalli,
	Jonathan Cameron, Namhyung Kim, Will Deacon, linux-arm-kernel

On Fri, Aug 28, 2020 at 02:56:14PM -0600, Rob Herring wrote:
> Now that we have a common libperf based userspace counter read test
> with the same functionality as the x86 specific rdpmc test, let's remove
> it.
> 
> Signed-off-by: Rob Herring <robh@kernel.org>
> ---
> This one is optional. On the plus side, it eliminates a copy of the read
> loop. The main downside I see is loosing the ability to test in 'perf test'.
> 
> 
>  tools/perf/arch/x86/include/arch-tests.h |   1 -
>  tools/perf/arch/x86/tests/Build          |   1 -
>  tools/perf/arch/x86/tests/arch-tests.c   |   4 -
>  tools/perf/arch/x86/tests/rdpmc.c        | 182 -----------------------
>  4 files changed, 188 deletions(-)
>  delete mode 100644 tools/perf/arch/x86/tests/rdpmc.c
> 
> diff --git a/tools/perf/arch/x86/include/arch-tests.h b/tools/perf/arch/x86/include/arch-tests.h
> index c41c5affe4be..d9c32ba0cdac 100644
> --- a/tools/perf/arch/x86/include/arch-tests.h
> +++ b/tools/perf/arch/x86/include/arch-tests.h
> @@ -6,7 +6,6 @@
>  struct test;
> 
>  /* Tests */
> -int test__rdpmc(struct test *test __maybe_unused, int subtest);

we don't currently run libperf tests as part of perf test suite,
so before we do that, I rather not remove the tests..

feel free to add the code that runs libperf tests within 'perf test'
command ;-)

thanks,
jirka


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v2 5/9] libperf: Add support for user space counter access
  2020-08-28 20:56 ` [PATCH v2 5/9] libperf: Add support for user space counter access Rob Herring
@ 2020-08-31  9:11   ` Jiri Olsa
  2020-09-02 16:58     ` Rob Herring
  2020-08-31  9:11   ` Jiri Olsa
  2020-09-02 18:07   ` Ian Rogers
  2 siblings, 1 reply; 18+ messages in thread
From: Jiri Olsa @ 2020-08-31  9:11 UTC (permalink / raw)
  To: Rob Herring
  Cc: Mark Rutland, Ian Rogers, Peter Zijlstra, Catalin Marinas,
	linux-kernel, Arnaldo Carvalho de Melo, Alexander Shishkin,
	Raphael Gault, Ingo Molnar, honnappa.nagarahalli,
	Jonathan Cameron, Namhyung Kim, Will Deacon, linux-arm-kernel

On Fri, Aug 28, 2020 at 02:56:10PM -0600, Rob Herring wrote:

SNIP

>  #endif /* __LIBPERF_INTERNAL_MMAP_H */
> diff --git a/tools/lib/perf/include/perf/evsel.h b/tools/lib/perf/include/perf/evsel.h
> index c82ec39a4ad0..6d0da962870c 100644
> --- a/tools/lib/perf/include/perf/evsel.h
> +++ b/tools/lib/perf/include/perf/evsel.h
> @@ -27,6 +27,7 @@ LIBPERF_API int perf_evsel__open(struct perf_evsel *evsel, struct perf_cpu_map *
>  				 struct perf_thread_map *threads);
>  LIBPERF_API void perf_evsel__close(struct perf_evsel *evsel);
>  LIBPERF_API void perf_evsel__close_cpu(struct perf_evsel *evsel, int cpu);
> +LIBPERF_API void *perf_evsel__mmap(struct perf_evsel *evsel);
>  LIBPERF_API int perf_evsel__read(struct perf_evsel *evsel, int cpu, int thread,
>  				 struct perf_counts_values *count);
>  LIBPERF_API int perf_evsel__enable(struct perf_evsel *evsel);
> diff --git a/tools/lib/perf/libperf.map b/tools/lib/perf/libperf.map
> index 7be1af8a546c..733a0647be8b 100644
> --- a/tools/lib/perf/libperf.map
> +++ b/tools/lib/perf/libperf.map
> @@ -23,6 +23,7 @@ LIBPERF_0.0.1 {
>  		perf_evsel__disable;
>  		perf_evsel__open;
>  		perf_evsel__close;
> +		perf_evsel__mmap;
>  		perf_evsel__read;
>  		perf_evsel__cpus;
>  		perf_evsel__threads;

please put perf_evsel__mmap changes into separate patch

SNIP

> +int perf_mmap__read_self(struct perf_mmap *map, struct perf_counts_values *count)
> +{
> +	struct perf_event_mmap_page *pc = map->base;
> +	u32 seq, idx, time_mult = 0, time_shift = 0;
> +	u64 cnt, cyc = 0, time_offset = 0, time_cycles = 0, time_mask = ~0ULL;
> +
> +	BUG_ON(!pc);
> +
> +	if (!pc->cap_user_rdpmc)
> +		return -1;
> +
> +	do {
> +		seq = READ_ONCE(pc->lock);
> +		barrier();
> +
> +		count->ena = READ_ONCE(pc->time_enabled);
> +		count->run = READ_ONCE(pc->time_running);
> +
> +		if (pc->cap_user_time && count->ena != count->run) {
> +			cyc = read_timestamp();
> +			time_mult = READ_ONCE(pc->time_mult);
> +			time_shift = READ_ONCE(pc->time_shift);
> +			time_offset = READ_ONCE(pc->time_offset);
> +
> +			if (pc->cap_user_time_short) {
> +				time_cycles = READ_ONCE(pc->time_cycles);
> +				time_mask = READ_ONCE(pc->time_mask);
> +			}
> +		}
> +
> +		idx = READ_ONCE(pc->index);
> +		cnt = READ_ONCE(pc->offset);
> +		if (pc->cap_user_rdpmc && idx) {

no need to check pc->cap_user_rdpmc again

> +static int test_stat_user_read(int event)
> +{
> +	struct perf_counts_values counts = { .val = 0 };
> +	struct perf_thread_map *threads;
> +	struct perf_evsel *evsel;
> +	struct perf_event_mmap_page *pc;
> +	struct perf_event_attr attr = {
> +		.type	= PERF_TYPE_HARDWARE,
> +		.config	= event,
> +	};
> +	int err, i;
> +
> +	threads = perf_thread_map__new_dummy();
> +	__T("failed to create threads", threads);
> +
> +	perf_thread_map__set_pid(threads, 0, 0);
> +
> +	evsel = perf_evsel__new(&attr);
> +	__T("failed to create evsel", evsel);
> +
> +	err = perf_evsel__open(evsel, NULL, threads);
> +	__T("failed to open evsel", err == 0);
> +
> +	pc = perf_evsel__mmap(evsel);
> +	__T("failed to mmap evsel", pc);
> +
> +#if defined(__i386__) || defined(__x86_64__) || defined(__aarch64__)
> +	__T("userspace counter access not supported", pc->cap_user_rdpmc);
> +	__T("userspace counter access not enabled", pc->index);
> +	__T("userspace counter width not set", pc->pmc_width >= 32);
> +#endif
> +
> +	perf_evsel__read(evsel, 0, 0, &counts);
> +	__T("failed to read value for evsel", counts.val != 0);
> +
> +	fputs("\n", stderr);
> +	for (i = 0; i < 5; i++) {
> +		volatile int count = 0x10000 << i;
> +		__u64 start, end, last = 0;
> +
> +		fprintf(stderr, "\tloop = %u, ", count);

we should add support to display verbose output for tests,
because right now this breaks the output:

- running test-cpumap.c...OK
- running test-threadmap.c...OK
- running test-evlist.c...OK
- running test-evsel.c...
        loop = 65536, count = 328035
        loop = 131072, count = 655715
        loop = 262144, count = 1311075
        loop = 524288, count = 2627060
        loop = 1048576, count = 5253540

        loop = 65536, count = 327594
        loop = 131072, count = 659930
        loop = 262144, count = 1378892
        loop = 524288, count = 2664341
        loop = 1048576, count = 5365682
OK

but we can do it in separate change later

thanks,
jirka


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v2 5/9] libperf: Add support for user space counter access
  2020-08-28 20:56 ` [PATCH v2 5/9] libperf: Add support for user space counter access Rob Herring
  2020-08-31  9:11   ` Jiri Olsa
@ 2020-08-31  9:11   ` Jiri Olsa
  2020-09-02 17:01     ` Rob Herring
  2020-09-02 18:07   ` Ian Rogers
  2 siblings, 1 reply; 18+ messages in thread
From: Jiri Olsa @ 2020-08-31  9:11 UTC (permalink / raw)
  To: Rob Herring
  Cc: Mark Rutland, Ian Rogers, Peter Zijlstra, Catalin Marinas,
	linux-kernel, Arnaldo Carvalho de Melo, Alexander Shishkin,
	Raphael Gault, Ingo Molnar, honnappa.nagarahalli,
	Jonathan Cameron, Namhyung Kim, Will Deacon, linux-arm-kernel

On Fri, Aug 28, 2020 at 02:56:10PM -0600, Rob Herring wrote:

SNIP

>  
> +void *perf_evsel__mmap(struct perf_evsel *evsel)
> +{
> +	int ret;
> +	struct perf_mmap *map;
> +	struct perf_mmap_param mp = {
> +		.mask = -1,
> +		.prot = PROT_READ | PROT_WRITE,
> +	};
> +
> +	if (FD(evsel, 0, 0) < 0)
> +		return NULL;
> +
> +	map = zalloc(sizeof(*map));
> +	if (!map)
> +		return NULL;
> +
> +	perf_mmap__init(map, NULL, false, NULL);
> +
> +	ret = perf_mmap__mmap(map, &mp, FD(evsel, 0, 0), 0);
> +	if (ret) {
> +		free(map);
> +		return NULL;
> +	}
> +
> +	evsel->mmap = map;
> +	return map->base;
> +}

so this only maps first page, I think we should use different
name and keep perf_evsel__mmap for some generic map with size

perf_evsel__mmap_user
perf_evsel__mmap_zero
...?

not sure.. or we could add size argument

jirka


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v2 5/9] libperf: Add support for user space counter access
  2020-08-31  9:11   ` Jiri Olsa
@ 2020-09-02 16:58     ` Rob Herring
  0 siblings, 0 replies; 18+ messages in thread
From: Rob Herring @ 2020-09-02 16:58 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Mark Rutland, Ian Rogers, Peter Zijlstra, Catalin Marinas,
	linux-kernel, Arnaldo Carvalho de Melo, Alexander Shishkin,
	Raphael Gault, Ingo Molnar, Honnappa Nagarahalli,
	Jonathan Cameron, Namhyung Kim, Will Deacon,
	moderated list:ARM/FREESCALE IMX / MXC ARM ARCHITECTURE

On Mon, Aug 31, 2020 at 3:11 AM Jiri Olsa <jolsa@redhat.com> wrote:
>
> On Fri, Aug 28, 2020 at 02:56:10PM -0600, Rob Herring wrote:
>
> SNIP


> > +int perf_mmap__read_self(struct perf_mmap *map, struct perf_counts_values *count)
> > +{
> > +     struct perf_event_mmap_page *pc = map->base;
> > +     u32 seq, idx, time_mult = 0, time_shift = 0;
> > +     u64 cnt, cyc = 0, time_offset = 0, time_cycles = 0, time_mask = ~0ULL;
> > +
> > +     BUG_ON(!pc);
> > +
> > +     if (!pc->cap_user_rdpmc)
> > +             return -1;
> > +
> > +     do {
> > +             seq = READ_ONCE(pc->lock);
> > +             barrier();
> > +
> > +             count->ena = READ_ONCE(pc->time_enabled);
> > +             count->run = READ_ONCE(pc->time_running);
> > +
> > +             if (pc->cap_user_time && count->ena != count->run) {
> > +                     cyc = read_timestamp();
> > +                     time_mult = READ_ONCE(pc->time_mult);
> > +                     time_shift = READ_ONCE(pc->time_shift);
> > +                     time_offset = READ_ONCE(pc->time_offset);
> > +
> > +                     if (pc->cap_user_time_short) {
> > +                             time_cycles = READ_ONCE(pc->time_cycles);
> > +                             time_mask = READ_ONCE(pc->time_mask);
> > +                     }
> > +             }
> > +
> > +             idx = READ_ONCE(pc->index);
> > +             cnt = READ_ONCE(pc->offset);
> > +             if (pc->cap_user_rdpmc && idx) {
>
> no need to check pc->cap_user_rdpmc again

I was thinking cap_user_rdpmc could change, but I guess idx will
always be 0 in that case.

> > +static int test_stat_user_read(int event)
> > +{
> > +     struct perf_counts_values counts = { .val = 0 };
> > +     struct perf_thread_map *threads;
> > +     struct perf_evsel *evsel;
> > +     struct perf_event_mmap_page *pc;
> > +     struct perf_event_attr attr = {
> > +             .type   = PERF_TYPE_HARDWARE,
> > +             .config = event,
> > +     };
> > +     int err, i;
> > +
> > +     threads = perf_thread_map__new_dummy();
> > +     __T("failed to create threads", threads);
> > +
> > +     perf_thread_map__set_pid(threads, 0, 0);
> > +
> > +     evsel = perf_evsel__new(&attr);
> > +     __T("failed to create evsel", evsel);
> > +
> > +     err = perf_evsel__open(evsel, NULL, threads);
> > +     __T("failed to open evsel", err == 0);
> > +
> > +     pc = perf_evsel__mmap(evsel);
> > +     __T("failed to mmap evsel", pc);
> > +
> > +#if defined(__i386__) || defined(__x86_64__) || defined(__aarch64__)
> > +     __T("userspace counter access not supported", pc->cap_user_rdpmc);
> > +     __T("userspace counter access not enabled", pc->index);
> > +     __T("userspace counter width not set", pc->pmc_width >= 32);
> > +#endif
> > +
> > +     perf_evsel__read(evsel, 0, 0, &counts);
> > +     __T("failed to read value for evsel", counts.val != 0);
> > +
> > +     fputs("\n", stderr);
> > +     for (i = 0; i < 5; i++) {
> > +             volatile int count = 0x10000 << i;
> > +             __u64 start, end, last = 0;
> > +
> > +             fprintf(stderr, "\tloop = %u, ", count);
>
> we should add support to display verbose output for tests,
> because right now this breaks the output:
>
> - running test-cpumap.c...OK
> - running test-threadmap.c...OK
> - running test-evlist.c...OK
> - running test-evsel.c...
>         loop = 65536, count = 328035
>         loop = 131072, count = 655715
>         loop = 262144, count = 1311075
>         loop = 524288, count = 2627060
>         loop = 1048576, count = 5253540
>
>         loop = 65536, count = 327594
>         loop = 131072, count = 659930
>         loop = 262144, count = 1378892
>         loop = 524288, count = 2664341
>         loop = 1048576, count = 5365682
> OK
>
> but we can do it in separate change later

Would you like me to just comment this out then for now?

Rob

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v2 5/9] libperf: Add support for user space counter access
  2020-08-31  9:11   ` Jiri Olsa
@ 2020-09-02 17:01     ` Rob Herring
  0 siblings, 0 replies; 18+ messages in thread
From: Rob Herring @ 2020-09-02 17:01 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Mark Rutland, Ian Rogers, Peter Zijlstra, Catalin Marinas,
	linux-kernel, Arnaldo Carvalho de Melo, Alexander Shishkin,
	Raphael Gault, Ingo Molnar, Honnappa Nagarahalli,
	Jonathan Cameron, Namhyung Kim, Will Deacon,
	moderated list:ARM/FREESCALE IMX / MXC ARM ARCHITECTURE

On Mon, Aug 31, 2020 at 3:11 AM Jiri Olsa <jolsa@redhat.com> wrote:
>
> On Fri, Aug 28, 2020 at 02:56:10PM -0600, Rob Herring wrote:
>
> SNIP
>
> >
> > +void *perf_evsel__mmap(struct perf_evsel *evsel)
> > +{
> > +     int ret;
> > +     struct perf_mmap *map;
> > +     struct perf_mmap_param mp = {
> > +             .mask = -1,
> > +             .prot = PROT_READ | PROT_WRITE,
> > +     };
> > +
> > +     if (FD(evsel, 0, 0) < 0)
> > +             return NULL;
> > +
> > +     map = zalloc(sizeof(*map));
> > +     if (!map)
> > +             return NULL;
> > +
> > +     perf_mmap__init(map, NULL, false, NULL);
> > +
> > +     ret = perf_mmap__mmap(map, &mp, FD(evsel, 0, 0), 0);
> > +     if (ret) {
> > +             free(map);
> > +             return NULL;
> > +     }
> > +
> > +     evsel->mmap = map;
> > +     return map->base;
> > +}
>
> so this only maps first page, I think we should use different
> name and keep perf_evsel__mmap for some generic map with size
>
> perf_evsel__mmap_user
> perf_evsel__mmap_zero
> ...?
>
> not sure.. or we could add size argument

Adding a size arg is simple enough to do and saves the hard naming problem. :)

Rob

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v2 5/9] libperf: Add support for user space counter access
  2020-08-28 20:56 ` [PATCH v2 5/9] libperf: Add support for user space counter access Rob Herring
  2020-08-31  9:11   ` Jiri Olsa
  2020-08-31  9:11   ` Jiri Olsa
@ 2020-09-02 18:07   ` Ian Rogers
  2020-09-02 19:48     ` Rob Herring
  2 siblings, 1 reply; 18+ messages in thread
From: Ian Rogers @ 2020-09-02 18:07 UTC (permalink / raw)
  To: Rob Herring
  Cc: Mark Rutland, Will Deacon, Peter Zijlstra, Catalin Marinas, LKML,
	Arnaldo Carvalho de Melo, Alexander Shishkin, Raphael Gault,
	Ingo Molnar, honnappa.nagarahalli, Jonathan Cameron,
	Namhyung Kim, Jiri Olsa, Linux ARM

On Fri, Aug 28, 2020 at 1:56 PM Rob Herring <robh@kernel.org> wrote:
>
> x86 and arm64 can both support direct access of event counters in
> userspace. The access sequence is less than trivial and currently exists
> in perf test code (tools/perf/arch/x86/tests/rdpmc.c) with copies in
> projects such as PAPI and libpfm4.
>
> In order to support usersapce access, an event must be mmapped. While
> there's already mmap support for evlist, the usecase is a bit different
> than the self monitoring with userspace access. So let's add a new
> perf_evsel__mmap() function to mmap an evsel. This allows implementing
> userspace access as a fastpath for perf_evsel__read().
>
> The mmapped address is returned by perf_evsel__mmap() primarily for
> users/tests to check if userspace access is enabled.
>
> Signed-off-by: Rob Herring <robh@kernel.org>
> ---
>  tools/lib/perf/Documentation/libperf.txt |  1 +
>  tools/lib/perf/evsel.c                   | 33 +++++++++
>  tools/lib/perf/include/internal/evsel.h  |  2 +
>  tools/lib/perf/include/internal/mmap.h   |  3 +
>  tools/lib/perf/include/perf/evsel.h      |  1 +
>  tools/lib/perf/libperf.map               |  1 +
>  tools/lib/perf/mmap.c                    | 90 ++++++++++++++++++++++++
>  tools/lib/perf/tests/test-evsel.c        | 64 +++++++++++++++++
>  8 files changed, 195 insertions(+)
>
> diff --git a/tools/lib/perf/Documentation/libperf.txt b/tools/lib/perf/Documentation/libperf.txt
> index 0c74c30ed23a..ca7478acc97c 100644
> --- a/tools/lib/perf/Documentation/libperf.txt
> +++ b/tools/lib/perf/Documentation/libperf.txt
> @@ -136,6 +136,7 @@ SYNOPSIS
>                         struct perf_thread_map *threads);
>    void perf_evsel__close(struct perf_evsel *evsel);
>    void perf_evsel__close_cpu(struct perf_evsel *evsel, int cpu);
> +  void *perf_evsel__mmap(struct perf_evsel *evsel);
>    int perf_evsel__read(struct perf_evsel *evsel, int cpu, int thread,
>                         struct perf_counts_values *count);
>    int perf_evsel__enable(struct perf_evsel *evsel);
> diff --git a/tools/lib/perf/evsel.c b/tools/lib/perf/evsel.c
> index 4dc06289f4c7..b0c94ef4d9b6 100644
> --- a/tools/lib/perf/evsel.c
> +++ b/tools/lib/perf/evsel.c
> @@ -11,10 +11,12 @@
>  #include <stdlib.h>
>  #include <internal/xyarray.h>
>  #include <internal/cpumap.h>
> +#include <internal/mmap.h>
>  #include <internal/threadmap.h>
>  #include <internal/lib.h>
>  #include <linux/string.h>
>  #include <sys/ioctl.h>
> +#include <sys/mman.h>
>
>  void perf_evsel__init(struct perf_evsel *evsel, struct perf_event_attr *attr)
>  {
> @@ -156,6 +158,34 @@ void perf_evsel__close_cpu(struct perf_evsel *evsel, int cpu)
>         perf_evsel__close_fd_cpu(evsel, cpu);
>  }
>
> +void *perf_evsel__mmap(struct perf_evsel *evsel)
> +{
> +       int ret;
> +       struct perf_mmap *map;
> +       struct perf_mmap_param mp = {
> +               .mask = -1,
> +               .prot = PROT_READ | PROT_WRITE,
> +       };
> +
> +       if (FD(evsel, 0, 0) < 0)
> +               return NULL;
> +
> +       map = zalloc(sizeof(*map));
> +       if (!map)
> +               return NULL;
> +
> +       perf_mmap__init(map, NULL, false, NULL);
> +
> +       ret = perf_mmap__mmap(map, &mp, FD(evsel, 0, 0), 0);
> +       if (ret) {
> +               free(map);
> +               return NULL;
> +       }
> +
> +       evsel->mmap = map;
> +       return map->base;
> +}
> +
>  int perf_evsel__read_size(struct perf_evsel *evsel)
>  {
>         u64 read_format = evsel->attr.read_format;
> @@ -191,6 +221,9 @@ int perf_evsel__read(struct perf_evsel *evsel, int cpu, int thread,
>         if (FD(evsel, cpu, thread) < 0)
>                 return -EINVAL;
>
> +       if (evsel->mmap && !perf_mmap__read_self(evsel->mmap, count))
> +               return 0;
> +
>         if (readn(FD(evsel, cpu, thread), count->values, size) <= 0)
>                 return -errno;
>
> diff --git a/tools/lib/perf/include/internal/evsel.h b/tools/lib/perf/include/internal/evsel.h
> index 1ffd083b235e..a7985dbb68ff 100644
> --- a/tools/lib/perf/include/internal/evsel.h
> +++ b/tools/lib/perf/include/internal/evsel.h
> @@ -9,6 +9,7 @@
>
>  struct perf_cpu_map;
>  struct perf_thread_map;
> +struct perf_mmap;
>  struct xyarray;
>
>  /*
> @@ -40,6 +41,7 @@ struct perf_evsel {
>         struct perf_cpu_map     *cpus;
>         struct perf_cpu_map     *own_cpus;
>         struct perf_thread_map  *threads;
> +       struct perf_mmap        *mmap;
>         struct xyarray          *fd;
>         struct xyarray          *sample_id;
>         u64                     *id;
> diff --git a/tools/lib/perf/include/internal/mmap.h b/tools/lib/perf/include/internal/mmap.h
> index be7556e0a2b2..5e3422f40ed5 100644
> --- a/tools/lib/perf/include/internal/mmap.h
> +++ b/tools/lib/perf/include/internal/mmap.h
> @@ -11,6 +11,7 @@
>  #define PERF_SAMPLE_MAX_SIZE (1 << 16)
>
>  struct perf_mmap;
> +struct perf_counts_values;
>
>  typedef void (*libperf_unmap_cb_t)(struct perf_mmap *map);
>
> @@ -52,4 +53,6 @@ void perf_mmap__put(struct perf_mmap *map);
>
>  u64 perf_mmap__read_head(struct perf_mmap *map);
>
> +int perf_mmap__read_self(struct perf_mmap *map, struct perf_counts_values *count);
> +
>  #endif /* __LIBPERF_INTERNAL_MMAP_H */
> diff --git a/tools/lib/perf/include/perf/evsel.h b/tools/lib/perf/include/perf/evsel.h
> index c82ec39a4ad0..6d0da962870c 100644
> --- a/tools/lib/perf/include/perf/evsel.h
> +++ b/tools/lib/perf/include/perf/evsel.h
> @@ -27,6 +27,7 @@ LIBPERF_API int perf_evsel__open(struct perf_evsel *evsel, struct perf_cpu_map *
>                                  struct perf_thread_map *threads);
>  LIBPERF_API void perf_evsel__close(struct perf_evsel *evsel);
>  LIBPERF_API void perf_evsel__close_cpu(struct perf_evsel *evsel, int cpu);
> +LIBPERF_API void *perf_evsel__mmap(struct perf_evsel *evsel);
>  LIBPERF_API int perf_evsel__read(struct perf_evsel *evsel, int cpu, int thread,
>                                  struct perf_counts_values *count);
>  LIBPERF_API int perf_evsel__enable(struct perf_evsel *evsel);
> diff --git a/tools/lib/perf/libperf.map b/tools/lib/perf/libperf.map
> index 7be1af8a546c..733a0647be8b 100644
> --- a/tools/lib/perf/libperf.map
> +++ b/tools/lib/perf/libperf.map
> @@ -23,6 +23,7 @@ LIBPERF_0.0.1 {
>                 perf_evsel__disable;
>                 perf_evsel__open;
>                 perf_evsel__close;
> +               perf_evsel__mmap;
>                 perf_evsel__read;
>                 perf_evsel__cpus;
>                 perf_evsel__threads;
> diff --git a/tools/lib/perf/mmap.c b/tools/lib/perf/mmap.c
> index 79d5ed6c38cc..cb07969cfdbf 100644
> --- a/tools/lib/perf/mmap.c
> +++ b/tools/lib/perf/mmap.c
> @@ -8,9 +8,11 @@
>  #include <linux/perf_event.h>
>  #include <perf/mmap.h>
>  #include <perf/event.h>
> +#include <perf/evsel.h>
>  #include <internal/mmap.h>
>  #include <internal/lib.h>
>  #include <linux/kernel.h>
> +#include <linux/math64.h>
>  #include "internal.h"
>
>  void perf_mmap__init(struct perf_mmap *map, struct perf_mmap *prev,
> @@ -273,3 +275,91 @@ union perf_event *perf_mmap__read_event(struct perf_mmap *map)
>
>         return event;
>  }
> +
> +#if defined(__i386__) || defined(__x86_64__)
> +static u64 read_perf_counter(unsigned int counter)
> +{
> +       unsigned int low, high;
> +
> +       asm volatile("rdpmc" : "=a" (low), "=d" (high) : "c" (counter));
> +
> +       return low | ((u64)high) << 32;
> +}
> +
> +static u64 read_timestamp(void)
> +{
> +       unsigned int low, high;
> +
> +       asm volatile("rdtsc" : "=a" (low), "=d" (high));
> +
> +       return low | ((u64)high) << 32;
> +}
> +#else
> +static u64 read_perf_counter(unsigned int counter) { return 0; }
> +static u64 read_timestamp(void) { return 0; }
> +#endif
> +
> +int perf_mmap__read_self(struct perf_mmap *map, struct perf_counts_values *count)
> +{
> +       struct perf_event_mmap_page *pc = map->base;
> +       u32 seq, idx, time_mult = 0, time_shift = 0;
> +       u64 cnt, cyc = 0, time_offset = 0, time_cycles = 0, time_mask = ~0ULL;
> +
> +       BUG_ON(!pc);
> +
> +       if (!pc->cap_user_rdpmc)
> +               return -1;
> +
> +       do {
> +               seq = READ_ONCE(pc->lock);
> +               barrier();
> +
> +               count->ena = READ_ONCE(pc->time_enabled);
> +               count->run = READ_ONCE(pc->time_running);
> +
> +               if (pc->cap_user_time && count->ena != count->run) {
> +                       cyc = read_timestamp();
> +                       time_mult = READ_ONCE(pc->time_mult);
> +                       time_shift = READ_ONCE(pc->time_shift);
> +                       time_offset = READ_ONCE(pc->time_offset);
> +
> +                       if (pc->cap_user_time_short) {
> +                               time_cycles = READ_ONCE(pc->time_cycles);
> +                               time_mask = READ_ONCE(pc->time_mask);
> +                       }
> +               }
> +
> +               idx = READ_ONCE(pc->index);
> +               cnt = READ_ONCE(pc->offset);
> +               if (pc->cap_user_rdpmc && idx) {
> +                       u64 evcnt = read_perf_counter(idx - 1);
> +                       u16 width = READ_ONCE(pc->pmc_width);
> +
> +                       evcnt <<= 64 - width;
> +                       evcnt >>= 64 - width;
> +                       cnt += evcnt;
> +               } else
> +                       return -1;
> +
> +               barrier();
> +       } while (READ_ONCE(pc->lock) != seq);
> +
> +       if (count->ena != count->run) {

There's an existing bug here that I tried to resolve in this patch:
https://lore.kernel.org/lkml/CAP-5=fVRdqvswtyQMg5cB+ntTGda+SAYskjTQednEH-AeZo13g@mail.gmail.com/
Due to multiplexing, enabled may be > 0 but run == 0 and the divide
below can end up with divide by zero.

I like the idea of this code being in a library, there's an intent
that the perf_event.h and test code be copy-paste-able, but there is
some pre-existing divergence. It would be nice if this code could be
closer to the sample code in both the test and perf_event.h.

As per the change above, I think running and enabled times need to be
out arguments.

Thanks,
Ian

> +               u64 delta;
> +
> +               /* Adjust for cap_usr_time_short, a nop if not */
> +               cyc = time_cycles + ((cyc - time_cycles) & time_mask);
> +
> +               delta = time_offset + mul_u64_u32_shr(cyc, time_mult, time_shift);
> +
> +               count->ena += delta;
> +               if (idx)
> +                       count->run += delta;
> +
> +               cnt = mul_u64_u64_div64(cnt, count->ena, count->run);
> +       }
> +
> +       count->val = cnt;
> +
> +       return 0;
> +}
> diff --git a/tools/lib/perf/tests/test-evsel.c b/tools/lib/perf/tests/test-evsel.c
> index 135722ac965b..fd637d23216b 100644
> --- a/tools/lib/perf/tests/test-evsel.c
> +++ b/tools/lib/perf/tests/test-evsel.c
> @@ -120,6 +120,68 @@ static int test_stat_thread_enable(void)
>         return 0;
>  }
>
> +static int test_stat_user_read(int event)
> +{
> +       struct perf_counts_values counts = { .val = 0 };
> +       struct perf_thread_map *threads;
> +       struct perf_evsel *evsel;
> +       struct perf_event_mmap_page *pc;
> +       struct perf_event_attr attr = {
> +               .type   = PERF_TYPE_HARDWARE,
> +               .config = event,
> +       };
> +       int err, i;
> +
> +       threads = perf_thread_map__new_dummy();
> +       __T("failed to create threads", threads);
> +
> +       perf_thread_map__set_pid(threads, 0, 0);
> +
> +       evsel = perf_evsel__new(&attr);
> +       __T("failed to create evsel", evsel);
> +
> +       err = perf_evsel__open(evsel, NULL, threads);
> +       __T("failed to open evsel", err == 0);
> +
> +       pc = perf_evsel__mmap(evsel);
> +       __T("failed to mmap evsel", pc);
> +
> +#if defined(__i386__) || defined(__x86_64__) || defined(__aarch64__)
> +       __T("userspace counter access not supported", pc->cap_user_rdpmc);
> +       __T("userspace counter access not enabled", pc->index);
> +       __T("userspace counter width not set", pc->pmc_width >= 32);
> +#endif
> +
> +       perf_evsel__read(evsel, 0, 0, &counts);
> +       __T("failed to read value for evsel", counts.val != 0);
> +
> +       fputs("\n", stderr);
> +       for (i = 0; i < 5; i++) {
> +               volatile int count = 0x10000 << i;
> +               __u64 start, end, last = 0;
> +
> +               fprintf(stderr, "\tloop = %u, ", count);
> +
> +               perf_evsel__read(evsel, 0, 0, &counts);
> +               start = counts.val;
> +
> +               while (count--) ;
> +
> +               perf_evsel__read(evsel, 0, 0, &counts);
> +               end = counts.val;
> +
> +               __T("invalid counter data", (end - start) > last);
> +               last = end - start;
> +               fprintf(stderr, "count = %llu\n", end - start);
> +       }
> +
> +       perf_evsel__close(evsel);
> +       perf_evsel__delete(evsel);
> +
> +       perf_thread_map__put(threads);
> +       return 0;
> +}
> +
>  int main(int argc, char **argv)
>  {
>         __T_START;
> @@ -129,6 +191,8 @@ int main(int argc, char **argv)
>         test_stat_cpu();
>         test_stat_thread();
>         test_stat_thread_enable();
> +       test_stat_user_read(PERF_COUNT_HW_INSTRUCTIONS);
> +       test_stat_user_read(PERF_COUNT_HW_CPU_CYCLES);
>
>         __T_END;
>         return 0;
> --
> 2.25.1
>

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v2 5/9] libperf: Add support for user space counter access
  2020-09-02 18:07   ` Ian Rogers
@ 2020-09-02 19:48     ` Rob Herring
  2020-09-04  5:51       ` Ian Rogers
  0 siblings, 1 reply; 18+ messages in thread
From: Rob Herring @ 2020-09-02 19:48 UTC (permalink / raw)
  To: Ian Rogers
  Cc: Mark Rutland, Will Deacon, Peter Zijlstra, Catalin Marinas, LKML,
	Arnaldo Carvalho de Melo, Alexander Shishkin, Raphael Gault,
	Ingo Molnar, Honnappa Nagarahalli, Jonathan Cameron,
	Namhyung Kim, Jiri Olsa, Linux ARM

On Wed, Sep 2, 2020 at 12:07 PM Ian Rogers <irogers@google.com> wrote:
>
> On Fri, Aug 28, 2020 at 1:56 PM Rob Herring <robh@kernel.org> wrote:
> >
> > x86 and arm64 can both support direct access of event counters in
> > userspace. The access sequence is less than trivial and currently exists
> > in perf test code (tools/perf/arch/x86/tests/rdpmc.c) with copies in
> > projects such as PAPI and libpfm4.
> >
> > In order to support usersapce access, an event must be mmapped. While
> > there's already mmap support for evlist, the usecase is a bit different
> > than the self monitoring with userspace access. So let's add a new
> > perf_evsel__mmap() function to mmap an evsel. This allows implementing
> > userspace access as a fastpath for perf_evsel__read().
> >
> > The mmapped address is returned by perf_evsel__mmap() primarily for
> > users/tests to check if userspace access is enabled.
> >
> > Signed-off-by: Rob Herring <robh@kernel.org>
> > ---

> > +int perf_mmap__read_self(struct perf_mmap *map, struct perf_counts_values *count)
> > +{
> > +       struct perf_event_mmap_page *pc = map->base;
> > +       u32 seq, idx, time_mult = 0, time_shift = 0;
> > +       u64 cnt, cyc = 0, time_offset = 0, time_cycles = 0, time_mask = ~0ULL;
> > +
> > +       BUG_ON(!pc);
> > +
> > +       if (!pc->cap_user_rdpmc)
> > +               return -1;
> > +
> > +       do {
> > +               seq = READ_ONCE(pc->lock);
> > +               barrier();
> > +
> > +               count->ena = READ_ONCE(pc->time_enabled);
> > +               count->run = READ_ONCE(pc->time_running);
> > +
> > +               if (pc->cap_user_time && count->ena != count->run) {
> > +                       cyc = read_timestamp();
> > +                       time_mult = READ_ONCE(pc->time_mult);
> > +                       time_shift = READ_ONCE(pc->time_shift);
> > +                       time_offset = READ_ONCE(pc->time_offset);
> > +
> > +                       if (pc->cap_user_time_short) {
> > +                               time_cycles = READ_ONCE(pc->time_cycles);
> > +                               time_mask = READ_ONCE(pc->time_mask);
> > +                       }
> > +               }
> > +
> > +               idx = READ_ONCE(pc->index);
> > +               cnt = READ_ONCE(pc->offset);
> > +               if (pc->cap_user_rdpmc && idx) {
> > +                       u64 evcnt = read_perf_counter(idx - 1);
> > +                       u16 width = READ_ONCE(pc->pmc_width);
> > +
> > +                       evcnt <<= 64 - width;
> > +                       evcnt >>= 64 - width;
> > +                       cnt += evcnt;
> > +               } else
> > +                       return -1;
> > +
> > +               barrier();
> > +       } while (READ_ONCE(pc->lock) != seq);
> > +
> > +       if (count->ena != count->run) {
>
> There's an existing bug here that I tried to resolve in this patch:
> https://lore.kernel.org/lkml/CAP-5=fVRdqvswtyQMg5cB+ntTGda+SAYskjTQednEH-AeZo13g@mail.gmail.com/
> Due to multiplexing, enabled may be > 0 but run == 0 and the divide
> below can end up with divide by zero.

Yeah, I saw that, but didn't try to also fix that issue here.

> I like the idea of this code being in a library, there's an intent
> that the perf_event.h and test code be copy-paste-able, but there is
> some pre-existing divergence. It would be nice if this code could be
> closer to the sample code in both the test and perf_event.h.

The only way we get and keep all the versions of the code aligned is
removing the other copies. We should just remove the code comment from
perf_event.h IMO. If rdpmc.c is going to stick around given some
resistance to removing it, then perhaps it should be converted to use
libperf. At that point it could also be arch independent. Though I
don't like the idea of having the same test twice.

> As per the change above, I think running and enabled times need to be
> out arguments.

They are now in this version.

Rob

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v2 5/9] libperf: Add support for user space counter access
  2020-09-02 19:48     ` Rob Herring
@ 2020-09-04  5:51       ` Ian Rogers
  0 siblings, 0 replies; 18+ messages in thread
From: Ian Rogers @ 2020-09-04  5:51 UTC (permalink / raw)
  To: Rob Herring
  Cc: Mark Rutland, Will Deacon, Peter Zijlstra, Catalin Marinas, LKML,
	Arnaldo Carvalho de Melo, Alexander Shishkin, Raphael Gault,
	Ingo Molnar, Honnappa Nagarahalli, Jonathan Cameron,
	Namhyung Kim, Jiri Olsa, Linux ARM

On Wed, Sep 2, 2020 at 12:48 PM Rob Herring <robh@kernel.org> wrote:
>
> On Wed, Sep 2, 2020 at 12:07 PM Ian Rogers <irogers@google.com> wrote:
> >
> > On Fri, Aug 28, 2020 at 1:56 PM Rob Herring <robh@kernel.org> wrote:
> > >
> > > x86 and arm64 can both support direct access of event counters in
> > > userspace. The access sequence is less than trivial and currently exists
> > > in perf test code (tools/perf/arch/x86/tests/rdpmc.c) with copies in
> > > projects such as PAPI and libpfm4.
> > >
> > > In order to support usersapce access, an event must be mmapped. While
> > > there's already mmap support for evlist, the usecase is a bit different
> > > than the self monitoring with userspace access. So let's add a new
> > > perf_evsel__mmap() function to mmap an evsel. This allows implementing
> > > userspace access as a fastpath for perf_evsel__read().
> > >
> > > The mmapped address is returned by perf_evsel__mmap() primarily for
> > > users/tests to check if userspace access is enabled.
> > >
> > > Signed-off-by: Rob Herring <robh@kernel.org>
> > > ---
>
> > > +int perf_mmap__read_self(struct perf_mmap *map, struct perf_counts_values *count)
> > > +{
> > > +       struct perf_event_mmap_page *pc = map->base;
> > > +       u32 seq, idx, time_mult = 0, time_shift = 0;
> > > +       u64 cnt, cyc = 0, time_offset = 0, time_cycles = 0, time_mask = ~0ULL;
> > > +
> > > +       BUG_ON(!pc);
> > > +
> > > +       if (!pc->cap_user_rdpmc)
> > > +               return -1;
> > > +
> > > +       do {
> > > +               seq = READ_ONCE(pc->lock);
> > > +               barrier();
> > > +
> > > +               count->ena = READ_ONCE(pc->time_enabled);
> > > +               count->run = READ_ONCE(pc->time_running);
> > > +
> > > +               if (pc->cap_user_time && count->ena != count->run) {
> > > +                       cyc = read_timestamp();
> > > +                       time_mult = READ_ONCE(pc->time_mult);
> > > +                       time_shift = READ_ONCE(pc->time_shift);
> > > +                       time_offset = READ_ONCE(pc->time_offset);
> > > +
> > > +                       if (pc->cap_user_time_short) {
> > > +                               time_cycles = READ_ONCE(pc->time_cycles);
> > > +                               time_mask = READ_ONCE(pc->time_mask);
> > > +                       }
> > > +               }
> > > +
> > > +               idx = READ_ONCE(pc->index);
> > > +               cnt = READ_ONCE(pc->offset);
> > > +               if (pc->cap_user_rdpmc && idx) {
> > > +                       u64 evcnt = read_perf_counter(idx - 1);
> > > +                       u16 width = READ_ONCE(pc->pmc_width);
> > > +
> > > +                       evcnt <<= 64 - width;
> > > +                       evcnt >>= 64 - width;
> > > +                       cnt += evcnt;
> > > +               } else
> > > +                       return -1;
> > > +
> > > +               barrier();
> > > +       } while (READ_ONCE(pc->lock) != seq);
> > > +
> > > +       if (count->ena != count->run) {
> >
> > There's an existing bug here that I tried to resolve in this patch:
> > https://lore.kernel.org/lkml/CAP-5=fVRdqvswtyQMg5cB+ntTGda+SAYskjTQednEH-AeZo13g@mail.gmail.com/
> > Due to multiplexing, enabled may be > 0 but run == 0 and the divide
> > below can end up with divide by zero.
>
> Yeah, I saw that, but didn't try to also fix that issue here.
>
> > I like the idea of this code being in a library, there's an intent
> > that the perf_event.h and test code be copy-paste-able, but there is
> > some pre-existing divergence. It would be nice if this code could be
> > closer to the sample code in both the test and perf_event.h.
>
> The only way we get and keep all the versions of the code aligned is
> removing the other copies. We should just remove the code comment from
> perf_event.h IMO. If rdpmc.c is going to stick around given some
> resistance to removing it, then perhaps it should be converted to use
> libperf. At that point it could also be arch independent. Though I
> don't like the idea of having the same test twice.

This makes sense to me, perhaps others could comment. Given the
cleaned up API fixing or deleting tools/perf/arch/x86/tests/rdpmc.c is
desirable (as your patch set does). I wondered if we could do Jiri's
suggestion to run the lib/perf tests with perf test. One way would be
to have shell script wrapper in tools/perf/tests/shell. It's not clear
how to make a dependency from a shell script there and tests built
elsewhere in the tree though.

> > As per the change above, I think running and enabled times need to be
> > out arguments.
>
> They are now in this version.

Sorry, my mistake. I'd missed that.

Thanks,
Ian

> Rob

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2020-09-04  5:53 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-08-28 20:56 [PATCH v2 0/9] libperf and arm64 userspace counter access support Rob Herring
2020-08-28 20:56 ` [PATCH v2 1/9] arm64: pmu: Add hook to handle pmu-related undefined instructions Rob Herring
2020-08-28 20:56 ` [PATCH v2 2/9] arm64: pmu: Add function implementation to update event index in userpage Rob Herring
2020-08-28 20:56 ` [PATCH v2 3/9] arm64: perf: Enable pmu counter direct access for perf event on armv8 Rob Herring
2020-08-28 20:56 ` [PATCH v2 4/9] tools/include: Add an initial math64.h Rob Herring
2020-08-28 20:56 ` [PATCH v2 5/9] libperf: Add support for user space counter access Rob Herring
2020-08-31  9:11   ` Jiri Olsa
2020-09-02 16:58     ` Rob Herring
2020-08-31  9:11   ` Jiri Olsa
2020-09-02 17:01     ` Rob Herring
2020-09-02 18:07   ` Ian Rogers
2020-09-02 19:48     ` Rob Herring
2020-09-04  5:51       ` Ian Rogers
2020-08-28 20:56 ` [PATCH v2 6/9] libperf: Add arm64 support to perf_mmap__read_self() Rob Herring
2020-08-28 20:56 ` [PATCH v2 7/9] perf: arm64: Add test for userspace counter access on heterogeneous systems Rob Herring
2020-08-28 20:56 ` [PATCH v2 8/9] Documentation: arm64: Document PMU counters access from userspace Rob Herring
2020-08-28 20:56 ` [PATCH v2 9/9] perf: Remove x86 specific rdpmc test Rob Herring
2020-08-31  9:11   ` Jiri Olsa

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).