linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC 0/6] arm64: Enable access to pmu registers by user-space
@ 2019-05-16 13:21 Raphael Gault
  2019-05-16 13:21 ` [PATCH 1/6] perf: arm64: Compile tests unconditionally Raphael Gault
                   ` (5 more replies)
  0 siblings, 6 replies; 13+ messages in thread
From: Raphael Gault @ 2019-05-16 13:21 UTC (permalink / raw)
  To: linux-arm-kernel, linux-kernel
  Cc: mingo, peterz, catalin.marinas, will.deacon, acme, mark.rutland,
	Raphael Gault

The perf user-space tool relies on the PMU to monitor events. It offers an
abstraction layer over the hardware counters since the underlying
implementation is cpu-dependent. We want to allow userspace tools to have
access to the registers storing the hardware counters' values directly.
This targets specifically self-monitoring tasks in order to reduce the
overhead by directly accessing the registers without having to go
through the kernel.
In order to do this we need to setup the pmu so that it exposes its registers
to userspace access.

The first patch enables the tests for arm64 architecture in the perf
tool to be compiled systematically.

The second patch add a test to the perf tool so that we can test that the
access to the registers works correctly from userspace.

The third patch focuses on the armv8 pmuv3 PMU support and makes sure that
the access to the pmu registers is enable and that the userspace have
access to the relevent information in order to use them.

The fourth patch adds a hook to handle faulting access to the pmu
registers. This is necessary in order to have a coherent behaviour
on big.LITTLE environment.

The fifth patch put in place callbacks to enable access to the hardware
counters from userspace when a compatible event is opened using the perf
API.

Note:
This series is applied on top of this patch (already acked):
https://patchwork.kernel.org/patch/10896407/

*** BLURB HERE ***

Raphael Gault (6):
  perf: arm64: Compile tests unconditionally
  perf: arm64: Add test to check userspace access to hardware counters.
  arm64: pmu: Add function implementation to update event index in
    userpage.
  arm64: pmu: Add hook to handle pmu-related undefined instructions
  arm64: perf: Enable pmu counter direct access for perf event on armv8
  Documentation: arm64: Document PMU counters access from userspace

 .../arm64/pmu_counter_user_access.txt         |  42 +++
 arch/arm64/include/asm/mmu.h                  |   6 +
 arch/arm64/include/asm/mmu_context.h          |   2 +
 arch/arm64/include/asm/perf_event.h           |  14 +
 arch/arm64/kernel/perf_event.c                |  72 +++++
 drivers/perf/arm_pmu.c                        |  48 ++++
 include/linux/perf/arm_pmu.h                  |   2 +
 tools/perf/arch/arm64/Build                   |   2 +-
 tools/perf/arch/arm64/include/arch-tests.h    |   6 +
 tools/perf/arch/arm64/tests/Build             |   3 +-
 tools/perf/arch/arm64/tests/arch-tests.c      |   4 +
 tools/perf/arch/arm64/tests/user-events.c     | 255 ++++++++++++++++++
 12 files changed, 454 insertions(+), 2 deletions(-)
 create mode 100644 Documentation/arm64/pmu_counter_user_access.txt
 create mode 100644 tools/perf/arch/arm64/tests/user-events.c

-- 
2.17.1


^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH 1/6] perf: arm64: Compile tests unconditionally
  2019-05-16 13:21 [RFC 0/6] arm64: Enable access to pmu registers by user-space Raphael Gault
@ 2019-05-16 13:21 ` Raphael Gault
  2019-05-16 13:21 ` [PATCH 2/6] perf: arm64: Add test to check userspace access to hardware counters Raphael Gault
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 13+ messages in thread
From: Raphael Gault @ 2019-05-16 13:21 UTC (permalink / raw)
  To: linux-arm-kernel, linux-kernel
  Cc: mingo, peterz, catalin.marinas, will.deacon, acme, mark.rutland,
	Raphael Gault

In order to subsequently add more tests for the arm64 architecture
we compile the tests target for arm64 systematically.

Signed-off-by: Raphael Gault <raphael.gault@arm.com>
---
 tools/perf/arch/arm64/Build       | 2 +-
 tools/perf/arch/arm64/tests/Build | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/tools/perf/arch/arm64/Build b/tools/perf/arch/arm64/Build
index 36222e64bbf7..a7dd46a5b678 100644
--- a/tools/perf/arch/arm64/Build
+++ b/tools/perf/arch/arm64/Build
@@ -1,2 +1,2 @@
 perf-y += util/
-perf-$(CONFIG_DWARF_UNWIND) += tests/
+perf-y += tests/
diff --git a/tools/perf/arch/arm64/tests/Build b/tools/perf/arch/arm64/tests/Build
index 41707fea74b3..a61c06bdb757 100644
--- a/tools/perf/arch/arm64/tests/Build
+++ b/tools/perf/arch/arm64/tests/Build
@@ -1,4 +1,4 @@
 perf-y += regs_load.o
-perf-y += dwarf-unwind.o
+perf-$(CONFIG_DWARF_UNWIND) += dwarf-unwind.o
 
 perf-y += arch-tests.o
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 2/6] perf: arm64: Add test to check userspace access to hardware counters.
  2019-05-16 13:21 [RFC 0/6] arm64: Enable access to pmu registers by user-space Raphael Gault
  2019-05-16 13:21 ` [PATCH 1/6] perf: arm64: Compile tests unconditionally Raphael Gault
@ 2019-05-16 13:21 ` Raphael Gault
  2019-05-16 13:21 ` [PATCH 3/6] arm64: pmu: Add function implementation to update event index in userpage Raphael Gault
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 13+ messages in thread
From: Raphael Gault @ 2019-05-16 13:21 UTC (permalink / raw)
  To: linux-arm-kernel, linux-kernel
  Cc: mingo, peterz, catalin.marinas, will.deacon, acme, mark.rutland,
	Raphael Gault

This test relies on the fact that the PMU registers are accessible
from userspace. It then uses the perf_event_mmap_page to retrieve
the counter index and access the underlying register.

This test uses sched_setaffinity(2) in order to run on all CPU and thus
check the behaviour of the PMU of all cpus in a big.LITTLE environment.

Signed-off-by: Raphael Gault <raphael.gault@arm.com>
---
 tools/perf/arch/arm64/include/arch-tests.h |   6 +
 tools/perf/arch/arm64/tests/Build          |   1 +
 tools/perf/arch/arm64/tests/arch-tests.c   |   4 +
 tools/perf/arch/arm64/tests/user-events.c  | 255 +++++++++++++++++++++
 4 files changed, 266 insertions(+)
 create mode 100644 tools/perf/arch/arm64/tests/user-events.c

diff --git a/tools/perf/arch/arm64/include/arch-tests.h b/tools/perf/arch/arm64/include/arch-tests.h
index 90ec4c8cb880..a9b17ae0560b 100644
--- a/tools/perf/arch/arm64/include/arch-tests.h
+++ b/tools/perf/arch/arm64/include/arch-tests.h
@@ -2,11 +2,17 @@
 #ifndef ARCH_TESTS_H
 #define ARCH_TESTS_H
 
+#define __maybe_unused	__attribute__((unused))
 #ifdef HAVE_DWARF_UNWIND_SUPPORT
 struct thread;
 struct perf_sample;
+int test__arch_unwind_sample(struct perf_sample *sample,
+			     struct thread *thread);
 #endif
 
 extern struct test arch_tests[];
+int test__rd_pmevcntr(struct test *test __maybe_unused,
+		      int subtest __maybe_unused);
+
 
 #endif
diff --git a/tools/perf/arch/arm64/tests/Build b/tools/perf/arch/arm64/tests/Build
index a61c06bdb757..3f9a20c17fc6 100644
--- a/tools/perf/arch/arm64/tests/Build
+++ b/tools/perf/arch/arm64/tests/Build
@@ -1,4 +1,5 @@
 perf-y += regs_load.o
 perf-$(CONFIG_DWARF_UNWIND) += dwarf-unwind.o
 
+perf-y += user-events.o
 perf-y += arch-tests.o
diff --git a/tools/perf/arch/arm64/tests/arch-tests.c b/tools/perf/arch/arm64/tests/arch-tests.c
index 5b1543c98022..57df9b89dede 100644
--- a/tools/perf/arch/arm64/tests/arch-tests.c
+++ b/tools/perf/arch/arm64/tests/arch-tests.c
@@ -10,6 +10,10 @@ struct test arch_tests[] = {
 		.func = test__dwarf_unwind,
 	},
 #endif
+	{
+		.desc = "User counter access",
+		.func = test__rd_pmevcntr,
+	},
 	{
 		.func = NULL,
 	},
diff --git a/tools/perf/arch/arm64/tests/user-events.c b/tools/perf/arch/arm64/tests/user-events.c
new file mode 100644
index 000000000000..958e4cd000c1
--- /dev/null
+++ b/tools/perf/arch/arm64/tests/user-events.c
@@ -0,0 +1,255 @@
+// SPDX-License-Identifier: GPL-2.0
+#include <asm/bug.h>
+#include <errno.h>
+#include <unistd.h>
+#include <sched.h>
+#include <stdlib.h>
+#include <signal.h>
+#include <sys/mman.h>
+#include <sys/sysinfo.h>
+#include <sys/types.h>
+#include <sys/wait.h>
+#include <linux/types.h>
+#include "perf.h"
+#include "debug.h"
+#include "tests/tests.h"
+#include "cloexec.h"
+#include "util.h"
+#include "arch-tests.h"
+
+/*
+ * ARMv8 ARM reserves the following encoding for system registers:
+ * (Ref: ARMv8 ARM, Section: "System instruction class encoding overview",
+ *  C5.2, version:ARM DDI 0487A.f)
+ *      [20-19] : Op0
+ *      [18-16] : Op1
+ *      [15-12] : CRn
+ *      [11-8]  : CRm
+ *      [7-5]   : Op2
+ */
+#define Op0_shift       19
+#define Op0_mask        0x3
+#define Op1_shift       16
+#define Op1_mask        0x7
+#define CRn_shift       12
+#define CRn_mask        0xf
+#define CRm_shift       8
+#define CRm_mask        0xf
+#define Op2_shift       5
+#define Op2_mask        0x7
+
+#define __stringify(x)	#x
+
+#define read_sysreg(r) ({						\
+	u64 __val;							\
+	asm volatile("mrs %0, " __stringify(r) : "=r" (__val));		\
+	__val;								\
+})
+
+#define PMEVCNTR_READ_CASE(idx)					\
+	case idx:						\
+		return read_sysreg(pmevcntr##idx##_el0)
+
+#define PMEVCNTR_CASES(readwrite)		\
+	PMEVCNTR_READ_CASE(0);			\
+	PMEVCNTR_READ_CASE(1);			\
+	PMEVCNTR_READ_CASE(2);			\
+	PMEVCNTR_READ_CASE(3);			\
+	PMEVCNTR_READ_CASE(4);			\
+	PMEVCNTR_READ_CASE(5);			\
+	PMEVCNTR_READ_CASE(6);			\
+	PMEVCNTR_READ_CASE(7);			\
+	PMEVCNTR_READ_CASE(8);			\
+	PMEVCNTR_READ_CASE(9);			\
+	PMEVCNTR_READ_CASE(10);			\
+	PMEVCNTR_READ_CASE(11);			\
+	PMEVCNTR_READ_CASE(12);			\
+	PMEVCNTR_READ_CASE(13);			\
+	PMEVCNTR_READ_CASE(14);			\
+	PMEVCNTR_READ_CASE(15);			\
+	PMEVCNTR_READ_CASE(16);			\
+	PMEVCNTR_READ_CASE(17);			\
+	PMEVCNTR_READ_CASE(18);			\
+	PMEVCNTR_READ_CASE(19);			\
+	PMEVCNTR_READ_CASE(20);			\
+	PMEVCNTR_READ_CASE(21);			\
+	PMEVCNTR_READ_CASE(22);			\
+	PMEVCNTR_READ_CASE(23);			\
+	PMEVCNTR_READ_CASE(24);			\
+	PMEVCNTR_READ_CASE(25);			\
+	PMEVCNTR_READ_CASE(26);			\
+	PMEVCNTR_READ_CASE(27);			\
+	PMEVCNTR_READ_CASE(28);			\
+	PMEVCNTR_READ_CASE(29);			\
+	PMEVCNTR_READ_CASE(30)
+
+/*
+ * Read a value direct from PMEVCNTR<idx>
+ */
+static u64 read_evcnt_direct(int idx)
+{
+	switch (idx) {
+	PMEVCNTR_CASES(READ);
+	default:
+		WARN_ON(1);
+	}
+
+	return 0;
+}
+
+static u64 mmap_read_self(void *addr)
+{
+	struct perf_event_mmap_page *pc = addr;
+	u32 seq, idx, time_mult = 0, time_shift = 0;
+	u64 count, cyc = 0, time_offset = 0, enabled, running, delta;
+
+	do {
+		seq = READ_ONCE(pc->lock);
+		barrier();
+
+		enabled = READ_ONCE(pc->time_enabled);
+		running = READ_ONCE(pc->time_running);
+
+		if (enabled != running) {
+			cyc = read_sysreg(cntvct_el0);
+			time_mult = READ_ONCE(pc->time_mult);
+			time_shift = READ_ONCE(pc->time_shift);
+			time_offset = READ_ONCE(pc->time_offset);
+		}
+
+		idx = READ_ONCE(pc->index);
+		count = READ_ONCE(pc->offset);
+		if (idx)
+			count += read_evcnt_direct(idx - 1);
+
+		barrier();
+	} while (READ_ONCE(pc->lock) != seq);
+
+	if (enabled != running) {
+		u64 quot, rem;
+
+		quot = (cyc >> time_shift);
+		rem = cyc & (((u64)1 << time_shift) - 1);
+		delta = time_offset + quot * time_mult +
+			((rem * time_mult) >> time_shift);
+
+		enabled += delta;
+		if (idx)
+			running += delta;
+
+		quot = count / running;
+		rem = count % running;
+		count = quot * enabled + (rem * enabled) / running;
+	}
+
+	return count;
+}
+
+static int __test__rd_pmevcntr(void)
+{
+	volatile int tmp = 0;
+	u64 i, loops = 1000;
+	int n;
+	int fd;
+	void *addr;
+	struct perf_event_attr attr = {
+		.type = PERF_TYPE_HARDWARE,
+		.config = PERF_COUNT_HW_INSTRUCTIONS,
+		.exclude_kernel = 1,
+	};
+	u64 delta_sum = 0;
+	char sbuf[STRERR_BUFSIZE];
+
+	fd = sys_perf_event_open(&attr, 0, -1, -1,
+				 perf_event_open_cloexec_flag());
+	if (fd < 0) {
+		pr_err("Error: sys_perf_event_open() syscall returned "
+		       "with %d (%s)\n", fd,
+		       str_error_r(errno, sbuf, sizeof(sbuf)));
+		return -1;
+	}
+
+	addr = mmap(NULL, page_size, PROT_READ, MAP_SHARED, fd, 0);
+	if (addr == (void *)(-1)) {
+		pr_err("Error: mmap() syscall returned with (%s)\n",
+		       str_error_r(errno, sbuf, sizeof(sbuf)));
+		goto out_close;
+	}
+
+	for (n = 0; n < 6; n++) {
+		u64 stamp, now, delta;
+
+		stamp = mmap_read_self(addr);
+
+		for (i = 0; i < loops; i++)
+			tmp++;
+
+		now = mmap_read_self(addr);
+		loops *= 10;
+
+		delta = now - stamp;
+		pr_debug("%14d: %14llu\n", n, (long long)delta);
+
+		delta_sum += delta;
+	}
+
+	munmap(addr, page_size);
+	pr_debug("   ");
+
+out_close:
+	close(fd);
+
+	if (!delta_sum)
+		return -1;
+
+	return 0;
+}
+
+int test__rd_pmevcntr(struct test __maybe_unused *test,
+		      int __maybe_unused subtest)
+{
+	int status = 0;
+	int wret = 0;
+	int ret = 0;
+	int pid;
+	int cpu;
+	cpu_set_t cpu_set;
+
+	pid = fork();
+	if (pid < 0)
+		return -1;
+
+	if (!pid) {
+		for (cpu = 0; cpu < get_nprocs(); cpu++) {
+			pr_info("setting affinity to cpu: %d\n", cpu);
+			CPU_ZERO(&cpu_set);
+			CPU_SET(cpu, &cpu_set);
+			if (sched_setaffinity(getpid(),
+					      sizeof(cpu_set),
+					      &cpu_set) == -1) {
+				pr_err("Error: impossible to set cpu (%d) affinity\n",
+				       cpu);
+				continue;
+			}
+			ret = __test__rd_pmevcntr();
+		}
+		exit(ret);
+	}
+
+	wret = waitpid(pid, &status, 0);
+	if (wret < 0)
+		return -1;
+
+	if (WIFSIGNALED(status)) {
+		pr_err("Error: the child process was interrupted by a signal\n");
+		return -1;
+	}
+
+	if (WIFEXITED(status) && WEXITSTATUS(status)) {
+		pr_err("Error: the child process exited with: %d\n",
+		       WEXITSTATUS(status));
+		return -1;
+	}
+
+	return 0;
+}
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 3/6] arm64: pmu: Add function implementation to update event index in userpage.
  2019-05-16 13:21 [RFC 0/6] arm64: Enable access to pmu registers by user-space Raphael Gault
  2019-05-16 13:21 ` [PATCH 1/6] perf: arm64: Compile tests unconditionally Raphael Gault
  2019-05-16 13:21 ` [PATCH 2/6] perf: arm64: Add test to check userspace access to hardware counters Raphael Gault
@ 2019-05-16 13:21 ` Raphael Gault
  2019-05-17 13:21   ` Mark Rutland
  2019-05-16 13:21 ` [PATCH 4/6] arm64: pmu: Add hook to handle pmu-related undefined instructions Raphael Gault
                   ` (2 subsequent siblings)
  5 siblings, 1 reply; 13+ messages in thread
From: Raphael Gault @ 2019-05-16 13:21 UTC (permalink / raw)
  To: linux-arm-kernel, linux-kernel
  Cc: mingo, peterz, catalin.marinas, will.deacon, acme, mark.rutland,
	Raphael Gault

In order to be able to access the counter directly for userspace,
we need to provide the index of the counter using the userpage.
We thus need to override the event_idx function to retrieve and
convert the perf_event index to armv8 hardware index.

Signed-off-by: Raphael Gault <raphael.gault@arm.com>
---
 arch/arm64/kernel/perf_event.c |  4 ++++
 drivers/perf/arm_pmu.c         | 10 ++++++++++
 include/linux/perf/arm_pmu.h   |  2 ++
 3 files changed, 16 insertions(+)

diff --git a/arch/arm64/kernel/perf_event.c b/arch/arm64/kernel/perf_event.c
index 6164d389eed6..e6316f99f66b 100644
--- a/arch/arm64/kernel/perf_event.c
+++ b/arch/arm64/kernel/perf_event.c
@@ -890,6 +890,8 @@ static int __armv8_pmuv3_map_event(struct perf_event *event,
 	if (armv8pmu_event_is_64bit(event))
 		event->hw.flags |= ARMPMU_EVT_64BIT;
 
+	event->hw.flags |= ARMPMU_EL0_RD_CNTR;
+
 	/* Only expose micro/arch events supported by this PMU */
 	if ((hw_event_id > 0) && (hw_event_id < ARMV8_PMUV3_MAX_COMMON_EVENTS)
 	    && test_bit(hw_event_id, armpmu->pmceid_bitmap)) {
@@ -1188,6 +1190,8 @@ void arch_perf_update_userpage(struct perf_event *event,
 	 */
 	freq = arch_timer_get_rate();
 	userpg->cap_user_time = 1;
+	userpg->cap_user_rdpmc =
+		!!(event->hw.flags & ARMPMU_EL0_RD_CNTR);
 
 	clocks_calc_mult_shift(&userpg->time_mult, &shift, freq,
 			NSEC_PER_SEC, 0);
diff --git a/drivers/perf/arm_pmu.c b/drivers/perf/arm_pmu.c
index eec75b97e7ea..3f4c2ec7ff89 100644
--- a/drivers/perf/arm_pmu.c
+++ b/drivers/perf/arm_pmu.c
@@ -777,6 +777,15 @@ static void cpu_pmu_destroy(struct arm_pmu *cpu_pmu)
 					    &cpu_pmu->node);
 }
 
+
+static int armpmu_event_idx(struct perf_event *event)
+{
+	if (!(event->hw.flags & ARMPMU_EL0_RD_CNTR))
+		return 0;
+
+	return event->hw.idx;
+}
+
 static struct arm_pmu *__armpmu_alloc(gfp_t flags)
 {
 	struct arm_pmu *pmu;
@@ -803,6 +812,7 @@ static struct arm_pmu *__armpmu_alloc(gfp_t flags)
 		.start		= armpmu_start,
 		.stop		= armpmu_stop,
 		.read		= armpmu_read,
+		.event_idx	= armpmu_event_idx,
 		.filter_match	= armpmu_filter_match,
 		.attr_groups	= pmu->attr_groups,
 		/*
diff --git a/include/linux/perf/arm_pmu.h b/include/linux/perf/arm_pmu.h
index 4641e850b204..3bef390c1069 100644
--- a/include/linux/perf/arm_pmu.h
+++ b/include/linux/perf/arm_pmu.h
@@ -30,6 +30,8 @@
  */
 /* Event uses a 64bit counter */
 #define ARMPMU_EVT_64BIT		1
+/* Allow access to hardware counter from userspace */
+#define ARMPMU_EL0_RD_CNTR		2
 
 #define HW_OP_UNSUPPORTED		0xFFFF
 #define C(_x)				PERF_COUNT_HW_CACHE_##_x
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 4/6] arm64: pmu: Add hook to handle pmu-related undefined instructions
  2019-05-16 13:21 [RFC 0/6] arm64: Enable access to pmu registers by user-space Raphael Gault
                   ` (2 preceding siblings ...)
  2019-05-16 13:21 ` [PATCH 3/6] arm64: pmu: Add function implementation to update event index in userpage Raphael Gault
@ 2019-05-16 13:21 ` Raphael Gault
  2019-05-17  7:10   ` Peter Zijlstra
  2019-05-16 13:21 ` [PATCH 5/6] arm64: perf: Enable pmu counter direct access for perf event on armv8 Raphael Gault
  2019-05-16 13:21 ` [PATCH 6/6] Documentation: arm64: Document PMU counters access from userspace Raphael Gault
  5 siblings, 1 reply; 13+ messages in thread
From: Raphael Gault @ 2019-05-16 13:21 UTC (permalink / raw)
  To: linux-arm-kernel, linux-kernel
  Cc: mingo, peterz, catalin.marinas, will.deacon, acme, mark.rutland,
	Raphael Gault

In order to prevent the userspace processes which are trying to access
the registers from the pmu registers on a big.LITTLE environment we
introduce a hook to handle undefined instructions.

The goal here is to prevent the process to be interrupted by a signal
when the error is caused by the task being scheduled while accessing
a counter, causing the counter access to be invalid. As we are not able
to know efficiently the number of counters available physically on both
pmu in that context we consider that any faulting access to a counter
which is architecturally correct should not cause a SIGILL signal if
the permissions are set accordingly.

Signed-off-by: Raphael Gault <raphael.gault@arm.com>
---
 arch/arm64/kernel/perf_event.c | 68 ++++++++++++++++++++++++++++++++++
 1 file changed, 68 insertions(+)

diff --git a/arch/arm64/kernel/perf_event.c b/arch/arm64/kernel/perf_event.c
index e6316f99f66b..760c947b58dd 100644
--- a/arch/arm64/kernel/perf_event.c
+++ b/arch/arm64/kernel/perf_event.c
@@ -19,9 +19,11 @@
  * along with this program.  If not, see <http://www.gnu.org/licenses/>.
  */
 
+#include <asm/cpu.h>
 #include <asm/irq_regs.h>
 #include <asm/perf_event.h>
 #include <asm/sysreg.h>
+#include <asm/traps.h>
 #include <asm/virt.h>
 
 #include <linux/acpi.h>
@@ -993,6 +995,72 @@ static int armv8pmu_probe_pmu(struct arm_pmu *cpu_pmu)
 	return probe.present ? 0 : -ENODEV;
 }
 
+static bool is_evcntr(u32 sys_reg)
+{
+	u32 CRn, Op0, Op1, CRm;
+
+	CRn = sys_reg_CRn(sys_reg);
+	CRm = sys_reg_CRm(sys_reg);
+	Op0 = sys_reg_Op0(sys_reg);
+	Op1 = sys_reg_Op1(sys_reg);
+
+	return (CRn == 0xE &&
+		(CRm & 0xc) == 0x8 &&
+		Op1 == 0x3 &&
+		Op0 == 0x3);
+}
+
+static int emulate_pmu(struct pt_regs *regs, u32 insn)
+{
+	u32 sys_reg, rt;
+	u32 pmuserenr;
+
+	sys_reg = (u32)aarch64_insn_decode_immediate(AARCH64_INSN_IMM_16, insn) << 5;
+	rt = aarch64_insn_decode_register(AARCH64_INSN_REGTYPE_RT, insn);
+	pmuserenr = read_sysreg(pmuserenr_el0);
+
+	if ((pmuserenr & (ARMV8_PMU_USERENR_ER|ARMV8_PMU_USERENR_CR)) !=
+	    (ARMV8_PMU_USERENR_ER|ARMV8_PMU_USERENR_CR))
+		return -EINVAL;
+
+	if (sys_reg != SYS_PMXEVCNTR_EL0 &&
+	    !is_evcntr(sys_reg))
+		return -EINVAL;
+
+	/*
+	 * We put 0 in the target register if we
+	 * are reading from pmu register. If we are
+	 * writing, we do nothing.
+	 */
+	if ((insn & 0xfff00000) == 0xd5300000)
+		pt_regs_write_reg(regs, rt, 0);
+	else if (sys_reg != SYS_PMSELR_EL0)
+		return -EINVAL;
+
+	arm64_skip_faulting_instruction(regs, 4);
+	return 0;
+}
+
+/*
+ * This hook will only be triggered by mrs
+ * instructions on PMU registers. This is mandatory
+ * in order to have a consistent behaviour even on
+ * big.LITTLE systems.
+ */
+static struct undef_hook pmu_hook = {
+	.instr_mask = 0xffff8800,
+	.instr_val  = 0xd53b8800,
+	.fn = emulate_pmu,
+};
+
+static int __init enable_pmu_emulation(void)
+{
+	register_undef_hook(&pmu_hook);
+	return 0;
+}
+
+core_initcall(enable_pmu_emulation);
+
 static int armv8_pmu_init(struct arm_pmu *cpu_pmu)
 {
 	int ret = armv8pmu_probe_pmu(cpu_pmu);
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 5/6] arm64: perf: Enable pmu counter direct access for perf event on armv8
  2019-05-16 13:21 [RFC 0/6] arm64: Enable access to pmu registers by user-space Raphael Gault
                   ` (3 preceding siblings ...)
  2019-05-16 13:21 ` [PATCH 4/6] arm64: pmu: Add hook to handle pmu-related undefined instructions Raphael Gault
@ 2019-05-16 13:21 ` Raphael Gault
  2019-05-16 13:21 ` [PATCH 6/6] Documentation: arm64: Document PMU counters access from userspace Raphael Gault
  5 siblings, 0 replies; 13+ messages in thread
From: Raphael Gault @ 2019-05-16 13:21 UTC (permalink / raw)
  To: linux-arm-kernel, linux-kernel
  Cc: mingo, peterz, catalin.marinas, will.deacon, acme, mark.rutland,
	Raphael Gault

Keep track of event opened with direct access to the hardware counters
and modify permissions while they are open.

The strategy used here is the same which x86 uses: everytime an event
is mapped, the permissions are set if required. The atomic field added
in the mm_context helps keep track of the different event opened and
de-activate the permissions when all are unmapped.
We also need to update the permissions in the context switch code so
that tasks keep the right permissions.

Signed-off-by: Raphael Gault <raphael.gault@arm.com>
---
 arch/arm64/include/asm/mmu.h         |  6 +++++
 arch/arm64/include/asm/mmu_context.h |  2 ++
 arch/arm64/include/asm/perf_event.h  | 14 ++++++++++
 drivers/perf/arm_pmu.c               | 38 ++++++++++++++++++++++++++++
 4 files changed, 60 insertions(+)

diff --git a/arch/arm64/include/asm/mmu.h b/arch/arm64/include/asm/mmu.h
index 67ef25d037ea..9de4cf0b17c7 100644
--- a/arch/arm64/include/asm/mmu.h
+++ b/arch/arm64/include/asm/mmu.h
@@ -29,6 +29,12 @@
 
 typedef struct {
 	atomic64_t	id;
+
+	/*
+	 * non-zero if userspace have access to hardware
+	 * counters directly.
+	 */
+	atomic_t	pmu_direct_access;
 	void		*vdso;
 	unsigned long	flags;
 } mm_context_t;
diff --git a/arch/arm64/include/asm/mmu_context.h b/arch/arm64/include/asm/mmu_context.h
index 2da3e478fd8f..33494af613d8 100644
--- a/arch/arm64/include/asm/mmu_context.h
+++ b/arch/arm64/include/asm/mmu_context.h
@@ -32,6 +32,7 @@
 #include <asm-generic/mm_hooks.h>
 #include <asm/cputype.h>
 #include <asm/pgtable.h>
+#include <asm/perf_event.h>
 #include <asm/sysreg.h>
 #include <asm/tlbflush.h>
 
@@ -235,6 +236,7 @@ static inline void __switch_mm(struct mm_struct *next)
 	}
 
 	check_and_switch_context(next, cpu);
+	perf_switch_user_access(next);
 }
 
 static inline void
diff --git a/arch/arm64/include/asm/perf_event.h b/arch/arm64/include/asm/perf_event.h
index c593761ba61c..32a6d604bb3b 100644
--- a/arch/arm64/include/asm/perf_event.h
+++ b/arch/arm64/include/asm/perf_event.h
@@ -19,6 +19,7 @@
 
 #include <asm/stack_pointer.h>
 #include <asm/ptrace.h>
+#include <linux/mm_types.h>
 
 #define	ARMV8_PMU_MAX_COUNTERS	32
 #define	ARMV8_PMU_COUNTER_MASK	(ARMV8_PMU_MAX_COUNTERS - 1)
@@ -234,4 +235,17 @@ extern unsigned long perf_misc_flags(struct pt_regs *regs);
 	(regs)->pstate = PSR_MODE_EL1h;	\
 }
 
+static inline void perf_switch_user_access(struct mm_struct *mm)
+{
+	if (!IS_ENABLED(CONFIG_PERF_EVENTS))
+		return;
+
+	if (atomic_read(&mm->context.pmu_direct_access)) {
+		write_sysreg(ARMV8_PMU_USERENR_ER|ARMV8_PMU_USERENR_CR,
+			     pmuserenr_el0);
+	} else {
+		write_sysreg(0, pmuserenr_el0);
+	}
+}
+
 #endif
diff --git a/drivers/perf/arm_pmu.c b/drivers/perf/arm_pmu.c
index 3f4c2ec7ff89..45a64f942864 100644
--- a/drivers/perf/arm_pmu.c
+++ b/drivers/perf/arm_pmu.c
@@ -24,6 +24,7 @@
 #include <linux/irqdesc.h>
 
 #include <asm/irq_regs.h>
+#include <asm/mmu_context.h>
 
 static DEFINE_PER_CPU(struct arm_pmu *, cpu_armpmu);
 static DEFINE_PER_CPU(int, cpu_irq);
@@ -786,6 +787,41 @@ static int armpmu_event_idx(struct perf_event *event)
 	return event->hw.idx;
 }
 
+static void refresh_pmuserenr(void *mm)
+{
+	perf_switch_user_access(mm);
+}
+
+static void armpmu_event_mapped(struct perf_event *event, struct mm_struct *mm)
+{
+	if (!(event->hw.flags & ARMPMU_EL0_RD_CNTR))
+		return;
+
+	/*
+	 * This function relies on not being called concurrently in two
+	 * tasks in the same mm.  Otherwise one task could observe
+	 * pmu_direct_access > 1 and return all the way back to
+	 * userspace with user access disabled while another task is still
+	 * doing on_each_cpu_mask() to enable user access.
+	 *
+	 * For now, this can't happen because all callers hold mmap_sem
+	 * for write.  If this changes, we'll need a different solution.
+	 */
+	lockdep_assert_held_exclusive(&mm->mmap_sem);
+
+	if (atomic_inc_return(&mm->context.pmu_direct_access) == 1)
+		on_each_cpu(refresh_pmuserenr, mm, 1);
+}
+
+static void armpmu_event_unmapped(struct perf_event *event, struct mm_struct *mm)
+{
+	if (!(event->hw.flags & ARMPMU_EL0_RD_CNTR))
+		return;
+
+	if (atomic_dec_and_test(&mm->context.pmu_direct_access))
+		on_each_cpu_mask(mm_cpumask(mm), refresh_pmuserenr, NULL, 1);
+}
+
 static struct arm_pmu *__armpmu_alloc(gfp_t flags)
 {
 	struct arm_pmu *pmu;
@@ -807,6 +843,8 @@ static struct arm_pmu *__armpmu_alloc(gfp_t flags)
 		.pmu_enable	= armpmu_enable,
 		.pmu_disable	= armpmu_disable,
 		.event_init	= armpmu_event_init,
+		.event_mapped	= armpmu_event_mapped,
+		.event_unmapped	= armpmu_event_unmapped,
 		.add		= armpmu_add,
 		.del		= armpmu_del,
 		.start		= armpmu_start,
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 6/6] Documentation: arm64: Document PMU counters access from userspace
  2019-05-16 13:21 [RFC 0/6] arm64: Enable access to pmu registers by user-space Raphael Gault
                   ` (4 preceding siblings ...)
  2019-05-16 13:21 ` [PATCH 5/6] arm64: perf: Enable pmu counter direct access for perf event on armv8 Raphael Gault
@ 2019-05-16 13:21 ` Raphael Gault
  5 siblings, 0 replies; 13+ messages in thread
From: Raphael Gault @ 2019-05-16 13:21 UTC (permalink / raw)
  To: linux-arm-kernel, linux-kernel
  Cc: mingo, peterz, catalin.marinas, will.deacon, acme, mark.rutland,
	Raphael Gault

Add a documentation file to describe the access to the pmu hardware
counters from userspace

Signed-off-by: Raphael Gault <raphael.gault@arm.com>
---
 .../arm64/pmu_counter_user_access.txt         | 42 +++++++++++++++++++
 1 file changed, 42 insertions(+)
 create mode 100644 Documentation/arm64/pmu_counter_user_access.txt

diff --git a/Documentation/arm64/pmu_counter_user_access.txt b/Documentation/arm64/pmu_counter_user_access.txt
new file mode 100644
index 000000000000..bccf5edbf7f5
--- /dev/null
+++ b/Documentation/arm64/pmu_counter_user_access.txt
@@ -0,0 +1,42 @@
+Access to PMU hardware counter from userspace
+=============================================
+
+Overview
+--------
+The perf user-space tool relies on the PMU to monitor events. It offers an
+abstraction layer over the hardware counters since the underlying
+implementation is cpu-dependent.
+Arm64 allows userspace tools to have access to the registers storing the
+hardware counters' values directly.
+
+This targets specifically self-monitoring tasks in order to reduce the overhead
+by directly accessing the registers without having to go through the kernel.
+
+How-to
+------
+The focus is set on the armv8 pmuv3 which makes sure that the access to the pmu
+registers is enable and that the userspace have access to the relevent
+information in order to use them. 
+
+In order to have access to the hardware counter it is necessary to open the event
+using the perf tool interface: the sys_perf_event_open syscall returns a fd which
+can subsequently be used with the mmap syscall in order to retrieve a page of memory
+containing information about the event.
+The PMU driver uses this page to expose to the user the hardware counter's
+index. Using this index enables the user to access the PMU registers using the
+`mrs` instruction.
+
+Have a look `at tools/perf/arch/arm64/tests/user-events.c` for an example. It can be
+run using the perf tool to check that the access to the registers works
+correctly from userspace:
+
+./perf test -v
+
+About chained events
+--------------------
+When the user requests for an event to be counted on 64 bits, two hardware
+counters are used and need to be combined to retrieve the correct value:
+
+val = read_counter(idx);
+if ((event.attr.config1 & 0x1))
+	val = (val << 32) | read_counter(idx - 1);
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH 4/6] arm64: pmu: Add hook to handle pmu-related undefined instructions
  2019-05-16 13:21 ` [PATCH 4/6] arm64: pmu: Add hook to handle pmu-related undefined instructions Raphael Gault
@ 2019-05-17  7:10   ` Peter Zijlstra
  2019-05-17  7:35     ` Raphael Gault
  2019-05-17  8:04     ` Mark Rutland
  0 siblings, 2 replies; 13+ messages in thread
From: Peter Zijlstra @ 2019-05-17  7:10 UTC (permalink / raw)
  To: Raphael Gault
  Cc: linux-arm-kernel, linux-kernel, mingo, catalin.marinas,
	will.deacon, acme, mark.rutland

On Thu, May 16, 2019 at 02:21:46PM +0100, Raphael Gault wrote:
> In order to prevent the userspace processes which are trying to access
> the registers from the pmu registers on a big.LITTLE environment we
> introduce a hook to handle undefined instructions.
> 
> The goal here is to prevent the process to be interrupted by a signal
> when the error is caused by the task being scheduled while accessing
> a counter, causing the counter access to be invalid. As we are not able
> to know efficiently the number of counters available physically on both
> pmu in that context we consider that any faulting access to a counter
> which is architecturally correct should not cause a SIGILL signal if
> the permissions are set accordingly.

The other approach is using rseq for this; with that you can guarantee
it will never issue the instruction on a wrong CPU.

That said; emulating the thing isn't horrible either.

> +	/*
> +	 * We put 0 in the target register if we
> +	 * are reading from pmu register. If we are
> +	 * writing, we do nothing.
> +	 */

Wait _what_ ?!? userspace can _WRITE_ to these registers?

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 4/6] arm64: pmu: Add hook to handle pmu-related undefined instructions
  2019-05-17  7:10   ` Peter Zijlstra
@ 2019-05-17  7:35     ` Raphael Gault
  2019-05-17  8:04     ` Mark Rutland
  1 sibling, 0 replies; 13+ messages in thread
From: Raphael Gault @ 2019-05-17  7:35 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: linux-arm-kernel, linux-kernel, mingo, Catalin Marinas,
	Will Deacon, acme, Mark Rutland

Hi,

On 5/17/19 8:10 AM, Peter Zijlstra wrote:
> On Thu, May 16, 2019 at 02:21:46PM +0100, Raphael Gault wrote:
>> In order to prevent the userspace processes which are trying to access
>> the registers from the pmu registers on a big.LITTLE environment we
>> introduce a hook to handle undefined instructions.
>>
>> The goal here is to prevent the process to be interrupted by a signal
>> when the error is caused by the task being scheduled while accessing
>> a counter, causing the counter access to be invalid. As we are not able
>> to know efficiently the number of counters available physically on both
>> pmu in that context we consider that any faulting access to a counter
>> which is architecturally correct should not cause a SIGILL signal if
>> the permissions are set accordingly.
>
> The other approach is using rseq for this; with that you can guarantee
> it will never issue the instruction on a wrong CPU.
>
> That said; emulating the thing isn't horrible either.
>
>> +/*
>> + * We put 0 in the target register if we
>> + * are reading from pmu register. If we are
>> + * writing, we do nothing.
>> + */
>
> Wait _what_ ?!? userspace can _WRITE_ to these registers?
>

The user can write to some pmu registers but those are not the ones that
interest us here. My comment was ill formed, indeed this hook can only
be triggered by reads in this case.
Sorry about that.

Thanks,

--
Raphael Gault
IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 4/6] arm64: pmu: Add hook to handle pmu-related undefined instructions
  2019-05-17  7:10   ` Peter Zijlstra
  2019-05-17  7:35     ` Raphael Gault
@ 2019-05-17  8:04     ` Mark Rutland
  2019-05-17  8:26       ` Peter Zijlstra
  1 sibling, 1 reply; 13+ messages in thread
From: Mark Rutland @ 2019-05-17  8:04 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Raphael Gault, linux-arm-kernel, linux-kernel, mingo,
	catalin.marinas, will.deacon, acme

On Fri, May 17, 2019 at 09:10:18AM +0200, Peter Zijlstra wrote:
> On Thu, May 16, 2019 at 02:21:46PM +0100, Raphael Gault wrote:
> > In order to prevent the userspace processes which are trying to access
> > the registers from the pmu registers on a big.LITTLE environment we
> > introduce a hook to handle undefined instructions.
> > 
> > The goal here is to prevent the process to be interrupted by a signal
> > when the error is caused by the task being scheduled while accessing
> > a counter, causing the counter access to be invalid. As we are not able
> > to know efficiently the number of counters available physically on both
> > pmu in that context we consider that any faulting access to a counter
> > which is architecturally correct should not cause a SIGILL signal if
> > the permissions are set accordingly.
> 
> The other approach is using rseq for this; with that you can guarantee
> it will never issue the instruction on a wrong CPU.
> 
> That said; emulating the thing isn't horrible either.

Yup. Attempting to use rseq is on the todo list.

> > +	/*
> > +	 * We put 0 in the target register if we
> > +	 * are reading from pmu register. If we are
> > +	 * writing, we do nothing.
> > +	 */
> 
> Wait _what_ ?!? userspace can _WRITE_ to these registers?

Remember that this is in an undefined (trap) handler.

If userspace _attempts_ to write to the registers, the CPU will trap to the
kernel. The comment is perhaps misleading; when we "do nothing", the common
trap handling code will send a SIGILL to userspace.

It would probably be better to say something like:

	/*
	 * If userspace is tries to read a counter that doesn't exist on this
	 * CPU, we emulate it as reading as zero. This happens if userspace is
	 * preempted between reading the idx and actually reading the counter,
	 * and the seqlock and idx have already changed, so it's as-if the
	 * counter has been reprogrammed with a different event.
	 *
	 * We don't permit userspace to write to these registers, and will
	 * inject a SIGILL.
	 */

There is one caveat: userspace can write to PMSELR without trapping, so we will
have to context-switch with the task. That only affects indirect addressing of
PMU registers, and doesn't have a functional effect on the behaviour of the
PMU, so that's benign from the PoV of perf.

Thanks,
Mark.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 4/6] arm64: pmu: Add hook to handle pmu-related undefined instructions
  2019-05-17  8:04     ` Mark Rutland
@ 2019-05-17  8:26       ` Peter Zijlstra
  2019-05-17  9:07         ` Peter Zijlstra
  0 siblings, 1 reply; 13+ messages in thread
From: Peter Zijlstra @ 2019-05-17  8:26 UTC (permalink / raw)
  To: Mark Rutland
  Cc: Raphael Gault, linux-arm-kernel, linux-kernel, mingo,
	catalin.marinas, will.deacon, acme

On Fri, May 17, 2019 at 09:04:20AM +0100, Mark Rutland wrote:

> Remember that this is in an undefined (trap) handler.
> 
> If userspace _attempts_ to write to the registers, the CPU will trap to the
> kernel. The comment is perhaps misleading; when we "do nothing", the common
> trap handling code will send a SIGILL to userspace.
> 
> It would probably be better to say something like:
> 
> 	/*
> 	 * If userspace is tries to read a counter that doesn't exist on this
> 	 * CPU, we emulate it as reading as zero. This happens if userspace is
> 	 * preempted between reading the idx and actually reading the counter,
> 	 * and the seqlock and idx have already changed, so it's as-if the
> 	 * counter has been reprogrammed with a different event.

Might be good to mention that userspace will/should discard the value it
reads, and therefore any value is good (including 0).

> 	 * We don't permit userspace to write to these registers, and will
> 	 * inject a SIGILL.
> 	 */
> 
> There is one caveat: userspace can write to PMSELR without trapping, so we will
> have to context-switch with the task. That only affects indirect addressing of
> PMU registers, and doesn't have a functional effect on the behaviour of the
> PMU, so that's benign from the PoV of perf.

Sad though; ideally you'd state that indirect addressing is
out-of-bounds and they get to keep the pieces. But I suspect you're
right that people will do it anyway and complain once it comes apart.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 4/6] arm64: pmu: Add hook to handle pmu-related undefined instructions
  2019-05-17  8:26       ` Peter Zijlstra
@ 2019-05-17  9:07         ` Peter Zijlstra
  0 siblings, 0 replies; 13+ messages in thread
From: Peter Zijlstra @ 2019-05-17  9:07 UTC (permalink / raw)
  To: Mark Rutland
  Cc: Raphael Gault, linux-arm-kernel, linux-kernel, mingo,
	catalin.marinas, will.deacon, acme

On Fri, May 17, 2019 at 10:26:55AM +0200, Peter Zijlstra wrote:
> On Fri, May 17, 2019 at 09:04:20AM +0100, Mark Rutland wrote:
> 
> > Remember that this is in an undefined (trap) handler.
> > 
> > If userspace _attempts_ to write to the registers, the CPU will trap to the
> > kernel. The comment is perhaps misleading; when we "do nothing", the common
> > trap handling code will send a SIGILL to userspace.
> > 
> > It would probably be better to say something like:
> > 
> > 	/*
> > 	 * If userspace is tries to read a counter that doesn't exist on this
> > 	 * CPU, we emulate it as reading as zero. This happens if userspace is
> > 	 * preempted between reading the idx and actually reading the counter,
> > 	 * and the seqlock and idx have already changed, so it's as-if the
> > 	 * counter has been reprogrammed with a different event.
> 
> Might be good to mention that userspace will/should discard the value it
> reads, and therefore any value is good (including 0).
> 
> > 	 * We don't permit userspace to write to these registers, and will
> > 	 * inject a SIGILL.
> > 	 */
> > 
> > There is one caveat: userspace can write to PMSELR without trapping, so we will
> > have to context-switch with the task. That only affects indirect addressing of
> > PMU registers, and doesn't have a functional effect on the behaviour of the
> > PMU, so that's benign from the PoV of perf.
> 
> Sad though; ideally you'd state that indirect addressing is
> out-of-bounds and they get to keep the pieces. But I suspect you're
> right that people will do it anyway and complain once it comes apart.

I'm still not entirely convinced you need that context switching. If we
sched-out, the seqcount value will change, idem when we sched-in. So
under no circumstance (even if we stay on the same CPU), will the
seqcount match when we get back on.

So why preserve that register?

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 3/6] arm64: pmu: Add function implementation to update event index in userpage.
  2019-05-16 13:21 ` [PATCH 3/6] arm64: pmu: Add function implementation to update event index in userpage Raphael Gault
@ 2019-05-17 13:21   ` Mark Rutland
  0 siblings, 0 replies; 13+ messages in thread
From: Mark Rutland @ 2019-05-17 13:21 UTC (permalink / raw)
  To: Raphael Gault
  Cc: linux-arm-kernel, linux-kernel, mingo, peterz, catalin.marinas,
	will.deacon, acme

On Thu, May 16, 2019 at 02:21:45PM +0100, Raphael Gault wrote:
> In order to be able to access the counter directly for userspace,
> we need to provide the index of the counter using the userpage.
> We thus need to override the event_idx function to retrieve and
> convert the perf_event index to armv8 hardware index.

It would be worth noting that since the arm_pmu framework can be used
with other versions of the PMU architecture which could not permit safe
userspace access, we allow the arch code to opt-in with the
ARMPMU_EL0_RD_CNTR flag.

> Signed-off-by: Raphael Gault <raphael.gault@arm.com>
> ---
>  arch/arm64/kernel/perf_event.c |  4 ++++
>  drivers/perf/arm_pmu.c         | 10 ++++++++++
>  include/linux/perf/arm_pmu.h   |  2 ++
>  3 files changed, 16 insertions(+)
> 
> diff --git a/arch/arm64/kernel/perf_event.c b/arch/arm64/kernel/perf_event.c
> index 6164d389eed6..e6316f99f66b 100644
> --- a/arch/arm64/kernel/perf_event.c
> +++ b/arch/arm64/kernel/perf_event.c
> @@ -890,6 +890,8 @@ static int __armv8_pmuv3_map_event(struct perf_event *event,
>  	if (armv8pmu_event_is_64bit(event))
>  		event->hw.flags |= ARMPMU_EVT_64BIT;
>  
> +	event->hw.flags |= ARMPMU_EL0_RD_CNTR;
> +
>  	/* Only expose micro/arch events supported by this PMU */
>  	if ((hw_event_id > 0) && (hw_event_id < ARMV8_PMUV3_MAX_COMMON_EVENTS)
>  	    && test_bit(hw_event_id, armpmu->pmceid_bitmap)) {
> @@ -1188,6 +1190,8 @@ void arch_perf_update_userpage(struct perf_event *event,
>  	 */
>  	freq = arch_timer_get_rate();
>  	userpg->cap_user_time = 1;
> +	userpg->cap_user_rdpmc =
> +		!!(event->hw.flags & ARMPMU_EL0_RD_CNTR);

This is under 80 columns when placed on a single line, so it doesn't
need to be split here.

>  
>  	clocks_calc_mult_shift(&userpg->time_mult, &shift, freq,
>  			NSEC_PER_SEC, 0);
> diff --git a/drivers/perf/arm_pmu.c b/drivers/perf/arm_pmu.c
> index eec75b97e7ea..3f4c2ec7ff89 100644
> --- a/drivers/perf/arm_pmu.c
> +++ b/drivers/perf/arm_pmu.c
> @@ -777,6 +777,15 @@ static void cpu_pmu_destroy(struct arm_pmu *cpu_pmu)
>  					    &cpu_pmu->node);
>  }
>  
> +
> +static int armpmu_event_idx(struct perf_event *event)
> +{
> +	if (!(event->hw.flags & ARMPMU_EL0_RD_CNTR))
> +		return 0;
> +
> +	return event->hw.idx;

I think this needs to remap ARMV8_IDX_CYCLE_COUNTER to 32, to match the
offset applie to the rest of counter indices.

Otherwise, this looks good to me.

Thanks,
Mark.

> +}
> +
>  static struct arm_pmu *__armpmu_alloc(gfp_t flags)
>  {
>  	struct arm_pmu *pmu;
> @@ -803,6 +812,7 @@ static struct arm_pmu *__armpmu_alloc(gfp_t flags)
>  		.start		= armpmu_start,
>  		.stop		= armpmu_stop,
>  		.read		= armpmu_read,
> +		.event_idx	= armpmu_event_idx,
>  		.filter_match	= armpmu_filter_match,
>  		.attr_groups	= pmu->attr_groups,
>  		/*
> diff --git a/include/linux/perf/arm_pmu.h b/include/linux/perf/arm_pmu.h
> index 4641e850b204..3bef390c1069 100644
> --- a/include/linux/perf/arm_pmu.h
> +++ b/include/linux/perf/arm_pmu.h
> @@ -30,6 +30,8 @@
>   */
>  /* Event uses a 64bit counter */
>  #define ARMPMU_EVT_64BIT		1
> +/* Allow access to hardware counter from userspace */
> +#define ARMPMU_EL0_RD_CNTR		2
>  
>  #define HW_OP_UNSUPPORTED		0xFFFF
>  #define C(_x)				PERF_COUNT_HW_CACHE_##_x
> -- 
> 2.17.1
> 

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2019-05-17 13:21 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-05-16 13:21 [RFC 0/6] arm64: Enable access to pmu registers by user-space Raphael Gault
2019-05-16 13:21 ` [PATCH 1/6] perf: arm64: Compile tests unconditionally Raphael Gault
2019-05-16 13:21 ` [PATCH 2/6] perf: arm64: Add test to check userspace access to hardware counters Raphael Gault
2019-05-16 13:21 ` [PATCH 3/6] arm64: pmu: Add function implementation to update event index in userpage Raphael Gault
2019-05-17 13:21   ` Mark Rutland
2019-05-16 13:21 ` [PATCH 4/6] arm64: pmu: Add hook to handle pmu-related undefined instructions Raphael Gault
2019-05-17  7:10   ` Peter Zijlstra
2019-05-17  7:35     ` Raphael Gault
2019-05-17  8:04     ` Mark Rutland
2019-05-17  8:26       ` Peter Zijlstra
2019-05-17  9:07         ` Peter Zijlstra
2019-05-16 13:21 ` [PATCH 5/6] arm64: perf: Enable pmu counter direct access for perf event on armv8 Raphael Gault
2019-05-16 13:21 ` [PATCH 6/6] Documentation: arm64: Document PMU counters access from userspace Raphael Gault

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).