linux-arm-kernel.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v4 0/7] arm64: Enable access to pmu registers by user-space
@ 2019-08-22 14:42 Raphael Gault
  2019-08-22 14:42 ` [PATCH v4 1/7] perf: arm64: Add test to check userspace access to hardware counters Raphael Gault
                   ` (6 more replies)
  0 siblings, 7 replies; 11+ messages in thread
From: Raphael Gault @ 2019-08-22 14:42 UTC (permalink / raw)
  To: linux-arm-kernel, linux-kernel
  Cc: mark.rutland, raph.gault+kdev, peterz, catalin.marinas,
	will.deacon, acme, Raphael Gault, mingo

Hi,

Changes since v3:
* Rebased on will/for-next/perf in order to include this patch [1]
* Re-introduce `mrs` hook used in previous versions
* Invert cpu feature to track homogeneity instead of heterogeneity
* Introduce accessor for boot_cpu_data (see second commit for more info)
* Apply Mark Rutland's comments


The perf user-space tool relies on the PMU to monitor events. It offers an
abstraction layer over the hardware counters since the underlying
implementation is cpu-dependent. We want to allow userspace tools to have
access to the registers storing the hardware counters' values directly.
This targets specifically self-monitoring tasks in order to reduce the
overhead by directly accessing the registers without having to go
through the kernel.
In order to do this we need to setup the pmu so that it exposes its registers
to userspace access.

The first patch add a test to the perf tool so that we can test that the
access to the registers works correctly from userspace.

The second patch introduces an accessor for `boot_cpu_data` which is
static. Including cpu.h turned out to cause a chain of dependencies so I
opted for the accessor since it is not much used.

The third patch add a capability in the arm64 cpufeatures framework in
order to detect when we are running on a homogeneous system.

The fourth patch re introduces the hooks to handling undefined
instruction for `mrs` instructions on pmu-related registers.

The fifth patch focuses on the armv8 pmuv3 PMU support and makes sure that
the access to the pmu registers is enable and that the userspace have
access to the relevent information in order to use them.

The sixth patch put in place callbacks to enable access to the hardware
counters from userspace when a compatible event is opened using the perf
API.

The seventh patch adds a short documentation about PMU counters direct
access from userspace.

[1]: https://lkml.org/lkml/2019/8/20/875

Raphael Gault (7):
  perf: arm64: Add test to check userspace access to hardware counters.
  arm64: cpu: Add accessor for boot_cpu_data
  arm64: cpufeature: Add feature to detect homogeneous systems
  arm64: pmu: Add hook to handle pmu-related undefined instructions
  arm64: pmu: Add function implementation to update event index in
    userpage.
  arm64: perf: Enable pmu counter direct access for perf event on armv8
  Documentation: arm64: Document PMU counters access from userspace

 .../arm64/pmu_counter_user_access.txt         |  42 +++
 arch/arm64/include/asm/cpu.h                  |   2 +-
 arch/arm64/include/asm/cpucaps.h              |   3 +-
 arch/arm64/include/asm/cpufeature.h           |  10 +
 arch/arm64/include/asm/mmu.h                  |   6 +
 arch/arm64/include/asm/mmu_context.h          |   2 +
 arch/arm64/include/asm/perf_event.h           |  14 +
 arch/arm64/kernel/cpufeature.c                |  32 ++-
 arch/arm64/kernel/cpuinfo.c                   |   7 +-
 arch/arm64/kernel/perf_event.c                |  77 ++++++
 drivers/perf/arm_pmu.c                        |  54 ++++
 include/linux/perf/arm_pmu.h                  |   2 +
 tools/perf/arch/arm64/include/arch-tests.h    |   7 +
 tools/perf/arch/arm64/tests/Build             |   1 +
 tools/perf/arch/arm64/tests/arch-tests.c      |   4 +
 tools/perf/arch/arm64/tests/user-events.c     | 254 ++++++++++++++++++
 16 files changed, 512 insertions(+), 5 deletions(-)
 create mode 100644 Documentation/arm64/pmu_counter_user_access.txt
 create mode 100644 tools/perf/arch/arm64/tests/user-events.c

-- 
2.17.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH v4 1/7] perf: arm64: Add test to check userspace access to hardware counters.
  2019-08-22 14:42 [PATCH v4 0/7] arm64: Enable access to pmu registers by user-space Raphael Gault
@ 2019-08-22 14:42 ` Raphael Gault
  2019-08-27 11:17   ` Jonathan Cameron
  2019-08-22 14:42 ` [PATCH v4 2/7] arm64: cpu: Add accessor for boot_cpu_data Raphael Gault
                   ` (5 subsequent siblings)
  6 siblings, 1 reply; 11+ messages in thread
From: Raphael Gault @ 2019-08-22 14:42 UTC (permalink / raw)
  To: linux-arm-kernel, linux-kernel
  Cc: mark.rutland, raph.gault+kdev, peterz, catalin.marinas,
	will.deacon, acme, Raphael Gault, mingo

This test relies on the fact that the PMU registers are accessible
from userspace. It then uses the perf_event_mmap_page to retrieve
the counter index and access the underlying register.

This test uses sched_setaffinity(2) in order to run on all CPU and thus
check the behaviour of the PMU of all cpus in a big.LITTLE environment.

Signed-off-by: Raphael Gault <raphael.gault@arm.com>
---
 tools/perf/arch/arm64/include/arch-tests.h |   7 +
 tools/perf/arch/arm64/tests/Build          |   1 +
 tools/perf/arch/arm64/tests/arch-tests.c   |   4 +
 tools/perf/arch/arm64/tests/user-events.c  | 254 +++++++++++++++++++++
 4 files changed, 266 insertions(+)
 create mode 100644 tools/perf/arch/arm64/tests/user-events.c

diff --git a/tools/perf/arch/arm64/include/arch-tests.h b/tools/perf/arch/arm64/include/arch-tests.h
index 90ec4c8cb880..6a8483de1015 100644
--- a/tools/perf/arch/arm64/include/arch-tests.h
+++ b/tools/perf/arch/arm64/include/arch-tests.h
@@ -2,11 +2,18 @@
 #ifndef ARCH_TESTS_H
 #define ARCH_TESTS_H
 
+#include <linux/compiler.h>
+
 #ifdef HAVE_DWARF_UNWIND_SUPPORT
 struct thread;
 struct perf_sample;
+int test__arch_unwind_sample(struct perf_sample *sample,
+			     struct thread *thread);
 #endif
 
 extern struct test arch_tests[];
+int test__rd_pmevcntr(struct test *test __maybe_unused,
+		      int subtest __maybe_unused);
+
 
 #endif
diff --git a/tools/perf/arch/arm64/tests/Build b/tools/perf/arch/arm64/tests/Build
index a61c06bdb757..3f9a20c17fc6 100644
--- a/tools/perf/arch/arm64/tests/Build
+++ b/tools/perf/arch/arm64/tests/Build
@@ -1,4 +1,5 @@
 perf-y += regs_load.o
 perf-$(CONFIG_DWARF_UNWIND) += dwarf-unwind.o
 
+perf-y += user-events.o
 perf-y += arch-tests.o
diff --git a/tools/perf/arch/arm64/tests/arch-tests.c b/tools/perf/arch/arm64/tests/arch-tests.c
index 5b1543c98022..57df9b89dede 100644
--- a/tools/perf/arch/arm64/tests/arch-tests.c
+++ b/tools/perf/arch/arm64/tests/arch-tests.c
@@ -10,6 +10,10 @@ struct test arch_tests[] = {
 		.func = test__dwarf_unwind,
 	},
 #endif
+	{
+		.desc = "User counter access",
+		.func = test__rd_pmevcntr,
+	},
 	{
 		.func = NULL,
 	},
diff --git a/tools/perf/arch/arm64/tests/user-events.c b/tools/perf/arch/arm64/tests/user-events.c
new file mode 100644
index 000000000000..b048d7e392bc
--- /dev/null
+++ b/tools/perf/arch/arm64/tests/user-events.c
@@ -0,0 +1,254 @@
+// SPDX-License-Identifier: GPL-2.0
+#include <asm/bug.h>
+#include <errno.h>
+#include <unistd.h>
+#include <sched.h>
+#include <stdlib.h>
+#include <signal.h>
+#include <sys/mman.h>
+#include <sys/sysinfo.h>
+#include <sys/types.h>
+#include <sys/wait.h>
+#include <linux/types.h>
+#include "perf.h"
+#include "debug.h"
+#include "tests/tests.h"
+#include "cloexec.h"
+#include "util.h"
+#include "arch-tests.h"
+
+/*
+ * ARMv8 ARM reserves the following encoding for system registers:
+ * (Ref: ARMv8 ARM, Section: "System instruction class encoding overview",
+ *  C5.2, version:ARM DDI 0487A.f)
+ *      [20-19] : Op0
+ *      [18-16] : Op1
+ *      [15-12] : CRn
+ *      [11-8]  : CRm
+ *      [7-5]   : Op2
+ */
+#define Op0_shift       19
+#define Op0_mask        0x3
+#define Op1_shift       16
+#define Op1_mask        0x7
+#define CRn_shift       12
+#define CRn_mask        0xf
+#define CRm_shift       8
+#define CRm_mask        0xf
+#define Op2_shift       5
+#define Op2_mask        0x7
+
+#define __stringify(x)	#x
+
+#define read_sysreg(r) ({						\
+	u64 __val;							\
+	asm volatile("mrs %0, " __stringify(r) : "=r" (__val));		\
+	__val;								\
+})
+
+#define PMEVCNTR_READ_CASE(idx)					\
+	case idx:						\
+		return read_sysreg(pmevcntr##idx##_el0)
+
+#define PMEVCNTR_CASES(readwrite)		\
+	PMEVCNTR_READ_CASE(0);			\
+	PMEVCNTR_READ_CASE(1);			\
+	PMEVCNTR_READ_CASE(2);			\
+	PMEVCNTR_READ_CASE(3);			\
+	PMEVCNTR_READ_CASE(4);			\
+	PMEVCNTR_READ_CASE(5);			\
+	PMEVCNTR_READ_CASE(6);			\
+	PMEVCNTR_READ_CASE(7);			\
+	PMEVCNTR_READ_CASE(8);			\
+	PMEVCNTR_READ_CASE(9);			\
+	PMEVCNTR_READ_CASE(10);			\
+	PMEVCNTR_READ_CASE(11);			\
+	PMEVCNTR_READ_CASE(12);			\
+	PMEVCNTR_READ_CASE(13);			\
+	PMEVCNTR_READ_CASE(14);			\
+	PMEVCNTR_READ_CASE(15);			\
+	PMEVCNTR_READ_CASE(16);			\
+	PMEVCNTR_READ_CASE(17);			\
+	PMEVCNTR_READ_CASE(18);			\
+	PMEVCNTR_READ_CASE(19);			\
+	PMEVCNTR_READ_CASE(20);			\
+	PMEVCNTR_READ_CASE(21);			\
+	PMEVCNTR_READ_CASE(22);			\
+	PMEVCNTR_READ_CASE(23);			\
+	PMEVCNTR_READ_CASE(24);			\
+	PMEVCNTR_READ_CASE(25);			\
+	PMEVCNTR_READ_CASE(26);			\
+	PMEVCNTR_READ_CASE(27);			\
+	PMEVCNTR_READ_CASE(28);			\
+	PMEVCNTR_READ_CASE(29);			\
+	PMEVCNTR_READ_CASE(30)
+
+/*
+ * Read a value direct from PMEVCNTR<idx>
+ */
+static u64 read_evcnt_direct(int idx)
+{
+	switch (idx) {
+	PMEVCNTR_CASES(READ);
+	default:
+		WARN_ON(1);
+	}
+
+	return 0;
+}
+
+static u64 mmap_read_self(void *addr)
+{
+	struct perf_event_mmap_page *pc = addr;
+	u32 seq, idx, time_mult = 0, time_shift = 0;
+	u64 count, cyc = 0, time_offset = 0, enabled, running, delta;
+
+	do {
+		seq = READ_ONCE(pc->lock);
+		barrier();
+
+		enabled = READ_ONCE(pc->time_enabled);
+		running = READ_ONCE(pc->time_running);
+
+		if (enabled != running) {
+			cyc = read_sysreg(cntvct_el0);
+			time_mult = READ_ONCE(pc->time_mult);
+			time_shift = READ_ONCE(pc->time_shift);
+			time_offset = READ_ONCE(pc->time_offset);
+		}
+
+		idx = READ_ONCE(pc->index);
+		count = READ_ONCE(pc->offset);
+		if (idx)
+			count += read_evcnt_direct(idx - 1);
+
+		barrier();
+	} while (READ_ONCE(pc->lock) != seq);
+
+	if (enabled != running) {
+		u64 quot, rem;
+
+		quot = (cyc >> time_shift);
+		rem = cyc & (((u64)1 << time_shift) - 1);
+		delta = time_offset + quot * time_mult +
+			((rem * time_mult) >> time_shift);
+
+		enabled += delta;
+		if (idx)
+			running += delta;
+
+		quot = count / running;
+		rem = count % running;
+		count = quot * enabled + (rem * enabled) / running;
+	}
+
+	return count;
+}
+
+static int __test__rd_pmevcntr(void)
+{
+	volatile int tmp = 0;
+	u64 i, loops = 1000;
+	int n;
+	int fd;
+	void *addr;
+	struct perf_event_attr attr = {
+		.type = PERF_TYPE_HARDWARE,
+		.config = PERF_COUNT_HW_INSTRUCTIONS,
+		.exclude_kernel = 1,
+	};
+	u64 delta_sum = 0;
+	char sbuf[STRERR_BUFSIZE];
+
+	fd = sys_perf_event_open(&attr, 0, -1, -1,
+				 perf_event_open_cloexec_flag());
+	if (fd < 0) {
+		pr_err("Error: sys_perf_event_open() syscall returned with %d (%s)\n", fd,
+		       str_error_r(errno, sbuf, sizeof(sbuf)));
+		return -1;
+	}
+
+	addr = mmap(NULL, page_size, PROT_READ, MAP_SHARED, fd, 0);
+	if (addr == (void *)(-1)) {
+		pr_err("Error: mmap() syscall returned with (%s)\n",
+		       str_error_r(errno, sbuf, sizeof(sbuf)));
+		goto out_close;
+	}
+
+	for (n = 0; n < 6; n++) {
+		u64 stamp, now, delta;
+
+		stamp = mmap_read_self(addr);
+
+		for (i = 0; i < loops; i++)
+			tmp++;
+
+		now = mmap_read_self(addr);
+		loops *= 10;
+
+		delta = now - stamp;
+		pr_debug("%14d: %14llu\n", n, (long long)delta);
+
+		delta_sum += delta;
+	}
+
+	munmap(addr, page_size);
+	pr_debug("   ");
+
+out_close:
+	close(fd);
+
+	if (!delta_sum)
+		return -1;
+
+	return 0;
+}
+
+int test__rd_pmevcntr(struct test __maybe_unused *test,
+		      int __maybe_unused subtest)
+{
+	int status = 0;
+	int wret = 0;
+	int ret = 0;
+	int pid;
+	int cpu;
+	cpu_set_t cpu_set;
+
+	pid = fork();
+	if (pid < 0)
+		return -1;
+
+	if (!pid) {
+		for (cpu = 0; cpu < get_nprocs(); cpu++) {
+			pr_info("setting affinity to cpu: %d\n", cpu);
+			CPU_ZERO(&cpu_set);
+			CPU_SET(cpu, &cpu_set);
+			if (sched_setaffinity(getpid(),
+					      sizeof(cpu_set),
+					      &cpu_set) == -1) {
+				pr_err("Error: impossible to set cpu (%d) affinity\n",
+				       cpu);
+				continue;
+			}
+			ret = __test__rd_pmevcntr();
+		}
+		exit(ret);
+	}
+
+	wret = waitpid(pid, &status, 0);
+	if (wret < 0)
+		return -1;
+
+	if (WIFSIGNALED(status)) {
+		pr_err("Error: the child process was interrupted by a signal\n");
+		return -1;
+	}
+
+	if (WIFEXITED(status) && WEXITSTATUS(status)) {
+		pr_err("Error: the child process exited with: %d\n",
+		       WEXITSTATUS(status));
+		return -1;
+	}
+
+	return 0;
+}
-- 
2.17.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH v4 2/7] arm64: cpu: Add accessor for boot_cpu_data
  2019-08-22 14:42 [PATCH v4 0/7] arm64: Enable access to pmu registers by user-space Raphael Gault
  2019-08-22 14:42 ` [PATCH v4 1/7] perf: arm64: Add test to check userspace access to hardware counters Raphael Gault
@ 2019-08-22 14:42 ` Raphael Gault
  2019-08-22 14:42 ` [PATCH v4 3/7] arm64: cpufeature: Add feature to detect homogeneous systems Raphael Gault
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 11+ messages in thread
From: Raphael Gault @ 2019-08-22 14:42 UTC (permalink / raw)
  To: linux-arm-kernel, linux-kernel
  Cc: mark.rutland, raph.gault+kdev, peterz, catalin.marinas,
	will.deacon, acme, Raphael Gault, mingo

Mark boot_cpu_data as read-only after initialization.
Define accessor to read boot_cpu_data from outside of boot_cpu_data.

Signed-off-by: Raphael Gault <raphael.gault@arm.com>
---
 arch/arm64/include/asm/cpu.h | 2 +-
 arch/arm64/kernel/cpuinfo.c  | 7 ++++++-
 2 files changed, 7 insertions(+), 2 deletions(-)

diff --git a/arch/arm64/include/asm/cpu.h b/arch/arm64/include/asm/cpu.h
index d72d995b7e25..6abc2faf1a64 100644
--- a/arch/arm64/include/asm/cpu.h
+++ b/arch/arm64/include/asm/cpu.h
@@ -62,5 +62,5 @@ void __init cpuinfo_store_boot_cpu(void);
 void __init init_cpu_features(struct cpuinfo_arm64 *info);
 void update_cpu_features(int cpu, struct cpuinfo_arm64 *info,
 				 struct cpuinfo_arm64 *boot);
-
+struct cpuinfo_arm64 *get_boot_cpu_data(void);
 #endif /* __ASM_CPU_H */
diff --git a/arch/arm64/kernel/cpuinfo.c b/arch/arm64/kernel/cpuinfo.c
index 876055e37352..ffa00b3a148b 100644
--- a/arch/arm64/kernel/cpuinfo.c
+++ b/arch/arm64/kernel/cpuinfo.c
@@ -31,7 +31,7 @@
  * values depending on configuration at or after reset.
  */
 DEFINE_PER_CPU(struct cpuinfo_arm64, cpu_data);
-static struct cpuinfo_arm64 boot_cpu_data;
+static struct cpuinfo_arm64 boot_cpu_data __ro_after_init;
 
 static char *icache_policy_str[] = {
 	[0 ... ICACHE_POLICY_PIPT]	= "RESERVED/UNKNOWN",
@@ -395,4 +395,9 @@ void __init cpuinfo_store_boot_cpu(void)
 	init_cpu_features(&boot_cpu_data);
 }
 
+struct cpuinfo_arm64 *get_boot_cpu_data(void)
+{
+	return &boot_cpu_data;
+}
+
 device_initcall(cpuinfo_regs_init);
-- 
2.17.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH v4 3/7] arm64: cpufeature: Add feature to detect homogeneous systems
  2019-08-22 14:42 [PATCH v4 0/7] arm64: Enable access to pmu registers by user-space Raphael Gault
  2019-08-22 14:42 ` [PATCH v4 1/7] perf: arm64: Add test to check userspace access to hardware counters Raphael Gault
  2019-08-22 14:42 ` [PATCH v4 2/7] arm64: cpu: Add accessor for boot_cpu_data Raphael Gault
@ 2019-08-22 14:42 ` Raphael Gault
  2019-08-22 14:42 ` [PATCH v4 4/7] arm64: pmu: Add hook to handle pmu-related undefined instructions Raphael Gault
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 11+ messages in thread
From: Raphael Gault @ 2019-08-22 14:42 UTC (permalink / raw)
  To: linux-arm-kernel, linux-kernel
  Cc: mark.rutland, raph.gault+kdev, peterz, catalin.marinas,
	will.deacon, acme, Raphael Gault, mingo

This feature is required in order to enable PMU counters direct
access from userspace only when the system is homogeneous.
This feature checks the model of each CPU brought online and compares it
to the boot CPU. If it differs then it is heterogeneous.

This CPU feature doesn't prevent different models of CPUs from being
hotplugged on, however if such a scenario happens, it will turn off the
feature. There is no possibility for the feature to be turned on again
by hotplugging off CPUs though.

Signed-off-by: Raphael Gault <raphael.gault@arm.com>
---
 arch/arm64/include/asm/cpucaps.h    |  3 ++-
 arch/arm64/include/asm/cpufeature.h | 10 ++++++++++
 arch/arm64/kernel/cpufeature.c      | 28 ++++++++++++++++++++++++++++
 3 files changed, 40 insertions(+), 1 deletion(-)

diff --git a/arch/arm64/include/asm/cpucaps.h b/arch/arm64/include/asm/cpucaps.h
index f19fe4b9acc4..1cd73cf46116 100644
--- a/arch/arm64/include/asm/cpucaps.h
+++ b/arch/arm64/include/asm/cpucaps.h
@@ -52,7 +52,8 @@
 #define ARM64_HAS_IRQ_PRIO_MASKING		42
 #define ARM64_HAS_DCPODP			43
 #define ARM64_WORKAROUND_1463225		44
+#define ARM64_HAS_HOMOGENEOUS_PMU		45
 
-#define ARM64_NCAPS				45
+#define ARM64_NCAPS				46
 
 #endif /* __ASM_CPUCAPS_H */
diff --git a/arch/arm64/include/asm/cpufeature.h b/arch/arm64/include/asm/cpufeature.h
index 407e2bf23676..c54a87896bbd 100644
--- a/arch/arm64/include/asm/cpufeature.h
+++ b/arch/arm64/include/asm/cpufeature.h
@@ -430,6 +430,16 @@ static inline void cpus_set_cap(unsigned int num)
 	}
 }
 
+static inline void cpus_unset_cap(unsigned int num)
+{
+	if (num >= ARM64_NCAPS) {
+		pr_warn("Attempt to unset an illegal CPU capability (%d >= %d)\n",
+			num, ARM64_NCAPS);
+	} else {
+		clear_bit(num, cpu_hwcaps);
+	}
+}
+
 static inline int __attribute_const__
 cpuid_feature_extract_signed_field_width(u64 features, int field, int width)
 {
diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
index f29f36a65175..07be444c1e31 100644
--- a/arch/arm64/kernel/cpufeature.c
+++ b/arch/arm64/kernel/cpufeature.c
@@ -1248,6 +1248,23 @@ static bool can_use_gic_priorities(const struct arm64_cpu_capabilities *entry,
 }
 #endif
 
+static bool has_homogeneous_pmu(const struct arm64_cpu_capabilities *entry,
+				  int scope)
+{
+	u32 model = read_cpuid_id() & MIDR_CPU_MODEL_MASK;
+	struct cpuinfo_arm64 *boot = get_boot_cpu_data();
+
+	return  (boot->reg_midr & MIDR_CPU_MODEL_MASK) == model;
+}
+
+static void disable_homogeneous_cap(const struct arm64_cpu_capabilities *entry)
+{
+	if (!has_homogeneous_pmu(entry, entry->type)) {
+		pr_info("Disabling Homogeneous PMU (%d)", entry->capability);
+		cpus_unset_cap(entry->capability);
+	}
+}
+
 static const struct arm64_cpu_capabilities arm64_features[] = {
 	{
 		.desc = "GIC system register CPU interface",
@@ -1548,6 +1565,17 @@ static const struct arm64_cpu_capabilities arm64_features[] = {
 		.min_field_value = 1,
 	},
 #endif
+	{
+		/*
+		 * Detect whether the system is heterogeneous or
+		 * homogeneous
+		 */
+		.desc = "Homogeneous CPUs",
+		.capability = ARM64_HAS_HOMOGENEOUS_PMU,
+		.type = ARM64_CPUCAP_WEAK_LOCAL_CPU_FEATURE,
+		.matches = has_homogeneous_pmu,
+		.cpu_enable = disable_homogeneous_cap,
+	},
 	{},
 };
 
-- 
2.17.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH v4 4/7] arm64: pmu: Add hook to handle pmu-related undefined instructions
  2019-08-22 14:42 [PATCH v4 0/7] arm64: Enable access to pmu registers by user-space Raphael Gault
                   ` (2 preceding siblings ...)
  2019-08-22 14:42 ` [PATCH v4 3/7] arm64: cpufeature: Add feature to detect homogeneous systems Raphael Gault
@ 2019-08-22 14:42 ` Raphael Gault
  2019-08-22 14:42 ` [PATCH v4 5/7] arm64: pmu: Add function implementation to update event index in userpage Raphael Gault
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 11+ messages in thread
From: Raphael Gault @ 2019-08-22 14:42 UTC (permalink / raw)
  To: linux-arm-kernel, linux-kernel
  Cc: mark.rutland, raph.gault+kdev, peterz, catalin.marinas,
	will.deacon, acme, Raphael Gault, mingo

This patch introduces a protection for the userspace processes which are
trying to access the registers from the pmu registers on a big.LITTLE
environment. It introduces a hook to handle undefined instructions.

The goal here is to prevent the process to be interrupted by a signal
when the error is caused by the task being scheduled while accessing
a counter, causing the counter access to be invalid. As we are not able
to know efficiently the number of counters available physically on both
pmu in that context we consider that any faulting access to a counter
which is architecturally correct should not cause a SIGILL signal if
the permissions are set accordingly.

This commit also modifies the mask of the mrs_hook declared in
arch/arm64/kernel/cpufeatures.c which emulates only feature register
access. This is necessary because this hook's mask was too large and
thus masking any mrs instruction, even if not related to the emulated
registers which made the pmu emulation inefficient.

Signed-off-by: Raphael Gault <raphael.gault@arm.com>
---
 arch/arm64/kernel/cpufeature.c |  4 +--
 arch/arm64/kernel/perf_event.c | 55 ++++++++++++++++++++++++++++++++++
 2 files changed, 57 insertions(+), 2 deletions(-)

diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
index 07be444c1e31..3a6285d0b2c0 100644
--- a/arch/arm64/kernel/cpufeature.c
+++ b/arch/arm64/kernel/cpufeature.c
@@ -2186,8 +2186,8 @@ static int emulate_mrs(struct pt_regs *regs, u32 insn)
 }
 
 static struct undef_hook mrs_hook = {
-	.instr_mask = 0xfff00000,
-	.instr_val  = 0xd5300000,
+	.instr_mask = 0xffff0000,
+	.instr_val  = 0xd5380000,
 	.pstate_mask = PSR_AA32_MODE_MASK,
 	.pstate_val = PSR_MODE_EL0t,
 	.fn = emulate_mrs,
diff --git a/arch/arm64/kernel/perf_event.c b/arch/arm64/kernel/perf_event.c
index a0b4f1bca491..64ca09c9ea65 100644
--- a/arch/arm64/kernel/perf_event.c
+++ b/arch/arm64/kernel/perf_event.c
@@ -8,9 +8,11 @@
  * This code is based heavily on the ARMv7 perf event code.
  */
 
+#include <asm/cpu.h>
 #include <asm/irq_regs.h>
 #include <asm/perf_event.h>
 #include <asm/sysreg.h>
+#include <asm/traps.h>
 #include <asm/virt.h>
 
 #include <linux/acpi.h>
@@ -1012,6 +1014,59 @@ static int armv8pmu_probe_pmu(struct arm_pmu *cpu_pmu)
 	return probe.present ? 0 : -ENODEV;
 }
 
+static int emulate_pmu(struct pt_regs *regs, u32 insn)
+{
+	u32 sys_reg, rt;
+	u32 pmuserenr;
+
+	sys_reg = (u32)aarch64_insn_decode_immediate(AARCH64_INSN_IMM_16, insn) << 5;
+	rt = aarch64_insn_decode_register(AARCH64_INSN_REGTYPE_RT, insn);
+	pmuserenr = read_sysreg(pmuserenr_el0);
+
+	if ((pmuserenr & (ARMV8_PMU_USERENR_ER|ARMV8_PMU_USERENR_CR)) !=
+	    (ARMV8_PMU_USERENR_ER|ARMV8_PMU_USERENR_CR))
+		return -EINVAL;
+
+
+	/*
+	 * Userspace is expected to only use this in the context of the scheme
+	 * described in the struct perf_event_mmap_page comments.
+	 *
+	 * Given that context, we can only get here if we got migrated between
+	 * getting the register index and doing the MSR read.  This in turn
+	 * implies we'll fail the sequence and retry, so any value returned is
+	 * 'good', all we need is to be non-fatal.
+	 *
+	 * The choice of the value 0 is comming from the fact that when
+	 * accessing a register which is not counting events but is accessible,
+	 * we get 0.
+	 */
+	pt_regs_write_reg(regs, rt, 0);
+
+	arm64_skip_faulting_instruction(regs, 4);
+	return 0;
+}
+
+/*
+ * This hook will only be triggered by mrs
+ * instructions on PMU registers. This is mandatory
+ * in order to have a consistent behaviour even on
+ * big.LITTLE systems.
+ */
+static struct undef_hook pmu_hook = {
+	.instr_mask = 0xffff8800,
+	.instr_val  = 0xd53b8800,
+	.fn = emulate_pmu,
+};
+
+static int __init enable_pmu_emulation(void)
+{
+	register_undef_hook(&pmu_hook);
+	return 0;
+}
+
+core_initcall(enable_pmu_emulation);
+
 static int armv8_pmu_init(struct arm_pmu *cpu_pmu)
 {
 	int ret = armv8pmu_probe_pmu(cpu_pmu);
-- 
2.17.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH v4 5/7] arm64: pmu: Add function implementation to update event index in userpage.
  2019-08-22 14:42 [PATCH v4 0/7] arm64: Enable access to pmu registers by user-space Raphael Gault
                   ` (3 preceding siblings ...)
  2019-08-22 14:42 ` [PATCH v4 4/7] arm64: pmu: Add hook to handle pmu-related undefined instructions Raphael Gault
@ 2019-08-22 14:42 ` Raphael Gault
  2019-08-22 14:42 ` [PATCH v4 6/7] arm64: perf: Enable pmu counter direct access for perf event on armv8 Raphael Gault
  2019-08-22 14:42 ` [PATCH v4 7/7] Documentation: arm64: Document PMU counters access from userspace Raphael Gault
  6 siblings, 0 replies; 11+ messages in thread
From: Raphael Gault @ 2019-08-22 14:42 UTC (permalink / raw)
  To: linux-arm-kernel, linux-kernel
  Cc: mark.rutland, raph.gault+kdev, peterz, catalin.marinas,
	will.deacon, acme, Raphael Gault, mingo

In order to be able to access the counter directly for userspace,
we need to provide the index of the counter using the userpage.
We thus need to override the event_idx function to retrieve and
convert the perf_event index to armv8 hardware index.

Since the arm_pmu driver can be used by any implementation, even
if not armv8, two components play a role into making sure the
behaviour is correct and consistent with the PMU capabilities:

* the ARMPMU_EL0_RD_CNTR flag which denotes the capability to access
counter from userspace.
* the event_idx call back, which is implemented and initialized by
the PMU implementation: if no callback is provided, the default
behaviour applies, returning 0 as index value.

Signed-off-by: Raphael Gault <raphael.gault@arm.com>
---
 arch/arm64/kernel/perf_event.c | 21 +++++++++++++++++++++
 include/linux/perf/arm_pmu.h   |  2 ++
 2 files changed, 23 insertions(+)

diff --git a/arch/arm64/kernel/perf_event.c b/arch/arm64/kernel/perf_event.c
index 64ca09c9ea65..de9b001e8b7c 100644
--- a/arch/arm64/kernel/perf_event.c
+++ b/arch/arm64/kernel/perf_event.c
@@ -820,6 +820,22 @@ static void armv8pmu_clear_event_idx(struct pmu_hw_events *cpuc,
 		clear_bit(idx - 1, cpuc->used_mask);
 }
 
+static int armv8pmu_access_event_idx(struct perf_event *event)
+{
+	if (!(event->hw.flags & ARMPMU_EL0_RD_CNTR))
+		return 0;
+
+	/*
+	 * We remap the cycle counter index to 32 to
+	 * match the offset applied to the rest of
+	 * the counter indices.
+	 */
+	if (event->hw.idx == ARMV8_IDX_CYCLE_COUNTER)
+		return 32;
+
+	return event->hw.idx;
+}
+
 /*
  * Add an event filter to a given event.
  */
@@ -913,6 +929,9 @@ static int __armv8_pmuv3_map_event(struct perf_event *event,
 	if (armv8pmu_event_is_64bit(event))
 		event->hw.flags |= ARMPMU_EVT_64BIT;
 
+	if (cpus_have_cap(ARM64_HAS_HOMOGENEOUS_PMU))
+		event->hw.flags |= ARMPMU_EL0_RD_CNTR;
+
 	/* Only expose micro/arch events supported by this PMU */
 	if ((hw_event_id > 0) && (hw_event_id < ARMV8_PMUV3_MAX_COMMON_EVENTS)
 	    && test_bit(hw_event_id, armpmu->pmceid_bitmap)) {
@@ -1086,6 +1105,8 @@ static int armv8_pmu_init(struct arm_pmu *cpu_pmu)
 	cpu_pmu->set_event_filter	= armv8pmu_set_event_filter;
 	cpu_pmu->filter_match		= armv8pmu_filter_match;
 
+	cpu_pmu->pmu.event_idx		= armv8pmu_access_event_idx;
+
 	return 0;
 }
 
diff --git a/include/linux/perf/arm_pmu.h b/include/linux/perf/arm_pmu.h
index 71f525a35ac2..1106a9ac00fd 100644
--- a/include/linux/perf/arm_pmu.h
+++ b/include/linux/perf/arm_pmu.h
@@ -26,6 +26,8 @@
  */
 /* Event uses a 64bit counter */
 #define ARMPMU_EVT_64BIT		1
+/* Allow access to hardware counter from userspace */
+#define ARMPMU_EL0_RD_CNTR		2
 
 #define HW_OP_UNSUPPORTED		0xFFFF
 #define C(_x)				PERF_COUNT_HW_CACHE_##_x
-- 
2.17.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH v4 6/7] arm64: perf: Enable pmu counter direct access for perf event on armv8
  2019-08-22 14:42 [PATCH v4 0/7] arm64: Enable access to pmu registers by user-space Raphael Gault
                   ` (4 preceding siblings ...)
  2019-08-22 14:42 ` [PATCH v4 5/7] arm64: pmu: Add function implementation to update event index in userpage Raphael Gault
@ 2019-08-22 14:42 ` Raphael Gault
  2019-08-22 16:39   ` Jonathan Cameron
  2019-08-22 14:42 ` [PATCH v4 7/7] Documentation: arm64: Document PMU counters access from userspace Raphael Gault
  6 siblings, 1 reply; 11+ messages in thread
From: Raphael Gault @ 2019-08-22 14:42 UTC (permalink / raw)
  To: linux-arm-kernel, linux-kernel
  Cc: mark.rutland, raph.gault+kdev, peterz, catalin.marinas,
	will.deacon, acme, Raphael Gault, mingo

Keep track of event opened with direct access to the hardware counters
and modify permissions while they are open.

The strategy used here is the same which x86 uses: everytime an event
is mapped, the permissions are set if required. The atomic field added
in the mm_context helps keep track of the different event opened and
de-activate the permissions when all are unmapped.
We also need to update the permissions in the context switch code so
that tasks keep the right permissions.

Signed-off-by: Raphael Gault <raphael.gault@arm.com>
---
 arch/arm64/include/asm/mmu.h         |  6 ++++
 arch/arm64/include/asm/mmu_context.h |  2 ++
 arch/arm64/include/asm/perf_event.h  | 14 ++++++++
 arch/arm64/kernel/perf_event.c       |  1 +
 drivers/perf/arm_pmu.c               | 54 ++++++++++++++++++++++++++++
 5 files changed, 77 insertions(+)

diff --git a/arch/arm64/include/asm/mmu.h b/arch/arm64/include/asm/mmu.h
index fd6161336653..88ed4466bd06 100644
--- a/arch/arm64/include/asm/mmu.h
+++ b/arch/arm64/include/asm/mmu.h
@@ -18,6 +18,12 @@
 
 typedef struct {
 	atomic64_t	id;
+
+	/*
+	 * non-zero if userspace have access to hardware
+	 * counters directly.
+	 */
+	atomic_t	pmu_direct_access;
 	void		*vdso;
 	unsigned long	flags;
 } mm_context_t;
diff --git a/arch/arm64/include/asm/mmu_context.h b/arch/arm64/include/asm/mmu_context.h
index 7ed0adb187a8..6e66ff940494 100644
--- a/arch/arm64/include/asm/mmu_context.h
+++ b/arch/arm64/include/asm/mmu_context.h
@@ -21,6 +21,7 @@
 #include <asm-generic/mm_hooks.h>
 #include <asm/cputype.h>
 #include <asm/pgtable.h>
+#include <asm/perf_event.h>
 #include <asm/sysreg.h>
 #include <asm/tlbflush.h>
 
@@ -224,6 +225,7 @@ static inline void __switch_mm(struct mm_struct *next)
 	}
 
 	check_and_switch_context(next, cpu);
+	perf_switch_user_access(next);
 }
 
 static inline void
diff --git a/arch/arm64/include/asm/perf_event.h b/arch/arm64/include/asm/perf_event.h
index 2bdbc79bbd01..ba58fa726631 100644
--- a/arch/arm64/include/asm/perf_event.h
+++ b/arch/arm64/include/asm/perf_event.h
@@ -8,6 +8,7 @@
 
 #include <asm/stack_pointer.h>
 #include <asm/ptrace.h>
+#include <linux/mm_types.h>
 
 #define	ARMV8_PMU_MAX_COUNTERS	32
 #define	ARMV8_PMU_COUNTER_MASK	(ARMV8_PMU_MAX_COUNTERS - 1)
@@ -223,4 +224,17 @@ extern unsigned long perf_misc_flags(struct pt_regs *regs);
 	(regs)->pstate = PSR_MODE_EL1h;	\
 }
 
+static inline void perf_switch_user_access(struct mm_struct *mm)
+{
+	if (!IS_ENABLED(CONFIG_PERF_EVENTS))
+		return;
+
+	if (atomic_read(&mm->context.pmu_direct_access)) {
+		write_sysreg(ARMV8_PMU_USERENR_ER|ARMV8_PMU_USERENR_CR,
+			     pmuserenr_el0);
+	} else {
+		write_sysreg(0, pmuserenr_el0);
+	}
+}
+
 #endif
diff --git a/arch/arm64/kernel/perf_event.c b/arch/arm64/kernel/perf_event.c
index de9b001e8b7c..7de56f22d038 100644
--- a/arch/arm64/kernel/perf_event.c
+++ b/arch/arm64/kernel/perf_event.c
@@ -1285,6 +1285,7 @@ void arch_perf_update_userpage(struct perf_event *event,
 	 */
 	freq = arch_timer_get_rate();
 	userpg->cap_user_time = 1;
+	userpg->cap_user_rdpmc = !!(event->hw.flags & ARMPMU_EL0_RD_CNTR);
 
 	clocks_calc_mult_shift(&userpg->time_mult, &shift, freq,
 			NSEC_PER_SEC, 0);
diff --git a/drivers/perf/arm_pmu.c b/drivers/perf/arm_pmu.c
index 2d06b8095a19..d0d3e523a4c4 100644
--- a/drivers/perf/arm_pmu.c
+++ b/drivers/perf/arm_pmu.c
@@ -25,6 +25,7 @@
 #include <linux/irqdesc.h>
 
 #include <asm/irq_regs.h>
+#include <asm/mmu_context.h>
 
 static DEFINE_PER_CPU(struct arm_pmu *, cpu_armpmu);
 static DEFINE_PER_CPU(int, cpu_irq);
@@ -778,6 +779,57 @@ static void cpu_pmu_destroy(struct arm_pmu *cpu_pmu)
 					    &cpu_pmu->node);
 }
 
+static void refresh_pmuserenr(void *mm)
+{
+	perf_switch_user_access(mm);
+}
+
+static int check_homogeneous_cap(struct perf_event *event, struct mm_struct *mm)
+{
+	pr_info("checking HAS_HOMOGENEOUS_PMU");
+	if (!cpus_have_cap(ARM64_HAS_HOMOGENEOUS_PMU)) {
+		pr_info("Disable direct access (!HAS_HOMOGENEOUS_PMU)");
+		atomic_set(&mm->context.pmu_direct_access, 0);
+		on_each_cpu(refresh_pmuserenr, mm, 1);
+		event->hw.flags &= ~ARMPMU_EL0_RD_CNTR;
+		return 0;
+	}
+
+	return 1;
+}
+
+static void armpmu_event_mapped(struct perf_event *event, struct mm_struct *mm)
+{
+	if (!(event->hw.flags & ARMPMU_EL0_RD_CNTR))
+		return;
+
+	/*
+	 * This function relies on not being called concurrently in two
+	 * tasks in the same mm.  Otherwise one task could observe
+	 * pmu_direct_access > 1 and return all the way back to
+	 * userspace with user access disabled while another task is still
+	 * doing on_each_cpu_mask() to enable user access.
+	 *
+	 * For now, this can't happen because all callers hold mmap_sem
+	 * for write.  If this changes, we'll need a different solution.
+	 */
+	lockdep_assert_held_write(&mm->mmap_sem);
+
+	if (check_homogeneous_cap(event, mm) &&
+	    atomic_inc_return(&mm->context.pmu_direct_access) == 1)
+		on_each_cpu(refresh_pmuserenr, mm, 1);
+}
+
+static void armpmu_event_unmapped(struct perf_event *event, struct mm_struct *mm)
+{
+	if (!(event->hw.flags & ARMPMU_EL0_RD_CNTR))
+		return;
+
+	if (check_homogeneous_cap(event, mm) &&
+	    atomic_dec_and_test(&mm->context.pmu_direct_access))
+		on_each_cpu_mask(mm_cpumask(mm), refresh_pmuserenr, NULL, 1);
+}
+
 static struct arm_pmu *__armpmu_alloc(gfp_t flags)
 {
 	struct arm_pmu *pmu;
@@ -799,6 +851,8 @@ static struct arm_pmu *__armpmu_alloc(gfp_t flags)
 		.pmu_enable	= armpmu_enable,
 		.pmu_disable	= armpmu_disable,
 		.event_init	= armpmu_event_init,
+		.event_mapped	= armpmu_event_mapped,
+		.event_unmapped	= armpmu_event_unmapped,
 		.add		= armpmu_add,
 		.del		= armpmu_del,
 		.start		= armpmu_start,
-- 
2.17.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH v4 7/7] Documentation: arm64: Document PMU counters access from userspace
  2019-08-22 14:42 [PATCH v4 0/7] arm64: Enable access to pmu registers by user-space Raphael Gault
                   ` (5 preceding siblings ...)
  2019-08-22 14:42 ` [PATCH v4 6/7] arm64: perf: Enable pmu counter direct access for perf event on armv8 Raphael Gault
@ 2019-08-22 14:42 ` Raphael Gault
  6 siblings, 0 replies; 11+ messages in thread
From: Raphael Gault @ 2019-08-22 14:42 UTC (permalink / raw)
  To: linux-arm-kernel, linux-kernel
  Cc: mark.rutland, raph.gault+kdev, peterz, catalin.marinas,
	will.deacon, acme, Raphael Gault, mingo

Add a documentation file to describe the access to the pmu hardware
counters from userspace

Signed-off-by: Raphael Gault <raphael.gault@arm.com>
---
 .../arm64/pmu_counter_user_access.txt         | 42 +++++++++++++++++++
 1 file changed, 42 insertions(+)
 create mode 100644 Documentation/arm64/pmu_counter_user_access.txt

diff --git a/Documentation/arm64/pmu_counter_user_access.txt b/Documentation/arm64/pmu_counter_user_access.txt
new file mode 100644
index 000000000000..6788b1107381
--- /dev/null
+++ b/Documentation/arm64/pmu_counter_user_access.txt
@@ -0,0 +1,42 @@
+Access to PMU hardware counter from userspace
+=============================================
+
+Overview
+--------
+The perf user-space tool relies on the PMU to monitor events. It offers an
+abstraction layer over the hardware counters since the underlying
+implementation is cpu-dependent.
+Arm64 allows userspace tools to have access to the registers storing the
+hardware counters' values directly.
+
+This targets specifically self-monitoring tasks in order to reduce the overhead
+by directly accessing the registers without having to go through the kernel.
+
+How-to
+------
+The focus is set on the armv8 pmuv3 which makes sure that the access to the pmu
+registers is enable and that the userspace have access to the relevent
+information in order to use them.
+
+In order to have access to the hardware counter it is necessary to open the event
+using the perf tool interface: the sys_perf_event_open syscall returns a fd which
+can subsequently be used with the mmap syscall in order to retrieve a page of memory
+containing information about the event.
+The PMU driver uses this page to expose to the user the hardware counter's
+index. Using this index enables the user to access the PMU registers using the
+`mrs` instruction.
+
+Have a look `at tools/perf/arch/arm64/tests/user-events.c` for an example. It can be
+run using the perf tool to check that the access to the registers works
+correctly from userspace:
+
+./perf test -v
+
+About chained events
+--------------------
+When the user requests for an event to be counted on 64 bits, two hardware
+counters are used and need to be combined to retrieve the correct value:
+
+val = read_counter(idx);
+if ((event.attr.config1 & 0x1))
+	val = (val << 32) | read_counter(idx - 1);
-- 
2.17.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH v4 6/7] arm64: perf: Enable pmu counter direct access for perf event on armv8
  2019-08-22 14:42 ` [PATCH v4 6/7] arm64: perf: Enable pmu counter direct access for perf event on armv8 Raphael Gault
@ 2019-08-22 16:39   ` Jonathan Cameron
  0 siblings, 0 replies; 11+ messages in thread
From: Jonathan Cameron @ 2019-08-22 16:39 UTC (permalink / raw)
  To: Raphael Gault
  Cc: mark.rutland, raph.gault+kdev, peterz, catalin.marinas,
	will.deacon, linux-kernel, acme, mingo, linux-arm-kernel

On Thu, 22 Aug 2019 15:42:19 +0100
Raphael Gault <raphael.gault@arm.com> wrote:

> Keep track of event opened with direct access to the hardware counters
> and modify permissions while they are open.
> 
> The strategy used here is the same which x86 uses: everytime an event
> is mapped, the permissions are set if required. The atomic field added
> in the mm_context helps keep track of the different event opened and
> de-activate the permissions when all are unmapped.
> We also need to update the permissions in the context switch code so
> that tasks keep the right permissions.
> 
> Signed-off-by: Raphael Gault <raphael.gault@arm.com>

Hi Raphael,

One trivial comment inline.

Thanks,

Jonathan


> ---
>  arch/arm64/include/asm/mmu.h         |  6 ++++
>  arch/arm64/include/asm/mmu_context.h |  2 ++
>  arch/arm64/include/asm/perf_event.h  | 14 ++++++++
>  arch/arm64/kernel/perf_event.c       |  1 +
>  drivers/perf/arm_pmu.c               | 54 ++++++++++++++++++++++++++++
>  5 files changed, 77 insertions(+)
> 
> diff --git a/arch/arm64/include/asm/mmu.h b/arch/arm64/include/asm/mmu.h
> index fd6161336653..88ed4466bd06 100644
> --- a/arch/arm64/include/asm/mmu.h
> +++ b/arch/arm64/include/asm/mmu.h
> @@ -18,6 +18,12 @@
>  
>  typedef struct {
>  	atomic64_t	id;
> +
> +	/*
> +	 * non-zero if userspace have access to hardware
> +	 * counters directly.
> +	 */
> +	atomic_t	pmu_direct_access;
>  	void		*vdso;
>  	unsigned long	flags;
>  } mm_context_t;
> diff --git a/arch/arm64/include/asm/mmu_context.h b/arch/arm64/include/asm/mmu_context.h
> index 7ed0adb187a8..6e66ff940494 100644
> --- a/arch/arm64/include/asm/mmu_context.h
> +++ b/arch/arm64/include/asm/mmu_context.h
> @@ -21,6 +21,7 @@
>  #include <asm-generic/mm_hooks.h>
>  #include <asm/cputype.h>
>  #include <asm/pgtable.h>
> +#include <asm/perf_event.h>
>  #include <asm/sysreg.h>
>  #include <asm/tlbflush.h>
>  
> @@ -224,6 +225,7 @@ static inline void __switch_mm(struct mm_struct *next)
>  	}
>  
>  	check_and_switch_context(next, cpu);
> +	perf_switch_user_access(next);
>  }
>  
>  static inline void
> diff --git a/arch/arm64/include/asm/perf_event.h b/arch/arm64/include/asm/perf_event.h
> index 2bdbc79bbd01..ba58fa726631 100644
> --- a/arch/arm64/include/asm/perf_event.h
> +++ b/arch/arm64/include/asm/perf_event.h
> @@ -8,6 +8,7 @@
>  
>  #include <asm/stack_pointer.h>
>  #include <asm/ptrace.h>
> +#include <linux/mm_types.h>
>  
>  #define	ARMV8_PMU_MAX_COUNTERS	32
>  #define	ARMV8_PMU_COUNTER_MASK	(ARMV8_PMU_MAX_COUNTERS - 1)
> @@ -223,4 +224,17 @@ extern unsigned long perf_misc_flags(struct pt_regs *regs);
>  	(regs)->pstate = PSR_MODE_EL1h;	\
>  }
>  
> +static inline void perf_switch_user_access(struct mm_struct *mm)
> +{
> +	if (!IS_ENABLED(CONFIG_PERF_EVENTS))
> +		return;
> +
> +	if (atomic_read(&mm->context.pmu_direct_access)) {
> +		write_sysreg(ARMV8_PMU_USERENR_ER|ARMV8_PMU_USERENR_CR,
> +			     pmuserenr_el0);
> +	} else {
> +		write_sysreg(0, pmuserenr_el0);
> +	}
> +}
> +
>  #endif
> diff --git a/arch/arm64/kernel/perf_event.c b/arch/arm64/kernel/perf_event.c
> index de9b001e8b7c..7de56f22d038 100644
> --- a/arch/arm64/kernel/perf_event.c
> +++ b/arch/arm64/kernel/perf_event.c
> @@ -1285,6 +1285,7 @@ void arch_perf_update_userpage(struct perf_event *event,
>  	 */
>  	freq = arch_timer_get_rate();
>  	userpg->cap_user_time = 1;
> +	userpg->cap_user_rdpmc = !!(event->hw.flags & ARMPMU_EL0_RD_CNTR);
>  
>  	clocks_calc_mult_shift(&userpg->time_mult, &shift, freq,
>  			NSEC_PER_SEC, 0);
> diff --git a/drivers/perf/arm_pmu.c b/drivers/perf/arm_pmu.c
> index 2d06b8095a19..d0d3e523a4c4 100644
> --- a/drivers/perf/arm_pmu.c
> +++ b/drivers/perf/arm_pmu.c
> @@ -25,6 +25,7 @@
>  #include <linux/irqdesc.h>
>  
>  #include <asm/irq_regs.h>
> +#include <asm/mmu_context.h>
>  
>  static DEFINE_PER_CPU(struct arm_pmu *, cpu_armpmu);
>  static DEFINE_PER_CPU(int, cpu_irq);
> @@ -778,6 +779,57 @@ static void cpu_pmu_destroy(struct arm_pmu *cpu_pmu)
>  					    &cpu_pmu->node);
>  }
>  
> +static void refresh_pmuserenr(void *mm)
> +{
> +	perf_switch_user_access(mm);
> +}
> +
> +static int check_homogeneous_cap(struct perf_event *event, struct mm_struct *mm)
> +{
> +	pr_info("checking HAS_HOMOGENEOUS_PMU");

Can we drop this spam from the good path.  Makes a bit of a mess of my
terminal when running the test ;)

> +	if (!cpus_have_cap(ARM64_HAS_HOMOGENEOUS_PMU)) {
> +		pr_info("Disable direct access (!HAS_HOMOGENEOUS_PMU)");
> +		atomic_set(&mm->context.pmu_direct_access, 0);
> +		on_each_cpu(refresh_pmuserenr, mm, 1);
> +		event->hw.flags &= ~ARMPMU_EL0_RD_CNTR;
> +		return 0;
> +	}
> +
> +	return 1;
> +}
> +
> +static void armpmu_event_mapped(struct perf_event *event, struct mm_struct *mm)
> +{
> +	if (!(event->hw.flags & ARMPMU_EL0_RD_CNTR))
> +		return;
> +
> +	/*
> +	 * This function relies on not being called concurrently in two
> +	 * tasks in the same mm.  Otherwise one task could observe
> +	 * pmu_direct_access > 1 and return all the way back to
> +	 * userspace with user access disabled while another task is still
> +	 * doing on_each_cpu_mask() to enable user access.
> +	 *
> +	 * For now, this can't happen because all callers hold mmap_sem
> +	 * for write.  If this changes, we'll need a different solution.
> +	 */
> +	lockdep_assert_held_write(&mm->mmap_sem);
> +
> +	if (check_homogeneous_cap(event, mm) &&
> +	    atomic_inc_return(&mm->context.pmu_direct_access) == 1)
> +		on_each_cpu(refresh_pmuserenr, mm, 1);
> +}
> +
> +static void armpmu_event_unmapped(struct perf_event *event, struct mm_struct *mm)
> +{
> +	if (!(event->hw.flags & ARMPMU_EL0_RD_CNTR))
> +		return;
> +
> +	if (check_homogeneous_cap(event, mm) &&
> +	    atomic_dec_and_test(&mm->context.pmu_direct_access))
> +		on_each_cpu_mask(mm_cpumask(mm), refresh_pmuserenr, NULL, 1);
> +}
> +
>  static struct arm_pmu *__armpmu_alloc(gfp_t flags)
>  {
>  	struct arm_pmu *pmu;
> @@ -799,6 +851,8 @@ static struct arm_pmu *__armpmu_alloc(gfp_t flags)
>  		.pmu_enable	= armpmu_enable,
>  		.pmu_disable	= armpmu_disable,
>  		.event_init	= armpmu_event_init,
> +		.event_mapped	= armpmu_event_mapped,
> +		.event_unmapped	= armpmu_event_unmapped,
>  		.add		= armpmu_add,
>  		.del		= armpmu_del,
>  		.start		= armpmu_start,



_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v4 1/7] perf: arm64: Add test to check userspace access to hardware counters.
  2019-08-22 14:42 ` [PATCH v4 1/7] perf: arm64: Add test to check userspace access to hardware counters Raphael Gault
@ 2019-08-27 11:17   ` Jonathan Cameron
  2020-06-03 16:19     ` Rob Herring
  0 siblings, 1 reply; 11+ messages in thread
From: Jonathan Cameron @ 2019-08-27 11:17 UTC (permalink / raw)
  To: Raphael Gault
  Cc: mark.rutland, raph.gault+kdev, peterz, catalin.marinas,
	will.deacon, linux-kernel, acme, mingo, linux-arm-kernel

On Thu, 22 Aug 2019 15:42:14 +0100
Raphael Gault <raphael.gault@arm.com> wrote:

> This test relies on the fact that the PMU registers are accessible
> from userspace. It then uses the perf_event_mmap_page to retrieve
> the counter index and access the underlying register.
> 
> This test uses sched_setaffinity(2) in order to run on all CPU and thus
> check the behaviour of the PMU of all cpus in a big.LITTLE environment.
> 
> Signed-off-by: Raphael Gault <raphael.gault@arm.com>

Hi Raphael,

I just tested this on 1620 and it works fairly nicely with one exception...

The test will run and generate garbage numbers if the rest of the
series isn't yet applied to the kernel.  Is there anything we can do
to prevent that?

It's a slightly silly complaint, but this also take a while compared to all 
the other tests if you have lots of cores, so maybe a slightly shorter
test?

Thanks,

Jonathan

> ---
>  tools/perf/arch/arm64/include/arch-tests.h |   7 +
>  tools/perf/arch/arm64/tests/Build          |   1 +
>  tools/perf/arch/arm64/tests/arch-tests.c   |   4 +
>  tools/perf/arch/arm64/tests/user-events.c  | 254 +++++++++++++++++++++
>  4 files changed, 266 insertions(+)
>  create mode 100644 tools/perf/arch/arm64/tests/user-events.c
> 
> diff --git a/tools/perf/arch/arm64/include/arch-tests.h b/tools/perf/arch/arm64/include/arch-tests.h
> index 90ec4c8cb880..6a8483de1015 100644
> --- a/tools/perf/arch/arm64/include/arch-tests.h
> +++ b/tools/perf/arch/arm64/include/arch-tests.h
> @@ -2,11 +2,18 @@
>  #ifndef ARCH_TESTS_H
>  #define ARCH_TESTS_H
>  
> +#include <linux/compiler.h>
> +
>  #ifdef HAVE_DWARF_UNWIND_SUPPORT
>  struct thread;
>  struct perf_sample;
> +int test__arch_unwind_sample(struct perf_sample *sample,
> +			     struct thread *thread);
>  #endif
>  
>  extern struct test arch_tests[];
> +int test__rd_pmevcntr(struct test *test __maybe_unused,
> +		      int subtest __maybe_unused);
> +
>  
>  #endif
> diff --git a/tools/perf/arch/arm64/tests/Build b/tools/perf/arch/arm64/tests/Build
> index a61c06bdb757..3f9a20c17fc6 100644
> --- a/tools/perf/arch/arm64/tests/Build
> +++ b/tools/perf/arch/arm64/tests/Build
> @@ -1,4 +1,5 @@
>  perf-y += regs_load.o
>  perf-$(CONFIG_DWARF_UNWIND) += dwarf-unwind.o
>  
> +perf-y += user-events.o
>  perf-y += arch-tests.o
> diff --git a/tools/perf/arch/arm64/tests/arch-tests.c b/tools/perf/arch/arm64/tests/arch-tests.c
> index 5b1543c98022..57df9b89dede 100644
> --- a/tools/perf/arch/arm64/tests/arch-tests.c
> +++ b/tools/perf/arch/arm64/tests/arch-tests.c
> @@ -10,6 +10,10 @@ struct test arch_tests[] = {
>  		.func = test__dwarf_unwind,
>  	},
>  #endif
> +	{
> +		.desc = "User counter access",
> +		.func = test__rd_pmevcntr,
> +	},
>  	{
>  		.func = NULL,
>  	},
> diff --git a/tools/perf/arch/arm64/tests/user-events.c b/tools/perf/arch/arm64/tests/user-events.c
> new file mode 100644
> index 000000000000..b048d7e392bc
> --- /dev/null
> +++ b/tools/perf/arch/arm64/tests/user-events.c
> @@ -0,0 +1,254 @@
> +// SPDX-License-Identifier: GPL-2.0
> +#include <asm/bug.h>
> +#include <errno.h>
> +#include <unistd.h>
> +#include <sched.h>
> +#include <stdlib.h>
> +#include <signal.h>
> +#include <sys/mman.h>
> +#include <sys/sysinfo.h>
> +#include <sys/types.h>
> +#include <sys/wait.h>
> +#include <linux/types.h>
> +#include "perf.h"
> +#include "debug.h"
> +#include "tests/tests.h"
> +#include "cloexec.h"
> +#include "util.h"
> +#include "arch-tests.h"
> +
> +/*
> + * ARMv8 ARM reserves the following encoding for system registers:
> + * (Ref: ARMv8 ARM, Section: "System instruction class encoding overview",
> + *  C5.2, version:ARM DDI 0487A.f)
> + *      [20-19] : Op0
> + *      [18-16] : Op1
> + *      [15-12] : CRn
> + *      [11-8]  : CRm
> + *      [7-5]   : Op2
> + */
> +#define Op0_shift       19
> +#define Op0_mask        0x3
> +#define Op1_shift       16
> +#define Op1_mask        0x7
> +#define CRn_shift       12
> +#define CRn_mask        0xf
> +#define CRm_shift       8
> +#define CRm_mask        0xf
> +#define Op2_shift       5
> +#define Op2_mask        0x7
> +
> +#define __stringify(x)	#x
> +
> +#define read_sysreg(r) ({						\
> +	u64 __val;							\
> +	asm volatile("mrs %0, " __stringify(r) : "=r" (__val));		\
> +	__val;								\
> +})
> +
> +#define PMEVCNTR_READ_CASE(idx)					\
> +	case idx:						\
> +		return read_sysreg(pmevcntr##idx##_el0)
> +
> +#define PMEVCNTR_CASES(readwrite)		\
> +	PMEVCNTR_READ_CASE(0);			\
> +	PMEVCNTR_READ_CASE(1);			\
> +	PMEVCNTR_READ_CASE(2);			\
> +	PMEVCNTR_READ_CASE(3);			\
> +	PMEVCNTR_READ_CASE(4);			\
> +	PMEVCNTR_READ_CASE(5);			\
> +	PMEVCNTR_READ_CASE(6);			\
> +	PMEVCNTR_READ_CASE(7);			\
> +	PMEVCNTR_READ_CASE(8);			\
> +	PMEVCNTR_READ_CASE(9);			\
> +	PMEVCNTR_READ_CASE(10);			\
> +	PMEVCNTR_READ_CASE(11);			\
> +	PMEVCNTR_READ_CASE(12);			\
> +	PMEVCNTR_READ_CASE(13);			\
> +	PMEVCNTR_READ_CASE(14);			\
> +	PMEVCNTR_READ_CASE(15);			\
> +	PMEVCNTR_READ_CASE(16);			\
> +	PMEVCNTR_READ_CASE(17);			\
> +	PMEVCNTR_READ_CASE(18);			\
> +	PMEVCNTR_READ_CASE(19);			\
> +	PMEVCNTR_READ_CASE(20);			\
> +	PMEVCNTR_READ_CASE(21);			\
> +	PMEVCNTR_READ_CASE(22);			\
> +	PMEVCNTR_READ_CASE(23);			\
> +	PMEVCNTR_READ_CASE(24);			\
> +	PMEVCNTR_READ_CASE(25);			\
> +	PMEVCNTR_READ_CASE(26);			\
> +	PMEVCNTR_READ_CASE(27);			\
> +	PMEVCNTR_READ_CASE(28);			\
> +	PMEVCNTR_READ_CASE(29);			\
> +	PMEVCNTR_READ_CASE(30)
> +
> +/*
> + * Read a value direct from PMEVCNTR<idx>
> + */
> +static u64 read_evcnt_direct(int idx)
> +{
> +	switch (idx) {
> +	PMEVCNTR_CASES(READ);
> +	default:
> +		WARN_ON(1);
> +	}
> +
> +	return 0;
> +}
> +
> +static u64 mmap_read_self(void *addr)
> +{
> +	struct perf_event_mmap_page *pc = addr;
> +	u32 seq, idx, time_mult = 0, time_shift = 0;
> +	u64 count, cyc = 0, time_offset = 0, enabled, running, delta;
> +
> +	do {
> +		seq = READ_ONCE(pc->lock);
> +		barrier();
> +
> +		enabled = READ_ONCE(pc->time_enabled);
> +		running = READ_ONCE(pc->time_running);
> +
> +		if (enabled != running) {
> +			cyc = read_sysreg(cntvct_el0);
> +			time_mult = READ_ONCE(pc->time_mult);
> +			time_shift = READ_ONCE(pc->time_shift);
> +			time_offset = READ_ONCE(pc->time_offset);
> +		}
> +
> +		idx = READ_ONCE(pc->index);
> +		count = READ_ONCE(pc->offset);
> +		if (idx)
> +			count += read_evcnt_direct(idx - 1);
> +
> +		barrier();
> +	} while (READ_ONCE(pc->lock) != seq);
> +
> +	if (enabled != running) {
> +		u64 quot, rem;
> +
> +		quot = (cyc >> time_shift);
> +		rem = cyc & (((u64)1 << time_shift) - 1);
> +		delta = time_offset + quot * time_mult +
> +			((rem * time_mult) >> time_shift);
> +
> +		enabled += delta;
> +		if (idx)
> +			running += delta;
> +
> +		quot = count / running;
> +		rem = count % running;
> +		count = quot * enabled + (rem * enabled) / running;
> +	}
> +
> +	return count;
> +}
> +
> +static int __test__rd_pmevcntr(void)
> +{
> +	volatile int tmp = 0;
> +	u64 i, loops = 1000;
> +	int n;
> +	int fd;
> +	void *addr;
> +	struct perf_event_attr attr = {
> +		.type = PERF_TYPE_HARDWARE,
> +		.config = PERF_COUNT_HW_INSTRUCTIONS,
> +		.exclude_kernel = 1,
> +	};
> +	u64 delta_sum = 0;
> +	char sbuf[STRERR_BUFSIZE];
> +
> +	fd = sys_perf_event_open(&attr, 0, -1, -1,
> +				 perf_event_open_cloexec_flag());
> +	if (fd < 0) {
> +		pr_err("Error: sys_perf_event_open() syscall returned with %d (%s)\n", fd,
> +		       str_error_r(errno, sbuf, sizeof(sbuf)));
> +		return -1;
> +	}
> +
> +	addr = mmap(NULL, page_size, PROT_READ, MAP_SHARED, fd, 0);
> +	if (addr == (void *)(-1)) {
> +		pr_err("Error: mmap() syscall returned with (%s)\n",
> +		       str_error_r(errno, sbuf, sizeof(sbuf)));
> +		goto out_close;
> +	}
> +
> +	for (n = 0; n < 6; n++) {
> +		u64 stamp, now, delta;
> +
> +		stamp = mmap_read_self(addr);
> +
> +		for (i = 0; i < loops; i++)
> +			tmp++;
> +
> +		now = mmap_read_self(addr);
> +		loops *= 10;
> +
> +		delta = now - stamp;
> +		pr_debug("%14d: %14llu\n", n, (long long)delta);
> +
> +		delta_sum += delta;
> +	}
> +
> +	munmap(addr, page_size);
> +	pr_debug("   ");
> +
> +out_close:
> +	close(fd);
> +
> +	if (!delta_sum)
> +		return -1;
> +
> +	return 0;
> +}
> +
> +int test__rd_pmevcntr(struct test __maybe_unused *test,
> +		      int __maybe_unused subtest)
> +{
> +	int status = 0;
> +	int wret = 0;
> +	int ret = 0;
> +	int pid;
> +	int cpu;
> +	cpu_set_t cpu_set;
> +
> +	pid = fork();
> +	if (pid < 0)
> +		return -1;
> +
> +	if (!pid) {
> +		for (cpu = 0; cpu < get_nprocs(); cpu++) {
> +			pr_info("setting affinity to cpu: %d\n", cpu);
> +			CPU_ZERO(&cpu_set);
> +			CPU_SET(cpu, &cpu_set);
> +			if (sched_setaffinity(getpid(),
> +					      sizeof(cpu_set),
> +					      &cpu_set) == -1) {
> +				pr_err("Error: impossible to set cpu (%d) affinity\n",
> +				       cpu);
> +				continue;
> +			}
> +			ret = __test__rd_pmevcntr();
> +		}
> +		exit(ret);
> +	}
> +
> +	wret = waitpid(pid, &status, 0);
> +	if (wret < 0)
> +		return -1;
> +
> +	if (WIFSIGNALED(status)) {
> +		pr_err("Error: the child process was interrupted by a signal\n");
> +		return -1;
> +
> +	if (WIFEXITED(status) && WEXITSTATUS(status)) {
> +		pr_err("Error: the child process exited with: %d\n",
> +		       WEXITSTATUS(status));
> +		return -1;
> +	}
> +
> +	return 0;
> +}



_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v4 1/7] perf: arm64: Add test to check userspace access to hardware counters.
  2019-08-27 11:17   ` Jonathan Cameron
@ 2020-06-03 16:19     ` Rob Herring
  0 siblings, 0 replies; 11+ messages in thread
From: Rob Herring @ 2020-06-03 16:19 UTC (permalink / raw)
  To: Jonathan Cameron
  Cc: mark.rutland, raph.gault+kdev, peterz, catalin.marinas,
	will.deacon, linux-kernel, acme, Raphael Gault, mingo,
	linux-arm-kernel

On Tue, Aug 27, 2019 at 07:17:55PM +0800, Jonathan Cameron wrote:
> On Thu, 22 Aug 2019 15:42:14 +0100
> Raphael Gault <raphael.gault@arm.com> wrote:
> 
> > This test relies on the fact that the PMU registers are accessible
> > from userspace. It then uses the perf_event_mmap_page to retrieve
> > the counter index and access the underlying register.
> > 
> > This test uses sched_setaffinity(2) in order to run on all CPU and thus
> > check the behaviour of the PMU of all cpus in a big.LITTLE environment.
> > 
> > Signed-off-by: Raphael Gault <raphael.gault@arm.com>
> 
> Hi Raphael,
> 
> I just tested this on 1620 and it works fairly nicely with one exception...

I'm working on reviving this series.

> The test will run and generate garbage numbers if the rest of the
> series isn't yet applied to the kernel.  Is there anything we can do
> to prevent that?

I've added a check that user access is enabled which should prevent 
that. It also validates pmc_width is set which was missing in this 
series.

> It's a slightly silly complaint, but this also take a while compared to all 
> the other tests if you have lots of cores, so maybe a slightly shorter
> test?

I'm not sure what the value of running on every core was supposed to be. 
If we want to check big.LITTLE, then the test should detect that and 
pass if user access is disabled on all cores. If we're not on 
big.LITTLE, then I don't see the point in this test running on every 
core.

Rob

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2020-06-03 16:19 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-08-22 14:42 [PATCH v4 0/7] arm64: Enable access to pmu registers by user-space Raphael Gault
2019-08-22 14:42 ` [PATCH v4 1/7] perf: arm64: Add test to check userspace access to hardware counters Raphael Gault
2019-08-27 11:17   ` Jonathan Cameron
2020-06-03 16:19     ` Rob Herring
2019-08-22 14:42 ` [PATCH v4 2/7] arm64: cpu: Add accessor for boot_cpu_data Raphael Gault
2019-08-22 14:42 ` [PATCH v4 3/7] arm64: cpufeature: Add feature to detect homogeneous systems Raphael Gault
2019-08-22 14:42 ` [PATCH v4 4/7] arm64: pmu: Add hook to handle pmu-related undefined instructions Raphael Gault
2019-08-22 14:42 ` [PATCH v4 5/7] arm64: pmu: Add function implementation to update event index in userpage Raphael Gault
2019-08-22 14:42 ` [PATCH v4 6/7] arm64: perf: Enable pmu counter direct access for perf event on armv8 Raphael Gault
2019-08-22 16:39   ` Jonathan Cameron
2019-08-22 14:42 ` [PATCH v4 7/7] Documentation: arm64: Document PMU counters access from userspace Raphael Gault

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).