linux-arm-kernel.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v3 0/5] arm64: vdso: getcpu() support
@ 2020-08-19 12:13 Mark Brown
  2020-08-19 12:13 ` [PATCH v3 1/5] arm64: vdso: Provide a define when building the vDSO Mark Brown
                   ` (5 more replies)
  0 siblings, 6 replies; 9+ messages in thread
From: Mark Brown @ 2020-08-19 12:13 UTC (permalink / raw)
  To: Catalin Marinas, Will Deacon
  Cc: Mark Brown, Vincenzo Frascino, Shuah Khan, linux-kselftest,
	linux-arm-kernel

Some applications, especially tracing ones, benefit from avoiding the
syscall overhead for getcpu() so it is common for architectures to have
vDSO implementations. Add one for arm64, using TPIDRRO_EL0 to pass a
pointer to per-CPU data rather than just store the immediate value in
order to allow for future extensibility.

It is questionable if something TPIDRRO_EL0 based is worthwhile at all
on current kernels, since v4.18 we have had support for restartable
sequences which can be used to provide a sched_getcpu() implementation
with generally better performance than the vDSO approach on
architectures which have that[1]. Work is ongoing to implement this for
glibc:

    https://lore.kernel.org/lkml/20200527185130.5604-3-mathieu.desnoyers@efficios.com/

but is not yet merged and will need similar work for other userspaces.
The main advantages for the vDSO implementation are the node parameter
(though this is a static mapping to CPU number so could be looked up
separately when processing data if it's needed, it shouldn't need to be
in the hot path) and ease of implementation for users.

This is currently not compatible with KPTI due to the use of TPIDRRO_EL0
by the KPTI trampoline, this could be addressed by reinitializing that
system register in the return path but I have found it hard to justify
adding that overhead for all users for something that is essentially a
profiling optimization which is likely to get superceeded by a more
modern implementation - if there are other uses for the per-CPU data
then the balance might change here.

This builds on work done by Kristina Martsenko some time ago but is a
new implementation.

[1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=d7822b1e24f2df5df98c76f0e94a5416349ff759

v3:
 - Rebase on v5.9-rc1.
 - Drop in progress portions of the series.
v2:
 - Rebase on v5.8-rc3.
 - Add further cleanup patches & a first draft of multi-page support.

Mark Brown (5):
  arm64: vdso: Provide a define when building the vDSO
  arm64: vdso: Add per-CPU data
  arm64: vdso: Initialise the per-CPU vDSO data
  arm64: vdso: Add getcpu() implementation
  selftests: vdso: Support arm64 in getcpu() test

 arch/arm64/include/asm/processor.h            | 12 +----
 arch/arm64/include/asm/vdso/datapage.h        | 54 +++++++++++++++++++
 arch/arm64/kernel/process.c                   | 26 ++++++++-
 arch/arm64/kernel/vdso.c                      | 33 +++++++++++-
 arch/arm64/kernel/vdso/Makefile               |  4 +-
 arch/arm64/kernel/vdso/vdso.lds.S             |  1 +
 arch/arm64/kernel/vdso/vgetcpu.c              | 48 +++++++++++++++++
 .../testing/selftests/vDSO/vdso_test_getcpu.c | 10 ++++
 8 files changed, 172 insertions(+), 16 deletions(-)
 create mode 100644 arch/arm64/include/asm/vdso/datapage.h
 create mode 100644 arch/arm64/kernel/vdso/vgetcpu.c

-- 
2.20.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH v3 1/5] arm64: vdso: Provide a define when building the vDSO
  2020-08-19 12:13 [PATCH v3 0/5] arm64: vdso: getcpu() support Mark Brown
@ 2020-08-19 12:13 ` Mark Brown
  2020-08-19 12:13 ` [PATCH v3 2/5] arm64: vdso: Add per-CPU data Mark Brown
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 9+ messages in thread
From: Mark Brown @ 2020-08-19 12:13 UTC (permalink / raw)
  To: Catalin Marinas, Will Deacon
  Cc: Mark Brown, Vincenzo Frascino, Shuah Khan, linux-kselftest,
	linux-arm-kernel

Provide a define identifying if code is being built for the vDSO to help
with writing headers that are shared between the kernel and the vDSO.

Signed-off-by: Mark Brown <broonie@kernel.org>
---
 arch/arm64/kernel/vdso/Makefile | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/arm64/kernel/vdso/Makefile b/arch/arm64/kernel/vdso/Makefile
index 45d5cfe46429..88cf0f0b91ed 100644
--- a/arch/arm64/kernel/vdso/Makefile
+++ b/arch/arm64/kernel/vdso/Makefile
@@ -28,7 +28,7 @@ ldflags-y := -shared -nostdlib -soname=linux-vdso.so.1 --hash-style=sysv	\
 	     $(btildflags-y) -T
 
 ccflags-y := -fno-common -fno-builtin -fno-stack-protector -ffixed-x18
-ccflags-y += -DDISABLE_BRANCH_PROFILING
+ccflags-y += -DDISABLE_BRANCH_PROFILING -D__VDSO__
 
 CFLAGS_REMOVE_vgettimeofday.o = $(CC_FLAGS_FTRACE) -Os $(CC_FLAGS_SCS) $(GCC_PLUGINS_CFLAGS)
 KBUILD_CFLAGS			+= $(DISABLE_LTO)
-- 
2.20.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH v3 2/5] arm64: vdso: Add per-CPU data
  2020-08-19 12:13 [PATCH v3 0/5] arm64: vdso: getcpu() support Mark Brown
  2020-08-19 12:13 ` [PATCH v3 1/5] arm64: vdso: Provide a define when building the vDSO Mark Brown
@ 2020-08-19 12:13 ` Mark Brown
  2020-08-19 12:13 ` [PATCH v3 3/5] arm64: vdso: Initialise the per-CPU vDSO data Mark Brown
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 9+ messages in thread
From: Mark Brown @ 2020-08-19 12:13 UTC (permalink / raw)
  To: Catalin Marinas, Will Deacon
  Cc: Mark Brown, Vincenzo Frascino, Shuah Khan, linux-kselftest,
	linux-arm-kernel

In order to support a vDSO getcpu() implementation add per-CPU data to
the vDSO data page. Do this by wrapping the generic vdso_data struct in
an arm64 specific one with an array of per-CPU data. The offset of the
per-CPU data applying to a CPU will be stored in TPIDRRO_EL0, this
allows us to get to the per-CPU data without doing any multiplications.

Since we currently only map a single data page for the vDSO but support
very large numbers of CPUs TPIDRRO may be set to zero for CPUs which don't
fit in the data page. This will also happen when KPTI is active since
kernel_ventry uses TPIDRRO_EL0 as a scratch register in that case, add a
comment to the code explaining this.

Acessors for the data are provided in the header since they will be needed
in multiple files and it seems neater to keep things together.

Signed-off-by: Mark Brown <broonie@kernel.org>
---
 arch/arm64/include/asm/processor.h     | 12 +-----
 arch/arm64/include/asm/vdso/datapage.h | 54 ++++++++++++++++++++++++++
 arch/arm64/kernel/process.c            | 26 ++++++++++++-
 arch/arm64/kernel/vdso.c               |  5 ++-
 4 files changed, 83 insertions(+), 14 deletions(-)
 create mode 100644 arch/arm64/include/asm/vdso/datapage.h

diff --git a/arch/arm64/include/asm/processor.h b/arch/arm64/include/asm/processor.h
index 240fe5e5b720..db7a804030b3 100644
--- a/arch/arm64/include/asm/processor.h
+++ b/arch/arm64/include/asm/processor.h
@@ -207,17 +207,7 @@ static inline void set_compat_ssbs_bit(struct pt_regs *regs)
 	regs->pstate |= PSR_AA32_SSBS_BIT;
 }
 
-static inline void start_thread(struct pt_regs *regs, unsigned long pc,
-				unsigned long sp)
-{
-	start_thread_common(regs, pc);
-	regs->pstate = PSR_MODE_EL0t;
-
-	if (arm64_get_ssbd_state() != ARM64_SSBD_FORCE_ENABLE)
-		set_ssbs_bit(regs);
-
-	regs->sp = sp;
-}
+void start_thread(struct pt_regs *regs, unsigned long pc, unsigned long sp);
 
 static inline bool is_ttbr0_addr(unsigned long addr)
 {
diff --git a/arch/arm64/include/asm/vdso/datapage.h b/arch/arm64/include/asm/vdso/datapage.h
new file mode 100644
index 000000000000..e88d97238c52
--- /dev/null
+++ b/arch/arm64/include/asm/vdso/datapage.h
@@ -0,0 +1,54 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Copyright (C) 2020 ARM Limited
+ */
+#ifndef __ASM_VDSO_DATAPAGE_H
+#define __ASM_VDSO_DATAPAGE_H
+
+#include <vdso/datapage.h>
+
+struct vdso_cpu_data {
+	unsigned int cpu;
+	unsigned int node;
+};
+
+struct arm64_vdso_data {
+	/* Must be first in struct, we cast to vdso_data */
+	struct vdso_data data[CS_BASES];
+	struct vdso_cpu_data cpu_data[];
+};
+
+#ifdef __VDSO__
+static inline struct vdso_cpu_data *__vdso_cpu_data(void)
+{
+	unsigned long offset;
+
+	asm volatile(
+		"mrs %0, tpidrro_el0\n"
+	: "=r" (offset)
+	:
+	: "cc");
+
+	if (offset)
+		return (void *)(_vdso_data) + offset;
+
+	return NULL;
+}
+#else
+static inline size_t vdso_cpu_offset(void)
+{
+	size_t offset, data_end;
+
+	offset = offsetof(struct arm64_vdso_data, cpu_data) +
+		smp_processor_id() * sizeof(struct vdso_cpu_data);
+	data_end = offset + sizeof(struct vdso_cpu_data) + 1;
+
+	/* We only map a single page for vDSO data currently */
+	if (data_end > PAGE_SIZE)
+		return 0;
+
+	return offset;
+}
+#endif
+
+#endif
diff --git a/arch/arm64/kernel/process.c b/arch/arm64/kernel/process.c
index 84ec630b8ab5..89b400f9397d 100644
--- a/arch/arm64/kernel/process.c
+++ b/arch/arm64/kernel/process.c
@@ -55,6 +55,7 @@
 #include <asm/processor.h>
 #include <asm/pointer_auth.h>
 #include <asm/stacktrace.h>
+#include <asm/vdso/datapage.h>
 
 #if defined(CONFIG_STACKPROTECTOR) && !defined(CONFIG_STACKPROTECTOR_PER_TASK)
 #include <linux/stackprotector.h>
@@ -309,6 +310,28 @@ void show_regs(struct pt_regs * regs)
 	dump_backtrace(regs, NULL, KERN_DEFAULT);
 }
 
+void start_thread(struct pt_regs *regs, unsigned long pc, unsigned long sp)
+{
+	start_thread_common(regs, pc);
+	regs->pstate = PSR_MODE_EL0t;
+
+	if (arm64_get_ssbd_state() != ARM64_SSBD_FORCE_ENABLE)
+		set_ssbs_bit(regs);
+
+	regs->sp = sp;
+
+	/*
+	 * Store the vDSO per-CPU offset if supported. Disable
+	 * preemption to make sure we read the CPU offset on the CPU
+	 * we write it on.
+	 */
+	if (!arm64_kernel_unmapped_at_el0()) {
+		preempt_disable();
+		write_sysreg(vdso_cpu_offset(), tpidrro_el0);
+		preempt_enable();
+	}
+}
+
 static void tls_thread_flush(void)
 {
 	write_sysreg(0, tpidr_el0);
@@ -452,7 +475,8 @@ static void tls_thread_switch(struct task_struct *next)
 	if (is_compat_thread(task_thread_info(next)))
 		write_sysreg(next->thread.uw.tp_value, tpidrro_el0);
 	else if (!arm64_kernel_unmapped_at_el0())
-		write_sysreg(0, tpidrro_el0);
+		/* Used as scratch in KPTI trampoline so don't set here. */
+		write_sysreg(vdso_cpu_offset(), tpidrro_el0);
 
 	write_sysreg(*task_user_tls(next), tpidr_el0);
 }
diff --git a/arch/arm64/kernel/vdso.c b/arch/arm64/kernel/vdso.c
index d4202a32abc9..2a8d7ab76bee 100644
--- a/arch/arm64/kernel/vdso.c
+++ b/arch/arm64/kernel/vdso.c
@@ -28,6 +28,7 @@
 #include <asm/cacheflush.h>
 #include <asm/signal32.h>
 #include <asm/vdso.h>
+#include <asm/vdso/datapage.h>
 
 extern char vdso_start[], vdso_end[];
 #ifdef CONFIG_COMPAT_VDSO
@@ -77,10 +78,10 @@ static struct vdso_abi_info vdso_info[] __ro_after_init = {
  * The vDSO data page.
  */
 static union {
-	struct vdso_data	data[CS_BASES];
+	struct arm64_vdso_data	data;
 	u8			page[PAGE_SIZE];
 } vdso_data_store __page_aligned_data;
-struct vdso_data *vdso_data = vdso_data_store.data;
+struct vdso_data *vdso_data = vdso_data_store.data.data;
 
 static int __vdso_remap(enum vdso_abi abi,
 			const struct vm_special_mapping *sm,
-- 
2.20.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH v3 3/5] arm64: vdso: Initialise the per-CPU vDSO data
  2020-08-19 12:13 [PATCH v3 0/5] arm64: vdso: getcpu() support Mark Brown
  2020-08-19 12:13 ` [PATCH v3 1/5] arm64: vdso: Provide a define when building the vDSO Mark Brown
  2020-08-19 12:13 ` [PATCH v3 2/5] arm64: vdso: Add per-CPU data Mark Brown
@ 2020-08-19 12:13 ` Mark Brown
  2020-08-19 12:13 ` [PATCH v3 4/5] arm64: vdso: Add getcpu() implementation Mark Brown
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 9+ messages in thread
From: Mark Brown @ 2020-08-19 12:13 UTC (permalink / raw)
  To: Catalin Marinas, Will Deacon
  Cc: Mark Brown, Vincenzo Frascino, Shuah Khan, linux-kselftest,
	linux-arm-kernel

Register with the CPU hotplug system to initialise the per-CPU data for
getcpu().

Signed-off-by: Mark Brown <broonie@kernel.org>
---
 arch/arm64/kernel/vdso.c | 28 ++++++++++++++++++++++++++++
 1 file changed, 28 insertions(+)

diff --git a/arch/arm64/kernel/vdso.c b/arch/arm64/kernel/vdso.c
index 2a8d7ab76bee..d9743c659341 100644
--- a/arch/arm64/kernel/vdso.c
+++ b/arch/arm64/kernel/vdso.c
@@ -9,6 +9,7 @@
 
 #include <linux/cache.h>
 #include <linux/clocksource.h>
+#include <linux/cpuhotplug.h>
 #include <linux/elf.h>
 #include <linux/err.h>
 #include <linux/errno.h>
@@ -18,6 +19,7 @@
 #include <linux/sched.h>
 #include <linux/signal.h>
 #include <linux/slab.h>
+#include <linux/smp.h>
 #include <linux/time_namespace.h>
 #include <linux/timekeeper_internal.h>
 #include <linux/vmalloc.h>
@@ -466,6 +468,26 @@ int aarch32_setup_additional_pages(struct linux_binprm *bprm, int uses_interp)
 }
 #endif /* CONFIG_COMPAT */
 
+static void vdso_cpu_init(void *p)
+{
+	struct arm64_vdso_data *data = (struct arm64_vdso_data *)vdso_data;
+	unsigned int cpu;
+
+	if (vdso_cpu_offset()) {
+		cpu = smp_processor_id();
+
+		data->cpu_data[cpu].cpu = cpu;
+		data->cpu_data[cpu].node = cpu_to_node(cpu);
+	}
+}
+
+static int vdso_cpu_online(unsigned int cpu)
+{
+	smp_call_function_single(cpu, vdso_cpu_init, NULL, 1);
+
+	return 0;
+}
+
 static int vdso_mremap(const struct vm_special_mapping *sm,
 		struct vm_area_struct *new_vma)
 {
@@ -494,6 +516,12 @@ static int __init vdso_init(void)
 	vdso_info[VDSO_ABI_AA64].dm = &aarch64_vdso_maps[AA64_MAP_VVAR];
 	vdso_info[VDSO_ABI_AA64].cm = &aarch64_vdso_maps[AA64_MAP_VDSO];
 
+	/*
+	 * Initialize per-CPU data, callback runs for all current and
+	 * future CPUs.
+	 */
+	cpuhp_setup_state(CPUHP_AP_ONLINE_DYN, "vdso", vdso_cpu_online, NULL);
+
 	return __vdso_init(VDSO_ABI_AA64);
 }
 arch_initcall(vdso_init);
-- 
2.20.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH v3 4/5] arm64: vdso: Add getcpu() implementation
  2020-08-19 12:13 [PATCH v3 0/5] arm64: vdso: getcpu() support Mark Brown
                   ` (2 preceding siblings ...)
  2020-08-19 12:13 ` [PATCH v3 3/5] arm64: vdso: Initialise the per-CPU vDSO data Mark Brown
@ 2020-08-19 12:13 ` Mark Brown
  2020-08-19 12:13 ` [PATCH v3 5/5] selftests: vdso: Support arm64 in getcpu() test Mark Brown
  2020-08-31 21:47 ` [PATCH v3 0/5] arm64: vdso: getcpu() support Shuah Khan
  5 siblings, 0 replies; 9+ messages in thread
From: Mark Brown @ 2020-08-19 12:13 UTC (permalink / raw)
  To: Catalin Marinas, Will Deacon
  Cc: Mark Brown, Vincenzo Frascino, Shuah Khan, linux-kselftest,
	linux-arm-kernel

Some applications, especially trace ones, benefit from avoiding the syscall
overhead on getcpu() calls so provide a vDSO implementation of it.

Signed-off-by: Mark Brown <broonie@kernel.org>
---
 arch/arm64/kernel/vdso/Makefile   |  2 +-
 arch/arm64/kernel/vdso/vdso.lds.S |  1 +
 arch/arm64/kernel/vdso/vgetcpu.c  | 48 +++++++++++++++++++++++++++++++
 3 files changed, 50 insertions(+), 1 deletion(-)
 create mode 100644 arch/arm64/kernel/vdso/vgetcpu.c

diff --git a/arch/arm64/kernel/vdso/Makefile b/arch/arm64/kernel/vdso/Makefile
index 88cf0f0b91ed..ff350e69b8b6 100644
--- a/arch/arm64/kernel/vdso/Makefile
+++ b/arch/arm64/kernel/vdso/Makefile
@@ -11,7 +11,7 @@
 ARCH_REL_TYPE_ABS := R_AARCH64_JUMP_SLOT|R_AARCH64_GLOB_DAT|R_AARCH64_ABS64
 include $(srctree)/lib/vdso/Makefile
 
-obj-vdso := vgettimeofday.o note.o sigreturn.o
+obj-vdso := vgettimeofday.o note.o sigreturn.o vgetcpu.o
 
 # Build rules
 targets := $(obj-vdso) vdso.so vdso.so.dbg
diff --git a/arch/arm64/kernel/vdso/vdso.lds.S b/arch/arm64/kernel/vdso/vdso.lds.S
index d808ad31e01f..ef3fb80e0349 100644
--- a/arch/arm64/kernel/vdso/vdso.lds.S
+++ b/arch/arm64/kernel/vdso/vdso.lds.S
@@ -80,6 +80,7 @@ VERSION
 		__kernel_gettimeofday;
 		__kernel_clock_gettime;
 		__kernel_clock_getres;
+		__kernel_getcpu;
 	local: *;
 	};
 }
diff --git a/arch/arm64/kernel/vdso/vgetcpu.c b/arch/arm64/kernel/vdso/vgetcpu.c
new file mode 100644
index 000000000000..e8972e561e08
--- /dev/null
+++ b/arch/arm64/kernel/vdso/vgetcpu.c
@@ -0,0 +1,48 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * ARM64 userspace implementations of getcpu()
+ *
+ * Copyright (C) 2020 ARM Limited
+ *
+ */
+
+#include <asm/unistd.h>
+#include <asm/vdso/datapage.h>
+
+struct getcpucache;
+
+static __always_inline
+int getcpu_fallback(unsigned int *_cpu, unsigned int *_node,
+		    struct getcpucache *_c)
+{
+	register unsigned int *cpu asm("x0") = _cpu;
+	register unsigned int *node asm("x1") = _node;
+	register struct getcpucache *c asm("x2") = _c;
+	register long ret asm ("x0");
+	register long nr asm("x8") = __NR_getcpu;
+
+	asm volatile(
+	"       svc #0\n"
+	: "=r" (ret)
+	: "r" (cpu), "r" (node), "r" (c), "r" (nr)
+	: "memory");
+
+	return ret;
+}
+
+int __kernel_getcpu(unsigned int *cpu, unsigned int *node,
+		    struct getcpucache *c)
+{
+	struct vdso_cpu_data *cpu_data = __vdso_cpu_data();
+
+	if (cpu_data) {
+		if (cpu)
+			*cpu = cpu_data->cpu;
+		if (node)
+			*node = cpu_data->node;
+
+		return 0;
+	}
+
+	return getcpu_fallback(cpu, node, c);
+}
-- 
2.20.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH v3 5/5] selftests: vdso: Support arm64 in getcpu() test
  2020-08-19 12:13 [PATCH v3 0/5] arm64: vdso: getcpu() support Mark Brown
                   ` (3 preceding siblings ...)
  2020-08-19 12:13 ` [PATCH v3 4/5] arm64: vdso: Add getcpu() implementation Mark Brown
@ 2020-08-19 12:13 ` Mark Brown
  2020-08-31 21:47 ` [PATCH v3 0/5] arm64: vdso: getcpu() support Shuah Khan
  5 siblings, 0 replies; 9+ messages in thread
From: Mark Brown @ 2020-08-19 12:13 UTC (permalink / raw)
  To: Catalin Marinas, Will Deacon
  Cc: Mark Brown, Vincenzo Frascino, Shuah Khan, linux-kselftest,
	linux-arm-kernel

arm64 exports the vDSO ABI with a version of LINUX_2.6.39 and symbols
prefixed with __kernel rather than __vdso. Update the getcpu() test to
handle this.

Signed-off-by: Mark Brown <broonie@kernel.org>
---
 tools/testing/selftests/vDSO/vdso_test_getcpu.c | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/tools/testing/selftests/vDSO/vdso_test_getcpu.c b/tools/testing/selftests/vDSO/vdso_test_getcpu.c
index fc25ede131b8..4aeb65012b81 100644
--- a/tools/testing/selftests/vDSO/vdso_test_getcpu.c
+++ b/tools/testing/selftests/vDSO/vdso_test_getcpu.c
@@ -14,8 +14,18 @@
 #include "../kselftest.h"
 #include "parse_vdso.h"
 
+/*
+ * ARM64's vDSO exports its getcpu() implementation with a different
+ * name and version from other architectures, so we need to handle it
+ * as a special case.
+ */
+#if defined(__aarch64__)
+const char *version = "LINUX_2.6.39";
+const char *name = "__kernel_getcpu";
+#else
 const char *version = "LINUX_2.6";
 const char *name = "__vdso_getcpu";
+#endif
 
 struct getcpu_cache;
 typedef long (*getcpu_t)(unsigned int *, unsigned int *,
-- 
2.20.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH v3 0/5] arm64: vdso: getcpu() support
  2020-08-19 12:13 [PATCH v3 0/5] arm64: vdso: getcpu() support Mark Brown
                   ` (4 preceding siblings ...)
  2020-08-19 12:13 ` [PATCH v3 5/5] selftests: vdso: Support arm64 in getcpu() test Mark Brown
@ 2020-08-31 21:47 ` Shuah Khan
  2020-09-01  9:25   ` Catalin Marinas
  5 siblings, 1 reply; 9+ messages in thread
From: Shuah Khan @ 2020-08-31 21:47 UTC (permalink / raw)
  To: Mark Brown, Catalin Marinas, Will Deacon
  Cc: Vincenzo Frascino, Shuah Khan, linux-kselftest, linux-arm-kernel,
	skh >> Shuah Khan

On 8/19/20 6:13 AM, Mark Brown wrote:
> Some applications, especially tracing ones, benefit from avoiding the
> syscall overhead for getcpu() so it is common for architectures to have
> vDSO implementations. Add one for arm64, using TPIDRRO_EL0 to pass a
> pointer to per-CPU data rather than just store the immediate value in
> order to allow for future extensibility.
> 
> It is questionable if something TPIDRRO_EL0 based is worthwhile at all
> on current kernels, since v4.18 we have had support for restartable
> sequences which can be used to provide a sched_getcpu() implementation
> with generally better performance than the vDSO approach on
> architectures which have that[1]. Work is ongoing to implement this for
> glibc:
> 
>      https://lore.kernel.org/lkml/20200527185130.5604-3-mathieu.desnoyers@efficios.com/
> 
> but is not yet merged and will need similar work for other userspaces.
> The main advantages for the vDSO implementation are the node parameter
> (though this is a static mapping to CPU number so could be looked up
> separately when processing data if it's needed, it shouldn't need to be
> in the hot path) and ease of implementation for users.
> 
> This is currently not compatible with KPTI due to the use of TPIDRRO_EL0
> by the KPTI trampoline, this could be addressed by reinitializing that
> system register in the return path but I have found it hard to justify
> adding that overhead for all users for something that is essentially a
> profiling optimization which is likely to get superceeded by a more
> modern implementation - if there are other uses for the per-CPU data
> then the balance might change here.
> 
> This builds on work done by Kristina Martsenko some time ago but is a
> new implementation.
> 
> [1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=d7822b1e24f2df5df98c76f0e94a5416349ff759
> 
> v3:
>   - Rebase on v5.9-rc1.
>   - Drop in progress portions of the series.
> v2:
>   - Rebase on v5.8-rc3.
>   - Add further cleanup patches & a first draft of multi-page support.
> 
> Mark Brown (5):
>    arm64: vdso: Provide a define when building the vDSO
>    arm64: vdso: Add per-CPU data
>    arm64: vdso: Initialise the per-CPU vDSO data
>    arm64: vdso: Add getcpu() implementation
>    selftests: vdso: Support arm64 in getcpu() test
> 
>   arch/arm64/include/asm/processor.h            | 12 +----
>   arch/arm64/include/asm/vdso/datapage.h        | 54 +++++++++++++++++++
>   arch/arm64/kernel/process.c                   | 26 ++++++++-
>   arch/arm64/kernel/vdso.c                      | 33 +++++++++++-
>   arch/arm64/kernel/vdso/Makefile               |  4 +-
>   arch/arm64/kernel/vdso/vdso.lds.S             |  1 +
>   arch/arm64/kernel/vdso/vgetcpu.c              | 48 +++++++++++++++++
>   .../testing/selftests/vDSO/vdso_test_getcpu.c | 10 ++++
>   8 files changed, 172 insertions(+), 16 deletions(-)
>   create mode 100644 arch/arm64/include/asm/vdso/datapage.h
>   create mode 100644 arch/arm64/kernel/vdso/vgetcpu.c
> 

Patches look good to me from selftests perspective. My acked by
for these patches to go through arm64.

Acked-by: Shuah Khan <skhan@linuxfoundation.org>

If you would like me to take these through kselftest tree, give
me your Acks. I can queue these up for 5.10-rc1

thanks,
-- Shuah

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v3 0/5] arm64: vdso: getcpu() support
  2020-08-31 21:47 ` [PATCH v3 0/5] arm64: vdso: getcpu() support Shuah Khan
@ 2020-09-01  9:25   ` Catalin Marinas
  2020-09-01 10:46     ` Mark Brown
  0 siblings, 1 reply; 9+ messages in thread
From: Catalin Marinas @ 2020-09-01  9:25 UTC (permalink / raw)
  To: Shuah Khan
  Cc: Will Deacon, Mark Brown, linux-kselftest, Vincenzo Frascino,
	Shuah Khan, linux-arm-kernel

On Mon, Aug 31, 2020 at 03:47:17PM -0600, Shuah Khan wrote:
> On 8/19/20 6:13 AM, Mark Brown wrote:
> > Some applications, especially tracing ones, benefit from avoiding the
> > syscall overhead for getcpu() so it is common for architectures to have
> > vDSO implementations. Add one for arm64, using TPIDRRO_EL0 to pass a
> > pointer to per-CPU data rather than just store the immediate value in
> > order to allow for future extensibility.
> > 
> > It is questionable if something TPIDRRO_EL0 based is worthwhile at all
> > on current kernels, since v4.18 we have had support for restartable
> > sequences which can be used to provide a sched_getcpu() implementation
> > with generally better performance than the vDSO approach on
> > architectures which have that[1]. Work is ongoing to implement this for
> > glibc:
> > 
> >      https://lore.kernel.org/lkml/20200527185130.5604-3-mathieu.desnoyers@efficios.com/
> > 
> > but is not yet merged and will need similar work for other userspaces.
> > The main advantages for the vDSO implementation are the node parameter
> > (though this is a static mapping to CPU number so could be looked up
> > separately when processing data if it's needed, it shouldn't need to be
> > in the hot path) and ease of implementation for users.
> > 
> > This is currently not compatible with KPTI due to the use of TPIDRRO_EL0
> > by the KPTI trampoline, this could be addressed by reinitializing that
> > system register in the return path but I have found it hard to justify
> > adding that overhead for all users for something that is essentially a
> > profiling optimization which is likely to get superceeded by a more
> > modern implementation - if there are other uses for the per-CPU data
> > then the balance might change here.
> > 
> > This builds on work done by Kristina Martsenko some time ago but is a
> > new implementation.
> > 
> > [1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=d7822b1e24f2df5df98c76f0e94a5416349ff759
> > 
> > v3:
> >   - Rebase on v5.9-rc1.
> >   - Drop in progress portions of the series.
> > v2:
> >   - Rebase on v5.8-rc3.
> >   - Add further cleanup patches & a first draft of multi-page support.
> > 
> > Mark Brown (5):
> >    arm64: vdso: Provide a define when building the vDSO
> >    arm64: vdso: Add per-CPU data
> >    arm64: vdso: Initialise the per-CPU vDSO data
> >    arm64: vdso: Add getcpu() implementation
> >    selftests: vdso: Support arm64 in getcpu() test
> > 
> >   arch/arm64/include/asm/processor.h            | 12 +----
> >   arch/arm64/include/asm/vdso/datapage.h        | 54 +++++++++++++++++++
> >   arch/arm64/kernel/process.c                   | 26 ++++++++-
> >   arch/arm64/kernel/vdso.c                      | 33 +++++++++++-
> >   arch/arm64/kernel/vdso/Makefile               |  4 +-
> >   arch/arm64/kernel/vdso/vdso.lds.S             |  1 +
> >   arch/arm64/kernel/vdso/vgetcpu.c              | 48 +++++++++++++++++
> >   .../testing/selftests/vDSO/vdso_test_getcpu.c | 10 ++++
> >   8 files changed, 172 insertions(+), 16 deletions(-)
> >   create mode 100644 arch/arm64/include/asm/vdso/datapage.h
> >   create mode 100644 arch/arm64/kernel/vdso/vgetcpu.c
> > 
> 
> Patches look good to me from selftests perspective. My acked by
> for these patches to go through arm64.
> 
> Acked-by: Shuah Khan <skhan@linuxfoundation.org>
> 
> If you would like me to take these through kselftest tree, give
> me your Acks. I can queue these up for 5.10-rc1

Thanks Shuah for the ack. We are still pondering whether the merge these
patches as they have some limitations (the per-CPU data structures may
not fit in the sole data vDSO page).

-- 
Catalin

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v3 0/5] arm64: vdso: getcpu() support
  2020-09-01  9:25   ` Catalin Marinas
@ 2020-09-01 10:46     ` Mark Brown
  0 siblings, 0 replies; 9+ messages in thread
From: Mark Brown @ 2020-09-01 10:46 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: Will Deacon, linux-kselftest, Shuah Khan, Vincenzo Frascino,
	Shuah Khan, linux-arm-kernel


[-- Attachment #1.1: Type: text/plain, Size: 601 bytes --]

On Tue, Sep 01, 2020 at 10:25:52AM +0100, Catalin Marinas wrote:

> Thanks Shuah for the ack. We are still pondering whether the merge these
> patches as they have some limitations (the per-CPU data structures may
> not fit in the sole data vDSO page).

They definitely don't fit, I did have some half-written proof of concept
patches that I posted that extend this but I was waiting to see if there
was any interest in a vDSO getcpu() at all before taking it further.
Vincenzo's work on doing the multipage user data that he announced at
Plumbers would cover it as well, I hadn't been aware of that.

[-- Attachment #1.2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

[-- Attachment #2: Type: text/plain, Size: 176 bytes --]

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2020-09-01 10:49 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-08-19 12:13 [PATCH v3 0/5] arm64: vdso: getcpu() support Mark Brown
2020-08-19 12:13 ` [PATCH v3 1/5] arm64: vdso: Provide a define when building the vDSO Mark Brown
2020-08-19 12:13 ` [PATCH v3 2/5] arm64: vdso: Add per-CPU data Mark Brown
2020-08-19 12:13 ` [PATCH v3 3/5] arm64: vdso: Initialise the per-CPU vDSO data Mark Brown
2020-08-19 12:13 ` [PATCH v3 4/5] arm64: vdso: Add getcpu() implementation Mark Brown
2020-08-19 12:13 ` [PATCH v3 5/5] selftests: vdso: Support arm64 in getcpu() test Mark Brown
2020-08-31 21:47 ` [PATCH v3 0/5] arm64: vdso: getcpu() support Shuah Khan
2020-09-01  9:25   ` Catalin Marinas
2020-09-01 10:46     ` Mark Brown

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).