linux-api.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC PATCH v4 00/25] RSEQ node id and virtual cpu id extensions
@ 2022-09-21 19:24 Mathieu Desnoyers
  2022-09-21 19:24 ` [RFC PATCH v4 01/25] rseq: Introduce feature size and alignment ELF auxiliary vector entries Mathieu Desnoyers
                   ` (5 more replies)
  0 siblings, 6 replies; 9+ messages in thread
From: Mathieu Desnoyers @ 2022-09-21 19:24 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: linux-kernel, Thomas Gleixner, Paul E . McKenney, Boqun Feng,
	H . Peter Anvin, Paul Turner, linux-api, Christian Brauner,
	Florian Weimer, David.Laight, carlos, Peter Oskolkov,
	Alexander Mikhalitsyn, Mathieu Desnoyers

Extend the rseq ABI to expose a NUMA node ID and a vm_vcpu_id field.

The NUMA node ID field allows implementing a faster getcpu(2) in libc.

The virtual cpu id allows ideal scaling (down or up) of user-space
per-cpu data structures. The virtual cpu ids allocated within a memory
space are tracked by the scheduler, which takes into account the number
of concurrently running threads, thus implicitly considering the number
of threads, the cpu affinity, the cpusets applying to those threads, and
the number of logical cores on the system.

This series is based on the v5.19 tag.

Thanks,

Mathieu

Mathieu Desnoyers (25):
  rseq: Introduce feature size and alignment ELF auxiliary vector
    entries
  rseq: Introduce extensible rseq ABI
  rseq: Extend struct rseq with numa node id
  selftests/rseq: Use ELF auxiliary vector for extensible rseq
  selftests/rseq: Implement rseq numa node id field selftest
  lib: Invert _find_next_bit source arguments
  lib: Implement find_{first,next}_{zero,one}_and_zero_bit
  cpumask: Implement cpumask_{first,next}_{zero,one}_and_zero
  sched: Introduce per memory space current virtual cpu id
  rseq: Extend struct rseq with per memory space vcpu id
  selftests/rseq: Remove RSEQ_SKIP_FASTPATH code
  selftests/rseq: Implement rseq vm_vcpu_id field support
  selftests/rseq: x86: Template memory ordering and percpu access mode
  selftests/rseq: arm: Template memory ordering and percpu access mode
  selftests/rseq: arm64: Template memory ordering and percpu access mode
  selftests/rseq: mips: Template memory ordering and percpu access mode
  selftests/rseq: ppc: Template memory ordering and percpu access mode
  selftests/rseq: s390: Template memory ordering and percpu access mode
  selftests/rseq: riscv: Template memory ordering and percpu access mode
  selftests/rseq: Implement basic percpu ops vm_vcpu_id test
  selftests/rseq: Implement parametrized vm_vcpu_id test
  selftests/rseq: x86: Implement rseq_load_u32_u32
  selftests/rseq: Implement numa node id vs vm_vcpu_id invariant test
  selftests/rseq: parametrized test: Report/abort on negative cpu id
  tracing/rseq: Add mm_vcpu_id field to rseq_update

 fs/binfmt_elf.c                               |    5 +
 fs/exec.c                                     |    6 +
 include/linux/cpumask.h                       |   86 ++
 include/linux/find.h                          |  123 +-
 include/linux/mm.h                            |   25 +
 include/linux/mm_types.h                      |  110 +-
 include/linux/sched.h                         |    9 +
 include/trace/events/rseq.h                   |    7 +-
 include/uapi/linux/auxvec.h                   |    2 +
 include/uapi/linux/rseq.h                     |   22 +
 init/Kconfig                                  |    4 +
 kernel/fork.c                                 |   11 +-
 kernel/ptrace.c                               |    2 +-
 kernel/rseq.c                                 |   61 +-
 kernel/sched/core.c                           |   49 +
 kernel/sched/sched.h                          |  166 +++
 kernel/signal.c                               |    2 +
 lib/find_bit.c                                |   17 +-
 tools/include/linux/find.h                    |    9 +-
 tools/lib/find_bit.c                          |   17 +-
 tools/testing/selftests/rseq/.gitignore       |    5 +
 tools/testing/selftests/rseq/Makefile         |   20 +-
 .../testing/selftests/rseq/basic_numa_test.c  |  117 ++
 .../selftests/rseq/basic_percpu_ops_test.c    |   46 +-
 tools/testing/selftests/rseq/basic_test.c     |    4 +
 tools/testing/selftests/rseq/compiler.h       |    6 +
 tools/testing/selftests/rseq/param_test.c     |  157 ++-
 tools/testing/selftests/rseq/rseq-abi.h       |   22 +
 tools/testing/selftests/rseq/rseq-arm-bits.h  |  505 +++++++
 tools/testing/selftests/rseq/rseq-arm.h       |  701 +---------
 .../testing/selftests/rseq/rseq-arm64-bits.h  |  392 ++++++
 tools/testing/selftests/rseq/rseq-arm64.h     |  520 +------
 .../testing/selftests/rseq/rseq-bits-reset.h  |   10 +
 .../selftests/rseq/rseq-bits-template.h       |   39 +
 tools/testing/selftests/rseq/rseq-mips-bits.h |  462 +++++++
 tools/testing/selftests/rseq/rseq-mips.h      |  646 +--------
 tools/testing/selftests/rseq/rseq-ppc-bits.h  |  454 +++++++
 tools/testing/selftests/rseq/rseq-ppc.h       |  617 +--------
 .../testing/selftests/rseq/rseq-riscv-bits.h  |  410 ++++++
 tools/testing/selftests/rseq/rseq-riscv.h     |  529 +-------
 tools/testing/selftests/rseq/rseq-s390-bits.h |  474 +++++++
 tools/testing/selftests/rseq/rseq-s390.h      |  495 +------
 tools/testing/selftests/rseq/rseq-skip.h      |   65 -
 tools/testing/selftests/rseq/rseq-x86-bits.h  | 1036 ++++++++++++++
 tools/testing/selftests/rseq/rseq-x86.h       | 1193 +----------------
 tools/testing/selftests/rseq/rseq.c           |   86 +-
 tools/testing/selftests/rseq/rseq.h           |  229 +++-
 .../testing/selftests/rseq/run_param_test.sh  |    5 +
 48 files changed, 5286 insertions(+), 4692 deletions(-)
 create mode 100644 tools/testing/selftests/rseq/basic_numa_test.c
 create mode 100644 tools/testing/selftests/rseq/rseq-arm-bits.h
 create mode 100644 tools/testing/selftests/rseq/rseq-arm64-bits.h
 create mode 100644 tools/testing/selftests/rseq/rseq-bits-reset.h
 create mode 100644 tools/testing/selftests/rseq/rseq-bits-template.h
 create mode 100644 tools/testing/selftests/rseq/rseq-mips-bits.h
 create mode 100644 tools/testing/selftests/rseq/rseq-ppc-bits.h
 create mode 100644 tools/testing/selftests/rseq/rseq-riscv-bits.h
 create mode 100644 tools/testing/selftests/rseq/rseq-s390-bits.h
 delete mode 100644 tools/testing/selftests/rseq/rseq-skip.h
 create mode 100644 tools/testing/selftests/rseq/rseq-x86-bits.h

-- 
2.25.1


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [RFC PATCH v4 01/25] rseq: Introduce feature size and alignment ELF auxiliary vector entries
  2022-09-21 19:24 [RFC PATCH v4 00/25] RSEQ node id and virtual cpu id extensions Mathieu Desnoyers
@ 2022-09-21 19:24 ` Mathieu Desnoyers
  2022-09-21 19:24 ` [RFC PATCH v4 02/25] rseq: Introduce extensible rseq ABI Mathieu Desnoyers
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 9+ messages in thread
From: Mathieu Desnoyers @ 2022-09-21 19:24 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: linux-kernel, Thomas Gleixner, Paul E . McKenney, Boqun Feng,
	H . Peter Anvin, Paul Turner, linux-api, Christian Brauner,
	Florian Weimer, David.Laight, carlos, Peter Oskolkov,
	Alexander Mikhalitsyn, Mathieu Desnoyers

Export the rseq feature size supported by the kernel as well as the
required allocation alignment for the rseq per-thread area to user-space
through ELF auxiliary vector entries.

This is part of the extensible rseq ABI.

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
---
 fs/binfmt_elf.c             | 5 +++++
 include/uapi/linux/auxvec.h | 2 ++
 include/uapi/linux/rseq.h   | 5 +++++
 3 files changed, 12 insertions(+)

diff --git a/fs/binfmt_elf.c b/fs/binfmt_elf.c
index 63c7ebb0da89..04fca1e4cbd2 100644
--- a/fs/binfmt_elf.c
+++ b/fs/binfmt_elf.c
@@ -46,6 +46,7 @@
 #include <linux/cred.h>
 #include <linux/dax.h>
 #include <linux/uaccess.h>
+#include <linux/rseq.h>
 #include <asm/param.h>
 #include <asm/page.h>
 
@@ -288,6 +289,10 @@ create_elf_tables(struct linux_binprm *bprm, const struct elfhdr *exec,
 	if (bprm->have_execfd) {
 		NEW_AUX_ENT(AT_EXECFD, bprm->execfd);
 	}
+#ifdef CONFIG_RSEQ
+	NEW_AUX_ENT(AT_RSEQ_FEATURE_SIZE, offsetof(struct rseq, end));
+	NEW_AUX_ENT(AT_RSEQ_ALIGN, __alignof__(struct rseq));
+#endif
 #undef NEW_AUX_ENT
 	/* AT_NULL is zero; clear the rest too */
 	memset(elf_info, 0, (char *)mm->saved_auxv +
diff --git a/include/uapi/linux/auxvec.h b/include/uapi/linux/auxvec.h
index c7e502bf5a6f..6991c4b8ab18 100644
--- a/include/uapi/linux/auxvec.h
+++ b/include/uapi/linux/auxvec.h
@@ -30,6 +30,8 @@
 				 * differ from AT_PLATFORM. */
 #define AT_RANDOM 25	/* address of 16 random bytes */
 #define AT_HWCAP2 26	/* extension of AT_HWCAP */
+#define AT_RSEQ_FEATURE_SIZE	27	/* rseq supported feature size */
+#define AT_RSEQ_ALIGN		28	/* rseq allocation alignment */
 
 #define AT_EXECFN  31	/* filename of program */
 
diff --git a/include/uapi/linux/rseq.h b/include/uapi/linux/rseq.h
index 77ee207623a9..05d3c4cdeb40 100644
--- a/include/uapi/linux/rseq.h
+++ b/include/uapi/linux/rseq.h
@@ -130,6 +130,11 @@ struct rseq {
 	 *     this thread.
 	 */
 	__u32 flags;
+
+	/*
+	 * Flexible array member at end of structure, after last feature field.
+	 */
+	char end[];
 } __attribute__((aligned(4 * sizeof(__u64))));
 
 #endif /* _UAPI_LINUX_RSEQ_H */
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [RFC PATCH v4 02/25] rseq: Introduce extensible rseq ABI
  2022-09-21 19:24 [RFC PATCH v4 00/25] RSEQ node id and virtual cpu id extensions Mathieu Desnoyers
  2022-09-21 19:24 ` [RFC PATCH v4 01/25] rseq: Introduce feature size and alignment ELF auxiliary vector entries Mathieu Desnoyers
@ 2022-09-21 19:24 ` Mathieu Desnoyers
  2022-09-21 19:24 ` [RFC PATCH v4 03/25] rseq: Extend struct rseq with numa node id Mathieu Desnoyers
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 9+ messages in thread
From: Mathieu Desnoyers @ 2022-09-21 19:24 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: linux-kernel, Thomas Gleixner, Paul E . McKenney, Boqun Feng,
	H . Peter Anvin, Paul Turner, linux-api, Christian Brauner,
	Florian Weimer, David.Laight, carlos, Peter Oskolkov,
	Alexander Mikhalitsyn, Mathieu Desnoyers

Introduce the extensible rseq ABI, where the feature size supported by
the kernel and the required alignment are communicated to user-space
through ELF auxiliary vectors.

This allows user-space to call rseq registration with a rseq_len of
either 32 bytes for the original struct rseq size (which includes
padding), or larger.

If rseq_len is larger than 32 bytes, then it must be large enough to
contain the feature size communicated to user-space through ELF
auxiliary vectors.

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
---
 include/linux/sched.h |  4 ++++
 kernel/ptrace.c       |  2 +-
 kernel/rseq.c         | 33 +++++++++++++++++++++++++++------
 3 files changed, 32 insertions(+), 7 deletions(-)

diff --git a/include/linux/sched.h b/include/linux/sched.h
index c46f3a63b758..6a80ce113d0e 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -1291,6 +1291,7 @@ struct task_struct {
 
 #ifdef CONFIG_RSEQ
 	struct rseq __user *rseq;
+	u32 rseq_len;
 	u32 rseq_sig;
 	/*
 	 * RmW on rseq_event_mask must be performed atomically
@@ -2324,10 +2325,12 @@ static inline void rseq_fork(struct task_struct *t, unsigned long clone_flags)
 {
 	if (clone_flags & CLONE_VM) {
 		t->rseq = NULL;
+		t->rseq_len = 0;
 		t->rseq_sig = 0;
 		t->rseq_event_mask = 0;
 	} else {
 		t->rseq = current->rseq;
+		t->rseq_len = current->rseq_len;
 		t->rseq_sig = current->rseq_sig;
 		t->rseq_event_mask = current->rseq_event_mask;
 	}
@@ -2336,6 +2339,7 @@ static inline void rseq_fork(struct task_struct *t, unsigned long clone_flags)
 static inline void rseq_execve(struct task_struct *t)
 {
 	t->rseq = NULL;
+	t->rseq_len = 0;
 	t->rseq_sig = 0;
 	t->rseq_event_mask = 0;
 }
diff --git a/kernel/ptrace.c b/kernel/ptrace.c
index 1893d909e45c..90de1ea51088 100644
--- a/kernel/ptrace.c
+++ b/kernel/ptrace.c
@@ -813,7 +813,7 @@ static long ptrace_get_rseq_configuration(struct task_struct *task,
 {
 	struct ptrace_rseq_configuration conf = {
 		.rseq_abi_pointer = (u64)(uintptr_t)task->rseq,
-		.rseq_abi_size = sizeof(*task->rseq),
+		.rseq_abi_size = task->rseq_len,
 		.signature = task->rseq_sig,
 		.flags = 0,
 	};
diff --git a/kernel/rseq.c b/kernel/rseq.c
index 97ac20b4f738..46dc5c2ce2b7 100644
--- a/kernel/rseq.c
+++ b/kernel/rseq.c
@@ -18,6 +18,9 @@
 #define CREATE_TRACE_POINTS
 #include <trace/events/rseq.h>
 
+/* The original rseq structure size (including padding) is 32 bytes. */
+#define ORIG_RSEQ_SIZE		32
+
 #define RSEQ_CS_PREEMPT_MIGRATE_FLAGS (RSEQ_CS_FLAG_NO_RESTART_ON_MIGRATE | \
 				       RSEQ_CS_FLAG_NO_RESTART_ON_PREEMPT)
 
@@ -86,10 +89,15 @@ static int rseq_update_cpu_id(struct task_struct *t)
 	u32 cpu_id = raw_smp_processor_id();
 	struct rseq __user *rseq = t->rseq;
 
-	if (!user_write_access_begin(rseq, sizeof(*rseq)))
+	if (!user_write_access_begin(rseq, t->rseq_len))
 		goto efault;
 	unsafe_put_user(cpu_id, &rseq->cpu_id_start, efault_end);
 	unsafe_put_user(cpu_id, &rseq->cpu_id, efault_end);
+	/*
+	 * Additional feature fields added after ORIG_RSEQ_SIZE
+	 * need to be conditionally updated only if
+	 * t->rseq_len != ORIG_RSEQ_SIZE.
+	 */
 	user_write_access_end();
 	trace_rseq_update(t);
 	return 0;
@@ -116,6 +124,11 @@ static int rseq_reset_rseq_cpu_id(struct task_struct *t)
 	 */
 	if (put_user(cpu_id, &t->rseq->cpu_id))
 		return -EFAULT;
+	/*
+	 * Additional feature fields added after ORIG_RSEQ_SIZE
+	 * need to be conditionally reset only if
+	 * t->rseq_len != ORIG_RSEQ_SIZE.
+	 */
 	return 0;
 }
 
@@ -336,7 +349,7 @@ SYSCALL_DEFINE4(rseq, struct rseq __user *, rseq, u32, rseq_len,
 		/* Unregister rseq for current thread. */
 		if (current->rseq != rseq || !current->rseq)
 			return -EINVAL;
-		if (rseq_len != sizeof(*rseq))
+		if (rseq_len != current->rseq_len)
 			return -EINVAL;
 		if (current->rseq_sig != sig)
 			return -EPERM;
@@ -345,6 +358,7 @@ SYSCALL_DEFINE4(rseq, struct rseq __user *, rseq, u32, rseq_len,
 			return ret;
 		current->rseq = NULL;
 		current->rseq_sig = 0;
+		current->rseq_len = 0;
 		return 0;
 	}
 
@@ -357,7 +371,7 @@ SYSCALL_DEFINE4(rseq, struct rseq __user *, rseq, u32, rseq_len,
 		 * the provided address differs from the prior
 		 * one.
 		 */
-		if (current->rseq != rseq || rseq_len != sizeof(*rseq))
+		if (current->rseq != rseq || rseq_len != current->rseq_len)
 			return -EINVAL;
 		if (current->rseq_sig != sig)
 			return -EPERM;
@@ -366,15 +380,22 @@ SYSCALL_DEFINE4(rseq, struct rseq __user *, rseq, u32, rseq_len,
 	}
 
 	/*
-	 * If there was no rseq previously registered,
-	 * ensure the provided rseq is properly aligned and valid.
+	 * If there was no rseq previously registered, ensure the provided rseq
+	 * is properly aligned, as communcated to user-space through the ELF
+	 * auxiliary vector AT_RSEQ_ALIGN.
+	 *
+	 * In order to be valid, rseq_len is either the original rseq size, or
+	 * large enough to contain all supported fields, as communicated to
+	 * user-space through the ELF auxiliary vector AT_RSEQ_FEATURE_SIZE.
 	 */
 	if (!IS_ALIGNED((unsigned long)rseq, __alignof__(*rseq)) ||
-	    rseq_len != sizeof(*rseq))
+	    rseq_len < ORIG_RSEQ_SIZE ||
+	    (rseq_len != ORIG_RSEQ_SIZE && rseq_len < offsetof(struct rseq, end)))
 		return -EINVAL;
 	if (!access_ok(rseq, rseq_len))
 		return -EFAULT;
 	current->rseq = rseq;
+	current->rseq_len = rseq_len;
 	current->rseq_sig = sig;
 	/*
 	 * If rseq was previously inactive, and has just been
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [RFC PATCH v4 03/25] rseq: Extend struct rseq with numa node id
  2022-09-21 19:24 [RFC PATCH v4 00/25] RSEQ node id and virtual cpu id extensions Mathieu Desnoyers
  2022-09-21 19:24 ` [RFC PATCH v4 01/25] rseq: Introduce feature size and alignment ELF auxiliary vector entries Mathieu Desnoyers
  2022-09-21 19:24 ` [RFC PATCH v4 02/25] rseq: Introduce extensible rseq ABI Mathieu Desnoyers
@ 2022-09-21 19:24 ` Mathieu Desnoyers
  2022-09-21 19:24 ` [RFC PATCH v4 04/25] selftests/rseq: Use ELF auxiliary vector for extensible rseq Mathieu Desnoyers
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 9+ messages in thread
From: Mathieu Desnoyers @ 2022-09-21 19:24 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: linux-kernel, Thomas Gleixner, Paul E . McKenney, Boqun Feng,
	H . Peter Anvin, Paul Turner, linux-api, Christian Brauner,
	Florian Weimer, David.Laight, carlos, Peter Oskolkov,
	Alexander Mikhalitsyn, Mathieu Desnoyers

Adding the NUMA node id to struct rseq is a straightforward thing to do,
and a good way to figure out if anything in the user-space ecosystem
prevents extending struct rseq.

This NUMA node id field allows memory allocators such as tcmalloc to
take advantage of fast access to the current NUMA node id to perform
NUMA-aware memory allocation.

It can also be useful for implementing fast-paths for NUMA-aware
user-space mutexes.

It also allows implementing getcpu(2) purely in user-space.

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
---
 include/trace/events/rseq.h |  4 +++-
 include/uapi/linux/rseq.h   |  8 ++++++++
 kernel/rseq.c               | 19 +++++++++++++------
 3 files changed, 24 insertions(+), 7 deletions(-)

diff --git a/include/trace/events/rseq.h b/include/trace/events/rseq.h
index a04a64bc1a00..6bd442697354 100644
--- a/include/trace/events/rseq.h
+++ b/include/trace/events/rseq.h
@@ -16,13 +16,15 @@ TRACE_EVENT(rseq_update,
 
 	TP_STRUCT__entry(
 		__field(s32, cpu_id)
+		__field(s32, node_id)
 	),
 
 	TP_fast_assign(
 		__entry->cpu_id = raw_smp_processor_id();
+		__entry->node_id = cpu_to_node(raw_smp_processor_id());
 	),
 
-	TP_printk("cpu_id=%d", __entry->cpu_id)
+	TP_printk("cpu_id=%d node_id=%d", __entry->cpu_id, __entry->node_id)
 );
 
 TRACE_EVENT(rseq_ip_fixup,
diff --git a/include/uapi/linux/rseq.h b/include/uapi/linux/rseq.h
index 05d3c4cdeb40..1cb90a435c5c 100644
--- a/include/uapi/linux/rseq.h
+++ b/include/uapi/linux/rseq.h
@@ -131,6 +131,14 @@ struct rseq {
 	 */
 	__u32 flags;
 
+	/*
+	 * Restartable sequences node_id field. Updated by the kernel. Read by
+	 * user-space with single-copy atomicity semantics. This field should
+	 * only be read by the thread which registered this data structure.
+	 * Aligned on 32-bit. Contains the current NUMA node ID.
+	 */
+	__u32 node_id;
+
 	/*
 	 * Flexible array member at end of structure, after last feature field.
 	 */
diff --git a/kernel/rseq.c b/kernel/rseq.c
index 46dc5c2ce2b7..cb7d8a5afc82 100644
--- a/kernel/rseq.c
+++ b/kernel/rseq.c
@@ -84,15 +84,17 @@
  *   F1. <failure>
  */
 
-static int rseq_update_cpu_id(struct task_struct *t)
+static int rseq_update_cpu_node_id(struct task_struct *t)
 {
-	u32 cpu_id = raw_smp_processor_id();
 	struct rseq __user *rseq = t->rseq;
+	u32 cpu_id = raw_smp_processor_id();
+	u32 node_id = cpu_to_node(cpu_id);
 
 	if (!user_write_access_begin(rseq, t->rseq_len))
 		goto efault;
 	unsafe_put_user(cpu_id, &rseq->cpu_id_start, efault_end);
 	unsafe_put_user(cpu_id, &rseq->cpu_id, efault_end);
+	unsafe_put_user(node_id, &rseq->node_id, efault_end);
 	/*
 	 * Additional feature fields added after ORIG_RSEQ_SIZE
 	 * need to be conditionally updated only if
@@ -108,9 +110,9 @@ static int rseq_update_cpu_id(struct task_struct *t)
 	return -EFAULT;
 }
 
-static int rseq_reset_rseq_cpu_id(struct task_struct *t)
+static int rseq_reset_rseq_cpu_node_id(struct task_struct *t)
 {
-	u32 cpu_id_start = 0, cpu_id = RSEQ_CPU_ID_UNINITIALIZED;
+	u32 cpu_id_start = 0, cpu_id = RSEQ_CPU_ID_UNINITIALIZED, node_id = 0;
 
 	/*
 	 * Reset cpu_id_start to its initial state (0).
@@ -124,6 +126,11 @@ static int rseq_reset_rseq_cpu_id(struct task_struct *t)
 	 */
 	if (put_user(cpu_id, &t->rseq->cpu_id))
 		return -EFAULT;
+	/*
+	 * Reset node_id to its initial state (0).
+	 */
+	if (put_user(node_id, &t->rseq->node_id))
+		return -EFAULT;
 	/*
 	 * Additional feature fields added after ORIG_RSEQ_SIZE
 	 * need to be conditionally reset only if
@@ -306,7 +313,7 @@ void __rseq_handle_notify_resume(struct ksignal *ksig, struct pt_regs *regs)
 		if (unlikely(ret < 0))
 			goto error;
 	}
-	if (unlikely(rseq_update_cpu_id(t)))
+	if (unlikely(rseq_update_cpu_node_id(t)))
 		goto error;
 	return;
 
@@ -353,7 +360,7 @@ SYSCALL_DEFINE4(rseq, struct rseq __user *, rseq, u32, rseq_len,
 			return -EINVAL;
 		if (current->rseq_sig != sig)
 			return -EPERM;
-		ret = rseq_reset_rseq_cpu_id(current);
+		ret = rseq_reset_rseq_cpu_node_id(current);
 		if (ret)
 			return ret;
 		current->rseq = NULL;
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [RFC PATCH v4 04/25] selftests/rseq: Use ELF auxiliary vector for extensible rseq
  2022-09-21 19:24 [RFC PATCH v4 00/25] RSEQ node id and virtual cpu id extensions Mathieu Desnoyers
                   ` (2 preceding siblings ...)
  2022-09-21 19:24 ` [RFC PATCH v4 03/25] rseq: Extend struct rseq with numa node id Mathieu Desnoyers
@ 2022-09-21 19:24 ` Mathieu Desnoyers
  2022-09-21 19:24 ` [RFC PATCH v4 05/25] selftests/rseq: Implement rseq numa node id field selftest Mathieu Desnoyers
  2022-09-21 19:54 ` [RFC PATCH v4 00/25] RSEQ node id and virtual cpu id extensions Mathieu Desnoyers
  5 siblings, 0 replies; 9+ messages in thread
From: Mathieu Desnoyers @ 2022-09-21 19:24 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: linux-kernel, Thomas Gleixner, Paul E . McKenney, Boqun Feng,
	H . Peter Anvin, Paul Turner, linux-api, Christian Brauner,
	Florian Weimer, David.Laight, carlos, Peter Oskolkov,
	Alexander Mikhalitsyn, Mathieu Desnoyers

Use the ELF auxiliary vector AT_RSEQ_FEATURE_SIZE to detect the RSEQ
features supported by the kernel.

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
---
 tools/testing/selftests/rseq/rseq-abi.h |  5 ++
 tools/testing/selftests/rseq/rseq.c     | 68 ++++++++++++++++++++++---
 tools/testing/selftests/rseq/rseq.h     | 18 +++++--
 3 files changed, 79 insertions(+), 12 deletions(-)

diff --git a/tools/testing/selftests/rseq/rseq-abi.h b/tools/testing/selftests/rseq/rseq-abi.h
index a8c44d9af71f..00ac846d85b0 100644
--- a/tools/testing/selftests/rseq/rseq-abi.h
+++ b/tools/testing/selftests/rseq/rseq-abi.h
@@ -146,6 +146,11 @@ struct rseq_abi {
 	 *     this thread.
 	 */
 	__u32 flags;
+
+	/*
+	 * Flexible array member at end of structure, after last feature field.
+	 */
+	char end[];
 } __attribute__((aligned(4 * sizeof(__u64))));
 
 #endif /* _RSEQ_ABI_H */
diff --git a/tools/testing/selftests/rseq/rseq.c b/tools/testing/selftests/rseq/rseq.c
index 986b9458efb2..20ea536d1012 100644
--- a/tools/testing/selftests/rseq/rseq.c
+++ b/tools/testing/selftests/rseq/rseq.c
@@ -28,6 +28,8 @@
 #include <limits.h>
 #include <dlfcn.h>
 #include <stddef.h>
+#include <sys/auxv.h>
+#include <linux/auxvec.h>
 
 #include "../kselftest.h"
 #include "rseq.h"
@@ -36,20 +38,38 @@ static const ptrdiff_t *libc_rseq_offset_p;
 static const unsigned int *libc_rseq_size_p;
 static const unsigned int *libc_rseq_flags_p;
 
-/* Offset from the thread pointer to the rseq area.  */
+/* Offset from the thread pointer to the rseq area. */
 ptrdiff_t rseq_offset;
 
-/* Size of the registered rseq area.  0 if the registration was
-   unsuccessful.  */
+/*
+ * Size of the registered rseq area. 0 if the registration was
+ * unsuccessful.
+ */
 unsigned int rseq_size = -1U;
 
 /* Flags used during rseq registration.  */
 unsigned int rseq_flags;
 
+/*
+ * rseq feature size supported by the kernel. 0 if the registration was
+ * unsuccessful.
+ */
+unsigned int rseq_feature_size = -1U;
+
 static int rseq_ownership;
+static int rseq_reg_success;	/* At least one rseq registration has succeded. */
+
+/* Allocate a large area for the TLS. */
+#define RSEQ_THREAD_AREA_ALLOC_SIZE	1024
+
+/* Original struct rseq feature size is 20 bytes. */
+#define ORIG_RSEQ_FEATURE_SIZE		20
+
+/* Orignal struct rseq allocation size is 32 bytes. */
+#define ORIG_RSEQ_ALLOC_SIZE		32
 
 static
-__thread struct rseq_abi __rseq_abi __attribute__((tls_model("initial-exec"))) = {
+__thread struct rseq_abi __rseq_abi __attribute__((tls_model("initial-exec"), aligned(RSEQ_THREAD_AREA_ALLOC_SIZE))) = {
 	.cpu_id = RSEQ_ABI_CPU_ID_UNINITIALIZED,
 };
 
@@ -84,10 +104,18 @@ int rseq_register_current_thread(void)
 		/* Treat libc's ownership as a successful registration. */
 		return 0;
 	}
-	rc = sys_rseq(&__rseq_abi, sizeof(struct rseq_abi), 0, RSEQ_SIG);
-	if (rc)
+	rc = sys_rseq(&__rseq_abi, rseq_size, 0, RSEQ_SIG);
+	if (rc) {
+		if (RSEQ_READ_ONCE(rseq_reg_success)) {
+			/* Incoherent success/failure within process. */
+			abort();
+		}
+		rseq_size = 0;
+		rseq_feature_size = 0;
 		return -1;
+	}
 	assert(rseq_current_cpu_raw() >= 0);
+	RSEQ_WRITE_ONCE(rseq_reg_success, 1);
 	return 0;
 }
 
@@ -99,12 +127,28 @@ int rseq_unregister_current_thread(void)
 		/* Treat libc's ownership as a successful unregistration. */
 		return 0;
 	}
-	rc = sys_rseq(&__rseq_abi, sizeof(struct rseq_abi), RSEQ_ABI_FLAG_UNREGISTER, RSEQ_SIG);
+	rc = sys_rseq(&__rseq_abi, rseq_size, RSEQ_ABI_FLAG_UNREGISTER, RSEQ_SIG);
 	if (rc)
 		return -1;
 	return 0;
 }
 
+static
+unsigned int get_rseq_feature_size(void)
+{
+	unsigned long auxv_rseq_feature_size, auxv_rseq_align;
+
+	auxv_rseq_align = getauxval(AT_RSEQ_ALIGN);
+	assert(!auxv_rseq_align || auxv_rseq_align <= RSEQ_THREAD_AREA_ALLOC_SIZE);
+
+	auxv_rseq_feature_size = getauxval(AT_RSEQ_FEATURE_SIZE);
+	assert(!auxv_rseq_feature_size || auxv_rseq_feature_size <= RSEQ_THREAD_AREA_ALLOC_SIZE);
+	if (auxv_rseq_feature_size)
+		return auxv_rseq_feature_size;
+	else
+		return ORIG_RSEQ_FEATURE_SIZE;
+}
+
 static __attribute__((constructor))
 void rseq_init(void)
 {
@@ -116,14 +160,21 @@ void rseq_init(void)
 		rseq_offset = *libc_rseq_offset_p;
 		rseq_size = *libc_rseq_size_p;
 		rseq_flags = *libc_rseq_flags_p;
+		rseq_feature_size = get_rseq_feature_size();
+		if (rseq_feature_size > rseq_size)
+			rseq_feature_size = rseq_size;
 		return;
 	}
 	if (!rseq_available())
 		return;
 	rseq_ownership = 1;
 	rseq_offset = (void *)&__rseq_abi - rseq_thread_pointer();
-	rseq_size = sizeof(struct rseq_abi);
 	rseq_flags = 0;
+	rseq_feature_size = get_rseq_feature_size();
+	if (rseq_feature_size == ORIG_RSEQ_FEATURE_SIZE)
+		rseq_size = ORIG_RSEQ_ALLOC_SIZE;
+	else
+		rseq_size = RSEQ_THREAD_AREA_ALLOC_SIZE;
 }
 
 static __attribute__((destructor))
@@ -133,6 +184,7 @@ void rseq_exit(void)
 		return;
 	rseq_offset = 0;
 	rseq_size = -1U;
+	rseq_feature_size = -1U;
 	rseq_ownership = 0;
 }
 
diff --git a/tools/testing/selftests/rseq/rseq.h b/tools/testing/selftests/rseq/rseq.h
index 6f7513384bf5..95adc1e1b0db 100644
--- a/tools/testing/selftests/rseq/rseq.h
+++ b/tools/testing/selftests/rseq/rseq.h
@@ -47,14 +47,24 @@
 
 #include "rseq-thread-pointer.h"
 
-/* Offset from the thread pointer to the rseq area.  */
+/* Offset from the thread pointer to the rseq area. */
 extern ptrdiff_t rseq_offset;
-/* Size of the registered rseq area.  0 if the registration was
-   unsuccessful.  */
+
+/*
+ * Size of the registered rseq area. 0 if the registration was
+ * unsuccessful.
+ */
 extern unsigned int rseq_size;
-/* Flags used during rseq registration.  */
+
+/* Flags used during rseq registration. */
 extern unsigned int rseq_flags;
 
+/*
+ * rseq feature size supported by the kernel. 0 if the registration was
+ * unsuccessful.
+ */
+extern unsigned int rseq_feature_size;
+
 static inline struct rseq_abi *rseq_get_abi(void)
 {
 	return (struct rseq_abi *) ((uintptr_t) rseq_thread_pointer() + rseq_offset);
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [RFC PATCH v4 05/25] selftests/rseq: Implement rseq numa node id field selftest
  2022-09-21 19:24 [RFC PATCH v4 00/25] RSEQ node id and virtual cpu id extensions Mathieu Desnoyers
                   ` (3 preceding siblings ...)
  2022-09-21 19:24 ` [RFC PATCH v4 04/25] selftests/rseq: Use ELF auxiliary vector for extensible rseq Mathieu Desnoyers
@ 2022-09-21 19:24 ` Mathieu Desnoyers
  2022-09-21 19:54 ` [RFC PATCH v4 00/25] RSEQ node id and virtual cpu id extensions Mathieu Desnoyers
  5 siblings, 0 replies; 9+ messages in thread
From: Mathieu Desnoyers @ 2022-09-21 19:24 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: linux-kernel, Thomas Gleixner, Paul E . McKenney, Boqun Feng,
	H . Peter Anvin, Paul Turner, linux-api, Christian Brauner,
	Florian Weimer, David.Laight, carlos, Peter Oskolkov,
	Alexander Mikhalitsyn, Mathieu Desnoyers

Test the NUMA node id extension rseq field. Compare it against the value
returned by the getcpu(2) system call while pinned on a specific core.

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
---
 tools/testing/selftests/rseq/basic_test.c |  4 ++++
 tools/testing/selftests/rseq/rseq-abi.h   |  8 +++++++
 tools/testing/selftests/rseq/rseq.c       | 18 +++++++++++++++
 tools/testing/selftests/rseq/rseq.h       | 28 +++++++++++++++++++++++
 4 files changed, 58 insertions(+)

diff --git a/tools/testing/selftests/rseq/basic_test.c b/tools/testing/selftests/rseq/basic_test.c
index d8efbfb89193..295eea16466f 100644
--- a/tools/testing/selftests/rseq/basic_test.c
+++ b/tools/testing/selftests/rseq/basic_test.c
@@ -22,6 +22,8 @@ void test_cpu_pointer(void)
 	CPU_ZERO(&test_affinity);
 	for (i = 0; i < CPU_SETSIZE; i++) {
 		if (CPU_ISSET(i, &affinity)) {
+			int node;
+
 			CPU_SET(i, &test_affinity);
 			sched_setaffinity(0, sizeof(test_affinity),
 					&test_affinity);
@@ -29,6 +31,8 @@ void test_cpu_pointer(void)
 			assert(rseq_current_cpu() == i);
 			assert(rseq_current_cpu_raw() == i);
 			assert(rseq_cpu_start() == i);
+			node = rseq_fallback_current_node();
+			assert(rseq_current_node_id() == node);
 			CPU_CLR(i, &test_affinity);
 		}
 	}
diff --git a/tools/testing/selftests/rseq/rseq-abi.h b/tools/testing/selftests/rseq/rseq-abi.h
index 00ac846d85b0..a1faa9162d52 100644
--- a/tools/testing/selftests/rseq/rseq-abi.h
+++ b/tools/testing/selftests/rseq/rseq-abi.h
@@ -147,6 +147,14 @@ struct rseq_abi {
 	 */
 	__u32 flags;
 
+	/*
+	 * Restartable sequences node_id field. Updated by the kernel. Read by
+	 * user-space with single-copy atomicity semantics. This field should
+	 * only be read by the thread which registered this data structure.
+	 * Aligned on 32-bit. Contains the current NUMA node ID.
+	 */
+	__u32 node_id;
+
 	/*
 	 * Flexible array member at end of structure, after last feature field.
 	 */
diff --git a/tools/testing/selftests/rseq/rseq.c b/tools/testing/selftests/rseq/rseq.c
index 20ea536d1012..0a96c3c779cd 100644
--- a/tools/testing/selftests/rseq/rseq.c
+++ b/tools/testing/selftests/rseq/rseq.c
@@ -79,6 +79,11 @@ static int sys_rseq(struct rseq_abi *rseq_abi, uint32_t rseq_len,
 	return syscall(__NR_rseq, rseq_abi, rseq_len, flags, sig);
 }
 
+static int sys_getcpu(unsigned *cpu, unsigned *node)
+{
+	return syscall(__NR_getcpu, cpu, node, NULL);
+}
+
 int rseq_available(void)
 {
 	int rc;
@@ -199,3 +204,16 @@ int32_t rseq_fallback_current_cpu(void)
 	}
 	return cpu;
 }
+
+int32_t rseq_fallback_current_node(void)
+{
+	uint32_t cpu_id, node_id;
+	int ret;
+
+	ret = sys_getcpu(&cpu_id, &node_id);
+	if (ret) {
+		perror("sys_getcpu()");
+		return ret;
+	}
+	return (int32_t) node_id;
+}
diff --git a/tools/testing/selftests/rseq/rseq.h b/tools/testing/selftests/rseq/rseq.h
index 95adc1e1b0db..fd17d0e54a1b 100644
--- a/tools/testing/selftests/rseq/rseq.h
+++ b/tools/testing/selftests/rseq/rseq.h
@@ -20,6 +20,15 @@
 #include "rseq-abi.h"
 #include "compiler.h"
 
+#ifndef rseq_sizeof_field
+#define rseq_sizeof_field(TYPE, MEMBER) sizeof((((TYPE *)0)->MEMBER))
+#endif
+
+#ifndef rseq_offsetofend
+#define rseq_offsetofend(TYPE, MEMBER) \
+	(offsetof(TYPE, MEMBER)	+ rseq_sizeof_field(TYPE, MEMBER))
+#endif
+
 /*
  * Empty code injection macros, override when testing.
  * It is important to consider that the ASM injection macros need to be
@@ -128,6 +137,11 @@ int rseq_unregister_current_thread(void);
  */
 int32_t rseq_fallback_current_cpu(void);
 
+/*
+ * Restartable sequence fallback for reading the current node number.
+ */
+int32_t rseq_fallback_current_node(void);
+
 /*
  * Values returned can be either the current CPU number, -1 (rseq is
  * uninitialized), or -2 (rseq initialization has failed).
@@ -163,6 +177,20 @@ static inline uint32_t rseq_current_cpu(void)
 	return cpu;
 }
 
+static inline bool rseq_node_id_available(void)
+{
+	return (int) rseq_feature_size >= rseq_offsetofend(struct rseq_abi, node_id);
+}
+
+/*
+ * Current NUMA node number.
+ */
+static inline uint32_t rseq_current_node_id(void)
+{
+	assert(rseq_node_id_available());
+	return RSEQ_ACCESS_ONCE(rseq_get_abi()->node_id);
+}
+
 static inline void rseq_clear_rseq_cs(void)
 {
 	RSEQ_WRITE_ONCE(rseq_get_abi()->rseq_cs.arch.ptr, 0);
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [RFC PATCH v4 00/25] RSEQ node id and virtual cpu id extensions
  2022-09-21 19:24 [RFC PATCH v4 00/25] RSEQ node id and virtual cpu id extensions Mathieu Desnoyers
                   ` (4 preceding siblings ...)
  2022-09-21 19:24 ` [RFC PATCH v4 05/25] selftests/rseq: Implement rseq numa node id field selftest Mathieu Desnoyers
@ 2022-09-21 19:54 ` Mathieu Desnoyers
  2022-09-22  8:10   ` Peter Zijlstra
  5 siblings, 1 reply; 9+ messages in thread
From: Mathieu Desnoyers @ 2022-09-21 19:54 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: linux-kernel, Thomas Gleixner, Paul E . McKenney, Boqun Feng,
	H . Peter Anvin, Paul Turner, linux-api, Christian Brauner,
	Florian Weimer, David.Laight, carlos, Peter Oskolkov,
	Alexander Mikhalitsyn

On 2022-09-21 15:24, Mathieu Desnoyers wrote:
> Extend the rseq ABI to expose a NUMA node ID and a vm_vcpu_id field.
> 
> The NUMA node ID field allows implementing a faster getcpu(2) in libc.
> 
> The virtual cpu id allows ideal scaling (down or up) of user-space
> per-cpu data structures. The virtual cpu ids allocated within a memory
> space are tracked by the scheduler, which takes into account the number
> of concurrently running threads, thus implicitly considering the number
> of threads, the cpu affinity, the cpusets applying to those threads, and
> the number of logical cores on the system.
> 
> This series is based on the v5.19 tag.

Hi Peter,

I'm having MTA issues at the moment. I will resend the series as soon as 
I can get hold of my sysadmin.

Sorry about that.

Thanks,

Mathieu

> 
> Thanks,
> 
> Mathieu
> 
> Mathieu Desnoyers (25):
>    rseq: Introduce feature size and alignment ELF auxiliary vector
>      entries
>    rseq: Introduce extensible rseq ABI
>    rseq: Extend struct rseq with numa node id
>    selftests/rseq: Use ELF auxiliary vector for extensible rseq
>    selftests/rseq: Implement rseq numa node id field selftest
>    lib: Invert _find_next_bit source arguments
>    lib: Implement find_{first,next}_{zero,one}_and_zero_bit
>    cpumask: Implement cpumask_{first,next}_{zero,one}_and_zero
>    sched: Introduce per memory space current virtual cpu id
>    rseq: Extend struct rseq with per memory space vcpu id
>    selftests/rseq: Remove RSEQ_SKIP_FASTPATH code
>    selftests/rseq: Implement rseq vm_vcpu_id field support
>    selftests/rseq: x86: Template memory ordering and percpu access mode
>    selftests/rseq: arm: Template memory ordering and percpu access mode
>    selftests/rseq: arm64: Template memory ordering and percpu access mode
>    selftests/rseq: mips: Template memory ordering and percpu access mode
>    selftests/rseq: ppc: Template memory ordering and percpu access mode
>    selftests/rseq: s390: Template memory ordering and percpu access mode
>    selftests/rseq: riscv: Template memory ordering and percpu access mode
>    selftests/rseq: Implement basic percpu ops vm_vcpu_id test
>    selftests/rseq: Implement parametrized vm_vcpu_id test
>    selftests/rseq: x86: Implement rseq_load_u32_u32
>    selftests/rseq: Implement numa node id vs vm_vcpu_id invariant test
>    selftests/rseq: parametrized test: Report/abort on negative cpu id
>    tracing/rseq: Add mm_vcpu_id field to rseq_update
> 
>   fs/binfmt_elf.c                               |    5 +
>   fs/exec.c                                     |    6 +
>   include/linux/cpumask.h                       |   86 ++
>   include/linux/find.h                          |  123 +-
>   include/linux/mm.h                            |   25 +
>   include/linux/mm_types.h                      |  110 +-
>   include/linux/sched.h                         |    9 +
>   include/trace/events/rseq.h                   |    7 +-
>   include/uapi/linux/auxvec.h                   |    2 +
>   include/uapi/linux/rseq.h                     |   22 +
>   init/Kconfig                                  |    4 +
>   kernel/fork.c                                 |   11 +-
>   kernel/ptrace.c                               |    2 +-
>   kernel/rseq.c                                 |   61 +-
>   kernel/sched/core.c                           |   49 +
>   kernel/sched/sched.h                          |  166 +++
>   kernel/signal.c                               |    2 +
>   lib/find_bit.c                                |   17 +-
>   tools/include/linux/find.h                    |    9 +-
>   tools/lib/find_bit.c                          |   17 +-
>   tools/testing/selftests/rseq/.gitignore       |    5 +
>   tools/testing/selftests/rseq/Makefile         |   20 +-
>   .../testing/selftests/rseq/basic_numa_test.c  |  117 ++
>   .../selftests/rseq/basic_percpu_ops_test.c    |   46 +-
>   tools/testing/selftests/rseq/basic_test.c     |    4 +
>   tools/testing/selftests/rseq/compiler.h       |    6 +
>   tools/testing/selftests/rseq/param_test.c     |  157 ++-
>   tools/testing/selftests/rseq/rseq-abi.h       |   22 +
>   tools/testing/selftests/rseq/rseq-arm-bits.h  |  505 +++++++
>   tools/testing/selftests/rseq/rseq-arm.h       |  701 +---------
>   .../testing/selftests/rseq/rseq-arm64-bits.h  |  392 ++++++
>   tools/testing/selftests/rseq/rseq-arm64.h     |  520 +------
>   .../testing/selftests/rseq/rseq-bits-reset.h  |   10 +
>   .../selftests/rseq/rseq-bits-template.h       |   39 +
>   tools/testing/selftests/rseq/rseq-mips-bits.h |  462 +++++++
>   tools/testing/selftests/rseq/rseq-mips.h      |  646 +--------
>   tools/testing/selftests/rseq/rseq-ppc-bits.h  |  454 +++++++
>   tools/testing/selftests/rseq/rseq-ppc.h       |  617 +--------
>   .../testing/selftests/rseq/rseq-riscv-bits.h  |  410 ++++++
>   tools/testing/selftests/rseq/rseq-riscv.h     |  529 +-------
>   tools/testing/selftests/rseq/rseq-s390-bits.h |  474 +++++++
>   tools/testing/selftests/rseq/rseq-s390.h      |  495 +------
>   tools/testing/selftests/rseq/rseq-skip.h      |   65 -
>   tools/testing/selftests/rseq/rseq-x86-bits.h  | 1036 ++++++++++++++
>   tools/testing/selftests/rseq/rseq-x86.h       | 1193 +----------------
>   tools/testing/selftests/rseq/rseq.c           |   86 +-
>   tools/testing/selftests/rseq/rseq.h           |  229 +++-
>   .../testing/selftests/rseq/run_param_test.sh  |    5 +
>   48 files changed, 5286 insertions(+), 4692 deletions(-)
>   create mode 100644 tools/testing/selftests/rseq/basic_numa_test.c
>   create mode 100644 tools/testing/selftests/rseq/rseq-arm-bits.h
>   create mode 100644 tools/testing/selftests/rseq/rseq-arm64-bits.h
>   create mode 100644 tools/testing/selftests/rseq/rseq-bits-reset.h
>   create mode 100644 tools/testing/selftests/rseq/rseq-bits-template.h
>   create mode 100644 tools/testing/selftests/rseq/rseq-mips-bits.h
>   create mode 100644 tools/testing/selftests/rseq/rseq-ppc-bits.h
>   create mode 100644 tools/testing/selftests/rseq/rseq-riscv-bits.h
>   create mode 100644 tools/testing/selftests/rseq/rseq-s390-bits.h
>   delete mode 100644 tools/testing/selftests/rseq/rseq-skip.h
>   create mode 100644 tools/testing/selftests/rseq/rseq-x86-bits.h
> 


-- 
Mathieu Desnoyers
EfficiOS Inc.
https://www.efficios.com

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [RFC PATCH v4 00/25] RSEQ node id and virtual cpu id extensions
  2022-09-21 19:54 ` [RFC PATCH v4 00/25] RSEQ node id and virtual cpu id extensions Mathieu Desnoyers
@ 2022-09-22  8:10   ` Peter Zijlstra
  2022-09-22 10:59     ` Mathieu Desnoyers
  0 siblings, 1 reply; 9+ messages in thread
From: Peter Zijlstra @ 2022-09-22  8:10 UTC (permalink / raw)
  To: Mathieu Desnoyers
  Cc: linux-kernel, Thomas Gleixner, Paul E . McKenney, Boqun Feng,
	H . Peter Anvin, Paul Turner, linux-api, Christian Brauner,
	Florian Weimer, David.Laight, carlos, Peter Oskolkov,
	Alexander Mikhalitsyn

On Wed, Sep 21, 2022 at 03:54:18PM -0400, Mathieu Desnoyers wrote:
> On 2022-09-21 15:24, Mathieu Desnoyers wrote:
> > Extend the rseq ABI to expose a NUMA node ID and a vm_vcpu_id field.
> > 
> > The NUMA node ID field allows implementing a faster getcpu(2) in libc.
> > 
> > The virtual cpu id allows ideal scaling (down or up) of user-space
> > per-cpu data structures. The virtual cpu ids allocated within a memory
> > space are tracked by the scheduler, which takes into account the number
> > of concurrently running threads, thus implicitly considering the number
> > of threads, the cpu affinity, the cpusets applying to those threads, and
> > the number of logical cores on the system.
> > 
> > This series is based on the v5.19 tag.
> 
> Hi Peter,
> 
> I'm having MTA issues at the moment. I will resend the series as soon as I
> can get hold of my sysadmin.

It landed in my inbox and Lore seems to have received a copy too; as
per:

  https://lkml.kernel.org/r/14ba275f-8ddc-33fc-2669-1c336436f473@efficios.com

So I'm thinking you did manage to send out mail and all is well.

I'll try and have a look later today.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [RFC PATCH v4 00/25] RSEQ node id and virtual cpu id extensions
  2022-09-22  8:10   ` Peter Zijlstra
@ 2022-09-22 10:59     ` Mathieu Desnoyers
  0 siblings, 0 replies; 9+ messages in thread
From: Mathieu Desnoyers @ 2022-09-22 10:59 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: linux-kernel, Thomas Gleixner, Paul E . McKenney, Boqun Feng,
	H . Peter Anvin, Paul Turner, linux-api, Christian Brauner,
	Florian Weimer, David.Laight, carlos, Peter Oskolkov,
	Alexander Mikhalitsyn

On 2022-09-22 04:10, Peter Zijlstra wrote:
> On Wed, Sep 21, 2022 at 03:54:18PM -0400, Mathieu Desnoyers wrote:
[...]
>> Hi Peter,
>>
>> I'm having MTA issues at the moment. I will resend the series as soon as I
>> can get hold of my sysadmin.
> 
> It landed in my inbox and Lore seems to have received a copy too; as
> per:
> 
>    https://lkml.kernel.org/r/14ba275f-8ddc-33fc-2669-1c336436f473@efficios.com
> 
> So I'm thinking you did manage to send out mail and all is well.
> 
> I'll try and have a look later today.

AFAIU my ISP's MTA only sent the first 7 emails in the series, and the 
rest are nowhere to be seen.

I've hopefully managed to fix my issues now. Let me try to resend the 
whole series. (without RFC tag this time)

Thanks,

Mathieu


-- 
Mathieu Desnoyers
EfficiOS Inc.
https://www.efficios.com

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2022-09-22 10:58 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-09-21 19:24 [RFC PATCH v4 00/25] RSEQ node id and virtual cpu id extensions Mathieu Desnoyers
2022-09-21 19:24 ` [RFC PATCH v4 01/25] rseq: Introduce feature size and alignment ELF auxiliary vector entries Mathieu Desnoyers
2022-09-21 19:24 ` [RFC PATCH v4 02/25] rseq: Introduce extensible rseq ABI Mathieu Desnoyers
2022-09-21 19:24 ` [RFC PATCH v4 03/25] rseq: Extend struct rseq with numa node id Mathieu Desnoyers
2022-09-21 19:24 ` [RFC PATCH v4 04/25] selftests/rseq: Use ELF auxiliary vector for extensible rseq Mathieu Desnoyers
2022-09-21 19:24 ` [RFC PATCH v4 05/25] selftests/rseq: Implement rseq numa node id field selftest Mathieu Desnoyers
2022-09-21 19:54 ` [RFC PATCH v4 00/25] RSEQ node id and virtual cpu id extensions Mathieu Desnoyers
2022-09-22  8:10   ` Peter Zijlstra
2022-09-22 10:59     ` Mathieu Desnoyers

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).