linux-api.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC PATCH 0/4] rseq: Introduce extensible struct rseq
@ 2020-07-14  3:03 Mathieu Desnoyers
  2020-07-14  3:03 ` [RFC PATCH 1/4] selftests: rseq: Use fixed value as rseq_len parameter Mathieu Desnoyers
                   ` (4 more replies)
  0 siblings, 5 replies; 37+ messages in thread
From: Mathieu Desnoyers @ 2020-07-14  3:03 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: linux-kernel, Thomas Gleixner, Paul E . McKenney, Boqun Feng,
	H . Peter Anvin, Paul Turner, linux-api, Christian Brauner,
	Florian Weimer, carlos, Mathieu Desnoyers

Recent discussion led to a solution for extending struct rseq. This is
an implementation of the proposed solution.

Now is a good time to agree on this scheme before the release of glibc
2.32, just in case there are small details to fix on the user-space
side in order to allow extending struct rseq.

Thanks,

Mathieu

Mathieu Desnoyers (4):
  selftests: rseq: Use fixed value as rseq_len parameter
  rseq: Allow extending struct rseq
  selftests: rseq: define __rseq_abi with extensible size
  selftests: rseq: print rseq extensible size in basic test

 include/linux/sched.h                     |  4 +++
 include/uapi/linux/rseq.h                 | 42 ++++++++++++++++++++--
 kernel/rseq.c                             | 44 +++++++++++++++++++----
 tools/testing/selftests/rseq/basic_test.c | 15 ++++++++
 tools/testing/selftests/rseq/rseq.c       |  8 +++--
 5 files changed, 101 insertions(+), 12 deletions(-)

-- 
2.17.1


^ permalink raw reply	[flat|nested] 37+ messages in thread

* [RFC PATCH 1/4] selftests: rseq: Use fixed value as rseq_len parameter
  2020-07-14  3:03 [RFC PATCH 0/4] rseq: Introduce extensible struct rseq Mathieu Desnoyers
@ 2020-07-14  3:03 ` Mathieu Desnoyers
  2020-07-14  3:03 ` [RFC PATCH 2/4] rseq: Allow extending struct rseq Mathieu Desnoyers
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 37+ messages in thread
From: Mathieu Desnoyers @ 2020-07-14  3:03 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: linux-kernel, Thomas Gleixner, Paul E . McKenney, Boqun Feng,
	H . Peter Anvin, Paul Turner, linux-api, Christian Brauner,
	Florian Weimer, carlos, Mathieu Desnoyers

The rseq registration and unregistration expect a fixed-size length
(32 bytes). In preparation to extend struct rseq, pass a fixed value
rather than the size of the rseq structure.

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
---
 include/uapi/linux/rseq.h           | 5 +++++
 tools/testing/selftests/rseq/rseq.c | 5 ++---
 2 files changed, 7 insertions(+), 3 deletions(-)

diff --git a/include/uapi/linux/rseq.h b/include/uapi/linux/rseq.h
index 9a402fdb60e9..e11d9df5e564 100644
--- a/include/uapi/linux/rseq.h
+++ b/include/uapi/linux/rseq.h
@@ -37,6 +37,11 @@ enum rseq_cs_flags {
 		(1U << RSEQ_CS_FLAG_NO_RESTART_ON_MIGRATE_BIT),
 };
 
+/* The rseq_len expected by rseq registration is always 32 bytes. */
+enum rseq_len_expected {
+	RSEQ_LEN_EXPECTED = 32,
+};
+
 /*
  * struct rseq_cs is aligned on 4 * 8 bytes to ensure it is always
  * contained within a single cache-line. It is usually declared as
diff --git a/tools/testing/selftests/rseq/rseq.c b/tools/testing/selftests/rseq/rseq.c
index 7159eb777fd3..da2264c679b9 100644
--- a/tools/testing/selftests/rseq/rseq.c
+++ b/tools/testing/selftests/rseq/rseq.c
@@ -87,7 +87,7 @@ int rseq_register_current_thread(void)
 	}
 	if (__rseq_refcount++)
 		goto end;
-	rc = sys_rseq(&__rseq_abi, sizeof(struct rseq), 0, RSEQ_SIG);
+	rc = sys_rseq(&__rseq_abi, RSEQ_LEN_EXPECTED, 0, RSEQ_SIG);
 	if (!rc) {
 		assert(rseq_current_cpu_raw() >= 0);
 		goto end;
@@ -115,8 +115,7 @@ int rseq_unregister_current_thread(void)
 	}
 	if (--__rseq_refcount)
 		goto end;
-	rc = sys_rseq(&__rseq_abi, sizeof(struct rseq),
-		      RSEQ_FLAG_UNREGISTER, RSEQ_SIG);
+	rc = sys_rseq(&__rseq_abi, RSEQ_LEN_EXPECTED, RSEQ_FLAG_UNREGISTER, RSEQ_SIG);
 	if (!rc)
 		goto end;
 	__rseq_refcount = 1;
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [RFC PATCH 2/4] rseq: Allow extending struct rseq
  2020-07-14  3:03 [RFC PATCH 0/4] rseq: Introduce extensible struct rseq Mathieu Desnoyers
  2020-07-14  3:03 ` [RFC PATCH 1/4] selftests: rseq: Use fixed value as rseq_len parameter Mathieu Desnoyers
@ 2020-07-14  3:03 ` Mathieu Desnoyers
  2020-07-14  9:58   ` Florian Weimer
                     ` (2 more replies)
  2020-07-14  3:03 ` [RFC PATCH 3/4] selftests: rseq: define __rseq_abi with extensible size Mathieu Desnoyers
                   ` (2 subsequent siblings)
  4 siblings, 3 replies; 37+ messages in thread
From: Mathieu Desnoyers @ 2020-07-14  3:03 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: linux-kernel, Thomas Gleixner, Paul E . McKenney, Boqun Feng,
	H . Peter Anvin, Paul Turner, linux-api, Christian Brauner,
	Florian Weimer, carlos, Mathieu Desnoyers

Add a __rseq_abi.flags "RSEQ_TLS_FLAG_SIZE", which indicates support for
extending struct rseq. This adds two new fields to struct rseq:
user_size and kernel_size.

The user_size field allows the size of the __rseq_abi definition (which
can be overridden by symbol interposition either by a preloaded library
or by the application) to be handed over to the kernel at registration.
This registration can be performed by a library, e.g. glibc, which does
not know there is interposition taking place.

The kernel_size is populated by the kernel when the "RSEQ_TLS_FLAG_SIZE"
flag is set in __rseq_abi.flags to the minimum between user_size and
the offset of the "end" field of struct rseq as known by the kernel.
This allows user-space to query which fields are effectively populated
by the kernel.

A rseq_size field is added to the task struct to keep track of the
"kernel_size" effective for each thread.

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
---
 include/linux/sched.h     |  4 ++++
 include/uapi/linux/rseq.h | 37 ++++++++++++++++++++++++++++++++--
 kernel/rseq.c             | 42 +++++++++++++++++++++++++++++++++------
 3 files changed, 75 insertions(+), 8 deletions(-)

diff --git a/include/linux/sched.h b/include/linux/sched.h
index 692e327d7455..5d61a3197987 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -1147,6 +1147,7 @@ struct task_struct {
 #ifdef CONFIG_RSEQ
 	struct rseq __user *rseq;
 	u32 rseq_sig;
+	u32 rseq_size;
 	/*
 	 * RmW on rseq_event_mask must be performed atomically
 	 * with respect to preemption.
@@ -1976,10 +1977,12 @@ static inline void rseq_fork(struct task_struct *t, unsigned long clone_flags)
 	if (clone_flags & CLONE_VM) {
 		t->rseq = NULL;
 		t->rseq_sig = 0;
+		t->rseq_size = 0;
 		t->rseq_event_mask = 0;
 	} else {
 		t->rseq = current->rseq;
 		t->rseq_sig = current->rseq_sig;
+		t->rseq_size = current->rseq_size;
 		t->rseq_event_mask = current->rseq_event_mask;
 	}
 }
@@ -1988,6 +1991,7 @@ static inline void rseq_execve(struct task_struct *t)
 {
 	t->rseq = NULL;
 	t->rseq_sig = 0;
+	t->rseq_size = 0;
 	t->rseq_event_mask = 0;
 }
 
diff --git a/include/uapi/linux/rseq.h b/include/uapi/linux/rseq.h
index e11d9df5e564..03c0b5e9a859 100644
--- a/include/uapi/linux/rseq.h
+++ b/include/uapi/linux/rseq.h
@@ -37,6 +37,15 @@ enum rseq_cs_flags {
 		(1U << RSEQ_CS_FLAG_NO_RESTART_ON_MIGRATE_BIT),
 };
 
+enum rseq_tls_flags_bit {
+	/* enum rseq_cs_flags reserves bits 0-2. */
+	RSEQ_TLS_FLAG_SIZE_BIT = 3,
+};
+
+enum rseq_tls_flags {
+	RSEQ_TLS_FLAG_SIZE = (1U << RSEQ_TLS_FLAG_SIZE_BIT),
+};
+
 /* The rseq_len expected by rseq registration is always 32 bytes. */
 enum rseq_len_expected {
 	RSEQ_LEN_EXPECTED = 32,
@@ -133,8 +142,9 @@ struct rseq {
 	 *
 	 * This field should only be updated by the thread which
 	 * registered this data structure. Read by the kernel.
-	 * Mainly used for single-stepping through rseq critical sections
-	 * with debuggers.
+	 *
+	 * The RSEQ_CS flags are mainly used for single-stepping through rseq
+	 * critical sections with debuggers.
 	 *
 	 * - RSEQ_CS_FLAG_NO_RESTART_ON_PREEMPT
 	 *     Inhibit instruction sequence block restart on preemption
@@ -145,8 +155,31 @@ struct rseq {
 	 * - RSEQ_CS_FLAG_NO_RESTART_ON_MIGRATE
 	 *     Inhibit instruction sequence block restart on migration for
 	 *     this thread.
+	 *
+	 * - RSEQ_TLS_FLAG_SIZE
+	 *     Extensible struct rseq ABI. This flag should be statically
+	 *     initialized.
 	 */
 	__u32 flags;
+	/*
+	 * With __rseq_abi.flags RSEQ_TLS_FLAG_SIZE set, user_size should be
+	 * statically initialized to offsetof(struct rseq, end).
+	 */
+	__u16 user_size;
+	/*
+	 * With __rseq_abi.flags RSEQ_TLS_FLAG_SIZE set, if the kernel supports
+	 * extensible struct rseq ABI, the kernel_size field is populated by
+	 * the kernel to the minimum between user_size and the offset of the
+	 * "end" field within the struct rseq supported by the kernel on
+	 * successful registration. Should be initialized to 0.
+	 */
+	__u16 kernel_size;
+
+	/*
+	 * Very last field of the structure, to calculate size excluding padding
+	 * with offsetof().
+	 */
+	char end[];
 } __attribute__((aligned(4 * sizeof(__u64))));
 
 #endif /* _UAPI_LINUX_RSEQ_H */
diff --git a/kernel/rseq.c b/kernel/rseq.c
index a4f86a9d6937..bbc57fc18573 100644
--- a/kernel/rseq.c
+++ b/kernel/rseq.c
@@ -96,6 +96,7 @@ static int rseq_update_cpu_id(struct task_struct *t)
 static int rseq_reset_rseq_cpu_id(struct task_struct *t)
 {
 	u32 cpu_id_start = 0, cpu_id = RSEQ_CPU_ID_UNINITIALIZED;
+	u16 kernel_size = 0;
 
 	/*
 	 * Reset cpu_id_start to its initial state (0).
@@ -109,6 +110,11 @@ static int rseq_reset_rseq_cpu_id(struct task_struct *t)
 	 */
 	if (put_user(cpu_id, &t->rseq->cpu_id))
 		return -EFAULT;
+	/*
+	 * Reset kernel_size to its initial state (0).
+	 */
+	if (put_user(kernel_size, &t->rseq->kernel_size))
+		return -EFAULT;
 	return 0;
 }
 
@@ -266,7 +272,7 @@ void __rseq_handle_notify_resume(struct ksignal *ksig, struct pt_regs *regs)
 
 	if (unlikely(t->flags & PF_EXITING))
 		return;
-	if (unlikely(!access_ok(t->rseq, sizeof(*t->rseq))))
+	if (unlikely(!access_ok(t->rseq, t->rseq_size)))
 		goto error;
 	ret = rseq_ip_fixup(regs);
 	if (unlikely(ret < 0))
@@ -294,7 +300,7 @@ void rseq_syscall(struct pt_regs *regs)
 
 	if (!t->rseq)
 		return;
-	if (!access_ok(t->rseq, sizeof(*t->rseq)) ||
+	if (!access_ok(t->rseq, t->rseq_size) ||
 	    rseq_get_rseq_cs(t, &rseq_cs) || in_rseq_cs(ip, &rseq_cs))
 		force_sig(SIGSEGV);
 }
@@ -308,6 +314,7 @@ SYSCALL_DEFINE4(rseq, struct rseq __user *, rseq, u32, rseq_len,
 		int, flags, u32, sig)
 {
 	int ret;
+	u32 tls_flags;
 
 	if (flags & RSEQ_FLAG_UNREGISTER) {
 		if (flags & ~RSEQ_FLAG_UNREGISTER)
@@ -315,7 +322,7 @@ SYSCALL_DEFINE4(rseq, struct rseq __user *, rseq, u32, rseq_len,
 		/* Unregister rseq for current thread. */
 		if (current->rseq != rseq || !current->rseq)
 			return -EINVAL;
-		if (rseq_len != sizeof(*rseq))
+		if (rseq_len != RSEQ_LEN_EXPECTED)
 			return -EINVAL;
 		if (current->rseq_sig != sig)
 			return -EPERM;
@@ -323,6 +330,7 @@ SYSCALL_DEFINE4(rseq, struct rseq __user *, rseq, u32, rseq_len,
 		if (ret)
 			return ret;
 		current->rseq = NULL;
+		current->rseq_size = 0;
 		current->rseq_sig = 0;
 		return 0;
 	}
@@ -336,7 +344,7 @@ SYSCALL_DEFINE4(rseq, struct rseq __user *, rseq, u32, rseq_len,
 		 * the provided address differs from the prior
 		 * one.
 		 */
-		if (current->rseq != rseq || rseq_len != sizeof(*rseq))
+		if (current->rseq != rseq || rseq_len != RSEQ_LEN_EXPECTED)
 			return -EINVAL;
 		if (current->rseq_sig != sig)
 			return -EPERM;
@@ -349,10 +357,32 @@ SYSCALL_DEFINE4(rseq, struct rseq __user *, rseq, u32, rseq_len,
 	 * ensure the provided rseq is properly aligned and valid.
 	 */
 	if (!IS_ALIGNED((unsigned long)rseq, __alignof__(*rseq)) ||
-	    rseq_len != sizeof(*rseq))
+	    rseq_len != RSEQ_LEN_EXPECTED)
 		return -EINVAL;
-	if (!access_ok(rseq, rseq_len))
+	if (!access_ok(rseq, RSEQ_LEN_EXPECTED))
 		return -EFAULT;
+
+	/* Handle extensible struct rseq ABI. */
+	ret = get_user(tls_flags, &rseq->flags);
+	if (ret)
+		return ret;
+	if (tls_flags & RSEQ_TLS_FLAG_SIZE) {
+		u16 user_size, kernel_size;
+
+		ret = get_user(user_size, &rseq->user_size);
+		if (ret)
+			return ret;
+		if (user_size < offsetof(struct rseq, kernel_size) + sizeof(u16))
+			return -EINVAL;
+		kernel_size = min_t(u16, user_size, offsetof(struct rseq, end));
+		ret = put_user(kernel_size, &rseq->kernel_size);
+		if (ret)
+			return ret;
+		current->rseq_size = kernel_size;
+	} else {
+		current->rseq_size = offsetof(struct rseq, user_size);
+	}
+
 	current->rseq = rseq;
 	current->rseq_sig = sig;
 	/*
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [RFC PATCH 3/4] selftests: rseq: define __rseq_abi with extensible size
  2020-07-14  3:03 [RFC PATCH 0/4] rseq: Introduce extensible struct rseq Mathieu Desnoyers
  2020-07-14  3:03 ` [RFC PATCH 1/4] selftests: rseq: Use fixed value as rseq_len parameter Mathieu Desnoyers
  2020-07-14  3:03 ` [RFC PATCH 2/4] rseq: Allow extending struct rseq Mathieu Desnoyers
@ 2020-07-14  3:03 ` Mathieu Desnoyers
  2020-07-14  3:03 ` [RFC PATCH 4/4] selftests: rseq: print rseq extensible size in basic test Mathieu Desnoyers
  2020-07-14 20:55 ` [RFC PATCH 0/4] rseq: Introduce extensible struct rseq Carlos O'Donell
  4 siblings, 0 replies; 37+ messages in thread
From: Mathieu Desnoyers @ 2020-07-14  3:03 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: linux-kernel, Thomas Gleixner, Paul E . McKenney, Boqun Feng,
	H . Peter Anvin, Paul Turner, linux-api, Christian Brauner,
	Florian Weimer, carlos, Mathieu Desnoyers

Define the __rseq_abi with the extensible size feature.

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
---
 tools/testing/selftests/rseq/rseq.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/tools/testing/selftests/rseq/rseq.c b/tools/testing/selftests/rseq/rseq.c
index da2264c679b9..2c29215d4790 100644
--- a/tools/testing/selftests/rseq/rseq.c
+++ b/tools/testing/selftests/rseq/rseq.c
@@ -26,6 +26,7 @@
 #include <assert.h>
 #include <signal.h>
 #include <limits.h>
+#include <stddef.h>
 
 #include "rseq.h"
 
@@ -33,6 +34,8 @@
 
 __thread volatile struct rseq __rseq_abi = {
 	.cpu_id = RSEQ_CPU_ID_UNINITIALIZED,
+	.flags = RSEQ_TLS_FLAG_SIZE,
+	.user_size = offsetof(struct rseq, end),
 };
 
 /*
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [RFC PATCH 4/4] selftests: rseq: print rseq extensible size in basic test
  2020-07-14  3:03 [RFC PATCH 0/4] rseq: Introduce extensible struct rseq Mathieu Desnoyers
                   ` (2 preceding siblings ...)
  2020-07-14  3:03 ` [RFC PATCH 3/4] selftests: rseq: define __rseq_abi with extensible size Mathieu Desnoyers
@ 2020-07-14  3:03 ` Mathieu Desnoyers
  2020-07-14 20:55 ` [RFC PATCH 0/4] rseq: Introduce extensible struct rseq Carlos O'Donell
  4 siblings, 0 replies; 37+ messages in thread
From: Mathieu Desnoyers @ 2020-07-14  3:03 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: linux-kernel, Thomas Gleixner, Paul E . McKenney, Boqun Feng,
	H . Peter Anvin, Paul Turner, linux-api, Christian Brauner,
	Florian Weimer, carlos, Mathieu Desnoyers

Print whether extensible size feature is supported by the kernel
and __rseq_abi definition, along with the contents of the kernel_size
field if it is available.

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
---
 tools/testing/selftests/rseq/basic_test.c | 15 +++++++++++++++
 1 file changed, 15 insertions(+)

diff --git a/tools/testing/selftests/rseq/basic_test.c b/tools/testing/selftests/rseq/basic_test.c
index d8efbfb89193..976e040574a1 100644
--- a/tools/testing/selftests/rseq/basic_test.c
+++ b/tools/testing/selftests/rseq/basic_test.c
@@ -10,6 +10,7 @@
 #include <stdio.h>
 #include <string.h>
 #include <sys/time.h>
+#include <inttypes.h>
 
 #include "rseq.h"
 
@@ -35,6 +36,17 @@ void test_cpu_pointer(void)
 	sched_setaffinity(0, sizeof(affinity), &affinity);
 }
 
+static void print_rseq_size(void)
+{
+	bool supported = (__rseq_abi.flags & RSEQ_TLS_FLAG_SIZE) && __rseq_abi.kernel_size != 0;
+
+	printf("extensible struct rseq supported: %s",
+		supported ? "yes" : "no");
+	if (supported)
+		printf(" (kernel_size=%" PRIu16 ")", __rseq_abi.kernel_size);
+	printf("\n");
+}
+
 int main(int argc, char **argv)
 {
 	if (rseq_register_current_thread()) {
@@ -42,6 +54,9 @@ int main(int argc, char **argv)
 			errno, strerror(errno));
 		goto init_thread_error;
 	}
+
+	print_rseq_size();
+
 	printf("testing current cpu\n");
 	test_cpu_pointer();
 	if (rseq_unregister_current_thread()) {
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 37+ messages in thread

* Re: [RFC PATCH 2/4] rseq: Allow extending struct rseq
  2020-07-14  3:03 ` [RFC PATCH 2/4] rseq: Allow extending struct rseq Mathieu Desnoyers
@ 2020-07-14  9:58   ` Florian Weimer
  2020-07-14 12:50     ` Mathieu Desnoyers
  2020-07-14 17:24   ` Peter Oskolkov
  2020-07-15 11:38   ` Christian Brauner
  2 siblings, 1 reply; 37+ messages in thread
From: Florian Weimer @ 2020-07-14  9:58 UTC (permalink / raw)
  To: Mathieu Desnoyers
  Cc: Peter Zijlstra, linux-kernel, Thomas Gleixner, Paul E . McKenney,
	Boqun Feng, H . Peter Anvin, Paul Turner, linux-api,
	Christian Brauner, carlos

* Mathieu Desnoyers:

> +	/*
> +	 * Very last field of the structure, to calculate size excluding padding
> +	 * with offsetof().
> +	 */
> +	char end[];
>  } __attribute__((aligned(4 * sizeof(__u64))));

This makes the header incompatible with standard C++.

How are extensions going to affect the definition of struct rseq,
including its alignment?

As things stand now, glibc 2.32 will make the size and alignment of
struct rseq part of its ABI, so it can't really change after that.

With a different approach, we can avoid making the symbol size part of
the ABI, but then we cannot use the __rseq_abi TLS symbol.  As a result,
interoperability with early adopters would be lost.

One way to avoid this problem would be for every library to register its
own rseq area, of the appropriate size.  Then process-wide coordination
in userspace would not be needed.

Thanks,
Florian


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [RFC PATCH 2/4] rseq: Allow extending struct rseq
  2020-07-14  9:58   ` Florian Weimer
@ 2020-07-14 12:50     ` Mathieu Desnoyers
  2020-07-14 13:00       ` Florian Weimer
  0 siblings, 1 reply; 37+ messages in thread
From: Mathieu Desnoyers @ 2020-07-14 12:50 UTC (permalink / raw)
  To: Florian Weimer
  Cc: Peter Zijlstra, linux-kernel, Thomas Gleixner, paulmck,
	Boqun Feng, H. Peter Anvin, Paul Turner, linux-api,
	Christian Brauner, carlos

----- On Jul 14, 2020, at 5:58 AM, Florian Weimer fweimer@redhat.com wrote:

> * Mathieu Desnoyers:
> 
>> +	/*
>> +	 * Very last field of the structure, to calculate size excluding padding
>> +	 * with offsetof().
>> +	 */
>> +	char end[];
>>  } __attribute__((aligned(4 * sizeof(__u64))));
> 
> This makes the header incompatible with standard C++.

One alternative would be to add a helper to compute the effective size on c++, e.g.:

/* Always updated with struct rseq_cs declaration.  */
#define rseq_last_field kernel_size

static inline size_t rseq_effective_size(void)
{
    return offsetof(struct rseq, rseq_last_field) + sizeof(((struct rseq *)NULL)->rseq_last_field);
}

> 
> How are extensions going to affect the definition of struct rseq,
> including its alignment?

The alignment will never decrease. If the structure becomes large enough
its alignment could theoretically increase. Would that be an issue ?


> As things stand now, glibc 2.32 will make the size and alignment of
> struct rseq part of its ABI, so it can't really change after that.

Can the size and alignment of a structure be defined as minimum alignment
and size values ? For instance, those would be invariant for a given glibc
version (if we always use the internal struct rseq declaration), but could
be increased in future versions.

> With a different approach, we can avoid making the symbol size part of
> the ABI, but then we cannot use the __rseq_abi TLS symbol.  As a result,
> interoperability with early adopters would be lost.

Do you mean with a function "getter", and then keeping that pointer around
in a per-user TLS ? I would prefer to avoid that because it adds an extra
pointer dereference on a fast path.

> One way to avoid this problem would be for every library to register its
> own rseq area, of the appropriate size.  Then process-wide coordination
> in userspace would not be needed.

I did propose the code to do just that in my initial rseq implementations, but
the idea was shutdown by kernel maintainers because it required the kernel to
handle a linked-list of rseq areas per thread, which was more complex within
the kernel.

Thanks,

Mathieu

> 
> Thanks,
> Florian

-- 
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [RFC PATCH 2/4] rseq: Allow extending struct rseq
  2020-07-14 12:50     ` Mathieu Desnoyers
@ 2020-07-14 13:00       ` Florian Weimer
  2020-07-14 13:19         ` Mathieu Desnoyers
  0 siblings, 1 reply; 37+ messages in thread
From: Florian Weimer @ 2020-07-14 13:00 UTC (permalink / raw)
  To: Mathieu Desnoyers
  Cc: Peter Zijlstra, linux-kernel, Thomas Gleixner, paulmck,
	Boqun Feng, H. Peter Anvin, Paul Turner, linux-api,
	Christian Brauner, carlos

* Mathieu Desnoyers:

>> How are extensions going to affect the definition of struct rseq,
>> including its alignment?
>
> The alignment will never decrease. If the structure becomes large enough
> its alignment could theoretically increase. Would that be an issue ?

Telling the compiler that struct is larger than it actually is, or that
it has more alignment than in memory, results in undefined behavior,
even if only fields are accessed in the smaller struct region.

An increase in alignment from 32 to 64 is perhaps not likely to have
this effect.  But the undefined behavior is still there, and has been
observed for mismatches like 8 vs 16.

>> As things stand now, glibc 2.32 will make the size and alignment of
>> struct rseq part of its ABI, so it can't really change after that.
>
> Can the size and alignment of a structure be defined as minimum alignment
> and size values ? For instance, those would be invariant for a given glibc
> version (if we always use the internal struct rseq declaration), but could
> be increased in future versions.

Not if we are talking about a global (TLS) data symbol.  No such changes
are possible there.  We have some workarounds for symbols that live
exclusively within glibc, but they don't work if there are libraries out
there which interpose the symbol.

>> With a different approach, we can avoid making the symbol size part of
>> the ABI, but then we cannot use the __rseq_abi TLS symbol.  As a result,
>> interoperability with early adopters would be lost.
>
> Do you mean with a function "getter", and then keeping that pointer around
> in a per-user TLS ? I would prefer to avoid that because it adds an extra
> pointer dereference on a fast path.

My choice would have been a function that returns the offset from the
thread pointer (which has to be unchanged regarding all threads).

Thanks,
Florian


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [RFC PATCH 2/4] rseq: Allow extending struct rseq
  2020-07-14 13:00       ` Florian Weimer
@ 2020-07-14 13:19         ` Mathieu Desnoyers
  2020-07-14 21:30           ` Carlos O'Donell
  0 siblings, 1 reply; 37+ messages in thread
From: Mathieu Desnoyers @ 2020-07-14 13:19 UTC (permalink / raw)
  To: Florian Weimer
  Cc: Peter Zijlstra, linux-kernel, Thomas Gleixner, paulmck,
	Boqun Feng, H. Peter Anvin, Paul Turner, linux-api,
	Christian Brauner, carlos

----- On Jul 14, 2020, at 9:00 AM, Florian Weimer fweimer@redhat.com wrote:

> * Mathieu Desnoyers:
> 
>>> How are extensions going to affect the definition of struct rseq,
>>> including its alignment?
>>
>> The alignment will never decrease. If the structure becomes large enough
>> its alignment could theoretically increase. Would that be an issue ?
> 
> Telling the compiler that struct is larger than it actually is, or that
> it has more alignment than in memory, results in undefined behavior,
> even if only fields are accessed in the smaller struct region.
> 
> An increase in alignment from 32 to 64 is perhaps not likely to have
> this effect.  But the undefined behavior is still there, and has been
> observed for mismatches like 8 vs 16.

Good points.

> 
>>> As things stand now, glibc 2.32 will make the size and alignment of
>>> struct rseq part of its ABI, so it can't really change after that.
>>
>> Can the size and alignment of a structure be defined as minimum alignment
>> and size values ? For instance, those would be invariant for a given glibc
>> version (if we always use the internal struct rseq declaration), but could
>> be increased in future versions.
> 
> Not if we are talking about a global (TLS) data symbol.  No such changes
> are possible there.  We have some workarounds for symbols that live
> exclusively within glibc, but they don't work if there are libraries out
> there which interpose the symbol.

OK

> 
>>> With a different approach, we can avoid making the symbol size part of
>>> the ABI, but then we cannot use the __rseq_abi TLS symbol.  As a result,
>>> interoperability with early adopters would be lost.
>>
>> Do you mean with a function "getter", and then keeping that pointer around
>> in a per-user TLS ? I would prefer to avoid that because it adds an extra
>> pointer dereference on a fast path.
> 
> My choice would have been a function that returns the offset from the
> thread pointer (which has to be unchanged regarding all threads).

So AFAIU we would have glibc expose a symbol, e.g.:

off_t rseq_tls_offset(void);

Which would be typically called by user libraries and applications at initialization
to get the offset of the struct rseq. They should store it in a static variable so
rseq critical sections can use that offset.

Is there an arch-agnostic way to get the thread pointer from user-space code ? That
would be needed by all rseq critical section implementations.

Thanks,

Mathieu

-- 
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [RFC PATCH 2/4] rseq: Allow extending struct rseq
  2020-07-14  3:03 ` [RFC PATCH 2/4] rseq: Allow extending struct rseq Mathieu Desnoyers
  2020-07-14  9:58   ` Florian Weimer
@ 2020-07-14 17:24   ` Peter Oskolkov
  2020-07-14 17:43     ` Mathieu Desnoyers
  2020-07-15 11:38   ` Christian Brauner
  2 siblings, 1 reply; 37+ messages in thread
From: Peter Oskolkov @ 2020-07-14 17:24 UTC (permalink / raw)
  To: Mathieu Desnoyers
  Cc: Peter Zijlstra, Linux Kernel Mailing List, Thomas Gleixner,
	Paul E . McKenney, Boqun Feng, H . Peter Anvin, Paul Turner,
	linux-api, Christian Brauner, Florian Weimer, carlos,
	Peter Oskolkov

At Google, we actually extended struct rseq (I will post the patches
here once they are fully deployed and we have specific
benefits/improvements to report). We did this by adding several fields
below __u32 flags (the last field currently), and correspondingly
increasing rseq_len in rseq() syscall. If the kernel does not know of
this extension, it will return -EINVAL due to an unexpected rseq_len;
then the application can either fall-back to the standard/upstream
rseq, or bail. If the kernel does know of this extension, it accepts
it. If the application passes the old rseq_len (32), the kernel knows
that this is an old application and treats it as such.

I looked through the archives, but I did not find specifically why the
pretty standard approach described above is considered inferior to the
one taken in this patch (freeze rseq_len at 32, add additional length
fields to struct rseq). Can these be summarized?

Thanks,
Peter

On Mon, Jul 13, 2020 at 8:04 PM Mathieu Desnoyers
<mathieu.desnoyers@efficios.com> wrote:
>
> Add a __rseq_abi.flags "RSEQ_TLS_FLAG_SIZE", which indicates support for
> extending struct rseq. This adds two new fields to struct rseq:
> user_size and kernel_size.
>
> The user_size field allows the size of the __rseq_abi definition (which
> can be overridden by symbol interposition either by a preloaded library
> or by the application) to be handed over to the kernel at registration.
> This registration can be performed by a library, e.g. glibc, which does
> not know there is interposition taking place.
>
> The kernel_size is populated by the kernel when the "RSEQ_TLS_FLAG_SIZE"
> flag is set in __rseq_abi.flags to the minimum between user_size and
> the offset of the "end" field of struct rseq as known by the kernel.
> This allows user-space to query which fields are effectively populated
> by the kernel.
>
> A rseq_size field is added to the task struct to keep track of the
> "kernel_size" effective for each thread.
>
> Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
> ---
>  include/linux/sched.h     |  4 ++++
>  include/uapi/linux/rseq.h | 37 ++++++++++++++++++++++++++++++++--
>  kernel/rseq.c             | 42 +++++++++++++++++++++++++++++++++------
>  3 files changed, 75 insertions(+), 8 deletions(-)
>
> diff --git a/include/linux/sched.h b/include/linux/sched.h
> index 692e327d7455..5d61a3197987 100644
> --- a/include/linux/sched.h
> +++ b/include/linux/sched.h
> @@ -1147,6 +1147,7 @@ struct task_struct {
>  #ifdef CONFIG_RSEQ
>         struct rseq __user *rseq;
>         u32 rseq_sig;
> +       u32 rseq_size;
>         /*
>          * RmW on rseq_event_mask must be performed atomically
>          * with respect to preemption.
> @@ -1976,10 +1977,12 @@ static inline void rseq_fork(struct task_struct *t, unsigned long clone_flags)
>         if (clone_flags & CLONE_VM) {
>                 t->rseq = NULL;
>                 t->rseq_sig = 0;
> +               t->rseq_size = 0;
>                 t->rseq_event_mask = 0;
>         } else {
>                 t->rseq = current->rseq;
>                 t->rseq_sig = current->rseq_sig;
> +               t->rseq_size = current->rseq_size;
>                 t->rseq_event_mask = current->rseq_event_mask;
>         }
>  }
> @@ -1988,6 +1991,7 @@ static inline void rseq_execve(struct task_struct *t)
>  {
>         t->rseq = NULL;
>         t->rseq_sig = 0;
> +       t->rseq_size = 0;
>         t->rseq_event_mask = 0;
>  }
>
> diff --git a/include/uapi/linux/rseq.h b/include/uapi/linux/rseq.h
> index e11d9df5e564..03c0b5e9a859 100644
> --- a/include/uapi/linux/rseq.h
> +++ b/include/uapi/linux/rseq.h
> @@ -37,6 +37,15 @@ enum rseq_cs_flags {
>                 (1U << RSEQ_CS_FLAG_NO_RESTART_ON_MIGRATE_BIT),
>  };
>
> +enum rseq_tls_flags_bit {
> +       /* enum rseq_cs_flags reserves bits 0-2. */
> +       RSEQ_TLS_FLAG_SIZE_BIT = 3,
> +};
> +
> +enum rseq_tls_flags {
> +       RSEQ_TLS_FLAG_SIZE = (1U << RSEQ_TLS_FLAG_SIZE_BIT),
> +};
> +
>  /* The rseq_len expected by rseq registration is always 32 bytes. */
>  enum rseq_len_expected {
>         RSEQ_LEN_EXPECTED = 32,
> @@ -133,8 +142,9 @@ struct rseq {
>          *
>          * This field should only be updated by the thread which
>          * registered this data structure. Read by the kernel.
> -        * Mainly used for single-stepping through rseq critical sections
> -        * with debuggers.
> +        *
> +        * The RSEQ_CS flags are mainly used for single-stepping through rseq
> +        * critical sections with debuggers.
>          *
>          * - RSEQ_CS_FLAG_NO_RESTART_ON_PREEMPT
>          *     Inhibit instruction sequence block restart on preemption
> @@ -145,8 +155,31 @@ struct rseq {
>          * - RSEQ_CS_FLAG_NO_RESTART_ON_MIGRATE
>          *     Inhibit instruction sequence block restart on migration for
>          *     this thread.
> +        *
> +        * - RSEQ_TLS_FLAG_SIZE
> +        *     Extensible struct rseq ABI. This flag should be statically
> +        *     initialized.
>          */
>         __u32 flags;
> +       /*
> +        * With __rseq_abi.flags RSEQ_TLS_FLAG_SIZE set, user_size should be
> +        * statically initialized to offsetof(struct rseq, end).
> +        */
> +       __u16 user_size;
> +       /*
> +        * With __rseq_abi.flags RSEQ_TLS_FLAG_SIZE set, if the kernel supports
> +        * extensible struct rseq ABI, the kernel_size field is populated by
> +        * the kernel to the minimum between user_size and the offset of the
> +        * "end" field within the struct rseq supported by the kernel on
> +        * successful registration. Should be initialized to 0.
> +        */
> +       __u16 kernel_size;
> +
> +       /*
> +        * Very last field of the structure, to calculate size excluding padding
> +        * with offsetof().
> +        */
> +       char end[];
>  } __attribute__((aligned(4 * sizeof(__u64))));
>
>  #endif /* _UAPI_LINUX_RSEQ_H */
> diff --git a/kernel/rseq.c b/kernel/rseq.c
> index a4f86a9d6937..bbc57fc18573 100644
> --- a/kernel/rseq.c
> +++ b/kernel/rseq.c
> @@ -96,6 +96,7 @@ static int rseq_update_cpu_id(struct task_struct *t)
>  static int rseq_reset_rseq_cpu_id(struct task_struct *t)
>  {
>         u32 cpu_id_start = 0, cpu_id = RSEQ_CPU_ID_UNINITIALIZED;
> +       u16 kernel_size = 0;
>
>         /*
>          * Reset cpu_id_start to its initial state (0).
> @@ -109,6 +110,11 @@ static int rseq_reset_rseq_cpu_id(struct task_struct *t)
>          */
>         if (put_user(cpu_id, &t->rseq->cpu_id))
>                 return -EFAULT;
> +       /*
> +        * Reset kernel_size to its initial state (0).
> +        */
> +       if (put_user(kernel_size, &t->rseq->kernel_size))
> +               return -EFAULT;
>         return 0;
>  }
>
> @@ -266,7 +272,7 @@ void __rseq_handle_notify_resume(struct ksignal *ksig, struct pt_regs *regs)
>
>         if (unlikely(t->flags & PF_EXITING))
>                 return;
> -       if (unlikely(!access_ok(t->rseq, sizeof(*t->rseq))))
> +       if (unlikely(!access_ok(t->rseq, t->rseq_size)))
>                 goto error;
>         ret = rseq_ip_fixup(regs);
>         if (unlikely(ret < 0))
> @@ -294,7 +300,7 @@ void rseq_syscall(struct pt_regs *regs)
>
>         if (!t->rseq)
>                 return;
> -       if (!access_ok(t->rseq, sizeof(*t->rseq)) ||
> +       if (!access_ok(t->rseq, t->rseq_size) ||
>             rseq_get_rseq_cs(t, &rseq_cs) || in_rseq_cs(ip, &rseq_cs))
>                 force_sig(SIGSEGV);
>  }
> @@ -308,6 +314,7 @@ SYSCALL_DEFINE4(rseq, struct rseq __user *, rseq, u32, rseq_len,
>                 int, flags, u32, sig)
>  {
>         int ret;
> +       u32 tls_flags;
>
>         if (flags & RSEQ_FLAG_UNREGISTER) {
>                 if (flags & ~RSEQ_FLAG_UNREGISTER)
> @@ -315,7 +322,7 @@ SYSCALL_DEFINE4(rseq, struct rseq __user *, rseq, u32, rseq_len,
>                 /* Unregister rseq for current thread. */
>                 if (current->rseq != rseq || !current->rseq)
>                         return -EINVAL;
> -               if (rseq_len != sizeof(*rseq))
> +               if (rseq_len != RSEQ_LEN_EXPECTED)
>                         return -EINVAL;
>                 if (current->rseq_sig != sig)
>                         return -EPERM;
> @@ -323,6 +330,7 @@ SYSCALL_DEFINE4(rseq, struct rseq __user *, rseq, u32, rseq_len,
>                 if (ret)
>                         return ret;
>                 current->rseq = NULL;
> +               current->rseq_size = 0;
>                 current->rseq_sig = 0;
>                 return 0;
>         }
> @@ -336,7 +344,7 @@ SYSCALL_DEFINE4(rseq, struct rseq __user *, rseq, u32, rseq_len,
>                  * the provided address differs from the prior
>                  * one.
>                  */
> -               if (current->rseq != rseq || rseq_len != sizeof(*rseq))
> +               if (current->rseq != rseq || rseq_len != RSEQ_LEN_EXPECTED)
>                         return -EINVAL;
>                 if (current->rseq_sig != sig)
>                         return -EPERM;
> @@ -349,10 +357,32 @@ SYSCALL_DEFINE4(rseq, struct rseq __user *, rseq, u32, rseq_len,
>          * ensure the provided rseq is properly aligned and valid.
>          */
>         if (!IS_ALIGNED((unsigned long)rseq, __alignof__(*rseq)) ||
> -           rseq_len != sizeof(*rseq))
> +           rseq_len != RSEQ_LEN_EXPECTED)
>                 return -EINVAL;
> -       if (!access_ok(rseq, rseq_len))
> +       if (!access_ok(rseq, RSEQ_LEN_EXPECTED))
>                 return -EFAULT;
> +
> +       /* Handle extensible struct rseq ABI. */
> +       ret = get_user(tls_flags, &rseq->flags);
> +       if (ret)
> +               return ret;
> +       if (tls_flags & RSEQ_TLS_FLAG_SIZE) {
> +               u16 user_size, kernel_size;
> +
> +               ret = get_user(user_size, &rseq->user_size);
> +               if (ret)
> +                       return ret;
> +               if (user_size < offsetof(struct rseq, kernel_size) + sizeof(u16))
> +                       return -EINVAL;
> +               kernel_size = min_t(u16, user_size, offsetof(struct rseq, end));
> +               ret = put_user(kernel_size, &rseq->kernel_size);
> +               if (ret)
> +                       return ret;
> +               current->rseq_size = kernel_size;
> +       } else {
> +               current->rseq_size = offsetof(struct rseq, user_size);
> +       }
> +
>         current->rseq = rseq;
>         current->rseq_sig = sig;
>         /*
> --
> 2.17.1
>

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [RFC PATCH 2/4] rseq: Allow extending struct rseq
  2020-07-14 17:24   ` Peter Oskolkov
@ 2020-07-14 17:43     ` Mathieu Desnoyers
  2020-07-14 18:33       ` Peter Oskolkov
  0 siblings, 1 reply; 37+ messages in thread
From: Mathieu Desnoyers @ 2020-07-14 17:43 UTC (permalink / raw)
  To: Peter Oskolkov
  Cc: Peter Zijlstra, linux-kernel, Thomas Gleixner, paulmck,
	Boqun Feng, H. Peter Anvin, Paul Turner, linux-api,
	Christian Brauner, Florian Weimer, carlos, Peter Oskolkov

----- On Jul 14, 2020, at 1:24 PM, Peter Oskolkov posk@posk.io wrote:

> At Google, we actually extended struct rseq (I will post the patches
> here once they are fully deployed and we have specific
> benefits/improvements to report). We did this by adding several fields
> below __u32 flags (the last field currently), and correspondingly
> increasing rseq_len in rseq() syscall. If the kernel does not know of
> this extension, it will return -EINVAL due to an unexpected rseq_len;
> then the application can either fall-back to the standard/upstream
> rseq, or bail. If the kernel does know of this extension, it accepts
> it. If the application passes the old rseq_len (32), the kernel knows
> that this is an old application and treats it as such.
> 
> I looked through the archives, but I did not find specifically why the
> pretty standard approach described above is considered inferior to the
> one taken in this patch (freeze rseq_len at 32, add additional length
> fields to struct rseq). Can these be summarized?

I think you don't face the issues I'm facing with libc rseq integration
because you control the entire user-space software ecosystem at Google.

The main issue we face is that the library responsible for registering
rseq (either glibc 2.32+, an early-adopter librseq library, or the
application) may very well not be the same library defining the __rseq_abi
symbol used in the global symbol table. Interposition with ld preload or
by defining the __rseq_abi in the program's executable are good examples
of this kind of scenario, and those use-cases are supported.

So the size of the __rseq_abi structure may be larger than the struct
rseq known by glibc (and eventually smaller, if future glibc versions
extend their __rseq_abi size but is loaded with an older program/library
doing __rseq_abi interposition).

So we need some way to allow code defining the __rseq_abi to let the kernel
know how much room is available, without necessarily requiring the code
responsible for rseq registration to be aware of that extended layout.
This is the purpose of the __rseq_abi.flags RSEQ_FLAG_TLS_SIZE and field
__rseq_abi.user_size.

And we need some way to allow the kernel to let user-space rseq critical
sections (user code) know how much of those fields are actually populated
by the kernel. This is the purpose of __rseq_abi.flags RSEQ_FLAG_TLS_SIZE
with __rseq_abi.kernel_size.

Thanks,

Mathieu

-- 
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [RFC PATCH 2/4] rseq: Allow extending struct rseq
  2020-07-14 17:43     ` Mathieu Desnoyers
@ 2020-07-14 18:33       ` Peter Oskolkov
  2020-07-15  2:34         ` Chris Kennelly
  0 siblings, 1 reply; 37+ messages in thread
From: Peter Oskolkov @ 2020-07-14 18:33 UTC (permalink / raw)
  To: Mathieu Desnoyers
  Cc: Peter Oskolkov, Peter Zijlstra, linux-kernel, Thomas Gleixner,
	paulmck, Boqun Feng, H. Peter Anvin, Paul Turner, linux-api,
	Christian Brauner, Florian Weimer, carlos, Chris Kennelly

On Tue, Jul 14, 2020 at 10:43 AM Mathieu Desnoyers
<mathieu.desnoyers@efficios.com> wrote:
>
> ----- On Jul 14, 2020, at 1:24 PM, Peter Oskolkov posk@posk.io wrote:
>
> > At Google, we actually extended struct rseq (I will post the patches
> > here once they are fully deployed and we have specific
> > benefits/improvements to report). We did this by adding several fields
> > below __u32 flags (the last field currently), and correspondingly
> > increasing rseq_len in rseq() syscall. If the kernel does not know of
> > this extension, it will return -EINVAL due to an unexpected rseq_len;
> > then the application can either fall-back to the standard/upstream
> > rseq, or bail. If the kernel does know of this extension, it accepts
> > it. If the application passes the old rseq_len (32), the kernel knows
> > that this is an old application and treats it as such.
> >
> > I looked through the archives, but I did not find specifically why the
> > pretty standard approach described above is considered inferior to the
> > one taken in this patch (freeze rseq_len at 32, add additional length
> > fields to struct rseq). Can these be summarized?
>
> I think you don't face the issues I'm facing with libc rseq integration
> because you control the entire user-space software ecosystem at Google.
>
> The main issue we face is that the library responsible for registering
> rseq (either glibc 2.32+, an early-adopter librseq library, or the
> application) may very well not be the same library defining the __rseq_abi
> symbol used in the global symbol table. Interposition with ld preload or
> by defining the __rseq_abi in the program's executable are good examples
> of this kind of scenario, and those use-cases are supported.
>
> So the size of the __rseq_abi structure may be larger than the struct
> rseq known by glibc (and eventually smaller, if future glibc versions
> extend their __rseq_abi size but is loaded with an older program/library
> doing __rseq_abi interposition).
>
> So we need some way to allow code defining the __rseq_abi to let the kernel
> know how much room is available, without necessarily requiring the code
> responsible for rseq registration to be aware of that extended layout.
> This is the purpose of the __rseq_abi.flags RSEQ_FLAG_TLS_SIZE and field
> __rseq_abi.user_size.
>
> And we need some way to allow the kernel to let user-space rseq critical
> sections (user code) know how much of those fields are actually populated
> by the kernel. This is the purpose of __rseq_abi.flags RSEQ_FLAG_TLS_SIZE
> with __rseq_abi.kernel_size.

Thanks, Mathieu, for the explanation. Yes, multiple unrelated
libraries having to share struct rseq complicates matters. Your
approach appears to be a way to reconcile the issues you outlined
above.

Thanks,
Peter

>
> Thanks,
>
> Mathieu
>
> --
> Mathieu Desnoyers
> EfficiOS Inc.
> http://www.efficios.com

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [RFC PATCH 0/4] rseq: Introduce extensible struct rseq
  2020-07-14  3:03 [RFC PATCH 0/4] rseq: Introduce extensible struct rseq Mathieu Desnoyers
                   ` (3 preceding siblings ...)
  2020-07-14  3:03 ` [RFC PATCH 4/4] selftests: rseq: print rseq extensible size in basic test Mathieu Desnoyers
@ 2020-07-14 20:55 ` Carlos O'Donell
  2020-07-15 13:02   ` Mathieu Desnoyers
  2020-07-15 15:12   ` Florian Weimer
  4 siblings, 2 replies; 37+ messages in thread
From: Carlos O'Donell @ 2020-07-14 20:55 UTC (permalink / raw)
  To: Mathieu Desnoyers, Peter Zijlstra
  Cc: linux-kernel, Thomas Gleixner, Paul E . McKenney, Boqun Feng,
	H . Peter Anvin, Paul Turner, linux-api, Christian Brauner,
	Florian Weimer

On 7/13/20 11:03 PM, Mathieu Desnoyers wrote:
> Recent discussion led to a solution for extending struct rseq. This is
> an implementation of the proposed solution.
> 
> Now is a good time to agree on this scheme before the release of glibc
> 2.32, just in case there are small details to fix on the user-space
> side in order to allow extending struct rseq.

Adding extensibility to the rseq registration process would be great,
but we are out of time for the glibc 2.32 release.

Should we revert rseq for glibc 2.32 and spend quality time discussing
the implications of an extensible design, something that Google already
says they are doing?

We can, with a clear head, and an agreed upon extension mechanism
include rseq in glibc 2.33 (release scheduled for Feburary 1st 2021).
We release time boxed every 6 months, no deviation, so you know when
your next merge window will be.

We have already done the hard work of fixing the nesting signal
handler issues, and glibc integration. If we revert today that will 
also give time for Firefox and Chrome to adjust their sandboxes.

Do you wish to go forward with rseq as we have it in glibc 2.32,
or do you wish to revert rseq from glibc 2.32, discuss the extension
mechanism, and put it back into glibc 2.33 with adjustments?

-- 
Cheers,
Carlos.


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [RFC PATCH 2/4] rseq: Allow extending struct rseq
  2020-07-14 13:19         ` Mathieu Desnoyers
@ 2020-07-14 21:30           ` Carlos O'Donell
  2020-07-15 13:12             ` Mathieu Desnoyers
  0 siblings, 1 reply; 37+ messages in thread
From: Carlos O'Donell @ 2020-07-14 21:30 UTC (permalink / raw)
  To: Mathieu Desnoyers, Florian Weimer
  Cc: Peter Zijlstra, linux-kernel, Thomas Gleixner, paulmck,
	Boqun Feng, H. Peter Anvin, Paul Turner, linux-api,
	Christian Brauner

On 7/14/20 9:19 AM, Mathieu Desnoyers wrote:
> Is there an arch-agnostic way to get the thread pointer from user-space code ? That
> would be needed by all rseq critical section implementations.

Yes, and no. We have void *__builtin_thread_pointer (void), but
few architectures implement the builtin so we'd have to go through
a round of compiler updates and backports. All targets know how to
access the thread pointer because the compiler has to generate
IE-mode accesses to the TLS variables.

I have filed an enhancement request:
Bug 96200 - Implement __builtin_thread_pointer() and 
            __builtin_set_thread_pointer() if TLS is supported
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96200

We have glibc internal macro APIs to access the thread pointer,
but I would rather the compiler handle the access since it can
schedule the resulting sequence better.

On some arches setting the therad pointer needs a syscall or
equivalent operation (hppa), and for some arches there is no
fixed register (arm) hence the need for __builtin_thread_pointer()
to force the compiler to place the pointer into a register for
function return.

-- 
Cheers,
Carlos.


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [RFC PATCH 2/4] rseq: Allow extending struct rseq
  2020-07-14 18:33       ` Peter Oskolkov
@ 2020-07-15  2:34         ` Chris Kennelly
  2020-07-15  6:31           ` Florian Weimer
  2020-07-15 14:50           ` Mathieu Desnoyers
  0 siblings, 2 replies; 37+ messages in thread
From: Chris Kennelly @ 2020-07-15  2:34 UTC (permalink / raw)
  To: Peter Oskolkov
  Cc: Mathieu Desnoyers, Peter Oskolkov, Peter Zijlstra, linux-kernel,
	Thomas Gleixner, paulmck, Boqun Feng, H. Peter Anvin,
	Paul Turner, linux-api, Christian Brauner, Florian Weimer,
	carlos

On Tue, Jul 14, 2020 at 2:33 PM Peter Oskolkov <posk@google.com> wrote:
>
> On Tue, Jul 14, 2020 at 10:43 AM Mathieu Desnoyers
> <mathieu.desnoyers@efficios.com> wrote:
> >
> > ----- On Jul 14, 2020, at 1:24 PM, Peter Oskolkov posk@posk.io wrote:
> >
> > > At Google, we actually extended struct rseq (I will post the patches
> > > here once they are fully deployed and we have specific
> > > benefits/improvements to report). We did this by adding several fields
> > > below __u32 flags (the last field currently), and correspondingly
> > > increasing rseq_len in rseq() syscall. If the kernel does not know of
> > > this extension, it will return -EINVAL due to an unexpected rseq_len;
> > > then the application can either fall-back to the standard/upstream
> > > rseq, or bail. If the kernel does know of this extension, it accepts
> > > it. If the application passes the old rseq_len (32), the kernel knows
> > > that this is an old application and treats it as such.
> > >
> > > I looked through the archives, but I did not find specifically why the
> > > pretty standard approach described above is considered inferior to the
> > > one taken in this patch (freeze rseq_len at 32, add additional length
> > > fields to struct rseq). Can these be summarized?
> >
> > I think you don't face the issues I'm facing with libc rseq integration
> > because you control the entire user-space software ecosystem at Google.
> >
> > The main issue we face is that the library responsible for registering
> > rseq (either glibc 2.32+, an early-adopter librseq library, or the
> > application) may very well not be the same library defining the __rseq_abi
> > symbol used in the global symbol table. Interposition with ld preload or
> > by defining the __rseq_abi in the program's executable are good examples
> > of this kind of scenario, and those use-cases are supported.

Does this work if/when we run out of bytes in the current sizeof(__rseq_abi)?

Which library provides the TLS symbol (and N bytes of storage) seems
sensitive to the choices the linker makes for us, once the symbol
sizes diverge.

> > So the size of the __rseq_abi structure may be larger than the struct
> > rseq known by glibc (and eventually smaller, if future glibc versions
> > extend their __rseq_abi size but is loaded with an older program/library
> > doing __rseq_abi interposition).

When glibc provides registration, is the anticipated use case that a
library would unregister and reregister each thread to "upgrade" it to
the most modern version of interface it knows about provided by the
kernel?

> > So we need some way to allow code defining the __rseq_abi to let the kernel
> > know how much room is available, without necessarily requiring the code
> > responsible for rseq registration to be aware of that extended layout.
> > This is the purpose of the __rseq_abi.flags RSEQ_FLAG_TLS_SIZE and field
> > __rseq_abi.user_size.
> >
> > And we need some way to allow the kernel to let user-space rseq critical
> > sections (user code) know how much of those fields are actually populated
> > by the kernel. This is the purpose of __rseq_abi.flags RSEQ_FLAG_TLS_SIZE
> > with __rseq_abi.kernel_size.

I authored the userspace component
(https://github.com/google/tcmalloc/commit/ad136d45f75a273b934446699cef8b278c34ec6e)
that consumes the extensions Peter mentions and found that minimizing
the performance impact of their potential absence was a bit of a
challenge.

There, I could assume an all-or-nothing registration of the new
feature--limited only by kernel availability for thread
homogeneity--but inconsistencies across early adopter libraries would
mean each thread would have to examine its own TLS to determine if a
feature were available.

Chris

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [RFC PATCH 2/4] rseq: Allow extending struct rseq
  2020-07-15  2:34         ` Chris Kennelly
@ 2020-07-15  6:31           ` Florian Weimer
  2020-07-15 10:59             ` Christian Brauner
  2020-07-15 14:38             ` Mathieu Desnoyers
  2020-07-15 14:50           ` Mathieu Desnoyers
  1 sibling, 2 replies; 37+ messages in thread
From: Florian Weimer @ 2020-07-15  6:31 UTC (permalink / raw)
  To: Chris Kennelly
  Cc: Peter Oskolkov, Mathieu Desnoyers, Peter Oskolkov,
	Peter Zijlstra, linux-kernel, Thomas Gleixner, paulmck,
	Boqun Feng, H. Peter Anvin, Paul Turner, linux-api,
	Christian Brauner, carlos

* Chris Kennelly:

> When glibc provides registration, is the anticipated use case that a
> library would unregister and reregister each thread to "upgrade" it to
> the most modern version of interface it knows about provided by the
> kernel?

Absolutely not, that is likely to break other consumers because an
expected rseq area becomes dormant instead.

> There, I could assume an all-or-nothing registration of the new
> feature--limited only by kernel availability for thread
> homogeneity--but inconsistencies across early adopter libraries would
> mean each thread would have to examine its own TLS to determine if a
> feature were available.

Exactly.  Certain uses of seccomp can also have this effect,
presenting a non-homogeneous view.

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [RFC PATCH 2/4] rseq: Allow extending struct rseq
  2020-07-15  6:31           ` Florian Weimer
@ 2020-07-15 10:59             ` Christian Brauner
  2020-07-15 14:38             ` Mathieu Desnoyers
  1 sibling, 0 replies; 37+ messages in thread
From: Christian Brauner @ 2020-07-15 10:59 UTC (permalink / raw)
  To: Florian Weimer
  Cc: Chris Kennelly, Peter Oskolkov, Mathieu Desnoyers,
	Peter Oskolkov, Peter Zijlstra, linux-kernel, Thomas Gleixner,
	paulmck, Boqun Feng, H. Peter Anvin, Paul Turner, linux-api,
	carlos

On Wed, Jul 15, 2020 at 08:31:05AM +0200, Florian Weimer wrote:
> * Chris Kennelly:
> 
> > When glibc provides registration, is the anticipated use case that a
> > library would unregister and reregister each thread to "upgrade" it to
> > the most modern version of interface it knows about provided by the
> > kernel?
> 
> Absolutely not, that is likely to break other consumers because an
> expected rseq area becomes dormant instead.
> 
> > There, I could assume an all-or-nothing registration of the new
> > feature--limited only by kernel availability for thread
> > homogeneity--but inconsistencies across early adopter libraries would
> > mean each thread would have to examine its own TLS to determine if a
> > feature were available.

Fwiw, I pointed this out in the discussions that led up to this
patchset. I don't see how this can work if threads don't check for their
feature set.

> 
> Exactly.  Certain uses of seccomp can also have this effect,
> presenting a non-homogeneous view.

Good point. There might be threads with a seccomp filter that would
block rseq features is what you mean, I assume.

Christian

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [RFC PATCH 2/4] rseq: Allow extending struct rseq
  2020-07-14  3:03 ` [RFC PATCH 2/4] rseq: Allow extending struct rseq Mathieu Desnoyers
  2020-07-14  9:58   ` Florian Weimer
  2020-07-14 17:24   ` Peter Oskolkov
@ 2020-07-15 11:38   ` Christian Brauner
  2020-07-15 12:33     ` Christian Brauner
  2 siblings, 1 reply; 37+ messages in thread
From: Christian Brauner @ 2020-07-15 11:38 UTC (permalink / raw)
  To: Mathieu Desnoyers, Florian Weimer
  Cc: Peter Zijlstra, linux-kernel, Thomas Gleixner, Paul E . McKenney,
	Boqun Feng, H . Peter Anvin, Paul Turner, linux-api, carlos

On Mon, Jul 13, 2020 at 11:03:46PM -0400, Mathieu Desnoyers wrote:
> Add a __rseq_abi.flags "RSEQ_TLS_FLAG_SIZE", which indicates support for
> extending struct rseq. This adds two new fields to struct rseq:
> user_size and kernel_size.
> 
> The user_size field allows the size of the __rseq_abi definition (which
> can be overridden by symbol interposition either by a preloaded library
> or by the application) to be handed over to the kernel at registration.
> This registration can be performed by a library, e.g. glibc, which does
> not know there is interposition taking place.
> 
> The kernel_size is populated by the kernel when the "RSEQ_TLS_FLAG_SIZE"
> flag is set in __rseq_abi.flags to the minimum between user_size and
> the offset of the "end" field of struct rseq as known by the kernel.
> This allows user-space to query which fields are effectively populated
> by the kernel.
> 
> A rseq_size field is added to the task struct to keep track of the
> "kernel_size" effective for each thread.
> 
> Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
> ---
>  include/linux/sched.h     |  4 ++++
>  include/uapi/linux/rseq.h | 37 ++++++++++++++++++++++++++++++++--
>  kernel/rseq.c             | 42 +++++++++++++++++++++++++++++++++------
>  3 files changed, 75 insertions(+), 8 deletions(-)
> 
> diff --git a/include/linux/sched.h b/include/linux/sched.h
> index 692e327d7455..5d61a3197987 100644
> --- a/include/linux/sched.h
> +++ b/include/linux/sched.h
> @@ -1147,6 +1147,7 @@ struct task_struct {
>  #ifdef CONFIG_RSEQ
>  	struct rseq __user *rseq;
>  	u32 rseq_sig;
> +	u32 rseq_size;
>  	/*
>  	 * RmW on rseq_event_mask must be performed atomically
>  	 * with respect to preemption.
> @@ -1976,10 +1977,12 @@ static inline void rseq_fork(struct task_struct *t, unsigned long clone_flags)
>  	if (clone_flags & CLONE_VM) {
>  		t->rseq = NULL;
>  		t->rseq_sig = 0;
> +		t->rseq_size = 0;
>  		t->rseq_event_mask = 0;
>  	} else {
>  		t->rseq = current->rseq;
>  		t->rseq_sig = current->rseq_sig;
> +		t->rseq_size = current->rseq_size;
>  		t->rseq_event_mask = current->rseq_event_mask;
>  	}
>  }
> @@ -1988,6 +1991,7 @@ static inline void rseq_execve(struct task_struct *t)
>  {
>  	t->rseq = NULL;
>  	t->rseq_sig = 0;
> +	t->rseq_size = 0;
>  	t->rseq_event_mask = 0;
>  }
>  
> diff --git a/include/uapi/linux/rseq.h b/include/uapi/linux/rseq.h
> index e11d9df5e564..03c0b5e9a859 100644
> --- a/include/uapi/linux/rseq.h
> +++ b/include/uapi/linux/rseq.h
> @@ -37,6 +37,15 @@ enum rseq_cs_flags {
>  		(1U << RSEQ_CS_FLAG_NO_RESTART_ON_MIGRATE_BIT),
>  };
>  
> +enum rseq_tls_flags_bit {
> +	/* enum rseq_cs_flags reserves bits 0-2. */
> +	RSEQ_TLS_FLAG_SIZE_BIT = 3,
> +};
> +
> +enum rseq_tls_flags {
> +	RSEQ_TLS_FLAG_SIZE = (1U << RSEQ_TLS_FLAG_SIZE_BIT),
> +};
> +
>  /* The rseq_len expected by rseq registration is always 32 bytes. */
>  enum rseq_len_expected {
>  	RSEQ_LEN_EXPECTED = 32,
> @@ -133,8 +142,9 @@ struct rseq {
>  	 *
>  	 * This field should only be updated by the thread which
>  	 * registered this data structure. Read by the kernel.
> -	 * Mainly used for single-stepping through rseq critical sections
> -	 * with debuggers.
> +	 *
> +	 * The RSEQ_CS flags are mainly used for single-stepping through rseq
> +	 * critical sections with debuggers.
>  	 *
>  	 * - RSEQ_CS_FLAG_NO_RESTART_ON_PREEMPT
>  	 *     Inhibit instruction sequence block restart on preemption
> @@ -145,8 +155,31 @@ struct rseq {
>  	 * - RSEQ_CS_FLAG_NO_RESTART_ON_MIGRATE
>  	 *     Inhibit instruction sequence block restart on migration for
>  	 *     this thread.
> +	 *
> +	 * - RSEQ_TLS_FLAG_SIZE
> +	 *     Extensible struct rseq ABI. This flag should be statically
> +	 *     initialized.
>  	 */
>  	__u32 flags;
> +	/*
> +	 * With __rseq_abi.flags RSEQ_TLS_FLAG_SIZE set, user_size should be
> +	 * statically initialized to offsetof(struct rseq, end).
> +	 */
> +	__u16 user_size;
> +	/*
> +	 * With __rseq_abi.flags RSEQ_TLS_FLAG_SIZE set, if the kernel supports
> +	 * extensible struct rseq ABI, the kernel_size field is populated by
> +	 * the kernel to the minimum between user_size and the offset of the
> +	 * "end" field within the struct rseq supported by the kernel on
> +	 * successful registration. Should be initialized to 0.
> +	 */
> +	__u16 kernel_size;

(Btw, this what I suggested - minus the user_size part - when I said
"expose the size of struct rseq" the kernel knows about. The approach
here is of course more general.)

It's pretty uncommon to use __u16 for sizes at least in public facing
structs. I'd suggest to use __u32 user_size and __u32 kernel_size and if
needed, insert padding. Seems you have done this in your union above
already.

> +
> +	/*
> +	 * Very last field of the structure, to calculate size excluding padding
> +	 * with offsetof().
> +	 */
> +	char end[];

Hm, could this mess with alignment or break making the struct
extensible? Feels like you're adding new members always before this
which is also pretty non-standard in terms of how we'd usually extend
structs.

>  } __attribute__((aligned(4 * sizeof(__u64))));
>  
>  #endif /* _UAPI_LINUX_RSEQ_H */
> diff --git a/kernel/rseq.c b/kernel/rseq.c
> index a4f86a9d6937..bbc57fc18573 100644
> --- a/kernel/rseq.c
> +++ b/kernel/rseq.c
> @@ -96,6 +96,7 @@ static int rseq_update_cpu_id(struct task_struct *t)
>  static int rseq_reset_rseq_cpu_id(struct task_struct *t)
>  {
>  	u32 cpu_id_start = 0, cpu_id = RSEQ_CPU_ID_UNINITIALIZED;
> +	u16 kernel_size = 0;
>  
>  	/*
>  	 * Reset cpu_id_start to its initial state (0).
> @@ -109,6 +110,11 @@ static int rseq_reset_rseq_cpu_id(struct task_struct *t)
>  	 */
>  	if (put_user(cpu_id, &t->rseq->cpu_id))
>  		return -EFAULT;
> +	/*
> +	 * Reset kernel_size to its initial state (0).
> +	 */
> +	if (put_user(kernel_size, &t->rseq->kernel_size))
> +		return -EFAULT;
>  	return 0;
>  }
>  
> @@ -266,7 +272,7 @@ void __rseq_handle_notify_resume(struct ksignal *ksig, struct pt_regs *regs)
>  
>  	if (unlikely(t->flags & PF_EXITING))
>  		return;
> -	if (unlikely(!access_ok(t->rseq, sizeof(*t->rseq))))
> +	if (unlikely(!access_ok(t->rseq, t->rseq_size)))
>  		goto error;
>  	ret = rseq_ip_fixup(regs);
>  	if (unlikely(ret < 0))
> @@ -294,7 +300,7 @@ void rseq_syscall(struct pt_regs *regs)
>  
>  	if (!t->rseq)
>  		return;
> -	if (!access_ok(t->rseq, sizeof(*t->rseq)) ||
> +	if (!access_ok(t->rseq, t->rseq_size) ||
>  	    rseq_get_rseq_cs(t, &rseq_cs) || in_rseq_cs(ip, &rseq_cs))
>  		force_sig(SIGSEGV);
>  }
> @@ -308,6 +314,7 @@ SYSCALL_DEFINE4(rseq, struct rseq __user *, rseq, u32, rseq_len,
>  		int, flags, u32, sig)
>  {
>  	int ret;
> +	u32 tls_flags;
>  
>  	if (flags & RSEQ_FLAG_UNREGISTER) {
>  		if (flags & ~RSEQ_FLAG_UNREGISTER)
> @@ -315,7 +322,7 @@ SYSCALL_DEFINE4(rseq, struct rseq __user *, rseq, u32, rseq_len,
>  		/* Unregister rseq for current thread. */
>  		if (current->rseq != rseq || !current->rseq)
>  			return -EINVAL;
> -		if (rseq_len != sizeof(*rseq))
> +		if (rseq_len != RSEQ_LEN_EXPECTED)

So I have to say that I think it's not a great to fix the length of the
rseq_len argument basically making it somewhat a nop. If I recall
correctly Florian said something about the rseq_len becoming part of the
glibc abi and that's why it can't be changed?
Is there any way we can avoid that so we can use the rseq_len argument
to have userspace pass down the size of struct rseq they know about?

It's really unintuitive to pass down an extensible struct but the length
argument associated with it is fixed.

I also think there should be some compile-time sanity checks here
similar to what we do in other places see e.g.

	BUILD_BUG_ON(sizeof(struct clone_args) != CLONE_ARGS_SIZE_VER2);

So here should at least be sm like:

	BUILD_BUG_ON(sizeof(struct rseq) != RSEQ_LEN_EXPECTED);



>  			return -EINVAL;
>  		if (current->rseq_sig != sig)
>  			return -EPERM;
> @@ -323,6 +330,7 @@ SYSCALL_DEFINE4(rseq, struct rseq __user *, rseq, u32, rseq_len,
>  		if (ret)
>  			return ret;
>  		current->rseq = NULL;
> +		current->rseq_size = 0;
>  		current->rseq_sig = 0;
>  		return 0;
>  	}
> @@ -336,7 +344,7 @@ SYSCALL_DEFINE4(rseq, struct rseq __user *, rseq, u32, rseq_len,
>  		 * the provided address differs from the prior
>  		 * one.
>  		 */
> -		if (current->rseq != rseq || rseq_len != sizeof(*rseq))
> +		if (current->rseq != rseq || rseq_len != RSEQ_LEN_EXPECTED)
>  			return -EINVAL;
>  		if (current->rseq_sig != sig)
>  			return -EPERM;
> @@ -349,10 +357,32 @@ SYSCALL_DEFINE4(rseq, struct rseq __user *, rseq, u32, rseq_len,
>  	 * ensure the provided rseq is properly aligned and valid.
>  	 */
>  	if (!IS_ALIGNED((unsigned long)rseq, __alignof__(*rseq)) ||
> -	    rseq_len != sizeof(*rseq))
> +	    rseq_len != RSEQ_LEN_EXPECTED)
>  		return -EINVAL;
> -	if (!access_ok(rseq, rseq_len))
> +	if (!access_ok(rseq, RSEQ_LEN_EXPECTED))
>  		return -EFAULT;
> +
> +	/* Handle extensible struct rseq ABI. */
> +	ret = get_user(tls_flags, &rseq->flags);
> +	if (ret)
> +		return ret;
> +	if (tls_flags & RSEQ_TLS_FLAG_SIZE) {
> +		u16 user_size, kernel_size;
> +
> +		ret = get_user(user_size, &rseq->user_size);
> +		if (ret)
> +			return ret;
> +		if (user_size < offsetof(struct rseq, kernel_size) + sizeof(u16))
> +			return -EINVAL;
> +		kernel_size = min_t(u16, user_size, offsetof(struct rseq, end));
> +		ret = put_user(kernel_size, &rseq->kernel_size);
> +		if (ret)
> +			return ret;
> +		current->rseq_size = kernel_size;
> +	} else {
> +		current->rseq_size = offsetof(struct rseq, user_size);
> +	}
> +
>  	current->rseq = rseq;
>  	current->rseq_sig = sig;
>  	/*
> -- 
> 2.17.1
> 

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [RFC PATCH 2/4] rseq: Allow extending struct rseq
  2020-07-15 11:38   ` Christian Brauner
@ 2020-07-15 12:33     ` Christian Brauner
  2020-07-15 15:10       ` Mathieu Desnoyers
  0 siblings, 1 reply; 37+ messages in thread
From: Christian Brauner @ 2020-07-15 12:33 UTC (permalink / raw)
  To: Mathieu Desnoyers, Florian Weimer
  Cc: Peter Zijlstra, linux-kernel, Thomas Gleixner, Paul E . McKenney,
	Boqun Feng, H . Peter Anvin, Paul Turner, linux-api, carlos

On Wed, Jul 15, 2020 at 01:38:51PM +0200, Christian Brauner wrote:
> On Mon, Jul 13, 2020 at 11:03:46PM -0400, Mathieu Desnoyers wrote:
> > Add a __rseq_abi.flags "RSEQ_TLS_FLAG_SIZE", which indicates support for
> > extending struct rseq. This adds two new fields to struct rseq:
> > user_size and kernel_size.
> > 
> > The user_size field allows the size of the __rseq_abi definition (which
> > can be overridden by symbol interposition either by a preloaded library
> > or by the application) to be handed over to the kernel at registration.
> > This registration can be performed by a library, e.g. glibc, which does
> > not know there is interposition taking place.
> > 
> > The kernel_size is populated by the kernel when the "RSEQ_TLS_FLAG_SIZE"
> > flag is set in __rseq_abi.flags to the minimum between user_size and
> > the offset of the "end" field of struct rseq as known by the kernel.
> > This allows user-space to query which fields are effectively populated
> > by the kernel.
> > 
> > A rseq_size field is added to the task struct to keep track of the
> > "kernel_size" effective for each thread.
> > 
> > Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
> > ---
> >  include/linux/sched.h     |  4 ++++
> >  include/uapi/linux/rseq.h | 37 ++++++++++++++++++++++++++++++++--
> >  kernel/rseq.c             | 42 +++++++++++++++++++++++++++++++++------
> >  3 files changed, 75 insertions(+), 8 deletions(-)
> > 
> > diff --git a/include/linux/sched.h b/include/linux/sched.h
> > index 692e327d7455..5d61a3197987 100644
> > --- a/include/linux/sched.h
> > +++ b/include/linux/sched.h
> > @@ -1147,6 +1147,7 @@ struct task_struct {
> >  #ifdef CONFIG_RSEQ
> >  	struct rseq __user *rseq;
> >  	u32 rseq_sig;
> > +	u32 rseq_size;
> >  	/*
> >  	 * RmW on rseq_event_mask must be performed atomically
> >  	 * with respect to preemption.
> > @@ -1976,10 +1977,12 @@ static inline void rseq_fork(struct task_struct *t, unsigned long clone_flags)
> >  	if (clone_flags & CLONE_VM) {
> >  		t->rseq = NULL;
> >  		t->rseq_sig = 0;
> > +		t->rseq_size = 0;
> >  		t->rseq_event_mask = 0;
> >  	} else {
> >  		t->rseq = current->rseq;
> >  		t->rseq_sig = current->rseq_sig;
> > +		t->rseq_size = current->rseq_size;
> >  		t->rseq_event_mask = current->rseq_event_mask;
> >  	}
> >  }
> > @@ -1988,6 +1991,7 @@ static inline void rseq_execve(struct task_struct *t)
> >  {
> >  	t->rseq = NULL;
> >  	t->rseq_sig = 0;
> > +	t->rseq_size = 0;
> >  	t->rseq_event_mask = 0;
> >  }
> >  
> > diff --git a/include/uapi/linux/rseq.h b/include/uapi/linux/rseq.h
> > index e11d9df5e564..03c0b5e9a859 100644
> > --- a/include/uapi/linux/rseq.h
> > +++ b/include/uapi/linux/rseq.h
> > @@ -37,6 +37,15 @@ enum rseq_cs_flags {
> >  		(1U << RSEQ_CS_FLAG_NO_RESTART_ON_MIGRATE_BIT),
> >  };
> >  
> > +enum rseq_tls_flags_bit {
> > +	/* enum rseq_cs_flags reserves bits 0-2. */
> > +	RSEQ_TLS_FLAG_SIZE_BIT = 3,
> > +};
> > +
> > +enum rseq_tls_flags {
> > +	RSEQ_TLS_FLAG_SIZE = (1U << RSEQ_TLS_FLAG_SIZE_BIT),
> > +};
> > +
> >  /* The rseq_len expected by rseq registration is always 32 bytes. */
> >  enum rseq_len_expected {
> >  	RSEQ_LEN_EXPECTED = 32,
> > @@ -133,8 +142,9 @@ struct rseq {
> >  	 *
> >  	 * This field should only be updated by the thread which
> >  	 * registered this data structure. Read by the kernel.
> > -	 * Mainly used for single-stepping through rseq critical sections
> > -	 * with debuggers.
> > +	 *
> > +	 * The RSEQ_CS flags are mainly used for single-stepping through rseq
> > +	 * critical sections with debuggers.
> >  	 *
> >  	 * - RSEQ_CS_FLAG_NO_RESTART_ON_PREEMPT
> >  	 *     Inhibit instruction sequence block restart on preemption
> > @@ -145,8 +155,31 @@ struct rseq {
> >  	 * - RSEQ_CS_FLAG_NO_RESTART_ON_MIGRATE
> >  	 *     Inhibit instruction sequence block restart on migration for
> >  	 *     this thread.
> > +	 *
> > +	 * - RSEQ_TLS_FLAG_SIZE
> > +	 *     Extensible struct rseq ABI. This flag should be statically
> > +	 *     initialized.
> >  	 */
> >  	__u32 flags;
> > +	/*
> > +	 * With __rseq_abi.flags RSEQ_TLS_FLAG_SIZE set, user_size should be
> > +	 * statically initialized to offsetof(struct rseq, end).
> > +	 */
> > +	__u16 user_size;
> > +	/*
> > +	 * With __rseq_abi.flags RSEQ_TLS_FLAG_SIZE set, if the kernel supports
> > +	 * extensible struct rseq ABI, the kernel_size field is populated by
> > +	 * the kernel to the minimum between user_size and the offset of the
> > +	 * "end" field within the struct rseq supported by the kernel on
> > +	 * successful registration. Should be initialized to 0.
> > +	 */
> > +	__u16 kernel_size;
> 
> (Btw, this what I suggested - minus the user_size part - when I said
> "expose the size of struct rseq" the kernel knows about. The approach
> here is of course more general.)
> 
> It's pretty uncommon to use __u16 for sizes at least in public facing
> structs. I'd suggest to use __u32 user_size and __u32 kernel_size and if
> needed, insert padding. Seems you have done this in your union above
> already.
> 
> > +
> > +	/*
> > +	 * Very last field of the structure, to calculate size excluding padding
> > +	 * with offsetof().
> > +	 */
> > +	char end[];
> 
> Hm, could this mess with alignment or break making the struct
> extensible? Feels like you're adding new members always before this
> which is also pretty non-standard in terms of how we'd usually extend
> structs.
> 
> >  } __attribute__((aligned(4 * sizeof(__u64))));
> >  
> >  #endif /* _UAPI_LINUX_RSEQ_H */
> > diff --git a/kernel/rseq.c b/kernel/rseq.c
> > index a4f86a9d6937..bbc57fc18573 100644
> > --- a/kernel/rseq.c
> > +++ b/kernel/rseq.c
> > @@ -96,6 +96,7 @@ static int rseq_update_cpu_id(struct task_struct *t)
> >  static int rseq_reset_rseq_cpu_id(struct task_struct *t)
> >  {
> >  	u32 cpu_id_start = 0, cpu_id = RSEQ_CPU_ID_UNINITIALIZED;
> > +	u16 kernel_size = 0;
> >  
> >  	/*
> >  	 * Reset cpu_id_start to its initial state (0).
> > @@ -109,6 +110,11 @@ static int rseq_reset_rseq_cpu_id(struct task_struct *t)
> >  	 */
> >  	if (put_user(cpu_id, &t->rseq->cpu_id))
> >  		return -EFAULT;
> > +	/*
> > +	 * Reset kernel_size to its initial state (0).
> > +	 */
> > +	if (put_user(kernel_size, &t->rseq->kernel_size))
> > +		return -EFAULT;
> >  	return 0;
> >  }
> >  
> > @@ -266,7 +272,7 @@ void __rseq_handle_notify_resume(struct ksignal *ksig, struct pt_regs *regs)
> >  
> >  	if (unlikely(t->flags & PF_EXITING))
> >  		return;
> > -	if (unlikely(!access_ok(t->rseq, sizeof(*t->rseq))))
> > +	if (unlikely(!access_ok(t->rseq, t->rseq_size)))
> >  		goto error;
> >  	ret = rseq_ip_fixup(regs);
> >  	if (unlikely(ret < 0))
> > @@ -294,7 +300,7 @@ void rseq_syscall(struct pt_regs *regs)
> >  
> >  	if (!t->rseq)
> >  		return;
> > -	if (!access_ok(t->rseq, sizeof(*t->rseq)) ||
> > +	if (!access_ok(t->rseq, t->rseq_size) ||
> >  	    rseq_get_rseq_cs(t, &rseq_cs) || in_rseq_cs(ip, &rseq_cs))
> >  		force_sig(SIGSEGV);
> >  }
> > @@ -308,6 +314,7 @@ SYSCALL_DEFINE4(rseq, struct rseq __user *, rseq, u32, rseq_len,
> >  		int, flags, u32, sig)
> >  {
> >  	int ret;
> > +	u32 tls_flags;
> >  
> >  	if (flags & RSEQ_FLAG_UNREGISTER) {
> >  		if (flags & ~RSEQ_FLAG_UNREGISTER)
> > @@ -315,7 +322,7 @@ SYSCALL_DEFINE4(rseq, struct rseq __user *, rseq, u32, rseq_len,
> >  		/* Unregister rseq for current thread. */
> >  		if (current->rseq != rseq || !current->rseq)
> >  			return -EINVAL;
> > -		if (rseq_len != sizeof(*rseq))
> > +		if (rseq_len != RSEQ_LEN_EXPECTED)
> 
> So I have to say that I think it's not a great to fix the length of the
> rseq_len argument basically making it somewhat a nop. If I recall
> correctly Florian said something about the rseq_len becoming part of the
> glibc abi and that's why it can't be changed?
> Is there any way we can avoid that so we can use the rseq_len argument
> to have userspace pass down the size of struct rseq they know about?
> 
> It's really unintuitive to pass down an extensible struct but the length
> argument associated with it is fixed.
> 
> I also think there should be some compile-time sanity checks here
> similar to what we do in other places see e.g.
> 
> 	BUILD_BUG_ON(sizeof(struct clone_args) != CLONE_ARGS_SIZE_VER2);
> 
> So here should at least be sm like:
> 
> 	BUILD_BUG_ON(sizeof(struct rseq) != RSEQ_LEN_EXPECTED);
> 

So here's a very free-wheeling draft of roughly what I had in mind. Not
even compile-tested just to illustrate what I'd change and sorry if that
code will make you sob in your hands:

From 2879e3c30dbe6ba0fc53884b1c41deaa444924a8 Mon Sep 17 00:00:00 2001
From: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Date: Mon, 13 Jul 2020 23:03:46 -0400
Subject: [PATCH] [UNTESTED] rseq: Allow extending struct rseq

Add a __rseq_abi.flags "RSEQ_TLS_FLAG_SIZE", which indicates support for
extending struct rseq. This adds two new fields to struct rseq:
user_size and kernel_size.

The user_size field allows the size of the __rseq_abi definition (which
can be overridden by symbol interposition either by a preloaded library
or by the application) to be handed over to the kernel at registration.
This registration can be performed by a library, e.g. glibc, which does
not know there is interposition taking place.

The kernel_size is populated by the kernel when the "RSEQ_TLS_FLAG_SIZE"
flag is set in __rseq_abi.flags to the minimum between user_size and
the offset of the "end" field of struct rseq as known by the kernel.
This allows user-space to query which fields are effectively populated
by the kernel.

A rseq_size field is added to the task struct to keep track of the
"kernel_size" effective for each thread.

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
---
 include/linux/sched.h     |  4 +++
 include/uapi/linux/rseq.h | 36 ++++++++++++++++++--
 kernel/rseq.c             | 72 ++++++++++++++++++++++++++++++++++++---
 3 files changed, 105 insertions(+), 7 deletions(-)

diff --git a/include/linux/sched.h b/include/linux/sched.h
index 692e327d7455..5d61a3197987 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -1147,6 +1147,7 @@ struct task_struct {
 #ifdef CONFIG_RSEQ
 	struct rseq __user *rseq;
 	u32 rseq_sig;
+	u32 rseq_size;
 	/*
 	 * RmW on rseq_event_mask must be performed atomically
 	 * with respect to preemption.
@@ -1976,10 +1977,12 @@ static inline void rseq_fork(struct task_struct *t, unsigned long clone_flags)
 	if (clone_flags & CLONE_VM) {
 		t->rseq = NULL;
 		t->rseq_sig = 0;
+		t->rseq_size = 0;
 		t->rseq_event_mask = 0;
 	} else {
 		t->rseq = current->rseq;
 		t->rseq_sig = current->rseq_sig;
+		t->rseq_size = current->rseq_size;
 		t->rseq_event_mask = current->rseq_event_mask;
 	}
 }
@@ -1988,6 +1991,7 @@ static inline void rseq_execve(struct task_struct *t)
 {
 	t->rseq = NULL;
 	t->rseq_sig = 0;
+	t->rseq_size = 0;
 	t->rseq_event_mask = 0;
 }
 
diff --git a/include/uapi/linux/rseq.h b/include/uapi/linux/rseq.h
index 9a402fdb60e9..f10ce38d9712 100644
--- a/include/uapi/linux/rseq.h
+++ b/include/uapi/linux/rseq.h
@@ -37,6 +37,15 @@ enum rseq_cs_flags {
 		(1U << RSEQ_CS_FLAG_NO_RESTART_ON_MIGRATE_BIT),
 };
 
+enum rseq_tls_flags_bit {
+	/* enum rseq_cs_flags reserves bits 0-2. */
+	RSEQ_TLS_FLAG_SIZE_BIT = 3,
+};
+
+enum rseq_tls_flags {
+	RSEQ_TLS_FLAG_SIZE = (1U << RSEQ_TLS_FLAG_SIZE_BIT),
+};
+
 /*
  * struct rseq_cs is aligned on 4 * 8 bytes to ensure it is always
  * contained within a single cache-line. It is usually declared as
@@ -128,8 +137,9 @@ struct rseq {
 	 *
 	 * This field should only be updated by the thread which
 	 * registered this data structure. Read by the kernel.
-	 * Mainly used for single-stepping through rseq critical sections
-	 * with debuggers.
+	 *
+	 * The RSEQ_CS flags are mainly used for single-stepping through rseq
+	 * critical sections with debuggers.
 	 *
 	 * - RSEQ_CS_FLAG_NO_RESTART_ON_PREEMPT
 	 *     Inhibit instruction sequence block restart on preemption
@@ -140,8 +150,30 @@ struct rseq {
 	 * - RSEQ_CS_FLAG_NO_RESTART_ON_MIGRATE
 	 *     Inhibit instruction sequence block restart on migration for
 	 *     this thread.
+	 *
+	 * - RSEQ_TLS_FLAG_SIZE
+	 *     Extensible struct rseq ABI. This flag should be statically
+	 *     initialized.
 	 */
 	__u32 flags;
+	/*
+	 * With __rseq_abi.flags RSEQ_TLS_FLAG_SIZE set, user_size should be
+	 * statically initialized to offsetof(struct rseq, end).
+	 */
+	__u32 user_size;
+	/*
+	 * With __rseq_abi.flags RSEQ_TLS_FLAG_SIZE set, if the kernel supports
+	 * extensible struct rseq ABI, the kernel_size field is populated by
+	 * the kernel to the minimum between user_size and the offset of the
+	 * "end" field within the struct rseq supported by the kernel on
+	 * successful registration. Should be initialized to 0.
+	 */
+	__u32 kernel_size;
+	__u32 active_size;
 } __attribute__((aligned(4 * sizeof(__u64))));
 
+#define RSEQ_SIZE_VER0 24 /* sizeof first published struct */
+#define RSEQ_SIZE_VER1 32 /* sizeof second published struct */
+#define RSEQ_SIZE_LATEST RSEQ_SIZE_VER1 /* sizeof last published struct */
+
 #endif /* _UAPI_LINUX_RSEQ_H */
diff --git a/kernel/rseq.c b/kernel/rseq.c
index a4f86a9d6937..9d28b41a99cd 100644
--- a/kernel/rseq.c
+++ b/kernel/rseq.c
@@ -96,6 +96,7 @@ static int rseq_update_cpu_id(struct task_struct *t)
 static int rseq_reset_rseq_cpu_id(struct task_struct *t)
 {
 	u32 cpu_id_start = 0, cpu_id = RSEQ_CPU_ID_UNINITIALIZED;
+	u16 kernel_size = 0;
 
 	/*
 	 * Reset cpu_id_start to its initial state (0).
@@ -109,6 +110,11 @@ static int rseq_reset_rseq_cpu_id(struct task_struct *t)
 	 */
 	if (put_user(cpu_id, &t->rseq->cpu_id))
 		return -EFAULT;
+	/*
+	 * Reset kernel_size to its initial state (0).
+	 */
+	if (put_user(kernel_size, &t->rseq->kernel_size))
+		return -EFAULT;
 	return 0;
 }
 
@@ -266,7 +272,7 @@ void __rseq_handle_notify_resume(struct ksignal *ksig, struct pt_regs *regs)
 
 	if (unlikely(t->flags & PF_EXITING))
 		return;
-	if (unlikely(!access_ok(t->rseq, sizeof(*t->rseq))))
+	if (unlikely(!access_ok(t->rseq, t->rseq_size)))
 		goto error;
 	ret = rseq_ip_fixup(regs);
 	if (unlikely(ret < 0))
@@ -294,7 +300,7 @@ void rseq_syscall(struct pt_regs *regs)
 
 	if (!t->rseq)
 		return;
-	if (!access_ok(t->rseq, sizeof(*t->rseq)) ||
+	if (!access_ok(t->rseq, t->rseq_size) ||
 	    rseq_get_rseq_cs(t, &rseq_cs) || in_rseq_cs(ip, &rseq_cs))
 		force_sig(SIGSEGV);
 }
@@ -308,6 +314,11 @@ SYSCALL_DEFINE4(rseq, struct rseq __user *, rseq, u32, rseq_len,
 		int, flags, u32, sig)
 {
 	int ret;
+	u32 tls_flags;
+
+	BUILD_BUG_ON(offsetofend(struct rseq, flags) != RSEQ_SIZE_VER0);
+	BUILD_BUG_ON(offsetofend(struct rseq, active_size) != RSEQ_SIZE_VER1);
+	BUILD_BUG_ON(sizeof(struct rseq) != RSEQ_SIZE_LATEST);
 
 	if (flags & RSEQ_FLAG_UNREGISTER) {
 		if (flags & ~RSEQ_FLAG_UNREGISTER)
@@ -315,14 +326,17 @@ SYSCALL_DEFINE4(rseq, struct rseq __user *, rseq, u32, rseq_len,
 		/* Unregister rseq for current thread. */
 		if (current->rseq != rseq || !current->rseq)
 			return -EINVAL;
-		if (rseq_len != sizeof(*rseq))
+		if (rseq_len < RSEQ_SIZE_VER0)
 			return -EINVAL;
+		if (!access_ok(rseq, rseq_len))
+			return -EFAULT;
 		if (current->rseq_sig != sig)
 			return -EPERM;
 		ret = rseq_reset_rseq_cpu_id(current);
 		if (ret)
 			return ret;
 		current->rseq = NULL;
+		current->rseq_size = 0;
 		current->rseq_sig = 0;
 		return 0;
 	}
@@ -336,7 +350,7 @@ SYSCALL_DEFINE4(rseq, struct rseq __user *, rseq, u32, rseq_len,
 		 * the provided address differs from the prior
 		 * one.
 		 */
-		if (current->rseq != rseq || rseq_len != sizeof(*rseq))
+		if (current->rseq != rseq || rseq_len != RSEQ_LEN_EXPECTED)
 			return -EINVAL;
 		if (current->rseq_sig != sig)
 			return -EPERM;
@@ -349,10 +363,58 @@ SYSCALL_DEFINE4(rseq, struct rseq __user *, rseq, u32, rseq_len,
 	 * ensure the provided rseq is properly aligned and valid.
 	 */
 	if (!IS_ALIGNED((unsigned long)rseq, __alignof__(*rseq)) ||
-	    rseq_len != sizeof(*rseq))
+	    rseq_len < RSEQ_SIZE_VER0)
 		return -EINVAL;
 	if (!access_ok(rseq, rseq_len))
 		return -EFAULT;
+
+	/* Handle extensible struct rseq ABI. */
+	ret = get_user(tls_flags, &rseq->flags);
+	if (ret)
+		return ret;
+	if (tls_flags & RSEQ_TLS_FLAG_SIZE) {
+		u32 user_size, kernel_size, active_size;
+
+		/* Can probably be made nicer by using check_zeroed_user(). */
+		ret = get_user(user_size, &rseq->user_size);
+		if (ret)
+			return ret;
+		if (user_size != 0)
+			return -EINVAL;
+
+		ret = get_user(active_size, &rseq->active_size);
+		if (ret)
+			return ret;
+		if (active_size != 0)
+			return -EINVAL;
+
+		ret = get_user(active_size, &rseq->kernel_size);
+		if (ret)
+			return ret;
+		if (kernel_size != 0)
+			return -EINVAL;
+
+		/* Calculate the useable size. */
+		active_size = min_t(u32, rseq_len, RSEQ_SIZE_LATEST);
+		ret = put_user(active_size, &rseq->active_size);
+		if (ret)
+			return ret;
+
+		/* Let other users know what userspace used to register. */
+		ret = put_user(rseq_len, &rseq->user_size);
+		if (ret)
+			return -EFAULT;
+
+		/* Let other users know what size the kernel supports. */
+		ret = put_user(RSEQ_SIZE_LATEST, &rseq->kernel_size);
+		if (ret)
+			return -EFAULT;
+
+		current->rseq_size = active_size;
+	} else {
+		current->rseq_size = RSEQ_SIZE_VER0;
+	}
+
 	current->rseq = rseq;
 	current->rseq_sig = sig;
 	/*
-- 
2.27.0




^ permalink raw reply related	[flat|nested] 37+ messages in thread

* Re: [RFC PATCH 0/4] rseq: Introduce extensible struct rseq
  2020-07-14 20:55 ` [RFC PATCH 0/4] rseq: Introduce extensible struct rseq Carlos O'Donell
@ 2020-07-15 13:02   ` Mathieu Desnoyers
  2020-07-16 13:39     ` Carlos O'Donell
  2020-07-15 15:12   ` Florian Weimer
  1 sibling, 1 reply; 37+ messages in thread
From: Mathieu Desnoyers @ 2020-07-15 13:02 UTC (permalink / raw)
  To: carlos
  Cc: Peter Zijlstra, linux-kernel, Thomas Gleixner, paulmck,
	Boqun Feng, H. Peter Anvin, Paul Turner, linux-api,
	Christian Brauner, Florian Weimer, Chris Kennelly,
	Linus Torvalds

----- On Jul 14, 2020, at 4:55 PM, carlos carlos@redhat.com wrote:

> On 7/13/20 11:03 PM, Mathieu Desnoyers wrote:
>> Recent discussion led to a solution for extending struct rseq. This is
>> an implementation of the proposed solution.
>> 
>> Now is a good time to agree on this scheme before the release of glibc
>> 2.32, just in case there are small details to fix on the user-space
>> side in order to allow extending struct rseq.
> 
> Adding extensibility to the rseq registration process would be great,
> but we are out of time for the glibc 2.32 release.

Of course, and my goal is not to add this support for extensibility
before glibc 2.32, but merely to see if we need to change anything in
the way it uses rseq today (before the release) in order to facilitate
extensibility in the future.

> Should we revert rseq for glibc 2.32 and spend quality time discussing
> the implications of an extensible design, something that Google already
> says they are doing?

Google's approach is limited to contexts simpler than multiple unrelated
libraries scenarios. Peter Oskolkov stated as a follow-up that my
extension approach would be one way to deal with problems associated
with sharing __rseq_abi between unrelated libraries:

https://lore.kernel.org/lkml/CAPNVh5fiCCJpyeLj_ciWzFrO4fasVXZNhpfKXJhJWJirXdJOjQ@mail.gmail.com/

The fact that Google already have their own rseq extensions internally
confirms that planning for extensibility is needed.

> We can, with a clear head, and an agreed upon extension mechanism
> include rseq in glibc 2.33 (release scheduled for Feburary 1st 2021).
> We release time boxed every 6 months, no deviation, so you know when
> your next merge window will be.
> 
> We have already done the hard work of fixing the nesting signal
> handler issues, and glibc integration. If we revert today that will
> also give time for Firefox and Chrome to adjust their sandboxes.
> 
> Do you wish to go forward with rseq as we have it in glibc 2.32,
> or do you wish to revert rseq from glibc 2.32, discuss the extension
> mechanism, and put it back into glibc 2.33 with adjustments?

So here we have a catch-22 situation. Linus wants to see how rseq
is being used before accepting additional features (ref.
https://lore.kernel.org/lkml/CAHk-=wjk-2c4XvWjdzc-bs9Hbgvy-p7ASSnKKphggr5qDoXRDQ@mail.gmail.com/).
This lack of ability to allow user-space to make any large-scale use
of the rseq system call in a coordinated fashion blocks wide use of rseq.
This coordination is supposed to be done by glibc, and I told
every user-space project maintainer who contacted me to hold off
using rseq until it is integrated into glibc. "tcmalloc" from Google
is the exception because they do not care about ABI compatibility with
other libraries (they are OK with a breakage and requiring upgrade).

The process I'm going through right now is checking what are our
options for extending rseq starting from the current ABI, just
to see if we are painting ourselves in a corner with the current
glibc integration. However, if we postpone integration of rseq
into glibc because of possible future extensibility features, those
may never happen because of the lack of usage feedback, due of lack
of users, due to lack of coordinated ABI registration.

At this point, the main question I would like answered is whether
it would be acceptable to increase the size and alignment of
the __rseq_abi symbol (which will be exposed by glibc) between
e.g. glibc 2.32 and 2.33. If it's not possible, then we can
find other solutions, for instance using an indirection with
a pointer to an extended structure, but this appears to be
slightly less efficient.

Thanks,

Mathieu

-- 
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [RFC PATCH 2/4] rseq: Allow extending struct rseq
  2020-07-14 21:30           ` Carlos O'Donell
@ 2020-07-15 13:12             ` Mathieu Desnoyers
  2020-07-15 13:22               ` Florian Weimer
  0 siblings, 1 reply; 37+ messages in thread
From: Mathieu Desnoyers @ 2020-07-15 13:12 UTC (permalink / raw)
  To: carlos
  Cc: Florian Weimer, Peter Zijlstra, linux-kernel, Thomas Gleixner,
	paulmck, Boqun Feng, H. Peter Anvin, Paul Turner, linux-api,
	Christian Brauner

----- On Jul 14, 2020, at 5:30 PM, carlos carlos@redhat.com wrote:

> On 7/14/20 9:19 AM, Mathieu Desnoyers wrote:
>> Is there an arch-agnostic way to get the thread pointer from user-space code ?
>> That
>> would be needed by all rseq critical section implementations.
> 
> Yes, and no. We have void *__builtin_thread_pointer (void), but
> few architectures implement the builtin so we'd have to go through
> a round of compiler updates and backports. All targets know how to
> access the thread pointer because the compiler has to generate
> IE-mode accesses to the TLS variables.

Practically speaking, I suspect this would mean postponing availability of
rseq for widely deployed applications for a few more years ? I can very well
see end users upgrading their kernel and using an early-adoption library
to use rseq today, but requiring to upgrade the entire toolchain will likely
postpone adoption to many years from now.

It would be good to start getting feedback from rseq users so we can progress
on the system call feature development. Unfortunately everything has been in
a stand-still for the past years due to lack of rseq registration coordination
in user-space.

Thanks,

Mathieu

-- 
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [RFC PATCH 2/4] rseq: Allow extending struct rseq
  2020-07-15 13:12             ` Mathieu Desnoyers
@ 2020-07-15 13:22               ` Florian Weimer
  2020-07-15 13:31                 ` Mathieu Desnoyers
  0 siblings, 1 reply; 37+ messages in thread
From: Florian Weimer @ 2020-07-15 13:22 UTC (permalink / raw)
  To: Mathieu Desnoyers
  Cc: carlos, Peter Zijlstra, linux-kernel, Thomas Gleixner, paulmck,
	Boqun Feng, H. Peter Anvin, Paul Turner, linux-api,
	Christian Brauner

* Mathieu Desnoyers:

> Practically speaking, I suspect this would mean postponing availability of
> rseq for widely deployed applications for a few more years ?

There is no rseq support in GCC today, so you have to write assembler
code anyway.

Thanks,
Florian


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [RFC PATCH 2/4] rseq: Allow extending struct rseq
  2020-07-15 13:22               ` Florian Weimer
@ 2020-07-15 13:31                 ` Mathieu Desnoyers
  2020-07-15 13:42                   ` Florian Weimer
  0 siblings, 1 reply; 37+ messages in thread
From: Mathieu Desnoyers @ 2020-07-15 13:31 UTC (permalink / raw)
  To: Florian Weimer
  Cc: carlos, Peter Zijlstra, linux-kernel, Thomas Gleixner, paulmck,
	Boqun Feng, H. Peter Anvin, Paul Turner, linux-api,
	Christian Brauner

----- On Jul 15, 2020, at 9:22 AM, Florian Weimer fweimer@redhat.com wrote:

> * Mathieu Desnoyers:
> 
>> Practically speaking, I suspect this would mean postponing availability of
>> rseq for widely deployed applications for a few more years ?
> 
> There is no rseq support in GCC today, so you have to write assembler
> code anyway.

Assembler code is only needed for the rseq critical sections, not for accessing
the __rseq_abi. Use-cases like getting the current cpu number do not currently
require any assembler for instance.

So indeed it could be done today without upgrading the toolchains by writing
custom assembler for each architecture to get the thread's struct rseq. AFAIU
the ABI to access the thread pointer is fixed for each architecture, right ?

How would this allow early-rseq-adopter libraries to interact with glibc ?

Thanks,

Mathieu


-- 
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [RFC PATCH 2/4] rseq: Allow extending struct rseq
  2020-07-15 13:31                 ` Mathieu Desnoyers
@ 2020-07-15 13:42                   ` Florian Weimer
  2020-07-15 13:55                     ` Christian Brauner
  2020-07-15 14:54                     ` Mathieu Desnoyers
  0 siblings, 2 replies; 37+ messages in thread
From: Florian Weimer @ 2020-07-15 13:42 UTC (permalink / raw)
  To: Mathieu Desnoyers
  Cc: carlos, Peter Zijlstra, linux-kernel, Thomas Gleixner, paulmck,
	Boqun Feng, H. Peter Anvin, Paul Turner, linux-api,
	Christian Brauner

* Mathieu Desnoyers:

> So indeed it could be done today without upgrading the toolchains by
> writing custom assembler for each architecture to get the thread's
> struct rseq. AFAIU the ABI to access the thread pointer is fixed for
> each architecture, right ?

Yes, determining the thread pointer and access initial-exec TLS
variables is baked into the ABI.

> How would this allow early-rseq-adopter libraries to interact with
> glibc ?

Under all extension proposals I've seen so far, early adopters are
essentially incompatible with glibc rseq registration.  I don't think
you can have it both ways.

Thanks,
Florian


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [RFC PATCH 2/4] rseq: Allow extending struct rseq
  2020-07-15 13:42                   ` Florian Weimer
@ 2020-07-15 13:55                     ` Christian Brauner
  2020-07-15 14:20                       ` Mathieu Desnoyers
  2020-07-15 14:54                     ` Mathieu Desnoyers
  1 sibling, 1 reply; 37+ messages in thread
From: Christian Brauner @ 2020-07-15 13:55 UTC (permalink / raw)
  To: Florian Weimer
  Cc: Mathieu Desnoyers, carlos, Peter Zijlstra, linux-kernel,
	Thomas Gleixner, paulmck, Boqun Feng, H. Peter Anvin,
	Paul Turner, linux-api

On Wed, Jul 15, 2020 at 03:42:11PM +0200, Florian Weimer wrote:
> * Mathieu Desnoyers:
> 
> > So indeed it could be done today without upgrading the toolchains by
> > writing custom assembler for each architecture to get the thread's
> > struct rseq. AFAIU the ABI to access the thread pointer is fixed for
> > each architecture, right ?
> 
> Yes, determining the thread pointer and access initial-exec TLS
> variables is baked into the ABI.
> 
> > How would this allow early-rseq-adopter libraries to interact with
> > glibc ?
> 
> Under all extension proposals I've seen so far, early adopters are
> essentially incompatible with glibc rseq registration.  I don't think
> you can have it both ways.

Who are the early adopters? And if we aren't being compatible with them
under the extensible schemes proposed we should be able to achieve
compatibility with non-early adopters, right? Which I guess is more
important. (I still struggle to make sense what qualifies as an early
adopter/what the difference to a non-early adopter is.)

Christian

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [RFC PATCH 2/4] rseq: Allow extending struct rseq
  2020-07-15 13:55                     ` Christian Brauner
@ 2020-07-15 14:20                       ` Mathieu Desnoyers
  0 siblings, 0 replies; 37+ messages in thread
From: Mathieu Desnoyers @ 2020-07-15 14:20 UTC (permalink / raw)
  To: Christian Brauner
  Cc: Florian Weimer, carlos, Peter Zijlstra, linux-kernel,
	Thomas Gleixner, paulmck, Boqun Feng, H. Peter Anvin,
	Paul Turner, linux-api

----- On Jul 15, 2020, at 9:55 AM, Christian Brauner christian.brauner@ubuntu.com wrote:

> On Wed, Jul 15, 2020 at 03:42:11PM +0200, Florian Weimer wrote:
>> * Mathieu Desnoyers:
>> 
>> > So indeed it could be done today without upgrading the toolchains by
>> > writing custom assembler for each architecture to get the thread's
>> > struct rseq. AFAIU the ABI to access the thread pointer is fixed for
>> > each architecture, right ?
>> 
>> Yes, determining the thread pointer and access initial-exec TLS
>> variables is baked into the ABI.
>> 
>> > How would this allow early-rseq-adopter libraries to interact with
>> > glibc ?
>> 
>> Under all extension proposals I've seen so far, early adopters are
>> essentially incompatible with glibc rseq registration.  I don't think
>> you can have it both ways.
> 
> Who are the early adopters? And if we aren't being compatible with them
> under the extensible schemes proposed we should be able to achieve
> compatibility with non-early adopters, right? Which I guess is more
> important. (I still struggle to make sense what qualifies as an early
> adopter/what the difference to a non-early adopter is.)

Early adopter libraries and applications are meant to be able to use rseq
without requiring upgrade of the entire environment to a newer glibc.

I maintain early adopter projects (liburcu, lttng-ust) which postpone using
rseq outside of prototype branches until we agree on an ABI to share
__rseq_abi between glibc and early adopter libraries. The last thing I
want is for those projects to break when an end-user upgrades their
glibc. tcmalloc is another early adopter which have less strict
compatibility requirements: they are OK with breaking changes requiring
upgrading and rebuilding tcmalloc.

Indeed, until we cast in stone the layout of struct rseq as exposed by
glibc, I think we have some freedom in our definition of "early adopter",
because pretty much every relevant open source project which want to use
rseq is waiting on glibc to define that ABI, to use rseq either as an
early-adopter or through a dependency on newer glibc.

Thanks,

Mathieu


-- 
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [RFC PATCH 2/4] rseq: Allow extending struct rseq
  2020-07-15  6:31           ` Florian Weimer
  2020-07-15 10:59             ` Christian Brauner
@ 2020-07-15 14:38             ` Mathieu Desnoyers
  1 sibling, 0 replies; 37+ messages in thread
From: Mathieu Desnoyers @ 2020-07-15 14:38 UTC (permalink / raw)
  To: Florian Weimer
  Cc: Chris Kennelly, Peter Oskolkov, Peter Oskolkov, Peter Zijlstra,
	linux-kernel, Thomas Gleixner, paulmck, Boqun Feng,
	H. Peter Anvin, Paul Turner, linux-api, Christian Brauner,
	carlos

----- On Jul 15, 2020, at 2:31 AM, Florian Weimer fw@deneb.enyo.de wrote:

> * Chris Kennelly:
> 
>> When glibc provides registration, is the anticipated use case that a
>> library would unregister and reregister each thread to "upgrade" it to
>> the most modern version of interface it knows about provided by the
>> kernel?
> 
> Absolutely not, that is likely to break other consumers because an
> expected rseq area becomes dormant instead.

Indeed.

> 
>> There, I could assume an all-or-nothing registration of the new
>> feature--limited only by kernel availability for thread
>> homogeneity--but inconsistencies across early adopter libraries would
>> mean each thread would have to examine its own TLS to determine if a
>> feature were available.
> 
> Exactly.  Certain uses of seccomp can also have this effect,
> presenting a non-homogeneous view.

The nice thing about having a consistent feature-set for a given
thread group is that it allows specializing the code at thread
group startup, rather than requiring to dynamically check for
feature availability at runtime in fast-paths.

I wonder whether this kind of non-homogeneous view scenario
caused by seccomp is something we should support, or something
that should be documented as incompatible with rseq ?

Thanks,

Mathieu



-- 
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [RFC PATCH 2/4] rseq: Allow extending struct rseq
  2020-07-15  2:34         ` Chris Kennelly
  2020-07-15  6:31           ` Florian Weimer
@ 2020-07-15 14:50           ` Mathieu Desnoyers
  1 sibling, 0 replies; 37+ messages in thread
From: Mathieu Desnoyers @ 2020-07-15 14:50 UTC (permalink / raw)
  To: Chris Kennelly
  Cc: Peter Oskolkov, Peter Oskolkov, Peter Zijlstra, linux-kernel,
	Thomas Gleixner, paulmck, Boqun Feng, H. Peter Anvin,
	Paul Turner, linux-api, Christian Brauner, Florian Weimer,
	carlos

----- On Jul 14, 2020, at 10:34 PM, Chris Kennelly ckennelly@google.com wrote:

> On Tue, Jul 14, 2020 at 2:33 PM Peter Oskolkov <posk@google.com> wrote:
>>
>> On Tue, Jul 14, 2020 at 10:43 AM Mathieu Desnoyers
>> <mathieu.desnoyers@efficios.com> wrote:
>> >
>> > ----- On Jul 14, 2020, at 1:24 PM, Peter Oskolkov posk@posk.io wrote:
>> >
>> > > At Google, we actually extended struct rseq (I will post the patches
>> > > here once they are fully deployed and we have specific
>> > > benefits/improvements to report). We did this by adding several fields
>> > > below __u32 flags (the last field currently), and correspondingly
>> > > increasing rseq_len in rseq() syscall. If the kernel does not know of
>> > > this extension, it will return -EINVAL due to an unexpected rseq_len;
>> > > then the application can either fall-back to the standard/upstream
>> > > rseq, or bail. If the kernel does know of this extension, it accepts
>> > > it. If the application passes the old rseq_len (32), the kernel knows
>> > > that this is an old application and treats it as such.
>> > >
>> > > I looked through the archives, but I did not find specifically why the
>> > > pretty standard approach described above is considered inferior to the
>> > > one taken in this patch (freeze rseq_len at 32, add additional length
>> > > fields to struct rseq). Can these be summarized?
>> >
>> > I think you don't face the issues I'm facing with libc rseq integration
>> > because you control the entire user-space software ecosystem at Google.
>> >
>> > The main issue we face is that the library responsible for registering
>> > rseq (either glibc 2.32+, an early-adopter librseq library, or the
>> > application) may very well not be the same library defining the __rseq_abi
>> > symbol used in the global symbol table. Interposition with ld preload or
>> > by defining the __rseq_abi in the program's executable are good examples
>> > of this kind of scenario, and those use-cases are supported.
> 
> Does this work if/when we run out of bytes in the current sizeof(__rseq_abi)?

Only if all libraries/programs involved (including glibc) expect that the size
of the __rseq_abi can be the smallest possible subset, and only consider it
to be "extended" if specific information in the ABI tells them it is the case.

> 
> Which library provides the TLS symbol (and N bytes of storage) seems
> sensitive to the choices the linker makes for us, once the symbol
> sizes diverge.

AFAIU, a symbol defined in the main executable will have precedence over
a preloaded library, which has precedence over shared library dependencies,
e.g. glibc.

Thanks,

Mathieu

-- 
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [RFC PATCH 2/4] rseq: Allow extending struct rseq
  2020-07-15 13:42                   ` Florian Weimer
  2020-07-15 13:55                     ` Christian Brauner
@ 2020-07-15 14:54                     ` Mathieu Desnoyers
  2020-07-15 14:58                       ` Florian Weimer
  1 sibling, 1 reply; 37+ messages in thread
From: Mathieu Desnoyers @ 2020-07-15 14:54 UTC (permalink / raw)
  To: Florian Weimer
  Cc: carlos, Peter Zijlstra, linux-kernel, Thomas Gleixner, paulmck,
	Boqun Feng, H. Peter Anvin, Paul Turner, linux-api,
	Christian Brauner

----- On Jul 15, 2020, at 9:42 AM, Florian Weimer fweimer@redhat.com wrote:
> * Mathieu Desnoyers:
> 
[...]
>> How would this allow early-rseq-adopter libraries to interact with
>> glibc ?
> 
> Under all extension proposals I've seen so far, early adopters are
> essentially incompatible with glibc rseq registration.  I don't think
> you can have it both ways.

The basic question I'm not sure about is whether we are allowed to increase
the size and alignement of __rseq_abi from e.g. glibc 2.32 to glibc 2.33.
If not, then we just need to find another way to extend struct rseq, e.g. by
adding a pointer to another extended structure in the padding space we
have at the end of struct rseq.

Thanks,

Mathieu

-- 
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [RFC PATCH 2/4] rseq: Allow extending struct rseq
  2020-07-15 14:54                     ` Mathieu Desnoyers
@ 2020-07-15 14:58                       ` Florian Weimer
  2020-07-15 15:26                         ` Mathieu Desnoyers
  0 siblings, 1 reply; 37+ messages in thread
From: Florian Weimer @ 2020-07-15 14:58 UTC (permalink / raw)
  To: Mathieu Desnoyers
  Cc: carlos, Peter Zijlstra, linux-kernel, Thomas Gleixner, paulmck,
	Boqun Feng, H. Peter Anvin, Paul Turner, linux-api,
	Christian Brauner

* Mathieu Desnoyers:

> ----- On Jul 15, 2020, at 9:42 AM, Florian Weimer fweimer@redhat.com wrote:
>> * Mathieu Desnoyers:
>> 
> [...]
>>> How would this allow early-rseq-adopter libraries to interact with
>>> glibc ?
>> 
>> Under all extension proposals I've seen so far, early adopters are
>> essentially incompatible with glibc rseq registration.  I don't think
>> you can have it both ways.
>
> The basic question I'm not sure about is whether we are allowed to increase
> the size and alignement of __rseq_abi from e.g. glibc 2.32 to glibc 2.33.

With the current mechanism (global TLS data symbol), we can do that
using symbol versioning.  That means that we can only do this on a
release boundary, and that it's incompatible with other libraries which
use an interposing unversioned symbol.

Thanks,
Florian


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [RFC PATCH 2/4] rseq: Allow extending struct rseq
  2020-07-15 12:33     ` Christian Brauner
@ 2020-07-15 15:10       ` Mathieu Desnoyers
  2020-07-15 15:33         ` Christian Brauner
  0 siblings, 1 reply; 37+ messages in thread
From: Mathieu Desnoyers @ 2020-07-15 15:10 UTC (permalink / raw)
  To: Christian Brauner
  Cc: Florian Weimer, Peter Zijlstra, linux-kernel, Thomas Gleixner,
	paulmck, Boqun Feng, H. Peter Anvin, Paul Turner, linux-api,
	carlos

----- On Jul 15, 2020, at 8:33 AM, Christian Brauner christian.brauner@ubuntu.com wrote:
[...]
> 
> So here's a very free-wheeling draft of roughly what I had in mind. Not
> even compile-tested just to illustrate what I'd change and sorry if that
> code will make you sob in your hands:
> 
[...]
> +	/*
> +	 * With __rseq_abi.flags RSEQ_TLS_FLAG_SIZE set, user_size should be
> +	 * statically initialized to offsetof(struct rseq, end).
> +	 */
> +	__u32 user_size;
> +	/*
> +	 * With __rseq_abi.flags RSEQ_TLS_FLAG_SIZE set, if the kernel supports
> +	 * extensible struct rseq ABI, the kernel_size field is populated by
> +	 * the kernel to the minimum between user_size and the offset of the
> +	 * "end" field within the struct rseq supported by the kernel on
> +	 * successful registration. Should be initialized to 0.
> +	 */
> +	__u32 kernel_size;

Moving from __u16 to __u32 for both fields don't achieve much, and increase
the size of struct rseq (excluding padding) from 24 bytes to 28 bytes.

Note that the struct rseq alignment is 32 bytes. At 24 bytes, it leaves room
for exactly one 8 bytes pointer, which can be useful for future extensions.
If the size is increased to 28 bytes or more, then we're done and cannot
add a pointer.

> +	__u32 active_size;

This additional field takes the very last bytes of padding we have in the
current layout.

> } __attribute__((aligned(4 * sizeof(__u64))));
> 
> +#define RSEQ_SIZE_VER0 24 /* sizeof first published struct */

This is incorrect. The sizeof(struct_rseq) with its 32 bytes alignment is 32,
not 24. The padding at the end of the structure is considered as part of its
size, but we cannot rely on its content being zero-initialized based on the
C standard.

> +#define RSEQ_SIZE_VER1 32 /* sizeof second published struct */
> +#define RSEQ_SIZE_LATEST RSEQ_SIZE_VER1 /* sizeof last published struct */
> +

[...]

> @@ -349,10 +363,58 @@ SYSCALL_DEFINE4(rseq, struct rseq __user *, rseq, u32,
> rseq_len,
> 	 * ensure the provided rseq is properly aligned and valid.
> 	 */
> 	if (!IS_ALIGNED((unsigned long)rseq, __alignof__(*rseq)) ||
> -	    rseq_len != sizeof(*rseq))
> +	    rseq_len < RSEQ_SIZE_VER0)

This could perhaps be changed for future kernels, but will break for existing
kernels as soon as rseq_len is increased. This is something we should have
planned for in the initial implementation of the system call, but here we are.

How do you envision that userspace would handle this failure from older kernels ?
Try again with a second system call passing RSEQ_SIZE_VER0 as argument ?

> 		return -EINVAL;
> 	if (!access_ok(rseq, rseq_len))
> 		return -EFAULT;
> +
> +	/* Handle extensible struct rseq ABI. */
> +	ret = get_user(tls_flags, &rseq->flags);
> +	if (ret)
> +		return ret;
> +	if (tls_flags & RSEQ_TLS_FLAG_SIZE) {
> +		u32 user_size, kernel_size, active_size;
> +
> +		/* Can probably be made nicer by using check_zeroed_user(). */
> +		ret = get_user(user_size, &rseq->user_size);
> +		if (ret)
> +			return ret;
> +		if (user_size != 0)
> +			return -EINVAL;
> +
> +		ret = get_user(active_size, &rseq->active_size);
> +		if (ret)
> +			return ret;
> +		if (active_size != 0)
> +			return -EINVAL;
> +
> +		ret = get_user(active_size, &rseq->kernel_size);

I guess you mean kernel_size here.

> +		if (ret)
> +			return ret;
> +		if (kernel_size != 0)
> +			return -EINVAL;
> +
> +		/* Calculate the useable size. */
> +		active_size = min_t(u32, rseq_len, RSEQ_SIZE_LATEST);

Where is the rseq_len supposed to come from in userspace ? Should it be
that the code doing the registration uses sizeof(struct rseq), or offsetof(struct rseq, end),
or should it read the content of __rseq_abi.user_size ?

> +		ret = put_user(active_size, &rseq->active_size);
> +		if (ret)
> +			return ret;
> +
> +		/* Let other users know what userspace used to register. */
> +		ret = put_user(rseq_len, &rseq->user_size);
> +		if (ret)
> +			return -EFAULT;
> +
> +		/* Let other users know what size the kernel supports. */

I am not sure what those 3 __u32 fields (user_size, kernel_size, and active_size),
plus use of the rseq_len syscall parameter, accomplish which was not accomplished
by my __u16 user_size + kernel_size approach ? If anything, it seems to make support
of older kernels which do not support an extended rseq_len parameter more complex.

Thanks,

Mathieu


> +		ret = put_user(RSEQ_SIZE_LATEST, &rseq->kernel_size);
> +		if (ret)
> +			return -EFAULT;
> +
> +		current->rseq_size = active_size;
> +	} else {
> +		current->rseq_size = RSEQ_SIZE_VER0;
> +	}
> +
> 	current->rseq = rseq;
> 	current->rseq_sig = sig;
> 	/*
> --
> 2.27.0

-- 
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [RFC PATCH 0/4] rseq: Introduce extensible struct rseq
  2020-07-14 20:55 ` [RFC PATCH 0/4] rseq: Introduce extensible struct rseq Carlos O'Donell
  2020-07-15 13:02   ` Mathieu Desnoyers
@ 2020-07-15 15:12   ` Florian Weimer
  2020-07-15 15:32     ` Mathieu Desnoyers
  1 sibling, 1 reply; 37+ messages in thread
From: Florian Weimer @ 2020-07-15 15:12 UTC (permalink / raw)
  To: Carlos O'Donell
  Cc: Mathieu Desnoyers, Peter Zijlstra, linux-kernel, Thomas Gleixner,
	Paul E . McKenney, Boqun Feng, H . Peter Anvin, Paul Turner,
	linux-api, Christian Brauner, Florian Weimer

* Carlos O'Donell:

> On 7/13/20 11:03 PM, Mathieu Desnoyers wrote:
>> Recent discussion led to a solution for extending struct rseq. This is
>> an implementation of the proposed solution.
>> 
>> Now is a good time to agree on this scheme before the release of glibc
>> 2.32, just in case there are small details to fix on the user-space
>> side in order to allow extending struct rseq.
>
> Adding extensibility to the rseq registration process would be great,
> but we are out of time for the glibc 2.32 release.
>
> Should we revert rseq for glibc 2.32 and spend quality time discussing
> the implications of an extensible design, something that Google already
> says they are doing?
>
> We can, with a clear head, and an agreed upon extension mechanism
> include rseq in glibc 2.33 (release scheduled for Feburary 1st 2021).
> We release time boxed every 6 months, no deviation, so you know when
> your next merge window will be.
>
> We have already done the hard work of fixing the nesting signal
> handler issues, and glibc integration. If we revert today that will 
> also give time for Firefox and Chrome to adjust their sandboxes.
>
> Do you wish to go forward with rseq as we have it in glibc 2.32,
> or do you wish to revert rseq from glibc 2.32, discuss the extension
> mechanism, and put it back into glibc 2.33 with adjustments?

I posted the glibc revert:

  <https://sourceware.org/pipermail/libc-alpha/2020-July/116368.html>

I do not think we have any other choice at this point.

Thanks,
Florian


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [RFC PATCH 2/4] rseq: Allow extending struct rseq
  2020-07-15 14:58                       ` Florian Weimer
@ 2020-07-15 15:26                         ` Mathieu Desnoyers
  0 siblings, 0 replies; 37+ messages in thread
From: Mathieu Desnoyers @ 2020-07-15 15:26 UTC (permalink / raw)
  To: Florian Weimer
  Cc: carlos, Peter Zijlstra, linux-kernel, Thomas Gleixner, paulmck,
	Boqun Feng, H. Peter Anvin, Paul Turner, linux-api,
	Christian Brauner

----- On Jul 15, 2020, at 10:58 AM, Florian Weimer fweimer@redhat.com wrote:

> * Mathieu Desnoyers:
> 
>> ----- On Jul 15, 2020, at 9:42 AM, Florian Weimer fweimer@redhat.com wrote:
>>> * Mathieu Desnoyers:
>>> 
>> [...]
>>>> How would this allow early-rseq-adopter libraries to interact with
>>>> glibc ?
>>> 
>>> Under all extension proposals I've seen so far, early adopters are
>>> essentially incompatible with glibc rseq registration.  I don't think
>>> you can have it both ways.
>>
>> The basic question I'm not sure about is whether we are allowed to increase
>> the size and alignement of __rseq_abi from e.g. glibc 2.32 to glibc 2.33.
> 
> With the current mechanism (global TLS data symbol), we can do that
> using symbol versioning.  That means that we can only do this on a
> release boundary,

That should not be a problem.

> and that it's incompatible with other libraries which
> use an interposing unversioned symbol.

We have the freedom to define the ABI of this shared __rseq_abi symbol
right now. Maybe it's not such a good thing to let early adopters use
unversioned __rseq_abi symbols.

Let me wrap my head around this scenario then, please let me know if
I'm misunderstanding something:

1) glibc 2.32 exposes:
   __rseq_abi (GLIBC_2.32) with size == 32.
   __rseq_abi with size == 32 is available as a private symbol within glibc
   - both symbols alias the same contents.

2) glibc 2.33 exposes:
   __rseq_abi (GLIBC_2.32) with size == 32.
   __rseq_abi (GLIBC_2.33) with size == 64.
   __rseq_abi with size == 64 is available as a private symbol within glibc
   - the three symbols alias the same contents.

Then what happens if we have a program or preloaded library defining
__rseq_abi (without version) with size == 32 loaded with a glibc 2.33 ?

Or what happens if we have a program or preloaded libary defining
__rseq_abi (GLIBC_2.32) with size == 32 loaded with a glibc 2.33 ?

I wonder if "GLIBC_*" is the right version namespace for this. Considering
that the layout of this structure is defined by the Linux kernel UAPI, maybe
we'd want version named as "RSEQ_1.0", "RSEQ_2.0" or something similar.

Thanks,

Mathieu

-- 
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [RFC PATCH 0/4] rseq: Introduce extensible struct rseq
  2020-07-15 15:12   ` Florian Weimer
@ 2020-07-15 15:32     ` Mathieu Desnoyers
  0 siblings, 0 replies; 37+ messages in thread
From: Mathieu Desnoyers @ 2020-07-15 15:32 UTC (permalink / raw)
  To: Florian Weimer
  Cc: carlos, Peter Zijlstra, linux-kernel, Thomas Gleixner, paulmck,
	Boqun Feng, H. Peter Anvin, Paul Turner, linux-api,
	Christian Brauner, Florian Weimer, Linus Torvalds

----- On Jul 15, 2020, at 11:12 AM, Florian Weimer fweimer@redhat.com wrote:

> * Carlos O'Donell:
> 
>> On 7/13/20 11:03 PM, Mathieu Desnoyers wrote:
>>> Recent discussion led to a solution for extending struct rseq. This is
>>> an implementation of the proposed solution.
>>> 
>>> Now is a good time to agree on this scheme before the release of glibc
>>> 2.32, just in case there are small details to fix on the user-space
>>> side in order to allow extending struct rseq.
>>
>> Adding extensibility to the rseq registration process would be great,
>> but we are out of time for the glibc 2.32 release.
>>
>> Should we revert rseq for glibc 2.32 and spend quality time discussing
>> the implications of an extensible design, something that Google already
>> says they are doing?
>>
>> We can, with a clear head, and an agreed upon extension mechanism
>> include rseq in glibc 2.33 (release scheduled for Feburary 1st 2021).
>> We release time boxed every 6 months, no deviation, so you know when
>> your next merge window will be.
>>
>> We have already done the hard work of fixing the nesting signal
>> handler issues, and glibc integration. If we revert today that will
>> also give time for Firefox and Chrome to adjust their sandboxes.
>>
>> Do you wish to go forward with rseq as we have it in glibc 2.32,
>> or do you wish to revert rseq from glibc 2.32, discuss the extension
>> mechanism, and put it back into glibc 2.33 with adjustments?
> 
> I posted the glibc revert:
> 
>  <https://sourceware.org/pipermail/libc-alpha/2020-July/116368.html>
> 
> I do not think we have any other choice at this point.

This is indeed the safe course of action.

Let's hope the overall interest about rseq will be sufficient to justify
integrating extensibility support in the rseq system call ABI, otherwise we
have a catch-22 situation where everything is stalled again, due to further
progress on rseq kernel features awaiting user feedback on the existing
implementation, which will never come because the integration of coordinated
use across libraries is awaiting further development at the kernel level.

Thanks,

Mathieu


-- 
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [RFC PATCH 2/4] rseq: Allow extending struct rseq
  2020-07-15 15:10       ` Mathieu Desnoyers
@ 2020-07-15 15:33         ` Christian Brauner
  0 siblings, 0 replies; 37+ messages in thread
From: Christian Brauner @ 2020-07-15 15:33 UTC (permalink / raw)
  To: Mathieu Desnoyers
  Cc: Florian Weimer, Peter Zijlstra, linux-kernel, Thomas Gleixner,
	paulmck, Boqun Feng, H. Peter Anvin, Paul Turner, linux-api,
	carlos

On Wed, Jul 15, 2020 at 11:10:47AM -0400, Mathieu Desnoyers wrote:
> ----- On Jul 15, 2020, at 8:33 AM, Christian Brauner christian.brauner@ubuntu.com wrote:
> [...]
> > 
> > So here's a very free-wheeling draft of roughly what I had in mind. Not
> > even compile-tested just to illustrate what I'd change and sorry if that
> > code will make you sob in your hands:
> > 
> [...]
> > +	/*
> > +	 * With __rseq_abi.flags RSEQ_TLS_FLAG_SIZE set, user_size should be
> > +	 * statically initialized to offsetof(struct rseq, end).
> > +	 */
> > +	__u32 user_size;
> > +	/*
> > +	 * With __rseq_abi.flags RSEQ_TLS_FLAG_SIZE set, if the kernel supports
> > +	 * extensible struct rseq ABI, the kernel_size field is populated by
> > +	 * the kernel to the minimum between user_size and the offset of the
> > +	 * "end" field within the struct rseq supported by the kernel on
> > +	 * successful registration. Should be initialized to 0.
> > +	 */
> > +	__u32 kernel_size;
> 
> Moving from __u16 to __u32 for both fields don't achieve much, and increase
> the size of struct rseq (excluding padding) from 24 bytes to 28 bytes.
> 
> Note that the struct rseq alignment is 32 bytes. At 24 bytes, it leaves room
> for exactly one 8 bytes pointer, which can be useful for future extensions.
> If the size is increased to 28 bytes or more, then we're done and cannot
> add a pointer.
> 
> > +	__u32 active_size;
> 
> This additional field takes the very last bytes of padding we have in the
> current layout.
> 
> > } __attribute__((aligned(4 * sizeof(__u64))));
> > 
> > +#define RSEQ_SIZE_VER0 24 /* sizeof first published struct */
> 
> This is incorrect. The sizeof(struct_rseq) with its 32 bytes alignment is 32,
> not 24. The padding at the end of the structure is considered as part of its
> size, but we cannot rely on its content being zero-initialized based on the
> C standard.
> 
> > +#define RSEQ_SIZE_VER1 32 /* sizeof second published struct */
> > +#define RSEQ_SIZE_LATEST RSEQ_SIZE_VER1 /* sizeof last published struct */
> > +
> 
> [...]
> 
> > @@ -349,10 +363,58 @@ SYSCALL_DEFINE4(rseq, struct rseq __user *, rseq, u32,
> > rseq_len,
> > 	 * ensure the provided rseq is properly aligned and valid.
> > 	 */
> > 	if (!IS_ALIGNED((unsigned long)rseq, __alignof__(*rseq)) ||
> > -	    rseq_len != sizeof(*rseq))
> > +	    rseq_len < RSEQ_SIZE_VER0)
> 
> This could perhaps be changed for future kernels, but will break for existing
> kernels as soon as rseq_len is increased. This is something we should have
> planned for in the initial implementation of the system call, but here we are.
> 
> How do you envision that userspace would handle this failure from older kernels ?
> Try again with a second system call passing RSEQ_SIZE_VER0 as argument ?
> 
> > 		return -EINVAL;
> > 	if (!access_ok(rseq, rseq_len))
> > 		return -EFAULT;
> > +
> > +	/* Handle extensible struct rseq ABI. */
> > +	ret = get_user(tls_flags, &rseq->flags);
> > +	if (ret)
> > +		return ret;
> > +	if (tls_flags & RSEQ_TLS_FLAG_SIZE) {
> > +		u32 user_size, kernel_size, active_size;
> > +
> > +		/* Can probably be made nicer by using check_zeroed_user(). */
> > +		ret = get_user(user_size, &rseq->user_size);
> > +		if (ret)
> > +			return ret;
> > +		if (user_size != 0)
> > +			return -EINVAL;
> > +
> > +		ret = get_user(active_size, &rseq->active_size);
> > +		if (ret)
> > +			return ret;
> > +		if (active_size != 0)
> > +			return -EINVAL;
> > +
> > +		ret = get_user(active_size, &rseq->kernel_size);
> 
> I guess you mean kernel_size here.
> 
> > +		if (ret)
> > +			return ret;
> > +		if (kernel_size != 0)
> > +			return -EINVAL;
> > +
> > +		/* Calculate the useable size. */
> > +		active_size = min_t(u32, rseq_len, RSEQ_SIZE_LATEST);
> 
> Where is the rseq_len supposed to come from in userspace ? Should it be
> that the code doing the registration uses sizeof(struct rseq), or offsetof(struct rseq, end),
> or should it read the content of __rseq_abi.user_size ?
> 
> > +		ret = put_user(active_size, &rseq->active_size);
> > +		if (ret)
> > +			return ret;
> > +
> > +		/* Let other users know what userspace used to register. */
> > +		ret = put_user(rseq_len, &rseq->user_size);
> > +		if (ret)
> > +			return -EFAULT;
> > +
> > +		/* Let other users know what size the kernel supports. */
> 
> I am not sure what those 3 __u32 fields (user_size, kernel_size, and active_size),
> plus use of the rseq_len syscall parameter, accomplish which was not accomplished
> by my __u16 user_size + kernel_size approach ? If anything, it seems to make support
> of older kernels which do not support an extended rseq_len parameter more complex.

Yeah, fair point. I really just sketched this.
It seemed to me that what you might want to expose all three sizes in
some form, i.e.  the size the kernel knows about, the size that
userspace knows about and the size in use. The advantage of exposing the
size the kernel itself knows about and the size that userspace knows
about is that you can infer the used size and you don't loose any
information. When you only register the kernel used size and the size
userspace knows about you can't necessarily infer what size the kernel
supports. But I suppose there's no obvious case where this is needed rn.

Thanks!
Christian

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [RFC PATCH 0/4] rseq: Introduce extensible struct rseq
  2020-07-15 13:02   ` Mathieu Desnoyers
@ 2020-07-16 13:39     ` Carlos O'Donell
  2020-07-16 14:45       ` Mathieu Desnoyers
  0 siblings, 1 reply; 37+ messages in thread
From: Carlos O'Donell @ 2020-07-16 13:39 UTC (permalink / raw)
  To: Mathieu Desnoyers
  Cc: Peter Zijlstra, linux-kernel, Thomas Gleixner, paulmck,
	Boqun Feng, H. Peter Anvin, Paul Turner, linux-api,
	Christian Brauner, Florian Weimer, Chris Kennelly,
	Linus Torvalds

On 7/15/20 9:02 AM, Mathieu Desnoyers wrote:
> At this point, the main question I would like answered is whether
> it would be acceptable to increase the size and alignment of
> the __rseq_abi symbol (which will be exposed by glibc) between
> e.g. glibc 2.32 and 2.33. If it's not possible, then we can
> find other solutions, for instance using an indirection with
> a pointer to an extended structure, but this appears to be
> slightly less efficient.

The answer is always a soft "maybe" because it depends exactly
on how we do it and what consequences we are willing to accept
in the design.

For example, static applications that call dlopen will fail if
we increase the alignment beyond 32 because we had to special
case this scenario. Why did we have to special case it? Because
the "static" part of the runtime needs to create the initial
thread's static TLS space, and since it doesn't know apriori
what will be loaded in the shared library, it needs to make a
"best guess" at the alignment requirement at startup.
We need to discuss this and agree that it's OK. We already want
to deprecate dynamic loading from static applications, so this
may not be a problem in general, but I hope you see my point.
That there are corner cases to be considered and ironed out.

I want to see a detailed design document explaining the various
compatibility issues and how we solve them along with the way
the extension mechanism would work and how it would be compliant
with C/C++ language rules in userspace without adding undue burden
of potentially having to use atomic instructions all the time.
This includes discussing how the headers change. We should also
talk out the options for symbol versioning and their consequences.
  
I haven't seen enough details, and there isn't really enough
time to discuss this. I think it is *great* that we are discussing
it, but it's safest if we revert rseq, finish the discussion,
and then finalize the inclusion for 2.33 with these details
ironed out.

I feel like we've made all the technical process we need to actually
include rseq in glibc, but this discussion, and the google example
(even if it doesn't match our use case) shows that if we spend another
month hammering out the extension details could yield something we
can use for years to come while we work out other details e.g. cpu_opv.

I can set aside time in the next month to write up such a document
and discuss these issues with you and Florian. The text would form
even more of the language we'd have to include in the man page for
the feature.

In the meantime I think we should revert rseq in glibc and take
our time to hash this out without the looming deadline of August 1st
for the ABI going out the door.

I know this is disappointing, but I think in a month you'll look
back at this, we'll have Fedora Rawhide using the new extensible
version (and you'll be able to point people at that), and we'll
only be 5 months away from an official release with extensible
rseq.

Could you please respond to Florian's request to revert here?
https://sourceware.org/pipermail/libc-alpha/2020-July/116368.html

I'm looking for a Signed-off-by from you that you're OK with
reverting.

-- 
Cheers,
Carlos.


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [RFC PATCH 0/4] rseq: Introduce extensible struct rseq
  2020-07-16 13:39     ` Carlos O'Donell
@ 2020-07-16 14:45       ` Mathieu Desnoyers
  0 siblings, 0 replies; 37+ messages in thread
From: Mathieu Desnoyers @ 2020-07-16 14:45 UTC (permalink / raw)
  To: carlos
  Cc: Peter Zijlstra, linux-kernel, Thomas Gleixner, paulmck,
	Boqun Feng, H. Peter Anvin, Paul Turner, linux-api,
	Christian Brauner, Florian Weimer, Chris Kennelly,
	Linus Torvalds

----- On Jul 16, 2020, at 9:39 AM, carlos carlos@redhat.com wrote:

> On 7/15/20 9:02 AM, Mathieu Desnoyers wrote:
>> At this point, the main question I would like answered is whether
>> it would be acceptable to increase the size and alignment of
>> the __rseq_abi symbol (which will be exposed by glibc) between
>> e.g. glibc 2.32 and 2.33. If it's not possible, then we can
>> find other solutions, for instance using an indirection with
>> a pointer to an extended structure, but this appears to be
>> slightly less efficient.
> 
> The answer is always a soft "maybe" because it depends exactly
> on how we do it and what consequences we are willing to accept
> in the design.
> 
> For example, static applications that call dlopen will fail if
> we increase the alignment beyond 32 because we had to special
> case this scenario. Why did we have to special case it? Because
> the "static" part of the runtime needs to create the initial
> thread's static TLS space, and since it doesn't know apriori
> what will be loaded in the shared library, it needs to make a
> "best guess" at the alignment requirement at startup.
> We need to discuss this and agree that it's OK. We already want
> to deprecate dynamic loading from static applications, so this
> may not be a problem in general, but I hope you see my point.
> That there are corner cases to be considered and ironed out.

Note that I don't foresee we will explicitly need to increase
the alignment value for __rseq_abi beyond 32, but I was merely
asking this for sake of completeness, in case extending struct rseq
beyond a certain limit ever happens to increase the minimum
alignment.

> 
> I want to see a detailed design document explaining the various
> compatibility issues and how we solve them along with the way
> the extension mechanism would work and how it would be compliant
> with C/C++ language rules in userspace without adding undue burden
> of potentially having to use atomic instructions all the time.
> This includes discussing how the headers change. We should also
> talk out the options for symbol versioning and their consequences.
>  
> I haven't seen enough details, and there isn't really enough
> time to discuss this. I think it is *great* that we are discussing
> it, but it's safest if we revert rseq, finish the discussion,
> and then finalize the inclusion for 2.33 with these details
> ironed out.

Yes, absolutely.

> 
> I feel like we've made all the technical process we need to actually
> include rseq in glibc, but this discussion, and the google example
> (even if it doesn't match our use case) shows that if we spend another
> month hammering out the extension details could yield something we
> can use for years to come while we work out other details e.g. cpu_opv.

Indeed. Note that the current approach proposed to replace cpu_opv
is "sched_pair_cpu", ref. https://lore.kernel.org/lkml/20200619202516.7109-1-mathieu.desnoyers@efficios.com/

> I can set aside time in the next month to write up such a document
> and discuss these issues with you and Florian. The text would form
> even more of the language we'd have to include in the man page for
> the feature.

I'll do my best to secure some time to work with you on this in the
next month, but I will really have to focus on other projects which
I had to delay to make sure the rseq integration was ready for glibc
2.32.

> In the meantime I think we should revert rseq in glibc and take
> our time to hash this out without the looming deadline of August 1st
> for the ABI going out the door.
> 
> I know this is disappointing, but I think in a month you'll look
> back at this, we'll have Fedora Rawhide using the new extensible
> version (and you'll be able to point people at that), and we'll
> only be 5 months away from an official release with extensible
> rseq.

If this delay gives us a future-proof extensible rseq ABI, I'm absolutely
for it!

> Could you please respond to Florian's request to revert here?
> https://sourceware.org/pipermail/libc-alpha/2020-July/116368.html
> 
> I'm looking for a Signed-off-by from you that you're OK with
> reverting.

Will do, thanks!

Mathieu


> 
> --
> Cheers,
> Carlos.

-- 
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com

^ permalink raw reply	[flat|nested] 37+ messages in thread

end of thread, other threads:[~2020-07-16 14:45 UTC | newest]

Thread overview: 37+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-07-14  3:03 [RFC PATCH 0/4] rseq: Introduce extensible struct rseq Mathieu Desnoyers
2020-07-14  3:03 ` [RFC PATCH 1/4] selftests: rseq: Use fixed value as rseq_len parameter Mathieu Desnoyers
2020-07-14  3:03 ` [RFC PATCH 2/4] rseq: Allow extending struct rseq Mathieu Desnoyers
2020-07-14  9:58   ` Florian Weimer
2020-07-14 12:50     ` Mathieu Desnoyers
2020-07-14 13:00       ` Florian Weimer
2020-07-14 13:19         ` Mathieu Desnoyers
2020-07-14 21:30           ` Carlos O'Donell
2020-07-15 13:12             ` Mathieu Desnoyers
2020-07-15 13:22               ` Florian Weimer
2020-07-15 13:31                 ` Mathieu Desnoyers
2020-07-15 13:42                   ` Florian Weimer
2020-07-15 13:55                     ` Christian Brauner
2020-07-15 14:20                       ` Mathieu Desnoyers
2020-07-15 14:54                     ` Mathieu Desnoyers
2020-07-15 14:58                       ` Florian Weimer
2020-07-15 15:26                         ` Mathieu Desnoyers
2020-07-14 17:24   ` Peter Oskolkov
2020-07-14 17:43     ` Mathieu Desnoyers
2020-07-14 18:33       ` Peter Oskolkov
2020-07-15  2:34         ` Chris Kennelly
2020-07-15  6:31           ` Florian Weimer
2020-07-15 10:59             ` Christian Brauner
2020-07-15 14:38             ` Mathieu Desnoyers
2020-07-15 14:50           ` Mathieu Desnoyers
2020-07-15 11:38   ` Christian Brauner
2020-07-15 12:33     ` Christian Brauner
2020-07-15 15:10       ` Mathieu Desnoyers
2020-07-15 15:33         ` Christian Brauner
2020-07-14  3:03 ` [RFC PATCH 3/4] selftests: rseq: define __rseq_abi with extensible size Mathieu Desnoyers
2020-07-14  3:03 ` [RFC PATCH 4/4] selftests: rseq: print rseq extensible size in basic test Mathieu Desnoyers
2020-07-14 20:55 ` [RFC PATCH 0/4] rseq: Introduce extensible struct rseq Carlos O'Donell
2020-07-15 13:02   ` Mathieu Desnoyers
2020-07-16 13:39     ` Carlos O'Donell
2020-07-16 14:45       ` Mathieu Desnoyers
2020-07-15 15:12   ` Florian Weimer
2020-07-15 15:32     ` Mathieu Desnoyers

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).