linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 1/2] membarrier: add MEMBARRIER_CMD_PRIVATE_RESTART_RSEQ_ON_CPU
@ 2020-08-06  0:08 Peter Oskolkov
  2020-08-06  0:08 ` [PATCH 2/2] selftests/rseq: test MEMBARRIER_CMD_PRIVATE_RESTART_RSEQ_ON_CPU Peter Oskolkov
  2020-08-06 13:48 ` [PATCH 1/2] membarrier: add MEMBARRIER_CMD_PRIVATE_RESTART_RSEQ_ON_CPU peterz
  0 siblings, 2 replies; 8+ messages in thread
From: Peter Oskolkov @ 2020-08-06  0:08 UTC (permalink / raw)
  To: Mathieu Desnoyers, Paul E . McKenney, linux-kernel
  Cc: Paul Turner, Chris Kennelly, Peter Oskolkov, Peter Oskolkov

This patchset is based on Google-internal RSEQ
work done by Paul Turner and Andrew Hunter.

When working with per-CPU RSEQ-based memory allocations,
it is sometimes important to make sure that a global
memory location is no longer accessed from RSEQ critical
sections. For example, there can be two per-CPU lists,
one is "active" and accessed per-CPU, while another one
is inactive and worked on asynchronously "off CPU" (e.g.
garbage collection is performed). Then at some point
the two lists are swapped, and a fast RCU-like mechanism
is required to make sure that the previously active
list is no longer accessed.

This patch introduces such a mechanism: in short,
membarrier() syscall issues an IPI to a CPU, restarting
any potentially active RSEQ critical sections on the CPU,
with an option to restart RSEQ CSs on all CPUs.

The second patch in the patchset adds a selftest
of this feature.

Signed-off-by: Peter Oskolkov <posk@google.com>
---
 include/uapi/linux/membarrier.h |  8 ++++++
 kernel/sched/membarrier.c       | 51 +++++++++++++++++++++++++++++++--
 2 files changed, 57 insertions(+), 2 deletions(-)

diff --git a/include/uapi/linux/membarrier.h b/include/uapi/linux/membarrier.h
index 5891d7614c8c..169ffb613397 100644
--- a/include/uapi/linux/membarrier.h
+++ b/include/uapi/linux/membarrier.h
@@ -114,6 +114,13 @@
  *                          If this command is not implemented by an
  *                          architecture, -EINVAL is returned.
  *                          Returns 0 on success.
+ * @MEMBARRIER_CMD_PRIVATE_RESTART_RSEQ_ON_CPU:
+ *                          If a thread belonging to the current process
+ *                          is currently in an RSEQ critical section on the
+ *                          CPU identified by flags parameter, restart it.
+ *                          @flags: if @flags >= 0, identifies the CPU to
+ *                                  restart RSEQ CS on; if == -1, restarts
+ *                                  RSEQ CSs on all CPUs.
  * @MEMBARRIER_CMD_SHARED:
  *                          Alias to MEMBARRIER_CMD_GLOBAL. Provided for
  *                          header backward compatibility.
@@ -131,6 +138,7 @@ enum membarrier_cmd {
 	MEMBARRIER_CMD_REGISTER_PRIVATE_EXPEDITED		= (1 << 4),
 	MEMBARRIER_CMD_PRIVATE_EXPEDITED_SYNC_CORE		= (1 << 5),
 	MEMBARRIER_CMD_REGISTER_PRIVATE_EXPEDITED_SYNC_CORE	= (1 << 6),
+	MEMBARRIER_CMD_PRIVATE_RESTART_RSEQ_ON_CPU		= (1 << 7),
 
 	/* Alias for header backward compatibility. */
 	MEMBARRIER_CMD_SHARED			= MEMBARRIER_CMD_GLOBAL,
diff --git a/kernel/sched/membarrier.c b/kernel/sched/membarrier.c
index 168479a7d61b..edcc1386daf5 100644
--- a/kernel/sched/membarrier.c
+++ b/kernel/sched/membarrier.c
@@ -18,11 +18,19 @@
 #define MEMBARRIER_PRIVATE_EXPEDITED_SYNC_CORE_BITMASK	0
 #endif
 
+#ifdef CONFIG_RSEQ
+#define MEMBARRIER_CMD_PRIVATE_RESTART_RSEQ_ON_CPU_BITMASK		\
+	MEMBARRIER_CMD_PRIVATE_RESTART_RSEQ_ON_CPU
+#else
+#define MEMBARRIER_CMD_PRIVATE_RESTART_RSEQ_ON_CPU_BITMASK	0
+#endif
+
 #define MEMBARRIER_CMD_BITMASK						\
 	(MEMBARRIER_CMD_GLOBAL | MEMBARRIER_CMD_GLOBAL_EXPEDITED	\
 	| MEMBARRIER_CMD_REGISTER_GLOBAL_EXPEDITED			\
 	| MEMBARRIER_CMD_PRIVATE_EXPEDITED				\
 	| MEMBARRIER_CMD_REGISTER_PRIVATE_EXPEDITED			\
+	| MEMBARRIER_CMD_PRIVATE_RESTART_RSEQ_ON_CPU_BITMASK		\
 	| MEMBARRIER_PRIVATE_EXPEDITED_SYNC_CORE_BITMASK)
 
 static void ipi_mb(void *info)
@@ -308,10 +316,47 @@ static int membarrier_register_private_expedited(int flags)
 	return 0;
 }
 
+#ifdef CONFIG_RSEQ
+static void membarrier_rseq_ipi(void *arg)
+{
+	struct task_struct *leader = (struct task_struct *)arg;
+
+	if (current->group_leader != leader)  /* Not our process. */
+		return;
+	if (!current->rseq)  /* RSEQ not set up for current task/thread. */
+		return;
+
+	rseq_preempt(current);
+}
+#endif
+
+static int membarrier_private_restart_rseq_on_cpu(int cpu_id)
+{
+#ifdef CONFIG_RSEQ
+	/* syscalls are not allowed inside rseq critical sections. */
+	if (cpu_id == raw_smp_processor_id())
+		return 0;
+
+	if (cpu_id >= 0) {
+		return smp_call_function_single(cpu_id, membarrier_rseq_ipi,
+						current->group_leader, true);
+	} else if (cpu_id == -1) {
+		on_each_cpu(membarrier_rseq_ipi,
+			    current->group_leader, true);
+	} else {
+		return -EINVAL;
+	}
+#endif
+	return 0;
+}
+
 /**
  * sys_membarrier - issue memory barriers on a set of threads
  * @cmd:   Takes command values defined in enum membarrier_cmd.
- * @flags: Currently needs to be 0. For future extensions.
+ * @flags: Currently needs to be 0 for all commands other than
+ *         MEMBARRIER_CMD_PRIVATE_RESTART_RSEQ_ON_CPU: in the latter
+ *         case it indicates the CPU on which to interrupt (= restart)
+ *         the RSEQ critical section, or -1 to restart RSEQ CSs on all CPUs.
  *
  * If this system call is not implemented, -ENOSYS is returned. If the
  * command specified does not exist, not available on the running
@@ -339,7 +384,7 @@ static int membarrier_register_private_expedited(int flags)
  */
 SYSCALL_DEFINE2(membarrier, int, cmd, int, flags)
 {
-	if (unlikely(flags))
+	if (unlikely(flags && cmd != MEMBARRIER_CMD_PRIVATE_RESTART_RSEQ_ON_CPU))
 		return -EINVAL;
 	switch (cmd) {
 	case MEMBARRIER_CMD_QUERY:
@@ -369,6 +414,8 @@ SYSCALL_DEFINE2(membarrier, int, cmd, int, flags)
 		return membarrier_private_expedited(MEMBARRIER_FLAG_SYNC_CORE);
 	case MEMBARRIER_CMD_REGISTER_PRIVATE_EXPEDITED_SYNC_CORE:
 		return membarrier_register_private_expedited(MEMBARRIER_FLAG_SYNC_CORE);
+	case MEMBARRIER_CMD_PRIVATE_RESTART_RSEQ_ON_CPU:
+		return membarrier_private_restart_rseq_on_cpu(flags);
 	default:
 		return -EINVAL;
 	}
-- 
2.28.0.163.g6104cc2f0b6-goog


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH 2/2] selftests/rseq: test MEMBARRIER_CMD_PRIVATE_RESTART_RSEQ_ON_CPU
  2020-08-06  0:08 [PATCH 1/2] membarrier: add MEMBARRIER_CMD_PRIVATE_RESTART_RSEQ_ON_CPU Peter Oskolkov
@ 2020-08-06  0:08 ` Peter Oskolkov
  2020-08-06 13:48 ` [PATCH 1/2] membarrier: add MEMBARRIER_CMD_PRIVATE_RESTART_RSEQ_ON_CPU peterz
  1 sibling, 0 replies; 8+ messages in thread
From: Peter Oskolkov @ 2020-08-06  0:08 UTC (permalink / raw)
  To: Mathieu Desnoyers, Paul E . McKenney, linux-kernel
  Cc: Paul Turner, Chris Kennelly, Peter Oskolkov, Peter Oskolkov

Based on Google-internal RSEQ work done by
Paul Turner and Andrew Hunter.

This patch adds a selftest for MEMBARRIER_CMD_PRIVATE_RESTART_RSEQ_ON_CPU.
The test quite often fails without the previous patch in this patchset,
but consistently passes with it.

Signed-off-by: Peter Oskolkov <posk@google.com>
---
 .../selftests/rseq/basic_percpu_ops_test.c    | 181 ++++++++++++++++++
 1 file changed, 181 insertions(+)

diff --git a/tools/testing/selftests/rseq/basic_percpu_ops_test.c b/tools/testing/selftests/rseq/basic_percpu_ops_test.c
index eb3f6db36d36..147c80deac19 100644
--- a/tools/testing/selftests/rseq/basic_percpu_ops_test.c
+++ b/tools/testing/selftests/rseq/basic_percpu_ops_test.c
@@ -3,16 +3,21 @@
 #include <assert.h>
 #include <pthread.h>
 #include <sched.h>
+#include <stdatomic.h>
 #include <stdint.h>
 #include <stdio.h>
 #include <stdlib.h>
 #include <string.h>
 #include <stddef.h>
+#include <syscall.h>
+#include <unistd.h>
 
 #include "rseq.h"
 
 #define ARRAY_SIZE(arr)	(sizeof(arr) / sizeof((arr)[0]))
 
+#define MEMBARRIER_CMD_PRIVATE_RESTART_RSEQ_ON_CPU	(1<<7)
+
 struct percpu_lock_entry {
 	intptr_t v;
 } __attribute__((aligned(128)));
@@ -289,6 +294,180 @@ void test_percpu_list(void)
 	assert(sum == expected_sum);
 }
 
+struct test_membarrier_thread_args {
+	int stop;
+	intptr_t percpu_list_ptr;
+};
+
+/* Worker threads modify data in their "active" percpu lists. */
+void *test_membarrier_worker_thread(void *arg)
+{
+	struct test_membarrier_thread_args *args =
+		(struct test_membarrier_thread_args *)arg;
+	const int iters = 10 * 1000 * 1000;
+	int i;
+
+	if (rseq_register_current_thread()) {
+		fprintf(stderr, "Error: rseq_register_current_thread(...) failed(%d): %s\n",
+			errno, strerror(errno));
+		abort();
+	}
+
+	for (i = 0; i < iters; ++i) {
+		while (true) {
+			int cpu, ret;
+			struct percpu_list *list_ptr = (struct percpu_list *)
+				atomic_load(&args->percpu_list_ptr);
+
+			if (!list_ptr)
+				continue;  /* Not yet initialized. */
+
+			cpu = rseq_cpu_start();
+			struct percpu_list_node *node = list_ptr->c[cpu].head;
+			const intptr_t prev = node->data;
+
+			ret = rseq_cmpeqv_cmpeqv_storev(&node->data, prev,
+					&args->percpu_list_ptr,
+					(intptr_t)list_ptr, prev + 1, cpu);
+			if (!ret)
+				break;  /* Success. */
+		}
+	}
+
+	if (rseq_unregister_current_thread()) {
+		fprintf(stderr, "Error: rseq_unregister_current_thread(...) failed(%d): %s\n",
+			errno, strerror(errno));
+		abort();
+	}
+	return NULL;
+}
+
+void test_membarrier_init_percpu_list(struct percpu_list *list)
+{
+	int i;
+
+	memset(list, 0, sizeof(*list));
+	for (i = 0; i < CPU_SETSIZE; i++) {
+		struct percpu_list_node *node;
+
+		node = malloc(sizeof(*node));
+		assert(node);
+		node->data = 0;
+		node->next = NULL;
+		list->c[i].head = node;
+	}
+}
+
+void test_membarrier_free_percpu_list(struct percpu_list *list)
+{
+	int i;
+
+	for (i = 0; i < CPU_SETSIZE; i++)
+		free(list->c[i].head);
+}
+
+static int sys_membarrier(int cmd, int flags)
+{
+	return syscall(__NR_membarrier, cmd, flags);
+}
+
+/*
+ * The manager thread swaps per-cpu lists that worker threads see,
+ * and validates that there are no unexpected modifications.
+ */
+void *test_membarrier_manager_thread(void *arg)
+{
+	struct test_membarrier_thread_args *args =
+		(struct test_membarrier_thread_args *)arg;
+	struct percpu_list list_a, list_b;
+	intptr_t expect_a = 0, expect_b = 0;
+	int cpu_a = 0, cpu_b = 0;
+
+	if (rseq_register_current_thread()) {
+		fprintf(stderr, "Error: rseq_register_current_thread(...) failed(%d): %s\n",
+			errno, strerror(errno));
+		abort();
+	}
+
+	/* Init lists. */
+	test_membarrier_init_percpu_list(&list_a);
+	test_membarrier_init_percpu_list(&list_b);
+
+	atomic_store(&args->percpu_list_ptr, (intptr_t)&list_a);
+
+	while (!atomic_load(&args->stop)) {
+		/* list_a is "active". */
+		cpu_a = rand() % CPU_SETSIZE;
+		/*
+		 * As list_b is "inactive", we should never see changes
+		 * to list_b.
+		 */
+		if (expect_b != atomic_load(&list_b.c[cpu_b].head->data)) {
+			fprintf(stderr, "Membarrier test failed\n");
+			abort();
+		}
+
+		/* Make list_b "active". */
+		atomic_store(&args->percpu_list_ptr, (intptr_t)&list_b);
+		sys_membarrier(MEMBARRIER_CMD_PRIVATE_RESTART_RSEQ_ON_CPU, cpu_a);
+		/*
+		 * Cpu A should now only modify list_b, so we values
+		 * in list_a should be stable.
+		 */
+		expect_a = atomic_load(&list_a.c[cpu_a].head->data);
+
+		cpu_b = rand() % CPU_SETSIZE;
+		/*
+		 * As list_a is "inactive", we should never see changes
+		 * to list_a.
+		 */
+		if (expect_a != atomic_load(&list_a.c[cpu_a].head->data)) {
+			fprintf(stderr, "Membarrier test failed\n");
+			abort();
+		}
+
+		/* Make list_a "active". */
+		atomic_store(&args->percpu_list_ptr, (intptr_t)&list_a);
+		sys_membarrier(MEMBARRIER_CMD_PRIVATE_RESTART_RSEQ_ON_CPU, cpu_b);
+		/* Remember a value from list_b. */
+		expect_b = atomic_load(&list_b.c[cpu_b].head->data);
+	}
+
+	test_membarrier_free_percpu_list(&list_a);
+	test_membarrier_free_percpu_list(&list_b);
+
+	if (rseq_unregister_current_thread()) {
+		fprintf(stderr, "Error: rseq_unregister_current_thread(...) failed(%d): %s\n",
+			errno, strerror(errno));
+		abort();
+	}
+	return NULL;
+}
+
+/* Test MEMBARRIER_CMD_PRIVATE_RESTART_RSEQ_ON_CPU membarrier command. */
+void test_membarrier(void)
+{
+	struct test_membarrier_thread_args thread_args;
+	pthread_t worker_threads[CPU_SETSIZE];
+	pthread_t manager_thread;
+	int i;
+
+	thread_args.stop = 0;
+	thread_args.percpu_list_ptr = 0;
+	pthread_create(&manager_thread, NULL,
+		       test_membarrier_manager_thread, &thread_args);
+
+	for (i = 0; i < CPU_SETSIZE; i++)
+		pthread_create(&worker_threads[i], NULL,
+		       test_membarrier_worker_thread, &thread_args);
+
+	for (i = 0; i < CPU_SETSIZE; i++)
+		pthread_join(worker_threads[i], NULL);
+
+	atomic_store(&thread_args.stop, 1);
+	pthread_join(manager_thread, NULL);
+}
+
 int main(int argc, char **argv)
 {
 	if (rseq_register_current_thread()) {
@@ -300,6 +479,8 @@ int main(int argc, char **argv)
 	test_percpu_spinlock();
 	printf("percpu_list\n");
 	test_percpu_list();
+	printf("membarrier\n");
+	test_membarrier();
 	if (rseq_unregister_current_thread()) {
 		fprintf(stderr, "Error: rseq_unregister_current_thread(...) failed(%d): %s\n",
 			errno, strerror(errno));
-- 
2.28.0.163.g6104cc2f0b6-goog


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH 1/2] membarrier: add MEMBARRIER_CMD_PRIVATE_RESTART_RSEQ_ON_CPU
  2020-08-06  0:08 [PATCH 1/2] membarrier: add MEMBARRIER_CMD_PRIVATE_RESTART_RSEQ_ON_CPU Peter Oskolkov
  2020-08-06  0:08 ` [PATCH 2/2] selftests/rseq: test MEMBARRIER_CMD_PRIVATE_RESTART_RSEQ_ON_CPU Peter Oskolkov
@ 2020-08-06 13:48 ` peterz
  2020-08-06 17:07   ` Peter Oskolkov
  1 sibling, 1 reply; 8+ messages in thread
From: peterz @ 2020-08-06 13:48 UTC (permalink / raw)
  To: Peter Oskolkov
  Cc: Mathieu Desnoyers, Paul E . McKenney, linux-kernel, Paul Turner,
	Chris Kennelly, Peter Oskolkov

On Wed, Aug 05, 2020 at 05:08:58PM -0700, Peter Oskolkov wrote:

Thanks for the Cc!

> + * @MEMBARRIER_CMD_PRIVATE_RESTART_RSEQ_ON_CPU:
> + *                          If a thread belonging to the current process
> + *                          is currently in an RSEQ critical section on the
> + *                          CPU identified by flags parameter, restart it.
> + *                          @flags: if @flags >= 0, identifies the CPU to
> + *                                  restart RSEQ CS on; if == -1, restarts
> + *                                  RSEQ CSs on all CPUs.

> +	} else if (cpu_id == -1) {
> +		on_each_cpu(membarrier_rseq_ipi,
> +			    current->group_leader, true);

This is an unpriv IPI the world. That's a big no-no.

Double so because all you want to target is the current process, which
you're defining as CLONE_THREAD, where the rest of this file uses
CLONE_VM to define a process.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 1/2] membarrier: add MEMBARRIER_CMD_PRIVATE_RESTART_RSEQ_ON_CPU
  2020-08-06 13:48 ` [PATCH 1/2] membarrier: add MEMBARRIER_CMD_PRIVATE_RESTART_RSEQ_ON_CPU peterz
@ 2020-08-06 17:07   ` Peter Oskolkov
  2020-08-06 17:37     ` Mathieu Desnoyers
  0 siblings, 1 reply; 8+ messages in thread
From: Peter Oskolkov @ 2020-08-06 17:07 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Peter Oskolkov, Mathieu Desnoyers, Paul E . McKenney,
	Linux Kernel Mailing List, Paul Turner, Chris Kennelly

On Thu, Aug 6, 2020 at 6:48 AM <peterz@infradead.org> wrote:
>
> On Wed, Aug 05, 2020 at 05:08:58PM -0700, Peter Oskolkov wrote:
>
> Thanks for the Cc!

Always a pleasure!

(Sorry, included only membarrier maintainers in v1; in v2 included
both membarrier and rseq maintainers).

>
> > + * @MEMBARRIER_CMD_PRIVATE_RESTART_RSEQ_ON_CPU:
> > + *                          If a thread belonging to the current process
> > + *                          is currently in an RSEQ critical section on the
> > + *                          CPU identified by flags parameter, restart it.
> > + *                          @flags: if @flags >= 0, identifies the CPU to
> > + *                                  restart RSEQ CS on; if == -1, restarts
> > + *                                  RSEQ CSs on all CPUs.
>
> > +     } else if (cpu_id == -1) {
> > +             on_each_cpu(membarrier_rseq_ipi,
> > +                         current->group_leader, true);
>
> This is an unpriv IPI the world. That's a big no-no.

removed in v2.

>
> Double so because all you want to target is the current process, which
> you're defining as CLONE_THREAD, where the rest of this file uses
> CLONE_VM to define a process.

Use current->mm in v2 instead of current->group_leader. Is it better this way?

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 1/2] membarrier: add MEMBARRIER_CMD_PRIVATE_RESTART_RSEQ_ON_CPU
  2020-08-06 17:07   ` Peter Oskolkov
@ 2020-08-06 17:37     ` Mathieu Desnoyers
  2020-08-07 17:48       ` Peter Oskolkov
  0 siblings, 1 reply; 8+ messages in thread
From: Mathieu Desnoyers @ 2020-08-06 17:37 UTC (permalink / raw)
  To: Peter Oskolkov
  Cc: Peter Zijlstra, Peter Oskolkov, paulmck, linux-kernel,
	Paul Turner, Chris Kennelly

----- On Aug 6, 2020, at 1:07 PM, Peter Oskolkov posk@posk.io wrote:

> On Thu, Aug 6, 2020 at 6:48 AM <peterz@infradead.org> wrote:
>>
>> On Wed, Aug 05, 2020 at 05:08:58PM -0700, Peter Oskolkov wrote:
>>
>> Thanks for the Cc!
> 
> Always a pleasure!
> 
> (Sorry, included only membarrier maintainers in v1; in v2 included
> both membarrier and rseq maintainers).
> 
>>
>> > + * @MEMBARRIER_CMD_PRIVATE_RESTART_RSEQ_ON_CPU:
>> > + *                          If a thread belonging to the current process
>> > + *                          is currently in an RSEQ critical section on the
>> > + *                          CPU identified by flags parameter, restart it.
>> > + *                          @flags: if @flags >= 0, identifies the CPU to
>> > + *                                  restart RSEQ CS on; if == -1, restarts
>> > + *                                  RSEQ CSs on all CPUs.
>>
>> > +     } else if (cpu_id == -1) {
>> > +             on_each_cpu(membarrier_rseq_ipi,
>> > +                         current->group_leader, true);
>>
>> This is an unpriv IPI the world. That's a big no-no.
> 
> removed in v2.

I don't think the feature must be removed, but its implementation needs adjustment.

How about we simply piggy-back on the membarrier schemes we already have, and
implement:

membarrier_register_private_expedited(MEMBARRIER_FLAG_RSEQ)
membarrier_private_expedited(MEMBARRIER_FLAG_RSEQ)

All the logic is there to prevent sending IPIs to runqueues which are not running
threads associated with the same mm. Considering that preemption does an rseq abort,
running a thread belonging to a different mm should mean that this CPU is not
currently executing an rseq critical section, or if it was, it has already been
aborted, so it is quiescent.

Then you'll probably want to change membarrier_private_expedited so it takes an
extra "cpu" argument. If cpu=-1, iterate on all runqueues like we currently do.
If cpu >= 0, only IPI that CPU if the thread currently running has the same mm.

Also, should this belong to the membarrier or the rseq system call ? It just
looks like the membarrier happens to implement very similar things for barriers,
but arguably this is really about rseq. I wonder if we should expose this through
rseq instead, even if we end up using membarrier code.

Thanks,

Mathieu

-- 
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 1/2] membarrier: add MEMBARRIER_CMD_PRIVATE_RESTART_RSEQ_ON_CPU
  2020-08-06 17:37     ` Mathieu Desnoyers
@ 2020-08-07 17:48       ` Peter Oskolkov
  2020-08-07 18:07         ` Mathieu Desnoyers
  0 siblings, 1 reply; 8+ messages in thread
From: Peter Oskolkov @ 2020-08-07 17:48 UTC (permalink / raw)
  To: Mathieu Desnoyers
  Cc: Peter Zijlstra, Peter Oskolkov, paulmck, linux-kernel,
	Paul Turner, Chris Kennelly

On Thu, Aug 6, 2020 at 10:37 AM Mathieu Desnoyers
<mathieu.desnoyers@efficios.com> wrote:
>

> >>
> >> This is an unpriv IPI the world. That's a big no-no.
> >
> > removed in v2.
>
> I don't think the feature must be removed, but its implementation needs adjustment.
>
> How about we simply piggy-back on the membarrier schemes we already have, and
> implement:
>
> membarrier_register_private_expedited(MEMBARRIER_FLAG_RSEQ)
> membarrier_private_expedited(MEMBARRIER_FLAG_RSEQ)
>
> All the logic is there to prevent sending IPIs to runqueues which are not running
> threads associated with the same mm. Considering that preemption does an rseq abort,
> running a thread belonging to a different mm should mean that this CPU is not
> currently executing an rseq critical section, or if it was, it has already been
> aborted, so it is quiescent.
>
> Then you'll probably want to change membarrier_private_expedited so it takes an
> extra "cpu" argument. If cpu=-1, iterate on all runqueues like we currently do.
> If cpu >= 0, only IPI that CPU if the thread currently running has the same mm.
>

Thanks, Mathieu! I'll prepare something based on your and Peter's feedback.

> Also, should this belong to the membarrier or the rseq system call ? It just
> looks like the membarrier happens to implement very similar things for barriers,
> but arguably this is really about rseq. I wonder if we should expose this through
> rseq instead, even if we end up using membarrier code.

Yes, this is more about rseq; on the other hand, the high-level API/behavior
looks closer to that membarrier, and a lot of code will be shared.

As you are the maintainer for both rseq and membarrier, this is for
you to decide, I guess... :)

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 1/2] membarrier: add MEMBARRIER_CMD_PRIVATE_RESTART_RSEQ_ON_CPU
  2020-08-07 17:48       ` Peter Oskolkov
@ 2020-08-07 18:07         ` Mathieu Desnoyers
  2020-08-07 19:02           ` peterz
  0 siblings, 1 reply; 8+ messages in thread
From: Mathieu Desnoyers @ 2020-08-07 18:07 UTC (permalink / raw)
  To: Peter Oskolkov
  Cc: Peter Zijlstra, Peter Oskolkov, paulmck, linux-kernel,
	Paul Turner, Chris Kennelly

----- On Aug 7, 2020, at 1:48 PM, Peter Oskolkov posk@posk.io wrote:

> On Thu, Aug 6, 2020 at 10:37 AM Mathieu Desnoyers
> <mathieu.desnoyers@efficios.com> wrote:
>>
[...]
>> Also, should this belong to the membarrier or the rseq system call ? It just
>> looks like the membarrier happens to implement very similar things for barriers,
>> but arguably this is really about rseq. I wonder if we should expose this
>> through
>> rseq instead, even if we end up using membarrier code.
> 
> Yes, this is more about rseq; on the other hand, the high-level API/behavior
> looks closer to that membarrier, and a lot of code will be shared.
> 
> As you are the maintainer for both rseq and membarrier, this is for
> you to decide, I guess... :)

Considering that membarrier has been made extensible with the cmd
argument, and on the other hand rseq can be extended with "flags", but is
currently only about registration/unregistration, I think adding a command
to membarrier is indeed a natural approach.

I am not very fond on re-purposing the membarrier flags parameter into a
cpu number though. Maybe we should tweak the membarrier system call so it
can expect 3 arguments instead ?

  int membarrier(int cmd, int flags, int cpu);

where cpu is only used for specific commands.

One thing I find weird about Peter's patch is that it adds a
MEMBERRIER_CMD_PRIVATE_EXPEDITED_RSEQ without a corresponding
MEMBARRIER_CMD_REGISTER_PRIVATE_EXPEDITED_RSEQ. Considering that
the SYNC_CORE variant already has its own register command, I
find it weird that the RSEQ counterpart does not have one.

Also, do we want to allow a RSEQ | SYNC_CORE private expedited
membarrier as well ? If that is the case, then we might want to
investigate exposing RSEQ-membarrier as a new membarrier flag
rather than as a stand-alone command.

Thanks,

Mathieu

-- 
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 1/2] membarrier: add MEMBARRIER_CMD_PRIVATE_RESTART_RSEQ_ON_CPU
  2020-08-07 18:07         ` Mathieu Desnoyers
@ 2020-08-07 19:02           ` peterz
  0 siblings, 0 replies; 8+ messages in thread
From: peterz @ 2020-08-07 19:02 UTC (permalink / raw)
  To: Mathieu Desnoyers
  Cc: Peter Oskolkov, Peter Oskolkov, paulmck, linux-kernel,
	Paul Turner, Chris Kennelly

On Fri, Aug 07, 2020 at 02:07:59PM -0400, Mathieu Desnoyers wrote:
> One thing I find weird about Peter's patch is that it adds a
> MEMBERRIER_CMD_PRIVATE_EXPEDITED_RSEQ without a corresponding
> MEMBARRIER_CMD_REGISTER_PRIVATE_EXPEDITED_RSEQ. Considering that
> the SYNC_CORE variant already has its own register command, I
> find it weird that the RSEQ counterpart does not have one.

I thought the register thing was about global, not private membarriers.

Anyway, it was just a quick pseudo thing to show how one can go about
adding the rseq to the mm iteration we already have.

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2020-08-07 19:02 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-08-06  0:08 [PATCH 1/2] membarrier: add MEMBARRIER_CMD_PRIVATE_RESTART_RSEQ_ON_CPU Peter Oskolkov
2020-08-06  0:08 ` [PATCH 2/2] selftests/rseq: test MEMBARRIER_CMD_PRIVATE_RESTART_RSEQ_ON_CPU Peter Oskolkov
2020-08-06 13:48 ` [PATCH 1/2] membarrier: add MEMBARRIER_CMD_PRIVATE_RESTART_RSEQ_ON_CPU peterz
2020-08-06 17:07   ` Peter Oskolkov
2020-08-06 17:37     ` Mathieu Desnoyers
2020-08-07 17:48       ` Peter Oskolkov
2020-08-07 18:07         ` Mathieu Desnoyers
2020-08-07 19:02           ` peterz

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).