* [RFC PATCH 1/2] membarrier: speed up single-threaded private cmd
@ 2017-08-27 19:54 Mathieu Desnoyers
2017-08-27 19:54 ` [RFC PATCH 2/2] membarrier: provide register sync core cmd Mathieu Desnoyers
0 siblings, 1 reply; 9+ messages in thread
From: Mathieu Desnoyers @ 2017-08-27 19:54 UTC (permalink / raw)
To: Paul E . McKenney, Peter Zijlstra
Cc: linux-kernel, Mathieu Desnoyers, Boqun Feng, Andrew Hunter,
Maged Michael, gromer, Avi Kivity, Benjamin Herrenschmidt,
Paul Mackerras, Michael Ellerman, Dave Watson
Single-threaded processes don't need to issue barriers, since the
thread's current CPU is the only one that matters.
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
CC: Peter Zijlstra <peterz@infradead.org>
CC: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
CC: Boqun Feng <boqun.feng@gmail.com>
CC: Andrew Hunter <ahh@google.com>
CC: Maged Michael <maged.michael@gmail.com>
CC: gromer@google.com
CC: Avi Kivity <avi@scylladb.com>
CC: Benjamin Herrenschmidt <benh@kernel.crashing.org>
CC: Paul Mackerras <paulus@samba.org>
CC: Michael Ellerman <mpe@ellerman.id.au>
CC: Dave Watson <davejwatson@fb.com>
---
kernel/sched/membarrier.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/kernel/sched/membarrier.c b/kernel/sched/membarrier.c
index a92fddc22747..7eec6914d2d2 100644
--- a/kernel/sched/membarrier.c
+++ b/kernel/sched/membarrier.c
@@ -39,7 +39,7 @@ static void membarrier_private_expedited(void)
bool fallback = false;
cpumask_var_t tmpmask;
- if (num_online_cpus() == 1)
+ if (num_online_cpus() == 1 || get_nr_threads(current) == 1)
return;
/*
--
2.11.0
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [RFC PATCH 2/2] membarrier: provide register sync core cmd
2017-08-27 19:54 [RFC PATCH 1/2] membarrier: speed up single-threaded private cmd Mathieu Desnoyers
@ 2017-08-27 19:54 ` Mathieu Desnoyers
2017-08-27 20:35 ` Paul E. McKenney
0 siblings, 1 reply; 9+ messages in thread
From: Mathieu Desnoyers @ 2017-08-27 19:54 UTC (permalink / raw)
To: Paul E . McKenney, Peter Zijlstra
Cc: linux-kernel, Mathieu Desnoyers, Boqun Feng, Andrew Hunter,
Maged Michael, gromer, Avi Kivity, Benjamin Herrenschmidt,
Paul Mackerras, Michael Ellerman, Dave Watson, Andy Lutomirski,
Will Deacon, Hans Boehm
Add a new MEMBARRIER_CMD_REGISTER_SYNC_CORE command to the membarrier
system call. It allows processes to register their intent to have their
threads issue core serializing barriers in addition to memory barriers
whenever a membarrier command is performed.
It is relevant for reclaim of JIT code, which requires to issue core
serializing barriers on all threads running on behalf of a process
after ensuring the old code is not visible anymore, before re-using
memory for new code.
When a processes returns from a MEMBARRIER_CMD_REGISTER_SYNC_CORE
command, it is guaranteed that all following MEMBARRIER_CMD_SHARED and
MEMBARRIER_CMD_PRIVATE_EXPEDITED issue core serializing barriers, in
addition to the memory barriers, on all of its running threads.
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
CC: Peter Zijlstra <peterz@infradead.org>
CC: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
CC: Boqun Feng <boqun.feng@gmail.com>
CC: Andrew Hunter <ahh@google.com>
CC: Maged Michael <maged.michael@gmail.com>
CC: gromer@google.com
CC: Avi Kivity <avi@scylladb.com>
CC: Benjamin Herrenschmidt <benh@kernel.crashing.org>
CC: Paul Mackerras <paulus@samba.org>
CC: Michael Ellerman <mpe@ellerman.id.au>
CC: Dave Watson <davejwatson@fb.com>
CC: Andy Lutomirski <luto@kernel.org>
CC: Will Deacon <will.deacon@arm.com>
CC: Hans Boehm <hboehm@google.com>
---
fs/exec.c | 1 +
include/linux/sched.h | 52 +++++++++++++++++++++++++++++++++++++++++
include/uapi/linux/membarrier.h | 1 +
kernel/fork.c | 2 ++
kernel/sched/core.c | 3 +++
kernel/sched/membarrier.c | 37 ++++++++++++++++++++++++++---
6 files changed, 93 insertions(+), 3 deletions(-)
diff --git a/fs/exec.c b/fs/exec.c
index 62175cbcc801..a4ab3253bac7 100644
--- a/fs/exec.c
+++ b/fs/exec.c
@@ -1794,6 +1794,7 @@ static int do_execveat_common(int fd, struct filename *filename,
/* execve succeeded */
current->fs->in_exec = 0;
current->in_execve = 0;
+ membarrier_execve(current);
acct_update_integrals(current);
task_numa_free(current);
free_bprm(bprm);
diff --git a/include/linux/sched.h b/include/linux/sched.h
index 8337e2db0bb2..b1ecdc4e8b84 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -1086,6 +1086,9 @@ struct task_struct {
/* Used by LSM modules for access restriction: */
void *security;
#endif
+#ifdef CONFIG_MEMBARRIER
+ int membarrier_sync_core;
+#endif
/*
* New fields for task_struct should be added above here, so that
@@ -1623,4 +1626,53 @@ extern long sched_getaffinity(pid_t pid, struct cpumask *mask);
#define TASK_SIZE_OF(tsk) TASK_SIZE
#endif
+#ifdef CONFIG_MEMBARRIER
+static inline void membarrier_fork(struct task_struct *t,
+ unsigned long clone_flags)
+{
+ /*
+ * Coherence of membarrier_sync_core against thread fork is
+ * protected by siglock. membarrier_fork is called with siglock
+ * held.
+ */
+ t->membarrier_sync_core = current->membarrier_sync_core;
+}
+static inline void membarrier_execve(struct task_struct *t)
+{
+ t->membarrier_sync_core = 0;
+}
+static inline void membarrier_sched_out(struct task_struct *t)
+{
+ /*
+ * Core serialization is performed before the memory barrier
+ * preceding the store to rq->curr.
+ */
+ if (unlikely(READ_ONCE(t->membarrier_sync_core)))
+ sync_core();
+}
+static inline void membarrier_sched_in(struct task_struct *t)
+{
+ /*
+ * Core serialization is performed after the memory barrier
+ * following the store to rq->curr.
+ */
+ if (unlikely(READ_ONCE(t->membarrier_sync_core)))
+ sync_core();
+}
+#else
+static inline void membarrier_fork(struct task_struct *t,
+ unsigned long clone_flags)
+{
+}
+static inline void membarrier_execve(struct task_struct *t)
+{
+}
+static inline void membarrier_sched_out(struct task_struct *t)
+{
+}
+static inline void membarrier_sched_in(struct task_struct *t)
+{
+}
+#endif
+
#endif
diff --git a/include/uapi/linux/membarrier.h b/include/uapi/linux/membarrier.h
index 6d47b3249d8a..bc26fcf8fb7b 100644
--- a/include/uapi/linux/membarrier.h
+++ b/include/uapi/linux/membarrier.h
@@ -67,6 +67,7 @@ enum membarrier_cmd {
/* reserved for MEMBARRIER_CMD_SHARED_EXPEDITED (1 << 1) */
/* reserved for MEMBARRIER_CMD_PRIVATE (1 << 2) */
MEMBARRIER_CMD_PRIVATE_EXPEDITED = (1 << 3),
+ MEMBARRIER_CMD_REGISTER_SYNC_CORE = (1 << 4),
};
#endif /* _UAPI_LINUX_MEMBARRIER_H */
diff --git a/kernel/fork.c b/kernel/fork.c
index 17921b0390b4..713b3c932671 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -1840,6 +1840,8 @@ static __latent_entropy struct task_struct *copy_process(
*/
copy_seccomp(p);
+ membarrier_fork(p, clone_flags);
+
/*
* Process group and session signals need to be delivered to just the
* parent before the fork or both the parent and the child after the
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 0e36d9960d91..3ca27d066950 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -3292,6 +3292,8 @@ static void __sched notrace __schedule(bool preempt)
local_irq_disable();
rcu_note_context_switch(preempt);
+ membarrier_sched_out(prev);
+
/*
* Make sure that signal_pending_state()->signal_pending() below
* can't be reordered with __set_current_state(TASK_INTERRUPTIBLE)
@@ -3364,6 +3366,7 @@ static void __sched notrace __schedule(bool preempt)
/* Also unlocks the rq: */
rq = context_switch(rq, prev, next, &rf);
+ membarrier_sched_in(next);
} else {
rq->clock_update_flags &= ~(RQCF_ACT_SKIP|RQCF_REQ_SKIP);
rq_unlock_irq(rq, &rf);
diff --git a/kernel/sched/membarrier.c b/kernel/sched/membarrier.c
index 7eec6914d2d2..f128caf64b49 100644
--- a/kernel/sched/membarrier.c
+++ b/kernel/sched/membarrier.c
@@ -25,12 +25,16 @@
* Bitmask made from a "or" of all commands within enum membarrier_cmd,
* except MEMBARRIER_CMD_QUERY.
*/
-#define MEMBARRIER_CMD_BITMASK \
- (MEMBARRIER_CMD_SHARED | MEMBARRIER_CMD_PRIVATE_EXPEDITED)
+#define MEMBARRIER_CMD_BITMASK \
+ (MEMBARRIER_CMD_SHARED \
+ | MEMBARRIER_CMD_PRIVATE_EXPEDITED \
+ | MEMBARRIER_CMD_REGISTER_SYNC_CORE)
static void ipi_mb(void *info)
{
- smp_mb(); /* IPIs should be serializing but paranoid. */
+ /* IPIs should be serializing but paranoid. */
+ smp_mb();
+ sync_core();
}
static void membarrier_private_expedited(void)
@@ -96,6 +100,30 @@ static void membarrier_private_expedited(void)
smp_mb(); /* exit from system call is not a mb */
}
+static void membarrier_register_sync_core(void)
+{
+ struct task_struct *p = current, *t;
+
+ if (get_nr_threads(p) == 1) {
+ p->membarrier_sync_core = 1;
+ return;
+ }
+
+ /*
+ * Coherence of membarrier_sync_core against thread fork is
+ * protected by siglock.
+ */
+ spin_lock(&p->sighand->siglock);
+ for_each_thread(p, t)
+ WRITE_ONCE(t->membarrier_sync_core, 1);
+ spin_unlock(&p->sighand->siglock);
+ /*
+ * Ensure all future scheduler execution will observe the new
+ * membarrier_sync_core state for this process.
+ */
+ synchronize_sched();
+}
+
/**
* sys_membarrier - issue memory barriers on a set of threads
* @cmd: Takes command values defined in enum membarrier_cmd.
@@ -146,6 +174,9 @@ SYSCALL_DEFINE2(membarrier, int, cmd, int, flags)
case MEMBARRIER_CMD_PRIVATE_EXPEDITED:
membarrier_private_expedited();
return 0;
+ case MEMBARRIER_CMD_REGISTER_SYNC_CORE:
+ membarrier_register_sync_core();
+ return 0;
default:
return -EINVAL;
}
--
2.11.0
^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [RFC PATCH 2/2] membarrier: provide register sync core cmd
2017-08-27 19:54 ` [RFC PATCH 2/2] membarrier: provide register sync core cmd Mathieu Desnoyers
@ 2017-08-27 20:35 ` Paul E. McKenney
2017-08-27 20:52 ` Mathieu Desnoyers
0 siblings, 1 reply; 9+ messages in thread
From: Paul E. McKenney @ 2017-08-27 20:35 UTC (permalink / raw)
To: Mathieu Desnoyers
Cc: Peter Zijlstra, linux-kernel, Boqun Feng, Andrew Hunter,
Maged Michael, gromer, Avi Kivity, Benjamin Herrenschmidt,
Paul Mackerras, Michael Ellerman, Dave Watson, Andy Lutomirski,
Will Deacon, Hans Boehm
On Sun, Aug 27, 2017 at 12:54:04PM -0700, Mathieu Desnoyers wrote:
> Add a new MEMBARRIER_CMD_REGISTER_SYNC_CORE command to the membarrier
> system call. It allows processes to register their intent to have their
> threads issue core serializing barriers in addition to memory barriers
> whenever a membarrier command is performed.
>
> It is relevant for reclaim of JIT code, which requires to issue core
> serializing barriers on all threads running on behalf of a process
> after ensuring the old code is not visible anymore, before re-using
> memory for new code.
>
> When a processes returns from a MEMBARRIER_CMD_REGISTER_SYNC_CORE
> command, it is guaranteed that all following MEMBARRIER_CMD_SHARED and
> MEMBARRIER_CMD_PRIVATE_EXPEDITED issue core serializing barriers, in
> addition to the memory barriers, on all of its running threads.
I have queued both of these for testing and review:
git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu.git
f54cf5123eb9 ("membarrier: Speed up single-threaded private cmd")
0d6eb99818da ("membarrier: Provide register sync core cmd")
Could the people who asked for this please test it? The second commit
is non-obvious enough to need some careful review and intensive testing
before I can in good conscious pass it upstream.
Thanx, Paul
> Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
> CC: Peter Zijlstra <peterz@infradead.org>
> CC: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> CC: Boqun Feng <boqun.feng@gmail.com>
> CC: Andrew Hunter <ahh@google.com>
> CC: Maged Michael <maged.michael@gmail.com>
> CC: gromer@google.com
> CC: Avi Kivity <avi@scylladb.com>
> CC: Benjamin Herrenschmidt <benh@kernel.crashing.org>
> CC: Paul Mackerras <paulus@samba.org>
> CC: Michael Ellerman <mpe@ellerman.id.au>
> CC: Dave Watson <davejwatson@fb.com>
> CC: Andy Lutomirski <luto@kernel.org>
> CC: Will Deacon <will.deacon@arm.com>
> CC: Hans Boehm <hboehm@google.com>
> ---
> fs/exec.c | 1 +
> include/linux/sched.h | 52 +++++++++++++++++++++++++++++++++++++++++
> include/uapi/linux/membarrier.h | 1 +
> kernel/fork.c | 2 ++
> kernel/sched/core.c | 3 +++
> kernel/sched/membarrier.c | 37 ++++++++++++++++++++++++++---
> 6 files changed, 93 insertions(+), 3 deletions(-)
>
> diff --git a/fs/exec.c b/fs/exec.c
> index 62175cbcc801..a4ab3253bac7 100644
> --- a/fs/exec.c
> +++ b/fs/exec.c
> @@ -1794,6 +1794,7 @@ static int do_execveat_common(int fd, struct filename *filename,
> /* execve succeeded */
> current->fs->in_exec = 0;
> current->in_execve = 0;
> + membarrier_execve(current);
> acct_update_integrals(current);
> task_numa_free(current);
> free_bprm(bprm);
> diff --git a/include/linux/sched.h b/include/linux/sched.h
> index 8337e2db0bb2..b1ecdc4e8b84 100644
> --- a/include/linux/sched.h
> +++ b/include/linux/sched.h
> @@ -1086,6 +1086,9 @@ struct task_struct {
> /* Used by LSM modules for access restriction: */
> void *security;
> #endif
> +#ifdef CONFIG_MEMBARRIER
> + int membarrier_sync_core;
> +#endif
>
> /*
> * New fields for task_struct should be added above here, so that
> @@ -1623,4 +1626,53 @@ extern long sched_getaffinity(pid_t pid, struct cpumask *mask);
> #define TASK_SIZE_OF(tsk) TASK_SIZE
> #endif
>
> +#ifdef CONFIG_MEMBARRIER
> +static inline void membarrier_fork(struct task_struct *t,
> + unsigned long clone_flags)
> +{
> + /*
> + * Coherence of membarrier_sync_core against thread fork is
> + * protected by siglock. membarrier_fork is called with siglock
> + * held.
> + */
> + t->membarrier_sync_core = current->membarrier_sync_core;
> +}
> +static inline void membarrier_execve(struct task_struct *t)
> +{
> + t->membarrier_sync_core = 0;
> +}
> +static inline void membarrier_sched_out(struct task_struct *t)
> +{
> + /*
> + * Core serialization is performed before the memory barrier
> + * preceding the store to rq->curr.
> + */
> + if (unlikely(READ_ONCE(t->membarrier_sync_core)))
> + sync_core();
> +}
> +static inline void membarrier_sched_in(struct task_struct *t)
> +{
> + /*
> + * Core serialization is performed after the memory barrier
> + * following the store to rq->curr.
> + */
> + if (unlikely(READ_ONCE(t->membarrier_sync_core)))
> + sync_core();
> +}
> +#else
> +static inline void membarrier_fork(struct task_struct *t,
> + unsigned long clone_flags)
> +{
> +}
> +static inline void membarrier_execve(struct task_struct *t)
> +{
> +}
> +static inline void membarrier_sched_out(struct task_struct *t)
> +{
> +}
> +static inline void membarrier_sched_in(struct task_struct *t)
> +{
> +}
> +#endif
> +
> #endif
> diff --git a/include/uapi/linux/membarrier.h b/include/uapi/linux/membarrier.h
> index 6d47b3249d8a..bc26fcf8fb7b 100644
> --- a/include/uapi/linux/membarrier.h
> +++ b/include/uapi/linux/membarrier.h
> @@ -67,6 +67,7 @@ enum membarrier_cmd {
> /* reserved for MEMBARRIER_CMD_SHARED_EXPEDITED (1 << 1) */
> /* reserved for MEMBARRIER_CMD_PRIVATE (1 << 2) */
> MEMBARRIER_CMD_PRIVATE_EXPEDITED = (1 << 3),
> + MEMBARRIER_CMD_REGISTER_SYNC_CORE = (1 << 4),
> };
>
> #endif /* _UAPI_LINUX_MEMBARRIER_H */
> diff --git a/kernel/fork.c b/kernel/fork.c
> index 17921b0390b4..713b3c932671 100644
> --- a/kernel/fork.c
> +++ b/kernel/fork.c
> @@ -1840,6 +1840,8 @@ static __latent_entropy struct task_struct *copy_process(
> */
> copy_seccomp(p);
>
> + membarrier_fork(p, clone_flags);
> +
> /*
> * Process group and session signals need to be delivered to just the
> * parent before the fork or both the parent and the child after the
> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> index 0e36d9960d91..3ca27d066950 100644
> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -3292,6 +3292,8 @@ static void __sched notrace __schedule(bool preempt)
> local_irq_disable();
> rcu_note_context_switch(preempt);
>
> + membarrier_sched_out(prev);
> +
> /*
> * Make sure that signal_pending_state()->signal_pending() below
> * can't be reordered with __set_current_state(TASK_INTERRUPTIBLE)
> @@ -3364,6 +3366,7 @@ static void __sched notrace __schedule(bool preempt)
>
> /* Also unlocks the rq: */
> rq = context_switch(rq, prev, next, &rf);
> + membarrier_sched_in(next);
> } else {
> rq->clock_update_flags &= ~(RQCF_ACT_SKIP|RQCF_REQ_SKIP);
> rq_unlock_irq(rq, &rf);
> diff --git a/kernel/sched/membarrier.c b/kernel/sched/membarrier.c
> index 7eec6914d2d2..f128caf64b49 100644
> --- a/kernel/sched/membarrier.c
> +++ b/kernel/sched/membarrier.c
> @@ -25,12 +25,16 @@
> * Bitmask made from a "or" of all commands within enum membarrier_cmd,
> * except MEMBARRIER_CMD_QUERY.
> */
> -#define MEMBARRIER_CMD_BITMASK \
> - (MEMBARRIER_CMD_SHARED | MEMBARRIER_CMD_PRIVATE_EXPEDITED)
> +#define MEMBARRIER_CMD_BITMASK \
> + (MEMBARRIER_CMD_SHARED \
> + | MEMBARRIER_CMD_PRIVATE_EXPEDITED \
> + | MEMBARRIER_CMD_REGISTER_SYNC_CORE)
>
> static void ipi_mb(void *info)
> {
> - smp_mb(); /* IPIs should be serializing but paranoid. */
> + /* IPIs should be serializing but paranoid. */
> + smp_mb();
> + sync_core();
> }
>
> static void membarrier_private_expedited(void)
> @@ -96,6 +100,30 @@ static void membarrier_private_expedited(void)
> smp_mb(); /* exit from system call is not a mb */
> }
>
> +static void membarrier_register_sync_core(void)
> +{
> + struct task_struct *p = current, *t;
> +
> + if (get_nr_threads(p) == 1) {
> + p->membarrier_sync_core = 1;
> + return;
> + }
> +
> + /*
> + * Coherence of membarrier_sync_core against thread fork is
> + * protected by siglock.
> + */
> + spin_lock(&p->sighand->siglock);
> + for_each_thread(p, t)
> + WRITE_ONCE(t->membarrier_sync_core, 1);
> + spin_unlock(&p->sighand->siglock);
> + /*
> + * Ensure all future scheduler execution will observe the new
> + * membarrier_sync_core state for this process.
> + */
> + synchronize_sched();
> +}
> +
> /**
> * sys_membarrier - issue memory barriers on a set of threads
> * @cmd: Takes command values defined in enum membarrier_cmd.
> @@ -146,6 +174,9 @@ SYSCALL_DEFINE2(membarrier, int, cmd, int, flags)
> case MEMBARRIER_CMD_PRIVATE_EXPEDITED:
> membarrier_private_expedited();
> return 0;
> + case MEMBARRIER_CMD_REGISTER_SYNC_CORE:
> + membarrier_register_sync_core();
> + return 0;
> default:
> return -EINVAL;
> }
> --
> 2.11.0
>
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [RFC PATCH 2/2] membarrier: provide register sync core cmd
2017-08-27 20:35 ` Paul E. McKenney
@ 2017-08-27 20:52 ` Mathieu Desnoyers
2017-08-27 21:00 ` Paul E. McKenney
0 siblings, 1 reply; 9+ messages in thread
From: Mathieu Desnoyers @ 2017-08-27 20:52 UTC (permalink / raw)
To: Paul E. McKenney
Cc: Peter Zijlstra, linux-kernel, Boqun Feng, Andrew Hunter,
maged michael, gromer, Avi Kivity, Benjamin Herrenschmidt,
Paul Mackerras, Michael Ellerman, Dave Watson, Andy Lutomirski,
Will Deacon, Hans Boehm
----- On Aug 27, 2017, at 1:35 PM, Paul E. McKenney paulmck@linux.vnet.ibm.com wrote:
> On Sun, Aug 27, 2017 at 12:54:04PM -0700, Mathieu Desnoyers wrote:
>> Add a new MEMBARRIER_CMD_REGISTER_SYNC_CORE command to the membarrier
>> system call. It allows processes to register their intent to have their
>> threads issue core serializing barriers in addition to memory barriers
>> whenever a membarrier command is performed.
>>
>> It is relevant for reclaim of JIT code, which requires to issue core
>> serializing barriers on all threads running on behalf of a process
>> after ensuring the old code is not visible anymore, before re-using
>> memory for new code.
>>
>> When a processes returns from a MEMBARRIER_CMD_REGISTER_SYNC_CORE
>> command, it is guaranteed that all following MEMBARRIER_CMD_SHARED and
>> MEMBARRIER_CMD_PRIVATE_EXPEDITED issue core serializing barriers, in
>> addition to the memory barriers, on all of its running threads.
>
> I have queued both of these for testing and review:
>
> git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu.git
> f54cf5123eb9 ("membarrier: Speed up single-threaded private cmd")
> 0d6eb99818da ("membarrier: Provide register sync core cmd")
>
> Could the people who asked for this please test it? The second commit
> is non-obvious enough to need some careful review and intensive testing
> before I can in good conscious pass it upstream.
Thanks Paul!
Can you pick up my v2 instead ? I added benchmarks in the changelog and
documented the new command in the membarrier public header. No changes
otherwise.
Thanks,
Mathieu
>
> Thanx, Paul
>
>> Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
>> CC: Peter Zijlstra <peterz@infradead.org>
>> CC: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
>> CC: Boqun Feng <boqun.feng@gmail.com>
>> CC: Andrew Hunter <ahh@google.com>
>> CC: Maged Michael <maged.michael@gmail.com>
>> CC: gromer@google.com
>> CC: Avi Kivity <avi@scylladb.com>
>> CC: Benjamin Herrenschmidt <benh@kernel.crashing.org>
>> CC: Paul Mackerras <paulus@samba.org>
>> CC: Michael Ellerman <mpe@ellerman.id.au>
>> CC: Dave Watson <davejwatson@fb.com>
>> CC: Andy Lutomirski <luto@kernel.org>
>> CC: Will Deacon <will.deacon@arm.com>
>> CC: Hans Boehm <hboehm@google.com>
>> ---
>> fs/exec.c | 1 +
>> include/linux/sched.h | 52 +++++++++++++++++++++++++++++++++++++++++
>> include/uapi/linux/membarrier.h | 1 +
>> kernel/fork.c | 2 ++
>> kernel/sched/core.c | 3 +++
>> kernel/sched/membarrier.c | 37 ++++++++++++++++++++++++++---
>> 6 files changed, 93 insertions(+), 3 deletions(-)
>>
>> diff --git a/fs/exec.c b/fs/exec.c
>> index 62175cbcc801..a4ab3253bac7 100644
>> --- a/fs/exec.c
>> +++ b/fs/exec.c
>> @@ -1794,6 +1794,7 @@ static int do_execveat_common(int fd, struct filename
>> *filename,
>> /* execve succeeded */
>> current->fs->in_exec = 0;
>> current->in_execve = 0;
>> + membarrier_execve(current);
>> acct_update_integrals(current);
>> task_numa_free(current);
>> free_bprm(bprm);
>> diff --git a/include/linux/sched.h b/include/linux/sched.h
>> index 8337e2db0bb2..b1ecdc4e8b84 100644
>> --- a/include/linux/sched.h
>> +++ b/include/linux/sched.h
>> @@ -1086,6 +1086,9 @@ struct task_struct {
>> /* Used by LSM modules for access restriction: */
>> void *security;
>> #endif
>> +#ifdef CONFIG_MEMBARRIER
>> + int membarrier_sync_core;
>> +#endif
>>
>> /*
>> * New fields for task_struct should be added above here, so that
>> @@ -1623,4 +1626,53 @@ extern long sched_getaffinity(pid_t pid, struct cpumask
>> *mask);
>> #define TASK_SIZE_OF(tsk) TASK_SIZE
>> #endif
>>
>> +#ifdef CONFIG_MEMBARRIER
>> +static inline void membarrier_fork(struct task_struct *t,
>> + unsigned long clone_flags)
>> +{
>> + /*
>> + * Coherence of membarrier_sync_core against thread fork is
>> + * protected by siglock. membarrier_fork is called with siglock
>> + * held.
>> + */
>> + t->membarrier_sync_core = current->membarrier_sync_core;
>> +}
>> +static inline void membarrier_execve(struct task_struct *t)
>> +{
>> + t->membarrier_sync_core = 0;
>> +}
>> +static inline void membarrier_sched_out(struct task_struct *t)
>> +{
>> + /*
>> + * Core serialization is performed before the memory barrier
>> + * preceding the store to rq->curr.
>> + */
>> + if (unlikely(READ_ONCE(t->membarrier_sync_core)))
>> + sync_core();
>> +}
>> +static inline void membarrier_sched_in(struct task_struct *t)
>> +{
>> + /*
>> + * Core serialization is performed after the memory barrier
>> + * following the store to rq->curr.
>> + */
>> + if (unlikely(READ_ONCE(t->membarrier_sync_core)))
>> + sync_core();
>> +}
>> +#else
>> +static inline void membarrier_fork(struct task_struct *t,
>> + unsigned long clone_flags)
>> +{
>> +}
>> +static inline void membarrier_execve(struct task_struct *t)
>> +{
>> +}
>> +static inline void membarrier_sched_out(struct task_struct *t)
>> +{
>> +}
>> +static inline void membarrier_sched_in(struct task_struct *t)
>> +{
>> +}
>> +#endif
>> +
>> #endif
>> diff --git a/include/uapi/linux/membarrier.h b/include/uapi/linux/membarrier.h
>> index 6d47b3249d8a..bc26fcf8fb7b 100644
>> --- a/include/uapi/linux/membarrier.h
>> +++ b/include/uapi/linux/membarrier.h
>> @@ -67,6 +67,7 @@ enum membarrier_cmd {
>> /* reserved for MEMBARRIER_CMD_SHARED_EXPEDITED (1 << 1) */
>> /* reserved for MEMBARRIER_CMD_PRIVATE (1 << 2) */
>> MEMBARRIER_CMD_PRIVATE_EXPEDITED = (1 << 3),
>> + MEMBARRIER_CMD_REGISTER_SYNC_CORE = (1 << 4),
>> };
>>
>> #endif /* _UAPI_LINUX_MEMBARRIER_H */
>> diff --git a/kernel/fork.c b/kernel/fork.c
>> index 17921b0390b4..713b3c932671 100644
>> --- a/kernel/fork.c
>> +++ b/kernel/fork.c
>> @@ -1840,6 +1840,8 @@ static __latent_entropy struct task_struct *copy_process(
>> */
>> copy_seccomp(p);
>>
>> + membarrier_fork(p, clone_flags);
>> +
>> /*
>> * Process group and session signals need to be delivered to just the
>> * parent before the fork or both the parent and the child after the
>> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
>> index 0e36d9960d91..3ca27d066950 100644
>> --- a/kernel/sched/core.c
>> +++ b/kernel/sched/core.c
>> @@ -3292,6 +3292,8 @@ static void __sched notrace __schedule(bool preempt)
>> local_irq_disable();
>> rcu_note_context_switch(preempt);
>>
>> + membarrier_sched_out(prev);
>> +
>> /*
>> * Make sure that signal_pending_state()->signal_pending() below
>> * can't be reordered with __set_current_state(TASK_INTERRUPTIBLE)
>> @@ -3364,6 +3366,7 @@ static void __sched notrace __schedule(bool preempt)
>>
>> /* Also unlocks the rq: */
>> rq = context_switch(rq, prev, next, &rf);
>> + membarrier_sched_in(next);
>> } else {
>> rq->clock_update_flags &= ~(RQCF_ACT_SKIP|RQCF_REQ_SKIP);
>> rq_unlock_irq(rq, &rf);
>> diff --git a/kernel/sched/membarrier.c b/kernel/sched/membarrier.c
>> index 7eec6914d2d2..f128caf64b49 100644
>> --- a/kernel/sched/membarrier.c
>> +++ b/kernel/sched/membarrier.c
>> @@ -25,12 +25,16 @@
>> * Bitmask made from a "or" of all commands within enum membarrier_cmd,
>> * except MEMBARRIER_CMD_QUERY.
>> */
>> -#define MEMBARRIER_CMD_BITMASK \
>> - (MEMBARRIER_CMD_SHARED | MEMBARRIER_CMD_PRIVATE_EXPEDITED)
>> +#define MEMBARRIER_CMD_BITMASK \
>> + (MEMBARRIER_CMD_SHARED \
>> + | MEMBARRIER_CMD_PRIVATE_EXPEDITED \
>> + | MEMBARRIER_CMD_REGISTER_SYNC_CORE)
>>
>> static void ipi_mb(void *info)
>> {
>> - smp_mb(); /* IPIs should be serializing but paranoid. */
>> + /* IPIs should be serializing but paranoid. */
>> + smp_mb();
>> + sync_core();
>> }
>>
>> static void membarrier_private_expedited(void)
>> @@ -96,6 +100,30 @@ static void membarrier_private_expedited(void)
>> smp_mb(); /* exit from system call is not a mb */
>> }
>>
>> +static void membarrier_register_sync_core(void)
>> +{
>> + struct task_struct *p = current, *t;
>> +
>> + if (get_nr_threads(p) == 1) {
>> + p->membarrier_sync_core = 1;
>> + return;
>> + }
>> +
>> + /*
>> + * Coherence of membarrier_sync_core against thread fork is
>> + * protected by siglock.
>> + */
>> + spin_lock(&p->sighand->siglock);
>> + for_each_thread(p, t)
>> + WRITE_ONCE(t->membarrier_sync_core, 1);
>> + spin_unlock(&p->sighand->siglock);
>> + /*
>> + * Ensure all future scheduler execution will observe the new
>> + * membarrier_sync_core state for this process.
>> + */
>> + synchronize_sched();
>> +}
>> +
>> /**
>> * sys_membarrier - issue memory barriers on a set of threads
>> * @cmd: Takes command values defined in enum membarrier_cmd.
>> @@ -146,6 +174,9 @@ SYSCALL_DEFINE2(membarrier, int, cmd, int, flags)
>> case MEMBARRIER_CMD_PRIVATE_EXPEDITED:
>> membarrier_private_expedited();
>> return 0;
>> + case MEMBARRIER_CMD_REGISTER_SYNC_CORE:
>> + membarrier_register_sync_core();
>> + return 0;
>> default:
>> return -EINVAL;
>> }
>> --
>> 2.11.0
--
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [RFC PATCH 2/2] membarrier: provide register sync core cmd
2017-08-27 20:52 ` Mathieu Desnoyers
@ 2017-08-27 21:00 ` Paul E. McKenney
2017-08-27 21:39 ` Mathieu Desnoyers
0 siblings, 1 reply; 9+ messages in thread
From: Paul E. McKenney @ 2017-08-27 21:00 UTC (permalink / raw)
To: Mathieu Desnoyers
Cc: Peter Zijlstra, linux-kernel, Boqun Feng, Andrew Hunter,
maged michael, gromer, Avi Kivity, Benjamin Herrenschmidt,
Paul Mackerras, Michael Ellerman, Dave Watson, Andy Lutomirski,
Will Deacon, Hans Boehm
On Sun, Aug 27, 2017 at 08:52:58PM +0000, Mathieu Desnoyers wrote:
> ----- On Aug 27, 2017, at 1:35 PM, Paul E. McKenney paulmck@linux.vnet.ibm.com wrote:
>
> > On Sun, Aug 27, 2017 at 12:54:04PM -0700, Mathieu Desnoyers wrote:
> >> Add a new MEMBARRIER_CMD_REGISTER_SYNC_CORE command to the membarrier
> >> system call. It allows processes to register their intent to have their
> >> threads issue core serializing barriers in addition to memory barriers
> >> whenever a membarrier command is performed.
> >>
> >> It is relevant for reclaim of JIT code, which requires to issue core
> >> serializing barriers on all threads running on behalf of a process
> >> after ensuring the old code is not visible anymore, before re-using
> >> memory for new code.
> >>
> >> When a processes returns from a MEMBARRIER_CMD_REGISTER_SYNC_CORE
> >> command, it is guaranteed that all following MEMBARRIER_CMD_SHARED and
> >> MEMBARRIER_CMD_PRIVATE_EXPEDITED issue core serializing barriers, in
> >> addition to the memory barriers, on all of its running threads.
> >
> > I have queued both of these for testing and review:
> >
> > git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu.git
> > f54cf5123eb9 ("membarrier: Speed up single-threaded private cmd")
> > 0d6eb99818da ("membarrier: Provide register sync core cmd")
> >
> > Could the people who asked for this please test it? The second commit
> > is non-obvious enough to need some careful review and intensive testing
> > before I can in good conscious pass it upstream.
>
> Thanks Paul!
>
> Can you pick up my v2 instead ? I added benchmarks in the changelog and
> documented the new command in the membarrier public header. No changes
> otherwise.
It looks like 0day Test Robot also wants some additional header files.
Could you please send me a v3 with its complaints addressed?
Thanx, Paul
> Thanks,
>
> Mathieu
>
> >
> > Thanx, Paul
> >
> >> Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
> >> CC: Peter Zijlstra <peterz@infradead.org>
> >> CC: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> >> CC: Boqun Feng <boqun.feng@gmail.com>
> >> CC: Andrew Hunter <ahh@google.com>
> >> CC: Maged Michael <maged.michael@gmail.com>
> >> CC: gromer@google.com
> >> CC: Avi Kivity <avi@scylladb.com>
> >> CC: Benjamin Herrenschmidt <benh@kernel.crashing.org>
> >> CC: Paul Mackerras <paulus@samba.org>
> >> CC: Michael Ellerman <mpe@ellerman.id.au>
> >> CC: Dave Watson <davejwatson@fb.com>
> >> CC: Andy Lutomirski <luto@kernel.org>
> >> CC: Will Deacon <will.deacon@arm.com>
> >> CC: Hans Boehm <hboehm@google.com>
> >> ---
> >> fs/exec.c | 1 +
> >> include/linux/sched.h | 52 +++++++++++++++++++++++++++++++++++++++++
> >> include/uapi/linux/membarrier.h | 1 +
> >> kernel/fork.c | 2 ++
> >> kernel/sched/core.c | 3 +++
> >> kernel/sched/membarrier.c | 37 ++++++++++++++++++++++++++---
> >> 6 files changed, 93 insertions(+), 3 deletions(-)
> >>
> >> diff --git a/fs/exec.c b/fs/exec.c
> >> index 62175cbcc801..a4ab3253bac7 100644
> >> --- a/fs/exec.c
> >> +++ b/fs/exec.c
> >> @@ -1794,6 +1794,7 @@ static int do_execveat_common(int fd, struct filename
> >> *filename,
> >> /* execve succeeded */
> >> current->fs->in_exec = 0;
> >> current->in_execve = 0;
> >> + membarrier_execve(current);
> >> acct_update_integrals(current);
> >> task_numa_free(current);
> >> free_bprm(bprm);
> >> diff --git a/include/linux/sched.h b/include/linux/sched.h
> >> index 8337e2db0bb2..b1ecdc4e8b84 100644
> >> --- a/include/linux/sched.h
> >> +++ b/include/linux/sched.h
> >> @@ -1086,6 +1086,9 @@ struct task_struct {
> >> /* Used by LSM modules for access restriction: */
> >> void *security;
> >> #endif
> >> +#ifdef CONFIG_MEMBARRIER
> >> + int membarrier_sync_core;
> >> +#endif
> >>
> >> /*
> >> * New fields for task_struct should be added above here, so that
> >> @@ -1623,4 +1626,53 @@ extern long sched_getaffinity(pid_t pid, struct cpumask
> >> *mask);
> >> #define TASK_SIZE_OF(tsk) TASK_SIZE
> >> #endif
> >>
> >> +#ifdef CONFIG_MEMBARRIER
> >> +static inline void membarrier_fork(struct task_struct *t,
> >> + unsigned long clone_flags)
> >> +{
> >> + /*
> >> + * Coherence of membarrier_sync_core against thread fork is
> >> + * protected by siglock. membarrier_fork is called with siglock
> >> + * held.
> >> + */
> >> + t->membarrier_sync_core = current->membarrier_sync_core;
> >> +}
> >> +static inline void membarrier_execve(struct task_struct *t)
> >> +{
> >> + t->membarrier_sync_core = 0;
> >> +}
> >> +static inline void membarrier_sched_out(struct task_struct *t)
> >> +{
> >> + /*
> >> + * Core serialization is performed before the memory barrier
> >> + * preceding the store to rq->curr.
> >> + */
> >> + if (unlikely(READ_ONCE(t->membarrier_sync_core)))
> >> + sync_core();
> >> +}
> >> +static inline void membarrier_sched_in(struct task_struct *t)
> >> +{
> >> + /*
> >> + * Core serialization is performed after the memory barrier
> >> + * following the store to rq->curr.
> >> + */
> >> + if (unlikely(READ_ONCE(t->membarrier_sync_core)))
> >> + sync_core();
> >> +}
> >> +#else
> >> +static inline void membarrier_fork(struct task_struct *t,
> >> + unsigned long clone_flags)
> >> +{
> >> +}
> >> +static inline void membarrier_execve(struct task_struct *t)
> >> +{
> >> +}
> >> +static inline void membarrier_sched_out(struct task_struct *t)
> >> +{
> >> +}
> >> +static inline void membarrier_sched_in(struct task_struct *t)
> >> +{
> >> +}
> >> +#endif
> >> +
> >> #endif
> >> diff --git a/include/uapi/linux/membarrier.h b/include/uapi/linux/membarrier.h
> >> index 6d47b3249d8a..bc26fcf8fb7b 100644
> >> --- a/include/uapi/linux/membarrier.h
> >> +++ b/include/uapi/linux/membarrier.h
> >> @@ -67,6 +67,7 @@ enum membarrier_cmd {
> >> /* reserved for MEMBARRIER_CMD_SHARED_EXPEDITED (1 << 1) */
> >> /* reserved for MEMBARRIER_CMD_PRIVATE (1 << 2) */
> >> MEMBARRIER_CMD_PRIVATE_EXPEDITED = (1 << 3),
> >> + MEMBARRIER_CMD_REGISTER_SYNC_CORE = (1 << 4),
> >> };
> >>
> >> #endif /* _UAPI_LINUX_MEMBARRIER_H */
> >> diff --git a/kernel/fork.c b/kernel/fork.c
> >> index 17921b0390b4..713b3c932671 100644
> >> --- a/kernel/fork.c
> >> +++ b/kernel/fork.c
> >> @@ -1840,6 +1840,8 @@ static __latent_entropy struct task_struct *copy_process(
> >> */
> >> copy_seccomp(p);
> >>
> >> + membarrier_fork(p, clone_flags);
> >> +
> >> /*
> >> * Process group and session signals need to be delivered to just the
> >> * parent before the fork or both the parent and the child after the
> >> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> >> index 0e36d9960d91..3ca27d066950 100644
> >> --- a/kernel/sched/core.c
> >> +++ b/kernel/sched/core.c
> >> @@ -3292,6 +3292,8 @@ static void __sched notrace __schedule(bool preempt)
> >> local_irq_disable();
> >> rcu_note_context_switch(preempt);
> >>
> >> + membarrier_sched_out(prev);
> >> +
> >> /*
> >> * Make sure that signal_pending_state()->signal_pending() below
> >> * can't be reordered with __set_current_state(TASK_INTERRUPTIBLE)
> >> @@ -3364,6 +3366,7 @@ static void __sched notrace __schedule(bool preempt)
> >>
> >> /* Also unlocks the rq: */
> >> rq = context_switch(rq, prev, next, &rf);
> >> + membarrier_sched_in(next);
> >> } else {
> >> rq->clock_update_flags &= ~(RQCF_ACT_SKIP|RQCF_REQ_SKIP);
> >> rq_unlock_irq(rq, &rf);
> >> diff --git a/kernel/sched/membarrier.c b/kernel/sched/membarrier.c
> >> index 7eec6914d2d2..f128caf64b49 100644
> >> --- a/kernel/sched/membarrier.c
> >> +++ b/kernel/sched/membarrier.c
> >> @@ -25,12 +25,16 @@
> >> * Bitmask made from a "or" of all commands within enum membarrier_cmd,
> >> * except MEMBARRIER_CMD_QUERY.
> >> */
> >> -#define MEMBARRIER_CMD_BITMASK \
> >> - (MEMBARRIER_CMD_SHARED | MEMBARRIER_CMD_PRIVATE_EXPEDITED)
> >> +#define MEMBARRIER_CMD_BITMASK \
> >> + (MEMBARRIER_CMD_SHARED \
> >> + | MEMBARRIER_CMD_PRIVATE_EXPEDITED \
> >> + | MEMBARRIER_CMD_REGISTER_SYNC_CORE)
> >>
> >> static void ipi_mb(void *info)
> >> {
> >> - smp_mb(); /* IPIs should be serializing but paranoid. */
> >> + /* IPIs should be serializing but paranoid. */
> >> + smp_mb();
> >> + sync_core();
> >> }
> >>
> >> static void membarrier_private_expedited(void)
> >> @@ -96,6 +100,30 @@ static void membarrier_private_expedited(void)
> >> smp_mb(); /* exit from system call is not a mb */
> >> }
> >>
> >> +static void membarrier_register_sync_core(void)
> >> +{
> >> + struct task_struct *p = current, *t;
> >> +
> >> + if (get_nr_threads(p) == 1) {
> >> + p->membarrier_sync_core = 1;
> >> + return;
> >> + }
> >> +
> >> + /*
> >> + * Coherence of membarrier_sync_core against thread fork is
> >> + * protected by siglock.
> >> + */
> >> + spin_lock(&p->sighand->siglock);
> >> + for_each_thread(p, t)
> >> + WRITE_ONCE(t->membarrier_sync_core, 1);
> >> + spin_unlock(&p->sighand->siglock);
> >> + /*
> >> + * Ensure all future scheduler execution will observe the new
> >> + * membarrier_sync_core state for this process.
> >> + */
> >> + synchronize_sched();
> >> +}
> >> +
> >> /**
> >> * sys_membarrier - issue memory barriers on a set of threads
> >> * @cmd: Takes command values defined in enum membarrier_cmd.
> >> @@ -146,6 +174,9 @@ SYSCALL_DEFINE2(membarrier, int, cmd, int, flags)
> >> case MEMBARRIER_CMD_PRIVATE_EXPEDITED:
> >> membarrier_private_expedited();
> >> return 0;
> >> + case MEMBARRIER_CMD_REGISTER_SYNC_CORE:
> >> + membarrier_register_sync_core();
> >> + return 0;
> >> default:
> >> return -EINVAL;
> >> }
> >> --
> >> 2.11.0
>
> --
> Mathieu Desnoyers
> EfficiOS Inc.
> http://www.efficios.com
>
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [RFC PATCH 2/2] membarrier: provide register sync core cmd
2017-08-27 21:00 ` Paul E. McKenney
@ 2017-08-27 21:39 ` Mathieu Desnoyers
2017-08-28 17:49 ` Paul E. McKenney
0 siblings, 1 reply; 9+ messages in thread
From: Mathieu Desnoyers @ 2017-08-27 21:39 UTC (permalink / raw)
To: Paul E. McKenney
Cc: Peter Zijlstra, linux-kernel, Boqun Feng, Andrew Hunter,
maged michael, gromer, Avi Kivity, Benjamin Herrenschmidt,
Paul Mackerras, Michael Ellerman, Dave Watson, Andy Lutomirski,
Will Deacon, Hans Boehm
----- On Aug 27, 2017, at 2:00 PM, Paul E. McKenney paulmck@linux.vnet.ibm.com wrote:
> On Sun, Aug 27, 2017 at 08:52:58PM +0000, Mathieu Desnoyers wrote:
>> ----- On Aug 27, 2017, at 1:35 PM, Paul E. McKenney paulmck@linux.vnet.ibm.com
>> wrote:
>>
>> > On Sun, Aug 27, 2017 at 12:54:04PM -0700, Mathieu Desnoyers wrote:
>> >> Add a new MEMBARRIER_CMD_REGISTER_SYNC_CORE command to the membarrier
>> >> system call. It allows processes to register their intent to have their
>> >> threads issue core serializing barriers in addition to memory barriers
>> >> whenever a membarrier command is performed.
>> >>
>> >> It is relevant for reclaim of JIT code, which requires to issue core
>> >> serializing barriers on all threads running on behalf of a process
>> >> after ensuring the old code is not visible anymore, before re-using
>> >> memory for new code.
>> >>
>> >> When a processes returns from a MEMBARRIER_CMD_REGISTER_SYNC_CORE
>> >> command, it is guaranteed that all following MEMBARRIER_CMD_SHARED and
>> >> MEMBARRIER_CMD_PRIVATE_EXPEDITED issue core serializing barriers, in
>> >> addition to the memory barriers, on all of its running threads.
>> >
>> > I have queued both of these for testing and review:
>> >
>> > git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu.git
>> > f54cf5123eb9 ("membarrier: Speed up single-threaded private cmd")
>> > 0d6eb99818da ("membarrier: Provide register sync core cmd")
>> >
>> > Could the people who asked for this please test it? The second commit
>> > is non-obvious enough to need some careful review and intensive testing
>> > before I can in good conscious pass it upstream.
>>
>> Thanks Paul!
>>
>> Can you pick up my v2 instead ? I added benchmarks in the changelog and
>> documented the new command in the membarrier public header. No changes
>> otherwise.
>
> It looks like 0day Test Robot also wants some additional header files.
> Could you please send me a v3 with its complaints addressed?
I sent a separate patch implementing sync_core() on xtensa. It should
be applied before this patch.
Hopefully we can get some testing from people with xtensa testing
environments.
Thanks,
Mathieu
>
> Thanx, Paul
>
>> Thanks,
>>
>> Mathieu
>>
>> >
>> > Thanx, Paul
>> >
>> >> Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
>> >> CC: Peter Zijlstra <peterz@infradead.org>
>> >> CC: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
>> >> CC: Boqun Feng <boqun.feng@gmail.com>
>> >> CC: Andrew Hunter <ahh@google.com>
>> >> CC: Maged Michael <maged.michael@gmail.com>
>> >> CC: gromer@google.com
>> >> CC: Avi Kivity <avi@scylladb.com>
>> >> CC: Benjamin Herrenschmidt <benh@kernel.crashing.org>
>> >> CC: Paul Mackerras <paulus@samba.org>
>> >> CC: Michael Ellerman <mpe@ellerman.id.au>
>> >> CC: Dave Watson <davejwatson@fb.com>
>> >> CC: Andy Lutomirski <luto@kernel.org>
>> >> CC: Will Deacon <will.deacon@arm.com>
>> >> CC: Hans Boehm <hboehm@google.com>
>> >> ---
>> >> fs/exec.c | 1 +
>> >> include/linux/sched.h | 52 +++++++++++++++++++++++++++++++++++++++++
>> >> include/uapi/linux/membarrier.h | 1 +
>> >> kernel/fork.c | 2 ++
>> >> kernel/sched/core.c | 3 +++
>> >> kernel/sched/membarrier.c | 37 ++++++++++++++++++++++++++---
>> >> 6 files changed, 93 insertions(+), 3 deletions(-)
>> >>
>> >> diff --git a/fs/exec.c b/fs/exec.c
>> >> index 62175cbcc801..a4ab3253bac7 100644
>> >> --- a/fs/exec.c
>> >> +++ b/fs/exec.c
>> >> @@ -1794,6 +1794,7 @@ static int do_execveat_common(int fd, struct filename
>> >> *filename,
>> >> /* execve succeeded */
>> >> current->fs->in_exec = 0;
>> >> current->in_execve = 0;
>> >> + membarrier_execve(current);
>> >> acct_update_integrals(current);
>> >> task_numa_free(current);
>> >> free_bprm(bprm);
>> >> diff --git a/include/linux/sched.h b/include/linux/sched.h
>> >> index 8337e2db0bb2..b1ecdc4e8b84 100644
>> >> --- a/include/linux/sched.h
>> >> +++ b/include/linux/sched.h
>> >> @@ -1086,6 +1086,9 @@ struct task_struct {
>> >> /* Used by LSM modules for access restriction: */
>> >> void *security;
>> >> #endif
>> >> +#ifdef CONFIG_MEMBARRIER
>> >> + int membarrier_sync_core;
>> >> +#endif
>> >>
>> >> /*
>> >> * New fields for task_struct should be added above here, so that
>> >> @@ -1623,4 +1626,53 @@ extern long sched_getaffinity(pid_t pid, struct cpumask
>> >> *mask);
>> >> #define TASK_SIZE_OF(tsk) TASK_SIZE
>> >> #endif
>> >>
>> >> +#ifdef CONFIG_MEMBARRIER
>> >> +static inline void membarrier_fork(struct task_struct *t,
>> >> + unsigned long clone_flags)
>> >> +{
>> >> + /*
>> >> + * Coherence of membarrier_sync_core against thread fork is
>> >> + * protected by siglock. membarrier_fork is called with siglock
>> >> + * held.
>> >> + */
>> >> + t->membarrier_sync_core = current->membarrier_sync_core;
>> >> +}
>> >> +static inline void membarrier_execve(struct task_struct *t)
>> >> +{
>> >> + t->membarrier_sync_core = 0;
>> >> +}
>> >> +static inline void membarrier_sched_out(struct task_struct *t)
>> >> +{
>> >> + /*
>> >> + * Core serialization is performed before the memory barrier
>> >> + * preceding the store to rq->curr.
>> >> + */
>> >> + if (unlikely(READ_ONCE(t->membarrier_sync_core)))
>> >> + sync_core();
>> >> +}
>> >> +static inline void membarrier_sched_in(struct task_struct *t)
>> >> +{
>> >> + /*
>> >> + * Core serialization is performed after the memory barrier
>> >> + * following the store to rq->curr.
>> >> + */
>> >> + if (unlikely(READ_ONCE(t->membarrier_sync_core)))
>> >> + sync_core();
>> >> +}
>> >> +#else
>> >> +static inline void membarrier_fork(struct task_struct *t,
>> >> + unsigned long clone_flags)
>> >> +{
>> >> +}
>> >> +static inline void membarrier_execve(struct task_struct *t)
>> >> +{
>> >> +}
>> >> +static inline void membarrier_sched_out(struct task_struct *t)
>> >> +{
>> >> +}
>> >> +static inline void membarrier_sched_in(struct task_struct *t)
>> >> +{
>> >> +}
>> >> +#endif
>> >> +
>> >> #endif
>> >> diff --git a/include/uapi/linux/membarrier.h b/include/uapi/linux/membarrier.h
>> >> index 6d47b3249d8a..bc26fcf8fb7b 100644
>> >> --- a/include/uapi/linux/membarrier.h
>> >> +++ b/include/uapi/linux/membarrier.h
>> >> @@ -67,6 +67,7 @@ enum membarrier_cmd {
>> >> /* reserved for MEMBARRIER_CMD_SHARED_EXPEDITED (1 << 1) */
>> >> /* reserved for MEMBARRIER_CMD_PRIVATE (1 << 2) */
>> >> MEMBARRIER_CMD_PRIVATE_EXPEDITED = (1 << 3),
>> >> + MEMBARRIER_CMD_REGISTER_SYNC_CORE = (1 << 4),
>> >> };
>> >>
>> >> #endif /* _UAPI_LINUX_MEMBARRIER_H */
>> >> diff --git a/kernel/fork.c b/kernel/fork.c
>> >> index 17921b0390b4..713b3c932671 100644
>> >> --- a/kernel/fork.c
>> >> +++ b/kernel/fork.c
>> >> @@ -1840,6 +1840,8 @@ static __latent_entropy struct task_struct *copy_process(
>> >> */
>> >> copy_seccomp(p);
>> >>
>> >> + membarrier_fork(p, clone_flags);
>> >> +
>> >> /*
>> >> * Process group and session signals need to be delivered to just the
>> >> * parent before the fork or both the parent and the child after the
>> >> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
>> >> index 0e36d9960d91..3ca27d066950 100644
>> >> --- a/kernel/sched/core.c
>> >> +++ b/kernel/sched/core.c
>> >> @@ -3292,6 +3292,8 @@ static void __sched notrace __schedule(bool preempt)
>> >> local_irq_disable();
>> >> rcu_note_context_switch(preempt);
>> >>
>> >> + membarrier_sched_out(prev);
>> >> +
>> >> /*
>> >> * Make sure that signal_pending_state()->signal_pending() below
>> >> * can't be reordered with __set_current_state(TASK_INTERRUPTIBLE)
>> >> @@ -3364,6 +3366,7 @@ static void __sched notrace __schedule(bool preempt)
>> >>
>> >> /* Also unlocks the rq: */
>> >> rq = context_switch(rq, prev, next, &rf);
>> >> + membarrier_sched_in(next);
>> >> } else {
>> >> rq->clock_update_flags &= ~(RQCF_ACT_SKIP|RQCF_REQ_SKIP);
>> >> rq_unlock_irq(rq, &rf);
>> >> diff --git a/kernel/sched/membarrier.c b/kernel/sched/membarrier.c
>> >> index 7eec6914d2d2..f128caf64b49 100644
>> >> --- a/kernel/sched/membarrier.c
>> >> +++ b/kernel/sched/membarrier.c
>> >> @@ -25,12 +25,16 @@
>> >> * Bitmask made from a "or" of all commands within enum membarrier_cmd,
>> >> * except MEMBARRIER_CMD_QUERY.
>> >> */
>> >> -#define MEMBARRIER_CMD_BITMASK \
>> >> - (MEMBARRIER_CMD_SHARED | MEMBARRIER_CMD_PRIVATE_EXPEDITED)
>> >> +#define MEMBARRIER_CMD_BITMASK \
>> >> + (MEMBARRIER_CMD_SHARED \
>> >> + | MEMBARRIER_CMD_PRIVATE_EXPEDITED \
>> >> + | MEMBARRIER_CMD_REGISTER_SYNC_CORE)
>> >>
>> >> static void ipi_mb(void *info)
>> >> {
>> >> - smp_mb(); /* IPIs should be serializing but paranoid. */
>> >> + /* IPIs should be serializing but paranoid. */
>> >> + smp_mb();
>> >> + sync_core();
>> >> }
>> >>
>> >> static void membarrier_private_expedited(void)
>> >> @@ -96,6 +100,30 @@ static void membarrier_private_expedited(void)
>> >> smp_mb(); /* exit from system call is not a mb */
>> >> }
>> >>
>> >> +static void membarrier_register_sync_core(void)
>> >> +{
>> >> + struct task_struct *p = current, *t;
>> >> +
>> >> + if (get_nr_threads(p) == 1) {
>> >> + p->membarrier_sync_core = 1;
>> >> + return;
>> >> + }
>> >> +
>> >> + /*
>> >> + * Coherence of membarrier_sync_core against thread fork is
>> >> + * protected by siglock.
>> >> + */
>> >> + spin_lock(&p->sighand->siglock);
>> >> + for_each_thread(p, t)
>> >> + WRITE_ONCE(t->membarrier_sync_core, 1);
>> >> + spin_unlock(&p->sighand->siglock);
>> >> + /*
>> >> + * Ensure all future scheduler execution will observe the new
>> >> + * membarrier_sync_core state for this process.
>> >> + */
>> >> + synchronize_sched();
>> >> +}
>> >> +
>> >> /**
>> >> * sys_membarrier - issue memory barriers on a set of threads
>> >> * @cmd: Takes command values defined in enum membarrier_cmd.
>> >> @@ -146,6 +174,9 @@ SYSCALL_DEFINE2(membarrier, int, cmd, int, flags)
>> >> case MEMBARRIER_CMD_PRIVATE_EXPEDITED:
>> >> membarrier_private_expedited();
>> >> return 0;
>> >> + case MEMBARRIER_CMD_REGISTER_SYNC_CORE:
>> >> + membarrier_register_sync_core();
>> >> + return 0;
>> >> default:
>> >> return -EINVAL;
>> >> }
>> >> --
>> >> 2.11.0
>>
>> --
>> Mathieu Desnoyers
>> EfficiOS Inc.
>> http://www.efficios.com
--
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [RFC PATCH 2/2] membarrier: provide register sync core cmd
2017-08-27 21:39 ` Mathieu Desnoyers
@ 2017-08-28 17:49 ` Paul E. McKenney
2017-08-30 17:53 ` Mathieu Desnoyers
0 siblings, 1 reply; 9+ messages in thread
From: Paul E. McKenney @ 2017-08-28 17:49 UTC (permalink / raw)
To: Mathieu Desnoyers
Cc: Peter Zijlstra, linux-kernel, Boqun Feng, Andrew Hunter,
maged michael, gromer, Avi Kivity, Benjamin Herrenschmidt,
Paul Mackerras, Michael Ellerman, Dave Watson, Andy Lutomirski,
Will Deacon, Hans Boehm
On Sun, Aug 27, 2017 at 09:39:54PM +0000, Mathieu Desnoyers wrote:
> ----- On Aug 27, 2017, at 2:00 PM, Paul E. McKenney paulmck@linux.vnet.ibm.com wrote:
>
> > On Sun, Aug 27, 2017 at 08:52:58PM +0000, Mathieu Desnoyers wrote:
> >> ----- On Aug 27, 2017, at 1:35 PM, Paul E. McKenney paulmck@linux.vnet.ibm.com
> >> wrote:
> >>
> >> > On Sun, Aug 27, 2017 at 12:54:04PM -0700, Mathieu Desnoyers wrote:
> >> >> Add a new MEMBARRIER_CMD_REGISTER_SYNC_CORE command to the membarrier
> >> >> system call. It allows processes to register their intent to have their
> >> >> threads issue core serializing barriers in addition to memory barriers
> >> >> whenever a membarrier command is performed.
> >> >>
> >> >> It is relevant for reclaim of JIT code, which requires to issue core
> >> >> serializing barriers on all threads running on behalf of a process
> >> >> after ensuring the old code is not visible anymore, before re-using
> >> >> memory for new code.
> >> >>
> >> >> When a processes returns from a MEMBARRIER_CMD_REGISTER_SYNC_CORE
> >> >> command, it is guaranteed that all following MEMBARRIER_CMD_SHARED and
> >> >> MEMBARRIER_CMD_PRIVATE_EXPEDITED issue core serializing barriers, in
> >> >> addition to the memory barriers, on all of its running threads.
> >> >
> >> > I have queued both of these for testing and review:
> >> >
> >> > git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu.git
> >> > f54cf5123eb9 ("membarrier: Speed up single-threaded private cmd")
> >> > 0d6eb99818da ("membarrier: Provide register sync core cmd")
> >> >
> >> > Could the people who asked for this please test it? The second commit
> >> > is non-obvious enough to need some careful review and intensive testing
> >> > before I can in good conscious pass it upstream.
> >>
> >> Thanks Paul!
> >>
> >> Can you pick up my v2 instead ? I added benchmarks in the changelog and
> >> documented the new command in the membarrier public header. No changes
> >> otherwise.
> >
> > It looks like 0day Test Robot also wants some additional header files.
> > Could you please send me a v3 with its complaints addressed?
>
> I sent a separate patch implementing sync_core() on xtensa. It should
> be applied before this patch.
And it looks like we also need a sync_core() on every CPU family other
than x86, which already has one. Do we want to make this capability
depend on an ARCH_ kconfig option, or should we just bit the bullet and
implement sync_core() everywhere?
(I had to drop this commit due to downstream complaints. Left to myself,
I would take the ARCH_ approach, then add it back in, but please let me
know how you would like to proceed.)
Thanx, Paul
> Hopefully we can get some testing from people with xtensa testing
> environments.
>
> Thanks,
>
> Mathieu
>
> >
> > Thanx, Paul
> >
> >> Thanks,
> >>
> >> Mathieu
> >>
> >> >
> >> > Thanx, Paul
> >> >
> >> >> Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
> >> >> CC: Peter Zijlstra <peterz@infradead.org>
> >> >> CC: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> >> >> CC: Boqun Feng <boqun.feng@gmail.com>
> >> >> CC: Andrew Hunter <ahh@google.com>
> >> >> CC: Maged Michael <maged.michael@gmail.com>
> >> >> CC: gromer@google.com
> >> >> CC: Avi Kivity <avi@scylladb.com>
> >> >> CC: Benjamin Herrenschmidt <benh@kernel.crashing.org>
> >> >> CC: Paul Mackerras <paulus@samba.org>
> >> >> CC: Michael Ellerman <mpe@ellerman.id.au>
> >> >> CC: Dave Watson <davejwatson@fb.com>
> >> >> CC: Andy Lutomirski <luto@kernel.org>
> >> >> CC: Will Deacon <will.deacon@arm.com>
> >> >> CC: Hans Boehm <hboehm@google.com>
> >> >> ---
> >> >> fs/exec.c | 1 +
> >> >> include/linux/sched.h | 52 +++++++++++++++++++++++++++++++++++++++++
> >> >> include/uapi/linux/membarrier.h | 1 +
> >> >> kernel/fork.c | 2 ++
> >> >> kernel/sched/core.c | 3 +++
> >> >> kernel/sched/membarrier.c | 37 ++++++++++++++++++++++++++---
> >> >> 6 files changed, 93 insertions(+), 3 deletions(-)
> >> >>
> >> >> diff --git a/fs/exec.c b/fs/exec.c
> >> >> index 62175cbcc801..a4ab3253bac7 100644
> >> >> --- a/fs/exec.c
> >> >> +++ b/fs/exec.c
> >> >> @@ -1794,6 +1794,7 @@ static int do_execveat_common(int fd, struct filename
> >> >> *filename,
> >> >> /* execve succeeded */
> >> >> current->fs->in_exec = 0;
> >> >> current->in_execve = 0;
> >> >> + membarrier_execve(current);
> >> >> acct_update_integrals(current);
> >> >> task_numa_free(current);
> >> >> free_bprm(bprm);
> >> >> diff --git a/include/linux/sched.h b/include/linux/sched.h
> >> >> index 8337e2db0bb2..b1ecdc4e8b84 100644
> >> >> --- a/include/linux/sched.h
> >> >> +++ b/include/linux/sched.h
> >> >> @@ -1086,6 +1086,9 @@ struct task_struct {
> >> >> /* Used by LSM modules for access restriction: */
> >> >> void *security;
> >> >> #endif
> >> >> +#ifdef CONFIG_MEMBARRIER
> >> >> + int membarrier_sync_core;
> >> >> +#endif
> >> >>
> >> >> /*
> >> >> * New fields for task_struct should be added above here, so that
> >> >> @@ -1623,4 +1626,53 @@ extern long sched_getaffinity(pid_t pid, struct cpumask
> >> >> *mask);
> >> >> #define TASK_SIZE_OF(tsk) TASK_SIZE
> >> >> #endif
> >> >>
> >> >> +#ifdef CONFIG_MEMBARRIER
> >> >> +static inline void membarrier_fork(struct task_struct *t,
> >> >> + unsigned long clone_flags)
> >> >> +{
> >> >> + /*
> >> >> + * Coherence of membarrier_sync_core against thread fork is
> >> >> + * protected by siglock. membarrier_fork is called with siglock
> >> >> + * held.
> >> >> + */
> >> >> + t->membarrier_sync_core = current->membarrier_sync_core;
> >> >> +}
> >> >> +static inline void membarrier_execve(struct task_struct *t)
> >> >> +{
> >> >> + t->membarrier_sync_core = 0;
> >> >> +}
> >> >> +static inline void membarrier_sched_out(struct task_struct *t)
> >> >> +{
> >> >> + /*
> >> >> + * Core serialization is performed before the memory barrier
> >> >> + * preceding the store to rq->curr.
> >> >> + */
> >> >> + if (unlikely(READ_ONCE(t->membarrier_sync_core)))
> >> >> + sync_core();
> >> >> +}
> >> >> +static inline void membarrier_sched_in(struct task_struct *t)
> >> >> +{
> >> >> + /*
> >> >> + * Core serialization is performed after the memory barrier
> >> >> + * following the store to rq->curr.
> >> >> + */
> >> >> + if (unlikely(READ_ONCE(t->membarrier_sync_core)))
> >> >> + sync_core();
> >> >> +}
> >> >> +#else
> >> >> +static inline void membarrier_fork(struct task_struct *t,
> >> >> + unsigned long clone_flags)
> >> >> +{
> >> >> +}
> >> >> +static inline void membarrier_execve(struct task_struct *t)
> >> >> +{
> >> >> +}
> >> >> +static inline void membarrier_sched_out(struct task_struct *t)
> >> >> +{
> >> >> +}
> >> >> +static inline void membarrier_sched_in(struct task_struct *t)
> >> >> +{
> >> >> +}
> >> >> +#endif
> >> >> +
> >> >> #endif
> >> >> diff --git a/include/uapi/linux/membarrier.h b/include/uapi/linux/membarrier.h
> >> >> index 6d47b3249d8a..bc26fcf8fb7b 100644
> >> >> --- a/include/uapi/linux/membarrier.h
> >> >> +++ b/include/uapi/linux/membarrier.h
> >> >> @@ -67,6 +67,7 @@ enum membarrier_cmd {
> >> >> /* reserved for MEMBARRIER_CMD_SHARED_EXPEDITED (1 << 1) */
> >> >> /* reserved for MEMBARRIER_CMD_PRIVATE (1 << 2) */
> >> >> MEMBARRIER_CMD_PRIVATE_EXPEDITED = (1 << 3),
> >> >> + MEMBARRIER_CMD_REGISTER_SYNC_CORE = (1 << 4),
> >> >> };
> >> >>
> >> >> #endif /* _UAPI_LINUX_MEMBARRIER_H */
> >> >> diff --git a/kernel/fork.c b/kernel/fork.c
> >> >> index 17921b0390b4..713b3c932671 100644
> >> >> --- a/kernel/fork.c
> >> >> +++ b/kernel/fork.c
> >> >> @@ -1840,6 +1840,8 @@ static __latent_entropy struct task_struct *copy_process(
> >> >> */
> >> >> copy_seccomp(p);
> >> >>
> >> >> + membarrier_fork(p, clone_flags);
> >> >> +
> >> >> /*
> >> >> * Process group and session signals need to be delivered to just the
> >> >> * parent before the fork or both the parent and the child after the
> >> >> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> >> >> index 0e36d9960d91..3ca27d066950 100644
> >> >> --- a/kernel/sched/core.c
> >> >> +++ b/kernel/sched/core.c
> >> >> @@ -3292,6 +3292,8 @@ static void __sched notrace __schedule(bool preempt)
> >> >> local_irq_disable();
> >> >> rcu_note_context_switch(preempt);
> >> >>
> >> >> + membarrier_sched_out(prev);
> >> >> +
> >> >> /*
> >> >> * Make sure that signal_pending_state()->signal_pending() below
> >> >> * can't be reordered with __set_current_state(TASK_INTERRUPTIBLE)
> >> >> @@ -3364,6 +3366,7 @@ static void __sched notrace __schedule(bool preempt)
> >> >>
> >> >> /* Also unlocks the rq: */
> >> >> rq = context_switch(rq, prev, next, &rf);
> >> >> + membarrier_sched_in(next);
> >> >> } else {
> >> >> rq->clock_update_flags &= ~(RQCF_ACT_SKIP|RQCF_REQ_SKIP);
> >> >> rq_unlock_irq(rq, &rf);
> >> >> diff --git a/kernel/sched/membarrier.c b/kernel/sched/membarrier.c
> >> >> index 7eec6914d2d2..f128caf64b49 100644
> >> >> --- a/kernel/sched/membarrier.c
> >> >> +++ b/kernel/sched/membarrier.c
> >> >> @@ -25,12 +25,16 @@
> >> >> * Bitmask made from a "or" of all commands within enum membarrier_cmd,
> >> >> * except MEMBARRIER_CMD_QUERY.
> >> >> */
> >> >> -#define MEMBARRIER_CMD_BITMASK \
> >> >> - (MEMBARRIER_CMD_SHARED | MEMBARRIER_CMD_PRIVATE_EXPEDITED)
> >> >> +#define MEMBARRIER_CMD_BITMASK \
> >> >> + (MEMBARRIER_CMD_SHARED \
> >> >> + | MEMBARRIER_CMD_PRIVATE_EXPEDITED \
> >> >> + | MEMBARRIER_CMD_REGISTER_SYNC_CORE)
> >> >>
> >> >> static void ipi_mb(void *info)
> >> >> {
> >> >> - smp_mb(); /* IPIs should be serializing but paranoid. */
> >> >> + /* IPIs should be serializing but paranoid. */
> >> >> + smp_mb();
> >> >> + sync_core();
> >> >> }
> >> >>
> >> >> static void membarrier_private_expedited(void)
> >> >> @@ -96,6 +100,30 @@ static void membarrier_private_expedited(void)
> >> >> smp_mb(); /* exit from system call is not a mb */
> >> >> }
> >> >>
> >> >> +static void membarrier_register_sync_core(void)
> >> >> +{
> >> >> + struct task_struct *p = current, *t;
> >> >> +
> >> >> + if (get_nr_threads(p) == 1) {
> >> >> + p->membarrier_sync_core = 1;
> >> >> + return;
> >> >> + }
> >> >> +
> >> >> + /*
> >> >> + * Coherence of membarrier_sync_core against thread fork is
> >> >> + * protected by siglock.
> >> >> + */
> >> >> + spin_lock(&p->sighand->siglock);
> >> >> + for_each_thread(p, t)
> >> >> + WRITE_ONCE(t->membarrier_sync_core, 1);
> >> >> + spin_unlock(&p->sighand->siglock);
> >> >> + /*
> >> >> + * Ensure all future scheduler execution will observe the new
> >> >> + * membarrier_sync_core state for this process.
> >> >> + */
> >> >> + synchronize_sched();
> >> >> +}
> >> >> +
> >> >> /**
> >> >> * sys_membarrier - issue memory barriers on a set of threads
> >> >> * @cmd: Takes command values defined in enum membarrier_cmd.
> >> >> @@ -146,6 +174,9 @@ SYSCALL_DEFINE2(membarrier, int, cmd, int, flags)
> >> >> case MEMBARRIER_CMD_PRIVATE_EXPEDITED:
> >> >> membarrier_private_expedited();
> >> >> return 0;
> >> >> + case MEMBARRIER_CMD_REGISTER_SYNC_CORE:
> >> >> + membarrier_register_sync_core();
> >> >> + return 0;
> >> >> default:
> >> >> return -EINVAL;
> >> >> }
> >> >> --
> >> >> 2.11.0
> >>
> >> --
> >> Mathieu Desnoyers
> >> EfficiOS Inc.
> >> http://www.efficios.com
>
> --
> Mathieu Desnoyers
> EfficiOS Inc.
> http://www.efficios.com
>
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [RFC PATCH 2/2] membarrier: provide register sync core cmd
2017-08-28 17:49 ` Paul E. McKenney
@ 2017-08-30 17:53 ` Mathieu Desnoyers
2017-08-30 18:01 ` Paul E. McKenney
0 siblings, 1 reply; 9+ messages in thread
From: Mathieu Desnoyers @ 2017-08-30 17:53 UTC (permalink / raw)
To: Paul E. McKenney
Cc: Peter Zijlstra, linux-kernel, Boqun Feng, Andrew Hunter,
maged michael, gromer, Avi Kivity, Benjamin Herrenschmidt,
Paul Mackerras, Michael Ellerman, Dave Watson, Andy Lutomirski,
Will Deacon, Hans Boehm
----- On Aug 28, 2017, at 1:49 PM, Paul E. McKenney paulmck@linux.vnet.ibm.com wrote:
> On Sun, Aug 27, 2017 at 09:39:54PM +0000, Mathieu Desnoyers wrote:
>> ----- On Aug 27, 2017, at 2:00 PM, Paul E. McKenney paulmck@linux.vnet.ibm.com
>> wrote:
>>
>> > On Sun, Aug 27, 2017 at 08:52:58PM +0000, Mathieu Desnoyers wrote:
>> >> ----- On Aug 27, 2017, at 1:35 PM, Paul E. McKenney paulmck@linux.vnet.ibm.com
>> >> wrote:
>> >>
>> >> > On Sun, Aug 27, 2017 at 12:54:04PM -0700, Mathieu Desnoyers wrote:
>> >> >> Add a new MEMBARRIER_CMD_REGISTER_SYNC_CORE command to the membarrier
>> >> >> system call. It allows processes to register their intent to have their
>> >> >> threads issue core serializing barriers in addition to memory barriers
>> >> >> whenever a membarrier command is performed.
>> >> >>
>> >> >> It is relevant for reclaim of JIT code, which requires to issue core
>> >> >> serializing barriers on all threads running on behalf of a process
>> >> >> after ensuring the old code is not visible anymore, before re-using
>> >> >> memory for new code.
>> >> >>
>> >> >> When a processes returns from a MEMBARRIER_CMD_REGISTER_SYNC_CORE
>> >> >> command, it is guaranteed that all following MEMBARRIER_CMD_SHARED and
>> >> >> MEMBARRIER_CMD_PRIVATE_EXPEDITED issue core serializing barriers, in
>> >> >> addition to the memory barriers, on all of its running threads.
>> >> >
>> >> > I have queued both of these for testing and review:
>> >> >
>> >> > git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu.git
>> >> > f54cf5123eb9 ("membarrier: Speed up single-threaded private cmd")
>> >> > 0d6eb99818da ("membarrier: Provide register sync core cmd")
>> >> >
>> >> > Could the people who asked for this please test it? The second commit
>> >> > is non-obvious enough to need some careful review and intensive testing
>> >> > before I can in good conscious pass it upstream.
>> >>
>> >> Thanks Paul!
>> >>
>> >> Can you pick up my v2 instead ? I added benchmarks in the changelog and
>> >> documented the new command in the membarrier public header. No changes
>> >> otherwise.
>> >
>> > It looks like 0day Test Robot also wants some additional header files.
>> > Could you please send me a v3 with its complaints addressed?
>>
>> I sent a separate patch implementing sync_core() on xtensa. It should
>> be applied before this patch.
>
> And it looks like we also need a sync_core() on every CPU family other
> than x86, which already has one. Do we want to make this capability
> depend on an ARCH_ kconfig option, or should we just bit the bullet and
> implement sync_core() everywhere?
>
> (I had to drop this commit due to downstream complaints. Left to myself,
> I would take the ARCH_ approach, then add it back in, but please let me
> know how you would like to proceed.)
I plan to go with the ARCH_ approach indeed.
Thanks,
Mathieu
>
> Thanx, Paul
>
>> Hopefully we can get some testing from people with xtensa testing
>> environments.
>>
>> Thanks,
>>
>> Mathieu
>>
>> >
>> > Thanx, Paul
>> >
>> >> Thanks,
>> >>
>> >> Mathieu
>> >>
>> >> >
>> >> > Thanx, Paul
>> >> >
>> >> >> Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
>> >> >> CC: Peter Zijlstra <peterz@infradead.org>
>> >> >> CC: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
>> >> >> CC: Boqun Feng <boqun.feng@gmail.com>
>> >> >> CC: Andrew Hunter <ahh@google.com>
>> >> >> CC: Maged Michael <maged.michael@gmail.com>
>> >> >> CC: gromer@google.com
>> >> >> CC: Avi Kivity <avi@scylladb.com>
>> >> >> CC: Benjamin Herrenschmidt <benh@kernel.crashing.org>
>> >> >> CC: Paul Mackerras <paulus@samba.org>
>> >> >> CC: Michael Ellerman <mpe@ellerman.id.au>
>> >> >> CC: Dave Watson <davejwatson@fb.com>
>> >> >> CC: Andy Lutomirski <luto@kernel.org>
>> >> >> CC: Will Deacon <will.deacon@arm.com>
>> >> >> CC: Hans Boehm <hboehm@google.com>
>> >> >> ---
>> >> >> fs/exec.c | 1 +
>> >> >> include/linux/sched.h | 52 +++++++++++++++++++++++++++++++++++++++++
>> >> >> include/uapi/linux/membarrier.h | 1 +
>> >> >> kernel/fork.c | 2 ++
>> >> >> kernel/sched/core.c | 3 +++
>> >> >> kernel/sched/membarrier.c | 37 ++++++++++++++++++++++++++---
>> >> >> 6 files changed, 93 insertions(+), 3 deletions(-)
>> >> >>
>> >> >> diff --git a/fs/exec.c b/fs/exec.c
>> >> >> index 62175cbcc801..a4ab3253bac7 100644
>> >> >> --- a/fs/exec.c
>> >> >> +++ b/fs/exec.c
>> >> >> @@ -1794,6 +1794,7 @@ static int do_execveat_common(int fd, struct filename
>> >> >> *filename,
>> >> >> /* execve succeeded */
>> >> >> current->fs->in_exec = 0;
>> >> >> current->in_execve = 0;
>> >> >> + membarrier_execve(current);
>> >> >> acct_update_integrals(current);
>> >> >> task_numa_free(current);
>> >> >> free_bprm(bprm);
>> >> >> diff --git a/include/linux/sched.h b/include/linux/sched.h
>> >> >> index 8337e2db0bb2..b1ecdc4e8b84 100644
>> >> >> --- a/include/linux/sched.h
>> >> >> +++ b/include/linux/sched.h
>> >> >> @@ -1086,6 +1086,9 @@ struct task_struct {
>> >> >> /* Used by LSM modules for access restriction: */
>> >> >> void *security;
>> >> >> #endif
>> >> >> +#ifdef CONFIG_MEMBARRIER
>> >> >> + int membarrier_sync_core;
>> >> >> +#endif
>> >> >>
>> >> >> /*
>> >> >> * New fields for task_struct should be added above here, so that
>> >> >> @@ -1623,4 +1626,53 @@ extern long sched_getaffinity(pid_t pid, struct cpumask
>> >> >> *mask);
>> >> >> #define TASK_SIZE_OF(tsk) TASK_SIZE
>> >> >> #endif
>> >> >>
>> >> >> +#ifdef CONFIG_MEMBARRIER
>> >> >> +static inline void membarrier_fork(struct task_struct *t,
>> >> >> + unsigned long clone_flags)
>> >> >> +{
>> >> >> + /*
>> >> >> + * Coherence of membarrier_sync_core against thread fork is
>> >> >> + * protected by siglock. membarrier_fork is called with siglock
>> >> >> + * held.
>> >> >> + */
>> >> >> + t->membarrier_sync_core = current->membarrier_sync_core;
>> >> >> +}
>> >> >> +static inline void membarrier_execve(struct task_struct *t)
>> >> >> +{
>> >> >> + t->membarrier_sync_core = 0;
>> >> >> +}
>> >> >> +static inline void membarrier_sched_out(struct task_struct *t)
>> >> >> +{
>> >> >> + /*
>> >> >> + * Core serialization is performed before the memory barrier
>> >> >> + * preceding the store to rq->curr.
>> >> >> + */
>> >> >> + if (unlikely(READ_ONCE(t->membarrier_sync_core)))
>> >> >> + sync_core();
>> >> >> +}
>> >> >> +static inline void membarrier_sched_in(struct task_struct *t)
>> >> >> +{
>> >> >> + /*
>> >> >> + * Core serialization is performed after the memory barrier
>> >> >> + * following the store to rq->curr.
>> >> >> + */
>> >> >> + if (unlikely(READ_ONCE(t->membarrier_sync_core)))
>> >> >> + sync_core();
>> >> >> +}
>> >> >> +#else
>> >> >> +static inline void membarrier_fork(struct task_struct *t,
>> >> >> + unsigned long clone_flags)
>> >> >> +{
>> >> >> +}
>> >> >> +static inline void membarrier_execve(struct task_struct *t)
>> >> >> +{
>> >> >> +}
>> >> >> +static inline void membarrier_sched_out(struct task_struct *t)
>> >> >> +{
>> >> >> +}
>> >> >> +static inline void membarrier_sched_in(struct task_struct *t)
>> >> >> +{
>> >> >> +}
>> >> >> +#endif
>> >> >> +
>> >> >> #endif
>> >> >> diff --git a/include/uapi/linux/membarrier.h b/include/uapi/linux/membarrier.h
>> >> >> index 6d47b3249d8a..bc26fcf8fb7b 100644
>> >> >> --- a/include/uapi/linux/membarrier.h
>> >> >> +++ b/include/uapi/linux/membarrier.h
>> >> >> @@ -67,6 +67,7 @@ enum membarrier_cmd {
>> >> >> /* reserved for MEMBARRIER_CMD_SHARED_EXPEDITED (1 << 1) */
>> >> >> /* reserved for MEMBARRIER_CMD_PRIVATE (1 << 2) */
>> >> >> MEMBARRIER_CMD_PRIVATE_EXPEDITED = (1 << 3),
>> >> >> + MEMBARRIER_CMD_REGISTER_SYNC_CORE = (1 << 4),
>> >> >> };
>> >> >>
>> >> >> #endif /* _UAPI_LINUX_MEMBARRIER_H */
>> >> >> diff --git a/kernel/fork.c b/kernel/fork.c
>> >> >> index 17921b0390b4..713b3c932671 100644
>> >> >> --- a/kernel/fork.c
>> >> >> +++ b/kernel/fork.c
>> >> >> @@ -1840,6 +1840,8 @@ static __latent_entropy struct task_struct *copy_process(
>> >> >> */
>> >> >> copy_seccomp(p);
>> >> >>
>> >> >> + membarrier_fork(p, clone_flags);
>> >> >> +
>> >> >> /*
>> >> >> * Process group and session signals need to be delivered to just the
>> >> >> * parent before the fork or both the parent and the child after the
>> >> >> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
>> >> >> index 0e36d9960d91..3ca27d066950 100644
>> >> >> --- a/kernel/sched/core.c
>> >> >> +++ b/kernel/sched/core.c
>> >> >> @@ -3292,6 +3292,8 @@ static void __sched notrace __schedule(bool preempt)
>> >> >> local_irq_disable();
>> >> >> rcu_note_context_switch(preempt);
>> >> >>
>> >> >> + membarrier_sched_out(prev);
>> >> >> +
>> >> >> /*
>> >> >> * Make sure that signal_pending_state()->signal_pending() below
>> >> >> * can't be reordered with __set_current_state(TASK_INTERRUPTIBLE)
>> >> >> @@ -3364,6 +3366,7 @@ static void __sched notrace __schedule(bool preempt)
>> >> >>
>> >> >> /* Also unlocks the rq: */
>> >> >> rq = context_switch(rq, prev, next, &rf);
>> >> >> + membarrier_sched_in(next);
>> >> >> } else {
>> >> >> rq->clock_update_flags &= ~(RQCF_ACT_SKIP|RQCF_REQ_SKIP);
>> >> >> rq_unlock_irq(rq, &rf);
>> >> >> diff --git a/kernel/sched/membarrier.c b/kernel/sched/membarrier.c
>> >> >> index 7eec6914d2d2..f128caf64b49 100644
>> >> >> --- a/kernel/sched/membarrier.c
>> >> >> +++ b/kernel/sched/membarrier.c
>> >> >> @@ -25,12 +25,16 @@
>> >> >> * Bitmask made from a "or" of all commands within enum membarrier_cmd,
>> >> >> * except MEMBARRIER_CMD_QUERY.
>> >> >> */
>> >> >> -#define MEMBARRIER_CMD_BITMASK \
>> >> >> - (MEMBARRIER_CMD_SHARED | MEMBARRIER_CMD_PRIVATE_EXPEDITED)
>> >> >> +#define MEMBARRIER_CMD_BITMASK \
>> >> >> + (MEMBARRIER_CMD_SHARED \
>> >> >> + | MEMBARRIER_CMD_PRIVATE_EXPEDITED \
>> >> >> + | MEMBARRIER_CMD_REGISTER_SYNC_CORE)
>> >> >>
>> >> >> static void ipi_mb(void *info)
>> >> >> {
>> >> >> - smp_mb(); /* IPIs should be serializing but paranoid. */
>> >> >> + /* IPIs should be serializing but paranoid. */
>> >> >> + smp_mb();
>> >> >> + sync_core();
>> >> >> }
>> >> >>
>> >> >> static void membarrier_private_expedited(void)
>> >> >> @@ -96,6 +100,30 @@ static void membarrier_private_expedited(void)
>> >> >> smp_mb(); /* exit from system call is not a mb */
>> >> >> }
>> >> >>
>> >> >> +static void membarrier_register_sync_core(void)
>> >> >> +{
>> >> >> + struct task_struct *p = current, *t;
>> >> >> +
>> >> >> + if (get_nr_threads(p) == 1) {
>> >> >> + p->membarrier_sync_core = 1;
>> >> >> + return;
>> >> >> + }
>> >> >> +
>> >> >> + /*
>> >> >> + * Coherence of membarrier_sync_core against thread fork is
>> >> >> + * protected by siglock.
>> >> >> + */
>> >> >> + spin_lock(&p->sighand->siglock);
>> >> >> + for_each_thread(p, t)
>> >> >> + WRITE_ONCE(t->membarrier_sync_core, 1);
>> >> >> + spin_unlock(&p->sighand->siglock);
>> >> >> + /*
>> >> >> + * Ensure all future scheduler execution will observe the new
>> >> >> + * membarrier_sync_core state for this process.
>> >> >> + */
>> >> >> + synchronize_sched();
>> >> >> +}
>> >> >> +
>> >> >> /**
>> >> >> * sys_membarrier - issue memory barriers on a set of threads
>> >> >> * @cmd: Takes command values defined in enum membarrier_cmd.
>> >> >> @@ -146,6 +174,9 @@ SYSCALL_DEFINE2(membarrier, int, cmd, int, flags)
>> >> >> case MEMBARRIER_CMD_PRIVATE_EXPEDITED:
>> >> >> membarrier_private_expedited();
>> >> >> return 0;
>> >> >> + case MEMBARRIER_CMD_REGISTER_SYNC_CORE:
>> >> >> + membarrier_register_sync_core();
>> >> >> + return 0;
>> >> >> default:
>> >> >> return -EINVAL;
>> >> >> }
>> >> >> --
>> >> >> 2.11.0
>> >>
>> >> --
>> >> Mathieu Desnoyers
>> >> EfficiOS Inc.
>> >> http://www.efficios.com
>>
>> --
>> Mathieu Desnoyers
>> EfficiOS Inc.
>> http://www.efficios.com
--
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [RFC PATCH 2/2] membarrier: provide register sync core cmd
2017-08-30 17:53 ` Mathieu Desnoyers
@ 2017-08-30 18:01 ` Paul E. McKenney
0 siblings, 0 replies; 9+ messages in thread
From: Paul E. McKenney @ 2017-08-30 18:01 UTC (permalink / raw)
To: Mathieu Desnoyers
Cc: Peter Zijlstra, linux-kernel, Boqun Feng, Andrew Hunter,
maged michael, gromer, Avi Kivity, Benjamin Herrenschmidt,
Paul Mackerras, Michael Ellerman, Dave Watson, Andy Lutomirski,
Will Deacon, Hans Boehm
On Wed, Aug 30, 2017 at 05:53:53PM +0000, Mathieu Desnoyers wrote:
> ----- On Aug 28, 2017, at 1:49 PM, Paul E. McKenney paulmck@linux.vnet.ibm.com wrote:
>
> > On Sun, Aug 27, 2017 at 09:39:54PM +0000, Mathieu Desnoyers wrote:
> >> ----- On Aug 27, 2017, at 2:00 PM, Paul E. McKenney paulmck@linux.vnet.ibm.com
> >> wrote:
> >>
> >> > On Sun, Aug 27, 2017 at 08:52:58PM +0000, Mathieu Desnoyers wrote:
> >> >> ----- On Aug 27, 2017, at 1:35 PM, Paul E. McKenney paulmck@linux.vnet.ibm.com
> >> >> wrote:
> >> >>
> >> >> > On Sun, Aug 27, 2017 at 12:54:04PM -0700, Mathieu Desnoyers wrote:
> >> >> >> Add a new MEMBARRIER_CMD_REGISTER_SYNC_CORE command to the membarrier
> >> >> >> system call. It allows processes to register their intent to have their
> >> >> >> threads issue core serializing barriers in addition to memory barriers
> >> >> >> whenever a membarrier command is performed.
> >> >> >>
> >> >> >> It is relevant for reclaim of JIT code, which requires to issue core
> >> >> >> serializing barriers on all threads running on behalf of a process
> >> >> >> after ensuring the old code is not visible anymore, before re-using
> >> >> >> memory for new code.
> >> >> >>
> >> >> >> When a processes returns from a MEMBARRIER_CMD_REGISTER_SYNC_CORE
> >> >> >> command, it is guaranteed that all following MEMBARRIER_CMD_SHARED and
> >> >> >> MEMBARRIER_CMD_PRIVATE_EXPEDITED issue core serializing barriers, in
> >> >> >> addition to the memory barriers, on all of its running threads.
> >> >> >
> >> >> > I have queued both of these for testing and review:
> >> >> >
> >> >> > git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu.git
> >> >> > f54cf5123eb9 ("membarrier: Speed up single-threaded private cmd")
> >> >> > 0d6eb99818da ("membarrier: Provide register sync core cmd")
> >> >> >
> >> >> > Could the people who asked for this please test it? The second commit
> >> >> > is non-obvious enough to need some careful review and intensive testing
> >> >> > before I can in good conscious pass it upstream.
> >> >>
> >> >> Thanks Paul!
> >> >>
> >> >> Can you pick up my v2 instead ? I added benchmarks in the changelog and
> >> >> documented the new command in the membarrier public header. No changes
> >> >> otherwise.
> >> >
> >> > It looks like 0day Test Robot also wants some additional header files.
> >> > Could you please send me a v3 with its complaints addressed?
> >>
> >> I sent a separate patch implementing sync_core() on xtensa. It should
> >> be applied before this patch.
> >
> > And it looks like we also need a sync_core() on every CPU family other
> > than x86, which already has one. Do we want to make this capability
> > depend on an ARCH_ kconfig option, or should we just bit the bullet and
> > implement sync_core() everywhere?
> >
> > (I had to drop this commit due to downstream complaints. Left to myself,
> > I would take the ARCH_ approach, then add it back in, but please let me
> > know how you would like to proceed.)
>
> I plan to go with the ARCH_ approach indeed.
Sounds good, looking forward to seeing it. ;-)
Thanx, Paul
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2017-08-30 18:01 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-08-27 19:54 [RFC PATCH 1/2] membarrier: speed up single-threaded private cmd Mathieu Desnoyers
2017-08-27 19:54 ` [RFC PATCH 2/2] membarrier: provide register sync core cmd Mathieu Desnoyers
2017-08-27 20:35 ` Paul E. McKenney
2017-08-27 20:52 ` Mathieu Desnoyers
2017-08-27 21:00 ` Paul E. McKenney
2017-08-27 21:39 ` Mathieu Desnoyers
2017-08-28 17:49 ` Paul E. McKenney
2017-08-30 17:53 ` Mathieu Desnoyers
2017-08-30 18:01 ` Paul E. McKenney
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.