* [PATCH v4 for 4.14 1/3] membarrier: Provide register expedited private command @ 2017-09-26 17:51 Mathieu Desnoyers 2017-09-26 17:51 ` [PATCH for 4.14 2/3] membarrier: selftest: Test private expedited cmd Mathieu Desnoyers ` (2 more replies) 0 siblings, 3 replies; 23+ messages in thread From: Mathieu Desnoyers @ 2017-09-26 17:51 UTC (permalink / raw) To: Paul E . McKenney, Peter Zijlstra Cc: linux-kernel, Mathieu Desnoyers, Boqun Feng, Andrew Hunter, Maged Michael, gromer, Avi Kivity, Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman, Dave Watson, Alan Stern, Will Deacon, Andy Lutomirski, linux-arch Provide a new command allowing processes to register their intent to use the private expedited command. This allows PowerPC to skip the full memory barrier in switch_mm(), and only issue the barrier when scheduling into a task belonging to a process that has registered to use expedited private. Processes are now required to register before using MEMBARRIER_CMD_PRIVATE_EXPEDITED, otherwise that command returns EPERM. Changes since v1: - Use test_ti_thread_flag(next, ...) instead of test_thread_flag() in powerpc membarrier_arch_sched_in(), given that we want to specifically check the next thread state. - Add missing ARCH_HAS_MEMBARRIER_HOOKS in Kconfig. - Use task_thread_info() to pass thread_info from task to *_ti_thread_flag(). Changes since v2: - Move membarrier_arch_sched_in() call to finish_task_switch(). - Check for NULL t->mm in membarrier_arch_fork(). - Use membarrier_sched_in() in generic code, which invokes the arch-specific membarrier_arch_sched_in(). This fixes allnoconfig build on PowerPC. - Move asm/membarrier.h include under CONFIG_MEMBARRIER, fixing allnoconfig build on PowerPC. - Build and runtime tested on PowerPC. Changes since v3: - Simply rely on copy_mm() to copy the membarrier_private_expedited mm field on fork. - powerpc: test thread flag instead of reading membarrier_private_expedited in membarrier_arch_fork(). - powerpc: skip memory barrier in membarrier_arch_sched_in() if coming from kernel thread, since mmdrop() implies a full barrier. - Set membarrier_private_expedited to 1 only after arch registration code, thus eliminating a race where concurrent commands could succeed when they should fail if issued concurrently with process registration. - Use READ_ONCE() for membarrier_private_expedited field access in membarrier_private_expedited. Matches WRITE_ONCE() performed in process registration. Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> CC: Peter Zijlstra <peterz@infradead.org> CC: Paul E. McKenney <paulmck@linux.vnet.ibm.com> CC: Boqun Feng <boqun.feng@gmail.com> CC: Andrew Hunter <ahh@google.com> CC: Maged Michael <maged.michael@gmail.com> CC: gromer@google.com CC: Avi Kivity <avi@scylladb.com> CC: Benjamin Herrenschmidt <benh@kernel.crashing.org> CC: Paul Mackerras <paulus@samba.org> CC: Michael Ellerman <mpe@ellerman.id.au> CC: Dave Watson <davejwatson@fb.com> CC: Alan Stern <stern@rowland.harvard.edu> CC: Will Deacon <will.deacon@arm.com> CC: Andy Lutomirski <luto@kernel.org> CC: linux-arch@vger.kernel.org --- MAINTAINERS | 2 ++ arch/powerpc/Kconfig | 1 + arch/powerpc/include/asm/membarrier.h | 44 +++++++++++++++++++++++++++ arch/powerpc/include/asm/thread_info.h | 3 ++ arch/powerpc/kernel/Makefile | 2 ++ arch/powerpc/kernel/membarrier.c | 45 ++++++++++++++++++++++++++++ fs/exec.c | 1 + include/linux/mm_types.h | 3 ++ include/linux/sched/mm.h | 55 ++++++++++++++++++++++++++++++++++ include/uapi/linux/membarrier.h | 23 +++++++++----- init/Kconfig | 3 ++ kernel/fork.c | 2 ++ kernel/sched/core.c | 11 +------ kernel/sched/membarrier.c | 25 +++++++++++++--- 14 files changed, 199 insertions(+), 21 deletions(-) create mode 100644 arch/powerpc/include/asm/membarrier.h create mode 100644 arch/powerpc/kernel/membarrier.c diff --git a/MAINTAINERS b/MAINTAINERS index 6671f375f7fc..f11d8aece00d 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -8816,6 +8816,8 @@ L: linux-kernel@vger.kernel.org S: Supported F: kernel/sched/membarrier.c F: include/uapi/linux/membarrier.h +F: arch/powerpc/kernel/membarrier.c +F: arch/powerpc/include/asm/membarrier.h MEMORY MANAGEMENT L: linux-mm@kvack.org diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig index 809c468edab1..6f44c5f74f71 100644 --- a/arch/powerpc/Kconfig +++ b/arch/powerpc/Kconfig @@ -138,6 +138,7 @@ config PPC select ARCH_HAS_ELF_RANDOMIZE select ARCH_HAS_FORTIFY_SOURCE select ARCH_HAS_GCOV_PROFILE_ALL + select ARCH_HAS_MEMBARRIER_HOOKS select ARCH_HAS_SCALED_CPUTIME if VIRT_CPU_ACCOUNTING_NATIVE select ARCH_HAS_SG_CHAIN select ARCH_HAS_TICK_BROADCAST if GENERIC_CLOCKEVENTS_BROADCAST diff --git a/arch/powerpc/include/asm/membarrier.h b/arch/powerpc/include/asm/membarrier.h new file mode 100644 index 000000000000..588154c1cf57 --- /dev/null +++ b/arch/powerpc/include/asm/membarrier.h @@ -0,0 +1,44 @@ +#ifndef _ASM_POWERPC_MEMBARRIER_H +#define _ASM_POWERPC_MEMBARRIER_H + +static inline void membarrier_arch_sched_in(struct task_struct *prev, + struct task_struct *next) +{ + /* + * Only need the full barrier when switching between processes. + * Barrier when switching from kernel to userspace is not + * required here, given that it is implied by mmdrop(). Barrier + * when switching from userspace to kernel is not needed after + * store to rq->curr. + */ + if (likely(!test_ti_thread_flag(task_thread_info(next), + TIF_MEMBARRIER_PRIVATE_EXPEDITED) + || !prev->mm || prev->mm == next->mm)) + return; + + /* + * The membarrier system call requires a full memory barrier + * after storing to rq->curr, before going back to user-space. + */ + smp_mb(); +} +static inline void membarrier_arch_fork(struct task_struct *t, + unsigned long clone_flags) +{ + /* + * Coherence of TIF_MEMBARRIER_PRIVATE_EXPEDITED against thread + * fork is protected by siglock. membarrier_arch_fork is called + * with siglock held. + */ + if (test_thread_flag(TIF_MEMBARRIER_PRIVATE_EXPEDITED)) + set_ti_thread_flag(task_thread_info(t), + TIF_MEMBARRIER_PRIVATE_EXPEDITED); +} +static inline void membarrier_arch_execve(struct task_struct *t) +{ + clear_ti_thread_flag(task_thread_info(t), + TIF_MEMBARRIER_PRIVATE_EXPEDITED); +} +void membarrier_arch_register_private_expedited(struct task_struct *t); + +#endif /* _ASM_POWERPC_MEMBARRIER_H */ diff --git a/arch/powerpc/include/asm/thread_info.h b/arch/powerpc/include/asm/thread_info.h index a941cc6fc3e9..2a208487724b 100644 --- a/arch/powerpc/include/asm/thread_info.h +++ b/arch/powerpc/include/asm/thread_info.h @@ -100,6 +100,7 @@ static inline struct thread_info *current_thread_info(void) #if defined(CONFIG_PPC64) #define TIF_ELF2ABI 18 /* function descriptors must die! */ #endif +#define TIF_MEMBARRIER_PRIVATE_EXPEDITED 19 /* membarrier */ /* as above, but as bit values */ #define _TIF_SYSCALL_TRACE (1<<TIF_SYSCALL_TRACE) @@ -119,6 +120,8 @@ static inline struct thread_info *current_thread_info(void) #define _TIF_SYSCALL_TRACEPOINT (1<<TIF_SYSCALL_TRACEPOINT) #define _TIF_EMULATE_STACK_STORE (1<<TIF_EMULATE_STACK_STORE) #define _TIF_NOHZ (1<<TIF_NOHZ) +#define _TIF_MEMBARRIER_PRIVATE_EXPEDITED \ + (1<<TIF_MEMBARRIER_PRIVATE_EXPEDITED) #define _TIF_SYSCALL_DOTRACE (_TIF_SYSCALL_TRACE | _TIF_SYSCALL_AUDIT | \ _TIF_SECCOMP | _TIF_SYSCALL_TRACEPOINT | \ _TIF_NOHZ) diff --git a/arch/powerpc/kernel/Makefile b/arch/powerpc/kernel/Makefile index 91960f83039c..2dd4b9e3313a 100644 --- a/arch/powerpc/kernel/Makefile +++ b/arch/powerpc/kernel/Makefile @@ -135,6 +135,8 @@ endif obj-$(CONFIG_EPAPR_PARAVIRT) += epapr_paravirt.o epapr_hcalls.o obj-$(CONFIG_KVM_GUEST) += kvm.o kvm_emul.o +obj-$(CONFIG_MEMBARRIER) += membarrier.o + # Disable GCOV & sanitizers in odd or sensitive code GCOV_PROFILE_prom_init.o := n UBSAN_SANITIZE_prom_init.o := n diff --git a/arch/powerpc/kernel/membarrier.c b/arch/powerpc/kernel/membarrier.c new file mode 100644 index 000000000000..b0d79a5f5981 --- /dev/null +++ b/arch/powerpc/kernel/membarrier.c @@ -0,0 +1,45 @@ +/* + * Copyright (C) 2010-2017 Mathieu Desnoyers <mathieu.desnoyers@efficios.com> + * + * membarrier system call - PowerPC architecture code + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + */ + +#include <linux/sched/mm.h> +#include <linux/sched/signal.h> +#include <linux/thread_info.h> +#include <linux/spinlock.h> +#include <linux/rcupdate.h> + +void membarrier_arch_register_private_expedited(struct task_struct *p) +{ + struct task_struct *t; + + if (get_nr_threads(p) == 1) { + set_thread_flag(TIF_MEMBARRIER_PRIVATE_EXPEDITED); + return; + } + /* + * Coherence of TIF_MEMBARRIER_PRIVATE_EXPEDITED against thread + * fork is protected by siglock. + */ + spin_lock(&p->sighand->siglock); + for_each_thread(p, t) + set_ti_thread_flag(task_thread_info(t), + TIF_MEMBARRIER_PRIVATE_EXPEDITED); + spin_unlock(&p->sighand->siglock); + /* + * Ensure all future scheduler executions will observe the new + * thread flag state for this process. + */ + synchronize_sched(); +} diff --git a/fs/exec.c b/fs/exec.c index ac34d9724684..b2448f2731b3 100644 --- a/fs/exec.c +++ b/fs/exec.c @@ -1802,6 +1802,7 @@ static int do_execveat_common(int fd, struct filename *filename, /* execve succeeded */ current->fs->in_exec = 0; current->in_execve = 0; + membarrier_execve(current); acct_update_integrals(current); task_numa_free(current); free_bprm(bprm); diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h index 46f4ecf5479a..5e0fe8ce053b 100644 --- a/include/linux/mm_types.h +++ b/include/linux/mm_types.h @@ -445,6 +445,9 @@ struct mm_struct { unsigned long flags; /* Must use atomic bitops to access the bits */ struct core_state *core_state; /* coredumping support */ +#ifdef CONFIG_MEMBARRIER + int membarrier_private_expedited; +#endif #ifdef CONFIG_AIO spinlock_t ioctx_lock; struct kioctx_table __rcu *ioctx_table; diff --git a/include/linux/sched/mm.h b/include/linux/sched/mm.h index 3a19c253bdb1..d3b81e48784d 100644 --- a/include/linux/sched/mm.h +++ b/include/linux/sched/mm.h @@ -205,4 +205,59 @@ static inline void memalloc_noreclaim_restore(unsigned int flags) current->flags = (current->flags & ~PF_MEMALLOC) | flags; } +#ifdef CONFIG_MEMBARRIER + +#ifdef CONFIG_ARCH_HAS_MEMBARRIER_HOOKS +#include <asm/membarrier.h> +#else +static inline void membarrier_arch_sched_in(struct task_struct *prev, + struct task_struct *next) +{ +} +static inline void membarrier_arch_fork(struct task_struct *t, + unsigned long clone_flags) +{ +} +static inline void membarrier_arch_execve(struct task_struct *t) +{ +} +static inline void membarrier_arch_register_private_expedited( + struct task_struct *p) +{ +} +#endif + +static inline void membarrier_sched_in(struct task_struct *prev, + struct task_struct *next) +{ + membarrier_arch_sched_in(prev, next); +} +static inline void membarrier_fork(struct task_struct *t, + unsigned long clone_flags) +{ + /* + * Prior copy_mm() copies the membarrier_private_expedited field + * from current->mm to t->mm. + */ + membarrier_arch_fork(t, clone_flags); +} +static inline void membarrier_execve(struct task_struct *t) +{ + t->mm->membarrier_private_expedited = 0; + membarrier_arch_execve(t); +} +#else +static inline void membarrier_sched_in(struct task_struct *prev, + struct task_struct *next) +{ +} +static inline void membarrier_fork(struct task_struct *t, + unsigned long clone_flags) +{ +} +static inline void membarrier_execve(struct task_struct *t) +{ +} +#endif + #endif /* _LINUX_SCHED_MM_H */ diff --git a/include/uapi/linux/membarrier.h b/include/uapi/linux/membarrier.h index 6d47b3249d8a..4e01ad7ffe98 100644 --- a/include/uapi/linux/membarrier.h +++ b/include/uapi/linux/membarrier.h @@ -52,21 +52,30 @@ * (non-running threads are de facto in such a * state). This only covers threads from the * same processes as the caller thread. This - * command returns 0. The "expedited" commands - * complete faster than the non-expedited ones, - * they never block, but have the downside of - * causing extra overhead. + * command returns 0 on success. The + * "expedited" commands complete faster than + * the non-expedited ones, they never block, + * but have the downside of causing extra + * overhead. A process needs to register its + * intent to use the private expedited command + * prior to using it, otherwise this command + * returns -EPERM. + * @MEMBARRIER_CMD_REGISTER_PRIVATE_EXPEDITED: + * Register the process intent to use + * MEMBARRIER_CMD_PRIVATE_EXPEDITED. Always + * returns 0. * * Command to be passed to the membarrier system call. The commands need to * be a single bit each, except for MEMBARRIER_CMD_QUERY which is assigned to * the value 0. */ enum membarrier_cmd { - MEMBARRIER_CMD_QUERY = 0, - MEMBARRIER_CMD_SHARED = (1 << 0), + MEMBARRIER_CMD_QUERY = 0, + MEMBARRIER_CMD_SHARED = (1 << 0), /* reserved for MEMBARRIER_CMD_SHARED_EXPEDITED (1 << 1) */ /* reserved for MEMBARRIER_CMD_PRIVATE (1 << 2) */ - MEMBARRIER_CMD_PRIVATE_EXPEDITED = (1 << 3), + MEMBARRIER_CMD_PRIVATE_EXPEDITED = (1 << 3), + MEMBARRIER_CMD_REGISTER_PRIVATE_EXPEDITED = (1 << 4), }; #endif /* _UAPI_LINUX_MEMBARRIER_H */ diff --git a/init/Kconfig b/init/Kconfig index 78cb2461012e..a3dc6a66f0d1 100644 --- a/init/Kconfig +++ b/init/Kconfig @@ -1400,6 +1400,9 @@ config MEMBARRIER If unsure, say Y. +config ARCH_HAS_MEMBARRIER_HOOKS + bool + config EMBEDDED bool "Embedded system" option allnoconfig_y diff --git a/kernel/fork.c b/kernel/fork.c index 10646182440f..bd4a93915e08 100644 --- a/kernel/fork.c +++ b/kernel/fork.c @@ -1840,6 +1840,8 @@ static __latent_entropy struct task_struct *copy_process( */ copy_seccomp(p); + membarrier_fork(p, clone_flags); + /* * Process group and session signals need to be delivered to just the * parent before the fork or both the parent and the child after the diff --git a/kernel/sched/core.c b/kernel/sched/core.c index 18a6966567da..b9d731283946 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -2643,17 +2643,8 @@ static struct rq *finish_task_switch(struct task_struct *prev) */ prev_state = prev->state; vtime_task_switch(prev); + membarrier_sched_in(prev, current); perf_event_task_sched_in(prev, current); - /* - * The membarrier system call requires a full memory barrier - * after storing to rq->curr, before going back to user-space. - * - * TODO: This smp_mb__after_unlock_lock can go away if PPC end - * up adding a full barrier to switch_mm(), or we should figure - * out if a smp_mb__after_unlock_lock is really the proper API - * to use. - */ - smp_mb__after_unlock_lock(); finish_lock_switch(rq, prev); finish_arch_post_lock_switch(); diff --git a/kernel/sched/membarrier.c b/kernel/sched/membarrier.c index a92fddc22747..f06949a279ca 100644 --- a/kernel/sched/membarrier.c +++ b/kernel/sched/membarrier.c @@ -26,21 +26,25 @@ * except MEMBARRIER_CMD_QUERY. */ #define MEMBARRIER_CMD_BITMASK \ - (MEMBARRIER_CMD_SHARED | MEMBARRIER_CMD_PRIVATE_EXPEDITED) + (MEMBARRIER_CMD_SHARED | MEMBARRIER_CMD_PRIVATE_EXPEDITED \ + | MEMBARRIER_CMD_REGISTER_PRIVATE_EXPEDITED) static void ipi_mb(void *info) { smp_mb(); /* IPIs should be serializing but paranoid. */ } -static void membarrier_private_expedited(void) +static int membarrier_private_expedited(void) { int cpu; bool fallback = false; cpumask_var_t tmpmask; + if (!READ_ONCE(current->mm->membarrier_private_expedited)) + return -EPERM; + if (num_online_cpus() == 1) - return; + return 0; /* * Matches memory barriers around rq->curr modification in @@ -94,6 +98,17 @@ static void membarrier_private_expedited(void) * rq->curr modification in scheduler. */ smp_mb(); /* exit from system call is not a mb */ + return 0; +} + +static void membarrier_register_private_expedited(void) +{ + struct task_struct *p = current; + + if (READ_ONCE(p->mm->membarrier_private_expedited)) + return; + membarrier_arch_register_private_expedited(p); + WRITE_ONCE(p->mm->membarrier_private_expedited, 1); } /** @@ -144,7 +159,9 @@ SYSCALL_DEFINE2(membarrier, int, cmd, int, flags) synchronize_sched(); return 0; case MEMBARRIER_CMD_PRIVATE_EXPEDITED: - membarrier_private_expedited(); + return membarrier_private_expedited(); + case MEMBARRIER_CMD_REGISTER_PRIVATE_EXPEDITED: + membarrier_register_private_expedited(); return 0; default: return -EINVAL; -- 2.11.0 ^ permalink raw reply related [flat|nested] 23+ messages in thread
* [PATCH for 4.14 2/3] membarrier: selftest: Test private expedited cmd 2017-09-26 17:51 [PATCH v4 for 4.14 1/3] membarrier: Provide register expedited private command Mathieu Desnoyers @ 2017-09-26 17:51 ` Mathieu Desnoyers 2017-09-26 19:41 ` Shuah Khan 2017-09-26 17:51 ` [PATCH v4 for 4.14 3/3] membarrier: Document scheduler barrier requirements Mathieu Desnoyers 2017-09-26 20:43 ` [PATCH v4 for 4.14 1/3] membarrier: Provide register expedited private command Mathieu Desnoyers 2 siblings, 1 reply; 23+ messages in thread From: Mathieu Desnoyers @ 2017-09-26 17:51 UTC (permalink / raw) To: Paul E . McKenney, Peter Zijlstra Cc: linux-kernel, Mathieu Desnoyers, Boqun Feng, Andrew Hunter, Maged Michael, gromer, Avi Kivity, Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman, Dave Watson, Alan Stern, Will Deacon, Andy Lutomirski, linux-arch Test the new MEMBARRIER_CMD_PRIVATE_EXPEDITED and MEMBARRIER_CMD_REGISTER_PRIVATE_EXPEDITED commands. Add checks expecting specific error values on system calls expected to fail. Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> CC: Peter Zijlstra <peterz@infradead.org> CC: Paul E. McKenney <paulmck@linux.vnet.ibm.com> CC: Boqun Feng <boqun.feng@gmail.com> CC: Andrew Hunter <ahh@google.com> CC: Maged Michael <maged.michael@gmail.com> CC: gromer@google.com CC: Avi Kivity <avi@scylladb.com> CC: Benjamin Herrenschmidt <benh@kernel.crashing.org> CC: Paul Mackerras <paulus@samba.org> CC: Michael Ellerman <mpe@ellerman.id.au> CC: Dave Watson <davejwatson@fb.com> CC: Alan Stern <stern@rowland.harvard.edu> CC: Will Deacon <will.deacon@arm.com> CC: Andy Lutomirski <luto@kernel.org> CC: linux-arch@vger.kernel.org --- .../testing/selftests/membarrier/membarrier_test.c | 109 ++++++++++++++++++--- 1 file changed, 94 insertions(+), 15 deletions(-) diff --git a/tools/testing/selftests/membarrier/membarrier_test.c b/tools/testing/selftests/membarrier/membarrier_test.c index 21399fcf1a59..f85657374b59 100644 --- a/tools/testing/selftests/membarrier/membarrier_test.c +++ b/tools/testing/selftests/membarrier/membarrier_test.c @@ -15,49 +15,119 @@ static int sys_membarrier(int cmd, int flags) static int test_membarrier_cmd_fail(void) { int cmd = -1, flags = 0; + const char *test_name = "sys membarrier invalid command"; if (sys_membarrier(cmd, flags) != -1) { ksft_exit_fail_msg( - "sys membarrier invalid command test: command = %d, flags = %d. Should fail, but passed\n", - cmd, flags); + "%s test: command = %d, flags = %d. Should fail, but passed\n", + test_name, cmd, flags); + } + if (errno != EINVAL) { + ksft_exit_fail_msg( + "%s test: flags = %d. Should return (%d: \"%s\"), but returned (%d: \"%s\").\n", + test_name, flags, EINVAL, strerror(EINVAL), + errno, strerror(errno)); } ksft_test_result_pass( - "sys membarrier invalid command test: command = %d, flags = %d. Failed as expected\n", - cmd, flags); + "%s test: command = %d, flags = %d, errno = %d. Failed as expected\n", + test_name, cmd, flags, errno); return 0; } static int test_membarrier_flags_fail(void) { int cmd = MEMBARRIER_CMD_QUERY, flags = 1; + const char *test_name = "sys membarrier MEMBARRIER_CMD_QUERY invalid flags"; if (sys_membarrier(cmd, flags) != -1) { ksft_exit_fail_msg( - "sys membarrier MEMBARRIER_CMD_QUERY invalid flags test: flags = %d. Should fail, but passed\n", - flags); + "%s test: flags = %d. Should fail, but passed\n", + test_name, flags); + } + if (errno != EINVAL) { + ksft_exit_fail_msg( + "%s test: flags = %d. Should return (%d: \"%s\"), but returned (%d: \"%s\").\n", + test_name, flags, EINVAL, strerror(EINVAL), + errno, strerror(errno)); } ksft_test_result_pass( - "sys membarrier MEMBARRIER_CMD_QUERY invalid flags test: flags = %d. Failed as expected\n", - flags); + "%s test: flags = %d, errno = %d. Failed as expected\n", + test_name, flags, errno); return 0; } -static int test_membarrier_success(void) +static int test_membarrier_shared_success(void) { int cmd = MEMBARRIER_CMD_SHARED, flags = 0; - const char *test_name = "sys membarrier MEMBARRIER_CMD_SHARED\n"; + const char *test_name = "sys membarrier MEMBARRIER_CMD_SHARED"; + + if (sys_membarrier(cmd, flags) != 0) { + ksft_exit_fail_msg( + "%s test: flags = %d, errno = %d\n", + test_name, flags, errno); + } + + ksft_test_result_pass( + "%s test: flags = %d\n", test_name, flags); + return 0; +} + +static int test_membarrier_private_expedited_fail(void) +{ + int cmd = MEMBARRIER_CMD_PRIVATE_EXPEDITED, flags = 0; + const char *test_name = "sys membarrier MEMBARRIER_CMD_PRIVATE_EXPEDITED not registered failure"; + + if (sys_membarrier(cmd, flags) != -1) { + ksft_exit_fail_msg( + "%s test: flags = %d. Should fail, but passed\n", + test_name, flags); + } + if (errno != EPERM) { + ksft_exit_fail_msg( + "%s test: flags = %d. Should return (%d: \"%s\"), but returned (%d: \"%s\").\n", + test_name, flags, EPERM, strerror(EPERM), + errno, strerror(errno)); + } + + ksft_test_result_pass( + "%s test: flags = %d, errno = %d\n", + test_name, flags, errno); + return 0; +} + +static int test_membarrier_register_private_expedited_success(void) +{ + int cmd = MEMBARRIER_CMD_REGISTER_PRIVATE_EXPEDITED, flags = 0; + const char *test_name = "sys membarrier MEMBARRIER_CMD_REGISTER_PRIVATE_EXPEDITED"; if (sys_membarrier(cmd, flags) != 0) { ksft_exit_fail_msg( - "sys membarrier MEMBARRIER_CMD_SHARED test: flags = %d\n", - flags); + "%s test: flags = %d, errno = %d\n", + test_name, flags, errno); } ksft_test_result_pass( - "sys membarrier MEMBARRIER_CMD_SHARED test: flags = %d\n", - flags); + "%s test: flags = %d\n", + test_name, flags); + return 0; +} + +static int test_membarrier_private_expedited_success(void) +{ + int cmd = MEMBARRIER_CMD_PRIVATE_EXPEDITED, flags = 0; + const char *test_name = "sys membarrier MEMBARRIER_CMD_PRIVATE_EXPEDITED"; + + if (sys_membarrier(cmd, flags) != 0) { + ksft_exit_fail_msg( + "%s test: flags = %d, errno = %d\n", + test_name, flags, errno); + } + + ksft_test_result_pass( + "%s test: flags = %d\n", + test_name, flags); return 0; } @@ -71,7 +141,16 @@ static int test_membarrier(void) status = test_membarrier_flags_fail(); if (status) return status; - status = test_membarrier_success(); + status = test_membarrier_shared_success(); + if (status) + return status; + status = test_membarrier_private_expedited_fail(); + if (status) + return status; + status = test_membarrier_register_private_expedited_success(); + if (status) + return status; + status = test_membarrier_private_expedited_success(); if (status) return status; return 0; -- 2.11.0 ^ permalink raw reply related [flat|nested] 23+ messages in thread
* Re: [PATCH for 4.14 2/3] membarrier: selftest: Test private expedited cmd 2017-09-26 17:51 ` [PATCH for 4.14 2/3] membarrier: selftest: Test private expedited cmd Mathieu Desnoyers @ 2017-09-26 19:41 ` Shuah Khan 2017-09-26 19:55 ` Mathieu Desnoyers 0 siblings, 1 reply; 23+ messages in thread From: Shuah Khan @ 2017-09-26 19:41 UTC (permalink / raw) To: Mathieu Desnoyers, shuah Cc: Paul E . McKenney, Peter Zijlstra, LKML, Boqun Feng, Andrew Hunter, Maged Michael, gromer, Avi Kivity, Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman, Dave Watson, Alan Stern, Will Deacon, Andy Lutomirski, linux-arch Hi Mathew, On Tue, Sep 26, 2017 at 11:51 AM, Mathieu Desnoyers <mathieu.desnoyers@efficios.com> wrote: > Test the new MEMBARRIER_CMD_PRIVATE_EXPEDITED and > MEMBARRIER_CMD_REGISTER_PRIVATE_EXPEDITED commands. > > Add checks expecting specific error values on system calls expected to > fail. > > Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> > CC: Peter Zijlstra <peterz@infradead.org> > CC: Paul E. McKenney <paulmck@linux.vnet.ibm.com> > CC: Boqun Feng <boqun.feng@gmail.com> > CC: Andrew Hunter <ahh@google.com> > CC: Maged Michael <maged.michael@gmail.com> > CC: gromer@google.com > CC: Avi Kivity <avi@scylladb.com> > CC: Benjamin Herrenschmidt <benh@kernel.crashing.org> > CC: Paul Mackerras <paulus@samba.org> > CC: Michael Ellerman <mpe@ellerman.id.au> > CC: Dave Watson <davejwatson@fb.com> > CC: Alan Stern <stern@rowland.harvard.edu> > CC: Will Deacon <will.deacon@arm.com> > CC: Andy Lutomirski <luto@kernel.org> > CC: linux-arch@vger.kernel.org > --- > .../testing/selftests/membarrier/membarrier_test.c | 109 Did you run get_maintainers script on this patch? I am curious why get_maintainers didn't include linux-kselftest@vger.kernel.org and shuah@kernel.org thanks, -- Shuah Please make sure you send the patches to ++++++++++++++++++--- > 1 file changed, 94 insertions(+), 15 deletions(-) > > diff --git a/tools/testing/selftests/membarrier/membarrier_test.c b/tools/testing/selftests/membarrier/membarrier_test.c > index 21399fcf1a59..f85657374b59 100644 > --- a/tools/testing/selftests/membarrier/membarrier_test.c > +++ b/tools/testing/selftests/membarrier/membarrier_test.c > @@ -15,49 +15,119 @@ static int sys_membarrier(int cmd, int flags) > static int test_membarrier_cmd_fail(void) > { > int cmd = -1, flags = 0; > + const char *test_name = "sys membarrier invalid command"; > > if (sys_membarrier(cmd, flags) != -1) { > ksft_exit_fail_msg( > - "sys membarrier invalid command test: command = %d, flags = %d. Should fail, but passed\n", > - cmd, flags); > + "%s test: command = %d, flags = %d. Should fail, but passed\n", > + test_name, cmd, flags); > + } > + if (errno != EINVAL) { > + ksft_exit_fail_msg( > + "%s test: flags = %d. Should return (%d: \"%s\"), but returned (%d: \"%s\").\n", > + test_name, flags, EINVAL, strerror(EINVAL), > + errno, strerror(errno)); > } > > ksft_test_result_pass( > - "sys membarrier invalid command test: command = %d, flags = %d. Failed as expected\n", > - cmd, flags); > + "%s test: command = %d, flags = %d, errno = %d. Failed as expected\n", > + test_name, cmd, flags, errno); > return 0; > } > > static int test_membarrier_flags_fail(void) > { > int cmd = MEMBARRIER_CMD_QUERY, flags = 1; > + const char *test_name = "sys membarrier MEMBARRIER_CMD_QUERY invalid flags"; > > if (sys_membarrier(cmd, flags) != -1) { > ksft_exit_fail_msg( > - "sys membarrier MEMBARRIER_CMD_QUERY invalid flags test: flags = %d. Should fail, but passed\n", > - flags); > + "%s test: flags = %d. Should fail, but passed\n", > + test_name, flags); > + } > + if (errno != EINVAL) { > + ksft_exit_fail_msg( > + "%s test: flags = %d. Should return (%d: \"%s\"), but returned (%d: \"%s\").\n", > + test_name, flags, EINVAL, strerror(EINVAL), > + errno, strerror(errno)); > } > > ksft_test_result_pass( > - "sys membarrier MEMBARRIER_CMD_QUERY invalid flags test: flags = %d. Failed as expected\n", > - flags); > + "%s test: flags = %d, errno = %d. Failed as expected\n", > + test_name, flags, errno); > return 0; > } > > -static int test_membarrier_success(void) > +static int test_membarrier_shared_success(void) > { > int cmd = MEMBARRIER_CMD_SHARED, flags = 0; > - const char *test_name = "sys membarrier MEMBARRIER_CMD_SHARED\n"; > + const char *test_name = "sys membarrier MEMBARRIER_CMD_SHARED"; > + > + if (sys_membarrier(cmd, flags) != 0) { > + ksft_exit_fail_msg( > + "%s test: flags = %d, errno = %d\n", > + test_name, flags, errno); > + } > + > + ksft_test_result_pass( > + "%s test: flags = %d\n", test_name, flags); > + return 0; > +} > + > +static int test_membarrier_private_expedited_fail(void) > +{ > + int cmd = MEMBARRIER_CMD_PRIVATE_EXPEDITED, flags = 0; > + const char *test_name = "sys membarrier MEMBARRIER_CMD_PRIVATE_EXPEDITED not registered failure"; > + > + if (sys_membarrier(cmd, flags) != -1) { > + ksft_exit_fail_msg( > + "%s test: flags = %d. Should fail, but passed\n", > + test_name, flags); > + } > + if (errno != EPERM) { > + ksft_exit_fail_msg( > + "%s test: flags = %d. Should return (%d: \"%s\"), but returned (%d: \"%s\").\n", > + test_name, flags, EPERM, strerror(EPERM), > + errno, strerror(errno)); > + } > + > + ksft_test_result_pass( > + "%s test: flags = %d, errno = %d\n", > + test_name, flags, errno); > + return 0; > +} > + > +static int test_membarrier_register_private_expedited_success(void) > +{ > + int cmd = MEMBARRIER_CMD_REGISTER_PRIVATE_EXPEDITED, flags = 0; > + const char *test_name = "sys membarrier MEMBARRIER_CMD_REGISTER_PRIVATE_EXPEDITED"; > > if (sys_membarrier(cmd, flags) != 0) { > ksft_exit_fail_msg( > - "sys membarrier MEMBARRIER_CMD_SHARED test: flags = %d\n", > - flags); > + "%s test: flags = %d, errno = %d\n", > + test_name, flags, errno); > } > > ksft_test_result_pass( > - "sys membarrier MEMBARRIER_CMD_SHARED test: flags = %d\n", > - flags); > + "%s test: flags = %d\n", > + test_name, flags); > + return 0; > +} > + > +static int test_membarrier_private_expedited_success(void) > +{ > + int cmd = MEMBARRIER_CMD_PRIVATE_EXPEDITED, flags = 0; > + const char *test_name = "sys membarrier MEMBARRIER_CMD_PRIVATE_EXPEDITED"; > + > + if (sys_membarrier(cmd, flags) != 0) { > + ksft_exit_fail_msg( > + "%s test: flags = %d, errno = %d\n", > + test_name, flags, errno); > + } > + > + ksft_test_result_pass( > + "%s test: flags = %d\n", > + test_name, flags); > return 0; > } > > @@ -71,7 +141,16 @@ static int test_membarrier(void) > status = test_membarrier_flags_fail(); > if (status) > return status; > - status = test_membarrier_success(); > + status = test_membarrier_shared_success(); > + if (status) > + return status; > + status = test_membarrier_private_expedited_fail(); > + if (status) > + return status; > + status = test_membarrier_register_private_expedited_success(); > + if (status) > + return status; > + status = test_membarrier_private_expedited_success(); > if (status) > return status; > return 0; > -- > 2.11.0 > ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH for 4.14 2/3] membarrier: selftest: Test private expedited cmd 2017-09-26 19:41 ` Shuah Khan @ 2017-09-26 19:55 ` Mathieu Desnoyers 2017-09-26 20:08 ` Shuah Khan 0 siblings, 1 reply; 23+ messages in thread From: Mathieu Desnoyers @ 2017-09-26 19:55 UTC (permalink / raw) To: Shuah Khan, linux-kselftest Cc: shuah, Paul E. McKenney, Peter Zijlstra, linux-kernel, Boqun Feng, Andrew Hunter, maged michael, gromer, Avi Kivity, Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman, Dave Watson, Alan Stern, Will Deacon, Andy Lutomirski, linux-arch ----- On Sep 26, 2017, at 3:41 PM, Shuah Khan shuahkhan@gmail.com wrote: > Hi Mathew, > > On Tue, Sep 26, 2017 at 11:51 AM, Mathieu Desnoyers > <mathieu.desnoyers@efficios.com> wrote: >> Test the new MEMBARRIER_CMD_PRIVATE_EXPEDITED and >> MEMBARRIER_CMD_REGISTER_PRIVATE_EXPEDITED commands. >> >> Add checks expecting specific error values on system calls expected to >> fail. >> >> Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> >> CC: Peter Zijlstra <peterz@infradead.org> >> CC: Paul E. McKenney <paulmck@linux.vnet.ibm.com> >> CC: Boqun Feng <boqun.feng@gmail.com> >> CC: Andrew Hunter <ahh@google.com> >> CC: Maged Michael <maged.michael@gmail.com> >> CC: gromer@google.com >> CC: Avi Kivity <avi@scylladb.com> >> CC: Benjamin Herrenschmidt <benh@kernel.crashing.org> >> CC: Paul Mackerras <paulus@samba.org> >> CC: Michael Ellerman <mpe@ellerman.id.au> >> CC: Dave Watson <davejwatson@fb.com> >> CC: Alan Stern <stern@rowland.harvard.edu> >> CC: Will Deacon <will.deacon@arm.com> >> CC: Andy Lutomirski <luto@kernel.org> >> CC: linux-arch@vger.kernel.org >> --- >> .../testing/selftests/membarrier/membarrier_test.c | 109 > > Did you run get_maintainers script on this patch? I am curious why > get_maintainers didn't include linux-kselftest@vger.kernel.org and > shuah@kernel.org My mistake. I'll add this script to my checklist before I send out patches. If OK with you, I can just add the ML in CC here. Thanks, Mathieu > > thanks, > -- Shuah > > Please make sure you send the patches to > ++++++++++++++++++--- >> 1 file changed, 94 insertions(+), 15 deletions(-) >> >> diff --git a/tools/testing/selftests/membarrier/membarrier_test.c >> b/tools/testing/selftests/membarrier/membarrier_test.c >> index 21399fcf1a59..f85657374b59 100644 >> --- a/tools/testing/selftests/membarrier/membarrier_test.c >> +++ b/tools/testing/selftests/membarrier/membarrier_test.c >> @@ -15,49 +15,119 @@ static int sys_membarrier(int cmd, int flags) >> static int test_membarrier_cmd_fail(void) >> { >> int cmd = -1, flags = 0; >> + const char *test_name = "sys membarrier invalid command"; >> >> if (sys_membarrier(cmd, flags) != -1) { >> ksft_exit_fail_msg( >> - "sys membarrier invalid command test: command = %d, >> flags = %d. Should fail, but passed\n", >> - cmd, flags); >> + "%s test: command = %d, flags = %d. Should fail, but >> passed\n", >> + test_name, cmd, flags); >> + } >> + if (errno != EINVAL) { >> + ksft_exit_fail_msg( >> + "%s test: flags = %d. Should return (%d: \"%s\"), but >> returned (%d: \"%s\").\n", >> + test_name, flags, EINVAL, strerror(EINVAL), >> + errno, strerror(errno)); >> } >> >> ksft_test_result_pass( >> - "sys membarrier invalid command test: command = %d, flags = %d. >> Failed as expected\n", >> - cmd, flags); >> + "%s test: command = %d, flags = %d, errno = %d. Failed as >> expected\n", >> + test_name, cmd, flags, errno); >> return 0; >> } >> >> static int test_membarrier_flags_fail(void) >> { >> int cmd = MEMBARRIER_CMD_QUERY, flags = 1; >> + const char *test_name = "sys membarrier MEMBARRIER_CMD_QUERY invalid >> flags"; >> >> if (sys_membarrier(cmd, flags) != -1) { >> ksft_exit_fail_msg( >> - "sys membarrier MEMBARRIER_CMD_QUERY invalid flags test: >> flags = %d. Should fail, but passed\n", >> - flags); >> + "%s test: flags = %d. Should fail, but passed\n", >> + test_name, flags); >> + } >> + if (errno != EINVAL) { >> + ksft_exit_fail_msg( >> + "%s test: flags = %d. Should return (%d: \"%s\"), but >> returned (%d: \"%s\").\n", >> + test_name, flags, EINVAL, strerror(EINVAL), >> + errno, strerror(errno)); >> } >> >> ksft_test_result_pass( >> - "sys membarrier MEMBARRIER_CMD_QUERY invalid flags test: flags = >> %d. Failed as expected\n", >> - flags); >> + "%s test: flags = %d, errno = %d. Failed as expected\n", >> + test_name, flags, errno); >> return 0; >> } >> >> -static int test_membarrier_success(void) >> +static int test_membarrier_shared_success(void) >> { >> int cmd = MEMBARRIER_CMD_SHARED, flags = 0; >> - const char *test_name = "sys membarrier MEMBARRIER_CMD_SHARED\n"; >> + const char *test_name = "sys membarrier MEMBARRIER_CMD_SHARED"; >> + >> + if (sys_membarrier(cmd, flags) != 0) { >> + ksft_exit_fail_msg( >> + "%s test: flags = %d, errno = %d\n", >> + test_name, flags, errno); >> + } >> + >> + ksft_test_result_pass( >> + "%s test: flags = %d\n", test_name, flags); >> + return 0; >> +} >> + >> +static int test_membarrier_private_expedited_fail(void) >> +{ >> + int cmd = MEMBARRIER_CMD_PRIVATE_EXPEDITED, flags = 0; >> + const char *test_name = "sys membarrier MEMBARRIER_CMD_PRIVATE_EXPEDITED >> not registered failure"; >> + >> + if (sys_membarrier(cmd, flags) != -1) { >> + ksft_exit_fail_msg( >> + "%s test: flags = %d. Should fail, but passed\n", >> + test_name, flags); >> + } >> + if (errno != EPERM) { >> + ksft_exit_fail_msg( >> + "%s test: flags = %d. Should return (%d: \"%s\"), but >> returned (%d: \"%s\").\n", >> + test_name, flags, EPERM, strerror(EPERM), >> + errno, strerror(errno)); >> + } >> + >> + ksft_test_result_pass( >> + "%s test: flags = %d, errno = %d\n", >> + test_name, flags, errno); >> + return 0; >> +} >> + >> +static int test_membarrier_register_private_expedited_success(void) >> +{ >> + int cmd = MEMBARRIER_CMD_REGISTER_PRIVATE_EXPEDITED, flags = 0; >> + const char *test_name = "sys membarrier >> MEMBARRIER_CMD_REGISTER_PRIVATE_EXPEDITED"; >> >> if (sys_membarrier(cmd, flags) != 0) { >> ksft_exit_fail_msg( >> - "sys membarrier MEMBARRIER_CMD_SHARED test: flags = >> %d\n", >> - flags); >> + "%s test: flags = %d, errno = %d\n", >> + test_name, flags, errno); >> } >> >> ksft_test_result_pass( >> - "sys membarrier MEMBARRIER_CMD_SHARED test: flags = %d\n", >> - flags); >> + "%s test: flags = %d\n", >> + test_name, flags); >> + return 0; >> +} >> + >> +static int test_membarrier_private_expedited_success(void) >> +{ >> + int cmd = MEMBARRIER_CMD_PRIVATE_EXPEDITED, flags = 0; >> + const char *test_name = "sys membarrier >> MEMBARRIER_CMD_PRIVATE_EXPEDITED"; >> + >> + if (sys_membarrier(cmd, flags) != 0) { >> + ksft_exit_fail_msg( >> + "%s test: flags = %d, errno = %d\n", >> + test_name, flags, errno); >> + } >> + >> + ksft_test_result_pass( >> + "%s test: flags = %d\n", >> + test_name, flags); >> return 0; >> } >> >> @@ -71,7 +141,16 @@ static int test_membarrier(void) >> status = test_membarrier_flags_fail(); >> if (status) >> return status; >> - status = test_membarrier_success(); >> + status = test_membarrier_shared_success(); >> + if (status) >> + return status; >> + status = test_membarrier_private_expedited_fail(); >> + if (status) >> + return status; >> + status = test_membarrier_register_private_expedited_success(); >> + if (status) >> + return status; >> + status = test_membarrier_private_expedited_success(); >> if (status) >> return status; >> return 0; >> -- >> 2.11.0 -- Mathieu Desnoyers EfficiOS Inc. http://www.efficios.com ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH for 4.14 2/3] membarrier: selftest: Test private expedited cmd 2017-09-26 19:55 ` Mathieu Desnoyers @ 2017-09-26 20:08 ` Shuah Khan 2017-09-26 20:15 ` Mathieu Desnoyers 0 siblings, 1 reply; 23+ messages in thread From: Shuah Khan @ 2017-09-26 20:08 UTC (permalink / raw) To: Mathieu Desnoyers Cc: linux-kselftest, shuah, Paul E. McKenney, Peter Zijlstra, linux-kernel, Boqun Feng, Andrew Hunter, maged michael, gromer, Avi Kivity, Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman, Dave Watson, Alan Stern, Will Deacon, Andy Lutomirski, linux-arch, shuahkh On Tue, Sep 26, 2017 at 1:55 PM, Mathieu Desnoyers <mathieu.desnoyers@efficios.com> wrote: > ----- On Sep 26, 2017, at 3:41 PM, Shuah Khan shuahkhan@gmail.com wrote: > >> Hi Mathew, >> >> On Tue, Sep 26, 2017 at 11:51 AM, Mathieu Desnoyers >> <mathieu.desnoyers@efficios.com> wrote: >>> Test the new MEMBARRIER_CMD_PRIVATE_EXPEDITED and >>> MEMBARRIER_CMD_REGISTER_PRIVATE_EXPEDITED commands. >>> >>> Add checks expecting specific error values on system calls expected to >>> fail. >>> >>> Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> >>> CC: Peter Zijlstra <peterz@infradead.org> >>> CC: Paul E. McKenney <paulmck@linux.vnet.ibm.com> >>> CC: Boqun Feng <boqun.feng@gmail.com> >>> CC: Andrew Hunter <ahh@google.com> >>> CC: Maged Michael <maged.michael@gmail.com> >>> CC: gromer@google.com >>> CC: Avi Kivity <avi@scylladb.com> >>> CC: Benjamin Herrenschmidt <benh@kernel.crashing.org> >>> CC: Paul Mackerras <paulus@samba.org> >>> CC: Michael Ellerman <mpe@ellerman.id.au> >>> CC: Dave Watson <davejwatson@fb.com> >>> CC: Alan Stern <stern@rowland.harvard.edu> >>> CC: Will Deacon <will.deacon@arm.com> >>> CC: Andy Lutomirski <luto@kernel.org> >>> CC: linux-arch@vger.kernel.org >>> --- >>> .../testing/selftests/membarrier/membarrier_test.c | 109 >> >> Did you run get_maintainers script on this patch? I am curious why >> get_maintainers didn't include linux-kselftest@vger.kernel.org and >> shuah@kernel.org > > My mistake. I'll add this script to my checklist before I send out > patches. > > If OK with you, I can just add the ML in CC here. > Please add everybody get_maintainers suggest for this patch as well. If this patch is going through linux-kselftest, you will have to resend the patch to me at some point, so I can pull it in. I am guessing this has dependency on the other patches in the series and will go through the primary tree. In that case CC is just fine and I will review it and Ack it. thanks, -- Shuah >> >> Please make sure you send the patches to >> ++++++++++++++++++--- >>> 1 file changed, 94 insertions(+), 15 deletions(-) >>> >>> diff --git a/tools/testing/selftests/membarrier/membarrier_test.c >>> b/tools/testing/selftests/membarrier/membarrier_test.c >>> index 21399fcf1a59..f85657374b59 100644 >>> --- a/tools/testing/selftests/membarrier/membarrier_test.c >>> +++ b/tools/testing/selftests/membarrier/membarrier_test.c >>> @@ -15,49 +15,119 @@ static int sys_membarrier(int cmd, int flags) >>> static int test_membarrier_cmd_fail(void) >>> { >>> int cmd = -1, flags = 0; >>> + const char *test_name = "sys membarrier invalid command"; >>> >>> if (sys_membarrier(cmd, flags) != -1) { >>> ksft_exit_fail_msg( >>> - "sys membarrier invalid command test: command = %d, >>> flags = %d. Should fail, but passed\n", >>> - cmd, flags); >>> + "%s test: command = %d, flags = %d. Should fail, but >>> passed\n", >>> + test_name, cmd, flags); >>> + } >>> + if (errno != EINVAL) { >>> + ksft_exit_fail_msg( >>> + "%s test: flags = %d. Should return (%d: \"%s\"), but >>> returned (%d: \"%s\").\n", >>> + test_name, flags, EINVAL, strerror(EINVAL), >>> + errno, strerror(errno)); >>> } >>> >>> ksft_test_result_pass( >>> - "sys membarrier invalid command test: command = %d, flags = %d. >>> Failed as expected\n", >>> - cmd, flags); >>> + "%s test: command = %d, flags = %d, errno = %d. Failed as >>> expected\n", >>> + test_name, cmd, flags, errno); >>> return 0; >>> } >>> >>> static int test_membarrier_flags_fail(void) >>> { >>> int cmd = MEMBARRIER_CMD_QUERY, flags = 1; >>> + const char *test_name = "sys membarrier MEMBARRIER_CMD_QUERY invalid >>> flags"; >>> >>> if (sys_membarrier(cmd, flags) != -1) { >>> ksft_exit_fail_msg( >>> - "sys membarrier MEMBARRIER_CMD_QUERY invalid flags test: >>> flags = %d. Should fail, but passed\n", >>> - flags); >>> + "%s test: flags = %d. Should fail, but passed\n", >>> + test_name, flags); >>> + } >>> + if (errno != EINVAL) { >>> + ksft_exit_fail_msg( >>> + "%s test: flags = %d. Should return (%d: \"%s\"), but >>> returned (%d: \"%s\").\n", >>> + test_name, flags, EINVAL, strerror(EINVAL), >>> + errno, strerror(errno)); >>> } >>> >>> ksft_test_result_pass( >>> - "sys membarrier MEMBARRIER_CMD_QUERY invalid flags test: flags = >>> %d. Failed as expected\n", >>> - flags); >>> + "%s test: flags = %d, errno = %d. Failed as expected\n", >>> + test_name, flags, errno); >>> return 0; >>> } >>> >>> -static int test_membarrier_success(void) >>> +static int test_membarrier_shared_success(void) >>> { >>> int cmd = MEMBARRIER_CMD_SHARED, flags = 0; >>> - const char *test_name = "sys membarrier MEMBARRIER_CMD_SHARED\n"; >>> + const char *test_name = "sys membarrier MEMBARRIER_CMD_SHARED"; >>> + >>> + if (sys_membarrier(cmd, flags) != 0) { >>> + ksft_exit_fail_msg( >>> + "%s test: flags = %d, errno = %d\n", >>> + test_name, flags, errno); >>> + } >>> + >>> + ksft_test_result_pass( >>> + "%s test: flags = %d\n", test_name, flags); >>> + return 0; >>> +} >>> + >>> +static int test_membarrier_private_expedited_fail(void) >>> +{ >>> + int cmd = MEMBARRIER_CMD_PRIVATE_EXPEDITED, flags = 0; >>> + const char *test_name = "sys membarrier MEMBARRIER_CMD_PRIVATE_EXPEDITED >>> not registered failure"; >>> + >>> + if (sys_membarrier(cmd, flags) != -1) { >>> + ksft_exit_fail_msg( >>> + "%s test: flags = %d. Should fail, but passed\n", >>> + test_name, flags); >>> + } >>> + if (errno != EPERM) { >>> + ksft_exit_fail_msg( >>> + "%s test: flags = %d. Should return (%d: \"%s\"), but >>> returned (%d: \"%s\").\n", >>> + test_name, flags, EPERM, strerror(EPERM), >>> + errno, strerror(errno)); >>> + } >>> + >>> + ksft_test_result_pass( >>> + "%s test: flags = %d, errno = %d\n", >>> + test_name, flags, errno); >>> + return 0; >>> +} >>> + >>> +static int test_membarrier_register_private_expedited_success(void) >>> +{ >>> + int cmd = MEMBARRIER_CMD_REGISTER_PRIVATE_EXPEDITED, flags = 0; >>> + const char *test_name = "sys membarrier >>> MEMBARRIER_CMD_REGISTER_PRIVATE_EXPEDITED"; >>> >>> if (sys_membarrier(cmd, flags) != 0) { >>> ksft_exit_fail_msg( >>> - "sys membarrier MEMBARRIER_CMD_SHARED test: flags = >>> %d\n", >>> - flags); >>> + "%s test: flags = %d, errno = %d\n", >>> + test_name, flags, errno); >>> } >>> >>> ksft_test_result_pass( >>> - "sys membarrier MEMBARRIER_CMD_SHARED test: flags = %d\n", >>> - flags); >>> + "%s test: flags = %d\n", >>> + test_name, flags); >>> + return 0; >>> +} >>> + >>> +static int test_membarrier_private_expedited_success(void) >>> +{ >>> + int cmd = MEMBARRIER_CMD_PRIVATE_EXPEDITED, flags = 0; >>> + const char *test_name = "sys membarrier >>> MEMBARRIER_CMD_PRIVATE_EXPEDITED"; >>> + >>> + if (sys_membarrier(cmd, flags) != 0) { >>> + ksft_exit_fail_msg( >>> + "%s test: flags = %d, errno = %d\n", >>> + test_name, flags, errno); >>> + } >>> + >>> + ksft_test_result_pass( >>> + "%s test: flags = %d\n", >>> + test_name, flags); >>> return 0; >>> } >>> >>> @@ -71,7 +141,16 @@ static int test_membarrier(void) >>> status = test_membarrier_flags_fail(); >>> if (status) >>> return status; >>> - status = test_membarrier_success(); >>> + status = test_membarrier_shared_success(); >>> + if (status) >>> + return status; >>> + status = test_membarrier_private_expedited_fail(); >>> + if (status) >>> + return status; >>> + status = test_membarrier_register_private_expedited_success(); >>> + if (status) >>> + return status; >>> + status = test_membarrier_private_expedited_success(); >>> if (status) >>> return status; >>> return 0; >>> -- >>> 2.11.0 > > -- > Mathieu Desnoyers > EfficiOS Inc. > http://www.efficios.com ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH for 4.14 2/3] membarrier: selftest: Test private expedited cmd 2017-09-26 20:08 ` Shuah Khan @ 2017-09-26 20:15 ` Mathieu Desnoyers 2017-09-26 20:34 ` Shuah Khan 2017-09-26 21:15 ` Greg Kroah-Hartman 0 siblings, 2 replies; 23+ messages in thread From: Mathieu Desnoyers @ 2017-09-26 20:15 UTC (permalink / raw) To: Shuah Khan, Greg Kroah-Hartman, Alice Ferrazzi, Paul Elder, Paul E. McKenney Cc: linux-kselftest, shuah, Peter Zijlstra, linux-kernel, Boqun Feng, Andrew Hunter, maged michael, gromer, Avi Kivity, Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman, Dave Watson, Alan Stern, Will Deacon, Andy Lutomirski, linux-arch, Shuah Khan ----- On Sep 26, 2017, at 4:08 PM, Shuah Khan shuahkhan@gmail.com wrote: > On Tue, Sep 26, 2017 at 1:55 PM, Mathieu Desnoyers > <mathieu.desnoyers@efficios.com> wrote: >> ----- On Sep 26, 2017, at 3:41 PM, Shuah Khan shuahkhan@gmail.com wrote: >> >>> Hi Mathew, >>> >>> On Tue, Sep 26, 2017 at 11:51 AM, Mathieu Desnoyers >>> <mathieu.desnoyers@efficios.com> wrote: >>>> Test the new MEMBARRIER_CMD_PRIVATE_EXPEDITED and >>>> MEMBARRIER_CMD_REGISTER_PRIVATE_EXPEDITED commands. >>>> >>>> Add checks expecting specific error values on system calls expected to >>>> fail. >>>> >>>> Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> >>>> CC: Peter Zijlstra <peterz@infradead.org> >>>> CC: Paul E. McKenney <paulmck@linux.vnet.ibm.com> >>>> CC: Boqun Feng <boqun.feng@gmail.com> >>>> CC: Andrew Hunter <ahh@google.com> >>>> CC: Maged Michael <maged.michael@gmail.com> >>>> CC: gromer@google.com >>>> CC: Avi Kivity <avi@scylladb.com> >>>> CC: Benjamin Herrenschmidt <benh@kernel.crashing.org> >>>> CC: Paul Mackerras <paulus@samba.org> >>>> CC: Michael Ellerman <mpe@ellerman.id.au> >>>> CC: Dave Watson <davejwatson@fb.com> >>>> CC: Alan Stern <stern@rowland.harvard.edu> >>>> CC: Will Deacon <will.deacon@arm.com> >>>> CC: Andy Lutomirski <luto@kernel.org> >>>> CC: linux-arch@vger.kernel.org >>>> --- >>>> .../testing/selftests/membarrier/membarrier_test.c | 109 >>> >>> Did you run get_maintainers script on this patch? I am curious why >>> get_maintainers didn't include linux-kselftest@vger.kernel.org and >>> shuah@kernel.org >> >> My mistake. I'll add this script to my checklist before I send out >> patches. >> >> If OK with you, I can just add the ML in CC here. >> > > Please add everybody get_maintainers suggest for this patch as well. > If this patch is going through linux-kselftest, you will have to resend the > patch to me at some point, so I can pull it in. > > I am guessing this has dependency on the other patches in the series > and will go through the primary tree. In that case CC is just fine and I will > review it and Ack it. Indeed, it has a dependency on another patch part of this series, which is aimed to go through Paul E. McKenney's tree. Adding Greg, Alice, and Paul Elder in CC as suggested by get_maintainers. Do you need me to re-send the series, or is this thread ok ? Thanks, Mathieu > > thanks, > -- Shuah >>> >>> Please make sure you send the patches to >>> ++++++++++++++++++--- >>>> 1 file changed, 94 insertions(+), 15 deletions(-) >>>> >>>> diff --git a/tools/testing/selftests/membarrier/membarrier_test.c >>>> b/tools/testing/selftests/membarrier/membarrier_test.c >>>> index 21399fcf1a59..f85657374b59 100644 >>>> --- a/tools/testing/selftests/membarrier/membarrier_test.c >>>> +++ b/tools/testing/selftests/membarrier/membarrier_test.c >>>> @@ -15,49 +15,119 @@ static int sys_membarrier(int cmd, int flags) >>>> static int test_membarrier_cmd_fail(void) >>>> { >>>> int cmd = -1, flags = 0; >>>> + const char *test_name = "sys membarrier invalid command"; >>>> >>>> if (sys_membarrier(cmd, flags) != -1) { >>>> ksft_exit_fail_msg( >>>> - "sys membarrier invalid command test: command = %d, >>>> flags = %d. Should fail, but passed\n", >>>> - cmd, flags); >>>> + "%s test: command = %d, flags = %d. Should fail, but >>>> passed\n", >>>> + test_name, cmd, flags); >>>> + } >>>> + if (errno != EINVAL) { >>>> + ksft_exit_fail_msg( >>>> + "%s test: flags = %d. Should return (%d: \"%s\"), but >>>> returned (%d: \"%s\").\n", >>>> + test_name, flags, EINVAL, strerror(EINVAL), >>>> + errno, strerror(errno)); >>>> } >>>> >>>> ksft_test_result_pass( >>>> - "sys membarrier invalid command test: command = %d, flags = %d. >>>> Failed as expected\n", >>>> - cmd, flags); >>>> + "%s test: command = %d, flags = %d, errno = %d. Failed as >>>> expected\n", >>>> + test_name, cmd, flags, errno); >>>> return 0; >>>> } >>>> >>>> static int test_membarrier_flags_fail(void) >>>> { >>>> int cmd = MEMBARRIER_CMD_QUERY, flags = 1; >>>> + const char *test_name = "sys membarrier MEMBARRIER_CMD_QUERY invalid >>>> flags"; >>>> >>>> if (sys_membarrier(cmd, flags) != -1) { >>>> ksft_exit_fail_msg( >>>> - "sys membarrier MEMBARRIER_CMD_QUERY invalid flags test: >>>> flags = %d. Should fail, but passed\n", >>>> - flags); >>>> + "%s test: flags = %d. Should fail, but passed\n", >>>> + test_name, flags); >>>> + } >>>> + if (errno != EINVAL) { >>>> + ksft_exit_fail_msg( >>>> + "%s test: flags = %d. Should return (%d: \"%s\"), but >>>> returned (%d: \"%s\").\n", >>>> + test_name, flags, EINVAL, strerror(EINVAL), >>>> + errno, strerror(errno)); >>>> } >>>> >>>> ksft_test_result_pass( >>>> - "sys membarrier MEMBARRIER_CMD_QUERY invalid flags test: flags = >>>> %d. Failed as expected\n", >>>> - flags); >>>> + "%s test: flags = %d, errno = %d. Failed as expected\n", >>>> + test_name, flags, errno); >>>> return 0; >>>> } >>>> >>>> -static int test_membarrier_success(void) >>>> +static int test_membarrier_shared_success(void) >>>> { >>>> int cmd = MEMBARRIER_CMD_SHARED, flags = 0; >>>> - const char *test_name = "sys membarrier MEMBARRIER_CMD_SHARED\n"; >>>> + const char *test_name = "sys membarrier MEMBARRIER_CMD_SHARED"; >>>> + >>>> + if (sys_membarrier(cmd, flags) != 0) { >>>> + ksft_exit_fail_msg( >>>> + "%s test: flags = %d, errno = %d\n", >>>> + test_name, flags, errno); >>>> + } >>>> + >>>> + ksft_test_result_pass( >>>> + "%s test: flags = %d\n", test_name, flags); >>>> + return 0; >>>> +} >>>> + >>>> +static int test_membarrier_private_expedited_fail(void) >>>> +{ >>>> + int cmd = MEMBARRIER_CMD_PRIVATE_EXPEDITED, flags = 0; >>>> + const char *test_name = "sys membarrier MEMBARRIER_CMD_PRIVATE_EXPEDITED >>>> not registered failure"; >>>> + >>>> + if (sys_membarrier(cmd, flags) != -1) { >>>> + ksft_exit_fail_msg( >>>> + "%s test: flags = %d. Should fail, but passed\n", >>>> + test_name, flags); >>>> + } >>>> + if (errno != EPERM) { >>>> + ksft_exit_fail_msg( >>>> + "%s test: flags = %d. Should return (%d: \"%s\"), but >>>> returned (%d: \"%s\").\n", >>>> + test_name, flags, EPERM, strerror(EPERM), >>>> + errno, strerror(errno)); >>>> + } >>>> + >>>> + ksft_test_result_pass( >>>> + "%s test: flags = %d, errno = %d\n", >>>> + test_name, flags, errno); >>>> + return 0; >>>> +} >>>> + >>>> +static int test_membarrier_register_private_expedited_success(void) >>>> +{ >>>> + int cmd = MEMBARRIER_CMD_REGISTER_PRIVATE_EXPEDITED, flags = 0; >>>> + const char *test_name = "sys membarrier >>>> MEMBARRIER_CMD_REGISTER_PRIVATE_EXPEDITED"; >>>> >>>> if (sys_membarrier(cmd, flags) != 0) { >>>> ksft_exit_fail_msg( >>>> - "sys membarrier MEMBARRIER_CMD_SHARED test: flags = >>>> %d\n", >>>> - flags); >>>> + "%s test: flags = %d, errno = %d\n", >>>> + test_name, flags, errno); >>>> } >>>> >>>> ksft_test_result_pass( >>>> - "sys membarrier MEMBARRIER_CMD_SHARED test: flags = %d\n", >>>> - flags); >>>> + "%s test: flags = %d\n", >>>> + test_name, flags); >>>> + return 0; >>>> +} >>>> + >>>> +static int test_membarrier_private_expedited_success(void) >>>> +{ >>>> + int cmd = MEMBARRIER_CMD_PRIVATE_EXPEDITED, flags = 0; >>>> + const char *test_name = "sys membarrier >>>> MEMBARRIER_CMD_PRIVATE_EXPEDITED"; >>>> + >>>> + if (sys_membarrier(cmd, flags) != 0) { >>>> + ksft_exit_fail_msg( >>>> + "%s test: flags = %d, errno = %d\n", >>>> + test_name, flags, errno); >>>> + } >>>> + >>>> + ksft_test_result_pass( >>>> + "%s test: flags = %d\n", >>>> + test_name, flags); >>>> return 0; >>>> } >>>> >>>> @@ -71,7 +141,16 @@ static int test_membarrier(void) >>>> status = test_membarrier_flags_fail(); >>>> if (status) >>>> return status; >>>> - status = test_membarrier_success(); >>>> + status = test_membarrier_shared_success(); >>>> + if (status) >>>> + return status; >>>> + status = test_membarrier_private_expedited_fail(); >>>> + if (status) >>>> + return status; >>>> + status = test_membarrier_register_private_expedited_success(); >>>> + if (status) >>>> + return status; >>>> + status = test_membarrier_private_expedited_success(); >>>> if (status) >>>> return status; >>>> return 0; >>>> -- >>>> 2.11.0 >> >> -- >> Mathieu Desnoyers >> EfficiOS Inc. > > http://www.efficios.com -- Mathieu Desnoyers EfficiOS Inc. http://www.efficios.com ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH for 4.14 2/3] membarrier: selftest: Test private expedited cmd 2017-09-26 20:15 ` Mathieu Desnoyers @ 2017-09-26 20:34 ` Shuah Khan 2017-09-26 21:15 ` Greg Kroah-Hartman 1 sibling, 0 replies; 23+ messages in thread From: Shuah Khan @ 2017-09-26 20:34 UTC (permalink / raw) To: Mathieu Desnoyers, Shuah Khan, Greg Kroah-Hartman, Alice Ferrazzi, Paul Elder, Paul E. McKenney Cc: linux-kselftest, Peter Zijlstra, Shuah Khan, Boqun Feng, Andrew Hunter, maged michael, gromer, Avi Kivity, Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman, Dave Watson, Alan Stern, Will Deacon, Andy Lutomirski, linux-arch On 09/26/2017 02:15 PM, Mathieu Desnoyers wrote: > > > ----- On Sep 26, 2017, at 4:08 PM, Shuah Khan shuahkhan@gmail.com wrote: > >> On Tue, Sep 26, 2017 at 1:55 PM, Mathieu Desnoyers >> <mathieu.desnoyers@efficios.com> wrote: >>> ----- On Sep 26, 2017, at 3:41 PM, Shuah Khan shuahkhan@gmail.com wrote: >>> >>>> Hi Mathew, >>>> >>>> On Tue, Sep 26, 2017 at 11:51 AM, Mathieu Desnoyers >>>> <mathieu.desnoyers@efficios.com> wrote: >>>>> Test the new MEMBARRIER_CMD_PRIVATE_EXPEDITED and >>>>> MEMBARRIER_CMD_REGISTER_PRIVATE_EXPEDITED commands. >>>>> >>>>> Add checks expecting specific error values on system calls expected to >>>>> fail. >>>>> >>>>> Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> >>>>> CC: Peter Zijlstra <peterz@infradead.org> >>>>> CC: Paul E. McKenney <paulmck@linux.vnet.ibm.com> >>>>> CC: Boqun Feng <boqun.feng@gmail.com> >>>>> CC: Andrew Hunter <ahh@google.com> >>>>> CC: Maged Michael <maged.michael@gmail.com> >>>>> CC: gromer@google.com >>>>> CC: Avi Kivity <avi@scylladb.com> >>>>> CC: Benjamin Herrenschmidt <benh@kernel.crashing.org> >>>>> CC: Paul Mackerras <paulus@samba.org> >>>>> CC: Michael Ellerman <mpe@ellerman.id.au> >>>>> CC: Dave Watson <davejwatson@fb.com> >>>>> CC: Alan Stern <stern@rowland.harvard.edu> >>>>> CC: Will Deacon <will.deacon@arm.com> >>>>> CC: Andy Lutomirski <luto@kernel.org> >>>>> CC: linux-arch@vger.kernel.org >>>>> --- >>>>> .../testing/selftests/membarrier/membarrier_test.c | 109 >>>> >>>> Did you run get_maintainers script on this patch? I am curious why >>>> get_maintainers didn't include linux-kselftest@vger.kernel.org and >>>> shuah@kernel.org >>> >>> My mistake. I'll add this script to my checklist before I send out >>> patches. >>> >>> If OK with you, I can just add the ML in CC here. >>> >> >> Please add everybody get_maintainers suggest for this patch as well. >> If this patch is going through linux-kselftest, you will have to resend the >> patch to me at some point, so I can pull it in. >> >> I am guessing this has dependency on the other patches in the series >> and will go through the primary tree. In that case CC is just fine and I will >> review it and Ack it. > > Indeed, it has a dependency on another patch part of this > series, which is aimed to go through Paul E. McKenney's tree. > > Adding Greg, Alice, and Paul Elder in CC as suggested by get_maintainers. > > Do you need me to re-send the series, or is this thread ok ? > Okay with me. Looks good to me. Acked-by: Shuah Khan <shuahkh@osg.samsung.com> thanks, -- Shuah > >> >> thanks, >> -- Shuah >>>> >>>> Please make sure you send the patches to >>>> ++++++++++++++++++--- >>>>> 1 file changed, 94 insertions(+), 15 deletions(-) >>>>> >>>>> diff --git a/tools/testing/selftests/membarrier/membarrier_test.c >>>>> b/tools/testing/selftests/membarrier/membarrier_test.c >>>>> index 21399fcf1a59..f85657374b59 100644 >>>>> --- a/tools/testing/selftests/membarrier/membarrier_test.c >>>>> +++ b/tools/testing/selftests/membarrier/membarrier_test.c >>>>> @@ -15,49 +15,119 @@ static int sys_membarrier(int cmd, int flags) >>>>> static int test_membarrier_cmd_fail(void) >>>>> { >>>>> int cmd = -1, flags = 0; >>>>> + const char *test_name = "sys membarrier invalid command"; >>>>> >>>>> if (sys_membarrier(cmd, flags) != -1) { >>>>> ksft_exit_fail_msg( >>>>> - "sys membarrier invalid command test: command = %d, >>>>> flags = %d. Should fail, but passed\n", >>>>> - cmd, flags); >>>>> + "%s test: command = %d, flags = %d. Should fail, but >>>>> passed\n", >>>>> + test_name, cmd, flags); >>>>> + } >>>>> + if (errno != EINVAL) { >>>>> + ksft_exit_fail_msg( >>>>> + "%s test: flags = %d. Should return (%d: \"%s\"), but >>>>> returned (%d: \"%s\").\n", >>>>> + test_name, flags, EINVAL, strerror(EINVAL), >>>>> + errno, strerror(errno)); >>>>> } >>>>> >>>>> ksft_test_result_pass( >>>>> - "sys membarrier invalid command test: command = %d, flags = %d. >>>>> Failed as expected\n", >>>>> - cmd, flags); >>>>> + "%s test: command = %d, flags = %d, errno = %d. Failed as >>>>> expected\n", >>>>> + test_name, cmd, flags, errno); >>>>> return 0; >>>>> } >>>>> >>>>> static int test_membarrier_flags_fail(void) >>>>> { >>>>> int cmd = MEMBARRIER_CMD_QUERY, flags = 1; >>>>> + const char *test_name = "sys membarrier MEMBARRIER_CMD_QUERY invalid >>>>> flags"; >>>>> >>>>> if (sys_membarrier(cmd, flags) != -1) { >>>>> ksft_exit_fail_msg( >>>>> - "sys membarrier MEMBARRIER_CMD_QUERY invalid flags test: >>>>> flags = %d. Should fail, but passed\n", >>>>> - flags); >>>>> + "%s test: flags = %d. Should fail, but passed\n", >>>>> + test_name, flags); >>>>> + } >>>>> + if (errno != EINVAL) { >>>>> + ksft_exit_fail_msg( >>>>> + "%s test: flags = %d. Should return (%d: \"%s\"), but >>>>> returned (%d: \"%s\").\n", >>>>> + test_name, flags, EINVAL, strerror(EINVAL), >>>>> + errno, strerror(errno)); >>>>> } >>>>> >>>>> ksft_test_result_pass( >>>>> - "sys membarrier MEMBARRIER_CMD_QUERY invalid flags test: flags = >>>>> %d. Failed as expected\n", >>>>> - flags); >>>>> + "%s test: flags = %d, errno = %d. Failed as expected\n", >>>>> + test_name, flags, errno); >>>>> return 0; >>>>> } >>>>> >>>>> -static int test_membarrier_success(void) >>>>> +static int test_membarrier_shared_success(void) >>>>> { >>>>> int cmd = MEMBARRIER_CMD_SHARED, flags = 0; >>>>> - const char *test_name = "sys membarrier MEMBARRIER_CMD_SHARED\n"; >>>>> + const char *test_name = "sys membarrier MEMBARRIER_CMD_SHARED"; >>>>> + >>>>> + if (sys_membarrier(cmd, flags) != 0) { >>>>> + ksft_exit_fail_msg( >>>>> + "%s test: flags = %d, errno = %d\n", >>>>> + test_name, flags, errno); >>>>> + } >>>>> + >>>>> + ksft_test_result_pass( >>>>> + "%s test: flags = %d\n", test_name, flags); >>>>> + return 0; >>>>> +} >>>>> + >>>>> +static int test_membarrier_private_expedited_fail(void) >>>>> +{ >>>>> + int cmd = MEMBARRIER_CMD_PRIVATE_EXPEDITED, flags = 0; >>>>> + const char *test_name = "sys membarrier MEMBARRIER_CMD_PRIVATE_EXPEDITED >>>>> not registered failure"; >>>>> + >>>>> + if (sys_membarrier(cmd, flags) != -1) { >>>>> + ksft_exit_fail_msg( >>>>> + "%s test: flags = %d. Should fail, but passed\n", >>>>> + test_name, flags); >>>>> + } >>>>> + if (errno != EPERM) { >>>>> + ksft_exit_fail_msg( >>>>> + "%s test: flags = %d. Should return (%d: \"%s\"), but >>>>> returned (%d: \"%s\").\n", >>>>> + test_name, flags, EPERM, strerror(EPERM), >>>>> + errno, strerror(errno)); >>>>> + } >>>>> + >>>>> + ksft_test_result_pass( >>>>> + "%s test: flags = %d, errno = %d\n", >>>>> + test_name, flags, errno); >>>>> + return 0; >>>>> +} >>>>> + >>>>> +static int test_membarrier_register_private_expedited_success(void) >>>>> +{ >>>>> + int cmd = MEMBARRIER_CMD_REGISTER_PRIVATE_EXPEDITED, flags = 0; >>>>> + const char *test_name = "sys membarrier >>>>> MEMBARRIER_CMD_REGISTER_PRIVATE_EXPEDITED"; >>>>> >>>>> if (sys_membarrier(cmd, flags) != 0) { >>>>> ksft_exit_fail_msg( >>>>> - "sys membarrier MEMBARRIER_CMD_SHARED test: flags = >>>>> %d\n", >>>>> - flags); >>>>> + "%s test: flags = %d, errno = %d\n", >>>>> + test_name, flags, errno); >>>>> } >>>>> >>>>> ksft_test_result_pass( >>>>> - "sys membarrier MEMBARRIER_CMD_SHARED test: flags = %d\n", >>>>> - flags); >>>>> + "%s test: flags = %d\n", >>>>> + test_name, flags); >>>>> + return 0; >>>>> +} >>>>> + >>>>> +static int test_membarrier_private_expedited_success(void) >>>>> +{ >>>>> + int cmd = MEMBARRIER_CMD_PRIVATE_EXPEDITED, flags = 0; >>>>> + const char *test_name = "sys membarrier >>>>> MEMBARRIER_CMD_PRIVATE_EXPEDITED"; >>>>> + >>>>> + if (sys_membarrier(cmd, flags) != 0) { >>>>> + ksft_exit_fail_msg( >>>>> + "%s test: flags = %d, errno = %d\n", >>>>> + test_name, flags, errno); >>>>> + } >>>>> + >>>>> + ksft_test_result_pass( >>>>> + "%s test: flags = %d\n", >>>>> + test_name, flags); >>>>> return 0; >>>>> } >>>>> >>>>> @@ -71,7 +141,16 @@ static int test_membarrier(void) >>>>> status = test_membarrier_flags_fail(); >>>>> if (status) >>>>> return status; >>>>> - status = test_membarrier_success(); >>>>> + status = test_membarrier_shared_success(); >>>>> + if (status) >>>>> + return status; >>>>> + status = test_membarrier_private_expedited_fail(); >>>>> + if (status) >>>>> + return status; >>>>> + status = test_membarrier_register_private_expedited_success(); >>>>> + if (status) >>>>> + return status; >>>>> + status = test_membarrier_private_expedited_success(); >>>>> if (status) >>>>> return status; >>>>> return 0; >>>>> -- >>>>> 2.11.0 >>> >>> -- >>> Mathieu Desnoyers >>> EfficiOS Inc. >>> http://www.efficios.com > ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH for 4.14 2/3] membarrier: selftest: Test private expedited cmd 2017-09-26 20:15 ` Mathieu Desnoyers @ 2017-09-26 21:15 ` Greg Kroah-Hartman 2017-09-26 21:15 ` Greg Kroah-Hartman 1 sibling, 0 replies; 23+ messages in thread From: Greg Kroah-Hartman @ 2017-09-26 21:15 UTC (permalink / raw) To: Mathieu Desnoyers Cc: Shuah Khan, Alice Ferrazzi, Paul Elder, Paul E. McKenney, linux-kselftest, shuah, Peter Zijlstra, linux-kernel, Boqun Feng, Andrew Hunter, maged michael, gromer, Avi Kivity, Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman, Dave Watson, Alan Stern, Will Deacon, Andy Lutomirski, linux-arch, Shuah Khan On Tue, Sep 26, 2017 at 08:15:25PM +0000, Mathieu Desnoyers wrote: > > > ----- On Sep 26, 2017, at 4:08 PM, Shuah Khan shuahkhan@gmail.com wrote: > > > On Tue, Sep 26, 2017 at 1:55 PM, Mathieu Desnoyers > > <mathieu.desnoyers@efficios.com> wrote: > >> ----- On Sep 26, 2017, at 3:41 PM, Shuah Khan shuahkhan@gmail.com wrote: > >> > >>> Hi Mathew, > >>> > >>> On Tue, Sep 26, 2017 at 11:51 AM, Mathieu Desnoyers > >>> <mathieu.desnoyers@efficios.com> wrote: > >>>> Test the new MEMBARRIER_CMD_PRIVATE_EXPEDITED and > >>>> MEMBARRIER_CMD_REGISTER_PRIVATE_EXPEDITED commands. > >>>> > >>>> Add checks expecting specific error values on system calls expected to > >>>> fail. > >>>> > >>>> Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> > >>>> CC: Peter Zijlstra <peterz@infradead.org> > >>>> CC: Paul E. McKenney <paulmck@linux.vnet.ibm.com> > >>>> CC: Boqun Feng <boqun.feng@gmail.com> > >>>> CC: Andrew Hunter <ahh@google.com> > >>>> CC: Maged Michael <maged.michael@gmail.com> > >>>> CC: gromer@google.com > >>>> CC: Avi Kivity <avi@scylladb.com> > >>>> CC: Benjamin Herrenschmidt <benh@kernel.crashing.org> > >>>> CC: Paul Mackerras <paulus@samba.org> > >>>> CC: Michael Ellerman <mpe@ellerman.id.au> > >>>> CC: Dave Watson <davejwatson@fb.com> > >>>> CC: Alan Stern <stern@rowland.harvard.edu> > >>>> CC: Will Deacon <will.deacon@arm.com> > >>>> CC: Andy Lutomirski <luto@kernel.org> > >>>> CC: linux-arch@vger.kernel.org > >>>> --- > >>>> .../testing/selftests/membarrier/membarrier_test.c | 109 > >>> > >>> Did you run get_maintainers script on this patch? I am curious why > >>> get_maintainers didn't include linux-kselftest@vger.kernel.org and > >>> shuah@kernel.org > >> > >> My mistake. I'll add this script to my checklist before I send out > >> patches. > >> > >> If OK with you, I can just add the ML in CC here. > >> > > > > Please add everybody get_maintainers suggest for this patch as well. > > If this patch is going through linux-kselftest, you will have to resend the > > patch to me at some point, so I can pull it in. > > > > I am guessing this has dependency on the other patches in the series > > and will go through the primary tree. In that case CC is just fine and I will > > review it and Ack it. > > Indeed, it has a dependency on another patch part of this > series, which is aimed to go through Paul E. McKenney's tree. > > Adding Greg, Alice, and Paul Elder in CC as suggested by get_maintainers. > > Do you need me to re-send the series, or is this thread ok ? Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH for 4.14 2/3] membarrier: selftest: Test private expedited cmd @ 2017-09-26 21:15 ` Greg Kroah-Hartman 0 siblings, 0 replies; 23+ messages in thread From: Greg Kroah-Hartman @ 2017-09-26 21:15 UTC (permalink / raw) To: Mathieu Desnoyers Cc: Shuah Khan, Alice Ferrazzi, Paul Elder, Paul E. McKenney, linux-kselftest, shuah, Peter Zijlstra, linux-kernel, Boqun Feng, Andrew Hunter, maged michael, gromer, Avi Kivity, Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman, Dave Watson, Alan Stern, Will Deacon, Andy Lutomirski On Tue, Sep 26, 2017 at 08:15:25PM +0000, Mathieu Desnoyers wrote: > > > ----- On Sep 26, 2017, at 4:08 PM, Shuah Khan shuahkhan@gmail.com wrote: > > > On Tue, Sep 26, 2017 at 1:55 PM, Mathieu Desnoyers > > <mathieu.desnoyers@efficios.com> wrote: > >> ----- On Sep 26, 2017, at 3:41 PM, Shuah Khan shuahkhan@gmail.com wrote: > >> > >>> Hi Mathew, > >>> > >>> On Tue, Sep 26, 2017 at 11:51 AM, Mathieu Desnoyers > >>> <mathieu.desnoyers@efficios.com> wrote: > >>>> Test the new MEMBARRIER_CMD_PRIVATE_EXPEDITED and > >>>> MEMBARRIER_CMD_REGISTER_PRIVATE_EXPEDITED commands. > >>>> > >>>> Add checks expecting specific error values on system calls expected to > >>>> fail. > >>>> > >>>> Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> > >>>> CC: Peter Zijlstra <peterz@infradead.org> > >>>> CC: Paul E. McKenney <paulmck@linux.vnet.ibm.com> > >>>> CC: Boqun Feng <boqun.feng@gmail.com> > >>>> CC: Andrew Hunter <ahh@google.com> > >>>> CC: Maged Michael <maged.michael@gmail.com> > >>>> CC: gromer@google.com > >>>> CC: Avi Kivity <avi@scylladb.com> > >>>> CC: Benjamin Herrenschmidt <benh@kernel.crashing.org> > >>>> CC: Paul Mackerras <paulus@samba.org> > >>>> CC: Michael Ellerman <mpe@ellerman.id.au> > >>>> CC: Dave Watson <davejwatson@fb.com> > >>>> CC: Alan Stern <stern@rowland.harvard.edu> > >>>> CC: Will Deacon <will.deacon@arm.com> > >>>> CC: Andy Lutomirski <luto@kernel.org> > >>>> CC: linux-arch@vger.kernel.org > >>>> --- > >>>> .../testing/selftests/membarrier/membarrier_test.c | 109 > >>> > >>> Did you run get_maintainers script on this patch? I am curious why > >>> get_maintainers didn't include linux-kselftest@vger.kernel.org and > >>> shuah@kernel.org > >> > >> My mistake. I'll add this script to my checklist before I send out > >> patches. > >> > >> If OK with you, I can just add the ML in CC here. > >> > > > > Please add everybody get_maintainers suggest for this patch as well. > > If this patch is going through linux-kselftest, you will have to resend the > > patch to me at some point, so I can pull it in. > > > > I am guessing this has dependency on the other patches in the series > > and will go through the primary tree. In that case CC is just fine and I will > > review it and Ack it. > > Indeed, it has a dependency on another patch part of this > series, which is aimed to go through Paul E. McKenney's tree. > > Adding Greg, Alice, and Paul Elder in CC as suggested by get_maintainers. > > Do you need me to re-send the series, or is this thread ok ? Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> ^ permalink raw reply [flat|nested] 23+ messages in thread
* [PATCH v4 for 4.14 3/3] membarrier: Document scheduler barrier requirements 2017-09-26 17:51 [PATCH v4 for 4.14 1/3] membarrier: Provide register expedited private command Mathieu Desnoyers 2017-09-26 17:51 ` [PATCH for 4.14 2/3] membarrier: selftest: Test private expedited cmd Mathieu Desnoyers @ 2017-09-26 17:51 ` Mathieu Desnoyers 2017-09-26 20:46 ` Mathieu Desnoyers 2017-09-26 20:43 ` [PATCH v4 for 4.14 1/3] membarrier: Provide register expedited private command Mathieu Desnoyers 2 siblings, 1 reply; 23+ messages in thread From: Mathieu Desnoyers @ 2017-09-26 17:51 UTC (permalink / raw) To: Paul E . McKenney, Peter Zijlstra Cc: linux-kernel, Mathieu Desnoyers, Boqun Feng, Andrew Hunter, Maged Michael, gromer, Avi Kivity, Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman, Dave Watson Document the membarrier requirement on having a full memory barrier in __schedule() after coming from user-space, before storing to rq->curr. It is provided by smp_mb__after_spinlock() in __schedule(). Document that membarrier requires a full barrier on transition from kernel thread to userspace thread. We currently have an implicit barrier from atomic_dec_and_test() in mmdrop() that ensures this. The x86 switch_mm_irqs_off() full barrier is currently provided by many cpumask update operations as well as write_cr3(). Document that write_cr3() provides this barrier. Changes since v1: - Update comments to match reality for code paths which are after storing to rq->curr, before returning to user-space. Changes since v2: - Update changelog (smp_mb__before_spinlock -> smp_mb__after_spinlock). Changes since v3: - Clarify comments following feeback from Peter Zijlstra. Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> CC: Peter Zijlstra <peterz@infradead.org> CC: Paul E. McKenney <paulmck@linux.vnet.ibm.com> CC: Boqun Feng <boqun.feng@gmail.com> CC: Andrew Hunter <ahh@google.com> CC: Maged Michael <maged.michael@gmail.com> CC: gromer@google.com CC: Avi Kivity <avi@scylladb.com> CC: Benjamin Herrenschmidt <benh@kernel.crashing.org> CC: Paul Mackerras <paulus@samba.org> CC: Michael Ellerman <mpe@ellerman.id.au> CC: Dave Watson <davejwatson@fb.com> --- arch/x86/mm/tlb.c | 5 +++++ include/linux/sched/mm.h | 5 +++++ kernel/sched/core.c | 38 +++++++++++++++++++++++++++----------- 3 files changed, 37 insertions(+), 11 deletions(-) diff --git a/arch/x86/mm/tlb.c b/arch/x86/mm/tlb.c index 93fe97cce581..5ba86b85953b 100644 --- a/arch/x86/mm/tlb.c +++ b/arch/x86/mm/tlb.c @@ -143,6 +143,11 @@ void switch_mm_irqs_off(struct mm_struct *prev, struct mm_struct *next, } #endif + /* + * The membarrier system call requires a full memory barrier + * before returning to user-space, after storing to rq->curr. + * Writing to CR3 provides that full memory barrier. + */ if (real_prev == next) { VM_BUG_ON(this_cpu_read(cpu_tlbstate.ctxs[prev_asid].ctx_id) != next->context.ctx_id); diff --git a/include/linux/sched/mm.h b/include/linux/sched/mm.h index d3b81e48784d..1bd10c2c0893 100644 --- a/include/linux/sched/mm.h +++ b/include/linux/sched/mm.h @@ -38,6 +38,11 @@ static inline void mmgrab(struct mm_struct *mm) extern void __mmdrop(struct mm_struct *); static inline void mmdrop(struct mm_struct *mm) { + /* + * The implicit full barrier implied by atomic_dec_and_test is + * required by the membarrier system call before returning to + * user-space, after storing to rq->curr. + */ if (unlikely(atomic_dec_and_test(&mm->mm_count))) __mmdrop(mm); } diff --git a/kernel/sched/core.c b/kernel/sched/core.c index b9d731283946..6254f87645de 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -2649,6 +2649,12 @@ static struct rq *finish_task_switch(struct task_struct *prev) finish_arch_post_lock_switch(); fire_sched_in_preempt_notifiers(current); + /* + * When transitioning from a kernel thread to a userspace + * thread, mmdrop()'s implicit full barrier is required by the + * membarrier system call, because the current active_mm can + * become the current mm without going through switch_mm(). + */ if (mm) mmdrop(mm); if (unlikely(prev_state == TASK_DEAD)) { @@ -2754,6 +2760,13 @@ context_switch(struct rq *rq, struct task_struct *prev, */ arch_start_context_switch(prev); + /* + * If mm is non-NULL, we pass through switch_mm(). If mm is + * NULL, we will pass through mmdrop() in finish_task_switch(). + * Both of these contain the full memory barrier required by + * membarrier after storing to rq->curr, before returning to + * user-space. + */ if (!mm) { next->active_mm = oldmm; mmgrab(oldmm); @@ -3290,6 +3303,9 @@ static void __sched notrace __schedule(bool preempt) * Make sure that signal_pending_state()->signal_pending() below * can't be reordered with __set_current_state(TASK_INTERRUPTIBLE) * done by the caller to avoid the race with signal_wake_up(). + * + * The membarrier system call requires a full memory barrier + * after coming from user-space, before storing to rq->curr. */ rq_lock(rq, &rf); smp_mb__after_spinlock(); @@ -3337,17 +3353,17 @@ static void __sched notrace __schedule(bool preempt) /* * The membarrier system call requires each architecture * to have a full memory barrier after updating - * rq->curr, before returning to user-space. For TSO - * (e.g. x86), the architecture must provide its own - * barrier in switch_mm(). For weakly ordered machines - * for which spin_unlock() acts as a full memory - * barrier, finish_lock_switch() in common code takes - * care of this barrier. For weakly ordered machines for - * which spin_unlock() acts as a RELEASE barrier (only - * arm64 and PowerPC), arm64 has a full barrier in - * switch_to(), and PowerPC has - * smp_mb__after_unlock_lock() before - * finish_lock_switch(). + * rq->curr, before returning to user-space. + * + * Here are the schemes providing that barrier on the + * various architectures: + * - mm ? switch_mm() : mmdrop() for x86, s390, sparc, + * - finish_lock_switch() for weakly-ordered + * architectures where spin_unlock is a full barrier, + * - switch_to() for arm64 (weakly-ordered, spin_unlock + * is a RELEASE barrier), + * - membarrier_arch_sched_in() for PowerPC, + * (weakly-ordered, spin_unlock is a RELEASE barrier). */ ++*switch_count; -- 2.11.0 ^ permalink raw reply related [flat|nested] 23+ messages in thread
* Re: [PATCH v4 for 4.14 3/3] membarrier: Document scheduler barrier requirements 2017-09-26 17:51 ` [PATCH v4 for 4.14 3/3] membarrier: Document scheduler barrier requirements Mathieu Desnoyers @ 2017-09-26 20:46 ` Mathieu Desnoyers 0 siblings, 0 replies; 23+ messages in thread From: Mathieu Desnoyers @ 2017-09-26 20:46 UTC (permalink / raw) To: Paul E. McKenney, Peter Zijlstra, Ingo Molnar, Thomas Gleixner, H. Peter Anvin Cc: linux-kernel, Boqun Feng, Andrew Hunter, maged michael, gromer, Avi Kivity, Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman, Dave Watson, x86 ----- On Sep 26, 2017, at 1:51 PM, Mathieu Desnoyers mathieu.desnoyers@efficios.com wrote: > Document the membarrier requirement on having a full memory barrier in > __schedule() after coming from user-space, before storing to rq->curr. > It is provided by smp_mb__after_spinlock() in __schedule(). Missed a few maintainers that should have been CC'd. Adding them now. This patch is aimed to go through Paul E. McKenney's tree. Thanks, Mathieu > > Document that membarrier requires a full barrier on transition from > kernel thread to userspace thread. We currently have an implicit barrier > from atomic_dec_and_test() in mmdrop() that ensures this. > > The x86 switch_mm_irqs_off() full barrier is currently provided by many > cpumask update operations as well as write_cr3(). Document that > write_cr3() provides this barrier. > > Changes since v1: > - Update comments to match reality for code paths which are after > storing to rq->curr, before returning to user-space. > Changes since v2: > - Update changelog (smp_mb__before_spinlock -> smp_mb__after_spinlock). > Changes since v3: > - Clarify comments following feeback from Peter Zijlstra. > > Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> > CC: Peter Zijlstra <peterz@infradead.org> > CC: Paul E. McKenney <paulmck@linux.vnet.ibm.com> > CC: Boqun Feng <boqun.feng@gmail.com> > CC: Andrew Hunter <ahh@google.com> > CC: Maged Michael <maged.michael@gmail.com> > CC: gromer@google.com > CC: Avi Kivity <avi@scylladb.com> > CC: Benjamin Herrenschmidt <benh@kernel.crashing.org> > CC: Paul Mackerras <paulus@samba.org> > CC: Michael Ellerman <mpe@ellerman.id.au> > CC: Dave Watson <davejwatson@fb.com> > --- > arch/x86/mm/tlb.c | 5 +++++ > include/linux/sched/mm.h | 5 +++++ > kernel/sched/core.c | 38 +++++++++++++++++++++++++++----------- > 3 files changed, 37 insertions(+), 11 deletions(-) > > diff --git a/arch/x86/mm/tlb.c b/arch/x86/mm/tlb.c > index 93fe97cce581..5ba86b85953b 100644 > --- a/arch/x86/mm/tlb.c > +++ b/arch/x86/mm/tlb.c > @@ -143,6 +143,11 @@ void switch_mm_irqs_off(struct mm_struct *prev, struct > mm_struct *next, > } > #endif > > + /* > + * The membarrier system call requires a full memory barrier > + * before returning to user-space, after storing to rq->curr. > + * Writing to CR3 provides that full memory barrier. > + */ > if (real_prev == next) { > VM_BUG_ON(this_cpu_read(cpu_tlbstate.ctxs[prev_asid].ctx_id) != > next->context.ctx_id); > diff --git a/include/linux/sched/mm.h b/include/linux/sched/mm.h > index d3b81e48784d..1bd10c2c0893 100644 > --- a/include/linux/sched/mm.h > +++ b/include/linux/sched/mm.h > @@ -38,6 +38,11 @@ static inline void mmgrab(struct mm_struct *mm) > extern void __mmdrop(struct mm_struct *); > static inline void mmdrop(struct mm_struct *mm) > { > + /* > + * The implicit full barrier implied by atomic_dec_and_test is > + * required by the membarrier system call before returning to > + * user-space, after storing to rq->curr. > + */ > if (unlikely(atomic_dec_and_test(&mm->mm_count))) > __mmdrop(mm); > } > diff --git a/kernel/sched/core.c b/kernel/sched/core.c > index b9d731283946..6254f87645de 100644 > --- a/kernel/sched/core.c > +++ b/kernel/sched/core.c > @@ -2649,6 +2649,12 @@ static struct rq *finish_task_switch(struct task_struct > *prev) > finish_arch_post_lock_switch(); > > fire_sched_in_preempt_notifiers(current); > + /* > + * When transitioning from a kernel thread to a userspace > + * thread, mmdrop()'s implicit full barrier is required by the > + * membarrier system call, because the current active_mm can > + * become the current mm without going through switch_mm(). > + */ > if (mm) > mmdrop(mm); > if (unlikely(prev_state == TASK_DEAD)) { > @@ -2754,6 +2760,13 @@ context_switch(struct rq *rq, struct task_struct *prev, > */ > arch_start_context_switch(prev); > > + /* > + * If mm is non-NULL, we pass through switch_mm(). If mm is > + * NULL, we will pass through mmdrop() in finish_task_switch(). > + * Both of these contain the full memory barrier required by > + * membarrier after storing to rq->curr, before returning to > + * user-space. > + */ > if (!mm) { > next->active_mm = oldmm; > mmgrab(oldmm); > @@ -3290,6 +3303,9 @@ static void __sched notrace __schedule(bool preempt) > * Make sure that signal_pending_state()->signal_pending() below > * can't be reordered with __set_current_state(TASK_INTERRUPTIBLE) > * done by the caller to avoid the race with signal_wake_up(). > + * > + * The membarrier system call requires a full memory barrier > + * after coming from user-space, before storing to rq->curr. > */ > rq_lock(rq, &rf); > smp_mb__after_spinlock(); > @@ -3337,17 +3353,17 @@ static void __sched notrace __schedule(bool preempt) > /* > * The membarrier system call requires each architecture > * to have a full memory barrier after updating > - * rq->curr, before returning to user-space. For TSO > - * (e.g. x86), the architecture must provide its own > - * barrier in switch_mm(). For weakly ordered machines > - * for which spin_unlock() acts as a full memory > - * barrier, finish_lock_switch() in common code takes > - * care of this barrier. For weakly ordered machines for > - * which spin_unlock() acts as a RELEASE barrier (only > - * arm64 and PowerPC), arm64 has a full barrier in > - * switch_to(), and PowerPC has > - * smp_mb__after_unlock_lock() before > - * finish_lock_switch(). > + * rq->curr, before returning to user-space. > + * > + * Here are the schemes providing that barrier on the > + * various architectures: > + * - mm ? switch_mm() : mmdrop() for x86, s390, sparc, > + * - finish_lock_switch() for weakly-ordered > + * architectures where spin_unlock is a full barrier, > + * - switch_to() for arm64 (weakly-ordered, spin_unlock > + * is a RELEASE barrier), > + * - membarrier_arch_sched_in() for PowerPC, > + * (weakly-ordered, spin_unlock is a RELEASE barrier). > */ > ++*switch_count; > > -- > 2.11.0 -- Mathieu Desnoyers EfficiOS Inc. http://www.efficios.com ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH v4 for 4.14 1/3] membarrier: Provide register expedited private command 2017-09-26 17:51 [PATCH v4 for 4.14 1/3] membarrier: Provide register expedited private command Mathieu Desnoyers 2017-09-26 17:51 ` [PATCH for 4.14 2/3] membarrier: selftest: Test private expedited cmd Mathieu Desnoyers 2017-09-26 17:51 ` [PATCH v4 for 4.14 3/3] membarrier: Document scheduler barrier requirements Mathieu Desnoyers @ 2017-09-26 20:43 ` Mathieu Desnoyers 2017-09-27 13:04 ` Nicholas Piggin 2 siblings, 1 reply; 23+ messages in thread From: Mathieu Desnoyers @ 2017-09-26 20:43 UTC (permalink / raw) To: Paul E. McKenney, Peter Zijlstra, Ingo Molnar, Alexander Viro Cc: linux-kernel, Boqun Feng, Andrew Hunter, maged michael, gromer, Avi Kivity, Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman, Dave Watson, Alan Stern, Will Deacon, Andy Lutomirski, linux-arch, linuxppc-dev ----- On Sep 26, 2017, at 1:51 PM, Mathieu Desnoyers mathieu.desnoyers@efficios.com wrote: > Provide a new command allowing processes to register their intent to use > the private expedited command. > I missed a few maintainers that should have been CC'd. Adding them now. This patch is aimed to go through Paul E. McKenney's tree. Thanks, Mathieu > This allows PowerPC to skip the full memory barrier in switch_mm(), and > only issue the barrier when scheduling into a task belonging to a > process that has registered to use expedited private. > > Processes are now required to register before using > MEMBARRIER_CMD_PRIVATE_EXPEDITED, otherwise that command returns EPERM. > > Changes since v1: > - Use test_ti_thread_flag(next, ...) instead of test_thread_flag() in > powerpc membarrier_arch_sched_in(), given that we want to specifically > check the next thread state. > - Add missing ARCH_HAS_MEMBARRIER_HOOKS in Kconfig. > - Use task_thread_info() to pass thread_info from task to > *_ti_thread_flag(). > > Changes since v2: > - Move membarrier_arch_sched_in() call to finish_task_switch(). > - Check for NULL t->mm in membarrier_arch_fork(). > - Use membarrier_sched_in() in generic code, which invokes the > arch-specific membarrier_arch_sched_in(). This fixes allnoconfig > build on PowerPC. > - Move asm/membarrier.h include under CONFIG_MEMBARRIER, fixing > allnoconfig build on PowerPC. > - Build and runtime tested on PowerPC. > > Changes since v3: > - Simply rely on copy_mm() to copy the membarrier_private_expedited mm > field on fork. > - powerpc: test thread flag instead of reading > membarrier_private_expedited in membarrier_arch_fork(). > - powerpc: skip memory barrier in membarrier_arch_sched_in() if coming > from kernel thread, since mmdrop() implies a full barrier. > - Set membarrier_private_expedited to 1 only after arch registration > code, thus eliminating a race where concurrent commands could succeed > when they should fail if issued concurrently with process > registration. > - Use READ_ONCE() for membarrier_private_expedited field access in > membarrier_private_expedited. Matches WRITE_ONCE() performed in > process registration. > > Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> > CC: Peter Zijlstra <peterz@infradead.org> > CC: Paul E. McKenney <paulmck@linux.vnet.ibm.com> > CC: Boqun Feng <boqun.feng@gmail.com> > CC: Andrew Hunter <ahh@google.com> > CC: Maged Michael <maged.michael@gmail.com> > CC: gromer@google.com > CC: Avi Kivity <avi@scylladb.com> > CC: Benjamin Herrenschmidt <benh@kernel.crashing.org> > CC: Paul Mackerras <paulus@samba.org> > CC: Michael Ellerman <mpe@ellerman.id.au> > CC: Dave Watson <davejwatson@fb.com> > CC: Alan Stern <stern@rowland.harvard.edu> > CC: Will Deacon <will.deacon@arm.com> > CC: Andy Lutomirski <luto@kernel.org> > CC: linux-arch@vger.kernel.org > --- > MAINTAINERS | 2 ++ > arch/powerpc/Kconfig | 1 + > arch/powerpc/include/asm/membarrier.h | 44 +++++++++++++++++++++++++++ > arch/powerpc/include/asm/thread_info.h | 3 ++ > arch/powerpc/kernel/Makefile | 2 ++ > arch/powerpc/kernel/membarrier.c | 45 ++++++++++++++++++++++++++++ > fs/exec.c | 1 + > include/linux/mm_types.h | 3 ++ > include/linux/sched/mm.h | 55 ++++++++++++++++++++++++++++++++++ > include/uapi/linux/membarrier.h | 23 +++++++++----- > init/Kconfig | 3 ++ > kernel/fork.c | 2 ++ > kernel/sched/core.c | 11 +------ > kernel/sched/membarrier.c | 25 +++++++++++++--- > 14 files changed, 199 insertions(+), 21 deletions(-) > create mode 100644 arch/powerpc/include/asm/membarrier.h > create mode 100644 arch/powerpc/kernel/membarrier.c > > diff --git a/MAINTAINERS b/MAINTAINERS > index 6671f375f7fc..f11d8aece00d 100644 > --- a/MAINTAINERS > +++ b/MAINTAINERS > @@ -8816,6 +8816,8 @@ L: linux-kernel@vger.kernel.org > S: Supported > F: kernel/sched/membarrier.c > F: include/uapi/linux/membarrier.h > +F: arch/powerpc/kernel/membarrier.c > +F: arch/powerpc/include/asm/membarrier.h > > MEMORY MANAGEMENT > L: linux-mm@kvack.org > diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig > index 809c468edab1..6f44c5f74f71 100644 > --- a/arch/powerpc/Kconfig > +++ b/arch/powerpc/Kconfig > @@ -138,6 +138,7 @@ config PPC > select ARCH_HAS_ELF_RANDOMIZE > select ARCH_HAS_FORTIFY_SOURCE > select ARCH_HAS_GCOV_PROFILE_ALL > + select ARCH_HAS_MEMBARRIER_HOOKS > select ARCH_HAS_SCALED_CPUTIME if VIRT_CPU_ACCOUNTING_NATIVE > select ARCH_HAS_SG_CHAIN > select ARCH_HAS_TICK_BROADCAST if GENERIC_CLOCKEVENTS_BROADCAST > diff --git a/arch/powerpc/include/asm/membarrier.h > b/arch/powerpc/include/asm/membarrier.h > new file mode 100644 > index 000000000000..588154c1cf57 > --- /dev/null > +++ b/arch/powerpc/include/asm/membarrier.h > @@ -0,0 +1,44 @@ > +#ifndef _ASM_POWERPC_MEMBARRIER_H > +#define _ASM_POWERPC_MEMBARRIER_H > + > +static inline void membarrier_arch_sched_in(struct task_struct *prev, > + struct task_struct *next) > +{ > + /* > + * Only need the full barrier when switching between processes. > + * Barrier when switching from kernel to userspace is not > + * required here, given that it is implied by mmdrop(). Barrier > + * when switching from userspace to kernel is not needed after > + * store to rq->curr. > + */ > + if (likely(!test_ti_thread_flag(task_thread_info(next), > + TIF_MEMBARRIER_PRIVATE_EXPEDITED) > + || !prev->mm || prev->mm == next->mm)) > + return; > + > + /* > + * The membarrier system call requires a full memory barrier > + * after storing to rq->curr, before going back to user-space. > + */ > + smp_mb(); > +} > +static inline void membarrier_arch_fork(struct task_struct *t, > + unsigned long clone_flags) > +{ > + /* > + * Coherence of TIF_MEMBARRIER_PRIVATE_EXPEDITED against thread > + * fork is protected by siglock. membarrier_arch_fork is called > + * with siglock held. > + */ > + if (test_thread_flag(TIF_MEMBARRIER_PRIVATE_EXPEDITED)) > + set_ti_thread_flag(task_thread_info(t), > + TIF_MEMBARRIER_PRIVATE_EXPEDITED); > +} > +static inline void membarrier_arch_execve(struct task_struct *t) > +{ > + clear_ti_thread_flag(task_thread_info(t), > + TIF_MEMBARRIER_PRIVATE_EXPEDITED); > +} > +void membarrier_arch_register_private_expedited(struct task_struct *t); > + > +#endif /* _ASM_POWERPC_MEMBARRIER_H */ > diff --git a/arch/powerpc/include/asm/thread_info.h > b/arch/powerpc/include/asm/thread_info.h > index a941cc6fc3e9..2a208487724b 100644 > --- a/arch/powerpc/include/asm/thread_info.h > +++ b/arch/powerpc/include/asm/thread_info.h > @@ -100,6 +100,7 @@ static inline struct thread_info *current_thread_info(void) > #if defined(CONFIG_PPC64) > #define TIF_ELF2ABI 18 /* function descriptors must die! */ > #endif > +#define TIF_MEMBARRIER_PRIVATE_EXPEDITED 19 /* membarrier */ > > /* as above, but as bit values */ > #define _TIF_SYSCALL_TRACE (1<<TIF_SYSCALL_TRACE) > @@ -119,6 +120,8 @@ static inline struct thread_info *current_thread_info(void) > #define _TIF_SYSCALL_TRACEPOINT (1<<TIF_SYSCALL_TRACEPOINT) > #define _TIF_EMULATE_STACK_STORE (1<<TIF_EMULATE_STACK_STORE) > #define _TIF_NOHZ (1<<TIF_NOHZ) > +#define _TIF_MEMBARRIER_PRIVATE_EXPEDITED \ > + (1<<TIF_MEMBARRIER_PRIVATE_EXPEDITED) > #define _TIF_SYSCALL_DOTRACE (_TIF_SYSCALL_TRACE | _TIF_SYSCALL_AUDIT | \ > _TIF_SECCOMP | _TIF_SYSCALL_TRACEPOINT | \ > _TIF_NOHZ) > diff --git a/arch/powerpc/kernel/Makefile b/arch/powerpc/kernel/Makefile > index 91960f83039c..2dd4b9e3313a 100644 > --- a/arch/powerpc/kernel/Makefile > +++ b/arch/powerpc/kernel/Makefile > @@ -135,6 +135,8 @@ endif > obj-$(CONFIG_EPAPR_PARAVIRT) += epapr_paravirt.o epapr_hcalls.o > obj-$(CONFIG_KVM_GUEST) += kvm.o kvm_emul.o > > +obj-$(CONFIG_MEMBARRIER) += membarrier.o > + > # Disable GCOV & sanitizers in odd or sensitive code > GCOV_PROFILE_prom_init.o := n > UBSAN_SANITIZE_prom_init.o := n > diff --git a/arch/powerpc/kernel/membarrier.c b/arch/powerpc/kernel/membarrier.c > new file mode 100644 > index 000000000000..b0d79a5f5981 > --- /dev/null > +++ b/arch/powerpc/kernel/membarrier.c > @@ -0,0 +1,45 @@ > +/* > + * Copyright (C) 2010-2017 Mathieu Desnoyers <mathieu.desnoyers@efficios.com> > + * > + * membarrier system call - PowerPC architecture code > + * > + * This program is free software; you can redistribute it and/or modify > + * it under the terms of the GNU General Public License as published by > + * the Free Software Foundation; either version 2 of the License, or > + * (at your option) any later version. > + * > + * This program is distributed in the hope that it will be useful, > + * but WITHOUT ANY WARRANTY; without even the implied warranty of > + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the > + * GNU General Public License for more details. > + */ > + > +#include <linux/sched/mm.h> > +#include <linux/sched/signal.h> > +#include <linux/thread_info.h> > +#include <linux/spinlock.h> > +#include <linux/rcupdate.h> > + > +void membarrier_arch_register_private_expedited(struct task_struct *p) > +{ > + struct task_struct *t; > + > + if (get_nr_threads(p) == 1) { > + set_thread_flag(TIF_MEMBARRIER_PRIVATE_EXPEDITED); > + return; > + } > + /* > + * Coherence of TIF_MEMBARRIER_PRIVATE_EXPEDITED against thread > + * fork is protected by siglock. > + */ > + spin_lock(&p->sighand->siglock); > + for_each_thread(p, t) > + set_ti_thread_flag(task_thread_info(t), > + TIF_MEMBARRIER_PRIVATE_EXPEDITED); > + spin_unlock(&p->sighand->siglock); > + /* > + * Ensure all future scheduler executions will observe the new > + * thread flag state for this process. > + */ > + synchronize_sched(); > +} > diff --git a/fs/exec.c b/fs/exec.c > index ac34d9724684..b2448f2731b3 100644 > --- a/fs/exec.c > +++ b/fs/exec.c > @@ -1802,6 +1802,7 @@ static int do_execveat_common(int fd, struct filename > *filename, > /* execve succeeded */ > current->fs->in_exec = 0; > current->in_execve = 0; > + membarrier_execve(current); > acct_update_integrals(current); > task_numa_free(current); > free_bprm(bprm); > diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h > index 46f4ecf5479a..5e0fe8ce053b 100644 > --- a/include/linux/mm_types.h > +++ b/include/linux/mm_types.h > @@ -445,6 +445,9 @@ struct mm_struct { > unsigned long flags; /* Must use atomic bitops to access the bits */ > > struct core_state *core_state; /* coredumping support */ > +#ifdef CONFIG_MEMBARRIER > + int membarrier_private_expedited; > +#endif > #ifdef CONFIG_AIO > spinlock_t ioctx_lock; > struct kioctx_table __rcu *ioctx_table; > diff --git a/include/linux/sched/mm.h b/include/linux/sched/mm.h > index 3a19c253bdb1..d3b81e48784d 100644 > --- a/include/linux/sched/mm.h > +++ b/include/linux/sched/mm.h > @@ -205,4 +205,59 @@ static inline void memalloc_noreclaim_restore(unsigned int > flags) > current->flags = (current->flags & ~PF_MEMALLOC) | flags; > } > > +#ifdef CONFIG_MEMBARRIER > + > +#ifdef CONFIG_ARCH_HAS_MEMBARRIER_HOOKS > +#include <asm/membarrier.h> > +#else > +static inline void membarrier_arch_sched_in(struct task_struct *prev, > + struct task_struct *next) > +{ > +} > +static inline void membarrier_arch_fork(struct task_struct *t, > + unsigned long clone_flags) > +{ > +} > +static inline void membarrier_arch_execve(struct task_struct *t) > +{ > +} > +static inline void membarrier_arch_register_private_expedited( > + struct task_struct *p) > +{ > +} > +#endif > + > +static inline void membarrier_sched_in(struct task_struct *prev, > + struct task_struct *next) > +{ > + membarrier_arch_sched_in(prev, next); > +} > +static inline void membarrier_fork(struct task_struct *t, > + unsigned long clone_flags) > +{ > + /* > + * Prior copy_mm() copies the membarrier_private_expedited field > + * from current->mm to t->mm. > + */ > + membarrier_arch_fork(t, clone_flags); > +} > +static inline void membarrier_execve(struct task_struct *t) > +{ > + t->mm->membarrier_private_expedited = 0; > + membarrier_arch_execve(t); > +} > +#else > +static inline void membarrier_sched_in(struct task_struct *prev, > + struct task_struct *next) > +{ > +} > +static inline void membarrier_fork(struct task_struct *t, > + unsigned long clone_flags) > +{ > +} > +static inline void membarrier_execve(struct task_struct *t) > +{ > +} > +#endif > + > #endif /* _LINUX_SCHED_MM_H */ > diff --git a/include/uapi/linux/membarrier.h b/include/uapi/linux/membarrier.h > index 6d47b3249d8a..4e01ad7ffe98 100644 > --- a/include/uapi/linux/membarrier.h > +++ b/include/uapi/linux/membarrier.h > @@ -52,21 +52,30 @@ > * (non-running threads are de facto in such a > * state). This only covers threads from the > * same processes as the caller thread. This > - * command returns 0. The "expedited" commands > - * complete faster than the non-expedited ones, > - * they never block, but have the downside of > - * causing extra overhead. > + * command returns 0 on success. The > + * "expedited" commands complete faster than > + * the non-expedited ones, they never block, > + * but have the downside of causing extra > + * overhead. A process needs to register its > + * intent to use the private expedited command > + * prior to using it, otherwise this command > + * returns -EPERM. > + * @MEMBARRIER_CMD_REGISTER_PRIVATE_EXPEDITED: > + * Register the process intent to use > + * MEMBARRIER_CMD_PRIVATE_EXPEDITED. Always > + * returns 0. > * > * Command to be passed to the membarrier system call. The commands need to > * be a single bit each, except for MEMBARRIER_CMD_QUERY which is assigned to > * the value 0. > */ > enum membarrier_cmd { > - MEMBARRIER_CMD_QUERY = 0, > - MEMBARRIER_CMD_SHARED = (1 << 0), > + MEMBARRIER_CMD_QUERY = 0, > + MEMBARRIER_CMD_SHARED = (1 << 0), > /* reserved for MEMBARRIER_CMD_SHARED_EXPEDITED (1 << 1) */ > /* reserved for MEMBARRIER_CMD_PRIVATE (1 << 2) */ > - MEMBARRIER_CMD_PRIVATE_EXPEDITED = (1 << 3), > + MEMBARRIER_CMD_PRIVATE_EXPEDITED = (1 << 3), > + MEMBARRIER_CMD_REGISTER_PRIVATE_EXPEDITED = (1 << 4), > }; > > #endif /* _UAPI_LINUX_MEMBARRIER_H */ > diff --git a/init/Kconfig b/init/Kconfig > index 78cb2461012e..a3dc6a66f0d1 100644 > --- a/init/Kconfig > +++ b/init/Kconfig > @@ -1400,6 +1400,9 @@ config MEMBARRIER > > If unsure, say Y. > > +config ARCH_HAS_MEMBARRIER_HOOKS > + bool > + > config EMBEDDED > bool "Embedded system" > option allnoconfig_y > diff --git a/kernel/fork.c b/kernel/fork.c > index 10646182440f..bd4a93915e08 100644 > --- a/kernel/fork.c > +++ b/kernel/fork.c > @@ -1840,6 +1840,8 @@ static __latent_entropy struct task_struct *copy_process( > */ > copy_seccomp(p); > > + membarrier_fork(p, clone_flags); > + > /* > * Process group and session signals need to be delivered to just the > * parent before the fork or both the parent and the child after the > diff --git a/kernel/sched/core.c b/kernel/sched/core.c > index 18a6966567da..b9d731283946 100644 > --- a/kernel/sched/core.c > +++ b/kernel/sched/core.c > @@ -2643,17 +2643,8 @@ static struct rq *finish_task_switch(struct task_struct > *prev) > */ > prev_state = prev->state; > vtime_task_switch(prev); > + membarrier_sched_in(prev, current); > perf_event_task_sched_in(prev, current); > - /* > - * The membarrier system call requires a full memory barrier > - * after storing to rq->curr, before going back to user-space. > - * > - * TODO: This smp_mb__after_unlock_lock can go away if PPC end > - * up adding a full barrier to switch_mm(), or we should figure > - * out if a smp_mb__after_unlock_lock is really the proper API > - * to use. > - */ > - smp_mb__after_unlock_lock(); > finish_lock_switch(rq, prev); > finish_arch_post_lock_switch(); > > diff --git a/kernel/sched/membarrier.c b/kernel/sched/membarrier.c > index a92fddc22747..f06949a279ca 100644 > --- a/kernel/sched/membarrier.c > +++ b/kernel/sched/membarrier.c > @@ -26,21 +26,25 @@ > * except MEMBARRIER_CMD_QUERY. > */ > #define MEMBARRIER_CMD_BITMASK \ > - (MEMBARRIER_CMD_SHARED | MEMBARRIER_CMD_PRIVATE_EXPEDITED) > + (MEMBARRIER_CMD_SHARED | MEMBARRIER_CMD_PRIVATE_EXPEDITED \ > + | MEMBARRIER_CMD_REGISTER_PRIVATE_EXPEDITED) > > static void ipi_mb(void *info) > { > smp_mb(); /* IPIs should be serializing but paranoid. */ > } > > -static void membarrier_private_expedited(void) > +static int membarrier_private_expedited(void) > { > int cpu; > bool fallback = false; > cpumask_var_t tmpmask; > > + if (!READ_ONCE(current->mm->membarrier_private_expedited)) > + return -EPERM; > + > if (num_online_cpus() == 1) > - return; > + return 0; > > /* > * Matches memory barriers around rq->curr modification in > @@ -94,6 +98,17 @@ static void membarrier_private_expedited(void) > * rq->curr modification in scheduler. > */ > smp_mb(); /* exit from system call is not a mb */ > + return 0; > +} > + > +static void membarrier_register_private_expedited(void) > +{ > + struct task_struct *p = current; > + > + if (READ_ONCE(p->mm->membarrier_private_expedited)) > + return; > + membarrier_arch_register_private_expedited(p); > + WRITE_ONCE(p->mm->membarrier_private_expedited, 1); > } > > /** > @@ -144,7 +159,9 @@ SYSCALL_DEFINE2(membarrier, int, cmd, int, flags) > synchronize_sched(); > return 0; > case MEMBARRIER_CMD_PRIVATE_EXPEDITED: > - membarrier_private_expedited(); > + return membarrier_private_expedited(); > + case MEMBARRIER_CMD_REGISTER_PRIVATE_EXPEDITED: > + membarrier_register_private_expedited(); > return 0; > default: > return -EINVAL; > -- > 2.11.0 -- Mathieu Desnoyers EfficiOS Inc. http://www.efficios.com ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH v4 for 4.14 1/3] membarrier: Provide register expedited private command 2017-09-26 20:43 ` [PATCH v4 for 4.14 1/3] membarrier: Provide register expedited private command Mathieu Desnoyers @ 2017-09-27 13:04 ` Nicholas Piggin 2017-09-28 13:31 ` Mathieu Desnoyers 0 siblings, 1 reply; 23+ messages in thread From: Nicholas Piggin @ 2017-09-27 13:04 UTC (permalink / raw) To: Mathieu Desnoyers Cc: Paul E. McKenney, Peter Zijlstra, Ingo Molnar, Alexander Viro, linux-arch, Avi Kivity, maged michael, Boqun Feng, Dave Watson, Will Deacon, linux-kernel, Andrew Hunter, Paul Mackerras, Andy Lutomirski, Alan Stern, linuxppc-dev, gromer On Tue, 26 Sep 2017 20:43:28 +0000 (UTC) Mathieu Desnoyers <mathieu.desnoyers@efficios.com> wrote: > ----- On Sep 26, 2017, at 1:51 PM, Mathieu Desnoyers mathieu.desnoyers@efficios.com wrote: > > > Provide a new command allowing processes to register their intent to use > > the private expedited command. > > > > I missed a few maintainers that should have been CC'd. Adding them now. > This patch is aimed to go through Paul E. McKenney's tree. Honestly this is pretty ugly new user API and fairly large amount of complexity just to avoid the powerpc barrier. And you end up with arch specific hooks anyway! So my plan was to add an arch-overridable loop primitive that iterates over all running threads for an mm. powerpc will use its mm_cpumask for iterating and use runqueue locks to avoid the barrier. x86 will most likely want to use its mm_cpumask to iterate. For the powerpc approach, yes there is some controversy about using runqueue locks even for cpus that we already can interfere with, but I think we have a lot of options we could look at *after* it ever shows up as a problem. Thanks, Nick ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH v4 for 4.14 1/3] membarrier: Provide register expedited private command 2017-09-27 13:04 ` Nicholas Piggin @ 2017-09-28 13:31 ` Mathieu Desnoyers 2017-09-28 15:01 ` Nicholas Piggin 0 siblings, 1 reply; 23+ messages in thread From: Mathieu Desnoyers @ 2017-09-28 13:31 UTC (permalink / raw) To: Nicholas Piggin Cc: Paul E. McKenney, Peter Zijlstra, Ingo Molnar, Alexander Viro, linux-arch, Avi Kivity, maged michael, Boqun Feng, Dave Watson, Will Deacon, linux-kernel, Andrew Hunter, Paul Mackerras, Andy Lutomirski, Alan Stern, linuxppc-dev, gromer ----- On Sep 27, 2017, at 9:04 AM, Nicholas Piggin npiggin@gmail.com wrote: > On Tue, 26 Sep 2017 20:43:28 +0000 (UTC) > Mathieu Desnoyers <mathieu.desnoyers@efficios.com> wrote: > >> ----- On Sep 26, 2017, at 1:51 PM, Mathieu Desnoyers >> mathieu.desnoyers@efficios.com wrote: >> >> > Provide a new command allowing processes to register their intent to use >> > the private expedited command. >> > >> >> I missed a few maintainers that should have been CC'd. Adding them now. >> This patch is aimed to go through Paul E. McKenney's tree. > > Honestly this is pretty ugly new user API and fairly large amount of > complexity just to avoid the powerpc barrier. And you end up with arch > specific hooks anyway! > > So my plan was to add an arch-overridable loop primitive that iterates > over all running threads for an mm. Iterating over all threads for an mm is one possible solution that has been listed at the membarrier BOF at LPC. We ended up dismissing this solution because it would not inefficient for processes which have lots of threads (e.g. Java). > powerpc will use its mm_cpumask for > iterating and use runqueue locks to avoid the barrier. This is another solution which has been rejected during the LPC BOF. What I gather from past threads is that the mm_cpumask's bits on powerpc are pretty much only being set, never much cleared. Therefore, over the lifetime of a thread which is not affined to specific processors, chances are that this cpumask will end up having all cores on the system. Therefore, you end up with the same rq lock disruption as if you would iterate on all online CPUs. If userspace does that in a loop, you end up, in PeterZ's words, with an Insta-DoS. The name may sound cool, but I gather that this is not a service the kernel willingly wants to provide to userspace. A cunning process could easily find a way to fill its mm_cpumask and then issue membarrier in a loop to bring a large system to its knees. > x86 will most > likely want to use its mm_cpumask to iterate. Iterating on mm_cpumask rather than online cpus adds complexity wrt memory barriers (unless we go for rq locks approach). We'd need, in addition to ensure that we have the proper barriers before/after store to rq->curr, to also ensure that we have the proper barriers between mm_cpumask updates and user-space loads/stores. Considering that we're not aiming at taking the rq locks anyway, iteration over all online cpus seems less complex than iterating on mm_cpumask on the architectures that keep track of it. > > For the powerpc approach, yes there is some controversy about using > runqueue locks even for cpus that we already can interfere with, but I > think we have a lot of options we could look at *after* it ever shows > up as a problem. The DoS argument from Peter seems to be a strong opposition to grabbing the rq locks. Here is another point in favor of having a register command for the private membarrier: This gives us greater flexibility to improve the kernel scheduler and return-to-userspace barriers if need be in the future. For instance, I plan to propose a "MEMBARRIER_FLAG_SYNC_CORE" flag that will also provide guarantees about context synchronization of all cores for memory reclaim performed by JIT for the next merge window. So far, the following architectures seems to have the proper core serializing instructions already in place when returning to user-space: x86 (iret), powerpc (rfi), arm32/64 (return from exception, eret), s390/x (lpswe), ia64 (rfi), parisc (issue at least 7 instructions while signing around a bonfire), and mips SMP (eret). So far, AFAIU, only x86 (eventually going through sysexit), alpha (appears to require an explicit imb), and sparc (explicit flush + 5 instructions around similar bonfire as parisc) appear to require special handling. I therefore plan to use the registration step with a MEMBARRIER_FLAG_SYNC_CORE flag set to set TIF flags and add the required context synchronizing barriers on sched_in() only for processes wishing to use private expedited membarrier. So I don't see much point in trying to remove that registration step. Thanks, Mathieu > > Thanks, > Nick -- Mathieu Desnoyers EfficiOS Inc. http://www.efficios.com ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH v4 for 4.14 1/3] membarrier: Provide register expedited private command 2017-09-28 13:31 ` Mathieu Desnoyers @ 2017-09-28 15:01 ` Nicholas Piggin 2017-09-28 15:29 ` Mathieu Desnoyers 2017-09-28 15:51 ` Peter Zijlstra 0 siblings, 2 replies; 23+ messages in thread From: Nicholas Piggin @ 2017-09-28 15:01 UTC (permalink / raw) To: Mathieu Desnoyers Cc: Paul E. McKenney, Peter Zijlstra, Ingo Molnar, Alexander Viro, linux-arch, Avi Kivity, maged michael, Boqun Feng, Dave Watson, Will Deacon, linux-kernel, Andrew Hunter, Paul Mackerras, Andy Lutomirski, Alan Stern, linuxppc-dev, gromer On Thu, 28 Sep 2017 13:31:36 +0000 (UTC) Mathieu Desnoyers <mathieu.desnoyers@efficios.com> wrote: > ----- On Sep 27, 2017, at 9:04 AM, Nicholas Piggin npiggin@gmail.com wrote: > > > On Tue, 26 Sep 2017 20:43:28 +0000 (UTC) > > Mathieu Desnoyers <mathieu.desnoyers@efficios.com> wrote: > > > >> ----- On Sep 26, 2017, at 1:51 PM, Mathieu Desnoyers > >> mathieu.desnoyers@efficios.com wrote: > >> > >> > Provide a new command allowing processes to register their intent to use > >> > the private expedited command. > >> > > >> > >> I missed a few maintainers that should have been CC'd. Adding them now. > >> This patch is aimed to go through Paul E. McKenney's tree. > > > > Honestly this is pretty ugly new user API and fairly large amount of > > complexity just to avoid the powerpc barrier. And you end up with arch > > specific hooks anyway! > > > > So my plan was to add an arch-overridable loop primitive that iterates > > over all running threads for an mm. > > Iterating over all threads for an mm is one possible solution that has > been listed at the membarrier BOF at LPC. We ended up dismissing this > solution because it would not inefficient for processes which have > lots of threads (e.g. Java). Sorry I meant hardware threads, I should have said "CPUs". > > > powerpc will use its mm_cpumask for > > iterating and use runqueue locks to avoid the barrier. > > This is another solution which has been rejected during the LPC BOF. > > What I gather from past threads is that the mm_cpumask's bits on powerpc > are pretty much only being set, never much cleared. Therefore, over the > lifetime of a thread which is not affined to specific processors, chances > are that this cpumask will end up having all cores on the system. That's fine. If a user is not bound to a subset of CPUs, they could also cause disturbances with other syscalls and faults, taking locks, causing tlb flushes and IPIs and things. > Therefore, > you end up with the same rq lock disruption as if you would iterate on all > online CPUs. If userspace does that in a loop, you end up, in PeterZ's words, > with an Insta-DoS. I really don't see how that can be true. spinlock by definition is for sharing of resources, it's not an insta-DoS just because you take shared spinlocks! You can contend other common locks or resources too. Linux simply does not have guaranteed strict isolation particularly when you allow threads to be scheduled together on CPUs, so this can't be used arbitrarily. If it was taking all locks at once that's one thing which has always been good policy to avoid. But it's not, any single rq lock will only be taken and released for a very short time, far shorter than a normal context switch. And the entire operation will be moderated by the cost of the syscall, and the number of runqueues it has to iterate. There's better ways to cause DoS. Start lots of threads and burn cycles, bounce between CPUs with affinty, sleep and wake one another between remote CPUs etc. Run out of memory, cause hash collisions on various hashes that are easy to control, cause lots of TLB flush IPIs... The runqueue lock is not really special. Might equally complain about a zone page allocator or lru lock, or an inode mutex or common dentry lock. > The name may sound cool, but I gather that this is not > a service the kernel willingly wants to provide to userspace. > > A cunning process could easily find a way to fill its mm_cpumask and then > issue membarrier in a loop to bring a large system to its knees. I still don't see how. Nothing that you couldn't do with other resources or configurations of threads and system calls. Sure you might cause a disturbance and admin might notice and kill it. > > > x86 will most > > likely want to use its mm_cpumask to iterate. > > Iterating on mm_cpumask rather than online cpus adds complexity wrt memory > barriers (unless we go for rq locks approach). We'd need, in addition to > ensure that we have the proper barriers before/after store to rq->curr, > to also ensure that we have the proper barriers between mm_cpumask > updates and user-space loads/stores. Considering that we're not aiming > at taking the rq locks anyway, iteration over all online cpus seems > less complex than iterating on mm_cpumask on the architectures that > keep track of it. Well x86 does not *have* to implement it. I thought it probably would like to, and I didn't think its barriers would have been all that complex when I last looked, but didn't look too closely. > > > > > For the powerpc approach, yes there is some controversy about using > > runqueue locks even for cpus that we already can interfere with, but I > > think we have a lot of options we could look at *after* it ever shows > > up as a problem. > > The DoS argument from Peter seems to be a strong opposition to grabbing > the rq locks. Well if I still can't unconvince you, then we should try testing that theory. > > Here is another point in favor of having a register command for the > private membarrier: This gives us greater flexibility to improve the > kernel scheduler and return-to-userspace barriers if need be in the > future. > > For instance, I plan to propose a "MEMBARRIER_FLAG_SYNC_CORE" flag > that will also provide guarantees about context synchronization of > all cores for memory reclaim performed by JIT for the next merge > window. So far, the following architectures seems to have the proper > core serializing instructions already in place when returning to > user-space: x86 (iret), powerpc (rfi), arm32/64 (return from exception, > eret), s390/x (lpswe), ia64 (rfi), parisc (issue at least 7 instructions > while signing around a bonfire), and mips SMP (eret). > > So far, AFAIU, only x86 (eventually going through sysexit), alpha > (appears to require an explicit imb), and sparc (explicit flush + 5 > instructions around similar bonfire as parisc) appear to require special > handling. > > I therefore plan to use the registration step with a > MEMBARRIER_FLAG_SYNC_CORE flag set to set TIF flags and add the > required context synchronizing barriers on sched_in() only for > processes wishing to use private expedited membarrier. > > So I don't see much point in trying to remove that registration step. I don't follow you. You are talking about the concept of registering intention to use a different function? And the registration API is not merged yet? Let me say I'm not completely against the idea of a registration API. But don't think registration for this expedited command is necessary. But (aside) let's say a tif flag turns out to be a good diea for your second case, why not just check the flag in the membarrier sys call and do the registration the first time it uses it? Thanks, Nick ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH v4 for 4.14 1/3] membarrier: Provide register expedited private command 2017-09-28 15:01 ` Nicholas Piggin @ 2017-09-28 15:29 ` Mathieu Desnoyers 2017-09-28 16:16 ` Nicholas Piggin 2017-09-28 15:51 ` Peter Zijlstra 1 sibling, 1 reply; 23+ messages in thread From: Mathieu Desnoyers @ 2017-09-28 15:29 UTC (permalink / raw) To: Nicholas Piggin Cc: Paul E. McKenney, Peter Zijlstra, Ingo Molnar, Alexander Viro, linux-arch, Avi Kivity, maged michael, Boqun Feng, Dave Watson, Will Deacon, linux-kernel, Andrew Hunter, Paul Mackerras, Andy Lutomirski, Alan Stern, linuxppc-dev, gromer ----- On Sep 28, 2017, at 11:01 AM, Nicholas Piggin npiggin@gmail.com wrote: > On Thu, 28 Sep 2017 13:31:36 +0000 (UTC) > Mathieu Desnoyers <mathieu.desnoyers@efficios.com> wrote: > >> ----- On Sep 27, 2017, at 9:04 AM, Nicholas Piggin npiggin@gmail.com wrote: >> >> > On Tue, 26 Sep 2017 20:43:28 +0000 (UTC) >> > Mathieu Desnoyers <mathieu.desnoyers@efficios.com> wrote: >> > >> >> ----- On Sep 26, 2017, at 1:51 PM, Mathieu Desnoyers >> >> mathieu.desnoyers@efficios.com wrote: >> >> [...] >> Therefore, >> you end up with the same rq lock disruption as if you would iterate on all >> online CPUs. If userspace does that in a loop, you end up, in PeterZ's words, >> with an Insta-DoS. > > I really don't see how that can be true. spinlock by definition is for > sharing of resources, it's not an insta-DoS just because you take shared > spinlocks! [...] >> >> > >> > For the powerpc approach, yes there is some controversy about using >> > runqueue locks even for cpus that we already can interfere with, but I >> > think we have a lot of options we could look at *after* it ever shows >> > up as a problem. >> >> The DoS argument from Peter seems to be a strong opposition to grabbing >> the rq locks. > > Well if I still can't unconvince you, then we should try testing that > theory. [ I'll let PeterZ pitch in on this part of the discussion ] > >> >> Here is another point in favor of having a register command for the >> private membarrier: This gives us greater flexibility to improve the >> kernel scheduler and return-to-userspace barriers if need be in the >> future. >> >> For instance, I plan to propose a "MEMBARRIER_FLAG_SYNC_CORE" flag >> that will also provide guarantees about context synchronization of >> all cores for memory reclaim performed by JIT for the next merge >> window. So far, the following architectures seems to have the proper >> core serializing instructions already in place when returning to >> user-space: x86 (iret), powerpc (rfi), arm32/64 (return from exception, >> eret), s390/x (lpswe), ia64 (rfi), parisc (issue at least 7 instructions >> while signing around a bonfire), and mips SMP (eret). >> >> So far, AFAIU, only x86 (eventually going through sysexit), alpha >> (appears to require an explicit imb), and sparc (explicit flush + 5 >> instructions around similar bonfire as parisc) appear to require special >> handling. >> >> I therefore plan to use the registration step with a >> MEMBARRIER_FLAG_SYNC_CORE flag set to set TIF flags and add the >> required context synchronizing barriers on sched_in() only for >> processes wishing to use private expedited membarrier. >> >> So I don't see much point in trying to remove that registration step. > > I don't follow you. You are talking about the concept of registering > intention to use a different function? And the registration API is not > merged yet? Yes, I'm talking about requiring processes to invoke membarrier cmd MEMBARRIER_CMD_REGISTER_PRIVATE_EXPEDITED before they can successfully invoke membarrier cmd MEMBARRIER_CMD_PRIVATE_EXPEDITED. > Let me say I'm not completely against the idea of a registration API. But > don't think registration for this expedited command is necessary. Given that we have the powerpc lack-of-full-barrier-on-return-to-userspace case now, and we foresee x86-sysexit, sparc, and alpha also requiring special treatment when we introduce the MEMBARRIER_FLAG_SYNC_CORE behavior in the next release, it seems that we'll have a hard time handling architecture special cases efficiently if we don't expose the registration API right away. > > But (aside) let's say a tif flag turns out to be a good diea for your > second case, why not just check the flag in the membarrier sys call and > do the registration the first time it uses it? We also considered that option. It's mainly about guaranteeing that an expedited membarrier command never blocks. If we introduce this "lazy auto-registration" behavior, we end up blocking the process at a random point in its execution so we can issue a synchronize_sched(). By exposing an explicit registration, we can control where this delay occurs, and even allow library constructors to invoke the registration while the process is a single threaded, therefore allowing us to completely skip synchronize_sched(). Thanks, Mathieu > > Thanks, > Nick -- Mathieu Desnoyers EfficiOS Inc. http://www.efficios.com ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH v4 for 4.14 1/3] membarrier: Provide register expedited private command 2017-09-28 15:29 ` Mathieu Desnoyers @ 2017-09-28 16:16 ` Nicholas Piggin 2017-09-28 18:28 ` Mathieu Desnoyers 0 siblings, 1 reply; 23+ messages in thread From: Nicholas Piggin @ 2017-09-28 16:16 UTC (permalink / raw) To: Mathieu Desnoyers Cc: Paul E. McKenney, Peter Zijlstra, Ingo Molnar, Alexander Viro, linux-arch, Avi Kivity, maged michael, Boqun Feng, Dave Watson, Will Deacon, linux-kernel, Andrew Hunter, Paul Mackerras, Andy Lutomirski, Alan Stern, linuxppc-dev, gromer On Thu, 28 Sep 2017 15:29:50 +0000 (UTC) Mathieu Desnoyers <mathieu.desnoyers@efficios.com> wrote: > ----- On Sep 28, 2017, at 11:01 AM, Nicholas Piggin npiggin@gmail.com wrote: > > > On Thu, 28 Sep 2017 13:31:36 +0000 (UTC) > > Mathieu Desnoyers <mathieu.desnoyers@efficios.com> wrote: > > > >> ----- On Sep 27, 2017, at 9:04 AM, Nicholas Piggin npiggin@gmail.com wrote: > >> [snip] > >> So I don't see much point in trying to remove that registration step. > > > > I don't follow you. You are talking about the concept of registering > > intention to use a different function? And the registration API is not > > merged yet? > > Yes, I'm talking about requiring processes to invoke membarrier cmd > MEMBARRIER_CMD_REGISTER_PRIVATE_EXPEDITED before they can successfully > invoke membarrier cmd MEMBARRIER_CMD_PRIVATE_EXPEDITED. > > > Let me say I'm not completely against the idea of a registration API. But > > don't think registration for this expedited command is necessary. > > Given that we have the powerpc lack-of-full-barrier-on-return-to-userspace > case now, and we foresee x86-sysexit, sparc, and alpha also requiring > special treatment when we introduce the MEMBARRIER_FLAG_SYNC_CORE behavior > in the next release, it seems that we'll have a hard time handling > architecture special cases efficiently if we don't expose the registration > API right away. But SYNC_CORE is a different functionality, right? You can add the registration API for it when that goes in. > > But (aside) let's say a tif flag turns out to be a good diea for your > > second case, why not just check the flag in the membarrier sys call and > > do the registration the first time it uses it? > > We also considered that option. It's mainly about guaranteeing that > an expedited membarrier command never blocks. If we introduce this > "lazy auto-registration" behavior, we end up blocking the process > at a random point in its execution so we can issue a synchronize_sched(). > By exposing an explicit registration, we can control where this delay > occurs, and even allow library constructors to invoke the registration > while the process is a single threaded, therefore allowing us to completely > skip synchronize_sched(). Okay I guess that could be a good reason. As I said I'm not opposed to the concept. I suppose you could even have a registration for expedited private even if it's a no-op on all architectures, just in case some new ways of implementing it can be done in future. I suppose I'm more objecting to the added complexity for powerpc, and more code in the fastpath to make the slowpath faster. Thanks, Nick ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH v4 for 4.14 1/3] membarrier: Provide register expedited private command 2017-09-28 16:16 ` Nicholas Piggin @ 2017-09-28 18:28 ` Mathieu Desnoyers 0 siblings, 0 replies; 23+ messages in thread From: Mathieu Desnoyers @ 2017-09-28 18:28 UTC (permalink / raw) To: Nicholas Piggin Cc: Paul E. McKenney, Peter Zijlstra, Ingo Molnar, Alexander Viro, linux-arch, Avi Kivity, maged michael, Boqun Feng, Dave Watson, Will Deacon, linux-kernel, Andrew Hunter, Paul Mackerras, Andy Lutomirski, Alan Stern, linuxppc-dev, gromer ----- On Sep 28, 2017, at 12:16 PM, Nicholas Piggin npiggin@gmail.com wrote: > On Thu, 28 Sep 2017 15:29:50 +0000 (UTC) > Mathieu Desnoyers <mathieu.desnoyers@efficios.com> wrote: > >> ----- On Sep 28, 2017, at 11:01 AM, Nicholas Piggin npiggin@gmail.com wrote: >> >> > On Thu, 28 Sep 2017 13:31:36 +0000 (UTC) >> > Mathieu Desnoyers <mathieu.desnoyers@efficios.com> wrote: >> > >> >> ----- On Sep 27, 2017, at 9:04 AM, Nicholas Piggin npiggin@gmail.com wrote: >> >> > > [snip] > >> >> So I don't see much point in trying to remove that registration step. >> > >> > I don't follow you. You are talking about the concept of registering >> > intention to use a different function? And the registration API is not >> > merged yet? >> >> Yes, I'm talking about requiring processes to invoke membarrier cmd >> MEMBARRIER_CMD_REGISTER_PRIVATE_EXPEDITED before they can successfully >> invoke membarrier cmd MEMBARRIER_CMD_PRIVATE_EXPEDITED. >> >> > Let me say I'm not completely against the idea of a registration API. But >> > don't think registration for this expedited command is necessary. >> >> Given that we have the powerpc lack-of-full-barrier-on-return-to-userspace >> case now, and we foresee x86-sysexit, sparc, and alpha also requiring >> special treatment when we introduce the MEMBARRIER_FLAG_SYNC_CORE behavior >> in the next release, it seems that we'll have a hard time handling >> architecture special cases efficiently if we don't expose the registration >> API right away. > > But SYNC_CORE is a different functionality, right? You can add the > registration API for it when that goes in. Sure, I could. However, I was hoping to re-use the same command, with a "SYNC_CORE" flag, and I would have liked to have consistent behavior for same commands used with different flags. > >> > But (aside) let's say a tif flag turns out to be a good diea for your >> > second case, why not just check the flag in the membarrier sys call and >> > do the registration the first time it uses it? >> >> We also considered that option. It's mainly about guaranteeing that >> an expedited membarrier command never blocks. If we introduce this >> "lazy auto-registration" behavior, we end up blocking the process >> at a random point in its execution so we can issue a synchronize_sched(). >> By exposing an explicit registration, we can control where this delay >> occurs, and even allow library constructors to invoke the registration >> while the process is a single threaded, therefore allowing us to completely >> skip synchronize_sched(). > > Okay I guess that could be a good reason. As I said I'm not opposed to > the concept. I suppose you could even have a registration for expedited > private even if it's a no-op on all architectures, just in case some new > ways of implementing it can be done in future. That's an approach I would be OK with too. Mandating explicit registration will give us much more flexibility. > I suppose I'm more objecting to the added complexity for powerpc, and > more code in the fastpath to make the slowpath faster. Just to make sure I understand your concern here. The "fastpath" you refer to is the TIF flag test in membarrier_sched_in() within finish_task_switch(), and the "slowpath" is switch_mm() which lacks the required full barrier now, am I correct ? Would it help if we invoke the membarrier hook from switch_mm() instead ? We'd therefore only add the TIF flag test in switch_mm(), rather than for every context switch. Thanks, Mathieu > > Thanks, > Nick -- Mathieu Desnoyers EfficiOS Inc. http://www.efficios.com ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH v4 for 4.14 1/3] membarrier: Provide register expedited private command 2017-09-28 15:01 ` Nicholas Piggin 2017-09-28 15:29 ` Mathieu Desnoyers @ 2017-09-28 15:51 ` Peter Zijlstra 2017-09-28 16:27 ` Nicholas Piggin 1 sibling, 1 reply; 23+ messages in thread From: Peter Zijlstra @ 2017-09-28 15:51 UTC (permalink / raw) To: Nicholas Piggin Cc: Mathieu Desnoyers, Paul E. McKenney, Ingo Molnar, Alexander Viro, linux-arch, Avi Kivity, maged michael, Boqun Feng, Dave Watson, Will Deacon, linux-kernel, Andrew Hunter, Paul Mackerras, Andy Lutomirski, Alan Stern, linuxppc-dev, gromer On Fri, Sep 29, 2017 at 01:01:12AM +1000, Nicholas Piggin wrote: > That's fine. If a user is not bound to a subset of CPUs, they could > also cause disturbances with other syscalls and faults, taking locks, > causing tlb flushes and IPIs and things. So on the big SGI class machines we've had trouble with for_each_cpu() loops before, and IIRC the biggest Power box is not too far from that 1-2K CPUs IIRC. Bouncing that lock across the machine is *painful*, I have vague memories of cases where the lock ping-pong was most the time spend. But only Power needs this, all the other architectures are fine with the lockless approach for MEMBAR_EXPEDITED_PRIVATE. The ISYNC variant of the same however appears to want TIF flags or something to aid a number of archs, the rq->lock will not help there. ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH v4 for 4.14 1/3] membarrier: Provide register expedited private command 2017-09-28 15:51 ` Peter Zijlstra @ 2017-09-28 16:27 ` Nicholas Piggin 2017-09-29 10:31 ` Peter Zijlstra 0 siblings, 1 reply; 23+ messages in thread From: Nicholas Piggin @ 2017-09-28 16:27 UTC (permalink / raw) To: Peter Zijlstra Cc: Mathieu Desnoyers, Paul E. McKenney, Ingo Molnar, Alexander Viro, linux-arch, Avi Kivity, maged michael, Boqun Feng, Dave Watson, Will Deacon, linux-kernel, Andrew Hunter, Paul Mackerras, Andy Lutomirski, Alan Stern, linuxppc-dev, gromer On Thu, 28 Sep 2017 17:51:15 +0200 Peter Zijlstra <peterz@infradead.org> wrote: > On Fri, Sep 29, 2017 at 01:01:12AM +1000, Nicholas Piggin wrote: > > That's fine. If a user is not bound to a subset of CPUs, they could > > also cause disturbances with other syscalls and faults, taking locks, > > causing tlb flushes and IPIs and things. > > So on the big SGI class machines we've had trouble with > for_each_cpu() loops before, and IIRC the biggest Power box is not too > far from that 1-2K CPUs IIRC. This is a loop in process context, interrupts on, can reschedule, no locks held though. The biggest power boxes are more tightly coupled than those big SGI systems, but even so just plodding along taking and releasing locks in turn would be fine on those SGI ones as well really. Not DoS level. This is not a single mega hot cache line or lock that is bouncing over the entire machine, but one process grabbing a line and lock from each of 1000 CPUs. Slight disturbance sure, but each individual CPU will see it as 1/1000th of a disturbance, most of the cost will be concentrated in the syscall caller. > > Bouncing that lock across the machine is *painful*, I have vague > memories of cases where the lock ping-pong was most the time spend. > > But only Power needs this, all the other architectures are fine with the > lockless approach for MEMBAR_EXPEDITED_PRIVATE. Yes, we can add an iterator function that power can override in a few lines. Less arch specific code than this proposal. > > The ISYNC variant of the same however appears to want TIF flags or > something to aid a number of archs, the rq->lock will not help there. The SYNC_CORE? Yes it seems different. I think that's another issue though. Thanks, Nick ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH v4 for 4.14 1/3] membarrier: Provide register expedited private command 2017-09-28 16:27 ` Nicholas Piggin @ 2017-09-29 10:31 ` Peter Zijlstra 2017-09-29 11:38 ` Nicholas Piggin 0 siblings, 1 reply; 23+ messages in thread From: Peter Zijlstra @ 2017-09-29 10:31 UTC (permalink / raw) To: Nicholas Piggin Cc: Mathieu Desnoyers, Paul E. McKenney, Ingo Molnar, Alexander Viro, linux-arch, Avi Kivity, maged michael, Boqun Feng, Dave Watson, Will Deacon, linux-kernel, Andrew Hunter, Paul Mackerras, Andy Lutomirski, Alan Stern, linuxppc-dev, gromer On Fri, Sep 29, 2017 at 02:27:57AM +1000, Nicholas Piggin wrote: > The biggest power boxes are more tightly coupled than those big > SGI systems, but even so just plodding along taking and releasing > locks in turn would be fine on those SGI ones as well really. Not DoS > level. This is not a single mega hot cache line or lock that is > bouncing over the entire machine, but one process grabbing a line and > lock from each of 1000 CPUs. > > Slight disturbance sure, but each individual CPU will see it as 1/1000th > of a disturbance, most of the cost will be concentrated in the syscall > caller. But once the: while (1) sys_membarrier() thread has all those (lock) lines in M state locally, it will become very hard for the remote CPUs to claim them back, because its constantly touching them. Sure it will touch a 1000 other lines before its back to this one, but if they're all local that's fairly quick. But you're right, your big machines have far smaller NUMA factors. > > Bouncing that lock across the machine is *painful*, I have vague > > memories of cases where the lock ping-pong was most the time spend. > > > > But only Power needs this, all the other architectures are fine with the > > lockless approach for MEMBAR_EXPEDITED_PRIVATE. > > Yes, we can add an iterator function that power can override in a few > lines. Less arch specific code than this proposal. A semi related issue; I suppose we can do a arch upcall to flush_tlb_mm and reset the mm_cpumask when we change cpuset groups. ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH v4 for 4.14 1/3] membarrier: Provide register expedited private command 2017-09-29 10:31 ` Peter Zijlstra @ 2017-09-29 11:38 ` Nicholas Piggin 2017-09-29 11:45 ` Peter Zijlstra 0 siblings, 1 reply; 23+ messages in thread From: Nicholas Piggin @ 2017-09-29 11:38 UTC (permalink / raw) To: Peter Zijlstra Cc: Mathieu Desnoyers, Paul E. McKenney, Ingo Molnar, Alexander Viro, linux-arch, Avi Kivity, maged michael, Boqun Feng, Dave Watson, Will Deacon, linux-kernel, Andrew Hunter, Paul Mackerras, Andy Lutomirski, Alan Stern, linuxppc-dev, gromer On Fri, 29 Sep 2017 12:31:31 +0200 Peter Zijlstra <peterz@infradead.org> wrote: > On Fri, Sep 29, 2017 at 02:27:57AM +1000, Nicholas Piggin wrote: > > > The biggest power boxes are more tightly coupled than those big > > SGI systems, but even so just plodding along taking and releasing > > locks in turn would be fine on those SGI ones as well really. Not DoS > > level. This is not a single mega hot cache line or lock that is > > bouncing over the entire machine, but one process grabbing a line and > > lock from each of 1000 CPUs. > > > > Slight disturbance sure, but each individual CPU will see it as 1/1000th > > of a disturbance, most of the cost will be concentrated in the syscall > > caller. > > But once the: > > while (1) > sys_membarrier() > > thread has all those (lock) lines in M state locally, it will become > very hard for the remote CPUs to claim them back, because its constantly Not really. There is some ability to hold onto a line for a time, but there is no way to starve them, let alone starve hundreds of other CPUs. They will request the cacheline exclusive and eventually get it. Then the membarrier CPU has to pay to get it back. If there is a lot of activity on the locks, the membarrier will have a difficult time to take each one. I don't say there is zero cost or can't interfere with others, only that it does not seem particularly bad compared with other things. Once you restrict it to mm_cpumask, then it's quite partitionable. I would really prefer to go this way on powerpc first. We could add the the registration APIs as basically no-ops, but which would allow the locking approach to be changed if we find it causes issues. I'll try to find some time and a big system when I can. > touching them. Sure it will touch a 1000 other lines before its back to > this one, but if they're all local that's fairly quick. > > But you're right, your big machines have far smaller NUMA factors. > > > > Bouncing that lock across the machine is *painful*, I have vague > > > memories of cases where the lock ping-pong was most the time spend. > > > > > > But only Power needs this, all the other architectures are fine with the > > > lockless approach for MEMBAR_EXPEDITED_PRIVATE. > > > > Yes, we can add an iterator function that power can override in a few > > lines. Less arch specific code than this proposal. > > A semi related issue; I suppose we can do a arch upcall to flush_tlb_mm > and reset the mm_cpumask when we change cpuset groups. For powerpc we have been looking at how mm_cpumask can be improved. It has real drawbacks even when you don't consider this new syscall. Thanks, Nick ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH v4 for 4.14 1/3] membarrier: Provide register expedited private command 2017-09-29 11:38 ` Nicholas Piggin @ 2017-09-29 11:45 ` Peter Zijlstra 0 siblings, 0 replies; 23+ messages in thread From: Peter Zijlstra @ 2017-09-29 11:45 UTC (permalink / raw) To: Nicholas Piggin Cc: Mathieu Desnoyers, Paul E. McKenney, Ingo Molnar, Alexander Viro, linux-arch, Avi Kivity, maged michael, Boqun Feng, Dave Watson, Will Deacon, linux-kernel, Andrew Hunter, Paul Mackerras, Andy Lutomirski, Alan Stern, linuxppc-dev, gromer On Fri, Sep 29, 2017 at 09:38:53PM +1000, Nicholas Piggin wrote: > Not really. There is some ability to hold onto a line for a time, but > there is no way to starve them, let alone starve hundreds of other > CPUs. They will request the cacheline exclusive and eventually get it. OK, hardware fairness there is nice. > I would really prefer to go this way on powerpc first. We could add the > the registration APIs as basically no-ops, but which would allow the > locking approach to be changed if we find it causes issues. I'll try to > find some time and a big system when I can. Fair enough I suppose. > > A semi related issue; I suppose we can do a arch upcall to flush_tlb_mm > > and reset the mm_cpumask when we change cpuset groups. > > For powerpc we have been looking at how mm_cpumask can be improved. > It has real drawbacks even when you don't consider this new syscall. What else do you use mm_cpumask for? ^ permalink raw reply [flat|nested] 23+ messages in thread
end of thread, other threads:[~2017-09-29 11:45 UTC | newest] Thread overview: 23+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2017-09-26 17:51 [PATCH v4 for 4.14 1/3] membarrier: Provide register expedited private command Mathieu Desnoyers 2017-09-26 17:51 ` [PATCH for 4.14 2/3] membarrier: selftest: Test private expedited cmd Mathieu Desnoyers 2017-09-26 19:41 ` Shuah Khan 2017-09-26 19:55 ` Mathieu Desnoyers 2017-09-26 20:08 ` Shuah Khan 2017-09-26 20:15 ` Mathieu Desnoyers 2017-09-26 20:34 ` Shuah Khan 2017-09-26 21:15 ` Greg Kroah-Hartman 2017-09-26 21:15 ` Greg Kroah-Hartman 2017-09-26 17:51 ` [PATCH v4 for 4.14 3/3] membarrier: Document scheduler barrier requirements Mathieu Desnoyers 2017-09-26 20:46 ` Mathieu Desnoyers 2017-09-26 20:43 ` [PATCH v4 for 4.14 1/3] membarrier: Provide register expedited private command Mathieu Desnoyers 2017-09-27 13:04 ` Nicholas Piggin 2017-09-28 13:31 ` Mathieu Desnoyers 2017-09-28 15:01 ` Nicholas Piggin 2017-09-28 15:29 ` Mathieu Desnoyers 2017-09-28 16:16 ` Nicholas Piggin 2017-09-28 18:28 ` Mathieu Desnoyers 2017-09-28 15:51 ` Peter Zijlstra 2017-09-28 16:27 ` Nicholas Piggin 2017-09-29 10:31 ` Peter Zijlstra 2017-09-29 11:38 ` Nicholas Piggin 2017-09-29 11:45 ` Peter Zijlstra
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.