linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v16] sys_membarrier(): system-wide memory barrier (generic, x86)
@ 2015-04-17 15:06 Mathieu Desnoyers
  2015-04-21 11:47 ` Thomas Gleixner
                   ` (3 more replies)
  0 siblings, 4 replies; 16+ messages in thread
From: Mathieu Desnoyers @ 2015-04-17 15:06 UTC (permalink / raw)
  To: linux-kernel
  Cc: Mathieu Desnoyers, Josh Triplett, KOSAKI Motohiro,
	Steven Rostedt, Nicholas Miell, Linus Torvalds, Ingo Molnar,
	Alan Cox, Lai Jiangshan, Stephen Hemminger, Andrew Morton,
	Thomas Gleixner, Peter Zijlstra, David Howells, Pranith Kumar,
	Michael Kerrisk

[ Not in this patch: tests (Pranith Kumar has a patch for this), man page. ]

Here is an implementation of a new system call, sys_membarrier(), which
executes a memory barrier on all threads running on the system. It is
implemented by calling synchronize_sched(). It can be used to distribute
the cost of user-space memory barriers asymmetrically by transforming
pairs of memory barriers into pairs consisting of sys_membarrier() and a
compiler barrier. For synchronization primitives that distinguish
between read-side and write-side (e.g. userspace RCU [1], rwlocks), the
read-side can be accelerated significantly by moving the bulk of the
memory barrier overhead to the write-side.

It is based on kernel v4.0.

To explain the benefit of this scheme, let's introduce two example threads:

Thread A (non-frequent, e.g. executing liburcu synchronize_rcu())
Thread B (frequent, e.g. executing liburcu
rcu_read_lock()/rcu_read_unlock())

In a scheme where all smp_mb() in thread A are ordering memory accesses
with respect to smp_mb() present in Thread B, we can change each
smp_mb() within Thread A into calls to sys_membarrier() and each
smp_mb() within Thread B into compiler barriers "barrier()".

Before the change, we had, for each smp_mb() pairs:

Thread A                    Thread B
previous mem accesses       previous mem accesses
smp_mb()                    smp_mb()
following mem accesses      following mem accesses

After the change, these pairs become:

Thread A                    Thread B
prev mem accesses           prev mem accesses
sys_membarrier()            barrier()
follow mem accesses         follow mem accesses

As we can see, there are two possible scenarios: either Thread B memory
accesses do not happen concurrently with Thread A accesses (1), or they
do (2).

1) Non-concurrent Thread A vs Thread B accesses:

Thread A                    Thread B
prev mem accesses
sys_membarrier()
follow mem accesses
                            prev mem accesses
                            barrier()
                            follow mem accesses

In this case, thread B accesses will be weakly ordered. This is OK,
because at that point, thread A is not particularly interested in
ordering them with respect to its own accesses.

2) Concurrent Thread A vs Thread B accesses

Thread A                    Thread B
prev mem accesses           prev mem accesses
sys_membarrier()            barrier()
follow mem accesses         follow mem accesses

In this case, thread B accesses, which are ensured to be in program
order thanks to the compiler barrier, will be "upgraded" to full
smp_mb() by synchronize_sched().

* Benchmarks

On Intel Xeon E5405 (8 cores)
(one thread is calling sys_membarrier, the other 7 threads are busy
looping)

1000 non-expedited sys_membarrier calls in 33s = 33 milliseconds/call.

* User-space user of this system call: Userspace RCU library

Both the signal-based and the sys_membarrier userspace RCU schemes
permit us to remove the memory barrier from the userspace RCU
rcu_read_lock() and rcu_read_unlock() primitives, thus significantly
accelerating them. These memory barriers are replaced by compiler
barriers on the read-side, and all matching memory barriers on the
write-side are turned into an invocation of a memory barrier on all
active threads in the process. By letting the kernel perform this
synchronization rather than dumbly sending a signal to every process
threads (as we currently do), we diminish the number of unnecessary wake
ups and only issue the memory barriers on active threads. Non-running
threads do not need to execute such barrier anyway, because these are
implied by the scheduler context switches.

Results in liburcu:

Operations in 10s, 6 readers, 2 writers:

memory barriers in reader:    1701557485 reads, 3129842 writes
signal-based scheme:          9825306874 reads,    5386 writes
sys_membarrier:               7992076602 reads,     220 writes

The dynamic sys_membarrier availability check adds some overhead to
the read-side compared to the signal-based scheme, but besides that,
with the expedited scheme, we can see that we are close to the read-side
performance of the signal-based scheme. However, this non-expedited
sys_membarrier implementation has a much slower grace period than signal
and memory barrier schemes.

An expedited version of this system call can be added later on to speed
up the grace period. Its implementation will likely depend on reading
the cpu_curr()->mm without holding each CPU's rq lock.

This patch adds the system call to x86 and to asm-generic.

[1] http://urcu.so

Changes since v15:
- Add flags argument in addition to cmd.
- Update documentation.

Changes since v14:
- Take care of Thomas Gleixner's comments.

Changes since v13:
- Move to kernel/membarrier.c.
- Remove MEMBARRIER_PRIVATE flag.
- Add MAINTAINERS file entry.

Changes since v12:
- Remove _FLAG suffix from uapi flags.
- Add Expert menuconfig option CONFIG_MEMBARRIER (default=y).
- Remove EXPEDITED mode. Only implement non-expedited for now, until
  reading the cpu_curr()->mm can be done without holding the CPU's rq
  lock.

Changes since v11:
- 5 years have passed.
- Rebase on v3.19 kernel.
- Add futex-alike PRIVATE vs SHARED semantic: private for per-process
  barriers, non-private for memory mappings shared between processes.
- Simplify user API.
- Code refactoring.

Changes since v10:
- Apply Randy's comments.
- Rebase on 2.6.34-rc4 -tip.

Changes since v9:
- Clean up #ifdef CONFIG_SMP.

Changes since v8:
- Go back to rq spin locks taken by sys_membarrier() rather than adding
  memory barriers to the scheduler. It implies a potential RoS
  (reduction of service) if sys_membarrier() is executed in a busy-loop
  by a user, but nothing more than what is already possible with other
  existing system calls, but saves memory barriers in the scheduler fast
  path.
- re-add the memory barrier comments to x86 switch_mm() as an example to
  other architectures.
- Update documentation of the memory barriers in sys_membarrier and
  switch_mm().
- Append execution scenarios to the changelog showing the purpose of
  each memory barrier.

Changes since v7:
- Move spinlock-mb and scheduler related changes to separate patches.
- Add support for sys_membarrier on x86_32.
- Only x86 32/64 system calls are reserved in this patch. It is planned
  to incrementally reserve syscall IDs on other architectures as these
  are tested.

Changes since v6:
- Remove some unlikely() not so unlikely.
- Add the proper scheduler memory barriers needed to only use the RCU
  read lock in sys_membarrier rather than take each runqueue spinlock:
- Move memory barriers from per-architecture switch_mm() to schedule()
  and finish_lock_switch(), where they clearly document that all data
  protected by the rq lock is guaranteed to have memory barriers issued
  between the scheduler update and the task execution. Replacing the
  spin lock acquire/release barriers with these memory barriers imply
  either no overhead (x86 spinlock atomic instruction already implies a
  full mb) or some hopefully small overhead caused by the upgrade of the
  spinlock acquire/release barriers to more heavyweight smp_mb().
- The "generic" version of spinlock-mb.h declares both a mapping to
  standard spinlocks and full memory barriers. Each architecture can
  specialize this header following their own need and declare
  CONFIG_HAVE_SPINLOCK_MB to use their own spinlock-mb.h.
- Note: benchmarks of scheduler overhead with specialized spinlock-mb.h
  implementations on a wide range of architecture would be welcome.

Changes since v5:
- Plan ahead for extensibility by introducing mandatory/optional masks
  to the "flags" system call parameter. Past experience with accept4(),
  signalfd4(), eventfd2(), epoll_create1(), dup3(), pipe2(), and
  inotify_init1() indicates that this is the kind of thing we want to
  plan for. Return -EINVAL if the mandatory flags received are unknown.
- Create include/linux/membarrier.h to define these flags.
- Add MEMBARRIER_QUERY optional flag.

Changes since v4:
- Add "int expedited" parameter, use synchronize_sched() in the
  non-expedited case. Thanks to Lai Jiangshan for making us consider
  seriously using synchronize_sched() to provide the low-overhead
  membarrier scheme.
- Check num_online_cpus() == 1, quickly return without doing nothing.

Changes since v3a:
- Confirm that each CPU indeed runs the current task's ->mm before
  sending an IPI. Ensures that we do not disturb RT tasks in the
  presence of lazy TLB shootdown.
- Document memory barriers needed in switch_mm().
- Surround helper functions with #ifdef CONFIG_SMP.

Changes since v2:
- simply send-to-many to the mm_cpumask. It contains the list of
  processors we have to IPI to (which use the mm), and this mask is
  updated atomically.

Changes since v1:
- Only perform the IPI in CONFIG_SMP.
- Only perform the IPI if the process has more than one thread.
- Only send IPIs to CPUs involved with threads belonging to our process.
- Adaptative IPI scheme (single vs many IPI with threshold).
- Issue smp_mb() at the beginning and end of the system call.

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Reviewed-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
CC: Josh Triplett <josh@joshtriplett.org>
CC: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
CC: Steven Rostedt <rostedt@goodmis.org>
CC: Nicholas Miell <nmiell@comcast.net>
CC: Linus Torvalds <torvalds@linux-foundation.org>
CC: Ingo Molnar <mingo@redhat.com>
CC: Alan Cox <gnomes@lxorguk.ukuu.org.uk>
CC: Lai Jiangshan <laijs@cn.fujitsu.com>
CC: Stephen Hemminger <stephen@networkplumber.org>
CC: Andrew Morton <akpm@linux-foundation.org>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Peter Zijlstra <peterz@infradead.org>
CC: David Howells <dhowells@redhat.com>
CC: "Pranith Kumar" <bobby.prani@gmail.com>
CC: Michael Kerrisk <mtk.manpages@gmail.com>
---
 MAINTAINERS                       |    8 ++++
 arch/x86/syscalls/syscall_32.tbl  |    1 +
 arch/x86/syscalls/syscall_64.tbl  |    1 +
 include/linux/syscalls.h          |    2 +
 include/uapi/asm-generic/unistd.h |    4 +-
 include/uapi/linux/Kbuild         |    1 +
 include/uapi/linux/membarrier.h   |   47 +++++++++++++++++++++++
 init/Kconfig                      |   13 ++++++
 kernel/Makefile                   |    1 +
 kernel/membarrier.c               |   74 +++++++++++++++++++++++++++++++++++++
 kernel/sys_ni.c                   |    3 +
 11 files changed, 154 insertions(+), 1 deletions(-)
 create mode 100644 include/uapi/linux/membarrier.h
 create mode 100644 kernel/membarrier.c

diff --git a/MAINTAINERS b/MAINTAINERS
index efbcb50..a94cc7b 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -6328,6 +6328,14 @@ W:	http://www.mellanox.com
 Q:	http://patchwork.ozlabs.org/project/netdev/list/
 F:	drivers/net/ethernet/mellanox/mlx4/en_*
 
+MEMBARRIER SUPPORT
+M:	Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
+M:	"Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
+L:	linux-kernel@vger.kernel.org
+S:	Supported
+F:	kernel/membarrier.c
+F:	include/uapi/linux/membarrier.h
+
 MEMORY MANAGEMENT
 L:	linux-mm@kvack.org
 W:	http://www.linux-mm.org
diff --git a/arch/x86/syscalls/syscall_32.tbl b/arch/x86/syscalls/syscall_32.tbl
index b3560ec..439415f 100644
--- a/arch/x86/syscalls/syscall_32.tbl
+++ b/arch/x86/syscalls/syscall_32.tbl
@@ -365,3 +365,4 @@
 356	i386	memfd_create		sys_memfd_create
 357	i386	bpf			sys_bpf
 358	i386	execveat		sys_execveat			stub32_execveat
+359	i386	membarrier		sys_membarrier
diff --git a/arch/x86/syscalls/syscall_64.tbl b/arch/x86/syscalls/syscall_64.tbl
index 8d656fb..823130d 100644
--- a/arch/x86/syscalls/syscall_64.tbl
+++ b/arch/x86/syscalls/syscall_64.tbl
@@ -329,6 +329,7 @@
 320	common	kexec_file_load		sys_kexec_file_load
 321	common	bpf			sys_bpf
 322	64	execveat		stub_execveat
+323	common	membarrier		sys_membarrier
 
 #
 # x32-specific system call numbers start at 512 to avoid cache impact
diff --git a/include/linux/syscalls.h b/include/linux/syscalls.h
index 76d1e38..51a9054 100644
--- a/include/linux/syscalls.h
+++ b/include/linux/syscalls.h
@@ -884,4 +884,6 @@ asmlinkage long sys_execveat(int dfd, const char __user *filename,
 			const char __user *const __user *argv,
 			const char __user *const __user *envp, int flags);
 
+asmlinkage long sys_membarrier(int cmd, int flags);
+
 #endif
diff --git a/include/uapi/asm-generic/unistd.h b/include/uapi/asm-generic/unistd.h
index e016bd9..8da542a 100644
--- a/include/uapi/asm-generic/unistd.h
+++ b/include/uapi/asm-generic/unistd.h
@@ -709,9 +709,11 @@ __SYSCALL(__NR_memfd_create, sys_memfd_create)
 __SYSCALL(__NR_bpf, sys_bpf)
 #define __NR_execveat 281
 __SC_COMP(__NR_execveat, sys_execveat, compat_sys_execveat)
+#define __NR_membarrier 282
+__SYSCALL(__NR_membarrier, sys_membarrier)
 
 #undef __NR_syscalls
-#define __NR_syscalls 282
+#define __NR_syscalls 283
 
 /*
  * All syscalls below here should go away really,
diff --git a/include/uapi/linux/Kbuild b/include/uapi/linux/Kbuild
index 68ceb97..4684a8d 100644
--- a/include/uapi/linux/Kbuild
+++ b/include/uapi/linux/Kbuild
@@ -249,6 +249,7 @@ header-y += mdio.h
 header-y += media.h
 header-y += media-bus-format.h
 header-y += mei.h
+header-y += membarrier.h
 header-y += memfd.h
 header-y += mempolicy.h
 header-y += meye.h
diff --git a/include/uapi/linux/membarrier.h b/include/uapi/linux/membarrier.h
new file mode 100644
index 0000000..1021be5
--- /dev/null
+++ b/include/uapi/linux/membarrier.h
@@ -0,0 +1,47 @@
+#ifndef _UAPI_LINUX_MEMBARRIER_H
+#define _UAPI_LINUX_MEMBARRIER_H
+
+/*
+ * linux/membarrier.h
+ *
+ * membarrier system call API
+ *
+ * Copyright (c) 2010, 2015 Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this software and associated documentation files (the "Software"), to deal
+ * in the Software without restriction, including without limitation the rights
+ * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+ * copies of the Software, and to permit persons to whom the Software is
+ * furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+ * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+ * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ */
+
+/**
+ * enum membarrier_cmd - membarrier system call command
+ *
+ * Command to be passed to the membarrier system call.
+ */
+enum membarrier_cmd {
+	/*
+	 * Query the set of supported commands. It returns a bitmask of valid
+	 * commands.
+	 */
+	MEMBARRIER_CMD_QUERY = 0,
+	/*
+	 * Execute a memory barrier on all running threads.
+	 */
+	MEMBARRIER_CMD_SHARED = (1 << 0),
+};
+
+#endif /* _UAPI_LINUX_MEMBARRIER_H */
diff --git a/init/Kconfig b/init/Kconfig
index f5dbc6d..89bad6a 100644
--- a/init/Kconfig
+++ b/init/Kconfig
@@ -1559,6 +1559,19 @@ config PCI_QUIRKS
 	  bugs/quirks. Disable this only if your target machine is
 	  unaffected by PCI quirks.
 
+config MEMBARRIER
+	bool "Enable membarrier() system call" if EXPERT
+	default y
+	depends on SMP
+	help
+	  Enable the membarrier() system call that allows issuing memory
+	  barriers across all running threads, which can be used to distribute
+	  the cost of user-space memory barriers asymmetrically by transforming
+	  pairs of memory barriers into pairs consisting of membarrier() and a
+	  compiler barrier.
+
+	  If unsure, say Y.
+
 config EMBEDDED
 	bool "Embedded system"
 	option allnoconfig_y
diff --git a/kernel/Makefile b/kernel/Makefile
index 1408b33..1cafa11 100644
--- a/kernel/Makefile
+++ b/kernel/Makefile
@@ -96,6 +96,7 @@ obj-$(CONFIG_CRASH_DUMP) += crash_dump.o
 obj-$(CONFIG_JUMP_LABEL) += jump_label.o
 obj-$(CONFIG_CONTEXT_TRACKING) += context_tracking.o
 obj-$(CONFIG_TORTURE_TEST) += torture.o
+obj-$(CONFIG_MEMBARRIER) += membarrier.o
 
 $(obj)/configs.o: $(obj)/config_data.h
 
diff --git a/kernel/membarrier.c b/kernel/membarrier.c
new file mode 100644
index 0000000..561fe64
--- /dev/null
+++ b/kernel/membarrier.c
@@ -0,0 +1,74 @@
+/*
+ * Copyright (C) 2010, 2015 Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
+ *
+ * membarrier system call
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ */
+
+#include <linux/syscalls.h>
+#include <linux/membarrier.h>
+
+/*
+ * Bitmask made from a "or" of all commands within enum membarrier_cmd,
+ * except MEMBARRIER_CMD_QUERY.
+ */
+#define MEMBARRIER_CMD_BITMASK	(MEMBARRIER_CMD_SHARED)
+
+/**
+ * sys_membarrier - issue memory barriers on a set of threads
+ * @cmd:   MEMBARRIER_CMD_QUERY:
+ *             Query the set of supported commands. It returns a bitmask of
+ *             supported commands.
+ *         MEMBARRIER_CMD_SHARED:
+ *             Execute a memory barrier on all running threads. Upon
+ *             return from system call, the caller thread is ensured that
+ *             all running threads have passed through a state where all
+ *             memory accesses to user-space addresses match program order
+ *             between entry to and return from the system call (non-running
+ *             threads are de facto in such a state). This covers threads
+ *             from all processes running on the system. This command
+ *             returns 0.
+ * @flags: Currently unused.
+ *
+ * If this system call is not implemented, -ENOSYS is returned. If the
+ * command specified does not exist, or if the command argument is invalid,
+ * this system call returns -EINVAL. For a given command, this system call
+ * is guaranteed to always return the same value until reboot.
+ *
+ * All memory accesses performed in program order from each targeted thread
+ * is guaranteed to be ordered with respect to sys_membarrier(). If we use
+ * the semantic "barrier()" to represent a compiler barrier forcing memory
+ * accesses to be performed in program order across the barrier, and
+ * smp_mb() to represent explicit memory barriers forcing full memory
+ * ordering across the barrier, we have the following ordering table for
+ * each pair of barrier(), sys_membarrier() and smp_mb():
+ *
+ * The pair ordering is detailed as (O: ordered, X: not ordered):
+ *
+ *                        barrier()   smp_mb() sys_membarrier()
+ *        barrier()          X           X            O
+ *        smp_mb()           X           O            O
+ *        sys_membarrier()   O           O            O
+ */
+SYSCALL_DEFINE2(membarrier, int, cmd, int, flags)
+{
+	switch (cmd) {
+	case MEMBARRIER_CMD_QUERY:
+		return MEMBARRIER_CMD_BITMASK;
+	case MEMBARRIER_CMD_SHARED:
+		if (num_online_cpus() > 1)
+			synchronize_sched();
+		return 0;
+	default:
+		return -EINVAL;
+	}
+}
diff --git a/kernel/sys_ni.c b/kernel/sys_ni.c
index 5adcb0a..5913b84 100644
--- a/kernel/sys_ni.c
+++ b/kernel/sys_ni.c
@@ -229,3 +229,6 @@ cond_syscall(sys_bpf);
 
 /* execveat */
 cond_syscall(sys_execveat);
+
+/* membarrier */
+cond_syscall(sys_membarrier);
-- 
1.7.7.3


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH v16] sys_membarrier(): system-wide memory barrier (generic, x86)
  2015-04-17 15:06 [PATCH v16] sys_membarrier(): system-wide memory barrier (generic, x86) Mathieu Desnoyers
@ 2015-04-21 11:47 ` Thomas Gleixner
  2015-04-22 20:36   ` Mathieu Desnoyers
  2015-04-21 16:29 ` Davidlohr Bueso
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 16+ messages in thread
From: Thomas Gleixner @ 2015-04-21 11:47 UTC (permalink / raw)
  To: Mathieu Desnoyers
  Cc: linux-kernel, Josh Triplett, KOSAKI Motohiro, Steven Rostedt,
	Nicholas Miell, Linus Torvalds, Ingo Molnar, Alan Cox,
	Lai Jiangshan, Stephen Hemminger, Andrew Morton, Peter Zijlstra,
	David Howells, Pranith Kumar, Michael Kerrisk

On Fri, 17 Apr 2015, Mathieu Desnoyers wrote:
> +/**
> + * enum membarrier_cmd - membarrier system call command
> + *

    * @MEMBARRIER_CMD_QUERY:	Explanatory blurb......

Hint: Create a kerneldoc document and build it.

> + * Command to be passed to the membarrier system call.

Please epxlain, that the commands need to be a single bit each.

> +config MEMBARRIER
> +	bool "Enable membarrier() system call" if EXPERT
> +	default y
> +	depends on SMP
> +	help
> +	  Enable the membarrier() system call that allows issuing memory
> +	  barriers across all running threads, which can be used to distribute
> +	  the cost of user-space memory barriers asymmetrically by transforming
> +	  pairs of memory barriers into pairs consisting of membarrier() and a
> +	  compiler barrier.
> +
> +	  If unsure, say Y.

Is it really worth to make this configurable?

> +/**
> + * sys_membarrier - issue memory barriers on a set of threads
> + * @cmd:   MEMBARRIER_CMD_QUERY:
> + *             Query the set of supported commands. It returns a bitmask of
> + *             supported commands.
> + *         MEMBARRIER_CMD_SHARED:
> + *             Execute a memory barrier on all running threads. Upon
> + *             return from system call, the caller thread is ensured that
> + *             all running threads have passed through a state where all
> + *             memory accesses to user-space addresses match program order
> + *             between entry to and return from the system call (non-running
> + *             threads are de facto in such a state). This covers threads
> + *             from all processes running on the system. This command
> + *             returns 0.

I think the explanation for the commands should be in the enum
documentation. This here should explain that @cmd takes command values
defined in the enum.

> +SYSCALL_DEFINE2(membarrier, int, cmd, int, flags)
> +{
> +	switch (cmd) {
> +	case MEMBARRIER_CMD_QUERY:
> +		return MEMBARRIER_CMD_BITMASK;
> +	case MEMBARRIER_CMD_SHARED:
> +		if (num_online_cpus() > 1)
> +			synchronize_sched();
> +		return 0;
> +	default:
> +		return -EINVAL;
> +	}
> +}

This looks way cleaner now :)

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v16] sys_membarrier(): system-wide memory barrier (generic, x86)
  2015-04-17 15:06 [PATCH v16] sys_membarrier(): system-wide memory barrier (generic, x86) Mathieu Desnoyers
  2015-04-21 11:47 ` Thomas Gleixner
@ 2015-04-21 16:29 ` Davidlohr Bueso
  2015-04-22 19:43   ` Mathieu Desnoyers
  2015-04-22 20:24 ` Pranith Kumar
  2015-04-23 10:33 ` Rasmus Villemoes
  3 siblings, 1 reply; 16+ messages in thread
From: Davidlohr Bueso @ 2015-04-21 16:29 UTC (permalink / raw)
  To: Mathieu Desnoyers
  Cc: linux-kernel, Josh Triplett, KOSAKI Motohiro, Steven Rostedt,
	Nicholas Miell, Linus Torvalds, Ingo Molnar, Alan Cox,
	Lai Jiangshan, Stephen Hemminger, Andrew Morton, Thomas Gleixner,
	Peter Zijlstra, David Howells, Pranith Kumar, Michael Kerrisk

On Fri, 2015-04-17 at 11:06 -0400, Mathieu Desnoyers wrote:
> [ Not in this patch: tests (Pranith Kumar has a patch for this), man page. ]

Actually, we had discussed that for new syscalls, the changelog _would_
be the manpage -- at least included in it.

Thanks,
Davidlohr


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v16] sys_membarrier(): system-wide memory barrier (generic, x86)
  2015-04-21 16:29 ` Davidlohr Bueso
@ 2015-04-22 19:43   ` Mathieu Desnoyers
  2015-04-22 20:01     ` Michael Kerrisk (man-pages)
  0 siblings, 1 reply; 16+ messages in thread
From: Mathieu Desnoyers @ 2015-04-22 19:43 UTC (permalink / raw)
  To: Davidlohr Bueso
  Cc: linux-kernel, Josh Triplett, KOSAKI Motohiro, Steven Rostedt,
	Nicholas Miell, Linus Torvalds, Ingo Molnar, Alan Cox,
	Lai Jiangshan, Stephen Hemminger, Andrew Morton, Thomas Gleixner,
	Peter Zijlstra, David Howells, Pranith Kumar, Michael Kerrisk

----- Original Message -----
> On Fri, 2015-04-17 at 11:06 -0400, Mathieu Desnoyers wrote:
> > [ Not in this patch: tests (Pranith Kumar has a patch for this), man page.
> > ]
> 
> Actually, we had discussed that for new syscalls, the changelog _would_
> be the manpage -- at least included in it.

Do you want the formatted content, or just the text ? Putting
manpage formatting in the changelog seems a bit odd.

Thanks,

Mathieu

-- 
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v16] sys_membarrier(): system-wide memory barrier (generic, x86)
  2015-04-22 19:43   ` Mathieu Desnoyers
@ 2015-04-22 20:01     ` Michael Kerrisk (man-pages)
  2015-04-22 21:01       ` Mathieu Desnoyers
  0 siblings, 1 reply; 16+ messages in thread
From: Michael Kerrisk (man-pages) @ 2015-04-22 20:01 UTC (permalink / raw)
  To: Mathieu Desnoyers
  Cc: Davidlohr Bueso, lkml, Josh Triplett, KOSAKI Motohiro,
	Steven Rostedt, Nicholas Miell, Linus Torvalds, Ingo Molnar,
	Alan Cox, Lai Jiangshan, Stephen Hemminger, Andrew Morton,
	Thomas Gleixner, Peter Zijlstra, David Howells, Pranith Kumar

On 22 April 2015 at 21:43, Mathieu Desnoyers
<mathieu.desnoyers@efficios.com> wrote:
> ----- Original Message -----
>> On Fri, 2015-04-17 at 11:06 -0400, Mathieu Desnoyers wrote:
>> > [ Not in this patch: tests (Pranith Kumar has a patch for this), man page.
>> > ]
>>
>> Actually, we had discussed that for new syscalls, the changelog _would_
>> be the manpage -- at least included in it.
>
> Do you want the formatted content, or just the text ? Putting
> manpage formatting in the changelog seems a bit odd.


Plain text works for me. Just divide it up into the usuual man-pages
sections (see man-pages(7), and I'll take care of the rest).

Cheers,

Michael

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v16] sys_membarrier(): system-wide memory barrier (generic, x86)
  2015-04-17 15:06 [PATCH v16] sys_membarrier(): system-wide memory barrier (generic, x86) Mathieu Desnoyers
  2015-04-21 11:47 ` Thomas Gleixner
  2015-04-21 16:29 ` Davidlohr Bueso
@ 2015-04-22 20:24 ` Pranith Kumar
  2015-04-22 20:40   ` Mathieu Desnoyers
  2015-04-23 10:33 ` Rasmus Villemoes
  3 siblings, 1 reply; 16+ messages in thread
From: Pranith Kumar @ 2015-04-22 20:24 UTC (permalink / raw)
  To: Mathieu Desnoyers
  Cc: LKML, Josh Triplett, KOSAKI Motohiro, Steven Rostedt,
	Nicholas Miell, Linus Torvalds, Ingo Molnar, Alan Cox,
	Lai Jiangshan, Stephen Hemminger, Andrew Morton, Thomas Gleixner,
	Peter Zijlstra, David Howells, Michael Kerrisk

On Fri, Apr 17, 2015 at 11:06 AM, Mathieu Desnoyers
<mathieu.desnoyers@efficios.com> wrote:
> diff --git a/init/Kconfig b/init/Kconfig
> index f5dbc6d..89bad6a 100644
> --- a/init/Kconfig
> +++ b/init/Kconfig
> @@ -1559,6 +1559,19 @@ config PCI_QUIRKS
>           bugs/quirks. Disable this only if your target machine is
>           unaffected by PCI quirks.
>
> +config MEMBARRIER
> +       bool "Enable membarrier() system call" if EXPERT
> +       default y
> +       depends on SMP
> +       help
> +         Enable the membarrier() system call that allows issuing memory
> +         barriers across all running threads, which can be used to distribute
> +         the cost of user-space memory barriers asymmetrically by transforming
> +         pairs of memory barriers into pairs consisting of membarrier() and a
> +         compiler barrier.
> +
> +         If unsure, say Y.
> +

I understand why this syscall makes sense on SMP only, but you are
anyways checking num_online_cpus() and returning if it is only one. Is
this limitation necessary then? How do !SMP systems handle this
syscall? (I am guessing glibc wrapper?)
...

> +SYSCALL_DEFINE2(membarrier, int, cmd, int, flags)
> +{
> +       switch (cmd) {
> +       case MEMBARRIER_CMD_QUERY:
> +               return MEMBARRIER_CMD_BITMASK;
> +       case MEMBARRIER_CMD_SHARED:
> +               if (num_online_cpus() > 1)
> +                       synchronize_sched();
> +               return 0;
> +       default:
> +               return -EINVAL;
> +       }
> +}



-- 
Pranith

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v16] sys_membarrier(): system-wide memory barrier (generic, x86)
  2015-04-21 11:47 ` Thomas Gleixner
@ 2015-04-22 20:36   ` Mathieu Desnoyers
  0 siblings, 0 replies; 16+ messages in thread
From: Mathieu Desnoyers @ 2015-04-22 20:36 UTC (permalink / raw)
  To: Thomas Gleixner, Josh Triplett
  Cc: linux-kernel, KOSAKI Motohiro, Steven Rostedt, Nicholas Miell,
	Linus Torvalds, Ingo Molnar, Alan Cox, Lai Jiangshan,
	Stephen Hemminger, Andrew Morton, Peter Zijlstra, David Howells,
	Pranith Kumar, Michael Kerrisk

----- Original Message -----
> On Fri, 17 Apr 2015, Mathieu Desnoyers wrote:
> > +/**
> > + * enum membarrier_cmd - membarrier system call command
> > + *
> 
>     * @MEMBARRIER_CMD_QUERY:	Explanatory blurb......
> 
> Hint: Create a kerneldoc document and build it.

Oops, fixed, and figured out how to build it. Next version
should be ok.

> 
> > + * Command to be passed to the membarrier system call.
> 
> Please epxlain, that the commands need to be a single bit each.

OK

> 
> > +config MEMBARRIER
> > +	bool "Enable membarrier() system call" if EXPERT
> > +	default y
> > +	depends on SMP
> > +	help
> > +	  Enable the membarrier() system call that allows issuing memory
> > +	  barriers across all running threads, which can be used to distribute
> > +	  the cost of user-space memory barriers asymmetrically by transforming
> > +	  pairs of memory barriers into pairs consisting of membarrier() and a
> > +	  compiler barrier.
> > +
> > +	  If unsure, say Y.
> 
> Is it really worth to make this configurable?

Josh Triplett asked for it to be made configurable. I don't have any strong
opinion one way or the other.


> 
> > +/**
> > + * sys_membarrier - issue memory barriers on a set of threads
> > + * @cmd:   MEMBARRIER_CMD_QUERY:
> > + *             Query the set of supported commands. It returns a bitmask
> > of
> > + *             supported commands.
> > + *         MEMBARRIER_CMD_SHARED:
> > + *             Execute a memory barrier on all running threads. Upon
> > + *             return from system call, the caller thread is ensured that
> > + *             all running threads have passed through a state where all
> > + *             memory accesses to user-space addresses match program order
> > + *             between entry to and return from the system call
> > (non-running
> > + *             threads are de facto in such a state). This covers threads
> > + *             from all processes running on the system. This command
> > + *             returns 0.
> 
> I think the explanation for the commands should be in the enum
> documentation. This here should explain that @cmd takes command values
> defined in the enum.

OK

> 
> > +SYSCALL_DEFINE2(membarrier, int, cmd, int, flags)
> > +{
> > +	switch (cmd) {
> > +	case MEMBARRIER_CMD_QUERY:
> > +		return MEMBARRIER_CMD_BITMASK;
> > +	case MEMBARRIER_CMD_SHARED:
> > +		if (num_online_cpus() > 1)
> > +			synchronize_sched();
> > +		return 0;
> > +	default:
> > +		return -EINVAL;
> > +	}
> > +}
> 
> This looks way cleaner now :)

Much smaller too :)

Thanks,

Mathieu

> 
> Thanks,
> 
> 	tglx
> 

-- 
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v16] sys_membarrier(): system-wide memory barrier (generic, x86)
  2015-04-22 20:24 ` Pranith Kumar
@ 2015-04-22 20:40   ` Mathieu Desnoyers
  2015-04-23  0:37     ` Pranith Kumar
  0 siblings, 1 reply; 16+ messages in thread
From: Mathieu Desnoyers @ 2015-04-22 20:40 UTC (permalink / raw)
  To: Pranith Kumar
  Cc: LKML, Josh Triplett, KOSAKI Motohiro, Steven Rostedt,
	Nicholas Miell, Linus Torvalds, Ingo Molnar, Alan Cox,
	Lai Jiangshan, Stephen Hemminger, Andrew Morton, Thomas Gleixner,
	Peter Zijlstra, David Howells, Michael Kerrisk

----- Original Message -----
> On Fri, Apr 17, 2015 at 11:06 AM, Mathieu Desnoyers
> <mathieu.desnoyers@efficios.com> wrote:
> > diff --git a/init/Kconfig b/init/Kconfig
> > index f5dbc6d..89bad6a 100644
> > --- a/init/Kconfig
> > +++ b/init/Kconfig
> > @@ -1559,6 +1559,19 @@ config PCI_QUIRKS
> >           bugs/quirks. Disable this only if your target machine is
> >           unaffected by PCI quirks.
> >
> > +config MEMBARRIER
> > +       bool "Enable membarrier() system call" if EXPERT
> > +       default y
> > +       depends on SMP
> > +       help
> > +         Enable the membarrier() system call that allows issuing memory
> > +         barriers across all running threads, which can be used to
> > distribute
> > +         the cost of user-space memory barriers asymmetrically by
> > transforming
> > +         pairs of memory barriers into pairs consisting of membarrier()
> > and a
> > +         compiler barrier.
> > +
> > +         If unsure, say Y.
> > +
> 
> I understand why this syscall makes sense on SMP only, but you are
> anyways checking num_online_cpus() and returning if it is only one. Is
> this limitation necessary then? How do !SMP systems handle this
> syscall? (I am guessing glibc wrapper?)

For !SMP, this system call is not implemented (returns -ENOSYS).
Userspace libs are expected to query sysconf(_SC_NPROCESSORS_CONF)
and check whether the system supports multiprocessor at all. If
only a single processor is supported by the kernel, then userspace
can skip the calls to sys_membarrier altogether, because they are
not even needed.

Do you think this kind of information belongs in a man page ?

Should we instead just implement the system call in !SMP, and
return 0 without any side-effect ? This would be a bit inefficient
to let userspace call a system call that has no effect whatsoever.

Thanks,

Mathieu

> ...
> 
> > +SYSCALL_DEFINE2(membarrier, int, cmd, int, flags)
> > +{
> > +       switch (cmd) {
> > +       case MEMBARRIER_CMD_QUERY:
> > +               return MEMBARRIER_CMD_BITMASK;
> > +       case MEMBARRIER_CMD_SHARED:
> > +               if (num_online_cpus() > 1)
> > +                       synchronize_sched();
> > +               return 0;
> > +       default:
> > +               return -EINVAL;
> > +       }
> > +}
> 
> 
> 
> --
> Pranith
> 

-- 
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v16] sys_membarrier(): system-wide memory barrier (generic, x86)
  2015-04-22 20:01     ` Michael Kerrisk (man-pages)
@ 2015-04-22 21:01       ` Mathieu Desnoyers
  2015-04-23 11:38         ` Michael Kerrisk (man-pages)
  0 siblings, 1 reply; 16+ messages in thread
From: Mathieu Desnoyers @ 2015-04-22 21:01 UTC (permalink / raw)
  To: mtk manpages
  Cc: Davidlohr Bueso, lkml, Josh Triplett, KOSAKI Motohiro,
	Steven Rostedt, Nicholas Miell, Linus Torvalds, Ingo Molnar,
	Alan Cox, Lai Jiangshan, Stephen Hemminger, Andrew Morton,
	Thomas Gleixner, Peter Zijlstra, David Howells, Pranith Kumar

----- Original Message -----
> On 22 April 2015 at 21:43, Mathieu Desnoyers
> <mathieu.desnoyers@efficios.com> wrote:
> > ----- Original Message -----
> >> On Fri, 2015-04-17 at 11:06 -0400, Mathieu Desnoyers wrote:
> >> > [ Not in this patch: tests (Pranith Kumar has a patch for this), man
> >> > page.
> >> > ]
> >>
> >> Actually, we had discussed that for new syscalls, the changelog _would_
> >> be the manpage -- at least included in it.
> >
> > Do you want the formatted content, or just the text ? Putting
> > manpage formatting in the changelog seems a bit odd.
> 
> 
> Plain text works for me. Just divide it up into the usuual man-pages
> sections (see man-pages(7), and I'll take care of the rest).

OK. Should I do the writeup as a man(2) from the POV of a glibc
user ? It's a bit different when we get to return values and
error codes if we document kernel<->glibc or glibc<->user.

Moreover, how can we get this into glibc ?

Thanks,

Mathieu

-- 
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v16] sys_membarrier(): system-wide memory barrier (generic, x86)
  2015-04-22 20:40   ` Mathieu Desnoyers
@ 2015-04-23  0:37     ` Pranith Kumar
  2015-04-23  0:40       ` Stephen Hemminger
  0 siblings, 1 reply; 16+ messages in thread
From: Pranith Kumar @ 2015-04-23  0:37 UTC (permalink / raw)
  To: Mathieu Desnoyers
  Cc: LKML, Josh Triplett, KOSAKI Motohiro, Steven Rostedt,
	Nicholas Miell, Linus Torvalds, Ingo Molnar, Alan Cox,
	Lai Jiangshan, Stephen Hemminger, Andrew Morton, Thomas Gleixner,
	Peter Zijlstra, David Howells, Michael Kerrisk

On Wed, Apr 22, 2015 at 4:40 PM, Mathieu Desnoyers
<mathieu.desnoyers@efficios.com> wrote:
> ----- Original Message -----
>> On Fri, Apr 17, 2015 at 11:06 AM, Mathieu Desnoyers
>> <mathieu.desnoyers@efficios.com> wrote:
>> > diff --git a/init/Kconfig b/init/Kconfig
>> > index f5dbc6d..89bad6a 100644
>> > --- a/init/Kconfig
>> > +++ b/init/Kconfig
>> > @@ -1559,6 +1559,19 @@ config PCI_QUIRKS
>> >           bugs/quirks. Disable this only if your target machine is
>> >           unaffected by PCI quirks.
>> >
>> > +config MEMBARRIER
>> > +       bool "Enable membarrier() system call" if EXPERT
>> > +       default y
>> > +       depends on SMP
>> > +       help
>> > +         Enable the membarrier() system call that allows issuing memory
>> > +         barriers across all running threads, which can be used to
>> > distribute
>> > +         the cost of user-space memory barriers asymmetrically by
>> > transforming
>> > +         pairs of memory barriers into pairs consisting of membarrier()
>> > and a
>> > +         compiler barrier.
>> > +
>> > +         If unsure, say Y.
>> > +
>>
>> I understand why this syscall makes sense on SMP only, but you are
>> anyways checking num_online_cpus() and returning if it is only one. Is
>> this limitation necessary then? How do !SMP systems handle this
>> syscall? (I am guessing glibc wrapper?)
>
> For !SMP, this system call is not implemented (returns -ENOSYS).
> Userspace libs are expected to query sysconf(_SC_NPROCESSORS_CONF)
> and check whether the system supports multiprocessor at all. If
> only a single processor is supported by the kernel, then userspace
> can skip the calls to sys_membarrier altogether, because they are
> not even needed.
>
> Do you think this kind of information belongs in a man page ?
>
> Should we instead just implement the system call in !SMP, and
> return 0 without any side-effect ? This would be a bit inefficient
> to let userspace call a system call that has no effect whatsoever.
>

Are there any other SMP-only system calls like this? I am not really
sure what is the right way but documenting it would be good.

-- 
Pranith

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v16] sys_membarrier(): system-wide memory barrier (generic, x86)
  2015-04-23  0:37     ` Pranith Kumar
@ 2015-04-23  0:40       ` Stephen Hemminger
  2015-04-23  2:48         ` Steven Rostedt
  0 siblings, 1 reply; 16+ messages in thread
From: Stephen Hemminger @ 2015-04-23  0:40 UTC (permalink / raw)
  To: Pranith Kumar
  Cc: Mathieu Desnoyers, LKML, Josh Triplett, KOSAKI Motohiro,
	Steven Rostedt, Nicholas Miell, Linus Torvalds, Ingo Molnar,
	Alan Cox, Lai Jiangshan, Andrew Morton, Thomas Gleixner,
	Peter Zijlstra, David Howells, Michael Kerrisk

On Wed, 22 Apr 2015 20:37:08 -0400
Pranith Kumar <bobby.prani@gmail.com> wrote:

> >> I understand why this syscall makes sense on SMP only, but you are
> >> anyways checking num_online_cpus() and returning if it is only one. Is
> >> this limitation necessary then? How do !SMP systems handle this
> >> syscall? (I am guessing glibc wrapper?)  
> >
> > For !SMP, this system call is not implemented (returns -ENOSYS).
> > Userspace libs are expected to query sysconf(_SC_NPROCESSORS_CONF)
> > and check whether the system supports multiprocessor at all. If
> > only a single processor is supported by the kernel, then userspace
> > can skip the calls to sys_membarrier altogether, because they are
> > not even needed.
> >
> > Do you think this kind of information belongs in a man page ?
> >
> > Should we instead just implement the system call in !SMP, and
> > return 0 without any side-effect ? This would be a bit inefficient
> > to let userspace call a system call that has no effect whatsoever.
> >  
> 
> Are there any other SMP-only system calls like this? I am not really
> sure what is the right way but documenting it would be good.


The syscall should just return 0.
Let the application not worry about how many CPU's are present

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v16] sys_membarrier(): system-wide memory barrier (generic, x86)
  2015-04-23  0:40       ` Stephen Hemminger
@ 2015-04-23  2:48         ` Steven Rostedt
  2015-04-23 13:45           ` Mathieu Desnoyers
  0 siblings, 1 reply; 16+ messages in thread
From: Steven Rostedt @ 2015-04-23  2:48 UTC (permalink / raw)
  To: Stephen Hemminger
  Cc: Pranith Kumar, Mathieu Desnoyers, LKML, Josh Triplett,
	KOSAKI Motohiro, Nicholas Miell, Linus Torvalds, Ingo Molnar,
	Alan Cox, Lai Jiangshan, Andrew Morton, Thomas Gleixner,
	Peter Zijlstra, David Howells, Michael Kerrisk

On Wed, 22 Apr 2015 17:40:51 -0700
Stephen Hemminger <stephen@networkplumber.org> wrote:

> The syscall should just return 0.
> Let the application not worry about how many CPU's are present

+1

-- Steve


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v16] sys_membarrier(): system-wide memory barrier (generic, x86)
  2015-04-17 15:06 [PATCH v16] sys_membarrier(): system-wide memory barrier (generic, x86) Mathieu Desnoyers
                   ` (2 preceding siblings ...)
  2015-04-22 20:24 ` Pranith Kumar
@ 2015-04-23 10:33 ` Rasmus Villemoes
  2015-04-23 14:04   ` Mathieu Desnoyers
  3 siblings, 1 reply; 16+ messages in thread
From: Rasmus Villemoes @ 2015-04-23 10:33 UTC (permalink / raw)
  To: Mathieu Desnoyers
  Cc: linux-kernel, Josh Triplett, KOSAKI Motohiro, Steven Rostedt,
	Nicholas Miell, Linus Torvalds, Ingo Molnar, Alan Cox,
	Lai Jiangshan, Stephen Hemminger, Andrew Morton, Thomas Gleixner,
	Peter Zijlstra, David Howells, Pranith Kumar, Michael Kerrisk

On Fri, Apr 17 2015, Mathieu Desnoyers <mathieu.desnoyers@efficios.com> wrote:

> + */
> +SYSCALL_DEFINE2(membarrier, int, cmd, int, flags)
> +{
> +	switch (cmd) {
> +	case MEMBARRIER_CMD_QUERY:
> +		return MEMBARRIER_CMD_BITMASK;
> +	case MEMBARRIER_CMD_SHARED:
> +		if (num_online_cpus() > 1)
> +			synchronize_sched();
> +		return 0;
> +	default:
> +		return -EINVAL;
> +	}
> +}

Shouldn't flags be enforced 0, to actually make future extensions
possible without risk of breaking some sloppy userspace? I think that is
or should be part of "make sure new syscalls take a flags parameter".

> + * If this system call is not implemented, -ENOSYS is returned. If the
> + * command specified does not exist, or if the command argument is invalid,
> + * this system call returns -EINVAL. For a given command, this system call
> + * is guaranteed to always return the same value until reboot.

I like that guarantee, but it may be a bit much to promise for any and
all possible future flags. So maybe weaken it to 'For a given command
and flags==0, this ...'.


Rasmus

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v16] sys_membarrier(): system-wide memory barrier (generic, x86)
  2015-04-22 21:01       ` Mathieu Desnoyers
@ 2015-04-23 11:38         ` Michael Kerrisk (man-pages)
  0 siblings, 0 replies; 16+ messages in thread
From: Michael Kerrisk (man-pages) @ 2015-04-23 11:38 UTC (permalink / raw)
  To: Mathieu Desnoyers
  Cc: Davidlohr Bueso, lkml, Josh Triplett, KOSAKI Motohiro,
	Steven Rostedt, Nicholas Miell, Linus Torvalds, Ingo Molnar,
	Alan Cox, Lai Jiangshan, Stephen Hemminger, Andrew Morton,
	Thomas Gleixner, Peter Zijlstra, David Howells, Pranith Kumar,
	Carlos O'Donell

Hi Mathieu,

On 22 April 2015 at 23:01, Mathieu Desnoyers
<mathieu.desnoyers@efficios.com> wrote:
> ----- Original Message -----
>> On 22 April 2015 at 21:43, Mathieu Desnoyers
>> <mathieu.desnoyers@efficios.com> wrote:
>> > ----- Original Message -----
>> >> On Fri, 2015-04-17 at 11:06 -0400, Mathieu Desnoyers wrote:
>> >> > [ Not in this patch: tests (Pranith Kumar has a patch for this), man
>> >> > page.
>> >> > ]
>> >>
>> >> Actually, we had discussed that for new syscalls, the changelog _would_
>> >> be the manpage -- at least included in it.
>> >
>> > Do you want the formatted content, or just the text ? Putting
>> > manpage formatting in the changelog seems a bit odd.
>>
>>
>> Plain text works for me. Just divide it up into the usuual man-pages
>> sections (see man-pages(7), and I'll take care of the rest).
>
> OK. Should I do the writeup as a man(2) from the POV of a glibc
> user ? It's a bit different when we get to return values and
> error codes if we document kernel<->glibc or glibc<->user.

Ideally, from glibc POV, since that is the POV od user space consumers.

> Moreover, how can we get this into glibc ?

An eternal question...

https://lwn.net/Articles/534682/

I added Carlos, who can probably best summarize the current state of
things. I think there's a wiki discussion on this topic, but I could
not find a URL.

Cheers,

Michael

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v16] sys_membarrier(): system-wide memory barrier (generic, x86)
  2015-04-23  2:48         ` Steven Rostedt
@ 2015-04-23 13:45           ` Mathieu Desnoyers
  0 siblings, 0 replies; 16+ messages in thread
From: Mathieu Desnoyers @ 2015-04-23 13:45 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Stephen Hemminger, Pranith Kumar, LKML, Josh Triplett,
	KOSAKI Motohiro, Nicholas Miell, Linus Torvalds, Ingo Molnar,
	Alan Cox, Lai Jiangshan, Andrew Morton, Thomas Gleixner,
	Peter Zijlstra, David Howells, Michael Kerrisk

----- Original Message -----
> On Wed, 22 Apr 2015 17:40:51 -0700
> Stephen Hemminger <stephen@networkplumber.org> wrote:
> 
> > The syscall should just return 0.
> > Let the application not worry about how many CPU's are present
> 
> +1

This is indeed how I implemented it initially. The nice thing
about this approach is that if the application don't care much
about the overhead of calling sys_membarrier on !SMP, returning
0 tells the application that sys_membarrier is indeed supported,
and that the application don't need to issue memory barriers on
the other target threads (compiler barrier is then sufficient),
which is correct.

I'll update the patch accordingly.

Thanks,

Mathieu

-- 
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v16] sys_membarrier(): system-wide memory barrier (generic, x86)
  2015-04-23 10:33 ` Rasmus Villemoes
@ 2015-04-23 14:04   ` Mathieu Desnoyers
  0 siblings, 0 replies; 16+ messages in thread
From: Mathieu Desnoyers @ 2015-04-23 14:04 UTC (permalink / raw)
  To: Rasmus Villemoes
  Cc: linux-kernel, Josh Triplett, KOSAKI Motohiro, Steven Rostedt,
	Nicholas Miell, Linus Torvalds, Ingo Molnar, Alan Cox,
	Lai Jiangshan, Stephen Hemminger, Andrew Morton, Thomas Gleixner,
	Peter Zijlstra, David Howells, Pranith Kumar, Michael Kerrisk

----- Original Message -----
> On Fri, Apr 17 2015, Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
> wrote:
> 
> > + */
> > +SYSCALL_DEFINE2(membarrier, int, cmd, int, flags)
> > +{
> > +	switch (cmd) {
> > +	case MEMBARRIER_CMD_QUERY:
> > +		return MEMBARRIER_CMD_BITMASK;
> > +	case MEMBARRIER_CMD_SHARED:
> > +		if (num_online_cpus() > 1)
> > +			synchronize_sched();
> > +		return 0;
> > +	default:
> > +		return -EINVAL;
> > +	}
> > +}
> 
> Shouldn't flags be enforced 0, to actually make future extensions
> possible without risk of breaking some sloppy userspace? I think that is
> or should be part of "make sure new syscalls take a flags parameter".

Very good point! I will update the code to check this, and the documentation,
with the wording:

(in membarrier.c:)
"@flags: Currently needs to be 0. For future extensions."
(in man page)
"The flags argument needs to be 0. For future extensions."

> 
> > + * If this system call is not implemented, -ENOSYS is returned. If the
> > + * command specified does not exist, or if the command argument is
> > invalid,
> > + * this system call returns -EINVAL. For a given command, this system call
> > + * is guaranteed to always return the same value until reboot.
> 
> I like that guarantee, but it may be a bit much to promise for any and
> all possible future flags. So maybe weaken it to 'For a given command
> and flags==0, this ...'.

This makes tons of sense, updating the doc with this too, with the
wording:

"For a given command, with flags argument set to 0, this system call
is guaranteed to always return the same value until reboot."

Thanks!

Mathieu

> 
> 
> Rasmus
> 

-- 
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2015-04-23 14:04 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-04-17 15:06 [PATCH v16] sys_membarrier(): system-wide memory barrier (generic, x86) Mathieu Desnoyers
2015-04-21 11:47 ` Thomas Gleixner
2015-04-22 20:36   ` Mathieu Desnoyers
2015-04-21 16:29 ` Davidlohr Bueso
2015-04-22 19:43   ` Mathieu Desnoyers
2015-04-22 20:01     ` Michael Kerrisk (man-pages)
2015-04-22 21:01       ` Mathieu Desnoyers
2015-04-23 11:38         ` Michael Kerrisk (man-pages)
2015-04-22 20:24 ` Pranith Kumar
2015-04-22 20:40   ` Mathieu Desnoyers
2015-04-23  0:37     ` Pranith Kumar
2015-04-23  0:40       ` Stephen Hemminger
2015-04-23  2:48         ` Steven Rostedt
2015-04-23 13:45           ` Mathieu Desnoyers
2015-04-23 10:33 ` Rasmus Villemoes
2015-04-23 14:04   ` Mathieu Desnoyers

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).