All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC PATCH man-pages] Add membarrier system call man page
@ 2015-12-13 13:17 Mathieu Desnoyers
  2015-12-15 16:14   ` Michael Kerrisk (man-pages)
  2015-12-18 18:40   ` Davidlohr Bueso
  0 siblings, 2 replies; 7+ messages in thread
From: Mathieu Desnoyers @ 2015-12-13 13:17 UTC (permalink / raw)
  To: Michael Kerrisk
  Cc: linux-kernel, Mathieu Desnoyers, Paul E. McKenney, Josh Triplett,
	KOSAKI Motohiro, Steven Rostedt, Nicholas Miell, Ingo Molnar,
	Alan Cox, Lai Jiangshan, Stephen Hemminger, Thomas Gleixner,
	Peter Zijlstra, David Howells, Pranith Kumar, Shuah Khan,
	Andrew Morton, Linus Torvalds, linux-api

[ Updated following feedback from Michael Kerrisk. Not sure what to put
  in SEE ALSO section ? Also, the example uses the syscall() macro.
  Should we target this, or some API eventually exposed by glibc ? ]

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Michael Kerrisk <mtk.manpages@gmail.com>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Josh Triplett <josh@joshtriplett.org>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Nicholas Miell <nmiell@comcast.net>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Alan Cox <gnomes@lxorguk.ukuu.org.uk>
Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
Cc: Stephen Hemminger <stephen@networkplumber.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: David Howells <dhowells@redhat.com>
Cc: Pranith Kumar <bobby.prani@gmail.com>
Cc: Michael Kerrisk <mtk.manpages@gmail.com>
Cc: Shuah Khan <shuahkh@osg.samsung.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
CC: linux-api@vger.kernel.org
---
 man2/membarrier.2 | 269 ++++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 269 insertions(+)
 create mode 100644 man2/membarrier.2

diff --git a/man2/membarrier.2 b/man2/membarrier.2
new file mode 100644
index 0000000..552d817
--- /dev/null
+++ b/man2/membarrier.2
@@ -0,0 +1,269 @@
+.\" Copyright 2015 Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
+.\"
+.\" %%%LICENSE_START(VERBATIM)
+.\" Permission is granted to make and distribute verbatim copies of this
+.\" manual provided the copyright notice and this permission notice are
+.\" preserved on all copies.
+.\"
+.\" Permission is granted to copy and distribute modified versions of this
+.\" manual under the conditions for verbatim copying, provided that the
+.\" entire resulting derived work is distributed under the terms of a
+.\" permission notice identical to this one.
+.\"
+.\" Since the Linux kernel and libraries are constantly changing, this
+.\" manual page may be incorrect or out-of-date.  The author(s) assume no
+.\" responsibility for errors or omissions, or for damages resulting from
+.\" the use of the information contained herein.  The author(s) may not
+.\" have taken the same level of care in the production of this manual,
+.\" which is licensed free of charge, as they might when working
+.\" professionally.
+.\"
+.\" Formatted or processed versions of this manual, if unaccompanied by
+.\" the source, must acknowledge the copyright and authors of this work.
+.\" %%%LICENSE_END
+.\"
+.TH MEMBARRIER 2 2015-04-15 "Linux" "Linux Programmer's Manual"
+.SH NAME
+membarrier \- issue memory barriers on a set of threads
+.SH SYNOPSIS
+.B #include <linux/membarrier.h>
+.sp
+.BI "int membarrier(int " cmd ", int " flags ");
+.sp
+.SH DESCRIPTION
+The membarrier system call helps reducing overhead of memory barrier
+instructions required to order memory accesses on multi-core systems.
+However, this system call is heavier than a memory barrier, so using it
+effectively is
+.B not
+as simple as replacing memory barriers with this
+system call, but requires understanding the following:
+
+Use of memory barriers needs to be done taking into account that a
+memory barrier always needs to be either matched with its memory barrier
+counterparts, or that the architecture's memory model don't require the
+matching barriers.
+
+There are cases where one side of the matching barriers (which we will
+refer to as "fast side") is executed much more often than the other
+(which we will refer to as "slow side"). This is a prime target for the
+membarrier system call. The key idea is to replace, for these matching
+barriers, the fast side memory barriers by simple compiler barriers,
+e.g.:
+
+  asm volatile ("" : : : "memory")
+
+and replace the slow side memory barriers by the membarrier system call.
+
+This will add overhead to the slow side, and remove overhead from the
+fast side, thus resulting in an overall performance increase as long as
+the slow side is infrequent enough that the membarrier system call
+overhead does not counterweight the performance gain on the fast side.
+
+Examples where this system call can be useful includes implementations
+of Ready-Copy Update librarires, and garbage collectors.
+
+The
+.I cmd
+argument is one of the following:
+
+.TP
+.B MEMBARRIER_CMD_QUERY
+Query the set of supported commands. It returns a bitmask of supported
+commands.
+.TP
+.B MEMBARRIER_CMD_SHARED
+Ensure that all threads from all processes on the system pass through a
+state where all memory accesses to user-space addresses match program
+order between entry to and return from the membarrier system call.
+All threads on the system are targeted by this command. This command
+returns 0.
+
+.PP
+The
+.I cmd
+argument expects a one-hot bit of a bitmask, except for the
+.B MEMBARRIER_CMD_QUERY
+command which has the value 0. This query command is always supported,
+even though it is not part of the bitmask.
+
+.PP
+The
+.I flags
+argument is currently unused.
+
+.PP
+All memory accesses performed in program order from each targeted thread
+is guaranteed to be ordered with respect to sys_membarrier(). If we use
+the semantic "barrier()" to represent a compiler barrier forcing memory
+accesses to be performed in program order across the barrier, and
+smp_mb() to represent explicit memory barriers forcing full memory
+ordering across the barrier, we have the following ordering table for
+each pair of barrier(), sys_membarrier() and smp_mb():
+
+The pair ordering is detailed as (O: ordered, X: not ordered):
+
+                       barrier()   smp_mb() sys_membarrier()
+       barrier()          X           X            O
+       smp_mb()           X           O            O
+       sys_membarrier()   O           O            O
+
+.SH RETURN VALUE
+On success, this system call returns zero.  On error, \-1 is returned,
+and
+.I errno
+is set appropriately.
+For a given command, with flags argument set to 0, this system call is
+guaranteed to always return the same value until reboot. Therefore, it
+is sufficient to handle errors in a program or library initialization
+function. Further calls with the same parameters will lead to the same
+result. Therefore, for flag argument set to 0, error handling is only
+required for the first calls to the
+.BR membarrier ()
+system call in an application.
+
+.SH ERRORS
+.TP
+.B ENOSYS
+System call is not implemented.
+.TP
+.B EINVAL
+.I cmd
+is invalid or
+.I flags
+is non-zero.
+
+.SH VERSIONS
+The membarrier system call was added in Linux 4.3.
+
+.SH CONFORMING TO
+.BR membarrier ()
+is Linux-specific.
+
+.SH NOTES
+
+A memory barrier instruction is part of the instruction set of
+architectures with weakly-ordered memory models. It orders memory
+accesses prior to the barrier and after the barrier with respect to
+matching barriers on other cores. For instance, a load fence can order
+loads prior to and following that fence with respect to stores ordered
+by store fences.
+
+Program order is the order in which instructions are ordered in the
+program assembly code.
+
+.SH EXAMPLE
+
+Assuming a multithreaded application where "fast_path()" is executed
+very frequently, and where "slow_path()" is executed infrequently, the
+following code (x86) can be transformed using
+.BR membarrier()
+:
+
+.nf
+#include <stdlib.h>
+
+static volatile int a, b;
+
+static void fast_path(void)
+{
+	int read_a, read_b;
+
+	read_b = b;
+	asm volatile ("mfence" : : : "memory");
+	read_a = a;
+	/* read_b == 1 implies read_a == 1. */
+	if (read_b == 1 && read_a == 0)
+		abort();
+}
+
+static void slow_path(void)
+{
+	a = 1;
+	asm volatile ("mfence" : : : "memory");
+	b = 1;
+}
+
+int main(int argc, char **argv)
+{
+	/*
+	 * Real applications would call fast_path() and slow_path() from
+	 * different threads. Call those from main() to keep this
+	 * example short.
+	 */
+	slow_path();
+	fast_path();
+	exit(EXIT_SUCCESS);
+}
+.fi
+
+The code above transformed to use the
+.BR membarrier()
+system call becomes:
+
+.nf
+#define _GNU_SOURCE
+#include <stdlib.h>
+#include <stdio.h>
+#include <unistd.h>
+#include <sys/syscall.h>
+#include <linux/membarrier.h>
+
+static volatile int a, b;
+
+static int membarrier(int cmd, int flags)
+{
+	return syscall(__NR_membarrier, cmd, flags);
+}
+
+static int init_membarrier(void)
+{
+	int ret;
+
+	/* Ensure that membarrier is supported. */
+	ret = membarrier(MEMBARRIER_CMD_QUERY, 0);
+	if (ret < 0) {
+		perror("membarrier");
+		return -1;
+	}
+	if (!(ret & MEMBARRIER_CMD_SHARED)) {
+		fprintf(stderr,
+			"membarrier does not support MEMBARRIER_CMD_SHARED.\\n");
+		return -1;
+	}
+	return 0;
+}
+
+static void fast_path(void)
+{
+	int read_a, read_b;
+
+	read_b = b;
+	asm volatile ("" : : : "memory");
+	read_a = a;
+	/* read_b == 1 implies read_a == 1. */
+	if (read_b == 1 && read_a == 0)
+		abort();
+}
+
+static void slow_path(void)
+{
+	a = 1;
+	membarrier(MEMBARRIER_CMD_SHARED, 0);
+	b = 1;
+}
+
+int main(int argc, char **argv)
+{
+	if (init_membarrier())
+		exit(EXIT_FAILURE);
+	/*
+	 * Real applications would call fast_path() and slow_path() from
+	 * different threads. Call those from main() to keep this
+	 * example short.
+	 */
+	slow_path();
+	fast_path();
+	exit(EXIT_SUCCESS);
+}
+.fi
-- 
2.1.4


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [RFC PATCH man-pages] Add membarrier system call man page
@ 2015-12-15 16:14   ` Michael Kerrisk (man-pages)
  0 siblings, 0 replies; 7+ messages in thread
From: Michael Kerrisk (man-pages) @ 2015-12-15 16:14 UTC (permalink / raw)
  To: Mathieu Desnoyers
  Cc: mtk.manpages, linux-kernel, Paul E. McKenney, Josh Triplett,
	KOSAKI Motohiro, Steven Rostedt, Nicholas Miell, Ingo Molnar,
	Alan Cox, Lai Jiangshan, Stephen Hemminger, Thomas Gleixner,
	Peter Zijlstra, David Howells, Pranith Kumar, Shuah Khan,
	Andrew Morton, Linus Torvalds, linux-api

Hi Mathieu,

On 12/13/2015 02:17 PM, Mathieu Desnoyers wrote:
> [ Updated following feedback from Michael Kerrisk. Not sure what to put
>   in SEE ALSO section ? 

Maybe we think of something later.

>   Also, the example uses the syscall() macro.
>   Should we target this, or some API eventually exposed by glibc ? ]

I think it's okay.

I've applied this patch, made some light edits, and pushed to
the public Git.

Thanks for the much better page, Mathieu!

Cheers,

Michael


> Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
> Cc: Michael Kerrisk <mtk.manpages@gmail.com>
> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> Cc: Josh Triplett <josh@joshtriplett.org>
> Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
> Cc: Steven Rostedt <rostedt@goodmis.org>
> Cc: Nicholas Miell <nmiell@comcast.net>
> Cc: Ingo Molnar <mingo@redhat.com>
> Cc: Alan Cox <gnomes@lxorguk.ukuu.org.uk>
> Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
> Cc: Stephen Hemminger <stephen@networkplumber.org>
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Cc: Peter Zijlstra <peterz@infradead.org>
> Cc: David Howells <dhowells@redhat.com>
> Cc: Pranith Kumar <bobby.prani@gmail.com>
> Cc: Michael Kerrisk <mtk.manpages@gmail.com>
> Cc: Shuah Khan <shuahkh@osg.samsung.com>
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: Linus Torvalds <torvalds@linux-foundation.org>
> CC: linux-api@vger.kernel.org
> ---
>  man2/membarrier.2 | 269 ++++++++++++++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 269 insertions(+)
>  create mode 100644 man2/membarrier.2
> 
> diff --git a/man2/membarrier.2 b/man2/membarrier.2
> new file mode 100644
> index 0000000..552d817
> --- /dev/null
> +++ b/man2/membarrier.2
> @@ -0,0 +1,269 @@
> +.\" Copyright 2015 Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
> +.\"
> +.\" %%%LICENSE_START(VERBATIM)
> +.\" Permission is granted to make and distribute verbatim copies of this
> +.\" manual provided the copyright notice and this permission notice are
> +.\" preserved on all copies.
> +.\"
> +.\" Permission is granted to copy and distribute modified versions of this
> +.\" manual under the conditions for verbatim copying, provided that the
> +.\" entire resulting derived work is distributed under the terms of a
> +.\" permission notice identical to this one.
> +.\"
> +.\" Since the Linux kernel and libraries are constantly changing, this
> +.\" manual page may be incorrect or out-of-date.  The author(s) assume no
> +.\" responsibility for errors or omissions, or for damages resulting from
> +.\" the use of the information contained herein.  The author(s) may not
> +.\" have taken the same level of care in the production of this manual,
> +.\" which is licensed free of charge, as they might when working
> +.\" professionally.
> +.\"
> +.\" Formatted or processed versions of this manual, if unaccompanied by
> +.\" the source, must acknowledge the copyright and authors of this work.
> +.\" %%%LICENSE_END
> +.\"
> +.TH MEMBARRIER 2 2015-04-15 "Linux" "Linux Programmer's Manual"
> +.SH NAME
> +membarrier \- issue memory barriers on a set of threads
> +.SH SYNOPSIS
> +.B #include <linux/membarrier.h>
> +.sp
> +.BI "int membarrier(int " cmd ", int " flags ");
> +.sp
> +.SH DESCRIPTION
> +The membarrier system call helps reducing overhead of memory barrier
> +instructions required to order memory accesses on multi-core systems.
> +However, this system call is heavier than a memory barrier, so using it
> +effectively is
> +.B not
> +as simple as replacing memory barriers with this
> +system call, but requires understanding the following:
> +
> +Use of memory barriers needs to be done taking into account that a
> +memory barrier always needs to be either matched with its memory barrier
> +counterparts, or that the architecture's memory model don't require the
> +matching barriers.
> +
> +There are cases where one side of the matching barriers (which we will
> +refer to as "fast side") is executed much more often than the other
> +(which we will refer to as "slow side"). This is a prime target for the
> +membarrier system call. The key idea is to replace, for these matching
> +barriers, the fast side memory barriers by simple compiler barriers,
> +e.g.:
> +
> +  asm volatile ("" : : : "memory")
> +
> +and replace the slow side memory barriers by the membarrier system call.
> +
> +This will add overhead to the slow side, and remove overhead from the
> +fast side, thus resulting in an overall performance increase as long as
> +the slow side is infrequent enough that the membarrier system call
> +overhead does not counterweight the performance gain on the fast side.
> +
> +Examples where this system call can be useful includes implementations
> +of Ready-Copy Update librarires, and garbage collectors.
> +
> +The
> +.I cmd
> +argument is one of the following:
> +
> +.TP
> +.B MEMBARRIER_CMD_QUERY
> +Query the set of supported commands. It returns a bitmask of supported
> +commands.
> +.TP
> +.B MEMBARRIER_CMD_SHARED
> +Ensure that all threads from all processes on the system pass through a
> +state where all memory accesses to user-space addresses match program
> +order between entry to and return from the membarrier system call.
> +All threads on the system are targeted by this command. This command
> +returns 0.
> +
> +.PP
> +The
> +.I cmd
> +argument expects a one-hot bit of a bitmask, except for the
> +.B MEMBARRIER_CMD_QUERY
> +command which has the value 0. This query command is always supported,
> +even though it is not part of the bitmask.
> +
> +.PP
> +The
> +.I flags
> +argument is currently unused.
> +
> +.PP
> +All memory accesses performed in program order from each targeted thread
> +is guaranteed to be ordered with respect to sys_membarrier(). If we use
> +the semantic "barrier()" to represent a compiler barrier forcing memory
> +accesses to be performed in program order across the barrier, and
> +smp_mb() to represent explicit memory barriers forcing full memory
> +ordering across the barrier, we have the following ordering table for
> +each pair of barrier(), sys_membarrier() and smp_mb():
> +
> +The pair ordering is detailed as (O: ordered, X: not ordered):
> +
> +                       barrier()   smp_mb() sys_membarrier()
> +       barrier()          X           X            O
> +       smp_mb()           X           O            O
> +       sys_membarrier()   O           O            O
> +
> +.SH RETURN VALUE
> +On success, this system call returns zero.  On error, \-1 is returned,
> +and
> +.I errno
> +is set appropriately.
> +For a given command, with flags argument set to 0, this system call is
> +guaranteed to always return the same value until reboot. Therefore, it
> +is sufficient to handle errors in a program or library initialization
> +function. Further calls with the same parameters will lead to the same
> +result. Therefore, for flag argument set to 0, error handling is only
> +required for the first calls to the
> +.BR membarrier ()
> +system call in an application.
> +
> +.SH ERRORS
> +.TP
> +.B ENOSYS
> +System call is not implemented.
> +.TP
> +.B EINVAL
> +.I cmd
> +is invalid or
> +.I flags
> +is non-zero.
> +
> +.SH VERSIONS
> +The membarrier system call was added in Linux 4.3.
> +
> +.SH CONFORMING TO
> +.BR membarrier ()
> +is Linux-specific.
> +
> +.SH NOTES
> +
> +A memory barrier instruction is part of the instruction set of
> +architectures with weakly-ordered memory models. It orders memory
> +accesses prior to the barrier and after the barrier with respect to
> +matching barriers on other cores. For instance, a load fence can order
> +loads prior to and following that fence with respect to stores ordered
> +by store fences.
> +
> +Program order is the order in which instructions are ordered in the
> +program assembly code.
> +
> +.SH EXAMPLE
> +
> +Assuming a multithreaded application where "fast_path()" is executed
> +very frequently, and where "slow_path()" is executed infrequently, the
> +following code (x86) can be transformed using
> +.BR membarrier()
> +:
> +
> +.nf
> +#include <stdlib.h>
> +
> +static volatile int a, b;
> +
> +static void fast_path(void)
> +{
> +	int read_a, read_b;
> +
> +	read_b = b;
> +	asm volatile ("mfence" : : : "memory");
> +	read_a = a;
> +	/* read_b == 1 implies read_a == 1. */
> +	if (read_b == 1 && read_a == 0)
> +		abort();
> +}
> +
> +static void slow_path(void)
> +{
> +	a = 1;
> +	asm volatile ("mfence" : : : "memory");
> +	b = 1;
> +}
> +
> +int main(int argc, char **argv)
> +{
> +	/*
> +	 * Real applications would call fast_path() and slow_path() from
> +	 * different threads. Call those from main() to keep this
> +	 * example short.
> +	 */
> +	slow_path();
> +	fast_path();
> +	exit(EXIT_SUCCESS);
> +}
> +.fi
> +
> +The code above transformed to use the
> +.BR membarrier()
> +system call becomes:
> +
> +.nf
> +#define _GNU_SOURCE
> +#include <stdlib.h>
> +#include <stdio.h>
> +#include <unistd.h>
> +#include <sys/syscall.h>
> +#include <linux/membarrier.h>
> +
> +static volatile int a, b;
> +
> +static int membarrier(int cmd, int flags)
> +{
> +	return syscall(__NR_membarrier, cmd, flags);
> +}
> +
> +static int init_membarrier(void)
> +{
> +	int ret;
> +
> +	/* Ensure that membarrier is supported. */
> +	ret = membarrier(MEMBARRIER_CMD_QUERY, 0);
> +	if (ret < 0) {
> +		perror("membarrier");
> +		return -1;
> +	}
> +	if (!(ret & MEMBARRIER_CMD_SHARED)) {
> +		fprintf(stderr,
> +			"membarrier does not support MEMBARRIER_CMD_SHARED.\\n");
> +		return -1;
> +	}
> +	return 0;
> +}
> +
> +static void fast_path(void)
> +{
> +	int read_a, read_b;
> +
> +	read_b = b;
> +	asm volatile ("" : : : "memory");
> +	read_a = a;
> +	/* read_b == 1 implies read_a == 1. */
> +	if (read_b == 1 && read_a == 0)
> +		abort();
> +}
> +
> +static void slow_path(void)
> +{
> +	a = 1;
> +	membarrier(MEMBARRIER_CMD_SHARED, 0);
> +	b = 1;
> +}
> +
> +int main(int argc, char **argv)
> +{
> +	if (init_membarrier())
> +		exit(EXIT_FAILURE);
> +	/*
> +	 * Real applications would call fast_path() and slow_path() from
> +	 * different threads. Call those from main() to keep this
> +	 * example short.
> +	 */
> +	slow_path();
> +	fast_path();
> +	exit(EXIT_SUCCESS);
> +}
> +.fi
> 


-- 
Michael Kerrisk
Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
Linux/UNIX System Programming Training: http://man7.org/training/

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [RFC PATCH man-pages] Add membarrier system call man page
@ 2015-12-15 16:14   ` Michael Kerrisk (man-pages)
  0 siblings, 0 replies; 7+ messages in thread
From: Michael Kerrisk (man-pages) @ 2015-12-15 16:14 UTC (permalink / raw)
  To: Mathieu Desnoyers
  Cc: mtk.manpages-Re5JQEeQqe8AvxtiuMwx3w,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA, Paul E. McKenney,
	Josh Triplett, KOSAKI Motohiro, Steven Rostedt, Nicholas Miell,
	Ingo Molnar, Alan Cox, Lai Jiangshan, Stephen Hemminger,
	Thomas Gleixner, Peter Zijlstra, David Howells, Pranith Kumar,
	Shuah Khan, Andrew Morton, Linus Torvalds,
	linux-api-u79uwXL29TY76Z2rM5mHXA

Hi Mathieu,

On 12/13/2015 02:17 PM, Mathieu Desnoyers wrote:
> [ Updated following feedback from Michael Kerrisk. Not sure what to put
>   in SEE ALSO section ? 

Maybe we think of something later.

>   Also, the example uses the syscall() macro.
>   Should we target this, or some API eventually exposed by glibc ? ]

I think it's okay.

I've applied this patch, made some light edits, and pushed to
the public Git.

Thanks for the much better page, Mathieu!

Cheers,

Michael


> Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers-vg+e7yoeK/dWk0Htik3J/w@public.gmane.org>
> Cc: Michael Kerrisk <mtk.manpages-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
> Cc: Paul E. McKenney <paulmck-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
> Cc: Josh Triplett <josh-iaAMLnmF4UmaiuxdJuQwMA@public.gmane.org>
> Cc: KOSAKI Motohiro <kosaki.motohiro-+CUm20s59erQFUHtdCDX3A@public.gmane.org>
> Cc: Steven Rostedt <rostedt-nx8X9YLhiw1AfugRpC6u6w@public.gmane.org>
> Cc: Nicholas Miell <nmiell-Wuw85uim5zDR7s880joybQ@public.gmane.org>
> Cc: Ingo Molnar <mingo-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
> Cc: Alan Cox <gnomes-qBU/x9rampVanCEyBjwyrvXRex20P6io@public.gmane.org>
> Cc: Lai Jiangshan <laijs-BthXqXjhjHXQFUHtdCDX3A@public.gmane.org>
> Cc: Stephen Hemminger <stephen-OTpzqLSitTUnbdJkjeBofR2eb7JE58TQ@public.gmane.org>
> Cc: Thomas Gleixner <tglx-hfZtesqFncYOwBW4kG4KsQ@public.gmane.org>
> Cc: Peter Zijlstra <peterz-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org>
> Cc: David Howells <dhowells-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
> Cc: Pranith Kumar <bobby.prani-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
> Cc: Michael Kerrisk <mtk.manpages-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
> Cc: Shuah Khan <shuahkh-JPH+aEBZ4P+UEJcrhfAQsw@public.gmane.org>
> Cc: Andrew Morton <akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org>
> Cc: Linus Torvalds <torvalds-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org>
> CC: linux-api-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> ---
>  man2/membarrier.2 | 269 ++++++++++++++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 269 insertions(+)
>  create mode 100644 man2/membarrier.2
> 
> diff --git a/man2/membarrier.2 b/man2/membarrier.2
> new file mode 100644
> index 0000000..552d817
> --- /dev/null
> +++ b/man2/membarrier.2
> @@ -0,0 +1,269 @@
> +.\" Copyright 2015 Mathieu Desnoyers <mathieu.desnoyers-vg+e7yoeK/dWk0Htik3J/w@public.gmane.org>
> +.\"
> +.\" %%%LICENSE_START(VERBATIM)
> +.\" Permission is granted to make and distribute verbatim copies of this
> +.\" manual provided the copyright notice and this permission notice are
> +.\" preserved on all copies.
> +.\"
> +.\" Permission is granted to copy and distribute modified versions of this
> +.\" manual under the conditions for verbatim copying, provided that the
> +.\" entire resulting derived work is distributed under the terms of a
> +.\" permission notice identical to this one.
> +.\"
> +.\" Since the Linux kernel and libraries are constantly changing, this
> +.\" manual page may be incorrect or out-of-date.  The author(s) assume no
> +.\" responsibility for errors or omissions, or for damages resulting from
> +.\" the use of the information contained herein.  The author(s) may not
> +.\" have taken the same level of care in the production of this manual,
> +.\" which is licensed free of charge, as they might when working
> +.\" professionally.
> +.\"
> +.\" Formatted or processed versions of this manual, if unaccompanied by
> +.\" the source, must acknowledge the copyright and authors of this work.
> +.\" %%%LICENSE_END
> +.\"
> +.TH MEMBARRIER 2 2015-04-15 "Linux" "Linux Programmer's Manual"
> +.SH NAME
> +membarrier \- issue memory barriers on a set of threads
> +.SH SYNOPSIS
> +.B #include <linux/membarrier.h>
> +.sp
> +.BI "int membarrier(int " cmd ", int " flags ");
> +.sp
> +.SH DESCRIPTION
> +The membarrier system call helps reducing overhead of memory barrier
> +instructions required to order memory accesses on multi-core systems.
> +However, this system call is heavier than a memory barrier, so using it
> +effectively is
> +.B not
> +as simple as replacing memory barriers with this
> +system call, but requires understanding the following:
> +
> +Use of memory barriers needs to be done taking into account that a
> +memory barrier always needs to be either matched with its memory barrier
> +counterparts, or that the architecture's memory model don't require the
> +matching barriers.
> +
> +There are cases where one side of the matching barriers (which we will
> +refer to as "fast side") is executed much more often than the other
> +(which we will refer to as "slow side"). This is a prime target for the
> +membarrier system call. The key idea is to replace, for these matching
> +barriers, the fast side memory barriers by simple compiler barriers,
> +e.g.:
> +
> +  asm volatile ("" : : : "memory")
> +
> +and replace the slow side memory barriers by the membarrier system call.
> +
> +This will add overhead to the slow side, and remove overhead from the
> +fast side, thus resulting in an overall performance increase as long as
> +the slow side is infrequent enough that the membarrier system call
> +overhead does not counterweight the performance gain on the fast side.
> +
> +Examples where this system call can be useful includes implementations
> +of Ready-Copy Update librarires, and garbage collectors.
> +
> +The
> +.I cmd
> +argument is one of the following:
> +
> +.TP
> +.B MEMBARRIER_CMD_QUERY
> +Query the set of supported commands. It returns a bitmask of supported
> +commands.
> +.TP
> +.B MEMBARRIER_CMD_SHARED
> +Ensure that all threads from all processes on the system pass through a
> +state where all memory accesses to user-space addresses match program
> +order between entry to and return from the membarrier system call.
> +All threads on the system are targeted by this command. This command
> +returns 0.
> +
> +.PP
> +The
> +.I cmd
> +argument expects a one-hot bit of a bitmask, except for the
> +.B MEMBARRIER_CMD_QUERY
> +command which has the value 0. This query command is always supported,
> +even though it is not part of the bitmask.
> +
> +.PP
> +The
> +.I flags
> +argument is currently unused.
> +
> +.PP
> +All memory accesses performed in program order from each targeted thread
> +is guaranteed to be ordered with respect to sys_membarrier(). If we use
> +the semantic "barrier()" to represent a compiler barrier forcing memory
> +accesses to be performed in program order across the barrier, and
> +smp_mb() to represent explicit memory barriers forcing full memory
> +ordering across the barrier, we have the following ordering table for
> +each pair of barrier(), sys_membarrier() and smp_mb():
> +
> +The pair ordering is detailed as (O: ordered, X: not ordered):
> +
> +                       barrier()   smp_mb() sys_membarrier()
> +       barrier()          X           X            O
> +       smp_mb()           X           O            O
> +       sys_membarrier()   O           O            O
> +
> +.SH RETURN VALUE
> +On success, this system call returns zero.  On error, \-1 is returned,
> +and
> +.I errno
> +is set appropriately.
> +For a given command, with flags argument set to 0, this system call is
> +guaranteed to always return the same value until reboot. Therefore, it
> +is sufficient to handle errors in a program or library initialization
> +function. Further calls with the same parameters will lead to the same
> +result. Therefore, for flag argument set to 0, error handling is only
> +required for the first calls to the
> +.BR membarrier ()
> +system call in an application.
> +
> +.SH ERRORS
> +.TP
> +.B ENOSYS
> +System call is not implemented.
> +.TP
> +.B EINVAL
> +.I cmd
> +is invalid or
> +.I flags
> +is non-zero.
> +
> +.SH VERSIONS
> +The membarrier system call was added in Linux 4.3.
> +
> +.SH CONFORMING TO
> +.BR membarrier ()
> +is Linux-specific.
> +
> +.SH NOTES
> +
> +A memory barrier instruction is part of the instruction set of
> +architectures with weakly-ordered memory models. It orders memory
> +accesses prior to the barrier and after the barrier with respect to
> +matching barriers on other cores. For instance, a load fence can order
> +loads prior to and following that fence with respect to stores ordered
> +by store fences.
> +
> +Program order is the order in which instructions are ordered in the
> +program assembly code.
> +
> +.SH EXAMPLE
> +
> +Assuming a multithreaded application where "fast_path()" is executed
> +very frequently, and where "slow_path()" is executed infrequently, the
> +following code (x86) can be transformed using
> +.BR membarrier()
> +:
> +
> +.nf
> +#include <stdlib.h>
> +
> +static volatile int a, b;
> +
> +static void fast_path(void)
> +{
> +	int read_a, read_b;
> +
> +	read_b = b;
> +	asm volatile ("mfence" : : : "memory");
> +	read_a = a;
> +	/* read_b == 1 implies read_a == 1. */
> +	if (read_b == 1 && read_a == 0)
> +		abort();
> +}
> +
> +static void slow_path(void)
> +{
> +	a = 1;
> +	asm volatile ("mfence" : : : "memory");
> +	b = 1;
> +}
> +
> +int main(int argc, char **argv)
> +{
> +	/*
> +	 * Real applications would call fast_path() and slow_path() from
> +	 * different threads. Call those from main() to keep this
> +	 * example short.
> +	 */
> +	slow_path();
> +	fast_path();
> +	exit(EXIT_SUCCESS);
> +}
> +.fi
> +
> +The code above transformed to use the
> +.BR membarrier()
> +system call becomes:
> +
> +.nf
> +#define _GNU_SOURCE
> +#include <stdlib.h>
> +#include <stdio.h>
> +#include <unistd.h>
> +#include <sys/syscall.h>
> +#include <linux/membarrier.h>
> +
> +static volatile int a, b;
> +
> +static int membarrier(int cmd, int flags)
> +{
> +	return syscall(__NR_membarrier, cmd, flags);
> +}
> +
> +static int init_membarrier(void)
> +{
> +	int ret;
> +
> +	/* Ensure that membarrier is supported. */
> +	ret = membarrier(MEMBARRIER_CMD_QUERY, 0);
> +	if (ret < 0) {
> +		perror("membarrier");
> +		return -1;
> +	}
> +	if (!(ret & MEMBARRIER_CMD_SHARED)) {
> +		fprintf(stderr,
> +			"membarrier does not support MEMBARRIER_CMD_SHARED.\\n");
> +		return -1;
> +	}
> +	return 0;
> +}
> +
> +static void fast_path(void)
> +{
> +	int read_a, read_b;
> +
> +	read_b = b;
> +	asm volatile ("" : : : "memory");
> +	read_a = a;
> +	/* read_b == 1 implies read_a == 1. */
> +	if (read_b == 1 && read_a == 0)
> +		abort();
> +}
> +
> +static void slow_path(void)
> +{
> +	a = 1;
> +	membarrier(MEMBARRIER_CMD_SHARED, 0);
> +	b = 1;
> +}
> +
> +int main(int argc, char **argv)
> +{
> +	if (init_membarrier())
> +		exit(EXIT_FAILURE);
> +	/*
> +	 * Real applications would call fast_path() and slow_path() from
> +	 * different threads. Call those from main() to keep this
> +	 * example short.
> +	 */
> +	slow_path();
> +	fast_path();
> +	exit(EXIT_SUCCESS);
> +}
> +.fi
> 


-- 
Michael Kerrisk
Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
Linux/UNIX System Programming Training: http://man7.org/training/

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [RFC PATCH man-pages] Add membarrier system call man page
@ 2015-12-18 18:40   ` Davidlohr Bueso
  0 siblings, 0 replies; 7+ messages in thread
From: Davidlohr Bueso @ 2015-12-18 18:40 UTC (permalink / raw)
  To: Mathieu Desnoyers
  Cc: Michael Kerrisk, linux-kernel, Paul E. McKenney, Josh Triplett,
	KOSAKI Motohiro, Steven Rostedt, Nicholas Miell, Ingo Molnar,
	Alan Cox, Lai Jiangshan, Stephen Hemminger, Thomas Gleixner,
	Peter Zijlstra, David Howells, Pranith Kumar, Shuah Khan,
	Andrew Morton, Linus Torvalds, linux-api

On Sun, 13 Dec 2015, Mathieu Desnoyers wrote:
>+.SH RETURN VALUE
>+On success, this system call returns zero.  On error, \-1 is returned,
>+and

For the zero return, would it make sense to specify that it is also the case
for MEMBARRIER_CMD_SHARED under UP? Its pretty obvious it should be a no-op,
but wouldn't hurt to make it explicit.

Thanks,
Davidlohr

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [RFC PATCH man-pages] Add membarrier system call man page
@ 2015-12-18 18:40   ` Davidlohr Bueso
  0 siblings, 0 replies; 7+ messages in thread
From: Davidlohr Bueso @ 2015-12-18 18:40 UTC (permalink / raw)
  To: Mathieu Desnoyers
  Cc: Michael Kerrisk, linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	Paul E. McKenney, Josh Triplett, KOSAKI Motohiro, Steven Rostedt,
	Nicholas Miell, Ingo Molnar, Alan Cox, Lai Jiangshan,
	Stephen Hemminger, Thomas Gleixner, Peter Zijlstra,
	David Howells, Pranith Kumar, Shuah Khan, Andrew Morton,
	Linus Torvalds, linux-api-u79uwXL29TY76Z2rM5mHXA

On Sun, 13 Dec 2015, Mathieu Desnoyers wrote:
>+.SH RETURN VALUE
>+On success, this system call returns zero.  On error, \-1 is returned,
>+and

For the zero return, would it make sense to specify that it is also the case
for MEMBARRIER_CMD_SHARED under UP? Its pretty obvious it should be a no-op,
but wouldn't hurt to make it explicit.

Thanks,
Davidlohr

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [RFC PATCH man-pages] Add membarrier system call man page
@ 2015-12-18 19:37     ` Mathieu Desnoyers
  0 siblings, 0 replies; 7+ messages in thread
From: Mathieu Desnoyers @ 2015-12-18 19:37 UTC (permalink / raw)
  To: Davidlohr Bueso
  Cc: Michael Kerrisk, linux-kernel, Paul E. McKenney, Josh Triplett,
	KOSAKI Motohiro, rostedt, Nicholas Miell, Ingo Molnar,
	One Thousand Gnomes, Lai Jiangshan, Stephen Hemminger,
	Thomas Gleixner, Peter Zijlstra, David Howells, Pranith Kumar,
	Shuah Khan, Andrew Morton, Linus Torvalds, linux-api

----- On Dec 18, 2015, at 1:40 PM, Davidlohr Bueso dave@stgolabs.net wrote:

> On Sun, 13 Dec 2015, Mathieu Desnoyers wrote:
>>+.SH RETURN VALUE
>>+On success, this system call returns zero.  On error, \-1 is returned,
>>+and
> 
> For the zero return, would it make sense to specify that it is also the case
> for MEMBARRIER_CMD_SHARED under UP? Its pretty obvious it should be a no-op,
> but wouldn't hurt to make it explicit.

My understanding is that man pages should not document the internal behavior
of the system call. What matters here from a user-space perspective, independently
of UP vs SMP, is that membarrier with MEMBARRIER_CMD_SHARED command either
succeeds (0) or fails (e.g. -1, with ENOSYS or EINVAL errno).

By the way, the updated man page text now has this for MEMBARRIER_CMD_SHARED
description:

"      MEMBARRIER_CMD_SHARED
              Ensure that all threads from all processes on  the  system  pass
              through   a  state  where  all  memory  accesses  to  user-space
              addresses match program order between entry to and  return  from
              the  membarrier()  system  call.   All threads on the system are
              targeted by this command."

The text above is true both on UP and SMP.

I fear that calling out details about UP vs SMP in the man page might confuse
users, leading them to think they need to do special handling of UP, even
though this is something about which they really should not have to worry.

Thanks,

Mathieu

> 
> Thanks,
> Davidlohr

-- 
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [RFC PATCH man-pages] Add membarrier system call man page
@ 2015-12-18 19:37     ` Mathieu Desnoyers
  0 siblings, 0 replies; 7+ messages in thread
From: Mathieu Desnoyers @ 2015-12-18 19:37 UTC (permalink / raw)
  To: Davidlohr Bueso
  Cc: Michael Kerrisk, linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	Paul E. McKenney, Josh Triplett, KOSAKI Motohiro, rostedt,
	Nicholas Miell, Ingo Molnar, One Thousand Gnomes, Lai Jiangshan,
	Stephen Hemminger, Thomas Gleixner, Peter Zijlstra,
	David Howells, Pranith Kumar, Shuah Khan, Andrew Morton,
	Linus Torvalds, linux-api

----- On Dec 18, 2015, at 1:40 PM, Davidlohr Bueso dave-h16yJtLeMjHk1uMJSBkQmQ@public.gmane.org wrote:

> On Sun, 13 Dec 2015, Mathieu Desnoyers wrote:
>>+.SH RETURN VALUE
>>+On success, this system call returns zero.  On error, \-1 is returned,
>>+and
> 
> For the zero return, would it make sense to specify that it is also the case
> for MEMBARRIER_CMD_SHARED under UP? Its pretty obvious it should be a no-op,
> but wouldn't hurt to make it explicit.

My understanding is that man pages should not document the internal behavior
of the system call. What matters here from a user-space perspective, independently
of UP vs SMP, is that membarrier with MEMBARRIER_CMD_SHARED command either
succeeds (0) or fails (e.g. -1, with ENOSYS or EINVAL errno).

By the way, the updated man page text now has this for MEMBARRIER_CMD_SHARED
description:

"      MEMBARRIER_CMD_SHARED
              Ensure that all threads from all processes on  the  system  pass
              through   a  state  where  all  memory  accesses  to  user-space
              addresses match program order between entry to and  return  from
              the  membarrier()  system  call.   All threads on the system are
              targeted by this command."

The text above is true both on UP and SMP.

I fear that calling out details about UP vs SMP in the man page might confuse
users, leading them to think they need to do special handling of UP, even
though this is something about which they really should not have to worry.

Thanks,

Mathieu

> 
> Thanks,
> Davidlohr

-- 
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2015-12-18 19:38 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-12-13 13:17 [RFC PATCH man-pages] Add membarrier system call man page Mathieu Desnoyers
2015-12-15 16:14 ` Michael Kerrisk (man-pages)
2015-12-15 16:14   ` Michael Kerrisk (man-pages)
2015-12-18 18:40 ` Davidlohr Bueso
2015-12-18 18:40   ` Davidlohr Bueso
2015-12-18 19:37   ` Mathieu Desnoyers
2015-12-18 19:37     ` Mathieu Desnoyers

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.