linux-api.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC PATCH manpages 0/3] rseq, cpu_opv, membarrier man pages updates
@ 2017-11-15 19:13 Mathieu Desnoyers
       [not found] ` <20171115191316.828-1-mathieu.desnoyers-vg+e7yoeK/dWk0Htik3J/w@public.gmane.org>
  0 siblings, 1 reply; 4+ messages in thread
From: Mathieu Desnoyers @ 2017-11-15 19:13 UTC (permalink / raw)
  To: Michael Kerrisk
  Cc: linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	linux-api-u79uwXL29TY76Z2rM5mHXA, Peter Zijlstra,
	Paul E . McKenney, Boqun Feng, Andy Lutomirski, Dave Watson,
	Paul Turner, Andrew Morton, Russell King, Thomas Gleixner,
	Ingo Molnar, H . Peter Anvin, Andrew Hunter, Andi Kleen,
	Chris Lameter, Ben Maurer, Steven Rostedt, Josh Triplett,
	Linus Torvalds, Catalin Marinas

Hi Michael,

Here are the new man pages for Restartable Sequences and cpu_opv
system calls (proposed for 4.15), and an update for the membarrier
man pages, which includes the private expedited commands introduced
in 4.14, and the shared expedited, and core serializing private
expedited commands (proposed for 4.15).

Feedback is welcome!

Thanks,

Mathieu

Mathieu Desnoyers (3):
  Add cpu_opv system call manpage
  Add rseq manpage
  Update membarrier manpage for 4.14, 4.15

 man2/cpu_opv.2    | 297 ++++++++++++++++++++++++++++++++++++++++++++++++++++++
 man2/membarrier.2 |  98 +++++++++++++++---
 man2/rseq.2       | 268 ++++++++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 649 insertions(+), 14 deletions(-)
 create mode 100644 man2/cpu_opv.2
 create mode 100644 man2/rseq.2

-- 
2.11.0

^ permalink raw reply	[flat|nested] 4+ messages in thread

* [RFC PATCH manpages 1/3] Add cpu_opv system call manpage
       [not found] ` <20171115191316.828-1-mathieu.desnoyers-vg+e7yoeK/dWk0Htik3J/w@public.gmane.org>
@ 2017-11-15 19:13   ` Mathieu Desnoyers
  2017-11-15 19:13   ` [RFC PATCH manpages 2/3] Add rseq manpage Mathieu Desnoyers
  2017-11-15 19:13   ` [RFC PATCH manpages 3/3] Update membarrier manpage for 4.14, 4.15 Mathieu Desnoyers
  2 siblings, 0 replies; 4+ messages in thread
From: Mathieu Desnoyers @ 2017-11-15 19:13 UTC (permalink / raw)
  To: Michael Kerrisk
  Cc: linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	linux-api-u79uwXL29TY76Z2rM5mHXA, Peter Zijlstra,
	Paul E . McKenney, Boqun Feng, Andy Lutomirski, Dave Watson,
	Paul Turner, Andrew Morton, Russell King, Thomas Gleixner,
	Ingo Molnar, H . Peter Anvin, Andrew Hunter, Andi Kleen,
	Chris Lameter, Ben Maurer, Steven Rostedt, Josh Triplett,
	Linus Torvalds, Catalin Marinas

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers-vg+e7yoeK/dWk0Htik3J/w@public.gmane.org>
CC: "Paul E. McKenney" <paulmck-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
CC: Peter Zijlstra <peterz-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org>
CC: Paul Turner <pjt-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
CC: Thomas Gleixner <tglx-hfZtesqFncYOwBW4kG4KsQ@public.gmane.org>
CC: Andrew Hunter <ahh-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
CC: Andy Lutomirski <luto-kltTT9wpgjJwATOyAt5JVQ@public.gmane.org>
CC: Andi Kleen <andi-Vw/NltI1exuRpAAqCnN02g@public.gmane.org>
CC: Dave Watson <davejwatson-b10kYP2dOMg@public.gmane.org>
CC: Chris Lameter <cl-vYTEC60ixJUAvxtiuMwx3w@public.gmane.org>
CC: Ingo Molnar <mingo-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
CC: "H. Peter Anvin" <hpa-YMNOUZJC4hwAvxtiuMwx3w@public.gmane.org>
CC: Ben Maurer <bmaurer-b10kYP2dOMg@public.gmane.org>
CC: Steven Rostedt <rostedt-nx8X9YLhiw1AfugRpC6u6w@public.gmane.org>
CC: Josh Triplett <josh-iaAMLnmF4UmaiuxdJuQwMA@public.gmane.org>
CC: Linus Torvalds <torvalds-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org>
CC: Andrew Morton <akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org>
CC: Russell King <linux-lFZ/pmaqli7XmaaqVzeoHQ@public.gmane.org>
CC: Catalin Marinas <catalin.marinas-5wv7dgnIgG8@public.gmane.org>
CC: Will Deacon <will.deacon-5wv7dgnIgG8@public.gmane.org>
CC: Michael Kerrisk <mtk.manpages-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
CC: Boqun Feng <boqun.feng-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
CC: linux-api-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
---
 man2/cpu_opv.2 | 297 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 297 insertions(+)
 create mode 100644 man2/cpu_opv.2

diff --git a/man2/cpu_opv.2 b/man2/cpu_opv.2
new file mode 100644
index 000000000..3d998dcbf
--- /dev/null
+++ b/man2/cpu_opv.2
@@ -0,0 +1,297 @@
+.\" Copyright 2017 Mathieu Desnoyers <mathieu.desnoyers-vg+e7yoeK/dWk0Htik3J/w@public.gmane.org>
+.\"
+.\" %%%LICENSE_START(VERBATIM)
+.\" Permission is granted to make and distribute verbatim copies of this
+.\" manual provided the copyright notice and this permission notice are
+.\" preserved on all copies.
+.\"
+.\" Permission is granted to copy and distribute modified versions of this
+.\" manual under the conditions for verbatim copying, provided that the
+.\" entire resulting derived work is distributed under the terms of a
+.\" permission notice identical to this one.
+.\"
+.\" Since the Linux kernel and libraries are constantly changing, this
+.\" manual page may be incorrect or out-of-date.  The author(s) assume no
+.\" responsibility for errors or omissions, or for damages resulting from
+.\" the use of the information contained herein.  The author(s) may not
+.\" have taken the same level of care in the production of this manual,
+.\" which is licensed free of charge, as they might when working
+.\" professionally.
+.\"
+.\" Formatted or processed versions of this manual, if unaccompanied by
+.\" the source, must acknowledge the copyright and authors of this work.
+.\" %%%LICENSE_END
+.\"
+.TH CPU_OPV 2 2017-11-10 "Linux" "Linux Programmer's Manual"
+.SH NAME
+cpu_opv \- CPU preempt-off operation vector system call
+.SH SYNOPSIS
+.nf
+.B #include <linux/cpu_opv.h>
+.sp
+.BI "int cpu_opv(struct cpu_op * " cpu_opv ", int " cpuopcnt ", int " cpu ", int " flags ");
+.sp
+.SH DESCRIPTION
+The cpu_opv system call executes a vector of operations on behalf of
+user-space on a specific CPU with preemption disabled.
+
+The operations available are: comparison, memcpy, add, or, and, xor,
+left shift, right shift, and memory barrier. The system call receives a
+CPU number from user-space as argument, which is the CPU on which those
+operations need to be performed. All preparation steps such as loading
+pointers, and applying offsets to arrays, need to be performed by
+user-space before invoking the system call. The "comparison" operation
+can be used to check that the data used in the preparation step did not
+change between preparation of system call inputs and operation execution
+within the preempt-off critical section.
+
+A maximum limit of 16 operations per cpu_opv syscall invocation is
+enforced, and a overall maximum length sum, so user-space cannot
+generate a too long preempt-off critical section. Each operation is
+also limited a length of PAGE_SIZE bytes, meaning that an operation
+can touch a maximum of 4 pages.
+
+If the thread is not running on the requested CPU, an attempt is made to
+migrate it to the requested CPU. If the requested CPU is not part of
+the cpus allowed mask of the thread, the system call fails with EINVAL.
+The system call will fail with EAGAIN if the scheduler migrated the
+thread away from the requested CPU between its migration and following
+execution with disabled preemption. User-space is then free to retry
+either with the same or with a different CPU number, depending on its
+algorithmic constraints.
+
+.PP
+The layout of
+.B struct cpu_opv
+is as follows:
+.TP
+.B Fields
+
+.TP
+.in +4n
+.I op
+Operation of type
+.B enum cpu_op_type
+to perform. This operation type selects the associated "u" union field.
+.in
+.TP
+.in +4n
+.I len
+Length (in bytes) of data to consider for this operation.
+.in
+.TP
+.in +4n
+.I u.compare_op
+For a
+.B CPU_COMPARE_EQ_OP , and
+.B CPU_COMPARE_NE_OP , contains the 
+.B a
+and
+.B b
+pointers to compare. The
+.B expect_fault_a
+and
+.B expect_fault_b
+fields indicate whether a page fault should be expected for each of
+those pointers.
+If
+.B expect_fault_a , or
+.B expect_fault_b
+is set, EAGAIN is returned on fault, else EFAULT is returned. The
+.B len
+field is allowed to take values from 0 to PAGE_SIZE for comparison
+operations.
+.in
+.TP
+.in +4n
+.I u.memcpy_op
+For a
+.B CPU_MEMCPY_OP ,
+contains the 
+.B dst
+and
+.B src
+pointers, expressing a copy of
+.B src
+into
+.B dst. The
+.B expect_fault_dst
+and
+.B expect_fault_src
+fields indicate whether a page fault should be expected for each of
+those pointers.
+If
+.B expect_fault_dst , or
+.B expect_fault_src
+is set, EAGAIN is returned on fault, else EFAULT is returned. The
+.B len
+field is allowed to take values from 0 to PAGE_SIZE for memcpy
+operations.
+.in
+.TP
+.in +4n
+.I u.arithmetic_op
+For a
+.B CPU_ADD_OP ,
+contains the
+.B p ,
+.B count , and
+.B expect_fault_p
+fields, which are respectively a pointer to the memory location to
+increment, the 64-bit signed integer value to add, and whether a page
+fault should be expected for
+.B p .
+If
+.B expect_fault_p
+is set, EAGAIN is returned on fault, else EFAULT is returned. The
+.B len
+field is allowed to take values of 1, 2, 4, 8 bytes for arithmetic
+operations.
+.in
+.TP
+.in +4n
+.I u.bitwise_op
+For a
+.B CPU_OR_OP ,
+.B CPU_AND_OP , and
+.B CPU_XOR_OP ,
+contains the
+.B p ,
+.B mask , and
+.B expect_fault_p
+fields, which are respectively a pointer to the memory location to
+target, the mask to apply, and whether a page fault should be
+expected for
+.B p .
+If
+.B expect_fault_p
+is set, EAGAIN is returned on fault, else EFAULT is returned. The
+.B len
+field is allowed to take values of 1, 2, 4, 8 bytes for bitwise
+operations.
+.in
+.TP
+.in +4n
+.I u.shift_op
+For a
+.B CPU_LSHIFT_OP , and
+.B CPU_RSHIFT_OP ,
+contains the
+.B p ,
+.B bits , and
+.B expect_fault_p
+fields, which are respectively a pointer to the memory location to
+target, the number of bits to shift either left of right, and whether a
+page fault should be expected for
+.B p .
+If
+.B expect_fault_p
+is set, EAGAIN is returned on fault, else EFAULT is returned. The
+.B len
+field is allowed to take values of 1, 2, 4, 8 bytes for shift
+operations. The
+.B bits
+field is allowed to take values between 0 and 63.
+.in
+
+.PP
+The enum cpu_op_types contains the following operations:
+.IP \[bu] 2
+CPU_COMPARE_EQ_OP: Compare whether two memory locations are equal,
+.IP \[bu] 2
+CPU_COMPARE_NE_OP: Compare whether two memory locations differ,
+.IP \[bu] 2
+CPU_MEMCPY_OP: Copy a source memory location into a destination,
+.IP \[bu] 2
+CPU_ADD_OP: Increment a target memory location of a given count,
+.IP \[bu] 2
+CPU_OR_OP: Apply a "or" mask to a memory location,
+.IP \[bu] 2
+CPU_AND_OP: Apply a "and" mask to a memory location,
+.IP \[bu] 2
+CPU_XOR_OP: Apply a "xor" mask to a memory location,
+.IP \[bu] 2
+CPU_LSHIFT_OP: Shift a memory location left of a given number of bits,
+.IP \[bu] 2
+CPU_RSHIFT_OP: Shift a memory location right of a given number of bits.
+.IP \[bu] 2
+CPU_MB_OP: Issue a memory barrier.
+
+All of the operations above provide single-copy atomicity guarantees for
+word-sized, word-aligned target pointers, for both loads and stores.
+
+.PP
+The
+.I cpuopcnt
+argument is the number of elements in the cpu_opv array. It can take
+values from 0 to 16.
+
+.PP
+The
+.I cpu
+argument is the CPU number on which the operation sequence needs to be
+executed.
+
+.PP
+The
+.I flags
+argument is expected to be 0.
+
+.SH RETURN VALUE
+A return value of 0 indicates success. On error, \-1 is returned, and
+.I errno
+is set appropriately.
+
+.SH ERRORS
+.TP
+.B EAGAIN
+.BR cpu_opv ()
+system call should be attempted again.
+.TP
+.B EINVAL
+Either
+.I flags
+contains an invalid value, or
+.I cpu
+contains an invalid value or a value not allowed by the current thread's
+allowed cpu mask, or
+.I cpuopcnt
+contains an invalid value, or the
+.I cpu_opv
+operation vector contains an invalid
+.I op
+value, or the
+.I cpu_opv
+operation vector contains an invalid
+.I len
+value, or the
+.I cpu_opv
+operation vector sum of
+.I len
+values is too large.
+
+.TP
+.B ENOSYS
+The
+.BR cpu_opv ()
+system call is not implemented by this kernel.
+.TP
+.B EFAULT
+.I cpu_opv
+is an invalid address, or a pointer contained within an operation
+is invalid (and a fault is not expected for that pointer).
+
+.SH VERSIONS
+The
+.BR cpu_opv ()
+system call was added in Linux 4.X
+.BR (TODO).
+
+.SH CONFORMING TO
+.BR cpu_opv ()
+is Linux-specific.
+
+.in
+.SH SEE ALSO
+.BR membarrier (2) ,
+.BR rseq (2)
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 4+ messages in thread

* [RFC PATCH manpages 2/3] Add rseq manpage
       [not found] ` <20171115191316.828-1-mathieu.desnoyers-vg+e7yoeK/dWk0Htik3J/w@public.gmane.org>
  2017-11-15 19:13   ` [RFC PATCH manpages 1/3] Add cpu_opv system call manpage Mathieu Desnoyers
@ 2017-11-15 19:13   ` Mathieu Desnoyers
  2017-11-15 19:13   ` [RFC PATCH manpages 3/3] Update membarrier manpage for 4.14, 4.15 Mathieu Desnoyers
  2 siblings, 0 replies; 4+ messages in thread
From: Mathieu Desnoyers @ 2017-11-15 19:13 UTC (permalink / raw)
  To: Michael Kerrisk
  Cc: linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	linux-api-u79uwXL29TY76Z2rM5mHXA, Peter Zijlstra,
	Paul E . McKenney, Boqun Feng, Andy Lutomirski, Dave Watson,
	Paul Turner, Andrew Morton, Russell King, Thomas Gleixner,
	Ingo Molnar, H . Peter Anvin, Andrew Hunter, Andi Kleen,
	Chris Lameter, Ben Maurer, Steven Rostedt, Josh Triplett,
	Linus Torvalds, Catalin Marinas

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers-vg+e7yoeK/dWk0Htik3J/w@public.gmane.org>
CC: "Paul E. McKenney" <paulmck-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
CC: Peter Zijlstra <peterz-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org>
CC: Paul Turner <pjt-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
CC: Thomas Gleixner <tglx-hfZtesqFncYOwBW4kG4KsQ@public.gmane.org>
CC: Andrew Hunter <ahh-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
CC: Andy Lutomirski <luto-kltTT9wpgjJwATOyAt5JVQ@public.gmane.org>
CC: Andi Kleen <andi-Vw/NltI1exuRpAAqCnN02g@public.gmane.org>
CC: Dave Watson <davejwatson-b10kYP2dOMg@public.gmane.org>
CC: Chris Lameter <cl-vYTEC60ixJUAvxtiuMwx3w@public.gmane.org>
CC: Ingo Molnar <mingo-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
CC: "H. Peter Anvin" <hpa-YMNOUZJC4hwAvxtiuMwx3w@public.gmane.org>
CC: Ben Maurer <bmaurer-b10kYP2dOMg@public.gmane.org>
CC: Steven Rostedt <rostedt-nx8X9YLhiw1AfugRpC6u6w@public.gmane.org>
CC: Josh Triplett <josh-iaAMLnmF4UmaiuxdJuQwMA@public.gmane.org>
CC: Linus Torvalds <torvalds-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org>
CC: Andrew Morton <akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org>
CC: Russell King <linux-lFZ/pmaqli7XmaaqVzeoHQ@public.gmane.org>
CC: Catalin Marinas <catalin.marinas-5wv7dgnIgG8@public.gmane.org>
CC: Will Deacon <will.deacon-5wv7dgnIgG8@public.gmane.org>
CC: Michael Kerrisk <mtk.manpages-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
CC: Boqun Feng <boqun.feng-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
CC: linux-api-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
---
 man2/rseq.2 | 268 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 268 insertions(+)
 create mode 100644 man2/rseq.2

diff --git a/man2/rseq.2 b/man2/rseq.2
new file mode 100644
index 000000000..2877c618c
--- /dev/null
+++ b/man2/rseq.2
@@ -0,0 +1,268 @@
+.\" Copyright 2015-2017 Mathieu Desnoyers <mathieu.desnoyers-vg+e7yoeK/dWk0Htik3J/w@public.gmane.org>
+.\"
+.\" %%%LICENSE_START(VERBATIM)
+.\" Permission is granted to make and distribute verbatim copies of this
+.\" manual provided the copyright notice and this permission notice are
+.\" preserved on all copies.
+.\"
+.\" Permission is granted to copy and distribute modified versions of this
+.\" manual under the conditions for verbatim copying, provided that the
+.\" entire resulting derived work is distributed under the terms of a
+.\" permission notice identical to this one.
+.\"
+.\" Since the Linux kernel and libraries are constantly changing, this
+.\" manual page may be incorrect or out-of-date.  The author(s) assume no
+.\" responsibility for errors or omissions, or for damages resulting from
+.\" the use of the information contained herein.  The author(s) may not
+.\" have taken the same level of care in the production of this manual,
+.\" which is licensed free of charge, as they might when working
+.\" professionally.
+.\"
+.\" Formatted or processed versions of this manual, if unaccompanied by
+.\" the source, must acknowledge the copyright and authors of this work.
+.\" %%%LICENSE_END
+.\"
+.TH RSEQ 2 2017-11-10 "Linux" "Linux Programmer's Manual"
+.SH NAME
+rseq \- Restartable sequences and cpu number cache
+.SH SYNOPSIS
+.nf
+.B #include <linux/rseq.h>
+.sp
+.BI "int rseq(struct rseq * " rseq ", uint32_t " rseq_len ", int " flags ", uint32_t " sig ");
+.sp
+.SH DESCRIPTION
+The
+.BR rseq ()
+ABI accelerates user-space operations on per-cpu data by defining a
+shared data structure ABI between each user-space thread and the kernel.
+
+It allows user-space to perform update operations on per-cpu data
+without requiring heavy-weight atomic operations.
+
+Restartable sequences are atomic with respect to preemption (making it
+atomic with respect to other threads running on the same CPU), as well
+as signal delivery (user-space execution contexts nested over the same
+thread).
+
+It is suited for update operations on per-cpu data.
+
+It can be used on data structures shared between threads within a
+process, and on data structures shared between threads across different
+processes.
+
+.PP
+Some examples of operations that can be accelerated or improved
+by this ABI:
+.IP \[bu] 2
+Memory allocator per-cpu free-lists,
+.IP \[bu] 2
+Querying the current CPU number,
+.IP \[bu] 2
+Incrementing per-CPU counters,
+.IP \[bu] 2
+Modifying data protected by per-CPU spinlocks,
+.IP \[bu] 2
+Inserting/removing elements in per-CPU linked-lists,
+.IP \[bu] 2
+Writing/reading per-CPU ring buffers content.
+.IP \[bu] 2
+Accurately reading performance monitoring unit counters
+with respect to thread migration.
+
+.PP
+The
+.I rseq
+argument is a pointer to the thread-local rseq structure to be shared
+between kernel and user-space. A NULL
+.I rseq
+value unregisters the current thread rseq structure.
+
+.PP
+The layout of
+.B struct rseq
+is as follows:
+.TP
+.B Structure alignment
+This structure is aligned on multiples of 32 bytes.
+.TP
+.B Structure size
+This structure is extensible. Its size is passed as parameter to the
+rseq system call.
+.TP
+.B Fields
+
+.TP
+.in +4n
+.I cpu_id_start
+Optimistic cache of the CPU number on which the current thread is
+running. Its value is guaranteed to always be a possible CPU number,
+even when rseq is not initialized. The value it contains should always
+be confirmed by reading the cpu_id field.
+.in
+.TP
+.in +4n
+.I cpu_id
+Cache of the CPU number on which the current thread is running.
+-1 if uninitialized.
+.in
+.TP
+.in +4n
+.I rseq_cs
+The rseq_cs field is a pointer to a struct rseq_cs. Is is NULL when no
+rseq assembly block critical section is active for the current thread.
+Setting it to point to a critical section descriptor (struct rseq_cs)
+marks the beginning of the critical section.
+.in
+.TP
+.in +4n
+.I flags
+Flags indicating the restart behavior for the current thread. This is
+mainly used for debugging purposes. Can be either:
+.IP \[bu]
+RSEQ_CS_FLAG_NO_RESTART_ON_PREEMPT
+.IP \[bu]
+RSEQ_CS_FLAG_NO_RESTART_ON_SIGNAL
+.IP \[bu]
+RSEQ_CS_FLAG_NO_RESTART_ON_MIGRATE
+.in
+
+.PP
+The layout of
+.B struct rseq_cs
+version 0 is as follows:
+.TP
+.B Structure alignment
+This structure is aligned on multiples of 32 bytes.
+.TP
+.B Structure size
+This structure has a fixed size of 32 bytes.
+.TP
+.B Fields
+
+.TP
+.in +4n
+.I version
+Version of this structure.
+.in
+.TP
+.in +4n
+.I flags
+Flags indicating the restart behavior of this structure. Can be
+either:
+.IP \[bu]
+RSEQ_CS_FLAG_NO_RESTART_ON_PREEMPT
+.IP \[bu]
+RSEQ_CS_FLAG_NO_RESTART_ON_SIGNAL
+.IP \[bu]
+RSEQ_CS_FLAG_NO_RESTART_ON_MIGRATE
+.TP
+.in +4n
+.I start_ip
+Instruction pointer address of the first instruction of the sequence of
+consecutive assembly instructions.
+.in
+.TP
+.in +4n
+.I post_commit_offset
+Offset (from start_ip address) of the address after the last instruction
+of the sequence of consecutive assembly instructions.
+.in
+.TP
+.in +4n
+.I abort_ip
+Instruction pointer address where to move the execution flow in case of
+abort of the sequence of consecutive assembly instructions.
+.in
+
+.PP
+The
+.I rseq_len
+argument is the size of the
+.I struct rseq
+to register.
+
+.PP
+The
+.I flags
+argument is 0 for registration, and
+.IR RSEQ_FLAG_UNREGISTER
+for unregistration.
+
+.PP
+The
+.I sig
+argument is the 32-bit signature to be expected before the abort
+handler code.
+
+.PP
+A single library per process should keep the rseq structure in a
+thread-local storage variable.
+The
+.I cpu_id
+field should be initialized to -1, and the
+.I cpu_id_start
+field should be initialized to a possible CPU value (typically 0).
+
+.PP
+Each thread is responsible for registering and unregistering its rseq
+structure. No more than one rseq structure address can be registered
+per thread at a given time.
+
+.PP
+In a typical usage scenario, the thread registering the rseq
+structure will be performing loads and stores from/to that structure. It
+is however also allowed to read that structure from other threads.
+The rseq field updates performed by the kernel provide relaxed atomicity
+semantics, which guarantee that other threads performing relaxed atomic
+reads of the cpu number cache will always observe a consistent value.
+
+.SH RETURN VALUE
+A return value of 0 indicates success. On error, \-1 is returned, and
+.I errno
+is set appropriately.
+
+.SH ERRORS
+.TP
+.B EINVAL
+Either
+.I flags
+contains an invalid value, or
+.I rseq
+contains an address which is not appropriately aligned, or
+.I rseq_len
+contains a size that does not match the size received on registration.
+.TP
+.B ENOSYS
+The
+.BR rseq ()
+system call is not implemented by this kernel.
+.TP
+.B EFAULT
+.I rseq
+is an invalid address.
+.TP
+.B EBUSY
+Restartable sequence is already registered for this thread.
+.TP
+.B EPERM
+The
+.I sig
+argument on unregistration does not match the signature received
+on registration.
+
+.SH VERSIONS
+The
+.BR rseq ()
+system call was added in Linux 4.X
+.BR (TODO).
+
+.SH CONFORMING TO
+.BR rseq ()
+is Linux-specific.
+
+.in
+.SH SEE ALSO
+.BR sched_getcpu (3) ,
+.BR cpu_opv (2) ,
+.BR membarrier (2)
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 4+ messages in thread

* [RFC PATCH manpages 3/3] Update membarrier manpage for 4.14, 4.15
       [not found] ` <20171115191316.828-1-mathieu.desnoyers-vg+e7yoeK/dWk0Htik3J/w@public.gmane.org>
  2017-11-15 19:13   ` [RFC PATCH manpages 1/3] Add cpu_opv system call manpage Mathieu Desnoyers
  2017-11-15 19:13   ` [RFC PATCH manpages 2/3] Add rseq manpage Mathieu Desnoyers
@ 2017-11-15 19:13   ` Mathieu Desnoyers
  2 siblings, 0 replies; 4+ messages in thread
From: Mathieu Desnoyers @ 2017-11-15 19:13 UTC (permalink / raw)
  To: Michael Kerrisk
  Cc: linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	linux-api-u79uwXL29TY76Z2rM5mHXA, Peter Zijlstra,
	Paul E . McKenney, Boqun Feng, Andy Lutomirski, Dave Watson,
	Paul Turner, Andrew Morton, Russell King, Thomas Gleixner,
	Ingo Molnar, H . Peter Anvin, Andrew Hunter, Andi Kleen,
	Chris Lameter, Ben Maurer, Steven Rostedt, Josh Triplett,
	Linus Torvalds, Catalin Marinas

Add documentation for those new membarrier commands:

New in 4.14:
        MEMBARRIER_CMD_PRIVATE_EXPEDITED
        MEMBARRIER_CMD_REGISTER_PRIVATE_EXPEDITED

Adapt the MEMBARRIER_CMD_SHARED return value documentation to reflect
that it now returns -EINVAL when issued on a system configured for
nohz_full.

New in 4.15:
        MEMBARRIER_CMD_SHARED_EXPEDITED
        MEMBARRIER_CMD_REGISTER_SHARED_EXPEDITED
        MEMBARRIER_CMD_PRIVATE_EXPEDITED_SYNC_CORE
        MEMBARRIER_CMD_REGISTER_PRIVATE_EXPEDITED_SYNC_CORE

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers-vg+e7yoeK/dWk0Htik3J/w@public.gmane.org>
CC: "Paul E. McKenney" <paulmck-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
CC: Peter Zijlstra <peterz-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org>
CC: Paul Turner <pjt-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
CC: Thomas Gleixner <tglx-hfZtesqFncYOwBW4kG4KsQ@public.gmane.org>
CC: Andrew Hunter <ahh-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
CC: Andy Lutomirski <luto-kltTT9wpgjJwATOyAt5JVQ@public.gmane.org>
CC: Andi Kleen <andi-Vw/NltI1exuRpAAqCnN02g@public.gmane.org>
CC: Dave Watson <davejwatson-b10kYP2dOMg@public.gmane.org>
CC: Chris Lameter <cl-vYTEC60ixJUAvxtiuMwx3w@public.gmane.org>
CC: Ingo Molnar <mingo-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
CC: "H. Peter Anvin" <hpa-YMNOUZJC4hwAvxtiuMwx3w@public.gmane.org>
CC: Ben Maurer <bmaurer-b10kYP2dOMg@public.gmane.org>
CC: Steven Rostedt <rostedt-nx8X9YLhiw1AfugRpC6u6w@public.gmane.org>
CC: Josh Triplett <josh-iaAMLnmF4UmaiuxdJuQwMA@public.gmane.org>
CC: Linus Torvalds <torvalds-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org>
CC: Andrew Morton <akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org>
CC: Russell King <linux-lFZ/pmaqli7XmaaqVzeoHQ@public.gmane.org>
CC: Catalin Marinas <catalin.marinas-5wv7dgnIgG8@public.gmane.org>
CC: Will Deacon <will.deacon-5wv7dgnIgG8@public.gmane.org>
CC: Michael Kerrisk <mtk.manpages-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
CC: Boqun Feng <boqun.feng-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
CC: linux-api-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
---
 man2/membarrier.2 | 98 +++++++++++++++++++++++++++++++++++++++++++++++--------
 1 file changed, 84 insertions(+), 14 deletions(-)

diff --git a/man2/membarrier.2 b/man2/membarrier.2
index bbf611e10..6720d20f3 100644
--- a/man2/membarrier.2
+++ b/man2/membarrier.2
@@ -1,4 +1,4 @@
-.\" Copyright 2015 Mathieu Desnoyers <mathieu.desnoyers-vg+e7yoeK/dWk0Htik3J/w@public.gmane.org>
+.\" Copyright 2015-2017 Mathieu Desnoyers <mathieu.desnoyers-vg+e7yoeK/dWk0Htik3J/w@public.gmane.org>
 .\"
 .\" %%%LICENSE_START(VERBATIM)
 .\" Permission is granted to make and distribute verbatim copies of this
@@ -22,7 +22,7 @@
 .\" the source, must acknowledge the copyright and authors of this work.
 .\" %%%LICENSE_END
 .\"
-.TH MEMBARRIER 2 2017-09-15 "Linux" "Linux Programmer's Manual"
+.TH MEMBARRIER 2 2017-11-15 "Linux" "Linux Programmer's Manual"
 .SH NAME
 membarrier \- issue memory barriers on a set of threads
 .SH SYNOPSIS
@@ -87,6 +87,60 @@ order between entry to and return from the
 .BR membarrier ()
 system call.
 All threads on the system are targeted by this command.
+.TP
+.B MEMBARRIER_CMD_SHARED_EXPEDITED
+Execute a memory barrier on all running threads part of a process which
+previously registered with
+.BR MEMBARRIER_CMD_REGISTER_SHARED_EXPEDITED .
+Upon return from system call, the caller thread is ensured that all
+running threads have passed through a state where all memory accesses to
+user-space addresses match program order between entry to and return
+from the system call (non-running threads are de facto in such a state).
+This only covers threads from processes which registered with
+.BR MEMBARRIER_CMD_REGISTER_SHARED_EXPEDITED .
+Given that registration is about the intent to receive the barriers, it
+is valid to invoke
+.BR MEMBARRIER_CMD_SHARED_EXPEDITED
+from a non-registered process.
+.TP
+.B MEMBARRIER_CMD_REGISTER_SHARED_EXPEDITED
+Register the process intent to receive
+.BR MEMBARRIER_CMD_SHARED_EXPEDITED
+memory barriers.
+.TP
+.B MEMBARRIER_CMD_PRIVATE_EXPEDITED
+Execute a memory barrier on each running thread belonging to the same
+process as the current thread. Upon return from system call, the caller
+thread is ensured that all its running threads siblings have passed
+through a state where all memory accesses to user-space addresses match
+program order between entry to and return from the system call
+(non-running threads are de facto in such a state). This only covers
+threads from the same process as the caller thread. The "expedited"
+commands complete faster than the non-expedited ones, they never block,
+but have the downside of causing extra overhead. A process needs to
+register its intent to use the private expedited command prior to using
+it.
+.TP
+.B MEMBARRIER_CMD_REGISTER_PRIVATE_EXPEDITED
+Register the process intent to use
+.BR MEMBARRIER_CMD_PRIVATE_EXPEDITED .
+.TP
+.B MEMBARRIER_CMD_PRIVATE_EXPEDITED_SYNC_CORE
+In addition to provide memory ordering guarantees described in
+.BR MEMBARRIER_CMD_PRIVATE_EXPEDITED ,
+ensure the caller thread, upon return from system call, that all its
+running threads siblings have executed a core serializing instruction.
+(architectures are required to guarantee that non-running threads issue
+core serializing instructions before they resume user-space execution).
+This only covers threads from the same process as the caller thread.
+The "expedited" commands complete faster than the non-expedited ones,
+they never block, but have the downside of causing extra overhead. A
+process needs to register its intent to use the private expedited sync
+core command prior to using it.
+.TP
+.B MEMBARRIER_CMD_REGISTER_PRIVATE_EXPEDITED_SYNC_CORE
+Register the process intent to use
+.BR MEMBARRIER_CMD_PRIVATE_EXPEDITED_SYNC_CORE .
 .PP
 The
 .I flags
@@ -117,9 +171,16 @@ The pair ordering is detailed as (O: ordered, X: not ordered):
 .SH RETURN VALUE
 On success, the
 .B MEMBARRIER_CMD_QUERY
-operation returns a bit mask of supported commands and the
-.B MEMBARRIER_CMD_SHARED
-operation returns zero.
+operation returns a bit mask of supported commands, and the
+.B MEMBARRIER_CMD_SHARED ,
+.B MEMBARRIER_CMD_SHARED_EXPEDITED ,
+.B MEMBARRIER_CMD_REGISTER_SHARED_EXPEDITED ,
+.B MEMBARRIER_CMD_PRIVATE_EXPEDITED ,
+.B MEMBARRIER_CMD_REGISTER_PRIVATE_EXPEDITED ,
+.B MEMBARRIER_CMD_PRIVATE_EXPEDITED_SYNC_CORE ,
+and
+.B MEMBARRIER_CMD_REGISTER_PRIVATE_EXPEDITED_SYNC_CORE 
+operations return zero.
 On error, \-1 is returned,
 and
 .I errno
@@ -138,22 +199,27 @@ set to 0, error handling is required only for the first call to
 .TP
 .B EINVAL
 .I cmd
-is invalid or
+is invalid, or
 .I flags
-is non-zero.
+is non-zero, or
+the architecture does not implement the
+.BR MEMBARRIER_CMD_PRIVATE_EXPEDITED_SYNC_CORE
+and
+.BR MEMBARRIER_CMD_REGISTER_PRIVATE_EXPEDITED_SYNC_CORE
+commands, or the
+.BR MEMBARRIER_CMD_SHARED
+command is disabled because the
+.I nohz_full
+CPU parameter has been set.
 .TP
 .B ENOSYS
 The
 .BR membarrier ()
 system call is not implemented by this kernel.
 .TP
-.BR ENOSYS " (since Linux 4.11)"
-.\" 907565337ebf998a68cb5c5b2174ce5e5da065eb
-The
-.BR membarrier ()
-system call is disabled because the
-.I nohz_full
-CPU parameter has been set.
+.B EPERM
+The current process was not registered prior to using private expedited
+commands.
 .SH VERSIONS
 The
 .BR membarrier ()
@@ -162,6 +228,10 @@ system call was added in Linux 4.3.
 .SH CONFORMING TO
 .BR membarrier ()
 is Linux-specific.
+.in
+.SH SEE ALSO
+.BR cpu_opv (2) ,
+.BR rseq (2)
 .SH NOTES
 A memory barrier instruction is part of the instruction set of
 architectures with weakly-ordered memory models.
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2017-11-15 19:13 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-11-15 19:13 [RFC PATCH manpages 0/3] rseq, cpu_opv, membarrier man pages updates Mathieu Desnoyers
     [not found] ` <20171115191316.828-1-mathieu.desnoyers-vg+e7yoeK/dWk0Htik3J/w@public.gmane.org>
2017-11-15 19:13   ` [RFC PATCH manpages 1/3] Add cpu_opv system call manpage Mathieu Desnoyers
2017-11-15 19:13   ` [RFC PATCH manpages 2/3] Add rseq manpage Mathieu Desnoyers
2017-11-15 19:13   ` [RFC PATCH manpages 3/3] Update membarrier manpage for 4.14, 4.15 Mathieu Desnoyers

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).