All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/6] arm64: add instrumented atomics
@ 2018-05-04 17:39 ` Mark Rutland
  0 siblings, 0 replies; 103+ messages in thread
From: Mark Rutland @ 2018-05-04 17:39 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: linux-kernel, aryabinin, boqun.feng, catalin.marinas, dvyukov,
	mark.rutland, mingo, peterz, will.deacon

This series (based on v4.17-rc3) allows arm64's atomics to be
instrumented, which should make it easier to catch bugs where atomics
are used on erroneous memory locations.

The bulk of the diffstat is teaching the generic instrumentation about
the acquire/release/relaxed variants of each atomic, along with some
optional atomics which x86 doesn't implement directly.

To build an arm64 defonfig one additional patch [1] is required, which
fixes an include in the SUNRPC code. I've pushed the series, along with
that patch, to my arm64/atomic-instrumentation branch [2].

This has seen basic testing on a Juno R1 machine so far.

Thanks,
Mark.

[1] https://lkml.kernel.org/r/1489574142-20856-1-git-send-email-mark.rutland@arm.com
[2] git://git.kernel.org/pub/scm/linux/kernel/git/mark/linux.git arm64/atomic-instrumentation

Mark Rutland (6):
  locking/atomic, asm-generic: instrument ordering variants
  locking/atomic, asm-generic: instrument atomic*andnot*()
  arm64: use <linux/atomic.h> for cmpxchg
  arm64: fix assembly constraints for cmpxchg
  arm64: use instrumented atomics
  arm64: instrument smp_{load_acquire,store_release}

 arch/arm64/include/asm/atomic.h           |  299 +++----
 arch/arm64/include/asm/atomic_ll_sc.h     |   30 +-
 arch/arm64/include/asm/atomic_lse.h       |   43 +-
 arch/arm64/include/asm/barrier.h          |   22 +-
 arch/arm64/include/asm/cmpxchg.h          |   25 +-
 arch/arm64/include/asm/pgtable.h          |    2 +-
 arch/arm64/include/asm/sync_bitops.h      |    3 +-
 arch/arm64/mm/fault.c                     |    2 +-
 include/asm-generic/atomic-instrumented.h | 1305 +++++++++++++++++++++++++----
 9 files changed, 1339 insertions(+), 392 deletions(-)

-- 
2.11.0

^ permalink raw reply	[flat|nested] 103+ messages in thread

* [PATCH 0/6] arm64: add instrumented atomics
@ 2018-05-04 17:39 ` Mark Rutland
  0 siblings, 0 replies; 103+ messages in thread
From: Mark Rutland @ 2018-05-04 17:39 UTC (permalink / raw)
  To: linux-arm-kernel

This series (based on v4.17-rc3) allows arm64's atomics to be
instrumented, which should make it easier to catch bugs where atomics
are used on erroneous memory locations.

The bulk of the diffstat is teaching the generic instrumentation about
the acquire/release/relaxed variants of each atomic, along with some
optional atomics which x86 doesn't implement directly.

To build an arm64 defonfig one additional patch [1] is required, which
fixes an include in the SUNRPC code. I've pushed the series, along with
that patch, to my arm64/atomic-instrumentation branch [2].

This has seen basic testing on a Juno R1 machine so far.

Thanks,
Mark.

[1] https://lkml.kernel.org/r/1489574142-20856-1-git-send-email-mark.rutland at arm.com
[2] git://git.kernel.org/pub/scm/linux/kernel/git/mark/linux.git arm64/atomic-instrumentation

Mark Rutland (6):
  locking/atomic, asm-generic: instrument ordering variants
  locking/atomic, asm-generic: instrument atomic*andnot*()
  arm64: use <linux/atomic.h> for cmpxchg
  arm64: fix assembly constraints for cmpxchg
  arm64: use instrumented atomics
  arm64: instrument smp_{load_acquire,store_release}

 arch/arm64/include/asm/atomic.h           |  299 +++----
 arch/arm64/include/asm/atomic_ll_sc.h     |   30 +-
 arch/arm64/include/asm/atomic_lse.h       |   43 +-
 arch/arm64/include/asm/barrier.h          |   22 +-
 arch/arm64/include/asm/cmpxchg.h          |   25 +-
 arch/arm64/include/asm/pgtable.h          |    2 +-
 arch/arm64/include/asm/sync_bitops.h      |    3 +-
 arch/arm64/mm/fault.c                     |    2 +-
 include/asm-generic/atomic-instrumented.h | 1305 +++++++++++++++++++++++++----
 9 files changed, 1339 insertions(+), 392 deletions(-)

-- 
2.11.0

^ permalink raw reply	[flat|nested] 103+ messages in thread

* [PATCH 1/6] locking/atomic, asm-generic: instrument ordering variants
  2018-05-04 17:39 ` Mark Rutland
@ 2018-05-04 17:39   ` Mark Rutland
  -1 siblings, 0 replies; 103+ messages in thread
From: Mark Rutland @ 2018-05-04 17:39 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: linux-kernel, aryabinin, boqun.feng, catalin.marinas, dvyukov,
	mark.rutland, mingo, peterz, will.deacon

Currently <asm-generic/atomic-instrumented.h> only instruments the fully
ordered variants of atomic functions, ignoring the {relaxed,acquire,release}
ordering variants.

This patch reworks the header to instrument all ordering variants of the atomic
functions, so that architectures implementing these are instrumented
appropriately.

To minimise repetition, a macro is used to generate each variant from a common
template. The {full,relaxed,acquire,release} order variants respectively are
then built using this template, where the architecture provides an
implementation.

To stick to an 80 column limit while keeping the templates legible, the return
type and function name of each template are split over two lines. For
consistency, this is done even when not strictly necessary.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will.deacon@arm.com>
---
 include/asm-generic/atomic-instrumented.h | 1195 ++++++++++++++++++++++++-----
 1 file changed, 1008 insertions(+), 187 deletions(-)

diff --git a/include/asm-generic/atomic-instrumented.h b/include/asm-generic/atomic-instrumented.h
index ec07f23678ea..26f0e3098442 100644
--- a/include/asm-generic/atomic-instrumented.h
+++ b/include/asm-generic/atomic-instrumented.h
@@ -40,171 +40,664 @@ static __always_inline void atomic64_set(atomic64_t *v, s64 i)
 	arch_atomic64_set(v, i);
 }
 
-static __always_inline int atomic_xchg(atomic_t *v, int i)
-{
-	kasan_check_write(v, sizeof(*v));
-	return arch_atomic_xchg(v, i);
+#define INSTR_ATOMIC_XCHG(order)					\
+static __always_inline int						\
+atomic_xchg##order(atomic_t *v, int i)					\
+{									\
+	kasan_check_write(v, sizeof(*v));				\
+	return arch_atomic_xchg##order(v, i);				\
 }
 
-static __always_inline s64 atomic64_xchg(atomic64_t *v, s64 i)
-{
-	kasan_check_write(v, sizeof(*v));
-	return arch_atomic64_xchg(v, i);
+INSTR_ATOMIC_XCHG()
+
+#ifdef arch_atomic_xchg_relaxed
+INSTR_ATOMIC_XCHG(_relaxed)
+#define atomic_xchg_relaxed atomic_xchg_relaxed
+#endif
+
+#ifdef arch_atomic_xchg_acquire
+INSTR_ATOMIC_XCHG(_acquire)
+#define atomic_xchg_acquire atomic_xchg_acquire
+#endif
+
+#ifdef arch_atomic_xchg_release
+INSTR_ATOMIC_XCHG(_release)
+#define atomic_xchg_release atomic_xchg_release
+#endif
+
+#define INSTR_ATOMIC64_XCHG(order)					\
+static __always_inline s64						\
+atomic64_xchg##order(atomic64_t *v, s64 i)				\
+{									\
+	kasan_check_write(v, sizeof(*v));				\
+	return arch_atomic64_xchg##order(v, i);				\
 }
 
-static __always_inline int atomic_cmpxchg(atomic_t *v, int old, int new)
-{
-	kasan_check_write(v, sizeof(*v));
-	return arch_atomic_cmpxchg(v, old, new);
+INSTR_ATOMIC64_XCHG()
+
+#ifdef arch_atomic64_xchg_relaxed
+INSTR_ATOMIC64_XCHG(_relaxed)
+#define atomic64_xchg_relaxed atomic64_xchg_relaxed
+#endif
+
+#ifdef arch_atomic64_xchg_acquire
+INSTR_ATOMIC64_XCHG(_acquire)
+#define atomic64_xchg_acquire atomic64_xchg_acquire
+#endif
+
+#ifdef arch_atomic64_xchg_release
+INSTR_ATOMIC64_XCHG(_release)
+#define atomic64_xchg_release atomic64_xchg_release
+#endif
+
+#define INSTR_ATOMIC_CMPXCHG(order)					\
+static __always_inline int						\
+atomic_cmpxchg##order(atomic_t *v, int old, int new)			\
+{									\
+	kasan_check_write(v, sizeof(*v));				\
+	return arch_atomic_cmpxchg##order(v, old, new);			\
 }
 
-static __always_inline s64 atomic64_cmpxchg(atomic64_t *v, s64 old, s64 new)
-{
-	kasan_check_write(v, sizeof(*v));
-	return arch_atomic64_cmpxchg(v, old, new);
+INSTR_ATOMIC_CMPXCHG()
+
+#ifdef arch_atomic_cmpxchg_relaxed
+INSTR_ATOMIC_CMPXCHG(_relaxed)
+#define atomic_cmpxchg_relaxed atomic_cmpxchg_relaxed
+#endif
+
+#ifdef arch_atomic_cmpxchg_acquire
+INSTR_ATOMIC_CMPXCHG(_acquire)
+#define atomic_cmpxchg_acquire atomic_cmpxchg_acquire
+#endif
+
+#ifdef arch_atomic_cmpxchg_release
+INSTR_ATOMIC_CMPXCHG(_release)
+#define atomic_cmpxchg_release atomic_cmpxchg_release
+#endif
+
+#define INSTR_ATOMIC64_CMPXCHG(order)					\
+static __always_inline s64						\
+atomic64_cmpxchg##order(atomic64_t *v, s64 old, s64 new)		\
+{									\
+	kasan_check_write(v, sizeof(*v));				\
+	return arch_atomic64_cmpxchg##order(v, old, new);		\
+}
+
+INSTR_ATOMIC64_CMPXCHG()
+
+#ifdef arch_atomic64_cmpxchg_relaxed
+INSTR_ATOMIC64_CMPXCHG(_relaxed)
+#define atomic64_cmpxchg_relaxed atomic64_cmpxchg_relaxed
+#endif
+
+#ifdef arch_atomic64_cmpxchg_acquire
+INSTR_ATOMIC64_CMPXCHG(_acquire)
+#define atomic64_cmpxchg_acquire atomic64_cmpxchg_acquire
+#endif
+
+#ifdef arch_atomic64_cmpxchg_release
+INSTR_ATOMIC64_CMPXCHG(_release)
+#define atomic64_cmpxchg_release atomic64_cmpxchg_release
+#endif
+
+#define INSTR_ATOMIC_TRY_CMPXCHG(order)					\
+static __always_inline bool						\
+atomic_try_cmpxchg##order(atomic_t *v, int *old, int new)		\
+{									\
+	kasan_check_write(v, sizeof(*v));				\
+	kasan_check_read(old, sizeof(*old));				\
+	return arch_atomic_try_cmpxchg##order(v, old, new);		\
 }
 
 #ifdef arch_atomic_try_cmpxchg
+INSTR_ATOMIC_TRY_CMPXCHG()
 #define atomic_try_cmpxchg atomic_try_cmpxchg
-static __always_inline bool atomic_try_cmpxchg(atomic_t *v, int *old, int new)
-{
-	kasan_check_write(v, sizeof(*v));
-	kasan_check_read(old, sizeof(*old));
-	return arch_atomic_try_cmpxchg(v, old, new);
-}
 #endif
 
-#ifdef arch_atomic64_try_cmpxchg
-#define atomic64_try_cmpxchg atomic64_try_cmpxchg
-static __always_inline bool atomic64_try_cmpxchg(atomic64_t *v, s64 *old, s64 new)
-{
-	kasan_check_write(v, sizeof(*v));
-	kasan_check_read(old, sizeof(*old));
-	return arch_atomic64_try_cmpxchg(v, old, new);
+#ifdef arch_atomic_try_cmpxchg_relaxed
+INSTR_ATOMIC_TRY_CMPXCHG(_relaxed)
+#define atomic_try_cmpxchg_relaxed atomic_try_cmpxchg_relaxed
+#endif
+
+#ifdef arch_atomic_try_cmpxchg_acquire
+INSTR_ATOMIC_TRY_CMPXCHG(_acquire)
+#define atomic_try_cmpxchg_acquire atomic_try_cmpxchg_acquire
+#endif
+
+#ifdef arch_atomic_try_cmpxchg_release
+INSTR_ATOMIC_TRY_CMPXCHG(_release)
+#define atomic_try_cmpxchg_release atomic_try_cmpxchg_release
+#endif
+
+#define INSTR_ATOMIC64_TRY_CMPXCHG(order)				\
+static __always_inline bool						\
+atomic64_try_cmpxchg##order(atomic64_t *v, s64 *old, s64 new)		\
+{									\
+	kasan_check_write(v, sizeof(*v));				\
+	kasan_check_read(old, sizeof(*old));				\
+	return arch_atomic64_try_cmpxchg##order(v, old, new);		\
 }
+
+#ifdef arch_atomic64_try_cmpxchg
+INSTR_ATOMIC64_TRY_CMPXCHG()
+#define atomic_try_cmpxchg atomic_try_cmpxchg
 #endif
 
-static __always_inline int __atomic_add_unless(atomic_t *v, int a, int u)
-{
-	kasan_check_write(v, sizeof(*v));
-	return __arch_atomic_add_unless(v, a, u);
+#ifdef arch_atomic64_try_cmpxchg_relaxed
+INSTR_ATOMIC64_TRY_CMPXCHG(_relaxed)
+#define atomic_try_cmpxchg_relaxed atomic_try_cmpxchg_relaxed
+#endif
+
+#ifdef arch_atomic64_try_cmpxchg_acquire
+INSTR_ATOMIC64_TRY_CMPXCHG(_acquire)
+#define atomic_try_cmpxchg_acquire atomic_try_cmpxchg_acquire
+#endif
+
+#ifdef arch_atomic64_try_cmpxchg_release
+INSTR_ATOMIC64_TRY_CMPXCHG(_release)
+#define atomic_try_cmpxchg_release atomic_try_cmpxchg_release
+#endif
+
+#define __INSTR_ATOMIC_ADD_UNLESS(order)				\
+static __always_inline int						\
+__atomic_add_unless##order(atomic_t *v, int a, int u)			\
+{									\
+	kasan_check_write(v, sizeof(*v));				\
+	return __arch_atomic_add_unless##order(v, a, u);		\
 }
 
+__INSTR_ATOMIC_ADD_UNLESS()
 
-static __always_inline bool atomic64_add_unless(atomic64_t *v, s64 a, s64 u)
-{
-	kasan_check_write(v, sizeof(*v));
-	return arch_atomic64_add_unless(v, a, u);
+#ifdef __arch_atomic_add_unless_relaxed
+__INSTR_ATOMIC_ADD_UNLESS(_relaxed)
+#define __atomic_add_unless_relaxed __atomic_add_unless_relaxed
+#endif
+
+#ifdef __arch_atomic_add_unless_acquire
+__INSTR_ATOMIC_ADD_UNLESS(_acquire)
+#define __atomic_add_unless_acquire __atomic_add_unless_acquire
+#endif
+
+#ifdef __arch_atomic_add_unless_release
+__INSTR_ATOMIC_ADD_UNLESS(_release)
+#define __atomic_add_unless_release __atomic_add_unless_release
+#endif
+
+#define INSTR_ATOMIC64_ADD_UNLESS(order)				\
+static __always_inline bool						\
+atomic64_add_unless##order(atomic64_t *v, s64 a, s64 u)			\
+{									\
+	kasan_check_write(v, sizeof(*v));				\
+	return arch_atomic64_add_unless##order(v, a, u);		\
 }
 
-static __always_inline void atomic_inc(atomic_t *v)
-{
-	kasan_check_write(v, sizeof(*v));
-	arch_atomic_inc(v);
+INSTR_ATOMIC64_ADD_UNLESS()
+
+#ifdef arch_atomic64_add_unless_relaxed
+INSTR_ATOMIC64_ADD_UNLESS(_relaxed)
+#define atomic64_add_unless_relaxed atomic64_add_unless_relaxed
+#endif
+
+#ifdef arch_atomic64_add_unless_acquire
+INSTR_ATOMIC64_ADD_UNLESS(_acquire)
+#define atomic64_add_unless_acquire atomic64_add_unless_acquire
+#endif
+
+#ifdef arch_atomic64_add_unless_release
+INSTR_ATOMIC64_ADD_UNLESS(_release)
+#define atomic64_add_unless_release atomic64_add_unless_release
+#endif
+
+#define INSTR_ATOMIC_INC(order)						\
+static __always_inline void						\
+atomic_inc##order(atomic_t *v)						\
+{									\
+	kasan_check_write(v, sizeof(*v));				\
+	arch_atomic_inc##order(v);					\
 }
 
-static __always_inline void atomic64_inc(atomic64_t *v)
-{
-	kasan_check_write(v, sizeof(*v));
-	arch_atomic64_inc(v);
+INSTR_ATOMIC_INC()
+
+#ifdef arch_atomic_inc_relaxed
+INSTR_ATOMIC_INC(_relaxed)
+#define atomic_inc_relaxed atomic_inc_relaxed
+#endif
+
+#ifdef arch_atomic_inc_acquire
+INSTR_ATOMIC_INC(_acquire)
+#define atomic_inc_acquire atomic_inc_acquire
+#endif
+
+#ifdef arch_atomic_inc_release
+INSTR_ATOMIC_INC(_release)
+#define atomic_inc_release atomic_inc_release
+#endif
+
+#define INSTR_ATOMIC64_INC(order)					\
+static __always_inline void						\
+atomic64_inc##order(atomic64_t *v)					\
+{									\
+	kasan_check_write(v, sizeof(*v));				\
+	arch_atomic64_inc##order(v);					\
 }
 
-static __always_inline void atomic_dec(atomic_t *v)
-{
-	kasan_check_write(v, sizeof(*v));
-	arch_atomic_dec(v);
+INSTR_ATOMIC64_INC()
+
+#ifdef arch_atomic64_inc_relaxed
+INSTR_ATOMIC64_INC(_relaxed)
+#define atomic64_inc_relaxed atomic64_inc_relaxed
+#endif
+
+#ifdef arch_atomic64_inc_acquire
+INSTR_ATOMIC64_INC(_acquire)
+#define atomic64_inc_acquire atomic64_inc_acquire
+#endif
+
+#ifdef arch_atomic64_inc_release
+INSTR_ATOMIC64_INC(_release)
+#define atomic64_inc_release atomic64_inc_release
+#endif
+
+#define INSTR_ATOMIC_DEC(order)						\
+static __always_inline void						\
+atomic_dec##order(atomic_t *v)						\
+{									\
+	kasan_check_write(v, sizeof(*v));				\
+	arch_atomic_dec##order(v);					\
 }
 
-static __always_inline void atomic64_dec(atomic64_t *v)
-{
-	kasan_check_write(v, sizeof(*v));
-	arch_atomic64_dec(v);
+INSTR_ATOMIC_DEC()
+
+#ifdef arch_atomic_dec_relaxed
+INSTR_ATOMIC_DEC(_relaxed)
+#define atomic_dec_relaxed atomic_dec_relaxed
+#endif
+
+#ifdef arch_atomic_dec_acquire
+INSTR_ATOMIC_DEC(_acquire)
+#define atomic_dec_acquire atomic_dec_acquire
+#endif
+
+#ifdef arch_atomic_dec_release
+INSTR_ATOMIC_DEC(_release)
+#define atomic_dec_release atomic_dec_release
+#endif
+
+#define INSTR_ATOMIC64_DEC(order)					\
+static __always_inline void						\
+atomic64_dec##order(atomic64_t *v)					\
+{									\
+	kasan_check_write(v, sizeof(*v));				\
+	arch_atomic64_dec##order(v);					\
 }
 
-static __always_inline void atomic_add(int i, atomic_t *v)
-{
-	kasan_check_write(v, sizeof(*v));
-	arch_atomic_add(i, v);
+INSTR_ATOMIC64_DEC()
+
+#ifdef arch_atomic64_dec_relaxed
+INSTR_ATOMIC64_DEC(_relaxed)
+#define atomic64_dec_relaxed atomic64_dec_relaxed
+#endif
+
+#ifdef arch_atomic64_dec_acquire
+INSTR_ATOMIC64_DEC(_acquire)
+#define atomic64_dec_acquire atomic64_dec_acquire
+#endif
+
+#ifdef arch_atomic64_dec_release
+INSTR_ATOMIC64_DEC(_release)
+#define atomic64_dec_release atomic64_dec_release
+#endif
+
+#define INSTR_ATOMIC_ADD(order)						\
+static __always_inline void						\
+atomic_add##order(int i, atomic_t *v)					\
+{									\
+	kasan_check_write(v, sizeof(*v));				\
+	arch_atomic_add##order(i, v);					\
 }
 
-static __always_inline void atomic64_add(s64 i, atomic64_t *v)
-{
-	kasan_check_write(v, sizeof(*v));
-	arch_atomic64_add(i, v);
+INSTR_ATOMIC_ADD()
+
+#ifdef arch_atomic_add_relaxed
+INSTR_ATOMIC_ADD(_relaxed)
+#define atomic_add_relaxed atomic_add_relaxed
+#endif
+
+#ifdef arch_atomic_add_acquire
+INSTR_ATOMIC_ADD(_acquire)
+#define atomic_add_acquire atomic_add_acquire
+#endif
+
+#ifdef arch_atomic_add_release
+INSTR_ATOMIC_ADD(_release)
+#define atomic_add_release atomic_add_release
+#endif
+
+#define INSTR_ATOMIC64_ADD(order)					\
+static __always_inline void						\
+atomic64_add##order(s64 i, atomic64_t *v)				\
+{									\
+	kasan_check_write(v, sizeof(*v));				\
+	arch_atomic64_add##order(i, v);					\
 }
 
-static __always_inline void atomic_sub(int i, atomic_t *v)
-{
-	kasan_check_write(v, sizeof(*v));
-	arch_atomic_sub(i, v);
+INSTR_ATOMIC64_ADD()
+
+#ifdef arch_atomic64_add_relaxed
+INSTR_ATOMIC64_ADD(_relaxed)
+#define atomic64_add_relaxed atomic64_add_relaxed
+#endif
+
+#ifdef arch_atomic64_add_acquire
+INSTR_ATOMIC64_ADD(_acquire)
+#define atomic64_add_acquire atomic64_add_acquire
+#endif
+
+#ifdef arch_atomic64_add_release
+INSTR_ATOMIC64_ADD(_release)
+#define atomic64_add_release atomic64_add_release
+#endif
+
+#define INSTR_ATOMIC_SUB(order)						\
+static __always_inline void						\
+atomic_sub##order(int i, atomic_t *v)					\
+{									\
+	kasan_check_write(v, sizeof(*v));				\
+	arch_atomic_sub##order(i, v);					\
 }
 
-static __always_inline void atomic64_sub(s64 i, atomic64_t *v)
-{
-	kasan_check_write(v, sizeof(*v));
-	arch_atomic64_sub(i, v);
+INSTR_ATOMIC_SUB()
+
+#ifdef arch_atomic_sub_relaxed
+INSTR_ATOMIC_SUB(_relaxed)
+#define atomic_sub_relaxed atomic_sub_relaxed
+#endif
+
+#ifdef arch_atomic_sub_acquire
+INSTR_ATOMIC_SUB(_acquire)
+#define atomic_sub_acquire atomic_sub_acquire
+#endif
+
+#ifdef arch_atomic_sub_release
+INSTR_ATOMIC_SUB(_release)
+#define atomic_sub_release atomic_sub_release
+#endif
+
+#define INSTR_ATOMIC64_SUB(order)					\
+static __always_inline void						\
+atomic64_sub##order(s64 i, atomic64_t *v)				\
+{									\
+	kasan_check_write(v, sizeof(*v));				\
+	arch_atomic64_sub##order(i, v);					\
 }
 
-static __always_inline void atomic_and(int i, atomic_t *v)
-{
-	kasan_check_write(v, sizeof(*v));
-	arch_atomic_and(i, v);
+INSTR_ATOMIC64_SUB()
+
+#ifdef arch_atomic64_sub_relaxed
+INSTR_ATOMIC64_SUB(_relaxed)
+#define atomic64_sub_relaxed atomic64_sub_relaxed
+#endif
+
+#ifdef arch_atomic64_sub_acquire
+INSTR_ATOMIC64_SUB(_acquire)
+#define atomic64_sub_acquire atomic64_sub_acquire
+#endif
+
+#ifdef arch_atomic64_sub_release
+INSTR_ATOMIC64_SUB(_release)
+#define atomic64_sub_release atomic64_sub_release
+#endif
+
+#define INSTR_ATOMIC_AND(order)						\
+static __always_inline void						\
+atomic_and##order(int i, atomic_t *v)					\
+{									\
+	kasan_check_write(v, sizeof(*v));				\
+	arch_atomic_and##order(i, v);					\
 }
 
-static __always_inline void atomic64_and(s64 i, atomic64_t *v)
-{
-	kasan_check_write(v, sizeof(*v));
-	arch_atomic64_and(i, v);
+INSTR_ATOMIC_AND()
+
+#ifdef arch_atomic_and_relaxed
+INSTR_ATOMIC_AND(_relaxed)
+#define atomic_and_relaxed atomic_and_relaxed
+#endif
+
+#ifdef arch_atomic_and_acquire
+INSTR_ATOMIC_AND(_acquire)
+#define atomic_and_acquire atomic_and_acquire
+#endif
+
+#ifdef arch_atomic_and_release
+INSTR_ATOMIC_AND(_release)
+#define atomic_and_release atomic_and_release
+#endif
+
+#define INSTR_ATOMIC64_AND(order)					\
+static __always_inline void						\
+atomic64_and##order(s64 i, atomic64_t *v)				\
+{									\
+	kasan_check_write(v, sizeof(*v));				\
+	arch_atomic64_and##order(i, v);					\
 }
 
-static __always_inline void atomic_or(int i, atomic_t *v)
-{
-	kasan_check_write(v, sizeof(*v));
-	arch_atomic_or(i, v);
+INSTR_ATOMIC64_AND()
+
+#ifdef arch_atomic64_and_relaxed
+INSTR_ATOMIC64_AND(_relaxed)
+#define atomic64_and_relaxed atomic64_and_relaxed
+#endif
+
+#ifdef arch_atomic64_and_acquire
+INSTR_ATOMIC64_AND(_acquire)
+#define atomic64_and_acquire atomic64_and_acquire
+#endif
+
+#ifdef arch_atomic64_and_release
+INSTR_ATOMIC64_AND(_release)
+#define atomic64_and_release atomic64_and_release
+#endif
+
+#define INSTR_ATOMIC_OR(order)						\
+static __always_inline void						\
+atomic_or##order(int i, atomic_t *v)					\
+{									\
+	kasan_check_write(v, sizeof(*v));				\
+	arch_atomic_or##order(i, v);					\
 }
 
-static __always_inline void atomic64_or(s64 i, atomic64_t *v)
-{
-	kasan_check_write(v, sizeof(*v));
-	arch_atomic64_or(i, v);
+INSTR_ATOMIC_OR()
+
+#ifdef arch_atomic_or_relaxed
+INSTR_ATOMIC_OR(_relaxed)
+#define atomic_or_relaxed atomic_or_relaxed
+#endif
+
+#ifdef arch_atomic_or_acquire
+INSTR_ATOMIC_OR(_acquire)
+#define atomic_or_acquire atomic_or_acquire
+#endif
+
+#ifdef arch_atomic_or_release
+INSTR_ATOMIC_OR(_release)
+#define atomic_or_release atomic_or_release
+#endif
+
+#define INSTR_ATOMIC64_OR(order)					\
+static __always_inline void						\
+atomic64_or##order(s64 i, atomic64_t *v)				\
+{									\
+	kasan_check_write(v, sizeof(*v));				\
+	arch_atomic64_or##order(i, v);					\
 }
 
-static __always_inline void atomic_xor(int i, atomic_t *v)
-{
-	kasan_check_write(v, sizeof(*v));
-	arch_atomic_xor(i, v);
+INSTR_ATOMIC64_OR()
+
+#ifdef arch_atomic64_or_relaxed
+INSTR_ATOMIC64_OR(_relaxed)
+#define atomic64_or_relaxed atomic64_or_relaxed
+#endif
+
+#ifdef arch_atomic64_or_acquire
+INSTR_ATOMIC64_OR(_acquire)
+#define atomic64_or_acquire atomic64_or_acquire
+#endif
+
+#ifdef arch_atomic64_or_release
+INSTR_ATOMIC64_OR(_release)
+#define atomic64_or_release atomic64_or_release
+#endif
+
+#define INSTR_ATOMIC_XOR(order)						\
+static __always_inline void						\
+atomic_xor##order(int i, atomic_t *v)					\
+{									\
+	kasan_check_write(v, sizeof(*v));				\
+	arch_atomic_xor##order(i, v);					\
 }
 
-static __always_inline void atomic64_xor(s64 i, atomic64_t *v)
-{
-	kasan_check_write(v, sizeof(*v));
-	arch_atomic64_xor(i, v);
+INSTR_ATOMIC_XOR()
+
+#ifdef arch_atomic_xor_relaxed
+INSTR_ATOMIC_XOR(_relaxed)
+#define atomic_xor_relaxed atomic_xor_relaxed
+#endif
+
+#ifdef arch_atomic_xor_acquire
+INSTR_ATOMIC_XOR(_acquire)
+#define atomic_xor_acquire atomic_xor_acquire
+#endif
+
+#ifdef arch_atomic_xor_release
+INSTR_ATOMIC_XOR(_release)
+#define atomic_xor_release atomic_xor_release
+#endif
+
+#define INSTR_ATOMIC64_XOR(order)					\
+static __always_inline void						\
+atomic64_xor##order(s64 i, atomic64_t *v)				\
+{									\
+	kasan_check_write(v, sizeof(*v));				\
+	arch_atomic64_xor##order(i, v);					\
 }
 
-static __always_inline int atomic_inc_return(atomic_t *v)
-{
-	kasan_check_write(v, sizeof(*v));
-	return arch_atomic_inc_return(v);
+INSTR_ATOMIC64_XOR()
+
+#ifdef arch_atomic64_xor_relaxed
+INSTR_ATOMIC64_XOR(_relaxed)
+#define atomic64_xor_relaxed atomic64_xor_relaxed
+#endif
+
+#ifdef arch_atomic64_xor_acquire
+INSTR_ATOMIC64_XOR(_acquire)
+#define atomic64_xor_acquire atomic64_xor_acquire
+#endif
+
+#ifdef arch_atomic64_xor_release
+INSTR_ATOMIC64_XOR(_release)
+#define atomic64_xor_release atomic64_xor_release
+#endif
+
+#define INSTR_ATOMIC_INC_RETURN(order)					\
+static __always_inline int						\
+atomic_inc_return##order(atomic_t *v)					\
+{									\
+	kasan_check_write(v, sizeof(*v));				\
+	return arch_atomic_inc_return##order(v);			\
 }
 
-static __always_inline s64 atomic64_inc_return(atomic64_t *v)
-{
-	kasan_check_write(v, sizeof(*v));
-	return arch_atomic64_inc_return(v);
+INSTR_ATOMIC_INC_RETURN()
+
+#ifdef arch_atomic_inc_return_relaxed
+INSTR_ATOMIC_INC_RETURN(_relaxed)
+#define atomic_inc_return_relaxed atomic_inc_return_relaxed
+#endif
+
+#ifdef arch_atomic_inc_return_acquire
+INSTR_ATOMIC_INC_RETURN(_acquire)
+#define atomic_inc_return_acquire atomic_inc_return_acquire
+#endif
+
+#ifdef arch_atomic_inc_return_release
+INSTR_ATOMIC_INC_RETURN(_release)
+#define atomic_inc_return_release atomic_inc_return_release
+#endif
+
+#define INSTR_ATOMIC64_INC_RETURN(order)				\
+static __always_inline s64						\
+atomic64_inc_return##order(atomic64_t *v)				\
+{									\
+	kasan_check_write(v, sizeof(*v));				\
+	return arch_atomic64_inc_return##order(v);			\
 }
 
-static __always_inline int atomic_dec_return(atomic_t *v)
-{
-	kasan_check_write(v, sizeof(*v));
-	return arch_atomic_dec_return(v);
+INSTR_ATOMIC64_INC_RETURN()
+
+#ifdef arch_atomic64_inc_return_relaxed
+INSTR_ATOMIC64_INC_RETURN(_relaxed)
+#define atomic64_inc_return_relaxed atomic64_inc_return_relaxed
+#endif
+
+#ifdef arch_atomic64_inc_return_acquire
+INSTR_ATOMIC64_INC_RETURN(_acquire)
+#define atomic64_inc_return_acquire atomic64_inc_return_acquire
+#endif
+
+#ifdef arch_atomic64_inc_return_release
+INSTR_ATOMIC64_INC_RETURN(_release)
+#define atomic64_inc_return_release atomic64_inc_return_release
+#endif
+
+#define INSTR_ATOMIC_DEC_RETURN(order)					\
+static __always_inline int						\
+atomic_dec_return##order(atomic_t *v)					\
+{									\
+	kasan_check_write(v, sizeof(*v));				\
+	return arch_atomic_dec_return##order(v);			\
 }
 
-static __always_inline s64 atomic64_dec_return(atomic64_t *v)
-{
-	kasan_check_write(v, sizeof(*v));
-	return arch_atomic64_dec_return(v);
+INSTR_ATOMIC_DEC_RETURN()
+
+#ifdef arch_atomic_dec_return_relaxed
+INSTR_ATOMIC_DEC_RETURN(_relaxed)
+#define atomic_dec_return_relaxed atomic_dec_return_relaxed
+#endif
+
+#ifdef arch_atomic_dec_return_acquire
+INSTR_ATOMIC_DEC_RETURN(_acquire)
+#define atomic_dec_return_acquire atomic_dec_return_acquire
+#endif
+
+#ifdef arch_atomic_dec_return_release
+INSTR_ATOMIC_DEC_RETURN(_release)
+#define atomic_dec_return_release atomic_dec_return_release
+#endif
+
+#define INSTR_ATOMIC64_DEC_RETURN(order)				\
+static __always_inline s64						\
+atomic64_dec_return##order(atomic64_t *v)				\
+{									\
+	kasan_check_write(v, sizeof(*v));				\
+	return arch_atomic64_dec_return##order(v);			\
 }
 
+INSTR_ATOMIC64_DEC_RETURN()
+
+#ifdef arch_atomic64_dec_return_relaxed
+INSTR_ATOMIC64_DEC_RETURN(_relaxed)
+#define atomic64_dec_return_relaxed atomic64_dec_return_relaxed
+#endif
+
+#ifdef arch_atomic64_dec_return_acquire
+INSTR_ATOMIC64_DEC_RETURN(_acquire)
+#define atomic64_dec_return_acquire atomic64_dec_return_acquire
+#endif
+
+#ifdef arch_atomic64_dec_return_release
+INSTR_ATOMIC64_DEC_RETURN(_release)
+#define atomic64_dec_return_release atomic64_dec_return_release
+#endif
+
 static __always_inline s64 atomic64_inc_not_zero(atomic64_t *v)
 {
 	kasan_check_write(v, sizeof(*v));
@@ -241,90 +734,356 @@ static __always_inline bool atomic64_inc_and_test(atomic64_t *v)
 	return arch_atomic64_inc_and_test(v);
 }
 
-static __always_inline int atomic_add_return(int i, atomic_t *v)
-{
-	kasan_check_write(v, sizeof(*v));
-	return arch_atomic_add_return(i, v);
+#define INSTR_ATOMIC_ADD_RETURN(order)					\
+static __always_inline int						\
+atomic_add_return##order(int i, atomic_t *v)				\
+{									\
+	kasan_check_write(v, sizeof(*v));				\
+	return arch_atomic_add_return##order(i, v);			\
 }
 
-static __always_inline s64 atomic64_add_return(s64 i, atomic64_t *v)
-{
-	kasan_check_write(v, sizeof(*v));
-	return arch_atomic64_add_return(i, v);
+INSTR_ATOMIC_ADD_RETURN()
+
+#ifdef arch_atomic_add_return_relaxed
+INSTR_ATOMIC_ADD_RETURN(_relaxed)
+#define atomic_add_return_relaxed atomic_add_return_relaxed
+#endif
+
+#ifdef arch_atomic_add_return_acquire
+INSTR_ATOMIC_ADD_RETURN(_acquire)
+#define atomic_add_return_acquire atomic_add_return_acquire
+#endif
+
+#ifdef arch_atomic_add_return_release
+INSTR_ATOMIC_ADD_RETURN(_release)
+#define atomic_add_return_release atomic_add_return_release
+#endif
+
+#define INSTR_ATOMIC64_ADD_RETURN(order)				\
+static __always_inline s64						\
+atomic64_add_return##order(s64 i, atomic64_t *v)			\
+{									\
+	kasan_check_write(v, sizeof(*v));				\
+	return arch_atomic64_add_return##order(i, v);			\
 }
 
-static __always_inline int atomic_sub_return(int i, atomic_t *v)
-{
-	kasan_check_write(v, sizeof(*v));
-	return arch_atomic_sub_return(i, v);
+INSTR_ATOMIC64_ADD_RETURN()
+
+#ifdef arch_atomic64_add_return_relaxed
+INSTR_ATOMIC64_ADD_RETURN(_relaxed)
+#define atomic64_add_return_relaxed atomic64_add_return_relaxed
+#endif
+
+#ifdef arch_atomic64_add_return_acquire
+INSTR_ATOMIC64_ADD_RETURN(_acquire)
+#define atomic64_add_return_acquire atomic64_add_return_acquire
+#endif
+
+#ifdef arch_atomic64_add_return_release
+INSTR_ATOMIC64_ADD_RETURN(_release)
+#define atomic64_add_return_release atomic64_add_return_release
+#endif
+
+#define INSTR_ATOMIC_SUB_RETURN(order)					\
+static __always_inline int						\
+atomic_sub_return##order(int i, atomic_t *v)				\
+{									\
+	kasan_check_write(v, sizeof(*v));				\
+	return arch_atomic_sub_return##order(i, v);			\
 }
 
-static __always_inline s64 atomic64_sub_return(s64 i, atomic64_t *v)
-{
-	kasan_check_write(v, sizeof(*v));
-	return arch_atomic64_sub_return(i, v);
+INSTR_ATOMIC_SUB_RETURN()
+
+#ifdef arch_atomic_sub_return_relaxed
+INSTR_ATOMIC_SUB_RETURN(_relaxed)
+#define atomic_sub_return_relaxed atomic_sub_return_relaxed
+#endif
+
+#ifdef arch_atomic_sub_return_acquire
+INSTR_ATOMIC_SUB_RETURN(_acquire)
+#define atomic_sub_return_acquire atomic_sub_return_acquire
+#endif
+
+#ifdef arch_atomic_sub_return_release
+INSTR_ATOMIC_SUB_RETURN(_release)
+#define atomic_sub_return_release atomic_sub_return_release
+#endif
+
+#define INSTR_ATOMIC64_SUB_RETURN(order)				\
+static __always_inline s64						\
+atomic64_sub_return##order(s64 i, atomic64_t *v)			\
+{									\
+	kasan_check_write(v, sizeof(*v));				\
+	return arch_atomic64_sub_return##order(i, v);			\
 }
 
-static __always_inline int atomic_fetch_add(int i, atomic_t *v)
-{
-	kasan_check_write(v, sizeof(*v));
-	return arch_atomic_fetch_add(i, v);
+INSTR_ATOMIC64_SUB_RETURN()
+
+#ifdef arch_atomic64_sub_return_relaxed
+INSTR_ATOMIC64_SUB_RETURN(_relaxed)
+#define atomic64_sub_return_relaxed atomic64_sub_return_relaxed
+#endif
+
+#ifdef arch_atomic64_sub_return_acquire
+INSTR_ATOMIC64_SUB_RETURN(_acquire)
+#define atomic64_sub_return_acquire atomic64_sub_return_acquire
+#endif
+
+#ifdef arch_atomic64_sub_return_release
+INSTR_ATOMIC64_SUB_RETURN(_release)
+#define atomic64_sub_return_release atomic64_sub_return_release
+#endif
+
+#define INSTR_ATOMIC_FETCH_ADD(order)					\
+static __always_inline int						\
+atomic_fetch_add##order(int i, atomic_t *v)				\
+{									\
+	kasan_check_write(v, sizeof(*v));				\
+	return arch_atomic_fetch_add##order(i, v);			\
 }
 
-static __always_inline s64 atomic64_fetch_add(s64 i, atomic64_t *v)
-{
-	kasan_check_write(v, sizeof(*v));
-	return arch_atomic64_fetch_add(i, v);
+INSTR_ATOMIC_FETCH_ADD()
+
+#ifdef arch_atomic_fetch_add_relaxed
+INSTR_ATOMIC_FETCH_ADD(_relaxed)
+#define atomic_fetch_add_relaxed atomic_fetch_add_relaxed
+#endif
+
+#ifdef arch_atomic_fetch_add_acquire
+INSTR_ATOMIC_FETCH_ADD(_acquire)
+#define atomic_fetch_add_acquire atomic_fetch_add_acquire
+#endif
+
+#ifdef arch_atomic_fetch_add_release
+INSTR_ATOMIC_FETCH_ADD(_release)
+#define atomic_fetch_add_release atomic_fetch_add_release
+#endif
+
+#define INSTR_ATOMIC64_FETCH_ADD(order)					\
+static __always_inline s64						\
+atomic64_fetch_add##order(s64 i, atomic64_t *v)				\
+{									\
+	kasan_check_write(v, sizeof(*v));				\
+	return arch_atomic64_fetch_add##order(i, v);			\
 }
 
-static __always_inline int atomic_fetch_sub(int i, atomic_t *v)
-{
-	kasan_check_write(v, sizeof(*v));
-	return arch_atomic_fetch_sub(i, v);
+INSTR_ATOMIC64_FETCH_ADD()
+
+#ifdef arch_atomic64_fetch_add_relaxed
+INSTR_ATOMIC64_FETCH_ADD(_relaxed)
+#define atomic64_fetch_add_relaxed atomic64_fetch_add_relaxed
+#endif
+
+#ifdef arch_atomic64_fetch_add_acquire
+INSTR_ATOMIC64_FETCH_ADD(_acquire)
+#define atomic64_fetch_add_acquire atomic64_fetch_add_acquire
+#endif
+
+#ifdef arch_atomic64_fetch_add_release
+INSTR_ATOMIC64_FETCH_ADD(_release)
+#define atomic64_fetch_add_release atomic64_fetch_add_release
+#endif
+
+#define INSTR_ATOMIC_FETCH_SUB(order)					\
+static __always_inline int						\
+atomic_fetch_sub##order(int i, atomic_t *v)				\
+{									\
+	kasan_check_write(v, sizeof(*v));				\
+	return arch_atomic_fetch_sub##order(i, v);			\
 }
 
-static __always_inline s64 atomic64_fetch_sub(s64 i, atomic64_t *v)
-{
-	kasan_check_write(v, sizeof(*v));
-	return arch_atomic64_fetch_sub(i, v);
+INSTR_ATOMIC_FETCH_SUB()
+
+#ifdef arch_atomic_fetch_sub_relaxed
+INSTR_ATOMIC_FETCH_SUB(_relaxed)
+#define atomic_fetch_sub_relaxed atomic_fetch_sub_relaxed
+#endif
+
+#ifdef arch_atomic_fetch_sub_acquire
+INSTR_ATOMIC_FETCH_SUB(_acquire)
+#define atomic_fetch_sub_acquire atomic_fetch_sub_acquire
+#endif
+
+#ifdef arch_atomic_fetch_sub_release
+INSTR_ATOMIC_FETCH_SUB(_release)
+#define atomic_fetch_sub_release atomic_fetch_sub_release
+#endif
+
+#define INSTR_ATOMIC64_FETCH_SUB(order)					\
+static __always_inline s64						\
+atomic64_fetch_sub##order(s64 i, atomic64_t *v)				\
+{									\
+	kasan_check_write(v, sizeof(*v));				\
+	return arch_atomic64_fetch_sub##order(i, v);			\
 }
 
-static __always_inline int atomic_fetch_and(int i, atomic_t *v)
-{
-	kasan_check_write(v, sizeof(*v));
-	return arch_atomic_fetch_and(i, v);
+INSTR_ATOMIC64_FETCH_SUB()
+
+#ifdef arch_atomic64_fetch_sub_relaxed
+INSTR_ATOMIC64_FETCH_SUB(_relaxed)
+#define atomic64_fetch_sub_relaxed atomic64_fetch_sub_relaxed
+#endif
+
+#ifdef arch_atomic64_fetch_sub_acquire
+INSTR_ATOMIC64_FETCH_SUB(_acquire)
+#define atomic64_fetch_sub_acquire atomic64_fetch_sub_acquire
+#endif
+
+#ifdef arch_atomic64_fetch_sub_release
+INSTR_ATOMIC64_FETCH_SUB(_release)
+#define atomic64_fetch_sub_release atomic64_fetch_sub_release
+#endif
+
+#define INSTR_ATOMIC_FETCH_AND(order)					\
+static __always_inline int						\
+atomic_fetch_and##order(int i, atomic_t *v)				\
+{									\
+	kasan_check_write(v, sizeof(*v));				\
+	return arch_atomic_fetch_and##order(i, v);			\
 }
 
-static __always_inline s64 atomic64_fetch_and(s64 i, atomic64_t *v)
-{
-	kasan_check_write(v, sizeof(*v));
-	return arch_atomic64_fetch_and(i, v);
+INSTR_ATOMIC_FETCH_AND()
+
+#ifdef arch_atomic_fetch_and_relaxed
+INSTR_ATOMIC_FETCH_AND(_relaxed)
+#define atomic_fetch_and_relaxed atomic_fetch_and_relaxed
+#endif
+
+#ifdef arch_atomic_fetch_and_acquire
+INSTR_ATOMIC_FETCH_AND(_acquire)
+#define atomic_fetch_and_acquire atomic_fetch_and_acquire
+#endif
+
+#ifdef arch_atomic_fetch_and_release
+INSTR_ATOMIC_FETCH_AND(_release)
+#define atomic_fetch_and_release atomic_fetch_and_release
+#endif
+
+#define INSTR_ATOMIC64_FETCH_AND(order)					\
+static __always_inline s64						\
+atomic64_fetch_and##order(s64 i, atomic64_t *v)				\
+{									\
+	kasan_check_write(v, sizeof(*v));				\
+	return arch_atomic64_fetch_and##order(i, v);			\
 }
 
-static __always_inline int atomic_fetch_or(int i, atomic_t *v)
-{
-	kasan_check_write(v, sizeof(*v));
-	return arch_atomic_fetch_or(i, v);
+INSTR_ATOMIC64_FETCH_AND()
+
+#ifdef arch_atomic64_fetch_and_relaxed
+INSTR_ATOMIC64_FETCH_AND(_relaxed)
+#define atomic64_fetch_and_relaxed atomic64_fetch_and_relaxed
+#endif
+
+#ifdef arch_atomic64_fetch_and_acquire
+INSTR_ATOMIC64_FETCH_AND(_acquire)
+#define atomic64_fetch_and_acquire atomic64_fetch_and_acquire
+#endif
+
+#ifdef arch_atomic64_fetch_and_release
+INSTR_ATOMIC64_FETCH_AND(_release)
+#define atomic64_fetch_and_release atomic64_fetch_and_release
+#endif
+
+#define INSTR_ATOMIC_FETCH_OR(order)					\
+static __always_inline int						\
+atomic_fetch_or##order(int i, atomic_t *v)				\
+{									\
+	kasan_check_write(v, sizeof(*v));				\
+	return arch_atomic_fetch_or##order(i, v);			\
 }
 
-static __always_inline s64 atomic64_fetch_or(s64 i, atomic64_t *v)
-{
-	kasan_check_write(v, sizeof(*v));
-	return arch_atomic64_fetch_or(i, v);
+INSTR_ATOMIC_FETCH_OR()
+
+#ifdef arch_atomic_fetch_or_relaxed
+INSTR_ATOMIC_FETCH_OR(_relaxed)
+#define atomic_fetch_or_relaxed atomic_fetch_or_relaxed
+#endif
+
+#ifdef arch_atomic_fetch_or_acquire
+INSTR_ATOMIC_FETCH_OR(_acquire)
+#define atomic_fetch_or_acquire atomic_fetch_or_acquire
+#endif
+
+#ifdef arch_atomic_fetch_or_release
+INSTR_ATOMIC_FETCH_OR(_release)
+#define atomic_fetch_or_release atomic_fetch_or_release
+#endif
+
+#define INSTR_ATOMIC64_FETCH_OR(order)					\
+static __always_inline s64						\
+atomic64_fetch_or##order(s64 i, atomic64_t *v)				\
+{									\
+	kasan_check_write(v, sizeof(*v));				\
+	return arch_atomic64_fetch_or##order(i, v);			\
 }
 
-static __always_inline int atomic_fetch_xor(int i, atomic_t *v)
-{
-	kasan_check_write(v, sizeof(*v));
-	return arch_atomic_fetch_xor(i, v);
+INSTR_ATOMIC64_FETCH_OR()
+
+#ifdef arch_atomic64_fetch_or_relaxed
+INSTR_ATOMIC64_FETCH_OR(_relaxed)
+#define atomic64_fetch_or_relaxed atomic64_fetch_or_relaxed
+#endif
+
+#ifdef arch_atomic64_fetch_or_acquire
+INSTR_ATOMIC64_FETCH_OR(_acquire)
+#define atomic64_fetch_or_acquire atomic64_fetch_or_acquire
+#endif
+
+#ifdef arch_atomic64_fetch_or_release
+INSTR_ATOMIC64_FETCH_OR(_release)
+#define atomic64_fetch_or_release atomic64_fetch_or_release
+#endif
+
+#define INSTR_ATOMIC_FETCH_XOR(order)					\
+static __always_inline int						\
+atomic_fetch_xor##order(int i, atomic_t *v)				\
+{									\
+	kasan_check_write(v, sizeof(*v));				\
+	return arch_atomic_fetch_xor##order(i, v);			\
 }
 
-static __always_inline s64 atomic64_fetch_xor(s64 i, atomic64_t *v)
-{
-	kasan_check_write(v, sizeof(*v));
-	return arch_atomic64_fetch_xor(i, v);
+INSTR_ATOMIC_FETCH_XOR()
+
+#ifdef arch_atomic_fetch_xor_relaxed
+INSTR_ATOMIC_FETCH_XOR(_relaxed)
+#define atomic_fetch_xor_relaxed atomic_fetch_xor_relaxed
+#endif
+
+#ifdef arch_atomic_fetch_xor_acquire
+INSTR_ATOMIC_FETCH_XOR(_acquire)
+#define atomic_fetch_xor_acquire atomic_fetch_xor_acquire
+#endif
+
+#ifdef arch_atomic_fetch_xor_release
+INSTR_ATOMIC_FETCH_XOR(_release)
+#define atomic_fetch_xor_release atomic_fetch_xor_release
+#endif
+
+#define INSTR_ATOMIC64_FETCH_XOR(xorder)				\
+static __always_inline s64						\
+atomic64_fetch_xor##xorder(s64 i, atomic64_t *v)			\
+{									\
+	kasan_check_write(v, sizeof(*v));				\
+	return arch_atomic64_fetch_xor##xorder(i, v);			\
 }
 
+INSTR_ATOMIC64_FETCH_XOR()
+
+#ifdef arch_atomic64_fetch_xor_relaxed
+INSTR_ATOMIC64_FETCH_XOR(_relaxed)
+#define atomic64_fetch_xor_relaxed atomic64_fetch_xor_relaxed
+#endif
+
+#ifdef arch_atomic64_fetch_xor_acquire
+INSTR_ATOMIC64_FETCH_XOR(_acquire)
+#define atomic64_fetch_xor_acquire atomic64_fetch_xor_acquire
+#endif
+
+#ifdef arch_atomic64_fetch_xor_release
+INSTR_ATOMIC64_FETCH_XOR(_release)
+#define atomic64_fetch_xor_release atomic64_fetch_xor_release
+#endif
+
 static __always_inline bool atomic_sub_and_test(int i, atomic_t *v)
 {
 	kasan_check_write(v, sizeof(*v));
@@ -349,31 +1108,64 @@ static __always_inline bool atomic64_add_negative(s64 i, atomic64_t *v)
 	return arch_atomic64_add_negative(i, v);
 }
 
-static __always_inline unsigned long
-cmpxchg_size(volatile void *ptr, unsigned long old, unsigned long new, int size)
-{
-	kasan_check_write(ptr, size);
-	switch (size) {
-	case 1:
-		return arch_cmpxchg((u8 *)ptr, (u8)old, (u8)new);
-	case 2:
-		return arch_cmpxchg((u16 *)ptr, (u16)old, (u16)new);
-	case 4:
-		return arch_cmpxchg((u32 *)ptr, (u32)old, (u32)new);
-	case 8:
-		BUILD_BUG_ON(sizeof(unsigned long) != 8);
-		return arch_cmpxchg((u64 *)ptr, (u64)old, (u64)new);
-	}
-	BUILD_BUG();
-	return 0;
+#define INSTR_CMPXCHG(order)							\
+static __always_inline unsigned long						\
+cmpxchg##order##_size(volatile void *ptr, unsigned long old,			\
+		       unsigned long new, int size)				\
+{										\
+	kasan_check_write(ptr, size);						\
+	switch (size) {								\
+	case 1:									\
+		return arch_cmpxchg##order((u8 *)ptr, (u8)old, (u8)new);	\
+	case 2:									\
+		return arch_cmpxchg##order((u16 *)ptr, (u16)old, (u16)new);	\
+	case 4:									\
+		return arch_cmpxchg##order((u32 *)ptr, (u32)old, (u32)new);	\
+	case 8:									\
+		BUILD_BUG_ON(sizeof(unsigned long) != 8);			\
+		return arch_cmpxchg##order((u64 *)ptr, (u64)old, (u64)new);	\
+	}									\
+	BUILD_BUG();								\
+	return 0;								\
 }
 
+INSTR_CMPXCHG()
 #define cmpxchg(ptr, old, new)						\
 ({									\
 	((__typeof__(*(ptr)))cmpxchg_size((ptr), (unsigned long)(old),	\
 		(unsigned long)(new), sizeof(*(ptr))));			\
 })
 
+#ifdef arch_cmpxchg_relaxed
+INSTR_CMPXCHG(_relaxed)
+#define cmpxchg_relaxed(ptr, old, new)					\
+({									\
+	((__typeof__(*(ptr)))cmpxchg_relaxed_size((ptr),		\
+		(unsigned long)(old), (unsigned long)(new), 		\
+		sizeof(*(ptr))));					\
+})
+#endif
+
+#ifdef arch_cmpxchg_acquire
+INSTR_CMPXCHG(_acquire)
+#define cmpxchg_acquire(ptr, old, new)					\
+({									\
+	((__typeof__(*(ptr)))cmpxchg_acquire_size((ptr),		\
+		(unsigned long)(old), (unsigned long)(new), 		\
+		sizeof(*(ptr))));					\
+})
+#endif
+
+#ifdef arch_cmpxchg_release
+INSTR_CMPXCHG(_release)
+#define cmpxchg_release(ptr, old, new)					\
+({									\
+	((__typeof__(*(ptr)))cmpxchg_release_size((ptr),		\
+		(unsigned long)(old), (unsigned long)(new), 		\
+		sizeof(*(ptr))));					\
+})
+#endif
+
 static __always_inline unsigned long
 sync_cmpxchg_size(volatile void *ptr, unsigned long old, unsigned long new,
 		  int size)
@@ -428,19 +1220,48 @@ cmpxchg_local_size(volatile void *ptr, unsigned long old, unsigned long new,
 		sizeof(*(ptr))));					\
 })
 
-static __always_inline u64
-cmpxchg64_size(volatile u64 *ptr, u64 old, u64 new)
-{
-	kasan_check_write(ptr, sizeof(*ptr));
-	return arch_cmpxchg64(ptr, old, new);
+#define INSTR_CMPXCHG64(order)						\
+static __always_inline u64						\
+cmpxchg64##order##_size(volatile u64 *ptr, u64 old, u64 new)		\
+{									\
+	kasan_check_write(ptr, sizeof(*ptr));				\
+	return arch_cmpxchg64##order(ptr, old, new);			\
 }
 
+INSTR_CMPXCHG64()
 #define cmpxchg64(ptr, old, new)					\
 ({									\
 	((__typeof__(*(ptr)))cmpxchg64_size((ptr), (u64)(old),		\
 		(u64)(new)));						\
 })
 
+#ifdef arch_cmpxchg64_relaxed
+INSTR_CMPXCHG64(_relaxed)
+#define cmpxchg64_relaxed(ptr, old, new)				\
+({									\
+	((__typeof__(*(ptr)))cmpxchg64_relaxed_size((ptr), (u64)(old),	\
+		(u64)(new)));						\
+})
+#endif
+
+#ifdef arch_cmpxchg64_acquire
+INSTR_CMPXCHG64(_acquire)
+#define cmpxchg64_acquire(ptr, old, new)				\
+({									\
+	((__typeof__(*(ptr)))cmpxchg64_acquire_size((ptr), (u64)(old),	\
+		(u64)(new)));						\
+})
+#endif
+
+#ifdef arch_cmpxchg64_release
+INSTR_CMPXCHG64(_release)
+#define cmpxchg64_release(ptr, old, new)				\
+({									\
+	((__typeof__(*(ptr)))cmpxchg64_release_size((ptr), (u64)(old),	\
+		(u64)(new)));						\
+})
+#endif
+
 static __always_inline u64
 cmpxchg64_local_size(volatile u64 *ptr, u64 old, u64 new)
 {
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 103+ messages in thread

* [PATCH 1/6] locking/atomic, asm-generic: instrument ordering variants
@ 2018-05-04 17:39   ` Mark Rutland
  0 siblings, 0 replies; 103+ messages in thread
From: Mark Rutland @ 2018-05-04 17:39 UTC (permalink / raw)
  To: linux-arm-kernel

Currently <asm-generic/atomic-instrumented.h> only instruments the fully
ordered variants of atomic functions, ignoring the {relaxed,acquire,release}
ordering variants.

This patch reworks the header to instrument all ordering variants of the atomic
functions, so that architectures implementing these are instrumented
appropriately.

To minimise repetition, a macro is used to generate each variant from a common
template. The {full,relaxed,acquire,release} order variants respectively are
then built using this template, where the architecture provides an
implementation.

To stick to an 80 column limit while keeping the templates legible, the return
type and function name of each template are split over two lines. For
consistency, this is done even when not strictly necessary.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will.deacon@arm.com>
---
 include/asm-generic/atomic-instrumented.h | 1195 ++++++++++++++++++++++++-----
 1 file changed, 1008 insertions(+), 187 deletions(-)

diff --git a/include/asm-generic/atomic-instrumented.h b/include/asm-generic/atomic-instrumented.h
index ec07f23678ea..26f0e3098442 100644
--- a/include/asm-generic/atomic-instrumented.h
+++ b/include/asm-generic/atomic-instrumented.h
@@ -40,171 +40,664 @@ static __always_inline void atomic64_set(atomic64_t *v, s64 i)
 	arch_atomic64_set(v, i);
 }
 
-static __always_inline int atomic_xchg(atomic_t *v, int i)
-{
-	kasan_check_write(v, sizeof(*v));
-	return arch_atomic_xchg(v, i);
+#define INSTR_ATOMIC_XCHG(order)					\
+static __always_inline int						\
+atomic_xchg##order(atomic_t *v, int i)					\
+{									\
+	kasan_check_write(v, sizeof(*v));				\
+	return arch_atomic_xchg##order(v, i);				\
 }
 
-static __always_inline s64 atomic64_xchg(atomic64_t *v, s64 i)
-{
-	kasan_check_write(v, sizeof(*v));
-	return arch_atomic64_xchg(v, i);
+INSTR_ATOMIC_XCHG()
+
+#ifdef arch_atomic_xchg_relaxed
+INSTR_ATOMIC_XCHG(_relaxed)
+#define atomic_xchg_relaxed atomic_xchg_relaxed
+#endif
+
+#ifdef arch_atomic_xchg_acquire
+INSTR_ATOMIC_XCHG(_acquire)
+#define atomic_xchg_acquire atomic_xchg_acquire
+#endif
+
+#ifdef arch_atomic_xchg_release
+INSTR_ATOMIC_XCHG(_release)
+#define atomic_xchg_release atomic_xchg_release
+#endif
+
+#define INSTR_ATOMIC64_XCHG(order)					\
+static __always_inline s64						\
+atomic64_xchg##order(atomic64_t *v, s64 i)				\
+{									\
+	kasan_check_write(v, sizeof(*v));				\
+	return arch_atomic64_xchg##order(v, i);				\
 }
 
-static __always_inline int atomic_cmpxchg(atomic_t *v, int old, int new)
-{
-	kasan_check_write(v, sizeof(*v));
-	return arch_atomic_cmpxchg(v, old, new);
+INSTR_ATOMIC64_XCHG()
+
+#ifdef arch_atomic64_xchg_relaxed
+INSTR_ATOMIC64_XCHG(_relaxed)
+#define atomic64_xchg_relaxed atomic64_xchg_relaxed
+#endif
+
+#ifdef arch_atomic64_xchg_acquire
+INSTR_ATOMIC64_XCHG(_acquire)
+#define atomic64_xchg_acquire atomic64_xchg_acquire
+#endif
+
+#ifdef arch_atomic64_xchg_release
+INSTR_ATOMIC64_XCHG(_release)
+#define atomic64_xchg_release atomic64_xchg_release
+#endif
+
+#define INSTR_ATOMIC_CMPXCHG(order)					\
+static __always_inline int						\
+atomic_cmpxchg##order(atomic_t *v, int old, int new)			\
+{									\
+	kasan_check_write(v, sizeof(*v));				\
+	return arch_atomic_cmpxchg##order(v, old, new);			\
 }
 
-static __always_inline s64 atomic64_cmpxchg(atomic64_t *v, s64 old, s64 new)
-{
-	kasan_check_write(v, sizeof(*v));
-	return arch_atomic64_cmpxchg(v, old, new);
+INSTR_ATOMIC_CMPXCHG()
+
+#ifdef arch_atomic_cmpxchg_relaxed
+INSTR_ATOMIC_CMPXCHG(_relaxed)
+#define atomic_cmpxchg_relaxed atomic_cmpxchg_relaxed
+#endif
+
+#ifdef arch_atomic_cmpxchg_acquire
+INSTR_ATOMIC_CMPXCHG(_acquire)
+#define atomic_cmpxchg_acquire atomic_cmpxchg_acquire
+#endif
+
+#ifdef arch_atomic_cmpxchg_release
+INSTR_ATOMIC_CMPXCHG(_release)
+#define atomic_cmpxchg_release atomic_cmpxchg_release
+#endif
+
+#define INSTR_ATOMIC64_CMPXCHG(order)					\
+static __always_inline s64						\
+atomic64_cmpxchg##order(atomic64_t *v, s64 old, s64 new)		\
+{									\
+	kasan_check_write(v, sizeof(*v));				\
+	return arch_atomic64_cmpxchg##order(v, old, new);		\
+}
+
+INSTR_ATOMIC64_CMPXCHG()
+
+#ifdef arch_atomic64_cmpxchg_relaxed
+INSTR_ATOMIC64_CMPXCHG(_relaxed)
+#define atomic64_cmpxchg_relaxed atomic64_cmpxchg_relaxed
+#endif
+
+#ifdef arch_atomic64_cmpxchg_acquire
+INSTR_ATOMIC64_CMPXCHG(_acquire)
+#define atomic64_cmpxchg_acquire atomic64_cmpxchg_acquire
+#endif
+
+#ifdef arch_atomic64_cmpxchg_release
+INSTR_ATOMIC64_CMPXCHG(_release)
+#define atomic64_cmpxchg_release atomic64_cmpxchg_release
+#endif
+
+#define INSTR_ATOMIC_TRY_CMPXCHG(order)					\
+static __always_inline bool						\
+atomic_try_cmpxchg##order(atomic_t *v, int *old, int new)		\
+{									\
+	kasan_check_write(v, sizeof(*v));				\
+	kasan_check_read(old, sizeof(*old));				\
+	return arch_atomic_try_cmpxchg##order(v, old, new);		\
 }
 
 #ifdef arch_atomic_try_cmpxchg
+INSTR_ATOMIC_TRY_CMPXCHG()
 #define atomic_try_cmpxchg atomic_try_cmpxchg
-static __always_inline bool atomic_try_cmpxchg(atomic_t *v, int *old, int new)
-{
-	kasan_check_write(v, sizeof(*v));
-	kasan_check_read(old, sizeof(*old));
-	return arch_atomic_try_cmpxchg(v, old, new);
-}
 #endif
 
-#ifdef arch_atomic64_try_cmpxchg
-#define atomic64_try_cmpxchg atomic64_try_cmpxchg
-static __always_inline bool atomic64_try_cmpxchg(atomic64_t *v, s64 *old, s64 new)
-{
-	kasan_check_write(v, sizeof(*v));
-	kasan_check_read(old, sizeof(*old));
-	return arch_atomic64_try_cmpxchg(v, old, new);
+#ifdef arch_atomic_try_cmpxchg_relaxed
+INSTR_ATOMIC_TRY_CMPXCHG(_relaxed)
+#define atomic_try_cmpxchg_relaxed atomic_try_cmpxchg_relaxed
+#endif
+
+#ifdef arch_atomic_try_cmpxchg_acquire
+INSTR_ATOMIC_TRY_CMPXCHG(_acquire)
+#define atomic_try_cmpxchg_acquire atomic_try_cmpxchg_acquire
+#endif
+
+#ifdef arch_atomic_try_cmpxchg_release
+INSTR_ATOMIC_TRY_CMPXCHG(_release)
+#define atomic_try_cmpxchg_release atomic_try_cmpxchg_release
+#endif
+
+#define INSTR_ATOMIC64_TRY_CMPXCHG(order)				\
+static __always_inline bool						\
+atomic64_try_cmpxchg##order(atomic64_t *v, s64 *old, s64 new)		\
+{									\
+	kasan_check_write(v, sizeof(*v));				\
+	kasan_check_read(old, sizeof(*old));				\
+	return arch_atomic64_try_cmpxchg##order(v, old, new);		\
 }
+
+#ifdef arch_atomic64_try_cmpxchg
+INSTR_ATOMIC64_TRY_CMPXCHG()
+#define atomic_try_cmpxchg atomic_try_cmpxchg
 #endif
 
-static __always_inline int __atomic_add_unless(atomic_t *v, int a, int u)
-{
-	kasan_check_write(v, sizeof(*v));
-	return __arch_atomic_add_unless(v, a, u);
+#ifdef arch_atomic64_try_cmpxchg_relaxed
+INSTR_ATOMIC64_TRY_CMPXCHG(_relaxed)
+#define atomic_try_cmpxchg_relaxed atomic_try_cmpxchg_relaxed
+#endif
+
+#ifdef arch_atomic64_try_cmpxchg_acquire
+INSTR_ATOMIC64_TRY_CMPXCHG(_acquire)
+#define atomic_try_cmpxchg_acquire atomic_try_cmpxchg_acquire
+#endif
+
+#ifdef arch_atomic64_try_cmpxchg_release
+INSTR_ATOMIC64_TRY_CMPXCHG(_release)
+#define atomic_try_cmpxchg_release atomic_try_cmpxchg_release
+#endif
+
+#define __INSTR_ATOMIC_ADD_UNLESS(order)				\
+static __always_inline int						\
+__atomic_add_unless##order(atomic_t *v, int a, int u)			\
+{									\
+	kasan_check_write(v, sizeof(*v));				\
+	return __arch_atomic_add_unless##order(v, a, u);		\
 }
 
+__INSTR_ATOMIC_ADD_UNLESS()
 
-static __always_inline bool atomic64_add_unless(atomic64_t *v, s64 a, s64 u)
-{
-	kasan_check_write(v, sizeof(*v));
-	return arch_atomic64_add_unless(v, a, u);
+#ifdef __arch_atomic_add_unless_relaxed
+__INSTR_ATOMIC_ADD_UNLESS(_relaxed)
+#define __atomic_add_unless_relaxed __atomic_add_unless_relaxed
+#endif
+
+#ifdef __arch_atomic_add_unless_acquire
+__INSTR_ATOMIC_ADD_UNLESS(_acquire)
+#define __atomic_add_unless_acquire __atomic_add_unless_acquire
+#endif
+
+#ifdef __arch_atomic_add_unless_release
+__INSTR_ATOMIC_ADD_UNLESS(_release)
+#define __atomic_add_unless_release __atomic_add_unless_release
+#endif
+
+#define INSTR_ATOMIC64_ADD_UNLESS(order)				\
+static __always_inline bool						\
+atomic64_add_unless##order(atomic64_t *v, s64 a, s64 u)			\
+{									\
+	kasan_check_write(v, sizeof(*v));				\
+	return arch_atomic64_add_unless##order(v, a, u);		\
 }
 
-static __always_inline void atomic_inc(atomic_t *v)
-{
-	kasan_check_write(v, sizeof(*v));
-	arch_atomic_inc(v);
+INSTR_ATOMIC64_ADD_UNLESS()
+
+#ifdef arch_atomic64_add_unless_relaxed
+INSTR_ATOMIC64_ADD_UNLESS(_relaxed)
+#define atomic64_add_unless_relaxed atomic64_add_unless_relaxed
+#endif
+
+#ifdef arch_atomic64_add_unless_acquire
+INSTR_ATOMIC64_ADD_UNLESS(_acquire)
+#define atomic64_add_unless_acquire atomic64_add_unless_acquire
+#endif
+
+#ifdef arch_atomic64_add_unless_release
+INSTR_ATOMIC64_ADD_UNLESS(_release)
+#define atomic64_add_unless_release atomic64_add_unless_release
+#endif
+
+#define INSTR_ATOMIC_INC(order)						\
+static __always_inline void						\
+atomic_inc##order(atomic_t *v)						\
+{									\
+	kasan_check_write(v, sizeof(*v));				\
+	arch_atomic_inc##order(v);					\
 }
 
-static __always_inline void atomic64_inc(atomic64_t *v)
-{
-	kasan_check_write(v, sizeof(*v));
-	arch_atomic64_inc(v);
+INSTR_ATOMIC_INC()
+
+#ifdef arch_atomic_inc_relaxed
+INSTR_ATOMIC_INC(_relaxed)
+#define atomic_inc_relaxed atomic_inc_relaxed
+#endif
+
+#ifdef arch_atomic_inc_acquire
+INSTR_ATOMIC_INC(_acquire)
+#define atomic_inc_acquire atomic_inc_acquire
+#endif
+
+#ifdef arch_atomic_inc_release
+INSTR_ATOMIC_INC(_release)
+#define atomic_inc_release atomic_inc_release
+#endif
+
+#define INSTR_ATOMIC64_INC(order)					\
+static __always_inline void						\
+atomic64_inc##order(atomic64_t *v)					\
+{									\
+	kasan_check_write(v, sizeof(*v));				\
+	arch_atomic64_inc##order(v);					\
 }
 
-static __always_inline void atomic_dec(atomic_t *v)
-{
-	kasan_check_write(v, sizeof(*v));
-	arch_atomic_dec(v);
+INSTR_ATOMIC64_INC()
+
+#ifdef arch_atomic64_inc_relaxed
+INSTR_ATOMIC64_INC(_relaxed)
+#define atomic64_inc_relaxed atomic64_inc_relaxed
+#endif
+
+#ifdef arch_atomic64_inc_acquire
+INSTR_ATOMIC64_INC(_acquire)
+#define atomic64_inc_acquire atomic64_inc_acquire
+#endif
+
+#ifdef arch_atomic64_inc_release
+INSTR_ATOMIC64_INC(_release)
+#define atomic64_inc_release atomic64_inc_release
+#endif
+
+#define INSTR_ATOMIC_DEC(order)						\
+static __always_inline void						\
+atomic_dec##order(atomic_t *v)						\
+{									\
+	kasan_check_write(v, sizeof(*v));				\
+	arch_atomic_dec##order(v);					\
 }
 
-static __always_inline void atomic64_dec(atomic64_t *v)
-{
-	kasan_check_write(v, sizeof(*v));
-	arch_atomic64_dec(v);
+INSTR_ATOMIC_DEC()
+
+#ifdef arch_atomic_dec_relaxed
+INSTR_ATOMIC_DEC(_relaxed)
+#define atomic_dec_relaxed atomic_dec_relaxed
+#endif
+
+#ifdef arch_atomic_dec_acquire
+INSTR_ATOMIC_DEC(_acquire)
+#define atomic_dec_acquire atomic_dec_acquire
+#endif
+
+#ifdef arch_atomic_dec_release
+INSTR_ATOMIC_DEC(_release)
+#define atomic_dec_release atomic_dec_release
+#endif
+
+#define INSTR_ATOMIC64_DEC(order)					\
+static __always_inline void						\
+atomic64_dec##order(atomic64_t *v)					\
+{									\
+	kasan_check_write(v, sizeof(*v));				\
+	arch_atomic64_dec##order(v);					\
 }
 
-static __always_inline void atomic_add(int i, atomic_t *v)
-{
-	kasan_check_write(v, sizeof(*v));
-	arch_atomic_add(i, v);
+INSTR_ATOMIC64_DEC()
+
+#ifdef arch_atomic64_dec_relaxed
+INSTR_ATOMIC64_DEC(_relaxed)
+#define atomic64_dec_relaxed atomic64_dec_relaxed
+#endif
+
+#ifdef arch_atomic64_dec_acquire
+INSTR_ATOMIC64_DEC(_acquire)
+#define atomic64_dec_acquire atomic64_dec_acquire
+#endif
+
+#ifdef arch_atomic64_dec_release
+INSTR_ATOMIC64_DEC(_release)
+#define atomic64_dec_release atomic64_dec_release
+#endif
+
+#define INSTR_ATOMIC_ADD(order)						\
+static __always_inline void						\
+atomic_add##order(int i, atomic_t *v)					\
+{									\
+	kasan_check_write(v, sizeof(*v));				\
+	arch_atomic_add##order(i, v);					\
 }
 
-static __always_inline void atomic64_add(s64 i, atomic64_t *v)
-{
-	kasan_check_write(v, sizeof(*v));
-	arch_atomic64_add(i, v);
+INSTR_ATOMIC_ADD()
+
+#ifdef arch_atomic_add_relaxed
+INSTR_ATOMIC_ADD(_relaxed)
+#define atomic_add_relaxed atomic_add_relaxed
+#endif
+
+#ifdef arch_atomic_add_acquire
+INSTR_ATOMIC_ADD(_acquire)
+#define atomic_add_acquire atomic_add_acquire
+#endif
+
+#ifdef arch_atomic_add_release
+INSTR_ATOMIC_ADD(_release)
+#define atomic_add_release atomic_add_release
+#endif
+
+#define INSTR_ATOMIC64_ADD(order)					\
+static __always_inline void						\
+atomic64_add##order(s64 i, atomic64_t *v)				\
+{									\
+	kasan_check_write(v, sizeof(*v));				\
+	arch_atomic64_add##order(i, v);					\
 }
 
-static __always_inline void atomic_sub(int i, atomic_t *v)
-{
-	kasan_check_write(v, sizeof(*v));
-	arch_atomic_sub(i, v);
+INSTR_ATOMIC64_ADD()
+
+#ifdef arch_atomic64_add_relaxed
+INSTR_ATOMIC64_ADD(_relaxed)
+#define atomic64_add_relaxed atomic64_add_relaxed
+#endif
+
+#ifdef arch_atomic64_add_acquire
+INSTR_ATOMIC64_ADD(_acquire)
+#define atomic64_add_acquire atomic64_add_acquire
+#endif
+
+#ifdef arch_atomic64_add_release
+INSTR_ATOMIC64_ADD(_release)
+#define atomic64_add_release atomic64_add_release
+#endif
+
+#define INSTR_ATOMIC_SUB(order)						\
+static __always_inline void						\
+atomic_sub##order(int i, atomic_t *v)					\
+{									\
+	kasan_check_write(v, sizeof(*v));				\
+	arch_atomic_sub##order(i, v);					\
 }
 
-static __always_inline void atomic64_sub(s64 i, atomic64_t *v)
-{
-	kasan_check_write(v, sizeof(*v));
-	arch_atomic64_sub(i, v);
+INSTR_ATOMIC_SUB()
+
+#ifdef arch_atomic_sub_relaxed
+INSTR_ATOMIC_SUB(_relaxed)
+#define atomic_sub_relaxed atomic_sub_relaxed
+#endif
+
+#ifdef arch_atomic_sub_acquire
+INSTR_ATOMIC_SUB(_acquire)
+#define atomic_sub_acquire atomic_sub_acquire
+#endif
+
+#ifdef arch_atomic_sub_release
+INSTR_ATOMIC_SUB(_release)
+#define atomic_sub_release atomic_sub_release
+#endif
+
+#define INSTR_ATOMIC64_SUB(order)					\
+static __always_inline void						\
+atomic64_sub##order(s64 i, atomic64_t *v)				\
+{									\
+	kasan_check_write(v, sizeof(*v));				\
+	arch_atomic64_sub##order(i, v);					\
 }
 
-static __always_inline void atomic_and(int i, atomic_t *v)
-{
-	kasan_check_write(v, sizeof(*v));
-	arch_atomic_and(i, v);
+INSTR_ATOMIC64_SUB()
+
+#ifdef arch_atomic64_sub_relaxed
+INSTR_ATOMIC64_SUB(_relaxed)
+#define atomic64_sub_relaxed atomic64_sub_relaxed
+#endif
+
+#ifdef arch_atomic64_sub_acquire
+INSTR_ATOMIC64_SUB(_acquire)
+#define atomic64_sub_acquire atomic64_sub_acquire
+#endif
+
+#ifdef arch_atomic64_sub_release
+INSTR_ATOMIC64_SUB(_release)
+#define atomic64_sub_release atomic64_sub_release
+#endif
+
+#define INSTR_ATOMIC_AND(order)						\
+static __always_inline void						\
+atomic_and##order(int i, atomic_t *v)					\
+{									\
+	kasan_check_write(v, sizeof(*v));				\
+	arch_atomic_and##order(i, v);					\
 }
 
-static __always_inline void atomic64_and(s64 i, atomic64_t *v)
-{
-	kasan_check_write(v, sizeof(*v));
-	arch_atomic64_and(i, v);
+INSTR_ATOMIC_AND()
+
+#ifdef arch_atomic_and_relaxed
+INSTR_ATOMIC_AND(_relaxed)
+#define atomic_and_relaxed atomic_and_relaxed
+#endif
+
+#ifdef arch_atomic_and_acquire
+INSTR_ATOMIC_AND(_acquire)
+#define atomic_and_acquire atomic_and_acquire
+#endif
+
+#ifdef arch_atomic_and_release
+INSTR_ATOMIC_AND(_release)
+#define atomic_and_release atomic_and_release
+#endif
+
+#define INSTR_ATOMIC64_AND(order)					\
+static __always_inline void						\
+atomic64_and##order(s64 i, atomic64_t *v)				\
+{									\
+	kasan_check_write(v, sizeof(*v));				\
+	arch_atomic64_and##order(i, v);					\
 }
 
-static __always_inline void atomic_or(int i, atomic_t *v)
-{
-	kasan_check_write(v, sizeof(*v));
-	arch_atomic_or(i, v);
+INSTR_ATOMIC64_AND()
+
+#ifdef arch_atomic64_and_relaxed
+INSTR_ATOMIC64_AND(_relaxed)
+#define atomic64_and_relaxed atomic64_and_relaxed
+#endif
+
+#ifdef arch_atomic64_and_acquire
+INSTR_ATOMIC64_AND(_acquire)
+#define atomic64_and_acquire atomic64_and_acquire
+#endif
+
+#ifdef arch_atomic64_and_release
+INSTR_ATOMIC64_AND(_release)
+#define atomic64_and_release atomic64_and_release
+#endif
+
+#define INSTR_ATOMIC_OR(order)						\
+static __always_inline void						\
+atomic_or##order(int i, atomic_t *v)					\
+{									\
+	kasan_check_write(v, sizeof(*v));				\
+	arch_atomic_or##order(i, v);					\
 }
 
-static __always_inline void atomic64_or(s64 i, atomic64_t *v)
-{
-	kasan_check_write(v, sizeof(*v));
-	arch_atomic64_or(i, v);
+INSTR_ATOMIC_OR()
+
+#ifdef arch_atomic_or_relaxed
+INSTR_ATOMIC_OR(_relaxed)
+#define atomic_or_relaxed atomic_or_relaxed
+#endif
+
+#ifdef arch_atomic_or_acquire
+INSTR_ATOMIC_OR(_acquire)
+#define atomic_or_acquire atomic_or_acquire
+#endif
+
+#ifdef arch_atomic_or_release
+INSTR_ATOMIC_OR(_release)
+#define atomic_or_release atomic_or_release
+#endif
+
+#define INSTR_ATOMIC64_OR(order)					\
+static __always_inline void						\
+atomic64_or##order(s64 i, atomic64_t *v)				\
+{									\
+	kasan_check_write(v, sizeof(*v));				\
+	arch_atomic64_or##order(i, v);					\
 }
 
-static __always_inline void atomic_xor(int i, atomic_t *v)
-{
-	kasan_check_write(v, sizeof(*v));
-	arch_atomic_xor(i, v);
+INSTR_ATOMIC64_OR()
+
+#ifdef arch_atomic64_or_relaxed
+INSTR_ATOMIC64_OR(_relaxed)
+#define atomic64_or_relaxed atomic64_or_relaxed
+#endif
+
+#ifdef arch_atomic64_or_acquire
+INSTR_ATOMIC64_OR(_acquire)
+#define atomic64_or_acquire atomic64_or_acquire
+#endif
+
+#ifdef arch_atomic64_or_release
+INSTR_ATOMIC64_OR(_release)
+#define atomic64_or_release atomic64_or_release
+#endif
+
+#define INSTR_ATOMIC_XOR(order)						\
+static __always_inline void						\
+atomic_xor##order(int i, atomic_t *v)					\
+{									\
+	kasan_check_write(v, sizeof(*v));				\
+	arch_atomic_xor##order(i, v);					\
 }
 
-static __always_inline void atomic64_xor(s64 i, atomic64_t *v)
-{
-	kasan_check_write(v, sizeof(*v));
-	arch_atomic64_xor(i, v);
+INSTR_ATOMIC_XOR()
+
+#ifdef arch_atomic_xor_relaxed
+INSTR_ATOMIC_XOR(_relaxed)
+#define atomic_xor_relaxed atomic_xor_relaxed
+#endif
+
+#ifdef arch_atomic_xor_acquire
+INSTR_ATOMIC_XOR(_acquire)
+#define atomic_xor_acquire atomic_xor_acquire
+#endif
+
+#ifdef arch_atomic_xor_release
+INSTR_ATOMIC_XOR(_release)
+#define atomic_xor_release atomic_xor_release
+#endif
+
+#define INSTR_ATOMIC64_XOR(order)					\
+static __always_inline void						\
+atomic64_xor##order(s64 i, atomic64_t *v)				\
+{									\
+	kasan_check_write(v, sizeof(*v));				\
+	arch_atomic64_xor##order(i, v);					\
 }
 
-static __always_inline int atomic_inc_return(atomic_t *v)
-{
-	kasan_check_write(v, sizeof(*v));
-	return arch_atomic_inc_return(v);
+INSTR_ATOMIC64_XOR()
+
+#ifdef arch_atomic64_xor_relaxed
+INSTR_ATOMIC64_XOR(_relaxed)
+#define atomic64_xor_relaxed atomic64_xor_relaxed
+#endif
+
+#ifdef arch_atomic64_xor_acquire
+INSTR_ATOMIC64_XOR(_acquire)
+#define atomic64_xor_acquire atomic64_xor_acquire
+#endif
+
+#ifdef arch_atomic64_xor_release
+INSTR_ATOMIC64_XOR(_release)
+#define atomic64_xor_release atomic64_xor_release
+#endif
+
+#define INSTR_ATOMIC_INC_RETURN(order)					\
+static __always_inline int						\
+atomic_inc_return##order(atomic_t *v)					\
+{									\
+	kasan_check_write(v, sizeof(*v));				\
+	return arch_atomic_inc_return##order(v);			\
 }
 
-static __always_inline s64 atomic64_inc_return(atomic64_t *v)
-{
-	kasan_check_write(v, sizeof(*v));
-	return arch_atomic64_inc_return(v);
+INSTR_ATOMIC_INC_RETURN()
+
+#ifdef arch_atomic_inc_return_relaxed
+INSTR_ATOMIC_INC_RETURN(_relaxed)
+#define atomic_inc_return_relaxed atomic_inc_return_relaxed
+#endif
+
+#ifdef arch_atomic_inc_return_acquire
+INSTR_ATOMIC_INC_RETURN(_acquire)
+#define atomic_inc_return_acquire atomic_inc_return_acquire
+#endif
+
+#ifdef arch_atomic_inc_return_release
+INSTR_ATOMIC_INC_RETURN(_release)
+#define atomic_inc_return_release atomic_inc_return_release
+#endif
+
+#define INSTR_ATOMIC64_INC_RETURN(order)				\
+static __always_inline s64						\
+atomic64_inc_return##order(atomic64_t *v)				\
+{									\
+	kasan_check_write(v, sizeof(*v));				\
+	return arch_atomic64_inc_return##order(v);			\
 }
 
-static __always_inline int atomic_dec_return(atomic_t *v)
-{
-	kasan_check_write(v, sizeof(*v));
-	return arch_atomic_dec_return(v);
+INSTR_ATOMIC64_INC_RETURN()
+
+#ifdef arch_atomic64_inc_return_relaxed
+INSTR_ATOMIC64_INC_RETURN(_relaxed)
+#define atomic64_inc_return_relaxed atomic64_inc_return_relaxed
+#endif
+
+#ifdef arch_atomic64_inc_return_acquire
+INSTR_ATOMIC64_INC_RETURN(_acquire)
+#define atomic64_inc_return_acquire atomic64_inc_return_acquire
+#endif
+
+#ifdef arch_atomic64_inc_return_release
+INSTR_ATOMIC64_INC_RETURN(_release)
+#define atomic64_inc_return_release atomic64_inc_return_release
+#endif
+
+#define INSTR_ATOMIC_DEC_RETURN(order)					\
+static __always_inline int						\
+atomic_dec_return##order(atomic_t *v)					\
+{									\
+	kasan_check_write(v, sizeof(*v));				\
+	return arch_atomic_dec_return##order(v);			\
 }
 
-static __always_inline s64 atomic64_dec_return(atomic64_t *v)
-{
-	kasan_check_write(v, sizeof(*v));
-	return arch_atomic64_dec_return(v);
+INSTR_ATOMIC_DEC_RETURN()
+
+#ifdef arch_atomic_dec_return_relaxed
+INSTR_ATOMIC_DEC_RETURN(_relaxed)
+#define atomic_dec_return_relaxed atomic_dec_return_relaxed
+#endif
+
+#ifdef arch_atomic_dec_return_acquire
+INSTR_ATOMIC_DEC_RETURN(_acquire)
+#define atomic_dec_return_acquire atomic_dec_return_acquire
+#endif
+
+#ifdef arch_atomic_dec_return_release
+INSTR_ATOMIC_DEC_RETURN(_release)
+#define atomic_dec_return_release atomic_dec_return_release
+#endif
+
+#define INSTR_ATOMIC64_DEC_RETURN(order)				\
+static __always_inline s64						\
+atomic64_dec_return##order(atomic64_t *v)				\
+{									\
+	kasan_check_write(v, sizeof(*v));				\
+	return arch_atomic64_dec_return##order(v);			\
 }
 
+INSTR_ATOMIC64_DEC_RETURN()
+
+#ifdef arch_atomic64_dec_return_relaxed
+INSTR_ATOMIC64_DEC_RETURN(_relaxed)
+#define atomic64_dec_return_relaxed atomic64_dec_return_relaxed
+#endif
+
+#ifdef arch_atomic64_dec_return_acquire
+INSTR_ATOMIC64_DEC_RETURN(_acquire)
+#define atomic64_dec_return_acquire atomic64_dec_return_acquire
+#endif
+
+#ifdef arch_atomic64_dec_return_release
+INSTR_ATOMIC64_DEC_RETURN(_release)
+#define atomic64_dec_return_release atomic64_dec_return_release
+#endif
+
 static __always_inline s64 atomic64_inc_not_zero(atomic64_t *v)
 {
 	kasan_check_write(v, sizeof(*v));
@@ -241,90 +734,356 @@ static __always_inline bool atomic64_inc_and_test(atomic64_t *v)
 	return arch_atomic64_inc_and_test(v);
 }
 
-static __always_inline int atomic_add_return(int i, atomic_t *v)
-{
-	kasan_check_write(v, sizeof(*v));
-	return arch_atomic_add_return(i, v);
+#define INSTR_ATOMIC_ADD_RETURN(order)					\
+static __always_inline int						\
+atomic_add_return##order(int i, atomic_t *v)				\
+{									\
+	kasan_check_write(v, sizeof(*v));				\
+	return arch_atomic_add_return##order(i, v);			\
 }
 
-static __always_inline s64 atomic64_add_return(s64 i, atomic64_t *v)
-{
-	kasan_check_write(v, sizeof(*v));
-	return arch_atomic64_add_return(i, v);
+INSTR_ATOMIC_ADD_RETURN()
+
+#ifdef arch_atomic_add_return_relaxed
+INSTR_ATOMIC_ADD_RETURN(_relaxed)
+#define atomic_add_return_relaxed atomic_add_return_relaxed
+#endif
+
+#ifdef arch_atomic_add_return_acquire
+INSTR_ATOMIC_ADD_RETURN(_acquire)
+#define atomic_add_return_acquire atomic_add_return_acquire
+#endif
+
+#ifdef arch_atomic_add_return_release
+INSTR_ATOMIC_ADD_RETURN(_release)
+#define atomic_add_return_release atomic_add_return_release
+#endif
+
+#define INSTR_ATOMIC64_ADD_RETURN(order)				\
+static __always_inline s64						\
+atomic64_add_return##order(s64 i, atomic64_t *v)			\
+{									\
+	kasan_check_write(v, sizeof(*v));				\
+	return arch_atomic64_add_return##order(i, v);			\
 }
 
-static __always_inline int atomic_sub_return(int i, atomic_t *v)
-{
-	kasan_check_write(v, sizeof(*v));
-	return arch_atomic_sub_return(i, v);
+INSTR_ATOMIC64_ADD_RETURN()
+
+#ifdef arch_atomic64_add_return_relaxed
+INSTR_ATOMIC64_ADD_RETURN(_relaxed)
+#define atomic64_add_return_relaxed atomic64_add_return_relaxed
+#endif
+
+#ifdef arch_atomic64_add_return_acquire
+INSTR_ATOMIC64_ADD_RETURN(_acquire)
+#define atomic64_add_return_acquire atomic64_add_return_acquire
+#endif
+
+#ifdef arch_atomic64_add_return_release
+INSTR_ATOMIC64_ADD_RETURN(_release)
+#define atomic64_add_return_release atomic64_add_return_release
+#endif
+
+#define INSTR_ATOMIC_SUB_RETURN(order)					\
+static __always_inline int						\
+atomic_sub_return##order(int i, atomic_t *v)				\
+{									\
+	kasan_check_write(v, sizeof(*v));				\
+	return arch_atomic_sub_return##order(i, v);			\
 }
 
-static __always_inline s64 atomic64_sub_return(s64 i, atomic64_t *v)
-{
-	kasan_check_write(v, sizeof(*v));
-	return arch_atomic64_sub_return(i, v);
+INSTR_ATOMIC_SUB_RETURN()
+
+#ifdef arch_atomic_sub_return_relaxed
+INSTR_ATOMIC_SUB_RETURN(_relaxed)
+#define atomic_sub_return_relaxed atomic_sub_return_relaxed
+#endif
+
+#ifdef arch_atomic_sub_return_acquire
+INSTR_ATOMIC_SUB_RETURN(_acquire)
+#define atomic_sub_return_acquire atomic_sub_return_acquire
+#endif
+
+#ifdef arch_atomic_sub_return_release
+INSTR_ATOMIC_SUB_RETURN(_release)
+#define atomic_sub_return_release atomic_sub_return_release
+#endif
+
+#define INSTR_ATOMIC64_SUB_RETURN(order)				\
+static __always_inline s64						\
+atomic64_sub_return##order(s64 i, atomic64_t *v)			\
+{									\
+	kasan_check_write(v, sizeof(*v));				\
+	return arch_atomic64_sub_return##order(i, v);			\
 }
 
-static __always_inline int atomic_fetch_add(int i, atomic_t *v)
-{
-	kasan_check_write(v, sizeof(*v));
-	return arch_atomic_fetch_add(i, v);
+INSTR_ATOMIC64_SUB_RETURN()
+
+#ifdef arch_atomic64_sub_return_relaxed
+INSTR_ATOMIC64_SUB_RETURN(_relaxed)
+#define atomic64_sub_return_relaxed atomic64_sub_return_relaxed
+#endif
+
+#ifdef arch_atomic64_sub_return_acquire
+INSTR_ATOMIC64_SUB_RETURN(_acquire)
+#define atomic64_sub_return_acquire atomic64_sub_return_acquire
+#endif
+
+#ifdef arch_atomic64_sub_return_release
+INSTR_ATOMIC64_SUB_RETURN(_release)
+#define atomic64_sub_return_release atomic64_sub_return_release
+#endif
+
+#define INSTR_ATOMIC_FETCH_ADD(order)					\
+static __always_inline int						\
+atomic_fetch_add##order(int i, atomic_t *v)				\
+{									\
+	kasan_check_write(v, sizeof(*v));				\
+	return arch_atomic_fetch_add##order(i, v);			\
 }
 
-static __always_inline s64 atomic64_fetch_add(s64 i, atomic64_t *v)
-{
-	kasan_check_write(v, sizeof(*v));
-	return arch_atomic64_fetch_add(i, v);
+INSTR_ATOMIC_FETCH_ADD()
+
+#ifdef arch_atomic_fetch_add_relaxed
+INSTR_ATOMIC_FETCH_ADD(_relaxed)
+#define atomic_fetch_add_relaxed atomic_fetch_add_relaxed
+#endif
+
+#ifdef arch_atomic_fetch_add_acquire
+INSTR_ATOMIC_FETCH_ADD(_acquire)
+#define atomic_fetch_add_acquire atomic_fetch_add_acquire
+#endif
+
+#ifdef arch_atomic_fetch_add_release
+INSTR_ATOMIC_FETCH_ADD(_release)
+#define atomic_fetch_add_release atomic_fetch_add_release
+#endif
+
+#define INSTR_ATOMIC64_FETCH_ADD(order)					\
+static __always_inline s64						\
+atomic64_fetch_add##order(s64 i, atomic64_t *v)				\
+{									\
+	kasan_check_write(v, sizeof(*v));				\
+	return arch_atomic64_fetch_add##order(i, v);			\
 }
 
-static __always_inline int atomic_fetch_sub(int i, atomic_t *v)
-{
-	kasan_check_write(v, sizeof(*v));
-	return arch_atomic_fetch_sub(i, v);
+INSTR_ATOMIC64_FETCH_ADD()
+
+#ifdef arch_atomic64_fetch_add_relaxed
+INSTR_ATOMIC64_FETCH_ADD(_relaxed)
+#define atomic64_fetch_add_relaxed atomic64_fetch_add_relaxed
+#endif
+
+#ifdef arch_atomic64_fetch_add_acquire
+INSTR_ATOMIC64_FETCH_ADD(_acquire)
+#define atomic64_fetch_add_acquire atomic64_fetch_add_acquire
+#endif
+
+#ifdef arch_atomic64_fetch_add_release
+INSTR_ATOMIC64_FETCH_ADD(_release)
+#define atomic64_fetch_add_release atomic64_fetch_add_release
+#endif
+
+#define INSTR_ATOMIC_FETCH_SUB(order)					\
+static __always_inline int						\
+atomic_fetch_sub##order(int i, atomic_t *v)				\
+{									\
+	kasan_check_write(v, sizeof(*v));				\
+	return arch_atomic_fetch_sub##order(i, v);			\
 }
 
-static __always_inline s64 atomic64_fetch_sub(s64 i, atomic64_t *v)
-{
-	kasan_check_write(v, sizeof(*v));
-	return arch_atomic64_fetch_sub(i, v);
+INSTR_ATOMIC_FETCH_SUB()
+
+#ifdef arch_atomic_fetch_sub_relaxed
+INSTR_ATOMIC_FETCH_SUB(_relaxed)
+#define atomic_fetch_sub_relaxed atomic_fetch_sub_relaxed
+#endif
+
+#ifdef arch_atomic_fetch_sub_acquire
+INSTR_ATOMIC_FETCH_SUB(_acquire)
+#define atomic_fetch_sub_acquire atomic_fetch_sub_acquire
+#endif
+
+#ifdef arch_atomic_fetch_sub_release
+INSTR_ATOMIC_FETCH_SUB(_release)
+#define atomic_fetch_sub_release atomic_fetch_sub_release
+#endif
+
+#define INSTR_ATOMIC64_FETCH_SUB(order)					\
+static __always_inline s64						\
+atomic64_fetch_sub##order(s64 i, atomic64_t *v)				\
+{									\
+	kasan_check_write(v, sizeof(*v));				\
+	return arch_atomic64_fetch_sub##order(i, v);			\
 }
 
-static __always_inline int atomic_fetch_and(int i, atomic_t *v)
-{
-	kasan_check_write(v, sizeof(*v));
-	return arch_atomic_fetch_and(i, v);
+INSTR_ATOMIC64_FETCH_SUB()
+
+#ifdef arch_atomic64_fetch_sub_relaxed
+INSTR_ATOMIC64_FETCH_SUB(_relaxed)
+#define atomic64_fetch_sub_relaxed atomic64_fetch_sub_relaxed
+#endif
+
+#ifdef arch_atomic64_fetch_sub_acquire
+INSTR_ATOMIC64_FETCH_SUB(_acquire)
+#define atomic64_fetch_sub_acquire atomic64_fetch_sub_acquire
+#endif
+
+#ifdef arch_atomic64_fetch_sub_release
+INSTR_ATOMIC64_FETCH_SUB(_release)
+#define atomic64_fetch_sub_release atomic64_fetch_sub_release
+#endif
+
+#define INSTR_ATOMIC_FETCH_AND(order)					\
+static __always_inline int						\
+atomic_fetch_and##order(int i, atomic_t *v)				\
+{									\
+	kasan_check_write(v, sizeof(*v));				\
+	return arch_atomic_fetch_and##order(i, v);			\
 }
 
-static __always_inline s64 atomic64_fetch_and(s64 i, atomic64_t *v)
-{
-	kasan_check_write(v, sizeof(*v));
-	return arch_atomic64_fetch_and(i, v);
+INSTR_ATOMIC_FETCH_AND()
+
+#ifdef arch_atomic_fetch_and_relaxed
+INSTR_ATOMIC_FETCH_AND(_relaxed)
+#define atomic_fetch_and_relaxed atomic_fetch_and_relaxed
+#endif
+
+#ifdef arch_atomic_fetch_and_acquire
+INSTR_ATOMIC_FETCH_AND(_acquire)
+#define atomic_fetch_and_acquire atomic_fetch_and_acquire
+#endif
+
+#ifdef arch_atomic_fetch_and_release
+INSTR_ATOMIC_FETCH_AND(_release)
+#define atomic_fetch_and_release atomic_fetch_and_release
+#endif
+
+#define INSTR_ATOMIC64_FETCH_AND(order)					\
+static __always_inline s64						\
+atomic64_fetch_and##order(s64 i, atomic64_t *v)				\
+{									\
+	kasan_check_write(v, sizeof(*v));				\
+	return arch_atomic64_fetch_and##order(i, v);			\
 }
 
-static __always_inline int atomic_fetch_or(int i, atomic_t *v)
-{
-	kasan_check_write(v, sizeof(*v));
-	return arch_atomic_fetch_or(i, v);
+INSTR_ATOMIC64_FETCH_AND()
+
+#ifdef arch_atomic64_fetch_and_relaxed
+INSTR_ATOMIC64_FETCH_AND(_relaxed)
+#define atomic64_fetch_and_relaxed atomic64_fetch_and_relaxed
+#endif
+
+#ifdef arch_atomic64_fetch_and_acquire
+INSTR_ATOMIC64_FETCH_AND(_acquire)
+#define atomic64_fetch_and_acquire atomic64_fetch_and_acquire
+#endif
+
+#ifdef arch_atomic64_fetch_and_release
+INSTR_ATOMIC64_FETCH_AND(_release)
+#define atomic64_fetch_and_release atomic64_fetch_and_release
+#endif
+
+#define INSTR_ATOMIC_FETCH_OR(order)					\
+static __always_inline int						\
+atomic_fetch_or##order(int i, atomic_t *v)				\
+{									\
+	kasan_check_write(v, sizeof(*v));				\
+	return arch_atomic_fetch_or##order(i, v);			\
 }
 
-static __always_inline s64 atomic64_fetch_or(s64 i, atomic64_t *v)
-{
-	kasan_check_write(v, sizeof(*v));
-	return arch_atomic64_fetch_or(i, v);
+INSTR_ATOMIC_FETCH_OR()
+
+#ifdef arch_atomic_fetch_or_relaxed
+INSTR_ATOMIC_FETCH_OR(_relaxed)
+#define atomic_fetch_or_relaxed atomic_fetch_or_relaxed
+#endif
+
+#ifdef arch_atomic_fetch_or_acquire
+INSTR_ATOMIC_FETCH_OR(_acquire)
+#define atomic_fetch_or_acquire atomic_fetch_or_acquire
+#endif
+
+#ifdef arch_atomic_fetch_or_release
+INSTR_ATOMIC_FETCH_OR(_release)
+#define atomic_fetch_or_release atomic_fetch_or_release
+#endif
+
+#define INSTR_ATOMIC64_FETCH_OR(order)					\
+static __always_inline s64						\
+atomic64_fetch_or##order(s64 i, atomic64_t *v)				\
+{									\
+	kasan_check_write(v, sizeof(*v));				\
+	return arch_atomic64_fetch_or##order(i, v);			\
 }
 
-static __always_inline int atomic_fetch_xor(int i, atomic_t *v)
-{
-	kasan_check_write(v, sizeof(*v));
-	return arch_atomic_fetch_xor(i, v);
+INSTR_ATOMIC64_FETCH_OR()
+
+#ifdef arch_atomic64_fetch_or_relaxed
+INSTR_ATOMIC64_FETCH_OR(_relaxed)
+#define atomic64_fetch_or_relaxed atomic64_fetch_or_relaxed
+#endif
+
+#ifdef arch_atomic64_fetch_or_acquire
+INSTR_ATOMIC64_FETCH_OR(_acquire)
+#define atomic64_fetch_or_acquire atomic64_fetch_or_acquire
+#endif
+
+#ifdef arch_atomic64_fetch_or_release
+INSTR_ATOMIC64_FETCH_OR(_release)
+#define atomic64_fetch_or_release atomic64_fetch_or_release
+#endif
+
+#define INSTR_ATOMIC_FETCH_XOR(order)					\
+static __always_inline int						\
+atomic_fetch_xor##order(int i, atomic_t *v)				\
+{									\
+	kasan_check_write(v, sizeof(*v));				\
+	return arch_atomic_fetch_xor##order(i, v);			\
 }
 
-static __always_inline s64 atomic64_fetch_xor(s64 i, atomic64_t *v)
-{
-	kasan_check_write(v, sizeof(*v));
-	return arch_atomic64_fetch_xor(i, v);
+INSTR_ATOMIC_FETCH_XOR()
+
+#ifdef arch_atomic_fetch_xor_relaxed
+INSTR_ATOMIC_FETCH_XOR(_relaxed)
+#define atomic_fetch_xor_relaxed atomic_fetch_xor_relaxed
+#endif
+
+#ifdef arch_atomic_fetch_xor_acquire
+INSTR_ATOMIC_FETCH_XOR(_acquire)
+#define atomic_fetch_xor_acquire atomic_fetch_xor_acquire
+#endif
+
+#ifdef arch_atomic_fetch_xor_release
+INSTR_ATOMIC_FETCH_XOR(_release)
+#define atomic_fetch_xor_release atomic_fetch_xor_release
+#endif
+
+#define INSTR_ATOMIC64_FETCH_XOR(xorder)				\
+static __always_inline s64						\
+atomic64_fetch_xor##xorder(s64 i, atomic64_t *v)			\
+{									\
+	kasan_check_write(v, sizeof(*v));				\
+	return arch_atomic64_fetch_xor##xorder(i, v);			\
 }
 
+INSTR_ATOMIC64_FETCH_XOR()
+
+#ifdef arch_atomic64_fetch_xor_relaxed
+INSTR_ATOMIC64_FETCH_XOR(_relaxed)
+#define atomic64_fetch_xor_relaxed atomic64_fetch_xor_relaxed
+#endif
+
+#ifdef arch_atomic64_fetch_xor_acquire
+INSTR_ATOMIC64_FETCH_XOR(_acquire)
+#define atomic64_fetch_xor_acquire atomic64_fetch_xor_acquire
+#endif
+
+#ifdef arch_atomic64_fetch_xor_release
+INSTR_ATOMIC64_FETCH_XOR(_release)
+#define atomic64_fetch_xor_release atomic64_fetch_xor_release
+#endif
+
 static __always_inline bool atomic_sub_and_test(int i, atomic_t *v)
 {
 	kasan_check_write(v, sizeof(*v));
@@ -349,31 +1108,64 @@ static __always_inline bool atomic64_add_negative(s64 i, atomic64_t *v)
 	return arch_atomic64_add_negative(i, v);
 }
 
-static __always_inline unsigned long
-cmpxchg_size(volatile void *ptr, unsigned long old, unsigned long new, int size)
-{
-	kasan_check_write(ptr, size);
-	switch (size) {
-	case 1:
-		return arch_cmpxchg((u8 *)ptr, (u8)old, (u8)new);
-	case 2:
-		return arch_cmpxchg((u16 *)ptr, (u16)old, (u16)new);
-	case 4:
-		return arch_cmpxchg((u32 *)ptr, (u32)old, (u32)new);
-	case 8:
-		BUILD_BUG_ON(sizeof(unsigned long) != 8);
-		return arch_cmpxchg((u64 *)ptr, (u64)old, (u64)new);
-	}
-	BUILD_BUG();
-	return 0;
+#define INSTR_CMPXCHG(order)							\
+static __always_inline unsigned long						\
+cmpxchg##order##_size(volatile void *ptr, unsigned long old,			\
+		       unsigned long new, int size)				\
+{										\
+	kasan_check_write(ptr, size);						\
+	switch (size) {								\
+	case 1:									\
+		return arch_cmpxchg##order((u8 *)ptr, (u8)old, (u8)new);	\
+	case 2:									\
+		return arch_cmpxchg##order((u16 *)ptr, (u16)old, (u16)new);	\
+	case 4:									\
+		return arch_cmpxchg##order((u32 *)ptr, (u32)old, (u32)new);	\
+	case 8:									\
+		BUILD_BUG_ON(sizeof(unsigned long) != 8);			\
+		return arch_cmpxchg##order((u64 *)ptr, (u64)old, (u64)new);	\
+	}									\
+	BUILD_BUG();								\
+	return 0;								\
 }
 
+INSTR_CMPXCHG()
 #define cmpxchg(ptr, old, new)						\
 ({									\
 	((__typeof__(*(ptr)))cmpxchg_size((ptr), (unsigned long)(old),	\
 		(unsigned long)(new), sizeof(*(ptr))));			\
 })
 
+#ifdef arch_cmpxchg_relaxed
+INSTR_CMPXCHG(_relaxed)
+#define cmpxchg_relaxed(ptr, old, new)					\
+({									\
+	((__typeof__(*(ptr)))cmpxchg_relaxed_size((ptr),		\
+		(unsigned long)(old), (unsigned long)(new), 		\
+		sizeof(*(ptr))));					\
+})
+#endif
+
+#ifdef arch_cmpxchg_acquire
+INSTR_CMPXCHG(_acquire)
+#define cmpxchg_acquire(ptr, old, new)					\
+({									\
+	((__typeof__(*(ptr)))cmpxchg_acquire_size((ptr),		\
+		(unsigned long)(old), (unsigned long)(new), 		\
+		sizeof(*(ptr))));					\
+})
+#endif
+
+#ifdef arch_cmpxchg_release
+INSTR_CMPXCHG(_release)
+#define cmpxchg_release(ptr, old, new)					\
+({									\
+	((__typeof__(*(ptr)))cmpxchg_release_size((ptr),		\
+		(unsigned long)(old), (unsigned long)(new), 		\
+		sizeof(*(ptr))));					\
+})
+#endif
+
 static __always_inline unsigned long
 sync_cmpxchg_size(volatile void *ptr, unsigned long old, unsigned long new,
 		  int size)
@@ -428,19 +1220,48 @@ cmpxchg_local_size(volatile void *ptr, unsigned long old, unsigned long new,
 		sizeof(*(ptr))));					\
 })
 
-static __always_inline u64
-cmpxchg64_size(volatile u64 *ptr, u64 old, u64 new)
-{
-	kasan_check_write(ptr, sizeof(*ptr));
-	return arch_cmpxchg64(ptr, old, new);
+#define INSTR_CMPXCHG64(order)						\
+static __always_inline u64						\
+cmpxchg64##order##_size(volatile u64 *ptr, u64 old, u64 new)		\
+{									\
+	kasan_check_write(ptr, sizeof(*ptr));				\
+	return arch_cmpxchg64##order(ptr, old, new);			\
 }
 
+INSTR_CMPXCHG64()
 #define cmpxchg64(ptr, old, new)					\
 ({									\
 	((__typeof__(*(ptr)))cmpxchg64_size((ptr), (u64)(old),		\
 		(u64)(new)));						\
 })
 
+#ifdef arch_cmpxchg64_relaxed
+INSTR_CMPXCHG64(_relaxed)
+#define cmpxchg64_relaxed(ptr, old, new)				\
+({									\
+	((__typeof__(*(ptr)))cmpxchg64_relaxed_size((ptr), (u64)(old),	\
+		(u64)(new)));						\
+})
+#endif
+
+#ifdef arch_cmpxchg64_acquire
+INSTR_CMPXCHG64(_acquire)
+#define cmpxchg64_acquire(ptr, old, new)				\
+({									\
+	((__typeof__(*(ptr)))cmpxchg64_acquire_size((ptr), (u64)(old),	\
+		(u64)(new)));						\
+})
+#endif
+
+#ifdef arch_cmpxchg64_release
+INSTR_CMPXCHG64(_release)
+#define cmpxchg64_release(ptr, old, new)				\
+({									\
+	((__typeof__(*(ptr)))cmpxchg64_release_size((ptr), (u64)(old),	\
+		(u64)(new)));						\
+})
+#endif
+
 static __always_inline u64
 cmpxchg64_local_size(volatile u64 *ptr, u64 old, u64 new)
 {
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 103+ messages in thread

* [PATCH 2/6] locking/atomic, asm-generic: instrument atomic*andnot*()
  2018-05-04 17:39 ` Mark Rutland
@ 2018-05-04 17:39   ` Mark Rutland
  -1 siblings, 0 replies; 103+ messages in thread
From: Mark Rutland @ 2018-05-04 17:39 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: linux-kernel, aryabinin, boqun.feng, catalin.marinas, dvyukov,
	mark.rutland, mingo, peterz, will.deacon

We don't currently define instrumentation wrappers for the various forms
of atomic*andnot*(), as these aren't implemented directly by x86.

So that we can instrument architectures which provide these, let's
define wrappers for all the variants of these atomics.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will.deacon@arm.com>
---
 include/asm-generic/atomic-instrumented.h | 112 ++++++++++++++++++++++++++++++
 1 file changed, 112 insertions(+)

diff --git a/include/asm-generic/atomic-instrumented.h b/include/asm-generic/atomic-instrumented.h
index 26f0e3098442..b1920f0f64ab 100644
--- a/include/asm-generic/atomic-instrumented.h
+++ b/include/asm-generic/atomic-instrumented.h
@@ -498,6 +498,62 @@ INSTR_ATOMIC64_AND(_release)
 #define atomic64_and_release atomic64_and_release
 #endif
 
+#define INSTR_ATOMIC_ANDNOT(order)					\
+static __always_inline void						\
+atomic_andnot##order(int i, atomic_t *v)				\
+{									\
+	kasan_check_write(v, sizeof(*v));				\
+	arch_atomic_andnot##order(i, v);				\
+}
+
+#ifdef arch_atomic_andnot
+INSTR_ATOMIC_ANDNOT()
+#define atomic_andnot atomic_andnot
+#endif
+
+#ifdef arch_atomic_andnot_relaxed
+INSTR_ATOMIC_ANDNOT(_relaxed)
+#define atomic_andnot_relaxed atomic_andnot_relaxed
+#endif
+
+#ifdef arch_atomic_andnot_acquire
+INSTR_ATOMIC_ANDNOT(_acquire)
+#define atomic_andnot_acquire atomic_andnot_acquire
+#endif
+
+#ifdef arch_atomic_andnot_release
+INSTR_ATOMIC_ANDNOT(_release)
+#define atomic_andnot_release atomic_andnot_release
+#endif
+
+#define INSTR_ATOMIC64_ANDNOT(order)					\
+static __always_inline void						\
+atomic64_andnot##order(s64 i, atomic64_t *v)				\
+{									\
+	kasan_check_write(v, sizeof(*v));				\
+	arch_atomic64_andnot##order(i, v);				\
+}
+
+#ifdef arch_atomic64_andnot
+INSTR_ATOMIC64_ANDNOT()
+#define atomic64_andnot atomic64_andnot
+#endif
+
+#ifdef arch_atomic64_andnot_relaxed
+INSTR_ATOMIC64_ANDNOT(_relaxed)
+#define atomic64_andnot_relaxed atomic64_andnot_relaxed
+#endif
+
+#ifdef arch_atomic64_andnot_acquire
+INSTR_ATOMIC64_ANDNOT(_acquire)
+#define atomic64_andnot_acquire atomic64_andnot_acquire
+#endif
+
+#ifdef arch_atomic64_andnot_release
+INSTR_ATOMIC64_ANDNOT(_release)
+#define atomic64_andnot_release atomic64_andnot_release
+#endif
+
 #define INSTR_ATOMIC_OR(order)						\
 static __always_inline void						\
 atomic_or##order(int i, atomic_t *v)					\
@@ -984,6 +1040,62 @@ INSTR_ATOMIC64_FETCH_AND(_release)
 #define atomic64_fetch_and_release atomic64_fetch_and_release
 #endif
 
+#define INSTR_ATOMIC_FETCH_ANDNOT(order)				\
+static __always_inline int						\
+atomic_fetch_andnot##order(int i, atomic_t *v)				\
+{									\
+	kasan_check_write(v, sizeof(*v));				\
+	return arch_atomic_fetch_andnot##order(i, v);			\
+}
+
+#ifdef arch_atomic_fetch_andnot
+INSTR_ATOMIC_FETCH_ANDNOT()
+#define atomic_fetch_andnot atomic_fetch_andnot
+#endif
+
+#ifdef arch_atomic_fetch_andnot_relaxed
+INSTR_ATOMIC_FETCH_ANDNOT(_relaxed)
+#define atomic_fetch_andnot_relaxed atomic_fetch_andnot_relaxed
+#endif
+
+#ifdef arch_atomic_fetch_andnot_acquire
+INSTR_ATOMIC_FETCH_ANDNOT(_acquire)
+#define atomic_fetch_andnot_acquire atomic_fetch_andnot_acquire
+#endif
+
+#ifdef arch_atomic_fetch_andnot_release
+INSTR_ATOMIC_FETCH_ANDNOT(_release)
+#define atomic_fetch_andnot_release atomic_fetch_andnot_release
+#endif
+
+#define INSTR_ATOMIC64_FETCH_ANDNOT(order)				\
+static __always_inline s64						\
+atomic64_fetch_andnot##order(s64 i, atomic64_t *v)			\
+{									\
+	kasan_check_write(v, sizeof(*v));				\
+	return arch_atomic64_fetch_andnot##order(i, v);			\
+}
+
+#ifdef arch_atomic64_fetch_andnot
+INSTR_ATOMIC64_FETCH_ANDNOT()
+#define atomic64_fetch_andnot atomic64_fetch_andnot
+#endif
+
+#ifdef arch_atomic64_fetch_andnot_relaxed
+INSTR_ATOMIC64_FETCH_ANDNOT(_relaxed)
+#define atomic64_fetch_andnot_relaxed atomic64_fetch_andnot_relaxed
+#endif
+
+#ifdef arch_atomic64_fetch_andnot_acquire
+INSTR_ATOMIC64_FETCH_ANDNOT(_acquire)
+#define atomic64_fetch_andnot_acquire atomic64_fetch_andnot_acquire
+#endif
+
+#ifdef arch_atomic64_fetch_andnot_release
+INSTR_ATOMIC64_FETCH_ANDNOT(_release)
+#define atomic64_fetch_andnot_release atomic64_fetch_andnot_release
+#endif
+
 #define INSTR_ATOMIC_FETCH_OR(order)					\
 static __always_inline int						\
 atomic_fetch_or##order(int i, atomic_t *v)				\
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 103+ messages in thread

* [PATCH 2/6] locking/atomic, asm-generic: instrument atomic*andnot*()
@ 2018-05-04 17:39   ` Mark Rutland
  0 siblings, 0 replies; 103+ messages in thread
From: Mark Rutland @ 2018-05-04 17:39 UTC (permalink / raw)
  To: linux-arm-kernel

We don't currently define instrumentation wrappers for the various forms
of atomic*andnot*(), as these aren't implemented directly by x86.

So that we can instrument architectures which provide these, let's
define wrappers for all the variants of these atomics.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will.deacon@arm.com>
---
 include/asm-generic/atomic-instrumented.h | 112 ++++++++++++++++++++++++++++++
 1 file changed, 112 insertions(+)

diff --git a/include/asm-generic/atomic-instrumented.h b/include/asm-generic/atomic-instrumented.h
index 26f0e3098442..b1920f0f64ab 100644
--- a/include/asm-generic/atomic-instrumented.h
+++ b/include/asm-generic/atomic-instrumented.h
@@ -498,6 +498,62 @@ INSTR_ATOMIC64_AND(_release)
 #define atomic64_and_release atomic64_and_release
 #endif
 
+#define INSTR_ATOMIC_ANDNOT(order)					\
+static __always_inline void						\
+atomic_andnot##order(int i, atomic_t *v)				\
+{									\
+	kasan_check_write(v, sizeof(*v));				\
+	arch_atomic_andnot##order(i, v);				\
+}
+
+#ifdef arch_atomic_andnot
+INSTR_ATOMIC_ANDNOT()
+#define atomic_andnot atomic_andnot
+#endif
+
+#ifdef arch_atomic_andnot_relaxed
+INSTR_ATOMIC_ANDNOT(_relaxed)
+#define atomic_andnot_relaxed atomic_andnot_relaxed
+#endif
+
+#ifdef arch_atomic_andnot_acquire
+INSTR_ATOMIC_ANDNOT(_acquire)
+#define atomic_andnot_acquire atomic_andnot_acquire
+#endif
+
+#ifdef arch_atomic_andnot_release
+INSTR_ATOMIC_ANDNOT(_release)
+#define atomic_andnot_release atomic_andnot_release
+#endif
+
+#define INSTR_ATOMIC64_ANDNOT(order)					\
+static __always_inline void						\
+atomic64_andnot##order(s64 i, atomic64_t *v)				\
+{									\
+	kasan_check_write(v, sizeof(*v));				\
+	arch_atomic64_andnot##order(i, v);				\
+}
+
+#ifdef arch_atomic64_andnot
+INSTR_ATOMIC64_ANDNOT()
+#define atomic64_andnot atomic64_andnot
+#endif
+
+#ifdef arch_atomic64_andnot_relaxed
+INSTR_ATOMIC64_ANDNOT(_relaxed)
+#define atomic64_andnot_relaxed atomic64_andnot_relaxed
+#endif
+
+#ifdef arch_atomic64_andnot_acquire
+INSTR_ATOMIC64_ANDNOT(_acquire)
+#define atomic64_andnot_acquire atomic64_andnot_acquire
+#endif
+
+#ifdef arch_atomic64_andnot_release
+INSTR_ATOMIC64_ANDNOT(_release)
+#define atomic64_andnot_release atomic64_andnot_release
+#endif
+
 #define INSTR_ATOMIC_OR(order)						\
 static __always_inline void						\
 atomic_or##order(int i, atomic_t *v)					\
@@ -984,6 +1040,62 @@ INSTR_ATOMIC64_FETCH_AND(_release)
 #define atomic64_fetch_and_release atomic64_fetch_and_release
 #endif
 
+#define INSTR_ATOMIC_FETCH_ANDNOT(order)				\
+static __always_inline int						\
+atomic_fetch_andnot##order(int i, atomic_t *v)				\
+{									\
+	kasan_check_write(v, sizeof(*v));				\
+	return arch_atomic_fetch_andnot##order(i, v);			\
+}
+
+#ifdef arch_atomic_fetch_andnot
+INSTR_ATOMIC_FETCH_ANDNOT()
+#define atomic_fetch_andnot atomic_fetch_andnot
+#endif
+
+#ifdef arch_atomic_fetch_andnot_relaxed
+INSTR_ATOMIC_FETCH_ANDNOT(_relaxed)
+#define atomic_fetch_andnot_relaxed atomic_fetch_andnot_relaxed
+#endif
+
+#ifdef arch_atomic_fetch_andnot_acquire
+INSTR_ATOMIC_FETCH_ANDNOT(_acquire)
+#define atomic_fetch_andnot_acquire atomic_fetch_andnot_acquire
+#endif
+
+#ifdef arch_atomic_fetch_andnot_release
+INSTR_ATOMIC_FETCH_ANDNOT(_release)
+#define atomic_fetch_andnot_release atomic_fetch_andnot_release
+#endif
+
+#define INSTR_ATOMIC64_FETCH_ANDNOT(order)				\
+static __always_inline s64						\
+atomic64_fetch_andnot##order(s64 i, atomic64_t *v)			\
+{									\
+	kasan_check_write(v, sizeof(*v));				\
+	return arch_atomic64_fetch_andnot##order(i, v);			\
+}
+
+#ifdef arch_atomic64_fetch_andnot
+INSTR_ATOMIC64_FETCH_ANDNOT()
+#define atomic64_fetch_andnot atomic64_fetch_andnot
+#endif
+
+#ifdef arch_atomic64_fetch_andnot_relaxed
+INSTR_ATOMIC64_FETCH_ANDNOT(_relaxed)
+#define atomic64_fetch_andnot_relaxed atomic64_fetch_andnot_relaxed
+#endif
+
+#ifdef arch_atomic64_fetch_andnot_acquire
+INSTR_ATOMIC64_FETCH_ANDNOT(_acquire)
+#define atomic64_fetch_andnot_acquire atomic64_fetch_andnot_acquire
+#endif
+
+#ifdef arch_atomic64_fetch_andnot_release
+INSTR_ATOMIC64_FETCH_ANDNOT(_release)
+#define atomic64_fetch_andnot_release atomic64_fetch_andnot_release
+#endif
+
 #define INSTR_ATOMIC_FETCH_OR(order)					\
 static __always_inline int						\
 atomic_fetch_or##order(int i, atomic_t *v)				\
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 103+ messages in thread

* [PATCH 3/6] arm64: use <linux/atomic.h> for cmpxchg
  2018-05-04 17:39 ` Mark Rutland
@ 2018-05-04 17:39   ` Mark Rutland
  -1 siblings, 0 replies; 103+ messages in thread
From: Mark Rutland @ 2018-05-04 17:39 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: linux-kernel, aryabinin, boqun.feng, catalin.marinas, dvyukov,
	mark.rutland, mingo, peterz, will.deacon

Currently a number of arm64-specific files include <asm/cmpxchg.h> for
the definition of the cmpxchg helpers. This works fine today, but won't
when we switch over to instrumented atomics, and as noted in
Documentation/core-api/atomic_ops.rst:

  If someone wants to use xchg(), cmpxchg() and their variants,
  linux/atomic.h should be included rather than asm/cmpxchg.h, unless
  the code is in arch/* and can take care of itself.

... so let's switch to <linux/atomic.h> for these definitions.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
---
 arch/arm64/include/asm/pgtable.h     | 2 +-
 arch/arm64/include/asm/sync_bitops.h | 2 +-
 arch/arm64/mm/fault.c                | 2 +-
 3 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h
index 7c4c8f318ba9..c797c0fbbce2 100644
--- a/arch/arm64/include/asm/pgtable.h
+++ b/arch/arm64/include/asm/pgtable.h
@@ -39,8 +39,8 @@
 
 #ifndef __ASSEMBLY__
 
-#include <asm/cmpxchg.h>
 #include <asm/fixmap.h>
+#include <linux/atomic.h>
 #include <linux/mmdebug.h>
 #include <linux/mm_types.h>
 #include <linux/sched.h>
diff --git a/arch/arm64/include/asm/sync_bitops.h b/arch/arm64/include/asm/sync_bitops.h
index eee31a9f72a5..24ed8f445b8b 100644
--- a/arch/arm64/include/asm/sync_bitops.h
+++ b/arch/arm64/include/asm/sync_bitops.h
@@ -3,7 +3,7 @@
 #define __ASM_SYNC_BITOPS_H__
 
 #include <asm/bitops.h>
-#include <asm/cmpxchg.h>
+#include <linux/atomic.h>
 
 /* sync_bitops functions are equivalent to the SMP implementation of the
  * original functions, independently from CONFIG_SMP being defined.
diff --git a/arch/arm64/mm/fault.c b/arch/arm64/mm/fault.c
index 4165485e8b6e..bfbc695e2ea2 100644
--- a/arch/arm64/mm/fault.c
+++ b/arch/arm64/mm/fault.c
@@ -18,6 +18,7 @@
  * along with this program.  If not, see <http://www.gnu.org/licenses/>.
  */
 
+#include <linux/atomic.h>
 #include <linux/extable.h>
 #include <linux/signal.h>
 #include <linux/mm.h>
@@ -34,7 +35,6 @@
 #include <linux/hugetlb.h>
 
 #include <asm/bug.h>
-#include <asm/cmpxchg.h>
 #include <asm/cpufeature.h>
 #include <asm/exception.h>
 #include <asm/debug-monitors.h>
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 103+ messages in thread

* [PATCH 3/6] arm64: use <linux/atomic.h> for cmpxchg
@ 2018-05-04 17:39   ` Mark Rutland
  0 siblings, 0 replies; 103+ messages in thread
From: Mark Rutland @ 2018-05-04 17:39 UTC (permalink / raw)
  To: linux-arm-kernel

Currently a number of arm64-specific files include <asm/cmpxchg.h> for
the definition of the cmpxchg helpers. This works fine today, but won't
when we switch over to instrumented atomics, and as noted in
Documentation/core-api/atomic_ops.rst:

  If someone wants to use xchg(), cmpxchg() and their variants,
  linux/atomic.h should be included rather than asm/cmpxchg.h, unless
  the code is in arch/* and can take care of itself.

... so let's switch to <linux/atomic.h> for these definitions.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
---
 arch/arm64/include/asm/pgtable.h     | 2 +-
 arch/arm64/include/asm/sync_bitops.h | 2 +-
 arch/arm64/mm/fault.c                | 2 +-
 3 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h
index 7c4c8f318ba9..c797c0fbbce2 100644
--- a/arch/arm64/include/asm/pgtable.h
+++ b/arch/arm64/include/asm/pgtable.h
@@ -39,8 +39,8 @@
 
 #ifndef __ASSEMBLY__
 
-#include <asm/cmpxchg.h>
 #include <asm/fixmap.h>
+#include <linux/atomic.h>
 #include <linux/mmdebug.h>
 #include <linux/mm_types.h>
 #include <linux/sched.h>
diff --git a/arch/arm64/include/asm/sync_bitops.h b/arch/arm64/include/asm/sync_bitops.h
index eee31a9f72a5..24ed8f445b8b 100644
--- a/arch/arm64/include/asm/sync_bitops.h
+++ b/arch/arm64/include/asm/sync_bitops.h
@@ -3,7 +3,7 @@
 #define __ASM_SYNC_BITOPS_H__
 
 #include <asm/bitops.h>
-#include <asm/cmpxchg.h>
+#include <linux/atomic.h>
 
 /* sync_bitops functions are equivalent to the SMP implementation of the
  * original functions, independently from CONFIG_SMP being defined.
diff --git a/arch/arm64/mm/fault.c b/arch/arm64/mm/fault.c
index 4165485e8b6e..bfbc695e2ea2 100644
--- a/arch/arm64/mm/fault.c
+++ b/arch/arm64/mm/fault.c
@@ -18,6 +18,7 @@
  * along with this program.  If not, see <http://www.gnu.org/licenses/>.
  */
 
+#include <linux/atomic.h>
 #include <linux/extable.h>
 #include <linux/signal.h>
 #include <linux/mm.h>
@@ -34,7 +35,6 @@
 #include <linux/hugetlb.h>
 
 #include <asm/bug.h>
-#include <asm/cmpxchg.h>
 #include <asm/cpufeature.h>
 #include <asm/exception.h>
 #include <asm/debug-monitors.h>
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 103+ messages in thread

* [PATCH 4/6] arm64: fix assembly constraints for cmpxchg
  2018-05-04 17:39 ` Mark Rutland
@ 2018-05-04 17:39   ` Mark Rutland
  -1 siblings, 0 replies; 103+ messages in thread
From: Mark Rutland @ 2018-05-04 17:39 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: linux-kernel, aryabinin, boqun.feng, catalin.marinas, dvyukov,
	mark.rutland, mingo, peterz, will.deacon

Our LL/SC cmpxchg assembly uses "Lr" as the constraint for old, which
allows either an integer constant suitable for a 64-bit logical
oepration, or a register.

However, this assembly is also used for 32-bit cases (where we
explicitly add a 'w' prefix to the output format), where the set of
valid immediates differ, and we should use a 'Kr' constraint.

In some cases, this can result in build failures, when GCC selects an
immediate which is valid for a 64-bit logical operation, but we try to
assemble a 32-bit logical operation:

[mark@lakrids:~/src/linux]% uselinaro 17.05 make ARCH=arm64 CROSS_COMPILE=aarch64-linux-gnu- net/sunrpc/auth_gss/svcauth_gss.o
  CHK     include/config/kernel.release
  CHK     include/generated/uapi/linux/version.h
  CHK     include/generated/utsrelease.h
  CHK     include/generated/bounds.h
  CHK     include/generated/timeconst.h
  CHK     include/generated/asm-offsets.h
  CALL    scripts/checksyscalls.sh
  CHK     scripts/mod/devicetable-offsets.h
  CC      net/sunrpc/auth_gss/svcauth_gss.o
/tmp/ccj04KVh.s: Assembler messages:
/tmp/ccj04KVh.s:325: Error: immediate out of range at operand 3 -- `eor w2,w1,4294967295'
scripts/Makefile.build:324: recipe for target 'net/sunrpc/auth_gss/svcauth_gss.o' failed
make[1]: *** [net/sunrpc/auth_gss/svcauth_gss.o] Error 1
Makefile:1704: recipe for target 'net/sunrpc/auth_gss/svcauth_gss.o' failed
make: *** [net/sunrpc/auth_gss/svcauth_gss.o] Error 2

Note that today we largely avoid the specific failure above because GCC
happens to already have the value in a register, and in most cases uses
that rather than generating the immediate. The following code added to
an arbitrary file will cause the same failure:

unsigned int test_cmpxchg(unsigned int *l)
{
       return cmpxchg(l, -1, 0);
}

While it would seem that we could conditionally use the 'K' constraint,
this seems to be handled erroneously by GCC (at least versions 6.3 and
7.1), with the same immediates being used, despite not being permitted
for 32-bit logical operations.

Thus we must avoid the use of an immediate in order to prevent failures
as above.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
---
 arch/arm64/include/asm/atomic_ll_sc.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/arm64/include/asm/atomic_ll_sc.h b/arch/arm64/include/asm/atomic_ll_sc.h
index f5a2d09afb38..3175f4982682 100644
--- a/arch/arm64/include/asm/atomic_ll_sc.h
+++ b/arch/arm64/include/asm/atomic_ll_sc.h
@@ -267,7 +267,7 @@ __LL_SC_PREFIX(__cmpxchg_case_##name(volatile void *ptr,		\
 	"2:"								\
 	: [tmp] "=&r" (tmp), [oldval] "=&r" (oldval),			\
 	  [v] "+Q" (*(unsigned long *)ptr)				\
-	: [old] "Lr" (old), [new] "r" (new)				\
+	: [old] "r" (old), [new] "r" (new)				\
 	: cl);								\
 									\
 	return oldval;							\
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 103+ messages in thread

* [PATCH 4/6] arm64: fix assembly constraints for cmpxchg
@ 2018-05-04 17:39   ` Mark Rutland
  0 siblings, 0 replies; 103+ messages in thread
From: Mark Rutland @ 2018-05-04 17:39 UTC (permalink / raw)
  To: linux-arm-kernel

Our LL/SC cmpxchg assembly uses "Lr" as the constraint for old, which
allows either an integer constant suitable for a 64-bit logical
oepration, or a register.

However, this assembly is also used for 32-bit cases (where we
explicitly add a 'w' prefix to the output format), where the set of
valid immediates differ, and we should use a 'Kr' constraint.

In some cases, this can result in build failures, when GCC selects an
immediate which is valid for a 64-bit logical operation, but we try to
assemble a 32-bit logical operation:

[mark at lakrids:~/src/linux]% uselinaro 17.05 make ARCH=arm64 CROSS_COMPILE=aarch64-linux-gnu- net/sunrpc/auth_gss/svcauth_gss.o
  CHK     include/config/kernel.release
  CHK     include/generated/uapi/linux/version.h
  CHK     include/generated/utsrelease.h
  CHK     include/generated/bounds.h
  CHK     include/generated/timeconst.h
  CHK     include/generated/asm-offsets.h
  CALL    scripts/checksyscalls.sh
  CHK     scripts/mod/devicetable-offsets.h
  CC      net/sunrpc/auth_gss/svcauth_gss.o
/tmp/ccj04KVh.s: Assembler messages:
/tmp/ccj04KVh.s:325: Error: immediate out of range at operand 3 -- `eor w2,w1,4294967295'
scripts/Makefile.build:324: recipe for target 'net/sunrpc/auth_gss/svcauth_gss.o' failed
make[1]: *** [net/sunrpc/auth_gss/svcauth_gss.o] Error 1
Makefile:1704: recipe for target 'net/sunrpc/auth_gss/svcauth_gss.o' failed
make: *** [net/sunrpc/auth_gss/svcauth_gss.o] Error 2

Note that today we largely avoid the specific failure above because GCC
happens to already have the value in a register, and in most cases uses
that rather than generating the immediate. The following code added to
an arbitrary file will cause the same failure:

unsigned int test_cmpxchg(unsigned int *l)
{
       return cmpxchg(l, -1, 0);
}

While it would seem that we could conditionally use the 'K' constraint,
this seems to be handled erroneously by GCC (at least versions 6.3 and
7.1), with the same immediates being used, despite not being permitted
for 32-bit logical operations.

Thus we must avoid the use of an immediate in order to prevent failures
as above.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
---
 arch/arm64/include/asm/atomic_ll_sc.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/arm64/include/asm/atomic_ll_sc.h b/arch/arm64/include/asm/atomic_ll_sc.h
index f5a2d09afb38..3175f4982682 100644
--- a/arch/arm64/include/asm/atomic_ll_sc.h
+++ b/arch/arm64/include/asm/atomic_ll_sc.h
@@ -267,7 +267,7 @@ __LL_SC_PREFIX(__cmpxchg_case_##name(volatile void *ptr,		\
 	"2:"								\
 	: [tmp] "=&r" (tmp), [oldval] "=&r" (oldval),			\
 	  [v] "+Q" (*(unsigned long *)ptr)				\
-	: [old] "Lr" (old), [new] "r" (new)				\
+	: [old] "r" (old), [new] "r" (new)				\
 	: cl);								\
 									\
 	return oldval;							\
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 103+ messages in thread

* [PATCH 5/6] arm64: use instrumented atomics
  2018-05-04 17:39 ` Mark Rutland
@ 2018-05-04 17:39   ` Mark Rutland
  -1 siblings, 0 replies; 103+ messages in thread
From: Mark Rutland @ 2018-05-04 17:39 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: linux-kernel, aryabinin, boqun.feng, catalin.marinas, dvyukov,
	mark.rutland, mingo, peterz, will.deacon

As our atomics are written in inline assembly, they don't get
instrumented when we enable KASAN, and thus we can miss when they are
used on erroneous memory locations.

As with x86, let's use atomic-instrumented.h to give arm64 instrumented
atomics. This requires that we add an arch_ prefix to our atomic names,
but other than naming, no changes are made to the atomics themselves.

Due to include dependencies, we must move our definition of sync_cmpxchg
into <asm/cmpxchg.h>, but this is not harmful.

There should be no functional change as a result of this patch when
CONFIG_KASAN is not selected.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
---
 arch/arm64/include/asm/atomic.h       | 299 +++++++++++++++++-----------------
 arch/arm64/include/asm/atomic_ll_sc.h |  28 ++--
 arch/arm64/include/asm/atomic_lse.h   |  43 ++---
 arch/arm64/include/asm/cmpxchg.h      |  25 +--
 arch/arm64/include/asm/sync_bitops.h  |   1 -
 5 files changed, 202 insertions(+), 194 deletions(-)

diff --git a/arch/arm64/include/asm/atomic.h b/arch/arm64/include/asm/atomic.h
index c0235e0ff849..aefdce33f81a 100644
--- a/arch/arm64/include/asm/atomic.h
+++ b/arch/arm64/include/asm/atomic.h
@@ -53,158 +53,161 @@
 
 #define ATOMIC_INIT(i)	{ (i) }
 
-#define atomic_read(v)			READ_ONCE((v)->counter)
-#define atomic_set(v, i)		WRITE_ONCE(((v)->counter), (i))
-
-#define atomic_add_return_relaxed	atomic_add_return_relaxed
-#define atomic_add_return_acquire	atomic_add_return_acquire
-#define atomic_add_return_release	atomic_add_return_release
-#define atomic_add_return		atomic_add_return
-
-#define atomic_inc_return_relaxed(v)	atomic_add_return_relaxed(1, (v))
-#define atomic_inc_return_acquire(v)	atomic_add_return_acquire(1, (v))
-#define atomic_inc_return_release(v)	atomic_add_return_release(1, (v))
-#define atomic_inc_return(v)		atomic_add_return(1, (v))
-
-#define atomic_sub_return_relaxed	atomic_sub_return_relaxed
-#define atomic_sub_return_acquire	atomic_sub_return_acquire
-#define atomic_sub_return_release	atomic_sub_return_release
-#define atomic_sub_return		atomic_sub_return
-
-#define atomic_dec_return_relaxed(v)	atomic_sub_return_relaxed(1, (v))
-#define atomic_dec_return_acquire(v)	atomic_sub_return_acquire(1, (v))
-#define atomic_dec_return_release(v)	atomic_sub_return_release(1, (v))
-#define atomic_dec_return(v)		atomic_sub_return(1, (v))
-
-#define atomic_fetch_add_relaxed	atomic_fetch_add_relaxed
-#define atomic_fetch_add_acquire	atomic_fetch_add_acquire
-#define atomic_fetch_add_release	atomic_fetch_add_release
-#define atomic_fetch_add		atomic_fetch_add
-
-#define atomic_fetch_sub_relaxed	atomic_fetch_sub_relaxed
-#define atomic_fetch_sub_acquire	atomic_fetch_sub_acquire
-#define atomic_fetch_sub_release	atomic_fetch_sub_release
-#define atomic_fetch_sub		atomic_fetch_sub
-
-#define atomic_fetch_and_relaxed	atomic_fetch_and_relaxed
-#define atomic_fetch_and_acquire	atomic_fetch_and_acquire
-#define atomic_fetch_and_release	atomic_fetch_and_release
-#define atomic_fetch_and		atomic_fetch_and
-
-#define atomic_fetch_andnot_relaxed	atomic_fetch_andnot_relaxed
-#define atomic_fetch_andnot_acquire	atomic_fetch_andnot_acquire
-#define atomic_fetch_andnot_release	atomic_fetch_andnot_release
-#define atomic_fetch_andnot		atomic_fetch_andnot
-
-#define atomic_fetch_or_relaxed		atomic_fetch_or_relaxed
-#define atomic_fetch_or_acquire		atomic_fetch_or_acquire
-#define atomic_fetch_or_release		atomic_fetch_or_release
-#define atomic_fetch_or			atomic_fetch_or
-
-#define atomic_fetch_xor_relaxed	atomic_fetch_xor_relaxed
-#define atomic_fetch_xor_acquire	atomic_fetch_xor_acquire
-#define atomic_fetch_xor_release	atomic_fetch_xor_release
-#define atomic_fetch_xor		atomic_fetch_xor
-
-#define atomic_xchg_relaxed(v, new)	xchg_relaxed(&((v)->counter), (new))
-#define atomic_xchg_acquire(v, new)	xchg_acquire(&((v)->counter), (new))
-#define atomic_xchg_release(v, new)	xchg_release(&((v)->counter), (new))
-#define atomic_xchg(v, new)		xchg(&((v)->counter), (new))
-
-#define atomic_cmpxchg_relaxed(v, old, new)				\
-	cmpxchg_relaxed(&((v)->counter), (old), (new))
-#define atomic_cmpxchg_acquire(v, old, new)				\
-	cmpxchg_acquire(&((v)->counter), (old), (new))
-#define atomic_cmpxchg_release(v, old, new)				\
-	cmpxchg_release(&((v)->counter), (old), (new))
-#define atomic_cmpxchg(v, old, new)	cmpxchg(&((v)->counter), (old), (new))
-
-#define atomic_inc(v)			atomic_add(1, (v))
-#define atomic_dec(v)			atomic_sub(1, (v))
-#define atomic_inc_and_test(v)		(atomic_inc_return(v) == 0)
-#define atomic_dec_and_test(v)		(atomic_dec_return(v) == 0)
-#define atomic_sub_and_test(i, v)	(atomic_sub_return((i), (v)) == 0)
-#define atomic_add_negative(i, v)	(atomic_add_return((i), (v)) < 0)
-#define __atomic_add_unless(v, a, u)	___atomic_add_unless(v, a, u,)
-#define atomic_andnot			atomic_andnot
+#define arch_atomic_read(v)			READ_ONCE((v)->counter)
+#define arch_atomic_set(v, i)			WRITE_ONCE(((v)->counter), (i))
+
+#define arch_atomic_add_return_relaxed		arch_atomic_add_return_relaxed
+#define arch_atomic_add_return_acquire		arch_atomic_add_return_acquire
+#define arch_atomic_add_return_release		arch_atomic_add_return_release
+#define arch_atomic_add_return			arch_atomic_add_return
+
+#define arch_atomic_inc_return_relaxed(v)	arch_atomic_add_return_relaxed(1, (v))
+#define arch_atomic_inc_return_acquire(v)	arch_atomic_add_return_acquire(1, (v))
+#define arch_atomic_inc_return_release(v)	arch_atomic_add_return_release(1, (v))
+#define arch_atomic_inc_return(v)		arch_atomic_add_return(1, (v))
+
+#define arch_atomic_sub_return_relaxed		arch_atomic_sub_return_relaxed
+#define arch_atomic_sub_return_acquire		arch_atomic_sub_return_acquire
+#define arch_atomic_sub_return_release		arch_atomic_sub_return_release
+#define arch_atomic_sub_return			arch_atomic_sub_return
+
+#define arch_atomic_dec_return_relaxed(v)	arch_atomic_sub_return_relaxed(1, (v))
+#define arch_atomic_dec_return_acquire(v)	arch_atomic_sub_return_acquire(1, (v))
+#define arch_atomic_dec_return_release(v)	arch_atomic_sub_return_release(1, (v))
+#define arch_atomic_dec_return(v)		arch_atomic_sub_return(1, (v))
+
+#define arch_atomic_fetch_add_relaxed		arch_atomic_fetch_add_relaxed
+#define arch_atomic_fetch_add_acquire		arch_atomic_fetch_add_acquire
+#define arch_atomic_fetch_add_release		arch_atomic_fetch_add_release
+#define arch_atomic_fetch_add			arch_atomic_fetch_add
+
+#define arch_atomic_fetch_sub_relaxed		arch_atomic_fetch_sub_relaxed
+#define arch_atomic_fetch_sub_acquire		arch_atomic_fetch_sub_acquire
+#define arch_atomic_fetch_sub_release		arch_atomic_fetch_sub_release
+#define arch_atomic_fetch_sub			arch_atomic_fetch_sub
+
+#define arch_atomic_fetch_and_relaxed		arch_atomic_fetch_and_relaxed
+#define arch_atomic_fetch_and_acquire		arch_atomic_fetch_and_acquire
+#define arch_atomic_fetch_and_release		arch_atomic_fetch_and_release
+#define arch_atomic_fetch_and			arch_atomic_fetch_and
+
+#define arch_atomic_fetch_andnot_relaxed	arch_atomic_fetch_andnot_relaxed
+#define arch_atomic_fetch_andnot_acquire	arch_atomic_fetch_andnot_acquire
+#define arch_atomic_fetch_andnot_release	arch_atomic_fetch_andnot_release
+#define arch_atomic_fetch_andnot		arch_atomic_fetch_andnot
+
+#define arch_atomic_fetch_or_relaxed		arch_atomic_fetch_or_relaxed
+#define arch_atomic_fetch_or_acquire		arch_atomic_fetch_or_acquire
+#define arch_atomic_fetch_or_release		arch_atomic_fetch_or_release
+#define arch_atomic_fetch_or			arch_atomic_fetch_or
+
+#define arch_atomic_fetch_xor_relaxed		arch_atomic_fetch_xor_relaxed
+#define arch_atomic_fetch_xor_acquire		arch_atomic_fetch_xor_acquire
+#define arch_atomic_fetch_xor_release		arch_atomic_fetch_xor_release
+#define arch_atomic_fetch_xor			arch_atomic_fetch_xor
+
+#define arch_atomic_xchg_relaxed(v, new)	xchg_relaxed(&((v)->counter), (new))
+#define arch_atomic_xchg_acquire(v, new)	xchg_acquire(&((v)->counter), (new))
+#define arch_atomic_xchg_release(v, new)	xchg_release(&((v)->counter), (new))
+#define arch_atomic_xchg(v, new)		xchg(&((v)->counter), (new))
+
+#define arch_atomic_cmpxchg_relaxed(v, old, new)			\
+	arch_cmpxchg_relaxed(&((v)->counter), (old), (new))
+#define arch_atomic_cmpxchg_acquire(v, old, new)			\
+	arch_cmpxchg_acquire(&((v)->counter), (old), (new))
+#define arch_atomic_cmpxchg_release(v, old, new)			\
+	arch_cmpxchg_release(&((v)->counter), (old), (new))
+#define arch_atomic_cmpxchg(v, old, new)				\
+	arch_cmpxchg(&((v)->counter), (old), (new))
+
+#define arch_atomic_inc(v)			arch_atomic_add(1, (v))
+#define arch_atomic_dec(v)			arch_atomic_sub(1, (v))
+#define arch_atomic_inc_and_test(v)		(arch_atomic_inc_return(v) == 0)
+#define arch_atomic_dec_and_test(v)		(arch_atomic_dec_return(v) == 0)
+#define arch_atomic_sub_and_test(i, v)		(arch_atomic_sub_return((i), (v)) == 0)
+#define arch_atomic_add_negative(i, v)		(arch_atomic_add_return((i), (v)) < 0)
+#define __arch_atomic_add_unless(v, a, u)	___atomic_add_unless(v, a, u,)
+#define arch_atomic_andnot			arch_atomic_andnot
 
 /*
  * 64-bit atomic operations.
  */
-#define ATOMIC64_INIT			ATOMIC_INIT
-#define atomic64_read			atomic_read
-#define atomic64_set			atomic_set
-
-#define atomic64_add_return_relaxed	atomic64_add_return_relaxed
-#define atomic64_add_return_acquire	atomic64_add_return_acquire
-#define atomic64_add_return_release	atomic64_add_return_release
-#define atomic64_add_return		atomic64_add_return
-
-#define atomic64_inc_return_relaxed(v)	atomic64_add_return_relaxed(1, (v))
-#define atomic64_inc_return_acquire(v)	atomic64_add_return_acquire(1, (v))
-#define atomic64_inc_return_release(v)	atomic64_add_return_release(1, (v))
-#define atomic64_inc_return(v)		atomic64_add_return(1, (v))
-
-#define atomic64_sub_return_relaxed	atomic64_sub_return_relaxed
-#define atomic64_sub_return_acquire	atomic64_sub_return_acquire
-#define atomic64_sub_return_release	atomic64_sub_return_release
-#define atomic64_sub_return		atomic64_sub_return
-
-#define atomic64_dec_return_relaxed(v)	atomic64_sub_return_relaxed(1, (v))
-#define atomic64_dec_return_acquire(v)	atomic64_sub_return_acquire(1, (v))
-#define atomic64_dec_return_release(v)	atomic64_sub_return_release(1, (v))
-#define atomic64_dec_return(v)		atomic64_sub_return(1, (v))
-
-#define atomic64_fetch_add_relaxed	atomic64_fetch_add_relaxed
-#define atomic64_fetch_add_acquire	atomic64_fetch_add_acquire
-#define atomic64_fetch_add_release	atomic64_fetch_add_release
-#define atomic64_fetch_add		atomic64_fetch_add
-
-#define atomic64_fetch_sub_relaxed	atomic64_fetch_sub_relaxed
-#define atomic64_fetch_sub_acquire	atomic64_fetch_sub_acquire
-#define atomic64_fetch_sub_release	atomic64_fetch_sub_release
-#define atomic64_fetch_sub		atomic64_fetch_sub
-
-#define atomic64_fetch_and_relaxed	atomic64_fetch_and_relaxed
-#define atomic64_fetch_and_acquire	atomic64_fetch_and_acquire
-#define atomic64_fetch_and_release	atomic64_fetch_and_release
-#define atomic64_fetch_and		atomic64_fetch_and
-
-#define atomic64_fetch_andnot_relaxed	atomic64_fetch_andnot_relaxed
-#define atomic64_fetch_andnot_acquire	atomic64_fetch_andnot_acquire
-#define atomic64_fetch_andnot_release	atomic64_fetch_andnot_release
-#define atomic64_fetch_andnot		atomic64_fetch_andnot
-
-#define atomic64_fetch_or_relaxed	atomic64_fetch_or_relaxed
-#define atomic64_fetch_or_acquire	atomic64_fetch_or_acquire
-#define atomic64_fetch_or_release	atomic64_fetch_or_release
-#define atomic64_fetch_or		atomic64_fetch_or
-
-#define atomic64_fetch_xor_relaxed	atomic64_fetch_xor_relaxed
-#define atomic64_fetch_xor_acquire	atomic64_fetch_xor_acquire
-#define atomic64_fetch_xor_release	atomic64_fetch_xor_release
-#define atomic64_fetch_xor		atomic64_fetch_xor
-
-#define atomic64_xchg_relaxed		atomic_xchg_relaxed
-#define atomic64_xchg_acquire		atomic_xchg_acquire
-#define atomic64_xchg_release		atomic_xchg_release
-#define atomic64_xchg			atomic_xchg
-
-#define atomic64_cmpxchg_relaxed	atomic_cmpxchg_relaxed
-#define atomic64_cmpxchg_acquire	atomic_cmpxchg_acquire
-#define atomic64_cmpxchg_release	atomic_cmpxchg_release
-#define atomic64_cmpxchg		atomic_cmpxchg
-
-#define atomic64_inc(v)			atomic64_add(1, (v))
-#define atomic64_dec(v)			atomic64_sub(1, (v))
-#define atomic64_inc_and_test(v)	(atomic64_inc_return(v) == 0)
-#define atomic64_dec_and_test(v)	(atomic64_dec_return(v) == 0)
-#define atomic64_sub_and_test(i, v)	(atomic64_sub_return((i), (v)) == 0)
-#define atomic64_add_negative(i, v)	(atomic64_add_return((i), (v)) < 0)
-#define atomic64_add_unless(v, a, u)	(___atomic_add_unless(v, a, u, 64) != u)
-#define atomic64_andnot			atomic64_andnot
-
-#define atomic64_inc_not_zero(v)	atomic64_add_unless((v), 1, 0)
+#define ATOMIC64_INIT				ATOMIC_INIT
+#define arch_atomic64_read			arch_atomic_read
+#define arch_atomic64_set			arch_atomic_set
+
+#define arch_atomic64_add_return_relaxed	arch_atomic64_add_return_relaxed
+#define arch_atomic64_add_return_acquire	arch_atomic64_add_return_acquire
+#define arch_atomic64_add_return_release	arch_atomic64_add_return_release
+#define arch_atomic64_add_return		arch_atomic64_add_return
+
+#define arch_atomic64_inc_return_relaxed(v)	arch_atomic64_add_return_relaxed(1, (v))
+#define arch_atomic64_inc_return_acquire(v)	arch_atomic64_add_return_acquire(1, (v))
+#define arch_atomic64_inc_return_release(v)	arch_atomic64_add_return_release(1, (v))
+#define arch_atomic64_inc_return(v)		arch_atomic64_add_return(1, (v))
+
+#define arch_atomic64_sub_return_relaxed	arch_atomic64_sub_return_relaxed
+#define arch_atomic64_sub_return_acquire	arch_atomic64_sub_return_acquire
+#define arch_atomic64_sub_return_release	arch_atomic64_sub_return_release
+#define arch_atomic64_sub_return		arch_atomic64_sub_return
+
+#define arch_atomic64_dec_return_relaxed(v)	arch_atomic64_sub_return_relaxed(1, (v))
+#define arch_atomic64_dec_return_acquire(v)	arch_atomic64_sub_return_acquire(1, (v))
+#define arch_atomic64_dec_return_release(v)	arch_atomic64_sub_return_release(1, (v))
+#define arch_atomic64_dec_return(v)		arch_atomic64_sub_return(1, (v))
+
+#define arch_atomic64_fetch_add_relaxed		arch_atomic64_fetch_add_relaxed
+#define arch_atomic64_fetch_add_acquire		arch_atomic64_fetch_add_acquire
+#define arch_atomic64_fetch_add_release		arch_atomic64_fetch_add_release
+#define arch_atomic64_fetch_add			arch_atomic64_fetch_add
+
+#define arch_atomic64_fetch_sub_relaxed		arch_atomic64_fetch_sub_relaxed
+#define arch_atomic64_fetch_sub_acquire		arch_atomic64_fetch_sub_acquire
+#define arch_atomic64_fetch_sub_release		arch_atomic64_fetch_sub_release
+#define arch_atomic64_fetch_sub			arch_atomic64_fetch_sub
+
+#define arch_atomic64_fetch_and_relaxed		arch_atomic64_fetch_and_relaxed
+#define arch_atomic64_fetch_and_acquire		arch_atomic64_fetch_and_acquire
+#define arch_atomic64_fetch_and_release		arch_atomic64_fetch_and_release
+#define arch_atomic64_fetch_and			arch_atomic64_fetch_and
+
+#define arch_atomic64_fetch_andnot_relaxed	arch_atomic64_fetch_andnot_relaxed
+#define arch_atomic64_fetch_andnot_acquire	arch_atomic64_fetch_andnot_acquire
+#define arch_atomic64_fetch_andnot_release	arch_atomic64_fetch_andnot_release
+#define arch_atomic64_fetch_andnot		arch_atomic64_fetch_andnot
+
+#define arch_atomic64_fetch_or_relaxed		arch_atomic64_fetch_or_relaxed
+#define arch_atomic64_fetch_or_acquire		arch_atomic64_fetch_or_acquire
+#define arch_atomic64_fetch_or_release		arch_atomic64_fetch_or_release
+#define arch_atomic64_fetch_or			arch_atomic64_fetch_or
+
+#define arch_atomic64_fetch_xor_relaxed		arch_atomic64_fetch_xor_relaxed
+#define arch_atomic64_fetch_xor_acquire		arch_atomic64_fetch_xor_acquire
+#define arch_atomic64_fetch_xor_release		arch_atomic64_fetch_xor_release
+#define arch_atomic64_fetch_xor			arch_atomic64_fetch_xor
+
+#define arch_atomic64_xchg_relaxed		arch_atomic_xchg_relaxed
+#define arch_atomic64_xchg_acquire		arch_atomic_xchg_acquire
+#define arch_atomic64_xchg_release		arch_atomic_xchg_release
+#define arch_atomic64_xchg			arch_atomic_xchg
+
+#define arch_atomic64_cmpxchg_relaxed		arch_atomic_cmpxchg_relaxed
+#define arch_atomic64_cmpxchg_acquire		arch_atomic_cmpxchg_acquire
+#define arch_atomic64_cmpxchg_release		arch_atomic_cmpxchg_release
+#define arch_atomic64_cmpxchg			arch_atomic_cmpxchg
+
+#define arch_atomic64_inc(v)			arch_atomic64_add(1, (v))
+#define arch_atomic64_dec(v)			arch_atomic64_sub(1, (v))
+#define arch_atomic64_inc_and_test(v)		(arch_atomic64_inc_return(v) == 0)
+#define arch_atomic64_dec_and_test(v)		(arch_atomic64_dec_return(v) == 0)
+#define arch_atomic64_sub_and_test(i, v)	(arch_atomic64_sub_return((i), (v)) == 0)
+#define arch_atomic64_add_negative(i, v)	(arch_atomic64_add_return((i), (v)) < 0)
+#define arch_atomic64_add_unless(v, a, u)	(___atomic_add_unless(v, a, u, 64) != u)
+#define arch_atomic64_andnot			arch_atomic64_andnot
+
+#define arch_atomic64_inc_not_zero(v)		arch_atomic64_add_unless((v), 1, 0)
+
+#include <asm-generic/atomic-instrumented.h>
 
 #endif
 #endif
diff --git a/arch/arm64/include/asm/atomic_ll_sc.h b/arch/arm64/include/asm/atomic_ll_sc.h
index 3175f4982682..c28d5a824104 100644
--- a/arch/arm64/include/asm/atomic_ll_sc.h
+++ b/arch/arm64/include/asm/atomic_ll_sc.h
@@ -39,7 +39,7 @@
 
 #define ATOMIC_OP(op, asm_op)						\
 __LL_SC_INLINE void							\
-__LL_SC_PREFIX(atomic_##op(int i, atomic_t *v))				\
+__LL_SC_PREFIX(arch_atomic_##op(int i, atomic_t *v))			\
 {									\
 	unsigned long tmp;						\
 	int result;							\
@@ -53,11 +53,11 @@ __LL_SC_PREFIX(atomic_##op(int i, atomic_t *v))				\
 	: "=&r" (result), "=&r" (tmp), "+Q" (v->counter)		\
 	: "Ir" (i));							\
 }									\
-__LL_SC_EXPORT(atomic_##op);
+__LL_SC_EXPORT(arch_atomic_##op);
 
 #define ATOMIC_OP_RETURN(name, mb, acq, rel, cl, op, asm_op)		\
 __LL_SC_INLINE int							\
-__LL_SC_PREFIX(atomic_##op##_return##name(int i, atomic_t *v))		\
+__LL_SC_PREFIX(arch_atomic_##op##_return##name(int i, atomic_t *v))	\
 {									\
 	unsigned long tmp;						\
 	int result;							\
@@ -75,11 +75,11 @@ __LL_SC_PREFIX(atomic_##op##_return##name(int i, atomic_t *v))		\
 									\
 	return result;							\
 }									\
-__LL_SC_EXPORT(atomic_##op##_return##name);
+__LL_SC_EXPORT(arch_atomic_##op##_return##name);
 
 #define ATOMIC_FETCH_OP(name, mb, acq, rel, cl, op, asm_op)		\
 __LL_SC_INLINE int							\
-__LL_SC_PREFIX(atomic_fetch_##op##name(int i, atomic_t *v))		\
+__LL_SC_PREFIX(arch_atomic_fetch_##op##name(int i, atomic_t *v))	\
 {									\
 	unsigned long tmp;						\
 	int val, result;						\
@@ -97,7 +97,7 @@ __LL_SC_PREFIX(atomic_fetch_##op##name(int i, atomic_t *v))		\
 									\
 	return result;							\
 }									\
-__LL_SC_EXPORT(atomic_fetch_##op##name);
+__LL_SC_EXPORT(arch_atomic_fetch_##op##name);
 
 #define ATOMIC_OPS(...)							\
 	ATOMIC_OP(__VA_ARGS__)						\
@@ -133,7 +133,7 @@ ATOMIC_OPS(xor, eor)
 
 #define ATOMIC64_OP(op, asm_op)						\
 __LL_SC_INLINE void							\
-__LL_SC_PREFIX(atomic64_##op(long i, atomic64_t *v))			\
+__LL_SC_PREFIX(arch_atomic64_##op(long i, atomic64_t *v))		\
 {									\
 	long result;							\
 	unsigned long tmp;						\
@@ -147,11 +147,11 @@ __LL_SC_PREFIX(atomic64_##op(long i, atomic64_t *v))			\
 	: "=&r" (result), "=&r" (tmp), "+Q" (v->counter)		\
 	: "Ir" (i));							\
 }									\
-__LL_SC_EXPORT(atomic64_##op);
+__LL_SC_EXPORT(arch_atomic64_##op);
 
 #define ATOMIC64_OP_RETURN(name, mb, acq, rel, cl, op, asm_op)		\
 __LL_SC_INLINE long							\
-__LL_SC_PREFIX(atomic64_##op##_return##name(long i, atomic64_t *v))	\
+__LL_SC_PREFIX(arch_atomic64_##op##_return##name(long i, atomic64_t *v))\
 {									\
 	long result;							\
 	unsigned long tmp;						\
@@ -169,11 +169,11 @@ __LL_SC_PREFIX(atomic64_##op##_return##name(long i, atomic64_t *v))	\
 									\
 	return result;							\
 }									\
-__LL_SC_EXPORT(atomic64_##op##_return##name);
+__LL_SC_EXPORT(arch_atomic64_##op##_return##name);
 
 #define ATOMIC64_FETCH_OP(name, mb, acq, rel, cl, op, asm_op)		\
 __LL_SC_INLINE long							\
-__LL_SC_PREFIX(atomic64_fetch_##op##name(long i, atomic64_t *v))	\
+__LL_SC_PREFIX(arch_atomic64_fetch_##op##name(long i, atomic64_t *v))	\
 {									\
 	long result, val;						\
 	unsigned long tmp;						\
@@ -191,7 +191,7 @@ __LL_SC_PREFIX(atomic64_fetch_##op##name(long i, atomic64_t *v))	\
 									\
 	return result;							\
 }									\
-__LL_SC_EXPORT(atomic64_fetch_##op##name);
+__LL_SC_EXPORT(arch_atomic64_fetch_##op##name);
 
 #define ATOMIC64_OPS(...)						\
 	ATOMIC64_OP(__VA_ARGS__)					\
@@ -226,7 +226,7 @@ ATOMIC64_OPS(xor, eor)
 #undef ATOMIC64_OP
 
 __LL_SC_INLINE long
-__LL_SC_PREFIX(atomic64_dec_if_positive(atomic64_t *v))
+__LL_SC_PREFIX(arch_atomic64_dec_if_positive(atomic64_t *v))
 {
 	long result;
 	unsigned long tmp;
@@ -246,7 +246,7 @@ __LL_SC_PREFIX(atomic64_dec_if_positive(atomic64_t *v))
 
 	return result;
 }
-__LL_SC_EXPORT(atomic64_dec_if_positive);
+__LL_SC_EXPORT(arch_atomic64_dec_if_positive);
 
 #define __CMPXCHG_CASE(w, sz, name, mb, acq, rel, cl)			\
 __LL_SC_INLINE unsigned long						\
diff --git a/arch/arm64/include/asm/atomic_lse.h b/arch/arm64/include/asm/atomic_lse.h
index 9ef0797380cb..9a071f71c521 100644
--- a/arch/arm64/include/asm/atomic_lse.h
+++ b/arch/arm64/include/asm/atomic_lse.h
@@ -25,9 +25,9 @@
 #error "please don't include this file directly"
 #endif
 
-#define __LL_SC_ATOMIC(op)	__LL_SC_CALL(atomic_##op)
+#define __LL_SC_ATOMIC(op)	__LL_SC_CALL(arch_atomic_##op)
 #define ATOMIC_OP(op, asm_op)						\
-static inline void atomic_##op(int i, atomic_t *v)			\
+static inline void arch_atomic_##op(int i, atomic_t *v)			\
 {									\
 	register int w0 asm ("w0") = i;					\
 	register atomic_t *x1 asm ("x1") = v;				\
@@ -47,7 +47,7 @@ ATOMIC_OP(add, stadd)
 #undef ATOMIC_OP
 
 #define ATOMIC_FETCH_OP(name, mb, op, asm_op, cl...)			\
-static inline int atomic_fetch_##op##name(int i, atomic_t *v)		\
+static inline int arch_atomic_fetch_##op##name(int i, atomic_t *v)	\
 {									\
 	register int w0 asm ("w0") = i;					\
 	register atomic_t *x1 asm ("x1") = v;				\
@@ -79,7 +79,7 @@ ATOMIC_FETCH_OPS(add, ldadd)
 #undef ATOMIC_FETCH_OPS
 
 #define ATOMIC_OP_ADD_RETURN(name, mb, cl...)				\
-static inline int atomic_add_return##name(int i, atomic_t *v)		\
+static inline int arch_atomic_add_return##name(int i, atomic_t *v)	\
 {									\
 	register int w0 asm ("w0") = i;					\
 	register atomic_t *x1 asm ("x1") = v;				\
@@ -105,7 +105,7 @@ ATOMIC_OP_ADD_RETURN(        , al, "memory")
 
 #undef ATOMIC_OP_ADD_RETURN
 
-static inline void atomic_and(int i, atomic_t *v)
+static inline void arch_atomic_and(int i, atomic_t *v)
 {
 	register int w0 asm ("w0") = i;
 	register atomic_t *x1 asm ("x1") = v;
@@ -123,7 +123,7 @@ static inline void atomic_and(int i, atomic_t *v)
 }
 
 #define ATOMIC_FETCH_OP_AND(name, mb, cl...)				\
-static inline int atomic_fetch_and##name(int i, atomic_t *v)		\
+static inline int arch_atomic_fetch_and##name(int i, atomic_t *v)	\
 {									\
 	register int w0 asm ("w0") = i;					\
 	register atomic_t *x1 asm ("x1") = v;				\
@@ -149,7 +149,7 @@ ATOMIC_FETCH_OP_AND(        , al, "memory")
 
 #undef ATOMIC_FETCH_OP_AND
 
-static inline void atomic_sub(int i, atomic_t *v)
+static inline void arch_atomic_sub(int i, atomic_t *v)
 {
 	register int w0 asm ("w0") = i;
 	register atomic_t *x1 asm ("x1") = v;
@@ -167,7 +167,7 @@ static inline void atomic_sub(int i, atomic_t *v)
 }
 
 #define ATOMIC_OP_SUB_RETURN(name, mb, cl...)				\
-static inline int atomic_sub_return##name(int i, atomic_t *v)		\
+static inline int arch_atomic_sub_return##name(int i, atomic_t *v)	\
 {									\
 	register int w0 asm ("w0") = i;					\
 	register atomic_t *x1 asm ("x1") = v;				\
@@ -195,7 +195,7 @@ ATOMIC_OP_SUB_RETURN(        , al, "memory")
 #undef ATOMIC_OP_SUB_RETURN
 
 #define ATOMIC_FETCH_OP_SUB(name, mb, cl...)				\
-static inline int atomic_fetch_sub##name(int i, atomic_t *v)		\
+static inline int arch_atomic_fetch_sub##name(int i, atomic_t *v)	\
 {									\
 	register int w0 asm ("w0") = i;					\
 	register atomic_t *x1 asm ("x1") = v;				\
@@ -222,9 +222,9 @@ ATOMIC_FETCH_OP_SUB(        , al, "memory")
 #undef ATOMIC_FETCH_OP_SUB
 #undef __LL_SC_ATOMIC
 
-#define __LL_SC_ATOMIC64(op)	__LL_SC_CALL(atomic64_##op)
+#define __LL_SC_ATOMIC64(op)	__LL_SC_CALL(arch_atomic64_##op)
 #define ATOMIC64_OP(op, asm_op)						\
-static inline void atomic64_##op(long i, atomic64_t *v)			\
+static inline void arch_atomic64_##op(long i, atomic64_t *v)		\
 {									\
 	register long x0 asm ("x0") = i;				\
 	register atomic64_t *x1 asm ("x1") = v;				\
@@ -244,7 +244,8 @@ ATOMIC64_OP(add, stadd)
 #undef ATOMIC64_OP
 
 #define ATOMIC64_FETCH_OP(name, mb, op, asm_op, cl...)			\
-static inline long atomic64_fetch_##op##name(long i, atomic64_t *v)	\
+static inline long							\
+arch_atomic64_fetch_##op##name(long i, atomic64_t *v)			\
 {									\
 	register long x0 asm ("x0") = i;				\
 	register atomic64_t *x1 asm ("x1") = v;				\
@@ -276,7 +277,8 @@ ATOMIC64_FETCH_OPS(add, ldadd)
 #undef ATOMIC64_FETCH_OPS
 
 #define ATOMIC64_OP_ADD_RETURN(name, mb, cl...)				\
-static inline long atomic64_add_return##name(long i, atomic64_t *v)	\
+static inline long							\
+arch_atomic64_add_return##name(long i, atomic64_t *v)			\
 {									\
 	register long x0 asm ("x0") = i;				\
 	register atomic64_t *x1 asm ("x1") = v;				\
@@ -302,7 +304,7 @@ ATOMIC64_OP_ADD_RETURN(        , al, "memory")
 
 #undef ATOMIC64_OP_ADD_RETURN
 
-static inline void atomic64_and(long i, atomic64_t *v)
+static inline void arch_atomic64_and(long i, atomic64_t *v)
 {
 	register long x0 asm ("x0") = i;
 	register atomic64_t *x1 asm ("x1") = v;
@@ -320,7 +322,8 @@ static inline void atomic64_and(long i, atomic64_t *v)
 }
 
 #define ATOMIC64_FETCH_OP_AND(name, mb, cl...)				\
-static inline long atomic64_fetch_and##name(long i, atomic64_t *v)	\
+static inline long							\
+arch_atomic64_fetch_and##name(long i, atomic64_t *v)			\
 {									\
 	register long x0 asm ("x0") = i;				\
 	register atomic64_t *x1 asm ("x1") = v;				\
@@ -346,7 +349,7 @@ ATOMIC64_FETCH_OP_AND(        , al, "memory")
 
 #undef ATOMIC64_FETCH_OP_AND
 
-static inline void atomic64_sub(long i, atomic64_t *v)
+static inline void arch_atomic64_sub(long i, atomic64_t *v)
 {
 	register long x0 asm ("x0") = i;
 	register atomic64_t *x1 asm ("x1") = v;
@@ -364,7 +367,8 @@ static inline void atomic64_sub(long i, atomic64_t *v)
 }
 
 #define ATOMIC64_OP_SUB_RETURN(name, mb, cl...)				\
-static inline long atomic64_sub_return##name(long i, atomic64_t *v)	\
+static inline long							\
+arch_atomic64_sub_return##name(long i, atomic64_t *v)			\
 {									\
 	register long x0 asm ("x0") = i;				\
 	register atomic64_t *x1 asm ("x1") = v;				\
@@ -392,7 +396,8 @@ ATOMIC64_OP_SUB_RETURN(        , al, "memory")
 #undef ATOMIC64_OP_SUB_RETURN
 
 #define ATOMIC64_FETCH_OP_SUB(name, mb, cl...)				\
-static inline long atomic64_fetch_sub##name(long i, atomic64_t *v)	\
+static inline long							\
+arch_atomic64_fetch_sub##name(long i, atomic64_t *v)			\
 {									\
 	register long x0 asm ("x0") = i;				\
 	register atomic64_t *x1 asm ("x1") = v;				\
@@ -418,7 +423,7 @@ ATOMIC64_FETCH_OP_SUB(        , al, "memory")
 
 #undef ATOMIC64_FETCH_OP_SUB
 
-static inline long atomic64_dec_if_positive(atomic64_t *v)
+static inline long arch_atomic64_dec_if_positive(atomic64_t *v)
 {
 	register long x0 asm ("x0") = (long)v;
 
diff --git a/arch/arm64/include/asm/cmpxchg.h b/arch/arm64/include/asm/cmpxchg.h
index 4f5fd2a36e6e..0f470ffd2d59 100644
--- a/arch/arm64/include/asm/cmpxchg.h
+++ b/arch/arm64/include/asm/cmpxchg.h
@@ -154,18 +154,19 @@ __CMPXCHG_GEN(_mb)
 })
 
 /* cmpxchg */
-#define cmpxchg_relaxed(...)	__cmpxchg_wrapper(    , __VA_ARGS__)
-#define cmpxchg_acquire(...)	__cmpxchg_wrapper(_acq, __VA_ARGS__)
-#define cmpxchg_release(...)	__cmpxchg_wrapper(_rel, __VA_ARGS__)
-#define cmpxchg(...)		__cmpxchg_wrapper( _mb, __VA_ARGS__)
-#define cmpxchg_local		cmpxchg_relaxed
+#define arch_cmpxchg_relaxed(...)	__cmpxchg_wrapper(    , __VA_ARGS__)
+#define arch_cmpxchg_acquire(...)	__cmpxchg_wrapper(_acq, __VA_ARGS__)
+#define arch_cmpxchg_release(...)	__cmpxchg_wrapper(_rel, __VA_ARGS__)
+#define arch_cmpxchg(...)		__cmpxchg_wrapper( _mb, __VA_ARGS__)
+#define arch_cmpxchg_local		arch_cmpxchg_relaxed
+#define arch_sync_cmpxchg		arch_cmpxchg
 
 /* cmpxchg64 */
-#define cmpxchg64_relaxed	cmpxchg_relaxed
-#define cmpxchg64_acquire	cmpxchg_acquire
-#define cmpxchg64_release	cmpxchg_release
-#define cmpxchg64		cmpxchg
-#define cmpxchg64_local		cmpxchg_local
+#define arch_cmpxchg64_relaxed		arch_cmpxchg_relaxed
+#define arch_cmpxchg64_acquire		arch_cmpxchg_acquire
+#define arch_cmpxchg64_release		arch_cmpxchg_release
+#define arch_cmpxchg64			arch_cmpxchg
+#define arch_cmpxchg64_local		arch_cmpxchg_local
 
 /* cmpxchg_double */
 #define system_has_cmpxchg_double()     1
@@ -177,7 +178,7 @@ __CMPXCHG_GEN(_mb)
 	VM_BUG_ON((unsigned long *)(ptr2) - (unsigned long *)(ptr1) != 1);	\
 })
 
-#define cmpxchg_double(ptr1, ptr2, o1, o2, n1, n2) \
+#define arch_cmpxchg_double(ptr1, ptr2, o1, o2, n1, n2) \
 ({\
 	int __ret;\
 	__cmpxchg_double_check(ptr1, ptr2); \
@@ -187,7 +188,7 @@ __CMPXCHG_GEN(_mb)
 	__ret; \
 })
 
-#define cmpxchg_double_local(ptr1, ptr2, o1, o2, n1, n2) \
+#define arch_cmpxchg_double_local(ptr1, ptr2, o1, o2, n1, n2) \
 ({\
 	int __ret;\
 	__cmpxchg_double_check(ptr1, ptr2); \
diff --git a/arch/arm64/include/asm/sync_bitops.h b/arch/arm64/include/asm/sync_bitops.h
index 24ed8f445b8b..e42de14627f2 100644
--- a/arch/arm64/include/asm/sync_bitops.h
+++ b/arch/arm64/include/asm/sync_bitops.h
@@ -22,6 +22,5 @@
 #define sync_test_and_clear_bit(nr, p) test_and_clear_bit(nr, p)
 #define sync_test_and_change_bit(nr, p)        test_and_change_bit(nr, p)
 #define sync_test_bit(nr, addr)                test_bit(nr, addr)
-#define sync_cmpxchg                   cmpxchg
 
 #endif
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 103+ messages in thread

* [PATCH 5/6] arm64: use instrumented atomics
@ 2018-05-04 17:39   ` Mark Rutland
  0 siblings, 0 replies; 103+ messages in thread
From: Mark Rutland @ 2018-05-04 17:39 UTC (permalink / raw)
  To: linux-arm-kernel

As our atomics are written in inline assembly, they don't get
instrumented when we enable KASAN, and thus we can miss when they are
used on erroneous memory locations.

As with x86, let's use atomic-instrumented.h to give arm64 instrumented
atomics. This requires that we add an arch_ prefix to our atomic names,
but other than naming, no changes are made to the atomics themselves.

Due to include dependencies, we must move our definition of sync_cmpxchg
into <asm/cmpxchg.h>, but this is not harmful.

There should be no functional change as a result of this patch when
CONFIG_KASAN is not selected.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
---
 arch/arm64/include/asm/atomic.h       | 299 +++++++++++++++++-----------------
 arch/arm64/include/asm/atomic_ll_sc.h |  28 ++--
 arch/arm64/include/asm/atomic_lse.h   |  43 ++---
 arch/arm64/include/asm/cmpxchg.h      |  25 +--
 arch/arm64/include/asm/sync_bitops.h  |   1 -
 5 files changed, 202 insertions(+), 194 deletions(-)

diff --git a/arch/arm64/include/asm/atomic.h b/arch/arm64/include/asm/atomic.h
index c0235e0ff849..aefdce33f81a 100644
--- a/arch/arm64/include/asm/atomic.h
+++ b/arch/arm64/include/asm/atomic.h
@@ -53,158 +53,161 @@
 
 #define ATOMIC_INIT(i)	{ (i) }
 
-#define atomic_read(v)			READ_ONCE((v)->counter)
-#define atomic_set(v, i)		WRITE_ONCE(((v)->counter), (i))
-
-#define atomic_add_return_relaxed	atomic_add_return_relaxed
-#define atomic_add_return_acquire	atomic_add_return_acquire
-#define atomic_add_return_release	atomic_add_return_release
-#define atomic_add_return		atomic_add_return
-
-#define atomic_inc_return_relaxed(v)	atomic_add_return_relaxed(1, (v))
-#define atomic_inc_return_acquire(v)	atomic_add_return_acquire(1, (v))
-#define atomic_inc_return_release(v)	atomic_add_return_release(1, (v))
-#define atomic_inc_return(v)		atomic_add_return(1, (v))
-
-#define atomic_sub_return_relaxed	atomic_sub_return_relaxed
-#define atomic_sub_return_acquire	atomic_sub_return_acquire
-#define atomic_sub_return_release	atomic_sub_return_release
-#define atomic_sub_return		atomic_sub_return
-
-#define atomic_dec_return_relaxed(v)	atomic_sub_return_relaxed(1, (v))
-#define atomic_dec_return_acquire(v)	atomic_sub_return_acquire(1, (v))
-#define atomic_dec_return_release(v)	atomic_sub_return_release(1, (v))
-#define atomic_dec_return(v)		atomic_sub_return(1, (v))
-
-#define atomic_fetch_add_relaxed	atomic_fetch_add_relaxed
-#define atomic_fetch_add_acquire	atomic_fetch_add_acquire
-#define atomic_fetch_add_release	atomic_fetch_add_release
-#define atomic_fetch_add		atomic_fetch_add
-
-#define atomic_fetch_sub_relaxed	atomic_fetch_sub_relaxed
-#define atomic_fetch_sub_acquire	atomic_fetch_sub_acquire
-#define atomic_fetch_sub_release	atomic_fetch_sub_release
-#define atomic_fetch_sub		atomic_fetch_sub
-
-#define atomic_fetch_and_relaxed	atomic_fetch_and_relaxed
-#define atomic_fetch_and_acquire	atomic_fetch_and_acquire
-#define atomic_fetch_and_release	atomic_fetch_and_release
-#define atomic_fetch_and		atomic_fetch_and
-
-#define atomic_fetch_andnot_relaxed	atomic_fetch_andnot_relaxed
-#define atomic_fetch_andnot_acquire	atomic_fetch_andnot_acquire
-#define atomic_fetch_andnot_release	atomic_fetch_andnot_release
-#define atomic_fetch_andnot		atomic_fetch_andnot
-
-#define atomic_fetch_or_relaxed		atomic_fetch_or_relaxed
-#define atomic_fetch_or_acquire		atomic_fetch_or_acquire
-#define atomic_fetch_or_release		atomic_fetch_or_release
-#define atomic_fetch_or			atomic_fetch_or
-
-#define atomic_fetch_xor_relaxed	atomic_fetch_xor_relaxed
-#define atomic_fetch_xor_acquire	atomic_fetch_xor_acquire
-#define atomic_fetch_xor_release	atomic_fetch_xor_release
-#define atomic_fetch_xor		atomic_fetch_xor
-
-#define atomic_xchg_relaxed(v, new)	xchg_relaxed(&((v)->counter), (new))
-#define atomic_xchg_acquire(v, new)	xchg_acquire(&((v)->counter), (new))
-#define atomic_xchg_release(v, new)	xchg_release(&((v)->counter), (new))
-#define atomic_xchg(v, new)		xchg(&((v)->counter), (new))
-
-#define atomic_cmpxchg_relaxed(v, old, new)				\
-	cmpxchg_relaxed(&((v)->counter), (old), (new))
-#define atomic_cmpxchg_acquire(v, old, new)				\
-	cmpxchg_acquire(&((v)->counter), (old), (new))
-#define atomic_cmpxchg_release(v, old, new)				\
-	cmpxchg_release(&((v)->counter), (old), (new))
-#define atomic_cmpxchg(v, old, new)	cmpxchg(&((v)->counter), (old), (new))
-
-#define atomic_inc(v)			atomic_add(1, (v))
-#define atomic_dec(v)			atomic_sub(1, (v))
-#define atomic_inc_and_test(v)		(atomic_inc_return(v) == 0)
-#define atomic_dec_and_test(v)		(atomic_dec_return(v) == 0)
-#define atomic_sub_and_test(i, v)	(atomic_sub_return((i), (v)) == 0)
-#define atomic_add_negative(i, v)	(atomic_add_return((i), (v)) < 0)
-#define __atomic_add_unless(v, a, u)	___atomic_add_unless(v, a, u,)
-#define atomic_andnot			atomic_andnot
+#define arch_atomic_read(v)			READ_ONCE((v)->counter)
+#define arch_atomic_set(v, i)			WRITE_ONCE(((v)->counter), (i))
+
+#define arch_atomic_add_return_relaxed		arch_atomic_add_return_relaxed
+#define arch_atomic_add_return_acquire		arch_atomic_add_return_acquire
+#define arch_atomic_add_return_release		arch_atomic_add_return_release
+#define arch_atomic_add_return			arch_atomic_add_return
+
+#define arch_atomic_inc_return_relaxed(v)	arch_atomic_add_return_relaxed(1, (v))
+#define arch_atomic_inc_return_acquire(v)	arch_atomic_add_return_acquire(1, (v))
+#define arch_atomic_inc_return_release(v)	arch_atomic_add_return_release(1, (v))
+#define arch_atomic_inc_return(v)		arch_atomic_add_return(1, (v))
+
+#define arch_atomic_sub_return_relaxed		arch_atomic_sub_return_relaxed
+#define arch_atomic_sub_return_acquire		arch_atomic_sub_return_acquire
+#define arch_atomic_sub_return_release		arch_atomic_sub_return_release
+#define arch_atomic_sub_return			arch_atomic_sub_return
+
+#define arch_atomic_dec_return_relaxed(v)	arch_atomic_sub_return_relaxed(1, (v))
+#define arch_atomic_dec_return_acquire(v)	arch_atomic_sub_return_acquire(1, (v))
+#define arch_atomic_dec_return_release(v)	arch_atomic_sub_return_release(1, (v))
+#define arch_atomic_dec_return(v)		arch_atomic_sub_return(1, (v))
+
+#define arch_atomic_fetch_add_relaxed		arch_atomic_fetch_add_relaxed
+#define arch_atomic_fetch_add_acquire		arch_atomic_fetch_add_acquire
+#define arch_atomic_fetch_add_release		arch_atomic_fetch_add_release
+#define arch_atomic_fetch_add			arch_atomic_fetch_add
+
+#define arch_atomic_fetch_sub_relaxed		arch_atomic_fetch_sub_relaxed
+#define arch_atomic_fetch_sub_acquire		arch_atomic_fetch_sub_acquire
+#define arch_atomic_fetch_sub_release		arch_atomic_fetch_sub_release
+#define arch_atomic_fetch_sub			arch_atomic_fetch_sub
+
+#define arch_atomic_fetch_and_relaxed		arch_atomic_fetch_and_relaxed
+#define arch_atomic_fetch_and_acquire		arch_atomic_fetch_and_acquire
+#define arch_atomic_fetch_and_release		arch_atomic_fetch_and_release
+#define arch_atomic_fetch_and			arch_atomic_fetch_and
+
+#define arch_atomic_fetch_andnot_relaxed	arch_atomic_fetch_andnot_relaxed
+#define arch_atomic_fetch_andnot_acquire	arch_atomic_fetch_andnot_acquire
+#define arch_atomic_fetch_andnot_release	arch_atomic_fetch_andnot_release
+#define arch_atomic_fetch_andnot		arch_atomic_fetch_andnot
+
+#define arch_atomic_fetch_or_relaxed		arch_atomic_fetch_or_relaxed
+#define arch_atomic_fetch_or_acquire		arch_atomic_fetch_or_acquire
+#define arch_atomic_fetch_or_release		arch_atomic_fetch_or_release
+#define arch_atomic_fetch_or			arch_atomic_fetch_or
+
+#define arch_atomic_fetch_xor_relaxed		arch_atomic_fetch_xor_relaxed
+#define arch_atomic_fetch_xor_acquire		arch_atomic_fetch_xor_acquire
+#define arch_atomic_fetch_xor_release		arch_atomic_fetch_xor_release
+#define arch_atomic_fetch_xor			arch_atomic_fetch_xor
+
+#define arch_atomic_xchg_relaxed(v, new)	xchg_relaxed(&((v)->counter), (new))
+#define arch_atomic_xchg_acquire(v, new)	xchg_acquire(&((v)->counter), (new))
+#define arch_atomic_xchg_release(v, new)	xchg_release(&((v)->counter), (new))
+#define arch_atomic_xchg(v, new)		xchg(&((v)->counter), (new))
+
+#define arch_atomic_cmpxchg_relaxed(v, old, new)			\
+	arch_cmpxchg_relaxed(&((v)->counter), (old), (new))
+#define arch_atomic_cmpxchg_acquire(v, old, new)			\
+	arch_cmpxchg_acquire(&((v)->counter), (old), (new))
+#define arch_atomic_cmpxchg_release(v, old, new)			\
+	arch_cmpxchg_release(&((v)->counter), (old), (new))
+#define arch_atomic_cmpxchg(v, old, new)				\
+	arch_cmpxchg(&((v)->counter), (old), (new))
+
+#define arch_atomic_inc(v)			arch_atomic_add(1, (v))
+#define arch_atomic_dec(v)			arch_atomic_sub(1, (v))
+#define arch_atomic_inc_and_test(v)		(arch_atomic_inc_return(v) == 0)
+#define arch_atomic_dec_and_test(v)		(arch_atomic_dec_return(v) == 0)
+#define arch_atomic_sub_and_test(i, v)		(arch_atomic_sub_return((i), (v)) == 0)
+#define arch_atomic_add_negative(i, v)		(arch_atomic_add_return((i), (v)) < 0)
+#define __arch_atomic_add_unless(v, a, u)	___atomic_add_unless(v, a, u,)
+#define arch_atomic_andnot			arch_atomic_andnot
 
 /*
  * 64-bit atomic operations.
  */
-#define ATOMIC64_INIT			ATOMIC_INIT
-#define atomic64_read			atomic_read
-#define atomic64_set			atomic_set
-
-#define atomic64_add_return_relaxed	atomic64_add_return_relaxed
-#define atomic64_add_return_acquire	atomic64_add_return_acquire
-#define atomic64_add_return_release	atomic64_add_return_release
-#define atomic64_add_return		atomic64_add_return
-
-#define atomic64_inc_return_relaxed(v)	atomic64_add_return_relaxed(1, (v))
-#define atomic64_inc_return_acquire(v)	atomic64_add_return_acquire(1, (v))
-#define atomic64_inc_return_release(v)	atomic64_add_return_release(1, (v))
-#define atomic64_inc_return(v)		atomic64_add_return(1, (v))
-
-#define atomic64_sub_return_relaxed	atomic64_sub_return_relaxed
-#define atomic64_sub_return_acquire	atomic64_sub_return_acquire
-#define atomic64_sub_return_release	atomic64_sub_return_release
-#define atomic64_sub_return		atomic64_sub_return
-
-#define atomic64_dec_return_relaxed(v)	atomic64_sub_return_relaxed(1, (v))
-#define atomic64_dec_return_acquire(v)	atomic64_sub_return_acquire(1, (v))
-#define atomic64_dec_return_release(v)	atomic64_sub_return_release(1, (v))
-#define atomic64_dec_return(v)		atomic64_sub_return(1, (v))
-
-#define atomic64_fetch_add_relaxed	atomic64_fetch_add_relaxed
-#define atomic64_fetch_add_acquire	atomic64_fetch_add_acquire
-#define atomic64_fetch_add_release	atomic64_fetch_add_release
-#define atomic64_fetch_add		atomic64_fetch_add
-
-#define atomic64_fetch_sub_relaxed	atomic64_fetch_sub_relaxed
-#define atomic64_fetch_sub_acquire	atomic64_fetch_sub_acquire
-#define atomic64_fetch_sub_release	atomic64_fetch_sub_release
-#define atomic64_fetch_sub		atomic64_fetch_sub
-
-#define atomic64_fetch_and_relaxed	atomic64_fetch_and_relaxed
-#define atomic64_fetch_and_acquire	atomic64_fetch_and_acquire
-#define atomic64_fetch_and_release	atomic64_fetch_and_release
-#define atomic64_fetch_and		atomic64_fetch_and
-
-#define atomic64_fetch_andnot_relaxed	atomic64_fetch_andnot_relaxed
-#define atomic64_fetch_andnot_acquire	atomic64_fetch_andnot_acquire
-#define atomic64_fetch_andnot_release	atomic64_fetch_andnot_release
-#define atomic64_fetch_andnot		atomic64_fetch_andnot
-
-#define atomic64_fetch_or_relaxed	atomic64_fetch_or_relaxed
-#define atomic64_fetch_or_acquire	atomic64_fetch_or_acquire
-#define atomic64_fetch_or_release	atomic64_fetch_or_release
-#define atomic64_fetch_or		atomic64_fetch_or
-
-#define atomic64_fetch_xor_relaxed	atomic64_fetch_xor_relaxed
-#define atomic64_fetch_xor_acquire	atomic64_fetch_xor_acquire
-#define atomic64_fetch_xor_release	atomic64_fetch_xor_release
-#define atomic64_fetch_xor		atomic64_fetch_xor
-
-#define atomic64_xchg_relaxed		atomic_xchg_relaxed
-#define atomic64_xchg_acquire		atomic_xchg_acquire
-#define atomic64_xchg_release		atomic_xchg_release
-#define atomic64_xchg			atomic_xchg
-
-#define atomic64_cmpxchg_relaxed	atomic_cmpxchg_relaxed
-#define atomic64_cmpxchg_acquire	atomic_cmpxchg_acquire
-#define atomic64_cmpxchg_release	atomic_cmpxchg_release
-#define atomic64_cmpxchg		atomic_cmpxchg
-
-#define atomic64_inc(v)			atomic64_add(1, (v))
-#define atomic64_dec(v)			atomic64_sub(1, (v))
-#define atomic64_inc_and_test(v)	(atomic64_inc_return(v) == 0)
-#define atomic64_dec_and_test(v)	(atomic64_dec_return(v) == 0)
-#define atomic64_sub_and_test(i, v)	(atomic64_sub_return((i), (v)) == 0)
-#define atomic64_add_negative(i, v)	(atomic64_add_return((i), (v)) < 0)
-#define atomic64_add_unless(v, a, u)	(___atomic_add_unless(v, a, u, 64) != u)
-#define atomic64_andnot			atomic64_andnot
-
-#define atomic64_inc_not_zero(v)	atomic64_add_unless((v), 1, 0)
+#define ATOMIC64_INIT				ATOMIC_INIT
+#define arch_atomic64_read			arch_atomic_read
+#define arch_atomic64_set			arch_atomic_set
+
+#define arch_atomic64_add_return_relaxed	arch_atomic64_add_return_relaxed
+#define arch_atomic64_add_return_acquire	arch_atomic64_add_return_acquire
+#define arch_atomic64_add_return_release	arch_atomic64_add_return_release
+#define arch_atomic64_add_return		arch_atomic64_add_return
+
+#define arch_atomic64_inc_return_relaxed(v)	arch_atomic64_add_return_relaxed(1, (v))
+#define arch_atomic64_inc_return_acquire(v)	arch_atomic64_add_return_acquire(1, (v))
+#define arch_atomic64_inc_return_release(v)	arch_atomic64_add_return_release(1, (v))
+#define arch_atomic64_inc_return(v)		arch_atomic64_add_return(1, (v))
+
+#define arch_atomic64_sub_return_relaxed	arch_atomic64_sub_return_relaxed
+#define arch_atomic64_sub_return_acquire	arch_atomic64_sub_return_acquire
+#define arch_atomic64_sub_return_release	arch_atomic64_sub_return_release
+#define arch_atomic64_sub_return		arch_atomic64_sub_return
+
+#define arch_atomic64_dec_return_relaxed(v)	arch_atomic64_sub_return_relaxed(1, (v))
+#define arch_atomic64_dec_return_acquire(v)	arch_atomic64_sub_return_acquire(1, (v))
+#define arch_atomic64_dec_return_release(v)	arch_atomic64_sub_return_release(1, (v))
+#define arch_atomic64_dec_return(v)		arch_atomic64_sub_return(1, (v))
+
+#define arch_atomic64_fetch_add_relaxed		arch_atomic64_fetch_add_relaxed
+#define arch_atomic64_fetch_add_acquire		arch_atomic64_fetch_add_acquire
+#define arch_atomic64_fetch_add_release		arch_atomic64_fetch_add_release
+#define arch_atomic64_fetch_add			arch_atomic64_fetch_add
+
+#define arch_atomic64_fetch_sub_relaxed		arch_atomic64_fetch_sub_relaxed
+#define arch_atomic64_fetch_sub_acquire		arch_atomic64_fetch_sub_acquire
+#define arch_atomic64_fetch_sub_release		arch_atomic64_fetch_sub_release
+#define arch_atomic64_fetch_sub			arch_atomic64_fetch_sub
+
+#define arch_atomic64_fetch_and_relaxed		arch_atomic64_fetch_and_relaxed
+#define arch_atomic64_fetch_and_acquire		arch_atomic64_fetch_and_acquire
+#define arch_atomic64_fetch_and_release		arch_atomic64_fetch_and_release
+#define arch_atomic64_fetch_and			arch_atomic64_fetch_and
+
+#define arch_atomic64_fetch_andnot_relaxed	arch_atomic64_fetch_andnot_relaxed
+#define arch_atomic64_fetch_andnot_acquire	arch_atomic64_fetch_andnot_acquire
+#define arch_atomic64_fetch_andnot_release	arch_atomic64_fetch_andnot_release
+#define arch_atomic64_fetch_andnot		arch_atomic64_fetch_andnot
+
+#define arch_atomic64_fetch_or_relaxed		arch_atomic64_fetch_or_relaxed
+#define arch_atomic64_fetch_or_acquire		arch_atomic64_fetch_or_acquire
+#define arch_atomic64_fetch_or_release		arch_atomic64_fetch_or_release
+#define arch_atomic64_fetch_or			arch_atomic64_fetch_or
+
+#define arch_atomic64_fetch_xor_relaxed		arch_atomic64_fetch_xor_relaxed
+#define arch_atomic64_fetch_xor_acquire		arch_atomic64_fetch_xor_acquire
+#define arch_atomic64_fetch_xor_release		arch_atomic64_fetch_xor_release
+#define arch_atomic64_fetch_xor			arch_atomic64_fetch_xor
+
+#define arch_atomic64_xchg_relaxed		arch_atomic_xchg_relaxed
+#define arch_atomic64_xchg_acquire		arch_atomic_xchg_acquire
+#define arch_atomic64_xchg_release		arch_atomic_xchg_release
+#define arch_atomic64_xchg			arch_atomic_xchg
+
+#define arch_atomic64_cmpxchg_relaxed		arch_atomic_cmpxchg_relaxed
+#define arch_atomic64_cmpxchg_acquire		arch_atomic_cmpxchg_acquire
+#define arch_atomic64_cmpxchg_release		arch_atomic_cmpxchg_release
+#define arch_atomic64_cmpxchg			arch_atomic_cmpxchg
+
+#define arch_atomic64_inc(v)			arch_atomic64_add(1, (v))
+#define arch_atomic64_dec(v)			arch_atomic64_sub(1, (v))
+#define arch_atomic64_inc_and_test(v)		(arch_atomic64_inc_return(v) == 0)
+#define arch_atomic64_dec_and_test(v)		(arch_atomic64_dec_return(v) == 0)
+#define arch_atomic64_sub_and_test(i, v)	(arch_atomic64_sub_return((i), (v)) == 0)
+#define arch_atomic64_add_negative(i, v)	(arch_atomic64_add_return((i), (v)) < 0)
+#define arch_atomic64_add_unless(v, a, u)	(___atomic_add_unless(v, a, u, 64) != u)
+#define arch_atomic64_andnot			arch_atomic64_andnot
+
+#define arch_atomic64_inc_not_zero(v)		arch_atomic64_add_unless((v), 1, 0)
+
+#include <asm-generic/atomic-instrumented.h>
 
 #endif
 #endif
diff --git a/arch/arm64/include/asm/atomic_ll_sc.h b/arch/arm64/include/asm/atomic_ll_sc.h
index 3175f4982682..c28d5a824104 100644
--- a/arch/arm64/include/asm/atomic_ll_sc.h
+++ b/arch/arm64/include/asm/atomic_ll_sc.h
@@ -39,7 +39,7 @@
 
 #define ATOMIC_OP(op, asm_op)						\
 __LL_SC_INLINE void							\
-__LL_SC_PREFIX(atomic_##op(int i, atomic_t *v))				\
+__LL_SC_PREFIX(arch_atomic_##op(int i, atomic_t *v))			\
 {									\
 	unsigned long tmp;						\
 	int result;							\
@@ -53,11 +53,11 @@ __LL_SC_PREFIX(atomic_##op(int i, atomic_t *v))				\
 	: "=&r" (result), "=&r" (tmp), "+Q" (v->counter)		\
 	: "Ir" (i));							\
 }									\
-__LL_SC_EXPORT(atomic_##op);
+__LL_SC_EXPORT(arch_atomic_##op);
 
 #define ATOMIC_OP_RETURN(name, mb, acq, rel, cl, op, asm_op)		\
 __LL_SC_INLINE int							\
-__LL_SC_PREFIX(atomic_##op##_return##name(int i, atomic_t *v))		\
+__LL_SC_PREFIX(arch_atomic_##op##_return##name(int i, atomic_t *v))	\
 {									\
 	unsigned long tmp;						\
 	int result;							\
@@ -75,11 +75,11 @@ __LL_SC_PREFIX(atomic_##op##_return##name(int i, atomic_t *v))		\
 									\
 	return result;							\
 }									\
-__LL_SC_EXPORT(atomic_##op##_return##name);
+__LL_SC_EXPORT(arch_atomic_##op##_return##name);
 
 #define ATOMIC_FETCH_OP(name, mb, acq, rel, cl, op, asm_op)		\
 __LL_SC_INLINE int							\
-__LL_SC_PREFIX(atomic_fetch_##op##name(int i, atomic_t *v))		\
+__LL_SC_PREFIX(arch_atomic_fetch_##op##name(int i, atomic_t *v))	\
 {									\
 	unsigned long tmp;						\
 	int val, result;						\
@@ -97,7 +97,7 @@ __LL_SC_PREFIX(atomic_fetch_##op##name(int i, atomic_t *v))		\
 									\
 	return result;							\
 }									\
-__LL_SC_EXPORT(atomic_fetch_##op##name);
+__LL_SC_EXPORT(arch_atomic_fetch_##op##name);
 
 #define ATOMIC_OPS(...)							\
 	ATOMIC_OP(__VA_ARGS__)						\
@@ -133,7 +133,7 @@ ATOMIC_OPS(xor, eor)
 
 #define ATOMIC64_OP(op, asm_op)						\
 __LL_SC_INLINE void							\
-__LL_SC_PREFIX(atomic64_##op(long i, atomic64_t *v))			\
+__LL_SC_PREFIX(arch_atomic64_##op(long i, atomic64_t *v))		\
 {									\
 	long result;							\
 	unsigned long tmp;						\
@@ -147,11 +147,11 @@ __LL_SC_PREFIX(atomic64_##op(long i, atomic64_t *v))			\
 	: "=&r" (result), "=&r" (tmp), "+Q" (v->counter)		\
 	: "Ir" (i));							\
 }									\
-__LL_SC_EXPORT(atomic64_##op);
+__LL_SC_EXPORT(arch_atomic64_##op);
 
 #define ATOMIC64_OP_RETURN(name, mb, acq, rel, cl, op, asm_op)		\
 __LL_SC_INLINE long							\
-__LL_SC_PREFIX(atomic64_##op##_return##name(long i, atomic64_t *v))	\
+__LL_SC_PREFIX(arch_atomic64_##op##_return##name(long i, atomic64_t *v))\
 {									\
 	long result;							\
 	unsigned long tmp;						\
@@ -169,11 +169,11 @@ __LL_SC_PREFIX(atomic64_##op##_return##name(long i, atomic64_t *v))	\
 									\
 	return result;							\
 }									\
-__LL_SC_EXPORT(atomic64_##op##_return##name);
+__LL_SC_EXPORT(arch_atomic64_##op##_return##name);
 
 #define ATOMIC64_FETCH_OP(name, mb, acq, rel, cl, op, asm_op)		\
 __LL_SC_INLINE long							\
-__LL_SC_PREFIX(atomic64_fetch_##op##name(long i, atomic64_t *v))	\
+__LL_SC_PREFIX(arch_atomic64_fetch_##op##name(long i, atomic64_t *v))	\
 {									\
 	long result, val;						\
 	unsigned long tmp;						\
@@ -191,7 +191,7 @@ __LL_SC_PREFIX(atomic64_fetch_##op##name(long i, atomic64_t *v))	\
 									\
 	return result;							\
 }									\
-__LL_SC_EXPORT(atomic64_fetch_##op##name);
+__LL_SC_EXPORT(arch_atomic64_fetch_##op##name);
 
 #define ATOMIC64_OPS(...)						\
 	ATOMIC64_OP(__VA_ARGS__)					\
@@ -226,7 +226,7 @@ ATOMIC64_OPS(xor, eor)
 #undef ATOMIC64_OP
 
 __LL_SC_INLINE long
-__LL_SC_PREFIX(atomic64_dec_if_positive(atomic64_t *v))
+__LL_SC_PREFIX(arch_atomic64_dec_if_positive(atomic64_t *v))
 {
 	long result;
 	unsigned long tmp;
@@ -246,7 +246,7 @@ __LL_SC_PREFIX(atomic64_dec_if_positive(atomic64_t *v))
 
 	return result;
 }
-__LL_SC_EXPORT(atomic64_dec_if_positive);
+__LL_SC_EXPORT(arch_atomic64_dec_if_positive);
 
 #define __CMPXCHG_CASE(w, sz, name, mb, acq, rel, cl)			\
 __LL_SC_INLINE unsigned long						\
diff --git a/arch/arm64/include/asm/atomic_lse.h b/arch/arm64/include/asm/atomic_lse.h
index 9ef0797380cb..9a071f71c521 100644
--- a/arch/arm64/include/asm/atomic_lse.h
+++ b/arch/arm64/include/asm/atomic_lse.h
@@ -25,9 +25,9 @@
 #error "please don't include this file directly"
 #endif
 
-#define __LL_SC_ATOMIC(op)	__LL_SC_CALL(atomic_##op)
+#define __LL_SC_ATOMIC(op)	__LL_SC_CALL(arch_atomic_##op)
 #define ATOMIC_OP(op, asm_op)						\
-static inline void atomic_##op(int i, atomic_t *v)			\
+static inline void arch_atomic_##op(int i, atomic_t *v)			\
 {									\
 	register int w0 asm ("w0") = i;					\
 	register atomic_t *x1 asm ("x1") = v;				\
@@ -47,7 +47,7 @@ ATOMIC_OP(add, stadd)
 #undef ATOMIC_OP
 
 #define ATOMIC_FETCH_OP(name, mb, op, asm_op, cl...)			\
-static inline int atomic_fetch_##op##name(int i, atomic_t *v)		\
+static inline int arch_atomic_fetch_##op##name(int i, atomic_t *v)	\
 {									\
 	register int w0 asm ("w0") = i;					\
 	register atomic_t *x1 asm ("x1") = v;				\
@@ -79,7 +79,7 @@ ATOMIC_FETCH_OPS(add, ldadd)
 #undef ATOMIC_FETCH_OPS
 
 #define ATOMIC_OP_ADD_RETURN(name, mb, cl...)				\
-static inline int atomic_add_return##name(int i, atomic_t *v)		\
+static inline int arch_atomic_add_return##name(int i, atomic_t *v)	\
 {									\
 	register int w0 asm ("w0") = i;					\
 	register atomic_t *x1 asm ("x1") = v;				\
@@ -105,7 +105,7 @@ ATOMIC_OP_ADD_RETURN(        , al, "memory")
 
 #undef ATOMIC_OP_ADD_RETURN
 
-static inline void atomic_and(int i, atomic_t *v)
+static inline void arch_atomic_and(int i, atomic_t *v)
 {
 	register int w0 asm ("w0") = i;
 	register atomic_t *x1 asm ("x1") = v;
@@ -123,7 +123,7 @@ static inline void atomic_and(int i, atomic_t *v)
 }
 
 #define ATOMIC_FETCH_OP_AND(name, mb, cl...)				\
-static inline int atomic_fetch_and##name(int i, atomic_t *v)		\
+static inline int arch_atomic_fetch_and##name(int i, atomic_t *v)	\
 {									\
 	register int w0 asm ("w0") = i;					\
 	register atomic_t *x1 asm ("x1") = v;				\
@@ -149,7 +149,7 @@ ATOMIC_FETCH_OP_AND(        , al, "memory")
 
 #undef ATOMIC_FETCH_OP_AND
 
-static inline void atomic_sub(int i, atomic_t *v)
+static inline void arch_atomic_sub(int i, atomic_t *v)
 {
 	register int w0 asm ("w0") = i;
 	register atomic_t *x1 asm ("x1") = v;
@@ -167,7 +167,7 @@ static inline void atomic_sub(int i, atomic_t *v)
 }
 
 #define ATOMIC_OP_SUB_RETURN(name, mb, cl...)				\
-static inline int atomic_sub_return##name(int i, atomic_t *v)		\
+static inline int arch_atomic_sub_return##name(int i, atomic_t *v)	\
 {									\
 	register int w0 asm ("w0") = i;					\
 	register atomic_t *x1 asm ("x1") = v;				\
@@ -195,7 +195,7 @@ ATOMIC_OP_SUB_RETURN(        , al, "memory")
 #undef ATOMIC_OP_SUB_RETURN
 
 #define ATOMIC_FETCH_OP_SUB(name, mb, cl...)				\
-static inline int atomic_fetch_sub##name(int i, atomic_t *v)		\
+static inline int arch_atomic_fetch_sub##name(int i, atomic_t *v)	\
 {									\
 	register int w0 asm ("w0") = i;					\
 	register atomic_t *x1 asm ("x1") = v;				\
@@ -222,9 +222,9 @@ ATOMIC_FETCH_OP_SUB(        , al, "memory")
 #undef ATOMIC_FETCH_OP_SUB
 #undef __LL_SC_ATOMIC
 
-#define __LL_SC_ATOMIC64(op)	__LL_SC_CALL(atomic64_##op)
+#define __LL_SC_ATOMIC64(op)	__LL_SC_CALL(arch_atomic64_##op)
 #define ATOMIC64_OP(op, asm_op)						\
-static inline void atomic64_##op(long i, atomic64_t *v)			\
+static inline void arch_atomic64_##op(long i, atomic64_t *v)		\
 {									\
 	register long x0 asm ("x0") = i;				\
 	register atomic64_t *x1 asm ("x1") = v;				\
@@ -244,7 +244,8 @@ ATOMIC64_OP(add, stadd)
 #undef ATOMIC64_OP
 
 #define ATOMIC64_FETCH_OP(name, mb, op, asm_op, cl...)			\
-static inline long atomic64_fetch_##op##name(long i, atomic64_t *v)	\
+static inline long							\
+arch_atomic64_fetch_##op##name(long i, atomic64_t *v)			\
 {									\
 	register long x0 asm ("x0") = i;				\
 	register atomic64_t *x1 asm ("x1") = v;				\
@@ -276,7 +277,8 @@ ATOMIC64_FETCH_OPS(add, ldadd)
 #undef ATOMIC64_FETCH_OPS
 
 #define ATOMIC64_OP_ADD_RETURN(name, mb, cl...)				\
-static inline long atomic64_add_return##name(long i, atomic64_t *v)	\
+static inline long							\
+arch_atomic64_add_return##name(long i, atomic64_t *v)			\
 {									\
 	register long x0 asm ("x0") = i;				\
 	register atomic64_t *x1 asm ("x1") = v;				\
@@ -302,7 +304,7 @@ ATOMIC64_OP_ADD_RETURN(        , al, "memory")
 
 #undef ATOMIC64_OP_ADD_RETURN
 
-static inline void atomic64_and(long i, atomic64_t *v)
+static inline void arch_atomic64_and(long i, atomic64_t *v)
 {
 	register long x0 asm ("x0") = i;
 	register atomic64_t *x1 asm ("x1") = v;
@@ -320,7 +322,8 @@ static inline void atomic64_and(long i, atomic64_t *v)
 }
 
 #define ATOMIC64_FETCH_OP_AND(name, mb, cl...)				\
-static inline long atomic64_fetch_and##name(long i, atomic64_t *v)	\
+static inline long							\
+arch_atomic64_fetch_and##name(long i, atomic64_t *v)			\
 {									\
 	register long x0 asm ("x0") = i;				\
 	register atomic64_t *x1 asm ("x1") = v;				\
@@ -346,7 +349,7 @@ ATOMIC64_FETCH_OP_AND(        , al, "memory")
 
 #undef ATOMIC64_FETCH_OP_AND
 
-static inline void atomic64_sub(long i, atomic64_t *v)
+static inline void arch_atomic64_sub(long i, atomic64_t *v)
 {
 	register long x0 asm ("x0") = i;
 	register atomic64_t *x1 asm ("x1") = v;
@@ -364,7 +367,8 @@ static inline void atomic64_sub(long i, atomic64_t *v)
 }
 
 #define ATOMIC64_OP_SUB_RETURN(name, mb, cl...)				\
-static inline long atomic64_sub_return##name(long i, atomic64_t *v)	\
+static inline long							\
+arch_atomic64_sub_return##name(long i, atomic64_t *v)			\
 {									\
 	register long x0 asm ("x0") = i;				\
 	register atomic64_t *x1 asm ("x1") = v;				\
@@ -392,7 +396,8 @@ ATOMIC64_OP_SUB_RETURN(        , al, "memory")
 #undef ATOMIC64_OP_SUB_RETURN
 
 #define ATOMIC64_FETCH_OP_SUB(name, mb, cl...)				\
-static inline long atomic64_fetch_sub##name(long i, atomic64_t *v)	\
+static inline long							\
+arch_atomic64_fetch_sub##name(long i, atomic64_t *v)			\
 {									\
 	register long x0 asm ("x0") = i;				\
 	register atomic64_t *x1 asm ("x1") = v;				\
@@ -418,7 +423,7 @@ ATOMIC64_FETCH_OP_SUB(        , al, "memory")
 
 #undef ATOMIC64_FETCH_OP_SUB
 
-static inline long atomic64_dec_if_positive(atomic64_t *v)
+static inline long arch_atomic64_dec_if_positive(atomic64_t *v)
 {
 	register long x0 asm ("x0") = (long)v;
 
diff --git a/arch/arm64/include/asm/cmpxchg.h b/arch/arm64/include/asm/cmpxchg.h
index 4f5fd2a36e6e..0f470ffd2d59 100644
--- a/arch/arm64/include/asm/cmpxchg.h
+++ b/arch/arm64/include/asm/cmpxchg.h
@@ -154,18 +154,19 @@ __CMPXCHG_GEN(_mb)
 })
 
 /* cmpxchg */
-#define cmpxchg_relaxed(...)	__cmpxchg_wrapper(    , __VA_ARGS__)
-#define cmpxchg_acquire(...)	__cmpxchg_wrapper(_acq, __VA_ARGS__)
-#define cmpxchg_release(...)	__cmpxchg_wrapper(_rel, __VA_ARGS__)
-#define cmpxchg(...)		__cmpxchg_wrapper( _mb, __VA_ARGS__)
-#define cmpxchg_local		cmpxchg_relaxed
+#define arch_cmpxchg_relaxed(...)	__cmpxchg_wrapper(    , __VA_ARGS__)
+#define arch_cmpxchg_acquire(...)	__cmpxchg_wrapper(_acq, __VA_ARGS__)
+#define arch_cmpxchg_release(...)	__cmpxchg_wrapper(_rel, __VA_ARGS__)
+#define arch_cmpxchg(...)		__cmpxchg_wrapper( _mb, __VA_ARGS__)
+#define arch_cmpxchg_local		arch_cmpxchg_relaxed
+#define arch_sync_cmpxchg		arch_cmpxchg
 
 /* cmpxchg64 */
-#define cmpxchg64_relaxed	cmpxchg_relaxed
-#define cmpxchg64_acquire	cmpxchg_acquire
-#define cmpxchg64_release	cmpxchg_release
-#define cmpxchg64		cmpxchg
-#define cmpxchg64_local		cmpxchg_local
+#define arch_cmpxchg64_relaxed		arch_cmpxchg_relaxed
+#define arch_cmpxchg64_acquire		arch_cmpxchg_acquire
+#define arch_cmpxchg64_release		arch_cmpxchg_release
+#define arch_cmpxchg64			arch_cmpxchg
+#define arch_cmpxchg64_local		arch_cmpxchg_local
 
 /* cmpxchg_double */
 #define system_has_cmpxchg_double()     1
@@ -177,7 +178,7 @@ __CMPXCHG_GEN(_mb)
 	VM_BUG_ON((unsigned long *)(ptr2) - (unsigned long *)(ptr1) != 1);	\
 })
 
-#define cmpxchg_double(ptr1, ptr2, o1, o2, n1, n2) \
+#define arch_cmpxchg_double(ptr1, ptr2, o1, o2, n1, n2) \
 ({\
 	int __ret;\
 	__cmpxchg_double_check(ptr1, ptr2); \
@@ -187,7 +188,7 @@ __CMPXCHG_GEN(_mb)
 	__ret; \
 })
 
-#define cmpxchg_double_local(ptr1, ptr2, o1, o2, n1, n2) \
+#define arch_cmpxchg_double_local(ptr1, ptr2, o1, o2, n1, n2) \
 ({\
 	int __ret;\
 	__cmpxchg_double_check(ptr1, ptr2); \
diff --git a/arch/arm64/include/asm/sync_bitops.h b/arch/arm64/include/asm/sync_bitops.h
index 24ed8f445b8b..e42de14627f2 100644
--- a/arch/arm64/include/asm/sync_bitops.h
+++ b/arch/arm64/include/asm/sync_bitops.h
@@ -22,6 +22,5 @@
 #define sync_test_and_clear_bit(nr, p) test_and_clear_bit(nr, p)
 #define sync_test_and_change_bit(nr, p)        test_and_change_bit(nr, p)
 #define sync_test_bit(nr, addr)                test_bit(nr, addr)
-#define sync_cmpxchg                   cmpxchg
 
 #endif
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 103+ messages in thread

* [PATCH 6/6] arm64: instrument smp_{load_acquire,store_release}
  2018-05-04 17:39 ` Mark Rutland
@ 2018-05-04 17:39   ` Mark Rutland
  -1 siblings, 0 replies; 103+ messages in thread
From: Mark Rutland @ 2018-05-04 17:39 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: linux-kernel, aryabinin, boqun.feng, catalin.marinas, dvyukov,
	mark.rutland, mingo, peterz, will.deacon

Our __smp_store_release() and __smp_load_acquire() macros use inline
assembly, which is opaque to kasan. This means that kasan can't catch
erroneous use of these.

This patch adds kasan instrumentation to both.

It might be better to turn these into __arch_* variants, as we do for
the atomics, but this works for the time being.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
---
 arch/arm64/include/asm/barrier.h | 22 ++++++++++++++--------
 1 file changed, 14 insertions(+), 8 deletions(-)

diff --git a/arch/arm64/include/asm/barrier.h b/arch/arm64/include/asm/barrier.h
index f11518af96a9..1a9c601619e5 100644
--- a/arch/arm64/include/asm/barrier.h
+++ b/arch/arm64/include/asm/barrier.h
@@ -20,6 +20,8 @@
 
 #ifndef __ASSEMBLY__
 
+#include <linux/kasan-checks.h>
+
 #define __nops(n)	".rept	" #n "\nnop\n.endr\n"
 #define nops(n)		asm volatile(__nops(n))
 
@@ -68,31 +70,33 @@ static inline unsigned long array_index_mask_nospec(unsigned long idx,
 
 #define __smp_store_release(p, v)					\
 do {									\
+	typeof(p) __p = (p);						\
 	union { typeof(*p) __val; char __c[1]; } __u =			\
 		{ .__val = (__force typeof(*p)) (v) }; 			\
 	compiletime_assert_atomic_type(*p);				\
+	kasan_check_write(__p, sizeof(*__p));				\
 	switch (sizeof(*p)) {						\
 	case 1:								\
 		asm volatile ("stlrb %w1, %0"				\
-				: "=Q" (*p)				\
+				: "=Q" (*__p)				\
 				: "r" (*(__u8 *)__u.__c)		\
 				: "memory");				\
 		break;							\
 	case 2:								\
 		asm volatile ("stlrh %w1, %0"				\
-				: "=Q" (*p)				\
+				: "=Q" (*__p)				\
 				: "r" (*(__u16 *)__u.__c)		\
 				: "memory");				\
 		break;							\
 	case 4:								\
 		asm volatile ("stlr %w1, %0"				\
-				: "=Q" (*p)				\
+				: "=Q" (*__p)				\
 				: "r" (*(__u32 *)__u.__c)		\
 				: "memory");				\
 		break;							\
 	case 8:								\
 		asm volatile ("stlr %1, %0"				\
-				: "=Q" (*p)				\
+				: "=Q" (*__p)				\
 				: "r" (*(__u64 *)__u.__c)		\
 				: "memory");				\
 		break;							\
@@ -102,27 +106,29 @@ do {									\
 #define __smp_load_acquire(p)						\
 ({									\
 	union { typeof(*p) __val; char __c[1]; } __u;			\
+	typeof(p) __p = (p);						\
 	compiletime_assert_atomic_type(*p);				\
+	kasan_check_read(__p, sizeof(*__p));				\
 	switch (sizeof(*p)) {						\
 	case 1:								\
 		asm volatile ("ldarb %w0, %1"				\
 			: "=r" (*(__u8 *)__u.__c)			\
-			: "Q" (*p) : "memory");				\
+			: "Q" (*__p) : "memory");			\
 		break;							\
 	case 2:								\
 		asm volatile ("ldarh %w0, %1"				\
 			: "=r" (*(__u16 *)__u.__c)			\
-			: "Q" (*p) : "memory");				\
+			: "Q" (*__p) : "memory");			\
 		break;							\
 	case 4:								\
 		asm volatile ("ldar %w0, %1"				\
 			: "=r" (*(__u32 *)__u.__c)			\
-			: "Q" (*p) : "memory");				\
+			: "Q" (*__p) : "memory");			\
 		break;							\
 	case 8:								\
 		asm volatile ("ldar %0, %1"				\
 			: "=r" (*(__u64 *)__u.__c)			\
-			: "Q" (*p) : "memory");				\
+			: "Q" (*__p) : "memory");			\
 		break;							\
 	}								\
 	__u.__val;							\
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 103+ messages in thread

* [PATCH 6/6] arm64: instrument smp_{load_acquire,store_release}
@ 2018-05-04 17:39   ` Mark Rutland
  0 siblings, 0 replies; 103+ messages in thread
From: Mark Rutland @ 2018-05-04 17:39 UTC (permalink / raw)
  To: linux-arm-kernel

Our __smp_store_release() and __smp_load_acquire() macros use inline
assembly, which is opaque to kasan. This means that kasan can't catch
erroneous use of these.

This patch adds kasan instrumentation to both.

It might be better to turn these into __arch_* variants, as we do for
the atomics, but this works for the time being.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
---
 arch/arm64/include/asm/barrier.h | 22 ++++++++++++++--------
 1 file changed, 14 insertions(+), 8 deletions(-)

diff --git a/arch/arm64/include/asm/barrier.h b/arch/arm64/include/asm/barrier.h
index f11518af96a9..1a9c601619e5 100644
--- a/arch/arm64/include/asm/barrier.h
+++ b/arch/arm64/include/asm/barrier.h
@@ -20,6 +20,8 @@
 
 #ifndef __ASSEMBLY__
 
+#include <linux/kasan-checks.h>
+
 #define __nops(n)	".rept	" #n "\nnop\n.endr\n"
 #define nops(n)		asm volatile(__nops(n))
 
@@ -68,31 +70,33 @@ static inline unsigned long array_index_mask_nospec(unsigned long idx,
 
 #define __smp_store_release(p, v)					\
 do {									\
+	typeof(p) __p = (p);						\
 	union { typeof(*p) __val; char __c[1]; } __u =			\
 		{ .__val = (__force typeof(*p)) (v) }; 			\
 	compiletime_assert_atomic_type(*p);				\
+	kasan_check_write(__p, sizeof(*__p));				\
 	switch (sizeof(*p)) {						\
 	case 1:								\
 		asm volatile ("stlrb %w1, %0"				\
-				: "=Q" (*p)				\
+				: "=Q" (*__p)				\
 				: "r" (*(__u8 *)__u.__c)		\
 				: "memory");				\
 		break;							\
 	case 2:								\
 		asm volatile ("stlrh %w1, %0"				\
-				: "=Q" (*p)				\
+				: "=Q" (*__p)				\
 				: "r" (*(__u16 *)__u.__c)		\
 				: "memory");				\
 		break;							\
 	case 4:								\
 		asm volatile ("stlr %w1, %0"				\
-				: "=Q" (*p)				\
+				: "=Q" (*__p)				\
 				: "r" (*(__u32 *)__u.__c)		\
 				: "memory");				\
 		break;							\
 	case 8:								\
 		asm volatile ("stlr %1, %0"				\
-				: "=Q" (*p)				\
+				: "=Q" (*__p)				\
 				: "r" (*(__u64 *)__u.__c)		\
 				: "memory");				\
 		break;							\
@@ -102,27 +106,29 @@ do {									\
 #define __smp_load_acquire(p)						\
 ({									\
 	union { typeof(*p) __val; char __c[1]; } __u;			\
+	typeof(p) __p = (p);						\
 	compiletime_assert_atomic_type(*p);				\
+	kasan_check_read(__p, sizeof(*__p));				\
 	switch (sizeof(*p)) {						\
 	case 1:								\
 		asm volatile ("ldarb %w0, %1"				\
 			: "=r" (*(__u8 *)__u.__c)			\
-			: "Q" (*p) : "memory");				\
+			: "Q" (*__p) : "memory");			\
 		break;							\
 	case 2:								\
 		asm volatile ("ldarh %w0, %1"				\
 			: "=r" (*(__u16 *)__u.__c)			\
-			: "Q" (*p) : "memory");				\
+			: "Q" (*__p) : "memory");			\
 		break;							\
 	case 4:								\
 		asm volatile ("ldar %w0, %1"				\
 			: "=r" (*(__u32 *)__u.__c)			\
-			: "Q" (*p) : "memory");				\
+			: "Q" (*__p) : "memory");			\
 		break;							\
 	case 8:								\
 		asm volatile ("ldar %0, %1"				\
 			: "=r" (*(__u64 *)__u.__c)			\
-			: "Q" (*p) : "memory");				\
+			: "Q" (*__p) : "memory");			\
 		break;							\
 	}								\
 	__u.__val;							\
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 103+ messages in thread

* Re: [PATCH 1/6] locking/atomic, asm-generic: instrument ordering variants
  2018-05-04 17:39   ` Mark Rutland
@ 2018-05-04 18:01     ` Peter Zijlstra
  -1 siblings, 0 replies; 103+ messages in thread
From: Peter Zijlstra @ 2018-05-04 18:01 UTC (permalink / raw)
  To: Mark Rutland
  Cc: linux-arm-kernel, linux-kernel, aryabinin, boqun.feng,
	catalin.marinas, dvyukov, mingo, will.deacon

On Fri, May 04, 2018 at 06:39:32PM +0100, Mark Rutland wrote:
> Currently <asm-generic/atomic-instrumented.h> only instruments the fully
> ordered variants of atomic functions, ignoring the {relaxed,acquire,release}
> ordering variants.
> 
> This patch reworks the header to instrument all ordering variants of the atomic
> functions, so that architectures implementing these are instrumented
> appropriately.
> 
> To minimise repetition, a macro is used to generate each variant from a common
> template. The {full,relaxed,acquire,release} order variants respectively are
> then built using this template, where the architecture provides an
> implementation.
> 
> To stick to an 80 column limit while keeping the templates legible, the return
> type and function name of each template are split over two lines. For
> consistency, this is done even when not strictly necessary.
> 
> Signed-off-by: Mark Rutland <mark.rutland@arm.com>
> Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
> Cc: Boqun Feng <boqun.feng@gmail.com>
> Cc: Dmitry Vyukov <dvyukov@google.com>
> Cc: Ingo Molnar <mingo@kernel.org>
> Cc: Peter Zijlstra <peterz@infradead.org>
> Cc: Will Deacon <will.deacon@arm.com>
> ---
>  include/asm-generic/atomic-instrumented.h | 1195 ++++++++++++++++++++++++-----
>  1 file changed, 1008 insertions(+), 187 deletions(-)

Is there really no way to either generate or further macro compress this?

This is stupid repetitive, we just got rid of all that endless copy
paste crap in atomic implementations and now we're going back to that.

Adding or changing atomic bits becomes horrifically painful because of this.

^ permalink raw reply	[flat|nested] 103+ messages in thread

* [PATCH 1/6] locking/atomic, asm-generic: instrument ordering variants
@ 2018-05-04 18:01     ` Peter Zijlstra
  0 siblings, 0 replies; 103+ messages in thread
From: Peter Zijlstra @ 2018-05-04 18:01 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, May 04, 2018 at 06:39:32PM +0100, Mark Rutland wrote:
> Currently <asm-generic/atomic-instrumented.h> only instruments the fully
> ordered variants of atomic functions, ignoring the {relaxed,acquire,release}
> ordering variants.
> 
> This patch reworks the header to instrument all ordering variants of the atomic
> functions, so that architectures implementing these are instrumented
> appropriately.
> 
> To minimise repetition, a macro is used to generate each variant from a common
> template. The {full,relaxed,acquire,release} order variants respectively are
> then built using this template, where the architecture provides an
> implementation.
> 
> To stick to an 80 column limit while keeping the templates legible, the return
> type and function name of each template are split over two lines. For
> consistency, this is done even when not strictly necessary.
> 
> Signed-off-by: Mark Rutland <mark.rutland@arm.com>
> Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
> Cc: Boqun Feng <boqun.feng@gmail.com>
> Cc: Dmitry Vyukov <dvyukov@google.com>
> Cc: Ingo Molnar <mingo@kernel.org>
> Cc: Peter Zijlstra <peterz@infradead.org>
> Cc: Will Deacon <will.deacon@arm.com>
> ---
>  include/asm-generic/atomic-instrumented.h | 1195 ++++++++++++++++++++++++-----
>  1 file changed, 1008 insertions(+), 187 deletions(-)

Is there really no way to either generate or further macro compress this?

This is stupid repetitive, we just got rid of all that endless copy
paste crap in atomic implementations and now we're going back to that.

Adding or changing atomic bits becomes horrifically painful because of this.

^ permalink raw reply	[flat|nested] 103+ messages in thread

* Re: [PATCH 1/6] locking/atomic, asm-generic: instrument ordering variants
  2018-05-04 18:01     ` Peter Zijlstra
@ 2018-05-04 18:09       ` Mark Rutland
  -1 siblings, 0 replies; 103+ messages in thread
From: Mark Rutland @ 2018-05-04 18:09 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: linux-arm-kernel, linux-kernel, aryabinin, boqun.feng,
	catalin.marinas, dvyukov, mingo, will.deacon

On Fri, May 04, 2018 at 08:01:05PM +0200, Peter Zijlstra wrote:
> On Fri, May 04, 2018 at 06:39:32PM +0100, Mark Rutland wrote:
> > Currently <asm-generic/atomic-instrumented.h> only instruments the fully
> > ordered variants of atomic functions, ignoring the {relaxed,acquire,release}
> > ordering variants.
> > 
> > This patch reworks the header to instrument all ordering variants of the atomic
> > functions, so that architectures implementing these are instrumented
> > appropriately.
> > 
> > To minimise repetition, a macro is used to generate each variant from a common
> > template. The {full,relaxed,acquire,release} order variants respectively are
> > then built using this template, where the architecture provides an
> > implementation.

> >  include/asm-generic/atomic-instrumented.h | 1195 ++++++++++++++++++++++++-----
> >  1 file changed, 1008 insertions(+), 187 deletions(-)
> 
> Is there really no way to either generate or further macro compress this?

I can definitely macro compress this somewhat, but the bulk of the
repetition will be the ifdeffery, which can't be macro'd away IIUC.

Generating this with a script is possible -- do we do anything like that
elsewhere?

> This is stupid repetitive, we just got rid of all that endless copy
> paste crap in atomic implementations and now we're going back to that.
> 
> Adding or changing atomic bits becomes horrifically painful because of this.

Sure thing; mangling it to its current state was a pain enough.

Thanks,
Mark.

^ permalink raw reply	[flat|nested] 103+ messages in thread

* [PATCH 1/6] locking/atomic, asm-generic: instrument ordering variants
@ 2018-05-04 18:09       ` Mark Rutland
  0 siblings, 0 replies; 103+ messages in thread
From: Mark Rutland @ 2018-05-04 18:09 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, May 04, 2018 at 08:01:05PM +0200, Peter Zijlstra wrote:
> On Fri, May 04, 2018 at 06:39:32PM +0100, Mark Rutland wrote:
> > Currently <asm-generic/atomic-instrumented.h> only instruments the fully
> > ordered variants of atomic functions, ignoring the {relaxed,acquire,release}
> > ordering variants.
> > 
> > This patch reworks the header to instrument all ordering variants of the atomic
> > functions, so that architectures implementing these are instrumented
> > appropriately.
> > 
> > To minimise repetition, a macro is used to generate each variant from a common
> > template. The {full,relaxed,acquire,release} order variants respectively are
> > then built using this template, where the architecture provides an
> > implementation.

> >  include/asm-generic/atomic-instrumented.h | 1195 ++++++++++++++++++++++++-----
> >  1 file changed, 1008 insertions(+), 187 deletions(-)
> 
> Is there really no way to either generate or further macro compress this?

I can definitely macro compress this somewhat, but the bulk of the
repetition will be the ifdeffery, which can't be macro'd away IIUC.

Generating this with a script is possible -- do we do anything like that
elsewhere?

> This is stupid repetitive, we just got rid of all that endless copy
> paste crap in atomic implementations and now we're going back to that.
> 
> Adding or changing atomic bits becomes horrifically painful because of this.

Sure thing; mangling it to its current state was a pain enough.

Thanks,
Mark.

^ permalink raw reply	[flat|nested] 103+ messages in thread

* Re: [PATCH 1/6] locking/atomic, asm-generic: instrument ordering variants
  2018-05-04 18:09       ` Mark Rutland
@ 2018-05-04 18:24         ` Peter Zijlstra
  -1 siblings, 0 replies; 103+ messages in thread
From: Peter Zijlstra @ 2018-05-04 18:24 UTC (permalink / raw)
  To: Mark Rutland
  Cc: linux-arm-kernel, linux-kernel, aryabinin, boqun.feng,
	catalin.marinas, dvyukov, mingo, will.deacon

On Fri, May 04, 2018 at 07:09:09PM +0100, Mark Rutland wrote:
> On Fri, May 04, 2018 at 08:01:05PM +0200, Peter Zijlstra wrote:
> > On Fri, May 04, 2018 at 06:39:32PM +0100, Mark Rutland wrote:

> > >  include/asm-generic/atomic-instrumented.h | 1195 ++++++++++++++++++++++++-----
> > >  1 file changed, 1008 insertions(+), 187 deletions(-)
> > 
> > Is there really no way to either generate or further macro compress this?
> 
> I can definitely macro compress this somewhat, but the bulk of the
> repetition will be the ifdeffery, which can't be macro'd away IIUC.

Right, much like what we already have in linux/atomic.h I suspect,
having to duplicating that isn't brilliant either.

> Generating this with a script is possible -- do we do anything like that
> elsewhere?

There's include/generated/ in your build directory. But nothing on this
scale I think.

^ permalink raw reply	[flat|nested] 103+ messages in thread

* [PATCH 1/6] locking/atomic, asm-generic: instrument ordering variants
@ 2018-05-04 18:24         ` Peter Zijlstra
  0 siblings, 0 replies; 103+ messages in thread
From: Peter Zijlstra @ 2018-05-04 18:24 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, May 04, 2018 at 07:09:09PM +0100, Mark Rutland wrote:
> On Fri, May 04, 2018 at 08:01:05PM +0200, Peter Zijlstra wrote:
> > On Fri, May 04, 2018 at 06:39:32PM +0100, Mark Rutland wrote:

> > >  include/asm-generic/atomic-instrumented.h | 1195 ++++++++++++++++++++++++-----
> > >  1 file changed, 1008 insertions(+), 187 deletions(-)
> > 
> > Is there really no way to either generate or further macro compress this?
> 
> I can definitely macro compress this somewhat, but the bulk of the
> repetition will be the ifdeffery, which can't be macro'd away IIUC.

Right, much like what we already have in linux/atomic.h I suspect,
having to duplicating that isn't brilliant either.

> Generating this with a script is possible -- do we do anything like that
> elsewhere?

There's include/generated/ in your build directory. But nothing on this
scale I think.

^ permalink raw reply	[flat|nested] 103+ messages in thread

* [PATCH] locking/atomics: Clean up the atomic.h maze of #defines
  2018-05-04 18:09       ` Mark Rutland
@ 2018-05-05  8:11         ` Ingo Molnar
  -1 siblings, 0 replies; 103+ messages in thread
From: Ingo Molnar @ 2018-05-05  8:11 UTC (permalink / raw)
  To: Mark Rutland
  Cc: Peter Zijlstra, linux-arm-kernel, linux-kernel, aryabinin,
	boqun.feng, catalin.marinas, dvyukov, will.deacon


* Mark Rutland <mark.rutland@arm.com> wrote:

> On Fri, May 04, 2018 at 08:01:05PM +0200, Peter Zijlstra wrote:
> > On Fri, May 04, 2018 at 06:39:32PM +0100, Mark Rutland wrote:
> > > Currently <asm-generic/atomic-instrumented.h> only instruments the fully
> > > ordered variants of atomic functions, ignoring the {relaxed,acquire,release}
> > > ordering variants.
> > > 
> > > This patch reworks the header to instrument all ordering variants of the atomic
> > > functions, so that architectures implementing these are instrumented
> > > appropriately.
> > > 
> > > To minimise repetition, a macro is used to generate each variant from a common
> > > template. The {full,relaxed,acquire,release} order variants respectively are
> > > then built using this template, where the architecture provides an
> > > implementation.
> 
> > >  include/asm-generic/atomic-instrumented.h | 1195 ++++++++++++++++++++++++-----
> > >  1 file changed, 1008 insertions(+), 187 deletions(-)
> > 
> > Is there really no way to either generate or further macro compress this?
> 
> I can definitely macro compress this somewhat, but the bulk of the
> repetition will be the ifdeffery, which can't be macro'd away IIUC.

The thing is, the existing #ifdeffery is suboptimal to begin with.

I just did the following cleanups (patch attached):

 include/linux/atomic.h | 1275 +++++++++++++++++++++---------------------------
 1 file changed, 543 insertions(+), 732 deletions(-)

The gist of the changes is the following simplification of the main construct:

Before:

 #ifndef atomic_fetch_dec_relaxed

 #ifndef atomic_fetch_dec
 #define atomic_fetch_dec(v)	        atomic_fetch_sub(1, (v))
 #define atomic_fetch_dec_relaxed(v)	atomic_fetch_sub_relaxed(1, (v))
 #define atomic_fetch_dec_acquire(v)	atomic_fetch_sub_acquire(1, (v))
 #define atomic_fetch_dec_release(v)	atomic_fetch_sub_release(1, (v))
 #else /* atomic_fetch_dec */
 #define atomic_fetch_dec_relaxed	atomic_fetch_dec
 #define atomic_fetch_dec_acquire	atomic_fetch_dec
 #define atomic_fetch_dec_release	atomic_fetch_dec
 #endif /* atomic_fetch_dec */

 #else /* atomic_fetch_dec_relaxed */

 #ifndef atomic_fetch_dec_acquire
 #define atomic_fetch_dec_acquire(...)					\
	__atomic_op_acquire(atomic_fetch_dec, __VA_ARGS__)
 #endif

 #ifndef atomic_fetch_dec_release
 #define atomic_fetch_dec_release(...)					\
	__atomic_op_release(atomic_fetch_dec, __VA_ARGS__)
 #endif

 #ifndef atomic_fetch_dec
 #define atomic_fetch_dec(...)						\
	__atomic_op_fence(atomic_fetch_dec, __VA_ARGS__)
 #endif
 #endif /* atomic_fetch_dec_relaxed */

After:

 #ifndef atomic_fetch_dec_relaxed
 # ifndef atomic_fetch_dec
 #  define atomic_fetch_dec(v)			atomic_fetch_sub(1, (v))
 #  define atomic_fetch_dec_relaxed(v)		atomic_fetch_sub_relaxed(1, (v))
 #  define atomic_fetch_dec_acquire(v)		atomic_fetch_sub_acquire(1, (v))
 #  define atomic_fetch_dec_release(v)		atomic_fetch_sub_release(1, (v))
 # else
 #  define atomic_fetch_dec_relaxed		atomic_fetch_dec
 #  define atomic_fetch_dec_acquire		atomic_fetch_dec
 #  define atomic_fetch_dec_release		atomic_fetch_dec
 # endif
 #else
 # ifndef atomic_fetch_dec_acquire
 #  define atomic_fetch_dec_acquire(...)	__atomic_op_acquire(atomic_fetch_dec, __VA_ARGS__)
 # endif
 # ifndef atomic_fetch_dec_release
 #  define atomic_fetch_dec_release(...)	__atomic_op_release(atomic_fetch_dec, __VA_ARGS__)
 # endif
 # ifndef atomic_fetch_dec
 #  define atomic_fetch_dec(...)		__atomic_op_fence(atomic_fetch_dec, __VA_ARGS__)
 # endif
 #endif

The new variant is readable at a glance, and the hierarchy of defines is very 
obvious as well.

And I think we could do even better - there's absolutely no reason why _every_ 
operation has to be made conditional on a finegrained level - they are overriden 
in API groups. In fact allowing individual override is arguably a fragility.

So we could do the following simplification on top of that:

 #ifndef atomic_fetch_dec_relaxed
 # ifndef atomic_fetch_dec
 #  define atomic_fetch_dec(v)			atomic_fetch_sub(1, (v))
 #  define atomic_fetch_dec_relaxed(v)		atomic_fetch_sub_relaxed(1, (v))
 #  define atomic_fetch_dec_acquire(v)		atomic_fetch_sub_acquire(1, (v))
 #  define atomic_fetch_dec_release(v)		atomic_fetch_sub_release(1, (v))
 # else
 #  define atomic_fetch_dec_relaxed		atomic_fetch_dec
 #  define atomic_fetch_dec_acquire		atomic_fetch_dec
 #  define atomic_fetch_dec_release		atomic_fetch_dec
 # endif
 #else
 # ifndef atomic_fetch_dec
 #  define atomic_fetch_dec(...)		__atomic_op_fence(atomic_fetch_dec, __VA_ARGS__)
 #  define atomic_fetch_dec_acquire(...)	__atomic_op_acquire(atomic_fetch_dec, __VA_ARGS__)
 #  define atomic_fetch_dec_release(...)	__atomic_op_release(atomic_fetch_dec, __VA_ARGS__)
 # endif
 #endif

Note how the grouping of APIs based on 'atomic_fetch_dec' is already an assumption 
in the primary !atomic_fetch_dec_relaxed branch.

I much prefer such clear constructs of API mapping versus magic macros.

Thanks,

	Ingo

=============================>
>From 0171d4ed840d25befaedcf03e834bb76acb400c0 Mon Sep 17 00:00:00 2001
From: Ingo Molnar <mingo@kernel.org>
Date: Sat, 5 May 2018 09:57:02 +0200
Subject: [PATCH] locking/atomics: Clean up the atomic.h maze of #defines

Use structured defines to make it all much more readable.

Before:

 #ifndef atomic_fetch_dec_relaxed

 #ifndef atomic_fetch_dec
 #define atomic_fetch_dec(v)	        atomic_fetch_sub(1, (v))
 #define atomic_fetch_dec_relaxed(v)	atomic_fetch_sub_relaxed(1, (v))
 #define atomic_fetch_dec_acquire(v)	atomic_fetch_sub_acquire(1, (v))
 #define atomic_fetch_dec_release(v)	atomic_fetch_sub_release(1, (v))
 #else /* atomic_fetch_dec */
 #define atomic_fetch_dec_relaxed	atomic_fetch_dec
 #define atomic_fetch_dec_acquire	atomic_fetch_dec
 #define atomic_fetch_dec_release	atomic_fetch_dec
 #endif /* atomic_fetch_dec */

 #else /* atomic_fetch_dec_relaxed */

 #ifndef atomic_fetch_dec_acquire
 #define atomic_fetch_dec_acquire(...)					\
	__atomic_op_acquire(atomic_fetch_dec, __VA_ARGS__)
 #endif

 #ifndef atomic_fetch_dec_release
 #define atomic_fetch_dec_release(...)					\
	__atomic_op_release(atomic_fetch_dec, __VA_ARGS__)
 #endif

 #ifndef atomic_fetch_dec
 #define atomic_fetch_dec(...)						\
	__atomic_op_fence(atomic_fetch_dec, __VA_ARGS__)
 #endif
 #endif /* atomic_fetch_dec_relaxed */

After:

 #ifndef atomic_fetch_dec_relaxed
 # ifndef atomic_fetch_dec
 #  define atomic_fetch_dec(v)			atomic_fetch_sub(1, (v))
 #  define atomic_fetch_dec_relaxed(v)		atomic_fetch_sub_relaxed(1, (v))
 #  define atomic_fetch_dec_acquire(v)		atomic_fetch_sub_acquire(1, (v))
 #  define atomic_fetch_dec_release(v)		atomic_fetch_sub_release(1, (v))
 # else
 #  define atomic_fetch_dec_relaxed		atomic_fetch_dec
 #  define atomic_fetch_dec_acquire		atomic_fetch_dec
 #  define atomic_fetch_dec_release		atomic_fetch_dec
 # endif
 #else
 # ifndef atomic_fetch_dec_acquire
 #  define atomic_fetch_dec_acquire(...)	__atomic_op_acquire(atomic_fetch_dec, __VA_ARGS__)
 # endif
 # ifndef atomic_fetch_dec_release
 #  define atomic_fetch_dec_release(...)	__atomic_op_release(atomic_fetch_dec, __VA_ARGS__)
 # endif
 # ifndef atomic_fetch_dec
 #  define atomic_fetch_dec(...)		__atomic_op_fence(atomic_fetch_dec, __VA_ARGS__)
 # endif
 #endif

Beyond the linecount reduction this also makes it easier to follow
the various conditions.

Also clean up a few other minor details and make the code more
consistent throughout.

No change in functionality.

Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 include/linux/atomic.h | 1275 +++++++++++++++++++++---------------------------
 1 file changed, 543 insertions(+), 732 deletions(-)

diff --git a/include/linux/atomic.h b/include/linux/atomic.h
index 01ce3997cb42..dc157c092ae5 100644
--- a/include/linux/atomic.h
+++ b/include/linux/atomic.h
@@ -24,11 +24,11 @@
  */
 
 #ifndef atomic_read_acquire
-#define  atomic_read_acquire(v)		smp_load_acquire(&(v)->counter)
+# define atomic_read_acquire(v)			smp_load_acquire(&(v)->counter)
 #endif
 
 #ifndef atomic_set_release
-#define  atomic_set_release(v, i)	smp_store_release(&(v)->counter, (i))
+# define atomic_set_release(v, i)		smp_store_release(&(v)->counter, (i))
 #endif
 
 /*
@@ -71,454 +71,351 @@
 })
 #endif
 
-/* atomic_add_return_relaxed */
-#ifndef atomic_add_return_relaxed
-#define  atomic_add_return_relaxed	atomic_add_return
-#define  atomic_add_return_acquire	atomic_add_return
-#define  atomic_add_return_release	atomic_add_return
-
-#else /* atomic_add_return_relaxed */
-
-#ifndef atomic_add_return_acquire
-#define  atomic_add_return_acquire(...)					\
-	__atomic_op_acquire(atomic_add_return, __VA_ARGS__)
-#endif
+/* atomic_add_return_relaxed() et al: */
 
-#ifndef atomic_add_return_release
-#define  atomic_add_return_release(...)					\
-	__atomic_op_release(atomic_add_return, __VA_ARGS__)
-#endif
-
-#ifndef atomic_add_return
-#define  atomic_add_return(...)						\
-	__atomic_op_fence(atomic_add_return, __VA_ARGS__)
-#endif
-#endif /* atomic_add_return_relaxed */
+#ifndef atomic_add_return_relaxed
+# define atomic_add_return_relaxed		atomic_add_return
+# define atomic_add_return_acquire		atomic_add_return
+# define atomic_add_return_release		atomic_add_return
+#else
+# ifndef atomic_add_return_acquire
+#  define atomic_add_return_acquire(...)	__atomic_op_acquire(atomic_add_return, __VA_ARGS__)
+# endif
+# ifndef atomic_add_return_release
+#  define atomic_add_return_release(...)	__atomic_op_release(atomic_add_return, __VA_ARGS__)
+# endif
+# ifndef atomic_add_return
+#  define atomic_add_return(...)		__atomic_op_fence(atomic_add_return, __VA_ARGS__)
+# endif
+#endif
+
+/* atomic_inc_return_relaxed() et al: */
 
-/* atomic_inc_return_relaxed */
 #ifndef atomic_inc_return_relaxed
-#define  atomic_inc_return_relaxed	atomic_inc_return
-#define  atomic_inc_return_acquire	atomic_inc_return
-#define  atomic_inc_return_release	atomic_inc_return
-
-#else /* atomic_inc_return_relaxed */
-
-#ifndef atomic_inc_return_acquire
-#define  atomic_inc_return_acquire(...)					\
-	__atomic_op_acquire(atomic_inc_return, __VA_ARGS__)
-#endif
-
-#ifndef atomic_inc_return_release
-#define  atomic_inc_return_release(...)					\
-	__atomic_op_release(atomic_inc_return, __VA_ARGS__)
-#endif
-
-#ifndef atomic_inc_return
-#define  atomic_inc_return(...)						\
-	__atomic_op_fence(atomic_inc_return, __VA_ARGS__)
-#endif
-#endif /* atomic_inc_return_relaxed */
+# define atomic_inc_return_relaxed		atomic_inc_return
+# define atomic_inc_return_acquire		atomic_inc_return
+# define atomic_inc_return_release		atomic_inc_return
+#else
+# ifndef atomic_inc_return_acquire
+#  define atomic_inc_return_acquire(...)	__atomic_op_acquire(atomic_inc_return, __VA_ARGS__)
+# endif
+# ifndef atomic_inc_return_release
+#  define atomic_inc_return_release(...)	__atomic_op_release(atomic_inc_return, __VA_ARGS__)
+# endif
+# ifndef atomic_inc_return
+#  define atomic_inc_return(...)		__atomic_op_fence(atomic_inc_return, __VA_ARGS__)
+# endif
+#endif
+
+/* atomic_sub_return_relaxed() et al: */
 
-/* atomic_sub_return_relaxed */
 #ifndef atomic_sub_return_relaxed
-#define  atomic_sub_return_relaxed	atomic_sub_return
-#define  atomic_sub_return_acquire	atomic_sub_return
-#define  atomic_sub_return_release	atomic_sub_return
-
-#else /* atomic_sub_return_relaxed */
-
-#ifndef atomic_sub_return_acquire
-#define  atomic_sub_return_acquire(...)					\
-	__atomic_op_acquire(atomic_sub_return, __VA_ARGS__)
-#endif
-
-#ifndef atomic_sub_return_release
-#define  atomic_sub_return_release(...)					\
-	__atomic_op_release(atomic_sub_return, __VA_ARGS__)
-#endif
-
-#ifndef atomic_sub_return
-#define  atomic_sub_return(...)						\
-	__atomic_op_fence(atomic_sub_return, __VA_ARGS__)
-#endif
-#endif /* atomic_sub_return_relaxed */
+# define atomic_sub_return_relaxed		atomic_sub_return
+# define atomic_sub_return_acquire		atomic_sub_return
+# define atomic_sub_return_release		atomic_sub_return
+#else
+# ifndef atomic_sub_return_acquire
+#  define atomic_sub_return_acquire(...)	__atomic_op_acquire(atomic_sub_return, __VA_ARGS__)
+# endif
+# ifndef atomic_sub_return_release
+#  define atomic_sub_return_release(...)	__atomic_op_release(atomic_sub_return, __VA_ARGS__)
+# endif
+# ifndef atomic_sub_return
+#  define atomic_sub_return(...)		__atomic_op_fence(atomic_sub_return, __VA_ARGS__)
+# endif
+#endif
+
+/* atomic_dec_return_relaxed() et al: */
 
-/* atomic_dec_return_relaxed */
 #ifndef atomic_dec_return_relaxed
-#define  atomic_dec_return_relaxed	atomic_dec_return
-#define  atomic_dec_return_acquire	atomic_dec_return
-#define  atomic_dec_return_release	atomic_dec_return
-
-#else /* atomic_dec_return_relaxed */
-
-#ifndef atomic_dec_return_acquire
-#define  atomic_dec_return_acquire(...)					\
-	__atomic_op_acquire(atomic_dec_return, __VA_ARGS__)
-#endif
+# define atomic_dec_return_relaxed		atomic_dec_return
+# define atomic_dec_return_acquire		atomic_dec_return
+# define atomic_dec_return_release		atomic_dec_return
+#else
+# ifndef atomic_dec_return_acquire
+#  define atomic_dec_return_acquire(...)	__atomic_op_acquire(atomic_dec_return, __VA_ARGS__)
+# endif
+# ifndef atomic_dec_return_release
+#  define atomic_dec_return_release(...)	__atomic_op_release(atomic_dec_return, __VA_ARGS__)
+# endif
+# ifndef atomic_dec_return
+#  define atomic_dec_return(...)		__atomic_op_fence(atomic_dec_return, __VA_ARGS__)
+# endif
+#endif
+
+/* atomic_fetch_add_relaxed() et al: */
 
-#ifndef atomic_dec_return_release
-#define  atomic_dec_return_release(...)					\
-	__atomic_op_release(atomic_dec_return, __VA_ARGS__)
-#endif
-
-#ifndef atomic_dec_return
-#define  atomic_dec_return(...)						\
-	__atomic_op_fence(atomic_dec_return, __VA_ARGS__)
-#endif
-#endif /* atomic_dec_return_relaxed */
-
-
-/* atomic_fetch_add_relaxed */
 #ifndef atomic_fetch_add_relaxed
-#define atomic_fetch_add_relaxed	atomic_fetch_add
-#define atomic_fetch_add_acquire	atomic_fetch_add
-#define atomic_fetch_add_release	atomic_fetch_add
-
-#else /* atomic_fetch_add_relaxed */
-
-#ifndef atomic_fetch_add_acquire
-#define atomic_fetch_add_acquire(...)					\
-	__atomic_op_acquire(atomic_fetch_add, __VA_ARGS__)
-#endif
-
-#ifndef atomic_fetch_add_release
-#define atomic_fetch_add_release(...)					\
-	__atomic_op_release(atomic_fetch_add, __VA_ARGS__)
-#endif
+# define atomic_fetch_add_relaxed		atomic_fetch_add
+# define atomic_fetch_add_acquire		atomic_fetch_add
+# define atomic_fetch_add_release		atomic_fetch_add
+#else
+# ifndef atomic_fetch_add_acquire
+#  define atomic_fetch_add_acquire(...)		__atomic_op_acquire(atomic_fetch_add, __VA_ARGS__)
+# endif
+# ifndef atomic_fetch_add_release
+#  define atomic_fetch_add_release(...)		__atomic_op_release(atomic_fetch_add, __VA_ARGS__)
+# endif
+# ifndef atomic_fetch_add
+#  define atomic_fetch_add(...)			__atomic_op_fence(atomic_fetch_add, __VA_ARGS__)
+# endif
+#endif
+
+/* atomic_fetch_inc_relaxed() et al: */
 
-#ifndef atomic_fetch_add
-#define atomic_fetch_add(...)						\
-	__atomic_op_fence(atomic_fetch_add, __VA_ARGS__)
-#endif
-#endif /* atomic_fetch_add_relaxed */
-
-/* atomic_fetch_inc_relaxed */
 #ifndef atomic_fetch_inc_relaxed
+# ifndef atomic_fetch_inc
+#  define atomic_fetch_inc(v)			atomic_fetch_add(1, (v))
+#  define atomic_fetch_inc_relaxed(v)		atomic_fetch_add_relaxed(1, (v))
+#  define atomic_fetch_inc_acquire(v)		atomic_fetch_add_acquire(1, (v))
+#  define atomic_fetch_inc_release(v)		atomic_fetch_add_release(1, (v))
+# else
+#  define atomic_fetch_inc_relaxed		atomic_fetch_inc
+#  define atomic_fetch_inc_acquire		atomic_fetch_inc
+#  define atomic_fetch_inc_release		atomic_fetch_inc
+# endif
+#else
+# ifndef atomic_fetch_inc_acquire
+#  define atomic_fetch_inc_acquire(...)		__atomic_op_acquire(atomic_fetch_inc, __VA_ARGS__)
+# endif
+# ifndef atomic_fetch_inc_release
+#  define atomic_fetch_inc_release(...)		__atomic_op_release(atomic_fetch_inc, __VA_ARGS__)
+# endif
+# ifndef atomic_fetch_inc
+#  define atomic_fetch_inc(...)			__atomic_op_fence(atomic_fetch_inc, __VA_ARGS__)
+# endif
+#endif
+
+/* atomic_fetch_sub_relaxed() et al: */
 
-#ifndef atomic_fetch_inc
-#define atomic_fetch_inc(v)	        atomic_fetch_add(1, (v))
-#define atomic_fetch_inc_relaxed(v)	atomic_fetch_add_relaxed(1, (v))
-#define atomic_fetch_inc_acquire(v)	atomic_fetch_add_acquire(1, (v))
-#define atomic_fetch_inc_release(v)	atomic_fetch_add_release(1, (v))
-#else /* atomic_fetch_inc */
-#define atomic_fetch_inc_relaxed	atomic_fetch_inc
-#define atomic_fetch_inc_acquire	atomic_fetch_inc
-#define atomic_fetch_inc_release	atomic_fetch_inc
-#endif /* atomic_fetch_inc */
-
-#else /* atomic_fetch_inc_relaxed */
-
-#ifndef atomic_fetch_inc_acquire
-#define atomic_fetch_inc_acquire(...)					\
-	__atomic_op_acquire(atomic_fetch_inc, __VA_ARGS__)
-#endif
-
-#ifndef atomic_fetch_inc_release
-#define atomic_fetch_inc_release(...)					\
-	__atomic_op_release(atomic_fetch_inc, __VA_ARGS__)
-#endif
-
-#ifndef atomic_fetch_inc
-#define atomic_fetch_inc(...)						\
-	__atomic_op_fence(atomic_fetch_inc, __VA_ARGS__)
-#endif
-#endif /* atomic_fetch_inc_relaxed */
-
-/* atomic_fetch_sub_relaxed */
 #ifndef atomic_fetch_sub_relaxed
-#define atomic_fetch_sub_relaxed	atomic_fetch_sub
-#define atomic_fetch_sub_acquire	atomic_fetch_sub
-#define atomic_fetch_sub_release	atomic_fetch_sub
-
-#else /* atomic_fetch_sub_relaxed */
-
-#ifndef atomic_fetch_sub_acquire
-#define atomic_fetch_sub_acquire(...)					\
-	__atomic_op_acquire(atomic_fetch_sub, __VA_ARGS__)
-#endif
+# define atomic_fetch_sub_relaxed		atomic_fetch_sub
+# define atomic_fetch_sub_acquire		atomic_fetch_sub
+# define atomic_fetch_sub_release		atomic_fetch_sub
+#else
+# ifndef atomic_fetch_sub_acquire
+#  define atomic_fetch_sub_acquire(...)		__atomic_op_acquire(atomic_fetch_sub, __VA_ARGS__)
+# endif
+# ifndef atomic_fetch_sub_release
+#  define atomic_fetch_sub_release(...)		__atomic_op_release(atomic_fetch_sub, __VA_ARGS__)
+# endif
+# ifndef atomic_fetch_sub
+#  define atomic_fetch_sub(...)			__atomic_op_fence(atomic_fetch_sub, __VA_ARGS__)
+# endif
+#endif
+
+/* atomic_fetch_dec_relaxed() et al: */
 
-#ifndef atomic_fetch_sub_release
-#define atomic_fetch_sub_release(...)					\
-	__atomic_op_release(atomic_fetch_sub, __VA_ARGS__)
-#endif
-
-#ifndef atomic_fetch_sub
-#define atomic_fetch_sub(...)						\
-	__atomic_op_fence(atomic_fetch_sub, __VA_ARGS__)
-#endif
-#endif /* atomic_fetch_sub_relaxed */
-
-/* atomic_fetch_dec_relaxed */
 #ifndef atomic_fetch_dec_relaxed
+# ifndef atomic_fetch_dec
+#  define atomic_fetch_dec(v)			atomic_fetch_sub(1, (v))
+#  define atomic_fetch_dec_relaxed(v)		atomic_fetch_sub_relaxed(1, (v))
+#  define atomic_fetch_dec_acquire(v)		atomic_fetch_sub_acquire(1, (v))
+#  define atomic_fetch_dec_release(v)		atomic_fetch_sub_release(1, (v))
+# else
+#  define atomic_fetch_dec_relaxed		atomic_fetch_dec
+#  define atomic_fetch_dec_acquire		atomic_fetch_dec
+#  define atomic_fetch_dec_release		atomic_fetch_dec
+# endif
+#else
+# ifndef atomic_fetch_dec_acquire
+#  define atomic_fetch_dec_acquire(...)		__atomic_op_acquire(atomic_fetch_dec, __VA_ARGS__)
+# endif
+# ifndef atomic_fetch_dec_release
+#  define atomic_fetch_dec_release(...)		__atomic_op_release(atomic_fetch_dec, __VA_ARGS__)
+# endif
+# ifndef atomic_fetch_dec
+#  define atomic_fetch_dec(...)			__atomic_op_fence(atomic_fetch_dec, __VA_ARGS__)
+# endif
+#endif
+
+/* atomic_fetch_or_relaxed() et al: */
 
-#ifndef atomic_fetch_dec
-#define atomic_fetch_dec(v)	        atomic_fetch_sub(1, (v))
-#define atomic_fetch_dec_relaxed(v)	atomic_fetch_sub_relaxed(1, (v))
-#define atomic_fetch_dec_acquire(v)	atomic_fetch_sub_acquire(1, (v))
-#define atomic_fetch_dec_release(v)	atomic_fetch_sub_release(1, (v))
-#else /* atomic_fetch_dec */
-#define atomic_fetch_dec_relaxed	atomic_fetch_dec
-#define atomic_fetch_dec_acquire	atomic_fetch_dec
-#define atomic_fetch_dec_release	atomic_fetch_dec
-#endif /* atomic_fetch_dec */
-
-#else /* atomic_fetch_dec_relaxed */
-
-#ifndef atomic_fetch_dec_acquire
-#define atomic_fetch_dec_acquire(...)					\
-	__atomic_op_acquire(atomic_fetch_dec, __VA_ARGS__)
-#endif
-
-#ifndef atomic_fetch_dec_release
-#define atomic_fetch_dec_release(...)					\
-	__atomic_op_release(atomic_fetch_dec, __VA_ARGS__)
-#endif
-
-#ifndef atomic_fetch_dec
-#define atomic_fetch_dec(...)						\
-	__atomic_op_fence(atomic_fetch_dec, __VA_ARGS__)
-#endif
-#endif /* atomic_fetch_dec_relaxed */
-
-/* atomic_fetch_or_relaxed */
 #ifndef atomic_fetch_or_relaxed
-#define atomic_fetch_or_relaxed	atomic_fetch_or
-#define atomic_fetch_or_acquire	atomic_fetch_or
-#define atomic_fetch_or_release	atomic_fetch_or
-
-#else /* atomic_fetch_or_relaxed */
-
-#ifndef atomic_fetch_or_acquire
-#define atomic_fetch_or_acquire(...)					\
-	__atomic_op_acquire(atomic_fetch_or, __VA_ARGS__)
-#endif
-
-#ifndef atomic_fetch_or_release
-#define atomic_fetch_or_release(...)					\
-	__atomic_op_release(atomic_fetch_or, __VA_ARGS__)
-#endif
-
-#ifndef atomic_fetch_or
-#define atomic_fetch_or(...)						\
-	__atomic_op_fence(atomic_fetch_or, __VA_ARGS__)
-#endif
-#endif /* atomic_fetch_or_relaxed */
+# define atomic_fetch_or_relaxed		atomic_fetch_or
+# define atomic_fetch_or_acquire		atomic_fetch_or
+# define atomic_fetch_or_release		atomic_fetch_or
+#else
+# ifndef atomic_fetch_or_acquire
+#  define atomic_fetch_or_acquire(...)		__atomic_op_acquire(atomic_fetch_or, __VA_ARGS__)
+# endif
+# ifndef atomic_fetch_or_release
+#  define atomic_fetch_or_release(...)		__atomic_op_release(atomic_fetch_or, __VA_ARGS__)
+# endif
+# ifndef atomic_fetch_or
+#  define atomic_fetch_or(...)			__atomic_op_fence(atomic_fetch_or, __VA_ARGS__)
+# endif
+#endif
+
+/* atomic_fetch_and_relaxed() et al: */
 
-/* atomic_fetch_and_relaxed */
 #ifndef atomic_fetch_and_relaxed
-#define atomic_fetch_and_relaxed	atomic_fetch_and
-#define atomic_fetch_and_acquire	atomic_fetch_and
-#define atomic_fetch_and_release	atomic_fetch_and
-
-#else /* atomic_fetch_and_relaxed */
-
-#ifndef atomic_fetch_and_acquire
-#define atomic_fetch_and_acquire(...)					\
-	__atomic_op_acquire(atomic_fetch_and, __VA_ARGS__)
-#endif
-
-#ifndef atomic_fetch_and_release
-#define atomic_fetch_and_release(...)					\
-	__atomic_op_release(atomic_fetch_and, __VA_ARGS__)
-#endif
-
-#ifndef atomic_fetch_and
-#define atomic_fetch_and(...)						\
-	__atomic_op_fence(atomic_fetch_and, __VA_ARGS__)
+# define atomic_fetch_and_relaxed		atomic_fetch_and
+# define atomic_fetch_and_acquire		atomic_fetch_and
+# define atomic_fetch_and_release		atomic_fetch_and
+#else
+# ifndef atomic_fetch_and_acquire
+#  define atomic_fetch_and_acquire(...)		__atomic_op_acquire(atomic_fetch_and, __VA_ARGS__)
+# endif
+# ifndef atomic_fetch_and_release
+#  define atomic_fetch_and_release(...)		__atomic_op_release(atomic_fetch_and, __VA_ARGS__)
+# endif
+# ifndef atomic_fetch_and
+#  define atomic_fetch_and(...)			__atomic_op_fence(atomic_fetch_and, __VA_ARGS__)
+# endif
 #endif
-#endif /* atomic_fetch_and_relaxed */
 
 #ifdef atomic_andnot
-/* atomic_fetch_andnot_relaxed */
-#ifndef atomic_fetch_andnot_relaxed
-#define atomic_fetch_andnot_relaxed	atomic_fetch_andnot
-#define atomic_fetch_andnot_acquire	atomic_fetch_andnot
-#define atomic_fetch_andnot_release	atomic_fetch_andnot
-
-#else /* atomic_fetch_andnot_relaxed */
 
-#ifndef atomic_fetch_andnot_acquire
-#define atomic_fetch_andnot_acquire(...)					\
-	__atomic_op_acquire(atomic_fetch_andnot, __VA_ARGS__)
-#endif
+/* atomic_fetch_andnot_relaxed() et al: */
 
-#ifndef atomic_fetch_andnot_release
-#define atomic_fetch_andnot_release(...)					\
-	__atomic_op_release(atomic_fetch_andnot, __VA_ARGS__)
+#ifndef atomic_fetch_andnot_relaxed
+# define atomic_fetch_andnot_relaxed		atomic_fetch_andnot
+# define atomic_fetch_andnot_acquire		atomic_fetch_andnot
+# define atomic_fetch_andnot_release		atomic_fetch_andnot
+#else
+# ifndef atomic_fetch_andnot_acquire
+#  define atomic_fetch_andnot_acquire(...)	 __atomic_op_acquire(atomic_fetch_andnot, __VA_ARGS__)
+# endif
+# ifndef atomic_fetch_andnot_release
+#  define atomic_fetch_andnot_release(...)	 __atomic_op_release(atomic_fetch_andnot, __VA_ARGS__)
+# endif
+# ifndef atomic_fetch_andnot
+#  define atomic_fetch_andnot(...)		__atomic_op_fence(atomic_fetch_andnot, __VA_ARGS__)
+# endif
 #endif
 
-#ifndef atomic_fetch_andnot
-#define atomic_fetch_andnot(...)						\
-	__atomic_op_fence(atomic_fetch_andnot, __VA_ARGS__)
-#endif
-#endif /* atomic_fetch_andnot_relaxed */
 #endif /* atomic_andnot */
 
-/* atomic_fetch_xor_relaxed */
-#ifndef atomic_fetch_xor_relaxed
-#define atomic_fetch_xor_relaxed	atomic_fetch_xor
-#define atomic_fetch_xor_acquire	atomic_fetch_xor
-#define atomic_fetch_xor_release	atomic_fetch_xor
-
-#else /* atomic_fetch_xor_relaxed */
+/* atomic_fetch_xor_relaxed() et al: */
 
-#ifndef atomic_fetch_xor_acquire
-#define atomic_fetch_xor_acquire(...)					\
-	__atomic_op_acquire(atomic_fetch_xor, __VA_ARGS__)
-#endif
-
-#ifndef atomic_fetch_xor_release
-#define atomic_fetch_xor_release(...)					\
-	__atomic_op_release(atomic_fetch_xor, __VA_ARGS__)
+#ifndef atomic_fetch_xor_relaxed
+# define atomic_fetch_xor_relaxed		atomic_fetch_xor
+# define atomic_fetch_xor_acquire		atomic_fetch_xor
+# define atomic_fetch_xor_release		atomic_fetch_xor
+#else
+# ifndef atomic_fetch_xor_acquire
+#  define atomic_fetch_xor_acquire(...)		__atomic_op_acquire(atomic_fetch_xor, __VA_ARGS__)
+# endif
+# ifndef atomic_fetch_xor_release
+#  define atomic_fetch_xor_release(...)		__atomic_op_release(atomic_fetch_xor, __VA_ARGS__)
+# endif
+# ifndef atomic_fetch_xor
+#  define atomic_fetch_xor(...)			__atomic_op_fence(atomic_fetch_xor, __VA_ARGS__)
+# endif
 #endif
 
-#ifndef atomic_fetch_xor
-#define atomic_fetch_xor(...)						\
-	__atomic_op_fence(atomic_fetch_xor, __VA_ARGS__)
-#endif
-#endif /* atomic_fetch_xor_relaxed */
 
+/* atomic_xchg_relaxed() et al: */
 
-/* atomic_xchg_relaxed */
 #ifndef atomic_xchg_relaxed
-#define  atomic_xchg_relaxed		atomic_xchg
-#define  atomic_xchg_acquire		atomic_xchg
-#define  atomic_xchg_release		atomic_xchg
-
-#else /* atomic_xchg_relaxed */
-
-#ifndef atomic_xchg_acquire
-#define  atomic_xchg_acquire(...)					\
-	__atomic_op_acquire(atomic_xchg, __VA_ARGS__)
-#endif
-
-#ifndef atomic_xchg_release
-#define  atomic_xchg_release(...)					\
-	__atomic_op_release(atomic_xchg, __VA_ARGS__)
-#endif
+#define atomic_xchg_relaxed			atomic_xchg
+#define atomic_xchg_acquire			atomic_xchg
+#define atomic_xchg_release			atomic_xchg
+#else
+# ifndef atomic_xchg_acquire
+#  define atomic_xchg_acquire(...)		__atomic_op_acquire(atomic_xchg, __VA_ARGS__)
+# endif
+# ifndef atomic_xchg_release
+#  define atomic_xchg_release(...)		__atomic_op_release(atomic_xchg, __VA_ARGS__)
+# endif
+# ifndef atomic_xchg
+#  define atomic_xchg(...)			__atomic_op_fence(atomic_xchg, __VA_ARGS__)
+# endif
+#endif
+
+/* atomic_cmpxchg_relaxed() et al: */
 
-#ifndef atomic_xchg
-#define  atomic_xchg(...)						\
-	__atomic_op_fence(atomic_xchg, __VA_ARGS__)
-#endif
-#endif /* atomic_xchg_relaxed */
-
-/* atomic_cmpxchg_relaxed */
 #ifndef atomic_cmpxchg_relaxed
-#define  atomic_cmpxchg_relaxed		atomic_cmpxchg
-#define  atomic_cmpxchg_acquire		atomic_cmpxchg
-#define  atomic_cmpxchg_release		atomic_cmpxchg
-
-#else /* atomic_cmpxchg_relaxed */
-
-#ifndef atomic_cmpxchg_acquire
-#define  atomic_cmpxchg_acquire(...)					\
-	__atomic_op_acquire(atomic_cmpxchg, __VA_ARGS__)
+# define atomic_cmpxchg_relaxed			atomic_cmpxchg
+# define atomic_cmpxchg_acquire			atomic_cmpxchg
+# define atomic_cmpxchg_release			atomic_cmpxchg
+#else
+# ifndef atomic_cmpxchg_acquire
+#  define atomic_cmpxchg_acquire(...)		__atomic_op_acquire(atomic_cmpxchg, __VA_ARGS__)
+# endif
+# ifndef atomic_cmpxchg_release
+#  define atomic_cmpxchg_release(...)		__atomic_op_release(atomic_cmpxchg, __VA_ARGS__)
+# endif
+# ifndef atomic_cmpxchg
+#  define atomic_cmpxchg(...)			__atomic_op_fence(atomic_cmpxchg, __VA_ARGS__)
+# endif
 #endif
 
-#ifndef atomic_cmpxchg_release
-#define  atomic_cmpxchg_release(...)					\
-	__atomic_op_release(atomic_cmpxchg, __VA_ARGS__)
-#endif
-
-#ifndef atomic_cmpxchg
-#define  atomic_cmpxchg(...)						\
-	__atomic_op_fence(atomic_cmpxchg, __VA_ARGS__)
-#endif
-#endif /* atomic_cmpxchg_relaxed */
-
 #ifndef atomic_try_cmpxchg
-
-#define __atomic_try_cmpxchg(type, _p, _po, _n)				\
-({									\
+# define __atomic_try_cmpxchg(type, _p, _po, _n)			\
+  ({									\
 	typeof(_po) __po = (_po);					\
 	typeof(*(_po)) __r, __o = *__po;				\
 	__r = atomic_cmpxchg##type((_p), __o, (_n));			\
 	if (unlikely(__r != __o))					\
 		*__po = __r;						\
 	likely(__r == __o);						\
-})
-
-#define atomic_try_cmpxchg(_p, _po, _n)		__atomic_try_cmpxchg(, _p, _po, _n)
-#define atomic_try_cmpxchg_relaxed(_p, _po, _n)	__atomic_try_cmpxchg(_relaxed, _p, _po, _n)
-#define atomic_try_cmpxchg_acquire(_p, _po, _n)	__atomic_try_cmpxchg(_acquire, _p, _po, _n)
-#define atomic_try_cmpxchg_release(_p, _po, _n)	__atomic_try_cmpxchg(_release, _p, _po, _n)
-
-#else /* atomic_try_cmpxchg */
-#define atomic_try_cmpxchg_relaxed	atomic_try_cmpxchg
-#define atomic_try_cmpxchg_acquire	atomic_try_cmpxchg
-#define atomic_try_cmpxchg_release	atomic_try_cmpxchg
-#endif /* atomic_try_cmpxchg */
-
-/* cmpxchg_relaxed */
-#ifndef cmpxchg_relaxed
-#define  cmpxchg_relaxed		cmpxchg
-#define  cmpxchg_acquire		cmpxchg
-#define  cmpxchg_release		cmpxchg
-
-#else /* cmpxchg_relaxed */
-
-#ifndef cmpxchg_acquire
-#define  cmpxchg_acquire(...)						\
-	__atomic_op_acquire(cmpxchg, __VA_ARGS__)
+  })
+# define atomic_try_cmpxchg(_p, _po, _n)	 __atomic_try_cmpxchg(, _p, _po, _n)
+# define atomic_try_cmpxchg_relaxed(_p, _po, _n) __atomic_try_cmpxchg(_relaxed, _p, _po, _n)
+# define atomic_try_cmpxchg_acquire(_p, _po, _n) __atomic_try_cmpxchg(_acquire, _p, _po, _n)
+# define atomic_try_cmpxchg_release(_p, _po, _n) __atomic_try_cmpxchg(_release, _p, _po, _n)
+#else
+# define atomic_try_cmpxchg_relaxed		atomic_try_cmpxchg
+# define atomic_try_cmpxchg_acquire		atomic_try_cmpxchg
+# define atomic_try_cmpxchg_release		atomic_try_cmpxchg
 #endif
 
-#ifndef cmpxchg_release
-#define  cmpxchg_release(...)						\
-	__atomic_op_release(cmpxchg, __VA_ARGS__)
-#endif
+/* cmpxchg_relaxed() et al: */
 
-#ifndef cmpxchg
-#define  cmpxchg(...)							\
-	__atomic_op_fence(cmpxchg, __VA_ARGS__)
-#endif
-#endif /* cmpxchg_relaxed */
+#ifndef cmpxchg_relaxed
+# define cmpxchg_relaxed			cmpxchg
+# define cmpxchg_acquire			cmpxchg
+# define cmpxchg_release			cmpxchg
+#else
+# ifndef cmpxchg_acquire
+#  define cmpxchg_acquire(...)			__atomic_op_acquire(cmpxchg, __VA_ARGS__)
+# endif
+# ifndef cmpxchg_release
+#  define cmpxchg_release(...)			__atomic_op_release(cmpxchg, __VA_ARGS__)
+# endif
+# ifndef cmpxchg
+#  define cmpxchg(...)				__atomic_op_fence(cmpxchg, __VA_ARGS__)
+# endif
+#endif
+
+/* cmpxchg64_relaxed() et al: */
 
-/* cmpxchg64_relaxed */
 #ifndef cmpxchg64_relaxed
-#define  cmpxchg64_relaxed		cmpxchg64
-#define  cmpxchg64_acquire		cmpxchg64
-#define  cmpxchg64_release		cmpxchg64
-
-#else /* cmpxchg64_relaxed */
-
-#ifndef cmpxchg64_acquire
-#define  cmpxchg64_acquire(...)						\
-	__atomic_op_acquire(cmpxchg64, __VA_ARGS__)
-#endif
-
-#ifndef cmpxchg64_release
-#define  cmpxchg64_release(...)						\
-	__atomic_op_release(cmpxchg64, __VA_ARGS__)
-#endif
-
-#ifndef cmpxchg64
-#define  cmpxchg64(...)							\
-	__atomic_op_fence(cmpxchg64, __VA_ARGS__)
-#endif
-#endif /* cmpxchg64_relaxed */
+# define cmpxchg64_relaxed			cmpxchg64
+# define cmpxchg64_acquire			cmpxchg64
+# define cmpxchg64_release			cmpxchg64
+#else
+# ifndef cmpxchg64_acquire
+#  define cmpxchg64_acquire(...)		__atomic_op_acquire(cmpxchg64, __VA_ARGS__)
+# endif
+# ifndef cmpxchg64_release
+#  define cmpxchg64_release(...)		__atomic_op_release(cmpxchg64, __VA_ARGS__)
+# endif
+# ifndef cmpxchg64
+#  define cmpxchg64(...)			__atomic_op_fence(cmpxchg64, __VA_ARGS__)
+# endif
+#endif
+
+/* xchg_relaxed() et al: */
 
-/* xchg_relaxed */
 #ifndef xchg_relaxed
-#define  xchg_relaxed			xchg
-#define  xchg_acquire			xchg
-#define  xchg_release			xchg
-
-#else /* xchg_relaxed */
-
-#ifndef xchg_acquire
-#define  xchg_acquire(...)		__atomic_op_acquire(xchg, __VA_ARGS__)
-#endif
-
-#ifndef xchg_release
-#define  xchg_release(...)		__atomic_op_release(xchg, __VA_ARGS__)
+# define xchg_relaxed				xchg
+# define xchg_acquire				xchg
+# define xchg_release				xchg
+#else
+# ifndef xchg_acquire
+#  define xchg_acquire(...)			__atomic_op_acquire(xchg, __VA_ARGS__)
+# endif
+# ifndef xchg_release
+#  define xchg_release(...)			__atomic_op_release(xchg, __VA_ARGS__)
+# endif
+# ifndef xchg
+#  define xchg(...)				__atomic_op_fence(xchg, __VA_ARGS__)
+# endif
 #endif
 
-#ifndef xchg
-#define  xchg(...)			__atomic_op_fence(xchg, __VA_ARGS__)
-#endif
-#endif /* xchg_relaxed */
-
 /**
  * atomic_add_unless - add unless the number is already a given value
  * @v: pointer of type atomic_t
@@ -541,7 +438,7 @@ static inline int atomic_add_unless(atomic_t *v, int a, int u)
  * Returns non-zero if @v was non-zero, and zero otherwise.
  */
 #ifndef atomic_inc_not_zero
-#define atomic_inc_not_zero(v)		atomic_add_unless((v), 1, 0)
+# define atomic_inc_not_zero(v)			atomic_add_unless((v), 1, 0)
 #endif
 
 #ifndef atomic_andnot
@@ -607,6 +504,7 @@ static inline int atomic_inc_not_zero_hint(atomic_t *v, int hint)
 static inline int atomic_inc_unless_negative(atomic_t *p)
 {
 	int v, v1;
+
 	for (v = 0; v >= 0; v = v1) {
 		v1 = atomic_cmpxchg(p, v, v + 1);
 		if (likely(v1 == v))
@@ -620,6 +518,7 @@ static inline int atomic_inc_unless_negative(atomic_t *p)
 static inline int atomic_dec_unless_positive(atomic_t *p)
 {
 	int v, v1;
+
 	for (v = 0; v <= 0; v = v1) {
 		v1 = atomic_cmpxchg(p, v, v - 1);
 		if (likely(v1 == v))
@@ -640,6 +539,7 @@ static inline int atomic_dec_unless_positive(atomic_t *p)
 static inline int atomic_dec_if_positive(atomic_t *v)
 {
 	int c, old, dec;
+
 	c = atomic_read(v);
 	for (;;) {
 		dec = c - 1;
@@ -654,400 +554,311 @@ static inline int atomic_dec_if_positive(atomic_t *v)
 }
 #endif
 
-#define atomic_cond_read_relaxed(v, c)	smp_cond_load_relaxed(&(v)->counter, (c))
-#define atomic_cond_read_acquire(v, c)	smp_cond_load_acquire(&(v)->counter, (c))
+#define atomic_cond_read_relaxed(v, c)		smp_cond_load_relaxed(&(v)->counter, (c))
+#define atomic_cond_read_acquire(v, c)		smp_cond_load_acquire(&(v)->counter, (c))
 
 #ifdef CONFIG_GENERIC_ATOMIC64
 #include <asm-generic/atomic64.h>
 #endif
 
 #ifndef atomic64_read_acquire
-#define  atomic64_read_acquire(v)	smp_load_acquire(&(v)->counter)
+# define atomic64_read_acquire(v)		smp_load_acquire(&(v)->counter)
 #endif
 
 #ifndef atomic64_set_release
-#define  atomic64_set_release(v, i)	smp_store_release(&(v)->counter, (i))
-#endif
-
-/* atomic64_add_return_relaxed */
-#ifndef atomic64_add_return_relaxed
-#define  atomic64_add_return_relaxed	atomic64_add_return
-#define  atomic64_add_return_acquire	atomic64_add_return
-#define  atomic64_add_return_release	atomic64_add_return
-
-#else /* atomic64_add_return_relaxed */
-
-#ifndef atomic64_add_return_acquire
-#define  atomic64_add_return_acquire(...)				\
-	__atomic_op_acquire(atomic64_add_return, __VA_ARGS__)
+# define atomic64_set_release(v, i)		smp_store_release(&(v)->counter, (i))
 #endif
 
-#ifndef atomic64_add_return_release
-#define  atomic64_add_return_release(...)				\
-	__atomic_op_release(atomic64_add_return, __VA_ARGS__)
-#endif
+/* atomic64_add_return_relaxed() et al: */
 
-#ifndef atomic64_add_return
-#define  atomic64_add_return(...)					\
-	__atomic_op_fence(atomic64_add_return, __VA_ARGS__)
-#endif
-#endif /* atomic64_add_return_relaxed */
+#ifndef atomic64_add_return_relaxed
+# define atomic64_add_return_relaxed		atomic64_add_return
+# define atomic64_add_return_acquire		atomic64_add_return
+# define atomic64_add_return_release		atomic64_add_return
+#else
+# ifndef atomic64_add_return_acquire
+#  define atomic64_add_return_acquire(...)	__atomic_op_acquire(atomic64_add_return, __VA_ARGS__)
+# endif
+# ifndef atomic64_add_return_release
+#  define atomic64_add_return_release(...)	__atomic_op_release(atomic64_add_return, __VA_ARGS__)
+# endif
+# ifndef atomic64_add_return
+#  define atomic64_add_return(...)		__atomic_op_fence(atomic64_add_return, __VA_ARGS__)
+# endif
+#endif
+
+/* atomic64_inc_return_relaxed() et al: */
 
-/* atomic64_inc_return_relaxed */
 #ifndef atomic64_inc_return_relaxed
-#define  atomic64_inc_return_relaxed	atomic64_inc_return
-#define  atomic64_inc_return_acquire	atomic64_inc_return
-#define  atomic64_inc_return_release	atomic64_inc_return
-
-#else /* atomic64_inc_return_relaxed */
-
-#ifndef atomic64_inc_return_acquire
-#define  atomic64_inc_return_acquire(...)				\
-	__atomic_op_acquire(atomic64_inc_return, __VA_ARGS__)
-#endif
-
-#ifndef atomic64_inc_return_release
-#define  atomic64_inc_return_release(...)				\
-	__atomic_op_release(atomic64_inc_return, __VA_ARGS__)
-#endif
-
-#ifndef atomic64_inc_return
-#define  atomic64_inc_return(...)					\
-	__atomic_op_fence(atomic64_inc_return, __VA_ARGS__)
-#endif
-#endif /* atomic64_inc_return_relaxed */
-
+# define atomic64_inc_return_relaxed		atomic64_inc_return
+# define atomic64_inc_return_acquire		atomic64_inc_return
+# define atomic64_inc_return_release		atomic64_inc_return
+#else
+# ifndef atomic64_inc_return_acquire
+#  define atomic64_inc_return_acquire(...)	__atomic_op_acquire(atomic64_inc_return, __VA_ARGS__)
+# endif
+# ifndef atomic64_inc_return_release
+#  define atomic64_inc_return_release(...)	__atomic_op_release(atomic64_inc_return, __VA_ARGS__)
+# endif
+# ifndef atomic64_inc_return
+#  define atomic64_inc_return(...)		__atomic_op_fence(atomic64_inc_return, __VA_ARGS__)
+# endif
+#endif
+
+/* atomic64_sub_return_relaxed() et al: */
 
-/* atomic64_sub_return_relaxed */
 #ifndef atomic64_sub_return_relaxed
-#define  atomic64_sub_return_relaxed	atomic64_sub_return
-#define  atomic64_sub_return_acquire	atomic64_sub_return
-#define  atomic64_sub_return_release	atomic64_sub_return
+# define atomic64_sub_return_relaxed		atomic64_sub_return
+# define atomic64_sub_return_acquire		atomic64_sub_return
+# define atomic64_sub_return_release		atomic64_sub_return
+#else
+# ifndef atomic64_sub_return_acquire
+#  define atomic64_sub_return_acquire(...)	__atomic_op_acquire(atomic64_sub_return, __VA_ARGS__)
+# endif
+# ifndef atomic64_sub_return_release
+#  define atomic64_sub_return_release(...)	__atomic_op_release(atomic64_sub_return, __VA_ARGS__)
+# endif
+# ifndef atomic64_sub_return
+#  define atomic64_sub_return(...)		__atomic_op_fence(atomic64_sub_return, __VA_ARGS__)
+# endif
+#endif
+
+/* atomic64_dec_return_relaxed() et al: */
 
-#else /* atomic64_sub_return_relaxed */
-
-#ifndef atomic64_sub_return_acquire
-#define  atomic64_sub_return_acquire(...)				\
-	__atomic_op_acquire(atomic64_sub_return, __VA_ARGS__)
-#endif
-
-#ifndef atomic64_sub_return_release
-#define  atomic64_sub_return_release(...)				\
-	__atomic_op_release(atomic64_sub_return, __VA_ARGS__)
-#endif
-
-#ifndef atomic64_sub_return
-#define  atomic64_sub_return(...)					\
-	__atomic_op_fence(atomic64_sub_return, __VA_ARGS__)
-#endif
-#endif /* atomic64_sub_return_relaxed */
-
-/* atomic64_dec_return_relaxed */
 #ifndef atomic64_dec_return_relaxed
-#define  atomic64_dec_return_relaxed	atomic64_dec_return
-#define  atomic64_dec_return_acquire	atomic64_dec_return
-#define  atomic64_dec_return_release	atomic64_dec_return
-
-#else /* atomic64_dec_return_relaxed */
-
-#ifndef atomic64_dec_return_acquire
-#define  atomic64_dec_return_acquire(...)				\
-	__atomic_op_acquire(atomic64_dec_return, __VA_ARGS__)
-#endif
-
-#ifndef atomic64_dec_return_release
-#define  atomic64_dec_return_release(...)				\
-	__atomic_op_release(atomic64_dec_return, __VA_ARGS__)
-#endif
-
-#ifndef atomic64_dec_return
-#define  atomic64_dec_return(...)					\
-	__atomic_op_fence(atomic64_dec_return, __VA_ARGS__)
-#endif
-#endif /* atomic64_dec_return_relaxed */
+# define atomic64_dec_return_relaxed		atomic64_dec_return
+# define atomic64_dec_return_acquire		atomic64_dec_return
+# define atomic64_dec_return_release		atomic64_dec_return
+#else
+# ifndef atomic64_dec_return_acquire
+#  define atomic64_dec_return_acquire(...)	__atomic_op_acquire(atomic64_dec_return, __VA_ARGS__)
+# endif
+# ifndef atomic64_dec_return_release
+#  define atomic64_dec_return_release(...)	__atomic_op_release(atomic64_dec_return, __VA_ARGS__)
+# endif
+# ifndef atomic64_dec_return
+#  define atomic64_dec_return(...)		__atomic_op_fence(atomic64_dec_return, __VA_ARGS__)
+# endif
+#endif
+
+/* atomic64_fetch_add_relaxed() et al: */
 
-
-/* atomic64_fetch_add_relaxed */
 #ifndef atomic64_fetch_add_relaxed
-#define atomic64_fetch_add_relaxed	atomic64_fetch_add
-#define atomic64_fetch_add_acquire	atomic64_fetch_add
-#define atomic64_fetch_add_release	atomic64_fetch_add
-
-#else /* atomic64_fetch_add_relaxed */
-
-#ifndef atomic64_fetch_add_acquire
-#define atomic64_fetch_add_acquire(...)					\
-	__atomic_op_acquire(atomic64_fetch_add, __VA_ARGS__)
-#endif
+# define atomic64_fetch_add_relaxed		atomic64_fetch_add
+# define atomic64_fetch_add_acquire		atomic64_fetch_add
+# define atomic64_fetch_add_release		atomic64_fetch_add
+#else
+# ifndef atomic64_fetch_add_acquire
+#  define atomic64_fetch_add_acquire(...)	__atomic_op_acquire(atomic64_fetch_add, __VA_ARGS__)
+# endif
+# ifndef atomic64_fetch_add_release
+#  define atomic64_fetch_add_release(...)	__atomic_op_release(atomic64_fetch_add, __VA_ARGS__)
+# endif
+# ifndef atomic64_fetch_add
+#  define atomic64_fetch_add(...)		__atomic_op_fence(atomic64_fetch_add, __VA_ARGS__)
+# endif
+#endif
+
+/* atomic64_fetch_inc_relaxed() et al: */
 
-#ifndef atomic64_fetch_add_release
-#define atomic64_fetch_add_release(...)					\
-	__atomic_op_release(atomic64_fetch_add, __VA_ARGS__)
-#endif
-
-#ifndef atomic64_fetch_add
-#define atomic64_fetch_add(...)						\
-	__atomic_op_fence(atomic64_fetch_add, __VA_ARGS__)
-#endif
-#endif /* atomic64_fetch_add_relaxed */
-
-/* atomic64_fetch_inc_relaxed */
 #ifndef atomic64_fetch_inc_relaxed
+# ifndef atomic64_fetch_inc
+#  define atomic64_fetch_inc(v)			atomic64_fetch_add(1, (v))
+#  define atomic64_fetch_inc_relaxed(v)		atomic64_fetch_add_relaxed(1, (v))
+#  define atomic64_fetch_inc_acquire(v)		atomic64_fetch_add_acquire(1, (v))
+#  define atomic64_fetch_inc_release(v)		atomic64_fetch_add_release(1, (v))
+# else
+#  define atomic64_fetch_inc_relaxed		atomic64_fetch_inc
+#  define atomic64_fetch_inc_acquire		atomic64_fetch_inc
+#  define atomic64_fetch_inc_release		atomic64_fetch_inc
+# endif
+#else
+# ifndef atomic64_fetch_inc_acquire
+#  define atomic64_fetch_inc_acquire(...)	__atomic_op_acquire(atomic64_fetch_inc, __VA_ARGS__)
+# endif
+# ifndef atomic64_fetch_inc_release
+#  define atomic64_fetch_inc_release(...)	__atomic_op_release(atomic64_fetch_inc, __VA_ARGS__)
+# endif
+# ifndef atomic64_fetch_inc
+#  define atomic64_fetch_inc(...)		__atomic_op_fence(atomic64_fetch_inc, __VA_ARGS__)
+# endif
+#endif
+
+/* atomic64_fetch_sub_relaxed() et al: */
 
-#ifndef atomic64_fetch_inc
-#define atomic64_fetch_inc(v)		atomic64_fetch_add(1, (v))
-#define atomic64_fetch_inc_relaxed(v)	atomic64_fetch_add_relaxed(1, (v))
-#define atomic64_fetch_inc_acquire(v)	atomic64_fetch_add_acquire(1, (v))
-#define atomic64_fetch_inc_release(v)	atomic64_fetch_add_release(1, (v))
-#else /* atomic64_fetch_inc */
-#define atomic64_fetch_inc_relaxed	atomic64_fetch_inc
-#define atomic64_fetch_inc_acquire	atomic64_fetch_inc
-#define atomic64_fetch_inc_release	atomic64_fetch_inc
-#endif /* atomic64_fetch_inc */
-
-#else /* atomic64_fetch_inc_relaxed */
-
-#ifndef atomic64_fetch_inc_acquire
-#define atomic64_fetch_inc_acquire(...)					\
-	__atomic_op_acquire(atomic64_fetch_inc, __VA_ARGS__)
-#endif
-
-#ifndef atomic64_fetch_inc_release
-#define atomic64_fetch_inc_release(...)					\
-	__atomic_op_release(atomic64_fetch_inc, __VA_ARGS__)
-#endif
-
-#ifndef atomic64_fetch_inc
-#define atomic64_fetch_inc(...)						\
-	__atomic_op_fence(atomic64_fetch_inc, __VA_ARGS__)
-#endif
-#endif /* atomic64_fetch_inc_relaxed */
-
-/* atomic64_fetch_sub_relaxed */
 #ifndef atomic64_fetch_sub_relaxed
-#define atomic64_fetch_sub_relaxed	atomic64_fetch_sub
-#define atomic64_fetch_sub_acquire	atomic64_fetch_sub
-#define atomic64_fetch_sub_release	atomic64_fetch_sub
-
-#else /* atomic64_fetch_sub_relaxed */
-
-#ifndef atomic64_fetch_sub_acquire
-#define atomic64_fetch_sub_acquire(...)					\
-	__atomic_op_acquire(atomic64_fetch_sub, __VA_ARGS__)
-#endif
-
-#ifndef atomic64_fetch_sub_release
-#define atomic64_fetch_sub_release(...)					\
-	__atomic_op_release(atomic64_fetch_sub, __VA_ARGS__)
-#endif
-
-#ifndef atomic64_fetch_sub
-#define atomic64_fetch_sub(...)						\
-	__atomic_op_fence(atomic64_fetch_sub, __VA_ARGS__)
-#endif
-#endif /* atomic64_fetch_sub_relaxed */
+# define atomic64_fetch_sub_relaxed		atomic64_fetch_sub
+# define atomic64_fetch_sub_acquire		atomic64_fetch_sub
+# define atomic64_fetch_sub_release		atomic64_fetch_sub
+#else
+# ifndef atomic64_fetch_sub_acquire
+#  define atomic64_fetch_sub_acquire(...)	__atomic_op_acquire(atomic64_fetch_sub, __VA_ARGS__)
+# endif
+# ifndef atomic64_fetch_sub_release
+#  define atomic64_fetch_sub_release(...)	__atomic_op_release(atomic64_fetch_sub, __VA_ARGS__)
+# endif
+# ifndef atomic64_fetch_sub
+#  define atomic64_fetch_sub(...)		__atomic_op_fence(atomic64_fetch_sub, __VA_ARGS__)
+# endif
+#endif
+
+/* atomic64_fetch_dec_relaxed() et al: */
 
-/* atomic64_fetch_dec_relaxed */
 #ifndef atomic64_fetch_dec_relaxed
+# ifndef atomic64_fetch_dec
+#  define atomic64_fetch_dec(v)			atomic64_fetch_sub(1, (v))
+#  define atomic64_fetch_dec_relaxed(v)		atomic64_fetch_sub_relaxed(1, (v))
+#  define atomic64_fetch_dec_acquire(v)		atomic64_fetch_sub_acquire(1, (v))
+#  define atomic64_fetch_dec_release(v)		atomic64_fetch_sub_release(1, (v))
+# else
+#  define atomic64_fetch_dec_relaxed		atomic64_fetch_dec
+#  define atomic64_fetch_dec_acquire		atomic64_fetch_dec
+#  define atomic64_fetch_dec_release		atomic64_fetch_dec
+# endif
+#else
+# ifndef atomic64_fetch_dec_acquire
+#  define atomic64_fetch_dec_acquire(...)	__atomic_op_acquire(atomic64_fetch_dec, __VA_ARGS__)
+# endif
+# ifndef atomic64_fetch_dec_release
+#  define atomic64_fetch_dec_release(...)	__atomic_op_release(atomic64_fetch_dec, __VA_ARGS__)
+# endif
+# ifndef atomic64_fetch_dec
+#  define atomic64_fetch_dec(...)		__atomic_op_fence(atomic64_fetch_dec, __VA_ARGS__)
+# endif
+#endif
+
+/* atomic64_fetch_or_relaxed() et al: */
 
-#ifndef atomic64_fetch_dec
-#define atomic64_fetch_dec(v)		atomic64_fetch_sub(1, (v))
-#define atomic64_fetch_dec_relaxed(v)	atomic64_fetch_sub_relaxed(1, (v))
-#define atomic64_fetch_dec_acquire(v)	atomic64_fetch_sub_acquire(1, (v))
-#define atomic64_fetch_dec_release(v)	atomic64_fetch_sub_release(1, (v))
-#else /* atomic64_fetch_dec */
-#define atomic64_fetch_dec_relaxed	atomic64_fetch_dec
-#define atomic64_fetch_dec_acquire	atomic64_fetch_dec
-#define atomic64_fetch_dec_release	atomic64_fetch_dec
-#endif /* atomic64_fetch_dec */
-
-#else /* atomic64_fetch_dec_relaxed */
-
-#ifndef atomic64_fetch_dec_acquire
-#define atomic64_fetch_dec_acquire(...)					\
-	__atomic_op_acquire(atomic64_fetch_dec, __VA_ARGS__)
-#endif
-
-#ifndef atomic64_fetch_dec_release
-#define atomic64_fetch_dec_release(...)					\
-	__atomic_op_release(atomic64_fetch_dec, __VA_ARGS__)
-#endif
-
-#ifndef atomic64_fetch_dec
-#define atomic64_fetch_dec(...)						\
-	__atomic_op_fence(atomic64_fetch_dec, __VA_ARGS__)
-#endif
-#endif /* atomic64_fetch_dec_relaxed */
-
-/* atomic64_fetch_or_relaxed */
 #ifndef atomic64_fetch_or_relaxed
-#define atomic64_fetch_or_relaxed	atomic64_fetch_or
-#define atomic64_fetch_or_acquire	atomic64_fetch_or
-#define atomic64_fetch_or_release	atomic64_fetch_or
-
-#else /* atomic64_fetch_or_relaxed */
-
-#ifndef atomic64_fetch_or_acquire
-#define atomic64_fetch_or_acquire(...)					\
-	__atomic_op_acquire(atomic64_fetch_or, __VA_ARGS__)
+# define atomic64_fetch_or_relaxed		atomic64_fetch_or
+# define atomic64_fetch_or_acquire		atomic64_fetch_or
+# define atomic64_fetch_or_release		atomic64_fetch_or
+#else
+# ifndef atomic64_fetch_or_acquire
+#  define atomic64_fetch_or_acquire(...)	__atomic_op_acquire(atomic64_fetch_or, __VA_ARGS__)
+# endif
+# ifndef atomic64_fetch_or_release
+#  define atomic64_fetch_or_release(...)	__atomic_op_release(atomic64_fetch_or, __VA_ARGS__)
+# endif
+# ifndef atomic64_fetch_or
+#  define atomic64_fetch_or(...)		__atomic_op_fence(atomic64_fetch_or, __VA_ARGS__)
+# endif
 #endif
 
-#ifndef atomic64_fetch_or_release
-#define atomic64_fetch_or_release(...)					\
-	__atomic_op_release(atomic64_fetch_or, __VA_ARGS__)
-#endif
 
-#ifndef atomic64_fetch_or
-#define atomic64_fetch_or(...)						\
-	__atomic_op_fence(atomic64_fetch_or, __VA_ARGS__)
-#endif
-#endif /* atomic64_fetch_or_relaxed */
+/* atomic64_fetch_and_relaxed() et al: */
 
-/* atomic64_fetch_and_relaxed */
 #ifndef atomic64_fetch_and_relaxed
-#define atomic64_fetch_and_relaxed	atomic64_fetch_and
-#define atomic64_fetch_and_acquire	atomic64_fetch_and
-#define atomic64_fetch_and_release	atomic64_fetch_and
-
-#else /* atomic64_fetch_and_relaxed */
-
-#ifndef atomic64_fetch_and_acquire
-#define atomic64_fetch_and_acquire(...)					\
-	__atomic_op_acquire(atomic64_fetch_and, __VA_ARGS__)
+# define atomic64_fetch_and_relaxed		atomic64_fetch_and
+# define atomic64_fetch_and_acquire		atomic64_fetch_and
+# define atomic64_fetch_and_release		atomic64_fetch_and
+#else
+# ifndef atomic64_fetch_and_acquire
+#  define atomic64_fetch_and_acquire(...)	__atomic_op_acquire(atomic64_fetch_and, __VA_ARGS__)
+# endif
+# ifndef atomic64_fetch_and_release
+#  define atomic64_fetch_and_release(...)	__atomic_op_release(atomic64_fetch_and, __VA_ARGS__)
+# endif
+# ifndef atomic64_fetch_and
+#  define atomic64_fetch_and(...)		__atomic_op_fence(atomic64_fetch_and, __VA_ARGS__)
+# endif
 #endif
 
-#ifndef atomic64_fetch_and_release
-#define atomic64_fetch_and_release(...)					\
-	__atomic_op_release(atomic64_fetch_and, __VA_ARGS__)
-#endif
-
-#ifndef atomic64_fetch_and
-#define atomic64_fetch_and(...)						\
-	__atomic_op_fence(atomic64_fetch_and, __VA_ARGS__)
-#endif
-#endif /* atomic64_fetch_and_relaxed */
-
 #ifdef atomic64_andnot
-/* atomic64_fetch_andnot_relaxed */
-#ifndef atomic64_fetch_andnot_relaxed
-#define atomic64_fetch_andnot_relaxed	atomic64_fetch_andnot
-#define atomic64_fetch_andnot_acquire	atomic64_fetch_andnot
-#define atomic64_fetch_andnot_release	atomic64_fetch_andnot
-
-#else /* atomic64_fetch_andnot_relaxed */
 
-#ifndef atomic64_fetch_andnot_acquire
-#define atomic64_fetch_andnot_acquire(...)					\
-	__atomic_op_acquire(atomic64_fetch_andnot, __VA_ARGS__)
-#endif
+/* atomic64_fetch_andnot_relaxed() et al: */
 
-#ifndef atomic64_fetch_andnot_release
-#define atomic64_fetch_andnot_release(...)					\
-	__atomic_op_release(atomic64_fetch_andnot, __VA_ARGS__)
+#ifndef atomic64_fetch_andnot_relaxed
+# define atomic64_fetch_andnot_relaxed		atomic64_fetch_andnot
+# define atomic64_fetch_andnot_acquire		atomic64_fetch_andnot
+# define atomic64_fetch_andnot_release		atomic64_fetch_andnot
+#else
+# ifndef atomic64_fetch_andnot_acquire
+#  define atomic64_fetch_andnot_acquire(...)	__atomic_op_acquire(atomic64_fetch_andnot, __VA_ARGS__)
+# endif
+# ifndef atomic64_fetch_andnot_release
+#  define atomic64_fetch_andnot_release(...)	__atomic_op_release(atomic64_fetch_andnot, __VA_ARGS__)
+# endif
+# ifndef atomic64_fetch_andnot
+#  define atomic64_fetch_andnot(...)		__atomic_op_fence(atomic64_fetch_andnot, __VA_ARGS__)
+# endif
 #endif
 
-#ifndef atomic64_fetch_andnot
-#define atomic64_fetch_andnot(...)						\
-	__atomic_op_fence(atomic64_fetch_andnot, __VA_ARGS__)
-#endif
-#endif /* atomic64_fetch_andnot_relaxed */
 #endif /* atomic64_andnot */
 
-/* atomic64_fetch_xor_relaxed */
-#ifndef atomic64_fetch_xor_relaxed
-#define atomic64_fetch_xor_relaxed	atomic64_fetch_xor
-#define atomic64_fetch_xor_acquire	atomic64_fetch_xor
-#define atomic64_fetch_xor_release	atomic64_fetch_xor
-
-#else /* atomic64_fetch_xor_relaxed */
+/* atomic64_fetch_xor_relaxed() et al: */
 
-#ifndef atomic64_fetch_xor_acquire
-#define atomic64_fetch_xor_acquire(...)					\
-	__atomic_op_acquire(atomic64_fetch_xor, __VA_ARGS__)
-#endif
-
-#ifndef atomic64_fetch_xor_release
-#define atomic64_fetch_xor_release(...)					\
-	__atomic_op_release(atomic64_fetch_xor, __VA_ARGS__)
+#ifndef atomic64_fetch_xor_relaxed
+# define atomic64_fetch_xor_relaxed		atomic64_fetch_xor
+# define atomic64_fetch_xor_acquire		atomic64_fetch_xor
+# define atomic64_fetch_xor_release		atomic64_fetch_xor
+#else
+# ifndef atomic64_fetch_xor_acquire
+#  define atomic64_fetch_xor_acquire(...)	__atomic_op_acquire(atomic64_fetch_xor, __VA_ARGS__)
+# endif
+# ifndef atomic64_fetch_xor_release
+#  define atomic64_fetch_xor_release(...)	__atomic_op_release(atomic64_fetch_xor, __VA_ARGS__)
+# endif
+# ifndef atomic64_fetch_xor
+#  define atomic64_fetch_xor(...)		__atomic_op_fence(atomic64_fetch_xor, __VA_ARGS__)
 #endif
-
-#ifndef atomic64_fetch_xor
-#define atomic64_fetch_xor(...)						\
-	__atomic_op_fence(atomic64_fetch_xor, __VA_ARGS__)
 #endif
-#endif /* atomic64_fetch_xor_relaxed */
 
+/* atomic64_xchg_relaxed() et al: */
 
-/* atomic64_xchg_relaxed */
 #ifndef atomic64_xchg_relaxed
-#define  atomic64_xchg_relaxed		atomic64_xchg
-#define  atomic64_xchg_acquire		atomic64_xchg
-#define  atomic64_xchg_release		atomic64_xchg
-
-#else /* atomic64_xchg_relaxed */
-
-#ifndef atomic64_xchg_acquire
-#define  atomic64_xchg_acquire(...)					\
-	__atomic_op_acquire(atomic64_xchg, __VA_ARGS__)
-#endif
+# define atomic64_xchg_relaxed			atomic64_xchg
+# define atomic64_xchg_acquire			atomic64_xchg
+# define atomic64_xchg_release			atomic64_xchg
+#else
+# ifndef atomic64_xchg_acquire
+#  define atomic64_xchg_acquire(...)		__atomic_op_acquire(atomic64_xchg, __VA_ARGS__)
+# endif
+# ifndef atomic64_xchg_release
+#  define atomic64_xchg_release(...)		__atomic_op_release(atomic64_xchg, __VA_ARGS__)
+# endif
+# ifndef atomic64_xchg
+#  define atomic64_xchg(...)			__atomic_op_fence(atomic64_xchg, __VA_ARGS__)
+# endif
+#endif
+
+/* atomic64_cmpxchg_relaxed() et al: */
 
-#ifndef atomic64_xchg_release
-#define  atomic64_xchg_release(...)					\
-	__atomic_op_release(atomic64_xchg, __VA_ARGS__)
-#endif
-
-#ifndef atomic64_xchg
-#define  atomic64_xchg(...)						\
-	__atomic_op_fence(atomic64_xchg, __VA_ARGS__)
-#endif
-#endif /* atomic64_xchg_relaxed */
-
-/* atomic64_cmpxchg_relaxed */
 #ifndef atomic64_cmpxchg_relaxed
-#define  atomic64_cmpxchg_relaxed	atomic64_cmpxchg
-#define  atomic64_cmpxchg_acquire	atomic64_cmpxchg
-#define  atomic64_cmpxchg_release	atomic64_cmpxchg
-
-#else /* atomic64_cmpxchg_relaxed */
-
-#ifndef atomic64_cmpxchg_acquire
-#define  atomic64_cmpxchg_acquire(...)					\
-	__atomic_op_acquire(atomic64_cmpxchg, __VA_ARGS__)
-#endif
-
-#ifndef atomic64_cmpxchg_release
-#define  atomic64_cmpxchg_release(...)					\
-	__atomic_op_release(atomic64_cmpxchg, __VA_ARGS__)
-#endif
-
-#ifndef atomic64_cmpxchg
-#define  atomic64_cmpxchg(...)						\
-	__atomic_op_fence(atomic64_cmpxchg, __VA_ARGS__)
+# define atomic64_cmpxchg_relaxed		atomic64_cmpxchg
+# define atomic64_cmpxchg_acquire		atomic64_cmpxchg
+# define atomic64_cmpxchg_release		atomic64_cmpxchg
+#else
+# ifndef atomic64_cmpxchg_acquire
+#  define atomic64_cmpxchg_acquire(...)		__atomic_op_acquire(atomic64_cmpxchg, __VA_ARGS__)
+# endif
+# ifndef atomic64_cmpxchg_release
+#  define atomic64_cmpxchg_release(...)		__atomic_op_release(atomic64_cmpxchg, __VA_ARGS__)
+# endif
+# ifndef atomic64_cmpxchg
+#  define atomic64_cmpxchg(...)			__atomic_op_fence(atomic64_cmpxchg, __VA_ARGS__)
+# endif
 #endif
-#endif /* atomic64_cmpxchg_relaxed */
 
 #ifndef atomic64_try_cmpxchg
-
-#define __atomic64_try_cmpxchg(type, _p, _po, _n)			\
-({									\
+# define __atomic64_try_cmpxchg(type, _p, _po, _n)			\
+  ({									\
 	typeof(_po) __po = (_po);					\
 	typeof(*(_po)) __r, __o = *__po;				\
 	__r = atomic64_cmpxchg##type((_p), __o, (_n));			\
 	if (unlikely(__r != __o))					\
 		*__po = __r;						\
 	likely(__r == __o);						\
-})
-
-#define atomic64_try_cmpxchg(_p, _po, _n)		__atomic64_try_cmpxchg(, _p, _po, _n)
-#define atomic64_try_cmpxchg_relaxed(_p, _po, _n)	__atomic64_try_cmpxchg(_relaxed, _p, _po, _n)
-#define atomic64_try_cmpxchg_acquire(_p, _po, _n)	__atomic64_try_cmpxchg(_acquire, _p, _po, _n)
-#define atomic64_try_cmpxchg_release(_p, _po, _n)	__atomic64_try_cmpxchg(_release, _p, _po, _n)
-
-#else /* atomic64_try_cmpxchg */
-#define atomic64_try_cmpxchg_relaxed	atomic64_try_cmpxchg
-#define atomic64_try_cmpxchg_acquire	atomic64_try_cmpxchg
-#define atomic64_try_cmpxchg_release	atomic64_try_cmpxchg
-#endif /* atomic64_try_cmpxchg */
+  })
+# define atomic64_try_cmpxchg(_p, _po, _n)	   __atomic64_try_cmpxchg(, _p, _po, _n)
+# define atomic64_try_cmpxchg_relaxed(_p, _po, _n) __atomic64_try_cmpxchg(_relaxed, _p, _po, _n)
+# define atomic64_try_cmpxchg_acquire(_p, _po, _n) __atomic64_try_cmpxchg(_acquire, _p, _po, _n)
+# define atomic64_try_cmpxchg_release(_p, _po, _n) __atomic64_try_cmpxchg(_release, _p, _po, _n)
+#else
+# define atomic64_try_cmpxchg_relaxed		atomic64_try_cmpxchg
+# define atomic64_try_cmpxchg_acquire		atomic64_try_cmpxchg
+# define atomic64_try_cmpxchg_release		atomic64_try_cmpxchg
+#endif
 
 #ifndef atomic64_andnot
 static inline void atomic64_andnot(long long i, atomic64_t *v)

^ permalink raw reply related	[flat|nested] 103+ messages in thread

* [PATCH] locking/atomics: Clean up the atomic.h maze of #defines
@ 2018-05-05  8:11         ` Ingo Molnar
  0 siblings, 0 replies; 103+ messages in thread
From: Ingo Molnar @ 2018-05-05  8:11 UTC (permalink / raw)
  To: linux-arm-kernel


* Mark Rutland <mark.rutland@arm.com> wrote:

> On Fri, May 04, 2018 at 08:01:05PM +0200, Peter Zijlstra wrote:
> > On Fri, May 04, 2018 at 06:39:32PM +0100, Mark Rutland wrote:
> > > Currently <asm-generic/atomic-instrumented.h> only instruments the fully
> > > ordered variants of atomic functions, ignoring the {relaxed,acquire,release}
> > > ordering variants.
> > > 
> > > This patch reworks the header to instrument all ordering variants of the atomic
> > > functions, so that architectures implementing these are instrumented
> > > appropriately.
> > > 
> > > To minimise repetition, a macro is used to generate each variant from a common
> > > template. The {full,relaxed,acquire,release} order variants respectively are
> > > then built using this template, where the architecture provides an
> > > implementation.
> 
> > >  include/asm-generic/atomic-instrumented.h | 1195 ++++++++++++++++++++++++-----
> > >  1 file changed, 1008 insertions(+), 187 deletions(-)
> > 
> > Is there really no way to either generate or further macro compress this?
> 
> I can definitely macro compress this somewhat, but the bulk of the
> repetition will be the ifdeffery, which can't be macro'd away IIUC.

The thing is, the existing #ifdeffery is suboptimal to begin with.

I just did the following cleanups (patch attached):

 include/linux/atomic.h | 1275 +++++++++++++++++++++---------------------------
 1 file changed, 543 insertions(+), 732 deletions(-)

The gist of the changes is the following simplification of the main construct:

Before:

 #ifndef atomic_fetch_dec_relaxed

 #ifndef atomic_fetch_dec
 #define atomic_fetch_dec(v)	        atomic_fetch_sub(1, (v))
 #define atomic_fetch_dec_relaxed(v)	atomic_fetch_sub_relaxed(1, (v))
 #define atomic_fetch_dec_acquire(v)	atomic_fetch_sub_acquire(1, (v))
 #define atomic_fetch_dec_release(v)	atomic_fetch_sub_release(1, (v))
 #else /* atomic_fetch_dec */
 #define atomic_fetch_dec_relaxed	atomic_fetch_dec
 #define atomic_fetch_dec_acquire	atomic_fetch_dec
 #define atomic_fetch_dec_release	atomic_fetch_dec
 #endif /* atomic_fetch_dec */

 #else /* atomic_fetch_dec_relaxed */

 #ifndef atomic_fetch_dec_acquire
 #define atomic_fetch_dec_acquire(...)					\
	__atomic_op_acquire(atomic_fetch_dec, __VA_ARGS__)
 #endif

 #ifndef atomic_fetch_dec_release
 #define atomic_fetch_dec_release(...)					\
	__atomic_op_release(atomic_fetch_dec, __VA_ARGS__)
 #endif

 #ifndef atomic_fetch_dec
 #define atomic_fetch_dec(...)						\
	__atomic_op_fence(atomic_fetch_dec, __VA_ARGS__)
 #endif
 #endif /* atomic_fetch_dec_relaxed */

After:

 #ifndef atomic_fetch_dec_relaxed
 # ifndef atomic_fetch_dec
 #  define atomic_fetch_dec(v)			atomic_fetch_sub(1, (v))
 #  define atomic_fetch_dec_relaxed(v)		atomic_fetch_sub_relaxed(1, (v))
 #  define atomic_fetch_dec_acquire(v)		atomic_fetch_sub_acquire(1, (v))
 #  define atomic_fetch_dec_release(v)		atomic_fetch_sub_release(1, (v))
 # else
 #  define atomic_fetch_dec_relaxed		atomic_fetch_dec
 #  define atomic_fetch_dec_acquire		atomic_fetch_dec
 #  define atomic_fetch_dec_release		atomic_fetch_dec
 # endif
 #else
 # ifndef atomic_fetch_dec_acquire
 #  define atomic_fetch_dec_acquire(...)	__atomic_op_acquire(atomic_fetch_dec, __VA_ARGS__)
 # endif
 # ifndef atomic_fetch_dec_release
 #  define atomic_fetch_dec_release(...)	__atomic_op_release(atomic_fetch_dec, __VA_ARGS__)
 # endif
 # ifndef atomic_fetch_dec
 #  define atomic_fetch_dec(...)		__atomic_op_fence(atomic_fetch_dec, __VA_ARGS__)
 # endif
 #endif

The new variant is readable at a glance, and the hierarchy of defines is very 
obvious as well.

And I think we could do even better - there's absolutely no reason why _every_ 
operation has to be made conditional on a finegrained level - they are overriden 
in API groups. In fact allowing individual override is arguably a fragility.

So we could do the following simplification on top of that:

 #ifndef atomic_fetch_dec_relaxed
 # ifndef atomic_fetch_dec
 #  define atomic_fetch_dec(v)			atomic_fetch_sub(1, (v))
 #  define atomic_fetch_dec_relaxed(v)		atomic_fetch_sub_relaxed(1, (v))
 #  define atomic_fetch_dec_acquire(v)		atomic_fetch_sub_acquire(1, (v))
 #  define atomic_fetch_dec_release(v)		atomic_fetch_sub_release(1, (v))
 # else
 #  define atomic_fetch_dec_relaxed		atomic_fetch_dec
 #  define atomic_fetch_dec_acquire		atomic_fetch_dec
 #  define atomic_fetch_dec_release		atomic_fetch_dec
 # endif
 #else
 # ifndef atomic_fetch_dec
 #  define atomic_fetch_dec(...)		__atomic_op_fence(atomic_fetch_dec, __VA_ARGS__)
 #  define atomic_fetch_dec_acquire(...)	__atomic_op_acquire(atomic_fetch_dec, __VA_ARGS__)
 #  define atomic_fetch_dec_release(...)	__atomic_op_release(atomic_fetch_dec, __VA_ARGS__)
 # endif
 #endif

Note how the grouping of APIs based on 'atomic_fetch_dec' is already an assumption 
in the primary !atomic_fetch_dec_relaxed branch.

I much prefer such clear constructs of API mapping versus magic macros.

Thanks,

	Ingo

=============================>
>From 0171d4ed840d25befaedcf03e834bb76acb400c0 Mon Sep 17 00:00:00 2001
From: Ingo Molnar <mingo@kernel.org>
Date: Sat, 5 May 2018 09:57:02 +0200
Subject: [PATCH] locking/atomics: Clean up the atomic.h maze of #defines

Use structured defines to make it all much more readable.

Before:

 #ifndef atomic_fetch_dec_relaxed

 #ifndef atomic_fetch_dec
 #define atomic_fetch_dec(v)	        atomic_fetch_sub(1, (v))
 #define atomic_fetch_dec_relaxed(v)	atomic_fetch_sub_relaxed(1, (v))
 #define atomic_fetch_dec_acquire(v)	atomic_fetch_sub_acquire(1, (v))
 #define atomic_fetch_dec_release(v)	atomic_fetch_sub_release(1, (v))
 #else /* atomic_fetch_dec */
 #define atomic_fetch_dec_relaxed	atomic_fetch_dec
 #define atomic_fetch_dec_acquire	atomic_fetch_dec
 #define atomic_fetch_dec_release	atomic_fetch_dec
 #endif /* atomic_fetch_dec */

 #else /* atomic_fetch_dec_relaxed */

 #ifndef atomic_fetch_dec_acquire
 #define atomic_fetch_dec_acquire(...)					\
	__atomic_op_acquire(atomic_fetch_dec, __VA_ARGS__)
 #endif

 #ifndef atomic_fetch_dec_release
 #define atomic_fetch_dec_release(...)					\
	__atomic_op_release(atomic_fetch_dec, __VA_ARGS__)
 #endif

 #ifndef atomic_fetch_dec
 #define atomic_fetch_dec(...)						\
	__atomic_op_fence(atomic_fetch_dec, __VA_ARGS__)
 #endif
 #endif /* atomic_fetch_dec_relaxed */

After:

 #ifndef atomic_fetch_dec_relaxed
 # ifndef atomic_fetch_dec
 #  define atomic_fetch_dec(v)			atomic_fetch_sub(1, (v))
 #  define atomic_fetch_dec_relaxed(v)		atomic_fetch_sub_relaxed(1, (v))
 #  define atomic_fetch_dec_acquire(v)		atomic_fetch_sub_acquire(1, (v))
 #  define atomic_fetch_dec_release(v)		atomic_fetch_sub_release(1, (v))
 # else
 #  define atomic_fetch_dec_relaxed		atomic_fetch_dec
 #  define atomic_fetch_dec_acquire		atomic_fetch_dec
 #  define atomic_fetch_dec_release		atomic_fetch_dec
 # endif
 #else
 # ifndef atomic_fetch_dec_acquire
 #  define atomic_fetch_dec_acquire(...)	__atomic_op_acquire(atomic_fetch_dec, __VA_ARGS__)
 # endif
 # ifndef atomic_fetch_dec_release
 #  define atomic_fetch_dec_release(...)	__atomic_op_release(atomic_fetch_dec, __VA_ARGS__)
 # endif
 # ifndef atomic_fetch_dec
 #  define atomic_fetch_dec(...)		__atomic_op_fence(atomic_fetch_dec, __VA_ARGS__)
 # endif
 #endif

Beyond the linecount reduction this also makes it easier to follow
the various conditions.

Also clean up a few other minor details and make the code more
consistent throughout.

No change in functionality.

Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: linux-kernel at vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 include/linux/atomic.h | 1275 +++++++++++++++++++++---------------------------
 1 file changed, 543 insertions(+), 732 deletions(-)

diff --git a/include/linux/atomic.h b/include/linux/atomic.h
index 01ce3997cb42..dc157c092ae5 100644
--- a/include/linux/atomic.h
+++ b/include/linux/atomic.h
@@ -24,11 +24,11 @@
  */
 
 #ifndef atomic_read_acquire
-#define  atomic_read_acquire(v)		smp_load_acquire(&(v)->counter)
+# define atomic_read_acquire(v)			smp_load_acquire(&(v)->counter)
 #endif
 
 #ifndef atomic_set_release
-#define  atomic_set_release(v, i)	smp_store_release(&(v)->counter, (i))
+# define atomic_set_release(v, i)		smp_store_release(&(v)->counter, (i))
 #endif
 
 /*
@@ -71,454 +71,351 @@
 })
 #endif
 
-/* atomic_add_return_relaxed */
-#ifndef atomic_add_return_relaxed
-#define  atomic_add_return_relaxed	atomic_add_return
-#define  atomic_add_return_acquire	atomic_add_return
-#define  atomic_add_return_release	atomic_add_return
-
-#else /* atomic_add_return_relaxed */
-
-#ifndef atomic_add_return_acquire
-#define  atomic_add_return_acquire(...)					\
-	__atomic_op_acquire(atomic_add_return, __VA_ARGS__)
-#endif
+/* atomic_add_return_relaxed() et al: */
 
-#ifndef atomic_add_return_release
-#define  atomic_add_return_release(...)					\
-	__atomic_op_release(atomic_add_return, __VA_ARGS__)
-#endif
-
-#ifndef atomic_add_return
-#define  atomic_add_return(...)						\
-	__atomic_op_fence(atomic_add_return, __VA_ARGS__)
-#endif
-#endif /* atomic_add_return_relaxed */
+#ifndef atomic_add_return_relaxed
+# define atomic_add_return_relaxed		atomic_add_return
+# define atomic_add_return_acquire		atomic_add_return
+# define atomic_add_return_release		atomic_add_return
+#else
+# ifndef atomic_add_return_acquire
+#  define atomic_add_return_acquire(...)	__atomic_op_acquire(atomic_add_return, __VA_ARGS__)
+# endif
+# ifndef atomic_add_return_release
+#  define atomic_add_return_release(...)	__atomic_op_release(atomic_add_return, __VA_ARGS__)
+# endif
+# ifndef atomic_add_return
+#  define atomic_add_return(...)		__atomic_op_fence(atomic_add_return, __VA_ARGS__)
+# endif
+#endif
+
+/* atomic_inc_return_relaxed() et al: */
 
-/* atomic_inc_return_relaxed */
 #ifndef atomic_inc_return_relaxed
-#define  atomic_inc_return_relaxed	atomic_inc_return
-#define  atomic_inc_return_acquire	atomic_inc_return
-#define  atomic_inc_return_release	atomic_inc_return
-
-#else /* atomic_inc_return_relaxed */
-
-#ifndef atomic_inc_return_acquire
-#define  atomic_inc_return_acquire(...)					\
-	__atomic_op_acquire(atomic_inc_return, __VA_ARGS__)
-#endif
-
-#ifndef atomic_inc_return_release
-#define  atomic_inc_return_release(...)					\
-	__atomic_op_release(atomic_inc_return, __VA_ARGS__)
-#endif
-
-#ifndef atomic_inc_return
-#define  atomic_inc_return(...)						\
-	__atomic_op_fence(atomic_inc_return, __VA_ARGS__)
-#endif
-#endif /* atomic_inc_return_relaxed */
+# define atomic_inc_return_relaxed		atomic_inc_return
+# define atomic_inc_return_acquire		atomic_inc_return
+# define atomic_inc_return_release		atomic_inc_return
+#else
+# ifndef atomic_inc_return_acquire
+#  define atomic_inc_return_acquire(...)	__atomic_op_acquire(atomic_inc_return, __VA_ARGS__)
+# endif
+# ifndef atomic_inc_return_release
+#  define atomic_inc_return_release(...)	__atomic_op_release(atomic_inc_return, __VA_ARGS__)
+# endif
+# ifndef atomic_inc_return
+#  define atomic_inc_return(...)		__atomic_op_fence(atomic_inc_return, __VA_ARGS__)
+# endif
+#endif
+
+/* atomic_sub_return_relaxed() et al: */
 
-/* atomic_sub_return_relaxed */
 #ifndef atomic_sub_return_relaxed
-#define  atomic_sub_return_relaxed	atomic_sub_return
-#define  atomic_sub_return_acquire	atomic_sub_return
-#define  atomic_sub_return_release	atomic_sub_return
-
-#else /* atomic_sub_return_relaxed */
-
-#ifndef atomic_sub_return_acquire
-#define  atomic_sub_return_acquire(...)					\
-	__atomic_op_acquire(atomic_sub_return, __VA_ARGS__)
-#endif
-
-#ifndef atomic_sub_return_release
-#define  atomic_sub_return_release(...)					\
-	__atomic_op_release(atomic_sub_return, __VA_ARGS__)
-#endif
-
-#ifndef atomic_sub_return
-#define  atomic_sub_return(...)						\
-	__atomic_op_fence(atomic_sub_return, __VA_ARGS__)
-#endif
-#endif /* atomic_sub_return_relaxed */
+# define atomic_sub_return_relaxed		atomic_sub_return
+# define atomic_sub_return_acquire		atomic_sub_return
+# define atomic_sub_return_release		atomic_sub_return
+#else
+# ifndef atomic_sub_return_acquire
+#  define atomic_sub_return_acquire(...)	__atomic_op_acquire(atomic_sub_return, __VA_ARGS__)
+# endif
+# ifndef atomic_sub_return_release
+#  define atomic_sub_return_release(...)	__atomic_op_release(atomic_sub_return, __VA_ARGS__)
+# endif
+# ifndef atomic_sub_return
+#  define atomic_sub_return(...)		__atomic_op_fence(atomic_sub_return, __VA_ARGS__)
+# endif
+#endif
+
+/* atomic_dec_return_relaxed() et al: */
 
-/* atomic_dec_return_relaxed */
 #ifndef atomic_dec_return_relaxed
-#define  atomic_dec_return_relaxed	atomic_dec_return
-#define  atomic_dec_return_acquire	atomic_dec_return
-#define  atomic_dec_return_release	atomic_dec_return
-
-#else /* atomic_dec_return_relaxed */
-
-#ifndef atomic_dec_return_acquire
-#define  atomic_dec_return_acquire(...)					\
-	__atomic_op_acquire(atomic_dec_return, __VA_ARGS__)
-#endif
+# define atomic_dec_return_relaxed		atomic_dec_return
+# define atomic_dec_return_acquire		atomic_dec_return
+# define atomic_dec_return_release		atomic_dec_return
+#else
+# ifndef atomic_dec_return_acquire
+#  define atomic_dec_return_acquire(...)	__atomic_op_acquire(atomic_dec_return, __VA_ARGS__)
+# endif
+# ifndef atomic_dec_return_release
+#  define atomic_dec_return_release(...)	__atomic_op_release(atomic_dec_return, __VA_ARGS__)
+# endif
+# ifndef atomic_dec_return
+#  define atomic_dec_return(...)		__atomic_op_fence(atomic_dec_return, __VA_ARGS__)
+# endif
+#endif
+
+/* atomic_fetch_add_relaxed() et al: */
 
-#ifndef atomic_dec_return_release
-#define  atomic_dec_return_release(...)					\
-	__atomic_op_release(atomic_dec_return, __VA_ARGS__)
-#endif
-
-#ifndef atomic_dec_return
-#define  atomic_dec_return(...)						\
-	__atomic_op_fence(atomic_dec_return, __VA_ARGS__)
-#endif
-#endif /* atomic_dec_return_relaxed */
-
-
-/* atomic_fetch_add_relaxed */
 #ifndef atomic_fetch_add_relaxed
-#define atomic_fetch_add_relaxed	atomic_fetch_add
-#define atomic_fetch_add_acquire	atomic_fetch_add
-#define atomic_fetch_add_release	atomic_fetch_add
-
-#else /* atomic_fetch_add_relaxed */
-
-#ifndef atomic_fetch_add_acquire
-#define atomic_fetch_add_acquire(...)					\
-	__atomic_op_acquire(atomic_fetch_add, __VA_ARGS__)
-#endif
-
-#ifndef atomic_fetch_add_release
-#define atomic_fetch_add_release(...)					\
-	__atomic_op_release(atomic_fetch_add, __VA_ARGS__)
-#endif
+# define atomic_fetch_add_relaxed		atomic_fetch_add
+# define atomic_fetch_add_acquire		atomic_fetch_add
+# define atomic_fetch_add_release		atomic_fetch_add
+#else
+# ifndef atomic_fetch_add_acquire
+#  define atomic_fetch_add_acquire(...)		__atomic_op_acquire(atomic_fetch_add, __VA_ARGS__)
+# endif
+# ifndef atomic_fetch_add_release
+#  define atomic_fetch_add_release(...)		__atomic_op_release(atomic_fetch_add, __VA_ARGS__)
+# endif
+# ifndef atomic_fetch_add
+#  define atomic_fetch_add(...)			__atomic_op_fence(atomic_fetch_add, __VA_ARGS__)
+# endif
+#endif
+
+/* atomic_fetch_inc_relaxed() et al: */
 
-#ifndef atomic_fetch_add
-#define atomic_fetch_add(...)						\
-	__atomic_op_fence(atomic_fetch_add, __VA_ARGS__)
-#endif
-#endif /* atomic_fetch_add_relaxed */
-
-/* atomic_fetch_inc_relaxed */
 #ifndef atomic_fetch_inc_relaxed
+# ifndef atomic_fetch_inc
+#  define atomic_fetch_inc(v)			atomic_fetch_add(1, (v))
+#  define atomic_fetch_inc_relaxed(v)		atomic_fetch_add_relaxed(1, (v))
+#  define atomic_fetch_inc_acquire(v)		atomic_fetch_add_acquire(1, (v))
+#  define atomic_fetch_inc_release(v)		atomic_fetch_add_release(1, (v))
+# else
+#  define atomic_fetch_inc_relaxed		atomic_fetch_inc
+#  define atomic_fetch_inc_acquire		atomic_fetch_inc
+#  define atomic_fetch_inc_release		atomic_fetch_inc
+# endif
+#else
+# ifndef atomic_fetch_inc_acquire
+#  define atomic_fetch_inc_acquire(...)		__atomic_op_acquire(atomic_fetch_inc, __VA_ARGS__)
+# endif
+# ifndef atomic_fetch_inc_release
+#  define atomic_fetch_inc_release(...)		__atomic_op_release(atomic_fetch_inc, __VA_ARGS__)
+# endif
+# ifndef atomic_fetch_inc
+#  define atomic_fetch_inc(...)			__atomic_op_fence(atomic_fetch_inc, __VA_ARGS__)
+# endif
+#endif
+
+/* atomic_fetch_sub_relaxed() et al: */
 
-#ifndef atomic_fetch_inc
-#define atomic_fetch_inc(v)	        atomic_fetch_add(1, (v))
-#define atomic_fetch_inc_relaxed(v)	atomic_fetch_add_relaxed(1, (v))
-#define atomic_fetch_inc_acquire(v)	atomic_fetch_add_acquire(1, (v))
-#define atomic_fetch_inc_release(v)	atomic_fetch_add_release(1, (v))
-#else /* atomic_fetch_inc */
-#define atomic_fetch_inc_relaxed	atomic_fetch_inc
-#define atomic_fetch_inc_acquire	atomic_fetch_inc
-#define atomic_fetch_inc_release	atomic_fetch_inc
-#endif /* atomic_fetch_inc */
-
-#else /* atomic_fetch_inc_relaxed */
-
-#ifndef atomic_fetch_inc_acquire
-#define atomic_fetch_inc_acquire(...)					\
-	__atomic_op_acquire(atomic_fetch_inc, __VA_ARGS__)
-#endif
-
-#ifndef atomic_fetch_inc_release
-#define atomic_fetch_inc_release(...)					\
-	__atomic_op_release(atomic_fetch_inc, __VA_ARGS__)
-#endif
-
-#ifndef atomic_fetch_inc
-#define atomic_fetch_inc(...)						\
-	__atomic_op_fence(atomic_fetch_inc, __VA_ARGS__)
-#endif
-#endif /* atomic_fetch_inc_relaxed */
-
-/* atomic_fetch_sub_relaxed */
 #ifndef atomic_fetch_sub_relaxed
-#define atomic_fetch_sub_relaxed	atomic_fetch_sub
-#define atomic_fetch_sub_acquire	atomic_fetch_sub
-#define atomic_fetch_sub_release	atomic_fetch_sub
-
-#else /* atomic_fetch_sub_relaxed */
-
-#ifndef atomic_fetch_sub_acquire
-#define atomic_fetch_sub_acquire(...)					\
-	__atomic_op_acquire(atomic_fetch_sub, __VA_ARGS__)
-#endif
+# define atomic_fetch_sub_relaxed		atomic_fetch_sub
+# define atomic_fetch_sub_acquire		atomic_fetch_sub
+# define atomic_fetch_sub_release		atomic_fetch_sub
+#else
+# ifndef atomic_fetch_sub_acquire
+#  define atomic_fetch_sub_acquire(...)		__atomic_op_acquire(atomic_fetch_sub, __VA_ARGS__)
+# endif
+# ifndef atomic_fetch_sub_release
+#  define atomic_fetch_sub_release(...)		__atomic_op_release(atomic_fetch_sub, __VA_ARGS__)
+# endif
+# ifndef atomic_fetch_sub
+#  define atomic_fetch_sub(...)			__atomic_op_fence(atomic_fetch_sub, __VA_ARGS__)
+# endif
+#endif
+
+/* atomic_fetch_dec_relaxed() et al: */
 
-#ifndef atomic_fetch_sub_release
-#define atomic_fetch_sub_release(...)					\
-	__atomic_op_release(atomic_fetch_sub, __VA_ARGS__)
-#endif
-
-#ifndef atomic_fetch_sub
-#define atomic_fetch_sub(...)						\
-	__atomic_op_fence(atomic_fetch_sub, __VA_ARGS__)
-#endif
-#endif /* atomic_fetch_sub_relaxed */
-
-/* atomic_fetch_dec_relaxed */
 #ifndef atomic_fetch_dec_relaxed
+# ifndef atomic_fetch_dec
+#  define atomic_fetch_dec(v)			atomic_fetch_sub(1, (v))
+#  define atomic_fetch_dec_relaxed(v)		atomic_fetch_sub_relaxed(1, (v))
+#  define atomic_fetch_dec_acquire(v)		atomic_fetch_sub_acquire(1, (v))
+#  define atomic_fetch_dec_release(v)		atomic_fetch_sub_release(1, (v))
+# else
+#  define atomic_fetch_dec_relaxed		atomic_fetch_dec
+#  define atomic_fetch_dec_acquire		atomic_fetch_dec
+#  define atomic_fetch_dec_release		atomic_fetch_dec
+# endif
+#else
+# ifndef atomic_fetch_dec_acquire
+#  define atomic_fetch_dec_acquire(...)		__atomic_op_acquire(atomic_fetch_dec, __VA_ARGS__)
+# endif
+# ifndef atomic_fetch_dec_release
+#  define atomic_fetch_dec_release(...)		__atomic_op_release(atomic_fetch_dec, __VA_ARGS__)
+# endif
+# ifndef atomic_fetch_dec
+#  define atomic_fetch_dec(...)			__atomic_op_fence(atomic_fetch_dec, __VA_ARGS__)
+# endif
+#endif
+
+/* atomic_fetch_or_relaxed() et al: */
 
-#ifndef atomic_fetch_dec
-#define atomic_fetch_dec(v)	        atomic_fetch_sub(1, (v))
-#define atomic_fetch_dec_relaxed(v)	atomic_fetch_sub_relaxed(1, (v))
-#define atomic_fetch_dec_acquire(v)	atomic_fetch_sub_acquire(1, (v))
-#define atomic_fetch_dec_release(v)	atomic_fetch_sub_release(1, (v))
-#else /* atomic_fetch_dec */
-#define atomic_fetch_dec_relaxed	atomic_fetch_dec
-#define atomic_fetch_dec_acquire	atomic_fetch_dec
-#define atomic_fetch_dec_release	atomic_fetch_dec
-#endif /* atomic_fetch_dec */
-
-#else /* atomic_fetch_dec_relaxed */
-
-#ifndef atomic_fetch_dec_acquire
-#define atomic_fetch_dec_acquire(...)					\
-	__atomic_op_acquire(atomic_fetch_dec, __VA_ARGS__)
-#endif
-
-#ifndef atomic_fetch_dec_release
-#define atomic_fetch_dec_release(...)					\
-	__atomic_op_release(atomic_fetch_dec, __VA_ARGS__)
-#endif
-
-#ifndef atomic_fetch_dec
-#define atomic_fetch_dec(...)						\
-	__atomic_op_fence(atomic_fetch_dec, __VA_ARGS__)
-#endif
-#endif /* atomic_fetch_dec_relaxed */
-
-/* atomic_fetch_or_relaxed */
 #ifndef atomic_fetch_or_relaxed
-#define atomic_fetch_or_relaxed	atomic_fetch_or
-#define atomic_fetch_or_acquire	atomic_fetch_or
-#define atomic_fetch_or_release	atomic_fetch_or
-
-#else /* atomic_fetch_or_relaxed */
-
-#ifndef atomic_fetch_or_acquire
-#define atomic_fetch_or_acquire(...)					\
-	__atomic_op_acquire(atomic_fetch_or, __VA_ARGS__)
-#endif
-
-#ifndef atomic_fetch_or_release
-#define atomic_fetch_or_release(...)					\
-	__atomic_op_release(atomic_fetch_or, __VA_ARGS__)
-#endif
-
-#ifndef atomic_fetch_or
-#define atomic_fetch_or(...)						\
-	__atomic_op_fence(atomic_fetch_or, __VA_ARGS__)
-#endif
-#endif /* atomic_fetch_or_relaxed */
+# define atomic_fetch_or_relaxed		atomic_fetch_or
+# define atomic_fetch_or_acquire		atomic_fetch_or
+# define atomic_fetch_or_release		atomic_fetch_or
+#else
+# ifndef atomic_fetch_or_acquire
+#  define atomic_fetch_or_acquire(...)		__atomic_op_acquire(atomic_fetch_or, __VA_ARGS__)
+# endif
+# ifndef atomic_fetch_or_release
+#  define atomic_fetch_or_release(...)		__atomic_op_release(atomic_fetch_or, __VA_ARGS__)
+# endif
+# ifndef atomic_fetch_or
+#  define atomic_fetch_or(...)			__atomic_op_fence(atomic_fetch_or, __VA_ARGS__)
+# endif
+#endif
+
+/* atomic_fetch_and_relaxed() et al: */
 
-/* atomic_fetch_and_relaxed */
 #ifndef atomic_fetch_and_relaxed
-#define atomic_fetch_and_relaxed	atomic_fetch_and
-#define atomic_fetch_and_acquire	atomic_fetch_and
-#define atomic_fetch_and_release	atomic_fetch_and
-
-#else /* atomic_fetch_and_relaxed */
-
-#ifndef atomic_fetch_and_acquire
-#define atomic_fetch_and_acquire(...)					\
-	__atomic_op_acquire(atomic_fetch_and, __VA_ARGS__)
-#endif
-
-#ifndef atomic_fetch_and_release
-#define atomic_fetch_and_release(...)					\
-	__atomic_op_release(atomic_fetch_and, __VA_ARGS__)
-#endif
-
-#ifndef atomic_fetch_and
-#define atomic_fetch_and(...)						\
-	__atomic_op_fence(atomic_fetch_and, __VA_ARGS__)
+# define atomic_fetch_and_relaxed		atomic_fetch_and
+# define atomic_fetch_and_acquire		atomic_fetch_and
+# define atomic_fetch_and_release		atomic_fetch_and
+#else
+# ifndef atomic_fetch_and_acquire
+#  define atomic_fetch_and_acquire(...)		__atomic_op_acquire(atomic_fetch_and, __VA_ARGS__)
+# endif
+# ifndef atomic_fetch_and_release
+#  define atomic_fetch_and_release(...)		__atomic_op_release(atomic_fetch_and, __VA_ARGS__)
+# endif
+# ifndef atomic_fetch_and
+#  define atomic_fetch_and(...)			__atomic_op_fence(atomic_fetch_and, __VA_ARGS__)
+# endif
 #endif
-#endif /* atomic_fetch_and_relaxed */
 
 #ifdef atomic_andnot
-/* atomic_fetch_andnot_relaxed */
-#ifndef atomic_fetch_andnot_relaxed
-#define atomic_fetch_andnot_relaxed	atomic_fetch_andnot
-#define atomic_fetch_andnot_acquire	atomic_fetch_andnot
-#define atomic_fetch_andnot_release	atomic_fetch_andnot
-
-#else /* atomic_fetch_andnot_relaxed */
 
-#ifndef atomic_fetch_andnot_acquire
-#define atomic_fetch_andnot_acquire(...)					\
-	__atomic_op_acquire(atomic_fetch_andnot, __VA_ARGS__)
-#endif
+/* atomic_fetch_andnot_relaxed() et al: */
 
-#ifndef atomic_fetch_andnot_release
-#define atomic_fetch_andnot_release(...)					\
-	__atomic_op_release(atomic_fetch_andnot, __VA_ARGS__)
+#ifndef atomic_fetch_andnot_relaxed
+# define atomic_fetch_andnot_relaxed		atomic_fetch_andnot
+# define atomic_fetch_andnot_acquire		atomic_fetch_andnot
+# define atomic_fetch_andnot_release		atomic_fetch_andnot
+#else
+# ifndef atomic_fetch_andnot_acquire
+#  define atomic_fetch_andnot_acquire(...)	 __atomic_op_acquire(atomic_fetch_andnot, __VA_ARGS__)
+# endif
+# ifndef atomic_fetch_andnot_release
+#  define atomic_fetch_andnot_release(...)	 __atomic_op_release(atomic_fetch_andnot, __VA_ARGS__)
+# endif
+# ifndef atomic_fetch_andnot
+#  define atomic_fetch_andnot(...)		__atomic_op_fence(atomic_fetch_andnot, __VA_ARGS__)
+# endif
 #endif
 
-#ifndef atomic_fetch_andnot
-#define atomic_fetch_andnot(...)						\
-	__atomic_op_fence(atomic_fetch_andnot, __VA_ARGS__)
-#endif
-#endif /* atomic_fetch_andnot_relaxed */
 #endif /* atomic_andnot */
 
-/* atomic_fetch_xor_relaxed */
-#ifndef atomic_fetch_xor_relaxed
-#define atomic_fetch_xor_relaxed	atomic_fetch_xor
-#define atomic_fetch_xor_acquire	atomic_fetch_xor
-#define atomic_fetch_xor_release	atomic_fetch_xor
-
-#else /* atomic_fetch_xor_relaxed */
+/* atomic_fetch_xor_relaxed() et al: */
 
-#ifndef atomic_fetch_xor_acquire
-#define atomic_fetch_xor_acquire(...)					\
-	__atomic_op_acquire(atomic_fetch_xor, __VA_ARGS__)
-#endif
-
-#ifndef atomic_fetch_xor_release
-#define atomic_fetch_xor_release(...)					\
-	__atomic_op_release(atomic_fetch_xor, __VA_ARGS__)
+#ifndef atomic_fetch_xor_relaxed
+# define atomic_fetch_xor_relaxed		atomic_fetch_xor
+# define atomic_fetch_xor_acquire		atomic_fetch_xor
+# define atomic_fetch_xor_release		atomic_fetch_xor
+#else
+# ifndef atomic_fetch_xor_acquire
+#  define atomic_fetch_xor_acquire(...)		__atomic_op_acquire(atomic_fetch_xor, __VA_ARGS__)
+# endif
+# ifndef atomic_fetch_xor_release
+#  define atomic_fetch_xor_release(...)		__atomic_op_release(atomic_fetch_xor, __VA_ARGS__)
+# endif
+# ifndef atomic_fetch_xor
+#  define atomic_fetch_xor(...)			__atomic_op_fence(atomic_fetch_xor, __VA_ARGS__)
+# endif
 #endif
 
-#ifndef atomic_fetch_xor
-#define atomic_fetch_xor(...)						\
-	__atomic_op_fence(atomic_fetch_xor, __VA_ARGS__)
-#endif
-#endif /* atomic_fetch_xor_relaxed */
 
+/* atomic_xchg_relaxed() et al: */
 
-/* atomic_xchg_relaxed */
 #ifndef atomic_xchg_relaxed
-#define  atomic_xchg_relaxed		atomic_xchg
-#define  atomic_xchg_acquire		atomic_xchg
-#define  atomic_xchg_release		atomic_xchg
-
-#else /* atomic_xchg_relaxed */
-
-#ifndef atomic_xchg_acquire
-#define  atomic_xchg_acquire(...)					\
-	__atomic_op_acquire(atomic_xchg, __VA_ARGS__)
-#endif
-
-#ifndef atomic_xchg_release
-#define  atomic_xchg_release(...)					\
-	__atomic_op_release(atomic_xchg, __VA_ARGS__)
-#endif
+#define atomic_xchg_relaxed			atomic_xchg
+#define atomic_xchg_acquire			atomic_xchg
+#define atomic_xchg_release			atomic_xchg
+#else
+# ifndef atomic_xchg_acquire
+#  define atomic_xchg_acquire(...)		__atomic_op_acquire(atomic_xchg, __VA_ARGS__)
+# endif
+# ifndef atomic_xchg_release
+#  define atomic_xchg_release(...)		__atomic_op_release(atomic_xchg, __VA_ARGS__)
+# endif
+# ifndef atomic_xchg
+#  define atomic_xchg(...)			__atomic_op_fence(atomic_xchg, __VA_ARGS__)
+# endif
+#endif
+
+/* atomic_cmpxchg_relaxed() et al: */
 
-#ifndef atomic_xchg
-#define  atomic_xchg(...)						\
-	__atomic_op_fence(atomic_xchg, __VA_ARGS__)
-#endif
-#endif /* atomic_xchg_relaxed */
-
-/* atomic_cmpxchg_relaxed */
 #ifndef atomic_cmpxchg_relaxed
-#define  atomic_cmpxchg_relaxed		atomic_cmpxchg
-#define  atomic_cmpxchg_acquire		atomic_cmpxchg
-#define  atomic_cmpxchg_release		atomic_cmpxchg
-
-#else /* atomic_cmpxchg_relaxed */
-
-#ifndef atomic_cmpxchg_acquire
-#define  atomic_cmpxchg_acquire(...)					\
-	__atomic_op_acquire(atomic_cmpxchg, __VA_ARGS__)
+# define atomic_cmpxchg_relaxed			atomic_cmpxchg
+# define atomic_cmpxchg_acquire			atomic_cmpxchg
+# define atomic_cmpxchg_release			atomic_cmpxchg
+#else
+# ifndef atomic_cmpxchg_acquire
+#  define atomic_cmpxchg_acquire(...)		__atomic_op_acquire(atomic_cmpxchg, __VA_ARGS__)
+# endif
+# ifndef atomic_cmpxchg_release
+#  define atomic_cmpxchg_release(...)		__atomic_op_release(atomic_cmpxchg, __VA_ARGS__)
+# endif
+# ifndef atomic_cmpxchg
+#  define atomic_cmpxchg(...)			__atomic_op_fence(atomic_cmpxchg, __VA_ARGS__)
+# endif
 #endif
 
-#ifndef atomic_cmpxchg_release
-#define  atomic_cmpxchg_release(...)					\
-	__atomic_op_release(atomic_cmpxchg, __VA_ARGS__)
-#endif
-
-#ifndef atomic_cmpxchg
-#define  atomic_cmpxchg(...)						\
-	__atomic_op_fence(atomic_cmpxchg, __VA_ARGS__)
-#endif
-#endif /* atomic_cmpxchg_relaxed */
-
 #ifndef atomic_try_cmpxchg
-
-#define __atomic_try_cmpxchg(type, _p, _po, _n)				\
-({									\
+# define __atomic_try_cmpxchg(type, _p, _po, _n)			\
+  ({									\
 	typeof(_po) __po = (_po);					\
 	typeof(*(_po)) __r, __o = *__po;				\
 	__r = atomic_cmpxchg##type((_p), __o, (_n));			\
 	if (unlikely(__r != __o))					\
 		*__po = __r;						\
 	likely(__r == __o);						\
-})
-
-#define atomic_try_cmpxchg(_p, _po, _n)		__atomic_try_cmpxchg(, _p, _po, _n)
-#define atomic_try_cmpxchg_relaxed(_p, _po, _n)	__atomic_try_cmpxchg(_relaxed, _p, _po, _n)
-#define atomic_try_cmpxchg_acquire(_p, _po, _n)	__atomic_try_cmpxchg(_acquire, _p, _po, _n)
-#define atomic_try_cmpxchg_release(_p, _po, _n)	__atomic_try_cmpxchg(_release, _p, _po, _n)
-
-#else /* atomic_try_cmpxchg */
-#define atomic_try_cmpxchg_relaxed	atomic_try_cmpxchg
-#define atomic_try_cmpxchg_acquire	atomic_try_cmpxchg
-#define atomic_try_cmpxchg_release	atomic_try_cmpxchg
-#endif /* atomic_try_cmpxchg */
-
-/* cmpxchg_relaxed */
-#ifndef cmpxchg_relaxed
-#define  cmpxchg_relaxed		cmpxchg
-#define  cmpxchg_acquire		cmpxchg
-#define  cmpxchg_release		cmpxchg
-
-#else /* cmpxchg_relaxed */
-
-#ifndef cmpxchg_acquire
-#define  cmpxchg_acquire(...)						\
-	__atomic_op_acquire(cmpxchg, __VA_ARGS__)
+  })
+# define atomic_try_cmpxchg(_p, _po, _n)	 __atomic_try_cmpxchg(, _p, _po, _n)
+# define atomic_try_cmpxchg_relaxed(_p, _po, _n) __atomic_try_cmpxchg(_relaxed, _p, _po, _n)
+# define atomic_try_cmpxchg_acquire(_p, _po, _n) __atomic_try_cmpxchg(_acquire, _p, _po, _n)
+# define atomic_try_cmpxchg_release(_p, _po, _n) __atomic_try_cmpxchg(_release, _p, _po, _n)
+#else
+# define atomic_try_cmpxchg_relaxed		atomic_try_cmpxchg
+# define atomic_try_cmpxchg_acquire		atomic_try_cmpxchg
+# define atomic_try_cmpxchg_release		atomic_try_cmpxchg
 #endif
 
-#ifndef cmpxchg_release
-#define  cmpxchg_release(...)						\
-	__atomic_op_release(cmpxchg, __VA_ARGS__)
-#endif
+/* cmpxchg_relaxed() et al: */
 
-#ifndef cmpxchg
-#define  cmpxchg(...)							\
-	__atomic_op_fence(cmpxchg, __VA_ARGS__)
-#endif
-#endif /* cmpxchg_relaxed */
+#ifndef cmpxchg_relaxed
+# define cmpxchg_relaxed			cmpxchg
+# define cmpxchg_acquire			cmpxchg
+# define cmpxchg_release			cmpxchg
+#else
+# ifndef cmpxchg_acquire
+#  define cmpxchg_acquire(...)			__atomic_op_acquire(cmpxchg, __VA_ARGS__)
+# endif
+# ifndef cmpxchg_release
+#  define cmpxchg_release(...)			__atomic_op_release(cmpxchg, __VA_ARGS__)
+# endif
+# ifndef cmpxchg
+#  define cmpxchg(...)				__atomic_op_fence(cmpxchg, __VA_ARGS__)
+# endif
+#endif
+
+/* cmpxchg64_relaxed() et al: */
 
-/* cmpxchg64_relaxed */
 #ifndef cmpxchg64_relaxed
-#define  cmpxchg64_relaxed		cmpxchg64
-#define  cmpxchg64_acquire		cmpxchg64
-#define  cmpxchg64_release		cmpxchg64
-
-#else /* cmpxchg64_relaxed */
-
-#ifndef cmpxchg64_acquire
-#define  cmpxchg64_acquire(...)						\
-	__atomic_op_acquire(cmpxchg64, __VA_ARGS__)
-#endif
-
-#ifndef cmpxchg64_release
-#define  cmpxchg64_release(...)						\
-	__atomic_op_release(cmpxchg64, __VA_ARGS__)
-#endif
-
-#ifndef cmpxchg64
-#define  cmpxchg64(...)							\
-	__atomic_op_fence(cmpxchg64, __VA_ARGS__)
-#endif
-#endif /* cmpxchg64_relaxed */
+# define cmpxchg64_relaxed			cmpxchg64
+# define cmpxchg64_acquire			cmpxchg64
+# define cmpxchg64_release			cmpxchg64
+#else
+# ifndef cmpxchg64_acquire
+#  define cmpxchg64_acquire(...)		__atomic_op_acquire(cmpxchg64, __VA_ARGS__)
+# endif
+# ifndef cmpxchg64_release
+#  define cmpxchg64_release(...)		__atomic_op_release(cmpxchg64, __VA_ARGS__)
+# endif
+# ifndef cmpxchg64
+#  define cmpxchg64(...)			__atomic_op_fence(cmpxchg64, __VA_ARGS__)
+# endif
+#endif
+
+/* xchg_relaxed() et al: */
 
-/* xchg_relaxed */
 #ifndef xchg_relaxed
-#define  xchg_relaxed			xchg
-#define  xchg_acquire			xchg
-#define  xchg_release			xchg
-
-#else /* xchg_relaxed */
-
-#ifndef xchg_acquire
-#define  xchg_acquire(...)		__atomic_op_acquire(xchg, __VA_ARGS__)
-#endif
-
-#ifndef xchg_release
-#define  xchg_release(...)		__atomic_op_release(xchg, __VA_ARGS__)
+# define xchg_relaxed				xchg
+# define xchg_acquire				xchg
+# define xchg_release				xchg
+#else
+# ifndef xchg_acquire
+#  define xchg_acquire(...)			__atomic_op_acquire(xchg, __VA_ARGS__)
+# endif
+# ifndef xchg_release
+#  define xchg_release(...)			__atomic_op_release(xchg, __VA_ARGS__)
+# endif
+# ifndef xchg
+#  define xchg(...)				__atomic_op_fence(xchg, __VA_ARGS__)
+# endif
 #endif
 
-#ifndef xchg
-#define  xchg(...)			__atomic_op_fence(xchg, __VA_ARGS__)
-#endif
-#endif /* xchg_relaxed */
-
 /**
  * atomic_add_unless - add unless the number is already a given value
  * @v: pointer of type atomic_t
@@ -541,7 +438,7 @@ static inline int atomic_add_unless(atomic_t *v, int a, int u)
  * Returns non-zero if @v was non-zero, and zero otherwise.
  */
 #ifndef atomic_inc_not_zero
-#define atomic_inc_not_zero(v)		atomic_add_unless((v), 1, 0)
+# define atomic_inc_not_zero(v)			atomic_add_unless((v), 1, 0)
 #endif
 
 #ifndef atomic_andnot
@@ -607,6 +504,7 @@ static inline int atomic_inc_not_zero_hint(atomic_t *v, int hint)
 static inline int atomic_inc_unless_negative(atomic_t *p)
 {
 	int v, v1;
+
 	for (v = 0; v >= 0; v = v1) {
 		v1 = atomic_cmpxchg(p, v, v + 1);
 		if (likely(v1 == v))
@@ -620,6 +518,7 @@ static inline int atomic_inc_unless_negative(atomic_t *p)
 static inline int atomic_dec_unless_positive(atomic_t *p)
 {
 	int v, v1;
+
 	for (v = 0; v <= 0; v = v1) {
 		v1 = atomic_cmpxchg(p, v, v - 1);
 		if (likely(v1 == v))
@@ -640,6 +539,7 @@ static inline int atomic_dec_unless_positive(atomic_t *p)
 static inline int atomic_dec_if_positive(atomic_t *v)
 {
 	int c, old, dec;
+
 	c = atomic_read(v);
 	for (;;) {
 		dec = c - 1;
@@ -654,400 +554,311 @@ static inline int atomic_dec_if_positive(atomic_t *v)
 }
 #endif
 
-#define atomic_cond_read_relaxed(v, c)	smp_cond_load_relaxed(&(v)->counter, (c))
-#define atomic_cond_read_acquire(v, c)	smp_cond_load_acquire(&(v)->counter, (c))
+#define atomic_cond_read_relaxed(v, c)		smp_cond_load_relaxed(&(v)->counter, (c))
+#define atomic_cond_read_acquire(v, c)		smp_cond_load_acquire(&(v)->counter, (c))
 
 #ifdef CONFIG_GENERIC_ATOMIC64
 #include <asm-generic/atomic64.h>
 #endif
 
 #ifndef atomic64_read_acquire
-#define  atomic64_read_acquire(v)	smp_load_acquire(&(v)->counter)
+# define atomic64_read_acquire(v)		smp_load_acquire(&(v)->counter)
 #endif
 
 #ifndef atomic64_set_release
-#define  atomic64_set_release(v, i)	smp_store_release(&(v)->counter, (i))
-#endif
-
-/* atomic64_add_return_relaxed */
-#ifndef atomic64_add_return_relaxed
-#define  atomic64_add_return_relaxed	atomic64_add_return
-#define  atomic64_add_return_acquire	atomic64_add_return
-#define  atomic64_add_return_release	atomic64_add_return
-
-#else /* atomic64_add_return_relaxed */
-
-#ifndef atomic64_add_return_acquire
-#define  atomic64_add_return_acquire(...)				\
-	__atomic_op_acquire(atomic64_add_return, __VA_ARGS__)
+# define atomic64_set_release(v, i)		smp_store_release(&(v)->counter, (i))
 #endif
 
-#ifndef atomic64_add_return_release
-#define  atomic64_add_return_release(...)				\
-	__atomic_op_release(atomic64_add_return, __VA_ARGS__)
-#endif
+/* atomic64_add_return_relaxed() et al: */
 
-#ifndef atomic64_add_return
-#define  atomic64_add_return(...)					\
-	__atomic_op_fence(atomic64_add_return, __VA_ARGS__)
-#endif
-#endif /* atomic64_add_return_relaxed */
+#ifndef atomic64_add_return_relaxed
+# define atomic64_add_return_relaxed		atomic64_add_return
+# define atomic64_add_return_acquire		atomic64_add_return
+# define atomic64_add_return_release		atomic64_add_return
+#else
+# ifndef atomic64_add_return_acquire
+#  define atomic64_add_return_acquire(...)	__atomic_op_acquire(atomic64_add_return, __VA_ARGS__)
+# endif
+# ifndef atomic64_add_return_release
+#  define atomic64_add_return_release(...)	__atomic_op_release(atomic64_add_return, __VA_ARGS__)
+# endif
+# ifndef atomic64_add_return
+#  define atomic64_add_return(...)		__atomic_op_fence(atomic64_add_return, __VA_ARGS__)
+# endif
+#endif
+
+/* atomic64_inc_return_relaxed() et al: */
 
-/* atomic64_inc_return_relaxed */
 #ifndef atomic64_inc_return_relaxed
-#define  atomic64_inc_return_relaxed	atomic64_inc_return
-#define  atomic64_inc_return_acquire	atomic64_inc_return
-#define  atomic64_inc_return_release	atomic64_inc_return
-
-#else /* atomic64_inc_return_relaxed */
-
-#ifndef atomic64_inc_return_acquire
-#define  atomic64_inc_return_acquire(...)				\
-	__atomic_op_acquire(atomic64_inc_return, __VA_ARGS__)
-#endif
-
-#ifndef atomic64_inc_return_release
-#define  atomic64_inc_return_release(...)				\
-	__atomic_op_release(atomic64_inc_return, __VA_ARGS__)
-#endif
-
-#ifndef atomic64_inc_return
-#define  atomic64_inc_return(...)					\
-	__atomic_op_fence(atomic64_inc_return, __VA_ARGS__)
-#endif
-#endif /* atomic64_inc_return_relaxed */
-
+# define atomic64_inc_return_relaxed		atomic64_inc_return
+# define atomic64_inc_return_acquire		atomic64_inc_return
+# define atomic64_inc_return_release		atomic64_inc_return
+#else
+# ifndef atomic64_inc_return_acquire
+#  define atomic64_inc_return_acquire(...)	__atomic_op_acquire(atomic64_inc_return, __VA_ARGS__)
+# endif
+# ifndef atomic64_inc_return_release
+#  define atomic64_inc_return_release(...)	__atomic_op_release(atomic64_inc_return, __VA_ARGS__)
+# endif
+# ifndef atomic64_inc_return
+#  define atomic64_inc_return(...)		__atomic_op_fence(atomic64_inc_return, __VA_ARGS__)
+# endif
+#endif
+
+/* atomic64_sub_return_relaxed() et al: */
 
-/* atomic64_sub_return_relaxed */
 #ifndef atomic64_sub_return_relaxed
-#define  atomic64_sub_return_relaxed	atomic64_sub_return
-#define  atomic64_sub_return_acquire	atomic64_sub_return
-#define  atomic64_sub_return_release	atomic64_sub_return
+# define atomic64_sub_return_relaxed		atomic64_sub_return
+# define atomic64_sub_return_acquire		atomic64_sub_return
+# define atomic64_sub_return_release		atomic64_sub_return
+#else
+# ifndef atomic64_sub_return_acquire
+#  define atomic64_sub_return_acquire(...)	__atomic_op_acquire(atomic64_sub_return, __VA_ARGS__)
+# endif
+# ifndef atomic64_sub_return_release
+#  define atomic64_sub_return_release(...)	__atomic_op_release(atomic64_sub_return, __VA_ARGS__)
+# endif
+# ifndef atomic64_sub_return
+#  define atomic64_sub_return(...)		__atomic_op_fence(atomic64_sub_return, __VA_ARGS__)
+# endif
+#endif
+
+/* atomic64_dec_return_relaxed() et al: */
 
-#else /* atomic64_sub_return_relaxed */
-
-#ifndef atomic64_sub_return_acquire
-#define  atomic64_sub_return_acquire(...)				\
-	__atomic_op_acquire(atomic64_sub_return, __VA_ARGS__)
-#endif
-
-#ifndef atomic64_sub_return_release
-#define  atomic64_sub_return_release(...)				\
-	__atomic_op_release(atomic64_sub_return, __VA_ARGS__)
-#endif
-
-#ifndef atomic64_sub_return
-#define  atomic64_sub_return(...)					\
-	__atomic_op_fence(atomic64_sub_return, __VA_ARGS__)
-#endif
-#endif /* atomic64_sub_return_relaxed */
-
-/* atomic64_dec_return_relaxed */
 #ifndef atomic64_dec_return_relaxed
-#define  atomic64_dec_return_relaxed	atomic64_dec_return
-#define  atomic64_dec_return_acquire	atomic64_dec_return
-#define  atomic64_dec_return_release	atomic64_dec_return
-
-#else /* atomic64_dec_return_relaxed */
-
-#ifndef atomic64_dec_return_acquire
-#define  atomic64_dec_return_acquire(...)				\
-	__atomic_op_acquire(atomic64_dec_return, __VA_ARGS__)
-#endif
-
-#ifndef atomic64_dec_return_release
-#define  atomic64_dec_return_release(...)				\
-	__atomic_op_release(atomic64_dec_return, __VA_ARGS__)
-#endif
-
-#ifndef atomic64_dec_return
-#define  atomic64_dec_return(...)					\
-	__atomic_op_fence(atomic64_dec_return, __VA_ARGS__)
-#endif
-#endif /* atomic64_dec_return_relaxed */
+# define atomic64_dec_return_relaxed		atomic64_dec_return
+# define atomic64_dec_return_acquire		atomic64_dec_return
+# define atomic64_dec_return_release		atomic64_dec_return
+#else
+# ifndef atomic64_dec_return_acquire
+#  define atomic64_dec_return_acquire(...)	__atomic_op_acquire(atomic64_dec_return, __VA_ARGS__)
+# endif
+# ifndef atomic64_dec_return_release
+#  define atomic64_dec_return_release(...)	__atomic_op_release(atomic64_dec_return, __VA_ARGS__)
+# endif
+# ifndef atomic64_dec_return
+#  define atomic64_dec_return(...)		__atomic_op_fence(atomic64_dec_return, __VA_ARGS__)
+# endif
+#endif
+
+/* atomic64_fetch_add_relaxed() et al: */
 
-
-/* atomic64_fetch_add_relaxed */
 #ifndef atomic64_fetch_add_relaxed
-#define atomic64_fetch_add_relaxed	atomic64_fetch_add
-#define atomic64_fetch_add_acquire	atomic64_fetch_add
-#define atomic64_fetch_add_release	atomic64_fetch_add
-
-#else /* atomic64_fetch_add_relaxed */
-
-#ifndef atomic64_fetch_add_acquire
-#define atomic64_fetch_add_acquire(...)					\
-	__atomic_op_acquire(atomic64_fetch_add, __VA_ARGS__)
-#endif
+# define atomic64_fetch_add_relaxed		atomic64_fetch_add
+# define atomic64_fetch_add_acquire		atomic64_fetch_add
+# define atomic64_fetch_add_release		atomic64_fetch_add
+#else
+# ifndef atomic64_fetch_add_acquire
+#  define atomic64_fetch_add_acquire(...)	__atomic_op_acquire(atomic64_fetch_add, __VA_ARGS__)
+# endif
+# ifndef atomic64_fetch_add_release
+#  define atomic64_fetch_add_release(...)	__atomic_op_release(atomic64_fetch_add, __VA_ARGS__)
+# endif
+# ifndef atomic64_fetch_add
+#  define atomic64_fetch_add(...)		__atomic_op_fence(atomic64_fetch_add, __VA_ARGS__)
+# endif
+#endif
+
+/* atomic64_fetch_inc_relaxed() et al: */
 
-#ifndef atomic64_fetch_add_release
-#define atomic64_fetch_add_release(...)					\
-	__atomic_op_release(atomic64_fetch_add, __VA_ARGS__)
-#endif
-
-#ifndef atomic64_fetch_add
-#define atomic64_fetch_add(...)						\
-	__atomic_op_fence(atomic64_fetch_add, __VA_ARGS__)
-#endif
-#endif /* atomic64_fetch_add_relaxed */
-
-/* atomic64_fetch_inc_relaxed */
 #ifndef atomic64_fetch_inc_relaxed
+# ifndef atomic64_fetch_inc
+#  define atomic64_fetch_inc(v)			atomic64_fetch_add(1, (v))
+#  define atomic64_fetch_inc_relaxed(v)		atomic64_fetch_add_relaxed(1, (v))
+#  define atomic64_fetch_inc_acquire(v)		atomic64_fetch_add_acquire(1, (v))
+#  define atomic64_fetch_inc_release(v)		atomic64_fetch_add_release(1, (v))
+# else
+#  define atomic64_fetch_inc_relaxed		atomic64_fetch_inc
+#  define atomic64_fetch_inc_acquire		atomic64_fetch_inc
+#  define atomic64_fetch_inc_release		atomic64_fetch_inc
+# endif
+#else
+# ifndef atomic64_fetch_inc_acquire
+#  define atomic64_fetch_inc_acquire(...)	__atomic_op_acquire(atomic64_fetch_inc, __VA_ARGS__)
+# endif
+# ifndef atomic64_fetch_inc_release
+#  define atomic64_fetch_inc_release(...)	__atomic_op_release(atomic64_fetch_inc, __VA_ARGS__)
+# endif
+# ifndef atomic64_fetch_inc
+#  define atomic64_fetch_inc(...)		__atomic_op_fence(atomic64_fetch_inc, __VA_ARGS__)
+# endif
+#endif
+
+/* atomic64_fetch_sub_relaxed() et al: */
 
-#ifndef atomic64_fetch_inc
-#define atomic64_fetch_inc(v)		atomic64_fetch_add(1, (v))
-#define atomic64_fetch_inc_relaxed(v)	atomic64_fetch_add_relaxed(1, (v))
-#define atomic64_fetch_inc_acquire(v)	atomic64_fetch_add_acquire(1, (v))
-#define atomic64_fetch_inc_release(v)	atomic64_fetch_add_release(1, (v))
-#else /* atomic64_fetch_inc */
-#define atomic64_fetch_inc_relaxed	atomic64_fetch_inc
-#define atomic64_fetch_inc_acquire	atomic64_fetch_inc
-#define atomic64_fetch_inc_release	atomic64_fetch_inc
-#endif /* atomic64_fetch_inc */
-
-#else /* atomic64_fetch_inc_relaxed */
-
-#ifndef atomic64_fetch_inc_acquire
-#define atomic64_fetch_inc_acquire(...)					\
-	__atomic_op_acquire(atomic64_fetch_inc, __VA_ARGS__)
-#endif
-
-#ifndef atomic64_fetch_inc_release
-#define atomic64_fetch_inc_release(...)					\
-	__atomic_op_release(atomic64_fetch_inc, __VA_ARGS__)
-#endif
-
-#ifndef atomic64_fetch_inc
-#define atomic64_fetch_inc(...)						\
-	__atomic_op_fence(atomic64_fetch_inc, __VA_ARGS__)
-#endif
-#endif /* atomic64_fetch_inc_relaxed */
-
-/* atomic64_fetch_sub_relaxed */
 #ifndef atomic64_fetch_sub_relaxed
-#define atomic64_fetch_sub_relaxed	atomic64_fetch_sub
-#define atomic64_fetch_sub_acquire	atomic64_fetch_sub
-#define atomic64_fetch_sub_release	atomic64_fetch_sub
-
-#else /* atomic64_fetch_sub_relaxed */
-
-#ifndef atomic64_fetch_sub_acquire
-#define atomic64_fetch_sub_acquire(...)					\
-	__atomic_op_acquire(atomic64_fetch_sub, __VA_ARGS__)
-#endif
-
-#ifndef atomic64_fetch_sub_release
-#define atomic64_fetch_sub_release(...)					\
-	__atomic_op_release(atomic64_fetch_sub, __VA_ARGS__)
-#endif
-
-#ifndef atomic64_fetch_sub
-#define atomic64_fetch_sub(...)						\
-	__atomic_op_fence(atomic64_fetch_sub, __VA_ARGS__)
-#endif
-#endif /* atomic64_fetch_sub_relaxed */
+# define atomic64_fetch_sub_relaxed		atomic64_fetch_sub
+# define atomic64_fetch_sub_acquire		atomic64_fetch_sub
+# define atomic64_fetch_sub_release		atomic64_fetch_sub
+#else
+# ifndef atomic64_fetch_sub_acquire
+#  define atomic64_fetch_sub_acquire(...)	__atomic_op_acquire(atomic64_fetch_sub, __VA_ARGS__)
+# endif
+# ifndef atomic64_fetch_sub_release
+#  define atomic64_fetch_sub_release(...)	__atomic_op_release(atomic64_fetch_sub, __VA_ARGS__)
+# endif
+# ifndef atomic64_fetch_sub
+#  define atomic64_fetch_sub(...)		__atomic_op_fence(atomic64_fetch_sub, __VA_ARGS__)
+# endif
+#endif
+
+/* atomic64_fetch_dec_relaxed() et al: */
 
-/* atomic64_fetch_dec_relaxed */
 #ifndef atomic64_fetch_dec_relaxed
+# ifndef atomic64_fetch_dec
+#  define atomic64_fetch_dec(v)			atomic64_fetch_sub(1, (v))
+#  define atomic64_fetch_dec_relaxed(v)		atomic64_fetch_sub_relaxed(1, (v))
+#  define atomic64_fetch_dec_acquire(v)		atomic64_fetch_sub_acquire(1, (v))
+#  define atomic64_fetch_dec_release(v)		atomic64_fetch_sub_release(1, (v))
+# else
+#  define atomic64_fetch_dec_relaxed		atomic64_fetch_dec
+#  define atomic64_fetch_dec_acquire		atomic64_fetch_dec
+#  define atomic64_fetch_dec_release		atomic64_fetch_dec
+# endif
+#else
+# ifndef atomic64_fetch_dec_acquire
+#  define atomic64_fetch_dec_acquire(...)	__atomic_op_acquire(atomic64_fetch_dec, __VA_ARGS__)
+# endif
+# ifndef atomic64_fetch_dec_release
+#  define atomic64_fetch_dec_release(...)	__atomic_op_release(atomic64_fetch_dec, __VA_ARGS__)
+# endif
+# ifndef atomic64_fetch_dec
+#  define atomic64_fetch_dec(...)		__atomic_op_fence(atomic64_fetch_dec, __VA_ARGS__)
+# endif
+#endif
+
+/* atomic64_fetch_or_relaxed() et al: */
 
-#ifndef atomic64_fetch_dec
-#define atomic64_fetch_dec(v)		atomic64_fetch_sub(1, (v))
-#define atomic64_fetch_dec_relaxed(v)	atomic64_fetch_sub_relaxed(1, (v))
-#define atomic64_fetch_dec_acquire(v)	atomic64_fetch_sub_acquire(1, (v))
-#define atomic64_fetch_dec_release(v)	atomic64_fetch_sub_release(1, (v))
-#else /* atomic64_fetch_dec */
-#define atomic64_fetch_dec_relaxed	atomic64_fetch_dec
-#define atomic64_fetch_dec_acquire	atomic64_fetch_dec
-#define atomic64_fetch_dec_release	atomic64_fetch_dec
-#endif /* atomic64_fetch_dec */
-
-#else /* atomic64_fetch_dec_relaxed */
-
-#ifndef atomic64_fetch_dec_acquire
-#define atomic64_fetch_dec_acquire(...)					\
-	__atomic_op_acquire(atomic64_fetch_dec, __VA_ARGS__)
-#endif
-
-#ifndef atomic64_fetch_dec_release
-#define atomic64_fetch_dec_release(...)					\
-	__atomic_op_release(atomic64_fetch_dec, __VA_ARGS__)
-#endif
-
-#ifndef atomic64_fetch_dec
-#define atomic64_fetch_dec(...)						\
-	__atomic_op_fence(atomic64_fetch_dec, __VA_ARGS__)
-#endif
-#endif /* atomic64_fetch_dec_relaxed */
-
-/* atomic64_fetch_or_relaxed */
 #ifndef atomic64_fetch_or_relaxed
-#define atomic64_fetch_or_relaxed	atomic64_fetch_or
-#define atomic64_fetch_or_acquire	atomic64_fetch_or
-#define atomic64_fetch_or_release	atomic64_fetch_or
-
-#else /* atomic64_fetch_or_relaxed */
-
-#ifndef atomic64_fetch_or_acquire
-#define atomic64_fetch_or_acquire(...)					\
-	__atomic_op_acquire(atomic64_fetch_or, __VA_ARGS__)
+# define atomic64_fetch_or_relaxed		atomic64_fetch_or
+# define atomic64_fetch_or_acquire		atomic64_fetch_or
+# define atomic64_fetch_or_release		atomic64_fetch_or
+#else
+# ifndef atomic64_fetch_or_acquire
+#  define atomic64_fetch_or_acquire(...)	__atomic_op_acquire(atomic64_fetch_or, __VA_ARGS__)
+# endif
+# ifndef atomic64_fetch_or_release
+#  define atomic64_fetch_or_release(...)	__atomic_op_release(atomic64_fetch_or, __VA_ARGS__)
+# endif
+# ifndef atomic64_fetch_or
+#  define atomic64_fetch_or(...)		__atomic_op_fence(atomic64_fetch_or, __VA_ARGS__)
+# endif
 #endif
 
-#ifndef atomic64_fetch_or_release
-#define atomic64_fetch_or_release(...)					\
-	__atomic_op_release(atomic64_fetch_or, __VA_ARGS__)
-#endif
 
-#ifndef atomic64_fetch_or
-#define atomic64_fetch_or(...)						\
-	__atomic_op_fence(atomic64_fetch_or, __VA_ARGS__)
-#endif
-#endif /* atomic64_fetch_or_relaxed */
+/* atomic64_fetch_and_relaxed() et al: */
 
-/* atomic64_fetch_and_relaxed */
 #ifndef atomic64_fetch_and_relaxed
-#define atomic64_fetch_and_relaxed	atomic64_fetch_and
-#define atomic64_fetch_and_acquire	atomic64_fetch_and
-#define atomic64_fetch_and_release	atomic64_fetch_and
-
-#else /* atomic64_fetch_and_relaxed */
-
-#ifndef atomic64_fetch_and_acquire
-#define atomic64_fetch_and_acquire(...)					\
-	__atomic_op_acquire(atomic64_fetch_and, __VA_ARGS__)
+# define atomic64_fetch_and_relaxed		atomic64_fetch_and
+# define atomic64_fetch_and_acquire		atomic64_fetch_and
+# define atomic64_fetch_and_release		atomic64_fetch_and
+#else
+# ifndef atomic64_fetch_and_acquire
+#  define atomic64_fetch_and_acquire(...)	__atomic_op_acquire(atomic64_fetch_and, __VA_ARGS__)
+# endif
+# ifndef atomic64_fetch_and_release
+#  define atomic64_fetch_and_release(...)	__atomic_op_release(atomic64_fetch_and, __VA_ARGS__)
+# endif
+# ifndef atomic64_fetch_and
+#  define atomic64_fetch_and(...)		__atomic_op_fence(atomic64_fetch_and, __VA_ARGS__)
+# endif
 #endif
 
-#ifndef atomic64_fetch_and_release
-#define atomic64_fetch_and_release(...)					\
-	__atomic_op_release(atomic64_fetch_and, __VA_ARGS__)
-#endif
-
-#ifndef atomic64_fetch_and
-#define atomic64_fetch_and(...)						\
-	__atomic_op_fence(atomic64_fetch_and, __VA_ARGS__)
-#endif
-#endif /* atomic64_fetch_and_relaxed */
-
 #ifdef atomic64_andnot
-/* atomic64_fetch_andnot_relaxed */
-#ifndef atomic64_fetch_andnot_relaxed
-#define atomic64_fetch_andnot_relaxed	atomic64_fetch_andnot
-#define atomic64_fetch_andnot_acquire	atomic64_fetch_andnot
-#define atomic64_fetch_andnot_release	atomic64_fetch_andnot
-
-#else /* atomic64_fetch_andnot_relaxed */
 
-#ifndef atomic64_fetch_andnot_acquire
-#define atomic64_fetch_andnot_acquire(...)					\
-	__atomic_op_acquire(atomic64_fetch_andnot, __VA_ARGS__)
-#endif
+/* atomic64_fetch_andnot_relaxed() et al: */
 
-#ifndef atomic64_fetch_andnot_release
-#define atomic64_fetch_andnot_release(...)					\
-	__atomic_op_release(atomic64_fetch_andnot, __VA_ARGS__)
+#ifndef atomic64_fetch_andnot_relaxed
+# define atomic64_fetch_andnot_relaxed		atomic64_fetch_andnot
+# define atomic64_fetch_andnot_acquire		atomic64_fetch_andnot
+# define atomic64_fetch_andnot_release		atomic64_fetch_andnot
+#else
+# ifndef atomic64_fetch_andnot_acquire
+#  define atomic64_fetch_andnot_acquire(...)	__atomic_op_acquire(atomic64_fetch_andnot, __VA_ARGS__)
+# endif
+# ifndef atomic64_fetch_andnot_release
+#  define atomic64_fetch_andnot_release(...)	__atomic_op_release(atomic64_fetch_andnot, __VA_ARGS__)
+# endif
+# ifndef atomic64_fetch_andnot
+#  define atomic64_fetch_andnot(...)		__atomic_op_fence(atomic64_fetch_andnot, __VA_ARGS__)
+# endif
 #endif
 
-#ifndef atomic64_fetch_andnot
-#define atomic64_fetch_andnot(...)						\
-	__atomic_op_fence(atomic64_fetch_andnot, __VA_ARGS__)
-#endif
-#endif /* atomic64_fetch_andnot_relaxed */
 #endif /* atomic64_andnot */
 
-/* atomic64_fetch_xor_relaxed */
-#ifndef atomic64_fetch_xor_relaxed
-#define atomic64_fetch_xor_relaxed	atomic64_fetch_xor
-#define atomic64_fetch_xor_acquire	atomic64_fetch_xor
-#define atomic64_fetch_xor_release	atomic64_fetch_xor
-
-#else /* atomic64_fetch_xor_relaxed */
+/* atomic64_fetch_xor_relaxed() et al: */
 
-#ifndef atomic64_fetch_xor_acquire
-#define atomic64_fetch_xor_acquire(...)					\
-	__atomic_op_acquire(atomic64_fetch_xor, __VA_ARGS__)
-#endif
-
-#ifndef atomic64_fetch_xor_release
-#define atomic64_fetch_xor_release(...)					\
-	__atomic_op_release(atomic64_fetch_xor, __VA_ARGS__)
+#ifndef atomic64_fetch_xor_relaxed
+# define atomic64_fetch_xor_relaxed		atomic64_fetch_xor
+# define atomic64_fetch_xor_acquire		atomic64_fetch_xor
+# define atomic64_fetch_xor_release		atomic64_fetch_xor
+#else
+# ifndef atomic64_fetch_xor_acquire
+#  define atomic64_fetch_xor_acquire(...)	__atomic_op_acquire(atomic64_fetch_xor, __VA_ARGS__)
+# endif
+# ifndef atomic64_fetch_xor_release
+#  define atomic64_fetch_xor_release(...)	__atomic_op_release(atomic64_fetch_xor, __VA_ARGS__)
+# endif
+# ifndef atomic64_fetch_xor
+#  define atomic64_fetch_xor(...)		__atomic_op_fence(atomic64_fetch_xor, __VA_ARGS__)
 #endif
-
-#ifndef atomic64_fetch_xor
-#define atomic64_fetch_xor(...)						\
-	__atomic_op_fence(atomic64_fetch_xor, __VA_ARGS__)
 #endif
-#endif /* atomic64_fetch_xor_relaxed */
 
+/* atomic64_xchg_relaxed() et al: */
 
-/* atomic64_xchg_relaxed */
 #ifndef atomic64_xchg_relaxed
-#define  atomic64_xchg_relaxed		atomic64_xchg
-#define  atomic64_xchg_acquire		atomic64_xchg
-#define  atomic64_xchg_release		atomic64_xchg
-
-#else /* atomic64_xchg_relaxed */
-
-#ifndef atomic64_xchg_acquire
-#define  atomic64_xchg_acquire(...)					\
-	__atomic_op_acquire(atomic64_xchg, __VA_ARGS__)
-#endif
+# define atomic64_xchg_relaxed			atomic64_xchg
+# define atomic64_xchg_acquire			atomic64_xchg
+# define atomic64_xchg_release			atomic64_xchg
+#else
+# ifndef atomic64_xchg_acquire
+#  define atomic64_xchg_acquire(...)		__atomic_op_acquire(atomic64_xchg, __VA_ARGS__)
+# endif
+# ifndef atomic64_xchg_release
+#  define atomic64_xchg_release(...)		__atomic_op_release(atomic64_xchg, __VA_ARGS__)
+# endif
+# ifndef atomic64_xchg
+#  define atomic64_xchg(...)			__atomic_op_fence(atomic64_xchg, __VA_ARGS__)
+# endif
+#endif
+
+/* atomic64_cmpxchg_relaxed() et al: */
 
-#ifndef atomic64_xchg_release
-#define  atomic64_xchg_release(...)					\
-	__atomic_op_release(atomic64_xchg, __VA_ARGS__)
-#endif
-
-#ifndef atomic64_xchg
-#define  atomic64_xchg(...)						\
-	__atomic_op_fence(atomic64_xchg, __VA_ARGS__)
-#endif
-#endif /* atomic64_xchg_relaxed */
-
-/* atomic64_cmpxchg_relaxed */
 #ifndef atomic64_cmpxchg_relaxed
-#define  atomic64_cmpxchg_relaxed	atomic64_cmpxchg
-#define  atomic64_cmpxchg_acquire	atomic64_cmpxchg
-#define  atomic64_cmpxchg_release	atomic64_cmpxchg
-
-#else /* atomic64_cmpxchg_relaxed */
-
-#ifndef atomic64_cmpxchg_acquire
-#define  atomic64_cmpxchg_acquire(...)					\
-	__atomic_op_acquire(atomic64_cmpxchg, __VA_ARGS__)
-#endif
-
-#ifndef atomic64_cmpxchg_release
-#define  atomic64_cmpxchg_release(...)					\
-	__atomic_op_release(atomic64_cmpxchg, __VA_ARGS__)
-#endif
-
-#ifndef atomic64_cmpxchg
-#define  atomic64_cmpxchg(...)						\
-	__atomic_op_fence(atomic64_cmpxchg, __VA_ARGS__)
+# define atomic64_cmpxchg_relaxed		atomic64_cmpxchg
+# define atomic64_cmpxchg_acquire		atomic64_cmpxchg
+# define atomic64_cmpxchg_release		atomic64_cmpxchg
+#else
+# ifndef atomic64_cmpxchg_acquire
+#  define atomic64_cmpxchg_acquire(...)		__atomic_op_acquire(atomic64_cmpxchg, __VA_ARGS__)
+# endif
+# ifndef atomic64_cmpxchg_release
+#  define atomic64_cmpxchg_release(...)		__atomic_op_release(atomic64_cmpxchg, __VA_ARGS__)
+# endif
+# ifndef atomic64_cmpxchg
+#  define atomic64_cmpxchg(...)			__atomic_op_fence(atomic64_cmpxchg, __VA_ARGS__)
+# endif
 #endif
-#endif /* atomic64_cmpxchg_relaxed */
 
 #ifndef atomic64_try_cmpxchg
-
-#define __atomic64_try_cmpxchg(type, _p, _po, _n)			\
-({									\
+# define __atomic64_try_cmpxchg(type, _p, _po, _n)			\
+  ({									\
 	typeof(_po) __po = (_po);					\
 	typeof(*(_po)) __r, __o = *__po;				\
 	__r = atomic64_cmpxchg##type((_p), __o, (_n));			\
 	if (unlikely(__r != __o))					\
 		*__po = __r;						\
 	likely(__r == __o);						\
-})
-
-#define atomic64_try_cmpxchg(_p, _po, _n)		__atomic64_try_cmpxchg(, _p, _po, _n)
-#define atomic64_try_cmpxchg_relaxed(_p, _po, _n)	__atomic64_try_cmpxchg(_relaxed, _p, _po, _n)
-#define atomic64_try_cmpxchg_acquire(_p, _po, _n)	__atomic64_try_cmpxchg(_acquire, _p, _po, _n)
-#define atomic64_try_cmpxchg_release(_p, _po, _n)	__atomic64_try_cmpxchg(_release, _p, _po, _n)
-
-#else /* atomic64_try_cmpxchg */
-#define atomic64_try_cmpxchg_relaxed	atomic64_try_cmpxchg
-#define atomic64_try_cmpxchg_acquire	atomic64_try_cmpxchg
-#define atomic64_try_cmpxchg_release	atomic64_try_cmpxchg
-#endif /* atomic64_try_cmpxchg */
+  })
+# define atomic64_try_cmpxchg(_p, _po, _n)	   __atomic64_try_cmpxchg(, _p, _po, _n)
+# define atomic64_try_cmpxchg_relaxed(_p, _po, _n) __atomic64_try_cmpxchg(_relaxed, _p, _po, _n)
+# define atomic64_try_cmpxchg_acquire(_p, _po, _n) __atomic64_try_cmpxchg(_acquire, _p, _po, _n)
+# define atomic64_try_cmpxchg_release(_p, _po, _n) __atomic64_try_cmpxchg(_release, _p, _po, _n)
+#else
+# define atomic64_try_cmpxchg_relaxed		atomic64_try_cmpxchg
+# define atomic64_try_cmpxchg_acquire		atomic64_try_cmpxchg
+# define atomic64_try_cmpxchg_release		atomic64_try_cmpxchg
+#endif
 
 #ifndef atomic64_andnot
 static inline void atomic64_andnot(long long i, atomic64_t *v)

^ permalink raw reply related	[flat|nested] 103+ messages in thread

* [PATCH] locking/atomics: Simplify the op definitions in atomic.h some more
  2018-05-05  8:11         ` Ingo Molnar
@ 2018-05-05  8:36           ` Ingo Molnar
  -1 siblings, 0 replies; 103+ messages in thread
From: Ingo Molnar @ 2018-05-05  8:36 UTC (permalink / raw)
  To: Mark Rutland
  Cc: Peter Zijlstra, linux-arm-kernel, linux-kernel, aryabinin,
	boqun.feng, catalin.marinas, dvyukov, will.deacon,
	Linus Torvalds, Andrew Morton, Paul E. McKenney, Peter Zijlstra,
	Thomas Gleixner


* Ingo Molnar <mingo@kernel.org> wrote:

> Before:
> 
>  #ifndef atomic_fetch_dec_relaxed
> 
>  #ifndef atomic_fetch_dec
>  #define atomic_fetch_dec(v)	        atomic_fetch_sub(1, (v))
>  #define atomic_fetch_dec_relaxed(v)	atomic_fetch_sub_relaxed(1, (v))
>  #define atomic_fetch_dec_acquire(v)	atomic_fetch_sub_acquire(1, (v))
>  #define atomic_fetch_dec_release(v)	atomic_fetch_sub_release(1, (v))
>  #else /* atomic_fetch_dec */
>  #define atomic_fetch_dec_relaxed	atomic_fetch_dec
>  #define atomic_fetch_dec_acquire	atomic_fetch_dec
>  #define atomic_fetch_dec_release	atomic_fetch_dec
>  #endif /* atomic_fetch_dec */
> 
>  #else /* atomic_fetch_dec_relaxed */
> 
>  #ifndef atomic_fetch_dec_acquire
>  #define atomic_fetch_dec_acquire(...)					\
> 	__atomic_op_acquire(atomic_fetch_dec, __VA_ARGS__)
>  #endif
> 
>  #ifndef atomic_fetch_dec_release
>  #define atomic_fetch_dec_release(...)					\
> 	__atomic_op_release(atomic_fetch_dec, __VA_ARGS__)
>  #endif
> 
>  #ifndef atomic_fetch_dec
>  #define atomic_fetch_dec(...)						\
> 	__atomic_op_fence(atomic_fetch_dec, __VA_ARGS__)
>  #endif
>  #endif /* atomic_fetch_dec_relaxed */
> 
> After:
> 
>  #ifndef atomic_fetch_dec_relaxed
>  # ifndef atomic_fetch_dec
>  #  define atomic_fetch_dec(v)		atomic_fetch_sub(1, (v))
>  #  define atomic_fetch_dec_relaxed(v)	atomic_fetch_sub_relaxed(1, (v))
>  #  define atomic_fetch_dec_acquire(v)	atomic_fetch_sub_acquire(1, (v))
>  #  define atomic_fetch_dec_release(v)	atomic_fetch_sub_release(1, (v))
>  # else
>  #  define atomic_fetch_dec_relaxed		atomic_fetch_dec
>  #  define atomic_fetch_dec_acquire		atomic_fetch_dec
>  #  define atomic_fetch_dec_release		atomic_fetch_dec
>  # endif
>  #else
>  # ifndef atomic_fetch_dec_acquire
>  #  define atomic_fetch_dec_acquire(...)	__atomic_op_acquire(atomic_fetch_dec, __VA_ARGS__)
>  # endif
>  # ifndef atomic_fetch_dec_release
>  #  define atomic_fetch_dec_release(...)	__atomic_op_release(atomic_fetch_dec, __VA_ARGS__)
>  # endif
>  # ifndef atomic_fetch_dec
>  #  define atomic_fetch_dec(...)		__atomic_op_fence(atomic_fetch_dec, __VA_ARGS__)
>  # endif
>  #endif
> 
> The new variant is readable at a glance, and the hierarchy of defines is very 
> obvious as well.
> 
> And I think we could do even better - there's absolutely no reason why _every_ 
> operation has to be made conditional on a finegrained level - they are overriden 
> in API groups. In fact allowing individual override is arguably a fragility.
> 
> So we could do the following simplification on top of that:
> 
>  #ifndef atomic_fetch_dec_relaxed
>  # ifndef atomic_fetch_dec
>  #  define atomic_fetch_dec(v)		atomic_fetch_sub(1, (v))
>  #  define atomic_fetch_dec_relaxed(v)	atomic_fetch_sub_relaxed(1, (v))
>  #  define atomic_fetch_dec_acquire(v)	atomic_fetch_sub_acquire(1, (v))
>  #  define atomic_fetch_dec_release(v)	atomic_fetch_sub_release(1, (v))
>  # else
>  #  define atomic_fetch_dec_relaxed		atomic_fetch_dec
>  #  define atomic_fetch_dec_acquire		atomic_fetch_dec
>  #  define atomic_fetch_dec_release		atomic_fetch_dec
>  # endif
>  #else
>  # ifndef atomic_fetch_dec
>  #  define atomic_fetch_dec(...)		__atomic_op_fence(atomic_fetch_dec, __VA_ARGS__)
>  #  define atomic_fetch_dec_acquire(...)	__atomic_op_acquire(atomic_fetch_dec, __VA_ARGS__)
>  #  define atomic_fetch_dec_release(...)	__atomic_op_release(atomic_fetch_dec, __VA_ARGS__)
>  # endif
>  #endif

The attached patch implements this, which gives us another healthy simplification:

 include/linux/atomic.h | 312 ++++++++++---------------------------------------
 1 file changed, 62 insertions(+), 250 deletions(-)

Note that the simplest definition block is now:

#ifndef atomic_cmpxchg_relaxed
# define atomic_cmpxchg_relaxed			atomic_cmpxchg
# define atomic_cmpxchg_acquire			atomic_cmpxchg
# define atomic_cmpxchg_release			atomic_cmpxchg
#else
# ifndef atomic_cmpxchg
#  define atomic_cmpxchg(...)			__atomic_op_fence(atomic_cmpxchg, __VA_ARGS__)
#  define atomic_cmpxchg_acquire(...)		__atomic_op_acquire(atomic_cmpxchg, __VA_ARGS__)
#  define atomic_cmpxchg_release(...)		__atomic_op_release(atomic_cmpxchg, __VA_ARGS__)
# endif
#endif

... which is very readable!

The total linecount reduction of the two patches is pretty significant as well:

 include/linux/atomic.h | 1063 ++++++++++++++++--------------------------------
 1 file changed, 343 insertions(+), 720 deletions(-)

Note that I kept the second patch separate, because technically it changes the way 
we use the defines - it should not break anything, unless I missed some detail.

Please keep this kind of clarity and simplicity in new instrumentation patches!

Thanks,

	Ingo

==================>
>From 5affbf7e91901143f84f1b2ca64f4afe70e210fd Mon Sep 17 00:00:00 2001
From: Ingo Molnar <mingo@kernel.org>
Date: Sat, 5 May 2018 10:23:23 +0200
Subject: [PATCH] locking/atomics: Simplify the op definitions in atomic.h some more

Before:

 #ifndef atomic_fetch_dec_relaxed
 # ifndef atomic_fetch_dec
 #  define atomic_fetch_dec(v)			atomic_fetch_sub(1, (v))
 #  define atomic_fetch_dec_relaxed(v)		atomic_fetch_sub_relaxed(1, (v))
 #  define atomic_fetch_dec_acquire(v)		atomic_fetch_sub_acquire(1, (v))
 #  define atomic_fetch_dec_release(v)		atomic_fetch_sub_release(1, (v))
 # else
 #  define atomic_fetch_dec_relaxed		atomic_fetch_dec
 #  define atomic_fetch_dec_acquire		atomic_fetch_dec
 #  define atomic_fetch_dec_release		atomic_fetch_dec
 # endif
 #else
 # ifndef atomic_fetch_dec_acquire
 #  define atomic_fetch_dec_acquire(...)	__atomic_op_acquire(atomic_fetch_dec, __VA_ARGS__)
 # endif
 # ifndef atomic_fetch_dec_release
 #  define atomic_fetch_dec_release(...)	__atomic_op_release(atomic_fetch_dec, __VA_ARGS__)
 # endif
 # ifndef atomic_fetch_dec
 #  define atomic_fetch_dec(...)		__atomic_op_fence(atomic_fetch_dec, __VA_ARGS__)
 # endif
 #endif

After:

 #ifndef atomic_fetch_dec_relaxed
 # ifndef atomic_fetch_dec
 #  define atomic_fetch_dec(v)			atomic_fetch_sub(1, (v))
 #  define atomic_fetch_dec_relaxed(v)		atomic_fetch_sub_relaxed(1, (v))
 #  define atomic_fetch_dec_acquire(v)		atomic_fetch_sub_acquire(1, (v))
 #  define atomic_fetch_dec_release(v)		atomic_fetch_sub_release(1, (v))
 # else
 #  define atomic_fetch_dec_relaxed		atomic_fetch_dec
 #  define atomic_fetch_dec_acquire		atomic_fetch_dec
 #  define atomic_fetch_dec_release		atomic_fetch_dec
 # endif
 #else
 # ifndef atomic_fetch_dec
 #  define atomic_fetch_dec(...)		__atomic_op_fence(atomic_fetch_dec, __VA_ARGS__)
 #  define atomic_fetch_dec_acquire(...)	__atomic_op_acquire(atomic_fetch_dec, __VA_ARGS__)
 #  define atomic_fetch_dec_release(...)	__atomic_op_release(atomic_fetch_dec, __VA_ARGS__)
 # endif
 #endif

The idea is that because we already group these APIs by certain defines
such as atomic_fetch_dec_relaxed and atomic_fetch_dec in the primary
branches - we can do the same in the secondary branch as well.

( Also remove some unnecessarily duplicate comments, as the API
  group defines are now pretty much self-documenting. )

No change in functionality.

Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 include/linux/atomic.h | 312 ++++++++++---------------------------------------
 1 file changed, 62 insertions(+), 250 deletions(-)

diff --git a/include/linux/atomic.h b/include/linux/atomic.h
index 67aaafba256b..352ecc72d7f5 100644
--- a/include/linux/atomic.h
+++ b/include/linux/atomic.h
@@ -71,98 +71,66 @@
 })
 #endif
 
-/* atomic_add_return_relaxed() et al: */
-
 #ifndef atomic_add_return_relaxed
 # define atomic_add_return_relaxed		atomic_add_return
 # define atomic_add_return_acquire		atomic_add_return
 # define atomic_add_return_release		atomic_add_return
 #else
-# ifndef atomic_add_return_acquire
-#  define atomic_add_return_acquire(...)	__atomic_op_acquire(atomic_add_return, __VA_ARGS__)
-# endif
-# ifndef atomic_add_return_release
-#  define atomic_add_return_release(...)	__atomic_op_release(atomic_add_return, __VA_ARGS__)
-# endif
 # ifndef atomic_add_return
 #  define atomic_add_return(...)		__atomic_op_fence(atomic_add_return, __VA_ARGS__)
+#  define atomic_add_return_acquire(...)	__atomic_op_acquire(atomic_add_return, __VA_ARGS__)
+#  define atomic_add_return_release(...)	__atomic_op_release(atomic_add_return, __VA_ARGS__)
 # endif
 #endif
 
-/* atomic_inc_return_relaxed() et al: */
-
 #ifndef atomic_inc_return_relaxed
 # define atomic_inc_return_relaxed		atomic_inc_return
 # define atomic_inc_return_acquire		atomic_inc_return
 # define atomic_inc_return_release		atomic_inc_return
 #else
-# ifndef atomic_inc_return_acquire
-#  define atomic_inc_return_acquire(...)	__atomic_op_acquire(atomic_inc_return, __VA_ARGS__)
-# endif
-# ifndef atomic_inc_return_release
-#  define atomic_inc_return_release(...)	__atomic_op_release(atomic_inc_return, __VA_ARGS__)
-# endif
 # ifndef atomic_inc_return
 #  define atomic_inc_return(...)		__atomic_op_fence(atomic_inc_return, __VA_ARGS__)
+#  define atomic_inc_return_acquire(...)	__atomic_op_acquire(atomic_inc_return, __VA_ARGS__)
+#  define atomic_inc_return_release(...)	__atomic_op_release(atomic_inc_return, __VA_ARGS__)
 # endif
 #endif
 
-/* atomic_sub_return_relaxed() et al: */
-
 #ifndef atomic_sub_return_relaxed
 # define atomic_sub_return_relaxed		atomic_sub_return
 # define atomic_sub_return_acquire		atomic_sub_return
 # define atomic_sub_return_release		atomic_sub_return
 #else
-# ifndef atomic_sub_return_acquire
-#  define atomic_sub_return_acquire(...)	__atomic_op_acquire(atomic_sub_return, __VA_ARGS__)
-# endif
-# ifndef atomic_sub_return_release
-#  define atomic_sub_return_release(...)	__atomic_op_release(atomic_sub_return, __VA_ARGS__)
-# endif
 # ifndef atomic_sub_return
 #  define atomic_sub_return(...)		__atomic_op_fence(atomic_sub_return, __VA_ARGS__)
+#  define atomic_sub_return_acquire(...)	__atomic_op_acquire(atomic_sub_return, __VA_ARGS__)
+#  define atomic_sub_return_release(...)	__atomic_op_release(atomic_sub_return, __VA_ARGS__)
 # endif
 #endif
 
-/* atomic_dec_return_relaxed() et al: */
-
 #ifndef atomic_dec_return_relaxed
 # define atomic_dec_return_relaxed		atomic_dec_return
 # define atomic_dec_return_acquire		atomic_dec_return
 # define atomic_dec_return_release		atomic_dec_return
 #else
-# ifndef atomic_dec_return_acquire
-#  define atomic_dec_return_acquire(...)	__atomic_op_acquire(atomic_dec_return, __VA_ARGS__)
-# endif
-# ifndef atomic_dec_return_release
-#  define atomic_dec_return_release(...)	__atomic_op_release(atomic_dec_return, __VA_ARGS__)
-# endif
 # ifndef atomic_dec_return
 #  define atomic_dec_return(...)		__atomic_op_fence(atomic_dec_return, __VA_ARGS__)
+#  define atomic_dec_return_acquire(...)	__atomic_op_acquire(atomic_dec_return, __VA_ARGS__)
+#  define atomic_dec_return_release(...)	__atomic_op_release(atomic_dec_return, __VA_ARGS__)
 # endif
 #endif
 
-/* atomic_fetch_add_relaxed() et al: */
-
 #ifndef atomic_fetch_add_relaxed
 # define atomic_fetch_add_relaxed		atomic_fetch_add
 # define atomic_fetch_add_acquire		atomic_fetch_add
 # define atomic_fetch_add_release		atomic_fetch_add
 #else
-# ifndef atomic_fetch_add_acquire
-#  define atomic_fetch_add_acquire(...)		__atomic_op_acquire(atomic_fetch_add, __VA_ARGS__)
-# endif
-# ifndef atomic_fetch_add_release
-#  define atomic_fetch_add_release(...)		__atomic_op_release(atomic_fetch_add, __VA_ARGS__)
-# endif
 # ifndef atomic_fetch_add
 #  define atomic_fetch_add(...)			__atomic_op_fence(atomic_fetch_add, __VA_ARGS__)
+#  define atomic_fetch_add_acquire(...)		__atomic_op_acquire(atomic_fetch_add, __VA_ARGS__)
+#  define atomic_fetch_add_release(...)		__atomic_op_release(atomic_fetch_add, __VA_ARGS__)
 # endif
 #endif
 
-/* atomic_fetch_inc_relaxed() et al: */
-
 #ifndef atomic_fetch_inc_relaxed
 # ifndef atomic_fetch_inc
 #  define atomic_fetch_inc(v)			atomic_fetch_add(1, (v))
@@ -175,37 +143,25 @@
 #  define atomic_fetch_inc_release		atomic_fetch_inc
 # endif
 #else
-# ifndef atomic_fetch_inc_acquire
-#  define atomic_fetch_inc_acquire(...)		__atomic_op_acquire(atomic_fetch_inc, __VA_ARGS__)
-# endif
-# ifndef atomic_fetch_inc_release
-#  define atomic_fetch_inc_release(...)		__atomic_op_release(atomic_fetch_inc, __VA_ARGS__)
-# endif
 # ifndef atomic_fetch_inc
 #  define atomic_fetch_inc(...)			__atomic_op_fence(atomic_fetch_inc, __VA_ARGS__)
+#  define atomic_fetch_inc_acquire(...)		__atomic_op_acquire(atomic_fetch_inc, __VA_ARGS__)
+#  define atomic_fetch_inc_release(...)		__atomic_op_release(atomic_fetch_inc, __VA_ARGS__)
 # endif
 #endif
 
-/* atomic_fetch_sub_relaxed() et al: */
-
 #ifndef atomic_fetch_sub_relaxed
 # define atomic_fetch_sub_relaxed		atomic_fetch_sub
 # define atomic_fetch_sub_acquire		atomic_fetch_sub
 # define atomic_fetch_sub_release		atomic_fetch_sub
 #else
-# ifndef atomic_fetch_sub_acquire
-#  define atomic_fetch_sub_acquire(...)		__atomic_op_acquire(atomic_fetch_sub, __VA_ARGS__)
-# endif
-# ifndef atomic_fetch_sub_release
-#  define atomic_fetch_sub_release(...)		__atomic_op_release(atomic_fetch_sub, __VA_ARGS__)
-# endif
 # ifndef atomic_fetch_sub
 #  define atomic_fetch_sub(...)			__atomic_op_fence(atomic_fetch_sub, __VA_ARGS__)
+#  define atomic_fetch_sub_acquire(...)		__atomic_op_acquire(atomic_fetch_sub, __VA_ARGS__)
+#  define atomic_fetch_sub_release(...)		__atomic_op_release(atomic_fetch_sub, __VA_ARGS__)
 # endif
 #endif
 
-/* atomic_fetch_dec_relaxed() et al: */
-
 #ifndef atomic_fetch_dec_relaxed
 # ifndef atomic_fetch_dec
 #  define atomic_fetch_dec(v)			atomic_fetch_sub(1, (v))
@@ -218,127 +174,86 @@
 #  define atomic_fetch_dec_release		atomic_fetch_dec
 # endif
 #else
-# ifndef atomic_fetch_dec_acquire
-#  define atomic_fetch_dec_acquire(...)		__atomic_op_acquire(atomic_fetch_dec, __VA_ARGS__)
-# endif
-# ifndef atomic_fetch_dec_release
-#  define atomic_fetch_dec_release(...)		__atomic_op_release(atomic_fetch_dec, __VA_ARGS__)
-# endif
 # ifndef atomic_fetch_dec
 #  define atomic_fetch_dec(...)			__atomic_op_fence(atomic_fetch_dec, __VA_ARGS__)
+#  define atomic_fetch_dec_acquire(...)		__atomic_op_acquire(atomic_fetch_dec, __VA_ARGS__)
+#  define atomic_fetch_dec_release(...)		__atomic_op_release(atomic_fetch_dec, __VA_ARGS__)
 # endif
 #endif
 
-/* atomic_fetch_or_relaxed() et al: */
-
 #ifndef atomic_fetch_or_relaxed
 # define atomic_fetch_or_relaxed		atomic_fetch_or
 # define atomic_fetch_or_acquire		atomic_fetch_or
 # define atomic_fetch_or_release		atomic_fetch_or
 #else
-# ifndef atomic_fetch_or_acquire
-#  define atomic_fetch_or_acquire(...)		__atomic_op_acquire(atomic_fetch_or, __VA_ARGS__)
-# endif
-# ifndef atomic_fetch_or_release
-#  define atomic_fetch_or_release(...)		__atomic_op_release(atomic_fetch_or, __VA_ARGS__)
-# endif
 # ifndef atomic_fetch_or
 #  define atomic_fetch_or(...)			__atomic_op_fence(atomic_fetch_or, __VA_ARGS__)
+#  define atomic_fetch_or_acquire(...)		__atomic_op_acquire(atomic_fetch_or, __VA_ARGS__)
+#  define atomic_fetch_or_release(...)		__atomic_op_release(atomic_fetch_or, __VA_ARGS__)
 # endif
 #endif
 
-/* atomic_fetch_and_relaxed() et al: */
-
 #ifndef atomic_fetch_and_relaxed
 # define atomic_fetch_and_relaxed		atomic_fetch_and
 # define atomic_fetch_and_acquire		atomic_fetch_and
 # define atomic_fetch_and_release		atomic_fetch_and
 #else
-# ifndef atomic_fetch_and_acquire
-#  define atomic_fetch_and_acquire(...)		__atomic_op_acquire(atomic_fetch_and, __VA_ARGS__)
-# endif
-# ifndef atomic_fetch_and_release
-#  define atomic_fetch_and_release(...)		__atomic_op_release(atomic_fetch_and, __VA_ARGS__)
-# endif
 # ifndef atomic_fetch_and
 #  define atomic_fetch_and(...)			__atomic_op_fence(atomic_fetch_and, __VA_ARGS__)
+#  define atomic_fetch_and_acquire(...)		__atomic_op_acquire(atomic_fetch_and, __VA_ARGS__)
+#  define atomic_fetch_and_release(...)		__atomic_op_release(atomic_fetch_and, __VA_ARGS__)
 # endif
 #endif
 
 #ifdef atomic_andnot
 
-/* atomic_fetch_andnot_relaxed() et al: */
-
 #ifndef atomic_fetch_andnot_relaxed
 # define atomic_fetch_andnot_relaxed		atomic_fetch_andnot
 # define atomic_fetch_andnot_acquire		atomic_fetch_andnot
 # define atomic_fetch_andnot_release		atomic_fetch_andnot
 #else
-# ifndef atomic_fetch_andnot_acquire
-#  define atomic_fetch_andnot_acquire(...)	__atomic_op_acquire(atomic_fetch_andnot, __VA_ARGS__)
-# endif
-# ifndef atomic_fetch_andnot_release
-#  define atomic_fetch_andnot_release(...)	__atomic_op_release(atomic_fetch_andnot, __VA_ARGS__)
-# endif
 # ifndef atomic_fetch_andnot
 #  define atomic_fetch_andnot(...)		__atomic_op_fence(atomic_fetch_andnot, __VA_ARGS__)
+#  define atomic_fetch_andnot_acquire(...)	__atomic_op_acquire(atomic_fetch_andnot, __VA_ARGS__)
+#  define atomic_fetch_andnot_release(...)	__atomic_op_release(atomic_fetch_andnot, __VA_ARGS__)
 # endif
 #endif
 
 #endif /* atomic_andnot */
 
-/* atomic_fetch_xor_relaxed() et al: */
-
 #ifndef atomic_fetch_xor_relaxed
 # define atomic_fetch_xor_relaxed		atomic_fetch_xor
 # define atomic_fetch_xor_acquire		atomic_fetch_xor
 # define atomic_fetch_xor_release		atomic_fetch_xor
 #else
-# ifndef atomic_fetch_xor_acquire
-#  define atomic_fetch_xor_acquire(...)		__atomic_op_acquire(atomic_fetch_xor, __VA_ARGS__)
-# endif
-# ifndef atomic_fetch_xor_release
-#  define atomic_fetch_xor_release(...)		__atomic_op_release(atomic_fetch_xor, __VA_ARGS__)
-# endif
 # ifndef atomic_fetch_xor
 #  define atomic_fetch_xor(...)			__atomic_op_fence(atomic_fetch_xor, __VA_ARGS__)
+#  define atomic_fetch_xor_acquire(...)		__atomic_op_acquire(atomic_fetch_xor, __VA_ARGS__)
+#  define atomic_fetch_xor_release(...)		__atomic_op_release(atomic_fetch_xor, __VA_ARGS__)
 # endif
 #endif
 
-
-/* atomic_xchg_relaxed() et al: */
-
 #ifndef atomic_xchg_relaxed
 #define atomic_xchg_relaxed			atomic_xchg
 #define atomic_xchg_acquire			atomic_xchg
 #define atomic_xchg_release			atomic_xchg
 #else
-# ifndef atomic_xchg_acquire
-#  define atomic_xchg_acquire(...)		__atomic_op_acquire(atomic_xchg, __VA_ARGS__)
-# endif
-# ifndef atomic_xchg_release
-#  define atomic_xchg_release(...)		__atomic_op_release(atomic_xchg, __VA_ARGS__)
-# endif
 # ifndef atomic_xchg
 #  define atomic_xchg(...)			__atomic_op_fence(atomic_xchg, __VA_ARGS__)
+#  define atomic_xchg_acquire(...)		__atomic_op_acquire(atomic_xchg, __VA_ARGS__)
+#  define atomic_xchg_release(...)		__atomic_op_release(atomic_xchg, __VA_ARGS__)
 # endif
 #endif
 
-/* atomic_cmpxchg_relaxed() et al: */
-
 #ifndef atomic_cmpxchg_relaxed
 # define atomic_cmpxchg_relaxed			atomic_cmpxchg
 # define atomic_cmpxchg_acquire			atomic_cmpxchg
 # define atomic_cmpxchg_release			atomic_cmpxchg
 #else
-# ifndef atomic_cmpxchg_acquire
-#  define atomic_cmpxchg_acquire(...)		__atomic_op_acquire(atomic_cmpxchg, __VA_ARGS__)
-# endif
-# ifndef atomic_cmpxchg_release
-#  define atomic_cmpxchg_release(...)		__atomic_op_release(atomic_cmpxchg, __VA_ARGS__)
-# endif
 # ifndef atomic_cmpxchg
 #  define atomic_cmpxchg(...)			__atomic_op_fence(atomic_cmpxchg, __VA_ARGS__)
+#  define atomic_cmpxchg_acquire(...)		__atomic_op_acquire(atomic_cmpxchg, __VA_ARGS__)
+#  define atomic_cmpxchg_release(...)		__atomic_op_release(atomic_cmpxchg, __VA_ARGS__)
 # endif
 #endif
 
@@ -362,57 +277,39 @@
 # define atomic_try_cmpxchg_release		atomic_try_cmpxchg
 #endif
 
-/* cmpxchg_relaxed() et al: */
-
 #ifndef cmpxchg_relaxed
 # define cmpxchg_relaxed			cmpxchg
 # define cmpxchg_acquire			cmpxchg
 # define cmpxchg_release			cmpxchg
 #else
-# ifndef cmpxchg_acquire
-#  define cmpxchg_acquire(...)			__atomic_op_acquire(cmpxchg, __VA_ARGS__)
-# endif
-# ifndef cmpxchg_release
-#  define cmpxchg_release(...)			__atomic_op_release(cmpxchg, __VA_ARGS__)
-# endif
 # ifndef cmpxchg
 #  define cmpxchg(...)				__atomic_op_fence(cmpxchg, __VA_ARGS__)
+#  define cmpxchg_acquire(...)			__atomic_op_acquire(cmpxchg, __VA_ARGS__)
+#  define cmpxchg_release(...)			__atomic_op_release(cmpxchg, __VA_ARGS__)
 # endif
 #endif
 
-/* cmpxchg64_relaxed() et al: */
-
 #ifndef cmpxchg64_relaxed
 # define cmpxchg64_relaxed			cmpxchg64
 # define cmpxchg64_acquire			cmpxchg64
 # define cmpxchg64_release			cmpxchg64
 #else
-# ifndef cmpxchg64_acquire
-#  define cmpxchg64_acquire(...)		__atomic_op_acquire(cmpxchg64, __VA_ARGS__)
-# endif
-# ifndef cmpxchg64_release
-#  define cmpxchg64_release(...)		__atomic_op_release(cmpxchg64, __VA_ARGS__)
-# endif
 # ifndef cmpxchg64
 #  define cmpxchg64(...)			__atomic_op_fence(cmpxchg64, __VA_ARGS__)
+#  define cmpxchg64_acquire(...)		__atomic_op_acquire(cmpxchg64, __VA_ARGS__)
+#  define cmpxchg64_release(...)		__atomic_op_release(cmpxchg64, __VA_ARGS__)
 # endif
 #endif
 
-/* xchg_relaxed() et al: */
-
 #ifndef xchg_relaxed
 # define xchg_relaxed				xchg
 # define xchg_acquire				xchg
 # define xchg_release				xchg
 #else
-# ifndef xchg_acquire
-#  define xchg_acquire(...)			__atomic_op_acquire(xchg, __VA_ARGS__)
-# endif
-# ifndef xchg_release
-#  define xchg_release(...)			__atomic_op_release(xchg, __VA_ARGS__)
-# endif
 # ifndef xchg
 #  define xchg(...)				__atomic_op_fence(xchg, __VA_ARGS__)
+#  define xchg_acquire(...)			__atomic_op_acquire(xchg, __VA_ARGS__)
+#  define xchg_release(...)			__atomic_op_release(xchg, __VA_ARGS__)
 # endif
 #endif
 
@@ -569,98 +466,66 @@ static inline int atomic_dec_if_positive(atomic_t *v)
 # define atomic64_set_release(v, i)		smp_store_release(&(v)->counter, (i))
 #endif
 
-/* atomic64_add_return_relaxed() et al: */
-
 #ifndef atomic64_add_return_relaxed
 # define atomic64_add_return_relaxed		atomic64_add_return
 # define atomic64_add_return_acquire		atomic64_add_return
 # define atomic64_add_return_release		atomic64_add_return
 #else
-# ifndef atomic64_add_return_acquire
-#  define atomic64_add_return_acquire(...)	__atomic_op_acquire(atomic64_add_return, __VA_ARGS__)
-# endif
-# ifndef atomic64_add_return_release
-#  define atomic64_add_return_release(...)	__atomic_op_release(atomic64_add_return, __VA_ARGS__)
-# endif
 # ifndef atomic64_add_return
 #  define atomic64_add_return(...)		__atomic_op_fence(atomic64_add_return, __VA_ARGS__)
+#  define atomic64_add_return_acquire(...)	__atomic_op_acquire(atomic64_add_return, __VA_ARGS__)
+#  define atomic64_add_return_release(...)	__atomic_op_release(atomic64_add_return, __VA_ARGS__)
 # endif
 #endif
 
-/* atomic64_inc_return_relaxed() et al: */
-
 #ifndef atomic64_inc_return_relaxed
 # define atomic64_inc_return_relaxed		atomic64_inc_return
 # define atomic64_inc_return_acquire		atomic64_inc_return
 # define atomic64_inc_return_release		atomic64_inc_return
 #else
-# ifndef atomic64_inc_return_acquire
-#  define atomic64_inc_return_acquire(...)	__atomic_op_acquire(atomic64_inc_return, __VA_ARGS__)
-# endif
-# ifndef atomic64_inc_return_release
-#  define atomic64_inc_return_release(...)	__atomic_op_release(atomic64_inc_return, __VA_ARGS__)
-# endif
 # ifndef atomic64_inc_return
 #  define atomic64_inc_return(...)		__atomic_op_fence(atomic64_inc_return, __VA_ARGS__)
+#  define atomic64_inc_return_acquire(...)	__atomic_op_acquire(atomic64_inc_return, __VA_ARGS__)
+#  define atomic64_inc_return_release(...)	__atomic_op_release(atomic64_inc_return, __VA_ARGS__)
 # endif
 #endif
 
-/* atomic64_sub_return_relaxed() et al: */
-
 #ifndef atomic64_sub_return_relaxed
 # define atomic64_sub_return_relaxed		atomic64_sub_return
 # define atomic64_sub_return_acquire		atomic64_sub_return
 # define atomic64_sub_return_release		atomic64_sub_return
 #else
-# ifndef atomic64_sub_return_acquire
-#  define atomic64_sub_return_acquire(...)	__atomic_op_acquire(atomic64_sub_return, __VA_ARGS__)
-# endif
-# ifndef atomic64_sub_return_release
-#  define atomic64_sub_return_release(...)	__atomic_op_release(atomic64_sub_return, __VA_ARGS__)
-# endif
 # ifndef atomic64_sub_return
 #  define atomic64_sub_return(...)		__atomic_op_fence(atomic64_sub_return, __VA_ARGS__)
+#  define atomic64_sub_return_acquire(...)	__atomic_op_acquire(atomic64_sub_return, __VA_ARGS__)
+#  define atomic64_sub_return_release(...)	__atomic_op_release(atomic64_sub_return, __VA_ARGS__)
 # endif
 #endif
 
-/* atomic64_dec_return_relaxed() et al: */
-
 #ifndef atomic64_dec_return_relaxed
 # define atomic64_dec_return_relaxed		atomic64_dec_return
 # define atomic64_dec_return_acquire		atomic64_dec_return
 # define atomic64_dec_return_release		atomic64_dec_return
 #else
-# ifndef atomic64_dec_return_acquire
-#  define atomic64_dec_return_acquire(...)	__atomic_op_acquire(atomic64_dec_return, __VA_ARGS__)
-# endif
-# ifndef atomic64_dec_return_release
-#  define atomic64_dec_return_release(...)	__atomic_op_release(atomic64_dec_return, __VA_ARGS__)
-# endif
 # ifndef atomic64_dec_return
 #  define atomic64_dec_return(...)		__atomic_op_fence(atomic64_dec_return, __VA_ARGS__)
+#  define atomic64_dec_return_acquire(...)	__atomic_op_acquire(atomic64_dec_return, __VA_ARGS__)
+#  define atomic64_dec_return_release(...)	__atomic_op_release(atomic64_dec_return, __VA_ARGS__)
 # endif
 #endif
 
-/* atomic64_fetch_add_relaxed() et al: */
-
 #ifndef atomic64_fetch_add_relaxed
 # define atomic64_fetch_add_relaxed		atomic64_fetch_add
 # define atomic64_fetch_add_acquire		atomic64_fetch_add
 # define atomic64_fetch_add_release		atomic64_fetch_add
 #else
-# ifndef atomic64_fetch_add_acquire
-#  define atomic64_fetch_add_acquire(...)	__atomic_op_acquire(atomic64_fetch_add, __VA_ARGS__)
-# endif
-# ifndef atomic64_fetch_add_release
-#  define atomic64_fetch_add_release(...)	__atomic_op_release(atomic64_fetch_add, __VA_ARGS__)
-# endif
 # ifndef atomic64_fetch_add
 #  define atomic64_fetch_add(...)		__atomic_op_fence(atomic64_fetch_add, __VA_ARGS__)
+#  define atomic64_fetch_add_acquire(...)	__atomic_op_acquire(atomic64_fetch_add, __VA_ARGS__)
+#  define atomic64_fetch_add_release(...)	__atomic_op_release(atomic64_fetch_add, __VA_ARGS__)
 # endif
 #endif
 
-/* atomic64_fetch_inc_relaxed() et al: */
-
 #ifndef atomic64_fetch_inc_relaxed
 # ifndef atomic64_fetch_inc
 #  define atomic64_fetch_inc(v)			atomic64_fetch_add(1, (v))
@@ -673,37 +538,25 @@ static inline int atomic_dec_if_positive(atomic_t *v)
 #  define atomic64_fetch_inc_release		atomic64_fetch_inc
 # endif
 #else
-# ifndef atomic64_fetch_inc_acquire
-#  define atomic64_fetch_inc_acquire(...)	__atomic_op_acquire(atomic64_fetch_inc, __VA_ARGS__)
-# endif
-# ifndef atomic64_fetch_inc_release
-#  define atomic64_fetch_inc_release(...)	__atomic_op_release(atomic64_fetch_inc, __VA_ARGS__)
-# endif
 # ifndef atomic64_fetch_inc
 #  define atomic64_fetch_inc(...)		__atomic_op_fence(atomic64_fetch_inc, __VA_ARGS__)
+#  define atomic64_fetch_inc_acquire(...)	__atomic_op_acquire(atomic64_fetch_inc, __VA_ARGS__)
+#  define atomic64_fetch_inc_release(...)	__atomic_op_release(atomic64_fetch_inc, __VA_ARGS__)
 # endif
 #endif
 
-/* atomic64_fetch_sub_relaxed() et al: */
-
 #ifndef atomic64_fetch_sub_relaxed
 # define atomic64_fetch_sub_relaxed		atomic64_fetch_sub
 # define atomic64_fetch_sub_acquire		atomic64_fetch_sub
 # define atomic64_fetch_sub_release		atomic64_fetch_sub
 #else
-# ifndef atomic64_fetch_sub_acquire
-#  define atomic64_fetch_sub_acquire(...)	__atomic_op_acquire(atomic64_fetch_sub, __VA_ARGS__)
-# endif
-# ifndef atomic64_fetch_sub_release
-#  define atomic64_fetch_sub_release(...)	__atomic_op_release(atomic64_fetch_sub, __VA_ARGS__)
-# endif
 # ifndef atomic64_fetch_sub
 #  define atomic64_fetch_sub(...)		__atomic_op_fence(atomic64_fetch_sub, __VA_ARGS__)
+#  define atomic64_fetch_sub_acquire(...)	__atomic_op_acquire(atomic64_fetch_sub, __VA_ARGS__)
+#  define atomic64_fetch_sub_release(...)	__atomic_op_release(atomic64_fetch_sub, __VA_ARGS__)
 # endif
 #endif
 
-/* atomic64_fetch_dec_relaxed() et al: */
-
 #ifndef atomic64_fetch_dec_relaxed
 # ifndef atomic64_fetch_dec
 #  define atomic64_fetch_dec(v)			atomic64_fetch_sub(1, (v))
@@ -716,127 +569,86 @@ static inline int atomic_dec_if_positive(atomic_t *v)
 #  define atomic64_fetch_dec_release		atomic64_fetch_dec
 # endif
 #else
-# ifndef atomic64_fetch_dec_acquire
-#  define atomic64_fetch_dec_acquire(...)	__atomic_op_acquire(atomic64_fetch_dec, __VA_ARGS__)
-# endif
-# ifndef atomic64_fetch_dec_release
-#  define atomic64_fetch_dec_release(...)	__atomic_op_release(atomic64_fetch_dec, __VA_ARGS__)
-# endif
 # ifndef atomic64_fetch_dec
 #  define atomic64_fetch_dec(...)		__atomic_op_fence(atomic64_fetch_dec, __VA_ARGS__)
+#  define atomic64_fetch_dec_acquire(...)	__atomic_op_acquire(atomic64_fetch_dec, __VA_ARGS__)
+#  define atomic64_fetch_dec_release(...)	__atomic_op_release(atomic64_fetch_dec, __VA_ARGS__)
 # endif
 #endif
 
-/* atomic64_fetch_or_relaxed() et al: */
-
 #ifndef atomic64_fetch_or_relaxed
 # define atomic64_fetch_or_relaxed		atomic64_fetch_or
 # define atomic64_fetch_or_acquire		atomic64_fetch_or
 # define atomic64_fetch_or_release		atomic64_fetch_or
 #else
-# ifndef atomic64_fetch_or_acquire
-#  define atomic64_fetch_or_acquire(...)	__atomic_op_acquire(atomic64_fetch_or, __VA_ARGS__)
-# endif
-# ifndef atomic64_fetch_or_release
-#  define atomic64_fetch_or_release(...)	__atomic_op_release(atomic64_fetch_or, __VA_ARGS__)
-# endif
 # ifndef atomic64_fetch_or
 #  define atomic64_fetch_or(...)		__atomic_op_fence(atomic64_fetch_or, __VA_ARGS__)
+#  define atomic64_fetch_or_acquire(...)	__atomic_op_acquire(atomic64_fetch_or, __VA_ARGS__)
+#  define atomic64_fetch_or_release(...)	__atomic_op_release(atomic64_fetch_or, __VA_ARGS__)
 # endif
 #endif
 
-
-/* atomic64_fetch_and_relaxed() et al: */
-
 #ifndef atomic64_fetch_and_relaxed
 # define atomic64_fetch_and_relaxed		atomic64_fetch_and
 # define atomic64_fetch_and_acquire		atomic64_fetch_and
 # define atomic64_fetch_and_release		atomic64_fetch_and
 #else
-# ifndef atomic64_fetch_and_acquire
-#  define atomic64_fetch_and_acquire(...)	__atomic_op_acquire(atomic64_fetch_and, __VA_ARGS__)
-# endif
-# ifndef atomic64_fetch_and_release
-#  define atomic64_fetch_and_release(...)	__atomic_op_release(atomic64_fetch_and, __VA_ARGS__)
-# endif
 # ifndef atomic64_fetch_and
 #  define atomic64_fetch_and(...)		__atomic_op_fence(atomic64_fetch_and, __VA_ARGS__)
+#  define atomic64_fetch_and_acquire(...)	__atomic_op_acquire(atomic64_fetch_and, __VA_ARGS__)
+#  define atomic64_fetch_and_release(...)	__atomic_op_release(atomic64_fetch_and, __VA_ARGS__)
 # endif
 #endif
 
 #ifdef atomic64_andnot
 
-/* atomic64_fetch_andnot_relaxed() et al: */
-
 #ifndef atomic64_fetch_andnot_relaxed
 # define atomic64_fetch_andnot_relaxed		atomic64_fetch_andnot
 # define atomic64_fetch_andnot_acquire		atomic64_fetch_andnot
 # define atomic64_fetch_andnot_release		atomic64_fetch_andnot
 #else
-# ifndef atomic64_fetch_andnot_acquire
-#  define atomic64_fetch_andnot_acquire(...)	__atomic_op_acquire(atomic64_fetch_andnot, __VA_ARGS__)
-# endif
-# ifndef atomic64_fetch_andnot_release
-#  define atomic64_fetch_andnot_release(...)	__atomic_op_release(atomic64_fetch_andnot, __VA_ARGS__)
-# endif
 # ifndef atomic64_fetch_andnot
 #  define atomic64_fetch_andnot(...)		__atomic_op_fence(atomic64_fetch_andnot, __VA_ARGS__)
+#  define atomic64_fetch_andnot_acquire(...)	__atomic_op_acquire(atomic64_fetch_andnot, __VA_ARGS__)
+#  define atomic64_fetch_andnot_release(...)	__atomic_op_release(atomic64_fetch_andnot, __VA_ARGS__)
 # endif
 #endif
 
 #endif /* atomic64_andnot */
 
-/* atomic64_fetch_xor_relaxed() et al: */
-
 #ifndef atomic64_fetch_xor_relaxed
 # define atomic64_fetch_xor_relaxed		atomic64_fetch_xor
 # define atomic64_fetch_xor_acquire		atomic64_fetch_xor
 # define atomic64_fetch_xor_release		atomic64_fetch_xor
 #else
-# ifndef atomic64_fetch_xor_acquire
-#  define atomic64_fetch_xor_acquire(...)	__atomic_op_acquire(atomic64_fetch_xor, __VA_ARGS__)
-# endif
-# ifndef atomic64_fetch_xor_release
-#  define atomic64_fetch_xor_release(...)	__atomic_op_release(atomic64_fetch_xor, __VA_ARGS__)
-# endif
 # ifndef atomic64_fetch_xor
 #  define atomic64_fetch_xor(...)		__atomic_op_fence(atomic64_fetch_xor, __VA_ARGS__)
+#  define atomic64_fetch_xor_acquire(...)	__atomic_op_acquire(atomic64_fetch_xor, __VA_ARGS__)
+#  define atomic64_fetch_xor_release(...)	__atomic_op_release(atomic64_fetch_xor, __VA_ARGS__)
 # endif
 #endif
 
-/* atomic64_xchg_relaxed() et al: */
-
 #ifndef atomic64_xchg_relaxed
 # define atomic64_xchg_relaxed			atomic64_xchg
 # define atomic64_xchg_acquire			atomic64_xchg
 # define atomic64_xchg_release			atomic64_xchg
 #else
-# ifndef atomic64_xchg_acquire
-#  define atomic64_xchg_acquire(...)		__atomic_op_acquire(atomic64_xchg, __VA_ARGS__)
-# endif
-# ifndef atomic64_xchg_release
-#  define atomic64_xchg_release(...)		__atomic_op_release(atomic64_xchg, __VA_ARGS__)
-# endif
 # ifndef atomic64_xchg
 #  define atomic64_xchg(...)			__atomic_op_fence(atomic64_xchg, __VA_ARGS__)
+#  define atomic64_xchg_acquire(...)		__atomic_op_acquire(atomic64_xchg, __VA_ARGS__)
+#  define atomic64_xchg_release(...)		__atomic_op_release(atomic64_xchg, __VA_ARGS__)
 # endif
 #endif
 
-/* atomic64_cmpxchg_relaxed() et al: */
-
 #ifndef atomic64_cmpxchg_relaxed
 # define atomic64_cmpxchg_relaxed		atomic64_cmpxchg
 # define atomic64_cmpxchg_acquire		atomic64_cmpxchg
 # define atomic64_cmpxchg_release		atomic64_cmpxchg
 #else
-# ifndef atomic64_cmpxchg_acquire
-#  define atomic64_cmpxchg_acquire(...)		__atomic_op_acquire(atomic64_cmpxchg, __VA_ARGS__)
-# endif
-# ifndef atomic64_cmpxchg_release
-#  define atomic64_cmpxchg_release(...)		__atomic_op_release(atomic64_cmpxchg, __VA_ARGS__)
-# endif
 # ifndef atomic64_cmpxchg
 #  define atomic64_cmpxchg(...)			__atomic_op_fence(atomic64_cmpxchg, __VA_ARGS__)
+#  define atomic64_cmpxchg_acquire(...)		__atomic_op_acquire(atomic64_cmpxchg, __VA_ARGS__)
+#  define atomic64_cmpxchg_release(...)		__atomic_op_release(atomic64_cmpxchg, __VA_ARGS__)
 # endif
 #endif
 

^ permalink raw reply related	[flat|nested] 103+ messages in thread

* [PATCH] locking/atomics: Simplify the op definitions in atomic.h some more
@ 2018-05-05  8:36           ` Ingo Molnar
  0 siblings, 0 replies; 103+ messages in thread
From: Ingo Molnar @ 2018-05-05  8:36 UTC (permalink / raw)
  To: linux-arm-kernel


* Ingo Molnar <mingo@kernel.org> wrote:

> Before:
> 
>  #ifndef atomic_fetch_dec_relaxed
> 
>  #ifndef atomic_fetch_dec
>  #define atomic_fetch_dec(v)	        atomic_fetch_sub(1, (v))
>  #define atomic_fetch_dec_relaxed(v)	atomic_fetch_sub_relaxed(1, (v))
>  #define atomic_fetch_dec_acquire(v)	atomic_fetch_sub_acquire(1, (v))
>  #define atomic_fetch_dec_release(v)	atomic_fetch_sub_release(1, (v))
>  #else /* atomic_fetch_dec */
>  #define atomic_fetch_dec_relaxed	atomic_fetch_dec
>  #define atomic_fetch_dec_acquire	atomic_fetch_dec
>  #define atomic_fetch_dec_release	atomic_fetch_dec
>  #endif /* atomic_fetch_dec */
> 
>  #else /* atomic_fetch_dec_relaxed */
> 
>  #ifndef atomic_fetch_dec_acquire
>  #define atomic_fetch_dec_acquire(...)					\
> 	__atomic_op_acquire(atomic_fetch_dec, __VA_ARGS__)
>  #endif
> 
>  #ifndef atomic_fetch_dec_release
>  #define atomic_fetch_dec_release(...)					\
> 	__atomic_op_release(atomic_fetch_dec, __VA_ARGS__)
>  #endif
> 
>  #ifndef atomic_fetch_dec
>  #define atomic_fetch_dec(...)						\
> 	__atomic_op_fence(atomic_fetch_dec, __VA_ARGS__)
>  #endif
>  #endif /* atomic_fetch_dec_relaxed */
> 
> After:
> 
>  #ifndef atomic_fetch_dec_relaxed
>  # ifndef atomic_fetch_dec
>  #  define atomic_fetch_dec(v)		atomic_fetch_sub(1, (v))
>  #  define atomic_fetch_dec_relaxed(v)	atomic_fetch_sub_relaxed(1, (v))
>  #  define atomic_fetch_dec_acquire(v)	atomic_fetch_sub_acquire(1, (v))
>  #  define atomic_fetch_dec_release(v)	atomic_fetch_sub_release(1, (v))
>  # else
>  #  define atomic_fetch_dec_relaxed		atomic_fetch_dec
>  #  define atomic_fetch_dec_acquire		atomic_fetch_dec
>  #  define atomic_fetch_dec_release		atomic_fetch_dec
>  # endif
>  #else
>  # ifndef atomic_fetch_dec_acquire
>  #  define atomic_fetch_dec_acquire(...)	__atomic_op_acquire(atomic_fetch_dec, __VA_ARGS__)
>  # endif
>  # ifndef atomic_fetch_dec_release
>  #  define atomic_fetch_dec_release(...)	__atomic_op_release(atomic_fetch_dec, __VA_ARGS__)
>  # endif
>  # ifndef atomic_fetch_dec
>  #  define atomic_fetch_dec(...)		__atomic_op_fence(atomic_fetch_dec, __VA_ARGS__)
>  # endif
>  #endif
> 
> The new variant is readable at a glance, and the hierarchy of defines is very 
> obvious as well.
> 
> And I think we could do even better - there's absolutely no reason why _every_ 
> operation has to be made conditional on a finegrained level - they are overriden 
> in API groups. In fact allowing individual override is arguably a fragility.
> 
> So we could do the following simplification on top of that:
> 
>  #ifndef atomic_fetch_dec_relaxed
>  # ifndef atomic_fetch_dec
>  #  define atomic_fetch_dec(v)		atomic_fetch_sub(1, (v))
>  #  define atomic_fetch_dec_relaxed(v)	atomic_fetch_sub_relaxed(1, (v))
>  #  define atomic_fetch_dec_acquire(v)	atomic_fetch_sub_acquire(1, (v))
>  #  define atomic_fetch_dec_release(v)	atomic_fetch_sub_release(1, (v))
>  # else
>  #  define atomic_fetch_dec_relaxed		atomic_fetch_dec
>  #  define atomic_fetch_dec_acquire		atomic_fetch_dec
>  #  define atomic_fetch_dec_release		atomic_fetch_dec
>  # endif
>  #else
>  # ifndef atomic_fetch_dec
>  #  define atomic_fetch_dec(...)		__atomic_op_fence(atomic_fetch_dec, __VA_ARGS__)
>  #  define atomic_fetch_dec_acquire(...)	__atomic_op_acquire(atomic_fetch_dec, __VA_ARGS__)
>  #  define atomic_fetch_dec_release(...)	__atomic_op_release(atomic_fetch_dec, __VA_ARGS__)
>  # endif
>  #endif

The attached patch implements this, which gives us another healthy simplification:

 include/linux/atomic.h | 312 ++++++++++---------------------------------------
 1 file changed, 62 insertions(+), 250 deletions(-)

Note that the simplest definition block is now:

#ifndef atomic_cmpxchg_relaxed
# define atomic_cmpxchg_relaxed			atomic_cmpxchg
# define atomic_cmpxchg_acquire			atomic_cmpxchg
# define atomic_cmpxchg_release			atomic_cmpxchg
#else
# ifndef atomic_cmpxchg
#  define atomic_cmpxchg(...)			__atomic_op_fence(atomic_cmpxchg, __VA_ARGS__)
#  define atomic_cmpxchg_acquire(...)		__atomic_op_acquire(atomic_cmpxchg, __VA_ARGS__)
#  define atomic_cmpxchg_release(...)		__atomic_op_release(atomic_cmpxchg, __VA_ARGS__)
# endif
#endif

... which is very readable!

The total linecount reduction of the two patches is pretty significant as well:

 include/linux/atomic.h | 1063 ++++++++++++++++--------------------------------
 1 file changed, 343 insertions(+), 720 deletions(-)

Note that I kept the second patch separate, because technically it changes the way 
we use the defines - it should not break anything, unless I missed some detail.

Please keep this kind of clarity and simplicity in new instrumentation patches!

Thanks,

	Ingo

==================>
>From 5affbf7e91901143f84f1b2ca64f4afe70e210fd Mon Sep 17 00:00:00 2001
From: Ingo Molnar <mingo@kernel.org>
Date: Sat, 5 May 2018 10:23:23 +0200
Subject: [PATCH] locking/atomics: Simplify the op definitions in atomic.h some more

Before:

 #ifndef atomic_fetch_dec_relaxed
 # ifndef atomic_fetch_dec
 #  define atomic_fetch_dec(v)			atomic_fetch_sub(1, (v))
 #  define atomic_fetch_dec_relaxed(v)		atomic_fetch_sub_relaxed(1, (v))
 #  define atomic_fetch_dec_acquire(v)		atomic_fetch_sub_acquire(1, (v))
 #  define atomic_fetch_dec_release(v)		atomic_fetch_sub_release(1, (v))
 # else
 #  define atomic_fetch_dec_relaxed		atomic_fetch_dec
 #  define atomic_fetch_dec_acquire		atomic_fetch_dec
 #  define atomic_fetch_dec_release		atomic_fetch_dec
 # endif
 #else
 # ifndef atomic_fetch_dec_acquire
 #  define atomic_fetch_dec_acquire(...)	__atomic_op_acquire(atomic_fetch_dec, __VA_ARGS__)
 # endif
 # ifndef atomic_fetch_dec_release
 #  define atomic_fetch_dec_release(...)	__atomic_op_release(atomic_fetch_dec, __VA_ARGS__)
 # endif
 # ifndef atomic_fetch_dec
 #  define atomic_fetch_dec(...)		__atomic_op_fence(atomic_fetch_dec, __VA_ARGS__)
 # endif
 #endif

After:

 #ifndef atomic_fetch_dec_relaxed
 # ifndef atomic_fetch_dec
 #  define atomic_fetch_dec(v)			atomic_fetch_sub(1, (v))
 #  define atomic_fetch_dec_relaxed(v)		atomic_fetch_sub_relaxed(1, (v))
 #  define atomic_fetch_dec_acquire(v)		atomic_fetch_sub_acquire(1, (v))
 #  define atomic_fetch_dec_release(v)		atomic_fetch_sub_release(1, (v))
 # else
 #  define atomic_fetch_dec_relaxed		atomic_fetch_dec
 #  define atomic_fetch_dec_acquire		atomic_fetch_dec
 #  define atomic_fetch_dec_release		atomic_fetch_dec
 # endif
 #else
 # ifndef atomic_fetch_dec
 #  define atomic_fetch_dec(...)		__atomic_op_fence(atomic_fetch_dec, __VA_ARGS__)
 #  define atomic_fetch_dec_acquire(...)	__atomic_op_acquire(atomic_fetch_dec, __VA_ARGS__)
 #  define atomic_fetch_dec_release(...)	__atomic_op_release(atomic_fetch_dec, __VA_ARGS__)
 # endif
 #endif

The idea is that because we already group these APIs by certain defines
such as atomic_fetch_dec_relaxed and atomic_fetch_dec in the primary
branches - we can do the same in the secondary branch as well.

( Also remove some unnecessarily duplicate comments, as the API
  group defines are now pretty much self-documenting. )

No change in functionality.

Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: linux-kernel at vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 include/linux/atomic.h | 312 ++++++++++---------------------------------------
 1 file changed, 62 insertions(+), 250 deletions(-)

diff --git a/include/linux/atomic.h b/include/linux/atomic.h
index 67aaafba256b..352ecc72d7f5 100644
--- a/include/linux/atomic.h
+++ b/include/linux/atomic.h
@@ -71,98 +71,66 @@
 })
 #endif
 
-/* atomic_add_return_relaxed() et al: */
-
 #ifndef atomic_add_return_relaxed
 # define atomic_add_return_relaxed		atomic_add_return
 # define atomic_add_return_acquire		atomic_add_return
 # define atomic_add_return_release		atomic_add_return
 #else
-# ifndef atomic_add_return_acquire
-#  define atomic_add_return_acquire(...)	__atomic_op_acquire(atomic_add_return, __VA_ARGS__)
-# endif
-# ifndef atomic_add_return_release
-#  define atomic_add_return_release(...)	__atomic_op_release(atomic_add_return, __VA_ARGS__)
-# endif
 # ifndef atomic_add_return
 #  define atomic_add_return(...)		__atomic_op_fence(atomic_add_return, __VA_ARGS__)
+#  define atomic_add_return_acquire(...)	__atomic_op_acquire(atomic_add_return, __VA_ARGS__)
+#  define atomic_add_return_release(...)	__atomic_op_release(atomic_add_return, __VA_ARGS__)
 # endif
 #endif
 
-/* atomic_inc_return_relaxed() et al: */
-
 #ifndef atomic_inc_return_relaxed
 # define atomic_inc_return_relaxed		atomic_inc_return
 # define atomic_inc_return_acquire		atomic_inc_return
 # define atomic_inc_return_release		atomic_inc_return
 #else
-# ifndef atomic_inc_return_acquire
-#  define atomic_inc_return_acquire(...)	__atomic_op_acquire(atomic_inc_return, __VA_ARGS__)
-# endif
-# ifndef atomic_inc_return_release
-#  define atomic_inc_return_release(...)	__atomic_op_release(atomic_inc_return, __VA_ARGS__)
-# endif
 # ifndef atomic_inc_return
 #  define atomic_inc_return(...)		__atomic_op_fence(atomic_inc_return, __VA_ARGS__)
+#  define atomic_inc_return_acquire(...)	__atomic_op_acquire(atomic_inc_return, __VA_ARGS__)
+#  define atomic_inc_return_release(...)	__atomic_op_release(atomic_inc_return, __VA_ARGS__)
 # endif
 #endif
 
-/* atomic_sub_return_relaxed() et al: */
-
 #ifndef atomic_sub_return_relaxed
 # define atomic_sub_return_relaxed		atomic_sub_return
 # define atomic_sub_return_acquire		atomic_sub_return
 # define atomic_sub_return_release		atomic_sub_return
 #else
-# ifndef atomic_sub_return_acquire
-#  define atomic_sub_return_acquire(...)	__atomic_op_acquire(atomic_sub_return, __VA_ARGS__)
-# endif
-# ifndef atomic_sub_return_release
-#  define atomic_sub_return_release(...)	__atomic_op_release(atomic_sub_return, __VA_ARGS__)
-# endif
 # ifndef atomic_sub_return
 #  define atomic_sub_return(...)		__atomic_op_fence(atomic_sub_return, __VA_ARGS__)
+#  define atomic_sub_return_acquire(...)	__atomic_op_acquire(atomic_sub_return, __VA_ARGS__)
+#  define atomic_sub_return_release(...)	__atomic_op_release(atomic_sub_return, __VA_ARGS__)
 # endif
 #endif
 
-/* atomic_dec_return_relaxed() et al: */
-
 #ifndef atomic_dec_return_relaxed
 # define atomic_dec_return_relaxed		atomic_dec_return
 # define atomic_dec_return_acquire		atomic_dec_return
 # define atomic_dec_return_release		atomic_dec_return
 #else
-# ifndef atomic_dec_return_acquire
-#  define atomic_dec_return_acquire(...)	__atomic_op_acquire(atomic_dec_return, __VA_ARGS__)
-# endif
-# ifndef atomic_dec_return_release
-#  define atomic_dec_return_release(...)	__atomic_op_release(atomic_dec_return, __VA_ARGS__)
-# endif
 # ifndef atomic_dec_return
 #  define atomic_dec_return(...)		__atomic_op_fence(atomic_dec_return, __VA_ARGS__)
+#  define atomic_dec_return_acquire(...)	__atomic_op_acquire(atomic_dec_return, __VA_ARGS__)
+#  define atomic_dec_return_release(...)	__atomic_op_release(atomic_dec_return, __VA_ARGS__)
 # endif
 #endif
 
-/* atomic_fetch_add_relaxed() et al: */
-
 #ifndef atomic_fetch_add_relaxed
 # define atomic_fetch_add_relaxed		atomic_fetch_add
 # define atomic_fetch_add_acquire		atomic_fetch_add
 # define atomic_fetch_add_release		atomic_fetch_add
 #else
-# ifndef atomic_fetch_add_acquire
-#  define atomic_fetch_add_acquire(...)		__atomic_op_acquire(atomic_fetch_add, __VA_ARGS__)
-# endif
-# ifndef atomic_fetch_add_release
-#  define atomic_fetch_add_release(...)		__atomic_op_release(atomic_fetch_add, __VA_ARGS__)
-# endif
 # ifndef atomic_fetch_add
 #  define atomic_fetch_add(...)			__atomic_op_fence(atomic_fetch_add, __VA_ARGS__)
+#  define atomic_fetch_add_acquire(...)		__atomic_op_acquire(atomic_fetch_add, __VA_ARGS__)
+#  define atomic_fetch_add_release(...)		__atomic_op_release(atomic_fetch_add, __VA_ARGS__)
 # endif
 #endif
 
-/* atomic_fetch_inc_relaxed() et al: */
-
 #ifndef atomic_fetch_inc_relaxed
 # ifndef atomic_fetch_inc
 #  define atomic_fetch_inc(v)			atomic_fetch_add(1, (v))
@@ -175,37 +143,25 @@
 #  define atomic_fetch_inc_release		atomic_fetch_inc
 # endif
 #else
-# ifndef atomic_fetch_inc_acquire
-#  define atomic_fetch_inc_acquire(...)		__atomic_op_acquire(atomic_fetch_inc, __VA_ARGS__)
-# endif
-# ifndef atomic_fetch_inc_release
-#  define atomic_fetch_inc_release(...)		__atomic_op_release(atomic_fetch_inc, __VA_ARGS__)
-# endif
 # ifndef atomic_fetch_inc
 #  define atomic_fetch_inc(...)			__atomic_op_fence(atomic_fetch_inc, __VA_ARGS__)
+#  define atomic_fetch_inc_acquire(...)		__atomic_op_acquire(atomic_fetch_inc, __VA_ARGS__)
+#  define atomic_fetch_inc_release(...)		__atomic_op_release(atomic_fetch_inc, __VA_ARGS__)
 # endif
 #endif
 
-/* atomic_fetch_sub_relaxed() et al: */
-
 #ifndef atomic_fetch_sub_relaxed
 # define atomic_fetch_sub_relaxed		atomic_fetch_sub
 # define atomic_fetch_sub_acquire		atomic_fetch_sub
 # define atomic_fetch_sub_release		atomic_fetch_sub
 #else
-# ifndef atomic_fetch_sub_acquire
-#  define atomic_fetch_sub_acquire(...)		__atomic_op_acquire(atomic_fetch_sub, __VA_ARGS__)
-# endif
-# ifndef atomic_fetch_sub_release
-#  define atomic_fetch_sub_release(...)		__atomic_op_release(atomic_fetch_sub, __VA_ARGS__)
-# endif
 # ifndef atomic_fetch_sub
 #  define atomic_fetch_sub(...)			__atomic_op_fence(atomic_fetch_sub, __VA_ARGS__)
+#  define atomic_fetch_sub_acquire(...)		__atomic_op_acquire(atomic_fetch_sub, __VA_ARGS__)
+#  define atomic_fetch_sub_release(...)		__atomic_op_release(atomic_fetch_sub, __VA_ARGS__)
 # endif
 #endif
 
-/* atomic_fetch_dec_relaxed() et al: */
-
 #ifndef atomic_fetch_dec_relaxed
 # ifndef atomic_fetch_dec
 #  define atomic_fetch_dec(v)			atomic_fetch_sub(1, (v))
@@ -218,127 +174,86 @@
 #  define atomic_fetch_dec_release		atomic_fetch_dec
 # endif
 #else
-# ifndef atomic_fetch_dec_acquire
-#  define atomic_fetch_dec_acquire(...)		__atomic_op_acquire(atomic_fetch_dec, __VA_ARGS__)
-# endif
-# ifndef atomic_fetch_dec_release
-#  define atomic_fetch_dec_release(...)		__atomic_op_release(atomic_fetch_dec, __VA_ARGS__)
-# endif
 # ifndef atomic_fetch_dec
 #  define atomic_fetch_dec(...)			__atomic_op_fence(atomic_fetch_dec, __VA_ARGS__)
+#  define atomic_fetch_dec_acquire(...)		__atomic_op_acquire(atomic_fetch_dec, __VA_ARGS__)
+#  define atomic_fetch_dec_release(...)		__atomic_op_release(atomic_fetch_dec, __VA_ARGS__)
 # endif
 #endif
 
-/* atomic_fetch_or_relaxed() et al: */
-
 #ifndef atomic_fetch_or_relaxed
 # define atomic_fetch_or_relaxed		atomic_fetch_or
 # define atomic_fetch_or_acquire		atomic_fetch_or
 # define atomic_fetch_or_release		atomic_fetch_or
 #else
-# ifndef atomic_fetch_or_acquire
-#  define atomic_fetch_or_acquire(...)		__atomic_op_acquire(atomic_fetch_or, __VA_ARGS__)
-# endif
-# ifndef atomic_fetch_or_release
-#  define atomic_fetch_or_release(...)		__atomic_op_release(atomic_fetch_or, __VA_ARGS__)
-# endif
 # ifndef atomic_fetch_or
 #  define atomic_fetch_or(...)			__atomic_op_fence(atomic_fetch_or, __VA_ARGS__)
+#  define atomic_fetch_or_acquire(...)		__atomic_op_acquire(atomic_fetch_or, __VA_ARGS__)
+#  define atomic_fetch_or_release(...)		__atomic_op_release(atomic_fetch_or, __VA_ARGS__)
 # endif
 #endif
 
-/* atomic_fetch_and_relaxed() et al: */
-
 #ifndef atomic_fetch_and_relaxed
 # define atomic_fetch_and_relaxed		atomic_fetch_and
 # define atomic_fetch_and_acquire		atomic_fetch_and
 # define atomic_fetch_and_release		atomic_fetch_and
 #else
-# ifndef atomic_fetch_and_acquire
-#  define atomic_fetch_and_acquire(...)		__atomic_op_acquire(atomic_fetch_and, __VA_ARGS__)
-# endif
-# ifndef atomic_fetch_and_release
-#  define atomic_fetch_and_release(...)		__atomic_op_release(atomic_fetch_and, __VA_ARGS__)
-# endif
 # ifndef atomic_fetch_and
 #  define atomic_fetch_and(...)			__atomic_op_fence(atomic_fetch_and, __VA_ARGS__)
+#  define atomic_fetch_and_acquire(...)		__atomic_op_acquire(atomic_fetch_and, __VA_ARGS__)
+#  define atomic_fetch_and_release(...)		__atomic_op_release(atomic_fetch_and, __VA_ARGS__)
 # endif
 #endif
 
 #ifdef atomic_andnot
 
-/* atomic_fetch_andnot_relaxed() et al: */
-
 #ifndef atomic_fetch_andnot_relaxed
 # define atomic_fetch_andnot_relaxed		atomic_fetch_andnot
 # define atomic_fetch_andnot_acquire		atomic_fetch_andnot
 # define atomic_fetch_andnot_release		atomic_fetch_andnot
 #else
-# ifndef atomic_fetch_andnot_acquire
-#  define atomic_fetch_andnot_acquire(...)	__atomic_op_acquire(atomic_fetch_andnot, __VA_ARGS__)
-# endif
-# ifndef atomic_fetch_andnot_release
-#  define atomic_fetch_andnot_release(...)	__atomic_op_release(atomic_fetch_andnot, __VA_ARGS__)
-# endif
 # ifndef atomic_fetch_andnot
 #  define atomic_fetch_andnot(...)		__atomic_op_fence(atomic_fetch_andnot, __VA_ARGS__)
+#  define atomic_fetch_andnot_acquire(...)	__atomic_op_acquire(atomic_fetch_andnot, __VA_ARGS__)
+#  define atomic_fetch_andnot_release(...)	__atomic_op_release(atomic_fetch_andnot, __VA_ARGS__)
 # endif
 #endif
 
 #endif /* atomic_andnot */
 
-/* atomic_fetch_xor_relaxed() et al: */
-
 #ifndef atomic_fetch_xor_relaxed
 # define atomic_fetch_xor_relaxed		atomic_fetch_xor
 # define atomic_fetch_xor_acquire		atomic_fetch_xor
 # define atomic_fetch_xor_release		atomic_fetch_xor
 #else
-# ifndef atomic_fetch_xor_acquire
-#  define atomic_fetch_xor_acquire(...)		__atomic_op_acquire(atomic_fetch_xor, __VA_ARGS__)
-# endif
-# ifndef atomic_fetch_xor_release
-#  define atomic_fetch_xor_release(...)		__atomic_op_release(atomic_fetch_xor, __VA_ARGS__)
-# endif
 # ifndef atomic_fetch_xor
 #  define atomic_fetch_xor(...)			__atomic_op_fence(atomic_fetch_xor, __VA_ARGS__)
+#  define atomic_fetch_xor_acquire(...)		__atomic_op_acquire(atomic_fetch_xor, __VA_ARGS__)
+#  define atomic_fetch_xor_release(...)		__atomic_op_release(atomic_fetch_xor, __VA_ARGS__)
 # endif
 #endif
 
-
-/* atomic_xchg_relaxed() et al: */
-
 #ifndef atomic_xchg_relaxed
 #define atomic_xchg_relaxed			atomic_xchg
 #define atomic_xchg_acquire			atomic_xchg
 #define atomic_xchg_release			atomic_xchg
 #else
-# ifndef atomic_xchg_acquire
-#  define atomic_xchg_acquire(...)		__atomic_op_acquire(atomic_xchg, __VA_ARGS__)
-# endif
-# ifndef atomic_xchg_release
-#  define atomic_xchg_release(...)		__atomic_op_release(atomic_xchg, __VA_ARGS__)
-# endif
 # ifndef atomic_xchg
 #  define atomic_xchg(...)			__atomic_op_fence(atomic_xchg, __VA_ARGS__)
+#  define atomic_xchg_acquire(...)		__atomic_op_acquire(atomic_xchg, __VA_ARGS__)
+#  define atomic_xchg_release(...)		__atomic_op_release(atomic_xchg, __VA_ARGS__)
 # endif
 #endif
 
-/* atomic_cmpxchg_relaxed() et al: */
-
 #ifndef atomic_cmpxchg_relaxed
 # define atomic_cmpxchg_relaxed			atomic_cmpxchg
 # define atomic_cmpxchg_acquire			atomic_cmpxchg
 # define atomic_cmpxchg_release			atomic_cmpxchg
 #else
-# ifndef atomic_cmpxchg_acquire
-#  define atomic_cmpxchg_acquire(...)		__atomic_op_acquire(atomic_cmpxchg, __VA_ARGS__)
-# endif
-# ifndef atomic_cmpxchg_release
-#  define atomic_cmpxchg_release(...)		__atomic_op_release(atomic_cmpxchg, __VA_ARGS__)
-# endif
 # ifndef atomic_cmpxchg
 #  define atomic_cmpxchg(...)			__atomic_op_fence(atomic_cmpxchg, __VA_ARGS__)
+#  define atomic_cmpxchg_acquire(...)		__atomic_op_acquire(atomic_cmpxchg, __VA_ARGS__)
+#  define atomic_cmpxchg_release(...)		__atomic_op_release(atomic_cmpxchg, __VA_ARGS__)
 # endif
 #endif
 
@@ -362,57 +277,39 @@
 # define atomic_try_cmpxchg_release		atomic_try_cmpxchg
 #endif
 
-/* cmpxchg_relaxed() et al: */
-
 #ifndef cmpxchg_relaxed
 # define cmpxchg_relaxed			cmpxchg
 # define cmpxchg_acquire			cmpxchg
 # define cmpxchg_release			cmpxchg
 #else
-# ifndef cmpxchg_acquire
-#  define cmpxchg_acquire(...)			__atomic_op_acquire(cmpxchg, __VA_ARGS__)
-# endif
-# ifndef cmpxchg_release
-#  define cmpxchg_release(...)			__atomic_op_release(cmpxchg, __VA_ARGS__)
-# endif
 # ifndef cmpxchg
 #  define cmpxchg(...)				__atomic_op_fence(cmpxchg, __VA_ARGS__)
+#  define cmpxchg_acquire(...)			__atomic_op_acquire(cmpxchg, __VA_ARGS__)
+#  define cmpxchg_release(...)			__atomic_op_release(cmpxchg, __VA_ARGS__)
 # endif
 #endif
 
-/* cmpxchg64_relaxed() et al: */
-
 #ifndef cmpxchg64_relaxed
 # define cmpxchg64_relaxed			cmpxchg64
 # define cmpxchg64_acquire			cmpxchg64
 # define cmpxchg64_release			cmpxchg64
 #else
-# ifndef cmpxchg64_acquire
-#  define cmpxchg64_acquire(...)		__atomic_op_acquire(cmpxchg64, __VA_ARGS__)
-# endif
-# ifndef cmpxchg64_release
-#  define cmpxchg64_release(...)		__atomic_op_release(cmpxchg64, __VA_ARGS__)
-# endif
 # ifndef cmpxchg64
 #  define cmpxchg64(...)			__atomic_op_fence(cmpxchg64, __VA_ARGS__)
+#  define cmpxchg64_acquire(...)		__atomic_op_acquire(cmpxchg64, __VA_ARGS__)
+#  define cmpxchg64_release(...)		__atomic_op_release(cmpxchg64, __VA_ARGS__)
 # endif
 #endif
 
-/* xchg_relaxed() et al: */
-
 #ifndef xchg_relaxed
 # define xchg_relaxed				xchg
 # define xchg_acquire				xchg
 # define xchg_release				xchg
 #else
-# ifndef xchg_acquire
-#  define xchg_acquire(...)			__atomic_op_acquire(xchg, __VA_ARGS__)
-# endif
-# ifndef xchg_release
-#  define xchg_release(...)			__atomic_op_release(xchg, __VA_ARGS__)
-# endif
 # ifndef xchg
 #  define xchg(...)				__atomic_op_fence(xchg, __VA_ARGS__)
+#  define xchg_acquire(...)			__atomic_op_acquire(xchg, __VA_ARGS__)
+#  define xchg_release(...)			__atomic_op_release(xchg, __VA_ARGS__)
 # endif
 #endif
 
@@ -569,98 +466,66 @@ static inline int atomic_dec_if_positive(atomic_t *v)
 # define atomic64_set_release(v, i)		smp_store_release(&(v)->counter, (i))
 #endif
 
-/* atomic64_add_return_relaxed() et al: */
-
 #ifndef atomic64_add_return_relaxed
 # define atomic64_add_return_relaxed		atomic64_add_return
 # define atomic64_add_return_acquire		atomic64_add_return
 # define atomic64_add_return_release		atomic64_add_return
 #else
-# ifndef atomic64_add_return_acquire
-#  define atomic64_add_return_acquire(...)	__atomic_op_acquire(atomic64_add_return, __VA_ARGS__)
-# endif
-# ifndef atomic64_add_return_release
-#  define atomic64_add_return_release(...)	__atomic_op_release(atomic64_add_return, __VA_ARGS__)
-# endif
 # ifndef atomic64_add_return
 #  define atomic64_add_return(...)		__atomic_op_fence(atomic64_add_return, __VA_ARGS__)
+#  define atomic64_add_return_acquire(...)	__atomic_op_acquire(atomic64_add_return, __VA_ARGS__)
+#  define atomic64_add_return_release(...)	__atomic_op_release(atomic64_add_return, __VA_ARGS__)
 # endif
 #endif
 
-/* atomic64_inc_return_relaxed() et al: */
-
 #ifndef atomic64_inc_return_relaxed
 # define atomic64_inc_return_relaxed		atomic64_inc_return
 # define atomic64_inc_return_acquire		atomic64_inc_return
 # define atomic64_inc_return_release		atomic64_inc_return
 #else
-# ifndef atomic64_inc_return_acquire
-#  define atomic64_inc_return_acquire(...)	__atomic_op_acquire(atomic64_inc_return, __VA_ARGS__)
-# endif
-# ifndef atomic64_inc_return_release
-#  define atomic64_inc_return_release(...)	__atomic_op_release(atomic64_inc_return, __VA_ARGS__)
-# endif
 # ifndef atomic64_inc_return
 #  define atomic64_inc_return(...)		__atomic_op_fence(atomic64_inc_return, __VA_ARGS__)
+#  define atomic64_inc_return_acquire(...)	__atomic_op_acquire(atomic64_inc_return, __VA_ARGS__)
+#  define atomic64_inc_return_release(...)	__atomic_op_release(atomic64_inc_return, __VA_ARGS__)
 # endif
 #endif
 
-/* atomic64_sub_return_relaxed() et al: */
-
 #ifndef atomic64_sub_return_relaxed
 # define atomic64_sub_return_relaxed		atomic64_sub_return
 # define atomic64_sub_return_acquire		atomic64_sub_return
 # define atomic64_sub_return_release		atomic64_sub_return
 #else
-# ifndef atomic64_sub_return_acquire
-#  define atomic64_sub_return_acquire(...)	__atomic_op_acquire(atomic64_sub_return, __VA_ARGS__)
-# endif
-# ifndef atomic64_sub_return_release
-#  define atomic64_sub_return_release(...)	__atomic_op_release(atomic64_sub_return, __VA_ARGS__)
-# endif
 # ifndef atomic64_sub_return
 #  define atomic64_sub_return(...)		__atomic_op_fence(atomic64_sub_return, __VA_ARGS__)
+#  define atomic64_sub_return_acquire(...)	__atomic_op_acquire(atomic64_sub_return, __VA_ARGS__)
+#  define atomic64_sub_return_release(...)	__atomic_op_release(atomic64_sub_return, __VA_ARGS__)
 # endif
 #endif
 
-/* atomic64_dec_return_relaxed() et al: */
-
 #ifndef atomic64_dec_return_relaxed
 # define atomic64_dec_return_relaxed		atomic64_dec_return
 # define atomic64_dec_return_acquire		atomic64_dec_return
 # define atomic64_dec_return_release		atomic64_dec_return
 #else
-# ifndef atomic64_dec_return_acquire
-#  define atomic64_dec_return_acquire(...)	__atomic_op_acquire(atomic64_dec_return, __VA_ARGS__)
-# endif
-# ifndef atomic64_dec_return_release
-#  define atomic64_dec_return_release(...)	__atomic_op_release(atomic64_dec_return, __VA_ARGS__)
-# endif
 # ifndef atomic64_dec_return
 #  define atomic64_dec_return(...)		__atomic_op_fence(atomic64_dec_return, __VA_ARGS__)
+#  define atomic64_dec_return_acquire(...)	__atomic_op_acquire(atomic64_dec_return, __VA_ARGS__)
+#  define atomic64_dec_return_release(...)	__atomic_op_release(atomic64_dec_return, __VA_ARGS__)
 # endif
 #endif
 
-/* atomic64_fetch_add_relaxed() et al: */
-
 #ifndef atomic64_fetch_add_relaxed
 # define atomic64_fetch_add_relaxed		atomic64_fetch_add
 # define atomic64_fetch_add_acquire		atomic64_fetch_add
 # define atomic64_fetch_add_release		atomic64_fetch_add
 #else
-# ifndef atomic64_fetch_add_acquire
-#  define atomic64_fetch_add_acquire(...)	__atomic_op_acquire(atomic64_fetch_add, __VA_ARGS__)
-# endif
-# ifndef atomic64_fetch_add_release
-#  define atomic64_fetch_add_release(...)	__atomic_op_release(atomic64_fetch_add, __VA_ARGS__)
-# endif
 # ifndef atomic64_fetch_add
 #  define atomic64_fetch_add(...)		__atomic_op_fence(atomic64_fetch_add, __VA_ARGS__)
+#  define atomic64_fetch_add_acquire(...)	__atomic_op_acquire(atomic64_fetch_add, __VA_ARGS__)
+#  define atomic64_fetch_add_release(...)	__atomic_op_release(atomic64_fetch_add, __VA_ARGS__)
 # endif
 #endif
 
-/* atomic64_fetch_inc_relaxed() et al: */
-
 #ifndef atomic64_fetch_inc_relaxed
 # ifndef atomic64_fetch_inc
 #  define atomic64_fetch_inc(v)			atomic64_fetch_add(1, (v))
@@ -673,37 +538,25 @@ static inline int atomic_dec_if_positive(atomic_t *v)
 #  define atomic64_fetch_inc_release		atomic64_fetch_inc
 # endif
 #else
-# ifndef atomic64_fetch_inc_acquire
-#  define atomic64_fetch_inc_acquire(...)	__atomic_op_acquire(atomic64_fetch_inc, __VA_ARGS__)
-# endif
-# ifndef atomic64_fetch_inc_release
-#  define atomic64_fetch_inc_release(...)	__atomic_op_release(atomic64_fetch_inc, __VA_ARGS__)
-# endif
 # ifndef atomic64_fetch_inc
 #  define atomic64_fetch_inc(...)		__atomic_op_fence(atomic64_fetch_inc, __VA_ARGS__)
+#  define atomic64_fetch_inc_acquire(...)	__atomic_op_acquire(atomic64_fetch_inc, __VA_ARGS__)
+#  define atomic64_fetch_inc_release(...)	__atomic_op_release(atomic64_fetch_inc, __VA_ARGS__)
 # endif
 #endif
 
-/* atomic64_fetch_sub_relaxed() et al: */
-
 #ifndef atomic64_fetch_sub_relaxed
 # define atomic64_fetch_sub_relaxed		atomic64_fetch_sub
 # define atomic64_fetch_sub_acquire		atomic64_fetch_sub
 # define atomic64_fetch_sub_release		atomic64_fetch_sub
 #else
-# ifndef atomic64_fetch_sub_acquire
-#  define atomic64_fetch_sub_acquire(...)	__atomic_op_acquire(atomic64_fetch_sub, __VA_ARGS__)
-# endif
-# ifndef atomic64_fetch_sub_release
-#  define atomic64_fetch_sub_release(...)	__atomic_op_release(atomic64_fetch_sub, __VA_ARGS__)
-# endif
 # ifndef atomic64_fetch_sub
 #  define atomic64_fetch_sub(...)		__atomic_op_fence(atomic64_fetch_sub, __VA_ARGS__)
+#  define atomic64_fetch_sub_acquire(...)	__atomic_op_acquire(atomic64_fetch_sub, __VA_ARGS__)
+#  define atomic64_fetch_sub_release(...)	__atomic_op_release(atomic64_fetch_sub, __VA_ARGS__)
 # endif
 #endif
 
-/* atomic64_fetch_dec_relaxed() et al: */
-
 #ifndef atomic64_fetch_dec_relaxed
 # ifndef atomic64_fetch_dec
 #  define atomic64_fetch_dec(v)			atomic64_fetch_sub(1, (v))
@@ -716,127 +569,86 @@ static inline int atomic_dec_if_positive(atomic_t *v)
 #  define atomic64_fetch_dec_release		atomic64_fetch_dec
 # endif
 #else
-# ifndef atomic64_fetch_dec_acquire
-#  define atomic64_fetch_dec_acquire(...)	__atomic_op_acquire(atomic64_fetch_dec, __VA_ARGS__)
-# endif
-# ifndef atomic64_fetch_dec_release
-#  define atomic64_fetch_dec_release(...)	__atomic_op_release(atomic64_fetch_dec, __VA_ARGS__)
-# endif
 # ifndef atomic64_fetch_dec
 #  define atomic64_fetch_dec(...)		__atomic_op_fence(atomic64_fetch_dec, __VA_ARGS__)
+#  define atomic64_fetch_dec_acquire(...)	__atomic_op_acquire(atomic64_fetch_dec, __VA_ARGS__)
+#  define atomic64_fetch_dec_release(...)	__atomic_op_release(atomic64_fetch_dec, __VA_ARGS__)
 # endif
 #endif
 
-/* atomic64_fetch_or_relaxed() et al: */
-
 #ifndef atomic64_fetch_or_relaxed
 # define atomic64_fetch_or_relaxed		atomic64_fetch_or
 # define atomic64_fetch_or_acquire		atomic64_fetch_or
 # define atomic64_fetch_or_release		atomic64_fetch_or
 #else
-# ifndef atomic64_fetch_or_acquire
-#  define atomic64_fetch_or_acquire(...)	__atomic_op_acquire(atomic64_fetch_or, __VA_ARGS__)
-# endif
-# ifndef atomic64_fetch_or_release
-#  define atomic64_fetch_or_release(...)	__atomic_op_release(atomic64_fetch_or, __VA_ARGS__)
-# endif
 # ifndef atomic64_fetch_or
 #  define atomic64_fetch_or(...)		__atomic_op_fence(atomic64_fetch_or, __VA_ARGS__)
+#  define atomic64_fetch_or_acquire(...)	__atomic_op_acquire(atomic64_fetch_or, __VA_ARGS__)
+#  define atomic64_fetch_or_release(...)	__atomic_op_release(atomic64_fetch_or, __VA_ARGS__)
 # endif
 #endif
 
-
-/* atomic64_fetch_and_relaxed() et al: */
-
 #ifndef atomic64_fetch_and_relaxed
 # define atomic64_fetch_and_relaxed		atomic64_fetch_and
 # define atomic64_fetch_and_acquire		atomic64_fetch_and
 # define atomic64_fetch_and_release		atomic64_fetch_and
 #else
-# ifndef atomic64_fetch_and_acquire
-#  define atomic64_fetch_and_acquire(...)	__atomic_op_acquire(atomic64_fetch_and, __VA_ARGS__)
-# endif
-# ifndef atomic64_fetch_and_release
-#  define atomic64_fetch_and_release(...)	__atomic_op_release(atomic64_fetch_and, __VA_ARGS__)
-# endif
 # ifndef atomic64_fetch_and
 #  define atomic64_fetch_and(...)		__atomic_op_fence(atomic64_fetch_and, __VA_ARGS__)
+#  define atomic64_fetch_and_acquire(...)	__atomic_op_acquire(atomic64_fetch_and, __VA_ARGS__)
+#  define atomic64_fetch_and_release(...)	__atomic_op_release(atomic64_fetch_and, __VA_ARGS__)
 # endif
 #endif
 
 #ifdef atomic64_andnot
 
-/* atomic64_fetch_andnot_relaxed() et al: */
-
 #ifndef atomic64_fetch_andnot_relaxed
 # define atomic64_fetch_andnot_relaxed		atomic64_fetch_andnot
 # define atomic64_fetch_andnot_acquire		atomic64_fetch_andnot
 # define atomic64_fetch_andnot_release		atomic64_fetch_andnot
 #else
-# ifndef atomic64_fetch_andnot_acquire
-#  define atomic64_fetch_andnot_acquire(...)	__atomic_op_acquire(atomic64_fetch_andnot, __VA_ARGS__)
-# endif
-# ifndef atomic64_fetch_andnot_release
-#  define atomic64_fetch_andnot_release(...)	__atomic_op_release(atomic64_fetch_andnot, __VA_ARGS__)
-# endif
 # ifndef atomic64_fetch_andnot
 #  define atomic64_fetch_andnot(...)		__atomic_op_fence(atomic64_fetch_andnot, __VA_ARGS__)
+#  define atomic64_fetch_andnot_acquire(...)	__atomic_op_acquire(atomic64_fetch_andnot, __VA_ARGS__)
+#  define atomic64_fetch_andnot_release(...)	__atomic_op_release(atomic64_fetch_andnot, __VA_ARGS__)
 # endif
 #endif
 
 #endif /* atomic64_andnot */
 
-/* atomic64_fetch_xor_relaxed() et al: */
-
 #ifndef atomic64_fetch_xor_relaxed
 # define atomic64_fetch_xor_relaxed		atomic64_fetch_xor
 # define atomic64_fetch_xor_acquire		atomic64_fetch_xor
 # define atomic64_fetch_xor_release		atomic64_fetch_xor
 #else
-# ifndef atomic64_fetch_xor_acquire
-#  define atomic64_fetch_xor_acquire(...)	__atomic_op_acquire(atomic64_fetch_xor, __VA_ARGS__)
-# endif
-# ifndef atomic64_fetch_xor_release
-#  define atomic64_fetch_xor_release(...)	__atomic_op_release(atomic64_fetch_xor, __VA_ARGS__)
-# endif
 # ifndef atomic64_fetch_xor
 #  define atomic64_fetch_xor(...)		__atomic_op_fence(atomic64_fetch_xor, __VA_ARGS__)
+#  define atomic64_fetch_xor_acquire(...)	__atomic_op_acquire(atomic64_fetch_xor, __VA_ARGS__)
+#  define atomic64_fetch_xor_release(...)	__atomic_op_release(atomic64_fetch_xor, __VA_ARGS__)
 # endif
 #endif
 
-/* atomic64_xchg_relaxed() et al: */
-
 #ifndef atomic64_xchg_relaxed
 # define atomic64_xchg_relaxed			atomic64_xchg
 # define atomic64_xchg_acquire			atomic64_xchg
 # define atomic64_xchg_release			atomic64_xchg
 #else
-# ifndef atomic64_xchg_acquire
-#  define atomic64_xchg_acquire(...)		__atomic_op_acquire(atomic64_xchg, __VA_ARGS__)
-# endif
-# ifndef atomic64_xchg_release
-#  define atomic64_xchg_release(...)		__atomic_op_release(atomic64_xchg, __VA_ARGS__)
-# endif
 # ifndef atomic64_xchg
 #  define atomic64_xchg(...)			__atomic_op_fence(atomic64_xchg, __VA_ARGS__)
+#  define atomic64_xchg_acquire(...)		__atomic_op_acquire(atomic64_xchg, __VA_ARGS__)
+#  define atomic64_xchg_release(...)		__atomic_op_release(atomic64_xchg, __VA_ARGS__)
 # endif
 #endif
 
-/* atomic64_cmpxchg_relaxed() et al: */
-
 #ifndef atomic64_cmpxchg_relaxed
 # define atomic64_cmpxchg_relaxed		atomic64_cmpxchg
 # define atomic64_cmpxchg_acquire		atomic64_cmpxchg
 # define atomic64_cmpxchg_release		atomic64_cmpxchg
 #else
-# ifndef atomic64_cmpxchg_acquire
-#  define atomic64_cmpxchg_acquire(...)		__atomic_op_acquire(atomic64_cmpxchg, __VA_ARGS__)
-# endif
-# ifndef atomic64_cmpxchg_release
-#  define atomic64_cmpxchg_release(...)		__atomic_op_release(atomic64_cmpxchg, __VA_ARGS__)
-# endif
 # ifndef atomic64_cmpxchg
 #  define atomic64_cmpxchg(...)			__atomic_op_fence(atomic64_cmpxchg, __VA_ARGS__)
+#  define atomic64_cmpxchg_acquire(...)		__atomic_op_acquire(atomic64_cmpxchg, __VA_ARGS__)
+#  define atomic64_cmpxchg_release(...)		__atomic_op_release(atomic64_cmpxchg, __VA_ARGS__)
 # endif
 #endif
 

^ permalink raw reply related	[flat|nested] 103+ messages in thread

* Re: [PATCH] locking/atomics: Clean up the atomic.h maze of #defines
  2018-05-05  8:11         ` Ingo Molnar
@ 2018-05-05  8:47           ` Peter Zijlstra
  -1 siblings, 0 replies; 103+ messages in thread
From: Peter Zijlstra @ 2018-05-05  8:47 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Mark Rutland, linux-arm-kernel, linux-kernel, aryabinin,
	boqun.feng, catalin.marinas, dvyukov, will.deacon

On Sat, May 05, 2018 at 10:11:00AM +0200, Ingo Molnar wrote:

> Before:
> 
>  #ifndef atomic_fetch_dec_relaxed
> 
>  #ifndef atomic_fetch_dec
>  #define atomic_fetch_dec(v)	        atomic_fetch_sub(1, (v))
>  #define atomic_fetch_dec_relaxed(v)	atomic_fetch_sub_relaxed(1, (v))
>  #define atomic_fetch_dec_acquire(v)	atomic_fetch_sub_acquire(1, (v))
>  #define atomic_fetch_dec_release(v)	atomic_fetch_sub_release(1, (v))
>  #else /* atomic_fetch_dec */
>  #define atomic_fetch_dec_relaxed	atomic_fetch_dec
>  #define atomic_fetch_dec_acquire	atomic_fetch_dec
>  #define atomic_fetch_dec_release	atomic_fetch_dec
>  #endif /* atomic_fetch_dec */
> 
>  #else /* atomic_fetch_dec_relaxed */
> 
>  #ifndef atomic_fetch_dec_acquire
>  #define atomic_fetch_dec_acquire(...)					\
> 	__atomic_op_acquire(atomic_fetch_dec, __VA_ARGS__)
>  #endif
> 
>  #ifndef atomic_fetch_dec_release
>  #define atomic_fetch_dec_release(...)					\
> 	__atomic_op_release(atomic_fetch_dec, __VA_ARGS__)
>  #endif
> 
>  #ifndef atomic_fetch_dec
>  #define atomic_fetch_dec(...)						\
> 	__atomic_op_fence(atomic_fetch_dec, __VA_ARGS__)
>  #endif
>  #endif /* atomic_fetch_dec_relaxed */
> 
> After:
> 
>  #ifndef atomic_fetch_dec_relaxed
>  # ifndef atomic_fetch_dec
>  #  define atomic_fetch_dec(v)			atomic_fetch_sub(1, (v))
>  #  define atomic_fetch_dec_relaxed(v)		atomic_fetch_sub_relaxed(1, (v))
>  #  define atomic_fetch_dec_acquire(v)		atomic_fetch_sub_acquire(1, (v))
>  #  define atomic_fetch_dec_release(v)		atomic_fetch_sub_release(1, (v))
>  # else
>  #  define atomic_fetch_dec_relaxed		atomic_fetch_dec
>  #  define atomic_fetch_dec_acquire		atomic_fetch_dec
>  #  define atomic_fetch_dec_release		atomic_fetch_dec
>  # endif
>  #else
>  # ifndef atomic_fetch_dec_acquire
>  #  define atomic_fetch_dec_acquire(...)	__atomic_op_acquire(atomic_fetch_dec, __VA_ARGS__)
>  # endif
>  # ifndef atomic_fetch_dec_release
>  #  define atomic_fetch_dec_release(...)	__atomic_op_release(atomic_fetch_dec, __VA_ARGS__)
>  # endif
>  # ifndef atomic_fetch_dec
>  #  define atomic_fetch_dec(...)		__atomic_op_fence(atomic_fetch_dec, __VA_ARGS__)
>  # endif
>  #endif
> 
> The new variant is readable at a glance, and the hierarchy of defines is very 
> obvious as well.

It wraps and looks hideous in my normal setup. And I do detest that indent
after # thing.

> And I think we could do even better - there's absolutely no reason why _every_ 
> operation has to be made conditional on a finegrained level - they are overriden 
> in API groups. In fact allowing individual override is arguably a fragility.
> 
> So we could do the following simplification on top of that:
> 
>  #ifndef atomic_fetch_dec_relaxed
>  # ifndef atomic_fetch_dec
>  #  define atomic_fetch_dec(v)			atomic_fetch_sub(1, (v))
>  #  define atomic_fetch_dec_relaxed(v)		atomic_fetch_sub_relaxed(1, (v))
>  #  define atomic_fetch_dec_acquire(v)		atomic_fetch_sub_acquire(1, (v))
>  #  define atomic_fetch_dec_release(v)		atomic_fetch_sub_release(1, (v))
>  # else
>  #  define atomic_fetch_dec_relaxed		atomic_fetch_dec
>  #  define atomic_fetch_dec_acquire		atomic_fetch_dec
>  #  define atomic_fetch_dec_release		atomic_fetch_dec
>  # endif
>  #else
>  # ifndef atomic_fetch_dec
>  #  define atomic_fetch_dec(...)		__atomic_op_fence(atomic_fetch_dec, __VA_ARGS__)
>  #  define atomic_fetch_dec_acquire(...)	__atomic_op_acquire(atomic_fetch_dec, __VA_ARGS__)
>  #  define atomic_fetch_dec_release(...)	__atomic_op_release(atomic_fetch_dec, __VA_ARGS__)
>  # endif
>  #endif

This would disallow an architecture to override just fetch_dec_release for
instance.

I don't think there currently is any architecture that does that, but the
intent was to allow it to override anything and only provide defaults where it
does not.

None of this takes away the giant trainwreck that is the annotated atomic stuff
though.

And I seriously hate this one:

  ba1c9f83f633 ("locking/atomic/x86: Un-macro-ify atomic ops implementation")

and will likely undo that the moment I need to change anything there.

So no, don't like.

^ permalink raw reply	[flat|nested] 103+ messages in thread

* [PATCH] locking/atomics: Clean up the atomic.h maze of #defines
@ 2018-05-05  8:47           ` Peter Zijlstra
  0 siblings, 0 replies; 103+ messages in thread
From: Peter Zijlstra @ 2018-05-05  8:47 UTC (permalink / raw)
  To: linux-arm-kernel

On Sat, May 05, 2018 at 10:11:00AM +0200, Ingo Molnar wrote:

> Before:
> 
>  #ifndef atomic_fetch_dec_relaxed
> 
>  #ifndef atomic_fetch_dec
>  #define atomic_fetch_dec(v)	        atomic_fetch_sub(1, (v))
>  #define atomic_fetch_dec_relaxed(v)	atomic_fetch_sub_relaxed(1, (v))
>  #define atomic_fetch_dec_acquire(v)	atomic_fetch_sub_acquire(1, (v))
>  #define atomic_fetch_dec_release(v)	atomic_fetch_sub_release(1, (v))
>  #else /* atomic_fetch_dec */
>  #define atomic_fetch_dec_relaxed	atomic_fetch_dec
>  #define atomic_fetch_dec_acquire	atomic_fetch_dec
>  #define atomic_fetch_dec_release	atomic_fetch_dec
>  #endif /* atomic_fetch_dec */
> 
>  #else /* atomic_fetch_dec_relaxed */
> 
>  #ifndef atomic_fetch_dec_acquire
>  #define atomic_fetch_dec_acquire(...)					\
> 	__atomic_op_acquire(atomic_fetch_dec, __VA_ARGS__)
>  #endif
> 
>  #ifndef atomic_fetch_dec_release
>  #define atomic_fetch_dec_release(...)					\
> 	__atomic_op_release(atomic_fetch_dec, __VA_ARGS__)
>  #endif
> 
>  #ifndef atomic_fetch_dec
>  #define atomic_fetch_dec(...)						\
> 	__atomic_op_fence(atomic_fetch_dec, __VA_ARGS__)
>  #endif
>  #endif /* atomic_fetch_dec_relaxed */
> 
> After:
> 
>  #ifndef atomic_fetch_dec_relaxed
>  # ifndef atomic_fetch_dec
>  #  define atomic_fetch_dec(v)			atomic_fetch_sub(1, (v))
>  #  define atomic_fetch_dec_relaxed(v)		atomic_fetch_sub_relaxed(1, (v))
>  #  define atomic_fetch_dec_acquire(v)		atomic_fetch_sub_acquire(1, (v))
>  #  define atomic_fetch_dec_release(v)		atomic_fetch_sub_release(1, (v))
>  # else
>  #  define atomic_fetch_dec_relaxed		atomic_fetch_dec
>  #  define atomic_fetch_dec_acquire		atomic_fetch_dec
>  #  define atomic_fetch_dec_release		atomic_fetch_dec
>  # endif
>  #else
>  # ifndef atomic_fetch_dec_acquire
>  #  define atomic_fetch_dec_acquire(...)	__atomic_op_acquire(atomic_fetch_dec, __VA_ARGS__)
>  # endif
>  # ifndef atomic_fetch_dec_release
>  #  define atomic_fetch_dec_release(...)	__atomic_op_release(atomic_fetch_dec, __VA_ARGS__)
>  # endif
>  # ifndef atomic_fetch_dec
>  #  define atomic_fetch_dec(...)		__atomic_op_fence(atomic_fetch_dec, __VA_ARGS__)
>  # endif
>  #endif
> 
> The new variant is readable at a glance, and the hierarchy of defines is very 
> obvious as well.

It wraps and looks hideous in my normal setup. And I do detest that indent
after # thing.

> And I think we could do even better - there's absolutely no reason why _every_ 
> operation has to be made conditional on a finegrained level - they are overriden 
> in API groups. In fact allowing individual override is arguably a fragility.
> 
> So we could do the following simplification on top of that:
> 
>  #ifndef atomic_fetch_dec_relaxed
>  # ifndef atomic_fetch_dec
>  #  define atomic_fetch_dec(v)			atomic_fetch_sub(1, (v))
>  #  define atomic_fetch_dec_relaxed(v)		atomic_fetch_sub_relaxed(1, (v))
>  #  define atomic_fetch_dec_acquire(v)		atomic_fetch_sub_acquire(1, (v))
>  #  define atomic_fetch_dec_release(v)		atomic_fetch_sub_release(1, (v))
>  # else
>  #  define atomic_fetch_dec_relaxed		atomic_fetch_dec
>  #  define atomic_fetch_dec_acquire		atomic_fetch_dec
>  #  define atomic_fetch_dec_release		atomic_fetch_dec
>  # endif
>  #else
>  # ifndef atomic_fetch_dec
>  #  define atomic_fetch_dec(...)		__atomic_op_fence(atomic_fetch_dec, __VA_ARGS__)
>  #  define atomic_fetch_dec_acquire(...)	__atomic_op_acquire(atomic_fetch_dec, __VA_ARGS__)
>  #  define atomic_fetch_dec_release(...)	__atomic_op_release(atomic_fetch_dec, __VA_ARGS__)
>  # endif
>  #endif

This would disallow an architecture to override just fetch_dec_release for
instance.

I don't think there currently is any architecture that does that, but the
intent was to allow it to override anything and only provide defaults where it
does not.

None of this takes away the giant trainwreck that is the annotated atomic stuff
though.

And I seriously hate this one:

  ba1c9f83f633 ("locking/atomic/x86: Un-macro-ify atomic ops implementation")

and will likely undo that the moment I need to change anything there.

So no, don't like.

^ permalink raw reply	[flat|nested] 103+ messages in thread

* [PATCH] locking/atomics: Combine the atomic_andnot() and atomic64_andnot() API definitions
  2018-05-05  8:36           ` Ingo Molnar
@ 2018-05-05  8:54             ` Ingo Molnar
  -1 siblings, 0 replies; 103+ messages in thread
From: Ingo Molnar @ 2018-05-05  8:54 UTC (permalink / raw)
  To: Mark Rutland
  Cc: Peter Zijlstra, linux-arm-kernel, linux-kernel, aryabinin,
	boqun.feng, catalin.marinas, dvyukov, will.deacon,
	Linus Torvalds, Andrew Morton, Paul E. McKenney, Peter Zijlstra,
	Thomas Gleixner


* Ingo Molnar <mingo@kernel.org> wrote:

> Note that the simplest definition block is now:
> 
> #ifndef atomic_cmpxchg_relaxed
> # define atomic_cmpxchg_relaxed			atomic_cmpxchg
> # define atomic_cmpxchg_acquire			atomic_cmpxchg
> # define atomic_cmpxchg_release			atomic_cmpxchg
> #else
> # ifndef atomic_cmpxchg
> #  define atomic_cmpxchg(...)			__atomic_op_fence(atomic_cmpxchg, __VA_ARGS__)
> #  define atomic_cmpxchg_acquire(...)		__atomic_op_acquire(atomic_cmpxchg, __VA_ARGS__)
> #  define atomic_cmpxchg_release(...)		__atomic_op_release(atomic_cmpxchg, __VA_ARGS__)
> # endif
> #endif
> 
> ... which is very readable!
> 
> The total linecount reduction of the two patches is pretty significant as well:
> 
>  include/linux/atomic.h | 1063 ++++++++++++++++--------------------------------
>  1 file changed, 343 insertions(+), 720 deletions(-)

BTW., I noticed two asymmetries while cleaning up this code:

==============>

#ifdef atomic_andnot

#ifndef atomic_fetch_andnot_relaxed
# define atomic_fetch_andnot_relaxed            atomic_fetch_andnot
# define atomic_fetch_andnot_acquire            atomic_fetch_andnot
# define atomic_fetch_andnot_release            atomic_fetch_andnot
#else
# ifndef atomic_fetch_andnot
#  define atomic_fetch_andnot(...)              __atomic_op_fence(atomic_fetch_andnot, __VA_ARGS__)
#  define atomic_fetch_andnot_acquire(...)      __atomic_op_acquire(atomic_fetch_andnot, __VA_ARGS__)
#  define atomic_fetch_andnot_release(...)      __atomic_op_release(atomic_fetch_andnot, __VA_ARGS__)
# endif
#endif

#endif /* atomic_andnot */

...

#ifdef atomic64_andnot

#ifndef atomic64_fetch_andnot_relaxed
# define atomic64_fetch_andnot_relaxed          atomic64_fetch_andnot
# define atomic64_fetch_andnot_acquire          atomic64_fetch_andnot
# define atomic64_fetch_andnot_release          atomic64_fetch_andnot
#else
# ifndef atomic64_fetch_andnot
#  define atomic64_fetch_andnot(...)            __atomic_op_fence(atomic64_fetch_andnot, __VA_ARGS__)
#  define atomic64_fetch_andnot_acquire(...)    __atomic_op_acquire(atomic64_fetch_andnot, __VA_ARGS__)
#  define atomic64_fetch_andnot_release(...)    __atomic_op_release(atomic64_fetch_andnot, __VA_ARGS__)
# endif
#endif

#endif /* atomic64_andnot */

<==============

Why do these two API groups have an outer condition, i.e.:

 #ifdef atomic_andnot
 ...
 #endif /* atomic_andnot */
 ...
 #ifdef atomic64_andnot
 ...
 #endif /* atomic64_andnot */

because the base APIs themselves are optional and have a default implementation:

 #ifndef atomic_andnot
 ...
 #endif
 ...
 #ifndef atomic64_andnot
 ...
 #endif

I think it's overall cleaner if we combine them into continous blocks, defining 
all variants of an API group in a single place:

 #ifdef atomic_andnot
 #else
 #endif

etc.

The patch below implements this.

Thanks,

	Ingo

===================>
>From f5efafa83af8c46b9e81b010b46caeeadb450179 Mon Sep 17 00:00:00 2001
From: Ingo Molnar <mingo@kernel.org>
Date: Sat, 5 May 2018 10:46:41 +0200
Subject: [PATCH] locking/atomics: Combine the atomic_andnot() and atomic64_andnot() API definitions

The atomic_andnot() and atomic64_andnot() are defined in 4 separate groups
spred out in the atomic.h header:

 #ifdef atomic_andnot
 ...
 #endif /* atomic_andnot */
 ...
 #ifndef atomic_andnot
 ...
 #endif
 ...
 #ifdef atomic64_andnot
 ...
 #endif /* atomic64_andnot */
 ...
 #ifndef atomic64_andnot
 ...
 #endif

Combine them into unify them into two groups:

 #ifdef atomic_andnot
 #else
 #endif

 ...

 #ifdef atomic64_andnot
 #else
 #endif

So that one API group is defined in a single place within the header.

Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 include/linux/atomic.h | 72 +++++++++++++++++++++++++-------------------------
 1 file changed, 36 insertions(+), 36 deletions(-)

diff --git a/include/linux/atomic.h b/include/linux/atomic.h
index 352ecc72d7f5..1176cf7c6f03 100644
--- a/include/linux/atomic.h
+++ b/include/linux/atomic.h
@@ -205,22 +205,6 @@
 # endif
 #endif
 
-#ifdef atomic_andnot
-
-#ifndef atomic_fetch_andnot_relaxed
-# define atomic_fetch_andnot_relaxed		atomic_fetch_andnot
-# define atomic_fetch_andnot_acquire		atomic_fetch_andnot
-# define atomic_fetch_andnot_release		atomic_fetch_andnot
-#else
-# ifndef atomic_fetch_andnot
-#  define atomic_fetch_andnot(...)		__atomic_op_fence(atomic_fetch_andnot, __VA_ARGS__)
-#  define atomic_fetch_andnot_acquire(...)	__atomic_op_acquire(atomic_fetch_andnot, __VA_ARGS__)
-#  define atomic_fetch_andnot_release(...)	__atomic_op_release(atomic_fetch_andnot, __VA_ARGS__)
-# endif
-#endif
-
-#endif /* atomic_andnot */
-
 #ifndef atomic_fetch_xor_relaxed
 # define atomic_fetch_xor_relaxed		atomic_fetch_xor
 # define atomic_fetch_xor_acquire		atomic_fetch_xor
@@ -338,7 +322,22 @@ static inline int atomic_add_unless(atomic_t *v, int a, int u)
 # define atomic_inc_not_zero(v)			atomic_add_unless((v), 1, 0)
 #endif
 
-#ifndef atomic_andnot
+#ifdef atomic_andnot
+
+#ifndef atomic_fetch_andnot_relaxed
+# define atomic_fetch_andnot_relaxed		atomic_fetch_andnot
+# define atomic_fetch_andnot_acquire		atomic_fetch_andnot
+# define atomic_fetch_andnot_release		atomic_fetch_andnot
+#else
+# ifndef atomic_fetch_andnot
+#  define atomic_fetch_andnot(...)		__atomic_op_fence(atomic_fetch_andnot, __VA_ARGS__)
+#  define atomic_fetch_andnot_acquire(...)	__atomic_op_acquire(atomic_fetch_andnot, __VA_ARGS__)
+#  define atomic_fetch_andnot_release(...)	__atomic_op_release(atomic_fetch_andnot, __VA_ARGS__)
+# endif
+#endif
+
+#else /* !atomic_andnot: */
+
 static inline void atomic_andnot(int i, atomic_t *v)
 {
 	atomic_and(~i, v);
@@ -363,7 +362,8 @@ static inline int atomic_fetch_andnot_release(int i, atomic_t *v)
 {
 	return atomic_fetch_and_release(~i, v);
 }
-#endif
+
+#endif /* !atomic_andnot */
 
 /**
  * atomic_inc_not_zero_hint - increment if not null
@@ -600,22 +600,6 @@ static inline int atomic_dec_if_positive(atomic_t *v)
 # endif
 #endif
 
-#ifdef atomic64_andnot
-
-#ifndef atomic64_fetch_andnot_relaxed
-# define atomic64_fetch_andnot_relaxed		atomic64_fetch_andnot
-# define atomic64_fetch_andnot_acquire		atomic64_fetch_andnot
-# define atomic64_fetch_andnot_release		atomic64_fetch_andnot
-#else
-# ifndef atomic64_fetch_andnot
-#  define atomic64_fetch_andnot(...)		__atomic_op_fence(atomic64_fetch_andnot, __VA_ARGS__)
-#  define atomic64_fetch_andnot_acquire(...)	__atomic_op_acquire(atomic64_fetch_andnot, __VA_ARGS__)
-#  define atomic64_fetch_andnot_release(...)	__atomic_op_release(atomic64_fetch_andnot, __VA_ARGS__)
-# endif
-#endif
-
-#endif /* atomic64_andnot */
-
 #ifndef atomic64_fetch_xor_relaxed
 # define atomic64_fetch_xor_relaxed		atomic64_fetch_xor
 # define atomic64_fetch_xor_acquire		atomic64_fetch_xor
@@ -672,7 +656,22 @@ static inline int atomic_dec_if_positive(atomic_t *v)
 # define atomic64_try_cmpxchg_release		atomic64_try_cmpxchg
 #endif
 
-#ifndef atomic64_andnot
+#ifdef atomic64_andnot
+
+#ifndef atomic64_fetch_andnot_relaxed
+# define atomic64_fetch_andnot_relaxed		atomic64_fetch_andnot
+# define atomic64_fetch_andnot_acquire		atomic64_fetch_andnot
+# define atomic64_fetch_andnot_release		atomic64_fetch_andnot
+#else
+# ifndef atomic64_fetch_andnot
+#  define atomic64_fetch_andnot(...)		__atomic_op_fence(atomic64_fetch_andnot, __VA_ARGS__)
+#  define atomic64_fetch_andnot_acquire(...)	__atomic_op_acquire(atomic64_fetch_andnot, __VA_ARGS__)
+#  define atomic64_fetch_andnot_release(...)	__atomic_op_release(atomic64_fetch_andnot, __VA_ARGS__)
+# endif
+#endif
+
+#else /* !atomic64_andnot: */
+
 static inline void atomic64_andnot(long long i, atomic64_t *v)
 {
 	atomic64_and(~i, v);
@@ -697,7 +696,8 @@ static inline long long atomic64_fetch_andnot_release(long long i, atomic64_t *v
 {
 	return atomic64_fetch_and_release(~i, v);
 }
-#endif
+
+#endif /* !atomic64_andnot */
 
 #define atomic64_cond_read_relaxed(v, c)	smp_cond_load_relaxed(&(v)->counter, (c))
 #define atomic64_cond_read_acquire(v, c)	smp_cond_load_acquire(&(v)->counter, (c))

^ permalink raw reply related	[flat|nested] 103+ messages in thread

* [PATCH] locking/atomics: Combine the atomic_andnot() and atomic64_andnot() API definitions
@ 2018-05-05  8:54             ` Ingo Molnar
  0 siblings, 0 replies; 103+ messages in thread
From: Ingo Molnar @ 2018-05-05  8:54 UTC (permalink / raw)
  To: linux-arm-kernel


* Ingo Molnar <mingo@kernel.org> wrote:

> Note that the simplest definition block is now:
> 
> #ifndef atomic_cmpxchg_relaxed
> # define atomic_cmpxchg_relaxed			atomic_cmpxchg
> # define atomic_cmpxchg_acquire			atomic_cmpxchg
> # define atomic_cmpxchg_release			atomic_cmpxchg
> #else
> # ifndef atomic_cmpxchg
> #  define atomic_cmpxchg(...)			__atomic_op_fence(atomic_cmpxchg, __VA_ARGS__)
> #  define atomic_cmpxchg_acquire(...)		__atomic_op_acquire(atomic_cmpxchg, __VA_ARGS__)
> #  define atomic_cmpxchg_release(...)		__atomic_op_release(atomic_cmpxchg, __VA_ARGS__)
> # endif
> #endif
> 
> ... which is very readable!
> 
> The total linecount reduction of the two patches is pretty significant as well:
> 
>  include/linux/atomic.h | 1063 ++++++++++++++++--------------------------------
>  1 file changed, 343 insertions(+), 720 deletions(-)

BTW., I noticed two asymmetries while cleaning up this code:

==============>

#ifdef atomic_andnot

#ifndef atomic_fetch_andnot_relaxed
# define atomic_fetch_andnot_relaxed            atomic_fetch_andnot
# define atomic_fetch_andnot_acquire            atomic_fetch_andnot
# define atomic_fetch_andnot_release            atomic_fetch_andnot
#else
# ifndef atomic_fetch_andnot
#  define atomic_fetch_andnot(...)              __atomic_op_fence(atomic_fetch_andnot, __VA_ARGS__)
#  define atomic_fetch_andnot_acquire(...)      __atomic_op_acquire(atomic_fetch_andnot, __VA_ARGS__)
#  define atomic_fetch_andnot_release(...)      __atomic_op_release(atomic_fetch_andnot, __VA_ARGS__)
# endif
#endif

#endif /* atomic_andnot */

...

#ifdef atomic64_andnot

#ifndef atomic64_fetch_andnot_relaxed
# define atomic64_fetch_andnot_relaxed          atomic64_fetch_andnot
# define atomic64_fetch_andnot_acquire          atomic64_fetch_andnot
# define atomic64_fetch_andnot_release          atomic64_fetch_andnot
#else
# ifndef atomic64_fetch_andnot
#  define atomic64_fetch_andnot(...)            __atomic_op_fence(atomic64_fetch_andnot, __VA_ARGS__)
#  define atomic64_fetch_andnot_acquire(...)    __atomic_op_acquire(atomic64_fetch_andnot, __VA_ARGS__)
#  define atomic64_fetch_andnot_release(...)    __atomic_op_release(atomic64_fetch_andnot, __VA_ARGS__)
# endif
#endif

#endif /* atomic64_andnot */

<==============

Why do these two API groups have an outer condition, i.e.:

 #ifdef atomic_andnot
 ...
 #endif /* atomic_andnot */
 ...
 #ifdef atomic64_andnot
 ...
 #endif /* atomic64_andnot */

because the base APIs themselves are optional and have a default implementation:

 #ifndef atomic_andnot
 ...
 #endif
 ...
 #ifndef atomic64_andnot
 ...
 #endif

I think it's overall cleaner if we combine them into continous blocks, defining 
all variants of an API group in a single place:

 #ifdef atomic_andnot
 #else
 #endif

etc.

The patch below implements this.

Thanks,

	Ingo

===================>
>From f5efafa83af8c46b9e81b010b46caeeadb450179 Mon Sep 17 00:00:00 2001
From: Ingo Molnar <mingo@kernel.org>
Date: Sat, 5 May 2018 10:46:41 +0200
Subject: [PATCH] locking/atomics: Combine the atomic_andnot() and atomic64_andnot() API definitions

The atomic_andnot() and atomic64_andnot() are defined in 4 separate groups
spred out in the atomic.h header:

 #ifdef atomic_andnot
 ...
 #endif /* atomic_andnot */
 ...
 #ifndef atomic_andnot
 ...
 #endif
 ...
 #ifdef atomic64_andnot
 ...
 #endif /* atomic64_andnot */
 ...
 #ifndef atomic64_andnot
 ...
 #endif

Combine them into unify them into two groups:

 #ifdef atomic_andnot
 #else
 #endif

 ...

 #ifdef atomic64_andnot
 #else
 #endif

So that one API group is defined in a single place within the header.

Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: linux-kernel at vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 include/linux/atomic.h | 72 +++++++++++++++++++++++++-------------------------
 1 file changed, 36 insertions(+), 36 deletions(-)

diff --git a/include/linux/atomic.h b/include/linux/atomic.h
index 352ecc72d7f5..1176cf7c6f03 100644
--- a/include/linux/atomic.h
+++ b/include/linux/atomic.h
@@ -205,22 +205,6 @@
 # endif
 #endif
 
-#ifdef atomic_andnot
-
-#ifndef atomic_fetch_andnot_relaxed
-# define atomic_fetch_andnot_relaxed		atomic_fetch_andnot
-# define atomic_fetch_andnot_acquire		atomic_fetch_andnot
-# define atomic_fetch_andnot_release		atomic_fetch_andnot
-#else
-# ifndef atomic_fetch_andnot
-#  define atomic_fetch_andnot(...)		__atomic_op_fence(atomic_fetch_andnot, __VA_ARGS__)
-#  define atomic_fetch_andnot_acquire(...)	__atomic_op_acquire(atomic_fetch_andnot, __VA_ARGS__)
-#  define atomic_fetch_andnot_release(...)	__atomic_op_release(atomic_fetch_andnot, __VA_ARGS__)
-# endif
-#endif
-
-#endif /* atomic_andnot */
-
 #ifndef atomic_fetch_xor_relaxed
 # define atomic_fetch_xor_relaxed		atomic_fetch_xor
 # define atomic_fetch_xor_acquire		atomic_fetch_xor
@@ -338,7 +322,22 @@ static inline int atomic_add_unless(atomic_t *v, int a, int u)
 # define atomic_inc_not_zero(v)			atomic_add_unless((v), 1, 0)
 #endif
 
-#ifndef atomic_andnot
+#ifdef atomic_andnot
+
+#ifndef atomic_fetch_andnot_relaxed
+# define atomic_fetch_andnot_relaxed		atomic_fetch_andnot
+# define atomic_fetch_andnot_acquire		atomic_fetch_andnot
+# define atomic_fetch_andnot_release		atomic_fetch_andnot
+#else
+# ifndef atomic_fetch_andnot
+#  define atomic_fetch_andnot(...)		__atomic_op_fence(atomic_fetch_andnot, __VA_ARGS__)
+#  define atomic_fetch_andnot_acquire(...)	__atomic_op_acquire(atomic_fetch_andnot, __VA_ARGS__)
+#  define atomic_fetch_andnot_release(...)	__atomic_op_release(atomic_fetch_andnot, __VA_ARGS__)
+# endif
+#endif
+
+#else /* !atomic_andnot: */
+
 static inline void atomic_andnot(int i, atomic_t *v)
 {
 	atomic_and(~i, v);
@@ -363,7 +362,8 @@ static inline int atomic_fetch_andnot_release(int i, atomic_t *v)
 {
 	return atomic_fetch_and_release(~i, v);
 }
-#endif
+
+#endif /* !atomic_andnot */
 
 /**
  * atomic_inc_not_zero_hint - increment if not null
@@ -600,22 +600,6 @@ static inline int atomic_dec_if_positive(atomic_t *v)
 # endif
 #endif
 
-#ifdef atomic64_andnot
-
-#ifndef atomic64_fetch_andnot_relaxed
-# define atomic64_fetch_andnot_relaxed		atomic64_fetch_andnot
-# define atomic64_fetch_andnot_acquire		atomic64_fetch_andnot
-# define atomic64_fetch_andnot_release		atomic64_fetch_andnot
-#else
-# ifndef atomic64_fetch_andnot
-#  define atomic64_fetch_andnot(...)		__atomic_op_fence(atomic64_fetch_andnot, __VA_ARGS__)
-#  define atomic64_fetch_andnot_acquire(...)	__atomic_op_acquire(atomic64_fetch_andnot, __VA_ARGS__)
-#  define atomic64_fetch_andnot_release(...)	__atomic_op_release(atomic64_fetch_andnot, __VA_ARGS__)
-# endif
-#endif
-
-#endif /* atomic64_andnot */
-
 #ifndef atomic64_fetch_xor_relaxed
 # define atomic64_fetch_xor_relaxed		atomic64_fetch_xor
 # define atomic64_fetch_xor_acquire		atomic64_fetch_xor
@@ -672,7 +656,22 @@ static inline int atomic_dec_if_positive(atomic_t *v)
 # define atomic64_try_cmpxchg_release		atomic64_try_cmpxchg
 #endif
 
-#ifndef atomic64_andnot
+#ifdef atomic64_andnot
+
+#ifndef atomic64_fetch_andnot_relaxed
+# define atomic64_fetch_andnot_relaxed		atomic64_fetch_andnot
+# define atomic64_fetch_andnot_acquire		atomic64_fetch_andnot
+# define atomic64_fetch_andnot_release		atomic64_fetch_andnot
+#else
+# ifndef atomic64_fetch_andnot
+#  define atomic64_fetch_andnot(...)		__atomic_op_fence(atomic64_fetch_andnot, __VA_ARGS__)
+#  define atomic64_fetch_andnot_acquire(...)	__atomic_op_acquire(atomic64_fetch_andnot, __VA_ARGS__)
+#  define atomic64_fetch_andnot_release(...)	__atomic_op_release(atomic64_fetch_andnot, __VA_ARGS__)
+# endif
+#endif
+
+#else /* !atomic64_andnot: */
+
 static inline void atomic64_andnot(long long i, atomic64_t *v)
 {
 	atomic64_and(~i, v);
@@ -697,7 +696,8 @@ static inline long long atomic64_fetch_andnot_release(long long i, atomic64_t *v
 {
 	return atomic64_fetch_and_release(~i, v);
 }
-#endif
+
+#endif /* !atomic64_andnot */
 
 #define atomic64_cond_read_relaxed(v, c)	smp_cond_load_relaxed(&(v)->counter, (c))
 #define atomic64_cond_read_acquire(v, c)	smp_cond_load_acquire(&(v)->counter, (c))

^ permalink raw reply related	[flat|nested] 103+ messages in thread

* Re: [PATCH] locking/atomics: Clean up the atomic.h maze of #defines
  2018-05-05  8:47           ` Peter Zijlstra
@ 2018-05-05  9:04             ` Ingo Molnar
  -1 siblings, 0 replies; 103+ messages in thread
From: Ingo Molnar @ 2018-05-05  9:04 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Mark Rutland, linux-arm-kernel, linux-kernel, aryabinin,
	boqun.feng, catalin.marinas, dvyukov, will.deacon


* Peter Zijlstra <peterz@infradead.org> wrote:

> > So we could do the following simplification on top of that:
> > 
> >  #ifndef atomic_fetch_dec_relaxed
> >  # ifndef atomic_fetch_dec
> >  #  define atomic_fetch_dec(v)		atomic_fetch_sub(1, (v))
> >  #  define atomic_fetch_dec_relaxed(v)	atomic_fetch_sub_relaxed(1, (v))
> >  #  define atomic_fetch_dec_acquire(v)	atomic_fetch_sub_acquire(1, (v))
> >  #  define atomic_fetch_dec_release(v)	atomic_fetch_sub_release(1, (v))
> >  # else
> >  #  define atomic_fetch_dec_relaxed		atomic_fetch_dec
> >  #  define atomic_fetch_dec_acquire		atomic_fetch_dec
> >  #  define atomic_fetch_dec_release		atomic_fetch_dec
> >  # endif
> >  #else
> >  # ifndef atomic_fetch_dec
> >  #  define atomic_fetch_dec(...)		__atomic_op_fence(atomic_fetch_dec, __VA_ARGS__)
> >  #  define atomic_fetch_dec_acquire(...)	__atomic_op_acquire(atomic_fetch_dec, __VA_ARGS__)
> >  #  define atomic_fetch_dec_release(...)	__atomic_op_release(atomic_fetch_dec, __VA_ARGS__)
> >  # endif
> >  #endif
> 
> This would disallow an architecture to override just fetch_dec_release for
> instance.

Couldn't such a crazy arch just define _all_ the 3 APIs in this group?
That's really a small price and makes the place pay the complexity
price that does the weirdness...

> I don't think there currently is any architecture that does that, but the
> intent was to allow it to override anything and only provide defaults where it
> does not.

I'd argue that if a new arch only defines one of these APIs that's probably a bug. 
If they absolutely want to do it, they still can - by defining all 3 APIs.

So there's no loss in arch flexibility.

> None of this takes away the giant trainwreck that is the annotated atomic stuff
> though.
> 
> And I seriously hate this one:
> 
>   ba1c9f83f633 ("locking/atomic/x86: Un-macro-ify atomic ops implementation")
> 
> and will likely undo that the moment I need to change anything there.

If it makes the code more readable then I don't object - the problem was that the 
instrumentation indirection made all that code much harder to follow.

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 103+ messages in thread

* [PATCH] locking/atomics: Clean up the atomic.h maze of #defines
@ 2018-05-05  9:04             ` Ingo Molnar
  0 siblings, 0 replies; 103+ messages in thread
From: Ingo Molnar @ 2018-05-05  9:04 UTC (permalink / raw)
  To: linux-arm-kernel


* Peter Zijlstra <peterz@infradead.org> wrote:

> > So we could do the following simplification on top of that:
> > 
> >  #ifndef atomic_fetch_dec_relaxed
> >  # ifndef atomic_fetch_dec
> >  #  define atomic_fetch_dec(v)		atomic_fetch_sub(1, (v))
> >  #  define atomic_fetch_dec_relaxed(v)	atomic_fetch_sub_relaxed(1, (v))
> >  #  define atomic_fetch_dec_acquire(v)	atomic_fetch_sub_acquire(1, (v))
> >  #  define atomic_fetch_dec_release(v)	atomic_fetch_sub_release(1, (v))
> >  # else
> >  #  define atomic_fetch_dec_relaxed		atomic_fetch_dec
> >  #  define atomic_fetch_dec_acquire		atomic_fetch_dec
> >  #  define atomic_fetch_dec_release		atomic_fetch_dec
> >  # endif
> >  #else
> >  # ifndef atomic_fetch_dec
> >  #  define atomic_fetch_dec(...)		__atomic_op_fence(atomic_fetch_dec, __VA_ARGS__)
> >  #  define atomic_fetch_dec_acquire(...)	__atomic_op_acquire(atomic_fetch_dec, __VA_ARGS__)
> >  #  define atomic_fetch_dec_release(...)	__atomic_op_release(atomic_fetch_dec, __VA_ARGS__)
> >  # endif
> >  #endif
> 
> This would disallow an architecture to override just fetch_dec_release for
> instance.

Couldn't such a crazy arch just define _all_ the 3 APIs in this group?
That's really a small price and makes the place pay the complexity
price that does the weirdness...

> I don't think there currently is any architecture that does that, but the
> intent was to allow it to override anything and only provide defaults where it
> does not.

I'd argue that if a new arch only defines one of these APIs that's probably a bug. 
If they absolutely want to do it, they still can - by defining all 3 APIs.

So there's no loss in arch flexibility.

> None of this takes away the giant trainwreck that is the annotated atomic stuff
> though.
> 
> And I seriously hate this one:
> 
>   ba1c9f83f633 ("locking/atomic/x86: Un-macro-ify atomic ops implementation")
> 
> and will likely undo that the moment I need to change anything there.

If it makes the code more readable then I don't object - the problem was that the 
instrumentation indirection made all that code much harder to follow.

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 103+ messages in thread

* Re: [PATCH] locking/atomics: Clean up the atomic.h maze of #defines
  2018-05-05  8:47           ` Peter Zijlstra
@ 2018-05-05  9:05             ` Dmitry Vyukov
  -1 siblings, 0 replies; 103+ messages in thread
From: Dmitry Vyukov @ 2018-05-05  9:05 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Ingo Molnar, Mark Rutland, Linux ARM, LKML, Andrey Ryabinin,
	Boqun Feng, Catalin Marinas, Will Deacon

On Sat, May 5, 2018 at 10:47 AM, Peter Zijlstra <peterz@infradead.org> wrote:
> On Sat, May 05, 2018 at 10:11:00AM +0200, Ingo Molnar wrote:
>
>> Before:
>>
>>  #ifndef atomic_fetch_dec_relaxed
>>
>>  #ifndef atomic_fetch_dec
>>  #define atomic_fetch_dec(v)          atomic_fetch_sub(1, (v))
>>  #define atomic_fetch_dec_relaxed(v)  atomic_fetch_sub_relaxed(1, (v))
>>  #define atomic_fetch_dec_acquire(v)  atomic_fetch_sub_acquire(1, (v))
>>  #define atomic_fetch_dec_release(v)  atomic_fetch_sub_release(1, (v))
>>  #else /* atomic_fetch_dec */
>>  #define atomic_fetch_dec_relaxed     atomic_fetch_dec
>>  #define atomic_fetch_dec_acquire     atomic_fetch_dec
>>  #define atomic_fetch_dec_release     atomic_fetch_dec
>>  #endif /* atomic_fetch_dec */
>>
>>  #else /* atomic_fetch_dec_relaxed */
>>
>>  #ifndef atomic_fetch_dec_acquire
>>  #define atomic_fetch_dec_acquire(...)                                        \
>>       __atomic_op_acquire(atomic_fetch_dec, __VA_ARGS__)
>>  #endif
>>
>>  #ifndef atomic_fetch_dec_release
>>  #define atomic_fetch_dec_release(...)                                        \
>>       __atomic_op_release(atomic_fetch_dec, __VA_ARGS__)
>>  #endif
>>
>>  #ifndef atomic_fetch_dec
>>  #define atomic_fetch_dec(...)                                                \
>>       __atomic_op_fence(atomic_fetch_dec, __VA_ARGS__)
>>  #endif
>>  #endif /* atomic_fetch_dec_relaxed */
>>
>> After:
>>
>>  #ifndef atomic_fetch_dec_relaxed
>>  # ifndef atomic_fetch_dec
>>  #  define atomic_fetch_dec(v)                        atomic_fetch_sub(1, (v))
>>  #  define atomic_fetch_dec_relaxed(v)                atomic_fetch_sub_relaxed(1, (v))
>>  #  define atomic_fetch_dec_acquire(v)                atomic_fetch_sub_acquire(1, (v))
>>  #  define atomic_fetch_dec_release(v)                atomic_fetch_sub_release(1, (v))
>>  # else
>>  #  define atomic_fetch_dec_relaxed           atomic_fetch_dec
>>  #  define atomic_fetch_dec_acquire           atomic_fetch_dec
>>  #  define atomic_fetch_dec_release           atomic_fetch_dec
>>  # endif
>>  #else
>>  # ifndef atomic_fetch_dec_acquire
>>  #  define atomic_fetch_dec_acquire(...)      __atomic_op_acquire(atomic_fetch_dec, __VA_ARGS__)
>>  # endif
>>  # ifndef atomic_fetch_dec_release
>>  #  define atomic_fetch_dec_release(...)      __atomic_op_release(atomic_fetch_dec, __VA_ARGS__)
>>  # endif
>>  # ifndef atomic_fetch_dec
>>  #  define atomic_fetch_dec(...)              __atomic_op_fence(atomic_fetch_dec, __VA_ARGS__)
>>  # endif
>>  #endif
>>
>> The new variant is readable at a glance, and the hierarchy of defines is very
>> obvious as well.
>
> It wraps and looks hideous in my normal setup. And I do detest that indent
> after # thing.
>
>> And I think we could do even better - there's absolutely no reason why _every_
>> operation has to be made conditional on a finegrained level - they are overriden
>> in API groups. In fact allowing individual override is arguably a fragility.
>>
>> So we could do the following simplification on top of that:
>>
>>  #ifndef atomic_fetch_dec_relaxed
>>  # ifndef atomic_fetch_dec
>>  #  define atomic_fetch_dec(v)                        atomic_fetch_sub(1, (v))
>>  #  define atomic_fetch_dec_relaxed(v)                atomic_fetch_sub_relaxed(1, (v))
>>  #  define atomic_fetch_dec_acquire(v)                atomic_fetch_sub_acquire(1, (v))
>>  #  define atomic_fetch_dec_release(v)                atomic_fetch_sub_release(1, (v))
>>  # else
>>  #  define atomic_fetch_dec_relaxed           atomic_fetch_dec
>>  #  define atomic_fetch_dec_acquire           atomic_fetch_dec
>>  #  define atomic_fetch_dec_release           atomic_fetch_dec
>>  # endif
>>  #else
>>  # ifndef atomic_fetch_dec
>>  #  define atomic_fetch_dec(...)              __atomic_op_fence(atomic_fetch_dec, __VA_ARGS__)
>>  #  define atomic_fetch_dec_acquire(...)      __atomic_op_acquire(atomic_fetch_dec, __VA_ARGS__)
>>  #  define atomic_fetch_dec_release(...)      __atomic_op_release(atomic_fetch_dec, __VA_ARGS__)
>>  # endif
>>  #endif
>
> This would disallow an architecture to override just fetch_dec_release for
> instance.
>
> I don't think there currently is any architecture that does that, but the
> intent was to allow it to override anything and only provide defaults where it
> does not.
>
> None of this takes away the giant trainwreck that is the annotated atomic stuff
> though.
>
> And I seriously hate this one:
>
>   ba1c9f83f633 ("locking/atomic/x86: Un-macro-ify atomic ops implementation")
>
> and will likely undo that the moment I need to change anything there.
>
> So no, don't like.

That was asked by Ingo:
https://groups.google.com/d/msg/kasan-dev/3sNHjjb4GCI/Xz1uVWaaAAAJ

I think in the end all of current options suck in one way or another,
so we are just going in circles.
We either need something different (e.g. codegen), or settle on one
option for doing it.

^ permalink raw reply	[flat|nested] 103+ messages in thread

* [PATCH] locking/atomics: Clean up the atomic.h maze of #defines
@ 2018-05-05  9:05             ` Dmitry Vyukov
  0 siblings, 0 replies; 103+ messages in thread
From: Dmitry Vyukov @ 2018-05-05  9:05 UTC (permalink / raw)
  To: linux-arm-kernel

On Sat, May 5, 2018 at 10:47 AM, Peter Zijlstra <peterz@infradead.org> wrote:
> On Sat, May 05, 2018 at 10:11:00AM +0200, Ingo Molnar wrote:
>
>> Before:
>>
>>  #ifndef atomic_fetch_dec_relaxed
>>
>>  #ifndef atomic_fetch_dec
>>  #define atomic_fetch_dec(v)          atomic_fetch_sub(1, (v))
>>  #define atomic_fetch_dec_relaxed(v)  atomic_fetch_sub_relaxed(1, (v))
>>  #define atomic_fetch_dec_acquire(v)  atomic_fetch_sub_acquire(1, (v))
>>  #define atomic_fetch_dec_release(v)  atomic_fetch_sub_release(1, (v))
>>  #else /* atomic_fetch_dec */
>>  #define atomic_fetch_dec_relaxed     atomic_fetch_dec
>>  #define atomic_fetch_dec_acquire     atomic_fetch_dec
>>  #define atomic_fetch_dec_release     atomic_fetch_dec
>>  #endif /* atomic_fetch_dec */
>>
>>  #else /* atomic_fetch_dec_relaxed */
>>
>>  #ifndef atomic_fetch_dec_acquire
>>  #define atomic_fetch_dec_acquire(...)                                        \
>>       __atomic_op_acquire(atomic_fetch_dec, __VA_ARGS__)
>>  #endif
>>
>>  #ifndef atomic_fetch_dec_release
>>  #define atomic_fetch_dec_release(...)                                        \
>>       __atomic_op_release(atomic_fetch_dec, __VA_ARGS__)
>>  #endif
>>
>>  #ifndef atomic_fetch_dec
>>  #define atomic_fetch_dec(...)                                                \
>>       __atomic_op_fence(atomic_fetch_dec, __VA_ARGS__)
>>  #endif
>>  #endif /* atomic_fetch_dec_relaxed */
>>
>> After:
>>
>>  #ifndef atomic_fetch_dec_relaxed
>>  # ifndef atomic_fetch_dec
>>  #  define atomic_fetch_dec(v)                        atomic_fetch_sub(1, (v))
>>  #  define atomic_fetch_dec_relaxed(v)                atomic_fetch_sub_relaxed(1, (v))
>>  #  define atomic_fetch_dec_acquire(v)                atomic_fetch_sub_acquire(1, (v))
>>  #  define atomic_fetch_dec_release(v)                atomic_fetch_sub_release(1, (v))
>>  # else
>>  #  define atomic_fetch_dec_relaxed           atomic_fetch_dec
>>  #  define atomic_fetch_dec_acquire           atomic_fetch_dec
>>  #  define atomic_fetch_dec_release           atomic_fetch_dec
>>  # endif
>>  #else
>>  # ifndef atomic_fetch_dec_acquire
>>  #  define atomic_fetch_dec_acquire(...)      __atomic_op_acquire(atomic_fetch_dec, __VA_ARGS__)
>>  # endif
>>  # ifndef atomic_fetch_dec_release
>>  #  define atomic_fetch_dec_release(...)      __atomic_op_release(atomic_fetch_dec, __VA_ARGS__)
>>  # endif
>>  # ifndef atomic_fetch_dec
>>  #  define atomic_fetch_dec(...)              __atomic_op_fence(atomic_fetch_dec, __VA_ARGS__)
>>  # endif
>>  #endif
>>
>> The new variant is readable at a glance, and the hierarchy of defines is very
>> obvious as well.
>
> It wraps and looks hideous in my normal setup. And I do detest that indent
> after # thing.
>
>> And I think we could do even better - there's absolutely no reason why _every_
>> operation has to be made conditional on a finegrained level - they are overriden
>> in API groups. In fact allowing individual override is arguably a fragility.
>>
>> So we could do the following simplification on top of that:
>>
>>  #ifndef atomic_fetch_dec_relaxed
>>  # ifndef atomic_fetch_dec
>>  #  define atomic_fetch_dec(v)                        atomic_fetch_sub(1, (v))
>>  #  define atomic_fetch_dec_relaxed(v)                atomic_fetch_sub_relaxed(1, (v))
>>  #  define atomic_fetch_dec_acquire(v)                atomic_fetch_sub_acquire(1, (v))
>>  #  define atomic_fetch_dec_release(v)                atomic_fetch_sub_release(1, (v))
>>  # else
>>  #  define atomic_fetch_dec_relaxed           atomic_fetch_dec
>>  #  define atomic_fetch_dec_acquire           atomic_fetch_dec
>>  #  define atomic_fetch_dec_release           atomic_fetch_dec
>>  # endif
>>  #else
>>  # ifndef atomic_fetch_dec
>>  #  define atomic_fetch_dec(...)              __atomic_op_fence(atomic_fetch_dec, __VA_ARGS__)
>>  #  define atomic_fetch_dec_acquire(...)      __atomic_op_acquire(atomic_fetch_dec, __VA_ARGS__)
>>  #  define atomic_fetch_dec_release(...)      __atomic_op_release(atomic_fetch_dec, __VA_ARGS__)
>>  # endif
>>  #endif
>
> This would disallow an architecture to override just fetch_dec_release for
> instance.
>
> I don't think there currently is any architecture that does that, but the
> intent was to allow it to override anything and only provide defaults where it
> does not.
>
> None of this takes away the giant trainwreck that is the annotated atomic stuff
> though.
>
> And I seriously hate this one:
>
>   ba1c9f83f633 ("locking/atomic/x86: Un-macro-ify atomic ops implementation")
>
> and will likely undo that the moment I need to change anything there.
>
> So no, don't like.

That was asked by Ingo:
https://groups.google.com/d/msg/kasan-dev/3sNHjjb4GCI/Xz1uVWaaAAAJ

I think in the end all of current options suck in one way or another,
so we are just going in circles.
We either need something different (e.g. codegen), or settle on one
option for doing it.

^ permalink raw reply	[flat|nested] 103+ messages in thread

* Re: [PATCH] locking/atomics: Clean up the atomic.h maze of #defines
  2018-05-05  8:47           ` Peter Zijlstra
@ 2018-05-05  9:09             ` Ingo Molnar
  -1 siblings, 0 replies; 103+ messages in thread
From: Ingo Molnar @ 2018-05-05  9:09 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Mark Rutland, linux-arm-kernel, linux-kernel, aryabinin,
	boqun.feng, catalin.marinas, dvyukov, will.deacon


* Peter Zijlstra <peterz@infradead.org> wrote:

> On Sat, May 05, 2018 at 10:11:00AM +0200, Ingo Molnar wrote:
> 
> > Before:
> > 
> >  #ifndef atomic_fetch_dec_relaxed
> > 
> >  #ifndef atomic_fetch_dec
> >  #define atomic_fetch_dec(v)	        atomic_fetch_sub(1, (v))
> >  #define atomic_fetch_dec_relaxed(v)	atomic_fetch_sub_relaxed(1, (v))
> >  #define atomic_fetch_dec_acquire(v)	atomic_fetch_sub_acquire(1, (v))
> >  #define atomic_fetch_dec_release(v)	atomic_fetch_sub_release(1, (v))
> >  #else /* atomic_fetch_dec */
> >  #define atomic_fetch_dec_relaxed	atomic_fetch_dec
> >  #define atomic_fetch_dec_acquire	atomic_fetch_dec
> >  #define atomic_fetch_dec_release	atomic_fetch_dec
> >  #endif /* atomic_fetch_dec */
> > 
> >  #else /* atomic_fetch_dec_relaxed */
> > 
> >  #ifndef atomic_fetch_dec_acquire
> >  #define atomic_fetch_dec_acquire(...)					\
> > 	__atomic_op_acquire(atomic_fetch_dec, __VA_ARGS__)
> >  #endif
> > 
> >  #ifndef atomic_fetch_dec_release
> >  #define atomic_fetch_dec_release(...)					\
> > 	__atomic_op_release(atomic_fetch_dec, __VA_ARGS__)
> >  #endif
> > 
> >  #ifndef atomic_fetch_dec
> >  #define atomic_fetch_dec(...)						\
> > 	__atomic_op_fence(atomic_fetch_dec, __VA_ARGS__)
> >  #endif
> >  #endif /* atomic_fetch_dec_relaxed */
> > 
> > After:
> > 
> >  #ifndef atomic_fetch_dec_relaxed
> >  # ifndef atomic_fetch_dec
> >  #  define atomic_fetch_dec(v)			atomic_fetch_sub(1, (v))
> >  #  define atomic_fetch_dec_relaxed(v)		atomic_fetch_sub_relaxed(1, (v))
> >  #  define atomic_fetch_dec_acquire(v)		atomic_fetch_sub_acquire(1, (v))
> >  #  define atomic_fetch_dec_release(v)		atomic_fetch_sub_release(1, (v))
> >  # else
> >  #  define atomic_fetch_dec_relaxed		atomic_fetch_dec
> >  #  define atomic_fetch_dec_acquire		atomic_fetch_dec
> >  #  define atomic_fetch_dec_release		atomic_fetch_dec
> >  # endif
> >  #else
> >  # ifndef atomic_fetch_dec_acquire
> >  #  define atomic_fetch_dec_acquire(...)	__atomic_op_acquire(atomic_fetch_dec, __VA_ARGS__)
> >  # endif
> >  # ifndef atomic_fetch_dec_release
> >  #  define atomic_fetch_dec_release(...)	__atomic_op_release(atomic_fetch_dec, __VA_ARGS__)
> >  # endif
> >  # ifndef atomic_fetch_dec
> >  #  define atomic_fetch_dec(...)		__atomic_op_fence(atomic_fetch_dec, __VA_ARGS__)
> >  # endif
> >  #endif
> > 
> > The new variant is readable at a glance, and the hierarchy of defines is very 
> > obvious as well.
> 
> It wraps and looks hideous in my normal setup. And I do detest that indent
> after # thing.

You should use wider terminals if you take a look at such code - there's already 
numerous areas of the kernel that are not readable on 80x25 terminals.

_Please_ try the following experiment, for me:

Enter the 21st century temporarily and widen two of your terminals from 80 cols to 
100 cols - it's only ~20% wider.

Apply the 3 patches I sent and then open the new and the old atomic.h in the two 
terminals and compare them visually.

The new structure is _much_ more compact, it is nicer looking and much more 
readable.

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 103+ messages in thread

* [PATCH] locking/atomics: Clean up the atomic.h maze of #defines
@ 2018-05-05  9:09             ` Ingo Molnar
  0 siblings, 0 replies; 103+ messages in thread
From: Ingo Molnar @ 2018-05-05  9:09 UTC (permalink / raw)
  To: linux-arm-kernel


* Peter Zijlstra <peterz@infradead.org> wrote:

> On Sat, May 05, 2018 at 10:11:00AM +0200, Ingo Molnar wrote:
> 
> > Before:
> > 
> >  #ifndef atomic_fetch_dec_relaxed
> > 
> >  #ifndef atomic_fetch_dec
> >  #define atomic_fetch_dec(v)	        atomic_fetch_sub(1, (v))
> >  #define atomic_fetch_dec_relaxed(v)	atomic_fetch_sub_relaxed(1, (v))
> >  #define atomic_fetch_dec_acquire(v)	atomic_fetch_sub_acquire(1, (v))
> >  #define atomic_fetch_dec_release(v)	atomic_fetch_sub_release(1, (v))
> >  #else /* atomic_fetch_dec */
> >  #define atomic_fetch_dec_relaxed	atomic_fetch_dec
> >  #define atomic_fetch_dec_acquire	atomic_fetch_dec
> >  #define atomic_fetch_dec_release	atomic_fetch_dec
> >  #endif /* atomic_fetch_dec */
> > 
> >  #else /* atomic_fetch_dec_relaxed */
> > 
> >  #ifndef atomic_fetch_dec_acquire
> >  #define atomic_fetch_dec_acquire(...)					\
> > 	__atomic_op_acquire(atomic_fetch_dec, __VA_ARGS__)
> >  #endif
> > 
> >  #ifndef atomic_fetch_dec_release
> >  #define atomic_fetch_dec_release(...)					\
> > 	__atomic_op_release(atomic_fetch_dec, __VA_ARGS__)
> >  #endif
> > 
> >  #ifndef atomic_fetch_dec
> >  #define atomic_fetch_dec(...)						\
> > 	__atomic_op_fence(atomic_fetch_dec, __VA_ARGS__)
> >  #endif
> >  #endif /* atomic_fetch_dec_relaxed */
> > 
> > After:
> > 
> >  #ifndef atomic_fetch_dec_relaxed
> >  # ifndef atomic_fetch_dec
> >  #  define atomic_fetch_dec(v)			atomic_fetch_sub(1, (v))
> >  #  define atomic_fetch_dec_relaxed(v)		atomic_fetch_sub_relaxed(1, (v))
> >  #  define atomic_fetch_dec_acquire(v)		atomic_fetch_sub_acquire(1, (v))
> >  #  define atomic_fetch_dec_release(v)		atomic_fetch_sub_release(1, (v))
> >  # else
> >  #  define atomic_fetch_dec_relaxed		atomic_fetch_dec
> >  #  define atomic_fetch_dec_acquire		atomic_fetch_dec
> >  #  define atomic_fetch_dec_release		atomic_fetch_dec
> >  # endif
> >  #else
> >  # ifndef atomic_fetch_dec_acquire
> >  #  define atomic_fetch_dec_acquire(...)	__atomic_op_acquire(atomic_fetch_dec, __VA_ARGS__)
> >  # endif
> >  # ifndef atomic_fetch_dec_release
> >  #  define atomic_fetch_dec_release(...)	__atomic_op_release(atomic_fetch_dec, __VA_ARGS__)
> >  # endif
> >  # ifndef atomic_fetch_dec
> >  #  define atomic_fetch_dec(...)		__atomic_op_fence(atomic_fetch_dec, __VA_ARGS__)
> >  # endif
> >  #endif
> > 
> > The new variant is readable at a glance, and the hierarchy of defines is very 
> > obvious as well.
> 
> It wraps and looks hideous in my normal setup. And I do detest that indent
> after # thing.

You should use wider terminals if you take a look at such code - there's already 
numerous areas of the kernel that are not readable on 80x25 terminals.

_Please_ try the following experiment, for me:

Enter the 21st century temporarily and widen two of your terminals from 80 cols to 
100 cols - it's only ~20% wider.

Apply the 3 patches I sent and then open the new and the old atomic.h in the two 
terminals and compare them visually.

The new structure is _much_ more compact, it is nicer looking and much more 
readable.

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 103+ messages in thread

* Re: [PATCH 1/6] locking/atomic, asm-generic: instrument ordering variants
  2018-05-04 18:24         ` Peter Zijlstra
@ 2018-05-05  9:12           ` Mark Rutland
  -1 siblings, 0 replies; 103+ messages in thread
From: Mark Rutland @ 2018-05-05  9:12 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: linux-arm-kernel, linux-kernel, aryabinin, boqun.feng,
	catalin.marinas, dvyukov, mingo, will.deacon

On Fri, May 04, 2018 at 08:24:22PM +0200, Peter Zijlstra wrote:
> On Fri, May 04, 2018 at 07:09:09PM +0100, Mark Rutland wrote:
> > On Fri, May 04, 2018 at 08:01:05PM +0200, Peter Zijlstra wrote:
> > > On Fri, May 04, 2018 at 06:39:32PM +0100, Mark Rutland wrote:
> 
> > > >  include/asm-generic/atomic-instrumented.h | 1195 ++++++++++++++++++++++++-----
> > > >  1 file changed, 1008 insertions(+), 187 deletions(-)
> > > 
> > > Is there really no way to either generate or further macro compress this?
> > 
> > I can definitely macro compress this somewhat, but the bulk of the
> > repetition will be the ifdeffery, which can't be macro'd away IIUC.
> 
> Right, much like what we already have in linux/atomic.h I suspect,
> having to duplicating that isn't brilliant either.
> 
> > Generating this with a script is possible -- do we do anything like that
> > elsewhere?
> 
> There's include/generated/ in your build directory. But nothing on this
> scale I think.

Sure. I'm not familiar with how we generate those, so I'll go digging through
the build infrastructure.

Thanks,
Mark.

^ permalink raw reply	[flat|nested] 103+ messages in thread

* [PATCH 1/6] locking/atomic, asm-generic: instrument ordering variants
@ 2018-05-05  9:12           ` Mark Rutland
  0 siblings, 0 replies; 103+ messages in thread
From: Mark Rutland @ 2018-05-05  9:12 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, May 04, 2018 at 08:24:22PM +0200, Peter Zijlstra wrote:
> On Fri, May 04, 2018 at 07:09:09PM +0100, Mark Rutland wrote:
> > On Fri, May 04, 2018 at 08:01:05PM +0200, Peter Zijlstra wrote:
> > > On Fri, May 04, 2018 at 06:39:32PM +0100, Mark Rutland wrote:
> 
> > > >  include/asm-generic/atomic-instrumented.h | 1195 ++++++++++++++++++++++++-----
> > > >  1 file changed, 1008 insertions(+), 187 deletions(-)
> > > 
> > > Is there really no way to either generate or further macro compress this?
> > 
> > I can definitely macro compress this somewhat, but the bulk of the
> > repetition will be the ifdeffery, which can't be macro'd away IIUC.
> 
> Right, much like what we already have in linux/atomic.h I suspect,
> having to duplicating that isn't brilliant either.
> 
> > Generating this with a script is possible -- do we do anything like that
> > elsewhere?
> 
> There's include/generated/ in your build directory. But nothing on this
> scale I think.

Sure. I'm not familiar with how we generate those, so I'll go digging through
the build infrastructure.

Thanks,
Mark.

^ permalink raw reply	[flat|nested] 103+ messages in thread

* Re: [PATCH] locking/atomics: Clean up the atomic.h maze of #defines
  2018-05-05  9:04             ` Ingo Molnar
@ 2018-05-05  9:24               ` Peter Zijlstra
  -1 siblings, 0 replies; 103+ messages in thread
From: Peter Zijlstra @ 2018-05-05  9:24 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Mark Rutland, linux-arm-kernel, linux-kernel, aryabinin,
	boqun.feng, catalin.marinas, dvyukov, will.deacon

On Sat, May 05, 2018 at 11:04:03AM +0200, Ingo Molnar wrote:
> 
> * Peter Zijlstra <peterz@infradead.org> wrote:
> 
> > > So we could do the following simplification on top of that:
> > > 
> > >  #ifndef atomic_fetch_dec_relaxed
> > >  # ifndef atomic_fetch_dec
> > >  #  define atomic_fetch_dec(v)		atomic_fetch_sub(1, (v))
> > >  #  define atomic_fetch_dec_relaxed(v)	atomic_fetch_sub_relaxed(1, (v))
> > >  #  define atomic_fetch_dec_acquire(v)	atomic_fetch_sub_acquire(1, (v))
> > >  #  define atomic_fetch_dec_release(v)	atomic_fetch_sub_release(1, (v))
> > >  # else
> > >  #  define atomic_fetch_dec_relaxed		atomic_fetch_dec
> > >  #  define atomic_fetch_dec_acquire		atomic_fetch_dec
> > >  #  define atomic_fetch_dec_release		atomic_fetch_dec
> > >  # endif
> > >  #else
> > >  # ifndef atomic_fetch_dec
> > >  #  define atomic_fetch_dec(...)		__atomic_op_fence(atomic_fetch_dec, __VA_ARGS__)
> > >  #  define atomic_fetch_dec_acquire(...)	__atomic_op_acquire(atomic_fetch_dec, __VA_ARGS__)
> > >  #  define atomic_fetch_dec_release(...)	__atomic_op_release(atomic_fetch_dec, __VA_ARGS__)
> > >  # endif
> > >  #endif
> > 
> > This would disallow an architecture to override just fetch_dec_release for
> > instance.
> 
> Couldn't such a crazy arch just define _all_ the 3 APIs in this group?
> That's really a small price and makes the place pay the complexity
> price that does the weirdness...

I would expect the pattern where it can do all 'release' and/or all
'acquire' variants special but cannot use the __atomic_op_*() wrappery.

> > I don't think there currently is any architecture that does that, but the
> > intent was to allow it to override anything and only provide defaults where it
> > does not.
> 
> I'd argue that if a new arch only defines one of these APIs that's probably a bug. 
> If they absolutely want to do it, they still can - by defining all 3 APIs.
> 
> So there's no loss in arch flexibility.

Ideally we'd generate the whole mess.. and then allowing these extra few
overrides is not a problem at all.

> > None of this takes away the giant trainwreck that is the annotated atomic stuff
> > though.
> > 
> > And I seriously hate this one:
> > 
> >   ba1c9f83f633 ("locking/atomic/x86: Un-macro-ify atomic ops implementation")
> > 
> > and will likely undo that the moment I need to change anything there.
> 
> If it makes the code more readable then I don't object - the problem was that the 
> instrumentation indirection made all that code much harder to follow.

Thing is, it is all the exact same loop, and bitrot mandates they drift
over time. When I cleaned up all the architectures I found plenty cases
where there were spurious differences between things.

^ permalink raw reply	[flat|nested] 103+ messages in thread

* [PATCH] locking/atomics: Clean up the atomic.h maze of #defines
@ 2018-05-05  9:24               ` Peter Zijlstra
  0 siblings, 0 replies; 103+ messages in thread
From: Peter Zijlstra @ 2018-05-05  9:24 UTC (permalink / raw)
  To: linux-arm-kernel

On Sat, May 05, 2018 at 11:04:03AM +0200, Ingo Molnar wrote:
> 
> * Peter Zijlstra <peterz@infradead.org> wrote:
> 
> > > So we could do the following simplification on top of that:
> > > 
> > >  #ifndef atomic_fetch_dec_relaxed
> > >  # ifndef atomic_fetch_dec
> > >  #  define atomic_fetch_dec(v)		atomic_fetch_sub(1, (v))
> > >  #  define atomic_fetch_dec_relaxed(v)	atomic_fetch_sub_relaxed(1, (v))
> > >  #  define atomic_fetch_dec_acquire(v)	atomic_fetch_sub_acquire(1, (v))
> > >  #  define atomic_fetch_dec_release(v)	atomic_fetch_sub_release(1, (v))
> > >  # else
> > >  #  define atomic_fetch_dec_relaxed		atomic_fetch_dec
> > >  #  define atomic_fetch_dec_acquire		atomic_fetch_dec
> > >  #  define atomic_fetch_dec_release		atomic_fetch_dec
> > >  # endif
> > >  #else
> > >  # ifndef atomic_fetch_dec
> > >  #  define atomic_fetch_dec(...)		__atomic_op_fence(atomic_fetch_dec, __VA_ARGS__)
> > >  #  define atomic_fetch_dec_acquire(...)	__atomic_op_acquire(atomic_fetch_dec, __VA_ARGS__)
> > >  #  define atomic_fetch_dec_release(...)	__atomic_op_release(atomic_fetch_dec, __VA_ARGS__)
> > >  # endif
> > >  #endif
> > 
> > This would disallow an architecture to override just fetch_dec_release for
> > instance.
> 
> Couldn't such a crazy arch just define _all_ the 3 APIs in this group?
> That's really a small price and makes the place pay the complexity
> price that does the weirdness...

I would expect the pattern where it can do all 'release' and/or all
'acquire' variants special but cannot use the __atomic_op_*() wrappery.

> > I don't think there currently is any architecture that does that, but the
> > intent was to allow it to override anything and only provide defaults where it
> > does not.
> 
> I'd argue that if a new arch only defines one of these APIs that's probably a bug. 
> If they absolutely want to do it, they still can - by defining all 3 APIs.
> 
> So there's no loss in arch flexibility.

Ideally we'd generate the whole mess.. and then allowing these extra few
overrides is not a problem at all.

> > None of this takes away the giant trainwreck that is the annotated atomic stuff
> > though.
> > 
> > And I seriously hate this one:
> > 
> >   ba1c9f83f633 ("locking/atomic/x86: Un-macro-ify atomic ops implementation")
> > 
> > and will likely undo that the moment I need to change anything there.
> 
> If it makes the code more readable then I don't object - the problem was that the 
> instrumentation indirection made all that code much harder to follow.

Thing is, it is all the exact same loop, and bitrot mandates they drift
over time. When I cleaned up all the architectures I found plenty cases
where there were spurious differences between things.

^ permalink raw reply	[flat|nested] 103+ messages in thread

* Re: [PATCH] locking/atomics: Clean up the atomic.h maze of #defines
  2018-05-05  9:09             ` Ingo Molnar
@ 2018-05-05  9:29               ` Peter Zijlstra
  -1 siblings, 0 replies; 103+ messages in thread
From: Peter Zijlstra @ 2018-05-05  9:29 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Mark Rutland, linux-arm-kernel, linux-kernel, aryabinin,
	boqun.feng, catalin.marinas, dvyukov, will.deacon

On Sat, May 05, 2018 at 11:09:03AM +0200, Ingo Molnar wrote:
> > >  # ifndef atomic_fetch_dec_acquire
> > >  #  define atomic_fetch_dec_acquire(...)	__atomic_op_acquire(atomic_fetch_dec, __VA_ARGS__)
> > >  # endif
> > >  # ifndef atomic_fetch_dec_release
> > >  #  define atomic_fetch_dec_release(...)	__atomic_op_release(atomic_fetch_dec, __VA_ARGS__)
> > >  # endif
> > >  # ifndef atomic_fetch_dec
> > >  #  define atomic_fetch_dec(...)		__atomic_op_fence(atomic_fetch_dec, __VA_ARGS__)
> > >  # endif
> > >  #endif
> > > 
> > > The new variant is readable at a glance, and the hierarchy of defines is very 
> > > obvious as well.
> > 
> > It wraps and looks hideous in my normal setup. And I do detest that indent
> > after # thing.
> 
> You should use wider terminals if you take a look at such code - there's already 
> numerous areas of the kernel that are not readable on 80x25 terminals.
> 
> _Please_ try the following experiment, for me:
> 
> Enter the 21st century temporarily and widen two of your terminals from 80 cols to 
> 100 cols - it's only ~20% wider.

Doesn't work that way. The only way I get more columns is if I shrink my
font further. I work with tiles per monitor (left/right obv.) and use
two columns per editor. This gets me a total of 4 columns.

On my desktop that is slightly over 100 characters per column, on my
laptop that is slightly below 100 -- mostly because I'm pixel limited on
fontsize on that thing (FullHD sucks).

If it wraps it wraps.

^ permalink raw reply	[flat|nested] 103+ messages in thread

* [PATCH] locking/atomics: Clean up the atomic.h maze of #defines
@ 2018-05-05  9:29               ` Peter Zijlstra
  0 siblings, 0 replies; 103+ messages in thread
From: Peter Zijlstra @ 2018-05-05  9:29 UTC (permalink / raw)
  To: linux-arm-kernel

On Sat, May 05, 2018 at 11:09:03AM +0200, Ingo Molnar wrote:
> > >  # ifndef atomic_fetch_dec_acquire
> > >  #  define atomic_fetch_dec_acquire(...)	__atomic_op_acquire(atomic_fetch_dec, __VA_ARGS__)
> > >  # endif
> > >  # ifndef atomic_fetch_dec_release
> > >  #  define atomic_fetch_dec_release(...)	__atomic_op_release(atomic_fetch_dec, __VA_ARGS__)
> > >  # endif
> > >  # ifndef atomic_fetch_dec
> > >  #  define atomic_fetch_dec(...)		__atomic_op_fence(atomic_fetch_dec, __VA_ARGS__)
> > >  # endif
> > >  #endif
> > > 
> > > The new variant is readable at a glance, and the hierarchy of defines is very 
> > > obvious as well.
> > 
> > It wraps and looks hideous in my normal setup. And I do detest that indent
> > after # thing.
> 
> You should use wider terminals if you take a look at such code - there's already 
> numerous areas of the kernel that are not readable on 80x25 terminals.
> 
> _Please_ try the following experiment, for me:
> 
> Enter the 21st century temporarily and widen two of your terminals from 80 cols to 
> 100 cols - it's only ~20% wider.

Doesn't work that way. The only way I get more columns is if I shrink my
font further. I work with tiles per monitor (left/right obv.) and use
two columns per editor. This gets me a total of 4 columns.

On my desktop that is slightly over 100 characters per column, on my
laptop that is slightly below 100 -- mostly because I'm pixel limited on
fontsize on that thing (FullHD sucks).

If it wraps it wraps.

^ permalink raw reply	[flat|nested] 103+ messages in thread

* Re: [PATCH] locking/atomics: Clean up the atomic.h maze of #defines
  2018-05-05  9:05             ` Dmitry Vyukov
@ 2018-05-05  9:32               ` Peter Zijlstra
  -1 siblings, 0 replies; 103+ messages in thread
From: Peter Zijlstra @ 2018-05-05  9:32 UTC (permalink / raw)
  To: Dmitry Vyukov
  Cc: Ingo Molnar, Mark Rutland, Linux ARM, LKML, Andrey Ryabinin,
	Boqun Feng, Catalin Marinas, Will Deacon

On Sat, May 05, 2018 at 11:05:51AM +0200, Dmitry Vyukov wrote:
> On Sat, May 5, 2018 at 10:47 AM, Peter Zijlstra <peterz@infradead.org> wrote:
> > And I seriously hate this one:
> >
> >   ba1c9f83f633 ("locking/atomic/x86: Un-macro-ify atomic ops implementation")
> >
> > and will likely undo that the moment I need to change anything there.

> That was asked by Ingo:
> https://groups.google.com/d/msg/kasan-dev/3sNHjjb4GCI/Xz1uVWaaAAAJ
> 
> I think in the end all of current options suck in one way or another,
> so we are just going in circles.

Yeah, and I disagree with him, but didn't have the energy to fight at
that time (and still don't really, I'm just complaining).

> We either need something different (e.g. codegen), or settle on one
> option for doing it.

Codegen I think is the only sensible option at this point for all the
wrappers. The existing ones (without the annotation muck) were already
cumbersome, the annotation stuff just makes it completely horrid.

^ permalink raw reply	[flat|nested] 103+ messages in thread

* [PATCH] locking/atomics: Clean up the atomic.h maze of #defines
@ 2018-05-05  9:32               ` Peter Zijlstra
  0 siblings, 0 replies; 103+ messages in thread
From: Peter Zijlstra @ 2018-05-05  9:32 UTC (permalink / raw)
  To: linux-arm-kernel

On Sat, May 05, 2018 at 11:05:51AM +0200, Dmitry Vyukov wrote:
> On Sat, May 5, 2018 at 10:47 AM, Peter Zijlstra <peterz@infradead.org> wrote:
> > And I seriously hate this one:
> >
> >   ba1c9f83f633 ("locking/atomic/x86: Un-macro-ify atomic ops implementation")
> >
> > and will likely undo that the moment I need to change anything there.

> That was asked by Ingo:
> https://groups.google.com/d/msg/kasan-dev/3sNHjjb4GCI/Xz1uVWaaAAAJ
> 
> I think in the end all of current options suck in one way or another,
> so we are just going in circles.

Yeah, and I disagree with him, but didn't have the energy to fight at
that time (and still don't really, I'm just complaining).

> We either need something different (e.g. codegen), or settle on one
> option for doing it.

Codegen I think is the only sensible option at this point for all the
wrappers. The existing ones (without the annotation muck) were already
cumbersome, the annotation stuff just makes it completely horrid.

^ permalink raw reply	[flat|nested] 103+ messages in thread

* Re: [PATCH] locking/atomics: Clean up the atomic.h maze of #defines
  2018-05-05  9:04             ` Ingo Molnar
@ 2018-05-05  9:38               ` Ingo Molnar
  -1 siblings, 0 replies; 103+ messages in thread
From: Ingo Molnar @ 2018-05-05  9:38 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Mark Rutland, linux-arm-kernel, linux-kernel, aryabinin,
	boqun.feng, catalin.marinas, dvyukov, will.deacon


* Ingo Molnar <mingo@kernel.org> wrote:

> * Peter Zijlstra <peterz@infradead.org> wrote:
> 
> > > So we could do the following simplification on top of that:
> > > 
> > >  #ifndef atomic_fetch_dec_relaxed
> > >  # ifndef atomic_fetch_dec
> > >  #  define atomic_fetch_dec(v)		atomic_fetch_sub(1, (v))
> > >  #  define atomic_fetch_dec_relaxed(v)	atomic_fetch_sub_relaxed(1, (v))
> > >  #  define atomic_fetch_dec_acquire(v)	atomic_fetch_sub_acquire(1, (v))
> > >  #  define atomic_fetch_dec_release(v)	atomic_fetch_sub_release(1, (v))
> > >  # else
> > >  #  define atomic_fetch_dec_relaxed		atomic_fetch_dec
> > >  #  define atomic_fetch_dec_acquire		atomic_fetch_dec
> > >  #  define atomic_fetch_dec_release		atomic_fetch_dec
> > >  # endif
> > >  #else
> > >  # ifndef atomic_fetch_dec
> > >  #  define atomic_fetch_dec(...)		__atomic_op_fence(atomic_fetch_dec, __VA_ARGS__)
> > >  #  define atomic_fetch_dec_acquire(...)	__atomic_op_acquire(atomic_fetch_dec, __VA_ARGS__)
> > >  #  define atomic_fetch_dec_release(...)	__atomic_op_release(atomic_fetch_dec, __VA_ARGS__)
> > >  # endif
> > >  #endif
> > 
> > This would disallow an architecture to override just fetch_dec_release for
> > instance.
> 
> Couldn't such a crazy arch just define _all_ the 3 APIs in this group?
> That's really a small price and makes the place pay the complexity
> price that does the weirdness...
> 
> > I don't think there currently is any architecture that does that, but the
> > intent was to allow it to override anything and only provide defaults where it
> > does not.
> 
> I'd argue that if a new arch only defines one of these APIs that's probably a bug. 
> If they absolutely want to do it, they still can - by defining all 3 APIs.
> 
> So there's no loss in arch flexibility.

BTW., PowerPC for example is already in such a situation, it does not define 
atomic_cmpxchg_release(), only the other APIs:

#define atomic_cmpxchg(v, o, n) (cmpxchg(&((v)->counter), (o), (n)))
#define atomic_cmpxchg_relaxed(v, o, n) \
	cmpxchg_relaxed(&((v)->counter), (o), (n))
#define atomic_cmpxchg_acquire(v, o, n) \
	cmpxchg_acquire(&((v)->counter), (o), (n))

Was it really the intention on the PowerPC side that the generic code falls back 
to cmpxchg(), i.e.:

#  define atomic_cmpxchg_release(...)           __atomic_op_release(atomic_cmpxchg, __VA_ARGS__)

Which after macro expansion becomes:

	smp_mb__before_atomic();
	atomic_cmpxchg_relaxed(v, o, n);

smp_mb__before_atomic() on PowerPC falls back to the generic __smp_mb(), which 
falls back to mb(), which on PowerPC is the 'sync' instruction.

Isn't this a inefficiency bug?

While I'm pretty clueless about PowerPC low level cmpxchg atomics, they appear to 
have the following basic structure:

full cmpxchg():

	PPC_ATOMIC_ENTRY_BARRIER # sync
	ldarx + stdcx
	PPC_ATOMIC_EXIT_BARRIER  # sync

cmpxchg_relaxed():

	ldarx + stdcx

cmpxchg_acquire():

	ldarx + stdcx
	PPC_ACQUIRE_BARRIER      # lwsync

The logical extension for cmpxchg_release() would be:

cmpxchg_release():

	PPC_RELEASE_BARRIER      # lwsync
	ldarx + stdcx

But instead we silently get the generic fallback, which does:

	smp_mb__before_atomic();
	atomic_cmpxchg_relaxed(v, o, n);

Which maps to:

	sync
	ldarx + stdcx

Note that it uses a full barrier instead of lwsync (does that stand for 
'lightweight sync'?).

Even if it turns out we need the full barrier, with the overly finegrained 
structure of the atomics this detail is totally undocumented and non-obvious.

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 103+ messages in thread

* [PATCH] locking/atomics: Clean up the atomic.h maze of #defines
@ 2018-05-05  9:38               ` Ingo Molnar
  0 siblings, 0 replies; 103+ messages in thread
From: Ingo Molnar @ 2018-05-05  9:38 UTC (permalink / raw)
  To: linux-arm-kernel


* Ingo Molnar <mingo@kernel.org> wrote:

> * Peter Zijlstra <peterz@infradead.org> wrote:
> 
> > > So we could do the following simplification on top of that:
> > > 
> > >  #ifndef atomic_fetch_dec_relaxed
> > >  # ifndef atomic_fetch_dec
> > >  #  define atomic_fetch_dec(v)		atomic_fetch_sub(1, (v))
> > >  #  define atomic_fetch_dec_relaxed(v)	atomic_fetch_sub_relaxed(1, (v))
> > >  #  define atomic_fetch_dec_acquire(v)	atomic_fetch_sub_acquire(1, (v))
> > >  #  define atomic_fetch_dec_release(v)	atomic_fetch_sub_release(1, (v))
> > >  # else
> > >  #  define atomic_fetch_dec_relaxed		atomic_fetch_dec
> > >  #  define atomic_fetch_dec_acquire		atomic_fetch_dec
> > >  #  define atomic_fetch_dec_release		atomic_fetch_dec
> > >  # endif
> > >  #else
> > >  # ifndef atomic_fetch_dec
> > >  #  define atomic_fetch_dec(...)		__atomic_op_fence(atomic_fetch_dec, __VA_ARGS__)
> > >  #  define atomic_fetch_dec_acquire(...)	__atomic_op_acquire(atomic_fetch_dec, __VA_ARGS__)
> > >  #  define atomic_fetch_dec_release(...)	__atomic_op_release(atomic_fetch_dec, __VA_ARGS__)
> > >  # endif
> > >  #endif
> > 
> > This would disallow an architecture to override just fetch_dec_release for
> > instance.
> 
> Couldn't such a crazy arch just define _all_ the 3 APIs in this group?
> That's really a small price and makes the place pay the complexity
> price that does the weirdness...
> 
> > I don't think there currently is any architecture that does that, but the
> > intent was to allow it to override anything and only provide defaults where it
> > does not.
> 
> I'd argue that if a new arch only defines one of these APIs that's probably a bug. 
> If they absolutely want to do it, they still can - by defining all 3 APIs.
> 
> So there's no loss in arch flexibility.

BTW., PowerPC for example is already in such a situation, it does not define 
atomic_cmpxchg_release(), only the other APIs:

#define atomic_cmpxchg(v, o, n) (cmpxchg(&((v)->counter), (o), (n)))
#define atomic_cmpxchg_relaxed(v, o, n) \
	cmpxchg_relaxed(&((v)->counter), (o), (n))
#define atomic_cmpxchg_acquire(v, o, n) \
	cmpxchg_acquire(&((v)->counter), (o), (n))

Was it really the intention on the PowerPC side that the generic code falls back 
to cmpxchg(), i.e.:

#  define atomic_cmpxchg_release(...)           __atomic_op_release(atomic_cmpxchg, __VA_ARGS__)

Which after macro expansion becomes:

	smp_mb__before_atomic();
	atomic_cmpxchg_relaxed(v, o, n);

smp_mb__before_atomic() on PowerPC falls back to the generic __smp_mb(), which 
falls back to mb(), which on PowerPC is the 'sync' instruction.

Isn't this a inefficiency bug?

While I'm pretty clueless about PowerPC low level cmpxchg atomics, they appear to 
have the following basic structure:

full cmpxchg():

	PPC_ATOMIC_ENTRY_BARRIER # sync
	ldarx + stdcx
	PPC_ATOMIC_EXIT_BARRIER  # sync

cmpxchg_relaxed():

	ldarx + stdcx

cmpxchg_acquire():

	ldarx + stdcx
	PPC_ACQUIRE_BARRIER      # lwsync

The logical extension for cmpxchg_release() would be:

cmpxchg_release():

	PPC_RELEASE_BARRIER      # lwsync
	ldarx + stdcx

But instead we silently get the generic fallback, which does:

	smp_mb__before_atomic();
	atomic_cmpxchg_relaxed(v, o, n);

Which maps to:

	sync
	ldarx + stdcx

Note that it uses a full barrier instead of lwsync (does that stand for 
'lightweight sync'?).

Even if it turns out we need the full barrier, with the overly finegrained 
structure of the atomics this detail is totally undocumented and non-obvious.

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 103+ messages in thread

* [RFC PATCH] locking/atomics/powerpc: Introduce optimized cmpxchg_release() family of APIs for PowerPC
  2018-05-05  9:38               ` Ingo Molnar
@ 2018-05-05 10:00                 ` Ingo Molnar
  -1 siblings, 0 replies; 103+ messages in thread
From: Ingo Molnar @ 2018-05-05 10:00 UTC (permalink / raw)
  To: Peter Zijlstra, Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman
  Cc: Mark Rutland, linux-arm-kernel, linux-kernel, aryabinin,
	boqun.feng, catalin.marinas, dvyukov, will.deacon


* Ingo Molnar <mingo@kernel.org> wrote:

> > So there's no loss in arch flexibility.
> 
> BTW., PowerPC for example is already in such a situation, it does not define 
> atomic_cmpxchg_release(), only the other APIs:
> 
> #define atomic_cmpxchg(v, o, n) (cmpxchg(&((v)->counter), (o), (n)))
> #define atomic_cmpxchg_relaxed(v, o, n) \
> 	cmpxchg_relaxed(&((v)->counter), (o), (n))
> #define atomic_cmpxchg_acquire(v, o, n) \
> 	cmpxchg_acquire(&((v)->counter), (o), (n))
> 
> Was it really the intention on the PowerPC side that the generic code falls back 
> to cmpxchg(), i.e.:
> 
> #  define atomic_cmpxchg_release(...)           __atomic_op_release(atomic_cmpxchg, __VA_ARGS__)
> 
> Which after macro expansion becomes:
> 
> 	smp_mb__before_atomic();
> 	atomic_cmpxchg_relaxed(v, o, n);
> 
> smp_mb__before_atomic() on PowerPC falls back to the generic __smp_mb(), which 
> falls back to mb(), which on PowerPC is the 'sync' instruction.
> 
> Isn't this a inefficiency bug?
> 
> While I'm pretty clueless about PowerPC low level cmpxchg atomics, they appear to 
> have the following basic structure:
> 
> full cmpxchg():
> 
> 	PPC_ATOMIC_ENTRY_BARRIER # sync
> 	ldarx + stdcx
> 	PPC_ATOMIC_EXIT_BARRIER  # sync
> 
> cmpxchg_relaxed():
> 
> 	ldarx + stdcx
> 
> cmpxchg_acquire():
> 
> 	ldarx + stdcx
> 	PPC_ACQUIRE_BARRIER      # lwsync
> 
> The logical extension for cmpxchg_release() would be:
> 
> cmpxchg_release():
> 
> 	PPC_RELEASE_BARRIER      # lwsync
> 	ldarx + stdcx
> 
> But instead we silently get the generic fallback, which does:
> 
> 	smp_mb__before_atomic();
> 	atomic_cmpxchg_relaxed(v, o, n);
> 
> Which maps to:
> 
> 	sync
> 	ldarx + stdcx
> 
> Note that it uses a full barrier instead of lwsync (does that stand for 
> 'lightweight sync'?).
> 
> Even if it turns out we need the full barrier, with the overly finegrained 
> structure of the atomics this detail is totally undocumented and non-obvious.

The patch below fills in those bits and implements the optimized cmpxchg_release() 
family of APIs. The end effect should be that cmpxchg_release() will now use 
'lwsync' instead of 'sync' on PowerPC, for the following APIs:

  cmpxchg_release()
  cmpxchg64_release()
  atomic_cmpxchg_release()
  atomic64_cmpxchg_release()

I based this choice of the release barrier on an existing bitops low level PowerPC 
method:

   DEFINE_BITOP(clear_bits_unlock, andc, PPC_RELEASE_BARRIER)

This clearly suggests that PPC_RELEASE_BARRIER is in active use and 'lwsync' is 
the 'release barrier' instruction, if I interpreted that right.

But I know very little about PowerPC so this might be spectacularly wrong. It's 
totally untested as well. I also pretty sick today so my mental capabilities are 
significantly reduced ...

So not signed off and such.

Thanks,

	Ingo

---
 arch/powerpc/include/asm/atomic.h  |  4 ++
 arch/powerpc/include/asm/cmpxchg.h | 81 ++++++++++++++++++++++++++++++++++++++
 2 files changed, 85 insertions(+)

diff --git a/arch/powerpc/include/asm/atomic.h b/arch/powerpc/include/asm/atomic.h
index 682b3e6a1e21..f7a6f29acb12 100644
--- a/arch/powerpc/include/asm/atomic.h
+++ b/arch/powerpc/include/asm/atomic.h
@@ -213,6 +213,8 @@ static __inline__ int atomic_dec_return_relaxed(atomic_t *v)
 	cmpxchg_relaxed(&((v)->counter), (o), (n))
 #define atomic_cmpxchg_acquire(v, o, n) \
 	cmpxchg_acquire(&((v)->counter), (o), (n))
+#define atomic_cmpxchg_release(v, o, n) \
+	cmpxchg_release(&((v)->counter), (o), (n))
 
 #define atomic_xchg(v, new) (xchg(&((v)->counter), new))
 #define atomic_xchg_relaxed(v, new) xchg_relaxed(&((v)->counter), (new))
@@ -519,6 +521,8 @@ static __inline__ long atomic64_dec_if_positive(atomic64_t *v)
 	cmpxchg_relaxed(&((v)->counter), (o), (n))
 #define atomic64_cmpxchg_acquire(v, o, n) \
 	cmpxchg_acquire(&((v)->counter), (o), (n))
+#define atomic64_cmpxchg_release(v, o, n) \
+	cmpxchg_release(&((v)->counter), (o), (n))
 
 #define atomic64_xchg(v, new) (xchg(&((v)->counter), new))
 #define atomic64_xchg_relaxed(v, new) xchg_relaxed(&((v)->counter), (new))
diff --git a/arch/powerpc/include/asm/cmpxchg.h b/arch/powerpc/include/asm/cmpxchg.h
index 9b001f1f6b32..6e46310b1833 100644
--- a/arch/powerpc/include/asm/cmpxchg.h
+++ b/arch/powerpc/include/asm/cmpxchg.h
@@ -213,10 +213,12 @@ __xchg_relaxed(void *ptr, unsigned long x, unsigned int size)
 CMPXCHG_GEN(u8, , PPC_ATOMIC_ENTRY_BARRIER, PPC_ATOMIC_EXIT_BARRIER, "memory");
 CMPXCHG_GEN(u8, _local, , , "memory");
 CMPXCHG_GEN(u8, _acquire, , PPC_ACQUIRE_BARRIER, "memory");
+CMPXCHG_GEN(u8, _release, PPC_RELEASE_BARRIER, , "memory");
 CMPXCHG_GEN(u8, _relaxed, , , "cc");
 CMPXCHG_GEN(u16, , PPC_ATOMIC_ENTRY_BARRIER, PPC_ATOMIC_EXIT_BARRIER, "memory");
 CMPXCHG_GEN(u16, _local, , , "memory");
 CMPXCHG_GEN(u16, _acquire, , PPC_ACQUIRE_BARRIER, "memory");
+CMPXCHG_GEN(u16, _release, PPC_RELEASE_BARRIER, , "memory");
 CMPXCHG_GEN(u16, _relaxed, , , "cc");
 
 static __always_inline unsigned long
@@ -314,6 +316,29 @@ __cmpxchg_u32_acquire(u32 *p, unsigned long old, unsigned long new)
 	return prev;
 }
 
+static __always_inline unsigned long
+__cmpxchg_u32_release(u32 *p, unsigned long old, unsigned long new)
+{
+	unsigned long prev;
+
+	__asm__ __volatile__ (
+	PPC_RELEASE_BARRIER
+"1:	lwarx	%0,0,%2		# __cmpxchg_u32_release\n"
+"	cmpw	0,%0,%3\n"
+"	bne-	2f\n"
+	PPC405_ERR77(0, %2)
+"	stwcx.	%4,0,%2\n"
+"	bne-	1b\n"
+	"\n"
+"2:"
+	: "=&r" (prev), "+m" (*p)
+	: "r" (p), "r" (old), "r" (new)
+	: "cc", "memory");
+
+	return prev;
+}
+
+
 #ifdef CONFIG_PPC64
 static __always_inline unsigned long
 __cmpxchg_u64(volatile unsigned long *p, unsigned long old, unsigned long new)
@@ -397,6 +422,27 @@ __cmpxchg_u64_acquire(u64 *p, unsigned long old, unsigned long new)
 
 	return prev;
 }
+
+static __always_inline unsigned long
+__cmpxchg_u64_release(u64 *p, unsigned long old, unsigned long new)
+{
+	unsigned long prev;
+
+	__asm__ __volatile__ (
+	PPC_RELEASE_BARRIER
+"1:	ldarx	%0,0,%2		# __cmpxchg_u64_release\n"
+"	cmpd	0,%0,%3\n"
+"	bne-	2f\n"
+"	stdcx.	%4,0,%2\n"
+"	bne-	1b\n"
+	"\n"
+"2:"
+	: "=&r" (prev), "+m" (*p)
+	: "r" (p), "r" (old), "r" (new)
+	: "cc", "memory");
+
+	return prev;
+}
 #endif
 
 static __always_inline unsigned long
@@ -478,6 +524,27 @@ __cmpxchg_acquire(void *ptr, unsigned long old, unsigned long new,
 	BUILD_BUG_ON_MSG(1, "Unsupported size for __cmpxchg_acquire");
 	return old;
 }
+
+static __always_inline unsigned long
+__cmpxchg_release(void *ptr, unsigned long old, unsigned long new,
+		  unsigned int size)
+{
+	switch (size) {
+	case 1:
+		return __cmpxchg_u8_release(ptr, old, new);
+	case 2:
+		return __cmpxchg_u16_release(ptr, old, new);
+	case 4:
+		return __cmpxchg_u32_release(ptr, old, new);
+#ifdef CONFIG_PPC64
+	case 8:
+		return __cmpxchg_u64_release(ptr, old, new);
+#endif
+	}
+	BUILD_BUG_ON_MSG(1, "Unsupported size for __cmpxchg_release");
+	return old;
+}
+
 #define cmpxchg(ptr, o, n)						 \
   ({									 \
      __typeof__(*(ptr)) _o_ = (o);					 \
@@ -512,6 +579,15 @@ __cmpxchg_acquire(void *ptr, unsigned long old, unsigned long new,
 			(unsigned long)_o_, (unsigned long)_n_,		\
 			sizeof(*(ptr)));				\
 })
+
+#define cmpxchg_release(ptr, o, n)					\
+({									\
+	__typeof__(*(ptr)) _o_ = (o);					\
+	__typeof__(*(ptr)) _n_ = (n);					\
+	(__typeof__(*(ptr))) __cmpxchg_release((ptr),			\
+			(unsigned long)_o_, (unsigned long)_n_,		\
+			sizeof(*(ptr)));				\
+})
 #ifdef CONFIG_PPC64
 #define cmpxchg64(ptr, o, n)						\
   ({									\
@@ -533,6 +609,11 @@ __cmpxchg_acquire(void *ptr, unsigned long old, unsigned long new,
 	BUILD_BUG_ON(sizeof(*(ptr)) != 8);				\
 	cmpxchg_acquire((ptr), (o), (n));				\
 })
+#define cmpxchg64_release(ptr, o, n)					\
+({									\
+	BUILD_BUG_ON(sizeof(*(ptr)) != 8);				\
+	cmpxchg_release((ptr), (o), (n));				\
+})
 #else
 #include <asm-generic/cmpxchg-local.h>
 #define cmpxchg64_local(ptr, o, n) __cmpxchg64_local_generic((ptr), (o), (n))

^ permalink raw reply related	[flat|nested] 103+ messages in thread

* [RFC PATCH] locking/atomics/powerpc: Introduce optimized cmpxchg_release() family of APIs for PowerPC
@ 2018-05-05 10:00                 ` Ingo Molnar
  0 siblings, 0 replies; 103+ messages in thread
From: Ingo Molnar @ 2018-05-05 10:00 UTC (permalink / raw)
  To: linux-arm-kernel


* Ingo Molnar <mingo@kernel.org> wrote:

> > So there's no loss in arch flexibility.
> 
> BTW., PowerPC for example is already in such a situation, it does not define 
> atomic_cmpxchg_release(), only the other APIs:
> 
> #define atomic_cmpxchg(v, o, n) (cmpxchg(&((v)->counter), (o), (n)))
> #define atomic_cmpxchg_relaxed(v, o, n) \
> 	cmpxchg_relaxed(&((v)->counter), (o), (n))
> #define atomic_cmpxchg_acquire(v, o, n) \
> 	cmpxchg_acquire(&((v)->counter), (o), (n))
> 
> Was it really the intention on the PowerPC side that the generic code falls back 
> to cmpxchg(), i.e.:
> 
> #  define atomic_cmpxchg_release(...)           __atomic_op_release(atomic_cmpxchg, __VA_ARGS__)
> 
> Which after macro expansion becomes:
> 
> 	smp_mb__before_atomic();
> 	atomic_cmpxchg_relaxed(v, o, n);
> 
> smp_mb__before_atomic() on PowerPC falls back to the generic __smp_mb(), which 
> falls back to mb(), which on PowerPC is the 'sync' instruction.
> 
> Isn't this a inefficiency bug?
> 
> While I'm pretty clueless about PowerPC low level cmpxchg atomics, they appear to 
> have the following basic structure:
> 
> full cmpxchg():
> 
> 	PPC_ATOMIC_ENTRY_BARRIER # sync
> 	ldarx + stdcx
> 	PPC_ATOMIC_EXIT_BARRIER  # sync
> 
> cmpxchg_relaxed():
> 
> 	ldarx + stdcx
> 
> cmpxchg_acquire():
> 
> 	ldarx + stdcx
> 	PPC_ACQUIRE_BARRIER      # lwsync
> 
> The logical extension for cmpxchg_release() would be:
> 
> cmpxchg_release():
> 
> 	PPC_RELEASE_BARRIER      # lwsync
> 	ldarx + stdcx
> 
> But instead we silently get the generic fallback, which does:
> 
> 	smp_mb__before_atomic();
> 	atomic_cmpxchg_relaxed(v, o, n);
> 
> Which maps to:
> 
> 	sync
> 	ldarx + stdcx
> 
> Note that it uses a full barrier instead of lwsync (does that stand for 
> 'lightweight sync'?).
> 
> Even if it turns out we need the full barrier, with the overly finegrained 
> structure of the atomics this detail is totally undocumented and non-obvious.

The patch below fills in those bits and implements the optimized cmpxchg_release() 
family of APIs. The end effect should be that cmpxchg_release() will now use 
'lwsync' instead of 'sync' on PowerPC, for the following APIs:

  cmpxchg_release()
  cmpxchg64_release()
  atomic_cmpxchg_release()
  atomic64_cmpxchg_release()

I based this choice of the release barrier on an existing bitops low level PowerPC 
method:

   DEFINE_BITOP(clear_bits_unlock, andc, PPC_RELEASE_BARRIER)

This clearly suggests that PPC_RELEASE_BARRIER is in active use and 'lwsync' is 
the 'release barrier' instruction, if I interpreted that right.

But I know very little about PowerPC so this might be spectacularly wrong. It's 
totally untested as well. I also pretty sick today so my mental capabilities are 
significantly reduced ...

So not signed off and such.

Thanks,

	Ingo

---
 arch/powerpc/include/asm/atomic.h  |  4 ++
 arch/powerpc/include/asm/cmpxchg.h | 81 ++++++++++++++++++++++++++++++++++++++
 2 files changed, 85 insertions(+)

diff --git a/arch/powerpc/include/asm/atomic.h b/arch/powerpc/include/asm/atomic.h
index 682b3e6a1e21..f7a6f29acb12 100644
--- a/arch/powerpc/include/asm/atomic.h
+++ b/arch/powerpc/include/asm/atomic.h
@@ -213,6 +213,8 @@ static __inline__ int atomic_dec_return_relaxed(atomic_t *v)
 	cmpxchg_relaxed(&((v)->counter), (o), (n))
 #define atomic_cmpxchg_acquire(v, o, n) \
 	cmpxchg_acquire(&((v)->counter), (o), (n))
+#define atomic_cmpxchg_release(v, o, n) \
+	cmpxchg_release(&((v)->counter), (o), (n))
 
 #define atomic_xchg(v, new) (xchg(&((v)->counter), new))
 #define atomic_xchg_relaxed(v, new) xchg_relaxed(&((v)->counter), (new))
@@ -519,6 +521,8 @@ static __inline__ long atomic64_dec_if_positive(atomic64_t *v)
 	cmpxchg_relaxed(&((v)->counter), (o), (n))
 #define atomic64_cmpxchg_acquire(v, o, n) \
 	cmpxchg_acquire(&((v)->counter), (o), (n))
+#define atomic64_cmpxchg_release(v, o, n) \
+	cmpxchg_release(&((v)->counter), (o), (n))
 
 #define atomic64_xchg(v, new) (xchg(&((v)->counter), new))
 #define atomic64_xchg_relaxed(v, new) xchg_relaxed(&((v)->counter), (new))
diff --git a/arch/powerpc/include/asm/cmpxchg.h b/arch/powerpc/include/asm/cmpxchg.h
index 9b001f1f6b32..6e46310b1833 100644
--- a/arch/powerpc/include/asm/cmpxchg.h
+++ b/arch/powerpc/include/asm/cmpxchg.h
@@ -213,10 +213,12 @@ __xchg_relaxed(void *ptr, unsigned long x, unsigned int size)
 CMPXCHG_GEN(u8, , PPC_ATOMIC_ENTRY_BARRIER, PPC_ATOMIC_EXIT_BARRIER, "memory");
 CMPXCHG_GEN(u8, _local, , , "memory");
 CMPXCHG_GEN(u8, _acquire, , PPC_ACQUIRE_BARRIER, "memory");
+CMPXCHG_GEN(u8, _release, PPC_RELEASE_BARRIER, , "memory");
 CMPXCHG_GEN(u8, _relaxed, , , "cc");
 CMPXCHG_GEN(u16, , PPC_ATOMIC_ENTRY_BARRIER, PPC_ATOMIC_EXIT_BARRIER, "memory");
 CMPXCHG_GEN(u16, _local, , , "memory");
 CMPXCHG_GEN(u16, _acquire, , PPC_ACQUIRE_BARRIER, "memory");
+CMPXCHG_GEN(u16, _release, PPC_RELEASE_BARRIER, , "memory");
 CMPXCHG_GEN(u16, _relaxed, , , "cc");
 
 static __always_inline unsigned long
@@ -314,6 +316,29 @@ __cmpxchg_u32_acquire(u32 *p, unsigned long old, unsigned long new)
 	return prev;
 }
 
+static __always_inline unsigned long
+__cmpxchg_u32_release(u32 *p, unsigned long old, unsigned long new)
+{
+	unsigned long prev;
+
+	__asm__ __volatile__ (
+	PPC_RELEASE_BARRIER
+"1:	lwarx	%0,0,%2		# __cmpxchg_u32_release\n"
+"	cmpw	0,%0,%3\n"
+"	bne-	2f\n"
+	PPC405_ERR77(0, %2)
+"	stwcx.	%4,0,%2\n"
+"	bne-	1b\n"
+	"\n"
+"2:"
+	: "=&r" (prev), "+m" (*p)
+	: "r" (p), "r" (old), "r" (new)
+	: "cc", "memory");
+
+	return prev;
+}
+
+
 #ifdef CONFIG_PPC64
 static __always_inline unsigned long
 __cmpxchg_u64(volatile unsigned long *p, unsigned long old, unsigned long new)
@@ -397,6 +422,27 @@ __cmpxchg_u64_acquire(u64 *p, unsigned long old, unsigned long new)
 
 	return prev;
 }
+
+static __always_inline unsigned long
+__cmpxchg_u64_release(u64 *p, unsigned long old, unsigned long new)
+{
+	unsigned long prev;
+
+	__asm__ __volatile__ (
+	PPC_RELEASE_BARRIER
+"1:	ldarx	%0,0,%2		# __cmpxchg_u64_release\n"
+"	cmpd	0,%0,%3\n"
+"	bne-	2f\n"
+"	stdcx.	%4,0,%2\n"
+"	bne-	1b\n"
+	"\n"
+"2:"
+	: "=&r" (prev), "+m" (*p)
+	: "r" (p), "r" (old), "r" (new)
+	: "cc", "memory");
+
+	return prev;
+}
 #endif
 
 static __always_inline unsigned long
@@ -478,6 +524,27 @@ __cmpxchg_acquire(void *ptr, unsigned long old, unsigned long new,
 	BUILD_BUG_ON_MSG(1, "Unsupported size for __cmpxchg_acquire");
 	return old;
 }
+
+static __always_inline unsigned long
+__cmpxchg_release(void *ptr, unsigned long old, unsigned long new,
+		  unsigned int size)
+{
+	switch (size) {
+	case 1:
+		return __cmpxchg_u8_release(ptr, old, new);
+	case 2:
+		return __cmpxchg_u16_release(ptr, old, new);
+	case 4:
+		return __cmpxchg_u32_release(ptr, old, new);
+#ifdef CONFIG_PPC64
+	case 8:
+		return __cmpxchg_u64_release(ptr, old, new);
+#endif
+	}
+	BUILD_BUG_ON_MSG(1, "Unsupported size for __cmpxchg_release");
+	return old;
+}
+
 #define cmpxchg(ptr, o, n)						 \
   ({									 \
      __typeof__(*(ptr)) _o_ = (o);					 \
@@ -512,6 +579,15 @@ __cmpxchg_acquire(void *ptr, unsigned long old, unsigned long new,
 			(unsigned long)_o_, (unsigned long)_n_,		\
 			sizeof(*(ptr)));				\
 })
+
+#define cmpxchg_release(ptr, o, n)					\
+({									\
+	__typeof__(*(ptr)) _o_ = (o);					\
+	__typeof__(*(ptr)) _n_ = (n);					\
+	(__typeof__(*(ptr))) __cmpxchg_release((ptr),			\
+			(unsigned long)_o_, (unsigned long)_n_,		\
+			sizeof(*(ptr)));				\
+})
 #ifdef CONFIG_PPC64
 #define cmpxchg64(ptr, o, n)						\
   ({									\
@@ -533,6 +609,11 @@ __cmpxchg_acquire(void *ptr, unsigned long old, unsigned long new,
 	BUILD_BUG_ON(sizeof(*(ptr)) != 8);				\
 	cmpxchg_acquire((ptr), (o), (n));				\
 })
+#define cmpxchg64_release(ptr, o, n)					\
+({									\
+	BUILD_BUG_ON(sizeof(*(ptr)) != 8);				\
+	cmpxchg_release((ptr), (o), (n));				\
+})
 #else
 #include <asm-generic/cmpxchg-local.h>
 #define cmpxchg64_local(ptr, o, n) __cmpxchg64_local_generic((ptr), (o), (n))

^ permalink raw reply related	[flat|nested] 103+ messages in thread

* Re: [PATCH] locking/atomics: Clean up the atomic.h maze of #defines
  2018-05-05  9:38               ` Ingo Molnar
@ 2018-05-05 10:16                 ` Boqun Feng
  -1 siblings, 0 replies; 103+ messages in thread
From: Boqun Feng @ 2018-05-05 10:16 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Peter Zijlstra, Mark Rutland, linux-arm-kernel, linux-kernel,
	aryabinin, catalin.marinas, dvyukov, will.deacon

[-- Attachment #1: Type: text/plain, Size: 5169 bytes --]

On Sat, May 05, 2018 at 11:38:29AM +0200, Ingo Molnar wrote:
> 
> * Ingo Molnar <mingo@kernel.org> wrote:
> 
> > * Peter Zijlstra <peterz@infradead.org> wrote:
> > 
> > > > So we could do the following simplification on top of that:
> > > > 
> > > >  #ifndef atomic_fetch_dec_relaxed
> > > >  # ifndef atomic_fetch_dec
> > > >  #  define atomic_fetch_dec(v)		atomic_fetch_sub(1, (v))
> > > >  #  define atomic_fetch_dec_relaxed(v)	atomic_fetch_sub_relaxed(1, (v))
> > > >  #  define atomic_fetch_dec_acquire(v)	atomic_fetch_sub_acquire(1, (v))
> > > >  #  define atomic_fetch_dec_release(v)	atomic_fetch_sub_release(1, (v))
> > > >  # else
> > > >  #  define atomic_fetch_dec_relaxed		atomic_fetch_dec
> > > >  #  define atomic_fetch_dec_acquire		atomic_fetch_dec
> > > >  #  define atomic_fetch_dec_release		atomic_fetch_dec
> > > >  # endif
> > > >  #else
> > > >  # ifndef atomic_fetch_dec
> > > >  #  define atomic_fetch_dec(...)		__atomic_op_fence(atomic_fetch_dec, __VA_ARGS__)
> > > >  #  define atomic_fetch_dec_acquire(...)	__atomic_op_acquire(atomic_fetch_dec, __VA_ARGS__)
> > > >  #  define atomic_fetch_dec_release(...)	__atomic_op_release(atomic_fetch_dec, __VA_ARGS__)
> > > >  # endif
> > > >  #endif
> > > 
> > > This would disallow an architecture to override just fetch_dec_release for
> > > instance.
> > 
> > Couldn't such a crazy arch just define _all_ the 3 APIs in this group?
> > That's really a small price and makes the place pay the complexity
> > price that does the weirdness...
> > 
> > > I don't think there currently is any architecture that does that, but the
> > > intent was to allow it to override anything and only provide defaults where it
> > > does not.
> > 
> > I'd argue that if a new arch only defines one of these APIs that's probably a bug. 
> > If they absolutely want to do it, they still can - by defining all 3 APIs.
> > 
> > So there's no loss in arch flexibility.
> 
> BTW., PowerPC for example is already in such a situation, it does not define 
> atomic_cmpxchg_release(), only the other APIs:
> 
> #define atomic_cmpxchg(v, o, n) (cmpxchg(&((v)->counter), (o), (n)))
> #define atomic_cmpxchg_relaxed(v, o, n) \
> 	cmpxchg_relaxed(&((v)->counter), (o), (n))
> #define atomic_cmpxchg_acquire(v, o, n) \
> 	cmpxchg_acquire(&((v)->counter), (o), (n))
> 
> Was it really the intention on the PowerPC side that the generic code falls back 
> to cmpxchg(), i.e.:
> 
> #  define atomic_cmpxchg_release(...)           __atomic_op_release(atomic_cmpxchg, __VA_ARGS__)
> 

So ppc has its own definition __atomic_op_release() in
arch/powerpc/include/asm/atomic.h:

	#define __atomic_op_release(op, args...)				\
	({									\
		__asm__ __volatile__(PPC_RELEASE_BARRIER "" : : : "memory");	\
		op##_relaxed(args);						\
	})

, and PPC_RELEASE_BARRIER is lwsync, so we map to

	lwsync();
	atomic_cmpxchg_relaxed(v, o, n);

And the reason, why we don't define atomic_cmpxchg_release() but define
atomic_cmpxchg_acquire() is that, atomic_cmpxchg_*() could provide no
ordering guarantee if the cmp fails, we did this for
atomic_cmpxchg_acquire() but not for atomic_cmpxchg_release(), because
doing so may introduce a memory barrier inside a ll/sc critical section,
please see the comment before __cmpxchg_u32_acquire() in
arch/powerpc/include/asm/cmpxchg.h:

	/*
	 * cmpxchg family don't have order guarantee if cmp part fails, therefore we
	 * can avoid superfluous barriers if we use assembly code to implement
	 * cmpxchg() and cmpxchg_acquire(), however we don't do the similar for
	 * cmpxchg_release() because that will result in putting a barrier in the
	 * middle of a ll/sc loop, which is probably a bad idea. For example, this
	 * might cause the conditional store more likely to fail.
	 */

Regards,
Boqun


> Which after macro expansion becomes:
> 
> 	smp_mb__before_atomic();
> 	atomic_cmpxchg_relaxed(v, o, n);
> 
> smp_mb__before_atomic() on PowerPC falls back to the generic __smp_mb(), which 
> falls back to mb(), which on PowerPC is the 'sync' instruction.
> 
> Isn't this a inefficiency bug?
> 
> While I'm pretty clueless about PowerPC low level cmpxchg atomics, they appear to 
> have the following basic structure:
> 
> full cmpxchg():
> 
> 	PPC_ATOMIC_ENTRY_BARRIER # sync
> 	ldarx + stdcx
> 	PPC_ATOMIC_EXIT_BARRIER  # sync
> 
> cmpxchg_relaxed():
> 
> 	ldarx + stdcx
> 
> cmpxchg_acquire():
> 
> 	ldarx + stdcx
> 	PPC_ACQUIRE_BARRIER      # lwsync
> 
> The logical extension for cmpxchg_release() would be:
> 
> cmpxchg_release():
> 
> 	PPC_RELEASE_BARRIER      # lwsync
> 	ldarx + stdcx
> 
> But instead we silently get the generic fallback, which does:
> 
> 	smp_mb__before_atomic();
> 	atomic_cmpxchg_relaxed(v, o, n);
> 
> Which maps to:
> 
> 	sync
> 	ldarx + stdcx
> 
> Note that it uses a full barrier instead of lwsync (does that stand for 
> 'lightweight sync'?).
> 
> Even if it turns out we need the full barrier, with the overly finegrained 
> structure of the atomics this detail is totally undocumented and non-obvious.
> 
> Thanks,
> 
> 	Ingo

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 103+ messages in thread

* [PATCH] locking/atomics: Clean up the atomic.h maze of #defines
@ 2018-05-05 10:16                 ` Boqun Feng
  0 siblings, 0 replies; 103+ messages in thread
From: Boqun Feng @ 2018-05-05 10:16 UTC (permalink / raw)
  To: linux-arm-kernel

On Sat, May 05, 2018 at 11:38:29AM +0200, Ingo Molnar wrote:
> 
> * Ingo Molnar <mingo@kernel.org> wrote:
> 
> > * Peter Zijlstra <peterz@infradead.org> wrote:
> > 
> > > > So we could do the following simplification on top of that:
> > > > 
> > > >  #ifndef atomic_fetch_dec_relaxed
> > > >  # ifndef atomic_fetch_dec
> > > >  #  define atomic_fetch_dec(v)		atomic_fetch_sub(1, (v))
> > > >  #  define atomic_fetch_dec_relaxed(v)	atomic_fetch_sub_relaxed(1, (v))
> > > >  #  define atomic_fetch_dec_acquire(v)	atomic_fetch_sub_acquire(1, (v))
> > > >  #  define atomic_fetch_dec_release(v)	atomic_fetch_sub_release(1, (v))
> > > >  # else
> > > >  #  define atomic_fetch_dec_relaxed		atomic_fetch_dec
> > > >  #  define atomic_fetch_dec_acquire		atomic_fetch_dec
> > > >  #  define atomic_fetch_dec_release		atomic_fetch_dec
> > > >  # endif
> > > >  #else
> > > >  # ifndef atomic_fetch_dec
> > > >  #  define atomic_fetch_dec(...)		__atomic_op_fence(atomic_fetch_dec, __VA_ARGS__)
> > > >  #  define atomic_fetch_dec_acquire(...)	__atomic_op_acquire(atomic_fetch_dec, __VA_ARGS__)
> > > >  #  define atomic_fetch_dec_release(...)	__atomic_op_release(atomic_fetch_dec, __VA_ARGS__)
> > > >  # endif
> > > >  #endif
> > > 
> > > This would disallow an architecture to override just fetch_dec_release for
> > > instance.
> > 
> > Couldn't such a crazy arch just define _all_ the 3 APIs in this group?
> > That's really a small price and makes the place pay the complexity
> > price that does the weirdness...
> > 
> > > I don't think there currently is any architecture that does that, but the
> > > intent was to allow it to override anything and only provide defaults where it
> > > does not.
> > 
> > I'd argue that if a new arch only defines one of these APIs that's probably a bug. 
> > If they absolutely want to do it, they still can - by defining all 3 APIs.
> > 
> > So there's no loss in arch flexibility.
> 
> BTW., PowerPC for example is already in such a situation, it does not define 
> atomic_cmpxchg_release(), only the other APIs:
> 
> #define atomic_cmpxchg(v, o, n) (cmpxchg(&((v)->counter), (o), (n)))
> #define atomic_cmpxchg_relaxed(v, o, n) \
> 	cmpxchg_relaxed(&((v)->counter), (o), (n))
> #define atomic_cmpxchg_acquire(v, o, n) \
> 	cmpxchg_acquire(&((v)->counter), (o), (n))
> 
> Was it really the intention on the PowerPC side that the generic code falls back 
> to cmpxchg(), i.e.:
> 
> #  define atomic_cmpxchg_release(...)           __atomic_op_release(atomic_cmpxchg, __VA_ARGS__)
> 

So ppc has its own definition __atomic_op_release() in
arch/powerpc/include/asm/atomic.h:

	#define __atomic_op_release(op, args...)				\
	({									\
		__asm__ __volatile__(PPC_RELEASE_BARRIER "" : : : "memory");	\
		op##_relaxed(args);						\
	})

, and PPC_RELEASE_BARRIER is lwsync, so we map to

	lwsync();
	atomic_cmpxchg_relaxed(v, o, n);

And the reason, why we don't define atomic_cmpxchg_release() but define
atomic_cmpxchg_acquire() is that, atomic_cmpxchg_*() could provide no
ordering guarantee if the cmp fails, we did this for
atomic_cmpxchg_acquire() but not for atomic_cmpxchg_release(), because
doing so may introduce a memory barrier inside a ll/sc critical section,
please see the comment before __cmpxchg_u32_acquire() in
arch/powerpc/include/asm/cmpxchg.h:

	/*
	 * cmpxchg family don't have order guarantee if cmp part fails, therefore we
	 * can avoid superfluous barriers if we use assembly code to implement
	 * cmpxchg() and cmpxchg_acquire(), however we don't do the similar for
	 * cmpxchg_release() because that will result in putting a barrier in the
	 * middle of a ll/sc loop, which is probably a bad idea. For example, this
	 * might cause the conditional store more likely to fail.
	 */

Regards,
Boqun


> Which after macro expansion becomes:
> 
> 	smp_mb__before_atomic();
> 	atomic_cmpxchg_relaxed(v, o, n);
> 
> smp_mb__before_atomic() on PowerPC falls back to the generic __smp_mb(), which 
> falls back to mb(), which on PowerPC is the 'sync' instruction.
> 
> Isn't this a inefficiency bug?
> 
> While I'm pretty clueless about PowerPC low level cmpxchg atomics, they appear to 
> have the following basic structure:
> 
> full cmpxchg():
> 
> 	PPC_ATOMIC_ENTRY_BARRIER # sync
> 	ldarx + stdcx
> 	PPC_ATOMIC_EXIT_BARRIER  # sync
> 
> cmpxchg_relaxed():
> 
> 	ldarx + stdcx
> 
> cmpxchg_acquire():
> 
> 	ldarx + stdcx
> 	PPC_ACQUIRE_BARRIER      # lwsync
> 
> The logical extension for cmpxchg_release() would be:
> 
> cmpxchg_release():
> 
> 	PPC_RELEASE_BARRIER      # lwsync
> 	ldarx + stdcx
> 
> But instead we silently get the generic fallback, which does:
> 
> 	smp_mb__before_atomic();
> 	atomic_cmpxchg_relaxed(v, o, n);
> 
> Which maps to:
> 
> 	sync
> 	ldarx + stdcx
> 
> Note that it uses a full barrier instead of lwsync (does that stand for 
> 'lightweight sync'?).
> 
> Even if it turns out we need the full barrier, with the overly finegrained 
> structure of the atomics this detail is totally undocumented and non-obvious.
> 
> Thanks,
> 
> 	Ingo
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 488 bytes
Desc: not available
URL: <http://lists.infradead.org/pipermail/linux-arm-kernel/attachments/20180505/84554ee3/attachment.sig>

^ permalink raw reply	[flat|nested] 103+ messages in thread

* Re: [RFC PATCH] locking/atomics/powerpc: Introduce optimized cmpxchg_release() family of APIs for PowerPC
  2018-05-05 10:00                 ` Ingo Molnar
@ 2018-05-05 10:26                   ` Boqun Feng
  -1 siblings, 0 replies; 103+ messages in thread
From: Boqun Feng @ 2018-05-05 10:26 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Peter Zijlstra, Benjamin Herrenschmidt, Paul Mackerras,
	Michael Ellerman, Mark Rutland, linux-arm-kernel, linux-kernel,
	aryabinin, catalin.marinas, dvyukov, will.deacon

[-- Attachment #1: Type: text/plain, Size: 9406 bytes --]

Hi Ingo,

On Sat, May 05, 2018 at 12:00:55PM +0200, Ingo Molnar wrote:
> 
> * Ingo Molnar <mingo@kernel.org> wrote:
> 
> > > So there's no loss in arch flexibility.
> > 
> > BTW., PowerPC for example is already in such a situation, it does not define 
> > atomic_cmpxchg_release(), only the other APIs:
> > 
> > #define atomic_cmpxchg(v, o, n) (cmpxchg(&((v)->counter), (o), (n)))
> > #define atomic_cmpxchg_relaxed(v, o, n) \
> > 	cmpxchg_relaxed(&((v)->counter), (o), (n))
> > #define atomic_cmpxchg_acquire(v, o, n) \
> > 	cmpxchg_acquire(&((v)->counter), (o), (n))
> > 
> > Was it really the intention on the PowerPC side that the generic code falls back 
> > to cmpxchg(), i.e.:
> > 
> > #  define atomic_cmpxchg_release(...)           __atomic_op_release(atomic_cmpxchg, __VA_ARGS__)
> > 
> > Which after macro expansion becomes:
> > 
> > 	smp_mb__before_atomic();
> > 	atomic_cmpxchg_relaxed(v, o, n);
> > 
> > smp_mb__before_atomic() on PowerPC falls back to the generic __smp_mb(), which 
> > falls back to mb(), which on PowerPC is the 'sync' instruction.
> > 
> > Isn't this a inefficiency bug?
> > 
> > While I'm pretty clueless about PowerPC low level cmpxchg atomics, they appear to 
> > have the following basic structure:
> > 
> > full cmpxchg():
> > 
> > 	PPC_ATOMIC_ENTRY_BARRIER # sync
> > 	ldarx + stdcx
> > 	PPC_ATOMIC_EXIT_BARRIER  # sync
> > 
> > cmpxchg_relaxed():
> > 
> > 	ldarx + stdcx
> > 
> > cmpxchg_acquire():
> > 
> > 	ldarx + stdcx
> > 	PPC_ACQUIRE_BARRIER      # lwsync
> > 
> > The logical extension for cmpxchg_release() would be:
> > 
> > cmpxchg_release():
> > 
> > 	PPC_RELEASE_BARRIER      # lwsync
> > 	ldarx + stdcx
> > 
> > But instead we silently get the generic fallback, which does:
> > 
> > 	smp_mb__before_atomic();
> > 	atomic_cmpxchg_relaxed(v, o, n);
> > 
> > Which maps to:
> > 
> > 	sync
> > 	ldarx + stdcx
> > 
> > Note that it uses a full barrier instead of lwsync (does that stand for 
> > 'lightweight sync'?).
> > 
> > Even if it turns out we need the full barrier, with the overly finegrained 
> > structure of the atomics this detail is totally undocumented and non-obvious.
> 
> The patch below fills in those bits and implements the optimized cmpxchg_release() 
> family of APIs. The end effect should be that cmpxchg_release() will now use 
> 'lwsync' instead of 'sync' on PowerPC, for the following APIs:
> 
>   cmpxchg_release()
>   cmpxchg64_release()
>   atomic_cmpxchg_release()
>   atomic64_cmpxchg_release()
> 
> I based this choice of the release barrier on an existing bitops low level PowerPC 
> method:
> 
>    DEFINE_BITOP(clear_bits_unlock, andc, PPC_RELEASE_BARRIER)
> 
> This clearly suggests that PPC_RELEASE_BARRIER is in active use and 'lwsync' is 
> the 'release barrier' instruction, if I interpreted that right.
> 

Thanks for looking into this, but as I said in other email:

	https://marc.info/?l=linux-kernel&m=152551511324210&w=2

, we actually generate light weight barriers for cmpxchg_release()
familiy.

The reason of the asymmetry between cmpxchg_acquire() and
cmpxchg_release() is that we want to save a barrier for
cmpxchg_acquire() if the cmp fails, but doing the similar for
cmpxchg_release() will introduce a scenario that puts a barrier in a
ll/sc loop, which may be a bad idea.

> But I know very little about PowerPC so this might be spectacularly wrong. It's 
> totally untested as well. I also pretty sick today so my mental capabilities are 
> significantly reduced ...
> 

Feel sorry about that, hope you well!

Please let me know if you think I should provide more document work to
make this more informative.

Regards,
Boqun

> So not signed off and such.
> 
> Thanks,
> 
> 	Ingo
> 
> ---
>  arch/powerpc/include/asm/atomic.h  |  4 ++
>  arch/powerpc/include/asm/cmpxchg.h | 81 ++++++++++++++++++++++++++++++++++++++
>  2 files changed, 85 insertions(+)
> 
> diff --git a/arch/powerpc/include/asm/atomic.h b/arch/powerpc/include/asm/atomic.h
> index 682b3e6a1e21..f7a6f29acb12 100644
> --- a/arch/powerpc/include/asm/atomic.h
> +++ b/arch/powerpc/include/asm/atomic.h
> @@ -213,6 +213,8 @@ static __inline__ int atomic_dec_return_relaxed(atomic_t *v)
>  	cmpxchg_relaxed(&((v)->counter), (o), (n))
>  #define atomic_cmpxchg_acquire(v, o, n) \
>  	cmpxchg_acquire(&((v)->counter), (o), (n))
> +#define atomic_cmpxchg_release(v, o, n) \
> +	cmpxchg_release(&((v)->counter), (o), (n))
>  
>  #define atomic_xchg(v, new) (xchg(&((v)->counter), new))
>  #define atomic_xchg_relaxed(v, new) xchg_relaxed(&((v)->counter), (new))
> @@ -519,6 +521,8 @@ static __inline__ long atomic64_dec_if_positive(atomic64_t *v)
>  	cmpxchg_relaxed(&((v)->counter), (o), (n))
>  #define atomic64_cmpxchg_acquire(v, o, n) \
>  	cmpxchg_acquire(&((v)->counter), (o), (n))
> +#define atomic64_cmpxchg_release(v, o, n) \
> +	cmpxchg_release(&((v)->counter), (o), (n))
>  
>  #define atomic64_xchg(v, new) (xchg(&((v)->counter), new))
>  #define atomic64_xchg_relaxed(v, new) xchg_relaxed(&((v)->counter), (new))
> diff --git a/arch/powerpc/include/asm/cmpxchg.h b/arch/powerpc/include/asm/cmpxchg.h
> index 9b001f1f6b32..6e46310b1833 100644
> --- a/arch/powerpc/include/asm/cmpxchg.h
> +++ b/arch/powerpc/include/asm/cmpxchg.h
> @@ -213,10 +213,12 @@ __xchg_relaxed(void *ptr, unsigned long x, unsigned int size)
>  CMPXCHG_GEN(u8, , PPC_ATOMIC_ENTRY_BARRIER, PPC_ATOMIC_EXIT_BARRIER, "memory");
>  CMPXCHG_GEN(u8, _local, , , "memory");
>  CMPXCHG_GEN(u8, _acquire, , PPC_ACQUIRE_BARRIER, "memory");
> +CMPXCHG_GEN(u8, _release, PPC_RELEASE_BARRIER, , "memory");
>  CMPXCHG_GEN(u8, _relaxed, , , "cc");
>  CMPXCHG_GEN(u16, , PPC_ATOMIC_ENTRY_BARRIER, PPC_ATOMIC_EXIT_BARRIER, "memory");
>  CMPXCHG_GEN(u16, _local, , , "memory");
>  CMPXCHG_GEN(u16, _acquire, , PPC_ACQUIRE_BARRIER, "memory");
> +CMPXCHG_GEN(u16, _release, PPC_RELEASE_BARRIER, , "memory");
>  CMPXCHG_GEN(u16, _relaxed, , , "cc");
>  
>  static __always_inline unsigned long
> @@ -314,6 +316,29 @@ __cmpxchg_u32_acquire(u32 *p, unsigned long old, unsigned long new)
>  	return prev;
>  }
>  
> +static __always_inline unsigned long
> +__cmpxchg_u32_release(u32 *p, unsigned long old, unsigned long new)
> +{
> +	unsigned long prev;
> +
> +	__asm__ __volatile__ (
> +	PPC_RELEASE_BARRIER
> +"1:	lwarx	%0,0,%2		# __cmpxchg_u32_release\n"
> +"	cmpw	0,%0,%3\n"
> +"	bne-	2f\n"
> +	PPC405_ERR77(0, %2)
> +"	stwcx.	%4,0,%2\n"
> +"	bne-	1b\n"
> +	"\n"
> +"2:"
> +	: "=&r" (prev), "+m" (*p)
> +	: "r" (p), "r" (old), "r" (new)
> +	: "cc", "memory");
> +
> +	return prev;
> +}
> +
> +
>  #ifdef CONFIG_PPC64
>  static __always_inline unsigned long
>  __cmpxchg_u64(volatile unsigned long *p, unsigned long old, unsigned long new)
> @@ -397,6 +422,27 @@ __cmpxchg_u64_acquire(u64 *p, unsigned long old, unsigned long new)
>  
>  	return prev;
>  }
> +
> +static __always_inline unsigned long
> +__cmpxchg_u64_release(u64 *p, unsigned long old, unsigned long new)
> +{
> +	unsigned long prev;
> +
> +	__asm__ __volatile__ (
> +	PPC_RELEASE_BARRIER
> +"1:	ldarx	%0,0,%2		# __cmpxchg_u64_release\n"
> +"	cmpd	0,%0,%3\n"
> +"	bne-	2f\n"
> +"	stdcx.	%4,0,%2\n"
> +"	bne-	1b\n"
> +	"\n"
> +"2:"
> +	: "=&r" (prev), "+m" (*p)
> +	: "r" (p), "r" (old), "r" (new)
> +	: "cc", "memory");
> +
> +	return prev;
> +}
>  #endif
>  
>  static __always_inline unsigned long
> @@ -478,6 +524,27 @@ __cmpxchg_acquire(void *ptr, unsigned long old, unsigned long new,
>  	BUILD_BUG_ON_MSG(1, "Unsupported size for __cmpxchg_acquire");
>  	return old;
>  }
> +
> +static __always_inline unsigned long
> +__cmpxchg_release(void *ptr, unsigned long old, unsigned long new,
> +		  unsigned int size)
> +{
> +	switch (size) {
> +	case 1:
> +		return __cmpxchg_u8_release(ptr, old, new);
> +	case 2:
> +		return __cmpxchg_u16_release(ptr, old, new);
> +	case 4:
> +		return __cmpxchg_u32_release(ptr, old, new);
> +#ifdef CONFIG_PPC64
> +	case 8:
> +		return __cmpxchg_u64_release(ptr, old, new);
> +#endif
> +	}
> +	BUILD_BUG_ON_MSG(1, "Unsupported size for __cmpxchg_release");
> +	return old;
> +}
> +
>  #define cmpxchg(ptr, o, n)						 \
>    ({									 \
>       __typeof__(*(ptr)) _o_ = (o);					 \
> @@ -512,6 +579,15 @@ __cmpxchg_acquire(void *ptr, unsigned long old, unsigned long new,
>  			(unsigned long)_o_, (unsigned long)_n_,		\
>  			sizeof(*(ptr)));				\
>  })
> +
> +#define cmpxchg_release(ptr, o, n)					\
> +({									\
> +	__typeof__(*(ptr)) _o_ = (o);					\
> +	__typeof__(*(ptr)) _n_ = (n);					\
> +	(__typeof__(*(ptr))) __cmpxchg_release((ptr),			\
> +			(unsigned long)_o_, (unsigned long)_n_,		\
> +			sizeof(*(ptr)));				\
> +})
>  #ifdef CONFIG_PPC64
>  #define cmpxchg64(ptr, o, n)						\
>    ({									\
> @@ -533,6 +609,11 @@ __cmpxchg_acquire(void *ptr, unsigned long old, unsigned long new,
>  	BUILD_BUG_ON(sizeof(*(ptr)) != 8);				\
>  	cmpxchg_acquire((ptr), (o), (n));				\
>  })
> +#define cmpxchg64_release(ptr, o, n)					\
> +({									\
> +	BUILD_BUG_ON(sizeof(*(ptr)) != 8);				\
> +	cmpxchg_release((ptr), (o), (n));				\
> +})
>  #else
>  #include <asm-generic/cmpxchg-local.h>
>  #define cmpxchg64_local(ptr, o, n) __cmpxchg64_local_generic((ptr), (o), (n))

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 103+ messages in thread

* [RFC PATCH] locking/atomics/powerpc: Introduce optimized cmpxchg_release() family of APIs for PowerPC
@ 2018-05-05 10:26                   ` Boqun Feng
  0 siblings, 0 replies; 103+ messages in thread
From: Boqun Feng @ 2018-05-05 10:26 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Ingo,

On Sat, May 05, 2018 at 12:00:55PM +0200, Ingo Molnar wrote:
> 
> * Ingo Molnar <mingo@kernel.org> wrote:
> 
> > > So there's no loss in arch flexibility.
> > 
> > BTW., PowerPC for example is already in such a situation, it does not define 
> > atomic_cmpxchg_release(), only the other APIs:
> > 
> > #define atomic_cmpxchg(v, o, n) (cmpxchg(&((v)->counter), (o), (n)))
> > #define atomic_cmpxchg_relaxed(v, o, n) \
> > 	cmpxchg_relaxed(&((v)->counter), (o), (n))
> > #define atomic_cmpxchg_acquire(v, o, n) \
> > 	cmpxchg_acquire(&((v)->counter), (o), (n))
> > 
> > Was it really the intention on the PowerPC side that the generic code falls back 
> > to cmpxchg(), i.e.:
> > 
> > #  define atomic_cmpxchg_release(...)           __atomic_op_release(atomic_cmpxchg, __VA_ARGS__)
> > 
> > Which after macro expansion becomes:
> > 
> > 	smp_mb__before_atomic();
> > 	atomic_cmpxchg_relaxed(v, o, n);
> > 
> > smp_mb__before_atomic() on PowerPC falls back to the generic __smp_mb(), which 
> > falls back to mb(), which on PowerPC is the 'sync' instruction.
> > 
> > Isn't this a inefficiency bug?
> > 
> > While I'm pretty clueless about PowerPC low level cmpxchg atomics, they appear to 
> > have the following basic structure:
> > 
> > full cmpxchg():
> > 
> > 	PPC_ATOMIC_ENTRY_BARRIER # sync
> > 	ldarx + stdcx
> > 	PPC_ATOMIC_EXIT_BARRIER  # sync
> > 
> > cmpxchg_relaxed():
> > 
> > 	ldarx + stdcx
> > 
> > cmpxchg_acquire():
> > 
> > 	ldarx + stdcx
> > 	PPC_ACQUIRE_BARRIER      # lwsync
> > 
> > The logical extension for cmpxchg_release() would be:
> > 
> > cmpxchg_release():
> > 
> > 	PPC_RELEASE_BARRIER      # lwsync
> > 	ldarx + stdcx
> > 
> > But instead we silently get the generic fallback, which does:
> > 
> > 	smp_mb__before_atomic();
> > 	atomic_cmpxchg_relaxed(v, o, n);
> > 
> > Which maps to:
> > 
> > 	sync
> > 	ldarx + stdcx
> > 
> > Note that it uses a full barrier instead of lwsync (does that stand for 
> > 'lightweight sync'?).
> > 
> > Even if it turns out we need the full barrier, with the overly finegrained 
> > structure of the atomics this detail is totally undocumented and non-obvious.
> 
> The patch below fills in those bits and implements the optimized cmpxchg_release() 
> family of APIs. The end effect should be that cmpxchg_release() will now use 
> 'lwsync' instead of 'sync' on PowerPC, for the following APIs:
> 
>   cmpxchg_release()
>   cmpxchg64_release()
>   atomic_cmpxchg_release()
>   atomic64_cmpxchg_release()
> 
> I based this choice of the release barrier on an existing bitops low level PowerPC 
> method:
> 
>    DEFINE_BITOP(clear_bits_unlock, andc, PPC_RELEASE_BARRIER)
> 
> This clearly suggests that PPC_RELEASE_BARRIER is in active use and 'lwsync' is 
> the 'release barrier' instruction, if I interpreted that right.
> 

Thanks for looking into this, but as I said in other email:

	https://marc.info/?l=linux-kernel&m=152551511324210&w=2

, we actually generate light weight barriers for cmpxchg_release()
familiy.

The reason of the asymmetry between cmpxchg_acquire() and
cmpxchg_release() is that we want to save a barrier for
cmpxchg_acquire() if the cmp fails, but doing the similar for
cmpxchg_release() will introduce a scenario that puts a barrier in a
ll/sc loop, which may be a bad idea.

> But I know very little about PowerPC so this might be spectacularly wrong. It's 
> totally untested as well. I also pretty sick today so my mental capabilities are 
> significantly reduced ...
> 

Feel sorry about that, hope you well!

Please let me know if you think I should provide more document work to
make this more informative.

Regards,
Boqun

> So not signed off and such.
> 
> Thanks,
> 
> 	Ingo
> 
> ---
>  arch/powerpc/include/asm/atomic.h  |  4 ++
>  arch/powerpc/include/asm/cmpxchg.h | 81 ++++++++++++++++++++++++++++++++++++++
>  2 files changed, 85 insertions(+)
> 
> diff --git a/arch/powerpc/include/asm/atomic.h b/arch/powerpc/include/asm/atomic.h
> index 682b3e6a1e21..f7a6f29acb12 100644
> --- a/arch/powerpc/include/asm/atomic.h
> +++ b/arch/powerpc/include/asm/atomic.h
> @@ -213,6 +213,8 @@ static __inline__ int atomic_dec_return_relaxed(atomic_t *v)
>  	cmpxchg_relaxed(&((v)->counter), (o), (n))
>  #define atomic_cmpxchg_acquire(v, o, n) \
>  	cmpxchg_acquire(&((v)->counter), (o), (n))
> +#define atomic_cmpxchg_release(v, o, n) \
> +	cmpxchg_release(&((v)->counter), (o), (n))
>  
>  #define atomic_xchg(v, new) (xchg(&((v)->counter), new))
>  #define atomic_xchg_relaxed(v, new) xchg_relaxed(&((v)->counter), (new))
> @@ -519,6 +521,8 @@ static __inline__ long atomic64_dec_if_positive(atomic64_t *v)
>  	cmpxchg_relaxed(&((v)->counter), (o), (n))
>  #define atomic64_cmpxchg_acquire(v, o, n) \
>  	cmpxchg_acquire(&((v)->counter), (o), (n))
> +#define atomic64_cmpxchg_release(v, o, n) \
> +	cmpxchg_release(&((v)->counter), (o), (n))
>  
>  #define atomic64_xchg(v, new) (xchg(&((v)->counter), new))
>  #define atomic64_xchg_relaxed(v, new) xchg_relaxed(&((v)->counter), (new))
> diff --git a/arch/powerpc/include/asm/cmpxchg.h b/arch/powerpc/include/asm/cmpxchg.h
> index 9b001f1f6b32..6e46310b1833 100644
> --- a/arch/powerpc/include/asm/cmpxchg.h
> +++ b/arch/powerpc/include/asm/cmpxchg.h
> @@ -213,10 +213,12 @@ __xchg_relaxed(void *ptr, unsigned long x, unsigned int size)
>  CMPXCHG_GEN(u8, , PPC_ATOMIC_ENTRY_BARRIER, PPC_ATOMIC_EXIT_BARRIER, "memory");
>  CMPXCHG_GEN(u8, _local, , , "memory");
>  CMPXCHG_GEN(u8, _acquire, , PPC_ACQUIRE_BARRIER, "memory");
> +CMPXCHG_GEN(u8, _release, PPC_RELEASE_BARRIER, , "memory");
>  CMPXCHG_GEN(u8, _relaxed, , , "cc");
>  CMPXCHG_GEN(u16, , PPC_ATOMIC_ENTRY_BARRIER, PPC_ATOMIC_EXIT_BARRIER, "memory");
>  CMPXCHG_GEN(u16, _local, , , "memory");
>  CMPXCHG_GEN(u16, _acquire, , PPC_ACQUIRE_BARRIER, "memory");
> +CMPXCHG_GEN(u16, _release, PPC_RELEASE_BARRIER, , "memory");
>  CMPXCHG_GEN(u16, _relaxed, , , "cc");
>  
>  static __always_inline unsigned long
> @@ -314,6 +316,29 @@ __cmpxchg_u32_acquire(u32 *p, unsigned long old, unsigned long new)
>  	return prev;
>  }
>  
> +static __always_inline unsigned long
> +__cmpxchg_u32_release(u32 *p, unsigned long old, unsigned long new)
> +{
> +	unsigned long prev;
> +
> +	__asm__ __volatile__ (
> +	PPC_RELEASE_BARRIER
> +"1:	lwarx	%0,0,%2		# __cmpxchg_u32_release\n"
> +"	cmpw	0,%0,%3\n"
> +"	bne-	2f\n"
> +	PPC405_ERR77(0, %2)
> +"	stwcx.	%4,0,%2\n"
> +"	bne-	1b\n"
> +	"\n"
> +"2:"
> +	: "=&r" (prev), "+m" (*p)
> +	: "r" (p), "r" (old), "r" (new)
> +	: "cc", "memory");
> +
> +	return prev;
> +}
> +
> +
>  #ifdef CONFIG_PPC64
>  static __always_inline unsigned long
>  __cmpxchg_u64(volatile unsigned long *p, unsigned long old, unsigned long new)
> @@ -397,6 +422,27 @@ __cmpxchg_u64_acquire(u64 *p, unsigned long old, unsigned long new)
>  
>  	return prev;
>  }
> +
> +static __always_inline unsigned long
> +__cmpxchg_u64_release(u64 *p, unsigned long old, unsigned long new)
> +{
> +	unsigned long prev;
> +
> +	__asm__ __volatile__ (
> +	PPC_RELEASE_BARRIER
> +"1:	ldarx	%0,0,%2		# __cmpxchg_u64_release\n"
> +"	cmpd	0,%0,%3\n"
> +"	bne-	2f\n"
> +"	stdcx.	%4,0,%2\n"
> +"	bne-	1b\n"
> +	"\n"
> +"2:"
> +	: "=&r" (prev), "+m" (*p)
> +	: "r" (p), "r" (old), "r" (new)
> +	: "cc", "memory");
> +
> +	return prev;
> +}
>  #endif
>  
>  static __always_inline unsigned long
> @@ -478,6 +524,27 @@ __cmpxchg_acquire(void *ptr, unsigned long old, unsigned long new,
>  	BUILD_BUG_ON_MSG(1, "Unsupported size for __cmpxchg_acquire");
>  	return old;
>  }
> +
> +static __always_inline unsigned long
> +__cmpxchg_release(void *ptr, unsigned long old, unsigned long new,
> +		  unsigned int size)
> +{
> +	switch (size) {
> +	case 1:
> +		return __cmpxchg_u8_release(ptr, old, new);
> +	case 2:
> +		return __cmpxchg_u16_release(ptr, old, new);
> +	case 4:
> +		return __cmpxchg_u32_release(ptr, old, new);
> +#ifdef CONFIG_PPC64
> +	case 8:
> +		return __cmpxchg_u64_release(ptr, old, new);
> +#endif
> +	}
> +	BUILD_BUG_ON_MSG(1, "Unsupported size for __cmpxchg_release");
> +	return old;
> +}
> +
>  #define cmpxchg(ptr, o, n)						 \
>    ({									 \
>       __typeof__(*(ptr)) _o_ = (o);					 \
> @@ -512,6 +579,15 @@ __cmpxchg_acquire(void *ptr, unsigned long old, unsigned long new,
>  			(unsigned long)_o_, (unsigned long)_n_,		\
>  			sizeof(*(ptr)));				\
>  })
> +
> +#define cmpxchg_release(ptr, o, n)					\
> +({									\
> +	__typeof__(*(ptr)) _o_ = (o);					\
> +	__typeof__(*(ptr)) _n_ = (n);					\
> +	(__typeof__(*(ptr))) __cmpxchg_release((ptr),			\
> +			(unsigned long)_o_, (unsigned long)_n_,		\
> +			sizeof(*(ptr)));				\
> +})
>  #ifdef CONFIG_PPC64
>  #define cmpxchg64(ptr, o, n)						\
>    ({									\
> @@ -533,6 +609,11 @@ __cmpxchg_acquire(void *ptr, unsigned long old, unsigned long new,
>  	BUILD_BUG_ON(sizeof(*(ptr)) != 8);				\
>  	cmpxchg_acquire((ptr), (o), (n));				\
>  })
> +#define cmpxchg64_release(ptr, o, n)					\
> +({									\
> +	BUILD_BUG_ON(sizeof(*(ptr)) != 8);				\
> +	cmpxchg_release((ptr), (o), (n));				\
> +})
>  #else
>  #include <asm-generic/cmpxchg-local.h>
>  #define cmpxchg64_local(ptr, o, n) __cmpxchg64_local_generic((ptr), (o), (n))
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 488 bytes
Desc: not available
URL: <http://lists.infradead.org/pipermail/linux-arm-kernel/attachments/20180505/064f3b95/attachment-0001.sig>

^ permalink raw reply	[flat|nested] 103+ messages in thread

* [RFC PATCH] locking/atomics/powerpc: Clarify why the cmpxchg_relaxed() family of APIs falls back to full cmpxchg()
  2018-05-05 10:16                 ` Boqun Feng
@ 2018-05-05 10:35                   ` Ingo Molnar
  -1 siblings, 0 replies; 103+ messages in thread
From: Ingo Molnar @ 2018-05-05 10:35 UTC (permalink / raw)
  To: Boqun Feng
  Cc: Peter Zijlstra, Mark Rutland, linux-arm-kernel, linux-kernel,
	aryabinin, catalin.marinas, dvyukov, will.deacon


* Boqun Feng <boqun.feng@gmail.com> wrote:

> On Sat, May 05, 2018 at 11:38:29AM +0200, Ingo Molnar wrote:
> > 
> > * Ingo Molnar <mingo@kernel.org> wrote:
> > 
> > > * Peter Zijlstra <peterz@infradead.org> wrote:
> > > 
> > > > > So we could do the following simplification on top of that:
> > > > > 
> > > > >  #ifndef atomic_fetch_dec_relaxed
> > > > >  # ifndef atomic_fetch_dec
> > > > >  #  define atomic_fetch_dec(v)		atomic_fetch_sub(1, (v))
> > > > >  #  define atomic_fetch_dec_relaxed(v)	atomic_fetch_sub_relaxed(1, (v))
> > > > >  #  define atomic_fetch_dec_acquire(v)	atomic_fetch_sub_acquire(1, (v))
> > > > >  #  define atomic_fetch_dec_release(v)	atomic_fetch_sub_release(1, (v))
> > > > >  # else
> > > > >  #  define atomic_fetch_dec_relaxed		atomic_fetch_dec
> > > > >  #  define atomic_fetch_dec_acquire		atomic_fetch_dec
> > > > >  #  define atomic_fetch_dec_release		atomic_fetch_dec
> > > > >  # endif
> > > > >  #else
> > > > >  # ifndef atomic_fetch_dec
> > > > >  #  define atomic_fetch_dec(...)		__atomic_op_fence(atomic_fetch_dec, __VA_ARGS__)
> > > > >  #  define atomic_fetch_dec_acquire(...)	__atomic_op_acquire(atomic_fetch_dec, __VA_ARGS__)
> > > > >  #  define atomic_fetch_dec_release(...)	__atomic_op_release(atomic_fetch_dec, __VA_ARGS__)
> > > > >  # endif
> > > > >  #endif
> > > > 
> > > > This would disallow an architecture to override just fetch_dec_release for
> > > > instance.
> > > 
> > > Couldn't such a crazy arch just define _all_ the 3 APIs in this group?
> > > That's really a small price and makes the place pay the complexity
> > > price that does the weirdness...
> > > 
> > > > I don't think there currently is any architecture that does that, but the
> > > > intent was to allow it to override anything and only provide defaults where it
> > > > does not.
> > > 
> > > I'd argue that if a new arch only defines one of these APIs that's probably a bug. 
> > > If they absolutely want to do it, they still can - by defining all 3 APIs.
> > > 
> > > So there's no loss in arch flexibility.
> > 
> > BTW., PowerPC for example is already in such a situation, it does not define 
> > atomic_cmpxchg_release(), only the other APIs:
> > 
> > #define atomic_cmpxchg(v, o, n) (cmpxchg(&((v)->counter), (o), (n)))
> > #define atomic_cmpxchg_relaxed(v, o, n) \
> > 	cmpxchg_relaxed(&((v)->counter), (o), (n))
> > #define atomic_cmpxchg_acquire(v, o, n) \
> > 	cmpxchg_acquire(&((v)->counter), (o), (n))
> > 
> > Was it really the intention on the PowerPC side that the generic code falls back 
> > to cmpxchg(), i.e.:
> > 
> > #  define atomic_cmpxchg_release(...)           __atomic_op_release(atomic_cmpxchg, __VA_ARGS__)
> > 
> 
> So ppc has its own definition __atomic_op_release() in
> arch/powerpc/include/asm/atomic.h:
> 
> 	#define __atomic_op_release(op, args...)				\
> 	({									\
> 		__asm__ __volatile__(PPC_RELEASE_BARRIER "" : : : "memory");	\
> 		op##_relaxed(args);						\
> 	})
> 
> , and PPC_RELEASE_BARRIER is lwsync, so we map to
> 
> 	lwsync();
> 	atomic_cmpxchg_relaxed(v, o, n);
> 
> And the reason, why we don't define atomic_cmpxchg_release() but define
> atomic_cmpxchg_acquire() is that, atomic_cmpxchg_*() could provide no
> ordering guarantee if the cmp fails, we did this for
> atomic_cmpxchg_acquire() but not for atomic_cmpxchg_release(), because
> doing so may introduce a memory barrier inside a ll/sc critical section,
> please see the comment before __cmpxchg_u32_acquire() in
> arch/powerpc/include/asm/cmpxchg.h:
> 
> 	/*
> 	 * cmpxchg family don't have order guarantee if cmp part fails, therefore we
> 	 * can avoid superfluous barriers if we use assembly code to implement
> 	 * cmpxchg() and cmpxchg_acquire(), however we don't do the similar for
> 	 * cmpxchg_release() because that will result in putting a barrier in the
> 	 * middle of a ll/sc loop, which is probably a bad idea. For example, this
> 	 * might cause the conditional store more likely to fail.
> 	 */

Makes sense, thanks a lot for the explanation, missed that comment in the middle 
of the assembly functions!

So the patch I sent is buggy, please disregard it.

May I suggest the patch below? No change in functionality, but it documents the 
lack of the cmpxchg_release() APIs and maps them explicitly to the full cmpxchg() 
version. (Which the generic code does now in a rather roundabout way.)

Also, the change to arch/powerpc/include/asm/atomic.h has no functional effect 
right now either, but should anyone add a _relaxed() variant in the future, with 
this change atomic_cmpxchg_release() and atomic64_cmpxchg_release() will pick that 
up automatically.

Would this be acceptable?

Thanks,

	Ingo

---
 arch/powerpc/include/asm/atomic.h  |  4 ++++
 arch/powerpc/include/asm/cmpxchg.h | 13 +++++++++++++
 2 files changed, 17 insertions(+)

diff --git a/arch/powerpc/include/asm/atomic.h b/arch/powerpc/include/asm/atomic.h
index 682b3e6a1e21..f7a6f29acb12 100644
--- a/arch/powerpc/include/asm/atomic.h
+++ b/arch/powerpc/include/asm/atomic.h
@@ -213,6 +213,8 @@ static __inline__ int atomic_dec_return_relaxed(atomic_t *v)
 	cmpxchg_relaxed(&((v)->counter), (o), (n))
 #define atomic_cmpxchg_acquire(v, o, n) \
 	cmpxchg_acquire(&((v)->counter), (o), (n))
+#define atomic_cmpxchg_release(v, o, n) \
+	cmpxchg_release(&((v)->counter), (o), (n))
 
 #define atomic_xchg(v, new) (xchg(&((v)->counter), new))
 #define atomic_xchg_relaxed(v, new) xchg_relaxed(&((v)->counter), (new))
@@ -519,6 +521,8 @@ static __inline__ long atomic64_dec_if_positive(atomic64_t *v)
 	cmpxchg_relaxed(&((v)->counter), (o), (n))
 #define atomic64_cmpxchg_acquire(v, o, n) \
 	cmpxchg_acquire(&((v)->counter), (o), (n))
+#define atomic64_cmpxchg_release(v, o, n) \
+	cmpxchg_release(&((v)->counter), (o), (n))
 
 #define atomic64_xchg(v, new) (xchg(&((v)->counter), new))
 #define atomic64_xchg_relaxed(v, new) xchg_relaxed(&((v)->counter), (new))
diff --git a/arch/powerpc/include/asm/cmpxchg.h b/arch/powerpc/include/asm/cmpxchg.h
index 9b001f1f6b32..1f1d35062f3a 100644
--- a/arch/powerpc/include/asm/cmpxchg.h
+++ b/arch/powerpc/include/asm/cmpxchg.h
@@ -512,6 +512,13 @@ __cmpxchg_acquire(void *ptr, unsigned long old, unsigned long new,
 			(unsigned long)_o_, (unsigned long)_n_,		\
 			sizeof(*(ptr)));				\
 })
+
+/*
+ * cmpxchg_release() falls back to a full cmpxchg(),
+ * see the comments at __cmpxchg_u32_acquire():
+ */
+#define cmpxchg_release cmpxchg
+
 #ifdef CONFIG_PPC64
 #define cmpxchg64(ptr, o, n)						\
   ({									\
@@ -538,5 +545,11 @@ __cmpxchg_acquire(void *ptr, unsigned long old, unsigned long new,
 #define cmpxchg64_local(ptr, o, n) __cmpxchg64_local_generic((ptr), (o), (n))
 #endif
 
+/*
+ * cmpxchg64_release() falls back to a full cmpxchg(),
+ * see the comments at __cmpxchg_u32_acquire():
+ */
+#define cmpxchg64_release cmpxchg64
+
 #endif /* __KERNEL__ */
 #endif /* _ASM_POWERPC_CMPXCHG_H_ */

^ permalink raw reply related	[flat|nested] 103+ messages in thread

* [RFC PATCH] locking/atomics/powerpc: Clarify why the cmpxchg_relaxed() family of APIs falls back to full cmpxchg()
@ 2018-05-05 10:35                   ` Ingo Molnar
  0 siblings, 0 replies; 103+ messages in thread
From: Ingo Molnar @ 2018-05-05 10:35 UTC (permalink / raw)
  To: linux-arm-kernel


* Boqun Feng <boqun.feng@gmail.com> wrote:

> On Sat, May 05, 2018 at 11:38:29AM +0200, Ingo Molnar wrote:
> > 
> > * Ingo Molnar <mingo@kernel.org> wrote:
> > 
> > > * Peter Zijlstra <peterz@infradead.org> wrote:
> > > 
> > > > > So we could do the following simplification on top of that:
> > > > > 
> > > > >  #ifndef atomic_fetch_dec_relaxed
> > > > >  # ifndef atomic_fetch_dec
> > > > >  #  define atomic_fetch_dec(v)		atomic_fetch_sub(1, (v))
> > > > >  #  define atomic_fetch_dec_relaxed(v)	atomic_fetch_sub_relaxed(1, (v))
> > > > >  #  define atomic_fetch_dec_acquire(v)	atomic_fetch_sub_acquire(1, (v))
> > > > >  #  define atomic_fetch_dec_release(v)	atomic_fetch_sub_release(1, (v))
> > > > >  # else
> > > > >  #  define atomic_fetch_dec_relaxed		atomic_fetch_dec
> > > > >  #  define atomic_fetch_dec_acquire		atomic_fetch_dec
> > > > >  #  define atomic_fetch_dec_release		atomic_fetch_dec
> > > > >  # endif
> > > > >  #else
> > > > >  # ifndef atomic_fetch_dec
> > > > >  #  define atomic_fetch_dec(...)		__atomic_op_fence(atomic_fetch_dec, __VA_ARGS__)
> > > > >  #  define atomic_fetch_dec_acquire(...)	__atomic_op_acquire(atomic_fetch_dec, __VA_ARGS__)
> > > > >  #  define atomic_fetch_dec_release(...)	__atomic_op_release(atomic_fetch_dec, __VA_ARGS__)
> > > > >  # endif
> > > > >  #endif
> > > > 
> > > > This would disallow an architecture to override just fetch_dec_release for
> > > > instance.
> > > 
> > > Couldn't such a crazy arch just define _all_ the 3 APIs in this group?
> > > That's really a small price and makes the place pay the complexity
> > > price that does the weirdness...
> > > 
> > > > I don't think there currently is any architecture that does that, but the
> > > > intent was to allow it to override anything and only provide defaults where it
> > > > does not.
> > > 
> > > I'd argue that if a new arch only defines one of these APIs that's probably a bug. 
> > > If they absolutely want to do it, they still can - by defining all 3 APIs.
> > > 
> > > So there's no loss in arch flexibility.
> > 
> > BTW., PowerPC for example is already in such a situation, it does not define 
> > atomic_cmpxchg_release(), only the other APIs:
> > 
> > #define atomic_cmpxchg(v, o, n) (cmpxchg(&((v)->counter), (o), (n)))
> > #define atomic_cmpxchg_relaxed(v, o, n) \
> > 	cmpxchg_relaxed(&((v)->counter), (o), (n))
> > #define atomic_cmpxchg_acquire(v, o, n) \
> > 	cmpxchg_acquire(&((v)->counter), (o), (n))
> > 
> > Was it really the intention on the PowerPC side that the generic code falls back 
> > to cmpxchg(), i.e.:
> > 
> > #  define atomic_cmpxchg_release(...)           __atomic_op_release(atomic_cmpxchg, __VA_ARGS__)
> > 
> 
> So ppc has its own definition __atomic_op_release() in
> arch/powerpc/include/asm/atomic.h:
> 
> 	#define __atomic_op_release(op, args...)				\
> 	({									\
> 		__asm__ __volatile__(PPC_RELEASE_BARRIER "" : : : "memory");	\
> 		op##_relaxed(args);						\
> 	})
> 
> , and PPC_RELEASE_BARRIER is lwsync, so we map to
> 
> 	lwsync();
> 	atomic_cmpxchg_relaxed(v, o, n);
> 
> And the reason, why we don't define atomic_cmpxchg_release() but define
> atomic_cmpxchg_acquire() is that, atomic_cmpxchg_*() could provide no
> ordering guarantee if the cmp fails, we did this for
> atomic_cmpxchg_acquire() but not for atomic_cmpxchg_release(), because
> doing so may introduce a memory barrier inside a ll/sc critical section,
> please see the comment before __cmpxchg_u32_acquire() in
> arch/powerpc/include/asm/cmpxchg.h:
> 
> 	/*
> 	 * cmpxchg family don't have order guarantee if cmp part fails, therefore we
> 	 * can avoid superfluous barriers if we use assembly code to implement
> 	 * cmpxchg() and cmpxchg_acquire(), however we don't do the similar for
> 	 * cmpxchg_release() because that will result in putting a barrier in the
> 	 * middle of a ll/sc loop, which is probably a bad idea. For example, this
> 	 * might cause the conditional store more likely to fail.
> 	 */

Makes sense, thanks a lot for the explanation, missed that comment in the middle 
of the assembly functions!

So the patch I sent is buggy, please disregard it.

May I suggest the patch below? No change in functionality, but it documents the 
lack of the cmpxchg_release() APIs and maps them explicitly to the full cmpxchg() 
version. (Which the generic code does now in a rather roundabout way.)

Also, the change to arch/powerpc/include/asm/atomic.h has no functional effect 
right now either, but should anyone add a _relaxed() variant in the future, with 
this change atomic_cmpxchg_release() and atomic64_cmpxchg_release() will pick that 
up automatically.

Would this be acceptable?

Thanks,

	Ingo

---
 arch/powerpc/include/asm/atomic.h  |  4 ++++
 arch/powerpc/include/asm/cmpxchg.h | 13 +++++++++++++
 2 files changed, 17 insertions(+)

diff --git a/arch/powerpc/include/asm/atomic.h b/arch/powerpc/include/asm/atomic.h
index 682b3e6a1e21..f7a6f29acb12 100644
--- a/arch/powerpc/include/asm/atomic.h
+++ b/arch/powerpc/include/asm/atomic.h
@@ -213,6 +213,8 @@ static __inline__ int atomic_dec_return_relaxed(atomic_t *v)
 	cmpxchg_relaxed(&((v)->counter), (o), (n))
 #define atomic_cmpxchg_acquire(v, o, n) \
 	cmpxchg_acquire(&((v)->counter), (o), (n))
+#define atomic_cmpxchg_release(v, o, n) \
+	cmpxchg_release(&((v)->counter), (o), (n))
 
 #define atomic_xchg(v, new) (xchg(&((v)->counter), new))
 #define atomic_xchg_relaxed(v, new) xchg_relaxed(&((v)->counter), (new))
@@ -519,6 +521,8 @@ static __inline__ long atomic64_dec_if_positive(atomic64_t *v)
 	cmpxchg_relaxed(&((v)->counter), (o), (n))
 #define atomic64_cmpxchg_acquire(v, o, n) \
 	cmpxchg_acquire(&((v)->counter), (o), (n))
+#define atomic64_cmpxchg_release(v, o, n) \
+	cmpxchg_release(&((v)->counter), (o), (n))
 
 #define atomic64_xchg(v, new) (xchg(&((v)->counter), new))
 #define atomic64_xchg_relaxed(v, new) xchg_relaxed(&((v)->counter), (new))
diff --git a/arch/powerpc/include/asm/cmpxchg.h b/arch/powerpc/include/asm/cmpxchg.h
index 9b001f1f6b32..1f1d35062f3a 100644
--- a/arch/powerpc/include/asm/cmpxchg.h
+++ b/arch/powerpc/include/asm/cmpxchg.h
@@ -512,6 +512,13 @@ __cmpxchg_acquire(void *ptr, unsigned long old, unsigned long new,
 			(unsigned long)_o_, (unsigned long)_n_,		\
 			sizeof(*(ptr)));				\
 })
+
+/*
+ * cmpxchg_release() falls back to a full cmpxchg(),
+ * see the comments at __cmpxchg_u32_acquire():
+ */
+#define cmpxchg_release cmpxchg
+
 #ifdef CONFIG_PPC64
 #define cmpxchg64(ptr, o, n)						\
   ({									\
@@ -538,5 +545,11 @@ __cmpxchg_acquire(void *ptr, unsigned long old, unsigned long new,
 #define cmpxchg64_local(ptr, o, n) __cmpxchg64_local_generic((ptr), (o), (n))
 #endif
 
+/*
+ * cmpxchg64_release() falls back to a full cmpxchg(),
+ * see the comments at __cmpxchg_u32_acquire():
+ */
+#define cmpxchg64_release cmpxchg64
+
 #endif /* __KERNEL__ */
 #endif /* _ASM_POWERPC_CMPXCHG_H_ */

^ permalink raw reply related	[flat|nested] 103+ messages in thread

* [PATCH] locking/atomics: Shorten the __atomic_op() defines to __op()
  2018-05-05  9:29               ` Peter Zijlstra
@ 2018-05-05 10:48                 ` Ingo Molnar
  -1 siblings, 0 replies; 103+ messages in thread
From: Ingo Molnar @ 2018-05-05 10:48 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Mark Rutland, linux-arm-kernel, linux-kernel, aryabinin,
	boqun.feng, catalin.marinas, dvyukov, will.deacon


* Peter Zijlstra <peterz@infradead.org> wrote:

> On Sat, May 05, 2018 at 11:09:03AM +0200, Ingo Molnar wrote:
> > > >  # ifndef atomic_fetch_dec_acquire
> > > >  #  define atomic_fetch_dec_acquire(...)	__atomic_op_acquire(atomic_fetch_dec, __VA_ARGS__)
> > > >  # endif
> > > >  # ifndef atomic_fetch_dec_release
> > > >  #  define atomic_fetch_dec_release(...)	__atomic_op_release(atomic_fetch_dec, __VA_ARGS__)
> > > >  # endif
> > > >  # ifndef atomic_fetch_dec
> > > >  #  define atomic_fetch_dec(...)		__atomic_op_fence(atomic_fetch_dec, __VA_ARGS__)
> > > >  # endif
> > > >  #endif
> > > > 
> > > > The new variant is readable at a glance, and the hierarchy of defines is very 
> > > > obvious as well.
> > > 
> > > It wraps and looks hideous in my normal setup. And I do detest that indent
> > > after # thing.
> > 
> > You should use wider terminals if you take a look at such code - there's already 
> > numerous areas of the kernel that are not readable on 80x25 terminals.
> > 
> > _Please_ try the following experiment, for me:
> > 
> > Enter the 21st century temporarily and widen two of your terminals from 80 cols to 
> > 100 cols - it's only ~20% wider.
> 
> Doesn't work that way. The only way I get more columns is if I shrink my
> font further. I work with tiles per monitor (left/right obv.) and use
> two columns per editor. This gets me a total of 4 columns.
> 
> On my desktop that is slightly over 100 characters per column, on my
> laptop that is slightly below 100 -- mostly because I'm pixel limited on
> fontsize on that thing (FullHD sucks).
> 
> If it wraps it wraps.

Out of the 707 lines in atomic.h only 25 are wider than 100 chars - and the max 
length is 104 chars.

If that's too then there's a few more things we could do - for example the 
attached patch renames a (very minor) misnomer to a shorter name and thus saves on 
the longest lines, the column histogram now looks like this:

    79 4
    80 7
    81 3
    82 9
    84 4
    85 2
    86 3
    87 1
    88 4
    89 13
    90 7
    91 20
    92 18
    93 12
    94 11
    96 5

I.e. the longest line is down to 96 columns, and 99% of the file is 94 cols or 
shorter.

Is this still too long?

Thanks,

	Ingo

============================>
From: Ingo Molnar <mingo@kernel.org>
Date: Sat, 5 May 2018 12:41:57 +0200
Subject: [PATCH] locking/atomics: Shorten the __atomic_op() defines to __op()

The __atomic prefix is somewhat of a misnomer, because not all
APIs we use with these macros have an atomic_ prefix.

This also reduces the length of the longest lines in the header,
making them more readable on PeterZ's terminals.

Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Paul E. McKenney <paulmck@us.ibm.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Deacon <will.deacon@arm.com>
Cc: aryabinin@virtuozzo.com
Cc: boqun.feng@gmail.com
Cc: catalin.marinas@arm.com
Cc: dvyukov@google.com
Cc: linux-arm-kernel@lists.infradead.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 include/linux/atomic.h | 204 +++++++++++++++++++++++++------------------------
 1 file changed, 103 insertions(+), 101 deletions(-)

diff --git a/include/linux/atomic.h b/include/linux/atomic.h
index 1176cf7c6f03..f32ff6d9e4d2 100644
--- a/include/linux/atomic.h
+++ b/include/linux/atomic.h
@@ -37,33 +37,35 @@
  * variant is already fully ordered, no additional barriers are needed.
  *
  * Besides, if an arch has a special barrier for acquire/release, it could
- * implement its own __atomic_op_* and use the same framework for building
+ * implement its own __op_* and use the same framework for building
  * variants
  *
- * If an architecture overrides __atomic_op_acquire() it will probably want
+ * If an architecture overrides __op_acquire() it will probably want
  * to define smp_mb__after_spinlock().
  */
-#ifndef __atomic_op_acquire
-#define __atomic_op_acquire(op, args...)				\
+#ifndef __op_acquire
+#define __op_acquire(op, args...)					\
 ({									\
 	typeof(op##_relaxed(args)) __ret  = op##_relaxed(args);		\
+									\
 	smp_mb__after_atomic();						\
 	__ret;								\
 })
 #endif
 
-#ifndef __atomic_op_release
-#define __atomic_op_release(op, args...)				\
+#ifndef __op_release
+#define __op_release(op, args...)					\
 ({									\
 	smp_mb__before_atomic();					\
 	op##_relaxed(args);						\
 })
 #endif
 
-#ifndef __atomic_op_fence
-#define __atomic_op_fence(op, args...)					\
+#ifndef __op_fence
+#define __op_fence(op, args...)						\
 ({									\
 	typeof(op##_relaxed(args)) __ret;				\
+									\
 	smp_mb__before_atomic();					\
 	__ret = op##_relaxed(args);					\
 	smp_mb__after_atomic();						\
@@ -77,9 +79,9 @@
 # define atomic_add_return_release		atomic_add_return
 #else
 # ifndef atomic_add_return
-#  define atomic_add_return(...)		__atomic_op_fence(atomic_add_return, __VA_ARGS__)
-#  define atomic_add_return_acquire(...)	__atomic_op_acquire(atomic_add_return, __VA_ARGS__)
-#  define atomic_add_return_release(...)	__atomic_op_release(atomic_add_return, __VA_ARGS__)
+#  define atomic_add_return(...)		__op_fence(atomic_add_return, __VA_ARGS__)
+#  define atomic_add_return_acquire(...)	__op_acquire(atomic_add_return, __VA_ARGS__)
+#  define atomic_add_return_release(...)	__op_release(atomic_add_return, __VA_ARGS__)
 # endif
 #endif
 
@@ -89,9 +91,9 @@
 # define atomic_inc_return_release		atomic_inc_return
 #else
 # ifndef atomic_inc_return
-#  define atomic_inc_return(...)		__atomic_op_fence(atomic_inc_return, __VA_ARGS__)
-#  define atomic_inc_return_acquire(...)	__atomic_op_acquire(atomic_inc_return, __VA_ARGS__)
-#  define atomic_inc_return_release(...)	__atomic_op_release(atomic_inc_return, __VA_ARGS__)
+#  define atomic_inc_return(...)		__op_fence(atomic_inc_return, __VA_ARGS__)
+#  define atomic_inc_return_acquire(...)	__op_acquire(atomic_inc_return, __VA_ARGS__)
+#  define atomic_inc_return_release(...)	__op_release(atomic_inc_return, __VA_ARGS__)
 # endif
 #endif
 
@@ -101,9 +103,9 @@
 # define atomic_sub_return_release		atomic_sub_return
 #else
 # ifndef atomic_sub_return
-#  define atomic_sub_return(...)		__atomic_op_fence(atomic_sub_return, __VA_ARGS__)
-#  define atomic_sub_return_acquire(...)	__atomic_op_acquire(atomic_sub_return, __VA_ARGS__)
-#  define atomic_sub_return_release(...)	__atomic_op_release(atomic_sub_return, __VA_ARGS__)
+#  define atomic_sub_return(...)		__op_fence(atomic_sub_return, __VA_ARGS__)
+#  define atomic_sub_return_acquire(...)	__op_acquire(atomic_sub_return, __VA_ARGS__)
+#  define atomic_sub_return_release(...)	__op_release(atomic_sub_return, __VA_ARGS__)
 # endif
 #endif
 
@@ -113,9 +115,9 @@
 # define atomic_dec_return_release		atomic_dec_return
 #else
 # ifndef atomic_dec_return
-#  define atomic_dec_return(...)		__atomic_op_fence(atomic_dec_return, __VA_ARGS__)
-#  define atomic_dec_return_acquire(...)	__atomic_op_acquire(atomic_dec_return, __VA_ARGS__)
-#  define atomic_dec_return_release(...)	__atomic_op_release(atomic_dec_return, __VA_ARGS__)
+#  define atomic_dec_return(...)		__op_fence(atomic_dec_return, __VA_ARGS__)
+#  define atomic_dec_return_acquire(...)	__op_acquire(atomic_dec_return, __VA_ARGS__)
+#  define atomic_dec_return_release(...)	__op_release(atomic_dec_return, __VA_ARGS__)
 # endif
 #endif
 
@@ -125,9 +127,9 @@
 # define atomic_fetch_add_release		atomic_fetch_add
 #else
 # ifndef atomic_fetch_add
-#  define atomic_fetch_add(...)			__atomic_op_fence(atomic_fetch_add, __VA_ARGS__)
-#  define atomic_fetch_add_acquire(...)		__atomic_op_acquire(atomic_fetch_add, __VA_ARGS__)
-#  define atomic_fetch_add_release(...)		__atomic_op_release(atomic_fetch_add, __VA_ARGS__)
+#  define atomic_fetch_add(...)			__op_fence(atomic_fetch_add, __VA_ARGS__)
+#  define atomic_fetch_add_acquire(...)		__op_acquire(atomic_fetch_add, __VA_ARGS__)
+#  define atomic_fetch_add_release(...)		__op_release(atomic_fetch_add, __VA_ARGS__)
 # endif
 #endif
 
@@ -144,9 +146,9 @@
 # endif
 #else
 # ifndef atomic_fetch_inc
-#  define atomic_fetch_inc(...)			__atomic_op_fence(atomic_fetch_inc, __VA_ARGS__)
-#  define atomic_fetch_inc_acquire(...)		__atomic_op_acquire(atomic_fetch_inc, __VA_ARGS__)
-#  define atomic_fetch_inc_release(...)		__atomic_op_release(atomic_fetch_inc, __VA_ARGS__)
+#  define atomic_fetch_inc(...)			__op_fence(atomic_fetch_inc, __VA_ARGS__)
+#  define atomic_fetch_inc_acquire(...)		__op_acquire(atomic_fetch_inc, __VA_ARGS__)
+#  define atomic_fetch_inc_release(...)		__op_release(atomic_fetch_inc, __VA_ARGS__)
 # endif
 #endif
 
@@ -156,9 +158,9 @@
 # define atomic_fetch_sub_release		atomic_fetch_sub
 #else
 # ifndef atomic_fetch_sub
-#  define atomic_fetch_sub(...)			__atomic_op_fence(atomic_fetch_sub, __VA_ARGS__)
-#  define atomic_fetch_sub_acquire(...)		__atomic_op_acquire(atomic_fetch_sub, __VA_ARGS__)
-#  define atomic_fetch_sub_release(...)		__atomic_op_release(atomic_fetch_sub, __VA_ARGS__)
+#  define atomic_fetch_sub(...)			__op_fence(atomic_fetch_sub, __VA_ARGS__)
+#  define atomic_fetch_sub_acquire(...)		__op_acquire(atomic_fetch_sub, __VA_ARGS__)
+#  define atomic_fetch_sub_release(...)		__op_release(atomic_fetch_sub, __VA_ARGS__)
 # endif
 #endif
 
@@ -175,9 +177,9 @@
 # endif
 #else
 # ifndef atomic_fetch_dec
-#  define atomic_fetch_dec(...)			__atomic_op_fence(atomic_fetch_dec, __VA_ARGS__)
-#  define atomic_fetch_dec_acquire(...)		__atomic_op_acquire(atomic_fetch_dec, __VA_ARGS__)
-#  define atomic_fetch_dec_release(...)		__atomic_op_release(atomic_fetch_dec, __VA_ARGS__)
+#  define atomic_fetch_dec(...)			__op_fence(atomic_fetch_dec, __VA_ARGS__)
+#  define atomic_fetch_dec_acquire(...)		__op_acquire(atomic_fetch_dec, __VA_ARGS__)
+#  define atomic_fetch_dec_release(...)		__op_release(atomic_fetch_dec, __VA_ARGS__)
 # endif
 #endif
 
@@ -187,9 +189,9 @@
 # define atomic_fetch_or_release		atomic_fetch_or
 #else
 # ifndef atomic_fetch_or
-#  define atomic_fetch_or(...)			__atomic_op_fence(atomic_fetch_or, __VA_ARGS__)
-#  define atomic_fetch_or_acquire(...)		__atomic_op_acquire(atomic_fetch_or, __VA_ARGS__)
-#  define atomic_fetch_or_release(...)		__atomic_op_release(atomic_fetch_or, __VA_ARGS__)
+#  define atomic_fetch_or(...)			__op_fence(atomic_fetch_or, __VA_ARGS__)
+#  define atomic_fetch_or_acquire(...)		__op_acquire(atomic_fetch_or, __VA_ARGS__)
+#  define atomic_fetch_or_release(...)		__op_release(atomic_fetch_or, __VA_ARGS__)
 # endif
 #endif
 
@@ -199,9 +201,9 @@
 # define atomic_fetch_and_release		atomic_fetch_and
 #else
 # ifndef atomic_fetch_and
-#  define atomic_fetch_and(...)			__atomic_op_fence(atomic_fetch_and, __VA_ARGS__)
-#  define atomic_fetch_and_acquire(...)		__atomic_op_acquire(atomic_fetch_and, __VA_ARGS__)
-#  define atomic_fetch_and_release(...)		__atomic_op_release(atomic_fetch_and, __VA_ARGS__)
+#  define atomic_fetch_and(...)			__op_fence(atomic_fetch_and, __VA_ARGS__)
+#  define atomic_fetch_and_acquire(...)		__op_acquire(atomic_fetch_and, __VA_ARGS__)
+#  define atomic_fetch_and_release(...)		__op_release(atomic_fetch_and, __VA_ARGS__)
 # endif
 #endif
 
@@ -211,9 +213,9 @@
 # define atomic_fetch_xor_release		atomic_fetch_xor
 #else
 # ifndef atomic_fetch_xor
-#  define atomic_fetch_xor(...)			__atomic_op_fence(atomic_fetch_xor, __VA_ARGS__)
-#  define atomic_fetch_xor_acquire(...)		__atomic_op_acquire(atomic_fetch_xor, __VA_ARGS__)
-#  define atomic_fetch_xor_release(...)		__atomic_op_release(atomic_fetch_xor, __VA_ARGS__)
+#  define atomic_fetch_xor(...)			__op_fence(atomic_fetch_xor, __VA_ARGS__)
+#  define atomic_fetch_xor_acquire(...)		__op_acquire(atomic_fetch_xor, __VA_ARGS__)
+#  define atomic_fetch_xor_release(...)		__op_release(atomic_fetch_xor, __VA_ARGS__)
 # endif
 #endif
 
@@ -223,9 +225,9 @@
 #define atomic_xchg_release			atomic_xchg
 #else
 # ifndef atomic_xchg
-#  define atomic_xchg(...)			__atomic_op_fence(atomic_xchg, __VA_ARGS__)
-#  define atomic_xchg_acquire(...)		__atomic_op_acquire(atomic_xchg, __VA_ARGS__)
-#  define atomic_xchg_release(...)		__atomic_op_release(atomic_xchg, __VA_ARGS__)
+#  define atomic_xchg(...)			__op_fence(atomic_xchg, __VA_ARGS__)
+#  define atomic_xchg_acquire(...)		__op_acquire(atomic_xchg, __VA_ARGS__)
+#  define atomic_xchg_release(...)		__op_release(atomic_xchg, __VA_ARGS__)
 # endif
 #endif
 
@@ -235,9 +237,9 @@
 # define atomic_cmpxchg_release			atomic_cmpxchg
 #else
 # ifndef atomic_cmpxchg
-#  define atomic_cmpxchg(...)			__atomic_op_fence(atomic_cmpxchg, __VA_ARGS__)
-#  define atomic_cmpxchg_acquire(...)		__atomic_op_acquire(atomic_cmpxchg, __VA_ARGS__)
-#  define atomic_cmpxchg_release(...)		__atomic_op_release(atomic_cmpxchg, __VA_ARGS__)
+#  define atomic_cmpxchg(...)			__op_fence(atomic_cmpxchg, __VA_ARGS__)
+#  define atomic_cmpxchg_acquire(...)		__op_acquire(atomic_cmpxchg, __VA_ARGS__)
+#  define atomic_cmpxchg_release(...)		__op_release(atomic_cmpxchg, __VA_ARGS__)
 # endif
 #endif
 
@@ -267,9 +269,9 @@
 # define cmpxchg_release			cmpxchg
 #else
 # ifndef cmpxchg
-#  define cmpxchg(...)				__atomic_op_fence(cmpxchg, __VA_ARGS__)
-#  define cmpxchg_acquire(...)			__atomic_op_acquire(cmpxchg, __VA_ARGS__)
-#  define cmpxchg_release(...)			__atomic_op_release(cmpxchg, __VA_ARGS__)
+#  define cmpxchg(...)				__op_fence(cmpxchg, __VA_ARGS__)
+#  define cmpxchg_acquire(...)			__op_acquire(cmpxchg, __VA_ARGS__)
+#  define cmpxchg_release(...)			__op_release(cmpxchg, __VA_ARGS__)
 # endif
 #endif
 
@@ -279,9 +281,9 @@
 # define cmpxchg64_release			cmpxchg64
 #else
 # ifndef cmpxchg64
-#  define cmpxchg64(...)			__atomic_op_fence(cmpxchg64, __VA_ARGS__)
-#  define cmpxchg64_acquire(...)		__atomic_op_acquire(cmpxchg64, __VA_ARGS__)
-#  define cmpxchg64_release(...)		__atomic_op_release(cmpxchg64, __VA_ARGS__)
+#  define cmpxchg64(...)			__op_fence(cmpxchg64, __VA_ARGS__)
+#  define cmpxchg64_acquire(...)		__op_acquire(cmpxchg64, __VA_ARGS__)
+#  define cmpxchg64_release(...)		__op_release(cmpxchg64, __VA_ARGS__)
 # endif
 #endif
 
@@ -291,9 +293,9 @@
 # define xchg_release				xchg
 #else
 # ifndef xchg
-#  define xchg(...)				__atomic_op_fence(xchg, __VA_ARGS__)
-#  define xchg_acquire(...)			__atomic_op_acquire(xchg, __VA_ARGS__)
-#  define xchg_release(...)			__atomic_op_release(xchg, __VA_ARGS__)
+#  define xchg(...)				__op_fence(xchg, __VA_ARGS__)
+#  define xchg_acquire(...)			__op_acquire(xchg, __VA_ARGS__)
+#  define xchg_release(...)			__op_release(xchg, __VA_ARGS__)
 # endif
 #endif
 
@@ -330,9 +332,9 @@ static inline int atomic_add_unless(atomic_t *v, int a, int u)
 # define atomic_fetch_andnot_release		atomic_fetch_andnot
 #else
 # ifndef atomic_fetch_andnot
-#  define atomic_fetch_andnot(...)		__atomic_op_fence(atomic_fetch_andnot, __VA_ARGS__)
-#  define atomic_fetch_andnot_acquire(...)	__atomic_op_acquire(atomic_fetch_andnot, __VA_ARGS__)
-#  define atomic_fetch_andnot_release(...)	__atomic_op_release(atomic_fetch_andnot, __VA_ARGS__)
+#  define atomic_fetch_andnot(...)		__op_fence(atomic_fetch_andnot, __VA_ARGS__)
+#  define atomic_fetch_andnot_acquire(...)	__op_acquire(atomic_fetch_andnot, __VA_ARGS__)
+#  define atomic_fetch_andnot_release(...)	__op_release(atomic_fetch_andnot, __VA_ARGS__)
 # endif
 #endif
 
@@ -472,9 +474,9 @@ static inline int atomic_dec_if_positive(atomic_t *v)
 # define atomic64_add_return_release		atomic64_add_return
 #else
 # ifndef atomic64_add_return
-#  define atomic64_add_return(...)		__atomic_op_fence(atomic64_add_return, __VA_ARGS__)
-#  define atomic64_add_return_acquire(...)	__atomic_op_acquire(atomic64_add_return, __VA_ARGS__)
-#  define atomic64_add_return_release(...)	__atomic_op_release(atomic64_add_return, __VA_ARGS__)
+#  define atomic64_add_return(...)		__op_fence(atomic64_add_return, __VA_ARGS__)
+#  define atomic64_add_return_acquire(...)	__op_acquire(atomic64_add_return, __VA_ARGS__)
+#  define atomic64_add_return_release(...)	__op_release(atomic64_add_return, __VA_ARGS__)
 # endif
 #endif
 
@@ -484,9 +486,9 @@ static inline int atomic_dec_if_positive(atomic_t *v)
 # define atomic64_inc_return_release		atomic64_inc_return
 #else
 # ifndef atomic64_inc_return
-#  define atomic64_inc_return(...)		__atomic_op_fence(atomic64_inc_return, __VA_ARGS__)
-#  define atomic64_inc_return_acquire(...)	__atomic_op_acquire(atomic64_inc_return, __VA_ARGS__)
-#  define atomic64_inc_return_release(...)	__atomic_op_release(atomic64_inc_return, __VA_ARGS__)
+#  define atomic64_inc_return(...)		__op_fence(atomic64_inc_return, __VA_ARGS__)
+#  define atomic64_inc_return_acquire(...)	__op_acquire(atomic64_inc_return, __VA_ARGS__)
+#  define atomic64_inc_return_release(...)	__op_release(atomic64_inc_return, __VA_ARGS__)
 # endif
 #endif
 
@@ -496,9 +498,9 @@ static inline int atomic_dec_if_positive(atomic_t *v)
 # define atomic64_sub_return_release		atomic64_sub_return
 #else
 # ifndef atomic64_sub_return
-#  define atomic64_sub_return(...)		__atomic_op_fence(atomic64_sub_return, __VA_ARGS__)
-#  define atomic64_sub_return_acquire(...)	__atomic_op_acquire(atomic64_sub_return, __VA_ARGS__)
-#  define atomic64_sub_return_release(...)	__atomic_op_release(atomic64_sub_return, __VA_ARGS__)
+#  define atomic64_sub_return(...)		__op_fence(atomic64_sub_return, __VA_ARGS__)
+#  define atomic64_sub_return_acquire(...)	__op_acquire(atomic64_sub_return, __VA_ARGS__)
+#  define atomic64_sub_return_release(...)	__op_release(atomic64_sub_return, __VA_ARGS__)
 # endif
 #endif
 
@@ -508,9 +510,9 @@ static inline int atomic_dec_if_positive(atomic_t *v)
 # define atomic64_dec_return_release		atomic64_dec_return
 #else
 # ifndef atomic64_dec_return
-#  define atomic64_dec_return(...)		__atomic_op_fence(atomic64_dec_return, __VA_ARGS__)
-#  define atomic64_dec_return_acquire(...)	__atomic_op_acquire(atomic64_dec_return, __VA_ARGS__)
-#  define atomic64_dec_return_release(...)	__atomic_op_release(atomic64_dec_return, __VA_ARGS__)
+#  define atomic64_dec_return(...)		__op_fence(atomic64_dec_return, __VA_ARGS__)
+#  define atomic64_dec_return_acquire(...)	__op_acquire(atomic64_dec_return, __VA_ARGS__)
+#  define atomic64_dec_return_release(...)	__op_release(atomic64_dec_return, __VA_ARGS__)
 # endif
 #endif
 
@@ -520,9 +522,9 @@ static inline int atomic_dec_if_positive(atomic_t *v)
 # define atomic64_fetch_add_release		atomic64_fetch_add
 #else
 # ifndef atomic64_fetch_add
-#  define atomic64_fetch_add(...)		__atomic_op_fence(atomic64_fetch_add, __VA_ARGS__)
-#  define atomic64_fetch_add_acquire(...)	__atomic_op_acquire(atomic64_fetch_add, __VA_ARGS__)
-#  define atomic64_fetch_add_release(...)	__atomic_op_release(atomic64_fetch_add, __VA_ARGS__)
+#  define atomic64_fetch_add(...)		__op_fence(atomic64_fetch_add, __VA_ARGS__)
+#  define atomic64_fetch_add_acquire(...)	__op_acquire(atomic64_fetch_add, __VA_ARGS__)
+#  define atomic64_fetch_add_release(...)	__op_release(atomic64_fetch_add, __VA_ARGS__)
 # endif
 #endif
 
@@ -539,9 +541,9 @@ static inline int atomic_dec_if_positive(atomic_t *v)
 # endif
 #else
 # ifndef atomic64_fetch_inc
-#  define atomic64_fetch_inc(...)		__atomic_op_fence(atomic64_fetch_inc, __VA_ARGS__)
-#  define atomic64_fetch_inc_acquire(...)	__atomic_op_acquire(atomic64_fetch_inc, __VA_ARGS__)
-#  define atomic64_fetch_inc_release(...)	__atomic_op_release(atomic64_fetch_inc, __VA_ARGS__)
+#  define atomic64_fetch_inc(...)		__op_fence(atomic64_fetch_inc, __VA_ARGS__)
+#  define atomic64_fetch_inc_acquire(...)	__op_acquire(atomic64_fetch_inc, __VA_ARGS__)
+#  define atomic64_fetch_inc_release(...)	__op_release(atomic64_fetch_inc, __VA_ARGS__)
 # endif
 #endif
 
@@ -551,9 +553,9 @@ static inline int atomic_dec_if_positive(atomic_t *v)
 # define atomic64_fetch_sub_release		atomic64_fetch_sub
 #else
 # ifndef atomic64_fetch_sub
-#  define atomic64_fetch_sub(...)		__atomic_op_fence(atomic64_fetch_sub, __VA_ARGS__)
-#  define atomic64_fetch_sub_acquire(...)	__atomic_op_acquire(atomic64_fetch_sub, __VA_ARGS__)
-#  define atomic64_fetch_sub_release(...)	__atomic_op_release(atomic64_fetch_sub, __VA_ARGS__)
+#  define atomic64_fetch_sub(...)		__op_fence(atomic64_fetch_sub, __VA_ARGS__)
+#  define atomic64_fetch_sub_acquire(...)	__op_acquire(atomic64_fetch_sub, __VA_ARGS__)
+#  define atomic64_fetch_sub_release(...)	__op_release(atomic64_fetch_sub, __VA_ARGS__)
 # endif
 #endif
 
@@ -570,9 +572,9 @@ static inline int atomic_dec_if_positive(atomic_t *v)
 # endif
 #else
 # ifndef atomic64_fetch_dec
-#  define atomic64_fetch_dec(...)		__atomic_op_fence(atomic64_fetch_dec, __VA_ARGS__)
-#  define atomic64_fetch_dec_acquire(...)	__atomic_op_acquire(atomic64_fetch_dec, __VA_ARGS__)
-#  define atomic64_fetch_dec_release(...)	__atomic_op_release(atomic64_fetch_dec, __VA_ARGS__)
+#  define atomic64_fetch_dec(...)		__op_fence(atomic64_fetch_dec, __VA_ARGS__)
+#  define atomic64_fetch_dec_acquire(...)	__op_acquire(atomic64_fetch_dec, __VA_ARGS__)
+#  define atomic64_fetch_dec_release(...)	__op_release(atomic64_fetch_dec, __VA_ARGS__)
 # endif
 #endif
 
@@ -582,9 +584,9 @@ static inline int atomic_dec_if_positive(atomic_t *v)
 # define atomic64_fetch_or_release		atomic64_fetch_or
 #else
 # ifndef atomic64_fetch_or
-#  define atomic64_fetch_or(...)		__atomic_op_fence(atomic64_fetch_or, __VA_ARGS__)
-#  define atomic64_fetch_or_acquire(...)	__atomic_op_acquire(atomic64_fetch_or, __VA_ARGS__)
-#  define atomic64_fetch_or_release(...)	__atomic_op_release(atomic64_fetch_or, __VA_ARGS__)
+#  define atomic64_fetch_or(...)		__op_fence(atomic64_fetch_or, __VA_ARGS__)
+#  define atomic64_fetch_or_acquire(...)	__op_acquire(atomic64_fetch_or, __VA_ARGS__)
+#  define atomic64_fetch_or_release(...)	__op_release(atomic64_fetch_or, __VA_ARGS__)
 # endif
 #endif
 
@@ -594,9 +596,9 @@ static inline int atomic_dec_if_positive(atomic_t *v)
 # define atomic64_fetch_and_release		atomic64_fetch_and
 #else
 # ifndef atomic64_fetch_and
-#  define atomic64_fetch_and(...)		__atomic_op_fence(atomic64_fetch_and, __VA_ARGS__)
-#  define atomic64_fetch_and_acquire(...)	__atomic_op_acquire(atomic64_fetch_and, __VA_ARGS__)
-#  define atomic64_fetch_and_release(...)	__atomic_op_release(atomic64_fetch_and, __VA_ARGS__)
+#  define atomic64_fetch_and(...)		__op_fence(atomic64_fetch_and, __VA_ARGS__)
+#  define atomic64_fetch_and_acquire(...)	__op_acquire(atomic64_fetch_and, __VA_ARGS__)
+#  define atomic64_fetch_and_release(...)	__op_release(atomic64_fetch_and, __VA_ARGS__)
 # endif
 #endif
 
@@ -606,9 +608,9 @@ static inline int atomic_dec_if_positive(atomic_t *v)
 # define atomic64_fetch_xor_release		atomic64_fetch_xor
 #else
 # ifndef atomic64_fetch_xor
-#  define atomic64_fetch_xor(...)		__atomic_op_fence(atomic64_fetch_xor, __VA_ARGS__)
-#  define atomic64_fetch_xor_acquire(...)	__atomic_op_acquire(atomic64_fetch_xor, __VA_ARGS__)
-#  define atomic64_fetch_xor_release(...)	__atomic_op_release(atomic64_fetch_xor, __VA_ARGS__)
+#  define atomic64_fetch_xor(...)		__op_fence(atomic64_fetch_xor, __VA_ARGS__)
+#  define atomic64_fetch_xor_acquire(...)	__op_acquire(atomic64_fetch_xor, __VA_ARGS__)
+#  define atomic64_fetch_xor_release(...)	__op_release(atomic64_fetch_xor, __VA_ARGS__)
 # endif
 #endif
 
@@ -618,9 +620,9 @@ static inline int atomic_dec_if_positive(atomic_t *v)
 # define atomic64_xchg_release			atomic64_xchg
 #else
 # ifndef atomic64_xchg
-#  define atomic64_xchg(...)			__atomic_op_fence(atomic64_xchg, __VA_ARGS__)
-#  define atomic64_xchg_acquire(...)		__atomic_op_acquire(atomic64_xchg, __VA_ARGS__)
-#  define atomic64_xchg_release(...)		__atomic_op_release(atomic64_xchg, __VA_ARGS__)
+#  define atomic64_xchg(...)			__op_fence(atomic64_xchg, __VA_ARGS__)
+#  define atomic64_xchg_acquire(...)		__op_acquire(atomic64_xchg, __VA_ARGS__)
+#  define atomic64_xchg_release(...)		__op_release(atomic64_xchg, __VA_ARGS__)
 # endif
 #endif
 
@@ -630,9 +632,9 @@ static inline int atomic_dec_if_positive(atomic_t *v)
 # define atomic64_cmpxchg_release		atomic64_cmpxchg
 #else
 # ifndef atomic64_cmpxchg
-#  define atomic64_cmpxchg(...)			__atomic_op_fence(atomic64_cmpxchg, __VA_ARGS__)
-#  define atomic64_cmpxchg_acquire(...)		__atomic_op_acquire(atomic64_cmpxchg, __VA_ARGS__)
-#  define atomic64_cmpxchg_release(...)		__atomic_op_release(atomic64_cmpxchg, __VA_ARGS__)
+#  define atomic64_cmpxchg(...)			__op_fence(atomic64_cmpxchg, __VA_ARGS__)
+#  define atomic64_cmpxchg_acquire(...)		__op_acquire(atomic64_cmpxchg, __VA_ARGS__)
+#  define atomic64_cmpxchg_release(...)		__op_release(atomic64_cmpxchg, __VA_ARGS__)
 # endif
 #endif
 
@@ -664,9 +666,9 @@ static inline int atomic_dec_if_positive(atomic_t *v)
 # define atomic64_fetch_andnot_release		atomic64_fetch_andnot
 #else
 # ifndef atomic64_fetch_andnot
-#  define atomic64_fetch_andnot(...)		__atomic_op_fence(atomic64_fetch_andnot, __VA_ARGS__)
-#  define atomic64_fetch_andnot_acquire(...)	__atomic_op_acquire(atomic64_fetch_andnot, __VA_ARGS__)
-#  define atomic64_fetch_andnot_release(...)	__atomic_op_release(atomic64_fetch_andnot, __VA_ARGS__)
+#  define atomic64_fetch_andnot(...)		__op_fence(atomic64_fetch_andnot, __VA_ARGS__)
+#  define atomic64_fetch_andnot_acquire(...)	__op_acquire(atomic64_fetch_andnot, __VA_ARGS__)
+#  define atomic64_fetch_andnot_release(...)	__op_release(atomic64_fetch_andnot, __VA_ARGS__)
 # endif
 #endif
 

^ permalink raw reply related	[flat|nested] 103+ messages in thread

* [PATCH] locking/atomics: Shorten the __atomic_op() defines to __op()
@ 2018-05-05 10:48                 ` Ingo Molnar
  0 siblings, 0 replies; 103+ messages in thread
From: Ingo Molnar @ 2018-05-05 10:48 UTC (permalink / raw)
  To: linux-arm-kernel


* Peter Zijlstra <peterz@infradead.org> wrote:

> On Sat, May 05, 2018 at 11:09:03AM +0200, Ingo Molnar wrote:
> > > >  # ifndef atomic_fetch_dec_acquire
> > > >  #  define atomic_fetch_dec_acquire(...)	__atomic_op_acquire(atomic_fetch_dec, __VA_ARGS__)
> > > >  # endif
> > > >  # ifndef atomic_fetch_dec_release
> > > >  #  define atomic_fetch_dec_release(...)	__atomic_op_release(atomic_fetch_dec, __VA_ARGS__)
> > > >  # endif
> > > >  # ifndef atomic_fetch_dec
> > > >  #  define atomic_fetch_dec(...)		__atomic_op_fence(atomic_fetch_dec, __VA_ARGS__)
> > > >  # endif
> > > >  #endif
> > > > 
> > > > The new variant is readable at a glance, and the hierarchy of defines is very 
> > > > obvious as well.
> > > 
> > > It wraps and looks hideous in my normal setup. And I do detest that indent
> > > after # thing.
> > 
> > You should use wider terminals if you take a look at such code - there's already 
> > numerous areas of the kernel that are not readable on 80x25 terminals.
> > 
> > _Please_ try the following experiment, for me:
> > 
> > Enter the 21st century temporarily and widen two of your terminals from 80 cols to 
> > 100 cols - it's only ~20% wider.
> 
> Doesn't work that way. The only way I get more columns is if I shrink my
> font further. I work with tiles per monitor (left/right obv.) and use
> two columns per editor. This gets me a total of 4 columns.
> 
> On my desktop that is slightly over 100 characters per column, on my
> laptop that is slightly below 100 -- mostly because I'm pixel limited on
> fontsize on that thing (FullHD sucks).
> 
> If it wraps it wraps.

Out of the 707 lines in atomic.h only 25 are wider than 100 chars - and the max 
length is 104 chars.

If that's too then there's a few more things we could do - for example the 
attached patch renames a (very minor) misnomer to a shorter name and thus saves on 
the longest lines, the column histogram now looks like this:

    79 4
    80 7
    81 3
    82 9
    84 4
    85 2
    86 3
    87 1
    88 4
    89 13
    90 7
    91 20
    92 18
    93 12
    94 11
    96 5

I.e. the longest line is down to 96 columns, and 99% of the file is 94 cols or 
shorter.

Is this still too long?

Thanks,

	Ingo

============================>
From: Ingo Molnar <mingo@kernel.org>
Date: Sat, 5 May 2018 12:41:57 +0200
Subject: [PATCH] locking/atomics: Shorten the __atomic_op() defines to __op()

The __atomic prefix is somewhat of a misnomer, because not all
APIs we use with these macros have an atomic_ prefix.

This also reduces the length of the longest lines in the header,
making them more readable on PeterZ's terminals.

Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Paul E. McKenney <paulmck@us.ibm.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Deacon <will.deacon@arm.com>
Cc: aryabinin at virtuozzo.com
Cc: boqun.feng at gmail.com
Cc: catalin.marinas at arm.com
Cc: dvyukov at google.com
Cc: linux-arm-kernel at lists.infradead.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 include/linux/atomic.h | 204 +++++++++++++++++++++++++------------------------
 1 file changed, 103 insertions(+), 101 deletions(-)

diff --git a/include/linux/atomic.h b/include/linux/atomic.h
index 1176cf7c6f03..f32ff6d9e4d2 100644
--- a/include/linux/atomic.h
+++ b/include/linux/atomic.h
@@ -37,33 +37,35 @@
  * variant is already fully ordered, no additional barriers are needed.
  *
  * Besides, if an arch has a special barrier for acquire/release, it could
- * implement its own __atomic_op_* and use the same framework for building
+ * implement its own __op_* and use the same framework for building
  * variants
  *
- * If an architecture overrides __atomic_op_acquire() it will probably want
+ * If an architecture overrides __op_acquire() it will probably want
  * to define smp_mb__after_spinlock().
  */
-#ifndef __atomic_op_acquire
-#define __atomic_op_acquire(op, args...)				\
+#ifndef __op_acquire
+#define __op_acquire(op, args...)					\
 ({									\
 	typeof(op##_relaxed(args)) __ret  = op##_relaxed(args);		\
+									\
 	smp_mb__after_atomic();						\
 	__ret;								\
 })
 #endif
 
-#ifndef __atomic_op_release
-#define __atomic_op_release(op, args...)				\
+#ifndef __op_release
+#define __op_release(op, args...)					\
 ({									\
 	smp_mb__before_atomic();					\
 	op##_relaxed(args);						\
 })
 #endif
 
-#ifndef __atomic_op_fence
-#define __atomic_op_fence(op, args...)					\
+#ifndef __op_fence
+#define __op_fence(op, args...)						\
 ({									\
 	typeof(op##_relaxed(args)) __ret;				\
+									\
 	smp_mb__before_atomic();					\
 	__ret = op##_relaxed(args);					\
 	smp_mb__after_atomic();						\
@@ -77,9 +79,9 @@
 # define atomic_add_return_release		atomic_add_return
 #else
 # ifndef atomic_add_return
-#  define atomic_add_return(...)		__atomic_op_fence(atomic_add_return, __VA_ARGS__)
-#  define atomic_add_return_acquire(...)	__atomic_op_acquire(atomic_add_return, __VA_ARGS__)
-#  define atomic_add_return_release(...)	__atomic_op_release(atomic_add_return, __VA_ARGS__)
+#  define atomic_add_return(...)		__op_fence(atomic_add_return, __VA_ARGS__)
+#  define atomic_add_return_acquire(...)	__op_acquire(atomic_add_return, __VA_ARGS__)
+#  define atomic_add_return_release(...)	__op_release(atomic_add_return, __VA_ARGS__)
 # endif
 #endif
 
@@ -89,9 +91,9 @@
 # define atomic_inc_return_release		atomic_inc_return
 #else
 # ifndef atomic_inc_return
-#  define atomic_inc_return(...)		__atomic_op_fence(atomic_inc_return, __VA_ARGS__)
-#  define atomic_inc_return_acquire(...)	__atomic_op_acquire(atomic_inc_return, __VA_ARGS__)
-#  define atomic_inc_return_release(...)	__atomic_op_release(atomic_inc_return, __VA_ARGS__)
+#  define atomic_inc_return(...)		__op_fence(atomic_inc_return, __VA_ARGS__)
+#  define atomic_inc_return_acquire(...)	__op_acquire(atomic_inc_return, __VA_ARGS__)
+#  define atomic_inc_return_release(...)	__op_release(atomic_inc_return, __VA_ARGS__)
 # endif
 #endif
 
@@ -101,9 +103,9 @@
 # define atomic_sub_return_release		atomic_sub_return
 #else
 # ifndef atomic_sub_return
-#  define atomic_sub_return(...)		__atomic_op_fence(atomic_sub_return, __VA_ARGS__)
-#  define atomic_sub_return_acquire(...)	__atomic_op_acquire(atomic_sub_return, __VA_ARGS__)
-#  define atomic_sub_return_release(...)	__atomic_op_release(atomic_sub_return, __VA_ARGS__)
+#  define atomic_sub_return(...)		__op_fence(atomic_sub_return, __VA_ARGS__)
+#  define atomic_sub_return_acquire(...)	__op_acquire(atomic_sub_return, __VA_ARGS__)
+#  define atomic_sub_return_release(...)	__op_release(atomic_sub_return, __VA_ARGS__)
 # endif
 #endif
 
@@ -113,9 +115,9 @@
 # define atomic_dec_return_release		atomic_dec_return
 #else
 # ifndef atomic_dec_return
-#  define atomic_dec_return(...)		__atomic_op_fence(atomic_dec_return, __VA_ARGS__)
-#  define atomic_dec_return_acquire(...)	__atomic_op_acquire(atomic_dec_return, __VA_ARGS__)
-#  define atomic_dec_return_release(...)	__atomic_op_release(atomic_dec_return, __VA_ARGS__)
+#  define atomic_dec_return(...)		__op_fence(atomic_dec_return, __VA_ARGS__)
+#  define atomic_dec_return_acquire(...)	__op_acquire(atomic_dec_return, __VA_ARGS__)
+#  define atomic_dec_return_release(...)	__op_release(atomic_dec_return, __VA_ARGS__)
 # endif
 #endif
 
@@ -125,9 +127,9 @@
 # define atomic_fetch_add_release		atomic_fetch_add
 #else
 # ifndef atomic_fetch_add
-#  define atomic_fetch_add(...)			__atomic_op_fence(atomic_fetch_add, __VA_ARGS__)
-#  define atomic_fetch_add_acquire(...)		__atomic_op_acquire(atomic_fetch_add, __VA_ARGS__)
-#  define atomic_fetch_add_release(...)		__atomic_op_release(atomic_fetch_add, __VA_ARGS__)
+#  define atomic_fetch_add(...)			__op_fence(atomic_fetch_add, __VA_ARGS__)
+#  define atomic_fetch_add_acquire(...)		__op_acquire(atomic_fetch_add, __VA_ARGS__)
+#  define atomic_fetch_add_release(...)		__op_release(atomic_fetch_add, __VA_ARGS__)
 # endif
 #endif
 
@@ -144,9 +146,9 @@
 # endif
 #else
 # ifndef atomic_fetch_inc
-#  define atomic_fetch_inc(...)			__atomic_op_fence(atomic_fetch_inc, __VA_ARGS__)
-#  define atomic_fetch_inc_acquire(...)		__atomic_op_acquire(atomic_fetch_inc, __VA_ARGS__)
-#  define atomic_fetch_inc_release(...)		__atomic_op_release(atomic_fetch_inc, __VA_ARGS__)
+#  define atomic_fetch_inc(...)			__op_fence(atomic_fetch_inc, __VA_ARGS__)
+#  define atomic_fetch_inc_acquire(...)		__op_acquire(atomic_fetch_inc, __VA_ARGS__)
+#  define atomic_fetch_inc_release(...)		__op_release(atomic_fetch_inc, __VA_ARGS__)
 # endif
 #endif
 
@@ -156,9 +158,9 @@
 # define atomic_fetch_sub_release		atomic_fetch_sub
 #else
 # ifndef atomic_fetch_sub
-#  define atomic_fetch_sub(...)			__atomic_op_fence(atomic_fetch_sub, __VA_ARGS__)
-#  define atomic_fetch_sub_acquire(...)		__atomic_op_acquire(atomic_fetch_sub, __VA_ARGS__)
-#  define atomic_fetch_sub_release(...)		__atomic_op_release(atomic_fetch_sub, __VA_ARGS__)
+#  define atomic_fetch_sub(...)			__op_fence(atomic_fetch_sub, __VA_ARGS__)
+#  define atomic_fetch_sub_acquire(...)		__op_acquire(atomic_fetch_sub, __VA_ARGS__)
+#  define atomic_fetch_sub_release(...)		__op_release(atomic_fetch_sub, __VA_ARGS__)
 # endif
 #endif
 
@@ -175,9 +177,9 @@
 # endif
 #else
 # ifndef atomic_fetch_dec
-#  define atomic_fetch_dec(...)			__atomic_op_fence(atomic_fetch_dec, __VA_ARGS__)
-#  define atomic_fetch_dec_acquire(...)		__atomic_op_acquire(atomic_fetch_dec, __VA_ARGS__)
-#  define atomic_fetch_dec_release(...)		__atomic_op_release(atomic_fetch_dec, __VA_ARGS__)
+#  define atomic_fetch_dec(...)			__op_fence(atomic_fetch_dec, __VA_ARGS__)
+#  define atomic_fetch_dec_acquire(...)		__op_acquire(atomic_fetch_dec, __VA_ARGS__)
+#  define atomic_fetch_dec_release(...)		__op_release(atomic_fetch_dec, __VA_ARGS__)
 # endif
 #endif
 
@@ -187,9 +189,9 @@
 # define atomic_fetch_or_release		atomic_fetch_or
 #else
 # ifndef atomic_fetch_or
-#  define atomic_fetch_or(...)			__atomic_op_fence(atomic_fetch_or, __VA_ARGS__)
-#  define atomic_fetch_or_acquire(...)		__atomic_op_acquire(atomic_fetch_or, __VA_ARGS__)
-#  define atomic_fetch_or_release(...)		__atomic_op_release(atomic_fetch_or, __VA_ARGS__)
+#  define atomic_fetch_or(...)			__op_fence(atomic_fetch_or, __VA_ARGS__)
+#  define atomic_fetch_or_acquire(...)		__op_acquire(atomic_fetch_or, __VA_ARGS__)
+#  define atomic_fetch_or_release(...)		__op_release(atomic_fetch_or, __VA_ARGS__)
 # endif
 #endif
 
@@ -199,9 +201,9 @@
 # define atomic_fetch_and_release		atomic_fetch_and
 #else
 # ifndef atomic_fetch_and
-#  define atomic_fetch_and(...)			__atomic_op_fence(atomic_fetch_and, __VA_ARGS__)
-#  define atomic_fetch_and_acquire(...)		__atomic_op_acquire(atomic_fetch_and, __VA_ARGS__)
-#  define atomic_fetch_and_release(...)		__atomic_op_release(atomic_fetch_and, __VA_ARGS__)
+#  define atomic_fetch_and(...)			__op_fence(atomic_fetch_and, __VA_ARGS__)
+#  define atomic_fetch_and_acquire(...)		__op_acquire(atomic_fetch_and, __VA_ARGS__)
+#  define atomic_fetch_and_release(...)		__op_release(atomic_fetch_and, __VA_ARGS__)
 # endif
 #endif
 
@@ -211,9 +213,9 @@
 # define atomic_fetch_xor_release		atomic_fetch_xor
 #else
 # ifndef atomic_fetch_xor
-#  define atomic_fetch_xor(...)			__atomic_op_fence(atomic_fetch_xor, __VA_ARGS__)
-#  define atomic_fetch_xor_acquire(...)		__atomic_op_acquire(atomic_fetch_xor, __VA_ARGS__)
-#  define atomic_fetch_xor_release(...)		__atomic_op_release(atomic_fetch_xor, __VA_ARGS__)
+#  define atomic_fetch_xor(...)			__op_fence(atomic_fetch_xor, __VA_ARGS__)
+#  define atomic_fetch_xor_acquire(...)		__op_acquire(atomic_fetch_xor, __VA_ARGS__)
+#  define atomic_fetch_xor_release(...)		__op_release(atomic_fetch_xor, __VA_ARGS__)
 # endif
 #endif
 
@@ -223,9 +225,9 @@
 #define atomic_xchg_release			atomic_xchg
 #else
 # ifndef atomic_xchg
-#  define atomic_xchg(...)			__atomic_op_fence(atomic_xchg, __VA_ARGS__)
-#  define atomic_xchg_acquire(...)		__atomic_op_acquire(atomic_xchg, __VA_ARGS__)
-#  define atomic_xchg_release(...)		__atomic_op_release(atomic_xchg, __VA_ARGS__)
+#  define atomic_xchg(...)			__op_fence(atomic_xchg, __VA_ARGS__)
+#  define atomic_xchg_acquire(...)		__op_acquire(atomic_xchg, __VA_ARGS__)
+#  define atomic_xchg_release(...)		__op_release(atomic_xchg, __VA_ARGS__)
 # endif
 #endif
 
@@ -235,9 +237,9 @@
 # define atomic_cmpxchg_release			atomic_cmpxchg
 #else
 # ifndef atomic_cmpxchg
-#  define atomic_cmpxchg(...)			__atomic_op_fence(atomic_cmpxchg, __VA_ARGS__)
-#  define atomic_cmpxchg_acquire(...)		__atomic_op_acquire(atomic_cmpxchg, __VA_ARGS__)
-#  define atomic_cmpxchg_release(...)		__atomic_op_release(atomic_cmpxchg, __VA_ARGS__)
+#  define atomic_cmpxchg(...)			__op_fence(atomic_cmpxchg, __VA_ARGS__)
+#  define atomic_cmpxchg_acquire(...)		__op_acquire(atomic_cmpxchg, __VA_ARGS__)
+#  define atomic_cmpxchg_release(...)		__op_release(atomic_cmpxchg, __VA_ARGS__)
 # endif
 #endif
 
@@ -267,9 +269,9 @@
 # define cmpxchg_release			cmpxchg
 #else
 # ifndef cmpxchg
-#  define cmpxchg(...)				__atomic_op_fence(cmpxchg, __VA_ARGS__)
-#  define cmpxchg_acquire(...)			__atomic_op_acquire(cmpxchg, __VA_ARGS__)
-#  define cmpxchg_release(...)			__atomic_op_release(cmpxchg, __VA_ARGS__)
+#  define cmpxchg(...)				__op_fence(cmpxchg, __VA_ARGS__)
+#  define cmpxchg_acquire(...)			__op_acquire(cmpxchg, __VA_ARGS__)
+#  define cmpxchg_release(...)			__op_release(cmpxchg, __VA_ARGS__)
 # endif
 #endif
 
@@ -279,9 +281,9 @@
 # define cmpxchg64_release			cmpxchg64
 #else
 # ifndef cmpxchg64
-#  define cmpxchg64(...)			__atomic_op_fence(cmpxchg64, __VA_ARGS__)
-#  define cmpxchg64_acquire(...)		__atomic_op_acquire(cmpxchg64, __VA_ARGS__)
-#  define cmpxchg64_release(...)		__atomic_op_release(cmpxchg64, __VA_ARGS__)
+#  define cmpxchg64(...)			__op_fence(cmpxchg64, __VA_ARGS__)
+#  define cmpxchg64_acquire(...)		__op_acquire(cmpxchg64, __VA_ARGS__)
+#  define cmpxchg64_release(...)		__op_release(cmpxchg64, __VA_ARGS__)
 # endif
 #endif
 
@@ -291,9 +293,9 @@
 # define xchg_release				xchg
 #else
 # ifndef xchg
-#  define xchg(...)				__atomic_op_fence(xchg, __VA_ARGS__)
-#  define xchg_acquire(...)			__atomic_op_acquire(xchg, __VA_ARGS__)
-#  define xchg_release(...)			__atomic_op_release(xchg, __VA_ARGS__)
+#  define xchg(...)				__op_fence(xchg, __VA_ARGS__)
+#  define xchg_acquire(...)			__op_acquire(xchg, __VA_ARGS__)
+#  define xchg_release(...)			__op_release(xchg, __VA_ARGS__)
 # endif
 #endif
 
@@ -330,9 +332,9 @@ static inline int atomic_add_unless(atomic_t *v, int a, int u)
 # define atomic_fetch_andnot_release		atomic_fetch_andnot
 #else
 # ifndef atomic_fetch_andnot
-#  define atomic_fetch_andnot(...)		__atomic_op_fence(atomic_fetch_andnot, __VA_ARGS__)
-#  define atomic_fetch_andnot_acquire(...)	__atomic_op_acquire(atomic_fetch_andnot, __VA_ARGS__)
-#  define atomic_fetch_andnot_release(...)	__atomic_op_release(atomic_fetch_andnot, __VA_ARGS__)
+#  define atomic_fetch_andnot(...)		__op_fence(atomic_fetch_andnot, __VA_ARGS__)
+#  define atomic_fetch_andnot_acquire(...)	__op_acquire(atomic_fetch_andnot, __VA_ARGS__)
+#  define atomic_fetch_andnot_release(...)	__op_release(atomic_fetch_andnot, __VA_ARGS__)
 # endif
 #endif
 
@@ -472,9 +474,9 @@ static inline int atomic_dec_if_positive(atomic_t *v)
 # define atomic64_add_return_release		atomic64_add_return
 #else
 # ifndef atomic64_add_return
-#  define atomic64_add_return(...)		__atomic_op_fence(atomic64_add_return, __VA_ARGS__)
-#  define atomic64_add_return_acquire(...)	__atomic_op_acquire(atomic64_add_return, __VA_ARGS__)
-#  define atomic64_add_return_release(...)	__atomic_op_release(atomic64_add_return, __VA_ARGS__)
+#  define atomic64_add_return(...)		__op_fence(atomic64_add_return, __VA_ARGS__)
+#  define atomic64_add_return_acquire(...)	__op_acquire(atomic64_add_return, __VA_ARGS__)
+#  define atomic64_add_return_release(...)	__op_release(atomic64_add_return, __VA_ARGS__)
 # endif
 #endif
 
@@ -484,9 +486,9 @@ static inline int atomic_dec_if_positive(atomic_t *v)
 # define atomic64_inc_return_release		atomic64_inc_return
 #else
 # ifndef atomic64_inc_return
-#  define atomic64_inc_return(...)		__atomic_op_fence(atomic64_inc_return, __VA_ARGS__)
-#  define atomic64_inc_return_acquire(...)	__atomic_op_acquire(atomic64_inc_return, __VA_ARGS__)
-#  define atomic64_inc_return_release(...)	__atomic_op_release(atomic64_inc_return, __VA_ARGS__)
+#  define atomic64_inc_return(...)		__op_fence(atomic64_inc_return, __VA_ARGS__)
+#  define atomic64_inc_return_acquire(...)	__op_acquire(atomic64_inc_return, __VA_ARGS__)
+#  define atomic64_inc_return_release(...)	__op_release(atomic64_inc_return, __VA_ARGS__)
 # endif
 #endif
 
@@ -496,9 +498,9 @@ static inline int atomic_dec_if_positive(atomic_t *v)
 # define atomic64_sub_return_release		atomic64_sub_return
 #else
 # ifndef atomic64_sub_return
-#  define atomic64_sub_return(...)		__atomic_op_fence(atomic64_sub_return, __VA_ARGS__)
-#  define atomic64_sub_return_acquire(...)	__atomic_op_acquire(atomic64_sub_return, __VA_ARGS__)
-#  define atomic64_sub_return_release(...)	__atomic_op_release(atomic64_sub_return, __VA_ARGS__)
+#  define atomic64_sub_return(...)		__op_fence(atomic64_sub_return, __VA_ARGS__)
+#  define atomic64_sub_return_acquire(...)	__op_acquire(atomic64_sub_return, __VA_ARGS__)
+#  define atomic64_sub_return_release(...)	__op_release(atomic64_sub_return, __VA_ARGS__)
 # endif
 #endif
 
@@ -508,9 +510,9 @@ static inline int atomic_dec_if_positive(atomic_t *v)
 # define atomic64_dec_return_release		atomic64_dec_return
 #else
 # ifndef atomic64_dec_return
-#  define atomic64_dec_return(...)		__atomic_op_fence(atomic64_dec_return, __VA_ARGS__)
-#  define atomic64_dec_return_acquire(...)	__atomic_op_acquire(atomic64_dec_return, __VA_ARGS__)
-#  define atomic64_dec_return_release(...)	__atomic_op_release(atomic64_dec_return, __VA_ARGS__)
+#  define atomic64_dec_return(...)		__op_fence(atomic64_dec_return, __VA_ARGS__)
+#  define atomic64_dec_return_acquire(...)	__op_acquire(atomic64_dec_return, __VA_ARGS__)
+#  define atomic64_dec_return_release(...)	__op_release(atomic64_dec_return, __VA_ARGS__)
 # endif
 #endif
 
@@ -520,9 +522,9 @@ static inline int atomic_dec_if_positive(atomic_t *v)
 # define atomic64_fetch_add_release		atomic64_fetch_add
 #else
 # ifndef atomic64_fetch_add
-#  define atomic64_fetch_add(...)		__atomic_op_fence(atomic64_fetch_add, __VA_ARGS__)
-#  define atomic64_fetch_add_acquire(...)	__atomic_op_acquire(atomic64_fetch_add, __VA_ARGS__)
-#  define atomic64_fetch_add_release(...)	__atomic_op_release(atomic64_fetch_add, __VA_ARGS__)
+#  define atomic64_fetch_add(...)		__op_fence(atomic64_fetch_add, __VA_ARGS__)
+#  define atomic64_fetch_add_acquire(...)	__op_acquire(atomic64_fetch_add, __VA_ARGS__)
+#  define atomic64_fetch_add_release(...)	__op_release(atomic64_fetch_add, __VA_ARGS__)
 # endif
 #endif
 
@@ -539,9 +541,9 @@ static inline int atomic_dec_if_positive(atomic_t *v)
 # endif
 #else
 # ifndef atomic64_fetch_inc
-#  define atomic64_fetch_inc(...)		__atomic_op_fence(atomic64_fetch_inc, __VA_ARGS__)
-#  define atomic64_fetch_inc_acquire(...)	__atomic_op_acquire(atomic64_fetch_inc, __VA_ARGS__)
-#  define atomic64_fetch_inc_release(...)	__atomic_op_release(atomic64_fetch_inc, __VA_ARGS__)
+#  define atomic64_fetch_inc(...)		__op_fence(atomic64_fetch_inc, __VA_ARGS__)
+#  define atomic64_fetch_inc_acquire(...)	__op_acquire(atomic64_fetch_inc, __VA_ARGS__)
+#  define atomic64_fetch_inc_release(...)	__op_release(atomic64_fetch_inc, __VA_ARGS__)
 # endif
 #endif
 
@@ -551,9 +553,9 @@ static inline int atomic_dec_if_positive(atomic_t *v)
 # define atomic64_fetch_sub_release		atomic64_fetch_sub
 #else
 # ifndef atomic64_fetch_sub
-#  define atomic64_fetch_sub(...)		__atomic_op_fence(atomic64_fetch_sub, __VA_ARGS__)
-#  define atomic64_fetch_sub_acquire(...)	__atomic_op_acquire(atomic64_fetch_sub, __VA_ARGS__)
-#  define atomic64_fetch_sub_release(...)	__atomic_op_release(atomic64_fetch_sub, __VA_ARGS__)
+#  define atomic64_fetch_sub(...)		__op_fence(atomic64_fetch_sub, __VA_ARGS__)
+#  define atomic64_fetch_sub_acquire(...)	__op_acquire(atomic64_fetch_sub, __VA_ARGS__)
+#  define atomic64_fetch_sub_release(...)	__op_release(atomic64_fetch_sub, __VA_ARGS__)
 # endif
 #endif
 
@@ -570,9 +572,9 @@ static inline int atomic_dec_if_positive(atomic_t *v)
 # endif
 #else
 # ifndef atomic64_fetch_dec
-#  define atomic64_fetch_dec(...)		__atomic_op_fence(atomic64_fetch_dec, __VA_ARGS__)
-#  define atomic64_fetch_dec_acquire(...)	__atomic_op_acquire(atomic64_fetch_dec, __VA_ARGS__)
-#  define atomic64_fetch_dec_release(...)	__atomic_op_release(atomic64_fetch_dec, __VA_ARGS__)
+#  define atomic64_fetch_dec(...)		__op_fence(atomic64_fetch_dec, __VA_ARGS__)
+#  define atomic64_fetch_dec_acquire(...)	__op_acquire(atomic64_fetch_dec, __VA_ARGS__)
+#  define atomic64_fetch_dec_release(...)	__op_release(atomic64_fetch_dec, __VA_ARGS__)
 # endif
 #endif
 
@@ -582,9 +584,9 @@ static inline int atomic_dec_if_positive(atomic_t *v)
 # define atomic64_fetch_or_release		atomic64_fetch_or
 #else
 # ifndef atomic64_fetch_or
-#  define atomic64_fetch_or(...)		__atomic_op_fence(atomic64_fetch_or, __VA_ARGS__)
-#  define atomic64_fetch_or_acquire(...)	__atomic_op_acquire(atomic64_fetch_or, __VA_ARGS__)
-#  define atomic64_fetch_or_release(...)	__atomic_op_release(atomic64_fetch_or, __VA_ARGS__)
+#  define atomic64_fetch_or(...)		__op_fence(atomic64_fetch_or, __VA_ARGS__)
+#  define atomic64_fetch_or_acquire(...)	__op_acquire(atomic64_fetch_or, __VA_ARGS__)
+#  define atomic64_fetch_or_release(...)	__op_release(atomic64_fetch_or, __VA_ARGS__)
 # endif
 #endif
 
@@ -594,9 +596,9 @@ static inline int atomic_dec_if_positive(atomic_t *v)
 # define atomic64_fetch_and_release		atomic64_fetch_and
 #else
 # ifndef atomic64_fetch_and
-#  define atomic64_fetch_and(...)		__atomic_op_fence(atomic64_fetch_and, __VA_ARGS__)
-#  define atomic64_fetch_and_acquire(...)	__atomic_op_acquire(atomic64_fetch_and, __VA_ARGS__)
-#  define atomic64_fetch_and_release(...)	__atomic_op_release(atomic64_fetch_and, __VA_ARGS__)
+#  define atomic64_fetch_and(...)		__op_fence(atomic64_fetch_and, __VA_ARGS__)
+#  define atomic64_fetch_and_acquire(...)	__op_acquire(atomic64_fetch_and, __VA_ARGS__)
+#  define atomic64_fetch_and_release(...)	__op_release(atomic64_fetch_and, __VA_ARGS__)
 # endif
 #endif
 
@@ -606,9 +608,9 @@ static inline int atomic_dec_if_positive(atomic_t *v)
 # define atomic64_fetch_xor_release		atomic64_fetch_xor
 #else
 # ifndef atomic64_fetch_xor
-#  define atomic64_fetch_xor(...)		__atomic_op_fence(atomic64_fetch_xor, __VA_ARGS__)
-#  define atomic64_fetch_xor_acquire(...)	__atomic_op_acquire(atomic64_fetch_xor, __VA_ARGS__)
-#  define atomic64_fetch_xor_release(...)	__atomic_op_release(atomic64_fetch_xor, __VA_ARGS__)
+#  define atomic64_fetch_xor(...)		__op_fence(atomic64_fetch_xor, __VA_ARGS__)
+#  define atomic64_fetch_xor_acquire(...)	__op_acquire(atomic64_fetch_xor, __VA_ARGS__)
+#  define atomic64_fetch_xor_release(...)	__op_release(atomic64_fetch_xor, __VA_ARGS__)
 # endif
 #endif
 
@@ -618,9 +620,9 @@ static inline int atomic_dec_if_positive(atomic_t *v)
 # define atomic64_xchg_release			atomic64_xchg
 #else
 # ifndef atomic64_xchg
-#  define atomic64_xchg(...)			__atomic_op_fence(atomic64_xchg, __VA_ARGS__)
-#  define atomic64_xchg_acquire(...)		__atomic_op_acquire(atomic64_xchg, __VA_ARGS__)
-#  define atomic64_xchg_release(...)		__atomic_op_release(atomic64_xchg, __VA_ARGS__)
+#  define atomic64_xchg(...)			__op_fence(atomic64_xchg, __VA_ARGS__)
+#  define atomic64_xchg_acquire(...)		__op_acquire(atomic64_xchg, __VA_ARGS__)
+#  define atomic64_xchg_release(...)		__op_release(atomic64_xchg, __VA_ARGS__)
 # endif
 #endif
 
@@ -630,9 +632,9 @@ static inline int atomic_dec_if_positive(atomic_t *v)
 # define atomic64_cmpxchg_release		atomic64_cmpxchg
 #else
 # ifndef atomic64_cmpxchg
-#  define atomic64_cmpxchg(...)			__atomic_op_fence(atomic64_cmpxchg, __VA_ARGS__)
-#  define atomic64_cmpxchg_acquire(...)		__atomic_op_acquire(atomic64_cmpxchg, __VA_ARGS__)
-#  define atomic64_cmpxchg_release(...)		__atomic_op_release(atomic64_cmpxchg, __VA_ARGS__)
+#  define atomic64_cmpxchg(...)			__op_fence(atomic64_cmpxchg, __VA_ARGS__)
+#  define atomic64_cmpxchg_acquire(...)		__op_acquire(atomic64_cmpxchg, __VA_ARGS__)
+#  define atomic64_cmpxchg_release(...)		__op_release(atomic64_cmpxchg, __VA_ARGS__)
 # endif
 #endif
 
@@ -664,9 +666,9 @@ static inline int atomic_dec_if_positive(atomic_t *v)
 # define atomic64_fetch_andnot_release		atomic64_fetch_andnot
 #else
 # ifndef atomic64_fetch_andnot
-#  define atomic64_fetch_andnot(...)		__atomic_op_fence(atomic64_fetch_andnot, __VA_ARGS__)
-#  define atomic64_fetch_andnot_acquire(...)	__atomic_op_acquire(atomic64_fetch_andnot, __VA_ARGS__)
-#  define atomic64_fetch_andnot_release(...)	__atomic_op_release(atomic64_fetch_andnot, __VA_ARGS__)
+#  define atomic64_fetch_andnot(...)		__op_fence(atomic64_fetch_andnot, __VA_ARGS__)
+#  define atomic64_fetch_andnot_acquire(...)	__op_acquire(atomic64_fetch_andnot, __VA_ARGS__)
+#  define atomic64_fetch_andnot_release(...)	__op_release(atomic64_fetch_andnot, __VA_ARGS__)
 # endif
 #endif
 

^ permalink raw reply related	[flat|nested] 103+ messages in thread

* Re: [PATCH] locking/atomics: Shorten the __atomic_op() defines to __op()
  2018-05-05 10:48                 ` Ingo Molnar
@ 2018-05-05 10:59                   ` Ingo Molnar
  -1 siblings, 0 replies; 103+ messages in thread
From: Ingo Molnar @ 2018-05-05 10:59 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Mark Rutland, linux-arm-kernel, linux-kernel, aryabinin,
	boqun.feng, catalin.marinas, dvyukov, will.deacon


* Ingo Molnar <mingo@kernel.org> wrote:

> If that's too then there's a few more things we could do - for example the 
              ^--too much
> attached patch renames a (very minor) misnomer to a shorter name and thus saves on 
> the longest lines, the column histogram now looks like this:

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 103+ messages in thread

* [PATCH] locking/atomics: Shorten the __atomic_op() defines to __op()
@ 2018-05-05 10:59                   ` Ingo Molnar
  0 siblings, 0 replies; 103+ messages in thread
From: Ingo Molnar @ 2018-05-05 10:59 UTC (permalink / raw)
  To: linux-arm-kernel


* Ingo Molnar <mingo@kernel.org> wrote:

> If that's too then there's a few more things we could do - for example the 
              ^--too much
> attached patch renames a (very minor) misnomer to a shorter name and thus saves on 
> the longest lines, the column histogram now looks like this:

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 103+ messages in thread

* Re: [RFC PATCH] locking/atomics/powerpc: Clarify why the cmpxchg_relaxed() family of APIs falls back to full cmpxchg()
  2018-05-05 10:35                   ` Ingo Molnar
@ 2018-05-05 11:28                     ` Boqun Feng
  -1 siblings, 0 replies; 103+ messages in thread
From: Boqun Feng @ 2018-05-05 11:28 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Peter Zijlstra, Mark Rutland, linux-arm-kernel, linux-kernel,
	aryabinin, catalin.marinas, dvyukov, will.deacon

[-- Attachment #1: Type: text/plain, Size: 8998 bytes --]

On Sat, May 05, 2018 at 12:35:50PM +0200, Ingo Molnar wrote:
> 
> * Boqun Feng <boqun.feng@gmail.com> wrote:
> 
> > On Sat, May 05, 2018 at 11:38:29AM +0200, Ingo Molnar wrote:
> > > 
> > > * Ingo Molnar <mingo@kernel.org> wrote:
> > > 
> > > > * Peter Zijlstra <peterz@infradead.org> wrote:
> > > > 
> > > > > > So we could do the following simplification on top of that:
> > > > > > 
> > > > > >  #ifndef atomic_fetch_dec_relaxed
> > > > > >  # ifndef atomic_fetch_dec
> > > > > >  #  define atomic_fetch_dec(v)		atomic_fetch_sub(1, (v))
> > > > > >  #  define atomic_fetch_dec_relaxed(v)	atomic_fetch_sub_relaxed(1, (v))
> > > > > >  #  define atomic_fetch_dec_acquire(v)	atomic_fetch_sub_acquire(1, (v))
> > > > > >  #  define atomic_fetch_dec_release(v)	atomic_fetch_sub_release(1, (v))
> > > > > >  # else
> > > > > >  #  define atomic_fetch_dec_relaxed		atomic_fetch_dec
> > > > > >  #  define atomic_fetch_dec_acquire		atomic_fetch_dec
> > > > > >  #  define atomic_fetch_dec_release		atomic_fetch_dec
> > > > > >  # endif
> > > > > >  #else
> > > > > >  # ifndef atomic_fetch_dec
> > > > > >  #  define atomic_fetch_dec(...)		__atomic_op_fence(atomic_fetch_dec, __VA_ARGS__)
> > > > > >  #  define atomic_fetch_dec_acquire(...)	__atomic_op_acquire(atomic_fetch_dec, __VA_ARGS__)
> > > > > >  #  define atomic_fetch_dec_release(...)	__atomic_op_release(atomic_fetch_dec, __VA_ARGS__)
> > > > > >  # endif
> > > > > >  #endif
> > > > > 
> > > > > This would disallow an architecture to override just fetch_dec_release for
> > > > > instance.
> > > > 
> > > > Couldn't such a crazy arch just define _all_ the 3 APIs in this group?
> > > > That's really a small price and makes the place pay the complexity
> > > > price that does the weirdness...
> > > > 
> > > > > I don't think there currently is any architecture that does that, but the
> > > > > intent was to allow it to override anything and only provide defaults where it
> > > > > does not.
> > > > 
> > > > I'd argue that if a new arch only defines one of these APIs that's probably a bug. 
> > > > If they absolutely want to do it, they still can - by defining all 3 APIs.
> > > > 
> > > > So there's no loss in arch flexibility.
> > > 
> > > BTW., PowerPC for example is already in such a situation, it does not define 
> > > atomic_cmpxchg_release(), only the other APIs:
> > > 
> > > #define atomic_cmpxchg(v, o, n) (cmpxchg(&((v)->counter), (o), (n)))
> > > #define atomic_cmpxchg_relaxed(v, o, n) \
> > > 	cmpxchg_relaxed(&((v)->counter), (o), (n))
> > > #define atomic_cmpxchg_acquire(v, o, n) \
> > > 	cmpxchg_acquire(&((v)->counter), (o), (n))
> > > 
> > > Was it really the intention on the PowerPC side that the generic code falls back 
> > > to cmpxchg(), i.e.:
> > > 
> > > #  define atomic_cmpxchg_release(...)           __atomic_op_release(atomic_cmpxchg, __VA_ARGS__)
> > > 
> > 
> > So ppc has its own definition __atomic_op_release() in
> > arch/powerpc/include/asm/atomic.h:
> > 
> > 	#define __atomic_op_release(op, args...)				\
> > 	({									\
> > 		__asm__ __volatile__(PPC_RELEASE_BARRIER "" : : : "memory");	\
> > 		op##_relaxed(args);						\
> > 	})
> > 
> > , and PPC_RELEASE_BARRIER is lwsync, so we map to
> > 
> > 	lwsync();
> > 	atomic_cmpxchg_relaxed(v, o, n);
> > 
> > And the reason, why we don't define atomic_cmpxchg_release() but define
> > atomic_cmpxchg_acquire() is that, atomic_cmpxchg_*() could provide no
> > ordering guarantee if the cmp fails, we did this for
> > atomic_cmpxchg_acquire() but not for atomic_cmpxchg_release(), because
> > doing so may introduce a memory barrier inside a ll/sc critical section,
> > please see the comment before __cmpxchg_u32_acquire() in
> > arch/powerpc/include/asm/cmpxchg.h:
> > 
> > 	/*
> > 	 * cmpxchg family don't have order guarantee if cmp part fails, therefore we
> > 	 * can avoid superfluous barriers if we use assembly code to implement
> > 	 * cmpxchg() and cmpxchg_acquire(), however we don't do the similar for
> > 	 * cmpxchg_release() because that will result in putting a barrier in the
> > 	 * middle of a ll/sc loop, which is probably a bad idea. For example, this
> > 	 * might cause the conditional store more likely to fail.
> > 	 */
> 
> Makes sense, thanks a lot for the explanation, missed that comment in the middle 
> of the assembly functions!
> 

;-) I could move it so somewhere else in the future.

> So the patch I sent is buggy, please disregard it.
> 
> May I suggest the patch below? No change in functionality, but it documents the 
> lack of the cmpxchg_release() APIs and maps them explicitly to the full cmpxchg() 
> version. (Which the generic code does now in a rather roundabout way.)
> 

Hmm.. cmpxchg_release() is actually lwsync() + cmpxchg_relaxed(), but
you just make it sync() + cmpxchg_relaxed() + sync() with the fallback,
and sync() is much heavier, so I don't think the fallback is correct.

I think maybe you can move powerpc's __atomic_op_{acqurie,release}()
from atomic.h to cmpxchg.h (in arch/powerpc/include/asm), and

	#define cmpxchg_release __atomic_op_release(cmpxchg, __VA_ARGS__);
	#define cmpxchg64_release __atomic_op_release(cmpxchg64, __VA_ARGS__);

I put a diff below to say what I mean (untested).

> Also, the change to arch/powerpc/include/asm/atomic.h has no functional effect 
> right now either, but should anyone add a _relaxed() variant in the future, with 
> this change atomic_cmpxchg_release() and atomic64_cmpxchg_release() will pick that 
> up automatically.
> 

You mean with your other modification in include/linux/atomic.h, right?
Because with the unmodified include/linux/atomic.h, we already pick that
autmatically. If so, I think that's fine.

Here is the diff for the modification for cmpxchg_release(), the idea is
we generate them in asm/cmpxchg.h other than linux/atomic.h for ppc, so
we keep the new linux/atomic.h working. Because if I understand
correctly, the next linux/atomic.h only accepts that

1)	architecture only defines fully ordered primitives

or

2)	architecture only defines _relaxed primitives

or

3)	architecture defines all four (fully, _relaxed, _acquire,
	_release) primitives

So powerpc needs to define all four primitives in its only
asm/cmpxchg.h.

Regards,
Boqun

diff --git a/arch/powerpc/include/asm/atomic.h b/arch/powerpc/include/asm/atomic.h
index 682b3e6a1e21..0136be11c84f 100644
--- a/arch/powerpc/include/asm/atomic.h
+++ b/arch/powerpc/include/asm/atomic.h
@@ -13,24 +13,6 @@
 
 #define ATOMIC_INIT(i)		{ (i) }
 
-/*
- * Since *_return_relaxed and {cmp}xchg_relaxed are implemented with
- * a "bne-" instruction at the end, so an isync is enough as a acquire barrier
- * on the platform without lwsync.
- */
-#define __atomic_op_acquire(op, args...)				\
-({									\
-	typeof(op##_relaxed(args)) __ret  = op##_relaxed(args);		\
-	__asm__ __volatile__(PPC_ACQUIRE_BARRIER "" : : : "memory");	\
-	__ret;								\
-})
-
-#define __atomic_op_release(op, args...)				\
-({									\
-	__asm__ __volatile__(PPC_RELEASE_BARRIER "" : : : "memory");	\
-	op##_relaxed(args);						\
-})
-
 static __inline__ int atomic_read(const atomic_t *v)
 {
 	int t;
diff --git a/arch/powerpc/include/asm/cmpxchg.h b/arch/powerpc/include/asm/cmpxchg.h
index 9b001f1f6b32..9e20a942aff9 100644
--- a/arch/powerpc/include/asm/cmpxchg.h
+++ b/arch/powerpc/include/asm/cmpxchg.h
@@ -8,6 +8,24 @@
 #include <asm/asm-compat.h>
 #include <linux/bug.h>
 
+/*
+ * Since *_return_relaxed and {cmp}xchg_relaxed are implemented with
+ * a "bne-" instruction at the end, so an isync is enough as a acquire barrier
+ * on the platform without lwsync.
+ */
+#define __atomic_op_acquire(op, args...)				\
+({									\
+	typeof(op##_relaxed(args)) __ret  = op##_relaxed(args);		\
+	__asm__ __volatile__(PPC_ACQUIRE_BARRIER "" : : : "memory");	\
+	__ret;								\
+})
+
+#define __atomic_op_release(op, args...)				\
+({									\
+	__asm__ __volatile__(PPC_RELEASE_BARRIER "" : : : "memory");	\
+	op##_relaxed(args);						\
+})
+
 #ifdef __BIG_ENDIAN
 #define BITOFF_CAL(size, off)	((sizeof(u32) - size - off) * BITS_PER_BYTE)
 #else
@@ -512,6 +530,8 @@ __cmpxchg_acquire(void *ptr, unsigned long old, unsigned long new,
 			(unsigned long)_o_, (unsigned long)_n_,		\
 			sizeof(*(ptr)));				\
 })
+
+#define cmpxchg_release(ptr, o, n) __atomic_op_release(cmpxchg, __VA_ARGS__)
 #ifdef CONFIG_PPC64
 #define cmpxchg64(ptr, o, n)						\
   ({									\
@@ -533,6 +553,7 @@ __cmpxchg_acquire(void *ptr, unsigned long old, unsigned long new,
 	BUILD_BUG_ON(sizeof(*(ptr)) != 8);				\
 	cmpxchg_acquire((ptr), (o), (n));				\
 })
+#define cmpxchg64_release(ptr, o, n) __atomic_op_release(cmpxchg64, __VA_ARGS__)
 #else
 #include <asm-generic/cmpxchg-local.h>
 #define cmpxchg64_local(ptr, o, n) __cmpxchg64_local_generic((ptr), (o), (n))

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply related	[flat|nested] 103+ messages in thread

* [RFC PATCH] locking/atomics/powerpc: Clarify why the cmpxchg_relaxed() family of APIs falls back to full cmpxchg()
@ 2018-05-05 11:28                     ` Boqun Feng
  0 siblings, 0 replies; 103+ messages in thread
From: Boqun Feng @ 2018-05-05 11:28 UTC (permalink / raw)
  To: linux-arm-kernel

On Sat, May 05, 2018 at 12:35:50PM +0200, Ingo Molnar wrote:
> 
> * Boqun Feng <boqun.feng@gmail.com> wrote:
> 
> > On Sat, May 05, 2018 at 11:38:29AM +0200, Ingo Molnar wrote:
> > > 
> > > * Ingo Molnar <mingo@kernel.org> wrote:
> > > 
> > > > * Peter Zijlstra <peterz@infradead.org> wrote:
> > > > 
> > > > > > So we could do the following simplification on top of that:
> > > > > > 
> > > > > >  #ifndef atomic_fetch_dec_relaxed
> > > > > >  # ifndef atomic_fetch_dec
> > > > > >  #  define atomic_fetch_dec(v)		atomic_fetch_sub(1, (v))
> > > > > >  #  define atomic_fetch_dec_relaxed(v)	atomic_fetch_sub_relaxed(1, (v))
> > > > > >  #  define atomic_fetch_dec_acquire(v)	atomic_fetch_sub_acquire(1, (v))
> > > > > >  #  define atomic_fetch_dec_release(v)	atomic_fetch_sub_release(1, (v))
> > > > > >  # else
> > > > > >  #  define atomic_fetch_dec_relaxed		atomic_fetch_dec
> > > > > >  #  define atomic_fetch_dec_acquire		atomic_fetch_dec
> > > > > >  #  define atomic_fetch_dec_release		atomic_fetch_dec
> > > > > >  # endif
> > > > > >  #else
> > > > > >  # ifndef atomic_fetch_dec
> > > > > >  #  define atomic_fetch_dec(...)		__atomic_op_fence(atomic_fetch_dec, __VA_ARGS__)
> > > > > >  #  define atomic_fetch_dec_acquire(...)	__atomic_op_acquire(atomic_fetch_dec, __VA_ARGS__)
> > > > > >  #  define atomic_fetch_dec_release(...)	__atomic_op_release(atomic_fetch_dec, __VA_ARGS__)
> > > > > >  # endif
> > > > > >  #endif
> > > > > 
> > > > > This would disallow an architecture to override just fetch_dec_release for
> > > > > instance.
> > > > 
> > > > Couldn't such a crazy arch just define _all_ the 3 APIs in this group?
> > > > That's really a small price and makes the place pay the complexity
> > > > price that does the weirdness...
> > > > 
> > > > > I don't think there currently is any architecture that does that, but the
> > > > > intent was to allow it to override anything and only provide defaults where it
> > > > > does not.
> > > > 
> > > > I'd argue that if a new arch only defines one of these APIs that's probably a bug. 
> > > > If they absolutely want to do it, they still can - by defining all 3 APIs.
> > > > 
> > > > So there's no loss in arch flexibility.
> > > 
> > > BTW., PowerPC for example is already in such a situation, it does not define 
> > > atomic_cmpxchg_release(), only the other APIs:
> > > 
> > > #define atomic_cmpxchg(v, o, n) (cmpxchg(&((v)->counter), (o), (n)))
> > > #define atomic_cmpxchg_relaxed(v, o, n) \
> > > 	cmpxchg_relaxed(&((v)->counter), (o), (n))
> > > #define atomic_cmpxchg_acquire(v, o, n) \
> > > 	cmpxchg_acquire(&((v)->counter), (o), (n))
> > > 
> > > Was it really the intention on the PowerPC side that the generic code falls back 
> > > to cmpxchg(), i.e.:
> > > 
> > > #  define atomic_cmpxchg_release(...)           __atomic_op_release(atomic_cmpxchg, __VA_ARGS__)
> > > 
> > 
> > So ppc has its own definition __atomic_op_release() in
> > arch/powerpc/include/asm/atomic.h:
> > 
> > 	#define __atomic_op_release(op, args...)				\
> > 	({									\
> > 		__asm__ __volatile__(PPC_RELEASE_BARRIER "" : : : "memory");	\
> > 		op##_relaxed(args);						\
> > 	})
> > 
> > , and PPC_RELEASE_BARRIER is lwsync, so we map to
> > 
> > 	lwsync();
> > 	atomic_cmpxchg_relaxed(v, o, n);
> > 
> > And the reason, why we don't define atomic_cmpxchg_release() but define
> > atomic_cmpxchg_acquire() is that, atomic_cmpxchg_*() could provide no
> > ordering guarantee if the cmp fails, we did this for
> > atomic_cmpxchg_acquire() but not for atomic_cmpxchg_release(), because
> > doing so may introduce a memory barrier inside a ll/sc critical section,
> > please see the comment before __cmpxchg_u32_acquire() in
> > arch/powerpc/include/asm/cmpxchg.h:
> > 
> > 	/*
> > 	 * cmpxchg family don't have order guarantee if cmp part fails, therefore we
> > 	 * can avoid superfluous barriers if we use assembly code to implement
> > 	 * cmpxchg() and cmpxchg_acquire(), however we don't do the similar for
> > 	 * cmpxchg_release() because that will result in putting a barrier in the
> > 	 * middle of a ll/sc loop, which is probably a bad idea. For example, this
> > 	 * might cause the conditional store more likely to fail.
> > 	 */
> 
> Makes sense, thanks a lot for the explanation, missed that comment in the middle 
> of the assembly functions!
> 

;-) I could move it so somewhere else in the future.

> So the patch I sent is buggy, please disregard it.
> 
> May I suggest the patch below? No change in functionality, but it documents the 
> lack of the cmpxchg_release() APIs and maps them explicitly to the full cmpxchg() 
> version. (Which the generic code does now in a rather roundabout way.)
> 

Hmm.. cmpxchg_release() is actually lwsync() + cmpxchg_relaxed(), but
you just make it sync() + cmpxchg_relaxed() + sync() with the fallback,
and sync() is much heavier, so I don't think the fallback is correct.

I think maybe you can move powerpc's __atomic_op_{acqurie,release}()
from atomic.h to cmpxchg.h (in arch/powerpc/include/asm), and

	#define cmpxchg_release __atomic_op_release(cmpxchg, __VA_ARGS__);
	#define cmpxchg64_release __atomic_op_release(cmpxchg64, __VA_ARGS__);

I put a diff below to say what I mean (untested).

> Also, the change to arch/powerpc/include/asm/atomic.h has no functional effect 
> right now either, but should anyone add a _relaxed() variant in the future, with 
> this change atomic_cmpxchg_release() and atomic64_cmpxchg_release() will pick that 
> up automatically.
> 

You mean with your other modification in include/linux/atomic.h, right?
Because with the unmodified include/linux/atomic.h, we already pick that
autmatically. If so, I think that's fine.

Here is the diff for the modification for cmpxchg_release(), the idea is
we generate them in asm/cmpxchg.h other than linux/atomic.h for ppc, so
we keep the new linux/atomic.h working. Because if I understand
correctly, the next linux/atomic.h only accepts that

1)	architecture only defines fully ordered primitives

or

2)	architecture only defines _relaxed primitives

or

3)	architecture defines all four (fully, _relaxed, _acquire,
	_release) primitives

So powerpc needs to define all four primitives in its only
asm/cmpxchg.h.

Regards,
Boqun

diff --git a/arch/powerpc/include/asm/atomic.h b/arch/powerpc/include/asm/atomic.h
index 682b3e6a1e21..0136be11c84f 100644
--- a/arch/powerpc/include/asm/atomic.h
+++ b/arch/powerpc/include/asm/atomic.h
@@ -13,24 +13,6 @@
 
 #define ATOMIC_INIT(i)		{ (i) }
 
-/*
- * Since *_return_relaxed and {cmp}xchg_relaxed are implemented with
- * a "bne-" instruction at the end, so an isync is enough as a acquire barrier
- * on the platform without lwsync.
- */
-#define __atomic_op_acquire(op, args...)				\
-({									\
-	typeof(op##_relaxed(args)) __ret  = op##_relaxed(args);		\
-	__asm__ __volatile__(PPC_ACQUIRE_BARRIER "" : : : "memory");	\
-	__ret;								\
-})
-
-#define __atomic_op_release(op, args...)				\
-({									\
-	__asm__ __volatile__(PPC_RELEASE_BARRIER "" : : : "memory");	\
-	op##_relaxed(args);						\
-})
-
 static __inline__ int atomic_read(const atomic_t *v)
 {
 	int t;
diff --git a/arch/powerpc/include/asm/cmpxchg.h b/arch/powerpc/include/asm/cmpxchg.h
index 9b001f1f6b32..9e20a942aff9 100644
--- a/arch/powerpc/include/asm/cmpxchg.h
+++ b/arch/powerpc/include/asm/cmpxchg.h
@@ -8,6 +8,24 @@
 #include <asm/asm-compat.h>
 #include <linux/bug.h>
 
+/*
+ * Since *_return_relaxed and {cmp}xchg_relaxed are implemented with
+ * a "bne-" instruction at the end, so an isync is enough as a acquire barrier
+ * on the platform without lwsync.
+ */
+#define __atomic_op_acquire(op, args...)				\
+({									\
+	typeof(op##_relaxed(args)) __ret  = op##_relaxed(args);		\
+	__asm__ __volatile__(PPC_ACQUIRE_BARRIER "" : : : "memory");	\
+	__ret;								\
+})
+
+#define __atomic_op_release(op, args...)				\
+({									\
+	__asm__ __volatile__(PPC_RELEASE_BARRIER "" : : : "memory");	\
+	op##_relaxed(args);						\
+})
+
 #ifdef __BIG_ENDIAN
 #define BITOFF_CAL(size, off)	((sizeof(u32) - size - off) * BITS_PER_BYTE)
 #else
@@ -512,6 +530,8 @@ __cmpxchg_acquire(void *ptr, unsigned long old, unsigned long new,
 			(unsigned long)_o_, (unsigned long)_n_,		\
 			sizeof(*(ptr)));				\
 })
+
+#define cmpxchg_release(ptr, o, n) __atomic_op_release(cmpxchg, __VA_ARGS__)
 #ifdef CONFIG_PPC64
 #define cmpxchg64(ptr, o, n)						\
   ({									\
@@ -533,6 +553,7 @@ __cmpxchg_acquire(void *ptr, unsigned long old, unsigned long new,
 	BUILD_BUG_ON(sizeof(*(ptr)) != 8);				\
 	cmpxchg_acquire((ptr), (o), (n));				\
 })
+#define cmpxchg64_release(ptr, o, n) __atomic_op_release(cmpxchg64, __VA_ARGS__)
 #else
 #include <asm-generic/cmpxchg-local.h>
 #define cmpxchg64_local(ptr, o, n) __cmpxchg64_local_generic((ptr), (o), (n))
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 488 bytes
Desc: not available
URL: <http://lists.infradead.org/pipermail/linux-arm-kernel/attachments/20180505/f4d4dc74/attachment.sig>

^ permalink raw reply related	[flat|nested] 103+ messages in thread

* [PATCH] locking/atomics/powerpc: Move cmpxchg helpers to asm/cmpxchg.h and define the full set of cmpxchg APIs
  2018-05-05 11:28                     ` Boqun Feng
@ 2018-05-05 13:27                       ` Ingo Molnar
  -1 siblings, 0 replies; 103+ messages in thread
From: Ingo Molnar @ 2018-05-05 13:27 UTC (permalink / raw)
  To: Boqun Feng
  Cc: Peter Zijlstra, Mark Rutland, linux-arm-kernel, linux-kernel,
	aryabinin, catalin.marinas, dvyukov, will.deacon


* Boqun Feng <boqun.feng@gmail.com> wrote:

> > May I suggest the patch below? No change in functionality, but it documents the 
> > lack of the cmpxchg_release() APIs and maps them explicitly to the full cmpxchg() 
> > version. (Which the generic code does now in a rather roundabout way.)
> > 
> 
> Hmm.. cmpxchg_release() is actually lwsync() + cmpxchg_relaxed(), but
> you just make it sync() + cmpxchg_relaxed() + sync() with the fallback,
> and sync() is much heavier, so I don't think the fallback is correct.

Indeed!

The bit I missed previously is that PowerPC provides its own __atomic_op_release() 
method:

   #define __atomic_op_release(op, args...)                                \
   ({                                                                      \
           __asm__ __volatile__(PPC_RELEASE_BARRIER "" : : : "memory");    \
           op##_relaxed(args);                                             \
   })

... which maps to LWSYNC as you say, and my patch made that worse.

> I think maybe you can move powerpc's __atomic_op_{acqurie,release}()
> from atomic.h to cmpxchg.h (in arch/powerpc/include/asm), and
> 
> 	#define cmpxchg_release __atomic_op_release(cmpxchg, __VA_ARGS__);
> 	#define cmpxchg64_release __atomic_op_release(cmpxchg64, __VA_ARGS__);
> 
> I put a diff below to say what I mean (untested).
> 
> > Also, the change to arch/powerpc/include/asm/atomic.h has no functional effect 
> > right now either, but should anyone add a _relaxed() variant in the future, with 
> > this change atomic_cmpxchg_release() and atomic64_cmpxchg_release() will pick that 
> > up automatically.
> > 
> 
> You mean with your other modification in include/linux/atomic.h, right?
> Because with the unmodified include/linux/atomic.h, we already pick that
> autmatically. If so, I think that's fine.
> 
> Here is the diff for the modification for cmpxchg_release(), the idea is
> we generate them in asm/cmpxchg.h other than linux/atomic.h for ppc, so
> we keep the new linux/atomic.h working. Because if I understand
> correctly, the next linux/atomic.h only accepts that
> 
> 1)	architecture only defines fully ordered primitives
> 
> or
> 
> 2)	architecture only defines _relaxed primitives
> 
> or
> 
> 3)	architecture defines all four (fully, _relaxed, _acquire,
> 	_release) primitives
> 
> So powerpc needs to define all four primitives in its only
> asm/cmpxchg.h.

Correct, although the new logic is still RFC, PeterZ didn't like the first version 
I proposed and might NAK them.

Thanks for the patch - I have created the patch below from it and added your 
Signed-off-by.

The only change I made beyond a trivial build fix is that I also added the release 
atomics variants explicitly:

+#define atomic_cmpxchg_release(v, o, n) \
+	cmpxchg_release(&((v)->counter), (o), (n))
+#define atomic64_cmpxchg_release(v, o, n) \
+	cmpxchg_release(&((v)->counter), (o), (n))

It has passed a PowerPC cross-build test here, but no runtime tests.

Does this patch look good to you?

(Still subject to PeterZ's Ack/NAK.)

Thanks,

	Ingo

======================>
From: Boqun Feng <boqun.feng@gmail.com>
Date: Sat, 5 May 2018 19:28:17 +0800
Subject: [PATCH] locking/atomics/powerpc: Move cmpxchg helpers to asm/cmpxchg.h and define the full set of cmpxchg APIs

Move PowerPC's __op_{acqurie,release}() from atomic.h to
cmpxchg.h (in arch/powerpc/include/asm), plus use them to
define these two methods:

	#define cmpxchg_release __op_release(cmpxchg, __VA_ARGS__);
	#define cmpxchg64_release __op_release(cmpxchg64, __VA_ARGS__);

... the idea is to generate all these methods in cmpxchg.h and to define the full
array of atomic primitives, including the cmpxchg_release() methods which were
defined by the generic code before.

Also define the atomic[64]_() variants explicitly.

This ensures that all these low level cmpxchg APIs are defined in
PowerPC headers, with no generic header fallbacks.

No change in functionality or code generation.

Signed-off-by: Boqun Feng <boqun.feng@gmail.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: aryabinin@virtuozzo.com
Cc: catalin.marinas@arm.com
Cc: dvyukov@google.com
Cc: linux-arm-kernel@lists.infradead.org
Cc: will.deacon@arm.com
Link: http://lkml.kernel.org/r/20180505112817.ihrb726i37bwm4cj@tardis
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/powerpc/include/asm/atomic.h  | 22 ++++------------------
 arch/powerpc/include/asm/cmpxchg.h | 24 ++++++++++++++++++++++++
 2 files changed, 28 insertions(+), 18 deletions(-)

diff --git a/arch/powerpc/include/asm/atomic.h b/arch/powerpc/include/asm/atomic.h
index 682b3e6a1e21..4e06955ec10f 100644
--- a/arch/powerpc/include/asm/atomic.h
+++ b/arch/powerpc/include/asm/atomic.h
@@ -13,24 +13,6 @@
 
 #define ATOMIC_INIT(i)		{ (i) }
 
-/*
- * Since *_return_relaxed and {cmp}xchg_relaxed are implemented with
- * a "bne-" instruction at the end, so an isync is enough as a acquire barrier
- * on the platform without lwsync.
- */
-#define __atomic_op_acquire(op, args...)				\
-({									\
-	typeof(op##_relaxed(args)) __ret  = op##_relaxed(args);		\
-	__asm__ __volatile__(PPC_ACQUIRE_BARRIER "" : : : "memory");	\
-	__ret;								\
-})
-
-#define __atomic_op_release(op, args...)				\
-({									\
-	__asm__ __volatile__(PPC_RELEASE_BARRIER "" : : : "memory");	\
-	op##_relaxed(args);						\
-})
-
 static __inline__ int atomic_read(const atomic_t *v)
 {
 	int t;
@@ -213,6 +195,8 @@ static __inline__ int atomic_dec_return_relaxed(atomic_t *v)
 	cmpxchg_relaxed(&((v)->counter), (o), (n))
 #define atomic_cmpxchg_acquire(v, o, n) \
 	cmpxchg_acquire(&((v)->counter), (o), (n))
+#define atomic_cmpxchg_release(v, o, n) \
+	cmpxchg_release(&((v)->counter), (o), (n))
 
 #define atomic_xchg(v, new) (xchg(&((v)->counter), new))
 #define atomic_xchg_relaxed(v, new) xchg_relaxed(&((v)->counter), (new))
@@ -519,6 +503,8 @@ static __inline__ long atomic64_dec_if_positive(atomic64_t *v)
 	cmpxchg_relaxed(&((v)->counter), (o), (n))
 #define atomic64_cmpxchg_acquire(v, o, n) \
 	cmpxchg_acquire(&((v)->counter), (o), (n))
+#define atomic64_cmpxchg_release(v, o, n) \
+	cmpxchg_release(&((v)->counter), (o), (n))
 
 #define atomic64_xchg(v, new) (xchg(&((v)->counter), new))
 #define atomic64_xchg_relaxed(v, new) xchg_relaxed(&((v)->counter), (new))
diff --git a/arch/powerpc/include/asm/cmpxchg.h b/arch/powerpc/include/asm/cmpxchg.h
index 9b001f1f6b32..e27a612b957f 100644
--- a/arch/powerpc/include/asm/cmpxchg.h
+++ b/arch/powerpc/include/asm/cmpxchg.h
@@ -8,6 +8,24 @@
 #include <asm/asm-compat.h>
 #include <linux/bug.h>
 
+/*
+ * Since *_return_relaxed and {cmp}xchg_relaxed are implemented with
+ * a "bne-" instruction at the end, so an isync is enough as a acquire barrier
+ * on the platform without lwsync.
+ */
+#define __atomic_op_acquire(op, args...)				\
+({									\
+	typeof(op##_relaxed(args)) __ret  = op##_relaxed(args);		\
+	__asm__ __volatile__(PPC_ACQUIRE_BARRIER "" : : : "memory");	\
+	__ret;								\
+})
+
+#define __atomic_op_release(op, args...)				\
+({									\
+	__asm__ __volatile__(PPC_RELEASE_BARRIER "" : : : "memory");	\
+	op##_relaxed(args);						\
+})
+
 #ifdef __BIG_ENDIAN
 #define BITOFF_CAL(size, off)	((sizeof(u32) - size - off) * BITS_PER_BYTE)
 #else
@@ -512,6 +530,9 @@ __cmpxchg_acquire(void *ptr, unsigned long old, unsigned long new,
 			(unsigned long)_o_, (unsigned long)_n_,		\
 			sizeof(*(ptr)));				\
 })
+
+#define cmpxchg_release(...) __atomic_op_release(cmpxchg, __VA_ARGS__)
+
 #ifdef CONFIG_PPC64
 #define cmpxchg64(ptr, o, n)						\
   ({									\
@@ -533,6 +554,9 @@ __cmpxchg_acquire(void *ptr, unsigned long old, unsigned long new,
 	BUILD_BUG_ON(sizeof(*(ptr)) != 8);				\
 	cmpxchg_acquire((ptr), (o), (n));				\
 })
+
+#define cmpxchg64_release(...) __atomic_op_release(cmpxchg64, __VA_ARGS__)
+
 #else
 #include <asm-generic/cmpxchg-local.h>
 #define cmpxchg64_local(ptr, o, n) __cmpxchg64_local_generic((ptr), (o), (n))

^ permalink raw reply related	[flat|nested] 103+ messages in thread

* [PATCH] locking/atomics/powerpc: Move cmpxchg helpers to asm/cmpxchg.h and define the full set of cmpxchg APIs
@ 2018-05-05 13:27                       ` Ingo Molnar
  0 siblings, 0 replies; 103+ messages in thread
From: Ingo Molnar @ 2018-05-05 13:27 UTC (permalink / raw)
  To: linux-arm-kernel


* Boqun Feng <boqun.feng@gmail.com> wrote:

> > May I suggest the patch below? No change in functionality, but it documents the 
> > lack of the cmpxchg_release() APIs and maps them explicitly to the full cmpxchg() 
> > version. (Which the generic code does now in a rather roundabout way.)
> > 
> 
> Hmm.. cmpxchg_release() is actually lwsync() + cmpxchg_relaxed(), but
> you just make it sync() + cmpxchg_relaxed() + sync() with the fallback,
> and sync() is much heavier, so I don't think the fallback is correct.

Indeed!

The bit I missed previously is that PowerPC provides its own __atomic_op_release() 
method:

   #define __atomic_op_release(op, args...)                                \
   ({                                                                      \
           __asm__ __volatile__(PPC_RELEASE_BARRIER "" : : : "memory");    \
           op##_relaxed(args);                                             \
   })

... which maps to LWSYNC as you say, and my patch made that worse.

> I think maybe you can move powerpc's __atomic_op_{acqurie,release}()
> from atomic.h to cmpxchg.h (in arch/powerpc/include/asm), and
> 
> 	#define cmpxchg_release __atomic_op_release(cmpxchg, __VA_ARGS__);
> 	#define cmpxchg64_release __atomic_op_release(cmpxchg64, __VA_ARGS__);
> 
> I put a diff below to say what I mean (untested).
> 
> > Also, the change to arch/powerpc/include/asm/atomic.h has no functional effect 
> > right now either, but should anyone add a _relaxed() variant in the future, with 
> > this change atomic_cmpxchg_release() and atomic64_cmpxchg_release() will pick that 
> > up automatically.
> > 
> 
> You mean with your other modification in include/linux/atomic.h, right?
> Because with the unmodified include/linux/atomic.h, we already pick that
> autmatically. If so, I think that's fine.
> 
> Here is the diff for the modification for cmpxchg_release(), the idea is
> we generate them in asm/cmpxchg.h other than linux/atomic.h for ppc, so
> we keep the new linux/atomic.h working. Because if I understand
> correctly, the next linux/atomic.h only accepts that
> 
> 1)	architecture only defines fully ordered primitives
> 
> or
> 
> 2)	architecture only defines _relaxed primitives
> 
> or
> 
> 3)	architecture defines all four (fully, _relaxed, _acquire,
> 	_release) primitives
> 
> So powerpc needs to define all four primitives in its only
> asm/cmpxchg.h.

Correct, although the new logic is still RFC, PeterZ didn't like the first version 
I proposed and might NAK them.

Thanks for the patch - I have created the patch below from it and added your 
Signed-off-by.

The only change I made beyond a trivial build fix is that I also added the release 
atomics variants explicitly:

+#define atomic_cmpxchg_release(v, o, n) \
+	cmpxchg_release(&((v)->counter), (o), (n))
+#define atomic64_cmpxchg_release(v, o, n) \
+	cmpxchg_release(&((v)->counter), (o), (n))

It has passed a PowerPC cross-build test here, but no runtime tests.

Does this patch look good to you?

(Still subject to PeterZ's Ack/NAK.)

Thanks,

	Ingo

======================>
From: Boqun Feng <boqun.feng@gmail.com>
Date: Sat, 5 May 2018 19:28:17 +0800
Subject: [PATCH] locking/atomics/powerpc: Move cmpxchg helpers to asm/cmpxchg.h and define the full set of cmpxchg APIs

Move PowerPC's __op_{acqurie,release}() from atomic.h to
cmpxchg.h (in arch/powerpc/include/asm), plus use them to
define these two methods:

	#define cmpxchg_release __op_release(cmpxchg, __VA_ARGS__);
	#define cmpxchg64_release __op_release(cmpxchg64, __VA_ARGS__);

... the idea is to generate all these methods in cmpxchg.h and to define the full
array of atomic primitives, including the cmpxchg_release() methods which were
defined by the generic code before.

Also define the atomic[64]_() variants explicitly.

This ensures that all these low level cmpxchg APIs are defined in
PowerPC headers, with no generic header fallbacks.

No change in functionality or code generation.

Signed-off-by: Boqun Feng <boqun.feng@gmail.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: aryabinin at virtuozzo.com
Cc: catalin.marinas at arm.com
Cc: dvyukov at google.com
Cc: linux-arm-kernel at lists.infradead.org
Cc: will.deacon at arm.com
Link: http://lkml.kernel.org/r/20180505112817.ihrb726i37bwm4cj at tardis
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/powerpc/include/asm/atomic.h  | 22 ++++------------------
 arch/powerpc/include/asm/cmpxchg.h | 24 ++++++++++++++++++++++++
 2 files changed, 28 insertions(+), 18 deletions(-)

diff --git a/arch/powerpc/include/asm/atomic.h b/arch/powerpc/include/asm/atomic.h
index 682b3e6a1e21..4e06955ec10f 100644
--- a/arch/powerpc/include/asm/atomic.h
+++ b/arch/powerpc/include/asm/atomic.h
@@ -13,24 +13,6 @@
 
 #define ATOMIC_INIT(i)		{ (i) }
 
-/*
- * Since *_return_relaxed and {cmp}xchg_relaxed are implemented with
- * a "bne-" instruction at the end, so an isync is enough as a acquire barrier
- * on the platform without lwsync.
- */
-#define __atomic_op_acquire(op, args...)				\
-({									\
-	typeof(op##_relaxed(args)) __ret  = op##_relaxed(args);		\
-	__asm__ __volatile__(PPC_ACQUIRE_BARRIER "" : : : "memory");	\
-	__ret;								\
-})
-
-#define __atomic_op_release(op, args...)				\
-({									\
-	__asm__ __volatile__(PPC_RELEASE_BARRIER "" : : : "memory");	\
-	op##_relaxed(args);						\
-})
-
 static __inline__ int atomic_read(const atomic_t *v)
 {
 	int t;
@@ -213,6 +195,8 @@ static __inline__ int atomic_dec_return_relaxed(atomic_t *v)
 	cmpxchg_relaxed(&((v)->counter), (o), (n))
 #define atomic_cmpxchg_acquire(v, o, n) \
 	cmpxchg_acquire(&((v)->counter), (o), (n))
+#define atomic_cmpxchg_release(v, o, n) \
+	cmpxchg_release(&((v)->counter), (o), (n))
 
 #define atomic_xchg(v, new) (xchg(&((v)->counter), new))
 #define atomic_xchg_relaxed(v, new) xchg_relaxed(&((v)->counter), (new))
@@ -519,6 +503,8 @@ static __inline__ long atomic64_dec_if_positive(atomic64_t *v)
 	cmpxchg_relaxed(&((v)->counter), (o), (n))
 #define atomic64_cmpxchg_acquire(v, o, n) \
 	cmpxchg_acquire(&((v)->counter), (o), (n))
+#define atomic64_cmpxchg_release(v, o, n) \
+	cmpxchg_release(&((v)->counter), (o), (n))
 
 #define atomic64_xchg(v, new) (xchg(&((v)->counter), new))
 #define atomic64_xchg_relaxed(v, new) xchg_relaxed(&((v)->counter), (new))
diff --git a/arch/powerpc/include/asm/cmpxchg.h b/arch/powerpc/include/asm/cmpxchg.h
index 9b001f1f6b32..e27a612b957f 100644
--- a/arch/powerpc/include/asm/cmpxchg.h
+++ b/arch/powerpc/include/asm/cmpxchg.h
@@ -8,6 +8,24 @@
 #include <asm/asm-compat.h>
 #include <linux/bug.h>
 
+/*
+ * Since *_return_relaxed and {cmp}xchg_relaxed are implemented with
+ * a "bne-" instruction at the end, so an isync is enough as a acquire barrier
+ * on the platform without lwsync.
+ */
+#define __atomic_op_acquire(op, args...)				\
+({									\
+	typeof(op##_relaxed(args)) __ret  = op##_relaxed(args);		\
+	__asm__ __volatile__(PPC_ACQUIRE_BARRIER "" : : : "memory");	\
+	__ret;								\
+})
+
+#define __atomic_op_release(op, args...)				\
+({									\
+	__asm__ __volatile__(PPC_RELEASE_BARRIER "" : : : "memory");	\
+	op##_relaxed(args);						\
+})
+
 #ifdef __BIG_ENDIAN
 #define BITOFF_CAL(size, off)	((sizeof(u32) - size - off) * BITS_PER_BYTE)
 #else
@@ -512,6 +530,9 @@ __cmpxchg_acquire(void *ptr, unsigned long old, unsigned long new,
 			(unsigned long)_o_, (unsigned long)_n_,		\
 			sizeof(*(ptr)));				\
 })
+
+#define cmpxchg_release(...) __atomic_op_release(cmpxchg, __VA_ARGS__)
+
 #ifdef CONFIG_PPC64
 #define cmpxchg64(ptr, o, n)						\
   ({									\
@@ -533,6 +554,9 @@ __cmpxchg_acquire(void *ptr, unsigned long old, unsigned long new,
 	BUILD_BUG_ON(sizeof(*(ptr)) != 8);				\
 	cmpxchg_acquire((ptr), (o), (n));				\
 })
+
+#define cmpxchg64_release(...) __atomic_op_release(cmpxchg64, __VA_ARGS__)
+
 #else
 #include <asm-generic/cmpxchg-local.h>
 #define cmpxchg64_local(ptr, o, n) __cmpxchg64_local_generic((ptr), (o), (n))

^ permalink raw reply related	[flat|nested] 103+ messages in thread

* Re: [PATCH] locking/atomics/powerpc: Move cmpxchg helpers to asm/cmpxchg.h and define the full set of cmpxchg APIs
  2018-05-05 13:27                       ` Ingo Molnar
@ 2018-05-05 14:03                         ` Boqun Feng
  -1 siblings, 0 replies; 103+ messages in thread
From: Boqun Feng @ 2018-05-05 14:03 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Peter Zijlstra, Mark Rutland, linux-arm-kernel, linux-kernel,
	aryabinin, catalin.marinas, dvyukov, will.deacon

[-- Attachment #1: Type: text/plain, Size: 9180 bytes --]

On Sat, May 05, 2018 at 03:27:51PM +0200, Ingo Molnar wrote:
> 
> * Boqun Feng <boqun.feng@gmail.com> wrote:
> 
> > > May I suggest the patch below? No change in functionality, but it documents the 
> > > lack of the cmpxchg_release() APIs and maps them explicitly to the full cmpxchg() 
> > > version. (Which the generic code does now in a rather roundabout way.)
> > > 
> > 
> > Hmm.. cmpxchg_release() is actually lwsync() + cmpxchg_relaxed(), but
> > you just make it sync() + cmpxchg_relaxed() + sync() with the fallback,
> > and sync() is much heavier, so I don't think the fallback is correct.
> 
> Indeed!
> 
> The bit I missed previously is that PowerPC provides its own __atomic_op_release() 
> method:
> 
>    #define __atomic_op_release(op, args...)                                \
>    ({                                                                      \
>            __asm__ __volatile__(PPC_RELEASE_BARRIER "" : : : "memory");    \
>            op##_relaxed(args);                                             \
>    })
> 
> ... which maps to LWSYNC as you say, and my patch made that worse.
> 
> > I think maybe you can move powerpc's __atomic_op_{acqurie,release}()
> > from atomic.h to cmpxchg.h (in arch/powerpc/include/asm), and
> > 
> > 	#define cmpxchg_release __atomic_op_release(cmpxchg, __VA_ARGS__);
> > 	#define cmpxchg64_release __atomic_op_release(cmpxchg64, __VA_ARGS__);
> > 
> > I put a diff below to say what I mean (untested).
> > 
> > > Also, the change to arch/powerpc/include/asm/atomic.h has no functional effect 
> > > right now either, but should anyone add a _relaxed() variant in the future, with 
> > > this change atomic_cmpxchg_release() and atomic64_cmpxchg_release() will pick that 
> > > up automatically.
> > > 
> > 
> > You mean with your other modification in include/linux/atomic.h, right?
> > Because with the unmodified include/linux/atomic.h, we already pick that
> > autmatically. If so, I think that's fine.
> > 
> > Here is the diff for the modification for cmpxchg_release(), the idea is
> > we generate them in asm/cmpxchg.h other than linux/atomic.h for ppc, so
> > we keep the new linux/atomic.h working. Because if I understand
> > correctly, the next linux/atomic.h only accepts that
> > 
> > 1)	architecture only defines fully ordered primitives
> > 
> > or
> > 
> > 2)	architecture only defines _relaxed primitives
> > 
> > or
> > 
> > 3)	architecture defines all four (fully, _relaxed, _acquire,
> > 	_release) primitives
> > 
> > So powerpc needs to define all four primitives in its only
> > asm/cmpxchg.h.
> 
> Correct, although the new logic is still RFC, PeterZ didn't like the first version 
> I proposed and might NAK them.
> 

Understood. From my side, I don't have strong feelings for either way.
But since powerpc gets affected with the new logic, so I'm glad I could
help.

> Thanks for the patch - I have created the patch below from it and added your 
> Signed-off-by.
> 

Thanks ;-)

> The only change I made beyond a trivial build fix is that I also added the release 
> atomics variants explicitly:
> 
> +#define atomic_cmpxchg_release(v, o, n) \
> +	cmpxchg_release(&((v)->counter), (o), (n))
> +#define atomic64_cmpxchg_release(v, o, n) \
> +	cmpxchg_release(&((v)->counter), (o), (n))
> 
> It has passed a PowerPC cross-build test here, but no runtime tests.
> 

Do you have the commit at any branch in tip tree? I could pull it and
cross-build and check the assembly code of lib/atomic64_test.c, that way
I could verify whether we mess something up.

> Does this patch look good to you?
> 

Yep!

Regards,
Boqun

> (Still subject to PeterZ's Ack/NAK.)
> 
> Thanks,
> 
> 	Ingo
> 
> ======================>
> From: Boqun Feng <boqun.feng@gmail.com>
> Date: Sat, 5 May 2018 19:28:17 +0800
> Subject: [PATCH] locking/atomics/powerpc: Move cmpxchg helpers to asm/cmpxchg.h and define the full set of cmpxchg APIs
> 
> Move PowerPC's __op_{acqurie,release}() from atomic.h to
> cmpxchg.h (in arch/powerpc/include/asm), plus use them to
> define these two methods:
> 
> 	#define cmpxchg_release __op_release(cmpxchg, __VA_ARGS__);
> 	#define cmpxchg64_release __op_release(cmpxchg64, __VA_ARGS__);
> 
> ... the idea is to generate all these methods in cmpxchg.h and to define the full
> array of atomic primitives, including the cmpxchg_release() methods which were
> defined by the generic code before.
> 
> Also define the atomic[64]_() variants explicitly.
> 
> This ensures that all these low level cmpxchg APIs are defined in
> PowerPC headers, with no generic header fallbacks.
> 
> No change in functionality or code generation.
> 
> Signed-off-by: Boqun Feng <boqun.feng@gmail.com>
> Cc: Linus Torvalds <torvalds@linux-foundation.org>
> Cc: Mark Rutland <mark.rutland@arm.com>
> Cc: Peter Zijlstra <peterz@infradead.org>
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Cc: aryabinin@virtuozzo.com
> Cc: catalin.marinas@arm.com
> Cc: dvyukov@google.com
> Cc: linux-arm-kernel@lists.infradead.org
> Cc: will.deacon@arm.com
> Link: http://lkml.kernel.org/r/20180505112817.ihrb726i37bwm4cj@tardis
> Signed-off-by: Ingo Molnar <mingo@kernel.org>
> ---
>  arch/powerpc/include/asm/atomic.h  | 22 ++++------------------
>  arch/powerpc/include/asm/cmpxchg.h | 24 ++++++++++++++++++++++++
>  2 files changed, 28 insertions(+), 18 deletions(-)
> 
> diff --git a/arch/powerpc/include/asm/atomic.h b/arch/powerpc/include/asm/atomic.h
> index 682b3e6a1e21..4e06955ec10f 100644
> --- a/arch/powerpc/include/asm/atomic.h
> +++ b/arch/powerpc/include/asm/atomic.h
> @@ -13,24 +13,6 @@
>  
>  #define ATOMIC_INIT(i)		{ (i) }
>  
> -/*
> - * Since *_return_relaxed and {cmp}xchg_relaxed are implemented with
> - * a "bne-" instruction at the end, so an isync is enough as a acquire barrier
> - * on the platform without lwsync.
> - */
> -#define __atomic_op_acquire(op, args...)				\
> -({									\
> -	typeof(op##_relaxed(args)) __ret  = op##_relaxed(args);		\
> -	__asm__ __volatile__(PPC_ACQUIRE_BARRIER "" : : : "memory");	\
> -	__ret;								\
> -})
> -
> -#define __atomic_op_release(op, args...)				\
> -({									\
> -	__asm__ __volatile__(PPC_RELEASE_BARRIER "" : : : "memory");	\
> -	op##_relaxed(args);						\
> -})
> -
>  static __inline__ int atomic_read(const atomic_t *v)
>  {
>  	int t;
> @@ -213,6 +195,8 @@ static __inline__ int atomic_dec_return_relaxed(atomic_t *v)
>  	cmpxchg_relaxed(&((v)->counter), (o), (n))
>  #define atomic_cmpxchg_acquire(v, o, n) \
>  	cmpxchg_acquire(&((v)->counter), (o), (n))
> +#define atomic_cmpxchg_release(v, o, n) \
> +	cmpxchg_release(&((v)->counter), (o), (n))
>  
>  #define atomic_xchg(v, new) (xchg(&((v)->counter), new))
>  #define atomic_xchg_relaxed(v, new) xchg_relaxed(&((v)->counter), (new))
> @@ -519,6 +503,8 @@ static __inline__ long atomic64_dec_if_positive(atomic64_t *v)
>  	cmpxchg_relaxed(&((v)->counter), (o), (n))
>  #define atomic64_cmpxchg_acquire(v, o, n) \
>  	cmpxchg_acquire(&((v)->counter), (o), (n))
> +#define atomic64_cmpxchg_release(v, o, n) \
> +	cmpxchg_release(&((v)->counter), (o), (n))
>  
>  #define atomic64_xchg(v, new) (xchg(&((v)->counter), new))
>  #define atomic64_xchg_relaxed(v, new) xchg_relaxed(&((v)->counter), (new))
> diff --git a/arch/powerpc/include/asm/cmpxchg.h b/arch/powerpc/include/asm/cmpxchg.h
> index 9b001f1f6b32..e27a612b957f 100644
> --- a/arch/powerpc/include/asm/cmpxchg.h
> +++ b/arch/powerpc/include/asm/cmpxchg.h
> @@ -8,6 +8,24 @@
>  #include <asm/asm-compat.h>
>  #include <linux/bug.h>
>  
> +/*
> + * Since *_return_relaxed and {cmp}xchg_relaxed are implemented with
> + * a "bne-" instruction at the end, so an isync is enough as a acquire barrier
> + * on the platform without lwsync.
> + */
> +#define __atomic_op_acquire(op, args...)				\
> +({									\
> +	typeof(op##_relaxed(args)) __ret  = op##_relaxed(args);		\
> +	__asm__ __volatile__(PPC_ACQUIRE_BARRIER "" : : : "memory");	\
> +	__ret;								\
> +})
> +
> +#define __atomic_op_release(op, args...)				\
> +({									\
> +	__asm__ __volatile__(PPC_RELEASE_BARRIER "" : : : "memory");	\
> +	op##_relaxed(args);						\
> +})
> +
>  #ifdef __BIG_ENDIAN
>  #define BITOFF_CAL(size, off)	((sizeof(u32) - size - off) * BITS_PER_BYTE)
>  #else
> @@ -512,6 +530,9 @@ __cmpxchg_acquire(void *ptr, unsigned long old, unsigned long new,
>  			(unsigned long)_o_, (unsigned long)_n_,		\
>  			sizeof(*(ptr)));				\
>  })
> +
> +#define cmpxchg_release(...) __atomic_op_release(cmpxchg, __VA_ARGS__)
> +
>  #ifdef CONFIG_PPC64
>  #define cmpxchg64(ptr, o, n)						\
>    ({									\
> @@ -533,6 +554,9 @@ __cmpxchg_acquire(void *ptr, unsigned long old, unsigned long new,
>  	BUILD_BUG_ON(sizeof(*(ptr)) != 8);				\
>  	cmpxchg_acquire((ptr), (o), (n));				\
>  })
> +
> +#define cmpxchg64_release(...) __atomic_op_release(cmpxchg64, __VA_ARGS__)
> +
>  #else
>  #include <asm-generic/cmpxchg-local.h>
>  #define cmpxchg64_local(ptr, o, n) __cmpxchg64_local_generic((ptr), (o), (n))

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 103+ messages in thread

* [PATCH] locking/atomics/powerpc: Move cmpxchg helpers to asm/cmpxchg.h and define the full set of cmpxchg APIs
@ 2018-05-05 14:03                         ` Boqun Feng
  0 siblings, 0 replies; 103+ messages in thread
From: Boqun Feng @ 2018-05-05 14:03 UTC (permalink / raw)
  To: linux-arm-kernel

On Sat, May 05, 2018 at 03:27:51PM +0200, Ingo Molnar wrote:
> 
> * Boqun Feng <boqun.feng@gmail.com> wrote:
> 
> > > May I suggest the patch below? No change in functionality, but it documents the 
> > > lack of the cmpxchg_release() APIs and maps them explicitly to the full cmpxchg() 
> > > version. (Which the generic code does now in a rather roundabout way.)
> > > 
> > 
> > Hmm.. cmpxchg_release() is actually lwsync() + cmpxchg_relaxed(), but
> > you just make it sync() + cmpxchg_relaxed() + sync() with the fallback,
> > and sync() is much heavier, so I don't think the fallback is correct.
> 
> Indeed!
> 
> The bit I missed previously is that PowerPC provides its own __atomic_op_release() 
> method:
> 
>    #define __atomic_op_release(op, args...)                                \
>    ({                                                                      \
>            __asm__ __volatile__(PPC_RELEASE_BARRIER "" : : : "memory");    \
>            op##_relaxed(args);                                             \
>    })
> 
> ... which maps to LWSYNC as you say, and my patch made that worse.
> 
> > I think maybe you can move powerpc's __atomic_op_{acqurie,release}()
> > from atomic.h to cmpxchg.h (in arch/powerpc/include/asm), and
> > 
> > 	#define cmpxchg_release __atomic_op_release(cmpxchg, __VA_ARGS__);
> > 	#define cmpxchg64_release __atomic_op_release(cmpxchg64, __VA_ARGS__);
> > 
> > I put a diff below to say what I mean (untested).
> > 
> > > Also, the change to arch/powerpc/include/asm/atomic.h has no functional effect 
> > > right now either, but should anyone add a _relaxed() variant in the future, with 
> > > this change atomic_cmpxchg_release() and atomic64_cmpxchg_release() will pick that 
> > > up automatically.
> > > 
> > 
> > You mean with your other modification in include/linux/atomic.h, right?
> > Because with the unmodified include/linux/atomic.h, we already pick that
> > autmatically. If so, I think that's fine.
> > 
> > Here is the diff for the modification for cmpxchg_release(), the idea is
> > we generate them in asm/cmpxchg.h other than linux/atomic.h for ppc, so
> > we keep the new linux/atomic.h working. Because if I understand
> > correctly, the next linux/atomic.h only accepts that
> > 
> > 1)	architecture only defines fully ordered primitives
> > 
> > or
> > 
> > 2)	architecture only defines _relaxed primitives
> > 
> > or
> > 
> > 3)	architecture defines all four (fully, _relaxed, _acquire,
> > 	_release) primitives
> > 
> > So powerpc needs to define all four primitives in its only
> > asm/cmpxchg.h.
> 
> Correct, although the new logic is still RFC, PeterZ didn't like the first version 
> I proposed and might NAK them.
> 

Understood. From my side, I don't have strong feelings for either way.
But since powerpc gets affected with the new logic, so I'm glad I could
help.

> Thanks for the patch - I have created the patch below from it and added your 
> Signed-off-by.
> 

Thanks ;-)

> The only change I made beyond a trivial build fix is that I also added the release 
> atomics variants explicitly:
> 
> +#define atomic_cmpxchg_release(v, o, n) \
> +	cmpxchg_release(&((v)->counter), (o), (n))
> +#define atomic64_cmpxchg_release(v, o, n) \
> +	cmpxchg_release(&((v)->counter), (o), (n))
> 
> It has passed a PowerPC cross-build test here, but no runtime tests.
> 

Do you have the commit at any branch in tip tree? I could pull it and
cross-build and check the assembly code of lib/atomic64_test.c, that way
I could verify whether we mess something up.

> Does this patch look good to you?
> 

Yep!

Regards,
Boqun

> (Still subject to PeterZ's Ack/NAK.)
> 
> Thanks,
> 
> 	Ingo
> 
> ======================>
> From: Boqun Feng <boqun.feng@gmail.com>
> Date: Sat, 5 May 2018 19:28:17 +0800
> Subject: [PATCH] locking/atomics/powerpc: Move cmpxchg helpers to asm/cmpxchg.h and define the full set of cmpxchg APIs
> 
> Move PowerPC's __op_{acqurie,release}() from atomic.h to
> cmpxchg.h (in arch/powerpc/include/asm), plus use them to
> define these two methods:
> 
> 	#define cmpxchg_release __op_release(cmpxchg, __VA_ARGS__);
> 	#define cmpxchg64_release __op_release(cmpxchg64, __VA_ARGS__);
> 
> ... the idea is to generate all these methods in cmpxchg.h and to define the full
> array of atomic primitives, including the cmpxchg_release() methods which were
> defined by the generic code before.
> 
> Also define the atomic[64]_() variants explicitly.
> 
> This ensures that all these low level cmpxchg APIs are defined in
> PowerPC headers, with no generic header fallbacks.
> 
> No change in functionality or code generation.
> 
> Signed-off-by: Boqun Feng <boqun.feng@gmail.com>
> Cc: Linus Torvalds <torvalds@linux-foundation.org>
> Cc: Mark Rutland <mark.rutland@arm.com>
> Cc: Peter Zijlstra <peterz@infradead.org>
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Cc: aryabinin at virtuozzo.com
> Cc: catalin.marinas at arm.com
> Cc: dvyukov at google.com
> Cc: linux-arm-kernel at lists.infradead.org
> Cc: will.deacon at arm.com
> Link: http://lkml.kernel.org/r/20180505112817.ihrb726i37bwm4cj at tardis
> Signed-off-by: Ingo Molnar <mingo@kernel.org>
> ---
>  arch/powerpc/include/asm/atomic.h  | 22 ++++------------------
>  arch/powerpc/include/asm/cmpxchg.h | 24 ++++++++++++++++++++++++
>  2 files changed, 28 insertions(+), 18 deletions(-)
> 
> diff --git a/arch/powerpc/include/asm/atomic.h b/arch/powerpc/include/asm/atomic.h
> index 682b3e6a1e21..4e06955ec10f 100644
> --- a/arch/powerpc/include/asm/atomic.h
> +++ b/arch/powerpc/include/asm/atomic.h
> @@ -13,24 +13,6 @@
>  
>  #define ATOMIC_INIT(i)		{ (i) }
>  
> -/*
> - * Since *_return_relaxed and {cmp}xchg_relaxed are implemented with
> - * a "bne-" instruction at the end, so an isync is enough as a acquire barrier
> - * on the platform without lwsync.
> - */
> -#define __atomic_op_acquire(op, args...)				\
> -({									\
> -	typeof(op##_relaxed(args)) __ret  = op##_relaxed(args);		\
> -	__asm__ __volatile__(PPC_ACQUIRE_BARRIER "" : : : "memory");	\
> -	__ret;								\
> -})
> -
> -#define __atomic_op_release(op, args...)				\
> -({									\
> -	__asm__ __volatile__(PPC_RELEASE_BARRIER "" : : : "memory");	\
> -	op##_relaxed(args);						\
> -})
> -
>  static __inline__ int atomic_read(const atomic_t *v)
>  {
>  	int t;
> @@ -213,6 +195,8 @@ static __inline__ int atomic_dec_return_relaxed(atomic_t *v)
>  	cmpxchg_relaxed(&((v)->counter), (o), (n))
>  #define atomic_cmpxchg_acquire(v, o, n) \
>  	cmpxchg_acquire(&((v)->counter), (o), (n))
> +#define atomic_cmpxchg_release(v, o, n) \
> +	cmpxchg_release(&((v)->counter), (o), (n))
>  
>  #define atomic_xchg(v, new) (xchg(&((v)->counter), new))
>  #define atomic_xchg_relaxed(v, new) xchg_relaxed(&((v)->counter), (new))
> @@ -519,6 +503,8 @@ static __inline__ long atomic64_dec_if_positive(atomic64_t *v)
>  	cmpxchg_relaxed(&((v)->counter), (o), (n))
>  #define atomic64_cmpxchg_acquire(v, o, n) \
>  	cmpxchg_acquire(&((v)->counter), (o), (n))
> +#define atomic64_cmpxchg_release(v, o, n) \
> +	cmpxchg_release(&((v)->counter), (o), (n))
>  
>  #define atomic64_xchg(v, new) (xchg(&((v)->counter), new))
>  #define atomic64_xchg_relaxed(v, new) xchg_relaxed(&((v)->counter), (new))
> diff --git a/arch/powerpc/include/asm/cmpxchg.h b/arch/powerpc/include/asm/cmpxchg.h
> index 9b001f1f6b32..e27a612b957f 100644
> --- a/arch/powerpc/include/asm/cmpxchg.h
> +++ b/arch/powerpc/include/asm/cmpxchg.h
> @@ -8,6 +8,24 @@
>  #include <asm/asm-compat.h>
>  #include <linux/bug.h>
>  
> +/*
> + * Since *_return_relaxed and {cmp}xchg_relaxed are implemented with
> + * a "bne-" instruction at the end, so an isync is enough as a acquire barrier
> + * on the platform without lwsync.
> + */
> +#define __atomic_op_acquire(op, args...)				\
> +({									\
> +	typeof(op##_relaxed(args)) __ret  = op##_relaxed(args);		\
> +	__asm__ __volatile__(PPC_ACQUIRE_BARRIER "" : : : "memory");	\
> +	__ret;								\
> +})
> +
> +#define __atomic_op_release(op, args...)				\
> +({									\
> +	__asm__ __volatile__(PPC_RELEASE_BARRIER "" : : : "memory");	\
> +	op##_relaxed(args);						\
> +})
> +
>  #ifdef __BIG_ENDIAN
>  #define BITOFF_CAL(size, off)	((sizeof(u32) - size - off) * BITS_PER_BYTE)
>  #else
> @@ -512,6 +530,9 @@ __cmpxchg_acquire(void *ptr, unsigned long old, unsigned long new,
>  			(unsigned long)_o_, (unsigned long)_n_,		\
>  			sizeof(*(ptr)));				\
>  })
> +
> +#define cmpxchg_release(...) __atomic_op_release(cmpxchg, __VA_ARGS__)
> +
>  #ifdef CONFIG_PPC64
>  #define cmpxchg64(ptr, o, n)						\
>    ({									\
> @@ -533,6 +554,9 @@ __cmpxchg_acquire(void *ptr, unsigned long old, unsigned long new,
>  	BUILD_BUG_ON(sizeof(*(ptr)) != 8);				\
>  	cmpxchg_acquire((ptr), (o), (n));				\
>  })
> +
> +#define cmpxchg64_release(...) __atomic_op_release(cmpxchg64, __VA_ARGS__)
> +
>  #else
>  #include <asm-generic/cmpxchg-local.h>
>  #define cmpxchg64_local(ptr, o, n) __cmpxchg64_local_generic((ptr), (o), (n))
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 488 bytes
Desc: not available
URL: <http://lists.infradead.org/pipermail/linux-arm-kernel/attachments/20180505/fec9a4cc/attachment-0001.sig>

^ permalink raw reply	[flat|nested] 103+ messages in thread

* Re: [RFC PATCH] locking/atomics/powerpc: Introduce optimized cmpxchg_release() family of APIs for PowerPC
  2018-05-05 10:00                 ` Ingo Molnar
@ 2018-05-06  1:56                   ` Benjamin Herrenschmidt
  -1 siblings, 0 replies; 103+ messages in thread
From: Benjamin Herrenschmidt @ 2018-05-06  1:56 UTC (permalink / raw)
  To: Ingo Molnar, Peter Zijlstra, Paul Mackerras, Michael Ellerman
  Cc: Mark Rutland, linux-arm-kernel, linux-kernel, aryabinin,
	boqun.feng, catalin.marinas, dvyukov, will.deacon

On Sat, 2018-05-05 at 12:00 +0200, Ingo Molnar wrote:
> This clearly suggests that PPC_RELEASE_BARRIER is in active use and 'lwsync' is 
> the 'release barrier' instruction, if I interpreted that right.

The closest to one we got.

The semantics are that it orders all load/store pairs to cachable
storage except store+load.

Cheers,
Ben.

^ permalink raw reply	[flat|nested] 103+ messages in thread

* [RFC PATCH] locking/atomics/powerpc: Introduce optimized cmpxchg_release() family of APIs for PowerPC
@ 2018-05-06  1:56                   ` Benjamin Herrenschmidt
  0 siblings, 0 replies; 103+ messages in thread
From: Benjamin Herrenschmidt @ 2018-05-06  1:56 UTC (permalink / raw)
  To: linux-arm-kernel

On Sat, 2018-05-05 at 12:00 +0200, Ingo Molnar wrote:
> This clearly suggests that PPC_RELEASE_BARRIER is in active use and 'lwsync' is 
> the 'release barrier' instruction, if I interpreted that right.

The closest to one we got.

The semantics are that it orders all load/store pairs to cachable
storage except store+load.

Cheers,
Ben.

^ permalink raw reply	[flat|nested] 103+ messages in thread

* Re: [PATCH] locking/atomics/powerpc: Move cmpxchg helpers to asm/cmpxchg.h and define the full set of cmpxchg APIs
  2018-05-05 14:03                         ` Boqun Feng
@ 2018-05-06 12:11                           ` Ingo Molnar
  -1 siblings, 0 replies; 103+ messages in thread
From: Ingo Molnar @ 2018-05-06 12:11 UTC (permalink / raw)
  To: Boqun Feng
  Cc: Peter Zijlstra, Mark Rutland, linux-arm-kernel, linux-kernel,
	aryabinin, catalin.marinas, dvyukov, will.deacon


* Boqun Feng <boqun.feng@gmail.com> wrote:

> > The only change I made beyond a trivial build fix is that I also added the release 
> > atomics variants explicitly:
> > 
> > +#define atomic_cmpxchg_release(v, o, n) \
> > +	cmpxchg_release(&((v)->counter), (o), (n))
> > +#define atomic64_cmpxchg_release(v, o, n) \
> > +	cmpxchg_release(&((v)->counter), (o), (n))
> > 
> > It has passed a PowerPC cross-build test here, but no runtime tests.
> > 
> 
> Do you have the commit at any branch in tip tree? I could pull it and
> cross-build and check the assembly code of lib/atomic64_test.c, that way
> I could verify whether we mess something up.
> 
> > Does this patch look good to you?
> > 
> 
> Yep!

Great - I have pushed the commits out into the locking tree, they can be found in:

  git fetch git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git locking/core

The PowerPC preparatory commit from you is:

  0476a632cb3a: locking/atomics/powerpc: Move cmpxchg helpers to asm/cmpxchg.h and define the full set of cmpxchg APIs

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 103+ messages in thread

* [PATCH] locking/atomics/powerpc: Move cmpxchg helpers to asm/cmpxchg.h and define the full set of cmpxchg APIs
@ 2018-05-06 12:11                           ` Ingo Molnar
  0 siblings, 0 replies; 103+ messages in thread
From: Ingo Molnar @ 2018-05-06 12:11 UTC (permalink / raw)
  To: linux-arm-kernel


* Boqun Feng <boqun.feng@gmail.com> wrote:

> > The only change I made beyond a trivial build fix is that I also added the release 
> > atomics variants explicitly:
> > 
> > +#define atomic_cmpxchg_release(v, o, n) \
> > +	cmpxchg_release(&((v)->counter), (o), (n))
> > +#define atomic64_cmpxchg_release(v, o, n) \
> > +	cmpxchg_release(&((v)->counter), (o), (n))
> > 
> > It has passed a PowerPC cross-build test here, but no runtime tests.
> > 
> 
> Do you have the commit at any branch in tip tree? I could pull it and
> cross-build and check the assembly code of lib/atomic64_test.c, that way
> I could verify whether we mess something up.
> 
> > Does this patch look good to you?
> > 
> 
> Yep!

Great - I have pushed the commits out into the locking tree, they can be found in:

  git fetch git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git locking/core

The PowerPC preparatory commit from you is:

  0476a632cb3a: locking/atomics/powerpc: Move cmpxchg helpers to asm/cmpxchg.h and define the full set of cmpxchg APIs

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 103+ messages in thread

* [tip:locking/core] locking/atomics/powerpc: Move cmpxchg helpers to asm/cmpxchg.h and define the full set of cmpxchg APIs
  2018-05-05 11:28                     ` Boqun Feng
  (?)
  (?)
@ 2018-05-06 12:13                     ` tip-bot for Boqun Feng
  2018-05-07 13:31                         ` Boqun Feng
  -1 siblings, 1 reply; 103+ messages in thread
From: tip-bot for Boqun Feng @ 2018-05-06 12:13 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: tglx, hpa, linux-kernel, mark.rutland, boqun.feng, peterz, mingo,
	torvalds

Commit-ID:  0476a632cb3aa88c03cefc294050a9a86760e88d
Gitweb:     https://git.kernel.org/tip/0476a632cb3aa88c03cefc294050a9a86760e88d
Author:     Boqun Feng <boqun.feng@gmail.com>
AuthorDate: Sat, 5 May 2018 19:28:17 +0800
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Sat, 5 May 2018 15:22:20 +0200

locking/atomics/powerpc: Move cmpxchg helpers to asm/cmpxchg.h and define the full set of cmpxchg APIs

Move PowerPC's __op_{acqurie,release}() from atomic.h to
cmpxchg.h (in arch/powerpc/include/asm), plus use them to
define these two methods:

	#define cmpxchg_release __op_release(cmpxchg, __VA_ARGS__);
	#define cmpxchg64_release __op_release(cmpxchg64, __VA_ARGS__);

... the idea is to generate all these methods in cmpxchg.h and to define the full
array of atomic primitives, including the cmpxchg_release() methods which were
defined by the generic code before.

Also define the atomic[64]_() variants explicitly.

This ensures that all these low level cmpxchg APIs are defined in
PowerPC headers, with no generic header fallbacks.

No change in functionality or code generation.

Signed-off-by: Boqun Feng <boqun.feng@gmail.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: aryabinin@virtuozzo.com
Cc: catalin.marinas@arm.com
Cc: dvyukov@google.com
Cc: linux-arm-kernel@lists.infradead.org
Cc: will.deacon@arm.com
Link: http://lkml.kernel.org/r/20180505112817.ihrb726i37bwm4cj@tardis
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/powerpc/include/asm/atomic.h  | 22 ++++------------------
 arch/powerpc/include/asm/cmpxchg.h | 24 ++++++++++++++++++++++++
 2 files changed, 28 insertions(+), 18 deletions(-)

diff --git a/arch/powerpc/include/asm/atomic.h b/arch/powerpc/include/asm/atomic.h
index 682b3e6a1e21..4e06955ec10f 100644
--- a/arch/powerpc/include/asm/atomic.h
+++ b/arch/powerpc/include/asm/atomic.h
@@ -13,24 +13,6 @@
 
 #define ATOMIC_INIT(i)		{ (i) }
 
-/*
- * Since *_return_relaxed and {cmp}xchg_relaxed are implemented with
- * a "bne-" instruction at the end, so an isync is enough as a acquire barrier
- * on the platform without lwsync.
- */
-#define __atomic_op_acquire(op, args...)				\
-({									\
-	typeof(op##_relaxed(args)) __ret  = op##_relaxed(args);		\
-	__asm__ __volatile__(PPC_ACQUIRE_BARRIER "" : : : "memory");	\
-	__ret;								\
-})
-
-#define __atomic_op_release(op, args...)				\
-({									\
-	__asm__ __volatile__(PPC_RELEASE_BARRIER "" : : : "memory");	\
-	op##_relaxed(args);						\
-})
-
 static __inline__ int atomic_read(const atomic_t *v)
 {
 	int t;
@@ -213,6 +195,8 @@ static __inline__ int atomic_dec_return_relaxed(atomic_t *v)
 	cmpxchg_relaxed(&((v)->counter), (o), (n))
 #define atomic_cmpxchg_acquire(v, o, n) \
 	cmpxchg_acquire(&((v)->counter), (o), (n))
+#define atomic_cmpxchg_release(v, o, n) \
+	cmpxchg_release(&((v)->counter), (o), (n))
 
 #define atomic_xchg(v, new) (xchg(&((v)->counter), new))
 #define atomic_xchg_relaxed(v, new) xchg_relaxed(&((v)->counter), (new))
@@ -519,6 +503,8 @@ static __inline__ long atomic64_dec_if_positive(atomic64_t *v)
 	cmpxchg_relaxed(&((v)->counter), (o), (n))
 #define atomic64_cmpxchg_acquire(v, o, n) \
 	cmpxchg_acquire(&((v)->counter), (o), (n))
+#define atomic64_cmpxchg_release(v, o, n) \
+	cmpxchg_release(&((v)->counter), (o), (n))
 
 #define atomic64_xchg(v, new) (xchg(&((v)->counter), new))
 #define atomic64_xchg_relaxed(v, new) xchg_relaxed(&((v)->counter), (new))
diff --git a/arch/powerpc/include/asm/cmpxchg.h b/arch/powerpc/include/asm/cmpxchg.h
index 9b001f1f6b32..e27a612b957f 100644
--- a/arch/powerpc/include/asm/cmpxchg.h
+++ b/arch/powerpc/include/asm/cmpxchg.h
@@ -8,6 +8,24 @@
 #include <asm/asm-compat.h>
 #include <linux/bug.h>
 
+/*
+ * Since *_return_relaxed and {cmp}xchg_relaxed are implemented with
+ * a "bne-" instruction at the end, so an isync is enough as a acquire barrier
+ * on the platform without lwsync.
+ */
+#define __atomic_op_acquire(op, args...)				\
+({									\
+	typeof(op##_relaxed(args)) __ret  = op##_relaxed(args);		\
+	__asm__ __volatile__(PPC_ACQUIRE_BARRIER "" : : : "memory");	\
+	__ret;								\
+})
+
+#define __atomic_op_release(op, args...)				\
+({									\
+	__asm__ __volatile__(PPC_RELEASE_BARRIER "" : : : "memory");	\
+	op##_relaxed(args);						\
+})
+
 #ifdef __BIG_ENDIAN
 #define BITOFF_CAL(size, off)	((sizeof(u32) - size - off) * BITS_PER_BYTE)
 #else
@@ -512,6 +530,9 @@ __cmpxchg_acquire(void *ptr, unsigned long old, unsigned long new,
 			(unsigned long)_o_, (unsigned long)_n_,		\
 			sizeof(*(ptr)));				\
 })
+
+#define cmpxchg_release(...) __atomic_op_release(cmpxchg, __VA_ARGS__)
+
 #ifdef CONFIG_PPC64
 #define cmpxchg64(ptr, o, n)						\
   ({									\
@@ -533,6 +554,9 @@ __cmpxchg_acquire(void *ptr, unsigned long old, unsigned long new,
 	BUILD_BUG_ON(sizeof(*(ptr)) != 8);				\
 	cmpxchg_acquire((ptr), (o), (n));				\
 })
+
+#define cmpxchg64_release(...) __atomic_op_release(cmpxchg64, __VA_ARGS__)
+
 #else
 #include <asm-generic/cmpxchg-local.h>
 #define cmpxchg64_local(ptr, o, n) __cmpxchg64_local_generic((ptr), (o), (n))

^ permalink raw reply related	[flat|nested] 103+ messages in thread

* [tip:locking/core] locking/atomics: Clean up the atomic.h maze of #defines
  2018-05-05  8:11         ` Ingo Molnar
                           ` (2 preceding siblings ...)
  (?)
@ 2018-05-06 12:14         ` tip-bot for Ingo Molnar
  -1 siblings, 0 replies; 103+ messages in thread
From: tip-bot for Ingo Molnar @ 2018-05-06 12:14 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: peterz, akpm, mingo, mark.rutland, torvalds, will.deacon, hpa,
	tglx, paulmck, linux-kernel

Commit-ID:  a2d636a4bfd5e9b31215e5d1913e7fe0d0c0970a
Gitweb:     https://git.kernel.org/tip/a2d636a4bfd5e9b31215e5d1913e7fe0d0c0970a
Author:     Ingo Molnar <mingo@kernel.org>
AuthorDate: Sat, 5 May 2018 10:11:00 +0200
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Sat, 5 May 2018 15:22:44 +0200

locking/atomics: Clean up the atomic.h maze of #defines

Use structured defines to make it all much more readable.

Before:

 #ifndef atomic_fetch_dec_relaxed

 #ifndef atomic_fetch_dec
 #define atomic_fetch_dec(v)	        atomic_fetch_sub(1, (v))
 #define atomic_fetch_dec_relaxed(v)	atomic_fetch_sub_relaxed(1, (v))
 #define atomic_fetch_dec_acquire(v)	atomic_fetch_sub_acquire(1, (v))
 #define atomic_fetch_dec_release(v)	atomic_fetch_sub_release(1, (v))
 #else /* atomic_fetch_dec */
 #define atomic_fetch_dec_relaxed	atomic_fetch_dec
 #define atomic_fetch_dec_acquire	atomic_fetch_dec
 #define atomic_fetch_dec_release	atomic_fetch_dec
 #endif /* atomic_fetch_dec */

 #else /* atomic_fetch_dec_relaxed */

 #ifndef atomic_fetch_dec_acquire
 #define atomic_fetch_dec_acquire(...)					\
	__atomic_op_acquire(atomic_fetch_dec, __VA_ARGS__)
 #endif

 #ifndef atomic_fetch_dec_release
 #define atomic_fetch_dec_release(...)					\
	__atomic_op_release(atomic_fetch_dec, __VA_ARGS__)
 #endif

 #ifndef atomic_fetch_dec
 #define atomic_fetch_dec(...)						\
	__atomic_op_fence(atomic_fetch_dec, __VA_ARGS__)
 #endif
 #endif /* atomic_fetch_dec_relaxed */

After:

 #ifndef atomic_fetch_dec_relaxed
 # ifndef atomic_fetch_dec
 #  define atomic_fetch_dec(v)			atomic_fetch_sub(1, (v))
 #  define atomic_fetch_dec_relaxed(v)		atomic_fetch_sub_relaxed(1, (v))
 #  define atomic_fetch_dec_acquire(v)		atomic_fetch_sub_acquire(1, (v))
 #  define atomic_fetch_dec_release(v)		atomic_fetch_sub_release(1, (v))
 # else
 #  define atomic_fetch_dec_relaxed		atomic_fetch_dec
 #  define atomic_fetch_dec_acquire		atomic_fetch_dec
 #  define atomic_fetch_dec_release		atomic_fetch_dec
 # endif
 #else
 # ifndef atomic_fetch_dec_acquire
 #  define atomic_fetch_dec_acquire(...)	__atomic_op_acquire(atomic_fetch_dec, __VA_ARGS__)
 # endif
 # ifndef atomic_fetch_dec_release
 #  define atomic_fetch_dec_release(...)	__atomic_op_release(atomic_fetch_dec, __VA_ARGS__)
 # endif
 # ifndef atomic_fetch_dec
 #  define atomic_fetch_dec(...)		__atomic_op_fence(atomic_fetch_dec, __VA_ARGS__)
 # endif
 #endif

Beyond the linecount reduction this also makes it easier to follow
the various conditions.

Also clean up a few other minor details and make the code more
consistent throughout.

No change in functionality.

Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Deacon <will.deacon@arm.com>
Cc: aryabinin@virtuozzo.com
Cc: boqun.feng@gmail.com
Cc: catalin.marinas@arm.com
Cc: dvyukov@google.com
Cc: linux-arm-kernel@lists.infradead.org
Link: http://lkml.kernel.org/r/20180505081100.nsyrqrpzq2vd27bk@gmail.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 include/linux/atomic.h | 1275 +++++++++++++++++++++---------------------------
 1 file changed, 543 insertions(+), 732 deletions(-)

diff --git a/include/linux/atomic.h b/include/linux/atomic.h
index 01ce3997cb42..12f4ad559ab1 100644
--- a/include/linux/atomic.h
+++ b/include/linux/atomic.h
@@ -24,11 +24,11 @@
  */
 
 #ifndef atomic_read_acquire
-#define  atomic_read_acquire(v)		smp_load_acquire(&(v)->counter)
+# define atomic_read_acquire(v)			smp_load_acquire(&(v)->counter)
 #endif
 
 #ifndef atomic_set_release
-#define  atomic_set_release(v, i)	smp_store_release(&(v)->counter, (i))
+# define atomic_set_release(v, i)		smp_store_release(&(v)->counter, (i))
 #endif
 
 /*
@@ -71,454 +71,351 @@
 })
 #endif
 
-/* atomic_add_return_relaxed */
-#ifndef atomic_add_return_relaxed
-#define  atomic_add_return_relaxed	atomic_add_return
-#define  atomic_add_return_acquire	atomic_add_return
-#define  atomic_add_return_release	atomic_add_return
-
-#else /* atomic_add_return_relaxed */
-
-#ifndef atomic_add_return_acquire
-#define  atomic_add_return_acquire(...)					\
-	__atomic_op_acquire(atomic_add_return, __VA_ARGS__)
-#endif
+/* atomic_add_return_relaxed() et al: */
 
-#ifndef atomic_add_return_release
-#define  atomic_add_return_release(...)					\
-	__atomic_op_release(atomic_add_return, __VA_ARGS__)
-#endif
-
-#ifndef atomic_add_return
-#define  atomic_add_return(...)						\
-	__atomic_op_fence(atomic_add_return, __VA_ARGS__)
-#endif
-#endif /* atomic_add_return_relaxed */
+#ifndef atomic_add_return_relaxed
+# define atomic_add_return_relaxed		atomic_add_return
+# define atomic_add_return_acquire		atomic_add_return
+# define atomic_add_return_release		atomic_add_return
+#else
+# ifndef atomic_add_return_acquire
+#  define atomic_add_return_acquire(...)	__atomic_op_acquire(atomic_add_return, __VA_ARGS__)
+# endif
+# ifndef atomic_add_return_release
+#  define atomic_add_return_release(...)	__atomic_op_release(atomic_add_return, __VA_ARGS__)
+# endif
+# ifndef atomic_add_return
+#  define atomic_add_return(...)		__atomic_op_fence(atomic_add_return, __VA_ARGS__)
+# endif
+#endif
+
+/* atomic_inc_return_relaxed() et al: */
 
-/* atomic_inc_return_relaxed */
 #ifndef atomic_inc_return_relaxed
-#define  atomic_inc_return_relaxed	atomic_inc_return
-#define  atomic_inc_return_acquire	atomic_inc_return
-#define  atomic_inc_return_release	atomic_inc_return
-
-#else /* atomic_inc_return_relaxed */
-
-#ifndef atomic_inc_return_acquire
-#define  atomic_inc_return_acquire(...)					\
-	__atomic_op_acquire(atomic_inc_return, __VA_ARGS__)
-#endif
-
-#ifndef atomic_inc_return_release
-#define  atomic_inc_return_release(...)					\
-	__atomic_op_release(atomic_inc_return, __VA_ARGS__)
-#endif
-
-#ifndef atomic_inc_return
-#define  atomic_inc_return(...)						\
-	__atomic_op_fence(atomic_inc_return, __VA_ARGS__)
-#endif
-#endif /* atomic_inc_return_relaxed */
+# define atomic_inc_return_relaxed		atomic_inc_return
+# define atomic_inc_return_acquire		atomic_inc_return
+# define atomic_inc_return_release		atomic_inc_return
+#else
+# ifndef atomic_inc_return_acquire
+#  define atomic_inc_return_acquire(...)	__atomic_op_acquire(atomic_inc_return, __VA_ARGS__)
+# endif
+# ifndef atomic_inc_return_release
+#  define atomic_inc_return_release(...)	__atomic_op_release(atomic_inc_return, __VA_ARGS__)
+# endif
+# ifndef atomic_inc_return
+#  define atomic_inc_return(...)		__atomic_op_fence(atomic_inc_return, __VA_ARGS__)
+# endif
+#endif
+
+/* atomic_sub_return_relaxed() et al: */
 
-/* atomic_sub_return_relaxed */
 #ifndef atomic_sub_return_relaxed
-#define  atomic_sub_return_relaxed	atomic_sub_return
-#define  atomic_sub_return_acquire	atomic_sub_return
-#define  atomic_sub_return_release	atomic_sub_return
-
-#else /* atomic_sub_return_relaxed */
-
-#ifndef atomic_sub_return_acquire
-#define  atomic_sub_return_acquire(...)					\
-	__atomic_op_acquire(atomic_sub_return, __VA_ARGS__)
-#endif
-
-#ifndef atomic_sub_return_release
-#define  atomic_sub_return_release(...)					\
-	__atomic_op_release(atomic_sub_return, __VA_ARGS__)
-#endif
-
-#ifndef atomic_sub_return
-#define  atomic_sub_return(...)						\
-	__atomic_op_fence(atomic_sub_return, __VA_ARGS__)
-#endif
-#endif /* atomic_sub_return_relaxed */
+# define atomic_sub_return_relaxed		atomic_sub_return
+# define atomic_sub_return_acquire		atomic_sub_return
+# define atomic_sub_return_release		atomic_sub_return
+#else
+# ifndef atomic_sub_return_acquire
+#  define atomic_sub_return_acquire(...)	__atomic_op_acquire(atomic_sub_return, __VA_ARGS__)
+# endif
+# ifndef atomic_sub_return_release
+#  define atomic_sub_return_release(...)	__atomic_op_release(atomic_sub_return, __VA_ARGS__)
+# endif
+# ifndef atomic_sub_return
+#  define atomic_sub_return(...)		__atomic_op_fence(atomic_sub_return, __VA_ARGS__)
+# endif
+#endif
+
+/* atomic_dec_return_relaxed() et al: */
 
-/* atomic_dec_return_relaxed */
 #ifndef atomic_dec_return_relaxed
-#define  atomic_dec_return_relaxed	atomic_dec_return
-#define  atomic_dec_return_acquire	atomic_dec_return
-#define  atomic_dec_return_release	atomic_dec_return
-
-#else /* atomic_dec_return_relaxed */
-
-#ifndef atomic_dec_return_acquire
-#define  atomic_dec_return_acquire(...)					\
-	__atomic_op_acquire(atomic_dec_return, __VA_ARGS__)
-#endif
+# define atomic_dec_return_relaxed		atomic_dec_return
+# define atomic_dec_return_acquire		atomic_dec_return
+# define atomic_dec_return_release		atomic_dec_return
+#else
+# ifndef atomic_dec_return_acquire
+#  define atomic_dec_return_acquire(...)	__atomic_op_acquire(atomic_dec_return, __VA_ARGS__)
+# endif
+# ifndef atomic_dec_return_release
+#  define atomic_dec_return_release(...)	__atomic_op_release(atomic_dec_return, __VA_ARGS__)
+# endif
+# ifndef atomic_dec_return
+#  define atomic_dec_return(...)		__atomic_op_fence(atomic_dec_return, __VA_ARGS__)
+# endif
+#endif
+
+/* atomic_fetch_add_relaxed() et al: */
 
-#ifndef atomic_dec_return_release
-#define  atomic_dec_return_release(...)					\
-	__atomic_op_release(atomic_dec_return, __VA_ARGS__)
-#endif
-
-#ifndef atomic_dec_return
-#define  atomic_dec_return(...)						\
-	__atomic_op_fence(atomic_dec_return, __VA_ARGS__)
-#endif
-#endif /* atomic_dec_return_relaxed */
-
-
-/* atomic_fetch_add_relaxed */
 #ifndef atomic_fetch_add_relaxed
-#define atomic_fetch_add_relaxed	atomic_fetch_add
-#define atomic_fetch_add_acquire	atomic_fetch_add
-#define atomic_fetch_add_release	atomic_fetch_add
-
-#else /* atomic_fetch_add_relaxed */
-
-#ifndef atomic_fetch_add_acquire
-#define atomic_fetch_add_acquire(...)					\
-	__atomic_op_acquire(atomic_fetch_add, __VA_ARGS__)
-#endif
-
-#ifndef atomic_fetch_add_release
-#define atomic_fetch_add_release(...)					\
-	__atomic_op_release(atomic_fetch_add, __VA_ARGS__)
-#endif
+# define atomic_fetch_add_relaxed		atomic_fetch_add
+# define atomic_fetch_add_acquire		atomic_fetch_add
+# define atomic_fetch_add_release		atomic_fetch_add
+#else
+# ifndef atomic_fetch_add_acquire
+#  define atomic_fetch_add_acquire(...)		__atomic_op_acquire(atomic_fetch_add, __VA_ARGS__)
+# endif
+# ifndef atomic_fetch_add_release
+#  define atomic_fetch_add_release(...)		__atomic_op_release(atomic_fetch_add, __VA_ARGS__)
+# endif
+# ifndef atomic_fetch_add
+#  define atomic_fetch_add(...)			__atomic_op_fence(atomic_fetch_add, __VA_ARGS__)
+# endif
+#endif
+
+/* atomic_fetch_inc_relaxed() et al: */
 
-#ifndef atomic_fetch_add
-#define atomic_fetch_add(...)						\
-	__atomic_op_fence(atomic_fetch_add, __VA_ARGS__)
-#endif
-#endif /* atomic_fetch_add_relaxed */
-
-/* atomic_fetch_inc_relaxed */
 #ifndef atomic_fetch_inc_relaxed
+# ifndef atomic_fetch_inc
+#  define atomic_fetch_inc(v)			atomic_fetch_add(1, (v))
+#  define atomic_fetch_inc_relaxed(v)		atomic_fetch_add_relaxed(1, (v))
+#  define atomic_fetch_inc_acquire(v)		atomic_fetch_add_acquire(1, (v))
+#  define atomic_fetch_inc_release(v)		atomic_fetch_add_release(1, (v))
+# else
+#  define atomic_fetch_inc_relaxed		atomic_fetch_inc
+#  define atomic_fetch_inc_acquire		atomic_fetch_inc
+#  define atomic_fetch_inc_release		atomic_fetch_inc
+# endif
+#else
+# ifndef atomic_fetch_inc_acquire
+#  define atomic_fetch_inc_acquire(...)		__atomic_op_acquire(atomic_fetch_inc, __VA_ARGS__)
+# endif
+# ifndef atomic_fetch_inc_release
+#  define atomic_fetch_inc_release(...)		__atomic_op_release(atomic_fetch_inc, __VA_ARGS__)
+# endif
+# ifndef atomic_fetch_inc
+#  define atomic_fetch_inc(...)			__atomic_op_fence(atomic_fetch_inc, __VA_ARGS__)
+# endif
+#endif
+
+/* atomic_fetch_sub_relaxed() et al: */
 
-#ifndef atomic_fetch_inc
-#define atomic_fetch_inc(v)	        atomic_fetch_add(1, (v))
-#define atomic_fetch_inc_relaxed(v)	atomic_fetch_add_relaxed(1, (v))
-#define atomic_fetch_inc_acquire(v)	atomic_fetch_add_acquire(1, (v))
-#define atomic_fetch_inc_release(v)	atomic_fetch_add_release(1, (v))
-#else /* atomic_fetch_inc */
-#define atomic_fetch_inc_relaxed	atomic_fetch_inc
-#define atomic_fetch_inc_acquire	atomic_fetch_inc
-#define atomic_fetch_inc_release	atomic_fetch_inc
-#endif /* atomic_fetch_inc */
-
-#else /* atomic_fetch_inc_relaxed */
-
-#ifndef atomic_fetch_inc_acquire
-#define atomic_fetch_inc_acquire(...)					\
-	__atomic_op_acquire(atomic_fetch_inc, __VA_ARGS__)
-#endif
-
-#ifndef atomic_fetch_inc_release
-#define atomic_fetch_inc_release(...)					\
-	__atomic_op_release(atomic_fetch_inc, __VA_ARGS__)
-#endif
-
-#ifndef atomic_fetch_inc
-#define atomic_fetch_inc(...)						\
-	__atomic_op_fence(atomic_fetch_inc, __VA_ARGS__)
-#endif
-#endif /* atomic_fetch_inc_relaxed */
-
-/* atomic_fetch_sub_relaxed */
 #ifndef atomic_fetch_sub_relaxed
-#define atomic_fetch_sub_relaxed	atomic_fetch_sub
-#define atomic_fetch_sub_acquire	atomic_fetch_sub
-#define atomic_fetch_sub_release	atomic_fetch_sub
-
-#else /* atomic_fetch_sub_relaxed */
-
-#ifndef atomic_fetch_sub_acquire
-#define atomic_fetch_sub_acquire(...)					\
-	__atomic_op_acquire(atomic_fetch_sub, __VA_ARGS__)
-#endif
+# define atomic_fetch_sub_relaxed		atomic_fetch_sub
+# define atomic_fetch_sub_acquire		atomic_fetch_sub
+# define atomic_fetch_sub_release		atomic_fetch_sub
+#else
+# ifndef atomic_fetch_sub_acquire
+#  define atomic_fetch_sub_acquire(...)		__atomic_op_acquire(atomic_fetch_sub, __VA_ARGS__)
+# endif
+# ifndef atomic_fetch_sub_release
+#  define atomic_fetch_sub_release(...)		__atomic_op_release(atomic_fetch_sub, __VA_ARGS__)
+# endif
+# ifndef atomic_fetch_sub
+#  define atomic_fetch_sub(...)			__atomic_op_fence(atomic_fetch_sub, __VA_ARGS__)
+# endif
+#endif
+
+/* atomic_fetch_dec_relaxed() et al: */
 
-#ifndef atomic_fetch_sub_release
-#define atomic_fetch_sub_release(...)					\
-	__atomic_op_release(atomic_fetch_sub, __VA_ARGS__)
-#endif
-
-#ifndef atomic_fetch_sub
-#define atomic_fetch_sub(...)						\
-	__atomic_op_fence(atomic_fetch_sub, __VA_ARGS__)
-#endif
-#endif /* atomic_fetch_sub_relaxed */
-
-/* atomic_fetch_dec_relaxed */
 #ifndef atomic_fetch_dec_relaxed
+# ifndef atomic_fetch_dec
+#  define atomic_fetch_dec(v)			atomic_fetch_sub(1, (v))
+#  define atomic_fetch_dec_relaxed(v)		atomic_fetch_sub_relaxed(1, (v))
+#  define atomic_fetch_dec_acquire(v)		atomic_fetch_sub_acquire(1, (v))
+#  define atomic_fetch_dec_release(v)		atomic_fetch_sub_release(1, (v))
+# else
+#  define atomic_fetch_dec_relaxed		atomic_fetch_dec
+#  define atomic_fetch_dec_acquire		atomic_fetch_dec
+#  define atomic_fetch_dec_release		atomic_fetch_dec
+# endif
+#else
+# ifndef atomic_fetch_dec_acquire
+#  define atomic_fetch_dec_acquire(...)		__atomic_op_acquire(atomic_fetch_dec, __VA_ARGS__)
+# endif
+# ifndef atomic_fetch_dec_release
+#  define atomic_fetch_dec_release(...)		__atomic_op_release(atomic_fetch_dec, __VA_ARGS__)
+# endif
+# ifndef atomic_fetch_dec
+#  define atomic_fetch_dec(...)			__atomic_op_fence(atomic_fetch_dec, __VA_ARGS__)
+# endif
+#endif
+
+/* atomic_fetch_or_relaxed() et al: */
 
-#ifndef atomic_fetch_dec
-#define atomic_fetch_dec(v)	        atomic_fetch_sub(1, (v))
-#define atomic_fetch_dec_relaxed(v)	atomic_fetch_sub_relaxed(1, (v))
-#define atomic_fetch_dec_acquire(v)	atomic_fetch_sub_acquire(1, (v))
-#define atomic_fetch_dec_release(v)	atomic_fetch_sub_release(1, (v))
-#else /* atomic_fetch_dec */
-#define atomic_fetch_dec_relaxed	atomic_fetch_dec
-#define atomic_fetch_dec_acquire	atomic_fetch_dec
-#define atomic_fetch_dec_release	atomic_fetch_dec
-#endif /* atomic_fetch_dec */
-
-#else /* atomic_fetch_dec_relaxed */
-
-#ifndef atomic_fetch_dec_acquire
-#define atomic_fetch_dec_acquire(...)					\
-	__atomic_op_acquire(atomic_fetch_dec, __VA_ARGS__)
-#endif
-
-#ifndef atomic_fetch_dec_release
-#define atomic_fetch_dec_release(...)					\
-	__atomic_op_release(atomic_fetch_dec, __VA_ARGS__)
-#endif
-
-#ifndef atomic_fetch_dec
-#define atomic_fetch_dec(...)						\
-	__atomic_op_fence(atomic_fetch_dec, __VA_ARGS__)
-#endif
-#endif /* atomic_fetch_dec_relaxed */
-
-/* atomic_fetch_or_relaxed */
 #ifndef atomic_fetch_or_relaxed
-#define atomic_fetch_or_relaxed	atomic_fetch_or
-#define atomic_fetch_or_acquire	atomic_fetch_or
-#define atomic_fetch_or_release	atomic_fetch_or
-
-#else /* atomic_fetch_or_relaxed */
-
-#ifndef atomic_fetch_or_acquire
-#define atomic_fetch_or_acquire(...)					\
-	__atomic_op_acquire(atomic_fetch_or, __VA_ARGS__)
-#endif
-
-#ifndef atomic_fetch_or_release
-#define atomic_fetch_or_release(...)					\
-	__atomic_op_release(atomic_fetch_or, __VA_ARGS__)
-#endif
-
-#ifndef atomic_fetch_or
-#define atomic_fetch_or(...)						\
-	__atomic_op_fence(atomic_fetch_or, __VA_ARGS__)
-#endif
-#endif /* atomic_fetch_or_relaxed */
+# define atomic_fetch_or_relaxed		atomic_fetch_or
+# define atomic_fetch_or_acquire		atomic_fetch_or
+# define atomic_fetch_or_release		atomic_fetch_or
+#else
+# ifndef atomic_fetch_or_acquire
+#  define atomic_fetch_or_acquire(...)		__atomic_op_acquire(atomic_fetch_or, __VA_ARGS__)
+# endif
+# ifndef atomic_fetch_or_release
+#  define atomic_fetch_or_release(...)		__atomic_op_release(atomic_fetch_or, __VA_ARGS__)
+# endif
+# ifndef atomic_fetch_or
+#  define atomic_fetch_or(...)			__atomic_op_fence(atomic_fetch_or, __VA_ARGS__)
+# endif
+#endif
+
+/* atomic_fetch_and_relaxed() et al: */
 
-/* atomic_fetch_and_relaxed */
 #ifndef atomic_fetch_and_relaxed
-#define atomic_fetch_and_relaxed	atomic_fetch_and
-#define atomic_fetch_and_acquire	atomic_fetch_and
-#define atomic_fetch_and_release	atomic_fetch_and
-
-#else /* atomic_fetch_and_relaxed */
-
-#ifndef atomic_fetch_and_acquire
-#define atomic_fetch_and_acquire(...)					\
-	__atomic_op_acquire(atomic_fetch_and, __VA_ARGS__)
-#endif
-
-#ifndef atomic_fetch_and_release
-#define atomic_fetch_and_release(...)					\
-	__atomic_op_release(atomic_fetch_and, __VA_ARGS__)
-#endif
-
-#ifndef atomic_fetch_and
-#define atomic_fetch_and(...)						\
-	__atomic_op_fence(atomic_fetch_and, __VA_ARGS__)
+# define atomic_fetch_and_relaxed		atomic_fetch_and
+# define atomic_fetch_and_acquire		atomic_fetch_and
+# define atomic_fetch_and_release		atomic_fetch_and
+#else
+# ifndef atomic_fetch_and_acquire
+#  define atomic_fetch_and_acquire(...)		__atomic_op_acquire(atomic_fetch_and, __VA_ARGS__)
+# endif
+# ifndef atomic_fetch_and_release
+#  define atomic_fetch_and_release(...)		__atomic_op_release(atomic_fetch_and, __VA_ARGS__)
+# endif
+# ifndef atomic_fetch_and
+#  define atomic_fetch_and(...)			__atomic_op_fence(atomic_fetch_and, __VA_ARGS__)
+# endif
 #endif
-#endif /* atomic_fetch_and_relaxed */
 
 #ifdef atomic_andnot
-/* atomic_fetch_andnot_relaxed */
-#ifndef atomic_fetch_andnot_relaxed
-#define atomic_fetch_andnot_relaxed	atomic_fetch_andnot
-#define atomic_fetch_andnot_acquire	atomic_fetch_andnot
-#define atomic_fetch_andnot_release	atomic_fetch_andnot
-
-#else /* atomic_fetch_andnot_relaxed */
 
-#ifndef atomic_fetch_andnot_acquire
-#define atomic_fetch_andnot_acquire(...)					\
-	__atomic_op_acquire(atomic_fetch_andnot, __VA_ARGS__)
-#endif
+/* atomic_fetch_andnot_relaxed() et al: */
 
-#ifndef atomic_fetch_andnot_release
-#define atomic_fetch_andnot_release(...)					\
-	__atomic_op_release(atomic_fetch_andnot, __VA_ARGS__)
+#ifndef atomic_fetch_andnot_relaxed
+# define atomic_fetch_andnot_relaxed		atomic_fetch_andnot
+# define atomic_fetch_andnot_acquire		atomic_fetch_andnot
+# define atomic_fetch_andnot_release		atomic_fetch_andnot
+#else
+# ifndef atomic_fetch_andnot_acquire
+#  define atomic_fetch_andnot_acquire(...)	__atomic_op_acquire(atomic_fetch_andnot, __VA_ARGS__)
+# endif
+# ifndef atomic_fetch_andnot_release
+#  define atomic_fetch_andnot_release(...)	__atomic_op_release(atomic_fetch_andnot, __VA_ARGS__)
+# endif
+# ifndef atomic_fetch_andnot
+#  define atomic_fetch_andnot(...)		__atomic_op_fence(atomic_fetch_andnot, __VA_ARGS__)
+# endif
 #endif
 
-#ifndef atomic_fetch_andnot
-#define atomic_fetch_andnot(...)						\
-	__atomic_op_fence(atomic_fetch_andnot, __VA_ARGS__)
-#endif
-#endif /* atomic_fetch_andnot_relaxed */
 #endif /* atomic_andnot */
 
-/* atomic_fetch_xor_relaxed */
-#ifndef atomic_fetch_xor_relaxed
-#define atomic_fetch_xor_relaxed	atomic_fetch_xor
-#define atomic_fetch_xor_acquire	atomic_fetch_xor
-#define atomic_fetch_xor_release	atomic_fetch_xor
-
-#else /* atomic_fetch_xor_relaxed */
+/* atomic_fetch_xor_relaxed() et al: */
 
-#ifndef atomic_fetch_xor_acquire
-#define atomic_fetch_xor_acquire(...)					\
-	__atomic_op_acquire(atomic_fetch_xor, __VA_ARGS__)
-#endif
-
-#ifndef atomic_fetch_xor_release
-#define atomic_fetch_xor_release(...)					\
-	__atomic_op_release(atomic_fetch_xor, __VA_ARGS__)
+#ifndef atomic_fetch_xor_relaxed
+# define atomic_fetch_xor_relaxed		atomic_fetch_xor
+# define atomic_fetch_xor_acquire		atomic_fetch_xor
+# define atomic_fetch_xor_release		atomic_fetch_xor
+#else
+# ifndef atomic_fetch_xor_acquire
+#  define atomic_fetch_xor_acquire(...)		__atomic_op_acquire(atomic_fetch_xor, __VA_ARGS__)
+# endif
+# ifndef atomic_fetch_xor_release
+#  define atomic_fetch_xor_release(...)		__atomic_op_release(atomic_fetch_xor, __VA_ARGS__)
+# endif
+# ifndef atomic_fetch_xor
+#  define atomic_fetch_xor(...)			__atomic_op_fence(atomic_fetch_xor, __VA_ARGS__)
+# endif
 #endif
 
-#ifndef atomic_fetch_xor
-#define atomic_fetch_xor(...)						\
-	__atomic_op_fence(atomic_fetch_xor, __VA_ARGS__)
-#endif
-#endif /* atomic_fetch_xor_relaxed */
 
+/* atomic_xchg_relaxed() et al: */
 
-/* atomic_xchg_relaxed */
 #ifndef atomic_xchg_relaxed
-#define  atomic_xchg_relaxed		atomic_xchg
-#define  atomic_xchg_acquire		atomic_xchg
-#define  atomic_xchg_release		atomic_xchg
-
-#else /* atomic_xchg_relaxed */
-
-#ifndef atomic_xchg_acquire
-#define  atomic_xchg_acquire(...)					\
-	__atomic_op_acquire(atomic_xchg, __VA_ARGS__)
-#endif
-
-#ifndef atomic_xchg_release
-#define  atomic_xchg_release(...)					\
-	__atomic_op_release(atomic_xchg, __VA_ARGS__)
-#endif
+#define atomic_xchg_relaxed			atomic_xchg
+#define atomic_xchg_acquire			atomic_xchg
+#define atomic_xchg_release			atomic_xchg
+#else
+# ifndef atomic_xchg_acquire
+#  define atomic_xchg_acquire(...)		__atomic_op_acquire(atomic_xchg, __VA_ARGS__)
+# endif
+# ifndef atomic_xchg_release
+#  define atomic_xchg_release(...)		__atomic_op_release(atomic_xchg, __VA_ARGS__)
+# endif
+# ifndef atomic_xchg
+#  define atomic_xchg(...)			__atomic_op_fence(atomic_xchg, __VA_ARGS__)
+# endif
+#endif
+
+/* atomic_cmpxchg_relaxed() et al: */
 
-#ifndef atomic_xchg
-#define  atomic_xchg(...)						\
-	__atomic_op_fence(atomic_xchg, __VA_ARGS__)
-#endif
-#endif /* atomic_xchg_relaxed */
-
-/* atomic_cmpxchg_relaxed */
 #ifndef atomic_cmpxchg_relaxed
-#define  atomic_cmpxchg_relaxed		atomic_cmpxchg
-#define  atomic_cmpxchg_acquire		atomic_cmpxchg
-#define  atomic_cmpxchg_release		atomic_cmpxchg
-
-#else /* atomic_cmpxchg_relaxed */
-
-#ifndef atomic_cmpxchg_acquire
-#define  atomic_cmpxchg_acquire(...)					\
-	__atomic_op_acquire(atomic_cmpxchg, __VA_ARGS__)
+# define atomic_cmpxchg_relaxed			atomic_cmpxchg
+# define atomic_cmpxchg_acquire			atomic_cmpxchg
+# define atomic_cmpxchg_release			atomic_cmpxchg
+#else
+# ifndef atomic_cmpxchg_acquire
+#  define atomic_cmpxchg_acquire(...)		__atomic_op_acquire(atomic_cmpxchg, __VA_ARGS__)
+# endif
+# ifndef atomic_cmpxchg_release
+#  define atomic_cmpxchg_release(...)		__atomic_op_release(atomic_cmpxchg, __VA_ARGS__)
+# endif
+# ifndef atomic_cmpxchg
+#  define atomic_cmpxchg(...)			__atomic_op_fence(atomic_cmpxchg, __VA_ARGS__)
+# endif
 #endif
 
-#ifndef atomic_cmpxchg_release
-#define  atomic_cmpxchg_release(...)					\
-	__atomic_op_release(atomic_cmpxchg, __VA_ARGS__)
-#endif
-
-#ifndef atomic_cmpxchg
-#define  atomic_cmpxchg(...)						\
-	__atomic_op_fence(atomic_cmpxchg, __VA_ARGS__)
-#endif
-#endif /* atomic_cmpxchg_relaxed */
-
 #ifndef atomic_try_cmpxchg
-
-#define __atomic_try_cmpxchg(type, _p, _po, _n)				\
-({									\
+# define __atomic_try_cmpxchg(type, _p, _po, _n)			\
+  ({									\
 	typeof(_po) __po = (_po);					\
 	typeof(*(_po)) __r, __o = *__po;				\
 	__r = atomic_cmpxchg##type((_p), __o, (_n));			\
 	if (unlikely(__r != __o))					\
 		*__po = __r;						\
 	likely(__r == __o);						\
-})
-
-#define atomic_try_cmpxchg(_p, _po, _n)		__atomic_try_cmpxchg(, _p, _po, _n)
-#define atomic_try_cmpxchg_relaxed(_p, _po, _n)	__atomic_try_cmpxchg(_relaxed, _p, _po, _n)
-#define atomic_try_cmpxchg_acquire(_p, _po, _n)	__atomic_try_cmpxchg(_acquire, _p, _po, _n)
-#define atomic_try_cmpxchg_release(_p, _po, _n)	__atomic_try_cmpxchg(_release, _p, _po, _n)
-
-#else /* atomic_try_cmpxchg */
-#define atomic_try_cmpxchg_relaxed	atomic_try_cmpxchg
-#define atomic_try_cmpxchg_acquire	atomic_try_cmpxchg
-#define atomic_try_cmpxchg_release	atomic_try_cmpxchg
-#endif /* atomic_try_cmpxchg */
-
-/* cmpxchg_relaxed */
-#ifndef cmpxchg_relaxed
-#define  cmpxchg_relaxed		cmpxchg
-#define  cmpxchg_acquire		cmpxchg
-#define  cmpxchg_release		cmpxchg
-
-#else /* cmpxchg_relaxed */
-
-#ifndef cmpxchg_acquire
-#define  cmpxchg_acquire(...)						\
-	__atomic_op_acquire(cmpxchg, __VA_ARGS__)
+  })
+# define atomic_try_cmpxchg(_p, _po, _n)	 __atomic_try_cmpxchg(, _p, _po, _n)
+# define atomic_try_cmpxchg_relaxed(_p, _po, _n) __atomic_try_cmpxchg(_relaxed, _p, _po, _n)
+# define atomic_try_cmpxchg_acquire(_p, _po, _n) __atomic_try_cmpxchg(_acquire, _p, _po, _n)
+# define atomic_try_cmpxchg_release(_p, _po, _n) __atomic_try_cmpxchg(_release, _p, _po, _n)
+#else
+# define atomic_try_cmpxchg_relaxed		atomic_try_cmpxchg
+# define atomic_try_cmpxchg_acquire		atomic_try_cmpxchg
+# define atomic_try_cmpxchg_release		atomic_try_cmpxchg
 #endif
 
-#ifndef cmpxchg_release
-#define  cmpxchg_release(...)						\
-	__atomic_op_release(cmpxchg, __VA_ARGS__)
-#endif
+/* cmpxchg_relaxed() et al: */
 
-#ifndef cmpxchg
-#define  cmpxchg(...)							\
-	__atomic_op_fence(cmpxchg, __VA_ARGS__)
-#endif
-#endif /* cmpxchg_relaxed */
+#ifndef cmpxchg_relaxed
+# define cmpxchg_relaxed			cmpxchg
+# define cmpxchg_acquire			cmpxchg
+# define cmpxchg_release			cmpxchg
+#else
+# ifndef cmpxchg_acquire
+#  define cmpxchg_acquire(...)			__atomic_op_acquire(cmpxchg, __VA_ARGS__)
+# endif
+# ifndef cmpxchg_release
+#  define cmpxchg_release(...)			__atomic_op_release(cmpxchg, __VA_ARGS__)
+# endif
+# ifndef cmpxchg
+#  define cmpxchg(...)				__atomic_op_fence(cmpxchg, __VA_ARGS__)
+# endif
+#endif
+
+/* cmpxchg64_relaxed() et al: */
 
-/* cmpxchg64_relaxed */
 #ifndef cmpxchg64_relaxed
-#define  cmpxchg64_relaxed		cmpxchg64
-#define  cmpxchg64_acquire		cmpxchg64
-#define  cmpxchg64_release		cmpxchg64
-
-#else /* cmpxchg64_relaxed */
-
-#ifndef cmpxchg64_acquire
-#define  cmpxchg64_acquire(...)						\
-	__atomic_op_acquire(cmpxchg64, __VA_ARGS__)
-#endif
-
-#ifndef cmpxchg64_release
-#define  cmpxchg64_release(...)						\
-	__atomic_op_release(cmpxchg64, __VA_ARGS__)
-#endif
-
-#ifndef cmpxchg64
-#define  cmpxchg64(...)							\
-	__atomic_op_fence(cmpxchg64, __VA_ARGS__)
-#endif
-#endif /* cmpxchg64_relaxed */
+# define cmpxchg64_relaxed			cmpxchg64
+# define cmpxchg64_acquire			cmpxchg64
+# define cmpxchg64_release			cmpxchg64
+#else
+# ifndef cmpxchg64_acquire
+#  define cmpxchg64_acquire(...)		__atomic_op_acquire(cmpxchg64, __VA_ARGS__)
+# endif
+# ifndef cmpxchg64_release
+#  define cmpxchg64_release(...)		__atomic_op_release(cmpxchg64, __VA_ARGS__)
+# endif
+# ifndef cmpxchg64
+#  define cmpxchg64(...)			__atomic_op_fence(cmpxchg64, __VA_ARGS__)
+# endif
+#endif
+
+/* xchg_relaxed() et al: */
 
-/* xchg_relaxed */
 #ifndef xchg_relaxed
-#define  xchg_relaxed			xchg
-#define  xchg_acquire			xchg
-#define  xchg_release			xchg
-
-#else /* xchg_relaxed */
-
-#ifndef xchg_acquire
-#define  xchg_acquire(...)		__atomic_op_acquire(xchg, __VA_ARGS__)
-#endif
-
-#ifndef xchg_release
-#define  xchg_release(...)		__atomic_op_release(xchg, __VA_ARGS__)
+# define xchg_relaxed				xchg
+# define xchg_acquire				xchg
+# define xchg_release				xchg
+#else
+# ifndef xchg_acquire
+#  define xchg_acquire(...)			__atomic_op_acquire(xchg, __VA_ARGS__)
+# endif
+# ifndef xchg_release
+#  define xchg_release(...)			__atomic_op_release(xchg, __VA_ARGS__)
+# endif
+# ifndef xchg
+#  define xchg(...)				__atomic_op_fence(xchg, __VA_ARGS__)
+# endif
 #endif
 
-#ifndef xchg
-#define  xchg(...)			__atomic_op_fence(xchg, __VA_ARGS__)
-#endif
-#endif /* xchg_relaxed */
-
 /**
  * atomic_add_unless - add unless the number is already a given value
  * @v: pointer of type atomic_t
@@ -541,7 +438,7 @@ static inline int atomic_add_unless(atomic_t *v, int a, int u)
  * Returns non-zero if @v was non-zero, and zero otherwise.
  */
 #ifndef atomic_inc_not_zero
-#define atomic_inc_not_zero(v)		atomic_add_unless((v), 1, 0)
+# define atomic_inc_not_zero(v)			atomic_add_unless((v), 1, 0)
 #endif
 
 #ifndef atomic_andnot
@@ -607,6 +504,7 @@ static inline int atomic_inc_not_zero_hint(atomic_t *v, int hint)
 static inline int atomic_inc_unless_negative(atomic_t *p)
 {
 	int v, v1;
+
 	for (v = 0; v >= 0; v = v1) {
 		v1 = atomic_cmpxchg(p, v, v + 1);
 		if (likely(v1 == v))
@@ -620,6 +518,7 @@ static inline int atomic_inc_unless_negative(atomic_t *p)
 static inline int atomic_dec_unless_positive(atomic_t *p)
 {
 	int v, v1;
+
 	for (v = 0; v <= 0; v = v1) {
 		v1 = atomic_cmpxchg(p, v, v - 1);
 		if (likely(v1 == v))
@@ -640,6 +539,7 @@ static inline int atomic_dec_unless_positive(atomic_t *p)
 static inline int atomic_dec_if_positive(atomic_t *v)
 {
 	int c, old, dec;
+
 	c = atomic_read(v);
 	for (;;) {
 		dec = c - 1;
@@ -654,400 +554,311 @@ static inline int atomic_dec_if_positive(atomic_t *v)
 }
 #endif
 
-#define atomic_cond_read_relaxed(v, c)	smp_cond_load_relaxed(&(v)->counter, (c))
-#define atomic_cond_read_acquire(v, c)	smp_cond_load_acquire(&(v)->counter, (c))
+#define atomic_cond_read_relaxed(v, c)		smp_cond_load_relaxed(&(v)->counter, (c))
+#define atomic_cond_read_acquire(v, c)		smp_cond_load_acquire(&(v)->counter, (c))
 
 #ifdef CONFIG_GENERIC_ATOMIC64
 #include <asm-generic/atomic64.h>
 #endif
 
 #ifndef atomic64_read_acquire
-#define  atomic64_read_acquire(v)	smp_load_acquire(&(v)->counter)
+# define atomic64_read_acquire(v)		smp_load_acquire(&(v)->counter)
 #endif
 
 #ifndef atomic64_set_release
-#define  atomic64_set_release(v, i)	smp_store_release(&(v)->counter, (i))
-#endif
-
-/* atomic64_add_return_relaxed */
-#ifndef atomic64_add_return_relaxed
-#define  atomic64_add_return_relaxed	atomic64_add_return
-#define  atomic64_add_return_acquire	atomic64_add_return
-#define  atomic64_add_return_release	atomic64_add_return
-
-#else /* atomic64_add_return_relaxed */
-
-#ifndef atomic64_add_return_acquire
-#define  atomic64_add_return_acquire(...)				\
-	__atomic_op_acquire(atomic64_add_return, __VA_ARGS__)
+# define atomic64_set_release(v, i)		smp_store_release(&(v)->counter, (i))
 #endif
 
-#ifndef atomic64_add_return_release
-#define  atomic64_add_return_release(...)				\
-	__atomic_op_release(atomic64_add_return, __VA_ARGS__)
-#endif
+/* atomic64_add_return_relaxed() et al: */
 
-#ifndef atomic64_add_return
-#define  atomic64_add_return(...)					\
-	__atomic_op_fence(atomic64_add_return, __VA_ARGS__)
-#endif
-#endif /* atomic64_add_return_relaxed */
+#ifndef atomic64_add_return_relaxed
+# define atomic64_add_return_relaxed		atomic64_add_return
+# define atomic64_add_return_acquire		atomic64_add_return
+# define atomic64_add_return_release		atomic64_add_return
+#else
+# ifndef atomic64_add_return_acquire
+#  define atomic64_add_return_acquire(...)	__atomic_op_acquire(atomic64_add_return, __VA_ARGS__)
+# endif
+# ifndef atomic64_add_return_release
+#  define atomic64_add_return_release(...)	__atomic_op_release(atomic64_add_return, __VA_ARGS__)
+# endif
+# ifndef atomic64_add_return
+#  define atomic64_add_return(...)		__atomic_op_fence(atomic64_add_return, __VA_ARGS__)
+# endif
+#endif
+
+/* atomic64_inc_return_relaxed() et al: */
 
-/* atomic64_inc_return_relaxed */
 #ifndef atomic64_inc_return_relaxed
-#define  atomic64_inc_return_relaxed	atomic64_inc_return
-#define  atomic64_inc_return_acquire	atomic64_inc_return
-#define  atomic64_inc_return_release	atomic64_inc_return
-
-#else /* atomic64_inc_return_relaxed */
-
-#ifndef atomic64_inc_return_acquire
-#define  atomic64_inc_return_acquire(...)				\
-	__atomic_op_acquire(atomic64_inc_return, __VA_ARGS__)
-#endif
-
-#ifndef atomic64_inc_return_release
-#define  atomic64_inc_return_release(...)				\
-	__atomic_op_release(atomic64_inc_return, __VA_ARGS__)
-#endif
-
-#ifndef atomic64_inc_return
-#define  atomic64_inc_return(...)					\
-	__atomic_op_fence(atomic64_inc_return, __VA_ARGS__)
-#endif
-#endif /* atomic64_inc_return_relaxed */
-
+# define atomic64_inc_return_relaxed		atomic64_inc_return
+# define atomic64_inc_return_acquire		atomic64_inc_return
+# define atomic64_inc_return_release		atomic64_inc_return
+#else
+# ifndef atomic64_inc_return_acquire
+#  define atomic64_inc_return_acquire(...)	__atomic_op_acquire(atomic64_inc_return, __VA_ARGS__)
+# endif
+# ifndef atomic64_inc_return_release
+#  define atomic64_inc_return_release(...)	__atomic_op_release(atomic64_inc_return, __VA_ARGS__)
+# endif
+# ifndef atomic64_inc_return
+#  define atomic64_inc_return(...)		__atomic_op_fence(atomic64_inc_return, __VA_ARGS__)
+# endif
+#endif
+
+/* atomic64_sub_return_relaxed() et al: */
 
-/* atomic64_sub_return_relaxed */
 #ifndef atomic64_sub_return_relaxed
-#define  atomic64_sub_return_relaxed	atomic64_sub_return
-#define  atomic64_sub_return_acquire	atomic64_sub_return
-#define  atomic64_sub_return_release	atomic64_sub_return
+# define atomic64_sub_return_relaxed		atomic64_sub_return
+# define atomic64_sub_return_acquire		atomic64_sub_return
+# define atomic64_sub_return_release		atomic64_sub_return
+#else
+# ifndef atomic64_sub_return_acquire
+#  define atomic64_sub_return_acquire(...)	__atomic_op_acquire(atomic64_sub_return, __VA_ARGS__)
+# endif
+# ifndef atomic64_sub_return_release
+#  define atomic64_sub_return_release(...)	__atomic_op_release(atomic64_sub_return, __VA_ARGS__)
+# endif
+# ifndef atomic64_sub_return
+#  define atomic64_sub_return(...)		__atomic_op_fence(atomic64_sub_return, __VA_ARGS__)
+# endif
+#endif
+
+/* atomic64_dec_return_relaxed() et al: */
 
-#else /* atomic64_sub_return_relaxed */
-
-#ifndef atomic64_sub_return_acquire
-#define  atomic64_sub_return_acquire(...)				\
-	__atomic_op_acquire(atomic64_sub_return, __VA_ARGS__)
-#endif
-
-#ifndef atomic64_sub_return_release
-#define  atomic64_sub_return_release(...)				\
-	__atomic_op_release(atomic64_sub_return, __VA_ARGS__)
-#endif
-
-#ifndef atomic64_sub_return
-#define  atomic64_sub_return(...)					\
-	__atomic_op_fence(atomic64_sub_return, __VA_ARGS__)
-#endif
-#endif /* atomic64_sub_return_relaxed */
-
-/* atomic64_dec_return_relaxed */
 #ifndef atomic64_dec_return_relaxed
-#define  atomic64_dec_return_relaxed	atomic64_dec_return
-#define  atomic64_dec_return_acquire	atomic64_dec_return
-#define  atomic64_dec_return_release	atomic64_dec_return
-
-#else /* atomic64_dec_return_relaxed */
-
-#ifndef atomic64_dec_return_acquire
-#define  atomic64_dec_return_acquire(...)				\
-	__atomic_op_acquire(atomic64_dec_return, __VA_ARGS__)
-#endif
-
-#ifndef atomic64_dec_return_release
-#define  atomic64_dec_return_release(...)				\
-	__atomic_op_release(atomic64_dec_return, __VA_ARGS__)
-#endif
-
-#ifndef atomic64_dec_return
-#define  atomic64_dec_return(...)					\
-	__atomic_op_fence(atomic64_dec_return, __VA_ARGS__)
-#endif
-#endif /* atomic64_dec_return_relaxed */
+# define atomic64_dec_return_relaxed		atomic64_dec_return
+# define atomic64_dec_return_acquire		atomic64_dec_return
+# define atomic64_dec_return_release		atomic64_dec_return
+#else
+# ifndef atomic64_dec_return_acquire
+#  define atomic64_dec_return_acquire(...)	__atomic_op_acquire(atomic64_dec_return, __VA_ARGS__)
+# endif
+# ifndef atomic64_dec_return_release
+#  define atomic64_dec_return_release(...)	__atomic_op_release(atomic64_dec_return, __VA_ARGS__)
+# endif
+# ifndef atomic64_dec_return
+#  define atomic64_dec_return(...)		__atomic_op_fence(atomic64_dec_return, __VA_ARGS__)
+# endif
+#endif
+
+/* atomic64_fetch_add_relaxed() et al: */
 
-
-/* atomic64_fetch_add_relaxed */
 #ifndef atomic64_fetch_add_relaxed
-#define atomic64_fetch_add_relaxed	atomic64_fetch_add
-#define atomic64_fetch_add_acquire	atomic64_fetch_add
-#define atomic64_fetch_add_release	atomic64_fetch_add
-
-#else /* atomic64_fetch_add_relaxed */
-
-#ifndef atomic64_fetch_add_acquire
-#define atomic64_fetch_add_acquire(...)					\
-	__atomic_op_acquire(atomic64_fetch_add, __VA_ARGS__)
-#endif
+# define atomic64_fetch_add_relaxed		atomic64_fetch_add
+# define atomic64_fetch_add_acquire		atomic64_fetch_add
+# define atomic64_fetch_add_release		atomic64_fetch_add
+#else
+# ifndef atomic64_fetch_add_acquire
+#  define atomic64_fetch_add_acquire(...)	__atomic_op_acquire(atomic64_fetch_add, __VA_ARGS__)
+# endif
+# ifndef atomic64_fetch_add_release
+#  define atomic64_fetch_add_release(...)	__atomic_op_release(atomic64_fetch_add, __VA_ARGS__)
+# endif
+# ifndef atomic64_fetch_add
+#  define atomic64_fetch_add(...)		__atomic_op_fence(atomic64_fetch_add, __VA_ARGS__)
+# endif
+#endif
+
+/* atomic64_fetch_inc_relaxed() et al: */
 
-#ifndef atomic64_fetch_add_release
-#define atomic64_fetch_add_release(...)					\
-	__atomic_op_release(atomic64_fetch_add, __VA_ARGS__)
-#endif
-
-#ifndef atomic64_fetch_add
-#define atomic64_fetch_add(...)						\
-	__atomic_op_fence(atomic64_fetch_add, __VA_ARGS__)
-#endif
-#endif /* atomic64_fetch_add_relaxed */
-
-/* atomic64_fetch_inc_relaxed */
 #ifndef atomic64_fetch_inc_relaxed
+# ifndef atomic64_fetch_inc
+#  define atomic64_fetch_inc(v)			atomic64_fetch_add(1, (v))
+#  define atomic64_fetch_inc_relaxed(v)		atomic64_fetch_add_relaxed(1, (v))
+#  define atomic64_fetch_inc_acquire(v)		atomic64_fetch_add_acquire(1, (v))
+#  define atomic64_fetch_inc_release(v)		atomic64_fetch_add_release(1, (v))
+# else
+#  define atomic64_fetch_inc_relaxed		atomic64_fetch_inc
+#  define atomic64_fetch_inc_acquire		atomic64_fetch_inc
+#  define atomic64_fetch_inc_release		atomic64_fetch_inc
+# endif
+#else
+# ifndef atomic64_fetch_inc_acquire
+#  define atomic64_fetch_inc_acquire(...)	__atomic_op_acquire(atomic64_fetch_inc, __VA_ARGS__)
+# endif
+# ifndef atomic64_fetch_inc_release
+#  define atomic64_fetch_inc_release(...)	__atomic_op_release(atomic64_fetch_inc, __VA_ARGS__)
+# endif
+# ifndef atomic64_fetch_inc
+#  define atomic64_fetch_inc(...)		__atomic_op_fence(atomic64_fetch_inc, __VA_ARGS__)
+# endif
+#endif
+
+/* atomic64_fetch_sub_relaxed() et al: */
 
-#ifndef atomic64_fetch_inc
-#define atomic64_fetch_inc(v)		atomic64_fetch_add(1, (v))
-#define atomic64_fetch_inc_relaxed(v)	atomic64_fetch_add_relaxed(1, (v))
-#define atomic64_fetch_inc_acquire(v)	atomic64_fetch_add_acquire(1, (v))
-#define atomic64_fetch_inc_release(v)	atomic64_fetch_add_release(1, (v))
-#else /* atomic64_fetch_inc */
-#define atomic64_fetch_inc_relaxed	atomic64_fetch_inc
-#define atomic64_fetch_inc_acquire	atomic64_fetch_inc
-#define atomic64_fetch_inc_release	atomic64_fetch_inc
-#endif /* atomic64_fetch_inc */
-
-#else /* atomic64_fetch_inc_relaxed */
-
-#ifndef atomic64_fetch_inc_acquire
-#define atomic64_fetch_inc_acquire(...)					\
-	__atomic_op_acquire(atomic64_fetch_inc, __VA_ARGS__)
-#endif
-
-#ifndef atomic64_fetch_inc_release
-#define atomic64_fetch_inc_release(...)					\
-	__atomic_op_release(atomic64_fetch_inc, __VA_ARGS__)
-#endif
-
-#ifndef atomic64_fetch_inc
-#define atomic64_fetch_inc(...)						\
-	__atomic_op_fence(atomic64_fetch_inc, __VA_ARGS__)
-#endif
-#endif /* atomic64_fetch_inc_relaxed */
-
-/* atomic64_fetch_sub_relaxed */
 #ifndef atomic64_fetch_sub_relaxed
-#define atomic64_fetch_sub_relaxed	atomic64_fetch_sub
-#define atomic64_fetch_sub_acquire	atomic64_fetch_sub
-#define atomic64_fetch_sub_release	atomic64_fetch_sub
-
-#else /* atomic64_fetch_sub_relaxed */
-
-#ifndef atomic64_fetch_sub_acquire
-#define atomic64_fetch_sub_acquire(...)					\
-	__atomic_op_acquire(atomic64_fetch_sub, __VA_ARGS__)
-#endif
-
-#ifndef atomic64_fetch_sub_release
-#define atomic64_fetch_sub_release(...)					\
-	__atomic_op_release(atomic64_fetch_sub, __VA_ARGS__)
-#endif
-
-#ifndef atomic64_fetch_sub
-#define atomic64_fetch_sub(...)						\
-	__atomic_op_fence(atomic64_fetch_sub, __VA_ARGS__)
-#endif
-#endif /* atomic64_fetch_sub_relaxed */
+# define atomic64_fetch_sub_relaxed		atomic64_fetch_sub
+# define atomic64_fetch_sub_acquire		atomic64_fetch_sub
+# define atomic64_fetch_sub_release		atomic64_fetch_sub
+#else
+# ifndef atomic64_fetch_sub_acquire
+#  define atomic64_fetch_sub_acquire(...)	__atomic_op_acquire(atomic64_fetch_sub, __VA_ARGS__)
+# endif
+# ifndef atomic64_fetch_sub_release
+#  define atomic64_fetch_sub_release(...)	__atomic_op_release(atomic64_fetch_sub, __VA_ARGS__)
+# endif
+# ifndef atomic64_fetch_sub
+#  define atomic64_fetch_sub(...)		__atomic_op_fence(atomic64_fetch_sub, __VA_ARGS__)
+# endif
+#endif
+
+/* atomic64_fetch_dec_relaxed() et al: */
 
-/* atomic64_fetch_dec_relaxed */
 #ifndef atomic64_fetch_dec_relaxed
+# ifndef atomic64_fetch_dec
+#  define atomic64_fetch_dec(v)			atomic64_fetch_sub(1, (v))
+#  define atomic64_fetch_dec_relaxed(v)		atomic64_fetch_sub_relaxed(1, (v))
+#  define atomic64_fetch_dec_acquire(v)		atomic64_fetch_sub_acquire(1, (v))
+#  define atomic64_fetch_dec_release(v)		atomic64_fetch_sub_release(1, (v))
+# else
+#  define atomic64_fetch_dec_relaxed		atomic64_fetch_dec
+#  define atomic64_fetch_dec_acquire		atomic64_fetch_dec
+#  define atomic64_fetch_dec_release		atomic64_fetch_dec
+# endif
+#else
+# ifndef atomic64_fetch_dec_acquire
+#  define atomic64_fetch_dec_acquire(...)	__atomic_op_acquire(atomic64_fetch_dec, __VA_ARGS__)
+# endif
+# ifndef atomic64_fetch_dec_release
+#  define atomic64_fetch_dec_release(...)	__atomic_op_release(atomic64_fetch_dec, __VA_ARGS__)
+# endif
+# ifndef atomic64_fetch_dec
+#  define atomic64_fetch_dec(...)		__atomic_op_fence(atomic64_fetch_dec, __VA_ARGS__)
+# endif
+#endif
+
+/* atomic64_fetch_or_relaxed() et al: */
 
-#ifndef atomic64_fetch_dec
-#define atomic64_fetch_dec(v)		atomic64_fetch_sub(1, (v))
-#define atomic64_fetch_dec_relaxed(v)	atomic64_fetch_sub_relaxed(1, (v))
-#define atomic64_fetch_dec_acquire(v)	atomic64_fetch_sub_acquire(1, (v))
-#define atomic64_fetch_dec_release(v)	atomic64_fetch_sub_release(1, (v))
-#else /* atomic64_fetch_dec */
-#define atomic64_fetch_dec_relaxed	atomic64_fetch_dec
-#define atomic64_fetch_dec_acquire	atomic64_fetch_dec
-#define atomic64_fetch_dec_release	atomic64_fetch_dec
-#endif /* atomic64_fetch_dec */
-
-#else /* atomic64_fetch_dec_relaxed */
-
-#ifndef atomic64_fetch_dec_acquire
-#define atomic64_fetch_dec_acquire(...)					\
-	__atomic_op_acquire(atomic64_fetch_dec, __VA_ARGS__)
-#endif
-
-#ifndef atomic64_fetch_dec_release
-#define atomic64_fetch_dec_release(...)					\
-	__atomic_op_release(atomic64_fetch_dec, __VA_ARGS__)
-#endif
-
-#ifndef atomic64_fetch_dec
-#define atomic64_fetch_dec(...)						\
-	__atomic_op_fence(atomic64_fetch_dec, __VA_ARGS__)
-#endif
-#endif /* atomic64_fetch_dec_relaxed */
-
-/* atomic64_fetch_or_relaxed */
 #ifndef atomic64_fetch_or_relaxed
-#define atomic64_fetch_or_relaxed	atomic64_fetch_or
-#define atomic64_fetch_or_acquire	atomic64_fetch_or
-#define atomic64_fetch_or_release	atomic64_fetch_or
-
-#else /* atomic64_fetch_or_relaxed */
-
-#ifndef atomic64_fetch_or_acquire
-#define atomic64_fetch_or_acquire(...)					\
-	__atomic_op_acquire(atomic64_fetch_or, __VA_ARGS__)
+# define atomic64_fetch_or_relaxed		atomic64_fetch_or
+# define atomic64_fetch_or_acquire		atomic64_fetch_or
+# define atomic64_fetch_or_release		atomic64_fetch_or
+#else
+# ifndef atomic64_fetch_or_acquire
+#  define atomic64_fetch_or_acquire(...)	__atomic_op_acquire(atomic64_fetch_or, __VA_ARGS__)
+# endif
+# ifndef atomic64_fetch_or_release
+#  define atomic64_fetch_or_release(...)	__atomic_op_release(atomic64_fetch_or, __VA_ARGS__)
+# endif
+# ifndef atomic64_fetch_or
+#  define atomic64_fetch_or(...)		__atomic_op_fence(atomic64_fetch_or, __VA_ARGS__)
+# endif
 #endif
 
-#ifndef atomic64_fetch_or_release
-#define atomic64_fetch_or_release(...)					\
-	__atomic_op_release(atomic64_fetch_or, __VA_ARGS__)
-#endif
 
-#ifndef atomic64_fetch_or
-#define atomic64_fetch_or(...)						\
-	__atomic_op_fence(atomic64_fetch_or, __VA_ARGS__)
-#endif
-#endif /* atomic64_fetch_or_relaxed */
+/* atomic64_fetch_and_relaxed() et al: */
 
-/* atomic64_fetch_and_relaxed */
 #ifndef atomic64_fetch_and_relaxed
-#define atomic64_fetch_and_relaxed	atomic64_fetch_and
-#define atomic64_fetch_and_acquire	atomic64_fetch_and
-#define atomic64_fetch_and_release	atomic64_fetch_and
-
-#else /* atomic64_fetch_and_relaxed */
-
-#ifndef atomic64_fetch_and_acquire
-#define atomic64_fetch_and_acquire(...)					\
-	__atomic_op_acquire(atomic64_fetch_and, __VA_ARGS__)
+# define atomic64_fetch_and_relaxed		atomic64_fetch_and
+# define atomic64_fetch_and_acquire		atomic64_fetch_and
+# define atomic64_fetch_and_release		atomic64_fetch_and
+#else
+# ifndef atomic64_fetch_and_acquire
+#  define atomic64_fetch_and_acquire(...)	__atomic_op_acquire(atomic64_fetch_and, __VA_ARGS__)
+# endif
+# ifndef atomic64_fetch_and_release
+#  define atomic64_fetch_and_release(...)	__atomic_op_release(atomic64_fetch_and, __VA_ARGS__)
+# endif
+# ifndef atomic64_fetch_and
+#  define atomic64_fetch_and(...)		__atomic_op_fence(atomic64_fetch_and, __VA_ARGS__)
+# endif
 #endif
 
-#ifndef atomic64_fetch_and_release
-#define atomic64_fetch_and_release(...)					\
-	__atomic_op_release(atomic64_fetch_and, __VA_ARGS__)
-#endif
-
-#ifndef atomic64_fetch_and
-#define atomic64_fetch_and(...)						\
-	__atomic_op_fence(atomic64_fetch_and, __VA_ARGS__)
-#endif
-#endif /* atomic64_fetch_and_relaxed */
-
 #ifdef atomic64_andnot
-/* atomic64_fetch_andnot_relaxed */
-#ifndef atomic64_fetch_andnot_relaxed
-#define atomic64_fetch_andnot_relaxed	atomic64_fetch_andnot
-#define atomic64_fetch_andnot_acquire	atomic64_fetch_andnot
-#define atomic64_fetch_andnot_release	atomic64_fetch_andnot
-
-#else /* atomic64_fetch_andnot_relaxed */
 
-#ifndef atomic64_fetch_andnot_acquire
-#define atomic64_fetch_andnot_acquire(...)					\
-	__atomic_op_acquire(atomic64_fetch_andnot, __VA_ARGS__)
-#endif
+/* atomic64_fetch_andnot_relaxed() et al: */
 
-#ifndef atomic64_fetch_andnot_release
-#define atomic64_fetch_andnot_release(...)					\
-	__atomic_op_release(atomic64_fetch_andnot, __VA_ARGS__)
+#ifndef atomic64_fetch_andnot_relaxed
+# define atomic64_fetch_andnot_relaxed		atomic64_fetch_andnot
+# define atomic64_fetch_andnot_acquire		atomic64_fetch_andnot
+# define atomic64_fetch_andnot_release		atomic64_fetch_andnot
+#else
+# ifndef atomic64_fetch_andnot_acquire
+#  define atomic64_fetch_andnot_acquire(...)	__atomic_op_acquire(atomic64_fetch_andnot, __VA_ARGS__)
+# endif
+# ifndef atomic64_fetch_andnot_release
+#  define atomic64_fetch_andnot_release(...)	__atomic_op_release(atomic64_fetch_andnot, __VA_ARGS__)
+# endif
+# ifndef atomic64_fetch_andnot
+#  define atomic64_fetch_andnot(...)		__atomic_op_fence(atomic64_fetch_andnot, __VA_ARGS__)
+# endif
 #endif
 
-#ifndef atomic64_fetch_andnot
-#define atomic64_fetch_andnot(...)						\
-	__atomic_op_fence(atomic64_fetch_andnot, __VA_ARGS__)
-#endif
-#endif /* atomic64_fetch_andnot_relaxed */
 #endif /* atomic64_andnot */
 
-/* atomic64_fetch_xor_relaxed */
-#ifndef atomic64_fetch_xor_relaxed
-#define atomic64_fetch_xor_relaxed	atomic64_fetch_xor
-#define atomic64_fetch_xor_acquire	atomic64_fetch_xor
-#define atomic64_fetch_xor_release	atomic64_fetch_xor
-
-#else /* atomic64_fetch_xor_relaxed */
+/* atomic64_fetch_xor_relaxed() et al: */
 
-#ifndef atomic64_fetch_xor_acquire
-#define atomic64_fetch_xor_acquire(...)					\
-	__atomic_op_acquire(atomic64_fetch_xor, __VA_ARGS__)
-#endif
-
-#ifndef atomic64_fetch_xor_release
-#define atomic64_fetch_xor_release(...)					\
-	__atomic_op_release(atomic64_fetch_xor, __VA_ARGS__)
+#ifndef atomic64_fetch_xor_relaxed
+# define atomic64_fetch_xor_relaxed		atomic64_fetch_xor
+# define atomic64_fetch_xor_acquire		atomic64_fetch_xor
+# define atomic64_fetch_xor_release		atomic64_fetch_xor
+#else
+# ifndef atomic64_fetch_xor_acquire
+#  define atomic64_fetch_xor_acquire(...)	__atomic_op_acquire(atomic64_fetch_xor, __VA_ARGS__)
+# endif
+# ifndef atomic64_fetch_xor_release
+#  define atomic64_fetch_xor_release(...)	__atomic_op_release(atomic64_fetch_xor, __VA_ARGS__)
+# endif
+# ifndef atomic64_fetch_xor
+#  define atomic64_fetch_xor(...)		__atomic_op_fence(atomic64_fetch_xor, __VA_ARGS__)
 #endif
-
-#ifndef atomic64_fetch_xor
-#define atomic64_fetch_xor(...)						\
-	__atomic_op_fence(atomic64_fetch_xor, __VA_ARGS__)
 #endif
-#endif /* atomic64_fetch_xor_relaxed */
 
+/* atomic64_xchg_relaxed() et al: */
 
-/* atomic64_xchg_relaxed */
 #ifndef atomic64_xchg_relaxed
-#define  atomic64_xchg_relaxed		atomic64_xchg
-#define  atomic64_xchg_acquire		atomic64_xchg
-#define  atomic64_xchg_release		atomic64_xchg
-
-#else /* atomic64_xchg_relaxed */
-
-#ifndef atomic64_xchg_acquire
-#define  atomic64_xchg_acquire(...)					\
-	__atomic_op_acquire(atomic64_xchg, __VA_ARGS__)
-#endif
+# define atomic64_xchg_relaxed			atomic64_xchg
+# define atomic64_xchg_acquire			atomic64_xchg
+# define atomic64_xchg_release			atomic64_xchg
+#else
+# ifndef atomic64_xchg_acquire
+#  define atomic64_xchg_acquire(...)		__atomic_op_acquire(atomic64_xchg, __VA_ARGS__)
+# endif
+# ifndef atomic64_xchg_release
+#  define atomic64_xchg_release(...)		__atomic_op_release(atomic64_xchg, __VA_ARGS__)
+# endif
+# ifndef atomic64_xchg
+#  define atomic64_xchg(...)			__atomic_op_fence(atomic64_xchg, __VA_ARGS__)
+# endif
+#endif
+
+/* atomic64_cmpxchg_relaxed() et al: */
 
-#ifndef atomic64_xchg_release
-#define  atomic64_xchg_release(...)					\
-	__atomic_op_release(atomic64_xchg, __VA_ARGS__)
-#endif
-
-#ifndef atomic64_xchg
-#define  atomic64_xchg(...)						\
-	__atomic_op_fence(atomic64_xchg, __VA_ARGS__)
-#endif
-#endif /* atomic64_xchg_relaxed */
-
-/* atomic64_cmpxchg_relaxed */
 #ifndef atomic64_cmpxchg_relaxed
-#define  atomic64_cmpxchg_relaxed	atomic64_cmpxchg
-#define  atomic64_cmpxchg_acquire	atomic64_cmpxchg
-#define  atomic64_cmpxchg_release	atomic64_cmpxchg
-
-#else /* atomic64_cmpxchg_relaxed */
-
-#ifndef atomic64_cmpxchg_acquire
-#define  atomic64_cmpxchg_acquire(...)					\
-	__atomic_op_acquire(atomic64_cmpxchg, __VA_ARGS__)
-#endif
-
-#ifndef atomic64_cmpxchg_release
-#define  atomic64_cmpxchg_release(...)					\
-	__atomic_op_release(atomic64_cmpxchg, __VA_ARGS__)
-#endif
-
-#ifndef atomic64_cmpxchg
-#define  atomic64_cmpxchg(...)						\
-	__atomic_op_fence(atomic64_cmpxchg, __VA_ARGS__)
+# define atomic64_cmpxchg_relaxed		atomic64_cmpxchg
+# define atomic64_cmpxchg_acquire		atomic64_cmpxchg
+# define atomic64_cmpxchg_release		atomic64_cmpxchg
+#else
+# ifndef atomic64_cmpxchg_acquire
+#  define atomic64_cmpxchg_acquire(...)		__atomic_op_acquire(atomic64_cmpxchg, __VA_ARGS__)
+# endif
+# ifndef atomic64_cmpxchg_release
+#  define atomic64_cmpxchg_release(...)		__atomic_op_release(atomic64_cmpxchg, __VA_ARGS__)
+# endif
+# ifndef atomic64_cmpxchg
+#  define atomic64_cmpxchg(...)			__atomic_op_fence(atomic64_cmpxchg, __VA_ARGS__)
+# endif
 #endif
-#endif /* atomic64_cmpxchg_relaxed */
 
 #ifndef atomic64_try_cmpxchg
-
-#define __atomic64_try_cmpxchg(type, _p, _po, _n)			\
-({									\
+# define __atomic64_try_cmpxchg(type, _p, _po, _n)			\
+  ({									\
 	typeof(_po) __po = (_po);					\
 	typeof(*(_po)) __r, __o = *__po;				\
 	__r = atomic64_cmpxchg##type((_p), __o, (_n));			\
 	if (unlikely(__r != __o))					\
 		*__po = __r;						\
 	likely(__r == __o);						\
-})
-
-#define atomic64_try_cmpxchg(_p, _po, _n)		__atomic64_try_cmpxchg(, _p, _po, _n)
-#define atomic64_try_cmpxchg_relaxed(_p, _po, _n)	__atomic64_try_cmpxchg(_relaxed, _p, _po, _n)
-#define atomic64_try_cmpxchg_acquire(_p, _po, _n)	__atomic64_try_cmpxchg(_acquire, _p, _po, _n)
-#define atomic64_try_cmpxchg_release(_p, _po, _n)	__atomic64_try_cmpxchg(_release, _p, _po, _n)
-
-#else /* atomic64_try_cmpxchg */
-#define atomic64_try_cmpxchg_relaxed	atomic64_try_cmpxchg
-#define atomic64_try_cmpxchg_acquire	atomic64_try_cmpxchg
-#define atomic64_try_cmpxchg_release	atomic64_try_cmpxchg
-#endif /* atomic64_try_cmpxchg */
+  })
+# define atomic64_try_cmpxchg(_p, _po, _n)	   __atomic64_try_cmpxchg(, _p, _po, _n)
+# define atomic64_try_cmpxchg_relaxed(_p, _po, _n) __atomic64_try_cmpxchg(_relaxed, _p, _po, _n)
+# define atomic64_try_cmpxchg_acquire(_p, _po, _n) __atomic64_try_cmpxchg(_acquire, _p, _po, _n)
+# define atomic64_try_cmpxchg_release(_p, _po, _n) __atomic64_try_cmpxchg(_release, _p, _po, _n)
+#else
+# define atomic64_try_cmpxchg_relaxed		atomic64_try_cmpxchg
+# define atomic64_try_cmpxchg_acquire		atomic64_try_cmpxchg
+# define atomic64_try_cmpxchg_release		atomic64_try_cmpxchg
+#endif
 
 #ifndef atomic64_andnot
 static inline void atomic64_andnot(long long i, atomic64_t *v)

^ permalink raw reply related	[flat|nested] 103+ messages in thread

* [tip:locking/core] locking/atomics: Simplify the op definitions in atomic.h some more
  2018-05-05  8:36           ` Ingo Molnar
  (?)
  (?)
@ 2018-05-06 12:14           ` tip-bot for Ingo Molnar
  2018-05-09  7:33             ` Peter Zijlstra
  -1 siblings, 1 reply; 103+ messages in thread
From: tip-bot for Ingo Molnar @ 2018-05-06 12:14 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: linux-kernel, akpm, will.deacon, mark.rutland, paulmck, torvalds,
	a.p.zijlstra, tglx, hpa, mingo

Commit-ID:  87d655a48dfe74293f72dc001ed042142cf00d44
Gitweb:     https://git.kernel.org/tip/87d655a48dfe74293f72dc001ed042142cf00d44
Author:     Ingo Molnar <mingo@kernel.org>
AuthorDate: Sat, 5 May 2018 10:36:35 +0200
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Sat, 5 May 2018 15:22:44 +0200

locking/atomics: Simplify the op definitions in atomic.h some more

Before:

 #ifndef atomic_fetch_dec_relaxed
 # ifndef atomic_fetch_dec
 #  define atomic_fetch_dec(v)			atomic_fetch_sub(1, (v))
 #  define atomic_fetch_dec_relaxed(v)		atomic_fetch_sub_relaxed(1, (v))
 #  define atomic_fetch_dec_acquire(v)		atomic_fetch_sub_acquire(1, (v))
 #  define atomic_fetch_dec_release(v)		atomic_fetch_sub_release(1, (v))
 # else
 #  define atomic_fetch_dec_relaxed		atomic_fetch_dec
 #  define atomic_fetch_dec_acquire		atomic_fetch_dec
 #  define atomic_fetch_dec_release		atomic_fetch_dec
 # endif
 #else
 # ifndef atomic_fetch_dec_acquire
 #  define atomic_fetch_dec_acquire(...)	__atomic_op_acquire(atomic_fetch_dec, __VA_ARGS__)
 # endif
 # ifndef atomic_fetch_dec_release
 #  define atomic_fetch_dec_release(...)	__atomic_op_release(atomic_fetch_dec, __VA_ARGS__)
 # endif
 # ifndef atomic_fetch_dec
 #  define atomic_fetch_dec(...)		__atomic_op_fence(atomic_fetch_dec, __VA_ARGS__)
 # endif
 #endif

After:

 #ifndef atomic_fetch_dec_relaxed
 # ifndef atomic_fetch_dec
 #  define atomic_fetch_dec(v)			atomic_fetch_sub(1, (v))
 #  define atomic_fetch_dec_relaxed(v)		atomic_fetch_sub_relaxed(1, (v))
 #  define atomic_fetch_dec_acquire(v)		atomic_fetch_sub_acquire(1, (v))
 #  define atomic_fetch_dec_release(v)		atomic_fetch_sub_release(1, (v))
 # else
 #  define atomic_fetch_dec_relaxed		atomic_fetch_dec
 #  define atomic_fetch_dec_acquire		atomic_fetch_dec
 #  define atomic_fetch_dec_release		atomic_fetch_dec
 # endif
 #else
 # ifndef atomic_fetch_dec
 #  define atomic_fetch_dec(...)		__atomic_op_fence(atomic_fetch_dec, __VA_ARGS__)
 #  define atomic_fetch_dec_acquire(...)	__atomic_op_acquire(atomic_fetch_dec, __VA_ARGS__)
 #  define atomic_fetch_dec_release(...)	__atomic_op_release(atomic_fetch_dec, __VA_ARGS__)
 # endif
 #endif

The idea is that because we already group these APIs by certain defines
such as atomic_fetch_dec_relaxed and atomic_fetch_dec in the primary
branches - we can do the same in the secondary branch as well.

( Also remove some unnecessarily duplicate comments, as the API
  group defines are now pretty much self-documenting. )

No change in functionality.

Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Deacon <will.deacon@arm.com>
Cc: aryabinin@virtuozzo.com
Cc: boqun.feng@gmail.com
Cc: catalin.marinas@arm.com
Cc: dvyukov@google.com
Cc: linux-arm-kernel@lists.infradead.org
Link: http://lkml.kernel.org/r/20180505083635.622xmcvb42dw5xxh@gmail.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 include/linux/atomic.h | 312 ++++++++++---------------------------------------
 1 file changed, 62 insertions(+), 250 deletions(-)

diff --git a/include/linux/atomic.h b/include/linux/atomic.h
index 12f4ad559ab1..352ecc72d7f5 100644
--- a/include/linux/atomic.h
+++ b/include/linux/atomic.h
@@ -71,98 +71,66 @@
 })
 #endif
 
-/* atomic_add_return_relaxed() et al: */
-
 #ifndef atomic_add_return_relaxed
 # define atomic_add_return_relaxed		atomic_add_return
 # define atomic_add_return_acquire		atomic_add_return
 # define atomic_add_return_release		atomic_add_return
 #else
-# ifndef atomic_add_return_acquire
-#  define atomic_add_return_acquire(...)	__atomic_op_acquire(atomic_add_return, __VA_ARGS__)
-# endif
-# ifndef atomic_add_return_release
-#  define atomic_add_return_release(...)	__atomic_op_release(atomic_add_return, __VA_ARGS__)
-# endif
 # ifndef atomic_add_return
 #  define atomic_add_return(...)		__atomic_op_fence(atomic_add_return, __VA_ARGS__)
+#  define atomic_add_return_acquire(...)	__atomic_op_acquire(atomic_add_return, __VA_ARGS__)
+#  define atomic_add_return_release(...)	__atomic_op_release(atomic_add_return, __VA_ARGS__)
 # endif
 #endif
 
-/* atomic_inc_return_relaxed() et al: */
-
 #ifndef atomic_inc_return_relaxed
 # define atomic_inc_return_relaxed		atomic_inc_return
 # define atomic_inc_return_acquire		atomic_inc_return
 # define atomic_inc_return_release		atomic_inc_return
 #else
-# ifndef atomic_inc_return_acquire
-#  define atomic_inc_return_acquire(...)	__atomic_op_acquire(atomic_inc_return, __VA_ARGS__)
-# endif
-# ifndef atomic_inc_return_release
-#  define atomic_inc_return_release(...)	__atomic_op_release(atomic_inc_return, __VA_ARGS__)
-# endif
 # ifndef atomic_inc_return
 #  define atomic_inc_return(...)		__atomic_op_fence(atomic_inc_return, __VA_ARGS__)
+#  define atomic_inc_return_acquire(...)	__atomic_op_acquire(atomic_inc_return, __VA_ARGS__)
+#  define atomic_inc_return_release(...)	__atomic_op_release(atomic_inc_return, __VA_ARGS__)
 # endif
 #endif
 
-/* atomic_sub_return_relaxed() et al: */
-
 #ifndef atomic_sub_return_relaxed
 # define atomic_sub_return_relaxed		atomic_sub_return
 # define atomic_sub_return_acquire		atomic_sub_return
 # define atomic_sub_return_release		atomic_sub_return
 #else
-# ifndef atomic_sub_return_acquire
-#  define atomic_sub_return_acquire(...)	__atomic_op_acquire(atomic_sub_return, __VA_ARGS__)
-# endif
-# ifndef atomic_sub_return_release
-#  define atomic_sub_return_release(...)	__atomic_op_release(atomic_sub_return, __VA_ARGS__)
-# endif
 # ifndef atomic_sub_return
 #  define atomic_sub_return(...)		__atomic_op_fence(atomic_sub_return, __VA_ARGS__)
+#  define atomic_sub_return_acquire(...)	__atomic_op_acquire(atomic_sub_return, __VA_ARGS__)
+#  define atomic_sub_return_release(...)	__atomic_op_release(atomic_sub_return, __VA_ARGS__)
 # endif
 #endif
 
-/* atomic_dec_return_relaxed() et al: */
-
 #ifndef atomic_dec_return_relaxed
 # define atomic_dec_return_relaxed		atomic_dec_return
 # define atomic_dec_return_acquire		atomic_dec_return
 # define atomic_dec_return_release		atomic_dec_return
 #else
-# ifndef atomic_dec_return_acquire
-#  define atomic_dec_return_acquire(...)	__atomic_op_acquire(atomic_dec_return, __VA_ARGS__)
-# endif
-# ifndef atomic_dec_return_release
-#  define atomic_dec_return_release(...)	__atomic_op_release(atomic_dec_return, __VA_ARGS__)
-# endif
 # ifndef atomic_dec_return
 #  define atomic_dec_return(...)		__atomic_op_fence(atomic_dec_return, __VA_ARGS__)
+#  define atomic_dec_return_acquire(...)	__atomic_op_acquire(atomic_dec_return, __VA_ARGS__)
+#  define atomic_dec_return_release(...)	__atomic_op_release(atomic_dec_return, __VA_ARGS__)
 # endif
 #endif
 
-/* atomic_fetch_add_relaxed() et al: */
-
 #ifndef atomic_fetch_add_relaxed
 # define atomic_fetch_add_relaxed		atomic_fetch_add
 # define atomic_fetch_add_acquire		atomic_fetch_add
 # define atomic_fetch_add_release		atomic_fetch_add
 #else
-# ifndef atomic_fetch_add_acquire
-#  define atomic_fetch_add_acquire(...)		__atomic_op_acquire(atomic_fetch_add, __VA_ARGS__)
-# endif
-# ifndef atomic_fetch_add_release
-#  define atomic_fetch_add_release(...)		__atomic_op_release(atomic_fetch_add, __VA_ARGS__)
-# endif
 # ifndef atomic_fetch_add
 #  define atomic_fetch_add(...)			__atomic_op_fence(atomic_fetch_add, __VA_ARGS__)
+#  define atomic_fetch_add_acquire(...)		__atomic_op_acquire(atomic_fetch_add, __VA_ARGS__)
+#  define atomic_fetch_add_release(...)		__atomic_op_release(atomic_fetch_add, __VA_ARGS__)
 # endif
 #endif
 
-/* atomic_fetch_inc_relaxed() et al: */
-
 #ifndef atomic_fetch_inc_relaxed
 # ifndef atomic_fetch_inc
 #  define atomic_fetch_inc(v)			atomic_fetch_add(1, (v))
@@ -175,37 +143,25 @@
 #  define atomic_fetch_inc_release		atomic_fetch_inc
 # endif
 #else
-# ifndef atomic_fetch_inc_acquire
-#  define atomic_fetch_inc_acquire(...)		__atomic_op_acquire(atomic_fetch_inc, __VA_ARGS__)
-# endif
-# ifndef atomic_fetch_inc_release
-#  define atomic_fetch_inc_release(...)		__atomic_op_release(atomic_fetch_inc, __VA_ARGS__)
-# endif
 # ifndef atomic_fetch_inc
 #  define atomic_fetch_inc(...)			__atomic_op_fence(atomic_fetch_inc, __VA_ARGS__)
+#  define atomic_fetch_inc_acquire(...)		__atomic_op_acquire(atomic_fetch_inc, __VA_ARGS__)
+#  define atomic_fetch_inc_release(...)		__atomic_op_release(atomic_fetch_inc, __VA_ARGS__)
 # endif
 #endif
 
-/* atomic_fetch_sub_relaxed() et al: */
-
 #ifndef atomic_fetch_sub_relaxed
 # define atomic_fetch_sub_relaxed		atomic_fetch_sub
 # define atomic_fetch_sub_acquire		atomic_fetch_sub
 # define atomic_fetch_sub_release		atomic_fetch_sub
 #else
-# ifndef atomic_fetch_sub_acquire
-#  define atomic_fetch_sub_acquire(...)		__atomic_op_acquire(atomic_fetch_sub, __VA_ARGS__)
-# endif
-# ifndef atomic_fetch_sub_release
-#  define atomic_fetch_sub_release(...)		__atomic_op_release(atomic_fetch_sub, __VA_ARGS__)
-# endif
 # ifndef atomic_fetch_sub
 #  define atomic_fetch_sub(...)			__atomic_op_fence(atomic_fetch_sub, __VA_ARGS__)
+#  define atomic_fetch_sub_acquire(...)		__atomic_op_acquire(atomic_fetch_sub, __VA_ARGS__)
+#  define atomic_fetch_sub_release(...)		__atomic_op_release(atomic_fetch_sub, __VA_ARGS__)
 # endif
 #endif
 
-/* atomic_fetch_dec_relaxed() et al: */
-
 #ifndef atomic_fetch_dec_relaxed
 # ifndef atomic_fetch_dec
 #  define atomic_fetch_dec(v)			atomic_fetch_sub(1, (v))
@@ -218,127 +174,86 @@
 #  define atomic_fetch_dec_release		atomic_fetch_dec
 # endif
 #else
-# ifndef atomic_fetch_dec_acquire
-#  define atomic_fetch_dec_acquire(...)		__atomic_op_acquire(atomic_fetch_dec, __VA_ARGS__)
-# endif
-# ifndef atomic_fetch_dec_release
-#  define atomic_fetch_dec_release(...)		__atomic_op_release(atomic_fetch_dec, __VA_ARGS__)
-# endif
 # ifndef atomic_fetch_dec
 #  define atomic_fetch_dec(...)			__atomic_op_fence(atomic_fetch_dec, __VA_ARGS__)
+#  define atomic_fetch_dec_acquire(...)		__atomic_op_acquire(atomic_fetch_dec, __VA_ARGS__)
+#  define atomic_fetch_dec_release(...)		__atomic_op_release(atomic_fetch_dec, __VA_ARGS__)
 # endif
 #endif
 
-/* atomic_fetch_or_relaxed() et al: */
-
 #ifndef atomic_fetch_or_relaxed
 # define atomic_fetch_or_relaxed		atomic_fetch_or
 # define atomic_fetch_or_acquire		atomic_fetch_or
 # define atomic_fetch_or_release		atomic_fetch_or
 #else
-# ifndef atomic_fetch_or_acquire
-#  define atomic_fetch_or_acquire(...)		__atomic_op_acquire(atomic_fetch_or, __VA_ARGS__)
-# endif
-# ifndef atomic_fetch_or_release
-#  define atomic_fetch_or_release(...)		__atomic_op_release(atomic_fetch_or, __VA_ARGS__)
-# endif
 # ifndef atomic_fetch_or
 #  define atomic_fetch_or(...)			__atomic_op_fence(atomic_fetch_or, __VA_ARGS__)
+#  define atomic_fetch_or_acquire(...)		__atomic_op_acquire(atomic_fetch_or, __VA_ARGS__)
+#  define atomic_fetch_or_release(...)		__atomic_op_release(atomic_fetch_or, __VA_ARGS__)
 # endif
 #endif
 
-/* atomic_fetch_and_relaxed() et al: */
-
 #ifndef atomic_fetch_and_relaxed
 # define atomic_fetch_and_relaxed		atomic_fetch_and
 # define atomic_fetch_and_acquire		atomic_fetch_and
 # define atomic_fetch_and_release		atomic_fetch_and
 #else
-# ifndef atomic_fetch_and_acquire
-#  define atomic_fetch_and_acquire(...)		__atomic_op_acquire(atomic_fetch_and, __VA_ARGS__)
-# endif
-# ifndef atomic_fetch_and_release
-#  define atomic_fetch_and_release(...)		__atomic_op_release(atomic_fetch_and, __VA_ARGS__)
-# endif
 # ifndef atomic_fetch_and
 #  define atomic_fetch_and(...)			__atomic_op_fence(atomic_fetch_and, __VA_ARGS__)
+#  define atomic_fetch_and_acquire(...)		__atomic_op_acquire(atomic_fetch_and, __VA_ARGS__)
+#  define atomic_fetch_and_release(...)		__atomic_op_release(atomic_fetch_and, __VA_ARGS__)
 # endif
 #endif
 
 #ifdef atomic_andnot
 
-/* atomic_fetch_andnot_relaxed() et al: */
-
 #ifndef atomic_fetch_andnot_relaxed
 # define atomic_fetch_andnot_relaxed		atomic_fetch_andnot
 # define atomic_fetch_andnot_acquire		atomic_fetch_andnot
 # define atomic_fetch_andnot_release		atomic_fetch_andnot
 #else
-# ifndef atomic_fetch_andnot_acquire
-#  define atomic_fetch_andnot_acquire(...)	__atomic_op_acquire(atomic_fetch_andnot, __VA_ARGS__)
-# endif
-# ifndef atomic_fetch_andnot_release
-#  define atomic_fetch_andnot_release(...)	__atomic_op_release(atomic_fetch_andnot, __VA_ARGS__)
-# endif
 # ifndef atomic_fetch_andnot
 #  define atomic_fetch_andnot(...)		__atomic_op_fence(atomic_fetch_andnot, __VA_ARGS__)
+#  define atomic_fetch_andnot_acquire(...)	__atomic_op_acquire(atomic_fetch_andnot, __VA_ARGS__)
+#  define atomic_fetch_andnot_release(...)	__atomic_op_release(atomic_fetch_andnot, __VA_ARGS__)
 # endif
 #endif
 
 #endif /* atomic_andnot */
 
-/* atomic_fetch_xor_relaxed() et al: */
-
 #ifndef atomic_fetch_xor_relaxed
 # define atomic_fetch_xor_relaxed		atomic_fetch_xor
 # define atomic_fetch_xor_acquire		atomic_fetch_xor
 # define atomic_fetch_xor_release		atomic_fetch_xor
 #else
-# ifndef atomic_fetch_xor_acquire
-#  define atomic_fetch_xor_acquire(...)		__atomic_op_acquire(atomic_fetch_xor, __VA_ARGS__)
-# endif
-# ifndef atomic_fetch_xor_release
-#  define atomic_fetch_xor_release(...)		__atomic_op_release(atomic_fetch_xor, __VA_ARGS__)
-# endif
 # ifndef atomic_fetch_xor
 #  define atomic_fetch_xor(...)			__atomic_op_fence(atomic_fetch_xor, __VA_ARGS__)
+#  define atomic_fetch_xor_acquire(...)		__atomic_op_acquire(atomic_fetch_xor, __VA_ARGS__)
+#  define atomic_fetch_xor_release(...)		__atomic_op_release(atomic_fetch_xor, __VA_ARGS__)
 # endif
 #endif
 
-
-/* atomic_xchg_relaxed() et al: */
-
 #ifndef atomic_xchg_relaxed
 #define atomic_xchg_relaxed			atomic_xchg
 #define atomic_xchg_acquire			atomic_xchg
 #define atomic_xchg_release			atomic_xchg
 #else
-# ifndef atomic_xchg_acquire
-#  define atomic_xchg_acquire(...)		__atomic_op_acquire(atomic_xchg, __VA_ARGS__)
-# endif
-# ifndef atomic_xchg_release
-#  define atomic_xchg_release(...)		__atomic_op_release(atomic_xchg, __VA_ARGS__)
-# endif
 # ifndef atomic_xchg
 #  define atomic_xchg(...)			__atomic_op_fence(atomic_xchg, __VA_ARGS__)
+#  define atomic_xchg_acquire(...)		__atomic_op_acquire(atomic_xchg, __VA_ARGS__)
+#  define atomic_xchg_release(...)		__atomic_op_release(atomic_xchg, __VA_ARGS__)
 # endif
 #endif
 
-/* atomic_cmpxchg_relaxed() et al: */
-
 #ifndef atomic_cmpxchg_relaxed
 # define atomic_cmpxchg_relaxed			atomic_cmpxchg
 # define atomic_cmpxchg_acquire			atomic_cmpxchg
 # define atomic_cmpxchg_release			atomic_cmpxchg
 #else
-# ifndef atomic_cmpxchg_acquire
-#  define atomic_cmpxchg_acquire(...)		__atomic_op_acquire(atomic_cmpxchg, __VA_ARGS__)
-# endif
-# ifndef atomic_cmpxchg_release
-#  define atomic_cmpxchg_release(...)		__atomic_op_release(atomic_cmpxchg, __VA_ARGS__)
-# endif
 # ifndef atomic_cmpxchg
 #  define atomic_cmpxchg(...)			__atomic_op_fence(atomic_cmpxchg, __VA_ARGS__)
+#  define atomic_cmpxchg_acquire(...)		__atomic_op_acquire(atomic_cmpxchg, __VA_ARGS__)
+#  define atomic_cmpxchg_release(...)		__atomic_op_release(atomic_cmpxchg, __VA_ARGS__)
 # endif
 #endif
 
@@ -362,57 +277,39 @@
 # define atomic_try_cmpxchg_release		atomic_try_cmpxchg
 #endif
 
-/* cmpxchg_relaxed() et al: */
-
 #ifndef cmpxchg_relaxed
 # define cmpxchg_relaxed			cmpxchg
 # define cmpxchg_acquire			cmpxchg
 # define cmpxchg_release			cmpxchg
 #else
-# ifndef cmpxchg_acquire
-#  define cmpxchg_acquire(...)			__atomic_op_acquire(cmpxchg, __VA_ARGS__)
-# endif
-# ifndef cmpxchg_release
-#  define cmpxchg_release(...)			__atomic_op_release(cmpxchg, __VA_ARGS__)
-# endif
 # ifndef cmpxchg
 #  define cmpxchg(...)				__atomic_op_fence(cmpxchg, __VA_ARGS__)
+#  define cmpxchg_acquire(...)			__atomic_op_acquire(cmpxchg, __VA_ARGS__)
+#  define cmpxchg_release(...)			__atomic_op_release(cmpxchg, __VA_ARGS__)
 # endif
 #endif
 
-/* cmpxchg64_relaxed() et al: */
-
 #ifndef cmpxchg64_relaxed
 # define cmpxchg64_relaxed			cmpxchg64
 # define cmpxchg64_acquire			cmpxchg64
 # define cmpxchg64_release			cmpxchg64
 #else
-# ifndef cmpxchg64_acquire
-#  define cmpxchg64_acquire(...)		__atomic_op_acquire(cmpxchg64, __VA_ARGS__)
-# endif
-# ifndef cmpxchg64_release
-#  define cmpxchg64_release(...)		__atomic_op_release(cmpxchg64, __VA_ARGS__)
-# endif
 # ifndef cmpxchg64
 #  define cmpxchg64(...)			__atomic_op_fence(cmpxchg64, __VA_ARGS__)
+#  define cmpxchg64_acquire(...)		__atomic_op_acquire(cmpxchg64, __VA_ARGS__)
+#  define cmpxchg64_release(...)		__atomic_op_release(cmpxchg64, __VA_ARGS__)
 # endif
 #endif
 
-/* xchg_relaxed() et al: */
-
 #ifndef xchg_relaxed
 # define xchg_relaxed				xchg
 # define xchg_acquire				xchg
 # define xchg_release				xchg
 #else
-# ifndef xchg_acquire
-#  define xchg_acquire(...)			__atomic_op_acquire(xchg, __VA_ARGS__)
-# endif
-# ifndef xchg_release
-#  define xchg_release(...)			__atomic_op_release(xchg, __VA_ARGS__)
-# endif
 # ifndef xchg
 #  define xchg(...)				__atomic_op_fence(xchg, __VA_ARGS__)
+#  define xchg_acquire(...)			__atomic_op_acquire(xchg, __VA_ARGS__)
+#  define xchg_release(...)			__atomic_op_release(xchg, __VA_ARGS__)
 # endif
 #endif
 
@@ -569,98 +466,66 @@ static inline int atomic_dec_if_positive(atomic_t *v)
 # define atomic64_set_release(v, i)		smp_store_release(&(v)->counter, (i))
 #endif
 
-/* atomic64_add_return_relaxed() et al: */
-
 #ifndef atomic64_add_return_relaxed
 # define atomic64_add_return_relaxed		atomic64_add_return
 # define atomic64_add_return_acquire		atomic64_add_return
 # define atomic64_add_return_release		atomic64_add_return
 #else
-# ifndef atomic64_add_return_acquire
-#  define atomic64_add_return_acquire(...)	__atomic_op_acquire(atomic64_add_return, __VA_ARGS__)
-# endif
-# ifndef atomic64_add_return_release
-#  define atomic64_add_return_release(...)	__atomic_op_release(atomic64_add_return, __VA_ARGS__)
-# endif
 # ifndef atomic64_add_return
 #  define atomic64_add_return(...)		__atomic_op_fence(atomic64_add_return, __VA_ARGS__)
+#  define atomic64_add_return_acquire(...)	__atomic_op_acquire(atomic64_add_return, __VA_ARGS__)
+#  define atomic64_add_return_release(...)	__atomic_op_release(atomic64_add_return, __VA_ARGS__)
 # endif
 #endif
 
-/* atomic64_inc_return_relaxed() et al: */
-
 #ifndef atomic64_inc_return_relaxed
 # define atomic64_inc_return_relaxed		atomic64_inc_return
 # define atomic64_inc_return_acquire		atomic64_inc_return
 # define atomic64_inc_return_release		atomic64_inc_return
 #else
-# ifndef atomic64_inc_return_acquire
-#  define atomic64_inc_return_acquire(...)	__atomic_op_acquire(atomic64_inc_return, __VA_ARGS__)
-# endif
-# ifndef atomic64_inc_return_release
-#  define atomic64_inc_return_release(...)	__atomic_op_release(atomic64_inc_return, __VA_ARGS__)
-# endif
 # ifndef atomic64_inc_return
 #  define atomic64_inc_return(...)		__atomic_op_fence(atomic64_inc_return, __VA_ARGS__)
+#  define atomic64_inc_return_acquire(...)	__atomic_op_acquire(atomic64_inc_return, __VA_ARGS__)
+#  define atomic64_inc_return_release(...)	__atomic_op_release(atomic64_inc_return, __VA_ARGS__)
 # endif
 #endif
 
-/* atomic64_sub_return_relaxed() et al: */
-
 #ifndef atomic64_sub_return_relaxed
 # define atomic64_sub_return_relaxed		atomic64_sub_return
 # define atomic64_sub_return_acquire		atomic64_sub_return
 # define atomic64_sub_return_release		atomic64_sub_return
 #else
-# ifndef atomic64_sub_return_acquire
-#  define atomic64_sub_return_acquire(...)	__atomic_op_acquire(atomic64_sub_return, __VA_ARGS__)
-# endif
-# ifndef atomic64_sub_return_release
-#  define atomic64_sub_return_release(...)	__atomic_op_release(atomic64_sub_return, __VA_ARGS__)
-# endif
 # ifndef atomic64_sub_return
 #  define atomic64_sub_return(...)		__atomic_op_fence(atomic64_sub_return, __VA_ARGS__)
+#  define atomic64_sub_return_acquire(...)	__atomic_op_acquire(atomic64_sub_return, __VA_ARGS__)
+#  define atomic64_sub_return_release(...)	__atomic_op_release(atomic64_sub_return, __VA_ARGS__)
 # endif
 #endif
 
-/* atomic64_dec_return_relaxed() et al: */
-
 #ifndef atomic64_dec_return_relaxed
 # define atomic64_dec_return_relaxed		atomic64_dec_return
 # define atomic64_dec_return_acquire		atomic64_dec_return
 # define atomic64_dec_return_release		atomic64_dec_return
 #else
-# ifndef atomic64_dec_return_acquire
-#  define atomic64_dec_return_acquire(...)	__atomic_op_acquire(atomic64_dec_return, __VA_ARGS__)
-# endif
-# ifndef atomic64_dec_return_release
-#  define atomic64_dec_return_release(...)	__atomic_op_release(atomic64_dec_return, __VA_ARGS__)
-# endif
 # ifndef atomic64_dec_return
 #  define atomic64_dec_return(...)		__atomic_op_fence(atomic64_dec_return, __VA_ARGS__)
+#  define atomic64_dec_return_acquire(...)	__atomic_op_acquire(atomic64_dec_return, __VA_ARGS__)
+#  define atomic64_dec_return_release(...)	__atomic_op_release(atomic64_dec_return, __VA_ARGS__)
 # endif
 #endif
 
-/* atomic64_fetch_add_relaxed() et al: */
-
 #ifndef atomic64_fetch_add_relaxed
 # define atomic64_fetch_add_relaxed		atomic64_fetch_add
 # define atomic64_fetch_add_acquire		atomic64_fetch_add
 # define atomic64_fetch_add_release		atomic64_fetch_add
 #else
-# ifndef atomic64_fetch_add_acquire
-#  define atomic64_fetch_add_acquire(...)	__atomic_op_acquire(atomic64_fetch_add, __VA_ARGS__)
-# endif
-# ifndef atomic64_fetch_add_release
-#  define atomic64_fetch_add_release(...)	__atomic_op_release(atomic64_fetch_add, __VA_ARGS__)
-# endif
 # ifndef atomic64_fetch_add
 #  define atomic64_fetch_add(...)		__atomic_op_fence(atomic64_fetch_add, __VA_ARGS__)
+#  define atomic64_fetch_add_acquire(...)	__atomic_op_acquire(atomic64_fetch_add, __VA_ARGS__)
+#  define atomic64_fetch_add_release(...)	__atomic_op_release(atomic64_fetch_add, __VA_ARGS__)
 # endif
 #endif
 
-/* atomic64_fetch_inc_relaxed() et al: */
-
 #ifndef atomic64_fetch_inc_relaxed
 # ifndef atomic64_fetch_inc
 #  define atomic64_fetch_inc(v)			atomic64_fetch_add(1, (v))
@@ -673,37 +538,25 @@ static inline int atomic_dec_if_positive(atomic_t *v)
 #  define atomic64_fetch_inc_release		atomic64_fetch_inc
 # endif
 #else
-# ifndef atomic64_fetch_inc_acquire
-#  define atomic64_fetch_inc_acquire(...)	__atomic_op_acquire(atomic64_fetch_inc, __VA_ARGS__)
-# endif
-# ifndef atomic64_fetch_inc_release
-#  define atomic64_fetch_inc_release(...)	__atomic_op_release(atomic64_fetch_inc, __VA_ARGS__)
-# endif
 # ifndef atomic64_fetch_inc
 #  define atomic64_fetch_inc(...)		__atomic_op_fence(atomic64_fetch_inc, __VA_ARGS__)
+#  define atomic64_fetch_inc_acquire(...)	__atomic_op_acquire(atomic64_fetch_inc, __VA_ARGS__)
+#  define atomic64_fetch_inc_release(...)	__atomic_op_release(atomic64_fetch_inc, __VA_ARGS__)
 # endif
 #endif
 
-/* atomic64_fetch_sub_relaxed() et al: */
-
 #ifndef atomic64_fetch_sub_relaxed
 # define atomic64_fetch_sub_relaxed		atomic64_fetch_sub
 # define atomic64_fetch_sub_acquire		atomic64_fetch_sub
 # define atomic64_fetch_sub_release		atomic64_fetch_sub
 #else
-# ifndef atomic64_fetch_sub_acquire
-#  define atomic64_fetch_sub_acquire(...)	__atomic_op_acquire(atomic64_fetch_sub, __VA_ARGS__)
-# endif
-# ifndef atomic64_fetch_sub_release
-#  define atomic64_fetch_sub_release(...)	__atomic_op_release(atomic64_fetch_sub, __VA_ARGS__)
-# endif
 # ifndef atomic64_fetch_sub
 #  define atomic64_fetch_sub(...)		__atomic_op_fence(atomic64_fetch_sub, __VA_ARGS__)
+#  define atomic64_fetch_sub_acquire(...)	__atomic_op_acquire(atomic64_fetch_sub, __VA_ARGS__)
+#  define atomic64_fetch_sub_release(...)	__atomic_op_release(atomic64_fetch_sub, __VA_ARGS__)
 # endif
 #endif
 
-/* atomic64_fetch_dec_relaxed() et al: */
-
 #ifndef atomic64_fetch_dec_relaxed
 # ifndef atomic64_fetch_dec
 #  define atomic64_fetch_dec(v)			atomic64_fetch_sub(1, (v))
@@ -716,127 +569,86 @@ static inline int atomic_dec_if_positive(atomic_t *v)
 #  define atomic64_fetch_dec_release		atomic64_fetch_dec
 # endif
 #else
-# ifndef atomic64_fetch_dec_acquire
-#  define atomic64_fetch_dec_acquire(...)	__atomic_op_acquire(atomic64_fetch_dec, __VA_ARGS__)
-# endif
-# ifndef atomic64_fetch_dec_release
-#  define atomic64_fetch_dec_release(...)	__atomic_op_release(atomic64_fetch_dec, __VA_ARGS__)
-# endif
 # ifndef atomic64_fetch_dec
 #  define atomic64_fetch_dec(...)		__atomic_op_fence(atomic64_fetch_dec, __VA_ARGS__)
+#  define atomic64_fetch_dec_acquire(...)	__atomic_op_acquire(atomic64_fetch_dec, __VA_ARGS__)
+#  define atomic64_fetch_dec_release(...)	__atomic_op_release(atomic64_fetch_dec, __VA_ARGS__)
 # endif
 #endif
 
-/* atomic64_fetch_or_relaxed() et al: */
-
 #ifndef atomic64_fetch_or_relaxed
 # define atomic64_fetch_or_relaxed		atomic64_fetch_or
 # define atomic64_fetch_or_acquire		atomic64_fetch_or
 # define atomic64_fetch_or_release		atomic64_fetch_or
 #else
-# ifndef atomic64_fetch_or_acquire
-#  define atomic64_fetch_or_acquire(...)	__atomic_op_acquire(atomic64_fetch_or, __VA_ARGS__)
-# endif
-# ifndef atomic64_fetch_or_release
-#  define atomic64_fetch_or_release(...)	__atomic_op_release(atomic64_fetch_or, __VA_ARGS__)
-# endif
 # ifndef atomic64_fetch_or
 #  define atomic64_fetch_or(...)		__atomic_op_fence(atomic64_fetch_or, __VA_ARGS__)
+#  define atomic64_fetch_or_acquire(...)	__atomic_op_acquire(atomic64_fetch_or, __VA_ARGS__)
+#  define atomic64_fetch_or_release(...)	__atomic_op_release(atomic64_fetch_or, __VA_ARGS__)
 # endif
 #endif
 
-
-/* atomic64_fetch_and_relaxed() et al: */
-
 #ifndef atomic64_fetch_and_relaxed
 # define atomic64_fetch_and_relaxed		atomic64_fetch_and
 # define atomic64_fetch_and_acquire		atomic64_fetch_and
 # define atomic64_fetch_and_release		atomic64_fetch_and
 #else
-# ifndef atomic64_fetch_and_acquire
-#  define atomic64_fetch_and_acquire(...)	__atomic_op_acquire(atomic64_fetch_and, __VA_ARGS__)
-# endif
-# ifndef atomic64_fetch_and_release
-#  define atomic64_fetch_and_release(...)	__atomic_op_release(atomic64_fetch_and, __VA_ARGS__)
-# endif
 # ifndef atomic64_fetch_and
 #  define atomic64_fetch_and(...)		__atomic_op_fence(atomic64_fetch_and, __VA_ARGS__)
+#  define atomic64_fetch_and_acquire(...)	__atomic_op_acquire(atomic64_fetch_and, __VA_ARGS__)
+#  define atomic64_fetch_and_release(...)	__atomic_op_release(atomic64_fetch_and, __VA_ARGS__)
 # endif
 #endif
 
 #ifdef atomic64_andnot
 
-/* atomic64_fetch_andnot_relaxed() et al: */
-
 #ifndef atomic64_fetch_andnot_relaxed
 # define atomic64_fetch_andnot_relaxed		atomic64_fetch_andnot
 # define atomic64_fetch_andnot_acquire		atomic64_fetch_andnot
 # define atomic64_fetch_andnot_release		atomic64_fetch_andnot
 #else
-# ifndef atomic64_fetch_andnot_acquire
-#  define atomic64_fetch_andnot_acquire(...)	__atomic_op_acquire(atomic64_fetch_andnot, __VA_ARGS__)
-# endif
-# ifndef atomic64_fetch_andnot_release
-#  define atomic64_fetch_andnot_release(...)	__atomic_op_release(atomic64_fetch_andnot, __VA_ARGS__)
-# endif
 # ifndef atomic64_fetch_andnot
 #  define atomic64_fetch_andnot(...)		__atomic_op_fence(atomic64_fetch_andnot, __VA_ARGS__)
+#  define atomic64_fetch_andnot_acquire(...)	__atomic_op_acquire(atomic64_fetch_andnot, __VA_ARGS__)
+#  define atomic64_fetch_andnot_release(...)	__atomic_op_release(atomic64_fetch_andnot, __VA_ARGS__)
 # endif
 #endif
 
 #endif /* atomic64_andnot */
 
-/* atomic64_fetch_xor_relaxed() et al: */
-
 #ifndef atomic64_fetch_xor_relaxed
 # define atomic64_fetch_xor_relaxed		atomic64_fetch_xor
 # define atomic64_fetch_xor_acquire		atomic64_fetch_xor
 # define atomic64_fetch_xor_release		atomic64_fetch_xor
 #else
-# ifndef atomic64_fetch_xor_acquire
+# ifndef atomic64_fetch_xor
+#  define atomic64_fetch_xor(...)		__atomic_op_fence(atomic64_fetch_xor, __VA_ARGS__)
 #  define atomic64_fetch_xor_acquire(...)	__atomic_op_acquire(atomic64_fetch_xor, __VA_ARGS__)
-# endif
-# ifndef atomic64_fetch_xor_release
 #  define atomic64_fetch_xor_release(...)	__atomic_op_release(atomic64_fetch_xor, __VA_ARGS__)
 # endif
-# ifndef atomic64_fetch_xor
-#  define atomic64_fetch_xor(...)		__atomic_op_fence(atomic64_fetch_xor, __VA_ARGS__)
-#endif
 #endif
 
-/* atomic64_xchg_relaxed() et al: */
-
 #ifndef atomic64_xchg_relaxed
 # define atomic64_xchg_relaxed			atomic64_xchg
 # define atomic64_xchg_acquire			atomic64_xchg
 # define atomic64_xchg_release			atomic64_xchg
 #else
-# ifndef atomic64_xchg_acquire
-#  define atomic64_xchg_acquire(...)		__atomic_op_acquire(atomic64_xchg, __VA_ARGS__)
-# endif
-# ifndef atomic64_xchg_release
-#  define atomic64_xchg_release(...)		__atomic_op_release(atomic64_xchg, __VA_ARGS__)
-# endif
 # ifndef atomic64_xchg
 #  define atomic64_xchg(...)			__atomic_op_fence(atomic64_xchg, __VA_ARGS__)
+#  define atomic64_xchg_acquire(...)		__atomic_op_acquire(atomic64_xchg, __VA_ARGS__)
+#  define atomic64_xchg_release(...)		__atomic_op_release(atomic64_xchg, __VA_ARGS__)
 # endif
 #endif
 
-/* atomic64_cmpxchg_relaxed() et al: */
-
 #ifndef atomic64_cmpxchg_relaxed
 # define atomic64_cmpxchg_relaxed		atomic64_cmpxchg
 # define atomic64_cmpxchg_acquire		atomic64_cmpxchg
 # define atomic64_cmpxchg_release		atomic64_cmpxchg
 #else
-# ifndef atomic64_cmpxchg_acquire
-#  define atomic64_cmpxchg_acquire(...)		__atomic_op_acquire(atomic64_cmpxchg, __VA_ARGS__)
-# endif
-# ifndef atomic64_cmpxchg_release
-#  define atomic64_cmpxchg_release(...)		__atomic_op_release(atomic64_cmpxchg, __VA_ARGS__)
-# endif
 # ifndef atomic64_cmpxchg
 #  define atomic64_cmpxchg(...)			__atomic_op_fence(atomic64_cmpxchg, __VA_ARGS__)
+#  define atomic64_cmpxchg_acquire(...)		__atomic_op_acquire(atomic64_cmpxchg, __VA_ARGS__)
+#  define atomic64_cmpxchg_release(...)		__atomic_op_release(atomic64_cmpxchg, __VA_ARGS__)
 # endif
 #endif
 

^ permalink raw reply related	[flat|nested] 103+ messages in thread

* [tip:locking/core] locking/atomics: Combine the atomic_andnot() and atomic64_andnot() API definitions
  2018-05-05  8:54             ` Ingo Molnar
  (?)
@ 2018-05-06 12:15             ` tip-bot for Ingo Molnar
  -1 siblings, 0 replies; 103+ messages in thread
From: tip-bot for Ingo Molnar @ 2018-05-06 12:15 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: mingo, torvalds, will.deacon, akpm, peterz, tglx, paulmck, hpa,
	mark.rutland, linux-kernel

Commit-ID:  7b9b2e57c7edaeac5404f39c5974ff227540d41e
Gitweb:     https://git.kernel.org/tip/7b9b2e57c7edaeac5404f39c5974ff227540d41e
Author:     Ingo Molnar <mingo@kernel.org>
AuthorDate: Sat, 5 May 2018 10:54:45 +0200
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Sat, 5 May 2018 15:22:45 +0200

locking/atomics: Combine the atomic_andnot() and atomic64_andnot() API definitions

The atomic_andnot() and atomic64_andnot() are defined in 4 separate groups
spred out in the atomic.h header:

 #ifdef atomic_andnot
 ...
 #endif /* atomic_andnot */
 ...
 #ifndef atomic_andnot
 ...
 #endif
 ...
 #ifdef atomic64_andnot
 ...
 #endif /* atomic64_andnot */
 ...
 #ifndef atomic64_andnot
 ...
 #endif

Combine them into unify them into two groups:

 #ifdef atomic_andnot
 #else
 #endif

 ...

 #ifdef atomic64_andnot
 #else
 #endif

So that one API group is defined in a single place within the header.

Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Paul E. McKenney <paulmck@us.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Deacon <will.deacon@arm.com>
Cc: aryabinin@virtuozzo.com
Cc: boqun.feng@gmail.com
Cc: catalin.marinas@arm.com
Cc: dvyukov@google.com
Cc: linux-arm-kernel@lists.infradead.org
Link: http://lkml.kernel.org/r/20180505085445.cmdnqh6xpnpfoqzb@gmail.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 include/linux/atomic.h | 72 +++++++++++++++++++++++++-------------------------
 1 file changed, 36 insertions(+), 36 deletions(-)

diff --git a/include/linux/atomic.h b/include/linux/atomic.h
index 352ecc72d7f5..1176cf7c6f03 100644
--- a/include/linux/atomic.h
+++ b/include/linux/atomic.h
@@ -205,22 +205,6 @@
 # endif
 #endif
 
-#ifdef atomic_andnot
-
-#ifndef atomic_fetch_andnot_relaxed
-# define atomic_fetch_andnot_relaxed		atomic_fetch_andnot
-# define atomic_fetch_andnot_acquire		atomic_fetch_andnot
-# define atomic_fetch_andnot_release		atomic_fetch_andnot
-#else
-# ifndef atomic_fetch_andnot
-#  define atomic_fetch_andnot(...)		__atomic_op_fence(atomic_fetch_andnot, __VA_ARGS__)
-#  define atomic_fetch_andnot_acquire(...)	__atomic_op_acquire(atomic_fetch_andnot, __VA_ARGS__)
-#  define atomic_fetch_andnot_release(...)	__atomic_op_release(atomic_fetch_andnot, __VA_ARGS__)
-# endif
-#endif
-
-#endif /* atomic_andnot */
-
 #ifndef atomic_fetch_xor_relaxed
 # define atomic_fetch_xor_relaxed		atomic_fetch_xor
 # define atomic_fetch_xor_acquire		atomic_fetch_xor
@@ -338,7 +322,22 @@ static inline int atomic_add_unless(atomic_t *v, int a, int u)
 # define atomic_inc_not_zero(v)			atomic_add_unless((v), 1, 0)
 #endif
 
-#ifndef atomic_andnot
+#ifdef atomic_andnot
+
+#ifndef atomic_fetch_andnot_relaxed
+# define atomic_fetch_andnot_relaxed		atomic_fetch_andnot
+# define atomic_fetch_andnot_acquire		atomic_fetch_andnot
+# define atomic_fetch_andnot_release		atomic_fetch_andnot
+#else
+# ifndef atomic_fetch_andnot
+#  define atomic_fetch_andnot(...)		__atomic_op_fence(atomic_fetch_andnot, __VA_ARGS__)
+#  define atomic_fetch_andnot_acquire(...)	__atomic_op_acquire(atomic_fetch_andnot, __VA_ARGS__)
+#  define atomic_fetch_andnot_release(...)	__atomic_op_release(atomic_fetch_andnot, __VA_ARGS__)
+# endif
+#endif
+
+#else /* !atomic_andnot: */
+
 static inline void atomic_andnot(int i, atomic_t *v)
 {
 	atomic_and(~i, v);
@@ -363,7 +362,8 @@ static inline int atomic_fetch_andnot_release(int i, atomic_t *v)
 {
 	return atomic_fetch_and_release(~i, v);
 }
-#endif
+
+#endif /* !atomic_andnot */
 
 /**
  * atomic_inc_not_zero_hint - increment if not null
@@ -600,22 +600,6 @@ static inline int atomic_dec_if_positive(atomic_t *v)
 # endif
 #endif
 
-#ifdef atomic64_andnot
-
-#ifndef atomic64_fetch_andnot_relaxed
-# define atomic64_fetch_andnot_relaxed		atomic64_fetch_andnot
-# define atomic64_fetch_andnot_acquire		atomic64_fetch_andnot
-# define atomic64_fetch_andnot_release		atomic64_fetch_andnot
-#else
-# ifndef atomic64_fetch_andnot
-#  define atomic64_fetch_andnot(...)		__atomic_op_fence(atomic64_fetch_andnot, __VA_ARGS__)
-#  define atomic64_fetch_andnot_acquire(...)	__atomic_op_acquire(atomic64_fetch_andnot, __VA_ARGS__)
-#  define atomic64_fetch_andnot_release(...)	__atomic_op_release(atomic64_fetch_andnot, __VA_ARGS__)
-# endif
-#endif
-
-#endif /* atomic64_andnot */
-
 #ifndef atomic64_fetch_xor_relaxed
 # define atomic64_fetch_xor_relaxed		atomic64_fetch_xor
 # define atomic64_fetch_xor_acquire		atomic64_fetch_xor
@@ -672,7 +656,22 @@ static inline int atomic_dec_if_positive(atomic_t *v)
 # define atomic64_try_cmpxchg_release		atomic64_try_cmpxchg
 #endif
 
-#ifndef atomic64_andnot
+#ifdef atomic64_andnot
+
+#ifndef atomic64_fetch_andnot_relaxed
+# define atomic64_fetch_andnot_relaxed		atomic64_fetch_andnot
+# define atomic64_fetch_andnot_acquire		atomic64_fetch_andnot
+# define atomic64_fetch_andnot_release		atomic64_fetch_andnot
+#else
+# ifndef atomic64_fetch_andnot
+#  define atomic64_fetch_andnot(...)		__atomic_op_fence(atomic64_fetch_andnot, __VA_ARGS__)
+#  define atomic64_fetch_andnot_acquire(...)	__atomic_op_acquire(atomic64_fetch_andnot, __VA_ARGS__)
+#  define atomic64_fetch_andnot_release(...)	__atomic_op_release(atomic64_fetch_andnot, __VA_ARGS__)
+# endif
+#endif
+
+#else /* !atomic64_andnot: */
+
 static inline void atomic64_andnot(long long i, atomic64_t *v)
 {
 	atomic64_and(~i, v);
@@ -697,7 +696,8 @@ static inline long long atomic64_fetch_andnot_release(long long i, atomic64_t *v
 {
 	return atomic64_fetch_and_release(~i, v);
 }
-#endif
+
+#endif /* !atomic64_andnot */
 
 #define atomic64_cond_read_relaxed(v, c)	smp_cond_load_relaxed(&(v)->counter, (c))
 #define atomic64_cond_read_acquire(v, c)	smp_cond_load_acquire(&(v)->counter, (c))

^ permalink raw reply related	[flat|nested] 103+ messages in thread

* [tip:locking/core] locking/atomics: Shorten the __atomic_op() defines to __op()
  2018-05-05 10:48                 ` Ingo Molnar
  (?)
  (?)
@ 2018-05-06 12:15                 ` tip-bot for Ingo Molnar
  -1 siblings, 0 replies; 103+ messages in thread
From: tip-bot for Ingo Molnar @ 2018-05-06 12:15 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: peterz, will.deacon, linux-kernel, torvalds, mark.rutland, hpa,
	mingo, tglx, paulmck, akpm

Commit-ID:  ad6812db385540eb2457c945a8e95fc9095b706c
Gitweb:     https://git.kernel.org/tip/ad6812db385540eb2457c945a8e95fc9095b706c
Author:     Ingo Molnar <mingo@kernel.org>
AuthorDate: Sat, 5 May 2018 12:48:58 +0200
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Sat, 5 May 2018 15:23:55 +0200

locking/atomics: Shorten the __atomic_op() defines to __op()

The __atomic prefix is somewhat of a misnomer, because not all
APIs we use with these macros have an atomic_ prefix.

This also reduces the length of the longest lines in the header,
making them more readable on PeterZ's terminals.

No change in functionality.

Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Paul E. McKenney <paulmck@us.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Deacon <will.deacon@arm.com>
Cc: aryabinin@virtuozzo.com
Cc: boqun.feng@gmail.com
Cc: catalin.marinas@arm.com
Cc: dvyukov@google.com
Cc: linux-arm-kernel@lists.infradead.org
Link: http://lkml.kernel.org/r/20180505104858.ap4bfv6ip2vprzyj@gmail.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/alpha/include/asm/atomic.h    |   4 +-
 arch/powerpc/include/asm/cmpxchg.h |   8 +-
 arch/riscv/include/asm/atomic.h    |   4 +-
 include/linux/atomic.h             | 204 +++++++++++++++++++------------------
 4 files changed, 111 insertions(+), 109 deletions(-)

diff --git a/arch/alpha/include/asm/atomic.h b/arch/alpha/include/asm/atomic.h
index 767bfdd42992..786edb5f16c4 100644
--- a/arch/alpha/include/asm/atomic.h
+++ b/arch/alpha/include/asm/atomic.h
@@ -21,8 +21,8 @@
  * barriered versions. To avoid redundant back-to-back fences, we can
  * define the _acquire and _fence versions explicitly.
  */
-#define __atomic_op_acquire(op, args...)	op##_relaxed(args)
-#define __atomic_op_fence			__atomic_op_release
+#define __op_acquire(op, args...)	op##_relaxed(args)
+#define __op_fence			__op_release
 
 #define ATOMIC_INIT(i)		{ (i) }
 #define ATOMIC64_INIT(i)	{ (i) }
diff --git a/arch/powerpc/include/asm/cmpxchg.h b/arch/powerpc/include/asm/cmpxchg.h
index e27a612b957f..dc5a5426d683 100644
--- a/arch/powerpc/include/asm/cmpxchg.h
+++ b/arch/powerpc/include/asm/cmpxchg.h
@@ -13,14 +13,14 @@
  * a "bne-" instruction at the end, so an isync is enough as a acquire barrier
  * on the platform without lwsync.
  */
-#define __atomic_op_acquire(op, args...)				\
+#define __op_acquire(op, args...)					\
 ({									\
 	typeof(op##_relaxed(args)) __ret  = op##_relaxed(args);		\
 	__asm__ __volatile__(PPC_ACQUIRE_BARRIER "" : : : "memory");	\
 	__ret;								\
 })
 
-#define __atomic_op_release(op, args...)				\
+#define __op_release(op, args...)					\
 ({									\
 	__asm__ __volatile__(PPC_RELEASE_BARRIER "" : : : "memory");	\
 	op##_relaxed(args);						\
@@ -531,7 +531,7 @@ __cmpxchg_acquire(void *ptr, unsigned long old, unsigned long new,
 			sizeof(*(ptr)));				\
 })
 
-#define cmpxchg_release(...) __atomic_op_release(cmpxchg, __VA_ARGS__)
+#define cmpxchg_release(...) __op_release(cmpxchg, __VA_ARGS__)
 
 #ifdef CONFIG_PPC64
 #define cmpxchg64(ptr, o, n)						\
@@ -555,7 +555,7 @@ __cmpxchg_acquire(void *ptr, unsigned long old, unsigned long new,
 	cmpxchg_acquire((ptr), (o), (n));				\
 })
 
-#define cmpxchg64_release(...) __atomic_op_release(cmpxchg64, __VA_ARGS__)
+#define cmpxchg64_release(...) __op_release(cmpxchg64, __VA_ARGS__)
 
 #else
 #include <asm-generic/cmpxchg-local.h>
diff --git a/arch/riscv/include/asm/atomic.h b/arch/riscv/include/asm/atomic.h
index 855115ace98c..992c0aff9554 100644
--- a/arch/riscv/include/asm/atomic.h
+++ b/arch/riscv/include/asm/atomic.h
@@ -25,14 +25,14 @@
 
 #define ATOMIC_INIT(i)	{ (i) }
 
-#define __atomic_op_acquire(op, args...)				\
+#define __op_acquire(op, args...)					\
 ({									\
 	typeof(op##_relaxed(args)) __ret  = op##_relaxed(args);		\
 	__asm__ __volatile__(RISCV_ACQUIRE_BARRIER "" ::: "memory");	\
 	__ret;								\
 })
 
-#define __atomic_op_release(op, args...)				\
+#define __op_release(op, args...)					\
 ({									\
 	__asm__ __volatile__(RISCV_RELEASE_BARRIER "" ::: "memory");	\
 	op##_relaxed(args);						\
diff --git a/include/linux/atomic.h b/include/linux/atomic.h
index 1176cf7c6f03..f32ff6d9e4d2 100644
--- a/include/linux/atomic.h
+++ b/include/linux/atomic.h
@@ -37,33 +37,35 @@
  * variant is already fully ordered, no additional barriers are needed.
  *
  * Besides, if an arch has a special barrier for acquire/release, it could
- * implement its own __atomic_op_* and use the same framework for building
+ * implement its own __op_* and use the same framework for building
  * variants
  *
- * If an architecture overrides __atomic_op_acquire() it will probably want
+ * If an architecture overrides __op_acquire() it will probably want
  * to define smp_mb__after_spinlock().
  */
-#ifndef __atomic_op_acquire
-#define __atomic_op_acquire(op, args...)				\
+#ifndef __op_acquire
+#define __op_acquire(op, args...)					\
 ({									\
 	typeof(op##_relaxed(args)) __ret  = op##_relaxed(args);		\
+									\
 	smp_mb__after_atomic();						\
 	__ret;								\
 })
 #endif
 
-#ifndef __atomic_op_release
-#define __atomic_op_release(op, args...)				\
+#ifndef __op_release
+#define __op_release(op, args...)					\
 ({									\
 	smp_mb__before_atomic();					\
 	op##_relaxed(args);						\
 })
 #endif
 
-#ifndef __atomic_op_fence
-#define __atomic_op_fence(op, args...)					\
+#ifndef __op_fence
+#define __op_fence(op, args...)						\
 ({									\
 	typeof(op##_relaxed(args)) __ret;				\
+									\
 	smp_mb__before_atomic();					\
 	__ret = op##_relaxed(args);					\
 	smp_mb__after_atomic();						\
@@ -77,9 +79,9 @@
 # define atomic_add_return_release		atomic_add_return
 #else
 # ifndef atomic_add_return
-#  define atomic_add_return(...)		__atomic_op_fence(atomic_add_return, __VA_ARGS__)
-#  define atomic_add_return_acquire(...)	__atomic_op_acquire(atomic_add_return, __VA_ARGS__)
-#  define atomic_add_return_release(...)	__atomic_op_release(atomic_add_return, __VA_ARGS__)
+#  define atomic_add_return(...)		__op_fence(atomic_add_return, __VA_ARGS__)
+#  define atomic_add_return_acquire(...)	__op_acquire(atomic_add_return, __VA_ARGS__)
+#  define atomic_add_return_release(...)	__op_release(atomic_add_return, __VA_ARGS__)
 # endif
 #endif
 
@@ -89,9 +91,9 @@
 # define atomic_inc_return_release		atomic_inc_return
 #else
 # ifndef atomic_inc_return
-#  define atomic_inc_return(...)		__atomic_op_fence(atomic_inc_return, __VA_ARGS__)
-#  define atomic_inc_return_acquire(...)	__atomic_op_acquire(atomic_inc_return, __VA_ARGS__)
-#  define atomic_inc_return_release(...)	__atomic_op_release(atomic_inc_return, __VA_ARGS__)
+#  define atomic_inc_return(...)		__op_fence(atomic_inc_return, __VA_ARGS__)
+#  define atomic_inc_return_acquire(...)	__op_acquire(atomic_inc_return, __VA_ARGS__)
+#  define atomic_inc_return_release(...)	__op_release(atomic_inc_return, __VA_ARGS__)
 # endif
 #endif
 
@@ -101,9 +103,9 @@
 # define atomic_sub_return_release		atomic_sub_return
 #else
 # ifndef atomic_sub_return
-#  define atomic_sub_return(...)		__atomic_op_fence(atomic_sub_return, __VA_ARGS__)
-#  define atomic_sub_return_acquire(...)	__atomic_op_acquire(atomic_sub_return, __VA_ARGS__)
-#  define atomic_sub_return_release(...)	__atomic_op_release(atomic_sub_return, __VA_ARGS__)
+#  define atomic_sub_return(...)		__op_fence(atomic_sub_return, __VA_ARGS__)
+#  define atomic_sub_return_acquire(...)	__op_acquire(atomic_sub_return, __VA_ARGS__)
+#  define atomic_sub_return_release(...)	__op_release(atomic_sub_return, __VA_ARGS__)
 # endif
 #endif
 
@@ -113,9 +115,9 @@
 # define atomic_dec_return_release		atomic_dec_return
 #else
 # ifndef atomic_dec_return
-#  define atomic_dec_return(...)		__atomic_op_fence(atomic_dec_return, __VA_ARGS__)
-#  define atomic_dec_return_acquire(...)	__atomic_op_acquire(atomic_dec_return, __VA_ARGS__)
-#  define atomic_dec_return_release(...)	__atomic_op_release(atomic_dec_return, __VA_ARGS__)
+#  define atomic_dec_return(...)		__op_fence(atomic_dec_return, __VA_ARGS__)
+#  define atomic_dec_return_acquire(...)	__op_acquire(atomic_dec_return, __VA_ARGS__)
+#  define atomic_dec_return_release(...)	__op_release(atomic_dec_return, __VA_ARGS__)
 # endif
 #endif
 
@@ -125,9 +127,9 @@
 # define atomic_fetch_add_release		atomic_fetch_add
 #else
 # ifndef atomic_fetch_add
-#  define atomic_fetch_add(...)			__atomic_op_fence(atomic_fetch_add, __VA_ARGS__)
-#  define atomic_fetch_add_acquire(...)		__atomic_op_acquire(atomic_fetch_add, __VA_ARGS__)
-#  define atomic_fetch_add_release(...)		__atomic_op_release(atomic_fetch_add, __VA_ARGS__)
+#  define atomic_fetch_add(...)			__op_fence(atomic_fetch_add, __VA_ARGS__)
+#  define atomic_fetch_add_acquire(...)		__op_acquire(atomic_fetch_add, __VA_ARGS__)
+#  define atomic_fetch_add_release(...)		__op_release(atomic_fetch_add, __VA_ARGS__)
 # endif
 #endif
 
@@ -144,9 +146,9 @@
 # endif
 #else
 # ifndef atomic_fetch_inc
-#  define atomic_fetch_inc(...)			__atomic_op_fence(atomic_fetch_inc, __VA_ARGS__)
-#  define atomic_fetch_inc_acquire(...)		__atomic_op_acquire(atomic_fetch_inc, __VA_ARGS__)
-#  define atomic_fetch_inc_release(...)		__atomic_op_release(atomic_fetch_inc, __VA_ARGS__)
+#  define atomic_fetch_inc(...)			__op_fence(atomic_fetch_inc, __VA_ARGS__)
+#  define atomic_fetch_inc_acquire(...)		__op_acquire(atomic_fetch_inc, __VA_ARGS__)
+#  define atomic_fetch_inc_release(...)		__op_release(atomic_fetch_inc, __VA_ARGS__)
 # endif
 #endif
 
@@ -156,9 +158,9 @@
 # define atomic_fetch_sub_release		atomic_fetch_sub
 #else
 # ifndef atomic_fetch_sub
-#  define atomic_fetch_sub(...)			__atomic_op_fence(atomic_fetch_sub, __VA_ARGS__)
-#  define atomic_fetch_sub_acquire(...)		__atomic_op_acquire(atomic_fetch_sub, __VA_ARGS__)
-#  define atomic_fetch_sub_release(...)		__atomic_op_release(atomic_fetch_sub, __VA_ARGS__)
+#  define atomic_fetch_sub(...)			__op_fence(atomic_fetch_sub, __VA_ARGS__)
+#  define atomic_fetch_sub_acquire(...)		__op_acquire(atomic_fetch_sub, __VA_ARGS__)
+#  define atomic_fetch_sub_release(...)		__op_release(atomic_fetch_sub, __VA_ARGS__)
 # endif
 #endif
 
@@ -175,9 +177,9 @@
 # endif
 #else
 # ifndef atomic_fetch_dec
-#  define atomic_fetch_dec(...)			__atomic_op_fence(atomic_fetch_dec, __VA_ARGS__)
-#  define atomic_fetch_dec_acquire(...)		__atomic_op_acquire(atomic_fetch_dec, __VA_ARGS__)
-#  define atomic_fetch_dec_release(...)		__atomic_op_release(atomic_fetch_dec, __VA_ARGS__)
+#  define atomic_fetch_dec(...)			__op_fence(atomic_fetch_dec, __VA_ARGS__)
+#  define atomic_fetch_dec_acquire(...)		__op_acquire(atomic_fetch_dec, __VA_ARGS__)
+#  define atomic_fetch_dec_release(...)		__op_release(atomic_fetch_dec, __VA_ARGS__)
 # endif
 #endif
 
@@ -187,9 +189,9 @@
 # define atomic_fetch_or_release		atomic_fetch_or
 #else
 # ifndef atomic_fetch_or
-#  define atomic_fetch_or(...)			__atomic_op_fence(atomic_fetch_or, __VA_ARGS__)
-#  define atomic_fetch_or_acquire(...)		__atomic_op_acquire(atomic_fetch_or, __VA_ARGS__)
-#  define atomic_fetch_or_release(...)		__atomic_op_release(atomic_fetch_or, __VA_ARGS__)
+#  define atomic_fetch_or(...)			__op_fence(atomic_fetch_or, __VA_ARGS__)
+#  define atomic_fetch_or_acquire(...)		__op_acquire(atomic_fetch_or, __VA_ARGS__)
+#  define atomic_fetch_or_release(...)		__op_release(atomic_fetch_or, __VA_ARGS__)
 # endif
 #endif
 
@@ -199,9 +201,9 @@
 # define atomic_fetch_and_release		atomic_fetch_and
 #else
 # ifndef atomic_fetch_and
-#  define atomic_fetch_and(...)			__atomic_op_fence(atomic_fetch_and, __VA_ARGS__)
-#  define atomic_fetch_and_acquire(...)		__atomic_op_acquire(atomic_fetch_and, __VA_ARGS__)
-#  define atomic_fetch_and_release(...)		__atomic_op_release(atomic_fetch_and, __VA_ARGS__)
+#  define atomic_fetch_and(...)			__op_fence(atomic_fetch_and, __VA_ARGS__)
+#  define atomic_fetch_and_acquire(...)		__op_acquire(atomic_fetch_and, __VA_ARGS__)
+#  define atomic_fetch_and_release(...)		__op_release(atomic_fetch_and, __VA_ARGS__)
 # endif
 #endif
 
@@ -211,9 +213,9 @@
 # define atomic_fetch_xor_release		atomic_fetch_xor
 #else
 # ifndef atomic_fetch_xor
-#  define atomic_fetch_xor(...)			__atomic_op_fence(atomic_fetch_xor, __VA_ARGS__)
-#  define atomic_fetch_xor_acquire(...)		__atomic_op_acquire(atomic_fetch_xor, __VA_ARGS__)
-#  define atomic_fetch_xor_release(...)		__atomic_op_release(atomic_fetch_xor, __VA_ARGS__)
+#  define atomic_fetch_xor(...)			__op_fence(atomic_fetch_xor, __VA_ARGS__)
+#  define atomic_fetch_xor_acquire(...)		__op_acquire(atomic_fetch_xor, __VA_ARGS__)
+#  define atomic_fetch_xor_release(...)		__op_release(atomic_fetch_xor, __VA_ARGS__)
 # endif
 #endif
 
@@ -223,9 +225,9 @@
 #define atomic_xchg_release			atomic_xchg
 #else
 # ifndef atomic_xchg
-#  define atomic_xchg(...)			__atomic_op_fence(atomic_xchg, __VA_ARGS__)
-#  define atomic_xchg_acquire(...)		__atomic_op_acquire(atomic_xchg, __VA_ARGS__)
-#  define atomic_xchg_release(...)		__atomic_op_release(atomic_xchg, __VA_ARGS__)
+#  define atomic_xchg(...)			__op_fence(atomic_xchg, __VA_ARGS__)
+#  define atomic_xchg_acquire(...)		__op_acquire(atomic_xchg, __VA_ARGS__)
+#  define atomic_xchg_release(...)		__op_release(atomic_xchg, __VA_ARGS__)
 # endif
 #endif
 
@@ -235,9 +237,9 @@
 # define atomic_cmpxchg_release			atomic_cmpxchg
 #else
 # ifndef atomic_cmpxchg
-#  define atomic_cmpxchg(...)			__atomic_op_fence(atomic_cmpxchg, __VA_ARGS__)
-#  define atomic_cmpxchg_acquire(...)		__atomic_op_acquire(atomic_cmpxchg, __VA_ARGS__)
-#  define atomic_cmpxchg_release(...)		__atomic_op_release(atomic_cmpxchg, __VA_ARGS__)
+#  define atomic_cmpxchg(...)			__op_fence(atomic_cmpxchg, __VA_ARGS__)
+#  define atomic_cmpxchg_acquire(...)		__op_acquire(atomic_cmpxchg, __VA_ARGS__)
+#  define atomic_cmpxchg_release(...)		__op_release(atomic_cmpxchg, __VA_ARGS__)
 # endif
 #endif
 
@@ -267,9 +269,9 @@
 # define cmpxchg_release			cmpxchg
 #else
 # ifndef cmpxchg
-#  define cmpxchg(...)				__atomic_op_fence(cmpxchg, __VA_ARGS__)
-#  define cmpxchg_acquire(...)			__atomic_op_acquire(cmpxchg, __VA_ARGS__)
-#  define cmpxchg_release(...)			__atomic_op_release(cmpxchg, __VA_ARGS__)
+#  define cmpxchg(...)				__op_fence(cmpxchg, __VA_ARGS__)
+#  define cmpxchg_acquire(...)			__op_acquire(cmpxchg, __VA_ARGS__)
+#  define cmpxchg_release(...)			__op_release(cmpxchg, __VA_ARGS__)
 # endif
 #endif
 
@@ -279,9 +281,9 @@
 # define cmpxchg64_release			cmpxchg64
 #else
 # ifndef cmpxchg64
-#  define cmpxchg64(...)			__atomic_op_fence(cmpxchg64, __VA_ARGS__)
-#  define cmpxchg64_acquire(...)		__atomic_op_acquire(cmpxchg64, __VA_ARGS__)
-#  define cmpxchg64_release(...)		__atomic_op_release(cmpxchg64, __VA_ARGS__)
+#  define cmpxchg64(...)			__op_fence(cmpxchg64, __VA_ARGS__)
+#  define cmpxchg64_acquire(...)		__op_acquire(cmpxchg64, __VA_ARGS__)
+#  define cmpxchg64_release(...)		__op_release(cmpxchg64, __VA_ARGS__)
 # endif
 #endif
 
@@ -291,9 +293,9 @@
 # define xchg_release				xchg
 #else
 # ifndef xchg
-#  define xchg(...)				__atomic_op_fence(xchg, __VA_ARGS__)
-#  define xchg_acquire(...)			__atomic_op_acquire(xchg, __VA_ARGS__)
-#  define xchg_release(...)			__atomic_op_release(xchg, __VA_ARGS__)
+#  define xchg(...)				__op_fence(xchg, __VA_ARGS__)
+#  define xchg_acquire(...)			__op_acquire(xchg, __VA_ARGS__)
+#  define xchg_release(...)			__op_release(xchg, __VA_ARGS__)
 # endif
 #endif
 
@@ -330,9 +332,9 @@ static inline int atomic_add_unless(atomic_t *v, int a, int u)
 # define atomic_fetch_andnot_release		atomic_fetch_andnot
 #else
 # ifndef atomic_fetch_andnot
-#  define atomic_fetch_andnot(...)		__atomic_op_fence(atomic_fetch_andnot, __VA_ARGS__)
-#  define atomic_fetch_andnot_acquire(...)	__atomic_op_acquire(atomic_fetch_andnot, __VA_ARGS__)
-#  define atomic_fetch_andnot_release(...)	__atomic_op_release(atomic_fetch_andnot, __VA_ARGS__)
+#  define atomic_fetch_andnot(...)		__op_fence(atomic_fetch_andnot, __VA_ARGS__)
+#  define atomic_fetch_andnot_acquire(...)	__op_acquire(atomic_fetch_andnot, __VA_ARGS__)
+#  define atomic_fetch_andnot_release(...)	__op_release(atomic_fetch_andnot, __VA_ARGS__)
 # endif
 #endif
 
@@ -472,9 +474,9 @@ static inline int atomic_dec_if_positive(atomic_t *v)
 # define atomic64_add_return_release		atomic64_add_return
 #else
 # ifndef atomic64_add_return
-#  define atomic64_add_return(...)		__atomic_op_fence(atomic64_add_return, __VA_ARGS__)
-#  define atomic64_add_return_acquire(...)	__atomic_op_acquire(atomic64_add_return, __VA_ARGS__)
-#  define atomic64_add_return_release(...)	__atomic_op_release(atomic64_add_return, __VA_ARGS__)
+#  define atomic64_add_return(...)		__op_fence(atomic64_add_return, __VA_ARGS__)
+#  define atomic64_add_return_acquire(...)	__op_acquire(atomic64_add_return, __VA_ARGS__)
+#  define atomic64_add_return_release(...)	__op_release(atomic64_add_return, __VA_ARGS__)
 # endif
 #endif
 
@@ -484,9 +486,9 @@ static inline int atomic_dec_if_positive(atomic_t *v)
 # define atomic64_inc_return_release		atomic64_inc_return
 #else
 # ifndef atomic64_inc_return
-#  define atomic64_inc_return(...)		__atomic_op_fence(atomic64_inc_return, __VA_ARGS__)
-#  define atomic64_inc_return_acquire(...)	__atomic_op_acquire(atomic64_inc_return, __VA_ARGS__)
-#  define atomic64_inc_return_release(...)	__atomic_op_release(atomic64_inc_return, __VA_ARGS__)
+#  define atomic64_inc_return(...)		__op_fence(atomic64_inc_return, __VA_ARGS__)
+#  define atomic64_inc_return_acquire(...)	__op_acquire(atomic64_inc_return, __VA_ARGS__)
+#  define atomic64_inc_return_release(...)	__op_release(atomic64_inc_return, __VA_ARGS__)
 # endif
 #endif
 
@@ -496,9 +498,9 @@ static inline int atomic_dec_if_positive(atomic_t *v)
 # define atomic64_sub_return_release		atomic64_sub_return
 #else
 # ifndef atomic64_sub_return
-#  define atomic64_sub_return(...)		__atomic_op_fence(atomic64_sub_return, __VA_ARGS__)
-#  define atomic64_sub_return_acquire(...)	__atomic_op_acquire(atomic64_sub_return, __VA_ARGS__)
-#  define atomic64_sub_return_release(...)	__atomic_op_release(atomic64_sub_return, __VA_ARGS__)
+#  define atomic64_sub_return(...)		__op_fence(atomic64_sub_return, __VA_ARGS__)
+#  define atomic64_sub_return_acquire(...)	__op_acquire(atomic64_sub_return, __VA_ARGS__)
+#  define atomic64_sub_return_release(...)	__op_release(atomic64_sub_return, __VA_ARGS__)
 # endif
 #endif
 
@@ -508,9 +510,9 @@ static inline int atomic_dec_if_positive(atomic_t *v)
 # define atomic64_dec_return_release		atomic64_dec_return
 #else
 # ifndef atomic64_dec_return
-#  define atomic64_dec_return(...)		__atomic_op_fence(atomic64_dec_return, __VA_ARGS__)
-#  define atomic64_dec_return_acquire(...)	__atomic_op_acquire(atomic64_dec_return, __VA_ARGS__)
-#  define atomic64_dec_return_release(...)	__atomic_op_release(atomic64_dec_return, __VA_ARGS__)
+#  define atomic64_dec_return(...)		__op_fence(atomic64_dec_return, __VA_ARGS__)
+#  define atomic64_dec_return_acquire(...)	__op_acquire(atomic64_dec_return, __VA_ARGS__)
+#  define atomic64_dec_return_release(...)	__op_release(atomic64_dec_return, __VA_ARGS__)
 # endif
 #endif
 
@@ -520,9 +522,9 @@ static inline int atomic_dec_if_positive(atomic_t *v)
 # define atomic64_fetch_add_release		atomic64_fetch_add
 #else
 # ifndef atomic64_fetch_add
-#  define atomic64_fetch_add(...)		__atomic_op_fence(atomic64_fetch_add, __VA_ARGS__)
-#  define atomic64_fetch_add_acquire(...)	__atomic_op_acquire(atomic64_fetch_add, __VA_ARGS__)
-#  define atomic64_fetch_add_release(...)	__atomic_op_release(atomic64_fetch_add, __VA_ARGS__)
+#  define atomic64_fetch_add(...)		__op_fence(atomic64_fetch_add, __VA_ARGS__)
+#  define atomic64_fetch_add_acquire(...)	__op_acquire(atomic64_fetch_add, __VA_ARGS__)
+#  define atomic64_fetch_add_release(...)	__op_release(atomic64_fetch_add, __VA_ARGS__)
 # endif
 #endif
 
@@ -539,9 +541,9 @@ static inline int atomic_dec_if_positive(atomic_t *v)
 # endif
 #else
 # ifndef atomic64_fetch_inc
-#  define atomic64_fetch_inc(...)		__atomic_op_fence(atomic64_fetch_inc, __VA_ARGS__)
-#  define atomic64_fetch_inc_acquire(...)	__atomic_op_acquire(atomic64_fetch_inc, __VA_ARGS__)
-#  define atomic64_fetch_inc_release(...)	__atomic_op_release(atomic64_fetch_inc, __VA_ARGS__)
+#  define atomic64_fetch_inc(...)		__op_fence(atomic64_fetch_inc, __VA_ARGS__)
+#  define atomic64_fetch_inc_acquire(...)	__op_acquire(atomic64_fetch_inc, __VA_ARGS__)
+#  define atomic64_fetch_inc_release(...)	__op_release(atomic64_fetch_inc, __VA_ARGS__)
 # endif
 #endif
 
@@ -551,9 +553,9 @@ static inline int atomic_dec_if_positive(atomic_t *v)
 # define atomic64_fetch_sub_release		atomic64_fetch_sub
 #else
 # ifndef atomic64_fetch_sub
-#  define atomic64_fetch_sub(...)		__atomic_op_fence(atomic64_fetch_sub, __VA_ARGS__)
-#  define atomic64_fetch_sub_acquire(...)	__atomic_op_acquire(atomic64_fetch_sub, __VA_ARGS__)
-#  define atomic64_fetch_sub_release(...)	__atomic_op_release(atomic64_fetch_sub, __VA_ARGS__)
+#  define atomic64_fetch_sub(...)		__op_fence(atomic64_fetch_sub, __VA_ARGS__)
+#  define atomic64_fetch_sub_acquire(...)	__op_acquire(atomic64_fetch_sub, __VA_ARGS__)
+#  define atomic64_fetch_sub_release(...)	__op_release(atomic64_fetch_sub, __VA_ARGS__)
 # endif
 #endif
 
@@ -570,9 +572,9 @@ static inline int atomic_dec_if_positive(atomic_t *v)
 # endif
 #else
 # ifndef atomic64_fetch_dec
-#  define atomic64_fetch_dec(...)		__atomic_op_fence(atomic64_fetch_dec, __VA_ARGS__)
-#  define atomic64_fetch_dec_acquire(...)	__atomic_op_acquire(atomic64_fetch_dec, __VA_ARGS__)
-#  define atomic64_fetch_dec_release(...)	__atomic_op_release(atomic64_fetch_dec, __VA_ARGS__)
+#  define atomic64_fetch_dec(...)		__op_fence(atomic64_fetch_dec, __VA_ARGS__)
+#  define atomic64_fetch_dec_acquire(...)	__op_acquire(atomic64_fetch_dec, __VA_ARGS__)
+#  define atomic64_fetch_dec_release(...)	__op_release(atomic64_fetch_dec, __VA_ARGS__)
 # endif
 #endif
 
@@ -582,9 +584,9 @@ static inline int atomic_dec_if_positive(atomic_t *v)
 # define atomic64_fetch_or_release		atomic64_fetch_or
 #else
 # ifndef atomic64_fetch_or
-#  define atomic64_fetch_or(...)		__atomic_op_fence(atomic64_fetch_or, __VA_ARGS__)
-#  define atomic64_fetch_or_acquire(...)	__atomic_op_acquire(atomic64_fetch_or, __VA_ARGS__)
-#  define atomic64_fetch_or_release(...)	__atomic_op_release(atomic64_fetch_or, __VA_ARGS__)
+#  define atomic64_fetch_or(...)		__op_fence(atomic64_fetch_or, __VA_ARGS__)
+#  define atomic64_fetch_or_acquire(...)	__op_acquire(atomic64_fetch_or, __VA_ARGS__)
+#  define atomic64_fetch_or_release(...)	__op_release(atomic64_fetch_or, __VA_ARGS__)
 # endif
 #endif
 
@@ -594,9 +596,9 @@ static inline int atomic_dec_if_positive(atomic_t *v)
 # define atomic64_fetch_and_release		atomic64_fetch_and
 #else
 # ifndef atomic64_fetch_and
-#  define atomic64_fetch_and(...)		__atomic_op_fence(atomic64_fetch_and, __VA_ARGS__)
-#  define atomic64_fetch_and_acquire(...)	__atomic_op_acquire(atomic64_fetch_and, __VA_ARGS__)
-#  define atomic64_fetch_and_release(...)	__atomic_op_release(atomic64_fetch_and, __VA_ARGS__)
+#  define atomic64_fetch_and(...)		__op_fence(atomic64_fetch_and, __VA_ARGS__)
+#  define atomic64_fetch_and_acquire(...)	__op_acquire(atomic64_fetch_and, __VA_ARGS__)
+#  define atomic64_fetch_and_release(...)	__op_release(atomic64_fetch_and, __VA_ARGS__)
 # endif
 #endif
 
@@ -606,9 +608,9 @@ static inline int atomic_dec_if_positive(atomic_t *v)
 # define atomic64_fetch_xor_release		atomic64_fetch_xor
 #else
 # ifndef atomic64_fetch_xor
-#  define atomic64_fetch_xor(...)		__atomic_op_fence(atomic64_fetch_xor, __VA_ARGS__)
-#  define atomic64_fetch_xor_acquire(...)	__atomic_op_acquire(atomic64_fetch_xor, __VA_ARGS__)
-#  define atomic64_fetch_xor_release(...)	__atomic_op_release(atomic64_fetch_xor, __VA_ARGS__)
+#  define atomic64_fetch_xor(...)		__op_fence(atomic64_fetch_xor, __VA_ARGS__)
+#  define atomic64_fetch_xor_acquire(...)	__op_acquire(atomic64_fetch_xor, __VA_ARGS__)
+#  define atomic64_fetch_xor_release(...)	__op_release(atomic64_fetch_xor, __VA_ARGS__)
 # endif
 #endif
 
@@ -618,9 +620,9 @@ static inline int atomic_dec_if_positive(atomic_t *v)
 # define atomic64_xchg_release			atomic64_xchg
 #else
 # ifndef atomic64_xchg
-#  define atomic64_xchg(...)			__atomic_op_fence(atomic64_xchg, __VA_ARGS__)
-#  define atomic64_xchg_acquire(...)		__atomic_op_acquire(atomic64_xchg, __VA_ARGS__)
-#  define atomic64_xchg_release(...)		__atomic_op_release(atomic64_xchg, __VA_ARGS__)
+#  define atomic64_xchg(...)			__op_fence(atomic64_xchg, __VA_ARGS__)
+#  define atomic64_xchg_acquire(...)		__op_acquire(atomic64_xchg, __VA_ARGS__)
+#  define atomic64_xchg_release(...)		__op_release(atomic64_xchg, __VA_ARGS__)
 # endif
 #endif
 
@@ -630,9 +632,9 @@ static inline int atomic_dec_if_positive(atomic_t *v)
 # define atomic64_cmpxchg_release		atomic64_cmpxchg
 #else
 # ifndef atomic64_cmpxchg
-#  define atomic64_cmpxchg(...)			__atomic_op_fence(atomic64_cmpxchg, __VA_ARGS__)
-#  define atomic64_cmpxchg_acquire(...)		__atomic_op_acquire(atomic64_cmpxchg, __VA_ARGS__)
-#  define atomic64_cmpxchg_release(...)		__atomic_op_release(atomic64_cmpxchg, __VA_ARGS__)
+#  define atomic64_cmpxchg(...)			__op_fence(atomic64_cmpxchg, __VA_ARGS__)
+#  define atomic64_cmpxchg_acquire(...)		__op_acquire(atomic64_cmpxchg, __VA_ARGS__)
+#  define atomic64_cmpxchg_release(...)		__op_release(atomic64_cmpxchg, __VA_ARGS__)
 # endif
 #endif
 
@@ -664,9 +666,9 @@ static inline int atomic_dec_if_positive(atomic_t *v)
 # define atomic64_fetch_andnot_release		atomic64_fetch_andnot
 #else
 # ifndef atomic64_fetch_andnot
-#  define atomic64_fetch_andnot(...)		__atomic_op_fence(atomic64_fetch_andnot, __VA_ARGS__)
-#  define atomic64_fetch_andnot_acquire(...)	__atomic_op_acquire(atomic64_fetch_andnot, __VA_ARGS__)
-#  define atomic64_fetch_andnot_release(...)	__atomic_op_release(atomic64_fetch_andnot, __VA_ARGS__)
+#  define atomic64_fetch_andnot(...)		__op_fence(atomic64_fetch_andnot, __VA_ARGS__)
+#  define atomic64_fetch_andnot_acquire(...)	__op_acquire(atomic64_fetch_andnot, __VA_ARGS__)
+#  define atomic64_fetch_andnot_release(...)	__op_release(atomic64_fetch_andnot, __VA_ARGS__)
 # endif
 #endif
 

^ permalink raw reply related	[flat|nested] 103+ messages in thread

* Re: [PATCH] locking/atomics: Simplify the op definitions in atomic.h some more
  2018-05-05  8:36           ` Ingo Molnar
@ 2018-05-06 14:12             ` Andrea Parri
  -1 siblings, 0 replies; 103+ messages in thread
From: Andrea Parri @ 2018-05-06 14:12 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Mark Rutland, Peter Zijlstra, linux-arm-kernel, linux-kernel,
	aryabinin, boqun.feng, catalin.marinas, dvyukov, will.deacon,
	Linus Torvalds, Andrew Morton, Paul E. McKenney, Peter Zijlstra,
	Thomas Gleixner

Hi Ingo,

> From 5affbf7e91901143f84f1b2ca64f4afe70e210fd Mon Sep 17 00:00:00 2001
> From: Ingo Molnar <mingo@kernel.org>
> Date: Sat, 5 May 2018 10:23:23 +0200
> Subject: [PATCH] locking/atomics: Simplify the op definitions in atomic.h some more
> 
> Before:
> 
>  #ifndef atomic_fetch_dec_relaxed
>  # ifndef atomic_fetch_dec
>  #  define atomic_fetch_dec(v)			atomic_fetch_sub(1, (v))
>  #  define atomic_fetch_dec_relaxed(v)		atomic_fetch_sub_relaxed(1, (v))
>  #  define atomic_fetch_dec_acquire(v)		atomic_fetch_sub_acquire(1, (v))
>  #  define atomic_fetch_dec_release(v)		atomic_fetch_sub_release(1, (v))
>  # else
>  #  define atomic_fetch_dec_relaxed		atomic_fetch_dec
>  #  define atomic_fetch_dec_acquire		atomic_fetch_dec
>  #  define atomic_fetch_dec_release		atomic_fetch_dec
>  # endif
>  #else
>  # ifndef atomic_fetch_dec_acquire
>  #  define atomic_fetch_dec_acquire(...)	__atomic_op_acquire(atomic_fetch_dec, __VA_ARGS__)
>  # endif
>  # ifndef atomic_fetch_dec_release
>  #  define atomic_fetch_dec_release(...)	__atomic_op_release(atomic_fetch_dec, __VA_ARGS__)
>  # endif
>  # ifndef atomic_fetch_dec
>  #  define atomic_fetch_dec(...)		__atomic_op_fence(atomic_fetch_dec, __VA_ARGS__)
>  # endif
>  #endif
> 
> After:
> 
>  #ifndef atomic_fetch_dec_relaxed
>  # ifndef atomic_fetch_dec
>  #  define atomic_fetch_dec(v)			atomic_fetch_sub(1, (v))
>  #  define atomic_fetch_dec_relaxed(v)		atomic_fetch_sub_relaxed(1, (v))
>  #  define atomic_fetch_dec_acquire(v)		atomic_fetch_sub_acquire(1, (v))
>  #  define atomic_fetch_dec_release(v)		atomic_fetch_sub_release(1, (v))
>  # else
>  #  define atomic_fetch_dec_relaxed		atomic_fetch_dec
>  #  define atomic_fetch_dec_acquire		atomic_fetch_dec
>  #  define atomic_fetch_dec_release		atomic_fetch_dec
>  # endif
>  #else
>  # ifndef atomic_fetch_dec
>  #  define atomic_fetch_dec(...)		__atomic_op_fence(atomic_fetch_dec, __VA_ARGS__)
>  #  define atomic_fetch_dec_acquire(...)	__atomic_op_acquire(atomic_fetch_dec, __VA_ARGS__)
>  #  define atomic_fetch_dec_release(...)	__atomic_op_release(atomic_fetch_dec, __VA_ARGS__)
>  # endif
>  #endif
> 
> The idea is that because we already group these APIs by certain defines
> such as atomic_fetch_dec_relaxed and atomic_fetch_dec in the primary
> branches - we can do the same in the secondary branch as well.
> 
> ( Also remove some unnecessarily duplicate comments, as the API
>   group defines are now pretty much self-documenting. )
> 
> No change in functionality.
> 
> Cc: Peter Zijlstra <peterz@infradead.org>
> Cc: Linus Torvalds <torvalds@linux-foundation.org>
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> Cc: Will Deacon <will.deacon@arm.com>
> Cc: linux-kernel@vger.kernel.org
> Signed-off-by: Ingo Molnar <mingo@kernel.org>

This breaks compilation on RISC-V. (For some of its atomics, the arch
currently defines the _relaxed and the full variants and it relies on
the generic definitions for the _acquire and the _release variants.)

  Andrea


> ---
>  include/linux/atomic.h | 312 ++++++++++---------------------------------------
>  1 file changed, 62 insertions(+), 250 deletions(-)
> 
> diff --git a/include/linux/atomic.h b/include/linux/atomic.h
> index 67aaafba256b..352ecc72d7f5 100644
> --- a/include/linux/atomic.h
> +++ b/include/linux/atomic.h
> @@ -71,98 +71,66 @@
>  })
>  #endif
>  
> -/* atomic_add_return_relaxed() et al: */
> -
>  #ifndef atomic_add_return_relaxed
>  # define atomic_add_return_relaxed		atomic_add_return
>  # define atomic_add_return_acquire		atomic_add_return
>  # define atomic_add_return_release		atomic_add_return
>  #else
> -# ifndef atomic_add_return_acquire
> -#  define atomic_add_return_acquire(...)	__atomic_op_acquire(atomic_add_return, __VA_ARGS__)
> -# endif
> -# ifndef atomic_add_return_release
> -#  define atomic_add_return_release(...)	__atomic_op_release(atomic_add_return, __VA_ARGS__)
> -# endif
>  # ifndef atomic_add_return
>  #  define atomic_add_return(...)		__atomic_op_fence(atomic_add_return, __VA_ARGS__)
> +#  define atomic_add_return_acquire(...)	__atomic_op_acquire(atomic_add_return, __VA_ARGS__)
> +#  define atomic_add_return_release(...)	__atomic_op_release(atomic_add_return, __VA_ARGS__)
>  # endif
>  #endif
>  
> -/* atomic_inc_return_relaxed() et al: */
> -
>  #ifndef atomic_inc_return_relaxed
>  # define atomic_inc_return_relaxed		atomic_inc_return
>  # define atomic_inc_return_acquire		atomic_inc_return
>  # define atomic_inc_return_release		atomic_inc_return
>  #else
> -# ifndef atomic_inc_return_acquire
> -#  define atomic_inc_return_acquire(...)	__atomic_op_acquire(atomic_inc_return, __VA_ARGS__)
> -# endif
> -# ifndef atomic_inc_return_release
> -#  define atomic_inc_return_release(...)	__atomic_op_release(atomic_inc_return, __VA_ARGS__)
> -# endif
>  # ifndef atomic_inc_return
>  #  define atomic_inc_return(...)		__atomic_op_fence(atomic_inc_return, __VA_ARGS__)
> +#  define atomic_inc_return_acquire(...)	__atomic_op_acquire(atomic_inc_return, __VA_ARGS__)
> +#  define atomic_inc_return_release(...)	__atomic_op_release(atomic_inc_return, __VA_ARGS__)
>  # endif
>  #endif
>  
> -/* atomic_sub_return_relaxed() et al: */
> -
>  #ifndef atomic_sub_return_relaxed
>  # define atomic_sub_return_relaxed		atomic_sub_return
>  # define atomic_sub_return_acquire		atomic_sub_return
>  # define atomic_sub_return_release		atomic_sub_return
>  #else
> -# ifndef atomic_sub_return_acquire
> -#  define atomic_sub_return_acquire(...)	__atomic_op_acquire(atomic_sub_return, __VA_ARGS__)
> -# endif
> -# ifndef atomic_sub_return_release
> -#  define atomic_sub_return_release(...)	__atomic_op_release(atomic_sub_return, __VA_ARGS__)
> -# endif
>  # ifndef atomic_sub_return
>  #  define atomic_sub_return(...)		__atomic_op_fence(atomic_sub_return, __VA_ARGS__)
> +#  define atomic_sub_return_acquire(...)	__atomic_op_acquire(atomic_sub_return, __VA_ARGS__)
> +#  define atomic_sub_return_release(...)	__atomic_op_release(atomic_sub_return, __VA_ARGS__)
>  # endif
>  #endif
>  
> -/* atomic_dec_return_relaxed() et al: */
> -
>  #ifndef atomic_dec_return_relaxed
>  # define atomic_dec_return_relaxed		atomic_dec_return
>  # define atomic_dec_return_acquire		atomic_dec_return
>  # define atomic_dec_return_release		atomic_dec_return
>  #else
> -# ifndef atomic_dec_return_acquire
> -#  define atomic_dec_return_acquire(...)	__atomic_op_acquire(atomic_dec_return, __VA_ARGS__)
> -# endif
> -# ifndef atomic_dec_return_release
> -#  define atomic_dec_return_release(...)	__atomic_op_release(atomic_dec_return, __VA_ARGS__)
> -# endif
>  # ifndef atomic_dec_return
>  #  define atomic_dec_return(...)		__atomic_op_fence(atomic_dec_return, __VA_ARGS__)
> +#  define atomic_dec_return_acquire(...)	__atomic_op_acquire(atomic_dec_return, __VA_ARGS__)
> +#  define atomic_dec_return_release(...)	__atomic_op_release(atomic_dec_return, __VA_ARGS__)
>  # endif
>  #endif
>  
> -/* atomic_fetch_add_relaxed() et al: */
> -
>  #ifndef atomic_fetch_add_relaxed
>  # define atomic_fetch_add_relaxed		atomic_fetch_add
>  # define atomic_fetch_add_acquire		atomic_fetch_add
>  # define atomic_fetch_add_release		atomic_fetch_add
>  #else
> -# ifndef atomic_fetch_add_acquire
> -#  define atomic_fetch_add_acquire(...)		__atomic_op_acquire(atomic_fetch_add, __VA_ARGS__)
> -# endif
> -# ifndef atomic_fetch_add_release
> -#  define atomic_fetch_add_release(...)		__atomic_op_release(atomic_fetch_add, __VA_ARGS__)
> -# endif
>  # ifndef atomic_fetch_add
>  #  define atomic_fetch_add(...)			__atomic_op_fence(atomic_fetch_add, __VA_ARGS__)
> +#  define atomic_fetch_add_acquire(...)		__atomic_op_acquire(atomic_fetch_add, __VA_ARGS__)
> +#  define atomic_fetch_add_release(...)		__atomic_op_release(atomic_fetch_add, __VA_ARGS__)
>  # endif
>  #endif
>  
> -/* atomic_fetch_inc_relaxed() et al: */
> -
>  #ifndef atomic_fetch_inc_relaxed
>  # ifndef atomic_fetch_inc
>  #  define atomic_fetch_inc(v)			atomic_fetch_add(1, (v))
> @@ -175,37 +143,25 @@
>  #  define atomic_fetch_inc_release		atomic_fetch_inc
>  # endif
>  #else
> -# ifndef atomic_fetch_inc_acquire
> -#  define atomic_fetch_inc_acquire(...)		__atomic_op_acquire(atomic_fetch_inc, __VA_ARGS__)
> -# endif
> -# ifndef atomic_fetch_inc_release
> -#  define atomic_fetch_inc_release(...)		__atomic_op_release(atomic_fetch_inc, __VA_ARGS__)
> -# endif
>  # ifndef atomic_fetch_inc
>  #  define atomic_fetch_inc(...)			__atomic_op_fence(atomic_fetch_inc, __VA_ARGS__)
> +#  define atomic_fetch_inc_acquire(...)		__atomic_op_acquire(atomic_fetch_inc, __VA_ARGS__)
> +#  define atomic_fetch_inc_release(...)		__atomic_op_release(atomic_fetch_inc, __VA_ARGS__)
>  # endif
>  #endif
>  
> -/* atomic_fetch_sub_relaxed() et al: */
> -
>  #ifndef atomic_fetch_sub_relaxed
>  # define atomic_fetch_sub_relaxed		atomic_fetch_sub
>  # define atomic_fetch_sub_acquire		atomic_fetch_sub
>  # define atomic_fetch_sub_release		atomic_fetch_sub
>  #else
> -# ifndef atomic_fetch_sub_acquire
> -#  define atomic_fetch_sub_acquire(...)		__atomic_op_acquire(atomic_fetch_sub, __VA_ARGS__)
> -# endif
> -# ifndef atomic_fetch_sub_release
> -#  define atomic_fetch_sub_release(...)		__atomic_op_release(atomic_fetch_sub, __VA_ARGS__)
> -# endif
>  # ifndef atomic_fetch_sub
>  #  define atomic_fetch_sub(...)			__atomic_op_fence(atomic_fetch_sub, __VA_ARGS__)
> +#  define atomic_fetch_sub_acquire(...)		__atomic_op_acquire(atomic_fetch_sub, __VA_ARGS__)
> +#  define atomic_fetch_sub_release(...)		__atomic_op_release(atomic_fetch_sub, __VA_ARGS__)
>  # endif
>  #endif
>  
> -/* atomic_fetch_dec_relaxed() et al: */
> -
>  #ifndef atomic_fetch_dec_relaxed
>  # ifndef atomic_fetch_dec
>  #  define atomic_fetch_dec(v)			atomic_fetch_sub(1, (v))
> @@ -218,127 +174,86 @@
>  #  define atomic_fetch_dec_release		atomic_fetch_dec
>  # endif
>  #else
> -# ifndef atomic_fetch_dec_acquire
> -#  define atomic_fetch_dec_acquire(...)		__atomic_op_acquire(atomic_fetch_dec, __VA_ARGS__)
> -# endif
> -# ifndef atomic_fetch_dec_release
> -#  define atomic_fetch_dec_release(...)		__atomic_op_release(atomic_fetch_dec, __VA_ARGS__)
> -# endif
>  # ifndef atomic_fetch_dec
>  #  define atomic_fetch_dec(...)			__atomic_op_fence(atomic_fetch_dec, __VA_ARGS__)
> +#  define atomic_fetch_dec_acquire(...)		__atomic_op_acquire(atomic_fetch_dec, __VA_ARGS__)
> +#  define atomic_fetch_dec_release(...)		__atomic_op_release(atomic_fetch_dec, __VA_ARGS__)
>  # endif
>  #endif
>  
> -/* atomic_fetch_or_relaxed() et al: */
> -
>  #ifndef atomic_fetch_or_relaxed
>  # define atomic_fetch_or_relaxed		atomic_fetch_or
>  # define atomic_fetch_or_acquire		atomic_fetch_or
>  # define atomic_fetch_or_release		atomic_fetch_or
>  #else
> -# ifndef atomic_fetch_or_acquire
> -#  define atomic_fetch_or_acquire(...)		__atomic_op_acquire(atomic_fetch_or, __VA_ARGS__)
> -# endif
> -# ifndef atomic_fetch_or_release
> -#  define atomic_fetch_or_release(...)		__atomic_op_release(atomic_fetch_or, __VA_ARGS__)
> -# endif
>  # ifndef atomic_fetch_or
>  #  define atomic_fetch_or(...)			__atomic_op_fence(atomic_fetch_or, __VA_ARGS__)
> +#  define atomic_fetch_or_acquire(...)		__atomic_op_acquire(atomic_fetch_or, __VA_ARGS__)
> +#  define atomic_fetch_or_release(...)		__atomic_op_release(atomic_fetch_or, __VA_ARGS__)
>  # endif
>  #endif
>  
> -/* atomic_fetch_and_relaxed() et al: */
> -
>  #ifndef atomic_fetch_and_relaxed
>  # define atomic_fetch_and_relaxed		atomic_fetch_and
>  # define atomic_fetch_and_acquire		atomic_fetch_and
>  # define atomic_fetch_and_release		atomic_fetch_and
>  #else
> -# ifndef atomic_fetch_and_acquire
> -#  define atomic_fetch_and_acquire(...)		__atomic_op_acquire(atomic_fetch_and, __VA_ARGS__)
> -# endif
> -# ifndef atomic_fetch_and_release
> -#  define atomic_fetch_and_release(...)		__atomic_op_release(atomic_fetch_and, __VA_ARGS__)
> -# endif
>  # ifndef atomic_fetch_and
>  #  define atomic_fetch_and(...)			__atomic_op_fence(atomic_fetch_and, __VA_ARGS__)
> +#  define atomic_fetch_and_acquire(...)		__atomic_op_acquire(atomic_fetch_and, __VA_ARGS__)
> +#  define atomic_fetch_and_release(...)		__atomic_op_release(atomic_fetch_and, __VA_ARGS__)
>  # endif
>  #endif
>  
>  #ifdef atomic_andnot
>  
> -/* atomic_fetch_andnot_relaxed() et al: */
> -
>  #ifndef atomic_fetch_andnot_relaxed
>  # define atomic_fetch_andnot_relaxed		atomic_fetch_andnot
>  # define atomic_fetch_andnot_acquire		atomic_fetch_andnot
>  # define atomic_fetch_andnot_release		atomic_fetch_andnot
>  #else
> -# ifndef atomic_fetch_andnot_acquire
> -#  define atomic_fetch_andnot_acquire(...)	__atomic_op_acquire(atomic_fetch_andnot, __VA_ARGS__)
> -# endif
> -# ifndef atomic_fetch_andnot_release
> -#  define atomic_fetch_andnot_release(...)	__atomic_op_release(atomic_fetch_andnot, __VA_ARGS__)
> -# endif
>  # ifndef atomic_fetch_andnot
>  #  define atomic_fetch_andnot(...)		__atomic_op_fence(atomic_fetch_andnot, __VA_ARGS__)
> +#  define atomic_fetch_andnot_acquire(...)	__atomic_op_acquire(atomic_fetch_andnot, __VA_ARGS__)
> +#  define atomic_fetch_andnot_release(...)	__atomic_op_release(atomic_fetch_andnot, __VA_ARGS__)
>  # endif
>  #endif
>  
>  #endif /* atomic_andnot */
>  
> -/* atomic_fetch_xor_relaxed() et al: */
> -
>  #ifndef atomic_fetch_xor_relaxed
>  # define atomic_fetch_xor_relaxed		atomic_fetch_xor
>  # define atomic_fetch_xor_acquire		atomic_fetch_xor
>  # define atomic_fetch_xor_release		atomic_fetch_xor
>  #else
> -# ifndef atomic_fetch_xor_acquire
> -#  define atomic_fetch_xor_acquire(...)		__atomic_op_acquire(atomic_fetch_xor, __VA_ARGS__)
> -# endif
> -# ifndef atomic_fetch_xor_release
> -#  define atomic_fetch_xor_release(...)		__atomic_op_release(atomic_fetch_xor, __VA_ARGS__)
> -# endif
>  # ifndef atomic_fetch_xor
>  #  define atomic_fetch_xor(...)			__atomic_op_fence(atomic_fetch_xor, __VA_ARGS__)
> +#  define atomic_fetch_xor_acquire(...)		__atomic_op_acquire(atomic_fetch_xor, __VA_ARGS__)
> +#  define atomic_fetch_xor_release(...)		__atomic_op_release(atomic_fetch_xor, __VA_ARGS__)
>  # endif
>  #endif
>  
> -
> -/* atomic_xchg_relaxed() et al: */
> -
>  #ifndef atomic_xchg_relaxed
>  #define atomic_xchg_relaxed			atomic_xchg
>  #define atomic_xchg_acquire			atomic_xchg
>  #define atomic_xchg_release			atomic_xchg
>  #else
> -# ifndef atomic_xchg_acquire
> -#  define atomic_xchg_acquire(...)		__atomic_op_acquire(atomic_xchg, __VA_ARGS__)
> -# endif
> -# ifndef atomic_xchg_release
> -#  define atomic_xchg_release(...)		__atomic_op_release(atomic_xchg, __VA_ARGS__)
> -# endif
>  # ifndef atomic_xchg
>  #  define atomic_xchg(...)			__atomic_op_fence(atomic_xchg, __VA_ARGS__)
> +#  define atomic_xchg_acquire(...)		__atomic_op_acquire(atomic_xchg, __VA_ARGS__)
> +#  define atomic_xchg_release(...)		__atomic_op_release(atomic_xchg, __VA_ARGS__)
>  # endif
>  #endif
>  
> -/* atomic_cmpxchg_relaxed() et al: */
> -
>  #ifndef atomic_cmpxchg_relaxed
>  # define atomic_cmpxchg_relaxed			atomic_cmpxchg
>  # define atomic_cmpxchg_acquire			atomic_cmpxchg
>  # define atomic_cmpxchg_release			atomic_cmpxchg
>  #else
> -# ifndef atomic_cmpxchg_acquire
> -#  define atomic_cmpxchg_acquire(...)		__atomic_op_acquire(atomic_cmpxchg, __VA_ARGS__)
> -# endif
> -# ifndef atomic_cmpxchg_release
> -#  define atomic_cmpxchg_release(...)		__atomic_op_release(atomic_cmpxchg, __VA_ARGS__)
> -# endif
>  # ifndef atomic_cmpxchg
>  #  define atomic_cmpxchg(...)			__atomic_op_fence(atomic_cmpxchg, __VA_ARGS__)
> +#  define atomic_cmpxchg_acquire(...)		__atomic_op_acquire(atomic_cmpxchg, __VA_ARGS__)
> +#  define atomic_cmpxchg_release(...)		__atomic_op_release(atomic_cmpxchg, __VA_ARGS__)
>  # endif
>  #endif
>  
> @@ -362,57 +277,39 @@
>  # define atomic_try_cmpxchg_release		atomic_try_cmpxchg
>  #endif
>  
> -/* cmpxchg_relaxed() et al: */
> -
>  #ifndef cmpxchg_relaxed
>  # define cmpxchg_relaxed			cmpxchg
>  # define cmpxchg_acquire			cmpxchg
>  # define cmpxchg_release			cmpxchg
>  #else
> -# ifndef cmpxchg_acquire
> -#  define cmpxchg_acquire(...)			__atomic_op_acquire(cmpxchg, __VA_ARGS__)
> -# endif
> -# ifndef cmpxchg_release
> -#  define cmpxchg_release(...)			__atomic_op_release(cmpxchg, __VA_ARGS__)
> -# endif
>  # ifndef cmpxchg
>  #  define cmpxchg(...)				__atomic_op_fence(cmpxchg, __VA_ARGS__)
> +#  define cmpxchg_acquire(...)			__atomic_op_acquire(cmpxchg, __VA_ARGS__)
> +#  define cmpxchg_release(...)			__atomic_op_release(cmpxchg, __VA_ARGS__)
>  # endif
>  #endif
>  
> -/* cmpxchg64_relaxed() et al: */
> -
>  #ifndef cmpxchg64_relaxed
>  # define cmpxchg64_relaxed			cmpxchg64
>  # define cmpxchg64_acquire			cmpxchg64
>  # define cmpxchg64_release			cmpxchg64
>  #else
> -# ifndef cmpxchg64_acquire
> -#  define cmpxchg64_acquire(...)		__atomic_op_acquire(cmpxchg64, __VA_ARGS__)
> -# endif
> -# ifndef cmpxchg64_release
> -#  define cmpxchg64_release(...)		__atomic_op_release(cmpxchg64, __VA_ARGS__)
> -# endif
>  # ifndef cmpxchg64
>  #  define cmpxchg64(...)			__atomic_op_fence(cmpxchg64, __VA_ARGS__)
> +#  define cmpxchg64_acquire(...)		__atomic_op_acquire(cmpxchg64, __VA_ARGS__)
> +#  define cmpxchg64_release(...)		__atomic_op_release(cmpxchg64, __VA_ARGS__)
>  # endif
>  #endif
>  
> -/* xchg_relaxed() et al: */
> -
>  #ifndef xchg_relaxed
>  # define xchg_relaxed				xchg
>  # define xchg_acquire				xchg
>  # define xchg_release				xchg
>  #else
> -# ifndef xchg_acquire
> -#  define xchg_acquire(...)			__atomic_op_acquire(xchg, __VA_ARGS__)
> -# endif
> -# ifndef xchg_release
> -#  define xchg_release(...)			__atomic_op_release(xchg, __VA_ARGS__)
> -# endif
>  # ifndef xchg
>  #  define xchg(...)				__atomic_op_fence(xchg, __VA_ARGS__)
> +#  define xchg_acquire(...)			__atomic_op_acquire(xchg, __VA_ARGS__)
> +#  define xchg_release(...)			__atomic_op_release(xchg, __VA_ARGS__)
>  # endif
>  #endif
>  
> @@ -569,98 +466,66 @@ static inline int atomic_dec_if_positive(atomic_t *v)
>  # define atomic64_set_release(v, i)		smp_store_release(&(v)->counter, (i))
>  #endif
>  
> -/* atomic64_add_return_relaxed() et al: */
> -
>  #ifndef atomic64_add_return_relaxed
>  # define atomic64_add_return_relaxed		atomic64_add_return
>  # define atomic64_add_return_acquire		atomic64_add_return
>  # define atomic64_add_return_release		atomic64_add_return
>  #else
> -# ifndef atomic64_add_return_acquire
> -#  define atomic64_add_return_acquire(...)	__atomic_op_acquire(atomic64_add_return, __VA_ARGS__)
> -# endif
> -# ifndef atomic64_add_return_release
> -#  define atomic64_add_return_release(...)	__atomic_op_release(atomic64_add_return, __VA_ARGS__)
> -# endif
>  # ifndef atomic64_add_return
>  #  define atomic64_add_return(...)		__atomic_op_fence(atomic64_add_return, __VA_ARGS__)
> +#  define atomic64_add_return_acquire(...)	__atomic_op_acquire(atomic64_add_return, __VA_ARGS__)
> +#  define atomic64_add_return_release(...)	__atomic_op_release(atomic64_add_return, __VA_ARGS__)
>  # endif
>  #endif
>  
> -/* atomic64_inc_return_relaxed() et al: */
> -
>  #ifndef atomic64_inc_return_relaxed
>  # define atomic64_inc_return_relaxed		atomic64_inc_return
>  # define atomic64_inc_return_acquire		atomic64_inc_return
>  # define atomic64_inc_return_release		atomic64_inc_return
>  #else
> -# ifndef atomic64_inc_return_acquire
> -#  define atomic64_inc_return_acquire(...)	__atomic_op_acquire(atomic64_inc_return, __VA_ARGS__)
> -# endif
> -# ifndef atomic64_inc_return_release
> -#  define atomic64_inc_return_release(...)	__atomic_op_release(atomic64_inc_return, __VA_ARGS__)
> -# endif
>  # ifndef atomic64_inc_return
>  #  define atomic64_inc_return(...)		__atomic_op_fence(atomic64_inc_return, __VA_ARGS__)
> +#  define atomic64_inc_return_acquire(...)	__atomic_op_acquire(atomic64_inc_return, __VA_ARGS__)
> +#  define atomic64_inc_return_release(...)	__atomic_op_release(atomic64_inc_return, __VA_ARGS__)
>  # endif
>  #endif
>  
> -/* atomic64_sub_return_relaxed() et al: */
> -
>  #ifndef atomic64_sub_return_relaxed
>  # define atomic64_sub_return_relaxed		atomic64_sub_return
>  # define atomic64_sub_return_acquire		atomic64_sub_return
>  # define atomic64_sub_return_release		atomic64_sub_return
>  #else
> -# ifndef atomic64_sub_return_acquire
> -#  define atomic64_sub_return_acquire(...)	__atomic_op_acquire(atomic64_sub_return, __VA_ARGS__)
> -# endif
> -# ifndef atomic64_sub_return_release
> -#  define atomic64_sub_return_release(...)	__atomic_op_release(atomic64_sub_return, __VA_ARGS__)
> -# endif
>  # ifndef atomic64_sub_return
>  #  define atomic64_sub_return(...)		__atomic_op_fence(atomic64_sub_return, __VA_ARGS__)
> +#  define atomic64_sub_return_acquire(...)	__atomic_op_acquire(atomic64_sub_return, __VA_ARGS__)
> +#  define atomic64_sub_return_release(...)	__atomic_op_release(atomic64_sub_return, __VA_ARGS__)
>  # endif
>  #endif
>  
> -/* atomic64_dec_return_relaxed() et al: */
> -
>  #ifndef atomic64_dec_return_relaxed
>  # define atomic64_dec_return_relaxed		atomic64_dec_return
>  # define atomic64_dec_return_acquire		atomic64_dec_return
>  # define atomic64_dec_return_release		atomic64_dec_return
>  #else
> -# ifndef atomic64_dec_return_acquire
> -#  define atomic64_dec_return_acquire(...)	__atomic_op_acquire(atomic64_dec_return, __VA_ARGS__)
> -# endif
> -# ifndef atomic64_dec_return_release
> -#  define atomic64_dec_return_release(...)	__atomic_op_release(atomic64_dec_return, __VA_ARGS__)
> -# endif
>  # ifndef atomic64_dec_return
>  #  define atomic64_dec_return(...)		__atomic_op_fence(atomic64_dec_return, __VA_ARGS__)
> +#  define atomic64_dec_return_acquire(...)	__atomic_op_acquire(atomic64_dec_return, __VA_ARGS__)
> +#  define atomic64_dec_return_release(...)	__atomic_op_release(atomic64_dec_return, __VA_ARGS__)
>  # endif
>  #endif
>  
> -/* atomic64_fetch_add_relaxed() et al: */
> -
>  #ifndef atomic64_fetch_add_relaxed
>  # define atomic64_fetch_add_relaxed		atomic64_fetch_add
>  # define atomic64_fetch_add_acquire		atomic64_fetch_add
>  # define atomic64_fetch_add_release		atomic64_fetch_add
>  #else
> -# ifndef atomic64_fetch_add_acquire
> -#  define atomic64_fetch_add_acquire(...)	__atomic_op_acquire(atomic64_fetch_add, __VA_ARGS__)
> -# endif
> -# ifndef atomic64_fetch_add_release
> -#  define atomic64_fetch_add_release(...)	__atomic_op_release(atomic64_fetch_add, __VA_ARGS__)
> -# endif
>  # ifndef atomic64_fetch_add
>  #  define atomic64_fetch_add(...)		__atomic_op_fence(atomic64_fetch_add, __VA_ARGS__)
> +#  define atomic64_fetch_add_acquire(...)	__atomic_op_acquire(atomic64_fetch_add, __VA_ARGS__)
> +#  define atomic64_fetch_add_release(...)	__atomic_op_release(atomic64_fetch_add, __VA_ARGS__)
>  # endif
>  #endif
>  
> -/* atomic64_fetch_inc_relaxed() et al: */
> -
>  #ifndef atomic64_fetch_inc_relaxed
>  # ifndef atomic64_fetch_inc
>  #  define atomic64_fetch_inc(v)			atomic64_fetch_add(1, (v))
> @@ -673,37 +538,25 @@ static inline int atomic_dec_if_positive(atomic_t *v)
>  #  define atomic64_fetch_inc_release		atomic64_fetch_inc
>  # endif
>  #else
> -# ifndef atomic64_fetch_inc_acquire
> -#  define atomic64_fetch_inc_acquire(...)	__atomic_op_acquire(atomic64_fetch_inc, __VA_ARGS__)
> -# endif
> -# ifndef atomic64_fetch_inc_release
> -#  define atomic64_fetch_inc_release(...)	__atomic_op_release(atomic64_fetch_inc, __VA_ARGS__)
> -# endif
>  # ifndef atomic64_fetch_inc
>  #  define atomic64_fetch_inc(...)		__atomic_op_fence(atomic64_fetch_inc, __VA_ARGS__)
> +#  define atomic64_fetch_inc_acquire(...)	__atomic_op_acquire(atomic64_fetch_inc, __VA_ARGS__)
> +#  define atomic64_fetch_inc_release(...)	__atomic_op_release(atomic64_fetch_inc, __VA_ARGS__)
>  # endif
>  #endif
>  
> -/* atomic64_fetch_sub_relaxed() et al: */
> -
>  #ifndef atomic64_fetch_sub_relaxed
>  # define atomic64_fetch_sub_relaxed		atomic64_fetch_sub
>  # define atomic64_fetch_sub_acquire		atomic64_fetch_sub
>  # define atomic64_fetch_sub_release		atomic64_fetch_sub
>  #else
> -# ifndef atomic64_fetch_sub_acquire
> -#  define atomic64_fetch_sub_acquire(...)	__atomic_op_acquire(atomic64_fetch_sub, __VA_ARGS__)
> -# endif
> -# ifndef atomic64_fetch_sub_release
> -#  define atomic64_fetch_sub_release(...)	__atomic_op_release(atomic64_fetch_sub, __VA_ARGS__)
> -# endif
>  # ifndef atomic64_fetch_sub
>  #  define atomic64_fetch_sub(...)		__atomic_op_fence(atomic64_fetch_sub, __VA_ARGS__)
> +#  define atomic64_fetch_sub_acquire(...)	__atomic_op_acquire(atomic64_fetch_sub, __VA_ARGS__)
> +#  define atomic64_fetch_sub_release(...)	__atomic_op_release(atomic64_fetch_sub, __VA_ARGS__)
>  # endif
>  #endif
>  
> -/* atomic64_fetch_dec_relaxed() et al: */
> -
>  #ifndef atomic64_fetch_dec_relaxed
>  # ifndef atomic64_fetch_dec
>  #  define atomic64_fetch_dec(v)			atomic64_fetch_sub(1, (v))
> @@ -716,127 +569,86 @@ static inline int atomic_dec_if_positive(atomic_t *v)
>  #  define atomic64_fetch_dec_release		atomic64_fetch_dec
>  # endif
>  #else
> -# ifndef atomic64_fetch_dec_acquire
> -#  define atomic64_fetch_dec_acquire(...)	__atomic_op_acquire(atomic64_fetch_dec, __VA_ARGS__)
> -# endif
> -# ifndef atomic64_fetch_dec_release
> -#  define atomic64_fetch_dec_release(...)	__atomic_op_release(atomic64_fetch_dec, __VA_ARGS__)
> -# endif
>  # ifndef atomic64_fetch_dec
>  #  define atomic64_fetch_dec(...)		__atomic_op_fence(atomic64_fetch_dec, __VA_ARGS__)
> +#  define atomic64_fetch_dec_acquire(...)	__atomic_op_acquire(atomic64_fetch_dec, __VA_ARGS__)
> +#  define atomic64_fetch_dec_release(...)	__atomic_op_release(atomic64_fetch_dec, __VA_ARGS__)
>  # endif
>  #endif
>  
> -/* atomic64_fetch_or_relaxed() et al: */
> -
>  #ifndef atomic64_fetch_or_relaxed
>  # define atomic64_fetch_or_relaxed		atomic64_fetch_or
>  # define atomic64_fetch_or_acquire		atomic64_fetch_or
>  # define atomic64_fetch_or_release		atomic64_fetch_or
>  #else
> -# ifndef atomic64_fetch_or_acquire
> -#  define atomic64_fetch_or_acquire(...)	__atomic_op_acquire(atomic64_fetch_or, __VA_ARGS__)
> -# endif
> -# ifndef atomic64_fetch_or_release
> -#  define atomic64_fetch_or_release(...)	__atomic_op_release(atomic64_fetch_or, __VA_ARGS__)
> -# endif
>  # ifndef atomic64_fetch_or
>  #  define atomic64_fetch_or(...)		__atomic_op_fence(atomic64_fetch_or, __VA_ARGS__)
> +#  define atomic64_fetch_or_acquire(...)	__atomic_op_acquire(atomic64_fetch_or, __VA_ARGS__)
> +#  define atomic64_fetch_or_release(...)	__atomic_op_release(atomic64_fetch_or, __VA_ARGS__)
>  # endif
>  #endif
>  
> -
> -/* atomic64_fetch_and_relaxed() et al: */
> -
>  #ifndef atomic64_fetch_and_relaxed
>  # define atomic64_fetch_and_relaxed		atomic64_fetch_and
>  # define atomic64_fetch_and_acquire		atomic64_fetch_and
>  # define atomic64_fetch_and_release		atomic64_fetch_and
>  #else
> -# ifndef atomic64_fetch_and_acquire
> -#  define atomic64_fetch_and_acquire(...)	__atomic_op_acquire(atomic64_fetch_and, __VA_ARGS__)
> -# endif
> -# ifndef atomic64_fetch_and_release
> -#  define atomic64_fetch_and_release(...)	__atomic_op_release(atomic64_fetch_and, __VA_ARGS__)
> -# endif
>  # ifndef atomic64_fetch_and
>  #  define atomic64_fetch_and(...)		__atomic_op_fence(atomic64_fetch_and, __VA_ARGS__)
> +#  define atomic64_fetch_and_acquire(...)	__atomic_op_acquire(atomic64_fetch_and, __VA_ARGS__)
> +#  define atomic64_fetch_and_release(...)	__atomic_op_release(atomic64_fetch_and, __VA_ARGS__)
>  # endif
>  #endif
>  
>  #ifdef atomic64_andnot
>  
> -/* atomic64_fetch_andnot_relaxed() et al: */
> -
>  #ifndef atomic64_fetch_andnot_relaxed
>  # define atomic64_fetch_andnot_relaxed		atomic64_fetch_andnot
>  # define atomic64_fetch_andnot_acquire		atomic64_fetch_andnot
>  # define atomic64_fetch_andnot_release		atomic64_fetch_andnot
>  #else
> -# ifndef atomic64_fetch_andnot_acquire
> -#  define atomic64_fetch_andnot_acquire(...)	__atomic_op_acquire(atomic64_fetch_andnot, __VA_ARGS__)
> -# endif
> -# ifndef atomic64_fetch_andnot_release
> -#  define atomic64_fetch_andnot_release(...)	__atomic_op_release(atomic64_fetch_andnot, __VA_ARGS__)
> -# endif
>  # ifndef atomic64_fetch_andnot
>  #  define atomic64_fetch_andnot(...)		__atomic_op_fence(atomic64_fetch_andnot, __VA_ARGS__)
> +#  define atomic64_fetch_andnot_acquire(...)	__atomic_op_acquire(atomic64_fetch_andnot, __VA_ARGS__)
> +#  define atomic64_fetch_andnot_release(...)	__atomic_op_release(atomic64_fetch_andnot, __VA_ARGS__)
>  # endif
>  #endif
>  
>  #endif /* atomic64_andnot */
>  
> -/* atomic64_fetch_xor_relaxed() et al: */
> -
>  #ifndef atomic64_fetch_xor_relaxed
>  # define atomic64_fetch_xor_relaxed		atomic64_fetch_xor
>  # define atomic64_fetch_xor_acquire		atomic64_fetch_xor
>  # define atomic64_fetch_xor_release		atomic64_fetch_xor
>  #else
> -# ifndef atomic64_fetch_xor_acquire
> -#  define atomic64_fetch_xor_acquire(...)	__atomic_op_acquire(atomic64_fetch_xor, __VA_ARGS__)
> -# endif
> -# ifndef atomic64_fetch_xor_release
> -#  define atomic64_fetch_xor_release(...)	__atomic_op_release(atomic64_fetch_xor, __VA_ARGS__)
> -# endif
>  # ifndef atomic64_fetch_xor
>  #  define atomic64_fetch_xor(...)		__atomic_op_fence(atomic64_fetch_xor, __VA_ARGS__)
> +#  define atomic64_fetch_xor_acquire(...)	__atomic_op_acquire(atomic64_fetch_xor, __VA_ARGS__)
> +#  define atomic64_fetch_xor_release(...)	__atomic_op_release(atomic64_fetch_xor, __VA_ARGS__)
>  # endif
>  #endif
>  
> -/* atomic64_xchg_relaxed() et al: */
> -
>  #ifndef atomic64_xchg_relaxed
>  # define atomic64_xchg_relaxed			atomic64_xchg
>  # define atomic64_xchg_acquire			atomic64_xchg
>  # define atomic64_xchg_release			atomic64_xchg
>  #else
> -# ifndef atomic64_xchg_acquire
> -#  define atomic64_xchg_acquire(...)		__atomic_op_acquire(atomic64_xchg, __VA_ARGS__)
> -# endif
> -# ifndef atomic64_xchg_release
> -#  define atomic64_xchg_release(...)		__atomic_op_release(atomic64_xchg, __VA_ARGS__)
> -# endif
>  # ifndef atomic64_xchg
>  #  define atomic64_xchg(...)			__atomic_op_fence(atomic64_xchg, __VA_ARGS__)
> +#  define atomic64_xchg_acquire(...)		__atomic_op_acquire(atomic64_xchg, __VA_ARGS__)
> +#  define atomic64_xchg_release(...)		__atomic_op_release(atomic64_xchg, __VA_ARGS__)
>  # endif
>  #endif
>  
> -/* atomic64_cmpxchg_relaxed() et al: */
> -
>  #ifndef atomic64_cmpxchg_relaxed
>  # define atomic64_cmpxchg_relaxed		atomic64_cmpxchg
>  # define atomic64_cmpxchg_acquire		atomic64_cmpxchg
>  # define atomic64_cmpxchg_release		atomic64_cmpxchg
>  #else
> -# ifndef atomic64_cmpxchg_acquire
> -#  define atomic64_cmpxchg_acquire(...)		__atomic_op_acquire(atomic64_cmpxchg, __VA_ARGS__)
> -# endif
> -# ifndef atomic64_cmpxchg_release
> -#  define atomic64_cmpxchg_release(...)		__atomic_op_release(atomic64_cmpxchg, __VA_ARGS__)
> -# endif
>  # ifndef atomic64_cmpxchg
>  #  define atomic64_cmpxchg(...)			__atomic_op_fence(atomic64_cmpxchg, __VA_ARGS__)
> +#  define atomic64_cmpxchg_acquire(...)		__atomic_op_acquire(atomic64_cmpxchg, __VA_ARGS__)
> +#  define atomic64_cmpxchg_release(...)		__atomic_op_release(atomic64_cmpxchg, __VA_ARGS__)
>  # endif
>  #endif
>  

^ permalink raw reply	[flat|nested] 103+ messages in thread

* [PATCH] locking/atomics: Simplify the op definitions in atomic.h some more
@ 2018-05-06 14:12             ` Andrea Parri
  0 siblings, 0 replies; 103+ messages in thread
From: Andrea Parri @ 2018-05-06 14:12 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Ingo,

> From 5affbf7e91901143f84f1b2ca64f4afe70e210fd Mon Sep 17 00:00:00 2001
> From: Ingo Molnar <mingo@kernel.org>
> Date: Sat, 5 May 2018 10:23:23 +0200
> Subject: [PATCH] locking/atomics: Simplify the op definitions in atomic.h some more
> 
> Before:
> 
>  #ifndef atomic_fetch_dec_relaxed
>  # ifndef atomic_fetch_dec
>  #  define atomic_fetch_dec(v)			atomic_fetch_sub(1, (v))
>  #  define atomic_fetch_dec_relaxed(v)		atomic_fetch_sub_relaxed(1, (v))
>  #  define atomic_fetch_dec_acquire(v)		atomic_fetch_sub_acquire(1, (v))
>  #  define atomic_fetch_dec_release(v)		atomic_fetch_sub_release(1, (v))
>  # else
>  #  define atomic_fetch_dec_relaxed		atomic_fetch_dec
>  #  define atomic_fetch_dec_acquire		atomic_fetch_dec
>  #  define atomic_fetch_dec_release		atomic_fetch_dec
>  # endif
>  #else
>  # ifndef atomic_fetch_dec_acquire
>  #  define atomic_fetch_dec_acquire(...)	__atomic_op_acquire(atomic_fetch_dec, __VA_ARGS__)
>  # endif
>  # ifndef atomic_fetch_dec_release
>  #  define atomic_fetch_dec_release(...)	__atomic_op_release(atomic_fetch_dec, __VA_ARGS__)
>  # endif
>  # ifndef atomic_fetch_dec
>  #  define atomic_fetch_dec(...)		__atomic_op_fence(atomic_fetch_dec, __VA_ARGS__)
>  # endif
>  #endif
> 
> After:
> 
>  #ifndef atomic_fetch_dec_relaxed
>  # ifndef atomic_fetch_dec
>  #  define atomic_fetch_dec(v)			atomic_fetch_sub(1, (v))
>  #  define atomic_fetch_dec_relaxed(v)		atomic_fetch_sub_relaxed(1, (v))
>  #  define atomic_fetch_dec_acquire(v)		atomic_fetch_sub_acquire(1, (v))
>  #  define atomic_fetch_dec_release(v)		atomic_fetch_sub_release(1, (v))
>  # else
>  #  define atomic_fetch_dec_relaxed		atomic_fetch_dec
>  #  define atomic_fetch_dec_acquire		atomic_fetch_dec
>  #  define atomic_fetch_dec_release		atomic_fetch_dec
>  # endif
>  #else
>  # ifndef atomic_fetch_dec
>  #  define atomic_fetch_dec(...)		__atomic_op_fence(atomic_fetch_dec, __VA_ARGS__)
>  #  define atomic_fetch_dec_acquire(...)	__atomic_op_acquire(atomic_fetch_dec, __VA_ARGS__)
>  #  define atomic_fetch_dec_release(...)	__atomic_op_release(atomic_fetch_dec, __VA_ARGS__)
>  # endif
>  #endif
> 
> The idea is that because we already group these APIs by certain defines
> such as atomic_fetch_dec_relaxed and atomic_fetch_dec in the primary
> branches - we can do the same in the secondary branch as well.
> 
> ( Also remove some unnecessarily duplicate comments, as the API
>   group defines are now pretty much self-documenting. )
> 
> No change in functionality.
> 
> Cc: Peter Zijlstra <peterz@infradead.org>
> Cc: Linus Torvalds <torvalds@linux-foundation.org>
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> Cc: Will Deacon <will.deacon@arm.com>
> Cc: linux-kernel at vger.kernel.org
> Signed-off-by: Ingo Molnar <mingo@kernel.org>

This breaks compilation on RISC-V. (For some of its atomics, the arch
currently defines the _relaxed and the full variants and it relies on
the generic definitions for the _acquire and the _release variants.)

  Andrea


> ---
>  include/linux/atomic.h | 312 ++++++++++---------------------------------------
>  1 file changed, 62 insertions(+), 250 deletions(-)
> 
> diff --git a/include/linux/atomic.h b/include/linux/atomic.h
> index 67aaafba256b..352ecc72d7f5 100644
> --- a/include/linux/atomic.h
> +++ b/include/linux/atomic.h
> @@ -71,98 +71,66 @@
>  })
>  #endif
>  
> -/* atomic_add_return_relaxed() et al: */
> -
>  #ifndef atomic_add_return_relaxed
>  # define atomic_add_return_relaxed		atomic_add_return
>  # define atomic_add_return_acquire		atomic_add_return
>  # define atomic_add_return_release		atomic_add_return
>  #else
> -# ifndef atomic_add_return_acquire
> -#  define atomic_add_return_acquire(...)	__atomic_op_acquire(atomic_add_return, __VA_ARGS__)
> -# endif
> -# ifndef atomic_add_return_release
> -#  define atomic_add_return_release(...)	__atomic_op_release(atomic_add_return, __VA_ARGS__)
> -# endif
>  # ifndef atomic_add_return
>  #  define atomic_add_return(...)		__atomic_op_fence(atomic_add_return, __VA_ARGS__)
> +#  define atomic_add_return_acquire(...)	__atomic_op_acquire(atomic_add_return, __VA_ARGS__)
> +#  define atomic_add_return_release(...)	__atomic_op_release(atomic_add_return, __VA_ARGS__)
>  # endif
>  #endif
>  
> -/* atomic_inc_return_relaxed() et al: */
> -
>  #ifndef atomic_inc_return_relaxed
>  # define atomic_inc_return_relaxed		atomic_inc_return
>  # define atomic_inc_return_acquire		atomic_inc_return
>  # define atomic_inc_return_release		atomic_inc_return
>  #else
> -# ifndef atomic_inc_return_acquire
> -#  define atomic_inc_return_acquire(...)	__atomic_op_acquire(atomic_inc_return, __VA_ARGS__)
> -# endif
> -# ifndef atomic_inc_return_release
> -#  define atomic_inc_return_release(...)	__atomic_op_release(atomic_inc_return, __VA_ARGS__)
> -# endif
>  # ifndef atomic_inc_return
>  #  define atomic_inc_return(...)		__atomic_op_fence(atomic_inc_return, __VA_ARGS__)
> +#  define atomic_inc_return_acquire(...)	__atomic_op_acquire(atomic_inc_return, __VA_ARGS__)
> +#  define atomic_inc_return_release(...)	__atomic_op_release(atomic_inc_return, __VA_ARGS__)
>  # endif
>  #endif
>  
> -/* atomic_sub_return_relaxed() et al: */
> -
>  #ifndef atomic_sub_return_relaxed
>  # define atomic_sub_return_relaxed		atomic_sub_return
>  # define atomic_sub_return_acquire		atomic_sub_return
>  # define atomic_sub_return_release		atomic_sub_return
>  #else
> -# ifndef atomic_sub_return_acquire
> -#  define atomic_sub_return_acquire(...)	__atomic_op_acquire(atomic_sub_return, __VA_ARGS__)
> -# endif
> -# ifndef atomic_sub_return_release
> -#  define atomic_sub_return_release(...)	__atomic_op_release(atomic_sub_return, __VA_ARGS__)
> -# endif
>  # ifndef atomic_sub_return
>  #  define atomic_sub_return(...)		__atomic_op_fence(atomic_sub_return, __VA_ARGS__)
> +#  define atomic_sub_return_acquire(...)	__atomic_op_acquire(atomic_sub_return, __VA_ARGS__)
> +#  define atomic_sub_return_release(...)	__atomic_op_release(atomic_sub_return, __VA_ARGS__)
>  # endif
>  #endif
>  
> -/* atomic_dec_return_relaxed() et al: */
> -
>  #ifndef atomic_dec_return_relaxed
>  # define atomic_dec_return_relaxed		atomic_dec_return
>  # define atomic_dec_return_acquire		atomic_dec_return
>  # define atomic_dec_return_release		atomic_dec_return
>  #else
> -# ifndef atomic_dec_return_acquire
> -#  define atomic_dec_return_acquire(...)	__atomic_op_acquire(atomic_dec_return, __VA_ARGS__)
> -# endif
> -# ifndef atomic_dec_return_release
> -#  define atomic_dec_return_release(...)	__atomic_op_release(atomic_dec_return, __VA_ARGS__)
> -# endif
>  # ifndef atomic_dec_return
>  #  define atomic_dec_return(...)		__atomic_op_fence(atomic_dec_return, __VA_ARGS__)
> +#  define atomic_dec_return_acquire(...)	__atomic_op_acquire(atomic_dec_return, __VA_ARGS__)
> +#  define atomic_dec_return_release(...)	__atomic_op_release(atomic_dec_return, __VA_ARGS__)
>  # endif
>  #endif
>  
> -/* atomic_fetch_add_relaxed() et al: */
> -
>  #ifndef atomic_fetch_add_relaxed
>  # define atomic_fetch_add_relaxed		atomic_fetch_add
>  # define atomic_fetch_add_acquire		atomic_fetch_add
>  # define atomic_fetch_add_release		atomic_fetch_add
>  #else
> -# ifndef atomic_fetch_add_acquire
> -#  define atomic_fetch_add_acquire(...)		__atomic_op_acquire(atomic_fetch_add, __VA_ARGS__)
> -# endif
> -# ifndef atomic_fetch_add_release
> -#  define atomic_fetch_add_release(...)		__atomic_op_release(atomic_fetch_add, __VA_ARGS__)
> -# endif
>  # ifndef atomic_fetch_add
>  #  define atomic_fetch_add(...)			__atomic_op_fence(atomic_fetch_add, __VA_ARGS__)
> +#  define atomic_fetch_add_acquire(...)		__atomic_op_acquire(atomic_fetch_add, __VA_ARGS__)
> +#  define atomic_fetch_add_release(...)		__atomic_op_release(atomic_fetch_add, __VA_ARGS__)
>  # endif
>  #endif
>  
> -/* atomic_fetch_inc_relaxed() et al: */
> -
>  #ifndef atomic_fetch_inc_relaxed
>  # ifndef atomic_fetch_inc
>  #  define atomic_fetch_inc(v)			atomic_fetch_add(1, (v))
> @@ -175,37 +143,25 @@
>  #  define atomic_fetch_inc_release		atomic_fetch_inc
>  # endif
>  #else
> -# ifndef atomic_fetch_inc_acquire
> -#  define atomic_fetch_inc_acquire(...)		__atomic_op_acquire(atomic_fetch_inc, __VA_ARGS__)
> -# endif
> -# ifndef atomic_fetch_inc_release
> -#  define atomic_fetch_inc_release(...)		__atomic_op_release(atomic_fetch_inc, __VA_ARGS__)
> -# endif
>  # ifndef atomic_fetch_inc
>  #  define atomic_fetch_inc(...)			__atomic_op_fence(atomic_fetch_inc, __VA_ARGS__)
> +#  define atomic_fetch_inc_acquire(...)		__atomic_op_acquire(atomic_fetch_inc, __VA_ARGS__)
> +#  define atomic_fetch_inc_release(...)		__atomic_op_release(atomic_fetch_inc, __VA_ARGS__)
>  # endif
>  #endif
>  
> -/* atomic_fetch_sub_relaxed() et al: */
> -
>  #ifndef atomic_fetch_sub_relaxed
>  # define atomic_fetch_sub_relaxed		atomic_fetch_sub
>  # define atomic_fetch_sub_acquire		atomic_fetch_sub
>  # define atomic_fetch_sub_release		atomic_fetch_sub
>  #else
> -# ifndef atomic_fetch_sub_acquire
> -#  define atomic_fetch_sub_acquire(...)		__atomic_op_acquire(atomic_fetch_sub, __VA_ARGS__)
> -# endif
> -# ifndef atomic_fetch_sub_release
> -#  define atomic_fetch_sub_release(...)		__atomic_op_release(atomic_fetch_sub, __VA_ARGS__)
> -# endif
>  # ifndef atomic_fetch_sub
>  #  define atomic_fetch_sub(...)			__atomic_op_fence(atomic_fetch_sub, __VA_ARGS__)
> +#  define atomic_fetch_sub_acquire(...)		__atomic_op_acquire(atomic_fetch_sub, __VA_ARGS__)
> +#  define atomic_fetch_sub_release(...)		__atomic_op_release(atomic_fetch_sub, __VA_ARGS__)
>  # endif
>  #endif
>  
> -/* atomic_fetch_dec_relaxed() et al: */
> -
>  #ifndef atomic_fetch_dec_relaxed
>  # ifndef atomic_fetch_dec
>  #  define atomic_fetch_dec(v)			atomic_fetch_sub(1, (v))
> @@ -218,127 +174,86 @@
>  #  define atomic_fetch_dec_release		atomic_fetch_dec
>  # endif
>  #else
> -# ifndef atomic_fetch_dec_acquire
> -#  define atomic_fetch_dec_acquire(...)		__atomic_op_acquire(atomic_fetch_dec, __VA_ARGS__)
> -# endif
> -# ifndef atomic_fetch_dec_release
> -#  define atomic_fetch_dec_release(...)		__atomic_op_release(atomic_fetch_dec, __VA_ARGS__)
> -# endif
>  # ifndef atomic_fetch_dec
>  #  define atomic_fetch_dec(...)			__atomic_op_fence(atomic_fetch_dec, __VA_ARGS__)
> +#  define atomic_fetch_dec_acquire(...)		__atomic_op_acquire(atomic_fetch_dec, __VA_ARGS__)
> +#  define atomic_fetch_dec_release(...)		__atomic_op_release(atomic_fetch_dec, __VA_ARGS__)
>  # endif
>  #endif
>  
> -/* atomic_fetch_or_relaxed() et al: */
> -
>  #ifndef atomic_fetch_or_relaxed
>  # define atomic_fetch_or_relaxed		atomic_fetch_or
>  # define atomic_fetch_or_acquire		atomic_fetch_or
>  # define atomic_fetch_or_release		atomic_fetch_or
>  #else
> -# ifndef atomic_fetch_or_acquire
> -#  define atomic_fetch_or_acquire(...)		__atomic_op_acquire(atomic_fetch_or, __VA_ARGS__)
> -# endif
> -# ifndef atomic_fetch_or_release
> -#  define atomic_fetch_or_release(...)		__atomic_op_release(atomic_fetch_or, __VA_ARGS__)
> -# endif
>  # ifndef atomic_fetch_or
>  #  define atomic_fetch_or(...)			__atomic_op_fence(atomic_fetch_or, __VA_ARGS__)
> +#  define atomic_fetch_or_acquire(...)		__atomic_op_acquire(atomic_fetch_or, __VA_ARGS__)
> +#  define atomic_fetch_or_release(...)		__atomic_op_release(atomic_fetch_or, __VA_ARGS__)
>  # endif
>  #endif
>  
> -/* atomic_fetch_and_relaxed() et al: */
> -
>  #ifndef atomic_fetch_and_relaxed
>  # define atomic_fetch_and_relaxed		atomic_fetch_and
>  # define atomic_fetch_and_acquire		atomic_fetch_and
>  # define atomic_fetch_and_release		atomic_fetch_and
>  #else
> -# ifndef atomic_fetch_and_acquire
> -#  define atomic_fetch_and_acquire(...)		__atomic_op_acquire(atomic_fetch_and, __VA_ARGS__)
> -# endif
> -# ifndef atomic_fetch_and_release
> -#  define atomic_fetch_and_release(...)		__atomic_op_release(atomic_fetch_and, __VA_ARGS__)
> -# endif
>  # ifndef atomic_fetch_and
>  #  define atomic_fetch_and(...)			__atomic_op_fence(atomic_fetch_and, __VA_ARGS__)
> +#  define atomic_fetch_and_acquire(...)		__atomic_op_acquire(atomic_fetch_and, __VA_ARGS__)
> +#  define atomic_fetch_and_release(...)		__atomic_op_release(atomic_fetch_and, __VA_ARGS__)
>  # endif
>  #endif
>  
>  #ifdef atomic_andnot
>  
> -/* atomic_fetch_andnot_relaxed() et al: */
> -
>  #ifndef atomic_fetch_andnot_relaxed
>  # define atomic_fetch_andnot_relaxed		atomic_fetch_andnot
>  # define atomic_fetch_andnot_acquire		atomic_fetch_andnot
>  # define atomic_fetch_andnot_release		atomic_fetch_andnot
>  #else
> -# ifndef atomic_fetch_andnot_acquire
> -#  define atomic_fetch_andnot_acquire(...)	__atomic_op_acquire(atomic_fetch_andnot, __VA_ARGS__)
> -# endif
> -# ifndef atomic_fetch_andnot_release
> -#  define atomic_fetch_andnot_release(...)	__atomic_op_release(atomic_fetch_andnot, __VA_ARGS__)
> -# endif
>  # ifndef atomic_fetch_andnot
>  #  define atomic_fetch_andnot(...)		__atomic_op_fence(atomic_fetch_andnot, __VA_ARGS__)
> +#  define atomic_fetch_andnot_acquire(...)	__atomic_op_acquire(atomic_fetch_andnot, __VA_ARGS__)
> +#  define atomic_fetch_andnot_release(...)	__atomic_op_release(atomic_fetch_andnot, __VA_ARGS__)
>  # endif
>  #endif
>  
>  #endif /* atomic_andnot */
>  
> -/* atomic_fetch_xor_relaxed() et al: */
> -
>  #ifndef atomic_fetch_xor_relaxed
>  # define atomic_fetch_xor_relaxed		atomic_fetch_xor
>  # define atomic_fetch_xor_acquire		atomic_fetch_xor
>  # define atomic_fetch_xor_release		atomic_fetch_xor
>  #else
> -# ifndef atomic_fetch_xor_acquire
> -#  define atomic_fetch_xor_acquire(...)		__atomic_op_acquire(atomic_fetch_xor, __VA_ARGS__)
> -# endif
> -# ifndef atomic_fetch_xor_release
> -#  define atomic_fetch_xor_release(...)		__atomic_op_release(atomic_fetch_xor, __VA_ARGS__)
> -# endif
>  # ifndef atomic_fetch_xor
>  #  define atomic_fetch_xor(...)			__atomic_op_fence(atomic_fetch_xor, __VA_ARGS__)
> +#  define atomic_fetch_xor_acquire(...)		__atomic_op_acquire(atomic_fetch_xor, __VA_ARGS__)
> +#  define atomic_fetch_xor_release(...)		__atomic_op_release(atomic_fetch_xor, __VA_ARGS__)
>  # endif
>  #endif
>  
> -
> -/* atomic_xchg_relaxed() et al: */
> -
>  #ifndef atomic_xchg_relaxed
>  #define atomic_xchg_relaxed			atomic_xchg
>  #define atomic_xchg_acquire			atomic_xchg
>  #define atomic_xchg_release			atomic_xchg
>  #else
> -# ifndef atomic_xchg_acquire
> -#  define atomic_xchg_acquire(...)		__atomic_op_acquire(atomic_xchg, __VA_ARGS__)
> -# endif
> -# ifndef atomic_xchg_release
> -#  define atomic_xchg_release(...)		__atomic_op_release(atomic_xchg, __VA_ARGS__)
> -# endif
>  # ifndef atomic_xchg
>  #  define atomic_xchg(...)			__atomic_op_fence(atomic_xchg, __VA_ARGS__)
> +#  define atomic_xchg_acquire(...)		__atomic_op_acquire(atomic_xchg, __VA_ARGS__)
> +#  define atomic_xchg_release(...)		__atomic_op_release(atomic_xchg, __VA_ARGS__)
>  # endif
>  #endif
>  
> -/* atomic_cmpxchg_relaxed() et al: */
> -
>  #ifndef atomic_cmpxchg_relaxed
>  # define atomic_cmpxchg_relaxed			atomic_cmpxchg
>  # define atomic_cmpxchg_acquire			atomic_cmpxchg
>  # define atomic_cmpxchg_release			atomic_cmpxchg
>  #else
> -# ifndef atomic_cmpxchg_acquire
> -#  define atomic_cmpxchg_acquire(...)		__atomic_op_acquire(atomic_cmpxchg, __VA_ARGS__)
> -# endif
> -# ifndef atomic_cmpxchg_release
> -#  define atomic_cmpxchg_release(...)		__atomic_op_release(atomic_cmpxchg, __VA_ARGS__)
> -# endif
>  # ifndef atomic_cmpxchg
>  #  define atomic_cmpxchg(...)			__atomic_op_fence(atomic_cmpxchg, __VA_ARGS__)
> +#  define atomic_cmpxchg_acquire(...)		__atomic_op_acquire(atomic_cmpxchg, __VA_ARGS__)
> +#  define atomic_cmpxchg_release(...)		__atomic_op_release(atomic_cmpxchg, __VA_ARGS__)
>  # endif
>  #endif
>  
> @@ -362,57 +277,39 @@
>  # define atomic_try_cmpxchg_release		atomic_try_cmpxchg
>  #endif
>  
> -/* cmpxchg_relaxed() et al: */
> -
>  #ifndef cmpxchg_relaxed
>  # define cmpxchg_relaxed			cmpxchg
>  # define cmpxchg_acquire			cmpxchg
>  # define cmpxchg_release			cmpxchg
>  #else
> -# ifndef cmpxchg_acquire
> -#  define cmpxchg_acquire(...)			__atomic_op_acquire(cmpxchg, __VA_ARGS__)
> -# endif
> -# ifndef cmpxchg_release
> -#  define cmpxchg_release(...)			__atomic_op_release(cmpxchg, __VA_ARGS__)
> -# endif
>  # ifndef cmpxchg
>  #  define cmpxchg(...)				__atomic_op_fence(cmpxchg, __VA_ARGS__)
> +#  define cmpxchg_acquire(...)			__atomic_op_acquire(cmpxchg, __VA_ARGS__)
> +#  define cmpxchg_release(...)			__atomic_op_release(cmpxchg, __VA_ARGS__)
>  # endif
>  #endif
>  
> -/* cmpxchg64_relaxed() et al: */
> -
>  #ifndef cmpxchg64_relaxed
>  # define cmpxchg64_relaxed			cmpxchg64
>  # define cmpxchg64_acquire			cmpxchg64
>  # define cmpxchg64_release			cmpxchg64
>  #else
> -# ifndef cmpxchg64_acquire
> -#  define cmpxchg64_acquire(...)		__atomic_op_acquire(cmpxchg64, __VA_ARGS__)
> -# endif
> -# ifndef cmpxchg64_release
> -#  define cmpxchg64_release(...)		__atomic_op_release(cmpxchg64, __VA_ARGS__)
> -# endif
>  # ifndef cmpxchg64
>  #  define cmpxchg64(...)			__atomic_op_fence(cmpxchg64, __VA_ARGS__)
> +#  define cmpxchg64_acquire(...)		__atomic_op_acquire(cmpxchg64, __VA_ARGS__)
> +#  define cmpxchg64_release(...)		__atomic_op_release(cmpxchg64, __VA_ARGS__)
>  # endif
>  #endif
>  
> -/* xchg_relaxed() et al: */
> -
>  #ifndef xchg_relaxed
>  # define xchg_relaxed				xchg
>  # define xchg_acquire				xchg
>  # define xchg_release				xchg
>  #else
> -# ifndef xchg_acquire
> -#  define xchg_acquire(...)			__atomic_op_acquire(xchg, __VA_ARGS__)
> -# endif
> -# ifndef xchg_release
> -#  define xchg_release(...)			__atomic_op_release(xchg, __VA_ARGS__)
> -# endif
>  # ifndef xchg
>  #  define xchg(...)				__atomic_op_fence(xchg, __VA_ARGS__)
> +#  define xchg_acquire(...)			__atomic_op_acquire(xchg, __VA_ARGS__)
> +#  define xchg_release(...)			__atomic_op_release(xchg, __VA_ARGS__)
>  # endif
>  #endif
>  
> @@ -569,98 +466,66 @@ static inline int atomic_dec_if_positive(atomic_t *v)
>  # define atomic64_set_release(v, i)		smp_store_release(&(v)->counter, (i))
>  #endif
>  
> -/* atomic64_add_return_relaxed() et al: */
> -
>  #ifndef atomic64_add_return_relaxed
>  # define atomic64_add_return_relaxed		atomic64_add_return
>  # define atomic64_add_return_acquire		atomic64_add_return
>  # define atomic64_add_return_release		atomic64_add_return
>  #else
> -# ifndef atomic64_add_return_acquire
> -#  define atomic64_add_return_acquire(...)	__atomic_op_acquire(atomic64_add_return, __VA_ARGS__)
> -# endif
> -# ifndef atomic64_add_return_release
> -#  define atomic64_add_return_release(...)	__atomic_op_release(atomic64_add_return, __VA_ARGS__)
> -# endif
>  # ifndef atomic64_add_return
>  #  define atomic64_add_return(...)		__atomic_op_fence(atomic64_add_return, __VA_ARGS__)
> +#  define atomic64_add_return_acquire(...)	__atomic_op_acquire(atomic64_add_return, __VA_ARGS__)
> +#  define atomic64_add_return_release(...)	__atomic_op_release(atomic64_add_return, __VA_ARGS__)
>  # endif
>  #endif
>  
> -/* atomic64_inc_return_relaxed() et al: */
> -
>  #ifndef atomic64_inc_return_relaxed
>  # define atomic64_inc_return_relaxed		atomic64_inc_return
>  # define atomic64_inc_return_acquire		atomic64_inc_return
>  # define atomic64_inc_return_release		atomic64_inc_return
>  #else
> -# ifndef atomic64_inc_return_acquire
> -#  define atomic64_inc_return_acquire(...)	__atomic_op_acquire(atomic64_inc_return, __VA_ARGS__)
> -# endif
> -# ifndef atomic64_inc_return_release
> -#  define atomic64_inc_return_release(...)	__atomic_op_release(atomic64_inc_return, __VA_ARGS__)
> -# endif
>  # ifndef atomic64_inc_return
>  #  define atomic64_inc_return(...)		__atomic_op_fence(atomic64_inc_return, __VA_ARGS__)
> +#  define atomic64_inc_return_acquire(...)	__atomic_op_acquire(atomic64_inc_return, __VA_ARGS__)
> +#  define atomic64_inc_return_release(...)	__atomic_op_release(atomic64_inc_return, __VA_ARGS__)
>  # endif
>  #endif
>  
> -/* atomic64_sub_return_relaxed() et al: */
> -
>  #ifndef atomic64_sub_return_relaxed
>  # define atomic64_sub_return_relaxed		atomic64_sub_return
>  # define atomic64_sub_return_acquire		atomic64_sub_return
>  # define atomic64_sub_return_release		atomic64_sub_return
>  #else
> -# ifndef atomic64_sub_return_acquire
> -#  define atomic64_sub_return_acquire(...)	__atomic_op_acquire(atomic64_sub_return, __VA_ARGS__)
> -# endif
> -# ifndef atomic64_sub_return_release
> -#  define atomic64_sub_return_release(...)	__atomic_op_release(atomic64_sub_return, __VA_ARGS__)
> -# endif
>  # ifndef atomic64_sub_return
>  #  define atomic64_sub_return(...)		__atomic_op_fence(atomic64_sub_return, __VA_ARGS__)
> +#  define atomic64_sub_return_acquire(...)	__atomic_op_acquire(atomic64_sub_return, __VA_ARGS__)
> +#  define atomic64_sub_return_release(...)	__atomic_op_release(atomic64_sub_return, __VA_ARGS__)
>  # endif
>  #endif
>  
> -/* atomic64_dec_return_relaxed() et al: */
> -
>  #ifndef atomic64_dec_return_relaxed
>  # define atomic64_dec_return_relaxed		atomic64_dec_return
>  # define atomic64_dec_return_acquire		atomic64_dec_return
>  # define atomic64_dec_return_release		atomic64_dec_return
>  #else
> -# ifndef atomic64_dec_return_acquire
> -#  define atomic64_dec_return_acquire(...)	__atomic_op_acquire(atomic64_dec_return, __VA_ARGS__)
> -# endif
> -# ifndef atomic64_dec_return_release
> -#  define atomic64_dec_return_release(...)	__atomic_op_release(atomic64_dec_return, __VA_ARGS__)
> -# endif
>  # ifndef atomic64_dec_return
>  #  define atomic64_dec_return(...)		__atomic_op_fence(atomic64_dec_return, __VA_ARGS__)
> +#  define atomic64_dec_return_acquire(...)	__atomic_op_acquire(atomic64_dec_return, __VA_ARGS__)
> +#  define atomic64_dec_return_release(...)	__atomic_op_release(atomic64_dec_return, __VA_ARGS__)
>  # endif
>  #endif
>  
> -/* atomic64_fetch_add_relaxed() et al: */
> -
>  #ifndef atomic64_fetch_add_relaxed
>  # define atomic64_fetch_add_relaxed		atomic64_fetch_add
>  # define atomic64_fetch_add_acquire		atomic64_fetch_add
>  # define atomic64_fetch_add_release		atomic64_fetch_add
>  #else
> -# ifndef atomic64_fetch_add_acquire
> -#  define atomic64_fetch_add_acquire(...)	__atomic_op_acquire(atomic64_fetch_add, __VA_ARGS__)
> -# endif
> -# ifndef atomic64_fetch_add_release
> -#  define atomic64_fetch_add_release(...)	__atomic_op_release(atomic64_fetch_add, __VA_ARGS__)
> -# endif
>  # ifndef atomic64_fetch_add
>  #  define atomic64_fetch_add(...)		__atomic_op_fence(atomic64_fetch_add, __VA_ARGS__)
> +#  define atomic64_fetch_add_acquire(...)	__atomic_op_acquire(atomic64_fetch_add, __VA_ARGS__)
> +#  define atomic64_fetch_add_release(...)	__atomic_op_release(atomic64_fetch_add, __VA_ARGS__)
>  # endif
>  #endif
>  
> -/* atomic64_fetch_inc_relaxed() et al: */
> -
>  #ifndef atomic64_fetch_inc_relaxed
>  # ifndef atomic64_fetch_inc
>  #  define atomic64_fetch_inc(v)			atomic64_fetch_add(1, (v))
> @@ -673,37 +538,25 @@ static inline int atomic_dec_if_positive(atomic_t *v)
>  #  define atomic64_fetch_inc_release		atomic64_fetch_inc
>  # endif
>  #else
> -# ifndef atomic64_fetch_inc_acquire
> -#  define atomic64_fetch_inc_acquire(...)	__atomic_op_acquire(atomic64_fetch_inc, __VA_ARGS__)
> -# endif
> -# ifndef atomic64_fetch_inc_release
> -#  define atomic64_fetch_inc_release(...)	__atomic_op_release(atomic64_fetch_inc, __VA_ARGS__)
> -# endif
>  # ifndef atomic64_fetch_inc
>  #  define atomic64_fetch_inc(...)		__atomic_op_fence(atomic64_fetch_inc, __VA_ARGS__)
> +#  define atomic64_fetch_inc_acquire(...)	__atomic_op_acquire(atomic64_fetch_inc, __VA_ARGS__)
> +#  define atomic64_fetch_inc_release(...)	__atomic_op_release(atomic64_fetch_inc, __VA_ARGS__)
>  # endif
>  #endif
>  
> -/* atomic64_fetch_sub_relaxed() et al: */
> -
>  #ifndef atomic64_fetch_sub_relaxed
>  # define atomic64_fetch_sub_relaxed		atomic64_fetch_sub
>  # define atomic64_fetch_sub_acquire		atomic64_fetch_sub
>  # define atomic64_fetch_sub_release		atomic64_fetch_sub
>  #else
> -# ifndef atomic64_fetch_sub_acquire
> -#  define atomic64_fetch_sub_acquire(...)	__atomic_op_acquire(atomic64_fetch_sub, __VA_ARGS__)
> -# endif
> -# ifndef atomic64_fetch_sub_release
> -#  define atomic64_fetch_sub_release(...)	__atomic_op_release(atomic64_fetch_sub, __VA_ARGS__)
> -# endif
>  # ifndef atomic64_fetch_sub
>  #  define atomic64_fetch_sub(...)		__atomic_op_fence(atomic64_fetch_sub, __VA_ARGS__)
> +#  define atomic64_fetch_sub_acquire(...)	__atomic_op_acquire(atomic64_fetch_sub, __VA_ARGS__)
> +#  define atomic64_fetch_sub_release(...)	__atomic_op_release(atomic64_fetch_sub, __VA_ARGS__)
>  # endif
>  #endif
>  
> -/* atomic64_fetch_dec_relaxed() et al: */
> -
>  #ifndef atomic64_fetch_dec_relaxed
>  # ifndef atomic64_fetch_dec
>  #  define atomic64_fetch_dec(v)			atomic64_fetch_sub(1, (v))
> @@ -716,127 +569,86 @@ static inline int atomic_dec_if_positive(atomic_t *v)
>  #  define atomic64_fetch_dec_release		atomic64_fetch_dec
>  # endif
>  #else
> -# ifndef atomic64_fetch_dec_acquire
> -#  define atomic64_fetch_dec_acquire(...)	__atomic_op_acquire(atomic64_fetch_dec, __VA_ARGS__)
> -# endif
> -# ifndef atomic64_fetch_dec_release
> -#  define atomic64_fetch_dec_release(...)	__atomic_op_release(atomic64_fetch_dec, __VA_ARGS__)
> -# endif
>  # ifndef atomic64_fetch_dec
>  #  define atomic64_fetch_dec(...)		__atomic_op_fence(atomic64_fetch_dec, __VA_ARGS__)
> +#  define atomic64_fetch_dec_acquire(...)	__atomic_op_acquire(atomic64_fetch_dec, __VA_ARGS__)
> +#  define atomic64_fetch_dec_release(...)	__atomic_op_release(atomic64_fetch_dec, __VA_ARGS__)
>  # endif
>  #endif
>  
> -/* atomic64_fetch_or_relaxed() et al: */
> -
>  #ifndef atomic64_fetch_or_relaxed
>  # define atomic64_fetch_or_relaxed		atomic64_fetch_or
>  # define atomic64_fetch_or_acquire		atomic64_fetch_or
>  # define atomic64_fetch_or_release		atomic64_fetch_or
>  #else
> -# ifndef atomic64_fetch_or_acquire
> -#  define atomic64_fetch_or_acquire(...)	__atomic_op_acquire(atomic64_fetch_or, __VA_ARGS__)
> -# endif
> -# ifndef atomic64_fetch_or_release
> -#  define atomic64_fetch_or_release(...)	__atomic_op_release(atomic64_fetch_or, __VA_ARGS__)
> -# endif
>  # ifndef atomic64_fetch_or
>  #  define atomic64_fetch_or(...)		__atomic_op_fence(atomic64_fetch_or, __VA_ARGS__)
> +#  define atomic64_fetch_or_acquire(...)	__atomic_op_acquire(atomic64_fetch_or, __VA_ARGS__)
> +#  define atomic64_fetch_or_release(...)	__atomic_op_release(atomic64_fetch_or, __VA_ARGS__)
>  # endif
>  #endif
>  
> -
> -/* atomic64_fetch_and_relaxed() et al: */
> -
>  #ifndef atomic64_fetch_and_relaxed
>  # define atomic64_fetch_and_relaxed		atomic64_fetch_and
>  # define atomic64_fetch_and_acquire		atomic64_fetch_and
>  # define atomic64_fetch_and_release		atomic64_fetch_and
>  #else
> -# ifndef atomic64_fetch_and_acquire
> -#  define atomic64_fetch_and_acquire(...)	__atomic_op_acquire(atomic64_fetch_and, __VA_ARGS__)
> -# endif
> -# ifndef atomic64_fetch_and_release
> -#  define atomic64_fetch_and_release(...)	__atomic_op_release(atomic64_fetch_and, __VA_ARGS__)
> -# endif
>  # ifndef atomic64_fetch_and
>  #  define atomic64_fetch_and(...)		__atomic_op_fence(atomic64_fetch_and, __VA_ARGS__)
> +#  define atomic64_fetch_and_acquire(...)	__atomic_op_acquire(atomic64_fetch_and, __VA_ARGS__)
> +#  define atomic64_fetch_and_release(...)	__atomic_op_release(atomic64_fetch_and, __VA_ARGS__)
>  # endif
>  #endif
>  
>  #ifdef atomic64_andnot
>  
> -/* atomic64_fetch_andnot_relaxed() et al: */
> -
>  #ifndef atomic64_fetch_andnot_relaxed
>  # define atomic64_fetch_andnot_relaxed		atomic64_fetch_andnot
>  # define atomic64_fetch_andnot_acquire		atomic64_fetch_andnot
>  # define atomic64_fetch_andnot_release		atomic64_fetch_andnot
>  #else
> -# ifndef atomic64_fetch_andnot_acquire
> -#  define atomic64_fetch_andnot_acquire(...)	__atomic_op_acquire(atomic64_fetch_andnot, __VA_ARGS__)
> -# endif
> -# ifndef atomic64_fetch_andnot_release
> -#  define atomic64_fetch_andnot_release(...)	__atomic_op_release(atomic64_fetch_andnot, __VA_ARGS__)
> -# endif
>  # ifndef atomic64_fetch_andnot
>  #  define atomic64_fetch_andnot(...)		__atomic_op_fence(atomic64_fetch_andnot, __VA_ARGS__)
> +#  define atomic64_fetch_andnot_acquire(...)	__atomic_op_acquire(atomic64_fetch_andnot, __VA_ARGS__)
> +#  define atomic64_fetch_andnot_release(...)	__atomic_op_release(atomic64_fetch_andnot, __VA_ARGS__)
>  # endif
>  #endif
>  
>  #endif /* atomic64_andnot */
>  
> -/* atomic64_fetch_xor_relaxed() et al: */
> -
>  #ifndef atomic64_fetch_xor_relaxed
>  # define atomic64_fetch_xor_relaxed		atomic64_fetch_xor
>  # define atomic64_fetch_xor_acquire		atomic64_fetch_xor
>  # define atomic64_fetch_xor_release		atomic64_fetch_xor
>  #else
> -# ifndef atomic64_fetch_xor_acquire
> -#  define atomic64_fetch_xor_acquire(...)	__atomic_op_acquire(atomic64_fetch_xor, __VA_ARGS__)
> -# endif
> -# ifndef atomic64_fetch_xor_release
> -#  define atomic64_fetch_xor_release(...)	__atomic_op_release(atomic64_fetch_xor, __VA_ARGS__)
> -# endif
>  # ifndef atomic64_fetch_xor
>  #  define atomic64_fetch_xor(...)		__atomic_op_fence(atomic64_fetch_xor, __VA_ARGS__)
> +#  define atomic64_fetch_xor_acquire(...)	__atomic_op_acquire(atomic64_fetch_xor, __VA_ARGS__)
> +#  define atomic64_fetch_xor_release(...)	__atomic_op_release(atomic64_fetch_xor, __VA_ARGS__)
>  # endif
>  #endif
>  
> -/* atomic64_xchg_relaxed() et al: */
> -
>  #ifndef atomic64_xchg_relaxed
>  # define atomic64_xchg_relaxed			atomic64_xchg
>  # define atomic64_xchg_acquire			atomic64_xchg
>  # define atomic64_xchg_release			atomic64_xchg
>  #else
> -# ifndef atomic64_xchg_acquire
> -#  define atomic64_xchg_acquire(...)		__atomic_op_acquire(atomic64_xchg, __VA_ARGS__)
> -# endif
> -# ifndef atomic64_xchg_release
> -#  define atomic64_xchg_release(...)		__atomic_op_release(atomic64_xchg, __VA_ARGS__)
> -# endif
>  # ifndef atomic64_xchg
>  #  define atomic64_xchg(...)			__atomic_op_fence(atomic64_xchg, __VA_ARGS__)
> +#  define atomic64_xchg_acquire(...)		__atomic_op_acquire(atomic64_xchg, __VA_ARGS__)
> +#  define atomic64_xchg_release(...)		__atomic_op_release(atomic64_xchg, __VA_ARGS__)
>  # endif
>  #endif
>  
> -/* atomic64_cmpxchg_relaxed() et al: */
> -
>  #ifndef atomic64_cmpxchg_relaxed
>  # define atomic64_cmpxchg_relaxed		atomic64_cmpxchg
>  # define atomic64_cmpxchg_acquire		atomic64_cmpxchg
>  # define atomic64_cmpxchg_release		atomic64_cmpxchg
>  #else
> -# ifndef atomic64_cmpxchg_acquire
> -#  define atomic64_cmpxchg_acquire(...)		__atomic_op_acquire(atomic64_cmpxchg, __VA_ARGS__)
> -# endif
> -# ifndef atomic64_cmpxchg_release
> -#  define atomic64_cmpxchg_release(...)		__atomic_op_release(atomic64_cmpxchg, __VA_ARGS__)
> -# endif
>  # ifndef atomic64_cmpxchg
>  #  define atomic64_cmpxchg(...)			__atomic_op_fence(atomic64_cmpxchg, __VA_ARGS__)
> +#  define atomic64_cmpxchg_acquire(...)		__atomic_op_acquire(atomic64_cmpxchg, __VA_ARGS__)
> +#  define atomic64_cmpxchg_release(...)		__atomic_op_release(atomic64_cmpxchg, __VA_ARGS__)
>  # endif
>  #endif
>  

^ permalink raw reply	[flat|nested] 103+ messages in thread

* Re: [PATCH] locking/atomics: Combine the atomic_andnot() and atomic64_andnot() API definitions
  2018-05-05  8:54             ` Ingo Molnar
@ 2018-05-06 14:15               ` Andrea Parri
  -1 siblings, 0 replies; 103+ messages in thread
From: Andrea Parri @ 2018-05-06 14:15 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Mark Rutland, Peter Zijlstra, linux-arm-kernel, linux-kernel,
	aryabinin, boqun.feng, catalin.marinas, dvyukov, will.deacon,
	Linus Torvalds, Andrew Morton, Paul E. McKenney, Peter Zijlstra,
	Thomas Gleixner

Hi Ingo,

> From f5efafa83af8c46b9e81b010b46caeeadb450179 Mon Sep 17 00:00:00 2001
> From: Ingo Molnar <mingo@kernel.org>
> Date: Sat, 5 May 2018 10:46:41 +0200
> Subject: [PATCH] locking/atomics: Combine the atomic_andnot() and atomic64_andnot() API definitions
> 
> The atomic_andnot() and atomic64_andnot() are defined in 4 separate groups
> spred out in the atomic.h header:
> 
>  #ifdef atomic_andnot
>  ...
>  #endif /* atomic_andnot */
>  ...
>  #ifndef atomic_andnot
>  ...
>  #endif
>  ...
>  #ifdef atomic64_andnot
>  ...
>  #endif /* atomic64_andnot */
>  ...
>  #ifndef atomic64_andnot
>  ...
>  #endif
> 
> Combine them into unify them into two groups:

Nit: "Combine them into unify them into"

  Andrea


> 
>  #ifdef atomic_andnot
>  #else
>  #endif
> 
>  ...
> 
>  #ifdef atomic64_andnot
>  #else
>  #endif
> 
> So that one API group is defined in a single place within the header.
> 
> Cc: Peter Zijlstra <peterz@infradead.org>
> Cc: Linus Torvalds <torvalds@linux-foundation.org>
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> Cc: Will Deacon <will.deacon@arm.com>
> Cc: linux-kernel@vger.kernel.org
> Signed-off-by: Ingo Molnar <mingo@kernel.org>
> ---
>  include/linux/atomic.h | 72 +++++++++++++++++++++++++-------------------------
>  1 file changed, 36 insertions(+), 36 deletions(-)
> 
> diff --git a/include/linux/atomic.h b/include/linux/atomic.h
> index 352ecc72d7f5..1176cf7c6f03 100644
> --- a/include/linux/atomic.h
> +++ b/include/linux/atomic.h
> @@ -205,22 +205,6 @@
>  # endif
>  #endif
>  
> -#ifdef atomic_andnot
> -
> -#ifndef atomic_fetch_andnot_relaxed
> -# define atomic_fetch_andnot_relaxed		atomic_fetch_andnot
> -# define atomic_fetch_andnot_acquire		atomic_fetch_andnot
> -# define atomic_fetch_andnot_release		atomic_fetch_andnot
> -#else
> -# ifndef atomic_fetch_andnot
> -#  define atomic_fetch_andnot(...)		__atomic_op_fence(atomic_fetch_andnot, __VA_ARGS__)
> -#  define atomic_fetch_andnot_acquire(...)	__atomic_op_acquire(atomic_fetch_andnot, __VA_ARGS__)
> -#  define atomic_fetch_andnot_release(...)	__atomic_op_release(atomic_fetch_andnot, __VA_ARGS__)
> -# endif
> -#endif
> -
> -#endif /* atomic_andnot */
> -
>  #ifndef atomic_fetch_xor_relaxed
>  # define atomic_fetch_xor_relaxed		atomic_fetch_xor
>  # define atomic_fetch_xor_acquire		atomic_fetch_xor
> @@ -338,7 +322,22 @@ static inline int atomic_add_unless(atomic_t *v, int a, int u)
>  # define atomic_inc_not_zero(v)			atomic_add_unless((v), 1, 0)
>  #endif
>  
> -#ifndef atomic_andnot
> +#ifdef atomic_andnot
> +
> +#ifndef atomic_fetch_andnot_relaxed
> +# define atomic_fetch_andnot_relaxed		atomic_fetch_andnot
> +# define atomic_fetch_andnot_acquire		atomic_fetch_andnot
> +# define atomic_fetch_andnot_release		atomic_fetch_andnot
> +#else
> +# ifndef atomic_fetch_andnot
> +#  define atomic_fetch_andnot(...)		__atomic_op_fence(atomic_fetch_andnot, __VA_ARGS__)
> +#  define atomic_fetch_andnot_acquire(...)	__atomic_op_acquire(atomic_fetch_andnot, __VA_ARGS__)
> +#  define atomic_fetch_andnot_release(...)	__atomic_op_release(atomic_fetch_andnot, __VA_ARGS__)
> +# endif
> +#endif
> +
> +#else /* !atomic_andnot: */
> +
>  static inline void atomic_andnot(int i, atomic_t *v)
>  {
>  	atomic_and(~i, v);
> @@ -363,7 +362,8 @@ static inline int atomic_fetch_andnot_release(int i, atomic_t *v)
>  {
>  	return atomic_fetch_and_release(~i, v);
>  }
> -#endif
> +
> +#endif /* !atomic_andnot */
>  
>  /**
>   * atomic_inc_not_zero_hint - increment if not null
> @@ -600,22 +600,6 @@ static inline int atomic_dec_if_positive(atomic_t *v)
>  # endif
>  #endif
>  
> -#ifdef atomic64_andnot
> -
> -#ifndef atomic64_fetch_andnot_relaxed
> -# define atomic64_fetch_andnot_relaxed		atomic64_fetch_andnot
> -# define atomic64_fetch_andnot_acquire		atomic64_fetch_andnot
> -# define atomic64_fetch_andnot_release		atomic64_fetch_andnot
> -#else
> -# ifndef atomic64_fetch_andnot
> -#  define atomic64_fetch_andnot(...)		__atomic_op_fence(atomic64_fetch_andnot, __VA_ARGS__)
> -#  define atomic64_fetch_andnot_acquire(...)	__atomic_op_acquire(atomic64_fetch_andnot, __VA_ARGS__)
> -#  define atomic64_fetch_andnot_release(...)	__atomic_op_release(atomic64_fetch_andnot, __VA_ARGS__)
> -# endif
> -#endif
> -
> -#endif /* atomic64_andnot */
> -
>  #ifndef atomic64_fetch_xor_relaxed
>  # define atomic64_fetch_xor_relaxed		atomic64_fetch_xor
>  # define atomic64_fetch_xor_acquire		atomic64_fetch_xor
> @@ -672,7 +656,22 @@ static inline int atomic_dec_if_positive(atomic_t *v)
>  # define atomic64_try_cmpxchg_release		atomic64_try_cmpxchg
>  #endif
>  
> -#ifndef atomic64_andnot
> +#ifdef atomic64_andnot
> +
> +#ifndef atomic64_fetch_andnot_relaxed
> +# define atomic64_fetch_andnot_relaxed		atomic64_fetch_andnot
> +# define atomic64_fetch_andnot_acquire		atomic64_fetch_andnot
> +# define atomic64_fetch_andnot_release		atomic64_fetch_andnot
> +#else
> +# ifndef atomic64_fetch_andnot
> +#  define atomic64_fetch_andnot(...)		__atomic_op_fence(atomic64_fetch_andnot, __VA_ARGS__)
> +#  define atomic64_fetch_andnot_acquire(...)	__atomic_op_acquire(atomic64_fetch_andnot, __VA_ARGS__)
> +#  define atomic64_fetch_andnot_release(...)	__atomic_op_release(atomic64_fetch_andnot, __VA_ARGS__)
> +# endif
> +#endif
> +
> +#else /* !atomic64_andnot: */
> +
>  static inline void atomic64_andnot(long long i, atomic64_t *v)
>  {
>  	atomic64_and(~i, v);
> @@ -697,7 +696,8 @@ static inline long long atomic64_fetch_andnot_release(long long i, atomic64_t *v
>  {
>  	return atomic64_fetch_and_release(~i, v);
>  }
> -#endif
> +
> +#endif /* !atomic64_andnot */
>  
>  #define atomic64_cond_read_relaxed(v, c)	smp_cond_load_relaxed(&(v)->counter, (c))
>  #define atomic64_cond_read_acquire(v, c)	smp_cond_load_acquire(&(v)->counter, (c))

^ permalink raw reply	[flat|nested] 103+ messages in thread

* [PATCH] locking/atomics: Combine the atomic_andnot() and atomic64_andnot() API definitions
@ 2018-05-06 14:15               ` Andrea Parri
  0 siblings, 0 replies; 103+ messages in thread
From: Andrea Parri @ 2018-05-06 14:15 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Ingo,

> From f5efafa83af8c46b9e81b010b46caeeadb450179 Mon Sep 17 00:00:00 2001
> From: Ingo Molnar <mingo@kernel.org>
> Date: Sat, 5 May 2018 10:46:41 +0200
> Subject: [PATCH] locking/atomics: Combine the atomic_andnot() and atomic64_andnot() API definitions
> 
> The atomic_andnot() and atomic64_andnot() are defined in 4 separate groups
> spred out in the atomic.h header:
> 
>  #ifdef atomic_andnot
>  ...
>  #endif /* atomic_andnot */
>  ...
>  #ifndef atomic_andnot
>  ...
>  #endif
>  ...
>  #ifdef atomic64_andnot
>  ...
>  #endif /* atomic64_andnot */
>  ...
>  #ifndef atomic64_andnot
>  ...
>  #endif
> 
> Combine them into unify them into two groups:

Nit: "Combine them into unify them into"

  Andrea


> 
>  #ifdef atomic_andnot
>  #else
>  #endif
> 
>  ...
> 
>  #ifdef atomic64_andnot
>  #else
>  #endif
> 
> So that one API group is defined in a single place within the header.
> 
> Cc: Peter Zijlstra <peterz@infradead.org>
> Cc: Linus Torvalds <torvalds@linux-foundation.org>
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> Cc: Will Deacon <will.deacon@arm.com>
> Cc: linux-kernel at vger.kernel.org
> Signed-off-by: Ingo Molnar <mingo@kernel.org>
> ---
>  include/linux/atomic.h | 72 +++++++++++++++++++++++++-------------------------
>  1 file changed, 36 insertions(+), 36 deletions(-)
> 
> diff --git a/include/linux/atomic.h b/include/linux/atomic.h
> index 352ecc72d7f5..1176cf7c6f03 100644
> --- a/include/linux/atomic.h
> +++ b/include/linux/atomic.h
> @@ -205,22 +205,6 @@
>  # endif
>  #endif
>  
> -#ifdef atomic_andnot
> -
> -#ifndef atomic_fetch_andnot_relaxed
> -# define atomic_fetch_andnot_relaxed		atomic_fetch_andnot
> -# define atomic_fetch_andnot_acquire		atomic_fetch_andnot
> -# define atomic_fetch_andnot_release		atomic_fetch_andnot
> -#else
> -# ifndef atomic_fetch_andnot
> -#  define atomic_fetch_andnot(...)		__atomic_op_fence(atomic_fetch_andnot, __VA_ARGS__)
> -#  define atomic_fetch_andnot_acquire(...)	__atomic_op_acquire(atomic_fetch_andnot, __VA_ARGS__)
> -#  define atomic_fetch_andnot_release(...)	__atomic_op_release(atomic_fetch_andnot, __VA_ARGS__)
> -# endif
> -#endif
> -
> -#endif /* atomic_andnot */
> -
>  #ifndef atomic_fetch_xor_relaxed
>  # define atomic_fetch_xor_relaxed		atomic_fetch_xor
>  # define atomic_fetch_xor_acquire		atomic_fetch_xor
> @@ -338,7 +322,22 @@ static inline int atomic_add_unless(atomic_t *v, int a, int u)
>  # define atomic_inc_not_zero(v)			atomic_add_unless((v), 1, 0)
>  #endif
>  
> -#ifndef atomic_andnot
> +#ifdef atomic_andnot
> +
> +#ifndef atomic_fetch_andnot_relaxed
> +# define atomic_fetch_andnot_relaxed		atomic_fetch_andnot
> +# define atomic_fetch_andnot_acquire		atomic_fetch_andnot
> +# define atomic_fetch_andnot_release		atomic_fetch_andnot
> +#else
> +# ifndef atomic_fetch_andnot
> +#  define atomic_fetch_andnot(...)		__atomic_op_fence(atomic_fetch_andnot, __VA_ARGS__)
> +#  define atomic_fetch_andnot_acquire(...)	__atomic_op_acquire(atomic_fetch_andnot, __VA_ARGS__)
> +#  define atomic_fetch_andnot_release(...)	__atomic_op_release(atomic_fetch_andnot, __VA_ARGS__)
> +# endif
> +#endif
> +
> +#else /* !atomic_andnot: */
> +
>  static inline void atomic_andnot(int i, atomic_t *v)
>  {
>  	atomic_and(~i, v);
> @@ -363,7 +362,8 @@ static inline int atomic_fetch_andnot_release(int i, atomic_t *v)
>  {
>  	return atomic_fetch_and_release(~i, v);
>  }
> -#endif
> +
> +#endif /* !atomic_andnot */
>  
>  /**
>   * atomic_inc_not_zero_hint - increment if not null
> @@ -600,22 +600,6 @@ static inline int atomic_dec_if_positive(atomic_t *v)
>  # endif
>  #endif
>  
> -#ifdef atomic64_andnot
> -
> -#ifndef atomic64_fetch_andnot_relaxed
> -# define atomic64_fetch_andnot_relaxed		atomic64_fetch_andnot
> -# define atomic64_fetch_andnot_acquire		atomic64_fetch_andnot
> -# define atomic64_fetch_andnot_release		atomic64_fetch_andnot
> -#else
> -# ifndef atomic64_fetch_andnot
> -#  define atomic64_fetch_andnot(...)		__atomic_op_fence(atomic64_fetch_andnot, __VA_ARGS__)
> -#  define atomic64_fetch_andnot_acquire(...)	__atomic_op_acquire(atomic64_fetch_andnot, __VA_ARGS__)
> -#  define atomic64_fetch_andnot_release(...)	__atomic_op_release(atomic64_fetch_andnot, __VA_ARGS__)
> -# endif
> -#endif
> -
> -#endif /* atomic64_andnot */
> -
>  #ifndef atomic64_fetch_xor_relaxed
>  # define atomic64_fetch_xor_relaxed		atomic64_fetch_xor
>  # define atomic64_fetch_xor_acquire		atomic64_fetch_xor
> @@ -672,7 +656,22 @@ static inline int atomic_dec_if_positive(atomic_t *v)
>  # define atomic64_try_cmpxchg_release		atomic64_try_cmpxchg
>  #endif
>  
> -#ifndef atomic64_andnot
> +#ifdef atomic64_andnot
> +
> +#ifndef atomic64_fetch_andnot_relaxed
> +# define atomic64_fetch_andnot_relaxed		atomic64_fetch_andnot
> +# define atomic64_fetch_andnot_acquire		atomic64_fetch_andnot
> +# define atomic64_fetch_andnot_release		atomic64_fetch_andnot
> +#else
> +# ifndef atomic64_fetch_andnot
> +#  define atomic64_fetch_andnot(...)		__atomic_op_fence(atomic64_fetch_andnot, __VA_ARGS__)
> +#  define atomic64_fetch_andnot_acquire(...)	__atomic_op_acquire(atomic64_fetch_andnot, __VA_ARGS__)
> +#  define atomic64_fetch_andnot_release(...)	__atomic_op_release(atomic64_fetch_andnot, __VA_ARGS__)
> +# endif
> +#endif
> +
> +#else /* !atomic64_andnot: */
> +
>  static inline void atomic64_andnot(long long i, atomic64_t *v)
>  {
>  	atomic64_and(~i, v);
> @@ -697,7 +696,8 @@ static inline long long atomic64_fetch_andnot_release(long long i, atomic64_t *v
>  {
>  	return atomic64_fetch_and_release(~i, v);
>  }
> -#endif
> +
> +#endif /* !atomic64_andnot */
>  
>  #define atomic64_cond_read_relaxed(v, c)	smp_cond_load_relaxed(&(v)->counter, (c))
>  #define atomic64_cond_read_acquire(v, c)	smp_cond_load_acquire(&(v)->counter, (c))

^ permalink raw reply	[flat|nested] 103+ messages in thread

* Re: [PATCH] locking/atomics: Simplify the op definitions in atomic.h some more
  2018-05-06 14:12             ` Andrea Parri
@ 2018-05-06 14:57               ` Ingo Molnar
  -1 siblings, 0 replies; 103+ messages in thread
From: Ingo Molnar @ 2018-05-06 14:57 UTC (permalink / raw)
  To: Andrea Parri
  Cc: Mark Rutland, Peter Zijlstra, linux-arm-kernel, linux-kernel,
	aryabinin, boqun.feng, catalin.marinas, dvyukov, will.deacon,
	Linus Torvalds, Andrew Morton, Paul E. McKenney, Peter Zijlstra,
	Thomas Gleixner


* Andrea Parri <andrea.parri@amarulasolutions.com> wrote:

> Hi Ingo,
> 
> > From 5affbf7e91901143f84f1b2ca64f4afe70e210fd Mon Sep 17 00:00:00 2001
> > From: Ingo Molnar <mingo@kernel.org>
> > Date: Sat, 5 May 2018 10:23:23 +0200
> > Subject: [PATCH] locking/atomics: Simplify the op definitions in atomic.h some more
> > 
> > Before:
> > 
> >  #ifndef atomic_fetch_dec_relaxed
> >  # ifndef atomic_fetch_dec
> >  #  define atomic_fetch_dec(v)			atomic_fetch_sub(1, (v))
> >  #  define atomic_fetch_dec_relaxed(v)		atomic_fetch_sub_relaxed(1, (v))
> >  #  define atomic_fetch_dec_acquire(v)		atomic_fetch_sub_acquire(1, (v))
> >  #  define atomic_fetch_dec_release(v)		atomic_fetch_sub_release(1, (v))
> >  # else
> >  #  define atomic_fetch_dec_relaxed		atomic_fetch_dec
> >  #  define atomic_fetch_dec_acquire		atomic_fetch_dec
> >  #  define atomic_fetch_dec_release		atomic_fetch_dec
> >  # endif
> >  #else
> >  # ifndef atomic_fetch_dec_acquire
> >  #  define atomic_fetch_dec_acquire(...)	__atomic_op_acquire(atomic_fetch_dec, __VA_ARGS__)
> >  # endif
> >  # ifndef atomic_fetch_dec_release
> >  #  define atomic_fetch_dec_release(...)	__atomic_op_release(atomic_fetch_dec, __VA_ARGS__)
> >  # endif
> >  # ifndef atomic_fetch_dec
> >  #  define atomic_fetch_dec(...)		__atomic_op_fence(atomic_fetch_dec, __VA_ARGS__)
> >  # endif
> >  #endif
> > 
> > After:
> > 
> >  #ifndef atomic_fetch_dec_relaxed
> >  # ifndef atomic_fetch_dec
> >  #  define atomic_fetch_dec(v)			atomic_fetch_sub(1, (v))
> >  #  define atomic_fetch_dec_relaxed(v)		atomic_fetch_sub_relaxed(1, (v))
> >  #  define atomic_fetch_dec_acquire(v)		atomic_fetch_sub_acquire(1, (v))
> >  #  define atomic_fetch_dec_release(v)		atomic_fetch_sub_release(1, (v))
> >  # else
> >  #  define atomic_fetch_dec_relaxed		atomic_fetch_dec
> >  #  define atomic_fetch_dec_acquire		atomic_fetch_dec
> >  #  define atomic_fetch_dec_release		atomic_fetch_dec
> >  # endif
> >  #else
> >  # ifndef atomic_fetch_dec
> >  #  define atomic_fetch_dec(...)		__atomic_op_fence(atomic_fetch_dec, __VA_ARGS__)
> >  #  define atomic_fetch_dec_acquire(...)	__atomic_op_acquire(atomic_fetch_dec, __VA_ARGS__)
> >  #  define atomic_fetch_dec_release(...)	__atomic_op_release(atomic_fetch_dec, __VA_ARGS__)
> >  # endif
> >  #endif
> > 
> > The idea is that because we already group these APIs by certain defines
> > such as atomic_fetch_dec_relaxed and atomic_fetch_dec in the primary
> > branches - we can do the same in the secondary branch as well.
> > 
> > ( Also remove some unnecessarily duplicate comments, as the API
> >   group defines are now pretty much self-documenting. )
> > 
> > No change in functionality.
> > 
> > Cc: Peter Zijlstra <peterz@infradead.org>
> > Cc: Linus Torvalds <torvalds@linux-foundation.org>
> > Cc: Andrew Morton <akpm@linux-foundation.org>
> > Cc: Thomas Gleixner <tglx@linutronix.de>
> > Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> > Cc: Will Deacon <will.deacon@arm.com>
> > Cc: linux-kernel@vger.kernel.org
> > Signed-off-by: Ingo Molnar <mingo@kernel.org>
> 
> This breaks compilation on RISC-V. (For some of its atomics, the arch
> currently defines the _relaxed and the full variants and it relies on
> the generic definitions for the _acquire and the _release variants.)

I don't have cross-compilation for RISC-V, which is a relatively new arch.
(Is there any RISC-V set of cross-compilation tools on kernel.org somewhere?)

Could you please send a patch that defines those variants against Linus's tree, 
like the PowerPC patch that does something similar:

  0476a632cb3a: locking/atomics/powerpc: Move cmpxchg helpers to asm/cmpxchg.h and define the full set of cmpxchg APIs

?

... and I'll integrate it into the proper place to make it all bisectable, etc.

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 103+ messages in thread

* [PATCH] locking/atomics: Simplify the op definitions in atomic.h some more
@ 2018-05-06 14:57               ` Ingo Molnar
  0 siblings, 0 replies; 103+ messages in thread
From: Ingo Molnar @ 2018-05-06 14:57 UTC (permalink / raw)
  To: linux-arm-kernel


* Andrea Parri <andrea.parri@amarulasolutions.com> wrote:

> Hi Ingo,
> 
> > From 5affbf7e91901143f84f1b2ca64f4afe70e210fd Mon Sep 17 00:00:00 2001
> > From: Ingo Molnar <mingo@kernel.org>
> > Date: Sat, 5 May 2018 10:23:23 +0200
> > Subject: [PATCH] locking/atomics: Simplify the op definitions in atomic.h some more
> > 
> > Before:
> > 
> >  #ifndef atomic_fetch_dec_relaxed
> >  # ifndef atomic_fetch_dec
> >  #  define atomic_fetch_dec(v)			atomic_fetch_sub(1, (v))
> >  #  define atomic_fetch_dec_relaxed(v)		atomic_fetch_sub_relaxed(1, (v))
> >  #  define atomic_fetch_dec_acquire(v)		atomic_fetch_sub_acquire(1, (v))
> >  #  define atomic_fetch_dec_release(v)		atomic_fetch_sub_release(1, (v))
> >  # else
> >  #  define atomic_fetch_dec_relaxed		atomic_fetch_dec
> >  #  define atomic_fetch_dec_acquire		atomic_fetch_dec
> >  #  define atomic_fetch_dec_release		atomic_fetch_dec
> >  # endif
> >  #else
> >  # ifndef atomic_fetch_dec_acquire
> >  #  define atomic_fetch_dec_acquire(...)	__atomic_op_acquire(atomic_fetch_dec, __VA_ARGS__)
> >  # endif
> >  # ifndef atomic_fetch_dec_release
> >  #  define atomic_fetch_dec_release(...)	__atomic_op_release(atomic_fetch_dec, __VA_ARGS__)
> >  # endif
> >  # ifndef atomic_fetch_dec
> >  #  define atomic_fetch_dec(...)		__atomic_op_fence(atomic_fetch_dec, __VA_ARGS__)
> >  # endif
> >  #endif
> > 
> > After:
> > 
> >  #ifndef atomic_fetch_dec_relaxed
> >  # ifndef atomic_fetch_dec
> >  #  define atomic_fetch_dec(v)			atomic_fetch_sub(1, (v))
> >  #  define atomic_fetch_dec_relaxed(v)		atomic_fetch_sub_relaxed(1, (v))
> >  #  define atomic_fetch_dec_acquire(v)		atomic_fetch_sub_acquire(1, (v))
> >  #  define atomic_fetch_dec_release(v)		atomic_fetch_sub_release(1, (v))
> >  # else
> >  #  define atomic_fetch_dec_relaxed		atomic_fetch_dec
> >  #  define atomic_fetch_dec_acquire		atomic_fetch_dec
> >  #  define atomic_fetch_dec_release		atomic_fetch_dec
> >  # endif
> >  #else
> >  # ifndef atomic_fetch_dec
> >  #  define atomic_fetch_dec(...)		__atomic_op_fence(atomic_fetch_dec, __VA_ARGS__)
> >  #  define atomic_fetch_dec_acquire(...)	__atomic_op_acquire(atomic_fetch_dec, __VA_ARGS__)
> >  #  define atomic_fetch_dec_release(...)	__atomic_op_release(atomic_fetch_dec, __VA_ARGS__)
> >  # endif
> >  #endif
> > 
> > The idea is that because we already group these APIs by certain defines
> > such as atomic_fetch_dec_relaxed and atomic_fetch_dec in the primary
> > branches - we can do the same in the secondary branch as well.
> > 
> > ( Also remove some unnecessarily duplicate comments, as the API
> >   group defines are now pretty much self-documenting. )
> > 
> > No change in functionality.
> > 
> > Cc: Peter Zijlstra <peterz@infradead.org>
> > Cc: Linus Torvalds <torvalds@linux-foundation.org>
> > Cc: Andrew Morton <akpm@linux-foundation.org>
> > Cc: Thomas Gleixner <tglx@linutronix.de>
> > Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> > Cc: Will Deacon <will.deacon@arm.com>
> > Cc: linux-kernel at vger.kernel.org
> > Signed-off-by: Ingo Molnar <mingo@kernel.org>
> 
> This breaks compilation on RISC-V. (For some of its atomics, the arch
> currently defines the _relaxed and the full variants and it relies on
> the generic definitions for the _acquire and the _release variants.)

I don't have cross-compilation for RISC-V, which is a relatively new arch.
(Is there any RISC-V set of cross-compilation tools on kernel.org somewhere?)

Could you please send a patch that defines those variants against Linus's tree, 
like the PowerPC patch that does something similar:

  0476a632cb3a: locking/atomics/powerpc: Move cmpxchg helpers to asm/cmpxchg.h and define the full set of cmpxchg APIs

?

... and I'll integrate it into the proper place to make it all bisectable, etc.

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 103+ messages in thread

* Re: [PATCH] locking/atomics/powerpc: Move cmpxchg helpers to asm/cmpxchg.h and define the full set of cmpxchg APIs
  2018-05-06 12:11                           ` Ingo Molnar
@ 2018-05-07  1:04                             ` Boqun Feng
  -1 siblings, 0 replies; 103+ messages in thread
From: Boqun Feng @ 2018-05-07  1:04 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Peter Zijlstra, Mark Rutland, linux-arm-kernel, linux-kernel,
	aryabinin, catalin.marinas, dvyukov, will.deacon



On Sun, May 6, 2018, at 8:11 PM, Ingo Molnar wrote:
> 
> * Boqun Feng <boqun.feng@gmail.com> wrote:
> 
> > > The only change I made beyond a trivial build fix is that I also added the release 
> > > atomics variants explicitly:
> > > 
> > > +#define atomic_cmpxchg_release(v, o, n) \
> > > +	cmpxchg_release(&((v)->counter), (o), (n))
> > > +#define atomic64_cmpxchg_release(v, o, n) \
> > > +	cmpxchg_release(&((v)->counter), (o), (n))
> > > 
> > > It has passed a PowerPC cross-build test here, but no runtime tests.
> > > 
> > 
> > Do you have the commit at any branch in tip tree? I could pull it and
> > cross-build and check the assembly code of lib/atomic64_test.c, that way
> > I could verify whether we mess something up.
> > 
> > > Does this patch look good to you?
> > > 
> > 
> > Yep!
> 
> Great - I have pushed the commits out into the locking tree, they can be 
> found in:
> 
>   git fetch git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git 
> locking/core
> 

Thanks! My compile test told me that we need to remove the definitions of 
atomic_xchg and atomic64_xchg in ppc's asm/atomic.h: they are now
duplicate, and will prevent the generation of _release and _acquire in the
new logic.

If you need a updated patch for this from me, I could send later today.
(I don't have a  handy environment for patch sending now, so...)

Other than this, the modification looks fine, the lib/atomic64_test.c
generated the same asm before and after the patches.

Regards,
Boqun

> The PowerPC preparatory commit from you is:
> 
>   0476a632cb3a: locking/atomics/powerpc: Move cmpxchg helpers to asm/
> cmpxchg.h and define the full set of cmpxchg APIs
> 
> Thanks,
> 
> 	Ingo

^ permalink raw reply	[flat|nested] 103+ messages in thread

* [PATCH] locking/atomics/powerpc: Move cmpxchg helpers to asm/cmpxchg.h and define the full set of cmpxchg APIs
@ 2018-05-07  1:04                             ` Boqun Feng
  0 siblings, 0 replies; 103+ messages in thread
From: Boqun Feng @ 2018-05-07  1:04 UTC (permalink / raw)
  To: linux-arm-kernel



On Sun, May 6, 2018, at 8:11 PM, Ingo Molnar wrote:
> 
> * Boqun Feng <boqun.feng@gmail.com> wrote:
> 
> > > The only change I made beyond a trivial build fix is that I also added the release 
> > > atomics variants explicitly:
> > > 
> > > +#define atomic_cmpxchg_release(v, o, n) \
> > > +	cmpxchg_release(&((v)->counter), (o), (n))
> > > +#define atomic64_cmpxchg_release(v, o, n) \
> > > +	cmpxchg_release(&((v)->counter), (o), (n))
> > > 
> > > It has passed a PowerPC cross-build test here, but no runtime tests.
> > > 
> > 
> > Do you have the commit at any branch in tip tree? I could pull it and
> > cross-build and check the assembly code of lib/atomic64_test.c, that way
> > I could verify whether we mess something up.
> > 
> > > Does this patch look good to you?
> > > 
> > 
> > Yep!
> 
> Great - I have pushed the commits out into the locking tree, they can be 
> found in:
> 
>   git fetch git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git 
> locking/core
> 

Thanks! My compile test told me that we need to remove the definitions of 
atomic_xchg and atomic64_xchg in ppc's asm/atomic.h: they are now
duplicate, and will prevent the generation of _release and _acquire in the
new logic.

If you need a updated patch for this from me, I could send later today.
(I don't have a  handy environment for patch sending now, so...)

Other than this, the modification looks fine, the lib/atomic64_test.c
generated the same asm before and after the patches.

Regards,
Boqun

> The PowerPC preparatory commit from you is:
> 
>   0476a632cb3a: locking/atomics/powerpc: Move cmpxchg helpers to asm/
> cmpxchg.h and define the full set of cmpxchg APIs
> 
> Thanks,
> 
> 	Ingo

^ permalink raw reply	[flat|nested] 103+ messages in thread

* [RFC PATCH] locking/atomics/x86/64: Clean up and fix details of <asm/atomic64_64.h>
  2018-05-05  9:32               ` Peter Zijlstra
@ 2018-05-07  6:43                 ` Ingo Molnar
  -1 siblings, 0 replies; 103+ messages in thread
From: Ingo Molnar @ 2018-05-07  6:43 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Dmitry Vyukov, Mark Rutland, Linux ARM, LKML, Andrey Ryabinin,
	Boqun Feng, Catalin Marinas, Will Deacon, Thomas Gleixner


* Peter Zijlstra <peterz@infradead.org> wrote:

> On Sat, May 05, 2018 at 11:05:51AM +0200, Dmitry Vyukov wrote:
> > On Sat, May 5, 2018 at 10:47 AM, Peter Zijlstra <peterz@infradead.org> wrote:
> > > And I seriously hate this one:
> > >
> > >   ba1c9f83f633 ("locking/atomic/x86: Un-macro-ify atomic ops implementation")
> > >
> > > and will likely undo that the moment I need to change anything there.
> 
> > That was asked by Ingo:
> > https://groups.google.com/d/msg/kasan-dev/3sNHjjb4GCI/Xz1uVWaaAAAJ
> > 
> > I think in the end all of current options suck in one way or another,
> > so we are just going in circles.
> 
> Yeah, and I disagree with him, but didn't have the energy to fight at
> that time (and still don't really, I'm just complaining).

Ok, this negative side effect of the un-macro-ifying still bothers me, so I tried 
to do something about it:

  1 file changed, 81 insertions(+), 172 deletions(-)

That's actually better than what we had before ba1c9f83f633, which did:

  3 files changed, 147 insertions(+), 70 deletions(-)

( Also consider that my patch adds new DocBook comments and only touches the 
  64-bit side, so if we do the same to the 32-bit header as well we'll gain even 
  more. )

But I don't think you'll like my solution: it requires twice as wide terminals to 
look at those particular functions...

The trick is to merge the C functions into a single line: this makes it look a bit 
weird, but it's still very readable (because the functions are simple), and, most 
importantly, the _differences_ are now easily visible.

This is how that part looks like on my terminal:

 ... do { } while (!arch_atomic64_try_cmpxchg(v, &val, val & i)); return val; }
 ... do { } while (!arch_atomic64_try_cmpxchg(v, &val, val | i)); return val; }
 ... do { } while (!arch_atomic64_try_cmpxchg(v, &val, val ^ i)); return val; }
 ...
 ... { asm(LOCK_PREFIX "orq  %1,%0" : "+m" (v->counter) : "er" (i) : "memory"); }
 ... { asm(LOCK_PREFIX "xorq %1,%0" : "+m" (v->counter) : "er" (i) : "memory"); }
 ... { asm(LOCK_PREFIX "andq %1,%0" : "+m" (v->counter) : "er" (i) : "memory"); }

Which makes the '&|^' and orq/xorq/andq differences easily visible. I kept all 
those long lines at the end of the header, to make it easy not to look at it, 
unless someone wants to actively change or understand the code.

The prototype/parameters are visible even with col80 terminals.

But please give it a try and at least _look_ at the result in a double-wide 
terminal.

There's also a ton of other small fixes and enhancements I did to the header file 
- see the changelog.

But I won't commit this without your explicit Acked-by.

Thanks,

	Ingo

================>
From: Ingo Molnar <mingo@kernel.org>
Date: Mon, 7 May 2018 07:59:12 +0200
Subject: [PATCH] locking/atomics/x86/64: Clean up and fix details of <asm/atomic64_64.h>

PeterZ complained that the following macro elimination:

  ba1c9f83f633: locking/atomic/x86: Un-macro-ify atomic ops implementation

bloated the header and made it less readable/maintainable.

Try to undo some of that damage, without reintroducing a macro mess:

 - Merge the previously macro generated C inline functions into a single line.
   While this looks weird on smaller terminals because of line wraps
   (and violates the col80 style rule brutally), it's surprisingly readable
   on larger terminals. An advantage with this format is that the repeated
   patterns are obviously visible, while the actual differences stick out
   very clearly. To me this is much more readable than the old macro solution.

While at it, also do a no-prisoners taken cleanup pass of all the code and comments
in this file:

 - Fix the DocBook comment of arch_atomic64_add_unless(), which incorrectly claimed:

     "Returns the old value of @v."

   In reality the function returns true if the operation was performed, or false otherwise.

 - Fix __always_inline use: where one member of a 'pair' of an operation has it,
   add it to both. There's no reason for them to ever deviate.

 - Fix/clarify the arch_atomic64_read() documentation: it not only reads but also
   returns the value.

 - Fix DocBook comments that referred to 'i' and 'v' instead of '@i' and '@v'.

 - Remove unnecessary parentheses from arch_atomic64_read() that was probably
   inherited from ancient macro versions.

 - Add DocBook description for arch_atomic64_[add|sub]_return() and
   arch_atomic64_fetch_[add|sub]().

 - Harmonize to a single variant of referring to atomic64_t pointers in
   comments, instead of these two variants:

      "@v: pointer of type atomic64_t"
      "@v: pointer to type atomic64_t"

   Use a single, shorter variant:

      "@v: pointer to atomic64_t"

   (Because the _t already implies that this is a type - no need to repeat that.)

 - Harmonize local variable naming, from a sometimes inconsistent selection of ad-hoc
   and sometimes cryptic variable names ('c', 'val', 'dec', 'val_old', etc.), to the
   following set of standardized local variable names:

     'i' for integer arguments
     'val_old' for old atomic values
     'val_new' for old atomic values

   From now on the name of the local variable is "obviously descriptive" of its
   role in the code.

 - Add newlines after local variable definitions and before return statements, consistently.

 - Change weird "@i: required value" phrase (all arguments to function calls are
   required, what does 'required' mean here?) to '@i: new value'

 - Add "Doesn't imply a write memory barrier" to arch_atomic64_set(), to
   mirror the read barrier comment of arch_atomic64_read().

 - Change the weird "or false for all other cases" phrase to the simpler "or false otherwise"
   formulation that is standard in DocBook comments.

 - Harmonize the punctuation of DocBook comments: detailed descriptions always end with a period.

 - Change the order of addition in the arch_atomic64_[add|sub]_return() code,
   to make it match the order in the description. (No change in functionality,
   addition is commutative.)

 - Remove unnecessary double newlines.

Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/include/asm/atomic64_64.h | 253 ++++++++++++-------------------------
 1 file changed, 81 insertions(+), 172 deletions(-)

diff --git a/arch/x86/include/asm/atomic64_64.h b/arch/x86/include/asm/atomic64_64.h
index 6106b59d3260..1c451871b391 100644
--- a/arch/x86/include/asm/atomic64_64.h
+++ b/arch/x86/include/asm/atomic64_64.h
@@ -12,22 +12,23 @@
 
 /**
  * arch_atomic64_read - read atomic64 variable
- * @v: pointer of type atomic64_t
+ * @v: pointer to atomic64_t
  *
- * Atomically reads the value of @v.
+ * Atomically reads and returns the value of @v.
  * Doesn't imply a read memory barrier.
  */
 static inline long arch_atomic64_read(const atomic64_t *v)
 {
-	return READ_ONCE((v)->counter);
+	return READ_ONCE(v->counter);
 }
 
 /**
  * arch_atomic64_set - set atomic64 variable
- * @v: pointer to type atomic64_t
- * @i: required value
+ * @v: pointer to atomic64_t
+ * @i: new value
  *
  * Atomically sets the value of @v to @i.
+ * Doesn't imply a write memory barrier.
  */
 static inline void arch_atomic64_set(atomic64_t *v, long i)
 {
@@ -35,41 +36,22 @@ static inline void arch_atomic64_set(atomic64_t *v, long i)
 }
 
 /**
- * arch_atomic64_add - add integer to atomic64 variable
- * @i: integer value to add
- * @v: pointer to type atomic64_t
+ * arch_atomic64_[add|sub] - add|subtract integer to/from atomic64 variable
+ * @i: integer value to add/subtract
+ * @v: pointer to atomic64_t
  *
- * Atomically adds @i to @v.
+ * Atomically adds/subtracts @i to/from @v.
  */
-static __always_inline void arch_atomic64_add(long i, atomic64_t *v)
-{
-	asm volatile(LOCK_PREFIX "addq %1,%0"
-		     : "=m" (v->counter)
-		     : "er" (i), "m" (v->counter));
-}
-
-/**
- * arch_atomic64_sub - subtract the atomic64 variable
- * @i: integer value to subtract
- * @v: pointer to type atomic64_t
- *
- * Atomically subtracts @i from @v.
- */
-static inline void arch_atomic64_sub(long i, atomic64_t *v)
-{
-	asm volatile(LOCK_PREFIX "subq %1,%0"
-		     : "=m" (v->counter)
-		     : "er" (i), "m" (v->counter));
-}
+static __always_inline void arch_atomic64_sub(long i, atomic64_t *v) { asm(LOCK_PREFIX "subq %1,%0" : "=m" (v->counter) : "er" (i), "m" (v->counter)); }
+static __always_inline void arch_atomic64_add(long i, atomic64_t *v) { asm(LOCK_PREFIX "addq %1,%0" : "=m" (v->counter) : "er" (i), "m" (v->counter)); }
 
 /**
  * arch_atomic64_sub_and_test - subtract value from variable and test result
  * @i: integer value to subtract
- * @v: pointer to type atomic64_t
+ * @v: pointer to atomic64_t
  *
  * Atomically subtracts @i from @v and returns
- * true if the result is zero, or false for all
- * other cases.
+ * true if the result is zero, or false otherwise.
  */
 static inline bool arch_atomic64_sub_and_test(long i, atomic64_t *v)
 {
@@ -77,133 +59,101 @@ static inline bool arch_atomic64_sub_and_test(long i, atomic64_t *v)
 }
 
 /**
- * arch_atomic64_inc - increment atomic64 variable
- * @v: pointer to type atomic64_t
+ * arch_atomic64_[inc|dec] - increment/decrement atomic64 variable
+ * @v: pointer to atomic64_t
  *
- * Atomically increments @v by 1.
+ * Atomically increments/decrements @v by 1.
  */
-static __always_inline void arch_atomic64_inc(atomic64_t *v)
-{
-	asm volatile(LOCK_PREFIX "incq %0"
-		     : "=m" (v->counter)
-		     : "m" (v->counter));
-}
+static __always_inline void arch_atomic64_inc(atomic64_t *v) { asm(LOCK_PREFIX "incq %0" : "=m" (v->counter) : "m" (v->counter)); }
+static __always_inline void arch_atomic64_dec(atomic64_t *v) { asm(LOCK_PREFIX "decq %0" : "=m" (v->counter) : "m" (v->counter)); }
 
 /**
- * arch_atomic64_dec - decrement atomic64 variable
- * @v: pointer to type atomic64_t
+ * arch_atomic64_[inc|dec]_and_test - increment/decrement and test
+ * @v: pointer to atomic64_t
  *
- * Atomically decrements @v by 1.
+ * Atomically increments/decrements @v by 1 and
+ * returns true if the result is 0, or false otherwise.
  */
-static __always_inline void arch_atomic64_dec(atomic64_t *v)
-{
-	asm volatile(LOCK_PREFIX "decq %0"
-		     : "=m" (v->counter)
-		     : "m" (v->counter));
-}
+static inline bool arch_atomic64_dec_and_test(atomic64_t *v) { GEN_UNARY_RMWcc(LOCK_PREFIX "decq", v->counter, "%0", e); }
+static inline bool arch_atomic64_inc_and_test(atomic64_t *v) { GEN_UNARY_RMWcc(LOCK_PREFIX "incq", v->counter, "%0", e); }
 
 /**
- * arch_atomic64_dec_and_test - decrement and test
- * @v: pointer to type atomic64_t
+ * arch_atomic64_add_negative - add and test if negative
+ * @i: integer value to add
+ * @v: pointer to atomic64_t
  *
- * Atomically decrements @v by 1 and
- * returns true if the result is 0, or false for all other
- * cases.
+ * Atomically adds @i to @v and returns true
+ * if the result is negative, or false otherwise.
  */
-static inline bool arch_atomic64_dec_and_test(atomic64_t *v)
+static inline bool arch_atomic64_add_negative(long i, atomic64_t *v)
 {
-	GEN_UNARY_RMWcc(LOCK_PREFIX "decq", v->counter, "%0", e);
+	GEN_BINARY_RMWcc(LOCK_PREFIX "addq", v->counter, "er", i, "%0", s);
 }
 
 /**
- * arch_atomic64_inc_and_test - increment and test
- * @v: pointer to type atomic64_t
+ * arch_atomic64_[add|sub]_return - add/subtract and return
+ * @i: integer value to add/subtract
+ * @v: pointer to atomic64_t
  *
- * Atomically increments @v by 1
- * and returns true if the result is zero, or false for all
- * other cases.
+ * Atomically adds/subtracts @i to/from @v and returns the new value of @v.
  */
-static inline bool arch_atomic64_inc_and_test(atomic64_t *v)
-{
-	GEN_UNARY_RMWcc(LOCK_PREFIX "incq", v->counter, "%0", e);
-}
+static __always_inline long arch_atomic64_add_return(long i, atomic64_t *v) { return xadd(&v->counter, +i) + i; }
+static __always_inline long arch_atomic64_sub_return(long i, atomic64_t *v) { return xadd(&v->counter, -i) - i; }
 
 /**
- * arch_atomic64_add_negative - add and test if negative
- * @i: integer value to add
- * @v: pointer to type atomic64_t
+ * arch_atomic64_fetch_[add|sub]_return - add/subtract and return old value
+ * @i: integer value to add/subtract
+ * @v: pointer to atomic64_t
  *
- * Atomically adds @i to @v and returns true
- * if the result is negative, or false when
- * result is greater than or equal to zero.
+ * Atomically adds/subtracts @i to/from @v and returns the old value of @v.
  */
-static inline bool arch_atomic64_add_negative(long i, atomic64_t *v)
-{
-	GEN_BINARY_RMWcc(LOCK_PREFIX "addq", v->counter, "er", i, "%0", s);
-}
+static __always_inline long arch_atomic64_fetch_add (long i, atomic64_t *v) { return xadd(&v->counter, +i); }
+static __always_inline long arch_atomic64_fetch_sub (long i, atomic64_t *v) { return xadd(&v->counter, -i); }
 
 /**
- * arch_atomic64_add_return - add and return
- * @i: integer value to add
- * @v: pointer to type atomic64_t
+ * arch_atomic64_[inc|dec]_return - increment/decrement and return
+ * @v: pointer to atomic64_t
  *
- * Atomically adds @i to @v and returns @i + @v
+ * Atomically increments/decrements @v and returns the new value of @v.
  */
-static __always_inline long arch_atomic64_add_return(long i, atomic64_t *v)
-{
-	return i + xadd(&v->counter, i);
-}
-
-static inline long arch_atomic64_sub_return(long i, atomic64_t *v)
-{
-	return arch_atomic64_add_return(-i, v);
-}
+#define arch_atomic64_inc_return(v)  arch_atomic64_add_return(1, (v))
+#define arch_atomic64_dec_return(v)  arch_atomic64_sub_return(1, (v))
 
-static inline long arch_atomic64_fetch_add(long i, atomic64_t *v)
+static inline long arch_atomic64_cmpxchg(atomic64_t *v, long val_old, long val_new)
 {
-	return xadd(&v->counter, i);
-}
-
-static inline long arch_atomic64_fetch_sub(long i, atomic64_t *v)
-{
-	return xadd(&v->counter, -i);
-}
-
-#define arch_atomic64_inc_return(v)  (arch_atomic64_add_return(1, (v)))
-#define arch_atomic64_dec_return(v)  (arch_atomic64_sub_return(1, (v)))
-
-static inline long arch_atomic64_cmpxchg(atomic64_t *v, long old, long new)
-{
-	return arch_cmpxchg(&v->counter, old, new);
+	return arch_cmpxchg(&v->counter, val_old, val_new);
 }
 
 #define arch_atomic64_try_cmpxchg arch_atomic64_try_cmpxchg
-static __always_inline bool arch_atomic64_try_cmpxchg(atomic64_t *v, s64 *old, long new)
+
+static __always_inline bool arch_atomic64_try_cmpxchg(atomic64_t *v, s64 *val_old, long val_new)
 {
-	return try_cmpxchg(&v->counter, old, new);
+	return try_cmpxchg(&v->counter, val_old, val_new);
 }
 
-static inline long arch_atomic64_xchg(atomic64_t *v, long new)
+static inline long arch_atomic64_xchg(atomic64_t *v, long val_new)
 {
-	return xchg(&v->counter, new);
+	return xchg(&v->counter, val_new);
 }
 
 /**
  * arch_atomic64_add_unless - add unless the number is a given value
- * @v: pointer of type atomic64_t
- * @a: the amount to add to v...
- * @u: ...unless v is equal to u.
+ * @v: pointer to atomic64_t
+ * @i: the amount to add to @v...
+ * @u: ...unless @v is equal to @u
  *
- * Atomically adds @a to @v, so long as it was not @u.
- * Returns the old value of @v.
+ * Atomically adds @i to @v, so long as @v was not @u.
+ * Returns true if the operation was performed, or false otherwise.
  */
-static inline bool arch_atomic64_add_unless(atomic64_t *v, long a, long u)
+static inline bool arch_atomic64_add_unless(atomic64_t *v, long i, long u)
 {
-	s64 c = arch_atomic64_read(v);
+	s64 val_old = arch_atomic64_read(v);
+
 	do {
-		if (unlikely(c == u))
+		if (unlikely(val_old == u))
 			return false;
-	} while (!arch_atomic64_try_cmpxchg(v, &c, c + a));
+	} while (!arch_atomic64_try_cmpxchg(v, &val_old, val_old + i));
+
 	return true;
 }
 
@@ -211,71 +161,30 @@ static inline bool arch_atomic64_add_unless(atomic64_t *v, long a, long u)
 
 /*
  * arch_atomic64_dec_if_positive - decrement by 1 if old value positive
- * @v: pointer of type atomic_t
+ * @v: pointer to type atomic_t
  *
  * The function returns the old value of *v minus 1, even if
- * the atomic variable, v, was not decremented.
+ * @v was not decremented.
  */
 static inline long arch_atomic64_dec_if_positive(atomic64_t *v)
 {
-	s64 dec, c = arch_atomic64_read(v);
-	do {
-		dec = c - 1;
-		if (unlikely(dec < 0))
-			break;
-	} while (!arch_atomic64_try_cmpxchg(v, &c, dec));
-	return dec;
-}
-
-static inline void arch_atomic64_and(long i, atomic64_t *v)
-{
-	asm volatile(LOCK_PREFIX "andq %1,%0"
-			: "+m" (v->counter)
-			: "er" (i)
-			: "memory");
-}
-
-static inline long arch_atomic64_fetch_and(long i, atomic64_t *v)
-{
-	s64 val = arch_atomic64_read(v);
-
-	do {
-	} while (!arch_atomic64_try_cmpxchg(v, &val, val & i));
-	return val;
-}
-
-static inline void arch_atomic64_or(long i, atomic64_t *v)
-{
-	asm volatile(LOCK_PREFIX "orq %1,%0"
-			: "+m" (v->counter)
-			: "er" (i)
-			: "memory");
-}
-
-static inline long arch_atomic64_fetch_or(long i, atomic64_t *v)
-{
-	s64 val = arch_atomic64_read(v);
+	s64 val_new, val = arch_atomic64_read(v);
 
 	do {
-	} while (!arch_atomic64_try_cmpxchg(v, &val, val | i));
-	return val;
-}
+		val_new = val - 1;
+		if (unlikely(val_new < 0))
+			break;
+	} while (!arch_atomic64_try_cmpxchg(v, &val, val_new));
 
-static inline void arch_atomic64_xor(long i, atomic64_t *v)
-{
-	asm volatile(LOCK_PREFIX "xorq %1,%0"
-			: "+m" (v->counter)
-			: "er" (i)
-			: "memory");
+	return val_new;
 }
 
-static inline long arch_atomic64_fetch_xor(long i, atomic64_t *v)
-{
-	s64 val = arch_atomic64_read(v);
+static inline long arch_atomic64_fetch_and(long i, atomic64_t *v) { s64 val = arch_atomic64_read(v); do { } while (!arch_atomic64_try_cmpxchg(v, &val, val & i)); return val; }
+static inline long arch_atomic64_fetch_or (long i, atomic64_t *v) { s64 val = arch_atomic64_read(v); do { } while (!arch_atomic64_try_cmpxchg(v, &val, val | i)); return val; }
+static inline long arch_atomic64_fetch_xor(long i, atomic64_t *v) { s64 val = arch_atomic64_read(v); do { } while (!arch_atomic64_try_cmpxchg(v, &val, val ^ i)); return val; }
 
-	do {
-	} while (!arch_atomic64_try_cmpxchg(v, &val, val ^ i));
-	return val;
-}
+static inline void arch_atomic64_or (long i, atomic64_t *v) { asm(LOCK_PREFIX "orq  %1,%0" : "+m" (v->counter) : "er" (i) : "memory"); }
+static inline void arch_atomic64_xor(long i, atomic64_t *v) { asm(LOCK_PREFIX "xorq %1,%0" : "+m" (v->counter) : "er" (i) : "memory"); }
+static inline void arch_atomic64_and(long i, atomic64_t *v) { asm(LOCK_PREFIX "andq %1,%0" : "+m" (v->counter) : "er" (i) : "memory"); }
 
 #endif /* _ASM_X86_ATOMIC64_64_H */

^ permalink raw reply related	[flat|nested] 103+ messages in thread

* [RFC PATCH] locking/atomics/x86/64: Clean up and fix details of <asm/atomic64_64.h>
@ 2018-05-07  6:43                 ` Ingo Molnar
  0 siblings, 0 replies; 103+ messages in thread
From: Ingo Molnar @ 2018-05-07  6:43 UTC (permalink / raw)
  To: linux-arm-kernel


* Peter Zijlstra <peterz@infradead.org> wrote:

> On Sat, May 05, 2018 at 11:05:51AM +0200, Dmitry Vyukov wrote:
> > On Sat, May 5, 2018 at 10:47 AM, Peter Zijlstra <peterz@infradead.org> wrote:
> > > And I seriously hate this one:
> > >
> > >   ba1c9f83f633 ("locking/atomic/x86: Un-macro-ify atomic ops implementation")
> > >
> > > and will likely undo that the moment I need to change anything there.
> 
> > That was asked by Ingo:
> > https://groups.google.com/d/msg/kasan-dev/3sNHjjb4GCI/Xz1uVWaaAAAJ
> > 
> > I think in the end all of current options suck in one way or another,
> > so we are just going in circles.
> 
> Yeah, and I disagree with him, but didn't have the energy to fight at
> that time (and still don't really, I'm just complaining).

Ok, this negative side effect of the un-macro-ifying still bothers me, so I tried 
to do something about it:

  1 file changed, 81 insertions(+), 172 deletions(-)

That's actually better than what we had before ba1c9f83f633, which did:

  3 files changed, 147 insertions(+), 70 deletions(-)

( Also consider that my patch adds new DocBook comments and only touches the 
  64-bit side, so if we do the same to the 32-bit header as well we'll gain even 
  more. )

But I don't think you'll like my solution: it requires twice as wide terminals to 
look at those particular functions...

The trick is to merge the C functions into a single line: this makes it look a bit 
weird, but it's still very readable (because the functions are simple), and, most 
importantly, the _differences_ are now easily visible.

This is how that part looks like on my terminal:

 ... do { } while (!arch_atomic64_try_cmpxchg(v, &val, val & i)); return val; }
 ... do { } while (!arch_atomic64_try_cmpxchg(v, &val, val | i)); return val; }
 ... do { } while (!arch_atomic64_try_cmpxchg(v, &val, val ^ i)); return val; }
 ...
 ... { asm(LOCK_PREFIX "orq  %1,%0" : "+m" (v->counter) : "er" (i) : "memory"); }
 ... { asm(LOCK_PREFIX "xorq %1,%0" : "+m" (v->counter) : "er" (i) : "memory"); }
 ... { asm(LOCK_PREFIX "andq %1,%0" : "+m" (v->counter) : "er" (i) : "memory"); }

Which makes the '&|^' and orq/xorq/andq differences easily visible. I kept all 
those long lines at the end of the header, to make it easy not to look at it, 
unless someone wants to actively change or understand the code.

The prototype/parameters are visible even with col80 terminals.

But please give it a try and at least _look_ at the result in a double-wide 
terminal.

There's also a ton of other small fixes and enhancements I did to the header file 
- see the changelog.

But I won't commit this without your explicit Acked-by.

Thanks,

	Ingo

================>
From: Ingo Molnar <mingo@kernel.org>
Date: Mon, 7 May 2018 07:59:12 +0200
Subject: [PATCH] locking/atomics/x86/64: Clean up and fix details of <asm/atomic64_64.h>

PeterZ complained that the following macro elimination:

  ba1c9f83f633: locking/atomic/x86: Un-macro-ify atomic ops implementation

bloated the header and made it less readable/maintainable.

Try to undo some of that damage, without reintroducing a macro mess:

 - Merge the previously macro generated C inline functions into a single line.
   While this looks weird on smaller terminals because of line wraps
   (and violates the col80 style rule brutally), it's surprisingly readable
   on larger terminals. An advantage with this format is that the repeated
   patterns are obviously visible, while the actual differences stick out
   very clearly. To me this is much more readable than the old macro solution.

While at it, also do a no-prisoners taken cleanup pass of all the code and comments
in this file:

 - Fix the DocBook comment of arch_atomic64_add_unless(), which incorrectly claimed:

     "Returns the old value of @v."

   In reality the function returns true if the operation was performed, or false otherwise.

 - Fix __always_inline use: where one member of a 'pair' of an operation has it,
   add it to both. There's no reason for them to ever deviate.

 - Fix/clarify the arch_atomic64_read() documentation: it not only reads but also
   returns the value.

 - Fix DocBook comments that referred to 'i' and 'v' instead of '@i' and '@v'.

 - Remove unnecessary parentheses from arch_atomic64_read() that was probably
   inherited from ancient macro versions.

 - Add DocBook description for arch_atomic64_[add|sub]_return() and
   arch_atomic64_fetch_[add|sub]().

 - Harmonize to a single variant of referring to atomic64_t pointers in
   comments, instead of these two variants:

      "@v: pointer of type atomic64_t"
      "@v: pointer to type atomic64_t"

   Use a single, shorter variant:

      "@v: pointer to atomic64_t"

   (Because the _t already implies that this is a type - no need to repeat that.)

 - Harmonize local variable naming, from a sometimes inconsistent selection of ad-hoc
   and sometimes cryptic variable names ('c', 'val', 'dec', 'val_old', etc.), to the
   following set of standardized local variable names:

     'i' for integer arguments
     'val_old' for old atomic values
     'val_new' for old atomic values

   From now on the name of the local variable is "obviously descriptive" of its
   role in the code.

 - Add newlines after local variable definitions and before return statements, consistently.

 - Change weird "@i: required value" phrase (all arguments to function calls are
   required, what does 'required' mean here?) to '@i: new value'

 - Add "Doesn't imply a write memory barrier" to arch_atomic64_set(), to
   mirror the read barrier comment of arch_atomic64_read().

 - Change the weird "or false for all other cases" phrase to the simpler "or false otherwise"
   formulation that is standard in DocBook comments.

 - Harmonize the punctuation of DocBook comments: detailed descriptions always end with a period.

 - Change the order of addition in the arch_atomic64_[add|sub]_return() code,
   to make it match the order in the description. (No change in functionality,
   addition is commutative.)

 - Remove unnecessary double newlines.

Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: linux-kernel at vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/include/asm/atomic64_64.h | 253 ++++++++++++-------------------------
 1 file changed, 81 insertions(+), 172 deletions(-)

diff --git a/arch/x86/include/asm/atomic64_64.h b/arch/x86/include/asm/atomic64_64.h
index 6106b59d3260..1c451871b391 100644
--- a/arch/x86/include/asm/atomic64_64.h
+++ b/arch/x86/include/asm/atomic64_64.h
@@ -12,22 +12,23 @@
 
 /**
  * arch_atomic64_read - read atomic64 variable
- * @v: pointer of type atomic64_t
+ * @v: pointer to atomic64_t
  *
- * Atomically reads the value of @v.
+ * Atomically reads and returns the value of @v.
  * Doesn't imply a read memory barrier.
  */
 static inline long arch_atomic64_read(const atomic64_t *v)
 {
-	return READ_ONCE((v)->counter);
+	return READ_ONCE(v->counter);
 }
 
 /**
  * arch_atomic64_set - set atomic64 variable
- * @v: pointer to type atomic64_t
- * @i: required value
+ * @v: pointer to atomic64_t
+ * @i: new value
  *
  * Atomically sets the value of @v to @i.
+ * Doesn't imply a write memory barrier.
  */
 static inline void arch_atomic64_set(atomic64_t *v, long i)
 {
@@ -35,41 +36,22 @@ static inline void arch_atomic64_set(atomic64_t *v, long i)
 }
 
 /**
- * arch_atomic64_add - add integer to atomic64 variable
- * @i: integer value to add
- * @v: pointer to type atomic64_t
+ * arch_atomic64_[add|sub] - add|subtract integer to/from atomic64 variable
+ * @i: integer value to add/subtract
+ * @v: pointer to atomic64_t
  *
- * Atomically adds @i to @v.
+ * Atomically adds/subtracts @i to/from @v.
  */
-static __always_inline void arch_atomic64_add(long i, atomic64_t *v)
-{
-	asm volatile(LOCK_PREFIX "addq %1,%0"
-		     : "=m" (v->counter)
-		     : "er" (i), "m" (v->counter));
-}
-
-/**
- * arch_atomic64_sub - subtract the atomic64 variable
- * @i: integer value to subtract
- * @v: pointer to type atomic64_t
- *
- * Atomically subtracts @i from @v.
- */
-static inline void arch_atomic64_sub(long i, atomic64_t *v)
-{
-	asm volatile(LOCK_PREFIX "subq %1,%0"
-		     : "=m" (v->counter)
-		     : "er" (i), "m" (v->counter));
-}
+static __always_inline void arch_atomic64_sub(long i, atomic64_t *v) { asm(LOCK_PREFIX "subq %1,%0" : "=m" (v->counter) : "er" (i), "m" (v->counter)); }
+static __always_inline void arch_atomic64_add(long i, atomic64_t *v) { asm(LOCK_PREFIX "addq %1,%0" : "=m" (v->counter) : "er" (i), "m" (v->counter)); }
 
 /**
  * arch_atomic64_sub_and_test - subtract value from variable and test result
  * @i: integer value to subtract
- * @v: pointer to type atomic64_t
+ * @v: pointer to atomic64_t
  *
  * Atomically subtracts @i from @v and returns
- * true if the result is zero, or false for all
- * other cases.
+ * true if the result is zero, or false otherwise.
  */
 static inline bool arch_atomic64_sub_and_test(long i, atomic64_t *v)
 {
@@ -77,133 +59,101 @@ static inline bool arch_atomic64_sub_and_test(long i, atomic64_t *v)
 }
 
 /**
- * arch_atomic64_inc - increment atomic64 variable
- * @v: pointer to type atomic64_t
+ * arch_atomic64_[inc|dec] - increment/decrement atomic64 variable
+ * @v: pointer to atomic64_t
  *
- * Atomically increments @v by 1.
+ * Atomically increments/decrements @v by 1.
  */
-static __always_inline void arch_atomic64_inc(atomic64_t *v)
-{
-	asm volatile(LOCK_PREFIX "incq %0"
-		     : "=m" (v->counter)
-		     : "m" (v->counter));
-}
+static __always_inline void arch_atomic64_inc(atomic64_t *v) { asm(LOCK_PREFIX "incq %0" : "=m" (v->counter) : "m" (v->counter)); }
+static __always_inline void arch_atomic64_dec(atomic64_t *v) { asm(LOCK_PREFIX "decq %0" : "=m" (v->counter) : "m" (v->counter)); }
 
 /**
- * arch_atomic64_dec - decrement atomic64 variable
- * @v: pointer to type atomic64_t
+ * arch_atomic64_[inc|dec]_and_test - increment/decrement and test
+ * @v: pointer to atomic64_t
  *
- * Atomically decrements @v by 1.
+ * Atomically increments/decrements @v by 1 and
+ * returns true if the result is 0, or false otherwise.
  */
-static __always_inline void arch_atomic64_dec(atomic64_t *v)
-{
-	asm volatile(LOCK_PREFIX "decq %0"
-		     : "=m" (v->counter)
-		     : "m" (v->counter));
-}
+static inline bool arch_atomic64_dec_and_test(atomic64_t *v) { GEN_UNARY_RMWcc(LOCK_PREFIX "decq", v->counter, "%0", e); }
+static inline bool arch_atomic64_inc_and_test(atomic64_t *v) { GEN_UNARY_RMWcc(LOCK_PREFIX "incq", v->counter, "%0", e); }
 
 /**
- * arch_atomic64_dec_and_test - decrement and test
- * @v: pointer to type atomic64_t
+ * arch_atomic64_add_negative - add and test if negative
+ * @i: integer value to add
+ * @v: pointer to atomic64_t
  *
- * Atomically decrements @v by 1 and
- * returns true if the result is 0, or false for all other
- * cases.
+ * Atomically adds @i to @v and returns true
+ * if the result is negative, or false otherwise.
  */
-static inline bool arch_atomic64_dec_and_test(atomic64_t *v)
+static inline bool arch_atomic64_add_negative(long i, atomic64_t *v)
 {
-	GEN_UNARY_RMWcc(LOCK_PREFIX "decq", v->counter, "%0", e);
+	GEN_BINARY_RMWcc(LOCK_PREFIX "addq", v->counter, "er", i, "%0", s);
 }
 
 /**
- * arch_atomic64_inc_and_test - increment and test
- * @v: pointer to type atomic64_t
+ * arch_atomic64_[add|sub]_return - add/subtract and return
+ * @i: integer value to add/subtract
+ * @v: pointer to atomic64_t
  *
- * Atomically increments @v by 1
- * and returns true if the result is zero, or false for all
- * other cases.
+ * Atomically adds/subtracts @i to/from @v and returns the new value of @v.
  */
-static inline bool arch_atomic64_inc_and_test(atomic64_t *v)
-{
-	GEN_UNARY_RMWcc(LOCK_PREFIX "incq", v->counter, "%0", e);
-}
+static __always_inline long arch_atomic64_add_return(long i, atomic64_t *v) { return xadd(&v->counter, +i) + i; }
+static __always_inline long arch_atomic64_sub_return(long i, atomic64_t *v) { return xadd(&v->counter, -i) - i; }
 
 /**
- * arch_atomic64_add_negative - add and test if negative
- * @i: integer value to add
- * @v: pointer to type atomic64_t
+ * arch_atomic64_fetch_[add|sub]_return - add/subtract and return old value
+ * @i: integer value to add/subtract
+ * @v: pointer to atomic64_t
  *
- * Atomically adds @i to @v and returns true
- * if the result is negative, or false when
- * result is greater than or equal to zero.
+ * Atomically adds/subtracts @i to/from @v and returns the old value of @v.
  */
-static inline bool arch_atomic64_add_negative(long i, atomic64_t *v)
-{
-	GEN_BINARY_RMWcc(LOCK_PREFIX "addq", v->counter, "er", i, "%0", s);
-}
+static __always_inline long arch_atomic64_fetch_add (long i, atomic64_t *v) { return xadd(&v->counter, +i); }
+static __always_inline long arch_atomic64_fetch_sub (long i, atomic64_t *v) { return xadd(&v->counter, -i); }
 
 /**
- * arch_atomic64_add_return - add and return
- * @i: integer value to add
- * @v: pointer to type atomic64_t
+ * arch_atomic64_[inc|dec]_return - increment/decrement and return
+ * @v: pointer to atomic64_t
  *
- * Atomically adds @i to @v and returns @i + @v
+ * Atomically increments/decrements @v and returns the new value of @v.
  */
-static __always_inline long arch_atomic64_add_return(long i, atomic64_t *v)
-{
-	return i + xadd(&v->counter, i);
-}
-
-static inline long arch_atomic64_sub_return(long i, atomic64_t *v)
-{
-	return arch_atomic64_add_return(-i, v);
-}
+#define arch_atomic64_inc_return(v)  arch_atomic64_add_return(1, (v))
+#define arch_atomic64_dec_return(v)  arch_atomic64_sub_return(1, (v))
 
-static inline long arch_atomic64_fetch_add(long i, atomic64_t *v)
+static inline long arch_atomic64_cmpxchg(atomic64_t *v, long val_old, long val_new)
 {
-	return xadd(&v->counter, i);
-}
-
-static inline long arch_atomic64_fetch_sub(long i, atomic64_t *v)
-{
-	return xadd(&v->counter, -i);
-}
-
-#define arch_atomic64_inc_return(v)  (arch_atomic64_add_return(1, (v)))
-#define arch_atomic64_dec_return(v)  (arch_atomic64_sub_return(1, (v)))
-
-static inline long arch_atomic64_cmpxchg(atomic64_t *v, long old, long new)
-{
-	return arch_cmpxchg(&v->counter, old, new);
+	return arch_cmpxchg(&v->counter, val_old, val_new);
 }
 
 #define arch_atomic64_try_cmpxchg arch_atomic64_try_cmpxchg
-static __always_inline bool arch_atomic64_try_cmpxchg(atomic64_t *v, s64 *old, long new)
+
+static __always_inline bool arch_atomic64_try_cmpxchg(atomic64_t *v, s64 *val_old, long val_new)
 {
-	return try_cmpxchg(&v->counter, old, new);
+	return try_cmpxchg(&v->counter, val_old, val_new);
 }
 
-static inline long arch_atomic64_xchg(atomic64_t *v, long new)
+static inline long arch_atomic64_xchg(atomic64_t *v, long val_new)
 {
-	return xchg(&v->counter, new);
+	return xchg(&v->counter, val_new);
 }
 
 /**
  * arch_atomic64_add_unless - add unless the number is a given value
- * @v: pointer of type atomic64_t
- * @a: the amount to add to v...
- * @u: ...unless v is equal to u.
+ * @v: pointer to atomic64_t
+ * @i: the amount to add to @v...
+ * @u: ...unless @v is equal to @u
  *
- * Atomically adds @a to @v, so long as it was not @u.
- * Returns the old value of @v.
+ * Atomically adds @i to @v, so long as @v was not @u.
+ * Returns true if the operation was performed, or false otherwise.
  */
-static inline bool arch_atomic64_add_unless(atomic64_t *v, long a, long u)
+static inline bool arch_atomic64_add_unless(atomic64_t *v, long i, long u)
 {
-	s64 c = arch_atomic64_read(v);
+	s64 val_old = arch_atomic64_read(v);
+
 	do {
-		if (unlikely(c == u))
+		if (unlikely(val_old == u))
 			return false;
-	} while (!arch_atomic64_try_cmpxchg(v, &c, c + a));
+	} while (!arch_atomic64_try_cmpxchg(v, &val_old, val_old + i));
+
 	return true;
 }
 
@@ -211,71 +161,30 @@ static inline bool arch_atomic64_add_unless(atomic64_t *v, long a, long u)
 
 /*
  * arch_atomic64_dec_if_positive - decrement by 1 if old value positive
- * @v: pointer of type atomic_t
+ * @v: pointer to type atomic_t
  *
  * The function returns the old value of *v minus 1, even if
- * the atomic variable, v, was not decremented.
+ * @v was not decremented.
  */
 static inline long arch_atomic64_dec_if_positive(atomic64_t *v)
 {
-	s64 dec, c = arch_atomic64_read(v);
-	do {
-		dec = c - 1;
-		if (unlikely(dec < 0))
-			break;
-	} while (!arch_atomic64_try_cmpxchg(v, &c, dec));
-	return dec;
-}
-
-static inline void arch_atomic64_and(long i, atomic64_t *v)
-{
-	asm volatile(LOCK_PREFIX "andq %1,%0"
-			: "+m" (v->counter)
-			: "er" (i)
-			: "memory");
-}
-
-static inline long arch_atomic64_fetch_and(long i, atomic64_t *v)
-{
-	s64 val = arch_atomic64_read(v);
-
-	do {
-	} while (!arch_atomic64_try_cmpxchg(v, &val, val & i));
-	return val;
-}
-
-static inline void arch_atomic64_or(long i, atomic64_t *v)
-{
-	asm volatile(LOCK_PREFIX "orq %1,%0"
-			: "+m" (v->counter)
-			: "er" (i)
-			: "memory");
-}
-
-static inline long arch_atomic64_fetch_or(long i, atomic64_t *v)
-{
-	s64 val = arch_atomic64_read(v);
+	s64 val_new, val = arch_atomic64_read(v);
 
 	do {
-	} while (!arch_atomic64_try_cmpxchg(v, &val, val | i));
-	return val;
-}
+		val_new = val - 1;
+		if (unlikely(val_new < 0))
+			break;
+	} while (!arch_atomic64_try_cmpxchg(v, &val, val_new));
 
-static inline void arch_atomic64_xor(long i, atomic64_t *v)
-{
-	asm volatile(LOCK_PREFIX "xorq %1,%0"
-			: "+m" (v->counter)
-			: "er" (i)
-			: "memory");
+	return val_new;
 }
 
-static inline long arch_atomic64_fetch_xor(long i, atomic64_t *v)
-{
-	s64 val = arch_atomic64_read(v);
+static inline long arch_atomic64_fetch_and(long i, atomic64_t *v) { s64 val = arch_atomic64_read(v); do { } while (!arch_atomic64_try_cmpxchg(v, &val, val & i)); return val; }
+static inline long arch_atomic64_fetch_or (long i, atomic64_t *v) { s64 val = arch_atomic64_read(v); do { } while (!arch_atomic64_try_cmpxchg(v, &val, val | i)); return val; }
+static inline long arch_atomic64_fetch_xor(long i, atomic64_t *v) { s64 val = arch_atomic64_read(v); do { } while (!arch_atomic64_try_cmpxchg(v, &val, val ^ i)); return val; }
 
-	do {
-	} while (!arch_atomic64_try_cmpxchg(v, &val, val ^ i));
-	return val;
-}
+static inline void arch_atomic64_or (long i, atomic64_t *v) { asm(LOCK_PREFIX "orq  %1,%0" : "+m" (v->counter) : "er" (i) : "memory"); }
+static inline void arch_atomic64_xor(long i, atomic64_t *v) { asm(LOCK_PREFIX "xorq %1,%0" : "+m" (v->counter) : "er" (i) : "memory"); }
+static inline void arch_atomic64_and(long i, atomic64_t *v) { asm(LOCK_PREFIX "andq %1,%0" : "+m" (v->counter) : "er" (i) : "memory"); }
 
 #endif /* _ASM_X86_ATOMIC64_64_H */

^ permalink raw reply related	[flat|nested] 103+ messages in thread

* Re: [PATCH] locking/atomics/powerpc: Move cmpxchg helpers to asm/cmpxchg.h and define the full set of cmpxchg APIs
  2018-05-07  1:04                             ` Boqun Feng
@ 2018-05-07  6:50                               ` Ingo Molnar
  -1 siblings, 0 replies; 103+ messages in thread
From: Ingo Molnar @ 2018-05-07  6:50 UTC (permalink / raw)
  To: Boqun Feng
  Cc: Peter Zijlstra, Mark Rutland, linux-arm-kernel, linux-kernel,
	aryabinin, catalin.marinas, dvyukov, will.deacon


* Boqun Feng <boqun.feng@gmail.com> wrote:

> 
> 
> On Sun, May 6, 2018, at 8:11 PM, Ingo Molnar wrote:
> > 
> > * Boqun Feng <boqun.feng@gmail.com> wrote:
> > 
> > > > The only change I made beyond a trivial build fix is that I also added the release 
> > > > atomics variants explicitly:
> > > > 
> > > > +#define atomic_cmpxchg_release(v, o, n) \
> > > > +	cmpxchg_release(&((v)->counter), (o), (n))
> > > > +#define atomic64_cmpxchg_release(v, o, n) \
> > > > +	cmpxchg_release(&((v)->counter), (o), (n))
> > > > 
> > > > It has passed a PowerPC cross-build test here, but no runtime tests.
> > > > 
> > > 
> > > Do you have the commit at any branch in tip tree? I could pull it and
> > > cross-build and check the assembly code of lib/atomic64_test.c, that way
> > > I could verify whether we mess something up.
> > > 
> > > > Does this patch look good to you?
> > > > 
> > > 
> > > Yep!
> > 
> > Great - I have pushed the commits out into the locking tree, they can be 
> > found in:
> > 
> >   git fetch git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git 
> > locking/core
> > 
> 
> Thanks! My compile test told me that we need to remove the definitions of 
> atomic_xchg and atomic64_xchg in ppc's asm/atomic.h: they are now
> duplicate, and will prevent the generation of _release and _acquire in the
> new logic.
> 
> If you need a updated patch for this from me, I could send later today.
> (I don't have a  handy environment for patch sending now, so...)

That would be cool, thanks! My own cross-build testing didn't trigger that build 
failure.

> Other than this, the modification looks fine, the lib/atomic64_test.c
> generated the same asm before and after the patches.

Cool, thanks for checking!

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 103+ messages in thread

* [PATCH] locking/atomics/powerpc: Move cmpxchg helpers to asm/cmpxchg.h and define the full set of cmpxchg APIs
@ 2018-05-07  6:50                               ` Ingo Molnar
  0 siblings, 0 replies; 103+ messages in thread
From: Ingo Molnar @ 2018-05-07  6:50 UTC (permalink / raw)
  To: linux-arm-kernel


* Boqun Feng <boqun.feng@gmail.com> wrote:

> 
> 
> On Sun, May 6, 2018, at 8:11 PM, Ingo Molnar wrote:
> > 
> > * Boqun Feng <boqun.feng@gmail.com> wrote:
> > 
> > > > The only change I made beyond a trivial build fix is that I also added the release 
> > > > atomics variants explicitly:
> > > > 
> > > > +#define atomic_cmpxchg_release(v, o, n) \
> > > > +	cmpxchg_release(&((v)->counter), (o), (n))
> > > > +#define atomic64_cmpxchg_release(v, o, n) \
> > > > +	cmpxchg_release(&((v)->counter), (o), (n))
> > > > 
> > > > It has passed a PowerPC cross-build test here, but no runtime tests.
> > > > 
> > > 
> > > Do you have the commit at any branch in tip tree? I could pull it and
> > > cross-build and check the assembly code of lib/atomic64_test.c, that way
> > > I could verify whether we mess something up.
> > > 
> > > > Does this patch look good to you?
> > > > 
> > > 
> > > Yep!
> > 
> > Great - I have pushed the commits out into the locking tree, they can be 
> > found in:
> > 
> >   git fetch git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git 
> > locking/core
> > 
> 
> Thanks! My compile test told me that we need to remove the definitions of 
> atomic_xchg and atomic64_xchg in ppc's asm/atomic.h: they are now
> duplicate, and will prevent the generation of _release and _acquire in the
> new logic.
> 
> If you need a updated patch for this from me, I could send later today.
> (I don't have a  handy environment for patch sending now, so...)

That would be cool, thanks! My own cross-build testing didn't trigger that build 
failure.

> Other than this, the modification looks fine, the lib/atomic64_test.c
> generated the same asm before and after the patches.

Cool, thanks for checking!

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 103+ messages in thread

* Re: [PATCH] locking/atomics: Simplify the op definitions in atomic.h some more
  2018-05-06 14:57               ` Ingo Molnar
@ 2018-05-07  9:54                 ` Andrea Parri
  -1 siblings, 0 replies; 103+ messages in thread
From: Andrea Parri @ 2018-05-07  9:54 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Mark Rutland, Peter Zijlstra, linux-arm-kernel, linux-kernel,
	aryabinin, boqun.feng, catalin.marinas, dvyukov, will.deacon,
	Linus Torvalds, Andrew Morton, Paul E. McKenney, Peter Zijlstra,
	Thomas Gleixner, Palmer Dabbelt, Albert Ou,
	Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman

On Sun, May 06, 2018 at 04:57:27PM +0200, Ingo Molnar wrote:
> 
> * Andrea Parri <andrea.parri@amarulasolutions.com> wrote:
> 
> > Hi Ingo,
> > 
> > > From 5affbf7e91901143f84f1b2ca64f4afe70e210fd Mon Sep 17 00:00:00 2001
> > > From: Ingo Molnar <mingo@kernel.org>
> > > Date: Sat, 5 May 2018 10:23:23 +0200
> > > Subject: [PATCH] locking/atomics: Simplify the op definitions in atomic.h some more
> > > 
> > > Before:
> > > 
> > >  #ifndef atomic_fetch_dec_relaxed
> > >  # ifndef atomic_fetch_dec
> > >  #  define atomic_fetch_dec(v)			atomic_fetch_sub(1, (v))
> > >  #  define atomic_fetch_dec_relaxed(v)		atomic_fetch_sub_relaxed(1, (v))
> > >  #  define atomic_fetch_dec_acquire(v)		atomic_fetch_sub_acquire(1, (v))
> > >  #  define atomic_fetch_dec_release(v)		atomic_fetch_sub_release(1, (v))
> > >  # else
> > >  #  define atomic_fetch_dec_relaxed		atomic_fetch_dec
> > >  #  define atomic_fetch_dec_acquire		atomic_fetch_dec
> > >  #  define atomic_fetch_dec_release		atomic_fetch_dec
> > >  # endif
> > >  #else
> > >  # ifndef atomic_fetch_dec_acquire
> > >  #  define atomic_fetch_dec_acquire(...)	__atomic_op_acquire(atomic_fetch_dec, __VA_ARGS__)
> > >  # endif
> > >  # ifndef atomic_fetch_dec_release
> > >  #  define atomic_fetch_dec_release(...)	__atomic_op_release(atomic_fetch_dec, __VA_ARGS__)
> > >  # endif
> > >  # ifndef atomic_fetch_dec
> > >  #  define atomic_fetch_dec(...)		__atomic_op_fence(atomic_fetch_dec, __VA_ARGS__)
> > >  # endif
> > >  #endif
> > > 
> > > After:
> > > 
> > >  #ifndef atomic_fetch_dec_relaxed
> > >  # ifndef atomic_fetch_dec
> > >  #  define atomic_fetch_dec(v)			atomic_fetch_sub(1, (v))
> > >  #  define atomic_fetch_dec_relaxed(v)		atomic_fetch_sub_relaxed(1, (v))
> > >  #  define atomic_fetch_dec_acquire(v)		atomic_fetch_sub_acquire(1, (v))
> > >  #  define atomic_fetch_dec_release(v)		atomic_fetch_sub_release(1, (v))
> > >  # else
> > >  #  define atomic_fetch_dec_relaxed		atomic_fetch_dec
> > >  #  define atomic_fetch_dec_acquire		atomic_fetch_dec
> > >  #  define atomic_fetch_dec_release		atomic_fetch_dec
> > >  # endif
> > >  #else
> > >  # ifndef atomic_fetch_dec
> > >  #  define atomic_fetch_dec(...)		__atomic_op_fence(atomic_fetch_dec, __VA_ARGS__)
> > >  #  define atomic_fetch_dec_acquire(...)	__atomic_op_acquire(atomic_fetch_dec, __VA_ARGS__)
> > >  #  define atomic_fetch_dec_release(...)	__atomic_op_release(atomic_fetch_dec, __VA_ARGS__)
> > >  # endif
> > >  #endif
> > > 
> > > The idea is that because we already group these APIs by certain defines
> > > such as atomic_fetch_dec_relaxed and atomic_fetch_dec in the primary
> > > branches - we can do the same in the secondary branch as well.
> > > 
> > > ( Also remove some unnecessarily duplicate comments, as the API
> > >   group defines are now pretty much self-documenting. )
> > > 
> > > No change in functionality.
> > > 
> > > Cc: Peter Zijlstra <peterz@infradead.org>
> > > Cc: Linus Torvalds <torvalds@linux-foundation.org>
> > > Cc: Andrew Morton <akpm@linux-foundation.org>
> > > Cc: Thomas Gleixner <tglx@linutronix.de>
> > > Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> > > Cc: Will Deacon <will.deacon@arm.com>
> > > Cc: linux-kernel@vger.kernel.org
> > > Signed-off-by: Ingo Molnar <mingo@kernel.org>
> > 
> > This breaks compilation on RISC-V. (For some of its atomics, the arch
> > currently defines the _relaxed and the full variants and it relies on
> > the generic definitions for the _acquire and the _release variants.)
> 
> I don't have cross-compilation for RISC-V, which is a relatively new arch.
> (Is there any RISC-V set of cross-compilation tools on kernel.org somewhere?)

I'm using the toolchain from:

  https://riscv.org/software-tools/

(adding Palmer and Albert in Cc:)


> 
> Could you please send a patch that defines those variants against Linus's tree, 
> like the PowerPC patch that does something similar:
> 
>   0476a632cb3a: locking/atomics/powerpc: Move cmpxchg helpers to asm/cmpxchg.h and define the full set of cmpxchg APIs
> 
> ?

Yes, please see below for a first RFC.

(BTW, get_maintainer.pl says that that patch missed Benjamin, Paul, Michael
 and linuxppc-dev@lists.ozlabs.org: FWIW, I'm Cc-ing the maintainers here.)

  Andrea


>From 411f05a44e0b53a435331b977ff864fba7501a95 Mon Sep 17 00:00:00 2001
From: Andrea Parri <andrea.parri@amarulasolutions.com>
Date: Mon, 7 May 2018 10:59:20 +0200
Subject: [RFC PATCH] riscv/atomic: Defines _acquire/_release variants

In preparation for Ingo's renovation of the generic atomic.h header [1],
define the _acquire/_release variants in the arch's header.

No change in code generation.

[1] http://lkml.kernel.org/r/20180505081100.nsyrqrpzq2vd27bk@gmail.com
    http://lkml.kernel.org/r/20180505083635.622xmcvb42dw5xxh@gmail.com

Suggested-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Andrea Parri <andrea.parri@amarulasolutions.com>
Cc: Palmer Dabbelt <palmer@sifive.com>
Cc: Albert Ou <albert@sifive.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: linux-riscv@lists.infradead.org
---
 arch/riscv/include/asm/atomic.h | 88 +++++++++++++++++++++++++++++++++++++++++
 1 file changed, 88 insertions(+)

diff --git a/arch/riscv/include/asm/atomic.h b/arch/riscv/include/asm/atomic.h
index 855115ace98c8..7cbd8033dfb5d 100644
--- a/arch/riscv/include/asm/atomic.h
+++ b/arch/riscv/include/asm/atomic.h
@@ -153,22 +153,54 @@ ATOMIC_OPS(sub, add, +, -i)
 
 #define atomic_add_return_relaxed	atomic_add_return_relaxed
 #define atomic_sub_return_relaxed	atomic_sub_return_relaxed
+#define atomic_add_return_acquire(...)					\
+	__atomic_op_acquire(atomic_add_return, __VA_ARGS__)
+#define atomic_sub_return_acquire(...)					\
+	__atomic_op_acquire(atomic_sub_return, __VA_ARGS__)
+#define atomic_add_return_release(...)					\
+	__atomic_op_release(atomic_add_return, __VA_ARGS__)
+#define atomic_sub_return_release(...)					\
+	__atomic_op_release(atomic_sub_return, __VA_ARGS__)
 #define atomic_add_return		atomic_add_return
 #define atomic_sub_return		atomic_sub_return
 
 #define atomic_fetch_add_relaxed	atomic_fetch_add_relaxed
 #define atomic_fetch_sub_relaxed	atomic_fetch_sub_relaxed
+#define atomic_fetch_add_acquire(...)					\
+	__atomic_op_acquire(atomic_fetch_add, __VA_ARGS__)
+#define atomic_fetch_sub_acquire(...)					\
+	__atomic_op_acquire(atomic_fetch_sub, __VA_ARGS__)
+#define atomic_fetch_add_release(...)					\
+	__atomic_op_release(atomic_fetch_add, __VA_ARGS__)
+#define atomic_fetch_sub_release(...)					\
+	__atomic_op_release(atomic_fetch_sub, __VA_ARGS__)
 #define atomic_fetch_add		atomic_fetch_add
 #define atomic_fetch_sub		atomic_fetch_sub
 
 #ifndef CONFIG_GENERIC_ATOMIC64
 #define atomic64_add_return_relaxed	atomic64_add_return_relaxed
 #define atomic64_sub_return_relaxed	atomic64_sub_return_relaxed
+#define atomic64_add_return_acquire(...)				\
+	__atomic_op_acquire(atomic64_add_return, __VA_ARGS__)
+#define atomic64_sub_return_acquire(...)				\
+	__atomic_op_acquire(atomic64_sub_return, __VA_ARGS__)
+#define atomic64_add_return_release(...)				\
+	__atomic_op_release(atomic64_add_return, __VA_ARGS__)
+#define atomic64_sub_return_release(...)				\
+	__atomic_op_release(atomic64_sub_return, __VA_ARGS__)
 #define atomic64_add_return		atomic64_add_return
 #define atomic64_sub_return		atomic64_sub_return
 
 #define atomic64_fetch_add_relaxed	atomic64_fetch_add_relaxed
 #define atomic64_fetch_sub_relaxed	atomic64_fetch_sub_relaxed
+#define atomic64_fetch_add_acquire(...)					\
+	__atomic_op_acquire(atomic64_fetch_add, __VA_ARGS__)
+#define atomic64_fetch_sub_acquire(...)					\
+	__atomic_op_acquire(atomic64_fetch_sub, __VA_ARGS__)
+#define atomic64_fetch_add_release(...)					\
+	__atomic_op_release(atomic64_fetch_add, __VA_ARGS__)
+#define atomic64_fetch_sub_release(...)					\
+	__atomic_op_release(atomic64_fetch_sub, __VA_ARGS__)
 #define atomic64_fetch_add		atomic64_fetch_add
 #define atomic64_fetch_sub		atomic64_fetch_sub
 #endif
@@ -191,6 +223,18 @@ ATOMIC_OPS(xor, xor, i)
 #define atomic_fetch_and_relaxed	atomic_fetch_and_relaxed
 #define atomic_fetch_or_relaxed		atomic_fetch_or_relaxed
 #define atomic_fetch_xor_relaxed	atomic_fetch_xor_relaxed
+#define atomic_fetch_and_acquire(...)					\
+	__atomic_op_acquire(atomic_fetch_and, __VA_ARGS__)
+#define atomic_fetch_or_acquire(...)					\
+	__atomic_op_acquire(atomic_fetch_or, __VA_ARGS__)
+#define atomic_fetch_xor_acquire(...)					\
+	__atomic_op_acquire(atomic_fetch_xor, __VA_ARGS__)
+#define atomic_fetch_and_release(...)					\
+	__atomic_op_release(atomic_fetch_and, __VA_ARGS__)
+#define atomic_fetch_or_release(...)					\
+	__atomic_op_release(atomic_fetch_or, __VA_ARGS__)
+#define atomic_fetch_xor_release(...)					\
+	__atomic_op_release(atomic_fetch_xor, __VA_ARGS__)
 #define atomic_fetch_and		atomic_fetch_and
 #define atomic_fetch_or			atomic_fetch_or
 #define atomic_fetch_xor		atomic_fetch_xor
@@ -199,6 +243,18 @@ ATOMIC_OPS(xor, xor, i)
 #define atomic64_fetch_and_relaxed	atomic64_fetch_and_relaxed
 #define atomic64_fetch_or_relaxed	atomic64_fetch_or_relaxed
 #define atomic64_fetch_xor_relaxed	atomic64_fetch_xor_relaxed
+#define atomic64_fetch_and_acquire(...)					\
+	__atomic_op_acquire(atomic64_fetch_and, __VA_ARGS__)
+#define atomic64_fetch_or_acquire(...)					\
+	__atomic_op_acquire(atomic64_fetch_or, __VA_ARGS__)
+#define atomic64_fetch_xor_acquire(...)					\
+	__atomic_op_acquire(atomic64_fetch_xor, __VA_ARGS__)
+#define atomic64_fetch_and_release(...)					\
+	__atomic_op_release(atomic64_fetch_and, __VA_ARGS__)
+#define atomic64_fetch_or_release(...)					\
+	__atomic_op_release(atomic64_fetch_or, __VA_ARGS__)
+#define atomic64_fetch_xor_release(...)					\
+	__atomic_op_release(atomic64_fetch_xor, __VA_ARGS__)
 #define atomic64_fetch_and		atomic64_fetch_and
 #define atomic64_fetch_or		atomic64_fetch_or
 #define atomic64_fetch_xor		atomic64_fetch_xor
@@ -290,22 +346,54 @@ ATOMIC_OPS(dec, add, +, -1)
 
 #define atomic_inc_return_relaxed	atomic_inc_return_relaxed
 #define atomic_dec_return_relaxed	atomic_dec_return_relaxed
+#define atomic_inc_return_acquire(...)					\
+	__atomic_op_acquire(atomic_inc_return, __VA_ARGS__)
+#define atomic_dec_return_acquire(...)					\
+	__atomic_op_acquire(atomic_dec_return, __VA_ARGS__)
+#define atomic_inc_return_release(...)					\
+	__atomic_op_release(atomic_inc_return, __VA_ARGS__)
+#define atomic_dec_return_release(...)					\
+	__atomic_op_release(atomic_dec_return, __VA_ARGS__)
 #define atomic_inc_return		atomic_inc_return
 #define atomic_dec_return		atomic_dec_return
 
 #define atomic_fetch_inc_relaxed	atomic_fetch_inc_relaxed
 #define atomic_fetch_dec_relaxed	atomic_fetch_dec_relaxed
+#define atomic_fetch_inc_acquire(...)					\
+	__atomic_op_acquire(atomic_fetch_inc, __VA_ARGS__)
+#define atomic_fetch_dec_acquire(...)					\
+	__atomic_op_acquire(atomic_fetch_dec, __VA_ARGS__)
+#define atomic_fetch_inc_release(...)					\
+	__atomic_op_release(atomic_fetch_inc, __VA_ARGS__)
+#define atomic_fetch_dec_release(...)					\
+	__atomic_op_release(atomic_fetch_dec, __VA_ARGS__)
 #define atomic_fetch_inc		atomic_fetch_inc
 #define atomic_fetch_dec		atomic_fetch_dec
 
 #ifndef CONFIG_GENERIC_ATOMIC64
 #define atomic64_inc_return_relaxed	atomic64_inc_return_relaxed
 #define atomic64_dec_return_relaxed	atomic64_dec_return_relaxed
+#define atomic64_inc_return_acquire(...)				\
+	__atomic_op_acquire(atomic64_inc_return, __VA_ARGS__)
+#define atomic64_dec_return_acquire(...)				\
+	__atomic_op_acquire(atomic64_dec_return, __VA_ARGS__)
+#define atomic64_inc_return_release(...)				\
+	__atomic_op_release(atomic64_inc_return, __VA_ARGS__)
+#define atomic64_dec_return_release(...)				\
+	__atomic_op_release(atomic64_dec_return, __VA_ARGS__)
 #define atomic64_inc_return		atomic64_inc_return
 #define atomic64_dec_return		atomic64_dec_return
 
 #define atomic64_fetch_inc_relaxed	atomic64_fetch_inc_relaxed
 #define atomic64_fetch_dec_relaxed	atomic64_fetch_dec_relaxed
+#define atomic64_fetch_inc_acquire(...)					\
+	__atomic_op_acquire(atomic64_fetch_inc, __VA_ARGS__)
+#define atomic64_fetch_dec_acquire(...)					\
+	__atomic_op_acquire(atomic64_fetch_dec, __VA_ARGS__)
+#define atomic64_fetch_inc_release(...)					\
+	__atomic_op_release(atomic64_fetch_inc, __VA_ARGS__)
+#define atomic64_fetch_dec_release(...)					\
+	__atomic_op_release(atomic64_fetch_dec, __VA_ARGS__)
 #define atomic64_fetch_inc		atomic64_fetch_inc
 #define atomic64_fetch_dec		atomic64_fetch_dec
 #endif
-- 
2.7.4



> 
> ... and I'll integrate it into the proper place to make it all bisectable, etc.
> 
> Thanks,
> 
> 	Ingo

^ permalink raw reply related	[flat|nested] 103+ messages in thread

* [PATCH] locking/atomics: Simplify the op definitions in atomic.h some more
@ 2018-05-07  9:54                 ` Andrea Parri
  0 siblings, 0 replies; 103+ messages in thread
From: Andrea Parri @ 2018-05-07  9:54 UTC (permalink / raw)
  To: linux-arm-kernel

On Sun, May 06, 2018 at 04:57:27PM +0200, Ingo Molnar wrote:
> 
> * Andrea Parri <andrea.parri@amarulasolutions.com> wrote:
> 
> > Hi Ingo,
> > 
> > > From 5affbf7e91901143f84f1b2ca64f4afe70e210fd Mon Sep 17 00:00:00 2001
> > > From: Ingo Molnar <mingo@kernel.org>
> > > Date: Sat, 5 May 2018 10:23:23 +0200
> > > Subject: [PATCH] locking/atomics: Simplify the op definitions in atomic.h some more
> > > 
> > > Before:
> > > 
> > >  #ifndef atomic_fetch_dec_relaxed
> > >  # ifndef atomic_fetch_dec
> > >  #  define atomic_fetch_dec(v)			atomic_fetch_sub(1, (v))
> > >  #  define atomic_fetch_dec_relaxed(v)		atomic_fetch_sub_relaxed(1, (v))
> > >  #  define atomic_fetch_dec_acquire(v)		atomic_fetch_sub_acquire(1, (v))
> > >  #  define atomic_fetch_dec_release(v)		atomic_fetch_sub_release(1, (v))
> > >  # else
> > >  #  define atomic_fetch_dec_relaxed		atomic_fetch_dec
> > >  #  define atomic_fetch_dec_acquire		atomic_fetch_dec
> > >  #  define atomic_fetch_dec_release		atomic_fetch_dec
> > >  # endif
> > >  #else
> > >  # ifndef atomic_fetch_dec_acquire
> > >  #  define atomic_fetch_dec_acquire(...)	__atomic_op_acquire(atomic_fetch_dec, __VA_ARGS__)
> > >  # endif
> > >  # ifndef atomic_fetch_dec_release
> > >  #  define atomic_fetch_dec_release(...)	__atomic_op_release(atomic_fetch_dec, __VA_ARGS__)
> > >  # endif
> > >  # ifndef atomic_fetch_dec
> > >  #  define atomic_fetch_dec(...)		__atomic_op_fence(atomic_fetch_dec, __VA_ARGS__)
> > >  # endif
> > >  #endif
> > > 
> > > After:
> > > 
> > >  #ifndef atomic_fetch_dec_relaxed
> > >  # ifndef atomic_fetch_dec
> > >  #  define atomic_fetch_dec(v)			atomic_fetch_sub(1, (v))
> > >  #  define atomic_fetch_dec_relaxed(v)		atomic_fetch_sub_relaxed(1, (v))
> > >  #  define atomic_fetch_dec_acquire(v)		atomic_fetch_sub_acquire(1, (v))
> > >  #  define atomic_fetch_dec_release(v)		atomic_fetch_sub_release(1, (v))
> > >  # else
> > >  #  define atomic_fetch_dec_relaxed		atomic_fetch_dec
> > >  #  define atomic_fetch_dec_acquire		atomic_fetch_dec
> > >  #  define atomic_fetch_dec_release		atomic_fetch_dec
> > >  # endif
> > >  #else
> > >  # ifndef atomic_fetch_dec
> > >  #  define atomic_fetch_dec(...)		__atomic_op_fence(atomic_fetch_dec, __VA_ARGS__)
> > >  #  define atomic_fetch_dec_acquire(...)	__atomic_op_acquire(atomic_fetch_dec, __VA_ARGS__)
> > >  #  define atomic_fetch_dec_release(...)	__atomic_op_release(atomic_fetch_dec, __VA_ARGS__)
> > >  # endif
> > >  #endif
> > > 
> > > The idea is that because we already group these APIs by certain defines
> > > such as atomic_fetch_dec_relaxed and atomic_fetch_dec in the primary
> > > branches - we can do the same in the secondary branch as well.
> > > 
> > > ( Also remove some unnecessarily duplicate comments, as the API
> > >   group defines are now pretty much self-documenting. )
> > > 
> > > No change in functionality.
> > > 
> > > Cc: Peter Zijlstra <peterz@infradead.org>
> > > Cc: Linus Torvalds <torvalds@linux-foundation.org>
> > > Cc: Andrew Morton <akpm@linux-foundation.org>
> > > Cc: Thomas Gleixner <tglx@linutronix.de>
> > > Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> > > Cc: Will Deacon <will.deacon@arm.com>
> > > Cc: linux-kernel at vger.kernel.org
> > > Signed-off-by: Ingo Molnar <mingo@kernel.org>
> > 
> > This breaks compilation on RISC-V. (For some of its atomics, the arch
> > currently defines the _relaxed and the full variants and it relies on
> > the generic definitions for the _acquire and the _release variants.)
> 
> I don't have cross-compilation for RISC-V, which is a relatively new arch.
> (Is there any RISC-V set of cross-compilation tools on kernel.org somewhere?)

I'm using the toolchain from:

  https://riscv.org/software-tools/

(adding Palmer and Albert in Cc:)


> 
> Could you please send a patch that defines those variants against Linus's tree, 
> like the PowerPC patch that does something similar:
> 
>   0476a632cb3a: locking/atomics/powerpc: Move cmpxchg helpers to asm/cmpxchg.h and define the full set of cmpxchg APIs
> 
> ?

Yes, please see below for a first RFC.

(BTW, get_maintainer.pl says that that patch missed Benjamin, Paul, Michael
 and linuxppc-dev at lists.ozlabs.org: FWIW, I'm Cc-ing the maintainers here.)

  Andrea


>From 411f05a44e0b53a435331b977ff864fba7501a95 Mon Sep 17 00:00:00 2001
From: Andrea Parri <andrea.parri@amarulasolutions.com>
Date: Mon, 7 May 2018 10:59:20 +0200
Subject: [RFC PATCH] riscv/atomic: Defines _acquire/_release variants

In preparation for Ingo's renovation of the generic atomic.h header [1],
define the _acquire/_release variants in the arch's header.

No change in code generation.

[1] http://lkml.kernel.org/r/20180505081100.nsyrqrpzq2vd27bk at gmail.com
    http://lkml.kernel.org/r/20180505083635.622xmcvb42dw5xxh at gmail.com

Suggested-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Andrea Parri <andrea.parri@amarulasolutions.com>
Cc: Palmer Dabbelt <palmer@sifive.com>
Cc: Albert Ou <albert@sifive.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: linux-riscv at lists.infradead.org
---
 arch/riscv/include/asm/atomic.h | 88 +++++++++++++++++++++++++++++++++++++++++
 1 file changed, 88 insertions(+)

diff --git a/arch/riscv/include/asm/atomic.h b/arch/riscv/include/asm/atomic.h
index 855115ace98c8..7cbd8033dfb5d 100644
--- a/arch/riscv/include/asm/atomic.h
+++ b/arch/riscv/include/asm/atomic.h
@@ -153,22 +153,54 @@ ATOMIC_OPS(sub, add, +, -i)
 
 #define atomic_add_return_relaxed	atomic_add_return_relaxed
 #define atomic_sub_return_relaxed	atomic_sub_return_relaxed
+#define atomic_add_return_acquire(...)					\
+	__atomic_op_acquire(atomic_add_return, __VA_ARGS__)
+#define atomic_sub_return_acquire(...)					\
+	__atomic_op_acquire(atomic_sub_return, __VA_ARGS__)
+#define atomic_add_return_release(...)					\
+	__atomic_op_release(atomic_add_return, __VA_ARGS__)
+#define atomic_sub_return_release(...)					\
+	__atomic_op_release(atomic_sub_return, __VA_ARGS__)
 #define atomic_add_return		atomic_add_return
 #define atomic_sub_return		atomic_sub_return
 
 #define atomic_fetch_add_relaxed	atomic_fetch_add_relaxed
 #define atomic_fetch_sub_relaxed	atomic_fetch_sub_relaxed
+#define atomic_fetch_add_acquire(...)					\
+	__atomic_op_acquire(atomic_fetch_add, __VA_ARGS__)
+#define atomic_fetch_sub_acquire(...)					\
+	__atomic_op_acquire(atomic_fetch_sub, __VA_ARGS__)
+#define atomic_fetch_add_release(...)					\
+	__atomic_op_release(atomic_fetch_add, __VA_ARGS__)
+#define atomic_fetch_sub_release(...)					\
+	__atomic_op_release(atomic_fetch_sub, __VA_ARGS__)
 #define atomic_fetch_add		atomic_fetch_add
 #define atomic_fetch_sub		atomic_fetch_sub
 
 #ifndef CONFIG_GENERIC_ATOMIC64
 #define atomic64_add_return_relaxed	atomic64_add_return_relaxed
 #define atomic64_sub_return_relaxed	atomic64_sub_return_relaxed
+#define atomic64_add_return_acquire(...)				\
+	__atomic_op_acquire(atomic64_add_return, __VA_ARGS__)
+#define atomic64_sub_return_acquire(...)				\
+	__atomic_op_acquire(atomic64_sub_return, __VA_ARGS__)
+#define atomic64_add_return_release(...)				\
+	__atomic_op_release(atomic64_add_return, __VA_ARGS__)
+#define atomic64_sub_return_release(...)				\
+	__atomic_op_release(atomic64_sub_return, __VA_ARGS__)
 #define atomic64_add_return		atomic64_add_return
 #define atomic64_sub_return		atomic64_sub_return
 
 #define atomic64_fetch_add_relaxed	atomic64_fetch_add_relaxed
 #define atomic64_fetch_sub_relaxed	atomic64_fetch_sub_relaxed
+#define atomic64_fetch_add_acquire(...)					\
+	__atomic_op_acquire(atomic64_fetch_add, __VA_ARGS__)
+#define atomic64_fetch_sub_acquire(...)					\
+	__atomic_op_acquire(atomic64_fetch_sub, __VA_ARGS__)
+#define atomic64_fetch_add_release(...)					\
+	__atomic_op_release(atomic64_fetch_add, __VA_ARGS__)
+#define atomic64_fetch_sub_release(...)					\
+	__atomic_op_release(atomic64_fetch_sub, __VA_ARGS__)
 #define atomic64_fetch_add		atomic64_fetch_add
 #define atomic64_fetch_sub		atomic64_fetch_sub
 #endif
@@ -191,6 +223,18 @@ ATOMIC_OPS(xor, xor, i)
 #define atomic_fetch_and_relaxed	atomic_fetch_and_relaxed
 #define atomic_fetch_or_relaxed		atomic_fetch_or_relaxed
 #define atomic_fetch_xor_relaxed	atomic_fetch_xor_relaxed
+#define atomic_fetch_and_acquire(...)					\
+	__atomic_op_acquire(atomic_fetch_and, __VA_ARGS__)
+#define atomic_fetch_or_acquire(...)					\
+	__atomic_op_acquire(atomic_fetch_or, __VA_ARGS__)
+#define atomic_fetch_xor_acquire(...)					\
+	__atomic_op_acquire(atomic_fetch_xor, __VA_ARGS__)
+#define atomic_fetch_and_release(...)					\
+	__atomic_op_release(atomic_fetch_and, __VA_ARGS__)
+#define atomic_fetch_or_release(...)					\
+	__atomic_op_release(atomic_fetch_or, __VA_ARGS__)
+#define atomic_fetch_xor_release(...)					\
+	__atomic_op_release(atomic_fetch_xor, __VA_ARGS__)
 #define atomic_fetch_and		atomic_fetch_and
 #define atomic_fetch_or			atomic_fetch_or
 #define atomic_fetch_xor		atomic_fetch_xor
@@ -199,6 +243,18 @@ ATOMIC_OPS(xor, xor, i)
 #define atomic64_fetch_and_relaxed	atomic64_fetch_and_relaxed
 #define atomic64_fetch_or_relaxed	atomic64_fetch_or_relaxed
 #define atomic64_fetch_xor_relaxed	atomic64_fetch_xor_relaxed
+#define atomic64_fetch_and_acquire(...)					\
+	__atomic_op_acquire(atomic64_fetch_and, __VA_ARGS__)
+#define atomic64_fetch_or_acquire(...)					\
+	__atomic_op_acquire(atomic64_fetch_or, __VA_ARGS__)
+#define atomic64_fetch_xor_acquire(...)					\
+	__atomic_op_acquire(atomic64_fetch_xor, __VA_ARGS__)
+#define atomic64_fetch_and_release(...)					\
+	__atomic_op_release(atomic64_fetch_and, __VA_ARGS__)
+#define atomic64_fetch_or_release(...)					\
+	__atomic_op_release(atomic64_fetch_or, __VA_ARGS__)
+#define atomic64_fetch_xor_release(...)					\
+	__atomic_op_release(atomic64_fetch_xor, __VA_ARGS__)
 #define atomic64_fetch_and		atomic64_fetch_and
 #define atomic64_fetch_or		atomic64_fetch_or
 #define atomic64_fetch_xor		atomic64_fetch_xor
@@ -290,22 +346,54 @@ ATOMIC_OPS(dec, add, +, -1)
 
 #define atomic_inc_return_relaxed	atomic_inc_return_relaxed
 #define atomic_dec_return_relaxed	atomic_dec_return_relaxed
+#define atomic_inc_return_acquire(...)					\
+	__atomic_op_acquire(atomic_inc_return, __VA_ARGS__)
+#define atomic_dec_return_acquire(...)					\
+	__atomic_op_acquire(atomic_dec_return, __VA_ARGS__)
+#define atomic_inc_return_release(...)					\
+	__atomic_op_release(atomic_inc_return, __VA_ARGS__)
+#define atomic_dec_return_release(...)					\
+	__atomic_op_release(atomic_dec_return, __VA_ARGS__)
 #define atomic_inc_return		atomic_inc_return
 #define atomic_dec_return		atomic_dec_return
 
 #define atomic_fetch_inc_relaxed	atomic_fetch_inc_relaxed
 #define atomic_fetch_dec_relaxed	atomic_fetch_dec_relaxed
+#define atomic_fetch_inc_acquire(...)					\
+	__atomic_op_acquire(atomic_fetch_inc, __VA_ARGS__)
+#define atomic_fetch_dec_acquire(...)					\
+	__atomic_op_acquire(atomic_fetch_dec, __VA_ARGS__)
+#define atomic_fetch_inc_release(...)					\
+	__atomic_op_release(atomic_fetch_inc, __VA_ARGS__)
+#define atomic_fetch_dec_release(...)					\
+	__atomic_op_release(atomic_fetch_dec, __VA_ARGS__)
 #define atomic_fetch_inc		atomic_fetch_inc
 #define atomic_fetch_dec		atomic_fetch_dec
 
 #ifndef CONFIG_GENERIC_ATOMIC64
 #define atomic64_inc_return_relaxed	atomic64_inc_return_relaxed
 #define atomic64_dec_return_relaxed	atomic64_dec_return_relaxed
+#define atomic64_inc_return_acquire(...)				\
+	__atomic_op_acquire(atomic64_inc_return, __VA_ARGS__)
+#define atomic64_dec_return_acquire(...)				\
+	__atomic_op_acquire(atomic64_dec_return, __VA_ARGS__)
+#define atomic64_inc_return_release(...)				\
+	__atomic_op_release(atomic64_inc_return, __VA_ARGS__)
+#define atomic64_dec_return_release(...)				\
+	__atomic_op_release(atomic64_dec_return, __VA_ARGS__)
 #define atomic64_inc_return		atomic64_inc_return
 #define atomic64_dec_return		atomic64_dec_return
 
 #define atomic64_fetch_inc_relaxed	atomic64_fetch_inc_relaxed
 #define atomic64_fetch_dec_relaxed	atomic64_fetch_dec_relaxed
+#define atomic64_fetch_inc_acquire(...)					\
+	__atomic_op_acquire(atomic64_fetch_inc, __VA_ARGS__)
+#define atomic64_fetch_dec_acquire(...)					\
+	__atomic_op_acquire(atomic64_fetch_dec, __VA_ARGS__)
+#define atomic64_fetch_inc_release(...)					\
+	__atomic_op_release(atomic64_fetch_inc, __VA_ARGS__)
+#define atomic64_fetch_dec_release(...)					\
+	__atomic_op_release(atomic64_fetch_dec, __VA_ARGS__)
 #define atomic64_fetch_inc		atomic64_fetch_inc
 #define atomic64_fetch_dec		atomic64_fetch_dec
 #endif
-- 
2.7.4



> 
> ... and I'll integrate it into the proper place to make it all bisectable, etc.
> 
> Thanks,
> 
> 	Ingo

^ permalink raw reply related	[flat|nested] 103+ messages in thread

* [PATCH v2] locking/atomics/powerpc: Move cmpxchg helpers to asm/cmpxchg.h and define the full set of cmpxchg APIs
  2018-05-06 12:13                     ` [tip:locking/core] " tip-bot for Boqun Feng
@ 2018-05-07 13:31                         ` Boqun Feng
  0 siblings, 0 replies; 103+ messages in thread
From: Boqun Feng @ 2018-05-07 13:31 UTC (permalink / raw)
  To: mingo, tglx, hpa, linux-kernel
  Cc: Boqun Feng, Linus Torvalds, Mark Rutland, Peter Zijlstra,
	aryabinin, catalin.marinas, dvyukov, linux-arm-kernel,
	will.deacon

Move PowerPC's __op_{acqurie,release}() from atomic.h to
cmpxchg.h (in arch/powerpc/include/asm), plus use them to
define these two methods:

	#define cmpxchg_release __op_release(cmpxchg, __VA_ARGS__);
	#define cmpxchg64_release __op_release(cmpxchg64, __VA_ARGS__);

... the idea is to generate all these methods in cmpxchg.h and to define the full
array of atomic primitives, including the cmpxchg_release() methods which were
defined by the generic code before.

Also define the atomic[64]_() variants explicitly.

This ensures that all these low level cmpxchg APIs are defined in
PowerPC headers, with no generic header fallbacks.

Also remove the duplicate definitions of atomic_xchg() and
atomic64_xchg() in asm/atomic.h: they could be generated via the generic
atomic.h header using _relaxed() primitives. This helps ppc adopt the
upcoming change of generic atomic.h header.

No change in functionality or code generation.

Signed-off-by: Boqun Feng <boqun.feng@gmail.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: aryabinin@virtuozzo.com
Cc: catalin.marinas@arm.com
Cc: dvyukov@google.com
Cc: linux-arm-kernel@lists.infradead.org
Cc: will.deacon@arm.com
---
v1 --> v2:
	remove duplicate definitions for atomic*_xchg() for future
	change

Ingo,
	I also remove the link and your SoB because I think your bot
	could add them automatically.

 arch/powerpc/include/asm/atomic.h  | 24 ++++--------------------
 arch/powerpc/include/asm/cmpxchg.h | 24 ++++++++++++++++++++++++
 2 files changed, 28 insertions(+), 20 deletions(-)

diff --git a/arch/powerpc/include/asm/atomic.h b/arch/powerpc/include/asm/atomic.h
index 682b3e6a1e21..583837a41bcc 100644
--- a/arch/powerpc/include/asm/atomic.h
+++ b/arch/powerpc/include/asm/atomic.h
@@ -13,24 +13,6 @@
 
 #define ATOMIC_INIT(i)		{ (i) }
 
-/*
- * Since *_return_relaxed and {cmp}xchg_relaxed are implemented with
- * a "bne-" instruction at the end, so an isync is enough as a acquire barrier
- * on the platform without lwsync.
- */
-#define __atomic_op_acquire(op, args...)				\
-({									\
-	typeof(op##_relaxed(args)) __ret  = op##_relaxed(args);		\
-	__asm__ __volatile__(PPC_ACQUIRE_BARRIER "" : : : "memory");	\
-	__ret;								\
-})
-
-#define __atomic_op_release(op, args...)				\
-({									\
-	__asm__ __volatile__(PPC_RELEASE_BARRIER "" : : : "memory");	\
-	op##_relaxed(args);						\
-})
-
 static __inline__ int atomic_read(const atomic_t *v)
 {
 	int t;
@@ -213,8 +195,9 @@ static __inline__ int atomic_dec_return_relaxed(atomic_t *v)
 	cmpxchg_relaxed(&((v)->counter), (o), (n))
 #define atomic_cmpxchg_acquire(v, o, n) \
 	cmpxchg_acquire(&((v)->counter), (o), (n))
+#define atomic_cmpxchg_release(v, o, n) \
+	cmpxchg_release(&((v)->counter), (o), (n))
 
-#define atomic_xchg(v, new) (xchg(&((v)->counter), new))
 #define atomic_xchg_relaxed(v, new) xchg_relaxed(&((v)->counter), (new))
 
 /**
@@ -519,8 +502,9 @@ static __inline__ long atomic64_dec_if_positive(atomic64_t *v)
 	cmpxchg_relaxed(&((v)->counter), (o), (n))
 #define atomic64_cmpxchg_acquire(v, o, n) \
 	cmpxchg_acquire(&((v)->counter), (o), (n))
+#define atomic64_cmpxchg_release(v, o, n) \
+	cmpxchg_release(&((v)->counter), (o), (n))
 
-#define atomic64_xchg(v, new) (xchg(&((v)->counter), new))
 #define atomic64_xchg_relaxed(v, new) xchg_relaxed(&((v)->counter), (new))
 
 /**
diff --git a/arch/powerpc/include/asm/cmpxchg.h b/arch/powerpc/include/asm/cmpxchg.h
index 9b001f1f6b32..e27a612b957f 100644
--- a/arch/powerpc/include/asm/cmpxchg.h
+++ b/arch/powerpc/include/asm/cmpxchg.h
@@ -8,6 +8,24 @@
 #include <asm/asm-compat.h>
 #include <linux/bug.h>
 
+/*
+ * Since *_return_relaxed and {cmp}xchg_relaxed are implemented with
+ * a "bne-" instruction at the end, so an isync is enough as a acquire barrier
+ * on the platform without lwsync.
+ */
+#define __atomic_op_acquire(op, args...)				\
+({									\
+	typeof(op##_relaxed(args)) __ret  = op##_relaxed(args);		\
+	__asm__ __volatile__(PPC_ACQUIRE_BARRIER "" : : : "memory");	\
+	__ret;								\
+})
+
+#define __atomic_op_release(op, args...)				\
+({									\
+	__asm__ __volatile__(PPC_RELEASE_BARRIER "" : : : "memory");	\
+	op##_relaxed(args);						\
+})
+
 #ifdef __BIG_ENDIAN
 #define BITOFF_CAL(size, off)	((sizeof(u32) - size - off) * BITS_PER_BYTE)
 #else
@@ -512,6 +530,9 @@ __cmpxchg_acquire(void *ptr, unsigned long old, unsigned long new,
 			(unsigned long)_o_, (unsigned long)_n_,		\
 			sizeof(*(ptr)));				\
 })
+
+#define cmpxchg_release(...) __atomic_op_release(cmpxchg, __VA_ARGS__)
+
 #ifdef CONFIG_PPC64
 #define cmpxchg64(ptr, o, n)						\
   ({									\
@@ -533,6 +554,9 @@ __cmpxchg_acquire(void *ptr, unsigned long old, unsigned long new,
 	BUILD_BUG_ON(sizeof(*(ptr)) != 8);				\
 	cmpxchg_acquire((ptr), (o), (n));				\
 })
+
+#define cmpxchg64_release(...) __atomic_op_release(cmpxchg64, __VA_ARGS__)
+
 #else
 #include <asm-generic/cmpxchg-local.h>
 #define cmpxchg64_local(ptr, o, n) __cmpxchg64_local_generic((ptr), (o), (n))
-- 
2.16.2

^ permalink raw reply related	[flat|nested] 103+ messages in thread

* [PATCH v2] locking/atomics/powerpc: Move cmpxchg helpers to asm/cmpxchg.h and define the full set of cmpxchg APIs
@ 2018-05-07 13:31                         ` Boqun Feng
  0 siblings, 0 replies; 103+ messages in thread
From: Boqun Feng @ 2018-05-07 13:31 UTC (permalink / raw)
  To: linux-arm-kernel

Move PowerPC's __op_{acqurie,release}() from atomic.h to
cmpxchg.h (in arch/powerpc/include/asm), plus use them to
define these two methods:

	#define cmpxchg_release __op_release(cmpxchg, __VA_ARGS__);
	#define cmpxchg64_release __op_release(cmpxchg64, __VA_ARGS__);

... the idea is to generate all these methods in cmpxchg.h and to define the full
array of atomic primitives, including the cmpxchg_release() methods which were
defined by the generic code before.

Also define the atomic[64]_() variants explicitly.

This ensures that all these low level cmpxchg APIs are defined in
PowerPC headers, with no generic header fallbacks.

Also remove the duplicate definitions of atomic_xchg() and
atomic64_xchg() in asm/atomic.h: they could be generated via the generic
atomic.h header using _relaxed() primitives. This helps ppc adopt the
upcoming change of generic atomic.h header.

No change in functionality or code generation.

Signed-off-by: Boqun Feng <boqun.feng@gmail.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: aryabinin at virtuozzo.com
Cc: catalin.marinas at arm.com
Cc: dvyukov at google.com
Cc: linux-arm-kernel at lists.infradead.org
Cc: will.deacon at arm.com
---
v1 --> v2:
	remove duplicate definitions for atomic*_xchg() for future
	change

Ingo,
	I also remove the link and your SoB because I think your bot
	could add them automatically.

 arch/powerpc/include/asm/atomic.h  | 24 ++++--------------------
 arch/powerpc/include/asm/cmpxchg.h | 24 ++++++++++++++++++++++++
 2 files changed, 28 insertions(+), 20 deletions(-)

diff --git a/arch/powerpc/include/asm/atomic.h b/arch/powerpc/include/asm/atomic.h
index 682b3e6a1e21..583837a41bcc 100644
--- a/arch/powerpc/include/asm/atomic.h
+++ b/arch/powerpc/include/asm/atomic.h
@@ -13,24 +13,6 @@
 
 #define ATOMIC_INIT(i)		{ (i) }
 
-/*
- * Since *_return_relaxed and {cmp}xchg_relaxed are implemented with
- * a "bne-" instruction at the end, so an isync is enough as a acquire barrier
- * on the platform without lwsync.
- */
-#define __atomic_op_acquire(op, args...)				\
-({									\
-	typeof(op##_relaxed(args)) __ret  = op##_relaxed(args);		\
-	__asm__ __volatile__(PPC_ACQUIRE_BARRIER "" : : : "memory");	\
-	__ret;								\
-})
-
-#define __atomic_op_release(op, args...)				\
-({									\
-	__asm__ __volatile__(PPC_RELEASE_BARRIER "" : : : "memory");	\
-	op##_relaxed(args);						\
-})
-
 static __inline__ int atomic_read(const atomic_t *v)
 {
 	int t;
@@ -213,8 +195,9 @@ static __inline__ int atomic_dec_return_relaxed(atomic_t *v)
 	cmpxchg_relaxed(&((v)->counter), (o), (n))
 #define atomic_cmpxchg_acquire(v, o, n) \
 	cmpxchg_acquire(&((v)->counter), (o), (n))
+#define atomic_cmpxchg_release(v, o, n) \
+	cmpxchg_release(&((v)->counter), (o), (n))
 
-#define atomic_xchg(v, new) (xchg(&((v)->counter), new))
 #define atomic_xchg_relaxed(v, new) xchg_relaxed(&((v)->counter), (new))
 
 /**
@@ -519,8 +502,9 @@ static __inline__ long atomic64_dec_if_positive(atomic64_t *v)
 	cmpxchg_relaxed(&((v)->counter), (o), (n))
 #define atomic64_cmpxchg_acquire(v, o, n) \
 	cmpxchg_acquire(&((v)->counter), (o), (n))
+#define atomic64_cmpxchg_release(v, o, n) \
+	cmpxchg_release(&((v)->counter), (o), (n))
 
-#define atomic64_xchg(v, new) (xchg(&((v)->counter), new))
 #define atomic64_xchg_relaxed(v, new) xchg_relaxed(&((v)->counter), (new))
 
 /**
diff --git a/arch/powerpc/include/asm/cmpxchg.h b/arch/powerpc/include/asm/cmpxchg.h
index 9b001f1f6b32..e27a612b957f 100644
--- a/arch/powerpc/include/asm/cmpxchg.h
+++ b/arch/powerpc/include/asm/cmpxchg.h
@@ -8,6 +8,24 @@
 #include <asm/asm-compat.h>
 #include <linux/bug.h>
 
+/*
+ * Since *_return_relaxed and {cmp}xchg_relaxed are implemented with
+ * a "bne-" instruction at the end, so an isync is enough as a acquire barrier
+ * on the platform without lwsync.
+ */
+#define __atomic_op_acquire(op, args...)				\
+({									\
+	typeof(op##_relaxed(args)) __ret  = op##_relaxed(args);		\
+	__asm__ __volatile__(PPC_ACQUIRE_BARRIER "" : : : "memory");	\
+	__ret;								\
+})
+
+#define __atomic_op_release(op, args...)				\
+({									\
+	__asm__ __volatile__(PPC_RELEASE_BARRIER "" : : : "memory");	\
+	op##_relaxed(args);						\
+})
+
 #ifdef __BIG_ENDIAN
 #define BITOFF_CAL(size, off)	((sizeof(u32) - size - off) * BITS_PER_BYTE)
 #else
@@ -512,6 +530,9 @@ __cmpxchg_acquire(void *ptr, unsigned long old, unsigned long new,
 			(unsigned long)_o_, (unsigned long)_n_,		\
 			sizeof(*(ptr)));				\
 })
+
+#define cmpxchg_release(...) __atomic_op_release(cmpxchg, __VA_ARGS__)
+
 #ifdef CONFIG_PPC64
 #define cmpxchg64(ptr, o, n)						\
   ({									\
@@ -533,6 +554,9 @@ __cmpxchg_acquire(void *ptr, unsigned long old, unsigned long new,
 	BUILD_BUG_ON(sizeof(*(ptr)) != 8);				\
 	cmpxchg_acquire((ptr), (o), (n));				\
 })
+
+#define cmpxchg64_release(...) __atomic_op_release(cmpxchg64, __VA_ARGS__)
+
 #else
 #include <asm-generic/cmpxchg-local.h>
 #define cmpxchg64_local(ptr, o, n) __cmpxchg64_local_generic((ptr), (o), (n))
-- 
2.16.2

^ permalink raw reply related	[flat|nested] 103+ messages in thread

* Re: [tip:locking/core] locking/atomics: Simplify the op definitions in atomic.h some more
  2018-05-06 12:14           ` [tip:locking/core] locking/atomics: Simplify the op definitions in atomic.h some more tip-bot for Ingo Molnar
@ 2018-05-09  7:33             ` Peter Zijlstra
  2018-05-09 13:03               ` Will Deacon
  2018-05-15  8:35               ` Ingo Molnar
  0 siblings, 2 replies; 103+ messages in thread
From: Peter Zijlstra @ 2018-05-09  7:33 UTC (permalink / raw)
  To: linux-kernel, akpm, will.deacon, mark.rutland, torvalds, paulmck,
	mingo, tglx, hpa
  Cc: linux-tip-commits

On Sun, May 06, 2018 at 05:14:36AM -0700, tip-bot for Ingo Molnar wrote:
> Commit-ID:  87d655a48dfe74293f72dc001ed042142cf00d44
> Gitweb:     https://git.kernel.org/tip/87d655a48dfe74293f72dc001ed042142cf00d44
> Author:     Ingo Molnar <mingo@kernel.org>
> AuthorDate: Sat, 5 May 2018 10:36:35 +0200
> Committer:  Ingo Molnar <mingo@kernel.org>
> CommitDate: Sat, 5 May 2018 15:22:44 +0200
> 
> locking/atomics: Simplify the op definitions in atomic.h some more
> 
> Before:
> 
>  #ifndef atomic_fetch_dec_relaxed
>  # ifndef atomic_fetch_dec
>  #  define atomic_fetch_dec(v)			atomic_fetch_sub(1, (v))
>  #  define atomic_fetch_dec_relaxed(v)		atomic_fetch_sub_relaxed(1, (v))
>  #  define atomic_fetch_dec_acquire(v)		atomic_fetch_sub_acquire(1, (v))
>  #  define atomic_fetch_dec_release(v)		atomic_fetch_sub_release(1, (v))
>  # else
>  #  define atomic_fetch_dec_relaxed		atomic_fetch_dec
>  #  define atomic_fetch_dec_acquire		atomic_fetch_dec
>  #  define atomic_fetch_dec_release		atomic_fetch_dec
>  # endif
>  #else
>  # ifndef atomic_fetch_dec_acquire
>  #  define atomic_fetch_dec_acquire(...)	__atomic_op_acquire(atomic_fetch_dec, __VA_ARGS__)
>  # endif
>  # ifndef atomic_fetch_dec_release
>  #  define atomic_fetch_dec_release(...)	__atomic_op_release(atomic_fetch_dec, __VA_ARGS__)
>  # endif
>  # ifndef atomic_fetch_dec
>  #  define atomic_fetch_dec(...)		__atomic_op_fence(atomic_fetch_dec, __VA_ARGS__)
>  # endif
>  #endif
> 
> After:
> 
>  #ifndef atomic_fetch_dec_relaxed
>  # ifndef atomic_fetch_dec
>  #  define atomic_fetch_dec(v)			atomic_fetch_sub(1, (v))
>  #  define atomic_fetch_dec_relaxed(v)		atomic_fetch_sub_relaxed(1, (v))
>  #  define atomic_fetch_dec_acquire(v)		atomic_fetch_sub_acquire(1, (v))
>  #  define atomic_fetch_dec_release(v)		atomic_fetch_sub_release(1, (v))
>  # else
>  #  define atomic_fetch_dec_relaxed		atomic_fetch_dec
>  #  define atomic_fetch_dec_acquire		atomic_fetch_dec
>  #  define atomic_fetch_dec_release		atomic_fetch_dec
>  # endif
>  #else
>  # ifndef atomic_fetch_dec
>  #  define atomic_fetch_dec(...)		__atomic_op_fence(atomic_fetch_dec, __VA_ARGS__)
>  #  define atomic_fetch_dec_acquire(...)	__atomic_op_acquire(atomic_fetch_dec, __VA_ARGS__)
>  #  define atomic_fetch_dec_release(...)	__atomic_op_release(atomic_fetch_dec, __VA_ARGS__)
>  # endif
>  #endif
> 
> The idea is that because we already group these APIs by certain defines
> such as atomic_fetch_dec_relaxed and atomic_fetch_dec in the primary
> branches - we can do the same in the secondary branch as well.
> 

ARGH, why did you merge this? It's pointless wankery, and doesn't solve
anything wrt. the annotated atomic crap.

And if we're going to do codegen, we might as well all generate this
anyway, so all this mucking about is a complete waste of time.

^ permalink raw reply	[flat|nested] 103+ messages in thread

* Re: [tip:locking/core] locking/atomics: Simplify the op definitions in atomic.h some more
  2018-05-09  7:33             ` Peter Zijlstra
@ 2018-05-09 13:03               ` Will Deacon
  2018-05-15  8:54                 ` Ingo Molnar
  2018-05-15  8:35               ` Ingo Molnar
  1 sibling, 1 reply; 103+ messages in thread
From: Will Deacon @ 2018-05-09 13:03 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: linux-kernel, akpm, mark.rutland, torvalds, paulmck, mingo, tglx,
	hpa, linux-tip-commits

Hi Ingo, Peter,

[I was at a conference last week and not reading email]

On Wed, May 09, 2018 at 09:33:27AM +0200, Peter Zijlstra wrote:
> On Sun, May 06, 2018 at 05:14:36AM -0700, tip-bot for Ingo Molnar wrote:
> > Commit-ID:  87d655a48dfe74293f72dc001ed042142cf00d44
> > Gitweb:     https://git.kernel.org/tip/87d655a48dfe74293f72dc001ed042142cf00d44
> > Author:     Ingo Molnar <mingo@kernel.org>
> > AuthorDate: Sat, 5 May 2018 10:36:35 +0200
> > Committer:  Ingo Molnar <mingo@kernel.org>
> > CommitDate: Sat, 5 May 2018 15:22:44 +0200
> > 
> > locking/atomics: Simplify the op definitions in atomic.h some more
> > 
> > Before:
> > 
> >  #ifndef atomic_fetch_dec_relaxed
> >  # ifndef atomic_fetch_dec
> >  #  define atomic_fetch_dec(v)			atomic_fetch_sub(1, (v))
> >  #  define atomic_fetch_dec_relaxed(v)		atomic_fetch_sub_relaxed(1, (v))
> >  #  define atomic_fetch_dec_acquire(v)		atomic_fetch_sub_acquire(1, (v))
> >  #  define atomic_fetch_dec_release(v)		atomic_fetch_sub_release(1, (v))
> >  # else
> >  #  define atomic_fetch_dec_relaxed		atomic_fetch_dec
> >  #  define atomic_fetch_dec_acquire		atomic_fetch_dec
> >  #  define atomic_fetch_dec_release		atomic_fetch_dec
> >  # endif
> >  #else
> >  # ifndef atomic_fetch_dec_acquire
> >  #  define atomic_fetch_dec_acquire(...)	__atomic_op_acquire(atomic_fetch_dec, __VA_ARGS__)
> >  # endif
> >  # ifndef atomic_fetch_dec_release
> >  #  define atomic_fetch_dec_release(...)	__atomic_op_release(atomic_fetch_dec, __VA_ARGS__)
> >  # endif
> >  # ifndef atomic_fetch_dec
> >  #  define atomic_fetch_dec(...)		__atomic_op_fence(atomic_fetch_dec, __VA_ARGS__)
> >  # endif
> >  #endif
> > 
> > After:
> > 
> >  #ifndef atomic_fetch_dec_relaxed
> >  # ifndef atomic_fetch_dec
> >  #  define atomic_fetch_dec(v)			atomic_fetch_sub(1, (v))
> >  #  define atomic_fetch_dec_relaxed(v)		atomic_fetch_sub_relaxed(1, (v))
> >  #  define atomic_fetch_dec_acquire(v)		atomic_fetch_sub_acquire(1, (v))
> >  #  define atomic_fetch_dec_release(v)		atomic_fetch_sub_release(1, (v))
> >  # else
> >  #  define atomic_fetch_dec_relaxed		atomic_fetch_dec
> >  #  define atomic_fetch_dec_acquire		atomic_fetch_dec
> >  #  define atomic_fetch_dec_release		atomic_fetch_dec
> >  # endif
> >  #else
> >  # ifndef atomic_fetch_dec
> >  #  define atomic_fetch_dec(...)		__atomic_op_fence(atomic_fetch_dec, __VA_ARGS__)
> >  #  define atomic_fetch_dec_acquire(...)	__atomic_op_acquire(atomic_fetch_dec, __VA_ARGS__)
> >  #  define atomic_fetch_dec_release(...)	__atomic_op_release(atomic_fetch_dec, __VA_ARGS__)
> >  # endif
> >  #endif
> > 
> > The idea is that because we already group these APIs by certain defines
> > such as atomic_fetch_dec_relaxed and atomic_fetch_dec in the primary
> > branches - we can do the same in the secondary branch as well.
> > 
> 
> ARGH, why did you merge this? It's pointless wankery, and doesn't solve
> anything wrt. the annotated atomic crap.
> 
> And if we're going to do codegen, we might as well all generate this
> anyway, so all this mucking about is a complete waste of time.

I have to agree here.

The reason we allowed the atomic primitives to be overridden on a
fine-grained basis is because there are often architecture-specific
barriers or instructions which mean that it is common to be able to provide
optimised atomics in some cases, but not others. Given that this stuff is
error-prone, having the generic code provide a safe and correct fallback
for all of the atomics that the architecture doesn't implement seemed like
a good thing to me. New architecture ports basically just have to provide
the weakest thing they have and a fence to get started, then they can
optimise where they can and not have to worry about the rest of the atomics
API.

With this patch, we're already seeing arch code (powerpc, risc-v) having
to add what is basically boiler-plate code, and it seems like we're just
doing this to make the generic code more readable! I'd much prefer we kept
the arch code simple, and took on the complexity burden in the generic code
where it can be looked after in one place.

For instrumentation, we're going to need to generate this stuff *anyway*,
so I think the readability argument mostly disappears.

Will

^ permalink raw reply	[flat|nested] 103+ messages in thread

* Re: [tip:locking/core] locking/atomics: Simplify the op definitions in atomic.h some more
  2018-05-09  7:33             ` Peter Zijlstra
  2018-05-09 13:03               ` Will Deacon
@ 2018-05-15  8:35               ` Ingo Molnar
  2018-05-15 11:41                 ` Peter Zijlstra
  1 sibling, 1 reply; 103+ messages in thread
From: Ingo Molnar @ 2018-05-15  8:35 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: linux-kernel, akpm, will.deacon, mark.rutland, torvalds, paulmck,
	tglx, hpa, linux-tip-commits


* Peter Zijlstra <peterz@infradead.org> wrote:

> And if we're going to do codegen, we might as well all generate this
> anyway, so all this mucking about is a complete waste of time.

I'm not yet convinced that it will be cleaner, but can be convinced in principle, 
but meanwhile the existing code is arguably butt-ugly and bloaty.

Regarding these cleanups, we had this before:

 /* atomic_add_return_relaxed */
 #ifndef atomic_add_return_relaxed
 #define  atomic_add_return_relaxed	atomic_add_return
 #define  atomic_add_return_acquire	atomic_add_return
 #define  atomic_add_return_release	atomic_add_return

 #else /* atomic_add_return_relaxed */

 #ifndef atomic_add_return_acquire
 #define  atomic_add_return_acquire(...)					\
	__atomic_op_acquire(atomic_add_return, __VA_ARGS__)
 #endif

 #ifndef atomic_add_return_release
 #define  atomic_add_return_release(...)					\
	__atomic_op_release(atomic_add_return, __VA_ARGS__)
 #endif

 #ifndef atomic_add_return
 #define  atomic_add_return(...)						\
	__atomic_op_fence(atomic_add_return, __VA_ARGS__)
 #endif
 #endif /* atomic_add_return_relaxed */

Which is 23 lines per definition.

Now we have this much more compact definition:

 #ifndef atomic_add_return_relaxed
 # define atomic_add_return_relaxed		atomic_add_return
 # define atomic_add_return_acquire		atomic_add_return
 # define atomic_add_return_release		atomic_add_return
 #else
 # ifndef atomic_add_return
 #  define atomic_add_return(...)		__op_fence(atomic_add_return, __VA_ARGS__)
 #  define atomic_add_return_acquire(...)	__op_acquire(atomic_add_return, __VA_ARGS__)
 #  define atomic_add_return_release(...)	__op_release(atomic_add_return, __VA_ARGS__)
 # endif
 #endif

Which is just _half_ the linecount.

Automated code generation might improve this some more, but the net effect on the 
core <linux/atomic.h> code right now is 373 lines removed:

  include/linux/atomic.h | 1109 ++++++++++++++++++++++++++++++++++++++++++++++++++----------------------------------------------------------------------------------------------------
  1 file changed, 368 insertions(+), 741 deletions(-)

... <linux/atomic.h> shrunk to just 709 lines.

The x86/include/asm/atomic64_64.h file got smaller as well due to the cleanups:

  arch/x86/include/asm/atomic64_64.h | 216 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++-----------------------------------------------------------------------------
  1 file changed, 97 insertions(+), 119 deletions(-)

So unless you can clean this up and shrink this even more, these changes are 
obviously justified on their own.

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 103+ messages in thread

* Re: [tip:locking/core] locking/atomics: Simplify the op definitions in atomic.h some more
  2018-05-09 13:03               ` Will Deacon
@ 2018-05-15  8:54                 ` Ingo Molnar
  0 siblings, 0 replies; 103+ messages in thread
From: Ingo Molnar @ 2018-05-15  8:54 UTC (permalink / raw)
  To: Will Deacon
  Cc: Peter Zijlstra, linux-kernel, akpm, mark.rutland, torvalds,
	paulmck, tglx, hpa, linux-tip-commits


* Will Deacon <will.deacon@arm.com> wrote:

> With this patch, we're already seeing arch code (powerpc, risc-v) having
> to add what is basically boiler-plate code, and it seems like we're just
> doing this to make the generic code more readable! I'd much prefer we kept
> the arch code simple, and took on the complexity burden in the generic code
> where it can be looked after in one place.

For PowerPC this is not really true, the patch has this effect:

   2 files changed, 28 insertions(+), 20 deletions(-)

and made it more obvious where the arch gets its low level definitions from.

For RISC-V it's true:

 arch/riscv/include/asm/atomic.h | 44 ++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 44 insertions(+)

but even before this it was already largely boilerplate:

#define atomic_inc_return_relaxed	atomic_inc_return_relaxed
#define atomic_dec_return_relaxed	atomic_dec_return_relaxed
#define atomic_inc_return		atomic_inc_return
#define atomic_dec_return		atomic_dec_return

#define atomic_fetch_inc_relaxed	atomic_fetch_inc_relaxed
#define atomic_fetch_dec_relaxed	atomic_fetch_dec_relaxed
#define atomic_fetch_inc		atomic_fetch_inc
#define atomic_fetch_dec		atomic_fetch_dec

#ifndef CONFIG_GENERIC_ATOMIC64
#define atomic64_inc_return_relaxed	atomic64_inc_return_relaxed
#define atomic64_dec_return_relaxed	atomic64_dec_return_relaxed
#define atomic64_inc_return		atomic64_inc_return
#define atomic64_dec_return		atomic64_dec_return

#define atomic64_fetch_inc_relaxed	atomic64_fetch_inc_relaxed
#define atomic64_fetch_dec_relaxed	atomic64_fetch_dec_relaxed
#define atomic64_fetch_inc		atomic64_fetch_inc
#define atomic64_fetch_dec		atomic64_fetch_dec
#endif

What this change does it that it makes _all_ the low level ops obviously mapped in 
the low level header, if an arch decides to implement the _relaxed variants 
itself. Not half of them in the low level header, half of them in the generic 
header...

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 103+ messages in thread

* Re: [tip:locking/core] locking/atomics: Simplify the op definitions in atomic.h some more
  2018-05-15  8:35               ` Ingo Molnar
@ 2018-05-15 11:41                 ` Peter Zijlstra
  2018-05-15 12:13                   ` Peter Zijlstra
  2018-05-15 15:43                   ` Mark Rutland
  0 siblings, 2 replies; 103+ messages in thread
From: Peter Zijlstra @ 2018-05-15 11:41 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: linux-kernel, akpm, will.deacon, mark.rutland, torvalds, paulmck,
	tglx, hpa, linux-tip-commits

On Tue, May 15, 2018 at 10:35:56AM +0200, Ingo Molnar wrote:
> Which is just _half_ the linecount.

It also provides less. I do not believe smaller is better here. The
line count really isn't the problem with this stuff.

The main pain point here is keeping the atomic, atomic64 and atomic_long
crud consistent, typically we tend to forget about atomic_long because
that lives in an entirely different header.

In any case, see:

  https://git.kernel.org/pub/scm/linux/kernel/git/mark/linux.git/commit/?h=atomics/generated&id=e54c888b3b9d8f3ef57b1a9c4255a6371cb9977d

which generates the atomic/atomic64 bits but does not yet deal with
atomic_long (I think I would've kept the 'header' thing in the
normal .h file but whatever).

Once we have the atomic_long thing added, we should also have enough
data to do function forwarding, and we should be able to start looking
at the whole annotated stuff.

Now clearly Mark hasn't had time to further work on that. But consider a
table like:

add(i,v)		RF
sub(i,v)		RF
inc(v)			RF
dec(v)			RF
or(i,v)			F
and(i,v)		F
andnot(i,v)		F
xor(i,v)		F
xchg(v,i)		X
cmpxchg(v,i,j)		X
try_cmpxchg(v,I,j)	XB

With the following proglet; that should contain enough to do full
forwarding (seems I forgot to implement 'B').

---
#!/bin/bash

gen_proto() {
	local cnt=0;

	proto=$1; shift;
	ret=$1; shift;
	pfx=$1; shift;
	sfx=$1; shift;

	echo -n "${ret} ";

	name=${proto%(*};
	echo -n "${pfx}${name}${sfx}("

	args=${proto#*\(};
	for arg in ${args//[,)]/ };
	do
		if [ $cnt -gt 0 ]
		then
			echo -n ", ";
		fi
		let cnt++;
		echo -n "${TYPES[$arg]} ${arg}"
	done
	echo ");"
}

gen_proto_order() {
	gen_proto $1 $2 $3 $4
	gen_proto $1 $2 $3 $4_acquire
	gen_proto $1 $2 $3 $4_release
	gen_proto $1 $2 $3 $4_relaxed
}

gen_void_protos() {
	grep -v -e "^$" -e "^#" atomic.tbl | while read proto meta;
	do
		gen_proto ${proto} "void" ${TYPES[pfx]} ""
	done
}

gen_return_protos() {
	grep -v -e "^$" -e "^#" atomic.tbl | while read proto meta;
	do
		if [[ $meta =~ "R" ]]; then
			gen_proto_order ${proto} ${TYPES[i]} ${TYPES[pfx]} "_return"
		fi
	done
}

gen_fetch_protos() {
	grep -v -e "^$" -e "^#" atomic.tbl | while read proto meta;
	do
		if [[ $meta =~ "F" ]]; then
			gen_proto_order ${proto} ${TYPES[i]} "${TYPES[pfx]}fetch_" ""
		fi
	done
}

gen_exchange_protos() {
	grep -v -e "^$" -e "^#" atomic.tbl | while read proto meta;
	do
		if [[ $meta =~ "X" ]]; then
			gen_proto_order ${proto} ${TYPES[i]} ${TYPES[pfx]} ""
		fi
	done
}

gen_protos() {
	gen_void_protos
	gen_return_protos
	gen_fetch_protos
	gen_exchange_protos
}

declare -A TYPES=( [pfx]="atomic_" [v]="atomic_t *" [i]="int" [j]="int" [I]="int *" )

gen_protos

declare -A TYPES=( [pfx]="atomic64_" [v]="atomic64_t *" [i]="s64" [j]="s64" [I]="s64 *" )

gen_protos

declare -A TYPES=( [pfx]="atomic_long_" [v]="atomic_long_t *" [i]="long" [j]="long" [I]="long *" )

gen_protos

^ permalink raw reply	[flat|nested] 103+ messages in thread

* Re: [tip:locking/core] locking/atomics: Simplify the op definitions in atomic.h some more
  2018-05-15 11:41                 ` Peter Zijlstra
@ 2018-05-15 12:13                   ` Peter Zijlstra
  2018-05-15 15:43                   ` Mark Rutland
  1 sibling, 0 replies; 103+ messages in thread
From: Peter Zijlstra @ 2018-05-15 12:13 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: linux-kernel, akpm, will.deacon, mark.rutland, torvalds, paulmck,
	tglx, hpa, linux-tip-commits

On Tue, May 15, 2018 at 01:41:44PM +0200, Peter Zijlstra wrote:
> #!/bin/bash
> 
> gen_proto() {
> 	local cnt=0;
> 
> 	proto=$1; shift;
> 	ret=$1; shift;
> 	pfx=$1; shift;
> 	sfx=$1; shift;
> 
> 	echo -n "${ret} ";
> 
> 	name=${proto%(*};
> 	echo -n "${pfx}${name}${sfx}("
> 
> 	args=${proto#*\(};
> 	for arg in ${args//[,)]/ };
> 	do
> 		if [ $cnt -gt 0 ]
> 		then
> 			echo -n ", ";
> 		fi
> 		let cnt++;
> 		echo -n "${TYPES[$arg]} ${arg}"
> 	done
> 	echo ");"
> }
> 
> gen_proto_order() {
> 	gen_proto $1 $2 $3 $4
> 	gen_proto $1 $2 $3 $4_acquire
> 	gen_proto $1 $2 $3 $4_release
> 	gen_proto $1 $2 $3 $4_relaxed
> }
> 
> gen_void_protos() {
> 	grep -v -e "^$" -e "^#" atomic.tbl | while read proto meta;
> 	do
> 		gen_proto ${proto} "void" ${TYPES[pfx]} ""
> 	done
> }
> 
> gen_return_protos() {
> 	grep -v -e "^$" -e "^#" atomic.tbl | while read proto meta;
> 	do
> 		if [[ $meta =~ "R" ]]; then
> 			gen_proto_order ${proto} ${TYPES[i]} ${TYPES[pfx]} "_return"
> 		fi
> 	done
> }
> 
> gen_fetch_protos() {
> 	grep -v -e "^$" -e "^#" atomic.tbl | while read proto meta;
> 	do
> 		if [[ $meta =~ "F" ]]; then
> 			gen_proto_order ${proto} ${TYPES[i]} "${TYPES[pfx]}fetch_" ""
> 		fi
> 	done
> }
> 
> gen_exchange_protos() {
> 	grep -v -e "^$" -e "^#" atomic.tbl | while read proto meta;
> 	do
> 		if [[ $meta =~ "X" ]]; then

			ret=${TYPES[i]};
			if [[ $meta =~ "B" ]]; then
				ret="bool"
			fi
			gen_proto_order ${proto} ${ret} ${TYPES[pfx]} ""

> 		fi
> 	done
> }
> 
> gen_protos() {
> 	gen_void_protos
> 	gen_return_protos
> 	gen_fetch_protos
> 	gen_exchange_protos
> }
> 
> declare -A TYPES=( [pfx]="atomic_" [v]="atomic_t *" [i]="int" [j]="int" [I]="int *" )
> 
> gen_protos
> 
> declare -A TYPES=( [pfx]="atomic64_" [v]="atomic64_t *" [i]="s64" [j]="s64" [I]="s64 *" )
> 
> gen_protos
> 
> declare -A TYPES=( [pfx]="atomic_long_" [v]="atomic_long_t *" [i]="long" [j]="long" [I]="long *" )
> 
> gen_protos
> 
> 

^ permalink raw reply	[flat|nested] 103+ messages in thread

* Re: [tip:locking/core] locking/atomics: Simplify the op definitions in atomic.h some more
  2018-05-15 11:41                 ` Peter Zijlstra
  2018-05-15 12:13                   ` Peter Zijlstra
@ 2018-05-15 15:43                   ` Mark Rutland
  2018-05-15 17:10                     ` Peter Zijlstra
  1 sibling, 1 reply; 103+ messages in thread
From: Mark Rutland @ 2018-05-15 15:43 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Ingo Molnar, linux-kernel, akpm, will.deacon, torvalds, paulmck,
	tglx, hpa, linux-tip-commits

On Tue, May 15, 2018 at 01:41:44PM +0200, Peter Zijlstra wrote:
> On Tue, May 15, 2018 at 10:35:56AM +0200, Ingo Molnar wrote:
> > Which is just _half_ the linecount.
> 
> It also provides less. I do not believe smaller is better here. The
> line count really isn't the problem with this stuff.
> 
> The main pain point here is keeping the atomic, atomic64 and atomic_long
> crud consistent, typically we tend to forget about atomic_long because
> that lives in an entirely different header.
> 
> In any case, see:
> 
>   https://git.kernel.org/pub/scm/linux/kernel/git/mark/linux.git/commit/?h=atomics/generated&id=e54c888b3b9d8f3ef57b1a9c4255a6371cb9977d
> 
> which generates the atomic/atomic64 bits but does not yet deal with
> atomic_long (I think I would've kept the 'header' thing in the
> normal .h file but whatever).
> 
> Once we have the atomic_long thing added, we should also have enough
> data to do function forwarding, and we should be able to start looking
> at the whole annotated stuff.
> 
> Now clearly Mark hasn't had time to further work on that. But consider a
> table like:
> 
> add(i,v)		RF
> sub(i,v)		RF
> inc(v)			RF
> dec(v)			RF
> or(i,v)			F
> and(i,v)		F
> andnot(i,v)		F
> xor(i,v)		F
> xchg(v,i)		X
> cmpxchg(v,i,j)		X
> try_cmpxchg(v,I,j)	XB
> 
> With the following proglet; that should contain enough to do full
> forwarding (seems I forgot to implement 'B').

I put together the following while trying to avoid bash magic (i.e the
arrays, and keeping the option of naming the params. My local copy of
dash seems happy with it.

I *think* the table can encode enough info to generate atomic-long.h,
atomic-instrumented.h, and the atomic.h ordering fallbacks. I'll need to
flesh out the table and check that we don't end up clashing with
some of the regular fallbacks.

Thanks,
Mark.

----
# name	meta	args...
#
# Where meta contains a string of:
# * B - bool: returns bool, fully ordered
# * V - void: returns void, fully ordered
# * I - int: returns base type, all orderings
# * R - return: returns base type, all orderings
# * F - fetch: returns base type, all orderings
# * T - try: returns bool, all orderings
#
# Where args contains list of type[:name], where type is:
# * v - pointer to atomic base type (atomic or atomic64)
# * i - base type (int or long)
# * I - pointer to base type (int or long)
#
add		VRF	i	v
sub		VRF	i	v
inc		VRF	v
dec		VRF	v
or		VF	i	v
and		VF	i	v
andnot		VF	i	v
xor		VF	i	v
xchg		I	v	i
cmpxchg		I	v	i:old	i:new
try_cmpxchg	T	v	I:old	i:new
add_and_test	B	i	v
sub_and_test	B	i	v
dec_and_test	B	v
inc_and_test	B	v
----


----
#!/bin/sh

gen_return_type() {
	local meta="$1"; shift
	local basetype="$1"; shift

	expr match "${meta}" "[V]" > /dev/null && printf "void"
	expr match "${meta}" "[BT]" > /dev/null && printf "bool"
	expr match "${meta}" "[IFR]" > /dev/null && printf "${basetype}"
}

gen_param()
{
	local basetype="$1"; shift
	local atomictype="$1"; shift
	local fmt="$1"; shift
	local name="${fmt#*:}"
	local type="${fmt%%:*}"

	[ "${type}" = "i" ] && type="${basetype} "
	[ "${type}" = "I" ] && type="${basetype} *"
	[ "${type}" = "v" ] && type="${atomictype} *"

	printf "%s%s" "${type}" "${name}"
}

gen_params()
{
	local basetype="$1"; shift
	local atomictype="$1"; shift

	while [ "$#" -gt 0 ]; do
		gen_param "${basetype}" "${atomictype}" "$1"
		[ "$#" -gt 1 ] && printf ", "
		shift;
	done
}

gen_proto_return_order_variant()
{
	local meta="$1"; shift;
	local name="$1"; shift
	local pfx="$1"; shift
	local basetype="$1"; shift

	gen_return_type "$meta" "${basetype}"

	printf " %s_%s(" "${pfx}" "${name}"
	gen_params "${basetype}" "${pfx}_t" $@
	printf ");\n"
}

gen_proto_return_order_variants()
{
	local meta="$1"; shift
	local name="$1"; shift
	gen_proto_return_order_variant "${meta}" "${name}" "$@"

	if expr match "${meta}" "[RFXC]" > /dev/null; then
		gen_proto_return_order_variant "${meta}" "${name}_acquire" "$@"
		gen_proto_return_order_variant "${meta}" "${name}_release" "$@"
		gen_proto_return_order_variant "${meta}" "${name}_relaxed" "$@"
	fi
}

gen_proto_variants()
{
	local meta="$1"; shift
	local name="$1"; shift

	[ "${meta}" = "R" ] && name="${name}_return"
	[ "${meta}" = "F" ] && name="fetch_${name}"

	gen_proto_return_order_variants "${meta}" "${name}" "$@"
}

gen_proto() {
	local meta="$1"; shift
	for m in $(echo "${meta}" | fold -w1); do
		gen_proto_variants "${m}" $@
	done
}

grep '^[a-z]' "$1" | while read name meta args; do

	gen_proto "${meta}" "${name}" "atomic" "int" ${args}

	gen_proto "${meta}" "${name}" "atomic64" "s64" ${args}

	gen_proto "${meta}" "${name}" "atomic_long" "long" ${args}

done
----

^ permalink raw reply	[flat|nested] 103+ messages in thread

* Re: [tip:locking/core] locking/atomics: Simplify the op definitions in atomic.h some more
  2018-05-15 15:43                   ` Mark Rutland
@ 2018-05-15 17:10                     ` Peter Zijlstra
  2018-05-15 17:53                       ` Mark Rutland
  0 siblings, 1 reply; 103+ messages in thread
From: Peter Zijlstra @ 2018-05-15 17:10 UTC (permalink / raw)
  To: Mark Rutland
  Cc: Ingo Molnar, linux-kernel, akpm, will.deacon, torvalds, paulmck,
	tglx, hpa, linux-tip-commits

On Tue, May 15, 2018 at 04:43:33PM +0100, Mark Rutland wrote:
> I put together the following while trying to avoid bash magic (i.e the
> arrays, and keeping the option of naming the params. My local copy of
> dash seems happy with it.

Very nice; clearly your sh foo is stronger than mine ;-)

> I *think* the table can encode enough info to generate atomic-long.h,
> atomic-instrumented.h, and the atomic.h ordering fallbacks. I'll need to
> flesh out the table and check that we don't end up clashing with
> some of the regular fallbacks.

Yes, details details ;-)

> # name	meta	args...
> #
> # Where meta contains a string of:
> # * B - bool: returns bool, fully ordered
> # * V - void: returns void, fully ordered

void retuns are relaxed

> # * I - int: returns base type, all orderings
> # * R - return: returns base type, all orderings
> # * F - fetch: returns base type, all orderings
> # * T - try: returns bool, all orderings

Little more verbose than mine, I think we can get there with X and XB
instead of I and T, but whatever :-)

> # Where args contains list of type[:name], where type is:
> # * v - pointer to atomic base type (atomic or atomic64)
> # * i - base type (int or long)
> # * I - pointer to base type (int or long)
> #
> add		VRF	i	v
> sub		VRF	i	v
> inc		VRF	v
> dec		VRF	v
> or		VF	i	v
> and		VF	i	v
> andnot		VF	i	v
> xor		VF	i	v
> xchg		I	v	i
> cmpxchg		I	v	i:old	i:new
> try_cmpxchg	T	v	I:old	i:new
> add_and_test	B	i	v
> sub_and_test	B	i	v
> dec_and_test	B	v
> inc_and_test	B	v

Cute, that [:name].

^ permalink raw reply	[flat|nested] 103+ messages in thread

* Re: [tip:locking/core] locking/atomics: Simplify the op definitions in atomic.h some more
  2018-05-15 17:10                     ` Peter Zijlstra
@ 2018-05-15 17:53                       ` Mark Rutland
  2018-05-15 18:11                         ` Peter Zijlstra
  0 siblings, 1 reply; 103+ messages in thread
From: Mark Rutland @ 2018-05-15 17:53 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Ingo Molnar, linux-kernel, akpm, will.deacon, torvalds, paulmck,
	tglx, hpa, linux-tip-commits

On Tue, May 15, 2018 at 07:10:21PM +0200, Peter Zijlstra wrote:
> On Tue, May 15, 2018 at 04:43:33PM +0100, Mark Rutland wrote:
> > I *think* the table can encode enough info to generate atomic-long.h,
> > atomic-instrumented.h, and the atomic.h ordering fallbacks. I'll need to
> > flesh out the table and check that we don't end up clashing with
> > some of the regular fallbacks.
> 
> Yes, details details ;-)
> 
> > # name	meta	args...
> > #
> > # Where meta contains a string of:
> > # * B - bool: returns bool, fully ordered
> > # * V - void: returns void, fully ordered
> 
> void retuns are relaxed

How about:

  V - void: returns void, no ordering variants (implicitly relaxed)

> > # * I - int: returns base type, all orderings
> > # * R - return: returns base type, all orderings
> > # * F - fetch: returns base type, all orderings
> > # * T - try: returns bool, all orderings
> 
> Little more verbose than mine, I think we can get there with X and XB
> instead of I and T, but whatever :-)

Mhmm. I found it easier to do switch-like things this way, but it works
regardless.

> > # Where args contains list of type[:name], where type is:
> > # * v - pointer to atomic base type (atomic or atomic64)
> > # * i - base type (int or long)
> > # * I - pointer to base type (int or long)
> > #
> > add		VRF	i	v
> > sub		VRF	i	v
> > inc		VRF	v
> > dec		VRF	v
> > or		VF	i	v
> > and		VF	i	v
> > andnot		VF	i	v
> > xor		VF	i	v
> > xchg		I	v	i
> > cmpxchg		I	v	i:old	i:new
> > try_cmpxchg	T	v	I:old	i:new
> > add_and_test	B	i	v
> > sub_and_test	B	i	v
> > dec_and_test	B	v
> > inc_and_test	B	v
> 
> Cute, that [:name].

:D

Thanks,
Mark.

^ permalink raw reply	[flat|nested] 103+ messages in thread

* Re: [tip:locking/core] locking/atomics: Simplify the op definitions in atomic.h some more
  2018-05-15 17:53                       ` Mark Rutland
@ 2018-05-15 18:11                         ` Peter Zijlstra
  2018-05-15 18:15                           ` Peter Zijlstra
  2018-05-21 17:12                           ` Mark Rutland
  0 siblings, 2 replies; 103+ messages in thread
From: Peter Zijlstra @ 2018-05-15 18:11 UTC (permalink / raw)
  To: Mark Rutland
  Cc: Ingo Molnar, linux-kernel, akpm, will.deacon, torvalds, paulmck,
	tglx, hpa, linux-tip-commits

On Tue, May 15, 2018 at 06:53:08PM +0100, Mark Rutland wrote:
> On Tue, May 15, 2018 at 07:10:21PM +0200, Peter Zijlstra wrote:
> > On Tue, May 15, 2018 at 04:43:33PM +0100, Mark Rutland wrote:
> > > I *think* the table can encode enough info to generate atomic-long.h,
> > > atomic-instrumented.h, and the atomic.h ordering fallbacks. I'll need to
> > > flesh out the table and check that we don't end up clashing with
> > > some of the regular fallbacks.
> > 
> > Yes, details details ;-)
> > 
> > > # name	meta	args...
> > > #
> > > # Where meta contains a string of:
> > > # * B - bool: returns bool, fully ordered
> > > # * V - void: returns void, fully ordered
> > 
> > void retuns are relaxed
> 
> How about:
> 
>   V - void: returns void, no ordering variants (implicitly relaxed)

Works for me.

> > > # * I - int: returns base type, all orderings
> > > # * R - return: returns base type, all orderings
> > > # * F - fetch: returns base type, all orderings
> > > # * T - try: returns bool, all orderings
> > 
> > Little more verbose than mine, I think we can get there with X and XB
> > instead of I and T, but whatever :-)
> 
> Mhmm. I found it easier to do switch-like things this way, but it works
> regardless.

I'm a minimalist, but yeah whatever ;-)

> > > # Where args contains list of type[:name], where type is:
> > > # * v - pointer to atomic base type (atomic or atomic64)
> > > # * i - base type (int or long)
> > > # * I - pointer to base type (int or long)
> > > #
> > > add		VRF	i	v
> > > sub		VRF	i	v
> > > inc		VRF	v
> > > dec		VRF	v
> > > or		VF	i	v
> > > and		VF	i	v
> > > andnot		VF	i	v
> > > xor		VF	i	v
> > > xchg		I	v	i
> > > cmpxchg		I	v	i:old	i:new
> > > try_cmpxchg	T	v	I:old	i:new
> > > add_and_test	B	i	v
> > > sub_and_test	B	i	v
> > > dec_and_test	B	v
> > > inc_and_test	B	v

we might also want:

set		V	v	i
set_release	V	v	i
read		I	v
read_acquire	I	v

(yes, I did get the set arguments the wrong way around initially)

^ permalink raw reply	[flat|nested] 103+ messages in thread

* Re: [tip:locking/core] locking/atomics: Simplify the op definitions in atomic.h some more
  2018-05-15 18:11                         ` Peter Zijlstra
@ 2018-05-15 18:15                           ` Peter Zijlstra
  2018-05-15 18:52                             ` Linus Torvalds
  2018-05-21 17:12                           ` Mark Rutland
  1 sibling, 1 reply; 103+ messages in thread
From: Peter Zijlstra @ 2018-05-15 18:15 UTC (permalink / raw)
  To: Mark Rutland
  Cc: Ingo Molnar, linux-kernel, akpm, will.deacon, torvalds, paulmck,
	tglx, hpa, linux-tip-commits

On Tue, May 15, 2018 at 08:11:36PM +0200, Peter Zijlstra wrote:
> On Tue, May 15, 2018 at 06:53:08PM +0100, Mark Rutland wrote:

> > > > # name	meta	args...
> > > > #
> > > > # Where meta contains a string of:
> > > > # * B - bool: returns bool, fully ordered
> > > > # * V - void: returns void, fully ordered
> > > > # * I - int: returns base type, all orderings
> > > > # * R - return: returns base type, all orderings
> > > > # * F - fetch: returns base type, all orderings
> > > > # * T - try: returns bool, all orderings
> > > 
> > > Little more verbose than mine, I think we can get there with X and XB
> > > instead of I and T, but whatever :-)
> > 
> > Mhmm. I found it easier to do switch-like things this way, but it works
> > regardless.
> 
> I'm a minimalist, but yeah whatever ;-)

Although; by having X we only have a single I. You currently have I as
meta and argument type, which is prone to confusion or something.

Alternatively we could use 'p' for the argument pointer thing.

> > > > # Where args contains list of type[:name], where type is:
> > > > # * v - pointer to atomic base type (atomic or atomic64)
> > > > # * i - base type (int or long)
> > > > # * I - pointer to base type (int or long)
> > > > #

^ permalink raw reply	[flat|nested] 103+ messages in thread

* Re: [tip:locking/core] locking/atomics: Simplify the op definitions in atomic.h some more
  2018-05-15 18:15                           ` Peter Zijlstra
@ 2018-05-15 18:52                             ` Linus Torvalds
  2018-05-15 19:39                               ` Peter Zijlstra
  0 siblings, 1 reply; 103+ messages in thread
From: Linus Torvalds @ 2018-05-15 18:52 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Mark Rutland, Ingo Molnar, Linux Kernel Mailing List,
	Andrew Morton, Will Deacon, Paul McKenney, Thomas Gleixner,
	Peter Anvin, linux-tip-commits

On Tue, May 15, 2018 at 11:15 AM Peter Zijlstra <peterz@infradead.org>
wrote:

> Alternatively we could use 'p' for the argument pointer thing.

Probably better than having i/I.

But while people are bikeshedding the important stuff, can I please mention
my personal pet peeve with generated code?

If we go down this "generate header files" path, and ever expand to
actually generating some of the definitions too, can we *please* try to
follow three rules:

  (a) make the generated header file not just say "this is generated", but
say exactly *what* generates it, so that when you look at it, you don't
have to search for the generator?

  (b) if at all possible, please aim to make "git grep" find the stuff that
is generated?

  (c) if b is not possible, then generate '#line' things in the generator so
that debug information points back to the original source?

That (b) in particular can be a major pain, because "git grep" will
obviously only look at the _source_ files (including the script that
generates thing), but not at the generated file at all.

But when you see a stack trace or oops or something that mentions some
function that you're not intimately familiar with, the first thing I do is
basically some variation of

     git grep function_name

or similar. Maybe it's just me, but I actually really like how fast "git
grep" is with threading and everything to just go the extra mile.

And what often happens with generated functions is that you can't actually
find them with git grep, because the generator will generate the function
names in two or more parts (ie in this case, for example, "cmpxchg_relaxed"
would never show up, becuase it's the "relaxed" version of cmpxchg.

So (b) will likely not work out, but at least try very hard to do (a) and
(c) when (b) fails. So that when people do stack traces, if they have the
debug information, at least it will point to the actual implementation in
the generator.

(NOTE! I realize that right now you're just talking about generating the
header file itself, with only declarations, not definitions. So the above
is not an issue. Yet. I'm just waiting for people to generate some of the
code too, and being proactive)_.

Hmm?

                 Linus

^ permalink raw reply	[flat|nested] 103+ messages in thread

* Re: [tip:locking/core] locking/atomics: Simplify the op definitions in atomic.h some more
  2018-05-15 18:52                             ` Linus Torvalds
@ 2018-05-15 19:39                               ` Peter Zijlstra
  0 siblings, 0 replies; 103+ messages in thread
From: Peter Zijlstra @ 2018-05-15 19:39 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Mark Rutland, Ingo Molnar, Linux Kernel Mailing List,
	Andrew Morton, Will Deacon, Paul McKenney, Thomas Gleixner,
	Peter Anvin, linux-tip-commits

On Tue, May 15, 2018 at 11:52:17AM -0700, Linus Torvalds wrote:
>   (b) if at all possible, please aim to make "git grep" find the stuff that
> is generated?

The easiest way to make that happen is to just commit the generated
headers instead of generating them each build.

This would mean moving them to include/linux/atomic-gen-*.h or somesuch,
instead of using include/generated/.

I don't expect they'll change often anyway (famous last words).

^ permalink raw reply	[flat|nested] 103+ messages in thread

* Re: [PATCH] locking/atomics: Simplify the op definitions in atomic.h some more
  2018-05-06 14:57               ` Ingo Molnar
@ 2018-05-18 18:43                 ` Palmer Dabbelt
  -1 siblings, 0 replies; 103+ messages in thread
From: Palmer Dabbelt @ 2018-05-18 18:43 UTC (permalink / raw)
  To: mingo
  Cc: andrea.parri, mark.rutland, peterz, linux-arm-kernel,
	linux-kernel, aryabinin, boqun.feng, catalin.marinas, dvyukov,
	Will Deacon, Linus Torvalds, akpm, paulmck, a.p.zijlstra, tglx

On Sun, 06 May 2018 07:57:27 PDT (-0700), mingo@kernel.org wrote:
>
> * Andrea Parri <andrea.parri@amarulasolutions.com> wrote:
>
>> Hi Ingo,
>>
>> > From 5affbf7e91901143f84f1b2ca64f4afe70e210fd Mon Sep 17 00:00:00 2001
>> > From: Ingo Molnar <mingo@kernel.org>
>> > Date: Sat, 5 May 2018 10:23:23 +0200
>> > Subject: [PATCH] locking/atomics: Simplify the op definitions in atomic.h some more
>> >
>> > Before:
>> >
>> >  #ifndef atomic_fetch_dec_relaxed
>> >  # ifndef atomic_fetch_dec
>> >  #  define atomic_fetch_dec(v)			atomic_fetch_sub(1, (v))
>> >  #  define atomic_fetch_dec_relaxed(v)		atomic_fetch_sub_relaxed(1, (v))
>> >  #  define atomic_fetch_dec_acquire(v)		atomic_fetch_sub_acquire(1, (v))
>> >  #  define atomic_fetch_dec_release(v)		atomic_fetch_sub_release(1, (v))
>> >  # else
>> >  #  define atomic_fetch_dec_relaxed		atomic_fetch_dec
>> >  #  define atomic_fetch_dec_acquire		atomic_fetch_dec
>> >  #  define atomic_fetch_dec_release		atomic_fetch_dec
>> >  # endif
>> >  #else
>> >  # ifndef atomic_fetch_dec_acquire
>> >  #  define atomic_fetch_dec_acquire(...)	__atomic_op_acquire(atomic_fetch_dec, __VA_ARGS__)
>> >  # endif
>> >  # ifndef atomic_fetch_dec_release
>> >  #  define atomic_fetch_dec_release(...)	__atomic_op_release(atomic_fetch_dec, __VA_ARGS__)
>> >  # endif
>> >  # ifndef atomic_fetch_dec
>> >  #  define atomic_fetch_dec(...)		__atomic_op_fence(atomic_fetch_dec, __VA_ARGS__)
>> >  # endif
>> >  #endif
>> >
>> > After:
>> >
>> >  #ifndef atomic_fetch_dec_relaxed
>> >  # ifndef atomic_fetch_dec
>> >  #  define atomic_fetch_dec(v)			atomic_fetch_sub(1, (v))
>> >  #  define atomic_fetch_dec_relaxed(v)		atomic_fetch_sub_relaxed(1, (v))
>> >  #  define atomic_fetch_dec_acquire(v)		atomic_fetch_sub_acquire(1, (v))
>> >  #  define atomic_fetch_dec_release(v)		atomic_fetch_sub_release(1, (v))
>> >  # else
>> >  #  define atomic_fetch_dec_relaxed		atomic_fetch_dec
>> >  #  define atomic_fetch_dec_acquire		atomic_fetch_dec
>> >  #  define atomic_fetch_dec_release		atomic_fetch_dec
>> >  # endif
>> >  #else
>> >  # ifndef atomic_fetch_dec
>> >  #  define atomic_fetch_dec(...)		__atomic_op_fence(atomic_fetch_dec, __VA_ARGS__)
>> >  #  define atomic_fetch_dec_acquire(...)	__atomic_op_acquire(atomic_fetch_dec, __VA_ARGS__)
>> >  #  define atomic_fetch_dec_release(...)	__atomic_op_release(atomic_fetch_dec, __VA_ARGS__)
>> >  # endif
>> >  #endif
>> >
>> > The idea is that because we already group these APIs by certain defines
>> > such as atomic_fetch_dec_relaxed and atomic_fetch_dec in the primary
>> > branches - we can do the same in the secondary branch as well.
>> >
>> > ( Also remove some unnecessarily duplicate comments, as the API
>> >   group defines are now pretty much self-documenting. )
>> >
>> > No change in functionality.
>> >
>> > Cc: Peter Zijlstra <peterz@infradead.org>
>> > Cc: Linus Torvalds <torvalds@linux-foundation.org>
>> > Cc: Andrew Morton <akpm@linux-foundation.org>
>> > Cc: Thomas Gleixner <tglx@linutronix.de>
>> > Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
>> > Cc: Will Deacon <will.deacon@arm.com>
>> > Cc: linux-kernel@vger.kernel.org
>> > Signed-off-by: Ingo Molnar <mingo@kernel.org>
>>
>> This breaks compilation on RISC-V. (For some of its atomics, the arch
>> currently defines the _relaxed and the full variants and it relies on
>> the generic definitions for the _acquire and the _release variants.)
>
> I don't have cross-compilation for RISC-V, which is a relatively new arch.
> (Is there any RISC-V set of cross-compilation tools on kernel.org somewhere?)

Arnd added RISC-V to the cross compiler list a month or two ago when he updated 
them all.  I use the "make.cross" script from the Intel test robot, which will 
fetch the cross compilers for you.  It looks like I made a Git Hub pull request 
to update the script for RISC-V, it fetches from kernel.org

    https://github.com/palmer-dabbelt/lkp-tests/blob/e14f4236ccd0572f4b87ffd480fecefee412dedc/sbin/make.cross
    http://cdn.kernel.org/pub/tools/crosstool/files/bin/
    http://cdn.kernel.org/pub/tools/crosstool/files/bin/x86_64/7.3.0/x86_64-gcc-7.3.0-nolibc_riscv64-linux.tar.gz

> Could you please send a patch that defines those variants against Linus's tree,
> like the PowerPC patch that does something similar:
>
>   0476a632cb3a: locking/atomics/powerpc: Move cmpxchg helpers to asm/cmpxchg.h and define the full set of cmpxchg APIs
>
> ?
>
> ... and I'll integrate it into the proper place to make it all bisectable, etc.

Sorry, I got buried in email again.  Did this get merged, or is there a current 
version of the patch set I should look at?

^ permalink raw reply	[flat|nested] 103+ messages in thread

* [PATCH] locking/atomics: Simplify the op definitions in atomic.h some more
@ 2018-05-18 18:43                 ` Palmer Dabbelt
  0 siblings, 0 replies; 103+ messages in thread
From: Palmer Dabbelt @ 2018-05-18 18:43 UTC (permalink / raw)
  To: linux-arm-kernel

On Sun, 06 May 2018 07:57:27 PDT (-0700), mingo at kernel.org wrote:
>
> * Andrea Parri <andrea.parri@amarulasolutions.com> wrote:
>
>> Hi Ingo,
>>
>> > From 5affbf7e91901143f84f1b2ca64f4afe70e210fd Mon Sep 17 00:00:00 2001
>> > From: Ingo Molnar <mingo@kernel.org>
>> > Date: Sat, 5 May 2018 10:23:23 +0200
>> > Subject: [PATCH] locking/atomics: Simplify the op definitions in atomic.h some more
>> >
>> > Before:
>> >
>> >  #ifndef atomic_fetch_dec_relaxed
>> >  # ifndef atomic_fetch_dec
>> >  #  define atomic_fetch_dec(v)			atomic_fetch_sub(1, (v))
>> >  #  define atomic_fetch_dec_relaxed(v)		atomic_fetch_sub_relaxed(1, (v))
>> >  #  define atomic_fetch_dec_acquire(v)		atomic_fetch_sub_acquire(1, (v))
>> >  #  define atomic_fetch_dec_release(v)		atomic_fetch_sub_release(1, (v))
>> >  # else
>> >  #  define atomic_fetch_dec_relaxed		atomic_fetch_dec
>> >  #  define atomic_fetch_dec_acquire		atomic_fetch_dec
>> >  #  define atomic_fetch_dec_release		atomic_fetch_dec
>> >  # endif
>> >  #else
>> >  # ifndef atomic_fetch_dec_acquire
>> >  #  define atomic_fetch_dec_acquire(...)	__atomic_op_acquire(atomic_fetch_dec, __VA_ARGS__)
>> >  # endif
>> >  # ifndef atomic_fetch_dec_release
>> >  #  define atomic_fetch_dec_release(...)	__atomic_op_release(atomic_fetch_dec, __VA_ARGS__)
>> >  # endif
>> >  # ifndef atomic_fetch_dec
>> >  #  define atomic_fetch_dec(...)		__atomic_op_fence(atomic_fetch_dec, __VA_ARGS__)
>> >  # endif
>> >  #endif
>> >
>> > After:
>> >
>> >  #ifndef atomic_fetch_dec_relaxed
>> >  # ifndef atomic_fetch_dec
>> >  #  define atomic_fetch_dec(v)			atomic_fetch_sub(1, (v))
>> >  #  define atomic_fetch_dec_relaxed(v)		atomic_fetch_sub_relaxed(1, (v))
>> >  #  define atomic_fetch_dec_acquire(v)		atomic_fetch_sub_acquire(1, (v))
>> >  #  define atomic_fetch_dec_release(v)		atomic_fetch_sub_release(1, (v))
>> >  # else
>> >  #  define atomic_fetch_dec_relaxed		atomic_fetch_dec
>> >  #  define atomic_fetch_dec_acquire		atomic_fetch_dec
>> >  #  define atomic_fetch_dec_release		atomic_fetch_dec
>> >  # endif
>> >  #else
>> >  # ifndef atomic_fetch_dec
>> >  #  define atomic_fetch_dec(...)		__atomic_op_fence(atomic_fetch_dec, __VA_ARGS__)
>> >  #  define atomic_fetch_dec_acquire(...)	__atomic_op_acquire(atomic_fetch_dec, __VA_ARGS__)
>> >  #  define atomic_fetch_dec_release(...)	__atomic_op_release(atomic_fetch_dec, __VA_ARGS__)
>> >  # endif
>> >  #endif
>> >
>> > The idea is that because we already group these APIs by certain defines
>> > such as atomic_fetch_dec_relaxed and atomic_fetch_dec in the primary
>> > branches - we can do the same in the secondary branch as well.
>> >
>> > ( Also remove some unnecessarily duplicate comments, as the API
>> >   group defines are now pretty much self-documenting. )
>> >
>> > No change in functionality.
>> >
>> > Cc: Peter Zijlstra <peterz@infradead.org>
>> > Cc: Linus Torvalds <torvalds@linux-foundation.org>
>> > Cc: Andrew Morton <akpm@linux-foundation.org>
>> > Cc: Thomas Gleixner <tglx@linutronix.de>
>> > Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
>> > Cc: Will Deacon <will.deacon@arm.com>
>> > Cc: linux-kernel at vger.kernel.org
>> > Signed-off-by: Ingo Molnar <mingo@kernel.org>
>>
>> This breaks compilation on RISC-V. (For some of its atomics, the arch
>> currently defines the _relaxed and the full variants and it relies on
>> the generic definitions for the _acquire and the _release variants.)
>
> I don't have cross-compilation for RISC-V, which is a relatively new arch.
> (Is there any RISC-V set of cross-compilation tools on kernel.org somewhere?)

Arnd added RISC-V to the cross compiler list a month or two ago when he updated 
them all.  I use the "make.cross" script from the Intel test robot, which will 
fetch the cross compilers for you.  It looks like I made a Git Hub pull request 
to update the script for RISC-V, it fetches from kernel.org

    https://github.com/palmer-dabbelt/lkp-tests/blob/e14f4236ccd0572f4b87ffd480fecefee412dedc/sbin/make.cross
    http://cdn.kernel.org/pub/tools/crosstool/files/bin/
    http://cdn.kernel.org/pub/tools/crosstool/files/bin/x86_64/7.3.0/x86_64-gcc-7.3.0-nolibc_riscv64-linux.tar.gz

> Could you please send a patch that defines those variants against Linus's tree,
> like the PowerPC patch that does something similar:
>
>   0476a632cb3a: locking/atomics/powerpc: Move cmpxchg helpers to asm/cmpxchg.h and define the full set of cmpxchg APIs
>
> ?
>
> ... and I'll integrate it into the proper place to make it all bisectable, etc.

Sorry, I got buried in email again.  Did this get merged, or is there a current 
version of the patch set I should look at?

^ permalink raw reply	[flat|nested] 103+ messages in thread

* Re: [tip:locking/core] locking/atomics: Simplify the op definitions in atomic.h some more
  2018-05-15 18:11                         ` Peter Zijlstra
  2018-05-15 18:15                           ` Peter Zijlstra
@ 2018-05-21 17:12                           ` Mark Rutland
  1 sibling, 0 replies; 103+ messages in thread
From: Mark Rutland @ 2018-05-21 17:12 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Ingo Molnar, linux-kernel, akpm, will.deacon, torvalds, paulmck,
	tglx, hpa, linux-tip-commits

On Tue, May 15, 2018 at 08:11:36PM +0200, Peter Zijlstra wrote:
> On Tue, May 15, 2018 at 06:53:08PM +0100, Mark Rutland wrote:
> > On Tue, May 15, 2018 at 07:10:21PM +0200, Peter Zijlstra wrote:
> > > On Tue, May 15, 2018 at 04:43:33PM +0100, Mark Rutland wrote:
> > > > I *think* the table can encode enough info to generate atomic-long.h,
> > > > atomic-instrumented.h, and the atomic.h ordering fallbacks. I'll need to
> > > > flesh out the table and check that we don't end up clashing with
> > > > some of the regular fallbacks.
> > > 
> > > Yes, details details ;-)
> > > 
> > > > # name	meta	args...
> > > > #
> > > > # Where meta contains a string of:
> > > > # * B - bool: returns bool, fully ordered
> > > > # * V - void: returns void, fully ordered
> > > 
> > > void retuns are relaxed
> > 
> > How about:
> > 
> >   V - void: returns void, no ordering variants (implicitly relaxed)
> 
> Works for me.
> 
> > > > # * I - int: returns base type, all orderings
> > > > # * R - return: returns base type, all orderings
> > > > # * F - fetch: returns base type, all orderings
> > > > # * T - try: returns bool, all orderings
> > > 
> > > Little more verbose than mine, I think we can get there with X and XB
> > > instead of I and T, but whatever :-)
> > 
> > Mhmm. I found it easier to do switch-like things this way, but it works
> > regardless.
> 
> I'm a minimalist, but yeah whatever ;-)
> 
> > > > # Where args contains list of type[:name], where type is:
> > > > # * v - pointer to atomic base type (atomic or atomic64)
> > > > # * i - base type (int or long)
> > > > # * I - pointer to base type (int or long)
> > > > #
> > > > add		VRF	i	v
> > > > sub		VRF	i	v
> > > > inc		VRF	v
> > > > dec		VRF	v
> > > > or		VF	i	v
> > > > and		VF	i	v
> > > > andnot		VF	i	v
> > > > xor		VF	i	v
> > > > xchg		I	v	i
> > > > cmpxchg		I	v	i:old	i:new
> > > > try_cmpxchg	T	v	I:old	i:new
> > > > add_and_test	B	i	v
> > > > sub_and_test	B	i	v
> > > > dec_and_test	B	v
> > > > inc_and_test	B	v
> 
> we might also want:
> 
> set		V	v	i
> set_release	V	v	i
> read		I	v
> read_acquire	I	v

Indeed!

I concurrently fiddled with this and came to the below. I special-cased
set and read for the purpose of preserving the acquire/release semantics
in the meta table, but either way works.

----
# name	meta	args...
#
# Where meta contains a string of variants to generate.
# Upper-case implies _{acquire,release,relaxed} variants.
# Valid meta values are:
# * B/b	- bool: returns bool
# * v	- void: returns void
# * I/i	- int: returns base type
# * R	- return: returns base type (has _return variants)
# * F	- fetch: returns base type (has fetch_ variants)
# * l	- load: returns base type (has _acquire order variant)
# * s	- store: returns void (has _release order variant)
#
# Where args contains list of type[:name], where type is:
# * cv	- const pointer to atomic base type (atomic_t/atomic64_t/atomic_long_t)
# * v	- pointer to atomic base type (atomic_t/atomic64_t/atomic_long_t)
# * i	- base type (int/s64/long)
# * I	- pointer to base type (int/s64/long)
#
read		l	cv
set		s	v	i
add		vRF	i	v
sub		vRF	i	v
inc		vRF	v
dec		vRF	v
and		vF	i	v
andnot		vF	i	v
or		vF	i	v
xor		vF	i	v
xchg		I	v	i
cmpxchg		I	v	i:old	i:new
try_cmpxchg	B	v	I:old	i:new
sub_and_test	b	i	v
dec_and_test	b	v
inc_and_test	b	v
add_negative	i	i	v
add_unless	i	v	i:a	i:u
inc_not_zero	i	v
----

... which seems to be sufficient to give me a working atomic-long.h
using [1,2].

I have *most* of the atomic.h ordering variant generation, and most of
atomic-instrumented.h, modulo a few special cases to be handled.

I'm also not sure which functions are intended to be mandatory. I had
hoped I could just wrap every atomic_foo in an ifdef, but not all arches
have a #define for all functions, so that needs to be handled somehow.

Thanks,
Mark.

[1] https://git.kernel.org/pub/scm/linux/kernel/git/mark/linux.git/commit/?h=atomics/generated&id=6b44ecd77dbb7c3e518260d2e223a29c64c85740
[2] https://git.kernel.org/pub/scm/linux/kernel/git/mark/linux.git/commit/?h=atomics/generated&id=432d8259040a67b7234c869c9298d042515958a2

^ permalink raw reply	[flat|nested] 103+ messages in thread

end of thread, other threads:[~2018-05-21 17:13 UTC | newest]

Thread overview: 103+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-05-04 17:39 [PATCH 0/6] arm64: add instrumented atomics Mark Rutland
2018-05-04 17:39 ` Mark Rutland
2018-05-04 17:39 ` [PATCH 1/6] locking/atomic, asm-generic: instrument ordering variants Mark Rutland
2018-05-04 17:39   ` Mark Rutland
2018-05-04 18:01   ` Peter Zijlstra
2018-05-04 18:01     ` Peter Zijlstra
2018-05-04 18:09     ` Mark Rutland
2018-05-04 18:09       ` Mark Rutland
2018-05-04 18:24       ` Peter Zijlstra
2018-05-04 18:24         ` Peter Zijlstra
2018-05-05  9:12         ` Mark Rutland
2018-05-05  9:12           ` Mark Rutland
2018-05-05  8:11       ` [PATCH] locking/atomics: Clean up the atomic.h maze of #defines Ingo Molnar
2018-05-05  8:11         ` Ingo Molnar
2018-05-05  8:36         ` [PATCH] locking/atomics: Simplify the op definitions in atomic.h some more Ingo Molnar
2018-05-05  8:36           ` Ingo Molnar
2018-05-05  8:54           ` [PATCH] locking/atomics: Combine the atomic_andnot() and atomic64_andnot() API definitions Ingo Molnar
2018-05-05  8:54             ` Ingo Molnar
2018-05-06 12:15             ` [tip:locking/core] " tip-bot for Ingo Molnar
2018-05-06 14:15             ` [PATCH] " Andrea Parri
2018-05-06 14:15               ` Andrea Parri
2018-05-06 12:14           ` [tip:locking/core] locking/atomics: Simplify the op definitions in atomic.h some more tip-bot for Ingo Molnar
2018-05-09  7:33             ` Peter Zijlstra
2018-05-09 13:03               ` Will Deacon
2018-05-15  8:54                 ` Ingo Molnar
2018-05-15  8:35               ` Ingo Molnar
2018-05-15 11:41                 ` Peter Zijlstra
2018-05-15 12:13                   ` Peter Zijlstra
2018-05-15 15:43                   ` Mark Rutland
2018-05-15 17:10                     ` Peter Zijlstra
2018-05-15 17:53                       ` Mark Rutland
2018-05-15 18:11                         ` Peter Zijlstra
2018-05-15 18:15                           ` Peter Zijlstra
2018-05-15 18:52                             ` Linus Torvalds
2018-05-15 19:39                               ` Peter Zijlstra
2018-05-21 17:12                           ` Mark Rutland
2018-05-06 14:12           ` [PATCH] " Andrea Parri
2018-05-06 14:12             ` Andrea Parri
2018-05-06 14:57             ` Ingo Molnar
2018-05-06 14:57               ` Ingo Molnar
2018-05-07  9:54               ` Andrea Parri
2018-05-07  9:54                 ` Andrea Parri
2018-05-18 18:43               ` Palmer Dabbelt
2018-05-18 18:43                 ` Palmer Dabbelt
2018-05-05  8:47         ` [PATCH] locking/atomics: Clean up the atomic.h maze of #defines Peter Zijlstra
2018-05-05  8:47           ` Peter Zijlstra
2018-05-05  9:04           ` Ingo Molnar
2018-05-05  9:04             ` Ingo Molnar
2018-05-05  9:24             ` Peter Zijlstra
2018-05-05  9:24               ` Peter Zijlstra
2018-05-05  9:38             ` Ingo Molnar
2018-05-05  9:38               ` Ingo Molnar
2018-05-05 10:00               ` [RFC PATCH] locking/atomics/powerpc: Introduce optimized cmpxchg_release() family of APIs for PowerPC Ingo Molnar
2018-05-05 10:00                 ` Ingo Molnar
2018-05-05 10:26                 ` Boqun Feng
2018-05-05 10:26                   ` Boqun Feng
2018-05-06  1:56                 ` Benjamin Herrenschmidt
2018-05-06  1:56                   ` Benjamin Herrenschmidt
2018-05-05 10:16               ` [PATCH] locking/atomics: Clean up the atomic.h maze of #defines Boqun Feng
2018-05-05 10:16                 ` Boqun Feng
2018-05-05 10:35                 ` [RFC PATCH] locking/atomics/powerpc: Clarify why the cmpxchg_relaxed() family of APIs falls back to full cmpxchg() Ingo Molnar
2018-05-05 10:35                   ` Ingo Molnar
2018-05-05 11:28                   ` Boqun Feng
2018-05-05 11:28                     ` Boqun Feng
2018-05-05 13:27                     ` [PATCH] locking/atomics/powerpc: Move cmpxchg helpers to asm/cmpxchg.h and define the full set of cmpxchg APIs Ingo Molnar
2018-05-05 13:27                       ` Ingo Molnar
2018-05-05 14:03                       ` Boqun Feng
2018-05-05 14:03                         ` Boqun Feng
2018-05-06 12:11                         ` Ingo Molnar
2018-05-06 12:11                           ` Ingo Molnar
2018-05-07  1:04                           ` Boqun Feng
2018-05-07  1:04                             ` Boqun Feng
2018-05-07  6:50                             ` Ingo Molnar
2018-05-07  6:50                               ` Ingo Molnar
2018-05-06 12:13                     ` [tip:locking/core] " tip-bot for Boqun Feng
2018-05-07 13:31                       ` [PATCH v2] " Boqun Feng
2018-05-07 13:31                         ` Boqun Feng
2018-05-05  9:05           ` [PATCH] locking/atomics: Clean up the atomic.h maze of #defines Dmitry Vyukov
2018-05-05  9:05             ` Dmitry Vyukov
2018-05-05  9:32             ` Peter Zijlstra
2018-05-05  9:32               ` Peter Zijlstra
2018-05-07  6:43               ` [RFC PATCH] locking/atomics/x86/64: Clean up and fix details of <asm/atomic64_64.h> Ingo Molnar
2018-05-07  6:43                 ` Ingo Molnar
2018-05-05  9:09           ` [PATCH] locking/atomics: Clean up the atomic.h maze of #defines Ingo Molnar
2018-05-05  9:09             ` Ingo Molnar
2018-05-05  9:29             ` Peter Zijlstra
2018-05-05  9:29               ` Peter Zijlstra
2018-05-05 10:48               ` [PATCH] locking/atomics: Shorten the __atomic_op() defines to __op() Ingo Molnar
2018-05-05 10:48                 ` Ingo Molnar
2018-05-05 10:59                 ` Ingo Molnar
2018-05-05 10:59                   ` Ingo Molnar
2018-05-06 12:15                 ` [tip:locking/core] " tip-bot for Ingo Molnar
2018-05-06 12:14         ` [tip:locking/core] locking/atomics: Clean up the atomic.h maze of #defines tip-bot for Ingo Molnar
2018-05-04 17:39 ` [PATCH 2/6] locking/atomic, asm-generic: instrument atomic*andnot*() Mark Rutland
2018-05-04 17:39   ` Mark Rutland
2018-05-04 17:39 ` [PATCH 3/6] arm64: use <linux/atomic.h> for cmpxchg Mark Rutland
2018-05-04 17:39   ` Mark Rutland
2018-05-04 17:39 ` [PATCH 4/6] arm64: fix assembly constraints " Mark Rutland
2018-05-04 17:39   ` Mark Rutland
2018-05-04 17:39 ` [PATCH 5/6] arm64: use instrumented atomics Mark Rutland
2018-05-04 17:39   ` Mark Rutland
2018-05-04 17:39 ` [PATCH 6/6] arm64: instrument smp_{load_acquire,store_release} Mark Rutland
2018-05-04 17:39   ` Mark Rutland

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.