All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/5] Add generic support for relaxed atomics
@ 2015-07-13 12:31 Will Deacon
  2015-07-13 12:31 ` [PATCH 1/5] atomics: add acquire/release/relaxed variants of some atomic operations Will Deacon
                   ` (4 more replies)
  0 siblings, 5 replies; 14+ messages in thread
From: Will Deacon @ 2015-07-13 12:31 UTC (permalink / raw)
  To: linux-arch; +Cc: Waiman.Long, peterz, linux-kernel, paulmck, Will Deacon

Hello,

This patch series adds support for a family of relaxed atomics to the
kernel. More specifically:

  - acquire/release/relaxed flavours of xchg, cmpxchg and {add,sub}_return
  - atomic_read_acquire
  - atomic_set_release

This came out of a separate patch series porting the (barrier-heavy)
qrwlock code to arm64. Rather than have arch-specific hooks littered
around the place, it makes more sense to define a core set of relaxed
atomics that can be used regardless of architecture.

For now, the definitions simply take on the existing (i.e. full-barrier)
semantics, but there is a direct mapping onto arm64 and even architectures
with explicit memory barrier instructions (e.g. powerpc, arm) can benefit,
as they do from the existing smp_load_acquire/smp_store_release macros.

The final patch is a proof-of-concept port of the qrwlock over to the
new atomics. It's based on some of the pending patches from me and Waiman,
so it won't apply to mainline but I think it illustrates the usage well
enough.

All feedback welcome,

Will

--->8

Will Deacon (5):
  atomics: add acquire/release/relaxed variants of some atomic
    operations
  asm-generic: rework atomic-long.h to avoid bulk code duplication
  asm-generic: add relaxed/acquire/release variants for atomic_long_t
  lockref: remove homebrew cmpxchg64_relaxed macro definition
  locking/qrwlock: make use of acquire/release/relaxed atomics

 Documentation/atomic_ops.txt      |   4 +-
 include/asm-generic/atomic-long.h | 263 ++++++++++++++------------------------
 include/asm-generic/qrwlock.h     |  13 +-
 include/linux/atomic.h            | 140 ++++++++++++++++++++
 kernel/locking/qrwlock.c          |  12 +-
 lib/lockref.c                     |   8 --
 6 files changed, 248 insertions(+), 192 deletions(-)

-- 
2.1.4


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH 1/5] atomics: add acquire/release/relaxed variants of some atomic operations
  2015-07-13 12:31 [PATCH 0/5] Add generic support for relaxed atomics Will Deacon
@ 2015-07-13 12:31 ` Will Deacon
  2015-07-14 10:25   ` Peter Zijlstra
  2015-07-13 12:31 ` [PATCH 2/5] asm-generic: rework atomic-long.h to avoid bulk code duplication Will Deacon
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 14+ messages in thread
From: Will Deacon @ 2015-07-13 12:31 UTC (permalink / raw)
  To: linux-arch; +Cc: Waiman.Long, peterz, linux-kernel, paulmck, Will Deacon

Whilst porting the generic qrwlock code over to arm64, it became
apparent that any portable locking code needs finer-grained control of
the memory-ordering guarantees provided by our atomic routines.

In particular: xchg, cmpxchg, {add,sub}_return are often used in
situations where full barrier semantics (currently the only option
available) are not required. For example, when a reader increments a
reader count to obtain a lock, checking the old value to see if a writer
was present, only acquire semantics are strictly needed.

This patch introduces three new ordering semantics for these operations:

  - *_relaxed: No ordering guarantees. This is similar to what we have
               already for the non-return atomics (e.g. atomic_add).

  - *_acquire: ACQUIRE semantics, similar to smp_load_acquire.

  - *_release: RELEASE semantics, similar to smp_store_release.

In memory-ordering speak, this means that the acquire/release semantics
are RCpc as opposed to RCsc. Consequently a RELEASE followed by an
ACQUIRE does not imply a full barrier, as already documented in
memory-barriers.txt.

Currently, all the new macros are conditionally mapped to the full-mb
variants, which has the interested side-effect of a failed cmpxchg_acquire
having no memory ordering guarantees.

Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
---
 Documentation/atomic_ops.txt |   4 +-
 include/linux/atomic.h       | 140 +++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 143 insertions(+), 1 deletion(-)

diff --git a/Documentation/atomic_ops.txt b/Documentation/atomic_ops.txt
index dab6da3382d9..b19fc34efdb1 100644
--- a/Documentation/atomic_ops.txt
+++ b/Documentation/atomic_ops.txt
@@ -266,7 +266,9 @@ with the given old and new values. Like all atomic_xxx operations,
 atomic_cmpxchg will only satisfy its atomicity semantics as long as all
 other accesses of *v are performed through atomic_xxx operations.
 
-atomic_cmpxchg must provide explicit memory barriers around the operation.
+atomic_cmpxchg must provide explicit memory barriers around the operation,
+although if the comparison fails then no memory ordering guarantees are
+required.
 
 The semantics for atomic_cmpxchg are the same as those defined for 'cas'
 below.
diff --git a/include/linux/atomic.h b/include/linux/atomic.h
index 5b08a8540ecf..a78c3704cd51 100644
--- a/include/linux/atomic.h
+++ b/include/linux/atomic.h
@@ -128,4 +128,144 @@ static inline void atomic_or(int i, atomic_t *v)
 #ifdef CONFIG_GENERIC_ATOMIC64
 #include <asm-generic/atomic64.h>
 #endif
+
+/*
+ * Relaxed variants of xchg, cmpxchg and some atomic operations.
+ *
+ * We support four variants:
+ *
+ * - Fully ordered: The default implementation, no suffix required.
+ * - Acquire: Provides ACQUIRE semantics, _acquire suffix.
+ * - Release: Provides RELEASE semantics, _release suffix.
+ * - Relaxed: No ordering guarantees, _relaxed suffix.
+ *
+ * See Documentation/memory-barriers.txt for ACQUIRE/RELEASE definitions.
+ */
+
+#ifndef atomic_read_acquire
+#define  atomic_read_acquire(v)		smp_load_acquire(&(v)->counter)
+#endif
+
+#ifndef atomic_set_release
+#define  atomic_set_release(v, i)	smp_store_release(&(v)->counter, (i))
+#endif
+
+#ifndef atomic_add_return_relaxed
+#define  atomic_add_return_relaxed	atomic_add_return
+#endif
+#ifndef atomic_add_return_acquire
+#define  atomic_add_return_acquire	atomic_add_return
+#endif
+#ifndef atomic_add_return_release
+#define  atomic_add_return_release	atomic_add_return
+#endif
+
+#ifndef atomic_sub_return_relaxed
+#define  atomic_sub_return_relaxed	atomic_sub_return
+#endif
+#ifndef atomic_sub_return_acquire
+#define  atomic_sub_return_acquire	atomic_sub_return
+#endif
+#ifndef atomic_sub_return_release
+#define  atomic_sub_return_release	atomic_sub_return
+#endif
+
+#ifndef atomic_xchg_relaxed
+#define  atomic_xchg_relaxed		atomic_xchg
+#endif
+#ifndef atomic_xchg_acquire
+#define  atomic_xchg_acquire		atomic_xchg
+#endif
+#ifndef atomic_xchg_release
+#define  atomic_xchg_release		atomic_xchg
+#endif
+
+#ifndef atomic_cmpxchg_relaxed
+#define  atomic_cmpxchg_relaxed		atomic_cmpxchg
+#endif
+#ifndef atomic_cmpxchg_acquire
+#define  atomic_cmpxchg_acquire		atomic_cmpxchg
+#endif
+#ifndef atomic_cmpxchg_release
+#define  atomic_cmpxchg_release		atomic_cmpxchg
+#endif
+
+#ifndef atomic64_read_acquire
+#define  atomic64_read_acquire(v)	smp_load_acquire(&(v)->counter)
+#endif
+
+#ifndef atomic64_set_release
+#define  atomic64_set_release(v, i)	smp_store_release(&(v)->counter, (i))
+#endif
+
+#ifndef atomic64_add_return_relaxed
+#define  atomic64_add_return_relaxed	atomic64_add_return
+#endif
+#ifndef atomic64_add_return_acquire
+#define  atomic64_add_return_acquire	atomic64_add_return
+#endif
+#ifndef atomic64_add_return_release
+#define  atomic64_add_return_release	atomic64_add_return
+#endif
+
+#ifndef atomic64_sub_return_relaxed
+#define  atomic64_sub_return_relaxed	atomic64_sub_return
+#endif
+#ifndef atomic64_sub_return_acquire
+#define  atomic64_sub_return_acquire	atomic64_sub_return
+#endif
+#ifndef atomic64_sub_return_release
+#define  atomic64_sub_return_release	atomic64_sub_return
+#endif
+
+#ifndef atomic64_xchg_relaxed
+#define  atomic64_xchg_relaxed		atomic64_xchg
+#endif
+#ifndef atomic64_xchg_acquire
+#define  atomic64_xchg_acquire		atomic64_xchg
+#endif
+#ifndef atomic64_xchg_release
+#define  atomic64_xchg_release		atomic64_xchg
+#endif
+
+#ifndef atomic64_cmpxchg_relaxed
+#define  atomic64_cmpxchg_relaxed	atomic64_cmpxchg
+#endif
+#ifndef atomic64_cmpxchg_acquire
+#define  atomic64_cmpxchg_acquire	atomic64_cmpxchg
+#endif
+#ifndef atomic64_cmpxchg_release
+#define  atomic64_cmpxchg_release	atomic64_cmpxchg
+#endif
+
+#ifndef cmpxchg_relaxed
+#define  cmpxchg_relaxed		cmpxchg
+#endif
+#ifndef cmpxchg_acquire
+#define  cmpxchg_acquire		cmpxchg
+#endif
+#ifndef cmpxchg_release
+#define  cmpxchg_release		cmpxchg
+#endif
+
+#ifndef cmpxchg64_relaxed
+#define  cmpxchg64_relaxed		cmpxchg64
+#endif
+#ifndef cmpxchg64_acquire
+#define  cmpxchg64_acquire		cmpxchg64
+#endif
+#ifndef cmpxchg64_release
+#define  cmpxchg64_release		cmpxchg64
+#endif
+
+#ifndef xchg_relaxed
+#define  xchg_relaxed			xchg
+#endif
+#ifndef xchg_acquire
+#define  xchg_acquire			xchg
+#endif
+#ifndef xchg_release
+#define  xchg_release			xchg
+#endif
+
 #endif /* _LINUX_ATOMIC_H */
-- 
2.1.4


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 2/5] asm-generic: rework atomic-long.h to avoid bulk code duplication
  2015-07-13 12:31 [PATCH 0/5] Add generic support for relaxed atomics Will Deacon
  2015-07-13 12:31 ` [PATCH 1/5] atomics: add acquire/release/relaxed variants of some atomic operations Will Deacon
@ 2015-07-13 12:31 ` Will Deacon
  2015-07-13 12:31 ` [PATCH 3/5] asm-generic: add relaxed/acquire/release variants for atomic_long_t Will Deacon
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 14+ messages in thread
From: Will Deacon @ 2015-07-13 12:31 UTC (permalink / raw)
  To: linux-arch; +Cc: Waiman.Long, peterz, linux-kernel, paulmck, Will Deacon

We can use some (admittedly ugly) macros to generate the 32-bit and
64-bit based atomic_long implementations from the same code.

Signed-off-by: Will Deacon <will.deacon@arm.com>
---
 include/asm-generic/atomic-long.h | 189 ++++++++------------------------------
 1 file changed, 40 insertions(+), 149 deletions(-)

diff --git a/include/asm-generic/atomic-long.h b/include/asm-generic/atomic-long.h
index b7babf0206b8..beaea541adfb 100644
--- a/include/asm-generic/atomic-long.h
+++ b/include/asm-generic/atomic-long.h
@@ -23,236 +23,127 @@
 typedef atomic64_t atomic_long_t;
 
 #define ATOMIC_LONG_INIT(i)	ATOMIC64_INIT(i)
+#define ATOMIC_LONG_PFX(x)	atomic64 ## x
 
-static inline long atomic_long_read(atomic_long_t *l)
-{
-	atomic64_t *v = (atomic64_t *)l;
-
-	return (long)atomic64_read(v);
-}
-
-static inline void atomic_long_set(atomic_long_t *l, long i)
-{
-	atomic64_t *v = (atomic64_t *)l;
-
-	atomic64_set(v, i);
-}
-
-static inline void atomic_long_inc(atomic_long_t *l)
-{
-	atomic64_t *v = (atomic64_t *)l;
-
-	atomic64_inc(v);
-}
-
-static inline void atomic_long_dec(atomic_long_t *l)
-{
-	atomic64_t *v = (atomic64_t *)l;
-
-	atomic64_dec(v);
-}
-
-static inline void atomic_long_add(long i, atomic_long_t *l)
-{
-	atomic64_t *v = (atomic64_t *)l;
-
-	atomic64_add(i, v);
-}
-
-static inline void atomic_long_sub(long i, atomic_long_t *l)
-{
-	atomic64_t *v = (atomic64_t *)l;
-
-	atomic64_sub(i, v);
-}
-
-static inline int atomic_long_sub_and_test(long i, atomic_long_t *l)
-{
-	atomic64_t *v = (atomic64_t *)l;
-
-	return atomic64_sub_and_test(i, v);
-}
-
-static inline int atomic_long_dec_and_test(atomic_long_t *l)
-{
-	atomic64_t *v = (atomic64_t *)l;
-
-	return atomic64_dec_and_test(v);
-}
-
-static inline int atomic_long_inc_and_test(atomic_long_t *l)
-{
-	atomic64_t *v = (atomic64_t *)l;
-
-	return atomic64_inc_and_test(v);
-}
-
-static inline int atomic_long_add_negative(long i, atomic_long_t *l)
-{
-	atomic64_t *v = (atomic64_t *)l;
-
-	return atomic64_add_negative(i, v);
-}
-
-static inline long atomic_long_add_return(long i, atomic_long_t *l)
-{
-	atomic64_t *v = (atomic64_t *)l;
-
-	return (long)atomic64_add_return(i, v);
-}
-
-static inline long atomic_long_sub_return(long i, atomic_long_t *l)
-{
-	atomic64_t *v = (atomic64_t *)l;
-
-	return (long)atomic64_sub_return(i, v);
-}
-
-static inline long atomic_long_inc_return(atomic_long_t *l)
-{
-	atomic64_t *v = (atomic64_t *)l;
-
-	return (long)atomic64_inc_return(v);
-}
-
-static inline long atomic_long_dec_return(atomic_long_t *l)
-{
-	atomic64_t *v = (atomic64_t *)l;
-
-	return (long)atomic64_dec_return(v);
-}
-
-static inline long atomic_long_add_unless(atomic_long_t *l, long a, long u)
-{
-	atomic64_t *v = (atomic64_t *)l;
-
-	return (long)atomic64_add_unless(v, a, u);
-}
-
-#define atomic_long_inc_not_zero(l) atomic64_inc_not_zero((atomic64_t *)(l))
-
-#define atomic_long_cmpxchg(l, old, new) \
-	(atomic64_cmpxchg((atomic64_t *)(l), (old), (new)))
-#define atomic_long_xchg(v, new) \
-	(atomic64_xchg((atomic64_t *)(v), (new)))
-
-#else  /*  BITS_PER_LONG == 64  */
+#else
 
 typedef atomic_t atomic_long_t;
 
 #define ATOMIC_LONG_INIT(i)	ATOMIC_INIT(i)
+#define ATOMIC_LONG_PFX(x)	atomic ## x
+
+#endif
+
 static inline long atomic_long_read(atomic_long_t *l)
 {
-	atomic_t *v = (atomic_t *)l;
+	ATOMIC_LONG_PFX(_t) *v = (ATOMIC_LONG_PFX(_t) *)l;
 
-	return (long)atomic_read(v);
+	return (long)ATOMIC_LONG_PFX(_read)(v);
 }
 
 static inline void atomic_long_set(atomic_long_t *l, long i)
 {
-	atomic_t *v = (atomic_t *)l;
+	ATOMIC_LONG_PFX(_t) *v = (ATOMIC_LONG_PFX(_t) *)l;
 
-	atomic_set(v, i);
+	ATOMIC_LONG_PFX(_set)(v, i);
 }
 
 static inline void atomic_long_inc(atomic_long_t *l)
 {
-	atomic_t *v = (atomic_t *)l;
+	ATOMIC_LONG_PFX(_t) *v = (ATOMIC_LONG_PFX(_t) *)l;
 
-	atomic_inc(v);
+	ATOMIC_LONG_PFX(_inc)(v);
 }
 
 static inline void atomic_long_dec(atomic_long_t *l)
 {
-	atomic_t *v = (atomic_t *)l;
+	ATOMIC_LONG_PFX(_t) *v = (ATOMIC_LONG_PFX(_t) *)l;
 
-	atomic_dec(v);
+	ATOMIC_LONG_PFX(_dec)(v);
 }
 
 static inline void atomic_long_add(long i, atomic_long_t *l)
 {
-	atomic_t *v = (atomic_t *)l;
+	ATOMIC_LONG_PFX(_t) *v = (ATOMIC_LONG_PFX(_t) *)l;
 
-	atomic_add(i, v);
+	ATOMIC_LONG_PFX(_add)(i, v);
 }
 
 static inline void atomic_long_sub(long i, atomic_long_t *l)
 {
-	atomic_t *v = (atomic_t *)l;
+	ATOMIC_LONG_PFX(_t) *v = (ATOMIC_LONG_PFX(_t) *)l;
 
-	atomic_sub(i, v);
+	ATOMIC_LONG_PFX(_sub)(i, v);
 }
 
 static inline int atomic_long_sub_and_test(long i, atomic_long_t *l)
 {
-	atomic_t *v = (atomic_t *)l;
+	ATOMIC_LONG_PFX(_t) *v = (ATOMIC_LONG_PFX(_t) *)l;
 
-	return atomic_sub_and_test(i, v);
+	return ATOMIC_LONG_PFX(_sub_and_test)(i, v);
 }
 
 static inline int atomic_long_dec_and_test(atomic_long_t *l)
 {
-	atomic_t *v = (atomic_t *)l;
+	ATOMIC_LONG_PFX(_t) *v = (ATOMIC_LONG_PFX(_t) *)l;
 
-	return atomic_dec_and_test(v);
+	return ATOMIC_LONG_PFX(_dec_and_test)(v);
 }
 
 static inline int atomic_long_inc_and_test(atomic_long_t *l)
 {
-	atomic_t *v = (atomic_t *)l;
+	ATOMIC_LONG_PFX(_t) *v = (ATOMIC_LONG_PFX(_t) *)l;
 
-	return atomic_inc_and_test(v);
+	return ATOMIC_LONG_PFX(_inc_and_test)(v);
 }
 
 static inline int atomic_long_add_negative(long i, atomic_long_t *l)
 {
-	atomic_t *v = (atomic_t *)l;
+	ATOMIC_LONG_PFX(_t) *v = (ATOMIC_LONG_PFX(_t) *)l;
 
-	return atomic_add_negative(i, v);
+	return ATOMIC_LONG_PFX(_add_negative)(i, v);
 }
 
 static inline long atomic_long_add_return(long i, atomic_long_t *l)
 {
-	atomic_t *v = (atomic_t *)l;
+	ATOMIC_LONG_PFX(_t) *v = (ATOMIC_LONG_PFX(_t) *)l;
 
-	return (long)atomic_add_return(i, v);
+	return (long)ATOMIC_LONG_PFX(_add_return)(i, v);
 }
 
 static inline long atomic_long_sub_return(long i, atomic_long_t *l)
 {
-	atomic_t *v = (atomic_t *)l;
+	ATOMIC_LONG_PFX(_t) *v = (ATOMIC_LONG_PFX(_t) *)l;
 
-	return (long)atomic_sub_return(i, v);
+	return (long)ATOMIC_LONG_PFX(_sub_return)(i, v);
 }
 
 static inline long atomic_long_inc_return(atomic_long_t *l)
 {
-	atomic_t *v = (atomic_t *)l;
+	ATOMIC_LONG_PFX(_t) *v = (ATOMIC_LONG_PFX(_t) *)l;
 
-	return (long)atomic_inc_return(v);
+	return (long)ATOMIC_LONG_PFX(_inc_return)(v);
 }
 
 static inline long atomic_long_dec_return(atomic_long_t *l)
 {
-	atomic_t *v = (atomic_t *)l;
+	ATOMIC_LONG_PFX(_t) *v = (ATOMIC_LONG_PFX(_t) *)l;
 
-	return (long)atomic_dec_return(v);
+	return (long)ATOMIC_LONG_PFX(_dec_return)(v);
 }
 
 static inline long atomic_long_add_unless(atomic_long_t *l, long a, long u)
 {
-	atomic_t *v = (atomic_t *)l;
+	ATOMIC_LONG_PFX(_t) *v = (ATOMIC_LONG_PFX(_t) *)l;
 
-	return (long)atomic_add_unless(v, a, u);
+	return (long)ATOMIC_LONG_PFX(_add_unless)(v, a, u);
 }
 
-#define atomic_long_inc_not_zero(l) atomic_inc_not_zero((atomic_t *)(l))
-
+#define atomic_long_inc_not_zero(l) \
+	ATOMIC_LONG_PFX(_inc_not_zero)((ATOMIC_LONG_PFX(_t) *)(l))
 #define atomic_long_cmpxchg(l, old, new) \
-	(atomic_cmpxchg((atomic_t *)(l), (old), (new)))
+	(ATOMIC_LONG_PFX(_cmpxchg)((ATOMIC_LONG_PFX(_t) *)(l), (old), (new)))
 #define atomic_long_xchg(v, new) \
-	(atomic_xchg((atomic_t *)(v), (new)))
-
-#endif  /*  BITS_PER_LONG == 64  */
+	(ATOMIC_LONG_PFX(_xchg)((ATOMIC_LONG_PFX(_t) *)(v), (new)))
 
 #endif  /*  _ASM_GENERIC_ATOMIC_LONG_H  */
-- 
2.1.4


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 3/5] asm-generic: add relaxed/acquire/release variants for atomic_long_t
  2015-07-13 12:31 [PATCH 0/5] Add generic support for relaxed atomics Will Deacon
  2015-07-13 12:31 ` [PATCH 1/5] atomics: add acquire/release/relaxed variants of some atomic operations Will Deacon
  2015-07-13 12:31 ` [PATCH 2/5] asm-generic: rework atomic-long.h to avoid bulk code duplication Will Deacon
@ 2015-07-13 12:31 ` Will Deacon
  2015-07-13 12:31 ` [PATCH 4/5] lockref: remove homebrew cmpxchg64_relaxed macro definition Will Deacon
  2015-07-13 12:31 ` [PATCH 5/5] locking/qrwlock: make use of acquire/release/relaxed atomics Will Deacon
  4 siblings, 0 replies; 14+ messages in thread
From: Will Deacon @ 2015-07-13 12:31 UTC (permalink / raw)
  To: linux-arch; +Cc: Waiman.Long, peterz, linux-kernel, paulmck, Will Deacon

This patch adds atomic_long_t wrappers for the new relaxed atomic
operations.

Signed-off-by: Will Deacon <will.deacon@arm.com>
---
 include/asm-generic/atomic-long.h | 86 +++++++++++++++++++++++++++------------
 include/linux/atomic.h            | 10 ++---
 2 files changed, 64 insertions(+), 32 deletions(-)

diff --git a/include/asm-generic/atomic-long.h b/include/asm-generic/atomic-long.h
index beaea541adfb..a94cbebbc33d 100644
--- a/include/asm-generic/atomic-long.h
+++ b/include/asm-generic/atomic-long.h
@@ -34,19 +34,69 @@ typedef atomic_t atomic_long_t;
 
 #endif
 
-static inline long atomic_long_read(atomic_long_t *l)
-{
-	ATOMIC_LONG_PFX(_t) *v = (ATOMIC_LONG_PFX(_t) *)l;
-
-	return (long)ATOMIC_LONG_PFX(_read)(v);
+#define ATOMIC_LONG_READ_OP(mo)						\
+static inline long atomic_long_read##mo(atomic_long_t *l)		\
+{									\
+	ATOMIC_LONG_PFX(_t) *v = (ATOMIC_LONG_PFX(_t) *)l;		\
+									\
+	return (long)ATOMIC_LONG_PFX(_read##mo)(v);			\
 }
+ATOMIC_LONG_READ_OP()
+ATOMIC_LONG_READ_OP(_acquire)
 
-static inline void atomic_long_set(atomic_long_t *l, long i)
-{
-	ATOMIC_LONG_PFX(_t) *v = (ATOMIC_LONG_PFX(_t) *)l;
+#undef ATOMIC_LONG_READ_OP
 
-	ATOMIC_LONG_PFX(_set)(v, i);
+#define ATOMIC_LONG_SET_OP(mo)						\
+static inline void atomic_long_set##mo(atomic_long_t *l, long i)	\
+{									\
+	ATOMIC_LONG_PFX(_t) *v = (ATOMIC_LONG_PFX(_t) *)l;		\
+									\
+	ATOMIC_LONG_PFX(_set##mo)(v, i);				\
+}
+ATOMIC_LONG_SET_OP()
+ATOMIC_LONG_SET_OP(_release)
+
+#undef ATOMIC_LONG_SET_OP
+
+#define ATOMIC_LONG_ADD_SUB_OP(op, mo)					\
+static inline long							\
+atomic_long_##op##_return##mo(long i, atomic_long_t *l)			\
+{									\
+	ATOMIC_LONG_PFX(_t) *v = (ATOMIC_LONG_PFX(_t) *)l;		\
+									\
+	return (long)ATOMIC_LONG_PFX(_##op##_return##mo)(i, v);		\
 }
+ATOMIC_LONG_ADD_SUB_OP(add,)
+ATOMIC_LONG_ADD_SUB_OP(add, _relaxed)
+ATOMIC_LONG_ADD_SUB_OP(add, _acquire)
+ATOMIC_LONG_ADD_SUB_OP(add, _release)
+ATOMIC_LONG_ADD_SUB_OP(sub,)
+ATOMIC_LONG_ADD_SUB_OP(sub, _relaxed)
+ATOMIC_LONG_ADD_SUB_OP(sub, _acquire)
+ATOMIC_LONG_ADD_SUB_OP(sub, _release)
+
+#undef ATOMIC_LONG_ADD_SUB_OP
+
+#define atomic_long_cmpxchg_relaxed(l, old, new) \
+	(ATOMIC_LONG_PFX(_cmpxchg_relaxed)((ATOMIC_LONG_PFX(_t) *)(l), \
+					   (old), (new)))
+#define atomic_long_cmpxchg_acquire(l, old, new) \
+	(ATOMIC_LONG_PFX(_cmpxchg_acquire)((ATOMIC_LONG_PFX(_t) *)(l), \
+					   (old), (new)))
+#define atomic_long_cmpxchg_release(l, old, new) \
+	(ATOMIC_LONG_PFX(_cmpxchg_release)((ATOMIC_LONG_PFX(_t) *)(l), \
+					   (old), (new)))
+#define atomic_long_cmpxchg(l, old, new) \
+	(ATOMIC_LONG_PFX(_cmpxchg)((ATOMIC_LONG_PFX(_t) *)(l), (old), (new)))
+
+#define atomic_long_xchg_relaxed(v, new) \
+	(ATOMIC_LONG_PFX(_xchg_relaxed)((ATOMIC_LONG_PFX(_t) *)(v), (new)))
+#define atomic_long_xchg_acquire(v, new) \
+	(ATOMIC_LONG_PFX(_xchg_acquire)((ATOMIC_LONG_PFX(_t) *)(v), (new)))
+#define atomic_long_xchg_release(v, new) \
+	(ATOMIC_LONG_PFX(_xchg_release)((ATOMIC_LONG_PFX(_t) *)(v), (new)))
+#define atomic_long_xchg(v, new) \
+	(ATOMIC_LONG_PFX(_xchg)((ATOMIC_LONG_PFX(_t) *)(v), (new)))
 
 static inline void atomic_long_inc(atomic_long_t *l)
 {
@@ -104,20 +154,6 @@ static inline int atomic_long_add_negative(long i, atomic_long_t *l)
 	return ATOMIC_LONG_PFX(_add_negative)(i, v);
 }
 
-static inline long atomic_long_add_return(long i, atomic_long_t *l)
-{
-	ATOMIC_LONG_PFX(_t) *v = (ATOMIC_LONG_PFX(_t) *)l;
-
-	return (long)ATOMIC_LONG_PFX(_add_return)(i, v);
-}
-
-static inline long atomic_long_sub_return(long i, atomic_long_t *l)
-{
-	ATOMIC_LONG_PFX(_t) *v = (ATOMIC_LONG_PFX(_t) *)l;
-
-	return (long)ATOMIC_LONG_PFX(_sub_return)(i, v);
-}
-
 static inline long atomic_long_inc_return(atomic_long_t *l)
 {
 	ATOMIC_LONG_PFX(_t) *v = (ATOMIC_LONG_PFX(_t) *)l;
@@ -141,9 +177,5 @@ static inline long atomic_long_add_unless(atomic_long_t *l, long a, long u)
 
 #define atomic_long_inc_not_zero(l) \
 	ATOMIC_LONG_PFX(_inc_not_zero)((ATOMIC_LONG_PFX(_t) *)(l))
-#define atomic_long_cmpxchg(l, old, new) \
-	(ATOMIC_LONG_PFX(_cmpxchg)((ATOMIC_LONG_PFX(_t) *)(l), (old), (new)))
-#define atomic_long_xchg(v, new) \
-	(ATOMIC_LONG_PFX(_xchg)((ATOMIC_LONG_PFX(_t) *)(v), (new)))
 
 #endif  /*  _ASM_GENERIC_ATOMIC_LONG_H  */
diff --git a/include/linux/atomic.h b/include/linux/atomic.h
index a78c3704cd51..a8b4c68790de 100644
--- a/include/linux/atomic.h
+++ b/include/linux/atomic.h
@@ -124,11 +124,6 @@ static inline void atomic_or(int i, atomic_t *v)
 }
 #endif /* #ifndef CONFIG_ARCH_HAS_ATOMIC_OR */
 
-#include <asm-generic/atomic-long.h>
-#ifdef CONFIG_GENERIC_ATOMIC64
-#include <asm-generic/atomic64.h>
-#endif
-
 /*
  * Relaxed variants of xchg, cmpxchg and some atomic operations.
  *
@@ -268,4 +263,9 @@ static inline void atomic_or(int i, atomic_t *v)
 #define  xchg_release			xchg
 #endif
 
+#include <asm-generic/atomic-long.h>
+#ifdef CONFIG_GENERIC_ATOMIC64
+#include <asm-generic/atomic64.h>
+#endif
+
 #endif /* _LINUX_ATOMIC_H */
-- 
2.1.4


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 4/5] lockref: remove homebrew cmpxchg64_relaxed macro definition
  2015-07-13 12:31 [PATCH 0/5] Add generic support for relaxed atomics Will Deacon
                   ` (2 preceding siblings ...)
  2015-07-13 12:31 ` [PATCH 3/5] asm-generic: add relaxed/acquire/release variants for atomic_long_t Will Deacon
@ 2015-07-13 12:31 ` Will Deacon
  2015-07-13 12:31 ` [PATCH 5/5] locking/qrwlock: make use of acquire/release/relaxed atomics Will Deacon
  4 siblings, 0 replies; 14+ messages in thread
From: Will Deacon @ 2015-07-13 12:31 UTC (permalink / raw)
  To: linux-arch; +Cc: Waiman.Long, peterz, linux-kernel, paulmck, Will Deacon

cmpxchg64_relaxed is now defined by asm-generic/atomic.h, so we can
remove our local definition from the lockref code.

Signed-off-by: Will Deacon <will.deacon@arm.com>
---
 lib/lockref.c | 8 --------
 1 file changed, 8 deletions(-)

diff --git a/lib/lockref.c b/lib/lockref.c
index 494994bf17c8..5a92189ad711 100644
--- a/lib/lockref.c
+++ b/lib/lockref.c
@@ -4,14 +4,6 @@
 #if USE_CMPXCHG_LOCKREF
 
 /*
- * Allow weakly-ordered memory architectures to provide barrier-less
- * cmpxchg semantics for lockref updates.
- */
-#ifndef cmpxchg64_relaxed
-# define cmpxchg64_relaxed cmpxchg64
-#endif
-
-/*
  * Note that the "cmpxchg()" reloads the "old" value for the
  * failure case.
  */
-- 
2.1.4


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 5/5] locking/qrwlock: make use of acquire/release/relaxed atomics
  2015-07-13 12:31 [PATCH 0/5] Add generic support for relaxed atomics Will Deacon
                   ` (3 preceding siblings ...)
  2015-07-13 12:31 ` [PATCH 4/5] lockref: remove homebrew cmpxchg64_relaxed macro definition Will Deacon
@ 2015-07-13 12:31 ` Will Deacon
  4 siblings, 0 replies; 14+ messages in thread
From: Will Deacon @ 2015-07-13 12:31 UTC (permalink / raw)
  To: linux-arch; +Cc: Waiman.Long, peterz, linux-kernel, paulmck, Will Deacon

The qrwlock implementation is slightly heavy in its use of memory
barriers, mainly through the use of cmpxchg and _return atomics, which
imply full barrier semantics.

This patch modifies the qrwlock code to use the more relaxed atomic
routines so that we can reduce the unnecessary barrier overhead on
weakly-ordered architectures.

Signed-off-by: Will Deacon <will.deacon@arm.com>
---
 include/asm-generic/qrwlock.h | 13 ++++++-------
 kernel/locking/qrwlock.c      | 12 ++++++------
 2 files changed, 12 insertions(+), 13 deletions(-)

diff --git a/include/asm-generic/qrwlock.h b/include/asm-generic/qrwlock.h
index 930920501e33..6ab846bf6413 100644
--- a/include/asm-generic/qrwlock.h
+++ b/include/asm-generic/qrwlock.h
@@ -68,7 +68,7 @@ static inline int queued_read_trylock(struct qrwlock *lock)
 
 	cnts = atomic_read(&lock->cnts);
 	if (likely(!(cnts & _QW_WMASK))) {
-		cnts = (u32)atomic_add_return(_QR_BIAS, &lock->cnts);
+		cnts = (u32)atomic_add_return_acquire(_QR_BIAS, &lock->cnts);
 		if (likely(!(cnts & _QW_WMASK)))
 			return 1;
 		atomic_sub(_QR_BIAS, &lock->cnts);
@@ -89,8 +89,8 @@ static inline int queued_write_trylock(struct qrwlock *lock)
 	if (unlikely(cnts))
 		return 0;
 
-	return likely(atomic_cmpxchg(&lock->cnts,
-				     cnts, cnts | _QW_LOCKED) == cnts);
+	return likely(atomic_cmpxchg_acquire(&lock->cnts,
+					     cnts, cnts | _QW_LOCKED) == cnts);
 }
 /**
  * queued_read_lock - acquire read lock of a queue rwlock
@@ -100,7 +100,7 @@ static inline void queued_read_lock(struct qrwlock *lock)
 {
 	u32 cnts;
 
-	cnts = atomic_add_return(_QR_BIAS, &lock->cnts);
+	cnts = atomic_add_return_acquire(_QR_BIAS, &lock->cnts);
 	if (likely(!(cnts & _QW_WMASK)))
 		return;
 
@@ -115,7 +115,7 @@ static inline void queued_read_lock(struct qrwlock *lock)
 static inline void queued_write_lock(struct qrwlock *lock)
 {
 	/* Optimize for the unfair lock case where the fair flag is 0. */
-	if (atomic_cmpxchg(&lock->cnts, 0, _QW_LOCKED) == 0)
+	if (atomic_cmpxchg_acquire(&lock->cnts, 0, _QW_LOCKED) == 0)
 		return;
 
 	queued_write_lock_slowpath(lock);
@@ -130,8 +130,7 @@ static inline void queued_read_unlock(struct qrwlock *lock)
 	/*
 	 * Atomically decrement the reader count
 	 */
-	smp_mb__before_atomic();
-	atomic_sub(_QR_BIAS, &lock->cnts);
+	(void)atomic_sub_return_release(_QR_BIAS, &lock->cnts);
 }
 
 /**
diff --git a/kernel/locking/qrwlock.c b/kernel/locking/qrwlock.c
index a71bb3541880..879c8fab7bea 100644
--- a/kernel/locking/qrwlock.c
+++ b/kernel/locking/qrwlock.c
@@ -36,7 +36,7 @@ rspin_until_writer_unlock(struct qrwlock *lock, u32 cnts)
 {
 	while ((cnts & _QW_WMASK) == _QW_LOCKED) {
 		cpu_relax_lowlatency();
-		cnts = smp_load_acquire((u32 *)&lock->cnts);
+		cnts = atomic_read_acquire(&lock->cnts);
 	}
 }
 
@@ -78,7 +78,7 @@ void queued_read_lock_slowpath(struct qrwlock *lock, u32 cnts)
 	while (atomic_read(&lock->cnts) & _QW_WMASK)
 		cpu_relax_lowlatency();
 
-	cnts = atomic_add_return(_QR_BIAS, &lock->cnts) - _QR_BIAS;
+	cnts = atomic_add_return_relaxed(_QR_BIAS, &lock->cnts) - _QR_BIAS;
 	rspin_until_writer_unlock(lock, cnts);
 
 	/*
@@ -101,7 +101,7 @@ void queued_write_lock_slowpath(struct qrwlock *lock)
 
 	/* Try to acquire the lock directly if no reader is present */
 	if (!atomic_read(&lock->cnts) &&
-	    (atomic_cmpxchg(&lock->cnts, 0, _QW_LOCKED) == 0))
+	    (atomic_cmpxchg_acquire(&lock->cnts, 0, _QW_LOCKED) == 0))
 		goto unlock;
 
 	/*
@@ -110,7 +110,7 @@ void queued_write_lock_slowpath(struct qrwlock *lock)
 	 */
 	for (;;) {
 		if (!READ_ONCE(lock->wmode) &&
-		   (cmpxchg(&lock->wmode, 0, _QW_WAITING) == 0))
+		   (cmpxchg_relaxed(&lock->wmode, 0, _QW_WAITING) == 0))
 			break;
 
 		cpu_relax_lowlatency();
@@ -120,8 +120,8 @@ void queued_write_lock_slowpath(struct qrwlock *lock)
 	for (;;) {
 		cnts = atomic_read(&lock->cnts);
 		if ((cnts == _QW_WAITING) &&
-		    (atomic_cmpxchg(&lock->cnts, _QW_WAITING,
-				    _QW_LOCKED) == _QW_WAITING))
+		    (atomic_cmpxchg_acquire(&lock->cnts, _QW_WAITING,
+					    _QW_LOCKED) == _QW_WAITING))
 			break;
 
 		cpu_relax_lowlatency();
-- 
2.1.4


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH 1/5] atomics: add acquire/release/relaxed variants of some atomic operations
  2015-07-13 12:31 ` [PATCH 1/5] atomics: add acquire/release/relaxed variants of some atomic operations Will Deacon
@ 2015-07-14 10:25   ` Peter Zijlstra
  2015-07-14 10:32     ` Will Deacon
  0 siblings, 1 reply; 14+ messages in thread
From: Peter Zijlstra @ 2015-07-14 10:25 UTC (permalink / raw)
  To: Will Deacon; +Cc: linux-arch, Waiman.Long, linux-kernel, paulmck

On Mon, Jul 13, 2015 at 01:31:23PM +0100, Will Deacon wrote:
> +
> +/*
> + * Relaxed variants of xchg, cmpxchg and some atomic operations.
> + *
> + * We support four variants:
> + *
> + * - Fully ordered: The default implementation, no suffix required.
> + * - Acquire: Provides ACQUIRE semantics, _acquire suffix.
> + * - Release: Provides RELEASE semantics, _release suffix.
> + * - Relaxed: No ordering guarantees, _relaxed suffix.
> + *
> + * See Documentation/memory-barriers.txt for ACQUIRE/RELEASE definitions.
> + */
> +
> +#ifndef atomic_read_acquire
> +#define  atomic_read_acquire(v)		smp_load_acquire(&(v)->counter)
> +#endif
> +
> +#ifndef atomic_set_release
> +#define  atomic_set_release(v, i)	smp_store_release(&(v)->counter, (i))
> +#endif
> +
> +#ifndef atomic_add_return_relaxed
> +#define  atomic_add_return_relaxed	atomic_add_return
> +#endif
> +#ifndef atomic_add_return_acquire
> +#define  atomic_add_return_acquire	atomic_add_return
> +#endif
> +#ifndef atomic_add_return_release
> +#define  atomic_add_return_release	atomic_add_return
> +#endif

Could we not define _{acquire,release} in terms of _relaxed and
smp_mb__{after,before}_atomic() ?

That way most archs really only need to implement _relaxed and be done
with it. ARM is special in that it actually has acquire/release
semantics in hardware (IA64 has it in the ISA but its not effective).


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 1/5] atomics: add acquire/release/relaxed variants of some atomic operations
  2015-07-14 10:25   ` Peter Zijlstra
@ 2015-07-14 10:32     ` Will Deacon
  2015-07-14 10:58       ` Peter Zijlstra
  0 siblings, 1 reply; 14+ messages in thread
From: Will Deacon @ 2015-07-14 10:32 UTC (permalink / raw)
  To: Peter Zijlstra; +Cc: linux-arch, Waiman.Long, linux-kernel, paulmck

On Tue, Jul 14, 2015 at 11:25:11AM +0100, Peter Zijlstra wrote:
> On Mon, Jul 13, 2015 at 01:31:23PM +0100, Will Deacon wrote:
> > +
> > +/*
> > + * Relaxed variants of xchg, cmpxchg and some atomic operations.
> > + *
> > + * We support four variants:
> > + *
> > + * - Fully ordered: The default implementation, no suffix required.
> > + * - Acquire: Provides ACQUIRE semantics, _acquire suffix.
> > + * - Release: Provides RELEASE semantics, _release suffix.
> > + * - Relaxed: No ordering guarantees, _relaxed suffix.
> > + *
> > + * See Documentation/memory-barriers.txt for ACQUIRE/RELEASE definitions.
> > + */
> > +
> > +#ifndef atomic_read_acquire
> > +#define  atomic_read_acquire(v)		smp_load_acquire(&(v)->counter)
> > +#endif
> > +
> > +#ifndef atomic_set_release
> > +#define  atomic_set_release(v, i)	smp_store_release(&(v)->counter, (i))
> > +#endif
> > +
> > +#ifndef atomic_add_return_relaxed
> > +#define  atomic_add_return_relaxed	atomic_add_return
> > +#endif
> > +#ifndef atomic_add_return_acquire
> > +#define  atomic_add_return_acquire	atomic_add_return
> > +#endif
> > +#ifndef atomic_add_return_release
> > +#define  atomic_add_return_release	atomic_add_return
> > +#endif
> 
> Could we not define _{acquire,release} in terms of _relaxed and
> smp_mb__{after,before}_atomic() ?

I actually started out with that, but it penalises architectures that
don't have _relaxed implementations of some routines.

For example, cmpxchg_acquire would likely expand to:

	cmpxchg_relaxed(); smp_mb__after_atomic();
->	cmpxchg(); smp_mb__after_atomic();
->	smp_mb(); __cmpxchg(); __smp_mb(); smp_mb();

(where __cmpxchg() is some arch-internal cmpxchg implementation)

Will

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 1/5] atomics: add acquire/release/relaxed variants of some atomic operations
  2015-07-14 10:32     ` Will Deacon
@ 2015-07-14 10:58       ` Peter Zijlstra
  2015-07-14 11:08         ` Will Deacon
  0 siblings, 1 reply; 14+ messages in thread
From: Peter Zijlstra @ 2015-07-14 10:58 UTC (permalink / raw)
  To: Will Deacon; +Cc: linux-arch, Waiman.Long, linux-kernel, paulmck

On Tue, Jul 14, 2015 at 11:32:20AM +0100, Will Deacon wrote:
> On Tue, Jul 14, 2015 at 11:25:11AM +0100, Peter Zijlstra wrote:
> > On Mon, Jul 13, 2015 at 01:31:23PM +0100, Will Deacon wrote:
> > > +
> > > +/*
> > > + * Relaxed variants of xchg, cmpxchg and some atomic operations.
> > > + *
> > > + * We support four variants:
> > > + *
> > > + * - Fully ordered: The default implementation, no suffix required.
> > > + * - Acquire: Provides ACQUIRE semantics, _acquire suffix.
> > > + * - Release: Provides RELEASE semantics, _release suffix.
> > > + * - Relaxed: No ordering guarantees, _relaxed suffix.
> > > + *
> > > + * See Documentation/memory-barriers.txt for ACQUIRE/RELEASE definitions.
> > > + */
> > > +
> > > +#ifndef atomic_read_acquire
> > > +#define  atomic_read_acquire(v)		smp_load_acquire(&(v)->counter)
> > > +#endif
> > > +
> > > +#ifndef atomic_set_release
> > > +#define  atomic_set_release(v, i)	smp_store_release(&(v)->counter, (i))
> > > +#endif
> > > +
> > > +#ifndef atomic_add_return_relaxed
> > > +#define  atomic_add_return_relaxed	atomic_add_return
> > > +#endif
> > > +#ifndef atomic_add_return_acquire
> > > +#define  atomic_add_return_acquire	atomic_add_return
> > > +#endif
> > > +#ifndef atomic_add_return_release
> > > +#define  atomic_add_return_release	atomic_add_return
> > > +#endif
> > 
> > Could we not define _{acquire,release} in terms of _relaxed and
> > smp_mb__{after,before}_atomic() ?
> 
> I actually started out with that, but it penalises architectures that
> don't have _relaxed implementations of some routines.

#ifndef atomic_add_return_relaxed

#define atomic_add_return_relaxed	atomic_add_return
/*
 * If one cannot define a more relaxed version,
 * acquire/release are out the window too.
 */
#define  atomic_add_return_acquire	atomic_add_return
#define  atomic_add_return_release	atomic_add_return

#else /* relaxed */

#ifndef atomic_add_return_acquire
#define  atomic_add_return_acquire(args...)	\
do {						\
	atomic_add_return_relaxed(args);	\
	smp_mb__after_atomic();			\
} while (0)
#endif

#ifndef atomic_add_return_release
#define  atomic_add_return_release(args...)	\
do {						\
	smp_mb__before_atomic();		\
	atomic_add_return_relaxed(args);	\
} while (0)
#endif

#endif /* relaxed */


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 1/5] atomics: add acquire/release/relaxed variants of some atomic operations
  2015-07-14 10:58       ` Peter Zijlstra
@ 2015-07-14 11:08         ` Will Deacon
  2015-07-14 11:24           ` Peter Zijlstra
  0 siblings, 1 reply; 14+ messages in thread
From: Will Deacon @ 2015-07-14 11:08 UTC (permalink / raw)
  To: Peter Zijlstra; +Cc: linux-arch, Waiman.Long, linux-kernel, paulmck

On Tue, Jul 14, 2015 at 11:58:37AM +0100, Peter Zijlstra wrote:
> On Tue, Jul 14, 2015 at 11:32:20AM +0100, Will Deacon wrote:
> > On Tue, Jul 14, 2015 at 11:25:11AM +0100, Peter Zijlstra wrote:
> > > Could we not define _{acquire,release} in terms of _relaxed and
> > > smp_mb__{after,before}_atomic() ?
> > 
> > I actually started out with that, but it penalises architectures that
> > don't have _relaxed implementations of some routines.
> 
> #ifndef atomic_add_return_relaxed
> 
> #define atomic_add_return_relaxed	atomic_add_return
> /*
>  * If one cannot define a more relaxed version,
>  * acquire/release are out the window too.
>  */
> #define  atomic_add_return_acquire	atomic_add_return
> #define  atomic_add_return_release	atomic_add_return
> 
> #else /* relaxed */
> 
> #ifndef atomic_add_return_acquire
> #define  atomic_add_return_acquire(args...)	\
> do {						\
> 	atomic_add_return_relaxed(args);	\
> 	smp_mb__after_atomic();			\
> } while (0)
> #endif
> 
> #ifndef atomic_add_return_release
> #define  atomic_add_return_release(args...)	\
> do {						\
> 	smp_mb__before_atomic();		\
> 	atomic_add_return_relaxed(args);	\
> } while (0)
> #endif
> 
> #endif /* relaxed */

If a picture is worth a thousand words, a patch is worth at least a thousand
pictures.

I'll add this for v2.

Will

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 1/5] atomics: add acquire/release/relaxed variants of some atomic operations
  2015-07-14 11:08         ` Will Deacon
@ 2015-07-14 11:24           ` Peter Zijlstra
  2015-07-14 11:31             ` Will Deacon
  0 siblings, 1 reply; 14+ messages in thread
From: Peter Zijlstra @ 2015-07-14 11:24 UTC (permalink / raw)
  To: Will Deacon; +Cc: linux-arch, Waiman.Long, linux-kernel, paulmck

On Tue, Jul 14, 2015 at 12:08:11PM +0100, Will Deacon wrote:
> On Tue, Jul 14, 2015 at 11:58:37AM +0100, Peter Zijlstra wrote:


> > #ifndef atomic_add_return_relaxed
> > 
> > #define atomic_add_return_relaxed	atomic_add_return
> > /*
> >  * If one cannot define a more relaxed version,
> >  * acquire/release are out the window too.
> >  */
> > #define  atomic_add_return_acquire	atomic_add_return
> > #define  atomic_add_return_release	atomic_add_return
> > 
> > #else /* relaxed */
> > 
> > #ifndef atomic_add_return_acquire
> > #define  atomic_add_return_acquire(args...)	\
> > do {						\
> > 	atomic_add_return_relaxed(args);	\
> > 	smp_mb__after_atomic();			\
> > } while (0)
> > #endif
> > 
> > #ifndef atomic_add_return_release
> > #define  atomic_add_return_release(args...)	\
> > do {						\
> > 	smp_mb__before_atomic();		\
> > 	atomic_add_return_relaxed(args);	\
> > } while (0)
> > #endif

One could even take it one step further and go:

#ifndef atomic_add_return
#define atomic_add_return(args...)		\
do {						\
	smp_mb__before_atomic();		\
	atomid_add_return_relaxed(args);	\
	smp_mb__after_atomic();			\
} while (0)

> > 
> > #endif /* relaxed */
> 
> If a picture is worth a thousand words, a patch is worth at least a thousand
> pictures.

:-)

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 1/5] atomics: add acquire/release/relaxed variants of some atomic operations
  2015-07-14 11:24           ` Peter Zijlstra
@ 2015-07-14 11:31             ` Will Deacon
  2015-07-14 11:38               ` Peter Zijlstra
  0 siblings, 1 reply; 14+ messages in thread
From: Will Deacon @ 2015-07-14 11:31 UTC (permalink / raw)
  To: Peter Zijlstra; +Cc: linux-arch, Waiman.Long, linux-kernel, paulmck

On Tue, Jul 14, 2015 at 12:24:20PM +0100, Peter Zijlstra wrote:
> On Tue, Jul 14, 2015 at 12:08:11PM +0100, Will Deacon wrote:
> > On Tue, Jul 14, 2015 at 11:58:37AM +0100, Peter Zijlstra wrote:
> 
> 
> > > #ifndef atomic_add_return_relaxed
> > > 
> > > #define atomic_add_return_relaxed	atomic_add_return
> > > /*
> > >  * If one cannot define a more relaxed version,
> > >  * acquire/release are out the window too.
> > >  */
> > > #define  atomic_add_return_acquire	atomic_add_return
> > > #define  atomic_add_return_release	atomic_add_return
> > > 
> > > #else /* relaxed */
> > > 
> > > #ifndef atomic_add_return_acquire
> > > #define  atomic_add_return_acquire(args...)	\
> > > do {						\
> > > 	atomic_add_return_relaxed(args);	\
> > > 	smp_mb__after_atomic();			\
> > > } while (0)
> > > #endif
> > > 
> > > #ifndef atomic_add_return_release
> > > #define  atomic_add_return_release(args...)	\
> > > do {						\
> > > 	smp_mb__before_atomic();		\
> > > 	atomic_add_return_relaxed(args);	\
> > > } while (0)
> > > #endif
> 
> One could even take it one step further and go:
> 
> #ifndef atomic_add_return
> #define atomic_add_return(args...)		\
> do {						\
> 	smp_mb__before_atomic();		\
> 	atomid_add_return_relaxed(args);	\
> 	smp_mb__after_atomic();			\
> } while (0)

...and

#ifndef atomic_add
#define atomic_add(args...)	(void)atomic_add_return_relaxed(args);

It would mean a new architecture only has to define a barrier instruction
and a handful of relaxed atomics for a bare-minimum atomic.h avoiding
spinlocks.

Will

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 1/5] atomics: add acquire/release/relaxed variants of some atomic operations
  2015-07-14 11:31             ` Will Deacon
@ 2015-07-14 11:38               ` Peter Zijlstra
  2015-07-14 15:55                 ` Will Deacon
  0 siblings, 1 reply; 14+ messages in thread
From: Peter Zijlstra @ 2015-07-14 11:38 UTC (permalink / raw)
  To: Will Deacon; +Cc: linux-arch, Waiman.Long, linux-kernel, paulmck

On Tue, Jul 14, 2015 at 12:31:47PM +0100, Will Deacon wrote:
> #ifndef atomic_add
> #define atomic_add(args...)	(void)atomic_add_return_relaxed(args);
> 
> It would mean a new architecture only has to define a barrier instruction
> and a handful of relaxed atomics for a bare-minimum atomic.h avoiding
> spinlocks.

Look at include/asm-generic/atomic.h, all you need is a cmpxchg().

We could easily change that to be cmpxchg_relaxed() and a few barriers.



^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 1/5] atomics: add acquire/release/relaxed variants of some atomic operations
  2015-07-14 11:38               ` Peter Zijlstra
@ 2015-07-14 15:55                 ` Will Deacon
  0 siblings, 0 replies; 14+ messages in thread
From: Will Deacon @ 2015-07-14 15:55 UTC (permalink / raw)
  To: Peter Zijlstra; +Cc: linux-arch, Waiman.Long, linux-kernel, paulmck

On Tue, Jul 14, 2015 at 12:38:01PM +0100, Peter Zijlstra wrote:
> On Tue, Jul 14, 2015 at 12:31:47PM +0100, Will Deacon wrote:
> > #ifndef atomic_add
> > #define atomic_add(args...)	(void)atomic_add_return_relaxed(args);
> > 
> > It would mean a new architecture only has to define a barrier instruction
> > and a handful of relaxed atomics for a bare-minimum atomic.h avoiding
> > spinlocks.
> 
> Look at include/asm-generic/atomic.h, all you need is a cmpxchg().
> 
> We could easily change that to be cmpxchg_relaxed() and a few barriers.

Ok, I'll leave that part for now and implement your original suggestion
as a starting point.

Will

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2015-07-14 15:55 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-07-13 12:31 [PATCH 0/5] Add generic support for relaxed atomics Will Deacon
2015-07-13 12:31 ` [PATCH 1/5] atomics: add acquire/release/relaxed variants of some atomic operations Will Deacon
2015-07-14 10:25   ` Peter Zijlstra
2015-07-14 10:32     ` Will Deacon
2015-07-14 10:58       ` Peter Zijlstra
2015-07-14 11:08         ` Will Deacon
2015-07-14 11:24           ` Peter Zijlstra
2015-07-14 11:31             ` Will Deacon
2015-07-14 11:38               ` Peter Zijlstra
2015-07-14 15:55                 ` Will Deacon
2015-07-13 12:31 ` [PATCH 2/5] asm-generic: rework atomic-long.h to avoid bulk code duplication Will Deacon
2015-07-13 12:31 ` [PATCH 3/5] asm-generic: add relaxed/acquire/release variants for atomic_long_t Will Deacon
2015-07-13 12:31 ` [PATCH 4/5] lockref: remove homebrew cmpxchg64_relaxed macro definition Will Deacon
2015-07-13 12:31 ` [PATCH 5/5] locking/qrwlock: make use of acquire/release/relaxed atomics Will Deacon

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.