All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v3 0/6] atomics: powerpc: Implement relaxed/acquire/release variants of some atomics
@ 2015-10-12 14:14 Boqun Feng
  2015-10-12 14:14 ` [PATCH v3 1/6] powerpc: atomic: Make *xchg and *cmpxchg a full barrier Boqun Feng
                   ` (7 more replies)
  0 siblings, 8 replies; 40+ messages in thread
From: Boqun Feng @ 2015-10-12 14:14 UTC (permalink / raw)
  To: linux-kernel, linuxppc-dev
  Cc: Peter Zijlstra, Ingo Molnar, Benjamin Herrenschmidt,
	Paul Mackerras, Michael Ellerman, Thomas Gleixner, Will Deacon,
	Paul E. McKenney, Waiman Long, Davidlohr Bueso, Boqun Feng

Hi,

This is v3 of the series.

Link for v1: https://lkml.org/lkml/2015/8/27/798
Link for v2: https://lkml.org/lkml/2015/9/16/527

Paul, Peter and Will, thank you all for the comments and suggestions,
that's really a lot of fun to discuss these with you and very
enlightening to me ;-)

Changes since v2:

*	reorder the patches to put the fix of cmpxchg, xchg and their
	atomic_ versions first, and Cc stable. (Peter Zijlstra)

*	modify the commit log for implementation of cmpxchg family to
	explain why we implement some operation using assembly code
	(Peter Zijlstra)

*	add implementation and test for {inc,dec}_return atomics.
	(Will Deacon)

*	rebase on current locking/core branch of tip tree (commit
	00eb4bab69db3)

*	rewrite the macros for generating tests to save some lines of
	code


Relaxed/acquire/release variants of atomic operations {add,sub}_return
and {cmp,}xchg are introduced by commit:

"atomics: add acquire/release/relaxed variants of some atomic operations"

and {inc,dec}_return has been introduced by commit:

"locking/asm-generic: Add _{relaxed|acquire|release}() variants for
inc/dec atomics"

Both of these are in the current locking/core branch of the tip tree.

By default, the generic code will implement a relaxed variant as a full
ordered atomic operation and release/acquire a variant as a relaxed
variant with a necessary general barrier before or after.

On powerpc, which has a weak memory order model, a relaxed variant can
be implemented more lightweightly than a full ordered one. Further more,
release and acquire variants can be implemented with arch-specific
lightweight barriers.

Besides, cmpxchg, xchg and their atomic_ versions are only RELEASE+ACQUIRE
rather that full barriers in current PPC implementation, which is
incorrect according to memory-barriers.txt.

Therefore this patchset fix the order guarantee of cmpxchg, xchg and
their atomic_ versions and implements the relaxed/acquire/release
variants based on powerpc memory model and specific barriers, Some
trivial tests for these new variants are also included in this series,
because some of these variants are not used in kernel for now, I think
is a good idea to at least generate the code for these variants
somewhere.

The patchset consists of 6 parts:

1.	Make xchg, cmpxchg and their atomic_ versions a full barrier

2.	Add trivial tests for the new variants in lib/atomic64_test.c

3.	Allow architectures to define their own __atomic_op_*() helpers
	to build other variants based on relaxed.

4.	Implement atomic{,64}_{add,sub,inc,dec}_return_* variants

5.	Implement xchg_* and atomic{,64}_xchg_* variants

6.	Implement cmpxchg_* atomic{,64}_cmpxchg_* variants


This patchset is based on current locking/core branch of the tip tree
and all patches are built and boot tested for little endian pseries, and
also tested by 0day.


Looking forward to any suggestion, question and comment ;-)

Regards,
Boqun


^ permalink raw reply	[flat|nested] 40+ messages in thread

* [PATCH v3 1/6] powerpc: atomic: Make *xchg and *cmpxchg a full barrier
  2015-10-12 14:14 [PATCH v3 0/6] atomics: powerpc: Implement relaxed/acquire/release variants of some atomics Boqun Feng
@ 2015-10-12 14:14 ` Boqun Feng
  2015-10-12 14:23   ` Boqun Feng
  2015-10-12 14:14 ` [PATCH v3 2/6] atomics: Add test for atomic operations with _relaxed variants Boqun Feng
                   ` (6 subsequent siblings)
  7 siblings, 1 reply; 40+ messages in thread
From: Boqun Feng @ 2015-10-12 14:14 UTC (permalink / raw)
  To: linux-kernel, linuxppc-dev
  Cc: Peter Zijlstra, Ingo Molnar, Benjamin Herrenschmidt,
	Paul Mackerras, Michael Ellerman, Thomas Gleixner, Will Deacon,
	Paul E. McKenney, Waiman Long, Davidlohr Bueso, Boqun Feng,
	stable

According to memory-barriers.txt, xchg, cmpxchg and their atomic{,64}_
versions all need to imply a full barrier, however they are now just
RELEASE+ACQUIRE, which is not a full barrier.

So replace PPC_RELEASE_BARRIER and PPC_ACQUIRE_BARRIER with
PPC_ATOMIC_ENTRY_BARRIER and PPC_ATOMIC_EXIT_BARRIER in
__{cmp,}xchg_{u32,u64} respectively to guarantee a full barrier
semantics of atomic{,64}_{cmp,}xchg() and {cmp,}xchg().

This patch is a complement of commit b97021f85517 ("powerpc: Fix
atomic_xxx_return barrier semantics").

Cc: stable@vger.kernel.org # 3.4.y-
Signed-off-by: Boqun Feng <boqun.feng@gmail.com>
---
 arch/powerpc/include/asm/cmpxchg.h | 16 ++++++++--------
 1 file changed, 8 insertions(+), 8 deletions(-)

diff --git a/arch/powerpc/include/asm/cmpxchg.h b/arch/powerpc/include/asm/cmpxchg.h
index ad6263c..d1a8d93 100644
--- a/arch/powerpc/include/asm/cmpxchg.h
+++ b/arch/powerpc/include/asm/cmpxchg.h
@@ -18,12 +18,12 @@ __xchg_u32(volatile void *p, unsigned long val)
 	unsigned long prev;
 
 	__asm__ __volatile__(
-	PPC_RELEASE_BARRIER
+	PPC_ATOMIC_ENTRY_BARRIER
 "1:	lwarx	%0,0,%2 \n"
 	PPC405_ERR77(0,%2)
 "	stwcx.	%3,0,%2 \n\
 	bne-	1b"
-	PPC_ACQUIRE_BARRIER
+	PPC_ATOMIC_EXIT_BARRIER
 	: "=&r" (prev), "+m" (*(volatile unsigned int *)p)
 	: "r" (p), "r" (val)
 	: "cc", "memory");
@@ -61,12 +61,12 @@ __xchg_u64(volatile void *p, unsigned long val)
 	unsigned long prev;
 
 	__asm__ __volatile__(
-	PPC_RELEASE_BARRIER
+	PPC_ATOMIC_ENTRY_BARRIER
 "1:	ldarx	%0,0,%2 \n"
 	PPC405_ERR77(0,%2)
 "	stdcx.	%3,0,%2 \n\
 	bne-	1b"
-	PPC_ACQUIRE_BARRIER
+	PPC_ATOMIC_EXIT_BARRIER
 	: "=&r" (prev), "+m" (*(volatile unsigned long *)p)
 	: "r" (p), "r" (val)
 	: "cc", "memory");
@@ -151,14 +151,14 @@ __cmpxchg_u32(volatile unsigned int *p, unsigned long old, unsigned long new)
 	unsigned int prev;
 
 	__asm__ __volatile__ (
-	PPC_RELEASE_BARRIER
+	PPC_ATOMIC_ENTRY_BARRIER
 "1:	lwarx	%0,0,%2		# __cmpxchg_u32\n\
 	cmpw	0,%0,%3\n\
 	bne-	2f\n"
 	PPC405_ERR77(0,%2)
 "	stwcx.	%4,0,%2\n\
 	bne-	1b"
-	PPC_ACQUIRE_BARRIER
+	PPC_ATOMIC_EXIT_BARRIER
 	"\n\
 2:"
 	: "=&r" (prev), "+m" (*p)
@@ -197,13 +197,13 @@ __cmpxchg_u64(volatile unsigned long *p, unsigned long old, unsigned long new)
 	unsigned long prev;
 
 	__asm__ __volatile__ (
-	PPC_RELEASE_BARRIER
+	PPC_ATOMIC_ENTRY_BARRIER
 "1:	ldarx	%0,0,%2		# __cmpxchg_u64\n\
 	cmpd	0,%0,%3\n\
 	bne-	2f\n\
 	stdcx.	%4,0,%2\n\
 	bne-	1b"
-	PPC_ACQUIRE_BARRIER
+	PPC_ATOMIC_EXIT_BARRIER
 	"\n\
 2:"
 	: "=&r" (prev), "+m" (*p)
-- 
2.5.3

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH v3 2/6] atomics: Add test for atomic operations with _relaxed variants
  2015-10-12 14:14 [PATCH v3 0/6] atomics: powerpc: Implement relaxed/acquire/release variants of some atomics Boqun Feng
  2015-10-12 14:14 ` [PATCH v3 1/6] powerpc: atomic: Make *xchg and *cmpxchg a full barrier Boqun Feng
@ 2015-10-12 14:14 ` Boqun Feng
       [not found]   ` <201510122205.Uu3yljqf%fengguang.wu@intel.com>
  2015-10-12 14:14 ` [PATCH v3 3/6] atomics: Allow architectures to define their own __atomic_op_* helpers Boqun Feng
                   ` (5 subsequent siblings)
  7 siblings, 1 reply; 40+ messages in thread
From: Boqun Feng @ 2015-10-12 14:14 UTC (permalink / raw)
  To: linux-kernel, linuxppc-dev
  Cc: Peter Zijlstra, Ingo Molnar, Benjamin Herrenschmidt,
	Paul Mackerras, Michael Ellerman, Thomas Gleixner, Will Deacon,
	Paul E. McKenney, Waiman Long, Davidlohr Bueso, Boqun Feng

Some atomic operations now have _{relaxed, acquire, release} variants,
this patch then adds some trivial tests for two purpose:

1.	test the behavior of these new operations in single-CPU
	environment.
2.	make their code generated before we actually use them somewhere,
	so that we can examine their assembly code.

Signed-off-by: Boqun Feng <boqun.feng@gmail.com>
---
 lib/atomic64_test.c | 120 ++++++++++++++++++++++++++++++++++------------------
 1 file changed, 79 insertions(+), 41 deletions(-)

diff --git a/lib/atomic64_test.c b/lib/atomic64_test.c
index 83c33a5b..18e422b 100644
--- a/lib/atomic64_test.c
+++ b/lib/atomic64_test.c
@@ -27,6 +27,65 @@ do {								\
 		(unsigned long long)r);				\
 } while (0)
 
+/*
+ * Test for a atomic operation family,
+ * @test should be a macro accepting parameters (bit, op, ...)
+ */
+
+#define FAMILY_TEST(test, bit, op, args...)	\
+do {						\
+	test(bit, op, ##args);		\
+	test(bit, op##_acquire, ##args);	\
+	test(bit, op##_release, ##args);	\
+	test(bit, op##_relaxed, ##args);	\
+} while (0)
+
+#define TEST_RETURN(bit, op, c_op, val)				\
+do {								\
+	atomic##bit##_set(&v, v0);				\
+	r = v0;							\
+	r c_op val;						\
+	BUG_ON(atomic##bit##_##op(val, &v) != r);		\
+	BUG_ON(atomic##bit##_read(&v) != r);			\
+} while (0)
+
+#define RETURN_FAMILY_TEST(bit, op, c_op, val)			\
+do {								\
+	FAMILY_TEST(TEST_RETURN, bit, op, c_op, val);		\
+} while (0)
+
+#define TEST_ARGS(bit, op, init, ret, expect, args...)		\
+do {								\
+	atomic##bit##_set(&v, init);				\
+	BUG_ON(atomic##bit##_##op(&v, ##args) != ret);		\
+	BUG_ON(atomic##bit##_read(&v) != expect);		\
+} while (0)
+
+#define XCHG_FAMILY_TEST(bit, init, new)				\
+do {									\
+	FAMILY_TEST(TEST_ARGS, bit, xchg, init, init, new, new);	\
+} while (0)
+
+#define CMPXCHG_FAMILY_TEST(bit, init, new, wrong)			\
+do {									\
+	FAMILY_TEST(TEST_ARGS, bit, cmpxchg, 				\
+			init, init, new, init, new);			\
+	FAMILY_TEST(TEST_ARGS, bit, cmpxchg,				\
+			init, init, init, wrong, new);			\
+} while (0)
+
+#define INC_RETURN_FAMILY_TEST(bit, i)			\
+do {							\
+	FAMILY_TEST(TEST_ARGS, bit, inc_return,		\
+			i, (i) + one, (i) + one);	\
+} while (0)
+
+#define DEC_RETURN_FAMILY_TEST(bit, i)			\
+do {							\
+	FAMILY_TEST(TEST_ARGS, bit, dec_return,		\
+			i, (i) - one, (i) - one);	\
+} while (0)
+
 static __init void test_atomic(void)
 {
 	int v0 = 0xaaa31337;
@@ -45,6 +104,18 @@ static __init void test_atomic(void)
 	TEST(, and, &=, v1);
 	TEST(, xor, ^=, v1);
 	TEST(, andnot, &= ~, v1);
+
+	RETURN_FAMILY_TEST(, add_return, +=, onestwos);
+	RETURN_FAMILY_TEST(, add_return, +=, -one);
+	RETURN_FAMILY_TEST(, sub_return, -=, onestwos);
+	RETURN_FAMILY_TEST(, sub_return, -=, -one);
+
+	INC_RETURN_FAMILY_TEST(, v0);
+	DEC_RETURN_FAMILY_TEST(, v0);
+
+	XCHG_FAMILY_TEST(, v0, v1);
+	CMPXCHG_FAMILY_TEST(, v0, v1, onestwos);
+
 }
 
 #define INIT(c) do { atomic64_set(&v, c); r = c; } while (0)
@@ -74,25 +145,10 @@ static __init void test_atomic64(void)
 	TEST(64, xor, ^=, v1);
 	TEST(64, andnot, &= ~, v1);
 
-	INIT(v0);
-	r += onestwos;
-	BUG_ON(atomic64_add_return(onestwos, &v) != r);
-	BUG_ON(v.counter != r);
-
-	INIT(v0);
-	r += -one;
-	BUG_ON(atomic64_add_return(-one, &v) != r);
-	BUG_ON(v.counter != r);
-
-	INIT(v0);
-	r -= onestwos;
-	BUG_ON(atomic64_sub_return(onestwos, &v) != r);
-	BUG_ON(v.counter != r);
-
-	INIT(v0);
-	r -= -one;
-	BUG_ON(atomic64_sub_return(-one, &v) != r);
-	BUG_ON(v.counter != r);
+	RETURN_FAMILY_TEST(64, add_return, +=, onestwos);
+	RETURN_FAMILY_TEST(64, add_return, +=, -one);
+	RETURN_FAMILY_TEST(64, sub_return, -=, onestwos);
+	RETURN_FAMILY_TEST(64, sub_return, -=, -one);
 
 	INIT(v0);
 	atomic64_inc(&v);
@@ -100,33 +156,15 @@ static __init void test_atomic64(void)
 	BUG_ON(v.counter != r);
 
 	INIT(v0);
-	r += one;
-	BUG_ON(atomic64_inc_return(&v) != r);
-	BUG_ON(v.counter != r);
-
-	INIT(v0);
 	atomic64_dec(&v);
 	r -= one;
 	BUG_ON(v.counter != r);
 
-	INIT(v0);
-	r -= one;
-	BUG_ON(atomic64_dec_return(&v) != r);
-	BUG_ON(v.counter != r);
-
-	INIT(v0);
-	BUG_ON(atomic64_xchg(&v, v1) != v0);
-	r = v1;
-	BUG_ON(v.counter != r);
-
-	INIT(v0);
-	BUG_ON(atomic64_cmpxchg(&v, v0, v1) != v0);
-	r = v1;
-	BUG_ON(v.counter != r);
+	INC_RETURN_FAMILY_TEST(64, v0);
+	DEC_RETURN_FAMILY_TEST(64, v0);
 
-	INIT(v0);
-	BUG_ON(atomic64_cmpxchg(&v, v2, v1) != v0);
-	BUG_ON(v.counter != r);
+	XCHG_FAMILY_TEST(64, v0, v1);
+	CMPXCHG_FAMILY_TEST(64, v0, v1, v2);
 
 	INIT(v0);
 	BUG_ON(atomic64_add_unless(&v, one, v0));
-- 
2.5.3


^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH v3 3/6] atomics: Allow architectures to define their own __atomic_op_* helpers
  2015-10-12 14:14 [PATCH v3 0/6] atomics: powerpc: Implement relaxed/acquire/release variants of some atomics Boqun Feng
  2015-10-12 14:14 ` [PATCH v3 1/6] powerpc: atomic: Make *xchg and *cmpxchg a full barrier Boqun Feng
  2015-10-12 14:14 ` [PATCH v3 2/6] atomics: Add test for atomic operations with _relaxed variants Boqun Feng
@ 2015-10-12 14:14 ` Boqun Feng
  2015-10-12 14:14   ` [PATCH v3 4/6] powerpc: atomic: Implement atomic{, 64}_*_return_* variants Boqun Feng
                   ` (4 subsequent siblings)
  7 siblings, 0 replies; 40+ messages in thread
From: Boqun Feng @ 2015-10-12 14:14 UTC (permalink / raw)
  To: linux-kernel, linuxppc-dev
  Cc: Peter Zijlstra, Ingo Molnar, Benjamin Herrenschmidt,
	Paul Mackerras, Michael Ellerman, Thomas Gleixner, Will Deacon,
	Paul E. McKenney, Waiman Long, Davidlohr Bueso, Boqun Feng

Some architectures may have their special barriers for acquire, release
and fence semantics, so that general memory barriers(smp_mb__*_atomic())
in the default __atomic_op_*() may be too strong, so allow architectures
to define their own helpers which can overwrite the default helpers.

Signed-off-by: Boqun Feng <boqun.feng@gmail.com>
---
 include/linux/atomic.h | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/include/linux/atomic.h b/include/linux/atomic.h
index 27e580d..947c1dc 100644
--- a/include/linux/atomic.h
+++ b/include/linux/atomic.h
@@ -43,20 +43,29 @@ static inline int atomic_read_ctrl(const atomic_t *v)
  * The idea here is to build acquire/release variants by adding explicit
  * barriers on top of the relaxed variant. In the case where the relaxed
  * variant is already fully ordered, no additional barriers are needed.
+ *
+ * Besides, if an arch has a special barrier for acquire/release, it could
+ * implement its own __atomic_op_* and use the same framework for building
+ * variants
  */
+#ifndef __atomic_op_acquire
 #define __atomic_op_acquire(op, args...)				\
 ({									\
 	typeof(op##_relaxed(args)) __ret  = op##_relaxed(args);		\
 	smp_mb__after_atomic();						\
 	__ret;								\
 })
+#endif
 
+#ifndef __atomic_op_release
 #define __atomic_op_release(op, args...)				\
 ({									\
 	smp_mb__before_atomic();					\
 	op##_relaxed(args);						\
 })
+#endif
 
+#ifndef __atomic_op_fence
 #define __atomic_op_fence(op, args...)					\
 ({									\
 	typeof(op##_relaxed(args)) __ret;				\
@@ -65,6 +74,7 @@ static inline int atomic_read_ctrl(const atomic_t *v)
 	smp_mb__after_atomic();						\
 	__ret;								\
 })
+#endif
 
 /* atomic_add_return_relaxed */
 #ifndef atomic_add_return_relaxed
-- 
2.5.3


^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH v3 4/6] powerpc: atomic: Implement atomic{,64}_*_return_* variants
  2015-10-12 14:14 [PATCH v3 0/6] atomics: powerpc: Implement relaxed/acquire/release variants of some atomics Boqun Feng
@ 2015-10-12 14:14   ` Boqun Feng
  2015-10-12 14:14 ` [PATCH v3 2/6] atomics: Add test for atomic operations with _relaxed variants Boqun Feng
                     ` (6 subsequent siblings)
  7 siblings, 0 replies; 40+ messages in thread
From: Boqun Feng @ 2015-10-12 14:14 UTC (permalink / raw)
  To: linux-kernel, linuxppc-dev
  Cc: Peter Zijlstra, Ingo Molnar, Benjamin Herrenschmidt,
	Paul Mackerras, Michael Ellerman, Thomas Gleixner, Will Deacon,
	Paul E. McKenney, Waiman Long, Davidlohr Bueso, Boqun Feng

On powerpc, acquire and release semantics can be achieved with
lightweight barriers("lwsync" and "ctrl+isync"), which can be used to
implement __atomic_op_{acquire,release}.

For release semantics, since we only need to ensure all memory accesses
that issue before must take effects before the -store- part of the
atomics, "lwsync" is what we only need. On the platform without
"lwsync", "sync" should be used. Therefore, smp_lwsync() is used here.

For acquire semantics, "lwsync" is what we only need for the similar
reason.  However on the platform without "lwsync", we can use "isync"
rather than "sync" as an acquire barrier. So a new kind of barrier
smp_acquire_barrier__after_atomic() is introduced, which is barrier() on
UP, "lwsync" if available and "isync" otherwise.

__atomic_op_fence is defined as smp_lwsync() + _relaxed +
smp_mb__after_atomic() to guarantee a full barrier.

Implement atomic{,64}_{add,sub,inc,dec}_return_relaxed, and build other
variants with these helpers.

Signed-off-by: Boqun Feng <boqun.feng@gmail.com>
---
 arch/powerpc/include/asm/atomic.h | 122 +++++++++++++++++++++++++-------------
 1 file changed, 80 insertions(+), 42 deletions(-)

diff --git a/arch/powerpc/include/asm/atomic.h b/arch/powerpc/include/asm/atomic.h
index 55f106e..3143af9 100644
--- a/arch/powerpc/include/asm/atomic.h
+++ b/arch/powerpc/include/asm/atomic.h
@@ -12,6 +12,39 @@
 
 #define ATOMIC_INIT(i)		{ (i) }
 
+/*
+ * Since {add,sub}_return_relaxed and xchg_relaxed are implemented with
+ * a "bne-" instruction at the end, so an isync is enough as a acquire barrier
+ * on the platform without lwsync.
+ */
+#ifdef CONFIG_SMP
+#define smp_acquire_barrier__after_atomic() \
+	__asm__ __volatile__(PPC_ACQUIRE_BARRIER : : : "memory")
+#else
+#define smp_acquire_barrier__after_atomic() barrier()
+#endif
+#define __atomic_op_acquire(op, args...)				\
+({									\
+	typeof(op##_relaxed(args)) __ret  = op##_relaxed(args);		\
+	smp_acquire_barrier__after_atomic();				\
+	__ret;								\
+})
+
+#define __atomic_op_release(op, args...)				\
+({									\
+	smp_lwsync();							\
+	op##_relaxed(args);						\
+})
+
+#define __atomic_op_fence(op, args...)				\
+({									\
+	typeof(op##_relaxed(args)) __ret;				\
+	smp_lwsync();							\
+	__ret = op##_relaxed(args);					\
+	smp_mb__after_atomic();						\
+	__ret;								\
+})
+
 static __inline__ int atomic_read(const atomic_t *v)
 {
 	int t;
@@ -42,27 +75,27 @@ static __inline__ void atomic_##op(int a, atomic_t *v)			\
 	: "cc");							\
 }									\
 
-#define ATOMIC_OP_RETURN(op, asm_op)					\
-static __inline__ int atomic_##op##_return(int a, atomic_t *v)		\
+#define ATOMIC_OP_RETURN_RELAXED(op, asm_op)				\
+static inline int atomic_##op##_return_relaxed(int a, atomic_t *v)	\
 {									\
 	int t;								\
 									\
 	__asm__ __volatile__(						\
-	PPC_ATOMIC_ENTRY_BARRIER					\
-"1:	lwarx	%0,0,%2		# atomic_" #op "_return\n"		\
-	#asm_op " %0,%1,%0\n"						\
-	PPC405_ERR77(0,%2)						\
-"	stwcx.	%0,0,%2 \n"						\
+"1:	lwarx	%0,0,%3		# atomic_" #op "_return_relaxed\n"	\
+	#asm_op " %0,%2,%0\n"						\
+	PPC405_ERR77(0, %3)						\
+"	stwcx.	%0,0,%3\n"						\
 "	bne-	1b\n"							\
-	PPC_ATOMIC_EXIT_BARRIER						\
-	: "=&r" (t)							\
+	: "=&r" (t), "+m" (v->counter)					\
 	: "r" (a), "r" (&v->counter)					\
-	: "cc", "memory");						\
+	: "cc");							\
 									\
 	return t;							\
 }
 
-#define ATOMIC_OPS(op, asm_op) ATOMIC_OP(op, asm_op) ATOMIC_OP_RETURN(op, asm_op)
+#define ATOMIC_OPS(op, asm_op)						\
+	ATOMIC_OP(op, asm_op)						\
+	ATOMIC_OP_RETURN_RELAXED(op, asm_op)
 
 ATOMIC_OPS(add, add)
 ATOMIC_OPS(sub, subf)
@@ -71,8 +104,11 @@ ATOMIC_OP(and, and)
 ATOMIC_OP(or, or)
 ATOMIC_OP(xor, xor)
 
+#define atomic_add_return_relaxed atomic_add_return_relaxed
+#define atomic_sub_return_relaxed atomic_sub_return_relaxed
+
 #undef ATOMIC_OPS
-#undef ATOMIC_OP_RETURN
+#undef ATOMIC_OP_RETURN_RELAXED
 #undef ATOMIC_OP
 
 #define atomic_add_negative(a, v)	(atomic_add_return((a), (v)) < 0)
@@ -92,21 +128,19 @@ static __inline__ void atomic_inc(atomic_t *v)
 	: "cc", "xer");
 }
 
-static __inline__ int atomic_inc_return(atomic_t *v)
+static __inline__ int atomic_inc_return_relaxed(atomic_t *v)
 {
 	int t;
 
 	__asm__ __volatile__(
-	PPC_ATOMIC_ENTRY_BARRIER
-"1:	lwarx	%0,0,%1		# atomic_inc_return\n\
+"1:	lwarx	%0,0,%1		# atomic_inc_return_relaxed\n\
 	addic	%0,%0,1\n"
 	PPC405_ERR77(0,%1)
 "	stwcx.	%0,0,%1 \n\
 	bne-	1b"
-	PPC_ATOMIC_EXIT_BARRIER
 	: "=&r" (t)
 	: "r" (&v->counter)
-	: "cc", "xer", "memory");
+	: "cc", "xer");
 
 	return t;
 }
@@ -136,25 +170,26 @@ static __inline__ void atomic_dec(atomic_t *v)
 	: "cc", "xer");
 }
 
-static __inline__ int atomic_dec_return(atomic_t *v)
+static __inline__ int atomic_dec_return_relaxed(atomic_t *v)
 {
 	int t;
 
 	__asm__ __volatile__(
-	PPC_ATOMIC_ENTRY_BARRIER
-"1:	lwarx	%0,0,%1		# atomic_dec_return\n\
+"1:	lwarx	%0,0,%1		# atomic_dec_return_relaxed\n\
 	addic	%0,%0,-1\n"
 	PPC405_ERR77(0,%1)
 "	stwcx.	%0,0,%1\n\
 	bne-	1b"
-	PPC_ATOMIC_EXIT_BARRIER
 	: "=&r" (t)
 	: "r" (&v->counter)
-	: "cc", "xer", "memory");
+	: "cc", "xer");
 
 	return t;
 }
 
+#define atomic_inc_return_relaxed atomic_inc_return_relaxed
+#define atomic_dec_return_relaxed atomic_dec_return_relaxed
+
 #define atomic_cmpxchg(v, o, n) (cmpxchg(&((v)->counter), (o), (n)))
 #define atomic_xchg(v, new) (xchg(&((v)->counter), new))
 
@@ -285,26 +320,27 @@ static __inline__ void atomic64_##op(long a, atomic64_t *v)		\
 	: "cc");							\
 }
 
-#define ATOMIC64_OP_RETURN(op, asm_op)					\
-static __inline__ long atomic64_##op##_return(long a, atomic64_t *v)	\
+#define ATOMIC64_OP_RETURN_RELAXED(op, asm_op)				\
+static inline long							\
+atomic64_##op##_return_relaxed(long a, atomic64_t *v)			\
 {									\
 	long t;								\
 									\
 	__asm__ __volatile__(						\
-	PPC_ATOMIC_ENTRY_BARRIER					\
-"1:	ldarx	%0,0,%2		# atomic64_" #op "_return\n"		\
-	#asm_op " %0,%1,%0\n"						\
-"	stdcx.	%0,0,%2 \n"						\
+"1:	ldarx	%0,0,%3		# atomic64_" #op "_return_relaxed\n"	\
+	#asm_op " %0,%2,%0\n"						\
+"	stdcx.	%0,0,%3\n"						\
 "	bne-	1b\n"							\
-	PPC_ATOMIC_EXIT_BARRIER						\
-	: "=&r" (t)							\
+	: "=&r" (t), "+m" (v->counter)					\
 	: "r" (a), "r" (&v->counter)					\
-	: "cc", "memory");						\
+	: "cc");							\
 									\
 	return t;							\
 }
 
-#define ATOMIC64_OPS(op, asm_op) ATOMIC64_OP(op, asm_op) ATOMIC64_OP_RETURN(op, asm_op)
+#define ATOMIC64_OPS(op, asm_op)					\
+	ATOMIC64_OP(op, asm_op)						\
+	ATOMIC64_OP_RETURN_RELAXED(op, asm_op)
 
 ATOMIC64_OPS(add, add)
 ATOMIC64_OPS(sub, subf)
@@ -312,8 +348,11 @@ ATOMIC64_OP(and, and)
 ATOMIC64_OP(or, or)
 ATOMIC64_OP(xor, xor)
 
-#undef ATOMIC64_OPS
-#undef ATOMIC64_OP_RETURN
+#define atomic64_add_return_relaxed atomic64_add_return_relaxed
+#define atomic64_sub_return_relaxed atomic64_sub_return_relaxed
+
+#undef ATOPIC64_OPS
+#undef ATOMIC64_OP_RETURN_RELAXED
 #undef ATOMIC64_OP
 
 #define atomic64_add_negative(a, v)	(atomic64_add_return((a), (v)) < 0)
@@ -332,20 +371,18 @@ static __inline__ void atomic64_inc(atomic64_t *v)
 	: "cc", "xer");
 }
 
-static __inline__ long atomic64_inc_return(atomic64_t *v)
+static __inline__ long atomic64_inc_return_relaxed(atomic64_t *v)
 {
 	long t;
 
 	__asm__ __volatile__(
-	PPC_ATOMIC_ENTRY_BARRIER
 "1:	ldarx	%0,0,%1		# atomic64_inc_return\n\
 	addic	%0,%0,1\n\
 	stdcx.	%0,0,%1 \n\
 	bne-	1b"
-	PPC_ATOMIC_EXIT_BARRIER
 	: "=&r" (t)
 	: "r" (&v->counter)
-	: "cc", "xer", "memory");
+	: "cc", "xer");
 
 	return t;
 }
@@ -374,24 +411,25 @@ static __inline__ void atomic64_dec(atomic64_t *v)
 	: "cc", "xer");
 }
 
-static __inline__ long atomic64_dec_return(atomic64_t *v)
+static __inline__ long atomic64_dec_return_relaxed(atomic64_t *v)
 {
 	long t;
 
 	__asm__ __volatile__(
-	PPC_ATOMIC_ENTRY_BARRIER
 "1:	ldarx	%0,0,%1		# atomic64_dec_return\n\
 	addic	%0,%0,-1\n\
 	stdcx.	%0,0,%1\n\
 	bne-	1b"
-	PPC_ATOMIC_EXIT_BARRIER
 	: "=&r" (t)
 	: "r" (&v->counter)
-	: "cc", "xer", "memory");
+	: "cc", "xer");
 
 	return t;
 }
 
+#define atomic64_inc_return_relaxed atomic64_inc_return_relaxed
+#define atomic64_dec_return_relaxed atomic64_dec_return_relaxed
+
 #define atomic64_sub_and_test(a, v)	(atomic64_sub_return((a), (v)) == 0)
 #define atomic64_dec_and_test(v)	(atomic64_dec_return((v)) == 0)
 
-- 
2.5.3


^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH v3 4/6] powerpc: atomic: Implement atomic{, 64}_*_return_* variants
@ 2015-10-12 14:14   ` Boqun Feng
  0 siblings, 0 replies; 40+ messages in thread
From: Boqun Feng @ 2015-10-12 14:14 UTC (permalink / raw)
  To: linux-kernel, linuxppc-dev
  Cc: Peter Zijlstra, Ingo Molnar, Benjamin Herrenschmidt,
	Paul Mackerras, Michael Ellerman, Thomas Gleixner, Will Deacon,
	Paul E. McKenney, Waiman Long, Davidlohr Bueso, Boqun Feng

On powerpc, acquire and release semantics can be achieved with
lightweight barriers("lwsync" and "ctrl+isync"), which can be used to
implement __atomic_op_{acquire,release}.

For release semantics, since we only need to ensure all memory accesses
that issue before must take effects before the -store- part of the
atomics, "lwsync" is what we only need. On the platform without
"lwsync", "sync" should be used. Therefore, smp_lwsync() is used here.

For acquire semantics, "lwsync" is what we only need for the similar
reason.  However on the platform without "lwsync", we can use "isync"
rather than "sync" as an acquire barrier. So a new kind of barrier
smp_acquire_barrier__after_atomic() is introduced, which is barrier() on
UP, "lwsync" if available and "isync" otherwise.

__atomic_op_fence is defined as smp_lwsync() + _relaxed +
smp_mb__after_atomic() to guarantee a full barrier.

Implement atomic{,64}_{add,sub,inc,dec}_return_relaxed, and build other
variants with these helpers.

Signed-off-by: Boqun Feng <boqun.feng@gmail.com>
---
 arch/powerpc/include/asm/atomic.h | 122 +++++++++++++++++++++++++-------------
 1 file changed, 80 insertions(+), 42 deletions(-)

diff --git a/arch/powerpc/include/asm/atomic.h b/arch/powerpc/include/asm/atomic.h
index 55f106e..3143af9 100644
--- a/arch/powerpc/include/asm/atomic.h
+++ b/arch/powerpc/include/asm/atomic.h
@@ -12,6 +12,39 @@
 
 #define ATOMIC_INIT(i)		{ (i) }
 
+/*
+ * Since {add,sub}_return_relaxed and xchg_relaxed are implemented with
+ * a "bne-" instruction at the end, so an isync is enough as a acquire barrier
+ * on the platform without lwsync.
+ */
+#ifdef CONFIG_SMP
+#define smp_acquire_barrier__after_atomic() \
+	__asm__ __volatile__(PPC_ACQUIRE_BARRIER : : : "memory")
+#else
+#define smp_acquire_barrier__after_atomic() barrier()
+#endif
+#define __atomic_op_acquire(op, args...)				\
+({									\
+	typeof(op##_relaxed(args)) __ret  = op##_relaxed(args);		\
+	smp_acquire_barrier__after_atomic();				\
+	__ret;								\
+})
+
+#define __atomic_op_release(op, args...)				\
+({									\
+	smp_lwsync();							\
+	op##_relaxed(args);						\
+})
+
+#define __atomic_op_fence(op, args...)				\
+({									\
+	typeof(op##_relaxed(args)) __ret;				\
+	smp_lwsync();							\
+	__ret = op##_relaxed(args);					\
+	smp_mb__after_atomic();						\
+	__ret;								\
+})
+
 static __inline__ int atomic_read(const atomic_t *v)
 {
 	int t;
@@ -42,27 +75,27 @@ static __inline__ void atomic_##op(int a, atomic_t *v)			\
 	: "cc");							\
 }									\
 
-#define ATOMIC_OP_RETURN(op, asm_op)					\
-static __inline__ int atomic_##op##_return(int a, atomic_t *v)		\
+#define ATOMIC_OP_RETURN_RELAXED(op, asm_op)				\
+static inline int atomic_##op##_return_relaxed(int a, atomic_t *v)	\
 {									\
 	int t;								\
 									\
 	__asm__ __volatile__(						\
-	PPC_ATOMIC_ENTRY_BARRIER					\
-"1:	lwarx	%0,0,%2		# atomic_" #op "_return\n"		\
-	#asm_op " %0,%1,%0\n"						\
-	PPC405_ERR77(0,%2)						\
-"	stwcx.	%0,0,%2 \n"						\
+"1:	lwarx	%0,0,%3		# atomic_" #op "_return_relaxed\n"	\
+	#asm_op " %0,%2,%0\n"						\
+	PPC405_ERR77(0, %3)						\
+"	stwcx.	%0,0,%3\n"						\
 "	bne-	1b\n"							\
-	PPC_ATOMIC_EXIT_BARRIER						\
-	: "=&r" (t)							\
+	: "=&r" (t), "+m" (v->counter)					\
 	: "r" (a), "r" (&v->counter)					\
-	: "cc", "memory");						\
+	: "cc");							\
 									\
 	return t;							\
 }
 
-#define ATOMIC_OPS(op, asm_op) ATOMIC_OP(op, asm_op) ATOMIC_OP_RETURN(op, asm_op)
+#define ATOMIC_OPS(op, asm_op)						\
+	ATOMIC_OP(op, asm_op)						\
+	ATOMIC_OP_RETURN_RELAXED(op, asm_op)
 
 ATOMIC_OPS(add, add)
 ATOMIC_OPS(sub, subf)
@@ -71,8 +104,11 @@ ATOMIC_OP(and, and)
 ATOMIC_OP(or, or)
 ATOMIC_OP(xor, xor)
 
+#define atomic_add_return_relaxed atomic_add_return_relaxed
+#define atomic_sub_return_relaxed atomic_sub_return_relaxed
+
 #undef ATOMIC_OPS
-#undef ATOMIC_OP_RETURN
+#undef ATOMIC_OP_RETURN_RELAXED
 #undef ATOMIC_OP
 
 #define atomic_add_negative(a, v)	(atomic_add_return((a), (v)) < 0)
@@ -92,21 +128,19 @@ static __inline__ void atomic_inc(atomic_t *v)
 	: "cc", "xer");
 }
 
-static __inline__ int atomic_inc_return(atomic_t *v)
+static __inline__ int atomic_inc_return_relaxed(atomic_t *v)
 {
 	int t;
 
 	__asm__ __volatile__(
-	PPC_ATOMIC_ENTRY_BARRIER
-"1:	lwarx	%0,0,%1		# atomic_inc_return\n\
+"1:	lwarx	%0,0,%1		# atomic_inc_return_relaxed\n\
 	addic	%0,%0,1\n"
 	PPC405_ERR77(0,%1)
 "	stwcx.	%0,0,%1 \n\
 	bne-	1b"
-	PPC_ATOMIC_EXIT_BARRIER
 	: "=&r" (t)
 	: "r" (&v->counter)
-	: "cc", "xer", "memory");
+	: "cc", "xer");
 
 	return t;
 }
@@ -136,25 +170,26 @@ static __inline__ void atomic_dec(atomic_t *v)
 	: "cc", "xer");
 }
 
-static __inline__ int atomic_dec_return(atomic_t *v)
+static __inline__ int atomic_dec_return_relaxed(atomic_t *v)
 {
 	int t;
 
 	__asm__ __volatile__(
-	PPC_ATOMIC_ENTRY_BARRIER
-"1:	lwarx	%0,0,%1		# atomic_dec_return\n\
+"1:	lwarx	%0,0,%1		# atomic_dec_return_relaxed\n\
 	addic	%0,%0,-1\n"
 	PPC405_ERR77(0,%1)
 "	stwcx.	%0,0,%1\n\
 	bne-	1b"
-	PPC_ATOMIC_EXIT_BARRIER
 	: "=&r" (t)
 	: "r" (&v->counter)
-	: "cc", "xer", "memory");
+	: "cc", "xer");
 
 	return t;
 }
 
+#define atomic_inc_return_relaxed atomic_inc_return_relaxed
+#define atomic_dec_return_relaxed atomic_dec_return_relaxed
+
 #define atomic_cmpxchg(v, o, n) (cmpxchg(&((v)->counter), (o), (n)))
 #define atomic_xchg(v, new) (xchg(&((v)->counter), new))
 
@@ -285,26 +320,27 @@ static __inline__ void atomic64_##op(long a, atomic64_t *v)		\
 	: "cc");							\
 }
 
-#define ATOMIC64_OP_RETURN(op, asm_op)					\
-static __inline__ long atomic64_##op##_return(long a, atomic64_t *v)	\
+#define ATOMIC64_OP_RETURN_RELAXED(op, asm_op)				\
+static inline long							\
+atomic64_##op##_return_relaxed(long a, atomic64_t *v)			\
 {									\
 	long t;								\
 									\
 	__asm__ __volatile__(						\
-	PPC_ATOMIC_ENTRY_BARRIER					\
-"1:	ldarx	%0,0,%2		# atomic64_" #op "_return\n"		\
-	#asm_op " %0,%1,%0\n"						\
-"	stdcx.	%0,0,%2 \n"						\
+"1:	ldarx	%0,0,%3		# atomic64_" #op "_return_relaxed\n"	\
+	#asm_op " %0,%2,%0\n"						\
+"	stdcx.	%0,0,%3\n"						\
 "	bne-	1b\n"							\
-	PPC_ATOMIC_EXIT_BARRIER						\
-	: "=&r" (t)							\
+	: "=&r" (t), "+m" (v->counter)					\
 	: "r" (a), "r" (&v->counter)					\
-	: "cc", "memory");						\
+	: "cc");							\
 									\
 	return t;							\
 }
 
-#define ATOMIC64_OPS(op, asm_op) ATOMIC64_OP(op, asm_op) ATOMIC64_OP_RETURN(op, asm_op)
+#define ATOMIC64_OPS(op, asm_op)					\
+	ATOMIC64_OP(op, asm_op)						\
+	ATOMIC64_OP_RETURN_RELAXED(op, asm_op)
 
 ATOMIC64_OPS(add, add)
 ATOMIC64_OPS(sub, subf)
@@ -312,8 +348,11 @@ ATOMIC64_OP(and, and)
 ATOMIC64_OP(or, or)
 ATOMIC64_OP(xor, xor)
 
-#undef ATOMIC64_OPS
-#undef ATOMIC64_OP_RETURN
+#define atomic64_add_return_relaxed atomic64_add_return_relaxed
+#define atomic64_sub_return_relaxed atomic64_sub_return_relaxed
+
+#undef ATOPIC64_OPS
+#undef ATOMIC64_OP_RETURN_RELAXED
 #undef ATOMIC64_OP
 
 #define atomic64_add_negative(a, v)	(atomic64_add_return((a), (v)) < 0)
@@ -332,20 +371,18 @@ static __inline__ void atomic64_inc(atomic64_t *v)
 	: "cc", "xer");
 }
 
-static __inline__ long atomic64_inc_return(atomic64_t *v)
+static __inline__ long atomic64_inc_return_relaxed(atomic64_t *v)
 {
 	long t;
 
 	__asm__ __volatile__(
-	PPC_ATOMIC_ENTRY_BARRIER
 "1:	ldarx	%0,0,%1		# atomic64_inc_return\n\
 	addic	%0,%0,1\n\
 	stdcx.	%0,0,%1 \n\
 	bne-	1b"
-	PPC_ATOMIC_EXIT_BARRIER
 	: "=&r" (t)
 	: "r" (&v->counter)
-	: "cc", "xer", "memory");
+	: "cc", "xer");
 
 	return t;
 }
@@ -374,24 +411,25 @@ static __inline__ void atomic64_dec(atomic64_t *v)
 	: "cc", "xer");
 }
 
-static __inline__ long atomic64_dec_return(atomic64_t *v)
+static __inline__ long atomic64_dec_return_relaxed(atomic64_t *v)
 {
 	long t;
 
 	__asm__ __volatile__(
-	PPC_ATOMIC_ENTRY_BARRIER
 "1:	ldarx	%0,0,%1		# atomic64_dec_return\n\
 	addic	%0,%0,-1\n\
 	stdcx.	%0,0,%1\n\
 	bne-	1b"
-	PPC_ATOMIC_EXIT_BARRIER
 	: "=&r" (t)
 	: "r" (&v->counter)
-	: "cc", "xer", "memory");
+	: "cc", "xer");
 
 	return t;
 }
 
+#define atomic64_inc_return_relaxed atomic64_inc_return_relaxed
+#define atomic64_dec_return_relaxed atomic64_dec_return_relaxed
+
 #define atomic64_sub_and_test(a, v)	(atomic64_sub_return((a), (v)) == 0)
 #define atomic64_dec_and_test(v)	(atomic64_dec_return((v)) == 0)
 
-- 
2.5.3

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH v3 5/6] powerpc: atomic: Implement xchg_* and atomic{,64}_xchg_* variants
  2015-10-12 14:14 [PATCH v3 0/6] atomics: powerpc: Implement relaxed/acquire/release variants of some atomics Boqun Feng
@ 2015-10-12 14:14   ` Boqun Feng
  2015-10-12 14:14 ` [PATCH v3 2/6] atomics: Add test for atomic operations with _relaxed variants Boqun Feng
                     ` (6 subsequent siblings)
  7 siblings, 0 replies; 40+ messages in thread
From: Boqun Feng @ 2015-10-12 14:14 UTC (permalink / raw)
  To: linux-kernel, linuxppc-dev
  Cc: Peter Zijlstra, Ingo Molnar, Benjamin Herrenschmidt,
	Paul Mackerras, Michael Ellerman, Thomas Gleixner, Will Deacon,
	Paul E. McKenney, Waiman Long, Davidlohr Bueso, Boqun Feng

Implement xchg_relaxed and atomic{,64}_xchg_relaxed, based on these
_relaxed variants, release/acquire variants and fully ordered versions
can be built.

Note that xchg_relaxed and atomic_{,64}_xchg_relaxed are not compiler
barriers.

Signed-off-by: Boqun Feng <boqun.feng@gmail.com>
---
 arch/powerpc/include/asm/atomic.h  |  2 ++
 arch/powerpc/include/asm/cmpxchg.h | 69 +++++++++++++++++---------------------
 2 files changed, 32 insertions(+), 39 deletions(-)

diff --git a/arch/powerpc/include/asm/atomic.h b/arch/powerpc/include/asm/atomic.h
index 3143af9..8869fb3 100644
--- a/arch/powerpc/include/asm/atomic.h
+++ b/arch/powerpc/include/asm/atomic.h
@@ -192,6 +192,7 @@ static __inline__ int atomic_dec_return_relaxed(atomic_t *v)
 
 #define atomic_cmpxchg(v, o, n) (cmpxchg(&((v)->counter), (o), (n)))
 #define atomic_xchg(v, new) (xchg(&((v)->counter), new))
+#define atomic_xchg_relaxed(v, new) xchg_relaxed(&((v)->counter), (new))
 
 /**
  * __atomic_add_unless - add unless the number is a given value
@@ -459,6 +460,7 @@ static __inline__ long atomic64_dec_if_positive(atomic64_t *v)
 
 #define atomic64_cmpxchg(v, o, n) (cmpxchg(&((v)->counter), (o), (n)))
 #define atomic64_xchg(v, new) (xchg(&((v)->counter), new))
+#define atomic64_xchg_relaxed(v, new) xchg_relaxed(&((v)->counter), (new))
 
 /**
  * atomic64_add_unless - add unless the number is a given value
diff --git a/arch/powerpc/include/asm/cmpxchg.h b/arch/powerpc/include/asm/cmpxchg.h
index d1a8d93..17c7e14 100644
--- a/arch/powerpc/include/asm/cmpxchg.h
+++ b/arch/powerpc/include/asm/cmpxchg.h
@@ -9,21 +9,20 @@
 /*
  * Atomic exchange
  *
- * Changes the memory location '*ptr' to be val and returns
+ * Changes the memory location '*p' to be val and returns
  * the previous value stored there.
  */
+
 static __always_inline unsigned long
-__xchg_u32(volatile void *p, unsigned long val)
+__xchg_u32_local(volatile void *p, unsigned long val)
 {
 	unsigned long prev;
 
 	__asm__ __volatile__(
-	PPC_ATOMIC_ENTRY_BARRIER
 "1:	lwarx	%0,0,%2 \n"
 	PPC405_ERR77(0,%2)
 "	stwcx.	%3,0,%2 \n\
 	bne-	1b"
-	PPC_ATOMIC_EXIT_BARRIER
 	: "=&r" (prev), "+m" (*(volatile unsigned int *)p)
 	: "r" (p), "r" (val)
 	: "cc", "memory");
@@ -31,42 +30,34 @@ __xchg_u32(volatile void *p, unsigned long val)
 	return prev;
 }
 
-/*
- * Atomic exchange
- *
- * Changes the memory location '*ptr' to be val and returns
- * the previous value stored there.
- */
 static __always_inline unsigned long
-__xchg_u32_local(volatile void *p, unsigned long val)
+__xchg_u32_relaxed(u32 *p, unsigned long val)
 {
 	unsigned long prev;
 
 	__asm__ __volatile__(
-"1:	lwarx	%0,0,%2 \n"
-	PPC405_ERR77(0,%2)
-"	stwcx.	%3,0,%2 \n\
-	bne-	1b"
-	: "=&r" (prev), "+m" (*(volatile unsigned int *)p)
+"1:	lwarx	%0,0,%2\n"
+	PPC405_ERR77(0, %2)
+"	stwcx.	%3,0,%2\n"
+"	bne-	1b"
+	: "=&r" (prev), "+m" (*p)
 	: "r" (p), "r" (val)
-	: "cc", "memory");
+	: "cc");
 
 	return prev;
 }
 
 #ifdef CONFIG_PPC64
 static __always_inline unsigned long
-__xchg_u64(volatile void *p, unsigned long val)
+__xchg_u64_local(volatile void *p, unsigned long val)
 {
 	unsigned long prev;
 
 	__asm__ __volatile__(
-	PPC_ATOMIC_ENTRY_BARRIER
 "1:	ldarx	%0,0,%2 \n"
 	PPC405_ERR77(0,%2)
 "	stdcx.	%3,0,%2 \n\
 	bne-	1b"
-	PPC_ATOMIC_EXIT_BARRIER
 	: "=&r" (prev), "+m" (*(volatile unsigned long *)p)
 	: "r" (p), "r" (val)
 	: "cc", "memory");
@@ -75,18 +66,18 @@ __xchg_u64(volatile void *p, unsigned long val)
 }
 
 static __always_inline unsigned long
-__xchg_u64_local(volatile void *p, unsigned long val)
+__xchg_u64_relaxed(u64 *p, unsigned long val)
 {
 	unsigned long prev;
 
 	__asm__ __volatile__(
-"1:	ldarx	%0,0,%2 \n"
-	PPC405_ERR77(0,%2)
-"	stdcx.	%3,0,%2 \n\
-	bne-	1b"
-	: "=&r" (prev), "+m" (*(volatile unsigned long *)p)
+"1:	ldarx	%0,0,%2\n"
+	PPC405_ERR77(0, %2)
+"	stdcx.	%3,0,%2\n"
+"	bne-	1b"
+	: "=&r" (prev), "+m" (*p)
 	: "r" (p), "r" (val)
-	: "cc", "memory");
+	: "cc");
 
 	return prev;
 }
@@ -99,14 +90,14 @@ __xchg_u64_local(volatile void *p, unsigned long val)
 extern void __xchg_called_with_bad_pointer(void);
 
 static __always_inline unsigned long
-__xchg(volatile void *ptr, unsigned long x, unsigned int size)
+__xchg_local(volatile void *ptr, unsigned long x, unsigned int size)
 {
 	switch (size) {
 	case 4:
-		return __xchg_u32(ptr, x);
+		return __xchg_u32_local(ptr, x);
 #ifdef CONFIG_PPC64
 	case 8:
-		return __xchg_u64(ptr, x);
+		return __xchg_u64_local(ptr, x);
 #endif
 	}
 	__xchg_called_with_bad_pointer();
@@ -114,25 +105,19 @@ __xchg(volatile void *ptr, unsigned long x, unsigned int size)
 }
 
 static __always_inline unsigned long
-__xchg_local(volatile void *ptr, unsigned long x, unsigned int size)
+__xchg_relaxed(void *ptr, unsigned long x, unsigned int size)
 {
 	switch (size) {
 	case 4:
-		return __xchg_u32_local(ptr, x);
+		return __xchg_u32_relaxed(ptr, x);
 #ifdef CONFIG_PPC64
 	case 8:
-		return __xchg_u64_local(ptr, x);
+		return __xchg_u64_relaxed(ptr, x);
 #endif
 	}
 	__xchg_called_with_bad_pointer();
 	return x;
 }
-#define xchg(ptr,x)							     \
-  ({									     \
-     __typeof__(*(ptr)) _x_ = (x);					     \
-     (__typeof__(*(ptr))) __xchg((ptr), (unsigned long)_x_, sizeof(*(ptr))); \
-  })
-
 #define xchg_local(ptr,x)						     \
   ({									     \
      __typeof__(*(ptr)) _x_ = (x);					     \
@@ -140,6 +125,12 @@ __xchg_local(volatile void *ptr, unsigned long x, unsigned int size)
      		(unsigned long)_x_, sizeof(*(ptr))); 			     \
   })
 
+#define xchg_relaxed(ptr, x)						\
+({									\
+	__typeof__(*(ptr)) _x_ = (x);					\
+	(__typeof__(*(ptr))) __xchg_relaxed((ptr),			\
+			(unsigned long)_x_, sizeof(*(ptr)));		\
+})
 /*
  * Compare and exchange - if *p == old, set it to new,
  * and return the old value of *p.
-- 
2.5.3


^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH v3 5/6] powerpc: atomic: Implement xchg_* and atomic{, 64}_xchg_* variants
@ 2015-10-12 14:14   ` Boqun Feng
  0 siblings, 0 replies; 40+ messages in thread
From: Boqun Feng @ 2015-10-12 14:14 UTC (permalink / raw)
  To: linux-kernel, linuxppc-dev
  Cc: Peter Zijlstra, Ingo Molnar, Benjamin Herrenschmidt,
	Paul Mackerras, Michael Ellerman, Thomas Gleixner, Will Deacon,
	Paul E. McKenney, Waiman Long, Davidlohr Bueso, Boqun Feng

Implement xchg_relaxed and atomic{,64}_xchg_relaxed, based on these
_relaxed variants, release/acquire variants and fully ordered versions
can be built.

Note that xchg_relaxed and atomic_{,64}_xchg_relaxed are not compiler
barriers.

Signed-off-by: Boqun Feng <boqun.feng@gmail.com>
---
 arch/powerpc/include/asm/atomic.h  |  2 ++
 arch/powerpc/include/asm/cmpxchg.h | 69 +++++++++++++++++---------------------
 2 files changed, 32 insertions(+), 39 deletions(-)

diff --git a/arch/powerpc/include/asm/atomic.h b/arch/powerpc/include/asm/atomic.h
index 3143af9..8869fb3 100644
--- a/arch/powerpc/include/asm/atomic.h
+++ b/arch/powerpc/include/asm/atomic.h
@@ -192,6 +192,7 @@ static __inline__ int atomic_dec_return_relaxed(atomic_t *v)
 
 #define atomic_cmpxchg(v, o, n) (cmpxchg(&((v)->counter), (o), (n)))
 #define atomic_xchg(v, new) (xchg(&((v)->counter), new))
+#define atomic_xchg_relaxed(v, new) xchg_relaxed(&((v)->counter), (new))
 
 /**
  * __atomic_add_unless - add unless the number is a given value
@@ -459,6 +460,7 @@ static __inline__ long atomic64_dec_if_positive(atomic64_t *v)
 
 #define atomic64_cmpxchg(v, o, n) (cmpxchg(&((v)->counter), (o), (n)))
 #define atomic64_xchg(v, new) (xchg(&((v)->counter), new))
+#define atomic64_xchg_relaxed(v, new) xchg_relaxed(&((v)->counter), (new))
 
 /**
  * atomic64_add_unless - add unless the number is a given value
diff --git a/arch/powerpc/include/asm/cmpxchg.h b/arch/powerpc/include/asm/cmpxchg.h
index d1a8d93..17c7e14 100644
--- a/arch/powerpc/include/asm/cmpxchg.h
+++ b/arch/powerpc/include/asm/cmpxchg.h
@@ -9,21 +9,20 @@
 /*
  * Atomic exchange
  *
- * Changes the memory location '*ptr' to be val and returns
+ * Changes the memory location '*p' to be val and returns
  * the previous value stored there.
  */
+
 static __always_inline unsigned long
-__xchg_u32(volatile void *p, unsigned long val)
+__xchg_u32_local(volatile void *p, unsigned long val)
 {
 	unsigned long prev;
 
 	__asm__ __volatile__(
-	PPC_ATOMIC_ENTRY_BARRIER
 "1:	lwarx	%0,0,%2 \n"
 	PPC405_ERR77(0,%2)
 "	stwcx.	%3,0,%2 \n\
 	bne-	1b"
-	PPC_ATOMIC_EXIT_BARRIER
 	: "=&r" (prev), "+m" (*(volatile unsigned int *)p)
 	: "r" (p), "r" (val)
 	: "cc", "memory");
@@ -31,42 +30,34 @@ __xchg_u32(volatile void *p, unsigned long val)
 	return prev;
 }
 
-/*
- * Atomic exchange
- *
- * Changes the memory location '*ptr' to be val and returns
- * the previous value stored there.
- */
 static __always_inline unsigned long
-__xchg_u32_local(volatile void *p, unsigned long val)
+__xchg_u32_relaxed(u32 *p, unsigned long val)
 {
 	unsigned long prev;
 
 	__asm__ __volatile__(
-"1:	lwarx	%0,0,%2 \n"
-	PPC405_ERR77(0,%2)
-"	stwcx.	%3,0,%2 \n\
-	bne-	1b"
-	: "=&r" (prev), "+m" (*(volatile unsigned int *)p)
+"1:	lwarx	%0,0,%2\n"
+	PPC405_ERR77(0, %2)
+"	stwcx.	%3,0,%2\n"
+"	bne-	1b"
+	: "=&r" (prev), "+m" (*p)
 	: "r" (p), "r" (val)
-	: "cc", "memory");
+	: "cc");
 
 	return prev;
 }
 
 #ifdef CONFIG_PPC64
 static __always_inline unsigned long
-__xchg_u64(volatile void *p, unsigned long val)
+__xchg_u64_local(volatile void *p, unsigned long val)
 {
 	unsigned long prev;
 
 	__asm__ __volatile__(
-	PPC_ATOMIC_ENTRY_BARRIER
 "1:	ldarx	%0,0,%2 \n"
 	PPC405_ERR77(0,%2)
 "	stdcx.	%3,0,%2 \n\
 	bne-	1b"
-	PPC_ATOMIC_EXIT_BARRIER
 	: "=&r" (prev), "+m" (*(volatile unsigned long *)p)
 	: "r" (p), "r" (val)
 	: "cc", "memory");
@@ -75,18 +66,18 @@ __xchg_u64(volatile void *p, unsigned long val)
 }
 
 static __always_inline unsigned long
-__xchg_u64_local(volatile void *p, unsigned long val)
+__xchg_u64_relaxed(u64 *p, unsigned long val)
 {
 	unsigned long prev;
 
 	__asm__ __volatile__(
-"1:	ldarx	%0,0,%2 \n"
-	PPC405_ERR77(0,%2)
-"	stdcx.	%3,0,%2 \n\
-	bne-	1b"
-	: "=&r" (prev), "+m" (*(volatile unsigned long *)p)
+"1:	ldarx	%0,0,%2\n"
+	PPC405_ERR77(0, %2)
+"	stdcx.	%3,0,%2\n"
+"	bne-	1b"
+	: "=&r" (prev), "+m" (*p)
 	: "r" (p), "r" (val)
-	: "cc", "memory");
+	: "cc");
 
 	return prev;
 }
@@ -99,14 +90,14 @@ __xchg_u64_local(volatile void *p, unsigned long val)
 extern void __xchg_called_with_bad_pointer(void);
 
 static __always_inline unsigned long
-__xchg(volatile void *ptr, unsigned long x, unsigned int size)
+__xchg_local(volatile void *ptr, unsigned long x, unsigned int size)
 {
 	switch (size) {
 	case 4:
-		return __xchg_u32(ptr, x);
+		return __xchg_u32_local(ptr, x);
 #ifdef CONFIG_PPC64
 	case 8:
-		return __xchg_u64(ptr, x);
+		return __xchg_u64_local(ptr, x);
 #endif
 	}
 	__xchg_called_with_bad_pointer();
@@ -114,25 +105,19 @@ __xchg(volatile void *ptr, unsigned long x, unsigned int size)
 }
 
 static __always_inline unsigned long
-__xchg_local(volatile void *ptr, unsigned long x, unsigned int size)
+__xchg_relaxed(void *ptr, unsigned long x, unsigned int size)
 {
 	switch (size) {
 	case 4:
-		return __xchg_u32_local(ptr, x);
+		return __xchg_u32_relaxed(ptr, x);
 #ifdef CONFIG_PPC64
 	case 8:
-		return __xchg_u64_local(ptr, x);
+		return __xchg_u64_relaxed(ptr, x);
 #endif
 	}
 	__xchg_called_with_bad_pointer();
 	return x;
 }
-#define xchg(ptr,x)							     \
-  ({									     \
-     __typeof__(*(ptr)) _x_ = (x);					     \
-     (__typeof__(*(ptr))) __xchg((ptr), (unsigned long)_x_, sizeof(*(ptr))); \
-  })
-
 #define xchg_local(ptr,x)						     \
   ({									     \
      __typeof__(*(ptr)) _x_ = (x);					     \
@@ -140,6 +125,12 @@ __xchg_local(volatile void *ptr, unsigned long x, unsigned int size)
      		(unsigned long)_x_, sizeof(*(ptr))); 			     \
   })
 
+#define xchg_relaxed(ptr, x)						\
+({									\
+	__typeof__(*(ptr)) _x_ = (x);					\
+	(__typeof__(*(ptr))) __xchg_relaxed((ptr),			\
+			(unsigned long)_x_, sizeof(*(ptr)));		\
+})
 /*
  * Compare and exchange - if *p == old, set it to new,
  * and return the old value of *p.
-- 
2.5.3

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH v3 6/6] powerpc: atomic: Implement cmpxchg{,64}_* and atomic{,64}_cmpxchg_* variants
  2015-10-12 14:14 [PATCH v3 0/6] atomics: powerpc: Implement relaxed/acquire/release variants of some atomics Boqun Feng
@ 2015-10-12 14:14   ` Boqun Feng
  2015-10-12 14:14 ` [PATCH v3 2/6] atomics: Add test for atomic operations with _relaxed variants Boqun Feng
                     ` (6 subsequent siblings)
  7 siblings, 0 replies; 40+ messages in thread
From: Boqun Feng @ 2015-10-12 14:14 UTC (permalink / raw)
  To: linux-kernel, linuxppc-dev
  Cc: Peter Zijlstra, Ingo Molnar, Benjamin Herrenschmidt,
	Paul Mackerras, Michael Ellerman, Thomas Gleixner, Will Deacon,
	Paul E. McKenney, Waiman Long, Davidlohr Bueso, Boqun Feng

Implement cmpxchg{,64}_relaxed and atomic{,64}_cmpxchg_relaxed, based on
which _release variants can be built.

To avoid superfluous barriers in _acquire variants, we implement these
operations with assembly code rather use __atomic_op_acquire() to build
them automatically.

For the same reason, we keep the assembly implementation of fully
ordered cmpxchg operations.

Note cmpxchg{,64}_relaxed and atomic{,64}_cmpxchg_relaxed are not
compiler barriers.

Signed-off-by: Boqun Feng <boqun.feng@gmail.com>
---
 arch/powerpc/include/asm/atomic.h  |  10 +++
 arch/powerpc/include/asm/cmpxchg.h | 141 ++++++++++++++++++++++++++++++++++++-
 2 files changed, 150 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/include/asm/atomic.h b/arch/powerpc/include/asm/atomic.h
index 8869fb3..9f44515 100644
--- a/arch/powerpc/include/asm/atomic.h
+++ b/arch/powerpc/include/asm/atomic.h
@@ -191,6 +191,11 @@ static __inline__ int atomic_dec_return_relaxed(atomic_t *v)
 #define atomic_dec_return_relaxed atomic_dec_return_relaxed
 
 #define atomic_cmpxchg(v, o, n) (cmpxchg(&((v)->counter), (o), (n)))
+#define atomic_cmpxchg_relaxed(v, o, n) \
+	cmpxchg_relaxed(&((v)->counter), (o), (n))
+#define atomic_cmpxchg_acquire(v, o, n) \
+	cmpxchg_acquire(&((v)->counter), (o), (n))
+
 #define atomic_xchg(v, new) (xchg(&((v)->counter), new))
 #define atomic_xchg_relaxed(v, new) xchg_relaxed(&((v)->counter), (new))
 
@@ -459,6 +464,11 @@ static __inline__ long atomic64_dec_if_positive(atomic64_t *v)
 }
 
 #define atomic64_cmpxchg(v, o, n) (cmpxchg(&((v)->counter), (o), (n)))
+#define atomic64_cmpxchg_relaxed(v, o, n) \
+	cmpxchg_relaxed(&((v)->counter), (o), (n))
+#define atomic64_cmpxchg_acquire(v, o, n) \
+	cmpxchg_acquire(&((v)->counter), (o), (n))
+
 #define atomic64_xchg(v, new) (xchg(&((v)->counter), new))
 #define atomic64_xchg_relaxed(v, new) xchg_relaxed(&((v)->counter), (new))
 
diff --git a/arch/powerpc/include/asm/cmpxchg.h b/arch/powerpc/include/asm/cmpxchg.h
index 17c7e14..fd4ea04 100644
--- a/arch/powerpc/include/asm/cmpxchg.h
+++ b/arch/powerpc/include/asm/cmpxchg.h
@@ -181,6 +181,48 @@ __cmpxchg_u32_local(volatile unsigned int *p, unsigned long old,
 	return prev;
 }
 
+static __always_inline unsigned long
+__cmpxchg_u32_relaxed(u32 *p, unsigned long old, unsigned long new)
+{
+	unsigned long prev;
+
+	__asm__ __volatile__ (
+"1:	lwarx	%0,0,%2		# __cmpxchg_u32_relaxed\n"
+"	cmpw	0,%0,%3\n"
+"	bne-	2f\n"
+	PPC405_ERR77(0, %2)
+"	stwcx.	%4,0,%2\n"
+"	bne-	1b\n"
+"2:"
+	: "=&r" (prev), "+m" (*p)
+	: "r" (p), "r" (old), "r" (new)
+	: "cc");
+
+	return prev;
+}
+
+static __always_inline unsigned long
+__cmpxchg_u32_acquire(u32 *p, unsigned long old, unsigned long new)
+{
+	unsigned long prev;
+
+	__asm__ __volatile__ (
+"1:	lwarx	%0,0,%2		# __cmpxchg_u32_acquire\n"
+"	cmpw	0,%0,%3\n"
+"	bne-	2f\n"
+	PPC405_ERR77(0, %2)
+"	stwcx.	%4,0,%2\n"
+"	bne-	1b\n"
+	PPC_ACQUIRE_BARRIER
+	"\n"
+"2:"
+	: "=&r" (prev), "+m" (*p)
+	: "r" (p), "r" (old), "r" (new)
+	: "cc", "memory");
+
+	return prev;
+}
+
 #ifdef CONFIG_PPC64
 static __always_inline unsigned long
 __cmpxchg_u64(volatile unsigned long *p, unsigned long old, unsigned long new)
@@ -224,6 +266,46 @@ __cmpxchg_u64_local(volatile unsigned long *p, unsigned long old,
 
 	return prev;
 }
+
+static __always_inline unsigned long
+__cmpxchg_u64_relaxed(u64 *p, unsigned long old, unsigned long new)
+{
+	unsigned long prev;
+
+	__asm__ __volatile__ (
+"1:	ldarx	%0,0,%2		# __cmpxchg_u64_relaxed\n"
+"	cmpd	0,%0,%3\n"
+"	bne-	2f\n"
+"	stdcx.	%4,0,%2\n"
+"	bne-	1b\n"
+"2:"
+	: "=&r" (prev), "+m" (*p)
+	: "r" (p), "r" (old), "r" (new)
+	: "cc");
+
+	return prev;
+}
+
+static __always_inline unsigned long
+__cmpxchg_u64_acquire(u64 *p, unsigned long old, unsigned long new)
+{
+	unsigned long prev;
+
+	__asm__ __volatile__ (
+"1:	ldarx	%0,0,%2		# __cmpxchg_u64_acquire\n"
+"	cmpd	0,%0,%3\n"
+"	bne-	2f\n"
+"	stdcx.	%4,0,%2\n"
+"	bne-	1b\n"
+	PPC_ACQUIRE_BARRIER
+	"\n"
+"2:"
+	: "=&r" (prev), "+m" (*p)
+	: "r" (p), "r" (old), "r" (new)
+	: "cc", "memory");
+
+	return prev;
+}
 #endif
 
 /* This function doesn't exist, so you'll get a linker error
@@ -262,6 +344,37 @@ __cmpxchg_local(volatile void *ptr, unsigned long old, unsigned long new,
 	return old;
 }
 
+static __always_inline unsigned long
+__cmpxchg_relaxed(void *ptr, unsigned long old, unsigned long new,
+		  unsigned int size)
+{
+	switch (size) {
+	case 4:
+		return __cmpxchg_u32_relaxed(ptr, old, new);
+#ifdef CONFIG_PPC64
+	case 8:
+		return __cmpxchg_u64_relaxed(ptr, old, new);
+#endif
+	}
+	__cmpxchg_called_with_bad_pointer();
+	return old;
+}
+
+static __always_inline unsigned long
+__cmpxchg_acquire(void *ptr, unsigned long old, unsigned long new,
+		  unsigned int size)
+{
+	switch (size) {
+	case 4:
+		return __cmpxchg_u32_acquire(ptr, old, new);
+#ifdef CONFIG_PPC64
+	case 8:
+		return __cmpxchg_u64_acquire(ptr, old, new);
+#endif
+	}
+	__cmpxchg_called_with_bad_pointer();
+	return old;
+}
 #define cmpxchg(ptr, o, n)						 \
   ({									 \
      __typeof__(*(ptr)) _o_ = (o);					 \
@@ -279,6 +392,23 @@ __cmpxchg_local(volatile void *ptr, unsigned long old, unsigned long new,
 				    (unsigned long)_n_, sizeof(*(ptr))); \
   })
 
+#define cmpxchg_relaxed(ptr, o, n)					\
+({									\
+	__typeof__(*(ptr)) _o_ = (o);					\
+	__typeof__(*(ptr)) _n_ = (n);					\
+	(__typeof__(*(ptr))) __cmpxchg_relaxed((ptr),			\
+			(unsigned long)_o_, (unsigned long)_n_,		\
+			sizeof(*(ptr)));				\
+})
+
+#define cmpxchg_acquire(ptr, o, n)					\
+({									\
+	__typeof__(*(ptr)) _o_ = (o);					\
+	__typeof__(*(ptr)) _n_ = (n);					\
+	(__typeof__(*(ptr))) __cmpxchg_acquire((ptr),			\
+			(unsigned long)_o_, (unsigned long)_n_,		\
+			sizeof(*(ptr)));				\
+})
 #ifdef CONFIG_PPC64
 #define cmpxchg64(ptr, o, n)						\
   ({									\
@@ -290,7 +420,16 @@ __cmpxchg_local(volatile void *ptr, unsigned long old, unsigned long new,
 	BUILD_BUG_ON(sizeof(*(ptr)) != 8);				\
 	cmpxchg_local((ptr), (o), (n));					\
   })
-#define cmpxchg64_relaxed	cmpxchg64_local
+#define cmpxchg64_relaxed(ptr, o, n)					\
+({									\
+	BUILD_BUG_ON(sizeof(*(ptr)) != 8);				\
+	cmpxchg_relaxed((ptr), (o), (n));				\
+})
+#define cmpxchg64_acquire(ptr, o, n)					\
+({									\
+	BUILD_BUG_ON(sizeof(*(ptr)) != 8);				\
+	cmpxchg_acquire((ptr), (o), (n));				\
+})
 #else
 #include <asm-generic/cmpxchg-local.h>
 #define cmpxchg64_local(ptr, o, n) __cmpxchg64_local_generic((ptr), (o), (n))
-- 
2.5.3


^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH v3 6/6] powerpc: atomic: Implement cmpxchg{, 64}_* and atomic{, 64}_cmpxchg_* variants
@ 2015-10-12 14:14   ` Boqun Feng
  0 siblings, 0 replies; 40+ messages in thread
From: Boqun Feng @ 2015-10-12 14:14 UTC (permalink / raw)
  To: linux-kernel, linuxppc-dev
  Cc: Peter Zijlstra, Ingo Molnar, Benjamin Herrenschmidt,
	Paul Mackerras, Michael Ellerman, Thomas Gleixner, Will Deacon,
	Paul E. McKenney, Waiman Long, Davidlohr Bueso, Boqun Feng

Implement cmpxchg{,64}_relaxed and atomic{,64}_cmpxchg_relaxed, based on
which _release variants can be built.

To avoid superfluous barriers in _acquire variants, we implement these
operations with assembly code rather use __atomic_op_acquire() to build
them automatically.

For the same reason, we keep the assembly implementation of fully
ordered cmpxchg operations.

Note cmpxchg{,64}_relaxed and atomic{,64}_cmpxchg_relaxed are not
compiler barriers.

Signed-off-by: Boqun Feng <boqun.feng@gmail.com>
---
 arch/powerpc/include/asm/atomic.h  |  10 +++
 arch/powerpc/include/asm/cmpxchg.h | 141 ++++++++++++++++++++++++++++++++++++-
 2 files changed, 150 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/include/asm/atomic.h b/arch/powerpc/include/asm/atomic.h
index 8869fb3..9f44515 100644
--- a/arch/powerpc/include/asm/atomic.h
+++ b/arch/powerpc/include/asm/atomic.h
@@ -191,6 +191,11 @@ static __inline__ int atomic_dec_return_relaxed(atomic_t *v)
 #define atomic_dec_return_relaxed atomic_dec_return_relaxed
 
 #define atomic_cmpxchg(v, o, n) (cmpxchg(&((v)->counter), (o), (n)))
+#define atomic_cmpxchg_relaxed(v, o, n) \
+	cmpxchg_relaxed(&((v)->counter), (o), (n))
+#define atomic_cmpxchg_acquire(v, o, n) \
+	cmpxchg_acquire(&((v)->counter), (o), (n))
+
 #define atomic_xchg(v, new) (xchg(&((v)->counter), new))
 #define atomic_xchg_relaxed(v, new) xchg_relaxed(&((v)->counter), (new))
 
@@ -459,6 +464,11 @@ static __inline__ long atomic64_dec_if_positive(atomic64_t *v)
 }
 
 #define atomic64_cmpxchg(v, o, n) (cmpxchg(&((v)->counter), (o), (n)))
+#define atomic64_cmpxchg_relaxed(v, o, n) \
+	cmpxchg_relaxed(&((v)->counter), (o), (n))
+#define atomic64_cmpxchg_acquire(v, o, n) \
+	cmpxchg_acquire(&((v)->counter), (o), (n))
+
 #define atomic64_xchg(v, new) (xchg(&((v)->counter), new))
 #define atomic64_xchg_relaxed(v, new) xchg_relaxed(&((v)->counter), (new))
 
diff --git a/arch/powerpc/include/asm/cmpxchg.h b/arch/powerpc/include/asm/cmpxchg.h
index 17c7e14..fd4ea04 100644
--- a/arch/powerpc/include/asm/cmpxchg.h
+++ b/arch/powerpc/include/asm/cmpxchg.h
@@ -181,6 +181,48 @@ __cmpxchg_u32_local(volatile unsigned int *p, unsigned long old,
 	return prev;
 }
 
+static __always_inline unsigned long
+__cmpxchg_u32_relaxed(u32 *p, unsigned long old, unsigned long new)
+{
+	unsigned long prev;
+
+	__asm__ __volatile__ (
+"1:	lwarx	%0,0,%2		# __cmpxchg_u32_relaxed\n"
+"	cmpw	0,%0,%3\n"
+"	bne-	2f\n"
+	PPC405_ERR77(0, %2)
+"	stwcx.	%4,0,%2\n"
+"	bne-	1b\n"
+"2:"
+	: "=&r" (prev), "+m" (*p)
+	: "r" (p), "r" (old), "r" (new)
+	: "cc");
+
+	return prev;
+}
+
+static __always_inline unsigned long
+__cmpxchg_u32_acquire(u32 *p, unsigned long old, unsigned long new)
+{
+	unsigned long prev;
+
+	__asm__ __volatile__ (
+"1:	lwarx	%0,0,%2		# __cmpxchg_u32_acquire\n"
+"	cmpw	0,%0,%3\n"
+"	bne-	2f\n"
+	PPC405_ERR77(0, %2)
+"	stwcx.	%4,0,%2\n"
+"	bne-	1b\n"
+	PPC_ACQUIRE_BARRIER
+	"\n"
+"2:"
+	: "=&r" (prev), "+m" (*p)
+	: "r" (p), "r" (old), "r" (new)
+	: "cc", "memory");
+
+	return prev;
+}
+
 #ifdef CONFIG_PPC64
 static __always_inline unsigned long
 __cmpxchg_u64(volatile unsigned long *p, unsigned long old, unsigned long new)
@@ -224,6 +266,46 @@ __cmpxchg_u64_local(volatile unsigned long *p, unsigned long old,
 
 	return prev;
 }
+
+static __always_inline unsigned long
+__cmpxchg_u64_relaxed(u64 *p, unsigned long old, unsigned long new)
+{
+	unsigned long prev;
+
+	__asm__ __volatile__ (
+"1:	ldarx	%0,0,%2		# __cmpxchg_u64_relaxed\n"
+"	cmpd	0,%0,%3\n"
+"	bne-	2f\n"
+"	stdcx.	%4,0,%2\n"
+"	bne-	1b\n"
+"2:"
+	: "=&r" (prev), "+m" (*p)
+	: "r" (p), "r" (old), "r" (new)
+	: "cc");
+
+	return prev;
+}
+
+static __always_inline unsigned long
+__cmpxchg_u64_acquire(u64 *p, unsigned long old, unsigned long new)
+{
+	unsigned long prev;
+
+	__asm__ __volatile__ (
+"1:	ldarx	%0,0,%2		# __cmpxchg_u64_acquire\n"
+"	cmpd	0,%0,%3\n"
+"	bne-	2f\n"
+"	stdcx.	%4,0,%2\n"
+"	bne-	1b\n"
+	PPC_ACQUIRE_BARRIER
+	"\n"
+"2:"
+	: "=&r" (prev), "+m" (*p)
+	: "r" (p), "r" (old), "r" (new)
+	: "cc", "memory");
+
+	return prev;
+}
 #endif
 
 /* This function doesn't exist, so you'll get a linker error
@@ -262,6 +344,37 @@ __cmpxchg_local(volatile void *ptr, unsigned long old, unsigned long new,
 	return old;
 }
 
+static __always_inline unsigned long
+__cmpxchg_relaxed(void *ptr, unsigned long old, unsigned long new,
+		  unsigned int size)
+{
+	switch (size) {
+	case 4:
+		return __cmpxchg_u32_relaxed(ptr, old, new);
+#ifdef CONFIG_PPC64
+	case 8:
+		return __cmpxchg_u64_relaxed(ptr, old, new);
+#endif
+	}
+	__cmpxchg_called_with_bad_pointer();
+	return old;
+}
+
+static __always_inline unsigned long
+__cmpxchg_acquire(void *ptr, unsigned long old, unsigned long new,
+		  unsigned int size)
+{
+	switch (size) {
+	case 4:
+		return __cmpxchg_u32_acquire(ptr, old, new);
+#ifdef CONFIG_PPC64
+	case 8:
+		return __cmpxchg_u64_acquire(ptr, old, new);
+#endif
+	}
+	__cmpxchg_called_with_bad_pointer();
+	return old;
+}
 #define cmpxchg(ptr, o, n)						 \
   ({									 \
      __typeof__(*(ptr)) _o_ = (o);					 \
@@ -279,6 +392,23 @@ __cmpxchg_local(volatile void *ptr, unsigned long old, unsigned long new,
 				    (unsigned long)_n_, sizeof(*(ptr))); \
   })
 
+#define cmpxchg_relaxed(ptr, o, n)					\
+({									\
+	__typeof__(*(ptr)) _o_ = (o);					\
+	__typeof__(*(ptr)) _n_ = (n);					\
+	(__typeof__(*(ptr))) __cmpxchg_relaxed((ptr),			\
+			(unsigned long)_o_, (unsigned long)_n_,		\
+			sizeof(*(ptr)));				\
+})
+
+#define cmpxchg_acquire(ptr, o, n)					\
+({									\
+	__typeof__(*(ptr)) _o_ = (o);					\
+	__typeof__(*(ptr)) _n_ = (n);					\
+	(__typeof__(*(ptr))) __cmpxchg_acquire((ptr),			\
+			(unsigned long)_o_, (unsigned long)_n_,		\
+			sizeof(*(ptr)));				\
+})
 #ifdef CONFIG_PPC64
 #define cmpxchg64(ptr, o, n)						\
   ({									\
@@ -290,7 +420,16 @@ __cmpxchg_local(volatile void *ptr, unsigned long old, unsigned long new,
 	BUILD_BUG_ON(sizeof(*(ptr)) != 8);				\
 	cmpxchg_local((ptr), (o), (n));					\
   })
-#define cmpxchg64_relaxed	cmpxchg64_local
+#define cmpxchg64_relaxed(ptr, o, n)					\
+({									\
+	BUILD_BUG_ON(sizeof(*(ptr)) != 8);				\
+	cmpxchg_relaxed((ptr), (o), (n));				\
+})
+#define cmpxchg64_acquire(ptr, o, n)					\
+({									\
+	BUILD_BUG_ON(sizeof(*(ptr)) != 8);				\
+	cmpxchg_acquire((ptr), (o), (n));				\
+})
 #else
 #include <asm-generic/cmpxchg-local.h>
 #define cmpxchg64_local(ptr, o, n) __cmpxchg64_local_generic((ptr), (o), (n))
-- 
2.5.3

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* Re: [PATCH v3 1/6] powerpc: atomic: Make *xchg and *cmpxchg a full barrier
  2015-10-12 14:14 ` [PATCH v3 1/6] powerpc: atomic: Make *xchg and *cmpxchg a full barrier Boqun Feng
@ 2015-10-12 14:23   ` Boqun Feng
  0 siblings, 0 replies; 40+ messages in thread
From: Boqun Feng @ 2015-10-12 14:23 UTC (permalink / raw)
  To: linux-kernel, linuxppc-dev
  Cc: Peter Zijlstra, Ingo Molnar, Benjamin Herrenschmidt,
	Paul Mackerras, Michael Ellerman, Thomas Gleixner, Will Deacon,
	Paul E. McKenney, Waiman Long, Davidlohr Bueso, stable

Oops.. sorry. I will resend this one with correct address list.

On Mon, Oct 12, 2015 at 10:14:01PM +0800, Boqun Feng wrote:
> According to memory-barriers.txt, xchg, cmpxchg and their atomic{,64}_
> versions all need to imply a full barrier, however they are now just
> RELEASE+ACQUIRE, which is not a full barrier.
> 
> So replace PPC_RELEASE_BARRIER and PPC_ACQUIRE_BARRIER with
> PPC_ATOMIC_ENTRY_BARRIER and PPC_ATOMIC_EXIT_BARRIER in
> __{cmp,}xchg_{u32,u64} respectively to guarantee a full barrier
> semantics of atomic{,64}_{cmp,}xchg() and {cmp,}xchg().
> 
> This patch is a complement of commit b97021f85517 ("powerpc: Fix
> atomic_xxx_return barrier semantics").
> 
> Cc: stable@vger.kernel.org # 3.4.y-
> Signed-off-by: Boqun Feng <boqun.feng@gmail.com>
> ---
>  arch/powerpc/include/asm/cmpxchg.h | 16 ++++++++--------
>  1 file changed, 8 insertions(+), 8 deletions(-)
> 
> diff --git a/arch/powerpc/include/asm/cmpxchg.h b/arch/powerpc/include/asm/cmpxchg.h
> index ad6263c..d1a8d93 100644
> --- a/arch/powerpc/include/asm/cmpxchg.h
> +++ b/arch/powerpc/include/asm/cmpxchg.h
> @@ -18,12 +18,12 @@ __xchg_u32(volatile void *p, unsigned long val)
>  	unsigned long prev;
>  
>  	__asm__ __volatile__(
> -	PPC_RELEASE_BARRIER
> +	PPC_ATOMIC_ENTRY_BARRIER
>  "1:	lwarx	%0,0,%2 \n"
>  	PPC405_ERR77(0,%2)
>  "	stwcx.	%3,0,%2 \n\
>  	bne-	1b"
> -	PPC_ACQUIRE_BARRIER
> +	PPC_ATOMIC_EXIT_BARRIER
>  	: "=&r" (prev), "+m" (*(volatile unsigned int *)p)
>  	: "r" (p), "r" (val)
>  	: "cc", "memory");
> @@ -61,12 +61,12 @@ __xchg_u64(volatile void *p, unsigned long val)
>  	unsigned long prev;
>  
>  	__asm__ __volatile__(
> -	PPC_RELEASE_BARRIER
> +	PPC_ATOMIC_ENTRY_BARRIER
>  "1:	ldarx	%0,0,%2 \n"
>  	PPC405_ERR77(0,%2)
>  "	stdcx.	%3,0,%2 \n\
>  	bne-	1b"
> -	PPC_ACQUIRE_BARRIER
> +	PPC_ATOMIC_EXIT_BARRIER
>  	: "=&r" (prev), "+m" (*(volatile unsigned long *)p)
>  	: "r" (p), "r" (val)
>  	: "cc", "memory");
> @@ -151,14 +151,14 @@ __cmpxchg_u32(volatile unsigned int *p, unsigned long old, unsigned long new)
>  	unsigned int prev;
>  
>  	__asm__ __volatile__ (
> -	PPC_RELEASE_BARRIER
> +	PPC_ATOMIC_ENTRY_BARRIER
>  "1:	lwarx	%0,0,%2		# __cmpxchg_u32\n\
>  	cmpw	0,%0,%3\n\
>  	bne-	2f\n"
>  	PPC405_ERR77(0,%2)
>  "	stwcx.	%4,0,%2\n\
>  	bne-	1b"
> -	PPC_ACQUIRE_BARRIER
> +	PPC_ATOMIC_EXIT_BARRIER
>  	"\n\
>  2:"
>  	: "=&r" (prev), "+m" (*p)
> @@ -197,13 +197,13 @@ __cmpxchg_u64(volatile unsigned long *p, unsigned long old, unsigned long new)
>  	unsigned long prev;
>  
>  	__asm__ __volatile__ (
> -	PPC_RELEASE_BARRIER
> +	PPC_ATOMIC_ENTRY_BARRIER
>  "1:	ldarx	%0,0,%2		# __cmpxchg_u64\n\
>  	cmpd	0,%0,%3\n\
>  	bne-	2f\n\
>  	stdcx.	%4,0,%2\n\
>  	bne-	1b"
> -	PPC_ACQUIRE_BARRIER
> +	PPC_ATOMIC_EXIT_BARRIER
>  	"\n\
>  2:"
>  	: "=&r" (prev), "+m" (*p)
> -- 
> 2.5.3
> 

^ permalink raw reply	[flat|nested] 40+ messages in thread

* [PATCH RESEND v3 1/6] powerpc: atomic: Make *xchg and *cmpxchg a full barrier
  2015-10-12 14:14 [PATCH v3 0/6] atomics: powerpc: Implement relaxed/acquire/release variants of some atomics Boqun Feng
                   ` (5 preceding siblings ...)
  2015-10-12 14:14   ` [PATCH v3 6/6] powerpc: atomic: Implement cmpxchg{, 64}_* and atomic{, 64}_cmpxchg_* variants Boqun Feng
@ 2015-10-12 14:30 ` Boqun Feng
  2015-10-14  0:10   ` Michael Ellerman
  2015-10-13 12:27 ` [PATCH v3 0/6] atomics: powerpc: Implement relaxed/acquire/release variants of some atomics Peter Zijlstra
  7 siblings, 1 reply; 40+ messages in thread
From: Boqun Feng @ 2015-10-12 14:30 UTC (permalink / raw)
  To: linux-kernel, linuxppc-dev
  Cc: Peter Zijlstra, Ingo Molnar, Benjamin Herrenschmidt,
	Paul Mackerras, Michael Ellerman, Thomas Gleixner, Will Deacon,
	Paul E. McKenney, Waiman Long, Davidlohr Bueso, Boqun Feng,
	stable

According to memory-barriers.txt, xchg, cmpxchg and their atomic{,64}_
versions all need to imply a full barrier, however they are now just
RELEASE+ACQUIRE, which is not a full barrier.

So replace PPC_RELEASE_BARRIER and PPC_ACQUIRE_BARRIER with
PPC_ATOMIC_ENTRY_BARRIER and PPC_ATOMIC_EXIT_BARRIER in
__{cmp,}xchg_{u32,u64} respectively to guarantee a full barrier
semantics of atomic{,64}_{cmp,}xchg() and {cmp,}xchg().

This patch is a complement of commit b97021f85517 ("powerpc: Fix
atomic_xxx_return barrier semantics").

Cc: <stable@vger.kernel.org> # 3.4.y-
Signed-off-by: Boqun Feng <boqun.feng@gmail.com>
---
 arch/powerpc/include/asm/cmpxchg.h | 16 ++++++++--------
 1 file changed, 8 insertions(+), 8 deletions(-)

diff --git a/arch/powerpc/include/asm/cmpxchg.h b/arch/powerpc/include/asm/cmpxchg.h
index ad6263c..d1a8d93 100644
--- a/arch/powerpc/include/asm/cmpxchg.h
+++ b/arch/powerpc/include/asm/cmpxchg.h
@@ -18,12 +18,12 @@ __xchg_u32(volatile void *p, unsigned long val)
 	unsigned long prev;
 
 	__asm__ __volatile__(
-	PPC_RELEASE_BARRIER
+	PPC_ATOMIC_ENTRY_BARRIER
 "1:	lwarx	%0,0,%2 \n"
 	PPC405_ERR77(0,%2)
 "	stwcx.	%3,0,%2 \n\
 	bne-	1b"
-	PPC_ACQUIRE_BARRIER
+	PPC_ATOMIC_EXIT_BARRIER
 	: "=&r" (prev), "+m" (*(volatile unsigned int *)p)
 	: "r" (p), "r" (val)
 	: "cc", "memory");
@@ -61,12 +61,12 @@ __xchg_u64(volatile void *p, unsigned long val)
 	unsigned long prev;
 
 	__asm__ __volatile__(
-	PPC_RELEASE_BARRIER
+	PPC_ATOMIC_ENTRY_BARRIER
 "1:	ldarx	%0,0,%2 \n"
 	PPC405_ERR77(0,%2)
 "	stdcx.	%3,0,%2 \n\
 	bne-	1b"
-	PPC_ACQUIRE_BARRIER
+	PPC_ATOMIC_EXIT_BARRIER
 	: "=&r" (prev), "+m" (*(volatile unsigned long *)p)
 	: "r" (p), "r" (val)
 	: "cc", "memory");
@@ -151,14 +151,14 @@ __cmpxchg_u32(volatile unsigned int *p, unsigned long old, unsigned long new)
 	unsigned int prev;
 
 	__asm__ __volatile__ (
-	PPC_RELEASE_BARRIER
+	PPC_ATOMIC_ENTRY_BARRIER
 "1:	lwarx	%0,0,%2		# __cmpxchg_u32\n\
 	cmpw	0,%0,%3\n\
 	bne-	2f\n"
 	PPC405_ERR77(0,%2)
 "	stwcx.	%4,0,%2\n\
 	bne-	1b"
-	PPC_ACQUIRE_BARRIER
+	PPC_ATOMIC_EXIT_BARRIER
 	"\n\
 2:"
 	: "=&r" (prev), "+m" (*p)
@@ -197,13 +197,13 @@ __cmpxchg_u64(volatile unsigned long *p, unsigned long old, unsigned long new)
 	unsigned long prev;
 
 	__asm__ __volatile__ (
-	PPC_RELEASE_BARRIER
+	PPC_ATOMIC_ENTRY_BARRIER
 "1:	ldarx	%0,0,%2		# __cmpxchg_u64\n\
 	cmpd	0,%0,%3\n\
 	bne-	2f\n\
 	stdcx.	%4,0,%2\n\
 	bne-	1b"
-	PPC_ACQUIRE_BARRIER
+	PPC_ATOMIC_EXIT_BARRIER
 	"\n\
 2:"
 	: "=&r" (prev), "+m" (*p)
-- 
2.5.3


^ permalink raw reply related	[flat|nested] 40+ messages in thread

* Re: [lkp] [PATCH v3 2/6] atomics: Add test for atomic operations with _relaxed variants
       [not found]     ` <20151012145652.GJ27351@fixme-laptop.cn.ibm.com>
@ 2015-10-12 15:29       ` Fengguang Wu
  2015-10-12 15:42         ` Boqun Feng
  0 siblings, 1 reply; 40+ messages in thread
From: Fengguang Wu @ 2015-10-12 15:29 UTC (permalink / raw)
  To: Boqun Feng; +Cc: kbuild-all, LKML

Hi Boqun,

The base tree detection is based on the whole patchset's

- subjects
- touched files
- TO/CC list

Log shows the files and TO/CC are strongly related to powerpc,
so it looks a natural choice to apply to it. Especially you put
"linuxppc-dev@lists.ozlabs.org" in the TO list while Peter/Ingo
in the CC list -- that looks like a strong indication for powerpc.

[2015-10-12 22:27:49] patched_files: ["arch/powerpc/include/asm/cmpxchg.h", "lib/atomic64_test.c", "include/linux/atomic.h", "arch/powerpc/include/asm/atomic.h"]
[2015-10-12 22:27:49] bases: ["powerpc/next", "powerpc/next"]

[2015-10-12 22:27:49] lists: ["linux-kernel@vger.kernel.org", "linuxppc-dev@lists.ozlabs.org", "Peter Zijlstra <peterz@infradead.org>", "Ingo Molnar <mingo@kernel.org>"
, "Benjamin Herrenschmidt <benh@kernel.crashing.org>", "Paul Mackerras <paulus@samba.org>", "Michael Ellerman <mpe@ellerman.id.au>", "Thomas Gleixner <tglx@linutronix.d
e>", "Will Deacon <will.deacon@arm.com>", "\"Paul E. McKenney\" <paulmck@linux.vnet.ibm.com>", "Waiman Long <waiman.long@hp.com>", "Davidlohr Bueso <dave@stgolabs.net>"
, "Boqun Feng <boqun.feng@gmail.com>"]
[2015-10-12 22:27:49] bases: ["powerpc/next", "powerpc/next", "powerpc/next", "mpe/next", "mpe/next", "arm64/for-next/core", "arm64/for-next/core"]

The possible improvement would be to let tip:locking/core register
itself in the MAINTAINERS file to match files *cmpxchg* *atomic*.

Thanks,
Fengguang

On Mon, Oct 12, 2015 at 10:56:52PM +0800, Boqun Feng wrote:
> On Mon, Oct 12, 2015 at 10:43:56PM +0800, kbuild test robot wrote:
> > Hi Boqun,
> > 
> > [auto build test ERROR on v4.3-rc5 -- if it's inappropriate base, please suggest rules for selecting the more suitable base]
> > 
> 
> This patch should be tested based on current locking/core branch of tip
> tree. Thank you.
> 
> Regards,
> Boqun
> 
> > url:    https://github.com/0day-ci/linux/commits/Boqun-Feng/atomics-powerpc-Implement-relaxed-acquire-release-variants-of-some-atomics/20151012-222750
> > config: x86_64-randconfig-x016-10121751 (attached as .config)
> > reproduce:
> >         # save the attached .config to linux build tree
> >         make ARCH=x86_64 
> > 
> > All error/warnings (new ones prefixed by >>):
> > 
> >    In file included from include/linux/init.h:4:0,
> >                     from lib/atomic64_test.c:14:
> >    lib/atomic64_test.c: In function 'test_atomic':
> > >> lib/atomic64_test.c:60:9: error: implicit declaration of function 'atomic_inc_return_acquire' [-Werror=implicit-function-declaration]
> >      BUG_ON(atomic##bit##_##op(&v, ##args) != ret);  \
> >             ^
> >    include/linux/compiler.h:166:42: note: in definition of macro 'unlikely'
> >     # define unlikely(x) __builtin_expect(!!(x), 0)
> >                                              ^
> > >> lib/atomic64_test.c:60:2: note: in expansion of macro 'BUG_ON'
> >      BUG_ON(atomic##bit##_##op(&v, ##args) != ret);  \
> >      ^
> > >> lib/atomic64_test.c:38:2: note: in expansion of macro 'TEST_ARGS'
> >      test(bit, op##_acquire, ##args); \
> >      ^
> > >> lib/atomic64_test.c:79:2: note: in expansion of macro 'FAMILY_TEST'
> >      FAMILY_TEST(TEST_ARGS, bit, inc_return,  \
> >      ^
> > >> lib/atomic64_test.c:113:2: note: in expansion of macro 'INC_RETURN_FAMILY_TEST'
> >      INC_RETURN_FAMILY_TEST(, v0);
> >      ^
> > >> lib/atomic64_test.c:60:9: error: implicit declaration of function 'atomic_inc_return_release' [-Werror=implicit-function-declaration]
> >      BUG_ON(atomic##bit##_##op(&v, ##args) != ret);  \
> >             ^
> >    include/linux/compiler.h:166:42: note: in definition of macro 'unlikely'
> >     # define unlikely(x) __builtin_expect(!!(x), 0)
> >                                              ^
> > >> lib/atomic64_test.c:60:2: note: in expansion of macro 'BUG_ON'
> >      BUG_ON(atomic##bit##_##op(&v, ##args) != ret);  \
> >      ^
> >    lib/atomic64_test.c:39:2: note: in expansion of macro 'TEST_ARGS'
> >      test(bit, op##_release, ##args); \
> >      ^
> > >> lib/atomic64_test.c:79:2: note: in expansion of macro 'FAMILY_TEST'
> >      FAMILY_TEST(TEST_ARGS, bit, inc_return,  \
> >      ^
> > >> lib/atomic64_test.c:113:2: note: in expansion of macro 'INC_RETURN_FAMILY_TEST'
> >      INC_RETURN_FAMILY_TEST(, v0);
> >      ^
> > >> lib/atomic64_test.c:60:9: error: implicit declaration of function 'atomic_inc_return_relaxed' [-Werror=implicit-function-declaration]
> >      BUG_ON(atomic##bit##_##op(&v, ##args) != ret);  \
> >             ^
> >    include/linux/compiler.h:166:42: note: in definition of macro 'unlikely'
> >     # define unlikely(x) __builtin_expect(!!(x), 0)
> >                                              ^
> > >> lib/atomic64_test.c:60:2: note: in expansion of macro 'BUG_ON'
> >      BUG_ON(atomic##bit##_##op(&v, ##args) != ret);  \
> >      ^
> >    lib/atomic64_test.c:40:2: note: in expansion of macro 'TEST_ARGS'
> >      test(bit, op##_relaxed, ##args); \
> >      ^
> > >> lib/atomic64_test.c:79:2: note: in expansion of macro 'FAMILY_TEST'
> >      FAMILY_TEST(TEST_ARGS, bit, inc_return,  \
> >      ^
> > >> lib/atomic64_test.c:113:2: note: in expansion of macro 'INC_RETURN_FAMILY_TEST'
> >      INC_RETURN_FAMILY_TEST(, v0);
> >      ^
> > >> lib/atomic64_test.c:60:9: error: implicit declaration of function 'atomic_dec_return_acquire' [-Werror=implicit-function-declaration]
> >      BUG_ON(atomic##bit##_##op(&v, ##args) != ret);  \
> >             ^
> >    include/linux/compiler.h:166:42: note: in definition of macro 'unlikely'
> >     # define unlikely(x) __builtin_expect(!!(x), 0)
> >                                              ^
> > >> lib/atomic64_test.c:60:2: note: in expansion of macro 'BUG_ON'
> >      BUG_ON(atomic##bit##_##op(&v, ##args) != ret);  \
> >      ^
> > >> lib/atomic64_test.c:38:2: note: in expansion of macro 'TEST_ARGS'
> >      test(bit, op##_acquire, ##args); \
> >      ^
> >    lib/atomic64_test.c:85:2: note: in expansion of macro 'FAMILY_TEST'
> >      FAMILY_TEST(TEST_ARGS, bit, dec_return,  \
> >      ^
> > >> lib/atomic64_test.c:114:2: note: in expansion of macro 'DEC_RETURN_FAMILY_TEST'
> >      DEC_RETURN_FAMILY_TEST(, v0);
> >      ^
> > >> lib/atomic64_test.c:60:9: error: implicit declaration of function 'atomic_dec_return_release' [-Werror=implicit-function-declaration]
> >      BUG_ON(atomic##bit##_##op(&v, ##args) != ret);  \
> >             ^
> >    include/linux/compiler.h:166:42: note: in definition of macro 'unlikely'
> >     # define unlikely(x) __builtin_expect(!!(x), 0)
> >                                              ^
> > >> lib/atomic64_test.c:60:2: note: in expansion of macro 'BUG_ON'
> >      BUG_ON(atomic##bit##_##op(&v, ##args) != ret);  \
> >      ^
> >    lib/atomic64_test.c:39:2: note: in expansion of macro 'TEST_ARGS'
> >      test(bit, op##_release, ##args); \
> >      ^
> >    lib/atomic64_test.c:85:2: note: in expansion of macro 'FAMILY_TEST'
> >      FAMILY_TEST(TEST_ARGS, bit, dec_return,  \
> >      ^
> > >> lib/atomic64_test.c:114:2: note: in expansion of macro 'DEC_RETURN_FAMILY_TEST'
> >      DEC_RETURN_FAMILY_TEST(, v0);
> >      ^
> > 
> > vim +/atomic_inc_return_acquire +60 lib/atomic64_test.c
> > 
> >      8	 * the Free Software Foundation; either version 2 of the License, or
> >      9	 * (at your option) any later version.
> >     10	 */
> >     11	
> >     12	#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
> >     13	
> >   > 14	#include <linux/init.h>
> >     15	#include <linux/bug.h>
> >     16	#include <linux/kernel.h>
> >     17	#include <linux/atomic.h>
> >     18	
> >     19	#define TEST(bit, op, c_op, val)				\
> >     20	do {								\
> >     21		atomic##bit##_set(&v, v0);				\
> >     22		r = v0;							\
> >     23		atomic##bit##_##op(val, &v);				\
> >     24		r c_op val;						\
> >     25		WARN(atomic##bit##_read(&v) != r, "%Lx != %Lx\n",	\
> >     26			(unsigned long long)atomic##bit##_read(&v),	\
> >     27			(unsigned long long)r);				\
> >     28	} while (0)
> >     29	
> >     30	/*
> >     31	 * Test for a atomic operation family,
> >     32	 * @test should be a macro accepting parameters (bit, op, ...)
> >     33	 */
> >     34	
> >     35	#define FAMILY_TEST(test, bit, op, args...)	\
> >     36	do {						\
> >     37		test(bit, op, ##args);		\
> >   > 38		test(bit, op##_acquire, ##args);	\
> >   > 39		test(bit, op##_release, ##args);	\
> >   > 40		test(bit, op##_relaxed, ##args);	\
> >     41	} while (0)
> >     42	
> >     43	#define TEST_RETURN(bit, op, c_op, val)				\
> >     44	do {								\
> >     45		atomic##bit##_set(&v, v0);				\
> >     46		r = v0;							\
> >     47		r c_op val;						\
> >     48		BUG_ON(atomic##bit##_##op(val, &v) != r);		\
> >     49		BUG_ON(atomic##bit##_read(&v) != r);			\
> >     50	} while (0)
> >     51	
> >     52	#define RETURN_FAMILY_TEST(bit, op, c_op, val)			\
> >     53	do {								\
> >     54		FAMILY_TEST(TEST_RETURN, bit, op, c_op, val);		\
> >     55	} while (0)
> >     56	
> >     57	#define TEST_ARGS(bit, op, init, ret, expect, args...)		\
> >     58	do {								\
> >     59		atomic##bit##_set(&v, init);				\
> >   > 60		BUG_ON(atomic##bit##_##op(&v, ##args) != ret);		\
> >     61		BUG_ON(atomic##bit##_read(&v) != expect);		\
> >     62	} while (0)
> >     63	
> >     64	#define XCHG_FAMILY_TEST(bit, init, new)				\
> >     65	do {									\
> >     66		FAMILY_TEST(TEST_ARGS, bit, xchg, init, init, new, new);	\
> >     67	} while (0)
> >     68	
> >     69	#define CMPXCHG_FAMILY_TEST(bit, init, new, wrong)			\
> >     70	do {									\
> >     71		FAMILY_TEST(TEST_ARGS, bit, cmpxchg, 				\
> >     72				init, init, new, init, new);			\
> >     73		FAMILY_TEST(TEST_ARGS, bit, cmpxchg,				\
> >     74				init, init, init, wrong, new);			\
> >     75	} while (0)
> >     76	
> >     77	#define INC_RETURN_FAMILY_TEST(bit, i)			\
> >     78	do {							\
> >   > 79		FAMILY_TEST(TEST_ARGS, bit, inc_return,		\
> >     80				i, (i) + one, (i) + one);	\
> >     81	} while (0)
> >     82	
> >     83	#define DEC_RETURN_FAMILY_TEST(bit, i)			\
> >     84	do {							\
> >   > 85		FAMILY_TEST(TEST_ARGS, bit, dec_return,		\
> >     86				i, (i) - one, (i) - one);	\
> >     87	} while (0)
> >     88	
> >     89	static __init void test_atomic(void)
> >     90	{
> >     91		int v0 = 0xaaa31337;
> >     92		int v1 = 0xdeadbeef;
> >     93		int onestwos = 0x11112222;
> >     94		int one = 1;
> >     95	
> >     96		atomic_t v;
> >     97		int r;
> >     98	
> >     99		TEST(, add, +=, onestwos);
> >    100		TEST(, add, +=, -one);
> >    101		TEST(, sub, -=, onestwos);
> >    102		TEST(, sub, -=, -one);
> >    103		TEST(, or, |=, v1);
> >    104		TEST(, and, &=, v1);
> >    105		TEST(, xor, ^=, v1);
> >    106		TEST(, andnot, &= ~, v1);
> >    107	
> >    108		RETURN_FAMILY_TEST(, add_return, +=, onestwos);
> >    109		RETURN_FAMILY_TEST(, add_return, +=, -one);
> >    110		RETURN_FAMILY_TEST(, sub_return, -=, onestwos);
> >    111		RETURN_FAMILY_TEST(, sub_return, -=, -one);
> >    112	
> >  > 113		INC_RETURN_FAMILY_TEST(, v0);
> >  > 114		DEC_RETURN_FAMILY_TEST(, v0);
> >    115	
> >    116		XCHG_FAMILY_TEST(, v0, v1);
> >    117		CMPXCHG_FAMILY_TEST(, v0, v1, onestwos);
> > 
> > ---
> > 0-DAY kernel test infrastructure                Open Source Technology Center
> > https://lists.01.org/pipermail/kbuild-all                   Intel Corporation
> 
> 



^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [lkp] [PATCH v3 2/6] atomics: Add test for atomic operations with _relaxed variants
  2015-10-12 15:29       ` [lkp] " Fengguang Wu
@ 2015-10-12 15:42         ` Boqun Feng
  2015-10-12 16:02           ` [kbuild-all] " Fengguang Wu
  0 siblings, 1 reply; 40+ messages in thread
From: Boqun Feng @ 2015-10-12 15:42 UTC (permalink / raw)
  To: Fengguang Wu; +Cc: kbuild-all, LKML

[-- Attachment #1: Type: text/plain, Size: 12520 bytes --]

Hi Fengguang,

On Mon, Oct 12, 2015 at 11:29:14PM +0800, Fengguang Wu wrote:
> Hi Boqun,
> 
> The base tree detection is based on the whole patchset's
> 
> - subjects
> - touched files
> - TO/CC list
> 
> Log shows the files and TO/CC are strongly related to powerpc,
> so it looks a natural choice to apply to it. Especially you put
> "linuxppc-dev@lists.ozlabs.org" in the TO list while Peter/Ingo
> in the CC list -- that looks like a strong indication for powerpc.
> 

Thank you for your explanation, so how about modifying the title to:

[PATCH v3 tip/locking/core 2/6] ...

also works?

BTW, does this bot have more tests than 0day? I have pushed this
patchset to my own repo and had it tested by 0day.

Regards,
Boqun

> [2015-10-12 22:27:49] patched_files: ["arch/powerpc/include/asm/cmpxchg.h", "lib/atomic64_test.c", "include/linux/atomic.h", "arch/powerpc/include/asm/atomic.h"]
> [2015-10-12 22:27:49] bases: ["powerpc/next", "powerpc/next"]
> 
> [2015-10-12 22:27:49] lists: ["linux-kernel@vger.kernel.org", "linuxppc-dev@lists.ozlabs.org", "Peter Zijlstra <peterz@infradead.org>", "Ingo Molnar <mingo@kernel.org>"
> , "Benjamin Herrenschmidt <benh@kernel.crashing.org>", "Paul Mackerras <paulus@samba.org>", "Michael Ellerman <mpe@ellerman.id.au>", "Thomas Gleixner <tglx@linutronix.d
> e>", "Will Deacon <will.deacon@arm.com>", "\"Paul E. McKenney\" <paulmck@linux.vnet.ibm.com>", "Waiman Long <waiman.long@hp.com>", "Davidlohr Bueso <dave@stgolabs.net>"
> , "Boqun Feng <boqun.feng@gmail.com>"]
> [2015-10-12 22:27:49] bases: ["powerpc/next", "powerpc/next", "powerpc/next", "mpe/next", "mpe/next", "arm64/for-next/core", "arm64/for-next/core"]
> 
> The possible improvement would be to let tip:locking/core register
> itself in the MAINTAINERS file to match files *cmpxchg* *atomic*.
> 
> Thanks,
> Fengguang
> 
> On Mon, Oct 12, 2015 at 10:56:52PM +0800, Boqun Feng wrote:
> > On Mon, Oct 12, 2015 at 10:43:56PM +0800, kbuild test robot wrote:
> > > Hi Boqun,
> > > 
> > > [auto build test ERROR on v4.3-rc5 -- if it's inappropriate base, please suggest rules for selecting the more suitable base]
> > > 
> > 
> > This patch should be tested based on current locking/core branch of tip
> > tree. Thank you.
> > 
> > Regards,
> > Boqun
> > 
> > > url:    https://github.com/0day-ci/linux/commits/Boqun-Feng/atomics-powerpc-Implement-relaxed-acquire-release-variants-of-some-atomics/20151012-222750
> > > config: x86_64-randconfig-x016-10121751 (attached as .config)
> > > reproduce:
> > >         # save the attached .config to linux build tree
> > >         make ARCH=x86_64 
> > > 
> > > All error/warnings (new ones prefixed by >>):
> > > 
> > >    In file included from include/linux/init.h:4:0,
> > >                     from lib/atomic64_test.c:14:
> > >    lib/atomic64_test.c: In function 'test_atomic':
> > > >> lib/atomic64_test.c:60:9: error: implicit declaration of function 'atomic_inc_return_acquire' [-Werror=implicit-function-declaration]
> > >      BUG_ON(atomic##bit##_##op(&v, ##args) != ret);  \
> > >             ^
> > >    include/linux/compiler.h:166:42: note: in definition of macro 'unlikely'
> > >     # define unlikely(x) __builtin_expect(!!(x), 0)
> > >                                              ^
> > > >> lib/atomic64_test.c:60:2: note: in expansion of macro 'BUG_ON'
> > >      BUG_ON(atomic##bit##_##op(&v, ##args) != ret);  \
> > >      ^
> > > >> lib/atomic64_test.c:38:2: note: in expansion of macro 'TEST_ARGS'
> > >      test(bit, op##_acquire, ##args); \
> > >      ^
> > > >> lib/atomic64_test.c:79:2: note: in expansion of macro 'FAMILY_TEST'
> > >      FAMILY_TEST(TEST_ARGS, bit, inc_return,  \
> > >      ^
> > > >> lib/atomic64_test.c:113:2: note: in expansion of macro 'INC_RETURN_FAMILY_TEST'
> > >      INC_RETURN_FAMILY_TEST(, v0);
> > >      ^
> > > >> lib/atomic64_test.c:60:9: error: implicit declaration of function 'atomic_inc_return_release' [-Werror=implicit-function-declaration]
> > >      BUG_ON(atomic##bit##_##op(&v, ##args) != ret);  \
> > >             ^
> > >    include/linux/compiler.h:166:42: note: in definition of macro 'unlikely'
> > >     # define unlikely(x) __builtin_expect(!!(x), 0)
> > >                                              ^
> > > >> lib/atomic64_test.c:60:2: note: in expansion of macro 'BUG_ON'
> > >      BUG_ON(atomic##bit##_##op(&v, ##args) != ret);  \
> > >      ^
> > >    lib/atomic64_test.c:39:2: note: in expansion of macro 'TEST_ARGS'
> > >      test(bit, op##_release, ##args); \
> > >      ^
> > > >> lib/atomic64_test.c:79:2: note: in expansion of macro 'FAMILY_TEST'
> > >      FAMILY_TEST(TEST_ARGS, bit, inc_return,  \
> > >      ^
> > > >> lib/atomic64_test.c:113:2: note: in expansion of macro 'INC_RETURN_FAMILY_TEST'
> > >      INC_RETURN_FAMILY_TEST(, v0);
> > >      ^
> > > >> lib/atomic64_test.c:60:9: error: implicit declaration of function 'atomic_inc_return_relaxed' [-Werror=implicit-function-declaration]
> > >      BUG_ON(atomic##bit##_##op(&v, ##args) != ret);  \
> > >             ^
> > >    include/linux/compiler.h:166:42: note: in definition of macro 'unlikely'
> > >     # define unlikely(x) __builtin_expect(!!(x), 0)
> > >                                              ^
> > > >> lib/atomic64_test.c:60:2: note: in expansion of macro 'BUG_ON'
> > >      BUG_ON(atomic##bit##_##op(&v, ##args) != ret);  \
> > >      ^
> > >    lib/atomic64_test.c:40:2: note: in expansion of macro 'TEST_ARGS'
> > >      test(bit, op##_relaxed, ##args); \
> > >      ^
> > > >> lib/atomic64_test.c:79:2: note: in expansion of macro 'FAMILY_TEST'
> > >      FAMILY_TEST(TEST_ARGS, bit, inc_return,  \
> > >      ^
> > > >> lib/atomic64_test.c:113:2: note: in expansion of macro 'INC_RETURN_FAMILY_TEST'
> > >      INC_RETURN_FAMILY_TEST(, v0);
> > >      ^
> > > >> lib/atomic64_test.c:60:9: error: implicit declaration of function 'atomic_dec_return_acquire' [-Werror=implicit-function-declaration]
> > >      BUG_ON(atomic##bit##_##op(&v, ##args) != ret);  \
> > >             ^
> > >    include/linux/compiler.h:166:42: note: in definition of macro 'unlikely'
> > >     # define unlikely(x) __builtin_expect(!!(x), 0)
> > >                                              ^
> > > >> lib/atomic64_test.c:60:2: note: in expansion of macro 'BUG_ON'
> > >      BUG_ON(atomic##bit##_##op(&v, ##args) != ret);  \
> > >      ^
> > > >> lib/atomic64_test.c:38:2: note: in expansion of macro 'TEST_ARGS'
> > >      test(bit, op##_acquire, ##args); \
> > >      ^
> > >    lib/atomic64_test.c:85:2: note: in expansion of macro 'FAMILY_TEST'
> > >      FAMILY_TEST(TEST_ARGS, bit, dec_return,  \
> > >      ^
> > > >> lib/atomic64_test.c:114:2: note: in expansion of macro 'DEC_RETURN_FAMILY_TEST'
> > >      DEC_RETURN_FAMILY_TEST(, v0);
> > >      ^
> > > >> lib/atomic64_test.c:60:9: error: implicit declaration of function 'atomic_dec_return_release' [-Werror=implicit-function-declaration]
> > >      BUG_ON(atomic##bit##_##op(&v, ##args) != ret);  \
> > >             ^
> > >    include/linux/compiler.h:166:42: note: in definition of macro 'unlikely'
> > >     # define unlikely(x) __builtin_expect(!!(x), 0)
> > >                                              ^
> > > >> lib/atomic64_test.c:60:2: note: in expansion of macro 'BUG_ON'
> > >      BUG_ON(atomic##bit##_##op(&v, ##args) != ret);  \
> > >      ^
> > >    lib/atomic64_test.c:39:2: note: in expansion of macro 'TEST_ARGS'
> > >      test(bit, op##_release, ##args); \
> > >      ^
> > >    lib/atomic64_test.c:85:2: note: in expansion of macro 'FAMILY_TEST'
> > >      FAMILY_TEST(TEST_ARGS, bit, dec_return,  \
> > >      ^
> > > >> lib/atomic64_test.c:114:2: note: in expansion of macro 'DEC_RETURN_FAMILY_TEST'
> > >      DEC_RETURN_FAMILY_TEST(, v0);
> > >      ^
> > > 
> > > vim +/atomic_inc_return_acquire +60 lib/atomic64_test.c
> > > 
> > >      8	 * the Free Software Foundation; either version 2 of the License, or
> > >      9	 * (at your option) any later version.
> > >     10	 */
> > >     11	
> > >     12	#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
> > >     13	
> > >   > 14	#include <linux/init.h>
> > >     15	#include <linux/bug.h>
> > >     16	#include <linux/kernel.h>
> > >     17	#include <linux/atomic.h>
> > >     18	
> > >     19	#define TEST(bit, op, c_op, val)				\
> > >     20	do {								\
> > >     21		atomic##bit##_set(&v, v0);				\
> > >     22		r = v0;							\
> > >     23		atomic##bit##_##op(val, &v);				\
> > >     24		r c_op val;						\
> > >     25		WARN(atomic##bit##_read(&v) != r, "%Lx != %Lx\n",	\
> > >     26			(unsigned long long)atomic##bit##_read(&v),	\
> > >     27			(unsigned long long)r);				\
> > >     28	} while (0)
> > >     29	
> > >     30	/*
> > >     31	 * Test for a atomic operation family,
> > >     32	 * @test should be a macro accepting parameters (bit, op, ...)
> > >     33	 */
> > >     34	
> > >     35	#define FAMILY_TEST(test, bit, op, args...)	\
> > >     36	do {						\
> > >     37		test(bit, op, ##args);		\
> > >   > 38		test(bit, op##_acquire, ##args);	\
> > >   > 39		test(bit, op##_release, ##args);	\
> > >   > 40		test(bit, op##_relaxed, ##args);	\
> > >     41	} while (0)
> > >     42	
> > >     43	#define TEST_RETURN(bit, op, c_op, val)				\
> > >     44	do {								\
> > >     45		atomic##bit##_set(&v, v0);				\
> > >     46		r = v0;							\
> > >     47		r c_op val;						\
> > >     48		BUG_ON(atomic##bit##_##op(val, &v) != r);		\
> > >     49		BUG_ON(atomic##bit##_read(&v) != r);			\
> > >     50	} while (0)
> > >     51	
> > >     52	#define RETURN_FAMILY_TEST(bit, op, c_op, val)			\
> > >     53	do {								\
> > >     54		FAMILY_TEST(TEST_RETURN, bit, op, c_op, val);		\
> > >     55	} while (0)
> > >     56	
> > >     57	#define TEST_ARGS(bit, op, init, ret, expect, args...)		\
> > >     58	do {								\
> > >     59		atomic##bit##_set(&v, init);				\
> > >   > 60		BUG_ON(atomic##bit##_##op(&v, ##args) != ret);		\
> > >     61		BUG_ON(atomic##bit##_read(&v) != expect);		\
> > >     62	} while (0)
> > >     63	
> > >     64	#define XCHG_FAMILY_TEST(bit, init, new)				\
> > >     65	do {									\
> > >     66		FAMILY_TEST(TEST_ARGS, bit, xchg, init, init, new, new);	\
> > >     67	} while (0)
> > >     68	
> > >     69	#define CMPXCHG_FAMILY_TEST(bit, init, new, wrong)			\
> > >     70	do {									\
> > >     71		FAMILY_TEST(TEST_ARGS, bit, cmpxchg, 				\
> > >     72				init, init, new, init, new);			\
> > >     73		FAMILY_TEST(TEST_ARGS, bit, cmpxchg,				\
> > >     74				init, init, init, wrong, new);			\
> > >     75	} while (0)
> > >     76	
> > >     77	#define INC_RETURN_FAMILY_TEST(bit, i)			\
> > >     78	do {							\
> > >   > 79		FAMILY_TEST(TEST_ARGS, bit, inc_return,		\
> > >     80				i, (i) + one, (i) + one);	\
> > >     81	} while (0)
> > >     82	
> > >     83	#define DEC_RETURN_FAMILY_TEST(bit, i)			\
> > >     84	do {							\
> > >   > 85		FAMILY_TEST(TEST_ARGS, bit, dec_return,		\
> > >     86				i, (i) - one, (i) - one);	\
> > >     87	} while (0)
> > >     88	
> > >     89	static __init void test_atomic(void)
> > >     90	{
> > >     91		int v0 = 0xaaa31337;
> > >     92		int v1 = 0xdeadbeef;
> > >     93		int onestwos = 0x11112222;
> > >     94		int one = 1;
> > >     95	
> > >     96		atomic_t v;
> > >     97		int r;
> > >     98	
> > >     99		TEST(, add, +=, onestwos);
> > >    100		TEST(, add, +=, -one);
> > >    101		TEST(, sub, -=, onestwos);
> > >    102		TEST(, sub, -=, -one);
> > >    103		TEST(, or, |=, v1);
> > >    104		TEST(, and, &=, v1);
> > >    105		TEST(, xor, ^=, v1);
> > >    106		TEST(, andnot, &= ~, v1);
> > >    107	
> > >    108		RETURN_FAMILY_TEST(, add_return, +=, onestwos);
> > >    109		RETURN_FAMILY_TEST(, add_return, +=, -one);
> > >    110		RETURN_FAMILY_TEST(, sub_return, -=, onestwos);
> > >    111		RETURN_FAMILY_TEST(, sub_return, -=, -one);
> > >    112	
> > >  > 113		INC_RETURN_FAMILY_TEST(, v0);
> > >  > 114		DEC_RETURN_FAMILY_TEST(, v0);
> > >    115	
> > >    116		XCHG_FAMILY_TEST(, v0, v1);
> > >    117		CMPXCHG_FAMILY_TEST(, v0, v1, onestwos);
> > > 
> > > ---
> > > 0-DAY kernel test infrastructure                Open Source Technology Center
> > > https://lists.01.org/pipermail/kbuild-all                   Intel Corporation
> > 
> > 
> 
> 

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 473 bytes --]

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [kbuild-all] [lkp] [PATCH v3 2/6] atomics: Add test for atomic operations with _relaxed variants
  2015-10-12 15:42         ` Boqun Feng
@ 2015-10-12 16:02           ` Fengguang Wu
  2015-10-12 16:09             ` Fengguang Wu
  2015-10-13  1:33             ` Boqun Feng
  0 siblings, 2 replies; 40+ messages in thread
From: Fengguang Wu @ 2015-10-12 16:02 UTC (permalink / raw)
  To: Boqun Feng; +Cc: kbuild-all, LKML

On Mon, Oct 12, 2015 at 11:42:24PM +0800, Boqun Feng wrote:
> Hi Fengguang,
> 
> On Mon, Oct 12, 2015 at 11:29:14PM +0800, Fengguang Wu wrote:
> > Hi Boqun,
> > 
> > The base tree detection is based on the whole patchset's
> > 
> > - subjects
> > - touched files
> > - TO/CC list
> > 
> > Log shows the files and TO/CC are strongly related to powerpc,
> > so it looks a natural choice to apply to it. Especially you put
> > "linuxppc-dev@lists.ozlabs.org" in the TO list while Peter/Ingo
> > in the CC list -- that looks like a strong indication for powerpc.
> > 
> 
> Thank you for your explanation, so how about modifying the title to:
> 
> [PATCH v3 tip/locking/core 2/6] ...
> 
> also works?

Yes it will work -- that'd be the most strong hint.

> BTW, does this bot have more tests than 0day? I have pushed this
> patchset to my own repo and had it tested by 0day.

Yes, it runs git am, the resulted commits are feed to 0day for
build/boot/performance tests. If you've already pushed the patches via
git, the robot should have skip such duplicate tests on the emailed
patches -- unless there is a bug. I'll have a check.

Thanks,
Fengguang

> > [2015-10-12 22:27:49] patched_files: ["arch/powerpc/include/asm/cmpxchg.h", "lib/atomic64_test.c", "include/linux/atomic.h", "arch/powerpc/include/asm/atomic.h"]
> > [2015-10-12 22:27:49] bases: ["powerpc/next", "powerpc/next"]
> > 
> > [2015-10-12 22:27:49] lists: ["linux-kernel@vger.kernel.org", "linuxppc-dev@lists.ozlabs.org", "Peter Zijlstra <peterz@infradead.org>", "Ingo Molnar <mingo@kernel.org>"
> > , "Benjamin Herrenschmidt <benh@kernel.crashing.org>", "Paul Mackerras <paulus@samba.org>", "Michael Ellerman <mpe@ellerman.id.au>", "Thomas Gleixner <tglx@linutronix.d
> > e>", "Will Deacon <will.deacon@arm.com>", "\"Paul E. McKenney\" <paulmck@linux.vnet.ibm.com>", "Waiman Long <waiman.long@hp.com>", "Davidlohr Bueso <dave@stgolabs.net>"
> > , "Boqun Feng <boqun.feng@gmail.com>"]
> > [2015-10-12 22:27:49] bases: ["powerpc/next", "powerpc/next", "powerpc/next", "mpe/next", "mpe/next", "arm64/for-next/core", "arm64/for-next/core"]
> > 
> > The possible improvement would be to let tip:locking/core register
> > itself in the MAINTAINERS file to match files *cmpxchg* *atomic*.
> > 
> > Thanks,
> > Fengguang
> > 
> > On Mon, Oct 12, 2015 at 10:56:52PM +0800, Boqun Feng wrote:
> > > On Mon, Oct 12, 2015 at 10:43:56PM +0800, kbuild test robot wrote:
> > > > Hi Boqun,
> > > > 
> > > > [auto build test ERROR on v4.3-rc5 -- if it's inappropriate base, please suggest rules for selecting the more suitable base]
> > > > 
> > > 
> > > This patch should be tested based on current locking/core branch of tip
> > > tree. Thank you.
> > > 
> > > Regards,
> > > Boqun
> > > 
> > > > url:    https://github.com/0day-ci/linux/commits/Boqun-Feng/atomics-powerpc-Implement-relaxed-acquire-release-variants-of-some-atomics/20151012-222750
> > > > config: x86_64-randconfig-x016-10121751 (attached as .config)
> > > > reproduce:
> > > >         # save the attached .config to linux build tree
> > > >         make ARCH=x86_64 
> > > > 
> > > > All error/warnings (new ones prefixed by >>):
> > > > 
> > > >    In file included from include/linux/init.h:4:0,
> > > >                     from lib/atomic64_test.c:14:
> > > >    lib/atomic64_test.c: In function 'test_atomic':
> > > > >> lib/atomic64_test.c:60:9: error: implicit declaration of function 'atomic_inc_return_acquire' [-Werror=implicit-function-declaration]
> > > >      BUG_ON(atomic##bit##_##op(&v, ##args) != ret);  \
> > > >             ^
> > > >    include/linux/compiler.h:166:42: note: in definition of macro 'unlikely'
> > > >     # define unlikely(x) __builtin_expect(!!(x), 0)
> > > >                                              ^
> > > > >> lib/atomic64_test.c:60:2: note: in expansion of macro 'BUG_ON'
> > > >      BUG_ON(atomic##bit##_##op(&v, ##args) != ret);  \
> > > >      ^
> > > > >> lib/atomic64_test.c:38:2: note: in expansion of macro 'TEST_ARGS'
> > > >      test(bit, op##_acquire, ##args); \
> > > >      ^
> > > > >> lib/atomic64_test.c:79:2: note: in expansion of macro 'FAMILY_TEST'
> > > >      FAMILY_TEST(TEST_ARGS, bit, inc_return,  \
> > > >      ^
> > > > >> lib/atomic64_test.c:113:2: note: in expansion of macro 'INC_RETURN_FAMILY_TEST'
> > > >      INC_RETURN_FAMILY_TEST(, v0);
> > > >      ^
> > > > >> lib/atomic64_test.c:60:9: error: implicit declaration of function 'atomic_inc_return_release' [-Werror=implicit-function-declaration]
> > > >      BUG_ON(atomic##bit##_##op(&v, ##args) != ret);  \
> > > >             ^
> > > >    include/linux/compiler.h:166:42: note: in definition of macro 'unlikely'
> > > >     # define unlikely(x) __builtin_expect(!!(x), 0)
> > > >                                              ^
> > > > >> lib/atomic64_test.c:60:2: note: in expansion of macro 'BUG_ON'
> > > >      BUG_ON(atomic##bit##_##op(&v, ##args) != ret);  \
> > > >      ^
> > > >    lib/atomic64_test.c:39:2: note: in expansion of macro 'TEST_ARGS'
> > > >      test(bit, op##_release, ##args); \
> > > >      ^
> > > > >> lib/atomic64_test.c:79:2: note: in expansion of macro 'FAMILY_TEST'
> > > >      FAMILY_TEST(TEST_ARGS, bit, inc_return,  \
> > > >      ^
> > > > >> lib/atomic64_test.c:113:2: note: in expansion of macro 'INC_RETURN_FAMILY_TEST'
> > > >      INC_RETURN_FAMILY_TEST(, v0);
> > > >      ^
> > > > >> lib/atomic64_test.c:60:9: error: implicit declaration of function 'atomic_inc_return_relaxed' [-Werror=implicit-function-declaration]
> > > >      BUG_ON(atomic##bit##_##op(&v, ##args) != ret);  \
> > > >             ^
> > > >    include/linux/compiler.h:166:42: note: in definition of macro 'unlikely'
> > > >     # define unlikely(x) __builtin_expect(!!(x), 0)
> > > >                                              ^
> > > > >> lib/atomic64_test.c:60:2: note: in expansion of macro 'BUG_ON'
> > > >      BUG_ON(atomic##bit##_##op(&v, ##args) != ret);  \
> > > >      ^
> > > >    lib/atomic64_test.c:40:2: note: in expansion of macro 'TEST_ARGS'
> > > >      test(bit, op##_relaxed, ##args); \
> > > >      ^
> > > > >> lib/atomic64_test.c:79:2: note: in expansion of macro 'FAMILY_TEST'
> > > >      FAMILY_TEST(TEST_ARGS, bit, inc_return,  \
> > > >      ^
> > > > >> lib/atomic64_test.c:113:2: note: in expansion of macro 'INC_RETURN_FAMILY_TEST'
> > > >      INC_RETURN_FAMILY_TEST(, v0);
> > > >      ^
> > > > >> lib/atomic64_test.c:60:9: error: implicit declaration of function 'atomic_dec_return_acquire' [-Werror=implicit-function-declaration]
> > > >      BUG_ON(atomic##bit##_##op(&v, ##args) != ret);  \
> > > >             ^
> > > >    include/linux/compiler.h:166:42: note: in definition of macro 'unlikely'
> > > >     # define unlikely(x) __builtin_expect(!!(x), 0)
> > > >                                              ^
> > > > >> lib/atomic64_test.c:60:2: note: in expansion of macro 'BUG_ON'
> > > >      BUG_ON(atomic##bit##_##op(&v, ##args) != ret);  \
> > > >      ^
> > > > >> lib/atomic64_test.c:38:2: note: in expansion of macro 'TEST_ARGS'
> > > >      test(bit, op##_acquire, ##args); \
> > > >      ^
> > > >    lib/atomic64_test.c:85:2: note: in expansion of macro 'FAMILY_TEST'
> > > >      FAMILY_TEST(TEST_ARGS, bit, dec_return,  \
> > > >      ^
> > > > >> lib/atomic64_test.c:114:2: note: in expansion of macro 'DEC_RETURN_FAMILY_TEST'
> > > >      DEC_RETURN_FAMILY_TEST(, v0);
> > > >      ^
> > > > >> lib/atomic64_test.c:60:9: error: implicit declaration of function 'atomic_dec_return_release' [-Werror=implicit-function-declaration]
> > > >      BUG_ON(atomic##bit##_##op(&v, ##args) != ret);  \
> > > >             ^
> > > >    include/linux/compiler.h:166:42: note: in definition of macro 'unlikely'
> > > >     # define unlikely(x) __builtin_expect(!!(x), 0)
> > > >                                              ^
> > > > >> lib/atomic64_test.c:60:2: note: in expansion of macro 'BUG_ON'
> > > >      BUG_ON(atomic##bit##_##op(&v, ##args) != ret);  \
> > > >      ^
> > > >    lib/atomic64_test.c:39:2: note: in expansion of macro 'TEST_ARGS'
> > > >      test(bit, op##_release, ##args); \
> > > >      ^
> > > >    lib/atomic64_test.c:85:2: note: in expansion of macro 'FAMILY_TEST'
> > > >      FAMILY_TEST(TEST_ARGS, bit, dec_return,  \
> > > >      ^
> > > > >> lib/atomic64_test.c:114:2: note: in expansion of macro 'DEC_RETURN_FAMILY_TEST'
> > > >      DEC_RETURN_FAMILY_TEST(, v0);
> > > >      ^
> > > > 
> > > > vim +/atomic_inc_return_acquire +60 lib/atomic64_test.c
> > > > 
> > > >      8	 * the Free Software Foundation; either version 2 of the License, or
> > > >      9	 * (at your option) any later version.
> > > >     10	 */
> > > >     11	
> > > >     12	#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
> > > >     13	
> > > >   > 14	#include <linux/init.h>
> > > >     15	#include <linux/bug.h>
> > > >     16	#include <linux/kernel.h>
> > > >     17	#include <linux/atomic.h>
> > > >     18	
> > > >     19	#define TEST(bit, op, c_op, val)				\
> > > >     20	do {								\
> > > >     21		atomic##bit##_set(&v, v0);				\
> > > >     22		r = v0;							\
> > > >     23		atomic##bit##_##op(val, &v);				\
> > > >     24		r c_op val;						\
> > > >     25		WARN(atomic##bit##_read(&v) != r, "%Lx != %Lx\n",	\
> > > >     26			(unsigned long long)atomic##bit##_read(&v),	\
> > > >     27			(unsigned long long)r);				\
> > > >     28	} while (0)
> > > >     29	
> > > >     30	/*
> > > >     31	 * Test for a atomic operation family,
> > > >     32	 * @test should be a macro accepting parameters (bit, op, ...)
> > > >     33	 */
> > > >     34	
> > > >     35	#define FAMILY_TEST(test, bit, op, args...)	\
> > > >     36	do {						\
> > > >     37		test(bit, op, ##args);		\
> > > >   > 38		test(bit, op##_acquire, ##args);	\
> > > >   > 39		test(bit, op##_release, ##args);	\
> > > >   > 40		test(bit, op##_relaxed, ##args);	\
> > > >     41	} while (0)
> > > >     42	
> > > >     43	#define TEST_RETURN(bit, op, c_op, val)				\
> > > >     44	do {								\
> > > >     45		atomic##bit##_set(&v, v0);				\
> > > >     46		r = v0;							\
> > > >     47		r c_op val;						\
> > > >     48		BUG_ON(atomic##bit##_##op(val, &v) != r);		\
> > > >     49		BUG_ON(atomic##bit##_read(&v) != r);			\
> > > >     50	} while (0)
> > > >     51	
> > > >     52	#define RETURN_FAMILY_TEST(bit, op, c_op, val)			\
> > > >     53	do {								\
> > > >     54		FAMILY_TEST(TEST_RETURN, bit, op, c_op, val);		\
> > > >     55	} while (0)
> > > >     56	
> > > >     57	#define TEST_ARGS(bit, op, init, ret, expect, args...)		\
> > > >     58	do {								\
> > > >     59		atomic##bit##_set(&v, init);				\
> > > >   > 60		BUG_ON(atomic##bit##_##op(&v, ##args) != ret);		\
> > > >     61		BUG_ON(atomic##bit##_read(&v) != expect);		\
> > > >     62	} while (0)
> > > >     63	
> > > >     64	#define XCHG_FAMILY_TEST(bit, init, new)				\
> > > >     65	do {									\
> > > >     66		FAMILY_TEST(TEST_ARGS, bit, xchg, init, init, new, new);	\
> > > >     67	} while (0)
> > > >     68	
> > > >     69	#define CMPXCHG_FAMILY_TEST(bit, init, new, wrong)			\
> > > >     70	do {									\
> > > >     71		FAMILY_TEST(TEST_ARGS, bit, cmpxchg, 				\
> > > >     72				init, init, new, init, new);			\
> > > >     73		FAMILY_TEST(TEST_ARGS, bit, cmpxchg,				\
> > > >     74				init, init, init, wrong, new);			\
> > > >     75	} while (0)
> > > >     76	
> > > >     77	#define INC_RETURN_FAMILY_TEST(bit, i)			\
> > > >     78	do {							\
> > > >   > 79		FAMILY_TEST(TEST_ARGS, bit, inc_return,		\
> > > >     80				i, (i) + one, (i) + one);	\
> > > >     81	} while (0)
> > > >     82	
> > > >     83	#define DEC_RETURN_FAMILY_TEST(bit, i)			\
> > > >     84	do {							\
> > > >   > 85		FAMILY_TEST(TEST_ARGS, bit, dec_return,		\
> > > >     86				i, (i) - one, (i) - one);	\
> > > >     87	} while (0)
> > > >     88	
> > > >     89	static __init void test_atomic(void)
> > > >     90	{
> > > >     91		int v0 = 0xaaa31337;
> > > >     92		int v1 = 0xdeadbeef;
> > > >     93		int onestwos = 0x11112222;
> > > >     94		int one = 1;
> > > >     95	
> > > >     96		atomic_t v;
> > > >     97		int r;
> > > >     98	
> > > >     99		TEST(, add, +=, onestwos);
> > > >    100		TEST(, add, +=, -one);
> > > >    101		TEST(, sub, -=, onestwos);
> > > >    102		TEST(, sub, -=, -one);
> > > >    103		TEST(, or, |=, v1);
> > > >    104		TEST(, and, &=, v1);
> > > >    105		TEST(, xor, ^=, v1);
> > > >    106		TEST(, andnot, &= ~, v1);
> > > >    107	
> > > >    108		RETURN_FAMILY_TEST(, add_return, +=, onestwos);
> > > >    109		RETURN_FAMILY_TEST(, add_return, +=, -one);
> > > >    110		RETURN_FAMILY_TEST(, sub_return, -=, onestwos);
> > > >    111		RETURN_FAMILY_TEST(, sub_return, -=, -one);
> > > >    112	
> > > >  > 113		INC_RETURN_FAMILY_TEST(, v0);
> > > >  > 114		DEC_RETURN_FAMILY_TEST(, v0);
> > > >    115	
> > > >    116		XCHG_FAMILY_TEST(, v0, v1);
> > > >    117		CMPXCHG_FAMILY_TEST(, v0, v1, onestwos);
> > > > 
> > > > ---
> > > > 0-DAY kernel test infrastructure                Open Source Technology Center
> > > > https://lists.01.org/pipermail/kbuild-all                   Intel Corporation
> > > 
> > > 
> > 
> > 



> _______________________________________________
> kbuild-all mailing list
> kbuild-all@lists.01.org
> https://lists.01.org/mailman/listinfo/kbuild-all


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [kbuild-all] [lkp] [PATCH v3 2/6] atomics: Add test for atomic operations with _relaxed variants
  2015-10-12 16:02           ` [kbuild-all] " Fengguang Wu
@ 2015-10-12 16:09             ` Fengguang Wu
  2015-10-13  1:33             ` Boqun Feng
  1 sibling, 0 replies; 40+ messages in thread
From: Fengguang Wu @ 2015-10-12 16:09 UTC (permalink / raw)
  To: Boqun Feng; +Cc: kbuild-all, LKML

> > BTW, does this bot have more tests than 0day? I have pushed this
> > patchset to my own repo and had it tested by 0day.
> 
> Yes, it runs git am, the resulted commits are feed to 0day for
> build/boot/performance tests. If you've already pushed the patches via
> git, the robot should have skip such duplicate tests on the emailed
> patches -- unless there is a bug. I'll have a check.

Ah yes, there is a silly bug.. Fixed, thanks!

Fengguang

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [kbuild-all] [lkp] [PATCH v3 2/6] atomics: Add test for atomic operations with _relaxed variants
  2015-10-12 16:02           ` [kbuild-all] " Fengguang Wu
  2015-10-12 16:09             ` Fengguang Wu
@ 2015-10-13  1:33             ` Boqun Feng
  1 sibling, 0 replies; 40+ messages in thread
From: Boqun Feng @ 2015-10-13  1:33 UTC (permalink / raw)
  To: Fengguang Wu; +Cc: kbuild-all, LKML

[-- Attachment #1: Type: text/plain, Size: 1489 bytes --]

On Tue, Oct 13, 2015 at 12:02:00AM +0800, Fengguang Wu wrote:
> On Mon, Oct 12, 2015 at 11:42:24PM +0800, Boqun Feng wrote:
> > Hi Fengguang,
> > 
> > On Mon, Oct 12, 2015 at 11:29:14PM +0800, Fengguang Wu wrote:
> > > Hi Boqun,
> > > 
> > > The base tree detection is based on the whole patchset's
> > > 
> > > - subjects
> > > - touched files
> > > - TO/CC list
> > > 
> > > Log shows the files and TO/CC are strongly related to powerpc,
> > > so it looks a natural choice to apply to it. Especially you put
> > > "linuxppc-dev@lists.ozlabs.org" in the TO list while Peter/Ingo
> > > in the CC list -- that looks like a strong indication for powerpc.
> > > 
> > 
> > Thank you for your explanation, so how about modifying the title to:
> > 
> > [PATCH v3 tip/locking/core 2/6] ...
> > 
> > also works?
> 
> Yes it will work -- that'd be the most strong hint.
> 

Great! I will use that hint in the future, thank you ;-)

> > BTW, does this bot have more tests than 0day? I have pushed this
> > patchset to my own repo and had it tested by 0day.
> 
> Yes, it runs git am, the resulted commits are feed to 0day for
> build/boot/performance tests. If you've already pushed the patches via
> git, the robot should have skip such duplicate tests on the emailed
> patches -- unless there is a bug. I'll have a check.
> 

Thank you, so I'm not going to resend this v3 patchset, will use the
subject hint for any future patchset ;-)

Regards,
Boqun


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 473 bytes --]

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v3 0/6] atomics: powerpc: Implement relaxed/acquire/release variants of some atomics
  2015-10-12 14:14 [PATCH v3 0/6] atomics: powerpc: Implement relaxed/acquire/release variants of some atomics Boqun Feng
                   ` (6 preceding siblings ...)
  2015-10-12 14:30 ` [PATCH RESEND v3 1/6] powerpc: atomic: Make *xchg and *cmpxchg a full barrier Boqun Feng
@ 2015-10-13 12:27 ` Peter Zijlstra
  2015-10-13 15:46   ` Paul E. McKenney
  7 siblings, 1 reply; 40+ messages in thread
From: Peter Zijlstra @ 2015-10-13 12:27 UTC (permalink / raw)
  To: Boqun Feng
  Cc: linux-kernel, linuxppc-dev, Ingo Molnar, Benjamin Herrenschmidt,
	Paul Mackerras, Michael Ellerman, Thomas Gleixner, Will Deacon,
	Paul E. McKenney, Waiman Long, Davidlohr Bueso

On Mon, Oct 12, 2015 at 10:14:00PM +0800, Boqun Feng wrote:
 
> The patchset consists of 6 parts:
> 
> 1.	Make xchg, cmpxchg and their atomic_ versions a full barrier
> 
> 2.	Add trivial tests for the new variants in lib/atomic64_test.c
> 
> 3.	Allow architectures to define their own __atomic_op_*() helpers
> 	to build other variants based on relaxed.
> 
> 4.	Implement atomic{,64}_{add,sub,inc,dec}_return_* variants
> 
> 5.	Implement xchg_* and atomic{,64}_xchg_* variants
> 
> 6.	Implement cmpxchg_* atomic{,64}_cmpxchg_* variants
> 
> 
> This patchset is based on current locking/core branch of the tip tree
> and all patches are built and boot tested for little endian pseries, and
> also tested by 0day.
> 

I don't see any immediate problems with this series at this point. Will,
Paul?

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v3 4/6] powerpc: atomic: Implement atomic{,64}_*_return_* variants
  2015-10-12 14:14   ` [PATCH v3 4/6] powerpc: atomic: Implement atomic{, 64}_*_return_* variants Boqun Feng
@ 2015-10-13 13:21     ` Will Deacon
  -1 siblings, 0 replies; 40+ messages in thread
From: Will Deacon @ 2015-10-13 13:21 UTC (permalink / raw)
  To: Boqun Feng
  Cc: linux-kernel, linuxppc-dev, Peter Zijlstra, Ingo Molnar,
	Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman,
	Thomas Gleixner, Paul E. McKenney, Waiman Long, Davidlohr Bueso

On Mon, Oct 12, 2015 at 10:14:04PM +0800, Boqun Feng wrote:
> On powerpc, acquire and release semantics can be achieved with
> lightweight barriers("lwsync" and "ctrl+isync"), which can be used to
> implement __atomic_op_{acquire,release}.
> 
> For release semantics, since we only need to ensure all memory accesses
> that issue before must take effects before the -store- part of the
> atomics, "lwsync" is what we only need. On the platform without
> "lwsync", "sync" should be used. Therefore, smp_lwsync() is used here.
> 
> For acquire semantics, "lwsync" is what we only need for the similar
> reason.  However on the platform without "lwsync", we can use "isync"
> rather than "sync" as an acquire barrier. So a new kind of barrier
> smp_acquire_barrier__after_atomic() is introduced, which is barrier() on
> UP, "lwsync" if available and "isync" otherwise.
> 
> __atomic_op_fence is defined as smp_lwsync() + _relaxed +
> smp_mb__after_atomic() to guarantee a full barrier.
> 
> Implement atomic{,64}_{add,sub,inc,dec}_return_relaxed, and build other
> variants with these helpers.
> 
> Signed-off-by: Boqun Feng <boqun.feng@gmail.com>
> ---
>  arch/powerpc/include/asm/atomic.h | 122 +++++++++++++++++++++++++-------------
>  1 file changed, 80 insertions(+), 42 deletions(-)
> 
> diff --git a/arch/powerpc/include/asm/atomic.h b/arch/powerpc/include/asm/atomic.h
> index 55f106e..3143af9 100644
> --- a/arch/powerpc/include/asm/atomic.h
> +++ b/arch/powerpc/include/asm/atomic.h
> @@ -12,6 +12,39 @@
>  
>  #define ATOMIC_INIT(i)		{ (i) }
>  
> +/*
> + * Since {add,sub}_return_relaxed and xchg_relaxed are implemented with
> + * a "bne-" instruction at the end, so an isync is enough as a acquire barrier
> + * on the platform without lwsync.
> + */
> +#ifdef CONFIG_SMP
> +#define smp_acquire_barrier__after_atomic() \
> +	__asm__ __volatile__(PPC_ACQUIRE_BARRIER : : : "memory")

I'm not keen on this barrier, as it sounds like it's part of the kernel
memory model, as opposed to an implementation detail on PowerPC (and
we've already got enough of that in the generic code ;).

Can you name it something different please (and maybe #undef it when
you're done)?

Will

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v3 4/6] powerpc: atomic: Implement atomic{, 64}_*_return_* variants
@ 2015-10-13 13:21     ` Will Deacon
  0 siblings, 0 replies; 40+ messages in thread
From: Will Deacon @ 2015-10-13 13:21 UTC (permalink / raw)
  To: Boqun Feng
  Cc: linux-kernel, linuxppc-dev, Peter Zijlstra, Ingo Molnar,
	Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman,
	Thomas Gleixner, Paul E. McKenney, Waiman Long, Davidlohr Bueso

On Mon, Oct 12, 2015 at 10:14:04PM +0800, Boqun Feng wrote:
> On powerpc, acquire and release semantics can be achieved with
> lightweight barriers("lwsync" and "ctrl+isync"), which can be used to
> implement __atomic_op_{acquire,release}.
> 
> For release semantics, since we only need to ensure all memory accesses
> that issue before must take effects before the -store- part of the
> atomics, "lwsync" is what we only need. On the platform without
> "lwsync", "sync" should be used. Therefore, smp_lwsync() is used here.
> 
> For acquire semantics, "lwsync" is what we only need for the similar
> reason.  However on the platform without "lwsync", we can use "isync"
> rather than "sync" as an acquire barrier. So a new kind of barrier
> smp_acquire_barrier__after_atomic() is introduced, which is barrier() on
> UP, "lwsync" if available and "isync" otherwise.
> 
> __atomic_op_fence is defined as smp_lwsync() + _relaxed +
> smp_mb__after_atomic() to guarantee a full barrier.
> 
> Implement atomic{,64}_{add,sub,inc,dec}_return_relaxed, and build other
> variants with these helpers.
> 
> Signed-off-by: Boqun Feng <boqun.feng@gmail.com>
> ---
>  arch/powerpc/include/asm/atomic.h | 122 +++++++++++++++++++++++++-------------
>  1 file changed, 80 insertions(+), 42 deletions(-)
> 
> diff --git a/arch/powerpc/include/asm/atomic.h b/arch/powerpc/include/asm/atomic.h
> index 55f106e..3143af9 100644
> --- a/arch/powerpc/include/asm/atomic.h
> +++ b/arch/powerpc/include/asm/atomic.h
> @@ -12,6 +12,39 @@
>  
>  #define ATOMIC_INIT(i)		{ (i) }
>  
> +/*
> + * Since {add,sub}_return_relaxed and xchg_relaxed are implemented with
> + * a "bne-" instruction at the end, so an isync is enough as a acquire barrier
> + * on the platform without lwsync.
> + */
> +#ifdef CONFIG_SMP
> +#define smp_acquire_barrier__after_atomic() \
> +	__asm__ __volatile__(PPC_ACQUIRE_BARRIER : : : "memory")

I'm not keen on this barrier, as it sounds like it's part of the kernel
memory model, as opposed to an implementation detail on PowerPC (and
we've already got enough of that in the generic code ;).

Can you name it something different please (and maybe #undef it when
you're done)?

Will

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v3 6/6] powerpc: atomic: Implement cmpxchg{,64}_* and atomic{,64}_cmpxchg_* variants
  2015-10-12 14:14   ` [PATCH v3 6/6] powerpc: atomic: Implement cmpxchg{, 64}_* and atomic{, 64}_cmpxchg_* variants Boqun Feng
  (?)
@ 2015-10-13 13:24   ` Will Deacon
  2015-10-13 14:32     ` Boqun Feng
  -1 siblings, 1 reply; 40+ messages in thread
From: Will Deacon @ 2015-10-13 13:24 UTC (permalink / raw)
  To: Boqun Feng
  Cc: linux-kernel, linuxppc-dev, Peter Zijlstra, Ingo Molnar,
	Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman,
	Thomas Gleixner, Paul E. McKenney, Waiman Long, Davidlohr Bueso

On Mon, Oct 12, 2015 at 10:14:06PM +0800, Boqun Feng wrote:
> Implement cmpxchg{,64}_relaxed and atomic{,64}_cmpxchg_relaxed, based on
> which _release variants can be built.
> 
> To avoid superfluous barriers in _acquire variants, we implement these
> operations with assembly code rather use __atomic_op_acquire() to build
> them automatically.

The "superfluous barriers" are for the case where the cmpxchg fails, right?
And you don't do the same thing for release, because you want to avoid a
barrier in the middle of the critical section?

(just checking I understand your reasoning).

Will

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v3 4/6] powerpc: atomic: Implement atomic{,64}_*_return_* variants
  2015-10-13 13:21     ` [PATCH v3 4/6] powerpc: atomic: Implement atomic{, 64}_*_return_* variants Will Deacon
@ 2015-10-13 13:35       ` Boqun Feng
  -1 siblings, 0 replies; 40+ messages in thread
From: Boqun Feng @ 2015-10-13 13:35 UTC (permalink / raw)
  To: Will Deacon
  Cc: linux-kernel, linuxppc-dev, Peter Zijlstra, Ingo Molnar,
	Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman,
	Thomas Gleixner, Paul E. McKenney, Waiman Long, Davidlohr Bueso

[-- Attachment #1: Type: text/plain, Size: 1320 bytes --]

On Tue, Oct 13, 2015 at 02:21:32PM +0100, Will Deacon wrote:
> On Mon, Oct 12, 2015 at 10:14:04PM +0800, Boqun Feng wrote:
[snip]
> > +/*
> > + * Since {add,sub}_return_relaxed and xchg_relaxed are implemented with
> > + * a "bne-" instruction at the end, so an isync is enough as a acquire barrier
> > + * on the platform without lwsync.
> > + */
> > +#ifdef CONFIG_SMP
> > +#define smp_acquire_barrier__after_atomic() \
> > +	__asm__ __volatile__(PPC_ACQUIRE_BARRIER : : : "memory")
> 
> I'm not keen on this barrier, as it sounds like it's part of the kernel
> memory model, as opposed to an implementation detail on PowerPC (and
> we've already got enough of that in the generic code ;).
> 

Indeed, but we still have smp_lwsync() ;-)

> Can you name it something different please (and maybe #undef it when
> you're done)?
> 

I've considered #undef it after used, but now I think open code this
into __atomic_op_acquire() of PPC is a better idea?


#define __atomic_op_acquire(op, args...)				\
({									\
	typeof(op##_relaxed(args)) __ret  = op##_relaxed(args);		\
	__asm__ __volatile__(PPC_ACQUIRE_BARRIER : : : "memory");	\
	__ret;								\
})

PPC_ACQUIRE_BARRIER will be empty if !SMP, so that will become a pure
compiler barrier and just what we need.

Regards,
Boqun

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 473 bytes --]

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v3 4/6] powerpc: atomic: Implement atomic{, 64}_*_return_* variants
@ 2015-10-13 13:35       ` Boqun Feng
  0 siblings, 0 replies; 40+ messages in thread
From: Boqun Feng @ 2015-10-13 13:35 UTC (permalink / raw)
  To: Will Deacon
  Cc: linux-kernel, linuxppc-dev, Peter Zijlstra, Ingo Molnar,
	Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman,
	Thomas Gleixner, Paul E. McKenney, Waiman Long, Davidlohr Bueso

[-- Attachment #1: Type: text/plain, Size: 1320 bytes --]

On Tue, Oct 13, 2015 at 02:21:32PM +0100, Will Deacon wrote:
> On Mon, Oct 12, 2015 at 10:14:04PM +0800, Boqun Feng wrote:
[snip]
> > +/*
> > + * Since {add,sub}_return_relaxed and xchg_relaxed are implemented with
> > + * a "bne-" instruction at the end, so an isync is enough as a acquire barrier
> > + * on the platform without lwsync.
> > + */
> > +#ifdef CONFIG_SMP
> > +#define smp_acquire_barrier__after_atomic() \
> > +	__asm__ __volatile__(PPC_ACQUIRE_BARRIER : : : "memory")
> 
> I'm not keen on this barrier, as it sounds like it's part of the kernel
> memory model, as opposed to an implementation detail on PowerPC (and
> we've already got enough of that in the generic code ;).
> 

Indeed, but we still have smp_lwsync() ;-)

> Can you name it something different please (and maybe #undef it when
> you're done)?
> 

I've considered #undef it after used, but now I think open code this
into __atomic_op_acquire() of PPC is a better idea?


#define __atomic_op_acquire(op, args...)				\
({									\
	typeof(op##_relaxed(args)) __ret  = op##_relaxed(args);		\
	__asm__ __volatile__(PPC_ACQUIRE_BARRIER : : : "memory");	\
	__ret;								\
})

PPC_ACQUIRE_BARRIER will be empty if !SMP, so that will become a pure
compiler barrier and just what we need.

Regards,
Boqun

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 473 bytes --]

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v3 6/6] powerpc: atomic: Implement cmpxchg{,64}_* and atomic{,64}_cmpxchg_* variants
  2015-10-13 13:24   ` [PATCH v3 6/6] powerpc: atomic: Implement cmpxchg{,64}_* and atomic{,64}_cmpxchg_* variants Will Deacon
@ 2015-10-13 14:32     ` Boqun Feng
  2015-10-13 14:43       ` Will Deacon
  2015-10-13 14:46       ` Boqun Feng
  0 siblings, 2 replies; 40+ messages in thread
From: Boqun Feng @ 2015-10-13 14:32 UTC (permalink / raw)
  To: Will Deacon
  Cc: linux-kernel, linuxppc-dev, Peter Zijlstra, Ingo Molnar,
	Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman,
	Thomas Gleixner, Paul E. McKenney, Waiman Long, Davidlohr Bueso

[-- Attachment #1: Type: text/plain, Size: 2419 bytes --]

On Tue, Oct 13, 2015 at 02:24:04PM +0100, Will Deacon wrote:
> On Mon, Oct 12, 2015 at 10:14:06PM +0800, Boqun Feng wrote:
> > Implement cmpxchg{,64}_relaxed and atomic{,64}_cmpxchg_relaxed, based on
> > which _release variants can be built.
> > 
> > To avoid superfluous barriers in _acquire variants, we implement these
> > operations with assembly code rather use __atomic_op_acquire() to build
> > them automatically.
> 
> The "superfluous barriers" are for the case where the cmpxchg fails, right?

Yes.

> And you don't do the same thing for release, because you want to avoid a
> barrier in the middle of the critical section?
> 

Mostly because of the comments in include/linux/atomic.h:

 * For compound atomics performing both a load and a store, ACQUIRE
 * semantics apply only to the load and RELEASE semantics only to the
 * store portion of the operation. Note that a failed cmpxchg_acquire
 * does -not- imply any memory ordering constraints.

so I thought only the barrier in cmpxchg_acquire() is conditional, and
the barrier in cmpxchg_release() is not. Maybe we'd better call it out
that cmpxchg *family* doesn't have any order guarantee if cmp fails, as
a complement of

ed2de9f74ecb ("locking/Documentation: Clarify failed cmpxchg() memory ordering semantics")

Because it seems this commit only claims that the barriers in fully
ordered version are conditional.


If cmpxchg_release doesn't have order guarantee when failed, I guess I
can implement it with a barrier in the middle as you mentioned:

	unsigned int prev;

	__asm__ __volatile__ (
"1:	lwarx	%0,0,%2		
	cmpw	0,%0,%3\n\
	bne-	2f\n"
	PPC_RELEASE_BARRIER
"	stwcx.	%4,0,%2\n\
	bne-	1b"
	"\n\
2:"
	: "=&r" (prev), "+m" (*p)
	: "r" (p), "r" (old), "r" (new)
	: "cc", "memory");

	return prev;


However, I need to check whether the architecture allows this and any
other problem exists.

Besides, I don't think it's a good idea to do the "put barrier in the
middle" thing in this patchset, because that seems a premature
optimization and if we go further, I guess we can also replace the
PPC_RELEASE_BARRIER above with a "sync" to implement a fully ordered
version cmpxchg(). Too much needs to investigate then..

> (just checking I understand your reasoning).
> 

That actually helps me find a probably better implementation if allowed,
thank you ;-)

Regards,
Boqun

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 473 bytes --]

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v3 6/6] powerpc: atomic: Implement cmpxchg{,64}_* and atomic{,64}_cmpxchg_* variants
  2015-10-13 14:32     ` Boqun Feng
@ 2015-10-13 14:43       ` Will Deacon
  2015-10-13 14:58         ` Boqun Feng
  2015-10-13 14:46       ` Boqun Feng
  1 sibling, 1 reply; 40+ messages in thread
From: Will Deacon @ 2015-10-13 14:43 UTC (permalink / raw)
  To: Boqun Feng
  Cc: linux-kernel, linuxppc-dev, Peter Zijlstra, Ingo Molnar,
	Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman,
	Thomas Gleixner, Paul E. McKenney, Waiman Long, Davidlohr Bueso

On Tue, Oct 13, 2015 at 10:32:59PM +0800, Boqun Feng wrote:
> On Tue, Oct 13, 2015 at 02:24:04PM +0100, Will Deacon wrote:
> > On Mon, Oct 12, 2015 at 10:14:06PM +0800, Boqun Feng wrote:
> > > Implement cmpxchg{,64}_relaxed and atomic{,64}_cmpxchg_relaxed, based on
> > > which _release variants can be built.
> > > 
> > > To avoid superfluous barriers in _acquire variants, we implement these
> > > operations with assembly code rather use __atomic_op_acquire() to build
> > > them automatically.
> > 
> > The "superfluous barriers" are for the case where the cmpxchg fails, right?
> 
> Yes.
> 
> > And you don't do the same thing for release, because you want to avoid a
> > barrier in the middle of the critical section?
> > 
> 
> Mostly because of the comments in include/linux/atomic.h:
> 
>  * For compound atomics performing both a load and a store, ACQUIRE
>  * semantics apply only to the load and RELEASE semantics only to the
>  * store portion of the operation. Note that a failed cmpxchg_acquire
>  * does -not- imply any memory ordering constraints.
> 
> so I thought only the barrier in cmpxchg_acquire() is conditional, and
> the barrier in cmpxchg_release() is not. Maybe we'd better call it out
> that cmpxchg *family* doesn't have any order guarantee if cmp fails, as
> a complement of
> 
> ed2de9f74ecb ("locking/Documentation: Clarify failed cmpxchg() memory ordering semantics")
> 
> Because it seems this commit only claims that the barriers in fully
> ordered version are conditional.

I didn't think this was ambiguous... A failed cmpxchg_release doesn't
perform a store, so because the RELEASE semantics only apply to the
store portion of the operation, it therefore doesn't have any ordering
guarantees. Acquire is called out as a special case because it *does*
actually perform a load on the failure case.

> If cmpxchg_release doesn't have order guarantee when failed, I guess I
> can implement it with a barrier in the middle as you mentioned:
> 
> 	unsigned int prev;
> 
> 	__asm__ __volatile__ (
> "1:	lwarx	%0,0,%2		
> 	cmpw	0,%0,%3\n\
> 	bne-	2f\n"
> 	PPC_RELEASE_BARRIER
> "	stwcx.	%4,0,%2\n\
> 	bne-	1b"
> 	"\n\
> 2:"
> 	: "=&r" (prev), "+m" (*p)
> 	: "r" (p), "r" (old), "r" (new)
> 	: "cc", "memory");
> 
> 	return prev;
> 
> 
> However, I need to check whether the architecture allows this and any
> other problem exists.
> 
> Besides, I don't think it's a good idea to do the "put barrier in the
> middle" thing in this patchset, because that seems a premature
> optimization and if we go further, I guess we can also replace the
> PPC_RELEASE_BARRIER above with a "sync" to implement a fully ordered
> version cmpxchg(). Too much needs to investigate then..

Putting a barrier in the middle of that critical section is probably a
terrible idea, and that's why I thought you were avoiding it (hence my
original question). Perhaps just add a comment to that effect, since I
fear adding more words to memory-barriers.txt is just likely to create
further confusion.

Will

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v3 6/6] powerpc: atomic: Implement cmpxchg{,64}_* and atomic{,64}_cmpxchg_* variants
  2015-10-13 14:32     ` Boqun Feng
  2015-10-13 14:43       ` Will Deacon
@ 2015-10-13 14:46       ` Boqun Feng
  1 sibling, 0 replies; 40+ messages in thread
From: Boqun Feng @ 2015-10-13 14:46 UTC (permalink / raw)
  To: Will Deacon
  Cc: linux-kernel, linuxppc-dev, Peter Zijlstra, Ingo Molnar,
	Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman,
	Thomas Gleixner, Paul E. McKenney, Waiman Long, Davidlohr Bueso

[-- Attachment #1: Type: text/plain, Size: 2767 bytes --]

On Tue, Oct 13, 2015 at 10:32:59PM +0800, Boqun Feng wrote:
> On Tue, Oct 13, 2015 at 02:24:04PM +0100, Will Deacon wrote:
> > On Mon, Oct 12, 2015 at 10:14:06PM +0800, Boqun Feng wrote:
> > > Implement cmpxchg{,64}_relaxed and atomic{,64}_cmpxchg_relaxed, based on
> > > which _release variants can be built.
> > > 
> > > To avoid superfluous barriers in _acquire variants, we implement these
> > > operations with assembly code rather use __atomic_op_acquire() to build
> > > them automatically.
> > 
> > The "superfluous barriers" are for the case where the cmpxchg fails, right?
> 
> Yes.
> 
> > And you don't do the same thing for release, because you want to avoid a
> > barrier in the middle of the critical section?
> > 
> 
> Mostly because of the comments in include/linux/atomic.h:
> 
>  * For compound atomics performing both a load and a store, ACQUIRE
>  * semantics apply only to the load and RELEASE semantics only to the
>  * store portion of the operation. Note that a failed cmpxchg_acquire
>  * does -not- imply any memory ordering constraints.
> 
> so I thought only the barrier in cmpxchg_acquire() is conditional, and
> the barrier in cmpxchg_release() is not. Maybe we'd better call it out
> that cmpxchg *family* doesn't have any order guarantee if cmp fails, as
> a complement of
> 
> ed2de9f74ecb ("locking/Documentation: Clarify failed cmpxchg() memory ordering semantics")
> 
> Because it seems this commit only claims that the barriers in fully
> ordered version are conditional.
> 
> 
> If cmpxchg_release doesn't have order guarantee when failed, I guess I
> can implement it with a barrier in the middle as you mentioned:
> 
> 	unsigned int prev;
> 
> 	__asm__ __volatile__ (
> "1:	lwarx	%0,0,%2		
> 	cmpw	0,%0,%3\n\
> 	bne-	2f\n"
> 	PPC_RELEASE_BARRIER
> "	stwcx.	%4,0,%2\n\
> 	bne-	1b"
> 	"\n\
> 2:"
> 	: "=&r" (prev), "+m" (*p)
> 	: "r" (p), "r" (old), "r" (new)
> 	: "cc", "memory");
> 
> 	return prev;
> 
> 
> However, I need to check whether the architecture allows this and any
> other problem exists.
> 
> Besides, I don't think it's a good idea to do the "put barrier in the
> middle" thing in this patchset, because that seems a premature
> optimization and if we go further, I guess we can also replace the
> PPC_RELEASE_BARRIER above with a "sync" to implement a fully ordered

Correction: we can't just put the sync in the middle to implement a
fully ordered version. Sorry for that mistake..

Regards,
Boqun

> version cmpxchg(). Too much needs to investigate then..
> 
> > (just checking I understand your reasoning).
> > 
> 
> That actually helps me find a probably better implementation if allowed,
> thank you ;-)
> 
> Regards,
> Boqun

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 473 bytes --]

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v3 6/6] powerpc: atomic: Implement cmpxchg{,64}_* and atomic{,64}_cmpxchg_* variants
  2015-10-13 14:43       ` Will Deacon
@ 2015-10-13 14:58         ` Boqun Feng
  2015-10-13 15:04           ` Will Deacon
  0 siblings, 1 reply; 40+ messages in thread
From: Boqun Feng @ 2015-10-13 14:58 UTC (permalink / raw)
  To: Will Deacon
  Cc: linux-kernel, linuxppc-dev, Peter Zijlstra, Ingo Molnar,
	Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman,
	Thomas Gleixner, Paul E. McKenney, Waiman Long, Davidlohr Bueso

[-- Attachment #1: Type: text/plain, Size: 2967 bytes --]

On Tue, Oct 13, 2015 at 03:43:33PM +0100, Will Deacon wrote:
> On Tue, Oct 13, 2015 at 10:32:59PM +0800, Boqun Feng wrote:
[snip]
> > 
> > Mostly because of the comments in include/linux/atomic.h:
> > 
> >  * For compound atomics performing both a load and a store, ACQUIRE
> >  * semantics apply only to the load and RELEASE semantics only to the
> >  * store portion of the operation. Note that a failed cmpxchg_acquire
> >  * does -not- imply any memory ordering constraints.
> > 
> > so I thought only the barrier in cmpxchg_acquire() is conditional, and
> > the barrier in cmpxchg_release() is not. Maybe we'd better call it out
> > that cmpxchg *family* doesn't have any order guarantee if cmp fails, as
> > a complement of
> > 
> > ed2de9f74ecb ("locking/Documentation: Clarify failed cmpxchg() memory ordering semantics")
> > 
> > Because it seems this commit only claims that the barriers in fully
> > ordered version are conditional.
> 
> I didn't think this was ambiguous... A failed cmpxchg_release doesn't
> perform a store, so because the RELEASE semantics only apply to the
> store portion of the operation, it therefore doesn't have any ordering
> guarantees. Acquire is called out as a special case because it *does*
> actually perform a load on the failure case.
> 

Make sense.

> > If cmpxchg_release doesn't have order guarantee when failed, I guess I
> > can implement it with a barrier in the middle as you mentioned:
> > 
> > 	unsigned int prev;
> > 
> > 	__asm__ __volatile__ (
> > "1:	lwarx	%0,0,%2		
> > 	cmpw	0,%0,%3\n\
> > 	bne-	2f\n"
> > 	PPC_RELEASE_BARRIER
> > "	stwcx.	%4,0,%2\n\
> > 	bne-	1b"
> > 	"\n\
> > 2:"
> > 	: "=&r" (prev), "+m" (*p)
> > 	: "r" (p), "r" (old), "r" (new)
> > 	: "cc", "memory");
> > 
> > 	return prev;
> > 
> > 
> > However, I need to check whether the architecture allows this and any
> > other problem exists.
> > 
> > Besides, I don't think it's a good idea to do the "put barrier in the
> > middle" thing in this patchset, because that seems a premature
> > optimization and if we go further, I guess we can also replace the
> > PPC_RELEASE_BARRIER above with a "sync" to implement a fully ordered
> > version cmpxchg(). Too much needs to investigate then..
> 
> Putting a barrier in the middle of that critical section is probably a
> terrible idea, and that's why I thought you were avoiding it (hence my

The fact is that I haven't thought of that way to implement
cmpxchg_release before you ask that question ;-) And I'm not going to do
that for now and probably not in the future.

> original question). Perhaps just add a comment to that effect, since I

Are you suggesting if I put a barrier in the middle I'd better to add a
comment, right? So if I don't do that, it's OK to let this patch as it.

Regards,
Boqun

> fear adding more words to memory-barriers.txt is just likely to create
> further confusion.
> 
> Will

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 473 bytes --]

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v3 6/6] powerpc: atomic: Implement cmpxchg{,64}_* and atomic{,64}_cmpxchg_* variants
  2015-10-13 14:58         ` Boqun Feng
@ 2015-10-13 15:04           ` Will Deacon
  2015-10-13 15:45             ` Boqun Feng
  2015-10-14  1:47             ` Boqun Feng
  0 siblings, 2 replies; 40+ messages in thread
From: Will Deacon @ 2015-10-13 15:04 UTC (permalink / raw)
  To: Boqun Feng
  Cc: linux-kernel, linuxppc-dev, Peter Zijlstra, Ingo Molnar,
	Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman,
	Thomas Gleixner, Paul E. McKenney, Waiman Long, Davidlohr Bueso

On Tue, Oct 13, 2015 at 10:58:30PM +0800, Boqun Feng wrote:
> On Tue, Oct 13, 2015 at 03:43:33PM +0100, Will Deacon wrote:
> > Putting a barrier in the middle of that critical section is probably a
> > terrible idea, and that's why I thought you were avoiding it (hence my
> 
> The fact is that I haven't thought of that way to implement
> cmpxchg_release before you ask that question ;-) And I'm not going to do
> that for now and probably not in the future.
> 
> > original question). Perhaps just add a comment to that effect, since I
> 
> Are you suggesting if I put a barrier in the middle I'd better to add a
> comment, right? So if I don't do that, it's OK to let this patch as it.

No, I mean put a comment in your file to explain the reason why you
override _relaxed and _acquire, but not _release (because overriding
_release would introduce this weird barrier in the middle of the critical
section, which would likely cause the conditional store to fail).

Will

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v3 6/6] powerpc: atomic: Implement cmpxchg{,64}_* and atomic{,64}_cmpxchg_* variants
  2015-10-13 15:04           ` Will Deacon
@ 2015-10-13 15:45             ` Boqun Feng
  2015-10-14  1:47             ` Boqun Feng
  1 sibling, 0 replies; 40+ messages in thread
From: Boqun Feng @ 2015-10-13 15:45 UTC (permalink / raw)
  To: Will Deacon
  Cc: linux-kernel, linuxppc-dev, Peter Zijlstra, Ingo Molnar,
	Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman,
	Thomas Gleixner, Paul E. McKenney, Waiman Long, Davidlohr Bueso

[-- Attachment #1: Type: text/plain, Size: 1148 bytes --]

On Tue, Oct 13, 2015 at 04:04:27PM +0100, Will Deacon wrote:
> On Tue, Oct 13, 2015 at 10:58:30PM +0800, Boqun Feng wrote:
> > On Tue, Oct 13, 2015 at 03:43:33PM +0100, Will Deacon wrote:
> > > Putting a barrier in the middle of that critical section is probably a
> > > terrible idea, and that's why I thought you were avoiding it (hence my
> > 
> > The fact is that I haven't thought of that way to implement
> > cmpxchg_release before you ask that question ;-) And I'm not going to do
> > that for now and probably not in the future.
> > 
> > > original question). Perhaps just add a comment to that effect, since I
> > 
> > Are you suggesting if I put a barrier in the middle I'd better to add a
> > comment, right? So if I don't do that, it's OK to let this patch as it.
> 
> No, I mean put a comment in your file to explain the reason why you
> override _relaxed and _acquire, but not _release (because overriding
> _release would introduce this weird barrier in the middle of the critical
> section, which would likely cause the conditional store to fail).
> 

Good idea, will do that. Thank you ;-)

Regards,
Boqun

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 473 bytes --]

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v3 0/6] atomics: powerpc: Implement relaxed/acquire/release variants of some atomics
  2015-10-13 12:27 ` [PATCH v3 0/6] atomics: powerpc: Implement relaxed/acquire/release variants of some atomics Peter Zijlstra
@ 2015-10-13 15:46   ` Paul E. McKenney
  0 siblings, 0 replies; 40+ messages in thread
From: Paul E. McKenney @ 2015-10-13 15:46 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Boqun Feng, linux-kernel, linuxppc-dev, Ingo Molnar,
	Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman,
	Thomas Gleixner, Will Deacon, Waiman Long, Davidlohr Bueso

On Tue, Oct 13, 2015 at 02:27:13PM +0200, Peter Zijlstra wrote:
> On Mon, Oct 12, 2015 at 10:14:00PM +0800, Boqun Feng wrote:
> 
> > The patchset consists of 6 parts:
> > 
> > 1.	Make xchg, cmpxchg and their atomic_ versions a full barrier
> > 
> > 2.	Add trivial tests for the new variants in lib/atomic64_test.c
> > 
> > 3.	Allow architectures to define their own __atomic_op_*() helpers
> > 	to build other variants based on relaxed.
> > 
> > 4.	Implement atomic{,64}_{add,sub,inc,dec}_return_* variants
> > 
> > 5.	Implement xchg_* and atomic{,64}_xchg_* variants
> > 
> > 6.	Implement cmpxchg_* atomic{,64}_cmpxchg_* variants
> > 
> > 
> > This patchset is based on current locking/core branch of the tip tree
> > and all patches are built and boot tested for little endian pseries, and
> > also tested by 0day.
> 
> I don't see any immediate problems with this series at this point. Will,
> Paul?

Every time I have gotten ready to take a close look, someone has pointed
out a problem, and I have deferred until the next version.  Looks like
I should take a close look at Boqun's next version regardless.  ;-)

							Thanx, Paul


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH RESEND v3 1/6] powerpc: atomic: Make *xchg and *cmpxchg a full barrier
  2015-10-12 14:30 ` [PATCH RESEND v3 1/6] powerpc: atomic: Make *xchg and *cmpxchg a full barrier Boqun Feng
@ 2015-10-14  0:10   ` Michael Ellerman
  2015-10-14  0:51     ` Boqun Feng
  0 siblings, 1 reply; 40+ messages in thread
From: Michael Ellerman @ 2015-10-14  0:10 UTC (permalink / raw)
  To: Boqun Feng
  Cc: linux-kernel, linuxppc-dev, Peter Zijlstra, Ingo Molnar,
	Benjamin Herrenschmidt, Paul Mackerras, Thomas Gleixner,
	Will Deacon, Paul E. McKenney, Waiman Long, Davidlohr Bueso,
	stable

On Mon, 2015-10-12 at 22:30 +0800, Boqun Feng wrote:
> According to memory-barriers.txt, xchg, cmpxchg and their atomic{,64}_
> versions all need to imply a full barrier, however they are now just
> RELEASE+ACQUIRE, which is not a full barrier.
> 
> So replace PPC_RELEASE_BARRIER and PPC_ACQUIRE_BARRIER with
> PPC_ATOMIC_ENTRY_BARRIER and PPC_ATOMIC_EXIT_BARRIER in
> __{cmp,}xchg_{u32,u64} respectively to guarantee a full barrier
> semantics of atomic{,64}_{cmp,}xchg() and {cmp,}xchg().
> 
> This patch is a complement of commit b97021f85517 ("powerpc: Fix
> atomic_xxx_return barrier semantics").
> 
> Cc: <stable@vger.kernel.org> # 3.4.y-
> Signed-off-by: Boqun Feng <boqun.feng@gmail.com>
> ---
>  arch/powerpc/include/asm/cmpxchg.h | 16 ++++++++--------
>  1 file changed, 8 insertions(+), 8 deletions(-)

Hi Boqun,

Thanks for fixing this. In future you should send a patch like this as a
separate patch. I've not been paying attention to it because I assumed it was
part of your full series and was still under discussion like the other patches.

I don't think we've seen any crashes caused by this have we? So I guess I'll
put it in next to let it get some wider testing rather than sending it straight
to Linus.

To be clear you're doing:

> -	PPC_RELEASE_BARRIER
> +	PPC_ATOMIC_ENTRY_BARRIER

Which is correct but doesn't actually change anything at the moment, because
both macros turn into LWSYNC.

On the other hand:

> -	PPC_ACQUIRE_BARRIER
> +	PPC_ATOMIC_EXIT_BARRIER

Is changing an isync (which is then patched to lwsync on some cpus), with a sync.


Also I'm not clear what your stable line means:

> Cc: <stable@vger.kernel.org> # 3.4.y-

Do you mean 3.4 and anything after? I usually write that as 3.4+, but I'm not
sure if that's the correct syntax either.


cheers



^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH RESEND v3 1/6] powerpc: atomic: Make *xchg and *cmpxchg a full barrier
  2015-10-14  0:10   ` Michael Ellerman
@ 2015-10-14  0:51     ` Boqun Feng
  2015-10-14  8:06       ` Peter Zijlstra
  0 siblings, 1 reply; 40+ messages in thread
From: Boqun Feng @ 2015-10-14  0:51 UTC (permalink / raw)
  To: Michael Ellerman
  Cc: linux-kernel, linuxppc-dev, Peter Zijlstra, Ingo Molnar,
	Benjamin Herrenschmidt, Paul Mackerras, Thomas Gleixner,
	Will Deacon, Paul E. McKenney, Waiman Long, Davidlohr Bueso,
	stable

[-- Attachment #1: Type: text/plain, Size: 3330 bytes --]

On Wed, Oct 14, 2015 at 11:10:00AM +1100, Michael Ellerman wrote:
> On Mon, 2015-10-12 at 22:30 +0800, Boqun Feng wrote:
> > According to memory-barriers.txt, xchg, cmpxchg and their atomic{,64}_
> > versions all need to imply a full barrier, however they are now just
> > RELEASE+ACQUIRE, which is not a full barrier.
> > 
> > So replace PPC_RELEASE_BARRIER and PPC_ACQUIRE_BARRIER with
> > PPC_ATOMIC_ENTRY_BARRIER and PPC_ATOMIC_EXIT_BARRIER in
> > __{cmp,}xchg_{u32,u64} respectively to guarantee a full barrier
> > semantics of atomic{,64}_{cmp,}xchg() and {cmp,}xchg().
> > 
> > This patch is a complement of commit b97021f85517 ("powerpc: Fix
> > atomic_xxx_return barrier semantics").
> > 
> > Cc: <stable@vger.kernel.org> # 3.4.y-
> > Signed-off-by: Boqun Feng <boqun.feng@gmail.com>
> > ---
> >  arch/powerpc/include/asm/cmpxchg.h | 16 ++++++++--------
> >  1 file changed, 8 insertions(+), 8 deletions(-)
> 
> Hi Boqun,
> 

Hello, Michael

> Thanks for fixing this. In future you should send a patch like this as a
> separate patch. I've not been paying attention to it because I assumed it was

Got it. However, here is the thing, in previous version, this fix
depends on some of other patches in this patchset. So to make this fix
applied cleanly, I reorder my patchset to put this patch first, and the
result is that some of other patches in this patchset depends on
this(they need to remove code modified by this patch).

So I guess I'd better to stop Cc stable for this one, and wait until
this patchset merged and send a separate patch for -stable tree. Does
that work for you? I think this is what Peter want to suggests me to do
when he asked me about this, right, Peter?

> part of your full series and was still under discussion like the other patches.
> 
> I don't think we've seen any crashes caused by this have we? So I guess I'll

No, we haven't seen any.

> put it in next to let it get some wider testing rather than sending it straight
> to Linus.
> 

Good idea, thank you ;-)

> To be clear you're doing:
> 
> > -	PPC_RELEASE_BARRIER
> > +	PPC_ATOMIC_ENTRY_BARRIER
> 
> Which is correct but doesn't actually change anything at the moment, because
> both macros turn into LWSYNC.
> 
> On the other hand:
> 
> > -	PPC_ACQUIRE_BARRIER
> > +	PPC_ATOMIC_EXIT_BARRIER
> 
> Is changing an isync (which is then patched to lwsync on some cpus), with a sync.
> 

These macros are introduced by commit b97021f85517 ("powerpc: Fix
atomic_xxx_return barrier semantics") to fix a similar problem, so I use
them to keep code similar.

> 
> Also I'm not clear what your stable line means:
> 
> > Cc: <stable@vger.kernel.org> # 3.4.y-
> 
> Do you mean 3.4 and anything after? I usually write that as 3.4+, but I'm not
> sure if that's the correct syntax either.
> 

Quote from Documentation/stable_kernel_rules.txt:

"""
Also, some patches may have kernel version prerequisites.  This can be
specified in the following format in the sign-off area:

     Cc:  <stable@vger.kernel.org> # 3.3.x-

   The tag has the meaning of:
     git cherry-pick <this commit>

   For each "-stable" tree starting with the specified version.
"""

But yes, I have seen several people use like "3.4+", I'm not sure either

Regards,
Boqun

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 473 bytes --]

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v3 4/6] powerpc: atomic: Implement atomic{,64}_*_return_* variants
  2015-10-13 13:35       ` [PATCH v3 4/6] powerpc: atomic: Implement atomic{, 64}_*_return_* variants Boqun Feng
@ 2015-10-14  1:00         ` Boqun Feng
  -1 siblings, 0 replies; 40+ messages in thread
From: Boqun Feng @ 2015-10-14  1:00 UTC (permalink / raw)
  To: Will Deacon
  Cc: linux-kernel, linuxppc-dev, Peter Zijlstra, Ingo Molnar,
	Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman,
	Thomas Gleixner, Paul E. McKenney, Waiman Long, Davidlohr Bueso

[-- Attachment #1: Type: text/plain, Size: 1538 bytes --]

On Tue, Oct 13, 2015 at 09:35:54PM +0800, Boqun Feng wrote:
> On Tue, Oct 13, 2015 at 02:21:32PM +0100, Will Deacon wrote:
> > On Mon, Oct 12, 2015 at 10:14:04PM +0800, Boqun Feng wrote:
> [snip]
> > > +/*
> > > + * Since {add,sub}_return_relaxed and xchg_relaxed are implemented with
> > > + * a "bne-" instruction at the end, so an isync is enough as a acquire barrier
> > > + * on the platform without lwsync.
> > > + */
> > > +#ifdef CONFIG_SMP
> > > +#define smp_acquire_barrier__after_atomic() \
> > > +	__asm__ __volatile__(PPC_ACQUIRE_BARRIER : : : "memory")
> > 
> > I'm not keen on this barrier, as it sounds like it's part of the kernel
> > memory model, as opposed to an implementation detail on PowerPC (and
> > we've already got enough of that in the generic code ;).
> > 
> 
> Indeed, but we still have smp_lwsync() ;-)
> 
> > Can you name it something different please (and maybe #undef it when
> > you're done)?
> > 
> 
> I've considered #undef it after used, but now I think open code this
> into __atomic_op_acquire() of PPC is a better idea?
> 
> 
> #define __atomic_op_acquire(op, args...)				\
> ({									\
> 	typeof(op##_relaxed(args)) __ret  = op##_relaxed(args);		\
> 	__asm__ __volatile__(PPC_ACQUIRE_BARRIER : : : "memory");	\

Should be:
	__asm__ __volatile__(PPC_ACQUIRE_BARRIER "" : : : "memory");

> 	__ret;								\
> })
> 
> PPC_ACQUIRE_BARRIER will be empty if !SMP, so that will become a pure
> compiler barrier and just what we need.
> 
> Regards,
> Boqun

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 473 bytes --]

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v3 4/6] powerpc: atomic: Implement atomic{, 64}_*_return_* variants
@ 2015-10-14  1:00         ` Boqun Feng
  0 siblings, 0 replies; 40+ messages in thread
From: Boqun Feng @ 2015-10-14  1:00 UTC (permalink / raw)
  To: Will Deacon
  Cc: linux-kernel, linuxppc-dev, Peter Zijlstra, Ingo Molnar,
	Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman,
	Thomas Gleixner, Paul E. McKenney, Waiman Long, Davidlohr Bueso

[-- Attachment #1: Type: text/plain, Size: 1538 bytes --]

On Tue, Oct 13, 2015 at 09:35:54PM +0800, Boqun Feng wrote:
> On Tue, Oct 13, 2015 at 02:21:32PM +0100, Will Deacon wrote:
> > On Mon, Oct 12, 2015 at 10:14:04PM +0800, Boqun Feng wrote:
> [snip]
> > > +/*
> > > + * Since {add,sub}_return_relaxed and xchg_relaxed are implemented with
> > > + * a "bne-" instruction at the end, so an isync is enough as a acquire barrier
> > > + * on the platform without lwsync.
> > > + */
> > > +#ifdef CONFIG_SMP
> > > +#define smp_acquire_barrier__after_atomic() \
> > > +	__asm__ __volatile__(PPC_ACQUIRE_BARRIER : : : "memory")
> > 
> > I'm not keen on this barrier, as it sounds like it's part of the kernel
> > memory model, as opposed to an implementation detail on PowerPC (and
> > we've already got enough of that in the generic code ;).
> > 
> 
> Indeed, but we still have smp_lwsync() ;-)
> 
> > Can you name it something different please (and maybe #undef it when
> > you're done)?
> > 
> 
> I've considered #undef it after used, but now I think open code this
> into __atomic_op_acquire() of PPC is a better idea?
> 
> 
> #define __atomic_op_acquire(op, args...)				\
> ({									\
> 	typeof(op##_relaxed(args)) __ret  = op##_relaxed(args);		\
> 	__asm__ __volatile__(PPC_ACQUIRE_BARRIER : : : "memory");	\

Should be:
	__asm__ __volatile__(PPC_ACQUIRE_BARRIER "" : : : "memory");

> 	__ret;								\
> })
> 
> PPC_ACQUIRE_BARRIER will be empty if !SMP, so that will become a pure
> compiler barrier and just what we need.
> 
> Regards,
> Boqun

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 473 bytes --]

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v3 6/6] powerpc: atomic: Implement cmpxchg{,64}_* and atomic{,64}_cmpxchg_* variants
  2015-10-13 15:04           ` Will Deacon
  2015-10-13 15:45             ` Boqun Feng
@ 2015-10-14  1:47             ` Boqun Feng
  2015-10-14  9:40               ` Will Deacon
  1 sibling, 1 reply; 40+ messages in thread
From: Boqun Feng @ 2015-10-14  1:47 UTC (permalink / raw)
  To: Will Deacon
  Cc: linux-kernel, linuxppc-dev, Peter Zijlstra, Ingo Molnar,
	Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman,
	Thomas Gleixner, Paul E. McKenney, Waiman Long, Davidlohr Bueso

[-- Attachment #1: Type: text/plain, Size: 1161 bytes --]

On Tue, Oct 13, 2015 at 04:04:27PM +0100, Will Deacon wrote:
> On Tue, Oct 13, 2015 at 10:58:30PM +0800, Boqun Feng wrote:
> > On Tue, Oct 13, 2015 at 03:43:33PM +0100, Will Deacon wrote:
> > > Putting a barrier in the middle of that critical section is probably a
> > > terrible idea, and that's why I thought you were avoiding it (hence my
> > 
> > The fact is that I haven't thought of that way to implement
> > cmpxchg_release before you ask that question ;-) And I'm not going to do
> > that for now and probably not in the future.
> > 
> > > original question). Perhaps just add a comment to that effect, since I
> > 
> > Are you suggesting if I put a barrier in the middle I'd better to add a
> > comment, right? So if I don't do that, it's OK to let this patch as it.
> 
> No, I mean put a comment in your file to explain the reason why you
> override _relaxed and _acquire, but not _release (because overriding

You mean overriding _acquire and fully order version, right?

> _release would introduce this weird barrier in the middle of the critical
> section, which would likely cause the conditional store to fail).
> 
> Will

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 473 bytes --]

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH RESEND v3 1/6] powerpc: atomic: Make *xchg and *cmpxchg a full barrier
  2015-10-14  0:51     ` Boqun Feng
@ 2015-10-14  8:06       ` Peter Zijlstra
  2015-10-14  9:26         ` Boqun Feng
  0 siblings, 1 reply; 40+ messages in thread
From: Peter Zijlstra @ 2015-10-14  8:06 UTC (permalink / raw)
  To: Boqun Feng
  Cc: Michael Ellerman, linux-kernel, linuxppc-dev, Ingo Molnar,
	Benjamin Herrenschmidt, Paul Mackerras, Thomas Gleixner,
	Will Deacon, Paul E. McKenney, Waiman Long, Davidlohr Bueso,
	stable

On Wed, Oct 14, 2015 at 08:51:34AM +0800, Boqun Feng wrote:
> On Wed, Oct 14, 2015 at 11:10:00AM +1100, Michael Ellerman wrote:

> > Thanks for fixing this. In future you should send a patch like this as a
> > separate patch. I've not been paying attention to it because I assumed it was
> 
> Got it. However, here is the thing, in previous version, this fix
> depends on some of other patches in this patchset. So to make this fix
> applied cleanly, I reorder my patchset to put this patch first, and the
> result is that some of other patches in this patchset depends on
> this(they need to remove code modified by this patch).
> 
> So I guess I'd better to stop Cc stable for this one, and wait until
> this patchset merged and send a separate patch for -stable tree. Does
> that work for you? I think this is what Peter want to suggests me to do
> when he asked me about this, right, Peter?

I don't think I had explicit thoughts about any of that, just that it
might make sense to have this patch not depend on the rest such that it
could indeed be stuffed into stable.

I'll leave the details up to Michael since he's PPC maintainer.

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH RESEND v3 1/6] powerpc: atomic: Make *xchg and *cmpxchg a full barrier
  2015-10-14  8:06       ` Peter Zijlstra
@ 2015-10-14  9:26         ` Boqun Feng
  2015-10-14  9:33           ` Peter Zijlstra
  0 siblings, 1 reply; 40+ messages in thread
From: Boqun Feng @ 2015-10-14  9:26 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Michael Ellerman, linux-kernel, linuxppc-dev, Ingo Molnar,
	Benjamin Herrenschmidt, Paul Mackerras, Thomas Gleixner,
	Will Deacon, Paul E. McKenney, Waiman Long, Davidlohr Bueso,
	stable

[-- Attachment #1: Type: text/plain, Size: 1601 bytes --]

On Wed, Oct 14, 2015 at 10:06:13AM +0200, Peter Zijlstra wrote:
> On Wed, Oct 14, 2015 at 08:51:34AM +0800, Boqun Feng wrote:
> > On Wed, Oct 14, 2015 at 11:10:00AM +1100, Michael Ellerman wrote:
> 
> > > Thanks for fixing this. In future you should send a patch like this as a
> > > separate patch. I've not been paying attention to it because I assumed it was
> > 
> > Got it. However, here is the thing, in previous version, this fix
> > depends on some of other patches in this patchset. So to make this fix
> > applied cleanly, I reorder my patchset to put this patch first, and the
> > result is that some of other patches in this patchset depends on
> > this(they need to remove code modified by this patch).
> > 
> > So I guess I'd better to stop Cc stable for this one, and wait until
> > this patchset merged and send a separate patch for -stable tree. Does
> > that work for you? I think this is what Peter want to suggests me to do
> > when he asked me about this, right, Peter?
> 
> I don't think I had explicit thoughts about any of that, just that it
> might make sense to have this patch not depend on the rest such that it
> could indeed be stuffed into stable.
> 

Got that. Sorry for misunderstanding you...

> I'll leave the details up to Michael since he's PPC maintainer.

Michael and Peter, rest of this patchset depends on commits which are
currently in the locking/core branch of the tip, so I would like it as a
whole queued there. Besides, I will keep this patch Cc'ed to stable in
future versions, that works for you both?

Regards,
Boqun

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 473 bytes --]

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH RESEND v3 1/6] powerpc: atomic: Make *xchg and *cmpxchg a full barrier
  2015-10-14  9:26         ` Boqun Feng
@ 2015-10-14  9:33           ` Peter Zijlstra
  2015-10-14  9:43             ` Michael Ellerman
  0 siblings, 1 reply; 40+ messages in thread
From: Peter Zijlstra @ 2015-10-14  9:33 UTC (permalink / raw)
  To: Boqun Feng
  Cc: Michael Ellerman, linux-kernel, linuxppc-dev, Ingo Molnar,
	Benjamin Herrenschmidt, Paul Mackerras, Thomas Gleixner,
	Will Deacon, Paul E. McKenney, Waiman Long, Davidlohr Bueso,
	stable

On Wed, Oct 14, 2015 at 05:26:53PM +0800, Boqun Feng wrote:
> Michael and Peter, rest of this patchset depends on commits which are
> currently in the locking/core branch of the tip, so I would like it as a
> whole queued there. Besides, I will keep this patch Cc'ed to stable in
> future versions, that works for you both?

>From my POV having the Cc stable in there is fine if Michael actually
wants them to go there. GregKH will vacuum them up once they hit Linus'
tree and we don't need to think about it anymore.

Alternatively, Michael could put the patch in a separate branch and we
could both merge that.

Or even, seeing how its a single patch and git mostly does the right
thing, we could just merge it independently in both trees and let git
sort it out at merge time.

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v3 6/6] powerpc: atomic: Implement cmpxchg{,64}_* and atomic{,64}_cmpxchg_* variants
  2015-10-14  1:47             ` Boqun Feng
@ 2015-10-14  9:40               ` Will Deacon
  0 siblings, 0 replies; 40+ messages in thread
From: Will Deacon @ 2015-10-14  9:40 UTC (permalink / raw)
  To: Boqun Feng
  Cc: linux-kernel, linuxppc-dev, Peter Zijlstra, Ingo Molnar,
	Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman,
	Thomas Gleixner, Paul E. McKenney, Waiman Long, Davidlohr Bueso

On Wed, Oct 14, 2015 at 09:47:35AM +0800, Boqun Feng wrote:
> On Tue, Oct 13, 2015 at 04:04:27PM +0100, Will Deacon wrote:
> > On Tue, Oct 13, 2015 at 10:58:30PM +0800, Boqun Feng wrote:
> > > On Tue, Oct 13, 2015 at 03:43:33PM +0100, Will Deacon wrote:
> > > > Putting a barrier in the middle of that critical section is probably a
> > > > terrible idea, and that's why I thought you were avoiding it (hence my
> > > 
> > > The fact is that I haven't thought of that way to implement
> > > cmpxchg_release before you ask that question ;-) And I'm not going to do
> > > that for now and probably not in the future.
> > > 
> > > > original question). Perhaps just add a comment to that effect, since I
> > > 
> > > Are you suggesting if I put a barrier in the middle I'd better to add a
> > > comment, right? So if I don't do that, it's OK to let this patch as it.
> > 
> > No, I mean put a comment in your file to explain the reason why you
> > override _relaxed and _acquire, but not _release (because overriding
> 
> You mean overriding _acquire and fully order version, right?

Yes, my mistake. Sounds like you get my drift, though.

Will

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH RESEND v3 1/6] powerpc: atomic: Make *xchg and *cmpxchg a full barrier
  2015-10-14  9:33           ` Peter Zijlstra
@ 2015-10-14  9:43             ` Michael Ellerman
  0 siblings, 0 replies; 40+ messages in thread
From: Michael Ellerman @ 2015-10-14  9:43 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Boqun Feng, linux-kernel, linuxppc-dev, Ingo Molnar,
	Benjamin Herrenschmidt, Paul Mackerras, Thomas Gleixner,
	Will Deacon, Paul E. McKenney, Waiman Long, Davidlohr Bueso,
	stable

On Wed, 2015-10-14 at 11:33 +0200, Peter Zijlstra wrote:
> On Wed, Oct 14, 2015 at 05:26:53PM +0800, Boqun Feng wrote:
> > Michael and Peter, rest of this patchset depends on commits which are
> > currently in the locking/core branch of the tip, so I would like it as a
> > whole queued there. Besides, I will keep this patch Cc'ed to stable in
> > future versions, that works for you both?
> 
> From my POV having the Cc stable in there is fine if Michael actually
> wants them to go there. GregKH will vacuum them up once they hit Linus'
> tree and we don't need to think about it anymore.

Yeah that's fine by me. Here's an Ack if you want one:

Acked-by: Michael Ellerman <mpe@ellerman.id.au>

> Alternatively, Michael could put the patch in a separate branch and we
> could both merge that.
> 
> Or even, seeing how its a single patch and git mostly does the right
> thing, we could just merge it independently in both trees and let git
> sort it out at merge time.

That probably would work, but I don't think it's necessary.

My tree doesn't get much (or any) more testing than linux-next, so as long as
locking/core is in linux-next then it will be tested just fine that way.

cheers



^ permalink raw reply	[flat|nested] 40+ messages in thread

end of thread, other threads:[~2015-10-14  9:44 UTC | newest]

Thread overview: 40+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-10-12 14:14 [PATCH v3 0/6] atomics: powerpc: Implement relaxed/acquire/release variants of some atomics Boqun Feng
2015-10-12 14:14 ` [PATCH v3 1/6] powerpc: atomic: Make *xchg and *cmpxchg a full barrier Boqun Feng
2015-10-12 14:23   ` Boqun Feng
2015-10-12 14:14 ` [PATCH v3 2/6] atomics: Add test for atomic operations with _relaxed variants Boqun Feng
     [not found]   ` <201510122205.Uu3yljqf%fengguang.wu@intel.com>
     [not found]     ` <20151012145652.GJ27351@fixme-laptop.cn.ibm.com>
2015-10-12 15:29       ` [lkp] " Fengguang Wu
2015-10-12 15:42         ` Boqun Feng
2015-10-12 16:02           ` [kbuild-all] " Fengguang Wu
2015-10-12 16:09             ` Fengguang Wu
2015-10-13  1:33             ` Boqun Feng
2015-10-12 14:14 ` [PATCH v3 3/6] atomics: Allow architectures to define their own __atomic_op_* helpers Boqun Feng
2015-10-12 14:14 ` [PATCH v3 4/6] powerpc: atomic: Implement atomic{,64}_*_return_* variants Boqun Feng
2015-10-12 14:14   ` [PATCH v3 4/6] powerpc: atomic: Implement atomic{, 64}_*_return_* variants Boqun Feng
2015-10-13 13:21   ` [PATCH v3 4/6] powerpc: atomic: Implement atomic{,64}_*_return_* variants Will Deacon
2015-10-13 13:21     ` [PATCH v3 4/6] powerpc: atomic: Implement atomic{, 64}_*_return_* variants Will Deacon
2015-10-13 13:35     ` [PATCH v3 4/6] powerpc: atomic: Implement atomic{,64}_*_return_* variants Boqun Feng
2015-10-13 13:35       ` [PATCH v3 4/6] powerpc: atomic: Implement atomic{, 64}_*_return_* variants Boqun Feng
2015-10-14  1:00       ` [PATCH v3 4/6] powerpc: atomic: Implement atomic{,64}_*_return_* variants Boqun Feng
2015-10-14  1:00         ` [PATCH v3 4/6] powerpc: atomic: Implement atomic{, 64}_*_return_* variants Boqun Feng
2015-10-12 14:14 ` [PATCH v3 5/6] powerpc: atomic: Implement xchg_* and atomic{,64}_xchg_* variants Boqun Feng
2015-10-12 14:14   ` [PATCH v3 5/6] powerpc: atomic: Implement xchg_* and atomic{, 64}_xchg_* variants Boqun Feng
2015-10-12 14:14 ` [PATCH v3 6/6] powerpc: atomic: Implement cmpxchg{,64}_* and atomic{,64}_cmpxchg_* variants Boqun Feng
2015-10-12 14:14   ` [PATCH v3 6/6] powerpc: atomic: Implement cmpxchg{, 64}_* and atomic{, 64}_cmpxchg_* variants Boqun Feng
2015-10-13 13:24   ` [PATCH v3 6/6] powerpc: atomic: Implement cmpxchg{,64}_* and atomic{,64}_cmpxchg_* variants Will Deacon
2015-10-13 14:32     ` Boqun Feng
2015-10-13 14:43       ` Will Deacon
2015-10-13 14:58         ` Boqun Feng
2015-10-13 15:04           ` Will Deacon
2015-10-13 15:45             ` Boqun Feng
2015-10-14  1:47             ` Boqun Feng
2015-10-14  9:40               ` Will Deacon
2015-10-13 14:46       ` Boqun Feng
2015-10-12 14:30 ` [PATCH RESEND v3 1/6] powerpc: atomic: Make *xchg and *cmpxchg a full barrier Boqun Feng
2015-10-14  0:10   ` Michael Ellerman
2015-10-14  0:51     ` Boqun Feng
2015-10-14  8:06       ` Peter Zijlstra
2015-10-14  9:26         ` Boqun Feng
2015-10-14  9:33           ` Peter Zijlstra
2015-10-14  9:43             ` Michael Ellerman
2015-10-13 12:27 ` [PATCH v3 0/6] atomics: powerpc: Implement relaxed/acquire/release variants of some atomics Peter Zijlstra
2015-10-13 15:46   ` Paul E. McKenney

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.