linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH -v2 00/33] implement atomic_fetch_$op
@ 2016-05-31 10:19 Peter Zijlstra
  2016-05-31 10:19 ` [PATCH -v2 01/33] locking,alpha: Implement atomic{,64}_fetch_{add,sub,and,andnot,or,xor}() Peter Zijlstra
                   ` (34 more replies)
  0 siblings, 35 replies; 61+ messages in thread
From: Peter Zijlstra @ 2016-05-31 10:19 UTC (permalink / raw)
  To: torvalds, mingo, tglx, will.deacon, paulmck, boqun.feng,
	waiman.long, fweisbec
  Cc: linux-kernel, linux-arch, rth, vgupta, linux, egtvedt, realmz6,
	ysato, rkuo, tony.luck, geert, james.hogan, ralf, dhowells, jejb,
	mpe, schwidefsky, dalias, davem, cmetcalf, jcmvbkbc, arnd,
	peterz, dbueso, fengguang.wu

As there have been a few requests for atomic_fetch_$op primitives and recently
by Linus, I figured I'd go and implement the lot.

The atomic_fetch_$op differs from the existing atomic_$op_return we already
have by returning the old value instead of the new value. This is especially
useful when the operation is irreversible (like bitops), and allows for things
like test-and-set.

This version incorporates all feedback from last time and is now complete thanks
to Will implementing ARMv8.1-LSE versions.

No known build breakage from the build-bot.

Notes:
 - arc asm/atomic.h is a bit of a mess after the eznps merge, I would
   recommend a restructure or split of that file, but could not find
   the will to do it.
 - arc, metag and tile could convert to _relaxed.

I'm aiming to merge this for v4.8 which should get this a fair few weeks in -next.

^ permalink raw reply	[flat|nested] 61+ messages in thread

* [PATCH -v2 01/33] locking,alpha: Implement atomic{,64}_fetch_{add,sub,and,andnot,or,xor}()
  2016-05-31 10:19 [PATCH -v2 00/33] implement atomic_fetch_$op Peter Zijlstra
@ 2016-05-31 10:19 ` Peter Zijlstra
  2016-05-31 10:19 ` [PATCH -v2 02/33] locking,arc: Implement atomic_fetch_{add,sub,and,andnot,or,xor}() Peter Zijlstra
                   ` (33 subsequent siblings)
  34 siblings, 0 replies; 61+ messages in thread
From: Peter Zijlstra @ 2016-05-31 10:19 UTC (permalink / raw)
  To: torvalds, mingo, tglx, will.deacon, paulmck, boqun.feng,
	waiman.long, fweisbec
  Cc: linux-kernel, linux-arch, rth, vgupta, linux, egtvedt, realmz6,
	ysato, rkuo, tony.luck, geert, james.hogan, ralf, dhowells, jejb,
	mpe, schwidefsky, dalias, davem, cmetcalf, jcmvbkbc, arnd,
	peterz, dbueso, fengguang.wu

[-- Attachment #1: peterz-atomic-fetch-alpha.patch --]
[-- Type: text/plain, Size: 3062 bytes --]

Implement FETCH-OP atomic primitives, these are very similar to the
existing OP-RETURN primitives we already have, except they return the
value of the atomic variable _before_ modification.

This is especially useful for irreversible operations -- such as
bitops (because it becomes impossible to reconstruct the state prior
to modification).

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 arch/alpha/include/asm/atomic.h |   65 ++++++++++++++++++++++++++++++++++------
 1 file changed, 56 insertions(+), 9 deletions(-)

--- a/arch/alpha/include/asm/atomic.h
+++ b/arch/alpha/include/asm/atomic.h
@@ -65,6 +65,25 @@ static inline int atomic_##op##_return(i
 	return result;							\
 }
 
+#define ATOMIC_FETCH_OP(op, asm_op)					\
+static inline int atomic_fetch_##op(int i, atomic_t *v)			\
+{									\
+	long temp, result;						\
+	smp_mb();							\
+	__asm__ __volatile__(						\
+	"1:	ldl_l %2,%1\n"						\
+	"	" #asm_op " %2,%3,%0\n"					\
+	"	stl_c %0,%1\n"						\
+	"	beq %0,2f\n"						\
+	".subsection 2\n"						\
+	"2:	br 1b\n"						\
+	".previous"							\
+	:"=&r" (temp), "=m" (v->counter), "=&r" (result)		\
+	:"Ir" (i), "m" (v->counter) : "memory");			\
+	smp_mb();							\
+	return result;							\
+}
+
 #define ATOMIC64_OP(op, asm_op)						\
 static __inline__ void atomic64_##op(long i, atomic64_t * v)		\
 {									\
@@ -101,11 +120,32 @@ static __inline__ long atomic64_##op##_r
 	return result;							\
 }
 
+#define ATOMIC64_FETCH_OP(op, asm_op)					\
+static __inline__ long atomic64_fetch_##op(long i, atomic64_t * v)	\
+{									\
+	long temp, result;						\
+	smp_mb();							\
+	__asm__ __volatile__(						\
+	"1:	ldq_l %2,%1\n"						\
+	"	" #asm_op " %2,%3,%0\n"					\
+	"	stq_c %0,%1\n"						\
+	"	beq %0,2f\n"						\
+	".subsection 2\n"						\
+	"2:	br 1b\n"						\
+	".previous"							\
+	:"=&r" (temp), "=m" (v->counter), "=&r" (result)		\
+	:"Ir" (i), "m" (v->counter) : "memory");			\
+	smp_mb();							\
+	return result;							\
+}
+
 #define ATOMIC_OPS(op)							\
 	ATOMIC_OP(op, op##l)						\
 	ATOMIC_OP_RETURN(op, op##l)					\
+	ATOMIC_FETCH_OP(op, op##l)					\
 	ATOMIC64_OP(op, op##q)						\
-	ATOMIC64_OP_RETURN(op, op##q)
+	ATOMIC64_OP_RETURN(op, op##q)					\
+	ATOMIC64_FETCH_OP(op, op##q)
 
 ATOMIC_OPS(add)
 ATOMIC_OPS(sub)
@@ -113,18 +153,25 @@ ATOMIC_OPS(sub)
 #define atomic_andnot atomic_andnot
 #define atomic64_andnot atomic64_andnot
 
-ATOMIC_OP(and, and)
-ATOMIC_OP(andnot, bic)
-ATOMIC_OP(or, bis)
-ATOMIC_OP(xor, xor)
-ATOMIC64_OP(and, and)
-ATOMIC64_OP(andnot, bic)
-ATOMIC64_OP(or, bis)
-ATOMIC64_OP(xor, xor)
+#define atomic_fetch_or atomic_fetch_or
+
+#undef ATOMIC_OPS
+#define ATOMIC_OPS(op, asm)						\
+	ATOMIC_OP(op, asm)						\
+	ATOMIC_FETCH_OP(op, asm)					\
+	ATOMIC64_OP(op, asm)						\
+	ATOMIC64_FETCH_OP(op, asm)
+
+ATOMIC_OPS(and, and)
+ATOMIC_OPS(andnot, bic)
+ATOMIC_OPS(or, bis)
+ATOMIC_OPS(xor, xor)
 
 #undef ATOMIC_OPS
+#undef ATOMIC64_FETCH_OP
 #undef ATOMIC64_OP_RETURN
 #undef ATOMIC64_OP
+#undef ATOMIC_FETCH_OP
 #undef ATOMIC_OP_RETURN
 #undef ATOMIC_OP
 

^ permalink raw reply	[flat|nested] 61+ messages in thread

* [PATCH -v2 02/33] locking,arc: Implement atomic_fetch_{add,sub,and,andnot,or,xor}()
  2016-05-31 10:19 [PATCH -v2 00/33] implement atomic_fetch_$op Peter Zijlstra
  2016-05-31 10:19 ` [PATCH -v2 01/33] locking,alpha: Implement atomic{,64}_fetch_{add,sub,and,andnot,or,xor}() Peter Zijlstra
@ 2016-05-31 10:19 ` Peter Zijlstra
  2016-05-31 10:19 ` [PATCH -v2 03/33] locking,arm: Implement atomic{,64}_fetch_{add,sub,and,andnot,or,xor}{,_relaxed,_acquire,_release}() Peter Zijlstra
                   ` (32 subsequent siblings)
  34 siblings, 0 replies; 61+ messages in thread
From: Peter Zijlstra @ 2016-05-31 10:19 UTC (permalink / raw)
  To: torvalds, mingo, tglx, will.deacon, paulmck, boqun.feng,
	waiman.long, fweisbec
  Cc: linux-kernel, linux-arch, rth, vgupta, linux, egtvedt, realmz6,
	ysato, rkuo, tony.luck, geert, james.hogan, ralf, dhowells, jejb,
	mpe, schwidefsky, dalias, davem, cmetcalf, jcmvbkbc, arnd,
	peterz, dbueso, fengguang.wu

[-- Attachment #1: peterz-atomic-fetch-arc.patch --]
[-- Type: text/plain, Size: 4617 bytes --]

Implement FETCH-OP atomic primitives, these are very similar to the
existing OP-RETURN primitives we already have, except they return the
value of the atomic variable _before_ modification.

This is especially useful for irreversible operations -- such as
bitops (because it becomes impossible to reconstruct the state prior
to modification).

Acked-by: Vineet Gupta <vgupta@synopsys.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 arch/arc/include/asm/atomic.h |  103 ++++++++++++++++++++++++++++++++++++++----
 1 file changed, 94 insertions(+), 9 deletions(-)

--- a/arch/arc/include/asm/atomic.h
+++ b/arch/arc/include/asm/atomic.h
@@ -104,6 +104,37 @@ static inline int atomic_##op##_return(i
 	return val;							\
 }
 
+#define ATOMIC_FETCH_OP(op, c_op, asm_op)				\
+static inline int atomic_fetch_##op(int i, atomic_t *v)			\
+{									\
+	unsigned int val, orig;						\
+	SCOND_FAIL_RETRY_VAR_DEF                                        \
+									\
+	/*								\
+	 * Explicit full memory barrier needed before/after as		\
+	 * LLOCK/SCOND thmeselves don't provide any such semantics	\
+	 */								\
+	smp_mb();							\
+									\
+	__asm__ __volatile__(						\
+	"1:	llock   %[orig], [%[ctr]]		\n"		\
+	"	" #asm_op " %[val], %[orig], %[i]	\n"		\
+	"	scond   %[val], [%[ctr]]		\n"		\
+	"						\n"		\
+	SCOND_FAIL_RETRY_ASM						\
+									\
+	: [val]	"=&r"	(val),						\
+	  [orig] "=&r" (orig)						\
+	  SCOND_FAIL_RETRY_VARS						\
+	: [ctr]	"r"	(&v->counter),					\
+	  [i]	"ir"	(i)						\
+	: "cc");							\
+									\
+	smp_mb();							\
+									\
+	return orig;							\
+}
+
 #else	/* !CONFIG_ARC_HAS_LLSC */
 
 #ifndef CONFIG_SMP
@@ -166,21 +197,46 @@ static inline int atomic_##op##_return(i
 	return temp;							\
 }
 
+#define ATOMIC_FETCH_OP(op, c_op, asm_op)				\
+static inline int atomic_fetch_##op(int i, atomic_t *v)			\
+{									\
+	unsigned long flags;						\
+	unsigned long orig;						\
+									\
+	/*								\
+	 * spin lock/unlock provides the needed smp_mb() before/after	\
+	 */								\
+	atomic_ops_lock(flags);						\
+	orig = v->counter;						\
+	v->counter c_op i;						\
+	atomic_ops_unlock(flags);					\
+									\
+	return orig;							\
+}
+
 #endif /* !CONFIG_ARC_HAS_LLSC */
 
 #define ATOMIC_OPS(op, c_op, asm_op)					\
 	ATOMIC_OP(op, c_op, asm_op)					\
-	ATOMIC_OP_RETURN(op, c_op, asm_op)
+	ATOMIC_OP_RETURN(op, c_op, asm_op)				\
+	ATOMIC_FETCH_OP(op, c_op, asm_op)
 
 ATOMIC_OPS(add, +=, add)
 ATOMIC_OPS(sub, -=, sub)
 
 #define atomic_andnot atomic_andnot
 
-ATOMIC_OP(and, &=, and)
-ATOMIC_OP(andnot, &= ~, bic)
-ATOMIC_OP(or, |=, or)
-ATOMIC_OP(xor, ^=, xor)
+#define atomic_fetch_or atomic_fetch_or
+
+#undef ATOMIC_OPS
+#define ATOMIC_OPS(op, c_op, asm_op)					\
+	ATOMIC_OP(op, c_op, asm_op)					\
+	ATOMIC_FETCH_OP(op, c_op, asm_op)
+
+ATOMIC_OPS(and, &=, and)
+ATOMIC_OPS(andnot, &= ~, bic)
+ATOMIC_OPS(or, |=, or)
+ATOMIC_OPS(xor, ^=, xor)
 
 #undef SCOND_FAIL_RETRY_VAR_DEF
 #undef SCOND_FAIL_RETRY_ASM
@@ -245,22 +301,51 @@ static inline int atomic_##op##_return(i
 	return temp;							\
 }
 
+#define ATOMIC_FETCH_OP(op, c_op, asm_op)				\
+static inline int atomic_fetch_##op(int i, atomic_t *v)			\
+{									\
+	unsigned int temp = i;						\
+									\
+	/* Explicit full memory barrier needed before/after */		\
+	smp_mb();							\
+									\
+	__asm__ __volatile__(						\
+	"	mov r2, %0\n"						\
+	"	mov r3, %1\n"						\
+	"       .word %2\n"						\
+	"	mov %0, r2"						\
+	: "+r"(temp)							\
+	: "r"(&v->counter), "i"(asm_op)					\
+	: "r2", "r3", "memory");					\
+									\
+	smp_mb();							\
+									\
+	return temp;							\
+}
+
 #define ATOMIC_OPS(op, c_op, asm_op)					\
 	ATOMIC_OP(op, c_op, asm_op)					\
-	ATOMIC_OP_RETURN(op, c_op, asm_op)
+	ATOMIC_OP_RETURN(op, c_op, asm_op)				\
+	ATOMIC_FETCH_OP(op, c_op, asm_op)
 
 ATOMIC_OPS(add, +=, CTOP_INST_AADD_DI_R2_R2_R3)
 #define atomic_sub(i, v) atomic_add(-(i), (v))
 #define atomic_sub_return(i, v) atomic_add_return(-(i), (v))
 
-ATOMIC_OP(and, &=, CTOP_INST_AAND_DI_R2_R2_R3)
+#undef ATOMIC_OPS
+#define ATOMIC_OPS(op, c_op, asm_op)					\
+	ATOMIC_OP(op, c_op, asm_op)					\
+	ATOMIC_FETCH_OP(op, c_op, asm_op)
+
+ATOMIC_OPS(and, &=, CTOP_INST_AAND_DI_R2_R2_R3)
 #define atomic_andnot(mask, v) atomic_and(~(mask), (v))
-ATOMIC_OP(or, |=, CTOP_INST_AOR_DI_R2_R2_R3)
-ATOMIC_OP(xor, ^=, CTOP_INST_AXOR_DI_R2_R2_R3)
+ATOMIC_OPS(or, |=, CTOP_INST_AOR_DI_R2_R2_R3)
+ATOMIC_OPS(xor, ^=, CTOP_INST_AXOR_DI_R2_R2_R3)
 
 #endif /* CONFIG_ARC_PLAT_EZNPS */
 
 #undef ATOMIC_OPS
+#undef ATOMIC_FETCH_OP
 #undef ATOMIC_OP_RETURN
 #undef ATOMIC_OP
 

^ permalink raw reply	[flat|nested] 61+ messages in thread

* [PATCH -v2 03/33] locking,arm: Implement atomic{,64}_fetch_{add,sub,and,andnot,or,xor}{,_relaxed,_acquire,_release}()
  2016-05-31 10:19 [PATCH -v2 00/33] implement atomic_fetch_$op Peter Zijlstra
  2016-05-31 10:19 ` [PATCH -v2 01/33] locking,alpha: Implement atomic{,64}_fetch_{add,sub,and,andnot,or,xor}() Peter Zijlstra
  2016-05-31 10:19 ` [PATCH -v2 02/33] locking,arc: Implement atomic_fetch_{add,sub,and,andnot,or,xor}() Peter Zijlstra
@ 2016-05-31 10:19 ` Peter Zijlstra
  2016-05-31 10:19 ` [PATCH -v2 04/33] locking,arm64: " Peter Zijlstra
                   ` (31 subsequent siblings)
  34 siblings, 0 replies; 61+ messages in thread
From: Peter Zijlstra @ 2016-05-31 10:19 UTC (permalink / raw)
  To: torvalds, mingo, tglx, will.deacon, paulmck, boqun.feng,
	waiman.long, fweisbec
  Cc: linux-kernel, linux-arch, rth, vgupta, linux, egtvedt, realmz6,
	ysato, rkuo, tony.luck, geert, james.hogan, ralf, dhowells, jejb,
	mpe, schwidefsky, dalias, davem, cmetcalf, jcmvbkbc, arnd,
	peterz, dbueso, fengguang.wu

[-- Attachment #1: peterz-atomic-fetch-arm.patch --]
[-- Type: text/plain, Size: 5312 bytes --]

Implement FETCH-OP atomic primitives, these are very similar to the
existing OP-RETURN primitives we already have, except they return the
value of the atomic variable _before_ modification.

This is especially useful for irreversible operations -- such as
bitops (because it becomes impossible to reconstruct the state prior
to modification).

Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 arch/arm/include/asm/atomic.h |  108 ++++++++++++++++++++++++++++++++++++++----
 1 file changed, 98 insertions(+), 10 deletions(-)

--- a/arch/arm/include/asm/atomic.h
+++ b/arch/arm/include/asm/atomic.h
@@ -77,8 +77,36 @@ static inline int atomic_##op##_return_r
 	return result;							\
 }
 
+#define ATOMIC_FETCH_OP(op, c_op, asm_op)				\
+static inline int atomic_fetch_##op##_relaxed(int i, atomic_t *v)	\
+{									\
+	unsigned long tmp;						\
+	int result, val;						\
+									\
+	prefetchw(&v->counter);						\
+									\
+	__asm__ __volatile__("@ atomic_fetch_" #op "\n"			\
+"1:	ldrex	%0, [%4]\n"						\
+"	" #asm_op "	%1, %0, %5\n"					\
+"	strex	%2, %1, [%4]\n"						\
+"	teq	%2, #0\n"						\
+"	bne	1b"							\
+	: "=&r" (result), "=&r" (val), "=&r" (tmp), "+Qo" (v->counter)	\
+	: "r" (&v->counter), "Ir" (i)					\
+	: "cc");							\
+									\
+	return result;							\
+}
+
 #define atomic_add_return_relaxed	atomic_add_return_relaxed
 #define atomic_sub_return_relaxed	atomic_sub_return_relaxed
+#define atomic_fetch_add_relaxed	atomic_fetch_add_relaxed
+#define atomic_fetch_sub_relaxed	atomic_fetch_sub_relaxed
+
+#define atomic_fetch_and_relaxed	atomic_fetch_and_relaxed
+#define atomic_fetch_andnot_relaxed	atomic_fetch_andnot_relaxed
+#define atomic_fetch_or_relaxed		atomic_fetch_or_relaxed
+#define atomic_fetch_xor_relaxed	atomic_fetch_xor_relaxed
 
 static inline int atomic_cmpxchg_relaxed(atomic_t *ptr, int old, int new)
 {
@@ -159,6 +187,22 @@ static inline int atomic_##op##_return(i
 	return val;							\
 }
 
+#define ATOMIC_FETCH_OP(op, c_op, asm_op)				\
+static inline int atomic_fetch_##op(int i, atomic_t *v)			\
+{									\
+	unsigned long flags;						\
+	int val;							\
+									\
+	raw_local_irq_save(flags);					\
+	val = v->counter;						\
+	v->counter c_op i;						\
+	raw_local_irq_restore(flags);					\
+									\
+	return val;							\
+}
+
+#define atomic_fetch_or atomic_fetch_or
+
 static inline int atomic_cmpxchg(atomic_t *v, int old, int new)
 {
 	int ret;
@@ -187,19 +231,26 @@ static inline int __atomic_add_unless(at
 
 #define ATOMIC_OPS(op, c_op, asm_op)					\
 	ATOMIC_OP(op, c_op, asm_op)					\
-	ATOMIC_OP_RETURN(op, c_op, asm_op)
+	ATOMIC_OP_RETURN(op, c_op, asm_op)				\
+	ATOMIC_FETCH_OP(op, c_op, asm_op)
 
 ATOMIC_OPS(add, +=, add)
 ATOMIC_OPS(sub, -=, sub)
 
 #define atomic_andnot atomic_andnot
 
-ATOMIC_OP(and, &=, and)
-ATOMIC_OP(andnot, &= ~, bic)
-ATOMIC_OP(or,  |=, orr)
-ATOMIC_OP(xor, ^=, eor)
+#undef ATOMIC_OPS
+#define ATOMIC_OPS(op, c_op, asm_op)					\
+	ATOMIC_OP(op, c_op, asm_op)					\
+	ATOMIC_FETCH_OP(op, c_op, asm_op)
+
+ATOMIC_OPS(and, &=, and)
+ATOMIC_OPS(andnot, &= ~, bic)
+ATOMIC_OPS(or,  |=, orr)
+ATOMIC_OPS(xor, ^=, eor)
 
 #undef ATOMIC_OPS
+#undef ATOMIC_FETCH_OP
 #undef ATOMIC_OP_RETURN
 #undef ATOMIC_OP
 
@@ -317,24 +368,61 @@ atomic64_##op##_return_relaxed(long long
 	return result;							\
 }
 
+#define ATOMIC64_FETCH_OP(op, op1, op2)					\
+static inline long long							\
+atomic64_fetch_##op##_relaxed(long long i, atomic64_t *v)		\
+{									\
+	long long result, val;						\
+	unsigned long tmp;						\
+									\
+	prefetchw(&v->counter);						\
+									\
+	__asm__ __volatile__("@ atomic64_fetch_" #op "\n"		\
+"1:	ldrexd	%0, %H0, [%4]\n"					\
+"	" #op1 " %Q1, %Q0, %Q5\n"					\
+"	" #op2 " %R1, %R0, %R5\n"					\
+"	strexd	%2, %1, %H1, [%4]\n"					\
+"	teq	%2, #0\n"						\
+"	bne	1b"							\
+	: "=&r" (result), "=&r" (val), "=&r" (tmp), "+Qo" (v->counter)	\
+	: "r" (&v->counter), "r" (i)					\
+	: "cc");							\
+									\
+	return result;							\
+}
+
 #define ATOMIC64_OPS(op, op1, op2)					\
 	ATOMIC64_OP(op, op1, op2)					\
-	ATOMIC64_OP_RETURN(op, op1, op2)
+	ATOMIC64_OP_RETURN(op, op1, op2)				\
+	ATOMIC64_FETCH_OP(op, op1, op2)
 
 ATOMIC64_OPS(add, adds, adc)
 ATOMIC64_OPS(sub, subs, sbc)
 
 #define atomic64_add_return_relaxed	atomic64_add_return_relaxed
 #define atomic64_sub_return_relaxed	atomic64_sub_return_relaxed
+#define atomic64_fetch_add_relaxed	atomic64_fetch_add_relaxed
+#define atomic64_fetch_sub_relaxed	atomic64_fetch_sub_relaxed
+
+#undef ATOMIC64_OPS
+#define ATOMIC64_OPS(op, op1, op2)					\
+	ATOMIC64_OP(op, op1, op2)					\
+	ATOMIC64_FETCH_OP(op, op1, op2)
 
 #define atomic64_andnot atomic64_andnot
 
-ATOMIC64_OP(and, and, and)
-ATOMIC64_OP(andnot, bic, bic)
-ATOMIC64_OP(or,  orr, orr)
-ATOMIC64_OP(xor, eor, eor)
+ATOMIC64_OPS(and, and, and)
+ATOMIC64_OPS(andnot, bic, bic)
+ATOMIC64_OPS(or,  orr, orr)
+ATOMIC64_OPS(xor, eor, eor)
+
+#define atomic64_fetch_and_relaxed	atomic64_fetch_and_relaxed
+#define atomic64_fetch_andnot_relaxed	atomic64_fetch_andnot_relaxed
+#define atomic64_fetch_or_relaxed	atomic64_fetch_or_relaxed
+#define atomic64_fetch_xor_relaxed	atomic64_fetch_xor_relaxed
 
 #undef ATOMIC64_OPS
+#undef ATOMIC64_FETCH_OP
 #undef ATOMIC64_OP_RETURN
 #undef ATOMIC64_OP
 

^ permalink raw reply	[flat|nested] 61+ messages in thread

* [PATCH -v2 04/33] locking,arm64: Implement atomic{,64}_fetch_{add,sub,and,andnot,or,xor}{,_relaxed,_acquire,_release}()
  2016-05-31 10:19 [PATCH -v2 00/33] implement atomic_fetch_$op Peter Zijlstra
                   ` (2 preceding siblings ...)
  2016-05-31 10:19 ` [PATCH -v2 03/33] locking,arm: Implement atomic{,64}_fetch_{add,sub,and,andnot,or,xor}{,_relaxed,_acquire,_release}() Peter Zijlstra
@ 2016-05-31 10:19 ` Peter Zijlstra
  2016-05-31 10:19 ` [PATCH -v2 05/33] arm64: atomic: generate LSE non-return cases using common macros Peter Zijlstra
                   ` (30 subsequent siblings)
  34 siblings, 0 replies; 61+ messages in thread
From: Peter Zijlstra @ 2016-05-31 10:19 UTC (permalink / raw)
  To: torvalds, mingo, tglx, will.deacon, paulmck, boqun.feng,
	waiman.long, fweisbec
  Cc: linux-kernel, linux-arch, rth, vgupta, linux, egtvedt, realmz6,
	ysato, rkuo, tony.luck, geert, james.hogan, ralf, dhowells, jejb,
	mpe, schwidefsky, dalias, davem, cmetcalf, jcmvbkbc, arnd,
	peterz, dbueso, fengguang.wu

[-- Attachment #1: peterz-atomic-fetch-arm64.patch --]
[-- Type: text/plain, Size: 9201 bytes --]

Implement FETCH-OP atomic primitives, these are very similar to the
existing OP-RETURN primitives we already have, except they return the
value of the atomic variable _before_ modification.

This is especially useful for irreversible operations -- such as
bitops (because it becomes impossible to reconstruct the state prior
to modification).

[wildea01: compile fixes for ll/sc]
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 arch/arm64/include/asm/atomic.h       |   62 +++++++++++++++++++
 arch/arm64/include/asm/atomic_ll_sc.h |  110 ++++++++++++++++++++++++++--------
 2 files changed, 148 insertions(+), 24 deletions(-)

--- a/arch/arm64/include/asm/atomic.h
+++ b/arch/arm64/include/asm/atomic.h
@@ -76,6 +76,36 @@
 #define atomic_dec_return_release(v)	atomic_sub_return_release(1, (v))
 #define atomic_dec_return(v)		atomic_sub_return(1, (v))
 
+#define atomic_fetch_add_relaxed	atomic_fetch_add_relaxed
+#define atomic_fetch_add_acquire	atomic_fetch_add_acquire
+#define atomic_fetch_add_release	atomic_fetch_add_release
+#define atomic_fetch_add		atomic_fetch_add
+
+#define atomic_fetch_sub_relaxed	atomic_fetch_sub_relaxed
+#define atomic_fetch_sub_acquire	atomic_fetch_sub_acquire
+#define atomic_fetch_sub_release	atomic_fetch_sub_release
+#define atomic_fetch_sub		atomic_fetch_sub
+
+#define atomic_fetch_and_relaxed	atomic_fetch_and_relaxed
+#define atomic_fetch_and_acquire	atomic_fetch_and_acquire
+#define atomic_fetch_and_release	atomic_fetch_and_release
+#define atomic_fetch_and		atomic_fetch_and
+
+#define atomic_fetch_andnot_relaxed	atomic_fetch_andnot_relaxed
+#define atomic_fetch_andnot_acquire	atomic_fetch_andnot_acquire
+#define atomic_fetch_andnot_release	atomic_fetch_andnot_release
+#define atomic_fetch_andnot		atomic_fetch_andnot
+
+#define atomic_fetch_or_relaxed		atomic_fetch_or_relaxed
+#define atomic_fetch_or_acquire		atomic_fetch_or_acquire
+#define atomic_fetch_or_release		atomic_fetch_or_release
+#define atomic_fetch_or			atomic_fetch_or
+
+#define atomic_fetch_xor_relaxed	atomic_fetch_xor_relaxed
+#define atomic_fetch_xor_acquire	atomic_fetch_xor_acquire
+#define atomic_fetch_xor_release	atomic_fetch_xor_release
+#define atomic_fetch_xor		atomic_fetch_xor
+
 #define atomic_xchg_relaxed(v, new)	xchg_relaxed(&((v)->counter), (new))
 #define atomic_xchg_acquire(v, new)	xchg_acquire(&((v)->counter), (new))
 #define atomic_xchg_release(v, new)	xchg_release(&((v)->counter), (new))
@@ -98,6 +128,8 @@
 #define __atomic_add_unless(v, a, u)	___atomic_add_unless(v, a, u,)
 #define atomic_andnot			atomic_andnot
 
+#define atomic_fetch_or atomic_fetch_or
+
 /*
  * 64-bit atomic operations.
  */
@@ -125,6 +157,36 @@
 #define atomic64_dec_return_release(v)	atomic64_sub_return_release(1, (v))
 #define atomic64_dec_return(v)		atomic64_sub_return(1, (v))
 
+#define atomic64_fetch_add_relaxed	atomic64_fetch_add_relaxed
+#define atomic64_fetch_add_acquire	atomic64_fetch_add_acquire
+#define atomic64_fetch_add_release	atomic64_fetch_add_release
+#define atomic64_fetch_add		atomic64_fetch_add
+
+#define atomic64_fetch_sub_relaxed	atomic64_fetch_sub_relaxed
+#define atomic64_fetch_sub_acquire	atomic64_fetch_sub_acquire
+#define atomic64_fetch_sub_release	atomic64_fetch_sub_release
+#define atomic64_fetch_sub		atomic64_fetch_sub
+
+#define atomic64_fetch_and_relaxed	atomic64_fetch_and_relaxed
+#define atomic64_fetch_and_acquire	atomic64_fetch_and_acquire
+#define atomic64_fetch_and_release	atomic64_fetch_and_release
+#define atomic64_fetch_and		atomic64_fetch_and
+
+#define atomic64_fetch_andnot_relaxed	atomic64_fetch_andnot_relaxed
+#define atomic64_fetch_andnot_acquire	atomic64_fetch_andnot_acquire
+#define atomic64_fetch_andnot_release	atomic64_fetch_andnot_release
+#define atomic64_fetch_andnot		atomic64_fetch_andnot
+
+#define atomic64_fetch_or_relaxed	atomic64_fetch_or_relaxed
+#define atomic64_fetch_or_acquire	atomic64_fetch_or_acquire
+#define atomic64_fetch_or_release	atomic64_fetch_or_release
+#define atomic64_fetch_or		atomic64_fetch_or
+
+#define atomic64_fetch_xor_relaxed	atomic64_fetch_xor_relaxed
+#define atomic64_fetch_xor_acquire	atomic64_fetch_xor_acquire
+#define atomic64_fetch_xor_release	atomic64_fetch_xor_release
+#define atomic64_fetch_xor		atomic64_fetch_xor
+
 #define atomic64_xchg_relaxed		atomic_xchg_relaxed
 #define atomic64_xchg_acquire		atomic_xchg_acquire
 #define atomic64_xchg_release		atomic_xchg_release
--- a/arch/arm64/include/asm/atomic_ll_sc.h
+++ b/arch/arm64/include/asm/atomic_ll_sc.h
@@ -77,26 +77,57 @@ __LL_SC_PREFIX(atomic_##op##_return##nam
 }									\
 __LL_SC_EXPORT(atomic_##op##_return##name);
 
+#define ATOMIC_FETCH_OP(name, mb, acq, rel, cl, op, asm_op)		\
+__LL_SC_INLINE int							\
+__LL_SC_PREFIX(atomic_fetch_##op##name(int i, atomic_t *v))		\
+{									\
+	unsigned long tmp;						\
+	int val, result;						\
+									\
+	asm volatile("// atomic_fetch_" #op #name "\n"			\
+"	prfm	pstl1strm, %3\n"					\
+"1:	ld" #acq "xr	%w0, %3\n"					\
+"	" #asm_op "	%w1, %w0, %w4\n"				\
+"	st" #rel "xr	%w2, %w1, %3\n"					\
+"	cbnz	%w2, 1b\n"						\
+"	" #mb								\
+	: "=&r" (result), "=&r" (val), "=&r" (tmp), "+Q" (v->counter)	\
+	: "Ir" (i)							\
+	: cl);								\
+									\
+	return result;							\
+}									\
+__LL_SC_EXPORT(atomic_fetch_##op##name);
+
 #define ATOMIC_OPS(...)							\
 	ATOMIC_OP(__VA_ARGS__)						\
-	ATOMIC_OP_RETURN(        , dmb ish,  , l, "memory", __VA_ARGS__)
-
-#define ATOMIC_OPS_RLX(...)						\
-	ATOMIC_OPS(__VA_ARGS__)						\
+	ATOMIC_OP_RETURN(        , dmb ish,  , l, "memory", __VA_ARGS__)\
 	ATOMIC_OP_RETURN(_relaxed,        ,  ,  ,         , __VA_ARGS__)\
 	ATOMIC_OP_RETURN(_acquire,        , a,  , "memory", __VA_ARGS__)\
-	ATOMIC_OP_RETURN(_release,        ,  , l, "memory", __VA_ARGS__)
+	ATOMIC_OP_RETURN(_release,        ,  , l, "memory", __VA_ARGS__)\
+	ATOMIC_FETCH_OP (        , dmb ish,  , l, "memory", __VA_ARGS__)\
+	ATOMIC_FETCH_OP (_relaxed,        ,  ,  ,         , __VA_ARGS__)\
+	ATOMIC_FETCH_OP (_acquire,        , a,  , "memory", __VA_ARGS__)\
+	ATOMIC_FETCH_OP (_release,        ,  , l, "memory", __VA_ARGS__)
 
-ATOMIC_OPS_RLX(add, add)
-ATOMIC_OPS_RLX(sub, sub)
+ATOMIC_OPS(add, add)
+ATOMIC_OPS(sub, sub)
 
-ATOMIC_OP(and, and)
-ATOMIC_OP(andnot, bic)
-ATOMIC_OP(or, orr)
-ATOMIC_OP(xor, eor)
+#undef ATOMIC_OPS
+#define ATOMIC_OPS(...)							\
+	ATOMIC_OP(__VA_ARGS__)						\
+	ATOMIC_FETCH_OP (        , dmb ish,  , l, "memory", __VA_ARGS__)\
+	ATOMIC_FETCH_OP (_relaxed,        ,  ,  ,         , __VA_ARGS__)\
+	ATOMIC_FETCH_OP (_acquire,        , a,  , "memory", __VA_ARGS__)\
+	ATOMIC_FETCH_OP (_release,        ,  , l, "memory", __VA_ARGS__)
+
+ATOMIC_OPS(and, and)
+ATOMIC_OPS(andnot, bic)
+ATOMIC_OPS(or, orr)
+ATOMIC_OPS(xor, eor)
 
-#undef ATOMIC_OPS_RLX
 #undef ATOMIC_OPS
+#undef ATOMIC_FETCH_OP
 #undef ATOMIC_OP_RETURN
 #undef ATOMIC_OP
 
@@ -140,26 +171,57 @@ __LL_SC_PREFIX(atomic64_##op##_return##n
 }									\
 __LL_SC_EXPORT(atomic64_##op##_return##name);
 
+#define ATOMIC64_FETCH_OP(name, mb, acq, rel, cl, op, asm_op)		\
+__LL_SC_INLINE long							\
+__LL_SC_PREFIX(atomic64_fetch_##op##name(long i, atomic64_t *v))	\
+{									\
+	long result, val;						\
+	unsigned long tmp;						\
+									\
+	asm volatile("// atomic64_fetch_" #op #name "\n"		\
+"	prfm	pstl1strm, %3\n"					\
+"1:	ld" #acq "xr	%0, %3\n"					\
+"	" #asm_op "	%1, %0, %4\n"					\
+"	st" #rel "xr	%w2, %1, %3\n"					\
+"	cbnz	%w2, 1b\n"						\
+"	" #mb								\
+	: "=&r" (result), "=&r" (val), "=&r" (tmp), "+Q" (v->counter)	\
+	: "Ir" (i)							\
+	: cl);								\
+									\
+	return result;							\
+}									\
+__LL_SC_EXPORT(atomic64_fetch_##op##name);
+
 #define ATOMIC64_OPS(...)						\
 	ATOMIC64_OP(__VA_ARGS__)					\
-	ATOMIC64_OP_RETURN(, dmb ish,  , l, "memory", __VA_ARGS__)
-
-#define ATOMIC64_OPS_RLX(...)						\
-	ATOMIC64_OPS(__VA_ARGS__)					\
+	ATOMIC64_OP_RETURN(, dmb ish,  , l, "memory", __VA_ARGS__)	\
 	ATOMIC64_OP_RETURN(_relaxed,,  ,  ,         , __VA_ARGS__)	\
 	ATOMIC64_OP_RETURN(_acquire,, a,  , "memory", __VA_ARGS__)	\
-	ATOMIC64_OP_RETURN(_release,,  , l, "memory", __VA_ARGS__)
+	ATOMIC64_OP_RETURN(_release,,  , l, "memory", __VA_ARGS__)	\
+	ATOMIC64_FETCH_OP (, dmb ish,  , l, "memory", __VA_ARGS__)	\
+	ATOMIC64_FETCH_OP (_relaxed,,  ,  ,         , __VA_ARGS__)	\
+	ATOMIC64_FETCH_OP (_acquire,, a,  , "memory", __VA_ARGS__)	\
+	ATOMIC64_FETCH_OP (_release,,  , l, "memory", __VA_ARGS__)
 
-ATOMIC64_OPS_RLX(add, add)
-ATOMIC64_OPS_RLX(sub, sub)
+ATOMIC64_OPS(add, add)
+ATOMIC64_OPS(sub, sub)
 
-ATOMIC64_OP(and, and)
-ATOMIC64_OP(andnot, bic)
-ATOMIC64_OP(or, orr)
-ATOMIC64_OP(xor, eor)
+#undef ATOMIC64_OPS
+#define ATOMIC64_OPS(...)						\
+	ATOMIC64_OP(__VA_ARGS__)					\
+	ATOMIC64_FETCH_OP (, dmb ish,  , l, "memory", __VA_ARGS__)	\
+	ATOMIC64_FETCH_OP (_relaxed,,  ,  ,         , __VA_ARGS__)	\
+	ATOMIC64_FETCH_OP (_acquire,, a,  , "memory", __VA_ARGS__)	\
+	ATOMIC64_FETCH_OP (_release,,  , l, "memory", __VA_ARGS__)
+
+ATOMIC64_OPS(and, and)
+ATOMIC64_OPS(andnot, bic)
+ATOMIC64_OPS(or, orr)
+ATOMIC64_OPS(xor, eor)
 
-#undef ATOMIC64_OPS_RLX
 #undef ATOMIC64_OPS
+#undef ATOMIC64_FETCH_OP
 #undef ATOMIC64_OP_RETURN
 #undef ATOMIC64_OP
 

^ permalink raw reply	[flat|nested] 61+ messages in thread

* [PATCH -v2 05/33] arm64: atomic: generate LSE non-return cases using common macros
  2016-05-31 10:19 [PATCH -v2 00/33] implement atomic_fetch_$op Peter Zijlstra
                   ` (3 preceding siblings ...)
  2016-05-31 10:19 ` [PATCH -v2 04/33] locking,arm64: " Peter Zijlstra
@ 2016-05-31 10:19 ` Peter Zijlstra
  2016-05-31 10:19 ` [PATCH -v2 06/33] locking,arm64: Implement atomic{,64}_fetch_{add,sub,and,andnot,or,xor}{,_relaxed,_acquire,_release}() for LSE instructions Peter Zijlstra
                   ` (29 subsequent siblings)
  34 siblings, 0 replies; 61+ messages in thread
From: Peter Zijlstra @ 2016-05-31 10:19 UTC (permalink / raw)
  To: torvalds, mingo, tglx, will.deacon, paulmck, boqun.feng,
	waiman.long, fweisbec
  Cc: linux-kernel, linux-arch, rth, vgupta, linux, egtvedt, realmz6,
	ysato, rkuo, tony.luck, geert, james.hogan, ralf, dhowells, jejb,
	mpe, schwidefsky, dalias, davem, cmetcalf, jcmvbkbc, arnd,
	peterz, dbueso, fengguang.wu

[-- Attachment #1: will_deacon-arm64-atomic__generate_lse_non-return_cases_using_common_macros.patch --]
[-- Type: text/plain, Size: 4520 bytes --]

From: Will Deacon <will.deacon@arm.com>

atomic[64]_{add,and,andnot,or,xor} all follow the same patterns, so
generate them using macros, like we do for the LL/SC case already.

Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1461344493-8262-1-git-send-email-will.deacon@arm.com
---
 arch/arm64/include/asm/atomic_lse.h |  122 +++++++++---------------------------
 1 file changed, 32 insertions(+), 90 deletions(-)

--- a/arch/arm64/include/asm/atomic_lse.h
+++ b/arch/arm64/include/asm/atomic_lse.h
@@ -26,54 +26,25 @@
 #endif
 
 #define __LL_SC_ATOMIC(op)	__LL_SC_CALL(atomic_##op)
-
-static inline void atomic_andnot(int i, atomic_t *v)
-{
-	register int w0 asm ("w0") = i;
-	register atomic_t *x1 asm ("x1") = v;
-
-	asm volatile(ARM64_LSE_ATOMIC_INSN(__LL_SC_ATOMIC(andnot),
-	"	stclr	%w[i], %[v]\n")
-	: [i] "+r" (w0), [v] "+Q" (v->counter)
-	: "r" (x1)
-	: __LL_SC_CLOBBERS);
-}
-
-static inline void atomic_or(int i, atomic_t *v)
-{
-	register int w0 asm ("w0") = i;
-	register atomic_t *x1 asm ("x1") = v;
-
-	asm volatile(ARM64_LSE_ATOMIC_INSN(__LL_SC_ATOMIC(or),
-	"	stset	%w[i], %[v]\n")
-	: [i] "+r" (w0), [v] "+Q" (v->counter)
-	: "r" (x1)
-	: __LL_SC_CLOBBERS);
+#define ATOMIC_OP(op, asm_op)						\
+static inline void atomic_##op(int i, atomic_t *v)			\
+{									\
+	register int w0 asm ("w0") = i;					\
+	register atomic_t *x1 asm ("x1") = v;				\
+									\
+	asm volatile(ARM64_LSE_ATOMIC_INSN(__LL_SC_ATOMIC(op),		\
+"	" #asm_op "	%w[i], %[v]\n")					\
+	: [i] "+r" (w0), [v] "+Q" (v->counter)				\
+	: "r" (x1)							\
+	: __LL_SC_CLOBBERS);						\
 }
 
-static inline void atomic_xor(int i, atomic_t *v)
-{
-	register int w0 asm ("w0") = i;
-	register atomic_t *x1 asm ("x1") = v;
+ATOMIC_OP(andnot, stclr)
+ATOMIC_OP(or, stset)
+ATOMIC_OP(xor, steor)
+ATOMIC_OP(add, stadd)
 
-	asm volatile(ARM64_LSE_ATOMIC_INSN(__LL_SC_ATOMIC(xor),
-	"	steor	%w[i], %[v]\n")
-	: [i] "+r" (w0), [v] "+Q" (v->counter)
-	: "r" (x1)
-	: __LL_SC_CLOBBERS);
-}
-
-static inline void atomic_add(int i, atomic_t *v)
-{
-	register int w0 asm ("w0") = i;
-	register atomic_t *x1 asm ("x1") = v;
-
-	asm volatile(ARM64_LSE_ATOMIC_INSN(__LL_SC_ATOMIC(add),
-	"	stadd	%w[i], %[v]\n")
-	: [i] "+r" (w0), [v] "+Q" (v->counter)
-	: "r" (x1)
-	: __LL_SC_CLOBBERS);
-}
+#undef ATOMIC_OP
 
 #define ATOMIC_OP_ADD_RETURN(name, mb, cl...)				\
 static inline int atomic_add_return##name(int i, atomic_t *v)		\
@@ -167,54 +138,25 @@ ATOMIC_OP_SUB_RETURN(        , al, "memo
 #undef __LL_SC_ATOMIC
 
 #define __LL_SC_ATOMIC64(op)	__LL_SC_CALL(atomic64_##op)
-
-static inline void atomic64_andnot(long i, atomic64_t *v)
-{
-	register long x0 asm ("x0") = i;
-	register atomic64_t *x1 asm ("x1") = v;
-
-	asm volatile(ARM64_LSE_ATOMIC_INSN(__LL_SC_ATOMIC64(andnot),
-	"	stclr	%[i], %[v]\n")
-	: [i] "+r" (x0), [v] "+Q" (v->counter)
-	: "r" (x1)
-	: __LL_SC_CLOBBERS);
-}
-
-static inline void atomic64_or(long i, atomic64_t *v)
-{
-	register long x0 asm ("x0") = i;
-	register atomic64_t *x1 asm ("x1") = v;
-
-	asm volatile(ARM64_LSE_ATOMIC_INSN(__LL_SC_ATOMIC64(or),
-	"	stset	%[i], %[v]\n")
-	: [i] "+r" (x0), [v] "+Q" (v->counter)
-	: "r" (x1)
-	: __LL_SC_CLOBBERS);
-}
-
-static inline void atomic64_xor(long i, atomic64_t *v)
-{
-	register long x0 asm ("x0") = i;
-	register atomic64_t *x1 asm ("x1") = v;
-
-	asm volatile(ARM64_LSE_ATOMIC_INSN(__LL_SC_ATOMIC64(xor),
-	"	steor	%[i], %[v]\n")
-	: [i] "+r" (x0), [v] "+Q" (v->counter)
-	: "r" (x1)
-	: __LL_SC_CLOBBERS);
+#define ATOMIC64_OP(op, asm_op)						\
+static inline void atomic64_##op(long i, atomic64_t *v)			\
+{									\
+	register long x0 asm ("x0") = i;				\
+	register atomic64_t *x1 asm ("x1") = v;				\
+									\
+	asm volatile(ARM64_LSE_ATOMIC_INSN(__LL_SC_ATOMIC64(op),	\
+"	" #asm_op "	%[i], %[v]\n")					\
+	: [i] "+r" (x0), [v] "+Q" (v->counter)				\
+	: "r" (x1)							\
+	: __LL_SC_CLOBBERS);						\
 }
 
-static inline void atomic64_add(long i, atomic64_t *v)
-{
-	register long x0 asm ("x0") = i;
-	register atomic64_t *x1 asm ("x1") = v;
+ATOMIC64_OP(andnot, stclr)
+ATOMIC64_OP(or, stset)
+ATOMIC64_OP(xor, steor)
+ATOMIC64_OP(add, stadd)
 
-	asm volatile(ARM64_LSE_ATOMIC_INSN(__LL_SC_ATOMIC64(add),
-	"	stadd	%[i], %[v]\n")
-	: [i] "+r" (x0), [v] "+Q" (v->counter)
-	: "r" (x1)
-	: __LL_SC_CLOBBERS);
-}
+#undef ATOMIC64_OP
 
 #define ATOMIC64_OP_ADD_RETURN(name, mb, cl...)				\
 static inline long atomic64_add_return##name(long i, atomic64_t *v)	\

^ permalink raw reply	[flat|nested] 61+ messages in thread

* [PATCH -v2 06/33] locking,arm64: Implement atomic{,64}_fetch_{add,sub,and,andnot,or,xor}{,_relaxed,_acquire,_release}() for LSE instructions
  2016-05-31 10:19 [PATCH -v2 00/33] implement atomic_fetch_$op Peter Zijlstra
                   ` (4 preceding siblings ...)
  2016-05-31 10:19 ` [PATCH -v2 05/33] arm64: atomic: generate LSE non-return cases using common macros Peter Zijlstra
@ 2016-05-31 10:19 ` Peter Zijlstra
  2016-05-31 10:19 ` [PATCH -v2 07/33] locking,avr32: Implement atomic_fetch_{add,sub,and,or,xor}() Peter Zijlstra
                   ` (28 subsequent siblings)
  34 siblings, 0 replies; 61+ messages in thread
From: Peter Zijlstra @ 2016-05-31 10:19 UTC (permalink / raw)
  To: torvalds, mingo, tglx, will.deacon, paulmck, boqun.feng,
	waiman.long, fweisbec
  Cc: linux-kernel, linux-arch, rth, vgupta, linux, egtvedt, realmz6,
	ysato, rkuo, tony.luck, geert, james.hogan, ralf, dhowells, jejb,
	mpe, schwidefsky, dalias, davem, cmetcalf, jcmvbkbc, arnd,
	peterz, dbueso, fengguang.wu

[-- Attachment #1: will_deacon-lockingarm64-implement_atomic64_fetch_addsubandandnotorxor_relaxed_acquire_release_for_lse_instructions.patch --]
[-- Type: text/plain, Size: 7216 bytes --]

From: Will Deacon <will.deacon@arm.com>

Implement FETCH-OP atomic primitives, these are very similar to the
existing OP-RETURN primitives we already have, except they return the
value of the atomic variable _before_ modification.

This is especially useful for irreversible operations -- such as
bitops (because it becomes impossible to reconstruct the state prior
to modification).

This patch implements the LSE variants.

Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1461344493-8262-2-git-send-email-will.deacon@arm.com
---
 arch/arm64/include/asm/atomic_lse.h |  172 ++++++++++++++++++++++++++++++++++++
 1 file changed, 172 insertions(+)

--- a/arch/arm64/include/asm/atomic_lse.h
+++ b/arch/arm64/include/asm/atomic_lse.h
@@ -46,6 +46,38 @@ ATOMIC_OP(add, stadd)
 
 #undef ATOMIC_OP
 
+#define ATOMIC_FETCH_OP(name, mb, op, asm_op, cl...)			\
+static inline int atomic_fetch_##op##name(int i, atomic_t *v)		\
+{									\
+	register int w0 asm ("w0") = i;					\
+	register atomic_t *x1 asm ("x1") = v;				\
+									\
+	asm volatile(ARM64_LSE_ATOMIC_INSN(				\
+	/* LL/SC */							\
+	__LL_SC_ATOMIC(fetch_##op##name),				\
+	/* LSE atomics */						\
+"	" #asm_op #mb "	%w[i], %w[i], %[v]")				\
+	: [i] "+r" (w0), [v] "+Q" (v->counter)				\
+	: "r" (x1)							\
+	: __LL_SC_CLOBBERS, ##cl);					\
+									\
+	return w0;							\
+}
+
+#define ATOMIC_FETCH_OPS(op, asm_op)					\
+	ATOMIC_FETCH_OP(_relaxed,   , op, asm_op)			\
+	ATOMIC_FETCH_OP(_acquire,  a, op, asm_op, "memory")		\
+	ATOMIC_FETCH_OP(_release,  l, op, asm_op, "memory")		\
+	ATOMIC_FETCH_OP(        , al, op, asm_op, "memory")
+
+ATOMIC_FETCH_OPS(andnot, ldclr)
+ATOMIC_FETCH_OPS(or, ldset)
+ATOMIC_FETCH_OPS(xor, ldeor)
+ATOMIC_FETCH_OPS(add, ldadd)
+
+#undef ATOMIC_FETCH_OP
+#undef ATOMIC_FETCH_OPS
+
 #define ATOMIC_OP_ADD_RETURN(name, mb, cl...)				\
 static inline int atomic_add_return##name(int i, atomic_t *v)		\
 {									\
@@ -90,6 +122,33 @@ static inline void atomic_and(int i, ato
 	: __LL_SC_CLOBBERS);
 }
 
+#define ATOMIC_FETCH_OP_AND(name, mb, cl...)				\
+static inline int atomic_fetch_and##name(int i, atomic_t *v)		\
+{									\
+	register int w0 asm ("w0") = i;					\
+	register atomic_t *x1 asm ("x1") = v;				\
+									\
+	asm volatile(ARM64_LSE_ATOMIC_INSN(				\
+	/* LL/SC */							\
+	"	nop\n"							\
+	__LL_SC_ATOMIC(fetch_and##name),				\
+	/* LSE atomics */						\
+	"	mvn	%w[i], %w[i]\n"					\
+	"	ldclr" #mb "	%w[i], %w[i], %[v]")			\
+	: [i] "+r" (w0), [v] "+Q" (v->counter)				\
+	: "r" (x1)							\
+	: __LL_SC_CLOBBERS, ##cl);					\
+									\
+	return w0;							\
+}
+
+ATOMIC_FETCH_OP_AND(_relaxed,   )
+ATOMIC_FETCH_OP_AND(_acquire,  a, "memory")
+ATOMIC_FETCH_OP_AND(_release,  l, "memory")
+ATOMIC_FETCH_OP_AND(        , al, "memory")
+
+#undef ATOMIC_FETCH_OP_AND
+
 static inline void atomic_sub(int i, atomic_t *v)
 {
 	register int w0 asm ("w0") = i;
@@ -135,6 +194,33 @@ ATOMIC_OP_SUB_RETURN(_release,  l, "memo
 ATOMIC_OP_SUB_RETURN(        , al, "memory")
 
 #undef ATOMIC_OP_SUB_RETURN
+
+#define ATOMIC_FETCH_OP_SUB(name, mb, cl...)				\
+static inline int atomic_fetch_sub##name(int i, atomic_t *v)		\
+{									\
+	register int w0 asm ("w0") = i;					\
+	register atomic_t *x1 asm ("x1") = v;				\
+									\
+	asm volatile(ARM64_LSE_ATOMIC_INSN(				\
+	/* LL/SC */							\
+	"	nop\n"							\
+	__LL_SC_ATOMIC(fetch_sub##name),				\
+	/* LSE atomics */						\
+	"	neg	%w[i], %w[i]\n"					\
+	"	ldadd" #mb "	%w[i], %w[i], %[v]")			\
+	: [i] "+r" (w0), [v] "+Q" (v->counter)				\
+	: "r" (x1)							\
+	: __LL_SC_CLOBBERS, ##cl);					\
+									\
+	return w0;							\
+}
+
+ATOMIC_FETCH_OP_SUB(_relaxed,   )
+ATOMIC_FETCH_OP_SUB(_acquire,  a, "memory")
+ATOMIC_FETCH_OP_SUB(_release,  l, "memory")
+ATOMIC_FETCH_OP_SUB(        , al, "memory")
+
+#undef ATOMIC_FETCH_OP_SUB
 #undef __LL_SC_ATOMIC
 
 #define __LL_SC_ATOMIC64(op)	__LL_SC_CALL(atomic64_##op)
@@ -158,6 +244,38 @@ ATOMIC64_OP(add, stadd)
 
 #undef ATOMIC64_OP
 
+#define ATOMIC64_FETCH_OP(name, mb, op, asm_op, cl...)			\
+static inline long atomic64_fetch_##op##name(long i, atomic64_t *v)	\
+{									\
+	register long x0 asm ("x0") = i;				\
+	register atomic64_t *x1 asm ("x1") = v;				\
+									\
+	asm volatile(ARM64_LSE_ATOMIC_INSN(				\
+	/* LL/SC */							\
+	__LL_SC_ATOMIC64(fetch_##op##name),				\
+	/* LSE atomics */						\
+"	" #asm_op #mb "	%[i], %[i], %[v]")				\
+	: [i] "+r" (x0), [v] "+Q" (v->counter)				\
+	: "r" (x1)							\
+	: __LL_SC_CLOBBERS, ##cl);					\
+									\
+	return x0;							\
+}
+
+#define ATOMIC64_FETCH_OPS(op, asm_op)					\
+	ATOMIC64_FETCH_OP(_relaxed,   , op, asm_op)			\
+	ATOMIC64_FETCH_OP(_acquire,  a, op, asm_op, "memory")		\
+	ATOMIC64_FETCH_OP(_release,  l, op, asm_op, "memory")		\
+	ATOMIC64_FETCH_OP(        , al, op, asm_op, "memory")
+
+ATOMIC64_FETCH_OPS(andnot, ldclr)
+ATOMIC64_FETCH_OPS(or, ldset)
+ATOMIC64_FETCH_OPS(xor, ldeor)
+ATOMIC64_FETCH_OPS(add, ldadd)
+
+#undef ATOMIC64_FETCH_OP
+#undef ATOMIC64_FETCH_OPS
+
 #define ATOMIC64_OP_ADD_RETURN(name, mb, cl...)				\
 static inline long atomic64_add_return##name(long i, atomic64_t *v)	\
 {									\
@@ -202,6 +320,33 @@ static inline void atomic64_and(long i,
 	: __LL_SC_CLOBBERS);
 }
 
+#define ATOMIC64_FETCH_OP_AND(name, mb, cl...)				\
+static inline long atomic64_fetch_and##name(long i, atomic64_t *v)	\
+{									\
+	register long x0 asm ("w0") = i;				\
+	register atomic64_t *x1 asm ("x1") = v;				\
+									\
+	asm volatile(ARM64_LSE_ATOMIC_INSN(				\
+	/* LL/SC */							\
+	"	nop\n"							\
+	__LL_SC_ATOMIC64(fetch_and##name),				\
+	/* LSE atomics */						\
+	"	mvn	%[i], %[i]\n"					\
+	"	ldclr" #mb "	%[i], %[i], %[v]")			\
+	: [i] "+r" (x0), [v] "+Q" (v->counter)				\
+	: "r" (x1)							\
+	: __LL_SC_CLOBBERS, ##cl);					\
+									\
+	return x0;							\
+}
+
+ATOMIC64_FETCH_OP_AND(_relaxed,   )
+ATOMIC64_FETCH_OP_AND(_acquire,  a, "memory")
+ATOMIC64_FETCH_OP_AND(_release,  l, "memory")
+ATOMIC64_FETCH_OP_AND(        , al, "memory")
+
+#undef ATOMIC64_FETCH_OP_AND
+
 static inline void atomic64_sub(long i, atomic64_t *v)
 {
 	register long x0 asm ("x0") = i;
@@ -248,6 +393,33 @@ ATOMIC64_OP_SUB_RETURN(        , al, "me
 
 #undef ATOMIC64_OP_SUB_RETURN
 
+#define ATOMIC64_FETCH_OP_SUB(name, mb, cl...)				\
+static inline long atomic64_fetch_sub##name(long i, atomic64_t *v)	\
+{									\
+	register long x0 asm ("w0") = i;				\
+	register atomic64_t *x1 asm ("x1") = v;				\
+									\
+	asm volatile(ARM64_LSE_ATOMIC_INSN(				\
+	/* LL/SC */							\
+	"	nop\n"							\
+	__LL_SC_ATOMIC64(fetch_sub##name),				\
+	/* LSE atomics */						\
+	"	neg	%[i], %[i]\n"					\
+	"	ldadd" #mb "	%[i], %[i], %[v]")			\
+	: [i] "+r" (x0), [v] "+Q" (v->counter)				\
+	: "r" (x1)							\
+	: __LL_SC_CLOBBERS, ##cl);					\
+									\
+	return x0;							\
+}
+
+ATOMIC64_FETCH_OP_SUB(_relaxed,   )
+ATOMIC64_FETCH_OP_SUB(_acquire,  a, "memory")
+ATOMIC64_FETCH_OP_SUB(_release,  l, "memory")
+ATOMIC64_FETCH_OP_SUB(        , al, "memory")
+
+#undef ATOMIC64_FETCH_OP_SUB
+
 static inline long atomic64_dec_if_positive(atomic64_t *v)
 {
 	register long x0 asm ("x0") = (long)v;

^ permalink raw reply	[flat|nested] 61+ messages in thread

* [PATCH -v2 07/33] locking,avr32: Implement atomic_fetch_{add,sub,and,or,xor}()
  2016-05-31 10:19 [PATCH -v2 00/33] implement atomic_fetch_$op Peter Zijlstra
                   ` (5 preceding siblings ...)
  2016-05-31 10:19 ` [PATCH -v2 06/33] locking,arm64: Implement atomic{,64}_fetch_{add,sub,and,andnot,or,xor}{,_relaxed,_acquire,_release}() for LSE instructions Peter Zijlstra
@ 2016-05-31 10:19 ` Peter Zijlstra
  2016-05-31 10:19 ` [PATCH -v2 08/33] locking,blackfin: " Peter Zijlstra
                   ` (27 subsequent siblings)
  34 siblings, 0 replies; 61+ messages in thread
From: Peter Zijlstra @ 2016-05-31 10:19 UTC (permalink / raw)
  To: torvalds, mingo, tglx, will.deacon, paulmck, boqun.feng,
	waiman.long, fweisbec
  Cc: linux-kernel, linux-arch, rth, vgupta, linux, egtvedt, realmz6,
	ysato, rkuo, tony.luck, geert, james.hogan, ralf, dhowells, jejb,
	mpe, schwidefsky, dalias, davem, cmetcalf, jcmvbkbc, arnd,
	peterz, dbueso, fengguang.wu

[-- Attachment #1: peterz-atomic-fetch-avr32.patch --]
[-- Type: text/plain, Size: 2821 bytes --]

Implement FETCH-OP atomic primitives, these are very similar to the
existing OP-RETURN primitives we already have, except they return the
value of the atomic variable _before_ modification.

This is especially useful for irreversible operations -- such as
bitops (because it becomes impossible to reconstruct the state prior
to modification).

Acked-by: Hans-Christian Noren Egtvedt <egtvedt@samfundet.no>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 arch/avr32/include/asm/atomic.h |   56 ++++++++++++++++++++++++++++++++++++----
 1 file changed, 51 insertions(+), 5 deletions(-)

--- a/arch/avr32/include/asm/atomic.h
+++ b/arch/avr32/include/asm/atomic.h
@@ -41,21 +41,51 @@ static inline int __atomic_##op##_return
 	return result;							\
 }
 
+#define ATOMIC_FETCH_OP(op, asm_op, asm_con)				\
+static inline int __atomic_fetch_##op(int i, atomic_t *v)		\
+{									\
+	int result, val;						\
+									\
+	asm volatile(							\
+		"/* atomic_fetch_" #op " */\n"				\
+		"1:	ssrf	5\n"					\
+		"	ld.w	%0, %3\n"				\
+		"	mov	%1, %0\n"				\
+		"	" #asm_op "	%1, %4\n"			\
+		"	stcond	%2, %1\n"				\
+		"	brne	1b"					\
+		: "=&r" (result), "=&r" (val), "=o" (v->counter)	\
+		: "m" (v->counter), #asm_con (i)			\
+		: "cc");						\
+									\
+	return result;							\
+}
+
 ATOMIC_OP_RETURN(sub, sub, rKs21)
 ATOMIC_OP_RETURN(add, add, r)
+ATOMIC_FETCH_OP (sub, sub, rKs21)
+ATOMIC_FETCH_OP (add, add, r)
 
-#define ATOMIC_OP(op, asm_op)						\
+#define atomic_fetch_or atomic_fetch_or
+
+#define ATOMIC_OPS(op, asm_op)						\
 ATOMIC_OP_RETURN(op, asm_op, r)						\
 static inline void atomic_##op(int i, atomic_t *v)			\
 {									\
 	(void)__atomic_##op##_return(i, v);				\
+}									\
+ATOMIC_FETCH_OP(op, asm_op, r)						\
+static inline int atomic_fetch_##op(int i, atomic_t *v)		\
+{									\
+	return __atomic_fetch_##op(i, v);				\
 }
 
-ATOMIC_OP(and, and)
-ATOMIC_OP(or, or)
-ATOMIC_OP(xor, eor)
+ATOMIC_OPS(and, and)
+ATOMIC_OPS(or, or)
+ATOMIC_OPS(xor, eor)
 
-#undef ATOMIC_OP
+#undef ATOMIC_OPS
+#undef ATOMIC_FETCH_OP
 #undef ATOMIC_OP_RETURN
 
 /*
@@ -87,6 +117,14 @@ static inline int atomic_add_return(int
 	return __atomic_add_return(i, v);
 }
 
+static inline int atomic_fetch_add(int i, atomic_t *v)
+{
+	if (IS_21BIT_CONST(i))
+		return __atomic_fetch_sub(-i, v);
+
+	return __atomic_fetch_add(i, v);
+}
+
 /*
  * atomic_sub_return - subtract the atomic variable
  * @i: integer value to subtract
@@ -102,6 +140,14 @@ static inline int atomic_sub_return(int
 	return __atomic_add_return(-i, v);
 }
 
+static inline int atomic_fetch_sub(int i, atomic_t *v)
+{
+	if (IS_21BIT_CONST(i))
+		return __atomic_fetch_sub(i, v);
+
+	return __atomic_fetch_add(-i, v);
+}
+
 /*
  * __atomic_add_unless - add unless the number is a given value
  * @v: pointer of type atomic_t

^ permalink raw reply	[flat|nested] 61+ messages in thread

* [PATCH -v2 08/33] locking,blackfin: Implement atomic_fetch_{add,sub,and,or,xor}()
  2016-05-31 10:19 [PATCH -v2 00/33] implement atomic_fetch_$op Peter Zijlstra
                   ` (6 preceding siblings ...)
  2016-05-31 10:19 ` [PATCH -v2 07/33] locking,avr32: Implement atomic_fetch_{add,sub,and,or,xor}() Peter Zijlstra
@ 2016-05-31 10:19 ` Peter Zijlstra
  2016-05-31 10:19 ` [PATCH -v2 09/33] locking,frv: Implement atomic{,64}_fetch_{add,sub,and,or,xor}() Peter Zijlstra
                   ` (26 subsequent siblings)
  34 siblings, 0 replies; 61+ messages in thread
From: Peter Zijlstra @ 2016-05-31 10:19 UTC (permalink / raw)
  To: torvalds, mingo, tglx, will.deacon, paulmck, boqun.feng,
	waiman.long, fweisbec
  Cc: linux-kernel, linux-arch, rth, vgupta, linux, egtvedt, realmz6,
	ysato, rkuo, tony.luck, geert, james.hogan, ralf, dhowells, jejb,
	mpe, schwidefsky, dalias, davem, cmetcalf, jcmvbkbc, arnd,
	peterz, dbueso, fengguang.wu

[-- Attachment #1: peterz-atomic-fetch-blackfin.patch --]
[-- Type: text/plain, Size: 3585 bytes --]

Implement FETCH-OP atomic primitives, these are very similar to the
existing OP-RETURN primitives we already have, except they return the
value of the atomic variable _before_ modification.

This is especially useful for irreversible operations -- such as
bitops (because it becomes impossible to reconstruct the state prior
to modification).

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 arch/blackfin/include/asm/atomic.h |    8 ++++++
 arch/blackfin/kernel/bfin_ksyms.c  |    1 
 arch/blackfin/mach-bf561/atomic.S  |   43 ++++++++++++++++++++++++++-----------
 3 files changed, 40 insertions(+), 12 deletions(-)

--- a/arch/blackfin/include/asm/atomic.h
+++ b/arch/blackfin/include/asm/atomic.h
@@ -17,6 +17,7 @@
 
 asmlinkage int __raw_uncached_fetch_asm(const volatile int *ptr);
 asmlinkage int __raw_atomic_add_asm(volatile int *ptr, int value);
+asmlinkage int __raw_atomic_xadd_asm(volatile int *ptr, int value);
 
 asmlinkage int __raw_atomic_and_asm(volatile int *ptr, int value);
 asmlinkage int __raw_atomic_or_asm(volatile int *ptr, int value);
@@ -28,10 +29,17 @@ asmlinkage int __raw_atomic_test_asm(con
 #define atomic_add_return(i, v) __raw_atomic_add_asm(&(v)->counter, i)
 #define atomic_sub_return(i, v) __raw_atomic_add_asm(&(v)->counter, -(i))
 
+#define atomic_fetch_add(i, v) __raw_atomic_xadd_asm(&(v)->counter, i)
+#define atomic_fetch_sub(i, v) __raw_atomic_xadd_asm(&(v)->counter, -(i))
+
 #define atomic_or(i, v)  (void)__raw_atomic_or_asm(&(v)->counter, i)
 #define atomic_and(i, v) (void)__raw_atomic_and_asm(&(v)->counter, i)
 #define atomic_xor(i, v) (void)__raw_atomic_xor_asm(&(v)->counter, i)
 
+#define atomic_fetch_or(i, v)  __raw_atomic_or_asm(&(v)->counter, i)
+#define atomic_fetch_and(i, v) __raw_atomic_and_asm(&(v)->counter, i)
+#define atomic_fetch_xor(i, v) __raw_atomic_xor_asm(&(v)->counter, i)
+
 #endif
 
 #include <asm-generic/atomic.h>
--- a/arch/blackfin/kernel/bfin_ksyms.c
+++ b/arch/blackfin/kernel/bfin_ksyms.c
@@ -84,6 +84,7 @@ EXPORT_SYMBOL(insl_16);
 
 #ifdef CONFIG_SMP
 EXPORT_SYMBOL(__raw_atomic_add_asm);
+EXPORT_SYMBOL(__raw_atomic_xadd_asm);
 EXPORT_SYMBOL(__raw_atomic_and_asm);
 EXPORT_SYMBOL(__raw_atomic_or_asm);
 EXPORT_SYMBOL(__raw_atomic_xor_asm);
--- a/arch/blackfin/mach-bf561/atomic.S
+++ b/arch/blackfin/mach-bf561/atomic.S
@@ -607,6 +607,28 @@ ENDPROC(___raw_atomic_add_asm)
 
 /*
  * r0 = ptr
+ * r1 = value
+ *
+ * ADD a signed value to a 32bit word and return the old value atomically.
+ * Clobbers: r3:0, p1:0
+ */
+ENTRY(___raw_atomic_xadd_asm)
+	p1 = r0;
+	r3 = r1;
+	[--sp] = rets;
+	call _get_core_lock;
+	r3 = [p1];
+	r2 = r3 + r2;
+	[p1] = r2;
+	r1 = p1;
+	call _put_core_lock;
+	r0 = r3;
+	rets = [sp++];
+	rts;
+ENDPROC(___raw_atomic_add_asm)
+
+/*
+ * r0 = ptr
  * r1 = mask
  *
  * AND the mask bits from a 32bit word and return the old 32bit value
@@ -618,10 +640,9 @@ ENTRY(___raw_atomic_and_asm)
 	r3 = r1;
 	[--sp] = rets;
 	call _get_core_lock;
-	r2 = [p1];
-	r3 = r2 & r3;
-	[p1] = r3;
-	r3 = r2;
+	r3 = [p1];
+	r2 = r2 & r3;
+	[p1] = r2;
 	r1 = p1;
 	call _put_core_lock;
 	r0 = r3;
@@ -642,10 +663,9 @@ ENTRY(___raw_atomic_or_asm)
 	r3 = r1;
 	[--sp] = rets;
 	call _get_core_lock;
-	r2 = [p1];
-	r3 = r2 | r3;
-	[p1] = r3;
-	r3 = r2;
+	r3 = [p1];
+	r2 = r2 | r3;
+	[p1] = r2;
 	r1 = p1;
 	call _put_core_lock;
 	r0 = r3;
@@ -666,10 +686,9 @@ ENTRY(___raw_atomic_xor_asm)
 	r3 = r1;
 	[--sp] = rets;
 	call _get_core_lock;
-	r2 = [p1];
-	r3 = r2 ^ r3;
-	[p1] = r3;
-	r3 = r2;
+	r3 = [p1];
+	r2 = r2 ^ r3;
+	[p1] = r2;
 	r1 = p1;
 	call _put_core_lock;
 	r0 = r3;

^ permalink raw reply	[flat|nested] 61+ messages in thread

* [PATCH -v2 09/33] locking,frv: Implement atomic{,64}_fetch_{add,sub,and,or,xor}()
  2016-05-31 10:19 [PATCH -v2 00/33] implement atomic_fetch_$op Peter Zijlstra
                   ` (7 preceding siblings ...)
  2016-05-31 10:19 ` [PATCH -v2 08/33] locking,blackfin: " Peter Zijlstra
@ 2016-05-31 10:19 ` Peter Zijlstra
  2016-05-31 10:19 ` [PATCH -v2 10/33] locking,h8300: Implement atomic_fetch_{add,sub,and,or,xor}() Peter Zijlstra
                   ` (25 subsequent siblings)
  34 siblings, 0 replies; 61+ messages in thread
From: Peter Zijlstra @ 2016-05-31 10:19 UTC (permalink / raw)
  To: torvalds, mingo, tglx, will.deacon, paulmck, boqun.feng,
	waiman.long, fweisbec
  Cc: linux-kernel, linux-arch, rth, vgupta, linux, egtvedt, realmz6,
	ysato, rkuo, tony.luck, geert, james.hogan, ralf, dhowells, jejb,
	mpe, schwidefsky, dalias, davem, cmetcalf, jcmvbkbc, arnd,
	peterz, dbueso, fengguang.wu

[-- Attachment #1: peterz-atomic-fetch-frv.patch --]
[-- Type: text/plain, Size: 2773 bytes --]

Implement FETCH-OP atomic primitives, these are very similar to the
existing OP-RETURN primitives we already have, except they return the
value of the atomic variable _before_ modification.

This is especially useful for irreversible operations -- such as
bitops (because it becomes impossible to reconstruct the state prior
to modification).

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 arch/frv/include/asm/atomic.h      |   32 ++++++++++++--------------------
 arch/frv/include/asm/atomic_defs.h |    2 ++
 2 files changed, 14 insertions(+), 20 deletions(-)

--- a/arch/frv/include/asm/atomic.h
+++ b/arch/frv/include/asm/atomic.h
@@ -60,16 +60,6 @@ static inline int atomic_add_negative(in
 	return atomic_add_return(i, v) < 0;
 }
 
-static inline void atomic_add(int i, atomic_t *v)
-{
-	atomic_add_return(i, v);
-}
-
-static inline void atomic_sub(int i, atomic_t *v)
-{
-	atomic_sub_return(i, v);
-}
-
 static inline void atomic_inc(atomic_t *v)
 {
 	atomic_inc_return(v);
@@ -84,6 +74,8 @@ static inline void atomic_dec(atomic_t *
 #define atomic_dec_and_test(v)		(atomic_sub_return(1, (v)) == 0)
 #define atomic_inc_and_test(v)		(atomic_add_return(1, (v)) == 0)
 
+#define atomic_fetch_or atomic_fetch_or
+
 /*
  * 64-bit atomic ops
  */
@@ -136,16 +128,6 @@ static inline long long atomic64_add_neg
 	return atomic64_add_return(i, v) < 0;
 }
 
-static inline void atomic64_add(long long i, atomic64_t *v)
-{
-	atomic64_add_return(i, v);
-}
-
-static inline void atomic64_sub(long long i, atomic64_t *v)
-{
-	atomic64_sub_return(i, v);
-}
-
 static inline void atomic64_inc(atomic64_t *v)
 {
 	atomic64_inc_return(v);
@@ -182,11 +164,19 @@ static __inline__ int __atomic_add_unles
 }
 
 #define ATOMIC_OP(op)							\
+static inline int atomic_fetch_##op(int i, atomic_t *v)			\
+{									\
+	return __atomic32_fetch_##op(i, &v->counter);			\
+}									\
 static inline void atomic_##op(int i, atomic_t *v)			\
 {									\
 	(void)__atomic32_fetch_##op(i, &v->counter);			\
 }									\
 									\
+static inline long long atomic64_fetch_##op(long long i, atomic64_t *v)	\
+{									\
+	return __atomic64_fetch_##op(i, &v->counter);			\
+}									\
 static inline void atomic64_##op(long long i, atomic64_t *v)		\
 {									\
 	(void)__atomic64_fetch_##op(i, &v->counter);			\
@@ -195,6 +185,8 @@ static inline void atomic64_##op(long lo
 ATOMIC_OP(or)
 ATOMIC_OP(and)
 ATOMIC_OP(xor)
+ATOMIC_OP(add)
+ATOMIC_OP(sub)
 
 #undef ATOMIC_OP
 
--- a/arch/frv/include/asm/atomic_defs.h
+++ b/arch/frv/include/asm/atomic_defs.h
@@ -162,6 +162,8 @@ ATOMIC_EXPORT(__atomic64_fetch_##op);
 ATOMIC_FETCH_OP(or)
 ATOMIC_FETCH_OP(and)
 ATOMIC_FETCH_OP(xor)
+ATOMIC_FETCH_OP(add)
+ATOMIC_FETCH_OP(sub)
 
 ATOMIC_OP_RETURN(add)
 ATOMIC_OP_RETURN(sub)

^ permalink raw reply	[flat|nested] 61+ messages in thread

* [PATCH -v2 10/33] locking,h8300: Implement atomic_fetch_{add,sub,and,or,xor}()
  2016-05-31 10:19 [PATCH -v2 00/33] implement atomic_fetch_$op Peter Zijlstra
                   ` (8 preceding siblings ...)
  2016-05-31 10:19 ` [PATCH -v2 09/33] locking,frv: Implement atomic{,64}_fetch_{add,sub,and,or,xor}() Peter Zijlstra
@ 2016-05-31 10:19 ` Peter Zijlstra
  2016-05-31 10:19 ` [PATCH -v2 11/33] locking,hexagon: " Peter Zijlstra
                   ` (24 subsequent siblings)
  34 siblings, 0 replies; 61+ messages in thread
From: Peter Zijlstra @ 2016-05-31 10:19 UTC (permalink / raw)
  To: torvalds, mingo, tglx, will.deacon, paulmck, boqun.feng,
	waiman.long, fweisbec
  Cc: linux-kernel, linux-arch, rth, vgupta, linux, egtvedt, realmz6,
	ysato, rkuo, tony.luck, geert, james.hogan, ralf, dhowells, jejb,
	mpe, schwidefsky, dalias, davem, cmetcalf, jcmvbkbc, arnd,
	peterz, dbueso, fengguang.wu

[-- Attachment #1: peterz-atomic-fetch-h8300.patch --]
[-- Type: text/plain, Size: 1911 bytes --]

Implement FETCH-OP atomic primitives, these are very similar to the
existing OP-RETURN primitives we already have, except they return the
value of the atomic variable _before_ modification.

This is especially useful for irreversible operations -- such as
bitops (because it becomes impossible to reconstruct the state prior
to modification).

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 arch/h8300/include/asm/atomic.h |   31 +++++++++++++++++++++++++------
 1 file changed, 25 insertions(+), 6 deletions(-)

--- a/arch/h8300/include/asm/atomic.h
+++ b/arch/h8300/include/asm/atomic.h
@@ -28,6 +28,19 @@ static inline int atomic_##op##_return(i
 	return ret;						\
 }
 
+#define ATOMIC_FETCH_OP(op, c_op)				\
+static inline int atomic_fetch_##op(int i, atomic_t *v)		\
+{								\
+	h8300flags flags;					\
+	int ret;						\
+								\
+	flags = arch_local_irq_save();				\
+	ret = v->counter;					\
+	v->counter c_op i;					\
+	arch_local_irq_restore(flags);				\
+	return ret;						\
+}
+
 #define ATOMIC_OP(op, c_op)					\
 static inline void atomic_##op(int i, atomic_t *v)		\
 {								\
@@ -41,17 +54,23 @@ static inline void atomic_##op(int i, at
 ATOMIC_OP_RETURN(add, +=)
 ATOMIC_OP_RETURN(sub, -=)
 
-ATOMIC_OP(and, &=)
-ATOMIC_OP(or,  |=)
-ATOMIC_OP(xor, ^=)
+#define atomic_fetch_or atomic_fetch_or
 
+#define ATOMIC_OPS(op, c_op)					\
+	ATOMIC_OP(op, c_op)					\
+	ATOMIC_FETCH_OP(op, c_op)
+
+ATOMIC_OPS(and, &=)
+ATOMIC_OPS(or,  |=)
+ATOMIC_OPS(xor, ^=)
+ATOMIC_OPS(add, +=)
+ATOMIC_OPS(sub, -=)
+
+#undef ATOMIC_OPS
 #undef ATOMIC_OP_RETURN
 #undef ATOMIC_OP
 
-#define atomic_add(i, v)		(void)atomic_add_return(i, v)
 #define atomic_add_negative(a, v)	(atomic_add_return((a), (v)) < 0)
-
-#define atomic_sub(i, v)		(void)atomic_sub_return(i, v)
 #define atomic_sub_and_test(i, v)	(atomic_sub_return(i, v) == 0)
 
 #define atomic_inc_return(v)		atomic_add_return(1, v)

^ permalink raw reply	[flat|nested] 61+ messages in thread

* [PATCH -v2 11/33] locking,hexagon: Implement atomic_fetch_{add,sub,and,or,xor}()
  2016-05-31 10:19 [PATCH -v2 00/33] implement atomic_fetch_$op Peter Zijlstra
                   ` (9 preceding siblings ...)
  2016-05-31 10:19 ` [PATCH -v2 10/33] locking,h8300: Implement atomic_fetch_{add,sub,and,or,xor}() Peter Zijlstra
@ 2016-05-31 10:19 ` Peter Zijlstra
  2016-05-31 10:19 ` [PATCH -v2 12/33] locking,ia64: Implement atomic{,64}_fetch_{add,sub,and,or,xor}() Peter Zijlstra
                   ` (23 subsequent siblings)
  34 siblings, 0 replies; 61+ messages in thread
From: Peter Zijlstra @ 2016-05-31 10:19 UTC (permalink / raw)
  To: torvalds, mingo, tglx, will.deacon, paulmck, boqun.feng,
	waiman.long, fweisbec
  Cc: linux-kernel, linux-arch, rth, vgupta, linux, egtvedt, realmz6,
	ysato, rkuo, tony.luck, geert, james.hogan, ralf, dhowells, jejb,
	mpe, schwidefsky, dalias, davem, cmetcalf, jcmvbkbc, arnd,
	peterz, dbueso, fengguang.wu

[-- Attachment #1: peterz-atomic-fetch-hexagon.patch --]
[-- Type: text/plain, Size: 1915 bytes --]

Implement FETCH-OP atomic primitives, these are very similar to the
existing OP-RETURN primitives we already have, except they return the
value of the atomic variable _before_ modification.

This is especially useful for irreversible operations -- such as
bitops (because it becomes impossible to reconstruct the state prior
to modification).

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 arch/hexagon/include/asm/atomic.h |   33 ++++++++++++++++++++++++++++-----
 1 file changed, 28 insertions(+), 5 deletions(-)

--- a/arch/hexagon/include/asm/atomic.h
+++ b/arch/hexagon/include/asm/atomic.h
@@ -110,7 +110,7 @@ static inline void atomic_##op(int i, at
 	);								\
 }									\
 
-#define ATOMIC_OP_RETURN(op)							\
+#define ATOMIC_OP_RETURN(op)						\
 static inline int atomic_##op##_return(int i, atomic_t *v)		\
 {									\
 	int output;							\
@@ -127,16 +127,39 @@ static inline int atomic_##op##_return(i
 	return output;							\
 }
 
-#define ATOMIC_OPS(op) ATOMIC_OP(op) ATOMIC_OP_RETURN(op)
+#define ATOMIC_FETCH_OP(op)						\
+static inline int atomic_fetch_##op(int i, atomic_t *v)			\
+{									\
+	int output, val;						\
+									\
+	__asm__ __volatile__ (						\
+		"1:	%0 = memw_locked(%2);\n"			\
+		"	%1 = "#op "(%0,%3);\n"				\
+		"	memw_locked(%2,P3)=%1;\n"			\
+		"	if !P3 jump 1b;\n"				\
+		: "=&r" (output), "=&r" (val)				\
+		: "r" (&v->counter), "r" (i)				\
+		: "memory", "p3"					\
+	);								\
+	return output;							\
+}
+
+#define ATOMIC_OPS(op) ATOMIC_OP(op) ATOMIC_OP_RETURN(op) ATOMIC_FETCH_OP(op)
 
 ATOMIC_OPS(add)
 ATOMIC_OPS(sub)
 
-ATOMIC_OP(and)
-ATOMIC_OP(or)
-ATOMIC_OP(xor)
+#undef ATOMIC_OPS
+#define ATOMIC_OPS(op) ATOMIC_OP(op) ATOMIC_FETCH_OP(op)
+
+#define atomic_fetch_or atomic_fetch_or
+
+ATOMIC_OPS(and)
+ATOMIC_OPS(or)
+ATOMIC_OPS(xor)
 
 #undef ATOMIC_OPS
+#undef ATOMIC_FETCH_OP
 #undef ATOMIC_OP_RETURN
 #undef ATOMIC_OP
 

^ permalink raw reply	[flat|nested] 61+ messages in thread

* [PATCH -v2 12/33] locking,ia64: Implement atomic{,64}_fetch_{add,sub,and,or,xor}()
  2016-05-31 10:19 [PATCH -v2 00/33] implement atomic_fetch_$op Peter Zijlstra
                   ` (10 preceding siblings ...)
  2016-05-31 10:19 ` [PATCH -v2 11/33] locking,hexagon: " Peter Zijlstra
@ 2016-05-31 10:19 ` Peter Zijlstra
  2016-05-31 10:19 ` [PATCH -v2 13/33] locking,m32r: Implement atomic_fetch_{add,sub,and,or,xor}() Peter Zijlstra
                   ` (22 subsequent siblings)
  34 siblings, 0 replies; 61+ messages in thread
From: Peter Zijlstra @ 2016-05-31 10:19 UTC (permalink / raw)
  To: torvalds, mingo, tglx, will.deacon, paulmck, boqun.feng,
	waiman.long, fweisbec
  Cc: linux-kernel, linux-arch, rth, vgupta, linux, egtvedt, realmz6,
	ysato, rkuo, tony.luck, geert, james.hogan, ralf, dhowells, jejb,
	mpe, schwidefsky, dalias, davem, cmetcalf, jcmvbkbc, arnd,
	peterz, dbueso, fengguang.wu

[-- Attachment #1: peterz-atomic-fetch-ia64.patch --]
[-- Type: text/plain, Size: 5705 bytes --]

Implement FETCH-OP atomic primitives, these are very similar to the
existing OP-RETURN primitives we already have, except they return the
value of the atomic variable _before_ modification.

This is especially useful for irreversible operations -- such as
bitops (because it becomes impossible to reconstruct the state prior
to modification).

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 arch/ia64/include/asm/atomic.h |  134 +++++++++++++++++++++++++++++++++++------
 1 file changed, 116 insertions(+), 18 deletions(-)

--- a/arch/ia64/include/asm/atomic.h
+++ b/arch/ia64/include/asm/atomic.h
@@ -42,8 +42,27 @@ ia64_atomic_##op (int i, atomic_t *v)
 	return new;							\
 }
 
-ATOMIC_OP(add, +)
-ATOMIC_OP(sub, -)
+#define ATOMIC_FETCH_OP(op, c_op)					\
+static __inline__ int							\
+ia64_atomic_fetch_##op (int i, atomic_t *v)				\
+{									\
+	__s32 old, new;							\
+	CMPXCHG_BUGCHECK_DECL						\
+									\
+	do {								\
+		CMPXCHG_BUGCHECK(v);					\
+		old = atomic_read(v);					\
+		new = old c_op i;					\
+	} while (ia64_cmpxchg(acq, v, old, new, sizeof(atomic_t)) != old); \
+	return old;							\
+}
+
+#define ATOMIC_OPS(op, c_op)						\
+	ATOMIC_OP(op, c_op)						\
+	ATOMIC_FETCH_OP(op, c_op)
+
+ATOMIC_OPS(add, +)
+ATOMIC_OPS(sub, -)
 
 #define atomic_add_return(i,v)						\
 ({									\
@@ -69,14 +88,44 @@ ATOMIC_OP(sub, -)
 		: ia64_atomic_sub(__ia64_asr_i, v);			\
 })
 
-ATOMIC_OP(and, &)
-ATOMIC_OP(or, |)
-ATOMIC_OP(xor, ^)
-
-#define atomic_and(i,v)	(void)ia64_atomic_and(i,v)
-#define atomic_or(i,v)	(void)ia64_atomic_or(i,v)
-#define atomic_xor(i,v)	(void)ia64_atomic_xor(i,v)
+#define atomic_fetch_add(i,v)						\
+({									\
+	int __ia64_aar_i = (i);						\
+	(__builtin_constant_p(i)					\
+	 && (   (__ia64_aar_i ==  1) || (__ia64_aar_i ==   4)		\
+	     || (__ia64_aar_i ==  8) || (__ia64_aar_i ==  16)		\
+	     || (__ia64_aar_i == -1) || (__ia64_aar_i ==  -4)		\
+	     || (__ia64_aar_i == -8) || (__ia64_aar_i == -16)))		\
+		? ia64_fetchadd(__ia64_aar_i, &(v)->counter, acq)	\
+		: ia64_atomic_fetch_add(__ia64_aar_i, v);		\
+})
+
+#define atomic_fetch_sub(i,v)						\
+({									\
+	int __ia64_asr_i = (i);						\
+	(__builtin_constant_p(i)					\
+	 && (   (__ia64_asr_i ==   1) || (__ia64_asr_i ==   4)		\
+	     || (__ia64_asr_i ==   8) || (__ia64_asr_i ==  16)		\
+	     || (__ia64_asr_i ==  -1) || (__ia64_asr_i ==  -4)		\
+	     || (__ia64_asr_i ==  -8) || (__ia64_asr_i == -16)))	\
+		? ia64_fetchadd(-__ia64_asr_i, &(v)->counter, acq)	\
+		: ia64_atomic_fetch_sub(__ia64_asr_i, v);		\
+})
 
+ATOMIC_FETCH_OP(and, &)
+ATOMIC_FETCH_OP(or, |)
+ATOMIC_FETCH_OP(xor, ^)
+
+#define atomic_and(i,v)	(void)ia64_atomic_fetch_and(i,v)
+#define atomic_or(i,v)	(void)ia64_atomic_fetch_or(i,v)
+#define atomic_xor(i,v)	(void)ia64_atomic_fetch_xor(i,v)
+
+#define atomic_fetch_and(i,v)	ia64_atomic_fetch_and(i,v)
+#define atomic_fetch_or(i,v)	ia64_atomic_fetch_or(i,v)
+#define atomic_fetch_xor(i,v)	ia64_atomic_fetch_xor(i,v)
+
+#undef ATOMIC_OPS
+#undef ATOMIC_FETCH_OP
 #undef ATOMIC_OP
 
 #define ATOMIC64_OP(op, c_op)						\
@@ -94,8 +143,27 @@ ia64_atomic64_##op (__s64 i, atomic64_t
 	return new;							\
 }
 
-ATOMIC64_OP(add, +)
-ATOMIC64_OP(sub, -)
+#define ATOMIC64_FETCH_OP(op, c_op)					\
+static __inline__ long							\
+ia64_atomic64_fetch_##op (__s64 i, atomic64_t *v)			\
+{									\
+	__s64 old, new;							\
+	CMPXCHG_BUGCHECK_DECL						\
+									\
+	do {								\
+		CMPXCHG_BUGCHECK(v);					\
+		old = atomic64_read(v);					\
+		new = old c_op i;					\
+	} while (ia64_cmpxchg(acq, v, old, new, sizeof(atomic64_t)) != old); \
+	return old;							\
+}
+
+#define ATOMIC64_OPS(op, c_op)						\
+	ATOMIC64_OP(op, c_op)						\
+	ATOMIC64_FETCH_OP(op, c_op)
+
+ATOMIC64_OPS(add, +)
+ATOMIC64_OPS(sub, -)
 
 #define atomic64_add_return(i,v)					\
 ({									\
@@ -121,14 +189,44 @@ ATOMIC64_OP(sub, -)
 		: ia64_atomic64_sub(__ia64_asr_i, v);			\
 })
 
-ATOMIC64_OP(and, &)
-ATOMIC64_OP(or, |)
-ATOMIC64_OP(xor, ^)
-
-#define atomic64_and(i,v)	(void)ia64_atomic64_and(i,v)
-#define atomic64_or(i,v)	(void)ia64_atomic64_or(i,v)
-#define atomic64_xor(i,v)	(void)ia64_atomic64_xor(i,v)
+#define atomic64_fetch_add(i,v)						\
+({									\
+	long __ia64_aar_i = (i);					\
+	(__builtin_constant_p(i)					\
+	 && (   (__ia64_aar_i ==  1) || (__ia64_aar_i ==   4)		\
+	     || (__ia64_aar_i ==  8) || (__ia64_aar_i ==  16)		\
+	     || (__ia64_aar_i == -1) || (__ia64_aar_i ==  -4)		\
+	     || (__ia64_aar_i == -8) || (__ia64_aar_i == -16)))		\
+		? ia64_fetchadd(__ia64_aar_i, &(v)->counter, acq)	\
+		: ia64_atomic64_fetch_add(__ia64_aar_i, v);		\
+})
+
+#define atomic64_fetch_sub(i,v)						\
+({									\
+	long __ia64_asr_i = (i);					\
+	(__builtin_constant_p(i)					\
+	 && (   (__ia64_asr_i ==   1) || (__ia64_asr_i ==   4)		\
+	     || (__ia64_asr_i ==   8) || (__ia64_asr_i ==  16)		\
+	     || (__ia64_asr_i ==  -1) || (__ia64_asr_i ==  -4)		\
+	     || (__ia64_asr_i ==  -8) || (__ia64_asr_i == -16)))	\
+		? ia64_fetchadd(-__ia64_asr_i, &(v)->counter, acq)	\
+		: ia64_atomic64_fetch_sub(__ia64_asr_i, v);		\
+})
+
+ATOMIC64_FETCH_OP(and, &)
+ATOMIC64_FETCH_OP(or, |)
+ATOMIC64_FETCH_OP(xor, ^)
+
+#define atomic64_and(i,v)	(void)ia64_atomic64_fetch_and(i,v)
+#define atomic64_or(i,v)	(void)ia64_atomic64_fetch_or(i,v)
+#define atomic64_xor(i,v)	(void)ia64_atomic64_fetch_xor(i,v)
+
+#define atomic64_fetch_and(i,v)	ia64_atomic64_fetch_and(i,v)
+#define atomic64_fetch_or(i,v)	ia64_atomic64_fetch_or(i,v)
+#define atomic64_fetch_xor(i,v)	ia64_atomic64_fetch_xor(i,v)
 
+#undef ATOMIC64_OPS
+#undef ATOMIC64_FETCH_OP
 #undef ATOMIC64_OP
 
 #define atomic_cmpxchg(v, old, new) (cmpxchg(&((v)->counter), old, new))

^ permalink raw reply	[flat|nested] 61+ messages in thread

* [PATCH -v2 13/33] locking,m32r: Implement atomic_fetch_{add,sub,and,or,xor}()
  2016-05-31 10:19 [PATCH -v2 00/33] implement atomic_fetch_$op Peter Zijlstra
                   ` (11 preceding siblings ...)
  2016-05-31 10:19 ` [PATCH -v2 12/33] locking,ia64: Implement atomic{,64}_fetch_{add,sub,and,or,xor}() Peter Zijlstra
@ 2016-05-31 10:19 ` Peter Zijlstra
  2016-05-31 10:19 ` [PATCH -v2 14/33] locking,m68k: " Peter Zijlstra
                   ` (21 subsequent siblings)
  34 siblings, 0 replies; 61+ messages in thread
From: Peter Zijlstra @ 2016-05-31 10:19 UTC (permalink / raw)
  To: torvalds, mingo, tglx, will.deacon, paulmck, boqun.feng,
	waiman.long, fweisbec
  Cc: linux-kernel, linux-arch, rth, vgupta, linux, egtvedt, realmz6,
	ysato, rkuo, tony.luck, geert, james.hogan, ralf, dhowells, jejb,
	mpe, schwidefsky, dalias, davem, cmetcalf, jcmvbkbc, arnd,
	peterz, dbueso, fengguang.wu

[-- Attachment #1: peterz-atomic-fetch-m32r.patch --]
[-- Type: text/plain, Size: 1848 bytes --]

Implement FETCH-OP atomic primitives, these are very similar to the
existing OP-RETURN primitives we already have, except they return the
value of the atomic variable _before_ modification.

This is especially useful for irreversible operations -- such as
bitops (because it becomes impossible to reconstruct the state prior
to modification).

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 arch/m32r/include/asm/atomic.h |   38 ++++++++++++++++++++++++++++++++++----
 1 file changed, 34 insertions(+), 4 deletions(-)

--- a/arch/m32r/include/asm/atomic.h
+++ b/arch/m32r/include/asm/atomic.h
@@ -89,16 +89,46 @@ static __inline__ int atomic_##op##_retu
 	return result;							\
 }
 
-#define ATOMIC_OPS(op) ATOMIC_OP(op) ATOMIC_OP_RETURN(op)
+#define ATOMIC_FETCH_OP(op)						\
+static __inline__ int atomic_fetch_##op(int i, atomic_t *v)		\
+{									\
+	unsigned long flags;						\
+	int result, val;						\
+									\
+	local_irq_save(flags);						\
+	__asm__ __volatile__ (						\
+		"# atomic_fetch_" #op "		\n\t"			\
+		DCACHE_CLEAR("%0", "r4", "%2")				\
+		M32R_LOCK" %1, @%2;		\n\t"			\
+		"mv %0, %1			\n\t" 			\
+		#op " %1, %3;			\n\t"			\
+		M32R_UNLOCK" %1, @%2;		\n\t"			\
+		: "=&r" (result), "=&r" (val)				\
+		: "r" (&v->counter), "r" (i)				\
+		: "memory"						\
+		__ATOMIC_CLOBBER					\
+	);								\
+	local_irq_restore(flags);					\
+									\
+	return result;							\
+}
+
+#define ATOMIC_OPS(op) ATOMIC_OP(op) ATOMIC_OP_RETURN(op) ATOMIC_FETCH_OP(op)
 
 ATOMIC_OPS(add)
 ATOMIC_OPS(sub)
 
-ATOMIC_OP(and)
-ATOMIC_OP(or)
-ATOMIC_OP(xor)
+#undef ATOMIC_OPS
+#define ATOMIC_OPS(op) ATOMIC_OP(op) ATOMIC_FETCH_OP(op)
+
+#define atomic_fetch_or atomic_fetch_or
+
+ATOMIC_OPS(and)
+ATOMIC_OPS(or)
+ATOMIC_OPS(xor)
 
 #undef ATOMIC_OPS
+#undef ATOMIC_FETCH_OP
 #undef ATOMIC_OP_RETURN
 #undef ATOMIC_OP
 

^ permalink raw reply	[flat|nested] 61+ messages in thread

* [PATCH -v2 14/33] locking,m68k: Implement atomic_fetch_{add,sub,and,or,xor}()
  2016-05-31 10:19 [PATCH -v2 00/33] implement atomic_fetch_$op Peter Zijlstra
                   ` (12 preceding siblings ...)
  2016-05-31 10:19 ` [PATCH -v2 13/33] locking,m32r: Implement atomic_fetch_{add,sub,and,or,xor}() Peter Zijlstra
@ 2016-05-31 10:19 ` Peter Zijlstra
  2016-06-16 10:08   ` Geert Uytterhoeven
  2016-05-31 10:19 ` [PATCH -v2 15/33] locking,metag: " Peter Zijlstra
                   ` (20 subsequent siblings)
  34 siblings, 1 reply; 61+ messages in thread
From: Peter Zijlstra @ 2016-05-31 10:19 UTC (permalink / raw)
  To: torvalds, mingo, tglx, will.deacon, paulmck, boqun.feng,
	waiman.long, fweisbec
  Cc: linux-kernel, linux-arch, rth, vgupta, linux, egtvedt, realmz6,
	ysato, rkuo, tony.luck, geert, james.hogan, ralf, dhowells, jejb,
	mpe, schwidefsky, dalias, davem, cmetcalf, jcmvbkbc, arnd,
	peterz, dbueso, fengguang.wu

[-- Attachment #1: peterz-atomic-fetch-m68k.patch --]
[-- Type: text/plain, Size: 2717 bytes --]

Implement FETCH-OP atomic primitives, these are very similar to the
existing OP-RETURN primitives we already have, except they return the
value of the atomic variable _before_ modification.

This is especially useful for irreversible operations -- such as
bitops (because it becomes impossible to reconstruct the state prior
to modification).

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 arch/m68k/include/asm/atomic.h |   53 +++++++++++++++++++++++++++++++++++++----
 1 file changed, 49 insertions(+), 4 deletions(-)

--- a/arch/m68k/include/asm/atomic.h
+++ b/arch/m68k/include/asm/atomic.h
@@ -38,6 +38,13 @@ static inline void atomic_##op(int i, at
 
 #ifdef CONFIG_RMW_INSNS
 
+/*
+ * Am I reading these CAS loops right in that %2 is the old value and the first
+ * iteration uses an uninitialized value?
+ *
+ * Would it not make sense to add: tmp = atomic_read(v); to avoid this?
+ */
+
 #define ATOMIC_OP_RETURN(op, c_op, asm_op)				\
 static inline int atomic_##op##_return(int i, atomic_t *v)		\
 {									\
@@ -53,6 +60,21 @@ static inline int atomic_##op##_return(i
 	return t;							\
 }
 
+#define ATOMIC_FETCH_OP(op, c_op, asm_op)				\
+static inline int atomic_fetch_##op(int i, atomic_t *v)			\
+{									\
+	int t, tmp;							\
+									\
+	__asm__ __volatile__(						\
+			"1:	movel %2,%1\n"				\
+			"	" #asm_op "l %3,%1\n"			\
+			"	casl %2,%1,%0\n"			\
+			"	jne 1b"					\
+			: "+m" (*v), "=&d" (t), "=&d" (tmp)		\
+			: "g" (i), "2" (atomic_read(v)));		\
+	return tmp;							\
+}
+
 #else
 
 #define ATOMIC_OP_RETURN(op, c_op, asm_op)				\
@@ -68,20 +90,43 @@ static inline int atomic_##op##_return(i
 	return t;							\
 }
 
+#define ATOMIC_FETCH_OP(op, c_op, asm_op)				\
+static inline int atomic_fetch_##op(int i, atomic_t * v)		\
+{									\
+	unsigned long flags;						\
+	int t;								\
+									\
+	local_irq_save(flags);						\
+	t = v->counter;							\
+	v->counter c_op i;						\
+	local_irq_restore(flags);					\
+									\
+	return t;							\
+}
+
 #endif /* CONFIG_RMW_INSNS */
 
 #define ATOMIC_OPS(op, c_op, asm_op)					\
 	ATOMIC_OP(op, c_op, asm_op)					\
-	ATOMIC_OP_RETURN(op, c_op, asm_op)
+	ATOMIC_OP_RETURN(op, c_op, asm_op)				\
+	ATOMIC_FETCH_OP(op, c_op, asm_op)
 
 ATOMIC_OPS(add, +=, add)
 ATOMIC_OPS(sub, -=, sub)
 
-ATOMIC_OP(and, &=, and)
-ATOMIC_OP(or, |=, or)
-ATOMIC_OP(xor, ^=, eor)
+#undef ATOMIC_OPS
+#define ATOMIC_OPS(op, c_op, asm_op)					\
+	ATOMIC_OP(op, c_op, asm_op)					\
+	ATOMIC_FETCH_OP(op, c_op, asm_op)
+
+#define atomic_fetch_or atomic_fetch_or
+
+ATOMIC_OPS(and, &=, and)
+ATOMIC_OPS(or, |=, or)
+ATOMIC_OPS(xor, ^=, eor)
 
 #undef ATOMIC_OPS
+#undef ATOMIC_FETCH_OP
 #undef ATOMIC_OP_RETURN
 #undef ATOMIC_OP
 

^ permalink raw reply	[flat|nested] 61+ messages in thread

* [PATCH -v2 15/33] locking,metag: Implement atomic_fetch_{add,sub,and,or,xor}()
  2016-05-31 10:19 [PATCH -v2 00/33] implement atomic_fetch_$op Peter Zijlstra
                   ` (13 preceding siblings ...)
  2016-05-31 10:19 ` [PATCH -v2 14/33] locking,m68k: " Peter Zijlstra
@ 2016-05-31 10:19 ` Peter Zijlstra
  2016-05-31 10:19 ` [PATCH -v2 16/33] locking,mips: Implement atomic{,64}_fetch_{add,sub,and,or,xor}() Peter Zijlstra
                   ` (19 subsequent siblings)
  34 siblings, 0 replies; 61+ messages in thread
From: Peter Zijlstra @ 2016-05-31 10:19 UTC (permalink / raw)
  To: torvalds, mingo, tglx, will.deacon, paulmck, boqun.feng,
	waiman.long, fweisbec
  Cc: linux-kernel, linux-arch, rth, vgupta, linux, egtvedt, realmz6,
	ysato, rkuo, tony.luck, geert, james.hogan, ralf, dhowells, jejb,
	mpe, schwidefsky, dalias, davem, cmetcalf, jcmvbkbc, arnd,
	peterz, dbueso, fengguang.wu

[-- Attachment #1: peterz-atomic-fetch-metag.patch --]
[-- Type: text/plain, Size: 3378 bytes --]

Implement FETCH-OP atomic primitives, these are very similar to the
existing OP-RETURN primitives we already have, except they return the
value of the atomic variable _before_ modification.

This is especially useful for irreversible operations -- such as
bitops (because it becomes impossible to reconstruct the state prior
to modification).

Acked-by: James Hogan <james.hogan@imgtec.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 arch/metag/include/asm/atomic.h        |    2 +
 arch/metag/include/asm/atomic_lnkget.h |   36 +++++++++++++++++++++++++++++----
 arch/metag/include/asm/atomic_lock1.h  |   33 ++++++++++++++++++++++++++----
 3 files changed, 63 insertions(+), 8 deletions(-)

--- a/arch/metag/include/asm/atomic.h
+++ b/arch/metag/include/asm/atomic.h
@@ -17,6 +17,8 @@
 #include <asm/atomic_lnkget.h>
 #endif
 
+#define atomic_fetch_or atomic_fetch_or
+
 #define atomic_add_negative(a, v)       (atomic_add_return((a), (v)) < 0)
 
 #define atomic_dec_return(v) atomic_sub_return(1, (v))
--- a/arch/metag/include/asm/atomic_lnkget.h
+++ b/arch/metag/include/asm/atomic_lnkget.h
@@ -69,16 +69,44 @@ static inline int atomic_##op##_return(i
 	return result;							\
 }
 
-#define ATOMIC_OPS(op) ATOMIC_OP(op) ATOMIC_OP_RETURN(op)
+#define ATOMIC_FETCH_OP(op)						\
+static inline int atomic_fetch_##op(int i, atomic_t *v)			\
+{									\
+	int result, temp;						\
+									\
+	smp_mb();							\
+									\
+	asm volatile (							\
+		"1:	LNKGETD %1, [%2]\n"				\
+		"	" #op "	%0, %1, %3\n"				\
+		"	LNKSETD [%2], %0\n"				\
+		"	DEFR	%0, TXSTAT\n"				\
+		"	ANDT	%0, %0, #HI(0x3f000000)\n"		\
+		"	CMPT	%0, #HI(0x02000000)\n"			\
+		"	BNZ 1b\n"					\
+		: "=&d" (temp), "=&d" (result)				\
+		: "da" (&v->counter), "bd" (i)				\
+		: "cc");						\
+									\
+	smp_mb();							\
+									\
+	return result;							\
+}
+
+#define ATOMIC_OPS(op) ATOMIC_OP(op) ATOMIC_OP_RETURN(op) ATOMIC_FETCH_OP(op)
 
 ATOMIC_OPS(add)
 ATOMIC_OPS(sub)
 
-ATOMIC_OP(and)
-ATOMIC_OP(or)
-ATOMIC_OP(xor)
+#undef ATOMIC_OPS
+#define ATOMIC_OPS(op) ATOMIC_OP(op) ATOMIC_FETCH_OP(op)
+
+ATOMIC_OPS(and)
+ATOMIC_OPS(or)
+ATOMIC_OPS(xor)
 
 #undef ATOMIC_OPS
+#undef ATOMIC_FETCH_OP
 #undef ATOMIC_OP_RETURN
 #undef ATOMIC_OP
 
--- a/arch/metag/include/asm/atomic_lock1.h
+++ b/arch/metag/include/asm/atomic_lock1.h
@@ -64,15 +64,40 @@ static inline int atomic_##op##_return(i
 	return result;							\
 }
 
-#define ATOMIC_OPS(op, c_op) ATOMIC_OP(op, c_op) ATOMIC_OP_RETURN(op, c_op)
+#define ATOMIC_FETCH_OP(op, c_op)					\
+static inline int atomic_fetch_##op(int i, atomic_t *v)			\
+{									\
+	unsigned long result;						\
+	unsigned long flags;						\
+									\
+	__global_lock1(flags);						\
+	result = v->counter;						\
+	fence();							\
+	v->counter c_op i;						\
+	__global_unlock1(flags);					\
+									\
+	return result;							\
+}
+
+#define ATOMIC_OPS(op, c_op)						\
+	ATOMIC_OP(op, c_op)						\
+	ATOMIC_OP_RETURN(op, c_op)					\
+	ATOMIC_FETCH_OP(op, c_op)
 
 ATOMIC_OPS(add, +=)
 ATOMIC_OPS(sub, -=)
-ATOMIC_OP(and, &=)
-ATOMIC_OP(or, |=)
-ATOMIC_OP(xor, ^=)
 
 #undef ATOMIC_OPS
+#define ATOMIC_OPS(op, c_op)						\
+	ATOMIC_OP(op, c_op)						\
+	ATOMIC_FETCH_OP(op, c_op)
+
+ATOMIC_OPS(and, &=)
+ATOMIC_OPS(or, |=)
+ATOMIC_OPS(xor, ^=)
+
+#undef ATOMIC_OPS
+#undef ATOMIC_FETCH_OP
 #undef ATOMIC_OP_RETURN
 #undef ATOMIC_OP
 

^ permalink raw reply	[flat|nested] 61+ messages in thread

* [PATCH -v2 16/33] locking,mips: Implement atomic{,64}_fetch_{add,sub,and,or,xor}()
  2016-05-31 10:19 [PATCH -v2 00/33] implement atomic_fetch_$op Peter Zijlstra
                   ` (14 preceding siblings ...)
  2016-05-31 10:19 ` [PATCH -v2 15/33] locking,metag: " Peter Zijlstra
@ 2016-05-31 10:19 ` Peter Zijlstra
  2016-05-31 10:19 ` [PATCH -v2 17/33] locking,mn10300: Implement atomic_fetch_{add,sub,and,or,xor}() Peter Zijlstra
                   ` (18 subsequent siblings)
  34 siblings, 0 replies; 61+ messages in thread
From: Peter Zijlstra @ 2016-05-31 10:19 UTC (permalink / raw)
  To: torvalds, mingo, tglx, will.deacon, paulmck, boqun.feng,
	waiman.long, fweisbec
  Cc: linux-kernel, linux-arch, rth, vgupta, linux, egtvedt, realmz6,
	ysato, rkuo, tony.luck, geert, james.hogan, ralf, dhowells, jejb,
	mpe, schwidefsky, dalias, davem, cmetcalf, jcmvbkbc, arnd,
	peterz, dbueso, fengguang.wu

[-- Attachment #1: peterz-atomic-fetch-mips.patch --]
[-- Type: text/plain, Size: 5948 bytes --]

Implement FETCH-OP atomic primitives, these are very similar to the
existing OP-RETURN primitives we already have, except they return the
value of the atomic variable _before_ modification.

This is especially useful for irreversible operations -- such as
bitops (because it becomes impossible to reconstruct the state prior
to modification).

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 arch/mips/include/asm/atomic.h |  138 ++++++++++++++++++++++++++++++++++++++---
 1 file changed, 129 insertions(+), 9 deletions(-)

--- a/arch/mips/include/asm/atomic.h
+++ b/arch/mips/include/asm/atomic.h
@@ -66,7 +66,7 @@ static __inline__ void atomic_##op(int i
 			"	" #asm_op " %0, %2			\n"   \
 			"	sc	%0, %1				\n"   \
 			"	.set	mips0				\n"   \
-			: "=&r" (temp), "+" GCC_OFF_SMALL_ASM() (v->counter)      \
+			: "=&r" (temp), "+" GCC_OFF_SMALL_ASM() (v->counter)  \
 			: "Ir" (i));					      \
 		} while (unlikely(!temp));				      \
 	} else {							      \
@@ -130,18 +130,78 @@ static __inline__ int atomic_##op##_retu
 	return result;							      \
 }
 
+#define ATOMIC_FETCH_OP(op, c_op, asm_op)				      \
+static __inline__ int atomic_fetch_##op(int i, atomic_t * v)		      \
+{									      \
+	int result;							      \
+									      \
+	smp_mb__before_llsc();						      \
+									      \
+	if (kernel_uses_llsc && R10000_LLSC_WAR) {			      \
+		int temp;						      \
+									      \
+		__asm__ __volatile__(					      \
+		"	.set	arch=r4000				\n"   \
+		"1:	ll	%1, %2		# atomic_fetch_" #op "	\n"   \
+		"	" #asm_op " %0, %1, %3				\n"   \
+		"	sc	%0, %2					\n"   \
+		"	beqzl	%0, 1b					\n"   \
+		"	move	%0, %1					\n"   \
+		"	.set	mips0					\n"   \
+		: "=&r" (result), "=&r" (temp),				      \
+		  "+" GCC_OFF_SMALL_ASM() (v->counter)			      \
+		: "Ir" (i));						      \
+	} else if (kernel_uses_llsc) {					      \
+		int temp;						      \
+									      \
+		do {							      \
+			__asm__ __volatile__(				      \
+			"	.set	"MIPS_ISA_LEVEL"		\n"   \
+			"	ll	%1, %2	# atomic_fetch_" #op "	\n"   \
+			"	" #asm_op " %0, %1, %3			\n"   \
+			"	sc	%0, %2				\n"   \
+			"	.set	mips0				\n"   \
+			: "=&r" (result), "=&r" (temp),			      \
+			  "+" GCC_OFF_SMALL_ASM() (v->counter)		      \
+			: "Ir" (i));					      \
+		} while (unlikely(!result));				      \
+									      \
+		result = temp;						      \
+	} else {							      \
+		unsigned long flags;					      \
+									      \
+		raw_local_irq_save(flags);				      \
+		result = v->counter;					      \
+		v->counter c_op i;					      \
+		raw_local_irq_restore(flags);				      \
+	}								      \
+									      \
+	smp_llsc_mb();							      \
+									      \
+	return result;							      \
+}
+
 #define ATOMIC_OPS(op, c_op, asm_op)					      \
 	ATOMIC_OP(op, c_op, asm_op)					      \
-	ATOMIC_OP_RETURN(op, c_op, asm_op)
+	ATOMIC_OP_RETURN(op, c_op, asm_op)				      \
+	ATOMIC_FETCH_OP(op, c_op, asm_op)
 
 ATOMIC_OPS(add, +=, addu)
 ATOMIC_OPS(sub, -=, subu)
 
-ATOMIC_OP(and, &=, and)
-ATOMIC_OP(or, |=, or)
-ATOMIC_OP(xor, ^=, xor)
+#undef ATOMIC_OPS
+#define ATOMIC_OPS(op, c_op, asm_op)					      \
+	ATOMIC_OP(op, c_op, asm_op)					      \
+	ATOMIC_FETCH_OP(op, c_op, asm_op)
+
+#define atomic_fetch_or atomic_fetch_or
+
+ATOMIC_OPS(and, &=, and)
+ATOMIC_OPS(or, |=, or)
+ATOMIC_OPS(xor, ^=, xor)
 
 #undef ATOMIC_OPS
+#undef ATOMIC_FETCH_OP
 #undef ATOMIC_OP_RETURN
 #undef ATOMIC_OP
 
@@ -414,17 +474,77 @@ static __inline__ long atomic64_##op##_r
 	return result;							      \
 }
 
+#define ATOMIC64_FETCH_OP(op, c_op, asm_op)				      \
+static __inline__ long atomic64_fetch_##op(long i, atomic64_t * v)	      \
+{									      \
+	long result;							      \
+									      \
+	smp_mb__before_llsc();						      \
+									      \
+	if (kernel_uses_llsc && R10000_LLSC_WAR) {			      \
+		long temp;						      \
+									      \
+		__asm__ __volatile__(					      \
+		"	.set	arch=r4000				\n"   \
+		"1:	lld	%1, %2		# atomic64_fetch_" #op "\n"   \
+		"	" #asm_op " %0, %1, %3				\n"   \
+		"	scd	%0, %2					\n"   \
+		"	beqzl	%0, 1b					\n"   \
+		"	move	%0, %1					\n"   \
+		"	.set	mips0					\n"   \
+		: "=&r" (result), "=&r" (temp),				      \
+		  "+" GCC_OFF_SMALL_ASM() (v->counter)			      \
+		: "Ir" (i));						      \
+	} else if (kernel_uses_llsc) {					      \
+		long temp;						      \
+									      \
+		do {							      \
+			__asm__ __volatile__(				      \
+			"	.set	"MIPS_ISA_LEVEL"		\n"   \
+			"	lld	%1, %2	# atomic64_fetch_" #op "\n"   \
+			"	" #asm_op " %0, %1, %3			\n"   \
+			"	scd	%0, %2				\n"   \
+			"	.set	mips0				\n"   \
+			: "=&r" (result), "=&r" (temp),			      \
+			  "=" GCC_OFF_SMALL_ASM() (v->counter)		      \
+			: "Ir" (i), GCC_OFF_SMALL_ASM() (v->counter)	      \
+			: "memory");					      \
+		} while (unlikely(!result));				      \
+									      \
+		result = temp;						      \
+	} else {							      \
+		unsigned long flags;					      \
+									      \
+		raw_local_irq_save(flags);				      \
+		result = v->counter;					      \
+		v->counter c_op i;					      \
+		raw_local_irq_restore(flags);				      \
+	}								      \
+									      \
+	smp_llsc_mb();							      \
+									      \
+	return result;							      \
+}
+
 #define ATOMIC64_OPS(op, c_op, asm_op)					      \
 	ATOMIC64_OP(op, c_op, asm_op)					      \
-	ATOMIC64_OP_RETURN(op, c_op, asm_op)
+	ATOMIC64_OP_RETURN(op, c_op, asm_op)				      \
+	ATOMIC64_FETCH_OP(op, c_op, asm_op)
 
 ATOMIC64_OPS(add, +=, daddu)
 ATOMIC64_OPS(sub, -=, dsubu)
-ATOMIC64_OP(and, &=, and)
-ATOMIC64_OP(or, |=, or)
-ATOMIC64_OP(xor, ^=, xor)
 
 #undef ATOMIC64_OPS
+#define ATOMIC64_OPS(op, c_op, asm_op)					      \
+	ATOMIC64_OP(op, c_op, asm_op)					      \
+	ATOMIC64_FETCH_OP(op, c_op, asm_op)
+
+ATOMIC64_OPS(and, &=, and)
+ATOMIC64_OPS(or, |=, or)
+ATOMIC64_OPS(xor, ^=, xor)
+
+#undef ATOMIC64_OPS
+#undef ATOMIC64_FETCH_OP
 #undef ATOMIC64_OP_RETURN
 #undef ATOMIC64_OP
 

^ permalink raw reply	[flat|nested] 61+ messages in thread

* [PATCH -v2 17/33] locking,mn10300: Implement atomic_fetch_{add,sub,and,or,xor}()
  2016-05-31 10:19 [PATCH -v2 00/33] implement atomic_fetch_$op Peter Zijlstra
                   ` (15 preceding siblings ...)
  2016-05-31 10:19 ` [PATCH -v2 16/33] locking,mips: Implement atomic{,64}_fetch_{add,sub,and,or,xor}() Peter Zijlstra
@ 2016-05-31 10:19 ` Peter Zijlstra
  2016-05-31 10:19 ` [PATCH -v2 18/33] locking,parisc: Implement atomic{,64}_fetch_{add,sub,and,or,xor}() Peter Zijlstra
                   ` (17 subsequent siblings)
  34 siblings, 0 replies; 61+ messages in thread
From: Peter Zijlstra @ 2016-05-31 10:19 UTC (permalink / raw)
  To: torvalds, mingo, tglx, will.deacon, paulmck, boqun.feng,
	waiman.long, fweisbec
  Cc: linux-kernel, linux-arch, rth, vgupta, linux, egtvedt, realmz6,
	ysato, rkuo, tony.luck, geert, james.hogan, ralf, dhowells, jejb,
	mpe, schwidefsky, dalias, davem, cmetcalf, jcmvbkbc, arnd,
	peterz, dbueso, fengguang.wu

[-- Attachment #1: peterz-atomic-fetch-mn10300.patch --]
[-- Type: text/plain, Size: 1798 bytes --]

Implement FETCH-OP atomic primitives, these are very similar to the
existing OP-RETURN primitives we already have, except they return the
value of the atomic variable _before_ modification.

This is especially useful for irreversible operations -- such as
bitops (because it becomes impossible to reconstruct the state prior
to modification).

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 arch/mn10300/include/asm/atomic.h |   35 +++++++++++++++++++++++++++++++----
 1 file changed, 31 insertions(+), 4 deletions(-)

--- a/arch/mn10300/include/asm/atomic.h
+++ b/arch/mn10300/include/asm/atomic.h
@@ -84,16 +84,43 @@ static inline int atomic_##op##_return(i
 	return retval;							\
 }
 
-#define ATOMIC_OPS(op) ATOMIC_OP(op) ATOMIC_OP_RETURN(op)
+#define ATOMIC_FETCH_OP(op)						\
+static inline int atomic_fetch_##op(int i, atomic_t *v)			\
+{									\
+	int retval, status;						\
+									\
+	asm volatile(							\
+		"1:	mov	%4,(_AAR,%3)	\n"			\
+		"	mov	(_ADR,%3),%1	\n"			\
+		"	mov	%1,%0		\n"			\
+		"	" #op "	%5,%0		\n"			\
+		"	mov	%0,(_ADR,%3)	\n"			\
+		"	mov	(_ADR,%3),%0	\n"	/* flush */	\
+		"	mov	(_ASR,%3),%0	\n"			\
+		"	or	%0,%0		\n"			\
+		"	bne	1b		\n"			\
+		: "=&r"(status), "=&r"(retval), "=m"(v->counter)	\
+		: "a"(ATOMIC_OPS_BASE_ADDR), "r"(&v->counter), "r"(i)	\
+		: "memory", "cc");					\
+	return retval;							\
+}
+
+#define ATOMIC_OPS(op) ATOMIC_OP(op) ATOMIC_OP_RETURN(op) ATOMIC_FETCH_OP(op)
 
 ATOMIC_OPS(add)
 ATOMIC_OPS(sub)
 
-ATOMIC_OP(and)
-ATOMIC_OP(or)
-ATOMIC_OP(xor)
+#undef ATOMIC_OPS
+#define ATOMIC_OPS(op) ATOMIC_OP(op) ATOMIC_FETCH_OP(op)
+
+#define atomic_fetch_or atomic_fetch_or
+
+ATOMIC_OPS(and)
+ATOMIC_OPS(or)
+ATOMIC_OPS(xor)
 
 #undef ATOMIC_OPS
+#undef ATOMIC_FETCH_OP
 #undef ATOMIC_OP_RETURN
 #undef ATOMIC_OP
 

^ permalink raw reply	[flat|nested] 61+ messages in thread

* [PATCH -v2 18/33] locking,parisc: Implement atomic{,64}_fetch_{add,sub,and,or,xor}()
  2016-05-31 10:19 [PATCH -v2 00/33] implement atomic_fetch_$op Peter Zijlstra
                   ` (16 preceding siblings ...)
  2016-05-31 10:19 ` [PATCH -v2 17/33] locking,mn10300: Implement atomic_fetch_{add,sub,and,or,xor}() Peter Zijlstra
@ 2016-05-31 10:19 ` Peter Zijlstra
  2016-05-31 10:19 ` [PATCH -v2 19/33] locking,powerpc: Implement atomic{,64}_fetch_{add,sub,and,or,xor}{,_relaxed,_acquire,_release}() Peter Zijlstra
                   ` (16 subsequent siblings)
  34 siblings, 0 replies; 61+ messages in thread
From: Peter Zijlstra @ 2016-05-31 10:19 UTC (permalink / raw)
  To: torvalds, mingo, tglx, will.deacon, paulmck, boqun.feng,
	waiman.long, fweisbec
  Cc: linux-kernel, linux-arch, rth, vgupta, linux, egtvedt, realmz6,
	ysato, rkuo, tony.luck, geert, james.hogan, ralf, dhowells, jejb,
	mpe, schwidefsky, dalias, davem, cmetcalf, jcmvbkbc, arnd,
	peterz, dbueso, fengguang.wu

[-- Attachment #1: peterz-atomic-fetch-parisc.patch --]
[-- Type: text/plain, Size: 2780 bytes --]

Implement FETCH-OP atomic primitives, these are very similar to the
existing OP-RETURN primitives we already have, except they return the
value of the atomic variable _before_ modification.

This is especially useful for irreversible operations -- such as
bitops (because it becomes impossible to reconstruct the state prior
to modification).

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 arch/parisc/include/asm/atomic.h |   65 ++++++++++++++++++++++++++++++++++-----
 1 file changed, 57 insertions(+), 8 deletions(-)

--- a/arch/parisc/include/asm/atomic.h
+++ b/arch/parisc/include/asm/atomic.h
@@ -121,16 +121,41 @@ static __inline__ int atomic_##op##_retu
 	return ret;							\
 }
 
-#define ATOMIC_OPS(op, c_op) ATOMIC_OP(op, c_op) ATOMIC_OP_RETURN(op, c_op)
+#define ATOMIC_FETCH_OP(op, c_op)					\
+static __inline__ int atomic_fetch_##op(int i, atomic_t *v)		\
+{									\
+	unsigned long flags;						\
+	int ret;							\
+									\
+	_atomic_spin_lock_irqsave(v, flags);				\
+	ret = v->counter;						\
+	v->counter c_op i;						\
+	_atomic_spin_unlock_irqrestore(v, flags);			\
+									\
+	return ret;							\
+}
+
+#define ATOMIC_OPS(op, c_op)						\
+	ATOMIC_OP(op, c_op)						\
+	ATOMIC_OP_RETURN(op, c_op)					\
+	ATOMIC_FETCH_OP(op, c_op)
 
 ATOMIC_OPS(add, +=)
 ATOMIC_OPS(sub, -=)
 
-ATOMIC_OP(and, &=)
-ATOMIC_OP(or, |=)
-ATOMIC_OP(xor, ^=)
+#undef ATOMIC_OPS
+#define ATOMIC_OPS(op, c_op)						\
+	ATOMIC_OP(op, c_op)						\
+	ATOMIC_FETCH_OP(op, c_op)
+
+#define atomic_fetch_or atomic_fetch_or
+
+ATOMIC_OPS(and, &=)
+ATOMIC_OPS(or, |=)
+ATOMIC_OPS(xor, ^=)
 
 #undef ATOMIC_OPS
+#undef ATOMIC_FETCH_OP
 #undef ATOMIC_OP_RETURN
 #undef ATOMIC_OP
 
@@ -185,15 +210,39 @@ static __inline__ s64 atomic64_##op##_re
 	return ret;							\
 }
 
-#define ATOMIC64_OPS(op, c_op) ATOMIC64_OP(op, c_op) ATOMIC64_OP_RETURN(op, c_op)
+#define ATOMIC64_FETCH_OP(op, c_op)					\
+static __inline__ s64 atomic64_fetch_##op(s64 i, atomic64_t *v)		\
+{									\
+	unsigned long flags;						\
+	s64 ret;							\
+									\
+	_atomic_spin_lock_irqsave(v, flags);				\
+	ret = v->counter;						\
+	v->counter c_op i;						\
+	_atomic_spin_unlock_irqrestore(v, flags);			\
+									\
+	return ret;							\
+}
+
+#define ATOMIC64_OPS(op, c_op)						\
+	ATOMIC64_OP(op, c_op)						\
+	ATOMIC64_OP_RETURN(op, c_op)					\
+	ATOMIC64_FETCH_OP(op, c_op)
 
 ATOMIC64_OPS(add, +=)
 ATOMIC64_OPS(sub, -=)
-ATOMIC64_OP(and, &=)
-ATOMIC64_OP(or, |=)
-ATOMIC64_OP(xor, ^=)
 
 #undef ATOMIC64_OPS
+#define ATOMIC64_OPS(op, c_op)						\
+	ATOMIC64_OP(op, c_op)						\
+	ATOMIC64_FETCH_OP(op, c_op)
+
+ATOMIC64_OPS(and, &=)
+ATOMIC64_OPS(or, |=)
+ATOMIC64_OPS(xor, ^=)
+
+#undef ATOMIC64_OPS
+#undef ATOMIC64_FETCH_OP
 #undef ATOMIC64_OP_RETURN
 #undef ATOMIC64_OP
 

^ permalink raw reply	[flat|nested] 61+ messages in thread

* [PATCH -v2 19/33] locking,powerpc: Implement atomic{,64}_fetch_{add,sub,and,or,xor}{,_relaxed,_acquire,_release}()
  2016-05-31 10:19 [PATCH -v2 00/33] implement atomic_fetch_$op Peter Zijlstra
                   ` (17 preceding siblings ...)
  2016-05-31 10:19 ` [PATCH -v2 18/33] locking,parisc: Implement atomic{,64}_fetch_{add,sub,and,or,xor}() Peter Zijlstra
@ 2016-05-31 10:19 ` Peter Zijlstra
  2016-06-01  3:11   ` Boqun Feng
  2016-05-31 10:19 ` [PATCH -v2 20/33] locking,s390: Implement atomic{,64}_fetch_{add,sub,and,or,xor}() Peter Zijlstra
                   ` (15 subsequent siblings)
  34 siblings, 1 reply; 61+ messages in thread
From: Peter Zijlstra @ 2016-05-31 10:19 UTC (permalink / raw)
  To: torvalds, mingo, tglx, will.deacon, paulmck, boqun.feng,
	waiman.long, fweisbec
  Cc: linux-kernel, linux-arch, rth, vgupta, linux, egtvedt, realmz6,
	ysato, rkuo, tony.luck, geert, james.hogan, ralf, dhowells, jejb,
	mpe, schwidefsky, dalias, davem, cmetcalf, jcmvbkbc, arnd,
	peterz, dbueso, fengguang.wu

[-- Attachment #1: peterz-atomic-fetch-powerpc.patch --]
[-- Type: text/plain, Size: 3916 bytes --]

Implement FETCH-OP atomic primitives, these are very similar to the
existing OP-RETURN primitives we already have, except they return the
value of the atomic variable _before_ modification.

This is especially useful for irreversible operations -- such as
bitops (because it becomes impossible to reconstruct the state prior
to modification).

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 arch/powerpc/include/asm/atomic.h |   83 +++++++++++++++++++++++++++++++++-----
 1 file changed, 74 insertions(+), 9 deletions(-)

--- a/arch/powerpc/include/asm/atomic.h
+++ b/arch/powerpc/include/asm/atomic.h
@@ -78,21 +78,53 @@ static inline int atomic_##op##_return_r
 	return t;							\
 }
 
+#define ATOMIC_FETCH_OP_RELAXED(op, asm_op)				\
+static inline int atomic_fetch_##op##_relaxed(int a, atomic_t *v)	\
+{									\
+	int res, t;							\
+									\
+	__asm__ __volatile__(						\
+"1:	lwarx	%0,0,%4		# atomic_fetch_" #op "_relaxed\n"	\
+	#asm_op " %1,%3,%0\n"						\
+	PPC405_ERR77(0, %4)						\
+"	stwcx.	%1,0,%4\n"						\
+"	bne-	1b\n"							\
+	: "=&r" (res), "=&r" (t), "+m" (v->counter)			\
+	: "r" (a), "r" (&v->counter)					\
+	: "cc");							\
+									\
+	return res;							\
+}
+
 #define ATOMIC_OPS(op, asm_op)						\
 	ATOMIC_OP(op, asm_op)						\
-	ATOMIC_OP_RETURN_RELAXED(op, asm_op)
+	ATOMIC_OP_RETURN_RELAXED(op, asm_op)				\
+	ATOMIC_FETCH_OP_RELAXED(op, asm_op)
 
 ATOMIC_OPS(add, add)
 ATOMIC_OPS(sub, subf)
 
-ATOMIC_OP(and, and)
-ATOMIC_OP(or, or)
-ATOMIC_OP(xor, xor)
-
 #define atomic_add_return_relaxed atomic_add_return_relaxed
 #define atomic_sub_return_relaxed atomic_sub_return_relaxed
 
+#define atomic_fetch_add_relaxed atomic_fetch_add_relaxed
+#define atomic_fetch_sub_relaxed atomic_fetch_sub_relaxed
+
+#undef ATOMIC_OPS
+#define ATOMIC_OPS(op, asm_op)						\
+	ATOMIC_OP(op, asm_op)						\
+	ATOMIC_FETCH_OP_RELAXED(op, asm_op)
+
+ATOMIC_OPS(and, and)
+ATOMIC_OPS(or, or)
+ATOMIC_OPS(xor, xor)
+
+#define atomic_fetch_and_relaxed atomic_fetch_and_relaxed
+#define atomic_fetch_or_relaxed  atomic_fetch_or_relaxed
+#define atomic_fetch_xor_relaxed atomic_fetch_xor_relaxed
+
 #undef ATOMIC_OPS
+#undef ATOMIC_FETCH_OP_RELAXED
 #undef ATOMIC_OP_RETURN_RELAXED
 #undef ATOMIC_OP
 
@@ -329,20 +361,53 @@ atomic64_##op##_return_relaxed(long a, a
 	return t;							\
 }
 
+#define ATOMIC64_FETCH_OP_RELAXED(op, asm_op)				\
+static inline long							\
+atomic64_fetch_##op##_relaxed(long a, atomic64_t *v)			\
+{									\
+	long res, t;							\
+									\
+	__asm__ __volatile__(						\
+"1:	ldarx	%0,0,%4		# atomic64_fetch_" #op "_relaxed\n"	\
+	#asm_op " %1,%3,%0\n"						\
+"	stdcx.	%1,0,%4\n"						\
+"	bne-	1b\n"							\
+	: "=&r" (res), "=&r" (t), "+m" (v->counter)			\
+	: "r" (a), "r" (&v->counter)					\
+	: "cc");							\
+									\
+	return t;							\
+}
+
 #define ATOMIC64_OPS(op, asm_op)					\
 	ATOMIC64_OP(op, asm_op)						\
-	ATOMIC64_OP_RETURN_RELAXED(op, asm_op)
+	ATOMIC64_OP_RETURN_RELAXED(op, asm_op)				\
+	ATOMIC64_FETCH_OP_RELAXED(op, asm_op)
 
 ATOMIC64_OPS(add, add)
 ATOMIC64_OPS(sub, subf)
-ATOMIC64_OP(and, and)
-ATOMIC64_OP(or, or)
-ATOMIC64_OP(xor, xor)
 
 #define atomic64_add_return_relaxed atomic64_add_return_relaxed
 #define atomic64_sub_return_relaxed atomic64_sub_return_relaxed
 
+#define atomic64_fetch_add_relaxed atomic64_fetch_add_relaxed
+#define atomic64_fetch_sub_relaxed atomic64_fetch_sub_relaxed
+
+#undef ATOMIC64_OPS
+#define ATOMIC64_OPS(op, asm_op)					\
+	ATOMIC64_OP(op, asm_op)						\
+	ATOMIC64_FETCH_OP_RELAXED(op, asm_op)
+
+ATOMIC64_OPS(and, and)
+ATOMIC64_OPS(or, or)
+ATOMIC64_OPS(xor, xor)
+
+#define atomic64_fetch_and_relaxed atomic64_fetch_and_relaxed
+#define atomic64_fetch_or_relaxed  atomic64_fetch_or_relaxed
+#define atomic64_fetch_xor_relaxed atomic64_fetch_xor_relaxed
+
 #undef ATOPIC64_OPS
+#undef ATOMIC64_FETCH_OP_RELAXED
 #undef ATOMIC64_OP_RETURN_RELAXED
 #undef ATOMIC64_OP
 

^ permalink raw reply	[flat|nested] 61+ messages in thread

* [PATCH -v2 20/33] locking,s390: Implement atomic{,64}_fetch_{add,sub,and,or,xor}()
  2016-05-31 10:19 [PATCH -v2 00/33] implement atomic_fetch_$op Peter Zijlstra
                   ` (18 preceding siblings ...)
  2016-05-31 10:19 ` [PATCH -v2 19/33] locking,powerpc: Implement atomic{,64}_fetch_{add,sub,and,or,xor}{,_relaxed,_acquire,_release}() Peter Zijlstra
@ 2016-05-31 10:19 ` Peter Zijlstra
  2016-05-31 10:19 ` [PATCH -v2 21/33] locking,sh: Implement atomic_fetch_{add,sub,and,or,xor}() Peter Zijlstra
                   ` (14 subsequent siblings)
  34 siblings, 0 replies; 61+ messages in thread
From: Peter Zijlstra @ 2016-05-31 10:19 UTC (permalink / raw)
  To: torvalds, mingo, tglx, will.deacon, paulmck, boqun.feng,
	waiman.long, fweisbec
  Cc: linux-kernel, linux-arch, rth, vgupta, linux, egtvedt, realmz6,
	ysato, rkuo, tony.luck, geert, james.hogan, ralf, dhowells, jejb,
	mpe, schwidefsky, dalias, davem, cmetcalf, jcmvbkbc, arnd,
	peterz, dbueso, fengguang.wu

[-- Attachment #1: peterz-atomic-fetch-s390.patch --]
[-- Type: text/plain, Size: 3880 bytes --]

Implement FETCH-OP atomic primitives, these are very similar to the
existing OP-RETURN primitives we already have, except they return the
value of the atomic variable _before_ modification.

This is especially useful for irreversible operations -- such as
bitops (because it becomes impossible to reconstruct the state prior
to modification).

Acked-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 arch/s390/include/asm/atomic.h |   42 +++++++++++++++++++++++++++++++----------
 1 file changed, 32 insertions(+), 10 deletions(-)

--- a/arch/s390/include/asm/atomic.h
+++ b/arch/s390/include/asm/atomic.h
@@ -93,6 +93,11 @@ static inline int atomic_add_return(int
 	return __ATOMIC_LOOP(v, i, __ATOMIC_ADD, __ATOMIC_BARRIER) + i;
 }
 
+static inline int atomic_fetch_add(int i, atomic_t *v)
+{
+	return __ATOMIC_LOOP(v, i, __ATOMIC_ADD, __ATOMIC_BARRIER);
+}
+
 static inline void atomic_add(int i, atomic_t *v)
 {
 #ifdef CONFIG_HAVE_MARCH_Z196_FEATURES
@@ -114,22 +119,29 @@ static inline void atomic_add(int i, ato
 #define atomic_inc_and_test(_v)		(atomic_add_return(1, _v) == 0)
 #define atomic_sub(_i, _v)		atomic_add(-(int)(_i), _v)
 #define atomic_sub_return(_i, _v)	atomic_add_return(-(int)(_i), _v)
+#define atomic_fetch_sub(_i, _v)	atomic_fetch_add(-(int)(_i), _v)
 #define atomic_sub_and_test(_i, _v)	(atomic_sub_return(_i, _v) == 0)
 #define atomic_dec(_v)			atomic_sub(1, _v)
 #define atomic_dec_return(_v)		atomic_sub_return(1, _v)
 #define atomic_dec_and_test(_v)		(atomic_sub_return(1, _v) == 0)
 
-#define ATOMIC_OP(op, OP)						\
+#define ATOMIC_OPS(op, OP)						\
 static inline void atomic_##op(int i, atomic_t *v)			\
 {									\
 	__ATOMIC_LOOP(v, i, __ATOMIC_##OP, __ATOMIC_NO_BARRIER);	\
+}									\
+static inline int atomic_fetch_##op(int i, atomic_t *v)			\
+{									\
+	return __ATOMIC_LOOP(v, i, __ATOMIC_##OP, __ATOMIC_BARRIER);	\
 }
 
-ATOMIC_OP(and, AND)
-ATOMIC_OP(or, OR)
-ATOMIC_OP(xor, XOR)
+#define atomic_fetch_or atomic_fetch_or
+
+ATOMIC_OPS(and, AND)
+ATOMIC_OPS(or, OR)
+ATOMIC_OPS(xor, XOR)
 
-#undef ATOMIC_OP
+#undef ATOMIC_OPS
 
 #define atomic_xchg(v, new) (xchg(&((v)->counter), new))
 
@@ -236,6 +248,11 @@ static inline long long atomic64_add_ret
 	return __ATOMIC64_LOOP(v, i, __ATOMIC64_ADD, __ATOMIC64_BARRIER) + i;
 }
 
+static inline long long atomic64_fetch_add(long long i, atomic64_t *v)
+{
+	return __ATOMIC64_LOOP(v, i, __ATOMIC64_ADD, __ATOMIC64_BARRIER);
+}
+
 static inline void atomic64_add(long long i, atomic64_t *v)
 {
 #ifdef CONFIG_HAVE_MARCH_Z196_FEATURES
@@ -264,17 +281,21 @@ static inline long long atomic64_cmpxchg
 	return old;
 }
 
-#define ATOMIC64_OP(op, OP)						\
+#define ATOMIC64_OPS(op, OP)						\
 static inline void atomic64_##op(long i, atomic64_t *v)			\
 {									\
 	__ATOMIC64_LOOP(v, i, __ATOMIC64_##OP, __ATOMIC64_NO_BARRIER);	\
+}									\
+static inline long atomic64_fetch_##op(long i, atomic64_t *v)		\
+{									\
+	return __ATOMIC64_LOOP(v, i, __ATOMIC64_##OP, __ATOMIC64_BARRIER); \
 }
 
-ATOMIC64_OP(and, AND)
-ATOMIC64_OP(or, OR)
-ATOMIC64_OP(xor, XOR)
+ATOMIC64_OPS(and, AND)
+ATOMIC64_OPS(or, OR)
+ATOMIC64_OPS(xor, XOR)
 
-#undef ATOMIC64_OP
+#undef ATOMIC64_OPS
 #undef __ATOMIC64_LOOP
 
 static inline int atomic64_add_unless(atomic64_t *v, long long i, long long u)
@@ -315,6 +336,7 @@ static inline long long atomic64_dec_if_
 #define atomic64_inc_return(_v)		atomic64_add_return(1, _v)
 #define atomic64_inc_and_test(_v)	(atomic64_add_return(1, _v) == 0)
 #define atomic64_sub_return(_i, _v)	atomic64_add_return(-(long long)(_i), _v)
+#define atomic64_fetch_sub(_i, _v)	atomic64_fetch_add(-(long long)(_i), _v)
 #define atomic64_sub(_i, _v)		atomic64_add(-(long long)(_i), _v)
 #define atomic64_sub_and_test(_i, _v)	(atomic64_sub_return(_i, _v) == 0)
 #define atomic64_dec(_v)		atomic64_sub(1, _v)

^ permalink raw reply	[flat|nested] 61+ messages in thread

* [PATCH -v2 21/33] locking,sh: Implement atomic_fetch_{add,sub,and,or,xor}()
  2016-05-31 10:19 [PATCH -v2 00/33] implement atomic_fetch_$op Peter Zijlstra
                   ` (19 preceding siblings ...)
  2016-05-31 10:19 ` [PATCH -v2 20/33] locking,s390: Implement atomic{,64}_fetch_{add,sub,and,or,xor}() Peter Zijlstra
@ 2016-05-31 10:19 ` Peter Zijlstra
  2016-05-31 10:19 ` [PATCH -v2 22/33] locking,sparc: Implement atomic{,64}_fetch_{add,sub,and,or,xor}() Peter Zijlstra
                   ` (13 subsequent siblings)
  34 siblings, 0 replies; 61+ messages in thread
From: Peter Zijlstra @ 2016-05-31 10:19 UTC (permalink / raw)
  To: torvalds, mingo, tglx, will.deacon, paulmck, boqun.feng,
	waiman.long, fweisbec
  Cc: linux-kernel, linux-arch, rth, vgupta, linux, egtvedt, realmz6,
	ysato, rkuo, tony.luck, geert, james.hogan, ralf, dhowells, jejb,
	mpe, schwidefsky, dalias, davem, cmetcalf, jcmvbkbc, arnd,
	peterz, dbueso, fengguang.wu

[-- Attachment #1: peterz-atomic-fetch-sh.patch --]
[-- Type: text/plain, Size: 4684 bytes --]

Implement FETCH-OP atomic primitives, these are very similar to the
existing OP-RETURN primitives we already have, except they return the
value of the atomic variable _before_ modification.

This is especially useful for irreversible operations -- such as
bitops (because it becomes impossible to reconstruct the state prior
to modification).

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 arch/sh/include/asm/atomic-grb.h  |   34 ++++++++++++++++++++++++++++++----
 arch/sh/include/asm/atomic-irq.h  |   31 +++++++++++++++++++++++++++----
 arch/sh/include/asm/atomic-llsc.h |   32 ++++++++++++++++++++++++++++----
 arch/sh/include/asm/atomic.h      |    2 ++
 4 files changed, 87 insertions(+), 12 deletions(-)

--- a/arch/sh/include/asm/atomic-grb.h
+++ b/arch/sh/include/asm/atomic-grb.h
@@ -43,16 +43,42 @@ static inline int atomic_##op##_return(i
 	return tmp;							\
 }
 
-#define ATOMIC_OPS(op) ATOMIC_OP(op) ATOMIC_OP_RETURN(op)
+#define ATOMIC_FETCH_OP(op)						\
+static inline int atomic_fetch_##op(int i, atomic_t *v)			\
+{									\
+	int res, tmp;							\
+									\
+	__asm__ __volatile__ (						\
+		"   .align 2              \n\t"				\
+		"   mova    1f,   r0      \n\t" /* r0 = end point */	\
+		"   mov    r15,   r1      \n\t" /* r1 = saved sp */	\
+		"   mov    #-6,   r15     \n\t" /* LOGIN: r15 = size */	\
+		"   mov.l  @%2,   %0      \n\t" /* load old value */	\
+		"   mov     %0,   %1      \n\t" /* save old value */	\
+		" " #op "   %3,   %0      \n\t" /* $op */		\
+		"   mov.l   %0,   @%2     \n\t" /* store new value */	\
+		"1: mov     r1,   r15     \n\t" /* LOGOUT */		\
+		: "=&r" (tmp), "=&r" (res), "+r"  (v)			\
+		: "r"   (i)						\
+		: "memory" , "r0", "r1");				\
+									\
+	return res;							\
+}
+
+#define ATOMIC_OPS(op) ATOMIC_OP(op) ATOMIC_OP_RETURN(op) ATOMIC_FETCH_OP(op)
 
 ATOMIC_OPS(add)
 ATOMIC_OPS(sub)
 
-ATOMIC_OP(and)
-ATOMIC_OP(or)
-ATOMIC_OP(xor)
+#undef ATOMIC_OPS
+#define ATOMIC_OPS(op) ATOMIC_OP(op) ATOMIC_FETCH_OP(op)
+
+ATOMIC_OPS(and)
+ATOMIC_OPS(or)
+ATOMIC_OPS(xor)
 
 #undef ATOMIC_OPS
+#undef ATOMIC_FETCH_OP
 #undef ATOMIC_OP_RETURN
 #undef ATOMIC_OP
 
--- a/arch/sh/include/asm/atomic-irq.h
+++ b/arch/sh/include/asm/atomic-irq.h
@@ -33,15 +33,38 @@ static inline int atomic_##op##_return(i
 	return temp;							\
 }
 
-#define ATOMIC_OPS(op, c_op) ATOMIC_OP(op, c_op) ATOMIC_OP_RETURN(op, c_op)
+#define ATOMIC_FETCH_OP(op, c_op)					\
+static inline int atomic_fetch_##op(int i, atomic_t *v)			\
+{									\
+	unsigned long temp, flags;					\
+									\
+	raw_local_irq_save(flags);					\
+	temp = v->counter;						\
+	v->counter c_op i;						\
+	raw_local_irq_restore(flags);					\
+									\
+	return temp;							\
+}
+
+#define ATOMIC_OPS(op, c_op)						\
+	ATOMIC_OP(op, c_op)						\
+	ATOMIC_OP_RETURN(op, c_op)					\
+	ATOMIC_FETCH_OP(op, c_op)
 
 ATOMIC_OPS(add, +=)
 ATOMIC_OPS(sub, -=)
-ATOMIC_OP(and, &=)
-ATOMIC_OP(or, |=)
-ATOMIC_OP(xor, ^=)
 
 #undef ATOMIC_OPS
+#define ATOMIC_OPS(op, c_op)						\
+	ATOMIC_OP(op, c_op)						\
+	ATOMIC_FETCH_OP(op, c_op)
+
+ATOMIC_OPS(and, &=)
+ATOMIC_OPS(or, |=)
+ATOMIC_OPS(xor, ^=)
+
+#undef ATOMIC_OPS
+#undef ATOMIC_FETCH_OP
 #undef ATOMIC_OP_RETURN
 #undef ATOMIC_OP
 
--- a/arch/sh/include/asm/atomic-llsc.h
+++ b/arch/sh/include/asm/atomic-llsc.h
@@ -48,15 +48,39 @@ static inline int atomic_##op##_return(i
 	return temp;							\
 }
 
-#define ATOMIC_OPS(op) ATOMIC_OP(op) ATOMIC_OP_RETURN(op)
+#define ATOMIC_FETCH_OP(op)						\
+static inline int atomic_fetch_##op(int i, atomic_t *v)			\
+{									\
+	unsigned long res, temp;					\
+									\
+	__asm__ __volatile__ (						\
+"1:	movli.l @%3, %0		! atomic_fetch_" #op "	\n"		\
+"	mov %0, %1					\n"		\
+"	" #op "	%2, %0					\n"		\
+"	movco.l	%0, @%3					\n"		\
+"	bf	1b					\n"		\
+"	synco						\n"		\
+	: "=&z" (temp), "=&z" (res)					\
+	: "r" (i), "r" (&v->counter)					\
+	: "t");								\
+									\
+	return res;							\
+}
+
+#define ATOMIC_OPS(op) ATOMIC_OP(op) ATOMIC_OP_RETURN(op) ATOMIC_FETCH_OP(op)
 
 ATOMIC_OPS(add)
 ATOMIC_OPS(sub)
-ATOMIC_OP(and)
-ATOMIC_OP(or)
-ATOMIC_OP(xor)
 
 #undef ATOMIC_OPS
+#define ATOMIC_OPS(op) ATOMIC_OP(op) ATOMIC_FETCH_OP(op)
+
+ATOMIC_OPS(and)
+ATOMIC_OPS(or)
+ATOMIC_OPS(xor)
+
+#undef ATOMIC_OPS
+#undef ATOMIC_FETCH_OP
 #undef ATOMIC_OP_RETURN
 #undef ATOMIC_OP
 
--- a/arch/sh/include/asm/atomic.h
+++ b/arch/sh/include/asm/atomic.h
@@ -25,6 +25,8 @@
 #include <asm/atomic-irq.h>
 #endif
 
+#define atomic_fetch_or atomic_fetch_or
+
 #define atomic_add_negative(a, v)	(atomic_add_return((a), (v)) < 0)
 #define atomic_dec_return(v)		atomic_sub_return(1, (v))
 #define atomic_inc_return(v)		atomic_add_return(1, (v))

^ permalink raw reply	[flat|nested] 61+ messages in thread

* [PATCH -v2 22/33] locking,sparc: Implement atomic{,64}_fetch_{add,sub,and,or,xor}()
  2016-05-31 10:19 [PATCH -v2 00/33] implement atomic_fetch_$op Peter Zijlstra
                   ` (20 preceding siblings ...)
  2016-05-31 10:19 ` [PATCH -v2 21/33] locking,sh: Implement atomic_fetch_{add,sub,and,or,xor}() Peter Zijlstra
@ 2016-05-31 10:19 ` Peter Zijlstra
  2016-05-31 17:50   ` David Miller
  2016-05-31 10:19 ` [PATCH -v2 23/33] locking,tile: " Peter Zijlstra
                   ` (12 subsequent siblings)
  34 siblings, 1 reply; 61+ messages in thread
From: Peter Zijlstra @ 2016-05-31 10:19 UTC (permalink / raw)
  To: torvalds, mingo, tglx, will.deacon, paulmck, boqun.feng,
	waiman.long, fweisbec
  Cc: linux-kernel, linux-arch, rth, vgupta, linux, egtvedt, realmz6,
	ysato, rkuo, tony.luck, geert, james.hogan, ralf, dhowells, jejb,
	mpe, schwidefsky, dalias, davem, cmetcalf, jcmvbkbc, arnd,
	peterz, dbueso, fengguang.wu

[-- Attachment #1: peterz-atomic-fetch-sparc.patch --]
[-- Type: text/plain, Size: 7872 bytes --]

Implement FETCH-OP atomic primitives, these are very similar to the
existing OP-RETURN primitives we already have, except they return the
value of the atomic variable _before_ modification.

This is especially useful for irreversible operations -- such as
bitops (because it becomes impossible to reconstruct the state prior
to modification).

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 arch/sparc/include/asm/atomic.h    |    1 
 arch/sparc/include/asm/atomic_32.h |   15 +++++++--
 arch/sparc/include/asm/atomic_64.h |   16 +++++++--
 arch/sparc/lib/atomic32.c          |   29 ++++++++++-------
 arch/sparc/lib/atomic_64.S         |   61 ++++++++++++++++++++++++++++++-------
 arch/sparc/lib/ksyms.c             |   17 +++++++---
 6 files changed, 105 insertions(+), 34 deletions(-)

--- a/arch/sparc/include/asm/atomic.h
+++ b/arch/sparc/include/asm/atomic.h
@@ -5,4 +5,5 @@
 #else
 #include <asm/atomic_32.h>
 #endif
+#define atomic_fetch_or atomic_fetch_or
 #endif
--- a/arch/sparc/include/asm/atomic_32.h
+++ b/arch/sparc/include/asm/atomic_32.h
@@ -20,9 +20,10 @@
 #define ATOMIC_INIT(i)  { (i) }
 
 int atomic_add_return(int, atomic_t *);
-void atomic_and(int, atomic_t *);
-void atomic_or(int, atomic_t *);
-void atomic_xor(int, atomic_t *);
+int atomic_fetch_add(int, atomic_t *);
+int atomic_fetch_and(int, atomic_t *);
+int atomic_fetch_or(int, atomic_t *);
+int atomic_fetch_xor(int, atomic_t *);
 int atomic_cmpxchg(atomic_t *, int, int);
 int atomic_xchg(atomic_t *, int);
 int __atomic_add_unless(atomic_t *, int, int);
@@ -35,7 +36,15 @@ void atomic_set(atomic_t *, int);
 #define atomic_inc(v)		((void)atomic_add_return(        1, (v)))
 #define atomic_dec(v)		((void)atomic_add_return(       -1, (v)))
 
+#define atomic_fetch_or	atomic_fetch_or
+
+#define atomic_and(i, v)	((void)atomic_fetch_and((i), (v)))
+#define atomic_or(i, v)		((void)atomic_fetch_or((i), (v)))
+#define atomic_xor(i, v)	((void)atomic_fetch_xor((i), (v)))
+
 #define atomic_sub_return(i, v)	(atomic_add_return(-(int)(i), (v)))
+#define atomic_fetch_sub(i, v)  (atomic_fetch_add (-(int)(i), (v)))
+
 #define atomic_inc_return(v)	(atomic_add_return(        1, (v)))
 #define atomic_dec_return(v)	(atomic_add_return(       -1, (v)))
 
--- a/arch/sparc/include/asm/atomic_64.h
+++ b/arch/sparc/include/asm/atomic_64.h
@@ -28,16 +28,24 @@ void atomic64_##op(long, atomic64_t *);
 int atomic_##op##_return(int, atomic_t *);				\
 long atomic64_##op##_return(long, atomic64_t *);
 
-#define ATOMIC_OPS(op) ATOMIC_OP(op) ATOMIC_OP_RETURN(op)
+#define ATOMIC_FETCH_OP(op)						\
+int atomic_fetch_##op(int, atomic_t *);					\
+long atomic64_fetch_##op(long, atomic64_t *);
+
+#define ATOMIC_OPS(op) ATOMIC_OP(op) ATOMIC_OP_RETURN(op) ATOMIC_FETCH_OP(op)
 
 ATOMIC_OPS(add)
 ATOMIC_OPS(sub)
 
-ATOMIC_OP(and)
-ATOMIC_OP(or)
-ATOMIC_OP(xor)
+#undef ATOMIC_OPS
+#define ATOMIC_OPS(op) ATOMIC_OP(op) ATOMIC_FETCH_OP(op)
+
+ATOMIC_OPS(and)
+ATOMIC_OPS(or)
+ATOMIC_OPS(xor)
 
 #undef ATOMIC_OPS
+#undef ATOMIC_FETCH_OP
 #undef ATOMIC_OP_RETURN
 #undef ATOMIC_OP
 
--- a/arch/sparc/lib/atomic32.c
+++ b/arch/sparc/lib/atomic32.c
@@ -27,39 +27,44 @@ static DEFINE_SPINLOCK(dummy);
 
 #endif /* SMP */
 
-#define ATOMIC_OP_RETURN(op, c_op)					\
-int atomic_##op##_return(int i, atomic_t *v)				\
+#define ATOMIC_FETCH_OP(op, c_op)					\
+int atomic_fetch_##op(int i, atomic_t *v)				\
 {									\
 	int ret;							\
 	unsigned long flags;						\
 	spin_lock_irqsave(ATOMIC_HASH(v), flags);			\
 									\
-	ret = (v->counter c_op i);					\
+	ret = v->counter;						\
+	v->counter c_op i;						\
 									\
 	spin_unlock_irqrestore(ATOMIC_HASH(v), flags);			\
 	return ret;							\
 }									\
-EXPORT_SYMBOL(atomic_##op##_return);
+EXPORT_SYMBOL(atomic_fetch_##op);
 
-#define ATOMIC_OP(op, c_op)						\
-void atomic_##op(int i, atomic_t *v)					\
+#define ATOMIC_OP_RETURN(op, c_op)					\
+int atomic_##op##_return(int i, atomic_t *v)				\
 {									\
+	int ret;							\
 	unsigned long flags;						\
 	spin_lock_irqsave(ATOMIC_HASH(v), flags);			\
 									\
-	v->counter c_op i;						\
+	ret = (v->counter c_op i);					\
 									\
 	spin_unlock_irqrestore(ATOMIC_HASH(v), flags);			\
+	return ret;							\
 }									\
-EXPORT_SYMBOL(atomic_##op);
+EXPORT_SYMBOL(atomic_##op##_return);
 
 ATOMIC_OP_RETURN(add, +=)
-ATOMIC_OP(and, &=)
-ATOMIC_OP(or, |=)
-ATOMIC_OP(xor, ^=)
 
+ATOMIC_FETCH_OP(add, +=)
+ATOMIC_FETCH_OP(and, &=)
+ATOMIC_FETCH_OP(or, |=)
+ATOMIC_FETCH_OP(xor, ^=)
+
+#undef ATOMIC_FETCH_OP
 #undef ATOMIC_OP_RETURN
-#undef ATOMIC_OP
 
 int atomic_xchg(atomic_t *v, int new)
 {
--- a/arch/sparc/lib/atomic_64.S
+++ b/arch/sparc/lib/atomic_64.S
@@ -9,10 +9,11 @@
 
 	.text
 
-	/* Two versions of the atomic routines, one that
+	/* Three versions of the atomic routines, one that
 	 * does not return a value and does not perform
-	 * memory barriers, and a second which returns
-	 * a value and does the barriers.
+	 * memory barriers, and a two which return
+	 * a value, the new and old value resp. and does the
+	 * barriers.
 	 */
 
 #define ATOMIC_OP(op)							\
@@ -43,15 +44,34 @@ ENTRY(atomic_##op##_return) /* %o0 = inc
 2:	BACKOFF_SPIN(%o2, %o3, 1b);					\
 ENDPROC(atomic_##op##_return);
 
-#define ATOMIC_OPS(op) ATOMIC_OP(op) ATOMIC_OP_RETURN(op)
+#define ATOMIC_FETCH_OP(op)						\
+ENTRY(atomic_fetch_##op) /* %o0 = increment, %o1 = atomic_ptr */	\
+	BACKOFF_SETUP(%o2);						\
+1:	lduw	[%o1], %g1;						\
+	op	%g1, %o0, %g7;						\
+	cas	[%o1], %g1, %g7;					\
+	cmp	%g1, %g7;						\
+	bne,pn	%icc, BACKOFF_LABEL(2f, 1b);				\
+	 nop;								\
+	retl;								\
+	 sra	%g1, 0, %o0;						\
+2:	BACKOFF_SPIN(%o2, %o3, 1b);					\
+ENDPROC(atomic_fetch_##op);
+
+#define ATOMIC_OPS(op) ATOMIC_OP(op) ATOMIC_OP_RETURN(op) ATOMIC_FETCH_OP(op)
 
 ATOMIC_OPS(add)
 ATOMIC_OPS(sub)
-ATOMIC_OP(and)
-ATOMIC_OP(or)
-ATOMIC_OP(xor)
 
 #undef ATOMIC_OPS
+#define ATOMIC_OPS(op) ATOMIC_OP(op) ATOMIC_FETCH_OP(op)
+
+ATOMIC_OPS(and)
+ATOMIC_OPS(or)
+ATOMIC_OPS(xor)
+
+#undef ATOMIC_OPS
+#undef ATOMIC_FETCH_OP
 #undef ATOMIC_OP_RETURN
 #undef ATOMIC_OP
 
@@ -83,15 +103,34 @@ ENTRY(atomic64_##op##_return) /* %o0 = i
 2:	BACKOFF_SPIN(%o2, %o3, 1b);					\
 ENDPROC(atomic64_##op##_return);
 
-#define ATOMIC64_OPS(op) ATOMIC64_OP(op) ATOMIC64_OP_RETURN(op)
+#define ATOMIC64_FETCH_OP(op)						\
+ENTRY(atomic64_fetch_##op) /* %o0 = increment, %o1 = atomic_ptr */	\
+	BACKOFF_SETUP(%o2);						\
+1:	ldx	[%o1], %g1;						\
+	op	%g1, %o0, %g7;						\
+	casx	[%o1], %g1, %g7;					\
+	cmp	%g1, %g7;						\
+	bne,pn	%xcc, BACKOFF_LABEL(2f, 1b);				\
+	 nop;								\
+	retl;								\
+	 mov	%g1, %o0;						\
+2:	BACKOFF_SPIN(%o2, %o3, 1b);					\
+ENDPROC(atomic64_fetch_##op);
+
+#define ATOMIC64_OPS(op) ATOMIC64_OP(op) ATOMIC64_OP_RETURN(op) ATOMIC64_FETCH_OP(op)
 
 ATOMIC64_OPS(add)
 ATOMIC64_OPS(sub)
-ATOMIC64_OP(and)
-ATOMIC64_OP(or)
-ATOMIC64_OP(xor)
 
 #undef ATOMIC64_OPS
+#define ATOMIC64_OPS(op) ATOMIC64_OP(op) ATOMIC64_FETCH_OP(op)
+
+ATOMIC64_OPS(and)
+ATOMIC64_OPS(or)
+ATOMIC64_OPS(xor)
+
+#undef ATOMIC64_OPS
+#undef ATOMIC64_FETCH_OP
 #undef ATOMIC64_OP_RETURN
 #undef ATOMIC64_OP
 
--- a/arch/sparc/lib/ksyms.c
+++ b/arch/sparc/lib/ksyms.c
@@ -107,15 +107,24 @@ EXPORT_SYMBOL(atomic64_##op);
 EXPORT_SYMBOL(atomic_##op##_return);					\
 EXPORT_SYMBOL(atomic64_##op##_return);
 
-#define ATOMIC_OPS(op) ATOMIC_OP(op) ATOMIC_OP_RETURN(op)
+#define ATOMIC_FETCH_OP(op)						\
+EXPORT_SYMBOL(atomic_fetch_##op);					\
+EXPORT_SYMBOL(atomic64_fetch_##op);
+
+#define ATOMIC_OPS(op) ATOMIC_OP(op) ATOMIC_OP_RETURN(op) ATOMIC_FETCH_OP(op)
 
 ATOMIC_OPS(add)
 ATOMIC_OPS(sub)
-ATOMIC_OP(and)
-ATOMIC_OP(or)
-ATOMIC_OP(xor)
 
 #undef ATOMIC_OPS
+#define ATOMIC_OPS(op) ATOMIC_OP(op) ATOMIC_FETCH_OP(op)
+
+ATOMIC_OPS(and)
+ATOMIC_OPS(or)
+ATOMIC_OPS(xor)
+
+#undef ATOMIC_OPS
+#undef ATOMIC_FETCH_OP
 #undef ATOMIC_OP_RETURN
 #undef ATOMIC_OP
 

^ permalink raw reply	[flat|nested] 61+ messages in thread

* [PATCH -v2 23/33] locking,tile: Implement atomic{,64}_fetch_{add,sub,and,or,xor}()
  2016-05-31 10:19 [PATCH -v2 00/33] implement atomic_fetch_$op Peter Zijlstra
                   ` (21 preceding siblings ...)
  2016-05-31 10:19 ` [PATCH -v2 22/33] locking,sparc: Implement atomic{,64}_fetch_{add,sub,and,or,xor}() Peter Zijlstra
@ 2016-05-31 10:19 ` Peter Zijlstra
  2016-05-31 10:19 ` [PATCH -v2 24/33] locking,x86: " Peter Zijlstra
                   ` (11 subsequent siblings)
  34 siblings, 0 replies; 61+ messages in thread
From: Peter Zijlstra @ 2016-05-31 10:19 UTC (permalink / raw)
  To: torvalds, mingo, tglx, will.deacon, paulmck, boqun.feng,
	waiman.long, fweisbec
  Cc: linux-kernel, linux-arch, rth, vgupta, linux, egtvedt, realmz6,
	ysato, rkuo, tony.luck, geert, james.hogan, ralf, dhowells, jejb,
	mpe, schwidefsky, dalias, davem, cmetcalf, jcmvbkbc, arnd,
	peterz, dbueso, fengguang.wu

[-- Attachment #1: peterz-atomic-fetch-tile.patch --]
[-- Type: text/plain, Size: 16721 bytes --]

Implement FETCH-OP atomic primitives, these are very similar to the
existing OP-RETURN primitives we already have, except they return the
value of the atomic variable _before_ modification.

This is especially useful for irreversible operations -- such as
bitops (because it becomes impossible to reconstruct the state prior
to modification).

Acked-by: Chris Metcalf <cmetcalf@mellanox.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 arch/tile/include/asm/atomic.h    |    4 +
 arch/tile/include/asm/atomic_32.h |   60 +++++++++++++------
 arch/tile/include/asm/atomic_64.h |  115 +++++++++++++++++++++++++-------------
 arch/tile/include/asm/bitops_32.h |   18 ++---
 arch/tile/lib/atomic_32.c         |   42 ++++++-------
 arch/tile/lib/atomic_asm_32.S     |   14 ++--
 6 files changed, 159 insertions(+), 94 deletions(-)

--- a/arch/tile/include/asm/atomic.h
+++ b/arch/tile/include/asm/atomic.h
@@ -46,6 +46,10 @@ static inline int atomic_read(const atom
  */
 #define atomic_sub_return(i, v)		atomic_add_return((int)(-(i)), (v))
 
+#define atomic_fetch_sub(i, v)		atomic_fetch_add(-(int)(i), (v))
+
+#define atomic_fetch_or atomic_fetch_or
+
 /**
  * atomic_sub - subtract integer from atomic variable
  * @i: integer value to subtract
--- a/arch/tile/include/asm/atomic_32.h
+++ b/arch/tile/include/asm/atomic_32.h
@@ -34,18 +34,29 @@ static inline void atomic_add(int i, ato
 	_atomic_xchg_add(&v->counter, i);
 }
 
-#define ATOMIC_OP(op)							\
-unsigned long _atomic_##op(volatile unsigned long *p, unsigned long mask); \
+#define ATOMIC_OPS(op)							\
+unsigned long _atomic_fetch_##op(volatile unsigned long *p, unsigned long mask); \
 static inline void atomic_##op(int i, atomic_t *v)			\
 {									\
-	_atomic_##op((unsigned long *)&v->counter, i);			\
+	_atomic_fetch_##op((unsigned long *)&v->counter, i);		\
+}									\
+static inline int atomic_fetch_##op(int i, atomic_t *v)			\
+{									\
+	smp_mb();							\
+	return _atomic_fetch_##op((unsigned long *)&v->counter, i);	\
 }
 
-ATOMIC_OP(and)
-ATOMIC_OP(or)
-ATOMIC_OP(xor)
+ATOMIC_OPS(and)
+ATOMIC_OPS(or)
+ATOMIC_OPS(xor)
+
+#undef ATOMIC_OPS
 
-#undef ATOMIC_OP
+static inline int atomic_fetch_add(int i, atomic_t *v)
+{
+	smp_mb();
+	return _atomic_xchg_add(&v->counter, i);
+}
 
 /**
  * atomic_add_return - add integer and return
@@ -126,17 +137,30 @@ static inline void atomic64_add(long lon
 	_atomic64_xchg_add(&v->counter, i);
 }
 
-#define ATOMIC64_OP(op)						\
-long long _atomic64_##op(long long *v, long long n);		\
+#define ATOMIC64_OPS(op)					\
+long long _atomic64_fetch_##op(long long *v, long long n);	\
+static inline void atomic64_##op(long long i, atomic64_t *v)	\
+{								\
+	_atomic64_fetch_##op(&v->counter, i);			\
+}								\
 static inline void atomic64_##op(long long i, atomic64_t *v)	\
 {								\
-	_atomic64_##op(&v->counter, i);				\
+	smp_mb();						\
+	return _atomic64_fetch_##op(&v->counter, i);		\
 }
 
 ATOMIC64_OP(and)
 ATOMIC64_OP(or)
 ATOMIC64_OP(xor)
 
+#undef ATOMIC64_OPS
+
+static inline long long atomic64_fetch_add(long long i, atomic64_t *v)
+{
+	smp_mb();
+	return _atomic64_xchg_add(&v->counter, i);
+}
+
 /**
  * atomic64_add_return - add integer and return
  * @v: pointer of type atomic64_t
@@ -186,6 +210,7 @@ static inline void atomic64_set(atomic64
 #define atomic64_inc_return(v)		atomic64_add_return(1LL, (v))
 #define atomic64_inc_and_test(v)	(atomic64_inc_return(v) == 0)
 #define atomic64_sub_return(i, v)	atomic64_add_return(-(i), (v))
+#define atomic64_fetch_sub(i, v)	atomic64_fetch_add(-(i), (v))
 #define atomic64_sub_and_test(a, v)	(atomic64_sub_return((a), (v)) == 0)
 #define atomic64_sub(i, v)		atomic64_add(-(i), (v))
 #define atomic64_dec(v)			atomic64_sub(1LL, (v))
@@ -193,7 +218,6 @@ static inline void atomic64_set(atomic64
 #define atomic64_dec_and_test(v)	(atomic64_dec_return((v)) == 0)
 #define atomic64_inc_not_zero(v)	atomic64_add_unless((v), 1LL, 0LL)
 
-
 #endif /* !__ASSEMBLY__ */
 
 /*
@@ -248,10 +272,10 @@ extern struct __get_user __atomic_xchg(v
 extern struct __get_user __atomic_xchg_add(volatile int *p, int *lock, int n);
 extern struct __get_user __atomic_xchg_add_unless(volatile int *p,
 						  int *lock, int o, int n);
-extern struct __get_user __atomic_or(volatile int *p, int *lock, int n);
-extern struct __get_user __atomic_and(volatile int *p, int *lock, int n);
-extern struct __get_user __atomic_andn(volatile int *p, int *lock, int n);
-extern struct __get_user __atomic_xor(volatile int *p, int *lock, int n);
+extern struct __get_user __atomic_fetch_or(volatile int *p, int *lock, int n);
+extern struct __get_user __atomic_fetch_and(volatile int *p, int *lock, int n);
+extern struct __get_user __atomic_fetch_andn(volatile int *p, int *lock, int n);
+extern struct __get_user __atomic_fetch_xor(volatile int *p, int *lock, int n);
 extern long long __atomic64_cmpxchg(volatile long long *p, int *lock,
 					long long o, long long n);
 extern long long __atomic64_xchg(volatile long long *p, int *lock, long long n);
@@ -259,9 +283,9 @@ extern long long __atomic64_xchg_add(vol
 					long long n);
 extern long long __atomic64_xchg_add_unless(volatile long long *p,
 					int *lock, long long o, long long n);
-extern long long __atomic64_and(volatile long long *p, int *lock, long long n);
-extern long long __atomic64_or(volatile long long *p, int *lock, long long n);
-extern long long __atomic64_xor(volatile long long *p, int *lock, long long n);
+extern long long __atomic64_fetch_and(volatile long long *p, int *lock, long long n);
+extern long long __atomic64_fetch_or(volatile long long *p, int *lock, long long n);
+extern long long __atomic64_fetch_xor(volatile long long *p, int *lock, long long n);
 
 /* Return failure from the atomic wrappers. */
 struct __get_user __atomic_bad_address(int __user *addr);
--- a/arch/tile/include/asm/atomic_64.h
+++ b/arch/tile/include/asm/atomic_64.h
@@ -32,11 +32,6 @@
  * on any routine which updates memory and returns a value.
  */
 
-static inline void atomic_add(int i, atomic_t *v)
-{
-	__insn_fetchadd4((void *)&v->counter, i);
-}
-
 /*
  * Note a subtlety of the locking here.  We are required to provide a
  * full memory barrier before and after the operation.  However, we
@@ -59,28 +54,39 @@ static inline int atomic_add_return(int
 	return val;
 }
 
-static inline int __atomic_add_unless(atomic_t *v, int a, int u)
+#define ATOMIC_OPS(op)							\
+static inline int atomic_fetch_##op(int i, atomic_t *v)			\
+{									\
+	int val;							\
+	smp_mb();							\
+	val = __insn_fetch##op##4((void *)&v->counter, i);		\
+	smp_mb();							\
+	return val;							\
+}									\
+static inline void atomic_##op(int i, atomic_t *v)			\
+{									\
+	__insn_fetch##op##4((void *)&v->counter, i);			\
+}
+
+ATOMIC_OPS(add)
+ATOMIC_OPS(and)
+ATOMIC_OPS(or)
+
+#undef ATOMIC_OPS
+
+static inline int atomic_fetch_xor(int i, atomic_t *v)
 {
 	int guess, oldval = v->counter;
+	smp_mb();
 	do {
-		if (oldval == u)
-			break;
 		guess = oldval;
-		oldval = cmpxchg(&v->counter, guess, guess + a);
+		__insn_mtspr(SPR_CMPEXCH_VALUE, guess);
+		oldval = __insn_cmpexch4(&v->counter, guess ^ i);
 	} while (guess != oldval);
+	smp_mb();
 	return oldval;
 }
 
-static inline void atomic_and(int i, atomic_t *v)
-{
-	__insn_fetchand4((void *)&v->counter, i);
-}
-
-static inline void atomic_or(int i, atomic_t *v)
-{
-	__insn_fetchor4((void *)&v->counter, i);
-}
-
 static inline void atomic_xor(int i, atomic_t *v)
 {
 	int guess, oldval = v->counter;
@@ -91,6 +97,18 @@ static inline void atomic_xor(int i, ato
 	} while (guess != oldval);
 }
 
+static inline int __atomic_add_unless(atomic_t *v, int a, int u)
+{
+	int guess, oldval = v->counter;
+	do {
+		if (oldval == u)
+			break;
+		guess = oldval;
+		oldval = cmpxchg(&v->counter, guess, guess + a);
+	} while (guess != oldval);
+	return oldval;
+}
+
 /* Now the true 64-bit operations. */
 
 #define ATOMIC64_INIT(i)	{ (i) }
@@ -98,11 +116,6 @@ static inline void atomic_xor(int i, ato
 #define atomic64_read(v)	READ_ONCE((v)->counter)
 #define atomic64_set(v, i)	WRITE_ONCE((v)->counter, (i))
 
-static inline void atomic64_add(long i, atomic64_t *v)
-{
-	__insn_fetchadd((void *)&v->counter, i);
-}
-
 static inline long atomic64_add_return(long i, atomic64_t *v)
 {
 	int val;
@@ -112,26 +125,37 @@ static inline long atomic64_add_return(l
 	return val;
 }
 
-static inline long atomic64_add_unless(atomic64_t *v, long a, long u)
+#define ATOMIC64_OPS(op)						\
+static inline long atomic64_fetch_##op(long i, atomic64_t *v)		\
+{									\
+	long val;							\
+	smp_mb();							\
+	val = __insn_fetch##op((void *)&v->counter, i);			\
+	smp_mb();							\
+	return val;							\
+}									\
+static inline void atomic64_##op(long i, atomic64_t *v)			\
+{									\
+	__insn_fetch##op((void *)&v->counter, i);			\
+}
+
+ATOMIC64_OPS(add)
+ATOMIC64_OPS(and)
+ATOMIC64_OPS(or)
+
+#undef ATOMIC64_OPS
+
+static inline long atomic64_fetch_xor(long i, atomic64_t *v)
 {
 	long guess, oldval = v->counter;
+	smp_mb();
 	do {
-		if (oldval == u)
-			break;
 		guess = oldval;
-		oldval = cmpxchg(&v->counter, guess, guess + a);
+		__insn_mtspr(SPR_CMPEXCH_VALUE, guess);
+		oldval = __insn_cmpexch(&v->counter, guess ^ i);
 	} while (guess != oldval);
-	return oldval != u;
-}
-
-static inline void atomic64_and(long i, atomic64_t *v)
-{
-	__insn_fetchand((void *)&v->counter, i);
-}
-
-static inline void atomic64_or(long i, atomic64_t *v)
-{
-	__insn_fetchor((void *)&v->counter, i);
+	smp_mb();
+	return oldval;
 }
 
 static inline void atomic64_xor(long i, atomic64_t *v)
@@ -144,7 +168,20 @@ static inline void atomic64_xor(long i,
 	} while (guess != oldval);
 }
 
+static inline long atomic64_add_unless(atomic64_t *v, long a, long u)
+{
+	long guess, oldval = v->counter;
+	do {
+		if (oldval == u)
+			break;
+		guess = oldval;
+		oldval = cmpxchg(&v->counter, guess, guess + a);
+	} while (guess != oldval);
+	return oldval != u;
+}
+
 #define atomic64_sub_return(i, v)	atomic64_add_return(-(i), (v))
+#define atomic64_fetch_sub(i, v)	atomic64_fetch_add(-(i), (v))
 #define atomic64_sub(i, v)		atomic64_add(-(i), (v))
 #define atomic64_inc_return(v)		atomic64_add_return(1, (v))
 #define atomic64_dec_return(v)		atomic64_sub_return(1, (v))
--- a/arch/tile/include/asm/bitops_32.h
+++ b/arch/tile/include/asm/bitops_32.h
@@ -19,9 +19,9 @@
 #include <asm/barrier.h>
 
 /* Tile-specific routines to support <asm/bitops.h>. */
-unsigned long _atomic_or(volatile unsigned long *p, unsigned long mask);
-unsigned long _atomic_andn(volatile unsigned long *p, unsigned long mask);
-unsigned long _atomic_xor(volatile unsigned long *p, unsigned long mask);
+unsigned long _atomic_fetch_or(volatile unsigned long *p, unsigned long mask);
+unsigned long _atomic_fetch_andn(volatile unsigned long *p, unsigned long mask);
+unsigned long _atomic_fetch_xor(volatile unsigned long *p, unsigned long mask);
 
 /**
  * set_bit - Atomically set a bit in memory
@@ -35,7 +35,7 @@ unsigned long _atomic_xor(volatile unsig
  */
 static inline void set_bit(unsigned nr, volatile unsigned long *addr)
 {
-	_atomic_or(addr + BIT_WORD(nr), BIT_MASK(nr));
+	_atomic_fetch_or(addr + BIT_WORD(nr), BIT_MASK(nr));
 }
 
 /**
@@ -54,7 +54,7 @@ static inline void set_bit(unsigned nr,
  */
 static inline void clear_bit(unsigned nr, volatile unsigned long *addr)
 {
-	_atomic_andn(addr + BIT_WORD(nr), BIT_MASK(nr));
+	_atomic_fetch_andn(addr + BIT_WORD(nr), BIT_MASK(nr));
 }
 
 /**
@@ -69,7 +69,7 @@ static inline void clear_bit(unsigned nr
  */
 static inline void change_bit(unsigned nr, volatile unsigned long *addr)
 {
-	_atomic_xor(addr + BIT_WORD(nr), BIT_MASK(nr));
+	_atomic_fetch_xor(addr + BIT_WORD(nr), BIT_MASK(nr));
 }
 
 /**
@@ -85,7 +85,7 @@ static inline int test_and_set_bit(unsig
 	unsigned long mask = BIT_MASK(nr);
 	addr += BIT_WORD(nr);
 	smp_mb();  /* barrier for proper semantics */
-	return (_atomic_or(addr, mask) & mask) != 0;
+	return (_atomic_fetch_or(addr, mask) & mask) != 0;
 }
 
 /**
@@ -101,7 +101,7 @@ static inline int test_and_clear_bit(uns
 	unsigned long mask = BIT_MASK(nr);
 	addr += BIT_WORD(nr);
 	smp_mb();  /* barrier for proper semantics */
-	return (_atomic_andn(addr, mask) & mask) != 0;
+	return (_atomic_fetch_andn(addr, mask) & mask) != 0;
 }
 
 /**
@@ -118,7 +118,7 @@ static inline int test_and_change_bit(un
 	unsigned long mask = BIT_MASK(nr);
 	addr += BIT_WORD(nr);
 	smp_mb();  /* barrier for proper semantics */
-	return (_atomic_xor(addr, mask) & mask) != 0;
+	return (_atomic_fetch_xor(addr, mask) & mask) != 0;
 }
 
 #include <asm-generic/bitops/ext2-atomic.h>
--- a/arch/tile/lib/atomic_32.c
+++ b/arch/tile/lib/atomic_32.c
@@ -88,29 +88,29 @@ int _atomic_cmpxchg(int *v, int o, int n
 }
 EXPORT_SYMBOL(_atomic_cmpxchg);
 
-unsigned long _atomic_or(volatile unsigned long *p, unsigned long mask)
+unsigned long _atomic_fetch_or(volatile unsigned long *p, unsigned long mask)
 {
-	return __atomic_or((int *)p, __atomic_setup(p), mask).val;
+	return __atomic_fetch_or((int *)p, __atomic_setup(p), mask).val;
 }
-EXPORT_SYMBOL(_atomic_or);
+EXPORT_SYMBOL(_atomic_fetch_or);
 
-unsigned long _atomic_and(volatile unsigned long *p, unsigned long mask)
+unsigned long _atomic_fetch_and(volatile unsigned long *p, unsigned long mask)
 {
-	return __atomic_and((int *)p, __atomic_setup(p), mask).val;
+	return __atomic_fetch_and((int *)p, __atomic_setup(p), mask).val;
 }
-EXPORT_SYMBOL(_atomic_and);
+EXPORT_SYMBOL(_atomic_fetch_and);
 
-unsigned long _atomic_andn(volatile unsigned long *p, unsigned long mask)
+unsigned long _atomic_fetch_andn(volatile unsigned long *p, unsigned long mask)
 {
-	return __atomic_andn((int *)p, __atomic_setup(p), mask).val;
+	return __atomic_fetch_andn((int *)p, __atomic_setup(p), mask).val;
 }
-EXPORT_SYMBOL(_atomic_andn);
+EXPORT_SYMBOL(_atomic_fetch_andn);
 
-unsigned long _atomic_xor(volatile unsigned long *p, unsigned long mask)
+unsigned long _atomic_fetch_xor(volatile unsigned long *p, unsigned long mask)
 {
-	return __atomic_xor((int *)p, __atomic_setup(p), mask).val;
+	return __atomic_fetch_xor((int *)p, __atomic_setup(p), mask).val;
 }
-EXPORT_SYMBOL(_atomic_xor);
+EXPORT_SYMBOL(_atomic_fetch_xor);
 
 
 long long _atomic64_xchg(long long *v, long long n)
@@ -142,23 +142,23 @@ long long _atomic64_cmpxchg(long long *v
 }
 EXPORT_SYMBOL(_atomic64_cmpxchg);
 
-long long _atomic64_and(long long *v, long long n)
+long long _atomic64_fetch_and(long long *v, long long n)
 {
-	return __atomic64_and(v, __atomic_setup(v), n);
+	return __atomic64_fetch_and(v, __atomic_setup(v), n);
 }
-EXPORT_SYMBOL(_atomic64_and);
+EXPORT_SYMBOL(_atomic64_fetch_and);
 
-long long _atomic64_or(long long *v, long long n)
+long long _atomic64_fetch_or(long long *v, long long n)
 {
-	return __atomic64_or(v, __atomic_setup(v), n);
+	return __atomic64_fetch_or(v, __atomic_setup(v), n);
 }
-EXPORT_SYMBOL(_atomic64_or);
+EXPORT_SYMBOL(_atomic64_fetch_or);
 
-long long _atomic64_xor(long long *v, long long n)
+long long _atomic64_fetch_xor(long long *v, long long n)
 {
-	return __atomic64_xor(v, __atomic_setup(v), n);
+	return __atomic64_fetch_xor(v, __atomic_setup(v), n);
 }
-EXPORT_SYMBOL(_atomic64_xor);
+EXPORT_SYMBOL(_atomic64_fetch_xor);
 
 /*
  * If any of the atomic or futex routines hit a bad address (not in
--- a/arch/tile/lib/atomic_asm_32.S
+++ b/arch/tile/lib/atomic_asm_32.S
@@ -177,10 +177,10 @@ atomic_op _xchg, 32, "move r24, r2"
 atomic_op _xchg_add, 32, "add r24, r22, r2"
 atomic_op _xchg_add_unless, 32, \
 	"sne r26, r22, r2; { bbns r26, 3f; add r24, r22, r3 }"
-atomic_op _or, 32, "or r24, r22, r2"
-atomic_op _and, 32, "and r24, r22, r2"
-atomic_op _andn, 32, "nor r2, r2, zero; and r24, r22, r2"
-atomic_op _xor, 32, "xor r24, r22, r2"
+atomic_op _fetch_or, 32, "or r24, r22, r2"
+atomic_op _fetch_and, 32, "and r24, r22, r2"
+atomic_op _fetch_andn, 32, "nor r2, r2, zero; and r24, r22, r2"
+atomic_op _fetch_xor, 32, "xor r24, r22, r2"
 
 atomic_op 64_cmpxchg, 64, "{ seq r26, r22, r2; seq r27, r23, r3 }; \
 	{ bbns r26, 3f; move r24, r4 }; { bbns r27, 3f; move r25, r5 }"
@@ -192,9 +192,9 @@ atomic_op 64_xchg_add_unless, 64, \
 	{ bbns r26, 3f; add r24, r22, r4 }; \
 	{ bbns r27, 3f; add r25, r23, r5 }; \
 	slt_u r26, r24, r22; add r25, r25, r26"
-atomic_op 64_or, 64, "{ or r24, r22, r2; or r25, r23, r3 }"
-atomic_op 64_and, 64, "{ and r24, r22, r2; and r25, r23, r3 }"
-atomic_op 64_xor, 64, "{ xor r24, r22, r2; xor r25, r23, r3 }"
+atomic_op 64_fetch_or, 64, "{ or r24, r22, r2; or r25, r23, r3 }"
+atomic_op 64_fetch_and, 64, "{ and r24, r22, r2; and r25, r23, r3 }"
+atomic_op 64_fetch_xor, 64, "{ xor r24, r22, r2; xor r25, r23, r3 }"
 
 	jrp     lr              /* happy backtracer */
 

^ permalink raw reply	[flat|nested] 61+ messages in thread

* [PATCH -v2 24/33] locking,x86: Implement atomic{,64}_fetch_{add,sub,and,or,xor}()
  2016-05-31 10:19 [PATCH -v2 00/33] implement atomic_fetch_$op Peter Zijlstra
                   ` (22 preceding siblings ...)
  2016-05-31 10:19 ` [PATCH -v2 23/33] locking,tile: " Peter Zijlstra
@ 2016-05-31 10:19 ` Peter Zijlstra
  2016-05-31 10:19 ` [PATCH -v2 25/33] locking,xtensa: Implement atomic_fetch_{add,sub,and,or,xor}() Peter Zijlstra
                   ` (10 subsequent siblings)
  34 siblings, 0 replies; 61+ messages in thread
From: Peter Zijlstra @ 2016-05-31 10:19 UTC (permalink / raw)
  To: torvalds, mingo, tglx, will.deacon, paulmck, boqun.feng,
	waiman.long, fweisbec
  Cc: linux-kernel, linux-arch, rth, vgupta, linux, egtvedt, realmz6,
	ysato, rkuo, tony.luck, geert, james.hogan, ralf, dhowells, jejb,
	mpe, schwidefsky, dalias, davem, cmetcalf, jcmvbkbc, arnd,
	peterz, dbueso, fengguang.wu

[-- Attachment #1: peterz-atomic-fetch-x86.patch --]
[-- Type: text/plain, Size: 4134 bytes --]

Implement FETCH-OP atomic primitives, these are very similar to the
existing OP-RETURN primitives we already have, except they return the
value of the atomic variable _before_ modification.

This is especially useful for irreversible operations -- such as
bitops (because it becomes impossible to reconstruct the state prior
to modification).

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 arch/x86/include/asm/atomic.h      |   37 ++++++++++++++++++++++++++++++++++---
 arch/x86/include/asm/atomic64_32.h |   25 ++++++++++++++++++++++---
 arch/x86/include/asm/atomic64_64.h |   35 ++++++++++++++++++++++++++++++++---
 3 files changed, 88 insertions(+), 9 deletions(-)

--- a/arch/x86/include/asm/atomic.h
+++ b/arch/x86/include/asm/atomic.h
@@ -171,6 +171,16 @@ static __always_inline int atomic_sub_re
 #define atomic_inc_return(v)  (atomic_add_return(1, v))
 #define atomic_dec_return(v)  (atomic_sub_return(1, v))
 
+static __always_inline int atomic_fetch_add(int i, atomic_t *v)
+{
+	return xadd(&v->counter, i);
+}
+
+static __always_inline int atomic_fetch_sub(int i, atomic_t *v)
+{
+	return xadd(&v->counter, -i);
+}
+
 static __always_inline int atomic_cmpxchg(atomic_t *v, int old, int new)
 {
 	return cmpxchg(&v->counter, old, new);
@@ -190,10 +200,31 @@ static inline void atomic_##op(int i, at
 			: "memory");					\
 }
 
-ATOMIC_OP(and)
-ATOMIC_OP(or)
-ATOMIC_OP(xor)
+#define ATOMIC_FETCH_OP(op, c_op)					\
+static inline int atomic_fetch_##op(int i, atomic_t *v)		\
+{									\
+	int old, val = atomic_read(v);					\
+	for (;;) {							\
+		old = atomic_cmpxchg(v, val, val c_op i);		\
+		if (old == val)						\
+			break;						\
+		val = old;						\
+	}								\
+	return old;							\
+}
+
+#define ATOMIC_OPS(op, c_op)						\
+	ATOMIC_OP(op)							\
+	ATOMIC_FETCH_OP(op, c_op)
+
+#define atomic_fetch_or atomic_fetch_or
+
+ATOMIC_OPS(and, &)
+ATOMIC_OPS(or , |)
+ATOMIC_OPS(xor, ^)
 
+#undef ATOMIC_OPS
+#undef ATOMIC_FETCH_OP
 #undef ATOMIC_OP
 
 /**
--- a/arch/x86/include/asm/atomic64_32.h
+++ b/arch/x86/include/asm/atomic64_32.h
@@ -320,10 +320,29 @@ static inline void atomic64_##op(long lo
 		c = old;						\
 }
 
-ATOMIC64_OP(and, &)
-ATOMIC64_OP(or, |)
-ATOMIC64_OP(xor, ^)
+#define ATOMIC64_FETCH_OP(op, c_op)					\
+static inline long long atomic64_fetch_##op(long long i, atomic64_t *v)	\
+{									\
+	long long old, c = 0;						\
+	while ((old = atomic64_cmpxchg(v, c, c c_op i)) != c)		\
+		c = old;						\
+	return old;							\
+}
+
+ATOMIC64_FETCH_OP(add, +)
+
+#define atomic64_fetch_sub(i, v)	atomic64_fetch_add(-(i), (v))
+
+#define ATOMIC64_OPS(op, c_op)						\
+	ATOMIC64_OP(op, c_op)						\
+	ATOMIC64_FETCH_OP(op, c_op)
+
+ATOMIC64_OPS(and, &)
+ATOMIC64_OPS(or, |)
+ATOMIC64_OPS(xor, ^)
 
+#undef ATOMIC64_OPS
+#undef ATOMIC64_FETCH_OP
 #undef ATOMIC64_OP
 
 #endif /* _ASM_X86_ATOMIC64_32_H */
--- a/arch/x86/include/asm/atomic64_64.h
+++ b/arch/x86/include/asm/atomic64_64.h
@@ -158,6 +158,16 @@ static inline long atomic64_sub_return(l
 	return atomic64_add_return(-i, v);
 }
 
+static inline long atomic64_fetch_add(long i, atomic64_t *v)
+{
+	return xadd(&v->counter, i);
+}
+
+static inline long atomic64_fetch_sub(long i, atomic64_t *v)
+{
+	return xadd(&v->counter, -i);
+}
+
 #define atomic64_inc_return(v)  (atomic64_add_return(1, (v)))
 #define atomic64_dec_return(v)  (atomic64_sub_return(1, (v)))
 
@@ -229,10 +239,29 @@ static inline void atomic64_##op(long i,
 			: "memory");					\
 }
 
-ATOMIC64_OP(and)
-ATOMIC64_OP(or)
-ATOMIC64_OP(xor)
+#define ATOMIC64_FETCH_OP(op, c_op)					\
+static inline long atomic64_fetch_##op(long i, atomic64_t *v)		\
+{									\
+	long old, val = atomic64_read(v);				\
+	for (;;) {							\
+		old = atomic64_cmpxchg(v, val, val c_op i);		\
+		if (old == val)						\
+			break;						\
+		val = old;						\
+	}								\
+	return old;							\
+}
+
+#define ATOMIC64_OPS(op, c_op)						\
+	ATOMIC64_OP(op)							\
+	ATOMIC64_FETCH_OP(op, c_op)
+
+ATOMIC64_OPS(and, &)
+ATOMIC64_OPS(or, |)
+ATOMIC64_OPS(xor, ^)
 
+#undef ATOMIC64_OPS
+#undef ATOMIC64_FETCH_OP
 #undef ATOMIC64_OP
 
 #endif /* _ASM_X86_ATOMIC64_64_H */

^ permalink raw reply	[flat|nested] 61+ messages in thread

* [PATCH -v2 25/33] locking,xtensa: Implement atomic_fetch_{add,sub,and,or,xor}()
  2016-05-31 10:19 [PATCH -v2 00/33] implement atomic_fetch_$op Peter Zijlstra
                   ` (23 preceding siblings ...)
  2016-05-31 10:19 ` [PATCH -v2 24/33] locking,x86: " Peter Zijlstra
@ 2016-05-31 10:19 ` Peter Zijlstra
  2016-05-31 10:19 ` [PATCH -v2 26/33] locking: Fix atomic64_relaxed bits Peter Zijlstra
                   ` (9 subsequent siblings)
  34 siblings, 0 replies; 61+ messages in thread
From: Peter Zijlstra @ 2016-05-31 10:19 UTC (permalink / raw)
  To: torvalds, mingo, tglx, will.deacon, paulmck, boqun.feng,
	waiman.long, fweisbec
  Cc: linux-kernel, linux-arch, rth, vgupta, linux, egtvedt, realmz6,
	ysato, rkuo, tony.luck, geert, james.hogan, ralf, dhowells, jejb,
	mpe, schwidefsky, dalias, davem, cmetcalf, jcmvbkbc, arnd,
	peterz, dbueso, fengguang.wu

[-- Attachment #1: peterz-atomic-fetch-xtensa.patch --]
[-- Type: text/plain, Size: 2476 bytes --]

Implement FETCH-OP atomic primitives, these are very similar to the
existing OP-RETURN primitives we already have, except they return the
value of the atomic variable _before_ modification.

This is especially useful for irreversible operations -- such as
bitops (because it becomes impossible to reconstruct the state prior
to modification).

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 arch/xtensa/include/asm/atomic.h |   54 ++++++++++++++++++++++++++++++++++++---
 1 file changed, 50 insertions(+), 4 deletions(-)

--- a/arch/xtensa/include/asm/atomic.h
+++ b/arch/xtensa/include/asm/atomic.h
@@ -98,6 +98,26 @@ static inline int atomic_##op##_return(i
 	return result;							\
 }
 
+#define ATOMIC_FETCH_OP(op)						\
+static inline int atomic_fetch_##op(int i, atomic_t * v)		\
+{									\
+	unsigned long tmp;						\
+	int result;							\
+									\
+	__asm__ __volatile__(						\
+			"1:     l32i    %1, %3, 0\n"			\
+			"       wsr     %1, scompare1\n"		\
+			"       " #op " %0, %1, %2\n"			\
+			"       s32c1i  %0, %3, 0\n"			\
+			"       bne     %0, %1, 1b\n"			\
+			: "=&a" (result), "=&a" (tmp)			\
+			: "a" (i), "a" (v)				\
+			: "memory"					\
+			);						\
+									\
+	return result;							\
+}
+
 #else /* XCHAL_HAVE_S32C1I */
 
 #define ATOMIC_OP(op)							\
@@ -138,18 +158,44 @@ static inline int atomic_##op##_return(i
 	return vval;							\
 }
 
+#define ATOMIC_FETCH_OP(op)						\
+static inline int atomic_fetch_##op(int i, atomic_t * v)		\
+{									\
+	unsigned int tmp, vval;						\
+									\
+	__asm__ __volatile__(						\
+			"       rsil    a15,"__stringify(TOPLEVEL)"\n"	\
+			"       l32i    %0, %3, 0\n"			\
+			"       " #op " %1, %0, %2\n"			\
+			"       s32i    %1, %3, 0\n"			\
+			"       wsr     a15, ps\n"			\
+			"       rsync\n"				\
+			: "=&a" (vval), "=&a" (tmp)			\
+			: "a" (i), "a" (v)				\
+			: "a15", "memory"				\
+			);						\
+									\
+	return vval;							\
+}
+
 #endif /* XCHAL_HAVE_S32C1I */
 
-#define ATOMIC_OPS(op) ATOMIC_OP(op) ATOMIC_OP_RETURN(op)
+#define ATOMIC_OPS(op) ATOMIC_OP(op) ATOMIC_FETCH_OP(op) ATOMIC_OP_RETURN(op)
 
 ATOMIC_OPS(add)
 ATOMIC_OPS(sub)
 
-ATOMIC_OP(and)
-ATOMIC_OP(or)
-ATOMIC_OP(xor)
+#undef ATOMIC_OPS
+#define ATOMIC_OPS(op) ATOMIC_OP(op) ATOMIC_FETCH_OP(op)
+
+#define atomic_fetch_or atomic_fetch_or
+
+ATOMIC_OPS(and)
+ATOMIC_OPS(or)
+ATOMIC_OPS(xor)
 
 #undef ATOMIC_OPS
+#undef ATOMIC_FETCH_OP
 #undef ATOMIC_OP_RETURN
 #undef ATOMIC_OP
 

^ permalink raw reply	[flat|nested] 61+ messages in thread

* [PATCH -v2 26/33] locking: Fix atomic64_relaxed bits
  2016-05-31 10:19 [PATCH -v2 00/33] implement atomic_fetch_$op Peter Zijlstra
                   ` (24 preceding siblings ...)
  2016-05-31 10:19 ` [PATCH -v2 25/33] locking,xtensa: Implement atomic_fetch_{add,sub,and,or,xor}() Peter Zijlstra
@ 2016-05-31 10:19 ` Peter Zijlstra
  2016-05-31 10:19 ` [PATCH -v2 27/33] locking: Implement atomic{,64,_long}_fetch_{add,sub,and,andnot,or,xor}{,_relaxed,_acquire,_release}() Peter Zijlstra
                   ` (8 subsequent siblings)
  34 siblings, 0 replies; 61+ messages in thread
From: Peter Zijlstra @ 2016-05-31 10:19 UTC (permalink / raw)
  To: torvalds, mingo, tglx, will.deacon, paulmck, boqun.feng,
	waiman.long, fweisbec
  Cc: linux-kernel, linux-arch, rth, vgupta, linux, egtvedt, realmz6,
	ysato, rkuo, tony.luck, geert, james.hogan, ralf, dhowells, jejb,
	mpe, schwidefsky, dalias, davem, cmetcalf, jcmvbkbc, arnd,
	peterz, dbueso, fengguang.wu

[-- Attachment #1: peterz-atomic64-relaxed-fix.patch --]
[-- Type: text/plain, Size: 10246 bytes --]

We should only expand the atomic64 relaxed bits once we've included
all relevant headers. So move it down until after we potentially
include asm-generic/atomic64.h.

In practise this will not have made a difference so far, since the
generic bits will not define _relaxed versions.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 include/linux/atomic.h |  306 ++++++++++++++++++++++++-------------------------
 1 file changed, 153 insertions(+), 153 deletions(-)

--- a/include/linux/atomic.h
+++ b/include/linux/atomic.h
@@ -211,159 +211,6 @@
 #endif
 #endif /* atomic_cmpxchg_relaxed */
 
-#ifndef atomic64_read_acquire
-#define  atomic64_read_acquire(v)	smp_load_acquire(&(v)->counter)
-#endif
-
-#ifndef atomic64_set_release
-#define  atomic64_set_release(v, i)	smp_store_release(&(v)->counter, (i))
-#endif
-
-/* atomic64_add_return_relaxed */
-#ifndef atomic64_add_return_relaxed
-#define  atomic64_add_return_relaxed	atomic64_add_return
-#define  atomic64_add_return_acquire	atomic64_add_return
-#define  atomic64_add_return_release	atomic64_add_return
-
-#else /* atomic64_add_return_relaxed */
-
-#ifndef atomic64_add_return_acquire
-#define  atomic64_add_return_acquire(...)				\
-	__atomic_op_acquire(atomic64_add_return, __VA_ARGS__)
-#endif
-
-#ifndef atomic64_add_return_release
-#define  atomic64_add_return_release(...)				\
-	__atomic_op_release(atomic64_add_return, __VA_ARGS__)
-#endif
-
-#ifndef atomic64_add_return
-#define  atomic64_add_return(...)					\
-	__atomic_op_fence(atomic64_add_return, __VA_ARGS__)
-#endif
-#endif /* atomic64_add_return_relaxed */
-
-/* atomic64_inc_return_relaxed */
-#ifndef atomic64_inc_return_relaxed
-#define  atomic64_inc_return_relaxed	atomic64_inc_return
-#define  atomic64_inc_return_acquire	atomic64_inc_return
-#define  atomic64_inc_return_release	atomic64_inc_return
-
-#else /* atomic64_inc_return_relaxed */
-
-#ifndef atomic64_inc_return_acquire
-#define  atomic64_inc_return_acquire(...)				\
-	__atomic_op_acquire(atomic64_inc_return, __VA_ARGS__)
-#endif
-
-#ifndef atomic64_inc_return_release
-#define  atomic64_inc_return_release(...)				\
-	__atomic_op_release(atomic64_inc_return, __VA_ARGS__)
-#endif
-
-#ifndef atomic64_inc_return
-#define  atomic64_inc_return(...)					\
-	__atomic_op_fence(atomic64_inc_return, __VA_ARGS__)
-#endif
-#endif /* atomic64_inc_return_relaxed */
-
-
-/* atomic64_sub_return_relaxed */
-#ifndef atomic64_sub_return_relaxed
-#define  atomic64_sub_return_relaxed	atomic64_sub_return
-#define  atomic64_sub_return_acquire	atomic64_sub_return
-#define  atomic64_sub_return_release	atomic64_sub_return
-
-#else /* atomic64_sub_return_relaxed */
-
-#ifndef atomic64_sub_return_acquire
-#define  atomic64_sub_return_acquire(...)				\
-	__atomic_op_acquire(atomic64_sub_return, __VA_ARGS__)
-#endif
-
-#ifndef atomic64_sub_return_release
-#define  atomic64_sub_return_release(...)				\
-	__atomic_op_release(atomic64_sub_return, __VA_ARGS__)
-#endif
-
-#ifndef atomic64_sub_return
-#define  atomic64_sub_return(...)					\
-	__atomic_op_fence(atomic64_sub_return, __VA_ARGS__)
-#endif
-#endif /* atomic64_sub_return_relaxed */
-
-/* atomic64_dec_return_relaxed */
-#ifndef atomic64_dec_return_relaxed
-#define  atomic64_dec_return_relaxed	atomic64_dec_return
-#define  atomic64_dec_return_acquire	atomic64_dec_return
-#define  atomic64_dec_return_release	atomic64_dec_return
-
-#else /* atomic64_dec_return_relaxed */
-
-#ifndef atomic64_dec_return_acquire
-#define  atomic64_dec_return_acquire(...)				\
-	__atomic_op_acquire(atomic64_dec_return, __VA_ARGS__)
-#endif
-
-#ifndef atomic64_dec_return_release
-#define  atomic64_dec_return_release(...)				\
-	__atomic_op_release(atomic64_dec_return, __VA_ARGS__)
-#endif
-
-#ifndef atomic64_dec_return
-#define  atomic64_dec_return(...)					\
-	__atomic_op_fence(atomic64_dec_return, __VA_ARGS__)
-#endif
-#endif /* atomic64_dec_return_relaxed */
-
-/* atomic64_xchg_relaxed */
-#ifndef atomic64_xchg_relaxed
-#define  atomic64_xchg_relaxed		atomic64_xchg
-#define  atomic64_xchg_acquire		atomic64_xchg
-#define  atomic64_xchg_release		atomic64_xchg
-
-#else /* atomic64_xchg_relaxed */
-
-#ifndef atomic64_xchg_acquire
-#define  atomic64_xchg_acquire(...)					\
-	__atomic_op_acquire(atomic64_xchg, __VA_ARGS__)
-#endif
-
-#ifndef atomic64_xchg_release
-#define  atomic64_xchg_release(...)					\
-	__atomic_op_release(atomic64_xchg, __VA_ARGS__)
-#endif
-
-#ifndef atomic64_xchg
-#define  atomic64_xchg(...)						\
-	__atomic_op_fence(atomic64_xchg, __VA_ARGS__)
-#endif
-#endif /* atomic64_xchg_relaxed */
-
-/* atomic64_cmpxchg_relaxed */
-#ifndef atomic64_cmpxchg_relaxed
-#define  atomic64_cmpxchg_relaxed	atomic64_cmpxchg
-#define  atomic64_cmpxchg_acquire	atomic64_cmpxchg
-#define  atomic64_cmpxchg_release	atomic64_cmpxchg
-
-#else /* atomic64_cmpxchg_relaxed */
-
-#ifndef atomic64_cmpxchg_acquire
-#define  atomic64_cmpxchg_acquire(...)					\
-	__atomic_op_acquire(atomic64_cmpxchg, __VA_ARGS__)
-#endif
-
-#ifndef atomic64_cmpxchg_release
-#define  atomic64_cmpxchg_release(...)					\
-	__atomic_op_release(atomic64_cmpxchg, __VA_ARGS__)
-#endif
-
-#ifndef atomic64_cmpxchg
-#define  atomic64_cmpxchg(...)						\
-	__atomic_op_fence(atomic64_cmpxchg, __VA_ARGS__)
-#endif
-#endif /* atomic64_cmpxchg_relaxed */
-
 /* cmpxchg_relaxed */
 #ifndef cmpxchg_relaxed
 #define  cmpxchg_relaxed		cmpxchg
@@ -583,6 +430,159 @@ static inline int atomic_fetch_or(atomic
 #include <asm-generic/atomic64.h>
 #endif
 
+#ifndef atomic64_read_acquire
+#define  atomic64_read_acquire(v)	smp_load_acquire(&(v)->counter)
+#endif
+
+#ifndef atomic64_set_release
+#define  atomic64_set_release(v, i)	smp_store_release(&(v)->counter, (i))
+#endif
+
+/* atomic64_add_return_relaxed */
+#ifndef atomic64_add_return_relaxed
+#define  atomic64_add_return_relaxed	atomic64_add_return
+#define  atomic64_add_return_acquire	atomic64_add_return
+#define  atomic64_add_return_release	atomic64_add_return
+
+#else /* atomic64_add_return_relaxed */
+
+#ifndef atomic64_add_return_acquire
+#define  atomic64_add_return_acquire(...)				\
+	__atomic_op_acquire(atomic64_add_return, __VA_ARGS__)
+#endif
+
+#ifndef atomic64_add_return_release
+#define  atomic64_add_return_release(...)				\
+	__atomic_op_release(atomic64_add_return, __VA_ARGS__)
+#endif
+
+#ifndef atomic64_add_return
+#define  atomic64_add_return(...)					\
+	__atomic_op_fence(atomic64_add_return, __VA_ARGS__)
+#endif
+#endif /* atomic64_add_return_relaxed */
+
+/* atomic64_inc_return_relaxed */
+#ifndef atomic64_inc_return_relaxed
+#define  atomic64_inc_return_relaxed	atomic64_inc_return
+#define  atomic64_inc_return_acquire	atomic64_inc_return
+#define  atomic64_inc_return_release	atomic64_inc_return
+
+#else /* atomic64_inc_return_relaxed */
+
+#ifndef atomic64_inc_return_acquire
+#define  atomic64_inc_return_acquire(...)				\
+	__atomic_op_acquire(atomic64_inc_return, __VA_ARGS__)
+#endif
+
+#ifndef atomic64_inc_return_release
+#define  atomic64_inc_return_release(...)				\
+	__atomic_op_release(atomic64_inc_return, __VA_ARGS__)
+#endif
+
+#ifndef atomic64_inc_return
+#define  atomic64_inc_return(...)					\
+	__atomic_op_fence(atomic64_inc_return, __VA_ARGS__)
+#endif
+#endif /* atomic64_inc_return_relaxed */
+
+
+/* atomic64_sub_return_relaxed */
+#ifndef atomic64_sub_return_relaxed
+#define  atomic64_sub_return_relaxed	atomic64_sub_return
+#define  atomic64_sub_return_acquire	atomic64_sub_return
+#define  atomic64_sub_return_release	atomic64_sub_return
+
+#else /* atomic64_sub_return_relaxed */
+
+#ifndef atomic64_sub_return_acquire
+#define  atomic64_sub_return_acquire(...)				\
+	__atomic_op_acquire(atomic64_sub_return, __VA_ARGS__)
+#endif
+
+#ifndef atomic64_sub_return_release
+#define  atomic64_sub_return_release(...)				\
+	__atomic_op_release(atomic64_sub_return, __VA_ARGS__)
+#endif
+
+#ifndef atomic64_sub_return
+#define  atomic64_sub_return(...)					\
+	__atomic_op_fence(atomic64_sub_return, __VA_ARGS__)
+#endif
+#endif /* atomic64_sub_return_relaxed */
+
+/* atomic64_dec_return_relaxed */
+#ifndef atomic64_dec_return_relaxed
+#define  atomic64_dec_return_relaxed	atomic64_dec_return
+#define  atomic64_dec_return_acquire	atomic64_dec_return
+#define  atomic64_dec_return_release	atomic64_dec_return
+
+#else /* atomic64_dec_return_relaxed */
+
+#ifndef atomic64_dec_return_acquire
+#define  atomic64_dec_return_acquire(...)				\
+	__atomic_op_acquire(atomic64_dec_return, __VA_ARGS__)
+#endif
+
+#ifndef atomic64_dec_return_release
+#define  atomic64_dec_return_release(...)				\
+	__atomic_op_release(atomic64_dec_return, __VA_ARGS__)
+#endif
+
+#ifndef atomic64_dec_return
+#define  atomic64_dec_return(...)					\
+	__atomic_op_fence(atomic64_dec_return, __VA_ARGS__)
+#endif
+#endif /* atomic64_dec_return_relaxed */
+
+/* atomic64_xchg_relaxed */
+#ifndef atomic64_xchg_relaxed
+#define  atomic64_xchg_relaxed		atomic64_xchg
+#define  atomic64_xchg_acquire		atomic64_xchg
+#define  atomic64_xchg_release		atomic64_xchg
+
+#else /* atomic64_xchg_relaxed */
+
+#ifndef atomic64_xchg_acquire
+#define  atomic64_xchg_acquire(...)					\
+	__atomic_op_acquire(atomic64_xchg, __VA_ARGS__)
+#endif
+
+#ifndef atomic64_xchg_release
+#define  atomic64_xchg_release(...)					\
+	__atomic_op_release(atomic64_xchg, __VA_ARGS__)
+#endif
+
+#ifndef atomic64_xchg
+#define  atomic64_xchg(...)						\
+	__atomic_op_fence(atomic64_xchg, __VA_ARGS__)
+#endif
+#endif /* atomic64_xchg_relaxed */
+
+/* atomic64_cmpxchg_relaxed */
+#ifndef atomic64_cmpxchg_relaxed
+#define  atomic64_cmpxchg_relaxed	atomic64_cmpxchg
+#define  atomic64_cmpxchg_acquire	atomic64_cmpxchg
+#define  atomic64_cmpxchg_release	atomic64_cmpxchg
+
+#else /* atomic64_cmpxchg_relaxed */
+
+#ifndef atomic64_cmpxchg_acquire
+#define  atomic64_cmpxchg_acquire(...)					\
+	__atomic_op_acquire(atomic64_cmpxchg, __VA_ARGS__)
+#endif
+
+#ifndef atomic64_cmpxchg_release
+#define  atomic64_cmpxchg_release(...)					\
+	__atomic_op_release(atomic64_cmpxchg, __VA_ARGS__)
+#endif
+
+#ifndef atomic64_cmpxchg
+#define  atomic64_cmpxchg(...)						\
+	__atomic_op_fence(atomic64_cmpxchg, __VA_ARGS__)
+#endif
+#endif /* atomic64_cmpxchg_relaxed */
+
 #ifndef atomic64_andnot
 static inline void atomic64_andnot(long long i, atomic64_t *v)
 {

^ permalink raw reply	[flat|nested] 61+ messages in thread

* [PATCH -v2 27/33] locking: Implement atomic{,64,_long}_fetch_{add,sub,and,andnot,or,xor}{,_relaxed,_acquire,_release}()
  2016-05-31 10:19 [PATCH -v2 00/33] implement atomic_fetch_$op Peter Zijlstra
                   ` (25 preceding siblings ...)
  2016-05-31 10:19 ` [PATCH -v2 26/33] locking: Fix atomic64_relaxed bits Peter Zijlstra
@ 2016-05-31 10:19 ` Peter Zijlstra
  2016-05-31 10:19 ` [PATCH -v2 28/33] locking: Remove linux/atomic.h:atomic_fetch_or Peter Zijlstra
                   ` (7 subsequent siblings)
  34 siblings, 0 replies; 61+ messages in thread
From: Peter Zijlstra @ 2016-05-31 10:19 UTC (permalink / raw)
  To: torvalds, mingo, tglx, will.deacon, paulmck, boqun.feng,
	waiman.long, fweisbec
  Cc: linux-kernel, linux-arch, rth, vgupta, linux, egtvedt, realmz6,
	ysato, rkuo, tony.luck, geert, james.hogan, ralf, dhowells, jejb,
	mpe, schwidefsky, dalias, davem, cmetcalf, jcmvbkbc, arnd,
	peterz, dbueso, fengguang.wu

[-- Attachment #1: peterz-atomic-fetch-generic.patch --]
[-- Type: text/plain, Size: 18478 bytes --]

Now that all the architectures have implemented support for these new
atomic primitives add on the generic infrastructure to expose and use
it.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 include/asm-generic/atomic-long.h |   36 +++-
 include/asm-generic/atomic.h      |   49 +++++
 include/asm-generic/atomic64.h    |   15 +
 include/linux/atomic.h            |  336 ++++++++++++++++++++++++++++++++++++++
 lib/atomic64.c                    |   32 +++
 lib/atomic64_test.c               |   34 +++
 6 files changed, 493 insertions(+), 9 deletions(-)

--- a/include/asm-generic/atomic-long.h
+++ b/include/asm-generic/atomic-long.h
@@ -112,6 +112,40 @@ static __always_inline void atomic_long_
 	ATOMIC_LONG_PFX(_dec)(v);
 }
 
+#define ATOMIC_LONG_FETCH_OP(op, mo)					\
+static inline long							\
+atomic_long_fetch_##op##mo(long i, atomic_long_t *l)			\
+{									\
+	ATOMIC_LONG_PFX(_t) *v = (ATOMIC_LONG_PFX(_t) *)l;		\
+									\
+	return (long)ATOMIC_LONG_PFX(_fetch_##op##mo)(i, v);		\
+}
+
+ATOMIC_LONG_FETCH_OP(add, )
+ATOMIC_LONG_FETCH_OP(add, _relaxed)
+ATOMIC_LONG_FETCH_OP(add, _acquire)
+ATOMIC_LONG_FETCH_OP(add, _release)
+ATOMIC_LONG_FETCH_OP(sub, )
+ATOMIC_LONG_FETCH_OP(sub, _relaxed)
+ATOMIC_LONG_FETCH_OP(sub, _acquire)
+ATOMIC_LONG_FETCH_OP(sub, _release)
+ATOMIC_LONG_FETCH_OP(and, )
+ATOMIC_LONG_FETCH_OP(and, _relaxed)
+ATOMIC_LONG_FETCH_OP(and, _acquire)
+ATOMIC_LONG_FETCH_OP(and, _release)
+ATOMIC_LONG_FETCH_OP(andnot, )
+ATOMIC_LONG_FETCH_OP(andnot, _relaxed)
+ATOMIC_LONG_FETCH_OP(andnot, _acquire)
+ATOMIC_LONG_FETCH_OP(andnot, _release)
+ATOMIC_LONG_FETCH_OP(or, )
+ATOMIC_LONG_FETCH_OP(or, _relaxed)
+ATOMIC_LONG_FETCH_OP(or, _acquire)
+ATOMIC_LONG_FETCH_OP(or, _release)
+ATOMIC_LONG_FETCH_OP(xor, )
+ATOMIC_LONG_FETCH_OP(xor, _relaxed)
+ATOMIC_LONG_FETCH_OP(xor, _acquire)
+ATOMIC_LONG_FETCH_OP(xor, _release)
+
 #define ATOMIC_LONG_OP(op)						\
 static __always_inline void						\
 atomic_long_##op(long i, atomic_long_t *l)				\
@@ -124,9 +158,9 @@ atomic_long_##op(long i, atomic_long_t *
 ATOMIC_LONG_OP(add)
 ATOMIC_LONG_OP(sub)
 ATOMIC_LONG_OP(and)
+ATOMIC_LONG_OP(andnot)
 ATOMIC_LONG_OP(or)
 ATOMIC_LONG_OP(xor)
-ATOMIC_LONG_OP(andnot)
 
 #undef ATOMIC_LONG_OP
 
--- a/include/asm-generic/atomic.h
+++ b/include/asm-generic/atomic.h
@@ -61,6 +61,18 @@ static inline int atomic_##op##_return(i
 	return c c_op i;						\
 }
 
+#define ATOMIC_FETCH_OP(op, c_op)					\
+static inline int atomic_fetch_##op(int i, atomic_t *v)			\
+{									\
+	int c, old;							\
+									\
+	c = v->counter;							\
+	while ((old = cmpxchg(&v->counter, c, c c_op i)) != c)		\
+		c = old;						\
+									\
+	return c;							\
+}
+
 #else
 
 #include <linux/irqflags.h>
@@ -88,6 +100,20 @@ static inline int atomic_##op##_return(i
 	return ret;							\
 }
 
+#define ATOMIC_FETCH_OP(op, c_op)					\
+static inline int atomic_fetch_##op(int i, atomic_t *v)			\
+{									\
+	unsigned long flags;						\
+	int ret;							\
+									\
+	raw_local_irq_save(flags);					\
+	ret = v->counter;						\
+	v->counter = v->counter c_op i;					\
+	raw_local_irq_restore(flags);					\
+									\
+	return ret;							\
+}
+
 #endif /* CONFIG_SMP */
 
 #ifndef atomic_add_return
@@ -98,6 +124,28 @@ ATOMIC_OP_RETURN(add, +)
 ATOMIC_OP_RETURN(sub, -)
 #endif
 
+#ifndef atomic_fetch_add
+ATOMIC_FETCH_OP(add, +)
+#endif
+
+#ifndef atomic_fetch_sub
+ATOMIC_FETCH_OP(sub, -)
+#endif
+
+#ifndef atomic_fetch_and
+ATOMIC_FETCH_OP(and, &)
+#endif
+
+#ifndef atomic_fetch_or
+#define atomic_fetch_or atomic_fetch_or
+
+ATOMIC_FETCH_OP(or, |)
+#endif
+
+#ifndef atomic_fetch_xor
+ATOMIC_FETCH_OP(xor, ^)
+#endif
+
 #ifndef atomic_and
 ATOMIC_OP(and, &)
 #endif
@@ -110,6 +158,7 @@ ATOMIC_OP(or, |)
 ATOMIC_OP(xor, ^)
 #endif
 
+#undef ATOMIC_FETCH_OP
 #undef ATOMIC_OP_RETURN
 #undef ATOMIC_OP
 
--- a/include/asm-generic/atomic64.h
+++ b/include/asm-generic/atomic64.h
@@ -27,16 +27,23 @@ extern void	 atomic64_##op(long long a,
 #define ATOMIC64_OP_RETURN(op)						\
 extern long long atomic64_##op##_return(long long a, atomic64_t *v);
 
-#define ATOMIC64_OPS(op)	ATOMIC64_OP(op) ATOMIC64_OP_RETURN(op)
+#define ATOMIC64_FETCH_OP(op)						\
+extern long long atomic64_fetch_##op(long long a, atomic64_t *v);
+
+#define ATOMIC64_OPS(op)	ATOMIC64_OP(op) ATOMIC64_OP_RETURN(op) ATOMIC64_FETCH_OP(op)
 
 ATOMIC64_OPS(add)
 ATOMIC64_OPS(sub)
 
-ATOMIC64_OP(and)
-ATOMIC64_OP(or)
-ATOMIC64_OP(xor)
+#undef ATOMIC64_OPS
+#define ATOMIC64_OPS(op)	ATOMIC64_OP(op) ATOMIC64_FETCH_OP(op)
+
+ATOMIC64_OPS(and)
+ATOMIC64_OPS(or)
+ATOMIC64_OPS(xor)
 
 #undef ATOMIC64_OPS
+#undef ATOMIC64_FETCH_OP
 #undef ATOMIC64_OP_RETURN
 #undef ATOMIC64_OP
 
--- a/include/linux/atomic.h
+++ b/include/linux/atomic.h
@@ -163,6 +163,154 @@
 #endif
 #endif /* atomic_dec_return_relaxed */
 
+
+/* atomic_fetch_add_relaxed */
+#ifndef atomic_fetch_add_relaxed
+#define atomic_fetch_add_relaxed	atomic_fetch_add
+#define atomic_fetch_add_acquire	atomic_fetch_add
+#define atomic_fetch_add_release	atomic_fetch_add
+
+#else /* atomic_fetch_add_relaxed */
+
+#ifndef atomic_fetch_add_acquire
+#define atomic_fetch_add_acquire(...)					\
+	__atomic_op_acquire(atomic_fetch_add, __VA_ARGS__)
+#endif
+
+#ifndef atomic_fetch_add_release
+#define atomic_fetch_add_release(...)					\
+	__atomic_op_release(atomic_fetch_add, __VA_ARGS__)
+#endif
+
+#ifndef atomic_fetch_add
+#define atomic_fetch_add(...)						\
+	__atomic_op_fence(atomic_fetch_add, __VA_ARGS__)
+#endif
+#endif /* atomic_fetch_add_relaxed */
+
+/* atomic_fetch_sub_relaxed */
+#ifndef atomic_fetch_sub_relaxed
+#define atomic_fetch_sub_relaxed	atomic_fetch_sub
+#define atomic_fetch_sub_acquire	atomic_fetch_sub
+#define atomic_fetch_sub_release	atomic_fetch_sub
+
+#else /* atomic_fetch_sub_relaxed */
+
+#ifndef atomic_fetch_sub_acquire
+#define atomic_fetch_sub_acquire(...)					\
+	__atomic_op_acquire(atomic_fetch_sub, __VA_ARGS__)
+#endif
+
+#ifndef atomic_fetch_sub_release
+#define atomic_fetch_sub_release(...)					\
+	__atomic_op_release(atomic_fetch_sub, __VA_ARGS__)
+#endif
+
+#ifndef atomic_fetch_sub
+#define atomic_fetch_sub(...)						\
+	__atomic_op_fence(atomic_fetch_sub, __VA_ARGS__)
+#endif
+#endif /* atomic_fetch_sub_relaxed */
+
+/* atomic_fetch_or_relaxed */
+#ifndef atomic_fetch_or_relaxed
+#define atomic_fetch_or_relaxed	atomic_fetch_or
+#define atomic_fetch_or_acquire	atomic_fetch_or
+#define atomic_fetch_or_release	atomic_fetch_or
+
+#else /* atomic_fetch_or_relaxed */
+
+#ifndef atomic_fetch_or_acquire
+#define atomic_fetch_or_acquire(...)					\
+	__atomic_op_acquire(atomic_fetch_or, __VA_ARGS__)
+#endif
+
+#ifndef atomic_fetch_or_release
+#define atomic_fetch_or_release(...)					\
+	__atomic_op_release(atomic_fetch_or, __VA_ARGS__)
+#endif
+
+#ifndef atomic_fetch_or
+#define atomic_fetch_or(...)						\
+	__atomic_op_fence(atomic_fetch_or, __VA_ARGS__)
+#endif
+#endif /* atomic_fetch_or_relaxed */
+
+/* atomic_fetch_and_relaxed */
+#ifndef atomic_fetch_and_relaxed
+#define atomic_fetch_and_relaxed	atomic_fetch_and
+#define atomic_fetch_and_acquire	atomic_fetch_and
+#define atomic_fetch_and_release	atomic_fetch_and
+
+#else /* atomic_fetch_and_relaxed */
+
+#ifndef atomic_fetch_and_acquire
+#define atomic_fetch_and_acquire(...)					\
+	__atomic_op_acquire(atomic_fetch_and, __VA_ARGS__)
+#endif
+
+#ifndef atomic_fetch_and_release
+#define atomic_fetch_and_release(...)					\
+	__atomic_op_release(atomic_fetch_and, __VA_ARGS__)
+#endif
+
+#ifndef atomic_fetch_and
+#define atomic_fetch_and(...)						\
+	__atomic_op_fence(atomic_fetch_and, __VA_ARGS__)
+#endif
+#endif /* atomic_fetch_and_relaxed */
+
+#ifdef atomic_andnot
+/* atomic_fetch_andnot_relaxed */
+#ifndef atomic_fetch_andnot_relaxed
+#define atomic_fetch_andnot_relaxed	atomic_fetch_andnot
+#define atomic_fetch_andnot_acquire	atomic_fetch_andnot
+#define atomic_fetch_andnot_release	atomic_fetch_andnot
+
+#else /* atomic_fetch_andnot_relaxed */
+
+#ifndef atomic_fetch_andnot_acquire
+#define atomic_fetch_andnot_acquire(...)					\
+	__atomic_op_acquire(atomic_fetch_andnot, __VA_ARGS__)
+#endif
+
+#ifndef atomic_fetch_andnot_release
+#define atomic_fetch_andnot_release(...)					\
+	__atomic_op_release(atomic_fetch_andnot, __VA_ARGS__)
+#endif
+
+#ifndef atomic_fetch_andnot
+#define atomic_fetch_andnot(...)						\
+	__atomic_op_fence(atomic_fetch_andnot, __VA_ARGS__)
+#endif
+#endif /* atomic_fetch_andnot_relaxed */
+#endif /* atomic_andnot */
+
+/* atomic_fetch_xor_relaxed */
+#ifndef atomic_fetch_xor_relaxed
+#define atomic_fetch_xor_relaxed	atomic_fetch_xor
+#define atomic_fetch_xor_acquire	atomic_fetch_xor
+#define atomic_fetch_xor_release	atomic_fetch_xor
+
+#else /* atomic_fetch_xor_relaxed */
+
+#ifndef atomic_fetch_xor_acquire
+#define atomic_fetch_xor_acquire(...)					\
+	__atomic_op_acquire(atomic_fetch_xor, __VA_ARGS__)
+#endif
+
+#ifndef atomic_fetch_xor_release
+#define atomic_fetch_xor_release(...)					\
+	__atomic_op_release(atomic_fetch_xor, __VA_ARGS__)
+#endif
+
+#ifndef atomic_fetch_xor
+#define atomic_fetch_xor(...)						\
+	__atomic_op_fence(atomic_fetch_xor, __VA_ARGS__)
+#endif
+#endif /* atomic_fetch_xor_relaxed */
+
+
 /* atomic_xchg_relaxed */
 #ifndef atomic_xchg_relaxed
 #define  atomic_xchg_relaxed		atomic_xchg
@@ -310,6 +458,26 @@ static inline void atomic_andnot(int i,
 {
 	atomic_and(~i, v);
 }
+
+static inline int atomic_fetch_andnot(int i, atomic_t *v)
+{
+	return atomic_fetch_and(~i, v);
+}
+
+static inline int atomic_fetch_andnot_relaxed(int i, atomic_t *v)
+{
+	return atomic_fetch_and_relaxed(~i, v);
+}
+
+static inline int atomic_fetch_andnot_acquire(int i, atomic_t *v)
+{
+	return atomic_fetch_and_acquire(~i, v);
+}
+
+static inline int atomic_fetch_andnot_release(int i, atomic_t *v)
+{
+	return atomic_fetch_and_release(~i, v);
+}
 #endif
 
 static inline __deprecated void atomic_clear_mask(unsigned int mask, atomic_t *v)
@@ -535,6 +703,154 @@ static inline int atomic_fetch_or(atomic
 #endif
 #endif /* atomic64_dec_return_relaxed */
 
+
+/* atomic64_fetch_add_relaxed */
+#ifndef atomic64_fetch_add_relaxed
+#define atomic64_fetch_add_relaxed	atomic64_fetch_add
+#define atomic64_fetch_add_acquire	atomic64_fetch_add
+#define atomic64_fetch_add_release	atomic64_fetch_add
+
+#else /* atomic64_fetch_add_relaxed */
+
+#ifndef atomic64_fetch_add_acquire
+#define atomic64_fetch_add_acquire(...)					\
+	__atomic_op_acquire(atomic64_fetch_add, __VA_ARGS__)
+#endif
+
+#ifndef atomic64_fetch_add_release
+#define atomic64_fetch_add_release(...)					\
+	__atomic_op_release(atomic64_fetch_add, __VA_ARGS__)
+#endif
+
+#ifndef atomic64_fetch_add
+#define atomic64_fetch_add(...)						\
+	__atomic_op_fence(atomic64_fetch_add, __VA_ARGS__)
+#endif
+#endif /* atomic64_fetch_add_relaxed */
+
+/* atomic64_fetch_sub_relaxed */
+#ifndef atomic64_fetch_sub_relaxed
+#define atomic64_fetch_sub_relaxed	atomic64_fetch_sub
+#define atomic64_fetch_sub_acquire	atomic64_fetch_sub
+#define atomic64_fetch_sub_release	atomic64_fetch_sub
+
+#else /* atomic64_fetch_sub_relaxed */
+
+#ifndef atomic64_fetch_sub_acquire
+#define atomic64_fetch_sub_acquire(...)					\
+	__atomic_op_acquire(atomic64_fetch_sub, __VA_ARGS__)
+#endif
+
+#ifndef atomic64_fetch_sub_release
+#define atomic64_fetch_sub_release(...)					\
+	__atomic_op_release(atomic64_fetch_sub, __VA_ARGS__)
+#endif
+
+#ifndef atomic64_fetch_sub
+#define atomic64_fetch_sub(...)						\
+	__atomic_op_fence(atomic64_fetch_sub, __VA_ARGS__)
+#endif
+#endif /* atomic64_fetch_sub_relaxed */
+
+/* atomic64_fetch_or_relaxed */
+#ifndef atomic64_fetch_or_relaxed
+#define atomic64_fetch_or_relaxed	atomic64_fetch_or
+#define atomic64_fetch_or_acquire	atomic64_fetch_or
+#define atomic64_fetch_or_release	atomic64_fetch_or
+
+#else /* atomic64_fetch_or_relaxed */
+
+#ifndef atomic64_fetch_or_acquire
+#define atomic64_fetch_or_acquire(...)					\
+	__atomic_op_acquire(atomic64_fetch_or, __VA_ARGS__)
+#endif
+
+#ifndef atomic64_fetch_or_release
+#define atomic64_fetch_or_release(...)					\
+	__atomic_op_release(atomic64_fetch_or, __VA_ARGS__)
+#endif
+
+#ifndef atomic64_fetch_or
+#define atomic64_fetch_or(...)						\
+	__atomic_op_fence(atomic64_fetch_or, __VA_ARGS__)
+#endif
+#endif /* atomic64_fetch_or_relaxed */
+
+/* atomic64_fetch_and_relaxed */
+#ifndef atomic64_fetch_and_relaxed
+#define atomic64_fetch_and_relaxed	atomic64_fetch_and
+#define atomic64_fetch_and_acquire	atomic64_fetch_and
+#define atomic64_fetch_and_release	atomic64_fetch_and
+
+#else /* atomic64_fetch_and_relaxed */
+
+#ifndef atomic64_fetch_and_acquire
+#define atomic64_fetch_and_acquire(...)					\
+	__atomic_op_acquire(atomic64_fetch_and, __VA_ARGS__)
+#endif
+
+#ifndef atomic64_fetch_and_release
+#define atomic64_fetch_and_release(...)					\
+	__atomic_op_release(atomic64_fetch_and, __VA_ARGS__)
+#endif
+
+#ifndef atomic64_fetch_and
+#define atomic64_fetch_and(...)						\
+	__atomic_op_fence(atomic64_fetch_and, __VA_ARGS__)
+#endif
+#endif /* atomic64_fetch_and_relaxed */
+
+#ifdef atomic64_andnot
+/* atomic64_fetch_andnot_relaxed */
+#ifndef atomic64_fetch_andnot_relaxed
+#define atomic64_fetch_andnot_relaxed	atomic64_fetch_andnot
+#define atomic64_fetch_andnot_acquire	atomic64_fetch_andnot
+#define atomic64_fetch_andnot_release	atomic64_fetch_andnot
+
+#else /* atomic64_fetch_andnot_relaxed */
+
+#ifndef atomic64_fetch_andnot_acquire
+#define atomic64_fetch_andnot_acquire(...)					\
+	__atomic_op_acquire(atomic64_fetch_andnot, __VA_ARGS__)
+#endif
+
+#ifndef atomic64_fetch_andnot_release
+#define atomic64_fetch_andnot_release(...)					\
+	__atomic_op_release(atomic64_fetch_andnot, __VA_ARGS__)
+#endif
+
+#ifndef atomic64_fetch_andnot
+#define atomic64_fetch_andnot(...)						\
+	__atomic_op_fence(atomic64_fetch_andnot, __VA_ARGS__)
+#endif
+#endif /* atomic64_fetch_andnot_relaxed */
+#endif /* atomic64_andnot */
+
+/* atomic64_fetch_xor_relaxed */
+#ifndef atomic64_fetch_xor_relaxed
+#define atomic64_fetch_xor_relaxed	atomic64_fetch_xor
+#define atomic64_fetch_xor_acquire	atomic64_fetch_xor
+#define atomic64_fetch_xor_release	atomic64_fetch_xor
+
+#else /* atomic64_fetch_xor_relaxed */
+
+#ifndef atomic64_fetch_xor_acquire
+#define atomic64_fetch_xor_acquire(...)					\
+	__atomic_op_acquire(atomic64_fetch_xor, __VA_ARGS__)
+#endif
+
+#ifndef atomic64_fetch_xor_release
+#define atomic64_fetch_xor_release(...)					\
+	__atomic_op_release(atomic64_fetch_xor, __VA_ARGS__)
+#endif
+
+#ifndef atomic64_fetch_xor
+#define atomic64_fetch_xor(...)						\
+	__atomic_op_fence(atomic64_fetch_xor, __VA_ARGS__)
+#endif
+#endif /* atomic64_fetch_xor_relaxed */
+
+
 /* atomic64_xchg_relaxed */
 #ifndef atomic64_xchg_relaxed
 #define  atomic64_xchg_relaxed		atomic64_xchg
@@ -588,6 +904,26 @@ static inline void atomic64_andnot(long
 {
 	atomic64_and(~i, v);
 }
+
+static inline long long atomic64_fetch_andnot(long long i, atomic64_t *v)
+{
+	return atomic64_fetch_and(~i, v);
+}
+
+static inline long long atomic64_fetch_andnot_relaxed(long long i, atomic64_t *v)
+{
+	return atomic64_fetch_and_relaxed(~i, v);
+}
+
+static inline long long atomic64_fetch_andnot_acquire(long long i, atomic64_t *v)
+{
+	return atomic64_fetch_and_acquire(~i, v);
+}
+
+static inline long long atomic64_fetch_andnot_release(long long i, atomic64_t *v)
+{
+	return atomic64_fetch_and_release(~i, v);
+}
 #endif
 
 #include <asm-generic/atomic-long.h>
--- a/lib/atomic64.c
+++ b/lib/atomic64.c
@@ -96,17 +96,41 @@ long long atomic64_##op##_return(long lo
 }									\
 EXPORT_SYMBOL(atomic64_##op##_return);
 
+#define ATOMIC64_FETCH_OP(op, c_op)					\
+long long atomic64_fetch_##op(long long a, atomic64_t *v)		\
+{									\
+	unsigned long flags;						\
+	raw_spinlock_t *lock = lock_addr(v);				\
+	long long val;							\
+									\
+	raw_spin_lock_irqsave(lock, flags);				\
+	val = v->counter;						\
+	v->counter c_op a;						\
+	raw_spin_unlock_irqrestore(lock, flags);			\
+	return val;							\
+}									\
+EXPORT_SYMBOL(atomic64_fetch_##op);
+
 #define ATOMIC64_OPS(op, c_op)						\
 	ATOMIC64_OP(op, c_op)						\
-	ATOMIC64_OP_RETURN(op, c_op)
+	ATOMIC64_OP_RETURN(op, c_op)					\
+	ATOMIC64_FETCH_OP(op, c_op)
 
 ATOMIC64_OPS(add, +=)
 ATOMIC64_OPS(sub, -=)
-ATOMIC64_OP(and, &=)
-ATOMIC64_OP(or, |=)
-ATOMIC64_OP(xor, ^=)
 
 #undef ATOMIC64_OPS
+#define ATOMIC64_OPS(op, c_op)						\
+	ATOMIC64_OP(op, c_op)						\
+	ATOMIC64_OP_RETURN(op, c_op)					\
+	ATOMIC64_FETCH_OP(op, c_op)
+
+ATOMIC64_OPS(and, &=)
+ATOMIC64_OPS(or, |=)
+ATOMIC64_OPS(xor, ^=)
+
+#undef ATOMIC64_OPS
+#undef ATOMIC64_FETCH_OP
 #undef ATOMIC64_OP_RETURN
 #undef ATOMIC64_OP
 
--- a/lib/atomic64_test.c
+++ b/lib/atomic64_test.c
@@ -53,11 +53,25 @@ do {								\
 	BUG_ON(atomic##bit##_read(&v) != r);			\
 } while (0)
 
+#define TEST_FETCH(bit, op, c_op, val)				\
+do {								\
+	atomic##bit##_set(&v, v0);				\
+	r = v0;							\
+	r c_op val;						\
+	BUG_ON(atomic##bit##_##op(val, &v) != v0);		\
+	BUG_ON(atomic##bit##_read(&v) != r);			\
+} while (0)
+
 #define RETURN_FAMILY_TEST(bit, op, c_op, val)			\
 do {								\
 	FAMILY_TEST(TEST_RETURN, bit, op, c_op, val);		\
 } while (0)
 
+#define FETCH_FAMILY_TEST(bit, op, c_op, val)			\
+do {								\
+	FAMILY_TEST(TEST_FETCH, bit, op, c_op, val);		\
+} while (0)
+
 #define TEST_ARGS(bit, op, init, ret, expect, args...)		\
 do {								\
 	atomic##bit##_set(&v, init);				\
@@ -114,6 +128,16 @@ static __init void test_atomic(void)
 	RETURN_FAMILY_TEST(, sub_return, -=, onestwos);
 	RETURN_FAMILY_TEST(, sub_return, -=, -one);
 
+	FETCH_FAMILY_TEST(, fetch_add, +=, onestwos);
+	FETCH_FAMILY_TEST(, fetch_add, +=, -one);
+	FETCH_FAMILY_TEST(, fetch_sub, -=, onestwos);
+	FETCH_FAMILY_TEST(, fetch_sub, -=, -one);
+
+	FETCH_FAMILY_TEST(, fetch_or,  |=, v1);
+	FETCH_FAMILY_TEST(, fetch_and, &=, v1);
+	FETCH_FAMILY_TEST(, fetch_andnot, &= ~, v1);
+	FETCH_FAMILY_TEST(, fetch_xor, ^=, v1);
+
 	INC_RETURN_FAMILY_TEST(, v0);
 	DEC_RETURN_FAMILY_TEST(, v0);
 
@@ -154,6 +178,16 @@ static __init void test_atomic64(void)
 	RETURN_FAMILY_TEST(64, sub_return, -=, onestwos);
 	RETURN_FAMILY_TEST(64, sub_return, -=, -one);
 
+	FETCH_FAMILY_TEST(64, fetch_add, +=, onestwos);
+	FETCH_FAMILY_TEST(64, fetch_add, +=, -one);
+	FETCH_FAMILY_TEST(64, fetch_sub, -=, onestwos);
+	FETCH_FAMILY_TEST(64, fetch_sub, -=, -one);
+
+	FETCH_FAMILY_TEST(64, fetch_or,  |=, v1);
+	FETCH_FAMILY_TEST(64, fetch_and, &=, v1);
+	FETCH_FAMILY_TEST(64, fetch_andnot, &= ~, v1);
+	FETCH_FAMILY_TEST(64, fetch_xor, ^=, v1);
+
 	INIT(v0);
 	atomic64_inc(&v);
 	r += one;

^ permalink raw reply	[flat|nested] 61+ messages in thread

* [PATCH -v2 28/33] locking: Remove linux/atomic.h:atomic_fetch_or
  2016-05-31 10:19 [PATCH -v2 00/33] implement atomic_fetch_$op Peter Zijlstra
                   ` (26 preceding siblings ...)
  2016-05-31 10:19 ` [PATCH -v2 27/33] locking: Implement atomic{,64,_long}_fetch_{add,sub,and,andnot,or,xor}{,_relaxed,_acquire,_release}() Peter Zijlstra
@ 2016-05-31 10:19 ` Peter Zijlstra
  2016-05-31 10:19 ` [PATCH -v2 29/33] locking: Remove the deprecated atomic_{set,clear}_mask() functions Peter Zijlstra
                   ` (6 subsequent siblings)
  34 siblings, 0 replies; 61+ messages in thread
From: Peter Zijlstra @ 2016-05-31 10:19 UTC (permalink / raw)
  To: torvalds, mingo, tglx, will.deacon, paulmck, boqun.feng,
	waiman.long, fweisbec
  Cc: linux-kernel, linux-arch, rth, vgupta, linux, egtvedt, realmz6,
	ysato, rkuo, tony.luck, geert, james.hogan, ralf, dhowells, jejb,
	mpe, schwidefsky, dalias, davem, cmetcalf, jcmvbkbc, arnd,
	peterz, dbueso, fengguang.wu

[-- Attachment #1: peterz-atomic-fetch_or-kill.patch --]
[-- Type: text/plain, Size: 8656 bytes --]

Since all architectures have this implemented natively, remove this
now dead code.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 arch/alpha/include/asm/atomic.h    |    2 --
 arch/arc/include/asm/atomic.h      |    2 --
 arch/arm/include/asm/atomic.h      |    2 --
 arch/arm64/include/asm/atomic.h    |    2 --
 arch/avr32/include/asm/atomic.h    |    2 --
 arch/frv/include/asm/atomic.h      |    2 --
 arch/h8300/include/asm/atomic.h    |    2 --
 arch/hexagon/include/asm/atomic.h  |    2 --
 arch/m32r/include/asm/atomic.h     |    2 --
 arch/m68k/include/asm/atomic.h     |    2 --
 arch/metag/include/asm/atomic.h    |    2 --
 arch/mips/include/asm/atomic.h     |    2 --
 arch/mn10300/include/asm/atomic.h  |    2 --
 arch/parisc/include/asm/atomic.h   |    2 --
 arch/s390/include/asm/atomic.h     |    2 --
 arch/sh/include/asm/atomic.h       |    2 --
 arch/sparc/include/asm/atomic.h    |    1 -
 arch/sparc/include/asm/atomic_32.h |    2 --
 arch/tile/include/asm/atomic.h     |    2 --
 arch/x86/include/asm/atomic.h      |    2 --
 arch/xtensa/include/asm/atomic.h   |    2 --
 include/asm-generic/atomic.h       |    2 --
 include/linux/atomic.h             |   21 ---------------------
 23 files changed, 64 deletions(-)

--- a/arch/alpha/include/asm/atomic.h
+++ b/arch/alpha/include/asm/atomic.h
@@ -153,8 +153,6 @@ ATOMIC_OPS(sub)
 #define atomic_andnot atomic_andnot
 #define atomic64_andnot atomic64_andnot
 
-#define atomic_fetch_or atomic_fetch_or
-
 #undef ATOMIC_OPS
 #define ATOMIC_OPS(op, asm)						\
 	ATOMIC_OP(op, asm)						\
--- a/arch/arc/include/asm/atomic.h
+++ b/arch/arc/include/asm/atomic.h
@@ -224,8 +224,6 @@ ATOMIC_OPS(sub, -=, sub)
 
 #define atomic_andnot atomic_andnot
 
-#define atomic_fetch_or atomic_fetch_or
-
 #undef ATOMIC_OPS
 #define ATOMIC_OPS(op, c_op, asm_op)					\
 	ATOMIC_OP(op, c_op, asm_op)					\
--- a/arch/arm/include/asm/atomic.h
+++ b/arch/arm/include/asm/atomic.h
@@ -201,8 +201,6 @@ static inline int atomic_fetch_##op(int
 	return val;							\
 }
 
-#define atomic_fetch_or atomic_fetch_or
-
 static inline int atomic_cmpxchg(atomic_t *v, int old, int new)
 {
 	int ret;
--- a/arch/arm64/include/asm/atomic.h
+++ b/arch/arm64/include/asm/atomic.h
@@ -128,8 +128,6 @@
 #define __atomic_add_unless(v, a, u)	___atomic_add_unless(v, a, u,)
 #define atomic_andnot			atomic_andnot
 
-#define atomic_fetch_or atomic_fetch_or
-
 /*
  * 64-bit atomic operations.
  */
--- a/arch/avr32/include/asm/atomic.h
+++ b/arch/avr32/include/asm/atomic.h
@@ -66,8 +66,6 @@ ATOMIC_OP_RETURN(add, add, r)
 ATOMIC_FETCH_OP (sub, sub, rKs21)
 ATOMIC_FETCH_OP (add, add, r)
 
-#define atomic_fetch_or atomic_fetch_or
-
 #define ATOMIC_OPS(op, asm_op)						\
 ATOMIC_OP_RETURN(op, asm_op, r)						\
 static inline void atomic_##op(int i, atomic_t *v)			\
--- a/arch/frv/include/asm/atomic.h
+++ b/arch/frv/include/asm/atomic.h
@@ -74,8 +74,6 @@ static inline void atomic_dec(atomic_t *
 #define atomic_dec_and_test(v)		(atomic_sub_return(1, (v)) == 0)
 #define atomic_inc_and_test(v)		(atomic_add_return(1, (v)) == 0)
 
-#define atomic_fetch_or atomic_fetch_or
-
 /*
  * 64-bit atomic ops
  */
--- a/arch/h8300/include/asm/atomic.h
+++ b/arch/h8300/include/asm/atomic.h
@@ -54,8 +54,6 @@ static inline void atomic_##op(int i, at
 ATOMIC_OP_RETURN(add, +=)
 ATOMIC_OP_RETURN(sub, -=)
 
-#define atomic_fetch_or atomic_fetch_or
-
 #define ATOMIC_OPS(op, c_op)					\
 	ATOMIC_OP(op, c_op)					\
 	ATOMIC_FETCH_OP(op, c_op)
--- a/arch/hexagon/include/asm/atomic.h
+++ b/arch/hexagon/include/asm/atomic.h
@@ -152,8 +152,6 @@ ATOMIC_OPS(sub)
 #undef ATOMIC_OPS
 #define ATOMIC_OPS(op) ATOMIC_OP(op) ATOMIC_FETCH_OP(op)
 
-#define atomic_fetch_or atomic_fetch_or
-
 ATOMIC_OPS(and)
 ATOMIC_OPS(or)
 ATOMIC_OPS(xor)
--- a/arch/m32r/include/asm/atomic.h
+++ b/arch/m32r/include/asm/atomic.h
@@ -121,8 +121,6 @@ ATOMIC_OPS(sub)
 #undef ATOMIC_OPS
 #define ATOMIC_OPS(op) ATOMIC_OP(op) ATOMIC_FETCH_OP(op)
 
-#define atomic_fetch_or atomic_fetch_or
-
 ATOMIC_OPS(and)
 ATOMIC_OPS(or)
 ATOMIC_OPS(xor)
--- a/arch/m68k/include/asm/atomic.h
+++ b/arch/m68k/include/asm/atomic.h
@@ -119,8 +119,6 @@ ATOMIC_OPS(sub, -=, sub)
 	ATOMIC_OP(op, c_op, asm_op)					\
 	ATOMIC_FETCH_OP(op, c_op, asm_op)
 
-#define atomic_fetch_or atomic_fetch_or
-
 ATOMIC_OPS(and, &=, and)
 ATOMIC_OPS(or, |=, or)
 ATOMIC_OPS(xor, ^=, eor)
--- a/arch/metag/include/asm/atomic.h
+++ b/arch/metag/include/asm/atomic.h
@@ -17,8 +17,6 @@
 #include <asm/atomic_lnkget.h>
 #endif
 
-#define atomic_fetch_or atomic_fetch_or
-
 #define atomic_add_negative(a, v)       (atomic_add_return((a), (v)) < 0)
 
 #define atomic_dec_return(v) atomic_sub_return(1, (v))
--- a/arch/mips/include/asm/atomic.h
+++ b/arch/mips/include/asm/atomic.h
@@ -194,8 +194,6 @@ ATOMIC_OPS(sub, -=, subu)
 	ATOMIC_OP(op, c_op, asm_op)					      \
 	ATOMIC_FETCH_OP(op, c_op, asm_op)
 
-#define atomic_fetch_or atomic_fetch_or
-
 ATOMIC_OPS(and, &=, and)
 ATOMIC_OPS(or, |=, or)
 ATOMIC_OPS(xor, ^=, xor)
--- a/arch/mn10300/include/asm/atomic.h
+++ b/arch/mn10300/include/asm/atomic.h
@@ -113,8 +113,6 @@ ATOMIC_OPS(sub)
 #undef ATOMIC_OPS
 #define ATOMIC_OPS(op) ATOMIC_OP(op) ATOMIC_FETCH_OP(op)
 
-#define atomic_fetch_or atomic_fetch_or
-
 ATOMIC_OPS(and)
 ATOMIC_OPS(or)
 ATOMIC_OPS(xor)
--- a/arch/parisc/include/asm/atomic.h
+++ b/arch/parisc/include/asm/atomic.h
@@ -148,8 +148,6 @@ ATOMIC_OPS(sub, -=)
 	ATOMIC_OP(op, c_op)						\
 	ATOMIC_FETCH_OP(op, c_op)
 
-#define atomic_fetch_or atomic_fetch_or
-
 ATOMIC_OPS(and, &=)
 ATOMIC_OPS(or, |=)
 ATOMIC_OPS(xor, ^=)
--- a/arch/s390/include/asm/atomic.h
+++ b/arch/s390/include/asm/atomic.h
@@ -135,8 +135,6 @@ static inline int atomic_fetch_##op(int
 	return __ATOMIC_LOOP(v, i, __ATOMIC_##OP, __ATOMIC_BARRIER);	\
 }
 
-#define atomic_fetch_or atomic_fetch_or
-
 ATOMIC_OPS(and, AND)
 ATOMIC_OPS(or, OR)
 ATOMIC_OPS(xor, XOR)
--- a/arch/sh/include/asm/atomic.h
+++ b/arch/sh/include/asm/atomic.h
@@ -25,8 +25,6 @@
 #include <asm/atomic-irq.h>
 #endif
 
-#define atomic_fetch_or atomic_fetch_or
-
 #define atomic_add_negative(a, v)	(atomic_add_return((a), (v)) < 0)
 #define atomic_dec_return(v)		atomic_sub_return(1, (v))
 #define atomic_inc_return(v)		atomic_add_return(1, (v))
--- a/arch/sparc/include/asm/atomic.h
+++ b/arch/sparc/include/asm/atomic.h
@@ -5,5 +5,4 @@
 #else
 #include <asm/atomic_32.h>
 #endif
-#define atomic_fetch_or atomic_fetch_or
 #endif
--- a/arch/sparc/include/asm/atomic_32.h
+++ b/arch/sparc/include/asm/atomic_32.h
@@ -36,8 +36,6 @@ void atomic_set(atomic_t *, int);
 #define atomic_inc(v)		((void)atomic_add_return(        1, (v)))
 #define atomic_dec(v)		((void)atomic_add_return(       -1, (v)))
 
-#define atomic_fetch_or	atomic_fetch_or
-
 #define atomic_and(i, v)	((void)atomic_fetch_and((i), (v)))
 #define atomic_or(i, v)		((void)atomic_fetch_or((i), (v)))
 #define atomic_xor(i, v)	((void)atomic_fetch_xor((i), (v)))
--- a/arch/tile/include/asm/atomic.h
+++ b/arch/tile/include/asm/atomic.h
@@ -48,8 +48,6 @@ static inline int atomic_read(const atom
 
 #define atomic_fetch_sub(i, v)		atomic_fetch_add(-(int)(i), (v))
 
-#define atomic_fetch_or atomic_fetch_or
-
 /**
  * atomic_sub - subtract integer from atomic variable
  * @i: integer value to subtract
--- a/arch/x86/include/asm/atomic.h
+++ b/arch/x86/include/asm/atomic.h
@@ -217,8 +217,6 @@ static inline int atomic_fetch_##op(int
 	ATOMIC_OP(op)							\
 	ATOMIC_FETCH_OP(op, c_op)
 
-#define atomic_fetch_or atomic_fetch_or
-
 ATOMIC_OPS(and, &)
 ATOMIC_OPS(or , |)
 ATOMIC_OPS(xor, ^)
--- a/arch/xtensa/include/asm/atomic.h
+++ b/arch/xtensa/include/asm/atomic.h
@@ -188,8 +188,6 @@ ATOMIC_OPS(sub)
 #undef ATOMIC_OPS
 #define ATOMIC_OPS(op) ATOMIC_OP(op) ATOMIC_FETCH_OP(op)
 
-#define atomic_fetch_or atomic_fetch_or
-
 ATOMIC_OPS(and)
 ATOMIC_OPS(or)
 ATOMIC_OPS(xor)
--- a/include/asm-generic/atomic.h
+++ b/include/asm-generic/atomic.h
@@ -137,8 +137,6 @@ ATOMIC_FETCH_OP(and, &)
 #endif
 
 #ifndef atomic_fetch_or
-#define atomic_fetch_or atomic_fetch_or
-
 ATOMIC_FETCH_OP(or, |)
 #endif
 
--- a/include/linux/atomic.h
+++ b/include/linux/atomic.h
@@ -620,27 +620,6 @@ static inline int atomic_dec_if_positive
 }
 #endif
 
-/**
- * atomic_fetch_or - perform *p |= mask and return old value of *p
- * @mask: mask to OR on the atomic_t
- * @p: pointer to atomic_t
- */
-#ifndef atomic_fetch_or
-static inline int atomic_fetch_or(int mask, atomic_t *p)
-{
-	int old, val = atomic_read(p);
-
-	for (;;) {
-		old = atomic_cmpxchg(p, val, val | mask);
-		if (old == val)
-			break;
-		val = old;
-	}
-
-	return old;
-}
-#endif
-
 #ifdef CONFIG_GENERIC_ATOMIC64
 #include <asm-generic/atomic64.h>
 #endif

^ permalink raw reply	[flat|nested] 61+ messages in thread

* [PATCH -v2 29/33] locking: Remove the deprecated atomic_{set,clear}_mask() functions
  2016-05-31 10:19 [PATCH -v2 00/33] implement atomic_fetch_$op Peter Zijlstra
                   ` (27 preceding siblings ...)
  2016-05-31 10:19 ` [PATCH -v2 28/33] locking: Remove linux/atomic.h:atomic_fetch_or Peter Zijlstra
@ 2016-05-31 10:19 ` Peter Zijlstra
  2016-05-31 10:19 ` [PATCH -v2 30/33] locking,alpha: Convert to _relaxed atomics Peter Zijlstra
                   ` (5 subsequent siblings)
  34 siblings, 0 replies; 61+ messages in thread
From: Peter Zijlstra @ 2016-05-31 10:19 UTC (permalink / raw)
  To: torvalds, mingo, tglx, will.deacon, paulmck, boqun.feng,
	waiman.long, fweisbec
  Cc: linux-kernel, linux-arch, rth, vgupta, linux, egtvedt, realmz6,
	ysato, rkuo, tony.luck, geert, james.hogan, ralf, dhowells, jejb,
	mpe, schwidefsky, dalias, davem, cmetcalf, jcmvbkbc, arnd,
	peterz, dbueso, fengguang.wu

[-- Attachment #1: peterz-atomic-mask-kill.patch --]
[-- Type: text/plain, Size: 1363 bytes --]

These functions have been deprecated for a while and there is only the
one user left, convert and kill.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 include/linux/atomic.h              |   10 ----------
 kernel/locking/qspinlock_paravirt.h |    4 ++--
 2 files changed, 2 insertions(+), 12 deletions(-)

--- a/include/linux/atomic.h
+++ b/include/linux/atomic.h
@@ -480,16 +480,6 @@ static inline int atomic_fetch_andnot_re
 }
 #endif
 
-static inline __deprecated void atomic_clear_mask(unsigned int mask, atomic_t *v)
-{
-	atomic_andnot(mask, v);
-}
-
-static inline __deprecated void atomic_set_mask(unsigned int mask, atomic_t *v)
-{
-	atomic_or(mask, v);
-}
-
 /**
  * atomic_inc_not_zero_hint - increment if not null
  * @v: pointer of type atomic_t
--- a/kernel/locking/qspinlock_paravirt.h
+++ b/kernel/locking/qspinlock_paravirt.h
@@ -112,12 +112,12 @@ static __always_inline int trylock_clear
 #else /* _Q_PENDING_BITS == 8 */
 static __always_inline void set_pending(struct qspinlock *lock)
 {
-	atomic_set_mask(_Q_PENDING_VAL, &lock->val);
+	atomic_or(_Q_PENDING_VAL, &lock->val);
 }
 
 static __always_inline void clear_pending(struct qspinlock *lock)
 {
-	atomic_clear_mask(_Q_PENDING_VAL, &lock->val);
+	atomic_andnot(_Q_PENDING_VAL, &lock->val);
 }
 
 static __always_inline int trylock_clear_pending(struct qspinlock *lock)

^ permalink raw reply	[flat|nested] 61+ messages in thread

* [PATCH -v2 30/33] locking,alpha: Convert to _relaxed atomics
  2016-05-31 10:19 [PATCH -v2 00/33] implement atomic_fetch_$op Peter Zijlstra
                   ` (28 preceding siblings ...)
  2016-05-31 10:19 ` [PATCH -v2 29/33] locking: Remove the deprecated atomic_{set,clear}_mask() functions Peter Zijlstra
@ 2016-05-31 10:19 ` Peter Zijlstra
  2016-05-31 10:19 ` [PATCH -v2 31/33] locking,mips: " Peter Zijlstra
                   ` (4 subsequent siblings)
  34 siblings, 0 replies; 61+ messages in thread
From: Peter Zijlstra @ 2016-05-31 10:19 UTC (permalink / raw)
  To: torvalds, mingo, tglx, will.deacon, paulmck, boqun.feng,
	waiman.long, fweisbec
  Cc: linux-kernel, linux-arch, rth, vgupta, linux, egtvedt, realmz6,
	ysato, rkuo, tony.luck, geert, james.hogan, ralf, dhowells, jejb,
	mpe, schwidefsky, dalias, davem, cmetcalf, jcmvbkbc, arnd,
	peterz, dbueso, fengguang.wu

[-- Attachment #1: peterz-atomic-alpha-relaxed.patch --]
[-- Type: text/plain, Size: 4179 bytes --]

Generic code will construct {,_acquire,_release} versions by adding the
required smp_mb__{before,after}_atomic() calls.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 arch/alpha/include/asm/atomic.h |   36 ++++++++++++++++++++++++------------
 1 file changed, 24 insertions(+), 12 deletions(-)

--- a/arch/alpha/include/asm/atomic.h
+++ b/arch/alpha/include/asm/atomic.h
@@ -46,10 +46,9 @@ static __inline__ void atomic_##op(int i
 }									\
 
 #define ATOMIC_OP_RETURN(op, asm_op)					\
-static inline int atomic_##op##_return(int i, atomic_t *v)		\
+static inline int atomic_##op##_return_relaxed(int i, atomic_t *v)	\
 {									\
 	long temp, result;						\
-	smp_mb();							\
 	__asm__ __volatile__(						\
 	"1:	ldl_l %0,%1\n"						\
 	"	" #asm_op " %0,%3,%2\n"					\
@@ -61,15 +60,13 @@ static inline int atomic_##op##_return(i
 	".previous"							\
 	:"=&r" (temp), "=m" (v->counter), "=&r" (result)		\
 	:"Ir" (i), "m" (v->counter) : "memory");			\
-	smp_mb();							\
 	return result;							\
 }
 
 #define ATOMIC_FETCH_OP(op, asm_op)					\
-static inline int atomic_fetch_##op(int i, atomic_t *v)			\
+static inline int atomic_fetch_##op##_relaxed(int i, atomic_t *v)	\
 {									\
 	long temp, result;						\
-	smp_mb();							\
 	__asm__ __volatile__(						\
 	"1:	ldl_l %2,%1\n"						\
 	"	" #asm_op " %2,%3,%0\n"					\
@@ -80,7 +77,6 @@ static inline int atomic_fetch_##op(int
 	".previous"							\
 	:"=&r" (temp), "=m" (v->counter), "=&r" (result)		\
 	:"Ir" (i), "m" (v->counter) : "memory");			\
-	smp_mb();							\
 	return result;							\
 }
 
@@ -101,10 +97,9 @@ static __inline__ void atomic64_##op(lon
 }									\
 
 #define ATOMIC64_OP_RETURN(op, asm_op)					\
-static __inline__ long atomic64_##op##_return(long i, atomic64_t * v)	\
+static __inline__ long atomic64_##op##_return_relaxed(long i, atomic64_t * v)	\
 {									\
 	long temp, result;						\
-	smp_mb();							\
 	__asm__ __volatile__(						\
 	"1:	ldq_l %0,%1\n"						\
 	"	" #asm_op " %0,%3,%2\n"					\
@@ -116,15 +111,13 @@ static __inline__ long atomic64_##op##_r
 	".previous"							\
 	:"=&r" (temp), "=m" (v->counter), "=&r" (result)		\
 	:"Ir" (i), "m" (v->counter) : "memory");			\
-	smp_mb();							\
 	return result;							\
 }
 
 #define ATOMIC64_FETCH_OP(op, asm_op)					\
-static __inline__ long atomic64_fetch_##op(long i, atomic64_t * v)	\
+static __inline__ long atomic64_fetch_##op##_relaxed(long i, atomic64_t * v)	\
 {									\
 	long temp, result;						\
-	smp_mb();							\
 	__asm__ __volatile__(						\
 	"1:	ldq_l %2,%1\n"						\
 	"	" #asm_op " %2,%3,%0\n"					\
@@ -135,7 +128,6 @@ static __inline__ long atomic64_fetch_##
 	".previous"							\
 	:"=&r" (temp), "=m" (v->counter), "=&r" (result)		\
 	:"Ir" (i), "m" (v->counter) : "memory");			\
-	smp_mb();							\
 	return result;							\
 }
 
@@ -150,6 +142,16 @@ static __inline__ long atomic64_fetch_##
 ATOMIC_OPS(add)
 ATOMIC_OPS(sub)
 
+#define atomic_add_return_relaxed	atomic_add_return_relaxed
+#define atomic_sub_return_relaxed	atomic_sub_return_relaxed
+#define atomic_fetch_add_relaxed	atomic_fetch_add_relaxed
+#define atomic_fetch_sub_relaxed	atomic_fetch_sub_relaxed
+
+#define atomic64_add_return_relaxed	atomic64_add_return_relaxed
+#define atomic64_sub_return_relaxed	atomic64_sub_return_relaxed
+#define atomic64_fetch_add_relaxed	atomic64_fetch_add_relaxed
+#define atomic64_fetch_sub_relaxed	atomic64_fetch_sub_relaxed
+
 #define atomic_andnot atomic_andnot
 #define atomic64_andnot atomic64_andnot
 
@@ -165,6 +167,16 @@ ATOMIC_OPS(andnot, bic)
 ATOMIC_OPS(or, bis)
 ATOMIC_OPS(xor, xor)
 
+#define atomic_fetch_and_relaxed	atomic_fetch_and_relaxed
+#define atomic_fetch_andnot_relaxed	atomic_fetch_andnot_relaxed
+#define atomic_fetch_or_relaxed		atomic_fetch_or_relaxed
+#define atomic_fetch_xor_relaxed	atomic_fetch_xor_relaxed
+
+#define atomic64_fetch_and_relaxed	atomic64_fetch_and_relaxed
+#define atomic64_fetch_andnot_relaxed	atomic64_fetch_andnot_relaxed
+#define atomic64_fetch_or_relaxed	atomic64_fetch_or_relaxed
+#define atomic64_fetch_xor_relaxed	atomic64_fetch_xor_relaxed
+
 #undef ATOMIC_OPS
 #undef ATOMIC64_FETCH_OP
 #undef ATOMIC64_OP_RETURN

^ permalink raw reply	[flat|nested] 61+ messages in thread

* [PATCH -v2 31/33] locking,mips: Convert to _relaxed atomics
  2016-05-31 10:19 [PATCH -v2 00/33] implement atomic_fetch_$op Peter Zijlstra
                   ` (29 preceding siblings ...)
  2016-05-31 10:19 ` [PATCH -v2 30/33] locking,alpha: Convert to _relaxed atomics Peter Zijlstra
@ 2016-05-31 10:19 ` Peter Zijlstra
  2016-05-31 10:19 ` [PATCH -v2 32/33] locking,qrwlock: Employ atomic_fetch_add_acquire() Peter Zijlstra
                   ` (3 subsequent siblings)
  34 siblings, 0 replies; 61+ messages in thread
From: Peter Zijlstra @ 2016-05-31 10:19 UTC (permalink / raw)
  To: torvalds, mingo, tglx, will.deacon, paulmck, boqun.feng,
	waiman.long, fweisbec
  Cc: linux-kernel, linux-arch, rth, vgupta, linux, egtvedt, realmz6,
	ysato, rkuo, tony.luck, geert, james.hogan, ralf, dhowells, jejb,
	mpe, schwidefsky, dalias, davem, cmetcalf, jcmvbkbc, arnd,
	peterz, dbueso, fengguang.wu

[-- Attachment #1: peterz-atomic-mips-relaxed.patch --]
[-- Type: text/plain, Size: 5005 bytes --]

Generic code will construct {,_acquire,_release} versions by adding the
required smp_mb__{before,after}_atomic() calls.

XXX if/when MIPS will start using their new SYNCxx instructions they
can provide custom __atomic_op_{acquire,release}() macros as per the
powerpc example.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 arch/mips/include/asm/atomic.h |   42 +++++++++++++++++++++--------------------
 1 file changed, 22 insertions(+), 20 deletions(-)

--- a/arch/mips/include/asm/atomic.h
+++ b/arch/mips/include/asm/atomic.h
@@ -79,12 +79,10 @@ static __inline__ void atomic_##op(int i
 }
 
 #define ATOMIC_OP_RETURN(op, c_op, asm_op)				      \
-static __inline__ int atomic_##op##_return(int i, atomic_t * v)		      \
+static __inline__ int atomic_##op##_return_relaxed(int i, atomic_t * v)	      \
 {									      \
 	int result;							      \
 									      \
-	smp_mb__before_llsc();						      \
-									      \
 	if (kernel_uses_llsc && R10000_LLSC_WAR) {			      \
 		int temp;						      \
 									      \
@@ -125,18 +123,14 @@ static __inline__ int atomic_##op##_retu
 		raw_local_irq_restore(flags);				      \
 	}								      \
 									      \
-	smp_llsc_mb();							      \
-									      \
 	return result;							      \
 }
 
 #define ATOMIC_FETCH_OP(op, c_op, asm_op)				      \
-static __inline__ int atomic_fetch_##op(int i, atomic_t * v)		      \
+static __inline__ int atomic_fetch_##op##_relaxed(int i, atomic_t * v)	      \
 {									      \
 	int result;							      \
 									      \
-	smp_mb__before_llsc();						      \
-									      \
 	if (kernel_uses_llsc && R10000_LLSC_WAR) {			      \
 		int temp;						      \
 									      \
@@ -176,8 +170,6 @@ static __inline__ int atomic_fetch_##op(
 		raw_local_irq_restore(flags);				      \
 	}								      \
 									      \
-	smp_llsc_mb();							      \
-									      \
 	return result;							      \
 }
 
@@ -189,6 +181,11 @@ static __inline__ int atomic_fetch_##op(
 ATOMIC_OPS(add, +=, addu)
 ATOMIC_OPS(sub, -=, subu)
 
+#define atomic_add_return_relaxed	atomic_add_return_relaxed
+#define atomic_sub_return_relaxed	atomic_sub_return_relaxed
+#define atomic_fetch_add_relaxed	atomic_fetch_add_relaxed
+#define atomic_fetch_sub_relaxed	atomic_fetch_sub_relaxed
+
 #undef ATOMIC_OPS
 #define ATOMIC_OPS(op, c_op, asm_op)					      \
 	ATOMIC_OP(op, c_op, asm_op)					      \
@@ -198,6 +195,10 @@ ATOMIC_OPS(and, &=, and)
 ATOMIC_OPS(or, |=, or)
 ATOMIC_OPS(xor, ^=, xor)
 
+#define atomic_fetch_and_relaxed	atomic_fetch_and_relaxed
+#define atomic_fetch_or_relaxed		atomic_fetch_or_relaxed
+#define atomic_fetch_xor_relaxed	atomic_fetch_xor_relaxed
+
 #undef ATOMIC_OPS
 #undef ATOMIC_FETCH_OP
 #undef ATOMIC_OP_RETURN
@@ -420,12 +421,10 @@ static __inline__ void atomic64_##op(lon
 }
 
 #define ATOMIC64_OP_RETURN(op, c_op, asm_op)				      \
-static __inline__ long atomic64_##op##_return(long i, atomic64_t * v)	      \
+static __inline__ long atomic64_##op##_return_relaxed(long i, atomic64_t * v) \
 {									      \
 	long result;							      \
 									      \
-	smp_mb__before_llsc();						      \
-									      \
 	if (kernel_uses_llsc && R10000_LLSC_WAR) {			      \
 		long temp;						      \
 									      \
@@ -467,18 +466,14 @@ static __inline__ long atomic64_##op##_r
 		raw_local_irq_restore(flags);				      \
 	}								      \
 									      \
-	smp_llsc_mb();							      \
-									      \
 	return result;							      \
 }
 
 #define ATOMIC64_FETCH_OP(op, c_op, asm_op)				      \
-static __inline__ long atomic64_fetch_##op(long i, atomic64_t * v)	      \
+static __inline__ long atomic64_fetch_##op##_relaxed(long i, atomic64_t * v)  \
 {									      \
 	long result;							      \
 									      \
-	smp_mb__before_llsc();						      \
-									      \
 	if (kernel_uses_llsc && R10000_LLSC_WAR) {			      \
 		long temp;						      \
 									      \
@@ -519,8 +514,6 @@ static __inline__ long atomic64_fetch_##
 		raw_local_irq_restore(flags);				      \
 	}								      \
 									      \
-	smp_llsc_mb();							      \
-									      \
 	return result;							      \
 }
 
@@ -532,6 +525,11 @@ static __inline__ long atomic64_fetch_##
 ATOMIC64_OPS(add, +=, daddu)
 ATOMIC64_OPS(sub, -=, dsubu)
 
+#define atomic64_add_return_relaxed	atomic64_add_return_relaxed
+#define atomic64_sub_return_relaxed	atomic64_sub_return_relaxed
+#define atomic64_fetch_add_relaxed	atomic64_fetch_add_relaxed
+#define atomic64_fetch_sub_relaxed	atomic64_fetch_sub_relaxed
+
 #undef ATOMIC64_OPS
 #define ATOMIC64_OPS(op, c_op, asm_op)					      \
 	ATOMIC64_OP(op, c_op, asm_op)					      \
@@ -541,6 +539,10 @@ ATOMIC64_OPS(and, &=, and)
 ATOMIC64_OPS(or, |=, or)
 ATOMIC64_OPS(xor, ^=, xor)
 
+#define atomic64_fetch_and_relaxed	atomic64_fetch_and_relaxed
+#define atomic64_fetch_or_relaxed	atomic64_fetch_or_relaxed
+#define atomic64_fetch_xor_relaxed	atomic64_fetch_xor_relaxed
+
 #undef ATOMIC64_OPS
 #undef ATOMIC64_FETCH_OP
 #undef ATOMIC64_OP_RETURN

^ permalink raw reply	[flat|nested] 61+ messages in thread

* [PATCH -v2 32/33] locking,qrwlock: Employ atomic_fetch_add_acquire()
  2016-05-31 10:19 [PATCH -v2 00/33] implement atomic_fetch_$op Peter Zijlstra
                   ` (30 preceding siblings ...)
  2016-05-31 10:19 ` [PATCH -v2 31/33] locking,mips: " Peter Zijlstra
@ 2016-05-31 10:19 ` Peter Zijlstra
  2016-05-31 10:19 ` [PATCH -v2 33/33] locking,rwsem: Employ atomic_long_fetch_add() Peter Zijlstra
                   ` (2 subsequent siblings)
  34 siblings, 0 replies; 61+ messages in thread
From: Peter Zijlstra @ 2016-05-31 10:19 UTC (permalink / raw)
  To: torvalds, mingo, tglx, will.deacon, paulmck, boqun.feng,
	waiman.long, fweisbec
  Cc: linux-kernel, linux-arch, rth, vgupta, linux, egtvedt, realmz6,
	ysato, rkuo, tony.luck, geert, james.hogan, ralf, dhowells, jejb,
	mpe, schwidefsky, dalias, davem, cmetcalf, jcmvbkbc, arnd,
	peterz, dbueso, fengguang.wu

[-- Attachment #1: peterz-locking-qspinlock-fetch-add.patch --]
[-- Type: text/plain, Size: 794 bytes --]

The only reason for the current code is to make GCC emit only the
"LOCK XADD" instruction on x86 (and not do a pointless extra ADD on
the result), do so nicer.

Acked-by: Waiman Long <waiman.long@hpe.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 kernel/locking/qrwlock.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/kernel/locking/qrwlock.c
+++ b/kernel/locking/qrwlock.c
@@ -93,7 +93,7 @@ void queued_read_lock_slowpath(struct qr
 	 * that accesses can't leak upwards out of our subsequent critical
 	 * section in the case that the lock is currently held for write.
 	 */
-	cnts = atomic_add_return_acquire(_QR_BIAS, &lock->cnts) - _QR_BIAS;
+	cnts = atomic_fetch_add_acquire(_QR_BIAS, &lock->cnts);
 	rspin_until_writer_unlock(lock, cnts);
 
 	/*

^ permalink raw reply	[flat|nested] 61+ messages in thread

* [PATCH -v2 33/33] locking,rwsem: Employ atomic_long_fetch_add()
  2016-05-31 10:19 [PATCH -v2 00/33] implement atomic_fetch_$op Peter Zijlstra
                   ` (31 preceding siblings ...)
  2016-05-31 10:19 ` [PATCH -v2 32/33] locking,qrwlock: Employ atomic_fetch_add_acquire() Peter Zijlstra
@ 2016-05-31 10:19 ` Peter Zijlstra
  2016-06-01 14:06 ` [PATCH -v2 00/33] implement atomic_fetch_$op Will Deacon
  2016-06-02  9:27 ` Vineet Gupta
  34 siblings, 0 replies; 61+ messages in thread
From: Peter Zijlstra @ 2016-05-31 10:19 UTC (permalink / raw)
  To: torvalds, mingo, tglx, will.deacon, paulmck, boqun.feng,
	waiman.long, fweisbec
  Cc: linux-kernel, linux-arch, rth, vgupta, linux, egtvedt, realmz6,
	ysato, rkuo, tony.luck, geert, james.hogan, ralf, dhowells, jejb,
	mpe, schwidefsky, dalias, davem, cmetcalf, jcmvbkbc, arnd,
	peterz, dbueso, fengguang.wu, Jason Low

[-- Attachment #1: peterz-locking-rwsem-fetch-add.patch --]
[-- Type: text/plain, Size: 723 bytes --]

Now that we have fetch_add() we can stop using add_return() - val.

Cc: Jason Low <jason.low2@hpe.com>
Cc: Waiman Long <waiman.long@hpe.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 kernel/locking/rwsem-xadd.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/kernel/locking/rwsem-xadd.c
+++ b/kernel/locking/rwsem-xadd.c
@@ -153,7 +153,7 @@ __rwsem_mark_wake(struct rw_semaphore *s
 	if (wake_type != RWSEM_WAKE_READ_OWNED) {
 		adjustment = RWSEM_ACTIVE_READ_BIAS;
  try_reader_grant:
-		oldcount = atomic_long_add_return(adjustment, &sem->count) - adjustment;
+		oldcount = atomic_long_fetch_add(adjustment, &sem->count);
 
 		if (unlikely(oldcount < RWSEM_WAITING_BIAS)) {
 			/*

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH -v2 22/33] locking,sparc: Implement atomic{,64}_fetch_{add,sub,and,or,xor}()
  2016-05-31 10:19 ` [PATCH -v2 22/33] locking,sparc: Implement atomic{,64}_fetch_{add,sub,and,or,xor}() Peter Zijlstra
@ 2016-05-31 17:50   ` David Miller
  0 siblings, 0 replies; 61+ messages in thread
From: David Miller @ 2016-05-31 17:50 UTC (permalink / raw)
  To: peterz
  Cc: torvalds, mingo, tglx, will.deacon, paulmck, boqun.feng,
	waiman.long, fweisbec, linux-kernel, linux-arch, rth, vgupta,
	linux, egtvedt, realmz6, ysato, rkuo, tony.luck, geert,
	james.hogan, ralf, dhowells, jejb, mpe, schwidefsky, dalias,
	cmetcalf, jcmvbkbc, arnd, dbueso, fengguang.wu

From: Peter Zijlstra <peterz@infradead.org>
Date: Tue, 31 May 2016 12:19:47 +0200

> Implement FETCH-OP atomic primitives, these are very similar to the
> existing OP-RETURN primitives we already have, except they return the
> value of the atomic variable _before_ modification.
> 
> This is especially useful for irreversible operations -- such as
> bitops (because it becomes impossible to reconstruct the state prior
> to modification).
> 
> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>

Acked-by: David S. Miller <davem@davemloft.net>

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH -v2 19/33] locking,powerpc: Implement atomic{,64}_fetch_{add,sub,and,or,xor}{,_relaxed,_acquire,_release}()
  2016-05-31 10:19 ` [PATCH -v2 19/33] locking,powerpc: Implement atomic{,64}_fetch_{add,sub,and,or,xor}{,_relaxed,_acquire,_release}() Peter Zijlstra
@ 2016-06-01  3:11   ` Boqun Feng
  2016-06-01  6:10     ` Boqun Feng
  0 siblings, 1 reply; 61+ messages in thread
From: Boqun Feng @ 2016-06-01  3:11 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: torvalds, mingo, tglx, will.deacon, paulmck, waiman.long,
	fweisbec, linux-kernel, linux-arch, rth, vgupta, linux, egtvedt,
	realmz6, ysato, rkuo, tony.luck, geert, james.hogan, ralf,
	dhowells, jejb, mpe, schwidefsky, dalias, davem, cmetcalf,
	jcmvbkbc, arnd, dbueso, fengguang.wu

[-- Attachment #1: Type: text/plain, Size: 919 bytes --]

Hi Peter,

On Tue, May 31, 2016 at 12:19:44PM +0200, Peter Zijlstra wrote:
[snip]
>  
> @@ -329,20 +361,53 @@ atomic64_##op##_return_relaxed(long a, a
>  	return t;							\
>  }
>  
> +#define ATOMIC64_FETCH_OP_RELAXED(op, asm_op)				\
> +static inline long							\
> +atomic64_fetch_##op##_relaxed(long a, atomic64_t *v)			\
> +{									\
> +	long res, t;							\
> +									\
> +	__asm__ __volatile__(						\
> +"1:	ldarx	%0,0,%4		# atomic64_fetch_" #op "_relaxed\n"	\
> +	#asm_op " %1,%3,%0\n"						\
> +"	stdcx.	%1,0,%4\n"						\
> +"	bne-	1b\n"							\
> +	: "=&r" (res), "=&r" (t), "+m" (v->counter)			\
> +	: "r" (a), "r" (&v->counter)					\
> +	: "cc");							\
> +									\
> +	return t;							\

Looks like I missed this one in v1, it should be
	
	return res;

because the primitives will return the values before modified by the
operations.

Regards,
Boqun

> +}
> +

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 473 bytes --]

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH -v2 19/33] locking,powerpc: Implement atomic{,64}_fetch_{add,sub,and,or,xor}{,_relaxed,_acquire,_release}()
  2016-06-01  3:11   ` Boqun Feng
@ 2016-06-01  6:10     ` Boqun Feng
  2016-06-01  8:46       ` Peter Zijlstra
  0 siblings, 1 reply; 61+ messages in thread
From: Boqun Feng @ 2016-06-01  6:10 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: torvalds, mingo, tglx, will.deacon, paulmck, waiman.long,
	fweisbec, linux-kernel, linux-arch, rth, vgupta, linux, egtvedt,
	realmz6, ysato, rkuo, tony.luck, geert, james.hogan, ralf,
	dhowells, jejb, mpe, schwidefsky, dalias, davem, cmetcalf,
	jcmvbkbc, arnd, dbueso, fengguang.wu

[-- Attachment #1: Type: text/plain, Size: 1444 bytes --]

On Wed, Jun 01, 2016 at 11:11:38AM +0800, Boqun Feng wrote:
> Hi Peter,
> 
> On Tue, May 31, 2016 at 12:19:44PM +0200, Peter Zijlstra wrote:
> [snip]
> >  
> > @@ -329,20 +361,53 @@ atomic64_##op##_return_relaxed(long a, a
> >  	return t;							\
> >  }
> >  
> > +#define ATOMIC64_FETCH_OP_RELAXED(op, asm_op)				\
> > +static inline long							\
> > +atomic64_fetch_##op##_relaxed(long a, atomic64_t *v)			\
> > +{									\
> > +	long res, t;							\
> > +									\
> > +	__asm__ __volatile__(						\
> > +"1:	ldarx	%0,0,%4		# atomic64_fetch_" #op "_relaxed\n"	\
> > +	#asm_op " %1,%3,%0\n"						\
> > +"	stdcx.	%1,0,%4\n"						\
> > +"	bne-	1b\n"							\
> > +	: "=&r" (res), "=&r" (t), "+m" (v->counter)			\
> > +	: "r" (a), "r" (&v->counter)					\
> > +	: "cc");							\
> > +									\
> > +	return t;							\
> 
> Looks like I missed this one in v1, it should be
> 	
> 	return res;
> 
> because the primitives will return the values before modified by the
> operations.
> 

FWIW, I tested on ppc with ATOMIC64_SELFTEST=y for the following branch:

git://git.kernel.org/pub/scm/linux/kernel/git/peterz/queue.git locking/atomic (afee54ef5b1f2b04)

without this modification, I can hit:

------------[ cut here ]------------
kernel BUG at lib/atomic64_test.c:181!
Oops: Exception in kernel mode, sig: 5 [#1]

with this modification, all the atomic selftests are passed ;-)

Regards,
Boqun

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 473 bytes --]

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH -v2 19/33] locking,powerpc: Implement atomic{,64}_fetch_{add,sub,and,or,xor}{,_relaxed,_acquire,_release}()
  2016-06-01  6:10     ` Boqun Feng
@ 2016-06-01  8:46       ` Peter Zijlstra
  0 siblings, 0 replies; 61+ messages in thread
From: Peter Zijlstra @ 2016-06-01  8:46 UTC (permalink / raw)
  To: Boqun Feng
  Cc: torvalds, mingo, tglx, will.deacon, paulmck, waiman.long,
	fweisbec, linux-kernel, linux-arch, rth, vgupta, linux, egtvedt,
	realmz6, ysato, rkuo, tony.luck, geert, james.hogan, ralf,
	dhowells, jejb, mpe, schwidefsky, dalias, davem, cmetcalf,
	jcmvbkbc, arnd, dbueso, fengguang.wu

On Wed, Jun 01, 2016 at 02:10:45PM +0800, Boqun Feng wrote:
> On Wed, Jun 01, 2016 at 11:11:38AM +0800, Boqun Feng wrote:

> > Looks like I missed this one in v1, it should be
> > 	
> > 	return res;

Indeed.

> > because the primitives will return the values before modified by the
> > operations.
> > 
> 
> FWIW, I tested on ppc with ATOMIC64_SELFTEST=y for the following branch:

Thanks, I'll add a tested-by tag with your name on ;-)

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH -v2 00/33] implement atomic_fetch_$op
  2016-05-31 10:19 [PATCH -v2 00/33] implement atomic_fetch_$op Peter Zijlstra
                   ` (32 preceding siblings ...)
  2016-05-31 10:19 ` [PATCH -v2 33/33] locking,rwsem: Employ atomic_long_fetch_add() Peter Zijlstra
@ 2016-06-01 14:06 ` Will Deacon
  2016-06-02  9:27 ` Vineet Gupta
  34 siblings, 0 replies; 61+ messages in thread
From: Will Deacon @ 2016-06-01 14:06 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: torvalds, mingo, tglx, paulmck, boqun.feng, waiman.long,
	fweisbec, linux-kernel, linux-arch, rth, vgupta, linux, egtvedt,
	realmz6, ysato, rkuo, tony.luck, geert, james.hogan, ralf,
	dhowells, jejb, mpe, schwidefsky, dalias, davem, cmetcalf,
	jcmvbkbc, arnd, dbueso, fengguang.wu

On Tue, May 31, 2016 at 12:19:25PM +0200, Peter Zijlstra wrote:
> As there have been a few requests for atomic_fetch_$op primitives and recently
> by Linus, I figured I'd go and implement the lot.
> 
> The atomic_fetch_$op differs from the existing atomic_$op_return we already
> have by returning the old value instead of the new value. This is especially
> useful when the operation is irreversible (like bitops), and allows for things
> like test-and-set.
> 
> This version incorporates all feedback from last time and is now complete thanks
> to Will implementing ARMv8.1-LSE versions.
> 
> No known build breakage from the build-bot.
> 
> Notes:
>  - arc asm/atomic.h is a bit of a mess after the eznps merge, I would
>    recommend a restructure or split of that file, but could not find
>    the will to do it.
>  - arc, metag and tile could convert to _relaxed.
> 
> I'm aiming to merge this for v4.8 which should get this a fair few weeks in -next.

Thanks Peter, the arm/arm64 bits all look good to me.

Will

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH -v2 00/33] implement atomic_fetch_$op
  2016-05-31 10:19 [PATCH -v2 00/33] implement atomic_fetch_$op Peter Zijlstra
                   ` (33 preceding siblings ...)
  2016-06-01 14:06 ` [PATCH -v2 00/33] implement atomic_fetch_$op Will Deacon
@ 2016-06-02  9:27 ` Vineet Gupta
  2016-06-02  9:33   ` Peter Zijlstra
  34 siblings, 1 reply; 61+ messages in thread
From: Vineet Gupta @ 2016-06-02  9:27 UTC (permalink / raw)
  To: Peter Zijlstra, torvalds, mingo, tglx, will.deacon, paulmck,
	boqun.feng, waiman.long, fweisbec
  Cc: linux-kernel, linux-arch, rth, linux, egtvedt, realmz6, ysato,
	rkuo, tony.luck, geert, james.hogan, ralf, dhowells, jejb, mpe,
	schwidefsky, dalias, davem, cmetcalf, jcmvbkbc, arnd, dbueso,
	fengguang.wu

On Tuesday 31 May 2016 03:59 PM, Peter Zijlstra wrote:
> Notes:
>  - arc asm/atomic.h is a bit of a mess after the eznps merge, I would
>    recommend a restructure or split of that file, but could not find
>    the will to do it.
>  - arc, metag and tile could convert to _relaxed.

Yes that was bothering me too. The split is indeed in order. However good news is
even w/o it I have decent cleanup as all the backoff retry code can now be
removed. The hardware guys did their foo and production RTL with latest h/w
release doesn't suffer from the scond livelock problem. I'd disabled the config
option for some time and now the code can be removed as well.

What's ur merge plan - are u going to rebase/respin once more so I can push those
updates to Linus for 4.7-rc2. Or you could carry those ARC patches in ur tree -
ahead of ur series. I'd much rather prefer the revert / cleanup before adding new
code which extends the back off code only to be deleted later.

What say you ?

-Vineet

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH -v2 00/33] implement atomic_fetch_$op
  2016-06-02  9:27 ` Vineet Gupta
@ 2016-06-02  9:33   ` Peter Zijlstra
  2016-06-08 12:43     ` Peter Zijlstra
  0 siblings, 1 reply; 61+ messages in thread
From: Peter Zijlstra @ 2016-06-02  9:33 UTC (permalink / raw)
  To: Vineet Gupta
  Cc: torvalds, mingo, tglx, will.deacon, paulmck, boqun.feng,
	waiman.long, fweisbec, linux-kernel, linux-arch, rth, linux,
	egtvedt, realmz6, ysato, rkuo, tony.luck, geert, james.hogan,
	ralf, dhowells, jejb, mpe, schwidefsky, dalias, davem, cmetcalf,
	jcmvbkbc, arnd, dbueso, fengguang.wu

On Thu, Jun 02, 2016 at 09:27:36AM +0000, Vineet Gupta wrote:
> On Tuesday 31 May 2016 03:59 PM, Peter Zijlstra wrote:
> > Notes:
> >  - arc asm/atomic.h is a bit of a mess after the eznps merge, I would
> >    recommend a restructure or split of that file, but could not find
> >    the will to do it.
> >  - arc, metag and tile could convert to _relaxed.
> 
> Yes that was bothering me too. The split is indeed in order. However good news is
> even w/o it I have decent cleanup as all the backoff retry code can now be
> removed. The hardware guys did their foo and production RTL with latest h/w
> release doesn't suffer from the scond livelock problem. I'd disabled the config
> option for some time and now the code can be removed as well.
> 
> What's ur merge plan - are u going to rebase/respin once more so I can push those
> updates to Linus for 4.7-rc2. Or you could carry those ARC patches in ur tree -
> ahead of ur series. I'd much rather prefer the revert / cleanup before adding new
> code which extends the back off code only to be deleted later.

I was hoping to get these into tip for v4.8, I can rebase on whatever
changes you make in v4.7 no problem.

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH -v2 00/33] implement atomic_fetch_$op
  2016-06-02  9:33   ` Peter Zijlstra
@ 2016-06-08 12:43     ` Peter Zijlstra
  2016-06-08 12:55       ` Ingo Molnar
  0 siblings, 1 reply; 61+ messages in thread
From: Peter Zijlstra @ 2016-06-08 12:43 UTC (permalink / raw)
  To: Vineet Gupta
  Cc: torvalds, mingo, tglx, will.deacon, paulmck, boqun.feng,
	waiman.long, fweisbec, linux-kernel, linux-arch, rth, linux,
	egtvedt, realmz6, ysato, rkuo, tony.luck, geert, james.hogan,
	ralf, dhowells, jejb, mpe, schwidefsky, dalias, davem, cmetcalf,
	jcmvbkbc, arnd, dbueso, fengguang.wu

On Thu, Jun 02, 2016 at 11:33:04AM +0200, Peter Zijlstra wrote:
> On Thu, Jun 02, 2016 at 09:27:36AM +0000, Vineet Gupta wrote:

> > What's ur merge plan - are u going to rebase/respin once more so I can push those
> > updates to Linus for 4.7-rc2. Or you could carry those ARC patches in ur tree -
> > ahead of ur series. I'd much rather prefer the revert / cleanup before adding new
> > code which extends the back off code only to be deleted later.
> 
> I was hoping to get these into tip for v4.8, I can rebase on whatever
> changes you make in v4.7 no problem.

-rc2 seems to have happened and I cannot seem to find changes to
arc/atomic.h, will you still be pushing those patches this window or
should I queue my patches?

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH -v2 00/33] implement atomic_fetch_$op
  2016-06-08 12:43     ` Peter Zijlstra
@ 2016-06-08 12:55       ` Ingo Molnar
  2016-06-08 13:32         ` Peter Zijlstra
  0 siblings, 1 reply; 61+ messages in thread
From: Ingo Molnar @ 2016-06-08 12:55 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Vineet Gupta, torvalds, tglx, will.deacon, paulmck, boqun.feng,
	waiman.long, fweisbec, linux-kernel, linux-arch, rth, linux,
	egtvedt, realmz6, ysato, rkuo, tony.luck, geert, james.hogan,
	ralf, dhowells, jejb, mpe, schwidefsky, dalias, davem, cmetcalf,
	jcmvbkbc, arnd, dbueso, fengguang.wu


* Peter Zijlstra <peterz@infradead.org> wrote:

> On Thu, Jun 02, 2016 at 11:33:04AM +0200, Peter Zijlstra wrote:
> > On Thu, Jun 02, 2016 at 09:27:36AM +0000, Vineet Gupta wrote:
> 
> > > What's ur merge plan - are u going to rebase/respin once more so I can push those
> > > updates to Linus for 4.7-rc2. Or you could carry those ARC patches in ur tree -
> > > ahead of ur series. I'd much rather prefer the revert / cleanup before adding new
> > > code which extends the back off code only to be deleted later.
> > 
> > I was hoping to get these into tip for v4.8, I can rebase on whatever
> > changes you make in v4.7 no problem.
> 
> -rc2 seems to have happened and I cannot seem to find changes to
> arc/atomic.h, will you still be pushing those patches this window or
> should I queue my patches?

I'd much prefer to have all of these in the locking tree (i.e. tip:locking/core), 
to make it less painful all around.

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH -v2 00/33] implement atomic_fetch_$op
  2016-06-08 12:55       ` Ingo Molnar
@ 2016-06-08 13:32         ` Peter Zijlstra
  2016-06-08 14:24           ` Vineet Gupta
  0 siblings, 1 reply; 61+ messages in thread
From: Peter Zijlstra @ 2016-06-08 13:32 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Vineet Gupta, torvalds, tglx, will.deacon, paulmck, boqun.feng,
	waiman.long, fweisbec, linux-kernel, linux-arch, rth, linux,
	egtvedt, realmz6, ysato, rkuo, tony.luck, geert, james.hogan,
	ralf, dhowells, jejb, mpe, schwidefsky, dalias, davem, cmetcalf,
	jcmvbkbc, arnd, dbueso, fengguang.wu

On Wed, Jun 08, 2016 at 02:55:30PM +0200, Ingo Molnar wrote:

> I'd much prefer to have all of these in the locking tree (i.e. tip:locking/core), 
> to make it less painful all around.

All the fetch_op stuff, yes certainly. But Vineet wanted to munge
arch/arc/include/asm/atomic.h a bit in 4.7, which would make applying
the arc fetch_op patch tricky.

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH -v2 00/33] implement atomic_fetch_$op
  2016-06-08 13:32         ` Peter Zijlstra
@ 2016-06-08 14:24           ` Vineet Gupta
  2016-06-08 14:38             ` Peter Zijlstra
  0 siblings, 1 reply; 61+ messages in thread
From: Vineet Gupta @ 2016-06-08 14:24 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar
  Cc: torvalds, tglx, will.deacon, paulmck, boqun.feng, waiman.long,
	fweisbec, linux-kernel, linux-arch, rth, linux, egtvedt, realmz6,
	ysato, rkuo, tony.luck, geert@linux-m68k.org

On Wednesday 08 June 2016 07:02 PM, Peter Zijlstra wrote:
> On Wed, Jun 08, 2016 at 02:55:30PM +0200, Ingo Molnar wrote:
> 
>> I'd much prefer to have all of these in the locking tree (i.e. tip:locking/core), 
>> to make it less painful all around.
> 
> All the fetch_op stuff, yes certainly. But Vineet wanted to munge
> arch/arc/include/asm/atomic.h a bit in 4.7, which would make applying
> the arc fetch_op patch tricky.
> 

My patches are in linux-next already. I was hoping to squeeze in a different fix
before sending the pull request to Linus. But I don't want to stall you, will do
that first thing tomorrow morning. OK ?

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH -v2 00/33] implement atomic_fetch_$op
  2016-06-08 14:24           ` Vineet Gupta
@ 2016-06-08 14:38             ` Peter Zijlstra
  0 siblings, 0 replies; 61+ messages in thread
From: Peter Zijlstra @ 2016-06-08 14:38 UTC (permalink / raw)
  To: Vineet Gupta
  Cc: Ingo Molnar, torvalds, tglx, will.deacon, paulmck, boqun.feng,
	waiman.long, fweisbec, linux-kernel, linux-arch, rth, linux,
	egtvedt, realmz6, ysato, rkuo, tony.luck, geert@linux-m68k.org

On Wed, Jun 08, 2016 at 07:54:31PM +0530, Vineet Gupta wrote:
> On Wednesday 08 June 2016 07:02 PM, Peter Zijlstra wrote:
> > On Wed, Jun 08, 2016 at 02:55:30PM +0200, Ingo Molnar wrote:
> > 
> >> I'd much prefer to have all of these in the locking tree (i.e. tip:locking/core), 
> >> to make it less painful all around.
> > 
> > All the fetch_op stuff, yes certainly. But Vineet wanted to munge
> > arch/arc/include/asm/atomic.h a bit in 4.7, which would make applying
> > the arc fetch_op patch tricky.
> > 
> 
> My patches are in linux-next already. I was hoping to squeeze in a different fix
> before sending the pull request to Linus. But I don't want to stall you, will do
> that first thing tomorrow morning. OK ?

Much thanks!

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH -v2 14/33] locking,m68k: Implement atomic_fetch_{add,sub,and,or,xor}()
  2016-05-31 10:19 ` [PATCH -v2 14/33] locking,m68k: " Peter Zijlstra
@ 2016-06-16 10:08   ` Geert Uytterhoeven
  2016-06-16 10:13     ` Peter Zijlstra
  0 siblings, 1 reply; 61+ messages in thread
From: Geert Uytterhoeven @ 2016-06-16 10:08 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Linus Torvalds, Ingo Molnar, Thomas Gleixner, Will Deacon,
	Paul McKenney, boqun.feng, waiman.long,
	Frédéric Weisbecker, linux-kernel, Linux-Arch,
	Richard Henderson, Vineet Gupta, Russell King,
	Hans-Christian Noren Egtvedt, Miao Steven, Yoshinori Sato,
	Richard Kuo, Tony Luck, James Hogan, Ralf Baechle, David Howells,
	James E.J. Bottomley, Michael Ellerman, Martin Schwidefsky,
	Rich Felker, David S. Miller, cmetcalf, Max Filippov,
	Arnd Bergmann, dbueso, Wu Fengguang, linux-m68k

Hi Peter,

On Tue, May 31, 2016 at 12:19 PM, Peter Zijlstra <peterz@infradead.org> wrote:
> Implement FETCH-OP atomic primitives, these are very similar to the
> existing OP-RETURN primitives we already have, except they return the
> value of the atomic variable _before_ modification.
>
> This is especially useful for irreversible operations -- such as
> bitops (because it becomes impossible to reconstruct the state prior
> to modification).
>
> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
> ---
>  arch/m68k/include/asm/atomic.h |   53 +++++++++++++++++++++++++++++++++++++----
>  1 file changed, 49 insertions(+), 4 deletions(-)
>
> --- a/arch/m68k/include/asm/atomic.h
> +++ b/arch/m68k/include/asm/atomic.h
> @@ -38,6 +38,13 @@ static inline void atomic_##op(int i, at
>
>  #ifdef CONFIG_RMW_INSNS
>
> +/*
> + * Am I reading these CAS loops right in that %2 is the old value and the first
> + * iteration uses an uninitialized value?
> + *
> + * Would it not make sense to add: tmp = atomic_read(v); to avoid this?
> + */
> +
>  #define ATOMIC_OP_RETURN(op, c_op, asm_op)                             \
>  static inline int atomic_##op##_return(int i, atomic_t *v)             \
>  {                                                                      \

Do we want the above comment in the code?

Gr{oetje,eeting}s,

                        Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH -v2 14/33] locking,m68k: Implement atomic_fetch_{add,sub,and,or,xor}()
  2016-06-16 10:08   ` Geert Uytterhoeven
@ 2016-06-16 10:13     ` Peter Zijlstra
  2016-06-16 12:43       ` Andreas Schwab
  0 siblings, 1 reply; 61+ messages in thread
From: Peter Zijlstra @ 2016-06-16 10:13 UTC (permalink / raw)
  To: Geert Uytterhoeven
  Cc: Linus Torvalds, Ingo Molnar, Thomas Gleixner, Will Deacon,
	Paul McKenney, boqun.feng, waiman.long,
	Frédéric Weisbecker, linux-kernel, Linux-Arch,
	Richard Henderson, Vineet Gupta, Russell King,
	Hans-Christian Noren Egtvedt, Miao Steven, Yoshinori Sato,
	Richard Kuo, Tony Luck, James Hogan, Ralf Baechle, David Howells,
	James E.J. Bottomley, Michael Ellerman, Martin Schwidefsky,
	Rich Felker, David S. Miller, cmetcalf, Max Filippov,
	Arnd Bergmann, dbueso, Wu Fengguang, linux-m68k

On Thu, Jun 16, 2016 at 12:08:27PM +0200, Geert Uytterhoeven wrote:

> >  #ifdef CONFIG_RMW_INSNS
> >
> > +/*
> > + * Am I reading these CAS loops right in that %2 is the old value and the first
> > + * iteration uses an uninitialized value?
> > + *
> > + * Would it not make sense to add: tmp = atomic_read(v); to avoid this?
> > + */
> > +
> >  #define ATOMIC_OP_RETURN(op, c_op, asm_op)                             \
> >  static inline int atomic_##op##_return(int i, atomic_t *v)             \
> >  {                                                                      \
> 
> Do we want the above comment in the code?

I figured it would not hurt; is this indeed the case, do we want to fix
it? I can do a follow up patch clarifying the situation.

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH -v2 14/33] locking,m68k: Implement atomic_fetch_{add,sub,and,or,xor}()
  2016-06-16 10:13     ` Peter Zijlstra
@ 2016-06-16 12:43       ` Andreas Schwab
  2016-06-16 12:49         ` Peter Zijlstra
  2016-06-17 15:40         ` Peter Zijlstra
  0 siblings, 2 replies; 61+ messages in thread
From: Andreas Schwab @ 2016-06-16 12:43 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Geert Uytterhoeven, Linus Torvalds, Ingo Molnar, Thomas Gleixner,
	Will Deacon, Paul McKenney, boqun.feng, waiman.long,
	Frédéric Weisbecker, linux-kernel, Linux-Arch,
	Richard Henderson, Vineet Gupta, Russell King,
	Hans-Christian Noren Egtvedt, Miao Steven, Yoshinori Sato,
	Richard Kuo, Tony Luck, James Hogan, Ralf Baechle, David Howells,
	James E.J. Bottomley, Michael Ellerman, Martin Schwidefsky,
	Rich Felker, David S. Miller, cmetcalf, Max Filippov,
	Arnd Bergmann, dbueso, Wu Fengguang, linux-m68k

Peter Zijlstra <peterz@infradead.org> writes:

> On Thu, Jun 16, 2016 at 12:08:27PM +0200, Geert Uytterhoeven wrote:
>
>> >  #ifdef CONFIG_RMW_INSNS
>> >
>> > +/*
>> > + * Am I reading these CAS loops right in that %2 is the old value and the first
>> > + * iteration uses an uninitialized value?
>> > + *
>> > + * Would it not make sense to add: tmp = atomic_read(v); to avoid this?
>> > + */
>> > +
>> >  #define ATOMIC_OP_RETURN(op, c_op, asm_op)                             \
>> >  static inline int atomic_##op##_return(int i, atomic_t *v)             \
>> >  {                                                                      \
>> 
>> Do we want the above comment in the code?
>
> I figured it would not hurt; is this indeed the case, do we want to fix
> it?

No, there is nothing to fix here.

Andreas.

-- 
Andreas Schwab, schwab@linux-m68k.org
GPG Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH -v2 14/33] locking,m68k: Implement atomic_fetch_{add,sub,and,or,xor}()
  2016-06-16 12:43       ` Andreas Schwab
@ 2016-06-16 12:49         ` Peter Zijlstra
  2016-06-16 12:53           ` Andreas Schwab
  2016-06-17 15:40         ` Peter Zijlstra
  1 sibling, 1 reply; 61+ messages in thread
From: Peter Zijlstra @ 2016-06-16 12:49 UTC (permalink / raw)
  To: Andreas Schwab
  Cc: Geert Uytterhoeven, Linus Torvalds, Ingo Molnar, Thomas Gleixner,
	Will Deacon, Paul McKenney, boqun.feng, waiman.long,
	Frédéric Weisbecker, linux-kernel, Linux-Arch,
	Richard Henderson, Vineet Gupta, Russell King,
	Hans-Christian Noren Egtvedt, Miao Steven, Yoshinori Sato,
	Richard Kuo, Tony Luck, James Hogan, Ralf Baechle, David Howells,
	James E.J. Bottomley, Michael Ellerman, Martin Schwidefsky,
	Rich Felker, David S. Miller, cmetcalf, Max Filippov,
	Arnd Bergmann, dbueso, Wu Fengguang, linux-m68k

On Thu, Jun 16, 2016 at 02:43:29PM +0200, Andreas Schwab wrote:
> Peter Zijlstra <peterz@infradead.org> writes:
> >> > +/*
> >> > + * Am I reading these CAS loops right in that %2 is the old value and the first
> >> > + * iteration uses an uninitialized value?
> >> > + *
> >> > + * Would it not make sense to add: tmp = atomic_read(v); to avoid this?
> >> > + */

> No, there is nothing to fix here.

OK, care to elucidate? Clearly I need help reading this.

I'm more than happy to remove the comment, but I would like to better
understand.

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH -v2 14/33] locking,m68k: Implement atomic_fetch_{add,sub,and,or,xor}()
  2016-06-16 12:49         ` Peter Zijlstra
@ 2016-06-16 12:53           ` Andreas Schwab
  2016-06-16 14:35             ` Peter Zijlstra
  0 siblings, 1 reply; 61+ messages in thread
From: Andreas Schwab @ 2016-06-16 12:53 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Geert Uytterhoeven, Linus Torvalds, Ingo Molnar, Thomas Gleixner,
	Will Deacon, Paul McKenney, boqun.feng, waiman.long,
	Frédéric Weisbecker, linux-kernel, Linux-Arch,
	Richard Henderson, Vineet Gupta, Russell King,
	Hans-Christian Noren Egtvedt, Miao Steven, Yoshinori Sato,
	Richard Kuo, Tony Luck, James Hogan, Ralf Baechle, David Howells,
	James E.J. Bottomley, Michael Ellerman, Martin Schwidefsky,
	Rich Felker, David S. Miller, cmetcalf, Max Filippov,
	Arnd Bergmann, dbueso, Wu Fengguang, linux-m68k

Peter Zijlstra <peterz@infradead.org> writes:

> On Thu, Jun 16, 2016 at 02:43:29PM +0200, Andreas Schwab wrote:
>> Peter Zijlstra <peterz@infradead.org> writes:
>> >> > +/*
>> >> > + * Am I reading these CAS loops right in that %2 is the old value and the first
>> >> > + * iteration uses an uninitialized value?
>> >> > + *
>> >> > + * Would it not make sense to add: tmp = atomic_read(v); to avoid this?
>> >> > + */
>
>> No, there is nothing to fix here.
>
> OK, care to elucidate? Clearly I need help reading this.

grep '2.*atomic_read'

Andreas.

-- 
Andreas Schwab, schwab@linux-m68k.org
GPG Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH -v2 14/33] locking,m68k: Implement atomic_fetch_{add,sub,and,or,xor}()
  2016-06-16 12:53           ` Andreas Schwab
@ 2016-06-16 14:35             ` Peter Zijlstra
  2016-06-16 14:37               ` Andreas Schwab
  0 siblings, 1 reply; 61+ messages in thread
From: Peter Zijlstra @ 2016-06-16 14:35 UTC (permalink / raw)
  To: Andreas Schwab
  Cc: Geert Uytterhoeven, Linus Torvalds, Ingo Molnar, Thomas Gleixner,
	Will Deacon, Paul McKenney, boqun.feng, waiman.long,
	Frédéric Weisbecker, linux-kernel, Linux-Arch,
	Richard Henderson, Vineet Gupta, Russell King,
	Hans-Christian Noren Egtvedt, Miao Steven, Yoshinori Sato,
	Richard Kuo, Tony Luck, James Hogan, Ralf Baechle, David Howells,
	James E.J. Bottomley, Michael Ellerman, Martin Schwidefsky,
	Rich Felker, David S. Miller, cmetcalf, Max Filippov,
	Arnd Bergmann, dbueso, Wu Fengguang, linux-m68k

On Thu, Jun 16, 2016 at 02:53:09PM +0200, Andreas Schwab wrote:
> Peter Zijlstra <peterz@infradead.org> writes:

> > OK, care to elucidate? Clearly I need help reading this.
> 
> grep '2.*atomic_read'

Much thanks to your detailed answer I found yet another obscure inline
asm syntax 'feature'.

So the "2" input operand actually sets the value of "=&d" (tmp), how
creative...

I would find:

#define ATOMIC_OP_RETURN(op, c_op, asm_op)                              \
static inline int atomic_##op##_return(int i, atomic_t *v)              \
{                                                                       \
        int t, tmp = atomic_read(v);                                    \
                                                                        \
        __asm__ __volatile__(                                           \
                        "1:     movel %2,%1\n"                          \
                        "       " #asm_op "l %3,%1\n"                   \
                        "       casl %2,%1,%0\n"                        \
                        "       jne 1b"                                 \
                        : "+m" (*v), "=&d" (t), "+d" (tmp)              \
                        : "g" (i));                                     \
        return t;                                                       \
}

Much more obvious.

But you're right, it seems to be sorted. I'll queue a patch removing
that comment.

Thanks!

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH -v2 14/33] locking,m68k: Implement atomic_fetch_{add,sub,and,or,xor}()
  2016-06-16 14:35             ` Peter Zijlstra
@ 2016-06-16 14:37               ` Andreas Schwab
  2016-06-16 14:56                 ` Peter Zijlstra
  0 siblings, 1 reply; 61+ messages in thread
From: Andreas Schwab @ 2016-06-16 14:37 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Geert Uytterhoeven, Linus Torvalds, Ingo Molnar, Thomas Gleixner,
	Will Deacon, Paul McKenney, boqun.feng, waiman.long,
	Frédéric Weisbecker, linux-kernel, Linux-Arch,
	Richard Henderson, Vineet Gupta, Russell King,
	Hans-Christian Noren Egtvedt, Miao Steven, Yoshinori Sato,
	Richard Kuo, Tony Luck, James Hogan, Ralf Baechle, David Howells,
	James E.J. Bottomley, Michael Ellerman, Martin Schwidefsky,
	Rich Felker, David S. Miller, cmetcalf, Max Filippov,
	Arnd Bergmann, dbueso, Wu Fengguang, linux-m68k

Peter Zijlstra <peterz@infradead.org> writes:

> So the "2" input operand actually sets the value of "=&d" (tmp), how
> creative...

That was the only way to do it when this was written.

Andreas.

-- 
Andreas Schwab, schwab@linux-m68k.org
GPG Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH -v2 14/33] locking,m68k: Implement atomic_fetch_{add,sub,and,or,xor}()
  2016-06-16 14:37               ` Andreas Schwab
@ 2016-06-16 14:56                 ` Peter Zijlstra
  2016-06-16 15:04                   ` Andreas Schwab
  0 siblings, 1 reply; 61+ messages in thread
From: Peter Zijlstra @ 2016-06-16 14:56 UTC (permalink / raw)
  To: Andreas Schwab
  Cc: Geert Uytterhoeven, Linus Torvalds, Ingo Molnar, Thomas Gleixner,
	Will Deacon, Paul McKenney, boqun.feng, waiman.long,
	Frédéric Weisbecker, linux-kernel, Linux-Arch,
	Richard Henderson, Vineet Gupta, Russell King,
	Hans-Christian Noren Egtvedt, Miao Steven, Yoshinori Sato,
	Richard Kuo, Tony Luck, James Hogan, Ralf Baechle, David Howells,
	James E.J. Bottomley, Michael Ellerman, Martin Schwidefsky,
	Rich Felker, David S. Miller, cmetcalf, Max Filippov,
	Arnd Bergmann, dbueso, Wu Fengguang, linux-m68k

On Thu, Jun 16, 2016 at 04:37:36PM +0200, Andreas Schwab wrote:
> Peter Zijlstra <peterz@infradead.org> writes:
> 
> > So the "2" input operand actually sets the value of "=&d" (tmp), how
> > creative...
> 
> That was the only way to do it when this was written.

Fair enough; do you still support a compiler old enough to require this?
If not, do you want me to 'fix' this or just remove the comment?

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH -v2 14/33] locking,m68k: Implement atomic_fetch_{add,sub,and,or,xor}()
  2016-06-16 14:56                 ` Peter Zijlstra
@ 2016-06-16 15:04                   ` Andreas Schwab
  2016-06-16 17:44                     ` Peter Zijlstra
  0 siblings, 1 reply; 61+ messages in thread
From: Andreas Schwab @ 2016-06-16 15:04 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Geert Uytterhoeven, Linus Torvalds, Ingo Molnar, Thomas Gleixner,
	Will Deacon, Paul McKenney, boqun.feng, waiman.long,
	Frédéric Weisbecker, linux-kernel, Linux-Arch,
	Richard Henderson, Vineet Gupta, Russell King,
	Hans-Christian Noren Egtvedt, Miao Steven, Yoshinori Sato,
	Richard Kuo, Tony Luck, James Hogan, Ralf Baechle, David Howells,
	James E.J. Bottomley, Michael Ellerman, Martin Schwidefsky,
	Rich Felker, David S. Miller, cmetcalf, Max Filippov,
	Arnd Bergmann, dbueso, Wu Fengguang, linux-m68k

Peter Zijlstra <peterz@infradead.org> writes:

> If not, do you want me to 'fix' this or just remove the comment?

It's not broken, so nothing to fix.

Andreas.

-- 
Andreas Schwab, schwab@linux-m68k.org
GPG Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH -v2 14/33] locking,m68k: Implement atomic_fetch_{add,sub,and,or,xor}()
  2016-06-16 15:04                   ` Andreas Schwab
@ 2016-06-16 17:44                     ` Peter Zijlstra
  2016-06-16 19:18                       ` Andreas Schwab
  2016-06-16 19:55                       ` Geert Uytterhoeven
  0 siblings, 2 replies; 61+ messages in thread
From: Peter Zijlstra @ 2016-06-16 17:44 UTC (permalink / raw)
  To: Andreas Schwab
  Cc: Geert Uytterhoeven, Linus Torvalds, Ingo Molnar, Thomas Gleixner,
	Will Deacon, Paul McKenney, boqun.feng, waiman.long,
	Frédéric Weisbecker, linux-kernel, Linux-Arch,
	Richard Henderson, Vineet Gupta, Russell King,
	Hans-Christian Noren Egtvedt, Miao Steven, Yoshinori Sato,
	Richard Kuo, Tony Luck, James Hogan, Ralf Baechle, David Howells,
	James E.J. Bottomley, Michael Ellerman, Martin Schwidefsky,
	Rich Felker, David S. Miller, cmetcalf, Max Filippov,
	Arnd Bergmann, dbueso, Wu Fengguang, linux-m68k

On Thu, Jun 16, 2016 at 05:04:24PM +0200, Andreas Schwab wrote:
> Peter Zijlstra <peterz@infradead.org> writes:
> 
> > If not, do you want me to 'fix' this or just remove the comment?
> 
> It's not broken, so nothing to fix.

Its non obvious code, that's usually plenty reason to change it.

Geert, you maintain this stuff, what say you? Is there still a good
reason (like supporting ancient compilers that don't do "+d" for
example) to keep the code as is?

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH -v2 14/33] locking,m68k: Implement atomic_fetch_{add,sub,and,or,xor}()
  2016-06-16 17:44                     ` Peter Zijlstra
@ 2016-06-16 19:18                       ` Andreas Schwab
  2016-06-16 19:55                       ` Geert Uytterhoeven
  1 sibling, 0 replies; 61+ messages in thread
From: Andreas Schwab @ 2016-06-16 19:18 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Geert Uytterhoeven, Linus Torvalds, Ingo Molnar, Thomas Gleixner,
	Will Deacon, Paul McKenney, boqun.feng, waiman.long,
	Frédéric Weisbecker, linux-kernel, Linux-Arch,
	Richard Henderson, Vineet Gupta, Russell King,
	Hans-Christian Noren Egtvedt, Miao Steven, Yoshinori Sato,
	Richard Kuo, Tony Luck, James Hogan, Ralf Baechle, David Howells,
	James E.J. Bottomley, Michael Ellerman, Martin Schwidefsky,
	Rich Felker, David S. Miller, cmetcalf, Max Filippov,
	Arnd Bergmann, dbueso, Wu Fengguang, linux-m68k

Peter Zijlstra <peterz@infradead.org> writes:

> On Thu, Jun 16, 2016 at 05:04:24PM +0200, Andreas Schwab wrote:
>> Peter Zijlstra <peterz@infradead.org> writes:
>> 
>> > If not, do you want me to 'fix' this or just remove the comment?
>> 
>> It's not broken, so nothing to fix.
>
> Its non obvious code, that's usually plenty reason to change it.

You appear to be the only one who has a problem with that documented
construct.

Andreas.

-- 
Andreas Schwab, schwab@linux-m68k.org
GPG Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH -v2 14/33] locking,m68k: Implement atomic_fetch_{add,sub,and,or,xor}()
  2016-06-16 17:44                     ` Peter Zijlstra
  2016-06-16 19:18                       ` Andreas Schwab
@ 2016-06-16 19:55                       ` Geert Uytterhoeven
  1 sibling, 0 replies; 61+ messages in thread
From: Geert Uytterhoeven @ 2016-06-16 19:55 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Andreas Schwab, Linus Torvalds, Ingo Molnar, Thomas Gleixner,
	Will Deacon, Paul McKenney, boqun.feng, waiman.long,
	Frédéric Weisbecker, linux-kernel, Linux-Arch,
	Richard Henderson, Vineet Gupta, Russell King,
	Hans-Christian Noren Egtvedt, Miao Steven, Yoshinori Sato,
	Richard Kuo, Tony Luck, James Hogan, Ralf Baechle, David Howells,
	James E.J. Bottomley, Michael Ellerman, Martin Schwidefsky,
	Rich Felker, David S. Miller, cmetcalf, Max Filippov,
	Arnd Bergmann, dbueso, Wu Fengguang, linux-m68k

Hi Peter,

On Thu, Jun 16, 2016 at 7:44 PM, Peter Zijlstra <peterz@infradead.org> wrote:
> On Thu, Jun 16, 2016 at 05:04:24PM +0200, Andreas Schwab wrote:
>> Peter Zijlstra <peterz@infradead.org> writes:
>>
>> > If not, do you want me to 'fix' this or just remove the comment?
>>
>> It's not broken, so nothing to fix.
>
> Its non obvious code, that's usually plenty reason to change it.
>
> Geert, you maintain this stuff, what say you? Is there still a good
> reason (like supporting ancient compilers that don't do "+d" for
> example) to keep the code as is?

I don't know when support for "+d" was introduced.
But given people regularly use old compilers, I'm not inclined to change it,
unless there's a very good reason.

BTW, what's the failure mode if an old compiler not supporting "+d"
encounters it?

Gr{oetje,eeting}s,

                        Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH -v2 14/33] locking,m68k: Implement atomic_fetch_{add,sub,and,or,xor}()
  2016-06-16 12:43       ` Andreas Schwab
  2016-06-16 12:49         ` Peter Zijlstra
@ 2016-06-17 15:40         ` Peter Zijlstra
  2016-06-20 17:47           ` Andreas Schwab
  1 sibling, 1 reply; 61+ messages in thread
From: Peter Zijlstra @ 2016-06-17 15:40 UTC (permalink / raw)
  To: Andreas Schwab
  Cc: Geert Uytterhoeven, Linus Torvalds, Ingo Molnar, Thomas Gleixner,
	Will Deacon, Paul McKenney, boqun.feng, waiman.long,
	Frédéric Weisbecker, linux-kernel, Linux-Arch,
	Richard Henderson, Vineet Gupta, Russell King,
	Hans-Christian Noren Egtvedt, Miao Steven, Yoshinori Sato,
	Richard Kuo, Tony Luck, James Hogan, Ralf Baechle, David Howells,
	James E.J. Bottomley, Michael Ellerman, Martin Schwidefsky,
	Rich Felker, David S. Miller, cmetcalf, Max Filippov,
	Arnd Bergmann, dbueso, Wu Fengguang, linux-m68k


Could either of you comment on the below patch?

All atomic functions that return a value should imply full memory
barrier semantics -- this very much includes a compiler barrier / memory
clobber.



---

 arch/m68k/include/asm/atomic.h  | 19 ++++++++++++-------
 arch/m68k/include/asm/cmpxchg.h |  9 ++++++---
 2 files changed, 18 insertions(+), 10 deletions(-)

diff --git a/arch/m68k/include/asm/atomic.h b/arch/m68k/include/asm/atomic.h
index 3e03de7ae33b..062a60417cb9 100644
--- a/arch/m68k/include/asm/atomic.h
+++ b/arch/m68k/include/asm/atomic.h
@@ -56,7 +56,8 @@ static inline int atomic_##op##_return(int i, atomic_t *v)		\
 			"	casl %2,%1,%0\n"			\
 			"	jne 1b"					\
 			: "+m" (*v), "=&d" (t), "=&d" (tmp)		\
-			: "g" (i), "2" (atomic_read(v)));		\
+			: "g" (i), "2" (atomic_read(v))			\
+			: "memory");					\
 	return t;							\
 }
 
@@ -71,7 +72,8 @@ static inline int atomic_fetch_##op(int i, atomic_t *v)			\
 			"	casl %2,%1,%0\n"			\
 			"	jne 1b"					\
 			: "+m" (*v), "=&d" (t), "=&d" (tmp)		\
-			: "g" (i), "2" (atomic_read(v)));		\
+			: "g" (i), "2" (atomic_read(v))			\
+			: "memory");					\
 	return tmp;							\
 }
 
@@ -141,7 +143,7 @@ static inline void atomic_dec(atomic_t *v)
 static inline int atomic_dec_and_test(atomic_t *v)
 {
 	char c;
-	__asm__ __volatile__("subql #1,%1; seq %0" : "=d" (c), "+m" (*v));
+	__asm__ __volatile__("subql #1,%1; seq %0" : "=d" (c), "+m" (*v) : : "memory");
 	return c != 0;
 }
 
@@ -151,14 +153,15 @@ static inline int atomic_dec_and_test_lt(atomic_t *v)
 	__asm__ __volatile__(
 		"subql #1,%1; slt %0"
 		: "=d" (c), "=m" (*v)
-		: "m" (*v));
+		: "m" (*v)
+		: "memory");
 	return c != 0;
 }
 
 static inline int atomic_inc_and_test(atomic_t *v)
 {
 	char c;
-	__asm__ __volatile__("addql #1,%1; seq %0" : "=d" (c), "+m" (*v));
+	__asm__ __volatile__("addql #1,%1; seq %0" : "=d" (c), "+m" (*v) : : "memory");
 	return c != 0;
 }
 
@@ -204,7 +207,8 @@ static inline int atomic_sub_and_test(int i, atomic_t *v)
 	char c;
 	__asm__ __volatile__("subl %2,%1; seq %0"
 			     : "=d" (c), "+m" (*v)
-			     : ASM_DI (i));
+			     : ASM_DI (i)
+			     : "memory");
 	return c != 0;
 }
 
@@ -213,7 +217,8 @@ static inline int atomic_add_negative(int i, atomic_t *v)
 	char c;
 	__asm__ __volatile__("addl %2,%1; smi %0"
 			     : "=d" (c), "+m" (*v)
-			     : ASM_DI (i));
+			     : ASM_DI (i)
+			     : "memory");
 	return c != 0;
 }
 
diff --git a/arch/m68k/include/asm/cmpxchg.h b/arch/m68k/include/asm/cmpxchg.h
index 83b1df80f0ac..d8b3d2b48785 100644
--- a/arch/m68k/include/asm/cmpxchg.h
+++ b/arch/m68k/include/asm/cmpxchg.h
@@ -98,17 +98,20 @@ static inline unsigned long __cmpxchg(volatile void *p, unsigned long old,
 	case 1:
 		__asm__ __volatile__ ("casb %0,%2,%1"
 				      : "=d" (old), "=m" (*(char *)p)
-				      : "d" (new), "0" (old), "m" (*(char *)p));
+				      : "d" (new), "0" (old), "m" (*(char *)p)
+				      : "memory");
 		break;
 	case 2:
 		__asm__ __volatile__ ("casw %0,%2,%1"
 				      : "=d" (old), "=m" (*(short *)p)
-				      : "d" (new), "0" (old), "m" (*(short *)p));
+				      : "d" (new), "0" (old), "m" (*(short *)p)
+				      : "memory");
 		break;
 	case 4:
 		__asm__ __volatile__ ("casl %0,%2,%1"
 				      : "=d" (old), "=m" (*(int *)p)
-				      : "d" (new), "0" (old), "m" (*(int *)p));
+				      : "d" (new), "0" (old), "m" (*(int *)p)
+				      : "memory");
 		break;
 	default:
 		old = __invalid_cmpxchg_size(p, old, new, size);

^ permalink raw reply related	[flat|nested] 61+ messages in thread

* Re: [PATCH -v2 14/33] locking,m68k: Implement atomic_fetch_{add,sub,and,or,xor}()
  2016-06-17 15:40         ` Peter Zijlstra
@ 2016-06-20 17:47           ` Andreas Schwab
  2016-06-21  4:27             ` Finn Thain
  0 siblings, 1 reply; 61+ messages in thread
From: Andreas Schwab @ 2016-06-20 17:47 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Geert Uytterhoeven, Linus Torvalds, Ingo Molnar, Thomas Gleixner,
	Will Deacon, Paul McKenney, boqun.feng, waiman.long,
	Frédéric Weisbecker, linux-kernel, Linux-Arch,
	Richard Henderson, Vineet Gupta, Russell King,
	Hans-Christian Noren Egtvedt, Miao Steven, Yoshinori Sato,
	Richard Kuo, Tony Luck, James Hogan, Ralf Baechle, David Howells,
	James E.J. Bottomley, Michael Ellerman, Martin Schwidefsky,
	Rich Felker, David S. Miller, cmetcalf, Max Filippov,
	Arnd Bergmann, dbueso, Wu Fengguang, linux-m68k

Peter Zijlstra <peterz@infradead.org> writes:

> Could either of you comment on the below patch?
>
> All atomic functions that return a value should imply full memory
> barrier semantics -- this very much includes a compiler barrier / memory
> clobber.

I wonder if it is possible to find a case where this makes a real
difference, ie. where the compiler erroneously reused a value due to
the missing barrier.

Andreas.

-- 
Andreas Schwab, schwab@linux-m68k.org
GPG Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH -v2 14/33] locking,m68k: Implement atomic_fetch_{add,sub,and,or,xor}()
  2016-06-20 17:47           ` Andreas Schwab
@ 2016-06-21  4:27             ` Finn Thain
  0 siblings, 0 replies; 61+ messages in thread
From: Finn Thain @ 2016-06-21  4:27 UTC (permalink / raw)
  To: Andreas Schwab
  Cc: Peter Zijlstra, Geert Uytterhoeven, Linus Torvalds, Ingo Molnar,
	Thomas Gleixner, Will Deacon, Paul McKenney, boqun.feng,
	waiman.long, Frédéric Weisbecker,
	linux-kernel\@vger.kernel.org, Linux-Arch, Richard Henderson,
	Vineet Gupta, Russell King, Hans-Christian Noren Egtvedt,
	Miao Steven, Yoshinori Sato, Richard Kuo, Tony Luck, James Hogan,
	Ralf Baechle, David Howells, James E.J. Bottomley,
	Michael Ellerman, Martin Schwidefsky, Rich Felker,
	David S. Miller, cmetcalf, Max Filippov, Arnd Bergmann, dbueso,
	Wu Fengguang, linux-m68k


On Mon, 20 Jun 2016, Andreas Schwab wrote:

> Peter Zijlstra <peterz@infradead.org> writes:
> 
> > Could either of you comment on the below patch?
> >
> > All atomic functions that return a value should imply full memory 
> > barrier semantics -- this very much includes a compiler barrier / 
> > memory clobber.
> 
> I wonder if it is possible to find a case where this makes a real 
> difference, ie. where the compiler erroneously reused a value due to the 
> missing barrier.

What the compiler does erroneously is a compiler bug by definition. But I 
think that was not what you meant.

Perhaps you're asking whether gcc in particular does what you expect, 
despite ambiguous source code. But what about other tools like static 
analyzers?

Ambiguous code is likely to attract patches like this for as long as it 
remains ambiguous. That's a waste of everyone's time, if patches like this 
could be written and reviewed just once.

-- 

> 
> Andreas.
> 
> 

^ permalink raw reply	[flat|nested] 61+ messages in thread

end of thread, other threads:[~2016-06-21  4:35 UTC | newest]

Thread overview: 61+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-05-31 10:19 [PATCH -v2 00/33] implement atomic_fetch_$op Peter Zijlstra
2016-05-31 10:19 ` [PATCH -v2 01/33] locking,alpha: Implement atomic{,64}_fetch_{add,sub,and,andnot,or,xor}() Peter Zijlstra
2016-05-31 10:19 ` [PATCH -v2 02/33] locking,arc: Implement atomic_fetch_{add,sub,and,andnot,or,xor}() Peter Zijlstra
2016-05-31 10:19 ` [PATCH -v2 03/33] locking,arm: Implement atomic{,64}_fetch_{add,sub,and,andnot,or,xor}{,_relaxed,_acquire,_release}() Peter Zijlstra
2016-05-31 10:19 ` [PATCH -v2 04/33] locking,arm64: " Peter Zijlstra
2016-05-31 10:19 ` [PATCH -v2 05/33] arm64: atomic: generate LSE non-return cases using common macros Peter Zijlstra
2016-05-31 10:19 ` [PATCH -v2 06/33] locking,arm64: Implement atomic{,64}_fetch_{add,sub,and,andnot,or,xor}{,_relaxed,_acquire,_release}() for LSE instructions Peter Zijlstra
2016-05-31 10:19 ` [PATCH -v2 07/33] locking,avr32: Implement atomic_fetch_{add,sub,and,or,xor}() Peter Zijlstra
2016-05-31 10:19 ` [PATCH -v2 08/33] locking,blackfin: " Peter Zijlstra
2016-05-31 10:19 ` [PATCH -v2 09/33] locking,frv: Implement atomic{,64}_fetch_{add,sub,and,or,xor}() Peter Zijlstra
2016-05-31 10:19 ` [PATCH -v2 10/33] locking,h8300: Implement atomic_fetch_{add,sub,and,or,xor}() Peter Zijlstra
2016-05-31 10:19 ` [PATCH -v2 11/33] locking,hexagon: " Peter Zijlstra
2016-05-31 10:19 ` [PATCH -v2 12/33] locking,ia64: Implement atomic{,64}_fetch_{add,sub,and,or,xor}() Peter Zijlstra
2016-05-31 10:19 ` [PATCH -v2 13/33] locking,m32r: Implement atomic_fetch_{add,sub,and,or,xor}() Peter Zijlstra
2016-05-31 10:19 ` [PATCH -v2 14/33] locking,m68k: " Peter Zijlstra
2016-06-16 10:08   ` Geert Uytterhoeven
2016-06-16 10:13     ` Peter Zijlstra
2016-06-16 12:43       ` Andreas Schwab
2016-06-16 12:49         ` Peter Zijlstra
2016-06-16 12:53           ` Andreas Schwab
2016-06-16 14:35             ` Peter Zijlstra
2016-06-16 14:37               ` Andreas Schwab
2016-06-16 14:56                 ` Peter Zijlstra
2016-06-16 15:04                   ` Andreas Schwab
2016-06-16 17:44                     ` Peter Zijlstra
2016-06-16 19:18                       ` Andreas Schwab
2016-06-16 19:55                       ` Geert Uytterhoeven
2016-06-17 15:40         ` Peter Zijlstra
2016-06-20 17:47           ` Andreas Schwab
2016-06-21  4:27             ` Finn Thain
2016-05-31 10:19 ` [PATCH -v2 15/33] locking,metag: " Peter Zijlstra
2016-05-31 10:19 ` [PATCH -v2 16/33] locking,mips: Implement atomic{,64}_fetch_{add,sub,and,or,xor}() Peter Zijlstra
2016-05-31 10:19 ` [PATCH -v2 17/33] locking,mn10300: Implement atomic_fetch_{add,sub,and,or,xor}() Peter Zijlstra
2016-05-31 10:19 ` [PATCH -v2 18/33] locking,parisc: Implement atomic{,64}_fetch_{add,sub,and,or,xor}() Peter Zijlstra
2016-05-31 10:19 ` [PATCH -v2 19/33] locking,powerpc: Implement atomic{,64}_fetch_{add,sub,and,or,xor}{,_relaxed,_acquire,_release}() Peter Zijlstra
2016-06-01  3:11   ` Boqun Feng
2016-06-01  6:10     ` Boqun Feng
2016-06-01  8:46       ` Peter Zijlstra
2016-05-31 10:19 ` [PATCH -v2 20/33] locking,s390: Implement atomic{,64}_fetch_{add,sub,and,or,xor}() Peter Zijlstra
2016-05-31 10:19 ` [PATCH -v2 21/33] locking,sh: Implement atomic_fetch_{add,sub,and,or,xor}() Peter Zijlstra
2016-05-31 10:19 ` [PATCH -v2 22/33] locking,sparc: Implement atomic{,64}_fetch_{add,sub,and,or,xor}() Peter Zijlstra
2016-05-31 17:50   ` David Miller
2016-05-31 10:19 ` [PATCH -v2 23/33] locking,tile: " Peter Zijlstra
2016-05-31 10:19 ` [PATCH -v2 24/33] locking,x86: " Peter Zijlstra
2016-05-31 10:19 ` [PATCH -v2 25/33] locking,xtensa: Implement atomic_fetch_{add,sub,and,or,xor}() Peter Zijlstra
2016-05-31 10:19 ` [PATCH -v2 26/33] locking: Fix atomic64_relaxed bits Peter Zijlstra
2016-05-31 10:19 ` [PATCH -v2 27/33] locking: Implement atomic{,64,_long}_fetch_{add,sub,and,andnot,or,xor}{,_relaxed,_acquire,_release}() Peter Zijlstra
2016-05-31 10:19 ` [PATCH -v2 28/33] locking: Remove linux/atomic.h:atomic_fetch_or Peter Zijlstra
2016-05-31 10:19 ` [PATCH -v2 29/33] locking: Remove the deprecated atomic_{set,clear}_mask() functions Peter Zijlstra
2016-05-31 10:19 ` [PATCH -v2 30/33] locking,alpha: Convert to _relaxed atomics Peter Zijlstra
2016-05-31 10:19 ` [PATCH -v2 31/33] locking,mips: " Peter Zijlstra
2016-05-31 10:19 ` [PATCH -v2 32/33] locking,qrwlock: Employ atomic_fetch_add_acquire() Peter Zijlstra
2016-05-31 10:19 ` [PATCH -v2 33/33] locking,rwsem: Employ atomic_long_fetch_add() Peter Zijlstra
2016-06-01 14:06 ` [PATCH -v2 00/33] implement atomic_fetch_$op Will Deacon
2016-06-02  9:27 ` Vineet Gupta
2016-06-02  9:33   ` Peter Zijlstra
2016-06-08 12:43     ` Peter Zijlstra
2016-06-08 12:55       ` Ingo Molnar
2016-06-08 13:32         ` Peter Zijlstra
2016-06-08 14:24           ` Vineet Gupta
2016-06-08 14:38             ` Peter Zijlstra

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).