All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC][PATCH 00/31] implement atomic_fetch_$op
@ 2016-04-22  9:04 Peter Zijlstra
  2016-04-22  9:04 ` [RFC][PATCH 01/31] locking: Flip arguments to atomic_fetch_or Peter Zijlstra
                   ` (31 more replies)
  0 siblings, 32 replies; 79+ messages in thread
From: Peter Zijlstra @ 2016-04-22  9:04 UTC (permalink / raw)
  To: torvalds, mingo, tglx, will.deacon, paulmck, boqun.feng,
	waiman.long, fweisbec
  Cc: linux-kernel, linux-arch, rth, vgupta, linux, egtvedt, realmz6,
	ysato, rkuo, tony.luck, geert, james.hogan, ralf, dhowells, jejb,
	mpe, schwidefsky, dalias, davem, cmetcalf, jcmvbkbc, arnd,
	peterz, dbueso, fengguang.wu

As there have been a few requests for atomic_fetch_$op primitives and recently
by Linus, I figured I'd go and implement the lot.

The atomic_fetch_$op differs from the existing atomic_$op_return we already
have by returning the old value instead of the new value. This is especially
useful when the operation is irreversible (like bitops), and allows for things
like test-and-set.

I did these patches on the plane, without access to architecture documentation,
so mistakes are quite possible. However, I mostly started with the
atomic_$op_return implementation and modified those to return the old value,
which significantly decreases the creativity required.

The one that I did not do was ARMv8.1-LSE and I was hoping Will would help out
with that. Also, it looks like the 0-day built bot does not do arm64 builds,
people might want to look into that.

No known build breakage on the build-bot, boot tested on x86_64.

Notes:
 - tile might have a barrier issue
 - there are a few more archs that can be converted to _relaxed if they so care:
   arc metag tile

---
 arch/alpha/include/asm/atomic.h        |  89 ++++-
 arch/arc/include/asm/atomic.h          |  67 +++-
 arch/arm/include/asm/atomic.h          | 106 +++++-
 arch/arm64/include/asm/atomic.h        |  30 ++
 arch/arm64/include/asm/atomic_ll_sc.h  | 108 ++++--
 arch/avr32/include/asm/atomic.h        |  54 ++-
 arch/blackfin/include/asm/atomic.h     |   8 +
 arch/blackfin/kernel/bfin_ksyms.c      |   1 +
 arch/blackfin/mach-bf561/atomic.S      |  43 ++-
 arch/frv/include/asm/atomic.h          |  30 +-
 arch/frv/include/asm/atomic_defs.h     |   2 +
 arch/h8300/include/asm/atomic.h        |  29 +-
 arch/hexagon/include/asm/atomic.h      |  31 +-
 arch/ia64/include/asm/atomic.h         | 130 ++++++-
 arch/m32r/include/asm/atomic.h         |  36 +-
 arch/m68k/include/asm/atomic.h         |  51 ++-
 arch/metag/include/asm/atomic_lnkget.h |  36 +-
 arch/metag/include/asm/atomic_lock1.h  |  33 +-
 arch/mips/include/asm/atomic.h         | 154 +++++++-
 arch/mn10300/include/asm/atomic.h      |  33 +-
 arch/parisc/include/asm/atomic.h       |  63 +++-
 arch/powerpc/include/asm/atomic.h      |  83 ++++-
 arch/s390/include/asm/atomic.h         |  40 ++-
 arch/sh/include/asm/atomic-grb.h       |  34 +-
 arch/sh/include/asm/atomic-irq.h       |  31 +-
 arch/sh/include/asm/atomic-llsc.h      |  32 +-
 arch/sparc/include/asm/atomic_32.h     |  13 +-
 arch/sparc/include/asm/atomic_64.h     |  16 +-
 arch/sparc/lib/atomic32.c              |  29 +-
 arch/sparc/lib/atomic_64.S             |  61 +++-
 arch/sparc/lib/ksyms.c                 |  17 +-
 arch/tile/include/asm/atomic.h         |   2 +
 arch/tile/include/asm/atomic_32.h      |  60 +++-
 arch/tile/include/asm/atomic_64.h      | 117 ++++--
 arch/tile/include/asm/bitops_32.h      |  18 +-
 arch/tile/lib/atomic_32.c              |  42 +--
 arch/tile/lib/atomic_asm_32.S          |  14 +-
 arch/x86/include/asm/atomic.h          |  35 +-
 arch/x86/include/asm/atomic64_32.h     |  25 +-
 arch/x86/include/asm/atomic64_64.h     |  35 +-
 arch/xtensa/include/asm/atomic.h       |  52 ++-
 include/asm-generic/atomic-long.h      |  36 +-
 include/asm-generic/atomic.h           |  47 +++
 include/asm-generic/atomic64.h         |  15 +-
 include/linux/atomic.h                 | 627 ++++++++++++++++++++++++---------
 kernel/locking/qrwlock.c               |   2 +-
 kernel/locking/qspinlock_paravirt.h    |   4 +-
 kernel/time/tick-sched.c               |   4 +-
 lib/atomic64.c                         |  32 +-
 lib/atomic64_test.c                    |  34 ++
 50 files changed, 2181 insertions(+), 510 deletions(-)

^ permalink raw reply	[flat|nested] 79+ messages in thread

* [RFC][PATCH 01/31] locking: Flip arguments to atomic_fetch_or
  2016-04-22  9:04 [RFC][PATCH 00/31] implement atomic_fetch_$op Peter Zijlstra
@ 2016-04-22  9:04 ` Peter Zijlstra
  2016-04-22 10:54   ` Will Deacon
  2016-04-22 11:09     ` Geert Uytterhoeven
  2016-04-22  9:04 ` [RFC][PATCH 02/31] locking,alpha: Implement atomic{,64}_fetch_{add,sub,and,andnot,or,xor}() Peter Zijlstra
                   ` (30 subsequent siblings)
  31 siblings, 2 replies; 79+ messages in thread
From: Peter Zijlstra @ 2016-04-22  9:04 UTC (permalink / raw)
  To: torvalds, mingo, tglx, will.deacon, paulmck, boqun.feng,
	waiman.long, fweisbec
  Cc: linux-kernel, linux-arch, rth, vgupta, linux, egtvedt, realmz6,
	ysato, rkuo, tony.luck, geert, james.hogan, ralf, dhowells, jejb,
	mpe, schwidefsky, dalias, davem, cmetcalf, jcmvbkbc, arnd,
	peterz, dbueso, fengguang.wu

[-- Attachment #1: peterz-atomic-fetch_or-flip-args.patch --]
[-- Type: text/plain, Size: 1357 bytes --]

All the atomic operations have their arguments the wrong way around;
make atomic_fetch_or() consistent and flip them.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 include/linux/atomic.h   |    4 ++--
 kernel/time/tick-sched.c |    4 ++--
 2 files changed, 4 insertions(+), 4 deletions(-)

--- a/include/linux/atomic.h
+++ b/include/linux/atomic.h
@@ -560,11 +560,11 @@ static inline int atomic_dec_if_positive
 
 /**
  * atomic_fetch_or - perform *p |= mask and return old value of *p
- * @p: pointer to atomic_t
  * @mask: mask to OR on the atomic_t
+ * @p: pointer to atomic_t
  */
 #ifndef atomic_fetch_or
-static inline int atomic_fetch_or(atomic_t *p, int mask)
+static inline int atomic_fetch_or(int mask, atomic_t *p)
 {
 	int old, val = atomic_read(p);
 
--- a/kernel/time/tick-sched.c
+++ b/kernel/time/tick-sched.c
@@ -262,7 +262,7 @@ static void tick_nohz_dep_set_all(atomic
 {
 	int prev;
 
-	prev = atomic_fetch_or(dep, BIT(bit));
+	prev = atomic_fetch_or(BIT(bit), dep);
 	if (!prev)
 		tick_nohz_full_kick_all();
 }
@@ -292,7 +292,7 @@ void tick_nohz_dep_set_cpu(int cpu, enum
 
 	ts = per_cpu_ptr(&tick_cpu_sched, cpu);
 
-	prev = atomic_fetch_or(&ts->tick_dep_mask, BIT(bit));
+	prev = atomic_fetch_or(BIT(bit), &ts->tick_dep_mask);
 	if (!prev) {
 		preempt_disable();
 		/* Perf needs local kick that is NMI safe */

^ permalink raw reply	[flat|nested] 79+ messages in thread

* [RFC][PATCH 02/31] locking,alpha: Implement atomic{,64}_fetch_{add,sub,and,andnot,or,xor}()
  2016-04-22  9:04 [RFC][PATCH 00/31] implement atomic_fetch_$op Peter Zijlstra
  2016-04-22  9:04 ` [RFC][PATCH 01/31] locking: Flip arguments to atomic_fetch_or Peter Zijlstra
@ 2016-04-22  9:04 ` Peter Zijlstra
  2016-04-22 16:57   ` Richard Henderson
  2016-04-22  9:04 ` [RFC][PATCH 03/31] locking,arc: Implement atomic_fetch_{add,sub,and,andnot,or,xor}() Peter Zijlstra
                   ` (29 subsequent siblings)
  31 siblings, 1 reply; 79+ messages in thread
From: Peter Zijlstra @ 2016-04-22  9:04 UTC (permalink / raw)
  To: torvalds, mingo, tglx, will.deacon, paulmck, boqun.feng,
	waiman.long, fweisbec
  Cc: linux-kernel, linux-arch, rth, vgupta, linux, egtvedt, realmz6,
	ysato, rkuo, tony.luck, geert, james.hogan, ralf, dhowells, jejb,
	mpe, schwidefsky, dalias, davem, cmetcalf, jcmvbkbc, arnd,
	peterz, dbueso, fengguang.wu

[-- Attachment #1: peterz-atomic-fetch-alpha.patch --]
[-- Type: text/plain, Size: 3116 bytes --]

Implement FETCH-OP atomic primitives, these are very similar to the
existing OP-RETURN primitives we already have, except they return the
value of the atomic variable _before_ modification.

This is especially useful for irreversible operations -- such as
bitops (because it becomes impossible to reconstruct the state prior
to modification).

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 arch/alpha/include/asm/atomic.h |   67 ++++++++++++++++++++++++++++++++++------
 1 file changed, 58 insertions(+), 9 deletions(-)

--- a/arch/alpha/include/asm/atomic.h
+++ b/arch/alpha/include/asm/atomic.h
@@ -65,6 +65,26 @@ static inline int atomic_##op##_return(i
 	return result;							\
 }
 
+#define ATOMIC_FETCH_OP(op, asm_op)					\
+static inline int atomic_fetch_##op(int i, atomic_t *v)			\
+{									\
+	long temp, result;						\
+	smp_mb();							\
+	__asm__ __volatile__(						\
+	"1:	ldl_l %0,%1\n"						\
+	"       mov %0,%2\n"						\
+	"	" #asm_op " %0,%3,%0\n"					\
+	"	stl_c %0,%1\n"						\
+	"	beq %0,2f\n"						\
+	".subsection 2\n"						\
+	"2:	br 1b\n"						\
+	".previous"							\
+	:"=&r" (temp), "=m" (v->counter), "=&r" (result)		\
+	:"Ir" (i), "m" (v->counter) : "memory");			\
+	smp_mb();							\
+	return result;							\
+}
+
 #define ATOMIC64_OP(op, asm_op)						\
 static __inline__ void atomic64_##op(long i, atomic64_t * v)		\
 {									\
@@ -101,11 +121,33 @@ static __inline__ long atomic64_##op##_r
 	return result;							\
 }
 
+#define ATOMIC64_FETCH_OP(op, asm_op)					\
+static __inline__ long atomic64_fetch_##op(long i, atomic64_t * v)	\
+{									\
+	long temp, result;						\
+	smp_mb();							\
+	__asm__ __volatile__(						\
+	"1:	ldq_l %0,%1\n"						\
+	"	mov %0,%2\n"						\
+	"	" #asm_op " %0,%3,%0\n"					\
+	"	stq_c %0,%1\n"						\
+	"	beq %0,2f\n"						\
+	".subsection 2\n"						\
+	"2:	br 1b\n"						\
+	".previous"							\
+	:"=&r" (temp), "=m" (v->counter), "=&r" (result)		\
+	:"Ir" (i), "m" (v->counter) : "memory");			\
+	smp_mb();							\
+	return result;							\
+}
+
 #define ATOMIC_OPS(op)							\
 	ATOMIC_OP(op, op##l)						\
 	ATOMIC_OP_RETURN(op, op##l)					\
+	ATOMIC_FETCH_OP(op, op##l)					\
 	ATOMIC64_OP(op, op##q)						\
-	ATOMIC64_OP_RETURN(op, op##q)
+	ATOMIC64_OP_RETURN(op, op##q)					\
+	ATOMIC64_FETCH_OP(op, op##q)
 
 ATOMIC_OPS(add)
 ATOMIC_OPS(sub)
@@ -113,18 +155,25 @@ ATOMIC_OPS(sub)
 #define atomic_andnot atomic_andnot
 #define atomic64_andnot atomic64_andnot
 
-ATOMIC_OP(and, and)
-ATOMIC_OP(andnot, bic)
-ATOMIC_OP(or, bis)
-ATOMIC_OP(xor, xor)
-ATOMIC64_OP(and, and)
-ATOMIC64_OP(andnot, bic)
-ATOMIC64_OP(or, bis)
-ATOMIC64_OP(xor, xor)
+#define atomic_fetch_or atomic_fetch_or
+
+#undef ATOMIC_OPS
+#define ATOMIC_OPS(op, asm)						\
+	ATOMIC_OP(op, asm)						\
+	ATOMIC_FETCH_OP(op, asm)					\
+	ATOMIC64_OP(op, asm)						\
+	ATOMIC64_FETCH_OP(op, asm)
+
+ATOMIC_OPS(and, and)
+ATOMIC_OPS(andnot, bic)
+ATOMIC_OPS(or, bis)
+ATOMIC_OPS(xor, xor)
 
 #undef ATOMIC_OPS
+#undef ATOMIC64_FETCH_OP
 #undef ATOMIC64_OP_RETURN
 #undef ATOMIC64_OP
+#undef ATOMIC_FETCH_OP
 #undef ATOMIC_OP_RETURN
 #undef ATOMIC_OP
 

^ permalink raw reply	[flat|nested] 79+ messages in thread

* [RFC][PATCH 03/31] locking,arc: Implement atomic_fetch_{add,sub,and,andnot,or,xor}()
  2016-04-22  9:04 [RFC][PATCH 00/31] implement atomic_fetch_$op Peter Zijlstra
  2016-04-22  9:04 ` [RFC][PATCH 01/31] locking: Flip arguments to atomic_fetch_or Peter Zijlstra
  2016-04-22  9:04 ` [RFC][PATCH 02/31] locking,alpha: Implement atomic{,64}_fetch_{add,sub,and,andnot,or,xor}() Peter Zijlstra
@ 2016-04-22  9:04 ` Peter Zijlstra
  2016-04-22 10:50     ` Vineet Gupta
  2016-04-22  9:04 ` [RFC][PATCH 04/31] locking,arm: Implement atomic{,64}_fetch_{add,sub,and,andnot,or,xor}{,_relaxed,_acquire,_release}() Peter Zijlstra
                   ` (28 subsequent siblings)
  31 siblings, 1 reply; 79+ messages in thread
From: Peter Zijlstra @ 2016-04-22  9:04 UTC (permalink / raw)
  To: torvalds, mingo, tglx, will.deacon, paulmck, boqun.feng,
	waiman.long, fweisbec
  Cc: linux-kernel, linux-arch, rth, vgupta, linux, egtvedt, realmz6,
	ysato, rkuo, tony.luck, geert, james.hogan, ralf, dhowells, jejb,
	mpe, schwidefsky, dalias, davem, cmetcalf, jcmvbkbc, arnd,
	peterz, dbueso, fengguang.wu

[-- Attachment #1: peterz-atomic-fetch-arc.patch --]
[-- Type: text/plain, Size: 3105 bytes --]

Implement FETCH-OP atomic primitives, these are very similar to the
existing OP-RETURN primitives we already have, except they return the
value of the atomic variable _before_ modification.

This is especially useful for irreversible operations -- such as
bitops (because it becomes impossible to reconstruct the state prior
to modification).

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 arch/arc/include/asm/atomic.h |   69 ++++++++++++++++++++++++++++++++++++++----
 1 file changed, 64 insertions(+), 5 deletions(-)

--- a/arch/arc/include/asm/atomic.h
+++ b/arch/arc/include/asm/atomic.h
@@ -102,6 +102,38 @@ static inline int atomic_##op##_return(i
 	return val;							\
 }
 
+#define ATOMIC_FETCH_OP(op, c_op, asm_op)				\
+static inline int atomic_fetch_##op(int i, atomic_t *v)			\
+{									\
+	unsigned int val, result;			                \
+	SCOND_FAIL_RETRY_VAR_DEF                                        \
+									\
+	/*								\
+	 * Explicit full memory barrier needed before/after as		\
+	 * LLOCK/SCOND thmeselves don't provide any such semantics	\
+	 */								\
+	smp_mb();							\
+									\
+	__asm__ __volatile__(						\
+	"1:	llock   %[val], [%[ctr]]		\n"		\
+	"	mov %[result], %[val]			\n"		\
+	"	" #asm_op " %[val], %[val], %[i]	\n"		\
+	"	scond   %[val], [%[ctr]]		\n"		\
+	"						\n"		\
+	SCOND_FAIL_RETRY_ASM						\
+									\
+	: [val]	"=&r"	(val),						\
+	  [result] "=&r" (result)					\
+	  SCOND_FAIL_RETRY_VARS						\
+	: [ctr]	"r"	(&v->counter),					\
+	  [i]	"ir"	(i)						\
+	: "cc");							\
+									\
+	smp_mb();							\
+									\
+	return result;							\
+}
+
 #else	/* !CONFIG_ARC_HAS_LLSC */
 
 #ifndef CONFIG_SMP
@@ -164,23 +196,50 @@ static inline int atomic_##op##_return(i
 	return temp;							\
 }
 
+#define ATOMIC_FETCH_OP(op, c_op, asm_op)				\
+static inline int atomic_fetch_##op(int i, atomic_t *v)			\
+{									\
+	unsigned long flags;						\
+	unsigned long temp, result;					\
+									\
+	/*								\
+	 * spin lock/unlock provides the needed smp_mb() before/after	\
+	 */								\
+	atomic_ops_lock(flags);						\
+	result = temp = v->counter;					\
+	temp c_op i;							\
+	v->counter = temp;						\
+	atomic_ops_unlock(flags);					\
+									\
+	return result;							\
+}
+
 #endif /* !CONFIG_ARC_HAS_LLSC */
 
 #define ATOMIC_OPS(op, c_op, asm_op)					\
 	ATOMIC_OP(op, c_op, asm_op)					\
-	ATOMIC_OP_RETURN(op, c_op, asm_op)
+	ATOMIC_OP_RETURN(op, c_op, asm_op)				\
+	ATOMIC_FETCH_OP(op, c_op, asm_op)
 
 ATOMIC_OPS(add, +=, add)
 ATOMIC_OPS(sub, -=, sub)
 
 #define atomic_andnot atomic_andnot
 
-ATOMIC_OP(and, &=, and)
-ATOMIC_OP(andnot, &= ~, bic)
-ATOMIC_OP(or, |=, or)
-ATOMIC_OP(xor, ^=, xor)
+#define atomic_fetch_or atomic_fetch_or
+
+#undef ATOMIC_OPS
+#define ATOMIC_OPS(op, c_op, asm_op)					\
+	ATOMIC_OP(op, c_op, asm_op)					\
+	ATOMIC_FETCH_OP(op, c_op, asm_op)
+
+ATOMIC_OPS(and, &=, and)
+ATOMIC_OPS(andnot, &= ~, bic)
+ATOMIC_OPS(or, |=, or)
+ATOMIC_OPS(xor, ^=, xor)
 
 #undef ATOMIC_OPS
+#undef ATOMIC_FETCH_OP
 #undef ATOMIC_OP_RETURN
 #undef ATOMIC_OP
 #undef SCOND_FAIL_RETRY_VAR_DEF

^ permalink raw reply	[flat|nested] 79+ messages in thread

* [RFC][PATCH 04/31] locking,arm: Implement atomic{,64}_fetch_{add,sub,and,andnot,or,xor}{,_relaxed,_acquire,_release}()
  2016-04-22  9:04 [RFC][PATCH 00/31] implement atomic_fetch_$op Peter Zijlstra
                   ` (2 preceding siblings ...)
  2016-04-22  9:04 ` [RFC][PATCH 03/31] locking,arc: Implement atomic_fetch_{add,sub,and,andnot,or,xor}() Peter Zijlstra
@ 2016-04-22  9:04 ` Peter Zijlstra
  2016-04-22 11:35   ` Will Deacon
  2016-04-22  9:04 ` [RFC][PATCH 05/31] locking,arm64: " Peter Zijlstra
                   ` (27 subsequent siblings)
  31 siblings, 1 reply; 79+ messages in thread
From: Peter Zijlstra @ 2016-04-22  9:04 UTC (permalink / raw)
  To: torvalds, mingo, tglx, will.deacon, paulmck, boqun.feng,
	waiman.long, fweisbec
  Cc: linux-kernel, linux-arch, rth, vgupta, linux, egtvedt, realmz6,
	ysato, rkuo, tony.luck, geert, james.hogan, ralf, dhowells, jejb,
	mpe, schwidefsky, dalias, davem, cmetcalf, jcmvbkbc, arnd,
	peterz, dbueso, fengguang.wu

[-- Attachment #1: peterz-atomic-fetch-arm.patch --]
[-- Type: text/plain, Size: 5268 bytes --]

Implement FETCH-OP atomic primitives, these are very similar to the
existing OP-RETURN primitives we already have, except they return the
value of the atomic variable _before_ modification.

This is especially useful for irreversible operations -- such as
bitops (because it becomes impossible to reconstruct the state prior
to modification).

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 arch/arm/include/asm/atomic.h |  108 ++++++++++++++++++++++++++++++++++++++----
 1 file changed, 98 insertions(+), 10 deletions(-)

--- a/arch/arm/include/asm/atomic.h
+++ b/arch/arm/include/asm/atomic.h
@@ -77,8 +77,36 @@ static inline int atomic_##op##_return_r
 	return result;							\
 }
 
+#define ATOMIC_FETCH_OP(op, c_op, asm_op)				\
+static inline int atomic_fetch_##op##_relaxed(int i, atomic_t *v)	\
+{									\
+	unsigned long tmp;						\
+	int result, val;						\
+									\
+	prefetchw(&v->counter);						\
+									\
+	__asm__ __volatile__("@ atomic_fetch_" #op "\n"			\
+"1:	ldrex	%0, [%4]\n"						\
+"	" #asm_op "	%1, %0, %5\n"					\
+"	strex	%2, %1, [%4]\n"						\
+"	teq	%2, #0\n"						\
+"	bne	1b"							\
+	: "=&r" (result), "=&r" (val), "=&r" (tmp), "+Qo" (v->counter)	\
+	: "r" (&v->counter), "Ir" (i)					\
+	: "cc");							\
+									\
+	return result;							\
+}
+
 #define atomic_add_return_relaxed	atomic_add_return_relaxed
 #define atomic_sub_return_relaxed	atomic_sub_return_relaxed
+#define atomic_fetch_add_relaxed	atomic_fetch_add_relaxed
+#define atomic_fetch_sub_relaxed	atomic_fetch_sub_relaxed
+
+#define atomic_fetch_and_relaxed	atomic_fetch_and_relaxed
+#define atomic_fetch_andnot_relaxed	atomic_fetch_andnot_relaxed
+#define atomic_fetch_or_relaxed		atomic_fetch_or_relaxed
+#define atomic_fetch_xor_relaxed	atomic_fetch_xor_relaxed
 
 static inline int atomic_cmpxchg_relaxed(atomic_t *ptr, int old, int new)
 {
@@ -159,6 +187,20 @@ static inline int atomic_##op##_return(i
 	return val;							\
 }
 
+#define ATOMIC_FETCH_OP(op, c_op, asm_op)				\
+static inline int atomic_fetch_##op(int i, atomic_t *v)			\
+{									\
+	unsigned long flags;						\
+	int val;							\
+									\
+	raw_local_irq_save(flags);					\
+	val = v->counter;						\
+	v->counter c_op i;						\
+	raw_local_irq_restore(flags);					\
+									\
+	return val;							\
+}
+
 static inline int atomic_cmpxchg(atomic_t *v, int old, int new)
 {
 	int ret;
@@ -187,19 +229,28 @@ static inline int __atomic_add_unless(at
 
 #define ATOMIC_OPS(op, c_op, asm_op)					\
 	ATOMIC_OP(op, c_op, asm_op)					\
-	ATOMIC_OP_RETURN(op, c_op, asm_op)
+	ATOMIC_OP_RETURN(op, c_op, asm_op)				\
+	ATOMIC_FETCH_OP(op, c_op, asm_op)
 
 ATOMIC_OPS(add, +=, add)
 ATOMIC_OPS(sub, -=, sub)
 
 #define atomic_andnot atomic_andnot
 
-ATOMIC_OP(and, &=, and)
-ATOMIC_OP(andnot, &= ~, bic)
-ATOMIC_OP(or,  |=, orr)
-ATOMIC_OP(xor, ^=, eor)
+#define atomic_fetch_or atomic_fetch_or
+
+#undef ATOMIC_OPS
+#define ATOMIC_OPS(op, c_op, asm_op)					\
+	ATOMIC_OP(op, c_op, asm_op)					\
+	ATOMIC_FETCH_OP(op, c_op, asm_op)
+
+ATOMIC_OPS(and, &=, and)
+ATOMIC_OPS(andnot, &= ~, bic)
+ATOMIC_OPS(or,  |=, orr)
+ATOMIC_OPS(xor, ^=, eor)
 
 #undef ATOMIC_OPS
+#undef ATOMIC_FETCH_OP
 #undef ATOMIC_OP_RETURN
 #undef ATOMIC_OP
 
@@ -317,24 +368,61 @@ atomic64_##op##_return_relaxed(long long
 	return result;							\
 }
 
+#define ATOMIC64_FETCH_OP(op, op1, op2)					\
+static inline long long							\
+atomic64_fetch_##op##_relaxed(long long i, atomic64_t *v)		\
+{									\
+	long long result, val;						\
+	unsigned long tmp;						\
+									\
+	prefetchw(&v->counter);						\
+									\
+	__asm__ __volatile__("@ atomic64_fetch_" #op "\n"		\
+"1:	ldrexd	%0, %H0, [%4]\n"					\
+"	" #op1 " %Q1, %Q0, %Q5\n"					\
+"	" #op2 " %R1, %R0, %R5\n"					\
+"	strexd	%2, %1, %H0, [%4]\n"					\
+"	teq	%2, #0\n"						\
+"	bne	1b"							\
+	: "=&r" (result), "=&r" (val), "=&r" (tmp), "+Qo" (v->counter)	\
+	: "r" (&v->counter), "r" (i)					\
+	: "cc");							\
+									\
+	return result;							\
+}
+
 #define ATOMIC64_OPS(op, op1, op2)					\
 	ATOMIC64_OP(op, op1, op2)					\
-	ATOMIC64_OP_RETURN(op, op1, op2)
+	ATOMIC64_OP_RETURN(op, op1, op2)				\
+	ATOMIC64_FETCH_OP(op, op1, op2)
 
 ATOMIC64_OPS(add, adds, adc)
 ATOMIC64_OPS(sub, subs, sbc)
 
 #define atomic64_add_return_relaxed	atomic64_add_return_relaxed
 #define atomic64_sub_return_relaxed	atomic64_sub_return_relaxed
+#define atomic64_fetch_add_relaxed	atomic64_fetch_add_relaxed
+#define atomic64_fetch_sub_relaxed	atomic64_fetch_sub_relaxed
+
+#undef ATOMIC64_OPS
+#define ATOMIC64_OPS(op, op1, op2)					\
+	ATOMIC64_OP(op, op1, op2)					\
+	ATOMIC64_FETCH_OP(op, op1, op2)
 
 #define atomic64_andnot atomic64_andnot
 
-ATOMIC64_OP(and, and, and)
-ATOMIC64_OP(andnot, bic, bic)
-ATOMIC64_OP(or,  orr, orr)
-ATOMIC64_OP(xor, eor, eor)
+ATOMIC64_OPS(and, and, and)
+ATOMIC64_OPS(andnot, bic, bic)
+ATOMIC64_OPS(or,  orr, orr)
+ATOMIC64_OPS(xor, eor, eor)
+
+#define atomic64_fetch_and_relaxed	atomic64_fetch_and_relaxed
+#define atomic64_fetch_andnot_relaxed	atomic64_fetch_andnot_relaxed
+#define atomic64_fetch_or_relaxed	atomic64_fetch_or_relaxed
+#define atomic64_fetch_xor_relaxed	atomic64_fetch_xor_relaxed
 
 #undef ATOMIC64_OPS
+#undef ATOMIC64_FETCH_OP
 #undef ATOMIC64_OP_RETURN
 #undef ATOMIC64_OP
 

^ permalink raw reply	[flat|nested] 79+ messages in thread

* [RFC][PATCH 05/31] locking,arm64: Implement atomic{,64}_fetch_{add,sub,and,andnot,or,xor}{,_relaxed,_acquire,_release}()
  2016-04-22  9:04 [RFC][PATCH 00/31] implement atomic_fetch_$op Peter Zijlstra
                   ` (3 preceding siblings ...)
  2016-04-22  9:04 ` [RFC][PATCH 04/31] locking,arm: Implement atomic{,64}_fetch_{add,sub,and,andnot,or,xor}{,_relaxed,_acquire,_release}() Peter Zijlstra
@ 2016-04-22  9:04 ` Peter Zijlstra
  2016-04-22 11:08   ` Will Deacon
  2016-04-22 14:23     ` Will Deacon
  2016-04-22  9:04 ` [RFC][PATCH 06/31] locking,avr32: Implement atomic_fetch_{add,sub,and,or,xor}() Peter Zijlstra
                   ` (26 subsequent siblings)
  31 siblings, 2 replies; 79+ messages in thread
From: Peter Zijlstra @ 2016-04-22  9:04 UTC (permalink / raw)
  To: torvalds, mingo, tglx, will.deacon, paulmck, boqun.feng,
	waiman.long, fweisbec
  Cc: linux-kernel, linux-arch, rth, vgupta, linux, egtvedt, realmz6,
	ysato, rkuo, tony.luck, geert, james.hogan, ralf, dhowells, jejb,
	mpe, schwidefsky, dalias, davem, cmetcalf, jcmvbkbc, arnd,
	peterz, dbueso, fengguang.wu

[-- Attachment #1: peterz-atomic-fetch-arm64.patch --]
[-- Type: text/plain, Size: 7360 bytes --]

Implement FETCH-OP atomic primitives, these are very similar to the
existing OP-RETURN primitives we already have, except they return the
value of the atomic variable _before_ modification.

This is especially useful for irreversible operations -- such as
bitops (because it becomes impossible to reconstruct the state prior
to modification).

XXX lacking LSE bits


Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 arch/arm64/include/asm/atomic.h       |   32 ++++++++++
 arch/arm64/include/asm/atomic_ll_sc.h |  108 ++++++++++++++++++++++++++--------
 2 files changed, 116 insertions(+), 24 deletions(-)

--- a/arch/arm64/include/asm/atomic.h
+++ b/arch/arm64/include/asm/atomic.h
@@ -76,6 +76,36 @@
 #define atomic_dec_return_release(v)	atomic_sub_return_release(1, (v))
 #define atomic_dec_return(v)		atomic_sub_return(1, (v))
 
+#define atomic_fetch_add_relaxed	atomic_fetch_add_relaxed
+#define atomic_fetch_add_acquire	atomic_fetch_add_acquire
+#define atomic_fetch_add_release	atomic_fetch_add_release
+#define atomic_fetch_add		atomic_fetch_add
+
+#define atomic_fetch_sub_relaxed	atomic_fetch_sub_relaxed
+#define atomic_fetch_sub_acquire	atomic_fetch_sub_acquire
+#define atomic_fetch_sub_release	atomic_fetch_sub_release
+#define atomic_fetch_sub		atomic_fetch_sub
+
+#define atomic_fetch_and_relaxed	atomic_fetch_and_relaxed
+#define atomic_fetch_and_acquire	atomic_fetch_and_acquire
+#define atomic_fetch_and_release	atomic_fetch_and_release
+#define atomic_fetch_and		atomic_fetch_and
+
+#define atomic_fetch_andnot_relaxed	atomic_fetch_andnot_relaxed
+#define atomic_fetch_andnot_acquire	atomic_fetch_andnot_acquire
+#define atomic_fetch_andnot_release	atomic_fetch_andnot_release
+#define atomic_fetch_andnot		atomic_fetch_andnot
+
+#define atomic_fetch_or_relaxed		atomic_fetch_or_relaxed
+#define atomic_fetch_or_acquire		atomic_fetch_or_acquire
+#define atomic_fetch_or_release		atomic_fetch_or_release
+#define atomic_fetch_or			atomic_fetch_or
+
+#define atomic_fetch_xor_relaxed	atomic_fetch_xor_relaxed
+#define atomic_fetch_xor_acquire	atomic_fetch_xor_acquire
+#define atomic_fetch_xor_release	atomic_fetch_xor_release
+#define atomic_fetch_xor		atomic_fetch_xor
+
 #define atomic_xchg_relaxed(v, new)	xchg_relaxed(&((v)->counter), (new))
 #define atomic_xchg_acquire(v, new)	xchg_acquire(&((v)->counter), (new))
 #define atomic_xchg_release(v, new)	xchg_release(&((v)->counter), (new))
@@ -98,6 +128,8 @@
 #define __atomic_add_unless(v, a, u)	___atomic_add_unless(v, a, u,)
 #define atomic_andnot			atomic_andnot
 
+#define atomic_fetch_or atomic_fetch_or
+
 /*
  * 64-bit atomic operations.
  */
--- a/arch/arm64/include/asm/atomic_ll_sc.h
+++ b/arch/arm64/include/asm/atomic_ll_sc.h
@@ -77,25 +77,55 @@ __LL_SC_PREFIX(atomic_##op##_return##nam
 }									\
 __LL_SC_EXPORT(atomic_##op##_return##name);
 
+#define ATOMIC_FETCH_OP(name, mb, acq, rel, cl, op, asm_op)		\
+__LL_SC_INLINE int							\
+__LL_SC_PREFIX(atomic_fetch_##op##name(int i, atomic_t *v))		\
+{									\
+	unsigned long tmp;						\
+	int val, result;						\
+									\
+	asm volatile("// atomic_fetch_" #op #name "\n"			\
+"	prfm	pstl1strm, %3\n"					\
+"1:	ld" #acq "xr	%w0, %3\n"					\
+"	" #asm_op "	%w1, %w0, %w4\n"				\
+"	st" #rel "xr	%w2, %w1, %3\n"					\
+"	cbnz	%w2, 1b\n"						\
+"	" #mb								\
+	: "=&r" (result), "=&r" (val), "=&r" (tmp), "+Q" (v->counter)	\
+	: "Ir" (i)							\
+	: cl);								\
+									\
+	return result;							\
+}									\
+__LL_SC_EXPORT(atomic_fetch_##op##name);
+
 #define ATOMIC_OPS(...)							\
 	ATOMIC_OP(__VA_ARGS__)						\
-	ATOMIC_OP_RETURN(        , dmb ish,  , l, "memory", __VA_ARGS__)
-
-#define ATOMIC_OPS_RLX(...)						\
-	ATOMIC_OPS(__VA_ARGS__)						\
+	ATOMIC_OP_RETURN(        , dmb ish,  , l, "memory", __VA_ARGS__)\
 	ATOMIC_OP_RETURN(_relaxed,        ,  ,  ,         , __VA_ARGS__)\
 	ATOMIC_OP_RETURN(_acquire,        , a,  , "memory", __VA_ARGS__)\
-	ATOMIC_OP_RETURN(_release,        ,  , l, "memory", __VA_ARGS__)
+	ATOMIC_OP_RETURN(_release,        ,  , l, "memory", __VA_ARGS__)\
+	ATOMIC_FETCH_OP (        , dmb ish,  , l, "memory", __VA_ARGS__)\
+	ATOMIC_FETCH_OP (_relaxed,        ,  ,  ,         , __VA_ARGS__)\
+	ATOMIC_FETCH_OP (_acquire,        , a,  , "memory", __VA_ARGS__)\
+	ATOMIC_FETCH_OP (_release,        ,  , l, "memory", __VA_ARGS__)
 
-ATOMIC_OPS_RLX(add, add)
-ATOMIC_OPS_RLX(sub, sub)
+ATOMIC_OPS(add, add)
+ATOMIC_OPS(sub, sub)
 
-ATOMIC_OP(and, and)
-ATOMIC_OP(andnot, bic)
-ATOMIC_OP(or, orr)
-ATOMIC_OP(xor, eor)
+#undef ATOMIC_OPS
+#define ATOMIC_OPS(...)							\
+	ATOMIC_OP(__VA_ARGS__)						\
+	ATOMIC_FETCH_OP (        , dmb ish,  , l, "memory", __VA_ARGS__)\
+	ATOMIC_FETCH_OP (_relaxed,        ,  ,  ,         , __VA_ARGS__)\
+	ATOMIC_FETCH_OP (_acquire,        , a,  , "memory", __VA_ARGS__)\
+	ATOMIC_FETCH_OP (_release,        ,  , l, "memory", __VA_ARGS__)
+
+ATOMIC_OPS(and, and)
+ATOMIC_OPS(andnot, bic)
+ATOMIC_OPS(or, orr)
+ATOMIC_OPS(xor, eor)
 
-#undef ATOMIC_OPS_RLX
 #undef ATOMIC_OPS
 #undef ATOMIC_OP_RETURN
 #undef ATOMIC_OP
@@ -140,25 +170,55 @@ __LL_SC_PREFIX(atomic64_##op##_return##n
 }									\
 __LL_SC_EXPORT(atomic64_##op##_return##name);
 
+#define ATOMIC64_FETCH_OP(name, mb, acq, rel, cl, op, asm_op)		\
+__LL_SC_INLINE long							\
+__LL_SC_PREFIX(atomic64_fetch_##op##name(long i, atomic64_t *v))	\
+{									\
+	long result, val;						\
+	unsigned long tmp;						\
+									\
+	asm volatile("// atomic64_fetch_" #op #name "\n"		\
+"	prfm	pstl1strm, %3\n"					\
+"1:	ld" #acq "xr	%0, %3\n"					\
+"	" #asm_op "	%1, %0, %4\n"					\
+"	st" #rel "xr	%w2, %1, %3\n"					\
+"	cbnz	%w2, 1b\n"						\
+"	" #mb								\
+	: "=&r" (result), "=&r" (val), "=&r" (tmp), "+Q" (v->counter)	\
+	: "Ir" (i)							\
+	: cl);								\
+									\
+	return result;							\
+}									\
+__LL_SC_EXPORT(atomic64_##op##_return##name);
+
 #define ATOMIC64_OPS(...)						\
 	ATOMIC64_OP(__VA_ARGS__)					\
-	ATOMIC64_OP_RETURN(, dmb ish,  , l, "memory", __VA_ARGS__)
-
-#define ATOMIC64_OPS_RLX(...)						\
-	ATOMIC64_OPS(__VA_ARGS__)					\
+	ATOMIC64_OP_RETURN(, dmb ish,  , l, "memory", __VA_ARGS__)	\
 	ATOMIC64_OP_RETURN(_relaxed,,  ,  ,         , __VA_ARGS__)	\
 	ATOMIC64_OP_RETURN(_acquire,, a,  , "memory", __VA_ARGS__)	\
-	ATOMIC64_OP_RETURN(_release,,  , l, "memory", __VA_ARGS__)
+	ATOMIC64_OP_RETURN(_release,,  , l, "memory", __VA_ARGS__)	\
+	ATOMIC64_FETCH_OP (, dmb ish,  , l, "memory", __VA_ARGS__)	\
+	ATOMIC64_FETCH_OP (_relaxed,,  ,  ,         , __VA_ARGS__)	\
+	ATOMIC64_FETCH_OP (_acquire,, a,  , "memory", __VA_ARGS__)	\
+	ATOMIC64_FETCH_OP (_release,,  , l, "memory", __VA_ARGS__)
 
-ATOMIC64_OPS_RLX(add, add)
-ATOMIC64_OPS_RLX(sub, sub)
+ATOMIC64_OPS(add, add)
+ATOMIC64_OPS(sub, sub)
 
-ATOMIC64_OP(and, and)
-ATOMIC64_OP(andnot, bic)
-ATOMIC64_OP(or, orr)
-ATOMIC64_OP(xor, eor)
+#undef ATOMIC_OPS
+#define ATOMIC64_OPS(...)						\
+	ATOMIC64_OP(__VA_ARGS__)					\
+	ATOMIC64_FETCH_OP (, dmb ish,  , l, "memory", __VA_ARGS__)	\
+	ATOMIC64_FETCH_OP (_relaxed,,  ,  ,         , __VA_ARGS__)	\
+	ATOMIC64_FETCH_OP (_acquire,, a,  , "memory", __VA_ARGS__)	\
+	ATOMIC64_FETCH_OP (_release,,  , l, "memory", __VA_ARGS__)
+
+ATOMIC64_OPS(and, and)
+ATOMIC64_OPS(andnot, bic)
+ATOMIC64_OPS(or, orr)
+ATOMIC64_OPS(xor, eor)
 
-#undef ATOMIC64_OPS_RLX
 #undef ATOMIC64_OPS
 #undef ATOMIC64_OP_RETURN
 #undef ATOMIC64_OP

^ permalink raw reply	[flat|nested] 79+ messages in thread

* [RFC][PATCH 06/31] locking,avr32: Implement atomic_fetch_{add,sub,and,or,xor}()
  2016-04-22  9:04 [RFC][PATCH 00/31] implement atomic_fetch_$op Peter Zijlstra
                   ` (4 preceding siblings ...)
  2016-04-22  9:04 ` [RFC][PATCH 05/31] locking,arm64: " Peter Zijlstra
@ 2016-04-22  9:04 ` Peter Zijlstra
  2016-04-22 11:58   ` Hans-Christian Noren Egtvedt
  2016-04-22  9:04 ` [RFC][PATCH 07/31] locking,blackfin: " Peter Zijlstra
                   ` (25 subsequent siblings)
  31 siblings, 1 reply; 79+ messages in thread
From: Peter Zijlstra @ 2016-04-22  9:04 UTC (permalink / raw)
  To: torvalds, mingo, tglx, will.deacon, paulmck, boqun.feng,
	waiman.long, fweisbec
  Cc: linux-kernel, linux-arch, rth, vgupta, linux, egtvedt, realmz6,
	ysato, rkuo, tony.luck, geert, james.hogan, ralf, dhowells, jejb,
	mpe, schwidefsky, dalias, davem, cmetcalf, jcmvbkbc, arnd,
	peterz, dbueso, fengguang.wu

[-- Attachment #1: peterz-atomic-fetch-avr32.patch --]
[-- Type: text/plain, Size: 2761 bytes --]

Implement FETCH-OP atomic primitives, these are very similar to the
existing OP-RETURN primitives we already have, except they return the
value of the atomic variable _before_ modification.

This is especially useful for irreversible operations -- such as
bitops (because it becomes impossible to reconstruct the state prior
to modification).



Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 arch/avr32/include/asm/atomic.h |   56 ++++++++++++++++++++++++++++++++++++----
 1 file changed, 51 insertions(+), 5 deletions(-)

--- a/arch/avr32/include/asm/atomic.h
+++ b/arch/avr32/include/asm/atomic.h
@@ -41,21 +41,51 @@ static inline int __atomic_##op##_return
 	return result;							\
 }
 
+#define ATOMIC_FETCH_OP(op, asm_op, asm_con)				\
+static inline int __atomic_fetch_##op(int i, atomic_t *v)		\
+{									\
+	int result, val;						\
+									\
+	asm volatile(							\
+		"/* atomic_fetch_" #op " */\n"				\
+		"1:	ssrf	5\n"					\
+		"	ld.w	%0, %3\n"				\
+		"	mov	%1, %0\n"				\
+		"	" #asm_op "	%1, %4\n"			\
+		"	stcond	%2, %1\n"				\
+		"	brne	1b"					\
+		: "=&r" (result), "=&r" (val), "=o" (v->counter)	\
+		: "m" (v->counter), #asm_con (i)			\
+		: "cc");						\
+									\
+	return result;							\
+}
+
 ATOMIC_OP_RETURN(sub, sub, rKs21)
 ATOMIC_OP_RETURN(add, add, r)
+ATOMIC_FETCH_OP (sub, sub, rKs21)
+ATOMIC_FETCH_OP (add, add, r)
 
-#define ATOMIC_OP(op, asm_op)						\
+#define atomic_fetch_or atomic_fetch_or
+
+#define ATOMIC_OPS(op, asm_op)						\
 ATOMIC_OP_RETURN(op, asm_op, r)						\
 static inline void atomic_##op(int i, atomic_t *v)			\
 {									\
 	(void)__atomic_##op##_return(i, v);				\
+}									\
+ATOMIC_FETCH_OP(op, asm_op, r)						\
+static inline int atomic_fetch_##op(int i, atomic_t *v)		\
+{									\
+	return __atomic_fetch_##op(i, v);				\
 }
 
-ATOMIC_OP(and, and)
-ATOMIC_OP(or, or)
-ATOMIC_OP(xor, eor)
+ATOMIC_OPS(and, and)
+ATOMIC_OPS(or, or)
+ATOMIC_OPS(xor, eor)
 
-#undef ATOMIC_OP
+#undef ATOMIC_OPS
+#undef ATOMIC_FETCH_OP
 #undef ATOMIC_OP_RETURN
 
 /*
@@ -87,6 +117,14 @@ static inline int atomic_add_return(int
 	return __atomic_add_return(i, v);
 }
 
+static inline int atomic_fetch_add(int i, atomic_t *v)
+{
+	if (IS_21BIT_CONST(i))
+		return __atomic_fetch_sub(-i, v);
+
+	return __atomic_fetch_add(i, v);
+}
+
 /*
  * atomic_sub_return - subtract the atomic variable
  * @i: integer value to subtract
@@ -102,6 +140,14 @@ static inline int atomic_sub_return(int
 	return __atomic_add_return(-i, v);
 }
 
+static inline int atomic_fetch_sub(int i, atomic_t *v)
+{
+	if (IS_21BIT_CONST(i))
+		return __atomic_fetch_sub(i, v);
+
+	return __atomic_fetch_add(-i, v);
+}
+
 /*
  * __atomic_add_unless - add unless the number is a given value
  * @v: pointer of type atomic_t

^ permalink raw reply	[flat|nested] 79+ messages in thread

* [RFC][PATCH 07/31] locking,blackfin: Implement atomic_fetch_{add,sub,and,or,xor}()
  2016-04-22  9:04 [RFC][PATCH 00/31] implement atomic_fetch_$op Peter Zijlstra
                   ` (5 preceding siblings ...)
  2016-04-22  9:04 ` [RFC][PATCH 06/31] locking,avr32: Implement atomic_fetch_{add,sub,and,or,xor}() Peter Zijlstra
@ 2016-04-22  9:04 ` Peter Zijlstra
  2016-04-22  9:04 ` [RFC][PATCH 08/31] locking,frv: Implement atomic{,64}_fetch_{add,sub,and,or,xor}() Peter Zijlstra
                   ` (24 subsequent siblings)
  31 siblings, 0 replies; 79+ messages in thread
From: Peter Zijlstra @ 2016-04-22  9:04 UTC (permalink / raw)
  To: torvalds, mingo, tglx, will.deacon, paulmck, boqun.feng,
	waiman.long, fweisbec
  Cc: linux-kernel, linux-arch, rth, vgupta, linux, egtvedt, realmz6,
	ysato, rkuo, tony.luck, geert, james.hogan, ralf, dhowells, jejb,
	mpe, schwidefsky, dalias, davem, cmetcalf, jcmvbkbc, arnd,
	peterz, dbueso, fengguang.wu

[-- Attachment #1: peterz-atomic-fetch-blackfin.patch --]
[-- Type: text/plain, Size: 3585 bytes --]

Implement FETCH-OP atomic primitives, these are very similar to the
existing OP-RETURN primitives we already have, except they return the
value of the atomic variable _before_ modification.

This is especially useful for irreversible operations -- such as
bitops (because it becomes impossible to reconstruct the state prior
to modification).

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 arch/blackfin/include/asm/atomic.h |    8 ++++++
 arch/blackfin/kernel/bfin_ksyms.c  |    1 
 arch/blackfin/mach-bf561/atomic.S  |   43 ++++++++++++++++++++++++++-----------
 3 files changed, 40 insertions(+), 12 deletions(-)

--- a/arch/blackfin/include/asm/atomic.h
+++ b/arch/blackfin/include/asm/atomic.h
@@ -17,6 +17,7 @@
 
 asmlinkage int __raw_uncached_fetch_asm(const volatile int *ptr);
 asmlinkage int __raw_atomic_add_asm(volatile int *ptr, int value);
+asmlinkage int __raw_atomic_xadd_asm(volatile int *ptr, int value);
 
 asmlinkage int __raw_atomic_and_asm(volatile int *ptr, int value);
 asmlinkage int __raw_atomic_or_asm(volatile int *ptr, int value);
@@ -28,10 +29,17 @@ asmlinkage int __raw_atomic_test_asm(con
 #define atomic_add_return(i, v) __raw_atomic_add_asm(&(v)->counter, i)
 #define atomic_sub_return(i, v) __raw_atomic_add_asm(&(v)->counter, -(i))
 
+#define atomic_fetch_add(i, v) __raw_atomic_xadd_asm(&(v)->counter, i)
+#define atomic_fetch_sub(i, v) __raw_atomic_xadd_asm(&(v)->counter, -(i))
+
 #define atomic_or(i, v)  (void)__raw_atomic_or_asm(&(v)->counter, i)
 #define atomic_and(i, v) (void)__raw_atomic_and_asm(&(v)->counter, i)
 #define atomic_xor(i, v) (void)__raw_atomic_xor_asm(&(v)->counter, i)
 
+#define atomic_fetch_or(i, v)  __raw_atomic_or_asm(&(v)->counter, i)
+#define atomic_fetch_and(i, v) __raw_atomic_and_asm(&(v)->counter, i)
+#define atomic_fetch_xor(i, v) __raw_atomic_xor_asm(&(v)->counter, i)
+
 #endif
 
 #include <asm-generic/atomic.h>
--- a/arch/blackfin/kernel/bfin_ksyms.c
+++ b/arch/blackfin/kernel/bfin_ksyms.c
@@ -84,6 +84,7 @@ EXPORT_SYMBOL(insl_16);
 
 #ifdef CONFIG_SMP
 EXPORT_SYMBOL(__raw_atomic_add_asm);
+EXPORT_SYMBOL(__raw_atomic_xadd_asm);
 EXPORT_SYMBOL(__raw_atomic_and_asm);
 EXPORT_SYMBOL(__raw_atomic_or_asm);
 EXPORT_SYMBOL(__raw_atomic_xor_asm);
--- a/arch/blackfin/mach-bf561/atomic.S
+++ b/arch/blackfin/mach-bf561/atomic.S
@@ -607,6 +607,28 @@ ENDPROC(___raw_atomic_add_asm)
 
 /*
  * r0 = ptr
+ * r1 = value
+ *
+ * ADD a signed value to a 32bit word and return the old value atomically.
+ * Clobbers: r3:0, p1:0
+ */
+ENTRY(___raw_atomic_xadd_asm)
+	p1 = r0;
+	r3 = r1;
+	[--sp] = rets;
+	call _get_core_lock;
+	r3 = [p1];
+	r2 = r3 + r2;
+	[p1] = r2;
+	r1 = p1;
+	call _put_core_lock;
+	r0 = r3;
+	rets = [sp++];
+	rts;
+ENDPROC(___raw_atomic_add_asm)
+
+/*
+ * r0 = ptr
  * r1 = mask
  *
  * AND the mask bits from a 32bit word and return the old 32bit value
@@ -618,10 +640,9 @@ ENTRY(___raw_atomic_and_asm)
 	r3 = r1;
 	[--sp] = rets;
 	call _get_core_lock;
-	r2 = [p1];
-	r3 = r2 & r3;
-	[p1] = r3;
-	r3 = r2;
+	r3 = [p1];
+	r2 = r2 & r3;
+	[p1] = r2;
 	r1 = p1;
 	call _put_core_lock;
 	r0 = r3;
@@ -642,10 +663,9 @@ ENTRY(___raw_atomic_or_asm)
 	r3 = r1;
 	[--sp] = rets;
 	call _get_core_lock;
-	r2 = [p1];
-	r3 = r2 | r3;
-	[p1] = r3;
-	r3 = r2;
+	r3 = [p1];
+	r2 = r2 | r3;
+	[p1] = r2;
 	r1 = p1;
 	call _put_core_lock;
 	r0 = r3;
@@ -666,10 +686,9 @@ ENTRY(___raw_atomic_xor_asm)
 	r3 = r1;
 	[--sp] = rets;
 	call _get_core_lock;
-	r2 = [p1];
-	r3 = r2 ^ r3;
-	[p1] = r3;
-	r3 = r2;
+	r3 = [p1];
+	r2 = r2 ^ r3;
+	[p1] = r2;
 	r1 = p1;
 	call _put_core_lock;
 	r0 = r3;

^ permalink raw reply	[flat|nested] 79+ messages in thread

* [RFC][PATCH 08/31] locking,frv: Implement atomic{,64}_fetch_{add,sub,and,or,xor}()
  2016-04-22  9:04 [RFC][PATCH 00/31] implement atomic_fetch_$op Peter Zijlstra
                   ` (6 preceding siblings ...)
  2016-04-22  9:04 ` [RFC][PATCH 07/31] locking,blackfin: " Peter Zijlstra
@ 2016-04-22  9:04 ` Peter Zijlstra
  2016-04-22  9:04 ` [RFC][PATCH 09/31] locking,h8300: Implement atomic_fetch_{add,sub,and,or,xor}() Peter Zijlstra
                   ` (23 subsequent siblings)
  31 siblings, 0 replies; 79+ messages in thread
From: Peter Zijlstra @ 2016-04-22  9:04 UTC (permalink / raw)
  To: torvalds, mingo, tglx, will.deacon, paulmck, boqun.feng,
	waiman.long, fweisbec
  Cc: linux-kernel, linux-arch, rth, vgupta, linux, egtvedt, realmz6,
	ysato, rkuo, tony.luck, geert, james.hogan, ralf, dhowells, jejb,
	mpe, schwidefsky, dalias, davem, cmetcalf, jcmvbkbc, arnd,
	peterz, dbueso, fengguang.wu

[-- Attachment #1: peterz-atomic-fetch-frv.patch --]
[-- Type: text/plain, Size: 2773 bytes --]

Implement FETCH-OP atomic primitives, these are very similar to the
existing OP-RETURN primitives we already have, except they return the
value of the atomic variable _before_ modification.

This is especially useful for irreversible operations -- such as
bitops (because it becomes impossible to reconstruct the state prior
to modification).

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 arch/frv/include/asm/atomic.h      |   32 ++++++++++++--------------------
 arch/frv/include/asm/atomic_defs.h |    2 ++
 2 files changed, 14 insertions(+), 20 deletions(-)

--- a/arch/frv/include/asm/atomic.h
+++ b/arch/frv/include/asm/atomic.h
@@ -60,16 +60,6 @@ static inline int atomic_add_negative(in
 	return atomic_add_return(i, v) < 0;
 }
 
-static inline void atomic_add(int i, atomic_t *v)
-{
-	atomic_add_return(i, v);
-}
-
-static inline void atomic_sub(int i, atomic_t *v)
-{
-	atomic_sub_return(i, v);
-}
-
 static inline void atomic_inc(atomic_t *v)
 {
 	atomic_inc_return(v);
@@ -84,6 +74,8 @@ static inline void atomic_dec(atomic_t *
 #define atomic_dec_and_test(v)		(atomic_sub_return(1, (v)) == 0)
 #define atomic_inc_and_test(v)		(atomic_add_return(1, (v)) == 0)
 
+#define atomic_fetch_or atomic_fetch_or
+
 /*
  * 64-bit atomic ops
  */
@@ -136,16 +128,6 @@ static inline long long atomic64_add_neg
 	return atomic64_add_return(i, v) < 0;
 }
 
-static inline void atomic64_add(long long i, atomic64_t *v)
-{
-	atomic64_add_return(i, v);
-}
-
-static inline void atomic64_sub(long long i, atomic64_t *v)
-{
-	atomic64_sub_return(i, v);
-}
-
 static inline void atomic64_inc(atomic64_t *v)
 {
 	atomic64_inc_return(v);
@@ -182,11 +164,19 @@ static __inline__ int __atomic_add_unles
 }
 
 #define ATOMIC_OP(op)							\
+static inline int atomic_fetch_##op(int i, atomic_t *v)			\
+{									\
+	return __atomic32_fetch_##op(i, &v->counter);			\
+}									\
 static inline void atomic_##op(int i, atomic_t *v)			\
 {									\
 	(void)__atomic32_fetch_##op(i, &v->counter);			\
 }									\
 									\
+static inline long long atomic64_fetch_##op(long long i, atomic64_t *v)	\
+{									\
+	return __atomic64_fetch_##op(i, &v->counter);			\
+}									\
 static inline void atomic64_##op(long long i, atomic64_t *v)		\
 {									\
 	(void)__atomic64_fetch_##op(i, &v->counter);			\
@@ -195,6 +185,8 @@ static inline void atomic64_##op(long lo
 ATOMIC_OP(or)
 ATOMIC_OP(and)
 ATOMIC_OP(xor)
+ATOMIC_OP(add)
+ATOMIC_OP(sub)
 
 #undef ATOMIC_OP
 
--- a/arch/frv/include/asm/atomic_defs.h
+++ b/arch/frv/include/asm/atomic_defs.h
@@ -162,6 +162,8 @@ ATOMIC_EXPORT(__atomic64_fetch_##op);
 ATOMIC_FETCH_OP(or)
 ATOMIC_FETCH_OP(and)
 ATOMIC_FETCH_OP(xor)
+ATOMIC_FETCH_OP(add)
+ATOMIC_FETCH_OP(sub)
 
 ATOMIC_OP_RETURN(add)
 ATOMIC_OP_RETURN(sub)

^ permalink raw reply	[flat|nested] 79+ messages in thread

* [RFC][PATCH 09/31] locking,h8300: Implement atomic_fetch_{add,sub,and,or,xor}()
  2016-04-22  9:04 [RFC][PATCH 00/31] implement atomic_fetch_$op Peter Zijlstra
                   ` (7 preceding siblings ...)
  2016-04-22  9:04 ` [RFC][PATCH 08/31] locking,frv: Implement atomic{,64}_fetch_{add,sub,and,or,xor}() Peter Zijlstra
@ 2016-04-22  9:04 ` Peter Zijlstra
  2016-04-22  9:04 ` [RFC][PATCH 10/31] locking,hexagon: " Peter Zijlstra
                   ` (22 subsequent siblings)
  31 siblings, 0 replies; 79+ messages in thread
From: Peter Zijlstra @ 2016-04-22  9:04 UTC (permalink / raw)
  To: torvalds, mingo, tglx, will.deacon, paulmck, boqun.feng,
	waiman.long, fweisbec
  Cc: linux-kernel, linux-arch, rth, vgupta, linux, egtvedt, realmz6,
	ysato, rkuo, tony.luck, geert, james.hogan, ralf, dhowells, jejb,
	mpe, schwidefsky, dalias, davem, cmetcalf, jcmvbkbc, arnd,
	peterz, dbueso, fengguang.wu

[-- Attachment #1: peterz-atomic-fetch-h8300.patch --]
[-- Type: text/plain, Size: 1918 bytes --]

Implement FETCH-OP atomic primitives, these are very similar to the
existing OP-RETURN primitives we already have, except they return the
value of the atomic variable _before_ modification.

This is especially useful for irreversible operations -- such as
bitops (because it becomes impossible to reconstruct the state prior
to modification).

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 arch/h8300/include/asm/atomic.h |   31 +++++++++++++++++++++++++------
 1 file changed, 25 insertions(+), 6 deletions(-)

--- a/arch/h8300/include/asm/atomic.h
+++ b/arch/h8300/include/asm/atomic.h
@@ -28,6 +28,19 @@ static inline int atomic_##op##_return(i
 	return ret;						\
 }
 
+#define ATOMIC_FETCH_OP(op, c_op)				\
+static inline int atomic_fetch_##op(int i, atomic_t *v)		\
+{								\
+	h8300flags flags;					\
+	int ret;						\
+								\
+	flags = arch_local_irq_save();				\
+	ret = v->counter;					\
+       	v->counter c_op i;					\
+	arch_local_irq_restore(flags);				\
+	return ret;						\
+}
+
 #define ATOMIC_OP(op, c_op)					\
 static inline void atomic_##op(int i, atomic_t *v)		\
 {								\
@@ -41,17 +54,23 @@ static inline void atomic_##op(int i, at
 ATOMIC_OP_RETURN(add, +=)
 ATOMIC_OP_RETURN(sub, -=)
 
-ATOMIC_OP(and, &=)
-ATOMIC_OP(or,  |=)
-ATOMIC_OP(xor, ^=)
+#define atomic_fetch_or atomic_fetch_or
 
+#define ATOMIC_OPS(op, c_op)					\
+	ATOMIC_OP(op, c_op)					\
+	ATOMIC_FETCH_OP(op, c_op)
+
+ATOMIC_OPS(and, &=)
+ATOMIC_OPS(or,  |=)
+ATOMIC_OPS(xor, ^=)
+ATOMIC_OPS(add, +=)
+ATOMIC_OPS(sub, -=)
+
+#undef ATOMIC_OPS
 #undef ATOMIC_OP_RETURN
 #undef ATOMIC_OP
 
-#define atomic_add(i, v)		(void)atomic_add_return(i, v)
 #define atomic_add_negative(a, v)	(atomic_add_return((a), (v)) < 0)
-
-#define atomic_sub(i, v)		(void)atomic_sub_return(i, v)
 #define atomic_sub_and_test(i, v)	(atomic_sub_return(i, v) == 0)
 
 #define atomic_inc_return(v)		atomic_add_return(1, v)

^ permalink raw reply	[flat|nested] 79+ messages in thread

* [RFC][PATCH 10/31] locking,hexagon: Implement atomic_fetch_{add,sub,and,or,xor}()
  2016-04-22  9:04 [RFC][PATCH 00/31] implement atomic_fetch_$op Peter Zijlstra
                   ` (8 preceding siblings ...)
  2016-04-22  9:04 ` [RFC][PATCH 09/31] locking,h8300: Implement atomic_fetch_{add,sub,and,or,xor}() Peter Zijlstra
@ 2016-04-22  9:04 ` Peter Zijlstra
  2016-04-23  2:16   ` Peter Zijlstra
  2016-04-22  9:04 ` [RFC][PATCH 11/31] locking,ia64: Implement atomic{,64}_fetch_{add,sub,and,or,xor}() Peter Zijlstra
                   ` (21 subsequent siblings)
  31 siblings, 1 reply; 79+ messages in thread
From: Peter Zijlstra @ 2016-04-22  9:04 UTC (permalink / raw)
  To: torvalds, mingo, tglx, will.deacon, paulmck, boqun.feng,
	waiman.long, fweisbec
  Cc: linux-kernel, linux-arch, rth, vgupta, linux, egtvedt, realmz6,
	ysato, rkuo, tony.luck, geert, james.hogan, ralf, dhowells, jejb,
	mpe, schwidefsky, dalias, davem, cmetcalf, jcmvbkbc, arnd,
	peterz, dbueso, fengguang.wu

[-- Attachment #1: peterz-atomic-fetch-hexagon.patch --]
[-- Type: text/plain, Size: 1915 bytes --]

Implement FETCH-OP atomic primitives, these are very similar to the
existing OP-RETURN primitives we already have, except they return the
value of the atomic variable _before_ modification.

This is especially useful for irreversible operations -- such as
bitops (because it becomes impossible to reconstruct the state prior
to modification).

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 arch/hexagon/include/asm/atomic.h |   33 ++++++++++++++++++++++++++++-----
 1 file changed, 28 insertions(+), 5 deletions(-)

--- a/arch/hexagon/include/asm/atomic.h
+++ b/arch/hexagon/include/asm/atomic.h
@@ -110,7 +110,7 @@ static inline void atomic_##op(int i, at
 	);								\
 }									\
 
-#define ATOMIC_OP_RETURN(op)							\
+#define ATOMIC_OP_RETURN(op)						\
 static inline int atomic_##op##_return(int i, atomic_t *v)		\
 {									\
 	int output;							\
@@ -127,16 +127,39 @@ static inline int atomic_##op##_return(i
 	return output;							\
 }
 
-#define ATOMIC_OPS(op) ATOMIC_OP(op) ATOMIC_OP_RETURN(op)
+#define ATOMIC_FETCH_OP(op)						\
+static inline int atomic_fetch_##op(int i, atomic_t *v)			\
+{									\
+	int output, val;						\
+									\
+	__asm__ __volatile__ (						\
+		"1:	%0 = memw_locked(%2);\n"			\
+		"	%1 = "#op "(%0,%3);\n"				\
+		"	memw_locked(%2,P3)=%0;\n"			\
+		"	if !P3 jump 1b;\n"				\
+		: "=&r" (output), "=&r" (val)				\
+		: "r" (&v->counter), "r" (i)				\
+		: "memory", "p3"					\
+	);								\
+	return output;							\
+}
+
+#define ATOMIC_OPS(op) ATOMIC_OP(op) ATOMIC_OP_RETURN(op) ATOMIC_FETCH_OP(op)
 
 ATOMIC_OPS(add)
 ATOMIC_OPS(sub)
 
-ATOMIC_OP(and)
-ATOMIC_OP(or)
-ATOMIC_OP(xor)
+#undef ATOMIC_OPS
+#define ATOMIC_OPS(op) ATOMIC_OP(op) ATOMIC_FETCH_OP(op)
+
+#define atomic_fetch_or atomic_fetch_or
+
+ATOMIC_OPS(and)
+ATOMIC_OPS(or)
+ATOMIC_OPS(xor)
 
 #undef ATOMIC_OPS
+#undef ATOMIC_FETCH_OP
 #undef ATOMIC_OP_RETURN
 #undef ATOMIC_OP
 

^ permalink raw reply	[flat|nested] 79+ messages in thread

* [RFC][PATCH 11/31] locking,ia64: Implement atomic{,64}_fetch_{add,sub,and,or,xor}()
  2016-04-22  9:04 [RFC][PATCH 00/31] implement atomic_fetch_$op Peter Zijlstra
                   ` (9 preceding siblings ...)
  2016-04-22  9:04 ` [RFC][PATCH 10/31] locking,hexagon: " Peter Zijlstra
@ 2016-04-22  9:04 ` Peter Zijlstra
  2016-04-22  9:04 ` [RFC][PATCH 12/31] locking,m32r: Implement atomic_fetch_{add,sub,and,or,xor}() Peter Zijlstra
                   ` (20 subsequent siblings)
  31 siblings, 0 replies; 79+ messages in thread
From: Peter Zijlstra @ 2016-04-22  9:04 UTC (permalink / raw)
  To: torvalds, mingo, tglx, will.deacon, paulmck, boqun.feng,
	waiman.long, fweisbec
  Cc: linux-kernel, linux-arch, rth, vgupta, linux, egtvedt, realmz6,
	ysato, rkuo, tony.luck, geert, james.hogan, ralf, dhowells, jejb,
	mpe, schwidefsky, dalias, davem, cmetcalf, jcmvbkbc, arnd,
	peterz, dbueso, fengguang.wu

[-- Attachment #1: peterz-atomic-fetch-ia64.patch --]
[-- Type: text/plain, Size: 5705 bytes --]

Implement FETCH-OP atomic primitives, these are very similar to the
existing OP-RETURN primitives we already have, except they return the
value of the atomic variable _before_ modification.

This is especially useful for irreversible operations -- such as
bitops (because it becomes impossible to reconstruct the state prior
to modification).

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 arch/ia64/include/asm/atomic.h |  134 +++++++++++++++++++++++++++++++++++------
 1 file changed, 116 insertions(+), 18 deletions(-)

--- a/arch/ia64/include/asm/atomic.h
+++ b/arch/ia64/include/asm/atomic.h
@@ -42,8 +42,27 @@ ia64_atomic_##op (int i, atomic_t *v)
 	return new;							\
 }
 
-ATOMIC_OP(add, +)
-ATOMIC_OP(sub, -)
+#define ATOMIC_FETCH_OP(op, c_op)					\
+static __inline__ int							\
+ia64_atomic_fetch_##op (int i, atomic_t *v)				\
+{									\
+	__s32 old, new;							\
+	CMPXCHG_BUGCHECK_DECL						\
+									\
+	do {								\
+		CMPXCHG_BUGCHECK(v);					\
+		old = atomic_read(v);					\
+		new = old c_op i;					\
+	} while (ia64_cmpxchg(acq, v, old, new, sizeof(atomic_t)) != old); \
+	return old;							\
+}
+
+#define ATOMIC_OPS(op, c_op)						\
+	ATOMIC_OP(op, c_op)						\
+	ATOMIC_FETCH_OP(op, c_op)
+
+ATOMIC_OPS(add, +)
+ATOMIC_OPS(sub, -)
 
 #define atomic_add_return(i,v)						\
 ({									\
@@ -69,14 +88,44 @@ ATOMIC_OP(sub, -)
 		: ia64_atomic_sub(__ia64_asr_i, v);			\
 })
 
-ATOMIC_OP(and, &)
-ATOMIC_OP(or, |)
-ATOMIC_OP(xor, ^)
-
-#define atomic_and(i,v)	(void)ia64_atomic_and(i,v)
-#define atomic_or(i,v)	(void)ia64_atomic_or(i,v)
-#define atomic_xor(i,v)	(void)ia64_atomic_xor(i,v)
+#define atomic_fetch_add(i,v)						\
+({									\
+	int __ia64_aar_i = (i);						\
+	(__builtin_constant_p(i)					\
+	 && (   (__ia64_aar_i ==  1) || (__ia64_aar_i ==   4)		\
+	     || (__ia64_aar_i ==  8) || (__ia64_aar_i ==  16)		\
+	     || (__ia64_aar_i == -1) || (__ia64_aar_i ==  -4)		\
+	     || (__ia64_aar_i == -8) || (__ia64_aar_i == -16)))		\
+		? ia64_fetchadd(__ia64_aar_i, &(v)->counter, acq)	\
+		: ia64_atomic_fetch_add(__ia64_aar_i, v);		\
+})
+
+#define atomic_fetch_sub(i,v)						\
+({									\
+	int __ia64_asr_i = (i);						\
+	(__builtin_constant_p(i)					\
+	 && (   (__ia64_asr_i ==   1) || (__ia64_asr_i ==   4)		\
+	     || (__ia64_asr_i ==   8) || (__ia64_asr_i ==  16)		\
+	     || (__ia64_asr_i ==  -1) || (__ia64_asr_i ==  -4)		\
+	     || (__ia64_asr_i ==  -8) || (__ia64_asr_i == -16)))	\
+		? ia64_fetchadd(-__ia64_asr_i, &(v)->counter, acq)	\
+		: ia64_atomic_fetch_sub(__ia64_asr_i, v);		\
+})
 
+ATOMIC_FETCH_OP(and, &)
+ATOMIC_FETCH_OP(or, |)
+ATOMIC_FETCH_OP(xor, ^)
+
+#define atomic_and(i,v)	(void)ia64_atomic_fetch_and(i,v)
+#define atomic_or(i,v)	(void)ia64_atomic_fetch_or(i,v)
+#define atomic_xor(i,v)	(void)ia64_atomic_fetch_xor(i,v)
+
+#define atomic_fetch_and(i,v)	ia64_atomic_fetch_and(i,v)
+#define atomic_fetch_or(i,v)	ia64_atomic_fetch_or(i,v)
+#define atomic_fetch_xor(i,v)	ia64_atomic_fetch_xor(i,v)
+
+#undef ATOMIC_OPS
+#undef ATOMIC_FETCH_OP
 #undef ATOMIC_OP
 
 #define ATOMIC64_OP(op, c_op)						\
@@ -94,8 +143,27 @@ ia64_atomic64_##op (__s64 i, atomic64_t
 	return new;							\
 }
 
-ATOMIC64_OP(add, +)
-ATOMIC64_OP(sub, -)
+#define ATOMIC64_FETCH_OP(op, c_op)					\
+static __inline__ long							\
+ia64_atomic64_fetch_##op (__s64 i, atomic64_t *v)			\
+{									\
+	__s64 old, new;							\
+	CMPXCHG_BUGCHECK_DECL						\
+									\
+	do {								\
+		CMPXCHG_BUGCHECK(v);					\
+		old = atomic64_read(v);					\
+		new = old c_op i;					\
+	} while (ia64_cmpxchg(acq, v, old, new, sizeof(atomic64_t)) != old); \
+	return old;							\
+}
+
+#define ATOMIC64_OPS(op, c_op)						\
+	ATOMIC64_OP(op, c_op)						\
+	ATOMIC64_FETCH_OP(op, c_op)
+
+ATOMIC64_OPS(add, +)
+ATOMIC64_OPS(sub, -)
 
 #define atomic64_add_return(i,v)					\
 ({									\
@@ -121,14 +189,44 @@ ATOMIC64_OP(sub, -)
 		: ia64_atomic64_sub(__ia64_asr_i, v);			\
 })
 
-ATOMIC64_OP(and, &)
-ATOMIC64_OP(or, |)
-ATOMIC64_OP(xor, ^)
-
-#define atomic64_and(i,v)	(void)ia64_atomic64_and(i,v)
-#define atomic64_or(i,v)	(void)ia64_atomic64_or(i,v)
-#define atomic64_xor(i,v)	(void)ia64_atomic64_xor(i,v)
+#define atomic64_fetch_add(i,v)						\
+({									\
+	long __ia64_aar_i = (i);					\
+	(__builtin_constant_p(i)					\
+	 && (   (__ia64_aar_i ==  1) || (__ia64_aar_i ==   4)		\
+	     || (__ia64_aar_i ==  8) || (__ia64_aar_i ==  16)		\
+	     || (__ia64_aar_i == -1) || (__ia64_aar_i ==  -4)		\
+	     || (__ia64_aar_i == -8) || (__ia64_aar_i == -16)))		\
+		? ia64_fetchadd(__ia64_aar_i, &(v)->counter, acq)	\
+		: ia64_atomic64_fetch_add(__ia64_aar_i, v);		\
+})
+
+#define atomic64_fetch_sub(i,v)						\
+({									\
+	long __ia64_asr_i = (i);					\
+	(__builtin_constant_p(i)					\
+	 && (   (__ia64_asr_i ==   1) || (__ia64_asr_i ==   4)		\
+	     || (__ia64_asr_i ==   8) || (__ia64_asr_i ==  16)		\
+	     || (__ia64_asr_i ==  -1) || (__ia64_asr_i ==  -4)		\
+	     || (__ia64_asr_i ==  -8) || (__ia64_asr_i == -16)))	\
+		? ia64_fetchadd(-__ia64_asr_i, &(v)->counter, acq)	\
+		: ia64_atomic64_fetch_sub(__ia64_asr_i, v);		\
+})
+
+ATOMIC64_FETCH_OP(and, &)
+ATOMIC64_FETCH_OP(or, |)
+ATOMIC64_FETCH_OP(xor, ^)
+
+#define atomic64_and(i,v)	(void)ia64_atomic64_fetch_and(i,v)
+#define atomic64_or(i,v)	(void)ia64_atomic64_fetch_or(i,v)
+#define atomic64_xor(i,v)	(void)ia64_atomic64_fetch_xor(i,v)
+
+#define atomic64_fetch_and(i,v)	ia64_atomic64_fetch_and(i,v)
+#define atomic64_fetch_or(i,v)	ia64_atomic64_fetch_or(i,v)
+#define atomic64_fetch_xor(i,v)	ia64_atomic64_fetch_xor(i,v)
 
+#undef ATOMIC64_OPS
+#undef ATOMIC64_FETCH_OP
 #undef ATOMIC64_OP
 
 #define atomic_cmpxchg(v, old, new) (cmpxchg(&((v)->counter), old, new))

^ permalink raw reply	[flat|nested] 79+ messages in thread

* [RFC][PATCH 12/31] locking,m32r: Implement atomic_fetch_{add,sub,and,or,xor}()
  2016-04-22  9:04 [RFC][PATCH 00/31] implement atomic_fetch_$op Peter Zijlstra
                   ` (10 preceding siblings ...)
  2016-04-22  9:04 ` [RFC][PATCH 11/31] locking,ia64: Implement atomic{,64}_fetch_{add,sub,and,or,xor}() Peter Zijlstra
@ 2016-04-22  9:04 ` Peter Zijlstra
  2016-04-22  9:04 ` [RFC][PATCH 13/31] locking,m68k: " Peter Zijlstra
                   ` (19 subsequent siblings)
  31 siblings, 0 replies; 79+ messages in thread
From: Peter Zijlstra @ 2016-04-22  9:04 UTC (permalink / raw)
  To: torvalds, mingo, tglx, will.deacon, paulmck, boqun.feng,
	waiman.long, fweisbec
  Cc: linux-kernel, linux-arch, rth, vgupta, linux, egtvedt, realmz6,
	ysato, rkuo, tony.luck, geert, james.hogan, ralf, dhowells, jejb,
	mpe, schwidefsky, dalias, davem, cmetcalf, jcmvbkbc, arnd,
	peterz, dbueso, fengguang.wu

[-- Attachment #1: peterz-atomic-fetch-m32r.patch --]
[-- Type: text/plain, Size: 1848 bytes --]

Implement FETCH-OP atomic primitives, these are very similar to the
existing OP-RETURN primitives we already have, except they return the
value of the atomic variable _before_ modification.

This is especially useful for irreversible operations -- such as
bitops (because it becomes impossible to reconstruct the state prior
to modification).

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 arch/m32r/include/asm/atomic.h |   38 ++++++++++++++++++++++++++++++++++----
 1 file changed, 34 insertions(+), 4 deletions(-)

--- a/arch/m32r/include/asm/atomic.h
+++ b/arch/m32r/include/asm/atomic.h
@@ -89,16 +89,46 @@ static __inline__ int atomic_##op##_retu
 	return result;							\
 }
 
-#define ATOMIC_OPS(op) ATOMIC_OP(op) ATOMIC_OP_RETURN(op)
+#define ATOMIC_FETCH_OP(op)						\
+static __inline__ int atomic_fetch_##op(int i, atomic_t *v)		\
+{									\
+	unsigned long flags;						\
+	int result, val;						\
+									\
+	local_irq_save(flags);						\
+	__asm__ __volatile__ (						\
+		"# atomic_fetch_" #op "		\n\t"			\
+		DCACHE_CLEAR("%0", "r4", "%2")				\
+		M32R_LOCK" %1, @%2;		\n\t"			\
+		"mv %0, %1			\n\t" 			\
+		#op " %1, %3;			\n\t"			\
+		M32R_UNLOCK" %1, @%2;		\n\t"			\
+		: "=&r" (result), "=&r" (val)				\
+		: "r" (&v->counter), "r" (i)				\
+		: "memory"						\
+		__ATOMIC_CLOBBER					\
+	);								\
+	local_irq_restore(flags);					\
+									\
+	return result;							\
+}
+
+#define ATOMIC_OPS(op) ATOMIC_OP(op) ATOMIC_OP_RETURN(op) ATOMIC_FETCH_OP(op)
 
 ATOMIC_OPS(add)
 ATOMIC_OPS(sub)
 
-ATOMIC_OP(and)
-ATOMIC_OP(or)
-ATOMIC_OP(xor)
+#undef ATOMIC_OPS
+#define ATOMIC_OPS(op) ATOMIC_OP(op) ATOMIC_FETCH_OP(op)
+
+#define atomic_fetch_or atomic_fetch_or
+
+ATOMIC_OPS(and)
+ATOMIC_OPS(or)
+ATOMIC_OPS(xor)
 
 #undef ATOMIC_OPS
+#undef ATOMIC_FETCH_OP
 #undef ATOMIC_OP_RETURN
 #undef ATOMIC_OP
 

^ permalink raw reply	[flat|nested] 79+ messages in thread

* [RFC][PATCH 13/31] locking,m68k: Implement atomic_fetch_{add,sub,and,or,xor}()
  2016-04-22  9:04 [RFC][PATCH 00/31] implement atomic_fetch_$op Peter Zijlstra
                   ` (11 preceding siblings ...)
  2016-04-22  9:04 ` [RFC][PATCH 12/31] locking,m32r: Implement atomic_fetch_{add,sub,and,or,xor}() Peter Zijlstra
@ 2016-04-22  9:04 ` Peter Zijlstra
  2016-04-22  9:04 ` [RFC][PATCH 14/31] locking,metag: " Peter Zijlstra
                   ` (18 subsequent siblings)
  31 siblings, 0 replies; 79+ messages in thread
From: Peter Zijlstra @ 2016-04-22  9:04 UTC (permalink / raw)
  To: torvalds, mingo, tglx, will.deacon, paulmck, boqun.feng,
	waiman.long, fweisbec
  Cc: linux-kernel, linux-arch, rth, vgupta, linux, egtvedt, realmz6,
	ysato, rkuo, tony.luck, geert, james.hogan, ralf, dhowells, jejb,
	mpe, schwidefsky, dalias, davem, cmetcalf, jcmvbkbc, arnd,
	peterz, dbueso, fengguang.wu

[-- Attachment #1: peterz-atomic-fetch-m68k.patch --]
[-- Type: text/plain, Size: 2717 bytes --]

Implement FETCH-OP atomic primitives, these are very similar to the
existing OP-RETURN primitives we already have, except they return the
value of the atomic variable _before_ modification.

This is especially useful for irreversible operations -- such as
bitops (because it becomes impossible to reconstruct the state prior
to modification).

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 arch/m68k/include/asm/atomic.h |   53 +++++++++++++++++++++++++++++++++++++----
 1 file changed, 49 insertions(+), 4 deletions(-)

--- a/arch/m68k/include/asm/atomic.h
+++ b/arch/m68k/include/asm/atomic.h
@@ -38,6 +38,13 @@ static inline void atomic_##op(int i, at
 
 #ifdef CONFIG_RMW_INSNS
 
+/*
+ * Am I reading these CAS loops right in that %2 is the old value and the first
+ * iteration uses an uninitialized value?
+ *
+ * Would it not make sense to add: tmp = atomic_read(v); to avoid this?
+ */
+
 #define ATOMIC_OP_RETURN(op, c_op, asm_op)				\
 static inline int atomic_##op##_return(int i, atomic_t *v)		\
 {									\
@@ -53,6 +60,21 @@ static inline int atomic_##op##_return(i
 	return t;							\
 }
 
+#define ATOMIC_FETCH_OP(op, c_op, asm_op)				\
+static inline int atomic_fetch_##op(int i, atomic_t *v)			\
+{									\
+	int t, tmp;							\
+									\
+	__asm__ __volatile__(						\
+			"1:	movel %2,%1\n"				\
+			"	" #asm_op "l %3,%1\n"			\
+			"	casl %2,%1,%0\n"			\
+			"	jne 1b"					\
+			: "+m" (*v), "=&d" (t), "=&d" (tmp)		\
+			: "g" (i), "2" (atomic_read(v)));		\
+	return tmp;							\
+}
+
 #else
 
 #define ATOMIC_OP_RETURN(op, c_op, asm_op)				\
@@ -68,20 +90,43 @@ static inline int atomic_##op##_return(i
 	return t;							\
 }
 
+#define ATOMIC_FETCH_OP(op, c_op, asm_op)				\
+static inline int atomic_fetch_##op(int i, atomic_t * v)		\
+{									\
+	unsigned long flags;						\
+	int t;								\
+									\
+	local_irq_save(flags);						\
+	t = v->counter;							\
+	v->counter c_op i;						\
+	local_irq_restore(flags);					\
+									\
+	return t;							\
+}
+
 #endif /* CONFIG_RMW_INSNS */
 
 #define ATOMIC_OPS(op, c_op, asm_op)					\
 	ATOMIC_OP(op, c_op, asm_op)					\
-	ATOMIC_OP_RETURN(op, c_op, asm_op)
+	ATOMIC_OP_RETURN(op, c_op, asm_op)				\
+	ATOMIC_FETCH_OP(op, c_op, asm_op)
 
 ATOMIC_OPS(add, +=, add)
 ATOMIC_OPS(sub, -=, sub)
 
-ATOMIC_OP(and, &=, and)
-ATOMIC_OP(or, |=, or)
-ATOMIC_OP(xor, ^=, eor)
+#undef ATOMIC_OPS
+#define ATOMIC_OPS(op, c_op, asm_op)					\
+	ATOMIC_OP(op, c_op, asm_op)					\
+	ATOMIC_FETCH_OP(op, c_op, asm_op)
+
+#define atomic_fetch_or atomic_fetch_or
+
+ATOMIC_OPS(and, &=, and)
+ATOMIC_OPS(or, |=, or)
+ATOMIC_OPS(xor, ^=, eor)
 
 #undef ATOMIC_OPS
+#undef ATOMIC_FETCH_OP
 #undef ATOMIC_OP_RETURN
 #undef ATOMIC_OP
 

^ permalink raw reply	[flat|nested] 79+ messages in thread

* [RFC][PATCH 14/31] locking,metag: Implement atomic_fetch_{add,sub,and,or,xor}()
  2016-04-22  9:04 [RFC][PATCH 00/31] implement atomic_fetch_$op Peter Zijlstra
                   ` (12 preceding siblings ...)
  2016-04-22  9:04 ` [RFC][PATCH 13/31] locking,m68k: " Peter Zijlstra
@ 2016-04-22  9:04 ` Peter Zijlstra
  2016-04-30  0:20     ` James Hogan
  2016-04-22  9:04 ` [RFC][PATCH 15/31] locking,mips: Implement atomic{,64}_fetch_{add,sub,and,or,xor}() Peter Zijlstra
                   ` (17 subsequent siblings)
  31 siblings, 1 reply; 79+ messages in thread
From: Peter Zijlstra @ 2016-04-22  9:04 UTC (permalink / raw)
  To: torvalds, mingo, tglx, will.deacon, paulmck, boqun.feng,
	waiman.long, fweisbec
  Cc: linux-kernel, linux-arch, rth, vgupta, linux, egtvedt, realmz6,
	ysato, rkuo, tony.luck, geert, james.hogan, ralf, dhowells, jejb,
	mpe, schwidefsky, dalias, davem, cmetcalf, jcmvbkbc, arnd,
	peterz, dbueso, fengguang.wu

[-- Attachment #1: peterz-atomic-fetch-metag.patch --]
[-- Type: text/plain, Size: 3332 bytes --]

Implement FETCH-OP atomic primitives, these are very similar to the
existing OP-RETURN primitives we already have, except they return the
value of the atomic variable _before_ modification.

This is especially useful for irreversible operations -- such as
bitops (because it becomes impossible to reconstruct the state prior
to modification).

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 arch/metag/include/asm/atomic.h        |    2 +
 arch/metag/include/asm/atomic_lnkget.h |   36 +++++++++++++++++++++++++++++----
 arch/metag/include/asm/atomic_lock1.h  |   33 ++++++++++++++++++++++++++----
 3 files changed, 63 insertions(+), 8 deletions(-)

--- a/arch/metag/include/asm/atomic.h
+++ b/arch/metag/include/asm/atomic.h
@@ -17,6 +17,8 @@
 #include <asm/atomic_lnkget.h>
 #endif
 
+#define atomic_fetch_or atomic_fetch_or
+
 #define atomic_add_negative(a, v)       (atomic_add_return((a), (v)) < 0)
 
 #define atomic_dec_return(v) atomic_sub_return(1, (v))
--- a/arch/metag/include/asm/atomic_lnkget.h
+++ b/arch/metag/include/asm/atomic_lnkget.h
@@ -69,16 +69,44 @@ static inline int atomic_##op##_return(i
 	return result;							\
 }
 
-#define ATOMIC_OPS(op) ATOMIC_OP(op) ATOMIC_OP_RETURN(op)
+#define ATOMIC_FETCH_OP(op)						\
+static inline int atomic_fetch_##op(int i, atomic_t *v)			\
+{									\
+	int result, temp;						\
+									\
+	smp_mb();							\
+									\
+	asm volatile (							\
+		"1:	LNKGETD %1, [%2]\n"				\
+		"	" #op "	%0, %1, %3\n"				\
+		"	LNKSETD [%2], %0\n"				\
+		"	DEFR	%0, TXSTAT\n"				\
+		"	ANDT	%0, %0, #HI(0x3f000000)\n"		\
+		"	CMPT	%0, #HI(0x02000000)\n"			\
+		"	BNZ 1b\n"					\
+		: "=&d" (temp), "=&da" (result)				\
+		: "da" (&v->counter), "bd" (i)				\
+		: "cc");						\
+									\
+	smp_mb();							\
+									\
+	return result;							\
+}
+
+#define ATOMIC_OPS(op) ATOMIC_OP(op) ATOMIC_OP_RETURN(op) ATOMIC_FETCH_OP(op)
 
 ATOMIC_OPS(add)
 ATOMIC_OPS(sub)
 
-ATOMIC_OP(and)
-ATOMIC_OP(or)
-ATOMIC_OP(xor)
+#undef ATOMIC_OPS
+#define ATOMIC_OPS(op) ATOMIC_OP(op) ATOMIC_FETCH_OP(op)
+
+ATOMIC_OPS(and)
+ATOMIC_OPS(or)
+ATOMIC_OPS(xor)
 
 #undef ATOMIC_OPS
+#undef ATOMIC_FETCH_OP
 #undef ATOMIC_OP_RETURN
 #undef ATOMIC_OP
 
--- a/arch/metag/include/asm/atomic_lock1.h
+++ b/arch/metag/include/asm/atomic_lock1.h
@@ -64,15 +64,40 @@ static inline int atomic_##op##_return(i
 	return result;							\
 }
 
-#define ATOMIC_OPS(op, c_op) ATOMIC_OP(op, c_op) ATOMIC_OP_RETURN(op, c_op)
+#define ATOMIC_FETCH_OP(op, c_op)					\
+static inline int atomic_fetch_##op(int i, atomic_t *v)			\
+{									\
+	unsigned long result;						\
+	unsigned long flags;						\
+									\
+	__global_lock1(flags);						\
+	result = v->counter;						\
+	fence();							\
+	v->counter c_op i;						\
+	__global_unlock1(flags);					\
+									\
+	return result;							\
+}
+
+#define ATOMIC_OPS(op, c_op)						\
+	ATOMIC_OP(op, c_op)						\
+	ATOMIC_OP_RETURN(op, c_op)					\
+	ATOMIC_FETCH_OP(op, c_op)
 
 ATOMIC_OPS(add, +=)
 ATOMIC_OPS(sub, -=)
-ATOMIC_OP(and, &=)
-ATOMIC_OP(or, |=)
-ATOMIC_OP(xor, ^=)
 
 #undef ATOMIC_OPS
+#define ATOMIC_OPS(op, c_op)						\
+	ATOMIC_OP(op, c_op)						\
+	ATOMIC_FETCH_OP(op, c_op)
+
+ATOMIC_OPS(and, &=)
+ATOMIC_OPS(or, |=)
+ATOMIC_OPS(xor, ^=)
+
+#undef ATOMIC_OPS
+#undef ATOMIC_FETCH_OP
 #undef ATOMIC_OP_RETURN
 #undef ATOMIC_OP
 

^ permalink raw reply	[flat|nested] 79+ messages in thread

* [RFC][PATCH 15/31] locking,mips: Implement atomic{,64}_fetch_{add,sub,and,or,xor}()
  2016-04-22  9:04 [RFC][PATCH 00/31] implement atomic_fetch_$op Peter Zijlstra
                   ` (13 preceding siblings ...)
  2016-04-22  9:04 ` [RFC][PATCH 14/31] locking,metag: " Peter Zijlstra
@ 2016-04-22  9:04 ` Peter Zijlstra
  2016-04-22  9:04 ` [RFC][PATCH 16/31] locking,mn10300: Implement atomic_fetch_{add,sub,and,or,xor}() Peter Zijlstra
                   ` (16 subsequent siblings)
  31 siblings, 0 replies; 79+ messages in thread
From: Peter Zijlstra @ 2016-04-22  9:04 UTC (permalink / raw)
  To: torvalds, mingo, tglx, will.deacon, paulmck, boqun.feng,
	waiman.long, fweisbec
  Cc: linux-kernel, linux-arch, rth, vgupta, linux, egtvedt, realmz6,
	ysato, rkuo, tony.luck, geert, james.hogan, ralf, dhowells, jejb,
	mpe, schwidefsky, dalias, davem, cmetcalf, jcmvbkbc, arnd,
	peterz, dbueso, fengguang.wu

[-- Attachment #1: peterz-atomic-fetch-mips.patch --]
[-- Type: text/plain, Size: 5948 bytes --]

Implement FETCH-OP atomic primitives, these are very similar to the
existing OP-RETURN primitives we already have, except they return the
value of the atomic variable _before_ modification.

This is especially useful for irreversible operations -- such as
bitops (because it becomes impossible to reconstruct the state prior
to modification).

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 arch/mips/include/asm/atomic.h |  138 ++++++++++++++++++++++++++++++++++++++---
 1 file changed, 129 insertions(+), 9 deletions(-)

--- a/arch/mips/include/asm/atomic.h
+++ b/arch/mips/include/asm/atomic.h
@@ -66,7 +66,7 @@ static __inline__ void atomic_##op(int i
 			"	" #asm_op " %0, %2			\n"   \
 			"	sc	%0, %1				\n"   \
 			"	.set	mips0				\n"   \
-			: "=&r" (temp), "+" GCC_OFF_SMALL_ASM() (v->counter)      \
+			: "=&r" (temp), "+" GCC_OFF_SMALL_ASM() (v->counter)  \
 			: "Ir" (i));					      \
 		} while (unlikely(!temp));				      \
 	} else {							      \
@@ -130,18 +130,78 @@ static __inline__ int atomic_##op##_retu
 	return result;							      \
 }
 
+#define ATOMIC_FETCH_OP(op, c_op, asm_op)				      \
+static __inline__ int atomic_fetch_##op(int i, atomic_t * v)		      \
+{									      \
+	int result;							      \
+									      \
+	smp_mb__before_llsc();						      \
+									      \
+	if (kernel_uses_llsc && R10000_LLSC_WAR) {			      \
+		int temp;						      \
+									      \
+		__asm__ __volatile__(					      \
+		"	.set	arch=r4000				\n"   \
+		"1:	ll	%1, %2		# atomic_fetch_" #op "	\n"   \
+		"	" #asm_op " %0, %1, %3				\n"   \
+		"	sc	%0, %2					\n"   \
+		"	beqzl	%0, 1b					\n"   \
+		"	move	%0, %1					\n"   \
+		"	.set	mips0					\n"   \
+		: "=&r" (result), "=&r" (temp),				      \
+		  "+" GCC_OFF_SMALL_ASM() (v->counter)			      \
+		: "Ir" (i));						      \
+	} else if (kernel_uses_llsc) {					      \
+		int temp;						      \
+									      \
+		do {							      \
+			__asm__ __volatile__(				      \
+			"	.set	"MIPS_ISA_LEVEL"		\n"   \
+			"	ll	%1, %2	# atomic_fetch_" #op "	\n"   \
+			"	" #asm_op " %0, %1, %3			\n"   \
+			"	sc	%0, %2				\n"   \
+			"	.set	mips0				\n"   \
+			: "=&r" (result), "=&r" (temp),			      \
+			  "+" GCC_OFF_SMALL_ASM() (v->counter)		      \
+			: "Ir" (i));					      \
+		} while (unlikely(!result));				      \
+									      \
+		result = temp;						      \
+	} else {							      \
+		unsigned long flags;					      \
+									      \
+		raw_local_irq_save(flags);				      \
+		result = v->counter;					      \
+		v->counter c_op i;					      \
+		raw_local_irq_restore(flags);				      \
+	}								      \
+									      \
+	smp_llsc_mb();							      \
+									      \
+	return result;							      \
+}
+
 #define ATOMIC_OPS(op, c_op, asm_op)					      \
 	ATOMIC_OP(op, c_op, asm_op)					      \
-	ATOMIC_OP_RETURN(op, c_op, asm_op)
+	ATOMIC_OP_RETURN(op, c_op, asm_op)				      \
+	ATOMIC_FETCH_OP(op, c_op, asm_op)
 
 ATOMIC_OPS(add, +=, addu)
 ATOMIC_OPS(sub, -=, subu)
 
-ATOMIC_OP(and, &=, and)
-ATOMIC_OP(or, |=, or)
-ATOMIC_OP(xor, ^=, xor)
+#undef ATOMIC_OPS
+#define ATOMIC_OPS(op, c_op, asm_op)					      \
+	ATOMIC_OP(op, c_op, asm_op)					      \
+	ATOMIC_FETCH_OP(op, c_op, asm_op)
+
+#define atomic_fetch_or atomic_fetch_or
+
+ATOMIC_OPS(and, &=, and)
+ATOMIC_OPS(or, |=, or)
+ATOMIC_OPS(xor, ^=, xor)
 
 #undef ATOMIC_OPS
+#undef ATOMIC_FETCH_OP
 #undef ATOMIC_OP_RETURN
 #undef ATOMIC_OP
 
@@ -414,17 +474,77 @@ static __inline__ long atomic64_##op##_r
 	return result;							      \
 }
 
+#define ATOMIC64_FETCH_OP(op, c_op, asm_op)				      \
+static __inline__ long atomic64_fetch_##op(long i, atomic64_t * v)	      \
+{									      \
+	long result;							      \
+									      \
+	smp_mb__before_llsc();						      \
+									      \
+	if (kernel_uses_llsc && R10000_LLSC_WAR) {			      \
+		long temp;						      \
+									      \
+		__asm__ __volatile__(					      \
+		"	.set	arch=r4000				\n"   \
+		"1:	lld	%1, %2		# atomic64_fetch_" #op "\n"   \
+		"	" #asm_op " %0, %1, %3				\n"   \
+		"	scd	%0, %2					\n"   \
+		"	beqzl	%0, 1b					\n"   \
+		"	move	%0, %1					\n"   \
+		"	.set	mips0					\n"   \
+		: "=&r" (result), "=&r" (temp),				      \
+		  "+" GCC_OFF_SMALL_ASM() (v->counter)			      \
+		: "Ir" (i));						      \
+	} else if (kernel_uses_llsc) {					      \
+		long temp;						      \
+									      \
+		do {							      \
+			__asm__ __volatile__(				      \
+			"	.set	"MIPS_ISA_LEVEL"		\n"   \
+			"	lld	%1, %2	# atomic64_fetch_" #op "\n"   \
+			"	" #asm_op " %0, %1, %3			\n"   \
+			"	scd	%0, %2				\n"   \
+			"	.set	mips0				\n"   \
+			: "=&r" (result), "=&r" (temp),			      \
+			  "=" GCC_OFF_SMALL_ASM() (v->counter)		      \
+			: "Ir" (i), GCC_OFF_SMALL_ASM() (v->counter)	      \
+			: "memory");					      \
+		} while (unlikely(!result));				      \
+									      \
+		result = temp;						      \
+	} else {							      \
+		unsigned long flags;					      \
+									      \
+		raw_local_irq_save(flags);				      \
+		result = v->counter;					      \
+		v->counter c_op i;					      \
+		raw_local_irq_restore(flags);				      \
+	}								      \
+									      \
+	smp_llsc_mb();							      \
+									      \
+	return result;							      \
+}
+
 #define ATOMIC64_OPS(op, c_op, asm_op)					      \
 	ATOMIC64_OP(op, c_op, asm_op)					      \
-	ATOMIC64_OP_RETURN(op, c_op, asm_op)
+	ATOMIC64_OP_RETURN(op, c_op, asm_op)				      \
+	ATOMIC64_FETCH_OP(op, c_op, asm_op)
 
 ATOMIC64_OPS(add, +=, daddu)
 ATOMIC64_OPS(sub, -=, dsubu)
-ATOMIC64_OP(and, &=, and)
-ATOMIC64_OP(or, |=, or)
-ATOMIC64_OP(xor, ^=, xor)
 
 #undef ATOMIC64_OPS
+#define ATOMIC64_OPS(op, c_op, asm_op)					      \
+	ATOMIC64_OP(op, c_op, asm_op)					      \
+	ATOMIC64_FETCH_OP(op, c_op, asm_op)
+
+ATOMIC64_OPS(and, &=, and)
+ATOMIC64_OPS(or, |=, or)
+ATOMIC64_OPS(xor, ^=, xor)
+
+#undef ATOMIC64_OPS
+#undef ATOMIC64_FETCH_OP
 #undef ATOMIC64_OP_RETURN
 #undef ATOMIC64_OP
 

^ permalink raw reply	[flat|nested] 79+ messages in thread

* [RFC][PATCH 16/31] locking,mn10300: Implement atomic_fetch_{add,sub,and,or,xor}()
  2016-04-22  9:04 [RFC][PATCH 00/31] implement atomic_fetch_$op Peter Zijlstra
                   ` (14 preceding siblings ...)
  2016-04-22  9:04 ` [RFC][PATCH 15/31] locking,mips: Implement atomic{,64}_fetch_{add,sub,and,or,xor}() Peter Zijlstra
@ 2016-04-22  9:04 ` Peter Zijlstra
  2016-04-22  9:04 ` [RFC][PATCH 17/31] locking,parisc: Implement atomic{,64}_fetch_{add,sub,and,or,xor}() Peter Zijlstra
                   ` (15 subsequent siblings)
  31 siblings, 0 replies; 79+ messages in thread
From: Peter Zijlstra @ 2016-04-22  9:04 UTC (permalink / raw)
  To: torvalds, mingo, tglx, will.deacon, paulmck, boqun.feng,
	waiman.long, fweisbec
  Cc: linux-kernel, linux-arch, rth, vgupta, linux, egtvedt, realmz6,
	ysato, rkuo, tony.luck, geert, james.hogan, ralf, dhowells, jejb,
	mpe, schwidefsky, dalias, davem, cmetcalf, jcmvbkbc, arnd,
	peterz, dbueso, fengguang.wu

[-- Attachment #1: peterz-atomic-fetch-mn10300.patch --]
[-- Type: text/plain, Size: 1798 bytes --]

Implement FETCH-OP atomic primitives, these are very similar to the
existing OP-RETURN primitives we already have, except they return the
value of the atomic variable _before_ modification.

This is especially useful for irreversible operations -- such as
bitops (because it becomes impossible to reconstruct the state prior
to modification).

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 arch/mn10300/include/asm/atomic.h |   35 +++++++++++++++++++++++++++++++----
 1 file changed, 31 insertions(+), 4 deletions(-)

--- a/arch/mn10300/include/asm/atomic.h
+++ b/arch/mn10300/include/asm/atomic.h
@@ -84,16 +84,43 @@ static inline int atomic_##op##_return(i
 	return retval;							\
 }
 
-#define ATOMIC_OPS(op) ATOMIC_OP(op) ATOMIC_OP_RETURN(op)
+#define ATOMIC_FETCH_OP(op)						\
+static inline int atomic_fetch_##op(int i, atomic_t *v)			\
+{									\
+	int retval, status;						\
+									\
+	asm volatile(							\
+		"1:	mov	%4,(_AAR,%3)	\n"			\
+		"	mov	(_ADR,%3),%1	\n"			\
+		"	mov	%1,%0		\n"			\
+		"	" #op "	%5,%0		\n"			\
+		"	mov	%0,(_ADR,%3)	\n"			\
+		"	mov	(_ADR,%3),%0	\n"	/* flush */	\
+		"	mov	(_ASR,%3),%0	\n"			\
+		"	or	%0,%0		\n"			\
+		"	bne	1b		\n"			\
+		: "=&r"(status), "=&r"(retval), "=m"(v->counter)	\
+		: "a"(ATOMIC_OPS_BASE_ADDR), "r"(&v->counter), "r"(i)	\
+		: "memory", "cc");					\
+	return retval;							\
+}
+
+#define ATOMIC_OPS(op) ATOMIC_OP(op) ATOMIC_OP_RETURN(op) ATOMIC_FETCH_OP(op)
 
 ATOMIC_OPS(add)
 ATOMIC_OPS(sub)
 
-ATOMIC_OP(and)
-ATOMIC_OP(or)
-ATOMIC_OP(xor)
+#undef ATOMIC_OPS
+#define ATOMIC_OPS(op) ATOMIC_OP(op) ATOMIC_FETCH_OP(op)
+
+#define atomic_fetch_or atomic_fetch_or
+
+ATOMIC_OPS(and)
+ATOMIC_OPS(or)
+ATOMIC_OPS(xor)
 
 #undef ATOMIC_OPS
+#undef ATOMIC_FETCH_OP
 #undef ATOMIC_OP_RETURN
 #undef ATOMIC_OP
 

^ permalink raw reply	[flat|nested] 79+ messages in thread

* [RFC][PATCH 17/31] locking,parisc: Implement atomic{,64}_fetch_{add,sub,and,or,xor}()
  2016-04-22  9:04 [RFC][PATCH 00/31] implement atomic_fetch_$op Peter Zijlstra
                   ` (15 preceding siblings ...)
  2016-04-22  9:04 ` [RFC][PATCH 16/31] locking,mn10300: Implement atomic_fetch_{add,sub,and,or,xor}() Peter Zijlstra
@ 2016-04-22  9:04 ` Peter Zijlstra
  2016-04-22  9:04 ` [RFC][PATCH 18/31] locking,powerpc: Implement atomic{,64}_fetch_{add,sub,and,or,xor}{,_relaxed,_acquire,_release}() Peter Zijlstra
                   ` (14 subsequent siblings)
  31 siblings, 0 replies; 79+ messages in thread
From: Peter Zijlstra @ 2016-04-22  9:04 UTC (permalink / raw)
  To: torvalds, mingo, tglx, will.deacon, paulmck, boqun.feng,
	waiman.long, fweisbec
  Cc: linux-kernel, linux-arch, rth, vgupta, linux, egtvedt, realmz6,
	ysato, rkuo, tony.luck, geert, james.hogan, ralf, dhowells, jejb,
	mpe, schwidefsky, dalias, davem, cmetcalf, jcmvbkbc, arnd,
	peterz, dbueso, fengguang.wu

[-- Attachment #1: peterz-atomic-fetch-parisc.patch --]
[-- Type: text/plain, Size: 2780 bytes --]

Implement FETCH-OP atomic primitives, these are very similar to the
existing OP-RETURN primitives we already have, except they return the
value of the atomic variable _before_ modification.

This is especially useful for irreversible operations -- such as
bitops (because it becomes impossible to reconstruct the state prior
to modification).

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 arch/parisc/include/asm/atomic.h |   65 ++++++++++++++++++++++++++++++++++-----
 1 file changed, 57 insertions(+), 8 deletions(-)

--- a/arch/parisc/include/asm/atomic.h
+++ b/arch/parisc/include/asm/atomic.h
@@ -121,16 +121,41 @@ static __inline__ int atomic_##op##_retu
 	return ret;							\
 }
 
-#define ATOMIC_OPS(op, c_op) ATOMIC_OP(op, c_op) ATOMIC_OP_RETURN(op, c_op)
+#define ATOMIC_FETCH_OP(op, c_op)					\
+static __inline__ int atomic_fetch_##op(int i, atomic_t *v)		\
+{									\
+	unsigned long flags;						\
+	int ret;							\
+									\
+	_atomic_spin_lock_irqsave(v, flags);				\
+	ret = v->counter;						\
+	v->counter c_op i;						\
+	_atomic_spin_unlock_irqrestore(v, flags);			\
+									\
+	return ret;							\
+}
+
+#define ATOMIC_OPS(op, c_op)						\
+	ATOMIC_OP(op, c_op)						\
+	ATOMIC_OP_RETURN(op, c_op)					\
+	ATOMIC_FETCH_OP(op, c_op)
 
 ATOMIC_OPS(add, +=)
 ATOMIC_OPS(sub, -=)
 
-ATOMIC_OP(and, &=)
-ATOMIC_OP(or, |=)
-ATOMIC_OP(xor, ^=)
+#undef ATOMIC_OPS
+#define ATOMIC_OPS(op, c_op)						\
+	ATOMIC_OP(op, c_op)						\
+	ATOMIC_FETCH_OP(op, c_op)
+
+#define atomic_fetch_or atomic_fetch_or
+
+ATOMIC_OPS(and, &=)
+ATOMIC_OPS(or, |=)
+ATOMIC_OPS(xor, ^=)
 
 #undef ATOMIC_OPS
+#undef ATOMIC_FETCH_OP
 #undef ATOMIC_OP_RETURN
 #undef ATOMIC_OP
 
@@ -185,15 +210,39 @@ static __inline__ s64 atomic64_##op##_re
 	return ret;							\
 }
 
-#define ATOMIC64_OPS(op, c_op) ATOMIC64_OP(op, c_op) ATOMIC64_OP_RETURN(op, c_op)
+#define ATOMIC64_FETCH_OP(op, c_op)					\
+static __inline__ s64 atomic64_fetch_##op(s64 i, atomic64_t *v)		\
+{									\
+	unsigned long flags;						\
+	s64 ret;							\
+									\
+	_atomic_spin_lock_irqsave(v, flags);				\
+	ret = v->counter;						\
+	v->counter c_op i;						\
+	_atomic_spin_unlock_irqrestore(v, flags);			\
+									\
+	return ret;							\
+}
+
+#define ATOMIC64_OPS(op, c_op)						\
+	ATOMIC64_OP(op, c_op)						\
+	ATOMIC64_OP_RETURN(op, c_op)					\
+	ATOMIC64_FETCH_OP(op, c_op)
 
 ATOMIC64_OPS(add, +=)
 ATOMIC64_OPS(sub, -=)
-ATOMIC64_OP(and, &=)
-ATOMIC64_OP(or, |=)
-ATOMIC64_OP(xor, ^=)
 
 #undef ATOMIC64_OPS
+#define ATOMIC64_OPS(op, c_op)						\
+	ATOMIC64_OP(op, c_op)						\
+	ATOMIC64_FETCH_OP(op, c_op)
+
+ATOMIC64_OPS(and, &=)
+ATOMIC64_OPS(or, |=)
+ATOMIC64_OPS(xor, ^=)
+
+#undef ATOMIC64_OPS
+#undef ATOMIC64_FETCH_OP
 #undef ATOMIC64_OP_RETURN
 #undef ATOMIC64_OP
 

^ permalink raw reply	[flat|nested] 79+ messages in thread

* [RFC][PATCH 18/31] locking,powerpc: Implement atomic{,64}_fetch_{add,sub,and,or,xor}{,_relaxed,_acquire,_release}()
  2016-04-22  9:04 [RFC][PATCH 00/31] implement atomic_fetch_$op Peter Zijlstra
                   ` (16 preceding siblings ...)
  2016-04-22  9:04 ` [RFC][PATCH 17/31] locking,parisc: Implement atomic{,64}_fetch_{add,sub,and,or,xor}() Peter Zijlstra
@ 2016-04-22  9:04 ` Peter Zijlstra
  2016-04-22 16:41   ` Boqun Feng
  2016-04-22  9:04 ` [RFC][PATCH 19/31] locking,s390: Implement atomic{,64}_fetch_{add,sub,and,or,xor}() Peter Zijlstra
                   ` (13 subsequent siblings)
  31 siblings, 1 reply; 79+ messages in thread
From: Peter Zijlstra @ 2016-04-22  9:04 UTC (permalink / raw)
  To: torvalds, mingo, tglx, will.deacon, paulmck, boqun.feng,
	waiman.long, fweisbec
  Cc: linux-kernel, linux-arch, rth, vgupta, linux, egtvedt, realmz6,
	ysato, rkuo, tony.luck, geert, james.hogan, ralf, dhowells, jejb,
	mpe, schwidefsky, dalias, davem, cmetcalf, jcmvbkbc, arnd,
	peterz, dbueso, fengguang.wu

[-- Attachment #1: peterz-atomic-fetch-powerpc.patch --]
[-- Type: text/plain, Size: 3917 bytes --]

Implement FETCH-OP atomic primitives, these are very similar to the
existing OP-RETURN primitives we already have, except they return the
value of the atomic variable _before_ modification.

This is especially useful for irreversible operations -- such as
bitops (because it becomes impossible to reconstruct the state prior
to modification).


Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 arch/powerpc/include/asm/atomic.h |   83 +++++++++++++++++++++++++++++++++-----
 1 file changed, 74 insertions(+), 9 deletions(-)

--- a/arch/powerpc/include/asm/atomic.h
+++ b/arch/powerpc/include/asm/atomic.h
@@ -78,21 +78,53 @@ static inline int atomic_##op##_return_r
 	return t;							\
 }
 
+#define ATOMIC_FETCH_OP_RELAXED(op, asm_op)				\
+static inline int atomic_fetch_##op##_relaxed(int a, atomic_t *v)	\
+{									\
+	int res, t;							\
+									\
+	__asm__ __volatile__(						\
+"1:	lwarx	%0,0,%4		# atomic_fetch_" #op "_relaxed\n"	\
+	#asm_op " %1,%2,%0\n"						\
+	PPC405_ERR77(0, %4)						\
+"	stwcx.	%1,0,%4\n"						\
+"	bne-	1b\n"							\
+	: "=&r" (res), "=&r" (t), "+m" (v->counter)			\
+	: "r" (a), "r" (&v->counter)					\
+	: "cc");							\
+									\
+	return res;							\
+}
+
 #define ATOMIC_OPS(op, asm_op)						\
 	ATOMIC_OP(op, asm_op)						\
-	ATOMIC_OP_RETURN_RELAXED(op, asm_op)
+	ATOMIC_OP_RETURN_RELAXED(op, asm_op)				\
+	ATOMIC_FETCH_OP_RELAXED(op, asm_op)
 
 ATOMIC_OPS(add, add)
 ATOMIC_OPS(sub, subf)
 
-ATOMIC_OP(and, and)
-ATOMIC_OP(or, or)
-ATOMIC_OP(xor, xor)
-
 #define atomic_add_return_relaxed atomic_add_return_relaxed
 #define atomic_sub_return_relaxed atomic_sub_return_relaxed
 
+#define atomic_fetch_add_relaxed atomic_fetch_add_relaxed
+#define atomic_fetch_sub_relaxed atomic_fetch_sub_relaxed
+
+#undef ATOMIC_OPS
+#define ATOMIC_OPS(op, asm_op)						\
+	ATOMIC_OP(op, asm_op)						\
+	ATOMIC_FETCH_OP_RELAXED(op, asm_op)
+
+ATOMIC_OPS(and, and)
+ATOMIC_OPS(or, or)
+ATOMIC_OPS(xor, xor)
+
+#define atomic_fetch_and_relaxed atomic_fetch_and_relaxed
+#define atomic_fetch_or_relaxed  atomic_fetch_or_relaxed
+#define atomic_fetch_xor_relaxed atomic_fetch_xor_relaxed
+
 #undef ATOMIC_OPS
+#undef ATOMIC_FETCH_OP_RELAXED
 #undef ATOMIC_OP_RETURN_RELAXED
 #undef ATOMIC_OP
 
@@ -329,20 +361,53 @@ atomic64_##op##_return_relaxed(long a, a
 	return t;							\
 }
 
+#define ATOMIC64_FETCH_OP_RELAXED(op, asm_op)				\
+static inline long							\
+atomic64_fetch_##op##_relaxed(long a, atomic64_t *v)			\
+{									\
+	long res, t;							\
+									\
+	__asm__ __volatile__(						\
+"1:	ldarx	%0,0,%4		# atomic64_fetch_" #op "_relaxed\n"	\
+	#asm_op " %1,%3,%0\n"						\
+"	stdcx.	%1,0,%4\n"						\
+"	bne-	1b\n"							\
+	: "=&r" (res), "=&r" (t), "+m" (v->counter)			\
+	: "r" (a), "r" (&v->counter)					\
+	: "cc");							\
+									\
+	return t;							\
+}
+
 #define ATOMIC64_OPS(op, asm_op)					\
 	ATOMIC64_OP(op, asm_op)						\
-	ATOMIC64_OP_RETURN_RELAXED(op, asm_op)
+	ATOMIC64_OP_RETURN_RELAXED(op, asm_op)				\
+	ATOMIC64_FETCH_OP_RELAXED(op, asm_op)
 
 ATOMIC64_OPS(add, add)
 ATOMIC64_OPS(sub, subf)
-ATOMIC64_OP(and, and)
-ATOMIC64_OP(or, or)
-ATOMIC64_OP(xor, xor)
 
 #define atomic64_add_return_relaxed atomic64_add_return_relaxed
 #define atomic64_sub_return_relaxed atomic64_sub_return_relaxed
 
+#define atomic64_fetch_add_relaxed atomic64_fetch_add_relaxed
+#define atomic64_fetch_sub_relaxed atomic64_fetch_sub_relaxed
+
+#undef ATOMIC64_OPS
+#define ATOMIC64_OPS(op, asm_op)					\
+	ATOMIC64_OP(op, asm_op)						\
+	ATOMIC64_FETCH_OP_RELAXED(op, asm_op)
+
+ATOMIC64_OPS(and, and)
+ATOMIC64_OPS(or, or)
+ATOMIC64_OPS(xor, xor)
+
+#define atomic64_fetch_and_relaxed atomic64_fetch_and_relaxed
+#define atomic64_fetch_or_relaxed  atomic64_fetch_or_relaxed
+#define atomic64_fetch_xor_relaxed atomic64_fetch_xor_relaxed
+
 #undef ATOPIC64_OPS
+#undef ATOMIC64_FETCH_OP_RELAXED
 #undef ATOMIC64_OP_RETURN_RELAXED
 #undef ATOMIC64_OP
 

^ permalink raw reply	[flat|nested] 79+ messages in thread

* [RFC][PATCH 19/31] locking,s390: Implement atomic{,64}_fetch_{add,sub,and,or,xor}()
  2016-04-22  9:04 [RFC][PATCH 00/31] implement atomic_fetch_$op Peter Zijlstra
                   ` (17 preceding siblings ...)
  2016-04-22  9:04 ` [RFC][PATCH 18/31] locking,powerpc: Implement atomic{,64}_fetch_{add,sub,and,or,xor}{,_relaxed,_acquire,_release}() Peter Zijlstra
@ 2016-04-22  9:04 ` Peter Zijlstra
  2016-04-25  8:06   ` Martin Schwidefsky
  2016-04-22  9:04 ` [RFC][PATCH 20/31] locking,sh: Implement atomic_fetch_{add,sub,and,or,xor}() Peter Zijlstra
                   ` (12 subsequent siblings)
  31 siblings, 1 reply; 79+ messages in thread
From: Peter Zijlstra @ 2016-04-22  9:04 UTC (permalink / raw)
  To: torvalds, mingo, tglx, will.deacon, paulmck, boqun.feng,
	waiman.long, fweisbec
  Cc: linux-kernel, linux-arch, rth, vgupta, linux, egtvedt, realmz6,
	ysato, rkuo, tony.luck, geert, james.hogan, ralf, dhowells, jejb,
	mpe, schwidefsky, dalias, davem, cmetcalf, jcmvbkbc, arnd,
	peterz, dbueso, fengguang.wu

[-- Attachment #1: peterz-atomic-fetch-s390.patch --]
[-- Type: text/plain, Size: 3826 bytes --]

Implement FETCH-OP atomic primitives, these are very similar to the
existing OP-RETURN primitives we already have, except they return the
value of the atomic variable _before_ modification.

This is especially useful for irreversible operations -- such as
bitops (because it becomes impossible to reconstruct the state prior
to modification).

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 arch/s390/include/asm/atomic.h |   42 +++++++++++++++++++++++++++++++----------
 1 file changed, 32 insertions(+), 10 deletions(-)

--- a/arch/s390/include/asm/atomic.h
+++ b/arch/s390/include/asm/atomic.h
@@ -93,6 +93,11 @@ static inline int atomic_add_return(int
 	return __ATOMIC_LOOP(v, i, __ATOMIC_ADD, __ATOMIC_BARRIER) + i;
 }
 
+static inline int atomic_fetch_add(int i, atomic_t *v)
+{
+	return __ATOMIC_LOOP(v, i, __ATOMIC_ADD, __ATOMIC_BARRIER);
+}
+
 static inline void atomic_add(int i, atomic_t *v)
 {
 #ifdef CONFIG_HAVE_MARCH_Z196_FEATURES
@@ -114,22 +119,29 @@ static inline void atomic_add(int i, ato
 #define atomic_inc_and_test(_v)		(atomic_add_return(1, _v) == 0)
 #define atomic_sub(_i, _v)		atomic_add(-(int)(_i), _v)
 #define atomic_sub_return(_i, _v)	atomic_add_return(-(int)(_i), _v)
+#define atomic_fetch_sub(_i, _v)	atomic_fetch_add(-(int)(_i), _v)
 #define atomic_sub_and_test(_i, _v)	(atomic_sub_return(_i, _v) == 0)
 #define atomic_dec(_v)			atomic_sub(1, _v)
 #define atomic_dec_return(_v)		atomic_sub_return(1, _v)
 #define atomic_dec_and_test(_v)		(atomic_sub_return(1, _v) == 0)
 
-#define ATOMIC_OP(op, OP)						\
+#define ATOMIC_OPS(op, OP)						\
 static inline void atomic_##op(int i, atomic_t *v)			\
 {									\
 	__ATOMIC_LOOP(v, i, __ATOMIC_##OP, __ATOMIC_NO_BARRIER);	\
+}									\
+static inline int atomic_fetch_##op(int i, atomic_t *v)			\
+{									\
+	return __ATOMIC_LOOP(v, i, __ATOMIC_##OP, __ATOMIC_BARRIER);	\
 }
 
-ATOMIC_OP(and, AND)
-ATOMIC_OP(or, OR)
-ATOMIC_OP(xor, XOR)
+#define atomic_fetch_or atomic_fetch_or
+
+ATOMIC_OPS(and, AND)
+ATOMIC_OPS(or, OR)
+ATOMIC_OPS(xor, XOR)
 
-#undef ATOMIC_OP
+#undef ATOMIC_OPS
 
 #define atomic_xchg(v, new) (xchg(&((v)->counter), new))
 
@@ -236,6 +248,11 @@ static inline long long atomic64_add_ret
 	return __ATOMIC64_LOOP(v, i, __ATOMIC64_ADD, __ATOMIC64_BARRIER) + i;
 }
 
+static inline long long atomic64_fetch_add(long long i, atomic64_t *v)
+{
+	return __ATOMIC64_LOOP(v, i, __ATOMIC64_ADD, __ATOMIC64_BARRIER);
+}
+
 static inline void atomic64_add(long long i, atomic64_t *v)
 {
 #ifdef CONFIG_HAVE_MARCH_Z196_FEATURES
@@ -264,17 +281,21 @@ static inline long long atomic64_cmpxchg
 	return old;
 }
 
-#define ATOMIC64_OP(op, OP)						\
+#define ATOMIC64_OPS(op, OP)						\
 static inline void atomic64_##op(long i, atomic64_t *v)			\
 {									\
 	__ATOMIC64_LOOP(v, i, __ATOMIC64_##OP, __ATOMIC64_NO_BARRIER);	\
+}									\
+static inline long atomic64_fetch_##op(long i, atomic64_t *v)		\
+{									\
+	return __ATOMIC64_LOOP(v, i, __ATOMIC64_##OP, __ATOMIC64_BARRIER); \
 }
 
-ATOMIC64_OP(and, AND)
-ATOMIC64_OP(or, OR)
-ATOMIC64_OP(xor, XOR)
+ATOMIC64_OPS(and, AND)
+ATOMIC64_OPS(or, OR)
+ATOMIC64_OPS(xor, XOR)
 
-#undef ATOMIC64_OP
+#undef ATOMIC64_OPS
 #undef __ATOMIC64_LOOP
 
 static inline int atomic64_add_unless(atomic64_t *v, long long i, long long u)
@@ -315,6 +336,7 @@ static inline long long atomic64_dec_if_
 #define atomic64_inc_return(_v)		atomic64_add_return(1, _v)
 #define atomic64_inc_and_test(_v)	(atomic64_add_return(1, _v) == 0)
 #define atomic64_sub_return(_i, _v)	atomic64_add_return(-(long long)(_i), _v)
+#define atomic64_fetch_sub(_i, _v)	atomic64_fetch_add(-(long long)(_i), _v)
 #define atomic64_sub(_i, _v)		atomic64_add(-(long long)(_i), _v)
 #define atomic64_sub_and_test(_i, _v)	(atomic64_sub_return(_i, _v) == 0)
 #define atomic64_dec(_v)		atomic64_sub(1, _v)

^ permalink raw reply	[flat|nested] 79+ messages in thread

* [RFC][PATCH 20/31] locking,sh: Implement atomic_fetch_{add,sub,and,or,xor}()
  2016-04-22  9:04 [RFC][PATCH 00/31] implement atomic_fetch_$op Peter Zijlstra
                   ` (18 preceding siblings ...)
  2016-04-22  9:04 ` [RFC][PATCH 19/31] locking,s390: Implement atomic{,64}_fetch_{add,sub,and,or,xor}() Peter Zijlstra
@ 2016-04-22  9:04 ` Peter Zijlstra
  2016-04-22  9:04 ` [RFC][PATCH 21/31] locking,sparc: Implement atomic{,64}_fetch_{add,sub,and,or,xor}() Peter Zijlstra
                   ` (11 subsequent siblings)
  31 siblings, 0 replies; 79+ messages in thread
From: Peter Zijlstra @ 2016-04-22  9:04 UTC (permalink / raw)
  To: torvalds, mingo, tglx, will.deacon, paulmck, boqun.feng,
	waiman.long, fweisbec
  Cc: linux-kernel, linux-arch, rth, vgupta, linux, egtvedt, realmz6,
	ysato, rkuo, tony.luck, geert, james.hogan, ralf, dhowells, jejb,
	mpe, schwidefsky, dalias, davem, cmetcalf, jcmvbkbc, arnd,
	peterz, dbueso, fengguang.wu

[-- Attachment #1: peterz-atomic-fetch-sh.patch --]
[-- Type: text/plain, Size: 4685 bytes --]

Implement FETCH-OP atomic primitives, these are very similar to the
existing OP-RETURN primitives we already have, except they return the
value of the atomic variable _before_ modification.

This is especially useful for irreversible operations -- such as
bitops (because it becomes impossible to reconstruct the state prior
to modification).


Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 arch/sh/include/asm/atomic-grb.h  |   34 ++++++++++++++++++++++++++++++----
 arch/sh/include/asm/atomic-irq.h  |   31 +++++++++++++++++++++++++++----
 arch/sh/include/asm/atomic-llsc.h |   32 ++++++++++++++++++++++++++++----
 arch/sh/include/asm/atomic.h      |    2 ++
 4 files changed, 87 insertions(+), 12 deletions(-)

--- a/arch/sh/include/asm/atomic-grb.h
+++ b/arch/sh/include/asm/atomic-grb.h
@@ -43,16 +43,42 @@ static inline int atomic_##op##_return(i
 	return tmp;							\
 }
 
-#define ATOMIC_OPS(op) ATOMIC_OP(op) ATOMIC_OP_RETURN(op)
+#define ATOMIC_FETCH_OP(op)						\
+static inline int atomic_fetch_##op(int i, atomic_t *v)			\
+{									\
+	int res, tmp;							\
+									\
+	__asm__ __volatile__ (						\
+		"   .align 2              \n\t"				\
+		"   mova    1f,   r0      \n\t" /* r0 = end point */	\
+		"   mov    r15,   r1      \n\t" /* r1 = saved sp */	\
+		"   mov    #-6,   r15     \n\t" /* LOGIN: r15 = size */	\
+		"   mov.l  @%2,   %0      \n\t" /* load old value */	\
+		"   mov     %0,   %1      \n\t" /* save old value */	\
+		" " #op "   %3,   %0      \n\t" /* $op */		\
+		"   mov.l   %0,   @%2     \n\t" /* store new value */	\
+		"1: mov     r1,   r15     \n\t" /* LOGOUT */		\
+		: "=&r" (tmp), "=&r" (res), "+r"  (v)			\
+		: "r"   (i)						\
+		: "memory" , "r0", "r1");				\
+									\
+	return res;							\
+}
+
+#define ATOMIC_OPS(op) ATOMIC_OP(op) ATOMIC_OP_RETURN(op) ATOMIC_FETCH_OP(op)
 
 ATOMIC_OPS(add)
 ATOMIC_OPS(sub)
 
-ATOMIC_OP(and)
-ATOMIC_OP(or)
-ATOMIC_OP(xor)
+#undef ATOMIC_OPS
+#define ATOMIC_OPS(op) ATOMIC_OP(op) ATOMIC_FETCH_OP(op)
+
+ATOMIC_OPS(and)
+ATOMIC_OPS(or)
+ATOMIC_OPS(xor)
 
 #undef ATOMIC_OPS
+#undef ATOMIC_FETCH_OP
 #undef ATOMIC_OP_RETURN
 #undef ATOMIC_OP
 
--- a/arch/sh/include/asm/atomic-irq.h
+++ b/arch/sh/include/asm/atomic-irq.h
@@ -33,15 +33,38 @@ static inline int atomic_##op##_return(i
 	return temp;							\
 }
 
-#define ATOMIC_OPS(op, c_op) ATOMIC_OP(op, c_op) ATOMIC_OP_RETURN(op, c_op)
+#define ATOMIC_FETCH_OP(op, c_op)					\
+static inline int atomic_fetch_##op(int i, atomic_t *v)			\
+{									\
+	unsigned long temp, flags;					\
+									\
+	raw_local_irq_save(flags);					\
+	temp = v->counter;						\
+	v->counter c_op i;						\
+	raw_local_irq_restore(flags);					\
+									\
+	return temp;							\
+}
+
+#define ATOMIC_OPS(op, c_op)						\
+	ATOMIC_OP(op, c_op)						\
+	ATOMIC_OP_RETURN(op, c_op)					\
+	ATOMIC_FETCH_OP(op, c_op)
 
 ATOMIC_OPS(add, +=)
 ATOMIC_OPS(sub, -=)
-ATOMIC_OP(and, &=)
-ATOMIC_OP(or, |=)
-ATOMIC_OP(xor, ^=)
 
 #undef ATOMIC_OPS
+#define ATOMIC_OPS(op, c_op)						\
+	ATOMIC_OP(op, c_op)						\
+	ATOMIC_FETCH_OP(op, c_op)
+
+ATOMIC_OPS(and, &=)
+ATOMIC_OPS(or, |=)
+ATOMIC_OPS(xor, ^=)
+
+#undef ATOMIC_OPS
+#undef ATOMIC_FETCH_OP
 #undef ATOMIC_OP_RETURN
 #undef ATOMIC_OP
 
--- a/arch/sh/include/asm/atomic-llsc.h
+++ b/arch/sh/include/asm/atomic-llsc.h
@@ -48,15 +48,39 @@ static inline int atomic_##op##_return(i
 	return temp;							\
 }
 
-#define ATOMIC_OPS(op) ATOMIC_OP(op) ATOMIC_OP_RETURN(op)
+#define ATOMIC_FETCH_OP(op)						\
+static inline int atomic_fetch_##op(int i, atomic_t *v)			\
+{									\
+	unsigned long res, temp;					\
+									\
+	__asm__ __volatile__ (						\
+"1:	movli.l @%3, %0		! atomic_fetch_" #op "	\n"		\
+"	mov %0, %1					\n"		\
+"	" #op "	%2, %0					\n"		\
+"	movco.l	%0, @%3					\n"		\
+"	bf	1b					\n"		\
+"	synco						\n"		\
+	: "=&z" (temp), "=&z" (res)					\
+	: "r" (i), "r" (&v->counter)					\
+	: "t");								\
+									\
+	return res;							\
+}
+
+#define ATOMIC_OPS(op) ATOMIC_OP(op) ATOMIC_OP_RETURN(op) ATOMIC_FETCH_OP(op)
 
 ATOMIC_OPS(add)
 ATOMIC_OPS(sub)
-ATOMIC_OP(and)
-ATOMIC_OP(or)
-ATOMIC_OP(xor)
 
 #undef ATOMIC_OPS
+#define ATOMIC_OPS(op) ATOMIC_OP(op) ATOMIC_FETCH_OP(op)
+
+ATOMIC_OPS(and)
+ATOMIC_OPS(or)
+ATOMIC_OPS(xor)
+
+#undef ATOMIC_OPS
+#undef ATOMIC_FETCH_OP
 #undef ATOMIC_OP_RETURN
 #undef ATOMIC_OP
 
--- a/arch/sh/include/asm/atomic.h
+++ b/arch/sh/include/asm/atomic.h
@@ -25,6 +25,8 @@
 #include <asm/atomic-irq.h>
 #endif
 
+#define atomic_fetch_or atomic_fetch_or
+
 #define atomic_add_negative(a, v)	(atomic_add_return((a), (v)) < 0)
 #define atomic_dec_return(v)		atomic_sub_return(1, (v))
 #define atomic_inc_return(v)		atomic_add_return(1, (v))

^ permalink raw reply	[flat|nested] 79+ messages in thread

* [RFC][PATCH 21/31] locking,sparc: Implement atomic{,64}_fetch_{add,sub,and,or,xor}()
  2016-04-22  9:04 [RFC][PATCH 00/31] implement atomic_fetch_$op Peter Zijlstra
                   ` (19 preceding siblings ...)
  2016-04-22  9:04 ` [RFC][PATCH 20/31] locking,sh: Implement atomic_fetch_{add,sub,and,or,xor}() Peter Zijlstra
@ 2016-04-22  9:04 ` Peter Zijlstra
  2016-04-22  9:04 ` [RFC][PATCH 22/31] locking,tile: " Peter Zijlstra
                   ` (10 subsequent siblings)
  31 siblings, 0 replies; 79+ messages in thread
From: Peter Zijlstra @ 2016-04-22  9:04 UTC (permalink / raw)
  To: torvalds, mingo, tglx, will.deacon, paulmck, boqun.feng,
	waiman.long, fweisbec
  Cc: linux-kernel, linux-arch, rth, vgupta, linux, egtvedt, realmz6,
	ysato, rkuo, tony.luck, geert, james.hogan, ralf, dhowells, jejb,
	mpe, schwidefsky, dalias, davem, cmetcalf, jcmvbkbc, arnd,
	peterz, dbueso, fengguang.wu

[-- Attachment #1: peterz-atomic-fetch-sparc.patch --]
[-- Type: text/plain, Size: 7872 bytes --]

Implement FETCH-OP atomic primitives, these are very similar to the
existing OP-RETURN primitives we already have, except they return the
value of the atomic variable _before_ modification.

This is especially useful for irreversible operations -- such as
bitops (because it becomes impossible to reconstruct the state prior
to modification).

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 arch/sparc/include/asm/atomic.h    |    1 
 arch/sparc/include/asm/atomic_32.h |   15 +++++++--
 arch/sparc/include/asm/atomic_64.h |   16 +++++++--
 arch/sparc/lib/atomic32.c          |   29 ++++++++++-------
 arch/sparc/lib/atomic_64.S         |   61 ++++++++++++++++++++++++++++++-------
 arch/sparc/lib/ksyms.c             |   17 +++++++---
 6 files changed, 105 insertions(+), 34 deletions(-)

--- a/arch/sparc/include/asm/atomic.h
+++ b/arch/sparc/include/asm/atomic.h
@@ -5,4 +5,5 @@
 #else
 #include <asm/atomic_32.h>
 #endif
+#define atomic_fetch_or atomic_fetch_or
 #endif
--- a/arch/sparc/include/asm/atomic_32.h
+++ b/arch/sparc/include/asm/atomic_32.h
@@ -20,9 +20,10 @@
 #define ATOMIC_INIT(i)  { (i) }
 
 int atomic_add_return(int, atomic_t *);
-void atomic_and(int, atomic_t *);
-void atomic_or(int, atomic_t *);
-void atomic_xor(int, atomic_t *);
+int atomic_fetch_add(int, atomic_t *);
+int atomic_fetch_and(int, atomic_t *);
+int atomic_fetch_or(int, atomic_t *);
+int atomic_fetch_xor(int, atomic_t *);
 int atomic_cmpxchg(atomic_t *, int, int);
 int atomic_xchg(atomic_t *, int);
 int __atomic_add_unless(atomic_t *, int, int);
@@ -35,7 +36,15 @@ void atomic_set(atomic_t *, int);
 #define atomic_inc(v)		((void)atomic_add_return(        1, (v)))
 #define atomic_dec(v)		((void)atomic_add_return(       -1, (v)))
 
+#define atomic_fetch_or	atomic_fetch_or
+
+#define atomic_and(i, v)	((void)atomic_fetch_and((i), (v)))
+#define atomic_or(i, v)		((void)atomic_fetch_or((i), (v)))
+#define atomic_xor(i, v)	((void)atomic_fetch_xor((i), (v)))
+
 #define atomic_sub_return(i, v)	(atomic_add_return(-(int)(i), (v)))
+#define atomic_fetch_sub(i, v)  (atomic_fetch_add (-(int)(i), (v)))
+
 #define atomic_inc_return(v)	(atomic_add_return(        1, (v)))
 #define atomic_dec_return(v)	(atomic_add_return(       -1, (v)))
 
--- a/arch/sparc/include/asm/atomic_64.h
+++ b/arch/sparc/include/asm/atomic_64.h
@@ -28,16 +28,24 @@ void atomic64_##op(long, atomic64_t *);
 int atomic_##op##_return(int, atomic_t *);				\
 long atomic64_##op##_return(long, atomic64_t *);
 
-#define ATOMIC_OPS(op) ATOMIC_OP(op) ATOMIC_OP_RETURN(op)
+#define ATOMIC_FETCH_OP(op)						\
+int atomic_fetch_##op(int, atomic_t *);					\
+long atomic64_fetch_##op(long, atomic64_t *);
+
+#define ATOMIC_OPS(op) ATOMIC_OP(op) ATOMIC_OP_RETURN(op) ATOMIC_FETCH_OP(op)
 
 ATOMIC_OPS(add)
 ATOMIC_OPS(sub)
 
-ATOMIC_OP(and)
-ATOMIC_OP(or)
-ATOMIC_OP(xor)
+#undef ATOMIC_OPS
+#define ATOMIC_OPS(op) ATOMIC_OP(op) ATOMIC_FETCH_OP(op)
+
+ATOMIC_OPS(and)
+ATOMIC_OPS(or)
+ATOMIC_OPS(xor)
 
 #undef ATOMIC_OPS
+#undef ATOMIC_FETCH_OP
 #undef ATOMIC_OP_RETURN
 #undef ATOMIC_OP
 
--- a/arch/sparc/lib/atomic32.c
+++ b/arch/sparc/lib/atomic32.c
@@ -27,39 +27,44 @@ static DEFINE_SPINLOCK(dummy);
 
 #endif /* SMP */
 
-#define ATOMIC_OP_RETURN(op, c_op)					\
-int atomic_##op##_return(int i, atomic_t *v)				\
+#define ATOMIC_FETCH_OP(op, c_op)					\
+int atomic_fetch_##op(int i, atomic_t *v)				\
 {									\
 	int ret;							\
 	unsigned long flags;						\
 	spin_lock_irqsave(ATOMIC_HASH(v), flags);			\
 									\
-	ret = (v->counter c_op i);					\
+	ret = v->counter;						\
+	v->counter c_op i;						\
 									\
 	spin_unlock_irqrestore(ATOMIC_HASH(v), flags);			\
 	return ret;							\
 }									\
-EXPORT_SYMBOL(atomic_##op##_return);
+EXPORT_SYMBOL(atomic_fetch_##op);
 
-#define ATOMIC_OP(op, c_op)						\
-void atomic_##op(int i, atomic_t *v)					\
+#define ATOMIC_OP_RETURN(op, c_op)					\
+int atomic_##op##_return(int i, atomic_t *v)				\
 {									\
+	int ret;							\
 	unsigned long flags;						\
 	spin_lock_irqsave(ATOMIC_HASH(v), flags);			\
 									\
-	v->counter c_op i;						\
+	ret = (v->counter c_op i);					\
 									\
 	spin_unlock_irqrestore(ATOMIC_HASH(v), flags);			\
+	return ret;							\
 }									\
-EXPORT_SYMBOL(atomic_##op);
+EXPORT_SYMBOL(atomic_##op##_return);
 
 ATOMIC_OP_RETURN(add, +=)
-ATOMIC_OP(and, &=)
-ATOMIC_OP(or, |=)
-ATOMIC_OP(xor, ^=)
 
+ATOMIC_FETCH_OP(add, +=)
+ATOMIC_FETCH_OP(and, &=)
+ATOMIC_FETCH_OP(or, |=)
+ATOMIC_FETCH_OP(xor, ^=)
+
+#undef ATOMIC_FETCH_OP
 #undef ATOMIC_OP_RETURN
-#undef ATOMIC_OP
 
 int atomic_xchg(atomic_t *v, int new)
 {
--- a/arch/sparc/lib/atomic_64.S
+++ b/arch/sparc/lib/atomic_64.S
@@ -9,10 +9,11 @@
 
 	.text
 
-	/* Two versions of the atomic routines, one that
+	/* Three versions of the atomic routines, one that
 	 * does not return a value and does not perform
-	 * memory barriers, and a second which returns
-	 * a value and does the barriers.
+	 * memory barriers, and a two which return
+	 * a value, the new and old value resp. and does the
+	 * barriers.
 	 */
 
 #define ATOMIC_OP(op)							\
@@ -43,15 +44,34 @@ ENTRY(atomic_##op##_return) /* %o0 = inc
 2:	BACKOFF_SPIN(%o2, %o3, 1b);					\
 ENDPROC(atomic_##op##_return);
 
-#define ATOMIC_OPS(op) ATOMIC_OP(op) ATOMIC_OP_RETURN(op)
+#define ATOMIC_FETCH_OP(op)						\
+ENTRY(atomic_fetch_##op) /* %o0 = increment, %o1 = atomic_ptr */	\
+	BACKOFF_SETUP(%o2);						\
+1:	lduw	[%o1], %g1;						\
+	op	%g1, %o0, %g7;						\
+	cas	[%o1], %g1, %g7;					\
+	cmp	%g1, %g7;						\
+	bne,pn	%icc, BACKOFF_LABEL(2f, 1b);				\
+	 nop;								\
+	retl;								\
+	 sra	%g1, 0, %o0;						\
+2:	BACKOFF_SPIN(%o2, %o3, 1b);					\
+ENDPROC(atomic_fetch_##op);
+
+#define ATOMIC_OPS(op) ATOMIC_OP(op) ATOMIC_OP_RETURN(op) ATOMIC_FETCH_OP(op)
 
 ATOMIC_OPS(add)
 ATOMIC_OPS(sub)
-ATOMIC_OP(and)
-ATOMIC_OP(or)
-ATOMIC_OP(xor)
 
 #undef ATOMIC_OPS
+#define ATOMIC_OPS(op) ATOMIC_OP(op) ATOMIC_FETCH_OP(op)
+
+ATOMIC_OPS(and)
+ATOMIC_OPS(or)
+ATOMIC_OPS(xor)
+
+#undef ATOMIC_OPS
+#undef ATOMIC_FETCH_OP
 #undef ATOMIC_OP_RETURN
 #undef ATOMIC_OP
 
@@ -83,15 +103,34 @@ ENTRY(atomic64_##op##_return) /* %o0 = i
 2:	BACKOFF_SPIN(%o2, %o3, 1b);					\
 ENDPROC(atomic64_##op##_return);
 
-#define ATOMIC64_OPS(op) ATOMIC64_OP(op) ATOMIC64_OP_RETURN(op)
+#define ATOMIC64_FETCH_OP(op)						\
+ENTRY(atomic64_fetch_##op) /* %o0 = increment, %o1 = atomic_ptr */	\
+	BACKOFF_SETUP(%o2);						\
+1:	ldx	[%o1], %g1;						\
+	op	%g1, %o0, %g7;						\
+	casx	[%o1], %g1, %g7;					\
+	cmp	%g1, %g7;						\
+	bne,pn	%xcc, BACKOFF_LABEL(2f, 1b);				\
+	 nop;								\
+	retl;								\
+	 mov	%g1, %o0;						\
+2:	BACKOFF_SPIN(%o2, %o3, 1b);					\
+ENDPROC(atomic64_fetch_##op);
+
+#define ATOMIC64_OPS(op) ATOMIC64_OP(op) ATOMIC64_OP_RETURN(op) ATOMIC64_FETCH_OP(op)
 
 ATOMIC64_OPS(add)
 ATOMIC64_OPS(sub)
-ATOMIC64_OP(and)
-ATOMIC64_OP(or)
-ATOMIC64_OP(xor)
 
 #undef ATOMIC64_OPS
+#define ATOMIC64_OPS(op) ATOMIC64_OP(op) ATOMIC64_FETCH_OP(op)
+
+ATOMIC64_OPS(and)
+ATOMIC64_OPS(or)
+ATOMIC64_OPS(xor)
+
+#undef ATOMIC64_OPS
+#undef ATOMIC64_FETCH_OP
 #undef ATOMIC64_OP_RETURN
 #undef ATOMIC64_OP
 
--- a/arch/sparc/lib/ksyms.c
+++ b/arch/sparc/lib/ksyms.c
@@ -107,15 +107,24 @@ EXPORT_SYMBOL(atomic64_##op);
 EXPORT_SYMBOL(atomic_##op##_return);					\
 EXPORT_SYMBOL(atomic64_##op##_return);
 
-#define ATOMIC_OPS(op) ATOMIC_OP(op) ATOMIC_OP_RETURN(op)
+#define ATOMIC_FETCH_OP(op)						\
+EXPORT_SYMBOL(atomic_fetch_##op);					\
+EXPORT_SYMBOL(atomic64_fetch_##op);
+
+#define ATOMIC_OPS(op) ATOMIC_OP(op) ATOMIC_OP_RETURN(op) ATOMIC_FETCH_OP(op)
 
 ATOMIC_OPS(add)
 ATOMIC_OPS(sub)
-ATOMIC_OP(and)
-ATOMIC_OP(or)
-ATOMIC_OP(xor)
 
 #undef ATOMIC_OPS
+#define ATOMIC_OPS(op) ATOMIC_OP(op) ATOMIC_FETCH_OP(op)
+
+ATOMIC_OPS(and)
+ATOMIC_OPS(or)
+ATOMIC_OPS(xor)
+
+#undef ATOMIC_OPS
+#undef ATOMIC_FETCH_OP
 #undef ATOMIC_OP_RETURN
 #undef ATOMIC_OP
 

^ permalink raw reply	[flat|nested] 79+ messages in thread

* [RFC][PATCH 22/31] locking,tile: Implement atomic{,64}_fetch_{add,sub,and,or,xor}()
  2016-04-22  9:04 [RFC][PATCH 00/31] implement atomic_fetch_$op Peter Zijlstra
                   ` (20 preceding siblings ...)
  2016-04-22  9:04 ` [RFC][PATCH 21/31] locking,sparc: Implement atomic{,64}_fetch_{add,sub,and,or,xor}() Peter Zijlstra
@ 2016-04-22  9:04 ` Peter Zijlstra
  2016-04-25 21:10     ` Chris Metcalf
       [not found]   ` <571E840A.8090703@mellanox.com>
  2016-04-22  9:04 ` [RFC][PATCH 23/31] locking,x86: " Peter Zijlstra
                   ` (9 subsequent siblings)
  31 siblings, 2 replies; 79+ messages in thread
From: Peter Zijlstra @ 2016-04-22  9:04 UTC (permalink / raw)
  To: torvalds, mingo, tglx, will.deacon, paulmck, boqun.feng,
	waiman.long, fweisbec
  Cc: linux-kernel, linux-arch, rth, vgupta, linux, egtvedt, realmz6,
	ysato, rkuo, tony.luck, geert, james.hogan, ralf, dhowells, jejb,
	mpe, schwidefsky, dalias, davem, cmetcalf, jcmvbkbc, arnd,
	peterz, dbueso, fengguang.wu

[-- Attachment #1: peterz-atomic-fetch-tile.patch --]
[-- Type: text/plain, Size: 16995 bytes --]

Implement FETCH-OP atomic primitives, these are very similar to the
existing OP-RETURN primitives we already have, except they return the
value of the atomic variable _before_ modification.

This is especially useful for irreversible operations -- such as
bitops (because it becomes impossible to reconstruct the state prior
to modification).

XXX please look at the tilegx (CONFIG_64BIT) atomics, I think we get
the barriers wrong (at the very least they're inconsistent).

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 arch/tile/include/asm/atomic.h    |    4 +
 arch/tile/include/asm/atomic_32.h |   60 +++++++++++++------
 arch/tile/include/asm/atomic_64.h |  117 +++++++++++++++++++++++++-------------
 arch/tile/include/asm/bitops_32.h |   18 ++---
 arch/tile/lib/atomic_32.c         |   42 ++++++-------
 arch/tile/lib/atomic_asm_32.S     |   14 ++--
 6 files changed, 161 insertions(+), 94 deletions(-)

--- a/arch/tile/include/asm/atomic.h
+++ b/arch/tile/include/asm/atomic.h
@@ -46,6 +46,10 @@ static inline int atomic_read(const atom
  */
 #define atomic_sub_return(i, v)		atomic_add_return((int)(-(i)), (v))
 
+#define atomic_fetch_sub(i, v)		atomic_fetch_add(-(int)(i), (v))
+
+#define atomic_fetch_or atomic_fetch_or
+
 /**
  * atomic_sub - subtract integer from atomic variable
  * @i: integer value to subtract
--- a/arch/tile/include/asm/atomic_32.h
+++ b/arch/tile/include/asm/atomic_32.h
@@ -34,18 +34,29 @@ static inline void atomic_add(int i, ato
 	_atomic_xchg_add(&v->counter, i);
 }
 
-#define ATOMIC_OP(op)							\
-unsigned long _atomic_##op(volatile unsigned long *p, unsigned long mask); \
+#define ATOMIC_OPS(op)							\
+unsigned long _atomic_fetch_##op(volatile unsigned long *p, unsigned long mask); \
 static inline void atomic_##op(int i, atomic_t *v)			\
 {									\
-	_atomic_##op((unsigned long *)&v->counter, i);			\
+	_atomic_fetch_##op((unsigned long *)&v->counter, i);		\
+}									\
+static inline int atomic_fetch_##op(int i, atomic_t *v)			\
+{									\
+	smp_mb();							\
+	return _atomic_fetch_##op((unsigned long *)&v->counter, i);	\
 }
 
-ATOMIC_OP(and)
-ATOMIC_OP(or)
-ATOMIC_OP(xor)
+ATOMIC_OPS(and)
+ATOMIC_OPS(or)
+ATOMIC_OPS(xor)
+
+#undef ATOMIC_OPS
 
-#undef ATOMIC_OP
+static inline int atomic_fetch_add(int i, atomic_t *v)
+{
+	smp_mb();
+	return _atomic_xchg_add(&v->counter, i);
+}
 
 /**
  * atomic_add_return - add integer and return
@@ -126,17 +137,30 @@ static inline void atomic64_add(long lon
 	_atomic64_xchg_add(&v->counter, i);
 }
 
-#define ATOMIC64_OP(op)						\
-long long _atomic64_##op(long long *v, long long n);		\
+#define ATOMIC64_OPS(op)					\
+long long _atomic64_fetch_##op(long long *v, long long n);	\
+static inline void atomic64_##op(long long i, atomic64_t *v)	\
+{								\
+	_atomic64_fetch_##op(&v->counter, i);			\
+}								\
 static inline void atomic64_##op(long long i, atomic64_t *v)	\
 {								\
-	_atomic64_##op(&v->counter, i);				\
+	smp_mb();						\
+	return _atomic64_fetch_##op(&v->counter, i);		\
 }
 
 ATOMIC64_OP(and)
 ATOMIC64_OP(or)
 ATOMIC64_OP(xor)
 
+#undef ATOMIC64_OPS
+
+static inline long long atomic64_fetch_add(long long i, atomic64_t *v)
+{
+	smp_mb();
+	return _atomic64_xchg_add(&v->counter, i);
+}
+
 /**
  * atomic64_add_return - add integer and return
  * @v: pointer of type atomic64_t
@@ -186,6 +210,7 @@ static inline void atomic64_set(atomic64
 #define atomic64_inc_return(v)		atomic64_add_return(1LL, (v))
 #define atomic64_inc_and_test(v)	(atomic64_inc_return(v) == 0)
 #define atomic64_sub_return(i, v)	atomic64_add_return(-(i), (v))
+#define atomic64_fetch_sub(i, v)	atomic64_fetch_add(-(i), (v))
 #define atomic64_sub_and_test(a, v)	(atomic64_sub_return((a), (v)) == 0)
 #define atomic64_sub(i, v)		atomic64_add(-(i), (v))
 #define atomic64_dec(v)			atomic64_sub(1LL, (v))
@@ -193,7 +218,6 @@ static inline void atomic64_set(atomic64
 #define atomic64_dec_and_test(v)	(atomic64_dec_return((v)) == 0)
 #define atomic64_inc_not_zero(v)	atomic64_add_unless((v), 1LL, 0LL)
 
-
 #endif /* !__ASSEMBLY__ */
 
 /*
@@ -248,10 +272,10 @@ extern struct __get_user __atomic_xchg(v
 extern struct __get_user __atomic_xchg_add(volatile int *p, int *lock, int n);
 extern struct __get_user __atomic_xchg_add_unless(volatile int *p,
 						  int *lock, int o, int n);
-extern struct __get_user __atomic_or(volatile int *p, int *lock, int n);
-extern struct __get_user __atomic_and(volatile int *p, int *lock, int n);
-extern struct __get_user __atomic_andn(volatile int *p, int *lock, int n);
-extern struct __get_user __atomic_xor(volatile int *p, int *lock, int n);
+extern struct __get_user __atomic_fetch_or(volatile int *p, int *lock, int n);
+extern struct __get_user __atomic_fetch_and(volatile int *p, int *lock, int n);
+extern struct __get_user __atomic_fetch_andn(volatile int *p, int *lock, int n);
+extern struct __get_user __atomic_fetch_xor(volatile int *p, int *lock, int n);
 extern long long __atomic64_cmpxchg(volatile long long *p, int *lock,
 					long long o, long long n);
 extern long long __atomic64_xchg(volatile long long *p, int *lock, long long n);
@@ -259,9 +283,9 @@ extern long long __atomic64_xchg_add(vol
 					long long n);
 extern long long __atomic64_xchg_add_unless(volatile long long *p,
 					int *lock, long long o, long long n);
-extern long long __atomic64_and(volatile long long *p, int *lock, long long n);
-extern long long __atomic64_or(volatile long long *p, int *lock, long long n);
-extern long long __atomic64_xor(volatile long long *p, int *lock, long long n);
+extern long long __atomic64_fetch_and(volatile long long *p, int *lock, long long n);
+extern long long __atomic64_fetch_or(volatile long long *p, int *lock, long long n);
+extern long long __atomic64_fetch_xor(volatile long long *p, int *lock, long long n);
 
 /* Return failure from the atomic wrappers. */
 struct __get_user __atomic_bad_address(int __user *addr);
--- a/arch/tile/include/asm/atomic_64.h
+++ b/arch/tile/include/asm/atomic_64.h
@@ -32,42 +32,49 @@
  * on any routine which updates memory and returns a value.
  */
 
-static inline void atomic_add(int i, atomic_t *v)
-{
-	__insn_fetchadd4((void *)&v->counter, i);
-}
-
 static inline int atomic_add_return(int i, atomic_t *v)
 {
 	int val;
 	smp_mb();  /* barrier for proper semantics */
 	val = __insn_fetchadd4((void *)&v->counter, i) + i;
 	barrier();  /* the "+ i" above will wait on memory */
+	/* XXX smp_mb() instead, as per cmpxchg() ? */
 	return val;
 }
 
-static inline int __atomic_add_unless(atomic_t *v, int a, int u)
+#define ATOMIC_OPS(op)							\
+static inline int atomic_fetch_##op(int i, atomic_t *v)			\
+{									\
+	int val;							\
+	smp_mb();							\
+	val = __insn_fetch##op##4((void *)&v->counter, i);		\
+	smp_mb();							\
+	return val;							\
+}									\
+static inline void atomic_##op(int i, atomic_t *v)			\
+{									\
+	__insn_fetch##op##4((void *)&v->counter, i);			\
+}
+
+ATOMIC_OPS(add)
+ATOMIC_OPS(and)
+ATOMIC_OPS(or)
+
+#undef ATOMIC_OPS
+
+static inline int atomic_fetch_xor(int i, atomic_t *v)
 {
 	int guess, oldval = v->counter;
+	smp_mb();
 	do {
-		if (oldval == u)
-			break;
 		guess = oldval;
-		oldval = cmpxchg(&v->counter, guess, guess + a);
+		__insn_mtspr(SPR_CMPEXCH_VALUE, guess);
+		oldval = __insn_cmpexch4(&v->counter, guess ^ i);
 	} while (guess != oldval);
+	smp_mb();
 	return oldval;
 }
 
-static inline void atomic_and(int i, atomic_t *v)
-{
-	__insn_fetchand4((void *)&v->counter, i);
-}
-
-static inline void atomic_or(int i, atomic_t *v)
-{
-	__insn_fetchor4((void *)&v->counter, i);
-}
-
 static inline void atomic_xor(int i, atomic_t *v)
 {
 	int guess, oldval = v->counter;
@@ -78,6 +85,18 @@ static inline void atomic_xor(int i, ato
 	} while (guess != oldval);
 }
 
+static inline int __atomic_add_unless(atomic_t *v, int a, int u)
+{
+	int guess, oldval = v->counter;
+	do {
+		if (oldval == u)
+			break;
+		guess = oldval;
+		oldval = cmpxchg(&v->counter, guess, guess + a);
+	} while (guess != oldval);
+	return oldval;
+}
+
 /* Now the true 64-bit operations. */
 
 #define ATOMIC64_INIT(i)	{ (i) }
@@ -85,40 +104,47 @@ static inline void atomic_xor(int i, ato
 #define atomic64_read(v)	READ_ONCE((v)->counter)
 #define atomic64_set(v, i)	WRITE_ONCE((v)->counter, (i))
 
-static inline void atomic64_add(long i, atomic64_t *v)
-{
-	__insn_fetchadd((void *)&v->counter, i);
-}
-
 static inline long atomic64_add_return(long i, atomic64_t *v)
 {
 	int val;
 	smp_mb();  /* barrier for proper semantics */
 	val = __insn_fetchadd((void *)&v->counter, i) + i;
 	barrier();  /* the "+ i" above will wait on memory */
+	/* XXX smp_mb() */
 	return val;
 }
 
-static inline long atomic64_add_unless(atomic64_t *v, long a, long u)
+#define ATOMIC64_OPS(op)						\
+static inline long atomic64_fetch_##op(long i, atomic64_t *v)		\
+{									\
+	long val;							\
+	smp_mb();							\
+	val = __insn_fetch##op((void *)&v->counter, i);			\
+	smp_mb();							\
+	return val;							\
+}									\
+static inline void atomic64_##op(long i, atomic64_t *v)			\
+{									\
+	__insn_fetch##op((void *)&v->counter, i);			\
+}
+
+ATOMIC64_OPS(add)
+ATOMIC64_OPS(and)
+ATOMIC64_OPS(or)
+
+#undef ATOMIC64_OPS
+
+static inline long atomic64_fetch_xor(long i, atomic64_t *v)
 {
 	long guess, oldval = v->counter;
+	smp_mb();
 	do {
-		if (oldval == u)
-			break;
 		guess = oldval;
-		oldval = cmpxchg(&v->counter, guess, guess + a);
+		__insn_mtspr(SPR_CMPEXCH_VALUE, guess);
+		oldval = __insn_cmpexch(&v->counter, guess ^ i);
 	} while (guess != oldval);
-	return oldval != u;
-}
-
-static inline void atomic64_and(long i, atomic64_t *v)
-{
-	__insn_fetchand((void *)&v->counter, i);
-}
-
-static inline void atomic64_or(long i, atomic64_t *v)
-{
-	__insn_fetchor((void *)&v->counter, i);
+	smp_mb();
+	return oldval;
 }
 
 static inline void atomic64_xor(long i, atomic64_t *v)
@@ -131,7 +157,20 @@ static inline void atomic64_xor(long i,
 	} while (guess != oldval);
 }
 
+static inline long atomic64_add_unless(atomic64_t *v, long a, long u)
+{
+	long guess, oldval = v->counter;
+	do {
+		if (oldval == u)
+			break;
+		guess = oldval;
+		oldval = cmpxchg(&v->counter, guess, guess + a);
+	} while (guess != oldval);
+	return oldval != u;
+}
+
 #define atomic64_sub_return(i, v)	atomic64_add_return(-(i), (v))
+#define atomic64_fetch_sub(i, v)	atomic64_fetch_add(-(i), (v))
 #define atomic64_sub(i, v)		atomic64_add(-(i), (v))
 #define atomic64_inc_return(v)		atomic64_add_return(1, (v))
 #define atomic64_dec_return(v)		atomic64_sub_return(1, (v))
--- a/arch/tile/include/asm/bitops_32.h
+++ b/arch/tile/include/asm/bitops_32.h
@@ -19,9 +19,9 @@
 #include <asm/barrier.h>
 
 /* Tile-specific routines to support <asm/bitops.h>. */
-unsigned long _atomic_or(volatile unsigned long *p, unsigned long mask);
-unsigned long _atomic_andn(volatile unsigned long *p, unsigned long mask);
-unsigned long _atomic_xor(volatile unsigned long *p, unsigned long mask);
+unsigned long _atomic_fetch_or(volatile unsigned long *p, unsigned long mask);
+unsigned long _atomic_fetch_andn(volatile unsigned long *p, unsigned long mask);
+unsigned long _atomic_fetch_xor(volatile unsigned long *p, unsigned long mask);
 
 /**
  * set_bit - Atomically set a bit in memory
@@ -35,7 +35,7 @@ unsigned long _atomic_xor(volatile unsig
  */
 static inline void set_bit(unsigned nr, volatile unsigned long *addr)
 {
-	_atomic_or(addr + BIT_WORD(nr), BIT_MASK(nr));
+	_atomic_fetch_or(addr + BIT_WORD(nr), BIT_MASK(nr));
 }
 
 /**
@@ -54,7 +54,7 @@ static inline void set_bit(unsigned nr,
  */
 static inline void clear_bit(unsigned nr, volatile unsigned long *addr)
 {
-	_atomic_andn(addr + BIT_WORD(nr), BIT_MASK(nr));
+	_atomic_fetch_andn(addr + BIT_WORD(nr), BIT_MASK(nr));
 }
 
 /**
@@ -69,7 +69,7 @@ static inline void clear_bit(unsigned nr
  */
 static inline void change_bit(unsigned nr, volatile unsigned long *addr)
 {
-	_atomic_xor(addr + BIT_WORD(nr), BIT_MASK(nr));
+	_atomic_fetch_xor(addr + BIT_WORD(nr), BIT_MASK(nr));
 }
 
 /**
@@ -85,7 +85,7 @@ static inline int test_and_set_bit(unsig
 	unsigned long mask = BIT_MASK(nr);
 	addr += BIT_WORD(nr);
 	smp_mb();  /* barrier for proper semantics */
-	return (_atomic_or(addr, mask) & mask) != 0;
+	return (_atomic_fetch_or(addr, mask) & mask) != 0;
 }
 
 /**
@@ -101,7 +101,7 @@ static inline int test_and_clear_bit(uns
 	unsigned long mask = BIT_MASK(nr);
 	addr += BIT_WORD(nr);
 	smp_mb();  /* barrier for proper semantics */
-	return (_atomic_andn(addr, mask) & mask) != 0;
+	return (_atomic_fetch_andn(addr, mask) & mask) != 0;
 }
 
 /**
@@ -118,7 +118,7 @@ static inline int test_and_change_bit(un
 	unsigned long mask = BIT_MASK(nr);
 	addr += BIT_WORD(nr);
 	smp_mb();  /* barrier for proper semantics */
-	return (_atomic_xor(addr, mask) & mask) != 0;
+	return (_atomic_fetch_xor(addr, mask) & mask) != 0;
 }
 
 #include <asm-generic/bitops/ext2-atomic.h>
--- a/arch/tile/lib/atomic_32.c
+++ b/arch/tile/lib/atomic_32.c
@@ -88,29 +88,29 @@ int _atomic_cmpxchg(int *v, int o, int n
 }
 EXPORT_SYMBOL(_atomic_cmpxchg);
 
-unsigned long _atomic_or(volatile unsigned long *p, unsigned long mask)
+unsigned long _atomic_fetch_or(volatile unsigned long *p, unsigned long mask)
 {
-	return __atomic_or((int *)p, __atomic_setup(p), mask).val;
+	return __atomic_fetch_or((int *)p, __atomic_setup(p), mask).val;
 }
-EXPORT_SYMBOL(_atomic_or);
+EXPORT_SYMBOL(_atomic_fetch_or);
 
-unsigned long _atomic_and(volatile unsigned long *p, unsigned long mask)
+unsigned long _atomic_fetch_and(volatile unsigned long *p, unsigned long mask)
 {
-	return __atomic_and((int *)p, __atomic_setup(p), mask).val;
+	return __atomic_fetch_and((int *)p, __atomic_setup(p), mask).val;
 }
-EXPORT_SYMBOL(_atomic_and);
+EXPORT_SYMBOL(_atomic_fetch_and);
 
-unsigned long _atomic_andn(volatile unsigned long *p, unsigned long mask)
+unsigned long _atomic_fetch_andn(volatile unsigned long *p, unsigned long mask)
 {
-	return __atomic_andn((int *)p, __atomic_setup(p), mask).val;
+	return __atomic_fetch_andn((int *)p, __atomic_setup(p), mask).val;
 }
-EXPORT_SYMBOL(_atomic_andn);
+EXPORT_SYMBOL(_atomic_fetch_andn);
 
-unsigned long _atomic_xor(volatile unsigned long *p, unsigned long mask)
+unsigned long _atomic_fetch_xor(volatile unsigned long *p, unsigned long mask)
 {
-	return __atomic_xor((int *)p, __atomic_setup(p), mask).val;
+	return __atomic_fetch_xor((int *)p, __atomic_setup(p), mask).val;
 }
-EXPORT_SYMBOL(_atomic_xor);
+EXPORT_SYMBOL(_atomic_fetch_xor);
 
 
 long long _atomic64_xchg(long long *v, long long n)
@@ -142,23 +142,23 @@ long long _atomic64_cmpxchg(long long *v
 }
 EXPORT_SYMBOL(_atomic64_cmpxchg);
 
-long long _atomic64_and(long long *v, long long n)
+long long _atomic64_fetch_and(long long *v, long long n)
 {
-	return __atomic64_and(v, __atomic_setup(v), n);
+	return __atomic64_fetch_and(v, __atomic_setup(v), n);
 }
-EXPORT_SYMBOL(_atomic64_and);
+EXPORT_SYMBOL(_atomic64_fetch_and);
 
-long long _atomic64_or(long long *v, long long n)
+long long _atomic64_fetch_or(long long *v, long long n)
 {
-	return __atomic64_or(v, __atomic_setup(v), n);
+	return __atomic64_fetch_or(v, __atomic_setup(v), n);
 }
-EXPORT_SYMBOL(_atomic64_or);
+EXPORT_SYMBOL(_atomic64_fetch_or);
 
-long long _atomic64_xor(long long *v, long long n)
+long long _atomic64_fetch_xor(long long *v, long long n)
 {
-	return __atomic64_xor(v, __atomic_setup(v), n);
+	return __atomic64_fetch_xor(v, __atomic_setup(v), n);
 }
-EXPORT_SYMBOL(_atomic64_xor);
+EXPORT_SYMBOL(_atomic64_fetch_xor);
 
 /*
  * If any of the atomic or futex routines hit a bad address (not in
--- a/arch/tile/lib/atomic_asm_32.S
+++ b/arch/tile/lib/atomic_asm_32.S
@@ -177,10 +177,10 @@ atomic_op _xchg, 32, "move r24, r2"
 atomic_op _xchg_add, 32, "add r24, r22, r2"
 atomic_op _xchg_add_unless, 32, \
 	"sne r26, r22, r2; { bbns r26, 3f; add r24, r22, r3 }"
-atomic_op _or, 32, "or r24, r22, r2"
-atomic_op _and, 32, "and r24, r22, r2"
-atomic_op _andn, 32, "nor r2, r2, zero; and r24, r22, r2"
-atomic_op _xor, 32, "xor r24, r22, r2"
+atomic_op _fetch_or, 32, "or r24, r22, r2"
+atomic_op _fetch_and, 32, "and r24, r22, r2"
+atomic_op _fetch_andn, 32, "nor r2, r2, zero; and r24, r22, r2"
+atomic_op _fetch_xor, 32, "xor r24, r22, r2"
 
 atomic_op 64_cmpxchg, 64, "{ seq r26, r22, r2; seq r27, r23, r3 }; \
 	{ bbns r26, 3f; move r24, r4 }; { bbns r27, 3f; move r25, r5 }"
@@ -192,9 +192,9 @@ atomic_op 64_xchg_add_unless, 64, \
 	{ bbns r26, 3f; add r24, r22, r4 }; \
 	{ bbns r27, 3f; add r25, r23, r5 }; \
 	slt_u r26, r24, r22; add r25, r25, r26"
-atomic_op 64_or, 64, "{ or r24, r22, r2; or r25, r23, r3 }"
-atomic_op 64_and, 64, "{ and r24, r22, r2; and r25, r23, r3 }"
-atomic_op 64_xor, 64, "{ xor r24, r22, r2; xor r25, r23, r3 }"
+atomic_op 64_fetch_or, 64, "{ or r24, r22, r2; or r25, r23, r3 }"
+atomic_op 64_fetch_and, 64, "{ and r24, r22, r2; and r25, r23, r3 }"
+atomic_op 64_fetch_xor, 64, "{ xor r24, r22, r2; xor r25, r23, r3 }"
 
 	jrp     lr              /* happy backtracer */
 

^ permalink raw reply	[flat|nested] 79+ messages in thread

* [RFC][PATCH 23/31] locking,x86: Implement atomic{,64}_fetch_{add,sub,and,or,xor}()
  2016-04-22  9:04 [RFC][PATCH 00/31] implement atomic_fetch_$op Peter Zijlstra
                   ` (21 preceding siblings ...)
  2016-04-22  9:04 ` [RFC][PATCH 22/31] locking,tile: " Peter Zijlstra
@ 2016-04-22  9:04 ` Peter Zijlstra
  2016-04-22  9:04 ` [RFC][PATCH 24/31] locking,xtensa: Implement atomic_fetch_{add,sub,and,or,xor}() Peter Zijlstra
                   ` (8 subsequent siblings)
  31 siblings, 0 replies; 79+ messages in thread
From: Peter Zijlstra @ 2016-04-22  9:04 UTC (permalink / raw)
  To: torvalds, mingo, tglx, will.deacon, paulmck, boqun.feng,
	waiman.long, fweisbec
  Cc: linux-kernel, linux-arch, rth, vgupta, linux, egtvedt, realmz6,
	ysato, rkuo, tony.luck, geert, james.hogan, ralf, dhowells, jejb,
	mpe, schwidefsky, dalias, davem, cmetcalf, jcmvbkbc, arnd,
	peterz, dbueso, fengguang.wu

[-- Attachment #1: peterz-atomic-fetch-x86.patch --]
[-- Type: text/plain, Size: 4134 bytes --]

Implement FETCH-OP atomic primitives, these are very similar to the
existing OP-RETURN primitives we already have, except they return the
value of the atomic variable _before_ modification.

This is especially useful for irreversible operations -- such as
bitops (because it becomes impossible to reconstruct the state prior
to modification).

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 arch/x86/include/asm/atomic.h      |   37 ++++++++++++++++++++++++++++++++++---
 arch/x86/include/asm/atomic64_32.h |   25 ++++++++++++++++++++++---
 arch/x86/include/asm/atomic64_64.h |   35 ++++++++++++++++++++++++++++++++---
 3 files changed, 88 insertions(+), 9 deletions(-)

--- a/arch/x86/include/asm/atomic.h
+++ b/arch/x86/include/asm/atomic.h
@@ -171,6 +171,16 @@ static __always_inline int atomic_sub_re
 #define atomic_inc_return(v)  (atomic_add_return(1, v))
 #define atomic_dec_return(v)  (atomic_sub_return(1, v))
 
+static __always_inline int atomic_fetch_add(int i, atomic_t *v)
+{
+	return xadd(&v->counter, i);
+}
+
+static __always_inline int atomic_fetch_sub(int i, atomic_t *v)
+{
+	return xadd(&v->counter, -i);
+}
+
 static __always_inline int atomic_cmpxchg(atomic_t *v, int old, int new)
 {
 	return cmpxchg(&v->counter, old, new);
@@ -190,10 +200,31 @@ static inline void atomic_##op(int i, at
 			: "memory");					\
 }
 
-ATOMIC_OP(and)
-ATOMIC_OP(or)
-ATOMIC_OP(xor)
+#define ATOMIC_FETCH_OP(op, c_op)					\
+static inline int atomic_fetch_##op(int i, atomic_t *v)		\
+{									\
+	int old, val = atomic_read(v);					\
+	for (;;) {							\
+		old = atomic_cmpxchg(v, val, val c_op i);		\
+		if (old == val)						\
+			break;						\
+		val = old;						\
+	}								\
+	return old;							\
+}
+
+#define ATOMIC_OPS(op, c_op)						\
+	ATOMIC_OP(op)							\
+	ATOMIC_FETCH_OP(op, c_op)
+
+#define atomic_fetch_or atomic_fetch_or
+
+ATOMIC_OPS(and, &)
+ATOMIC_OPS(or , |)
+ATOMIC_OPS(xor, ^)
 
+#undef ATOMIC_OPS
+#undef ATOMIC_FETCH_OP
 #undef ATOMIC_OP
 
 /**
--- a/arch/x86/include/asm/atomic64_32.h
+++ b/arch/x86/include/asm/atomic64_32.h
@@ -320,10 +320,29 @@ static inline void atomic64_##op(long lo
 		c = old;						\
 }
 
-ATOMIC64_OP(and, &)
-ATOMIC64_OP(or, |)
-ATOMIC64_OP(xor, ^)
+#define ATOMIC64_FETCH_OP(op, c_op)					\
+static inline long long atomic64_fetch_##op(long long i, atomic64_t *v)	\
+{									\
+	long long old, c = 0;						\
+	while ((old = atomic64_cmpxchg(v, c, c c_op i)) != c)		\
+		c = old;						\
+	return old;							\
+}
+
+ATOMIC64_FETCH_OP(add, +)
+
+#define atomic64_fetch_sub(i, v)	atomic64_fetch_add(-(i), (v))
+
+#define ATOMIC64_OPS(op, c_op)						\
+	ATOMIC64_OP(op, c_op)						\
+	ATOMIC64_FETCH_OP(op, c_op)
+
+ATOMIC64_OPS(and, &)
+ATOMIC64_OPS(or, |)
+ATOMIC64_OPS(xor, ^)
 
+#undef ATOMIC64_OPS
+#undef ATOMIC64_FETCH_OP
 #undef ATOMIC64_OP
 
 #endif /* _ASM_X86_ATOMIC64_32_H */
--- a/arch/x86/include/asm/atomic64_64.h
+++ b/arch/x86/include/asm/atomic64_64.h
@@ -158,6 +158,16 @@ static inline long atomic64_sub_return(l
 	return atomic64_add_return(-i, v);
 }
 
+static inline long atomic64_fetch_add(long i, atomic64_t *v)
+{
+	return xadd(&v->counter, i);
+}
+
+static inline long atomic64_fetch_sub(long i, atomic64_t *v)
+{
+	return xadd(&v->counter, -i);
+}
+
 #define atomic64_inc_return(v)  (atomic64_add_return(1, (v)))
 #define atomic64_dec_return(v)  (atomic64_sub_return(1, (v)))
 
@@ -229,10 +239,29 @@ static inline void atomic64_##op(long i,
 			: "memory");					\
 }
 
-ATOMIC64_OP(and)
-ATOMIC64_OP(or)
-ATOMIC64_OP(xor)
+#define ATOMIC64_FETCH_OP(op, c_op)					\
+static inline long atomic64_fetch_##op(long i, atomic64_t *v)		\
+{									\
+	long old, val = atomic64_read(v);				\
+	for (;;) {							\
+		old = atomic64_cmpxchg(v, val, val c_op i);		\
+		if (old == val)						\
+			break;						\
+		val = old;						\
+	}								\
+	return old;							\
+}
+
+#define ATOMIC64_OPS(op, c_op)						\
+	ATOMIC64_OP(op)							\
+	ATOMIC64_FETCH_OP(op, c_op)
+
+ATOMIC64_OPS(and, &)
+ATOMIC64_OPS(or, |)
+ATOMIC64_OPS(xor, ^)
 
+#undef ATOMIC64_OPS
+#undef ATOMIC64_FETCH_OP
 #undef ATOMIC64_OP
 
 #endif /* _ASM_X86_ATOMIC64_64_H */

^ permalink raw reply	[flat|nested] 79+ messages in thread

* [RFC][PATCH 24/31] locking,xtensa: Implement atomic_fetch_{add,sub,and,or,xor}()
  2016-04-22  9:04 [RFC][PATCH 00/31] implement atomic_fetch_$op Peter Zijlstra
                   ` (22 preceding siblings ...)
  2016-04-22  9:04 ` [RFC][PATCH 23/31] locking,x86: " Peter Zijlstra
@ 2016-04-22  9:04 ` Peter Zijlstra
  2016-04-22  9:04 ` [RFC][PATCH 25/31] locking: Fix atomic64_relaxed bits Peter Zijlstra
                   ` (7 subsequent siblings)
  31 siblings, 0 replies; 79+ messages in thread
From: Peter Zijlstra @ 2016-04-22  9:04 UTC (permalink / raw)
  To: torvalds, mingo, tglx, will.deacon, paulmck, boqun.feng,
	waiman.long, fweisbec
  Cc: linux-kernel, linux-arch, rth, vgupta, linux, egtvedt, realmz6,
	ysato, rkuo, tony.luck, geert, james.hogan, ralf, dhowells, jejb,
	mpe, schwidefsky, dalias, davem, cmetcalf, jcmvbkbc, arnd,
	peterz, dbueso, fengguang.wu

[-- Attachment #1: peterz-atomic-fetch-xtensa.patch --]
[-- Type: text/plain, Size: 2477 bytes --]

Implement FETCH-OP atomic primitives, these are very similar to the
existing OP-RETURN primitives we already have, except they return the
value of the atomic variable _before_ modification.

This is especially useful for irreversible operations -- such as
bitops (because it becomes impossible to reconstruct the state prior
to modification).


Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 arch/xtensa/include/asm/atomic.h |   54 ++++++++++++++++++++++++++++++++++++---
 1 file changed, 50 insertions(+), 4 deletions(-)

--- a/arch/xtensa/include/asm/atomic.h
+++ b/arch/xtensa/include/asm/atomic.h
@@ -98,6 +98,26 @@ static inline int atomic_##op##_return(i
 	return result;							\
 }
 
+#define ATOMIC_FETCH_OP(op)						\
+static inline int atomic_fetch_##op(int i, atomic_t * v)		\
+{									\
+	unsigned long tmp;						\
+	int result;							\
+									\
+	__asm__ __volatile__(						\
+			"1:     l32i    %1, %3, 0\n"			\
+			"       wsr     %1, scompare1\n"		\
+			"       " #op " %0, %1, %2\n"			\
+			"       s32c1i  %0, %3, 0\n"			\
+			"       bne     %0, %1, 1b\n"			\
+			: "=&a" (result), "=&a" (tmp)			\
+			: "a" (i), "a" (v)				\
+			: "memory"					\
+			);						\
+									\
+	return result;							\
+}
+
 #else /* XCHAL_HAVE_S32C1I */
 
 #define ATOMIC_OP(op)							\
@@ -138,18 +158,44 @@ static inline int atomic_##op##_return(i
 	return vval;							\
 }
 
+#define ATOMIC_FETCH_OP(op)						\
+static inline int atomic_fetch_##op(int i, atomic_t * v)		\
+{									\
+	unsigned int tmp, vval;						\
+									\
+	__asm__ __volatile__(						\
+			"       rsil    a15,"__stringify(TOPLEVEL)"\n"	\
+			"       l32i    %0, %3, 0\n"			\
+			"       " #op " %1, %0, %2\n"			\
+			"       s32i    %1, %3, 0\n"			\
+			"       wsr     a15, ps\n"			\
+			"       rsync\n"				\
+			: "=&a" (vval), "=&a" (tmp)			\
+			: "a" (i), "a" (v)				\
+			: "a15", "memory"				\
+			);						\
+									\
+	return vval;							\
+}
+
 #endif /* XCHAL_HAVE_S32C1I */
 
-#define ATOMIC_OPS(op) ATOMIC_OP(op) ATOMIC_OP_RETURN(op)
+#define ATOMIC_OPS(op) ATOMIC_OP(op) ATOMIC_FETCH_OP(op) ATOMIC_OP_RETURN(op)
 
 ATOMIC_OPS(add)
 ATOMIC_OPS(sub)
 
-ATOMIC_OP(and)
-ATOMIC_OP(or)
-ATOMIC_OP(xor)
+#undef ATOMIC_OPS
+#define ATOMIC_OPS(op) ATOMIC_OP(op) ATOMIC_FETCH_OP(op)
+
+#define atomic_fetch_or atomic_fetch_or
+
+ATOMIC_OPS(and)
+ATOMIC_OPS(or)
+ATOMIC_OPS(xor)
 
 #undef ATOMIC_OPS
+#undef ATOMIC_FETCH_OP
 #undef ATOMIC_OP_RETURN
 #undef ATOMIC_OP
 

^ permalink raw reply	[flat|nested] 79+ messages in thread

* [RFC][PATCH 25/31] locking: Fix atomic64_relaxed bits
  2016-04-22  9:04 [RFC][PATCH 00/31] implement atomic_fetch_$op Peter Zijlstra
                   ` (23 preceding siblings ...)
  2016-04-22  9:04 ` [RFC][PATCH 24/31] locking,xtensa: Implement atomic_fetch_{add,sub,and,or,xor}() Peter Zijlstra
@ 2016-04-22  9:04 ` Peter Zijlstra
  2016-04-22  9:04 ` [RFC][PATCH 26/31] locking: Implement atomic{,64,_long}_fetch_{add,sub,and,andnot,or,xor}{,_relaxed,_acquire,_release}() Peter Zijlstra
                   ` (6 subsequent siblings)
  31 siblings, 0 replies; 79+ messages in thread
From: Peter Zijlstra @ 2016-04-22  9:04 UTC (permalink / raw)
  To: torvalds, mingo, tglx, will.deacon, paulmck, boqun.feng,
	waiman.long, fweisbec
  Cc: linux-kernel, linux-arch, rth, vgupta, linux, egtvedt, realmz6,
	ysato, rkuo, tony.luck, geert, james.hogan, ralf, dhowells, jejb,
	mpe, schwidefsky, dalias, davem, cmetcalf, jcmvbkbc, arnd,
	peterz, dbueso, fengguang.wu

[-- Attachment #1: peterz-atomic64-relaxed-fix.patch --]
[-- Type: text/plain, Size: 10246 bytes --]

We should only expand the atomic64 relaxed bits once we've included
all relevant headers. So move it down until after we potentially
include asm-generic/atomic64.h.

In practise this will not have made a difference so far, since the
generic bits will not define _relaxed versions.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 include/linux/atomic.h |  306 ++++++++++++++++++++++++-------------------------
 1 file changed, 153 insertions(+), 153 deletions(-)

--- a/include/linux/atomic.h
+++ b/include/linux/atomic.h
@@ -211,159 +211,6 @@
 #endif
 #endif /* atomic_cmpxchg_relaxed */
 
-#ifndef atomic64_read_acquire
-#define  atomic64_read_acquire(v)	smp_load_acquire(&(v)->counter)
-#endif
-
-#ifndef atomic64_set_release
-#define  atomic64_set_release(v, i)	smp_store_release(&(v)->counter, (i))
-#endif
-
-/* atomic64_add_return_relaxed */
-#ifndef atomic64_add_return_relaxed
-#define  atomic64_add_return_relaxed	atomic64_add_return
-#define  atomic64_add_return_acquire	atomic64_add_return
-#define  atomic64_add_return_release	atomic64_add_return
-
-#else /* atomic64_add_return_relaxed */
-
-#ifndef atomic64_add_return_acquire
-#define  atomic64_add_return_acquire(...)				\
-	__atomic_op_acquire(atomic64_add_return, __VA_ARGS__)
-#endif
-
-#ifndef atomic64_add_return_release
-#define  atomic64_add_return_release(...)				\
-	__atomic_op_release(atomic64_add_return, __VA_ARGS__)
-#endif
-
-#ifndef atomic64_add_return
-#define  atomic64_add_return(...)					\
-	__atomic_op_fence(atomic64_add_return, __VA_ARGS__)
-#endif
-#endif /* atomic64_add_return_relaxed */
-
-/* atomic64_inc_return_relaxed */
-#ifndef atomic64_inc_return_relaxed
-#define  atomic64_inc_return_relaxed	atomic64_inc_return
-#define  atomic64_inc_return_acquire	atomic64_inc_return
-#define  atomic64_inc_return_release	atomic64_inc_return
-
-#else /* atomic64_inc_return_relaxed */
-
-#ifndef atomic64_inc_return_acquire
-#define  atomic64_inc_return_acquire(...)				\
-	__atomic_op_acquire(atomic64_inc_return, __VA_ARGS__)
-#endif
-
-#ifndef atomic64_inc_return_release
-#define  atomic64_inc_return_release(...)				\
-	__atomic_op_release(atomic64_inc_return, __VA_ARGS__)
-#endif
-
-#ifndef atomic64_inc_return
-#define  atomic64_inc_return(...)					\
-	__atomic_op_fence(atomic64_inc_return, __VA_ARGS__)
-#endif
-#endif /* atomic64_inc_return_relaxed */
-
-
-/* atomic64_sub_return_relaxed */
-#ifndef atomic64_sub_return_relaxed
-#define  atomic64_sub_return_relaxed	atomic64_sub_return
-#define  atomic64_sub_return_acquire	atomic64_sub_return
-#define  atomic64_sub_return_release	atomic64_sub_return
-
-#else /* atomic64_sub_return_relaxed */
-
-#ifndef atomic64_sub_return_acquire
-#define  atomic64_sub_return_acquire(...)				\
-	__atomic_op_acquire(atomic64_sub_return, __VA_ARGS__)
-#endif
-
-#ifndef atomic64_sub_return_release
-#define  atomic64_sub_return_release(...)				\
-	__atomic_op_release(atomic64_sub_return, __VA_ARGS__)
-#endif
-
-#ifndef atomic64_sub_return
-#define  atomic64_sub_return(...)					\
-	__atomic_op_fence(atomic64_sub_return, __VA_ARGS__)
-#endif
-#endif /* atomic64_sub_return_relaxed */
-
-/* atomic64_dec_return_relaxed */
-#ifndef atomic64_dec_return_relaxed
-#define  atomic64_dec_return_relaxed	atomic64_dec_return
-#define  atomic64_dec_return_acquire	atomic64_dec_return
-#define  atomic64_dec_return_release	atomic64_dec_return
-
-#else /* atomic64_dec_return_relaxed */
-
-#ifndef atomic64_dec_return_acquire
-#define  atomic64_dec_return_acquire(...)				\
-	__atomic_op_acquire(atomic64_dec_return, __VA_ARGS__)
-#endif
-
-#ifndef atomic64_dec_return_release
-#define  atomic64_dec_return_release(...)				\
-	__atomic_op_release(atomic64_dec_return, __VA_ARGS__)
-#endif
-
-#ifndef atomic64_dec_return
-#define  atomic64_dec_return(...)					\
-	__atomic_op_fence(atomic64_dec_return, __VA_ARGS__)
-#endif
-#endif /* atomic64_dec_return_relaxed */
-
-/* atomic64_xchg_relaxed */
-#ifndef atomic64_xchg_relaxed
-#define  atomic64_xchg_relaxed		atomic64_xchg
-#define  atomic64_xchg_acquire		atomic64_xchg
-#define  atomic64_xchg_release		atomic64_xchg
-
-#else /* atomic64_xchg_relaxed */
-
-#ifndef atomic64_xchg_acquire
-#define  atomic64_xchg_acquire(...)					\
-	__atomic_op_acquire(atomic64_xchg, __VA_ARGS__)
-#endif
-
-#ifndef atomic64_xchg_release
-#define  atomic64_xchg_release(...)					\
-	__atomic_op_release(atomic64_xchg, __VA_ARGS__)
-#endif
-
-#ifndef atomic64_xchg
-#define  atomic64_xchg(...)						\
-	__atomic_op_fence(atomic64_xchg, __VA_ARGS__)
-#endif
-#endif /* atomic64_xchg_relaxed */
-
-/* atomic64_cmpxchg_relaxed */
-#ifndef atomic64_cmpxchg_relaxed
-#define  atomic64_cmpxchg_relaxed	atomic64_cmpxchg
-#define  atomic64_cmpxchg_acquire	atomic64_cmpxchg
-#define  atomic64_cmpxchg_release	atomic64_cmpxchg
-
-#else /* atomic64_cmpxchg_relaxed */
-
-#ifndef atomic64_cmpxchg_acquire
-#define  atomic64_cmpxchg_acquire(...)					\
-	__atomic_op_acquire(atomic64_cmpxchg, __VA_ARGS__)
-#endif
-
-#ifndef atomic64_cmpxchg_release
-#define  atomic64_cmpxchg_release(...)					\
-	__atomic_op_release(atomic64_cmpxchg, __VA_ARGS__)
-#endif
-
-#ifndef atomic64_cmpxchg
-#define  atomic64_cmpxchg(...)						\
-	__atomic_op_fence(atomic64_cmpxchg, __VA_ARGS__)
-#endif
-#endif /* atomic64_cmpxchg_relaxed */
-
 /* cmpxchg_relaxed */
 #ifndef cmpxchg_relaxed
 #define  cmpxchg_relaxed		cmpxchg
@@ -583,6 +430,159 @@ static inline int atomic_fetch_or(atomic
 #include <asm-generic/atomic64.h>
 #endif
 
+#ifndef atomic64_read_acquire
+#define  atomic64_read_acquire(v)	smp_load_acquire(&(v)->counter)
+#endif
+
+#ifndef atomic64_set_release
+#define  atomic64_set_release(v, i)	smp_store_release(&(v)->counter, (i))
+#endif
+
+/* atomic64_add_return_relaxed */
+#ifndef atomic64_add_return_relaxed
+#define  atomic64_add_return_relaxed	atomic64_add_return
+#define  atomic64_add_return_acquire	atomic64_add_return
+#define  atomic64_add_return_release	atomic64_add_return
+
+#else /* atomic64_add_return_relaxed */
+
+#ifndef atomic64_add_return_acquire
+#define  atomic64_add_return_acquire(...)				\
+	__atomic_op_acquire(atomic64_add_return, __VA_ARGS__)
+#endif
+
+#ifndef atomic64_add_return_release
+#define  atomic64_add_return_release(...)				\
+	__atomic_op_release(atomic64_add_return, __VA_ARGS__)
+#endif
+
+#ifndef atomic64_add_return
+#define  atomic64_add_return(...)					\
+	__atomic_op_fence(atomic64_add_return, __VA_ARGS__)
+#endif
+#endif /* atomic64_add_return_relaxed */
+
+/* atomic64_inc_return_relaxed */
+#ifndef atomic64_inc_return_relaxed
+#define  atomic64_inc_return_relaxed	atomic64_inc_return
+#define  atomic64_inc_return_acquire	atomic64_inc_return
+#define  atomic64_inc_return_release	atomic64_inc_return
+
+#else /* atomic64_inc_return_relaxed */
+
+#ifndef atomic64_inc_return_acquire
+#define  atomic64_inc_return_acquire(...)				\
+	__atomic_op_acquire(atomic64_inc_return, __VA_ARGS__)
+#endif
+
+#ifndef atomic64_inc_return_release
+#define  atomic64_inc_return_release(...)				\
+	__atomic_op_release(atomic64_inc_return, __VA_ARGS__)
+#endif
+
+#ifndef atomic64_inc_return
+#define  atomic64_inc_return(...)					\
+	__atomic_op_fence(atomic64_inc_return, __VA_ARGS__)
+#endif
+#endif /* atomic64_inc_return_relaxed */
+
+
+/* atomic64_sub_return_relaxed */
+#ifndef atomic64_sub_return_relaxed
+#define  atomic64_sub_return_relaxed	atomic64_sub_return
+#define  atomic64_sub_return_acquire	atomic64_sub_return
+#define  atomic64_sub_return_release	atomic64_sub_return
+
+#else /* atomic64_sub_return_relaxed */
+
+#ifndef atomic64_sub_return_acquire
+#define  atomic64_sub_return_acquire(...)				\
+	__atomic_op_acquire(atomic64_sub_return, __VA_ARGS__)
+#endif
+
+#ifndef atomic64_sub_return_release
+#define  atomic64_sub_return_release(...)				\
+	__atomic_op_release(atomic64_sub_return, __VA_ARGS__)
+#endif
+
+#ifndef atomic64_sub_return
+#define  atomic64_sub_return(...)					\
+	__atomic_op_fence(atomic64_sub_return, __VA_ARGS__)
+#endif
+#endif /* atomic64_sub_return_relaxed */
+
+/* atomic64_dec_return_relaxed */
+#ifndef atomic64_dec_return_relaxed
+#define  atomic64_dec_return_relaxed	atomic64_dec_return
+#define  atomic64_dec_return_acquire	atomic64_dec_return
+#define  atomic64_dec_return_release	atomic64_dec_return
+
+#else /* atomic64_dec_return_relaxed */
+
+#ifndef atomic64_dec_return_acquire
+#define  atomic64_dec_return_acquire(...)				\
+	__atomic_op_acquire(atomic64_dec_return, __VA_ARGS__)
+#endif
+
+#ifndef atomic64_dec_return_release
+#define  atomic64_dec_return_release(...)				\
+	__atomic_op_release(atomic64_dec_return, __VA_ARGS__)
+#endif
+
+#ifndef atomic64_dec_return
+#define  atomic64_dec_return(...)					\
+	__atomic_op_fence(atomic64_dec_return, __VA_ARGS__)
+#endif
+#endif /* atomic64_dec_return_relaxed */
+
+/* atomic64_xchg_relaxed */
+#ifndef atomic64_xchg_relaxed
+#define  atomic64_xchg_relaxed		atomic64_xchg
+#define  atomic64_xchg_acquire		atomic64_xchg
+#define  atomic64_xchg_release		atomic64_xchg
+
+#else /* atomic64_xchg_relaxed */
+
+#ifndef atomic64_xchg_acquire
+#define  atomic64_xchg_acquire(...)					\
+	__atomic_op_acquire(atomic64_xchg, __VA_ARGS__)
+#endif
+
+#ifndef atomic64_xchg_release
+#define  atomic64_xchg_release(...)					\
+	__atomic_op_release(atomic64_xchg, __VA_ARGS__)
+#endif
+
+#ifndef atomic64_xchg
+#define  atomic64_xchg(...)						\
+	__atomic_op_fence(atomic64_xchg, __VA_ARGS__)
+#endif
+#endif /* atomic64_xchg_relaxed */
+
+/* atomic64_cmpxchg_relaxed */
+#ifndef atomic64_cmpxchg_relaxed
+#define  atomic64_cmpxchg_relaxed	atomic64_cmpxchg
+#define  atomic64_cmpxchg_acquire	atomic64_cmpxchg
+#define  atomic64_cmpxchg_release	atomic64_cmpxchg
+
+#else /* atomic64_cmpxchg_relaxed */
+
+#ifndef atomic64_cmpxchg_acquire
+#define  atomic64_cmpxchg_acquire(...)					\
+	__atomic_op_acquire(atomic64_cmpxchg, __VA_ARGS__)
+#endif
+
+#ifndef atomic64_cmpxchg_release
+#define  atomic64_cmpxchg_release(...)					\
+	__atomic_op_release(atomic64_cmpxchg, __VA_ARGS__)
+#endif
+
+#ifndef atomic64_cmpxchg
+#define  atomic64_cmpxchg(...)						\
+	__atomic_op_fence(atomic64_cmpxchg, __VA_ARGS__)
+#endif
+#endif /* atomic64_cmpxchg_relaxed */
+
 #ifndef atomic64_andnot
 static inline void atomic64_andnot(long long i, atomic64_t *v)
 {

^ permalink raw reply	[flat|nested] 79+ messages in thread

* [RFC][PATCH 26/31] locking: Implement atomic{,64,_long}_fetch_{add,sub,and,andnot,or,xor}{,_relaxed,_acquire,_release}()
  2016-04-22  9:04 [RFC][PATCH 00/31] implement atomic_fetch_$op Peter Zijlstra
                   ` (24 preceding siblings ...)
  2016-04-22  9:04 ` [RFC][PATCH 25/31] locking: Fix atomic64_relaxed bits Peter Zijlstra
@ 2016-04-22  9:04 ` Peter Zijlstra
  2016-04-22  9:04 ` [RFC][PATCH 27/31] locking: Remove linux/atomic.h:atomic_fetch_or Peter Zijlstra
                   ` (5 subsequent siblings)
  31 siblings, 0 replies; 79+ messages in thread
From: Peter Zijlstra @ 2016-04-22  9:04 UTC (permalink / raw)
  To: torvalds, mingo, tglx, will.deacon, paulmck, boqun.feng,
	waiman.long, fweisbec
  Cc: linux-kernel, linux-arch, rth, vgupta, linux, egtvedt, realmz6,
	ysato, rkuo, tony.luck, geert, james.hogan, ralf, dhowells, jejb,
	mpe, schwidefsky, dalias, davem, cmetcalf, jcmvbkbc, arnd,
	peterz, dbueso, fengguang.wu

[-- Attachment #1: peterz-atomic-fetch-generic.patch --]
[-- Type: text/plain, Size: 18478 bytes --]

Now that all the architectures have implemented support for these new
atomic primitives add on the generic infrastructure to expose and use
it.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 include/asm-generic/atomic-long.h |   36 +++-
 include/asm-generic/atomic.h      |   49 +++++
 include/asm-generic/atomic64.h    |   15 +
 include/linux/atomic.h            |  336 ++++++++++++++++++++++++++++++++++++++
 lib/atomic64.c                    |   32 +++
 lib/atomic64_test.c               |   34 +++
 6 files changed, 493 insertions(+), 9 deletions(-)

--- a/include/asm-generic/atomic-long.h
+++ b/include/asm-generic/atomic-long.h
@@ -112,6 +112,40 @@ static __always_inline void atomic_long_
 	ATOMIC_LONG_PFX(_dec)(v);
 }
 
+#define ATOMIC_LONG_FETCH_OP(op, mo)					\
+static inline long							\
+atomic_long_fetch_##op##mo(long i, atomic_long_t *l)			\
+{									\
+	ATOMIC_LONG_PFX(_t) *v = (ATOMIC_LONG_PFX(_t) *)l;		\
+									\
+	return (long)ATOMIC_LONG_PFX(_fetch_##op##mo)(i, v);		\
+}
+
+ATOMIC_LONG_FETCH_OP(add, )
+ATOMIC_LONG_FETCH_OP(add, _relaxed)
+ATOMIC_LONG_FETCH_OP(add, _acquire)
+ATOMIC_LONG_FETCH_OP(add, _release)
+ATOMIC_LONG_FETCH_OP(sub, )
+ATOMIC_LONG_FETCH_OP(sub, _relaxed)
+ATOMIC_LONG_FETCH_OP(sub, _acquire)
+ATOMIC_LONG_FETCH_OP(sub, _release)
+ATOMIC_LONG_FETCH_OP(and, )
+ATOMIC_LONG_FETCH_OP(and, _relaxed)
+ATOMIC_LONG_FETCH_OP(and, _acquire)
+ATOMIC_LONG_FETCH_OP(and, _release)
+ATOMIC_LONG_FETCH_OP(andnot, )
+ATOMIC_LONG_FETCH_OP(andnot, _relaxed)
+ATOMIC_LONG_FETCH_OP(andnot, _acquire)
+ATOMIC_LONG_FETCH_OP(andnot, _release)
+ATOMIC_LONG_FETCH_OP(or, )
+ATOMIC_LONG_FETCH_OP(or, _relaxed)
+ATOMIC_LONG_FETCH_OP(or, _acquire)
+ATOMIC_LONG_FETCH_OP(or, _release)
+ATOMIC_LONG_FETCH_OP(xor, )
+ATOMIC_LONG_FETCH_OP(xor, _relaxed)
+ATOMIC_LONG_FETCH_OP(xor, _acquire)
+ATOMIC_LONG_FETCH_OP(xor, _release)
+
 #define ATOMIC_LONG_OP(op)						\
 static __always_inline void						\
 atomic_long_##op(long i, atomic_long_t *l)				\
@@ -124,9 +158,9 @@ atomic_long_##op(long i, atomic_long_t *
 ATOMIC_LONG_OP(add)
 ATOMIC_LONG_OP(sub)
 ATOMIC_LONG_OP(and)
+ATOMIC_LONG_OP(andnot)
 ATOMIC_LONG_OP(or)
 ATOMIC_LONG_OP(xor)
-ATOMIC_LONG_OP(andnot)
 
 #undef ATOMIC_LONG_OP
 
--- a/include/asm-generic/atomic.h
+++ b/include/asm-generic/atomic.h
@@ -61,6 +61,18 @@ static inline int atomic_##op##_return(i
 	return c c_op i;						\
 }
 
+#define ATOMIC_FETCH_OP(op, c_op)					\
+static inline int atomic_fetch_##op(int i, atomic_t *v)			\
+{									\
+	int c, old;							\
+									\
+	c = v->counter;							\
+	while ((old = cmpxchg(&v->counter, c, c c_op i)) != c)		\
+		c = old;						\
+									\
+	return c;							\
+}
+
 #else
 
 #include <linux/irqflags.h>
@@ -88,6 +100,20 @@ static inline int atomic_##op##_return(i
 	return ret;							\
 }
 
+#define ATOMIC_FETCH_OP(op, c_op)					\
+static inline int atomic_fetch_##op(int i, atomic_t *v)			\
+{									\
+	unsigned long flags;						\
+	int ret;							\
+									\
+	raw_local_irq_save(flags);					\
+	ret = v->counter;						\
+	v->counter = v->counter c_op i;					\
+	raw_local_irq_restore(flags);					\
+									\
+	return ret;							\
+}
+
 #endif /* CONFIG_SMP */
 
 #ifndef atomic_add_return
@@ -98,6 +124,28 @@ ATOMIC_OP_RETURN(add, +)
 ATOMIC_OP_RETURN(sub, -)
 #endif
 
+#ifndef atomic_fetch_add
+ATOMIC_FETCH_OP(add, +)
+#endif
+
+#ifndef atomic_fetch_sub
+ATOMIC_FETCH_OP(sub, -)
+#endif
+
+#ifndef atomic_fetch_and
+ATOMIC_FETCH_OP(and, &)
+#endif
+
+#ifndef atomic_fetch_or
+#define atomic_fetch_or atomic_fetch_or
+
+ATOMIC_FETCH_OP(or, |)
+#endif
+
+#ifndef atomic_fetch_xor
+ATOMIC_FETCH_OP(xor, ^)
+#endif
+
 #ifndef atomic_and
 ATOMIC_OP(and, &)
 #endif
@@ -110,6 +158,7 @@ ATOMIC_OP(or, |)
 ATOMIC_OP(xor, ^)
 #endif
 
+#undef ATOMIC_FETCH_OP
 #undef ATOMIC_OP_RETURN
 #undef ATOMIC_OP
 
--- a/include/asm-generic/atomic64.h
+++ b/include/asm-generic/atomic64.h
@@ -27,16 +27,23 @@ extern void	 atomic64_##op(long long a,
 #define ATOMIC64_OP_RETURN(op)						\
 extern long long atomic64_##op##_return(long long a, atomic64_t *v);
 
-#define ATOMIC64_OPS(op)	ATOMIC64_OP(op) ATOMIC64_OP_RETURN(op)
+#define ATOMIC64_FETCH_OP(op)						\
+extern long long atomic64_fetch_##op(long long a, atomic64_t *v);
+
+#define ATOMIC64_OPS(op)	ATOMIC64_OP(op) ATOMIC64_OP_RETURN(op) ATOMIC64_FETCH_OP(op)
 
 ATOMIC64_OPS(add)
 ATOMIC64_OPS(sub)
 
-ATOMIC64_OP(and)
-ATOMIC64_OP(or)
-ATOMIC64_OP(xor)
+#undef ATOMIC64_OPS
+#define ATOMIC64_OPS(op)	ATOMIC64_OP(op) ATOMIC64_FETCH_OP(op)
+
+ATOMIC64_OPS(and)
+ATOMIC64_OPS(or)
+ATOMIC64_OPS(xor)
 
 #undef ATOMIC64_OPS
+#undef ATOMIC64_FETCH_OP
 #undef ATOMIC64_OP_RETURN
 #undef ATOMIC64_OP
 
--- a/include/linux/atomic.h
+++ b/include/linux/atomic.h
@@ -163,6 +163,154 @@
 #endif
 #endif /* atomic_dec_return_relaxed */
 
+
+/* atomic_fetch_add_relaxed */
+#ifndef atomic_fetch_add_relaxed
+#define atomic_fetch_add_relaxed	atomic_fetch_add
+#define atomic_fetch_add_acquire	atomic_fetch_add
+#define atomic_fetch_add_release	atomic_fetch_add
+
+#else /* atomic_fetch_add_relaxed */
+
+#ifndef atomic_fetch_add_acquire
+#define atomic_fetch_add_acquire(...)					\
+	__atomic_op_acquire(atomic_fetch_add, __VA_ARGS__)
+#endif
+
+#ifndef atomic_fetch_add_release
+#define atomic_fetch_add_release(...)					\
+	__atomic_op_release(atomic_fetch_add, __VA_ARGS__)
+#endif
+
+#ifndef atomic_fetch_add
+#define atomic_fetch_add(...)						\
+	__atomic_op_fence(atomic_fetch_add, __VA_ARGS__)
+#endif
+#endif /* atomic_fetch_add_relaxed */
+
+/* atomic_fetch_sub_relaxed */
+#ifndef atomic_fetch_sub_relaxed
+#define atomic_fetch_sub_relaxed	atomic_fetch_sub
+#define atomic_fetch_sub_acquire	atomic_fetch_sub
+#define atomic_fetch_sub_release	atomic_fetch_sub
+
+#else /* atomic_fetch_sub_relaxed */
+
+#ifndef atomic_fetch_sub_acquire
+#define atomic_fetch_sub_acquire(...)					\
+	__atomic_op_acquire(atomic_fetch_sub, __VA_ARGS__)
+#endif
+
+#ifndef atomic_fetch_sub_release
+#define atomic_fetch_sub_release(...)					\
+	__atomic_op_release(atomic_fetch_sub, __VA_ARGS__)
+#endif
+
+#ifndef atomic_fetch_sub
+#define atomic_fetch_sub(...)						\
+	__atomic_op_fence(atomic_fetch_sub, __VA_ARGS__)
+#endif
+#endif /* atomic_fetch_sub_relaxed */
+
+/* atomic_fetch_or_relaxed */
+#ifndef atomic_fetch_or_relaxed
+#define atomic_fetch_or_relaxed	atomic_fetch_or
+#define atomic_fetch_or_acquire	atomic_fetch_or
+#define atomic_fetch_or_release	atomic_fetch_or
+
+#else /* atomic_fetch_or_relaxed */
+
+#ifndef atomic_fetch_or_acquire
+#define atomic_fetch_or_acquire(...)					\
+	__atomic_op_acquire(atomic_fetch_or, __VA_ARGS__)
+#endif
+
+#ifndef atomic_fetch_or_release
+#define atomic_fetch_or_release(...)					\
+	__atomic_op_release(atomic_fetch_or, __VA_ARGS__)
+#endif
+
+#ifndef atomic_fetch_or
+#define atomic_fetch_or(...)						\
+	__atomic_op_fence(atomic_fetch_or, __VA_ARGS__)
+#endif
+#endif /* atomic_fetch_or_relaxed */
+
+/* atomic_fetch_and_relaxed */
+#ifndef atomic_fetch_and_relaxed
+#define atomic_fetch_and_relaxed	atomic_fetch_and
+#define atomic_fetch_and_acquire	atomic_fetch_and
+#define atomic_fetch_and_release	atomic_fetch_and
+
+#else /* atomic_fetch_and_relaxed */
+
+#ifndef atomic_fetch_and_acquire
+#define atomic_fetch_and_acquire(...)					\
+	__atomic_op_acquire(atomic_fetch_and, __VA_ARGS__)
+#endif
+
+#ifndef atomic_fetch_and_release
+#define atomic_fetch_and_release(...)					\
+	__atomic_op_release(atomic_fetch_and, __VA_ARGS__)
+#endif
+
+#ifndef atomic_fetch_and
+#define atomic_fetch_and(...)						\
+	__atomic_op_fence(atomic_fetch_and, __VA_ARGS__)
+#endif
+#endif /* atomic_fetch_and_relaxed */
+
+#ifdef atomic_andnot
+/* atomic_fetch_andnot_relaxed */
+#ifndef atomic_fetch_andnot_relaxed
+#define atomic_fetch_andnot_relaxed	atomic_fetch_andnot
+#define atomic_fetch_andnot_acquire	atomic_fetch_andnot
+#define atomic_fetch_andnot_release	atomic_fetch_andnot
+
+#else /* atomic_fetch_andnot_relaxed */
+
+#ifndef atomic_fetch_andnot_acquire
+#define atomic_fetch_andnot_acquire(...)					\
+	__atomic_op_acquire(atomic_fetch_andnot, __VA_ARGS__)
+#endif
+
+#ifndef atomic_fetch_andnot_release
+#define atomic_fetch_andnot_release(...)					\
+	__atomic_op_release(atomic_fetch_andnot, __VA_ARGS__)
+#endif
+
+#ifndef atomic_fetch_andnot
+#define atomic_fetch_andnot(...)						\
+	__atomic_op_fence(atomic_fetch_andnot, __VA_ARGS__)
+#endif
+#endif /* atomic_fetch_andnot_relaxed */
+#endif /* atomic_andnot */
+
+/* atomic_fetch_xor_relaxed */
+#ifndef atomic_fetch_xor_relaxed
+#define atomic_fetch_xor_relaxed	atomic_fetch_xor
+#define atomic_fetch_xor_acquire	atomic_fetch_xor
+#define atomic_fetch_xor_release	atomic_fetch_xor
+
+#else /* atomic_fetch_xor_relaxed */
+
+#ifndef atomic_fetch_xor_acquire
+#define atomic_fetch_xor_acquire(...)					\
+	__atomic_op_acquire(atomic_fetch_xor, __VA_ARGS__)
+#endif
+
+#ifndef atomic_fetch_xor_release
+#define atomic_fetch_xor_release(...)					\
+	__atomic_op_release(atomic_fetch_xor, __VA_ARGS__)
+#endif
+
+#ifndef atomic_fetch_xor
+#define atomic_fetch_xor(...)						\
+	__atomic_op_fence(atomic_fetch_xor, __VA_ARGS__)
+#endif
+#endif /* atomic_fetch_xor_relaxed */
+
+
 /* atomic_xchg_relaxed */
 #ifndef atomic_xchg_relaxed
 #define  atomic_xchg_relaxed		atomic_xchg
@@ -310,6 +458,26 @@ static inline void atomic_andnot(int i,
 {
 	atomic_and(~i, v);
 }
+
+static inline int atomic_fetch_andnot(int i, atomic_t *v)
+{
+	return atomic_fetch_and(~i, v);
+}
+
+static inline int atomic_fetch_andnot_relaxed(int i, atomic_t *v)
+{
+	return atomic_fetch_and_relaxed(~i, v);
+}
+
+static inline int atomic_fetch_andnot_acquire(int i, atomic_t *v)
+{
+	return atomic_fetch_and_acquire(~i, v);
+}
+
+static inline int atomic_fetch_andnot_release(int i, atomic_t *v)
+{
+	return atomic_fetch_and_release(~i, v);
+}
 #endif
 
 static inline __deprecated void atomic_clear_mask(unsigned int mask, atomic_t *v)
@@ -535,6 +703,154 @@ static inline int atomic_fetch_or(atomic
 #endif
 #endif /* atomic64_dec_return_relaxed */
 
+
+/* atomic64_fetch_add_relaxed */
+#ifndef atomic64_fetch_add_relaxed
+#define atomic64_fetch_add_relaxed	atomic64_fetch_add
+#define atomic64_fetch_add_acquire	atomic64_fetch_add
+#define atomic64_fetch_add_release	atomic64_fetch_add
+
+#else /* atomic64_fetch_add_relaxed */
+
+#ifndef atomic64_fetch_add_acquire
+#define atomic64_fetch_add_acquire(...)					\
+	__atomic_op_acquire(atomic64_fetch_add, __VA_ARGS__)
+#endif
+
+#ifndef atomic64_fetch_add_release
+#define atomic64_fetch_add_release(...)					\
+	__atomic_op_release(atomic64_fetch_add, __VA_ARGS__)
+#endif
+
+#ifndef atomic64_fetch_add
+#define atomic64_fetch_add(...)						\
+	__atomic_op_fence(atomic64_fetch_add, __VA_ARGS__)
+#endif
+#endif /* atomic64_fetch_add_relaxed */
+
+/* atomic64_fetch_sub_relaxed */
+#ifndef atomic64_fetch_sub_relaxed
+#define atomic64_fetch_sub_relaxed	atomic64_fetch_sub
+#define atomic64_fetch_sub_acquire	atomic64_fetch_sub
+#define atomic64_fetch_sub_release	atomic64_fetch_sub
+
+#else /* atomic64_fetch_sub_relaxed */
+
+#ifndef atomic64_fetch_sub_acquire
+#define atomic64_fetch_sub_acquire(...)					\
+	__atomic_op_acquire(atomic64_fetch_sub, __VA_ARGS__)
+#endif
+
+#ifndef atomic64_fetch_sub_release
+#define atomic64_fetch_sub_release(...)					\
+	__atomic_op_release(atomic64_fetch_sub, __VA_ARGS__)
+#endif
+
+#ifndef atomic64_fetch_sub
+#define atomic64_fetch_sub(...)						\
+	__atomic_op_fence(atomic64_fetch_sub, __VA_ARGS__)
+#endif
+#endif /* atomic64_fetch_sub_relaxed */
+
+/* atomic64_fetch_or_relaxed */
+#ifndef atomic64_fetch_or_relaxed
+#define atomic64_fetch_or_relaxed	atomic64_fetch_or
+#define atomic64_fetch_or_acquire	atomic64_fetch_or
+#define atomic64_fetch_or_release	atomic64_fetch_or
+
+#else /* atomic64_fetch_or_relaxed */
+
+#ifndef atomic64_fetch_or_acquire
+#define atomic64_fetch_or_acquire(...)					\
+	__atomic_op_acquire(atomic64_fetch_or, __VA_ARGS__)
+#endif
+
+#ifndef atomic64_fetch_or_release
+#define atomic64_fetch_or_release(...)					\
+	__atomic_op_release(atomic64_fetch_or, __VA_ARGS__)
+#endif
+
+#ifndef atomic64_fetch_or
+#define atomic64_fetch_or(...)						\
+	__atomic_op_fence(atomic64_fetch_or, __VA_ARGS__)
+#endif
+#endif /* atomic64_fetch_or_relaxed */
+
+/* atomic64_fetch_and_relaxed */
+#ifndef atomic64_fetch_and_relaxed
+#define atomic64_fetch_and_relaxed	atomic64_fetch_and
+#define atomic64_fetch_and_acquire	atomic64_fetch_and
+#define atomic64_fetch_and_release	atomic64_fetch_and
+
+#else /* atomic64_fetch_and_relaxed */
+
+#ifndef atomic64_fetch_and_acquire
+#define atomic64_fetch_and_acquire(...)					\
+	__atomic_op_acquire(atomic64_fetch_and, __VA_ARGS__)
+#endif
+
+#ifndef atomic64_fetch_and_release
+#define atomic64_fetch_and_release(...)					\
+	__atomic_op_release(atomic64_fetch_and, __VA_ARGS__)
+#endif
+
+#ifndef atomic64_fetch_and
+#define atomic64_fetch_and(...)						\
+	__atomic_op_fence(atomic64_fetch_and, __VA_ARGS__)
+#endif
+#endif /* atomic64_fetch_and_relaxed */
+
+#ifdef atomic64_andnot
+/* atomic64_fetch_andnot_relaxed */
+#ifndef atomic64_fetch_andnot_relaxed
+#define atomic64_fetch_andnot_relaxed	atomic64_fetch_andnot
+#define atomic64_fetch_andnot_acquire	atomic64_fetch_andnot
+#define atomic64_fetch_andnot_release	atomic64_fetch_andnot
+
+#else /* atomic64_fetch_andnot_relaxed */
+
+#ifndef atomic64_fetch_andnot_acquire
+#define atomic64_fetch_andnot_acquire(...)					\
+	__atomic_op_acquire(atomic64_fetch_andnot, __VA_ARGS__)
+#endif
+
+#ifndef atomic64_fetch_andnot_release
+#define atomic64_fetch_andnot_release(...)					\
+	__atomic_op_release(atomic64_fetch_andnot, __VA_ARGS__)
+#endif
+
+#ifndef atomic64_fetch_andnot
+#define atomic64_fetch_andnot(...)						\
+	__atomic_op_fence(atomic64_fetch_andnot, __VA_ARGS__)
+#endif
+#endif /* atomic64_fetch_andnot_relaxed */
+#endif /* atomic64_andnot */
+
+/* atomic64_fetch_xor_relaxed */
+#ifndef atomic64_fetch_xor_relaxed
+#define atomic64_fetch_xor_relaxed	atomic64_fetch_xor
+#define atomic64_fetch_xor_acquire	atomic64_fetch_xor
+#define atomic64_fetch_xor_release	atomic64_fetch_xor
+
+#else /* atomic64_fetch_xor_relaxed */
+
+#ifndef atomic64_fetch_xor_acquire
+#define atomic64_fetch_xor_acquire(...)					\
+	__atomic_op_acquire(atomic64_fetch_xor, __VA_ARGS__)
+#endif
+
+#ifndef atomic64_fetch_xor_release
+#define atomic64_fetch_xor_release(...)					\
+	__atomic_op_release(atomic64_fetch_xor, __VA_ARGS__)
+#endif
+
+#ifndef atomic64_fetch_xor
+#define atomic64_fetch_xor(...)						\
+	__atomic_op_fence(atomic64_fetch_xor, __VA_ARGS__)
+#endif
+#endif /* atomic64_fetch_xor_relaxed */
+
+
 /* atomic64_xchg_relaxed */
 #ifndef atomic64_xchg_relaxed
 #define  atomic64_xchg_relaxed		atomic64_xchg
@@ -588,6 +904,26 @@ static inline void atomic64_andnot(long
 {
 	atomic64_and(~i, v);
 }
+
+static inline long long atomic64_fetch_andnot(long long i, atomic64_t *v)
+{
+	return atomic64_fetch_and(~i, v);
+}
+
+static inline long long atomic64_fetch_andnot_relaxed(long long i, atomic64_t *v)
+{
+	return atomic64_fetch_and_relaxed(~i, v);
+}
+
+static inline long long atomic64_fetch_andnot_acquire(long long i, atomic64_t *v)
+{
+	return atomic64_fetch_and_acquire(~i, v);
+}
+
+static inline long long atomic64_fetch_andnot_release(long long i, atomic64_t *v)
+{
+	return atomic64_fetch_and_release(~i, v);
+}
 #endif
 
 #include <asm-generic/atomic-long.h>
--- a/lib/atomic64.c
+++ b/lib/atomic64.c
@@ -96,17 +96,41 @@ long long atomic64_##op##_return(long lo
 }									\
 EXPORT_SYMBOL(atomic64_##op##_return);
 
+#define ATOMIC64_FETCH_OP(op, c_op)					\
+long long atomic64_fetch_##op(long long a, atomic64_t *v)		\
+{									\
+	unsigned long flags;						\
+	raw_spinlock_t *lock = lock_addr(v);				\
+	long long val;							\
+									\
+	raw_spin_lock_irqsave(lock, flags);				\
+	val = v->counter;						\
+	v->counter c_op a;						\
+	raw_spin_unlock_irqrestore(lock, flags);			\
+	return val;							\
+}									\
+EXPORT_SYMBOL(atomic64_fetch_##op);
+
 #define ATOMIC64_OPS(op, c_op)						\
 	ATOMIC64_OP(op, c_op)						\
-	ATOMIC64_OP_RETURN(op, c_op)
+	ATOMIC64_OP_RETURN(op, c_op)					\
+	ATOMIC64_FETCH_OP(op, c_op)
 
 ATOMIC64_OPS(add, +=)
 ATOMIC64_OPS(sub, -=)
-ATOMIC64_OP(and, &=)
-ATOMIC64_OP(or, |=)
-ATOMIC64_OP(xor, ^=)
 
 #undef ATOMIC64_OPS
+#define ATOMIC64_OPS(op, c_op)						\
+	ATOMIC64_OP(op, c_op)						\
+	ATOMIC64_OP_RETURN(op, c_op)					\
+	ATOMIC64_FETCH_OP(op, c_op)
+
+ATOMIC64_OPS(and, &=)
+ATOMIC64_OPS(or, |=)
+ATOMIC64_OPS(xor, ^=)
+
+#undef ATOMIC64_OPS
+#undef ATOMIC64_FETCH_OP
 #undef ATOMIC64_OP_RETURN
 #undef ATOMIC64_OP
 
--- a/lib/atomic64_test.c
+++ b/lib/atomic64_test.c
@@ -53,11 +53,25 @@ do {								\
 	BUG_ON(atomic##bit##_read(&v) != r);			\
 } while (0)
 
+#define TEST_FETCH(bit, op, c_op, val)				\
+do {								\
+	atomic##bit##_set(&v, v0);				\
+	r = v0;							\
+	r c_op val;						\
+	BUG_ON(atomic##bit##_##op(val, &v) != v0);		\
+	BUG_ON(atomic##bit##_read(&v) != r);			\
+} while (0)
+
 #define RETURN_FAMILY_TEST(bit, op, c_op, val)			\
 do {								\
 	FAMILY_TEST(TEST_RETURN, bit, op, c_op, val);		\
 } while (0)
 
+#define FETCH_FAMILY_TEST(bit, op, c_op, val)			\
+do {								\
+	FAMILY_TEST(TEST_FETCH, bit, op, c_op, val);		\
+} while (0)
+
 #define TEST_ARGS(bit, op, init, ret, expect, args...)		\
 do {								\
 	atomic##bit##_set(&v, init);				\
@@ -114,6 +128,16 @@ static __init void test_atomic(void)
 	RETURN_FAMILY_TEST(, sub_return, -=, onestwos);
 	RETURN_FAMILY_TEST(, sub_return, -=, -one);
 
+	FETCH_FAMILY_TEST(, fetch_add, +=, onestwos);
+	FETCH_FAMILY_TEST(, fetch_add, +=, -one);
+	FETCH_FAMILY_TEST(, fetch_sub, -=, onestwos);
+	FETCH_FAMILY_TEST(, fetch_sub, -=, -one);
+
+	FETCH_FAMILY_TEST(, fetch_or,  |=, v1);
+	FETCH_FAMILY_TEST(, fetch_and, &=, v1);
+	FETCH_FAMILY_TEST(, fetch_andnot, &= ~, v1);
+	FETCH_FAMILY_TEST(, fetch_xor, ^=, v1);
+
 	INC_RETURN_FAMILY_TEST(, v0);
 	DEC_RETURN_FAMILY_TEST(, v0);
 
@@ -154,6 +178,16 @@ static __init void test_atomic64(void)
 	RETURN_FAMILY_TEST(64, sub_return, -=, onestwos);
 	RETURN_FAMILY_TEST(64, sub_return, -=, -one);
 
+	FETCH_FAMILY_TEST(64, fetch_add, +=, onestwos);
+	FETCH_FAMILY_TEST(64, fetch_add, +=, -one);
+	FETCH_FAMILY_TEST(64, fetch_sub, -=, onestwos);
+	FETCH_FAMILY_TEST(64, fetch_sub, -=, -one);
+
+	FETCH_FAMILY_TEST(64, fetch_or,  |=, v1);
+	FETCH_FAMILY_TEST(64, fetch_and, &=, v1);
+	FETCH_FAMILY_TEST(64, fetch_andnot, &= ~, v1);
+	FETCH_FAMILY_TEST(64, fetch_xor, ^=, v1);
+
 	INIT(v0);
 	atomic64_inc(&v);
 	r += one;

^ permalink raw reply	[flat|nested] 79+ messages in thread

* [RFC][PATCH 27/31] locking: Remove linux/atomic.h:atomic_fetch_or
  2016-04-22  9:04 [RFC][PATCH 00/31] implement atomic_fetch_$op Peter Zijlstra
                   ` (25 preceding siblings ...)
  2016-04-22  9:04 ` [RFC][PATCH 26/31] locking: Implement atomic{,64,_long}_fetch_{add,sub,and,andnot,or,xor}{,_relaxed,_acquire,_release}() Peter Zijlstra
@ 2016-04-22  9:04 ` Peter Zijlstra
  2016-04-22 13:02   ` Will Deacon
  2016-04-22  9:04 ` [RFC][PATCH 28/31] locking: Remove the deprecated atomic_{set,clear}_mask() functions Peter Zijlstra
                   ` (4 subsequent siblings)
  31 siblings, 1 reply; 79+ messages in thread
From: Peter Zijlstra @ 2016-04-22  9:04 UTC (permalink / raw)
  To: torvalds, mingo, tglx, will.deacon, paulmck, boqun.feng,
	waiman.long, fweisbec
  Cc: linux-kernel, linux-arch, rth, vgupta, linux, egtvedt, realmz6,
	ysato, rkuo, tony.luck, geert, james.hogan, ralf, dhowells, jejb,
	mpe, schwidefsky, dalias, davem, cmetcalf, jcmvbkbc, arnd,
	peterz, dbueso, fengguang.wu

[-- Attachment #1: peterz-atomic-fetch_or-kill.patch --]
[-- Type: text/plain, Size: 8679 bytes --]

Since all architectures have this implemented natively, remove this
now dead code.





Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 arch/alpha/include/asm/atomic.h    |    2 --
 arch/arc/include/asm/atomic.h      |    2 --
 arch/arm/include/asm/atomic.h      |    2 --
 arch/arm64/include/asm/atomic.h    |    2 --
 arch/avr32/include/asm/atomic.h    |    2 --
 arch/frv/include/asm/atomic.h      |    2 --
 arch/h8300/include/asm/atomic.h    |    2 --
 arch/hexagon/include/asm/atomic.h  |    2 --
 arch/m32r/include/asm/atomic.h     |    2 --
 arch/m68k/include/asm/atomic.h     |    2 --
 arch/metag/include/asm/atomic.h    |    2 --
 arch/mips/include/asm/atomic.h     |    2 --
 arch/mn10300/include/asm/atomic.h  |    2 --
 arch/parisc/include/asm/atomic.h   |    2 --
 arch/s390/include/asm/atomic.h     |    2 --
 arch/sh/include/asm/atomic.h       |    2 --
 arch/sparc/include/asm/atomic.h    |    1 -
 arch/sparc/include/asm/atomic_32.h |    2 --
 arch/tile/include/asm/atomic.h     |    2 --
 arch/x86/include/asm/atomic.h      |    2 --
 arch/xtensa/include/asm/atomic.h   |    2 --
 include/asm-generic/atomic.h       |    2 --
 include/linux/atomic.h             |   21 ---------------------
 23 files changed, 64 deletions(-)

--- a/arch/alpha/include/asm/atomic.h
+++ b/arch/alpha/include/asm/atomic.h
@@ -155,8 +155,6 @@ ATOMIC_OPS(sub)
 #define atomic_andnot atomic_andnot
 #define atomic64_andnot atomic64_andnot
 
-#define atomic_fetch_or atomic_fetch_or
-
 #undef ATOMIC_OPS
 #define ATOMIC_OPS(op, asm)						\
 	ATOMIC_OP(op, asm)						\
--- a/arch/arc/include/asm/atomic.h
+++ b/arch/arc/include/asm/atomic.h
@@ -226,8 +226,6 @@ ATOMIC_OPS(sub, -=, sub)
 
 #define atomic_andnot atomic_andnot
 
-#define atomic_fetch_or atomic_fetch_or
-
 #undef ATOMIC_OPS
 #define ATOMIC_OPS(op, c_op, asm_op)					\
 	ATOMIC_OP(op, c_op, asm_op)					\
--- a/arch/arm/include/asm/atomic.h
+++ b/arch/arm/include/asm/atomic.h
@@ -237,8 +237,6 @@ ATOMIC_OPS(sub, -=, sub)
 
 #define atomic_andnot atomic_andnot
 
-#define atomic_fetch_or atomic_fetch_or
-
 #undef ATOMIC_OPS
 #define ATOMIC_OPS(op, c_op, asm_op)					\
 	ATOMIC_OP(op, c_op, asm_op)					\
--- a/arch/arm64/include/asm/atomic.h
+++ b/arch/arm64/include/asm/atomic.h
@@ -128,8 +128,6 @@
 #define __atomic_add_unless(v, a, u)	___atomic_add_unless(v, a, u,)
 #define atomic_andnot			atomic_andnot
 
-#define atomic_fetch_or atomic_fetch_or
-
 /*
  * 64-bit atomic operations.
  */
--- a/arch/avr32/include/asm/atomic.h
+++ b/arch/avr32/include/asm/atomic.h
@@ -66,8 +66,6 @@ ATOMIC_OP_RETURN(add, add, r)
 ATOMIC_FETCH_OP (sub, sub, rKs21)
 ATOMIC_FETCH_OP (add, add, r)
 
-#define atomic_fetch_or atomic_fetch_or
-
 #define ATOMIC_OPS(op, asm_op)						\
 ATOMIC_OP_RETURN(op, asm_op, r)						\
 static inline void atomic_##op(int i, atomic_t *v)			\
--- a/arch/frv/include/asm/atomic.h
+++ b/arch/frv/include/asm/atomic.h
@@ -74,8 +74,6 @@ static inline void atomic_dec(atomic_t *
 #define atomic_dec_and_test(v)		(atomic_sub_return(1, (v)) == 0)
 #define atomic_inc_and_test(v)		(atomic_add_return(1, (v)) == 0)
 
-#define atomic_fetch_or atomic_fetch_or
-
 /*
  * 64-bit atomic ops
  */
--- a/arch/h8300/include/asm/atomic.h
+++ b/arch/h8300/include/asm/atomic.h
@@ -54,8 +54,6 @@ static inline void atomic_##op(int i, at
 ATOMIC_OP_RETURN(add, +=)
 ATOMIC_OP_RETURN(sub, -=)
 
-#define atomic_fetch_or atomic_fetch_or
-
 #define ATOMIC_OPS(op, c_op)					\
 	ATOMIC_OP(op, c_op)					\
 	ATOMIC_FETCH_OP(op, c_op)
--- a/arch/hexagon/include/asm/atomic.h
+++ b/arch/hexagon/include/asm/atomic.h
@@ -152,8 +152,6 @@ ATOMIC_OPS(sub)
 #undef ATOMIC_OPS
 #define ATOMIC_OPS(op) ATOMIC_OP(op) ATOMIC_FETCH_OP(op)
 
-#define atomic_fetch_or atomic_fetch_or
-
 ATOMIC_OPS(and)
 ATOMIC_OPS(or)
 ATOMIC_OPS(xor)
--- a/arch/m32r/include/asm/atomic.h
+++ b/arch/m32r/include/asm/atomic.h
@@ -121,8 +121,6 @@ ATOMIC_OPS(sub)
 #undef ATOMIC_OPS
 #define ATOMIC_OPS(op) ATOMIC_OP(op) ATOMIC_FETCH_OP(op)
 
-#define atomic_fetch_or atomic_fetch_or
-
 ATOMIC_OPS(and)
 ATOMIC_OPS(or)
 ATOMIC_OPS(xor)
--- a/arch/m68k/include/asm/atomic.h
+++ b/arch/m68k/include/asm/atomic.h
@@ -119,8 +119,6 @@ ATOMIC_OPS(sub, -=, sub)
 	ATOMIC_OP(op, c_op, asm_op)					\
 	ATOMIC_FETCH_OP(op, c_op, asm_op)
 
-#define atomic_fetch_or atomic_fetch_or
-
 ATOMIC_OPS(and, &=, and)
 ATOMIC_OPS(or, |=, or)
 ATOMIC_OPS(xor, ^=, eor)
--- a/arch/metag/include/asm/atomic.h
+++ b/arch/metag/include/asm/atomic.h
@@ -17,8 +17,6 @@
 #include <asm/atomic_lnkget.h>
 #endif
 
-#define atomic_fetch_or atomic_fetch_or
-
 #define atomic_add_negative(a, v)       (atomic_add_return((a), (v)) < 0)
 
 #define atomic_dec_return(v) atomic_sub_return(1, (v))
--- a/arch/mips/include/asm/atomic.h
+++ b/arch/mips/include/asm/atomic.h
@@ -194,8 +194,6 @@ ATOMIC_OPS(sub, -=, subu)
 	ATOMIC_OP(op, c_op, asm_op)					      \
 	ATOMIC_FETCH_OP(op, c_op, asm_op)
 
-#define atomic_fetch_or atomic_fetch_or
-
 ATOMIC_OPS(and, &=, and)
 ATOMIC_OPS(or, |=, or)
 ATOMIC_OPS(xor, ^=, xor)
--- a/arch/mn10300/include/asm/atomic.h
+++ b/arch/mn10300/include/asm/atomic.h
@@ -113,8 +113,6 @@ ATOMIC_OPS(sub)
 #undef ATOMIC_OPS
 #define ATOMIC_OPS(op) ATOMIC_OP(op) ATOMIC_FETCH_OP(op)
 
-#define atomic_fetch_or atomic_fetch_or
-
 ATOMIC_OPS(and)
 ATOMIC_OPS(or)
 ATOMIC_OPS(xor)
--- a/arch/parisc/include/asm/atomic.h
+++ b/arch/parisc/include/asm/atomic.h
@@ -148,8 +148,6 @@ ATOMIC_OPS(sub, -=)
 	ATOMIC_OP(op, c_op)						\
 	ATOMIC_FETCH_OP(op, c_op)
 
-#define atomic_fetch_or atomic_fetch_or
-
 ATOMIC_OPS(and, &=)
 ATOMIC_OPS(or, |=)
 ATOMIC_OPS(xor, ^=)
--- a/arch/s390/include/asm/atomic.h
+++ b/arch/s390/include/asm/atomic.h
@@ -135,8 +135,6 @@ static inline int atomic_fetch_##op(int
 	return __ATOMIC_LOOP(v, i, __ATOMIC_##OP, __ATOMIC_BARRIER);	\
 }
 
-#define atomic_fetch_or atomic_fetch_or
-
 ATOMIC_OPS(and, AND)
 ATOMIC_OPS(or, OR)
 ATOMIC_OPS(xor, XOR)
--- a/arch/sh/include/asm/atomic.h
+++ b/arch/sh/include/asm/atomic.h
@@ -25,8 +25,6 @@
 #include <asm/atomic-irq.h>
 #endif
 
-#define atomic_fetch_or atomic_fetch_or
-
 #define atomic_add_negative(a, v)	(atomic_add_return((a), (v)) < 0)
 #define atomic_dec_return(v)		atomic_sub_return(1, (v))
 #define atomic_inc_return(v)		atomic_add_return(1, (v))
--- a/arch/sparc/include/asm/atomic.h
+++ b/arch/sparc/include/asm/atomic.h
@@ -5,5 +5,4 @@
 #else
 #include <asm/atomic_32.h>
 #endif
-#define atomic_fetch_or atomic_fetch_or
 #endif
--- a/arch/sparc/include/asm/atomic_32.h
+++ b/arch/sparc/include/asm/atomic_32.h
@@ -36,8 +36,6 @@ void atomic_set(atomic_t *, int);
 #define atomic_inc(v)		((void)atomic_add_return(        1, (v)))
 #define atomic_dec(v)		((void)atomic_add_return(       -1, (v)))
 
-#define atomic_fetch_or	atomic_fetch_or
-
 #define atomic_and(i, v)	((void)atomic_fetch_and((i), (v)))
 #define atomic_or(i, v)		((void)atomic_fetch_or((i), (v)))
 #define atomic_xor(i, v)	((void)atomic_fetch_xor((i), (v)))
--- a/arch/tile/include/asm/atomic.h
+++ b/arch/tile/include/asm/atomic.h
@@ -48,8 +48,6 @@ static inline int atomic_read(const atom
 
 #define atomic_fetch_sub(i, v)		atomic_fetch_add(-(int)(i), (v))
 
-#define atomic_fetch_or atomic_fetch_or
-
 /**
  * atomic_sub - subtract integer from atomic variable
  * @i: integer value to subtract
--- a/arch/x86/include/asm/atomic.h
+++ b/arch/x86/include/asm/atomic.h
@@ -217,8 +217,6 @@ static inline int atomic_fetch_##op(int
 	ATOMIC_OP(op)							\
 	ATOMIC_FETCH_OP(op, c_op)
 
-#define atomic_fetch_or atomic_fetch_or
-
 ATOMIC_OPS(and, &)
 ATOMIC_OPS(or , |)
 ATOMIC_OPS(xor, ^)
--- a/arch/xtensa/include/asm/atomic.h
+++ b/arch/xtensa/include/asm/atomic.h
@@ -188,8 +188,6 @@ ATOMIC_OPS(sub)
 #undef ATOMIC_OPS
 #define ATOMIC_OPS(op) ATOMIC_OP(op) ATOMIC_FETCH_OP(op)
 
-#define atomic_fetch_or atomic_fetch_or
-
 ATOMIC_OPS(and)
 ATOMIC_OPS(or)
 ATOMIC_OPS(xor)
--- a/include/asm-generic/atomic.h
+++ b/include/asm-generic/atomic.h
@@ -137,8 +137,6 @@ ATOMIC_FETCH_OP(and, &)
 #endif
 
 #ifndef atomic_fetch_or
-#define atomic_fetch_or atomic_fetch_or
-
 ATOMIC_FETCH_OP(or, |)
 #endif
 
--- a/include/linux/atomic.h
+++ b/include/linux/atomic.h
@@ -573,27 +573,6 @@ static inline int atomic_dec_if_positive
 }
 #endif
 
-/**
- * atomic_fetch_or - perform *p |= mask and return old value of *p
- * @mask: mask to OR on the atomic_t
- * @p: pointer to atomic_t
- */
-#ifndef atomic_fetch_or
-static inline int atomic_fetch_or(int mask, atomic_t *p)
-{
-	int old, val = atomic_read(p);
-
-	for (;;) {
-		old = atomic_cmpxchg(p, val, val | mask);
-		if (old == val)
-			break;
-		val = old;
-	}
-
-	return old;
-}
-#endif
-
 #ifdef CONFIG_GENERIC_ATOMIC64
 #include <asm-generic/atomic64.h>
 #endif

^ permalink raw reply	[flat|nested] 79+ messages in thread

* [RFC][PATCH 28/31] locking: Remove the deprecated atomic_{set,clear}_mask() functions
  2016-04-22  9:04 [RFC][PATCH 00/31] implement atomic_fetch_$op Peter Zijlstra
                   ` (26 preceding siblings ...)
  2016-04-22  9:04 ` [RFC][PATCH 27/31] locking: Remove linux/atomic.h:atomic_fetch_or Peter Zijlstra
@ 2016-04-22  9:04 ` Peter Zijlstra
  2016-04-22  9:04 ` [RFC][PATCH 29/31] locking,alpha: Convert to _relaxed atomics Peter Zijlstra
                   ` (3 subsequent siblings)
  31 siblings, 0 replies; 79+ messages in thread
From: Peter Zijlstra @ 2016-04-22  9:04 UTC (permalink / raw)
  To: torvalds, mingo, tglx, will.deacon, paulmck, boqun.feng,
	waiman.long, fweisbec
  Cc: linux-kernel, linux-arch, rth, vgupta, linux, egtvedt, realmz6,
	ysato, rkuo, tony.luck, geert, james.hogan, ralf, dhowells, jejb,
	mpe, schwidefsky, dalias, davem, cmetcalf, jcmvbkbc, arnd,
	peterz, dbueso, fengguang.wu

[-- Attachment #1: peterz-atomic-mask-kill.patch --]
[-- Type: text/plain, Size: 1363 bytes --]

These functions have been deprecated for a while and there is only the
one user left, convert and kill.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 include/linux/atomic.h              |   10 ----------
 kernel/locking/qspinlock_paravirt.h |    4 ++--
 2 files changed, 2 insertions(+), 12 deletions(-)

--- a/include/linux/atomic.h
+++ b/include/linux/atomic.h
@@ -480,16 +480,6 @@ static inline int atomic_fetch_andnot_re
 }
 #endif
 
-static inline __deprecated void atomic_clear_mask(unsigned int mask, atomic_t *v)
-{
-	atomic_andnot(mask, v);
-}
-
-static inline __deprecated void atomic_set_mask(unsigned int mask, atomic_t *v)
-{
-	atomic_or(mask, v);
-}
-
 /**
  * atomic_inc_not_zero_hint - increment if not null
  * @v: pointer of type atomic_t
--- a/kernel/locking/qspinlock_paravirt.h
+++ b/kernel/locking/qspinlock_paravirt.h
@@ -112,12 +112,12 @@ static __always_inline int trylock_clear
 #else /* _Q_PENDING_BITS == 8 */
 static __always_inline void set_pending(struct qspinlock *lock)
 {
-	atomic_set_mask(_Q_PENDING_VAL, &lock->val);
+	atomic_or(_Q_PENDING_VAL, &lock->val);
 }
 
 static __always_inline void clear_pending(struct qspinlock *lock)
 {
-	atomic_clear_mask(_Q_PENDING_VAL, &lock->val);
+	atomic_andnot(_Q_PENDING_VAL, &lock->val);
 }
 
 static __always_inline int trylock_clear_pending(struct qspinlock *lock)

^ permalink raw reply	[flat|nested] 79+ messages in thread

* [RFC][PATCH 29/31] locking,alpha: Convert to _relaxed atomics
  2016-04-22  9:04 [RFC][PATCH 00/31] implement atomic_fetch_$op Peter Zijlstra
                   ` (27 preceding siblings ...)
  2016-04-22  9:04 ` [RFC][PATCH 28/31] locking: Remove the deprecated atomic_{set,clear}_mask() functions Peter Zijlstra
@ 2016-04-22  9:04 ` Peter Zijlstra
  2016-04-22  9:04 ` [RFC][PATCH 30/31] locking,mips: " Peter Zijlstra
                   ` (2 subsequent siblings)
  31 siblings, 0 replies; 79+ messages in thread
From: Peter Zijlstra @ 2016-04-22  9:04 UTC (permalink / raw)
  To: torvalds, mingo, tglx, will.deacon, paulmck, boqun.feng,
	waiman.long, fweisbec
  Cc: linux-kernel, linux-arch, rth, vgupta, linux, egtvedt, realmz6,
	ysato, rkuo, tony.luck, geert, james.hogan, ralf, dhowells, jejb,
	mpe, schwidefsky, dalias, davem, cmetcalf, jcmvbkbc, arnd,
	peterz, dbueso, fengguang.wu

[-- Attachment #1: peterz-atomic-alpha-relaxed.patch --]
[-- Type: text/plain, Size: 4165 bytes --]

Generic code will construct {,_acquire,_release} versions by adding the
required smp_mb__{before,after}_atomic() calls.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 arch/alpha/include/asm/atomic.h |   36 ++++++++++++++++++++++++------------
 1 file changed, 24 insertions(+), 12 deletions(-)

--- a/arch/alpha/include/asm/atomic.h
+++ b/arch/alpha/include/asm/atomic.h
@@ -46,10 +46,9 @@ static __inline__ void atomic_##op(int i
 }									\
 
 #define ATOMIC_OP_RETURN(op, asm_op)					\
-static inline int atomic_##op##_return(int i, atomic_t *v)		\
+static inline int atomic_##op##_return_relaxed(int i, atomic_t *v)	\
 {									\
 	long temp, result;						\
-	smp_mb();							\
 	__asm__ __volatile__(						\
 	"1:	ldl_l %0,%1\n"						\
 	"	" #asm_op " %0,%3,%2\n"					\
@@ -61,15 +60,13 @@ static inline int atomic_##op##_return(i
 	".previous"							\
 	:"=&r" (temp), "=m" (v->counter), "=&r" (result)		\
 	:"Ir" (i), "m" (v->counter) : "memory");			\
-	smp_mb();							\
 	return result;							\
 }
 
 #define ATOMIC_FETCH_OP(op, asm_op)					\
-static inline int atomic_fetch_##op(int i, atomic_t *v)			\
+static inline int atomic_fetch_##op##_relaxed(int i, atomic_t *v)	\
 {									\
 	long temp, result;						\
-	smp_mb();							\
 	__asm__ __volatile__(						\
 	"1:	ldl_l %0,%1\n"						\
 	"       mov %0,%2\n"						\
@@ -81,7 +78,6 @@ static inline int atomic_fetch_##op(int
 	".previous"							\
 	:"=&r" (temp), "=m" (v->counter), "=&r" (result)		\
 	:"Ir" (i), "m" (v->counter) : "memory");			\
-	smp_mb();							\
 	return result;							\
 }
 
@@ -102,10 +98,9 @@ static __inline__ void atomic64_##op(lon
 }									\
 
 #define ATOMIC64_OP_RETURN(op, asm_op)					\
-static __inline__ long atomic64_##op##_return(long i, atomic64_t * v)	\
+static __inline__ long atomic64_##op##_return_relaxed(long i, atomic64_t * v)	\
 {									\
 	long temp, result;						\
-	smp_mb();							\
 	__asm__ __volatile__(						\
 	"1:	ldq_l %0,%1\n"						\
 	"	" #asm_op " %0,%3,%2\n"					\
@@ -117,15 +112,13 @@ static __inline__ long atomic64_##op##_r
 	".previous"							\
 	:"=&r" (temp), "=m" (v->counter), "=&r" (result)		\
 	:"Ir" (i), "m" (v->counter) : "memory");			\
-	smp_mb();							\
 	return result;							\
 }
 
 #define ATOMIC64_FETCH_OP(op, asm_op)					\
-static __inline__ long atomic64_fetch_##op(long i, atomic64_t * v)	\
+static __inline__ long atomic64_fetch_##op##_relaxed(long i, atomic64_t * v)	\
 {									\
 	long temp, result;						\
-	smp_mb();							\
 	__asm__ __volatile__(						\
 	"1:	ldq_l %0,%1\n"						\
 	"	mov %0,%2\n"						\
@@ -137,7 +130,6 @@ static __inline__ long atomic64_fetch_##
 	".previous"							\
 	:"=&r" (temp), "=m" (v->counter), "=&r" (result)		\
 	:"Ir" (i), "m" (v->counter) : "memory");			\
-	smp_mb();							\
 	return result;							\
 }
 
@@ -152,6 +144,16 @@ static __inline__ long atomic64_fetch_##
 ATOMIC_OPS(add)
 ATOMIC_OPS(sub)
 
+#define atomic_add_return_relaxed	atomic_add_return_relaxed
+#define atomic_sub_return_relaxed	atomic_sub_return_relaxed
+#define atomic_fetch_add_relaxed	atomic_fetch_add_relaxed
+#define atomic_fetch_sub_relaxed	atomic_fetch_sub_relaxed
+
+#define atomic64_add_return_relaxed	atomic64_add_return_relaxed
+#define atomic64_sub_return_relaxed	atomic64_sub_return_relaxed
+#define atomic64_fetch_add_relaxed	atomic64_fetch_add_relaxed
+#define atomic64_fetch_sub_relaxed	atomic64_fetch_sub_relaxed
+
 #define atomic_andnot atomic_andnot
 #define atomic64_andnot atomic64_andnot
 
@@ -167,6 +169,16 @@ ATOMIC_OPS(andnot, bic)
 ATOMIC_OPS(or, bis)
 ATOMIC_OPS(xor, xor)
 
+#define atomic_fetch_and_relaxed	atomic_fetch_and_relaxed
+#define atomic_fetch_andnot_relaxed	atomic_fetch_andnot_relaxed
+#define atomic_fetch_or_relaxed		atomic_fetch_or_relaxed
+#define atomic_fetch_xor_relaxed	atomic_fetch_xor_relaxed
+
+#define atomic64_fetch_and_relaxed	atomic64_fetch_and_relaxed
+#define atomic64_fetch_andnot_relaxed	atomic64_fetch_andnot_relaxed
+#define atomic64_fetch_or_relaxed	atomic64_fetch_or_relaxed
+#define atomic64_fetch_xor_relaxed	atomic64_fetch_xor_relaxed
+
 #undef ATOMIC_OPS
 #undef ATOMIC64_FETCH_OP
 #undef ATOMIC64_OP_RETURN

^ permalink raw reply	[flat|nested] 79+ messages in thread

* [RFC][PATCH 30/31] locking,mips: Convert to _relaxed atomics
  2016-04-22  9:04 [RFC][PATCH 00/31] implement atomic_fetch_$op Peter Zijlstra
                   ` (28 preceding siblings ...)
  2016-04-22  9:04 ` [RFC][PATCH 29/31] locking,alpha: Convert to _relaxed atomics Peter Zijlstra
@ 2016-04-22  9:04 ` Peter Zijlstra
  2016-04-22  9:04 ` [RFC][PATCH 31/31] locking,qrwlock: Employ atomic_fetch_add_acquire() Peter Zijlstra
  2016-04-22  9:44 ` [RFC][PATCH 00/31] implement atomic_fetch_$op Peter Zijlstra
  31 siblings, 0 replies; 79+ messages in thread
From: Peter Zijlstra @ 2016-04-22  9:04 UTC (permalink / raw)
  To: torvalds, mingo, tglx, will.deacon, paulmck, boqun.feng,
	waiman.long, fweisbec
  Cc: linux-kernel, linux-arch, rth, vgupta, linux, egtvedt, realmz6,
	ysato, rkuo, tony.luck, geert, james.hogan, ralf, dhowells, jejb,
	mpe, schwidefsky, dalias, davem, cmetcalf, jcmvbkbc, arnd,
	peterz, dbueso, fengguang.wu

[-- Attachment #1: peterz-atomic-mips-relaxed.patch --]
[-- Type: text/plain, Size: 5006 bytes --]

Generic code will construct {,_acquire,_release} versions by adding the
required smp_mb__{before,after}_atomic() calls.

XXX if/when MIPS will start using their new SYNCxx instructions they
can provide custom __atomic_op_{acquire,release}() macros as per the
powerpc example.


Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 arch/mips/include/asm/atomic.h |   42 +++++++++++++++++++++--------------------
 1 file changed, 22 insertions(+), 20 deletions(-)

--- a/arch/mips/include/asm/atomic.h
+++ b/arch/mips/include/asm/atomic.h
@@ -79,12 +79,10 @@ static __inline__ void atomic_##op(int i
 }
 
 #define ATOMIC_OP_RETURN(op, c_op, asm_op)				      \
-static __inline__ int atomic_##op##_return(int i, atomic_t * v)		      \
+static __inline__ int atomic_##op##_return_relaxed(int i, atomic_t * v)	      \
 {									      \
 	int result;							      \
 									      \
-	smp_mb__before_llsc();						      \
-									      \
 	if (kernel_uses_llsc && R10000_LLSC_WAR) {			      \
 		int temp;						      \
 									      \
@@ -125,18 +123,14 @@ static __inline__ int atomic_##op##_retu
 		raw_local_irq_restore(flags);				      \
 	}								      \
 									      \
-	smp_llsc_mb();							      \
-									      \
 	return result;							      \
 }
 
 #define ATOMIC_FETCH_OP(op, c_op, asm_op)				      \
-static __inline__ int atomic_fetch_##op(int i, atomic_t * v)		      \
+static __inline__ int atomic_fetch_##op##_relaxed(int i, atomic_t * v)	      \
 {									      \
 	int result;							      \
 									      \
-	smp_mb__before_llsc();						      \
-									      \
 	if (kernel_uses_llsc && R10000_LLSC_WAR) {			      \
 		int temp;						      \
 									      \
@@ -176,8 +170,6 @@ static __inline__ int atomic_fetch_##op(
 		raw_local_irq_restore(flags);				      \
 	}								      \
 									      \
-	smp_llsc_mb();							      \
-									      \
 	return result;							      \
 }
 
@@ -189,6 +181,11 @@ static __inline__ int atomic_fetch_##op(
 ATOMIC_OPS(add, +=, addu)
 ATOMIC_OPS(sub, -=, subu)
 
+#define atomic_add_return_relaxed	atomic_add_return_relaxed
+#define atomic_sub_return_relaxed	atomic_sub_return_relaxed
+#define atomic_fetch_add_relaxed	atomic_fetch_add_relaxed
+#define atomic_fetch_sub_relaxed	atomic_fetch_sub_relaxed
+
 #undef ATOMIC_OPS
 #define ATOMIC_OPS(op, c_op, asm_op)					      \
 	ATOMIC_OP(op, c_op, asm_op)					      \
@@ -198,6 +195,10 @@ ATOMIC_OPS(and, &=, and)
 ATOMIC_OPS(or, |=, or)
 ATOMIC_OPS(xor, ^=, xor)
 
+#define atomic_fetch_and_relaxed	atomic_fetch_and_relaxed
+#define atomic_fetch_or_relaxed		atomic_fetch_or_relaxed
+#define atomic_fetch_xor_relaxed	atomic_fetch_xor_relaxed
+
 #undef ATOMIC_OPS
 #undef ATOMIC_FETCH_OP
 #undef ATOMIC_OP_RETURN
@@ -420,12 +421,10 @@ static __inline__ void atomic64_##op(lon
 }
 
 #define ATOMIC64_OP_RETURN(op, c_op, asm_op)				      \
-static __inline__ long atomic64_##op##_return(long i, atomic64_t * v)	      \
+static __inline__ long atomic64_##op##_return_relaxed(long i, atomic64_t * v) \
 {									      \
 	long result;							      \
 									      \
-	smp_mb__before_llsc();						      \
-									      \
 	if (kernel_uses_llsc && R10000_LLSC_WAR) {			      \
 		long temp;						      \
 									      \
@@ -467,18 +466,14 @@ static __inline__ long atomic64_##op##_r
 		raw_local_irq_restore(flags);				      \
 	}								      \
 									      \
-	smp_llsc_mb();							      \
-									      \
 	return result;							      \
 }
 
 #define ATOMIC64_FETCH_OP(op, c_op, asm_op)				      \
-static __inline__ long atomic64_fetch_##op(long i, atomic64_t * v)	      \
+static __inline__ long atomic64_fetch_##op##_relaxed(long i, atomic64_t * v)  \
 {									      \
 	long result;							      \
 									      \
-	smp_mb__before_llsc();						      \
-									      \
 	if (kernel_uses_llsc && R10000_LLSC_WAR) {			      \
 		long temp;						      \
 									      \
@@ -519,8 +514,6 @@ static __inline__ long atomic64_fetch_##
 		raw_local_irq_restore(flags);				      \
 	}								      \
 									      \
-	smp_llsc_mb();							      \
-									      \
 	return result;							      \
 }
 
@@ -532,6 +525,11 @@ static __inline__ long atomic64_fetch_##
 ATOMIC64_OPS(add, +=, daddu)
 ATOMIC64_OPS(sub, -=, dsubu)
 
+#define atomic64_add_return_relaxed	atomic64_add_return_relaxed
+#define atomic64_sub_return_relaxed	atomic64_sub_return_relaxed
+#define atomic64_fetch_add_relaxed	atomic64_fetch_add_relaxed
+#define atomic64_fetch_sub_relaxed	atomic64_fetch_sub_relaxed
+
 #undef ATOMIC64_OPS
 #define ATOMIC64_OPS(op, c_op, asm_op)					      \
 	ATOMIC64_OP(op, c_op, asm_op)					      \
@@ -541,6 +539,10 @@ ATOMIC64_OPS(and, &=, and)
 ATOMIC64_OPS(or, |=, or)
 ATOMIC64_OPS(xor, ^=, xor)
 
+#define atomic64_fetch_and_relaxed	atomic64_fetch_and_relaxed
+#define atomic64_fetch_or_relaxed	atomic64_fetch_or_relaxed
+#define atomic64_fetch_xor_relaxed	atomic64_fetch_xor_relaxed
+
 #undef ATOMIC64_OPS
 #undef ATOMIC64_FETCH_OP
 #undef ATOMIC64_OP_RETURN

^ permalink raw reply	[flat|nested] 79+ messages in thread

* [RFC][PATCH 31/31] locking,qrwlock: Employ atomic_fetch_add_acquire()
  2016-04-22  9:04 [RFC][PATCH 00/31] implement atomic_fetch_$op Peter Zijlstra
                   ` (29 preceding siblings ...)
  2016-04-22  9:04 ` [RFC][PATCH 30/31] locking,mips: " Peter Zijlstra
@ 2016-04-22  9:04 ` Peter Zijlstra
  2016-04-22 14:25     ` Waiman Long
  2016-04-22  9:44 ` [RFC][PATCH 00/31] implement atomic_fetch_$op Peter Zijlstra
  31 siblings, 1 reply; 79+ messages in thread
From: Peter Zijlstra @ 2016-04-22  9:04 UTC (permalink / raw)
  To: torvalds, mingo, tglx, will.deacon, paulmck, boqun.feng,
	waiman.long, fweisbec
  Cc: linux-kernel, linux-arch, rth, vgupta, linux, egtvedt, realmz6,
	ysato, rkuo, tony.luck, geert, james.hogan, ralf, dhowells, jejb,
	mpe, schwidefsky, dalias, davem, cmetcalf, jcmvbkbc, arnd,
	peterz, dbueso, fengguang.wu

[-- Attachment #1: peterz-locking-qspinlock-fetch-add.patch --]
[-- Type: text/plain, Size: 750 bytes --]

The only reason for the current code is to make GCC emit only the
"LOCK XADD" instruction on x86 (and not do a pointless extra ADD on
the result), do so nicer.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 kernel/locking/qrwlock.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/kernel/locking/qrwlock.c
+++ b/kernel/locking/qrwlock.c
@@ -93,7 +93,7 @@ void queued_read_lock_slowpath(struct qr
 	 * that accesses can't leak upwards out of our subsequent critical
 	 * section in the case that the lock is currently held for write.
 	 */
-	cnts = atomic_add_return_acquire(_QR_BIAS, &lock->cnts) - _QR_BIAS;
+	cnts = atomic_fetch_add_acquire(_QR_BIAS, &lock->cnts);
 	rspin_until_writer_unlock(lock, cnts);
 
 	/*

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [RFC][PATCH 00/31] implement atomic_fetch_$op
  2016-04-22  9:04 [RFC][PATCH 00/31] implement atomic_fetch_$op Peter Zijlstra
                   ` (30 preceding siblings ...)
  2016-04-22  9:04 ` [RFC][PATCH 31/31] locking,qrwlock: Employ atomic_fetch_add_acquire() Peter Zijlstra
@ 2016-04-22  9:44 ` Peter Zijlstra
  2016-04-22 12:56   ` Fengguang Wu
  31 siblings, 1 reply; 79+ messages in thread
From: Peter Zijlstra @ 2016-04-22  9:44 UTC (permalink / raw)
  To: torvalds, mingo, tglx, will.deacon, paulmck, boqun.feng,
	waiman.long, fweisbec
  Cc: linux-kernel, linux-arch, rth, vgupta, linux, egtvedt, realmz6,
	ysato, rkuo, tony.luck, geert, james.hogan, ralf, dhowells, jejb,
	mpe, schwidefsky, dalias, davem, cmetcalf, jcmvbkbc, arnd,
	dbueso, fengguang.wu

On Fri, Apr 22, 2016 at 11:04:13AM +0200, Peter Zijlstra wrote:
> The one that I did not do was ARMv8.1-LSE and I was hoping Will would help out
> with that. Also, it looks like the 0-day built bot does not do arm64 builds,
> people might want to look into that.

OK, weirdness. I received the "BUILD SUCCESS" email without any arm64
builds listed, but I just received a build bot email telling me the
arm64 build was borked (which I know it is).

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [RFC][PATCH 03/31] locking,arc: Implement atomic_fetch_{add,sub,and,andnot,or,xor}()
  2016-04-22  9:04 ` [RFC][PATCH 03/31] locking,arc: Implement atomic_fetch_{add,sub,and,andnot,or,xor}() Peter Zijlstra
@ 2016-04-22 10:50     ` Vineet Gupta
  0 siblings, 0 replies; 79+ messages in thread
From: Vineet Gupta @ 2016-04-22 10:50 UTC (permalink / raw)
  To: Peter Zijlstra, torvalds, mingo, tglx, will.deacon, paulmck,
	boqun.feng, waiman.long, fweisbec
  Cc: linux-kernel, linux-arch, rth, Vineet.Gupta1, linux, egtvedt,
	realmz6, ysato, rkuo, tony.luck, geert, james.hogan, ralf,
	dhowells, jejb, mpe, schwidefsky, dalias, davem, cmetcalf,
	jcmvbkbc, arnd, dbueso, fengguang.wu

On Friday 22 April 2016 03:13 PM, Peter Zijlstra wrote:
> Implement FETCH-OP atomic primitives, these are very similar to the
> existing OP-RETURN primitives we already have, except they return the
> value of the atomic variable _before_ modification.
>
> This is especially useful for irreversible operations -- such as
> bitops (because it becomes impossible to reconstruct the state prior
> to modification).
>
> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
> ---
>  arch/arc/include/asm/atomic.h |   69 ++++++++++++++++++++++++++++++++++++++----
>  1 file changed, 64 insertions(+), 5 deletions(-)
>
> --- a/arch/arc/include/asm/atomic.h
> +++ b/arch/arc/include/asm/atomic.h
> @@ -102,6 +102,38 @@ static inline int atomic_##op##_return(i
>  	return val;							\
>  }
>  
> +#define ATOMIC_FETCH_OP(op, c_op, asm_op)				\
> +static inline int atomic_fetch_##op(int i, atomic_t *v)			\
> +{									\
> +	unsigned int val, result;			                \
> +	SCOND_FAIL_RETRY_VAR_DEF                                        \
> +									\
> +	/*								\
> +	 * Explicit full memory barrier needed before/after as		\
> +	 * LLOCK/SCOND thmeselves don't provide any such semantics	\
> +	 */								\
> +	smp_mb();							\
> +									\
> +	__asm__ __volatile__(						\
> +	"1:	llock   %[val], [%[ctr]]		\n"		\
> +	"	mov %[result], %[val]			\n"		\

Calling it result could be a bit confusing, this is meant to be the "orig" value.
So it indeed "result" of the API, but for atomic operation it is pristine value.

Also we can optimize away that MOV - given there are plenty of regs, so

> +	"	" #asm_op " %[val], %[val], %[i]	\n"		\
> +	"	scond   %[val], [%[ctr]]		\n"		\

Instead have

+	"	" #asm_op " %[result], %[val], %[i]	\n"		\
+	"	scond   %[result], [%[ctr]]		\n"		\



> +	"						\n"		\
> +	SCOND_FAIL_RETRY_ASM						\
> +									\
> +	: [val]	"=&r"	(val),						\
> +	  [result] "=&r" (result)					\
> +	  SCOND_FAIL_RETRY_VARS						\
> +	: [ctr]	"r"	(&v->counter),					\
> +	  [i]	"ir"	(i)						\
> +	: "cc");							\
> +									\
> +	smp_mb();							\
> +									\
> +	return result;							\

This needs to be

+	return val;							\



> +}
> +
>  #else	/* !CONFIG_ARC_HAS_LLSC */
>  
>  #ifndef CONFIG_SMP
> @@ -164,23 +196,50 @@ static inline int atomic_##op##_return(i
>  	return temp;							\
>  }
>  
> +#define ATOMIC_FETCH_OP(op, c_op, asm_op)				\
> +static inline int atomic_fetch_##op(int i, atomic_t *v)			\
> +{									\
> +	unsigned long flags;						\
> +	unsigned long temp, result;					\

Same s above, I wouldn't call it result here !

Also per your other comment/patches, converting ARC to _relaxed atomics sounds
trivial, I can provide a fixup patch once your series is stable'ish and u point me
to ur git tree or some such .

Thx,
-Vineet

> +									\
> +	/*								\
> +	 * spin lock/unlock provides the needed smp_mb() before/after	\
> +	 */								\
> +	atomic_ops_lock(flags);						\
> +	result = temp = v->counter;					\
> +	temp c_op i;							\
> +	v->counter = temp;						\
> +	atomic_ops_unlock(flags);					\
> +									\
> +	return result;							\
> +}
> +
>  #endif /* !CONFIG_ARC_HAS_LLSC */
>  
>  #define ATOMIC_OPS(op, c_op, asm_op)					\
>  	ATOMIC_OP(op, c_op, asm_op)					\
> -	ATOMIC_OP_RETURN(op, c_op, asm_op)
> +	ATOMIC_OP_RETURN(op, c_op, asm_op)				\
> +	ATOMIC_FETCH_OP(op, c_op, asm_op)
>  
>  ATOMIC_OPS(add, +=, add)
>  ATOMIC_OPS(sub, -=, sub)
>  
>  #define atomic_andnot atomic_andnot
>  
> -ATOMIC_OP(and, &=, and)
> -ATOMIC_OP(andnot, &= ~, bic)
> -ATOMIC_OP(or, |=, or)
> -ATOMIC_OP(xor, ^=, xor)
> +#define atomic_fetch_or atomic_fetch_or
> +
> +#undef ATOMIC_OPS
> +#define ATOMIC_OPS(op, c_op, asm_op)					\
> +	ATOMIC_OP(op, c_op, asm_op)					\
> +	ATOMIC_FETCH_OP(op, c_op, asm_op)
> +
> +ATOMIC_OPS(and, &=, and)
> +ATOMIC_OPS(andnot, &= ~, bic)
> +ATOMIC_OPS(or, |=, or)
> +ATOMIC_OPS(xor, ^=, xor)
>  
>  #undef ATOMIC_OPS
> +#undef ATOMIC_FETCH_OP
>  #undef ATOMIC_OP_RETURN
>  #undef ATOMIC_OP
>  #undef SCOND_FAIL_RETRY_VAR_DEF

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [RFC][PATCH 03/31] locking,arc: Implement atomic_fetch_{add,sub,and,andnot,or,xor}()
@ 2016-04-22 10:50     ` Vineet Gupta
  0 siblings, 0 replies; 79+ messages in thread
From: Vineet Gupta @ 2016-04-22 10:50 UTC (permalink / raw)
  To: Peter Zijlstra, torvalds, mingo, tglx, will.deacon, paulmck,
	boqun.feng, waiman.long, fweisbec
  Cc: linux-kernel, linux-arch, rth, Vineet.Gupta1, linux, egtvedt,
	realmz6, ysato, rkuo, tony.luck, geert, james.hogan, ralf,
	dhowells, jejb, mpe, schwidefsky, dalias, dav

On Friday 22 April 2016 03:13 PM, Peter Zijlstra wrote:
> Implement FETCH-OP atomic primitives, these are very similar to the
> existing OP-RETURN primitives we already have, except they return the
> value of the atomic variable _before_ modification.
>
> This is especially useful for irreversible operations -- such as
> bitops (because it becomes impossible to reconstruct the state prior
> to modification).
>
> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
> ---
>  arch/arc/include/asm/atomic.h |   69 ++++++++++++++++++++++++++++++++++++++----
>  1 file changed, 64 insertions(+), 5 deletions(-)
>
> --- a/arch/arc/include/asm/atomic.h
> +++ b/arch/arc/include/asm/atomic.h
> @@ -102,6 +102,38 @@ static inline int atomic_##op##_return(i
>  	return val;							\
>  }
>  
> +#define ATOMIC_FETCH_OP(op, c_op, asm_op)				\
> +static inline int atomic_fetch_##op(int i, atomic_t *v)			\
> +{									\
> +	unsigned int val, result;			                \
> +	SCOND_FAIL_RETRY_VAR_DEF                                        \
> +									\
> +	/*								\
> +	 * Explicit full memory barrier needed before/after as		\
> +	 * LLOCK/SCOND thmeselves don't provide any such semantics	\
> +	 */								\
> +	smp_mb();							\
> +									\
> +	__asm__ __volatile__(						\
> +	"1:	llock   %[val], [%[ctr]]		\n"		\
> +	"	mov %[result], %[val]			\n"		\

Calling it result could be a bit confusing, this is meant to be the "orig" value.
So it indeed "result" of the API, but for atomic operation it is pristine value.

Also we can optimize away that MOV - given there are plenty of regs, so

> +	"	" #asm_op " %[val], %[val], %[i]	\n"		\
> +	"	scond   %[val], [%[ctr]]		\n"		\

Instead have

+	"	" #asm_op " %[result], %[val], %[i]	\n"		\
+	"	scond   %[result], [%[ctr]]		\n"		\



> +	"						\n"		\
> +	SCOND_FAIL_RETRY_ASM						\
> +									\
> +	: [val]	"=&r"	(val),						\
> +	  [result] "=&r" (result)					\
> +	  SCOND_FAIL_RETRY_VARS						\
> +	: [ctr]	"r"	(&v->counter),					\
> +	  [i]	"ir"	(i)						\
> +	: "cc");							\
> +									\
> +	smp_mb();							\
> +									\
> +	return result;							\

This needs to be

+	return val;							\



> +}
> +
>  #else	/* !CONFIG_ARC_HAS_LLSC */
>  
>  #ifndef CONFIG_SMP
> @@ -164,23 +196,50 @@ static inline int atomic_##op##_return(i
>  	return temp;							\
>  }
>  
> +#define ATOMIC_FETCH_OP(op, c_op, asm_op)				\
> +static inline int atomic_fetch_##op(int i, atomic_t *v)			\
> +{									\
> +	unsigned long flags;						\
> +	unsigned long temp, result;					\

Same s above, I wouldn't call it result here !

Also per your other comment/patches, converting ARC to _relaxed atomics sounds
trivial, I can provide a fixup patch once your series is stable'ish and u point me
to ur git tree or some such .

Thx,
-Vineet

> +									\
> +	/*								\
> +	 * spin lock/unlock provides the needed smp_mb() before/after	\
> +	 */								\
> +	atomic_ops_lock(flags);						\
> +	result = temp = v->counter;					\
> +	temp c_op i;							\
> +	v->counter = temp;						\
> +	atomic_ops_unlock(flags);					\
> +									\
> +	return result;							\
> +}
> +
>  #endif /* !CONFIG_ARC_HAS_LLSC */
>  
>  #define ATOMIC_OPS(op, c_op, asm_op)					\
>  	ATOMIC_OP(op, c_op, asm_op)					\
> -	ATOMIC_OP_RETURN(op, c_op, asm_op)
> +	ATOMIC_OP_RETURN(op, c_op, asm_op)				\
> +	ATOMIC_FETCH_OP(op, c_op, asm_op)
>  
>  ATOMIC_OPS(add, +=, add)
>  ATOMIC_OPS(sub, -=, sub)
>  
>  #define atomic_andnot atomic_andnot
>  
> -ATOMIC_OP(and, &=, and)
> -ATOMIC_OP(andnot, &= ~, bic)
> -ATOMIC_OP(or, |=, or)
> -ATOMIC_OP(xor, ^=, xor)
> +#define atomic_fetch_or atomic_fetch_or
> +
> +#undef ATOMIC_OPS
> +#define ATOMIC_OPS(op, c_op, asm_op)					\
> +	ATOMIC_OP(op, c_op, asm_op)					\
> +	ATOMIC_FETCH_OP(op, c_op, asm_op)
> +
> +ATOMIC_OPS(and, &=, and)
> +ATOMIC_OPS(andnot, &= ~, bic)
> +ATOMIC_OPS(or, |=, or)
> +ATOMIC_OPS(xor, ^=, xor)
>  
>  #undef ATOMIC_OPS
> +#undef ATOMIC_FETCH_OP
>  #undef ATOMIC_OP_RETURN
>  #undef ATOMIC_OP
>  #undef SCOND_FAIL_RETRY_VAR_DEF

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [RFC][PATCH 01/31] locking: Flip arguments to atomic_fetch_or
  2016-04-22  9:04 ` [RFC][PATCH 01/31] locking: Flip arguments to atomic_fetch_or Peter Zijlstra
@ 2016-04-22 10:54   ` Will Deacon
  2016-04-22 11:09     ` Geert Uytterhoeven
  1 sibling, 0 replies; 79+ messages in thread
From: Will Deacon @ 2016-04-22 10:54 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: torvalds, mingo, tglx, paulmck, boqun.feng, waiman.long,
	fweisbec, linux-kernel, linux-arch, rth, vgupta, linux, egtvedt,
	realmz6, ysato, rkuo, tony.luck, geert, james.hogan, ralf,
	dhowells, jejb, mpe, schwidefsky, dalias, davem, cmetcalf,
	jcmvbkbc, arnd, dbueso, fengguang.wu

On Fri, Apr 22, 2016 at 11:04:14AM +0200, Peter Zijlstra wrote:
> All the atomic operations have their arguments the wrong way around;
> make atomic_fetch_or() consistent and flip them.
> 
> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
> ---
>  include/linux/atomic.h   |    4 ++--
>  kernel/time/tick-sched.c |    4 ++--
>  2 files changed, 4 insertions(+), 4 deletions(-)

Makes sense:

Acked-by: Will Deacon <will.deacon@arm.com>

Will

> --- a/include/linux/atomic.h
> +++ b/include/linux/atomic.h
> @@ -560,11 +560,11 @@ static inline int atomic_dec_if_positive
>  
>  /**
>   * atomic_fetch_or - perform *p |= mask and return old value of *p
> - * @p: pointer to atomic_t
>   * @mask: mask to OR on the atomic_t
> + * @p: pointer to atomic_t
>   */
>  #ifndef atomic_fetch_or
> -static inline int atomic_fetch_or(atomic_t *p, int mask)
> +static inline int atomic_fetch_or(int mask, atomic_t *p)
>  {
>  	int old, val = atomic_read(p);
>  
> --- a/kernel/time/tick-sched.c
> +++ b/kernel/time/tick-sched.c
> @@ -262,7 +262,7 @@ static void tick_nohz_dep_set_all(atomic
>  {
>  	int prev;
>  
> -	prev = atomic_fetch_or(dep, BIT(bit));
> +	prev = atomic_fetch_or(BIT(bit), dep);
>  	if (!prev)
>  		tick_nohz_full_kick_all();
>  }
> @@ -292,7 +292,7 @@ void tick_nohz_dep_set_cpu(int cpu, enum
>  
>  	ts = per_cpu_ptr(&tick_cpu_sched, cpu);
>  
> -	prev = atomic_fetch_or(&ts->tick_dep_mask, BIT(bit));
> +	prev = atomic_fetch_or(BIT(bit), &ts->tick_dep_mask);
>  	if (!prev) {
>  		preempt_disable();
>  		/* Perf needs local kick that is NMI safe */
> 
> 

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [RFC][PATCH 05/31] locking,arm64: Implement atomic{,64}_fetch_{add,sub,and,andnot,or,xor}{,_relaxed,_acquire,_release}()
  2016-04-22  9:04 ` [RFC][PATCH 05/31] locking,arm64: " Peter Zijlstra
@ 2016-04-22 11:08   ` Will Deacon
  2016-04-22 14:23     ` Will Deacon
  1 sibling, 0 replies; 79+ messages in thread
From: Will Deacon @ 2016-04-22 11:08 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: torvalds, mingo, tglx, paulmck, boqun.feng, waiman.long,
	fweisbec, linux-kernel, linux-arch, rth, vgupta, linux, egtvedt,
	realmz6, ysato, rkuo, tony.luck, geert, james.hogan, ralf,
	dhowells, jejb, mpe, schwidefsky, dalias, davem, cmetcalf,
	jcmvbkbc, arnd, dbueso, fengguang.wu

On Fri, Apr 22, 2016 at 11:04:18AM +0200, Peter Zijlstra wrote:
> Implement FETCH-OP atomic primitives, these are very similar to the
> existing OP-RETURN primitives we already have, except they return the
> value of the atomic variable _before_ modification.
> 
> This is especially useful for irreversible operations -- such as
> bitops (because it becomes impossible to reconstruct the state prior
> to modification).
> 
> XXX lacking LSE bits

I'll cook a patch for this, but thanks for the series.

Will

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [RFC][PATCH 01/31] locking: Flip arguments to atomic_fetch_or
  2016-04-22  9:04 ` [RFC][PATCH 01/31] locking: Flip arguments to atomic_fetch_or Peter Zijlstra
@ 2016-04-22 11:09     ` Geert Uytterhoeven
  2016-04-22 11:09     ` Geert Uytterhoeven
  1 sibling, 0 replies; 79+ messages in thread
From: Geert Uytterhoeven @ 2016-04-22 11:09 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Linus Torvalds, Ingo Molnar, Thomas Gleixner, Will Deacon,
	Paul McKenney, boqun.feng, waiman.long,
	Frédéric Weisbecker, linux-kernel, Linux-Arch,
	Richard Henderson, Vineet Gupta, Russell King,
	Hans-Christian Noren Egtvedt, Miao Steven, Yoshinori Sato,
	Richard Kuo, Tony Luck, James Hogan, Ralf Baechle, David Howells,
	James E.J. Bottomley, Michael Ellerman, Martin Schwidefsky,
	Rich Felker, David S. Miller, cmetcalf, Max Filippov,
	Arnd Bergmann, dbueso, Wu Fengguang

On Fri, Apr 22, 2016 at 11:04 AM, Peter Zijlstra <peterz@infradead.org> wrote:
> All the atomic operations have their arguments the wrong way around;

s/wrong/other/?

> make atomic_fetch_or() consistent and flip them.

BTW, there are a few other inconsistencies:

atomic_add_unless()
atomic_cmpxchg()
atomic_inc_not_zero_hint()
atomic_set()
atomic_xchg

git grep "\<atomic_.*atomic_t\>.*\<int\>"

Gr{oetje,eeting}s,

                        Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [RFC][PATCH 01/31] locking: Flip arguments to atomic_fetch_or
@ 2016-04-22 11:09     ` Geert Uytterhoeven
  0 siblings, 0 replies; 79+ messages in thread
From: Geert Uytterhoeven @ 2016-04-22 11:09 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Linus Torvalds, Ingo Molnar, Thomas Gleixner, Will Deacon,
	Paul McKenney, boqun.feng, waiman.long,
	Frédéric Weisbecker, linux-kernel, Linux-Arch,
	Richard Henderson, Vineet Gupta, Russell King,
	Hans-Christian Noren Egtvedt, Miao Steven, Yoshinori Sato,
	Richard Kuo, Tony Luck, James Hogan, Ralf Baechle, David Howells,
	James E.J. Bottomley

On Fri, Apr 22, 2016 at 11:04 AM, Peter Zijlstra <peterz@infradead.org> wrote:
> All the atomic operations have their arguments the wrong way around;

s/wrong/other/?

> make atomic_fetch_or() consistent and flip them.

BTW, there are a few other inconsistencies:

atomic_add_unless()
atomic_cmpxchg()
atomic_inc_not_zero_hint()
atomic_set()
atomic_xchg

git grep "\<atomic_.*atomic_t\>.*\<int\>"

Gr{oetje,eeting}s,

                        Geert

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [RFC][PATCH 04/31] locking,arm: Implement atomic{,64}_fetch_{add,sub,and,andnot,or,xor}{,_relaxed,_acquire,_release}()
  2016-04-22  9:04 ` [RFC][PATCH 04/31] locking,arm: Implement atomic{,64}_fetch_{add,sub,and,andnot,or,xor}{,_relaxed,_acquire,_release}() Peter Zijlstra
@ 2016-04-22 11:35   ` Will Deacon
  0 siblings, 0 replies; 79+ messages in thread
From: Will Deacon @ 2016-04-22 11:35 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: torvalds, mingo, tglx, paulmck, boqun.feng, waiman.long,
	fweisbec, linux-kernel, linux-arch, rth, vgupta, linux, egtvedt,
	realmz6, ysato, rkuo, tony.luck, geert, james.hogan, ralf,
	dhowells, jejb, mpe, schwidefsky, dalias, davem, cmetcalf,
	jcmvbkbc, arnd, dbueso, fengguang.wu

On Fri, Apr 22, 2016 at 11:04:17AM +0200, Peter Zijlstra wrote:
> Implement FETCH-OP atomic primitives, these are very similar to the
> existing OP-RETURN primitives we already have, except they return the
> value of the atomic variable _before_ modification.
> 
> This is especially useful for irreversible operations -- such as
> bitops (because it becomes impossible to reconstruct the state prior
> to modification).
> 
> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
> ---
>  arch/arm/include/asm/atomic.h |  108 ++++++++++++++++++++++++++++++++++++++----
>  1 file changed, 98 insertions(+), 10 deletions(-)
> 
> --- a/arch/arm/include/asm/atomic.h
> +++ b/arch/arm/include/asm/atomic.h
> @@ -77,8 +77,36 @@ static inline int atomic_##op##_return_r
>  	return result;							\
>  }

[...]

> +static inline long long							\
> +atomic64_fetch_##op##_relaxed(long long i, atomic64_t *v)		\
> +{									\
> +	long long result, val;						\
> +	unsigned long tmp;						\
> +									\
> +	prefetchw(&v->counter);						\
> +									\
> +	__asm__ __volatile__("@ atomic64_fetch_" #op "\n"		\
> +"1:	ldrexd	%0, %H0, [%4]\n"					\
> +"	" #op1 " %Q1, %Q0, %Q5\n"					\
> +"	" #op2 " %R1, %R0, %R5\n"					\
> +"	strexd	%2, %1, %H0, [%4]\n"					\

You want %H1 here.

With that:

Acked-by: Will Deacon <will.deacon@arm.com>

Will

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [RFC][PATCH 06/31] locking,avr32: Implement atomic_fetch_{add,sub,and,or,xor}()
  2016-04-22  9:04 ` [RFC][PATCH 06/31] locking,avr32: Implement atomic_fetch_{add,sub,and,or,xor}() Peter Zijlstra
@ 2016-04-22 11:58   ` Hans-Christian Noren Egtvedt
  0 siblings, 0 replies; 79+ messages in thread
From: Hans-Christian Noren Egtvedt @ 2016-04-22 11:58 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: torvalds, mingo, tglx, will.deacon, paulmck, boqun.feng,
	waiman.long, fweisbec, linux-kernel, linux-arch, rth, vgupta,
	linux, realmz6, ysato, rkuo, tony.luck, geert, james.hogan, ralf,
	dhowells, jejb, mpe, schwidefsky, dalias, davem, cmetcalf,
	jcmvbkbc, arnd, dbueso, fengguang.wu

Around Fri 22 Apr 2016 11:04:19 +0200 or thereabout, Peter Zijlstra wrote:
> Implement FETCH-OP atomic primitives, these are very similar to the
> existing OP-RETURN primitives we already have, except they return the
> value of the atomic variable _before_ modification.
> 
> This is especially useful for irreversible operations -- such as
> bitops (because it becomes impossible to reconstruct the state prior
> to modification).
> 
> 
> 
> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>

Looks good.

Acked-by: Hans-Christian Noren Egtvedt <egtvedt@samfundet.no>

> ---
>  arch/avr32/include/asm/atomic.h |   56 ++++++++++++++++++++++++++++++++++++----
>  1 file changed, 51 insertions(+), 5 deletions(-)
> 
> --- a/arch/avr32/include/asm/atomic.h
> +++ b/arch/avr32/include/asm/atomic.h
> @@ -41,21 +41,51 @@ static inline int __atomic_##op##_return
>  	return result;							\
>  }
>  
> +#define ATOMIC_FETCH_OP(op, asm_op, asm_con)				\
> +static inline int __atomic_fetch_##op(int i, atomic_t *v)		\
> +{									\
> +	int result, val;						\
> +									\
> +	asm volatile(							\
> +		"/* atomic_fetch_" #op " */\n"				\
> +		"1:	ssrf	5\n"					\
> +		"	ld.w	%0, %3\n"				\
> +		"	mov	%1, %0\n"				\
> +		"	" #asm_op "	%1, %4\n"			\
> +		"	stcond	%2, %1\n"				\
> +		"	brne	1b"					\
> +		: "=&r" (result), "=&r" (val), "=o" (v->counter)	\
> +		: "m" (v->counter), #asm_con (i)			\
> +		: "cc");						\
> +									\
> +	return result;							\
> +}
> +
>  ATOMIC_OP_RETURN(sub, sub, rKs21)
>  ATOMIC_OP_RETURN(add, add, r)
> +ATOMIC_FETCH_OP (sub, sub, rKs21)
> +ATOMIC_FETCH_OP (add, add, r)
>  
> -#define ATOMIC_OP(op, asm_op)						\
> +#define atomic_fetch_or atomic_fetch_or
> +
> +#define ATOMIC_OPS(op, asm_op)						\
>  ATOMIC_OP_RETURN(op, asm_op, r)						\
>  static inline void atomic_##op(int i, atomic_t *v)			\
>  {									\
>  	(void)__atomic_##op##_return(i, v);				\
> +}									\
> +ATOMIC_FETCH_OP(op, asm_op, r)						\
> +static inline int atomic_fetch_##op(int i, atomic_t *v)		\
> +{									\
> +	return __atomic_fetch_##op(i, v);				\
>  }
>  
> -ATOMIC_OP(and, and)
> -ATOMIC_OP(or, or)
> -ATOMIC_OP(xor, eor)
> +ATOMIC_OPS(and, and)
> +ATOMIC_OPS(or, or)
> +ATOMIC_OPS(xor, eor)
>  
> -#undef ATOMIC_OP
> +#undef ATOMIC_OPS
> +#undef ATOMIC_FETCH_OP
>  #undef ATOMIC_OP_RETURN
>  
>  /*
> @@ -87,6 +117,14 @@ static inline int atomic_add_return(int
>  	return __atomic_add_return(i, v);
>  }
>  
> +static inline int atomic_fetch_add(int i, atomic_t *v)
> +{
> +	if (IS_21BIT_CONST(i))
> +		return __atomic_fetch_sub(-i, v);
> +
> +	return __atomic_fetch_add(i, v);
> +}
> +
>  /*
>   * atomic_sub_return - subtract the atomic variable
>   * @i: integer value to subtract
> @@ -102,6 +140,14 @@ static inline int atomic_sub_return(int
>  	return __atomic_add_return(-i, v);
>  }
>  
> +static inline int atomic_fetch_sub(int i, atomic_t *v)
> +{
> +	if (IS_21BIT_CONST(i))
> +		return __atomic_fetch_sub(i, v);
> +
> +	return __atomic_fetch_add(-i, v);
> +}
> +
>  /*
>   * __atomic_add_unless - add unless the number is a given value
>   * @v: pointer of type atomic_t
-- 
mvh
Hans-Christian Noren Egtvedt

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [RFC][PATCH 00/31] implement atomic_fetch_$op
  2016-04-22  9:44 ` [RFC][PATCH 00/31] implement atomic_fetch_$op Peter Zijlstra
@ 2016-04-22 12:56   ` Fengguang Wu
  2016-04-22 13:03     ` Will Deacon
                       ` (2 more replies)
  0 siblings, 3 replies; 79+ messages in thread
From: Fengguang Wu @ 2016-04-22 12:56 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: torvalds, mingo, tglx, will.deacon, paulmck, boqun.feng,
	waiman.long, fweisbec, linux-kernel, linux-arch, rth, vgupta,
	linux, egtvedt, realmz6, ysato, rkuo, tony.luck, geert,
	james.hogan, ralf, dhowells, jejb, mpe, schwidefsky, dalias,
	davem, cmetcalf, jcmvbkbc, arnd, dbueso

On Fri, Apr 22, 2016 at 11:44:55AM +0200, Peter Zijlstra wrote:
> On Fri, Apr 22, 2016 at 11:04:13AM +0200, Peter Zijlstra wrote:
> > The one that I did not do was ARMv8.1-LSE and I was hoping Will would help out
> > with that. Also, it looks like the 0-day built bot does not do arm64 builds,
> > people might want to look into that.
> 
> OK, weirdness. I received the "BUILD SUCCESS" email without any arm64
> builds listed, but I just received a build bot email telling me the
> arm64 build was borked (which I know it is).

Sorry, that may happen because even though most errors will be
detected in the first hour or before the BUILD SUCCESS/DONE
notification, the build/boot/performance tests for a particular branch
may continue for days, during the time test coverage keeps growing.
Which means it's possible to receive a build failure after receiving
BUILD SUCCESS notification.

In particular, 0-day bot classify 500+ kconfigs into 2 priority lists:

P1: 100+ realtime priority kconfigs which should be finished before sending
    out BUILD SUCCESS notification

P2: 400+ background priority kconfigs which may take hours to days to finish

That split is a tradeoff between timeliness and completeness. It turns
out to work well as long as we choose the suitable P1 list.

So the more accurate interpretation of "BUILD SUCCESS/DONE" would be:
0day bot is working on your tree (no worry about out-of-service) and
reached a major milestone.

I'll add arm64-defconfig to P1 list to improve its coverage.

Thanks,
Fengguang

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [RFC][PATCH 27/31] locking: Remove linux/atomic.h:atomic_fetch_or
  2016-04-22  9:04 ` [RFC][PATCH 27/31] locking: Remove linux/atomic.h:atomic_fetch_or Peter Zijlstra
@ 2016-04-22 13:02   ` Will Deacon
  2016-04-22 14:21     ` Peter Zijlstra
  0 siblings, 1 reply; 79+ messages in thread
From: Will Deacon @ 2016-04-22 13:02 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: torvalds, mingo, tglx, paulmck, boqun.feng, waiman.long,
	fweisbec, linux-kernel, linux-arch, rth, vgupta, linux, egtvedt,
	realmz6, ysato, rkuo, tony.luck, geert, james.hogan, ralf,
	dhowells, jejb, mpe, schwidefsky, dalias, davem, cmetcalf,
	jcmvbkbc, arnd, dbueso, fengguang.wu

On Fri, Apr 22, 2016 at 11:04:40AM +0200, Peter Zijlstra wrote:
> Since all architectures have this implemented natively, remove this
> now dead code.
> 
> 
> 
> 
> 
> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
> ---
>  arch/alpha/include/asm/atomic.h    |    2 --
>  arch/arc/include/asm/atomic.h      |    2 --
>  arch/arm/include/asm/atomic.h      |    2 --
>  arch/arm64/include/asm/atomic.h    |    2 --
>  arch/avr32/include/asm/atomic.h    |    2 --
>  arch/frv/include/asm/atomic.h      |    2 --
>  arch/h8300/include/asm/atomic.h    |    2 --
>  arch/hexagon/include/asm/atomic.h  |    2 --
>  arch/m32r/include/asm/atomic.h     |    2 --
>  arch/m68k/include/asm/atomic.h     |    2 --
>  arch/metag/include/asm/atomic.h    |    2 --
>  arch/mips/include/asm/atomic.h     |    2 --
>  arch/mn10300/include/asm/atomic.h  |    2 --
>  arch/parisc/include/asm/atomic.h   |    2 --
>  arch/s390/include/asm/atomic.h     |    2 --
>  arch/sh/include/asm/atomic.h       |    2 --
>  arch/sparc/include/asm/atomic.h    |    1 -
>  arch/sparc/include/asm/atomic_32.h |    2 --
>  arch/tile/include/asm/atomic.h     |    2 --
>  arch/x86/include/asm/atomic.h      |    2 --
>  arch/xtensa/include/asm/atomic.h   |    2 --
>  include/asm-generic/atomic.h       |    2 --
>  include/linux/atomic.h             |   21 ---------------------
>  23 files changed, 64 deletions(-)
> 
> --- a/arch/alpha/include/asm/atomic.h
> +++ b/arch/alpha/include/asm/atomic.h
> @@ -155,8 +155,6 @@ ATOMIC_OPS(sub)
>  #define atomic_andnot atomic_andnot
>  #define atomic64_andnot atomic64_andnot
>  
> -#define atomic_fetch_or atomic_fetch_or
> -
>  #undef ATOMIC_OPS
>  #define ATOMIC_OPS(op, asm)						\
>  	ATOMIC_OP(op, asm)						\
> --- a/arch/arc/include/asm/atomic.h
> +++ b/arch/arc/include/asm/atomic.h
> @@ -226,8 +226,6 @@ ATOMIC_OPS(sub, -=, sub)
>  
>  #define atomic_andnot atomic_andnot
>  
> -#define atomic_fetch_or atomic_fetch_or
> -
>  #undef ATOMIC_OPS
>  #define ATOMIC_OPS(op, c_op, asm_op)					\
>  	ATOMIC_OP(op, c_op, asm_op)					\
> --- a/arch/arm/include/asm/atomic.h
> +++ b/arch/arm/include/asm/atomic.h
> @@ -237,8 +237,6 @@ ATOMIC_OPS(sub, -=, sub)
>  
>  #define atomic_andnot atomic_andnot
>  
> -#define atomic_fetch_or atomic_fetch_or
> -
>  #undef ATOMIC_OPS
>  #define ATOMIC_OPS(op, c_op, asm_op)					\
>  	ATOMIC_OP(op, c_op, asm_op)					\
> --- a/arch/arm64/include/asm/atomic.h
> +++ b/arch/arm64/include/asm/atomic.h
> @@ -128,8 +128,6 @@
>  #define __atomic_add_unless(v, a, u)	___atomic_add_unless(v, a, u,)
>  #define atomic_andnot			atomic_andnot
>  
> -#define atomic_fetch_or atomic_fetch_or

For some reason, you added this twice to our atomic.h, so there's still
one left after this patch.

Will

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [RFC][PATCH 00/31] implement atomic_fetch_$op
  2016-04-22 12:56   ` Fengguang Wu
@ 2016-04-22 13:03     ` Will Deacon
  2016-04-22 14:23     ` Peter Zijlstra
  2016-04-22 18:35     ` Kalle Valo
  2 siblings, 0 replies; 79+ messages in thread
From: Will Deacon @ 2016-04-22 13:03 UTC (permalink / raw)
  To: Fengguang Wu
  Cc: Peter Zijlstra, torvalds, mingo, tglx, paulmck, boqun.feng,
	waiman.long, fweisbec, linux-kernel, linux-arch, rth, vgupta,
	linux, egtvedt, realmz6, ysato, rkuo, tony.luck, geert,
	james.hogan, ralf, dhowells, jejb, mpe, schwidefsky, dalias,
	davem, cmetcalf, jcmvbkbc, arnd, dbueso

On Fri, Apr 22, 2016 at 08:56:56PM +0800, Fengguang Wu wrote:
> On Fri, Apr 22, 2016 at 11:44:55AM +0200, Peter Zijlstra wrote:
> > On Fri, Apr 22, 2016 at 11:04:13AM +0200, Peter Zijlstra wrote:
> > > The one that I did not do was ARMv8.1-LSE and I was hoping Will would help out
> > > with that. Also, it looks like the 0-day built bot does not do arm64 builds,
> > > people might want to look into that.
> > 
> > OK, weirdness. I received the "BUILD SUCCESS" email without any arm64
> > builds listed, but I just received a build bot email telling me the
> > arm64 build was borked (which I know it is).
> 
> Sorry, that may happen because even though most errors will be
> detected in the first hour or before the BUILD SUCCESS/DONE
> notification, the build/boot/performance tests for a particular branch
> may continue for days, during the time test coverage keeps growing.
> Which means it's possible to receive a build failure after receiving
> BUILD SUCCESS notification.
> 
> In particular, 0-day bot classify 500+ kconfigs into 2 priority lists:
> 
> P1: 100+ realtime priority kconfigs which should be finished before sending
>     out BUILD SUCCESS notification
> 
> P2: 400+ background priority kconfigs which may take hours to days to finish
> 
> That split is a tradeoff between timeliness and completeness. It turns
> out to work well as long as we choose the suitable P1 list.
> 
> So the more accurate interpretation of "BUILD SUCCESS/DONE" would be:
> 0day bot is working on your tree (no worry about out-of-service) and
> reached a major milestone.
> 
> I'll add arm64-defconfig to P1 list to improve its coverage.

That's good to hear, thanks Fengguang!

Will

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [RFC][PATCH 03/31] locking,arc: Implement atomic_fetch_{add,sub,and,andnot,or,xor}()
  2016-04-22 10:50     ` Vineet Gupta
@ 2016-04-22 14:16       ` Peter Zijlstra
  -1 siblings, 0 replies; 79+ messages in thread
From: Peter Zijlstra @ 2016-04-22 14:16 UTC (permalink / raw)
  To: Vineet Gupta
  Cc: torvalds, mingo, tglx, will.deacon, paulmck, boqun.feng,
	waiman.long, fweisbec, linux-kernel, linux-arch, rth, linux,
	egtvedt, realmz6, ysato, rkuo, tony.luck, geert, james.hogan,
	ralf, dhowells, jejb, mpe, schwidefsky, dalias, davem, cmetcalf,
	jcmvbkbc, arnd, dbueso, fengguang.wu

On Fri, Apr 22, 2016 at 10:50:41AM +0000, Vineet Gupta wrote:

> > +#define ATOMIC_FETCH_OP(op, c_op, asm_op)				\
> > +static inline int atomic_fetch_##op(int i, atomic_t *v)			\
> > +{									\
> > +	unsigned int val, result;			                \
> > +	SCOND_FAIL_RETRY_VAR_DEF                                        \
> > +									\
> > +	/*								\
> > +	 * Explicit full memory barrier needed before/after as		\
> > +	 * LLOCK/SCOND thmeselves don't provide any such semantics	\
> > +	 */								\
> > +	smp_mb();							\
> > +									\
> > +	__asm__ __volatile__(						\
> > +	"1:	llock   %[val], [%[ctr]]		\n"		\
> > +	"	mov %[result], %[val]			\n"		\
> 
> Calling it result could be a bit confusing, this is meant to be the "orig" value.
> So it indeed "result" of the API, but for atomic operation it is pristine value.
> 
> Also we can optimize away that MOV - given there are plenty of regs, so
> 
> > +	"	" #asm_op " %[val], %[val], %[i]	\n"		\
> > +	"	scond   %[val], [%[ctr]]		\n"		\
> 
> Instead have
> 
> +	"	" #asm_op " %[result], %[val], %[i]	\n"		\
> +	"	scond   %[result], [%[ctr]]		\n"		\
> 
> 

Indeed, how about something like so?

---
Subject: locking,arc: Implement atomic_fetch_{add,sub,and,andnot,or,xor}()
From: Peter Zijlstra <peterz@infradead.org>
Date: Mon Apr 18 01:16:09 CEST 2016

Implement FETCH-OP atomic primitives, these are very similar to the
existing OP-RETURN primitives we already have, except they return the
value of the atomic variable _before_ modification.

This is especially useful for irreversible operations -- such as
bitops (because it becomes impossible to reconstruct the state prior
to modification).

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 arch/arc/include/asm/atomic.h |   69 ++++++++++++++++++++++++++++++++++++++----
 1 file changed, 64 insertions(+), 5 deletions(-)

--- a/arch/arc/include/asm/atomic.h
+++ b/arch/arc/include/asm/atomic.h
@@ -102,6 +102,37 @@ static inline int atomic_##op##_return(i
 	return val;							\
 }
 
+#define ATOMIC_FETCH_OP(op, c_op, asm_op)				\
+static inline int atomic_fetch_##op(int i, atomic_t *v)			\
+{									\
+	unsigned int val, orig;						\
+	SCOND_FAIL_RETRY_VAR_DEF                                        \
+									\
+	/*								\
+	 * Explicit full memory barrier needed before/after as		\
+	 * LLOCK/SCOND thmeselves don't provide any such semantics	\
+	 */								\
+	smp_mb();							\
+									\
+	__asm__ __volatile__(						\
+	"1:	llock   %[orig], [%[ctr]]		\n"		\
+	"	" #asm_op " %[val], %[orig], %[i]	\n"		\
+	"	scond   %[val], [%[ctr]]		\n"		\
+	"						\n"		\
+	SCOND_FAIL_RETRY_ASM						\
+									\
+	: [val]	"=&r"	(val),						\
+	  [orig] "=&r" (orig)						\
+	  SCOND_FAIL_RETRY_VARS						\
+	: [ctr]	"r"	(&v->counter),					\
+	  [i]	"ir"	(i)						\
+	: "cc");							\
+									\
+	smp_mb();							\
+									\
+	return orig;							\
+}
+
 #else	/* !CONFIG_ARC_HAS_LLSC */
 
 #ifndef CONFIG_SMP
@@ -164,23 +196,49 @@ static inline int atomic_##op##_return(i
 	return temp;							\
 }
 
+#define ATOMIC_FETCH_OP(op, c_op, asm_op)				\
+static inline int atomic_fetch_##op(int i, atomic_t *v)			\
+{									\
+	unsigned long flags;						\
+	unsigned long orig;						\
+									\
+	/*								\
+	 * spin lock/unlock provides the needed smp_mb() before/after	\
+	 */								\
+	atomic_ops_lock(flags);						\
+	orig = v->counter;						\
+	v->counter c_op i;						\
+	atomic_ops_unlock(flags);					\
+									\
+	return orig;							\
+}
+
 #endif /* !CONFIG_ARC_HAS_LLSC */
 
 #define ATOMIC_OPS(op, c_op, asm_op)					\
 	ATOMIC_OP(op, c_op, asm_op)					\
-	ATOMIC_OP_RETURN(op, c_op, asm_op)
+	ATOMIC_OP_RETURN(op, c_op, asm_op)				\
+	ATOMIC_FETCH_OP(op, c_op, asm_op)
 
 ATOMIC_OPS(add, +=, add)
 ATOMIC_OPS(sub, -=, sub)
 
 #define atomic_andnot atomic_andnot
 
-ATOMIC_OP(and, &=, and)
-ATOMIC_OP(andnot, &= ~, bic)
-ATOMIC_OP(or, |=, or)
-ATOMIC_OP(xor, ^=, xor)
+#define atomic_fetch_or atomic_fetch_or
+
+#undef ATOMIC_OPS
+#define ATOMIC_OPS(op, c_op, asm_op)					\
+	ATOMIC_OP(op, c_op, asm_op)					\
+	ATOMIC_FETCH_OP(op, c_op, asm_op)
+
+ATOMIC_OPS(and, &=, and)
+ATOMIC_OPS(andnot, &= ~, bic)
+ATOMIC_OPS(or, |=, or)
+ATOMIC_OPS(xor, ^=, xor)
 
 #undef ATOMIC_OPS
+#undef ATOMIC_FETCH_OP
 #undef ATOMIC_OP_RETURN
 #undef ATOMIC_OP
 #undef SCOND_FAIL_RETRY_VAR_DEF

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [RFC][PATCH 03/31] locking,arc: Implement atomic_fetch_{add,sub,and,andnot,or,xor}()
@ 2016-04-22 14:16       ` Peter Zijlstra
  0 siblings, 0 replies; 79+ messages in thread
From: Peter Zijlstra @ 2016-04-22 14:16 UTC (permalink / raw)
  To: Vineet Gupta
  Cc: torvalds, mingo, tglx, will.deacon, paulmck, boqun.feng,
	waiman.long, fweisbec, linux-kernel, linux-arch, rth, linux,
	egtvedt, realmz6, ysato, rkuo, tony.luck, geert

On Fri, Apr 22, 2016 at 10:50:41AM +0000, Vineet Gupta wrote:

> > +#define ATOMIC_FETCH_OP(op, c_op, asm_op)				\
> > +static inline int atomic_fetch_##op(int i, atomic_t *v)			\
> > +{									\
> > +	unsigned int val, result;			                \
> > +	SCOND_FAIL_RETRY_VAR_DEF                                        \
> > +									\
> > +	/*								\
> > +	 * Explicit full memory barrier needed before/after as		\
> > +	 * LLOCK/SCOND thmeselves don't provide any such semantics	\
> > +	 */								\
> > +	smp_mb();							\
> > +									\
> > +	__asm__ __volatile__(						\
> > +	"1:	llock   %[val], [%[ctr]]		\n"		\
> > +	"	mov %[result], %[val]			\n"		\
> 
> Calling it result could be a bit confusing, this is meant to be the "orig" value.
> So it indeed "result" of the API, but for atomic operation it is pristine value.
> 
> Also we can optimize away that MOV - given there are plenty of regs, so
> 
> > +	"	" #asm_op " %[val], %[val], %[i]	\n"		\
> > +	"	scond   %[val], [%[ctr]]		\n"		\
> 
> Instead have
> 
> +	"	" #asm_op " %[result], %[val], %[i]	\n"		\
> +	"	scond   %[result], [%[ctr]]		\n"		\
> 
> 

Indeed, how about something like so?

---
Subject: locking,arc: Implement atomic_fetch_{add,sub,and,andnot,or,xor}()
From: Peter Zijlstra <peterz@infradead.org>
Date: Mon Apr 18 01:16:09 CEST 2016

Implement FETCH-OP atomic primitives, these are very similar to the
existing OP-RETURN primitives we already have, except they return the
value of the atomic variable _before_ modification.

This is especially useful for irreversible operations -- such as
bitops (because it becomes impossible to reconstruct the state prior
to modification).

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 arch/arc/include/asm/atomic.h |   69 ++++++++++++++++++++++++++++++++++++++----
 1 file changed, 64 insertions(+), 5 deletions(-)

--- a/arch/arc/include/asm/atomic.h
+++ b/arch/arc/include/asm/atomic.h
@@ -102,6 +102,37 @@ static inline int atomic_##op##_return(i
 	return val;							\
 }
 
+#define ATOMIC_FETCH_OP(op, c_op, asm_op)				\
+static inline int atomic_fetch_##op(int i, atomic_t *v)			\
+{									\
+	unsigned int val, orig;						\
+	SCOND_FAIL_RETRY_VAR_DEF                                        \
+									\
+	/*								\
+	 * Explicit full memory barrier needed before/after as		\
+	 * LLOCK/SCOND thmeselves don't provide any such semantics	\
+	 */								\
+	smp_mb();							\
+									\
+	__asm__ __volatile__(						\
+	"1:	llock   %[orig], [%[ctr]]		\n"		\
+	"	" #asm_op " %[val], %[orig], %[i]	\n"		\
+	"	scond   %[val], [%[ctr]]		\n"		\
+	"						\n"		\
+	SCOND_FAIL_RETRY_ASM						\
+									\
+	: [val]	"=&r"	(val),						\
+	  [orig] "=&r" (orig)						\
+	  SCOND_FAIL_RETRY_VARS						\
+	: [ctr]	"r"	(&v->counter),					\
+	  [i]	"ir"	(i)						\
+	: "cc");							\
+									\
+	smp_mb();							\
+									\
+	return orig;							\
+}
+
 #else	/* !CONFIG_ARC_HAS_LLSC */
 
 #ifndef CONFIG_SMP
@@ -164,23 +196,49 @@ static inline int atomic_##op##_return(i
 	return temp;							\
 }
 
+#define ATOMIC_FETCH_OP(op, c_op, asm_op)				\
+static inline int atomic_fetch_##op(int i, atomic_t *v)			\
+{									\
+	unsigned long flags;						\
+	unsigned long orig;						\
+									\
+	/*								\
+	 * spin lock/unlock provides the needed smp_mb() before/after	\
+	 */								\
+	atomic_ops_lock(flags);						\
+	orig = v->counter;						\
+	v->counter c_op i;						\
+	atomic_ops_unlock(flags);					\
+									\
+	return orig;							\
+}
+
 #endif /* !CONFIG_ARC_HAS_LLSC */
 
 #define ATOMIC_OPS(op, c_op, asm_op)					\
 	ATOMIC_OP(op, c_op, asm_op)					\
-	ATOMIC_OP_RETURN(op, c_op, asm_op)
+	ATOMIC_OP_RETURN(op, c_op, asm_op)				\
+	ATOMIC_FETCH_OP(op, c_op, asm_op)
 
 ATOMIC_OPS(add, +=, add)
 ATOMIC_OPS(sub, -=, sub)
 
 #define atomic_andnot atomic_andnot
 
-ATOMIC_OP(and, &=, and)
-ATOMIC_OP(andnot, &= ~, bic)
-ATOMIC_OP(or, |=, or)
-ATOMIC_OP(xor, ^=, xor)
+#define atomic_fetch_or atomic_fetch_or
+
+#undef ATOMIC_OPS
+#define ATOMIC_OPS(op, c_op, asm_op)					\
+	ATOMIC_OP(op, c_op, asm_op)					\
+	ATOMIC_FETCH_OP(op, c_op, asm_op)
+
+ATOMIC_OPS(and, &=, and)
+ATOMIC_OPS(andnot, &= ~, bic)
+ATOMIC_OPS(or, |=, or)
+ATOMIC_OPS(xor, ^=, xor)
 
 #undef ATOMIC_OPS
+#undef ATOMIC_FETCH_OP
 #undef ATOMIC_OP_RETURN
 #undef ATOMIC_OP
 #undef SCOND_FAIL_RETRY_VAR_DEF

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [RFC][PATCH 01/31] locking: Flip arguments to atomic_fetch_or
  2016-04-22 11:09     ` Geert Uytterhoeven
@ 2016-04-22 14:18       ` Peter Zijlstra
  -1 siblings, 0 replies; 79+ messages in thread
From: Peter Zijlstra @ 2016-04-22 14:18 UTC (permalink / raw)
  To: Geert Uytterhoeven
  Cc: Linus Torvalds, Ingo Molnar, Thomas Gleixner, Will Deacon,
	Paul McKenney, boqun.feng, waiman.long,
	Frédéric Weisbecker, linux-kernel, Linux-Arch,
	Richard Henderson, Vineet Gupta, Russell King,
	Hans-Christian Noren Egtvedt, Miao Steven, Yoshinori Sato,
	Richard Kuo, Tony Luck, James Hogan, Ralf Baechle, David Howells,
	James E.J. Bottomley, Michael Ellerman, Martin Schwidefsky,
	Rich Felker, David S. Miller, cmetcalf, Max Filippov,
	Arnd Bergmann, dbueso, Wu Fengguang

On Fri, Apr 22, 2016 at 01:09:38PM +0200, Geert Uytterhoeven wrote:
> On Fri, Apr 22, 2016 at 11:04 AM, Peter Zijlstra <peterz@infradead.org> wrote:
> > All the atomic operations have their arguments the wrong way around;
> 
> s/wrong/other/?

Nah, I find they really are the wrong way around. I forever write:
atomic_add(&v, val); and then have the compiler yell at me.

> > make atomic_fetch_or() consistent and flip them.
> 
> BTW, there are a few other inconsistencies:
> 
> atomic_add_unless()
> atomic_cmpxchg()
> atomic_inc_not_zero_hint()
> atomic_set()
> atomic_xchg
> 
> git grep "\<atomic_.*atomic_t\>.*\<int\>"

Yes, but fixing those will be much more pain :/ atomic_fetch_or() was
freshly introduced and only has a few callers, furthermore, the
following patches would require it to be in line with the other
atomic_$op() due to them all being generated from the same 'template'.

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [RFC][PATCH 01/31] locking: Flip arguments to atomic_fetch_or
@ 2016-04-22 14:18       ` Peter Zijlstra
  0 siblings, 0 replies; 79+ messages in thread
From: Peter Zijlstra @ 2016-04-22 14:18 UTC (permalink / raw)
  To: Geert Uytterhoeven
  Cc: Linus Torvalds, Ingo Molnar, Thomas Gleixner, Will Deacon,
	Paul McKenney, boqun.feng, waiman.long,
	Frédéric Weisbecker, linux-kernel, Linux-Arch,
	Richard Henderson, Vineet Gupta, Russell King,
	Hans-Christian Noren Egtvedt, Miao Steven, Yoshinori Sato,
	Richard Kuo, Tony Luck, James Hogan, Ralf Baechle, David Howells,
	James E.J. Bottomley

On Fri, Apr 22, 2016 at 01:09:38PM +0200, Geert Uytterhoeven wrote:
> On Fri, Apr 22, 2016 at 11:04 AM, Peter Zijlstra <peterz@infradead.org> wrote:
> > All the atomic operations have their arguments the wrong way around;
> 
> s/wrong/other/?

Nah, I find they really are the wrong way around. I forever write:
atomic_add(&v, val); and then have the compiler yell at me.

> > make atomic_fetch_or() consistent and flip them.
> 
> BTW, there are a few other inconsistencies:
> 
> atomic_add_unless()
> atomic_cmpxchg()
> atomic_inc_not_zero_hint()
> atomic_set()
> atomic_xchg
> 
> git grep "\<atomic_.*atomic_t\>.*\<int\>"

Yes, but fixing those will be much more pain :/ atomic_fetch_or() was
freshly introduced and only has a few callers, furthermore, the
following patches would require it to be in line with the other
atomic_$op() due to them all being generated from the same 'template'.

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [RFC][PATCH 27/31] locking: Remove linux/atomic.h:atomic_fetch_or
  2016-04-22 13:02   ` Will Deacon
@ 2016-04-22 14:21     ` Peter Zijlstra
  0 siblings, 0 replies; 79+ messages in thread
From: Peter Zijlstra @ 2016-04-22 14:21 UTC (permalink / raw)
  To: Will Deacon
  Cc: torvalds, mingo, tglx, paulmck, boqun.feng, waiman.long,
	fweisbec, linux-kernel, linux-arch, rth, vgupta, linux, egtvedt,
	realmz6, ysato, rkuo, tony.luck, geert, james.hogan, ralf,
	dhowells, jejb, mpe, schwidefsky, dalias, davem, cmetcalf,
	jcmvbkbc, arnd, dbueso, fengguang.wu

On Fri, Apr 22, 2016 at 02:02:56PM +0100, Will Deacon wrote:
> On Fri, Apr 22, 2016 at 11:04:40AM +0200, Peter Zijlstra wrote:
> > --- a/arch/arm64/include/asm/atomic.h
> > +++ b/arch/arm64/include/asm/atomic.h
> > @@ -128,8 +128,6 @@
> >  #define __atomic_add_unless(v, a, u)	___atomic_add_unless(v, a, u,)
> >  #define atomic_andnot			atomic_andnot
> >  
> > -#define atomic_fetch_or atomic_fetch_or
> 
> For some reason, you added this twice to our atomic.h, so there's still
> one left after this patch.

Ah, yes. One was because of the whole _relaxed generate business, this
one is because of this generic atomic_fetch_or() thing.

I went through the arch/*/include/asm/atomic*.h files pretty much
without thinking to add this one.

The end result after this patch should be good though, we have
atomic_fetch_or_relaxed and do not want to generate atomic_fetch_or()
using smp_mb__{before,after}_atomic() because arm64 is 'special' :-)

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [RFC][PATCH 00/31] implement atomic_fetch_$op
  2016-04-22 12:56   ` Fengguang Wu
  2016-04-22 13:03     ` Will Deacon
@ 2016-04-22 14:23     ` Peter Zijlstra
  2016-04-23  1:59       ` Fengguang Wu
  2016-04-22 18:35     ` Kalle Valo
  2 siblings, 1 reply; 79+ messages in thread
From: Peter Zijlstra @ 2016-04-22 14:23 UTC (permalink / raw)
  To: Fengguang Wu
  Cc: torvalds, mingo, tglx, will.deacon, paulmck, boqun.feng,
	waiman.long, fweisbec, linux-kernel, linux-arch, rth, vgupta,
	linux, egtvedt, realmz6, ysato, rkuo, tony.luck, geert,
	james.hogan, ralf, dhowells, jejb, mpe, schwidefsky, dalias,
	davem, cmetcalf, jcmvbkbc, arnd, dbueso

On Fri, Apr 22, 2016 at 08:56:56PM +0800, Fengguang Wu wrote:
> I'll add arm64-defconfig to P1 list to improve its coverage.

Thanks; any more architectures missing from P1?

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [RFC][PATCH 05/31] locking,arm64: Implement atomic{,64}_fetch_{add,sub,and,andnot,or,xor}{,_relaxed,_acquire,_release}()
  2016-04-22  9:04 ` [RFC][PATCH 05/31] locking,arm64: " Peter Zijlstra
  2016-04-22 11:08   ` Will Deacon
@ 2016-04-22 14:23     ` Will Deacon
  1 sibling, 0 replies; 79+ messages in thread
From: Will Deacon @ 2016-04-22 14:23 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: torvalds, mingo, tglx, paulmck, boqun.feng, waiman.long,
	fweisbec, linux-kernel, linux-arch, rth, vgupta, linux, egtvedt,
	realmz6, ysato, rkuo, tony.luck, geert, james.hogan, ralf,
	dhowells, jejb, mpe, schwidefsky, dalias, davem, cmetcalf,
	jcmvbkbc, arnd, dbueso, fengguang.wu

On Fri, Apr 22, 2016 at 11:04:18AM +0200, Peter Zijlstra wrote:
> Implement FETCH-OP atomic primitives, these are very similar to the
> existing OP-RETURN primitives we already have, except they return the
> value of the atomic variable _before_ modification.
> 
> This is especially useful for irreversible operations -- such as
> bitops (because it becomes impossible to reconstruct the state prior
> to modification).

The LSE bits will take me some time, but you're also missing some stuff
for the LL/SC variants. Fixup below.

Will

--->8

>From ff2863445fb2a11dcd0cab4aaaeebe28aa5c9937 Mon Sep 17 00:00:00 2001
From: Will Deacon <will.deacon@arm.com>
Date: Fri, 22 Apr 2016 14:30:54 +0100
Subject: [PATCH] fixup! locking,arm64: Implement
 atomic{,64}_fetch_{add,sub,and,andnot,or,xor}{,_relaxed,_acquire,_release}()

Get the ll/sc stuff building and working

Signed-off-by: Will Deacon <will.deacon@arm.com>
---
 arch/arm64/include/asm/atomic.h       | 30 ++++++++++++++++++++++++++++++
 arch/arm64/include/asm/atomic_ll_sc.h |  8 ++++----
 2 files changed, 34 insertions(+), 4 deletions(-)

diff --git a/arch/arm64/include/asm/atomic.h b/arch/arm64/include/asm/atomic.h
index 83b74b67c04b..c0235e0ff849 100644
--- a/arch/arm64/include/asm/atomic.h
+++ b/arch/arm64/include/asm/atomic.h
@@ -155,6 +155,36 @@
 #define atomic64_dec_return_release(v)	atomic64_sub_return_release(1, (v))
 #define atomic64_dec_return(v)		atomic64_sub_return(1, (v))
 
+#define atomic64_fetch_add_relaxed	atomic64_fetch_add_relaxed
+#define atomic64_fetch_add_acquire	atomic64_fetch_add_acquire
+#define atomic64_fetch_add_release	atomic64_fetch_add_release
+#define atomic64_fetch_add		atomic64_fetch_add
+
+#define atomic64_fetch_sub_relaxed	atomic64_fetch_sub_relaxed
+#define atomic64_fetch_sub_acquire	atomic64_fetch_sub_acquire
+#define atomic64_fetch_sub_release	atomic64_fetch_sub_release
+#define atomic64_fetch_sub		atomic64_fetch_sub
+
+#define atomic64_fetch_and_relaxed	atomic64_fetch_and_relaxed
+#define atomic64_fetch_and_acquire	atomic64_fetch_and_acquire
+#define atomic64_fetch_and_release	atomic64_fetch_and_release
+#define atomic64_fetch_and		atomic64_fetch_and
+
+#define atomic64_fetch_andnot_relaxed	atomic64_fetch_andnot_relaxed
+#define atomic64_fetch_andnot_acquire	atomic64_fetch_andnot_acquire
+#define atomic64_fetch_andnot_release	atomic64_fetch_andnot_release
+#define atomic64_fetch_andnot		atomic64_fetch_andnot
+
+#define atomic64_fetch_or_relaxed	atomic64_fetch_or_relaxed
+#define atomic64_fetch_or_acquire	atomic64_fetch_or_acquire
+#define atomic64_fetch_or_release	atomic64_fetch_or_release
+#define atomic64_fetch_or		atomic64_fetch_or
+
+#define atomic64_fetch_xor_relaxed	atomic64_fetch_xor_relaxed
+#define atomic64_fetch_xor_acquire	atomic64_fetch_xor_acquire
+#define atomic64_fetch_xor_release	atomic64_fetch_xor_release
+#define atomic64_fetch_xor		atomic64_fetch_xor
+
 #define atomic64_xchg_relaxed		atomic_xchg_relaxed
 #define atomic64_xchg_acquire		atomic_xchg_acquire
 #define atomic64_xchg_release		atomic_xchg_release
diff --git a/arch/arm64/include/asm/atomic_ll_sc.h b/arch/arm64/include/asm/atomic_ll_sc.h
index f92806390c9a..2b29db9593c7 100644
--- a/arch/arm64/include/asm/atomic_ll_sc.h
+++ b/arch/arm64/include/asm/atomic_ll_sc.h
@@ -127,6 +127,7 @@ ATOMIC_OPS(or, orr)
 ATOMIC_OPS(xor, eor)
 
 #undef ATOMIC_OPS
+#undef ATOMIC_FETCH_OP
 #undef ATOMIC_OP_RETURN
 #undef ATOMIC_OP
 
@@ -195,11 +196,10 @@ __LL_SC_EXPORT(atomic64_##op##_return##name);
 #define ATOMIC64_OPS(...)						\
 	ATOMIC64_OP(__VA_ARGS__)					\
 	ATOMIC64_OP_RETURN(, dmb ish,  , l, "memory", __VA_ARGS__)	\
-	ATOMIC64_FETCH_OP (, dmb ish,  , l, "memory", __VA_ARGS__)	\
-	ATOMIC64_OPS(__VA_ARGS__)					\
 	ATOMIC64_OP_RETURN(_relaxed,,  ,  ,         , __VA_ARGS__)	\
 	ATOMIC64_OP_RETURN(_acquire,, a,  , "memory", __VA_ARGS__)	\
 	ATOMIC64_OP_RETURN(_release,,  , l, "memory", __VA_ARGS__)	\
+	ATOMIC64_FETCH_OP (, dmb ish,  , l, "memory", __VA_ARGS__)	\
 	ATOMIC64_FETCH_OP (_relaxed,,  ,  ,         , __VA_ARGS__)	\
 	ATOMIC64_FETCH_OP (_acquire,, a,  , "memory", __VA_ARGS__)	\
 	ATOMIC64_FETCH_OP (_release,,  , l, "memory", __VA_ARGS__)
@@ -207,11 +207,10 @@ __LL_SC_EXPORT(atomic64_##op##_return##name);
 ATOMIC64_OPS(add, add)
 ATOMIC64_OPS(sub, sub)
 
-#undef ATOMIC_OPS
+#undef ATOMIC64_OPS
 #define ATOMIC64_OPS(...)						\
 	ATOMIC64_OP(__VA_ARGS__)					\
 	ATOMIC64_FETCH_OP (, dmb ish,  , l, "memory", __VA_ARGS__)	\
-	ATOMIC64_OPS(__VA_ARGS__)					\
 	ATOMIC64_FETCH_OP (_relaxed,,  ,  ,         , __VA_ARGS__)	\
 	ATOMIC64_FETCH_OP (_acquire,, a,  , "memory", __VA_ARGS__)	\
 	ATOMIC64_FETCH_OP (_release,,  , l, "memory", __VA_ARGS__)
@@ -222,6 +221,7 @@ ATOMIC64_OPS(or, orr)
 ATOMIC64_OPS(xor, eor)
 
 #undef ATOMIC64_OPS
+#undef ATOMIC64_FETCH_OP
 #undef ATOMIC64_OP_RETURN
 #undef ATOMIC64_OP
 
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 79+ messages in thread

* Re: [RFC][PATCH 05/31] locking,arm64: Implement atomic{,64}_fetch_{add,sub,and,andnot,or,xor}{,_relaxed,_acquire,_release}()
@ 2016-04-22 14:23     ` Will Deacon
  0 siblings, 0 replies; 79+ messages in thread
From: Will Deacon @ 2016-04-22 14:23 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: torvalds, mingo, tglx, paulmck, boqun.feng, waiman.long,
	fweisbec, linux-kernel, linux-arch, rth, vgupta, linux, egtvedt,
	realmz6, ysato, rkuo, tony.luck, geert, james.hogan, ralf,
	dhowells, jejb, mpe, schwidefsky, dalias, davem, cmetcalf,
	jcmvbkbc, arnd, dbueso, fengguang.wu

On Fri, Apr 22, 2016 at 11:04:18AM +0200, Peter Zijlstra wrote:
> Implement FETCH-OP atomic primitives, these are very similar to the
> existing OP-RETURN primitives we already have, except they return the
> value of the atomic variable _before_ modification.
> 
> This is especially useful for irreversible operations -- such as
> bitops (because it becomes impossible to reconstruct the state prior
> to modification).

The LSE bits will take me some time, but you're also missing some stuff
for the LL/SC variants. Fixup below.

Will

--->8

From ff2863445fb2a11dcd0cab4aaaeebe28aa5c9937 Mon Sep 17 00:00:00 2001
From: Will Deacon <will.deacon@arm.com>
Date: Fri, 22 Apr 2016 14:30:54 +0100
Subject: [PATCH] fixup! locking,arm64: Implement
 atomic{,64}_fetch_{add,sub,and,andnot,or,xor}{,_relaxed,_acquire,_release}()

Get the ll/sc stuff building and working

Signed-off-by: Will Deacon <will.deacon@arm.com>
---
 arch/arm64/include/asm/atomic.h       | 30 ++++++++++++++++++++++++++++++
 arch/arm64/include/asm/atomic_ll_sc.h |  8 ++++----
 2 files changed, 34 insertions(+), 4 deletions(-)

diff --git a/arch/arm64/include/asm/atomic.h b/arch/arm64/include/asm/atomic.h
index 83b74b67c04b..c0235e0ff849 100644
--- a/arch/arm64/include/asm/atomic.h
+++ b/arch/arm64/include/asm/atomic.h
@@ -155,6 +155,36 @@
 #define atomic64_dec_return_release(v)	atomic64_sub_return_release(1, (v))
 #define atomic64_dec_return(v)		atomic64_sub_return(1, (v))
 
+#define atomic64_fetch_add_relaxed	atomic64_fetch_add_relaxed
+#define atomic64_fetch_add_acquire	atomic64_fetch_add_acquire
+#define atomic64_fetch_add_release	atomic64_fetch_add_release
+#define atomic64_fetch_add		atomic64_fetch_add
+
+#define atomic64_fetch_sub_relaxed	atomic64_fetch_sub_relaxed
+#define atomic64_fetch_sub_acquire	atomic64_fetch_sub_acquire
+#define atomic64_fetch_sub_release	atomic64_fetch_sub_release
+#define atomic64_fetch_sub		atomic64_fetch_sub
+
+#define atomic64_fetch_and_relaxed	atomic64_fetch_and_relaxed
+#define atomic64_fetch_and_acquire	atomic64_fetch_and_acquire
+#define atomic64_fetch_and_release	atomic64_fetch_and_release
+#define atomic64_fetch_and		atomic64_fetch_and
+
+#define atomic64_fetch_andnot_relaxed	atomic64_fetch_andnot_relaxed
+#define atomic64_fetch_andnot_acquire	atomic64_fetch_andnot_acquire
+#define atomic64_fetch_andnot_release	atomic64_fetch_andnot_release
+#define atomic64_fetch_andnot		atomic64_fetch_andnot
+
+#define atomic64_fetch_or_relaxed	atomic64_fetch_or_relaxed
+#define atomic64_fetch_or_acquire	atomic64_fetch_or_acquire
+#define atomic64_fetch_or_release	atomic64_fetch_or_release
+#define atomic64_fetch_or		atomic64_fetch_or
+
+#define atomic64_fetch_xor_relaxed	atomic64_fetch_xor_relaxed
+#define atomic64_fetch_xor_acquire	atomic64_fetch_xor_acquire
+#define atomic64_fetch_xor_release	atomic64_fetch_xor_release
+#define atomic64_fetch_xor		atomic64_fetch_xor
+
 #define atomic64_xchg_relaxed		atomic_xchg_relaxed
 #define atomic64_xchg_acquire		atomic_xchg_acquire
 #define atomic64_xchg_release		atomic_xchg_release
diff --git a/arch/arm64/include/asm/atomic_ll_sc.h b/arch/arm64/include/asm/atomic_ll_sc.h
index f92806390c9a..2b29db9593c7 100644
--- a/arch/arm64/include/asm/atomic_ll_sc.h
+++ b/arch/arm64/include/asm/atomic_ll_sc.h
@@ -127,6 +127,7 @@ ATOMIC_OPS(or, orr)
 ATOMIC_OPS(xor, eor)
 
 #undef ATOMIC_OPS
+#undef ATOMIC_FETCH_OP
 #undef ATOMIC_OP_RETURN
 #undef ATOMIC_OP
 
@@ -195,11 +196,10 @@ __LL_SC_EXPORT(atomic64_##op##_return##name);
 #define ATOMIC64_OPS(...)						\
 	ATOMIC64_OP(__VA_ARGS__)					\
 	ATOMIC64_OP_RETURN(, dmb ish,  , l, "memory", __VA_ARGS__)	\
-	ATOMIC64_FETCH_OP (, dmb ish,  , l, "memory", __VA_ARGS__)	\
-	ATOMIC64_OPS(__VA_ARGS__)					\
 	ATOMIC64_OP_RETURN(_relaxed,,  ,  ,         , __VA_ARGS__)	\
 	ATOMIC64_OP_RETURN(_acquire,, a,  , "memory", __VA_ARGS__)	\
 	ATOMIC64_OP_RETURN(_release,,  , l, "memory", __VA_ARGS__)	\
+	ATOMIC64_FETCH_OP (, dmb ish,  , l, "memory", __VA_ARGS__)	\
 	ATOMIC64_FETCH_OP (_relaxed,,  ,  ,         , __VA_ARGS__)	\
 	ATOMIC64_FETCH_OP (_acquire,, a,  , "memory", __VA_ARGS__)	\
 	ATOMIC64_FETCH_OP (_release,,  , l, "memory", __VA_ARGS__)
@@ -207,11 +207,10 @@ __LL_SC_EXPORT(atomic64_##op##_return##name);
 ATOMIC64_OPS(add, add)
 ATOMIC64_OPS(sub, sub)
 
-#undef ATOMIC_OPS
+#undef ATOMIC64_OPS
 #define ATOMIC64_OPS(...)						\
 	ATOMIC64_OP(__VA_ARGS__)					\
 	ATOMIC64_FETCH_OP (, dmb ish,  , l, "memory", __VA_ARGS__)	\
-	ATOMIC64_OPS(__VA_ARGS__)					\
 	ATOMIC64_FETCH_OP (_relaxed,,  ,  ,         , __VA_ARGS__)	\
 	ATOMIC64_FETCH_OP (_acquire,, a,  , "memory", __VA_ARGS__)	\
 	ATOMIC64_FETCH_OP (_release,,  , l, "memory", __VA_ARGS__)
@@ -222,6 +221,7 @@ ATOMIC64_OPS(or, orr)
 ATOMIC64_OPS(xor, eor)
 
 #undef ATOMIC64_OPS
+#undef ATOMIC64_FETCH_OP
 #undef ATOMIC64_OP_RETURN
 #undef ATOMIC64_OP
 
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 79+ messages in thread

* Re: [RFC][PATCH 05/31] locking,arm64: Implement atomic{,64}_fetch_{add,sub,and,andnot,or,xor}{,_relaxed,_acquire,_release}()
@ 2016-04-22 14:23     ` Will Deacon
  0 siblings, 0 replies; 79+ messages in thread
From: Will Deacon @ 2016-04-22 14:23 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: torvalds, mingo, tglx, paulmck, boqun.feng, waiman.long,
	fweisbec, linux-kernel, linux-arch, rth, vgupta, linux, egtvedt,
	realmz6, ysato, rkuo, tony.luck, geert, james.hogan, ralf,
	dhowells, jejb, mpe, schwidefsky, dalias, davem, cmetcalf,
	jcmvbkbc, arnd, dbueso, fengguang.wu

On Fri, Apr 22, 2016 at 11:04:18AM +0200, Peter Zijlstra wrote:
> Implement FETCH-OP atomic primitives, these are very similar to the
> existing OP-RETURN primitives we already have, except they return the
> value of the atomic variable _before_ modification.
> 
> This is especially useful for irreversible operations -- such as
> bitops (because it becomes impossible to reconstruct the state prior
> to modification).

The LSE bits will take me some time, but you're also missing some stuff
for the LL/SC variants. Fixup below.

Will

--->8

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [RFC][PATCH 31/31] locking,qrwlock: Employ atomic_fetch_add_acquire()
  2016-04-22  9:04 ` [RFC][PATCH 31/31] locking,qrwlock: Employ atomic_fetch_add_acquire() Peter Zijlstra
@ 2016-04-22 14:25     ` Waiman Long
  0 siblings, 0 replies; 79+ messages in thread
From: Waiman Long @ 2016-04-22 14:25 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: torvalds, mingo, tglx, will.deacon, paulmck, boqun.feng,
	fweisbec, linux-kernel, linux-arch, rth, vgupta, linux, egtvedt,
	realmz6, ysato, rkuo, tony.luck, geert, james.hogan, ralf,
	dhowells, jejb, mpe, schwidefsky, dalias, davem, cmetcalf,
	jcmvbkbc, arnd, dbueso, fengguang.wu

On 04/22/2016 05:04 AM, Peter Zijlstra wrote:
> The only reason for the current code is to make GCC emit only the
> "LOCK XADD" instruction on x86 (and not do a pointless extra ADD on
> the result), do so nicer.
>
> Signed-off-by: Peter Zijlstra (Intel)<peterz@infradead.org>
> ---
>   kernel/locking/qrwlock.c |    2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
>
> --- a/kernel/locking/qrwlock.c
> +++ b/kernel/locking/qrwlock.c
> @@ -93,7 +93,7 @@ void queued_read_lock_slowpath(struct qr
>   	 * that accesses can't leak upwards out of our subsequent critical
>   	 * section in the case that the lock is currently held for write.
>   	 */
> -	cnts = atomic_add_return_acquire(_QR_BIAS,&lock->cnts) - _QR_BIAS;
> +	cnts = atomic_fetch_add_acquire(_QR_BIAS,&lock->cnts);
>   	rspin_until_writer_unlock(lock, cnts);
>
>   	/*
>
>

Thanks for taking out this weirdness in the code.

Acked-by: Waiman Long <waiman.long@hpe.com>

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [RFC][PATCH 31/31] locking,qrwlock: Employ atomic_fetch_add_acquire()
@ 2016-04-22 14:25     ` Waiman Long
  0 siblings, 0 replies; 79+ messages in thread
From: Waiman Long @ 2016-04-22 14:25 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: torvalds, mingo, tglx, will.deacon, paulmck, boqun.feng,
	fweisbec, linux-kernel, linux-arch, rth, vgupta, linux, egtvedt,
	realmz6, ysato, rkuo, tony.luck, geert, james.hogan, ralf,
	dhowells, jejb, mpe, schwidefsky, dalias, davem, cmetcalf,
	jcmvbkbc, arnd, dbueso, fengguang.wu

On 04/22/2016 05:04 AM, Peter Zijlstra wrote:
> The only reason for the current code is to make GCC emit only the
> "LOCK XADD" instruction on x86 (and not do a pointless extra ADD on
> the result), do so nicer.
>
> Signed-off-by: Peter Zijlstra (Intel)<peterz@infradead.org>
> ---
>   kernel/locking/qrwlock.c |    2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
>
> --- a/kernel/locking/qrwlock.c
> +++ b/kernel/locking/qrwlock.c
> @@ -93,7 +93,7 @@ void queued_read_lock_slowpath(struct qr
>   	 * that accesses can't leak upwards out of our subsequent critical
>   	 * section in the case that the lock is currently held for write.
>   	 */
> -	cnts = atomic_add_return_acquire(_QR_BIAS,&lock->cnts) - _QR_BIAS;
> +	cnts = atomic_fetch_add_acquire(_QR_BIAS,&lock->cnts);
>   	rspin_until_writer_unlock(lock, cnts);
>
>   	/*
>
>

Thanks for taking out this weirdness in the code.

Acked-by: Waiman Long <waiman.long@hpe.com>

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [RFC][PATCH 03/31] locking,arc: Implement atomic_fetch_{add,sub,and,andnot,or,xor}()
  2016-04-22 10:50     ` Vineet Gupta
@ 2016-04-22 14:26       ` Peter Zijlstra
  -1 siblings, 0 replies; 79+ messages in thread
From: Peter Zijlstra @ 2016-04-22 14:26 UTC (permalink / raw)
  To: Vineet Gupta
  Cc: torvalds, mingo, tglx, will.deacon, paulmck, boqun.feng,
	waiman.long, fweisbec, linux-kernel, linux-arch, rth, linux,
	egtvedt, realmz6, ysato, rkuo, tony.luck, geert, james.hogan,
	ralf, dhowells, jejb, mpe, schwidefsky, dalias, davem, cmetcalf,
	jcmvbkbc, arnd, dbueso, fengguang.wu

On Fri, Apr 22, 2016 at 10:50:41AM +0000, Vineet Gupta wrote:
> Also per your other comment/patches, converting ARC to _relaxed atomics sounds
> trivial, I can provide a fixup patch once your series is stable'ish and u point me
> to ur git tree or some such .

Yeah, that change is pretty simple; I ran out of steam and only did
alpha and mips in this series.

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [RFC][PATCH 03/31] locking,arc: Implement atomic_fetch_{add,sub,and,andnot,or,xor}()
@ 2016-04-22 14:26       ` Peter Zijlstra
  0 siblings, 0 replies; 79+ messages in thread
From: Peter Zijlstra @ 2016-04-22 14:26 UTC (permalink / raw)
  To: Vineet Gupta
  Cc: torvalds, mingo, tglx, will.deacon, paulmck, boqun.feng,
	waiman.long, fweisbec, linux-kernel, linux-arch, rth, linux,
	egtvedt, realmz6, ysato, rkuo, tony.luck, geert

On Fri, Apr 22, 2016 at 10:50:41AM +0000, Vineet Gupta wrote:
> Also per your other comment/patches, converting ARC to _relaxed atomics sounds
> trivial, I can provide a fixup patch once your series is stable'ish and u point me
> to ur git tree or some such .

Yeah, that change is pretty simple; I ran out of steam and only did
alpha and mips in this series.

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [RFC][PATCH 18/31] locking,powerpc: Implement atomic{,64}_fetch_{add,sub,and,or,xor}{,_relaxed,_acquire,_release}()
  2016-04-22  9:04 ` [RFC][PATCH 18/31] locking,powerpc: Implement atomic{,64}_fetch_{add,sub,and,or,xor}{,_relaxed,_acquire,_release}() Peter Zijlstra
@ 2016-04-22 16:41   ` Boqun Feng
  2016-04-23  2:31     ` Peter Zijlstra
  0 siblings, 1 reply; 79+ messages in thread
From: Boqun Feng @ 2016-04-22 16:41 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: torvalds, mingo, tglx, will.deacon, paulmck, waiman.long,
	fweisbec, linux-kernel, linux-arch, rth, vgupta, linux, egtvedt,
	realmz6, ysato, rkuo, tony.luck, geert, james.hogan, ralf,
	dhowells, jejb, mpe, schwidefsky, dalias, davem, cmetcalf,
	jcmvbkbc, arnd, dbueso, fengguang.wu

[-- Attachment #1: Type: text/plain, Size: 4498 bytes --]

On Fri, Apr 22, 2016 at 11:04:31AM +0200, Peter Zijlstra wrote:
> Implement FETCH-OP atomic primitives, these are very similar to the
> existing OP-RETURN primitives we already have, except they return the
> value of the atomic variable _before_ modification.
> 
> This is especially useful for irreversible operations -- such as
> bitops (because it becomes impossible to reconstruct the state prior
> to modification).
> 
> 
> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
> ---
>  arch/powerpc/include/asm/atomic.h |   83 +++++++++++++++++++++++++++++++++-----
>  1 file changed, 74 insertions(+), 9 deletions(-)
> 
> --- a/arch/powerpc/include/asm/atomic.h
> +++ b/arch/powerpc/include/asm/atomic.h
> @@ -78,21 +78,53 @@ static inline int atomic_##op##_return_r
>  	return t;							\
>  }
>  
> +#define ATOMIC_FETCH_OP_RELAXED(op, asm_op)				\
> +static inline int atomic_fetch_##op##_relaxed(int a, atomic_t *v)	\
> +{									\
> +	int res, t;							\
> +									\
> +	__asm__ __volatile__(						\
> +"1:	lwarx	%0,0,%4		# atomic_fetch_" #op "_relaxed\n"	\
> +	#asm_op " %1,%2,%0\n"						\

Should be

	#asm_op " %1,%3,%0\n"

right? Because %2 is v->counter and %3 is @a.

Regards,
Boqun

> +	PPC405_ERR77(0, %4)						\
> +"	stwcx.	%1,0,%4\n"						\
> +"	bne-	1b\n"							\
> +	: "=&r" (res), "=&r" (t), "+m" (v->counter)			\
> +	: "r" (a), "r" (&v->counter)					\
> +	: "cc");							\
> +									\
> +	return res;							\
> +}
> +
>  #define ATOMIC_OPS(op, asm_op)						\
>  	ATOMIC_OP(op, asm_op)						\
> -	ATOMIC_OP_RETURN_RELAXED(op, asm_op)
> +	ATOMIC_OP_RETURN_RELAXED(op, asm_op)				\
> +	ATOMIC_FETCH_OP_RELAXED(op, asm_op)
>  
>  ATOMIC_OPS(add, add)
>  ATOMIC_OPS(sub, subf)
>  
> -ATOMIC_OP(and, and)
> -ATOMIC_OP(or, or)
> -ATOMIC_OP(xor, xor)
> -
>  #define atomic_add_return_relaxed atomic_add_return_relaxed
>  #define atomic_sub_return_relaxed atomic_sub_return_relaxed
>  
> +#define atomic_fetch_add_relaxed atomic_fetch_add_relaxed
> +#define atomic_fetch_sub_relaxed atomic_fetch_sub_relaxed
> +
> +#undef ATOMIC_OPS
> +#define ATOMIC_OPS(op, asm_op)						\
> +	ATOMIC_OP(op, asm_op)						\
> +	ATOMIC_FETCH_OP_RELAXED(op, asm_op)
> +
> +ATOMIC_OPS(and, and)
> +ATOMIC_OPS(or, or)
> +ATOMIC_OPS(xor, xor)
> +
> +#define atomic_fetch_and_relaxed atomic_fetch_and_relaxed
> +#define atomic_fetch_or_relaxed  atomic_fetch_or_relaxed
> +#define atomic_fetch_xor_relaxed atomic_fetch_xor_relaxed
> +
>  #undef ATOMIC_OPS
> +#undef ATOMIC_FETCH_OP_RELAXED
>  #undef ATOMIC_OP_RETURN_RELAXED
>  #undef ATOMIC_OP
>  
> @@ -329,20 +361,53 @@ atomic64_##op##_return_relaxed(long a, a
>  	return t;							\
>  }
>  
> +#define ATOMIC64_FETCH_OP_RELAXED(op, asm_op)				\
> +static inline long							\
> +atomic64_fetch_##op##_relaxed(long a, atomic64_t *v)			\
> +{									\
> +	long res, t;							\
> +									\
> +	__asm__ __volatile__(						\
> +"1:	ldarx	%0,0,%4		# atomic64_fetch_" #op "_relaxed\n"	\
> +	#asm_op " %1,%3,%0\n"						\
> +"	stdcx.	%1,0,%4\n"						\
> +"	bne-	1b\n"							\
> +	: "=&r" (res), "=&r" (t), "+m" (v->counter)			\
> +	: "r" (a), "r" (&v->counter)					\
> +	: "cc");							\
> +									\
> +	return t;							\
> +}
> +
>  #define ATOMIC64_OPS(op, asm_op)					\
>  	ATOMIC64_OP(op, asm_op)						\
> -	ATOMIC64_OP_RETURN_RELAXED(op, asm_op)
> +	ATOMIC64_OP_RETURN_RELAXED(op, asm_op)				\
> +	ATOMIC64_FETCH_OP_RELAXED(op, asm_op)
>  
>  ATOMIC64_OPS(add, add)
>  ATOMIC64_OPS(sub, subf)
> -ATOMIC64_OP(and, and)
> -ATOMIC64_OP(or, or)
> -ATOMIC64_OP(xor, xor)
>  
>  #define atomic64_add_return_relaxed atomic64_add_return_relaxed
>  #define atomic64_sub_return_relaxed atomic64_sub_return_relaxed
>  
> +#define atomic64_fetch_add_relaxed atomic64_fetch_add_relaxed
> +#define atomic64_fetch_sub_relaxed atomic64_fetch_sub_relaxed
> +
> +#undef ATOMIC64_OPS
> +#define ATOMIC64_OPS(op, asm_op)					\
> +	ATOMIC64_OP(op, asm_op)						\
> +	ATOMIC64_FETCH_OP_RELAXED(op, asm_op)
> +
> +ATOMIC64_OPS(and, and)
> +ATOMIC64_OPS(or, or)
> +ATOMIC64_OPS(xor, xor)
> +
> +#define atomic64_fetch_and_relaxed atomic64_fetch_and_relaxed
> +#define atomic64_fetch_or_relaxed  atomic64_fetch_or_relaxed
> +#define atomic64_fetch_xor_relaxed atomic64_fetch_xor_relaxed
> +
>  #undef ATOPIC64_OPS
> +#undef ATOMIC64_FETCH_OP_RELAXED
>  #undef ATOMIC64_OP_RETURN_RELAXED
>  #undef ATOMIC64_OP
>  
> 
> 

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 473 bytes --]

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [RFC][PATCH 02/31] locking,alpha: Implement atomic{,64}_fetch_{add,sub,and,andnot,or,xor}()
  2016-04-22  9:04 ` [RFC][PATCH 02/31] locking,alpha: Implement atomic{,64}_fetch_{add,sub,and,andnot,or,xor}() Peter Zijlstra
@ 2016-04-22 16:57   ` Richard Henderson
  2016-04-23  1:55     ` Peter Zijlstra
  0 siblings, 1 reply; 79+ messages in thread
From: Richard Henderson @ 2016-04-22 16:57 UTC (permalink / raw)
  To: Peter Zijlstra, torvalds, mingo, tglx, will.deacon, paulmck,
	boqun.feng, waiman.long, fweisbec
  Cc: linux-kernel, linux-arch, vgupta, linux, egtvedt, realmz6, ysato,
	rkuo, tony.luck, geert, james.hogan, ralf, dhowells, jejb, mpe,
	schwidefsky, dalias, davem, cmetcalf, jcmvbkbc, arnd, dbueso,
	fengguang.wu

On 04/22/2016 02:04 AM, Peter Zijlstra wrote:
> +	"1:	ldl_l %0,%1\n"						\
> +	"       mov %0,%2\n"						\
> +	"	" #asm_op " %0,%3,%0\n"					\
> +	"	stl_c %0,%1\n"						\

No need for the extra mov.

	ldl_l %2,%1
	asm_op %2,%3,%0
	stl_c %0,%1


r~

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [RFC][PATCH 00/31] implement atomic_fetch_$op
  2016-04-22 12:56   ` Fengguang Wu
  2016-04-22 13:03     ` Will Deacon
  2016-04-22 14:23     ` Peter Zijlstra
@ 2016-04-22 18:35     ` Kalle Valo
  2016-04-23  3:23       ` Fengguang Wu
  2 siblings, 1 reply; 79+ messages in thread
From: Kalle Valo @ 2016-04-22 18:35 UTC (permalink / raw)
  To: Fengguang Wu
  Cc: Peter Zijlstra, torvalds, mingo, tglx, will.deacon, paulmck,
	boqun.feng, waiman.long, fweisbec, linux-kernel, linux-arch, rth,
	vgupta, linux, egtvedt, realmz6, ysato, rkuo, tony.luck, geert,
	james.hogan, ralf, dhowells, jejb, mpe, schwidefsky, dalias,
	davem, cmetcalf, jcmvbkbc, arnd, dbueso

Fengguang Wu <fengguang.wu@intel.com> writes:

>> OK, weirdness. I received the "BUILD SUCCESS" email without any arm64
>> builds listed, but I just received a build bot email telling me the
>> arm64 build was borked (which I know it is).
>
> Sorry, that may happen because even though most errors will be
> detected in the first hour or before the BUILD SUCCESS/DONE
> notification, the build/boot/performance tests for a particular branch
> may continue for days, during the time test coverage keeps growing.
> Which means it's possible to receive a build failure after receiving
> BUILD SUCCESS notification.
>
> In particular, 0-day bot classify 500+ kconfigs into 2 priority lists:
>
> P1: 100+ realtime priority kconfigs which should be finished before sending
>     out BUILD SUCCESS notification
>
> P2: 400+ background priority kconfigs which may take hours to days to finish
>
> That split is a tradeoff between timeliness and completeness. It turns
> out to work well as long as we choose the suitable P1 list.
>
> So the more accurate interpretation of "BUILD SUCCESS/DONE" would be:
> 0day bot is working on your tree (no worry about out-of-service) and
> reached a major milestone.

Thanks, this is very useful information. But would it be also possible
to get a report about the P2 completion (or failure)?

-- 
Kalle Valo

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [RFC][PATCH 02/31] locking,alpha: Implement atomic{,64}_fetch_{add,sub,and,andnot,or,xor}()
  2016-04-22 16:57   ` Richard Henderson
@ 2016-04-23  1:55     ` Peter Zijlstra
  0 siblings, 0 replies; 79+ messages in thread
From: Peter Zijlstra @ 2016-04-23  1:55 UTC (permalink / raw)
  To: Richard Henderson
  Cc: torvalds, mingo, tglx, will.deacon, paulmck, boqun.feng,
	waiman.long, fweisbec, linux-kernel, linux-arch, vgupta, linux,
	egtvedt, realmz6, ysato, rkuo, tony.luck, geert, james.hogan,
	ralf, dhowells, jejb, mpe, schwidefsky, dalias, davem, cmetcalf,
	jcmvbkbc, arnd, dbueso, fengguang.wu

On Fri, Apr 22, 2016 at 09:57:04AM -0700, Richard Henderson wrote:
> On 04/22/2016 02:04 AM, Peter Zijlstra wrote:
> > +	"1:	ldl_l %0,%1\n"						\
> > +	"       mov %0,%2\n"						\
> > +	"	" #asm_op " %0,%3,%0\n"					\
> > +	"	stl_c %0,%1\n"						\
> 
> No need for the extra mov.
> 
> 	ldl_l %2,%1
> 	asm_op %2,%3,%0
> 	stl_c %0,%1

Indeed, I got my head stuck in two operand (x86) asm. This is the second
such 'mistake', I'll go have a look at the rest too.

Thanks!

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [RFC][PATCH 00/31] implement atomic_fetch_$op
  2016-04-22 14:23     ` Peter Zijlstra
@ 2016-04-23  1:59       ` Fengguang Wu
  0 siblings, 0 replies; 79+ messages in thread
From: Fengguang Wu @ 2016-04-23  1:59 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: torvalds, mingo, tglx, will.deacon, paulmck, boqun.feng,
	waiman.long, fweisbec, linux-kernel, linux-arch, rth, vgupta,
	linux, egtvedt, realmz6, ysato, rkuo, tony.luck, geert,
	james.hogan, ralf, dhowells, jejb, mpe, schwidefsky, dalias,
	davem, cmetcalf, jcmvbkbc, arnd, dbueso

On Fri, Apr 22, 2016 at 04:23:03PM +0200, Peter Zijlstra wrote:
> On Fri, Apr 22, 2016 at 08:56:56PM +0800, Fengguang Wu wrote:
> > I'll add arm64-defconfig to P1 list to improve its coverage.
> 
> Thanks; any more architectures missing from P1?

Good question! Just double checked and find s390 still missing.
All other supported archs have corresponding kconfigs in the P1 list:

alpha-defconfig
arm64-allnoconfig
arm64-defconfig
arm-allnoconfig
arm-at91_dt_defconfig
arm-efm32_defconfig
arm-exynos_defconfig
arm-multi_v5_defconfig
arm-multi_v7_defconfig
arm-shmobile_defconfig
arm-sunxi_defconfig
avr32-atngw100_defconfig
avr32-atstk1006_defconfig
blackfin-BF526-EZBRD_defconfig
blackfin-BF533-EZKIT_defconfig
blackfin-BF561-EZKIT-SMP_defconfig
blackfin-defconfig
blackfin-TCM-BF537_defconfig
cris-etrax-100lx_v2_defconfig
frv-defconfig
i386-alldefconfig
i386-allmodconfig
i386-allnoconfig
i386-allyesconfig
i386-defconfig
i386-tinyconfig
ia64-alldefconfig
ia64-allnoconfig
ia64-defconfig
m32r-m32104ut_defconfig
m32r-mappi3.smp_defconfig
m32r-opsput_defconfig
m32r-usrv_defconfig
m68k-m5475evb_defconfig
m68k-multi_defconfig
m68k-sun3_defconfig
microblaze-mmu_defconfig
microblaze-nommu_defconfig
mips-allnoconfig
mips-defconfig
mips-fuloong2e_defconfig
mips-jz4740
mips-txx9
mn10300-asb2364_defconfig
openrisc-or1ksim_defconfig
parisc-allnoconfig
parisc-b180_defconfig
parisc-c3000_defconfig
parisc-defconfig
powerpc-allnoconfig
powerpc-defconfig
powerpc-ppc64_defconfig
sh-allnoconfig
sh-defconfig
sh-rsk7269_defconfig
sh-sh7785lcr_32bit_defconfig
sh-titan_defconfig
sparc64-allnoconfig
sparc64-defconfig
sparc-defconfig
tile-tilegx_defconfig
um-i386_defconfig
um-x86_64_defconfig
x86_64-acpi-redef
x86_64-allmodconfig
x86_64-allyesdebian
x86_64-lkp
x86_64-nfsroot
x86_64-rhel
x86_64-rhel_gcov
xtensa-common_defconfig
xtensa-iss_defconfig

I'll add s390-default_defconfig to the above list.  Architecture
maintainers are welcome to suggest more suitable list of configs!

Thanks,
Fengguang

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [RFC][PATCH 10/31] locking,hexagon: Implement atomic_fetch_{add,sub,and,or,xor}()
  2016-04-22  9:04 ` [RFC][PATCH 10/31] locking,hexagon: " Peter Zijlstra
@ 2016-04-23  2:16   ` Peter Zijlstra
  2016-04-26  0:39     ` Richard Kuo
  0 siblings, 1 reply; 79+ messages in thread
From: Peter Zijlstra @ 2016-04-23  2:16 UTC (permalink / raw)
  To: torvalds, mingo, tglx, will.deacon, paulmck, boqun.feng,
	waiman.long, fweisbec
  Cc: linux-kernel, linux-arch, rth, vgupta, linux, egtvedt, realmz6,
	ysato, rkuo, tony.luck, geert, james.hogan, ralf, dhowells, jejb,
	mpe, schwidefsky, dalias, davem, cmetcalf, jcmvbkbc, arnd,
	dbueso, fengguang.wu

On Fri, Apr 22, 2016 at 11:04:23AM +0200, Peter Zijlstra wrote:
> +#define ATOMIC_FETCH_OP(op)						\
> +static inline int atomic_fetch_##op(int i, atomic_t *v)			\
> +{									\
> +	int output, val;						\
> +									\
> +	__asm__ __volatile__ (						\
> +		"1:	%0 = memw_locked(%2);\n"			\
> +		"	%1 = "#op "(%0,%3);\n"				\
> +		"	memw_locked(%2,P3)=%0;\n"			\

I'm thinking that wants to be:

			memw_locked(%2,P3)=%1;

> +		"	if !P3 jump 1b;\n"				\
> +		: "=&r" (output), "=&r" (val)				\
> +		: "r" (&v->counter), "r" (i)				\
> +		: "memory", "p3"					\
> +	);								\
> +	return output;							\
> +}

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [RFC][PATCH 18/31] locking,powerpc: Implement atomic{,64}_fetch_{add,sub,and,or,xor}{,_relaxed,_acquire,_release}()
  2016-04-22 16:41   ` Boqun Feng
@ 2016-04-23  2:31     ` Peter Zijlstra
  0 siblings, 0 replies; 79+ messages in thread
From: Peter Zijlstra @ 2016-04-23  2:31 UTC (permalink / raw)
  To: Boqun Feng
  Cc: torvalds, mingo, tglx, will.deacon, paulmck, waiman.long,
	fweisbec, linux-kernel, linux-arch, rth, vgupta, linux, egtvedt,
	realmz6, ysato, rkuo, tony.luck, geert, james.hogan, ralf,
	dhowells, jejb, mpe, schwidefsky, dalias, davem, cmetcalf,
	jcmvbkbc, arnd, dbueso, fengguang.wu

On Sat, Apr 23, 2016 at 12:41:57AM +0800, Boqun Feng wrote:
> > +#define ATOMIC_FETCH_OP_RELAXED(op, asm_op)				\
> > +static inline int atomic_fetch_##op##_relaxed(int a, atomic_t *v)	\
> > +{									\
> > +	int res, t;							\
> > +									\
> > +	__asm__ __volatile__(						\
> > +"1:	lwarx	%0,0,%4		# atomic_fetch_" #op "_relaxed\n"	\
> > +	#asm_op " %1,%2,%0\n"						\
> 
> Should be
> 
> 	#asm_op " %1,%3,%0\n"
> 
> right? Because %2 is v->counter and %3 is @a.

Indeed, thanks!

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [RFC][PATCH 00/31] implement atomic_fetch_$op
  2016-04-22 18:35     ` Kalle Valo
@ 2016-04-23  3:23       ` Fengguang Wu
  0 siblings, 0 replies; 79+ messages in thread
From: Fengguang Wu @ 2016-04-23  3:23 UTC (permalink / raw)
  To: Kalle Valo
  Cc: Peter Zijlstra, torvalds, mingo, tglx, will.deacon, paulmck,
	boqun.feng, waiman.long, fweisbec, linux-kernel, linux-arch, rth,
	vgupta, linux, egtvedt, realmz6, ysato, rkuo, tony.luck, geert,
	james.hogan, ralf, dhowells, jejb, mpe, schwidefsky, dalias,
	davem, cmetcalf, jcmvbkbc, arnd, dbueso

On Fri, Apr 22, 2016 at 09:35:06PM +0300, Kalle Valo wrote:
> Fengguang Wu <fengguang.wu@intel.com> writes:
> 
> >> OK, weirdness. I received the "BUILD SUCCESS" email without any arm64
> >> builds listed, but I just received a build bot email telling me the
> >> arm64 build was borked (which I know it is).
> >
> > Sorry, that may happen because even though most errors will be
> > detected in the first hour or before the BUILD SUCCESS/DONE
> > notification, the build/boot/performance tests for a particular branch
> > may continue for days, during the time test coverage keeps growing.
> > Which means it's possible to receive a build failure after receiving
> > BUILD SUCCESS notification.
> >
> > In particular, 0-day bot classify 500+ kconfigs into 2 priority lists:
> >
> > P1: 100+ realtime priority kconfigs which should be finished before sending
> >     out BUILD SUCCESS notification
> >
> > P2: 400+ background priority kconfigs which may take hours to days to finish
> >
> > That split is a tradeoff between timeliness and completeness. It turns
> > out to work well as long as we choose the suitable P1 list.
> >
> > So the more accurate interpretation of "BUILD SUCCESS/DONE" would be:
> > 0day bot is working on your tree (no worry about out-of-service) and
> > reached a major milestone.
> 
> Thanks, this is very useful information. But would it be also possible
> to get a report about the P2 completion (or failure)?

Good question! I'm not sure people would care (or even be confused)
about a report that arrive after days, however based on some statistic
data we may find the suitable time to wait for possible error reports.

The past reports show that about 60% errors are reported in 2 hours,
90% errors are reported in 24 hours and there are 1% errors reported
after 1 week.

So developers may reasonably wait for 1 day before sending out patches.

Thanks,
Fengguang

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [RFC][PATCH 03/31] locking,arc: Implement atomic_fetch_{add,sub,and,andnot,or,xor}()
  2016-04-22 14:16       ` Peter Zijlstra
@ 2016-04-25  4:26         ` Vineet Gupta
  -1 siblings, 0 replies; 79+ messages in thread
From: Vineet Gupta @ 2016-04-25  4:26 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: torvalds, mingo, tglx, will.deacon, paulmck, boqun.feng,
	waiman.long, fweisbec, linux-kernel, linux-arch, rth, linux,
	egtvedt, realmz6, ysato, rkuo, tony.luck, geert, james.hogan,
	ralf, dhowells, jejb, mpe, schwidefsky, dalias, davem, cmetcalf,
	jcmvbkbc, arnd, dbueso, fengguang.wu

On Friday 22 April 2016 07:46 PM, Peter Zijlstra wrote:
> On Fri, Apr 22, 2016 at 10:50:41AM +0000, Vineet Gupta wrote:
>
>>> > > +#define ATOMIC_FETCH_OP(op, c_op, asm_op)				\
>>> > > +static inline int atomic_fetch_##op(int i, atomic_t *v)			\
>>> > > +{									\
>>> > > +	unsigned int val, result;			                \
>>> > > +	SCOND_FAIL_RETRY_VAR_DEF                                        \
>>> > > +									\
>>> > > +	/*								\
>>> > > +	 * Explicit full memory barrier needed before/after as		\
>>> > > +	 * LLOCK/SCOND thmeselves don't provide any such semantics	\
>>> > > +	 */								\
>>> > > +	smp_mb();							\
>>> > > +									\
>>> > > +	__asm__ __volatile__(						\
>>> > > +	"1:	llock   %[val], [%[ctr]]		\n"		\
>>> > > +	"	mov %[result], %[val]			\n"		\
>> > 
>> > Calling it result could be a bit confusing, this is meant to be the "orig" value.
>> > So it indeed "result" of the API, but for atomic operation it is pristine value.
>> > 
>> > Also we can optimize away that MOV - given there are plenty of regs, so
>> > 
>>> > > +	"	" #asm_op " %[val], %[val], %[i]	\n"		\
>>> > > +	"	scond   %[val], [%[ctr]]		\n"		\
>> > 
>> > Instead have
>> > 
>> > +	"	" #asm_op " %[result], %[val], %[i]	\n"		\
>> > +	"	scond   %[result], [%[ctr]]		\n"		\
>> > 
>> > 
> Indeed, how about something like so?
>
> ---
> Subject: locking,arc: Implement atomic_fetch_{add,sub,and,andnot,or,xor}()
> From: Peter Zijlstra <peterz@infradead.org>
> Date: Mon Apr 18 01:16:09 CEST 2016
>
> Implement FETCH-OP atomic primitives, these are very similar to the
> existing OP-RETURN primitives we already have, except they return the
> value of the atomic variable _before_ modification.
>
> This is especially useful for irreversible operations -- such as
> bitops (because it becomes impossible to reconstruct the state prior
> to modification).
>
> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>

Acked-by: Vineet Gupta <vgupta@synopsys.com>

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [RFC][PATCH 03/31] locking,arc: Implement atomic_fetch_{add,sub,and,andnot,or,xor}()
@ 2016-04-25  4:26         ` Vineet Gupta
  0 siblings, 0 replies; 79+ messages in thread
From: Vineet Gupta @ 2016-04-25  4:26 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: torvalds, mingo, tglx, will.deacon, paulmck, boqun.feng,
	waiman.long, fweisbec, linux-kernel, linux-arch, rth, linux,
	egtvedt, realmz6, ysato, rkuo, tony.luck, geert

On Friday 22 April 2016 07:46 PM, Peter Zijlstra wrote:
> On Fri, Apr 22, 2016 at 10:50:41AM +0000, Vineet Gupta wrote:
>
>>> > > +#define ATOMIC_FETCH_OP(op, c_op, asm_op)				\
>>> > > +static inline int atomic_fetch_##op(int i, atomic_t *v)			\
>>> > > +{									\
>>> > > +	unsigned int val, result;			                \
>>> > > +	SCOND_FAIL_RETRY_VAR_DEF                                        \
>>> > > +									\
>>> > > +	/*								\
>>> > > +	 * Explicit full memory barrier needed before/after as		\
>>> > > +	 * LLOCK/SCOND thmeselves don't provide any such semantics	\
>>> > > +	 */								\
>>> > > +	smp_mb();							\
>>> > > +									\
>>> > > +	__asm__ __volatile__(						\
>>> > > +	"1:	llock   %[val], [%[ctr]]		\n"		\
>>> > > +	"	mov %[result], %[val]			\n"		\
>> > 
>> > Calling it result could be a bit confusing, this is meant to be the "orig" value.
>> > So it indeed "result" of the API, but for atomic operation it is pristine value.
>> > 
>> > Also we can optimize away that MOV - given there are plenty of regs, so
>> > 
>>> > > +	"	" #asm_op " %[val], %[val], %[i]	\n"		\
>>> > > +	"	scond   %[val], [%[ctr]]		\n"		\
>> > 
>> > Instead have
>> > 
>> > +	"	" #asm_op " %[result], %[val], %[i]	\n"		\
>> > +	"	scond   %[result], [%[ctr]]		\n"		\
>> > 
>> > 
> Indeed, how about something like so?
>
> ---
> Subject: locking,arc: Implement atomic_fetch_{add,sub,and,andnot,or,xor}()
> From: Peter Zijlstra <peterz@infradead.org>
> Date: Mon Apr 18 01:16:09 CEST 2016
>
> Implement FETCH-OP atomic primitives, these are very similar to the
> existing OP-RETURN primitives we already have, except they return the
> value of the atomic variable _before_ modification.
>
> This is especially useful for irreversible operations -- such as
> bitops (because it becomes impossible to reconstruct the state prior
> to modification).
>
> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>

Acked-by: Vineet Gupta <vgupta@synopsys.com>

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [RFC][PATCH 19/31] locking,s390: Implement atomic{,64}_fetch_{add,sub,and,or,xor}()
  2016-04-22  9:04 ` [RFC][PATCH 19/31] locking,s390: Implement atomic{,64}_fetch_{add,sub,and,or,xor}() Peter Zijlstra
@ 2016-04-25  8:06   ` Martin Schwidefsky
  2016-04-25  8:26     ` Peter Zijlstra
  0 siblings, 1 reply; 79+ messages in thread
From: Martin Schwidefsky @ 2016-04-25  8:06 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: torvalds, mingo, tglx, will.deacon, paulmck, boqun.feng,
	waiman.long, fweisbec, linux-kernel, linux-arch, rth, vgupta,
	linux, egtvedt, realmz6, ysato, rkuo, tony.luck, geert,
	james.hogan, ralf, dhowells, jejb, mpe, dalias, davem, cmetcalf,
	jcmvbkbc, arnd, dbueso, fengguang.wu

On Fri, 22 Apr 2016 11:04:32 +0200
Peter Zijlstra <peterz@infradead.org> wrote:

> Implement FETCH-OP atomic primitives, these are very similar to the
> existing OP-RETURN primitives we already have, except they return the
> value of the atomic variable _before_ modification.
> 
> This is especially useful for irreversible operations -- such as
> bitops (because it becomes impossible to reconstruct the state prior
> to modification).
> 
> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
> ---
>  arch/s390/include/asm/atomic.h |   42 +++++++++++++++++++++++++++++++----------
>  1 file changed, 32 insertions(+), 10 deletions(-)

That looks good, the code compiles and the functions are generated correctly.
We will know for sure if it works after the first user of these new functions
hit the kernel.

Acked-by: Martin Schwidefsky <schwidefsky@de.ibm.com>

-- 
blue skies,
   Martin.

"Reality continues to ruin my life." - Calvin.

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [RFC][PATCH 19/31] locking,s390: Implement atomic{,64}_fetch_{add,sub,and,or,xor}()
  2016-04-25  8:06   ` Martin Schwidefsky
@ 2016-04-25  8:26     ` Peter Zijlstra
  0 siblings, 0 replies; 79+ messages in thread
From: Peter Zijlstra @ 2016-04-25  8:26 UTC (permalink / raw)
  To: Martin Schwidefsky
  Cc: torvalds, mingo, tglx, will.deacon, paulmck, boqun.feng,
	waiman.long, fweisbec, linux-kernel, linux-arch, rth, vgupta,
	linux, egtvedt, realmz6, ysato, rkuo, tony.luck, geert,
	james.hogan, ralf, dhowells, jejb, mpe, dalias, davem, cmetcalf,
	jcmvbkbc, arnd, dbueso, fengguang.wu

On Mon, Apr 25, 2016 at 10:06:25AM +0200, Martin Schwidefsky wrote:
> On Fri, 22 Apr 2016 11:04:32 +0200
> Peter Zijlstra <peterz@infradead.org> wrote:
> 
> > Implement FETCH-OP atomic primitives, these are very similar to the
> > existing OP-RETURN primitives we already have, except they return the
> > value of the atomic variable _before_ modification.
> > 
> > This is especially useful for irreversible operations -- such as
> > bitops (because it becomes impossible to reconstruct the state prior
> > to modification).
> > 
> > Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
> > ---
> >  arch/s390/include/asm/atomic.h |   42 +++++++++++++++++++++++++++++++----------
> >  1 file changed, 32 insertions(+), 10 deletions(-)
> 
> That looks good, the code compiles and the functions are generated correctly.
> We will know for sure if it works after the first user of these new functions
> hit the kernel.

So we already have an atomic_fetch_or() user in the kernel, and this
series adds an atomic_fetch_add_acquire() user.

But yes, we'll undoubtedly grow more over time :-)

> Acked-by: Martin Schwidefsky <schwidefsky@de.ibm.com>

Thanks!

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [RFC][PATCH 22/31] locking,tile: Implement atomic{,64}_fetch_{add,sub,and,or,xor}()
  2016-04-22  9:04 ` [RFC][PATCH 22/31] locking,tile: " Peter Zijlstra
@ 2016-04-25 21:10     ` Chris Metcalf
       [not found]   ` <571E840A.8090703@mellanox.com>
  1 sibling, 0 replies; 79+ messages in thread
From: Chris Metcalf @ 2016-04-25 21:10 UTC (permalink / raw)
  To: Peter Zijlstra, torvalds, mingo, tglx, will.deacon, paulmck,
	boqun.feng, waiman.long, fweisbec
  Cc: linux-kernel, linux-arch, rth, vgupta, linux, egtvedt, realmz6,
	ysato, rkuo, tony.luck, geert, james.hogan, ralf, dhowells, jejb,
	mpe, schwidefsky, dalias, davem, jcmvbkbc, arnd, dbueso,
	fengguang.wu

[Grr, resending as text/plain; I have no idea what inspired Thunderbird
to send this as multipart/mixed with HTML.]

On 4/22/2016 5:04 AM, Peter Zijlstra wrote:
> Implement FETCH-OP atomic primitives, these are very similar to the
> existing OP-RETURN primitives we already have, except they return the
> value of the atomic variable_before_  modification.
>
> This is especially useful for irreversible operations -- such as
> bitops (because it becomes impossible to reconstruct the state prior
> to modification).
>
> XXX please look at the tilegx (CONFIG_64BIT) atomics, I think we get
> the barriers wrong (at the very least they're inconsistent).
>
> Signed-off-by: Peter Zijlstra (Intel)<peterz@infradead.org>
> ---
>   arch/tile/include/asm/atomic.h    |    4 +
>   arch/tile/include/asm/atomic_32.h |   60 +++++++++++++------
>   arch/tile/include/asm/atomic_64.h |  117 +++++++++++++++++++++++++-------------
>   arch/tile/include/asm/bitops_32.h |   18 ++---
>   arch/tile/lib/atomic_32.c         |   42 ++++++-------
>   arch/tile/lib/atomic_asm_32.S     |   14 ++--
>   6 files changed, 161 insertions(+), 94 deletions(-)
>
> [...]
>   static inline int atomic_add_return(int i, atomic_t *v)
>   {
>   	int val;
>   	smp_mb();  /* barrier for proper semantics */
>   	val = __insn_fetchadd4((void *)&v->counter, i) + i;
>   	barrier();  /* the "+ i" above will wait on memory */
> +	/* XXX smp_mb() instead, as per cmpxchg() ? */
>   	return val;
>   }

The existing code is subtle but I'm pretty sure it's not a bug.

The tilegx architecture will take the "+ i" and generate an add instruction.
The compiler barrier will make sure the add instruction happens before
anything else that could touch memory, and the microarchitecture will make
sure that the result of the atomic fetchadd has been returned to the core
before any further instructions are issued.  (The memory architecture is
lazy, but when you feed a load through an arithmetic operation, we block
issuing any further instructions until the add's operands are available.)

This would not be an adequate memory barrier in general, since other loads
or stores might still be in flight, even if the "val" operand had made it
from memory to the core at this point.  However, we have issued no other
stores or loads since the previous memory barrier, so we know that there
can be no other loads or stores in flight, and thus the compiler barrier
plus arithmetic op is equivalent to a memory barrier here.

In hindsight, perhaps a more substantial comment would have been helpful
here.  Unless you see something missing in my analysis, I'll plan to go
ahead and add a suitable comment here :-)

Otherwise, though just based on code inspection so far:

Acked-by: Chris Metcalf<cmetcalf@mellanox.com>  [for tile]

-- 
Chris Metcalf, Mellanox Technologies
http://www.mellanox.com

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [RFC][PATCH 22/31] locking,tile: Implement atomic{,64}_fetch_{add,sub,and,or,xor}()
@ 2016-04-25 21:10     ` Chris Metcalf
  0 siblings, 0 replies; 79+ messages in thread
From: Chris Metcalf @ 2016-04-25 21:10 UTC (permalink / raw)
  To: Peter Zijlstra, torvalds, mingo, tglx, will.deacon, paulmck,
	boqun.feng, waiman.long, fweisbec
  Cc: linux-kernel, linux-arch, rth, vgupta, linux, egtvedt, realmz6,
	ysato, rkuo, tony.luck, geert, james.hogan, ralf, dhowells, jejb,
	mpe, schwidefsky, dalias, davem, jcmvbkbc, arnd, dbueso,
	fengguang.wu

[Grr, resending as text/plain; I have no idea what inspired Thunderbird
to send this as multipart/mixed with HTML.]

On 4/22/2016 5:04 AM, Peter Zijlstra wrote:
> Implement FETCH-OP atomic primitives, these are very similar to the
> existing OP-RETURN primitives we already have, except they return the
> value of the atomic variable_before_  modification.
>
> This is especially useful for irreversible operations -- such as
> bitops (because it becomes impossible to reconstruct the state prior
> to modification).
>
> XXX please look at the tilegx (CONFIG_64BIT) atomics, I think we get
> the barriers wrong (at the very least they're inconsistent).
>
> Signed-off-by: Peter Zijlstra (Intel)<peterz@infradead.org>
> ---
>   arch/tile/include/asm/atomic.h    |    4 +
>   arch/tile/include/asm/atomic_32.h |   60 +++++++++++++------
>   arch/tile/include/asm/atomic_64.h |  117 +++++++++++++++++++++++++-------------
>   arch/tile/include/asm/bitops_32.h |   18 ++---
>   arch/tile/lib/atomic_32.c         |   42 ++++++-------
>   arch/tile/lib/atomic_asm_32.S     |   14 ++--
>   6 files changed, 161 insertions(+), 94 deletions(-)
>
> [...]
>   static inline int atomic_add_return(int i, atomic_t *v)
>   {
>   	int val;
>   	smp_mb();  /* barrier for proper semantics */
>   	val = __insn_fetchadd4((void *)&v->counter, i) + i;
>   	barrier();  /* the "+ i" above will wait on memory */
> +	/* XXX smp_mb() instead, as per cmpxchg() ? */
>   	return val;
>   }

The existing code is subtle but I'm pretty sure it's not a bug.

The tilegx architecture will take the "+ i" and generate an add instruction.
The compiler barrier will make sure the add instruction happens before
anything else that could touch memory, and the microarchitecture will make
sure that the result of the atomic fetchadd has been returned to the core
before any further instructions are issued.  (The memory architecture is
lazy, but when you feed a load through an arithmetic operation, we block
issuing any further instructions until the add's operands are available.)

This would not be an adequate memory barrier in general, since other loads
or stores might still be in flight, even if the "val" operand had made it
from memory to the core at this point.  However, we have issued no other
stores or loads since the previous memory barrier, so we know that there
can be no other loads or stores in flight, and thus the compiler barrier
plus arithmetic op is equivalent to a memory barrier here.

In hindsight, perhaps a more substantial comment would have been helpful
here.  Unless you see something missing in my analysis, I'll plan to go
ahead and add a suitable comment here :-)

Otherwise, though just based on code inspection so far:

Acked-by: Chris Metcalf<cmetcalf@mellanox.com>  [for tile]

-- 
Chris Metcalf, Mellanox Technologies
http://www.mellanox.com

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [RFC][PATCH 10/31] locking,hexagon: Implement atomic_fetch_{add,sub,and,or,xor}()
  2016-04-23  2:16   ` Peter Zijlstra
@ 2016-04-26  0:39     ` Richard Kuo
  0 siblings, 0 replies; 79+ messages in thread
From: Richard Kuo @ 2016-04-26  0:39 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: torvalds, mingo, tglx, will.deacon, paulmck, boqun.feng,
	waiman.long, fweisbec, linux-kernel, linux-arch, rth, vgupta,
	linux, egtvedt, realmz6, ysato, tony.luck, geert, james.hogan,
	ralf, dhowells, jejb, mpe, schwidefsky, dalias, davem, cmetcalf,
	jcmvbkbc, arnd, dbueso, fengguang.wu

On Sat, Apr 23, 2016 at 04:16:58AM +0200, Peter Zijlstra wrote:
> On Fri, Apr 22, 2016 at 11:04:23AM +0200, Peter Zijlstra wrote:
> > +#define ATOMIC_FETCH_OP(op)						\
> > +static inline int atomic_fetch_##op(int i, atomic_t *v)			\
> > +{									\
> > +	int output, val;						\
> > +									\
> > +	__asm__ __volatile__ (						\
> > +		"1:	%0 = memw_locked(%2);\n"			\
> > +		"	%1 = "#op "(%0,%3);\n"				\
> > +		"	memw_locked(%2,P3)=%0;\n"			\
> 
> I'm thinking that wants to be:
> 
> 			memw_locked(%2,P3)=%1;
> 
> > +		"	if !P3 jump 1b;\n"				\
> > +		: "=&r" (output), "=&r" (val)				\
> > +		: "r" (&v->counter), "r" (i)				\
> > +		: "memory", "p3"					\
> > +	);								\
> > +	return output;							\
> > +}

I think you are right.  With the above fix,

Acked-by: Richard Kuo <rkuo@codeaurora.org>


-- 
Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, 
a Linux Foundation Collaborative Project

^ permalink raw reply	[flat|nested] 79+ messages in thread

* [PATCH] tile: clarify barrier semantics of atomic_add_return
  2016-04-25 21:10     ` Chris Metcalf
  (?)
@ 2016-04-26 14:00     ` Chris Metcalf
  -1 siblings, 0 replies; 79+ messages in thread
From: Chris Metcalf @ 2016-04-26 14:00 UTC (permalink / raw)
  To: Peter Zijlstra, torvalds, mingo, tglx, will.deacon, paulmck,
	boqun.feng, waiman.long, fweisbec
  Cc: Chris Metcalf, linux-kernel

A recent discussion on LKML made it clear that the one-line
comment previously in atomic_add_return() was not clear enough:

https://lkml.kernel.org/r/571E87E2.3010306@mellanox.com

Signed-off-by: Chris Metcalf <cmetcalf@mellanox.com>
---
 arch/tile/include/asm/atomic_64.h | 17 +++++++++++++++--
 1 file changed, 15 insertions(+), 2 deletions(-)

diff --git a/arch/tile/include/asm/atomic_64.h b/arch/tile/include/asm/atomic_64.h
index 51cabc26e387..b0531a623653 100644
--- a/arch/tile/include/asm/atomic_64.h
+++ b/arch/tile/include/asm/atomic_64.h
@@ -37,12 +37,25 @@ static inline void atomic_add(int i, atomic_t *v)
 	__insn_fetchadd4((void *)&v->counter, i);
 }
 
+/*
+ * Note a subtlety of the locking here.  We are required to provide a
+ * full memory barrier before and after the operation.  However, we
+ * only provide an explicit mb before the operation.  After the
+ * operation, we use barrier() to get a full mb for free, because:
+ *
+ * (1) The barrier directive to the compiler prohibits any instructions
+ * being statically hoisted before the barrier;
+ * (2) the microarchitecture will not issue any further instructions
+ * until the fetchadd result is available for the "+ i" add instruction;
+ * (3) the smb_mb before the fetchadd ensures that no other memory
+ * operations are in flight at this point.
+ */
 static inline int atomic_add_return(int i, atomic_t *v)
 {
 	int val;
 	smp_mb();  /* barrier for proper semantics */
 	val = __insn_fetchadd4((void *)&v->counter, i) + i;
-	barrier();  /* the "+ i" above will wait on memory */
+	barrier();  /* equivalent to smp_mb(); see block comment above */
 	return val;
 }
 
@@ -95,7 +108,7 @@ static inline long atomic64_add_return(long i, atomic64_t *v)
 	int val;
 	smp_mb();  /* barrier for proper semantics */
 	val = __insn_fetchadd((void *)&v->counter, i) + i;
-	barrier();  /* the "+ i" above will wait on memory */
+	barrier();  /* equivalent to smp_mb; see atomic_add_return() */
 	return val;
 }
 
-- 
2.7.2

^ permalink raw reply related	[flat|nested] 79+ messages in thread

* Re: [RFC][PATCH 22/31] locking,tile: Implement atomic{,64}_fetch_{add,sub,and,or,xor}()
       [not found]   ` <571E840A.8090703@mellanox.com>
@ 2016-04-26 15:28     ` Peter Zijlstra
  2016-04-26 15:32         ` Chris Metcalf
  0 siblings, 1 reply; 79+ messages in thread
From: Peter Zijlstra @ 2016-04-26 15:28 UTC (permalink / raw)
  To: Chris Metcalf
  Cc: torvalds, mingo, tglx, will.deacon, paulmck, boqun.feng,
	waiman.long, fweisbec, linux-kernel, linux-arch, rth, vgupta,
	linux, egtvedt, realmz6, ysato, rkuo, tony.luck, geert,
	james.hogan, ralf, dhowells, jejb, mpe, schwidefsky, dalias,
	davem, jcmvbkbc, arnd, dbueso, fengguang.wu

On Mon, Apr 25, 2016 at 04:54:34PM -0400, Chris Metcalf wrote:
> On 4/22/2016 5:04 AM, Peter Zijlstra wrote:

> >  static inline int atomic_add_return(int i, atomic_t *v)
> >  {
> >  	int val;
> >  	smp_mb();  /* barrier for proper semantics */
> >  	val = __insn_fetchadd4((void *)&v->counter, i) + i;
> >  	barrier();  /* the "+ i" above will wait on memory */
> >+	/* XXX smp_mb() instead, as per cmpxchg() ? */
> >  	return val;
> >  }
> 
> The existing code is subtle but I'm pretty sure it's not a bug.
> 
> The tilegx architecture will take the "+ i" and generate an add instruction.
> The compiler barrier will make sure the add instruction happens before
> anything else that could touch memory, and the microarchitecture will make
> sure that the result of the atomic fetchadd has been returned to the core
> before any further instructions are issued.  (The memory architecture is
> lazy, but when you feed a load through an arithmetic operation, we block
> issuing any further instructions until the add's operands are available.)
> 
> This would not be an adequate memory barrier in general, since other loads
> or stores might still be in flight, even if the "val" operand had made it
> from memory to the core at this point.  However, we have issued no other
> stores or loads since the previous memory barrier, so we know that there
> can be no other loads or stores in flight, and thus the compiler barrier
> plus arithmetic op is equivalent to a memory barrier here.
> 
> In hindsight, perhaps a more substantial comment would have been helpful
> here.  Unless you see something missing in my analysis, I'll plan to go
> ahead and add a suitable comment here :-)
> 
> Otherwise, though just based on code inspection so far:
> 
> Acked-by: Chris Metcalf <cmetcalf@mellanox.com> [for tile]

Thanks!

Just to verify; the new fetch-op thingies _do_ indeed need the extra
smp_mb() as per my patch, because there is no trailing instruction
depending on the completion of the load?

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [RFC][PATCH 22/31] locking,tile: Implement atomic{,64}_fetch_{add,sub,and,or,xor}()
  2016-04-26 15:28     ` [RFC][PATCH 22/31] locking,tile: Implement atomic{,64}_fetch_{add,sub,and,or,xor}() Peter Zijlstra
@ 2016-04-26 15:32         ` Chris Metcalf
  0 siblings, 0 replies; 79+ messages in thread
From: Chris Metcalf @ 2016-04-26 15:32 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: torvalds, mingo, tglx, will.deacon, paulmck, boqun.feng,
	waiman.long, fweisbec, linux-kernel, linux-arch, rth, vgupta,
	linux, egtvedt, realmz6, ysato, rkuo, tony.luck, geert,
	james.hogan, ralf, dhowells, jejb, mpe, schwidefsky, dalias,
	davem, jcmvbkbc, arnd, dbueso, fengguang.wu

On 4/26/2016 11:28 AM, Peter Zijlstra wrote:
> On Mon, Apr 25, 2016 at 04:54:34PM -0400, Chris Metcalf wrote:
>> Otherwise, though just based on code inspection so far:
>>
>> Acked-by: Chris Metcalf <cmetcalf@mellanox.com> [for tile]
> Thanks!
>
> Just to verify; the new fetch-op thingies _do_ indeed need the extra
> smp_mb() as per my patch, because there is no trailing instruction
> depending on the completion of the load?

Exactly.  I should have said so explicitly :-)

-- 
Chris Metcalf, Mellanox Technologies
http://www.mellanox.com

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [RFC][PATCH 22/31] locking,tile: Implement atomic{,64}_fetch_{add,sub,and,or,xor}()
@ 2016-04-26 15:32         ` Chris Metcalf
  0 siblings, 0 replies; 79+ messages in thread
From: Chris Metcalf @ 2016-04-26 15:32 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: torvalds, mingo, tglx, will.deacon, paulmck, boqun.feng,
	waiman.long, fweisbec, linux-kernel, linux-arch, rth, vgupta,
	linux, egtvedt, realmz6, ysato, rkuo, tony.luck, geert,
	james.hogan, ralf, dhowells, jejb, mpe, schwidefsky, dalias,
	davem, jcmvbkbc, arnd, dbueso, fengguang.wu

On 4/26/2016 11:28 AM, Peter Zijlstra wrote:
> On Mon, Apr 25, 2016 at 04:54:34PM -0400, Chris Metcalf wrote:
>> Otherwise, though just based on code inspection so far:
>>
>> Acked-by: Chris Metcalf <cmetcalf@mellanox.com> [for tile]
> Thanks!
>
> Just to verify; the new fetch-op thingies _do_ indeed need the extra
> smp_mb() as per my patch, because there is no trailing instruction
> depending on the completion of the load?

Exactly.  I should have said so explicitly :-)

-- 
Chris Metcalf, Mellanox Technologies
http://www.mellanox.com

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [RFC][PATCH 14/31] locking,metag: Implement atomic_fetch_{add,sub,and,or,xor}()
  2016-04-22  9:04 ` [RFC][PATCH 14/31] locking,metag: " Peter Zijlstra
@ 2016-04-30  0:20     ` James Hogan
  0 siblings, 0 replies; 79+ messages in thread
From: James Hogan @ 2016-04-30  0:20 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: torvalds, mingo, tglx, will.deacon, paulmck, boqun.feng,
	waiman.long, fweisbec, linux-kernel, linux-arch, rth, vgupta,
	linux, egtvedt, realmz6, ysato, rkuo, tony.luck, geert, ralf,
	dhowells, jejb, mpe, schwidefsky, dalias, davem, cmetcalf,
	jcmvbkbc, arnd, dbueso, fengguang.wu

[-- Attachment #1: Type: text/plain, Size: 5840 bytes --]

Hi Peter,

On Fri, Apr 22, 2016 at 11:04:27AM +0200, Peter Zijlstra wrote:
> Implement FETCH-OP atomic primitives, these are very similar to the
> existing OP-RETURN primitives we already have, except they return the
> value of the atomic variable _before_ modification.
> 
> This is especially useful for irreversible operations -- such as
> bitops (because it becomes impossible to reconstruct the state prior
> to modification).
> 
> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
> ---
>  arch/metag/include/asm/atomic.h        |    2 +
>  arch/metag/include/asm/atomic_lnkget.h |   36 +++++++++++++++++++++++++++++----
>  arch/metag/include/asm/atomic_lock1.h  |   33 ++++++++++++++++++++++++++----
>  3 files changed, 63 insertions(+), 8 deletions(-)
> 
> --- a/arch/metag/include/asm/atomic.h
> +++ b/arch/metag/include/asm/atomic.h
> @@ -17,6 +17,8 @@
>  #include <asm/atomic_lnkget.h>
>  #endif
>  
> +#define atomic_fetch_or atomic_fetch_or
> +
>  #define atomic_add_negative(a, v)       (atomic_add_return((a), (v)) < 0)
>  
>  #define atomic_dec_return(v) atomic_sub_return(1, (v))
> --- a/arch/metag/include/asm/atomic_lnkget.h
> +++ b/arch/metag/include/asm/atomic_lnkget.h
> @@ -69,16 +69,44 @@ static inline int atomic_##op##_return(i
>  	return result;							\
>  }
>  
> -#define ATOMIC_OPS(op) ATOMIC_OP(op) ATOMIC_OP_RETURN(op)
> +#define ATOMIC_FETCH_OP(op)						\
> +static inline int atomic_fetch_##op(int i, atomic_t *v)			\
> +{									\
> +	int result, temp;						\
> +									\
> +	smp_mb();							\
> +									\
> +	asm volatile (							\
> +		"1:	LNKGETD %1, [%2]\n"				\
> +		"	" #op "	%0, %1, %3\n"				\

i was hoping never to have to think about meta asm constraints again :-P

and/or/xor are only available in the data units, as determined by %1 in
this case, so the constraint for result shouldn't have "a" in it.

diff --git a/arch/metag/include/asm/atomic_lnkget.h b/arch/metag/include/asm/atomic_lnkget.h
index 50ad05050947..def2c642f053 100644
--- a/arch/metag/include/asm/atomic_lnkget.h
+++ b/arch/metag/include/asm/atomic_lnkget.h
@@ -84,7 +84,7 @@ static inline int atomic_fetch_##op(int i, atomic_t *v)			\
 		"	ANDT	%0, %0, #HI(0x3f000000)\n"		\
 		"	CMPT	%0, #HI(0x02000000)\n"			\
 		"	BNZ 1b\n"					\
-		: "=&d" (temp), "=&da" (result)				\
+		: "=&d" (temp), "=&d" (result)				\
 		: "da" (&v->counter), "bd" (i)				\
 		: "cc");						\

That also ensures the "bd" constraint for %3 (meaning "an op2 register
where op1 [%1 in this case] is a data unit register and the instruction
supports O2R") is consistent.

So with that change this patch looks good to me:
Acked-by: James Hogan <james.hogan@imgtec.com>


Note that for the ATOMIC_OP_RETURN() case (add/sub only) either address
or data units can be used (hence the "da" for %1), but then the "bd"
constraint on %3 is wrong as op1 [%1] may not be in data unit (sorry I
didn't spot that at the time). I'll queue a fix, something like below
probably ("br" means "An Op2 register and the instruction supports O2R",
i.e. op1/%1 doesn't have to be a data unit register):

diff --git a/arch/metag/include/asm/atomic_lnkget.h b/arch/metag/include/asm/atomic_lnkget.h
index 50ad05050947..def2c642f053 100644
--- a/arch/metag/include/asm/atomic_lnkget.h
+++ b/arch/metag/include/asm/atomic_lnkget.h
@@ -61,7 +61,7 @@ static inline int atomic_##op##_return(int i, atomic_t *v)		\
 		"	CMPT	%0, #HI(0x02000000)\n"			\
 		"	BNZ 1b\n"					\
 		: "=&d" (temp), "=&da" (result)				\
-		: "da" (&v->counter), "bd" (i)				\
+		: "da" (&v->counter), "br" (i)				\
 		: "cc");						\
 									\
 	smp_mb();							\
 									\							\

Thanks
James

> +		"	LNKSETD [%2], %0\n"				\
> +		"	DEFR	%0, TXSTAT\n"				\
> +		"	ANDT	%0, %0, #HI(0x3f000000)\n"		\
> +		"	CMPT	%0, #HI(0x02000000)\n"			\
> +		"	BNZ 1b\n"					\
> +		: "=&d" (temp), "=&da" (result)				\
> +		: "da" (&v->counter), "bd" (i)				\
> +		: "cc");						\
> +									\
> +	smp_mb();							\
> +									\
> +	return result;							\
> +}
> +
> +#define ATOMIC_OPS(op) ATOMIC_OP(op) ATOMIC_OP_RETURN(op) ATOMIC_FETCH_OP(op)
>  
>  ATOMIC_OPS(add)
>  ATOMIC_OPS(sub)
>  
> -ATOMIC_OP(and)
> -ATOMIC_OP(or)
> -ATOMIC_OP(xor)
> +#undef ATOMIC_OPS
> +#define ATOMIC_OPS(op) ATOMIC_OP(op) ATOMIC_FETCH_OP(op)
> +
> +ATOMIC_OPS(and)
> +ATOMIC_OPS(or)
> +ATOMIC_OPS(xor)
>  
>  #undef ATOMIC_OPS
> +#undef ATOMIC_FETCH_OP
>  #undef ATOMIC_OP_RETURN
>  #undef ATOMIC_OP
>  
> --- a/arch/metag/include/asm/atomic_lock1.h
> +++ b/arch/metag/include/asm/atomic_lock1.h
> @@ -64,15 +64,40 @@ static inline int atomic_##op##_return(i
>  	return result;							\
>  }
>  
> -#define ATOMIC_OPS(op, c_op) ATOMIC_OP(op, c_op) ATOMIC_OP_RETURN(op, c_op)
> +#define ATOMIC_FETCH_OP(op, c_op)					\
> +static inline int atomic_fetch_##op(int i, atomic_t *v)			\
> +{									\
> +	unsigned long result;						\
> +	unsigned long flags;						\
> +									\
> +	__global_lock1(flags);						\
> +	result = v->counter;						\
> +	fence();							\
> +	v->counter c_op i;						\
> +	__global_unlock1(flags);					\
> +									\
> +	return result;							\
> +}
> +
> +#define ATOMIC_OPS(op, c_op)						\
> +	ATOMIC_OP(op, c_op)						\
> +	ATOMIC_OP_RETURN(op, c_op)					\
> +	ATOMIC_FETCH_OP(op, c_op)
>  
>  ATOMIC_OPS(add, +=)
>  ATOMIC_OPS(sub, -=)
> -ATOMIC_OP(and, &=)
> -ATOMIC_OP(or, |=)
> -ATOMIC_OP(xor, ^=)
>  
>  #undef ATOMIC_OPS
> +#define ATOMIC_OPS(op, c_op)						\
> +	ATOMIC_OP(op, c_op)						\
> +	ATOMIC_FETCH_OP(op, c_op)
> +
> +ATOMIC_OPS(and, &=)
> +ATOMIC_OPS(or, |=)
> +ATOMIC_OPS(xor, ^=)
> +
> +#undef ATOMIC_OPS
> +#undef ATOMIC_FETCH_OP
>  #undef ATOMIC_OP_RETURN
>  #undef ATOMIC_OP
>  
> 
> 

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply related	[flat|nested] 79+ messages in thread

* Re: [RFC][PATCH 14/31] locking,metag: Implement atomic_fetch_{add,sub,and,or,xor}()
@ 2016-04-30  0:20     ` James Hogan
  0 siblings, 0 replies; 79+ messages in thread
From: James Hogan @ 2016-04-30  0:20 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: torvalds, mingo, tglx, will.deacon, paulmck, boqun.feng,
	waiman.long, fweisbec, linux-kernel, linux-arch, rth, vgupta,
	linux, egtvedt, realmz6, ysato, rkuo, tony.luck, geert, ralf,
	dhowells, jejb, mpe, schwidefsky, dalias, davem, cmetcalf,
	jcmvbkbc, arnd, dbueso, fengguang.wu

[-- Attachment #1: Type: text/plain, Size: 5840 bytes --]

Hi Peter,

On Fri, Apr 22, 2016 at 11:04:27AM +0200, Peter Zijlstra wrote:
> Implement FETCH-OP atomic primitives, these are very similar to the
> existing OP-RETURN primitives we already have, except they return the
> value of the atomic variable _before_ modification.
> 
> This is especially useful for irreversible operations -- such as
> bitops (because it becomes impossible to reconstruct the state prior
> to modification).
> 
> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
> ---
>  arch/metag/include/asm/atomic.h        |    2 +
>  arch/metag/include/asm/atomic_lnkget.h |   36 +++++++++++++++++++++++++++++----
>  arch/metag/include/asm/atomic_lock1.h  |   33 ++++++++++++++++++++++++++----
>  3 files changed, 63 insertions(+), 8 deletions(-)
> 
> --- a/arch/metag/include/asm/atomic.h
> +++ b/arch/metag/include/asm/atomic.h
> @@ -17,6 +17,8 @@
>  #include <asm/atomic_lnkget.h>
>  #endif
>  
> +#define atomic_fetch_or atomic_fetch_or
> +
>  #define atomic_add_negative(a, v)       (atomic_add_return((a), (v)) < 0)
>  
>  #define atomic_dec_return(v) atomic_sub_return(1, (v))
> --- a/arch/metag/include/asm/atomic_lnkget.h
> +++ b/arch/metag/include/asm/atomic_lnkget.h
> @@ -69,16 +69,44 @@ static inline int atomic_##op##_return(i
>  	return result;							\
>  }
>  
> -#define ATOMIC_OPS(op) ATOMIC_OP(op) ATOMIC_OP_RETURN(op)
> +#define ATOMIC_FETCH_OP(op)						\
> +static inline int atomic_fetch_##op(int i, atomic_t *v)			\
> +{									\
> +	int result, temp;						\
> +									\
> +	smp_mb();							\
> +									\
> +	asm volatile (							\
> +		"1:	LNKGETD %1, [%2]\n"				\
> +		"	" #op "	%0, %1, %3\n"				\

i was hoping never to have to think about meta asm constraints again :-P

and/or/xor are only available in the data units, as determined by %1 in
this case, so the constraint for result shouldn't have "a" in it.

diff --git a/arch/metag/include/asm/atomic_lnkget.h b/arch/metag/include/asm/atomic_lnkget.h
index 50ad05050947..def2c642f053 100644
--- a/arch/metag/include/asm/atomic_lnkget.h
+++ b/arch/metag/include/asm/atomic_lnkget.h
@@ -84,7 +84,7 @@ static inline int atomic_fetch_##op(int i, atomic_t *v)			\
 		"	ANDT	%0, %0, #HI(0x3f000000)\n"		\
 		"	CMPT	%0, #HI(0x02000000)\n"			\
 		"	BNZ 1b\n"					\
-		: "=&d" (temp), "=&da" (result)				\
+		: "=&d" (temp), "=&d" (result)				\
 		: "da" (&v->counter), "bd" (i)				\
 		: "cc");						\

That also ensures the "bd" constraint for %3 (meaning "an op2 register
where op1 [%1 in this case] is a data unit register and the instruction
supports O2R") is consistent.

So with that change this patch looks good to me:
Acked-by: James Hogan <james.hogan@imgtec.com>


Note that for the ATOMIC_OP_RETURN() case (add/sub only) either address
or data units can be used (hence the "da" for %1), but then the "bd"
constraint on %3 is wrong as op1 [%1] may not be in data unit (sorry I
didn't spot that at the time). I'll queue a fix, something like below
probably ("br" means "An Op2 register and the instruction supports O2R",
i.e. op1/%1 doesn't have to be a data unit register):

diff --git a/arch/metag/include/asm/atomic_lnkget.h b/arch/metag/include/asm/atomic_lnkget.h
index 50ad05050947..def2c642f053 100644
--- a/arch/metag/include/asm/atomic_lnkget.h
+++ b/arch/metag/include/asm/atomic_lnkget.h
@@ -61,7 +61,7 @@ static inline int atomic_##op##_return(int i, atomic_t *v)		\
 		"	CMPT	%0, #HI(0x02000000)\n"			\
 		"	BNZ 1b\n"					\
 		: "=&d" (temp), "=&da" (result)				\
-		: "da" (&v->counter), "bd" (i)				\
+		: "da" (&v->counter), "br" (i)				\
 		: "cc");						\
 									\
 	smp_mb();							\
 									\							\

Thanks
James

> +		"	LNKSETD [%2], %0\n"				\
> +		"	DEFR	%0, TXSTAT\n"				\
> +		"	ANDT	%0, %0, #HI(0x3f000000)\n"		\
> +		"	CMPT	%0, #HI(0x02000000)\n"			\
> +		"	BNZ 1b\n"					\
> +		: "=&d" (temp), "=&da" (result)				\
> +		: "da" (&v->counter), "bd" (i)				\
> +		: "cc");						\
> +									\
> +	smp_mb();							\
> +									\
> +	return result;							\
> +}
> +
> +#define ATOMIC_OPS(op) ATOMIC_OP(op) ATOMIC_OP_RETURN(op) ATOMIC_FETCH_OP(op)
>  
>  ATOMIC_OPS(add)
>  ATOMIC_OPS(sub)
>  
> -ATOMIC_OP(and)
> -ATOMIC_OP(or)
> -ATOMIC_OP(xor)
> +#undef ATOMIC_OPS
> +#define ATOMIC_OPS(op) ATOMIC_OP(op) ATOMIC_FETCH_OP(op)
> +
> +ATOMIC_OPS(and)
> +ATOMIC_OPS(or)
> +ATOMIC_OPS(xor)
>  
>  #undef ATOMIC_OPS
> +#undef ATOMIC_FETCH_OP
>  #undef ATOMIC_OP_RETURN
>  #undef ATOMIC_OP
>  
> --- a/arch/metag/include/asm/atomic_lock1.h
> +++ b/arch/metag/include/asm/atomic_lock1.h
> @@ -64,15 +64,40 @@ static inline int atomic_##op##_return(i
>  	return result;							\
>  }
>  
> -#define ATOMIC_OPS(op, c_op) ATOMIC_OP(op, c_op) ATOMIC_OP_RETURN(op, c_op)
> +#define ATOMIC_FETCH_OP(op, c_op)					\
> +static inline int atomic_fetch_##op(int i, atomic_t *v)			\
> +{									\
> +	unsigned long result;						\
> +	unsigned long flags;						\
> +									\
> +	__global_lock1(flags);						\
> +	result = v->counter;						\
> +	fence();							\
> +	v->counter c_op i;						\
> +	__global_unlock1(flags);					\
> +									\
> +	return result;							\
> +}
> +
> +#define ATOMIC_OPS(op, c_op)						\
> +	ATOMIC_OP(op, c_op)						\
> +	ATOMIC_OP_RETURN(op, c_op)					\
> +	ATOMIC_FETCH_OP(op, c_op)
>  
>  ATOMIC_OPS(add, +=)
>  ATOMIC_OPS(sub, -=)
> -ATOMIC_OP(and, &=)
> -ATOMIC_OP(or, |=)
> -ATOMIC_OP(xor, ^=)
>  
>  #undef ATOMIC_OPS
> +#define ATOMIC_OPS(op, c_op)						\
> +	ATOMIC_OP(op, c_op)						\
> +	ATOMIC_FETCH_OP(op, c_op)
> +
> +ATOMIC_OPS(and, &=)
> +ATOMIC_OPS(or, |=)
> +ATOMIC_OPS(xor, ^=)
> +
> +#undef ATOMIC_OPS
> +#undef ATOMIC_FETCH_OP
>  #undef ATOMIC_OP_RETURN
>  #undef ATOMIC_OP
>  
> 
> 

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply related	[flat|nested] 79+ messages in thread

* Re: [RFC][PATCH 14/31] locking,metag: Implement atomic_fetch_{add,sub,and,or,xor}()
  2016-04-30  0:20     ` James Hogan
  (?)
@ 2016-05-02  8:15     ` Peter Zijlstra
  -1 siblings, 0 replies; 79+ messages in thread
From: Peter Zijlstra @ 2016-05-02  8:15 UTC (permalink / raw)
  To: James Hogan
  Cc: torvalds, mingo, tglx, will.deacon, paulmck, boqun.feng,
	waiman.long, fweisbec, linux-kernel, linux-arch, rth, vgupta,
	linux, egtvedt, realmz6, ysato, rkuo, tony.luck, geert, ralf,
	dhowells, jejb, mpe, schwidefsky, dalias, davem, cmetcalf,
	jcmvbkbc, arnd, dbueso, fengguang.wu

On Sat, Apr 30, 2016 at 01:20:31AM +0100, James Hogan wrote:
> > +	asm volatile (							\
> > +		"1:	LNKGETD %1, [%2]\n"				\
> > +		"	" #op "	%0, %1, %3\n"				\
> 
> i was hoping never to have to think about meta asm constraints again :-P

There is a solution for that: rm -rf arch/metag :-)

> and/or/xor are only available in the data units, as determined by %1 in
> this case, so the constraint for result shouldn't have "a" in it.
> 
> diff --git a/arch/metag/include/asm/atomic_lnkget.h b/arch/metag/include/asm/atomic_lnkget.h
> index 50ad05050947..def2c642f053 100644
> --- a/arch/metag/include/asm/atomic_lnkget.h
> +++ b/arch/metag/include/asm/atomic_lnkget.h
> @@ -84,7 +84,7 @@ static inline int atomic_fetch_##op(int i, atomic_t *v)			\
>  		"	ANDT	%0, %0, #HI(0x3f000000)\n"		\
>  		"	CMPT	%0, #HI(0x02000000)\n"			\
>  		"	BNZ 1b\n"					\
> -		: "=&d" (temp), "=&da" (result)				\
> +		: "=&d" (temp), "=&d" (result)				\
>  		: "da" (&v->counter), "bd" (i)				\
>  		: "cc");						\
> 
> That also ensures the "bd" constraint for %3 (meaning "an op2 register
> where op1 [%1 in this case] is a data unit register and the instruction
> supports O2R") is consistent.
> 
> So with that change this patch looks good to me:

Right, so I'd _never_ have thought to look at that,

> Acked-by: James Hogan <james.hogan@imgtec.com>

Thanks!

> Note that for the ATOMIC_OP_RETURN() case (add/sub only) either address
> or data units can be used (hence the "da" for %1), but then the "bd"
> constraint on %3 is wrong as op1 [%1] may not be in data unit (sorry I
> didn't spot that at the time). I'll queue a fix, something like below
> probably ("br" means "An Op2 register and the instruction supports O2R",
> i.e. op1/%1 doesn't have to be a data unit register):
> 
> diff --git a/arch/metag/include/asm/atomic_lnkget.h b/arch/metag/include/asm/atomic_lnkget.h
> index 50ad05050947..def2c642f053 100644
> --- a/arch/metag/include/asm/atomic_lnkget.h
> +++ b/arch/metag/include/asm/atomic_lnkget.h
> @@ -61,7 +61,7 @@ static inline int atomic_##op##_return(int i, atomic_t *v)		\
>  		"	CMPT	%0, #HI(0x02000000)\n"			\
>  		"	BNZ 1b\n"					\
>  		: "=&d" (temp), "=&da" (result)				\
> -		: "da" (&v->counter), "bd" (i)				\
> +		: "da" (&v->counter), "br" (i)				\
>  		: "cc");						\
>  									\
>  	smp_mb();							\
>  									\							\

Thanks, again :-)

^ permalink raw reply	[flat|nested] 79+ messages in thread

end of thread, other threads:[~2016-05-02  8:16 UTC | newest]

Thread overview: 79+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-04-22  9:04 [RFC][PATCH 00/31] implement atomic_fetch_$op Peter Zijlstra
2016-04-22  9:04 ` [RFC][PATCH 01/31] locking: Flip arguments to atomic_fetch_or Peter Zijlstra
2016-04-22 10:54   ` Will Deacon
2016-04-22 11:09   ` Geert Uytterhoeven
2016-04-22 11:09     ` Geert Uytterhoeven
2016-04-22 14:18     ` Peter Zijlstra
2016-04-22 14:18       ` Peter Zijlstra
2016-04-22  9:04 ` [RFC][PATCH 02/31] locking,alpha: Implement atomic{,64}_fetch_{add,sub,and,andnot,or,xor}() Peter Zijlstra
2016-04-22 16:57   ` Richard Henderson
2016-04-23  1:55     ` Peter Zijlstra
2016-04-22  9:04 ` [RFC][PATCH 03/31] locking,arc: Implement atomic_fetch_{add,sub,and,andnot,or,xor}() Peter Zijlstra
2016-04-22 10:50   ` Vineet Gupta
2016-04-22 10:50     ` Vineet Gupta
2016-04-22 14:16     ` Peter Zijlstra
2016-04-22 14:16       ` Peter Zijlstra
2016-04-25  4:26       ` Vineet Gupta
2016-04-25  4:26         ` Vineet Gupta
2016-04-22 14:26     ` Peter Zijlstra
2016-04-22 14:26       ` Peter Zijlstra
2016-04-22  9:04 ` [RFC][PATCH 04/31] locking,arm: Implement atomic{,64}_fetch_{add,sub,and,andnot,or,xor}{,_relaxed,_acquire,_release}() Peter Zijlstra
2016-04-22 11:35   ` Will Deacon
2016-04-22  9:04 ` [RFC][PATCH 05/31] locking,arm64: " Peter Zijlstra
2016-04-22 11:08   ` Will Deacon
2016-04-22 14:23   ` Will Deacon
2016-04-22 14:23     ` Will Deacon
2016-04-22 14:23     ` Will Deacon
2016-04-22  9:04 ` [RFC][PATCH 06/31] locking,avr32: Implement atomic_fetch_{add,sub,and,or,xor}() Peter Zijlstra
2016-04-22 11:58   ` Hans-Christian Noren Egtvedt
2016-04-22  9:04 ` [RFC][PATCH 07/31] locking,blackfin: " Peter Zijlstra
2016-04-22  9:04 ` [RFC][PATCH 08/31] locking,frv: Implement atomic{,64}_fetch_{add,sub,and,or,xor}() Peter Zijlstra
2016-04-22  9:04 ` [RFC][PATCH 09/31] locking,h8300: Implement atomic_fetch_{add,sub,and,or,xor}() Peter Zijlstra
2016-04-22  9:04 ` [RFC][PATCH 10/31] locking,hexagon: " Peter Zijlstra
2016-04-23  2:16   ` Peter Zijlstra
2016-04-26  0:39     ` Richard Kuo
2016-04-22  9:04 ` [RFC][PATCH 11/31] locking,ia64: Implement atomic{,64}_fetch_{add,sub,and,or,xor}() Peter Zijlstra
2016-04-22  9:04 ` [RFC][PATCH 12/31] locking,m32r: Implement atomic_fetch_{add,sub,and,or,xor}() Peter Zijlstra
2016-04-22  9:04 ` [RFC][PATCH 13/31] locking,m68k: " Peter Zijlstra
2016-04-22  9:04 ` [RFC][PATCH 14/31] locking,metag: " Peter Zijlstra
2016-04-30  0:20   ` James Hogan
2016-04-30  0:20     ` James Hogan
2016-05-02  8:15     ` Peter Zijlstra
2016-04-22  9:04 ` [RFC][PATCH 15/31] locking,mips: Implement atomic{,64}_fetch_{add,sub,and,or,xor}() Peter Zijlstra
2016-04-22  9:04 ` [RFC][PATCH 16/31] locking,mn10300: Implement atomic_fetch_{add,sub,and,or,xor}() Peter Zijlstra
2016-04-22  9:04 ` [RFC][PATCH 17/31] locking,parisc: Implement atomic{,64}_fetch_{add,sub,and,or,xor}() Peter Zijlstra
2016-04-22  9:04 ` [RFC][PATCH 18/31] locking,powerpc: Implement atomic{,64}_fetch_{add,sub,and,or,xor}{,_relaxed,_acquire,_release}() Peter Zijlstra
2016-04-22 16:41   ` Boqun Feng
2016-04-23  2:31     ` Peter Zijlstra
2016-04-22  9:04 ` [RFC][PATCH 19/31] locking,s390: Implement atomic{,64}_fetch_{add,sub,and,or,xor}() Peter Zijlstra
2016-04-25  8:06   ` Martin Schwidefsky
2016-04-25  8:26     ` Peter Zijlstra
2016-04-22  9:04 ` [RFC][PATCH 20/31] locking,sh: Implement atomic_fetch_{add,sub,and,or,xor}() Peter Zijlstra
2016-04-22  9:04 ` [RFC][PATCH 21/31] locking,sparc: Implement atomic{,64}_fetch_{add,sub,and,or,xor}() Peter Zijlstra
2016-04-22  9:04 ` [RFC][PATCH 22/31] locking,tile: " Peter Zijlstra
2016-04-25 21:10   ` Chris Metcalf
2016-04-25 21:10     ` Chris Metcalf
2016-04-26 14:00     ` [PATCH] tile: clarify barrier semantics of atomic_add_return Chris Metcalf
     [not found]   ` <571E840A.8090703@mellanox.com>
2016-04-26 15:28     ` [RFC][PATCH 22/31] locking,tile: Implement atomic{,64}_fetch_{add,sub,and,or,xor}() Peter Zijlstra
2016-04-26 15:32       ` Chris Metcalf
2016-04-26 15:32         ` Chris Metcalf
2016-04-22  9:04 ` [RFC][PATCH 23/31] locking,x86: " Peter Zijlstra
2016-04-22  9:04 ` [RFC][PATCH 24/31] locking,xtensa: Implement atomic_fetch_{add,sub,and,or,xor}() Peter Zijlstra
2016-04-22  9:04 ` [RFC][PATCH 25/31] locking: Fix atomic64_relaxed bits Peter Zijlstra
2016-04-22  9:04 ` [RFC][PATCH 26/31] locking: Implement atomic{,64,_long}_fetch_{add,sub,and,andnot,or,xor}{,_relaxed,_acquire,_release}() Peter Zijlstra
2016-04-22  9:04 ` [RFC][PATCH 27/31] locking: Remove linux/atomic.h:atomic_fetch_or Peter Zijlstra
2016-04-22 13:02   ` Will Deacon
2016-04-22 14:21     ` Peter Zijlstra
2016-04-22  9:04 ` [RFC][PATCH 28/31] locking: Remove the deprecated atomic_{set,clear}_mask() functions Peter Zijlstra
2016-04-22  9:04 ` [RFC][PATCH 29/31] locking,alpha: Convert to _relaxed atomics Peter Zijlstra
2016-04-22  9:04 ` [RFC][PATCH 30/31] locking,mips: " Peter Zijlstra
2016-04-22  9:04 ` [RFC][PATCH 31/31] locking,qrwlock: Employ atomic_fetch_add_acquire() Peter Zijlstra
2016-04-22 14:25   ` Waiman Long
2016-04-22 14:25     ` Waiman Long
2016-04-22  9:44 ` [RFC][PATCH 00/31] implement atomic_fetch_$op Peter Zijlstra
2016-04-22 12:56   ` Fengguang Wu
2016-04-22 13:03     ` Will Deacon
2016-04-22 14:23     ` Peter Zijlstra
2016-04-23  1:59       ` Fengguang Wu
2016-04-22 18:35     ` Kalle Valo
2016-04-23  3:23       ` Fengguang Wu

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.