All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v3] MIPS: Cleanup R10000_LLSC_WAR logic in atomic.h
@ 2017-11-17 15:21 Joshua Kinard
  2017-11-17 17:45 ` Maciej W. Rozycki
  0 siblings, 1 reply; 4+ messages in thread
From: Joshua Kinard @ 2017-11-17 15:21 UTC (permalink / raw)
  To: Ralf Baechle, James Hogan, Maciej W. Rozycki; +Cc: Paul Burton, Linux/MIPS

From: Joshua Kinard <kumba@gentoo.org>

This patch reduces down the conditionals in MIPS atomic code that deal
with a silicon bug in early R10000 cpus that required a workaround of
a branch-likely instruction following a store-conditional in order to
to guarantee the whole ll/sc sequence is atomic.  As the only real
difference is a branch-likely instruction (beqzl) over a standard
branch (beqz), the conditional is reduced down to down to a single
preprocessor check at the top to pick the required instruction.

This requires writing the uses in assembler, thus we discard the
non-R10000 case that uses a mixture of a C do...while loop with
embedded assembler that was added back in commit 7837314d141c.  A note
found in the git log for empty commit 5999eca25c1f is also addressed
and all uses of the 'noreorder' directive are eliminated, allowing GAS
to handle scheduling of the assembler.

The macro definition for the branch instruction and the code comment
derives from a patch sent in earlier by Paul Burton for various cmpxchg
cleanups.

Cc: linux-mips@linux-mips.org
Cc: Paul Burton <paul.burton@mips.com>
Signed-off-by: Joshua Kinard <kumba@gentoo.org>
Tested-by: Joshua Kinard <kumba@gentoo.org>

---
Changes in v3:
- Make the result of subu/dsubu available to sc/scd in
  atomic_sub_if_positive and atomic64_sub_if_positive while still
  avoiding the use of 'noreorder'.

Changes in v2:
- Incorporate suggestions from upstream to atomic_sub_if_positive
  and atomic64_sub_if_positive to avoid memory barriers, as those
  are already implied and to eliminate the '.noreorder' directive
  and corner case of subu/dsubu's output not getting used in the
  branch-likely case due to being in the branch delay slot.

---
 arch/mips/include/asm/atomic.h |  199 +++++--------------------------
 1 file changed, 38 insertions(+), 161 deletions(-)

diff --git a/arch/mips/include/asm/atomic.h b/arch/mips/include/asm/atomic.h
index 0ab176bdb8e8..629c4fca26a9 100644
--- a/arch/mips/include/asm/atomic.h
+++ b/arch/mips/include/asm/atomic.h
@@ -22,6 +22,17 @@
 #include <asm/cmpxchg.h>
 #include <asm/war.h>
 
+/*
+ * Using a branch-likely instruction to check the result of an sc instruction
+ * works around a bug present in R10000 CPUs prior to revision 3.0 that could
+ * cause ll-sc sequences to execute non-atomically.
+ */
+#if R10000_LLSC_WAR
+# define __scbeqz "beqzl"
+#else
+# define __scbeqz "beqz"
+#endif
+
 #define ATOMIC_INIT(i)	  { (i) }
 
 /*
@@ -44,31 +55,18 @@
 #define ATOMIC_OP(op, c_op, asm_op)					      \
 static __inline__ void atomic_##op(int i, atomic_t * v)			      \
 {									      \
-	if (kernel_uses_llsc && R10000_LLSC_WAR) {			      \
+	if (kernel_uses_llsc) {						      \
 		int temp;						      \
 									      \
 		__asm__ __volatile__(					      \
-		"	.set	arch=r4000				\n"   \
+		"	.set	"MIPS_ISA_LEVEL"			\n"   \
 		"1:	ll	%0, %1		# atomic_" #op "	\n"   \
 		"	" #asm_op " %0, %2				\n"   \
 		"	sc	%0, %1					\n"   \
-		"	beqzl	%0, 1b					\n"   \
+		"\t" __scbeqz "	%0, 1b					\n"   \
 		"	.set	mips0					\n"   \
 		: "=&r" (temp), "+" GCC_OFF_SMALL_ASM() (v->counter)	      \
 		: "Ir" (i));						      \
-	} else if (kernel_uses_llsc) {					      \
-		int temp;						      \
-									      \
-		do {							      \
-			__asm__ __volatile__(				      \
-			"	.set	"MIPS_ISA_LEVEL"		\n"   \
-			"	ll	%0, %1		# atomic_" #op "\n"   \
-			"	" #asm_op " %0, %2			\n"   \
-			"	sc	%0, %1				\n"   \
-			"	.set	mips0				\n"   \
-			: "=&r" (temp), "+" GCC_OFF_SMALL_ASM() (v->counter)  \
-			: "Ir" (i));					      \
-		} while (unlikely(!temp));				      \
 	} else {							      \
 		unsigned long flags;					      \
 									      \
@@ -83,36 +81,20 @@ static __inline__ int atomic_##op##_return_relaxed(int i, atomic_t * v)	      \
 {									      \
 	int result;							      \
 									      \
-	if (kernel_uses_llsc && R10000_LLSC_WAR) {			      \
+	if (kernel_uses_llsc) {						      \
 		int temp;						      \
 									      \
 		__asm__ __volatile__(					      \
-		"	.set	arch=r4000				\n"   \
+		"	.set	"MIPS_ISA_LEVEL"			\n"   \
 		"1:	ll	%1, %2		# atomic_" #op "_return	\n"   \
 		"	" #asm_op " %0, %1, %3				\n"   \
 		"	sc	%0, %2					\n"   \
-		"	beqzl	%0, 1b					\n"   \
+		"\t" __scbeqz "	%0, 1b					\n"   \
 		"	" #asm_op " %0, %1, %3				\n"   \
 		"	.set	mips0					\n"   \
 		: "=&r" (result), "=&r" (temp),				      \
 		  "+" GCC_OFF_SMALL_ASM() (v->counter)			      \
 		: "Ir" (i));						      \
-	} else if (kernel_uses_llsc) {					      \
-		int temp;						      \
-									      \
-		do {							      \
-			__asm__ __volatile__(				      \
-			"	.set	"MIPS_ISA_LEVEL"		\n"   \
-			"	ll	%1, %2	# atomic_" #op "_return	\n"   \
-			"	" #asm_op " %0, %1, %3			\n"   \
-			"	sc	%0, %2				\n"   \
-			"	.set	mips0				\n"   \
-			: "=&r" (result), "=&r" (temp),			      \
-			  "+" GCC_OFF_SMALL_ASM() (v->counter)		      \
-			: "Ir" (i));					      \
-		} while (unlikely(!result));				      \
-									      \
-		result = temp; result c_op i;				      \
 	} else {							      \
 		unsigned long flags;					      \
 									      \
@@ -131,36 +113,20 @@ static __inline__ int atomic_fetch_##op##_relaxed(int i, atomic_t * v)	      \
 {									      \
 	int result;							      \
 									      \
-	if (kernel_uses_llsc && R10000_LLSC_WAR) {			      \
+	if (kernel_uses_llsc) {						      \
 		int temp;						      \
 									      \
 		__asm__ __volatile__(					      \
-		"	.set	arch=r4000				\n"   \
+		"	.set	"MIPS_ISA_LEVEL"			\n"   \
 		"1:	ll	%1, %2		# atomic_fetch_" #op "	\n"   \
 		"	" #asm_op " %0, %1, %3				\n"   \
 		"	sc	%0, %2					\n"   \
-		"	beqzl	%0, 1b					\n"   \
+		"\t" __scbeqz "	%0, 1b					\n"   \
 		"	move	%0, %1					\n"   \
 		"	.set	mips0					\n"   \
 		: "=&r" (result), "=&r" (temp),				      \
 		  "+" GCC_OFF_SMALL_ASM() (v->counter)			      \
 		: "Ir" (i));						      \
-	} else if (kernel_uses_llsc) {					      \
-		int temp;						      \
-									      \
-		do {							      \
-			__asm__ __volatile__(				      \
-			"	.set	"MIPS_ISA_LEVEL"		\n"   \
-			"	ll	%1, %2	# atomic_fetch_" #op "	\n"   \
-			"	" #asm_op " %0, %1, %3			\n"   \
-			"	sc	%0, %2				\n"   \
-			"	.set	mips0				\n"   \
-			: "=&r" (result), "=&r" (temp),			      \
-			  "+" GCC_OFF_SMALL_ASM() (v->counter)		      \
-			: "Ir" (i));					      \
-		} while (unlikely(!result));				      \
-									      \
-		result = temp;						      \
 	} else {							      \
 		unsigned long flags;					      \
 									      \
@@ -218,38 +184,17 @@ static __inline__ int atomic_sub_if_positive(int i, atomic_t * v)
 
 	smp_mb__before_llsc();
 
-	if (kernel_uses_llsc && R10000_LLSC_WAR) {
-		int temp;
-
-		__asm__ __volatile__(
-		"	.set	arch=r4000				\n"
-		"1:	ll	%1, %2		# atomic_sub_if_positive\n"
-		"	subu	%0, %1, %3				\n"
-		"	bltz	%0, 1f					\n"
-		"	sc	%0, %2					\n"
-		"	.set	noreorder				\n"
-		"	beqzl	%0, 1b					\n"
-		"	 subu	%0, %1, %3				\n"
-		"	.set	reorder					\n"
-		"1:							\n"
-		"	.set	mips0					\n"
-		: "=&r" (result), "=&r" (temp),
-		  "+" GCC_OFF_SMALL_ASM() (v->counter)
-		: "Ir" (i), GCC_OFF_SMALL_ASM() (v->counter)
-		: "memory");
-	} else if (kernel_uses_llsc) {
+	if (kernel_uses_llsc) {
 		int temp;
 
 		__asm__ __volatile__(
 		"	.set	"MIPS_ISA_LEVEL"			\n"
 		"1:	ll	%1, %2		# atomic_sub_if_positive\n"
 		"	subu	%0, %1, %3				\n"
+		"	move	%1, %0					\n"
 		"	bltz	%0, 1f					\n"
-		"	sc	%0, %2					\n"
-		"	.set	noreorder				\n"
-		"	beqz	%0, 1b					\n"
-		"	 subu	%0, %1, %3				\n"
-		"	.set	reorder					\n"
+		"	sc	%1, %2					\n"
+		"\t" __scbeqz "	%1, 1b					\n"
 		"1:							\n"
 		"	.set	mips0					\n"
 		: "=&r" (result), "=&r" (temp),
@@ -386,31 +331,18 @@ static __inline__ int __atomic_add_unless(atomic_t *v, int a, int u)
 #define ATOMIC64_OP(op, c_op, asm_op)					      \
 static __inline__ void atomic64_##op(long i, atomic64_t * v)		      \
 {									      \
-	if (kernel_uses_llsc && R10000_LLSC_WAR) {			      \
+	if (kernel_uses_llsc) {						      \
 		long temp;						      \
 									      \
 		__asm__ __volatile__(					      \
-		"	.set	arch=r4000				\n"   \
+		"	.set	"MIPS_ISA_LEVEL"			\n"   \
 		"1:	lld	%0, %1		# atomic64_" #op "	\n"   \
 		"	" #asm_op " %0, %2				\n"   \
 		"	scd	%0, %1					\n"   \
-		"	beqzl	%0, 1b					\n"   \
+		"\t" __scbeqz "	%0, 1b					\n"   \
 		"	.set	mips0					\n"   \
 		: "=&r" (temp), "+" GCC_OFF_SMALL_ASM() (v->counter)	      \
 		: "Ir" (i));						      \
-	} else if (kernel_uses_llsc) {					      \
-		long temp;						      \
-									      \
-		do {							      \
-			__asm__ __volatile__(				      \
-			"	.set	"MIPS_ISA_LEVEL"		\n"   \
-			"	lld	%0, %1		# atomic64_" #op "\n" \
-			"	" #asm_op " %0, %2			\n"   \
-			"	scd	%0, %1				\n"   \
-			"	.set	mips0				\n"   \
-			: "=&r" (temp), "+" GCC_OFF_SMALL_ASM() (v->counter)      \
-			: "Ir" (i));					      \
-		} while (unlikely(!temp));				      \
 	} else {							      \
 		unsigned long flags;					      \
 									      \
@@ -425,37 +357,20 @@ static __inline__ long atomic64_##op##_return_relaxed(long i, atomic64_t * v) \
 {									      \
 	long result;							      \
 									      \
-	if (kernel_uses_llsc && R10000_LLSC_WAR) {			      \
+	if (kernel_uses_llsc) {						      \
 		long temp;						      \
 									      \
 		__asm__ __volatile__(					      \
-		"	.set	arch=r4000				\n"   \
+		"	.set	"MIPS_ISA_LEVEL"			\n"   \
 		"1:	lld	%1, %2		# atomic64_" #op "_return\n"  \
 		"	" #asm_op " %0, %1, %3				\n"   \
 		"	scd	%0, %2					\n"   \
-		"	beqzl	%0, 1b					\n"   \
+		"\t" __scbeqz "	%0, 1b					\n"   \
 		"	" #asm_op " %0, %1, %3				\n"   \
 		"	.set	mips0					\n"   \
 		: "=&r" (result), "=&r" (temp),				      \
 		  "+" GCC_OFF_SMALL_ASM() (v->counter)			      \
 		: "Ir" (i));						      \
-	} else if (kernel_uses_llsc) {					      \
-		long temp;						      \
-									      \
-		do {							      \
-			__asm__ __volatile__(				      \
-			"	.set	"MIPS_ISA_LEVEL"		\n"   \
-			"	lld	%1, %2	# atomic64_" #op "_return\n"  \
-			"	" #asm_op " %0, %1, %3			\n"   \
-			"	scd	%0, %2				\n"   \
-			"	.set	mips0				\n"   \
-			: "=&r" (result), "=&r" (temp),			      \
-			  "=" GCC_OFF_SMALL_ASM() (v->counter)		      \
-			: "Ir" (i), GCC_OFF_SMALL_ASM() (v->counter)	      \
-			: "memory");					      \
-		} while (unlikely(!result));				      \
-									      \
-		result = temp; result c_op i;				      \
 	} else {							      \
 		unsigned long flags;					      \
 									      \
@@ -474,37 +389,20 @@ static __inline__ long atomic64_fetch_##op##_relaxed(long i, atomic64_t * v)  \
 {									      \
 	long result;							      \
 									      \
-	if (kernel_uses_llsc && R10000_LLSC_WAR) {			      \
+	if (kernel_uses_llsc) {						      \
 		long temp;						      \
 									      \
 		__asm__ __volatile__(					      \
-		"	.set	arch=r4000				\n"   \
+		"	.set	"MIPS_ISA_LEVEL"			\n"   \
 		"1:	lld	%1, %2		# atomic64_fetch_" #op "\n"   \
 		"	" #asm_op " %0, %1, %3				\n"   \
 		"	scd	%0, %2					\n"   \
-		"	beqzl	%0, 1b					\n"   \
+		"\t" __scbeqz "	%0, 1b					\n"   \
 		"	move	%0, %1					\n"   \
 		"	.set	mips0					\n"   \
 		: "=&r" (result), "=&r" (temp),				      \
 		  "+" GCC_OFF_SMALL_ASM() (v->counter)			      \
 		: "Ir" (i));						      \
-	} else if (kernel_uses_llsc) {					      \
-		long temp;						      \
-									      \
-		do {							      \
-			__asm__ __volatile__(				      \
-			"	.set	"MIPS_ISA_LEVEL"		\n"   \
-			"	lld	%1, %2	# atomic64_fetch_" #op "\n"   \
-			"	" #asm_op " %0, %1, %3			\n"   \
-			"	scd	%0, %2				\n"   \
-			"	.set	mips0				\n"   \
-			: "=&r" (result), "=&r" (temp),			      \
-			  "=" GCC_OFF_SMALL_ASM() (v->counter)		      \
-			: "Ir" (i), GCC_OFF_SMALL_ASM() (v->counter)	      \
-			: "memory");					      \
-		} while (unlikely(!result));				      \
-									      \
-		result = temp;						      \
 	} else {							      \
 		unsigned long flags;					      \
 									      \
@@ -563,42 +461,21 @@ static __inline__ long atomic64_sub_if_positive(long i, atomic64_t * v)
 
 	smp_mb__before_llsc();
 
-	if (kernel_uses_llsc && R10000_LLSC_WAR) {
-		long temp;
-
-		__asm__ __volatile__(
-		"	.set	arch=r4000				\n"
-		"1:	lld	%1, %2		# atomic64_sub_if_positive\n"
-		"	dsubu	%0, %1, %3				\n"
-		"	bltz	%0, 1f					\n"
-		"	scd	%0, %2					\n"
-		"	.set	noreorder				\n"
-		"	beqzl	%0, 1b					\n"
-		"	 dsubu	%0, %1, %3				\n"
-		"	.set	reorder					\n"
-		"1:							\n"
-		"	.set	mips0					\n"
-		: "=&r" (result), "=&r" (temp),
-		  "=" GCC_OFF_SMALL_ASM() (v->counter)
-		: "Ir" (i), GCC_OFF_SMALL_ASM() (v->counter)
-		: "memory");
-	} else if (kernel_uses_llsc) {
+	if (kernel_uses_llsc) {
 		long temp;
 
 		__asm__ __volatile__(
 		"	.set	"MIPS_ISA_LEVEL"			\n"
 		"1:	lld	%1, %2		# atomic64_sub_if_positive\n"
 		"	dsubu	%0, %1, %3				\n"
+		"	move	%1, %0					\n"
 		"	bltz	%0, 1f					\n"
-		"	scd	%0, %2					\n"
-		"	.set	noreorder				\n"
-		"	beqz	%0, 1b					\n"
-		"	 dsubu	%0, %1, %3				\n"
-		"	.set	reorder					\n"
+		"	scd	%1, %2					\n"
+		"\t" __scbeqz "	%1, 1b					\n"
 		"1:							\n"
 		"	.set	mips0					\n"
 		: "=&r" (result), "=&r" (temp),
-		  "+" GCC_OFF_SMALL_ASM() (v->counter)
+		  "=" GCC_OFF_SMALL_ASM() (v->counter)
 		: "Ir" (i));
 	} else {
 		unsigned long flags;

^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH v3] MIPS: Cleanup R10000_LLSC_WAR logic in atomic.h
  2017-11-17 15:21 [PATCH v3] MIPS: Cleanup R10000_LLSC_WAR logic in atomic.h Joshua Kinard
@ 2017-11-17 17:45 ` Maciej W. Rozycki
  2017-11-19  2:18   ` Joshua Kinard
  0 siblings, 1 reply; 4+ messages in thread
From: Maciej W. Rozycki @ 2017-11-17 17:45 UTC (permalink / raw)
  To: Joshua Kinard
  Cc: Ralf Baechle, James Hogan, Maciej W. Rozycki, Paul Burton, Linux/MIPS

On Fri, 17 Nov 2017, Joshua Kinard wrote:

> This patch reduces down the conditionals in MIPS atomic code that deal
> with a silicon bug in early R10000 cpus that required a workaround of
> a branch-likely instruction following a store-conditional in order to
> to guarantee the whole ll/sc sequence is atomic.  As the only real
> difference is a branch-likely instruction (beqzl) over a standard
> branch (beqz), the conditional is reduced down to down to a single

 s/down to down to/down to/

> preprocessor check at the top to pick the required instruction.
> 
> This requires writing the uses in assembler, thus we discard the
> non-R10000 case that uses a mixture of a C do...while loop with
> embedded assembler that was added back in commit 7837314d141c.  A note
> found in the git log for empty commit 5999eca25c1f is also addressed

 Please use `commit 7837314d141c ("MIPS: Get rid of branches to 
.subsections.")', `commit 5999eca25c1f ("[MIPS] Improve branch prediction 
in ll/sc atomic operations.")' style in commit descriptions.  Please also 
pass your changes through `scripts/checkpatch.pl', which would have 
pointed it out.

 Also why do you think commit 5999eca25c1f was empty?  It certainly does 
not show up as empty for me.  Did you quote the correct commit ID?  I 
think it would be good too if you clarified which note in that commit you 
have specifically referred to.

> ---
> Changes in v3:
> - Make the result of subu/dsubu available to sc/scd in
>   atomic_sub_if_positive and atomic64_sub_if_positive while still
>   avoiding the use of 'noreorder'.
> 
> Changes in v2:
> - Incorporate suggestions from upstream to atomic_sub_if_positive
>   and atomic64_sub_if_positive to avoid memory barriers, as those
>   are already implied and to eliminate the '.noreorder' directive
>   and corner case of subu/dsubu's output not getting used in the
>   branch-likely case due to being in the branch delay slot.

 I think the BEQZL delay slot bug fixes to `atomic_sub_if_positive' and 
`atomic64_sub_if_positive' should be split off and submitted separately as 
a preparatory patch, so that they can be backported to stable branches; 
please do so.  It'll also make your original clean up that follows 
clearer.

> @@ -563,42 +461,21 @@ static __inline__ long atomic64_sub_if_positive(long i, atomic64_t * v)
>  
>  	smp_mb__before_llsc();
>  
> -	if (kernel_uses_llsc && R10000_LLSC_WAR) {
> -		long temp;
> -
> -		__asm__ __volatile__(
> -		"	.set	arch=r4000				\n"
> -		"1:	lld	%1, %2		# atomic64_sub_if_positive\n"
> -		"	dsubu	%0, %1, %3				\n"
> -		"	bltz	%0, 1f					\n"
> -		"	scd	%0, %2					\n"
> -		"	.set	noreorder				\n"
> -		"	beqzl	%0, 1b					\n"
> -		"	 dsubu	%0, %1, %3				\n"
> -		"	.set	reorder					\n"
> -		"1:							\n"
> -		"	.set	mips0					\n"
> -		: "=&r" (result), "=&r" (temp),
> -		  "=" GCC_OFF_SMALL_ASM() (v->counter)
> -		: "Ir" (i), GCC_OFF_SMALL_ASM() (v->counter)
> -		: "memory");
> -	} else if (kernel_uses_llsc) {
> +	if (kernel_uses_llsc) {
>  		long temp;
>  
>  		__asm__ __volatile__(
>  		"	.set	"MIPS_ISA_LEVEL"			\n"
>  		"1:	lld	%1, %2		# atomic64_sub_if_positive\n"
>  		"	dsubu	%0, %1, %3				\n"
> +		"	move	%1, %0					\n"
>  		"	bltz	%0, 1f					\n"
> -		"	scd	%0, %2					\n"
> -		"	.set	noreorder				\n"
> -		"	beqz	%0, 1b					\n"
> -		"	 dsubu	%0, %1, %3				\n"
> -		"	.set	reorder					\n"
> +		"	scd	%1, %2					\n"
> +		"\t" __scbeqz "	%1, 1b					\n"
>  		"1:							\n"
>  		"	.set	mips0					\n"
>  		: "=&r" (result), "=&r" (temp),
> -		  "+" GCC_OFF_SMALL_ASM() (v->counter)
> +		  "=" GCC_OFF_SMALL_ASM() (v->counter)

 This is wrong, `v->counter' is both read and written, so `+' has to stay.

 Otherwise OK, I think.  Perhaps you can split off another patch, just to 
fix up the mess with constraints, which should be either:

		: "=&r" (temp), "+" GCC_OFF_SMALL_ASM() (v->counter)
                : "Ir" (i));

if there is no result or:

		: "=&r" (result), "=&r" (temp),
		  "+" GCC_OFF_SMALL_ASM() (v->counter)
		: "Ir" (i));

if there is one.  This would be patch #2 in the series then and would make 
the final review easier I believe.

  Maciej

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH v3] MIPS: Cleanup R10000_LLSC_WAR logic in atomic.h
  2017-11-17 17:45 ` Maciej W. Rozycki
@ 2017-11-19  2:18   ` Joshua Kinard
  2017-11-19  3:03     ` Joshua Kinard
  0 siblings, 1 reply; 4+ messages in thread
From: Joshua Kinard @ 2017-11-19  2:18 UTC (permalink / raw)
  To: Maciej W. Rozycki
  Cc: Ralf Baechle, James Hogan, Maciej W. Rozycki, Paul Burton, Linux/MIPS

On 11/17/2017 12:45, Maciej W. Rozycki wrote:
> On Fri, 17 Nov 2017, Joshua Kinard wrote:
> 
>> This patch reduces down the conditionals in MIPS atomic code that deal
>> with a silicon bug in early R10000 cpus that required a workaround of
>> a branch-likely instruction following a store-conditional in order to
>> to guarantee the whole ll/sc sequence is atomic.  As the only real
>> difference is a branch-likely instruction (beqzl) over a standard
>> branch (beqz), the conditional is reduced down to down to a single
> 
>  s/down to down to/down to/

Will be fixed in v4


>> preprocessor check at the top to pick the required instruction.
>>
>> This requires writing the uses in assembler, thus we discard the
>> non-R10000 case that uses a mixture of a C do...while loop with
>> embedded assembler that was added back in commit 7837314d141c.  A note
>> found in the git log for empty commit 5999eca25c1f is also addressed
> 
>  Please use `commit 7837314d141c ("MIPS: Get rid of branches to 
> .subsections.")', `commit 5999eca25c1f ("[MIPS] Improve branch prediction 
> in ll/sc atomic operations.")' style in commit descriptions.  Please also 
> pass your changes through `scripts/checkpatch.pl', which would have 
> pointed it out.

I wasn't aware checkpatch actually looked at the commit descriptions.  I
thought it only dealt with the patch itself.  Learn something new every day!


>  Also why do you think commit 5999eca25c1f was empty?  It certainly does 
> not show up as empty for me.  Did you quote the correct commit ID?  I 
> think it would be good too if you clarified which note in that commit you 
> have specifically referred to.

I see what happened.  I accessed this URL:
https://git.linux-mips.org/cgit/ralf/linux.git/commit/arch/mips/include/asm/atomic.h?id=5999eca25c1fd4b9b9aca7833b04d10fe4bc877d

Which refers to the correct commit id, but the commit happened before the move
of include/asm-mips to arch/mips/include/asm, so git was showing me the commit
message, but no changes.  IMHO, that seems like a bug in cgit where it should
have redirected me to the right file path for the old commit, but then again,
git's never been the best at making it easy to find the history of files after
they have been moved.  I'll fix that bit in my commit message.


>> ---
>> Changes in v3:
>> - Make the result of subu/dsubu available to sc/scd in
>>   atomic_sub_if_positive and atomic64_sub_if_positive while still
>>   avoiding the use of 'noreorder'.
>>
>> Changes in v2:
>> - Incorporate suggestions from upstream to atomic_sub_if_positive
>>   and atomic64_sub_if_positive to avoid memory barriers, as those
>>   are already implied and to eliminate the '.noreorder' directive
>>   and corner case of subu/dsubu's output not getting used in the
>>   branch-likely case due to being in the branch delay slot.
> 
>  I think the BEQZL delay slot bug fixes to `atomic_sub_if_positive' and 
> `atomic64_sub_if_positive' should be split off and submitted separately as 
> a preparatory patch, so that they can be backported to stable branches; 
> please do so.  It'll also make your original clean up that follows 
> clearer.

I'll make it a stand-alone patch.  Will also CC stable for backporting.


>> @@ -563,42 +461,21 @@ static __inline__ long atomic64_sub_if_positive(long i, atomic64_t * v)
>>  
>>  	smp_mb__before_llsc();
>>  
>> -	if (kernel_uses_llsc && R10000_LLSC_WAR) {
>> -		long temp;
>> -
>> -		__asm__ __volatile__(
>> -		"	.set	arch=r4000				\n"
>> -		"1:	lld	%1, %2		# atomic64_sub_if_positive\n"
>> -		"	dsubu	%0, %1, %3				\n"
>> -		"	bltz	%0, 1f					\n"
>> -		"	scd	%0, %2					\n"
>> -		"	.set	noreorder				\n"
>> -		"	beqzl	%0, 1b					\n"
>> -		"	 dsubu	%0, %1, %3				\n"
>> -		"	.set	reorder					\n"
>> -		"1:							\n"
>> -		"	.set	mips0					\n"
>> -		: "=&r" (result), "=&r" (temp),
>> -		  "=" GCC_OFF_SMALL_ASM() (v->counter)
>> -		: "Ir" (i), GCC_OFF_SMALL_ASM() (v->counter)
>> -		: "memory");
>> -	} else if (kernel_uses_llsc) {
>> +	if (kernel_uses_llsc) {
>>  		long temp;
>>  
>>  		__asm__ __volatile__(
>>  		"	.set	"MIPS_ISA_LEVEL"			\n"
>>  		"1:	lld	%1, %2		# atomic64_sub_if_positive\n"
>>  		"	dsubu	%0, %1, %3				\n"
>> +		"	move	%1, %0					\n"
>>  		"	bltz	%0, 1f					\n"
>> -		"	scd	%0, %2					\n"
>> -		"	.set	noreorder				\n"
>> -		"	beqz	%0, 1b					\n"
>> -		"	 dsubu	%0, %1, %3				\n"
>> -		"	.set	reorder					\n"
>> +		"	scd	%1, %2					\n"
>> +		"\t" __scbeqz "	%1, 1b					\n"
>>  		"1:							\n"
>>  		"	.set	mips0					\n"
>>  		: "=&r" (result), "=&r" (temp),
>> -		  "+" GCC_OFF_SMALL_ASM() (v->counter)
>> +		  "=" GCC_OFF_SMALL_ASM() (v->counter)
> 
>  This is wrong, `v->counter' is both read and written, so `+' has to stay.
> 
>  Otherwise OK, I think.  Perhaps you can split off another patch, just to 
> fix up the mess with constraints, which should be either:
> 
> 		: "=&r" (temp), "+" GCC_OFF_SMALL_ASM() (v->counter)
>                 : "Ir" (i));
> 
> if there is no result or:
> 
> 		: "=&r" (result), "=&r" (temp),
> 		  "+" GCC_OFF_SMALL_ASM() (v->counter)
> 		: "Ir" (i));
> 
> if there is one.  This would be patch #2 in the series then and would make 
> the final review easier I believe.

This is probably a mistake I made when eliminating the R10000_LLSC_WAR block, I
wasn't sure which bottom part to use.  In a few of the non-R10K cases, the
existing do...while loop uses "=" with the trailing "memory" bit, and I didn't
realize they went together.  I'll use the correct form in the main cleanup patch.

-- 
Joshua Kinard
Gentoo/MIPS
kumba@gentoo.org
6144R/F5C6C943 2015-04-27
177C 1972 1FB8 F254 BAD0 3E72 5C63 F4E3 F5C6 C943

"The past tempts us, the present confuses us, the future frightens us.  And our
lives slip away, moment by moment, lost in that vast, terrible in-between."

--Emperor Turhan, Centauri Republic

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH v3] MIPS: Cleanup R10000_LLSC_WAR logic in atomic.h
  2017-11-19  2:18   ` Joshua Kinard
@ 2017-11-19  3:03     ` Joshua Kinard
  0 siblings, 0 replies; 4+ messages in thread
From: Joshua Kinard @ 2017-11-19  3:03 UTC (permalink / raw)
  To: Maciej W. Rozycki
  Cc: Ralf Baechle, James Hogan, Maciej W. Rozycki, Paul Burton, Linux/MIPS

On 11/18/2017 21:18, Joshua Kinard wrote:
> On 11/17/2017 12:45, Maciej W. Rozycki wrote:

[snip]

>>
>>  I think the BEQZL delay slot bug fixes to `atomic_sub_if_positive' and 
>> `atomic64_sub_if_positive' should be split off and submitted separately as 
>> a preparatory patch, so that they can be backported to stable branches; 
>> please do so.  It'll also make your original clean up that follows 
>> clearer.
> 
> I'll make it a stand-alone patch.  Will also CC stable for backporting.

Actually, thinking about it some, I think a separate patch strictly for stable
that only fixes the beqzl case, and leaves the beqz case alone, makes more
sense.  Because for the preparatory patch, the subu/dsubu issue really only
affects the beqzl case.  But I am applying the same fix to the beqz case to
make the follow-on patch to collapse the R10000_LLSC_WAR cases down clearer,
and I don't think this follow-on patch will need to be backported.

Or is a patch for stable really needed, since the branch-likely case really
only applies to R10K machines, which right now in mainline means IP27 and IP28?

I'll skip on CC'ing stable right now so I can get both patches queued up.  Can
re-visit if it's deemed to really be an issue.

-- 
Joshua Kinard
Gentoo/MIPS
kumba@gentoo.org
6144R/F5C6C943 2015-04-27
177C 1972 1FB8 F254 BAD0 3E72 5C63 F4E3 F5C6 C943

"The past tempts us, the present confuses us, the future frightens us.  And our
lives slip away, moment by moment, lost in that vast, terrible in-between."

--Emperor Turhan, Centauri Republic

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2017-11-19  3:06 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-11-17 15:21 [PATCH v3] MIPS: Cleanup R10000_LLSC_WAR logic in atomic.h Joshua Kinard
2017-11-17 17:45 ` Maciej W. Rozycki
2017-11-19  2:18   ` Joshua Kinard
2017-11-19  3:03     ` Joshua Kinard

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.