linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Sasha Levin <sashal@kernel.org>
To: linux-kernel@vger.kernel.org, stable@vger.kernel.org
Cc: Peter Zijlstra <peterz@infradead.org>,
	Huacai Chen <chenhc@lemote.com>, Huang Pei <huangpei@loongson.cn>,
	Paul Burton <paul.burton@mips.com>,
	Sasha Levin <sashal@kernel.org>,
	linux-mips@vger.kernel.org
Subject: [PATCH AUTOSEL 5.3 10/49] mips/atomic: Fix loongson_llsc_mb() wreckage
Date: Sun, 29 Sep 2019 13:30:10 -0400	[thread overview]
Message-ID: <20190929173053.8400-10-sashal@kernel.org> (raw)
In-Reply-To: <20190929173053.8400-1-sashal@kernel.org>

From: Peter Zijlstra <peterz@infradead.org>

[ Upstream commit 1c6c1ca318585f1096d4d04bc722297c85e9fb8a ]

The comment describing the loongson_llsc_mb() reorder case doesn't
make any sense what so ever. Instruction re-ordering is not an SMP
artifact, but rather a CPU local phenomenon. Clarify the comment by
explaining that these issue cause a coherence fail.

For the branch speculation case; if futex_atomic_cmpxchg_inatomic()
needs one at the bne branch target, then surely the normal
__cmpxch_asm() implementation does too. We cannot rely on the
barriers from cmpxchg() because cmpxchg_local() is implemented with
the same macro, and branch prediction and speculation are, too, CPU
local.

Fixes: e02e07e3127d ("MIPS: Loongson: Introduce and use loongson_llsc_mb()")
Cc: Huacai Chen <chenhc@lemote.com>
Cc: Huang Pei <huangpei@loongson.cn>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Paul Burton <paul.burton@mips.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 arch/mips/include/asm/atomic.h  |  5 +++--
 arch/mips/include/asm/barrier.h | 32 ++++++++++++++++++--------------
 arch/mips/include/asm/bitops.h  |  5 +++++
 arch/mips/include/asm/cmpxchg.h |  5 +++++
 arch/mips/kernel/syscall.c      |  1 +
 5 files changed, 32 insertions(+), 16 deletions(-)

diff --git a/arch/mips/include/asm/atomic.h b/arch/mips/include/asm/atomic.h
index 9a82dd11c0e9e..190f1b6151221 100644
--- a/arch/mips/include/asm/atomic.h
+++ b/arch/mips/include/asm/atomic.h
@@ -193,6 +193,7 @@ static __inline__ int atomic_sub_if_positive(int i, atomic_t * v)
 	if (kernel_uses_llsc) {
 		int temp;
 
+		loongson_llsc_mb();
 		__asm__ __volatile__(
 		"	.set	push					\n"
 		"	.set	"MIPS_ISA_LEVEL"			\n"
@@ -200,12 +201,12 @@ static __inline__ int atomic_sub_if_positive(int i, atomic_t * v)
 		"	.set	pop					\n"
 		"	subu	%0, %1, %3				\n"
 		"	move	%1, %0					\n"
-		"	bltz	%0, 1f					\n"
+		"	bltz	%0, 2f					\n"
 		"	.set	push					\n"
 		"	.set	"MIPS_ISA_LEVEL"			\n"
 		"	sc	%1, %2					\n"
 		"\t" __scbeqz "	%1, 1b					\n"
-		"1:							\n"
+		"2:							\n"
 		"	.set	pop					\n"
 		: "=&r" (result), "=&r" (temp),
 		  "+" GCC_OFF_SMALL_ASM() (v->counter)
diff --git a/arch/mips/include/asm/barrier.h b/arch/mips/include/asm/barrier.h
index b865e317a14f7..f9a6da96aae12 100644
--- a/arch/mips/include/asm/barrier.h
+++ b/arch/mips/include/asm/barrier.h
@@ -238,36 +238,40 @@
 
 /*
  * Some Loongson 3 CPUs have a bug wherein execution of a memory access (load,
- * store or pref) in between an ll & sc can cause the sc instruction to
+ * store or prefetch) in between an LL & SC can cause the SC instruction to
  * erroneously succeed, breaking atomicity. Whilst it's unusual to write code
  * containing such sequences, this bug bites harder than we might otherwise
  * expect due to reordering & speculation:
  *
- * 1) A memory access appearing prior to the ll in program order may actually
- *    be executed after the ll - this is the reordering case.
+ * 1) A memory access appearing prior to the LL in program order may actually
+ *    be executed after the LL - this is the reordering case.
  *
- *    In order to avoid this we need to place a memory barrier (ie. a sync
- *    instruction) prior to every ll instruction, in between it & any earlier
- *    memory access instructions. Many of these cases are already covered by
- *    smp_mb__before_llsc() but for the remaining cases, typically ones in
- *    which multiple CPUs may operate on a memory location but ordering is not
- *    usually guaranteed, we use loongson_llsc_mb() below.
+ *    In order to avoid this we need to place a memory barrier (ie. a SYNC
+ *    instruction) prior to every LL instruction, in between it and any earlier
+ *    memory access instructions.
  *
  *    This reordering case is fixed by 3A R2 CPUs, ie. 3A2000 models and later.
  *
- * 2) If a conditional branch exists between an ll & sc with a target outside
- *    of the ll-sc loop, for example an exit upon value mismatch in cmpxchg()
+ * 2) If a conditional branch exists between an LL & SC with a target outside
+ *    of the LL-SC loop, for example an exit upon value mismatch in cmpxchg()
  *    or similar, then misprediction of the branch may allow speculative
- *    execution of memory accesses from outside of the ll-sc loop.
+ *    execution of memory accesses from outside of the LL-SC loop.
  *
- *    In order to avoid this we need a memory barrier (ie. a sync instruction)
+ *    In order to avoid this we need a memory barrier (ie. a SYNC instruction)
  *    at each affected branch target, for which we also use loongson_llsc_mb()
  *    defined below.
  *
  *    This case affects all current Loongson 3 CPUs.
+ *
+ * The above described cases cause an error in the cache coherence protocol;
+ * such that the Invalidate of a competing LL-SC goes 'missing' and SC
+ * erroneously observes its core still has Exclusive state and lets the SC
+ * proceed.
+ *
+ * Therefore the error only occurs on SMP systems.
  */
 #ifdef CONFIG_CPU_LOONGSON3_WORKAROUNDS /* Loongson-3's LLSC workaround */
-#define loongson_llsc_mb()	__asm__ __volatile__(__WEAK_LLSC_MB : : :"memory")
+#define loongson_llsc_mb()	__asm__ __volatile__("sync" : : :"memory")
 #else
 #define loongson_llsc_mb()	do { } while (0)
 #endif
diff --git a/arch/mips/include/asm/bitops.h b/arch/mips/include/asm/bitops.h
index 9a466dde9b96a..7bd35e5e2a9e4 100644
--- a/arch/mips/include/asm/bitops.h
+++ b/arch/mips/include/asm/bitops.h
@@ -249,6 +249,7 @@ static inline int test_and_set_bit(unsigned long nr,
 		unsigned long *m = ((unsigned long *) addr) + (nr >> SZLONG_LOG);
 		unsigned long temp;
 
+		loongson_llsc_mb();
 		do {
 			__asm__ __volatile__(
 			"	.set	push				\n"
@@ -305,6 +306,7 @@ static inline int test_and_set_bit_lock(unsigned long nr,
 		unsigned long *m = ((unsigned long *) addr) + (nr >> SZLONG_LOG);
 		unsigned long temp;
 
+		loongson_llsc_mb();
 		do {
 			__asm__ __volatile__(
 			"	.set	push				\n"
@@ -364,6 +366,7 @@ static inline int test_and_clear_bit(unsigned long nr,
 		unsigned long *m = ((unsigned long *) addr) + (nr >> SZLONG_LOG);
 		unsigned long temp;
 
+		loongson_llsc_mb();
 		do {
 			__asm__ __volatile__(
 			"	" __LL	"%0, %1 # test_and_clear_bit	\n"
@@ -379,6 +382,7 @@ static inline int test_and_clear_bit(unsigned long nr,
 		unsigned long *m = ((unsigned long *) addr) + (nr >> SZLONG_LOG);
 		unsigned long temp;
 
+		loongson_llsc_mb();
 		do {
 			__asm__ __volatile__(
 			"	.set	push				\n"
@@ -438,6 +442,7 @@ static inline int test_and_change_bit(unsigned long nr,
 		unsigned long *m = ((unsigned long *) addr) + (nr >> SZLONG_LOG);
 		unsigned long temp;
 
+		loongson_llsc_mb();
 		do {
 			__asm__ __volatile__(
 			"	.set	push				\n"
diff --git a/arch/mips/include/asm/cmpxchg.h b/arch/mips/include/asm/cmpxchg.h
index f345a873742d9..f5994a332673b 100644
--- a/arch/mips/include/asm/cmpxchg.h
+++ b/arch/mips/include/asm/cmpxchg.h
@@ -46,6 +46,7 @@ extern unsigned long __xchg_called_with_bad_pointer(void)
 	__typeof(*(m)) __ret;						\
 									\
 	if (kernel_uses_llsc) {						\
+		loongson_llsc_mb();					\
 		__asm__ __volatile__(					\
 		"	.set	push				\n"	\
 		"	.set	noat				\n"	\
@@ -117,6 +118,7 @@ static inline unsigned long __xchg(volatile void *ptr, unsigned long x,
 	__typeof(*(m)) __ret;						\
 									\
 	if (kernel_uses_llsc) {						\
+		loongson_llsc_mb();					\
 		__asm__ __volatile__(					\
 		"	.set	push				\n"	\
 		"	.set	noat				\n"	\
@@ -134,6 +136,7 @@ static inline unsigned long __xchg(volatile void *ptr, unsigned long x,
 		: "=&r" (__ret), "=" GCC_OFF_SMALL_ASM() (*m)		\
 		: GCC_OFF_SMALL_ASM() (*m), "Jr" (old), "Jr" (new)		\
 		: "memory");						\
+		loongson_llsc_mb();					\
 	} else {							\
 		unsigned long __flags;					\
 									\
@@ -229,6 +232,7 @@ static inline unsigned long __cmpxchg64(volatile void *ptr,
 	 */
 	local_irq_save(flags);
 
+	loongson_llsc_mb();
 	asm volatile(
 	"	.set	push				\n"
 	"	.set	" MIPS_ISA_ARCH_LEVEL "		\n"
@@ -274,6 +278,7 @@ static inline unsigned long __cmpxchg64(volatile void *ptr,
 	  "r" (old),
 	  "r" (new)
 	: "memory");
+	loongson_llsc_mb();
 
 	local_irq_restore(flags);
 	return ret;
diff --git a/arch/mips/kernel/syscall.c b/arch/mips/kernel/syscall.c
index b6dc78ad5d8c0..b0e25e913bdb9 100644
--- a/arch/mips/kernel/syscall.c
+++ b/arch/mips/kernel/syscall.c
@@ -132,6 +132,7 @@ static inline int mips_atomic_set(unsigned long addr, unsigned long new)
 		  [efault] "i" (-EFAULT)
 		: "memory");
 	} else if (cpu_has_llsc) {
+		loongson_llsc_mb();
 		__asm__ __volatile__ (
 		"	.set	push					\n"
 		"	.set	"MIPS_ISA_ARCH_LEVEL"			\n"
-- 
2.20.1


  parent reply	other threads:[~2019-09-29 17:31 UTC|newest]

Thread overview: 51+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-09-29 17:30 [PATCH AUTOSEL 5.3 01/49] MIPS: Ingenic: Disable broken BTB lookup optimization Sasha Levin
2019-09-29 17:30 ` [PATCH AUTOSEL 5.3 02/49] clk: jz4740: Add TCU clock Sasha Levin
2019-09-29 17:30 ` [PATCH AUTOSEL 5.3 03/49] MIPS: Don't use bc_false uninitialized in __mm_isBranchInstr Sasha Levin
2019-09-29 17:30 ` [PATCH AUTOSEL 5.3 04/49] MIPS: tlbex: Explicitly cast _PAGE_NO_EXEC to a boolean Sasha Levin
2019-09-29 17:30 ` [PATCH AUTOSEL 5.3 05/49] i2c-cht-wc: Fix lockdep warning Sasha Levin
2019-09-29 17:30 ` [PATCH AUTOSEL 5.3 06/49] mfd: intel-lpss: Remove D3cold delay Sasha Levin
2019-09-29 17:30 ` [PATCH AUTOSEL 5.3 07/49] PCI: tegra: Fix OF node reference leak Sasha Levin
2019-09-29 17:30 ` [PATCH AUTOSEL 5.3 08/49] HID: wacom: Fix several minor compiler warnings Sasha Levin
2019-09-29 17:30 ` [PATCH AUTOSEL 5.3 09/49] rtc: bd70528: fix driver dependencies Sasha Levin
2019-09-29 17:30 ` Sasha Levin [this message]
2019-09-29 17:30 ` [PATCH AUTOSEL 5.3 11/49] PCI: pci-hyperv: Fix build errors on non-SYSFS config Sasha Levin
2019-09-29 17:30 ` [PATCH AUTOSEL 5.3 12/49] PCI: layerscape: Add the bar_fixed_64bit property to the endpoint driver Sasha Levin
2019-09-29 17:30 ` [PATCH AUTOSEL 5.3 13/49] livepatch: Nullify obj->mod in klp_module_coming()'s error path Sasha Levin
2019-09-29 17:30 ` [PATCH AUTOSEL 5.3 14/49] mips/atomic: Fix smp_mb__{before,after}_atomic() Sasha Levin
2019-09-29 17:30 ` [PATCH AUTOSEL 5.3 15/49] ARM: 8898/1: mm: Don't treat faults reported from cache maintenance as writes Sasha Levin
2019-09-29 17:30 ` [PATCH AUTOSEL 5.3 16/49] soundwire: intel: fix channel number reported by hardware Sasha Levin
2019-09-29 17:30 ` [PATCH AUTOSEL 5.3 17/49] PCI: mobiveil: Fix the CPU base address setup in inbound window Sasha Levin
2019-09-29 17:30 ` [PATCH AUTOSEL 5.3 18/49] ARM: 8875/1: Kconfig: default to AEABI w/ Clang Sasha Levin
2019-09-29 17:30 ` [PATCH AUTOSEL 5.3 19/49] MIPS: lantiq: update the clock alias' for the mainline PCIe PHY driver Sasha Levin
2019-09-29 17:30 ` [PATCH AUTOSEL 5.3 20/49] firmware: bcm47xx_nvram: Correct size_t printf format Sasha Levin
2019-09-29 19:39   ` Florian Fainelli
2019-10-05 22:53     ` Sasha Levin
2019-09-29 17:30 ` [PATCH AUTOSEL 5.3 21/49] rtc: snvs: fix possible race condition Sasha Levin
2019-09-29 17:30 ` [PATCH AUTOSEL 5.3 22/49] rtc: pcf85363/pcf85263: fix regmap error in set_time Sasha Levin
2019-09-29 17:30 ` [PATCH AUTOSEL 5.3 23/49] power: supply: register HWMON devices with valid names Sasha Levin
2019-09-29 17:30 ` [PATCH AUTOSEL 5.3 24/49] selinux: fix residual uses of current_security() for the SELinux blob Sasha Levin
2019-09-29 17:30 ` [PATCH AUTOSEL 5.3 25/49] PCI: Add pci_info_ratelimited() to ratelimit PCI separately Sasha Levin
2019-09-29 17:30 ` [PATCH AUTOSEL 5.3 26/49] HID: apple: Fix stuck function keys when using FN Sasha Levin
2019-09-29 17:30 ` [PATCH AUTOSEL 5.3 27/49] PCI: rockchip: Propagate errors for optional regulators Sasha Levin
2019-09-29 17:30 ` [PATCH AUTOSEL 5.3 28/49] PCI: histb: " Sasha Levin
2019-09-29 17:30 ` [PATCH AUTOSEL 5.3 29/49] PCI: imx6: " Sasha Levin
2019-09-29 17:30 ` [PATCH AUTOSEL 5.3 30/49] PCI: exynos: Propagate errors for optional PHYs Sasha Levin
2019-09-29 17:30 ` [PATCH AUTOSEL 5.3 31/49] security: smack: Fix possible null-pointer dereferences in smack_socket_sock_rcv_skb() Sasha Levin
2019-09-29 17:30 ` [PATCH AUTOSEL 5.3 32/49] PCI: Use static const struct, not const static struct Sasha Levin
2019-09-29 17:30 ` [PATCH AUTOSEL 5.3 33/49] ARM: 8905/1: Emit __gnu_mcount_nc when using Clang 10.0.0 or newer Sasha Levin
2019-09-29 17:30 ` [PATCH AUTOSEL 5.3 34/49] ARM: 8903/1: ensure that usable memory in bank 0 starts from a PMD-aligned address Sasha Levin
2019-09-29 17:30 ` [PATCH AUTOSEL 5.3 35/49] i2c: tegra: Move suspend handling to NOIRQ phase Sasha Levin
2019-09-29 17:30 ` [PATCH AUTOSEL 5.3 36/49] block, bfq: push up injection only after setting service time Sasha Levin
2019-09-29 17:30 ` [PATCH AUTOSEL 5.3 37/49] fat: work around race with userspace's read via blockdev while mounting Sasha Levin
2019-09-29 17:30 ` [PATCH AUTOSEL 5.3 38/49] pktcdvd: remove warning on attempting to register non-passthrough dev Sasha Levin
2019-09-29 17:30 ` [PATCH AUTOSEL 5.3 39/49] hypfs: Fix error number left in struct pointer member Sasha Levin
2019-09-29 17:30 ` [PATCH AUTOSEL 5.3 40/49] tools/power/x86/intel-speed-select: Fix high priority core mask over count Sasha Levin
2019-09-29 17:30 ` [PATCH AUTOSEL 5.3 41/49] crypto: hisilicon - Fix double free in sec_free_hw_sgl() Sasha Levin
2019-09-29 17:30 ` [PATCH AUTOSEL 5.3 42/49] mm: add dummy can_do_mlock() helper Sasha Levin
2019-09-29 17:30 ` [PATCH AUTOSEL 5.3 43/49] kbuild: clean compressed initramfs image Sasha Levin
2019-09-29 17:30 ` [PATCH AUTOSEL 5.3 44/49] ocfs2: wait for recovering done after direct unlock request Sasha Levin
2019-09-29 17:30 ` [PATCH AUTOSEL 5.3 45/49] kmemleak: increase DEBUG_KMEMLEAK_EARLY_LOG_SIZE default to 16K Sasha Levin
2019-09-29 17:30 ` [PATCH AUTOSEL 5.3 46/49] arm64: consider stack randomization for mmap base only when necessary Sasha Levin
2019-09-29 17:30 ` [PATCH AUTOSEL 5.3 47/49] mips: properly account for stack randomization and stack guard gap Sasha Levin
2019-09-29 17:30 ` [PATCH AUTOSEL 5.3 48/49] arm: " Sasha Levin
2019-09-29 17:30 ` [PATCH AUTOSEL 5.3 49/49] arm: use STACK_TOP when computing mmap base address Sasha Levin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190929173053.8400-10-sashal@kernel.org \
    --to=sashal@kernel.org \
    --cc=chenhc@lemote.com \
    --cc=huangpei@loongson.cn \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mips@vger.kernel.org \
    --cc=paul.burton@mips.com \
    --cc=peterz@infradead.org \
    --cc=stable@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).