All of lore.kernel.org
 help / color / mirror / Atom feed
From: Peter Zijlstra <peterz@infradead.org>
To: stern@rowland.harvard.edu, akiyks@gmail.com,
	andrea.parri@amarulasolutions.com, boqun.feng@gmail.com,
	dlustig@nvidia.com, dhowells@redhat.com, j.alglave@ucl.ac.uk,
	luc.maranget@inria.fr, npiggin@gmail.com, paulmck@linux.ibm.com,
	peterz@infradead.org, will.deacon@arm.com
Cc: linux-kernel@vger.kernel.org, torvalds@linux-foundation.org
Subject: [RFC][PATCH 5/5] x86/atomic: Fix smp_mb__{before,after}_atomic()
Date: Wed, 24 Apr 2019 14:37:01 +0200	[thread overview]
Message-ID: <20190424124421.808471451@infradead.org> (raw)
In-Reply-To: 20190424123656.484227701@infradead.org

Recent probing at the Linux Kernel Memory Model uncovered a
'surprise'. Strongly ordered architectures where the atomic RmW
primitive implies full memory ordering and
smp_mb__{before,after}_atomic() are a simple barrier() (such as x86)
fail for:

	*x = 1;
	atomic_inc(u);
	smp_mb__after_atomic();
	r0 = *y;

Because, while the atomic_inc() implies memory order, it
(surprisingly) does not provide a compiler barrier. This then allows
the compiler to re-order like so:

	atomic_inc(u);
	*x = 1;
	smp_mb__after_atomic();
	r0 = *y;

Which the CPU is then allowed to re-order (under TSO rules) like:

	atomic_inc(u);
	r0 = *y;
	*x = 1;

And this very much was not intended. Therefore strengthen the atomic
RmW ops to include a compiler barrier.

NOTE: atomic_{or,and,xor} and the bitops already had the compiler
barrier.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 Documentation/atomic_t.txt         |    3 +++
 arch/x86/include/asm/atomic.h      |    8 ++++----
 arch/x86/include/asm/atomic64_64.h |    8 ++++----
 arch/x86/include/asm/barrier.h     |    4 ++--
 4 files changed, 13 insertions(+), 10 deletions(-)

--- a/Documentation/atomic_t.txt
+++ b/Documentation/atomic_t.txt
@@ -194,6 +194,9 @@ These helper barriers exist because arch
 ordering on their SMP atomic primitives. For example our TSO architectures
 provide full ordered atomics and these barriers are no-ops.
 
+NOTE: when the atomic RmW ops are fully ordered, they should also imply a
+compiler barrier.
+
 Thus:
 
   atomic_fetch_add();
--- a/arch/x86/include/asm/atomic.h
+++ b/arch/x86/include/asm/atomic.h
@@ -54,7 +54,7 @@ static __always_inline void arch_atomic_
 {
 	asm volatile(LOCK_PREFIX "addl %1,%0"
 		     : "+m" (v->counter)
-		     : "ir" (i));
+		     : "ir" (i) : "memory");
 }
 
 /**
@@ -68,7 +68,7 @@ static __always_inline void arch_atomic_
 {
 	asm volatile(LOCK_PREFIX "subl %1,%0"
 		     : "+m" (v->counter)
-		     : "ir" (i));
+		     : "ir" (i) : "memory");
 }
 
 /**
@@ -95,7 +95,7 @@ static __always_inline bool arch_atomic_
 static __always_inline void arch_atomic_inc(atomic_t *v)
 {
 	asm volatile(LOCK_PREFIX "incl %0"
-		     : "+m" (v->counter));
+		     : "+m" (v->counter) :: "memory");
 }
 #define arch_atomic_inc arch_atomic_inc
 
@@ -108,7 +108,7 @@ static __always_inline void arch_atomic_
 static __always_inline void arch_atomic_dec(atomic_t *v)
 {
 	asm volatile(LOCK_PREFIX "decl %0"
-		     : "+m" (v->counter));
+		     : "+m" (v->counter) :: "memory");
 }
 #define arch_atomic_dec arch_atomic_dec
 
--- a/arch/x86/include/asm/atomic64_64.h
+++ b/arch/x86/include/asm/atomic64_64.h
@@ -45,7 +45,7 @@ static __always_inline void arch_atomic6
 {
 	asm volatile(LOCK_PREFIX "addq %1,%0"
 		     : "=m" (v->counter)
-		     : "er" (i), "m" (v->counter));
+		     : "er" (i), "m" (v->counter) : "memory");
 }
 
 /**
@@ -59,7 +59,7 @@ static inline void arch_atomic64_sub(lon
 {
 	asm volatile(LOCK_PREFIX "subq %1,%0"
 		     : "=m" (v->counter)
-		     : "er" (i), "m" (v->counter));
+		     : "er" (i), "m" (v->counter) : "memory");
 }
 
 /**
@@ -87,7 +87,7 @@ static __always_inline void arch_atomic6
 {
 	asm volatile(LOCK_PREFIX "incq %0"
 		     : "=m" (v->counter)
-		     : "m" (v->counter));
+		     : "m" (v->counter) : "memory");
 }
 #define arch_atomic64_inc arch_atomic64_inc
 
@@ -101,7 +101,7 @@ static __always_inline void arch_atomic6
 {
 	asm volatile(LOCK_PREFIX "decq %0"
 		     : "=m" (v->counter)
-		     : "m" (v->counter));
+		     : "m" (v->counter) : "memory");
 }
 #define arch_atomic64_dec arch_atomic64_dec
 
--- a/arch/x86/include/asm/barrier.h
+++ b/arch/x86/include/asm/barrier.h
@@ -80,8 +80,8 @@ do {									\
 })
 
 /* Atomic operations are already serializing on x86 */
-#define __smp_mb__before_atomic()	barrier()
-#define __smp_mb__after_atomic()	barrier()
+#define __smp_mb__before_atomic()	do { } while (0)
+#define __smp_mb__after_atomic()	do { } while (0)
 
 #include <asm-generic/barrier.h>
 



  parent reply	other threads:[~2019-04-24 12:48 UTC|newest]

Thread overview: 31+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-04-24 12:36 [RFC][PATCH 0/5] atomic: Fixes to smp_mb__{before,after}_atomic() and mips Peter Zijlstra
2019-04-24 12:36 ` [RFC][PATCH 1/5] mips/atomic: Fix cmpxchg64 barriers Peter Zijlstra
2019-04-24 21:00   ` Paul Burton
2019-04-25  6:59     ` Peter Zijlstra
2019-04-24 12:36 ` [RFC][PATCH 2/5] mips/atomic: Fix loongson_llsc_mb() wreckage Peter Zijlstra
2019-04-24 12:59   ` Peter Zijlstra
2019-04-24 21:18   ` Paul Burton
2019-04-25  4:58     ` huangpei
2019-04-25  7:33       ` Peter Zijlstra
2019-04-25  9:09         ` Peter Zijlstra
2019-04-25 12:14           ` huangpei
2019-04-25  9:12         ` Peter Zijlstra
2019-05-14 15:58           ` Peter Zijlstra
2019-05-14 16:10             ` Linus Torvalds
2019-05-14 16:56               ` Peter Zijlstra
2019-05-14 17:07                 ` Linus Torvalds
2019-05-15 13:50               ` huangpei
2019-04-25 11:32         ` huangpei
2019-04-25 12:26           ` Peter Zijlstra
2019-04-25 12:51             ` huangpei
2019-04-25 13:31               ` Peter Zijlstra
2019-04-26  2:57                 ` huangpei
2019-05-14 15:46                   ` Peter Zijlstra
2019-04-25 16:12       ` Linus Torvalds
2019-04-25  7:15     ` Peter Zijlstra
2019-04-24 12:36 ` [RFC][PATCH 3/5] mips/atomic: Optimize loongson3_llsc_mb() Peter Zijlstra
2019-04-24 12:37 ` [RFC][PATCH 4/5] mips/atomic: Fix smp_mb__{before,after}_atomic() Peter Zijlstra
2019-04-24 21:24   ` Paul Burton
2019-04-25  7:34     ` Peter Zijlstra
2019-04-24 12:37 ` Peter Zijlstra [this message]
2019-04-24 13:41   ` [RFC][PATCH 5/5] x86/atomic: " Will Deacon

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190424124421.808471451@infradead.org \
    --to=peterz@infradead.org \
    --cc=akiyks@gmail.com \
    --cc=andrea.parri@amarulasolutions.com \
    --cc=boqun.feng@gmail.com \
    --cc=dhowells@redhat.com \
    --cc=dlustig@nvidia.com \
    --cc=j.alglave@ucl.ac.uk \
    --cc=linux-kernel@vger.kernel.org \
    --cc=luc.maranget@inria.fr \
    --cc=npiggin@gmail.com \
    --cc=paulmck@linux.ibm.com \
    --cc=stern@rowland.harvard.edu \
    --cc=torvalds@linux-foundation.org \
    --cc=will.deacon@arm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.