All of lore.kernel.org
 help / color / mirror / Atom feed
From: "tip-bot for Paul E. McKenney" <tipbot@zytor.com>
To: linux-tip-commits@vger.kernel.org
Cc: linux-kernel@vger.kernel.org, hpa@zytor.com, mingo@kernel.org,
	torvalds@linux-foundation.org, a.p.zijlstra@chello.nl,
	paulmck@linux.vnet.ibm.com, akpm@linux-foundation.org,
	tglx@linutronix.de, linux-arch@vger.kernel.org
Subject: [tip:core/locking] Documentation/memory-barriers.txt: Downgrade UNLOCK+BLOCK
Date: Mon, 16 Dec 2013 02:40:22 -0800	[thread overview]
Message-ID: <tip-17eb88e068430014deb709e5af34197cdf2390c9@git.kernel.org> (raw)
In-Reply-To: <1386799151-2219-6-git-send-email-paulmck@linux.vnet.ibm.com>

Commit-ID:  17eb88e068430014deb709e5af34197cdf2390c9
Gitweb:     http://git.kernel.org/tip/17eb88e068430014deb709e5af34197cdf2390c9
Author:     Paul E. McKenney <paulmck@linux.vnet.ibm.com>
AuthorDate: Wed, 11 Dec 2013 13:59:09 -0800
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Mon, 16 Dec 2013 11:36:15 +0100

Documentation/memory-barriers.txt: Downgrade UNLOCK+BLOCK

Historically, an UNLOCK+LOCK pair executed by one CPU, by one
task, or on a given lock variable has implied a full memory
barrier.  In a recent LKML thread, the wisdom of this historical
approach was called into question:
http://www.spinics.net/lists/linux-mm/msg65653.html, in part due
to the memory-order complexities of low-handoff-overhead queued
locks on x86 systems.

This patch therefore removes this guarantee from the
documentation, and further documents how to restore it via a new
smp_mb__after_unlock_lock() primitive.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: <linux-arch@vger.kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Link: http://lkml.kernel.org/r/1386799151-2219-6-git-send-email-paulmck@linux.vnet.ibm.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 Documentation/memory-barriers.txt | 84 ++++++++++++++++++++++++++++++++-------
 1 file changed, 69 insertions(+), 15 deletions(-)

diff --git a/Documentation/memory-barriers.txt b/Documentation/memory-barriers.txt
index 919fd60..cb753c8 100644
--- a/Documentation/memory-barriers.txt
+++ b/Documentation/memory-barriers.txt
@@ -402,12 +402,18 @@ And a couple of implicit varieties:
      Memory operations that occur after an UNLOCK operation may appear to
      happen before it completes.
 
-     LOCK and UNLOCK operations are guaranteed to appear with respect to each
-     other strictly in the order specified.
-
      The use of LOCK and UNLOCK operations generally precludes the need for
      other sorts of memory barrier (but note the exceptions mentioned in the
-     subsection "MMIO write barrier").
+     subsection "MMIO write barrier").  In addition, an UNLOCK+LOCK pair
+     is -not- guaranteed to act as a full memory barrier.  However,
+     after a LOCK on a given lock variable, all memory accesses preceding any
+     prior UNLOCK on that same variable are guaranteed to be visible.
+     In other words, within a given lock variable's critical section,
+     all accesses of all previous critical sections for that lock variable
+     are guaranteed to have completed.
+
+     This means that LOCK acts as a minimal "acquire" operation and
+     UNLOCK acts as a minimal "release" operation.
 
 
 Memory barriers are only required where there's a possibility of interaction
@@ -1633,8 +1639,12 @@ for each construct.  These operations all imply certain barriers:
      Memory operations issued after the LOCK will be completed after the LOCK
      operation has completed.
 
-     Memory operations issued before the LOCK may be completed after the LOCK
-     operation has completed.
+     Memory operations issued before the LOCK may be completed after the
+     LOCK operation has completed.  An smp_mb__before_spinlock(), combined
+     with a following LOCK, orders prior loads against subsequent stores
+     and stores and prior stores against subsequent stores.  Note that
+     this is weaker than smp_mb()!  The smp_mb__before_spinlock()
+     primitive is free on many architectures.
 
  (2) UNLOCK operation implication:
 
@@ -1654,9 +1664,6 @@ for each construct.  These operations all imply certain barriers:
      All LOCK operations issued before an UNLOCK operation will be completed
      before the UNLOCK operation.
 
-     All UNLOCK operations issued before a LOCK operation will be completed
-     before the LOCK operation.
-
  (5) Failed conditional LOCK implication:
 
      Certain variants of the LOCK operation may fail, either due to being
@@ -1664,9 +1671,6 @@ for each construct.  These operations all imply certain barriers:
      signal whilst asleep waiting for the lock to become available.  Failed
      locks do not imply any sort of barrier.
 
-Therefore, from (1), (2) and (4) an UNLOCK followed by an unconditional LOCK is
-equivalent to a full barrier, but a LOCK followed by an UNLOCK is not.
-
 [!] Note: one of the consequences of LOCKs and UNLOCKs being only one-way
     barriers is that the effects of instructions outside of a critical section
     may seep into the inside of the critical section.
@@ -1677,13 +1681,57 @@ LOCK, and an access following the UNLOCK to happen before the UNLOCK, and the
 two accesses can themselves then cross:
 
 	*A = a;
-	LOCK
-	UNLOCK
+	LOCK M
+	UNLOCK M
 	*B = b;
 
 may occur as:
 
-	LOCK, STORE *B, STORE *A, UNLOCK
+	LOCK M, STORE *B, STORE *A, UNLOCK M
+
+This same reordering can of course occur if the LOCK and UNLOCK are
+to the same lock variable, but only from the perspective of another
+CPU not holding that lock.
+
+In short, an UNLOCK followed by a LOCK may -not- be assumed to be a full
+memory barrier because it is possible for a preceding UNLOCK to pass a
+later LOCK from the viewpoint of the CPU, but not from the viewpoint
+of the compiler.  Note that deadlocks cannot be introduced by this
+interchange because if such a deadlock threatened, the UNLOCK would
+simply complete.
+
+If it is necessary for an UNLOCK-LOCK pair to produce a full barrier,
+the LOCK can be followed by an smp_mb__after_unlock_lock() invocation.
+This will produce a full barrier if either (a) the UNLOCK and the LOCK
+are executed by the same CPU or task, or (b) the UNLOCK and LOCK act
+on the same lock variable.  The smp_mb__after_unlock_lock() primitive
+is free on many architectures.  Without smp_mb__after_unlock_lock(),
+the critical sections corresponding to the UNLOCK and the LOCK can cross:
+
+	*A = a;
+	UNLOCK M
+	LOCK N
+	*B = b;
+
+could occur as:
+
+	LOCK N, STORE *B, STORE *A, UNLOCK M
+
+With smp_mb__after_unlock_lock(), they cannot, so that:
+
+	*A = a;
+	UNLOCK M
+	LOCK N
+	smp_mb__after_unlock_lock();
+	*B = b;
+
+will always occur as either of the following:
+
+	STORE *A, UNLOCK, LOCK, STORE *B
+	STORE *A, LOCK, UNLOCK, STORE *B
+
+If the UNLOCK and LOCK were instead both operating on the same lock
+variable, only the first of these two alternatives can occur.
 
 Locks and semaphores may not provide any guarantee of ordering on UP compiled
 systems, and so cannot be counted on in such a situation to actually achieve
@@ -1911,6 +1959,7 @@ However, if the following occurs:
 	UNLOCK M	     [1]
 	ACCESS_ONCE(*D) = d;		ACCESS_ONCE(*E) = e;
 					LOCK M		     [2]
+					smp_mb__after_unlock_lock();
 					ACCESS_ONCE(*F) = f;
 					ACCESS_ONCE(*G) = g;
 					UNLOCK M	     [2]
@@ -1928,6 +1977,11 @@ But assuming CPU 1 gets the lock first, CPU 3 won't see any of:
 	*F, *G or *H preceding LOCK M [2]
 	*A, *B, *C, *E, *F or *G following UNLOCK M [2]
 
+Note that the smp_mb__after_unlock_lock() is critically important
+here: Without it CPU 3 might see some of the above orderings.
+Without smp_mb__after_unlock_lock(), the accesses are not guaranteed
+to be seen in order unless CPU 3 holds lock M.
+
 
 LOCKS VS I/O ACCESSES
 ---------------------

  reply	other threads:[~2013-12-16 10:40 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-12-11 21:58 [PATCH v6 tip/core/locking] Memory-barrier documentation updates + smp_mb__after_unlock_lock() Paul E. McKenney
2013-12-11 21:59 ` [PATCH v6 tip/core/locking 1/8] Documentation/memory-barriers.txt: Add needed ACCESS_ONCE() calls to memory-barriers.txt Paul E. McKenney
2013-12-11 21:59   ` [PATCH v6 tip/core/locking 2/8] Documentation/memory-barriers.txt: Add long atomic examples " Paul E. McKenney
2013-12-16 10:39     ` [tip:core/locking] " tip-bot for Paul E. McKenney
2013-12-11 21:59   ` [PATCH v6 tip/core/locking 3/8] Documentation/memory-barriers.txt: Prohibit speculative writes Paul E. McKenney
2013-12-12 13:17     ` Ingo Molnar
2013-12-12 15:08       ` Paul E. McKenney
2013-12-13 14:18         ` Peter Zijlstra
2013-12-16 10:39     ` [tip:core/locking] " tip-bot for Peter Zijlstra
2013-12-11 21:59   ` [PATCH v6 tip/core/locking 4/8] Documentation/memory-barriers.txt: Document ACCESS_ONCE() Paul E. McKenney
2013-12-16 10:40     ` [tip:core/locking] " tip-bot for Paul E. McKenney
2013-12-11 21:59   ` [PATCH v6 tip/core/locking 5/8] locking: Add an smp_mb__after_unlock_lock() for UNLOCK+LOCK barrier Paul E. McKenney
2013-12-16 10:40     ` [tip:core/locking] locking: Add an smp_mb__after_unlock_lock() for UNLOCK+BLOCK barrier tip-bot for Paul E. McKenney
2013-12-11 21:59   ` [PATCH v6 tip/core/locking 6/8] Documentation/memory-barriers.txt: Downgrade UNLOCK+LOCK Paul E. McKenney
2013-12-16 10:40     ` tip-bot for Paul E. McKenney [this message]
2013-12-11 21:59   ` [PATCH v6 tip/core/locking 7/8] rcu: Apply smp_mb__after_unlock_lock() to preserve grace periods Paul E. McKenney
2013-12-16 10:40     ` [tip:core/locking] " tip-bot for Paul E. McKenney
2013-12-11 21:59   ` [PATCH v6 tip/core/locking 8/8] powerpc: Full barrier for smp_mb__after_unlock_lock() Paul E. McKenney
2013-12-11 21:59     ` Paul E. McKenney
2013-12-16 10:40     ` [tip:core/locking] " tip-bot for Paul E. McKenney
2013-12-16 10:39   ` [tip:core/locking] Documentation/memory-barriers.txt: Add needed ACCESS_ONCE() calls to memory-barriers.txt tip-bot for Paul E. McKenney

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=tip-17eb88e068430014deb709e5af34197cdf2390c9@git.kernel.org \
    --to=tipbot@zytor.com \
    --cc=a.p.zijlstra@chello.nl \
    --cc=akpm@linux-foundation.org \
    --cc=hpa@zytor.com \
    --cc=linux-arch@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-tip-commits@vger.kernel.org \
    --cc=mingo@kernel.org \
    --cc=paulmck@linux.vnet.ibm.com \
    --cc=tglx@linutronix.de \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.