Linux-Next Archive on lore.kernel.org
 help / color / Atom feed
From: Peter Zijlstra <peterz@infradead.org>
To: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	Thomas Gleixner <tglx@linutronix.de>, Ingo Molnar <mingo@elte.hu>,
	"H. Peter Anvin" <hpa@zytor.com>,
	Linux-Next Mailing List <linux-next@vger.kernel.org>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	Nadav Amit <namit@vmware.com>,
	Linus <torvalds@linux-foundation.org>
Subject: Re: linux-next: manual merge of the akpm-current tree with the tip tree
Date: Fri, 11 Aug 2017 12:48:32 +0200
Message-ID: <20170811104832.cvunth3rbo75atzl@hirez.programming.kicks-ass.net> (raw)
In-Reply-To: <20170811093449.w5wttpulmwfykjzm@hirez.programming.kicks-ass.net>

On Fri, Aug 11, 2017 at 11:34:49AM +0200, Peter Zijlstra wrote:
> On Fri, Aug 11, 2017 at 05:53:26PM +1000, Stephen Rothwell wrote:
> > Hi all,
> > 
> > Today's linux-next merge of the akpm-current tree got conflicts in:
> > 
> >   include/linux/mm_types.h
> >   mm/huge_memory.c
> > 
> > between commit:
> > 
> >   8b1b436dd1cc ("mm, locking: Rework {set,clear,mm}_tlb_flush_pending()")
> > 
> > from the tip tree and commits:
> > 
> >   16af97dc5a89 ("mm: migrate: prevent racy access to tlb_flush_pending")
> >   a9b802500ebb ("Revert "mm: numa: defer TLB flush for THP migration as long as possible"")
> > 
> > from the akpm-current tree.
> > 
> > The latter 2 are now in Linus' tree as well (but were not when I started
> > the day).
> > 
> > The only way forward I could see was to revert
> > 
> >   8b1b436dd1cc ("mm, locking: Rework {set,clear,mm}_tlb_flush_pending()")
> > 
> > and the three following commits
> > 
> >   ff7a5fb0f1d5 ("overlayfs, locking: Remove smp_mb__before_spinlock() usage")
> >   d89e588ca408 ("locking: Introduce smp_mb__after_spinlock()")
> >   a9668cd6ee28 ("locking: Remove smp_mb__before_spinlock()")
> > 
> > before merging the akpm-current tree again.
> 
> Here's two patches that apply on top of tip.
> 


And here's one to fix the PPC ordering issue I found while doing those
patches.


---
Subject: mm: Fix barrier for inc_tlb_flush_pending() for PPC
From: Peter Zijlstra <peterz@infradead.org>
Date: Fri Aug 11 12:43:33 CEST 2017

When we have SPLIT_PTE_PTLOCKS and have RCpc locks (PPC) the UNLOCK of
one does not in fact order against the LOCK of another lock. Therefore
the documented scheme does not work.

Add an explicit smp_mb__after_atomic() to cure things.

Also update the comment to reflect the new inc/dec thing.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 include/linux/mm_types.h |   34 ++++++++++++++++++++++++----------
 1 file changed, 24 insertions(+), 10 deletions(-)

--- a/include/linux/mm_types.h
+++ b/include/linux/mm_types.h
@@ -533,7 +533,7 @@ static inline bool mm_tlb_flush_pending(
 {
 	/*
 	 * Must be called with PTL held; such that our PTL acquire will have
-	 * observed the store from set_tlb_flush_pending().
+	 * observed the increment from inc_tlb_flush_pending().
 	 */
 	return atomic_read(&mm->tlb_flush_pending);
 }
@@ -547,13 +547,11 @@ static inline void inc_tlb_flush_pending
 {
 	atomic_inc(&mm->tlb_flush_pending);
 	/*
-	 * The only time this value is relevant is when there are indeed pages
-	 * to flush. And we'll only flush pages after changing them, which
-	 * requires the PTL.
-	 *
 	 * So the ordering here is:
 	 *
-	 *	mm->tlb_flush_pending = true;
+	 *	atomic_inc(&mm->tlb_flush_pending)
+	 *	smp_mb__after_atomic();
+	 *
 	 *	spin_lock(&ptl);
 	 *	...
 	 *	set_pte_at();
@@ -565,17 +563,33 @@ static inline void inc_tlb_flush_pending
 	 *				spin_unlock(&ptl);
 	 *
 	 *	flush_tlb_range();
-	 *	mm->tlb_flush_pending = false;
+	 *	atomic_dec(&mm->tlb_flush_pending);
 	 *
-	 * So the =true store is constrained by the PTL unlock, and the =false
-	 * store is constrained by the TLB invalidate.
+	 * Where we order the increment against the PTE modification with the
+	 * smp_mb__after_atomic(). It would appear that the spin_unlock(&ptl)
+	 * is sufficient to constrain the inc, because we only care about the
+	 * value if there is indeed a pending PTE modification. However with
+	 * SPLIT_PTE_PTLOCKS and RCpc locks (PPC) the UNLOCK of one lock does
+	 * not order against the LOCK of another lock.
+	 *
+	 * The decrement is ordered by the flush_tlb_range(), such that
+	 * mm_tlb_flush_pending() will not return false unless all flushes have
+	 * completed.
 	 */
+	smp_mb__after_atomic();
 }
 
 /* Clearing is done after a TLB flush, which also provides a barrier. */
 static inline void dec_tlb_flush_pending(struct mm_struct *mm)
 {
-	/* see set_tlb_flush_pending */
+	/*
+	 * See inc_tlb_flush_pending().
+	 *
+	 * This cannot be smp_mb__before_atomic() because smp_mb() simply does
+	 * not order against TLB invalidate completion, which is what we need.
+	 *
+	 * Therefore we must rely on tlb_flush_*() to guarantee order.
+	 */
 	atomic_dec(&mm->tlb_flush_pending);
 }
 #else

  reply index

Thread overview: 87+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-08-11  7:53 Stephen Rothwell
2017-08-11  9:34 ` Peter Zijlstra
2017-08-11 10:48   ` Peter Zijlstra [this message]
2017-08-11 11:45   ` Stephen Rothwell
2017-08-11 11:56     ` Ingo Molnar
2017-08-11 12:17       ` Peter Zijlstra
2017-08-11 12:44         ` Ingo Molnar
2017-08-11 13:49           ` Stephen Rothwell
2017-08-11 14:04       ` Peter Zijlstra
2017-08-13  6:06         ` Nadav Amit
2017-08-13 12:50           ` Peter Zijlstra
2017-08-14  3:16             ` Minchan Kim
2017-08-14  5:07               ` Nadav Amit
2017-08-14  5:23                 ` Minchan Kim
2017-08-14  8:38                 ` Minchan Kim
2017-08-14 19:57                   ` Peter Zijlstra
2017-08-16  4:14                     ` Minchan Kim
2017-08-14 19:38                 ` Peter Zijlstra
2017-08-15  7:51                   ` Nadav Amit
2017-08-14  3:09         ` Minchan Kim
2017-08-14 18:54           ` Peter Zijlstra
  -- strict thread matches above, loose matches on Subject: below --
2019-10-31  5:43 Stephen Rothwell
2019-06-24 10:24 Stephen Rothwell
2019-05-01 11:10 Stephen Rothwell
2019-01-31  4:31 Stephen Rothwell
2018-08-20  4:32 Stephen Rothwell
2018-08-20 19:52 ` Andrew Morton
2018-03-23  5:59 Stephen Rothwell
2017-12-18  5:04 Stephen Rothwell
2017-11-10  4:33 Stephen Rothwell
2017-11-02  7:19 Stephen Rothwell
2017-08-22  6:57 Stephen Rothwell
2017-08-23  6:39 ` Vlastimil Babka
2017-04-12  6:46 Stephen Rothwell
2017-04-12 20:53 ` Vlastimil Babka
2017-04-20  2:17   ` NeilBrown
2017-03-24  5:25 Stephen Rothwell
2017-02-17  4:40 Stephen Rothwell
2016-11-14  6:08 Stephen Rothwell
2016-07-29  4:14 Stephen Rothwell
2016-06-15  5:23 Stephen Rothwell
2016-06-18 19:39 ` Manfred Spraul
2016-04-29  6:12 Stephen Rothwell
2016-04-29  6:26 ` Ingo Molnar
2016-03-02  5:40 Stephen Rothwell
2016-02-26  5:07 Stephen Rothwell
2016-02-26 21:35 ` Andrew Morton
2016-02-19  4:09 Stephen Rothwell
2016-02-19 15:26 ` Ard Biesheuvel
2015-12-07  8:06 Stephen Rothwell
2015-10-02  4:21 Stephen Rothwell
2015-07-28  6:00 Stephen Rothwell
2015-07-29 17:12 ` Andrea Arcangeli
2015-07-29 17:47   ` Andy Lutomirski
2015-07-29 18:46     ` Thomas Gleixner
2015-07-30 15:38       ` Andrea Arcangeli
2015-07-29 23:06   ` Stephen Rothwell
2015-07-29 23:07     ` Thomas Gleixner
2015-09-07 23:35   ` Stephen Rothwell
2015-09-08 18:11     ` Linus Torvalds
2015-09-08 22:56       ` Stephen Rothwell
2015-09-08 23:03         ` Linus Torvalds
2015-09-08 23:21           ` Andrew Morton
2015-09-16  6:58             ` Geert Uytterhoeven
2015-06-04 12:07 Stephen Rothwell
2015-04-08  8:28 Stephen Rothwell
2015-04-08  8:25 Stephen Rothwell
2014-03-17  9:31 Stephen Rothwell
2014-03-17  9:36 ` Peter Zijlstra
2014-03-19 23:27   ` Andrew Morton
2014-01-14  4:53 Stephen Rothwell
2014-01-14  5:04 ` Davidlohr Bueso
2014-01-14 12:51 ` Peter Zijlstra
2014-01-14 13:17   ` Geert Uytterhoeven
2014-01-14 13:33     ` Peter Zijlstra
2014-01-14 16:19     ` H. Peter Anvin
2014-01-14 15:15   ` H. Peter Anvin
2014-01-14 15:20     ` Geert Uytterhoeven
2014-01-14 15:41       ` Peter Zijlstra
2014-01-14 15:48         ` H. Peter Anvin
2014-01-07  6:00 Stephen Rothwell
2014-01-07  6:34 ` Tang Chen
2013-11-08  7:48 Stephen Rothwell
2013-11-08 18:58 ` Josh Triplett
2013-11-08 23:20   ` Stephen Rothwell
2013-11-09  0:19     ` Josh Triplett
2013-10-30  6:40 Stephen Rothwell

Reply instructions:

You may reply publically to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170811104832.cvunth3rbo75atzl@hirez.programming.kicks-ass.net \
    --to=peterz@infradead.org \
    --cc=akpm@linux-foundation.org \
    --cc=hpa@zytor.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-next@vger.kernel.org \
    --cc=mingo@elte.hu \
    --cc=namit@vmware.com \
    --cc=sfr@canb.auug.org.au \
    --cc=tglx@linutronix.de \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Linux-Next Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linux-next/0 linux-next/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-next linux-next/ https://lore.kernel.org/linux-next \
		linux-next@vger.kernel.org
	public-inbox-index linux-next

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-next


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git