All of lore.kernel.org
 help / color / mirror / Atom feed
From: Peter Zijlstra <peterz@infradead.org>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: linux-kernel@vger.kernel.org, luto@amacapital.net,
	axboe@kernel.dk, keescook@chromium.org,
	torvalds@linux-foundation.org, jannh@google.com, will@kernel.org,
	hch@lst.de, npiggin@gmail.com, mathieu.desnoyers@efficios.com
Subject: Re: [PATCH v3] mm: Fix kthread_use_mm() vs TLB invalidate
Date: Wed, 22 Jul 2020 10:35:33 +0200	[thread overview]
Message-ID: <20200722083533.GK10769@hirez.programming.kicks-ass.net> (raw)
In-Reply-To: <20200721140623.4e8ecc6ef5d5ff42115d68fc@linux-foundation.org>

On Tue, Jul 21, 2020 at 02:06:23PM -0700, Andrew Morton wrote:
> On Tue, 21 Jul 2020 17:41:06 +0200 Peter Zijlstra <peterz@infradead.org> wrote:
> 
> > 
> > For SMP systems using IPI based TLB invalidation, looking at
> > current->active_mm is entirely reasonable. This then presents the
> > following race condition:
> > 
> > 
> >   CPU0			CPU1
> > 
> >   flush_tlb_mm(mm)	use_mm(mm)
> >     <send-IPI>
> > 			  tsk->active_mm = mm;
> > 			  <IPI>
> > 			    if (tsk->active_mm == mm)
> > 			      // flush TLBs
> > 			  </IPI>
> > 			  switch_mm(old_mm,mm,tsk);
> > 
> > 
> > Where it is possible the IPI flushed the TLBs for @old_mm, not @mm,
> > because the IPI lands before we actually switched.
> > 
> > Avoid this by disabling IRQs across changing ->active_mm and
> > switch_mm().
> > 
> > [ There are all sorts of reasons this might be harmless for various
> > architecture specific reasons, but best not leave the door open at
> > all. ]
> 
> Can we give the -stable maintainers (and others) more explanation of
> why they might choose to merge this?

Like so then?

---
Subject: mm: Fix kthread_use_mm() vs TLB invalidate
From: Peter Zijlstra <peterz@infradead.org>
Date: Tue, 11 Feb 2020 10:25:19 +0100

For SMP systems using IPI based TLB invalidation, looking at
current->active_mm is entirely reasonable. This then presents the
following race condition:


  CPU0			CPU1

  flush_tlb_mm(mm)	use_mm(mm)
    <send-IPI>
			  tsk->active_mm = mm;
			  <IPI>
			    if (tsk->active_mm == mm)
			      // flush TLBs
			  </IPI>
			  switch_mm(old_mm,mm,tsk);


Where it is possible the IPI flushed the TLBs for @old_mm, not @mm,
because the IPI lands before we actually switched.

Avoid this by disabling IRQs across changing ->active_mm and
switch_mm().

Of the (SMP) architectures that have IPI based TLB invalidate:

  Alpha    - checks active_mm
  ARC      - ASID specific
  IA64     - checks active_mm
  MIPS     - ASID specific flush
  OpenRISC - shoots down world
  PARISC   - shoots down world
  SH       - ASID specific
  SPARC    - ASID specific
  x86      - N/A
  xtensa   - checks active_mm

So at the very least Alpha, IA64 and Xtensa are suspect.

On top of this, for scheduler consistency we need at least preemption
disabled across changing tsk->mm and doing switch_mm(), which is
currently provided by task_lock(), but that's not sufficient for
PREEMPT_RT.

Reported-by: Andy Lutomirski <luto@amacapital.net>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: stable@kernel.org
---
 kernel/kthread.c |   11 ++++++++++-
 1 file changed, 10 insertions(+), 1 deletion(-)

--- a/kernel/kthread.c
+++ b/kernel/kthread.c
@@ -1241,13 +1241,20 @@ void kthread_use_mm(struct mm_struct *mm
 	WARN_ON_ONCE(tsk->mm);
 
 	task_lock(tsk);
+	/*
+	 * Serialize the tsk->mm store and switch_mm() against TLB invalidation
+	 * IPIs. Also make sure we're non-preemptible on PREEMPT_RT to not race
+	 * against the scheduler writing to these variables.
+	 */
+	local_irq_disable();
 	active_mm = tsk->active_mm;
 	if (active_mm != mm) {
 		mmgrab(mm);
 		tsk->active_mm = mm;
 	}
 	tsk->mm = mm;
-	switch_mm(active_mm, mm, tsk);
+	switch_mm_irqs_off(active_mm, mm, tsk);
+	local_irq_enable();
 	task_unlock(tsk);
 #ifdef finish_arch_post_lock_switch
 	finish_arch_post_lock_switch();
@@ -1276,9 +1283,11 @@ void kthread_unuse_mm(struct mm_struct *
 
 	task_lock(tsk);
 	sync_mm_rss(mm);
+	local_irq_disable();
 	tsk->mm = NULL;
 	/* active_mm is still 'mm' */
 	enter_lazy_tlb(mm, tsk);
+	local_irq_enable();
 	task_unlock(tsk);
 }
 EXPORT_SYMBOL_GPL(kthread_unuse_mm);

  reply	other threads:[~2020-07-22  8:35 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-07-21 15:41 [PATCH v3] mm: Fix kthread_use_mm() vs TLB invalidate Peter Zijlstra
2020-07-21 21:06 ` Andrew Morton
2020-07-22  8:35   ` Peter Zijlstra [this message]
2020-07-23  7:15     ` Nicholas Piggin
2020-08-21  5:39 ` Aneesh Kumar K.V
2020-08-21 13:04   ` peterz
2020-08-28  3:26     ` Nicholas Piggin
2020-08-28  6:55       ` Nicholas Piggin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200722083533.GK10769@hirez.programming.kicks-ass.net \
    --to=peterz@infradead.org \
    --cc=akpm@linux-foundation.org \
    --cc=axboe@kernel.dk \
    --cc=hch@lst.de \
    --cc=jannh@google.com \
    --cc=keescook@chromium.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=luto@amacapital.net \
    --cc=mathieu.desnoyers@efficios.com \
    --cc=npiggin@gmail.com \
    --cc=torvalds@linux-foundation.org \
    --cc=will@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.