All of lore.kernel.org
 help / color / mirror / Atom feed
From: Peter Zijlstra <peterz@infradead.org>
To: linux-kernel@vger.kernel.org, dave.hansen@linux.intel.com,
	riel@redhat.com, brgerst@gmail.com, akpm@linux-foundation.org,
	luto@amacapital.net, mingo@kernel.org, dvlasenk@redhat.com,
	hpa@zytor.com, tglx@linutronix.de, bp@alien8.de, luto@kernel.org,
	torvalds@linux-foundation.org
Cc: linux-tip-commits@vger.kernel.org
Subject: Re: [tip:x86/urgent] x86/mm: Add barriers and document switch_mm() -vs-flush synchronization
Date: Mon, 11 Jan 2016 19:25:48 +0100	[thread overview]
Message-ID: <20160111182548.GF6344@twins.programming.kicks-ass.net> (raw)
In-Reply-To: <tip-71b3c126e61177eb693423f2e18a1914205b165e@git.kernel.org>

On Mon, Jan 11, 2016 at 03:42:40AM -0800, tip-bot for Andy Lutomirski wrote:
> --- a/arch/x86/include/asm/mmu_context.h
> +++ b/arch/x86/include/asm/mmu_context.h
> @@ -116,8 +116,34 @@ static inline void switch_mm(struct mm_struct *prev, struct mm_struct *next,
>  #endif
>  		cpumask_set_cpu(cpu, mm_cpumask(next));
>  
> -		/* Re-load page tables */
> +		/*
> +		 * Re-load page tables.
> +		 *
> +		 * This logic has an ordering constraint:
> +		 *
> +		 *  CPU 0: Write to a PTE for 'next'
> +		 *  CPU 0: load bit 1 in mm_cpumask.  if nonzero, send IPI.
> +		 *  CPU 1: set bit 1 in next's mm_cpumask
> +		 *  CPU 1: load from the PTE that CPU 0 writes (implicit)
> +		 *
> +		 * We need to prevent an outcome in which CPU 1 observes
> +		 * the new PTE value and CPU 0 observes bit 1 clear in
> +		 * mm_cpumask.  (If that occurs, then the IPI will never
> +		 * be sent, and CPU 0's TLB will contain a stale entry.)
> +		 *
> +		 * The bad outcome can occur if either CPU's load is
> +		 * reordered before that CPU's store, so both CPUs much

s/much/must/ ?

> +		 * execute full barriers to prevent this from happening.
> +		 *
> +		 * Thus, switch_mm needs a full barrier between the
> +		 * store to mm_cpumask and any operation that could load
> +		 * from next->pgd.  This barrier synchronizes with
> +		 * remote TLB flushers.  Fortunately, load_cr3 is
> +		 * serializing and thus acts as a full barrier.
> +		 *
> +		 */
>  		load_cr3(next->pgd);
> +
>  		trace_tlb_flush(TLB_FLUSH_ON_TASK_SWITCH, TLB_FLUSH_ALL);
>  
>  		/* Stop flush ipis for the previous mm */
> @@ -156,10 +182,15 @@ static inline void switch_mm(struct mm_struct *prev, struct mm_struct *next,
>  			 * schedule, protecting us from simultaneous changes.
>  			 */
>  			cpumask_set_cpu(cpu, mm_cpumask(next));
> +
>  			/*
>  			 * We were in lazy tlb mode and leave_mm disabled
>  			 * tlb flush IPI delivery. We must reload CR3
>  			 * to make sure to use no freed page tables.
> +			 *
> +			 * As above, this is a barrier that forces
> +			 * TLB repopulation to be ordered after the
> +			 * store to mm_cpumask.

somewhat confused by this comment, cpumask_set_cpu() is a LOCK BTS, that
is already fully ordered.

>  			 */
>  			load_cr3(next->pgd);
>  			trace_tlb_flush(TLB_FLUSH_ON_TASK_SWITCH, TLB_FLUSH_ALL);
> diff --git a/arch/x86/mm/tlb.c b/arch/x86/mm/tlb.c
> index 8ddb5d0..8f4cc3d 100644


> --- a/arch/x86/mm/tlb.c
> +++ b/arch/x86/mm/tlb.c

> @@ -188,17 +191,29 @@ void flush_tlb_mm_range(struct mm_struct *mm, unsigned long start,

>  	if (!current->mm) {
>  		leave_mm(smp_processor_id());
> +
> +		/* Synchronize with switch_mm. */
> +		smp_mb();
> +
>  		goto out;
>  	}

> +		} else {
>  			leave_mm(smp_processor_id());
> +
> +			/* Synchronize with switch_mm. */
> +			smp_mb();
> +		}
>  	}

The alternative is making leave_mm() unconditionally imply a full
barrier. I've not looked at other sites using it though.

       reply	other threads:[~2016-01-11 18:26 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <tip-71b3c126e61177eb693423f2e18a1914205b165e@git.kernel.org>
2016-01-11 18:25 ` Peter Zijlstra [this message]
2016-01-11 21:50   ` [tip:x86/urgent] x86/mm: Add barriers and document switch_mm() -vs-flush synchronization Andy Lutomirski
2016-01-11 22:22     ` Peter Zijlstra
2016-01-12 10:21     ` Ingo Molnar
2016-01-12 20:11       ` [PATCH] x86/mm: Improve switch_mm barrier comments Andy Lutomirski
2016-01-12 20:11         ` Andy Lutomirski
2016-01-12 20:47       ` [PATCH v2] " Andy Lutomirski
2016-01-12 20:47         ` Andy Lutomirski
2016-01-14  9:06         ` [tip:x86/urgent] x86/mm: Improve switch_mm() " tip-bot for Andy Lutomirski

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160111182548.GF6344@twins.programming.kicks-ass.net \
    --to=peterz@infradead.org \
    --cc=akpm@linux-foundation.org \
    --cc=bp@alien8.de \
    --cc=brgerst@gmail.com \
    --cc=dave.hansen@linux.intel.com \
    --cc=dvlasenk@redhat.com \
    --cc=hpa@zytor.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-tip-commits@vger.kernel.org \
    --cc=luto@amacapital.net \
    --cc=luto@kernel.org \
    --cc=mingo@kernel.org \
    --cc=riel@redhat.com \
    --cc=tglx@linutronix.de \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.