linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Andy Lutomirski <luto@kernel.org>
To: Nadav Amit <nadav.amit@gmail.com>
Cc: Linux-MM <linux-mm@kvack.org>,
	LKML <linux-kernel@vger.kernel.org>,
	 Nadav Amit <namit@vmware.com>,
	Andrea Arcangeli <aarcange@redhat.com>,
	 Andrew Morton <akpm@linux-foundation.org>,
	Andy Lutomirski <luto@kernel.org>,
	 Dave Hansen <dave.hansen@linux.intel.com>,
	Peter Zijlstra <peterz@infradead.org>,
	 Thomas Gleixner <tglx@linutronix.de>,
	Will Deacon <will@kernel.org>, Yu Zhao <yuzhao@google.com>,
	 Nick Piggin <npiggin@gmail.com>, X86 ML <x86@kernel.org>
Subject: Re: [RFC 08/20] mm: store completed TLB generation
Date: Sun, 31 Jan 2021 12:32:15 -0800	[thread overview]
Message-ID: <CALCETrUqoG9fhXLGbLomK-QrcSOhLDhJhQi5E=Y3FXNvYCzBcQ@mail.gmail.com> (raw)
In-Reply-To: <20210131001132.3368247-9-namit@vmware.com>

On Sat, Jan 30, 2021 at 4:16 PM Nadav Amit <nadav.amit@gmail.com> wrote:
>
> From: Nadav Amit <namit@vmware.com>
>
> To detect deferred TLB flushes in fine granularity, we need to keep
> track on the completed TLB flush generation for each mm.
>
> Add logic to track for each mm the tlb_gen_completed, which tracks the
> completed TLB generation. It is the arch responsibility to call
> mark_mm_tlb_gen_done() whenever a TLB flush is completed.
>
> Start the generation numbers from 1 instead of 0. This would allow later
> to detect whether flushes of a certain generation were completed.

Can you elaborate on how this helps?

I think you should document that tlb_gen_completed only means that no
outdated TLB entries will be observably used.  In the x86
implementation it's possible for older TLB entries to still exist,
unused, in TLBs of cpus running other mms.

How does this work with arch_tlbbatch_flush()?

>
> Signed-off-by: Nadav Amit <namit@vmware.com>
> Cc: Andrea Arcangeli <aarcange@redhat.com>
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: Andy Lutomirski <luto@kernel.org>
> Cc: Dave Hansen <dave.hansen@linux.intel.com>
> Cc: Peter Zijlstra <peterz@infradead.org>
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Cc: Will Deacon <will@kernel.org>
> Cc: Yu Zhao <yuzhao@google.com>
> Cc: Nick Piggin <npiggin@gmail.com>
> Cc: x86@kernel.org
> ---
>  arch/x86/mm/tlb.c         | 10 ++++++++++
>  include/asm-generic/tlb.h | 33 +++++++++++++++++++++++++++++++++
>  include/linux/mm_types.h  | 15 ++++++++++++++-
>  3 files changed, 57 insertions(+), 1 deletion(-)
>
> diff --git a/arch/x86/mm/tlb.c b/arch/x86/mm/tlb.c
> index 7ab21430be41..d17b5575531e 100644
> --- a/arch/x86/mm/tlb.c
> +++ b/arch/x86/mm/tlb.c
> @@ -14,6 +14,7 @@
>  #include <asm/nospec-branch.h>
>  #include <asm/cache.h>
>  #include <asm/apic.h>
> +#include <asm/tlb.h>
>
>  #include "mm_internal.h"
>
> @@ -915,6 +916,9 @@ void flush_tlb_mm_range(struct mm_struct *mm, unsigned long start,
>         if (cpumask_any_but(mm_cpumask(mm), cpu) < nr_cpu_ids)
>                 flush_tlb_others(mm_cpumask(mm), info);
>
> +       /* Update the completed generation */
> +       mark_mm_tlb_gen_done(mm, new_tlb_gen);
> +
>         put_flush_tlb_info();
>         put_cpu();
>  }
> @@ -1147,6 +1151,12 @@ void arch_tlbbatch_flush(struct arch_tlbflush_unmap_batch *batch)
>
>         cpumask_clear(&batch->cpumask);
>
> +       /*
> +        * We cannot call mark_mm_tlb_gen_done() since we do not know which
> +        * mm's should be flushed. This may lead to some unwarranted TLB
> +        * flushes, but not to correction problems.
> +        */
> +
>         put_cpu();
>  }
>
> diff --git a/include/asm-generic/tlb.h b/include/asm-generic/tlb.h
> index 517c89398c83..427bfcc6cdec 100644
> --- a/include/asm-generic/tlb.h
> +++ b/include/asm-generic/tlb.h
> @@ -513,6 +513,39 @@ static inline void tlb_end_vma(struct mmu_gather *tlb, struct vm_area_struct *vm
>  }
>  #endif
>
> +#ifdef CONFIG_ARCH_HAS_TLB_GENERATIONS
> +
> +/*
> + * Helper function to update a generation to have a new value, as long as new
> + * value is greater or equal to gen.
> + */

I read this a couple of times, and I don't understand it.  How about:

Helper function to atomically set *gen = max(*gen, new_gen)

> +static inline void tlb_update_generation(atomic64_t *gen, u64 new_gen)
> +{
> +       u64 cur_gen = atomic64_read(gen);
> +
> +       while (cur_gen < new_gen) {
> +               u64 old_gen = atomic64_cmpxchg(gen, cur_gen, new_gen);
> +
> +               /* Check if we succeeded in the cmpxchg */
> +               if (likely(cur_gen == old_gen))
> +                       break;
> +
> +               cur_gen = old_gen;
> +       };
> +}
> +
> +
> +static inline void mark_mm_tlb_gen_done(struct mm_struct *mm, u64 gen)
> +{
> +       /*
> +        * Update the completed generation to the new generation if the new
> +        * generation is greater than the previous one.
> +        */
> +       tlb_update_generation(&mm->tlb_gen_completed, gen);
> +}
> +
> +#endif /* CONFIG_ARCH_HAS_TLB_GENERATIONS */
> +
>  /*
>   * tlb_flush_{pte|pmd|pud|p4d}_range() adjust the tlb->start and tlb->end,
>   * and set corresponding cleared_*.
> diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h
> index 2035ac319c2b..8a5eb4bfac59 100644
> --- a/include/linux/mm_types.h
> +++ b/include/linux/mm_types.h
> @@ -571,6 +571,13 @@ struct mm_struct {
>                  * This is not used on Xen PV.
>                  */
>                 atomic64_t tlb_gen;
> +
> +               /*
> +                * TLB generation which is guarnateed to be flushed, including

guaranteed

> +                * all the PTE changes that were performed before tlb_gen was
> +                * incremented.
> +                */

I will defer judgment to future patches before I believe that this isn't racy :)

> +               atomic64_t tlb_gen_completed;
>  #endif
>         } __randomize_layout;
>
> @@ -690,7 +697,13 @@ static inline bool mm_tlb_flush_nested(struct mm_struct *mm)
>  #ifdef CONFIG_ARCH_HAS_TLB_GENERATIONS
>  static inline void init_mm_tlb_gen(struct mm_struct *mm)
>  {
> -       atomic64_set(&mm->tlb_gen, 0);
> +       /*
> +        * Start from generation of 1, so default generation 0 will be
> +        * considered as flushed and would not be regarded as an outstanding
> +        * deferred invalidation.
> +        */

Aha, this makes sense.

> +       atomic64_set(&mm->tlb_gen, 1);
> +       atomic64_set(&mm->tlb_gen_completed, 1);
>  }
>
>  static inline u64 inc_mm_tlb_gen(struct mm_struct *mm)
> --
> 2.25.1
>


  reply	other threads:[~2021-01-31 20:32 UTC|newest]

Thread overview: 67+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-01-31  0:11 [RFC 00/20] TLB batching consolidation and enhancements Nadav Amit
2021-01-31  0:11 ` [RFC 01/20] mm/tlb: fix fullmm semantics Nadav Amit
2021-01-31  1:02   ` Andy Lutomirski
2021-01-31  1:19     ` Nadav Amit
2021-01-31  2:57       ` Andy Lutomirski
2021-02-01  7:30         ` Nadav Amit
2021-02-01 11:36   ` Peter Zijlstra
2021-02-02  9:32     ` Nadav Amit
2021-02-02 11:00       ` Peter Zijlstra
2021-02-02 21:35         ` Nadav Amit
2021-02-03  9:44           ` Will Deacon
2021-02-04  3:20             ` Nadav Amit
2021-01-31  0:11 ` [RFC 02/20] mm/mprotect: use mmu_gather Nadav Amit
2021-01-31  0:11 ` [RFC 03/20] mm/mprotect: do not flush on permission promotion Nadav Amit
2021-01-31  1:07   ` Andy Lutomirski
2021-01-31  1:17     ` Nadav Amit
2021-01-31  2:59       ` Andy Lutomirski
     [not found]     ` <7a6de15a-a570-31f2-14d6-a8010296e694@citrix.com>
2021-02-01  5:58       ` Nadav Amit
2021-02-01 15:38         ` Andrew Cooper
2021-01-31  0:11 ` [RFC 04/20] mm/mapping_dirty_helpers: use mmu_gather Nadav Amit
2021-01-31  0:11 ` [RFC 05/20] mm/tlb: move BATCHED_UNMAP_TLB_FLUSH to tlb.h Nadav Amit
2021-01-31  0:11 ` [RFC 06/20] fs/task_mmu: use mmu_gather interface of clear-soft-dirty Nadav Amit
2021-01-31  0:11 ` [RFC 07/20] mm: move x86 tlb_gen to generic code Nadav Amit
2021-01-31 18:26   ` Andy Lutomirski
2021-01-31  0:11 ` [RFC 08/20] mm: store completed TLB generation Nadav Amit
2021-01-31 20:32   ` Andy Lutomirski [this message]
2021-02-01  7:28     ` Nadav Amit
2021-02-01 16:53       ` Andy Lutomirski
2021-02-01 11:52   ` Peter Zijlstra
2021-01-31  0:11 ` [RFC 09/20] mm: create pte/pmd_tlb_flush_pending() Nadav Amit
2021-01-31  0:11 ` [RFC 10/20] mm: add pte_to_page() Nadav Amit
2021-01-31  0:11 ` [RFC 11/20] mm/tlb: remove arch-specific tlb_start/end_vma() Nadav Amit
2021-02-01 12:09   ` Peter Zijlstra
2021-02-02  6:41     ` Nicholas Piggin
2021-02-02  7:20       ` Nadav Amit
2021-02-02  9:31         ` Peter Zijlstra
2021-02-02  9:54           ` Nadav Amit
2021-02-02 11:04             ` Peter Zijlstra
2021-01-31  0:11 ` [RFC 12/20] mm/tlb: save the VMA that is flushed during tlb_start_vma() Nadav Amit
2021-02-01 12:28   ` Peter Zijlstra
2021-01-31  0:11 ` [RFC 13/20] mm/tlb: introduce tlb_start_ptes() and tlb_end_ptes() Nadav Amit
2021-01-31  9:57   ` Damian Tometzki
2021-01-31 10:07   ` Damian Tometzki
2021-02-01  7:29     ` Nadav Amit
2021-02-01 13:19   ` Peter Zijlstra
2021-02-01 23:00     ` Nadav Amit
2021-01-31  0:11 ` [RFC 14/20] mm: move inc/dec_tlb_flush_pending() to mmu_gather.c Nadav Amit
2021-01-31  0:11 ` [RFC 15/20] mm: detect deferred TLB flushes in vma granularity Nadav Amit
2021-02-01 22:04   ` Nadav Amit
2021-02-02  0:14     ` Andy Lutomirski
2021-02-02 20:51       ` Nadav Amit
2021-02-04  4:35         ` Andy Lutomirski
2021-01-31  0:11 ` [RFC 16/20] mm/tlb: per-page table generation tracking Nadav Amit
2021-01-31  0:11 ` [RFC 17/20] mm/tlb: updated completed deferred TLB flush conditionally Nadav Amit
2021-01-31  0:11 ` [RFC 18/20] mm: make mm_cpumask() volatile Nadav Amit
2021-01-31  0:11 ` [RFC 19/20] lib/cpumask: introduce cpumask_atomic_or() Nadav Amit
2021-01-31  0:11 ` [RFC 20/20] mm/rmap: avoid potential races Nadav Amit
2021-08-23  8:05   ` Huang, Ying
2021-08-23 15:50     ` Nadav Amit
2021-08-24  0:36       ` Huang, Ying
2021-01-31  0:39 ` [RFC 00/20] TLB batching consolidation and enhancements Andy Lutomirski
2021-01-31  1:08   ` Nadav Amit
2021-01-31  3:30 ` Nicholas Piggin
2021-01-31  7:57   ` Nadav Amit
2021-01-31  8:14     ` Nadav Amit
2021-02-01 12:44     ` Peter Zijlstra
2021-02-02  7:14       ` Nicholas Piggin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CALCETrUqoG9fhXLGbLomK-QrcSOhLDhJhQi5E=Y3FXNvYCzBcQ@mail.gmail.com' \
    --to=luto@kernel.org \
    --cc=aarcange@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=dave.hansen@linux.intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=nadav.amit@gmail.com \
    --cc=namit@vmware.com \
    --cc=npiggin@gmail.com \
    --cc=peterz@infradead.org \
    --cc=tglx@linutronix.de \
    --cc=will@kernel.org \
    --cc=x86@kernel.org \
    --cc=yuzhao@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).