From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <SRS0=3kWv=HC=kvack.org=owner-linux-mm@kernel.org>
X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on
	aws-us-west-2-korg-lkml-1.web.codeaurora.org
X-Spam-Level: 
X-Spam-Status: No, score=-13.5 required=3.0 tests=BAYES_00,
	DKIM_ADSP_CUSTOM_MED,DKIM_INVALID,DKIM_SIGNED,FREEMAIL_FORGED_FROMDOMAIN,
	FREEMAIL_FROM,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,
	INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,
	USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0
Received: from mail.kernel.org (mail.kernel.org [198.145.29.99])
	by smtp.lore.kernel.org (Postfix) with ESMTP id 6BD26C433DB
	for <linux-mm@archiver.kernel.org>; Sun, 31 Jan 2021 00:16:37 +0000 (UTC)
Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17])
	by mail.kernel.org (Postfix) with ESMTP id E69C364E0F
	for <linux-mm@archiver.kernel.org>; Sun, 31 Jan 2021 00:16:36 +0000 (UTC)
DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org E69C364E0F
Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=gmail.com
Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org
Received: by kanga.kvack.org (Postfix)
	id 7D8266B0081; Sat, 30 Jan 2021 19:16:27 -0500 (EST)
Received: by kanga.kvack.org (Postfix, from userid 40)
	id 73CF26B0082; Sat, 30 Jan 2021 19:16:27 -0500 (EST)
X-Delivered-To: int-list-linux-mm@kvack.org
Received: by kanga.kvack.org (Postfix, from userid 63042)
	id 4CD4E6B0083; Sat, 30 Jan 2021 19:16:27 -0500 (EST)
X-Delivered-To: linux-mm@kvack.org
Received: from forelay.hostedemail.com (smtprelay0040.hostedemail.com [216.40.44.40])
	by kanga.kvack.org (Postfix) with ESMTP id 230C56B0081
	for <linux-mm@kvack.org>; Sat, 30 Jan 2021 19:16:27 -0500 (EST)
Received: from smtpin13.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251])
	by forelay01.hostedemail.com (Postfix) with ESMTP id DF22C180AD830
	for <linux-mm@kvack.org>; Sun, 31 Jan 2021 00:16:26 +0000 (UTC)
X-FDA: 77764153572.13.shake05_1f172fa275b5
Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251])
	by smtpin13.hostedemail.com (Postfix) with ESMTP id C1EDB18140B60
	for <linux-mm@kvack.org>; Sun, 31 Jan 2021 00:16:26 +0000 (UTC)
X-HE-Tag: shake05_1f172fa275b5
X-Filterd-Recvd-Size: 17414
Received: from mail-pf1-f172.google.com (mail-pf1-f172.google.com [209.85.210.172])
	by imf27.hostedemail.com (Postfix) with ESMTP
	for <linux-mm@kvack.org>; Sun, 31 Jan 2021 00:16:26 +0000 (UTC)
Received: by mail-pf1-f172.google.com with SMTP id b145so2281756pfb.4
        for <linux-mm@kvack.org>; Sat, 30 Jan 2021 16:16:26 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=gmail.com; s=20161025;
        h=from:to:cc:subject:date:message-id:in-reply-to:references
         :mime-version:content-transfer-encoding;
        bh=J0sNt0ybEAJJITGt3SZGOCmrQucNsG5Zn6uxnJSWEHU=;
        b=mGqXv/v3S2PT1kY3WmXQuXGhWNvnf3TFomZSCuxt1lRwxzy077QI4egWiVOxjrT4Ps
         WdBCHWZ9bM30L+PnMBFQYR5QoPUCONXwjHTb2CDztTXX0+Y/RsYWFOCAgJny0OVKA0dX
         urxWPZ18rfSnLOWT7SNjyGpNrbjABnWpvnaCkip4YW9JEZq9xoazTN5jOWU/yhyXXh0O
         ttiWKtPXS0Xf7UzsCDAJtZ7STa/yKkDu5EslnM+ftx0oR1V/jKH82cGUXaqyfGQmYWDD
         Z0N6Tkq2NOap6GymQUn4b7abeYc4N8WuMremtfN9pDd4PvmWnva8N+AbeWAlPjpEj52G
         M+0w==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=1e100.net; s=20161025;
        h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to
         :references:mime-version:content-transfer-encoding;
        bh=J0sNt0ybEAJJITGt3SZGOCmrQucNsG5Zn6uxnJSWEHU=;
        b=uDWesGompKCkYAw2W2U+slV4T8Tyh33XfZRdzHHYpO+C/dzWrk+4tfUquBcHjjDxlF
         WQVjTkiL7odWCFwoIZSTDhsQDISEonx83WtE1P7tvFM1zhHxp8CnqyZjw8Bx1oltaChS
         VNjAd7bpaBa1Z6ZF2FV8H1JUMzNE4BQH2jfkLYZTooS9Ke7gHrlFrI0y9kdfRXuFnJrl
         nQ2ZPUzR47Y5RKEgB9gJo9KFqnoRtNxvOG+x43zS29jjhKUz1mynkt4FOD9wPc/Mqq5X
         NWsZy3+eaB4L7/+ksCSA9TupaK3xUAXgAa0G10PTWjhzxv0EiQC6zppmtbF9jMxgjucp
         052g==
X-Gm-Message-State: AOAM533fSHCzqdAsh974rqxwG/Fy2TiSln+8ANZ8eCxP05Dn+EpQH55Y
	Dg0mHu8TcIY/k1xQiZk+VUk0XzC9kWE=
X-Google-Smtp-Source: ABdhPJwNe3mKTlXo7k7U2SgPj7FsSjQAw80hIuT+kdV7YK4IsAvj/+Q83sl9GE2LG7Qvy2myneRi7g==
X-Received: by 2002:a62:1690:0:b029:1c6:fdac:3438 with SMTP id 138-20020a6216900000b02901c6fdac3438mr10186735pfw.43.1612052184960;
        Sat, 30 Jan 2021 16:16:24 -0800 (PST)
Received: from sc2-haas01-esx0118.eng.vmware.com ([66.170.99.1])
        by smtp.gmail.com with ESMTPSA id e12sm13127365pga.13.2021.01.30.16.16.23
        (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256);
        Sat, 30 Jan 2021 16:16:24 -0800 (PST)
From: Nadav Amit <nadav.amit@gmail.com>
X-Google-Original-From: Nadav Amit
To: linux-mm@kvack.org,
	linux-kernel@vger.kernel.org
Cc: Nadav Amit <namit@vmware.com>,
	Andrea Arcangeli <aarcange@redhat.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Andy Lutomirski <luto@kernel.org>,
	Dave Hansen <dave.hansen@linux.intel.com>,
	Peter Zijlstra <peterz@infradead.org>,
	Thomas Gleixner <tglx@linutronix.de>,
	Will Deacon <will@kernel.org>,
	Yu Zhao <yuzhao@google.com>,
	x86@kernel.org
Subject: [RFC 15/20] mm: detect deferred TLB flushes in vma granularity
Date: Sat, 30 Jan 2021 16:11:27 -0800
Message-Id: <20210131001132.3368247-16-namit@vmware.com>
X-Mailer: git-send-email 2.25.1
In-Reply-To: <20210131001132.3368247-1-namit@vmware.com>
References: <20210131001132.3368247-1-namit@vmware.com>
MIME-Version: 1.0
Content-Transfer-Encoding: quoted-printable
X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4
Sender: owner-linux-mm@kvack.org
Precedence: bulk
X-Loop: owner-majordomo@kvack.org
List-ID: <linux-mm.kvack.org>

From: Nadav Amit <namit@vmware.com>

Currently, deferred TLB flushes are detected in the mm granularity: if
there is any deferred TLB flush in the entire address space due to NUMA
migration, pte_accessible() in x86 would return true, and
ptep_clear_flush() would require a TLB flush. This would happen even if
the PTE resides in a completely different vma.

Recent code changes and possible future enhancements might require to
detect TLB flushes in finer granularity. Detection in finer granularity
can also enable more aggressive TLB deferring in the future.

Record for each vma the last mm's TLB generation after the last deferred
PTE/PMD change while the page-table lock is still held. Increase the mm
generation before recording to indicate that a pending TLB flush is
pending. Record in the mmu_gather struct the mm's TLB generation at the
time in which the last TLB flushing was deferred.

Once the TLB flushing of deferred request takes place, use the deferred
TLB generation that is recorded in mmu_gather. Detection of deferred TLB
flushes is performed by checking whether the mm's completed TLB
generation is the lower/equal than the mm's TLB generation.
Architectures that use the TLB generation logic are required to perform
a full TLB flush if they detect that a new TLB flush request "skips" a
generation (as already done by x86 code).

To indicate that a deferred TLB flush takes place, increase the mm's TLB
generation after updating the PTEs. However, try to avoid increasing the
mm's generation after subsequent PTE updates, as increasing it again
would lead to a full TLB flush once the deferred TLB flushes are
performed (due to the "skipped" TLB generation). Therefore, if the mm
generation did not change after subsequent PTE update, use the previous
generation.

As multiple updates of the vma generation can be performed concurrently,
use atomic operations to ensure that the TLB generation as recorded in
the vma is the last (most recent) one.

Once a deferred TLB flush is eventually performed it might be redundant,
if due to another TLB flush the deferred flush was performed (by doing a
full TLB flush once detecting the "skipped" generation).  This case can
be detected if the deferred TLB generation, as recorded in mmu_gather
was already completed. However, we do not record deferred PUD/P4D
flushes, and freeing tables also requires a flush on cores in lazy
TLB mode. In such cases a TLB flush is needed even if the mm's completed
TLB generation indicates the flush was already "performed".

Signed-off-by: Nadav Amit <namit@vmware.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Deacon <will@kernel.org>
Cc: Yu Zhao <yuzhao@google.com>
Cc: x86@kernel.org
---
 arch/x86/include/asm/tlb.h      |  18 ++++--
 arch/x86/include/asm/tlbflush.h |   5 ++
 arch/x86/mm/tlb.c               |  14 ++++-
 include/asm-generic/tlb.h       | 104 ++++++++++++++++++++++++++++++--
 include/linux/mm_types.h        |  19 ++++++
 mm/mmap.c                       |   1 +
 mm/mmu_gather.c                 |   3 +
 7 files changed, 150 insertions(+), 14 deletions(-)

diff --git a/arch/x86/include/asm/tlb.h b/arch/x86/include/asm/tlb.h
index 580636cdc257..ecf538e6c6d5 100644
--- a/arch/x86/include/asm/tlb.h
+++ b/arch/x86/include/asm/tlb.h
@@ -9,15 +9,23 @@ static inline void tlb_flush(struct mmu_gather *tlb);
=20
 static inline void tlb_flush(struct mmu_gather *tlb)
 {
-	unsigned long start =3D 0UL, end =3D TLB_FLUSH_ALL;
 	unsigned int stride_shift =3D tlb_get_unmap_shift(tlb);
=20
-	if (!tlb->fullmm && !tlb->need_flush_all) {
-		start =3D tlb->start;
-		end =3D tlb->end;
+	/* Perform full flush when needed */
+	if (tlb->fullmm || tlb->need_flush_all) {
+		flush_tlb_mm_range(tlb->mm, 0, TLB_FLUSH_ALL, stride_shift,
+				   tlb->freed_tables);
+		return;
 	}
=20
-	flush_tlb_mm_range(tlb->mm, start, end, stride_shift, tlb->freed_tables=
);
+	/* Check if flush was already performed */
+	if (!tlb->freed_tables && !tlb->cleared_puds &&
+	    !tlb->cleared_p4ds &&
+	    atomic64_read(&tlb->mm->tlb_gen_completed) > tlb->defer_gen)
+		return;
+
+	flush_tlb_mm_range_gen(tlb->mm, tlb->start, tlb->end, stride_shift,
+			       tlb->freed_tables, tlb->defer_gen);
 }
=20
 /*
diff --git a/arch/x86/include/asm/tlbflush.h b/arch/x86/include/asm/tlbfl=
ush.h
index 2110b98026a7..296a00545056 100644
--- a/arch/x86/include/asm/tlbflush.h
+++ b/arch/x86/include/asm/tlbflush.h
@@ -225,6 +225,11 @@ void flush_tlb_others(const struct cpumask *cpumask,
 				: PAGE_SHIFT, false)
=20
 extern void flush_tlb_all(void);
+
+extern void flush_tlb_mm_range_gen(struct mm_struct *mm, unsigned long s=
tart,
+				unsigned long end, unsigned int stride_shift,
+				bool freed_tables, u64 gen);
+
 extern void flush_tlb_mm_range(struct mm_struct *mm, unsigned long start=
,
 				unsigned long end, unsigned int stride_shift,
 				bool freed_tables);
diff --git a/arch/x86/mm/tlb.c b/arch/x86/mm/tlb.c
index d17b5575531e..48f4b56fc4a7 100644
--- a/arch/x86/mm/tlb.c
+++ b/arch/x86/mm/tlb.c
@@ -883,12 +883,11 @@ static inline void put_flush_tlb_info(void)
 #endif
 }
=20
-void flush_tlb_mm_range(struct mm_struct *mm, unsigned long start,
+void flush_tlb_mm_range_gen(struct mm_struct *mm, unsigned long start,
 				unsigned long end, unsigned int stride_shift,
-				bool freed_tables)
+				bool freed_tables, u64 new_tlb_gen)
 {
 	struct flush_tlb_info *info;
-	u64 new_tlb_gen;
 	int cpu;
=20
 	cpu =3D get_cpu();
@@ -923,6 +922,15 @@ void flush_tlb_mm_range(struct mm_struct *mm, unsign=
ed long start,
 	put_cpu();
 }
=20
+void flush_tlb_mm_range(struct mm_struct *mm, unsigned long start,
+				unsigned long end, unsigned int stride_shift,
+				bool freed_tables)
+{
+	u64 new_tlb_gen =3D inc_mm_tlb_gen(mm);
+
+	flush_tlb_mm_range_gen(mm, start, end, stride_shift, freed_tables,
+			       new_tlb_gen);
+}
=20
 static void do_flush_tlb_all(void *info)
 {
diff --git a/include/asm-generic/tlb.h b/include/asm-generic/tlb.h
index 10690763090a..f25d2d955076 100644
--- a/include/asm-generic/tlb.h
+++ b/include/asm-generic/tlb.h
@@ -295,6 +295,11 @@ struct mmu_gather {
 	unsigned int		cleared_puds : 1;
 	unsigned int		cleared_p4ds : 1;
=20
+	/*
+	 * Whether a TLB flush was needed for PTEs in the current table
+	 */
+	unsigned int		cleared_ptes_in_table : 1;
+
 	unsigned int		batch_count;
=20
 #ifndef CONFIG_MMU_GATHER_NO_GATHER
@@ -305,6 +310,10 @@ struct mmu_gather {
 #ifdef CONFIG_MMU_GATHER_PAGE_SIZE
 	unsigned int page_size;
 #endif
+
+#ifdef CONFIG_ARCH_HAS_TLB_GENERATIONS
+	u64			defer_gen;
+#endif
 #endif
 };
=20
@@ -381,7 +390,8 @@ static inline void tlb_flush(struct mmu_gather *tlb)
 #endif
=20
 #if __is_defined(tlb_flush) ||						\
-	IS_ENABLED(CONFIG_ARCH_WANT_AGGRESSIVE_TLB_FLUSH_BATCHING)
+	IS_ENABLED(CONFIG_ARCH_WANT_AGGRESSIVE_TLB_FLUSH_BATCHING) ||	\
+	IS_ENABLED(CONFIG_ARCH_HAS_TLB_GENERATIONS)
=20
 static inline void
 tlb_update_vma(struct mmu_gather *tlb, struct vm_area_struct *vma)
@@ -472,7 +482,8 @@ static inline unsigned long tlb_get_unmap_size(struct=
 mmu_gather *tlb)
  */
 static inline void tlb_start_vma(struct mmu_gather *tlb, struct vm_area_=
struct *vma)
 {
-	if (tlb->fullmm)
+	if (IS_ENABLED(CONFIG_ARCH_WANT_AGGRESSIVE_TLB_FLUSH_BATCHING) &&
+		       tlb->fullmm)
 		return;
=20
 	tlb_update_vma(tlb, vma);
@@ -530,16 +541,87 @@ static inline void mark_mm_tlb_gen_done(struct mm_s=
truct *mm, u64 gen)
 	tlb_update_generation(&mm->tlb_gen_completed, gen);
 }
=20
-#endif /* CONFIG_ARCH_HAS_TLB_GENERATIONS */
+static inline void read_defer_tlb_flush_gen(struct mmu_gather *tlb)
+{
+	struct mm_struct *mm =3D tlb->mm;
+	u64 mm_gen;
+
+	/*
+	 * Any change of PTE before calling __track_deferred_tlb_flush() must b=
e
+	 * performed using RMW atomic operation that provides a memory barriers=
,
+	 * such as ptep_modify_prot_start().  The barrier ensure the PTEs are
+	 * written before the current generation is read, synchronizing
+	 * (implicitly) with flush_tlb_mm_range().
+	 */
+	smp_mb__after_atomic();
+
+	mm_gen =3D atomic64_read(&mm->tlb_gen);
+
+	/*
+	 * This condition checks for both first deferred TLB flush and for othe=
r
+	 * TLB pending or executed TLB flushes after the last table that we
+	 * updated. In the latter case, we are going to skip a generation, whic=
h
+	 * would lead to a full TLB flush. This should therefore not cause
+	 * correctness issues, and should not induce overheads, since anyhow in
+	 * TLB storms it is better to perform full TLB flush.
+	 */
+	if (mm_gen !=3D tlb->defer_gen) {
+		VM_BUG_ON(mm_gen < tlb->defer_gen);
+
+		tlb->defer_gen =3D inc_mm_tlb_gen(mm);
+	}
+}
+
+/*
+ * Store the deferred TLB generation in the VMA
+ */
+static inline void store_deferred_tlb_gen(struct mmu_gather *tlb)
+{
+	tlb_update_generation(&tlb->vma->defer_tlb_gen, tlb->defer_gen);
+}
+
+/*
+ * Track deferred TLB flushes for PTEs and PMDs to allow fine granularit=
y checks
+ * whether a PTE is accessible. The TLB generation after the PTE is flus=
hed is
+ * saved in the mmu_gather struct. Once a flush is performed, the genear=
tion is
+ * advanced.
+ */
+static inline void track_defer_tlb_flush(struct mmu_gather *tlb)
+{
+	if (tlb->fullmm)
+		return;
+
+	BUG_ON(!tlb->vma);
+
+	read_defer_tlb_flush_gen(tlb);
+	store_deferred_tlb_gen(tlb);
+}
+
+#define init_vma_tlb_generation(vma)				\
+	atomic64_set(&(vma)->defer_tlb_gen, 0)
+#else
+static inline void init_vma_tlb_generation(struct vm_area_struct *vma) {=
 }
+#endif
=20
 #define tlb_start_ptes(tlb)						\
 	do {								\
 		struct mmu_gather *_tlb =3D (tlb);			\
 									\
 		flush_tlb_batched_pending(_tlb->mm);			\
+		if (IS_ENABLED(CONFIG_ARCH_HAS_TLB_GENERATIONS))	\
+			_tlb->cleared_ptes_in_table =3D 0;		\
 	} while (0)
=20
-static inline void tlb_end_ptes(struct mmu_gather *tlb) { }
+static inline void tlb_end_ptes(struct mmu_gather *tlb)
+{
+	if (!IS_ENABLED(CONFIG_ARCH_HAS_TLB_GENERATIONS))
+		return;
+
+	if (tlb->cleared_ptes_in_table)
+		track_defer_tlb_flush(tlb);
+
+	tlb->cleared_ptes_in_table =3D 0;
+}
=20
 /*
  * tlb_flush_{pte|pmd|pud|p4d}_range() adjust the tlb->start and tlb->en=
d,
@@ -550,15 +632,25 @@ static inline void tlb_flush_pte_range(struct mmu_g=
ather *tlb,
 {
 	__tlb_adjust_range(tlb, address, size);
 	tlb->cleared_ptes =3D 1;
+
+	if (IS_ENABLED(CONFIG_ARCH_HAS_TLB_GENERATIONS))
+		tlb->cleared_ptes_in_table =3D 1;
 }
=20
-static inline void tlb_flush_pmd_range(struct mmu_gather *tlb,
+static inline void __tlb_flush_pmd_range(struct mmu_gather *tlb,
 				     unsigned long address, unsigned long size)
 {
 	__tlb_adjust_range(tlb, address, size);
 	tlb->cleared_pmds =3D 1;
 }
=20
+static inline void tlb_flush_pmd_range(struct mmu_gather *tlb,
+				     unsigned long address, unsigned long size)
+{
+	__tlb_flush_pmd_range(tlb, address, size);
+	track_defer_tlb_flush(tlb);
+}
+
 static inline void tlb_flush_pud_range(struct mmu_gather *tlb,
 				     unsigned long address, unsigned long size)
 {
@@ -649,7 +741,7 @@ static inline void tlb_flush_p4d_range(struct mmu_gat=
her *tlb,
 #ifndef pte_free_tlb
 #define pte_free_tlb(tlb, ptep, address)			\
 	do {							\
-		tlb_flush_pmd_range(tlb, address, PAGE_SIZE);	\
+		__tlb_flush_pmd_range(tlb, address, PAGE_SIZE);	\
 		tlb->freed_tables =3D 1;				\
 		__pte_free_tlb(tlb, ptep, address);		\
 	} while (0)
diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h
index 676795dfd5d4..bbe5d4a422f7 100644
--- a/include/linux/mm_types.h
+++ b/include/linux/mm_types.h
@@ -367,6 +367,9 @@ struct vm_area_struct {
 #endif
 #ifdef CONFIG_NUMA
 	struct mempolicy *vm_policy;	/* NUMA policy for the VMA */
+#endif
+#ifdef CONFIG_ARCH_HAS_TLB_GENERATIONS
+	atomic64_t defer_tlb_gen;	/* Deferred TLB flushes generation */
 #endif
 	struct vm_userfaultfd_ctx vm_userfaultfd_ctx;
 } __randomize_layout;
@@ -628,6 +631,21 @@ static inline bool mm_tlb_flush_pending(struct mm_st=
ruct *mm)
 	return atomic_read(&mm->tlb_flush_pending);
 }
=20
+#ifdef CONFIG_ARCH_HAS_TLB_GENERATIONS
+static inline bool pte_tlb_flush_pending(struct vm_area_struct *vma, pte=
_t *pte)
+{
+	struct mm_struct *mm =3D vma->vm_mm;
+
+	return atomic64_read(&vma->defer_tlb_gen) < atomic64_read(&mm->tlb_gen_=
completed);
+}
+
+static inline bool pmd_tlb_flush_pending(struct vm_area_struct *vma, pmd=
_t *pmd)
+{
+	struct mm_struct *mm =3D vma->vm_mm;
+
+	return atomic64_read(&vma->defer_tlb_gen) < atomic64_read(&mm->tlb_gen_=
completed);
+}
+#else /* CONFIG_ARCH_HAS_TLB_GENERATIONS */
 static inline bool pte_tlb_flush_pending(struct vm_area_struct *vma, pte=
_t *pte)
 {
 	return mm_tlb_flush_pending(vma->vm_mm);
@@ -637,6 +655,7 @@ static inline bool pmd_tlb_flush_pending(struct vm_ar=
ea_struct *vma, pmd_t *pmd)
 {
 	return mm_tlb_flush_pending(vma->vm_mm);
 }
+#endif /* CONFIG_ARCH_HAS_TLB_GENERATIONS */
=20
 static inline bool mm_tlb_flush_nested(struct mm_struct *mm)
 {
diff --git a/mm/mmap.c b/mm/mmap.c
index 90673febce6a..a81ef902e296 100644
--- a/mm/mmap.c
+++ b/mm/mmap.c
@@ -3337,6 +3337,7 @@ struct vm_area_struct *copy_vma(struct vm_area_stru=
ct **vmap,
 			get_file(new_vma->vm_file);
 		if (new_vma->vm_ops && new_vma->vm_ops->open)
 			new_vma->vm_ops->open(new_vma);
+		init_vma_tlb_generation(new_vma);
 		vma_link(mm, new_vma, prev, rb_link, rb_parent);
 		*need_rmap_locks =3D false;
 	}
diff --git a/mm/mmu_gather.c b/mm/mmu_gather.c
index 13338c096cc6..0d554f2f92ac 100644
--- a/mm/mmu_gather.c
+++ b/mm/mmu_gather.c
@@ -329,6 +329,9 @@ static void __tlb_gather_mmu(struct mmu_gather *tlb, =
struct mm_struct *mm,
 #endif
=20
 	tlb_table_init(tlb);
+#ifdef CONFIG_ARCH_HAS_TLB_GENERATIONS
+	tlb->defer_gen =3D 0;
+#endif
 #ifdef CONFIG_MMU_GATHER_PAGE_SIZE
 	tlb->page_size =3D 0;
 #endif
--=20
2.25.1