From: Yu Zhao <yuzhao@google.com> To: Will Deacon <will@kernel.org> Cc: linux-kernel@vger.kernel.org, kernel-team@android.com, Catalin Marinas <catalin.marinas@arm.com>, Minchan Kim <minchan@kernel.org>, Peter Zijlstra <peterz@infradead.org>, Linus Torvalds <torvalds@linux-foundation.org>, Anshuman Khandual <anshuman.khandual@arm.com>, linux-mm@kvack.org, linux-arm-kernel@lists.infradead.org Subject: Re: [PATCH 4/6] mm: proc: Invalidate TLB after clearing soft-dirty page state Date: Fri, 20 Nov 2020 13:22:53 -0700 [thread overview] Message-ID: <20201120202253.GB1303870@google.com> (raw) In-Reply-To: <20201120143557.6715-5-will@kernel.org> On Fri, Nov 20, 2020 at 02:35:55PM +0000, Will Deacon wrote: > Since commit 0758cd830494 ("asm-generic/tlb: avoid potential double flush"), > TLB invalidation is elided in tlb_finish_mmu() if no entries were batched > via the tlb_remove_*() functions. Consequently, the page-table modifications > performed by clear_refs_write() in response to a write to > /proc/<pid>/clear_refs do not perform TLB invalidation. Although this is > fine when simply aging the ptes, in the case of clearing the "soft-dirty" > state we can end up with entries where pte_write() is false, yet a > writable mapping remains in the TLB. I don't think we need a TLB flush in this context, same reason as we don't have one in copy_present_pte() which uses ptep_set_wrprotect() to write-protect a src PTE. ptep_modify_prot_start/commit() and ptep_set_wrprotect() guarantee either the dirty bit is set (when a PTE is still writable) or a PF happens (when a PTE has become r/o) when h/w page table walker races with kernel that modifies a PTE using the two APIs. > Fix this by calling tlb_remove_tlb_entry() for each entry being > write-protected when cleating soft-dirty. > > Signed-off-by: Will Deacon <will@kernel.org> > --- > fs/proc/task_mmu.c | 18 +++++++++++------- > 1 file changed, 11 insertions(+), 7 deletions(-) > > diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c > index cd03ab9087b0..3308292ee5c5 100644 > --- a/fs/proc/task_mmu.c > +++ b/fs/proc/task_mmu.c > @@ -1032,11 +1032,12 @@ enum clear_refs_types { > > struct clear_refs_private { > enum clear_refs_types type; > + struct mmu_gather *tlb; > }; > > #ifdef CONFIG_MEM_SOFT_DIRTY > static inline void clear_soft_dirty(struct vm_area_struct *vma, > - unsigned long addr, pte_t *pte) > + unsigned long addr, pte_t *pte, struct mmu_gather *tlb) > { > /* > * The soft-dirty tracker uses #PF-s to catch writes > @@ -1053,6 +1054,7 @@ static inline void clear_soft_dirty(struct vm_area_struct *vma, > ptent = pte_wrprotect(old_pte); > ptent = pte_clear_soft_dirty(ptent); > ptep_modify_prot_commit(vma, addr, pte, old_pte, ptent); > + tlb_remove_tlb_entry(tlb, pte, addr); > } else if (is_swap_pte(ptent)) { > ptent = pte_swp_clear_soft_dirty(ptent); > set_pte_at(vma->vm_mm, addr, pte, ptent); > @@ -1060,14 +1062,14 @@ static inline void clear_soft_dirty(struct vm_area_struct *vma, > } > #else > static inline void clear_soft_dirty(struct vm_area_struct *vma, > - unsigned long addr, pte_t *pte) > + unsigned long addr, pte_t *pte, struct mmu_gather *tlb) > { > } > #endif > > #if defined(CONFIG_MEM_SOFT_DIRTY) && defined(CONFIG_TRANSPARENT_HUGEPAGE) > static inline void clear_soft_dirty_pmd(struct vm_area_struct *vma, > - unsigned long addr, pmd_t *pmdp) > + unsigned long addr, pmd_t *pmdp, struct mmu_gather *tlb) > { > pmd_t old, pmd = *pmdp; > > @@ -1081,6 +1083,7 @@ static inline void clear_soft_dirty_pmd(struct vm_area_struct *vma, > > pmd = pmd_wrprotect(pmd); > pmd = pmd_clear_soft_dirty(pmd); > + tlb_remove_pmd_tlb_entry(tlb, pmdp, addr); > > set_pmd_at(vma->vm_mm, addr, pmdp, pmd); > } else if (is_migration_entry(pmd_to_swp_entry(pmd))) { > @@ -1090,7 +1093,7 @@ static inline void clear_soft_dirty_pmd(struct vm_area_struct *vma, > } > #else > static inline void clear_soft_dirty_pmd(struct vm_area_struct *vma, > - unsigned long addr, pmd_t *pmdp) > + unsigned long addr, pmd_t *pmdp, struct mmu_gather *tlb) > { > } > #endif > @@ -1107,7 +1110,7 @@ static int clear_refs_pte_range(pmd_t *pmd, unsigned long addr, > ptl = pmd_trans_huge_lock(pmd, vma); > if (ptl) { > if (cp->type == CLEAR_REFS_SOFT_DIRTY) { > - clear_soft_dirty_pmd(vma, addr, pmd); > + clear_soft_dirty_pmd(vma, addr, pmd, cp->tlb); > goto out; > } > > @@ -1133,7 +1136,7 @@ static int clear_refs_pte_range(pmd_t *pmd, unsigned long addr, > ptent = *pte; > > if (cp->type == CLEAR_REFS_SOFT_DIRTY) { > - clear_soft_dirty(vma, addr, pte); > + clear_soft_dirty(vma, addr, pte, cp->tlb); > continue; > } > > @@ -1212,7 +1215,8 @@ static ssize_t clear_refs_write(struct file *file, const char __user *buf, > if (mm) { > struct mmu_notifier_range range; > struct clear_refs_private cp = { > - .type = type, > + .type = type, > + .tlb = &tlb, > }; > > if (type == CLEAR_REFS_MM_HIWATER_RSS) { > -- > 2.29.2.454.gaff20da3a2-goog >
WARNING: multiple messages have this Message-ID (diff)
From: Yu Zhao <yuzhao@google.com> To: Will Deacon <will@kernel.org> Cc: kernel-team@android.com, Anshuman Khandual <anshuman.khandual@arm.com>, Peter Zijlstra <peterz@infradead.org>, Catalin Marinas <catalin.marinas@arm.com>, linux-kernel@vger.kernel.org, linux-mm@kvack.org, Minchan Kim <minchan@kernel.org>, Linus Torvalds <torvalds@linux-foundation.org>, linux-arm-kernel@lists.infradead.org Subject: Re: [PATCH 4/6] mm: proc: Invalidate TLB after clearing soft-dirty page state Date: Fri, 20 Nov 2020 13:22:53 -0700 [thread overview] Message-ID: <20201120202253.GB1303870@google.com> (raw) In-Reply-To: <20201120143557.6715-5-will@kernel.org> On Fri, Nov 20, 2020 at 02:35:55PM +0000, Will Deacon wrote: > Since commit 0758cd830494 ("asm-generic/tlb: avoid potential double flush"), > TLB invalidation is elided in tlb_finish_mmu() if no entries were batched > via the tlb_remove_*() functions. Consequently, the page-table modifications > performed by clear_refs_write() in response to a write to > /proc/<pid>/clear_refs do not perform TLB invalidation. Although this is > fine when simply aging the ptes, in the case of clearing the "soft-dirty" > state we can end up with entries where pte_write() is false, yet a > writable mapping remains in the TLB. I don't think we need a TLB flush in this context, same reason as we don't have one in copy_present_pte() which uses ptep_set_wrprotect() to write-protect a src PTE. ptep_modify_prot_start/commit() and ptep_set_wrprotect() guarantee either the dirty bit is set (when a PTE is still writable) or a PF happens (when a PTE has become r/o) when h/w page table walker races with kernel that modifies a PTE using the two APIs. > Fix this by calling tlb_remove_tlb_entry() for each entry being > write-protected when cleating soft-dirty. > > Signed-off-by: Will Deacon <will@kernel.org> > --- > fs/proc/task_mmu.c | 18 +++++++++++------- > 1 file changed, 11 insertions(+), 7 deletions(-) > > diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c > index cd03ab9087b0..3308292ee5c5 100644 > --- a/fs/proc/task_mmu.c > +++ b/fs/proc/task_mmu.c > @@ -1032,11 +1032,12 @@ enum clear_refs_types { > > struct clear_refs_private { > enum clear_refs_types type; > + struct mmu_gather *tlb; > }; > > #ifdef CONFIG_MEM_SOFT_DIRTY > static inline void clear_soft_dirty(struct vm_area_struct *vma, > - unsigned long addr, pte_t *pte) > + unsigned long addr, pte_t *pte, struct mmu_gather *tlb) > { > /* > * The soft-dirty tracker uses #PF-s to catch writes > @@ -1053,6 +1054,7 @@ static inline void clear_soft_dirty(struct vm_area_struct *vma, > ptent = pte_wrprotect(old_pte); > ptent = pte_clear_soft_dirty(ptent); > ptep_modify_prot_commit(vma, addr, pte, old_pte, ptent); > + tlb_remove_tlb_entry(tlb, pte, addr); > } else if (is_swap_pte(ptent)) { > ptent = pte_swp_clear_soft_dirty(ptent); > set_pte_at(vma->vm_mm, addr, pte, ptent); > @@ -1060,14 +1062,14 @@ static inline void clear_soft_dirty(struct vm_area_struct *vma, > } > #else > static inline void clear_soft_dirty(struct vm_area_struct *vma, > - unsigned long addr, pte_t *pte) > + unsigned long addr, pte_t *pte, struct mmu_gather *tlb) > { > } > #endif > > #if defined(CONFIG_MEM_SOFT_DIRTY) && defined(CONFIG_TRANSPARENT_HUGEPAGE) > static inline void clear_soft_dirty_pmd(struct vm_area_struct *vma, > - unsigned long addr, pmd_t *pmdp) > + unsigned long addr, pmd_t *pmdp, struct mmu_gather *tlb) > { > pmd_t old, pmd = *pmdp; > > @@ -1081,6 +1083,7 @@ static inline void clear_soft_dirty_pmd(struct vm_area_struct *vma, > > pmd = pmd_wrprotect(pmd); > pmd = pmd_clear_soft_dirty(pmd); > + tlb_remove_pmd_tlb_entry(tlb, pmdp, addr); > > set_pmd_at(vma->vm_mm, addr, pmdp, pmd); > } else if (is_migration_entry(pmd_to_swp_entry(pmd))) { > @@ -1090,7 +1093,7 @@ static inline void clear_soft_dirty_pmd(struct vm_area_struct *vma, > } > #else > static inline void clear_soft_dirty_pmd(struct vm_area_struct *vma, > - unsigned long addr, pmd_t *pmdp) > + unsigned long addr, pmd_t *pmdp, struct mmu_gather *tlb) > { > } > #endif > @@ -1107,7 +1110,7 @@ static int clear_refs_pte_range(pmd_t *pmd, unsigned long addr, > ptl = pmd_trans_huge_lock(pmd, vma); > if (ptl) { > if (cp->type == CLEAR_REFS_SOFT_DIRTY) { > - clear_soft_dirty_pmd(vma, addr, pmd); > + clear_soft_dirty_pmd(vma, addr, pmd, cp->tlb); > goto out; > } > > @@ -1133,7 +1136,7 @@ static int clear_refs_pte_range(pmd_t *pmd, unsigned long addr, > ptent = *pte; > > if (cp->type == CLEAR_REFS_SOFT_DIRTY) { > - clear_soft_dirty(vma, addr, pte); > + clear_soft_dirty(vma, addr, pte, cp->tlb); > continue; > } > > @@ -1212,7 +1215,8 @@ static ssize_t clear_refs_write(struct file *file, const char __user *buf, > if (mm) { > struct mmu_notifier_range range; > struct clear_refs_private cp = { > - .type = type, > + .type = type, > + .tlb = &tlb, > }; > > if (type == CLEAR_REFS_MM_HIWATER_RSS) { > -- > 2.29.2.454.gaff20da3a2-goog > _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
next prev parent reply other threads:[~2020-11-20 20:23 UTC|newest] Thread overview: 91+ messages / expand[flat|nested] mbox.gz Atom feed top 2020-11-20 14:35 [PATCH 0/6] tlb: Fix access and (soft-)dirty bit management Will Deacon 2020-11-20 14:35 ` Will Deacon 2020-11-20 14:35 ` [PATCH 1/6] arm64: pgtable: Fix pte_accessible() Will Deacon 2020-11-20 14:35 ` Will Deacon 2020-11-20 16:03 ` Minchan Kim 2020-11-20 16:03 ` Minchan Kim 2020-11-20 19:53 ` Yu Zhao 2020-11-20 19:53 ` Yu Zhao 2020-11-23 13:27 ` Catalin Marinas 2020-11-23 13:27 ` Catalin Marinas 2020-11-24 10:02 ` Anshuman Khandual 2020-11-24 10:02 ` Anshuman Khandual 2020-11-20 14:35 ` [PATCH 2/6] arm64: pgtable: Ensure dirty bit is preserved across pte_wrprotect() Will Deacon 2020-11-20 14:35 ` Will Deacon 2020-11-20 17:09 ` Minchan Kim 2020-11-20 17:09 ` Minchan Kim 2020-11-23 14:31 ` Catalin Marinas 2020-11-23 14:31 ` Catalin Marinas 2020-11-23 14:22 ` Catalin Marinas 2020-11-23 14:22 ` Catalin Marinas 2020-11-20 14:35 ` [PATCH 3/6] tlb: mmu_gather: Remove unused start/end arguments from tlb_finish_mmu() Will Deacon 2020-11-20 14:35 ` Will Deacon 2020-11-20 17:20 ` Linus Torvalds 2020-11-20 17:20 ` Linus Torvalds 2020-11-20 17:20 ` Linus Torvalds 2020-11-23 16:48 ` Will Deacon 2020-11-23 16:48 ` Will Deacon 2020-11-20 14:35 ` [PATCH 4/6] mm: proc: Invalidate TLB after clearing soft-dirty page state Will Deacon 2020-11-20 14:35 ` Will Deacon 2020-11-20 15:00 ` Peter Zijlstra 2020-11-20 15:00 ` Peter Zijlstra 2020-11-20 15:09 ` Peter Zijlstra 2020-11-20 15:09 ` Peter Zijlstra 2020-11-20 15:15 ` Will Deacon 2020-11-20 15:15 ` Will Deacon 2020-11-20 15:27 ` Peter Zijlstra 2020-11-20 15:27 ` Peter Zijlstra 2020-11-23 18:23 ` Will Deacon 2020-11-23 18:23 ` Will Deacon 2020-11-20 15:55 ` Minchan Kim 2020-11-20 15:55 ` Minchan Kim 2020-11-23 18:41 ` Will Deacon 2020-11-23 18:41 ` Will Deacon 2020-11-25 22:51 ` Minchan Kim 2020-11-25 22:51 ` Minchan Kim 2020-11-20 20:22 ` Yu Zhao [this message] 2020-11-20 20:22 ` Yu Zhao 2020-11-21 2:49 ` Yu Zhao 2020-11-21 2:49 ` Yu Zhao 2020-11-23 19:21 ` Yu Zhao 2020-11-23 19:21 ` Yu Zhao 2020-11-23 22:04 ` Will Deacon 2020-11-23 22:04 ` Will Deacon 2020-11-20 14:35 ` [PATCH 5/6] tlb: mmu_gather: Introduce tlb_gather_mmu_fullmm() Will Deacon 2020-11-20 14:35 ` Will Deacon 2020-11-20 17:22 ` Linus Torvalds 2020-11-20 17:22 ` Linus Torvalds 2020-11-20 17:22 ` Linus Torvalds 2020-11-20 17:31 ` Linus Torvalds 2020-11-20 17:31 ` Linus Torvalds 2020-11-20 17:31 ` Linus Torvalds 2020-11-23 16:48 ` Will Deacon 2020-11-23 16:48 ` Will Deacon 2021-02-01 11:32 ` [tip: core/mm] tlb: mmu_gather: Remove start/end arguments from tlb_gather_mmu() tip-bot2 for Will Deacon 2020-11-22 15:11 ` [tlb] e242a269fa: WARNING:at_mm/mmu_gather.c:#tlb_gather_mmu kernel test robot 2020-11-23 17:51 ` Will Deacon 2020-11-23 17:51 ` Will Deacon 2020-11-20 14:35 ` [PATCH 6/6] mm: proc: Avoid fullmm flush for young/dirty bit toggling Will Deacon 2020-11-20 14:35 ` Will Deacon 2020-11-20 17:41 ` Linus Torvalds 2020-11-20 17:41 ` Linus Torvalds 2020-11-20 17:41 ` Linus Torvalds 2020-11-20 17:45 ` Linus Torvalds 2020-11-20 17:45 ` Linus Torvalds 2020-11-20 17:45 ` Linus Torvalds 2020-11-20 20:40 ` Yu Zhao 2020-11-20 20:40 ` Yu Zhao 2020-11-23 18:35 ` Will Deacon 2020-11-23 18:35 ` Will Deacon 2020-11-23 20:04 ` Yu Zhao 2020-11-23 20:04 ` Yu Zhao 2020-11-23 21:17 ` Will Deacon 2020-11-23 21:17 ` Will Deacon 2020-11-24 1:13 ` Yu Zhao 2020-11-24 1:13 ` Yu Zhao 2020-11-24 14:31 ` Will Deacon 2020-11-24 14:31 ` Will Deacon 2020-11-25 22:01 ` Minchan Kim 2020-11-25 22:01 ` Minchan Kim 2020-11-24 14:46 ` Peter Zijlstra 2020-11-24 14:46 ` Peter Zijlstra
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=20201120202253.GB1303870@google.com \ --to=yuzhao@google.com \ --cc=anshuman.khandual@arm.com \ --cc=catalin.marinas@arm.com \ --cc=kernel-team@android.com \ --cc=linux-arm-kernel@lists.infradead.org \ --cc=linux-kernel@vger.kernel.org \ --cc=linux-mm@kvack.org \ --cc=minchan@kernel.org \ --cc=peterz@infradead.org \ --cc=torvalds@linux-foundation.org \ --cc=will@kernel.org \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.