From: Yang Shi <shy828301@gmail.com> To: Zi Yan <ziy@nvidia.com> Cc: Christian Borntraeger <borntraeger@de.ibm.com>, Huang Ying <ying.huang@intel.com>, Andrew Morton <akpm@linux-foundation.org>, Linux MM <linux-mm@kvack.org>, Linux Kernel Mailing List <linux-kernel@vger.kernel.org>, Dan Carpenter <dan.carpenter@oracle.com>, Mel Gorman <mgorman@suse.de>, Gerald Schaefer <gerald.schaefer@linux.ibm.com>, Heiko Carstens <hca@linux.ibm.com>, Hugh Dickins <hughd@google.com>, Andrea Arcangeli <aarcange@redhat.com>, "Kirill A . Shutemov" <kirill.shutemov@linux.intel.com>, Michal Hocko <mhocko@suse.com>, Vasily Gorbik <gor@linux.ibm.com>, Paolo Bonzini <pbonzini@redhat.com>, kvm list <kvm@vger.kernel.org> Subject: Re: [PATCH] mm,do_huge_pmd_numa_page: remove unnecessary TLB flushing code Date: Tue, 20 Jul 2021 15:19:50 -0700 [thread overview] Message-ID: <CAHbLzkqZZEic7+H0ky9u+aKO5o_cF0N5xQ=JO2tMpc8jg8RcnQ@mail.gmail.com> (raw) In-Reply-To: <0D75A92F-E2AA-480C-9E9A-0B6EE7897757@nvidia.com> On Tue, Jul 20, 2021 at 2:04 PM Zi Yan <ziy@nvidia.com> wrote: > > On 20 Jul 2021, at 16:53, Yang Shi wrote: > > > On Tue, Jul 20, 2021 at 7:25 AM Christian Borntraeger > > <borntraeger@de.ibm.com> wrote: > >> > >> > >> > >> On 20.07.21 08:55, Huang Ying wrote: > >>> Before the commit c5b5a3dd2c1f ("mm: thp: refactor NUMA fault > >>> handling"), the TLB flushing is done in do_huge_pmd_numa_page() itself > >>> via flush_tlb_range(). > >>> > >>> But after commit c5b5a3dd2c1f ("mm: thp: refactor NUMA fault > >>> handling"), the TLB flushing is done in migrate_pages() as in the > >>> following code path anyway. > >>> > >>> do_huge_pmd_numa_page > >>> migrate_misplaced_page > >>> migrate_pages > >>> > >>> So now, the TLB flushing code in do_huge_pmd_numa_page() becomes > >>> unnecessary. So the code is deleted in this patch to simplify the > >>> code. This is only code cleanup, there's no visible performance > >>> difference. > >>> > >>> Signed-off-by: "Huang, Ying" <ying.huang@intel.com> > >>> Cc: Yang Shi <shy828301@gmail.com> > >>> Cc: Dan Carpenter <dan.carpenter@oracle.com> > >>> Cc: Mel Gorman <mgorman@suse.de> > >>> Cc: Christian Borntraeger <borntraeger@de.ibm.com> > >>> Cc: Gerald Schaefer <gerald.schaefer@linux.ibm.com> > >>> Cc: Heiko Carstens <hca@linux.ibm.com> > >>> Cc: Hugh Dickins <hughd@google.com> > >>> Cc: Andrea Arcangeli <aarcange@redhat.com> > >>> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> > >>> Cc: Michal Hocko <mhocko@suse.com> > >>> Cc: Vasily Gorbik <gor@linux.ibm.com> > >>> Cc: Zi Yan <ziy@nvidia.com> > >>> --- > >>> mm/huge_memory.c | 26 -------------------------- > >>> 1 file changed, 26 deletions(-) > >>> > >>> diff --git a/mm/huge_memory.c b/mm/huge_memory.c > >>> index afff3ac87067..9f21e44c9030 100644 > >>> --- a/mm/huge_memory.c > >>> +++ b/mm/huge_memory.c > >>> @@ -1440,32 +1440,6 @@ vm_fault_t do_huge_pmd_numa_page(struct vm_fault *vmf) > >>> goto out; > >>> } > >>> > >>> - /* > >>> - * Since we took the NUMA fault, we must have observed the !accessible > >>> - * bit. Make sure all other CPUs agree with that, to avoid them > >>> - * modifying the page we're about to migrate. > >>> - * > >>> - * Must be done under PTL such that we'll observe the relevant > >>> - * inc_tlb_flush_pending(). > >>> - * > >>> - * We are not sure a pending tlb flush here is for a huge page > >>> - * mapping or not. Hence use the tlb range variant > >>> - */ > >>> - if (mm_tlb_flush_pending(vma->vm_mm)) { > >>> - flush_tlb_range(vma, haddr, haddr + HPAGE_PMD_SIZE); > >>> - /* > >>> - * change_huge_pmd() released the pmd lock before > >>> - * invalidating the secondary MMUs sharing the primary > >>> - * MMU pagetables (with ->invalidate_range()). The > >>> - * mmu_notifier_invalidate_range_end() (which > >>> - * internally calls ->invalidate_range()) in > >>> - * change_pmd_range() will run after us, so we can't > >>> - * rely on it here and we need an explicit invalidate. > >>> - */ > >>> - mmu_notifier_invalidate_range(vma->vm_mm, haddr, > >>> - haddr + HPAGE_PMD_SIZE); > >>> - } > >>> CC Paolo/KVM list so we also remove the mmu notifier here. Do we need those > >> now in migrate_pages? I am not an expert in that code, but I cant find > >> an equivalent mmu_notifier in migrate_misplaced_pages. > >> I might be totally wrong, just something that I noticed. > > > > Do you mean the missed mmu notifier invalidate for the THP migration > > case? Yes, I noticed that too. But I'm not sure whether it is intended > > or just missed. > > From my understand of mmu_notifier document, mmu_notifier_invalidate_range() > is needed only if the PTE is updated to point to a new page or the page pointed > by the PTE is freed. Page migration does not fall into either case. > In addition, in migrate_pages(), more specifically try_to_migrate_one(), > there is a pair of mmu_notifier_invalidate_range_start() and > mmu_notifier_invalidate_range_end() around the PTE manipulation code, which should > be sufficient to notify secondary TLBs (including KVM) about the PTE change > for page migration. Correct me if I am wrong. Thanks, I think you are correct. By looking into commit 7066f0f933a1 ("mm: thp: fix mmu_notifier in migrate_misplaced_transhuge_page()"), the tlb flush and mmu notifier invalidate were needed since the old numa fault implementation didn't change PTE to migration entry so it may cause data corruption due to the writes from GPU secondary MMU. The refactor does use the generic migration code which converts PTE to migration entry before copying data to the new page. > > — > Best Regards, > Yan, Zi
next prev parent reply other threads:[~2021-07-20 22:22 UTC|newest] Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top 2021-07-20 6:55 Huang Ying 2021-07-20 13:36 ` Zi Yan 2021-07-20 14:25 ` Christian Borntraeger 2021-07-20 20:53 ` Yang Shi 2021-07-20 21:04 ` Zi Yan 2021-07-20 22:19 ` Yang Shi [this message] 2021-07-21 15:41 ` Sean Christopherson 2021-07-22 0:26 ` Huang, Ying 2021-07-22 7:36 ` Christian Borntraeger 2021-07-22 23:10 ` Huang, Ying 2021-07-23 0:03 ` Huang, Ying 2021-07-23 20:19 ` Andrew Morton 2021-07-20 20:48 ` Yang Shi 2021-07-20 22:21 ` Yang Shi
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to='CAHbLzkqZZEic7+H0ky9u+aKO5o_cF0N5xQ=JO2tMpc8jg8RcnQ@mail.gmail.com' \ --to=shy828301@gmail.com \ --cc=aarcange@redhat.com \ --cc=akpm@linux-foundation.org \ --cc=borntraeger@de.ibm.com \ --cc=dan.carpenter@oracle.com \ --cc=gerald.schaefer@linux.ibm.com \ --cc=gor@linux.ibm.com \ --cc=hca@linux.ibm.com \ --cc=hughd@google.com \ --cc=kirill.shutemov@linux.intel.com \ --cc=kvm@vger.kernel.org \ --cc=linux-kernel@vger.kernel.org \ --cc=linux-mm@kvack.org \ --cc=mgorman@suse.de \ --cc=mhocko@suse.com \ --cc=pbonzini@redhat.com \ --cc=ying.huang@intel.com \ --cc=ziy@nvidia.com \ --subject='Re: [PATCH] mm,do_huge_pmd_numa_page: remove unnecessary TLB flushing code' \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: link
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).