From: Jason Gunthorpe <jgg@ziepe.ca> To: Ralph Campbell <rcampbell@nvidia.com> Cc: linux-rdma@vger.kernel.org, linux-mm@kvack.org, nouveau@lists.freedesktop.org, kvm-ppc@vger.kernel.org, linux-kselftest@vger.kernel.org, linux-kernel@vger.kernel.org, Jerome Glisse <jglisse@redhat.com>, John Hubbard <jhubbard@nvidia.com>, Christoph Hellwig <hch@lst.de>, Andrew Morton <akpm@linux-foundation.org>, Shuah Khan <shuah@kernel.org>, Ben Skeggs <bskeggs@redhat.com>, Bharata B Rao <bharata@linux.ibm.com> Subject: Re: [PATCH v4 6/6] mm/migrate: remove range invalidation in migrate_vma_pages() Date: Fri, 31 Jul 2020 16:15:43 -0300 [thread overview] Message-ID: <20200731191543.GJ24045@ziepe.ca> (raw) In-Reply-To: <7f947311-0034-9148-1dca-fb9b9a10abc4@nvidia.com> On Tue, Jul 28, 2020 at 03:04:07PM -0700, Ralph Campbell wrote: > > On 7/28/20 12:19 PM, Jason Gunthorpe wrote: > > On Thu, Jul 23, 2020 at 03:30:04PM -0700, Ralph Campbell wrote: > > > When migrating the special zero page, migrate_vma_pages() calls > > > mmu_notifier_invalidate_range_start() before replacing the zero page > > > PFN in the CPU page tables. This is unnecessary since the range was > > > invalidated in migrate_vma_setup() and the page table entry is checked > > > to be sure it hasn't changed between migrate_vma_setup() and > > > migrate_vma_pages(). Therefore, remove the redundant invalidation. > > > > I don't follow this logic, the purpose of the invalidation is also to > > clear out anything that may be mirroring this VA, and "the page hasn't > > changed" doesn't seem to rule out that case? > > > > I'm also not sure I follow where the zero page came from? > > The zero page comes from an anonymous private VMA that is read-only > and the user level CPU process tries to read the page data (or any > other read page fault). > > > Jason > > > > The overall migration process is: > > mmap_read_lock() > > migrate_vma_setup() > // invalidates range, locks/isolates pages, puts migration entry in page table > > <driver allocates destination pages and copies source to dest> > > migrate_vma_pages() > // moves source struct page info to destination struct page info. > // clears migration flag for pages that can't be migrated. > > <driver updates device page tables for pages still migrating, rollback pages not migrating> > > migrate_vma_finalize() > // replaces migration page table entry with destination page PFN. > > mmap_read_unlock() > > Since the address range is invalidated in the migrate_vma_setup() stage, > and the page is isolated from the LRU cache, locked, unmapped, and the page table > holds a migration entry (so the page can't be faulted and the CPU page table set > valid again), and there are no extra page references (pins), the page > "should not be modified". That is the physical page though, it doesn't prove nobody else is reading the PTE. > For pte_none()/is_zero_pfn() entries, migrate_vma_setup() leaves the > pte_none()/is_zero_pfn() entry in place but does still call > mmu_notifier_invalidate_range_start() for the whole range being migrated. Ok.. > In the migrate_vma_pages() step, the pte page table is locked and the > pte entry checked to be sure it is still pte_none/is_zero_pfn(). If not, > the new page isn't inserted. If it is still none/zero, the new device private > struct page is inserted into the page table, replacing the pte_none()/is_zero_pfn() > page table entry. The secondary MMUs were already invalidated in the migrate_vma_setup() > step and a pte_none() or zero page can't be modified so the only invalidation needed > is the CPU TLB(s) for clearing the special zero page PTE entry. No, the secondary MMU was invalidated but the invalidation start/end range was exited. That means a secondary MMU is immeidately able to reload the zero page into its MMU cache. When this code replaces the PTE that has a zero page it also has to invalidate again so that secondary MMU's are guaranteed to pick up the new PTE value. So, I still don't understand how this is safe? Jason
WARNING: multiple messages have this Message-ID (diff)
From: Jason Gunthorpe <jgg@ziepe.ca> To: Ralph Campbell <rcampbell@nvidia.com> Cc: linux-rdma@vger.kernel.org, linux-mm@kvack.org, nouveau@lists.freedesktop.org, kvm-ppc@vger.kernel.org, linux-kselftest@vger.kernel.org, linux-kernel@vger.kernel.org, Jerome Glisse <jglisse@redhat.com>, John Hubbard <jhubbard@nvidia.com>, Christoph Hellwig <hch@lst.de>, Andrew Morton <akpm@linux-foundation.org>, Shuah Khan <shuah@kernel.org>, Ben Skeggs <bskeggs@redhat.com>, Bharata B Rao <bharata@linux.ibm.com> Subject: Re: [PATCH v4 6/6] mm/migrate: remove range invalidation in migrate_vma_pages() Date: Fri, 31 Jul 2020 19:15:43 +0000 [thread overview] Message-ID: <20200731191543.GJ24045@ziepe.ca> (raw) In-Reply-To: <7f947311-0034-9148-1dca-fb9b9a10abc4@nvidia.com> On Tue, Jul 28, 2020 at 03:04:07PM -0700, Ralph Campbell wrote: > > On 7/28/20 12:19 PM, Jason Gunthorpe wrote: > > On Thu, Jul 23, 2020 at 03:30:04PM -0700, Ralph Campbell wrote: > > > When migrating the special zero page, migrate_vma_pages() calls > > > mmu_notifier_invalidate_range_start() before replacing the zero page > > > PFN in the CPU page tables. This is unnecessary since the range was > > > invalidated in migrate_vma_setup() and the page table entry is checked > > > to be sure it hasn't changed between migrate_vma_setup() and > > > migrate_vma_pages(). Therefore, remove the redundant invalidation. > > > > I don't follow this logic, the purpose of the invalidation is also to > > clear out anything that may be mirroring this VA, and "the page hasn't > > changed" doesn't seem to rule out that case? > > > > I'm also not sure I follow where the zero page came from? > > The zero page comes from an anonymous private VMA that is read-only > and the user level CPU process tries to read the page data (or any > other read page fault). > > > Jason > > > > The overall migration process is: > > mmap_read_lock() > > migrate_vma_setup() > // invalidates range, locks/isolates pages, puts migration entry in page table > > <driver allocates destination pages and copies source to dest> > > migrate_vma_pages() > // moves source struct page info to destination struct page info. > // clears migration flag for pages that can't be migrated. > > <driver updates device page tables for pages still migrating, rollback pages not migrating> > > migrate_vma_finalize() > // replaces migration page table entry with destination page PFN. > > mmap_read_unlock() > > Since the address range is invalidated in the migrate_vma_setup() stage, > and the page is isolated from the LRU cache, locked, unmapped, and the page table > holds a migration entry (so the page can't be faulted and the CPU page table set > valid again), and there are no extra page references (pins), the page > "should not be modified". That is the physical page though, it doesn't prove nobody else is reading the PTE. > For pte_none()/is_zero_pfn() entries, migrate_vma_setup() leaves the > pte_none()/is_zero_pfn() entry in place but does still call > mmu_notifier_invalidate_range_start() for the whole range being migrated. Ok.. > In the migrate_vma_pages() step, the pte page table is locked and the > pte entry checked to be sure it is still pte_none/is_zero_pfn(). If not, > the new page isn't inserted. If it is still none/zero, the new device private > struct page is inserted into the page table, replacing the pte_none()/is_zero_pfn() > page table entry. The secondary MMUs were already invalidated in the migrate_vma_setup() > step and a pte_none() or zero page can't be modified so the only invalidation needed > is the CPU TLB(s) for clearing the special zero page PTE entry. No, the secondary MMU was invalidated but the invalidation start/end range was exited. That means a secondary MMU is immeidately able to reload the zero page into its MMU cache. When this code replaces the PTE that has a zero page it also has to invalidate again so that secondary MMU's are guaranteed to pick up the new PTE value. So, I still don't understand how this is safe? Jason
next prev parent reply other threads:[~2020-07-31 19:15 UTC|newest] Thread overview: 42+ messages / expand[flat|nested] mbox.gz Atom feed top 2020-07-23 22:29 [PATCH v4 0/6] mm/migrate: avoid device private invalidations Ralph Campbell 2020-07-23 22:29 ` Ralph Campbell 2020-07-23 22:29 ` Ralph Campbell 2020-07-23 22:29 ` [PATCH v4 1/6] nouveau: fix storing invalid ptes Ralph Campbell 2020-07-23 22:29 ` Ralph Campbell 2020-07-23 22:29 ` Ralph Campbell 2020-07-23 22:30 ` [PATCH v4 2/6] mm/migrate: add a flags parameter to migrate_vma Ralph Campbell 2020-07-23 22:30 ` Ralph Campbell 2020-07-23 22:30 ` Ralph Campbell 2020-07-23 22:30 ` [PATCH v4 3/6] mm/notifier: add migration invalidation type Ralph Campbell 2020-07-23 22:30 ` Ralph Campbell 2020-07-23 22:30 ` Ralph Campbell 2020-07-28 19:15 ` Jason Gunthorpe 2020-07-28 19:15 ` Jason Gunthorpe 2020-07-28 19:15 ` Jason Gunthorpe 2020-07-28 20:57 ` Ralph Campbell 2020-07-28 20:57 ` Ralph Campbell 2020-07-28 20:57 ` Ralph Campbell 2020-07-23 22:30 ` [PATCH v4 4/6] nouveau/svm: use the new migration invalidation Ralph Campbell 2020-07-23 22:30 ` Ralph Campbell 2020-07-23 22:30 ` Ralph Campbell 2020-07-23 22:30 ` [PATCH v4 5/6] mm/hmm/test: " Ralph Campbell 2020-07-23 22:30 ` Ralph Campbell 2020-07-23 22:30 ` Ralph Campbell 2020-07-23 22:30 ` [PATCH v4 6/6] mm/migrate: remove range invalidation in migrate_vma_pages() Ralph Campbell 2020-07-23 22:30 ` Ralph Campbell 2020-07-23 22:30 ` Ralph Campbell 2020-07-23 22:30 ` Ralph Campbell 2020-07-28 19:19 ` Jason Gunthorpe 2020-07-28 19:19 ` Jason Gunthorpe 2020-07-28 19:19 ` Jason Gunthorpe 2020-07-28 22:04 ` Ralph Campbell 2020-07-28 22:04 ` Ralph Campbell 2020-07-28 22:04 ` Ralph Campbell 2020-07-31 19:15 ` Jason Gunthorpe [this message] 2020-07-31 19:15 ` Jason Gunthorpe 2020-07-31 19:31 ` Ralph Campbell 2020-07-31 19:31 ` Ralph Campbell 2020-07-31 19:31 ` Ralph Campbell 2020-07-28 19:22 ` [PATCH v4 0/6] mm/migrate: avoid device private invalidations Jason Gunthorpe 2020-07-28 19:22 ` Jason Gunthorpe 2020-07-28 19:22 ` Jason Gunthorpe
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=20200731191543.GJ24045@ziepe.ca \ --to=jgg@ziepe.ca \ --cc=akpm@linux-foundation.org \ --cc=bharata@linux.ibm.com \ --cc=bskeggs@redhat.com \ --cc=hch@lst.de \ --cc=jglisse@redhat.com \ --cc=jhubbard@nvidia.com \ --cc=kvm-ppc@vger.kernel.org \ --cc=linux-kernel@vger.kernel.org \ --cc=linux-kselftest@vger.kernel.org \ --cc=linux-mm@kvack.org \ --cc=linux-rdma@vger.kernel.org \ --cc=nouveau@lists.freedesktop.org \ --cc=rcampbell@nvidia.com \ --cc=shuah@kernel.org \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.