linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Ralph Campbell <rcampbell@nvidia.com>
To: Christoph Hellwig <hch@lst.de>
Cc: <nouveau@lists.freedesktop.org>, <linux-kernel@vger.kernel.org>,
	"Jerome Glisse" <jglisse@redhat.com>,
	John Hubbard <jhubbard@nvidia.com>,
	"Jason Gunthorpe" <jgg@mellanox.com>,
	Ben Skeggs <bskeggs@redhat.com>,
	Andrew Morton <akpm@linux-foundation.org>, <linux-mm@kvack.org>,
	Bharata B Rao <bharata@linux.ibm.com>
Subject: Re: [RESEND PATCH 2/3] nouveau: fix mixed normal and device private page migration
Date: Thu, 25 Jun 2020 10:25:38 -0700	[thread overview]
Message-ID: <a9aba057-3786-8204-f782-6e8f3c290b35@nvidia.com> (raw)
In-Reply-To: <330f6a82-d01d-db97-1dec-69346f41e707@nvidia.com>

Making sure to include linux-mm and Bharata B Rao for IBM's
use of migrate_vma*().

On 6/24/20 11:10 AM, Ralph Campbell wrote:
> 
> On 6/24/20 12:23 AM, Christoph Hellwig wrote:
>> On Mon, Jun 22, 2020 at 04:38:53PM -0700, Ralph Campbell wrote:
>>> The OpenCL function clEnqueueSVMMigrateMem(), without any flags, will
>>> migrate memory in the given address range to device private memory. The
>>> source pages might already have been migrated to device private memory.
>>> In that case, the source struct page is not checked to see if it is
>>> a device private page and incorrectly computes the GPU's physical
>>> address of local memory leading to data corruption.
>>> Fix this by checking the source struct page and computing the correct
>>> physical address.
>>
>> I'm really worried about all this delicate code to fix the mixed
>> ranges.  Can't we make it clear at the migrate_vma_* level if we want
>> to migrate from or two device private memory, and then skip all the work
>> for regions of memory that already are in the right place?  This might be
>> a little more work initially, but I think it leads to a much better
>> API.
>>
> 
> The current code does encode the direction with src_owner != NULL meaning
> device private to system memory and src_owner == NULL meaning system
> memory to device private memory. This patch would obviously defeat that
> so perhaps a flag could be added to the struct migrate_vma to indicate the
> direction but I'm unclear how that makes things less delicate.
> Can you expand on what you are worried about?
> 
> The issue with invalidations might be better addressed by letting the device
> driver handle device private page TLB invalidations when migrating to
> system memory and changing migrate_vma_setup() to only invalidate CPU
> TLB entries for normal pages being migrated to device private memory.
> If a page isn't migrating, it seems inefficient to invalidate those TLB
> entries.
> 
> Any other suggestions?

After a night's sleep, I think this might work. What do others think?

1) Add a new MMU_NOTIFY_MIGRATE enum to mmu_notifier_event.

2) Change migrate_vma_collect() to use the new MMU_NOTIFY_MIGRATE event type.

3) Modify nouveau_svmm_invalidate_range_start() to simply return (no invalidations)
for MMU_NOTIFY_MIGRATE mmu notifier callbacks.

4) Leave the src_owner check in migrate_vma_collect_pmd() for normal pages so if the
device driver is migrating normal pages to device private memory, the driver would
set src_owner = NULL and already migrated device private pages would be skipped.
Since the mmu notifier callback did nothing, the device private entries remain valid
in the device's MMU. migrate_vma_collect_pmd() would still invalidate the CPU page
tables for migrated normal pages.
If the driver is migrating device private pages to system memory, it would set
src_owner != NULL, normal pages would be skipped, but now the device driver has to
invalidate device MMU mappings in the "alloc and copy" before doing the copy.
This would be after migrate_vma_setup() returns so the list of migrating device
pages is known to the driver.

The rest of the migrate_vma_pages() and migrate_vma_finalize() remain the same.

  reply	other threads:[~2020-06-25 17:25 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-06-22 23:38 [RESEND PATCH 0/3] nouveau: fixes for SVM Ralph Campbell
2020-06-22 23:38 ` [RESEND PATCH 1/3] nouveau: fix migrate page regression Ralph Campbell
2020-06-23  0:51   ` John Hubbard
2020-06-25  5:23     ` [Nouveau] " Ben Skeggs
2020-06-25  5:25       ` Ben Skeggs
2020-06-22 23:38 ` [RESEND PATCH 2/3] nouveau: fix mixed normal and device private page migration Ralph Campbell
2020-06-23  0:30   ` John Hubbard
2020-06-23  1:42     ` Ralph Campbell
2020-06-24  7:23   ` Christoph Hellwig
2020-06-24 18:10     ` Ralph Campbell
2020-06-25 17:25       ` Ralph Campbell [this message]
2020-06-25 17:31         ` Jason Gunthorpe
2020-06-25 17:42           ` Ralph Campbell
2020-06-22 23:38 ` [RESEND PATCH 3/3] nouveau: make nvkm_vmm_ctor() and nvkm_mmu_ptp_get() static Ralph Campbell
2020-06-23  0:57   ` John Hubbard

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=a9aba057-3786-8204-f782-6e8f3c290b35@nvidia.com \
    --to=rcampbell@nvidia.com \
    --cc=akpm@linux-foundation.org \
    --cc=bharata@linux.ibm.com \
    --cc=bskeggs@redhat.com \
    --cc=hch@lst.de \
    --cc=jgg@mellanox.com \
    --cc=jglisse@redhat.com \
    --cc=jhubbard@nvidia.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=nouveau@lists.freedesktop.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).