linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* ensure device private pages have an owner
@ 2020-03-16 17:52 Christoph Hellwig
  2020-03-16 17:52 ` [PATCH 1/2] mm: handle multiple owners of device private pages in migrate_vma Christoph Hellwig
  2020-03-16 17:52 ` [PATCH 2/2] mm: remove device private page support from hmm_range_fault Christoph Hellwig
  0 siblings, 2 replies; 15+ messages in thread
From: Christoph Hellwig @ 2020-03-16 17:52 UTC (permalink / raw)
  To: Jason Gunthorpe, Dan Williams, Bharata B Rao,
	Christian König, Ben Skeggs
  Cc: Jerome Glisse, kvm-ppc, amd-gfx, dri-devel, nouveau, linux-mm

When acting on device private mappings a driver needs to know if the
device (or other entity in case of kvmppc) actually owns this private
mapping.  This series adds an owner field and converts the migrate_vma
code over to check it.  I looked into doing the same for
hmm_range_fault, but as far as I can tell that code has never been
wired up to actually work for device private memory, so instead of
trying to fix some unused code the second patch just remove the code.
We can add it back once we have a working and fully tested code, and
then should pass the expected owner in the hmm_range structure.


^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH 1/2] mm: handle multiple owners of device private pages in migrate_vma
  2020-03-16 17:52 ensure device private pages have an owner Christoph Hellwig
@ 2020-03-16 17:52 ` Christoph Hellwig
  2020-03-16 18:17   ` Jason Gunthorpe
  2020-03-16 17:52 ` [PATCH 2/2] mm: remove device private page support from hmm_range_fault Christoph Hellwig
  1 sibling, 1 reply; 15+ messages in thread
From: Christoph Hellwig @ 2020-03-16 17:52 UTC (permalink / raw)
  To: Jason Gunthorpe, Dan Williams, Bharata B Rao,
	Christian König, Ben Skeggs
  Cc: Jerome Glisse, kvm-ppc, amd-gfx, dri-devel, nouveau, linux-mm

Add a new opaque owner field to struct dev_pagemap, which will allow
the hmm and migrate_vma code to identify who owns ZONE_DEVICE memory,
and refuse to work on mappings not owned by the calling entity.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 arch/powerpc/kvm/book3s_hv_uvmem.c     | 4 ++++
 drivers/gpu/drm/nouveau/nouveau_dmem.c | 3 +++
 include/linux/memremap.h               | 4 ++++
 include/linux/migrate.h                | 2 ++
 mm/migrate.c                           | 3 +++
 5 files changed, 16 insertions(+)

diff --git a/arch/powerpc/kvm/book3s_hv_uvmem.c b/arch/powerpc/kvm/book3s_hv_uvmem.c
index 79b1202b1c62..29ed52892d31 100644
--- a/arch/powerpc/kvm/book3s_hv_uvmem.c
+++ b/arch/powerpc/kvm/book3s_hv_uvmem.c
@@ -386,6 +386,7 @@ kvmppc_svm_page_in(struct vm_area_struct *vma, unsigned long start,
 	mig.end = end;
 	mig.src = &src_pfn;
 	mig.dst = &dst_pfn;
+	mig.dev_private_owner = &kvmppc_uvmem_pgmap;
 
 	/*
 	 * We come here with mmap_sem write lock held just for
@@ -563,6 +564,7 @@ kvmppc_svm_page_out(struct vm_area_struct *vma, unsigned long start,
 	mig.end = end;
 	mig.src = &src_pfn;
 	mig.dst = &dst_pfn;
+	mig.dev_private_owner = &kvmppc_uvmem_pgmap;
 
 	mutex_lock(&kvm->arch.uvmem_lock);
 	/* The requested page is already paged-out, nothing to do */
@@ -779,6 +781,8 @@ int kvmppc_uvmem_init(void)
 	kvmppc_uvmem_pgmap.type = MEMORY_DEVICE_PRIVATE;
 	kvmppc_uvmem_pgmap.res = *res;
 	kvmppc_uvmem_pgmap.ops = &kvmppc_uvmem_ops;
+	/* just one global instance: */
+	kvmppc_uvmem_pgmap.owner = &kvmppc_uvmem_pgmap;
 	addr = memremap_pages(&kvmppc_uvmem_pgmap, NUMA_NO_NODE);
 	if (IS_ERR(addr)) {
 		ret = PTR_ERR(addr);
diff --git a/drivers/gpu/drm/nouveau/nouveau_dmem.c b/drivers/gpu/drm/nouveau/nouveau_dmem.c
index 0ad5d87b5a8e..7605c4c48985 100644
--- a/drivers/gpu/drm/nouveau/nouveau_dmem.c
+++ b/drivers/gpu/drm/nouveau/nouveau_dmem.c
@@ -176,6 +176,7 @@ static vm_fault_t nouveau_dmem_migrate_to_ram(struct vm_fault *vmf)
 		.end		= vmf->address + PAGE_SIZE,
 		.src		= &src,
 		.dst		= &dst,
+		.dev_private_owner = drm->dev,
 	};
 
 	/*
@@ -526,6 +527,7 @@ nouveau_dmem_init(struct nouveau_drm *drm)
 	drm->dmem->pagemap.type = MEMORY_DEVICE_PRIVATE;
 	drm->dmem->pagemap.res = *res;
 	drm->dmem->pagemap.ops = &nouveau_dmem_pagemap_ops;
+	drm->dmem->pagemap.owner = drm->dev;
 	if (IS_ERR(devm_memremap_pages(device, &drm->dmem->pagemap)))
 		goto out_free;
 
@@ -631,6 +633,7 @@ nouveau_dmem_migrate_vma(struct nouveau_drm *drm,
 	struct migrate_vma args = {
 		.vma		= vma,
 		.start		= start,
+		.dev_private_owner = drm->dev,
 	};
 	unsigned long c, i;
 	int ret = -ENOMEM;
diff --git a/include/linux/memremap.h b/include/linux/memremap.h
index 6fefb09af7c3..60d97e8fd3c0 100644
--- a/include/linux/memremap.h
+++ b/include/linux/memremap.h
@@ -103,6 +103,9 @@ struct dev_pagemap_ops {
  * @type: memory type: see MEMORY_* in memory_hotplug.h
  * @flags: PGMAP_* flags to specify defailed behavior
  * @ops: method table
+ * @owner: an opaque pointer identifying the entity that manages this
+ *	instance.  Used by various helpers to make sure that no
+ *	foreign ZONE_DEVICE memory is accessed.
  */
 struct dev_pagemap {
 	struct vmem_altmap altmap;
@@ -113,6 +116,7 @@ struct dev_pagemap {
 	enum memory_type type;
 	unsigned int flags;
 	const struct dev_pagemap_ops *ops;
+	void *owner;
 };
 
 static inline struct vmem_altmap *pgmap_altmap(struct dev_pagemap *pgmap)
diff --git a/include/linux/migrate.h b/include/linux/migrate.h
index 72120061b7d4..4bbd8d732e53 100644
--- a/include/linux/migrate.h
+++ b/include/linux/migrate.h
@@ -196,6 +196,8 @@ struct migrate_vma {
 	unsigned long		npages;
 	unsigned long		start;
 	unsigned long		end;
+
+	void			*dev_private_owner;
 };
 
 int migrate_vma_setup(struct migrate_vma *args);
diff --git a/mm/migrate.c b/mm/migrate.c
index b1092876e537..201e8fa627e0 100644
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -2267,6 +2267,9 @@ static int migrate_vma_collect_pmd(pmd_t *pmdp,
 				goto next;
 
 			page = device_private_entry_to_page(entry);
+			if (page->pgmap->owner != migrate->dev_private_owner)
+				goto next;
+
 			mpfn = migrate_pfn(page_to_pfn(page)) |
 					MIGRATE_PFN_MIGRATE;
 			if (is_write_device_private_entry(entry))
-- 
2.24.1



^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 2/2] mm: remove device private page support from hmm_range_fault
  2020-03-16 17:52 ensure device private pages have an owner Christoph Hellwig
  2020-03-16 17:52 ` [PATCH 1/2] mm: handle multiple owners of device private pages in migrate_vma Christoph Hellwig
@ 2020-03-16 17:52 ` Christoph Hellwig
  2020-03-16 18:42   ` Ralph Campbell
  1 sibling, 1 reply; 15+ messages in thread
From: Christoph Hellwig @ 2020-03-16 17:52 UTC (permalink / raw)
  To: Jason Gunthorpe, Dan Williams, Bharata B Rao,
	Christian König, Ben Skeggs
  Cc: Jerome Glisse, kvm-ppc, amd-gfx, dri-devel, nouveau, linux-mm

No driver has actually used properly wire up and support this feature.
There is various code related to it in nouveau, but as far as I can tell
it never actually got turned on, and the only changes since the initial
commit are global cleanups.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c |  1 -
 drivers/gpu/drm/nouveau/nouveau_dmem.c  | 37 -------------------------
 drivers/gpu/drm/nouveau/nouveau_dmem.h  |  2 --
 drivers/gpu/drm/nouveau/nouveau_svm.c   |  3 --
 include/linux/hmm.h                     |  2 --
 mm/hmm.c                                | 28 -------------------
 6 files changed, 73 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
index dee446278417..90821ce5e6ca 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
@@ -776,7 +776,6 @@ struct amdgpu_ttm_tt {
 static const uint64_t hmm_range_flags[HMM_PFN_FLAG_MAX] = {
 	(1 << 0), /* HMM_PFN_VALID */
 	(1 << 1), /* HMM_PFN_WRITE */
-	0 /* HMM_PFN_DEVICE_PRIVATE */
 };
 
 static const uint64_t hmm_range_values[HMM_PFN_VALUE_MAX] = {
diff --git a/drivers/gpu/drm/nouveau/nouveau_dmem.c b/drivers/gpu/drm/nouveau/nouveau_dmem.c
index 7605c4c48985..42808efceaf2 100644
--- a/drivers/gpu/drm/nouveau/nouveau_dmem.c
+++ b/drivers/gpu/drm/nouveau/nouveau_dmem.c
@@ -671,40 +671,3 @@ nouveau_dmem_migrate_vma(struct nouveau_drm *drm,
 out:
 	return ret;
 }
-
-static inline bool
-nouveau_dmem_page(struct nouveau_drm *drm, struct page *page)
-{
-	return is_device_private_page(page) && drm->dmem == page_to_dmem(page);
-}
-
-void
-nouveau_dmem_convert_pfn(struct nouveau_drm *drm,
-			 struct hmm_range *range)
-{
-	unsigned long i, npages;
-
-	npages = (range->end - range->start) >> PAGE_SHIFT;
-	for (i = 0; i < npages; ++i) {
-		struct page *page;
-		uint64_t addr;
-
-		page = hmm_device_entry_to_page(range, range->pfns[i]);
-		if (page == NULL)
-			continue;
-
-		if (!(range->pfns[i] & range->flags[HMM_PFN_DEVICE_PRIVATE])) {
-			continue;
-		}
-
-		if (!nouveau_dmem_page(drm, page)) {
-			WARN(1, "Some unknown device memory !\n");
-			range->pfns[i] = 0;
-			continue;
-		}
-
-		addr = nouveau_dmem_page_addr(page);
-		range->pfns[i] &= ((1UL << range->pfn_shift) - 1);
-		range->pfns[i] |= (addr >> PAGE_SHIFT) << range->pfn_shift;
-	}
-}
diff --git a/drivers/gpu/drm/nouveau/nouveau_dmem.h b/drivers/gpu/drm/nouveau/nouveau_dmem.h
index 92394be5d649..1ac620b3d4fb 100644
--- a/drivers/gpu/drm/nouveau/nouveau_dmem.h
+++ b/drivers/gpu/drm/nouveau/nouveau_dmem.h
@@ -38,8 +38,6 @@ int nouveau_dmem_migrate_vma(struct nouveau_drm *drm,
 			     unsigned long start,
 			     unsigned long end);
 
-void nouveau_dmem_convert_pfn(struct nouveau_drm *drm,
-			      struct hmm_range *range);
 #else /* IS_ENABLED(CONFIG_DRM_NOUVEAU_SVM) */
 static inline void nouveau_dmem_init(struct nouveau_drm *drm) {}
 static inline void nouveau_dmem_fini(struct nouveau_drm *drm) {}
diff --git a/drivers/gpu/drm/nouveau/nouveau_svm.c b/drivers/gpu/drm/nouveau/nouveau_svm.c
index df9bf1fd1bc0..7e0376dca137 100644
--- a/drivers/gpu/drm/nouveau/nouveau_svm.c
+++ b/drivers/gpu/drm/nouveau/nouveau_svm.c
@@ -367,7 +367,6 @@ static const u64
 nouveau_svm_pfn_flags[HMM_PFN_FLAG_MAX] = {
 	[HMM_PFN_VALID         ] = NVIF_VMM_PFNMAP_V0_V,
 	[HMM_PFN_WRITE         ] = NVIF_VMM_PFNMAP_V0_W,
-	[HMM_PFN_DEVICE_PRIVATE] = NVIF_VMM_PFNMAP_V0_VRAM,
 };
 
 static const u64
@@ -558,8 +557,6 @@ static int nouveau_range_fault(struct nouveau_svmm *svmm,
 		break;
 	}
 
-	nouveau_dmem_convert_pfn(drm, &range);
-
 	svmm->vmm->vmm.object.client->super = true;
 	ret = nvif_object_ioctl(&svmm->vmm->vmm.object, data, size, NULL);
 	svmm->vmm->vmm.object.client->super = false;
diff --git a/include/linux/hmm.h b/include/linux/hmm.h
index 4bf8d6997b12..5e6034f105c3 100644
--- a/include/linux/hmm.h
+++ b/include/linux/hmm.h
@@ -74,7 +74,6 @@
  * Flags:
  * HMM_PFN_VALID: pfn is valid. It has, at least, read permission.
  * HMM_PFN_WRITE: CPU page table has write permission set
- * HMM_PFN_DEVICE_PRIVATE: private device memory (ZONE_DEVICE)
  *
  * The driver provides a flags array for mapping page protections to device
  * PTE bits. If the driver valid bit for an entry is bit 3,
@@ -86,7 +85,6 @@
 enum hmm_pfn_flag_e {
 	HMM_PFN_VALID = 0,
 	HMM_PFN_WRITE,
-	HMM_PFN_DEVICE_PRIVATE,
 	HMM_PFN_FLAG_MAX
 };
 
diff --git a/mm/hmm.c b/mm/hmm.c
index 180e398170b0..3d10485bf323 100644
--- a/mm/hmm.c
+++ b/mm/hmm.c
@@ -118,15 +118,6 @@ static inline void hmm_pte_need_fault(const struct hmm_vma_walk *hmm_vma_walk,
 	/* We aren't ask to do anything ... */
 	if (!(pfns & range->flags[HMM_PFN_VALID]))
 		return;
-	/* If this is device memory then only fault if explicitly requested */
-	if ((cpu_flags & range->flags[HMM_PFN_DEVICE_PRIVATE])) {
-		/* Do we fault on device memory ? */
-		if (pfns & range->flags[HMM_PFN_DEVICE_PRIVATE]) {
-			*write_fault = pfns & range->flags[HMM_PFN_WRITE];
-			*fault = true;
-		}
-		return;
-	}
 
 	/* If CPU page table is not valid then we need to fault */
 	*fault = !(cpu_flags & range->flags[HMM_PFN_VALID]);
@@ -259,25 +250,6 @@ static int hmm_vma_handle_pte(struct mm_walk *walk, unsigned long addr,
 	if (!pte_present(pte)) {
 		swp_entry_t entry = pte_to_swp_entry(pte);
 
-		/*
-		 * This is a special swap entry, ignore migration, use
-		 * device and report anything else as error.
-		 */
-		if (is_device_private_entry(entry)) {
-			cpu_flags = range->flags[HMM_PFN_VALID] |
-				range->flags[HMM_PFN_DEVICE_PRIVATE];
-			cpu_flags |= is_write_device_private_entry(entry) ?
-				range->flags[HMM_PFN_WRITE] : 0;
-			hmm_pte_need_fault(hmm_vma_walk, orig_pfn, cpu_flags,
-					   &fault, &write_fault);
-			if (fault || write_fault)
-				goto fault;
-			*pfn = hmm_device_entry_from_pfn(range,
-					    swp_offset(entry));
-			*pfn |= cpu_flags;
-			return 0;
-		}
-
 		hmm_pte_need_fault(hmm_vma_walk, orig_pfn, 0, &fault,
 				   &write_fault);
 		if (!fault && !write_fault)
-- 
2.24.1



^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH 1/2] mm: handle multiple owners of device private pages in migrate_vma
  2020-03-16 17:52 ` [PATCH 1/2] mm: handle multiple owners of device private pages in migrate_vma Christoph Hellwig
@ 2020-03-16 18:17   ` Jason Gunthorpe
  2020-03-16 18:20     ` Christoph Hellwig
  0 siblings, 1 reply; 15+ messages in thread
From: Jason Gunthorpe @ 2020-03-16 18:17 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Dan Williams, Bharata B Rao, Christian König, Ben Skeggs,
	Jerome Glisse, kvm-ppc, amd-gfx, dri-devel, nouveau, linux-mm

On Mon, Mar 16, 2020 at 06:52:58PM +0100, Christoph Hellwig wrote:
> Add a new opaque owner field to struct dev_pagemap, which will allow
> the hmm and migrate_vma code to identify who owns ZONE_DEVICE memory,
> and refuse to work on mappings not owned by the calling entity.

Using a pointer seems like a good solution to me.

Is this a bug fix? What is the downside for migrate on pages it
doesn't work? I'm not up to speed on migrate..

Jason


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 1/2] mm: handle multiple owners of device private pages in migrate_vma
  2020-03-16 18:17   ` Jason Gunthorpe
@ 2020-03-16 18:20     ` Christoph Hellwig
  0 siblings, 0 replies; 15+ messages in thread
From: Christoph Hellwig @ 2020-03-16 18:20 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Christoph Hellwig, Dan Williams, Bharata B Rao,
	Christian König, Ben Skeggs, Jerome Glisse, kvm-ppc,
	amd-gfx, dri-devel, nouveau, linux-mm

On Mon, Mar 16, 2020 at 03:17:07PM -0300, Jason Gunthorpe wrote:
> On Mon, Mar 16, 2020 at 06:52:58PM +0100, Christoph Hellwig wrote:
> > Add a new opaque owner field to struct dev_pagemap, which will allow
> > the hmm and migrate_vma code to identify who owns ZONE_DEVICE memory,
> > and refuse to work on mappings not owned by the calling entity.
> 
> Using a pointer seems like a good solution to me.
> 
> Is this a bug fix? What is the downside for migrate on pages it
> doesn't work? I'm not up to speed on migrate..

migrating private pages not owned by driver simply won't work, as
the device can't access it.  Even inside the same driver say
GPU A can't just migrate CPU B's memory.  In that sense it is
a bug fix for the rather unlikely case of using the experimental
nouveau with multiple GPUs, or in a power secure VM (if that is
even possible).


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 2/2] mm: remove device private page support from hmm_range_fault
  2020-03-16 17:52 ` [PATCH 2/2] mm: remove device private page support from hmm_range_fault Christoph Hellwig
@ 2020-03-16 18:42   ` Ralph Campbell
  2020-03-16 18:49     ` Christoph Hellwig
  2020-03-16 19:04     ` Jason Gunthorpe
  0 siblings, 2 replies; 15+ messages in thread
From: Ralph Campbell @ 2020-03-16 18:42 UTC (permalink / raw)
  To: Christoph Hellwig, Jason Gunthorpe, Dan Williams, Bharata B Rao,
	Christian König, Ben Skeggs
  Cc: Jerome Glisse, kvm-ppc, amd-gfx, dri-devel, nouveau, linux-mm


On 3/16/20 10:52 AM, Christoph Hellwig wrote:
> No driver has actually used properly wire up and support this feature.
> There is various code related to it in nouveau, but as far as I can tell
> it never actually got turned on, and the only changes since the initial
> commit are global cleanups.

This is not actually true. OpenCL 2.x does support SVM with nouveau and
device private memory via clEnqueueSVMMigrateMem().
Also, Ben Skeggs has accepted a set of patches to map GPU memory after being
migrated and this change would conflict with that.


> Signed-off-by: Christoph Hellwig <hch@lst.de>
> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c |  1 -
>   drivers/gpu/drm/nouveau/nouveau_dmem.c  | 37 -------------------------
>   drivers/gpu/drm/nouveau/nouveau_dmem.h  |  2 --
>   drivers/gpu/drm/nouveau/nouveau_svm.c   |  3 --
>   include/linux/hmm.h                     |  2 --
>   mm/hmm.c                                | 28 -------------------
>   6 files changed, 73 deletions(-)
> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
> index dee446278417..90821ce5e6ca 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
> @@ -776,7 +776,6 @@ struct amdgpu_ttm_tt {
>   static const uint64_t hmm_range_flags[HMM_PFN_FLAG_MAX] = {
>   	(1 << 0), /* HMM_PFN_VALID */
>   	(1 << 1), /* HMM_PFN_WRITE */
> -	0 /* HMM_PFN_DEVICE_PRIVATE */
>   };
>   
>   static const uint64_t hmm_range_values[HMM_PFN_VALUE_MAX] = {
> diff --git a/drivers/gpu/drm/nouveau/nouveau_dmem.c b/drivers/gpu/drm/nouveau/nouveau_dmem.c
> index 7605c4c48985..42808efceaf2 100644
> --- a/drivers/gpu/drm/nouveau/nouveau_dmem.c
> +++ b/drivers/gpu/drm/nouveau/nouveau_dmem.c
> @@ -671,40 +671,3 @@ nouveau_dmem_migrate_vma(struct nouveau_drm *drm,
>   out:
>   	return ret;
>   }
> -
> -static inline bool
> -nouveau_dmem_page(struct nouveau_drm *drm, struct page *page)
> -{
> -	return is_device_private_page(page) && drm->dmem == page_to_dmem(page);
> -}
> -
> -void
> -nouveau_dmem_convert_pfn(struct nouveau_drm *drm,
> -			 struct hmm_range *range)
> -{
> -	unsigned long i, npages;
> -
> -	npages = (range->end - range->start) >> PAGE_SHIFT;
> -	for (i = 0; i < npages; ++i) {
> -		struct page *page;
> -		uint64_t addr;
> -
> -		page = hmm_device_entry_to_page(range, range->pfns[i]);
> -		if (page == NULL)
> -			continue;
> -
> -		if (!(range->pfns[i] & range->flags[HMM_PFN_DEVICE_PRIVATE])) {
> -			continue;
> -		}
> -
> -		if (!nouveau_dmem_page(drm, page)) {
> -			WARN(1, "Some unknown device memory !\n");
> -			range->pfns[i] = 0;
> -			continue;
> -		}
> -
> -		addr = nouveau_dmem_page_addr(page);
> -		range->pfns[i] &= ((1UL << range->pfn_shift) - 1);
> -		range->pfns[i] |= (addr >> PAGE_SHIFT) << range->pfn_shift;
> -	}
> -}
> diff --git a/drivers/gpu/drm/nouveau/nouveau_dmem.h b/drivers/gpu/drm/nouveau/nouveau_dmem.h
> index 92394be5d649..1ac620b3d4fb 100644
> --- a/drivers/gpu/drm/nouveau/nouveau_dmem.h
> +++ b/drivers/gpu/drm/nouveau/nouveau_dmem.h
> @@ -38,8 +38,6 @@ int nouveau_dmem_migrate_vma(struct nouveau_drm *drm,
>   			     unsigned long start,
>   			     unsigned long end);
>   
> -void nouveau_dmem_convert_pfn(struct nouveau_drm *drm,
> -			      struct hmm_range *range);
>   #else /* IS_ENABLED(CONFIG_DRM_NOUVEAU_SVM) */
>   static inline void nouveau_dmem_init(struct nouveau_drm *drm) {}
>   static inline void nouveau_dmem_fini(struct nouveau_drm *drm) {}
> diff --git a/drivers/gpu/drm/nouveau/nouveau_svm.c b/drivers/gpu/drm/nouveau/nouveau_svm.c
> index df9bf1fd1bc0..7e0376dca137 100644
> --- a/drivers/gpu/drm/nouveau/nouveau_svm.c
> +++ b/drivers/gpu/drm/nouveau/nouveau_svm.c
> @@ -367,7 +367,6 @@ static const u64
>   nouveau_svm_pfn_flags[HMM_PFN_FLAG_MAX] = {
>   	[HMM_PFN_VALID         ] = NVIF_VMM_PFNMAP_V0_V,
>   	[HMM_PFN_WRITE         ] = NVIF_VMM_PFNMAP_V0_W,
> -	[HMM_PFN_DEVICE_PRIVATE] = NVIF_VMM_PFNMAP_V0_VRAM,
>   };
>   
>   static const u64
> @@ -558,8 +557,6 @@ static int nouveau_range_fault(struct nouveau_svmm *svmm,
>   		break;
>   	}
>   
> -	nouveau_dmem_convert_pfn(drm, &range);
> -
>   	svmm->vmm->vmm.object.client->super = true;
>   	ret = nvif_object_ioctl(&svmm->vmm->vmm.object, data, size, NULL);
>   	svmm->vmm->vmm.object.client->super = false;
> diff --git a/include/linux/hmm.h b/include/linux/hmm.h
> index 4bf8d6997b12..5e6034f105c3 100644
> --- a/include/linux/hmm.h
> +++ b/include/linux/hmm.h
> @@ -74,7 +74,6 @@
>    * Flags:
>    * HMM_PFN_VALID: pfn is valid. It has, at least, read permission.
>    * HMM_PFN_WRITE: CPU page table has write permission set
> - * HMM_PFN_DEVICE_PRIVATE: private device memory (ZONE_DEVICE)
>    *
>    * The driver provides a flags array for mapping page protections to device
>    * PTE bits. If the driver valid bit for an entry is bit 3,
> @@ -86,7 +85,6 @@
>   enum hmm_pfn_flag_e {
>   	HMM_PFN_VALID = 0,
>   	HMM_PFN_WRITE,
> -	HMM_PFN_DEVICE_PRIVATE,
>   	HMM_PFN_FLAG_MAX
>   };
>   
> diff --git a/mm/hmm.c b/mm/hmm.c
> index 180e398170b0..3d10485bf323 100644
> --- a/mm/hmm.c
> +++ b/mm/hmm.c
> @@ -118,15 +118,6 @@ static inline void hmm_pte_need_fault(const struct hmm_vma_walk *hmm_vma_walk,
>   	/* We aren't ask to do anything ... */
>   	if (!(pfns & range->flags[HMM_PFN_VALID]))
>   		return;
> -	/* If this is device memory then only fault if explicitly requested */
> -	if ((cpu_flags & range->flags[HMM_PFN_DEVICE_PRIVATE])) {
> -		/* Do we fault on device memory ? */
> -		if (pfns & range->flags[HMM_PFN_DEVICE_PRIVATE]) {
> -			*write_fault = pfns & range->flags[HMM_PFN_WRITE];
> -			*fault = true;
> -		}
> -		return;
> -	}
>   
>   	/* If CPU page table is not valid then we need to fault */
>   	*fault = !(cpu_flags & range->flags[HMM_PFN_VALID]);
> @@ -259,25 +250,6 @@ static int hmm_vma_handle_pte(struct mm_walk *walk, unsigned long addr,
>   	if (!pte_present(pte)) {
>   		swp_entry_t entry = pte_to_swp_entry(pte);
>   
> -		/*
> -		 * This is a special swap entry, ignore migration, use
> -		 * device and report anything else as error.
> -		 */
> -		if (is_device_private_entry(entry)) {
> -			cpu_flags = range->flags[HMM_PFN_VALID] |
> -				range->flags[HMM_PFN_DEVICE_PRIVATE];
> -			cpu_flags |= is_write_device_private_entry(entry) ?
> -				range->flags[HMM_PFN_WRITE] : 0;
> -			hmm_pte_need_fault(hmm_vma_walk, orig_pfn, cpu_flags,
> -					   &fault, &write_fault);
> -			if (fault || write_fault)
> -				goto fault;
> -			*pfn = hmm_device_entry_from_pfn(range,
> -					    swp_offset(entry));
> -			*pfn |= cpu_flags;
> -			return 0;
> -		}
> -
>   		hmm_pte_need_fault(hmm_vma_walk, orig_pfn, 0, &fault,
>   				   &write_fault);
>   		if (!fault && !write_fault)
> 


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 2/2] mm: remove device private page support from hmm_range_fault
  2020-03-16 18:42   ` Ralph Campbell
@ 2020-03-16 18:49     ` Christoph Hellwig
  2020-03-16 18:58       ` Christoph Hellwig
                         ` (2 more replies)
  2020-03-16 19:04     ` Jason Gunthorpe
  1 sibling, 3 replies; 15+ messages in thread
From: Christoph Hellwig @ 2020-03-16 18:49 UTC (permalink / raw)
  To: Ralph Campbell
  Cc: Christoph Hellwig, Jason Gunthorpe, Dan Williams, Bharata B Rao,
	Christian König, Ben Skeggs, Jerome Glisse, kvm-ppc,
	amd-gfx, dri-devel, nouveau, linux-mm

On Mon, Mar 16, 2020 at 11:42:19AM -0700, Ralph Campbell wrote:
>
> On 3/16/20 10:52 AM, Christoph Hellwig wrote:
>> No driver has actually used properly wire up and support this feature.
>> There is various code related to it in nouveau, but as far as I can tell
>> it never actually got turned on, and the only changes since the initial
>> commit are global cleanups.
>
> This is not actually true. OpenCL 2.x does support SVM with nouveau and
> device private memory via clEnqueueSVMMigrateMem().
> Also, Ben Skeggs has accepted a set of patches to map GPU memory after being
> migrated and this change would conflict with that.

Can you explain me how we actually invoke this code?

For that we'd need HMM_PFN_DEVICE_PRIVATE NVIF_VMM_PFNMAP_V0_VRAM
set in ->pfns before calling hmm_range_fault, which isn't happening.


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 2/2] mm: remove device private page support from hmm_range_fault
  2020-03-16 18:49     ` Christoph Hellwig
@ 2020-03-16 18:58       ` Christoph Hellwig
  2020-03-16 19:56       ` Ralph Campbell
  2020-03-16 20:09       ` Jason Gunthorpe
  2 siblings, 0 replies; 15+ messages in thread
From: Christoph Hellwig @ 2020-03-16 18:58 UTC (permalink / raw)
  To: Ralph Campbell
  Cc: Christoph Hellwig, Jason Gunthorpe, Dan Williams, Bharata B Rao,
	Christian König, Ben Skeggs, Jerome Glisse, kvm-ppc,
	amd-gfx, dri-devel, nouveau, linux-mm

On Mon, Mar 16, 2020 at 07:49:35PM +0100, Christoph Hellwig wrote:
> On Mon, Mar 16, 2020 at 11:42:19AM -0700, Ralph Campbell wrote:
> >
> > On 3/16/20 10:52 AM, Christoph Hellwig wrote:
> >> No driver has actually used properly wire up and support this feature.
> >> There is various code related to it in nouveau, but as far as I can tell
> >> it never actually got turned on, and the only changes since the initial
> >> commit are global cleanups.
> >
> > This is not actually true. OpenCL 2.x does support SVM with nouveau and
> > device private memory via clEnqueueSVMMigrateMem().
> > Also, Ben Skeggs has accepted a set of patches to map GPU memory after being
> > migrated and this change would conflict with that.
> 
> Can you explain me how we actually invoke this code?
> 
> For that we'd need HMM_PFN_DEVICE_PRIVATE NVIF_VMM_PFNMAP_V0_VRAM
> set in ->pfns before calling hmm_range_fault, which isn't happening.

Ok, looks like the fault side is unused, but we need to keep the
non-fault path in place.  I'll fix up the patch.


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 2/2] mm: remove device private page support from hmm_range_fault
  2020-03-16 18:42   ` Ralph Campbell
  2020-03-16 18:49     ` Christoph Hellwig
@ 2020-03-16 19:04     ` Jason Gunthorpe
  2020-03-16 19:07       ` Christoph Hellwig
  1 sibling, 1 reply; 15+ messages in thread
From: Jason Gunthorpe @ 2020-03-16 19:04 UTC (permalink / raw)
  To: Ralph Campbell
  Cc: Christoph Hellwig, Dan Williams, Bharata B Rao,
	Christian König, Ben Skeggs, Jerome Glisse, kvm-ppc,
	amd-gfx, dri-devel, nouveau, linux-mm

On Mon, Mar 16, 2020 at 11:42:19AM -0700, Ralph Campbell wrote:
> 
> On 3/16/20 10:52 AM, Christoph Hellwig wrote:
> > No driver has actually used properly wire up and support this feature.
> > There is various code related to it in nouveau, but as far as I can tell
> > it never actually got turned on, and the only changes since the initial
> > commit are global cleanups.
> 
> This is not actually true. OpenCL 2.x does support SVM with nouveau and
> device private memory via clEnqueueSVMMigrateMem().
> Also, Ben Skeggs has accepted a set of patches to map GPU memory after being
> migrated and this change would conflict with that.

Other than the page_to_dmem() possibly doing container_of on the wrong
type pgmap, are there other bugs here to fix?

Something like this is probably close to the right thing to fix that
and work with Christoph's 1/2 patch - Ralph can you check, test, etc?

diff --git a/mm/hmm.c b/mm/hmm.c
index 9e8f68eb83287a..9fa4748da1b779 100644
--- a/mm/hmm.c
+++ b/mm/hmm.c
@@ -300,6 +300,20 @@ static int hmm_vma_handle_pte(struct mm_walk *walk, unsigned long addr,
 				range->flags[HMM_PFN_DEVICE_PRIVATE];
 			cpu_flags |= is_write_device_private_entry(entry) ?
 				range->flags[HMM_PFN_WRITE] : 0;
+
+			/*
+			 * If the caller understands this kind of device_private
+			 * page, then leave it as is, otherwise fault it.
+			 */
+			hmm_vma_walk->pgmap = get_dev_pagemap(
+				device_private_entry_to_pfn(entry),
+				hmm_vma_walk->pgmap);
+			if (!unlikely(!pgmap))
+				return -EBUSY;
+			if (hmm_vma_walk->pgmap->owner !=
+			    hmm_vma_walk->dev_private_owner)
+				cpu_flags = 0;
+
 			hmm_pte_need_fault(hmm_vma_walk, orig_pfn, cpu_flags,
 					   &fault, &write_fault);
 			if (fault || write_fault)

Jason


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH 2/2] mm: remove device private page support from hmm_range_fault
  2020-03-16 19:04     ` Jason Gunthorpe
@ 2020-03-16 19:07       ` Christoph Hellwig
  0 siblings, 0 replies; 15+ messages in thread
From: Christoph Hellwig @ 2020-03-16 19:07 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Ralph Campbell, Christoph Hellwig, Dan Williams, Bharata B Rao,
	Christian König, Ben Skeggs, Jerome Glisse, kvm-ppc,
	amd-gfx, dri-devel, nouveau, linux-mm

On Mon, Mar 16, 2020 at 04:04:43PM -0300, Jason Gunthorpe wrote:
> > This is not actually true. OpenCL 2.x does support SVM with nouveau and
> > device private memory via clEnqueueSVMMigrateMem().
> > Also, Ben Skeggs has accepted a set of patches to map GPU memory after being
> > migrated and this change would conflict with that.
> 
> Other than the page_to_dmem() possibly doing container_of on the wrong
> type pgmap, are there other bugs here to fix?

It fixes that bug, as well as making sure everyone gets this check
right by default.

> Something like this is probably close to the right thing to fix that
> and work with Christoph's 1/2 patch - Ralph can you check, test, etc?

I don't think we need the get_dev_pagemap.  For device private pages
we can derive the page using device_private_entry_to_page and just
use that.  Let me resend a proper series.


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 2/2] mm: remove device private page support from hmm_range_fault
  2020-03-16 18:49     ` Christoph Hellwig
  2020-03-16 18:58       ` Christoph Hellwig
@ 2020-03-16 19:56       ` Ralph Campbell
  2020-03-16 20:09       ` Jason Gunthorpe
  2 siblings, 0 replies; 15+ messages in thread
From: Ralph Campbell @ 2020-03-16 19:56 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Jason Gunthorpe, Dan Williams, Bharata B Rao,
	Christian König, Ben Skeggs, Jerome Glisse, kvm-ppc,
	amd-gfx, dri-devel, nouveau, linux-mm


On 3/16/20 11:49 AM, Christoph Hellwig wrote:
> On Mon, Mar 16, 2020 at 11:42:19AM -0700, Ralph Campbell wrote:
>>
>> On 3/16/20 10:52 AM, Christoph Hellwig wrote:
>>> No driver has actually used properly wire up and support this feature.
>>> There is various code related to it in nouveau, but as far as I can tell
>>> it never actually got turned on, and the only changes since the initial
>>> commit are global cleanups.
>>
>> This is not actually true. OpenCL 2.x does support SVM with nouveau and
>> device private memory via clEnqueueSVMMigrateMem().
>> Also, Ben Skeggs has accepted a set of patches to map GPU memory after being
>> migrated and this change would conflict with that.
> 
> Can you explain me how we actually invoke this code?

GPU memory is allocated when the device private memory "struct page" is
allocated. See where nouveau_dmem_chunk_alloc() calls nouveau_bo_new().
Then when a page is migrated to the GPU, the GPU memory physical address
is just the offset into the "fake" PFN range allocated by
devm_request_free_mem_region().

I'm looking into allocating GPU memory at the time of migration instead of when
the device private memory struct pages are allocated but that is a future
improvement.

System memory is migrated to GPU memory:
# mesa
clEnqueueSVMMigrateMem()
   svm_migrate_op()
     q.svm_migrate()
       pipe->svm_migrate() // really nvc0_svm_migrate()
         drmCommandWrite() // in libdrm
           drmIoctl()
             ioctl()
               nouveau_drm_ioctl() // nouveau_drm.c
                 drm_ioctl()
                   nouveau_svmm_bind()
                     nouveau_dmem_migrate_vma()
                       migrate_vma_setup()
                       nouveau_dmem_migrate_chunk()
                         nouveau_dmem_migrate_copy_one()
                           // allocate device private struct page
                           dpage = nouveau_dmem_page_alloc_locked()
                             dpage = nouveau_dmem_pages_alloc()
                             // Get GPU VRAM physical address
                             nouveau_dmem_page_addr(dpage)
                             // This does the DMA to GPU memory
                             drm->dmem->migrate.copy_func()
                       migrate_vma_pages()
                       migrate_vma_finalize()

Without my recent patch set, there is no GPU page table entry created for
this migrated memory so there will be a GPU fault which is handled in a
worker thread:
nouveau_svm_fault()
   // examine fault buffer entries and compute range of pages
   nouveau_range_fault()
     // This will fill in the pfns array with a device private entry PFN
     hmm_range_fault()
     // This sees the range->flags[HMM_PFN_DEVICE_PRIVATE] flag
     // and converts the HMM PFN to a GPU physical address
     nouveau_dmem_convert_pfn()
     // This sets up the GPU page tables
     nvif_object_ioctl()

> For that we'd need HMM_PFN_DEVICE_PRIVATE NVIF_VMM_PFNMAP_V0_VRAM
> set in ->pfns before calling hmm_range_fault, which isn't happening.
> 

It is set by hmm_range_fault() via the range->flags[HMM_PFN_DEVICE_PRIVATE] entry
when hmm_range_fault() sees a device private struct page. The call to
nouveau_dmem_convert_pfn() is just replacing the "fake" PFN with the real PFN
but not clearing/changing the read/write or VRAM/system memory PTE bits.


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 2/2] mm: remove device private page support from hmm_range_fault
  2020-03-16 18:49     ` Christoph Hellwig
  2020-03-16 18:58       ` Christoph Hellwig
  2020-03-16 19:56       ` Ralph Campbell
@ 2020-03-16 20:09       ` Jason Gunthorpe
  2020-03-16 20:24         ` Ralph Campbell
  2 siblings, 1 reply; 15+ messages in thread
From: Jason Gunthorpe @ 2020-03-16 20:09 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Ralph Campbell, Dan Williams, Bharata B Rao,
	Christian König, Ben Skeggs, Jerome Glisse, kvm-ppc,
	amd-gfx, dri-devel, nouveau, linux-mm

On Mon, Mar 16, 2020 at 07:49:35PM +0100, Christoph Hellwig wrote:
> On Mon, Mar 16, 2020 at 11:42:19AM -0700, Ralph Campbell wrote:
> >
> > On 3/16/20 10:52 AM, Christoph Hellwig wrote:
> >> No driver has actually used properly wire up and support this feature.
> >> There is various code related to it in nouveau, but as far as I can tell
> >> it never actually got turned on, and the only changes since the initial
> >> commit are global cleanups.
> >
> > This is not actually true. OpenCL 2.x does support SVM with nouveau and
> > device private memory via clEnqueueSVMMigrateMem().
> > Also, Ben Skeggs has accepted a set of patches to map GPU memory after being
> > migrated and this change would conflict with that.
> 
> Can you explain me how we actually invoke this code?
> 
> For that we'd need HMM_PFN_DEVICE_PRIVATE NVIF_VMM_PFNMAP_V0_VRAM
> set in ->pfns before calling hmm_range_fault, which isn't happening.

Oh, I got tripped on this too

The logic is backwards from what you'd think.. If you *set*
HMM_PFN_DEVICE_PRIVATE then this triggers:

hmm_pte_need_fault():
	if ((cpu_flags & range->flags[HMM_PFN_DEVICE_PRIVATE])) {
		/* Do we fault on device memory ? */
		if (pfns & range->flags[HMM_PFN_DEVICE_PRIVATE]) {
			*write_fault = pfns & range->flags[HMM_PFN_WRITE];
			*fault = true;
		}
		return;
	}

Ie if the cpu page is a DEVICE_PRIVATE and the caller sets
HMM_PFN_DEVICE_PRIVATE in the input flags (pfns) then it always faults
it and never sets HMM_PFN_DEVICE_PRIVATE in the output flags.

So setting 0 enabled device_private support, and nouveau is Ok.

AMDGPU is broken because it can't handle device private and can't set
the flag to supress it.

I was going to try and sort this out as part of getting rid of range->flags

Jason


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 2/2] mm: remove device private page support from hmm_range_fault
  2020-03-16 20:09       ` Jason Gunthorpe
@ 2020-03-16 20:24         ` Ralph Campbell
  2020-03-17 11:56           ` Jason Gunthorpe
  0 siblings, 1 reply; 15+ messages in thread
From: Ralph Campbell @ 2020-03-16 20:24 UTC (permalink / raw)
  To: Jason Gunthorpe, Christoph Hellwig
  Cc: Dan Williams, Bharata B Rao, Christian König, Ben Skeggs,
	Jerome Glisse, kvm-ppc, amd-gfx, dri-devel, nouveau, linux-mm


On 3/16/20 1:09 PM, Jason Gunthorpe wrote:
> On Mon, Mar 16, 2020 at 07:49:35PM +0100, Christoph Hellwig wrote:
>> On Mon, Mar 16, 2020 at 11:42:19AM -0700, Ralph Campbell wrote:
>>>
>>> On 3/16/20 10:52 AM, Christoph Hellwig wrote:
>>>> No driver has actually used properly wire up and support this feature.
>>>> There is various code related to it in nouveau, but as far as I can tell
>>>> it never actually got turned on, and the only changes since the initial
>>>> commit are global cleanups.
>>>
>>> This is not actually true. OpenCL 2.x does support SVM with nouveau and
>>> device private memory via clEnqueueSVMMigrateMem().
>>> Also, Ben Skeggs has accepted a set of patches to map GPU memory after being
>>> migrated and this change would conflict with that.
>>
>> Can you explain me how we actually invoke this code?
>>
>> For that we'd need HMM_PFN_DEVICE_PRIVATE NVIF_VMM_PFNMAP_V0_VRAM
>> set in ->pfns before calling hmm_range_fault, which isn't happening.
> 
> Oh, I got tripped on this too
> 
> The logic is backwards from what you'd think.. If you *set*
> HMM_PFN_DEVICE_PRIVATE then this triggers:
> 
> hmm_pte_need_fault():
> 	if ((cpu_flags & range->flags[HMM_PFN_DEVICE_PRIVATE])) {
> 		/* Do we fault on device memory ? */
> 		if (pfns & range->flags[HMM_PFN_DEVICE_PRIVATE]) {
> 			*write_fault = pfns & range->flags[HMM_PFN_WRITE];
> 			*fault = true;
> 		}
> 		return;
> 	}
> 
> Ie if the cpu page is a DEVICE_PRIVATE and the caller sets
> HMM_PFN_DEVICE_PRIVATE in the input flags (pfns) then it always faults
> it and never sets HMM_PFN_DEVICE_PRIVATE in the output flags.
> 
> So setting 0 enabled device_private support, and nouveau is Ok.
> 
> AMDGPU is broken because it can't handle device private and can't set
> the flag to supress it.
> 
> I was going to try and sort this out as part of getting rid of range->flags
> 
> Jason
> 

The reason for it being backwards is that "normally" a device doesn't want
the device private page to be faulted back to system memory, it wants to
get the device private struct page so it can update its page table to point
to the memory already in the device.
Also, a device like Nvidia's GPUs may have an alternate path for copying
one GPU's memory to another (nvlink) without going through system memory
so getting a device private struct page/PFN from hmm_range_fault() that isn't
"owned" by the faulting GPU is useful.
I agree that the current code isn't well tested or thought out for multiple devices
(rdma, NVMe drives, GPUs, etc.) but it also ties in with peer-to-peer access via PCIe.


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 2/2] mm: remove device private page support from hmm_range_fault
  2020-03-16 20:24         ` Ralph Campbell
@ 2020-03-17 11:56           ` Jason Gunthorpe
  2020-03-17 22:46             ` Ralph Campbell
  0 siblings, 1 reply; 15+ messages in thread
From: Jason Gunthorpe @ 2020-03-17 11:56 UTC (permalink / raw)
  To: Ralph Campbell
  Cc: Christoph Hellwig, Dan Williams, Bharata B Rao,
	Christian König, Ben Skeggs, Jerome Glisse, kvm-ppc,
	amd-gfx, dri-devel, nouveau, linux-mm

On Mon, Mar 16, 2020 at 01:24:09PM -0700, Ralph Campbell wrote:

> The reason for it being backwards is that "normally" a device doesn't want
> the device private page to be faulted back to system memory, it wants to
> get the device private struct page so it can update its page table to point
> to the memory already in the device.

The "backwards" is you set the flag on input and never get it on
output, clear the flag in input and maybe get it on output.

Compare with valid or write which don't work that way.

> Also, a device like Nvidia's GPUs may have an alternate path for copying
> one GPU's memory to another (nvlink) without going through system memory
> so getting a device private struct page/PFN from hmm_range_fault() that isn't
> "owned" by the faulting GPU is useful.
> I agree that the current code isn't well tested or thought out for multiple devices
> (rdma, NVMe drives, GPUs, etc.) but it also ties in with peer-to-peer access via PCIe.

I think the series here is a big improvement. The GPU driver can set
owners that match the hidden cluster interconnect.

Jason


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 2/2] mm: remove device private page support from hmm_range_fault
  2020-03-17 11:56           ` Jason Gunthorpe
@ 2020-03-17 22:46             ` Ralph Campbell
  0 siblings, 0 replies; 15+ messages in thread
From: Ralph Campbell @ 2020-03-17 22:46 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Christoph Hellwig, Dan Williams, Bharata B Rao,
	Christian König, Ben Skeggs, Jerome Glisse, kvm-ppc,
	amd-gfx, dri-devel, nouveau, linux-mm


On 3/17/20 4:56 AM, Jason Gunthorpe wrote:
> On Mon, Mar 16, 2020 at 01:24:09PM -0700, Ralph Campbell wrote:
> 
>> The reason for it being backwards is that "normally" a device doesn't want
>> the device private page to be faulted back to system memory, it wants to
>> get the device private struct page so it can update its page table to point
>> to the memory already in the device.
> 
> The "backwards" is you set the flag on input and never get it on
> output, clear the flag in input and maybe get it on output.
> 
> Compare with valid or write which don't work that way.
> 
>> Also, a device like Nvidia's GPUs may have an alternate path for copying
>> one GPU's memory to another (nvlink) without going through system memory
>> so getting a device private struct page/PFN from hmm_range_fault() that isn't
>> "owned" by the faulting GPU is useful.
>> I agree that the current code isn't well tested or thought out for multiple devices
>> (rdma, NVMe drives, GPUs, etc.) but it also ties in with peer-to-peer access via PCIe.
> 
> I think the series here is a big improvement. The GPU driver can set
> owners that match the hidden cluster interconnect.
> 
> Jason
> 

I agree this is an improvement. I was just thinking about possible future use cases.


^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2020-03-17 22:46 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-03-16 17:52 ensure device private pages have an owner Christoph Hellwig
2020-03-16 17:52 ` [PATCH 1/2] mm: handle multiple owners of device private pages in migrate_vma Christoph Hellwig
2020-03-16 18:17   ` Jason Gunthorpe
2020-03-16 18:20     ` Christoph Hellwig
2020-03-16 17:52 ` [PATCH 2/2] mm: remove device private page support from hmm_range_fault Christoph Hellwig
2020-03-16 18:42   ` Ralph Campbell
2020-03-16 18:49     ` Christoph Hellwig
2020-03-16 18:58       ` Christoph Hellwig
2020-03-16 19:56       ` Ralph Campbell
2020-03-16 20:09       ` Jason Gunthorpe
2020-03-16 20:24         ` Ralph Campbell
2020-03-17 11:56           ` Jason Gunthorpe
2020-03-17 22:46             ` Ralph Campbell
2020-03-16 19:04     ` Jason Gunthorpe
2020-03-16 19:07       ` Christoph Hellwig

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).