All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/3] handle userptr corner cases with HMM path
@ 2019-03-05 18:09 Yang, Philip
       [not found] ` <20190305180915.15216-1-Philip.Yang-5C7GfCeVMHo@public.gmane.org>
  0 siblings, 1 reply; 11+ messages in thread
From: Yang, Philip @ 2019-03-05 18:09 UTC (permalink / raw)
  To: amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW; +Cc: Yang, Philip

Those corner cases are found by kfdtest.KFDIPCTest.

Philip Yang (3):
  drm/amdkfd: support concurrent userptr update for HMM
  drm/amdgpu: support userptr cross VMAs case with HMM
  drm/amdgpu: more descriptive message if HMM not enabled

 .../gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c  |  28 +++-
 drivers/gpu/drm/amd/amdgpu/amdgpu_mn.h        |   2 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c       | 122 +++++++++++++-----
 3 files changed, 110 insertions(+), 42 deletions(-)

-- 
2.17.1

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH 1/3] drm/amdkfd: support concurrent userptr update for HMM
       [not found] ` <20190305180915.15216-1-Philip.Yang-5C7GfCeVMHo@public.gmane.org>
@ 2019-03-05 18:09   ` Yang, Philip
       [not found]     ` <20190305180915.15216-2-Philip.Yang-5C7GfCeVMHo@public.gmane.org>
  2019-03-05 18:09   ` [PATCH 2/3] drm/amdgpu: support userptr cross VMAs case with HMM Yang, Philip
  2019-03-05 18:09   ` [PATCH 3/3] drm/amdgpu: more descriptive message if HMM not enabled Yang, Philip
  2 siblings, 1 reply; 11+ messages in thread
From: Yang, Philip @ 2019-03-05 18:09 UTC (permalink / raw)
  To: amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW; +Cc: Yang, Philip

Userptr restore may have concurrent userptr invalidation after
hmm_vma_fault adds the range to the hmm->ranges list, needs call
hmm_vma_range_done to remove the range from hmm->ranges list first,
then reschedule the restore worker. Otherwise hmm_vma_fault will add
same range to the list, this will cause loop in the list because
range->next point to range itself.

Add function untrack_invalid_user_pages to reduce code duplication.

Change-Id: I31407739dc10554f8e418c7a0e0415d3d95552f1
Signed-off-by: Philip Yang <Philip.Yang@amd.com>
---
 .../gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c  | 28 ++++++++++++++-----
 1 file changed, 21 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
index 314c048fcac6..783d760ccfe3 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
@@ -1731,6 +1731,23 @@ static int update_invalid_user_pages(struct amdkfd_process_info *process_info,
 	return 0;
 }
 
+/* Remove invalid userptr BOs from hmm track list
+ *
+ * Stop HMM track the userptr update
+ */
+static void untrack_invalid_user_pages(struct amdkfd_process_info *process_info)
+{
+	struct kgd_mem *mem, *tmp_mem;
+	struct amdgpu_bo *bo;
+
+	list_for_each_entry_safe(mem, tmp_mem,
+				 &process_info->userptr_inval_list,
+				 validate_list.head) {
+		bo = mem->bo;
+		amdgpu_ttm_tt_get_user_pages_done(bo->tbo.ttm);
+	}
+}
+
 /* Validate invalid userptr BOs
  *
  * Validates BOs on the userptr_inval_list, and moves them back to the
@@ -1848,12 +1865,7 @@ static int validate_invalid_user_pages(struct amdkfd_process_info *process_info)
 out_free:
 	kfree(pd_bo_list_entries);
 out_no_mem:
-	list_for_each_entry_safe(mem, tmp_mem,
-				 &process_info->userptr_inval_list,
-				 validate_list.head) {
-		bo = mem->bo;
-		amdgpu_ttm_tt_get_user_pages_done(bo->tbo.ttm);
-	}
+	untrack_invalid_user_pages(process_info);
 
 	return ret;
 }
@@ -1897,8 +1909,10 @@ static void amdgpu_amdkfd_restore_userptr_worker(struct work_struct *work)
 	 * and we can just restart the queues.
 	 */
 	if (!list_empty(&process_info->userptr_inval_list)) {
-		if (atomic_read(&process_info->evicted_bos) != evicted_bos)
+		if (atomic_read(&process_info->evicted_bos) != evicted_bos) {
+			untrack_invalid_user_pages(process_info);
 			goto unlock_out; /* Concurrent eviction, try again */
+		}
 
 		if (validate_invalid_user_pages(process_info))
 			goto unlock_out;
-- 
2.17.1

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH 2/3] drm/amdgpu: support userptr cross VMAs case with HMM
       [not found] ` <20190305180915.15216-1-Philip.Yang-5C7GfCeVMHo@public.gmane.org>
  2019-03-05 18:09   ` [PATCH 1/3] drm/amdkfd: support concurrent userptr update for HMM Yang, Philip
@ 2019-03-05 18:09   ` Yang, Philip
       [not found]     ` <20190305180915.15216-3-Philip.Yang-5C7GfCeVMHo@public.gmane.org>
  2019-03-05 18:09   ` [PATCH 3/3] drm/amdgpu: more descriptive message if HMM not enabled Yang, Philip
  2 siblings, 1 reply; 11+ messages in thread
From: Yang, Philip @ 2019-03-05 18:09 UTC (permalink / raw)
  To: amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW; +Cc: Yang, Philip

userptr may cross two VMAs if the forked child process (not call exec
after fork) malloc buffer, then free it, and then malloc larger size
buf, kerenl will create new VMA adjacent to old VMA which was cloned
from parent process, some pages of userptr are in the first VMA, the
rest pages are in the second VMA.

HMM expects range only have one VMA, loop over all VMAs in the address
range, create multiple ranges to handle this case. See
is_mergeable_anon_vma in mm/mmap.c for details.

Change-Id: I0ca8c77e28deabccc139906f9ffee04b7e383314
Signed-off-by: Philip Yang <Philip.Yang@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 122 +++++++++++++++++-------
 1 file changed, 87 insertions(+), 35 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
index cd0ccfbbcb84..173bf4db5994 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
@@ -711,7 +711,8 @@ struct amdgpu_ttm_tt {
 	struct task_struct	*usertask;
 	uint32_t		userflags;
 #if IS_ENABLED(CONFIG_DRM_AMDGPU_USERPTR)
-	struct hmm_range	range;
+	struct hmm_range	*ranges;
+	int			nr_ranges;
 #endif
 };
 
@@ -723,62 +724,104 @@ struct amdgpu_ttm_tt {
  * once afterwards to stop HMM tracking
  */
 #if IS_ENABLED(CONFIG_DRM_AMDGPU_USERPTR)
+
+/* Support Userptr pages cross max 16 vmas */
+#define MAX_NR_VMAS	(16)
+
 int amdgpu_ttm_tt_get_user_pages(struct ttm_tt *ttm, struct page **pages)
 {
 	struct amdgpu_ttm_tt *gtt = (void *)ttm;
 	struct mm_struct *mm = gtt->usertask->mm;
-	unsigned long end = gtt->userptr + ttm->num_pages * PAGE_SIZE;
-	struct hmm_range *range = &gtt->range;
-	int r = 0, i;
+	unsigned long start = gtt->userptr;
+	unsigned long end = start + ttm->num_pages * PAGE_SIZE;
+	struct hmm_range *ranges;
+	struct vm_area_struct *vma = NULL, *vmas[MAX_NR_VMAS];
+	uint64_t *pfns, f;
+	int r = 0, i, nr_pages;
 
 	if (!mm) /* Happens during process shutdown */
 		return -ESRCH;
 
-	amdgpu_hmm_init_range(range);
-
 	down_read(&mm->mmap_sem);
 
-	range->vma = find_vma(mm, gtt->userptr);
-	if (!range_in_vma(range->vma, gtt->userptr, end))
-		r = -EFAULT;
-	else if ((gtt->userflags & AMDGPU_GEM_USERPTR_ANONONLY) &&
-		range->vma->vm_file)
+	/* user pages may cross multiple VMAs */
+	gtt->nr_ranges = 0;
+	do {
+		vma = find_vma(mm, vma ? vma->vm_end : start);
+		if (unlikely(!vma)) {
+			r = -EFAULT;
+			goto out;
+		}
+		vmas[gtt->nr_ranges++] = vma;
+		if (gtt->nr_ranges >= MAX_NR_VMAS) {
+			DRM_ERROR("invalid userptr range\n");
+			r = -EFAULT;
+			goto out;
+		}
+	} while (end > vma->vm_end);
+
+	DRM_DEBUG_DRIVER("0x%lx nr_ranges %d pages 0x%lx\n",
+		start, gtt->nr_ranges, ttm->num_pages);
+
+	if (unlikely((gtt->userflags & AMDGPU_GEM_USERPTR_ANONONLY) &&
+		vmas[0]->vm_file)) {
 		r = -EPERM;
-	if (r)
 		goto out;
+	}
 
-	range->pfns = kvmalloc_array(ttm->num_pages, sizeof(uint64_t),
-				     GFP_KERNEL);
-	if (range->pfns == NULL) {
+	ranges = kvmalloc_array(gtt->nr_ranges, sizeof(*ranges), GFP_KERNEL);
+	if (unlikely(!ranges)) {
 		r = -ENOMEM;
 		goto out;
 	}
-	range->start = gtt->userptr;
-	range->end = end;
 
-	range->pfns[0] = range->flags[HMM_PFN_VALID];
-	range->pfns[0] |= amdgpu_ttm_tt_is_readonly(ttm) ?
-				0 : range->flags[HMM_PFN_WRITE];
-	for (i = 1; i < ttm->num_pages; i++)
-		range->pfns[i] = range->pfns[0];
+	pfns = kvmalloc_array(ttm->num_pages, sizeof(*pfns), GFP_KERNEL);
+	if (unlikely(!pfns)) {
+		r = -ENOMEM;
+		goto out_free_ranges;
+	}
+
+	for (i = 0; i < gtt->nr_ranges; i++)
+		amdgpu_hmm_init_range(&ranges[i]);
+
+	f = ranges[0].flags[HMM_PFN_VALID];
+	f |= amdgpu_ttm_tt_is_readonly(ttm) ?
+				0 : ranges[0].flags[HMM_PFN_WRITE];
+	memset64(pfns, f, ttm->num_pages);
+
+	for (nr_pages = 0, i = 0; i < gtt->nr_ranges; i++) {
+		ranges[i].vma = vmas[i];
+		ranges[i].start = max(start, vmas[i]->vm_start);
+		ranges[i].end = min(end, vmas[i]->vm_end);
+		ranges[i].pfns = pfns + nr_pages;
+		nr_pages += (ranges[i].end - ranges[i].start) / PAGE_SIZE;
+
+		r = hmm_vma_fault(&ranges[i], true);
+		if (unlikely(r))
+			break;
+	}
+	if (unlikely(r)) {
+		while (i--)
+			hmm_vma_range_done(&ranges[i]);
 
-	/* This may trigger page table update */
-	r = hmm_vma_fault(range, true);
-	if (r)
 		goto out_free_pfns;
+	}
 
 	up_read(&mm->mmap_sem);
 
 	for (i = 0; i < ttm->num_pages; i++)
-		pages[i] = hmm_pfn_to_page(range, range->pfns[i]);
+		pages[i] = hmm_pfn_to_page(&ranges[0], pfns[i]);
+	gtt->ranges = ranges;
 
 	return 0;
 
 out_free_pfns:
-	kvfree(range->pfns);
-	range->pfns = NULL;
+	kvfree(pfns);
+out_free_ranges:
+	kvfree(ranges);
 out:
 	up_read(&mm->mmap_sem);
+
 	return r;
 }
 
@@ -792,15 +835,23 @@ bool amdgpu_ttm_tt_get_user_pages_done(struct ttm_tt *ttm)
 {
 	struct amdgpu_ttm_tt *gtt = (void *)ttm;
 	bool r = false;
+	int i;
 
 	if (!gtt || !gtt->userptr)
 		return false;
 
-	WARN_ONCE(!gtt->range.pfns, "No user pages to check\n");
-	if (gtt->range.pfns) {
-		r = hmm_vma_range_done(&gtt->range);
-		kvfree(gtt->range.pfns);
-		gtt->range.pfns = NULL;
+	DRM_DEBUG_DRIVER("user_pages_done 0x%llx nr_ranges %d pages 0x%lx\n",
+		gtt->userptr, gtt->nr_ranges, ttm->num_pages);
+
+	WARN_ONCE(!gtt->ranges || !gtt->ranges[0].pfns,
+		"No user pages to check\n");
+
+	if (gtt->ranges) {
+		for (i = 0; i < gtt->nr_ranges; i++)
+			r |= hmm_vma_range_done(&gtt->ranges[i]);
+		kvfree(gtt->ranges[0].pfns);
+		kvfree(gtt->ranges);
+		gtt->ranges = NULL;
 	}
 
 	return r;
@@ -884,8 +935,9 @@ static void amdgpu_ttm_tt_unpin_userptr(struct ttm_tt *ttm)
 	sg_free_table(ttm->sg);
 
 #if IS_ENABLED(CONFIG_DRM_AMDGPU_USERPTR)
-	if (gtt->range.pfns &&
-	    ttm->pages[0] == hmm_pfn_to_page(&gtt->range, gtt->range.pfns[0]))
+	if (gtt->ranges &&
+	    ttm->pages[0] == hmm_pfn_to_page(&gtt->ranges[0],
+					     gtt->ranges[0].pfns[0]))
 		WARN_ONCE(1, "Missing get_user_page_done\n");
 #endif
 }
-- 
2.17.1

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH 3/3] drm/amdgpu: more descriptive message if HMM not enabled
       [not found] ` <20190305180915.15216-1-Philip.Yang-5C7GfCeVMHo@public.gmane.org>
  2019-03-05 18:09   ` [PATCH 1/3] drm/amdkfd: support concurrent userptr update for HMM Yang, Philip
  2019-03-05 18:09   ` [PATCH 2/3] drm/amdgpu: support userptr cross VMAs case with HMM Yang, Philip
@ 2019-03-05 18:09   ` Yang, Philip
       [not found]     ` <20190305180915.15216-4-Philip.Yang-5C7GfCeVMHo@public.gmane.org>
  2 siblings, 1 reply; 11+ messages in thread
From: Yang, Philip @ 2019-03-05 18:09 UTC (permalink / raw)
  To: amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW; +Cc: Yang, Philip

If using old kernel config file, CONFIG_ZONE_DEVICE is not selected,
so CONFIG_HMM and CONFIG_HMM_MIRROR is not enabled, the current driver
error message "Failed to register MMU notifier" is not clear. Inform
user with more descriptive message on how to fix the missing kernel
config option.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109808

Change-Id: Idfebaeababa4c37c1ef093c2b91a26910a167585
Signed-off-by: Philip Yang <Philip.Yang@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_mn.h | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_mn.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_mn.h
index 4803e216e174..b155cac52aca 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_mn.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_mn.h
@@ -53,6 +53,8 @@ static inline struct amdgpu_mn *amdgpu_mn_get(struct amdgpu_device *adev,
 }
 static inline int amdgpu_mn_register(struct amdgpu_bo *bo, unsigned long addr)
 {
+	DRM_ERROR("HMM_MIRROR kernel config option is not enabled\n");
+	DRM_ERROR("add CONFIG_ZONE_DEVICE=y in config file to fix this\n");
 	return -ENODEV;
 }
 static inline void amdgpu_mn_unregister(struct amdgpu_bo *bo) {}
-- 
2.17.1

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH 3/3] drm/amdgpu: more descriptive message if HMM not enabled
       [not found]     ` <20190305180915.15216-4-Philip.Yang-5C7GfCeVMHo@public.gmane.org>
@ 2019-03-06  9:05       ` Michel Dänzer
       [not found]         ` <9ac0b1ab-891e-9fe6-667e-f77a244dd862-otUistvHUpPR7s880joybQ@public.gmane.org>
  0 siblings, 1 reply; 11+ messages in thread
From: Michel Dänzer @ 2019-03-06  9:05 UTC (permalink / raw)
  To: Yang, Philip; +Cc: amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW

On 2019-03-05 7:09 p.m., Yang, Philip wrote:
> If using old kernel config file, CONFIG_ZONE_DEVICE is not selected,
> so CONFIG_HMM and CONFIG_HMM_MIRROR is not enabled, the current driver
> error message "Failed to register MMU notifier" is not clear. Inform
> user with more descriptive message on how to fix the missing kernel
> config option.
> 
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109808
> 
> Change-Id: Idfebaeababa4c37c1ef093c2b91a26910a167585
> Signed-off-by: Philip Yang <Philip.Yang@amd.com>
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_mn.h | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_mn.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_mn.h
> index 4803e216e174..b155cac52aca 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_mn.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_mn.h
> @@ -53,6 +53,8 @@ static inline struct amdgpu_mn *amdgpu_mn_get(struct amdgpu_device *adev,
>  }
>  static inline int amdgpu_mn_register(struct amdgpu_bo *bo, unsigned long addr)
>  {
> +	DRM_ERROR("HMM_MIRROR kernel config option is not enabled\n");
> +	DRM_ERROR("add CONFIG_ZONE_DEVICE=y in config file to fix this\n");
>  	return -ENODEV;
>  }
>  static inline void amdgpu_mn_unregister(struct amdgpu_bo *bo) {}
> 

amdgpu_gem_userptr_ioctl calls amdgpu_mn_register if userspace sets the
AMDGPU_GEM_USERPTR_REGISTER flag, so buggy/malicious userspace could
spam dmesg with these errors. I'm afraid a different solution is needed.


-- 
Earthling Michel Dänzer               |              https://www.amd.com
Libre software enthusiast             |             Mesa and X developer
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 3/3] drm/amdgpu: more descriptive message if HMM not enabled
       [not found]         ` <9ac0b1ab-891e-9fe6-667e-f77a244dd862-otUistvHUpPR7s880joybQ@public.gmane.org>
@ 2019-03-06 14:42           ` Yang, Philip
       [not found]             ` <baa08fbf-8bb0-6ad7-9e62-b9ebdf88cf72-5C7GfCeVMHo@public.gmane.org>
  0 siblings, 1 reply; 11+ messages in thread
From: Yang, Philip @ 2019-03-06 14:42 UTC (permalink / raw)
  To: Michel Dänzer; +Cc: amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW



On 2019-03-06 4:05 a.m., Michel Dänzer wrote:
> On 2019-03-05 7:09 p.m., Yang, Philip wrote:
>> If using old kernel config file, CONFIG_ZONE_DEVICE is not selected,
>> so CONFIG_HMM and CONFIG_HMM_MIRROR is not enabled, the current driver
>> error message "Failed to register MMU notifier" is not clear. Inform
>> user with more descriptive message on how to fix the missing kernel
>> config option.
>>
>> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109808
>>
>> Change-Id: Idfebaeababa4c37c1ef093c2b91a26910a167585
>> Signed-off-by: Philip Yang <Philip.Yang@amd.com>
>> ---
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_mn.h | 2 ++
>>   1 file changed, 2 insertions(+)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_mn.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_mn.h
>> index 4803e216e174..b155cac52aca 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_mn.h
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_mn.h
>> @@ -53,6 +53,8 @@ static inline struct amdgpu_mn *amdgpu_mn_get(struct amdgpu_device *adev,
>>   }
>>   static inline int amdgpu_mn_register(struct amdgpu_bo *bo, unsigned long addr)
>>   {
>> +	DRM_ERROR("HMM_MIRROR kernel config option is not enabled\n");
>> +	DRM_ERROR("add CONFIG_ZONE_DEVICE=y in config file to fix this\n");
>>   	return -ENODEV;
>>   }
>>   static inline void amdgpu_mn_unregister(struct amdgpu_bo *bo) {}
>>
> 
> amdgpu_gem_userptr_ioctl calls amdgpu_mn_register if userspace sets the
> AMDGPU_GEM_USERPTR_REGISTER flag, so buggy/malicious userspace could
> spam dmesg with these errors. I'm afraid a different solution is needed.
> 
> The error message is inside dummy function amdgpu_mn_register, it only 
shows if kernel config option HMM is not enabled. Driver does need HMM 
kernel option to support userptr. We want to provide obvious message to 
users to update kernel config file from old kernel, otherwise all 
userptr related functions are broken. I will change this to WARN_ONCE to 
avoid spam dmesg.

Regards,
Philip
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 3/3] drm/amdgpu: more descriptive message if HMM not enabled
       [not found]             ` <baa08fbf-8bb0-6ad7-9e62-b9ebdf88cf72-5C7GfCeVMHo@public.gmane.org>
@ 2019-03-06 14:53               ` Michel Dänzer
  0 siblings, 0 replies; 11+ messages in thread
From: Michel Dänzer @ 2019-03-06 14:53 UTC (permalink / raw)
  To: Yang, Philip; +Cc: amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW

On 2019-03-06 3:42 p.m., Yang, Philip wrote:
> On 2019-03-06 4:05 a.m., Michel Dänzer wrote:
>> On 2019-03-05 7:09 p.m., Yang, Philip wrote:
>>> If using old kernel config file, CONFIG_ZONE_DEVICE is not selected,
>>> so CONFIG_HMM and CONFIG_HMM_MIRROR is not enabled, the current driver
>>> error message "Failed to register MMU notifier" is not clear. Inform
>>> user with more descriptive message on how to fix the missing kernel
>>> config option.
>>>
>>> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109808
>>>
>>> Change-Id: Idfebaeababa4c37c1ef093c2b91a26910a167585
>>> Signed-off-by: Philip Yang <Philip.Yang@amd.com>
>>> ---
>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_mn.h | 2 ++
>>>   1 file changed, 2 insertions(+)
>>>
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_mn.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_mn.h
>>> index 4803e216e174..b155cac52aca 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_mn.h
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_mn.h
>>> @@ -53,6 +53,8 @@ static inline struct amdgpu_mn *amdgpu_mn_get(struct amdgpu_device *adev,
>>>   }
>>>   static inline int amdgpu_mn_register(struct amdgpu_bo *bo, unsigned long addr)
>>>   {
>>> +	DRM_ERROR("HMM_MIRROR kernel config option is not enabled\n");
>>> +	DRM_ERROR("add CONFIG_ZONE_DEVICE=y in config file to fix this\n");
>>>   	return -ENODEV;
>>>   }
>>>   static inline void amdgpu_mn_unregister(struct amdgpu_bo *bo) {}
>>>
>>
>> amdgpu_gem_userptr_ioctl calls amdgpu_mn_register if userspace sets the
>> AMDGPU_GEM_USERPTR_REGISTER flag, so buggy/malicious userspace could
>> spam dmesg with these errors. I'm afraid a different solution is needed.
>>
> The error message is inside dummy function amdgpu_mn_register, it only 
> shows if kernel config option HMM is not enabled. Driver does need HMM 
> kernel option to support userptr. We want to provide obvious message to 
> users to update kernel config file from old kernel, otherwise all 
> userptr related functions are broken. I will change this to WARN_ONCE to 
> avoid spam dmesg.

I hope you mean DRM_WARN_ONCE. :) Plain WARN_ONCE is probably also too
noisy/scary for something that can be triggered by userspace.


-- 
Earthling Michel Dänzer               |              https://www.amd.com
Libre software enthusiast             |             Mesa and X developer
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 1/3] drm/amdkfd: support concurrent userptr update for HMM
       [not found]     ` <20190305180915.15216-2-Philip.Yang-5C7GfCeVMHo@public.gmane.org>
@ 2019-03-06 20:01       ` Kuehling, Felix
       [not found]         ` <772b1109-b6db-05cc-4f36-6428724ae4a3-5C7GfCeVMHo@public.gmane.org>
  0 siblings, 1 reply; 11+ messages in thread
From: Kuehling, Felix @ 2019-03-06 20:01 UTC (permalink / raw)
  To: Yang, Philip, amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW

Hmm, I'm not sure. This change probably fixes this issue, but there may 
be other similar corner cases in other situations where the restore 
worker fails and needs to retry. The better place to call untrack in  
amdgpu_amdkfd_restore_userptr_worker would be at the very end. Anything 
that's left in the userptr_inval_list at that point needs to be untracked.

For now as a quick fix for an urgent bug, this change is Reviewed-by: 
Felix Kuehling <Felix.Kuehling@amd.com>. But please revisit this and 
check if there are similar corner cases as I explained above.

Regards,
   Felix

On 3/5/2019 1:09 PM, Yang, Philip wrote:
> Userptr restore may have concurrent userptr invalidation after
> hmm_vma_fault adds the range to the hmm->ranges list, needs call
> hmm_vma_range_done to remove the range from hmm->ranges list first,
> then reschedule the restore worker. Otherwise hmm_vma_fault will add
> same range to the list, this will cause loop in the list because
> range->next point to range itself.
>
> Add function untrack_invalid_user_pages to reduce code duplication.
>
> Change-Id: I31407739dc10554f8e418c7a0e0415d3d95552f1
> Signed-off-by: Philip Yang <Philip.Yang@amd.com>
> ---
>   .../gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c  | 28 ++++++++++++++-----
>   1 file changed, 21 insertions(+), 7 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
> index 314c048fcac6..783d760ccfe3 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
> @@ -1731,6 +1731,23 @@ static int update_invalid_user_pages(struct amdkfd_process_info *process_info,
>   	return 0;
>   }
>   
> +/* Remove invalid userptr BOs from hmm track list
> + *
> + * Stop HMM track the userptr update
> + */
> +static void untrack_invalid_user_pages(struct amdkfd_process_info *process_info)
> +{
> +	struct kgd_mem *mem, *tmp_mem;
> +	struct amdgpu_bo *bo;
> +
> +	list_for_each_entry_safe(mem, tmp_mem,
> +				 &process_info->userptr_inval_list,
> +				 validate_list.head) {
> +		bo = mem->bo;
> +		amdgpu_ttm_tt_get_user_pages_done(bo->tbo.ttm);
> +	}
> +}
> +
>   /* Validate invalid userptr BOs
>    *
>    * Validates BOs on the userptr_inval_list, and moves them back to the
> @@ -1848,12 +1865,7 @@ static int validate_invalid_user_pages(struct amdkfd_process_info *process_info)
>   out_free:
>   	kfree(pd_bo_list_entries);
>   out_no_mem:
> -	list_for_each_entry_safe(mem, tmp_mem,
> -				 &process_info->userptr_inval_list,
> -				 validate_list.head) {
> -		bo = mem->bo;
> -		amdgpu_ttm_tt_get_user_pages_done(bo->tbo.ttm);
> -	}
> +	untrack_invalid_user_pages(process_info);
>   
>   	return ret;
>   }
> @@ -1897,8 +1909,10 @@ static void amdgpu_amdkfd_restore_userptr_worker(struct work_struct *work)
>   	 * and we can just restart the queues.
>   	 */
>   	if (!list_empty(&process_info->userptr_inval_list)) {
> -		if (atomic_read(&process_info->evicted_bos) != evicted_bos)
> +		if (atomic_read(&process_info->evicted_bos) != evicted_bos) {
> +			untrack_invalid_user_pages(process_info);
>   			goto unlock_out; /* Concurrent eviction, try again */
> +		}
>   
>   		if (validate_invalid_user_pages(process_info))
>   			goto unlock_out;
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 2/3] drm/amdgpu: support userptr cross VMAs case with HMM
       [not found]     ` <20190305180915.15216-3-Philip.Yang-5C7GfCeVMHo@public.gmane.org>
@ 2019-03-06 20:11       ` Kuehling, Felix
       [not found]         ` <f51eaecd-8d50-0eea-aac6-2905f945a2c2-5C7GfCeVMHo@public.gmane.org>
  0 siblings, 1 reply; 11+ messages in thread
From: Kuehling, Felix @ 2019-03-06 20:11 UTC (permalink / raw)
  To: Yang, Philip, amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW

Some comments inline ...

On 3/5/2019 1:09 PM, Yang, Philip wrote:
> userptr may cross two VMAs if the forked child process (not call exec
> after fork) malloc buffer, then free it, and then malloc larger size
> buf, kerenl will create new VMA adjacent to old VMA which was cloned
> from parent process, some pages of userptr are in the first VMA, the
> rest pages are in the second VMA.
>
> HMM expects range only have one VMA, loop over all VMAs in the address
> range, create multiple ranges to handle this case. See
> is_mergeable_anon_vma in mm/mmap.c for details.
>
> Change-Id: I0ca8c77e28deabccc139906f9ffee04b7e383314
> Signed-off-by: Philip Yang <Philip.Yang@amd.com>
> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 122 +++++++++++++++++-------
>   1 file changed, 87 insertions(+), 35 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
> index cd0ccfbbcb84..173bf4db5994 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
> @@ -711,7 +711,8 @@ struct amdgpu_ttm_tt {
>   	struct task_struct	*usertask;
>   	uint32_t		userflags;
>   #if IS_ENABLED(CONFIG_DRM_AMDGPU_USERPTR)
> -	struct hmm_range	range;
> +	struct hmm_range	*ranges;
> +	int			nr_ranges;
>   #endif
>   };
>   
> @@ -723,62 +724,104 @@ struct amdgpu_ttm_tt {
>    * once afterwards to stop HMM tracking
>    */
>   #if IS_ENABLED(CONFIG_DRM_AMDGPU_USERPTR)
> +
> +/* Support Userptr pages cross max 16 vmas */
> +#define MAX_NR_VMAS	(16)
> +
>   int amdgpu_ttm_tt_get_user_pages(struct ttm_tt *ttm, struct page **pages)
>   {
>   	struct amdgpu_ttm_tt *gtt = (void *)ttm;
>   	struct mm_struct *mm = gtt->usertask->mm;
> -	unsigned long end = gtt->userptr + ttm->num_pages * PAGE_SIZE;
> -	struct hmm_range *range = &gtt->range;
> -	int r = 0, i;
> +	unsigned long start = gtt->userptr;
> +	unsigned long end = start + ttm->num_pages * PAGE_SIZE;
> +	struct hmm_range *ranges;
> +	struct vm_area_struct *vma = NULL, *vmas[MAX_NR_VMAS];
> +	uint64_t *pfns, f;
> +	int r = 0, i, nr_pages;
>   
>   	if (!mm) /* Happens during process shutdown */
>   		return -ESRCH;
>   
> -	amdgpu_hmm_init_range(range);
> -
>   	down_read(&mm->mmap_sem);
>   
> -	range->vma = find_vma(mm, gtt->userptr);
> -	if (!range_in_vma(range->vma, gtt->userptr, end))
> -		r = -EFAULT;
> -	else if ((gtt->userflags & AMDGPU_GEM_USERPTR_ANONONLY) &&
> -		range->vma->vm_file)
> +	/* user pages may cross multiple VMAs */
> +	gtt->nr_ranges = 0;
> +	do {
> +		vma = find_vma(mm, vma ? vma->vm_end : start);
> +		if (unlikely(!vma)) {
> +			r = -EFAULT;
> +			goto out;
> +		}
> +		vmas[gtt->nr_ranges++] = vma;
> +		if (gtt->nr_ranges >= MAX_NR_VMAS) {

This will lead to a failure when you have exactly 16 VMAs. If you move 
the check to the start of the loop, it will only trigger when you exceed 
the limit not just after you reach it.


> +			DRM_ERROR("invalid userptr range\n");

The userptr range is not really invalid. It only exceeds some artificial 
limitation in this code. A message like "Too many VMAs in userptr range" 
would be more appropriate.


> +			r = -EFAULT;
> +			goto out;
> +		}
> +	} while (end > vma->vm_end);
> +
> +	DRM_DEBUG_DRIVER("0x%lx nr_ranges %d pages 0x%lx\n",
> +		start, gtt->nr_ranges, ttm->num_pages);
> +
> +	if (unlikely((gtt->userflags & AMDGPU_GEM_USERPTR_ANONONLY) &&
> +		vmas[0]->vm_file)) {
>   		r = -EPERM;
> -	if (r)
>   		goto out;
> +	}
>   
> -	range->pfns = kvmalloc_array(ttm->num_pages, sizeof(uint64_t),
> -				     GFP_KERNEL);
> -	if (range->pfns == NULL) {
> +	ranges = kvmalloc_array(gtt->nr_ranges, sizeof(*ranges), GFP_KERNEL);
> +	if (unlikely(!ranges)) {
>   		r = -ENOMEM;
>   		goto out;
>   	}
> -	range->start = gtt->userptr;
> -	range->end = end;
>   
> -	range->pfns[0] = range->flags[HMM_PFN_VALID];
> -	range->pfns[0] |= amdgpu_ttm_tt_is_readonly(ttm) ?
> -				0 : range->flags[HMM_PFN_WRITE];
> -	for (i = 1; i < ttm->num_pages; i++)
> -		range->pfns[i] = range->pfns[0];
> +	pfns = kvmalloc_array(ttm->num_pages, sizeof(*pfns), GFP_KERNEL);
> +	if (unlikely(!pfns)) {
> +		r = -ENOMEM;
> +		goto out_free_ranges;
> +	}
> +
> +	for (i = 0; i < gtt->nr_ranges; i++)
> +		amdgpu_hmm_init_range(&ranges[i]);
> +
> +	f = ranges[0].flags[HMM_PFN_VALID];
> +	f |= amdgpu_ttm_tt_is_readonly(ttm) ?
> +				0 : ranges[0].flags[HMM_PFN_WRITE];
> +	memset64(pfns, f, ttm->num_pages);
> +
> +	for (nr_pages = 0, i = 0; i < gtt->nr_ranges; i++) {
> +		ranges[i].vma = vmas[i];
> +		ranges[i].start = max(start, vmas[i]->vm_start);
> +		ranges[i].end = min(end, vmas[i]->vm_end);
> +		ranges[i].pfns = pfns + nr_pages;
> +		nr_pages += (ranges[i].end - ranges[i].start) / PAGE_SIZE;
> +
> +		r = hmm_vma_fault(&ranges[i], true);
> +		if (unlikely(r))
> +			break;
> +	}
> +	if (unlikely(r)) {
> +		while (i--)
> +			hmm_vma_range_done(&ranges[i]);
>   
> -	/* This may trigger page table update */
> -	r = hmm_vma_fault(range, true);
> -	if (r)
>   		goto out_free_pfns;
> +	}
>   
>   	up_read(&mm->mmap_sem);
>   
>   	for (i = 0; i < ttm->num_pages; i++)
> -		pages[i] = hmm_pfn_to_page(range, range->pfns[i]);
> +		pages[i] = hmm_pfn_to_page(&ranges[0], pfns[i]);

This looks wrong. Don't you need to use the correct range here instead 
of range[0]?


> +	gtt->ranges = ranges;
>   
>   	return 0;
>   
>   out_free_pfns:
> -	kvfree(range->pfns);
> -	range->pfns = NULL;
> +	kvfree(pfns);
> +out_free_ranges:
> +	kvfree(ranges);
>   out:
>   	up_read(&mm->mmap_sem);
> +
>   	return r;
>   }
>   
> @@ -792,15 +835,23 @@ bool amdgpu_ttm_tt_get_user_pages_done(struct ttm_tt *ttm)
>   {
>   	struct amdgpu_ttm_tt *gtt = (void *)ttm;
>   	bool r = false;
> +	int i;
>   
>   	if (!gtt || !gtt->userptr)
>   		return false;
>   
> -	WARN_ONCE(!gtt->range.pfns, "No user pages to check\n");
> -	if (gtt->range.pfns) {
> -		r = hmm_vma_range_done(&gtt->range);
> -		kvfree(gtt->range.pfns);
> -		gtt->range.pfns = NULL;
> +	DRM_DEBUG_DRIVER("user_pages_done 0x%llx nr_ranges %d pages 0x%lx\n",
> +		gtt->userptr, gtt->nr_ranges, ttm->num_pages);
> +
> +	WARN_ONCE(!gtt->ranges || !gtt->ranges[0].pfns,
> +		"No user pages to check\n");
> +
> +	if (gtt->ranges) {
> +		for (i = 0; i < gtt->nr_ranges; i++)
> +			r |= hmm_vma_range_done(&gtt->ranges[i]);
> +		kvfree(gtt->ranges[0].pfns);

Shouldn't this be in the loop and apply to gtt->ranges[i].pfns?

Regards,
   Felix

> +		kvfree(gtt->ranges);
> +		gtt->ranges = NULL;
>   	}
>   
>   	return r;
> @@ -884,8 +935,9 @@ static void amdgpu_ttm_tt_unpin_userptr(struct ttm_tt *ttm)
>   	sg_free_table(ttm->sg);
>   
>   #if IS_ENABLED(CONFIG_DRM_AMDGPU_USERPTR)
> -	if (gtt->range.pfns &&
> -	    ttm->pages[0] == hmm_pfn_to_page(&gtt->range, gtt->range.pfns[0]))
> +	if (gtt->ranges &&
> +	    ttm->pages[0] == hmm_pfn_to_page(&gtt->ranges[0],
> +					     gtt->ranges[0].pfns[0]))
>   		WARN_ONCE(1, "Missing get_user_page_done\n");
>   #endif
>   }
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 1/3] drm/amdkfd: support concurrent userptr update for HMM
       [not found]         ` <772b1109-b6db-05cc-4f36-6428724ae4a3-5C7GfCeVMHo@public.gmane.org>
@ 2019-03-07  0:54           ` Yang, Philip
  0 siblings, 0 replies; 11+ messages in thread
From: Yang, Philip @ 2019-03-07  0:54 UTC (permalink / raw)
  To: Kuehling, Felix, amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW

Hi Felix,

Thanks, there are other corner cases, call untrack at end of restore 
userptr worker is better place to cleanup. I will submit v2 patch, to 
fix this issue completely.

Philip

On 2019-03-06 3:01 p.m., Kuehling, Felix wrote:
> Hmm, I'm not sure. This change probably fixes this issue, but there may
> be other similar corner cases in other situations where the restore
> worker fails and needs to retry. The better place to call untrack in
> amdgpu_amdkfd_restore_userptr_worker would be at the very end. Anything
> that's left in the userptr_inval_list at that point needs to be untracked.
> 
> For now as a quick fix for an urgent bug, this change is Reviewed-by:
> Felix Kuehling <Felix.Kuehling@amd.com>. But please revisit this and
> check if there are similar corner cases as I explained above.
> 
> Regards,
>     Felix
> 
> On 3/5/2019 1:09 PM, Yang, Philip wrote:
>> Userptr restore may have concurrent userptr invalidation after
>> hmm_vma_fault adds the range to the hmm->ranges list, needs call
>> hmm_vma_range_done to remove the range from hmm->ranges list first,
>> then reschedule the restore worker. Otherwise hmm_vma_fault will add
>> same range to the list, this will cause loop in the list because
>> range->next point to range itself.
>>
>> Add function untrack_invalid_user_pages to reduce code duplication.
>>
>> Change-Id: I31407739dc10554f8e418c7a0e0415d3d95552f1
>> Signed-off-by: Philip Yang <Philip.Yang@amd.com>
>> ---
>>    .../gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c  | 28 ++++++++++++++-----
>>    1 file changed, 21 insertions(+), 7 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
>> index 314c048fcac6..783d760ccfe3 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
>> @@ -1731,6 +1731,23 @@ static int update_invalid_user_pages(struct amdkfd_process_info *process_info,
>>    	return 0;
>>    }
>>    
>> +/* Remove invalid userptr BOs from hmm track list
>> + *
>> + * Stop HMM track the userptr update
>> + */
>> +static void untrack_invalid_user_pages(struct amdkfd_process_info *process_info)
>> +{
>> +	struct kgd_mem *mem, *tmp_mem;
>> +	struct amdgpu_bo *bo;
>> +
>> +	list_for_each_entry_safe(mem, tmp_mem,
>> +				 &process_info->userptr_inval_list,
>> +				 validate_list.head) {
>> +		bo = mem->bo;
>> +		amdgpu_ttm_tt_get_user_pages_done(bo->tbo.ttm);
>> +	}
>> +}
>> +
>>    /* Validate invalid userptr BOs
>>     *
>>     * Validates BOs on the userptr_inval_list, and moves them back to the
>> @@ -1848,12 +1865,7 @@ static int validate_invalid_user_pages(struct amdkfd_process_info *process_info)
>>    out_free:
>>    	kfree(pd_bo_list_entries);
>>    out_no_mem:
>> -	list_for_each_entry_safe(mem, tmp_mem,
>> -				 &process_info->userptr_inval_list,
>> -				 validate_list.head) {
>> -		bo = mem->bo;
>> -		amdgpu_ttm_tt_get_user_pages_done(bo->tbo.ttm);
>> -	}
>> +	untrack_invalid_user_pages(process_info);
>>    
>>    	return ret;
>>    }
>> @@ -1897,8 +1909,10 @@ static void amdgpu_amdkfd_restore_userptr_worker(struct work_struct *work)
>>    	 * and we can just restart the queues.
>>    	 */
>>    	if (!list_empty(&process_info->userptr_inval_list)) {
>> -		if (atomic_read(&process_info->evicted_bos) != evicted_bos)
>> +		if (atomic_read(&process_info->evicted_bos) != evicted_bos) {
>> +			untrack_invalid_user_pages(process_info);
>>    			goto unlock_out; /* Concurrent eviction, try again */
>> +		}
>>    
>>    		if (validate_invalid_user_pages(process_info))
>>    			goto unlock_out;
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 2/3] drm/amdgpu: support userptr cross VMAs case with HMM
       [not found]         ` <f51eaecd-8d50-0eea-aac6-2905f945a2c2-5C7GfCeVMHo@public.gmane.org>
@ 2019-03-07  1:12           ` Yang, Philip
  0 siblings, 0 replies; 11+ messages in thread
From: Yang, Philip @ 2019-03-07  1:12 UTC (permalink / raw)
  To: Kuehling, Felix, amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW

I will submit v2 to fix those issues. Some comments inline...

On 2019-03-06 3:11 p.m., Kuehling, Felix wrote:
> Some comments inline ...
> 
> On 3/5/2019 1:09 PM, Yang, Philip wrote:
>> userptr may cross two VMAs if the forked child process (not call exec
>> after fork) malloc buffer, then free it, and then malloc larger size
>> buf, kerenl will create new VMA adjacent to old VMA which was cloned
>> from parent process, some pages of userptr are in the first VMA, the
>> rest pages are in the second VMA.
>>
>> HMM expects range only have one VMA, loop over all VMAs in the address
>> range, create multiple ranges to handle this case. See
>> is_mergeable_anon_vma in mm/mmap.c for details.
>>
>> Change-Id: I0ca8c77e28deabccc139906f9ffee04b7e383314
>> Signed-off-by: Philip Yang <Philip.Yang@amd.com>
>> ---
>>    drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 122 +++++++++++++++++-------
>>    1 file changed, 87 insertions(+), 35 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
>> index cd0ccfbbcb84..173bf4db5994 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
>> @@ -711,7 +711,8 @@ struct amdgpu_ttm_tt {
>>    	struct task_struct	*usertask;
>>    	uint32_t		userflags;
>>    #if IS_ENABLED(CONFIG_DRM_AMDGPU_USERPTR)
>> -	struct hmm_range	range;
>> +	struct hmm_range	*ranges;
>> +	int			nr_ranges;
>>    #endif
>>    };
>>    
>> @@ -723,62 +724,104 @@ struct amdgpu_ttm_tt {
>>     * once afterwards to stop HMM tracking
>>     */
>>    #if IS_ENABLED(CONFIG_DRM_AMDGPU_USERPTR)
>> +
>> +/* Support Userptr pages cross max 16 vmas */
>> +#define MAX_NR_VMAS	(16)
>> +
>>    int amdgpu_ttm_tt_get_user_pages(struct ttm_tt *ttm, struct page **pages)
>>    {
>>    	struct amdgpu_ttm_tt *gtt = (void *)ttm;
>>    	struct mm_struct *mm = gtt->usertask->mm;
>> -	unsigned long end = gtt->userptr + ttm->num_pages * PAGE_SIZE;
>> -	struct hmm_range *range = &gtt->range;
>> -	int r = 0, i;
>> +	unsigned long start = gtt->userptr;
>> +	unsigned long end = start + ttm->num_pages * PAGE_SIZE;
>> +	struct hmm_range *ranges;
>> +	struct vm_area_struct *vma = NULL, *vmas[MAX_NR_VMAS];
>> +	uint64_t *pfns, f;
>> +	int r = 0, i, nr_pages;
>>    
>>    	if (!mm) /* Happens during process shutdown */
>>    		return -ESRCH;
>>    
>> -	amdgpu_hmm_init_range(range);
>> -
>>    	down_read(&mm->mmap_sem);
>>    
>> -	range->vma = find_vma(mm, gtt->userptr);
>> -	if (!range_in_vma(range->vma, gtt->userptr, end))
>> -		r = -EFAULT;
>> -	else if ((gtt->userflags & AMDGPU_GEM_USERPTR_ANONONLY) &&
>> -		range->vma->vm_file)
>> +	/* user pages may cross multiple VMAs */
>> +	gtt->nr_ranges = 0;
>> +	do {
>> +		vma = find_vma(mm, vma ? vma->vm_end : start);
>> +		if (unlikely(!vma)) {
>> +			r = -EFAULT;
>> +			goto out;
>> +		}
>> +		vmas[gtt->nr_ranges++] = vma;
>> +		if (gtt->nr_ranges >= MAX_NR_VMAS) {
> 
> This will lead to a failure when you have exactly 16 VMAs. If you move
> the check to the start of the loop, it will only trigger when you exceed
> the limit not just after you reach it.
> 
Ok
> 
>> +			DRM_ERROR("invalid userptr range\n");
> 
> The userptr range is not really invalid. It only exceeds some artificial
> limitation in this code. A message like "Too many VMAs in userptr range"
> would be more appropriate.
> 
Ok
> 
>> +			r = -EFAULT;
>> +			goto out;
>> +		}
>> +	} while (end > vma->vm_end);
>> +
>> +	DRM_DEBUG_DRIVER("0x%lx nr_ranges %d pages 0x%lx\n",
>> +		start, gtt->nr_ranges, ttm->num_pages);
>> +
>> +	if (unlikely((gtt->userflags & AMDGPU_GEM_USERPTR_ANONONLY) &&
>> +		vmas[0]->vm_file)) {
>>    		r = -EPERM;
>> -	if (r)
>>    		goto out;
>> +	}
>>    
>> -	range->pfns = kvmalloc_array(ttm->num_pages, sizeof(uint64_t),
>> -				     GFP_KERNEL);
>> -	if (range->pfns == NULL) {
>> +	ranges = kvmalloc_array(gtt->nr_ranges, sizeof(*ranges), GFP_KERNEL);
>> +	if (unlikely(!ranges)) {
>>    		r = -ENOMEM;
>>    		goto out;
>>    	}
>> -	range->start = gtt->userptr;
>> -	range->end = end;
>>    
>> -	range->pfns[0] = range->flags[HMM_PFN_VALID];
>> -	range->pfns[0] |= amdgpu_ttm_tt_is_readonly(ttm) ?
>> -				0 : range->flags[HMM_PFN_WRITE];
>> -	for (i = 1; i < ttm->num_pages; i++)
>> -		range->pfns[i] = range->pfns[0];
>> +	pfns = kvmalloc_array(ttm->num_pages, sizeof(*pfns), GFP_KERNEL);
>> +	if (unlikely(!pfns)) {
>> +		r = -ENOMEM;
>> +		goto out_free_ranges;
>> +	}
>> +
>> +	for (i = 0; i < gtt->nr_ranges; i++)
>> +		amdgpu_hmm_init_range(&ranges[i]);
>> +
>> +	f = ranges[0].flags[HMM_PFN_VALID];
>> +	f |= amdgpu_ttm_tt_is_readonly(ttm) ?
>> +				0 : ranges[0].flags[HMM_PFN_WRITE];
>> +	memset64(pfns, f, ttm->num_pages);
>> +
>> +	for (nr_pages = 0, i = 0; i < gtt->nr_ranges; i++) {
>> +		ranges[i].vma = vmas[i];
>> +		ranges[i].start = max(start, vmas[i]->vm_start);
>> +		ranges[i].end = min(end, vmas[i]->vm_end);
>> +		ranges[i].pfns = pfns + nr_pages;
>> +		nr_pages += (ranges[i].end - ranges[i].start) / PAGE_SIZE;
>> +
>> +		r = hmm_vma_fault(&ranges[i], true);
>> +		if (unlikely(r))
>> +			break;
>> +	}
>> +	if (unlikely(r)) {
>> +		while (i--)
>> +			hmm_vma_range_done(&ranges[i]);
>>    
>> -	/* This may trigger page table update */
>> -	r = hmm_vma_fault(range, true);
>> -	if (r)
>>    		goto out_free_pfns;
>> +	}
>>    
>>    	up_read(&mm->mmap_sem);
>>    
>>    	for (i = 0; i < ttm->num_pages; i++)
>> -		pages[i] = hmm_pfn_to_page(range, range->pfns[i]);
>> +		pages[i] = hmm_pfn_to_page(&ranges[0], pfns[i]);
> 
> This looks wrong. Don't you need to use the correct range here instead
> of range[0]?
> 
hmm_pfn_to_page uses flags HMM_PFN_ERROR etc and pfn_shift, those flags 
and pfn_shift are same for all ranges. To use range corresponding to the 
page, we need another calculation from page to range.
> 
>> +	gtt->ranges = ranges;
>>    
>>    	return 0;
>>    
>>    out_free_pfns:
>> -	kvfree(range->pfns);
>> -	range->pfns = NULL;
>> +	kvfree(pfns);
>> +out_free_ranges:
>> +	kvfree(ranges);
>>    out:
>>    	up_read(&mm->mmap_sem);
>> +
>>    	return r;
>>    }
>>    
>> @@ -792,15 +835,23 @@ bool amdgpu_ttm_tt_get_user_pages_done(struct ttm_tt *ttm)
>>    {
>>    	struct amdgpu_ttm_tt *gtt = (void *)ttm;
>>    	bool r = false;
>> +	int i;
>>    
>>    	if (!gtt || !gtt->userptr)
>>    		return false;
>>    
>> -	WARN_ONCE(!gtt->range.pfns, "No user pages to check\n");
>> -	if (gtt->range.pfns) {
>> -		r = hmm_vma_range_done(&gtt->range);
>> -		kvfree(gtt->range.pfns);
>> -		gtt->range.pfns = NULL;
>> +	DRM_DEBUG_DRIVER("user_pages_done 0x%llx nr_ranges %d pages 0x%lx\n",
>> +		gtt->userptr, gtt->nr_ranges, ttm->num_pages);
>> +
>> +	WARN_ONCE(!gtt->ranges || !gtt->ranges[0].pfns,
>> +		"No user pages to check\n");
>> +
>> +	if (gtt->ranges) {
>> +		for (i = 0; i < gtt->nr_ranges; i++)
>> +			r |= hmm_vma_range_done(&gtt->ranges[i]);
>> +		kvfree(gtt->ranges[0].pfns);
> 
> Shouldn't this be in the loop and apply to gtt->ranges[i].pfns?
> 
pfns is allocated once based on ttm->num_pages, ranges[0].pfns is the 
pfns, other ranges pfns is pfns plus nr_pages offset. So only need to 
free ranges[0].pfns.

> Regards,
>     Felix
> 
>> +		kvfree(gtt->ranges);
>> +		gtt->ranges = NULL;
>>    	}
>>    
>>    	return r;
>> @@ -884,8 +935,9 @@ static void amdgpu_ttm_tt_unpin_userptr(struct ttm_tt *ttm)
>>    	sg_free_table(ttm->sg);
>>    
>>    #if IS_ENABLED(CONFIG_DRM_AMDGPU_USERPTR)
>> -	if (gtt->range.pfns &&
>> -	    ttm->pages[0] == hmm_pfn_to_page(&gtt->range, gtt->range.pfns[0]))
>> +	if (gtt->ranges &&
>> +	    ttm->pages[0] == hmm_pfn_to_page(&gtt->ranges[0],
>> +					     gtt->ranges[0].pfns[0]))
>>    		WARN_ONCE(1, "Missing get_user_page_done\n");
>>    #endif
>>    }
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2019-03-07  1:12 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-03-05 18:09 [PATCH 0/3] handle userptr corner cases with HMM path Yang, Philip
     [not found] ` <20190305180915.15216-1-Philip.Yang-5C7GfCeVMHo@public.gmane.org>
2019-03-05 18:09   ` [PATCH 1/3] drm/amdkfd: support concurrent userptr update for HMM Yang, Philip
     [not found]     ` <20190305180915.15216-2-Philip.Yang-5C7GfCeVMHo@public.gmane.org>
2019-03-06 20:01       ` Kuehling, Felix
     [not found]         ` <772b1109-b6db-05cc-4f36-6428724ae4a3-5C7GfCeVMHo@public.gmane.org>
2019-03-07  0:54           ` Yang, Philip
2019-03-05 18:09   ` [PATCH 2/3] drm/amdgpu: support userptr cross VMAs case with HMM Yang, Philip
     [not found]     ` <20190305180915.15216-3-Philip.Yang-5C7GfCeVMHo@public.gmane.org>
2019-03-06 20:11       ` Kuehling, Felix
     [not found]         ` <f51eaecd-8d50-0eea-aac6-2905f945a2c2-5C7GfCeVMHo@public.gmane.org>
2019-03-07  1:12           ` Yang, Philip
2019-03-05 18:09   ` [PATCH 3/3] drm/amdgpu: more descriptive message if HMM not enabled Yang, Philip
     [not found]     ` <20190305180915.15216-4-Philip.Yang-5C7GfCeVMHo@public.gmane.org>
2019-03-06  9:05       ` Michel Dänzer
     [not found]         ` <9ac0b1ab-891e-9fe6-667e-f77a244dd862-otUistvHUpPR7s880joybQ@public.gmane.org>
2019-03-06 14:42           ` Yang, Philip
     [not found]             ` <baa08fbf-8bb0-6ad7-9e62-b9ebdf88cf72-5C7GfCeVMHo@public.gmane.org>
2019-03-06 14:53               ` Michel Dänzer

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.