amd-gfx.lists.freedesktop.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 1/3] drm/amdgpu: Allow TTM to evict svm bo from same process
@ 2022-07-25 12:23 Philip Yang
  2022-07-25 12:23 ` [PATCH 2/3] drm/amdkfd: Set svm range max pages Philip Yang
                   ` (2 more replies)
  0 siblings, 3 replies; 7+ messages in thread
From: Philip Yang @ 2022-07-25 12:23 UTC (permalink / raw)
  To: amd-gfx; +Cc: Philip Yang, felix.kuehling

To support SVM range VRAM overcommitment, TTM should be able to evict
svm bo of same process to system memory, to get space to alloc new svm
bo.

Signed-off-by: Philip Yang <Philip.Yang@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_fence.c | 9 ++++++---
 1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_fence.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_fence.c
index 1d0dbff87d3f..e8bb32f4ca14 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_fence.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_fence.c
@@ -159,11 +159,14 @@ static void amdkfd_fence_release(struct dma_fence *f)
 }
 
 /**
- * amdkfd_fence_check_mm - Check if @mm is same as that of the fence @f
- *  if same return TRUE else return FALSE.
+ * amdkfd_fence_check_mm
  *
  * @f: [IN] fence
  * @mm: [IN] mm that needs to be verified
+ *
+ * Check if @mm is same as that of the fence @f, if same return TRUE else
+ * return FALSE.
+ * For svm bo, which support vram overcommitment, always return FALSE.
  */
 bool amdkfd_fence_check_mm(struct dma_fence *f, struct mm_struct *mm)
 {
@@ -171,7 +174,7 @@ bool amdkfd_fence_check_mm(struct dma_fence *f, struct mm_struct *mm)
 
 	if (!fence)
 		return false;
-	else if (fence->mm == mm)
+	else if (fence->mm == mm  && !fence->svm_bo)
 		return true;
 
 	return false;
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH 2/3] drm/amdkfd: Set svm range max pages
  2022-07-25 12:23 [PATCH 1/3] drm/amdgpu: Allow TTM to evict svm bo from same process Philip Yang
@ 2022-07-25 12:23 ` Philip Yang
  2022-07-25 14:34   ` Felix Kuehling
  2022-07-25 12:23 ` [PATCH 3/3] drm/amdkfd: Split giant svm range Philip Yang
  2022-07-25 15:01 ` [PATCH 1/3] drm/amdgpu: Allow TTM to evict svm bo from same process Felix Kuehling
  2 siblings, 1 reply; 7+ messages in thread
From: Philip Yang @ 2022-07-25 12:23 UTC (permalink / raw)
  To: amd-gfx; +Cc: Philip Yang, felix.kuehling

This will be used to split giant svm range into smaller ranges, to
support VRAM overcommitment by giant range and improve GPU retry fault
recover on giant range.

Signed-off-by: Philip Yang <Philip.Yang@amd.com>
---
 drivers/gpu/drm/amd/amdkfd/kfd_migrate.c |  2 ++
 drivers/gpu/drm/amd/amdkfd/kfd_svm.c     | 15 +++++++++++++++
 drivers/gpu/drm/amd/amdkfd/kfd_svm.h     |  3 +++
 3 files changed, 20 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_migrate.c b/drivers/gpu/drm/amd/amdkfd/kfd_migrate.c
index 9667015a6cbc..b1f87aa6138b 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_migrate.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_migrate.c
@@ -1019,6 +1019,8 @@ int svm_migrate_init(struct amdgpu_device *adev)
 
 	amdgpu_amdkfd_reserve_system_mem(SVM_HMM_PAGE_STRUCT_SIZE(size));
 
+	svm_range_set_max_pages(adev);
+
 	pr_info("HMM registered %ldMB device memory\n", size >> 20);
 
 	return 0;
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
index b592aee6d9d6..cf9565ddddf8 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
@@ -46,6 +46,11 @@
  */
 #define AMDGPU_SVM_RANGE_RETRY_FAULT_PENDING	(2UL * NSEC_PER_MSEC)
 
+/* Giant svm range split into smaller ranges based on this, it is decided using
+ * minimum of all dGPU/APU 1/32 VRAM size, between 2MB to 1GB and align to 2MB.
+ */
+uint64_t max_svm_range_pages;
+
 struct criu_svm_metadata {
 	struct list_head list;
 	struct kfd_criu_svm_range_priv_data data;
@@ -1869,6 +1874,16 @@ static struct svm_range *svm_range_clone(struct svm_range *old)
 
 	return new;
 }
+__init void svm_range_set_max_pages(struct amdgpu_device *adev)
+{
+	uint64_t pages;
+
+	/* 1/32 VRAM size in pages */
+	pages = adev->gmc.real_vram_size >> 17;
+	pages = clamp(pages, 1ULL << 9, 1ULL << 18);
+	max_svm_range_pages = min_not_zero(max_svm_range_pages, pages);
+	max_svm_range_pages = ALIGN(max_svm_range_pages, 1ULL << 9);
+}
 
 /**
  * svm_range_add - add svm range and handle overlap
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_svm.h b/drivers/gpu/drm/amd/amdkfd/kfd_svm.h
index eab7f6d3b13c..346a41bf8dbf 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_svm.h
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_svm.h
@@ -204,6 +204,9 @@ void svm_range_list_lock_and_flush_work(struct svm_range_list *svms, struct mm_s
 #define KFD_IS_SVM_API_SUPPORTED(dev) ((dev)->pgmap.type != 0)
 
 void svm_range_bo_unref_async(struct svm_range_bo *svm_bo);
+
+__init void svm_range_set_max_pages(struct amdgpu_device *adev);
+
 #else
 
 struct kfd_process;
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH 3/3] drm/amdkfd: Split giant svm range
  2022-07-25 12:23 [PATCH 1/3] drm/amdgpu: Allow TTM to evict svm bo from same process Philip Yang
  2022-07-25 12:23 ` [PATCH 2/3] drm/amdkfd: Set svm range max pages Philip Yang
@ 2022-07-25 12:23 ` Philip Yang
  2022-07-25 14:55   ` Felix Kuehling
  2022-07-25 15:01 ` [PATCH 1/3] drm/amdgpu: Allow TTM to evict svm bo from same process Felix Kuehling
  2 siblings, 1 reply; 7+ messages in thread
From: Philip Yang @ 2022-07-25 12:23 UTC (permalink / raw)
  To: amd-gfx; +Cc: Philip Yang, felix.kuehling

Giant svm range split to smaller ranges, align the range start address
to max svm range pages to improve MMU TLB usage.

Signed-off-by: Philip Yang <Philip.Yang@amd.com>
---
 drivers/gpu/drm/amd/amdkfd/kfd_svm.c | 52 +++++++++++++++++++---------
 1 file changed, 36 insertions(+), 16 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
index cf9565ddddf8..044bb99f88ea 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
@@ -1885,6 +1885,37 @@ __init void svm_range_set_max_pages(struct amdgpu_device *adev)
 	max_svm_range_pages = ALIGN(max_svm_range_pages, 1ULL << 9);
 }
 
+static int
+__svm_range_add(struct svm_range_list *svms, uint64_t start, uint64_t last,
+	       struct list_head *insert_list, struct list_head *update_list)
+{
+	struct svm_range *prange;
+	uint64_t l;
+
+	pr_debug("max_svm_range_pages 0x%llx adding [0x%llx 0x%llx]\n",
+		 max_svm_range_pages, start, last);
+
+	while (last >= start) {
+		if (last - start + 1 > max_svm_range_pages) {
+			if (start % max_svm_range_pages)
+				l = ALIGN(start, max_svm_range_pages) - 1;
+			else
+				l = start + max_svm_range_pages - 1;
+		} else {
+			l = last;
+		}
+
+		prange = svm_range_new(svms, start, l);
+		if (!prange)
+			return -ENOMEM;
+		list_add(&prange->list, insert_list);
+		list_add(&prange->update_list, update_list);
+
+		start = l + 1;
+	}
+	return 0;
+}
+
 /**
  * svm_range_add - add svm range and handle overlap
  * @p: the range add to this process svms
@@ -1987,14 +2018,10 @@ svm_range_add(struct kfd_process *p, uint64_t start, uint64_t size,
 
 		/* insert a new node if needed */
 		if (node->start > start) {
-			prange = svm_range_new(svms, start, node->start - 1);
-			if (!prange) {
-				r = -ENOMEM;
+			r = __svm_range_add(svms, start, node->start - 1,
+					    insert_list, update_list);
+			if (r)
 				goto out;
-			}
-
-			list_add(&prange->list, insert_list);
-			list_add(&prange->update_list, update_list);
 		}
 
 		node = next;
@@ -2002,15 +2029,8 @@ svm_range_add(struct kfd_process *p, uint64_t start, uint64_t size,
 	}
 
 	/* add a final range at the end if needed */
-	if (start <= last) {
-		prange = svm_range_new(svms, start, last);
-		if (!prange) {
-			r = -ENOMEM;
-			goto out;
-		}
-		list_add(&prange->list, insert_list);
-		list_add(&prange->update_list, update_list);
-	}
+	if (start <= last)
+		r = __svm_range_add(svms, start, last, insert_list, update_list);
 
 out:
 	if (r)
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH 2/3] drm/amdkfd: Set svm range max pages
  2022-07-25 12:23 ` [PATCH 2/3] drm/amdkfd: Set svm range max pages Philip Yang
@ 2022-07-25 14:34   ` Felix Kuehling
  2022-07-25 15:27     ` philip yang
  0 siblings, 1 reply; 7+ messages in thread
From: Felix Kuehling @ 2022-07-25 14:34 UTC (permalink / raw)
  To: Philip Yang, amd-gfx

Am 2022-07-25 um 08:23 schrieb Philip Yang:
> This will be used to split giant svm range into smaller ranges, to
> support VRAM overcommitment by giant range and improve GPU retry fault
> recover on giant range.
>
> Signed-off-by: Philip Yang <Philip.Yang@amd.com>
> ---
>   drivers/gpu/drm/amd/amdkfd/kfd_migrate.c |  2 ++
>   drivers/gpu/drm/amd/amdkfd/kfd_svm.c     | 15 +++++++++++++++
>   drivers/gpu/drm/amd/amdkfd/kfd_svm.h     |  3 +++
>   3 files changed, 20 insertions(+)
>
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_migrate.c b/drivers/gpu/drm/amd/amdkfd/kfd_migrate.c
> index 9667015a6cbc..b1f87aa6138b 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_migrate.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_migrate.c
> @@ -1019,6 +1019,8 @@ int svm_migrate_init(struct amdgpu_device *adev)
>   
>   	amdgpu_amdkfd_reserve_system_mem(SVM_HMM_PAGE_STRUCT_SIZE(size));
>   
> +	svm_range_set_max_pages(adev);
> +
>   	pr_info("HMM registered %ldMB device memory\n", size >> 20);
>   
>   	return 0;
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
> index b592aee6d9d6..cf9565ddddf8 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
> @@ -46,6 +46,11 @@
>    */
>   #define AMDGPU_SVM_RANGE_RETRY_FAULT_PENDING	(2UL * NSEC_PER_MSEC)
>   
> +/* Giant svm range split into smaller ranges based on this, it is decided using
> + * minimum of all dGPU/APU 1/32 VRAM size, between 2MB to 1GB and align to 2MB.
> + */
> +uint64_t max_svm_range_pages;
> +
>   struct criu_svm_metadata {
>   	struct list_head list;
>   	struct kfd_criu_svm_range_priv_data data;
> @@ -1869,6 +1874,16 @@ static struct svm_range *svm_range_clone(struct svm_range *old)
>   
>   	return new;
>   }
> +__init void svm_range_set_max_pages(struct amdgpu_device *adev)

Why is this marked as __init? This can run much later than module init.


> +{
> +	uint64_t pages;
> +
> +	/* 1/32 VRAM size in pages */
> +	pages = adev->gmc.real_vram_size >> 17;
> +	pages = clamp(pages, 1ULL << 9, 1ULL << 18);
> +	max_svm_range_pages = min_not_zero(max_svm_range_pages, pages);
> +	max_svm_range_pages = ALIGN(max_svm_range_pages, 1ULL << 9);

I'd recommend updating max_svm_range_pages with a single WRITE_ONCE to 
avoid race conditions with GPU hot-plug.

Regards,
   Felix


> +}
>   
>   /**
>    * svm_range_add - add svm range and handle overlap
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_svm.h b/drivers/gpu/drm/amd/amdkfd/kfd_svm.h
> index eab7f6d3b13c..346a41bf8dbf 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_svm.h
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_svm.h
> @@ -204,6 +204,9 @@ void svm_range_list_lock_and_flush_work(struct svm_range_list *svms, struct mm_s
>   #define KFD_IS_SVM_API_SUPPORTED(dev) ((dev)->pgmap.type != 0)
>   
>   void svm_range_bo_unref_async(struct svm_range_bo *svm_bo);
> +
> +__init void svm_range_set_max_pages(struct amdgpu_device *adev);
> +
>   #else
>   
>   struct kfd_process;

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 3/3] drm/amdkfd: Split giant svm range
  2022-07-25 12:23 ` [PATCH 3/3] drm/amdkfd: Split giant svm range Philip Yang
@ 2022-07-25 14:55   ` Felix Kuehling
  0 siblings, 0 replies; 7+ messages in thread
From: Felix Kuehling @ 2022-07-25 14:55 UTC (permalink / raw)
  To: Philip Yang, amd-gfx

Am 2022-07-25 um 08:23 schrieb Philip Yang:
> Giant svm range split to smaller ranges, align the range start address
> to max svm range pages to improve MMU TLB usage.
>
> Signed-off-by: Philip Yang <Philip.Yang@amd.com>
> ---
>   drivers/gpu/drm/amd/amdkfd/kfd_svm.c | 52 +++++++++++++++++++---------
>   1 file changed, 36 insertions(+), 16 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
> index cf9565ddddf8..044bb99f88ea 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
> @@ -1885,6 +1885,37 @@ __init void svm_range_set_max_pages(struct amdgpu_device *adev)
>   	max_svm_range_pages = ALIGN(max_svm_range_pages, 1ULL << 9);
>   }
>   
> +static int
> +__svm_range_add(struct svm_range_list *svms, uint64_t start, uint64_t last,
> +	       struct list_head *insert_list, struct list_head *update_list)

It would be nice to find a better name for this. Maybe 
svm_range_split_new. Maybe make the max size a parameter of the function 
for better clarity.


> +{
> +	struct svm_range *prange;
> +	uint64_t l;
> +
> +	pr_debug("max_svm_range_pages 0x%llx adding [0x%llx 0x%llx]\n",
> +		 max_svm_range_pages, start, last);
> +
> +	while (last >= start) {
> +		if (last - start + 1 > max_svm_range_pages) {

Use a single READ_ONCE in this function to read max_svm_range_pages into 
a local variable. This should avoid race conditions with GPU hotplug. If 
you make the max size a parameter of this function, that also works if 
the caller uses READ_ONCE.


> +			if (start % max_svm_range_pages)
> +				l = ALIGN(start, max_svm_range_pages) - 1;
> +			else
> +				l = start + max_svm_range_pages - 1;
> +		} else {
> +			l = last;

I think this whole if block could be written as

     l = min(last, ALIGN_DOWN(start + max_svm_range_pages, 
max_svm_range_pages) - 1);

Regards,
   Felix


> +		}
> +
> +		prange = svm_range_new(svms, start, l);
> +		if (!prange)
> +			return -ENOMEM;
> +		list_add(&prange->list, insert_list);
> +		list_add(&prange->update_list, update_list);
> +
> +		start = l + 1;
> +	}
> +	return 0;
> +}
> +
>   /**
>    * svm_range_add - add svm range and handle overlap
>    * @p: the range add to this process svms
> @@ -1987,14 +2018,10 @@ svm_range_add(struct kfd_process *p, uint64_t start, uint64_t size,
>   
>   		/* insert a new node if needed */
>   		if (node->start > start) {
> -			prange = svm_range_new(svms, start, node->start - 1);
> -			if (!prange) {
> -				r = -ENOMEM;
> +			r = __svm_range_add(svms, start, node->start - 1,
> +					    insert_list, update_list);
> +			if (r)
>   				goto out;
> -			}
> -
> -			list_add(&prange->list, insert_list);
> -			list_add(&prange->update_list, update_list);
>   		}
>   
>   		node = next;
> @@ -2002,15 +2029,8 @@ svm_range_add(struct kfd_process *p, uint64_t start, uint64_t size,
>   	}
>   
>   	/* add a final range at the end if needed */
> -	if (start <= last) {
> -		prange = svm_range_new(svms, start, last);
> -		if (!prange) {
> -			r = -ENOMEM;
> -			goto out;
> -		}
> -		list_add(&prange->list, insert_list);
> -		list_add(&prange->update_list, update_list);
> -	}
> +	if (start <= last)
> +		r = __svm_range_add(svms, start, last, insert_list, update_list);
>   
>   out:
>   	if (r)

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 1/3] drm/amdgpu: Allow TTM to evict svm bo from same process
  2022-07-25 12:23 [PATCH 1/3] drm/amdgpu: Allow TTM to evict svm bo from same process Philip Yang
  2022-07-25 12:23 ` [PATCH 2/3] drm/amdkfd: Set svm range max pages Philip Yang
  2022-07-25 12:23 ` [PATCH 3/3] drm/amdkfd: Split giant svm range Philip Yang
@ 2022-07-25 15:01 ` Felix Kuehling
  2 siblings, 0 replies; 7+ messages in thread
From: Felix Kuehling @ 2022-07-25 15:01 UTC (permalink / raw)
  To: Philip Yang, amd-gfx

Am 2022-07-25 um 08:23 schrieb Philip Yang:
> To support SVM range VRAM overcommitment, TTM should be able to evict
> svm bo of same process to system memory, to get space to alloc new svm
> bo.
>
> Signed-off-by: Philip Yang <Philip.Yang@amd.com>
> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_fence.c | 9 ++++++---
>   1 file changed, 6 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_fence.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_fence.c
> index 1d0dbff87d3f..e8bb32f4ca14 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_fence.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_fence.c
> @@ -159,11 +159,14 @@ static void amdkfd_fence_release(struct dma_fence *f)
>   }
>   
>   /**
> - * amdkfd_fence_check_mm - Check if @mm is same as that of the fence @f
> - *  if same return TRUE else return FALSE.
> + * amdkfd_fence_check_mm

I think we still need a brief description here. How about "Check whether 
to prevent eviction of @f by @mm".

With that fixed, the patch is

Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>


>    *
>    * @f: [IN] fence
>    * @mm: [IN] mm that needs to be verified
> + *
> + * Check if @mm is same as that of the fence @f, if same return TRUE else
> + * return FALSE.
> + * For svm bo, which support vram overcommitment, always return FALSE.
>    */
>   bool amdkfd_fence_check_mm(struct dma_fence *f, struct mm_struct *mm)
>   {
> @@ -171,7 +174,7 @@ bool amdkfd_fence_check_mm(struct dma_fence *f, struct mm_struct *mm)
>   
>   	if (!fence)
>   		return false;
> -	else if (fence->mm == mm)
> +	else if (fence->mm == mm  && !fence->svm_bo)
>   		return true;
>   
>   	return false;

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 2/3] drm/amdkfd: Set svm range max pages
  2022-07-25 14:34   ` Felix Kuehling
@ 2022-07-25 15:27     ` philip yang
  0 siblings, 0 replies; 7+ messages in thread
From: philip yang @ 2022-07-25 15:27 UTC (permalink / raw)
  To: Felix Kuehling, Philip Yang, amd-gfx

[-- Attachment #1: Type: text/html, Size: 5850 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2022-07-25 15:27 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-07-25 12:23 [PATCH 1/3] drm/amdgpu: Allow TTM to evict svm bo from same process Philip Yang
2022-07-25 12:23 ` [PATCH 2/3] drm/amdkfd: Set svm range max pages Philip Yang
2022-07-25 14:34   ` Felix Kuehling
2022-07-25 15:27     ` philip yang
2022-07-25 12:23 ` [PATCH 3/3] drm/amdkfd: Split giant svm range Philip Yang
2022-07-25 14:55   ` Felix Kuehling
2022-07-25 15:01 ` [PATCH 1/3] drm/amdgpu: Allow TTM to evict svm bo from same process Felix Kuehling

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).