All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 1/3] drm/amdgpu: use moving fence instead of exclusive for VM updates
@ 2019-09-02 10:52 Christian König
       [not found] ` <20190902105219.2813-1-christian.koenig-5C7GfCeVMHo@public.gmane.org>
  0 siblings, 1 reply; 11+ messages in thread
From: Christian König @ 2019-09-02 10:52 UTC (permalink / raw)
  To: amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW

Make VM updates depend on the moving fence instead of the exclusive one.

Makes it less likely to actually have a dependency.

Signed-off-by: Christian König <christian.koenig@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
index 189ad5699946..501e13420786 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
@@ -1706,7 +1706,7 @@ int amdgpu_vm_bo_update(struct amdgpu_device *adev,
 			ttm = container_of(bo->tbo.ttm, struct ttm_dma_tt, ttm);
 			pages_addr = ttm->dma_address;
 		}
-		exclusive = reservation_object_get_excl(bo->tbo.resv);
+		exclusive = bo->tbo.moving;
 	}
 
 	if (bo) {
-- 
2.17.1

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH 2/3] drm/amdgpu: reserve at least 4MB of VRAM for page tables
       [not found] ` <20190902105219.2813-1-christian.koenig-5C7GfCeVMHo@public.gmane.org>
@ 2019-09-02 10:52   ` Christian König
       [not found]     ` <20190902105219.2813-2-christian.koenig-5C7GfCeVMHo@public.gmane.org>
  2019-09-02 10:52   ` [PATCH 3/3] drm/amdgpu: remove amdgpu_cs_try_evict Christian König
  2019-09-03 20:37   ` [PATCH 1/3] drm/amdgpu: use moving fence instead of exclusive for VM updates Kuehling, Felix
  2 siblings, 1 reply; 11+ messages in thread
From: Christian König @ 2019-09-02 10:52 UTC (permalink / raw)
  To: amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW

This hopefully helps reduce the contention for page tables.

Signed-off-by: Christian König <christian.koenig@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h       | 3 +++
 drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c | 9 +++++++--
 2 files changed, 10 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
index 2eda3a8c330d..3352a87b822e 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
@@ -99,6 +99,9 @@ struct amdgpu_bo_list_entry;
 #define AMDGPU_VM_FAULT_STOP_FIRST	1
 #define AMDGPU_VM_FAULT_STOP_ALWAYS	2
 
+/* Reserve 4MB VRAM for page tables */
+#define AMDGPU_VM_RESERVED_VRAM		(4ULL << 20)
+
 /* max number of VMHUB */
 #define AMDGPU_MAX_VMHUBS			3
 #define AMDGPU_GFXHUB_0				0
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c
index 1150e34bc28f..59440f71d304 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c
@@ -24,6 +24,7 @@
 
 #include <drm/drmP.h>
 #include "amdgpu.h"
+#include "amdgpu_vm.h"
 
 struct amdgpu_vram_mgr {
 	struct drm_mm mm;
@@ -276,7 +277,7 @@ static int amdgpu_vram_mgr_new(struct ttm_mem_type_manager *man,
 	struct drm_mm_node *nodes;
 	enum drm_mm_insert_mode mode;
 	unsigned long lpfn, num_nodes, pages_per_node, pages_left;
-	uint64_t vis_usage = 0, mem_bytes;
+	uint64_t vis_usage = 0, mem_bytes, max_bytes;
 	unsigned i;
 	int r;
 
@@ -284,9 +285,13 @@ static int amdgpu_vram_mgr_new(struct ttm_mem_type_manager *man,
 	if (!lpfn)
 		lpfn = man->size;
 
+	max_bytes = adev->gmc.mc_vram_size;
+	if (tbo->type != ttm_bo_type_kernel)
+		max_bytes -= AMDGPU_VM_RESERVED_VRAM;
+
 	/* bail out quickly if there's likely not enough VRAM for this BO */
 	mem_bytes = (u64)mem->num_pages << PAGE_SHIFT;
-	if (atomic64_add_return(mem_bytes, &mgr->usage) > adev->gmc.mc_vram_size) {
+	if (atomic64_add_return(mem_bytes, &mgr->usage) > max_bytes) {
 		atomic64_sub(mem_bytes, &mgr->usage);
 		mem->mm_node = NULL;
 		return 0;
-- 
2.17.1

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH 3/3] drm/amdgpu: remove amdgpu_cs_try_evict
       [not found] ` <20190902105219.2813-1-christian.koenig-5C7GfCeVMHo@public.gmane.org>
  2019-09-02 10:52   ` [PATCH 2/3] drm/amdgpu: reserve at least 4MB of VRAM for page tables Christian König
@ 2019-09-02 10:52   ` Christian König
       [not found]     ` <20190902105219.2813-3-christian.koenig-5C7GfCeVMHo@public.gmane.org>
  2019-09-03 20:37   ` [PATCH 1/3] drm/amdgpu: use moving fence instead of exclusive for VM updates Kuehling, Felix
  2 siblings, 1 reply; 11+ messages in thread
From: Christian König @ 2019-09-02 10:52 UTC (permalink / raw)
  To: amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW

Trying to evict things from the current working set doesn't work that
well anymore because of per VM BOs.

Rely on reserving VRAM for page tables to avoid contention.

Signed-off-by: Christian König <christian.koenig@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu.h    |  1 -
 drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c | 71 +-------------------------
 2 files changed, 1 insertion(+), 71 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
index a236213f8e8e..d1995156733e 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
@@ -478,7 +478,6 @@ struct amdgpu_cs_parser {
 	uint64_t			bytes_moved_vis_threshold;
 	uint64_t			bytes_moved;
 	uint64_t			bytes_moved_vis;
-	struct amdgpu_bo_list_entry	*evictable;
 
 	/* user fence */
 	struct amdgpu_bo_list_entry	uf_entry;
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
index fd95b586b590..03182d968d3d 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
@@ -447,75 +447,12 @@ static int amdgpu_cs_bo_validate(struct amdgpu_cs_parser *p,
 	return r;
 }
 
-/* Last resort, try to evict something from the current working set */
-static bool amdgpu_cs_try_evict(struct amdgpu_cs_parser *p,
-				struct amdgpu_bo *validated)
-{
-	uint32_t domain = validated->allowed_domains;
-	struct ttm_operation_ctx ctx = { true, false };
-	int r;
-
-	if (!p->evictable)
-		return false;
-
-	for (;&p->evictable->tv.head != &p->validated;
-	     p->evictable = list_prev_entry(p->evictable, tv.head)) {
-
-		struct amdgpu_bo_list_entry *candidate = p->evictable;
-		struct amdgpu_bo *bo = ttm_to_amdgpu_bo(candidate->tv.bo);
-		struct amdgpu_device *adev = amdgpu_ttm_adev(bo->tbo.bdev);
-		bool update_bytes_moved_vis;
-		uint32_t other;
-
-		/* If we reached our current BO we can forget it */
-		if (bo == validated)
-			break;
-
-		/* We can't move pinned BOs here */
-		if (bo->pin_count)
-			continue;
-
-		other = amdgpu_mem_type_to_domain(bo->tbo.mem.mem_type);
-
-		/* Check if this BO is in one of the domains we need space for */
-		if (!(other & domain))
-			continue;
-
-		/* Check if we can move this BO somewhere else */
-		other = bo->allowed_domains & ~domain;
-		if (!other)
-			continue;
-
-		/* Good we can try to move this BO somewhere else */
-		update_bytes_moved_vis =
-				!amdgpu_gmc_vram_full_visible(&adev->gmc) &&
-				amdgpu_bo_in_cpu_visible_vram(bo);
-		amdgpu_bo_placement_from_domain(bo, other);
-		r = ttm_bo_validate(&bo->tbo, &bo->placement, &ctx);
-		p->bytes_moved += ctx.bytes_moved;
-		if (update_bytes_moved_vis)
-			p->bytes_moved_vis += ctx.bytes_moved;
-
-		if (unlikely(r))
-			break;
-
-		p->evictable = list_prev_entry(p->evictable, tv.head);
-		list_move(&candidate->tv.head, &p->validated);
-
-		return true;
-	}
-
-	return false;
-}
-
 static int amdgpu_cs_validate(void *param, struct amdgpu_bo *bo)
 {
 	struct amdgpu_cs_parser *p = param;
 	int r;
 
-	do {
-		r = amdgpu_cs_bo_validate(p, bo);
-	} while (r == -ENOMEM && amdgpu_cs_try_evict(p, bo));
+	r = amdgpu_cs_bo_validate(p, bo);
 	if (r)
 		return r;
 
@@ -554,9 +491,6 @@ static int amdgpu_cs_list_validate(struct amdgpu_cs_parser *p,
 			binding_userptr = true;
 		}
 
-		if (p->evictable == lobj)
-			p->evictable = NULL;
-
 		r = amdgpu_cs_validate(p, bo);
 		if (r)
 			return r;
@@ -659,9 +593,6 @@ static int amdgpu_cs_parser_bos(struct amdgpu_cs_parser *p,
 					  &p->bytes_moved_vis_threshold);
 	p->bytes_moved = 0;
 	p->bytes_moved_vis = 0;
-	p->evictable = list_last_entry(&p->validated,
-				       struct amdgpu_bo_list_entry,
-				       tv.head);
 
 	r = amdgpu_vm_validate_pt_bos(p->adev, &fpriv->vm,
 				      amdgpu_cs_validate, p);
-- 
2.17.1

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH 3/3] drm/amdgpu: remove amdgpu_cs_try_evict
       [not found]     ` <20190902105219.2813-3-christian.koenig-5C7GfCeVMHo@public.gmane.org>
@ 2019-09-02 14:07       ` Kenny Ho
       [not found]         ` <CAOWid-dhHxhuPRmz26POaFKnAo3LfvayGnEqA-wLCRn3kNsKsw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  0 siblings, 1 reply; 11+ messages in thread
From: Kenny Ho @ 2019-09-02 14:07 UTC (permalink / raw)
  To: Christian König; +Cc: amd-gfx list

Hey Christian,

Can you go into details a bit more on the how and why this doesn't
work well anymore?  (such as its relationship with per VM BOs?)  I am
curious to learn more because I was reading into this chunk of code
earlier.  Is this something that the Shrinker API can help with?

Regards,
Kenny

On Mon, Sep 2, 2019 at 6:52 AM Christian König
<ckoenig.leichtzumerken@gmail.com> wrote:
>
> Trying to evict things from the current working set doesn't work that
> well anymore because of per VM BOs.
>
> Rely on reserving VRAM for page tables to avoid contention.
>
> Signed-off-by: Christian König <christian.koenig@amd.com>
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu.h    |  1 -
>  drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c | 71 +-------------------------
>  2 files changed, 1 insertion(+), 71 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> index a236213f8e8e..d1995156733e 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> @@ -478,7 +478,6 @@ struct amdgpu_cs_parser {
>         uint64_t                        bytes_moved_vis_threshold;
>         uint64_t                        bytes_moved;
>         uint64_t                        bytes_moved_vis;
> -       struct amdgpu_bo_list_entry     *evictable;
>
>         /* user fence */
>         struct amdgpu_bo_list_entry     uf_entry;
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> index fd95b586b590..03182d968d3d 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> @@ -447,75 +447,12 @@ static int amdgpu_cs_bo_validate(struct amdgpu_cs_parser *p,
>         return r;
>  }
>
> -/* Last resort, try to evict something from the current working set */
> -static bool amdgpu_cs_try_evict(struct amdgpu_cs_parser *p,
> -                               struct amdgpu_bo *validated)
> -{
> -       uint32_t domain = validated->allowed_domains;
> -       struct ttm_operation_ctx ctx = { true, false };
> -       int r;
> -
> -       if (!p->evictable)
> -               return false;
> -
> -       for (;&p->evictable->tv.head != &p->validated;
> -            p->evictable = list_prev_entry(p->evictable, tv.head)) {
> -
> -               struct amdgpu_bo_list_entry *candidate = p->evictable;
> -               struct amdgpu_bo *bo = ttm_to_amdgpu_bo(candidate->tv.bo);
> -               struct amdgpu_device *adev = amdgpu_ttm_adev(bo->tbo.bdev);
> -               bool update_bytes_moved_vis;
> -               uint32_t other;
> -
> -               /* If we reached our current BO we can forget it */
> -               if (bo == validated)
> -                       break;
> -
> -               /* We can't move pinned BOs here */
> -               if (bo->pin_count)
> -                       continue;
> -
> -               other = amdgpu_mem_type_to_domain(bo->tbo.mem.mem_type);
> -
> -               /* Check if this BO is in one of the domains we need space for */
> -               if (!(other & domain))
> -                       continue;
> -
> -               /* Check if we can move this BO somewhere else */
> -               other = bo->allowed_domains & ~domain;
> -               if (!other)
> -                       continue;
> -
> -               /* Good we can try to move this BO somewhere else */
> -               update_bytes_moved_vis =
> -                               !amdgpu_gmc_vram_full_visible(&adev->gmc) &&
> -                               amdgpu_bo_in_cpu_visible_vram(bo);
> -               amdgpu_bo_placement_from_domain(bo, other);
> -               r = ttm_bo_validate(&bo->tbo, &bo->placement, &ctx);
> -               p->bytes_moved += ctx.bytes_moved;
> -               if (update_bytes_moved_vis)
> -                       p->bytes_moved_vis += ctx.bytes_moved;
> -
> -               if (unlikely(r))
> -                       break;
> -
> -               p->evictable = list_prev_entry(p->evictable, tv.head);
> -               list_move(&candidate->tv.head, &p->validated);
> -
> -               return true;
> -       }
> -
> -       return false;
> -}
> -
>  static int amdgpu_cs_validate(void *param, struct amdgpu_bo *bo)
>  {
>         struct amdgpu_cs_parser *p = param;
>         int r;
>
> -       do {
> -               r = amdgpu_cs_bo_validate(p, bo);
> -       } while (r == -ENOMEM && amdgpu_cs_try_evict(p, bo));
> +       r = amdgpu_cs_bo_validate(p, bo);
>         if (r)
>                 return r;
>
> @@ -554,9 +491,6 @@ static int amdgpu_cs_list_validate(struct amdgpu_cs_parser *p,
>                         binding_userptr = true;
>                 }
>
> -               if (p->evictable == lobj)
> -                       p->evictable = NULL;
> -
>                 r = amdgpu_cs_validate(p, bo);
>                 if (r)
>                         return r;
> @@ -659,9 +593,6 @@ static int amdgpu_cs_parser_bos(struct amdgpu_cs_parser *p,
>                                           &p->bytes_moved_vis_threshold);
>         p->bytes_moved = 0;
>         p->bytes_moved_vis = 0;
> -       p->evictable = list_last_entry(&p->validated,
> -                                      struct amdgpu_bo_list_entry,
> -                                      tv.head);
>
>         r = amdgpu_vm_validate_pt_bos(p->adev, &fpriv->vm,
>                                       amdgpu_cs_validate, p);
> --
> 2.17.1
>
> _______________________________________________
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 3/3] drm/amdgpu: remove amdgpu_cs_try_evict
       [not found]         ` <CAOWid-dhHxhuPRmz26POaFKnAo3LfvayGnEqA-wLCRn3kNsKsw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2019-09-02 14:30           ` Christian König
       [not found]             ` <4e8e46ea-4492-480e-bcd3-58cc563e4520-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
  0 siblings, 1 reply; 11+ messages in thread
From: Christian König @ 2019-09-02 14:30 UTC (permalink / raw)
  To: Kenny Ho; +Cc: amd-gfx list

Hi Kenny,

When we do a CS we have a certain set of buffers which the submission is 
working with and are locked down while we prepare the submission.

This working set contains of the buffers in the BO list as well as the 
one in the VM plus one or two for CSA and user fences etc..

Now what can happen is that we find that we need to allocate some page 
tables during the CS and when a lot of BOs are locked down allocating a 
page table can fail because we can't evict other BOs.

What this code tries todo is to evict stuff from the BO list to make 
room for VM BOs, but since now much more BOs are bound to the VM this 
doesn't work any more.


The root of the problem is that it is really tricky to figure out how 
much memory you need for the page tables in the first place. See for a 
BO in VRAM we usually need only one PTE for each 2MB, but for a BO in 
system memory we need one PTE for each 4K of memory.

So what can happen is that you evict something from VRAM because you 
need room and that eviction in turn makes you need even more room.....

It can take a while until this reaches a stable point, so this patch set 
here switched from a dynamic approach to just assuming the worst and 
reserving some memory for page tables.

Regards,
Christian.

Am 02.09.19 um 16:07 schrieb Kenny Ho:
> Hey Christian,
>
> Can you go into details a bit more on the how and why this doesn't
> work well anymore?  (such as its relationship with per VM BOs?)  I am
> curious to learn more because I was reading into this chunk of code
> earlier.  Is this something that the Shrinker API can help with?
>
> Regards,
> Kenny
>
> On Mon, Sep 2, 2019 at 6:52 AM Christian König
> <ckoenig.leichtzumerken@gmail.com> wrote:
>> Trying to evict things from the current working set doesn't work that
>> well anymore because of per VM BOs.
>>
>> Rely on reserving VRAM for page tables to avoid contention.
>>
>> Signed-off-by: Christian König <christian.koenig@amd.com>
>> ---
>>   drivers/gpu/drm/amd/amdgpu/amdgpu.h    |  1 -
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c | 71 +-------------------------
>>   2 files changed, 1 insertion(+), 71 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>> index a236213f8e8e..d1995156733e 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>> @@ -478,7 +478,6 @@ struct amdgpu_cs_parser {
>>          uint64_t                        bytes_moved_vis_threshold;
>>          uint64_t                        bytes_moved;
>>          uint64_t                        bytes_moved_vis;
>> -       struct amdgpu_bo_list_entry     *evictable;
>>
>>          /* user fence */
>>          struct amdgpu_bo_list_entry     uf_entry;
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
>> index fd95b586b590..03182d968d3d 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
>> @@ -447,75 +447,12 @@ static int amdgpu_cs_bo_validate(struct amdgpu_cs_parser *p,
>>          return r;
>>   }
>>
>> -/* Last resort, try to evict something from the current working set */
>> -static bool amdgpu_cs_try_evict(struct amdgpu_cs_parser *p,
>> -                               struct amdgpu_bo *validated)
>> -{
>> -       uint32_t domain = validated->allowed_domains;
>> -       struct ttm_operation_ctx ctx = { true, false };
>> -       int r;
>> -
>> -       if (!p->evictable)
>> -               return false;
>> -
>> -       for (;&p->evictable->tv.head != &p->validated;
>> -            p->evictable = list_prev_entry(p->evictable, tv.head)) {
>> -
>> -               struct amdgpu_bo_list_entry *candidate = p->evictable;
>> -               struct amdgpu_bo *bo = ttm_to_amdgpu_bo(candidate->tv.bo);
>> -               struct amdgpu_device *adev = amdgpu_ttm_adev(bo->tbo.bdev);
>> -               bool update_bytes_moved_vis;
>> -               uint32_t other;
>> -
>> -               /* If we reached our current BO we can forget it */
>> -               if (bo == validated)
>> -                       break;
>> -
>> -               /* We can't move pinned BOs here */
>> -               if (bo->pin_count)
>> -                       continue;
>> -
>> -               other = amdgpu_mem_type_to_domain(bo->tbo.mem.mem_type);
>> -
>> -               /* Check if this BO is in one of the domains we need space for */
>> -               if (!(other & domain))
>> -                       continue;
>> -
>> -               /* Check if we can move this BO somewhere else */
>> -               other = bo->allowed_domains & ~domain;
>> -               if (!other)
>> -                       continue;
>> -
>> -               /* Good we can try to move this BO somewhere else */
>> -               update_bytes_moved_vis =
>> -                               !amdgpu_gmc_vram_full_visible(&adev->gmc) &&
>> -                               amdgpu_bo_in_cpu_visible_vram(bo);
>> -               amdgpu_bo_placement_from_domain(bo, other);
>> -               r = ttm_bo_validate(&bo->tbo, &bo->placement, &ctx);
>> -               p->bytes_moved += ctx.bytes_moved;
>> -               if (update_bytes_moved_vis)
>> -                       p->bytes_moved_vis += ctx.bytes_moved;
>> -
>> -               if (unlikely(r))
>> -                       break;
>> -
>> -               p->evictable = list_prev_entry(p->evictable, tv.head);
>> -               list_move(&candidate->tv.head, &p->validated);
>> -
>> -               return true;
>> -       }
>> -
>> -       return false;
>> -}
>> -
>>   static int amdgpu_cs_validate(void *param, struct amdgpu_bo *bo)
>>   {
>>          struct amdgpu_cs_parser *p = param;
>>          int r;
>>
>> -       do {
>> -               r = amdgpu_cs_bo_validate(p, bo);
>> -       } while (r == -ENOMEM && amdgpu_cs_try_evict(p, bo));
>> +       r = amdgpu_cs_bo_validate(p, bo);
>>          if (r)
>>                  return r;
>>
>> @@ -554,9 +491,6 @@ static int amdgpu_cs_list_validate(struct amdgpu_cs_parser *p,
>>                          binding_userptr = true;
>>                  }
>>
>> -               if (p->evictable == lobj)
>> -                       p->evictable = NULL;
>> -
>>                  r = amdgpu_cs_validate(p, bo);
>>                  if (r)
>>                          return r;
>> @@ -659,9 +593,6 @@ static int amdgpu_cs_parser_bos(struct amdgpu_cs_parser *p,
>>                                            &p->bytes_moved_vis_threshold);
>>          p->bytes_moved = 0;
>>          p->bytes_moved_vis = 0;
>> -       p->evictable = list_last_entry(&p->validated,
>> -                                      struct amdgpu_bo_list_entry,
>> -                                      tv.head);
>>
>>          r = amdgpu_vm_validate_pt_bos(p->adev, &fpriv->vm,
>>                                        amdgpu_cs_validate, p);
>> --
>> 2.17.1
>>
>> _______________________________________________
>> amd-gfx mailing list
>> amd-gfx@lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/amd-gfx

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 3/3] drm/amdgpu: remove amdgpu_cs_try_evict
       [not found]             ` <4e8e46ea-4492-480e-bcd3-58cc563e4520-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
@ 2019-09-02 14:47               ` Kenny Ho
       [not found]                 ` <CAOWid-cCsb3OifZhuKA3-uPxj=S9TS96HBYb5b-xM-7+yzNRXA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  0 siblings, 1 reply; 11+ messages in thread
From: Kenny Ho @ 2019-09-02 14:47 UTC (permalink / raw)
  To: Christian König; +Cc: amd-gfx list

Ah ok, thanks for the explanation.  About the last bit, what is the
reason behind the differences in page size?  (I assume that's what you
meant by PTE?  Or is that something else?)

Regards,
Kenny

On Mon, Sep 2, 2019 at 10:31 AM Christian König
<ckoenig.leichtzumerken@gmail.com> wrote:
>
> Hi Kenny,
>
> When we do a CS we have a certain set of buffers which the submission is
> working with and are locked down while we prepare the submission.
>
> This working set contains of the buffers in the BO list as well as the
> one in the VM plus one or two for CSA and user fences etc..
>
> Now what can happen is that we find that we need to allocate some page
> tables during the CS and when a lot of BOs are locked down allocating a
> page table can fail because we can't evict other BOs.
>
> What this code tries todo is to evict stuff from the BO list to make
> room for VM BOs, but since now much more BOs are bound to the VM this
> doesn't work any more.
>
>
> The root of the problem is that it is really tricky to figure out how
> much memory you need for the page tables in the first place. See for a
> BO in VRAM we usually need only one PTE for each 2MB, but for a BO in
> system memory we need one PTE for each 4K of memory.
>
> So what can happen is that you evict something from VRAM because you
> need room and that eviction in turn makes you need even more room.....
>
> It can take a while until this reaches a stable point, so this patch set
> here switched from a dynamic approach to just assuming the worst and
> reserving some memory for page tables.
>
> Regards,
> Christian.
>
> Am 02.09.19 um 16:07 schrieb Kenny Ho:
> > Hey Christian,
> >
> > Can you go into details a bit more on the how and why this doesn't
> > work well anymore?  (such as its relationship with per VM BOs?)  I am
> > curious to learn more because I was reading into this chunk of code
> > earlier.  Is this something that the Shrinker API can help with?
> >
> > Regards,
> > Kenny
> >
> > On Mon, Sep 2, 2019 at 6:52 AM Christian König
> > <ckoenig.leichtzumerken@gmail.com> wrote:
> >> Trying to evict things from the current working set doesn't work that
> >> well anymore because of per VM BOs.
> >>
> >> Rely on reserving VRAM for page tables to avoid contention.
> >>
> >> Signed-off-by: Christian König <christian.koenig@amd.com>
> >> ---
> >>   drivers/gpu/drm/amd/amdgpu/amdgpu.h    |  1 -
> >>   drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c | 71 +-------------------------
> >>   2 files changed, 1 insertion(+), 71 deletions(-)
> >>
> >> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> >> index a236213f8e8e..d1995156733e 100644
> >> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> >> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> >> @@ -478,7 +478,6 @@ struct amdgpu_cs_parser {
> >>          uint64_t                        bytes_moved_vis_threshold;
> >>          uint64_t                        bytes_moved;
> >>          uint64_t                        bytes_moved_vis;
> >> -       struct amdgpu_bo_list_entry     *evictable;
> >>
> >>          /* user fence */
> >>          struct amdgpu_bo_list_entry     uf_entry;
> >> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> >> index fd95b586b590..03182d968d3d 100644
> >> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> >> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> >> @@ -447,75 +447,12 @@ static int amdgpu_cs_bo_validate(struct amdgpu_cs_parser *p,
> >>          return r;
> >>   }
> >>
> >> -/* Last resort, try to evict something from the current working set */
> >> -static bool amdgpu_cs_try_evict(struct amdgpu_cs_parser *p,
> >> -                               struct amdgpu_bo *validated)
> >> -{
> >> -       uint32_t domain = validated->allowed_domains;
> >> -       struct ttm_operation_ctx ctx = { true, false };
> >> -       int r;
> >> -
> >> -       if (!p->evictable)
> >> -               return false;
> >> -
> >> -       for (;&p->evictable->tv.head != &p->validated;
> >> -            p->evictable = list_prev_entry(p->evictable, tv.head)) {
> >> -
> >> -               struct amdgpu_bo_list_entry *candidate = p->evictable;
> >> -               struct amdgpu_bo *bo = ttm_to_amdgpu_bo(candidate->tv.bo);
> >> -               struct amdgpu_device *adev = amdgpu_ttm_adev(bo->tbo.bdev);
> >> -               bool update_bytes_moved_vis;
> >> -               uint32_t other;
> >> -
> >> -               /* If we reached our current BO we can forget it */
> >> -               if (bo == validated)
> >> -                       break;
> >> -
> >> -               /* We can't move pinned BOs here */
> >> -               if (bo->pin_count)
> >> -                       continue;
> >> -
> >> -               other = amdgpu_mem_type_to_domain(bo->tbo.mem.mem_type);
> >> -
> >> -               /* Check if this BO is in one of the domains we need space for */
> >> -               if (!(other & domain))
> >> -                       continue;
> >> -
> >> -               /* Check if we can move this BO somewhere else */
> >> -               other = bo->allowed_domains & ~domain;
> >> -               if (!other)
> >> -                       continue;
> >> -
> >> -               /* Good we can try to move this BO somewhere else */
> >> -               update_bytes_moved_vis =
> >> -                               !amdgpu_gmc_vram_full_visible(&adev->gmc) &&
> >> -                               amdgpu_bo_in_cpu_visible_vram(bo);
> >> -               amdgpu_bo_placement_from_domain(bo, other);
> >> -               r = ttm_bo_validate(&bo->tbo, &bo->placement, &ctx);
> >> -               p->bytes_moved += ctx.bytes_moved;
> >> -               if (update_bytes_moved_vis)
> >> -                       p->bytes_moved_vis += ctx.bytes_moved;
> >> -
> >> -               if (unlikely(r))
> >> -                       break;
> >> -
> >> -               p->evictable = list_prev_entry(p->evictable, tv.head);
> >> -               list_move(&candidate->tv.head, &p->validated);
> >> -
> >> -               return true;
> >> -       }
> >> -
> >> -       return false;
> >> -}
> >> -
> >>   static int amdgpu_cs_validate(void *param, struct amdgpu_bo *bo)
> >>   {
> >>          struct amdgpu_cs_parser *p = param;
> >>          int r;
> >>
> >> -       do {
> >> -               r = amdgpu_cs_bo_validate(p, bo);
> >> -       } while (r == -ENOMEM && amdgpu_cs_try_evict(p, bo));
> >> +       r = amdgpu_cs_bo_validate(p, bo);
> >>          if (r)
> >>                  return r;
> >>
> >> @@ -554,9 +491,6 @@ static int amdgpu_cs_list_validate(struct amdgpu_cs_parser *p,
> >>                          binding_userptr = true;
> >>                  }
> >>
> >> -               if (p->evictable == lobj)
> >> -                       p->evictable = NULL;
> >> -
> >>                  r = amdgpu_cs_validate(p, bo);
> >>                  if (r)
> >>                          return r;
> >> @@ -659,9 +593,6 @@ static int amdgpu_cs_parser_bos(struct amdgpu_cs_parser *p,
> >>                                            &p->bytes_moved_vis_threshold);
> >>          p->bytes_moved = 0;
> >>          p->bytes_moved_vis = 0;
> >> -       p->evictable = list_last_entry(&p->validated,
> >> -                                      struct amdgpu_bo_list_entry,
> >> -                                      tv.head);
> >>
> >>          r = amdgpu_vm_validate_pt_bos(p->adev, &fpriv->vm,
> >>                                        amdgpu_cs_validate, p);
> >> --
> >> 2.17.1
> >>
> >> _______________________________________________
> >> amd-gfx mailing list
> >> amd-gfx@lists.freedesktop.org
> >> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
>
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 3/3] drm/amdgpu: remove amdgpu_cs_try_evict
       [not found]                 ` <CAOWid-cCsb3OifZhuKA3-uPxj=S9TS96HBYb5b-xM-7+yzNRXA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2019-09-02 14:56                   ` Christian König
  0 siblings, 0 replies; 11+ messages in thread
From: Christian König @ 2019-09-02 14:56 UTC (permalink / raw)
  To: Kenny Ho, Christian König; +Cc: amd-gfx list

PTE means Page Table Entry.

Starting with Vega10 we have multi level page tables and when you 
continuously allocate 2MB (or even 1GB) you can set a bit in the 
hierarchy that the walker should stop and take the address and flags for 
the whole 2MB block.

The result is that with 4K pages you need an additional level in the 
hierarchy compared to 2MB pages and so more memory.

Regards,
Christian.

Am 02.09.19 um 16:47 schrieb Kenny Ho:
> Ah ok, thanks for the explanation.  About the last bit, what is the
> reason behind the differences in page size?  (I assume that's what you
> meant by PTE?  Or is that something else?)
>
> Regards,
> Kenny
>
> On Mon, Sep 2, 2019 at 10:31 AM Christian König
> <ckoenig.leichtzumerken@gmail.com> wrote:
>> Hi Kenny,
>>
>> When we do a CS we have a certain set of buffers which the submission is
>> working with and are locked down while we prepare the submission.
>>
>> This working set contains of the buffers in the BO list as well as the
>> one in the VM plus one or two for CSA and user fences etc..
>>
>> Now what can happen is that we find that we need to allocate some page
>> tables during the CS and when a lot of BOs are locked down allocating a
>> page table can fail because we can't evict other BOs.
>>
>> What this code tries todo is to evict stuff from the BO list to make
>> room for VM BOs, but since now much more BOs are bound to the VM this
>> doesn't work any more.
>>
>>
>> The root of the problem is that it is really tricky to figure out how
>> much memory you need for the page tables in the first place. See for a
>> BO in VRAM we usually need only one PTE for each 2MB, but for a BO in
>> system memory we need one PTE for each 4K of memory.
>>
>> So what can happen is that you evict something from VRAM because you
>> need room and that eviction in turn makes you need even more room.....
>>
>> It can take a while until this reaches a stable point, so this patch set
>> here switched from a dynamic approach to just assuming the worst and
>> reserving some memory for page tables.
>>
>> Regards,
>> Christian.
>>
>> Am 02.09.19 um 16:07 schrieb Kenny Ho:
>>> Hey Christian,
>>>
>>> Can you go into details a bit more on the how and why this doesn't
>>> work well anymore?  (such as its relationship with per VM BOs?)  I am
>>> curious to learn more because I was reading into this chunk of code
>>> earlier.  Is this something that the Shrinker API can help with?
>>>
>>> Regards,
>>> Kenny
>>>
>>> On Mon, Sep 2, 2019 at 6:52 AM Christian König
>>> <ckoenig.leichtzumerken@gmail.com> wrote:
>>>> Trying to evict things from the current working set doesn't work that
>>>> well anymore because of per VM BOs.
>>>>
>>>> Rely on reserving VRAM for page tables to avoid contention.
>>>>
>>>> Signed-off-by: Christian König <christian.koenig@amd.com>
>>>> ---
>>>>    drivers/gpu/drm/amd/amdgpu/amdgpu.h    |  1 -
>>>>    drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c | 71 +-------------------------
>>>>    2 files changed, 1 insertion(+), 71 deletions(-)
>>>>
>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>>>> index a236213f8e8e..d1995156733e 100644
>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>>>> @@ -478,7 +478,6 @@ struct amdgpu_cs_parser {
>>>>           uint64_t                        bytes_moved_vis_threshold;
>>>>           uint64_t                        bytes_moved;
>>>>           uint64_t                        bytes_moved_vis;
>>>> -       struct amdgpu_bo_list_entry     *evictable;
>>>>
>>>>           /* user fence */
>>>>           struct amdgpu_bo_list_entry     uf_entry;
>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
>>>> index fd95b586b590..03182d968d3d 100644
>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
>>>> @@ -447,75 +447,12 @@ static int amdgpu_cs_bo_validate(struct amdgpu_cs_parser *p,
>>>>           return r;
>>>>    }
>>>>
>>>> -/* Last resort, try to evict something from the current working set */
>>>> -static bool amdgpu_cs_try_evict(struct amdgpu_cs_parser *p,
>>>> -                               struct amdgpu_bo *validated)
>>>> -{
>>>> -       uint32_t domain = validated->allowed_domains;
>>>> -       struct ttm_operation_ctx ctx = { true, false };
>>>> -       int r;
>>>> -
>>>> -       if (!p->evictable)
>>>> -               return false;
>>>> -
>>>> -       for (;&p->evictable->tv.head != &p->validated;
>>>> -            p->evictable = list_prev_entry(p->evictable, tv.head)) {
>>>> -
>>>> -               struct amdgpu_bo_list_entry *candidate = p->evictable;
>>>> -               struct amdgpu_bo *bo = ttm_to_amdgpu_bo(candidate->tv.bo);
>>>> -               struct amdgpu_device *adev = amdgpu_ttm_adev(bo->tbo.bdev);
>>>> -               bool update_bytes_moved_vis;
>>>> -               uint32_t other;
>>>> -
>>>> -               /* If we reached our current BO we can forget it */
>>>> -               if (bo == validated)
>>>> -                       break;
>>>> -
>>>> -               /* We can't move pinned BOs here */
>>>> -               if (bo->pin_count)
>>>> -                       continue;
>>>> -
>>>> -               other = amdgpu_mem_type_to_domain(bo->tbo.mem.mem_type);
>>>> -
>>>> -               /* Check if this BO is in one of the domains we need space for */
>>>> -               if (!(other & domain))
>>>> -                       continue;
>>>> -
>>>> -               /* Check if we can move this BO somewhere else */
>>>> -               other = bo->allowed_domains & ~domain;
>>>> -               if (!other)
>>>> -                       continue;
>>>> -
>>>> -               /* Good we can try to move this BO somewhere else */
>>>> -               update_bytes_moved_vis =
>>>> -                               !amdgpu_gmc_vram_full_visible(&adev->gmc) &&
>>>> -                               amdgpu_bo_in_cpu_visible_vram(bo);
>>>> -               amdgpu_bo_placement_from_domain(bo, other);
>>>> -               r = ttm_bo_validate(&bo->tbo, &bo->placement, &ctx);
>>>> -               p->bytes_moved += ctx.bytes_moved;
>>>> -               if (update_bytes_moved_vis)
>>>> -                       p->bytes_moved_vis += ctx.bytes_moved;
>>>> -
>>>> -               if (unlikely(r))
>>>> -                       break;
>>>> -
>>>> -               p->evictable = list_prev_entry(p->evictable, tv.head);
>>>> -               list_move(&candidate->tv.head, &p->validated);
>>>> -
>>>> -               return true;
>>>> -       }
>>>> -
>>>> -       return false;
>>>> -}
>>>> -
>>>>    static int amdgpu_cs_validate(void *param, struct amdgpu_bo *bo)
>>>>    {
>>>>           struct amdgpu_cs_parser *p = param;
>>>>           int r;
>>>>
>>>> -       do {
>>>> -               r = amdgpu_cs_bo_validate(p, bo);
>>>> -       } while (r == -ENOMEM && amdgpu_cs_try_evict(p, bo));
>>>> +       r = amdgpu_cs_bo_validate(p, bo);
>>>>           if (r)
>>>>                   return r;
>>>>
>>>> @@ -554,9 +491,6 @@ static int amdgpu_cs_list_validate(struct amdgpu_cs_parser *p,
>>>>                           binding_userptr = true;
>>>>                   }
>>>>
>>>> -               if (p->evictable == lobj)
>>>> -                       p->evictable = NULL;
>>>> -
>>>>                   r = amdgpu_cs_validate(p, bo);
>>>>                   if (r)
>>>>                           return r;
>>>> @@ -659,9 +593,6 @@ static int amdgpu_cs_parser_bos(struct amdgpu_cs_parser *p,
>>>>                                             &p->bytes_moved_vis_threshold);
>>>>           p->bytes_moved = 0;
>>>>           p->bytes_moved_vis = 0;
>>>> -       p->evictable = list_last_entry(&p->validated,
>>>> -                                      struct amdgpu_bo_list_entry,
>>>> -                                      tv.head);
>>>>
>>>>           r = amdgpu_vm_validate_pt_bos(p->adev, &fpriv->vm,
>>>>                                         amdgpu_cs_validate, p);
>>>> --
>>>> 2.17.1
>>>>
>>>> _______________________________________________
>>>> amd-gfx mailing list
>>>> amd-gfx@lists.freedesktop.org
>>>> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
> _______________________________________________
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 11+ messages in thread

* RE: [PATCH 2/3] drm/amdgpu: reserve at least 4MB of VRAM for page tables
       [not found]     ` <20190902105219.2813-2-christian.koenig-5C7GfCeVMHo@public.gmane.org>
@ 2019-09-03  6:52       ` Zhou, David(ChunMing)
       [not found]         ` <MN2PR12MB29104DDA4EBB320F67988981B4B90-rweVpJHSKTr1t3MqfsnKMAdYzm3356FpvxpqHgZTriW3zl9H0oFU5g@public.gmane.org>
  0 siblings, 1 reply; 11+ messages in thread
From: Zhou, David(ChunMing) @ 2019-09-03  6:52 UTC (permalink / raw)
  To: Christian König, amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW

Do you need update the vram size reported to UMD ?

-David

-----Original Message-----
From: amd-gfx <amd-gfx-bounces@lists.freedesktop.org> On Behalf Of Christian König
Sent: Monday, September 2, 2019 6:52 PM
To: amd-gfx@lists.freedesktop.org
Subject: [PATCH 2/3] drm/amdgpu: reserve at least 4MB of VRAM for page tables

This hopefully helps reduce the contention for page tables.

Signed-off-by: Christian König <christian.koenig@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h       | 3 +++
 drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c | 9 +++++++--
 2 files changed, 10 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
index 2eda3a8c330d..3352a87b822e 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
@@ -99,6 +99,9 @@ struct amdgpu_bo_list_entry;
 #define AMDGPU_VM_FAULT_STOP_FIRST	1
 #define AMDGPU_VM_FAULT_STOP_ALWAYS	2
 
+/* Reserve 4MB VRAM for page tables */
+#define AMDGPU_VM_RESERVED_VRAM		(4ULL << 20)
+
 /* max number of VMHUB */
 #define AMDGPU_MAX_VMHUBS			3
 #define AMDGPU_GFXHUB_0				0
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c
index 1150e34bc28f..59440f71d304 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c
@@ -24,6 +24,7 @@
 
 #include <drm/drmP.h>
 #include "amdgpu.h"
+#include "amdgpu_vm.h"
 
 struct amdgpu_vram_mgr {
 	struct drm_mm mm;
@@ -276,7 +277,7 @@ static int amdgpu_vram_mgr_new(struct ttm_mem_type_manager *man,
 	struct drm_mm_node *nodes;
 	enum drm_mm_insert_mode mode;
 	unsigned long lpfn, num_nodes, pages_per_node, pages_left;
-	uint64_t vis_usage = 0, mem_bytes;
+	uint64_t vis_usage = 0, mem_bytes, max_bytes;
 	unsigned i;
 	int r;
 
@@ -284,9 +285,13 @@ static int amdgpu_vram_mgr_new(struct ttm_mem_type_manager *man,
 	if (!lpfn)
 		lpfn = man->size;
 
+	max_bytes = adev->gmc.mc_vram_size;
+	if (tbo->type != ttm_bo_type_kernel)
+		max_bytes -= AMDGPU_VM_RESERVED_VRAM;
+
 	/* bail out quickly if there's likely not enough VRAM for this BO */
 	mem_bytes = (u64)mem->num_pages << PAGE_SHIFT;
-	if (atomic64_add_return(mem_bytes, &mgr->usage) > adev->gmc.mc_vram_size) {
+	if (atomic64_add_return(mem_bytes, &mgr->usage) > max_bytes) {
 		atomic64_sub(mem_bytes, &mgr->usage);
 		mem->mm_node = NULL;
 		return 0;
-- 
2.17.1

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH 2/3] drm/amdgpu: reserve at least 4MB of VRAM for page tables
       [not found]         ` <MN2PR12MB29104DDA4EBB320F67988981B4B90-rweVpJHSKTr1t3MqfsnKMAdYzm3356FpvxpqHgZTriW3zl9H0oFU5g@public.gmane.org>
@ 2019-09-03  8:47           ` Christian König
  0 siblings, 0 replies; 11+ messages in thread
From: Christian König @ 2019-09-03  8:47 UTC (permalink / raw)
  To: Zhou, David(ChunMing), amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW

Oh, good point! That is probably good idea.

Christian.

Am 03.09.19 um 08:52 schrieb Zhou, David(ChunMing):
> Do you need update the vram size reported to UMD ?
>
> -David
>
> -----Original Message-----
> From: amd-gfx <amd-gfx-bounces@lists.freedesktop.org> On Behalf Of Christian König
> Sent: Monday, September 2, 2019 6:52 PM
> To: amd-gfx@lists.freedesktop.org
> Subject: [PATCH 2/3] drm/amdgpu: reserve at least 4MB of VRAM for page tables
>
> This hopefully helps reduce the contention for page tables.
>
> Signed-off-by: Christian König <christian.koenig@amd.com>
> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h       | 3 +++
>   drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c | 9 +++++++--
>   2 files changed, 10 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
> index 2eda3a8c330d..3352a87b822e 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
> @@ -99,6 +99,9 @@ struct amdgpu_bo_list_entry;
>   #define AMDGPU_VM_FAULT_STOP_FIRST	1
>   #define AMDGPU_VM_FAULT_STOP_ALWAYS	2
>   
> +/* Reserve 4MB VRAM for page tables */
> +#define AMDGPU_VM_RESERVED_VRAM		(4ULL << 20)
> +
>   /* max number of VMHUB */
>   #define AMDGPU_MAX_VMHUBS			3
>   #define AMDGPU_GFXHUB_0				0
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c
> index 1150e34bc28f..59440f71d304 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c
> @@ -24,6 +24,7 @@
>   
>   #include <drm/drmP.h>
>   #include "amdgpu.h"
> +#include "amdgpu_vm.h"
>   
>   struct amdgpu_vram_mgr {
>   	struct drm_mm mm;
> @@ -276,7 +277,7 @@ static int amdgpu_vram_mgr_new(struct ttm_mem_type_manager *man,
>   	struct drm_mm_node *nodes;
>   	enum drm_mm_insert_mode mode;
>   	unsigned long lpfn, num_nodes, pages_per_node, pages_left;
> -	uint64_t vis_usage = 0, mem_bytes;
> +	uint64_t vis_usage = 0, mem_bytes, max_bytes;
>   	unsigned i;
>   	int r;
>   
> @@ -284,9 +285,13 @@ static int amdgpu_vram_mgr_new(struct ttm_mem_type_manager *man,
>   	if (!lpfn)
>   		lpfn = man->size;
>   
> +	max_bytes = adev->gmc.mc_vram_size;
> +	if (tbo->type != ttm_bo_type_kernel)
> +		max_bytes -= AMDGPU_VM_RESERVED_VRAM;
> +
>   	/* bail out quickly if there's likely not enough VRAM for this BO */
>   	mem_bytes = (u64)mem->num_pages << PAGE_SHIFT;
> -	if (atomic64_add_return(mem_bytes, &mgr->usage) > adev->gmc.mc_vram_size) {
> +	if (atomic64_add_return(mem_bytes, &mgr->usage) > max_bytes) {
>   		atomic64_sub(mem_bytes, &mgr->usage);
>   		mem->mm_node = NULL;
>   		return 0;

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 1/3] drm/amdgpu: use moving fence instead of exclusive for VM updates
       [not found] ` <20190902105219.2813-1-christian.koenig-5C7GfCeVMHo@public.gmane.org>
  2019-09-02 10:52   ` [PATCH 2/3] drm/amdgpu: reserve at least 4MB of VRAM for page tables Christian König
  2019-09-02 10:52   ` [PATCH 3/3] drm/amdgpu: remove amdgpu_cs_try_evict Christian König
@ 2019-09-03 20:37   ` Kuehling, Felix
  2 siblings, 0 replies; 11+ messages in thread
From: Kuehling, Felix @ 2019-09-03 20:37 UTC (permalink / raw)
  To: Christian König, amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW

On 2019-09-02 6:52 a.m., Christian König wrote:
> Make VM updates depend on the moving fence instead of the exclusive one.

In effect, this makes the page table update depend on the last move of 
the BO, rather than the last change of the buffer contents. Makes sense.

Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>


>
> Makes it less likely to actually have a dependency.
>
> Signed-off-by: Christian König <christian.koenig@amd.com>
> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> index 189ad5699946..501e13420786 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> @@ -1706,7 +1706,7 @@ int amdgpu_vm_bo_update(struct amdgpu_device *adev,
>   			ttm = container_of(bo->tbo.ttm, struct ttm_dma_tt, ttm);
>   			pages_addr = ttm->dma_address;
>   		}
> -		exclusive = reservation_object_get_excl(bo->tbo.resv);
> +		exclusive = bo->tbo.moving;
>   	}
>   
>   	if (bo) {
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH 1/3] drm/amdgpu: use moving fence instead of exclusive for VM updates
@ 2019-09-03  9:09 Christian König
  0 siblings, 0 replies; 11+ messages in thread
From: Christian König @ 2019-09-03  9:09 UTC (permalink / raw)
  To: amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW

Make VM updates depend on the moving fence instead of the exclusive one.

Makes it less likely to actually have a dependency.

Signed-off-by: Christian König <christian.koenig@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
index 189ad5699946..501e13420786 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
@@ -1706,7 +1706,7 @@ int amdgpu_vm_bo_update(struct amdgpu_device *adev,
 			ttm = container_of(bo->tbo.ttm, struct ttm_dma_tt, ttm);
 			pages_addr = ttm->dma_address;
 		}
-		exclusive = reservation_object_get_excl(bo->tbo.resv);
+		exclusive = bo->tbo.moving;
 	}
 
 	if (bo) {
-- 
2.17.1

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2019-09-03 20:37 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-09-02 10:52 [PATCH 1/3] drm/amdgpu: use moving fence instead of exclusive for VM updates Christian König
     [not found] ` <20190902105219.2813-1-christian.koenig-5C7GfCeVMHo@public.gmane.org>
2019-09-02 10:52   ` [PATCH 2/3] drm/amdgpu: reserve at least 4MB of VRAM for page tables Christian König
     [not found]     ` <20190902105219.2813-2-christian.koenig-5C7GfCeVMHo@public.gmane.org>
2019-09-03  6:52       ` Zhou, David(ChunMing)
     [not found]         ` <MN2PR12MB29104DDA4EBB320F67988981B4B90-rweVpJHSKTr1t3MqfsnKMAdYzm3356FpvxpqHgZTriW3zl9H0oFU5g@public.gmane.org>
2019-09-03  8:47           ` Christian König
2019-09-02 10:52   ` [PATCH 3/3] drm/amdgpu: remove amdgpu_cs_try_evict Christian König
     [not found]     ` <20190902105219.2813-3-christian.koenig-5C7GfCeVMHo@public.gmane.org>
2019-09-02 14:07       ` Kenny Ho
     [not found]         ` <CAOWid-dhHxhuPRmz26POaFKnAo3LfvayGnEqA-wLCRn3kNsKsw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2019-09-02 14:30           ` Christian König
     [not found]             ` <4e8e46ea-4492-480e-bcd3-58cc563e4520-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2019-09-02 14:47               ` Kenny Ho
     [not found]                 ` <CAOWid-cCsb3OifZhuKA3-uPxj=S9TS96HBYb5b-xM-7+yzNRXA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2019-09-02 14:56                   ` Christian König
2019-09-03 20:37   ` [PATCH 1/3] drm/amdgpu: use moving fence instead of exclusive for VM updates Kuehling, Felix
2019-09-03  9:09 Christian König

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.