dri-devel.lists.freedesktop.org archive mirror
 help / color / mirror / Atom feed
* drm/ttm: moving the LRU into the resource
@ 2022-01-24 12:25 Christian König
  2022-01-24 12:25 ` [PATCH 01/12] drm/ttm: add ttm_resource_fini Christian König
                   ` (11 more replies)
  0 siblings, 12 replies; 23+ messages in thread
From: Christian König @ 2022-01-24 12:25 UTC (permalink / raw)
  To: ray.huang, thomas.hellstrom, dri-devel, bas

Hi guys,

let's hope that I have fixed and documented everything now.

This fixes the fundamental problem in TTM that during a move a buffer
has resources allocated from two different domains at the same time.

Additional to that it's a prerequisite to remove ghost objects and
allow to allocate multiple resources per BO.

Based on this set I've added an optimization for RADV which Bas pointed
out as valuable.

Please review, test and/or comment,
Christian.



^ permalink raw reply	[flat|nested] 23+ messages in thread

* [PATCH 01/12] drm/ttm: add ttm_resource_fini
  2022-01-24 12:25 drm/ttm: moving the LRU into the resource Christian König
@ 2022-01-24 12:25 ` Christian König
  2022-01-25 16:26   ` Daniel Vetter
  2022-01-24 12:25 ` [PATCH 02/12] drm/ttm: add back a reference to the bdev to the res manager Christian König
                   ` (10 subsequent siblings)
  11 siblings, 1 reply; 23+ messages in thread
From: Christian König @ 2022-01-24 12:25 UTC (permalink / raw)
  To: ray.huang, thomas.hellstrom, dri-devel, bas

Make sure we call the common cleanup function in all
implementations of the resource manager.

Signed-off-by: Christian König <christian.koenig@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c     | 2 ++
 drivers/gpu/drm/amd/amdgpu/amdgpu_preempt_mgr.c | 1 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c    | 2 ++
 drivers/gpu/drm/i915/i915_ttm_buddy_manager.c   | 1 +
 drivers/gpu/drm/nouveau/nouveau_mem.c           | 3 ++-
 drivers/gpu/drm/nouveau/nouveau_mem.h           | 3 ++-
 drivers/gpu/drm/nouveau/nouveau_ttm.c           | 9 +++++----
 drivers/gpu/drm/ttm/ttm_range_manager.c         | 2 ++
 drivers/gpu/drm/ttm/ttm_resource.c              | 6 ++++++
 drivers/gpu/drm/ttm/ttm_sys_manager.c           | 1 +
 drivers/gpu/drm/vmwgfx/vmwgfx_gmrid_manager.c   | 2 ++
 drivers/gpu/drm/vmwgfx/vmwgfx_system_manager.c  | 1 +
 include/drm/ttm/ttm_resource.h                  | 3 +++
 13 files changed, 30 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c
index 675a72ef305d..ea5470c8c921 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c
@@ -169,6 +169,7 @@ static int amdgpu_gtt_mgr_new(struct ttm_resource_manager *man,
 	return 0;
 
 err_free:
+	ttm_resource_fini(man, &node->base.base);
 	kfree(node);
 
 err_out:
@@ -200,6 +201,7 @@ static void amdgpu_gtt_mgr_del(struct ttm_resource_manager *man,
 	if (!(res->placement & TTM_PL_FLAG_TEMPORARY))
 		atomic64_sub(res->num_pages, &mgr->used);
 
+	ttm_resource_fini(man, res);
 	kfree(node);
 }
 
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_preempt_mgr.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_preempt_mgr.c
index d02c8637f909..ffddec08e931 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_preempt_mgr.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_preempt_mgr.c
@@ -95,6 +95,7 @@ static void amdgpu_preempt_mgr_del(struct ttm_resource_manager *man,
 	struct amdgpu_preempt_mgr *mgr = to_preempt_mgr(man);
 
 	atomic64_sub(res->num_pages, &mgr->used);
+	ttm_resource_fini(man, res);
 	kfree(res);
 }
 
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c
index 7b2b0980ec41..55d68408951d 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c
@@ -476,6 +476,7 @@ static int amdgpu_vram_mgr_new(struct ttm_resource_manager *man,
 	while (i--)
 		drm_mm_remove_node(&node->mm_nodes[i]);
 	spin_unlock(&mgr->lock);
+	ttm_resource_fini(man, &node->base);
 	kvfree(node);
 
 error_sub:
@@ -515,6 +516,7 @@ static void amdgpu_vram_mgr_del(struct ttm_resource_manager *man,
 	atomic64_sub(usage, &mgr->usage);
 	atomic64_sub(vis_usage, &mgr->vis_usage);
 
+	ttm_resource_fini(man, res);
 	kvfree(node);
 }
 
diff --git a/drivers/gpu/drm/i915/i915_ttm_buddy_manager.c b/drivers/gpu/drm/i915/i915_ttm_buddy_manager.c
index 6ba314f9836a..bf857703c797 100644
--- a/drivers/gpu/drm/i915/i915_ttm_buddy_manager.c
+++ b/drivers/gpu/drm/i915/i915_ttm_buddy_manager.c
@@ -124,6 +124,7 @@ static void i915_ttm_buddy_man_free(struct ttm_resource_manager *man,
 	drm_buddy_free_list(&bman->mm, &bman_res->blocks);
 	mutex_unlock(&bman->lock);
 
+	ttm_resource_fini(man, res);
 	kfree(bman_res);
 }
 
diff --git a/drivers/gpu/drm/nouveau/nouveau_mem.c b/drivers/gpu/drm/nouveau/nouveau_mem.c
index 2ca3207c13fc..2e517cdc24c9 100644
--- a/drivers/gpu/drm/nouveau/nouveau_mem.c
+++ b/drivers/gpu/drm/nouveau/nouveau_mem.c
@@ -162,11 +162,12 @@ nouveau_mem_vram(struct ttm_resource *reg, bool contig, u8 page)
 }
 
 void
-nouveau_mem_del(struct ttm_resource *reg)
+nouveau_mem_del(struct ttm_resource_manager *man, struct ttm_resource *reg)
 {
 	struct nouveau_mem *mem = nouveau_mem(reg);
 
 	nouveau_mem_fini(mem);
+	ttm_resource_fini(man, reg);
 	kfree(mem);
 }
 
diff --git a/drivers/gpu/drm/nouveau/nouveau_mem.h b/drivers/gpu/drm/nouveau/nouveau_mem.h
index 2c01166a90f2..325551eba5cd 100644
--- a/drivers/gpu/drm/nouveau/nouveau_mem.h
+++ b/drivers/gpu/drm/nouveau/nouveau_mem.h
@@ -23,7 +23,8 @@ nouveau_mem(struct ttm_resource *reg)
 
 int nouveau_mem_new(struct nouveau_cli *, u8 kind, u8 comp,
 		    struct ttm_resource **);
-void nouveau_mem_del(struct ttm_resource *);
+void nouveau_mem_del(struct ttm_resource_manager *man,
+		     struct ttm_resource *);
 int nouveau_mem_vram(struct ttm_resource *, bool contig, u8 page);
 int nouveau_mem_host(struct ttm_resource *, struct ttm_tt *);
 void nouveau_mem_fini(struct nouveau_mem *);
diff --git a/drivers/gpu/drm/nouveau/nouveau_ttm.c b/drivers/gpu/drm/nouveau/nouveau_ttm.c
index 2ca9d9a9e5d5..91ef33f8f22c 100644
--- a/drivers/gpu/drm/nouveau/nouveau_ttm.c
+++ b/drivers/gpu/drm/nouveau/nouveau_ttm.c
@@ -36,9 +36,10 @@
 #include <core/tegra.h>
 
 static void
-nouveau_manager_del(struct ttm_resource_manager *man, struct ttm_resource *reg)
+nouveau_manager_del(struct ttm_resource_manager *man,
+		    struct ttm_resource *reg)
 {
-	nouveau_mem_del(reg);
+	nouveau_mem_del(man, reg);
 }
 
 static int
@@ -62,7 +63,7 @@ nouveau_vram_manager_new(struct ttm_resource_manager *man,
 
 	ret = nouveau_mem_vram(*res, nvbo->contig, nvbo->page);
 	if (ret) {
-		nouveau_mem_del(*res);
+		nouveau_mem_del(man, *res);
 		return ret;
 	}
 
@@ -118,7 +119,7 @@ nv04_gart_manager_new(struct ttm_resource_manager *man,
 	ret = nvif_vmm_get(&mem->cli->vmm.vmm, PTES, false, 12, 0,
 			   (long)(*res)->num_pages << PAGE_SHIFT, &mem->vma[0]);
 	if (ret) {
-		nouveau_mem_del(*res);
+		nouveau_mem_del(man, *res);
 		return ret;
 	}
 
diff --git a/drivers/gpu/drm/ttm/ttm_range_manager.c b/drivers/gpu/drm/ttm/ttm_range_manager.c
index 072e0baf2ab4..55a41355f684 100644
--- a/drivers/gpu/drm/ttm/ttm_range_manager.c
+++ b/drivers/gpu/drm/ttm/ttm_range_manager.c
@@ -89,6 +89,7 @@ static int ttm_range_man_alloc(struct ttm_resource_manager *man,
 	spin_unlock(&rman->lock);
 
 	if (unlikely(ret)) {
+		ttm_resource_fini(man, *res);
 		kfree(node);
 		return ret;
 	}
@@ -108,6 +109,7 @@ static void ttm_range_man_free(struct ttm_resource_manager *man,
 	drm_mm_remove_node(&node->mm_nodes[0]);
 	spin_unlock(&rman->lock);
 
+	ttm_resource_fini(man, res);
 	kfree(node);
 }
 
diff --git a/drivers/gpu/drm/ttm/ttm_resource.c b/drivers/gpu/drm/ttm/ttm_resource.c
index 035d71332d18..89bcfe22a0ca 100644
--- a/drivers/gpu/drm/ttm/ttm_resource.c
+++ b/drivers/gpu/drm/ttm/ttm_resource.c
@@ -44,6 +44,12 @@ void ttm_resource_init(struct ttm_buffer_object *bo,
 }
 EXPORT_SYMBOL(ttm_resource_init);
 
+void ttm_resource_fini(struct ttm_resource_manager *man,
+		       struct ttm_resource *res)
+{
+}
+EXPORT_SYMBOL(ttm_resource_fini);
+
 int ttm_resource_alloc(struct ttm_buffer_object *bo,
 		       const struct ttm_place *place,
 		       struct ttm_resource **res_ptr)
diff --git a/drivers/gpu/drm/ttm/ttm_sys_manager.c b/drivers/gpu/drm/ttm/ttm_sys_manager.c
index 63aca52f75e1..135394dcca95 100644
--- a/drivers/gpu/drm/ttm/ttm_sys_manager.c
+++ b/drivers/gpu/drm/ttm/ttm_sys_manager.c
@@ -23,6 +23,7 @@ static int ttm_sys_man_alloc(struct ttm_resource_manager *man,
 static void ttm_sys_man_free(struct ttm_resource_manager *man,
 			     struct ttm_resource *res)
 {
+	ttm_resource_fini(man, res);
 	kfree(res);
 }
 
diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_gmrid_manager.c b/drivers/gpu/drm/vmwgfx/vmwgfx_gmrid_manager.c
index ebb4505a31a3..99ccf690f8a1 100644
--- a/drivers/gpu/drm/vmwgfx/vmwgfx_gmrid_manager.c
+++ b/drivers/gpu/drm/vmwgfx/vmwgfx_gmrid_manager.c
@@ -117,6 +117,7 @@ static int vmw_gmrid_man_get_node(struct ttm_resource_manager *man,
 	gman->used_gmr_pages -= (*res)->num_pages;
 	spin_unlock(&gman->lock);
 	ida_free(&gman->gmr_ida, id);
+	ttm_resource_fini(man, *res);
 	kfree(*res);
 	return -ENOSPC;
 }
@@ -130,6 +131,7 @@ static void vmw_gmrid_man_put_node(struct ttm_resource_manager *man,
 	spin_lock(&gman->lock);
 	gman->used_gmr_pages -= res->num_pages;
 	spin_unlock(&gman->lock);
+	ttm_resource_fini(man, res);
 	kfree(res);
 }
 
diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_system_manager.c b/drivers/gpu/drm/vmwgfx/vmwgfx_system_manager.c
index b0005b03a617..a64188c7268d 100644
--- a/drivers/gpu/drm/vmwgfx/vmwgfx_system_manager.c
+++ b/drivers/gpu/drm/vmwgfx/vmwgfx_system_manager.c
@@ -49,6 +49,7 @@ static int vmw_sys_man_alloc(struct ttm_resource_manager *man,
 static void vmw_sys_man_free(struct ttm_resource_manager *man,
 			     struct ttm_resource *res)
 {
+	ttm_resource_fini(man, res);
 	kfree(res);
 }
 
diff --git a/include/drm/ttm/ttm_resource.h b/include/drm/ttm/ttm_resource.h
index 5952051091cd..df1f06b7b504 100644
--- a/include/drm/ttm/ttm_resource.h
+++ b/include/drm/ttm/ttm_resource.h
@@ -261,6 +261,9 @@ ttm_resource_manager_cleanup(struct ttm_resource_manager *man)
 void ttm_resource_init(struct ttm_buffer_object *bo,
                        const struct ttm_place *place,
                        struct ttm_resource *res);
+void ttm_resource_fini(struct ttm_resource_manager *man,
+		       struct ttm_resource *res);
+
 int ttm_resource_alloc(struct ttm_buffer_object *bo,
 		       const struct ttm_place *place,
 		       struct ttm_resource **res);
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 02/12] drm/ttm: add back a reference to the bdev to the res manager
  2022-01-24 12:25 drm/ttm: moving the LRU into the resource Christian König
  2022-01-24 12:25 ` [PATCH 01/12] drm/ttm: add ttm_resource_fini Christian König
@ 2022-01-24 12:25 ` Christian König
  2022-01-25 16:30   ` Daniel Vetter
  2022-01-24 12:25 ` [PATCH 03/12] drm/ttm: add a weak BO reference to the resource v3 Christian König
                   ` (9 subsequent siblings)
  11 siblings, 1 reply; 23+ messages in thread
From: Christian König @ 2022-01-24 12:25 UTC (permalink / raw)
  To: ray.huang, thomas.hellstrom, dri-devel, bas

It is simply a lot cleaner to have this around instead of adding
the device throughout the call chain.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c     |  3 ++-
 drivers/gpu/drm/amd/amdgpu/amdgpu_preempt_mgr.c |  2 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c    |  3 ++-
 drivers/gpu/drm/i915/i915_ttm_buddy_manager.c   |  2 +-
 drivers/gpu/drm/nouveau/nouveau_ttm.c           |  4 ++--
 drivers/gpu/drm/ttm/ttm_range_manager.c         |  2 +-
 drivers/gpu/drm/ttm/ttm_resource.c              |  3 +++
 drivers/gpu/drm/ttm/ttm_sys_manager.c           |  2 +-
 drivers/gpu/drm/vmwgfx/vmwgfx_gmrid_manager.c   |  2 +-
 drivers/gpu/drm/vmwgfx/vmwgfx_system_manager.c  |  2 +-
 include/drm/ttm/ttm_resource.h                  | 16 +++++++++-------
 11 files changed, 24 insertions(+), 17 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c
index ea5470c8c921..9e7685a4878c 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c
@@ -293,7 +293,8 @@ int amdgpu_gtt_mgr_init(struct amdgpu_device *adev, uint64_t gtt_size)
 	man->use_tt = true;
 	man->func = &amdgpu_gtt_mgr_func;
 
-	ttm_resource_manager_init(man, gtt_size >> PAGE_SHIFT);
+	ttm_resource_manager_init(man, &adev->mman.bdev,
+				  gtt_size >> PAGE_SHIFT);
 
 	start = AMDGPU_GTT_MAX_TRANSFER_SIZE * AMDGPU_GTT_NUM_TRANSFER_WINDOWS;
 	size = (adev->gmc.gart_size >> PAGE_SHIFT) - start;
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_preempt_mgr.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_preempt_mgr.c
index ffddec08e931..6f7189d32f0a 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_preempt_mgr.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_preempt_mgr.c
@@ -153,7 +153,7 @@ int amdgpu_preempt_mgr_init(struct amdgpu_device *adev)
 	man->use_tt = true;
 	man->func = &amdgpu_preempt_mgr_func;
 
-	ttm_resource_manager_init(man, (1 << 30));
+	ttm_resource_manager_init(man, &adev->mman.bdev, (1 << 30));
 
 	atomic64_set(&mgr->used, 0);
 
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c
index 55d68408951d..ddd0b6d74070 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c
@@ -699,7 +699,8 @@ int amdgpu_vram_mgr_init(struct amdgpu_device *adev)
 	struct amdgpu_vram_mgr *mgr = &adev->mman.vram_mgr;
 	struct ttm_resource_manager *man = &mgr->manager;
 
-	ttm_resource_manager_init(man, adev->gmc.real_vram_size >> PAGE_SHIFT);
+	ttm_resource_manager_init(man, &adev->mman.bdev,
+				  adev->gmc.real_vram_size >> PAGE_SHIFT);
 
 	man->func = &amdgpu_vram_mgr_func;
 
diff --git a/drivers/gpu/drm/i915/i915_ttm_buddy_manager.c b/drivers/gpu/drm/i915/i915_ttm_buddy_manager.c
index bf857703c797..79dc8c6f2833 100644
--- a/drivers/gpu/drm/i915/i915_ttm_buddy_manager.c
+++ b/drivers/gpu/drm/i915/i915_ttm_buddy_manager.c
@@ -204,7 +204,7 @@ int i915_ttm_buddy_man_init(struct ttm_device *bdev,
 	man = &bman->manager;
 	man->use_tt = use_tt;
 	man->func = &i915_ttm_buddy_manager_func;
-	ttm_resource_manager_init(man, bman->mm.size >> PAGE_SHIFT);
+	ttm_resource_manager_init(man, bdev, bman->mm.size >> PAGE_SHIFT);
 
 	ttm_resource_manager_set_used(man, true);
 	ttm_set_driver_manager(bdev, type, man);
diff --git a/drivers/gpu/drm/nouveau/nouveau_ttm.c b/drivers/gpu/drm/nouveau/nouveau_ttm.c
index 91ef33f8f22c..85f1f5a0fe5d 100644
--- a/drivers/gpu/drm/nouveau/nouveau_ttm.c
+++ b/drivers/gpu/drm/nouveau/nouveau_ttm.c
@@ -164,7 +164,7 @@ nouveau_ttm_init_vram(struct nouveau_drm *drm)
 
 		man->func = &nouveau_vram_manager;
 
-		ttm_resource_manager_init(man,
+		ttm_resource_manager_init(man, &drm->ttm.bdev,
 					  drm->gem.vram_available >> PAGE_SHIFT);
 		ttm_set_driver_manager(&drm->ttm.bdev, TTM_PL_VRAM, man);
 		ttm_resource_manager_set_used(man, true);
@@ -211,7 +211,7 @@ nouveau_ttm_init_gtt(struct nouveau_drm *drm)
 
 	man->func = func;
 	man->use_tt = true;
-	ttm_resource_manager_init(man, size_pages);
+	ttm_resource_manager_init(man, &drm->ttm.bdev, size_pages);
 	ttm_set_driver_manager(&drm->ttm.bdev, TTM_PL_TT, man);
 	ttm_resource_manager_set_used(man, true);
 	return 0;
diff --git a/drivers/gpu/drm/ttm/ttm_range_manager.c b/drivers/gpu/drm/ttm/ttm_range_manager.c
index 55a41355f684..8cd4f3fb9f79 100644
--- a/drivers/gpu/drm/ttm/ttm_range_manager.c
+++ b/drivers/gpu/drm/ttm/ttm_range_manager.c
@@ -158,7 +158,7 @@ int ttm_range_man_init_nocheck(struct ttm_device *bdev,
 
 	man->func = &ttm_range_manager_func;
 
-	ttm_resource_manager_init(man, p_size);
+	ttm_resource_manager_init(man, bdev, p_size);
 
 	drm_mm_init(&rman->mm, 0, p_size);
 	spin_lock_init(&rman->lock);
diff --git a/drivers/gpu/drm/ttm/ttm_resource.c b/drivers/gpu/drm/ttm/ttm_resource.c
index 89bcfe22a0ca..41e7bf195168 100644
--- a/drivers/gpu/drm/ttm/ttm_resource.c
+++ b/drivers/gpu/drm/ttm/ttm_resource.c
@@ -126,16 +126,19 @@ EXPORT_SYMBOL(ttm_resource_compat);
  * ttm_resource_manager_init
  *
  * @man: memory manager object to init
+ * @bdev: ttm device this manager belongs to
  * @p_size: size managed area in pages.
  *
  * Initialise core parts of a manager object.
  */
 void ttm_resource_manager_init(struct ttm_resource_manager *man,
+			       struct ttm_device *bdev,
 			       unsigned long p_size)
 {
 	unsigned i;
 
 	spin_lock_init(&man->move_lock);
+	man->bdev = bdev;
 	man->size = p_size;
 
 	for (i = 0; i < TTM_MAX_BO_PRIORITY; ++i)
diff --git a/drivers/gpu/drm/ttm/ttm_sys_manager.c b/drivers/gpu/drm/ttm/ttm_sys_manager.c
index 135394dcca95..2ced169513cb 100644
--- a/drivers/gpu/drm/ttm/ttm_sys_manager.c
+++ b/drivers/gpu/drm/ttm/ttm_sys_manager.c
@@ -43,7 +43,7 @@ void ttm_sys_man_init(struct ttm_device *bdev)
 	man->use_tt = true;
 	man->func = &ttm_sys_manager_func;
 
-	ttm_resource_manager_init(man, 0);
+	ttm_resource_manager_init(man, bdev, 0);
 	ttm_set_driver_manager(bdev, TTM_PL_SYSTEM, man);
 	ttm_resource_manager_set_used(man, true);
 }
diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_gmrid_manager.c b/drivers/gpu/drm/vmwgfx/vmwgfx_gmrid_manager.c
index 99ccf690f8a1..60e3cc537f36 100644
--- a/drivers/gpu/drm/vmwgfx/vmwgfx_gmrid_manager.c
+++ b/drivers/gpu/drm/vmwgfx/vmwgfx_gmrid_manager.c
@@ -162,7 +162,7 @@ int vmw_gmrid_man_init(struct vmw_private *dev_priv, int type)
 
 	man->func = &vmw_gmrid_manager_func;
 	man->use_tt = true;
-	ttm_resource_manager_init(man, 0);
+	ttm_resource_manager_init(man, &dev_priv->bdev, 0);
 	spin_lock_init(&gman->lock);
 	gman->used_gmr_pages = 0;
 	ida_init(&gman->gmr_ida);
diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_system_manager.c b/drivers/gpu/drm/vmwgfx/vmwgfx_system_manager.c
index a64188c7268d..d3007bf1b8f5 100644
--- a/drivers/gpu/drm/vmwgfx/vmwgfx_system_manager.c
+++ b/drivers/gpu/drm/vmwgfx/vmwgfx_system_manager.c
@@ -70,7 +70,7 @@ int vmw_sys_man_init(struct vmw_private *dev_priv)
 	man->use_tt = true;
 	man->func = &vmw_sys_manager_func;
 
-	ttm_resource_manager_init(man, 0);
+	ttm_resource_manager_init(man, bdev, 0);
 	ttm_set_driver_manager(bdev, VMW_PL_SYSTEM, man);
 	ttm_resource_manager_set_used(man, true);
 	return 0;
diff --git a/include/drm/ttm/ttm_resource.h b/include/drm/ttm/ttm_resource.h
index df1f06b7b504..6bf37383002b 100644
--- a/include/drm/ttm/ttm_resource.h
+++ b/include/drm/ttm/ttm_resource.h
@@ -105,11 +105,11 @@ struct ttm_resource_manager_func {
  * @use_type: The memory type is enabled.
  * @use_tt: If a TT object should be used for the backing store.
  * @size: Size of the managed region.
+ * @bdev: ttm device this manager belongs to
  * @func: structure pointer implementing the range manager. See above
  * @move_lock: lock for move fence
- * static information. bdev::driver::io_mem_free is never used.
- * @lru: The lru list for this memory type.
  * @move: The fence of the last pipelined move operation.
+ * @lru: The lru list for this memory type.
  *
  * This structure is used to identify and manage memory types for a device.
  */
@@ -119,20 +119,21 @@ struct ttm_resource_manager {
 	 */
 	bool use_type;
 	bool use_tt;
+	struct ttm_device *bdev;
 	uint64_t size;
 	const struct ttm_resource_manager_func *func;
 	spinlock_t move_lock;
 
 	/*
-	 * Protected by the global->lru_lock.
+	 * Protected by @move_lock.
 	 */
-
-	struct list_head lru[TTM_MAX_BO_PRIORITY];
+	struct dma_fence *move;
 
 	/*
-	 * Protected by @move_lock.
+	 * Protected by the global->lru_lock.
 	 */
-	struct dma_fence *move;
+
+	struct list_head lru[TTM_MAX_BO_PRIORITY];
 };
 
 /**
@@ -272,6 +273,7 @@ bool ttm_resource_compat(struct ttm_resource *res,
 			 struct ttm_placement *placement);
 
 void ttm_resource_manager_init(struct ttm_resource_manager *man,
+			       struct ttm_device *bdev,
 			       unsigned long p_size);
 
 int ttm_resource_manager_evict_all(struct ttm_device *bdev,
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 03/12] drm/ttm: add a weak BO reference to the resource v3
  2022-01-24 12:25 drm/ttm: moving the LRU into the resource Christian König
  2022-01-24 12:25 ` [PATCH 01/12] drm/ttm: add ttm_resource_fini Christian König
  2022-01-24 12:25 ` [PATCH 02/12] drm/ttm: add back a reference to the bdev to the res manager Christian König
@ 2022-01-24 12:25 ` Christian König
  2022-01-24 12:25 ` [PATCH 04/12] drm/ttm: add common accounting to the resource mgr v2 Christian König
                   ` (8 subsequent siblings)
  11 siblings, 0 replies; 23+ messages in thread
From: Christian König @ 2022-01-24 12:25 UTC (permalink / raw)
  To: ray.huang, thomas.hellstrom, dri-devel, bas

Keep track for which BO a resource was allocated.
This is necessary to move the LRU handling into the resources.

A bit problematic is i915 since it tries to use the resource
interface without a BO which is illegal from the conceptional
point of view.

v2: Document that this is a weak reference and add a workaround for i915
v3: further document that this is protected by ttm_device::lru_lock and
    clarify the i915 workaround

Signed-off-by: Christian König <christian.koenig@amd.com>
Acked-by: Huang Rui <ray.huang@amd.com>
---
 drivers/gpu/drm/ttm/ttm_bo_util.c  | 7 +++++--
 drivers/gpu/drm/ttm/ttm_resource.c | 9 +++++++++
 include/drm/ttm/ttm_resource.h     | 4 ++++
 3 files changed, 18 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/ttm/ttm_bo_util.c b/drivers/gpu/drm/ttm/ttm_bo_util.c
index 72a94301bc95..544a84fa6589 100644
--- a/drivers/gpu/drm/ttm/ttm_bo_util.c
+++ b/drivers/gpu/drm/ttm/ttm_bo_util.c
@@ -241,6 +241,11 @@ static int ttm_buffer_object_transfer(struct ttm_buffer_object *bo,
 	if (bo->type != ttm_bo_type_sg)
 		fbo->base.base.resv = &fbo->base.base._resv;
 
+	if (fbo->base.resource) {
+		ttm_resource_set_bo(fbo->base.resource, &fbo->base);
+		bo->resource = NULL;
+	}
+
 	dma_resv_init(&fbo->base.base._resv);
 	fbo->base.base.dev = NULL;
 	ret = dma_resv_trylock(&fbo->base.base._resv);
@@ -509,7 +514,6 @@ static int ttm_bo_move_to_ghost(struct ttm_buffer_object *bo,
 		ghost_obj->ttm = NULL;
 	else
 		bo->ttm = NULL;
-	bo->resource = NULL;
 
 	dma_resv_unlock(&ghost_obj->base._resv);
 	ttm_bo_put(ghost_obj);
@@ -637,7 +641,6 @@ int ttm_bo_pipeline_gutting(struct ttm_buffer_object *bo)
 	dma_resv_unlock(&ghost->base._resv);
 	ttm_bo_put(ghost);
 	bo->ttm = ttm;
-	bo->resource = NULL;
 	ttm_bo_assign_mem(bo, sys_res);
 	return 0;
 
diff --git a/drivers/gpu/drm/ttm/ttm_resource.c b/drivers/gpu/drm/ttm/ttm_resource.c
index 41e7bf195168..7fdd58b53c61 100644
--- a/drivers/gpu/drm/ttm/ttm_resource.c
+++ b/drivers/gpu/drm/ttm/ttm_resource.c
@@ -41,6 +41,7 @@ void ttm_resource_init(struct ttm_buffer_object *bo,
 	res->bus.offset = 0;
 	res->bus.is_iomem = false;
 	res->bus.caching = ttm_cached;
+	res->bo = bo;
 }
 EXPORT_SYMBOL(ttm_resource_init);
 
@@ -122,6 +123,14 @@ bool ttm_resource_compat(struct ttm_resource *res,
 }
 EXPORT_SYMBOL(ttm_resource_compat);
 
+void ttm_resource_set_bo(struct ttm_resource *res,
+			 struct ttm_buffer_object *bo)
+{
+	spin_lock(&bo->bdev->lru_lock);
+	res->bo = bo;
+	spin_unlock(&bo->bdev->lru_lock);
+}
+
 /**
  * ttm_resource_manager_init
  *
diff --git a/include/drm/ttm/ttm_resource.h b/include/drm/ttm/ttm_resource.h
index 6bf37383002b..69eea9d6399b 100644
--- a/include/drm/ttm/ttm_resource.h
+++ b/include/drm/ttm/ttm_resource.h
@@ -161,6 +161,7 @@ struct ttm_bus_placement {
  * @mem_type: Resource type of the allocation.
  * @placement: Placement flags.
  * @bus: Placement on io bus accessible to the CPU
+ * @bo: weak reference to the BO, protected by ttm_device::lru_lock
  *
  * Structure indicating the placement and space resources used by a
  * buffer object.
@@ -171,6 +172,7 @@ struct ttm_resource {
 	uint32_t mem_type;
 	uint32_t placement;
 	struct ttm_bus_placement bus;
+	struct ttm_buffer_object *bo;
 };
 
 /**
@@ -271,6 +273,8 @@ int ttm_resource_alloc(struct ttm_buffer_object *bo,
 void ttm_resource_free(struct ttm_buffer_object *bo, struct ttm_resource **res);
 bool ttm_resource_compat(struct ttm_resource *res,
 			 struct ttm_placement *placement);
+void ttm_resource_set_bo(struct ttm_resource *res,
+			 struct ttm_buffer_object *bo);
 
 void ttm_resource_manager_init(struct ttm_resource_manager *man,
 			       struct ttm_device *bdev,
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 04/12] drm/ttm: add common accounting to the resource mgr v2
  2022-01-24 12:25 drm/ttm: moving the LRU into the resource Christian König
                   ` (2 preceding siblings ...)
  2022-01-24 12:25 ` [PATCH 03/12] drm/ttm: add a weak BO reference to the resource v3 Christian König
@ 2022-01-24 12:25 ` Christian König
  2022-01-25 16:37   ` Daniel Vetter
  2022-01-24 12:25 ` [PATCH 05/12] drm/ttm: move the LRU into resource handling v2 Christian König
                   ` (7 subsequent siblings)
  11 siblings, 1 reply; 23+ messages in thread
From: Christian König @ 2022-01-24 12:25 UTC (permalink / raw)
  To: ray.huang, thomas.hellstrom, dri-devel, bas

It makes sense to have this in the common manager for debugging and
accounting of how much resources are used.

v2: cleanup kerneldoc a bit

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
---
 drivers/gpu/drm/ttm/ttm_resource.c |  8 ++++++++
 include/drm/ttm/ttm_resource.h     | 20 +++++++++++++++++++-
 2 files changed, 27 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/ttm/ttm_resource.c b/drivers/gpu/drm/ttm/ttm_resource.c
index 7fdd58b53c61..b8362492980d 100644
--- a/drivers/gpu/drm/ttm/ttm_resource.c
+++ b/drivers/gpu/drm/ttm/ttm_resource.c
@@ -33,6 +33,8 @@ void ttm_resource_init(struct ttm_buffer_object *bo,
                        const struct ttm_place *place,
                        struct ttm_resource *res)
 {
+	struct ttm_resource_manager *man;
+
 	res->start = 0;
 	res->num_pages = PFN_UP(bo->base.size);
 	res->mem_type = place->mem_type;
@@ -42,12 +44,16 @@ void ttm_resource_init(struct ttm_buffer_object *bo,
 	res->bus.is_iomem = false;
 	res->bus.caching = ttm_cached;
 	res->bo = bo;
+
+	man = ttm_manager_type(bo->bdev, place->mem_type);
+	atomic64_add(bo->base.size, &man->usage);
 }
 EXPORT_SYMBOL(ttm_resource_init);
 
 void ttm_resource_fini(struct ttm_resource_manager *man,
 		       struct ttm_resource *res)
 {
+	atomic64_sub(res->bo->base.size, &man->usage);
 }
 EXPORT_SYMBOL(ttm_resource_fini);
 
@@ -149,6 +155,7 @@ void ttm_resource_manager_init(struct ttm_resource_manager *man,
 	spin_lock_init(&man->move_lock);
 	man->bdev = bdev;
 	man->size = p_size;
+	atomic64_set(&man->usage, 0);
 
 	for (i = 0; i < TTM_MAX_BO_PRIORITY; ++i)
 		INIT_LIST_HEAD(&man->lru[i]);
@@ -221,6 +228,7 @@ void ttm_resource_manager_debug(struct ttm_resource_manager *man,
 	drm_printf(p, "  use_type: %d\n", man->use_type);
 	drm_printf(p, "  use_tt: %d\n", man->use_tt);
 	drm_printf(p, "  size: %llu\n", man->size);
+	drm_printf(p, "  usage: %llu\n", atomic64_read(&man->usage));
 	if (man->func->debug)
 		man->func->debug(man, p);
 }
diff --git a/include/drm/ttm/ttm_resource.h b/include/drm/ttm/ttm_resource.h
index 69eea9d6399b..3d391279b33f 100644
--- a/include/drm/ttm/ttm_resource.h
+++ b/include/drm/ttm/ttm_resource.h
@@ -27,6 +27,7 @@
 
 #include <linux/types.h>
 #include <linux/mutex.h>
+#include <linux/atomic.h>
 #include <linux/dma-buf-map.h>
 #include <linux/dma-fence.h>
 #include <drm/drm_print.h>
@@ -132,8 +133,12 @@ struct ttm_resource_manager {
 	/*
 	 * Protected by the global->lru_lock.
 	 */
-
 	struct list_head lru[TTM_MAX_BO_PRIORITY];
+
+	/**
+	 * @usage: How much of the region is used, has its own protection.
+	 */
+	atomic64_t usage;
 };
 
 /**
@@ -261,6 +266,19 @@ ttm_resource_manager_cleanup(struct ttm_resource_manager *man)
 	man->move = NULL;
 }
 
+/**
+ * ttm_resource_manager_usage
+ *
+ * @man: A memory manager object.
+ *
+ * Return how many resources are currently used.
+ */
+static inline uint64_t
+ttm_resource_manager_usage(struct ttm_resource_manager *man)
+{
+	return atomic64_read(&man->usage);
+}
+
 void ttm_resource_init(struct ttm_buffer_object *bo,
                        const struct ttm_place *place,
                        struct ttm_resource *res);
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 05/12] drm/ttm: move the LRU into resource handling v2
  2022-01-24 12:25 drm/ttm: moving the LRU into the resource Christian König
                   ` (3 preceding siblings ...)
  2022-01-24 12:25 ` [PATCH 04/12] drm/ttm: add common accounting to the resource mgr v2 Christian König
@ 2022-01-24 12:25 ` Christian König
  2022-01-25 16:52   ` Daniel Vetter
  2022-01-24 12:25 ` [PATCH 06/12] drm/ttm: add resource iterator Christian König
                   ` (6 subsequent siblings)
  11 siblings, 1 reply; 23+ messages in thread
From: Christian König @ 2022-01-24 12:25 UTC (permalink / raw)
  To: ray.huang, thomas.hellstrom, dri-devel, bas

This way we finally fix the problem that new resource are
not immediately evict-able after allocation.

That has caused numerous problems including OOM on GDS handling
and not being able to use TTM as general resource manager.

v2: stop assuming in ttm_resource_fini that res->bo is still valid.

Signed-off-by: Christian König <christian.koenig@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c  |   8 +-
 drivers/gpu/drm/i915/gem/i915_gem_ttm.c |   2 +-
 drivers/gpu/drm/ttm/ttm_bo.c            | 115 ++--------------------
 drivers/gpu/drm/ttm/ttm_bo_util.c       |   1 -
 drivers/gpu/drm/ttm/ttm_device.c        |  64 ++++++-------
 drivers/gpu/drm/ttm/ttm_resource.c      | 122 +++++++++++++++++++++++-
 include/drm/ttm/ttm_bo_api.h            |  16 ----
 include/drm/ttm/ttm_bo_driver.h         |  29 +-----
 include/drm/ttm/ttm_resource.h          |  35 +++++++
 9 files changed, 201 insertions(+), 191 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
index a96ae4c0e040..cdb489cdf27b 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
@@ -683,12 +683,12 @@ void amdgpu_vm_move_to_lru_tail(struct amdgpu_device *adev,
 
 	if (vm->bulk_moveable) {
 		spin_lock(&adev->mman.bdev.lru_lock);
-		ttm_bo_bulk_move_lru_tail(&vm->lru_bulk_move);
+		ttm_lru_bulk_move_tail(&vm->lru_bulk_move);
 		spin_unlock(&adev->mman.bdev.lru_lock);
 		return;
 	}
 
-	memset(&vm->lru_bulk_move, 0, sizeof(vm->lru_bulk_move));
+	ttm_lru_bulk_move_init(&vm->lru_bulk_move);
 
 	spin_lock(&adev->mman.bdev.lru_lock);
 	list_for_each_entry(bo_base, &vm->idle, vm_status) {
@@ -698,11 +698,9 @@ void amdgpu_vm_move_to_lru_tail(struct amdgpu_device *adev,
 		if (!bo->parent)
 			continue;
 
-		ttm_bo_move_to_lru_tail(&bo->tbo, bo->tbo.resource,
-					&vm->lru_bulk_move);
+		ttm_bo_move_to_lru_tail(&bo->tbo, &vm->lru_bulk_move);
 		if (shadow)
 			ttm_bo_move_to_lru_tail(&shadow->tbo,
-						shadow->tbo.resource,
 						&vm->lru_bulk_move);
 	}
 	spin_unlock(&adev->mman.bdev.lru_lock);
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_ttm.c b/drivers/gpu/drm/i915/gem/i915_gem_ttm.c
index 218a9b3037c7..cdf1ff87c88d 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_ttm.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_ttm.c
@@ -835,7 +835,7 @@ void i915_ttm_adjust_lru(struct drm_i915_gem_object *obj)
 			bo->priority = I915_TTM_PRIO_NO_PAGES;
 	}
 
-	ttm_bo_move_to_lru_tail(bo, bo->resource, NULL);
+	ttm_bo_move_to_lru_tail(bo, NULL);
 	spin_unlock(&bo->bdev->lru_lock);
 }
 
diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c
index db3dc7ef5382..cb0fa932d495 100644
--- a/drivers/gpu/drm/ttm/ttm_bo.c
+++ b/drivers/gpu/drm/ttm/ttm_bo.c
@@ -69,108 +69,16 @@ static void ttm_bo_mem_space_debug(struct ttm_buffer_object *bo,
 	}
 }
 
-static inline void ttm_bo_move_to_pinned(struct ttm_buffer_object *bo)
-{
-	struct ttm_device *bdev = bo->bdev;
-
-	list_move_tail(&bo->lru, &bdev->pinned);
-
-	if (bdev->funcs->del_from_lru_notify)
-		bdev->funcs->del_from_lru_notify(bo);
-}
-
-static inline void ttm_bo_del_from_lru(struct ttm_buffer_object *bo)
-{
-	struct ttm_device *bdev = bo->bdev;
-
-	list_del_init(&bo->lru);
-
-	if (bdev->funcs->del_from_lru_notify)
-		bdev->funcs->del_from_lru_notify(bo);
-}
-
-static void ttm_bo_bulk_move_set_pos(struct ttm_lru_bulk_move_pos *pos,
-				     struct ttm_buffer_object *bo)
-{
-	if (!pos->first)
-		pos->first = bo;
-	pos->last = bo;
-}
-
 void ttm_bo_move_to_lru_tail(struct ttm_buffer_object *bo,
-			     struct ttm_resource *mem,
 			     struct ttm_lru_bulk_move *bulk)
 {
-	struct ttm_device *bdev = bo->bdev;
-	struct ttm_resource_manager *man;
-
-	if (!bo->deleted)
-		dma_resv_assert_held(bo->base.resv);
-
-	if (bo->pin_count) {
-		ttm_bo_move_to_pinned(bo);
-		return;
-	}
-
-	if (!mem)
-		return;
-
-	man = ttm_manager_type(bdev, mem->mem_type);
-	list_move_tail(&bo->lru, &man->lru[bo->priority]);
-
-	if (bdev->funcs->del_from_lru_notify)
-		bdev->funcs->del_from_lru_notify(bo);
-
-	if (bulk && !bo->pin_count) {
-		switch (bo->resource->mem_type) {
-		case TTM_PL_TT:
-			ttm_bo_bulk_move_set_pos(&bulk->tt[bo->priority], bo);
-			break;
+	dma_resv_assert_held(bo->base.resv);
 
-		case TTM_PL_VRAM:
-			ttm_bo_bulk_move_set_pos(&bulk->vram[bo->priority], bo);
-			break;
-		}
-	}
+	if (bo->resource)
+		ttm_resource_move_to_lru_tail(bo->resource, bulk);
 }
 EXPORT_SYMBOL(ttm_bo_move_to_lru_tail);
 
-void ttm_bo_bulk_move_lru_tail(struct ttm_lru_bulk_move *bulk)
-{
-	unsigned i;
-
-	for (i = 0; i < TTM_MAX_BO_PRIORITY; ++i) {
-		struct ttm_lru_bulk_move_pos *pos = &bulk->tt[i];
-		struct ttm_resource_manager *man;
-
-		if (!pos->first)
-			continue;
-
-		dma_resv_assert_held(pos->first->base.resv);
-		dma_resv_assert_held(pos->last->base.resv);
-
-		man = ttm_manager_type(pos->first->bdev, TTM_PL_TT);
-		list_bulk_move_tail(&man->lru[i], &pos->first->lru,
-				    &pos->last->lru);
-	}
-
-	for (i = 0; i < TTM_MAX_BO_PRIORITY; ++i) {
-		struct ttm_lru_bulk_move_pos *pos = &bulk->vram[i];
-		struct ttm_resource_manager *man;
-
-		if (!pos->first)
-			continue;
-
-		dma_resv_assert_held(pos->first->base.resv);
-		dma_resv_assert_held(pos->last->base.resv);
-
-		man = ttm_manager_type(pos->first->bdev, TTM_PL_VRAM);
-		list_bulk_move_tail(&man->lru[i], &pos->first->lru,
-				    &pos->last->lru);
-	}
-}
-EXPORT_SYMBOL(ttm_bo_bulk_move_lru_tail);
-
 static int ttm_bo_handle_move_mem(struct ttm_buffer_object *bo,
 				  struct ttm_resource *mem, bool evict,
 				  struct ttm_operation_ctx *ctx,
@@ -344,7 +252,6 @@ static int ttm_bo_cleanup_refs(struct ttm_buffer_object *bo,
 		return ret;
 	}
 
-	ttm_bo_move_to_pinned(bo);
 	list_del_init(&bo->ddestroy);
 	spin_unlock(&bo->bdev->lru_lock);
 	ttm_bo_cleanup_memtype_use(bo);
@@ -445,7 +352,7 @@ static void ttm_bo_release(struct kref *kref)
 		 */
 		if (bo->pin_count) {
 			bo->pin_count = 0;
-			ttm_bo_move_to_lru_tail(bo, bo->resource, NULL);
+			ttm_resource_move_to_lru_tail(bo->resource, NULL);
 		}
 
 		kref_init(&bo->kref);
@@ -458,7 +365,6 @@ static void ttm_bo_release(struct kref *kref)
 	}
 
 	spin_lock(&bo->bdev->lru_lock);
-	ttm_bo_del_from_lru(bo);
 	list_del(&bo->ddestroy);
 	spin_unlock(&bo->bdev->lru_lock);
 
@@ -673,15 +579,17 @@ int ttm_mem_evict_first(struct ttm_device *bdev,
 			struct ww_acquire_ctx *ticket)
 {
 	struct ttm_buffer_object *bo = NULL, *busy_bo = NULL;
+	struct ttm_resource *res;
 	bool locked = false;
 	unsigned i;
 	int ret;
 
 	spin_lock(&bdev->lru_lock);
 	for (i = 0; i < TTM_MAX_BO_PRIORITY; ++i) {
-		list_for_each_entry(bo, &man->lru[i], lru) {
+		list_for_each_entry(res, &man->lru[i], lru) {
 			bool busy;
 
+			bo = res->bo;
 			if (!ttm_bo_evict_swapout_allowable(bo, ctx, place,
 							    &locked, &busy)) {
 				if (busy && !busy_bo && ticket !=
@@ -699,7 +607,7 @@ int ttm_mem_evict_first(struct ttm_device *bdev,
 		}
 
 		/* If the inner loop terminated early, we have our candidate */
-		if (&bo->lru != &man->lru[i])
+		if (&res->lru != &man->lru[i])
 			break;
 
 		bo = NULL;
@@ -875,9 +783,6 @@ int ttm_bo_mem_space(struct ttm_buffer_object *bo,
 	}
 
 error:
-	if (bo->resource->mem_type == TTM_PL_SYSTEM && !bo->pin_count)
-		ttm_bo_move_to_lru_tail_unlocked(bo);
-
 	return ret;
 }
 EXPORT_SYMBOL(ttm_bo_mem_space);
@@ -971,7 +876,6 @@ int ttm_bo_init_reserved(struct ttm_device *bdev,
 	bo->destroy = destroy ? destroy : ttm_bo_default_destroy;
 
 	kref_init(&bo->kref);
-	INIT_LIST_HEAD(&bo->lru);
 	INIT_LIST_HEAD(&bo->ddestroy);
 	bo->bdev = bdev;
 	bo->type = type;
@@ -1021,8 +925,6 @@ int ttm_bo_init_reserved(struct ttm_device *bdev,
 		return ret;
 	}
 
-	ttm_bo_move_to_lru_tail_unlocked(bo);
-
 	return ret;
 }
 EXPORT_SYMBOL(ttm_bo_init_reserved);
@@ -1123,7 +1025,6 @@ int ttm_bo_swapout(struct ttm_buffer_object *bo, struct ttm_operation_ctx *ctx,
 		return ret == -EBUSY ? -ENOSPC : ret;
 	}
 
-	ttm_bo_move_to_pinned(bo);
 	/* TODO: Cleanup the locking */
 	spin_unlock(&bo->bdev->lru_lock);
 
diff --git a/drivers/gpu/drm/ttm/ttm_bo_util.c b/drivers/gpu/drm/ttm/ttm_bo_util.c
index 544a84fa6589..0163e92b57af 100644
--- a/drivers/gpu/drm/ttm/ttm_bo_util.c
+++ b/drivers/gpu/drm/ttm/ttm_bo_util.c
@@ -231,7 +231,6 @@ static int ttm_buffer_object_transfer(struct ttm_buffer_object *bo,
 
 	atomic_inc(&ttm_glob.bo_count);
 	INIT_LIST_HEAD(&fbo->base.ddestroy);
-	INIT_LIST_HEAD(&fbo->base.lru);
 	fbo->base.moving = NULL;
 	drm_vma_node_reset(&fbo->base.base.vma_node);
 
diff --git a/drivers/gpu/drm/ttm/ttm_device.c b/drivers/gpu/drm/ttm/ttm_device.c
index be24bb6cefd0..ba35887147ba 100644
--- a/drivers/gpu/drm/ttm/ttm_device.c
+++ b/drivers/gpu/drm/ttm/ttm_device.c
@@ -144,6 +144,7 @@ int ttm_device_swapout(struct ttm_device *bdev, struct ttm_operation_ctx *ctx,
 {
 	struct ttm_resource_manager *man;
 	struct ttm_buffer_object *bo;
+	struct ttm_resource *res;
 	unsigned i, j;
 	int ret;
 
@@ -154,8 +155,11 @@ int ttm_device_swapout(struct ttm_device *bdev, struct ttm_operation_ctx *ctx,
 			continue;
 
 		for (j = 0; j < TTM_MAX_BO_PRIORITY; ++j) {
-			list_for_each_entry(bo, &man->lru[j], lru) {
-				uint32_t num_pages = PFN_UP(bo->base.size);
+			list_for_each_entry(res, &man->lru[j], lru) {
+				uint32_t num_pages;
+
+				bo = res->bo;
+				num_pages = PFN_UP(bo->base.size);
 
 				ret = ttm_bo_swapout(bo, ctx, gfp_flags);
 				/* ttm_bo_swapout has dropped the lru_lock */
@@ -259,49 +263,45 @@ void ttm_device_fini(struct ttm_device *bdev)
 }
 EXPORT_SYMBOL(ttm_device_fini);
 
-void ttm_device_clear_dma_mappings(struct ttm_device *bdev)
+static void ttm_device_clear_lru_dma_mappings(struct ttm_device *bdev,
+					      struct list_head *list)
 {
-	struct ttm_resource_manager *man;
-	struct ttm_buffer_object *bo;
-	unsigned int i, j;
+	struct ttm_resource *res;
 
 	spin_lock(&bdev->lru_lock);
-	while (!list_empty(&bdev->pinned)) {
-		bo = list_first_entry(&bdev->pinned, struct ttm_buffer_object, lru);
+	while ((res = list_first_entry_or_null(list, typeof(*res), lru))) {
+		struct ttm_buffer_object *bo = res->bo;
+
 		/* Take ref against racing releases once lru_lock is unlocked */
-		if (ttm_bo_get_unless_zero(bo)) {
-			list_del_init(&bo->lru);
-			spin_unlock(&bdev->lru_lock);
+		if (!ttm_bo_get_unless_zero(bo))
+			continue;
 
-			if (bo->ttm)
-				ttm_tt_unpopulate(bo->bdev, bo->ttm);
+		list_del_init(&res->lru);
+		spin_unlock(&bdev->lru_lock);
 
-			ttm_bo_put(bo);
-			spin_lock(&bdev->lru_lock);
-		}
+		if (bo->ttm)
+			ttm_tt_unpopulate(bo->bdev, bo->ttm);
+
+		ttm_bo_put(bo);
+		spin_lock(&bdev->lru_lock);
 	}
+	spin_unlock(&bdev->lru_lock);
+}
+
+void ttm_device_clear_dma_mappings(struct ttm_device *bdev)
+{
+	struct ttm_resource_manager *man;
+	unsigned int i, j;
+
+	ttm_device_clear_lru_dma_mappings(bdev, &bdev->pinned);
 
 	for (i = TTM_PL_SYSTEM; i < TTM_NUM_MEM_TYPES; ++i) {
 		man = ttm_manager_type(bdev, i);
 		if (!man || !man->use_tt)
 			continue;
 
-		for (j = 0; j < TTM_MAX_BO_PRIORITY; ++j) {
-			while (!list_empty(&man->lru[j])) {
-				bo = list_first_entry(&man->lru[j], struct ttm_buffer_object, lru);
-				if (ttm_bo_get_unless_zero(bo)) {
-					list_del_init(&bo->lru);
-					spin_unlock(&bdev->lru_lock);
-
-					if (bo->ttm)
-						ttm_tt_unpopulate(bo->bdev, bo->ttm);
-
-					ttm_bo_put(bo);
-					spin_lock(&bdev->lru_lock);
-				}
-			}
-		}
+		for (j = 0; j < TTM_MAX_BO_PRIORITY; ++j)
+			ttm_device_clear_lru_dma_mappings(bdev, &man->lru[j]);
 	}
-	spin_unlock(&bdev->lru_lock);
 }
 EXPORT_SYMBOL(ttm_device_clear_dma_mappings);
diff --git a/drivers/gpu/drm/ttm/ttm_resource.c b/drivers/gpu/drm/ttm/ttm_resource.c
index b8362492980d..450e665c357b 100644
--- a/drivers/gpu/drm/ttm/ttm_resource.c
+++ b/drivers/gpu/drm/ttm/ttm_resource.c
@@ -29,6 +29,106 @@
 #include <drm/ttm/ttm_resource.h>
 #include <drm/ttm/ttm_bo_driver.h>
 
+/**
+ * ttm_lru_bulk_move_init - initialize a bulk move structure
+ * @bulk: the structure to init
+ *
+ * For now just memset the structure to zero.
+ */
+void ttm_lru_bulk_move_init(struct ttm_lru_bulk_move *bulk)
+{
+	memset(bulk, 0, sizeof(*bulk));
+}
+EXPORT_SYMBOL(ttm_lru_bulk_move_init);
+
+/**
+ * ttm_lru_bulk_move_tail
+ *
+ * @bulk: bulk move structure
+ *
+ * Bulk move BOs to the LRU tail, only valid to use when driver makes sure that
+ * resource order never changes. Should be called with ttm_device::lru_lock held.
+ */
+void ttm_lru_bulk_move_tail(struct ttm_lru_bulk_move *bulk)
+{
+	unsigned i;
+
+	for (i = 0; i < TTM_MAX_BO_PRIORITY; ++i) {
+		struct ttm_lru_bulk_move_pos *pos = &bulk->tt[i];
+		struct ttm_resource_manager *man;
+
+		if (!pos->first)
+			continue;
+
+		dma_resv_assert_held(pos->first->bo->base.resv);
+		dma_resv_assert_held(pos->last->bo->base.resv);
+
+		man = ttm_manager_type(pos->first->bo->bdev, TTM_PL_TT);
+		list_bulk_move_tail(&man->lru[i], &pos->first->lru,
+				    &pos->last->lru);
+	}
+
+	for (i = 0; i < TTM_MAX_BO_PRIORITY; ++i) {
+		struct ttm_lru_bulk_move_pos *pos = &bulk->vram[i];
+		struct ttm_resource_manager *man;
+
+		if (!pos->first)
+			continue;
+
+		dma_resv_assert_held(pos->first->bo->base.resv);
+		dma_resv_assert_held(pos->last->bo->base.resv);
+
+		man = ttm_manager_type(pos->first->bo->bdev, TTM_PL_VRAM);
+		list_bulk_move_tail(&man->lru[i], &pos->first->lru,
+				    &pos->last->lru);
+	}
+}
+EXPORT_SYMBOL(ttm_lru_bulk_move_tail);
+
+/* Record a resource position in a bulk move structure */
+static void ttm_lru_bulk_move_set_pos(struct ttm_lru_bulk_move_pos *pos,
+				      struct ttm_resource *res)
+{
+	if (!pos->first)
+		pos->first = res;
+	pos->last = res;
+}
+
+/* Move a resource to the LRU tail and track the bulk position */
+void ttm_resource_move_to_lru_tail(struct ttm_resource *res,
+				   struct ttm_lru_bulk_move *bulk)
+{
+	struct ttm_buffer_object *bo = res->bo;
+	struct ttm_device *bdev = bo->bdev;
+	struct ttm_resource_manager *man;
+
+	if (bo->pin_count) {
+		list_move_tail(&res->lru, &bdev->pinned);
+		if (bdev->funcs->del_from_lru_notify)
+			bdev->funcs->del_from_lru_notify(res->bo);
+		return;
+	}
+
+	man = ttm_manager_type(bdev, res->mem_type);
+	list_move_tail(&res->lru, &man->lru[bo->priority]);
+
+	if (bdev->funcs->del_from_lru_notify)
+		bdev->funcs->del_from_lru_notify(bo);
+
+	if (!bulk)
+		return;
+
+	switch (res->mem_type) {
+	case TTM_PL_TT:
+		ttm_lru_bulk_move_set_pos(&bulk->tt[bo->priority], res);
+		break;
+
+	case TTM_PL_VRAM:
+		ttm_lru_bulk_move_set_pos(&bulk->vram[bo->priority], res);
+		break;
+	}
+}
+
 void ttm_resource_init(struct ttm_buffer_object *bo,
                        const struct ttm_place *place,
                        struct ttm_resource *res)
@@ -44,16 +144,36 @@ void ttm_resource_init(struct ttm_buffer_object *bo,
 	res->bus.is_iomem = false;
 	res->bus.caching = ttm_cached;
 	res->bo = bo;
+	INIT_LIST_HEAD(&res->lru);
 
 	man = ttm_manager_type(bo->bdev, place->mem_type);
 	atomic64_add(bo->base.size, &man->usage);
+
+	spin_lock(&bo->bdev->lru_lock);
+	ttm_resource_move_to_lru_tail(res, NULL);
+	spin_unlock(&bo->bdev->lru_lock);
 }
 EXPORT_SYMBOL(ttm_resource_init);
 
+/**
+ * ttm_resource_fini
+ *
+ * @res: the resource to clean up
+ *
+ * Make sure the resource is removed from the LRU before destruction.
+ */
 void ttm_resource_fini(struct ttm_resource_manager *man,
 		       struct ttm_resource *res)
 {
-	atomic64_sub(res->bo->base.size, &man->usage);
+	struct ttm_device *bdev = man->bdev;
+
+	spin_lock(&bdev->lru_lock);
+	list_del_init(&res->lru);
+	if (res->bo && bdev->funcs->del_from_lru_notify)
+		bdev->funcs->del_from_lru_notify(res->bo);
+	spin_unlock(&bdev->lru_lock);
+
+	atomic64_sub(res->num_pages << PAGE_SHIFT, &man->usage);
 }
 EXPORT_SYMBOL(ttm_resource_fini);
 
diff --git a/include/drm/ttm/ttm_bo_api.h b/include/drm/ttm/ttm_bo_api.h
index c17b2df9178b..3da77fc54552 100644
--- a/include/drm/ttm/ttm_bo_api.h
+++ b/include/drm/ttm/ttm_bo_api.h
@@ -55,8 +55,6 @@ struct ttm_placement;
 
 struct ttm_place;
 
-struct ttm_lru_bulk_move;
-
 /**
  * enum ttm_bo_type
  *
@@ -94,7 +92,6 @@ struct ttm_tt;
  * @ttm: TTM structure holding system pages.
  * @evicted: Whether the object was evicted without user-space knowing.
  * @deleted: True if the object is only a zombie and already deleted.
- * @lru: List head for the lru list.
  * @ddestroy: List head for the delayed destroy list.
  * @swap: List head for swap LRU list.
  * @moving: Fence set when BO is moving
@@ -143,7 +140,6 @@ struct ttm_buffer_object {
 	 * Members protected by the bdev::lru_lock.
 	 */
 
-	struct list_head lru;
 	struct list_head ddestroy;
 
 	/**
@@ -295,7 +291,6 @@ void ttm_bo_put(struct ttm_buffer_object *bo);
  * ttm_bo_move_to_lru_tail
  *
  * @bo: The buffer object.
- * @mem: Resource object.
  * @bulk: optional bulk move structure to remember BO positions
  *
  * Move this BO to the tail of all lru lists used to lookup and reserve an
@@ -303,19 +298,8 @@ void ttm_bo_put(struct ttm_buffer_object *bo);
  * held, and is used to make a BO less likely to be considered for eviction.
  */
 void ttm_bo_move_to_lru_tail(struct ttm_buffer_object *bo,
-			     struct ttm_resource *mem,
 			     struct ttm_lru_bulk_move *bulk);
 
-/**
- * ttm_bo_bulk_move_lru_tail
- *
- * @bulk: bulk move structure
- *
- * Bulk move BOs to the LRU tail, only valid to use when driver makes sure that
- * BO order never changes. Should be called with ttm_global::lru_lock held.
- */
-void ttm_bo_bulk_move_lru_tail(struct ttm_lru_bulk_move *bulk);
-
 /**
  * ttm_bo_lock_delayed_workqueue
  *
diff --git a/include/drm/ttm/ttm_bo_driver.h b/include/drm/ttm/ttm_bo_driver.h
index 5f087575194b..6c7352e13708 100644
--- a/include/drm/ttm/ttm_bo_driver.h
+++ b/include/drm/ttm/ttm_bo_driver.h
@@ -45,33 +45,6 @@
 #include "ttm_tt.h"
 #include "ttm_pool.h"
 
-/**
- * struct ttm_lru_bulk_move_pos
- *
- * @first: first BO in the bulk move range
- * @last: last BO in the bulk move range
- *
- * Positions for a lru bulk move.
- */
-struct ttm_lru_bulk_move_pos {
-	struct ttm_buffer_object *first;
-	struct ttm_buffer_object *last;
-};
-
-/**
- * struct ttm_lru_bulk_move
- *
- * @tt: first/last lru entry for BOs in the TT domain
- * @vram: first/last lru entry for BOs in the VRAM domain
- * @swap: first/last lru entry for BOs on the swap list
- *
- * Helper structure for bulk moves on the LRU list.
- */
-struct ttm_lru_bulk_move {
-	struct ttm_lru_bulk_move_pos tt[TTM_MAX_BO_PRIORITY];
-	struct ttm_lru_bulk_move_pos vram[TTM_MAX_BO_PRIORITY];
-};
-
 /*
  * ttm_bo.c
  */
@@ -182,7 +155,7 @@ static inline void
 ttm_bo_move_to_lru_tail_unlocked(struct ttm_buffer_object *bo)
 {
 	spin_lock(&bo->bdev->lru_lock);
-	ttm_bo_move_to_lru_tail(bo, bo->resource, NULL);
+	ttm_bo_move_to_lru_tail(bo, NULL);
 	spin_unlock(&bo->bdev->lru_lock);
 }
 
diff --git a/include/drm/ttm/ttm_resource.h b/include/drm/ttm/ttm_resource.h
index 3d391279b33f..a54d52517a30 100644
--- a/include/drm/ttm/ttm_resource.h
+++ b/include/drm/ttm/ttm_resource.h
@@ -26,10 +26,12 @@
 #define _TTM_RESOURCE_H_
 
 #include <linux/types.h>
+#include <linux/list.h>
 #include <linux/mutex.h>
 #include <linux/atomic.h>
 #include <linux/dma-buf-map.h>
 #include <linux/dma-fence.h>
+
 #include <drm/drm_print.h>
 #include <drm/ttm/ttm_caching.h>
 #include <drm/ttm/ttm_kmap_iter.h>
@@ -178,6 +180,33 @@ struct ttm_resource {
 	uint32_t placement;
 	struct ttm_bus_placement bus;
 	struct ttm_buffer_object *bo;
+	struct list_head lru;
+};
+
+/**
+ * struct ttm_lru_bulk_move_pos
+ *
+ * @first: first res in the bulk move range
+ * @last: last res in the bulk move range
+ *
+ * Positions for a lru bulk move.
+ */
+struct ttm_lru_bulk_move_pos {
+	struct ttm_resource *first;
+	struct ttm_resource *last;
+};
+
+/**
+ * struct ttm_lru_bulk_move
+ *
+ * @tt: first/last lru entry for resources in the TT domain
+ * @vram: first/last lru entry for resources in the VRAM domain
+ *
+ * Helper structure for bulk moves on the LRU list.
+ */
+struct ttm_lru_bulk_move {
+	struct ttm_lru_bulk_move_pos tt[TTM_MAX_BO_PRIORITY];
+	struct ttm_lru_bulk_move_pos vram[TTM_MAX_BO_PRIORITY];
 };
 
 /**
@@ -279,6 +308,12 @@ ttm_resource_manager_usage(struct ttm_resource_manager *man)
 	return atomic64_read(&man->usage);
 }
 
+void ttm_lru_bulk_move_init(struct ttm_lru_bulk_move *bulk);
+void ttm_lru_bulk_move_tail(struct ttm_lru_bulk_move *bulk);
+
+void ttm_resource_move_to_lru_tail(struct ttm_resource *res,
+				   struct ttm_lru_bulk_move *bulk);
+
 void ttm_resource_init(struct ttm_buffer_object *bo,
                        const struct ttm_place *place,
                        struct ttm_resource *res);
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 06/12] drm/ttm: add resource iterator
  2022-01-24 12:25 drm/ttm: moving the LRU into the resource Christian König
                   ` (4 preceding siblings ...)
  2022-01-24 12:25 ` [PATCH 05/12] drm/ttm: move the LRU into resource handling v2 Christian König
@ 2022-01-24 12:25 ` Christian König
  2022-01-25 16:56   ` Daniel Vetter
  2022-01-24 12:25 ` [PATCH 07/12] drm/radeon: remove resource accounting Christian König
                   ` (5 subsequent siblings)
  11 siblings, 1 reply; 23+ messages in thread
From: Christian König @ 2022-01-24 12:25 UTC (permalink / raw)
  To: ray.huang, thomas.hellstrom, dri-devel, bas

Instead of duplicating that at different places add an iterator over all
the resources in a resource manager.

Signed-off-by: Christian König <christian.koenig@amd.com>
---
 drivers/gpu/drm/ttm/ttm_bo.c       | 41 +++++++++++----------------
 drivers/gpu/drm/ttm/ttm_device.c   | 26 ++++++++---------
 drivers/gpu/drm/ttm/ttm_resource.c | 45 ++++++++++++++++++++++++++++++
 include/drm/ttm/ttm_resource.h     | 23 +++++++++++++++
 4 files changed, 95 insertions(+), 40 deletions(-)

diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c
index cb0fa932d495..599be3dda8a9 100644
--- a/drivers/gpu/drm/ttm/ttm_bo.c
+++ b/drivers/gpu/drm/ttm/ttm_bo.c
@@ -579,38 +579,29 @@ int ttm_mem_evict_first(struct ttm_device *bdev,
 			struct ww_acquire_ctx *ticket)
 {
 	struct ttm_buffer_object *bo = NULL, *busy_bo = NULL;
+	struct ttm_resource_cursor cursor;
 	struct ttm_resource *res;
 	bool locked = false;
-	unsigned i;
 	int ret;
 
 	spin_lock(&bdev->lru_lock);
-	for (i = 0; i < TTM_MAX_BO_PRIORITY; ++i) {
-		list_for_each_entry(res, &man->lru[i], lru) {
-			bool busy;
-
-			bo = res->bo;
-			if (!ttm_bo_evict_swapout_allowable(bo, ctx, place,
-							    &locked, &busy)) {
-				if (busy && !busy_bo && ticket !=
-				    dma_resv_locking_ctx(bo->base.resv))
-					busy_bo = bo;
-				continue;
-			}
-
-			if (!ttm_bo_get_unless_zero(bo)) {
-				if (locked)
-					dma_resv_unlock(bo->base.resv);
-				continue;
-			}
-			break;
+	ttm_resource_manager_for_each_res(man, &cursor, res) {
+		bool busy;
+
+		if (!ttm_bo_evict_swapout_allowable(res->bo, ctx, place,
+						    &locked, &busy)) {
+			if (busy && !busy_bo && ticket !=
+			    dma_resv_locking_ctx(bo->base.resv))
+				busy_bo = res->bo;
+			continue;
 		}
 
-		/* If the inner loop terminated early, we have our candidate */
-		if (&res->lru != &man->lru[i])
-			break;
-
-		bo = NULL;
+		if (!ttm_bo_get_unless_zero(res->bo)) {
+			if (locked)
+				dma_resv_unlock(res->bo->base.resv);
+			continue;
+		}
+		bo = res->bo;
 	}
 
 	if (!bo) {
diff --git a/drivers/gpu/drm/ttm/ttm_device.c b/drivers/gpu/drm/ttm/ttm_device.c
index ba35887147ba..a0562ab386f5 100644
--- a/drivers/gpu/drm/ttm/ttm_device.c
+++ b/drivers/gpu/drm/ttm/ttm_device.c
@@ -142,10 +142,10 @@ EXPORT_SYMBOL(ttm_global_swapout);
 int ttm_device_swapout(struct ttm_device *bdev, struct ttm_operation_ctx *ctx,
 		       gfp_t gfp_flags)
 {
+	struct ttm_resource_cursor cursor;
 	struct ttm_resource_manager *man;
-	struct ttm_buffer_object *bo;
 	struct ttm_resource *res;
-	unsigned i, j;
+	unsigned i;
 	int ret;
 
 	spin_lock(&bdev->lru_lock);
@@ -154,20 +154,16 @@ int ttm_device_swapout(struct ttm_device *bdev, struct ttm_operation_ctx *ctx,
 		if (!man || !man->use_tt)
 			continue;
 
-		for (j = 0; j < TTM_MAX_BO_PRIORITY; ++j) {
-			list_for_each_entry(res, &man->lru[j], lru) {
-				uint32_t num_pages;
-
-				bo = res->bo;
-				num_pages = PFN_UP(bo->base.size);
+		ttm_resource_manager_for_each_res(man, &cursor, res) {
+			struct ttm_buffer_object *bo = res->bo;
+			uint32_t num_pages = PFN_UP(bo->base.size);
 
-				ret = ttm_bo_swapout(bo, ctx, gfp_flags);
-				/* ttm_bo_swapout has dropped the lru_lock */
-				if (!ret)
-					return num_pages;
-				if (ret != -EBUSY)
-					return ret;
-			}
+			ret = ttm_bo_swapout(bo, ctx, gfp_flags);
+			/* ttm_bo_swapout has dropped the lru_lock */
+			if (!ret)
+				return num_pages;
+			if (ret != -EBUSY)
+				return ret;
 		}
 	}
 	spin_unlock(&bdev->lru_lock);
diff --git a/drivers/gpu/drm/ttm/ttm_resource.c b/drivers/gpu/drm/ttm/ttm_resource.c
index 450e665c357b..9e68d36a1546 100644
--- a/drivers/gpu/drm/ttm/ttm_resource.c
+++ b/drivers/gpu/drm/ttm/ttm_resource.c
@@ -354,6 +354,51 @@ void ttm_resource_manager_debug(struct ttm_resource_manager *man,
 }
 EXPORT_SYMBOL(ttm_resource_manager_debug);
 
+/**
+ * ttm_resource_manager_first
+ *
+ * @man: resource manager to iterate over
+ * @cursor: cursor to record the position
+ *
+ * Returns the first resource from the resource manager.
+ */
+struct ttm_resource *
+ttm_resource_manager_first(struct ttm_resource_manager *man,
+			   struct ttm_resource_cursor *cursor)
+{
+	struct ttm_resource *res;
+
+	for (cursor->priority = 0; cursor->priority < TTM_MAX_BO_PRIORITY;
+	     ++cursor->priority)
+		list_for_each_entry(res, &man->lru[cursor->priority], lru)
+			return res;
+
+	return NULL;
+}
+
+/**
+ * ttm_resource_manager_next
+ *
+ * @man: resource manager to iterate over
+ * @cursor: cursor to record the position
+ *
+ * Returns the next resource from the resource manager.
+ */
+struct ttm_resource *
+ttm_resource_manager_next(struct ttm_resource_manager *man,
+			  struct ttm_resource_cursor *cursor,
+			  struct ttm_resource *res)
+{
+	list_for_each_entry_continue(res, &man->lru[cursor->priority], lru)
+		return res;
+
+	for (; cursor->priority < TTM_MAX_BO_PRIORITY; ++cursor->priority)
+		list_for_each_entry(res, &man->lru[cursor->priority], lru)
+			return res;
+
+	return NULL;
+}
+
 static void ttm_kmap_iter_iomap_map_local(struct ttm_kmap_iter *iter,
 					  struct dma_buf_map *dmap,
 					  pgoff_t i)
diff --git a/include/drm/ttm/ttm_resource.h b/include/drm/ttm/ttm_resource.h
index a54d52517a30..13da5e337350 100644
--- a/include/drm/ttm/ttm_resource.h
+++ b/include/drm/ttm/ttm_resource.h
@@ -183,6 +183,17 @@ struct ttm_resource {
 	struct list_head lru;
 };
 
+/**
+ * struct ttm_resource_cursor
+ *
+ * @priority: the current priority
+ *
+ * Cursor to iterate over the resources in a manager.
+ */
+struct ttm_resource_cursor {
+	unsigned int priority;
+};
+
 /**
  * struct ttm_lru_bulk_move_pos
  *
@@ -339,6 +350,18 @@ int ttm_resource_manager_evict_all(struct ttm_device *bdev,
 void ttm_resource_manager_debug(struct ttm_resource_manager *man,
 				struct drm_printer *p);
 
+struct ttm_resource *
+ttm_resource_manager_first(struct ttm_resource_manager *man,
+			   struct ttm_resource_cursor *cursor);
+struct ttm_resource *
+ttm_resource_manager_next(struct ttm_resource_manager *man,
+			  struct ttm_resource_cursor *cursor,
+			  struct ttm_resource *res);
+
+#define ttm_resource_manager_for_each_res(man, cursor, res)		\
+	for (res = ttm_resource_manager_first(man, cursor); res;	\
+	     res = ttm_resource_manager_next(man, cursor, res))
+
 struct ttm_kmap_iter *
 ttm_kmap_iter_iomap_init(struct ttm_kmap_iter_iomap *iter_io,
 			 struct io_mapping *iomap,
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 07/12] drm/radeon: remove resource accounting
  2022-01-24 12:25 drm/ttm: moving the LRU into the resource Christian König
                   ` (5 preceding siblings ...)
  2022-01-24 12:25 ` [PATCH 06/12] drm/ttm: add resource iterator Christian König
@ 2022-01-24 12:25 ` Christian König
  2022-01-24 12:25 ` [PATCH 08/12] drm/amdgpu: remove GTT accounting Christian König
                   ` (4 subsequent siblings)
  11 siblings, 0 replies; 23+ messages in thread
From: Christian König @ 2022-01-24 12:25 UTC (permalink / raw)
  To: ray.huang, thomas.hellstrom, dri-devel, bas

Use the one provided by TTM instead.

Signed-off-by: Christian König <christian.koenig@amd.com>
---
 drivers/gpu/drm/radeon/radeon.h        |  2 --
 drivers/gpu/drm/radeon/radeon_kms.c    |  7 ++++--
 drivers/gpu/drm/radeon/radeon_object.c | 30 +++-----------------------
 drivers/gpu/drm/radeon/radeon_object.h |  1 -
 drivers/gpu/drm/radeon/radeon_ttm.c    | 18 ++--------------
 5 files changed, 10 insertions(+), 48 deletions(-)

diff --git a/drivers/gpu/drm/radeon/radeon.h b/drivers/gpu/drm/radeon/radeon.h
index 895776c421d4..08f83bf2c330 100644
--- a/drivers/gpu/drm/radeon/radeon.h
+++ b/drivers/gpu/drm/radeon/radeon.h
@@ -2462,8 +2462,6 @@ struct radeon_device {
 	struct radeon_vm_manager	vm_manager;
 	struct mutex			gpu_clock_mutex;
 	/* memory stats */
-	atomic64_t			vram_usage;
-	atomic64_t			gtt_usage;
 	atomic64_t			num_bytes_moved;
 	atomic_t			gpu_reset_counter;
 	/* ACPI interface */
diff --git a/drivers/gpu/drm/radeon/radeon_kms.c b/drivers/gpu/drm/radeon/radeon_kms.c
index 7afe28408085..3a9226d0cb05 100644
--- a/drivers/gpu/drm/radeon/radeon_kms.c
+++ b/drivers/gpu/drm/radeon/radeon_kms.c
@@ -241,6 +241,7 @@ int radeon_info_ioctl(struct drm_device *dev, void *data, struct drm_file *filp)
 	struct drm_radeon_info *info = data;
 	struct radeon_mode_info *minfo = &rdev->mode_info;
 	uint32_t *value, value_tmp, *value_ptr, value_size;
+	struct ttm_resource_manager *man;
 	uint64_t value64;
 	struct drm_crtc *crtc;
 	int i, found;
@@ -550,12 +551,14 @@ int radeon_info_ioctl(struct drm_device *dev, void *data, struct drm_file *filp)
 	case RADEON_INFO_VRAM_USAGE:
 		value = (uint32_t*)&value64;
 		value_size = sizeof(uint64_t);
-		value64 = atomic64_read(&rdev->vram_usage);
+		man = ttm_manager_type(&rdev->mman.bdev, TTM_PL_VRAM);
+		value64 = ttm_resource_manager_usage(man);
 		break;
 	case RADEON_INFO_GTT_USAGE:
 		value = (uint32_t*)&value64;
 		value_size = sizeof(uint64_t);
-		value64 = atomic64_read(&rdev->gtt_usage);
+		man = ttm_manager_type(&rdev->mman.bdev, TTM_PL_TT);
+		value64 = ttm_resource_manager_usage(man);
 		break;
 	case RADEON_INFO_ACTIVE_CU_COUNT:
 		if (rdev->family >= CHIP_BONAIRE)
diff --git a/drivers/gpu/drm/radeon/radeon_object.c b/drivers/gpu/drm/radeon/radeon_object.c
index 56ede9d63b12..c9bbed2a25ad 100644
--- a/drivers/gpu/drm/radeon/radeon_object.c
+++ b/drivers/gpu/drm/radeon/radeon_object.c
@@ -49,27 +49,6 @@ static void radeon_bo_clear_surface_reg(struct radeon_bo *bo);
  * function are calling it.
  */
 
-static void radeon_update_memory_usage(struct ttm_buffer_object *bo,
-				       unsigned int mem_type, int sign)
-{
-	struct radeon_device *rdev = radeon_get_rdev(bo->bdev);
-
-	switch (mem_type) {
-	case TTM_PL_TT:
-		if (sign > 0)
-			atomic64_add(bo->base.size, &rdev->gtt_usage);
-		else
-			atomic64_sub(bo->base.size, &rdev->gtt_usage);
-		break;
-	case TTM_PL_VRAM:
-		if (sign > 0)
-			atomic64_add(bo->base.size, &rdev->vram_usage);
-		else
-			atomic64_sub(bo->base.size, &rdev->vram_usage);
-		break;
-	}
-}
-
 static void radeon_ttm_bo_destroy(struct ttm_buffer_object *tbo)
 {
 	struct radeon_bo *bo;
@@ -434,7 +413,9 @@ void radeon_bo_fini(struct radeon_device *rdev)
 static u64 radeon_bo_get_threshold_for_moves(struct radeon_device *rdev)
 {
 	u64 real_vram_size = rdev->mc.real_vram_size;
-	u64 vram_usage = atomic64_read(&rdev->vram_usage);
+	struct ttm_resource_manager *man =
+		ttm_manager_type(&rdev->mman.bdev, TTM_PL_VRAM);
+	u64 vram_usage = ttm_resource_manager_usage(man);
 
 	/* This function is based on the current VRAM usage.
 	 *
@@ -725,15 +706,10 @@ int radeon_bo_check_tiling(struct radeon_bo *bo, bool has_moved,
 }
 
 void radeon_bo_move_notify(struct ttm_buffer_object *bo,
-			   unsigned int old_type,
 			   struct ttm_resource *new_mem)
 {
 	struct radeon_bo *rbo;
 
-	radeon_update_memory_usage(bo, old_type, -1);
-	if (new_mem)
-		radeon_update_memory_usage(bo, new_mem->mem_type, 1);
-
 	if (!radeon_ttm_bo_is_radeon_bo(bo))
 		return;
 
diff --git a/drivers/gpu/drm/radeon/radeon_object.h b/drivers/gpu/drm/radeon/radeon_object.h
index 1afc7992ef91..0b64e202577b 100644
--- a/drivers/gpu/drm/radeon/radeon_object.h
+++ b/drivers/gpu/drm/radeon/radeon_object.h
@@ -161,7 +161,6 @@ extern void radeon_bo_get_tiling_flags(struct radeon_bo *bo,
 extern int radeon_bo_check_tiling(struct radeon_bo *bo, bool has_moved,
 				bool force_drop);
 extern void radeon_bo_move_notify(struct ttm_buffer_object *bo,
-				  unsigned int old_type,
 				  struct ttm_resource *new_mem);
 extern vm_fault_t radeon_bo_fault_reserve_notify(struct ttm_buffer_object *bo);
 extern int radeon_bo_get_surface_reg(struct radeon_bo *bo);
diff --git a/drivers/gpu/drm/radeon/radeon_ttm.c b/drivers/gpu/drm/radeon/radeon_ttm.c
index 0d1283cdc8fb..ae09a91a486a 100644
--- a/drivers/gpu/drm/radeon/radeon_ttm.c
+++ b/drivers/gpu/drm/radeon/radeon_ttm.c
@@ -199,7 +199,7 @@ static int radeon_bo_move(struct ttm_buffer_object *bo, bool evict,
 	struct ttm_resource *old_mem = bo->resource;
 	struct radeon_device *rdev;
 	struct radeon_bo *rbo;
-	int r, old_type;
+	int r;
 
 	if (new_mem->mem_type == TTM_PL_TT) {
 		r = radeon_ttm_tt_bind(bo->bdev, bo->ttm, new_mem);
@@ -216,9 +216,6 @@ static int radeon_bo_move(struct ttm_buffer_object *bo, bool evict,
 	if (WARN_ON_ONCE(rbo->tbo.pin_count > 0))
 		return -EINVAL;
 
-	/* Save old type for statistics update */
-	old_type = old_mem->mem_type;
-
 	rdev = radeon_get_rdev(bo->bdev);
 	if (old_mem->mem_type == TTM_PL_SYSTEM && bo->ttm == NULL) {
 		ttm_bo_move_null(bo, new_mem);
@@ -264,7 +261,7 @@ static int radeon_bo_move(struct ttm_buffer_object *bo, bool evict,
 out:
 	/* update statistics */
 	atomic64_add(bo->base.size, &rdev->num_bytes_moved);
-	radeon_bo_move_notify(bo, old_type, new_mem);
+	radeon_bo_move_notify(bo, new_mem);
 	return 0;
 }
 
@@ -679,16 +676,6 @@ bool radeon_ttm_tt_is_readonly(struct radeon_device *rdev,
 	return !!(gtt->userflags & RADEON_GEM_USERPTR_READONLY);
 }
 
-static void
-radeon_bo_delete_mem_notify(struct ttm_buffer_object *bo)
-{
-	unsigned int old_type = TTM_PL_SYSTEM;
-
-	if (bo->resource)
-		old_type = bo->resource->mem_type;
-	radeon_bo_move_notify(bo, old_type, NULL);
-}
-
 static struct ttm_device_funcs radeon_bo_driver = {
 	.ttm_tt_create = &radeon_ttm_tt_create,
 	.ttm_tt_populate = &radeon_ttm_tt_populate,
@@ -697,7 +684,6 @@ static struct ttm_device_funcs radeon_bo_driver = {
 	.eviction_valuable = ttm_bo_eviction_valuable,
 	.evict_flags = &radeon_evict_flags,
 	.move = &radeon_bo_move,
-	.delete_mem_notify = &radeon_bo_delete_mem_notify,
 	.io_mem_reserve = &radeon_ttm_io_mem_reserve,
 };
 
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 08/12] drm/amdgpu: remove GTT accounting
  2022-01-24 12:25 drm/ttm: moving the LRU into the resource Christian König
                   ` (6 preceding siblings ...)
  2022-01-24 12:25 ` [PATCH 07/12] drm/radeon: remove resource accounting Christian König
@ 2022-01-24 12:25 ` Christian König
  2022-01-24 12:25 ` [PATCH 09/12] drm/amdgpu: remove VRAM accounting Christian König
                   ` (3 subsequent siblings)
  11 siblings, 0 replies; 23+ messages in thread
From: Christian König @ 2022-01-24 12:25 UTC (permalink / raw)
  To: ray.huang, thomas.hellstrom, dri-devel, bas

This is provided by TTM now.

Also switch man->size to bytes instead of pages and fix the double
printing of size and usage in debugfs.

Signed-off-by: Christian König <christian.koenig@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c | 50 +++++----------------
 drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c     |  5 +--
 drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h     |  2 -
 3 files changed, 12 insertions(+), 45 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c
index 9e7685a4878c..ce5eeb3c1097 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c
@@ -60,7 +60,7 @@ static ssize_t amdgpu_mem_info_gtt_total_show(struct device *dev,
 	struct ttm_resource_manager *man;
 
 	man = ttm_manager_type(&adev->mman.bdev, TTM_PL_TT);
-	return sysfs_emit(buf, "%llu\n", man->size * PAGE_SIZE);
+	return sysfs_emit(buf, "%llu\n", man->size);
 }
 
 /**
@@ -80,7 +80,7 @@ static ssize_t amdgpu_mem_info_gtt_used_show(struct device *dev,
 	struct ttm_resource_manager *man;
 
 	man = ttm_manager_type(&adev->mman.bdev, TTM_PL_TT);
-	return sysfs_emit(buf, "%llu\n", amdgpu_gtt_mgr_usage(man));
+	return sysfs_emit(buf, "%llu\n", ttm_resource_manager_usage(man));
 }
 
 static DEVICE_ATTR(mem_info_gtt_total, S_IRUGO,
@@ -132,20 +132,17 @@ static int amdgpu_gtt_mgr_new(struct ttm_resource_manager *man,
 	struct amdgpu_gtt_node *node;
 	int r;
 
-	if (!(place->flags & TTM_PL_FLAG_TEMPORARY) &&
-	    atomic64_add_return(num_pages, &mgr->used) >  man->size) {
-		atomic64_sub(num_pages, &mgr->used);
-		return -ENOSPC;
-	}
-
 	node = kzalloc(struct_size(node, base.mm_nodes, 1), GFP_KERNEL);
-	if (!node) {
-		r = -ENOMEM;
-		goto err_out;
-	}
+	if (!node)
+		return -ENOMEM;
 
 	node->tbo = tbo;
 	ttm_resource_init(tbo, place, &node->base.base);
+	if (!(place->flags & TTM_PL_FLAG_TEMPORARY) &&
+	    ttm_resource_manager_usage(man) > man->size) {
+		r = -ENOSPC;
+		goto err_free;
+	}
 
 	if (place->lpfn) {
 		spin_lock(&mgr->lock);
@@ -171,11 +168,6 @@ static int amdgpu_gtt_mgr_new(struct ttm_resource_manager *man,
 err_free:
 	ttm_resource_fini(man, &node->base.base);
 	kfree(node);
-
-err_out:
-	if (!(place->flags & TTM_PL_FLAG_TEMPORARY))
-		atomic64_sub(num_pages, &mgr->used);
-
 	return r;
 }
 
@@ -198,27 +190,10 @@ static void amdgpu_gtt_mgr_del(struct ttm_resource_manager *man,
 		drm_mm_remove_node(&node->base.mm_nodes[0]);
 	spin_unlock(&mgr->lock);
 
-	if (!(res->placement & TTM_PL_FLAG_TEMPORARY))
-		atomic64_sub(res->num_pages, &mgr->used);
-
 	ttm_resource_fini(man, res);
 	kfree(node);
 }
 
-/**
- * amdgpu_gtt_mgr_usage - return usage of GTT domain
- *
- * @man: TTM memory type manager
- *
- * Return how many bytes are used in the GTT domain
- */
-uint64_t amdgpu_gtt_mgr_usage(struct ttm_resource_manager *man)
-{
-	struct amdgpu_gtt_mgr *mgr = to_gtt_mgr(man);
-
-	return atomic64_read(&mgr->used) * PAGE_SIZE;
-}
-
 /**
  * amdgpu_gtt_mgr_recover - re-init gart
  *
@@ -265,9 +240,6 @@ static void amdgpu_gtt_mgr_debug(struct ttm_resource_manager *man,
 	spin_lock(&mgr->lock);
 	drm_mm_print(&mgr->mm, printer);
 	spin_unlock(&mgr->lock);
-
-	drm_printf(printer, "man size:%llu pages,  gtt used:%llu pages\n",
-		   man->size, atomic64_read(&mgr->used));
 }
 
 static const struct ttm_resource_manager_func amdgpu_gtt_mgr_func = {
@@ -293,14 +265,12 @@ int amdgpu_gtt_mgr_init(struct amdgpu_device *adev, uint64_t gtt_size)
 	man->use_tt = true;
 	man->func = &amdgpu_gtt_mgr_func;
 
-	ttm_resource_manager_init(man, &adev->mman.bdev,
-				  gtt_size >> PAGE_SHIFT);
+	ttm_resource_manager_init(man, &adev->mman.bdev, gtt_size);
 
 	start = AMDGPU_GTT_MAX_TRANSFER_SIZE * AMDGPU_GTT_NUM_TRANSFER_WINDOWS;
 	size = (adev->gmc.gart_size >> PAGE_SHIFT) - start;
 	drm_mm_init(&mgr->mm, start, size);
 	spin_lock_init(&mgr->lock);
-	atomic64_set(&mgr->used, 0);
 
 	ttm_set_driver_manager(&adev->mman.bdev, TTM_PL_TT, &mgr->manager);
 	ttm_resource_manager_set_used(man, true);
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
index 651c7abfde03..8d5d56d9d26d 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
@@ -678,7 +678,7 @@ int amdgpu_info_ioctl(struct drm_device *dev, void *data, struct drm_file *filp)
 		ui64 = amdgpu_vram_mgr_vis_usage(ttm_manager_type(&adev->mman.bdev, TTM_PL_VRAM));
 		return copy_to_user(out, &ui64, min(size, 8u)) ? -EFAULT : 0;
 	case AMDGPU_INFO_GTT_USAGE:
-		ui64 = amdgpu_gtt_mgr_usage(ttm_manager_type(&adev->mman.bdev, TTM_PL_TT));
+		ui64 = ttm_resource_manager_usage(ttm_manager_type(&adev->mman.bdev, TTM_PL_TT));
 		return copy_to_user(out, &ui64, min(size, 8u)) ? -EFAULT : 0;
 	case AMDGPU_INFO_GDS_CONFIG: {
 		struct drm_amdgpu_info_gds gds_info;
@@ -737,8 +737,7 @@ int amdgpu_info_ioctl(struct drm_device *dev, void *data, struct drm_file *filp)
 		mem.gtt.total_heap_size *= PAGE_SIZE;
 		mem.gtt.usable_heap_size = mem.gtt.total_heap_size -
 			atomic64_read(&adev->gart_pin_size);
-		mem.gtt.heap_usage =
-			amdgpu_gtt_mgr_usage(gtt_man);
+		mem.gtt.heap_usage = ttm_resource_manager_usage(gtt_man);
 		mem.gtt.max_allocation = mem.gtt.usable_heap_size * 3 / 4;
 
 		return copy_to_user(out, &mem,
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h
index 7346ecff4438..2ac332cd4f59 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h
@@ -52,7 +52,6 @@ struct amdgpu_gtt_mgr {
 	struct ttm_resource_manager manager;
 	struct drm_mm mm;
 	spinlock_t lock;
-	atomic64_t used;
 };
 
 struct amdgpu_preempt_mgr {
@@ -114,7 +113,6 @@ int amdgpu_vram_mgr_init(struct amdgpu_device *adev);
 void amdgpu_vram_mgr_fini(struct amdgpu_device *adev);
 
 bool amdgpu_gtt_mgr_has_gart_addr(struct ttm_resource *mem);
-uint64_t amdgpu_gtt_mgr_usage(struct ttm_resource_manager *man);
 int amdgpu_gtt_mgr_recover(struct ttm_resource_manager *man);
 
 uint64_t amdgpu_preempt_mgr_usage(struct ttm_resource_manager *man);
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 09/12] drm/amdgpu: remove VRAM accounting
  2022-01-24 12:25 drm/ttm: moving the LRU into the resource Christian König
                   ` (7 preceding siblings ...)
  2022-01-24 12:25 ` [PATCH 08/12] drm/amdgpu: remove GTT accounting Christian König
@ 2022-01-24 12:25 ` Christian König
  2022-01-24 12:25 ` [PATCH 10/12] drm/amdgpu: drop amdgpu_gtt_node Christian König
                   ` (2 subsequent siblings)
  11 siblings, 0 replies; 23+ messages in thread
From: Christian König @ 2022-01-24 12:25 UTC (permalink / raw)
  To: ray.huang, thomas.hellstrom, dri-devel, bas

This is provided by TTM now.

Also switch man->size to bytes instead of pages and fix the double
printing of size and usage in debugfs.

Signed-off-by: Christian König <christian.koenig@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c   |  5 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c       |  2 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c      |  5 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c      |  2 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h      |  2 -
 drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c     |  2 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c | 56 +++++++-------------
 7 files changed, 26 insertions(+), 48 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
index 46cf48b3904a..5a3ecdea20eb 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
@@ -516,9 +516,10 @@ int amdgpu_amdkfd_get_dmabuf_info(struct amdgpu_device *adev, int dma_buf_fd,
 
 uint64_t amdgpu_amdkfd_get_vram_usage(struct amdgpu_device *adev)
 {
-	struct ttm_resource_manager *vram_man = ttm_manager_type(&adev->mman.bdev, TTM_PL_VRAM);
+	struct ttm_resource_manager *vram_man =
+		ttm_manager_type(&adev->mman.bdev, TTM_PL_VRAM);
 
-	return amdgpu_vram_mgr_usage(vram_man);
+	return ttm_resource_manager_usage(vram_man);
 }
 
 uint8_t amdgpu_amdkfd_get_xgmi_hops_count(struct amdgpu_device *dst,
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
index 53e407ea4c89..936c1a7174f6 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
@@ -315,7 +315,7 @@ static void amdgpu_cs_get_threshold_for_moves(struct amdgpu_device *adev,
 	}
 
 	total_vram = adev->gmc.real_vram_size - atomic64_read(&adev->vram_pin_size);
-	used_vram = amdgpu_vram_mgr_usage(vram_man);
+	used_vram = ttm_resource_manager_usage(vram_man);
 	free_vram = used_vram >= total_vram ? 0 : total_vram - used_vram;
 
 	spin_lock(&adev->mm_stats.lock);
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
index 8d5d56d9d26d..6d21ee430f03 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
@@ -672,7 +672,7 @@ int amdgpu_info_ioctl(struct drm_device *dev, void *data, struct drm_file *filp)
 		ui64 = atomic64_read(&adev->num_vram_cpu_page_faults);
 		return copy_to_user(out, &ui64, min(size, 8u)) ? -EFAULT : 0;
 	case AMDGPU_INFO_VRAM_USAGE:
-		ui64 = amdgpu_vram_mgr_usage(ttm_manager_type(&adev->mman.bdev, TTM_PL_VRAM));
+		ui64 = ttm_resource_manager_usage(ttm_manager_type(&adev->mman.bdev, TTM_PL_VRAM));
 		return copy_to_user(out, &ui64, min(size, 8u)) ? -EFAULT : 0;
 	case AMDGPU_INFO_VIS_VRAM_USAGE:
 		ui64 = amdgpu_vram_mgr_vis_usage(ttm_manager_type(&adev->mman.bdev, TTM_PL_VRAM));
@@ -718,8 +718,7 @@ int amdgpu_info_ioctl(struct drm_device *dev, void *data, struct drm_file *filp)
 		mem.vram.usable_heap_size = adev->gmc.real_vram_size -
 			atomic64_read(&adev->vram_pin_size) -
 			AMDGPU_VM_RESERVED_VRAM;
-		mem.vram.heap_usage =
-			amdgpu_vram_mgr_usage(vram_man);
+		mem.vram.heap_usage = ttm_resource_manager_usage(vram_man);
 		mem.vram.max_allocation = mem.vram.usable_heap_size * 3 / 4;
 
 		mem.cpu_accessible_vram.total_heap_size =
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
index eac2ff4647e3..f6b3695979a6 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
@@ -1873,7 +1873,7 @@ void amdgpu_ttm_set_buffer_funcs_status(struct amdgpu_device *adev, bool enable)
 		size = adev->gmc.real_vram_size;
 	else
 		size = adev->gmc.visible_vram_size;
-	man->size = size >> PAGE_SHIFT;
+	man->size = size;
 	adev->mman.buffer_funcs_enabled = enable;
 }
 
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h
index 2ac332cd4f59..f3eb4029bdcf 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h
@@ -44,7 +44,6 @@ struct amdgpu_vram_mgr {
 	spinlock_t lock;
 	struct list_head reservations_pending;
 	struct list_head reserved_pages;
-	atomic64_t usage;
 	atomic64_t vis_usage;
 };
 
@@ -127,7 +126,6 @@ int amdgpu_vram_mgr_alloc_sgt(struct amdgpu_device *adev,
 void amdgpu_vram_mgr_free_sgt(struct device *dev,
 			      enum dma_data_direction dir,
 			      struct sg_table *sgt);
-uint64_t amdgpu_vram_mgr_usage(struct ttm_resource_manager *man);
 uint64_t amdgpu_vram_mgr_vis_usage(struct ttm_resource_manager *man);
 int amdgpu_vram_mgr_reserve_range(struct ttm_resource_manager *man,
 				  uint64_t start, uint64_t size);
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c
index 3fc49823f527..4a23dfe39556 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c
@@ -576,7 +576,7 @@ static int amdgpu_virt_write_vf2pf_data(struct amdgpu_device *adev)
 	vf2pf_info->driver_cert = 0;
 	vf2pf_info->os_info.all = 0;
 
-	vf2pf_info->fb_usage = amdgpu_vram_mgr_usage(vram_man) >> 20;
+	vf2pf_info->fb_usage = ttm_resource_manager_usage(vram_man) >> 20;
 	vf2pf_info->fb_vis_usage = amdgpu_vram_mgr_vis_usage(vram_man) >> 20;
 	vf2pf_info->fb_size = adev->gmc.real_vram_size >> 20;
 	vf2pf_info->fb_vis_size = adev->gmc.visible_vram_size >> 20;
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c
index ddd0b6d74070..862c0f3d2329 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c
@@ -99,7 +99,7 @@ static ssize_t amdgpu_mem_info_vram_used_show(struct device *dev,
 	struct ttm_resource_manager *man;
 
 	man = ttm_manager_type(&adev->mman.bdev, TTM_PL_VRAM);
-	return sysfs_emit(buf, "%llu\n", amdgpu_vram_mgr_usage(man));
+	return sysfs_emit(buf, "%llu\n", ttm_resource_manager_usage(man));
 }
 
 /**
@@ -255,7 +255,7 @@ static void amdgpu_vram_mgr_do_reserve(struct ttm_resource_manager *man)
 
 		vis_usage = amdgpu_vram_mgr_vis_size(adev, &rsv->mm_node);
 		atomic64_add(vis_usage, &mgr->vis_usage);
-		atomic64_add(rsv->mm_node.size << PAGE_SHIFT, &mgr->usage);
+		atomic64_add(rsv->mm_node.size << PAGE_SHIFT, &man->usage);
 		list_move(&rsv->node, &mgr->reserved_pages);
 	}
 }
@@ -382,19 +382,13 @@ static int amdgpu_vram_mgr_new(struct ttm_resource_manager *man,
 
 	lpfn = place->lpfn;
 	if (!lpfn)
-		lpfn = man->size;
+		lpfn = man->size >> PAGE_SHIFT;
 
 	max_bytes = adev->gmc.mc_vram_size;
 	if (tbo->type != ttm_bo_type_kernel)
 		max_bytes -= AMDGPU_VM_RESERVED_VRAM;
 
-	/* bail out quickly if there's likely not enough VRAM for this BO */
 	mem_bytes = tbo->base.size;
-	if (atomic64_add_return(mem_bytes, &mgr->usage) > max_bytes) {
-		r = -ENOSPC;
-		goto error_sub;
-	}
-
 	if (place->flags & TTM_PL_FLAG_CONTIGUOUS) {
 		pages_per_node = ~0ul;
 		num_nodes = 1;
@@ -412,13 +406,17 @@ static int amdgpu_vram_mgr_new(struct ttm_resource_manager *man,
 
 	node = kvmalloc(struct_size(node, mm_nodes, num_nodes),
 			GFP_KERNEL | __GFP_ZERO);
-	if (!node) {
-		r = -ENOMEM;
-		goto error_sub;
-	}
+	if (!node)
+		return -ENOMEM;
 
 	ttm_resource_init(tbo, place, &node->base);
 
+	/* bail out quickly if there's likely not enough VRAM for this BO */
+	if (ttm_resource_manager_usage(man) > max_bytes) {
+		r = -ENOSPC;
+		goto error_fini;
+	}
+
 	mode = DRM_MM_INSERT_BEST;
 	if (place->flags & TTM_PL_FLAG_TOPDOWN)
 		mode = DRM_MM_INSERT_HIGH;
@@ -476,11 +474,10 @@ static int amdgpu_vram_mgr_new(struct ttm_resource_manager *man,
 	while (i--)
 		drm_mm_remove_node(&node->mm_nodes[i]);
 	spin_unlock(&mgr->lock);
+error_fini:
 	ttm_resource_fini(man, &node->base);
 	kvfree(node);
 
-error_sub:
-	atomic64_sub(mem_bytes, &mgr->usage);
 	return r;
 }
 
@@ -498,7 +495,7 @@ static void amdgpu_vram_mgr_del(struct ttm_resource_manager *man,
 	struct ttm_range_mgr_node *node = to_ttm_range_mgr_node(res);
 	struct amdgpu_vram_mgr *mgr = to_vram_mgr(man);
 	struct amdgpu_device *adev = to_amdgpu_device(mgr);
-	uint64_t usage = 0, vis_usage = 0;
+	uint64_t vis_usage = 0;
 	unsigned i, pages;
 
 	spin_lock(&mgr->lock);
@@ -507,13 +504,11 @@ static void amdgpu_vram_mgr_del(struct ttm_resource_manager *man,
 		struct drm_mm_node *mm = &node->mm_nodes[i];
 
 		drm_mm_remove_node(mm);
-		usage += mm->size << PAGE_SHIFT;
 		vis_usage += amdgpu_vram_mgr_vis_size(adev, mm);
 	}
 	amdgpu_vram_mgr_do_reserve(man);
 	spin_unlock(&mgr->lock);
 
-	atomic64_sub(usage, &mgr->usage);
 	atomic64_sub(vis_usage, &mgr->vis_usage);
 
 	ttm_resource_fini(man, res);
@@ -631,20 +626,6 @@ void amdgpu_vram_mgr_free_sgt(struct device *dev,
 	kfree(sgt);
 }
 
-/**
- * amdgpu_vram_mgr_usage - how many bytes are used in this domain
- *
- * @man: TTM memory type manager
- *
- * Returns how many bytes are used in this domain.
- */
-uint64_t amdgpu_vram_mgr_usage(struct ttm_resource_manager *man)
-{
-	struct amdgpu_vram_mgr *mgr = to_vram_mgr(man);
-
-	return atomic64_read(&mgr->usage);
-}
-
 /**
  * amdgpu_vram_mgr_vis_usage - how many bytes are used in the visible part
  *
@@ -672,13 +653,12 @@ static void amdgpu_vram_mgr_debug(struct ttm_resource_manager *man,
 {
 	struct amdgpu_vram_mgr *mgr = to_vram_mgr(man);
 
+	drm_printf(printer, "  vis usage:%llu\n",
+		   amdgpu_vram_mgr_vis_usage(man));
+
 	spin_lock(&mgr->lock);
 	drm_mm_print(&mgr->mm, printer);
 	spin_unlock(&mgr->lock);
-
-	drm_printf(printer, "man size:%llu pages, ram usage:%lluMB, vis usage:%lluMB\n",
-		   man->size, amdgpu_vram_mgr_usage(man) >> 20,
-		   amdgpu_vram_mgr_vis_usage(man) >> 20);
 }
 
 static const struct ttm_resource_manager_func amdgpu_vram_mgr_func = {
@@ -700,11 +680,11 @@ int amdgpu_vram_mgr_init(struct amdgpu_device *adev)
 	struct ttm_resource_manager *man = &mgr->manager;
 
 	ttm_resource_manager_init(man, &adev->mman.bdev,
-				  adev->gmc.real_vram_size >> PAGE_SHIFT);
+				  adev->gmc.real_vram_size);
 
 	man->func = &amdgpu_vram_mgr_func;
 
-	drm_mm_init(&mgr->mm, 0, man->size);
+	drm_mm_init(&mgr->mm, 0, man->size >> PAGE_SHIFT);
 	spin_lock_init(&mgr->lock);
 	INIT_LIST_HEAD(&mgr->reservations_pending);
 	INIT_LIST_HEAD(&mgr->reserved_pages);
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 10/12] drm/amdgpu: drop amdgpu_gtt_node
  2022-01-24 12:25 drm/ttm: moving the LRU into the resource Christian König
                   ` (8 preceding siblings ...)
  2022-01-24 12:25 ` [PATCH 09/12] drm/amdgpu: remove VRAM accounting Christian König
@ 2022-01-24 12:25 ` Christian König
  2022-01-24 12:25 ` [PATCH 11/12] drm/ttm: allow bulk moves for all domains Christian König
  2022-01-24 12:25 ` [PATCH 12/12] drm/ttm: rework bulk move handling Christian König
  11 siblings, 0 replies; 23+ messages in thread
From: Christian König @ 2022-01-24 12:25 UTC (permalink / raw)
  To: ray.huang, thomas.hellstrom, dri-devel, bas

We have the BO pointer in the base structure now as well.

Signed-off-by: Christian König <christian.koenig@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c | 49 ++++++++-------------
 1 file changed, 18 insertions(+), 31 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c
index ce5eeb3c1097..a55bbe1a154c 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c
@@ -26,23 +26,12 @@
 
 #include "amdgpu.h"
 
-struct amdgpu_gtt_node {
-	struct ttm_buffer_object *tbo;
-	struct ttm_range_mgr_node base;
-};
-
 static inline struct amdgpu_gtt_mgr *
 to_gtt_mgr(struct ttm_resource_manager *man)
 {
 	return container_of(man, struct amdgpu_gtt_mgr, manager);
 }
 
-static inline struct amdgpu_gtt_node *
-to_amdgpu_gtt_node(struct ttm_resource *res)
-{
-	return container_of(res, struct amdgpu_gtt_node, base.base);
-}
-
 /**
  * DOC: mem_info_gtt_total
  *
@@ -107,9 +96,9 @@ const struct attribute_group amdgpu_gtt_mgr_attr_group = {
  */
 bool amdgpu_gtt_mgr_has_gart_addr(struct ttm_resource *res)
 {
-	struct amdgpu_gtt_node *node = to_amdgpu_gtt_node(res);
+	struct ttm_range_mgr_node *node = to_ttm_range_mgr_node(res);
 
-	return drm_mm_node_allocated(&node->base.mm_nodes[0]);
+	return drm_mm_node_allocated(&node->mm_nodes[0]);
 }
 
 /**
@@ -129,15 +118,14 @@ static int amdgpu_gtt_mgr_new(struct ttm_resource_manager *man,
 {
 	struct amdgpu_gtt_mgr *mgr = to_gtt_mgr(man);
 	uint32_t num_pages = PFN_UP(tbo->base.size);
-	struct amdgpu_gtt_node *node;
+	struct ttm_range_mgr_node *node;
 	int r;
 
-	node = kzalloc(struct_size(node, base.mm_nodes, 1), GFP_KERNEL);
+	node = kzalloc(struct_size(node, mm_nodes, 1), GFP_KERNEL);
 	if (!node)
 		return -ENOMEM;
 
-	node->tbo = tbo;
-	ttm_resource_init(tbo, place, &node->base.base);
+	ttm_resource_init(tbo, place, &node->base);
 	if (!(place->flags & TTM_PL_FLAG_TEMPORARY) &&
 	    ttm_resource_manager_usage(man) > man->size) {
 		r = -ENOSPC;
@@ -146,8 +134,7 @@ static int amdgpu_gtt_mgr_new(struct ttm_resource_manager *man,
 
 	if (place->lpfn) {
 		spin_lock(&mgr->lock);
-		r = drm_mm_insert_node_in_range(&mgr->mm,
-						&node->base.mm_nodes[0],
+		r = drm_mm_insert_node_in_range(&mgr->mm, &node->mm_nodes[0],
 						num_pages, tbo->page_alignment,
 						0, place->fpfn, place->lpfn,
 						DRM_MM_INSERT_BEST);
@@ -155,18 +142,18 @@ static int amdgpu_gtt_mgr_new(struct ttm_resource_manager *man,
 		if (unlikely(r))
 			goto err_free;
 
-		node->base.base.start = node->base.mm_nodes[0].start;
+		node->base.start = node->mm_nodes[0].start;
 	} else {
-		node->base.mm_nodes[0].start = 0;
-		node->base.mm_nodes[0].size = node->base.base.num_pages;
-		node->base.base.start = AMDGPU_BO_INVALID_OFFSET;
+		node->mm_nodes[0].start = 0;
+		node->mm_nodes[0].size = node->base.num_pages;
+		node->base.start = AMDGPU_BO_INVALID_OFFSET;
 	}
 
-	*res = &node->base.base;
+	*res = &node->base;
 	return 0;
 
 err_free:
-	ttm_resource_fini(man, &node->base.base);
+	ttm_resource_fini(man, &node->base);
 	kfree(node);
 	return r;
 }
@@ -182,12 +169,12 @@ static int amdgpu_gtt_mgr_new(struct ttm_resource_manager *man,
 static void amdgpu_gtt_mgr_del(struct ttm_resource_manager *man,
 			       struct ttm_resource *res)
 {
-	struct amdgpu_gtt_node *node = to_amdgpu_gtt_node(res);
+	struct ttm_range_mgr_node *node = to_ttm_range_mgr_node(res);
 	struct amdgpu_gtt_mgr *mgr = to_gtt_mgr(man);
 
 	spin_lock(&mgr->lock);
-	if (drm_mm_node_allocated(&node->base.mm_nodes[0]))
-		drm_mm_remove_node(&node->base.mm_nodes[0]);
+	if (drm_mm_node_allocated(&node->mm_nodes[0]))
+		drm_mm_remove_node(&node->mm_nodes[0]);
 	spin_unlock(&mgr->lock);
 
 	ttm_resource_fini(man, res);
@@ -204,16 +191,16 @@ static void amdgpu_gtt_mgr_del(struct ttm_resource_manager *man,
 int amdgpu_gtt_mgr_recover(struct ttm_resource_manager *man)
 {
 	struct amdgpu_gtt_mgr *mgr = to_gtt_mgr(man);
+	struct ttm_range_mgr_node *node;
 	struct amdgpu_device *adev;
-	struct amdgpu_gtt_node *node;
 	struct drm_mm_node *mm_node;
 	int r = 0;
 
 	adev = container_of(mgr, typeof(*adev), mman.gtt_mgr);
 	spin_lock(&mgr->lock);
 	drm_mm_for_each_node(mm_node, &mgr->mm) {
-		node = container_of(mm_node, typeof(*node), base.mm_nodes[0]);
-		r = amdgpu_ttm_recover_gart(node->tbo);
+		node = container_of(mm_node, typeof(*node), mm_nodes[0]);
+		r = amdgpu_ttm_recover_gart(node->base.bo);
 		if (r)
 			break;
 	}
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 11/12] drm/ttm: allow bulk moves for all domains
  2022-01-24 12:25 drm/ttm: moving the LRU into the resource Christian König
                   ` (9 preceding siblings ...)
  2022-01-24 12:25 ` [PATCH 10/12] drm/amdgpu: drop amdgpu_gtt_node Christian König
@ 2022-01-24 12:25 ` Christian König
  2022-01-25 17:16   ` Daniel Vetter
  2022-01-24 12:25 ` [PATCH 12/12] drm/ttm: rework bulk move handling Christian König
  11 siblings, 1 reply; 23+ messages in thread
From: Christian König @ 2022-01-24 12:25 UTC (permalink / raw)
  To: ray.huang, thomas.hellstrom, dri-devel, bas

Not just TT and VRAM.

Signed-off-by: Christian König <christian.koenig@amd.com>
---
 drivers/gpu/drm/ttm/ttm_resource.c | 49 +++++++++---------------------
 include/drm/ttm/ttm_device.h       |  2 --
 include/drm/ttm/ttm_resource.h     |  4 +--
 3 files changed, 16 insertions(+), 39 deletions(-)

diff --git a/drivers/gpu/drm/ttm/ttm_resource.c b/drivers/gpu/drm/ttm/ttm_resource.c
index 9e68d36a1546..2ea8fb83377e 100644
--- a/drivers/gpu/drm/ttm/ttm_resource.c
+++ b/drivers/gpu/drm/ttm/ttm_resource.c
@@ -51,36 +51,23 @@ EXPORT_SYMBOL(ttm_lru_bulk_move_init);
  */
 void ttm_lru_bulk_move_tail(struct ttm_lru_bulk_move *bulk)
 {
-	unsigned i;
-
-	for (i = 0; i < TTM_MAX_BO_PRIORITY; ++i) {
-		struct ttm_lru_bulk_move_pos *pos = &bulk->tt[i];
-		struct ttm_resource_manager *man;
+	unsigned i, j;
 
-		if (!pos->first)
-			continue;
+	for (i = 0; i < TTM_NUM_MEM_TYPES; ++i) {
+		for (j = 0; j < TTM_MAX_BO_PRIORITY; ++j) {
+			struct ttm_lru_bulk_move_pos *pos = &bulk->pos[i][j];
+			struct ttm_resource_manager *man;
 
-		dma_resv_assert_held(pos->first->bo->base.resv);
-		dma_resv_assert_held(pos->last->bo->base.resv);
+			if (!pos->first)
+				continue;
 
-		man = ttm_manager_type(pos->first->bo->bdev, TTM_PL_TT);
-		list_bulk_move_tail(&man->lru[i], &pos->first->lru,
-				    &pos->last->lru);
-	}
-
-	for (i = 0; i < TTM_MAX_BO_PRIORITY; ++i) {
-		struct ttm_lru_bulk_move_pos *pos = &bulk->vram[i];
-		struct ttm_resource_manager *man;
+			dma_resv_assert_held(pos->first->bo->base.resv);
+			dma_resv_assert_held(pos->last->bo->base.resv);
 
-		if (!pos->first)
-			continue;
-
-		dma_resv_assert_held(pos->first->bo->base.resv);
-		dma_resv_assert_held(pos->last->bo->base.resv);
-
-		man = ttm_manager_type(pos->first->bo->bdev, TTM_PL_VRAM);
-		list_bulk_move_tail(&man->lru[i], &pos->first->lru,
-				    &pos->last->lru);
+			man = ttm_manager_type(pos->first->bo->bdev, i);
+			list_bulk_move_tail(&man->lru[j], &pos->first->lru,
+					    &pos->last->lru);
+		}
 	}
 }
 EXPORT_SYMBOL(ttm_lru_bulk_move_tail);
@@ -118,15 +105,7 @@ void ttm_resource_move_to_lru_tail(struct ttm_resource *res,
 	if (!bulk)
 		return;
 
-	switch (res->mem_type) {
-	case TTM_PL_TT:
-		ttm_lru_bulk_move_set_pos(&bulk->tt[bo->priority], res);
-		break;
-
-	case TTM_PL_VRAM:
-		ttm_lru_bulk_move_set_pos(&bulk->vram[bo->priority], res);
-		break;
-	}
+	ttm_lru_bulk_move_set_pos(&bulk->pos[res->mem_type][bo->priority], res);
 }
 
 void ttm_resource_init(struct ttm_buffer_object *bo,
diff --git a/include/drm/ttm/ttm_device.h b/include/drm/ttm/ttm_device.h
index 0a4ddec78d8f..425150f35fbe 100644
--- a/include/drm/ttm/ttm_device.h
+++ b/include/drm/ttm/ttm_device.h
@@ -30,8 +30,6 @@
 #include <drm/ttm/ttm_resource.h>
 #include <drm/ttm/ttm_pool.h>
 
-#define TTM_NUM_MEM_TYPES 8
-
 struct ttm_device;
 struct ttm_placement;
 struct ttm_buffer_object;
diff --git a/include/drm/ttm/ttm_resource.h b/include/drm/ttm/ttm_resource.h
index 13da5e337350..1556d1f62251 100644
--- a/include/drm/ttm/ttm_resource.h
+++ b/include/drm/ttm/ttm_resource.h
@@ -37,6 +37,7 @@
 #include <drm/ttm/ttm_kmap_iter.h>
 
 #define TTM_MAX_BO_PRIORITY	4U
+#define TTM_NUM_MEM_TYPES 8
 
 struct ttm_device;
 struct ttm_resource_manager;
@@ -216,8 +217,7 @@ struct ttm_lru_bulk_move_pos {
  * Helper structure for bulk moves on the LRU list.
  */
 struct ttm_lru_bulk_move {
-	struct ttm_lru_bulk_move_pos tt[TTM_MAX_BO_PRIORITY];
-	struct ttm_lru_bulk_move_pos vram[TTM_MAX_BO_PRIORITY];
+	struct ttm_lru_bulk_move_pos pos[TTM_NUM_MEM_TYPES][TTM_MAX_BO_PRIORITY];
 };
 
 /**
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 12/12] drm/ttm: rework bulk move handling
  2022-01-24 12:25 drm/ttm: moving the LRU into the resource Christian König
                   ` (10 preceding siblings ...)
  2022-01-24 12:25 ` [PATCH 11/12] drm/ttm: allow bulk moves for all domains Christian König
@ 2022-01-24 12:25 ` Christian König
  2022-01-25 17:12   ` Daniel Vetter
  11 siblings, 1 reply; 23+ messages in thread
From: Christian König @ 2022-01-24 12:25 UTC (permalink / raw)
  To: ray.huang, thomas.hellstrom, dri-devel, bas

Instead of providing the bulk move structure for each LRU update set
this as property of the BO. This should avoid costly bulk move rebuilds
with some games under RADV.

Signed-off-by: Christian König <christian.koenig@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c |  1 -
 drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c  | 72 +++-----------------
 drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h  |  3 -
 drivers/gpu/drm/i915/gem/i915_gem_ttm.c |  2 +-
 drivers/gpu/drm/ttm/ttm_bo.c            | 42 ++++++++++--
 drivers/gpu/drm/ttm/ttm_resource.c      | 87 ++++++++++++++++++-------
 include/drm/ttm/ttm_bo_api.h            | 18 ++---
 include/drm/ttm/ttm_bo_driver.h         |  2 +-
 include/drm/ttm/ttm_device.h            |  9 ---
 include/drm/ttm/ttm_resource.h          |  7 +-
 10 files changed, 124 insertions(+), 119 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
index f6b3695979a6..7424a4a8d35c 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
@@ -1497,7 +1497,6 @@ static struct ttm_device_funcs amdgpu_bo_driver = {
 	.io_mem_reserve = &amdgpu_ttm_io_mem_reserve,
 	.io_mem_pfn = amdgpu_ttm_io_mem_pfn,
 	.access_memory = &amdgpu_ttm_access_memory,
-	.del_from_lru_notify = &amdgpu_vm_del_from_lru_notify
 };
 
 /*
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
index cdb489cdf27b..35ed849cfe25 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
@@ -375,7 +375,7 @@ static void amdgpu_vm_bo_base_init(struct amdgpu_vm_bo_base *base,
 	if (bo->tbo.base.resv != vm->root.bo->tbo.base.resv)
 		return;
 
-	vm->bulk_moveable = false;
+	ttm_bo_set_bulk_move(&bo->tbo, &vm->lru_bulk_move);
 	if (bo->tbo.type == ttm_bo_type_kernel && bo->parent)
 		amdgpu_vm_bo_relocated(base);
 	else
@@ -637,36 +637,6 @@ void amdgpu_vm_get_pd_bo(struct amdgpu_vm *vm,
 	list_add(&entry->tv.head, validated);
 }
 
-/**
- * amdgpu_vm_del_from_lru_notify - update bulk_moveable flag
- *
- * @bo: BO which was removed from the LRU
- *
- * Make sure the bulk_moveable flag is updated when a BO is removed from the
- * LRU.
- */
-void amdgpu_vm_del_from_lru_notify(struct ttm_buffer_object *bo)
-{
-	struct amdgpu_bo *abo;
-	struct amdgpu_vm_bo_base *bo_base;
-
-	if (!amdgpu_bo_is_amdgpu_bo(bo))
-		return;
-
-	if (bo->pin_count)
-		return;
-
-	abo = ttm_to_amdgpu_bo(bo);
-	if (!abo->parent)
-		return;
-	for (bo_base = abo->vm_bo; bo_base; bo_base = bo_base->next) {
-		struct amdgpu_vm *vm = bo_base->vm;
-
-		if (abo->tbo.base.resv == vm->root.bo->tbo.base.resv)
-			vm->bulk_moveable = false;
-	}
-
-}
 /**
  * amdgpu_vm_move_to_lru_tail - move all BOs to the end of LRU
  *
@@ -679,33 +649,9 @@ void amdgpu_vm_del_from_lru_notify(struct ttm_buffer_object *bo)
 void amdgpu_vm_move_to_lru_tail(struct amdgpu_device *adev,
 				struct amdgpu_vm *vm)
 {
-	struct amdgpu_vm_bo_base *bo_base;
-
-	if (vm->bulk_moveable) {
-		spin_lock(&adev->mman.bdev.lru_lock);
-		ttm_lru_bulk_move_tail(&vm->lru_bulk_move);
-		spin_unlock(&adev->mman.bdev.lru_lock);
-		return;
-	}
-
-	ttm_lru_bulk_move_init(&vm->lru_bulk_move);
-
 	spin_lock(&adev->mman.bdev.lru_lock);
-	list_for_each_entry(bo_base, &vm->idle, vm_status) {
-		struct amdgpu_bo *bo = bo_base->bo;
-		struct amdgpu_bo *shadow = amdgpu_bo_shadowed(bo);
-
-		if (!bo->parent)
-			continue;
-
-		ttm_bo_move_to_lru_tail(&bo->tbo, &vm->lru_bulk_move);
-		if (shadow)
-			ttm_bo_move_to_lru_tail(&shadow->tbo,
-						&vm->lru_bulk_move);
-	}
+	ttm_lru_bulk_move_tail(&vm->lru_bulk_move);
 	spin_unlock(&adev->mman.bdev.lru_lock);
-
-	vm->bulk_moveable = true;
 }
 
 /**
@@ -728,8 +674,6 @@ int amdgpu_vm_validate_pt_bos(struct amdgpu_device *adev, struct amdgpu_vm *vm,
 	struct amdgpu_vm_bo_base *bo_base, *tmp;
 	int r;
 
-	vm->bulk_moveable &= list_empty(&vm->evicted);
-
 	list_for_each_entry_safe(bo_base, tmp, &vm->evicted, vm_status) {
 		struct amdgpu_bo *bo = bo_base->bo;
 		struct amdgpu_bo *shadow = amdgpu_bo_shadowed(bo);
@@ -1047,10 +991,16 @@ static void amdgpu_vm_free_table(struct amdgpu_vm_bo_base *entry)
 
 	if (!entry->bo)
 		return;
+
 	shadow = amdgpu_bo_shadowed(entry->bo);
+	if (shadow) {
+		ttm_bo_set_bulk_move(&shadow->tbo, NULL);
+		amdgpu_bo_unref(&shadow);
+	}
+
+	ttm_bo_set_bulk_move(&entry->bo->tbo, NULL);
 	entry->bo->vm_bo = NULL;
 	list_del(&entry->vm_status);
-	amdgpu_bo_unref(&shadow);
 	amdgpu_bo_unref(&entry->bo);
 }
 
@@ -1070,8 +1020,6 @@ static void amdgpu_vm_free_pts(struct amdgpu_device *adev,
 	struct amdgpu_vm_pt_cursor cursor;
 	struct amdgpu_vm_bo_base *entry;
 
-	vm->bulk_moveable = false;
-
 	for_each_amdgpu_vm_pt_dfs_safe(adev, vm, start, cursor, entry)
 		amdgpu_vm_free_table(entry);
 
@@ -2651,7 +2599,7 @@ void amdgpu_vm_bo_rmv(struct amdgpu_device *adev,
 
 	if (bo) {
 		if (bo->tbo.base.resv == vm->root.bo->tbo.base.resv)
-			vm->bulk_moveable = false;
+			ttm_bo_set_bulk_move(&bo->tbo, NULL);
 
 		for (base = &bo_va->base.bo->vm_bo; *base;
 		     base = &(*base)->next) {
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
index 85fcfb8c5efd..4d236682a118 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
@@ -317,8 +317,6 @@ struct amdgpu_vm {
 
 	/* Store positions of group of BOs */
 	struct ttm_lru_bulk_move lru_bulk_move;
-	/* mark whether can do the bulk move */
-	bool			bulk_moveable;
 	/* Flag to indicate if VM is used for compute */
 	bool			is_compute_context;
 };
@@ -454,7 +452,6 @@ void amdgpu_vm_set_task_info(struct amdgpu_vm *vm);
 
 void amdgpu_vm_move_to_lru_tail(struct amdgpu_device *adev,
 				struct amdgpu_vm *vm);
-void amdgpu_vm_del_from_lru_notify(struct ttm_buffer_object *bo);
 void amdgpu_vm_get_memory(struct amdgpu_vm *vm, uint64_t *vram_mem,
 				uint64_t *gtt_mem, uint64_t *cpu_mem);
 
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_ttm.c b/drivers/gpu/drm/i915/gem/i915_gem_ttm.c
index cdf1ff87c88d..abd1bde7a2b8 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_ttm.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_ttm.c
@@ -835,7 +835,7 @@ void i915_ttm_adjust_lru(struct drm_i915_gem_object *obj)
 			bo->priority = I915_TTM_PRIO_NO_PAGES;
 	}
 
-	ttm_bo_move_to_lru_tail(bo, NULL);
+	ttm_bo_move_to_lru_tail(bo);
 	spin_unlock(&bo->bdev->lru_lock);
 }
 
diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c
index 599be3dda8a9..7d1133bef33e 100644
--- a/drivers/gpu/drm/ttm/ttm_bo.c
+++ b/drivers/gpu/drm/ttm/ttm_bo.c
@@ -69,16 +69,48 @@ static void ttm_bo_mem_space_debug(struct ttm_buffer_object *bo,
 	}
 }
 
-void ttm_bo_move_to_lru_tail(struct ttm_buffer_object *bo,
-			     struct ttm_lru_bulk_move *bulk)
+/**
+ * ttm_bo_move_to_lru_tail
+ *
+ * @bo: The buffer object.
+ *
+ * Move this BO to the tail of all lru lists used to lookup and reserve an
+ * object. This function must be called with struct ttm_global::lru_lock
+ * held, and is used to make a BO less likely to be considered for eviction.
+ */
+void ttm_bo_move_to_lru_tail(struct ttm_buffer_object *bo)
 {
 	dma_resv_assert_held(bo->base.resv);
 
 	if (bo->resource)
-		ttm_resource_move_to_lru_tail(bo->resource, bulk);
+		ttm_resource_move_to_lru_tail(bo->resource);
 }
 EXPORT_SYMBOL(ttm_bo_move_to_lru_tail);
 
+/**
+ * ttm_bo_set_bulk_move - update BOs bulk move object
+ *
+ * @bo: The buffer object.
+ *
+ * Update the BOs bulk move object, making sure that resources are added/removed
+ * as well.
+ */
+void ttm_bo_set_bulk_move(struct ttm_buffer_object *bo,
+			  struct ttm_lru_bulk_move *bulk)
+{
+	if (bo->bulk_move == bulk)
+		return;
+
+	spin_lock(&bo->bdev->lru_lock);
+	if (bo->bulk_move && bo->resource)
+		ttm_lru_bulk_move_rmv(bo->bulk_move, bo->resource);
+	bo->bulk_move = bulk;
+	if (bo->bulk_move && bo->resource)
+		ttm_lru_bulk_move_add(bo->bulk_move, bo->resource);
+	spin_unlock(&bo->bdev->lru_lock);
+}
+EXPORT_SYMBOL(ttm_bo_set_bulk_move);
+
 static int ttm_bo_handle_move_mem(struct ttm_buffer_object *bo,
 				  struct ttm_resource *mem, bool evict,
 				  struct ttm_operation_ctx *ctx,
@@ -316,6 +348,7 @@ static void ttm_bo_release(struct kref *kref)
 	int ret;
 
 	WARN_ON_ONCE(bo->pin_count);
+	WARN_ON_ONCE(bo->bulk_move);
 
 	if (!bo->deleted) {
 		ret = ttm_bo_individualize_resv(bo);
@@ -352,7 +385,7 @@ static void ttm_bo_release(struct kref *kref)
 		 */
 		if (bo->pin_count) {
 			bo->pin_count = 0;
-			ttm_resource_move_to_lru_tail(bo->resource, NULL);
+			ttm_resource_move_to_lru_tail(bo->resource);
 		}
 
 		kref_init(&bo->kref);
@@ -874,6 +907,7 @@ int ttm_bo_init_reserved(struct ttm_device *bdev,
 	bo->moving = NULL;
 	bo->pin_count = 0;
 	bo->sg = sg;
+	bo->bulk_move = NULL;
 	if (resv) {
 		bo->base.resv = resv;
 		dma_resv_assert_held(bo->base.resv);
diff --git a/drivers/gpu/drm/ttm/ttm_resource.c b/drivers/gpu/drm/ttm/ttm_resource.c
index 2ea8fb83377e..b1068053434c 100644
--- a/drivers/gpu/drm/ttm/ttm_resource.c
+++ b/drivers/gpu/drm/ttm/ttm_resource.c
@@ -72,40 +72,75 @@ void ttm_lru_bulk_move_tail(struct ttm_lru_bulk_move *bulk)
 }
 EXPORT_SYMBOL(ttm_lru_bulk_move_tail);
 
-/* Record a resource position in a bulk move structure */
-static void ttm_lru_bulk_move_set_pos(struct ttm_lru_bulk_move_pos *pos,
-				      struct ttm_resource *res)
+/* Return the bulk move pos object for this resource */
+static struct ttm_lru_bulk_move_pos *
+ttm_lru_bulk_move_pos(struct ttm_lru_bulk_move *bulk, struct ttm_resource *res)
 {
-	if (!pos->first)
+	return &bulk->pos[res->mem_type][res->bo->priority];
+}
+
+/* Move the resource to the tail of the bulk move range */
+static void ttm_lru_bulk_move_pos_tail(struct ttm_lru_bulk_move_pos *pos,
+				       struct ttm_resource *res)
+{
+	if (pos->last != res) {
+		list_move(&res->lru, &pos->last->lru);
+		pos->last = res;
+	}
+}
+
+/* Add the resource to a bulk_move cursor */
+void ttm_lru_bulk_move_add(struct ttm_lru_bulk_move *bulk,
+			   struct ttm_resource *res)
+{
+	struct ttm_lru_bulk_move_pos *pos = ttm_lru_bulk_move_pos(bulk, res);
+
+	if (!pos->first) {
 		pos->first = res;
-	pos->last = res;
+		pos->last = res;
+	} else {
+		ttm_lru_bulk_move_pos_tail(pos, res);
+	}
+}
+
+/* Remove the resource from a bulk_move range */
+void ttm_lru_bulk_move_rmv(struct ttm_lru_bulk_move *bulk,
+			   struct ttm_resource *res)
+{
+	struct ttm_lru_bulk_move_pos *pos = ttm_lru_bulk_move_pos(bulk, res);
+
+	if (unlikely(pos->first == res && pos->last == res)) {
+		pos->first = NULL;
+		pos->last = NULL;
+	} else if (pos->first == res) {
+		pos->first = list_next_entry(res, lru);
+	} else if (pos->last == res) {
+		pos->last = list_prev_entry(res, lru);
+	} else {
+		list_move(&res->lru, &pos->last->lru);
+	}
 }
 
-/* Move a resource to the LRU tail and track the bulk position */
-void ttm_resource_move_to_lru_tail(struct ttm_resource *res,
-				   struct ttm_lru_bulk_move *bulk)
+/* Move a resource to the LRU or bulk tail */
+void ttm_resource_move_to_lru_tail(struct ttm_resource *res)
 {
 	struct ttm_buffer_object *bo = res->bo;
 	struct ttm_device *bdev = bo->bdev;
-	struct ttm_resource_manager *man;
 
 	if (bo->pin_count) {
 		list_move_tail(&res->lru, &bdev->pinned);
-		if (bdev->funcs->del_from_lru_notify)
-			bdev->funcs->del_from_lru_notify(res->bo);
-		return;
-	}
 
-	man = ttm_manager_type(bdev, res->mem_type);
-	list_move_tail(&res->lru, &man->lru[bo->priority]);
+	} else	if (bo->bulk_move) {
+		struct ttm_lru_bulk_move_pos *pos =
+			ttm_lru_bulk_move_pos(bo->bulk_move, res);
 
-	if (bdev->funcs->del_from_lru_notify)
-		bdev->funcs->del_from_lru_notify(bo);
-
-	if (!bulk)
-		return;
+		ttm_lru_bulk_move_pos_tail(pos, res);
+	} else {
+		struct ttm_resource_manager *man;
 
-	ttm_lru_bulk_move_set_pos(&bulk->pos[res->mem_type][bo->priority], res);
+		man = ttm_manager_type(bdev, res->mem_type);
+		list_move_tail(&res->lru, &man->lru[bo->priority]);
+	}
 }
 
 void ttm_resource_init(struct ttm_buffer_object *bo,
@@ -129,7 +164,7 @@ void ttm_resource_init(struct ttm_buffer_object *bo,
 	atomic64_add(bo->base.size, &man->usage);
 
 	spin_lock(&bo->bdev->lru_lock);
-	ttm_resource_move_to_lru_tail(res, NULL);
+	ttm_resource_move_to_lru_tail(res);
 	spin_unlock(&bo->bdev->lru_lock);
 }
 EXPORT_SYMBOL(ttm_resource_init);
@@ -148,8 +183,6 @@ void ttm_resource_fini(struct ttm_resource_manager *man,
 
 	spin_lock(&bdev->lru_lock);
 	list_del_init(&res->lru);
-	if (res->bo && bdev->funcs->del_from_lru_notify)
-		bdev->funcs->del_from_lru_notify(res->bo);
 	spin_unlock(&bdev->lru_lock);
 
 	atomic64_sub(res->num_pages << PAGE_SHIFT, &man->usage);
@@ -173,6 +206,12 @@ void ttm_resource_free(struct ttm_buffer_object *bo, struct ttm_resource **res)
 	if (!*res)
 		return;
 
+	if (bo->bulk_move) {
+		spin_lock(&bo->bdev->lru_lock);
+		ttm_lru_bulk_move_rmv(bo->bulk_move, *res);
+		spin_unlock(&bo->bdev->lru_lock);
+	}
+
 	man = ttm_manager_type(bo->bdev, (*res)->mem_type);
 	man->func->free(man, *res);
 	*res = NULL;
diff --git a/include/drm/ttm/ttm_bo_api.h b/include/drm/ttm/ttm_bo_api.h
index 3da77fc54552..b40c28e884a2 100644
--- a/include/drm/ttm/ttm_bo_api.h
+++ b/include/drm/ttm/ttm_bo_api.h
@@ -135,6 +135,7 @@ struct ttm_buffer_object {
 	struct ttm_resource *resource;
 	struct ttm_tt *ttm;
 	bool deleted;
+	struct ttm_lru_bulk_move *bulk_move;
 
 	/**
 	 * Members protected by the bdev::lru_lock.
@@ -287,18 +288,11 @@ int ttm_bo_validate(struct ttm_buffer_object *bo,
  */
 void ttm_bo_put(struct ttm_buffer_object *bo);
 
-/**
- * ttm_bo_move_to_lru_tail
- *
- * @bo: The buffer object.
- * @bulk: optional bulk move structure to remember BO positions
- *
- * Move this BO to the tail of all lru lists used to lookup and reserve an
- * object. This function must be called with struct ttm_global::lru_lock
- * held, and is used to make a BO less likely to be considered for eviction.
- */
-void ttm_bo_move_to_lru_tail(struct ttm_buffer_object *bo,
-			     struct ttm_lru_bulk_move *bulk);
+void ttm_bo_move_to_lru_tail(struct ttm_buffer_object *bo);
+
+
+void ttm_bo_set_bulk_move(struct ttm_buffer_object *bo,
+			  struct ttm_lru_bulk_move *bulk);
 
 /**
  * ttm_bo_lock_delayed_workqueue
diff --git a/include/drm/ttm/ttm_bo_driver.h b/include/drm/ttm/ttm_bo_driver.h
index 6c7352e13708..059a595e14e5 100644
--- a/include/drm/ttm/ttm_bo_driver.h
+++ b/include/drm/ttm/ttm_bo_driver.h
@@ -155,7 +155,7 @@ static inline void
 ttm_bo_move_to_lru_tail_unlocked(struct ttm_buffer_object *bo)
 {
 	spin_lock(&bo->bdev->lru_lock);
-	ttm_bo_move_to_lru_tail(bo, NULL);
+	ttm_bo_move_to_lru_tail(bo);
 	spin_unlock(&bo->bdev->lru_lock);
 }
 
diff --git a/include/drm/ttm/ttm_device.h b/include/drm/ttm/ttm_device.h
index 425150f35fbe..95b3c04b1ab9 100644
--- a/include/drm/ttm/ttm_device.h
+++ b/include/drm/ttm/ttm_device.h
@@ -198,15 +198,6 @@ struct ttm_device_funcs {
 	int (*access_memory)(struct ttm_buffer_object *bo, unsigned long offset,
 			     void *buf, int len, int write);
 
-	/**
-	 * struct ttm_bo_driver member del_from_lru_notify
-	 *
-	 * @bo: the buffer object deleted from lru
-	 *
-	 * notify driver that a BO was deleted from LRU.
-	 */
-	void (*del_from_lru_notify)(struct ttm_buffer_object *bo);
-
 	/**
 	 * Notify the driver that we're about to release a BO
 	 *
diff --git a/include/drm/ttm/ttm_resource.h b/include/drm/ttm/ttm_resource.h
index 1556d1f62251..d938ee78f536 100644
--- a/include/drm/ttm/ttm_resource.h
+++ b/include/drm/ttm/ttm_resource.h
@@ -320,10 +320,13 @@ ttm_resource_manager_usage(struct ttm_resource_manager *man)
 }
 
 void ttm_lru_bulk_move_init(struct ttm_lru_bulk_move *bulk);
+void ttm_lru_bulk_move_add(struct ttm_lru_bulk_move *bulk,
+			   struct ttm_resource *res);
+void ttm_lru_bulk_move_rmv(struct ttm_lru_bulk_move *bulk,
+			   struct ttm_resource *res);
 void ttm_lru_bulk_move_tail(struct ttm_lru_bulk_move *bulk);
 
-void ttm_resource_move_to_lru_tail(struct ttm_resource *res,
-				   struct ttm_lru_bulk_move *bulk);
+void ttm_resource_move_to_lru_tail(struct ttm_resource *res);
 
 void ttm_resource_init(struct ttm_buffer_object *bo,
                        const struct ttm_place *place,
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: [PATCH 01/12] drm/ttm: add ttm_resource_fini
  2022-01-24 12:25 ` [PATCH 01/12] drm/ttm: add ttm_resource_fini Christian König
@ 2022-01-25 16:26   ` Daniel Vetter
  0 siblings, 0 replies; 23+ messages in thread
From: Daniel Vetter @ 2022-01-25 16:26 UTC (permalink / raw)
  To: Christian König; +Cc: thomas.hellstrom, ray.huang, dri-devel

On Mon, Jan 24, 2022 at 01:25:03PM +0100, Christian König wrote:
> Make sure we call the common cleanup function in all
> implementations of the resource manager.
> 
> Signed-off-by: Christian König <christian.koenig@amd.com>
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c     | 2 ++
>  drivers/gpu/drm/amd/amdgpu/amdgpu_preempt_mgr.c | 1 +
>  drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c    | 2 ++
>  drivers/gpu/drm/i915/i915_ttm_buddy_manager.c   | 1 +
>  drivers/gpu/drm/nouveau/nouveau_mem.c           | 3 ++-
>  drivers/gpu/drm/nouveau/nouveau_mem.h           | 3 ++-
>  drivers/gpu/drm/nouveau/nouveau_ttm.c           | 9 +++++----
>  drivers/gpu/drm/ttm/ttm_range_manager.c         | 2 ++
>  drivers/gpu/drm/ttm/ttm_resource.c              | 6 ++++++
>  drivers/gpu/drm/ttm/ttm_sys_manager.c           | 1 +
>  drivers/gpu/drm/vmwgfx/vmwgfx_gmrid_manager.c   | 2 ++
>  drivers/gpu/drm/vmwgfx/vmwgfx_system_manager.c  | 1 +
>  include/drm/ttm/ttm_resource.h                  | 3 +++
>  13 files changed, 30 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c
> index 675a72ef305d..ea5470c8c921 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c
> @@ -169,6 +169,7 @@ static int amdgpu_gtt_mgr_new(struct ttm_resource_manager *man,
>  	return 0;
>  
>  err_free:
> +	ttm_resource_fini(man, &node->base.base);
>  	kfree(node);
>  
>  err_out:
> @@ -200,6 +201,7 @@ static void amdgpu_gtt_mgr_del(struct ttm_resource_manager *man,
>  	if (!(res->placement & TTM_PL_FLAG_TEMPORARY))
>  		atomic64_sub(res->num_pages, &mgr->used);
>  
> +	ttm_resource_fini(man, res);
>  	kfree(node);
>  }
>  
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_preempt_mgr.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_preempt_mgr.c
> index d02c8637f909..ffddec08e931 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_preempt_mgr.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_preempt_mgr.c
> @@ -95,6 +95,7 @@ static void amdgpu_preempt_mgr_del(struct ttm_resource_manager *man,
>  	struct amdgpu_preempt_mgr *mgr = to_preempt_mgr(man);
>  
>  	atomic64_sub(res->num_pages, &mgr->used);
> +	ttm_resource_fini(man, res);
>  	kfree(res);
>  }
>  
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c
> index 7b2b0980ec41..55d68408951d 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c
> @@ -476,6 +476,7 @@ static int amdgpu_vram_mgr_new(struct ttm_resource_manager *man,
>  	while (i--)
>  		drm_mm_remove_node(&node->mm_nodes[i]);
>  	spin_unlock(&mgr->lock);
> +	ttm_resource_fini(man, &node->base);
>  	kvfree(node);
>  
>  error_sub:
> @@ -515,6 +516,7 @@ static void amdgpu_vram_mgr_del(struct ttm_resource_manager *man,
>  	atomic64_sub(usage, &mgr->usage);
>  	atomic64_sub(vis_usage, &mgr->vis_usage);
>  
> +	ttm_resource_fini(man, res);
>  	kvfree(node);
>  }
>  
> diff --git a/drivers/gpu/drm/i915/i915_ttm_buddy_manager.c b/drivers/gpu/drm/i915/i915_ttm_buddy_manager.c
> index 6ba314f9836a..bf857703c797 100644
> --- a/drivers/gpu/drm/i915/i915_ttm_buddy_manager.c
> +++ b/drivers/gpu/drm/i915/i915_ttm_buddy_manager.c
> @@ -124,6 +124,7 @@ static void i915_ttm_buddy_man_free(struct ttm_resource_manager *man,
>  	drm_buddy_free_list(&bman->mm, &bman_res->blocks);
>  	mutex_unlock(&bman->lock);
>  
> +	ttm_resource_fini(man, res);
>  	kfree(bman_res);
>  }

The call in i915_ttm_buddy_man_alloc() in the error unwind call is
missing.

>  
> diff --git a/drivers/gpu/drm/nouveau/nouveau_mem.c b/drivers/gpu/drm/nouveau/nouveau_mem.c
> index 2ca3207c13fc..2e517cdc24c9 100644
> --- a/drivers/gpu/drm/nouveau/nouveau_mem.c
> +++ b/drivers/gpu/drm/nouveau/nouveau_mem.c
> @@ -162,11 +162,12 @@ nouveau_mem_vram(struct ttm_resource *reg, bool contig, u8 page)
>  }
>  
>  void
> -nouveau_mem_del(struct ttm_resource *reg)
> +nouveau_mem_del(struct ttm_resource_manager *man, struct ttm_resource *reg)
>  {
>  	struct nouveau_mem *mem = nouveau_mem(reg);
>  
>  	nouveau_mem_fini(mem);
> +	ttm_resource_fini(man, reg);
>  	kfree(mem);
>  }
>  
> diff --git a/drivers/gpu/drm/nouveau/nouveau_mem.h b/drivers/gpu/drm/nouveau/nouveau_mem.h
> index 2c01166a90f2..325551eba5cd 100644
> --- a/drivers/gpu/drm/nouveau/nouveau_mem.h
> +++ b/drivers/gpu/drm/nouveau/nouveau_mem.h
> @@ -23,7 +23,8 @@ nouveau_mem(struct ttm_resource *reg)
>  
>  int nouveau_mem_new(struct nouveau_cli *, u8 kind, u8 comp,
>  		    struct ttm_resource **);
> -void nouveau_mem_del(struct ttm_resource *);
> +void nouveau_mem_del(struct ttm_resource_manager *man,
> +		     struct ttm_resource *);
>  int nouveau_mem_vram(struct ttm_resource *, bool contig, u8 page);
>  int nouveau_mem_host(struct ttm_resource *, struct ttm_tt *);
>  void nouveau_mem_fini(struct nouveau_mem *);
> diff --git a/drivers/gpu/drm/nouveau/nouveau_ttm.c b/drivers/gpu/drm/nouveau/nouveau_ttm.c
> index 2ca9d9a9e5d5..91ef33f8f22c 100644
> --- a/drivers/gpu/drm/nouveau/nouveau_ttm.c
> +++ b/drivers/gpu/drm/nouveau/nouveau_ttm.c
> @@ -36,9 +36,10 @@
>  #include <core/tegra.h>
>  
>  static void
> -nouveau_manager_del(struct ttm_resource_manager *man, struct ttm_resource *reg)
> +nouveau_manager_del(struct ttm_resource_manager *man,
> +		    struct ttm_resource *reg)
>  {
> -	nouveau_mem_del(reg);
> +	nouveau_mem_del(man, reg);
>  }
>  
>  static int
> @@ -62,7 +63,7 @@ nouveau_vram_manager_new(struct ttm_resource_manager *man,
>  
>  	ret = nouveau_mem_vram(*res, nvbo->contig, nvbo->page);
>  	if (ret) {
> -		nouveau_mem_del(*res);
> +		nouveau_mem_del(man, *res);
>  		return ret;
>  	}
>  
> @@ -118,7 +119,7 @@ nv04_gart_manager_new(struct ttm_resource_manager *man,
>  	ret = nvif_vmm_get(&mem->cli->vmm.vmm, PTES, false, 12, 0,
>  			   (long)(*res)->num_pages << PAGE_SHIFT, &mem->vma[0]);
>  	if (ret) {
> -		nouveau_mem_del(*res);
> +		nouveau_mem_del(man, *res);
>  		return ret;
>  	}
>  
> diff --git a/drivers/gpu/drm/ttm/ttm_range_manager.c b/drivers/gpu/drm/ttm/ttm_range_manager.c
> index 072e0baf2ab4..55a41355f684 100644
> --- a/drivers/gpu/drm/ttm/ttm_range_manager.c
> +++ b/drivers/gpu/drm/ttm/ttm_range_manager.c
> @@ -89,6 +89,7 @@ static int ttm_range_man_alloc(struct ttm_resource_manager *man,
>  	spin_unlock(&rman->lock);
>  
>  	if (unlikely(ret)) {
> +		ttm_resource_fini(man, *res);
>  		kfree(node);
>  		return ret;
>  	}
> @@ -108,6 +109,7 @@ static void ttm_range_man_free(struct ttm_resource_manager *man,
>  	drm_mm_remove_node(&node->mm_nodes[0]);
>  	spin_unlock(&rman->lock);
>  
> +	ttm_resource_fini(man, res);
>  	kfree(node);
>  }
>  
> diff --git a/drivers/gpu/drm/ttm/ttm_resource.c b/drivers/gpu/drm/ttm/ttm_resource.c
> index 035d71332d18..89bcfe22a0ca 100644
> --- a/drivers/gpu/drm/ttm/ttm_resource.c
> +++ b/drivers/gpu/drm/ttm/ttm_resource.c
> @@ -44,6 +44,12 @@ void ttm_resource_init(struct ttm_buffer_object *bo,
>  }
>  EXPORT_SYMBOL(ttm_resource_init);
>  

Please add kerneldoc here, minimally mention from which callback this
usually should be called from. And also link from the kerneldoc of
ttm_resource_init to this one and back for completeness. I just like to
link stuff as much as possible :-)

Which I guess please add kerneldoc for the _init() version too at least,
we got to start somewhere ... Maybe even the entire file, probably worth
it to review this all and update as part of the lru extraction work here.

> +void ttm_resource_fini(struct ttm_resource_manager *man,
> +		       struct ttm_resource *res)
> +{
> +}
> +EXPORT_SYMBOL(ttm_resource_fini);
> +
>  int ttm_resource_alloc(struct ttm_buffer_object *bo,
>  		       const struct ttm_place *place,
>  		       struct ttm_resource **res_ptr)
> diff --git a/drivers/gpu/drm/ttm/ttm_sys_manager.c b/drivers/gpu/drm/ttm/ttm_sys_manager.c
> index 63aca52f75e1..135394dcca95 100644
> --- a/drivers/gpu/drm/ttm/ttm_sys_manager.c
> +++ b/drivers/gpu/drm/ttm/ttm_sys_manager.c
> @@ -23,6 +23,7 @@ static int ttm_sys_man_alloc(struct ttm_resource_manager *man,
>  static void ttm_sys_man_free(struct ttm_resource_manager *man,
>  			     struct ttm_resource *res)
>  {
> +	ttm_resource_fini(man, res);
>  	kfree(res);
>  }
>  
> diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_gmrid_manager.c b/drivers/gpu/drm/vmwgfx/vmwgfx_gmrid_manager.c
> index ebb4505a31a3..99ccf690f8a1 100644
> --- a/drivers/gpu/drm/vmwgfx/vmwgfx_gmrid_manager.c
> +++ b/drivers/gpu/drm/vmwgfx/vmwgfx_gmrid_manager.c
> @@ -117,6 +117,7 @@ static int vmw_gmrid_man_get_node(struct ttm_resource_manager *man,
>  	gman->used_gmr_pages -= (*res)->num_pages;
>  	spin_unlock(&gman->lock);
>  	ida_free(&gman->gmr_ida, id);
> +	ttm_resource_fini(man, *res);
>  	kfree(*res);
>  	return -ENOSPC;
>  }
> @@ -130,6 +131,7 @@ static void vmw_gmrid_man_put_node(struct ttm_resource_manager *man,
>  	spin_lock(&gman->lock);
>  	gman->used_gmr_pages -= res->num_pages;
>  	spin_unlock(&gman->lock);
> +	ttm_resource_fini(man, res);
>  	kfree(res);
>  }
>  
> diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_system_manager.c b/drivers/gpu/drm/vmwgfx/vmwgfx_system_manager.c
> index b0005b03a617..a64188c7268d 100644
> --- a/drivers/gpu/drm/vmwgfx/vmwgfx_system_manager.c
> +++ b/drivers/gpu/drm/vmwgfx/vmwgfx_system_manager.c
> @@ -49,6 +49,7 @@ static int vmw_sys_man_alloc(struct ttm_resource_manager *man,
>  static void vmw_sys_man_free(struct ttm_resource_manager *man,
>  			     struct ttm_resource *res)
>  {
> +	ttm_resource_fini(man, res);
>  	kfree(res);
>  }
>  
> diff --git a/include/drm/ttm/ttm_resource.h b/include/drm/ttm/ttm_resource.h
> index 5952051091cd..df1f06b7b504 100644
> --- a/include/drm/ttm/ttm_resource.h
> +++ b/include/drm/ttm/ttm_resource.h
> @@ -261,6 +261,9 @@ ttm_resource_manager_cleanup(struct ttm_resource_manager *man)
>  void ttm_resource_init(struct ttm_buffer_object *bo,
>                         const struct ttm_place *place,
>                         struct ttm_resource *res);
> +void ttm_resource_fini(struct ttm_resource_manager *man,
> +		       struct ttm_resource *res);
> +
>  int ttm_resource_alloc(struct ttm_buffer_object *bo,
>  		       const struct ttm_place *place,
>  		       struct ttm_resource **res);

With the i915 code fix and kerneldoc added:

Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>

> -- 
> 2.25.1
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 02/12] drm/ttm: add back a reference to the bdev to the res manager
  2022-01-24 12:25 ` [PATCH 02/12] drm/ttm: add back a reference to the bdev to the res manager Christian König
@ 2022-01-25 16:30   ` Daniel Vetter
  0 siblings, 0 replies; 23+ messages in thread
From: Daniel Vetter @ 2022-01-25 16:30 UTC (permalink / raw)
  To: Christian König; +Cc: thomas.hellstrom, ray.huang, dri-devel

On Mon, Jan 24, 2022 at 01:25:04PM +0100, Christian König wrote:
> It is simply a lot cleaner to have this around instead of adding
> the device throughout the call chain.
> 
> Signed-off-by: Christian König <christian.koenig@amd.com>
> Reviewed-by: Huang Rui <ray.huang@amd.com>
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c     |  3 ++-
>  drivers/gpu/drm/amd/amdgpu/amdgpu_preempt_mgr.c |  2 +-
>  drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c    |  3 ++-
>  drivers/gpu/drm/i915/i915_ttm_buddy_manager.c   |  2 +-
>  drivers/gpu/drm/nouveau/nouveau_ttm.c           |  4 ++--
>  drivers/gpu/drm/ttm/ttm_range_manager.c         |  2 +-
>  drivers/gpu/drm/ttm/ttm_resource.c              |  3 +++
>  drivers/gpu/drm/ttm/ttm_sys_manager.c           |  2 +-
>  drivers/gpu/drm/vmwgfx/vmwgfx_gmrid_manager.c   |  2 +-
>  drivers/gpu/drm/vmwgfx/vmwgfx_system_manager.c  |  2 +-
>  include/drm/ttm/ttm_resource.h                  | 16 +++++++++-------
>  11 files changed, 24 insertions(+), 17 deletions(-)
> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c
> index ea5470c8c921..9e7685a4878c 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c
> @@ -293,7 +293,8 @@ int amdgpu_gtt_mgr_init(struct amdgpu_device *adev, uint64_t gtt_size)
>  	man->use_tt = true;
>  	man->func = &amdgpu_gtt_mgr_func;
>  
> -	ttm_resource_manager_init(man, gtt_size >> PAGE_SHIFT);
> +	ttm_resource_manager_init(man, &adev->mman.bdev,
> +				  gtt_size >> PAGE_SHIFT);
>  
>  	start = AMDGPU_GTT_MAX_TRANSFER_SIZE * AMDGPU_GTT_NUM_TRANSFER_WINDOWS;
>  	size = (adev->gmc.gart_size >> PAGE_SHIFT) - start;
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_preempt_mgr.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_preempt_mgr.c
> index ffddec08e931..6f7189d32f0a 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_preempt_mgr.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_preempt_mgr.c
> @@ -153,7 +153,7 @@ int amdgpu_preempt_mgr_init(struct amdgpu_device *adev)
>  	man->use_tt = true;
>  	man->func = &amdgpu_preempt_mgr_func;
>  
> -	ttm_resource_manager_init(man, (1 << 30));
> +	ttm_resource_manager_init(man, &adev->mman.bdev, (1 << 30));
>  
>  	atomic64_set(&mgr->used, 0);
>  
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c
> index 55d68408951d..ddd0b6d74070 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c
> @@ -699,7 +699,8 @@ int amdgpu_vram_mgr_init(struct amdgpu_device *adev)
>  	struct amdgpu_vram_mgr *mgr = &adev->mman.vram_mgr;
>  	struct ttm_resource_manager *man = &mgr->manager;
>  
> -	ttm_resource_manager_init(man, adev->gmc.real_vram_size >> PAGE_SHIFT);
> +	ttm_resource_manager_init(man, &adev->mman.bdev,
> +				  adev->gmc.real_vram_size >> PAGE_SHIFT);
>  
>  	man->func = &amdgpu_vram_mgr_func;
>  
> diff --git a/drivers/gpu/drm/i915/i915_ttm_buddy_manager.c b/drivers/gpu/drm/i915/i915_ttm_buddy_manager.c
> index bf857703c797..79dc8c6f2833 100644
> --- a/drivers/gpu/drm/i915/i915_ttm_buddy_manager.c
> +++ b/drivers/gpu/drm/i915/i915_ttm_buddy_manager.c
> @@ -204,7 +204,7 @@ int i915_ttm_buddy_man_init(struct ttm_device *bdev,
>  	man = &bman->manager;
>  	man->use_tt = use_tt;
>  	man->func = &i915_ttm_buddy_manager_func;
> -	ttm_resource_manager_init(man, bman->mm.size >> PAGE_SHIFT);
> +	ttm_resource_manager_init(man, bdev, bman->mm.size >> PAGE_SHIFT);
>  
>  	ttm_resource_manager_set_used(man, true);
>  	ttm_set_driver_manager(bdev, type, man);
> diff --git a/drivers/gpu/drm/nouveau/nouveau_ttm.c b/drivers/gpu/drm/nouveau/nouveau_ttm.c
> index 91ef33f8f22c..85f1f5a0fe5d 100644
> --- a/drivers/gpu/drm/nouveau/nouveau_ttm.c
> +++ b/drivers/gpu/drm/nouveau/nouveau_ttm.c
> @@ -164,7 +164,7 @@ nouveau_ttm_init_vram(struct nouveau_drm *drm)
>  
>  		man->func = &nouveau_vram_manager;
>  
> -		ttm_resource_manager_init(man,
> +		ttm_resource_manager_init(man, &drm->ttm.bdev,
>  					  drm->gem.vram_available >> PAGE_SHIFT);
>  		ttm_set_driver_manager(&drm->ttm.bdev, TTM_PL_VRAM, man);
>  		ttm_resource_manager_set_used(man, true);
> @@ -211,7 +211,7 @@ nouveau_ttm_init_gtt(struct nouveau_drm *drm)
>  
>  	man->func = func;
>  	man->use_tt = true;
> -	ttm_resource_manager_init(man, size_pages);
> +	ttm_resource_manager_init(man, &drm->ttm.bdev, size_pages);
>  	ttm_set_driver_manager(&drm->ttm.bdev, TTM_PL_TT, man);
>  	ttm_resource_manager_set_used(man, true);
>  	return 0;
> diff --git a/drivers/gpu/drm/ttm/ttm_range_manager.c b/drivers/gpu/drm/ttm/ttm_range_manager.c
> index 55a41355f684..8cd4f3fb9f79 100644
> --- a/drivers/gpu/drm/ttm/ttm_range_manager.c
> +++ b/drivers/gpu/drm/ttm/ttm_range_manager.c
> @@ -158,7 +158,7 @@ int ttm_range_man_init_nocheck(struct ttm_device *bdev,
>  
>  	man->func = &ttm_range_manager_func;
>  
> -	ttm_resource_manager_init(man, p_size);
> +	ttm_resource_manager_init(man, bdev, p_size);
>  
>  	drm_mm_init(&rman->mm, 0, p_size);
>  	spin_lock_init(&rman->lock);
> diff --git a/drivers/gpu/drm/ttm/ttm_resource.c b/drivers/gpu/drm/ttm/ttm_resource.c
> index 89bcfe22a0ca..41e7bf195168 100644
> --- a/drivers/gpu/drm/ttm/ttm_resource.c
> +++ b/drivers/gpu/drm/ttm/ttm_resource.c
> @@ -126,16 +126,19 @@ EXPORT_SYMBOL(ttm_resource_compat);
>   * ttm_resource_manager_init
>   *
>   * @man: memory manager object to init
> + * @bdev: ttm device this manager belongs to
>   * @p_size: size managed area in pages.
>   *
>   * Initialise core parts of a manager object.
>   */
>  void ttm_resource_manager_init(struct ttm_resource_manager *man,
> +			       struct ttm_device *bdev,
>  			       unsigned long p_size)
>  {
>  	unsigned i;
>  
>  	spin_lock_init(&man->move_lock);
> +	man->bdev = bdev;
>  	man->size = p_size;
>  
>  	for (i = 0; i < TTM_MAX_BO_PRIORITY; ++i)
> diff --git a/drivers/gpu/drm/ttm/ttm_sys_manager.c b/drivers/gpu/drm/ttm/ttm_sys_manager.c
> index 135394dcca95..2ced169513cb 100644
> --- a/drivers/gpu/drm/ttm/ttm_sys_manager.c
> +++ b/drivers/gpu/drm/ttm/ttm_sys_manager.c
> @@ -43,7 +43,7 @@ void ttm_sys_man_init(struct ttm_device *bdev)
>  	man->use_tt = true;
>  	man->func = &ttm_sys_manager_func;
>  
> -	ttm_resource_manager_init(man, 0);
> +	ttm_resource_manager_init(man, bdev, 0);
>  	ttm_set_driver_manager(bdev, TTM_PL_SYSTEM, man);
>  	ttm_resource_manager_set_used(man, true);
>  }
> diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_gmrid_manager.c b/drivers/gpu/drm/vmwgfx/vmwgfx_gmrid_manager.c
> index 99ccf690f8a1..60e3cc537f36 100644
> --- a/drivers/gpu/drm/vmwgfx/vmwgfx_gmrid_manager.c
> +++ b/drivers/gpu/drm/vmwgfx/vmwgfx_gmrid_manager.c
> @@ -162,7 +162,7 @@ int vmw_gmrid_man_init(struct vmw_private *dev_priv, int type)
>  
>  	man->func = &vmw_gmrid_manager_func;
>  	man->use_tt = true;
> -	ttm_resource_manager_init(man, 0);
> +	ttm_resource_manager_init(man, &dev_priv->bdev, 0);
>  	spin_lock_init(&gman->lock);
>  	gman->used_gmr_pages = 0;
>  	ida_init(&gman->gmr_ida);
> diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_system_manager.c b/drivers/gpu/drm/vmwgfx/vmwgfx_system_manager.c
> index a64188c7268d..d3007bf1b8f5 100644
> --- a/drivers/gpu/drm/vmwgfx/vmwgfx_system_manager.c
> +++ b/drivers/gpu/drm/vmwgfx/vmwgfx_system_manager.c
> @@ -70,7 +70,7 @@ int vmw_sys_man_init(struct vmw_private *dev_priv)
>  	man->use_tt = true;
>  	man->func = &vmw_sys_manager_func;
>  
> -	ttm_resource_manager_init(man, 0);
> +	ttm_resource_manager_init(man, bdev, 0);
>  	ttm_set_driver_manager(bdev, VMW_PL_SYSTEM, man);
>  	ttm_resource_manager_set_used(man, true);
>  	return 0;
> diff --git a/include/drm/ttm/ttm_resource.h b/include/drm/ttm/ttm_resource.h
> index df1f06b7b504..6bf37383002b 100644
> --- a/include/drm/ttm/ttm_resource.h
> +++ b/include/drm/ttm/ttm_resource.h
> @@ -105,11 +105,11 @@ struct ttm_resource_manager_func {
>   * @use_type: The memory type is enabled.
>   * @use_tt: If a TT object should be used for the backing store.
>   * @size: Size of the managed region.
> + * @bdev: ttm device this manager belongs to
>   * @func: structure pointer implementing the range manager. See above
>   * @move_lock: lock for move fence
> - * static information. bdev::driver::io_mem_free is never used.
> - * @lru: The lru list for this memory type.
>   * @move: The fence of the last pipelined move operation.
> + * @lru: The lru list for this memory type.
>   *
>   * This structure is used to identify and manage memory types for a device.
>   */
> @@ -119,20 +119,21 @@ struct ttm_resource_manager {
>  	 */
>  	bool use_type;
>  	bool use_tt;
> +	struct ttm_device *bdev;
>  	uint64_t size;
>  	const struct ttm_resource_manager_func *func;
>  	spinlock_t move_lock;
>  
>  	/*
> -	 * Protected by the global->lru_lock.
> +	 * Protected by @move_lock.
>  	 */
> -
> -	struct list_head lru[TTM_MAX_BO_PRIORITY];
> +	struct dma_fence *move;
>  
>  	/*
> -	 * Protected by @move_lock.
> +	 * Protected by the global->lru_lock.

Since you edit these anyway, maybe move the kerneldoc from the header
comment to inline style and comment them. Having things split up doesn't
help anyone.

Anyway I'll trust gcc on this one :-)

Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>

>  	 */
> -	struct dma_fence *move;
> +
> +	struct list_head lru[TTM_MAX_BO_PRIORITY];
>  };
>  
>  /**
> @@ -272,6 +273,7 @@ bool ttm_resource_compat(struct ttm_resource *res,
>  			 struct ttm_placement *placement);
>  
>  void ttm_resource_manager_init(struct ttm_resource_manager *man,
> +			       struct ttm_device *bdev,
>  			       unsigned long p_size);
>  
>  int ttm_resource_manager_evict_all(struct ttm_device *bdev,
> -- 
> 2.25.1
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 04/12] drm/ttm: add common accounting to the resource mgr v2
  2022-01-24 12:25 ` [PATCH 04/12] drm/ttm: add common accounting to the resource mgr v2 Christian König
@ 2022-01-25 16:37   ` Daniel Vetter
  2022-01-26 14:42     ` Christian König
  0 siblings, 1 reply; 23+ messages in thread
From: Daniel Vetter @ 2022-01-25 16:37 UTC (permalink / raw)
  To: Christian König; +Cc: thomas.hellstrom, ray.huang, dri-devel

On Mon, Jan 24, 2022 at 01:25:06PM +0100, Christian König wrote:
> It makes sense to have this in the common manager for debugging and
> accounting of how much resources are used.
> 
> v2: cleanup kerneldoc a bit
> 
> Signed-off-by: Christian König <christian.koenig@amd.com>
> Reviewed-by: Huang Rui <ray.huang@amd.com>
> ---
>  drivers/gpu/drm/ttm/ttm_resource.c |  8 ++++++++
>  include/drm/ttm/ttm_resource.h     | 20 +++++++++++++++++++-
>  2 files changed, 27 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/ttm/ttm_resource.c b/drivers/gpu/drm/ttm/ttm_resource.c
> index 7fdd58b53c61..b8362492980d 100644
> --- a/drivers/gpu/drm/ttm/ttm_resource.c
> +++ b/drivers/gpu/drm/ttm/ttm_resource.c
> @@ -33,6 +33,8 @@ void ttm_resource_init(struct ttm_buffer_object *bo,
>                         const struct ttm_place *place,
>                         struct ttm_resource *res)
>  {
> +	struct ttm_resource_manager *man;
> +
>  	res->start = 0;
>  	res->num_pages = PFN_UP(bo->base.size);
>  	res->mem_type = place->mem_type;
> @@ -42,12 +44,16 @@ void ttm_resource_init(struct ttm_buffer_object *bo,
>  	res->bus.is_iomem = false;
>  	res->bus.caching = ttm_cached;
>  	res->bo = bo;
> +
> +	man = ttm_manager_type(bo->bdev, place->mem_type);
> +	atomic64_add(bo->base.size, &man->usage);

Doing this with atomics doesn't make a lot of sense to me. Yes with the
current organization it's the only thing to do in drivers, but if we move
this into ttm there's no reason we can track this together with the lru,
consistently with the lru, and under the same spinlock like the lru.

And at least spot-checking a few places the very next thing we generally
do is take the lru lock since there's really no other way to get the
resource into or out of the resource manager.

I think doing atomics for statistics when there's no need is not great,
because then people start using atomics for all kinds of other things, and
then get the barriers wrong. In i915 (simply due to the grotesque amount
of buggy overuse of atomics, both atomic_t and atomic bitfields) we've put
a hard block in place for any atomic addition. So that's why I'm a bit on
a crusade, but I also genuinely don't see why we need them here. All they
do is cost more since we have to take the spinlock anyway, the accounting
is just going to be a slight different (and imo more accurate) place.

Thoughts?

Cheers, Daniel

>  }
>  EXPORT_SYMBOL(ttm_resource_init);
>  
>  void ttm_resource_fini(struct ttm_resource_manager *man,
>  		       struct ttm_resource *res)
>  {
> +	atomic64_sub(res->bo->base.size, &man->usage);
>  }
>  EXPORT_SYMBOL(ttm_resource_fini);
>  
> @@ -149,6 +155,7 @@ void ttm_resource_manager_init(struct ttm_resource_manager *man,
>  	spin_lock_init(&man->move_lock);
>  	man->bdev = bdev;
>  	man->size = p_size;
> +	atomic64_set(&man->usage, 0);
>  
>  	for (i = 0; i < TTM_MAX_BO_PRIORITY; ++i)
>  		INIT_LIST_HEAD(&man->lru[i]);
> @@ -221,6 +228,7 @@ void ttm_resource_manager_debug(struct ttm_resource_manager *man,
>  	drm_printf(p, "  use_type: %d\n", man->use_type);
>  	drm_printf(p, "  use_tt: %d\n", man->use_tt);
>  	drm_printf(p, "  size: %llu\n", man->size);
> +	drm_printf(p, "  usage: %llu\n", atomic64_read(&man->usage));
>  	if (man->func->debug)
>  		man->func->debug(man, p);
>  }
> diff --git a/include/drm/ttm/ttm_resource.h b/include/drm/ttm/ttm_resource.h
> index 69eea9d6399b..3d391279b33f 100644
> --- a/include/drm/ttm/ttm_resource.h
> +++ b/include/drm/ttm/ttm_resource.h
> @@ -27,6 +27,7 @@
>  
>  #include <linux/types.h>
>  #include <linux/mutex.h>
> +#include <linux/atomic.h>
>  #include <linux/dma-buf-map.h>
>  #include <linux/dma-fence.h>
>  #include <drm/drm_print.h>
> @@ -132,8 +133,12 @@ struct ttm_resource_manager {
>  	/*
>  	 * Protected by the global->lru_lock.
>  	 */
> -
>  	struct list_head lru[TTM_MAX_BO_PRIORITY];
> +
> +	/**
> +	 * @usage: How much of the region is used, has its own protection.
> +	 */
> +	atomic64_t usage;
>  };
>  
>  /**
> @@ -261,6 +266,19 @@ ttm_resource_manager_cleanup(struct ttm_resource_manager *man)
>  	man->move = NULL;
>  }
>  
> +/**
> + * ttm_resource_manager_usage
> + *
> + * @man: A memory manager object.
> + *
> + * Return how many resources are currently used.
> + */
> +static inline uint64_t
> +ttm_resource_manager_usage(struct ttm_resource_manager *man)
> +{
> +	return atomic64_read(&man->usage);
> +}
> +
>  void ttm_resource_init(struct ttm_buffer_object *bo,
>                         const struct ttm_place *place,
>                         struct ttm_resource *res);
> -- 
> 2.25.1
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 05/12] drm/ttm: move the LRU into resource handling v2
  2022-01-24 12:25 ` [PATCH 05/12] drm/ttm: move the LRU into resource handling v2 Christian König
@ 2022-01-25 16:52   ` Daniel Vetter
  0 siblings, 0 replies; 23+ messages in thread
From: Daniel Vetter @ 2022-01-25 16:52 UTC (permalink / raw)
  To: Christian König; +Cc: thomas.hellstrom, ray.huang, dri-devel

On Mon, Jan 24, 2022 at 01:25:07PM +0100, Christian König wrote:
> This way we finally fix the problem that new resource are
> not immediately evict-able after allocation.
> 
> That has caused numerous problems including OOM on GDS handling
> and not being able to use TTM as general resource manager.
> 
> v2: stop assuming in ttm_resource_fini that res->bo is still valid.
> 
> Signed-off-by: Christian König <christian.koenig@amd.com>
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c  |   8 +-
>  drivers/gpu/drm/i915/gem/i915_gem_ttm.c |   2 +-
>  drivers/gpu/drm/ttm/ttm_bo.c            | 115 ++--------------------
>  drivers/gpu/drm/ttm/ttm_bo_util.c       |   1 -
>  drivers/gpu/drm/ttm/ttm_device.c        |  64 ++++++-------
>  drivers/gpu/drm/ttm/ttm_resource.c      | 122 +++++++++++++++++++++++-
>  include/drm/ttm/ttm_bo_api.h            |  16 ----
>  include/drm/ttm/ttm_bo_driver.h         |  29 +-----
>  include/drm/ttm/ttm_resource.h          |  35 +++++++
>  9 files changed, 201 insertions(+), 191 deletions(-)
> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> index a96ae4c0e040..cdb489cdf27b 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> @@ -683,12 +683,12 @@ void amdgpu_vm_move_to_lru_tail(struct amdgpu_device *adev,
>  
>  	if (vm->bulk_moveable) {
>  		spin_lock(&adev->mman.bdev.lru_lock);
> -		ttm_bo_bulk_move_lru_tail(&vm->lru_bulk_move);
> +		ttm_lru_bulk_move_tail(&vm->lru_bulk_move);
>  		spin_unlock(&adev->mman.bdev.lru_lock);
>  		return;
>  	}
>  
> -	memset(&vm->lru_bulk_move, 0, sizeof(vm->lru_bulk_move));
> +	ttm_lru_bulk_move_init(&vm->lru_bulk_move);
>  
>  	spin_lock(&adev->mman.bdev.lru_lock);
>  	list_for_each_entry(bo_base, &vm->idle, vm_status) {
> @@ -698,11 +698,9 @@ void amdgpu_vm_move_to_lru_tail(struct amdgpu_device *adev,
>  		if (!bo->parent)
>  			continue;
>  
> -		ttm_bo_move_to_lru_tail(&bo->tbo, bo->tbo.resource,
> -					&vm->lru_bulk_move);
> +		ttm_bo_move_to_lru_tail(&bo->tbo, &vm->lru_bulk_move);
>  		if (shadow)
>  			ttm_bo_move_to_lru_tail(&shadow->tbo,
> -						shadow->tbo.resource,
>  						&vm->lru_bulk_move);
>  	}
>  	spin_unlock(&adev->mman.bdev.lru_lock);
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_ttm.c b/drivers/gpu/drm/i915/gem/i915_gem_ttm.c
> index 218a9b3037c7..cdf1ff87c88d 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_ttm.c
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_ttm.c
> @@ -835,7 +835,7 @@ void i915_ttm_adjust_lru(struct drm_i915_gem_object *obj)
>  			bo->priority = I915_TTM_PRIO_NO_PAGES;
>  	}
>  
> -	ttm_bo_move_to_lru_tail(bo, bo->resource, NULL);
> +	ttm_bo_move_to_lru_tail(bo, NULL);
>  	spin_unlock(&bo->bdev->lru_lock);
>  }
>  
> diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c
> index db3dc7ef5382..cb0fa932d495 100644
> --- a/drivers/gpu/drm/ttm/ttm_bo.c
> +++ b/drivers/gpu/drm/ttm/ttm_bo.c
> @@ -69,108 +69,16 @@ static void ttm_bo_mem_space_debug(struct ttm_buffer_object *bo,
>  	}
>  }
>  
> -static inline void ttm_bo_move_to_pinned(struct ttm_buffer_object *bo)
> -{
> -	struct ttm_device *bdev = bo->bdev;
> -
> -	list_move_tail(&bo->lru, &bdev->pinned);
> -
> -	if (bdev->funcs->del_from_lru_notify)
> -		bdev->funcs->del_from_lru_notify(bo);
> -}
> -
> -static inline void ttm_bo_del_from_lru(struct ttm_buffer_object *bo)
> -{
> -	struct ttm_device *bdev = bo->bdev;
> -
> -	list_del_init(&bo->lru);
> -
> -	if (bdev->funcs->del_from_lru_notify)
> -		bdev->funcs->del_from_lru_notify(bo);
> -}
> -
> -static void ttm_bo_bulk_move_set_pos(struct ttm_lru_bulk_move_pos *pos,
> -				     struct ttm_buffer_object *bo)
> -{
> -	if (!pos->first)
> -		pos->first = bo;
> -	pos->last = bo;
> -}
> -
>  void ttm_bo_move_to_lru_tail(struct ttm_buffer_object *bo,
> -			     struct ttm_resource *mem,
>  			     struct ttm_lru_bulk_move *bulk)
>  {
> -	struct ttm_device *bdev = bo->bdev;
> -	struct ttm_resource_manager *man;
> -
> -	if (!bo->deleted)
> -		dma_resv_assert_held(bo->base.resv);
> -
> -	if (bo->pin_count) {
> -		ttm_bo_move_to_pinned(bo);
> -		return;
> -	}
> -
> -	if (!mem)
> -		return;
> -
> -	man = ttm_manager_type(bdev, mem->mem_type);
> -	list_move_tail(&bo->lru, &man->lru[bo->priority]);
> -
> -	if (bdev->funcs->del_from_lru_notify)
> -		bdev->funcs->del_from_lru_notify(bo);
> -
> -	if (bulk && !bo->pin_count) {
> -		switch (bo->resource->mem_type) {
> -		case TTM_PL_TT:
> -			ttm_bo_bulk_move_set_pos(&bulk->tt[bo->priority], bo);
> -			break;
> +	dma_resv_assert_held(bo->base.resv);
>  
> -		case TTM_PL_VRAM:
> -			ttm_bo_bulk_move_set_pos(&bulk->vram[bo->priority], bo);
> -			break;
> -		}
> -	}
> +	if (bo->resource)
> +		ttm_resource_move_to_lru_tail(bo->resource, bulk);
>  }
>  EXPORT_SYMBOL(ttm_bo_move_to_lru_tail);
>  
> -void ttm_bo_bulk_move_lru_tail(struct ttm_lru_bulk_move *bulk)
> -{
> -	unsigned i;
> -
> -	for (i = 0; i < TTM_MAX_BO_PRIORITY; ++i) {
> -		struct ttm_lru_bulk_move_pos *pos = &bulk->tt[i];
> -		struct ttm_resource_manager *man;
> -
> -		if (!pos->first)
> -			continue;
> -
> -		dma_resv_assert_held(pos->first->base.resv);
> -		dma_resv_assert_held(pos->last->base.resv);
> -
> -		man = ttm_manager_type(pos->first->bdev, TTM_PL_TT);
> -		list_bulk_move_tail(&man->lru[i], &pos->first->lru,
> -				    &pos->last->lru);
> -	}
> -
> -	for (i = 0; i < TTM_MAX_BO_PRIORITY; ++i) {
> -		struct ttm_lru_bulk_move_pos *pos = &bulk->vram[i];
> -		struct ttm_resource_manager *man;
> -
> -		if (!pos->first)
> -			continue;
> -
> -		dma_resv_assert_held(pos->first->base.resv);
> -		dma_resv_assert_held(pos->last->base.resv);
> -
> -		man = ttm_manager_type(pos->first->bdev, TTM_PL_VRAM);
> -		list_bulk_move_tail(&man->lru[i], &pos->first->lru,
> -				    &pos->last->lru);
> -	}
> -}
> -EXPORT_SYMBOL(ttm_bo_bulk_move_lru_tail);
> -
>  static int ttm_bo_handle_move_mem(struct ttm_buffer_object *bo,
>  				  struct ttm_resource *mem, bool evict,
>  				  struct ttm_operation_ctx *ctx,
> @@ -344,7 +252,6 @@ static int ttm_bo_cleanup_refs(struct ttm_buffer_object *bo,
>  		return ret;
>  	}
>  
> -	ttm_bo_move_to_pinned(bo);
>  	list_del_init(&bo->ddestroy);
>  	spin_unlock(&bo->bdev->lru_lock);
>  	ttm_bo_cleanup_memtype_use(bo);
> @@ -445,7 +352,7 @@ static void ttm_bo_release(struct kref *kref)
>  		 */
>  		if (bo->pin_count) {
>  			bo->pin_count = 0;
> -			ttm_bo_move_to_lru_tail(bo, bo->resource, NULL);
> +			ttm_resource_move_to_lru_tail(bo->resource, NULL);
>  		}
>  
>  		kref_init(&bo->kref);
> @@ -458,7 +365,6 @@ static void ttm_bo_release(struct kref *kref)
>  	}
>  
>  	spin_lock(&bo->bdev->lru_lock);
> -	ttm_bo_del_from_lru(bo);
>  	list_del(&bo->ddestroy);
>  	spin_unlock(&bo->bdev->lru_lock);
>  
> @@ -673,15 +579,17 @@ int ttm_mem_evict_first(struct ttm_device *bdev,
>  			struct ww_acquire_ctx *ticket)
>  {
>  	struct ttm_buffer_object *bo = NULL, *busy_bo = NULL;
> +	struct ttm_resource *res;
>  	bool locked = false;
>  	unsigned i;
>  	int ret;
>  
>  	spin_lock(&bdev->lru_lock);
>  	for (i = 0; i < TTM_MAX_BO_PRIORITY; ++i) {
> -		list_for_each_entry(bo, &man->lru[i], lru) {
> +		list_for_each_entry(res, &man->lru[i], lru) {
>  			bool busy;
>  
> +			bo = res->bo;
>  			if (!ttm_bo_evict_swapout_allowable(bo, ctx, place,
>  							    &locked, &busy)) {
>  				if (busy && !busy_bo && ticket !=
> @@ -699,7 +607,7 @@ int ttm_mem_evict_first(struct ttm_device *bdev,
>  		}
>  
>  		/* If the inner loop terminated early, we have our candidate */
> -		if (&bo->lru != &man->lru[i])
> +		if (&res->lru != &man->lru[i])
>  			break;
>  
>  		bo = NULL;
> @@ -875,9 +783,6 @@ int ttm_bo_mem_space(struct ttm_buffer_object *bo,
>  	}
>  
>  error:
> -	if (bo->resource->mem_type == TTM_PL_SYSTEM && !bo->pin_count)
> -		ttm_bo_move_to_lru_tail_unlocked(bo);
> -
>  	return ret;
>  }
>  EXPORT_SYMBOL(ttm_bo_mem_space);
> @@ -971,7 +876,6 @@ int ttm_bo_init_reserved(struct ttm_device *bdev,
>  	bo->destroy = destroy ? destroy : ttm_bo_default_destroy;
>  
>  	kref_init(&bo->kref);
> -	INIT_LIST_HEAD(&bo->lru);
>  	INIT_LIST_HEAD(&bo->ddestroy);
>  	bo->bdev = bdev;
>  	bo->type = type;
> @@ -1021,8 +925,6 @@ int ttm_bo_init_reserved(struct ttm_device *bdev,
>  		return ret;
>  	}
>  
> -	ttm_bo_move_to_lru_tail_unlocked(bo);
> -
>  	return ret;
>  }
>  EXPORT_SYMBOL(ttm_bo_init_reserved);
> @@ -1123,7 +1025,6 @@ int ttm_bo_swapout(struct ttm_buffer_object *bo, struct ttm_operation_ctx *ctx,
>  		return ret == -EBUSY ? -ENOSPC : ret;
>  	}
>  
> -	ttm_bo_move_to_pinned(bo);
>  	/* TODO: Cleanup the locking */
>  	spin_unlock(&bo->bdev->lru_lock);
>  
> diff --git a/drivers/gpu/drm/ttm/ttm_bo_util.c b/drivers/gpu/drm/ttm/ttm_bo_util.c
> index 544a84fa6589..0163e92b57af 100644
> --- a/drivers/gpu/drm/ttm/ttm_bo_util.c
> +++ b/drivers/gpu/drm/ttm/ttm_bo_util.c
> @@ -231,7 +231,6 @@ static int ttm_buffer_object_transfer(struct ttm_buffer_object *bo,
>  
>  	atomic_inc(&ttm_glob.bo_count);
>  	INIT_LIST_HEAD(&fbo->base.ddestroy);
> -	INIT_LIST_HEAD(&fbo->base.lru);
>  	fbo->base.moving = NULL;
>  	drm_vma_node_reset(&fbo->base.base.vma_node);
>  
> diff --git a/drivers/gpu/drm/ttm/ttm_device.c b/drivers/gpu/drm/ttm/ttm_device.c
> index be24bb6cefd0..ba35887147ba 100644
> --- a/drivers/gpu/drm/ttm/ttm_device.c
> +++ b/drivers/gpu/drm/ttm/ttm_device.c
> @@ -144,6 +144,7 @@ int ttm_device_swapout(struct ttm_device *bdev, struct ttm_operation_ctx *ctx,
>  {
>  	struct ttm_resource_manager *man;
>  	struct ttm_buffer_object *bo;
> +	struct ttm_resource *res;
>  	unsigned i, j;
>  	int ret;
>  
> @@ -154,8 +155,11 @@ int ttm_device_swapout(struct ttm_device *bdev, struct ttm_operation_ctx *ctx,
>  			continue;
>  
>  		for (j = 0; j < TTM_MAX_BO_PRIORITY; ++j) {
> -			list_for_each_entry(bo, &man->lru[j], lru) {
> -				uint32_t num_pages = PFN_UP(bo->base.size);
> +			list_for_each_entry(res, &man->lru[j], lru) {
> +				uint32_t num_pages;
> +
> +				bo = res->bo;
> +				num_pages = PFN_UP(bo->base.size);
>  
>  				ret = ttm_bo_swapout(bo, ctx, gfp_flags);
>  				/* ttm_bo_swapout has dropped the lru_lock */
> @@ -259,49 +263,45 @@ void ttm_device_fini(struct ttm_device *bdev)
>  }
>  EXPORT_SYMBOL(ttm_device_fini);
>  
> -void ttm_device_clear_dma_mappings(struct ttm_device *bdev)
> +static void ttm_device_clear_lru_dma_mappings(struct ttm_device *bdev,
> +					      struct list_head *list)
>  {
> -	struct ttm_resource_manager *man;
> -	struct ttm_buffer_object *bo;
> -	unsigned int i, j;
> +	struct ttm_resource *res;
>  
>  	spin_lock(&bdev->lru_lock);
> -	while (!list_empty(&bdev->pinned)) {
> -		bo = list_first_entry(&bdev->pinned, struct ttm_buffer_object, lru);
> +	while ((res = list_first_entry_or_null(list, typeof(*res), lru))) {
> +		struct ttm_buffer_object *bo = res->bo;
> +
>  		/* Take ref against racing releases once lru_lock is unlocked */
> -		if (ttm_bo_get_unless_zero(bo)) {
> -			list_del_init(&bo->lru);
> -			spin_unlock(&bdev->lru_lock);
> +		if (!ttm_bo_get_unless_zero(bo))
> +			continue;
>  
> -			if (bo->ttm)
> -				ttm_tt_unpopulate(bo->bdev, bo->ttm);
> +		list_del_init(&res->lru);
> +		spin_unlock(&bdev->lru_lock);
>  
> -			ttm_bo_put(bo);
> -			spin_lock(&bdev->lru_lock);
> -		}
> +		if (bo->ttm)
> +			ttm_tt_unpopulate(bo->bdev, bo->ttm);
> +
> +		ttm_bo_put(bo);
> +		spin_lock(&bdev->lru_lock);
>  	}
> +	spin_unlock(&bdev->lru_lock);
> +}
> +
> +void ttm_device_clear_dma_mappings(struct ttm_device *bdev)
> +{
> +	struct ttm_resource_manager *man;
> +	unsigned int i, j;
> +
> +	ttm_device_clear_lru_dma_mappings(bdev, &bdev->pinned);
>  
>  	for (i = TTM_PL_SYSTEM; i < TTM_NUM_MEM_TYPES; ++i) {
>  		man = ttm_manager_type(bdev, i);
>  		if (!man || !man->use_tt)
>  			continue;
>  
> -		for (j = 0; j < TTM_MAX_BO_PRIORITY; ++j) {
> -			while (!list_empty(&man->lru[j])) {
> -				bo = list_first_entry(&man->lru[j], struct ttm_buffer_object, lru);
> -				if (ttm_bo_get_unless_zero(bo)) {
> -					list_del_init(&bo->lru);
> -					spin_unlock(&bdev->lru_lock);
> -
> -					if (bo->ttm)
> -						ttm_tt_unpopulate(bo->bdev, bo->ttm);
> -
> -					ttm_bo_put(bo);
> -					spin_lock(&bdev->lru_lock);
> -				}
> -			}
> -		}
> +		for (j = 0; j < TTM_MAX_BO_PRIORITY; ++j)
> +			ttm_device_clear_lru_dma_mappings(bdev, &man->lru[j]);
>  	}
> -	spin_unlock(&bdev->lru_lock);
>  }
>  EXPORT_SYMBOL(ttm_device_clear_dma_mappings);
> diff --git a/drivers/gpu/drm/ttm/ttm_resource.c b/drivers/gpu/drm/ttm/ttm_resource.c
> index b8362492980d..450e665c357b 100644
> --- a/drivers/gpu/drm/ttm/ttm_resource.c
> +++ b/drivers/gpu/drm/ttm/ttm_resource.c
> @@ -29,6 +29,106 @@
>  #include <drm/ttm/ttm_resource.h>
>  #include <drm/ttm/ttm_bo_driver.h>
>  
> +/**
> + * ttm_lru_bulk_move_init - initialize a bulk move structure
> + * @bulk: the structure to init
> + *
> + * For now just memset the structure to zero.
> + */
> +void ttm_lru_bulk_move_init(struct ttm_lru_bulk_move *bulk)

I like the ttm_lru_bulk_move prefixe bikeshed.

> +{
> +	memset(bulk, 0, sizeof(*bulk));
> +}
> +EXPORT_SYMBOL(ttm_lru_bulk_move_init);
> +
> +/**
> + * ttm_lru_bulk_move_tail
> + *
> + * @bulk: bulk move structure
> + *
> + * Bulk move BOs to the LRU tail, only valid to use when driver makes sure that
> + * resource order never changes. Should be called with ttm_device::lru_lock held.

							  ^
This is a doxygen link or something, defo not sphinx.  Pls fix.

Also I think a bit more explaining how exactly this works (e.g. mention
that you need to guarantee that _all_ the bos in your bulk lru section
need to be locked) and other things would be good here. I know this is
just moved, but given how much fun you guys had to debug corner cases I
think it's worth to elaborate here a bit.

> + */
> +void ttm_lru_bulk_move_tail(struct ttm_lru_bulk_move *bulk)
> +{
> +	unsigned i;

Also assert_lockdep_held here for the lru lock, due to all the callers all
over.

> +
> +	for (i = 0; i < TTM_MAX_BO_PRIORITY; ++i) {
> +		struct ttm_lru_bulk_move_pos *pos = &bulk->tt[i];
> +		struct ttm_resource_manager *man;
> +
> +		if (!pos->first)
> +			continue;
> +
> +		dma_resv_assert_held(pos->first->bo->base.resv);

btw random idea for debug mode with lockdep enabled: Loop from first ->
last and check that they're _all_ locked? Also catches the case where
first/last aren't on the same list, and other screw-ups. And performance
with lockdep enabled is pretty bad anyway, so the full list walk here
shouldn't be a big disaster.

Or maybe only do that when we have the special dma_resv debug Kconfig
enabled.

> +		dma_resv_assert_held(pos->last->bo->base.resv);
> +
> +		man = ttm_manager_type(pos->first->bo->bdev, TTM_PL_TT);
> +		list_bulk_move_tail(&man->lru[i], &pos->first->lru,
> +				    &pos->last->lru);
> +	}
> +
> +	for (i = 0; i < TTM_MAX_BO_PRIORITY; ++i) {
> +		struct ttm_lru_bulk_move_pos *pos = &bulk->vram[i];
> +		struct ttm_resource_manager *man;
> +
> +		if (!pos->first)
> +			continue;
> +
> +		dma_resv_assert_held(pos->first->bo->base.resv);
> +		dma_resv_assert_held(pos->last->bo->base.resv);
> +
> +		man = ttm_manager_type(pos->first->bo->bdev, TTM_PL_VRAM);
> +		list_bulk_move_tail(&man->lru[i], &pos->first->lru,
> +				    &pos->last->lru);
> +	}
> +}
> +EXPORT_SYMBOL(ttm_lru_bulk_move_tail);
> +
> +/* Record a resource position in a bulk move structure */
> +static void ttm_lru_bulk_move_set_pos(struct ttm_lru_bulk_move_pos *pos,
> +				      struct ttm_resource *res)
> +{
> +	if (!pos->first)
> +		pos->first = res;
> +	pos->last = res;
> +}
> +
> +/* Move a resource to the LRU tail and track the bulk position */
> +void ttm_resource_move_to_lru_tail(struct ttm_resource *res,
> +				   struct ttm_lru_bulk_move *bulk)
> +{
> +	struct ttm_buffer_object *bo = res->bo;
> +	struct ttm_device *bdev = bo->bdev;
> +	struct ttm_resource_manager *man;

I think lockdep_assert_held here on the lru lock would be good, since
there's quite a few callers all over.

Also maybe double-check that the dma_resv is also held, like in the bulk
move. I get why the bulk move checks more since it's easier to screw that
up, but checking here is probably also good.

> +
> +	if (bo->pin_count) {
> +		list_move_tail(&res->lru, &bdev->pinned);
> +		if (bdev->funcs->del_from_lru_notify)
> +			bdev->funcs->del_from_lru_notify(res->bo);
> +		return;
> +	}
> +
> +	man = ttm_manager_type(bdev, res->mem_type);
> +	list_move_tail(&res->lru, &man->lru[bo->priority]);
> +
> +	if (bdev->funcs->del_from_lru_notify)
> +		bdev->funcs->del_from_lru_notify(bo);
> +
> +	if (!bulk)
> +		return;
> +
> +	switch (res->mem_type) {
> +	case TTM_PL_TT:
> +		ttm_lru_bulk_move_set_pos(&bulk->tt[bo->priority], res);
> +		break;
> +
> +	case TTM_PL_VRAM:
> +		ttm_lru_bulk_move_set_pos(&bulk->vram[bo->priority], res);
> +		break;
> +	}
> +}
> +
>  void ttm_resource_init(struct ttm_buffer_object *bo,
>                         const struct ttm_place *place,
>                         struct ttm_resource *res)
> @@ -44,16 +144,36 @@ void ttm_resource_init(struct ttm_buffer_object *bo,
>  	res->bus.is_iomem = false;
>  	res->bus.caching = ttm_cached;
>  	res->bo = bo;
> +	INIT_LIST_HEAD(&res->lru);
>  
>  	man = ttm_manager_type(bo->bdev, place->mem_type);
>  	atomic64_add(bo->base.size, &man->usage);
> +
> +	spin_lock(&bo->bdev->lru_lock);
> +	ttm_resource_move_to_lru_tail(res, NULL);
> +	spin_unlock(&bo->bdev->lru_lock);
>  }
>  EXPORT_SYMBOL(ttm_resource_init);
>  
> +/**
> + * ttm_resource_fini
> + *
> + * @res: the resource to clean up
> + *
> + * Make sure the resource is removed from the LRU before destruction.
> + */

Ah kerneldoc should probably be put into the earlier patch and explain
more when driver authors need to call this, and not so much what it
internally all needs to guaranteed.

>  void ttm_resource_fini(struct ttm_resource_manager *man,
>  		       struct ttm_resource *res)
>  {
> -	atomic64_sub(res->bo->base.size, &man->usage);
> +	struct ttm_device *bdev = man->bdev;
> +
> +	spin_lock(&bdev->lru_lock);
> +	list_del_init(&res->lru);
> +	if (res->bo && bdev->funcs->del_from_lru_notify)
> +		bdev->funcs->del_from_lru_notify(res->bo);
> +	spin_unlock(&bdev->lru_lock);
> +
> +	atomic64_sub(res->num_pages << PAGE_SHIFT, &man->usage);

Yeah more reasons that you really don't need an atomic here. I think if
you reorder the patches to first move the lru and then add account it will
be blantantly obvious that we don't need the resource tracking to be
atomic.


>  }
>  EXPORT_SYMBOL(ttm_resource_fini);
>  
> diff --git a/include/drm/ttm/ttm_bo_api.h b/include/drm/ttm/ttm_bo_api.h
> index c17b2df9178b..3da77fc54552 100644
> --- a/include/drm/ttm/ttm_bo_api.h
> +++ b/include/drm/ttm/ttm_bo_api.h
> @@ -55,8 +55,6 @@ struct ttm_placement;
>  
>  struct ttm_place;
>  
> -struct ttm_lru_bulk_move;
> -
>  /**
>   * enum ttm_bo_type
>   *
> @@ -94,7 +92,6 @@ struct ttm_tt;
>   * @ttm: TTM structure holding system pages.
>   * @evicted: Whether the object was evicted without user-space knowing.
>   * @deleted: True if the object is only a zombie and already deleted.
> - * @lru: List head for the lru list.
>   * @ddestroy: List head for the delayed destroy list.
>   * @swap: List head for swap LRU list.
>   * @moving: Fence set when BO is moving
> @@ -143,7 +140,6 @@ struct ttm_buffer_object {
>  	 * Members protected by the bdev::lru_lock.
>  	 */
>  
> -	struct list_head lru;
>  	struct list_head ddestroy;
>  
>  	/**
> @@ -295,7 +291,6 @@ void ttm_bo_put(struct ttm_buffer_object *bo);
>   * ttm_bo_move_to_lru_tail
>   *
>   * @bo: The buffer object.
> - * @mem: Resource object.
>   * @bulk: optional bulk move structure to remember BO positions
>   *
>   * Move this BO to the tail of all lru lists used to lookup and reserve an
> @@ -303,19 +298,8 @@ void ttm_bo_put(struct ttm_buffer_object *bo);
>   * held, and is used to make a BO less likely to be considered for eviction.
>   */
>  void ttm_bo_move_to_lru_tail(struct ttm_buffer_object *bo,
> -			     struct ttm_resource *mem,
>  			     struct ttm_lru_bulk_move *bulk);
>  
> -/**
> - * ttm_bo_bulk_move_lru_tail
> - *
> - * @bulk: bulk move structure
> - *
> - * Bulk move BOs to the LRU tail, only valid to use when driver makes sure that
> - * BO order never changes. Should be called with ttm_global::lru_lock held.
> - */
> -void ttm_bo_bulk_move_lru_tail(struct ttm_lru_bulk_move *bulk);
> -
>  /**
>   * ttm_bo_lock_delayed_workqueue
>   *
> diff --git a/include/drm/ttm/ttm_bo_driver.h b/include/drm/ttm/ttm_bo_driver.h
> index 5f087575194b..6c7352e13708 100644
> --- a/include/drm/ttm/ttm_bo_driver.h
> +++ b/include/drm/ttm/ttm_bo_driver.h
> @@ -45,33 +45,6 @@
>  #include "ttm_tt.h"
>  #include "ttm_pool.h"
>  
> -/**
> - * struct ttm_lru_bulk_move_pos
> - *
> - * @first: first BO in the bulk move range
> - * @last: last BO in the bulk move range
> - *
> - * Positions for a lru bulk move.
> - */
> -struct ttm_lru_bulk_move_pos {
> -	struct ttm_buffer_object *first;
> -	struct ttm_buffer_object *last;
> -};
> -
> -/**
> - * struct ttm_lru_bulk_move
> - *
> - * @tt: first/last lru entry for BOs in the TT domain
> - * @vram: first/last lru entry for BOs in the VRAM domain
> - * @swap: first/last lru entry for BOs on the swap list
> - *
> - * Helper structure for bulk moves on the LRU list.
> - */
> -struct ttm_lru_bulk_move {
> -	struct ttm_lru_bulk_move_pos tt[TTM_MAX_BO_PRIORITY];
> -	struct ttm_lru_bulk_move_pos vram[TTM_MAX_BO_PRIORITY];
> -};
> -
>  /*
>   * ttm_bo.c
>   */
> @@ -182,7 +155,7 @@ static inline void
>  ttm_bo_move_to_lru_tail_unlocked(struct ttm_buffer_object *bo)
>  {
>  	spin_lock(&bo->bdev->lru_lock);
> -	ttm_bo_move_to_lru_tail(bo, bo->resource, NULL);
> +	ttm_bo_move_to_lru_tail(bo, NULL);
>  	spin_unlock(&bo->bdev->lru_lock);
>  }
>  
> diff --git a/include/drm/ttm/ttm_resource.h b/include/drm/ttm/ttm_resource.h
> index 3d391279b33f..a54d52517a30 100644
> --- a/include/drm/ttm/ttm_resource.h
> +++ b/include/drm/ttm/ttm_resource.h
> @@ -26,10 +26,12 @@
>  #define _TTM_RESOURCE_H_
>  
>  #include <linux/types.h>
> +#include <linux/list.h>
>  #include <linux/mutex.h>
>  #include <linux/atomic.h>
>  #include <linux/dma-buf-map.h>
>  #include <linux/dma-fence.h>
> +
>  #include <drm/drm_print.h>
>  #include <drm/ttm/ttm_caching.h>
>  #include <drm/ttm/ttm_kmap_iter.h>
> @@ -178,6 +180,33 @@ struct ttm_resource {
>  	uint32_t placement;
>  	struct ttm_bus_placement bus;
>  	struct ttm_buffer_object *bo;

Kerneldoc missing here. Please use an in-line one which explains:
- locking
- refcounting
- anything else that ever matters really, like that we guarantee that
  statistics gathered are consistent when you hold the lru lock
- Maybe link to the bulk lru movement for added doc goodness?

> +	struct list_head lru;
> +};
> +
> +/**
> + * struct ttm_lru_bulk_move_pos
> + *
> + * @first: first res in the bulk move range
> + * @last: last res in the bulk move range
> + *
> + * Positions for a lru bulk move.
> + */
> +struct ttm_lru_bulk_move_pos {
> +	struct ttm_resource *first;
> +	struct ttm_resource *last;
> +};
> +
> +/**
> + * struct ttm_lru_bulk_move
> + *
> + * @tt: first/last lru entry for resources in the TT domain
> + * @vram: first/last lru entry for resources in the VRAM domain
> + *
> + * Helper structure for bulk moves on the LRU list.

I know it's just moved, but please also link to the bulk lru functions
here.

> + */
> +struct ttm_lru_bulk_move {
> +	struct ttm_lru_bulk_move_pos tt[TTM_MAX_BO_PRIORITY];
> +	struct ttm_lru_bulk_move_pos vram[TTM_MAX_BO_PRIORITY];
>  };
>  
>  /**
> @@ -279,6 +308,12 @@ ttm_resource_manager_usage(struct ttm_resource_manager *man)
>  	return atomic64_read(&man->usage);
>  }
>  
> +void ttm_lru_bulk_move_init(struct ttm_lru_bulk_move *bulk);
> +void ttm_lru_bulk_move_tail(struct ttm_lru_bulk_move *bulk);
> +
> +void ttm_resource_move_to_lru_tail(struct ttm_resource *res,
> +				   struct ttm_lru_bulk_move *bulk);
> +
>  void ttm_resource_init(struct ttm_buffer_object *bo,
>                         const struct ttm_place *place,
>                         struct ttm_resource *res);

I didn't check all the code movements in detail, but looks good with a few
spot checks. With the bikesheds addressed somehow:

Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>

> -- 
> 2.25.1
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 06/12] drm/ttm: add resource iterator
  2022-01-24 12:25 ` [PATCH 06/12] drm/ttm: add resource iterator Christian König
@ 2022-01-25 16:56   ` Daniel Vetter
  0 siblings, 0 replies; 23+ messages in thread
From: Daniel Vetter @ 2022-01-25 16:56 UTC (permalink / raw)
  To: Christian König; +Cc: thomas.hellstrom, ray.huang, dri-devel

On Mon, Jan 24, 2022 at 01:25:08PM +0100, Christian König wrote:
> Instead of duplicating that at different places add an iterator over all
> the resources in a resource manager.
> 
> Signed-off-by: Christian König <christian.koenig@amd.com>
> ---
>  drivers/gpu/drm/ttm/ttm_bo.c       | 41 +++++++++++----------------
>  drivers/gpu/drm/ttm/ttm_device.c   | 26 ++++++++---------
>  drivers/gpu/drm/ttm/ttm_resource.c | 45 ++++++++++++++++++++++++++++++
>  include/drm/ttm/ttm_resource.h     | 23 +++++++++++++++
>  4 files changed, 95 insertions(+), 40 deletions(-)
> 
> diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c
> index cb0fa932d495..599be3dda8a9 100644
> --- a/drivers/gpu/drm/ttm/ttm_bo.c
> +++ b/drivers/gpu/drm/ttm/ttm_bo.c
> @@ -579,38 +579,29 @@ int ttm_mem_evict_first(struct ttm_device *bdev,
>  			struct ww_acquire_ctx *ticket)
>  {
>  	struct ttm_buffer_object *bo = NULL, *busy_bo = NULL;
> +	struct ttm_resource_cursor cursor;
>  	struct ttm_resource *res;
>  	bool locked = false;
> -	unsigned i;
>  	int ret;
>  
>  	spin_lock(&bdev->lru_lock);
> -	for (i = 0; i < TTM_MAX_BO_PRIORITY; ++i) {
> -		list_for_each_entry(res, &man->lru[i], lru) {
> -			bool busy;
> -
> -			bo = res->bo;
> -			if (!ttm_bo_evict_swapout_allowable(bo, ctx, place,
> -							    &locked, &busy)) {
> -				if (busy && !busy_bo && ticket !=
> -				    dma_resv_locking_ctx(bo->base.resv))
> -					busy_bo = bo;
> -				continue;
> -			}
> -
> -			if (!ttm_bo_get_unless_zero(bo)) {
> -				if (locked)
> -					dma_resv_unlock(bo->base.resv);
> -				continue;
> -			}
> -			break;
> +	ttm_resource_manager_for_each_res(man, &cursor, res) {
> +		bool busy;
> +
> +		if (!ttm_bo_evict_swapout_allowable(res->bo, ctx, place,
> +						    &locked, &busy)) {
> +			if (busy && !busy_bo && ticket !=
> +			    dma_resv_locking_ctx(bo->base.resv))
> +				busy_bo = res->bo;
> +			continue;
>  		}
>  
> -		/* If the inner loop terminated early, we have our candidate */
> -		if (&res->lru != &man->lru[i])
> -			break;
> -
> -		bo = NULL;
> +		if (!ttm_bo_get_unless_zero(res->bo)) {
> +			if (locked)
> +				dma_resv_unlock(res->bo->base.resv);
> +			continue;
> +		}
> +		bo = res->bo;
>  	}
>  
>  	if (!bo) {
> diff --git a/drivers/gpu/drm/ttm/ttm_device.c b/drivers/gpu/drm/ttm/ttm_device.c
> index ba35887147ba..a0562ab386f5 100644
> --- a/drivers/gpu/drm/ttm/ttm_device.c
> +++ b/drivers/gpu/drm/ttm/ttm_device.c
> @@ -142,10 +142,10 @@ EXPORT_SYMBOL(ttm_global_swapout);
>  int ttm_device_swapout(struct ttm_device *bdev, struct ttm_operation_ctx *ctx,
>  		       gfp_t gfp_flags)
>  {
> +	struct ttm_resource_cursor cursor;
>  	struct ttm_resource_manager *man;
> -	struct ttm_buffer_object *bo;
>  	struct ttm_resource *res;
> -	unsigned i, j;
> +	unsigned i;
>  	int ret;
>  
>  	spin_lock(&bdev->lru_lock);
> @@ -154,20 +154,16 @@ int ttm_device_swapout(struct ttm_device *bdev, struct ttm_operation_ctx *ctx,
>  		if (!man || !man->use_tt)
>  			continue;
>  
> -		for (j = 0; j < TTM_MAX_BO_PRIORITY; ++j) {
> -			list_for_each_entry(res, &man->lru[j], lru) {
> -				uint32_t num_pages;
> -
> -				bo = res->bo;
> -				num_pages = PFN_UP(bo->base.size);
> +		ttm_resource_manager_for_each_res(man, &cursor, res) {
> +			struct ttm_buffer_object *bo = res->bo;
> +			uint32_t num_pages = PFN_UP(bo->base.size);
>  
> -				ret = ttm_bo_swapout(bo, ctx, gfp_flags);
> -				/* ttm_bo_swapout has dropped the lru_lock */
> -				if (!ret)
> -					return num_pages;
> -				if (ret != -EBUSY)
> -					return ret;
> -			}
> +			ret = ttm_bo_swapout(bo, ctx, gfp_flags);
> +			/* ttm_bo_swapout has dropped the lru_lock */
> +			if (!ret)
> +				return num_pages;
> +			if (ret != -EBUSY)
> +				return ret;
>  		}
>  	}
>  	spin_unlock(&bdev->lru_lock);
> diff --git a/drivers/gpu/drm/ttm/ttm_resource.c b/drivers/gpu/drm/ttm/ttm_resource.c
> index 450e665c357b..9e68d36a1546 100644
> --- a/drivers/gpu/drm/ttm/ttm_resource.c
> +++ b/drivers/gpu/drm/ttm/ttm_resource.c
> @@ -354,6 +354,51 @@ void ttm_resource_manager_debug(struct ttm_resource_manager *man,
>  }
>  EXPORT_SYMBOL(ttm_resource_manager_debug);
>  
> +/**
> + * ttm_resource_manager_first
> + *
> + * @man: resource manager to iterate over
> + * @cursor: cursor to record the position
> + *
> + * Returns the first resource from the resource manager.
> + */
> +struct ttm_resource *
> +ttm_resource_manager_first(struct ttm_resource_manager *man,
> +			   struct ttm_resource_cursor *cursor)
> +{
> +	struct ttm_resource *res;

assert_lockdep_held for the lru spinlock here and in the _next() one pls,
just to be on the safe side.

> +
> +	for (cursor->priority = 0; cursor->priority < TTM_MAX_BO_PRIORITY;
> +	     ++cursor->priority)
> +		list_for_each_entry(res, &man->lru[cursor->priority], lru)
> +			return res;
> +
> +	return NULL;
> +}
> +
> +/**
> + * ttm_resource_manager_next
> + *
> + * @man: resource manager to iterate over
> + * @cursor: cursor to record the position
> + *
> + * Returns the next resource from the resource manager.
> + */
> +struct ttm_resource *
> +ttm_resource_manager_next(struct ttm_resource_manager *man,
> +			  struct ttm_resource_cursor *cursor,
> +			  struct ttm_resource *res)
> +{
> +	list_for_each_entry_continue(res, &man->lru[cursor->priority], lru)
> +		return res;
> +
> +	for (; cursor->priority < TTM_MAX_BO_PRIORITY; ++cursor->priority)
> +		list_for_each_entry(res, &man->lru[cursor->priority], lru)
> +			return res;
> +
> +	return NULL;
> +}
> +
>  static void ttm_kmap_iter_iomap_map_local(struct ttm_kmap_iter *iter,
>  					  struct dma_buf_map *dmap,
>  					  pgoff_t i)
> diff --git a/include/drm/ttm/ttm_resource.h b/include/drm/ttm/ttm_resource.h
> index a54d52517a30..13da5e337350 100644
> --- a/include/drm/ttm/ttm_resource.h
> +++ b/include/drm/ttm/ttm_resource.h
> @@ -183,6 +183,17 @@ struct ttm_resource {
>  	struct list_head lru;
>  };
>  
> +/**
> + * struct ttm_resource_cursor
> + *
> + * @priority: the current priority
> + *
> + * Cursor to iterate over the resources in a manager.
> + */
> +struct ttm_resource_cursor {
> +	unsigned int priority;
> +};
> +
>  /**
>   * struct ttm_lru_bulk_move_pos
>   *
> @@ -339,6 +350,18 @@ int ttm_resource_manager_evict_all(struct ttm_device *bdev,
>  void ttm_resource_manager_debug(struct ttm_resource_manager *man,
>  				struct drm_printer *p);
>  
> +struct ttm_resource *
> +ttm_resource_manager_first(struct ttm_resource_manager *man,
> +			   struct ttm_resource_cursor *cursor);
> +struct ttm_resource *
> +ttm_resource_manager_next(struct ttm_resource_manager *man,
> +			  struct ttm_resource_cursor *cursor,
> +			  struct ttm_resource *res);
> +

Kerneldoc for this one would be nice too.

> +#define ttm_resource_manager_for_each_res(man, cursor, res)		\
> +	for (res = ttm_resource_manager_first(man, cursor); res;	\
> +	     res = ttm_resource_manager_next(man, cursor, res))
> +
>  struct ttm_kmap_iter *
>  ttm_kmap_iter_iomap_init(struct ttm_kmap_iter_iomap *iter_io,
>  			 struct io_mapping *iomap,

I really like this, looks neat and tidy. With the two nits addressed.

Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>


> -- 
> 2.25.1
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 12/12] drm/ttm: rework bulk move handling
  2022-01-24 12:25 ` [PATCH 12/12] drm/ttm: rework bulk move handling Christian König
@ 2022-01-25 17:12   ` Daniel Vetter
  0 siblings, 0 replies; 23+ messages in thread
From: Daniel Vetter @ 2022-01-25 17:12 UTC (permalink / raw)
  To: Christian König; +Cc: thomas.hellstrom, ray.huang, dri-devel

On Mon, Jan 24, 2022 at 01:25:14PM +0100, Christian König wrote:
> Instead of providing the bulk move structure for each LRU update set
> this as property of the BO. This should avoid costly bulk move rebuilds
> with some games under RADV.
> 
> Signed-off-by: Christian König <christian.koenig@amd.com>

I think I get what you're doing here (tbh kinda suprised it doesn't work
like this already), but first a bikeshed:

Pls do s/_rmv/_del/ over the entire thing (and maybe also amd code). I
went pretty seriously wtf is this read-modify-violate thing doing here :-)

So either spell out _remove or just go with the linux standard _add/_del
if you want to have the same length for neater alignment of both.

I also think if we move this into ttm then the kerneldoc should explain a
bit better what's going on (this patch kinda makes the neat kerneldoc you
added already a bit obselete in the same patch series). Maybe even an
example of where you should embed the bulk move object, how the locking
works, stuff like that.

Also I think at least in debug mode we should validate this more, to make
sure it's always correct. Stuff like walking the full bulk section when
lockdep is enabled and checking they're _all_ locked.

One question about locking below, but aside from these nits I like the
idea to move bulk lru moving into ttm!

> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c |  1 -
>  drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c  | 72 +++-----------------
>  drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h  |  3 -
>  drivers/gpu/drm/i915/gem/i915_gem_ttm.c |  2 +-
>  drivers/gpu/drm/ttm/ttm_bo.c            | 42 ++++++++++--
>  drivers/gpu/drm/ttm/ttm_resource.c      | 87 ++++++++++++++++++-------
>  include/drm/ttm/ttm_bo_api.h            | 18 ++---
>  include/drm/ttm/ttm_bo_driver.h         |  2 +-
>  include/drm/ttm/ttm_device.h            |  9 ---
>  include/drm/ttm/ttm_resource.h          |  7 +-
>  10 files changed, 124 insertions(+), 119 deletions(-)
> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
> index f6b3695979a6..7424a4a8d35c 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
> @@ -1497,7 +1497,6 @@ static struct ttm_device_funcs amdgpu_bo_driver = {
>  	.io_mem_reserve = &amdgpu_ttm_io_mem_reserve,
>  	.io_mem_pfn = amdgpu_ttm_io_mem_pfn,
>  	.access_memory = &amdgpu_ttm_access_memory,
> -	.del_from_lru_notify = &amdgpu_vm_del_from_lru_notify
>  };
>  
>  /*
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> index cdb489cdf27b..35ed849cfe25 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> @@ -375,7 +375,7 @@ static void amdgpu_vm_bo_base_init(struct amdgpu_vm_bo_base *base,
>  	if (bo->tbo.base.resv != vm->root.bo->tbo.base.resv)
>  		return;
>  
> -	vm->bulk_moveable = false;
> +	ttm_bo_set_bulk_move(&bo->tbo, &vm->lru_bulk_move);
>  	if (bo->tbo.type == ttm_bo_type_kernel && bo->parent)
>  		amdgpu_vm_bo_relocated(base);
>  	else
> @@ -637,36 +637,6 @@ void amdgpu_vm_get_pd_bo(struct amdgpu_vm *vm,
>  	list_add(&entry->tv.head, validated);
>  }
>  
> -/**
> - * amdgpu_vm_del_from_lru_notify - update bulk_moveable flag
> - *
> - * @bo: BO which was removed from the LRU
> - *
> - * Make sure the bulk_moveable flag is updated when a BO is removed from the
> - * LRU.
> - */
> -void amdgpu_vm_del_from_lru_notify(struct ttm_buffer_object *bo)
> -{
> -	struct amdgpu_bo *abo;
> -	struct amdgpu_vm_bo_base *bo_base;
> -
> -	if (!amdgpu_bo_is_amdgpu_bo(bo))
> -		return;
> -
> -	if (bo->pin_count)
> -		return;
> -
> -	abo = ttm_to_amdgpu_bo(bo);
> -	if (!abo->parent)
> -		return;
> -	for (bo_base = abo->vm_bo; bo_base; bo_base = bo_base->next) {
> -		struct amdgpu_vm *vm = bo_base->vm;
> -
> -		if (abo->tbo.base.resv == vm->root.bo->tbo.base.resv)
> -			vm->bulk_moveable = false;
> -	}
> -
> -}
>  /**
>   * amdgpu_vm_move_to_lru_tail - move all BOs to the end of LRU
>   *
> @@ -679,33 +649,9 @@ void amdgpu_vm_del_from_lru_notify(struct ttm_buffer_object *bo)
>  void amdgpu_vm_move_to_lru_tail(struct amdgpu_device *adev,
>  				struct amdgpu_vm *vm)
>  {
> -	struct amdgpu_vm_bo_base *bo_base;
> -
> -	if (vm->bulk_moveable) {
> -		spin_lock(&adev->mman.bdev.lru_lock);
> -		ttm_lru_bulk_move_tail(&vm->lru_bulk_move);
> -		spin_unlock(&adev->mman.bdev.lru_lock);
> -		return;
> -	}
> -
> -	ttm_lru_bulk_move_init(&vm->lru_bulk_move);
> -
>  	spin_lock(&adev->mman.bdev.lru_lock);
> -	list_for_each_entry(bo_base, &vm->idle, vm_status) {
> -		struct amdgpu_bo *bo = bo_base->bo;
> -		struct amdgpu_bo *shadow = amdgpu_bo_shadowed(bo);
> -
> -		if (!bo->parent)
> -			continue;
> -
> -		ttm_bo_move_to_lru_tail(&bo->tbo, &vm->lru_bulk_move);
> -		if (shadow)
> -			ttm_bo_move_to_lru_tail(&shadow->tbo,
> -						&vm->lru_bulk_move);
> -	}
> +	ttm_lru_bulk_move_tail(&vm->lru_bulk_move);
>  	spin_unlock(&adev->mman.bdev.lru_lock);
> -
> -	vm->bulk_moveable = true;
>  }
>  
>  /**
> @@ -728,8 +674,6 @@ int amdgpu_vm_validate_pt_bos(struct amdgpu_device *adev, struct amdgpu_vm *vm,
>  	struct amdgpu_vm_bo_base *bo_base, *tmp;
>  	int r;
>  
> -	vm->bulk_moveable &= list_empty(&vm->evicted);
> -
>  	list_for_each_entry_safe(bo_base, tmp, &vm->evicted, vm_status) {
>  		struct amdgpu_bo *bo = bo_base->bo;
>  		struct amdgpu_bo *shadow = amdgpu_bo_shadowed(bo);
> @@ -1047,10 +991,16 @@ static void amdgpu_vm_free_table(struct amdgpu_vm_bo_base *entry)
>  
>  	if (!entry->bo)
>  		return;
> +
>  	shadow = amdgpu_bo_shadowed(entry->bo);
> +	if (shadow) {
> +		ttm_bo_set_bulk_move(&shadow->tbo, NULL);
> +		amdgpu_bo_unref(&shadow);
> +	}
> +
> +	ttm_bo_set_bulk_move(&entry->bo->tbo, NULL);
>  	entry->bo->vm_bo = NULL;
>  	list_del(&entry->vm_status);
> -	amdgpu_bo_unref(&shadow);
>  	amdgpu_bo_unref(&entry->bo);
>  }
>  
> @@ -1070,8 +1020,6 @@ static void amdgpu_vm_free_pts(struct amdgpu_device *adev,
>  	struct amdgpu_vm_pt_cursor cursor;
>  	struct amdgpu_vm_bo_base *entry;
>  
> -	vm->bulk_moveable = false;
> -
>  	for_each_amdgpu_vm_pt_dfs_safe(adev, vm, start, cursor, entry)
>  		amdgpu_vm_free_table(entry);
>  
> @@ -2651,7 +2599,7 @@ void amdgpu_vm_bo_rmv(struct amdgpu_device *adev,
>  
>  	if (bo) {
>  		if (bo->tbo.base.resv == vm->root.bo->tbo.base.resv)
> -			vm->bulk_moveable = false;
> +			ttm_bo_set_bulk_move(&bo->tbo, NULL);
>  
>  		for (base = &bo_va->base.bo->vm_bo; *base;
>  		     base = &(*base)->next) {
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
> index 85fcfb8c5efd..4d236682a118 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
> @@ -317,8 +317,6 @@ struct amdgpu_vm {
>  
>  	/* Store positions of group of BOs */
>  	struct ttm_lru_bulk_move lru_bulk_move;
> -	/* mark whether can do the bulk move */
> -	bool			bulk_moveable;
>  	/* Flag to indicate if VM is used for compute */
>  	bool			is_compute_context;
>  };
> @@ -454,7 +452,6 @@ void amdgpu_vm_set_task_info(struct amdgpu_vm *vm);
>  
>  void amdgpu_vm_move_to_lru_tail(struct amdgpu_device *adev,
>  				struct amdgpu_vm *vm);
> -void amdgpu_vm_del_from_lru_notify(struct ttm_buffer_object *bo);
>  void amdgpu_vm_get_memory(struct amdgpu_vm *vm, uint64_t *vram_mem,
>  				uint64_t *gtt_mem, uint64_t *cpu_mem);
>  
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_ttm.c b/drivers/gpu/drm/i915/gem/i915_gem_ttm.c
> index cdf1ff87c88d..abd1bde7a2b8 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_ttm.c
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_ttm.c
> @@ -835,7 +835,7 @@ void i915_ttm_adjust_lru(struct drm_i915_gem_object *obj)
>  			bo->priority = I915_TTM_PRIO_NO_PAGES;
>  	}
>  
> -	ttm_bo_move_to_lru_tail(bo, NULL);
> +	ttm_bo_move_to_lru_tail(bo);
>  	spin_unlock(&bo->bdev->lru_lock);
>  }
>  
> diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c
> index 599be3dda8a9..7d1133bef33e 100644
> --- a/drivers/gpu/drm/ttm/ttm_bo.c
> +++ b/drivers/gpu/drm/ttm/ttm_bo.c
> @@ -69,16 +69,48 @@ static void ttm_bo_mem_space_debug(struct ttm_buffer_object *bo,
>  	}
>  }
>  
> -void ttm_bo_move_to_lru_tail(struct ttm_buffer_object *bo,
> -			     struct ttm_lru_bulk_move *bulk)
> +/**
> + * ttm_bo_move_to_lru_tail
> + *
> + * @bo: The buffer object.
> + *
> + * Move this BO to the tail of all lru lists used to lookup and reserve an
> + * object. This function must be called with struct ttm_global::lru_lock
> + * held, and is used to make a BO less likely to be considered for eviction.
> + */
> +void ttm_bo_move_to_lru_tail(struct ttm_buffer_object *bo)
>  {
>  	dma_resv_assert_held(bo->base.resv);
>  
>  	if (bo->resource)
> -		ttm_resource_move_to_lru_tail(bo->resource, bulk);
> +		ttm_resource_move_to_lru_tail(bo->resource);
>  }
>  EXPORT_SYMBOL(ttm_bo_move_to_lru_tail);
>  
> +/**
> + * ttm_bo_set_bulk_move - update BOs bulk move object
> + *
> + * @bo: The buffer object.
> + *
> + * Update the BOs bulk move object, making sure that resources are added/removed
> + * as well.
> + */
> +void ttm_bo_set_bulk_move(struct ttm_buffer_object *bo,
> +			  struct ttm_lru_bulk_move *bulk)
> +{
> +	if (bo->bulk_move == bulk)
> +		return;

Shouldn't we have a dma_resv_assert_held here too? Otherwise I'm not sure
how this works ...

Cheers, Daniel


> +
> +	spin_lock(&bo->bdev->lru_lock);
> +	if (bo->bulk_move && bo->resource)
> +		ttm_lru_bulk_move_rmv(bo->bulk_move, bo->resource);
> +	bo->bulk_move = bulk;
> +	if (bo->bulk_move && bo->resource)
> +		ttm_lru_bulk_move_add(bo->bulk_move, bo->resource);
> +	spin_unlock(&bo->bdev->lru_lock);
> +}
> +EXPORT_SYMBOL(ttm_bo_set_bulk_move);
> +
>  static int ttm_bo_handle_move_mem(struct ttm_buffer_object *bo,
>  				  struct ttm_resource *mem, bool evict,
>  				  struct ttm_operation_ctx *ctx,
> @@ -316,6 +348,7 @@ static void ttm_bo_release(struct kref *kref)
>  	int ret;
>  
>  	WARN_ON_ONCE(bo->pin_count);
> +	WARN_ON_ONCE(bo->bulk_move);
>  
>  	if (!bo->deleted) {
>  		ret = ttm_bo_individualize_resv(bo);
> @@ -352,7 +385,7 @@ static void ttm_bo_release(struct kref *kref)
>  		 */
>  		if (bo->pin_count) {
>  			bo->pin_count = 0;
> -			ttm_resource_move_to_lru_tail(bo->resource, NULL);
> +			ttm_resource_move_to_lru_tail(bo->resource);
>  		}
>  
>  		kref_init(&bo->kref);
> @@ -874,6 +907,7 @@ int ttm_bo_init_reserved(struct ttm_device *bdev,
>  	bo->moving = NULL;
>  	bo->pin_count = 0;
>  	bo->sg = sg;
> +	bo->bulk_move = NULL;
>  	if (resv) {
>  		bo->base.resv = resv;
>  		dma_resv_assert_held(bo->base.resv);
> diff --git a/drivers/gpu/drm/ttm/ttm_resource.c b/drivers/gpu/drm/ttm/ttm_resource.c
> index 2ea8fb83377e..b1068053434c 100644
> --- a/drivers/gpu/drm/ttm/ttm_resource.c
> +++ b/drivers/gpu/drm/ttm/ttm_resource.c
> @@ -72,40 +72,75 @@ void ttm_lru_bulk_move_tail(struct ttm_lru_bulk_move *bulk)
>  }
>  EXPORT_SYMBOL(ttm_lru_bulk_move_tail);
>  
> -/* Record a resource position in a bulk move structure */
> -static void ttm_lru_bulk_move_set_pos(struct ttm_lru_bulk_move_pos *pos,
> -				      struct ttm_resource *res)
> +/* Return the bulk move pos object for this resource */
> +static struct ttm_lru_bulk_move_pos *
> +ttm_lru_bulk_move_pos(struct ttm_lru_bulk_move *bulk, struct ttm_resource *res)
>  {
> -	if (!pos->first)
> +	return &bulk->pos[res->mem_type][res->bo->priority];
> +}
> +
> +/* Move the resource to the tail of the bulk move range */
> +static void ttm_lru_bulk_move_pos_tail(struct ttm_lru_bulk_move_pos *pos,
> +				       struct ttm_resource *res)
> +{
> +	if (pos->last != res) {
> +		list_move(&res->lru, &pos->last->lru);
> +		pos->last = res;
> +	}
> +}
> +
> +/* Add the resource to a bulk_move cursor */
> +void ttm_lru_bulk_move_add(struct ttm_lru_bulk_move *bulk,
> +			   struct ttm_resource *res)
> +{
> +	struct ttm_lru_bulk_move_pos *pos = ttm_lru_bulk_move_pos(bulk, res);
> +
> +	if (!pos->first) {
>  		pos->first = res;
> -	pos->last = res;
> +		pos->last = res;
> +	} else {
> +		ttm_lru_bulk_move_pos_tail(pos, res);
> +	}
> +}
> +
> +/* Remove the resource from a bulk_move range */
> +void ttm_lru_bulk_move_rmv(struct ttm_lru_bulk_move *bulk,
> +			   struct ttm_resource *res)
> +{
> +	struct ttm_lru_bulk_move_pos *pos = ttm_lru_bulk_move_pos(bulk, res);
> +
> +	if (unlikely(pos->first == res && pos->last == res)) {
> +		pos->first = NULL;
> +		pos->last = NULL;
> +	} else if (pos->first == res) {
> +		pos->first = list_next_entry(res, lru);
> +	} else if (pos->last == res) {
> +		pos->last = list_prev_entry(res, lru);
> +	} else {
> +		list_move(&res->lru, &pos->last->lru);
> +	}
>  }
>  
> -/* Move a resource to the LRU tail and track the bulk position */
> -void ttm_resource_move_to_lru_tail(struct ttm_resource *res,
> -				   struct ttm_lru_bulk_move *bulk)
> +/* Move a resource to the LRU or bulk tail */
> +void ttm_resource_move_to_lru_tail(struct ttm_resource *res)
>  {
>  	struct ttm_buffer_object *bo = res->bo;
>  	struct ttm_device *bdev = bo->bdev;
> -	struct ttm_resource_manager *man;
>  
>  	if (bo->pin_count) {
>  		list_move_tail(&res->lru, &bdev->pinned);
> -		if (bdev->funcs->del_from_lru_notify)
> -			bdev->funcs->del_from_lru_notify(res->bo);
> -		return;
> -	}
>  
> -	man = ttm_manager_type(bdev, res->mem_type);
> -	list_move_tail(&res->lru, &man->lru[bo->priority]);
> +	} else	if (bo->bulk_move) {
> +		struct ttm_lru_bulk_move_pos *pos =
> +			ttm_lru_bulk_move_pos(bo->bulk_move, res);
>  
> -	if (bdev->funcs->del_from_lru_notify)
> -		bdev->funcs->del_from_lru_notify(bo);
> -
> -	if (!bulk)
> -		return;
> +		ttm_lru_bulk_move_pos_tail(pos, res);
> +	} else {
> +		struct ttm_resource_manager *man;
>  
> -	ttm_lru_bulk_move_set_pos(&bulk->pos[res->mem_type][bo->priority], res);
> +		man = ttm_manager_type(bdev, res->mem_type);
> +		list_move_tail(&res->lru, &man->lru[bo->priority]);
> +	}
>  }
>  
>  void ttm_resource_init(struct ttm_buffer_object *bo,
> @@ -129,7 +164,7 @@ void ttm_resource_init(struct ttm_buffer_object *bo,
>  	atomic64_add(bo->base.size, &man->usage);
>  
>  	spin_lock(&bo->bdev->lru_lock);
> -	ttm_resource_move_to_lru_tail(res, NULL);
> +	ttm_resource_move_to_lru_tail(res);
>  	spin_unlock(&bo->bdev->lru_lock);
>  }
>  EXPORT_SYMBOL(ttm_resource_init);
> @@ -148,8 +183,6 @@ void ttm_resource_fini(struct ttm_resource_manager *man,
>  
>  	spin_lock(&bdev->lru_lock);
>  	list_del_init(&res->lru);
> -	if (res->bo && bdev->funcs->del_from_lru_notify)
> -		bdev->funcs->del_from_lru_notify(res->bo);
>  	spin_unlock(&bdev->lru_lock);
>  
>  	atomic64_sub(res->num_pages << PAGE_SHIFT, &man->usage);
> @@ -173,6 +206,12 @@ void ttm_resource_free(struct ttm_buffer_object *bo, struct ttm_resource **res)
>  	if (!*res)
>  		return;
>  
> +	if (bo->bulk_move) {
> +		spin_lock(&bo->bdev->lru_lock);
> +		ttm_lru_bulk_move_rmv(bo->bulk_move, *res);
> +		spin_unlock(&bo->bdev->lru_lock);
> +	}
> +
>  	man = ttm_manager_type(bo->bdev, (*res)->mem_type);
>  	man->func->free(man, *res);
>  	*res = NULL;
> diff --git a/include/drm/ttm/ttm_bo_api.h b/include/drm/ttm/ttm_bo_api.h
> index 3da77fc54552..b40c28e884a2 100644
> --- a/include/drm/ttm/ttm_bo_api.h
> +++ b/include/drm/ttm/ttm_bo_api.h
> @@ -135,6 +135,7 @@ struct ttm_buffer_object {
>  	struct ttm_resource *resource;
>  	struct ttm_tt *ttm;
>  	bool deleted;
> +	struct ttm_lru_bulk_move *bulk_move;
>  
>  	/**
>  	 * Members protected by the bdev::lru_lock.
> @@ -287,18 +288,11 @@ int ttm_bo_validate(struct ttm_buffer_object *bo,
>   */
>  void ttm_bo_put(struct ttm_buffer_object *bo);
>  
> -/**
> - * ttm_bo_move_to_lru_tail
> - *
> - * @bo: The buffer object.
> - * @bulk: optional bulk move structure to remember BO positions
> - *
> - * Move this BO to the tail of all lru lists used to lookup and reserve an
> - * object. This function must be called with struct ttm_global::lru_lock
> - * held, and is used to make a BO less likely to be considered for eviction.
> - */
> -void ttm_bo_move_to_lru_tail(struct ttm_buffer_object *bo,
> -			     struct ttm_lru_bulk_move *bulk);
> +void ttm_bo_move_to_lru_tail(struct ttm_buffer_object *bo);
> +
> +
> +void ttm_bo_set_bulk_move(struct ttm_buffer_object *bo,
> +			  struct ttm_lru_bulk_move *bulk);
>  
>  /**
>   * ttm_bo_lock_delayed_workqueue
> diff --git a/include/drm/ttm/ttm_bo_driver.h b/include/drm/ttm/ttm_bo_driver.h
> index 6c7352e13708..059a595e14e5 100644
> --- a/include/drm/ttm/ttm_bo_driver.h
> +++ b/include/drm/ttm/ttm_bo_driver.h
> @@ -155,7 +155,7 @@ static inline void
>  ttm_bo_move_to_lru_tail_unlocked(struct ttm_buffer_object *bo)
>  {
>  	spin_lock(&bo->bdev->lru_lock);
> -	ttm_bo_move_to_lru_tail(bo, NULL);
> +	ttm_bo_move_to_lru_tail(bo);
>  	spin_unlock(&bo->bdev->lru_lock);
>  }
>  
> diff --git a/include/drm/ttm/ttm_device.h b/include/drm/ttm/ttm_device.h
> index 425150f35fbe..95b3c04b1ab9 100644
> --- a/include/drm/ttm/ttm_device.h
> +++ b/include/drm/ttm/ttm_device.h
> @@ -198,15 +198,6 @@ struct ttm_device_funcs {
>  	int (*access_memory)(struct ttm_buffer_object *bo, unsigned long offset,
>  			     void *buf, int len, int write);
>  
> -	/**
> -	 * struct ttm_bo_driver member del_from_lru_notify
> -	 *
> -	 * @bo: the buffer object deleted from lru
> -	 *
> -	 * notify driver that a BO was deleted from LRU.
> -	 */
> -	void (*del_from_lru_notify)(struct ttm_buffer_object *bo);
> -
>  	/**
>  	 * Notify the driver that we're about to release a BO
>  	 *
> diff --git a/include/drm/ttm/ttm_resource.h b/include/drm/ttm/ttm_resource.h
> index 1556d1f62251..d938ee78f536 100644
> --- a/include/drm/ttm/ttm_resource.h
> +++ b/include/drm/ttm/ttm_resource.h
> @@ -320,10 +320,13 @@ ttm_resource_manager_usage(struct ttm_resource_manager *man)
>  }
>  
>  void ttm_lru_bulk_move_init(struct ttm_lru_bulk_move *bulk);
> +void ttm_lru_bulk_move_add(struct ttm_lru_bulk_move *bulk,
> +			   struct ttm_resource *res);
> +void ttm_lru_bulk_move_rmv(struct ttm_lru_bulk_move *bulk,
> +			   struct ttm_resource *res);
>  void ttm_lru_bulk_move_tail(struct ttm_lru_bulk_move *bulk);
>  
> -void ttm_resource_move_to_lru_tail(struct ttm_resource *res,
> -				   struct ttm_lru_bulk_move *bulk);
> +void ttm_resource_move_to_lru_tail(struct ttm_resource *res);
>  
>  void ttm_resource_init(struct ttm_buffer_object *bo,
>                         const struct ttm_place *place,
> -- 
> 2.25.1
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 11/12] drm/ttm: allow bulk moves for all domains
  2022-01-24 12:25 ` [PATCH 11/12] drm/ttm: allow bulk moves for all domains Christian König
@ 2022-01-25 17:16   ` Daniel Vetter
  0 siblings, 0 replies; 23+ messages in thread
From: Daniel Vetter @ 2022-01-25 17:16 UTC (permalink / raw)
  To: Christian König; +Cc: thomas.hellstrom, ray.huang, dri-devel

On Mon, Jan 24, 2022 at 01:25:13PM +0100, Christian König wrote:
> Not just TT and VRAM.
> 
> Signed-off-by: Christian König <christian.koenig@amd.com>
> ---
>  drivers/gpu/drm/ttm/ttm_resource.c | 49 +++++++++---------------------
>  include/drm/ttm/ttm_device.h       |  2 --
>  include/drm/ttm/ttm_resource.h     |  4 +--
>  3 files changed, 16 insertions(+), 39 deletions(-)
> 
> diff --git a/drivers/gpu/drm/ttm/ttm_resource.c b/drivers/gpu/drm/ttm/ttm_resource.c
> index 9e68d36a1546..2ea8fb83377e 100644
> --- a/drivers/gpu/drm/ttm/ttm_resource.c
> +++ b/drivers/gpu/drm/ttm/ttm_resource.c
> @@ -51,36 +51,23 @@ EXPORT_SYMBOL(ttm_lru_bulk_move_init);
>   */
>  void ttm_lru_bulk_move_tail(struct ttm_lru_bulk_move *bulk)
>  {
> -	unsigned i;
> -
> -	for (i = 0; i < TTM_MAX_BO_PRIORITY; ++i) {
> -		struct ttm_lru_bulk_move_pos *pos = &bulk->tt[i];
> -		struct ttm_resource_manager *man;
> +	unsigned i, j;
>  
> -		if (!pos->first)
> -			continue;
> +	for (i = 0; i < TTM_NUM_MEM_TYPES; ++i) {
> +		for (j = 0; j < TTM_MAX_BO_PRIORITY; ++j) {
> +			struct ttm_lru_bulk_move_pos *pos = &bulk->pos[i][j];
> +			struct ttm_resource_manager *man;
>  
> -		dma_resv_assert_held(pos->first->bo->base.resv);
> -		dma_resv_assert_held(pos->last->bo->base.resv);
> +			if (!pos->first)
> +				continue;
>  
> -		man = ttm_manager_type(pos->first->bo->bdev, TTM_PL_TT);
> -		list_bulk_move_tail(&man->lru[i], &pos->first->lru,
> -				    &pos->last->lru);
> -	}
> -
> -	for (i = 0; i < TTM_MAX_BO_PRIORITY; ++i) {
> -		struct ttm_lru_bulk_move_pos *pos = &bulk->vram[i];
> -		struct ttm_resource_manager *man;
> +			dma_resv_assert_held(pos->first->bo->base.resv);
> +			dma_resv_assert_held(pos->last->bo->base.resv);
>  
> -		if (!pos->first)
> -			continue;
> -
> -		dma_resv_assert_held(pos->first->bo->base.resv);
> -		dma_resv_assert_held(pos->last->bo->base.resv);
> -
> -		man = ttm_manager_type(pos->first->bo->bdev, TTM_PL_VRAM);
> -		list_bulk_move_tail(&man->lru[i], &pos->first->lru,
> -				    &pos->last->lru);
> +			man = ttm_manager_type(pos->first->bo->bdev, i);
> +			list_bulk_move_tail(&man->lru[j], &pos->first->lru,
> +					    &pos->last->lru);
> +		}
>  	}
>  }
>  EXPORT_SYMBOL(ttm_lru_bulk_move_tail);
> @@ -118,15 +105,7 @@ void ttm_resource_move_to_lru_tail(struct ttm_resource *res,
>  	if (!bulk)
>  		return;
>  
> -	switch (res->mem_type) {
> -	case TTM_PL_TT:
> -		ttm_lru_bulk_move_set_pos(&bulk->tt[bo->priority], res);
> -		break;
> -
> -	case TTM_PL_VRAM:
> -		ttm_lru_bulk_move_set_pos(&bulk->vram[bo->priority], res);
> -		break;
> -	}
> +	ttm_lru_bulk_move_set_pos(&bulk->pos[res->mem_type][bo->priority], res);
>  }
>  
>  void ttm_resource_init(struct ttm_buffer_object *bo,
> diff --git a/include/drm/ttm/ttm_device.h b/include/drm/ttm/ttm_device.h
> index 0a4ddec78d8f..425150f35fbe 100644
> --- a/include/drm/ttm/ttm_device.h
> +++ b/include/drm/ttm/ttm_device.h
> @@ -30,8 +30,6 @@
>  #include <drm/ttm/ttm_resource.h>
>  #include <drm/ttm/ttm_pool.h>
>  
> -#define TTM_NUM_MEM_TYPES 8
> -
>  struct ttm_device;
>  struct ttm_placement;
>  struct ttm_buffer_object;
> diff --git a/include/drm/ttm/ttm_resource.h b/include/drm/ttm/ttm_resource.h
> index 13da5e337350..1556d1f62251 100644
> --- a/include/drm/ttm/ttm_resource.h
> +++ b/include/drm/ttm/ttm_resource.h
> @@ -37,6 +37,7 @@
>  #include <drm/ttm/ttm_kmap_iter.h>
>  
>  #define TTM_MAX_BO_PRIORITY	4U
> +#define TTM_NUM_MEM_TYPES 8
>  
>  struct ttm_device;
>  struct ttm_resource_manager;
> @@ -216,8 +217,7 @@ struct ttm_lru_bulk_move_pos {
>   * Helper structure for bulk moves on the LRU list.
>   */
>  struct ttm_lru_bulk_move {
> -	struct ttm_lru_bulk_move_pos tt[TTM_MAX_BO_PRIORITY];
> -	struct ttm_lru_bulk_move_pos vram[TTM_MAX_BO_PRIORITY];
> +	struct ttm_lru_bulk_move_pos pos[TTM_NUM_MEM_TYPES][TTM_MAX_BO_PRIORITY];

256 bytes per client vm for embedding this seems acceptable. And we can
make this smarter if there's ever a need. I didn't fully check your code
movement (dinner is calling!) but lgtm.

Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>

>  };
>  
>  /**
> -- 
> 2.25.1
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 04/12] drm/ttm: add common accounting to the resource mgr v2
  2022-01-25 16:37   ` Daniel Vetter
@ 2022-01-26 14:42     ` Christian König
  2022-01-27  8:48       ` Daniel Vetter
  0 siblings, 1 reply; 23+ messages in thread
From: Christian König @ 2022-01-26 14:42 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: thomas.hellstrom, ray.huang, dri-devel

Am 25.01.22 um 17:37 schrieb Daniel Vetter:
> On Mon, Jan 24, 2022 at 01:25:06PM +0100, Christian König wrote:
>> It makes sense to have this in the common manager for debugging and
>> accounting of how much resources are used.
>>
>> v2: cleanup kerneldoc a bit
>>
>> Signed-off-by: Christian König <christian.koenig@amd.com>
>> Reviewed-by: Huang Rui <ray.huang@amd.com>
>> ---
>>   drivers/gpu/drm/ttm/ttm_resource.c |  8 ++++++++
>>   include/drm/ttm/ttm_resource.h     | 20 +++++++++++++++++++-
>>   2 files changed, 27 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/gpu/drm/ttm/ttm_resource.c b/drivers/gpu/drm/ttm/ttm_resource.c
>> index 7fdd58b53c61..b8362492980d 100644
>> --- a/drivers/gpu/drm/ttm/ttm_resource.c
>> +++ b/drivers/gpu/drm/ttm/ttm_resource.c
>> @@ -33,6 +33,8 @@ void ttm_resource_init(struct ttm_buffer_object *bo,
>>                          const struct ttm_place *place,
>>                          struct ttm_resource *res)
>>   {
>> +	struct ttm_resource_manager *man;
>> +
>>   	res->start = 0;
>>   	res->num_pages = PFN_UP(bo->base.size);
>>   	res->mem_type = place->mem_type;
>> @@ -42,12 +44,16 @@ void ttm_resource_init(struct ttm_buffer_object *bo,
>>   	res->bus.is_iomem = false;
>>   	res->bus.caching = ttm_cached;
>>   	res->bo = bo;
>> +
>> +	man = ttm_manager_type(bo->bdev, place->mem_type);
>> +	atomic64_add(bo->base.size, &man->usage);
> Doing this with atomics doesn't make a lot of sense to me. Yes with the
> current organization it's the only thing to do in drivers, but if we move
> this into ttm there's no reason we can track this together with the lru,
> consistently with the lru, and under the same spinlock like the lru.
>
> And at least spot-checking a few places the very next thing we generally
> do is take the lru lock since there's really no other way to get the
> resource into or out of the resource manager.
>
> I think doing atomics for statistics when there's no need is not great,
> because then people start using atomics for all kinds of other things, and
> then get the barriers wrong. In i915 (simply due to the grotesque amount
> of buggy overuse of atomics, both atomic_t and atomic bitfields) we've put
> a hard block in place for any atomic addition. So that's why I'm a bit on
> a crusade, but I also genuinely don't see why we need them here. All they
> do is cost more since we have to take the spinlock anyway, the accounting
> is just going to be a slight different (and imo more accurate) place.
>
> Thoughts?

Well it depends. We have two use cases for those statistics:
1. Early abort when there isn't enough free resources.
2. Debugging

For the debugging it's completely irrelevant if we grab the lock or not, 
but for the early abort I'm not that sure.

Anyway I will just put that under the lock instead for now, if we really 
find that it is contended we could still switch back to an atomic.

Regards,
Christian.

>
> Cheers, Daniel
>
>>   }
>>   EXPORT_SYMBOL(ttm_resource_init);
>>   
>>   void ttm_resource_fini(struct ttm_resource_manager *man,
>>   		       struct ttm_resource *res)
>>   {
>> +	atomic64_sub(res->bo->base.size, &man->usage);
>>   }
>>   EXPORT_SYMBOL(ttm_resource_fini);
>>   
>> @@ -149,6 +155,7 @@ void ttm_resource_manager_init(struct ttm_resource_manager *man,
>>   	spin_lock_init(&man->move_lock);
>>   	man->bdev = bdev;
>>   	man->size = p_size;
>> +	atomic64_set(&man->usage, 0);
>>   
>>   	for (i = 0; i < TTM_MAX_BO_PRIORITY; ++i)
>>   		INIT_LIST_HEAD(&man->lru[i]);
>> @@ -221,6 +228,7 @@ void ttm_resource_manager_debug(struct ttm_resource_manager *man,
>>   	drm_printf(p, "  use_type: %d\n", man->use_type);
>>   	drm_printf(p, "  use_tt: %d\n", man->use_tt);
>>   	drm_printf(p, "  size: %llu\n", man->size);
>> +	drm_printf(p, "  usage: %llu\n", atomic64_read(&man->usage));
>>   	if (man->func->debug)
>>   		man->func->debug(man, p);
>>   }
>> diff --git a/include/drm/ttm/ttm_resource.h b/include/drm/ttm/ttm_resource.h
>> index 69eea9d6399b..3d391279b33f 100644
>> --- a/include/drm/ttm/ttm_resource.h
>> +++ b/include/drm/ttm/ttm_resource.h
>> @@ -27,6 +27,7 @@
>>   
>>   #include <linux/types.h>
>>   #include <linux/mutex.h>
>> +#include <linux/atomic.h>
>>   #include <linux/dma-buf-map.h>
>>   #include <linux/dma-fence.h>
>>   #include <drm/drm_print.h>
>> @@ -132,8 +133,12 @@ struct ttm_resource_manager {
>>   	/*
>>   	 * Protected by the global->lru_lock.
>>   	 */
>> -
>>   	struct list_head lru[TTM_MAX_BO_PRIORITY];
>> +
>> +	/**
>> +	 * @usage: How much of the region is used, has its own protection.
>> +	 */
>> +	atomic64_t usage;
>>   };
>>   
>>   /**
>> @@ -261,6 +266,19 @@ ttm_resource_manager_cleanup(struct ttm_resource_manager *man)
>>   	man->move = NULL;
>>   }
>>   
>> +/**
>> + * ttm_resource_manager_usage
>> + *
>> + * @man: A memory manager object.
>> + *
>> + * Return how many resources are currently used.
>> + */
>> +static inline uint64_t
>> +ttm_resource_manager_usage(struct ttm_resource_manager *man)
>> +{
>> +	return atomic64_read(&man->usage);
>> +}
>> +
>>   void ttm_resource_init(struct ttm_buffer_object *bo,
>>                          const struct ttm_place *place,
>>                          struct ttm_resource *res);
>> -- 
>> 2.25.1
>>


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 04/12] drm/ttm: add common accounting to the resource mgr v2
  2022-01-26 14:42     ` Christian König
@ 2022-01-27  8:48       ` Daniel Vetter
  0 siblings, 0 replies; 23+ messages in thread
From: Daniel Vetter @ 2022-01-27  8:48 UTC (permalink / raw)
  To: Christian König; +Cc: thomas.hellstrom, ray.huang, dri-devel

On Wed, Jan 26, 2022 at 03:42:21PM +0100, Christian König wrote:
> Am 25.01.22 um 17:37 schrieb Daniel Vetter:
> > On Mon, Jan 24, 2022 at 01:25:06PM +0100, Christian König wrote:
> > > It makes sense to have this in the common manager for debugging and
> > > accounting of how much resources are used.
> > > 
> > > v2: cleanup kerneldoc a bit
> > > 
> > > Signed-off-by: Christian König <christian.koenig@amd.com>
> > > Reviewed-by: Huang Rui <ray.huang@amd.com>
> > > ---
> > >   drivers/gpu/drm/ttm/ttm_resource.c |  8 ++++++++
> > >   include/drm/ttm/ttm_resource.h     | 20 +++++++++++++++++++-
> > >   2 files changed, 27 insertions(+), 1 deletion(-)
> > > 
> > > diff --git a/drivers/gpu/drm/ttm/ttm_resource.c b/drivers/gpu/drm/ttm/ttm_resource.c
> > > index 7fdd58b53c61..b8362492980d 100644
> > > --- a/drivers/gpu/drm/ttm/ttm_resource.c
> > > +++ b/drivers/gpu/drm/ttm/ttm_resource.c
> > > @@ -33,6 +33,8 @@ void ttm_resource_init(struct ttm_buffer_object *bo,
> > >                          const struct ttm_place *place,
> > >                          struct ttm_resource *res)
> > >   {
> > > +	struct ttm_resource_manager *man;
> > > +
> > >   	res->start = 0;
> > >   	res->num_pages = PFN_UP(bo->base.size);
> > >   	res->mem_type = place->mem_type;
> > > @@ -42,12 +44,16 @@ void ttm_resource_init(struct ttm_buffer_object *bo,
> > >   	res->bus.is_iomem = false;
> > >   	res->bus.caching = ttm_cached;
> > >   	res->bo = bo;
> > > +
> > > +	man = ttm_manager_type(bo->bdev, place->mem_type);
> > > +	atomic64_add(bo->base.size, &man->usage);
> > Doing this with atomics doesn't make a lot of sense to me. Yes with the
> > current organization it's the only thing to do in drivers, but if we move
> > this into ttm there's no reason we can track this together with the lru,
> > consistently with the lru, and under the same spinlock like the lru.
> > 
> > And at least spot-checking a few places the very next thing we generally
> > do is take the lru lock since there's really no other way to get the
> > resource into or out of the resource manager.
> > 
> > I think doing atomics for statistics when there's no need is not great,
> > because then people start using atomics for all kinds of other things, and
> > then get the barriers wrong. In i915 (simply due to the grotesque amount
> > of buggy overuse of atomics, both atomic_t and atomic bitfields) we've put
> > a hard block in place for any atomic addition. So that's why I'm a bit on
> > a crusade, but I also genuinely don't see why we need them here. All they
> > do is cost more since we have to take the spinlock anyway, the accounting
> > is just going to be a slight different (and imo more accurate) place.
> > 
> > Thoughts?
> 
> Well it depends. We have two use cases for those statistics:
> 1. Early abort when there isn't enough free resources.
> 2. Debugging
> 
> For the debugging it's completely irrelevant if we grab the lock or not, but
> for the early abort I'm not that sure.
> 
> Anyway I will just put that under the lock instead for now, if we really
> find that it is contended we could still switch back to an atomic.

To clarify, I don't mind using atomic for statistics, it's kinda the prime
example use case for them (and the reason really why they're unordered
atomics by default). It's just if you already do take a lock anyway might
as well include the statistics in there too, because except in really
silly cases that should be all faster. Worst case you need to make sure
that the datastructure and statistics (list_head and u64 here) are in the
same cacheline, and with that the statistics practically become free when
done under the spinlock.

Anyway just figured I'll drop a bit more of my thinking on this topic
here, we're agreeing already on what the code should look like anyway :-)

Cheers, Daniel

> 
> Regards,
> Christian.
> 
> > 
> > Cheers, Daniel
> > 
> > >   }
> > >   EXPORT_SYMBOL(ttm_resource_init);
> > >   void ttm_resource_fini(struct ttm_resource_manager *man,
> > >   		       struct ttm_resource *res)
> > >   {
> > > +	atomic64_sub(res->bo->base.size, &man->usage);
> > >   }
> > >   EXPORT_SYMBOL(ttm_resource_fini);
> > > @@ -149,6 +155,7 @@ void ttm_resource_manager_init(struct ttm_resource_manager *man,
> > >   	spin_lock_init(&man->move_lock);
> > >   	man->bdev = bdev;
> > >   	man->size = p_size;
> > > +	atomic64_set(&man->usage, 0);
> > >   	for (i = 0; i < TTM_MAX_BO_PRIORITY; ++i)
> > >   		INIT_LIST_HEAD(&man->lru[i]);
> > > @@ -221,6 +228,7 @@ void ttm_resource_manager_debug(struct ttm_resource_manager *man,
> > >   	drm_printf(p, "  use_type: %d\n", man->use_type);
> > >   	drm_printf(p, "  use_tt: %d\n", man->use_tt);
> > >   	drm_printf(p, "  size: %llu\n", man->size);
> > > +	drm_printf(p, "  usage: %llu\n", atomic64_read(&man->usage));
> > >   	if (man->func->debug)
> > >   		man->func->debug(man, p);
> > >   }
> > > diff --git a/include/drm/ttm/ttm_resource.h b/include/drm/ttm/ttm_resource.h
> > > index 69eea9d6399b..3d391279b33f 100644
> > > --- a/include/drm/ttm/ttm_resource.h
> > > +++ b/include/drm/ttm/ttm_resource.h
> > > @@ -27,6 +27,7 @@
> > >   #include <linux/types.h>
> > >   #include <linux/mutex.h>
> > > +#include <linux/atomic.h>
> > >   #include <linux/dma-buf-map.h>
> > >   #include <linux/dma-fence.h>
> > >   #include <drm/drm_print.h>
> > > @@ -132,8 +133,12 @@ struct ttm_resource_manager {
> > >   	/*
> > >   	 * Protected by the global->lru_lock.
> > >   	 */
> > > -
> > >   	struct list_head lru[TTM_MAX_BO_PRIORITY];
> > > +
> > > +	/**
> > > +	 * @usage: How much of the region is used, has its own protection.
> > > +	 */
> > > +	atomic64_t usage;
> > >   };
> > >   /**
> > > @@ -261,6 +266,19 @@ ttm_resource_manager_cleanup(struct ttm_resource_manager *man)
> > >   	man->move = NULL;
> > >   }
> > > +/**
> > > + * ttm_resource_manager_usage
> > > + *
> > > + * @man: A memory manager object.
> > > + *
> > > + * Return how many resources are currently used.
> > > + */
> > > +static inline uint64_t
> > > +ttm_resource_manager_usage(struct ttm_resource_manager *man)
> > > +{
> > > +	return atomic64_read(&man->usage);
> > > +}
> > > +
> > >   void ttm_resource_init(struct ttm_buffer_object *bo,
> > >                          const struct ttm_place *place,
> > >                          struct ttm_resource *res);
> > > -- 
> > > 2.25.1
> > > 
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [PATCH 05/12] drm/ttm: move the LRU into resource handling v2
  2021-11-24 12:44 drm/ttm: moving the LRU into the resource Christian König
@ 2021-11-24 12:44 ` Christian König
  0 siblings, 0 replies; 23+ messages in thread
From: Christian König @ 2021-11-24 12:44 UTC (permalink / raw)
  To: daniel, thomas.hellstrom, ray.huang, dri-devel

This way we finally fix the problem that new resource are
not immediately evict-able after allocation.

That has caused numerous problems including OOM on GDS handling
and not being able to use TTM as general resource manager.

v2: stop assuming in ttm_resource_fini that res->bo is still valid.

Signed-off-by: Christian König <christian.koenig@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c  |   8 +-
 drivers/gpu/drm/i915/gem/i915_gem_ttm.c |   2 +-
 drivers/gpu/drm/ttm/ttm_bo.c            | 115 ++--------------------
 drivers/gpu/drm/ttm/ttm_bo_util.c       |   1 -
 drivers/gpu/drm/ttm/ttm_device.c        |  64 ++++++-------
 drivers/gpu/drm/ttm/ttm_resource.c      | 122 +++++++++++++++++++++++-
 include/drm/ttm/ttm_bo_api.h            |  16 ----
 include/drm/ttm/ttm_bo_driver.h         |  29 +-----
 include/drm/ttm/ttm_resource.h          |  35 +++++++
 9 files changed, 201 insertions(+), 191 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
index 6e631254dc7f..d172ccb440e4 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
@@ -683,12 +683,12 @@ void amdgpu_vm_move_to_lru_tail(struct amdgpu_device *adev,
 
 	if (vm->bulk_moveable) {
 		spin_lock(&adev->mman.bdev.lru_lock);
-		ttm_bo_bulk_move_lru_tail(&vm->lru_bulk_move);
+		ttm_lru_bulk_move_tail(&vm->lru_bulk_move);
 		spin_unlock(&adev->mman.bdev.lru_lock);
 		return;
 	}
 
-	memset(&vm->lru_bulk_move, 0, sizeof(vm->lru_bulk_move));
+	ttm_lru_bulk_move_init(&vm->lru_bulk_move);
 
 	spin_lock(&adev->mman.bdev.lru_lock);
 	list_for_each_entry(bo_base, &vm->idle, vm_status) {
@@ -698,11 +698,9 @@ void amdgpu_vm_move_to_lru_tail(struct amdgpu_device *adev,
 		if (!bo->parent)
 			continue;
 
-		ttm_bo_move_to_lru_tail(&bo->tbo, bo->tbo.resource,
-					&vm->lru_bulk_move);
+		ttm_bo_move_to_lru_tail(&bo->tbo, &vm->lru_bulk_move);
 		if (shadow)
 			ttm_bo_move_to_lru_tail(&shadow->tbo,
-						shadow->tbo.resource,
 						&vm->lru_bulk_move);
 	}
 	spin_unlock(&adev->mman.bdev.lru_lock);
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_ttm.c b/drivers/gpu/drm/i915/gem/i915_gem_ttm.c
index 74a1ffd0d7dd..5abe22aecc54 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_ttm.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_ttm.c
@@ -794,7 +794,7 @@ static void i915_ttm_adjust_lru(struct drm_i915_gem_object *obj)
 			bo->priority = I915_TTM_PRIO_NO_PAGES;
 	}
 
-	ttm_bo_move_to_lru_tail(bo, bo->resource, NULL);
+	ttm_bo_move_to_lru_tail(bo, NULL);
 	spin_unlock(&bo->bdev->lru_lock);
 }
 
diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c
index 5db5c9ba166c..5bf5a9442856 100644
--- a/drivers/gpu/drm/ttm/ttm_bo.c
+++ b/drivers/gpu/drm/ttm/ttm_bo.c
@@ -69,108 +69,16 @@ static void ttm_bo_mem_space_debug(struct ttm_buffer_object *bo,
 	}
 }
 
-static inline void ttm_bo_move_to_pinned(struct ttm_buffer_object *bo)
-{
-	struct ttm_device *bdev = bo->bdev;
-
-	list_move_tail(&bo->lru, &bdev->pinned);
-
-	if (bdev->funcs->del_from_lru_notify)
-		bdev->funcs->del_from_lru_notify(bo);
-}
-
-static inline void ttm_bo_del_from_lru(struct ttm_buffer_object *bo)
-{
-	struct ttm_device *bdev = bo->bdev;
-
-	list_del_init(&bo->lru);
-
-	if (bdev->funcs->del_from_lru_notify)
-		bdev->funcs->del_from_lru_notify(bo);
-}
-
-static void ttm_bo_bulk_move_set_pos(struct ttm_lru_bulk_move_pos *pos,
-				     struct ttm_buffer_object *bo)
-{
-	if (!pos->first)
-		pos->first = bo;
-	pos->last = bo;
-}
-
 void ttm_bo_move_to_lru_tail(struct ttm_buffer_object *bo,
-			     struct ttm_resource *mem,
 			     struct ttm_lru_bulk_move *bulk)
 {
-	struct ttm_device *bdev = bo->bdev;
-	struct ttm_resource_manager *man;
-
-	if (!bo->deleted)
-		dma_resv_assert_held(bo->base.resv);
-
-	if (bo->pin_count) {
-		ttm_bo_move_to_pinned(bo);
-		return;
-	}
-
-	if (!mem)
-		return;
-
-	man = ttm_manager_type(bdev, mem->mem_type);
-	list_move_tail(&bo->lru, &man->lru[bo->priority]);
-
-	if (bdev->funcs->del_from_lru_notify)
-		bdev->funcs->del_from_lru_notify(bo);
-
-	if (bulk && !bo->pin_count) {
-		switch (bo->resource->mem_type) {
-		case TTM_PL_TT:
-			ttm_bo_bulk_move_set_pos(&bulk->tt[bo->priority], bo);
-			break;
+	dma_resv_assert_held(bo->base.resv);
 
-		case TTM_PL_VRAM:
-			ttm_bo_bulk_move_set_pos(&bulk->vram[bo->priority], bo);
-			break;
-		}
-	}
+	if (bo->resource)
+		ttm_resource_move_to_lru_tail(bo->resource, bulk);
 }
 EXPORT_SYMBOL(ttm_bo_move_to_lru_tail);
 
-void ttm_bo_bulk_move_lru_tail(struct ttm_lru_bulk_move *bulk)
-{
-	unsigned i;
-
-	for (i = 0; i < TTM_MAX_BO_PRIORITY; ++i) {
-		struct ttm_lru_bulk_move_pos *pos = &bulk->tt[i];
-		struct ttm_resource_manager *man;
-
-		if (!pos->first)
-			continue;
-
-		dma_resv_assert_held(pos->first->base.resv);
-		dma_resv_assert_held(pos->last->base.resv);
-
-		man = ttm_manager_type(pos->first->bdev, TTM_PL_TT);
-		list_bulk_move_tail(&man->lru[i], &pos->first->lru,
-				    &pos->last->lru);
-	}
-
-	for (i = 0; i < TTM_MAX_BO_PRIORITY; ++i) {
-		struct ttm_lru_bulk_move_pos *pos = &bulk->vram[i];
-		struct ttm_resource_manager *man;
-
-		if (!pos->first)
-			continue;
-
-		dma_resv_assert_held(pos->first->base.resv);
-		dma_resv_assert_held(pos->last->base.resv);
-
-		man = ttm_manager_type(pos->first->bdev, TTM_PL_VRAM);
-		list_bulk_move_tail(&man->lru[i], &pos->first->lru,
-				    &pos->last->lru);
-	}
-}
-EXPORT_SYMBOL(ttm_bo_bulk_move_lru_tail);
-
 static int ttm_bo_handle_move_mem(struct ttm_buffer_object *bo,
 				  struct ttm_resource *mem, bool evict,
 				  struct ttm_operation_ctx *ctx,
@@ -345,7 +253,6 @@ static int ttm_bo_cleanup_refs(struct ttm_buffer_object *bo,
 		return ret;
 	}
 
-	ttm_bo_move_to_pinned(bo);
 	list_del_init(&bo->ddestroy);
 	spin_unlock(&bo->bdev->lru_lock);
 	ttm_bo_cleanup_memtype_use(bo);
@@ -447,7 +354,7 @@ static void ttm_bo_release(struct kref *kref)
 		 */
 		if (bo->pin_count) {
 			bo->pin_count = 0;
-			ttm_bo_move_to_lru_tail(bo, bo->resource, NULL);
+			ttm_resource_move_to_lru_tail(bo->resource, NULL);
 		}
 
 		kref_init(&bo->kref);
@@ -460,7 +367,6 @@ static void ttm_bo_release(struct kref *kref)
 	}
 
 	spin_lock(&bo->bdev->lru_lock);
-	ttm_bo_del_from_lru(bo);
 	list_del(&bo->ddestroy);
 	spin_unlock(&bo->bdev->lru_lock);
 
@@ -674,15 +580,17 @@ int ttm_mem_evict_first(struct ttm_device *bdev,
 			struct ww_acquire_ctx *ticket)
 {
 	struct ttm_buffer_object *bo = NULL, *busy_bo = NULL;
+	struct ttm_resource *res;
 	bool locked = false;
 	unsigned i;
 	int ret;
 
 	spin_lock(&bdev->lru_lock);
 	for (i = 0; i < TTM_MAX_BO_PRIORITY; ++i) {
-		list_for_each_entry(bo, &man->lru[i], lru) {
+		list_for_each_entry(res, &man->lru[i], lru) {
 			bool busy;
 
+			bo = res->bo;
 			if (!ttm_bo_evict_swapout_allowable(bo, ctx, place,
 							    &locked, &busy)) {
 				if (busy && !busy_bo && ticket !=
@@ -700,7 +608,7 @@ int ttm_mem_evict_first(struct ttm_device *bdev,
 		}
 
 		/* If the inner loop terminated early, we have our candidate */
-		if (&bo->lru != &man->lru[i])
+		if (&res->lru != &man->lru[i])
 			break;
 
 		bo = NULL;
@@ -872,9 +780,6 @@ int ttm_bo_mem_space(struct ttm_buffer_object *bo,
 	}
 
 error:
-	if (bo->resource->mem_type == TTM_PL_SYSTEM && !bo->pin_count)
-		ttm_bo_move_to_lru_tail_unlocked(bo);
-
 	return ret;
 }
 EXPORT_SYMBOL(ttm_bo_mem_space);
@@ -968,7 +873,6 @@ int ttm_bo_init_reserved(struct ttm_device *bdev,
 	bo->destroy = destroy ? destroy : ttm_bo_default_destroy;
 
 	kref_init(&bo->kref);
-	INIT_LIST_HEAD(&bo->lru);
 	INIT_LIST_HEAD(&bo->ddestroy);
 	bo->bdev = bdev;
 	bo->type = type;
@@ -1017,8 +921,6 @@ int ttm_bo_init_reserved(struct ttm_device *bdev,
 		return ret;
 	}
 
-	ttm_bo_move_to_lru_tail_unlocked(bo);
-
 	return ret;
 }
 EXPORT_SYMBOL(ttm_bo_init_reserved);
@@ -1120,7 +1022,6 @@ int ttm_bo_swapout(struct ttm_buffer_object *bo, struct ttm_operation_ctx *ctx,
 		return ret == -EBUSY ? -ENOSPC : ret;
 	}
 
-	ttm_bo_move_to_pinned(bo);
 	/* TODO: Cleanup the locking */
 	spin_unlock(&bo->bdev->lru_lock);
 
diff --git a/drivers/gpu/drm/ttm/ttm_bo_util.c b/drivers/gpu/drm/ttm/ttm_bo_util.c
index a2e3a9626198..6c3d052f96ae 100644
--- a/drivers/gpu/drm/ttm/ttm_bo_util.c
+++ b/drivers/gpu/drm/ttm/ttm_bo_util.c
@@ -228,7 +228,6 @@ static int ttm_buffer_object_transfer(struct ttm_buffer_object *bo,
 
 	atomic_inc(&ttm_glob.bo_count);
 	INIT_LIST_HEAD(&fbo->base.ddestroy);
-	INIT_LIST_HEAD(&fbo->base.lru);
 	drm_vma_node_reset(&fbo->base.base.vma_node);
 
 	kref_init(&fbo->base.kref);
diff --git a/drivers/gpu/drm/ttm/ttm_device.c b/drivers/gpu/drm/ttm/ttm_device.c
index be24bb6cefd0..ba35887147ba 100644
--- a/drivers/gpu/drm/ttm/ttm_device.c
+++ b/drivers/gpu/drm/ttm/ttm_device.c
@@ -144,6 +144,7 @@ int ttm_device_swapout(struct ttm_device *bdev, struct ttm_operation_ctx *ctx,
 {
 	struct ttm_resource_manager *man;
 	struct ttm_buffer_object *bo;
+	struct ttm_resource *res;
 	unsigned i, j;
 	int ret;
 
@@ -154,8 +155,11 @@ int ttm_device_swapout(struct ttm_device *bdev, struct ttm_operation_ctx *ctx,
 			continue;
 
 		for (j = 0; j < TTM_MAX_BO_PRIORITY; ++j) {
-			list_for_each_entry(bo, &man->lru[j], lru) {
-				uint32_t num_pages = PFN_UP(bo->base.size);
+			list_for_each_entry(res, &man->lru[j], lru) {
+				uint32_t num_pages;
+
+				bo = res->bo;
+				num_pages = PFN_UP(bo->base.size);
 
 				ret = ttm_bo_swapout(bo, ctx, gfp_flags);
 				/* ttm_bo_swapout has dropped the lru_lock */
@@ -259,49 +263,45 @@ void ttm_device_fini(struct ttm_device *bdev)
 }
 EXPORT_SYMBOL(ttm_device_fini);
 
-void ttm_device_clear_dma_mappings(struct ttm_device *bdev)
+static void ttm_device_clear_lru_dma_mappings(struct ttm_device *bdev,
+					      struct list_head *list)
 {
-	struct ttm_resource_manager *man;
-	struct ttm_buffer_object *bo;
-	unsigned int i, j;
+	struct ttm_resource *res;
 
 	spin_lock(&bdev->lru_lock);
-	while (!list_empty(&bdev->pinned)) {
-		bo = list_first_entry(&bdev->pinned, struct ttm_buffer_object, lru);
+	while ((res = list_first_entry_or_null(list, typeof(*res), lru))) {
+		struct ttm_buffer_object *bo = res->bo;
+
 		/* Take ref against racing releases once lru_lock is unlocked */
-		if (ttm_bo_get_unless_zero(bo)) {
-			list_del_init(&bo->lru);
-			spin_unlock(&bdev->lru_lock);
+		if (!ttm_bo_get_unless_zero(bo))
+			continue;
 
-			if (bo->ttm)
-				ttm_tt_unpopulate(bo->bdev, bo->ttm);
+		list_del_init(&res->lru);
+		spin_unlock(&bdev->lru_lock);
 
-			ttm_bo_put(bo);
-			spin_lock(&bdev->lru_lock);
-		}
+		if (bo->ttm)
+			ttm_tt_unpopulate(bo->bdev, bo->ttm);
+
+		ttm_bo_put(bo);
+		spin_lock(&bdev->lru_lock);
 	}
+	spin_unlock(&bdev->lru_lock);
+}
+
+void ttm_device_clear_dma_mappings(struct ttm_device *bdev)
+{
+	struct ttm_resource_manager *man;
+	unsigned int i, j;
+
+	ttm_device_clear_lru_dma_mappings(bdev, &bdev->pinned);
 
 	for (i = TTM_PL_SYSTEM; i < TTM_NUM_MEM_TYPES; ++i) {
 		man = ttm_manager_type(bdev, i);
 		if (!man || !man->use_tt)
 			continue;
 
-		for (j = 0; j < TTM_MAX_BO_PRIORITY; ++j) {
-			while (!list_empty(&man->lru[j])) {
-				bo = list_first_entry(&man->lru[j], struct ttm_buffer_object, lru);
-				if (ttm_bo_get_unless_zero(bo)) {
-					list_del_init(&bo->lru);
-					spin_unlock(&bdev->lru_lock);
-
-					if (bo->ttm)
-						ttm_tt_unpopulate(bo->bdev, bo->ttm);
-
-					ttm_bo_put(bo);
-					spin_lock(&bdev->lru_lock);
-				}
-			}
-		}
+		for (j = 0; j < TTM_MAX_BO_PRIORITY; ++j)
+			ttm_device_clear_lru_dma_mappings(bdev, &man->lru[j]);
 	}
-	spin_unlock(&bdev->lru_lock);
 }
 EXPORT_SYMBOL(ttm_device_clear_dma_mappings);
diff --git a/drivers/gpu/drm/ttm/ttm_resource.c b/drivers/gpu/drm/ttm/ttm_resource.c
index b8362492980d..450e665c357b 100644
--- a/drivers/gpu/drm/ttm/ttm_resource.c
+++ b/drivers/gpu/drm/ttm/ttm_resource.c
@@ -29,6 +29,106 @@
 #include <drm/ttm/ttm_resource.h>
 #include <drm/ttm/ttm_bo_driver.h>
 
+/**
+ * ttm_lru_bulk_move_init - initialize a bulk move structure
+ * @bulk: the structure to init
+ *
+ * For now just memset the structure to zero.
+ */
+void ttm_lru_bulk_move_init(struct ttm_lru_bulk_move *bulk)
+{
+	memset(bulk, 0, sizeof(*bulk));
+}
+EXPORT_SYMBOL(ttm_lru_bulk_move_init);
+
+/**
+ * ttm_lru_bulk_move_tail
+ *
+ * @bulk: bulk move structure
+ *
+ * Bulk move BOs to the LRU tail, only valid to use when driver makes sure that
+ * resource order never changes. Should be called with ttm_device::lru_lock held.
+ */
+void ttm_lru_bulk_move_tail(struct ttm_lru_bulk_move *bulk)
+{
+	unsigned i;
+
+	for (i = 0; i < TTM_MAX_BO_PRIORITY; ++i) {
+		struct ttm_lru_bulk_move_pos *pos = &bulk->tt[i];
+		struct ttm_resource_manager *man;
+
+		if (!pos->first)
+			continue;
+
+		dma_resv_assert_held(pos->first->bo->base.resv);
+		dma_resv_assert_held(pos->last->bo->base.resv);
+
+		man = ttm_manager_type(pos->first->bo->bdev, TTM_PL_TT);
+		list_bulk_move_tail(&man->lru[i], &pos->first->lru,
+				    &pos->last->lru);
+	}
+
+	for (i = 0; i < TTM_MAX_BO_PRIORITY; ++i) {
+		struct ttm_lru_bulk_move_pos *pos = &bulk->vram[i];
+		struct ttm_resource_manager *man;
+
+		if (!pos->first)
+			continue;
+
+		dma_resv_assert_held(pos->first->bo->base.resv);
+		dma_resv_assert_held(pos->last->bo->base.resv);
+
+		man = ttm_manager_type(pos->first->bo->bdev, TTM_PL_VRAM);
+		list_bulk_move_tail(&man->lru[i], &pos->first->lru,
+				    &pos->last->lru);
+	}
+}
+EXPORT_SYMBOL(ttm_lru_bulk_move_tail);
+
+/* Record a resource position in a bulk move structure */
+static void ttm_lru_bulk_move_set_pos(struct ttm_lru_bulk_move_pos *pos,
+				      struct ttm_resource *res)
+{
+	if (!pos->first)
+		pos->first = res;
+	pos->last = res;
+}
+
+/* Move a resource to the LRU tail and track the bulk position */
+void ttm_resource_move_to_lru_tail(struct ttm_resource *res,
+				   struct ttm_lru_bulk_move *bulk)
+{
+	struct ttm_buffer_object *bo = res->bo;
+	struct ttm_device *bdev = bo->bdev;
+	struct ttm_resource_manager *man;
+
+	if (bo->pin_count) {
+		list_move_tail(&res->lru, &bdev->pinned);
+		if (bdev->funcs->del_from_lru_notify)
+			bdev->funcs->del_from_lru_notify(res->bo);
+		return;
+	}
+
+	man = ttm_manager_type(bdev, res->mem_type);
+	list_move_tail(&res->lru, &man->lru[bo->priority]);
+
+	if (bdev->funcs->del_from_lru_notify)
+		bdev->funcs->del_from_lru_notify(bo);
+
+	if (!bulk)
+		return;
+
+	switch (res->mem_type) {
+	case TTM_PL_TT:
+		ttm_lru_bulk_move_set_pos(&bulk->tt[bo->priority], res);
+		break;
+
+	case TTM_PL_VRAM:
+		ttm_lru_bulk_move_set_pos(&bulk->vram[bo->priority], res);
+		break;
+	}
+}
+
 void ttm_resource_init(struct ttm_buffer_object *bo,
                        const struct ttm_place *place,
                        struct ttm_resource *res)
@@ -44,16 +144,36 @@ void ttm_resource_init(struct ttm_buffer_object *bo,
 	res->bus.is_iomem = false;
 	res->bus.caching = ttm_cached;
 	res->bo = bo;
+	INIT_LIST_HEAD(&res->lru);
 
 	man = ttm_manager_type(bo->bdev, place->mem_type);
 	atomic64_add(bo->base.size, &man->usage);
+
+	spin_lock(&bo->bdev->lru_lock);
+	ttm_resource_move_to_lru_tail(res, NULL);
+	spin_unlock(&bo->bdev->lru_lock);
 }
 EXPORT_SYMBOL(ttm_resource_init);
 
+/**
+ * ttm_resource_fini
+ *
+ * @res: the resource to clean up
+ *
+ * Make sure the resource is removed from the LRU before destruction.
+ */
 void ttm_resource_fini(struct ttm_resource_manager *man,
 		       struct ttm_resource *res)
 {
-	atomic64_sub(res->bo->base.size, &man->usage);
+	struct ttm_device *bdev = man->bdev;
+
+	spin_lock(&bdev->lru_lock);
+	list_del_init(&res->lru);
+	if (res->bo && bdev->funcs->del_from_lru_notify)
+		bdev->funcs->del_from_lru_notify(res->bo);
+	spin_unlock(&bdev->lru_lock);
+
+	atomic64_sub(res->num_pages << PAGE_SHIFT, &man->usage);
 }
 EXPORT_SYMBOL(ttm_resource_fini);
 
diff --git a/include/drm/ttm/ttm_bo_api.h b/include/drm/ttm/ttm_bo_api.h
index 9798eb097c13..d239b97e8f71 100644
--- a/include/drm/ttm/ttm_bo_api.h
+++ b/include/drm/ttm/ttm_bo_api.h
@@ -56,8 +56,6 @@ struct ttm_placement;
 
 struct ttm_place;
 
-struct ttm_lru_bulk_move;
-
 /**
  * enum ttm_bo_type
  *
@@ -95,7 +93,6 @@ struct ttm_tt;
  * @ttm: TTM structure holding system pages.
  * @evicted: Whether the object was evicted without user-space knowing.
  * @deleted: True if the object is only a zombie and already deleted.
- * @lru: List head for the lru list.
  * @ddestroy: List head for the delayed destroy list.
  * @swap: List head for swap LRU list.
  * @offset: The current GPU offset, which can have different meanings
@@ -143,7 +140,6 @@ struct ttm_buffer_object {
 	 * Members protected by the bdev::lru_lock.
 	 */
 
-	struct list_head lru;
 	struct list_head ddestroy;
 
 	/**
@@ -294,7 +290,6 @@ void ttm_bo_put(struct ttm_buffer_object *bo);
  * ttm_bo_move_to_lru_tail
  *
  * @bo: The buffer object.
- * @mem: Resource object.
  * @bulk: optional bulk move structure to remember BO positions
  *
  * Move this BO to the tail of all lru lists used to lookup and reserve an
@@ -302,19 +297,8 @@ void ttm_bo_put(struct ttm_buffer_object *bo);
  * held, and is used to make a BO less likely to be considered for eviction.
  */
 void ttm_bo_move_to_lru_tail(struct ttm_buffer_object *bo,
-			     struct ttm_resource *mem,
 			     struct ttm_lru_bulk_move *bulk);
 
-/**
- * ttm_bo_bulk_move_lru_tail
- *
- * @bulk: bulk move structure
- *
- * Bulk move BOs to the LRU tail, only valid to use when driver makes sure that
- * BO order never changes. Should be called with ttm_global::lru_lock held.
- */
-void ttm_bo_bulk_move_lru_tail(struct ttm_lru_bulk_move *bulk);
-
 /**
  * ttm_bo_lock_delayed_workqueue
  *
diff --git a/include/drm/ttm/ttm_bo_driver.h b/include/drm/ttm/ttm_bo_driver.h
index 5f087575194b..6c7352e13708 100644
--- a/include/drm/ttm/ttm_bo_driver.h
+++ b/include/drm/ttm/ttm_bo_driver.h
@@ -45,33 +45,6 @@
 #include "ttm_tt.h"
 #include "ttm_pool.h"
 
-/**
- * struct ttm_lru_bulk_move_pos
- *
- * @first: first BO in the bulk move range
- * @last: last BO in the bulk move range
- *
- * Positions for a lru bulk move.
- */
-struct ttm_lru_bulk_move_pos {
-	struct ttm_buffer_object *first;
-	struct ttm_buffer_object *last;
-};
-
-/**
- * struct ttm_lru_bulk_move
- *
- * @tt: first/last lru entry for BOs in the TT domain
- * @vram: first/last lru entry for BOs in the VRAM domain
- * @swap: first/last lru entry for BOs on the swap list
- *
- * Helper structure for bulk moves on the LRU list.
- */
-struct ttm_lru_bulk_move {
-	struct ttm_lru_bulk_move_pos tt[TTM_MAX_BO_PRIORITY];
-	struct ttm_lru_bulk_move_pos vram[TTM_MAX_BO_PRIORITY];
-};
-
 /*
  * ttm_bo.c
  */
@@ -182,7 +155,7 @@ static inline void
 ttm_bo_move_to_lru_tail_unlocked(struct ttm_buffer_object *bo)
 {
 	spin_lock(&bo->bdev->lru_lock);
-	ttm_bo_move_to_lru_tail(bo, bo->resource, NULL);
+	ttm_bo_move_to_lru_tail(bo, NULL);
 	spin_unlock(&bo->bdev->lru_lock);
 }
 
diff --git a/include/drm/ttm/ttm_resource.h b/include/drm/ttm/ttm_resource.h
index 3d391279b33f..a54d52517a30 100644
--- a/include/drm/ttm/ttm_resource.h
+++ b/include/drm/ttm/ttm_resource.h
@@ -26,10 +26,12 @@
 #define _TTM_RESOURCE_H_
 
 #include <linux/types.h>
+#include <linux/list.h>
 #include <linux/mutex.h>
 #include <linux/atomic.h>
 #include <linux/dma-buf-map.h>
 #include <linux/dma-fence.h>
+
 #include <drm/drm_print.h>
 #include <drm/ttm/ttm_caching.h>
 #include <drm/ttm/ttm_kmap_iter.h>
@@ -178,6 +180,33 @@ struct ttm_resource {
 	uint32_t placement;
 	struct ttm_bus_placement bus;
 	struct ttm_buffer_object *bo;
+	struct list_head lru;
+};
+
+/**
+ * struct ttm_lru_bulk_move_pos
+ *
+ * @first: first res in the bulk move range
+ * @last: last res in the bulk move range
+ *
+ * Positions for a lru bulk move.
+ */
+struct ttm_lru_bulk_move_pos {
+	struct ttm_resource *first;
+	struct ttm_resource *last;
+};
+
+/**
+ * struct ttm_lru_bulk_move
+ *
+ * @tt: first/last lru entry for resources in the TT domain
+ * @vram: first/last lru entry for resources in the VRAM domain
+ *
+ * Helper structure for bulk moves on the LRU list.
+ */
+struct ttm_lru_bulk_move {
+	struct ttm_lru_bulk_move_pos tt[TTM_MAX_BO_PRIORITY];
+	struct ttm_lru_bulk_move_pos vram[TTM_MAX_BO_PRIORITY];
 };
 
 /**
@@ -279,6 +308,12 @@ ttm_resource_manager_usage(struct ttm_resource_manager *man)
 	return atomic64_read(&man->usage);
 }
 
+void ttm_lru_bulk_move_init(struct ttm_lru_bulk_move *bulk);
+void ttm_lru_bulk_move_tail(struct ttm_lru_bulk_move *bulk);
+
+void ttm_resource_move_to_lru_tail(struct ttm_resource *res,
+				   struct ttm_lru_bulk_move *bulk);
+
 void ttm_resource_init(struct ttm_buffer_object *bo,
                        const struct ttm_place *place,
                        struct ttm_resource *res);
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 23+ messages in thread

end of thread, other threads:[~2022-01-27  8:48 UTC | newest]

Thread overview: 23+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-01-24 12:25 drm/ttm: moving the LRU into the resource Christian König
2022-01-24 12:25 ` [PATCH 01/12] drm/ttm: add ttm_resource_fini Christian König
2022-01-25 16:26   ` Daniel Vetter
2022-01-24 12:25 ` [PATCH 02/12] drm/ttm: add back a reference to the bdev to the res manager Christian König
2022-01-25 16:30   ` Daniel Vetter
2022-01-24 12:25 ` [PATCH 03/12] drm/ttm: add a weak BO reference to the resource v3 Christian König
2022-01-24 12:25 ` [PATCH 04/12] drm/ttm: add common accounting to the resource mgr v2 Christian König
2022-01-25 16:37   ` Daniel Vetter
2022-01-26 14:42     ` Christian König
2022-01-27  8:48       ` Daniel Vetter
2022-01-24 12:25 ` [PATCH 05/12] drm/ttm: move the LRU into resource handling v2 Christian König
2022-01-25 16:52   ` Daniel Vetter
2022-01-24 12:25 ` [PATCH 06/12] drm/ttm: add resource iterator Christian König
2022-01-25 16:56   ` Daniel Vetter
2022-01-24 12:25 ` [PATCH 07/12] drm/radeon: remove resource accounting Christian König
2022-01-24 12:25 ` [PATCH 08/12] drm/amdgpu: remove GTT accounting Christian König
2022-01-24 12:25 ` [PATCH 09/12] drm/amdgpu: remove VRAM accounting Christian König
2022-01-24 12:25 ` [PATCH 10/12] drm/amdgpu: drop amdgpu_gtt_node Christian König
2022-01-24 12:25 ` [PATCH 11/12] drm/ttm: allow bulk moves for all domains Christian König
2022-01-25 17:16   ` Daniel Vetter
2022-01-24 12:25 ` [PATCH 12/12] drm/ttm: rework bulk move handling Christian König
2022-01-25 17:12   ` Daniel Vetter
  -- strict thread matches above, loose matches on Subject: below --
2021-11-24 12:44 drm/ttm: moving the LRU into the resource Christian König
2021-11-24 12:44 ` [PATCH 05/12] drm/ttm: move the LRU into resource handling v2 Christian König

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).