amd-gfx.lists.freedesktop.org archive mirror
 help / color / mirror / Atom feed
* Cleanup TTMs delayed delete handling
@ 2020-02-10 15:09 Christian König
  2020-02-10 15:09 ` [PATCH 1/6] drm/ttm: refine ghost BO resv criteria Christian König
                   ` (6 more replies)
  0 siblings, 7 replies; 17+ messages in thread
From: Christian König @ 2020-02-10 15:09 UTC (permalink / raw)
  To: Xinhui.Pan, amd-gfx, dri-devel

This series of patches cleans up TTMs delayed delete handling.

The core of the new handling is that we new only have a single reference counter instead of two and use kref_get_unless_zero() to grab BOs from the LRU during eviction.

This reduces the overhead of LRU moves and allows us to properly individualize the BOs reservation object during deletion to allow adding BOs for clearing memory, unmapping page tables etc..

Please review and comment,
Christian.


_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH 1/6] drm/ttm: refine ghost BO resv criteria
  2020-02-10 15:09 Cleanup TTMs delayed delete handling Christian König
@ 2020-02-10 15:09 ` Christian König
  2020-02-10 15:09 ` [PATCH 2/6] drm/ttm: cleanup ttm_buffer_object_transfer Christian König
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 17+ messages in thread
From: Christian König @ 2020-02-10 15:09 UTC (permalink / raw)
  To: Xinhui.Pan, amd-gfx, dri-devel

Ghost BOs need to stick with the resv object only when the origin is imported.

This is a low hanging fruit to avoid OOM situations on evictions.

Signed-off-by: Christian König <christian.koenig@amd.com>
---
 drivers/gpu/drm/ttm/ttm_bo_util.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/ttm/ttm_bo_util.c b/drivers/gpu/drm/ttm/ttm_bo_util.c
index 2b0e5a088da0..86d152472f38 100644
--- a/drivers/gpu/drm/ttm/ttm_bo_util.c
+++ b/drivers/gpu/drm/ttm/ttm_bo_util.c
@@ -511,7 +511,7 @@ static int ttm_buffer_object_transfer(struct ttm_buffer_object *bo,
 	kref_init(&fbo->base.kref);
 	fbo->base.destroy = &ttm_transfered_destroy;
 	fbo->base.acc_size = 0;
-	if (bo->base.resv == &bo->base._resv)
+	if (bo->type != ttm_bo_type_sg)
 		fbo->base.base.resv = &fbo->base.base._resv;
 
 	dma_resv_init(&fbo->base.base._resv);
-- 
2.17.1

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 2/6] drm/ttm: cleanup ttm_buffer_object_transfer
  2020-02-10 15:09 Cleanup TTMs delayed delete handling Christian König
  2020-02-10 15:09 ` [PATCH 1/6] drm/ttm: refine ghost BO resv criteria Christian König
@ 2020-02-10 15:09 ` Christian König
  2020-02-10 15:09 ` [PATCH 3/6] drm/ttm: use RCU in ttm_bo_flush_all_fences Christian König
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 17+ messages in thread
From: Christian König @ 2020-02-10 15:09 UTC (permalink / raw)
  To: Xinhui.Pan, amd-gfx, dri-devel

The function is always called with deleted BOs.

While at it cleanup the indentation as well.

Signed-off-by: Christian König <christian.koenig@amd.com>
---
 drivers/gpu/drm/ttm/ttm_bo.c | 12 +++---------
 1 file changed, 3 insertions(+), 9 deletions(-)

diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c
index 1fbc36f05d89..c1104c8857b7 100644
--- a/drivers/gpu/drm/ttm/ttm_bo.c
+++ b/drivers/gpu/drm/ttm/ttm_bo.c
@@ -522,14 +522,9 @@ static int ttm_bo_cleanup_refs(struct ttm_buffer_object *bo,
 			       bool interruptible, bool no_wait_gpu,
 			       bool unlock_resv)
 {
-	struct dma_resv *resv;
+	struct dma_resv *resv = &bo->base._resv;
 	int ret;
 
-	if (unlikely(list_empty(&bo->ddestroy)))
-		resv = bo->base.resv;
-	else
-		resv = &bo->base._resv;
-
 	if (dma_resv_test_signaled_rcu(resv, true))
 		ret = 0;
 	else
@@ -542,9 +537,8 @@ static int ttm_bo_cleanup_refs(struct ttm_buffer_object *bo,
 			dma_resv_unlock(bo->base.resv);
 		spin_unlock(&ttm_bo_glob.lru_lock);
 
-		lret = dma_resv_wait_timeout_rcu(resv, true,
-							   interruptible,
-							   30 * HZ);
+		lret = dma_resv_wait_timeout_rcu(resv, true, interruptible,
+						 30 * HZ);
 
 		if (lret < 0)
 			return lret;
-- 
2.17.1

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 3/6] drm/ttm: use RCU in ttm_bo_flush_all_fences
  2020-02-10 15:09 Cleanup TTMs delayed delete handling Christian König
  2020-02-10 15:09 ` [PATCH 1/6] drm/ttm: refine ghost BO resv criteria Christian König
  2020-02-10 15:09 ` [PATCH 2/6] drm/ttm: cleanup ttm_buffer_object_transfer Christian König
@ 2020-02-10 15:09 ` Christian König
  2020-02-10 15:09 ` [PATCH 4/6] drm/ttm: rework BO delayed delete Christian König
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 17+ messages in thread
From: Christian König @ 2020-02-10 15:09 UTC (permalink / raw)
  To: Xinhui.Pan, amd-gfx, dri-devel

This allows it to call the function without the lock held.

Signed-off-by: Christian König <christian.koenig@amd.com>
---
 drivers/gpu/drm/ttm/ttm_bo.c | 10 ++++++----
 1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c
index c1104c8857b7..e12fc2c2d165 100644
--- a/drivers/gpu/drm/ttm/ttm_bo.c
+++ b/drivers/gpu/drm/ttm/ttm_bo.c
@@ -429,22 +429,24 @@ static int ttm_bo_individualize_resv(struct ttm_buffer_object *bo)
 
 static void ttm_bo_flush_all_fences(struct ttm_buffer_object *bo)
 {
+	struct dma_resv *resv = &bo->base._resv;
 	struct dma_resv_list *fobj;
 	struct dma_fence *fence;
 	int i;
 
-	fobj = dma_resv_get_list(&bo->base._resv);
-	fence = dma_resv_get_excl(&bo->base._resv);
+	rcu_read_lock();
+	fobj = rcu_dereference(resv->fence);
+	fence = rcu_dereference(resv->fence_excl);
 	if (fence && !fence->ops->signaled)
 		dma_fence_enable_sw_signaling(fence);
 
 	for (i = 0; fobj && i < fobj->shared_count; ++i) {
-		fence = rcu_dereference_protected(fobj->shared[i],
-					dma_resv_held(bo->base.resv));
+		fence = rcu_dereference(fobj->shared[i]);
 
 		if (!fence->ops->signaled)
 			dma_fence_enable_sw_signaling(fence);
 	}
+	rcu_read_unlock();
 }
 
 static void ttm_bo_cleanup_refs_or_queue(struct ttm_buffer_object *bo)
-- 
2.17.1

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 4/6] drm/ttm: rework BO delayed delete.
  2020-02-10 15:09 Cleanup TTMs delayed delete handling Christian König
                   ` (2 preceding siblings ...)
  2020-02-10 15:09 ` [PATCH 3/6] drm/ttm: use RCU in ttm_bo_flush_all_fences Christian König
@ 2020-02-10 15:09 ` Christian König
  2020-02-11  5:26   ` Pan, Xinhui
  2020-02-10 15:09 ` [PATCH 5/6] drm/ttm: replace dma_resv object on deleted BOs v2 Christian König
                   ` (2 subsequent siblings)
  6 siblings, 1 reply; 17+ messages in thread
From: Christian König @ 2020-02-10 15:09 UTC (permalink / raw)
  To: Xinhui.Pan, amd-gfx, dri-devel

This patch reworks the whole delayed deletion of BOs which aren't idle.

Instead of having two counters for the BO structure we resurrect the BO
when we find that a deleted BO is not idle yet.

This has many advantages, especially that we don't need to
increment/decrement the BOs reference counter any more when it
moves on the LRUs.

Signed-off-by: Christian König <christian.koenig@amd.com>
---
 drivers/gpu/drm/ttm/ttm_bo.c      | 217 +++++++++++++-----------------
 drivers/gpu/drm/ttm/ttm_bo_util.c |   1 -
 include/drm/ttm/ttm_bo_api.h      |  11 +-
 3 files changed, 97 insertions(+), 132 deletions(-)

diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c
index e12fc2c2d165..d0624685f5d2 100644
--- a/drivers/gpu/drm/ttm/ttm_bo.c
+++ b/drivers/gpu/drm/ttm/ttm_bo.c
@@ -145,26 +145,6 @@ static inline uint32_t ttm_bo_type_flags(unsigned type)
 	return 1 << (type);
 }
 
-static void ttm_bo_release_list(struct kref *list_kref)
-{
-	struct ttm_buffer_object *bo =
-	    container_of(list_kref, struct ttm_buffer_object, list_kref);
-	size_t acc_size = bo->acc_size;
-
-	BUG_ON(kref_read(&bo->list_kref));
-	BUG_ON(kref_read(&bo->kref));
-	BUG_ON(bo->mem.mm_node != NULL);
-	BUG_ON(!list_empty(&bo->lru));
-	BUG_ON(!list_empty(&bo->ddestroy));
-	ttm_tt_destroy(bo->ttm);
-	atomic_dec(&ttm_bo_glob.bo_count);
-	dma_fence_put(bo->moving);
-	if (!ttm_bo_uses_embedded_gem_object(bo))
-		dma_resv_fini(&bo->base._resv);
-	bo->destroy(bo);
-	ttm_mem_global_free(&ttm_mem_glob, acc_size);
-}
-
 static void ttm_bo_add_mem_to_lru(struct ttm_buffer_object *bo,
 				  struct ttm_mem_reg *mem)
 {
@@ -181,21 +161,14 @@ static void ttm_bo_add_mem_to_lru(struct ttm_buffer_object *bo,
 
 	man = &bdev->man[mem->mem_type];
 	list_add_tail(&bo->lru, &man->lru[bo->priority]);
-	kref_get(&bo->list_kref);
 
 	if (!(man->flags & TTM_MEMTYPE_FLAG_FIXED) && bo->ttm &&
 	    !(bo->ttm->page_flags & (TTM_PAGE_FLAG_SG |
 				     TTM_PAGE_FLAG_SWAPPED))) {
 		list_add_tail(&bo->swap, &ttm_bo_glob.swap_lru[bo->priority]);
-		kref_get(&bo->list_kref);
 	}
 }
 
-static void ttm_bo_ref_bug(struct kref *list_kref)
-{
-	BUG();
-}
-
 static void ttm_bo_del_from_lru(struct ttm_buffer_object *bo)
 {
 	struct ttm_bo_device *bdev = bo->bdev;
@@ -203,12 +176,10 @@ static void ttm_bo_del_from_lru(struct ttm_buffer_object *bo)
 
 	if (!list_empty(&bo->swap)) {
 		list_del_init(&bo->swap);
-		kref_put(&bo->list_kref, ttm_bo_ref_bug);
 		notify = true;
 	}
 	if (!list_empty(&bo->lru)) {
 		list_del_init(&bo->lru);
-		kref_put(&bo->list_kref, ttm_bo_ref_bug);
 		notify = true;
 	}
 
@@ -421,8 +392,7 @@ static int ttm_bo_individualize_resv(struct ttm_buffer_object *bo)
 	BUG_ON(!dma_resv_trylock(&bo->base._resv));
 
 	r = dma_resv_copy_fences(&bo->base._resv, bo->base.resv);
-	if (r)
-		dma_resv_unlock(&bo->base._resv);
+	dma_resv_unlock(&bo->base._resv);
 
 	return r;
 }
@@ -449,68 +419,10 @@ static void ttm_bo_flush_all_fences(struct ttm_buffer_object *bo)
 	rcu_read_unlock();
 }
 
-static void ttm_bo_cleanup_refs_or_queue(struct ttm_buffer_object *bo)
-{
-	struct ttm_bo_device *bdev = bo->bdev;
-	int ret;
-
-	ret = ttm_bo_individualize_resv(bo);
-	if (ret) {
-		/* Last resort, if we fail to allocate memory for the
-		 * fences block for the BO to become idle
-		 */
-		dma_resv_wait_timeout_rcu(bo->base.resv, true, false,
-						    30 * HZ);
-		spin_lock(&ttm_bo_glob.lru_lock);
-		goto error;
-	}
-
-	spin_lock(&ttm_bo_glob.lru_lock);
-	ret = dma_resv_trylock(bo->base.resv) ? 0 : -EBUSY;
-	if (!ret) {
-		if (dma_resv_test_signaled_rcu(&bo->base._resv, true)) {
-			ttm_bo_del_from_lru(bo);
-			spin_unlock(&ttm_bo_glob.lru_lock);
-			if (bo->base.resv != &bo->base._resv)
-				dma_resv_unlock(&bo->base._resv);
-
-			ttm_bo_cleanup_memtype_use(bo);
-			dma_resv_unlock(bo->base.resv);
-			return;
-		}
-
-		ttm_bo_flush_all_fences(bo);
-
-		/*
-		 * Make NO_EVICT bos immediately available to
-		 * shrinkers, now that they are queued for
-		 * destruction.
-		 */
-		if (bo->mem.placement & TTM_PL_FLAG_NO_EVICT) {
-			bo->mem.placement &= ~TTM_PL_FLAG_NO_EVICT;
-			ttm_bo_move_to_lru_tail(bo, NULL);
-		}
-
-		dma_resv_unlock(bo->base.resv);
-	}
-	if (bo->base.resv != &bo->base._resv) {
-		ttm_bo_flush_all_fences(bo);
-		dma_resv_unlock(&bo->base._resv);
-	}
-
-error:
-	kref_get(&bo->list_kref);
-	list_add_tail(&bo->ddestroy, &bdev->ddestroy);
-	spin_unlock(&ttm_bo_glob.lru_lock);
-
-	schedule_delayed_work(&bdev->wq,
-			      ((HZ / 100) < 1) ? 1 : HZ / 100);
-}
-
 /**
  * function ttm_bo_cleanup_refs
- * If bo idle, remove from delayed- and lru lists, and unref.
- * If not idle, do nothing.
+ * If bo idle, remove from lru lists, and unref.
+ * If not idle, block if possible.
  *
  * Must be called with lru_lock and reservation held, this function
  * will drop the lru lock and optionally the reservation lock before returning.
@@ -572,14 +484,14 @@ static int ttm_bo_cleanup_refs(struct ttm_buffer_object *bo,
 
 	ttm_bo_del_from_lru(bo);
 	list_del_init(&bo->ddestroy);
-	kref_put(&bo->list_kref, ttm_bo_ref_bug);
-
 	spin_unlock(&ttm_bo_glob.lru_lock);
 	ttm_bo_cleanup_memtype_use(bo);
 
 	if (unlock_resv)
 		dma_resv_unlock(bo->base.resv);
 
+	ttm_bo_put(bo);
+
 	return 0;
 }
 
@@ -601,8 +513,9 @@ static bool ttm_bo_delayed_delete(struct ttm_bo_device *bdev, bool remove_all)
 
 		bo = list_first_entry(&bdev->ddestroy, struct ttm_buffer_object,
 				      ddestroy);
-		kref_get(&bo->list_kref);
 		list_move_tail(&bo->ddestroy, &removed);
+		if (!ttm_bo_get_unless_zero(bo))
+			continue;
 
 		if (remove_all || bo->base.resv != &bo->base._resv) {
 			spin_unlock(&glob->lru_lock);
@@ -617,7 +530,7 @@ static bool ttm_bo_delayed_delete(struct ttm_bo_device *bdev, bool remove_all)
 			spin_unlock(&glob->lru_lock);
 		}
 
-		kref_put(&bo->list_kref, ttm_bo_release_list);
+		ttm_bo_put(bo);
 		spin_lock(&glob->lru_lock);
 	}
 	list_splice_tail(&removed, &bdev->ddestroy);
@@ -643,16 +556,68 @@ static void ttm_bo_release(struct kref *kref)
 	    container_of(kref, struct ttm_buffer_object, kref);
 	struct ttm_bo_device *bdev = bo->bdev;
 	struct ttm_mem_type_manager *man = &bdev->man[bo->mem.mem_type];
+	size_t acc_size = bo->acc_size;
+	int ret;
 
-	if (bo->bdev->driver->release_notify)
-		bo->bdev->driver->release_notify(bo);
+	if (!bo->deleted) {
+		if (bo->bdev->driver->release_notify)
+			bo->bdev->driver->release_notify(bo);
 
-	drm_vma_offset_remove(bdev->vma_manager, &bo->base.vma_node);
-	ttm_mem_io_lock(man, false);
-	ttm_mem_io_free_vm(bo);
-	ttm_mem_io_unlock(man);
-	ttm_bo_cleanup_refs_or_queue(bo);
-	kref_put(&bo->list_kref, ttm_bo_release_list);
+		drm_vma_offset_remove(bdev->vma_manager, &bo->base.vma_node);
+		ttm_mem_io_lock(man, false);
+		ttm_mem_io_free_vm(bo);
+		ttm_mem_io_unlock(man);
+
+		ret = ttm_bo_individualize_resv(bo);
+		if (ret) {
+			/* Last resort, if we fail to allocate memory for the
+			 * fences block for the BO to become idle
+			 */
+			dma_resv_wait_timeout_rcu(bo->base.resv, true, false,
+						  30 * HZ);
+		}
+	}
+
+	if (!dma_resv_test_signaled_rcu(bo->base.resv, true)) {
+		/* The BO is not idle, resurrect it for delayed destroy */
+		ttm_bo_flush_all_fences(bo);
+		bo->deleted = true;
+
+		/*
+		 * Make NO_EVICT bos immediately available to
+		 * shrinkers, now that they are queued for
+		 * destruction.
+		 */
+		if (bo->mem.placement & TTM_PL_FLAG_NO_EVICT) {
+			bo->mem.placement &= ~TTM_PL_FLAG_NO_EVICT;
+			ttm_bo_move_to_lru_tail(bo, NULL);
+		}
+
+		spin_lock(&ttm_bo_glob.lru_lock);
+		kref_init(&bo->kref);
+		list_add_tail(&bo->ddestroy, &bdev->ddestroy);
+		spin_unlock(&ttm_bo_glob.lru_lock);
+
+		schedule_delayed_work(&bdev->wq,
+				      ((HZ / 100) < 1) ? 1 : HZ / 100);
+		return;
+	}
+
+	spin_lock(&ttm_bo_glob.lru_lock);
+	ttm_bo_del_from_lru(bo);
+	list_del(&bo->ddestroy);
+	spin_unlock(&ttm_bo_glob.lru_lock);
+
+	ttm_bo_cleanup_memtype_use(bo);
+
+	BUG_ON(bo->mem.mm_node != NULL);
+	ttm_tt_destroy(bo->ttm);
+	atomic_dec(&ttm_bo_glob.bo_count);
+	dma_fence_put(bo->moving);
+	if (!ttm_bo_uses_embedded_gem_object(bo))
+		dma_resv_fini(&bo->base._resv);
+	bo->destroy(bo);
+	ttm_mem_global_free(&ttm_mem_glob, acc_size);
 }
 
 void ttm_bo_put(struct ttm_buffer_object *bo)
@@ -755,8 +720,7 @@ static bool ttm_bo_evict_swapout_allowable(struct ttm_buffer_object *bo,
 
 	if (bo->base.resv == ctx->resv) {
 		dma_resv_assert_held(bo->base.resv);
-		if (ctx->flags & TTM_OPT_FLAG_ALLOW_RES_EVICT
-		    || !list_empty(&bo->ddestroy))
+		if (ctx->flags & TTM_OPT_FLAG_ALLOW_RES_EVICT || bo->deleted)
 			ret = true;
 		*locked = false;
 		if (busy)
@@ -837,6 +801,11 @@ static int ttm_mem_evict_first(struct ttm_bo_device *bdev,
 					dma_resv_unlock(bo->base.resv);
 				continue;
 			}
+			if (!ttm_bo_get_unless_zero(bo)) {
+				if (locked)
+					dma_resv_unlock(bo->base.resv);
+				continue;
+			}
 			break;
 		}
 
@@ -848,21 +817,19 @@ static int ttm_mem_evict_first(struct ttm_bo_device *bdev,
 	}
 
 	if (!bo) {
-		if (busy_bo)
-			kref_get(&busy_bo->list_kref);
+		if (busy_bo && !ttm_bo_get_unless_zero(busy_bo))
+			busy_bo = NULL;
 		spin_unlock(&ttm_bo_glob.lru_lock);
 		ret = ttm_mem_evict_wait_busy(busy_bo, ctx, ticket);
 		if (busy_bo)
-			kref_put(&busy_bo->list_kref, ttm_bo_release_list);
+			ttm_bo_put(busy_bo);
 		return ret;
 	}
 
-	kref_get(&bo->list_kref);
-
-	if (!list_empty(&bo->ddestroy)) {
+	if (bo->deleted) {
 		ret = ttm_bo_cleanup_refs(bo, ctx->interruptible,
 					  ctx->no_wait_gpu, locked);
-		kref_put(&bo->list_kref, ttm_bo_release_list);
+		ttm_bo_put(bo);
 		return ret;
 	}
 
@@ -872,7 +839,7 @@ static int ttm_mem_evict_first(struct ttm_bo_device *bdev,
 	if (locked)
 		ttm_bo_unreserve(bo);
 
-	kref_put(&bo->list_kref, ttm_bo_release_list);
+	ttm_bo_put(bo);
 	return ret;
 }
 
@@ -1284,7 +1251,6 @@ int ttm_bo_init_reserved(struct ttm_bo_device *bdev,
 	bo->destroy = destroy ? destroy : ttm_bo_default_destroy;
 
 	kref_init(&bo->kref);
-	kref_init(&bo->list_kref);
 	INIT_LIST_HEAD(&bo->lru);
 	INIT_LIST_HEAD(&bo->ddestroy);
 	INIT_LIST_HEAD(&bo->swap);
@@ -1804,11 +1770,18 @@ int ttm_bo_swapout(struct ttm_bo_global *glob, struct ttm_operation_ctx *ctx)
 	spin_lock(&glob->lru_lock);
 	for (i = 0; i < TTM_MAX_BO_PRIORITY; ++i) {
 		list_for_each_entry(bo, &glob->swap_lru[i], swap) {
-			if (ttm_bo_evict_swapout_allowable(bo, ctx, &locked,
-							   NULL)) {
-				ret = 0;
-				break;
+			if (!ttm_bo_evict_swapout_allowable(bo, ctx, &locked,
+							    NULL))
+				continue;
+
+			if (!ttm_bo_get_unless_zero(bo)) {
+				if (locked)
+					dma_resv_unlock(bo->base.resv);
+				continue;
 			}
+
+			ret = 0;
+			break;
 		}
 		if (!ret)
 			break;
@@ -1819,11 +1792,9 @@ int ttm_bo_swapout(struct ttm_bo_global *glob, struct ttm_operation_ctx *ctx)
 		return ret;
 	}
 
-	kref_get(&bo->list_kref);
-
-	if (!list_empty(&bo->ddestroy)) {
+	if (bo->deleted) {
 		ret = ttm_bo_cleanup_refs(bo, false, false, locked);
-		kref_put(&bo->list_kref, ttm_bo_release_list);
+		ttm_bo_put(bo);
 		return ret;
 	}
 
@@ -1877,7 +1848,7 @@ int ttm_bo_swapout(struct ttm_bo_global *glob, struct ttm_operation_ctx *ctx)
 	 */
 	if (locked)
 		dma_resv_unlock(bo->base.resv);
-	kref_put(&bo->list_kref, ttm_bo_release_list);
+	ttm_bo_put(bo);
 	return ret;
 }
 EXPORT_SYMBOL(ttm_bo_swapout);
diff --git a/drivers/gpu/drm/ttm/ttm_bo_util.c b/drivers/gpu/drm/ttm/ttm_bo_util.c
index 86d152472f38..c8e359ded1df 100644
--- a/drivers/gpu/drm/ttm/ttm_bo_util.c
+++ b/drivers/gpu/drm/ttm/ttm_bo_util.c
@@ -507,7 +507,6 @@ static int ttm_buffer_object_transfer(struct ttm_buffer_object *bo,
 	fbo->base.moving = NULL;
 	drm_vma_node_reset(&fbo->base.base.vma_node);
 
-	kref_init(&fbo->base.list_kref);
 	kref_init(&fbo->base.kref);
 	fbo->base.destroy = &ttm_transfered_destroy;
 	fbo->base.acc_size = 0;
diff --git a/include/drm/ttm/ttm_bo_api.h b/include/drm/ttm/ttm_bo_api.h
index 66ca49db9633..b9bc1b00142e 100644
--- a/include/drm/ttm/ttm_bo_api.h
+++ b/include/drm/ttm/ttm_bo_api.h
@@ -135,18 +135,14 @@ struct ttm_tt;
  * @num_pages: Actual number of pages.
  * @acc_size: Accounted size for this object.
  * @kref: Reference count of this buffer object. When this refcount reaches
- * zero, the object is put on the delayed delete list.
- * @list_kref: List reference count of this buffer object. This member is
- * used to avoid destruction while the buffer object is still on a list.
- * Lru lists may keep one refcount, the delayed delete list, and kref != 0
- * keeps one refcount. When this refcount reaches zero,
- * the object is destroyed.
+ * zero, the object is destroyed or put on the delayed delete list.
  * @mem: structure describing current placement.
  * @persistent_swap_storage: Usually the swap storage is deleted for buffers
  * pinned in physical memory. If this behaviour is not desired, this member
  * holds a pointer to a persistent shmem object.
  * @ttm: TTM structure holding system pages.
  * @evicted: Whether the object was evicted without user-space knowing.
+ * @deleted: True if the object is only a zombie and already deleted.
  * @lru: List head for the lru list.
  * @ddestroy: List head for the delayed destroy list.
  * @swap: List head for swap LRU list.
@@ -183,9 +179,7 @@ struct ttm_buffer_object {
 	/**
 	* Members not needing protection.
 	*/
-
 	struct kref kref;
-	struct kref list_kref;
 
 	/**
 	 * Members protected by the bo::resv::reserved lock.
@@ -195,6 +189,7 @@ struct ttm_buffer_object {
 	struct file *persistent_swap_storage;
 	struct ttm_tt *ttm;
 	bool evicted;
+	bool deleted;
 
 	/**
 	 * Members protected by the bdev::lru_lock.
-- 
2.17.1

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 5/6] drm/ttm: replace dma_resv object on deleted BOs v2
  2020-02-10 15:09 Cleanup TTMs delayed delete handling Christian König
                   ` (3 preceding siblings ...)
  2020-02-10 15:09 ` [PATCH 4/6] drm/ttm: rework BO delayed delete Christian König
@ 2020-02-10 15:09 ` Christian König
  2020-02-11 14:14   ` Daniel Vetter
  2020-02-10 15:09 ` [PATCH 6/6] drm/ttm: individualize resv objects before calling release_notify Christian König
  2020-02-11 11:06 ` Cleanup TTMs delayed delete handling Pan, Xinhui
  6 siblings, 1 reply; 17+ messages in thread
From: Christian König @ 2020-02-10 15:09 UTC (permalink / raw)
  To: Xinhui.Pan, amd-gfx, dri-devel

When non-imported BOs are resurrected for delayed delete we replace
the dma_resv object to allow for easy reclaiming of the resources.

v2: move that to ttm_bo_individualize_resv

Signed-off-by: Christian König <christian.koenig@amd.com>
---
 drivers/gpu/drm/ttm/ttm_bo.c | 10 +++++++++-
 1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c
index d0624685f5d2..4d161038de98 100644
--- a/drivers/gpu/drm/ttm/ttm_bo.c
+++ b/drivers/gpu/drm/ttm/ttm_bo.c
@@ -393,6 +393,14 @@ static int ttm_bo_individualize_resv(struct ttm_buffer_object *bo)
 
 	r = dma_resv_copy_fences(&bo->base._resv, bo->base.resv);
 	dma_resv_unlock(&bo->base._resv);
+	if (r)
+		return r;
+
+	if (bo->type != ttm_bo_type_sg) {
+		spin_lock(&ttm_bo_glob.lru_lock);
+		bo->base.resv = &bo->base._resv;
+		spin_unlock(&ttm_bo_glob.lru_lock);
+	}
 
 	return r;
 }
@@ -720,7 +728,7 @@ static bool ttm_bo_evict_swapout_allowable(struct ttm_buffer_object *bo,
 
 	if (bo->base.resv == ctx->resv) {
 		dma_resv_assert_held(bo->base.resv);
-		if (ctx->flags & TTM_OPT_FLAG_ALLOW_RES_EVICT || bo->deleted)
+		if (ctx->flags & TTM_OPT_FLAG_ALLOW_RES_EVICT)
 			ret = true;
 		*locked = false;
 		if (busy)
-- 
2.17.1

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 6/6] drm/ttm: individualize resv objects before calling release_notify
  2020-02-10 15:09 Cleanup TTMs delayed delete handling Christian König
                   ` (4 preceding siblings ...)
  2020-02-10 15:09 ` [PATCH 5/6] drm/ttm: replace dma_resv object on deleted BOs v2 Christian König
@ 2020-02-10 15:09 ` Christian König
  2020-02-11 11:06 ` Cleanup TTMs delayed delete handling Pan, Xinhui
  6 siblings, 0 replies; 17+ messages in thread
From: Christian König @ 2020-02-10 15:09 UTC (permalink / raw)
  To: Xinhui.Pan, amd-gfx, dri-devel

This allows release_notify to add and remove fences from deleted objects.

Signed-off-by: Christian König <christian.koenig@amd.com>
---
 drivers/gpu/drm/ttm/ttm_bo.c | 16 ++++++++--------
 1 file changed, 8 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c
index 4d161038de98..42177ccd7035 100644
--- a/drivers/gpu/drm/ttm/ttm_bo.c
+++ b/drivers/gpu/drm/ttm/ttm_bo.c
@@ -568,14 +568,6 @@ static void ttm_bo_release(struct kref *kref)
 	int ret;
 
 	if (!bo->deleted) {
-		if (bo->bdev->driver->release_notify)
-			bo->bdev->driver->release_notify(bo);
-
-		drm_vma_offset_remove(bdev->vma_manager, &bo->base.vma_node);
-		ttm_mem_io_lock(man, false);
-		ttm_mem_io_free_vm(bo);
-		ttm_mem_io_unlock(man);
-
 		ret = ttm_bo_individualize_resv(bo);
 		if (ret) {
 			/* Last resort, if we fail to allocate memory for the
@@ -584,6 +576,14 @@ static void ttm_bo_release(struct kref *kref)
 			dma_resv_wait_timeout_rcu(bo->base.resv, true, false,
 						  30 * HZ);
 		}
+
+		if (bo->bdev->driver->release_notify)
+			bo->bdev->driver->release_notify(bo);
+
+		drm_vma_offset_remove(bdev->vma_manager, &bo->base.vma_node);
+		ttm_mem_io_lock(man, false);
+		ttm_mem_io_free_vm(bo);
+		ttm_mem_io_unlock(man);
 	}
 
 	if (!dma_resv_test_signaled_rcu(bo->base.resv, true)) {
-- 
2.17.1

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [PATCH 4/6] drm/ttm: rework BO delayed delete.
  2020-02-10 15:09 ` [PATCH 4/6] drm/ttm: rework BO delayed delete Christian König
@ 2020-02-11  5:26   ` Pan, Xinhui
  2020-02-11  5:43     ` Pan, Xinhui
  0 siblings, 1 reply; 17+ messages in thread
From: Pan, Xinhui @ 2020-02-11  5:26 UTC (permalink / raw)
  To: Christian König; +Cc: Pan, Xinhui, dri-devel, amd-gfx

comments inline.
[xh]


> 2020年2月10日 23:09,Christian König <ckoenig.leichtzumerken@gmail.com> 写道:
> 
> This patch reworks the whole delayed deletion of BOs which aren't idle.
> 
> Instead of having two counters for the BO structure we resurrect the BO
> when we find that a deleted BO is not idle yet.
> 
> This has many advantages, especially that we don't need to
> increment/decrement the BOs reference counter any more when it
> moves on the LRUs.
> 
> Signed-off-by: Christian König <christian.koenig@amd.com>
> ---
> drivers/gpu/drm/ttm/ttm_bo.c      | 217 +++++++++++++-----------------
> drivers/gpu/drm/ttm/ttm_bo_util.c |   1 -
> include/drm/ttm/ttm_bo_api.h      |  11 +-
> 3 files changed, 97 insertions(+), 132 deletions(-)
> 
> diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c
> index e12fc2c2d165..d0624685f5d2 100644
> --- a/drivers/gpu/drm/ttm/ttm_bo.c
> +++ b/drivers/gpu/drm/ttm/ttm_bo.c
> @@ -145,26 +145,6 @@ static inline uint32_t ttm_bo_type_flags(unsigned type)
> 	return 1 << (type);
> }
> 
> -static void ttm_bo_release_list(struct kref *list_kref)
> -{
> -	struct ttm_buffer_object *bo =
> -	    container_of(list_kref, struct ttm_buffer_object, list_kref);
> -	size_t acc_size = bo->acc_size;
> -
> -	BUG_ON(kref_read(&bo->list_kref));
> -	BUG_ON(kref_read(&bo->kref));
> -	BUG_ON(bo->mem.mm_node != NULL);
> -	BUG_ON(!list_empty(&bo->lru));
> -	BUG_ON(!list_empty(&bo->ddestroy));
> -	ttm_tt_destroy(bo->ttm);
> -	atomic_dec(&ttm_bo_glob.bo_count);
> -	dma_fence_put(bo->moving);
> -	if (!ttm_bo_uses_embedded_gem_object(bo))
> -		dma_resv_fini(&bo->base._resv);
> -	bo->destroy(bo);
> -	ttm_mem_global_free(&ttm_mem_glob, acc_size);
> -}
> -
> static void ttm_bo_add_mem_to_lru(struct ttm_buffer_object *bo,
> 				  struct ttm_mem_reg *mem)
> {
> @@ -181,21 +161,14 @@ static void ttm_bo_add_mem_to_lru(struct ttm_buffer_object *bo,
> 
> 	man = &bdev->man[mem->mem_type];
> 	list_add_tail(&bo->lru, &man->lru[bo->priority]);
> -	kref_get(&bo->list_kref);
> 
> 	if (!(man->flags & TTM_MEMTYPE_FLAG_FIXED) && bo->ttm &&
> 	    !(bo->ttm->page_flags & (TTM_PAGE_FLAG_SG |
> 				     TTM_PAGE_FLAG_SWAPPED))) {
> 		list_add_tail(&bo->swap, &ttm_bo_glob.swap_lru[bo->priority]);
> -		kref_get(&bo->list_kref);
> 	}
> }
> 
> -static void ttm_bo_ref_bug(struct kref *list_kref)
> -{
> -	BUG();
> -}
> -
> static void ttm_bo_del_from_lru(struct ttm_buffer_object *bo)
> {
> 	struct ttm_bo_device *bdev = bo->bdev;
> @@ -203,12 +176,10 @@ static void ttm_bo_del_from_lru(struct ttm_buffer_object *bo)
> 
> 	if (!list_empty(&bo->swap)) {
> 		list_del_init(&bo->swap);
> -		kref_put(&bo->list_kref, ttm_bo_ref_bug);
> 		notify = true;
> 	}
> 	if (!list_empty(&bo->lru)) {
> 		list_del_init(&bo->lru);
> -		kref_put(&bo->list_kref, ttm_bo_ref_bug);
> 		notify = true;
> 	}
> 
> @@ -421,8 +392,7 @@ static int ttm_bo_individualize_resv(struct ttm_buffer_object *bo)
> 	BUG_ON(!dma_resv_trylock(&bo->base._resv));
> 
> 	r = dma_resv_copy_fences(&bo->base._resv, bo->base.resv);
> -	if (r)
> -		dma_resv_unlock(&bo->base._resv);
> +	dma_resv_unlock(&bo->base._resv);
> 
> 	return r;
> }
> @@ -449,68 +419,10 @@ static void ttm_bo_flush_all_fences(struct ttm_buffer_object *bo)
> 	rcu_read_unlock();
> }
> 
> -static void ttm_bo_cleanup_refs_or_queue(struct ttm_buffer_object *bo)
> -{
> -	struct ttm_bo_device *bdev = bo->bdev;
> -	int ret;
> -
> -	ret = ttm_bo_individualize_resv(bo);
> -	if (ret) {
> -		/* Last resort, if we fail to allocate memory for the
> -		 * fences block for the BO to become idle
> -		 */
> -		dma_resv_wait_timeout_rcu(bo->base.resv, true, false,
> -						    30 * HZ);
> -		spin_lock(&ttm_bo_glob.lru_lock);
> -		goto error;
> -	}
> -
> -	spin_lock(&ttm_bo_glob.lru_lock);
> -	ret = dma_resv_trylock(bo->base.resv) ? 0 : -EBUSY;
> -	if (!ret) {
> -		if (dma_resv_test_signaled_rcu(&bo->base._resv, true)) {
> -			ttm_bo_del_from_lru(bo);
> -			spin_unlock(&ttm_bo_glob.lru_lock);
> -			if (bo->base.resv != &bo->base._resv)
> -				dma_resv_unlock(&bo->base._resv);
> -
> -			ttm_bo_cleanup_memtype_use(bo);
> -			dma_resv_unlock(bo->base.resv);
> -			return;
> -		}
> -
> -		ttm_bo_flush_all_fences(bo);
> -
> -		/*
> -		 * Make NO_EVICT bos immediately available to
> -		 * shrinkers, now that they are queued for
> -		 * destruction.
> -		 */
> -		if (bo->mem.placement & TTM_PL_FLAG_NO_EVICT) {
> -			bo->mem.placement &= ~TTM_PL_FLAG_NO_EVICT;
> -			ttm_bo_move_to_lru_tail(bo, NULL);
> -		}
> -
> -		dma_resv_unlock(bo->base.resv);
> -	}
> -	if (bo->base.resv != &bo->base._resv) {
> -		ttm_bo_flush_all_fences(bo);
> -		dma_resv_unlock(&bo->base._resv);
> -	}
> -
> -error:
> -	kref_get(&bo->list_kref);
> -	list_add_tail(&bo->ddestroy, &bdev->ddestroy);
> -	spin_unlock(&ttm_bo_glob.lru_lock);
> -
> -	schedule_delayed_work(&bdev->wq,
> -			      ((HZ / 100) < 1) ? 1 : HZ / 100);
> -}
> -
> /**
>  * function ttm_bo_cleanup_refs
> - * If bo idle, remove from delayed- and lru lists, and unref.
> - * If not idle, do nothing.
> + * If bo idle, remove from lru lists, and unref.
> + * If not idle, block if possible.
>  *
>  * Must be called with lru_lock and reservation held, this function
>  * will drop the lru lock and optionally the reservation lock before returning.
> @@ -572,14 +484,14 @@ static int ttm_bo_cleanup_refs(struct ttm_buffer_object *bo,
> 
> 	ttm_bo_del_from_lru(bo);
> 	list_del_init(&bo->ddestroy);
> -	kref_put(&bo->list_kref, ttm_bo_ref_bug);
> -
> 	spin_unlock(&ttm_bo_glob.lru_lock);
> 	ttm_bo_cleanup_memtype_use(bo);
> 
> 	if (unlock_resv)
> 		dma_resv_unlock(bo->base.resv);
> 
> +	ttm_bo_put(bo);
> +
> 	return 0;
> }
> 
> @@ -601,8 +513,9 @@ static bool ttm_bo_delayed_delete(struct ttm_bo_device *bdev, bool remove_all)
> 
> 		bo = list_first_entry(&bdev->ddestroy, struct ttm_buffer_object,
> 				      ddestroy);
> -		kref_get(&bo->list_kref);
> 		list_move_tail(&bo->ddestroy, &removed);
> +		if (!ttm_bo_get_unless_zero(bo))
> +			continue;
> 
> 		if (remove_all || bo->base.resv != &bo->base._resv) {
> 			spin_unlock(&glob->lru_lock);
> @@ -617,7 +530,7 @@ static bool ttm_bo_delayed_delete(struct ttm_bo_device *bdev, bool remove_all)
> 			spin_unlock(&glob->lru_lock);
> 		}
> 
> -		kref_put(&bo->list_kref, ttm_bo_release_list);
> +		ttm_bo_put(bo);
> 		spin_lock(&glob->lru_lock);
> 	}
> 	list_splice_tail(&removed, &bdev->ddestroy);
> @@ -643,16 +556,68 @@ static void ttm_bo_release(struct kref *kref)
> 	    container_of(kref, struct ttm_buffer_object, kref);
> 	struct ttm_bo_device *bdev = bo->bdev;
> 	struct ttm_mem_type_manager *man = &bdev->man[bo->mem.mem_type];
> +	size_t acc_size = bo->acc_size;
> +	int ret;
> 
> -	if (bo->bdev->driver->release_notify)
> -		bo->bdev->driver->release_notify(bo);
> +	if (!bo->deleted) {
> +		if (bo->bdev->driver->release_notify)
> +			bo->bdev->driver->release_notify(bo);
> 
> -	drm_vma_offset_remove(bdev->vma_manager, &bo->base.vma_node);
> -	ttm_mem_io_lock(man, false);
> -	ttm_mem_io_free_vm(bo);
> -	ttm_mem_io_unlock(man);
> -	ttm_bo_cleanup_refs_or_queue(bo);
> -	kref_put(&bo->list_kref, ttm_bo_release_list);
> +		drm_vma_offset_remove(bdev->vma_manager, &bo->base.vma_node);
> +		ttm_mem_io_lock(man, false);
> +		ttm_mem_io_free_vm(bo);
> +		ttm_mem_io_unlock(man);
> +
> +		ret = ttm_bo_individualize_resv(bo);
> +		if (ret) {
> +			/* Last resort, if we fail to allocate memory for the
> +			 * fences block for the BO to become idle
> +			 */
> +			dma_resv_wait_timeout_rcu(bo->base.resv, true, false,
> +						  30 * HZ);
> +		}
> +	}
> +
> +	if (!dma_resv_test_signaled_rcu(bo->base.resv, true)) {
> +		/* The BO is not idle, resurrect it for delayed destroy */
> +		ttm_bo_flush_all_fences(bo);
> +		bo->deleted = true;
> +
> +		/*
> +		 * Make NO_EVICT bos immediately available to
> +		 * shrinkers, now that they are queued for
> +		 * destruction.
> +		 */
> +		if (bo->mem.placement & TTM_PL_FLAG_NO_EVICT) {
> +			bo->mem.placement &= ~TTM_PL_FLAG_NO_EVICT;
> +			ttm_bo_move_to_lru_tail(bo, NULL);

[xh] this should be under lru lock.

> +		}
> +
> +		spin_lock(&ttm_bo_glob.lru_lock);
> +		kref_init(&bo->kref);
> +		list_add_tail(&bo->ddestroy, &bdev->ddestroy);
> +		spin_unlock(&ttm_bo_glob.lru_lock);
> +
> +		schedule_delayed_work(&bdev->wq,
> +				      ((HZ / 100) < 1) ? 1 : HZ / 100);
> +		return;
> +	}
> +
> +	spin_lock(&ttm_bo_glob.lru_lock);
> +	ttm_bo_del_from_lru(bo);
> +	list_del(&bo->ddestroy);
> +	spin_unlock(&ttm_bo_glob.lru_lock);
> +
> +	ttm_bo_cleanup_memtype_use(bo);
> +
> +	BUG_ON(bo->mem.mm_node != NULL);
> +	ttm_tt_destroy(bo->ttm);

[xh] already destroy it in ttm_bo_cleanup_memtype_use.


> +	atomic_dec(&ttm_bo_glob.bo_count);
> +	dma_fence_put(bo->moving);
> +	if (!ttm_bo_uses_embedded_gem_object(bo))
> +		dma_resv_fini(&bo->base._resv);
> +	bo->destroy(bo);
> +	ttm_mem_global_free(&ttm_mem_glob, acc_size);
> }
> 
> void ttm_bo_put(struct ttm_buffer_object *bo)
> @@ -755,8 +720,7 @@ static bool ttm_bo_evict_swapout_allowable(struct ttm_buffer_object *bo,
> 
> 	if (bo->base.resv == ctx->resv) {
> 		dma_resv_assert_held(bo->base.resv);
> -		if (ctx->flags & TTM_OPT_FLAG_ALLOW_RES_EVICT
> -		    || !list_empty(&bo->ddestroy))
> +		if (ctx->flags & TTM_OPT_FLAG_ALLOW_RES_EVICT || bo->deleted)
> 			ret = true;
> 		*locked = false;
> 		if (busy)
> @@ -837,6 +801,11 @@ static int ttm_mem_evict_first(struct ttm_bo_device *bdev,
> 					dma_resv_unlock(bo->base.resv);
> 				continue;
> 			}
> +			if (!ttm_bo_get_unless_zero(bo)) {
> +				if (locked)
> +					dma_resv_unlock(bo->base.resv);
> +				continue;
> +			}
> 			break;
> 		}
> 
> @@ -848,21 +817,19 @@ static int ttm_mem_evict_first(struct ttm_bo_device *bdev,
> 	}
> 
> 	if (!bo) {
> -		if (busy_bo)
> -			kref_get(&busy_bo->list_kref);
> +		if (busy_bo && !ttm_bo_get_unless_zero(busy_bo))
> +			busy_bo = NULL;
> 		spin_unlock(&ttm_bo_glob.lru_lock);
> 		ret = ttm_mem_evict_wait_busy(busy_bo, ctx, ticket);
> 		if (busy_bo)
> -			kref_put(&busy_bo->list_kref, ttm_bo_release_list);
> +			ttm_bo_put(busy_bo);
> 		return ret;
> 	}
> 
> -	kref_get(&bo->list_kref);
> -
> -	if (!list_empty(&bo->ddestroy)) {
> +	if (bo->deleted) {
> 		ret = ttm_bo_cleanup_refs(bo, ctx->interruptible,
> 					  ctx->no_wait_gpu, locked);
> -		kref_put(&bo->list_kref, ttm_bo_release_list);
> +		ttm_bo_put(bo);
> 		return ret;
> 	}
> 
> @@ -872,7 +839,7 @@ static int ttm_mem_evict_first(struct ttm_bo_device *bdev,
> 	if (locked)
> 		ttm_bo_unreserve(bo);
> 
> -	kref_put(&bo->list_kref, ttm_bo_release_list);
> +	ttm_bo_put(bo);
> 	return ret;
> }
> 
> @@ -1284,7 +1251,6 @@ int ttm_bo_init_reserved(struct ttm_bo_device *bdev,
> 	bo->destroy = destroy ? destroy : ttm_bo_default_destroy;
> 
> 	kref_init(&bo->kref);
> -	kref_init(&bo->list_kref);
> 	INIT_LIST_HEAD(&bo->lru);
> 	INIT_LIST_HEAD(&bo->ddestroy);
> 	INIT_LIST_HEAD(&bo->swap);
> @@ -1804,11 +1770,18 @@ int ttm_bo_swapout(struct ttm_bo_global *glob, struct ttm_operation_ctx *ctx)
> 	spin_lock(&glob->lru_lock);
> 	for (i = 0; i < TTM_MAX_BO_PRIORITY; ++i) {
> 		list_for_each_entry(bo, &glob->swap_lru[i], swap) {
> -			if (ttm_bo_evict_swapout_allowable(bo, ctx, &locked,
> -							   NULL)) {
> -				ret = 0;
> -				break;
> +			if (!ttm_bo_evict_swapout_allowable(bo, ctx, &locked,
> +							    NULL))
> +				continue;
> +
> +			if (!ttm_bo_get_unless_zero(bo)) {
> +				if (locked)
> +					dma_resv_unlock(bo->base.resv);
> +				continue;
> 			}

[xh] As you get the bo. when you put it?

You only put one bo just before return. BUT you get the bos in the for loop. Am I missing somethings?

thanks
xinhui


> +
> +			ret = 0;
> +			break;
> 		}
> 		if (!ret)
> 			break;
> @@ -1819,11 +1792,9 @@ int ttm_bo_swapout(struct ttm_bo_global *glob, struct ttm_operation_ctx *ctx)
> 		return ret;
> 	}
> 
> -	kref_get(&bo->list_kref);
> -
> -	if (!list_empty(&bo->ddestroy)) {
> +	if (bo->deleted) {
> 		ret = ttm_bo_cleanup_refs(bo, false, false, locked);
> -		kref_put(&bo->list_kref, ttm_bo_release_list);
> +		ttm_bo_put(bo);
> 		return ret;
> 	}
> 
> @@ -1877,7 +1848,7 @@ int ttm_bo_swapout(struct ttm_bo_global *glob, struct ttm_operation_ctx *ctx)
> 	 */
> 	if (locked)
> 		dma_resv_unlock(bo->base.resv);
> -	kref_put(&bo->list_kref, ttm_bo_release_list);
> +	ttm_bo_put(bo);
> 	return ret;
> }
> EXPORT_SYMBOL(ttm_bo_swapout);
> diff --git a/drivers/gpu/drm/ttm/ttm_bo_util.c b/drivers/gpu/drm/ttm/ttm_bo_util.c
> index 86d152472f38..c8e359ded1df 100644
> --- a/drivers/gpu/drm/ttm/ttm_bo_util.c
> +++ b/drivers/gpu/drm/ttm/ttm_bo_util.c
> @@ -507,7 +507,6 @@ static int ttm_buffer_object_transfer(struct ttm_buffer_object *bo,
> 	fbo->base.moving = NULL;
> 	drm_vma_node_reset(&fbo->base.base.vma_node);
> 
> -	kref_init(&fbo->base.list_kref);
> 	kref_init(&fbo->base.kref);
> 	fbo->base.destroy = &ttm_transfered_destroy;
> 	fbo->base.acc_size = 0;
> diff --git a/include/drm/ttm/ttm_bo_api.h b/include/drm/ttm/ttm_bo_api.h
> index 66ca49db9633..b9bc1b00142e 100644
> --- a/include/drm/ttm/ttm_bo_api.h
> +++ b/include/drm/ttm/ttm_bo_api.h
> @@ -135,18 +135,14 @@ struct ttm_tt;
>  * @num_pages: Actual number of pages.
>  * @acc_size: Accounted size for this object.
>  * @kref: Reference count of this buffer object. When this refcount reaches
> - * zero, the object is put on the delayed delete list.
> - * @list_kref: List reference count of this buffer object. This member is
> - * used to avoid destruction while the buffer object is still on a list.
> - * Lru lists may keep one refcount, the delayed delete list, and kref != 0
> - * keeps one refcount. When this refcount reaches zero,
> - * the object is destroyed.
> + * zero, the object is destroyed or put on the delayed delete list.
>  * @mem: structure describing current placement.
>  * @persistent_swap_storage: Usually the swap storage is deleted for buffers
>  * pinned in physical memory. If this behaviour is not desired, this member
>  * holds a pointer to a persistent shmem object.
>  * @ttm: TTM structure holding system pages.
>  * @evicted: Whether the object was evicted without user-space knowing.
> + * @deleted: True if the object is only a zombie and already deleted.
>  * @lru: List head for the lru list.
>  * @ddestroy: List head for the delayed destroy list.
>  * @swap: List head for swap LRU list.
> @@ -183,9 +179,7 @@ struct ttm_buffer_object {
> 	/**
> 	* Members not needing protection.
> 	*/
> -
> 	struct kref kref;
> -	struct kref list_kref;
> 
> 	/**
> 	 * Members protected by the bo::resv::reserved lock.
> @@ -195,6 +189,7 @@ struct ttm_buffer_object {
> 	struct file *persistent_swap_storage;
> 	struct ttm_tt *ttm;
> 	bool evicted;
> +	bool deleted;
> 
> 	/**
> 	 * Members protected by the bdev::lru_lock.
> -- 
> 2.17.1
> 

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 4/6] drm/ttm: rework BO delayed delete.
  2020-02-11  5:26   ` Pan, Xinhui
@ 2020-02-11  5:43     ` Pan, Xinhui
  2020-02-11 13:41       ` Christian König
  0 siblings, 1 reply; 17+ messages in thread
From: Pan, Xinhui @ 2020-02-11  5:43 UTC (permalink / raw)
  To: Pan, Xinhui; +Cc: Christian König, Pan, Xinhui, dri-devel, amd-gfx


comments inline

> 2020年2月11日 13:26,Pan, Xinhui <Xinhui.Pan@amd.com> 写道:
> 
> comments inline.
> [xh]
> 
> 
>> 2020年2月10日 23:09,Christian König <ckoenig.leichtzumerken@gmail.com> 写道:
>> 
>> This patch reworks the whole delayed deletion of BOs which aren't idle.
>> 
>> Instead of having two counters for the BO structure we resurrect the BO
>> when we find that a deleted BO is not idle yet.
>> 
>> This has many advantages, especially that we don't need to
>> increment/decrement the BOs reference counter any more when it
>> moves on the LRUs.
>> 
>> Signed-off-by: Christian König <christian.koenig@amd.com>
>> ---
>> drivers/gpu/drm/ttm/ttm_bo.c      | 217 +++++++++++++-----------------
>> drivers/gpu/drm/ttm/ttm_bo_util.c |   1 -
>> include/drm/ttm/ttm_bo_api.h      |  11 +-
>> 3 files changed, 97 insertions(+), 132 deletions(-)
>> 
>> diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c
>> index e12fc2c2d165..d0624685f5d2 100644
>> --- a/drivers/gpu/drm/ttm/ttm_bo.c
>> +++ b/drivers/gpu/drm/ttm/ttm_bo.c
>> @@ -145,26 +145,6 @@ static inline uint32_t ttm_bo_type_flags(unsigned type)
>> 	return 1 << (type);
>> }
>> 
>> -static void ttm_bo_release_list(struct kref *list_kref)
>> -{
>> -	struct ttm_buffer_object *bo =
>> -	    container_of(list_kref, struct ttm_buffer_object, list_kref);
>> -	size_t acc_size = bo->acc_size;
>> -
>> -	BUG_ON(kref_read(&bo->list_kref));
>> -	BUG_ON(kref_read(&bo->kref));
>> -	BUG_ON(bo->mem.mm_node != NULL);
>> -	BUG_ON(!list_empty(&bo->lru));
>> -	BUG_ON(!list_empty(&bo->ddestroy));
>> -	ttm_tt_destroy(bo->ttm);
>> -	atomic_dec(&ttm_bo_glob.bo_count);
>> -	dma_fence_put(bo->moving);
>> -	if (!ttm_bo_uses_embedded_gem_object(bo))
>> -		dma_resv_fini(&bo->base._resv);
>> -	bo->destroy(bo);
>> -	ttm_mem_global_free(&ttm_mem_glob, acc_size);
>> -}
>> -
>> static void ttm_bo_add_mem_to_lru(struct ttm_buffer_object *bo,
>> 				  struct ttm_mem_reg *mem)
>> {
>> @@ -181,21 +161,14 @@ static void ttm_bo_add_mem_to_lru(struct ttm_buffer_object *bo,
>> 
>> 	man = &bdev->man[mem->mem_type];
>> 	list_add_tail(&bo->lru, &man->lru[bo->priority]);
>> -	kref_get(&bo->list_kref);
>> 
>> 	if (!(man->flags & TTM_MEMTYPE_FLAG_FIXED) && bo->ttm &&
>> 	    !(bo->ttm->page_flags & (TTM_PAGE_FLAG_SG |
>> 				     TTM_PAGE_FLAG_SWAPPED))) {
>> 		list_add_tail(&bo->swap, &ttm_bo_glob.swap_lru[bo->priority]);
>> -		kref_get(&bo->list_kref);
>> 	}
>> }
>> 
>> -static void ttm_bo_ref_bug(struct kref *list_kref)
>> -{
>> -	BUG();
>> -}
>> -
>> static void ttm_bo_del_from_lru(struct ttm_buffer_object *bo)
>> {
>> 	struct ttm_bo_device *bdev = bo->bdev;
>> @@ -203,12 +176,10 @@ static void ttm_bo_del_from_lru(struct ttm_buffer_object *bo)
>> 
>> 	if (!list_empty(&bo->swap)) {
>> 		list_del_init(&bo->swap);
>> -		kref_put(&bo->list_kref, ttm_bo_ref_bug);
>> 		notify = true;
>> 	}
>> 	if (!list_empty(&bo->lru)) {
>> 		list_del_init(&bo->lru);
>> -		kref_put(&bo->list_kref, ttm_bo_ref_bug);
>> 		notify = true;
>> 	}
>> 
>> @@ -421,8 +392,7 @@ static int ttm_bo_individualize_resv(struct ttm_buffer_object *bo)
>> 	BUG_ON(!dma_resv_trylock(&bo->base._resv));
>> 
>> 	r = dma_resv_copy_fences(&bo->base._resv, bo->base.resv);
>> -	if (r)
>> -		dma_resv_unlock(&bo->base._resv);
>> +	dma_resv_unlock(&bo->base._resv);
>> 
>> 	return r;
>> }
>> @@ -449,68 +419,10 @@ static void ttm_bo_flush_all_fences(struct ttm_buffer_object *bo)
>> 	rcu_read_unlock();
>> }
>> 
>> -static void ttm_bo_cleanup_refs_or_queue(struct ttm_buffer_object *bo)
>> -{
>> -	struct ttm_bo_device *bdev = bo->bdev;
>> -	int ret;
>> -
>> -	ret = ttm_bo_individualize_resv(bo);
>> -	if (ret) {
>> -		/* Last resort, if we fail to allocate memory for the
>> -		 * fences block for the BO to become idle
>> -		 */
>> -		dma_resv_wait_timeout_rcu(bo->base.resv, true, false,
>> -						    30 * HZ);
>> -		spin_lock(&ttm_bo_glob.lru_lock);
>> -		goto error;
>> -	}
>> -
>> -	spin_lock(&ttm_bo_glob.lru_lock);
>> -	ret = dma_resv_trylock(bo->base.resv) ? 0 : -EBUSY;
>> -	if (!ret) {
>> -		if (dma_resv_test_signaled_rcu(&bo->base._resv, true)) {
>> -			ttm_bo_del_from_lru(bo);
>> -			spin_unlock(&ttm_bo_glob.lru_lock);
>> -			if (bo->base.resv != &bo->base._resv)
>> -				dma_resv_unlock(&bo->base._resv);
>> -
>> -			ttm_bo_cleanup_memtype_use(bo);
>> -			dma_resv_unlock(bo->base.resv);
>> -			return;
>> -		}
>> -
>> -		ttm_bo_flush_all_fences(bo);
>> -
>> -		/*
>> -		 * Make NO_EVICT bos immediately available to
>> -		 * shrinkers, now that they are queued for
>> -		 * destruction.
>> -		 */
>> -		if (bo->mem.placement & TTM_PL_FLAG_NO_EVICT) {
>> -			bo->mem.placement &= ~TTM_PL_FLAG_NO_EVICT;
>> -			ttm_bo_move_to_lru_tail(bo, NULL);
>> -		}
>> -
>> -		dma_resv_unlock(bo->base.resv);
>> -	}
>> -	if (bo->base.resv != &bo->base._resv) {
>> -		ttm_bo_flush_all_fences(bo);
>> -		dma_resv_unlock(&bo->base._resv);
>> -	}
>> -
>> -error:
>> -	kref_get(&bo->list_kref);
>> -	list_add_tail(&bo->ddestroy, &bdev->ddestroy);
>> -	spin_unlock(&ttm_bo_glob.lru_lock);
>> -
>> -	schedule_delayed_work(&bdev->wq,
>> -			      ((HZ / 100) < 1) ? 1 : HZ / 100);
>> -}
>> -
>> /**
>> * function ttm_bo_cleanup_refs
>> - * If bo idle, remove from delayed- and lru lists, and unref.
>> - * If not idle, do nothing.
>> + * If bo idle, remove from lru lists, and unref.
>> + * If not idle, block if possible.
>> *
>> * Must be called with lru_lock and reservation held, this function
>> * will drop the lru lock and optionally the reservation lock before returning.
>> @@ -572,14 +484,14 @@ static int ttm_bo_cleanup_refs(struct ttm_buffer_object *bo,
>> 
>> 	ttm_bo_del_from_lru(bo);
>> 	list_del_init(&bo->ddestroy);
>> -	kref_put(&bo->list_kref, ttm_bo_ref_bug);
>> -
>> 	spin_unlock(&ttm_bo_glob.lru_lock);
>> 	ttm_bo_cleanup_memtype_use(bo);
>> 
>> 	if (unlock_resv)
>> 		dma_resv_unlock(bo->base.resv);
>> 
>> +	ttm_bo_put(bo);
>> +
>> 	return 0;
>> }
>> 
>> @@ -601,8 +513,9 @@ static bool ttm_bo_delayed_delete(struct ttm_bo_device *bdev, bool remove_all)
>> 
>> 		bo = list_first_entry(&bdev->ddestroy, struct ttm_buffer_object,
>> 				      ddestroy);
>> -		kref_get(&bo->list_kref);
>> 		list_move_tail(&bo->ddestroy, &removed);
>> +		if (!ttm_bo_get_unless_zero(bo))
>> +			continue;
>> 
>> 		if (remove_all || bo->base.resv != &bo->base._resv) {
>> 			spin_unlock(&glob->lru_lock);
>> @@ -617,7 +530,7 @@ static bool ttm_bo_delayed_delete(struct ttm_bo_device *bdev, bool remove_all)
>> 			spin_unlock(&glob->lru_lock);
>> 		}
>> 
>> -		kref_put(&bo->list_kref, ttm_bo_release_list);
>> +		ttm_bo_put(bo);
>> 		spin_lock(&glob->lru_lock);
>> 	}
>> 	list_splice_tail(&removed, &bdev->ddestroy);
>> @@ -643,16 +556,68 @@ static void ttm_bo_release(struct kref *kref)
>> 	    container_of(kref, struct ttm_buffer_object, kref);
>> 	struct ttm_bo_device *bdev = bo->bdev;
>> 	struct ttm_mem_type_manager *man = &bdev->man[bo->mem.mem_type];
>> +	size_t acc_size = bo->acc_size;
>> +	int ret;
>> 
>> -	if (bo->bdev->driver->release_notify)
>> -		bo->bdev->driver->release_notify(bo);
>> +	if (!bo->deleted) {
>> +		if (bo->bdev->driver->release_notify)
>> +			bo->bdev->driver->release_notify(bo);
>> 
>> -	drm_vma_offset_remove(bdev->vma_manager, &bo->base.vma_node);
>> -	ttm_mem_io_lock(man, false);
>> -	ttm_mem_io_free_vm(bo);
>> -	ttm_mem_io_unlock(man);
>> -	ttm_bo_cleanup_refs_or_queue(bo);
>> -	kref_put(&bo->list_kref, ttm_bo_release_list);
>> +		drm_vma_offset_remove(bdev->vma_manager, &bo->base.vma_node);
>> +		ttm_mem_io_lock(man, false);
>> +		ttm_mem_io_free_vm(bo);
>> +		ttm_mem_io_unlock(man);
>> +
>> +		ret = ttm_bo_individualize_resv(bo);
>> +		if (ret) {
>> +			/* Last resort, if we fail to allocate memory for the
>> +			 * fences block for the BO to become idle
>> +			 */
>> +			dma_resv_wait_timeout_rcu(bo->base.resv, true, false,
>> +						  30 * HZ);
>> +		}
>> +	}
>> +
>> +	if (!dma_resv_test_signaled_rcu(bo->base.resv, true)) {
>> +		/* The BO is not idle, resurrect it for delayed destroy */
>> +		ttm_bo_flush_all_fences(bo);
>> +		bo->deleted = true;
>> +
>> +		/*
>> +		 * Make NO_EVICT bos immediately available to
>> +		 * shrinkers, now that they are queued for
>> +		 * destruction.
>> +		 */
>> +		if (bo->mem.placement & TTM_PL_FLAG_NO_EVICT) {
>> +			bo->mem.placement &= ~TTM_PL_FLAG_NO_EVICT;
>> +			ttm_bo_move_to_lru_tail(bo, NULL);
> 
> [xh] this should be under lru lock.
> 
>> +		}
>> +
>> +		spin_lock(&ttm_bo_glob.lru_lock);
>> +		kref_init(&bo->kref);
>> +		list_add_tail(&bo->ddestroy, &bdev->ddestroy);
>> +		spin_unlock(&ttm_bo_glob.lru_lock);
>> +
>> +		schedule_delayed_work(&bdev->wq,
>> +				      ((HZ / 100) < 1) ? 1 : HZ / 100);
>> +		return;
>> +	}
>> +
>> +	spin_lock(&ttm_bo_glob.lru_lock);
>> +	ttm_bo_del_from_lru(bo);
>> +	list_del(&bo->ddestroy);
>> +	spin_unlock(&ttm_bo_glob.lru_lock);
>> +
>> +	ttm_bo_cleanup_memtype_use(bo);
>> +
>> +	BUG_ON(bo->mem.mm_node != NULL);
>> +	ttm_tt_destroy(bo->ttm);
> 
> [xh] already destroy it in ttm_bo_cleanup_memtype_use.
> 
> 
>> +	atomic_dec(&ttm_bo_glob.bo_count);
>> +	dma_fence_put(bo->moving);
>> +	if (!ttm_bo_uses_embedded_gem_object(bo))
>> +		dma_resv_fini(&bo->base._resv);
>> +	bo->destroy(bo);
>> +	ttm_mem_global_free(&ttm_mem_glob, acc_size);
>> }
>> 
>> void ttm_bo_put(struct ttm_buffer_object *bo)
>> @@ -755,8 +720,7 @@ static bool ttm_bo_evict_swapout_allowable(struct ttm_buffer_object *bo,
>> 
>> 	if (bo->base.resv == ctx->resv) {
>> 		dma_resv_assert_held(bo->base.resv);
>> -		if (ctx->flags & TTM_OPT_FLAG_ALLOW_RES_EVICT
>> -		    || !list_empty(&bo->ddestroy))
>> +		if (ctx->flags & TTM_OPT_FLAG_ALLOW_RES_EVICT || bo->deleted)
>> 			ret = true;
>> 		*locked = false;
>> 		if (busy)
>> @@ -837,6 +801,11 @@ static int ttm_mem_evict_first(struct ttm_bo_device *bdev,
>> 					dma_resv_unlock(bo->base.resv);
>> 				continue;
>> 			}
>> +			if (!ttm_bo_get_unless_zero(bo)) {
>> +				if (locked)
>> +					dma_resv_unlock(bo->base.resv);
>> +				continue;
>> +			}
>> 			break;
>> 		}
>> 
>> @@ -848,21 +817,19 @@ static int ttm_mem_evict_first(struct ttm_bo_device *bdev,
>> 	}
>> 
>> 	if (!bo) {
>> -		if (busy_bo)
>> -			kref_get(&busy_bo->list_kref);
>> +		if (busy_bo && !ttm_bo_get_unless_zero(busy_bo))
>> +			busy_bo = NULL;
>> 		spin_unlock(&ttm_bo_glob.lru_lock);
>> 		ret = ttm_mem_evict_wait_busy(busy_bo, ctx, ticket);
>> 		if (busy_bo)
>> -			kref_put(&busy_bo->list_kref, ttm_bo_release_list);
>> +			ttm_bo_put(busy_bo);
>> 		return ret;
>> 	}
>> 
>> -	kref_get(&bo->list_kref);
>> -
>> -	if (!list_empty(&bo->ddestroy)) {
>> +	if (bo->deleted) {
>> 		ret = ttm_bo_cleanup_refs(bo, ctx->interruptible,
>> 					  ctx->no_wait_gpu, locked);
>> -		kref_put(&bo->list_kref, ttm_bo_release_list);
>> +		ttm_bo_put(bo);
>> 		return ret;
>> 	}
>> 
>> @@ -872,7 +839,7 @@ static int ttm_mem_evict_first(struct ttm_bo_device *bdev,
>> 	if (locked)
>> 		ttm_bo_unreserve(bo);
>> 
>> -	kref_put(&bo->list_kref, ttm_bo_release_list);
>> +	ttm_bo_put(bo);
>> 	return ret;
>> }
>> 
>> @@ -1284,7 +1251,6 @@ int ttm_bo_init_reserved(struct ttm_bo_device *bdev,
>> 	bo->destroy = destroy ? destroy : ttm_bo_default_destroy;
>> 
>> 	kref_init(&bo->kref);
>> -	kref_init(&bo->list_kref);
>> 	INIT_LIST_HEAD(&bo->lru);
>> 	INIT_LIST_HEAD(&bo->ddestroy);
>> 	INIT_LIST_HEAD(&bo->swap);
>> @@ -1804,11 +1770,18 @@ int ttm_bo_swapout(struct ttm_bo_global *glob, struct ttm_operation_ctx *ctx)
>> 	spin_lock(&glob->lru_lock);
>> 	for (i = 0; i < TTM_MAX_BO_PRIORITY; ++i) {
>> 		list_for_each_entry(bo, &glob->swap_lru[i], swap) {
>> -			if (ttm_bo_evict_swapout_allowable(bo, ctx, &locked,
>> -							   NULL)) {
>> -				ret = 0;
>> -				break;
>> +			if (!ttm_bo_evict_swapout_allowable(bo, ctx, &locked,
>> +							    NULL))
>> +				continue;
>> +
>> +			if (!ttm_bo_get_unless_zero(bo)) {
>> +				if (locked)
>> +					dma_resv_unlock(bo->base.resv);
>> +				continue;
>> 			}
> 
> [xh] As you get the bo. when you put it?
> 
> You only put one bo just before return. BUT you get the bos in the for loop. Am I missing somethings?
> 
> thanks
> xinhui
> 
> 
[xh] Never mind.
I missed the break;

thanks
xinhui

>> +
>> +			ret = 0;
>> +			break;
>> 		}
>> 		if (!ret)
>> 			break;
>> @@ -1819,11 +1792,9 @@ int ttm_bo_swapout(struct ttm_bo_global *glob, struct ttm_operation_ctx *ctx)
>> 		return ret;
>> 	}
>> 
>> -	kref_get(&bo->list_kref);
>> -
>> -	if (!list_empty(&bo->ddestroy)) {
>> +	if (bo->deleted) {
>> 		ret = ttm_bo_cleanup_refs(bo, false, false, locked);
>> -		kref_put(&bo->list_kref, ttm_bo_release_list);
>> +		ttm_bo_put(bo);
>> 		return ret;
>> 	}
>> 
>> @@ -1877,7 +1848,7 @@ int ttm_bo_swapout(struct ttm_bo_global *glob, struct ttm_operation_ctx *ctx)
>> 	 */
>> 	if (locked)
>> 		dma_resv_unlock(bo->base.resv);
>> -	kref_put(&bo->list_kref, ttm_bo_release_list);
>> +	ttm_bo_put(bo);
>> 	return ret;
>> }
>> EXPORT_SYMBOL(ttm_bo_swapout);
>> diff --git a/drivers/gpu/drm/ttm/ttm_bo_util.c b/drivers/gpu/drm/ttm/ttm_bo_util.c
>> index 86d152472f38..c8e359ded1df 100644
>> --- a/drivers/gpu/drm/ttm/ttm_bo_util.c
>> +++ b/drivers/gpu/drm/ttm/ttm_bo_util.c
>> @@ -507,7 +507,6 @@ static int ttm_buffer_object_transfer(struct ttm_buffer_object *bo,
>> 	fbo->base.moving = NULL;
>> 	drm_vma_node_reset(&fbo->base.base.vma_node);
>> 
>> -	kref_init(&fbo->base.list_kref);
>> 	kref_init(&fbo->base.kref);
>> 	fbo->base.destroy = &ttm_transfered_destroy;
>> 	fbo->base.acc_size = 0;
>> diff --git a/include/drm/ttm/ttm_bo_api.h b/include/drm/ttm/ttm_bo_api.h
>> index 66ca49db9633..b9bc1b00142e 100644
>> --- a/include/drm/ttm/ttm_bo_api.h
>> +++ b/include/drm/ttm/ttm_bo_api.h
>> @@ -135,18 +135,14 @@ struct ttm_tt;
>> * @num_pages: Actual number of pages.
>> * @acc_size: Accounted size for this object.
>> * @kref: Reference count of this buffer object. When this refcount reaches
>> - * zero, the object is put on the delayed delete list.
>> - * @list_kref: List reference count of this buffer object. This member is
>> - * used to avoid destruction while the buffer object is still on a list.
>> - * Lru lists may keep one refcount, the delayed delete list, and kref != 0
>> - * keeps one refcount. When this refcount reaches zero,
>> - * the object is destroyed.
>> + * zero, the object is destroyed or put on the delayed delete list.
>> * @mem: structure describing current placement.
>> * @persistent_swap_storage: Usually the swap storage is deleted for buffers
>> * pinned in physical memory. If this behaviour is not desired, this member
>> * holds a pointer to a persistent shmem object.
>> * @ttm: TTM structure holding system pages.
>> * @evicted: Whether the object was evicted without user-space knowing.
>> + * @deleted: True if the object is only a zombie and already deleted.
>> * @lru: List head for the lru list.
>> * @ddestroy: List head for the delayed destroy list.
>> * @swap: List head for swap LRU list.
>> @@ -183,9 +179,7 @@ struct ttm_buffer_object {
>> 	/**
>> 	* Members not needing protection.
>> 	*/
>> -
>> 	struct kref kref;
>> -	struct kref list_kref;
>> 
>> 	/**
>> 	 * Members protected by the bo::resv::reserved lock.
>> @@ -195,6 +189,7 @@ struct ttm_buffer_object {
>> 	struct file *persistent_swap_storage;
>> 	struct ttm_tt *ttm;
>> 	bool evicted;
>> +	bool deleted;
>> 
>> 	/**
>> 	 * Members protected by the bdev::lru_lock.
>> -- 
>> 2.17.1
>> 
> 

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Cleanup TTMs delayed delete handling
  2020-02-10 15:09 Cleanup TTMs delayed delete handling Christian König
                   ` (5 preceding siblings ...)
  2020-02-10 15:09 ` [PATCH 6/6] drm/ttm: individualize resv objects before calling release_notify Christian König
@ 2020-02-11 11:06 ` Pan, Xinhui
  2020-02-11 11:13   ` Daniel Vetter
  6 siblings, 1 reply; 17+ messages in thread
From: Pan, Xinhui @ 2020-02-11 11:06 UTC (permalink / raw)
  To: Christian König, amd-gfx, dri-devel


[-- Attachment #1.1: Type: text/plain, Size: 953 bytes --]

[AMD Official Use Only - Internal Distribution Only]

For patch 1/2/3/5/6
Reviewed-by: xinhui pan <xinhui.pan@amd.com>
________________________________
From: Christian König <ckoenig.leichtzumerken@gmail.com>
Sent: Monday, February 10, 2020 11:09:01 PM
To: Pan, Xinhui <Xinhui.Pan@amd.com>; amd-gfx@lists.freedesktop.org <amd-gfx@lists.freedesktop.org>; dri-devel@lists.freedesktop.org <dri-devel@lists.freedesktop.org>
Subject: Cleanup TTMs delayed delete handling

This series of patches cleans up TTMs delayed delete handling.

The core of the new handling is that we new only have a single reference counter instead of two and use kref_get_unless_zero() to grab BOs from the LRU during eviction.

This reduces the overhead of LRU moves and allows us to properly individualize the BOs reservation object during deletion to allow adding BOs for clearing memory, unmapping page tables etc..

Please review and comment,
Christian.



[-- Attachment #1.2: Type: text/html, Size: 1854 bytes --]

[-- Attachment #2: Type: text/plain, Size: 154 bytes --]

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Cleanup TTMs delayed delete handling
  2020-02-11 11:06 ` Cleanup TTMs delayed delete handling Pan, Xinhui
@ 2020-02-11 11:13   ` Daniel Vetter
  0 siblings, 0 replies; 17+ messages in thread
From: Daniel Vetter @ 2020-02-11 11:13 UTC (permalink / raw)
  To: Pan, Xinhui; +Cc: Christian König, dri-devel, amd-gfx

On Tue, Feb 11, 2020 at 11:06:53AM +0000, Pan, Xinhui wrote:
> [AMD Official Use Only - Internal Distribution Only]

Uh might want to fix your email setup.
-Daniel

> 
> For patch 1/2/3/5/6
> Reviewed-by: xinhui pan <xinhui.pan@amd.com>
> ________________________________
> From: Christian König <ckoenig.leichtzumerken@gmail.com>
> Sent: Monday, February 10, 2020 11:09:01 PM
> To: Pan, Xinhui <Xinhui.Pan@amd.com>; amd-gfx@lists.freedesktop.org <amd-gfx@lists.freedesktop.org>; dri-devel@lists.freedesktop.org <dri-devel@lists.freedesktop.org>
> Subject: Cleanup TTMs delayed delete handling
> 
> This series of patches cleans up TTMs delayed delete handling.
> 
> The core of the new handling is that we new only have a single reference counter instead of two and use kref_get_unless_zero() to grab BOs from the LRU during eviction.
> 
> This reduces the overhead of LRU moves and allows us to properly individualize the BOs reservation object during deletion to allow adding BOs for clearing memory, unmapping page tables etc..
> 
> Please review and comment,
> Christian.
> 
> 

> _______________________________________________
> dri-devel mailing list
> dri-devel@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/dri-devel


-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 4/6] drm/ttm: rework BO delayed delete.
  2020-02-11  5:43     ` Pan, Xinhui
@ 2020-02-11 13:41       ` Christian König
  0 siblings, 0 replies; 17+ messages in thread
From: Christian König @ 2020-02-11 13:41 UTC (permalink / raw)
  To: Pan, Xinhui; +Cc: dri-devel, amd-gfx

[SNIP]
>>> +		/*
>>> +		 * Make NO_EVICT bos immediately available to
>>> +		 * shrinkers, now that they are queued for
>>> +		 * destruction.
>>> +		 */
>>> +		if (bo->mem.placement & TTM_PL_FLAG_NO_EVICT) {
>>> +			bo->mem.placement &= ~TTM_PL_FLAG_NO_EVICT;
>>> +			ttm_bo_move_to_lru_tail(bo, NULL);
>> [xh] this should be under lru lock.

Ah, yes good point.

>>
>>> +	BUG_ON(bo->mem.mm_node != NULL);
>>> +	ttm_tt_destroy(bo->ttm);
>> [xh] already destroy it in ttm_bo_cleanup_memtype_use.

Fixed as well.

Going to send that out with those two fixed in a minute.

Thanks,
Christian.
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 5/6] drm/ttm: replace dma_resv object on deleted BOs v2
  2020-02-10 15:09 ` [PATCH 5/6] drm/ttm: replace dma_resv object on deleted BOs v2 Christian König
@ 2020-02-11 14:14   ` Daniel Vetter
  2020-02-11 15:02     ` Pan, Xinhui
  2020-02-11 15:15     ` Christian König
  0 siblings, 2 replies; 17+ messages in thread
From: Daniel Vetter @ 2020-02-11 14:14 UTC (permalink / raw)
  To: Christian König; +Cc: Xinhui.Pan, dri-devel, amd-gfx

On Mon, Feb 10, 2020 at 04:09:06PM +0100, Christian König wrote:
> When non-imported BOs are resurrected for delayed delete we replace
> the dma_resv object to allow for easy reclaiming of the resources.
> 
> v2: move that to ttm_bo_individualize_resv
> 
> Signed-off-by: Christian König <christian.koenig@amd.com>
> ---
>  drivers/gpu/drm/ttm/ttm_bo.c | 10 +++++++++-
>  1 file changed, 9 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c
> index d0624685f5d2..4d161038de98 100644
> --- a/drivers/gpu/drm/ttm/ttm_bo.c
> +++ b/drivers/gpu/drm/ttm/ttm_bo.c
> @@ -393,6 +393,14 @@ static int ttm_bo_individualize_resv(struct ttm_buffer_object *bo)
>  
>  	r = dma_resv_copy_fences(&bo->base._resv, bo->base.resv);
>  	dma_resv_unlock(&bo->base._resv);
> +	if (r)
> +		return r;
> +
> +	if (bo->type != ttm_bo_type_sg) {
> +		spin_lock(&ttm_bo_glob.lru_lock);
> +		bo->base.resv = &bo->base._resv;

Having the dma_resv pointer be protected by the lru_lock for ttm internal
stuff, but invariant everywhere else is really confusing. Not sure that's
a great idea, I've just chased some ttm code around freaking out about
that.
-Daniel

> +		spin_unlock(&ttm_bo_glob.lru_lock);
> +	}
>  
>  	return r;
>  }
> @@ -720,7 +728,7 @@ static bool ttm_bo_evict_swapout_allowable(struct ttm_buffer_object *bo,
>  
>  	if (bo->base.resv == ctx->resv) {
>  		dma_resv_assert_held(bo->base.resv);
> -		if (ctx->flags & TTM_OPT_FLAG_ALLOW_RES_EVICT || bo->deleted)
> +		if (ctx->flags & TTM_OPT_FLAG_ALLOW_RES_EVICT)
>  			ret = true;
>  		*locked = false;
>  		if (busy)
> -- 
> 2.17.1
> 
> _______________________________________________
> dri-devel mailing list
> dri-devel@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/dri-devel

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 5/6] drm/ttm: replace dma_resv object on deleted BOs v2
  2020-02-11 14:14   ` Daniel Vetter
@ 2020-02-11 15:02     ` Pan, Xinhui
  2020-02-11 15:23       ` Christian König
  2020-02-11 15:15     ` Christian König
  1 sibling, 1 reply; 17+ messages in thread
From: Pan, Xinhui @ 2020-02-11 15:02 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: Christian König, Pan, Xinhui, dri-devel, amd-gfx



> 2020年2月11日 22:14,Daniel Vetter <daniel@ffwll.ch> 写道:
> 
> On Mon, Feb 10, 2020 at 04:09:06PM +0100, Christian König wrote:
>> When non-imported BOs are resurrected for delayed delete we replace
>> the dma_resv object to allow for easy reclaiming of the resources.
>> 
>> v2: move that to ttm_bo_individualize_resv
>> 
>> Signed-off-by: Christian König <christian.koenig@amd.com>
>> ---
>> drivers/gpu/drm/ttm/ttm_bo.c | 10 +++++++++-
>> 1 file changed, 9 insertions(+), 1 deletion(-)
>> 
>> diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c
>> index d0624685f5d2..4d161038de98 100644
>> --- a/drivers/gpu/drm/ttm/ttm_bo.c
>> +++ b/drivers/gpu/drm/ttm/ttm_bo.c
>> @@ -393,6 +393,14 @@ static int ttm_bo_individualize_resv(struct ttm_buffer_object *bo)
>> 
>> 	r = dma_resv_copy_fences(&bo->base._resv, bo->base.resv);
>> 	dma_resv_unlock(&bo->base._resv);
>> +	if (r)
>> +		return r;
>> +
>> +	if (bo->type != ttm_bo_type_sg) {
>> +		spin_lock(&ttm_bo_glob.lru_lock);
>> +		bo->base.resv = &bo->base._resv;
> 
> Having the dma_resv pointer be protected by the lru_lock for ttm internal
> stuff, but invariant everywhere else is really confusing. Not sure that's

I think this is reader VS writer.
To avoid any internal functions using the old resv,  using an existing spin lock is acceptable.
Maybe RCU is better? That will need a lot of effort.
Anyway, ttm sucks. We HAS done a lot of work on it to make it better running on modern system.


> a great idea, I've just chased some ttm code around freaking out about
> that.
> -Daniel
> 
>> +		spin_unlock(&ttm_bo_glob.lru_lock);
>> +	}
>> 
>> 	return r;
>> }
>> @@ -720,7 +728,7 @@ static bool ttm_bo_evict_swapout_allowable(struct ttm_buffer_object *bo,
>> 
>> 	if (bo->base.resv == ctx->resv) {
>> 		dma_resv_assert_held(bo->base.resv);
>> -		if (ctx->flags & TTM_OPT_FLAG_ALLOW_RES_EVICT || bo->deleted)
>> +		if (ctx->flags & TTM_OPT_FLAG_ALLOW_RES_EVICT)
>> 			ret = true;
>> 		*locked = false;
>> 		if (busy)
>> -- 
>> 2.17.1
>> 
>> _______________________________________________
>> dri-devel mailing list
>> dri-devel@lists.freedesktop.org
>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Fdri-devel&amp;data=02%7C01%7CXinhui.Pan%40amd.com%7Cee67310e26b64ca9e79008d7aefca7b4%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637170272481765904&amp;sdata=ZpnP9MNBP1csQCKPR275ejIvsZ3b8xL80tmSlpf7MPA%3D&amp;reserved=0
> 
> -- 
> Daniel Vetter
> Software Engineer, Intel Corporation
> https://nam11.safelinks.protection.outlook.com/?url=http%3A%2F%2Fblog.ffwll.ch&amp;data=02%7C01%7CXinhui.Pan%40amd.com%7Cee67310e26b64ca9e79008d7aefca7b4%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637170272481765904&amp;sdata=fk28jtHhAnE312CFMgVXaZtaS2YNqJjmyJ317FWjAoM%3D&amp;reserved=0

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 5/6] drm/ttm: replace dma_resv object on deleted BOs v2
  2020-02-11 14:14   ` Daniel Vetter
  2020-02-11 15:02     ` Pan, Xinhui
@ 2020-02-11 15:15     ` Christian König
  1 sibling, 0 replies; 17+ messages in thread
From: Christian König @ 2020-02-11 15:15 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: Xinhui.Pan, dri-devel, amd-gfx

Am 11.02.20 um 15:14 schrieb Daniel Vetter:
> On Mon, Feb 10, 2020 at 04:09:06PM +0100, Christian König wrote:
>> When non-imported BOs are resurrected for delayed delete we replace
>> the dma_resv object to allow for easy reclaiming of the resources.
>>
>> v2: move that to ttm_bo_individualize_resv
>>
>> Signed-off-by: Christian König <christian.koenig@amd.com>
>> ---
>>   drivers/gpu/drm/ttm/ttm_bo.c | 10 +++++++++-
>>   1 file changed, 9 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c
>> index d0624685f5d2..4d161038de98 100644
>> --- a/drivers/gpu/drm/ttm/ttm_bo.c
>> +++ b/drivers/gpu/drm/ttm/ttm_bo.c
>> @@ -393,6 +393,14 @@ static int ttm_bo_individualize_resv(struct ttm_buffer_object *bo)
>>   
>>   	r = dma_resv_copy_fences(&bo->base._resv, bo->base.resv);
>>   	dma_resv_unlock(&bo->base._resv);
>> +	if (r)
>> +		return r;
>> +
>> +	if (bo->type != ttm_bo_type_sg) {
>> +		spin_lock(&ttm_bo_glob.lru_lock);
>> +		bo->base.resv = &bo->base._resv;
> Having the dma_resv pointer be protected by the lru_lock for ttm internal
> stuff, but invariant everywhere else is really confusing. Not sure that's
> a great idea, I've just chased some ttm code around freaking out about
> that.

The key point is that the reference counter is zero in this moment.

Taking the LRU spinlock was just a precaution and might actually be not 
even necessary here.

I will double check,
Christian.

> -Daniel
>
>> +		spin_unlock(&ttm_bo_glob.lru_lock);
>> +	}
>>   
>>   	return r;
>>   }
>> @@ -720,7 +728,7 @@ static bool ttm_bo_evict_swapout_allowable(struct ttm_buffer_object *bo,
>>   
>>   	if (bo->base.resv == ctx->resv) {
>>   		dma_resv_assert_held(bo->base.resv);
>> -		if (ctx->flags & TTM_OPT_FLAG_ALLOW_RES_EVICT || bo->deleted)
>> +		if (ctx->flags & TTM_OPT_FLAG_ALLOW_RES_EVICT)
>>   			ret = true;
>>   		*locked = false;
>>   		if (busy)
>> -- 
>> 2.17.1
>>
>> _______________________________________________
>> dri-devel mailing list
>> dri-devel@lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/dri-devel

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 5/6] drm/ttm: replace dma_resv object on deleted BOs v2
  2020-02-11 15:02     ` Pan, Xinhui
@ 2020-02-11 15:23       ` Christian König
  2020-02-11 15:36         ` Daniel Vetter
  0 siblings, 1 reply; 17+ messages in thread
From: Christian König @ 2020-02-11 15:23 UTC (permalink / raw)
  To: Pan, Xinhui, Daniel Vetter; +Cc: dri-devel, amd-gfx

Am 11.02.20 um 16:02 schrieb Pan, Xinhui:
>
>> 2020年2月11日 22:14,Daniel Vetter <daniel@ffwll.ch> 写道:
>>
>> On Mon, Feb 10, 2020 at 04:09:06PM +0100, Christian König wrote:
>>> When non-imported BOs are resurrected for delayed delete we replace
>>> the dma_resv object to allow for easy reclaiming of the resources.
>>>
>>> v2: move that to ttm_bo_individualize_resv
>>>
>>> Signed-off-by: Christian König <christian.koenig@amd.com>
>>> ---
>>> drivers/gpu/drm/ttm/ttm_bo.c | 10 +++++++++-
>>> 1 file changed, 9 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c
>>> index d0624685f5d2..4d161038de98 100644
>>> --- a/drivers/gpu/drm/ttm/ttm_bo.c
>>> +++ b/drivers/gpu/drm/ttm/ttm_bo.c
>>> @@ -393,6 +393,14 @@ static int ttm_bo_individualize_resv(struct ttm_buffer_object *bo)
>>>
>>> 	r = dma_resv_copy_fences(&bo->base._resv, bo->base.resv);
>>> 	dma_resv_unlock(&bo->base._resv);
>>> +	if (r)
>>> +		return r;
>>> +
>>> +	if (bo->type != ttm_bo_type_sg) {
>>> +		spin_lock(&ttm_bo_glob.lru_lock);
>>> +		bo->base.resv = &bo->base._resv;
>> Having the dma_resv pointer be protected by the lru_lock for ttm internal
>> stuff, but invariant everywhere else is really confusing. Not sure that's
> I think this is reader VS writer.
> To avoid any internal functions using the old resv,  using an existing spin lock is acceptable.
> Maybe RCU is better? That will need a lot of effort.
> Anyway, ttm sucks. We HAS done a lot of work on it to make it better running on modern system.

Yeah that summarize my recent presentation about TTM pretty much :)

Double checked that and the only reason we have the lock is that in 
ttm_mem_evict_first() we trylock first and then grab a reference.

So we should probably rework that code as well and then we can also drop 
that lock here, but that should come later.

Christian.

>
>
>> a great idea, I've just chased some ttm code around freaking out about
>> that.
>> -Daniel
>>
>>> +		spin_unlock(&ttm_bo_glob.lru_lock);
>>> +	}
>>>
>>> 	return r;
>>> }
>>> @@ -720,7 +728,7 @@ static bool ttm_bo_evict_swapout_allowable(struct ttm_buffer_object *bo,
>>>
>>> 	if (bo->base.resv == ctx->resv) {
>>> 		dma_resv_assert_held(bo->base.resv);
>>> -		if (ctx->flags & TTM_OPT_FLAG_ALLOW_RES_EVICT || bo->deleted)
>>> +		if (ctx->flags & TTM_OPT_FLAG_ALLOW_RES_EVICT)
>>> 			ret = true;
>>> 		*locked = false;
>>> 		if (busy)
>>> -- 
>>> 2.17.1
>>>
>>> _______________________________________________
>>> dri-devel mailing list
>>> dri-devel@lists.freedesktop.org
>>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Fdri-devel&amp;data=02%7C01%7CXinhui.Pan%40amd.com%7Cee67310e26b64ca9e79008d7aefca7b4%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637170272481765904&amp;sdata=ZpnP9MNBP1csQCKPR275ejIvsZ3b8xL80tmSlpf7MPA%3D&amp;reserved=0
>> -- 
>> Daniel Vetter
>> Software Engineer, Intel Corporation
>> https://nam11.safelinks.protection.outlook.com/?url=http%3A%2F%2Fblog.ffwll.ch&amp;data=02%7C01%7CXinhui.Pan%40amd.com%7Cee67310e26b64ca9e79008d7aefca7b4%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637170272481765904&amp;sdata=fk28jtHhAnE312CFMgVXaZtaS2YNqJjmyJ317FWjAoM%3D&amp;reserved=0

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 5/6] drm/ttm: replace dma_resv object on deleted BOs v2
  2020-02-11 15:23       ` Christian König
@ 2020-02-11 15:36         ` Daniel Vetter
  0 siblings, 0 replies; 17+ messages in thread
From: Daniel Vetter @ 2020-02-11 15:36 UTC (permalink / raw)
  To: Christian König; +Cc: Pan, Xinhui, dri-devel, amd-gfx

On Tue, Feb 11, 2020 at 4:23 PM Christian König
<ckoenig.leichtzumerken@gmail.com> wrote:
>
> Am 11.02.20 um 16:02 schrieb Pan, Xinhui:
> >
> >> 2020年2月11日 22:14,Daniel Vetter <daniel@ffwll.ch> 写道:
> >>
> >> On Mon, Feb 10, 2020 at 04:09:06PM +0100, Christian König wrote:
> >>> When non-imported BOs are resurrected for delayed delete we replace
> >>> the dma_resv object to allow for easy reclaiming of the resources.
> >>>
> >>> v2: move that to ttm_bo_individualize_resv
> >>>
> >>> Signed-off-by: Christian König <christian.koenig@amd.com>
> >>> ---
> >>> drivers/gpu/drm/ttm/ttm_bo.c | 10 +++++++++-
> >>> 1 file changed, 9 insertions(+), 1 deletion(-)
> >>>
> >>> diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c
> >>> index d0624685f5d2..4d161038de98 100644
> >>> --- a/drivers/gpu/drm/ttm/ttm_bo.c
> >>> +++ b/drivers/gpu/drm/ttm/ttm_bo.c
> >>> @@ -393,6 +393,14 @@ static int ttm_bo_individualize_resv(struct ttm_buffer_object *bo)
> >>>
> >>>     r = dma_resv_copy_fences(&bo->base._resv, bo->base.resv);
> >>>     dma_resv_unlock(&bo->base._resv);
> >>> +   if (r)
> >>> +           return r;
> >>> +
> >>> +   if (bo->type != ttm_bo_type_sg) {
> >>> +           spin_lock(&ttm_bo_glob.lru_lock);
> >>> +           bo->base.resv = &bo->base._resv;
> >> Having the dma_resv pointer be protected by the lru_lock for ttm internal
> >> stuff, but invariant everywhere else is really confusing. Not sure that's
> > I think this is reader VS writer.
> > To avoid any internal functions using the old resv,  using an existing spin lock is acceptable.
> > Maybe RCU is better? That will need a lot of effort.
> > Anyway, ttm sucks. We HAS done a lot of work on it to make it better running on modern system.
>
> Yeah that summarize my recent presentation about TTM pretty much :)
>
> Double checked that and the only reason we have the lock is that in
> ttm_mem_evict_first() we trylock first and then grab a reference.

Well even with the refcount stuff going on I think you're missing a
pile of barriers if you drop the spinlock. Refcounts are otherwise
unordered atomics (and the kref_get_unless_zero trick always needs
something else that guarantees that there's at least a weak reference,
i.e. something which cause a full synchronization somehow with the
release/free code for that thing). So would minimally needs rcu, or
like you do here the lru list spinlock.

> So we should probably rework that code as well and then we can also drop
> that lock here, but that should come later.

_If_ the trylock in evict is the only one then I agree this should be
safe. But definitely needs a comment explaining what exactly is going
on. Imo at least :-)

Cheers, Daniel
>
> Christian.
>
> >
> >
> >> a great idea, I've just chased some ttm code around freaking out about
> >> that.
> >> -Daniel
> >>
> >>> +           spin_unlock(&ttm_bo_glob.lru_lock);
> >>> +   }
> >>>
> >>>     return r;
> >>> }
> >>> @@ -720,7 +728,7 @@ static bool ttm_bo_evict_swapout_allowable(struct ttm_buffer_object *bo,
> >>>
> >>>     if (bo->base.resv == ctx->resv) {
> >>>             dma_resv_assert_held(bo->base.resv);
> >>> -           if (ctx->flags & TTM_OPT_FLAG_ALLOW_RES_EVICT || bo->deleted)
> >>> +           if (ctx->flags & TTM_OPT_FLAG_ALLOW_RES_EVICT)
> >>>                     ret = true;
> >>>             *locked = false;
> >>>             if (busy)
> >>> --
> >>> 2.17.1
> >>>
> >>> _______________________________________________
> >>> dri-devel mailing list
> >>> dri-devel@lists.freedesktop.org
> >>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Fdri-devel&amp;data=02%7C01%7CXinhui.Pan%40amd.com%7Cee67310e26b64ca9e79008d7aefca7b4%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637170272481765904&amp;sdata=ZpnP9MNBP1csQCKPR275ejIvsZ3b8xL80tmSlpf7MPA%3D&amp;reserved=0
> >> --
> >> Daniel Vetter
> >> Software Engineer, Intel Corporation
> >> https://nam11.safelinks.protection.outlook.com/?url=http%3A%2F%2Fblog.ffwll.ch&amp;data=02%7C01%7CXinhui.Pan%40amd.com%7Cee67310e26b64ca9e79008d7aefca7b4%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637170272481765904&amp;sdata=fk28jtHhAnE312CFMgVXaZtaS2YNqJjmyJ317FWjAoM%3D&amp;reserved=0
>


-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2020-02-11 15:36 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-02-10 15:09 Cleanup TTMs delayed delete handling Christian König
2020-02-10 15:09 ` [PATCH 1/6] drm/ttm: refine ghost BO resv criteria Christian König
2020-02-10 15:09 ` [PATCH 2/6] drm/ttm: cleanup ttm_buffer_object_transfer Christian König
2020-02-10 15:09 ` [PATCH 3/6] drm/ttm: use RCU in ttm_bo_flush_all_fences Christian König
2020-02-10 15:09 ` [PATCH 4/6] drm/ttm: rework BO delayed delete Christian König
2020-02-11  5:26   ` Pan, Xinhui
2020-02-11  5:43     ` Pan, Xinhui
2020-02-11 13:41       ` Christian König
2020-02-10 15:09 ` [PATCH 5/6] drm/ttm: replace dma_resv object on deleted BOs v2 Christian König
2020-02-11 14:14   ` Daniel Vetter
2020-02-11 15:02     ` Pan, Xinhui
2020-02-11 15:23       ` Christian König
2020-02-11 15:36         ` Daniel Vetter
2020-02-11 15:15     ` Christian König
2020-02-10 15:09 ` [PATCH 6/6] drm/ttm: individualize resv objects before calling release_notify Christian König
2020-02-11 11:06 ` Cleanup TTMs delayed delete handling Pan, Xinhui
2020-02-11 11:13   ` Daniel Vetter

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).