dri-devel.lists.freedesktop.org archive mirror
 help / color / mirror / Atom feed

* (no subject)
@ 2022-04-06  7:51 Christian König
  2022-04-06  7:51 ` [PATCH 01/16] dma-buf/drivers: make reserving a shared slot mandatory v4 Christian König
                   ` (16 more replies)
  0 siblings, 17 replies; 65+ messages in thread
From: Christian König @ 2022-04-06  7:51 UTC (permalink / raw)
  To: daniel.vetter, dri-devel

Hi Daniel,

rebased on top of all the changes in drm-misc-next now and hopefully
ready for 5.19.

I think I addressed all concern, but there was a bunch of rebase fallout
from i915, so better to double check that once more.

Regards,
Christian.

^ permalink raw reply	[flat|nested] 65+ messages in thread

* [PATCH 01/16] dma-buf/drivers: make reserving a shared slot mandatory v4
  2022-04-06  7:51 Christian König
@ 2022-04-06  7:51 ` Christian König
  2022-04-06 12:21   ` Daniel Vetter
  2022-04-06  7:51 ` [PATCH 02/16] dma-buf: add enum dma_resv_usage v4 Christian König
                   ` (15 subsequent siblings)
  16 siblings, 1 reply; 65+ messages in thread
From: Christian König @ 2022-04-06  7:51 UTC (permalink / raw)
  To: daniel.vetter, dri-devel; +Cc: Christian König

Audit all the users of dma_resv_add_excl_fence() and make sure they
reserve a shared slot also when only trying to add an exclusive fence.

This is the next step towards handling the exclusive fence like a
shared one.

v2: fix missed case in amdgpu
v3: and two more radeon, rename function
v4: add one more case to TTM, fix i915 after rebase

Signed-off-by: Christian König <christian.koenig@amd.com>
---
 drivers/dma-buf/dma-resv.c                    | 10 +--
 drivers/dma-buf/st-dma-resv.c                 | 64 +++++++++----------
 .../gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c  |  4 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_object.c    |  8 +++
 drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c        |  4 +-
 drivers/gpu/drm/amd/amdkfd/kfd_svm.c          |  2 +-
 drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c  |  8 +--
 drivers/gpu/drm/i915/gem/i915_gem_clflush.c   |  3 +-
 .../gpu/drm/i915/gem/i915_gem_execbuffer.c    | 10 ++-
 drivers/gpu/drm/i915/gem/i915_gem_ttm_move.c  |  6 +-
 .../drm/i915/gem/selftests/i915_gem_migrate.c |  5 +-
 drivers/gpu/drm/i915/i915_vma.c               | 10 ++-
 .../drm/i915/selftests/intel_memory_region.c  |  7 ++
 drivers/gpu/drm/lima/lima_gem.c               | 10 ++-
 drivers/gpu/drm/msm/msm_gem_submit.c          | 18 +++---
 drivers/gpu/drm/nouveau/nouveau_fence.c       |  8 +--
 drivers/gpu/drm/panfrost/panfrost_job.c       |  4 ++
 drivers/gpu/drm/qxl/qxl_release.c             |  2 +-
 drivers/gpu/drm/radeon/radeon_cs.c            |  4 ++
 drivers/gpu/drm/radeon/radeon_object.c        |  8 +++
 drivers/gpu/drm/radeon/radeon_vm.c            |  2 +-
 drivers/gpu/drm/ttm/ttm_bo.c                  |  8 ++-
 drivers/gpu/drm/ttm/ttm_bo_util.c             | 12 +++-
 drivers/gpu/drm/ttm/ttm_execbuf_util.c        | 15 ++---
 drivers/gpu/drm/v3d/v3d_gem.c                 | 15 +++--
 drivers/gpu/drm/vc4/vc4_gem.c                 |  2 +-
 drivers/gpu/drm/vgem/vgem_fence.c             | 12 ++--
 drivers/gpu/drm/virtio/virtgpu_gem.c          |  9 +++
 drivers/gpu/drm/vmwgfx/vmwgfx_bo.c            | 16 +++--
 include/linux/dma-resv.h                      |  4 +-
 30 files changed, 176 insertions(+), 114 deletions(-)

diff --git a/drivers/dma-buf/dma-resv.c b/drivers/dma-buf/dma-resv.c
index 15ffac35439d..8c650b96357a 100644
--- a/drivers/dma-buf/dma-resv.c
+++ b/drivers/dma-buf/dma-resv.c
@@ -152,7 +152,7 @@ static inline struct dma_resv_list *dma_resv_shared_list(struct dma_resv *obj)
 }
 
 /**
- * dma_resv_reserve_shared - Reserve space to add shared fences to
+ * dma_resv_reserve_fences - Reserve space to add shared fences to
  * a dma_resv.
  * @obj: reservation object
  * @num_fences: number of fences we want to add
@@ -167,7 +167,7 @@ static inline struct dma_resv_list *dma_resv_shared_list(struct dma_resv *obj)
  * RETURNS
  * Zero for success, or -errno
  */
-int dma_resv_reserve_shared(struct dma_resv *obj, unsigned int num_fences)
+int dma_resv_reserve_fences(struct dma_resv *obj, unsigned int num_fences)
 {
 	struct dma_resv_list *old, *new;
 	unsigned int i, j, k, max;
@@ -230,7 +230,7 @@ int dma_resv_reserve_shared(struct dma_resv *obj, unsigned int num_fences)
 
 	return 0;
 }
-EXPORT_SYMBOL(dma_resv_reserve_shared);
+EXPORT_SYMBOL(dma_resv_reserve_fences);
 
 #ifdef CONFIG_DEBUG_MUTEXES
 /**
@@ -238,7 +238,7 @@ EXPORT_SYMBOL(dma_resv_reserve_shared);
  * @obj: the dma_resv object to reset
  *
  * Reset the number of pre-reserved shared slots to test that drivers do
- * correct slot allocation using dma_resv_reserve_shared(). See also
+ * correct slot allocation using dma_resv_reserve_fences(). See also
  * &dma_resv_list.shared_max.
  */
 void dma_resv_reset_shared_max(struct dma_resv *obj)
@@ -260,7 +260,7 @@ EXPORT_SYMBOL(dma_resv_reset_shared_max);
  * @fence: the shared fence to add
  *
  * Add a fence to a shared slot, @obj must be locked with dma_resv_lock(), and
- * dma_resv_reserve_shared() has been called.
+ * dma_resv_reserve_fences() has been called.
  *
  * See also &dma_resv.fence for a discussion of the semantics.
  */
diff --git a/drivers/dma-buf/st-dma-resv.c b/drivers/dma-buf/st-dma-resv.c
index cbe999c6e7a6..d2e61f6ae989 100644
--- a/drivers/dma-buf/st-dma-resv.c
+++ b/drivers/dma-buf/st-dma-resv.c
@@ -75,17 +75,16 @@ static int test_signaling(void *arg, bool shared)
 		goto err_free;
 	}
 
-	if (shared) {
-		r = dma_resv_reserve_shared(&resv, 1);
-		if (r) {
-			pr_err("Resv shared slot allocation failed\n");
-			goto err_unlock;
-		}
+	r = dma_resv_reserve_fences(&resv, 1);
+	if (r) {
+		pr_err("Resv shared slot allocation failed\n");
+		goto err_unlock;
+	}
 
+	if (shared)
 		dma_resv_add_shared_fence(&resv, f);
-	} else {
+	else
 		dma_resv_add_excl_fence(&resv, f);
-	}
 
 	if (dma_resv_test_signaled(&resv, shared)) {
 		pr_err("Resv unexpectedly signaled\n");
@@ -134,17 +133,16 @@ static int test_for_each(void *arg, bool shared)
 		goto err_free;
 	}
 
-	if (shared) {
-		r = dma_resv_reserve_shared(&resv, 1);
-		if (r) {
-			pr_err("Resv shared slot allocation failed\n");
-			goto err_unlock;
-		}
+	r = dma_resv_reserve_fences(&resv, 1);
+	if (r) {
+		pr_err("Resv shared slot allocation failed\n");
+		goto err_unlock;
+	}
 
+	if (shared)
 		dma_resv_add_shared_fence(&resv, f);
-	} else {
+	else
 		dma_resv_add_excl_fence(&resv, f);
-	}
 
 	r = -ENOENT;
 	dma_resv_for_each_fence(&cursor, &resv, shared, fence) {
@@ -206,18 +204,17 @@ static int test_for_each_unlocked(void *arg, bool shared)
 		goto err_free;
 	}
 
-	if (shared) {
-		r = dma_resv_reserve_shared(&resv, 1);
-		if (r) {
-			pr_err("Resv shared slot allocation failed\n");
-			dma_resv_unlock(&resv);
-			goto err_free;
-		}
+	r = dma_resv_reserve_fences(&resv, 1);
+	if (r) {
+		pr_err("Resv shared slot allocation failed\n");
+		dma_resv_unlock(&resv);
+		goto err_free;
+	}
 
+	if (shared)
 		dma_resv_add_shared_fence(&resv, f);
-	} else {
+	else
 		dma_resv_add_excl_fence(&resv, f);
-	}
 	dma_resv_unlock(&resv);
 
 	r = -ENOENT;
@@ -290,18 +287,17 @@ static int test_get_fences(void *arg, bool shared)
 		goto err_resv;
 	}
 
-	if (shared) {
-		r = dma_resv_reserve_shared(&resv, 1);
-		if (r) {
-			pr_err("Resv shared slot allocation failed\n");
-			dma_resv_unlock(&resv);
-			goto err_resv;
-		}
+	r = dma_resv_reserve_fences(&resv, 1);
+	if (r) {
+		pr_err("Resv shared slot allocation failed\n");
+		dma_resv_unlock(&resv);
+		goto err_resv;
+	}
 
+	if (shared)
 		dma_resv_add_shared_fence(&resv, f);
-	} else {
+	else
 		dma_resv_add_excl_fence(&resv, f);
-	}
 	dma_resv_unlock(&resv);
 
 	r = dma_resv_get_fences(&resv, shared, &i, &fences);
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
index 900ed2a7483b..98b1736bb221 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
@@ -1233,7 +1233,7 @@ static int init_kfd_vm(struct amdgpu_vm *vm, void **process_info,
 				  AMDGPU_FENCE_OWNER_KFD, false);
 	if (ret)
 		goto wait_pd_fail;
-	ret = dma_resv_reserve_shared(vm->root.bo->tbo.base.resv, 1);
+	ret = dma_resv_reserve_fences(vm->root.bo->tbo.base.resv, 1);
 	if (ret)
 		goto reserve_shared_fail;
 	amdgpu_bo_fence(vm->root.bo,
@@ -2571,7 +2571,7 @@ int amdgpu_amdkfd_add_gws_to_process(void *info, void *gws, struct kgd_mem **mem
 	 * Add process eviction fence to bo so they can
 	 * evict each other.
 	 */
-	ret = dma_resv_reserve_shared(gws_bo->tbo.base.resv, 1);
+	ret = dma_resv_reserve_fences(gws_bo->tbo.base.resv, 1);
 	if (ret)
 		goto reserve_shared_fail;
 	amdgpu_bo_fence(gws_bo, &process_info->eviction_fence->base, true);
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
index 25731719c627..6f57a2fd5fe3 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
@@ -1388,6 +1388,14 @@ void amdgpu_bo_fence(struct amdgpu_bo *bo, struct dma_fence *fence,
 		     bool shared)
 {
 	struct dma_resv *resv = bo->tbo.base.resv;
+	int r;
+
+	r = dma_resv_reserve_fences(resv, 1);
+	if (r) {
+		/* As last resort on OOM we block for the fence */
+		dma_fence_wait(fence, false);
+		return;
+	}
 
 	if (shared)
 		dma_resv_add_shared_fence(resv, fence);
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
index 5d11978c162e..b13451255e8b 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
@@ -2926,7 +2926,7 @@ int amdgpu_vm_init(struct amdgpu_device *adev, struct amdgpu_vm *vm)
 	if (r)
 		goto error_free_root;
 
-	r = dma_resv_reserve_shared(root_bo->tbo.base.resv, 1);
+	r = dma_resv_reserve_fences(root_bo->tbo.base.resv, 1);
 	if (r)
 		goto error_unreserve;
 
@@ -3369,7 +3369,7 @@ bool amdgpu_vm_handle_fault(struct amdgpu_device *adev, u32 pasid,
 		value = 0;
 	}
 
-	r = dma_resv_reserve_shared(root->tbo.base.resv, 1);
+	r = dma_resv_reserve_fences(root->tbo.base.resv, 1);
 	if (r) {
 		pr_debug("failed %d to reserve fence slot\n", r);
 		goto error_unlock;
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
index 3b8856b4cece..b3fc3e958227 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
@@ -548,7 +548,7 @@ svm_range_vram_node_new(struct amdgpu_device *adev, struct svm_range *prange,
 		goto reserve_bo_failed;
 	}
 
-	r = dma_resv_reserve_shared(bo->tbo.base.resv, 1);
+	r = dma_resv_reserve_fences(bo->tbo.base.resv, 1);
 	if (r) {
 		pr_debug("failed %d to reserve bo\n", r);
 		amdgpu_bo_unreserve(bo);
diff --git a/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c b/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c
index 5f502c49aec2..53f7c78628a4 100644
--- a/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c
+++ b/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c
@@ -179,11 +179,9 @@ static int submit_fence_sync(struct etnaviv_gem_submit *submit)
 		struct etnaviv_gem_submit_bo *bo = &submit->bos[i];
 		struct dma_resv *robj = bo->obj->base.resv;
 
-		if (!(bo->flags & ETNA_SUBMIT_BO_WRITE)) {
-			ret = dma_resv_reserve_shared(robj, 1);
-			if (ret)
-				return ret;
-		}
+		ret = dma_resv_reserve_fences(robj, 1);
+		if (ret)
+			return ret;
 
 		if (submit->flags & ETNA_SUBMIT_NO_IMPLICIT)
 			continue;
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_clflush.c b/drivers/gpu/drm/i915/gem/i915_gem_clflush.c
index ce91b23385cf..1fd0cc9ca213 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_clflush.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_clflush.c
@@ -108,7 +108,8 @@ bool i915_gem_clflush_object(struct drm_i915_gem_object *obj,
 	trace_i915_gem_object_clflush(obj);
 
 	clflush = NULL;
-	if (!(flags & I915_CLFLUSH_SYNC))
+	if (!(flags & I915_CLFLUSH_SYNC) &&
+	    dma_resv_reserve_fences(obj->base.resv, 1) == 0)
 		clflush = clflush_work_create(obj);
 	if (clflush) {
 		i915_sw_fence_await_reservation(&clflush->base.chain,
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
index d42f437149c9..78f8797853ce 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
@@ -998,11 +998,9 @@ static int eb_validate_vmas(struct i915_execbuffer *eb)
 			}
 		}
 
-		if (!(ev->flags & EXEC_OBJECT_WRITE)) {
-			err = dma_resv_reserve_shared(vma->obj->base.resv, 1);
-			if (err)
-				return err;
-		}
+		err = dma_resv_reserve_fences(vma->obj->base.resv, 1);
+		if (err)
+			return err;
 
 		GEM_BUG_ON(drm_mm_node_allocated(&vma->node) &&
 			   eb_vma_misplaced(&eb->exec[i], vma, ev->flags));
@@ -2303,7 +2301,7 @@ static int eb_parse(struct i915_execbuffer *eb)
 	if (IS_ERR(batch))
 		return PTR_ERR(batch);
 
-	err = dma_resv_reserve_shared(shadow->obj->base.resv, 1);
+	err = dma_resv_reserve_fences(shadow->obj->base.resv, 1);
 	if (err)
 		return err;
 
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_ttm_move.c b/drivers/gpu/drm/i915/gem/i915_gem_ttm_move.c
index 1ebe6e4086a1..432ac74ff225 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_ttm_move.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_ttm_move.c
@@ -611,7 +611,11 @@ int i915_gem_obj_copy_ttm(struct drm_i915_gem_object *dst,
 	assert_object_held(src);
 	i915_deps_init(&deps, GFP_KERNEL | __GFP_NORETRY | __GFP_NOWARN);
 
-	ret = dma_resv_reserve_shared(src_bo->base.resv, 1);
+	ret = dma_resv_reserve_fences(src_bo->base.resv, 1);
+	if (ret)
+		return ret;
+
+	ret = dma_resv_reserve_fences(dst_bo->base.resv, 1);
 	if (ret)
 		return ret;
 
diff --git a/drivers/gpu/drm/i915/gem/selftests/i915_gem_migrate.c b/drivers/gpu/drm/i915/gem/selftests/i915_gem_migrate.c
index d534141b2cf7..0e52eb87cd55 100644
--- a/drivers/gpu/drm/i915/gem/selftests/i915_gem_migrate.c
+++ b/drivers/gpu/drm/i915/gem/selftests/i915_gem_migrate.c
@@ -216,7 +216,10 @@ static int __igt_lmem_pages_migrate(struct intel_gt *gt,
 					  i915_gem_object_is_lmem(obj),
 					  0xdeadbeaf, &rq);
 		if (rq) {
-			dma_resv_add_excl_fence(obj->base.resv, &rq->fence);
+			err = dma_resv_reserve_fences(obj->base.resv, 1);
+			if (!err)
+				dma_resv_add_excl_fence(obj->base.resv,
+							&rq->fence);
 			i915_gem_object_set_moving_fence(obj, &rq->fence);
 			i915_request_put(rq);
 		}
diff --git a/drivers/gpu/drm/i915/i915_vma.c b/drivers/gpu/drm/i915/i915_vma.c
index 94fcdb7bd21d..bae3423f58e8 100644
--- a/drivers/gpu/drm/i915/i915_vma.c
+++ b/drivers/gpu/drm/i915/i915_vma.c
@@ -1819,6 +1819,12 @@ int _i915_vma_move_to_active(struct i915_vma *vma,
 			intel_frontbuffer_put(front);
 		}
 
+		if (!(flags & __EXEC_OBJECT_NO_RESERVE)) {
+			err = dma_resv_reserve_fences(vma->obj->base.resv, 1);
+			if (unlikely(err))
+				return err;
+		}
+
 		if (fence) {
 			dma_resv_add_excl_fence(vma->obj->base.resv, fence);
 			obj->write_domain = I915_GEM_DOMAIN_RENDER;
@@ -1826,7 +1832,7 @@ int _i915_vma_move_to_active(struct i915_vma *vma,
 		}
 	} else {
 		if (!(flags & __EXEC_OBJECT_NO_RESERVE)) {
-			err = dma_resv_reserve_shared(vma->obj->base.resv, 1);
+			err = dma_resv_reserve_fences(vma->obj->base.resv, 1);
 			if (unlikely(err))
 				return err;
 		}
@@ -2044,7 +2050,7 @@ int i915_vma_unbind_async(struct i915_vma *vma, bool trylock_vm)
 	if (!obj->mm.rsgt)
 		return -EBUSY;
 
-	err = dma_resv_reserve_shared(obj->base.resv, 1);
+	err = dma_resv_reserve_fences(obj->base.resv, 1);
 	if (err)
 		return -EBUSY;
 
diff --git a/drivers/gpu/drm/i915/selftests/intel_memory_region.c b/drivers/gpu/drm/i915/selftests/intel_memory_region.c
index ba32893e0873..6114e013092b 100644
--- a/drivers/gpu/drm/i915/selftests/intel_memory_region.c
+++ b/drivers/gpu/drm/i915/selftests/intel_memory_region.c
@@ -1043,6 +1043,13 @@ static int igt_lmem_write_cpu(void *arg)
 	}
 
 	i915_gem_object_lock(obj, NULL);
+
+	err = dma_resv_reserve_fences(obj->base.resv, 1);
+	if (err) {
+		i915_gem_object_unlock(obj);
+		goto out_put;
+	}
+
 	/* Put the pages into a known state -- from the gpu for added fun */
 	intel_engine_pm_get(engine);
 	err = intel_context_migrate_clear(engine->gt->migrate.context, NULL,
diff --git a/drivers/gpu/drm/lima/lima_gem.c b/drivers/gpu/drm/lima/lima_gem.c
index 55bb1ec3c4f7..e0a11ee0e86d 100644
--- a/drivers/gpu/drm/lima/lima_gem.c
+++ b/drivers/gpu/drm/lima/lima_gem.c
@@ -257,13 +257,11 @@ int lima_gem_get_info(struct drm_file *file, u32 handle, u32 *va, u64 *offset)
 static int lima_gem_sync_bo(struct lima_sched_task *task, struct lima_bo *bo,
 			    bool write, bool explicit)
 {
-	int err = 0;
+	int err;
 
-	if (!write) {
-		err = dma_resv_reserve_shared(lima_bo_resv(bo), 1);
-		if (err)
-			return err;
-	}
+	err = dma_resv_reserve_fences(lima_bo_resv(bo), 1);
+	if (err)
+		return err;
 
 	/* explicit sync use user passed dep fence */
 	if (explicit)
diff --git a/drivers/gpu/drm/msm/msm_gem_submit.c b/drivers/gpu/drm/msm/msm_gem_submit.c
index c6d60c8d286d..3164db8be893 100644
--- a/drivers/gpu/drm/msm/msm_gem_submit.c
+++ b/drivers/gpu/drm/msm/msm_gem_submit.c
@@ -320,16 +320,14 @@ static int submit_fence_sync(struct msm_gem_submit *submit, bool no_implicit)
 		struct drm_gem_object *obj = &submit->bos[i].obj->base;
 		bool write = submit->bos[i].flags & MSM_SUBMIT_BO_WRITE;
 
-		if (!write) {
-			/* NOTE: _reserve_shared() must happen before
-			 * _add_shared_fence(), which makes this a slightly
-			 * strange place to call it.  OTOH this is a
-			 * convenient can-fail point to hook it in.
-			 */
-			ret = dma_resv_reserve_shared(obj->resv, 1);
-			if (ret)
-				return ret;
-		}
+		/* NOTE: _reserve_shared() must happen before
+		 * _add_shared_fence(), which makes this a slightly
+		 * strange place to call it.  OTOH this is a
+		 * convenient can-fail point to hook it in.
+		 */
+		ret = dma_resv_reserve_fences(obj->resv, 1);
+		if (ret)
+			return ret;
 
 		/* exclusive fences must be ordered */
 		if (no_implicit && !write)
diff --git a/drivers/gpu/drm/nouveau/nouveau_fence.c b/drivers/gpu/drm/nouveau/nouveau_fence.c
index a3a04e0d76ec..0268259e97eb 100644
--- a/drivers/gpu/drm/nouveau/nouveau_fence.c
+++ b/drivers/gpu/drm/nouveau/nouveau_fence.c
@@ -346,11 +346,9 @@ nouveau_fence_sync(struct nouveau_bo *nvbo, struct nouveau_channel *chan,
 	struct dma_resv *resv = nvbo->bo.base.resv;
 	int i, ret;
 
-	if (!exclusive) {
-		ret = dma_resv_reserve_shared(resv, 1);
-		if (ret)
-			return ret;
-	}
+	ret = dma_resv_reserve_fences(resv, 1);
+	if (ret)
+		return ret;
 
 	/* Waiting for the exclusive fence first causes performance regressions
 	 * under some circumstances. So manually wait for the shared ones first.
diff --git a/drivers/gpu/drm/panfrost/panfrost_job.c b/drivers/gpu/drm/panfrost/panfrost_job.c
index a6925dbb6224..c34114560e49 100644
--- a/drivers/gpu/drm/panfrost/panfrost_job.c
+++ b/drivers/gpu/drm/panfrost/panfrost_job.c
@@ -247,6 +247,10 @@ static int panfrost_acquire_object_fences(struct drm_gem_object **bos,
 	int i, ret;
 
 	for (i = 0; i < bo_count; i++) {
+		ret = dma_resv_reserve_fences(bos[i]->resv, 1);
+		if (ret)
+			return ret;
+
 		/* panfrost always uses write mode in its current uapi */
 		ret = drm_sched_job_add_implicit_dependencies(job, bos[i],
 							      true);
diff --git a/drivers/gpu/drm/qxl/qxl_release.c b/drivers/gpu/drm/qxl/qxl_release.c
index 469979cd0341..cde1e8ddaeaa 100644
--- a/drivers/gpu/drm/qxl/qxl_release.c
+++ b/drivers/gpu/drm/qxl/qxl_release.c
@@ -200,7 +200,7 @@ static int qxl_release_validate_bo(struct qxl_bo *bo)
 			return ret;
 	}
 
-	ret = dma_resv_reserve_shared(bo->tbo.base.resv, 1);
+	ret = dma_resv_reserve_fences(bo->tbo.base.resv, 1);
 	if (ret)
 		return ret;
 
diff --git a/drivers/gpu/drm/radeon/radeon_cs.c b/drivers/gpu/drm/radeon/radeon_cs.c
index 9ed2b2700e0a..446f7bae54c4 100644
--- a/drivers/gpu/drm/radeon/radeon_cs.c
+++ b/drivers/gpu/drm/radeon/radeon_cs.c
@@ -535,6 +535,10 @@ static int radeon_bo_vm_update_pte(struct radeon_cs_parser *p,
 			return r;
 
 		radeon_sync_fence(&p->ib.sync, bo_va->last_pt_update);
+
+		r = dma_resv_reserve_fences(bo->tbo.base.resv, 1);
+		if (r)
+			return r;
 	}
 
 	return radeon_vm_clear_invalids(rdev, vm);
diff --git a/drivers/gpu/drm/radeon/radeon_object.c b/drivers/gpu/drm/radeon/radeon_object.c
index 91a72cd14304..7ffd2e90f325 100644
--- a/drivers/gpu/drm/radeon/radeon_object.c
+++ b/drivers/gpu/drm/radeon/radeon_object.c
@@ -782,6 +782,14 @@ void radeon_bo_fence(struct radeon_bo *bo, struct radeon_fence *fence,
 		     bool shared)
 {
 	struct dma_resv *resv = bo->tbo.base.resv;
+	int r;
+
+	r = dma_resv_reserve_fences(resv, 1);
+	if (r) {
+		/* As last resort on OOM we block for the fence */
+		dma_fence_wait(&fence->base, false);
+		return;
+	}
 
 	if (shared)
 		dma_resv_add_shared_fence(resv, &fence->base);
diff --git a/drivers/gpu/drm/radeon/radeon_vm.c b/drivers/gpu/drm/radeon/radeon_vm.c
index bb53016f3138..987cabbf1318 100644
--- a/drivers/gpu/drm/radeon/radeon_vm.c
+++ b/drivers/gpu/drm/radeon/radeon_vm.c
@@ -831,7 +831,7 @@ static int radeon_vm_update_ptes(struct radeon_device *rdev,
 		int r;
 
 		radeon_sync_resv(rdev, &ib->sync, pt->tbo.base.resv, true);
-		r = dma_resv_reserve_shared(pt->tbo.base.resv, 1);
+		r = dma_resv_reserve_fences(pt->tbo.base.resv, 1);
 		if (r)
 			return r;
 
diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c
index e5fd0f2c0299..c49996cf25d0 100644
--- a/drivers/gpu/drm/ttm/ttm_bo.c
+++ b/drivers/gpu/drm/ttm/ttm_bo.c
@@ -151,6 +151,10 @@ static int ttm_bo_handle_move_mem(struct ttm_buffer_object *bo,
 		}
 	}
 
+	ret = dma_resv_reserve_fences(bo->base.resv, 1);
+	if (ret)
+		goto out_err;
+
 	ret = bdev->funcs->move(bo, evict, ctx, mem, hop);
 	if (ret) {
 		if (ret == -EMULTIHOP)
@@ -735,7 +739,7 @@ static int ttm_bo_add_move_fence(struct ttm_buffer_object *bo,
 
 	dma_resv_add_shared_fence(bo->base.resv, fence);
 
-	ret = dma_resv_reserve_shared(bo->base.resv, 1);
+	ret = dma_resv_reserve_fences(bo->base.resv, 1);
 	if (unlikely(ret)) {
 		dma_fence_put(fence);
 		return ret;
@@ -794,7 +798,7 @@ int ttm_bo_mem_space(struct ttm_buffer_object *bo,
 	bool type_found = false;
 	int i, ret;
 
-	ret = dma_resv_reserve_shared(bo->base.resv, 1);
+	ret = dma_resv_reserve_fences(bo->base.resv, 1);
 	if (unlikely(ret))
 		return ret;
 
diff --git a/drivers/gpu/drm/ttm/ttm_bo_util.c b/drivers/gpu/drm/ttm/ttm_bo_util.c
index 219dd81bbeab..1b96b91bf81b 100644
--- a/drivers/gpu/drm/ttm/ttm_bo_util.c
+++ b/drivers/gpu/drm/ttm/ttm_bo_util.c
@@ -221,9 +221,6 @@ static int ttm_buffer_object_transfer(struct ttm_buffer_object *bo,
 
 	fbo->base = *bo;
 
-	ttm_bo_get(bo);
-	fbo->bo = bo;
-
 	/**
 	 * Fix up members that we shouldn't copy directly:
 	 * TODO: Explicit member copy would probably be better here.
@@ -250,6 +247,15 @@ static int ttm_buffer_object_transfer(struct ttm_buffer_object *bo,
 	ret = dma_resv_trylock(&fbo->base.base._resv);
 	WARN_ON(!ret);
 
+	ret = dma_resv_reserve_fences(&fbo->base.base._resv, 1);
+	if (ret) {
+		kfree(fbo);
+		return ret;
+	}
+
+	ttm_bo_get(bo);
+	fbo->bo = bo;
+
 	ttm_bo_move_to_lru_tail_unlocked(&fbo->base);
 
 	*new_obj = &fbo->base;
diff --git a/drivers/gpu/drm/ttm/ttm_execbuf_util.c b/drivers/gpu/drm/ttm/ttm_execbuf_util.c
index 071c48d672c6..789c645f004e 100644
--- a/drivers/gpu/drm/ttm/ttm_execbuf_util.c
+++ b/drivers/gpu/drm/ttm/ttm_execbuf_util.c
@@ -90,6 +90,7 @@ int ttm_eu_reserve_buffers(struct ww_acquire_ctx *ticket,
 
 	list_for_each_entry(entry, list, head) {
 		struct ttm_buffer_object *bo = entry->bo;
+		unsigned int num_fences;
 
 		ret = ttm_bo_reserve(bo, intr, (ticket == NULL), ticket);
 		if (ret == -EALREADY && dups) {
@@ -100,12 +101,10 @@ int ttm_eu_reserve_buffers(struct ww_acquire_ctx *ticket,
 			continue;
 		}
 
+		num_fences = min(entry->num_shared, 1u);
 		if (!ret) {
-			if (!entry->num_shared)
-				continue;
-
-			ret = dma_resv_reserve_shared(bo->base.resv,
-								entry->num_shared);
+			ret = dma_resv_reserve_fences(bo->base.resv,
+						      num_fences);
 			if (!ret)
 				continue;
 		}
@@ -120,9 +119,9 @@ int ttm_eu_reserve_buffers(struct ww_acquire_ctx *ticket,
 			ret = ttm_bo_reserve_slowpath(bo, intr, ticket);
 		}
 
-		if (!ret && entry->num_shared)
-			ret = dma_resv_reserve_shared(bo->base.resv,
-								entry->num_shared);
+		if (!ret)
+			ret = dma_resv_reserve_fences(bo->base.resv,
+						      num_fences);
 
 		if (unlikely(ret != 0)) {
 			if (ticket) {
diff --git a/drivers/gpu/drm/v3d/v3d_gem.c b/drivers/gpu/drm/v3d/v3d_gem.c
index 92bc0faee84f..961812d33827 100644
--- a/drivers/gpu/drm/v3d/v3d_gem.c
+++ b/drivers/gpu/drm/v3d/v3d_gem.c
@@ -259,16 +259,21 @@ v3d_lock_bo_reservations(struct v3d_job *job,
 		return ret;
 
 	for (i = 0; i < job->bo_count; i++) {
+		ret = dma_resv_reserve_fences(job->bo[i]->resv, 1);
+		if (ret)
+			goto fail;
+
 		ret = drm_sched_job_add_implicit_dependencies(&job->base,
 							      job->bo[i], true);
-		if (ret) {
-			drm_gem_unlock_reservations(job->bo, job->bo_count,
-						    acquire_ctx);
-			return ret;
-		}
+		if (ret)
+			goto fail;
 	}
 
 	return 0;
+
+fail:
+	drm_gem_unlock_reservations(job->bo, job->bo_count, acquire_ctx);
+	return ret;
 }
 
 /**
diff --git a/drivers/gpu/drm/vc4/vc4_gem.c b/drivers/gpu/drm/vc4/vc4_gem.c
index 4abf10b66fe8..594bd6bb00d2 100644
--- a/drivers/gpu/drm/vc4/vc4_gem.c
+++ b/drivers/gpu/drm/vc4/vc4_gem.c
@@ -644,7 +644,7 @@ vc4_lock_bo_reservations(struct drm_device *dev,
 	for (i = 0; i < exec->bo_count; i++) {
 		bo = &exec->bo[i]->base;
 
-		ret = dma_resv_reserve_shared(bo->resv, 1);
+		ret = dma_resv_reserve_fences(bo->resv, 1);
 		if (ret) {
 			vc4_unlock_bo_reservations(dev, exec, acquire_ctx);
 			return ret;
diff --git a/drivers/gpu/drm/vgem/vgem_fence.c b/drivers/gpu/drm/vgem/vgem_fence.c
index bd6f75285fd9..2ddbebca87d9 100644
--- a/drivers/gpu/drm/vgem/vgem_fence.c
+++ b/drivers/gpu/drm/vgem/vgem_fence.c
@@ -157,12 +157,14 @@ int vgem_fence_attach_ioctl(struct drm_device *dev,
 	}
 
 	/* Expose the fence via the dma-buf */
-	ret = 0;
 	dma_resv_lock(resv, NULL);
-	if (arg->flags & VGEM_FENCE_WRITE)
-		dma_resv_add_excl_fence(resv, fence);
-	else if ((ret = dma_resv_reserve_shared(resv, 1)) == 0)
-		dma_resv_add_shared_fence(resv, fence);
+	ret = dma_resv_reserve_fences(resv, 1);
+	if (!ret) {
+		if (arg->flags & VGEM_FENCE_WRITE)
+			dma_resv_add_excl_fence(resv, fence);
+		else
+			dma_resv_add_shared_fence(resv, fence);
+	}
 	dma_resv_unlock(resv);
 
 	/* Record the fence in our idr for later signaling */
diff --git a/drivers/gpu/drm/virtio/virtgpu_gem.c b/drivers/gpu/drm/virtio/virtgpu_gem.c
index 48d3c9955f0d..1820ca6cf673 100644
--- a/drivers/gpu/drm/virtio/virtgpu_gem.c
+++ b/drivers/gpu/drm/virtio/virtgpu_gem.c
@@ -214,6 +214,7 @@ void virtio_gpu_array_add_obj(struct virtio_gpu_object_array *objs,
 
 int virtio_gpu_array_lock_resv(struct virtio_gpu_object_array *objs)
 {
+	unsigned int i;
 	int ret;
 
 	if (objs->nents == 1) {
@@ -222,6 +223,14 @@ int virtio_gpu_array_lock_resv(struct virtio_gpu_object_array *objs)
 		ret = drm_gem_lock_reservations(objs->objs, objs->nents,
 						&objs->ticket);
 	}
+	if (ret)
+		return ret;
+
+	for (i = 0; i < objs->nents; ++i) {
+		ret = dma_resv_reserve_fences(objs->objs[i]->resv, 1);
+		if (ret)
+			return ret;
+	}
 	return ret;
 }
 
diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_bo.c b/drivers/gpu/drm/vmwgfx/vmwgfx_bo.c
index 31aecc46624b..fe13aa8b4a64 100644
--- a/drivers/gpu/drm/vmwgfx/vmwgfx_bo.c
+++ b/drivers/gpu/drm/vmwgfx/vmwgfx_bo.c
@@ -747,16 +747,22 @@ void vmw_bo_fence_single(struct ttm_buffer_object *bo,
 			 struct vmw_fence_obj *fence)
 {
 	struct ttm_device *bdev = bo->bdev;
-
 	struct vmw_private *dev_priv =
 		container_of(bdev, struct vmw_private, bdev);
+	int ret;
 
-	if (fence == NULL) {
+	if (fence == NULL)
 		vmw_execbuf_fence_commands(NULL, dev_priv, &fence, NULL);
+	else
+		dma_fence_get(&fence->base);
+
+	ret = dma_resv_reserve_fences(bo->base.resv, 1);
+	if (!ret)
 		dma_resv_add_excl_fence(bo->base.resv, &fence->base);
-		dma_fence_put(&fence->base);
-	} else
-		dma_resv_add_excl_fence(bo->base.resv, &fence->base);
+	else
+		/* Last resort fallback when we are OOM */
+		dma_fence_wait(&fence->base, false);
+	dma_fence_put(&fence->base);
 }
 
 
diff --git a/include/linux/dma-resv.h b/include/linux/dma-resv.h
index ecb697d4d861..5fa04d0fccad 100644
--- a/include/linux/dma-resv.h
+++ b/include/linux/dma-resv.h
@@ -117,7 +117,7 @@ struct dma_resv {
 	 * A new fence is added by calling dma_resv_add_shared_fence(). Since
 	 * this often needs to be done past the point of no return in command
 	 * submission it cannot fail, and therefore sufficient slots need to be
-	 * reserved by calling dma_resv_reserve_shared().
+	 * reserved by calling dma_resv_reserve_fences().
 	 *
 	 * Note that actual semantics of what an exclusive or shared fence mean
 	 * is defined by the user, for reservation objects shared across drivers
@@ -413,7 +413,7 @@ static inline void dma_resv_unlock(struct dma_resv *obj)
 
 void dma_resv_init(struct dma_resv *obj);
 void dma_resv_fini(struct dma_resv *obj);
-int dma_resv_reserve_shared(struct dma_resv *obj, unsigned int num_fences);
+int dma_resv_reserve_fences(struct dma_resv *obj, unsigned int num_fences);
 void dma_resv_add_shared_fence(struct dma_resv *obj, struct dma_fence *fence);
 void dma_resv_replace_fences(struct dma_resv *obj, uint64_t context,
 			     struct dma_fence *fence);
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 65+ messages in thread

* [PATCH 02/16] dma-buf: add enum dma_resv_usage v4
  2022-04-06  7:51 Christian König
  2022-04-06  7:51 ` [PATCH 01/16] dma-buf/drivers: make reserving a shared slot mandatory v4 Christian König
@ 2022-04-06  7:51 ` Christian König
  2022-04-06  7:51 ` [PATCH 03/16] dma-buf: specify usage while adding fences to dma_resv obj v6 Christian König
                   ` (14 subsequent siblings)
  16 siblings, 0 replies; 65+ messages in thread
From: Christian König @ 2022-04-06  7:51 UTC (permalink / raw)
  To: daniel.vetter, dri-devel; +Cc: Christian König

This change adds the dma_resv_usage enum and allows us to specify why a
dma_resv object is queried for its containing fences.

Additional to that a dma_resv_usage_rw() helper function is added to aid
retrieving the fences for a read or write userspace submission.

This is then deployed to the different query functions of the dma_resv
object and all of their users. When the write paratermer was previously
true we now use DMA_RESV_USAGE_WRITE and DMA_RESV_USAGE_READ otherwise.

v2: add KERNEL/OTHER in separate patch
v3: some kerneldoc suggestions by Daniel
v4: some more kerneldoc suggestions by Daniel, fix missing cases lost in
    the rebase pointed out by Bas.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
---
 drivers/dma-buf/dma-buf.c                     |  6 +-
 drivers/dma-buf/dma-resv.c                    | 35 +++++----
 drivers/dma-buf/st-dma-resv.c                 | 48 ++++++------
 drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c        |  4 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_display.c   |  3 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c       |  3 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_ids.c       |  5 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_mn.c        |  4 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_object.c    |  4 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_sync.c      |  3 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c       |  3 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c       |  3 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c        |  7 +-
 .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c |  3 +-
 drivers/gpu/drm/drm_gem.c                     |  3 +-
 drivers/gpu/drm/drm_gem_atomic_helper.c       |  2 +-
 drivers/gpu/drm/etnaviv/etnaviv_gem.c         |  6 +-
 .../gpu/drm/i915/display/intel_atomic_plane.c |  3 +-
 drivers/gpu/drm/i915/gem/i915_gem_busy.c      |  4 +-
 drivers/gpu/drm/i915/gem/i915_gem_lmem.c      |  2 +-
 drivers/gpu/drm/i915/gem/i915_gem_userptr.c   |  2 +-
 drivers/gpu/drm/i915/gem/i915_gem_wait.c      |  6 +-
 .../drm/i915/gem/selftests/i915_gem_dmabuf.c  |  3 +-
 drivers/gpu/drm/i915/i915_request.c           |  3 +-
 drivers/gpu/drm/i915/i915_sw_fence.c          |  2 +-
 drivers/gpu/drm/msm/msm_gem.c                 |  3 +-
 drivers/gpu/drm/nouveau/dispnv50/wndw.c       |  3 +-
 drivers/gpu/drm/nouveau/nouveau_bo.c          |  8 +-
 drivers/gpu/drm/nouveau/nouveau_fence.c       |  8 +-
 drivers/gpu/drm/nouveau/nouveau_gem.c         |  3 +-
 drivers/gpu/drm/panfrost/panfrost_drv.c       |  3 +-
 drivers/gpu/drm/qxl/qxl_debugfs.c             |  3 +-
 drivers/gpu/drm/radeon/radeon_display.c       |  3 +-
 drivers/gpu/drm/radeon/radeon_gem.c           |  9 ++-
 drivers/gpu/drm/radeon/radeon_mn.c            |  4 +-
 drivers/gpu/drm/radeon/radeon_sync.c          |  2 +-
 drivers/gpu/drm/radeon/radeon_uvd.c           |  4 +-
 drivers/gpu/drm/scheduler/sched_main.c        |  3 +-
 drivers/gpu/drm/ttm/ttm_bo.c                  | 18 +++--
 drivers/gpu/drm/vgem/vgem_fence.c             |  4 +-
 drivers/gpu/drm/virtio/virtgpu_ioctl.c        |  5 +-
 drivers/gpu/drm/vmwgfx/vmwgfx_bo.c            |  4 +-
 drivers/gpu/drm/vmwgfx/vmwgfx_resource.c      |  3 +-
 drivers/infiniband/core/umem_dmabuf.c         |  3 +-
 include/linux/dma-buf.h                       |  8 +-
 include/linux/dma-resv.h                      | 73 +++++++++++++++----
 46 files changed, 215 insertions(+), 126 deletions(-)

diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c
index 775d3afb4169..1cddb65eafda 100644
--- a/drivers/dma-buf/dma-buf.c
+++ b/drivers/dma-buf/dma-buf.c
@@ -216,7 +216,8 @@ static bool dma_buf_poll_add_cb(struct dma_resv *resv, bool write,
 	struct dma_fence *fence;
 	int r;
 
-	dma_resv_for_each_fence(&cursor, resv, write, fence) {
+	dma_resv_for_each_fence(&cursor, resv, dma_resv_usage_rw(write),
+				fence) {
 		dma_fence_get(fence);
 		r = dma_fence_add_callback(fence, &dcb->cb, dma_buf_poll_cb);
 		if (!r)
@@ -1124,7 +1125,8 @@ static int __dma_buf_begin_cpu_access(struct dma_buf *dmabuf,
 	long ret;
 
 	/* Wait on any implicit rendering fences */
-	ret = dma_resv_wait_timeout(resv, write, true, MAX_SCHEDULE_TIMEOUT);
+	ret = dma_resv_wait_timeout(resv, dma_resv_usage_rw(write),
+				    true, MAX_SCHEDULE_TIMEOUT);
 	if (ret < 0)
 		return ret;
 
diff --git a/drivers/dma-buf/dma-resv.c b/drivers/dma-buf/dma-resv.c
index 8c650b96357a..17237e6ee30c 100644
--- a/drivers/dma-buf/dma-resv.c
+++ b/drivers/dma-buf/dma-resv.c
@@ -384,7 +384,7 @@ static void dma_resv_iter_restart_unlocked(struct dma_resv_iter *cursor)
 	cursor->seq = read_seqcount_begin(&cursor->obj->seq);
 	cursor->index = -1;
 	cursor->shared_count = 0;
-	if (cursor->all_fences) {
+	if (cursor->usage >= DMA_RESV_USAGE_READ) {
 		cursor->fences = dma_resv_shared_list(cursor->obj);
 		if (cursor->fences)
 			cursor->shared_count = cursor->fences->shared_count;
@@ -496,7 +496,7 @@ struct dma_fence *dma_resv_iter_first(struct dma_resv_iter *cursor)
 	dma_resv_assert_held(cursor->obj);
 
 	cursor->index = 0;
-	if (cursor->all_fences)
+	if (cursor->usage >= DMA_RESV_USAGE_READ)
 		cursor->fences = dma_resv_shared_list(cursor->obj);
 	else
 		cursor->fences = NULL;
@@ -551,7 +551,7 @@ int dma_resv_copy_fences(struct dma_resv *dst, struct dma_resv *src)
 	list = NULL;
 	excl = NULL;
 
-	dma_resv_iter_begin(&cursor, src, true);
+	dma_resv_iter_begin(&cursor, src, DMA_RESV_USAGE_READ);
 	dma_resv_for_each_fence_unlocked(&cursor, f) {
 
 		if (dma_resv_iter_is_restarted(&cursor)) {
@@ -597,7 +597,7 @@ EXPORT_SYMBOL(dma_resv_copy_fences);
  * dma_resv_get_fences - Get an object's shared and exclusive
  * fences without update side lock held
  * @obj: the reservation object
- * @write: true if we should return all fences
+ * @usage: controls which fences to include, see enum dma_resv_usage.
  * @num_fences: the number of fences returned
  * @fences: the array of fence ptrs returned (array is krealloc'd to the
  * required size, and must be freed by caller)
@@ -605,7 +605,7 @@ EXPORT_SYMBOL(dma_resv_copy_fences);
  * Retrieve all fences from the reservation object.
  * Returns either zero or -ENOMEM.
  */
-int dma_resv_get_fences(struct dma_resv *obj, bool write,
+int dma_resv_get_fences(struct dma_resv *obj, enum dma_resv_usage usage,
 			unsigned int *num_fences, struct dma_fence ***fences)
 {
 	struct dma_resv_iter cursor;
@@ -614,7 +614,7 @@ int dma_resv_get_fences(struct dma_resv *obj, bool write,
 	*num_fences = 0;
 	*fences = NULL;
 
-	dma_resv_iter_begin(&cursor, obj, write);
+	dma_resv_iter_begin(&cursor, obj, usage);
 	dma_resv_for_each_fence_unlocked(&cursor, fence) {
 
 		if (dma_resv_iter_is_restarted(&cursor)) {
@@ -646,7 +646,7 @@ EXPORT_SYMBOL_GPL(dma_resv_get_fences);
 /**
  * dma_resv_get_singleton - Get a single fence for all the fences
  * @obj: the reservation object
- * @write: true if we should return all fences
+ * @usage: controls which fences to include, see enum dma_resv_usage.
  * @fence: the resulting fence
  *
  * Get a single fence representing all the fences inside the resv object.
@@ -658,7 +658,7 @@ EXPORT_SYMBOL_GPL(dma_resv_get_fences);
  *
  * Returns 0 on success and negative error values on failure.
  */
-int dma_resv_get_singleton(struct dma_resv *obj, bool write,
+int dma_resv_get_singleton(struct dma_resv *obj, enum dma_resv_usage usage,
 			   struct dma_fence **fence)
 {
 	struct dma_fence_array *array;
@@ -666,7 +666,7 @@ int dma_resv_get_singleton(struct dma_resv *obj, bool write,
 	unsigned count;
 	int r;
 
-	r = dma_resv_get_fences(obj, write, &count, &fences);
+	r = dma_resv_get_fences(obj, usage, &count, &fences);
         if (r)
 		return r;
 
@@ -700,7 +700,7 @@ EXPORT_SYMBOL_GPL(dma_resv_get_singleton);
  * dma_resv_wait_timeout - Wait on reservation's objects
  * shared and/or exclusive fences.
  * @obj: the reservation object
- * @wait_all: if true, wait on all fences, else wait on just exclusive fence
+ * @usage: controls which fences to include, see enum dma_resv_usage.
  * @intr: if true, do interruptible wait
  * @timeout: timeout value in jiffies or zero to return immediately
  *
@@ -710,14 +710,14 @@ EXPORT_SYMBOL_GPL(dma_resv_get_singleton);
  * Returns -ERESTARTSYS if interrupted, 0 if the wait timed out, or
  * greater than zer on success.
  */
-long dma_resv_wait_timeout(struct dma_resv *obj, bool wait_all, bool intr,
-			   unsigned long timeout)
+long dma_resv_wait_timeout(struct dma_resv *obj, enum dma_resv_usage usage,
+			   bool intr, unsigned long timeout)
 {
 	long ret = timeout ? timeout : 1;
 	struct dma_resv_iter cursor;
 	struct dma_fence *fence;
 
-	dma_resv_iter_begin(&cursor, obj, wait_all);
+	dma_resv_iter_begin(&cursor, obj, usage);
 	dma_resv_for_each_fence_unlocked(&cursor, fence) {
 
 		ret = dma_fence_wait_timeout(fence, intr, ret);
@@ -737,8 +737,7 @@ EXPORT_SYMBOL_GPL(dma_resv_wait_timeout);
  * dma_resv_test_signaled - Test if a reservation object's fences have been
  * signaled.
  * @obj: the reservation object
- * @test_all: if true, test all fences, otherwise only test the exclusive
- * fence
+ * @usage: controls which fences to include, see enum dma_resv_usage.
  *
  * Callers are not required to hold specific locks, but maybe hold
  * dma_resv_lock() already.
@@ -747,12 +746,12 @@ EXPORT_SYMBOL_GPL(dma_resv_wait_timeout);
  *
  * True if all fences signaled, else false.
  */
-bool dma_resv_test_signaled(struct dma_resv *obj, bool test_all)
+bool dma_resv_test_signaled(struct dma_resv *obj, enum dma_resv_usage usage)
 {
 	struct dma_resv_iter cursor;
 	struct dma_fence *fence;
 
-	dma_resv_iter_begin(&cursor, obj, test_all);
+	dma_resv_iter_begin(&cursor, obj, usage);
 	dma_resv_for_each_fence_unlocked(&cursor, fence) {
 		dma_resv_iter_end(&cursor);
 		return false;
@@ -775,7 +774,7 @@ void dma_resv_describe(struct dma_resv *obj, struct seq_file *seq)
 	struct dma_resv_iter cursor;
 	struct dma_fence *fence;
 
-	dma_resv_for_each_fence(&cursor, obj, true, fence) {
+	dma_resv_for_each_fence(&cursor, obj, DMA_RESV_USAGE_READ, fence) {
 		seq_printf(seq, "\t%s fence:",
 			   dma_resv_iter_is_exclusive(&cursor) ?
 				"Exclusive" : "Shared");
diff --git a/drivers/dma-buf/st-dma-resv.c b/drivers/dma-buf/st-dma-resv.c
index d2e61f6ae989..d097981061b1 100644
--- a/drivers/dma-buf/st-dma-resv.c
+++ b/drivers/dma-buf/st-dma-resv.c
@@ -58,7 +58,7 @@ static int sanitycheck(void *arg)
 	return r;
 }
 
-static int test_signaling(void *arg, bool shared)
+static int test_signaling(void *arg, enum dma_resv_usage usage)
 {
 	struct dma_resv resv;
 	struct dma_fence *f;
@@ -81,18 +81,18 @@ static int test_signaling(void *arg, bool shared)
 		goto err_unlock;
 	}
 
-	if (shared)
+	if (usage >= DMA_RESV_USAGE_READ)
 		dma_resv_add_shared_fence(&resv, f);
 	else
 		dma_resv_add_excl_fence(&resv, f);
 
-	if (dma_resv_test_signaled(&resv, shared)) {
+	if (dma_resv_test_signaled(&resv, usage)) {
 		pr_err("Resv unexpectedly signaled\n");
 		r = -EINVAL;
 		goto err_unlock;
 	}
 	dma_fence_signal(f);
-	if (!dma_resv_test_signaled(&resv, shared)) {
+	if (!dma_resv_test_signaled(&resv, usage)) {
 		pr_err("Resv not reporting signaled\n");
 		r = -EINVAL;
 		goto err_unlock;
@@ -107,15 +107,15 @@ static int test_signaling(void *arg, bool shared)
 
 static int test_excl_signaling(void *arg)
 {
-	return test_signaling(arg, false);
+	return test_signaling(arg, DMA_RESV_USAGE_WRITE);
 }
 
 static int test_shared_signaling(void *arg)
 {
-	return test_signaling(arg, true);
+	return test_signaling(arg, DMA_RESV_USAGE_READ);
 }
 
-static int test_for_each(void *arg, bool shared)
+static int test_for_each(void *arg, enum dma_resv_usage usage)
 {
 	struct dma_resv_iter cursor;
 	struct dma_fence *f, *fence;
@@ -139,13 +139,13 @@ static int test_for_each(void *arg, bool shared)
 		goto err_unlock;
 	}
 
-	if (shared)
+	if (usage >= DMA_RESV_USAGE_READ)
 		dma_resv_add_shared_fence(&resv, f);
 	else
 		dma_resv_add_excl_fence(&resv, f);
 
 	r = -ENOENT;
-	dma_resv_for_each_fence(&cursor, &resv, shared, fence) {
+	dma_resv_for_each_fence(&cursor, &resv, usage, fence) {
 		if (!r) {
 			pr_err("More than one fence found\n");
 			r = -EINVAL;
@@ -156,7 +156,8 @@ static int test_for_each(void *arg, bool shared)
 			r = -EINVAL;
 			goto err_unlock;
 		}
-		if (dma_resv_iter_is_exclusive(&cursor) != !shared) {
+		if (dma_resv_iter_is_exclusive(&cursor) !=
+		    (usage >= DMA_RESV_USAGE_READ)) {
 			pr_err("Unexpected fence usage\n");
 			r = -EINVAL;
 			goto err_unlock;
@@ -178,15 +179,15 @@ static int test_for_each(void *arg, bool shared)
 
 static int test_excl_for_each(void *arg)
 {
-	return test_for_each(arg, false);
+	return test_for_each(arg, DMA_RESV_USAGE_WRITE);
 }
 
 static int test_shared_for_each(void *arg)
 {
-	return test_for_each(arg, true);
+	return test_for_each(arg, DMA_RESV_USAGE_READ);
 }
 
-static int test_for_each_unlocked(void *arg, bool shared)
+static int test_for_each_unlocked(void *arg, enum dma_resv_usage usage)
 {
 	struct dma_resv_iter cursor;
 	struct dma_fence *f, *fence;
@@ -211,14 +212,14 @@ static int test_for_each_unlocked(void *arg, bool shared)
 		goto err_free;
 	}
 
-	if (shared)
+	if (usage >= DMA_RESV_USAGE_READ)
 		dma_resv_add_shared_fence(&resv, f);
 	else
 		dma_resv_add_excl_fence(&resv, f);
 	dma_resv_unlock(&resv);
 
 	r = -ENOENT;
-	dma_resv_iter_begin(&cursor, &resv, shared);
+	dma_resv_iter_begin(&cursor, &resv, usage);
 	dma_resv_for_each_fence_unlocked(&cursor, fence) {
 		if (!r) {
 			pr_err("More than one fence found\n");
@@ -234,7 +235,8 @@ static int test_for_each_unlocked(void *arg, bool shared)
 			r = -EINVAL;
 			goto err_iter_end;
 		}
-		if (dma_resv_iter_is_exclusive(&cursor) != !shared) {
+		if (dma_resv_iter_is_exclusive(&cursor) !=
+		    (usage >= DMA_RESV_USAGE_READ)) {
 			pr_err("Unexpected fence usage\n");
 			r = -EINVAL;
 			goto err_iter_end;
@@ -262,15 +264,15 @@ static int test_for_each_unlocked(void *arg, bool shared)
 
 static int test_excl_for_each_unlocked(void *arg)
 {
-	return test_for_each_unlocked(arg, false);
+	return test_for_each_unlocked(arg, DMA_RESV_USAGE_WRITE);
 }
 
 static int test_shared_for_each_unlocked(void *arg)
 {
-	return test_for_each_unlocked(arg, true);
+	return test_for_each_unlocked(arg, DMA_RESV_USAGE_READ);
 }
 
-static int test_get_fences(void *arg, bool shared)
+static int test_get_fences(void *arg, enum dma_resv_usage usage)
 {
 	struct dma_fence *f, **fences = NULL;
 	struct dma_resv resv;
@@ -294,13 +296,13 @@ static int test_get_fences(void *arg, bool shared)
 		goto err_resv;
 	}
 
-	if (shared)
+	if (usage >= DMA_RESV_USAGE_READ)
 		dma_resv_add_shared_fence(&resv, f);
 	else
 		dma_resv_add_excl_fence(&resv, f);
 	dma_resv_unlock(&resv);
 
-	r = dma_resv_get_fences(&resv, shared, &i, &fences);
+	r = dma_resv_get_fences(&resv, usage, &i, &fences);
 	if (r) {
 		pr_err("get_fences failed\n");
 		goto err_free;
@@ -324,12 +326,12 @@ static int test_get_fences(void *arg, bool shared)
 
 static int test_excl_get_fences(void *arg)
 {
-	return test_get_fences(arg, false);
+	return test_get_fences(arg, DMA_RESV_USAGE_WRITE);
 }
 
 static int test_shared_get_fences(void *arg)
 {
-	return test_get_fences(arg, true);
+	return test_get_fences(arg, DMA_RESV_USAGE_READ);
 }
 
 int dma_resv(void)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
index e85e347eb670..413f32c3fd63 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
@@ -1288,7 +1288,9 @@ static int amdgpu_cs_submit(struct amdgpu_cs_parser *p,
 		 *
 		 * TODO: Remove together with dma_resv rework.
 		 */
-		dma_resv_for_each_fence(&cursor, resv, false, fence) {
+		dma_resv_for_each_fence(&cursor, resv,
+					DMA_RESV_USAGE_WRITE,
+					fence) {
 			break;
 		}
 		dma_fence_chain_init(chain, fence, dma_fence_get(p->fence), 1);
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c
index fae5c1debfad..7a6908d71820 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c
@@ -200,8 +200,7 @@ int amdgpu_display_crtc_page_flip_target(struct drm_crtc *crtc,
 		goto unpin;
 	}
 
-	/* TODO: Unify this with other drivers */
-	r = dma_resv_get_fences(new_abo->tbo.base.resv, true,
+	r = dma_resv_get_fences(new_abo->tbo.base.resv, DMA_RESV_USAGE_WRITE,
 				&work->shared_count,
 				&work->shared);
 	if (unlikely(r != 0)) {
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
index 57b74d35052f..84a53758e18e 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
@@ -526,7 +526,8 @@ int amdgpu_gem_wait_idle_ioctl(struct drm_device *dev, void *data,
 		return -ENOENT;
 	}
 	robj = gem_to_amdgpu_bo(gobj);
-	ret = dma_resv_wait_timeout(robj->tbo.base.resv, true, true, timeout);
+	ret = dma_resv_wait_timeout(robj->tbo.base.resv, DMA_RESV_USAGE_READ,
+				    true, timeout);
 
 	/* ret == 0 means not signaled,
 	 * ret > 0 means signaled
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ids.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ids.c
index 81207737c716..65998cbcd7f7 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ids.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ids.c
@@ -111,7 +111,7 @@ void amdgpu_pasid_free_delayed(struct dma_resv *resv,
 	struct dma_fence *fence;
 	int r;
 
-	r = dma_resv_get_singleton(resv, true, &fence);
+	r = dma_resv_get_singleton(resv, DMA_RESV_USAGE_READ, &fence);
 	if (r)
 		goto fallback;
 
@@ -139,7 +139,8 @@ void amdgpu_pasid_free_delayed(struct dma_resv *resv,
 	/* Not enough memory for the delayed delete, as last resort
 	 * block for all the fences to complete.
 	 */
-	dma_resv_wait_timeout(resv, true, false, MAX_SCHEDULE_TIMEOUT);
+	dma_resv_wait_timeout(resv, DMA_RESV_USAGE_READ,
+			      false, MAX_SCHEDULE_TIMEOUT);
 	amdgpu_pasid_free(pasid);
 }
 
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_mn.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_mn.c
index 4b153daf283d..86f5248676b0 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_mn.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_mn.c
@@ -75,8 +75,8 @@ static bool amdgpu_mn_invalidate_gfx(struct mmu_interval_notifier *mni,
 
 	mmu_interval_set_seq(mni, cur_seq);
 
-	r = dma_resv_wait_timeout(bo->tbo.base.resv, true, false,
-				  MAX_SCHEDULE_TIMEOUT);
+	r = dma_resv_wait_timeout(bo->tbo.base.resv, DMA_RESV_USAGE_READ,
+				  false, MAX_SCHEDULE_TIMEOUT);
 	mutex_unlock(&adev->notifier_lock);
 	if (r <= 0)
 		DRM_ERROR("(%ld) failed to wait for user bo\n", r);
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
index 6f57a2fd5fe3..a7f39f8ab7be 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
@@ -768,8 +768,8 @@ int amdgpu_bo_kmap(struct amdgpu_bo *bo, void **ptr)
 		return 0;
 	}
 
-	r = dma_resv_wait_timeout(bo->tbo.base.resv, false, false,
-				  MAX_SCHEDULE_TIMEOUT);
+	r = dma_resv_wait_timeout(bo->tbo.base.resv, DMA_RESV_USAGE_WRITE,
+				  false, MAX_SCHEDULE_TIMEOUT);
 	if (r < 0)
 		return r;
 
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_sync.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_sync.c
index 40e06745fae9..744e144e5fc2 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_sync.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_sync.c
@@ -259,7 +259,8 @@ int amdgpu_sync_resv(struct amdgpu_device *adev, struct amdgpu_sync *sync,
 	if (resv == NULL)
 		return -EINVAL;
 
-	dma_resv_for_each_fence(&cursor, resv, true, f) {
+	/* TODO: Use DMA_RESV_USAGE_READ here */
+	dma_resv_for_each_fence(&cursor, resv, DMA_RESV_USAGE_READ, f) {
 		dma_fence_chain_for_each(f, f) {
 			struct dma_fence *tmp = dma_fence_chain_contained(f);
 
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
index f7f149588432..5db5066e74b4 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
@@ -1344,7 +1344,8 @@ static bool amdgpu_ttm_bo_eviction_valuable(struct ttm_buffer_object *bo,
 	 * If true, then return false as any KFD process needs all its BOs to
 	 * be resident to run successfully
 	 */
-	dma_resv_for_each_fence(&resv_cursor, bo->base.resv, true, f) {
+	dma_resv_for_each_fence(&resv_cursor, bo->base.resv,
+				DMA_RESV_USAGE_READ, f) {
 		if (amdkfd_fence_check_mm(f, current->mm))
 			return false;
 	}
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c
index 39c74d9fa7cc..3654326219e0 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c
@@ -1163,7 +1163,8 @@ static int amdgpu_uvd_send_msg(struct amdgpu_ring *ring, struct amdgpu_bo *bo,
 	ib->length_dw = 16;
 
 	if (direct) {
-		r = dma_resv_wait_timeout(bo->tbo.base.resv, true, false,
+		r = dma_resv_wait_timeout(bo->tbo.base.resv,
+					  DMA_RESV_USAGE_WRITE, false,
 					  msecs_to_jiffies(10));
 		if (r == 0)
 			r = -ETIMEDOUT;
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
index b13451255e8b..a0376fd36a82 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
@@ -2059,7 +2059,7 @@ static void amdgpu_vm_prt_fini(struct amdgpu_device *adev, struct amdgpu_vm *vm)
 	struct dma_resv_iter cursor;
 	struct dma_fence *fence;
 
-	dma_resv_for_each_fence(&cursor, resv, true, fence) {
+	dma_resv_for_each_fence(&cursor, resv, DMA_RESV_USAGE_READ, fence) {
 		/* Add a callback for each fence in the reservation object */
 		amdgpu_vm_prt_get(adev);
 		amdgpu_vm_add_prt_cb(adev, fence);
@@ -2665,7 +2665,7 @@ bool amdgpu_vm_evictable(struct amdgpu_bo *bo)
 		return true;
 
 	/* Don't evict VM page tables while they are busy */
-	if (!dma_resv_test_signaled(bo->tbo.base.resv, true))
+	if (!dma_resv_test_signaled(bo->tbo.base.resv, DMA_RESV_USAGE_READ))
 		return false;
 
 	/* Try to block ongoing updates */
@@ -2845,7 +2845,8 @@ void amdgpu_vm_adjust_size(struct amdgpu_device *adev, uint32_t min_vm_size,
  */
 long amdgpu_vm_wait_idle(struct amdgpu_vm *vm, long timeout)
 {
-	timeout = dma_resv_wait_timeout(vm->root.bo->tbo.base.resv, true,
+	timeout = dma_resv_wait_timeout(vm->root.bo->tbo.base.resv,
+					DMA_RESV_USAGE_READ,
 					true, timeout);
 	if (timeout <= 0)
 		return timeout;
diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
index b30656959fd8..9e24b1e616af 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
@@ -9236,7 +9236,8 @@ static void amdgpu_dm_commit_planes(struct drm_atomic_state *state,
 		 * deadlock during GPU reset when this fence will not signal
 		 * but we hold reservation lock for the BO.
 		 */
-		r = dma_resv_wait_timeout(abo->tbo.base.resv, true, false,
+		r = dma_resv_wait_timeout(abo->tbo.base.resv,
+					  DMA_RESV_USAGE_WRITE, false,
 					  msecs_to_jiffies(5000));
 		if (unlikely(r <= 0))
 			DRM_ERROR("Waiting for fences timed out!");
diff --git a/drivers/gpu/drm/drm_gem.c b/drivers/gpu/drm/drm_gem.c
index 133dfae06fab..eb0c2d041f13 100644
--- a/drivers/gpu/drm/drm_gem.c
+++ b/drivers/gpu/drm/drm_gem.c
@@ -771,7 +771,8 @@ long drm_gem_dma_resv_wait(struct drm_file *filep, u32 handle,
 		return -EINVAL;
 	}
 
-	ret = dma_resv_wait_timeout(obj->resv, wait_all, true, timeout);
+	ret = dma_resv_wait_timeout(obj->resv, dma_resv_usage_rw(wait_all),
+				    true, timeout);
 	if (ret == 0)
 		ret = -ETIME;
 	else if (ret > 0)
diff --git a/drivers/gpu/drm/drm_gem_atomic_helper.c b/drivers/gpu/drm/drm_gem_atomic_helper.c
index 9338ddb7edff..a6d89aed0bda 100644
--- a/drivers/gpu/drm/drm_gem_atomic_helper.c
+++ b/drivers/gpu/drm/drm_gem_atomic_helper.c
@@ -151,7 +151,7 @@ int drm_gem_plane_helper_prepare_fb(struct drm_plane *plane, struct drm_plane_st
 		return 0;
 
 	obj = drm_gem_fb_get_obj(state->fb, 0);
-	ret = dma_resv_get_singleton(obj->resv, false, &fence);
+	ret = dma_resv_get_singleton(obj->resv, DMA_RESV_USAGE_WRITE, &fence);
 	if (ret)
 		return ret;
 
diff --git a/drivers/gpu/drm/etnaviv/etnaviv_gem.c b/drivers/gpu/drm/etnaviv/etnaviv_gem.c
index d5314aa28ff7..507172e2780b 100644
--- a/drivers/gpu/drm/etnaviv/etnaviv_gem.c
+++ b/drivers/gpu/drm/etnaviv/etnaviv_gem.c
@@ -380,12 +380,14 @@ int etnaviv_gem_cpu_prep(struct drm_gem_object *obj, u32 op,
 	}
 
 	if (op & ETNA_PREP_NOSYNC) {
-		if (!dma_resv_test_signaled(obj->resv, write))
+		if (!dma_resv_test_signaled(obj->resv,
+					    dma_resv_usage_rw(write)))
 			return -EBUSY;
 	} else {
 		unsigned long remain = etnaviv_timeout_to_jiffies(timeout);
 
-		ret = dma_resv_wait_timeout(obj->resv, write, true, remain);
+		ret = dma_resv_wait_timeout(obj->resv, dma_resv_usage_rw(write),
+					    true, remain);
 		if (ret <= 0)
 			return ret == 0 ? -ETIMEDOUT : ret;
 	}
diff --git a/drivers/gpu/drm/i915/display/intel_atomic_plane.c b/drivers/gpu/drm/i915/display/intel_atomic_plane.c
index 5712688232fb..03e86e836a17 100644
--- a/drivers/gpu/drm/i915/display/intel_atomic_plane.c
+++ b/drivers/gpu/drm/i915/display/intel_atomic_plane.c
@@ -997,7 +997,8 @@ intel_prepare_plane_fb(struct drm_plane *_plane,
 		if (ret < 0)
 			goto unpin_fb;
 
-		dma_resv_iter_begin(&cursor, obj->base.resv, false);
+		dma_resv_iter_begin(&cursor, obj->base.resv,
+				    DMA_RESV_USAGE_WRITE);
 		dma_resv_for_each_fence_unlocked(&cursor, fence) {
 			add_rps_boost_after_vblank(new_plane_state->hw.crtc,
 						   fence);
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_busy.c b/drivers/gpu/drm/i915/gem/i915_gem_busy.c
index 470fdfd61a0f..14a1c0ad8c3c 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_busy.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_busy.c
@@ -138,12 +138,12 @@ i915_gem_busy_ioctl(struct drm_device *dev, void *data,
 	 * Alternatively, we can trade that extra information on read/write
 	 * activity with
 	 *	args->busy =
-	 *		!dma_resv_test_signaled(obj->resv, true);
+	 *		!dma_resv_test_signaled(obj->resv, DMA_RESV_USAGE_READ);
 	 * to report the overall busyness. This is what the wait-ioctl does.
 	 *
 	 */
 	args->busy = 0;
-	dma_resv_iter_begin(&cursor, obj->base.resv, true);
+	dma_resv_iter_begin(&cursor, obj->base.resv, DMA_RESV_USAGE_READ);
 	dma_resv_for_each_fence_unlocked(&cursor, fence) {
 		if (dma_resv_iter_is_restarted(&cursor))
 			args->busy = 0;
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_lmem.c b/drivers/gpu/drm/i915/gem/i915_gem_lmem.c
index 444f8268b9c5..a200d3e66573 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_lmem.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_lmem.c
@@ -66,7 +66,7 @@ bool __i915_gem_object_is_lmem(struct drm_i915_gem_object *obj)
 	struct intel_memory_region *mr = READ_ONCE(obj->mm.region);
 
 #ifdef CONFIG_LOCKDEP
-	GEM_WARN_ON(dma_resv_test_signaled(obj->base.resv, true) &&
+	GEM_WARN_ON(dma_resv_test_signaled(obj->base.resv, DMA_RESV_USAGE_READ) &&
 		    i915_gem_object_evictable(obj));
 #endif
 	return mr && (mr->type == INTEL_MEMORY_LOCAL ||
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_userptr.c b/drivers/gpu/drm/i915/gem/i915_gem_userptr.c
index 6d1a71d6404c..644fe237601c 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_userptr.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_userptr.c
@@ -86,7 +86,7 @@ static bool i915_gem_userptr_invalidate(struct mmu_interval_notifier *mni,
 		return true;
 
 	/* we will unbind on next submission, still have userptr pins */
-	r = dma_resv_wait_timeout(obj->base.resv, true, false,
+	r = dma_resv_wait_timeout(obj->base.resv, DMA_RESV_USAGE_READ, false,
 				  MAX_SCHEDULE_TIMEOUT);
 	if (r <= 0)
 		drm_err(&i915->drm, "(%ld) failed to wait for idle\n", r);
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_wait.c b/drivers/gpu/drm/i915/gem/i915_gem_wait.c
index dab3d30c09a0..319936f91ac5 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_wait.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_wait.c
@@ -40,7 +40,8 @@ i915_gem_object_wait_reservation(struct dma_resv *resv,
 	struct dma_fence *fence;
 	long ret = timeout ?: 1;
 
-	dma_resv_iter_begin(&cursor, resv, flags & I915_WAIT_ALL);
+	dma_resv_iter_begin(&cursor, resv,
+			    dma_resv_usage_rw(flags & I915_WAIT_ALL));
 	dma_resv_for_each_fence_unlocked(&cursor, fence) {
 		ret = i915_gem_object_wait_fence(fence, flags, timeout);
 		if (ret <= 0)
@@ -117,7 +118,8 @@ i915_gem_object_wait_priority(struct drm_i915_gem_object *obj,
 	struct dma_resv_iter cursor;
 	struct dma_fence *fence;
 
-	dma_resv_iter_begin(&cursor, obj->base.resv, flags & I915_WAIT_ALL);
+	dma_resv_iter_begin(&cursor, obj->base.resv,
+			    dma_resv_usage_rw(flags & I915_WAIT_ALL));
 	dma_resv_for_each_fence_unlocked(&cursor, fence)
 		i915_gem_fence_wait_priority(fence, attr);
 	dma_resv_iter_end(&cursor);
diff --git a/drivers/gpu/drm/i915/gem/selftests/i915_gem_dmabuf.c b/drivers/gpu/drm/i915/gem/selftests/i915_gem_dmabuf.c
index b071a58dd6da..b4275b55e5b8 100644
--- a/drivers/gpu/drm/i915/gem/selftests/i915_gem_dmabuf.c
+++ b/drivers/gpu/drm/i915/gem/selftests/i915_gem_dmabuf.c
@@ -219,7 +219,8 @@ static int igt_dmabuf_import_same_driver(struct drm_i915_private *i915,
 		goto out_detach;
 	}
 
-	timeout = dma_resv_wait_timeout(dmabuf->resv, false, true, 5 * HZ);
+	timeout = dma_resv_wait_timeout(dmabuf->resv, DMA_RESV_USAGE_WRITE,
+					true, 5 * HZ);
 	if (!timeout) {
 		pr_err("dmabuf wait for exclusive fence timed out.\n");
 		timeout = -ETIME;
diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
index 582770360ad1..73d5195146b0 100644
--- a/drivers/gpu/drm/i915/i915_request.c
+++ b/drivers/gpu/drm/i915/i915_request.c
@@ -1598,7 +1598,8 @@ i915_request_await_object(struct i915_request *to,
 	struct dma_fence *fence;
 	int ret = 0;
 
-	dma_resv_for_each_fence(&cursor, obj->base.resv, write, fence) {
+	dma_resv_for_each_fence(&cursor, obj->base.resv,
+				dma_resv_usage_rw(write), fence) {
 		ret = i915_request_await_dma_fence(to, fence);
 		if (ret)
 			break;
diff --git a/drivers/gpu/drm/i915/i915_sw_fence.c b/drivers/gpu/drm/i915/i915_sw_fence.c
index 2a74a9a1cafe..ae984c66c48a 100644
--- a/drivers/gpu/drm/i915/i915_sw_fence.c
+++ b/drivers/gpu/drm/i915/i915_sw_fence.c
@@ -585,7 +585,7 @@ int i915_sw_fence_await_reservation(struct i915_sw_fence *fence,
 	debug_fence_assert(fence);
 	might_sleep_if(gfpflags_allow_blocking(gfp));
 
-	dma_resv_iter_begin(&cursor, resv, write);
+	dma_resv_iter_begin(&cursor, resv, dma_resv_usage_rw(write));
 	dma_resv_for_each_fence_unlocked(&cursor, f) {
 		pending = i915_sw_fence_await_dma_fence(fence, f, timeout,
 							gfp);
diff --git a/drivers/gpu/drm/msm/msm_gem.c b/drivers/gpu/drm/msm/msm_gem.c
index 02b9ae65a96a..01bbb5f2d462 100644
--- a/drivers/gpu/drm/msm/msm_gem.c
+++ b/drivers/gpu/drm/msm/msm_gem.c
@@ -848,7 +848,8 @@ int msm_gem_cpu_prep(struct drm_gem_object *obj, uint32_t op, ktime_t *timeout)
 		op & MSM_PREP_NOSYNC ? 0 : timeout_to_jiffies(timeout);
 	long ret;
 
-	ret = dma_resv_wait_timeout(obj->resv, write, true,  remain);
+	ret = dma_resv_wait_timeout(obj->resv, dma_resv_usage_rw(write),
+				    true,  remain);
 	if (ret == 0)
 		return remain == 0 ? -EBUSY : -ETIMEDOUT;
 	else if (ret < 0)
diff --git a/drivers/gpu/drm/nouveau/dispnv50/wndw.c b/drivers/gpu/drm/nouveau/dispnv50/wndw.c
index e2faf92e4831..8642b84ea20c 100644
--- a/drivers/gpu/drm/nouveau/dispnv50/wndw.c
+++ b/drivers/gpu/drm/nouveau/dispnv50/wndw.c
@@ -558,7 +558,8 @@ nv50_wndw_prepare_fb(struct drm_plane *plane, struct drm_plane_state *state)
 			asyw->image.handle[0] = ctxdma->object.handle;
 	}
 
-	ret = dma_resv_get_singleton(nvbo->bo.base.resv, false,
+	ret = dma_resv_get_singleton(nvbo->bo.base.resv,
+				     DMA_RESV_USAGE_WRITE,
 				     &asyw->state.fence);
 	if (ret)
 		return ret;
diff --git a/drivers/gpu/drm/nouveau/nouveau_bo.c b/drivers/gpu/drm/nouveau/nouveau_bo.c
index 74f8652d2bd3..c6bb4dbcd735 100644
--- a/drivers/gpu/drm/nouveau/nouveau_bo.c
+++ b/drivers/gpu/drm/nouveau/nouveau_bo.c
@@ -962,11 +962,11 @@ nouveau_bo_vm_cleanup(struct ttm_buffer_object *bo,
 	struct dma_fence *fence;
 	int ret;
 
-	/* TODO: This is actually a memory management dependency */
-	ret = dma_resv_get_singleton(bo->base.resv, false, &fence);
+	ret = dma_resv_get_singleton(bo->base.resv, DMA_RESV_USAGE_WRITE,
+				     &fence);
 	if (ret)
-		dma_resv_wait_timeout(bo->base.resv, false, false,
-				      MAX_SCHEDULE_TIMEOUT);
+		dma_resv_wait_timeout(bo->base.resv, DMA_RESV_USAGE_WRITE,
+				      false, MAX_SCHEDULE_TIMEOUT);
 
 	nv10_bo_put_tile_region(dev, *old_tile, fence);
 	*old_tile = new_tile;
diff --git a/drivers/gpu/drm/nouveau/nouveau_fence.c b/drivers/gpu/drm/nouveau/nouveau_fence.c
index 0268259e97eb..d5e81ccee01c 100644
--- a/drivers/gpu/drm/nouveau/nouveau_fence.c
+++ b/drivers/gpu/drm/nouveau/nouveau_fence.c
@@ -350,14 +350,16 @@ nouveau_fence_sync(struct nouveau_bo *nvbo, struct nouveau_channel *chan,
 	if (ret)
 		return ret;
 
-	/* Waiting for the exclusive fence first causes performance regressions
-	 * under some circumstances. So manually wait for the shared ones first.
+	/* Waiting for the writes first causes performance regressions
+	 * under some circumstances. So manually wait for the reads first.
 	 */
 	for (i = 0; i < 2; ++i) {
 		struct dma_resv_iter cursor;
 		struct dma_fence *fence;
 
-		dma_resv_for_each_fence(&cursor, resv, exclusive, fence) {
+		dma_resv_for_each_fence(&cursor, resv,
+					dma_resv_usage_rw(exclusive),
+					fence) {
 			struct nouveau_fence *f;
 
 			if (i == 0 && dma_resv_iter_is_exclusive(&cursor))
diff --git a/drivers/gpu/drm/nouveau/nouveau_gem.c b/drivers/gpu/drm/nouveau/nouveau_gem.c
index 9416bee92141..fab542a758ff 100644
--- a/drivers/gpu/drm/nouveau/nouveau_gem.c
+++ b/drivers/gpu/drm/nouveau/nouveau_gem.c
@@ -962,7 +962,8 @@ nouveau_gem_ioctl_cpu_prep(struct drm_device *dev, void *data,
 		return -ENOENT;
 	nvbo = nouveau_gem_object(gem);
 
-	lret = dma_resv_wait_timeout(nvbo->bo.base.resv, write, true,
+	lret = dma_resv_wait_timeout(nvbo->bo.base.resv,
+				     dma_resv_usage_rw(write), true,
 				     no_wait ? 0 : 30 * HZ);
 	if (!lret)
 		ret = -EBUSY;
diff --git a/drivers/gpu/drm/panfrost/panfrost_drv.c b/drivers/gpu/drm/panfrost/panfrost_drv.c
index 94b6f0a19c83..7fcbc2a5b6cd 100644
--- a/drivers/gpu/drm/panfrost/panfrost_drv.c
+++ b/drivers/gpu/drm/panfrost/panfrost_drv.c
@@ -316,7 +316,8 @@ panfrost_ioctl_wait_bo(struct drm_device *dev, void *data,
 	if (!gem_obj)
 		return -ENOENT;
 
-	ret = dma_resv_wait_timeout(gem_obj->resv, true, true, timeout);
+	ret = dma_resv_wait_timeout(gem_obj->resv, DMA_RESV_USAGE_READ,
+				    true, timeout);
 	if (!ret)
 		ret = timeout ? -ETIMEDOUT : -EBUSY;
 
diff --git a/drivers/gpu/drm/qxl/qxl_debugfs.c b/drivers/gpu/drm/qxl/qxl_debugfs.c
index 6a36b0fd845c..33e5889d6608 100644
--- a/drivers/gpu/drm/qxl/qxl_debugfs.c
+++ b/drivers/gpu/drm/qxl/qxl_debugfs.c
@@ -61,7 +61,8 @@ qxl_debugfs_buffers_info(struct seq_file *m, void *data)
 		struct dma_fence *fence;
 		int rel = 0;
 
-		dma_resv_iter_begin(&cursor, bo->tbo.base.resv, true);
+		dma_resv_iter_begin(&cursor, bo->tbo.base.resv,
+				    DMA_RESV_USAGE_READ);
 		dma_resv_for_each_fence_unlocked(&cursor, fence) {
 			if (dma_resv_iter_is_restarted(&cursor))
 				rel = 0;
diff --git a/drivers/gpu/drm/radeon/radeon_display.c b/drivers/gpu/drm/radeon/radeon_display.c
index f60e826cd292..57ff2b723c87 100644
--- a/drivers/gpu/drm/radeon/radeon_display.c
+++ b/drivers/gpu/drm/radeon/radeon_display.c
@@ -533,7 +533,8 @@ static int radeon_crtc_page_flip_target(struct drm_crtc *crtc,
 		DRM_ERROR("failed to pin new rbo buffer before flip\n");
 		goto cleanup;
 	}
-	r = dma_resv_get_singleton(new_rbo->tbo.base.resv, false, &work->fence);
+	r = dma_resv_get_singleton(new_rbo->tbo.base.resv, DMA_RESV_USAGE_WRITE,
+				   &work->fence);
 	if (r) {
 		radeon_bo_unreserve(new_rbo);
 		DRM_ERROR("failed to get new rbo buffer fences\n");
diff --git a/drivers/gpu/drm/radeon/radeon_gem.c b/drivers/gpu/drm/radeon/radeon_gem.c
index f563284a7fac..6616a828f40b 100644
--- a/drivers/gpu/drm/radeon/radeon_gem.c
+++ b/drivers/gpu/drm/radeon/radeon_gem.c
@@ -162,7 +162,9 @@ static int radeon_gem_set_domain(struct drm_gem_object *gobj,
 	}
 	if (domain == RADEON_GEM_DOMAIN_CPU) {
 		/* Asking for cpu access wait for object idle */
-		r = dma_resv_wait_timeout(robj->tbo.base.resv, true, true, 30 * HZ);
+		r = dma_resv_wait_timeout(robj->tbo.base.resv,
+					  DMA_RESV_USAGE_READ,
+					  true, 30 * HZ);
 		if (!r)
 			r = -EBUSY;
 
@@ -524,7 +526,7 @@ int radeon_gem_busy_ioctl(struct drm_device *dev, void *data,
 	}
 	robj = gem_to_radeon_bo(gobj);
 
-	r = dma_resv_test_signaled(robj->tbo.base.resv, true);
+	r = dma_resv_test_signaled(robj->tbo.base.resv, DMA_RESV_USAGE_READ);
 	if (r == 0)
 		r = -EBUSY;
 	else
@@ -553,7 +555,8 @@ int radeon_gem_wait_idle_ioctl(struct drm_device *dev, void *data,
 	}
 	robj = gem_to_radeon_bo(gobj);
 
-	ret = dma_resv_wait_timeout(robj->tbo.base.resv, true, true, 30 * HZ);
+	ret = dma_resv_wait_timeout(robj->tbo.base.resv, DMA_RESV_USAGE_READ,
+				    true, 30 * HZ);
 	if (ret == 0)
 		r = -EBUSY;
 	else if (ret < 0)
diff --git a/drivers/gpu/drm/radeon/radeon_mn.c b/drivers/gpu/drm/radeon/radeon_mn.c
index 9fa88549c89e..68ebeb1bdfff 100644
--- a/drivers/gpu/drm/radeon/radeon_mn.c
+++ b/drivers/gpu/drm/radeon/radeon_mn.c
@@ -66,8 +66,8 @@ static bool radeon_mn_invalidate(struct mmu_interval_notifier *mn,
 		return true;
 	}
 
-	r = dma_resv_wait_timeout(bo->tbo.base.resv, true, false,
-				  MAX_SCHEDULE_TIMEOUT);
+	r = dma_resv_wait_timeout(bo->tbo.base.resv, DMA_RESV_USAGE_READ,
+				  false, MAX_SCHEDULE_TIMEOUT);
 	if (r <= 0)
 		DRM_ERROR("(%ld) failed to wait for user bo\n", r);
 
diff --git a/drivers/gpu/drm/radeon/radeon_sync.c b/drivers/gpu/drm/radeon/radeon_sync.c
index b991ba1bcd51..49bbb2266c0f 100644
--- a/drivers/gpu/drm/radeon/radeon_sync.c
+++ b/drivers/gpu/drm/radeon/radeon_sync.c
@@ -96,7 +96,7 @@ int radeon_sync_resv(struct radeon_device *rdev,
 	struct dma_fence *f;
 	int r = 0;
 
-	dma_resv_for_each_fence(&cursor, resv, shared, f) {
+	dma_resv_for_each_fence(&cursor, resv, dma_resv_usage_rw(shared), f) {
 		fence = to_radeon_fence(f);
 		if (fence && fence->rdev == rdev)
 			radeon_sync_fence(sync, fence);
diff --git a/drivers/gpu/drm/radeon/radeon_uvd.c b/drivers/gpu/drm/radeon/radeon_uvd.c
index bc0f44299bb9..a50750740ab0 100644
--- a/drivers/gpu/drm/radeon/radeon_uvd.c
+++ b/drivers/gpu/drm/radeon/radeon_uvd.c
@@ -478,8 +478,8 @@ static int radeon_uvd_cs_msg(struct radeon_cs_parser *p, struct radeon_bo *bo,
 		return -EINVAL;
 	}
 
-	r = dma_resv_wait_timeout(bo->tbo.base.resv, false, false,
-				  MAX_SCHEDULE_TIMEOUT);
+	r = dma_resv_wait_timeout(bo->tbo.base.resv, DMA_RESV_USAGE_WRITE,
+				  false, MAX_SCHEDULE_TIMEOUT);
 	if (r <= 0) {
 		DRM_ERROR("Failed waiting for UVD message (%ld)!\n", r);
 		return r ? r : -ETIME;
diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
index c5660b066554..76fd2904c7c6 100644
--- a/drivers/gpu/drm/scheduler/sched_main.c
+++ b/drivers/gpu/drm/scheduler/sched_main.c
@@ -705,7 +705,8 @@ int drm_sched_job_add_implicit_dependencies(struct drm_sched_job *job,
 
 	dma_resv_assert_held(obj->resv);
 
-	dma_resv_for_each_fence(&cursor, obj->resv, write, fence) {
+	dma_resv_for_each_fence(&cursor, obj->resv, dma_resv_usage_rw(write),
+				fence) {
 		/* Make sure to grab an additional ref on the added fence */
 		dma_fence_get(fence);
 		ret = drm_sched_job_add_dependency(job, fence);
diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c
index c49996cf25d0..cff05b62f3f7 100644
--- a/drivers/gpu/drm/ttm/ttm_bo.c
+++ b/drivers/gpu/drm/ttm/ttm_bo.c
@@ -223,7 +223,7 @@ static void ttm_bo_flush_all_fences(struct ttm_buffer_object *bo)
 	struct dma_resv_iter cursor;
 	struct dma_fence *fence;
 
-	dma_resv_iter_begin(&cursor, resv, true);
+	dma_resv_iter_begin(&cursor, resv, DMA_RESV_USAGE_READ);
 	dma_resv_for_each_fence_unlocked(&cursor, fence) {
 		if (!fence->ops->signaled)
 			dma_fence_enable_sw_signaling(fence);
@@ -252,7 +252,7 @@ static int ttm_bo_cleanup_refs(struct ttm_buffer_object *bo,
 	struct dma_resv *resv = &bo->base._resv;
 	int ret;
 
-	if (dma_resv_test_signaled(resv, true))
+	if (dma_resv_test_signaled(resv, DMA_RESV_USAGE_READ))
 		ret = 0;
 	else
 		ret = -EBUSY;
@@ -264,7 +264,8 @@ static int ttm_bo_cleanup_refs(struct ttm_buffer_object *bo,
 			dma_resv_unlock(bo->base.resv);
 		spin_unlock(&bo->bdev->lru_lock);
 
-		lret = dma_resv_wait_timeout(resv, true, interruptible,
+		lret = dma_resv_wait_timeout(resv, DMA_RESV_USAGE_READ,
+					     interruptible,
 					     30 * HZ);
 
 		if (lret < 0)
@@ -367,7 +368,8 @@ static void ttm_bo_release(struct kref *kref)
 			/* Last resort, if we fail to allocate memory for the
 			 * fences block for the BO to become idle
 			 */
-			dma_resv_wait_timeout(bo->base.resv, true, false,
+			dma_resv_wait_timeout(bo->base.resv,
+					      DMA_RESV_USAGE_READ, false,
 					      30 * HZ);
 		}
 
@@ -378,7 +380,7 @@ static void ttm_bo_release(struct kref *kref)
 		ttm_mem_io_free(bdev, bo->resource);
 	}
 
-	if (!dma_resv_test_signaled(bo->base.resv, true) ||
+	if (!dma_resv_test_signaled(bo->base.resv, DMA_RESV_USAGE_READ) ||
 	    !dma_resv_trylock(bo->base.resv)) {
 		/* The BO is not idle, resurrect it for delayed destroy */
 		ttm_bo_flush_all_fences(bo);
@@ -1044,14 +1046,14 @@ int ttm_bo_wait(struct ttm_buffer_object *bo,
 	long timeout = 15 * HZ;
 
 	if (no_wait) {
-		if (dma_resv_test_signaled(bo->base.resv, true))
+		if (dma_resv_test_signaled(bo->base.resv, DMA_RESV_USAGE_READ))
 			return 0;
 		else
 			return -EBUSY;
 	}
 
-	timeout = dma_resv_wait_timeout(bo->base.resv, true, interruptible,
-					timeout);
+	timeout = dma_resv_wait_timeout(bo->base.resv, DMA_RESV_USAGE_READ,
+					interruptible, timeout);
 	if (timeout < 0)
 		return timeout;
 
diff --git a/drivers/gpu/drm/vgem/vgem_fence.c b/drivers/gpu/drm/vgem/vgem_fence.c
index 2ddbebca87d9..91fc4940c65a 100644
--- a/drivers/gpu/drm/vgem/vgem_fence.c
+++ b/drivers/gpu/drm/vgem/vgem_fence.c
@@ -130,6 +130,7 @@ int vgem_fence_attach_ioctl(struct drm_device *dev,
 	struct vgem_file *vfile = file->driver_priv;
 	struct dma_resv *resv;
 	struct drm_gem_object *obj;
+	enum dma_resv_usage usage;
 	struct dma_fence *fence;
 	int ret;
 
@@ -151,7 +152,8 @@ int vgem_fence_attach_ioctl(struct drm_device *dev,
 
 	/* Check for a conflicting fence */
 	resv = obj->resv;
-	if (!dma_resv_test_signaled(resv, arg->flags & VGEM_FENCE_WRITE)) {
+	usage = dma_resv_usage_rw(arg->flags & VGEM_FENCE_WRITE);
+	if (!dma_resv_test_signaled(resv, usage)) {
 		ret = -EBUSY;
 		goto err_fence;
 	}
diff --git a/drivers/gpu/drm/virtio/virtgpu_ioctl.c b/drivers/gpu/drm/virtio/virtgpu_ioctl.c
index 77743fd2c61a..f8d83358d2a0 100644
--- a/drivers/gpu/drm/virtio/virtgpu_ioctl.c
+++ b/drivers/gpu/drm/virtio/virtgpu_ioctl.c
@@ -518,9 +518,10 @@ static int virtio_gpu_wait_ioctl(struct drm_device *dev, void *data,
 		return -ENOENT;
 
 	if (args->flags & VIRTGPU_WAIT_NOWAIT) {
-		ret = dma_resv_test_signaled(obj->resv, true);
+		ret = dma_resv_test_signaled(obj->resv, DMA_RESV_USAGE_READ);
 	} else {
-		ret = dma_resv_wait_timeout(obj->resv, true, true, timeout);
+		ret = dma_resv_wait_timeout(obj->resv, DMA_RESV_USAGE_READ,
+					    true, timeout);
 	}
 	if (ret == 0)
 		ret = -EBUSY;
diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_bo.c b/drivers/gpu/drm/vmwgfx/vmwgfx_bo.c
index fe13aa8b4a64..b96884f7d03d 100644
--- a/drivers/gpu/drm/vmwgfx/vmwgfx_bo.c
+++ b/drivers/gpu/drm/vmwgfx/vmwgfx_bo.c
@@ -528,8 +528,8 @@ static int vmw_user_bo_synccpu_grab(struct vmw_buffer_object *vmw_bo,
 	if (flags & drm_vmw_synccpu_allow_cs) {
 		long lret;
 
-		lret = dma_resv_wait_timeout(bo->base.resv, true, true,
-					     nonblock ? 0 :
+		lret = dma_resv_wait_timeout(bo->base.resv, DMA_RESV_USAGE_READ,
+					     true, nonblock ? 0 :
 					     MAX_SCHEDULE_TIMEOUT);
 		if (!lret)
 			return -EBUSY;
diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_resource.c b/drivers/gpu/drm/vmwgfx/vmwgfx_resource.c
index 626067104751..a84d1d5628d0 100644
--- a/drivers/gpu/drm/vmwgfx/vmwgfx_resource.c
+++ b/drivers/gpu/drm/vmwgfx/vmwgfx_resource.c
@@ -1164,7 +1164,8 @@ int vmw_resources_clean(struct vmw_buffer_object *vbo, pgoff_t start,
 		if (bo->moving)
 			dma_fence_put(bo->moving);
 
-		return dma_resv_get_singleton(bo->base.resv, false,
+		return dma_resv_get_singleton(bo->base.resv,
+					      DMA_RESV_USAGE_WRITE,
 					      &bo->moving);
 	}
 
diff --git a/drivers/infiniband/core/umem_dmabuf.c b/drivers/infiniband/core/umem_dmabuf.c
index d32cd7538835..f9901d273b8e 100644
--- a/drivers/infiniband/core/umem_dmabuf.c
+++ b/drivers/infiniband/core/umem_dmabuf.c
@@ -67,7 +67,8 @@ int ib_umem_dmabuf_map_pages(struct ib_umem_dmabuf *umem_dmabuf)
 	 * may be not up-to-date. Wait for the exporter to finish
 	 * the migration.
 	 */
-	return dma_resv_wait_timeout(umem_dmabuf->attach->dmabuf->resv, false,
+	return dma_resv_wait_timeout(umem_dmabuf->attach->dmabuf->resv,
+				     DMA_RESV_USAGE_WRITE,
 				     false, MAX_SCHEDULE_TIMEOUT);
 }
 EXPORT_SYMBOL(ib_umem_dmabuf_map_pages);
diff --git a/include/linux/dma-buf.h b/include/linux/dma-buf.h
index 6fb91956ab8d..a297397743a2 100644
--- a/include/linux/dma-buf.h
+++ b/include/linux/dma-buf.h
@@ -408,6 +408,9 @@ struct dma_buf {
 	 *   pipelining across drivers. These do not set any fences for their
 	 *   access. An example here is v4l.
 	 *
+	 * - Driver should use dma_resv_usage_rw() when retrieving fences as
+	 *   dependency for implicit synchronization.
+	 *
 	 * DYNAMIC IMPORTER RULES:
 	 *
 	 * Dynamic importers, see dma_buf_attachment_is_dynamic(), have
@@ -423,8 +426,9 @@ struct dma_buf {
 	 *
 	 * IMPORTANT:
 	 *
-	 * All drivers must obey the struct dma_resv rules, specifically the
-	 * rules for updating and obeying fences.
+	 * All drivers and memory management related functions must obey the
+	 * struct dma_resv rules, specifically the rules for updating and
+	 * obeying fences. See enum dma_resv_usage for further descriptions.
 	 */
 	struct dma_resv *resv;
 
diff --git a/include/linux/dma-resv.h b/include/linux/dma-resv.h
index 5fa04d0fccad..92cd8023980f 100644
--- a/include/linux/dma-resv.h
+++ b/include/linux/dma-resv.h
@@ -49,6 +49,53 @@ extern struct ww_class reservation_ww_class;
 
 struct dma_resv_list;
 
+/**
+ * enum dma_resv_usage - how the fences from a dma_resv obj are used
+ *
+ * This enum describes the different use cases for a dma_resv object and
+ * controls which fences are returned when queried.
+ *
+ * An important fact is that there is the order WRITE<READ and when the
+ * dma_resv object is asked for fences for one use case the fences for the
+ * lower use case are returned as well.
+ */
+enum dma_resv_usage {
+	/**
+	 * @DMA_RESV_USAGE_WRITE: Implicit write synchronization.
+	 *
+	 * This should only be used for userspace command submissions which add
+	 * an implicit write dependency.
+	 */
+	DMA_RESV_USAGE_WRITE,
+
+	/**
+	 * @DMA_RESV_USAGE_READ: Implicit read synchronization.
+	 *
+	 * This should only be used for userspace command submissions which add
+	 * an implicit read dependency.
+	 */
+	DMA_RESV_USAGE_READ,
+};
+
+/**
+ * dma_resv_usage_rw - helper for implicit sync
+ * @write: true if we create a new implicit sync write
+ *
+ * This returns the implicit synchronization usage for write or read accesses,
+ * see enum dma_resv_usage and &dma_buf.resv.
+ */
+static inline enum dma_resv_usage dma_resv_usage_rw(bool write)
+{
+	/* This looks confusing at first sight, but is indeed correct.
+	 *
+	 * The rational is that new write operations needs to wait for the
+	 * existing read and write operations to finish.
+	 * But a new read operation only needs to wait for the existing write
+	 * operations to finish.
+	 */
+	return write ? DMA_RESV_USAGE_READ : DMA_RESV_USAGE_WRITE;
+}
+
 /**
  * struct dma_resv - a reservation object manages fences for a buffer
  *
@@ -142,8 +189,8 @@ struct dma_resv_iter {
 	/** @obj: The dma_resv object we iterate over */
 	struct dma_resv *obj;
 
-	/** @all_fences: If all fences should be returned */
-	bool all_fences;
+	/** @usage: Return fences with this usage or lower. */
+	enum dma_resv_usage usage;
 
 	/** @fence: the currently handled fence */
 	struct dma_fence *fence;
@@ -173,14 +220,14 @@ struct dma_fence *dma_resv_iter_next(struct dma_resv_iter *cursor);
  * dma_resv_iter_begin - initialize a dma_resv_iter object
  * @cursor: The dma_resv_iter object to initialize
  * @obj: The dma_resv object which we want to iterate over
- * @all_fences: If all fences should be returned or just the exclusive one
+ * @usage: controls which fences to include, see enum dma_resv_usage.
  */
 static inline void dma_resv_iter_begin(struct dma_resv_iter *cursor,
 				       struct dma_resv *obj,
-				       bool all_fences)
+				       enum dma_resv_usage usage)
 {
 	cursor->obj = obj;
-	cursor->all_fences = all_fences;
+	cursor->usage = usage;
 	cursor->fence = NULL;
 }
 
@@ -241,7 +288,7 @@ static inline bool dma_resv_iter_is_restarted(struct dma_resv_iter *cursor)
  * dma_resv_for_each_fence - fence iterator
  * @cursor: a struct dma_resv_iter pointer
  * @obj: a dma_resv object pointer
- * @all_fences: true if all fences should be returned
+ * @usage: controls which fences to return
  * @fence: the current fence
  *
  * Iterate over the fences in a struct dma_resv object while holding the
@@ -250,8 +297,8 @@ static inline bool dma_resv_iter_is_restarted(struct dma_resv_iter *cursor)
  * valid as long as the lock is held and so no extra reference to the fence is
  * taken.
  */
-#define dma_resv_for_each_fence(cursor, obj, all_fences, fence)	\
-	for (dma_resv_iter_begin(cursor, obj, all_fences),	\
+#define dma_resv_for_each_fence(cursor, obj, usage, fence)	\
+	for (dma_resv_iter_begin(cursor, obj, usage),	\
 	     fence = dma_resv_iter_first(cursor); fence;	\
 	     fence = dma_resv_iter_next(cursor))
 
@@ -418,14 +465,14 @@ void dma_resv_add_shared_fence(struct dma_resv *obj, struct dma_fence *fence);
 void dma_resv_replace_fences(struct dma_resv *obj, uint64_t context,
 			     struct dma_fence *fence);
 void dma_resv_add_excl_fence(struct dma_resv *obj, struct dma_fence *fence);
-int dma_resv_get_fences(struct dma_resv *obj, bool write,
+int dma_resv_get_fences(struct dma_resv *obj, enum dma_resv_usage usage,
 			unsigned int *num_fences, struct dma_fence ***fences);
-int dma_resv_get_singleton(struct dma_resv *obj, bool write,
+int dma_resv_get_singleton(struct dma_resv *obj, enum dma_resv_usage usage,
 			   struct dma_fence **fence);
 int dma_resv_copy_fences(struct dma_resv *dst, struct dma_resv *src);
-long dma_resv_wait_timeout(struct dma_resv *obj, bool wait_all, bool intr,
-			   unsigned long timeout);
-bool dma_resv_test_signaled(struct dma_resv *obj, bool test_all);
+long dma_resv_wait_timeout(struct dma_resv *obj, enum dma_resv_usage usage,
+			   bool intr, unsigned long timeout);
+bool dma_resv_test_signaled(struct dma_resv *obj, enum dma_resv_usage usage);
 void dma_resv_describe(struct dma_resv *obj, struct seq_file *seq);
 
 #endif /* _LINUX_RESERVATION_H */
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 65+ messages in thread

* [PATCH 03/16] dma-buf: specify usage while adding fences to dma_resv obj v6
  2022-04-06  7:51 Christian König
  2022-04-06  7:51 ` [PATCH 01/16] dma-buf/drivers: make reserving a shared slot mandatory v4 Christian König
  2022-04-06  7:51 ` [PATCH 02/16] dma-buf: add enum dma_resv_usage v4 Christian König
@ 2022-04-06  7:51 ` Christian König
  2022-04-06 12:32   ` Daniel Vetter
  2022-04-06  7:51 ` [PATCH 04/16] dma-buf & drm/amdgpu: remove dma_resv workaround Christian König
                   ` (13 subsequent siblings)
  16 siblings, 1 reply; 65+ messages in thread
From: Christian König @ 2022-04-06  7:51 UTC (permalink / raw)
  To: daniel.vetter, dri-devel; +Cc: Christian König

Instead of distingting between shared and exclusive fences specify
the fence usage while adding fences.

Rework all drivers to use this interface instead and deprecate the old one.

v2: some kerneldoc comments suggested by Daniel
v3: fix a missing case in radeon
v4: rebase on nouveau changes, fix lockdep and temporary disable warning
v5: more documentation updates
v6: separate internal dma_resv changes from this patch, avoids to
    disable warning temporary, rebase on upstream changes

Signed-off-by: Christian König <christian.koenig@amd.com>
---
 drivers/dma-buf/dma-resv.c                    |  48 +++++++--
 drivers/dma-buf/st-dma-resv.c                 | 101 +++++-------------
 .../gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c  |   2 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c        |   4 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_object.c    |   6 +-
 drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c  |  10 +-
 drivers/gpu/drm/i915/gem/i915_gem_busy.c      |  13 +--
 drivers/gpu/drm/i915/gem/i915_gem_clflush.c   |   3 +-
 drivers/gpu/drm/i915/gem/i915_gem_ttm_move.c  |   5 +-
 .../drm/i915/gem/selftests/i915_gem_migrate.c |   4 +-
 .../drm/i915/gem/selftests/i915_gem_mman.c    |   3 +-
 drivers/gpu/drm/i915/i915_vma.c               |   8 +-
 .../drm/i915/selftests/intel_memory_region.c  |   3 +-
 drivers/gpu/drm/lima/lima_gem.c               |   2 +-
 drivers/gpu/drm/msm/msm_gem_submit.c          |   6 +-
 drivers/gpu/drm/nouveau/nouveau_bo.c          |   9 +-
 drivers/gpu/drm/nouveau/nouveau_fence.c       |   4 +-
 drivers/gpu/drm/panfrost/panfrost_job.c       |   2 +-
 drivers/gpu/drm/qxl/qxl_release.c             |   3 +-
 drivers/gpu/drm/radeon/radeon_object.c        |   6 +-
 drivers/gpu/drm/ttm/ttm_bo.c                  |   2 +-
 drivers/gpu/drm/ttm/ttm_bo_util.c             |   5 +-
 drivers/gpu/drm/ttm/ttm_execbuf_util.c        |   6 +-
 drivers/gpu/drm/v3d/v3d_gem.c                 |   4 +-
 drivers/gpu/drm/vc4/vc4_gem.c                 |   2 +-
 drivers/gpu/drm/vgem/vgem_fence.c             |   9 +-
 drivers/gpu/drm/virtio/virtgpu_gem.c          |   3 +-
 drivers/gpu/drm/vmwgfx/vmwgfx_bo.c            |   3 +-
 include/linux/dma-buf.h                       |  16 +--
 include/linux/dma-resv.h                      |  25 +++--
 30 files changed, 151 insertions(+), 166 deletions(-)

diff --git a/drivers/dma-buf/dma-resv.c b/drivers/dma-buf/dma-resv.c
index 17237e6ee30c..543dae6566d2 100644
--- a/drivers/dma-buf/dma-resv.c
+++ b/drivers/dma-buf/dma-resv.c
@@ -234,14 +234,14 @@ EXPORT_SYMBOL(dma_resv_reserve_fences);
 
 #ifdef CONFIG_DEBUG_MUTEXES
 /**
- * dma_resv_reset_shared_max - reset shared fences for debugging
+ * dma_resv_reset_max_fences - reset shared fences for debugging
  * @obj: the dma_resv object to reset
  *
  * Reset the number of pre-reserved shared slots to test that drivers do
  * correct slot allocation using dma_resv_reserve_fences(). See also
  * &dma_resv_list.shared_max.
  */
-void dma_resv_reset_shared_max(struct dma_resv *obj)
+void dma_resv_reset_max_fences(struct dma_resv *obj)
 {
 	struct dma_resv_list *fences = dma_resv_shared_list(obj);
 
@@ -251,7 +251,7 @@ void dma_resv_reset_shared_max(struct dma_resv *obj)
 	if (fences)
 		fences->shared_max = fences->shared_count;
 }
-EXPORT_SYMBOL(dma_resv_reset_shared_max);
+EXPORT_SYMBOL(dma_resv_reset_max_fences);
 #endif
 
 /**
@@ -264,7 +264,8 @@ EXPORT_SYMBOL(dma_resv_reset_shared_max);
  *
  * See also &dma_resv.fence for a discussion of the semantics.
  */
-void dma_resv_add_shared_fence(struct dma_resv *obj, struct dma_fence *fence)
+static void dma_resv_add_shared_fence(struct dma_resv *obj,
+				      struct dma_fence *fence)
 {
 	struct dma_resv_list *fobj;
 	struct dma_fence *old;
@@ -305,13 +306,13 @@ void dma_resv_add_shared_fence(struct dma_resv *obj, struct dma_fence *fence)
 	write_seqcount_end(&obj->seq);
 	dma_fence_put(old);
 }
-EXPORT_SYMBOL(dma_resv_add_shared_fence);
 
 /**
  * dma_resv_replace_fences - replace fences in the dma_resv obj
  * @obj: the reservation object
  * @context: the context of the fences to replace
  * @replacement: the new fence to use instead
+ * @usage: how the new fence is used, see enum dma_resv_usage
  *
  * Replace fences with a specified context with a new fence. Only valid if the
  * operation represented by the original fence has no longer access to the
@@ -321,12 +322,16 @@ EXPORT_SYMBOL(dma_resv_add_shared_fence);
  * update fence which makes the resource inaccessible.
  */
 void dma_resv_replace_fences(struct dma_resv *obj, uint64_t context,
-			     struct dma_fence *replacement)
+			     struct dma_fence *replacement,
+			     enum dma_resv_usage usage)
 {
 	struct dma_resv_list *list;
 	struct dma_fence *old;
 	unsigned int i;
 
+	/* Only readers supported for now */
+	WARN_ON(usage != DMA_RESV_USAGE_READ);
+
 	dma_resv_assert_held(obj);
 
 	write_seqcount_begin(&obj->seq);
@@ -360,7 +365,8 @@ EXPORT_SYMBOL(dma_resv_replace_fences);
  * Add a fence to the exclusive slot. @obj must be locked with dma_resv_lock().
  * See also &dma_resv.fence_excl for a discussion of the semantics.
  */
-void dma_resv_add_excl_fence(struct dma_resv *obj, struct dma_fence *fence)
+static void dma_resv_add_excl_fence(struct dma_resv *obj,
+				    struct dma_fence *fence)
 {
 	struct dma_fence *old_fence = dma_resv_excl_fence(obj);
 
@@ -375,7 +381,27 @@ void dma_resv_add_excl_fence(struct dma_resv *obj, struct dma_fence *fence)
 
 	dma_fence_put(old_fence);
 }
-EXPORT_SYMBOL(dma_resv_add_excl_fence);
+
+/**
+ * dma_resv_add_fence - Add a fence to the dma_resv obj
+ * @obj: the reservation object
+ * @fence: the fence to add
+ * @usage: how the fence is used, see enum dma_resv_usage
+ *
+ * Add a fence to a slot, @obj must be locked with dma_resv_lock(), and
+ * dma_resv_reserve_fences() has been called.
+ *
+ * See also &dma_resv.fence for a discussion of the semantics.
+ */
+void dma_resv_add_fence(struct dma_resv *obj, struct dma_fence *fence,
+			enum dma_resv_usage usage)
+{
+	if (usage == DMA_RESV_USAGE_WRITE)
+		dma_resv_add_excl_fence(obj, fence);
+	else
+		dma_resv_add_shared_fence(obj, fence);
+}
+EXPORT_SYMBOL(dma_resv_add_fence);
 
 /* Restart the iterator by initializing all the necessary fields, but not the
  * relation to the dma_resv object. */
@@ -574,7 +600,7 @@ int dma_resv_copy_fences(struct dma_resv *dst, struct dma_resv *src)
 		}
 
 		dma_fence_get(f);
-		if (dma_resv_iter_is_exclusive(&cursor))
+		if (dma_resv_iter_usage(&cursor) == DMA_RESV_USAGE_WRITE)
 			excl = f;
 		else
 			RCU_INIT_POINTER(list->shared[list->shared_count++], f);
@@ -771,13 +797,13 @@ EXPORT_SYMBOL_GPL(dma_resv_test_signaled);
  */
 void dma_resv_describe(struct dma_resv *obj, struct seq_file *seq)
 {
+	static const char *usage[] = { "write", "read" };
 	struct dma_resv_iter cursor;
 	struct dma_fence *fence;
 
 	dma_resv_for_each_fence(&cursor, obj, DMA_RESV_USAGE_READ, fence) {
 		seq_printf(seq, "\t%s fence:",
-			   dma_resv_iter_is_exclusive(&cursor) ?
-				"Exclusive" : "Shared");
+			   usage[dma_resv_iter_usage(&cursor)]);
 		dma_fence_describe(fence, seq);
 	}
 }
diff --git a/drivers/dma-buf/st-dma-resv.c b/drivers/dma-buf/st-dma-resv.c
index d097981061b1..d0f7c2bfd4f0 100644
--- a/drivers/dma-buf/st-dma-resv.c
+++ b/drivers/dma-buf/st-dma-resv.c
@@ -58,8 +58,9 @@ static int sanitycheck(void *arg)
 	return r;
 }
 
-static int test_signaling(void *arg, enum dma_resv_usage usage)
+static int test_signaling(void *arg)
 {
+	enum dma_resv_usage usage = (unsigned long)arg;
 	struct dma_resv resv;
 	struct dma_fence *f;
 	int r;
@@ -81,11 +82,7 @@ static int test_signaling(void *arg, enum dma_resv_usage usage)
 		goto err_unlock;
 	}
 
-	if (usage >= DMA_RESV_USAGE_READ)
-		dma_resv_add_shared_fence(&resv, f);
-	else
-		dma_resv_add_excl_fence(&resv, f);
-
+	dma_resv_add_fence(&resv, f, usage);
 	if (dma_resv_test_signaled(&resv, usage)) {
 		pr_err("Resv unexpectedly signaled\n");
 		r = -EINVAL;
@@ -105,18 +102,9 @@ static int test_signaling(void *arg, enum dma_resv_usage usage)
 	return r;
 }
 
-static int test_excl_signaling(void *arg)
-{
-	return test_signaling(arg, DMA_RESV_USAGE_WRITE);
-}
-
-static int test_shared_signaling(void *arg)
-{
-	return test_signaling(arg, DMA_RESV_USAGE_READ);
-}
-
-static int test_for_each(void *arg, enum dma_resv_usage usage)
+static int test_for_each(void *arg)
 {
+	enum dma_resv_usage usage = (unsigned long)arg;
 	struct dma_resv_iter cursor;
 	struct dma_fence *f, *fence;
 	struct dma_resv resv;
@@ -139,10 +127,7 @@ static int test_for_each(void *arg, enum dma_resv_usage usage)
 		goto err_unlock;
 	}
 
-	if (usage >= DMA_RESV_USAGE_READ)
-		dma_resv_add_shared_fence(&resv, f);
-	else
-		dma_resv_add_excl_fence(&resv, f);
+	dma_resv_add_fence(&resv, f, usage);
 
 	r = -ENOENT;
 	dma_resv_for_each_fence(&cursor, &resv, usage, fence) {
@@ -156,8 +141,7 @@ static int test_for_each(void *arg, enum dma_resv_usage usage)
 			r = -EINVAL;
 			goto err_unlock;
 		}
-		if (dma_resv_iter_is_exclusive(&cursor) !=
-		    (usage >= DMA_RESV_USAGE_READ)) {
+		if (dma_resv_iter_usage(&cursor) != usage) {
 			pr_err("Unexpected fence usage\n");
 			r = -EINVAL;
 			goto err_unlock;
@@ -177,18 +161,9 @@ static int test_for_each(void *arg, enum dma_resv_usage usage)
 	return r;
 }
 
-static int test_excl_for_each(void *arg)
-{
-	return test_for_each(arg, DMA_RESV_USAGE_WRITE);
-}
-
-static int test_shared_for_each(void *arg)
-{
-	return test_for_each(arg, DMA_RESV_USAGE_READ);
-}
-
-static int test_for_each_unlocked(void *arg, enum dma_resv_usage usage)
+static int test_for_each_unlocked(void *arg)
 {
+	enum dma_resv_usage usage = (unsigned long)arg;
 	struct dma_resv_iter cursor;
 	struct dma_fence *f, *fence;
 	struct dma_resv resv;
@@ -212,10 +187,7 @@ static int test_for_each_unlocked(void *arg, enum dma_resv_usage usage)
 		goto err_free;
 	}
 
-	if (usage >= DMA_RESV_USAGE_READ)
-		dma_resv_add_shared_fence(&resv, f);
-	else
-		dma_resv_add_excl_fence(&resv, f);
+	dma_resv_add_fence(&resv, f, usage);
 	dma_resv_unlock(&resv);
 
 	r = -ENOENT;
@@ -235,8 +207,7 @@ static int test_for_each_unlocked(void *arg, enum dma_resv_usage usage)
 			r = -EINVAL;
 			goto err_iter_end;
 		}
-		if (dma_resv_iter_is_exclusive(&cursor) !=
-		    (usage >= DMA_RESV_USAGE_READ)) {
+		if (dma_resv_iter_usage(&cursor) != usage) {
 			pr_err("Unexpected fence usage\n");
 			r = -EINVAL;
 			goto err_iter_end;
@@ -262,18 +233,9 @@ static int test_for_each_unlocked(void *arg, enum dma_resv_usage usage)
 	return r;
 }
 
-static int test_excl_for_each_unlocked(void *arg)
-{
-	return test_for_each_unlocked(arg, DMA_RESV_USAGE_WRITE);
-}
-
-static int test_shared_for_each_unlocked(void *arg)
-{
-	return test_for_each_unlocked(arg, DMA_RESV_USAGE_READ);
-}
-
-static int test_get_fences(void *arg, enum dma_resv_usage usage)
+static int test_get_fences(void *arg)
 {
+	enum dma_resv_usage usage = (unsigned long)arg;
 	struct dma_fence *f, **fences = NULL;
 	struct dma_resv resv;
 	int r, i;
@@ -296,10 +258,7 @@ static int test_get_fences(void *arg, enum dma_resv_usage usage)
 		goto err_resv;
 	}
 
-	if (usage >= DMA_RESV_USAGE_READ)
-		dma_resv_add_shared_fence(&resv, f);
-	else
-		dma_resv_add_excl_fence(&resv, f);
+	dma_resv_add_fence(&resv, f, usage);
 	dma_resv_unlock(&resv);
 
 	r = dma_resv_get_fences(&resv, usage, &i, &fences);
@@ -324,30 +283,24 @@ static int test_get_fences(void *arg, enum dma_resv_usage usage)
 	return r;
 }
 
-static int test_excl_get_fences(void *arg)
-{
-	return test_get_fences(arg, DMA_RESV_USAGE_WRITE);
-}
-
-static int test_shared_get_fences(void *arg)
-{
-	return test_get_fences(arg, DMA_RESV_USAGE_READ);
-}
-
 int dma_resv(void)
 {
 	static const struct subtest tests[] = {
 		SUBTEST(sanitycheck),
-		SUBTEST(test_excl_signaling),
-		SUBTEST(test_shared_signaling),
-		SUBTEST(test_excl_for_each),
-		SUBTEST(test_shared_for_each),
-		SUBTEST(test_excl_for_each_unlocked),
-		SUBTEST(test_shared_for_each_unlocked),
-		SUBTEST(test_excl_get_fences),
-		SUBTEST(test_shared_get_fences),
+		SUBTEST(test_signaling),
+		SUBTEST(test_for_each),
+		SUBTEST(test_for_each_unlocked),
+		SUBTEST(test_get_fences),
 	};
+	enum dma_resv_usage usage;
+	int r;
 
 	spin_lock_init(&fence_lock);
-	return subtests(tests, NULL);
+	for (usage = DMA_RESV_USAGE_WRITE; usage <= DMA_RESV_USAGE_READ;
+	     ++usage) {
+		r = subtests(tests, (void *)(unsigned long)usage);
+		if (r)
+			return r;
+	}
+	return 0;
 }
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
index 98b1736bb221..5031e26e6716 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
@@ -263,7 +263,7 @@ static int amdgpu_amdkfd_remove_eviction_fence(struct amdgpu_bo *bo,
 	 */
 	replacement = dma_fence_get_stub();
 	dma_resv_replace_fences(bo->tbo.base.resv, ef->base.context,
-				replacement);
+				replacement, DMA_RESV_USAGE_READ);
 	dma_fence_put(replacement);
 	return 0;
 }
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
index 413f32c3fd63..76fd916424d6 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
@@ -55,8 +55,8 @@ static int amdgpu_cs_user_fence_chunk(struct amdgpu_cs_parser *p,
 	bo = amdgpu_bo_ref(gem_to_amdgpu_bo(gobj));
 	p->uf_entry.priority = 0;
 	p->uf_entry.tv.bo = &bo->tbo;
-	/* One for TTM and one for the CS job */
-	p->uf_entry.tv.num_shared = 2;
+	/* One for TTM and two for the CS job */
+	p->uf_entry.tv.num_shared = 3;
 
 	drm_gem_object_put(gobj);
 
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
index a7f39f8ab7be..a3cdf8a24377 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
@@ -1397,10 +1397,8 @@ void amdgpu_bo_fence(struct amdgpu_bo *bo, struct dma_fence *fence,
 		return;
 	}
 
-	if (shared)
-		dma_resv_add_shared_fence(resv, fence);
-	else
-		dma_resv_add_excl_fence(resv, fence);
+	dma_resv_add_fence(resv, fence, shared ? DMA_RESV_USAGE_READ :
+			   DMA_RESV_USAGE_WRITE);
 }
 
 /**
diff --git a/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c b/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c
index 53f7c78628a4..98bb5c9239de 100644
--- a/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c
+++ b/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c
@@ -202,14 +202,10 @@ static void submit_attach_object_fences(struct etnaviv_gem_submit *submit)
 
 	for (i = 0; i < submit->nr_bos; i++) {
 		struct drm_gem_object *obj = &submit->bos[i].obj->base;
+		bool write = submit->bos[i].flags & ETNA_SUBMIT_BO_WRITE;
 
-		if (submit->bos[i].flags & ETNA_SUBMIT_BO_WRITE)
-			dma_resv_add_excl_fence(obj->resv,
-							  submit->out_fence);
-		else
-			dma_resv_add_shared_fence(obj->resv,
-							    submit->out_fence);
-
+		dma_resv_add_fence(obj->resv, submit->out_fence, write ?
+				   DMA_RESV_USAGE_WRITE : DMA_RESV_USAGE_READ);
 		submit_unlock_object(submit, i);
 	}
 }
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_busy.c b/drivers/gpu/drm/i915/gem/i915_gem_busy.c
index 14a1c0ad8c3c..e7ae94ee1b44 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_busy.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_busy.c
@@ -148,12 +148,13 @@ i915_gem_busy_ioctl(struct drm_device *dev, void *data,
 		if (dma_resv_iter_is_restarted(&cursor))
 			args->busy = 0;
 
-		if (dma_resv_iter_is_exclusive(&cursor))
-			/* Translate the exclusive fence to the READ *and* WRITE engine */
-			args->busy |= busy_check_writer(fence);
-		else
-			/* Translate shared fences to READ set of engines */
-			args->busy |= busy_check_reader(fence);
+		/* Translate read fences to READ set of engines */
+		args->busy |= busy_check_reader(fence);
+	}
+	dma_resv_iter_begin(&cursor, obj->base.resv, DMA_RESV_USAGE_WRITE);
+	dma_resv_for_each_fence_unlocked(&cursor, fence) {
+		/* Translate the write fences to the READ *and* WRITE engine */
+		args->busy |= busy_check_writer(fence);
 	}
 	dma_resv_iter_end(&cursor);
 
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_clflush.c b/drivers/gpu/drm/i915/gem/i915_gem_clflush.c
index 1fd0cc9ca213..f5f2b8b115ea 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_clflush.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_clflush.c
@@ -116,7 +116,8 @@ bool i915_gem_clflush_object(struct drm_i915_gem_object *obj,
 						obj->base.resv, NULL, true,
 						i915_fence_timeout(i915),
 						I915_FENCE_GFP);
-		dma_resv_add_excl_fence(obj->base.resv, &clflush->base.dma);
+		dma_resv_add_fence(obj->base.resv, &clflush->base.dma,
+				   DMA_RESV_USAGE_WRITE);
 		dma_fence_work_commit(&clflush->base);
 		/*
 		 * We must have successfully populated the pages(since we are
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_ttm_move.c b/drivers/gpu/drm/i915/gem/i915_gem_ttm_move.c
index 432ac74ff225..438b8a95b3d1 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_ttm_move.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_ttm_move.c
@@ -637,9 +637,8 @@ int i915_gem_obj_copy_ttm(struct drm_i915_gem_object *dst,
 	if (IS_ERR_OR_NULL(copy_fence))
 		return PTR_ERR_OR_ZERO(copy_fence);
 
-	dma_resv_add_excl_fence(dst_bo->base.resv, copy_fence);
-	dma_resv_add_shared_fence(src_bo->base.resv, copy_fence);
-
+	dma_resv_add_fence(dst_bo->base.resv, copy_fence, DMA_RESV_USAGE_WRITE);
+	dma_resv_add_fence(src_bo->base.resv, copy_fence, DMA_RESV_USAGE_READ);
 	dma_fence_put(copy_fence);
 
 	return 0;
diff --git a/drivers/gpu/drm/i915/gem/selftests/i915_gem_migrate.c b/drivers/gpu/drm/i915/gem/selftests/i915_gem_migrate.c
index 0e52eb87cd55..4997ed18b6e4 100644
--- a/drivers/gpu/drm/i915/gem/selftests/i915_gem_migrate.c
+++ b/drivers/gpu/drm/i915/gem/selftests/i915_gem_migrate.c
@@ -218,8 +218,8 @@ static int __igt_lmem_pages_migrate(struct intel_gt *gt,
 		if (rq) {
 			err = dma_resv_reserve_fences(obj->base.resv, 1);
 			if (!err)
-				dma_resv_add_excl_fence(obj->base.resv,
-							&rq->fence);
+				dma_resv_add_fence(obj->base.resv, &rq->fence,
+						   DMA_RESV_USAGE_WRITE);
 			i915_gem_object_set_moving_fence(obj, &rq->fence);
 			i915_request_put(rq);
 		}
diff --git a/drivers/gpu/drm/i915/gem/selftests/i915_gem_mman.c b/drivers/gpu/drm/i915/gem/selftests/i915_gem_mman.c
index a132e241c3ee..3a6e3f6d239f 100644
--- a/drivers/gpu/drm/i915/gem/selftests/i915_gem_mman.c
+++ b/drivers/gpu/drm/i915/gem/selftests/i915_gem_mman.c
@@ -1220,7 +1220,8 @@ static int __igt_mmap_migrate(struct intel_memory_region **placements,
 					  expand32(POISON_INUSE), &rq);
 	i915_gem_object_unpin_pages(obj);
 	if (rq) {
-		dma_resv_add_excl_fence(obj->base.resv, &rq->fence);
+		dma_resv_add_fence(obj->base.resv, &rq->fence,
+				   DMA_RESV_USAGE_WRITE);
 		i915_gem_object_set_moving_fence(obj, &rq->fence);
 		i915_request_put(rq);
 	}
diff --git a/drivers/gpu/drm/i915/i915_vma.c b/drivers/gpu/drm/i915/i915_vma.c
index bae3423f58e8..524477d8939e 100644
--- a/drivers/gpu/drm/i915/i915_vma.c
+++ b/drivers/gpu/drm/i915/i915_vma.c
@@ -1826,7 +1826,8 @@ int _i915_vma_move_to_active(struct i915_vma *vma,
 		}
 
 		if (fence) {
-			dma_resv_add_excl_fence(vma->obj->base.resv, fence);
+			dma_resv_add_fence(vma->obj->base.resv, fence,
+					   DMA_RESV_USAGE_WRITE);
 			obj->write_domain = I915_GEM_DOMAIN_RENDER;
 			obj->read_domains = 0;
 		}
@@ -1838,7 +1839,8 @@ int _i915_vma_move_to_active(struct i915_vma *vma,
 		}
 
 		if (fence) {
-			dma_resv_add_shared_fence(vma->obj->base.resv, fence);
+			dma_resv_add_fence(vma->obj->base.resv, fence,
+					   DMA_RESV_USAGE_READ);
 			obj->write_domain = 0;
 		}
 	}
@@ -2078,7 +2080,7 @@ int i915_vma_unbind_async(struct i915_vma *vma, bool trylock_vm)
 		goto out_rpm;
 	}
 
-	dma_resv_add_shared_fence(obj->base.resv, fence);
+	dma_resv_add_fence(obj->base.resv, fence, DMA_RESV_USAGE_READ);
 	dma_fence_put(fence);
 
 out_rpm:
diff --git a/drivers/gpu/drm/i915/selftests/intel_memory_region.c b/drivers/gpu/drm/i915/selftests/intel_memory_region.c
index 6114e013092b..73eb53edb8de 100644
--- a/drivers/gpu/drm/i915/selftests/intel_memory_region.c
+++ b/drivers/gpu/drm/i915/selftests/intel_memory_region.c
@@ -1056,7 +1056,8 @@ static int igt_lmem_write_cpu(void *arg)
 					  obj->mm.pages->sgl, I915_CACHE_NONE,
 					  true, 0xdeadbeaf, &rq);
 	if (rq) {
-		dma_resv_add_excl_fence(obj->base.resv, &rq->fence);
+		dma_resv_add_fence(obj->base.resv, &rq->fence,
+				   DMA_RESV_USAGE_WRITE);
 		i915_request_put(rq);
 	}
 
diff --git a/drivers/gpu/drm/lima/lima_gem.c b/drivers/gpu/drm/lima/lima_gem.c
index e0a11ee0e86d..cb3bfccc930f 100644
--- a/drivers/gpu/drm/lima/lima_gem.c
+++ b/drivers/gpu/drm/lima/lima_gem.c
@@ -367,7 +367,7 @@ int lima_gem_submit(struct drm_file *file, struct lima_submit *submit)
 		if (submit->bos[i].flags & LIMA_SUBMIT_BO_WRITE)
 			dma_resv_add_excl_fence(lima_bo_resv(bos[i]), fence);
 		else
-			dma_resv_add_shared_fence(lima_bo_resv(bos[i]), fence);
+			dma_resv_add_fence(lima_bo_resv(bos[i]), fence);
 	}
 
 	drm_gem_unlock_reservations((struct drm_gem_object **)bos,
diff --git a/drivers/gpu/drm/msm/msm_gem_submit.c b/drivers/gpu/drm/msm/msm_gem_submit.c
index 3164db8be893..8d1eef914ba8 100644
--- a/drivers/gpu/drm/msm/msm_gem_submit.c
+++ b/drivers/gpu/drm/msm/msm_gem_submit.c
@@ -395,9 +395,11 @@ static void submit_attach_object_fences(struct msm_gem_submit *submit)
 		struct drm_gem_object *obj = &submit->bos[i].obj->base;
 
 		if (submit->bos[i].flags & MSM_SUBMIT_BO_WRITE)
-			dma_resv_add_excl_fence(obj->resv, submit->user_fence);
+			dma_resv_add_fence(obj->resv, submit->user_fence,
+					   DMA_RESV_USAGE_WRITE);
 		else if (submit->bos[i].flags & MSM_SUBMIT_BO_READ)
-			dma_resv_add_shared_fence(obj->resv, submit->user_fence);
+			dma_resv_add_fence(obj->resv, submit->user_fence,
+					   DMA_RESV_USAGE_READ);
 	}
 }
 
diff --git a/drivers/gpu/drm/nouveau/nouveau_bo.c b/drivers/gpu/drm/nouveau/nouveau_bo.c
index c6bb4dbcd735..05076e530e7d 100644
--- a/drivers/gpu/drm/nouveau/nouveau_bo.c
+++ b/drivers/gpu/drm/nouveau/nouveau_bo.c
@@ -1308,10 +1308,11 @@ nouveau_bo_fence(struct nouveau_bo *nvbo, struct nouveau_fence *fence, bool excl
 {
 	struct dma_resv *resv = nvbo->bo.base.resv;
 
-	if (exclusive)
-		dma_resv_add_excl_fence(resv, &fence->base);
-	else if (fence)
-		dma_resv_add_shared_fence(resv, &fence->base);
+	if (!fence)
+		return;
+
+	dma_resv_add_fence(resv, &fence->base, exclusive ?
+			   DMA_RESV_USAGE_WRITE : DMA_RESV_USAGE_READ);
 }
 
 static void
diff --git a/drivers/gpu/drm/nouveau/nouveau_fence.c b/drivers/gpu/drm/nouveau/nouveau_fence.c
index d5e81ccee01c..7f01dcf81fab 100644
--- a/drivers/gpu/drm/nouveau/nouveau_fence.c
+++ b/drivers/gpu/drm/nouveau/nouveau_fence.c
@@ -360,9 +360,11 @@ nouveau_fence_sync(struct nouveau_bo *nvbo, struct nouveau_channel *chan,
 		dma_resv_for_each_fence(&cursor, resv,
 					dma_resv_usage_rw(exclusive),
 					fence) {
+			enum dma_resv_usage usage;
 			struct nouveau_fence *f;
 
-			if (i == 0 && dma_resv_iter_is_exclusive(&cursor))
+			usage = dma_resv_iter_usage(&cursor);
+			if (i == 0 && usage == DMA_RESV_USAGE_WRITE)
 				continue;
 
 			f = nouveau_local_fence(fence, chan->drm);
diff --git a/drivers/gpu/drm/panfrost/panfrost_job.c b/drivers/gpu/drm/panfrost/panfrost_job.c
index c34114560e49..fda5871aebe3 100644
--- a/drivers/gpu/drm/panfrost/panfrost_job.c
+++ b/drivers/gpu/drm/panfrost/panfrost_job.c
@@ -268,7 +268,7 @@ static void panfrost_attach_object_fences(struct drm_gem_object **bos,
 	int i;
 
 	for (i = 0; i < bo_count; i++)
-		dma_resv_add_excl_fence(bos[i]->resv, fence);
+		dma_resv_add_fence(bos[i]->resv, fence, DMA_RESV_USAGE_WRITE);
 }
 
 int panfrost_job_push(struct panfrost_job *job)
diff --git a/drivers/gpu/drm/qxl/qxl_release.c b/drivers/gpu/drm/qxl/qxl_release.c
index cde1e8ddaeaa..368d26da0d6a 100644
--- a/drivers/gpu/drm/qxl/qxl_release.c
+++ b/drivers/gpu/drm/qxl/qxl_release.c
@@ -429,7 +429,8 @@ void qxl_release_fence_buffer_objects(struct qxl_release *release)
 	list_for_each_entry(entry, &release->bos, head) {
 		bo = entry->bo;
 
-		dma_resv_add_shared_fence(bo->base.resv, &release->base);
+		dma_resv_add_fence(bo->base.resv, &release->base,
+				   DMA_RESV_USAGE_READ);
 		ttm_bo_move_to_lru_tail_unlocked(bo);
 		dma_resv_unlock(bo->base.resv);
 	}
diff --git a/drivers/gpu/drm/radeon/radeon_object.c b/drivers/gpu/drm/radeon/radeon_object.c
index 7ffd2e90f325..cb5c4aa45cef 100644
--- a/drivers/gpu/drm/radeon/radeon_object.c
+++ b/drivers/gpu/drm/radeon/radeon_object.c
@@ -791,8 +791,6 @@ void radeon_bo_fence(struct radeon_bo *bo, struct radeon_fence *fence,
 		return;
 	}
 
-	if (shared)
-		dma_resv_add_shared_fence(resv, &fence->base);
-	else
-		dma_resv_add_excl_fence(resv, &fence->base);
+	dma_resv_add_fence(resv, &fence->base, shared ?
+			   DMA_RESV_USAGE_READ : DMA_RESV_USAGE_WRITE);
 }
diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c
index cff05b62f3f7..d74f9eea855e 100644
--- a/drivers/gpu/drm/ttm/ttm_bo.c
+++ b/drivers/gpu/drm/ttm/ttm_bo.c
@@ -739,7 +739,7 @@ static int ttm_bo_add_move_fence(struct ttm_buffer_object *bo,
 		return ret;
 	}
 
-	dma_resv_add_shared_fence(bo->base.resv, fence);
+	dma_resv_add_fence(bo->base.resv, fence, DMA_RESV_USAGE_WRITE);
 
 	ret = dma_resv_reserve_fences(bo->base.resv, 1);
 	if (unlikely(ret)) {
diff --git a/drivers/gpu/drm/ttm/ttm_bo_util.c b/drivers/gpu/drm/ttm/ttm_bo_util.c
index 1b96b91bf81b..7a96a1db13a7 100644
--- a/drivers/gpu/drm/ttm/ttm_bo_util.c
+++ b/drivers/gpu/drm/ttm/ttm_bo_util.c
@@ -507,7 +507,8 @@ static int ttm_bo_move_to_ghost(struct ttm_buffer_object *bo,
 	if (ret)
 		return ret;
 
-	dma_resv_add_excl_fence(&ghost_obj->base._resv, fence);
+	dma_resv_add_fence(&ghost_obj->base._resv, fence,
+			   DMA_RESV_USAGE_WRITE);
 
 	/**
 	 * If we're not moving to fixed memory, the TTM object
@@ -561,7 +562,7 @@ int ttm_bo_move_accel_cleanup(struct ttm_buffer_object *bo,
 	struct ttm_resource_manager *man = ttm_manager_type(bdev, new_mem->mem_type);
 	int ret = 0;
 
-	dma_resv_add_excl_fence(bo->base.resv, fence);
+	dma_resv_add_fence(bo->base.resv, fence, DMA_RESV_USAGE_WRITE);
 	if (!evict)
 		ret = ttm_bo_move_to_ghost(bo, fence, man->use_tt);
 	else if (!from->use_tt && pipeline)
diff --git a/drivers/gpu/drm/ttm/ttm_execbuf_util.c b/drivers/gpu/drm/ttm/ttm_execbuf_util.c
index 789c645f004e..0eb995d25df1 100644
--- a/drivers/gpu/drm/ttm/ttm_execbuf_util.c
+++ b/drivers/gpu/drm/ttm/ttm_execbuf_util.c
@@ -154,10 +154,8 @@ void ttm_eu_fence_buffer_objects(struct ww_acquire_ctx *ticket,
 	list_for_each_entry(entry, list, head) {
 		struct ttm_buffer_object *bo = entry->bo;
 
-		if (entry->num_shared)
-			dma_resv_add_shared_fence(bo->base.resv, fence);
-		else
-			dma_resv_add_excl_fence(bo->base.resv, fence);
+		dma_resv_add_fence(bo->base.resv, fence, entry->num_shared ?
+				   DMA_RESV_USAGE_READ : DMA_RESV_USAGE_WRITE);
 		ttm_bo_move_to_lru_tail_unlocked(bo);
 		dma_resv_unlock(bo->base.resv);
 	}
diff --git a/drivers/gpu/drm/v3d/v3d_gem.c b/drivers/gpu/drm/v3d/v3d_gem.c
index 961812d33827..2352e9640922 100644
--- a/drivers/gpu/drm/v3d/v3d_gem.c
+++ b/drivers/gpu/drm/v3d/v3d_gem.c
@@ -550,8 +550,8 @@ v3d_attach_fences_and_unlock_reservation(struct drm_file *file_priv,
 
 	for (i = 0; i < job->bo_count; i++) {
 		/* XXX: Use shared fences for read-only objects. */
-		dma_resv_add_excl_fence(job->bo[i]->resv,
-					job->done_fence);
+		dma_resv_add_fence(job->bo[i]->resv, job->done_fence,
+				   DMA_RESV_USAGE_WRITE);
 	}
 
 	drm_gem_unlock_reservations(job->bo, job->bo_count, acquire_ctx);
diff --git a/drivers/gpu/drm/vc4/vc4_gem.c b/drivers/gpu/drm/vc4/vc4_gem.c
index 594bd6bb00d2..38550317e025 100644
--- a/drivers/gpu/drm/vc4/vc4_gem.c
+++ b/drivers/gpu/drm/vc4/vc4_gem.c
@@ -546,7 +546,7 @@ vc4_update_bo_seqnos(struct vc4_exec_info *exec, uint64_t seqno)
 		bo = to_vc4_bo(&exec->bo[i]->base);
 		bo->seqno = seqno;
 
-		dma_resv_add_shared_fence(bo->base.base.resv, exec->fence);
+		dma_resv_add_fence(bo->base.base.resv, exec->fence);
 	}
 
 	list_for_each_entry(bo, &exec->unref_list, unref_head) {
diff --git a/drivers/gpu/drm/vgem/vgem_fence.c b/drivers/gpu/drm/vgem/vgem_fence.c
index 91fc4940c65a..c2a879734d40 100644
--- a/drivers/gpu/drm/vgem/vgem_fence.c
+++ b/drivers/gpu/drm/vgem/vgem_fence.c
@@ -161,12 +161,9 @@ int vgem_fence_attach_ioctl(struct drm_device *dev,
 	/* Expose the fence via the dma-buf */
 	dma_resv_lock(resv, NULL);
 	ret = dma_resv_reserve_fences(resv, 1);
-	if (!ret) {
-		if (arg->flags & VGEM_FENCE_WRITE)
-			dma_resv_add_excl_fence(resv, fence);
-		else
-			dma_resv_add_shared_fence(resv, fence);
-	}
+	if (!ret)
+		dma_resv_add_fence(resv, fence, arg->flags & VGEM_FENCE_WRITE ?
+				   DMA_RESV_USAGE_WRITE : DMA_RESV_USAGE_READ);
 	dma_resv_unlock(resv);
 
 	/* Record the fence in our idr for later signaling */
diff --git a/drivers/gpu/drm/virtio/virtgpu_gem.c b/drivers/gpu/drm/virtio/virtgpu_gem.c
index 1820ca6cf673..580a78809836 100644
--- a/drivers/gpu/drm/virtio/virtgpu_gem.c
+++ b/drivers/gpu/drm/virtio/virtgpu_gem.c
@@ -250,7 +250,8 @@ void virtio_gpu_array_add_fence(struct virtio_gpu_object_array *objs,
 	int i;
 
 	for (i = 0; i < objs->nents; i++)
-		dma_resv_add_excl_fence(objs->objs[i]->resv, fence);
+		dma_resv_add_fence(objs->objs[i]->resv, fence,
+				   DMA_RESV_USAGE_WRITE);
 }
 
 void virtio_gpu_array_put_free(struct virtio_gpu_object_array *objs)
diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_bo.c b/drivers/gpu/drm/vmwgfx/vmwgfx_bo.c
index b96884f7d03d..bec50223efe5 100644
--- a/drivers/gpu/drm/vmwgfx/vmwgfx_bo.c
+++ b/drivers/gpu/drm/vmwgfx/vmwgfx_bo.c
@@ -758,7 +758,8 @@ void vmw_bo_fence_single(struct ttm_buffer_object *bo,
 
 	ret = dma_resv_reserve_fences(bo->base.resv, 1);
 	if (!ret)
-		dma_resv_add_excl_fence(bo->base.resv, &fence->base);
+		dma_resv_add_fence(bo->base.resv, &fence->base,
+				   DMA_RESV_USAGE_WRITE);
 	else
 		/* Last resort fallback when we are OOM */
 		dma_fence_wait(&fence->base, false);
diff --git a/include/linux/dma-buf.h b/include/linux/dma-buf.h
index a297397743a2..71731796c8c3 100644
--- a/include/linux/dma-buf.h
+++ b/include/linux/dma-buf.h
@@ -393,15 +393,15 @@ struct dma_buf {
 	 * e.g. exposed in `Implicit Fence Poll Support`_ must follow the
 	 * below rules.
 	 *
-	 * - Drivers must add a shared fence through dma_resv_add_shared_fence()
-	 *   for anything the userspace API considers a read access. This highly
-	 *   depends upon the API and window system.
+	 * - Drivers must add a read fence through dma_resv_add_fence() with the
+	 *   DMA_RESV_USAGE_READ flag for anything the userspace API considers a
+	 *   read access. This highly depends upon the API and window system.
 	 *
-	 * - Similarly drivers must set the exclusive fence through
-	 *   dma_resv_add_excl_fence() for anything the userspace API considers
-	 *   write access.
+	 * - Similarly drivers must add a write fence through
+	 *   dma_resv_add_fence() with the DMA_RESV_USAGE_WRITE flag for
+	 *   anything the userspace API considers write access.
 	 *
-	 * - Drivers may just always set the exclusive fence, since that only
+	 * - Drivers may just always add a write fence, since that only
 	 *   causes unecessarily synchronization, but no correctness issues.
 	 *
 	 * - Some drivers only expose a synchronous userspace API with no
@@ -416,7 +416,7 @@ struct dma_buf {
 	 * Dynamic importers, see dma_buf_attachment_is_dynamic(), have
 	 * additional constraints on how they set up fences:
 	 *
-	 * - Dynamic importers must obey the exclusive fence and wait for it to
+	 * - Dynamic importers must obey the write fences and wait for them to
 	 *   signal before allowing access to the buffer's underlying storage
 	 *   through the device.
 	 *
diff --git a/include/linux/dma-resv.h b/include/linux/dma-resv.h
index 92cd8023980f..98dc5234b487 100644
--- a/include/linux/dma-resv.h
+++ b/include/linux/dma-resv.h
@@ -195,6 +195,9 @@ struct dma_resv_iter {
 	/** @fence: the currently handled fence */
 	struct dma_fence *fence;
 
+	/** @fence_usage: the usage of the current fence */
+	enum dma_resv_usage fence_usage;
+
 	/** @seq: sequence number to check for modifications */
 	unsigned int seq;
 
@@ -244,14 +247,15 @@ static inline void dma_resv_iter_end(struct dma_resv_iter *cursor)
 }
 
 /**
- * dma_resv_iter_is_exclusive - test if the current fence is the exclusive one
+ * dma_resv_iter_usage - Return the usage of the current fence
  * @cursor: the cursor of the current position
  *
- * Returns true if the currently returned fence is the exclusive one.
+ * Returns the usage of the currently processed fence.
  */
-static inline bool dma_resv_iter_is_exclusive(struct dma_resv_iter *cursor)
+static inline enum dma_resv_usage
+dma_resv_iter_usage(struct dma_resv_iter *cursor)
 {
-	return cursor->index == 0;
+	return cursor->fence_usage;
 }
 
 /**
@@ -306,9 +310,9 @@ static inline bool dma_resv_iter_is_restarted(struct dma_resv_iter *cursor)
 #define dma_resv_assert_held(obj) lockdep_assert_held(&(obj)->lock.base)
 
 #ifdef CONFIG_DEBUG_MUTEXES
-void dma_resv_reset_shared_max(struct dma_resv *obj);
+void dma_resv_reset_max_fences(struct dma_resv *obj);
 #else
-static inline void dma_resv_reset_shared_max(struct dma_resv *obj) {}
+static inline void dma_resv_reset_max_fences(struct dma_resv *obj) {}
 #endif
 
 /**
@@ -454,17 +458,18 @@ static inline struct ww_acquire_ctx *dma_resv_locking_ctx(struct dma_resv *obj)
  */
 static inline void dma_resv_unlock(struct dma_resv *obj)
 {
-	dma_resv_reset_shared_max(obj);
+	dma_resv_reset_max_fences(obj);
 	ww_mutex_unlock(&obj->lock);
 }
 
 void dma_resv_init(struct dma_resv *obj);
 void dma_resv_fini(struct dma_resv *obj);
 int dma_resv_reserve_fences(struct dma_resv *obj, unsigned int num_fences);
-void dma_resv_add_shared_fence(struct dma_resv *obj, struct dma_fence *fence);
+void dma_resv_add_fence(struct dma_resv *obj, struct dma_fence *fence,
+			enum dma_resv_usage usage);
 void dma_resv_replace_fences(struct dma_resv *obj, uint64_t context,
-			     struct dma_fence *fence);
-void dma_resv_add_excl_fence(struct dma_resv *obj, struct dma_fence *fence);
+			     struct dma_fence *fence,
+			     enum dma_resv_usage usage);
 int dma_resv_get_fences(struct dma_resv *obj, enum dma_resv_usage usage,
 			unsigned int *num_fences, struct dma_fence ***fences);
 int dma_resv_get_singleton(struct dma_resv *obj, enum dma_resv_usage usage,
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 65+ messages in thread

* [PATCH 04/16] dma-buf & drm/amdgpu: remove dma_resv workaround
  2022-04-06  7:51 Christian König
                   ` (2 preceding siblings ...)
  2022-04-06  7:51 ` [PATCH 03/16] dma-buf: specify usage while adding fences to dma_resv obj v6 Christian König
@ 2022-04-06  7:51 ` Christian König
  2022-04-06 12:39   ` Daniel Vetter
  2022-04-06  7:51 ` [PATCH 05/16] dma-buf: add DMA_RESV_USAGE_KERNEL v3 Christian König
                   ` (12 subsequent siblings)
  16 siblings, 1 reply; 65+ messages in thread
From: Christian König @ 2022-04-06  7:51 UTC (permalink / raw)
  To: daniel.vetter, dri-devel; +Cc: Christian König, amd-gfx

Rework the internals of the dma_resv object to allow adding more than one
write fence and remember for each fence what purpose it had.

This allows removing the workaround from amdgpu which used a container for
this instead.

Signed-off-by: Christian König <christian.koenig@amd.com>
Cc: amd-gfx@lists.freedesktop.org
---
 drivers/dma-buf/dma-resv.c                  | 353 ++++++++------------
 drivers/gpu/drm/amd/amdgpu/amdgpu_bo_list.h |   1 -
 drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c      |  53 +--
 include/linux/dma-resv.h                    |  47 +--
 4 files changed, 157 insertions(+), 297 deletions(-)

diff --git a/drivers/dma-buf/dma-resv.c b/drivers/dma-buf/dma-resv.c
index 543dae6566d2..378d47e1cfea 100644
--- a/drivers/dma-buf/dma-resv.c
+++ b/drivers/dma-buf/dma-resv.c
@@ -44,12 +44,12 @@
 /**
  * DOC: Reservation Object Overview
  *
- * The reservation object provides a mechanism to manage shared and
- * exclusive fences associated with a buffer.  A reservation object
- * can have attached one exclusive fence (normally associated with
- * write operations) or N shared fences (read operations).  The RCU
- * mechanism is used to protect read access to fences from locked
- * write-side updates.
+ * The reservation object provides a mechanism to manage a container of
+ * dma_fence object associated with a resource. A reservation object
+ * can have any number of fences attaches to it. Each fence carries an usage
+ * parameter determining how the operation represented by the fence is using the
+ * resource. The RCU mechanism is used to protect read access to fences from
+ * locked write-side updates.
  *
  * See struct dma_resv for more details.
  */
@@ -57,39 +57,59 @@
 DEFINE_WD_CLASS(reservation_ww_class);
 EXPORT_SYMBOL(reservation_ww_class);
 
+/* Mask for the lower fence pointer bits */
+#define DMA_RESV_LIST_MASK	0x3
+
 struct dma_resv_list {
 	struct rcu_head rcu;
-	u32 shared_count, shared_max;
-	struct dma_fence __rcu *shared[];
+	u32 num_fences, max_fences;
+	struct dma_fence __rcu *table[];
 };
 
-/**
- * dma_resv_list_alloc - allocate fence list
- * @shared_max: number of fences we need space for
- *
+/* Extract the fence and usage flags from an RCU protected entry in the list. */
+static void dma_resv_list_entry(struct dma_resv_list *list, unsigned int index,
+				struct dma_resv *resv, struct dma_fence **fence,
+				enum dma_resv_usage *usage)
+{
+	long tmp;
+
+	tmp = (long)rcu_dereference_check(list->table[index],
+					  resv ? dma_resv_held(resv) : true);
+	*fence = (struct dma_fence *)(tmp & ~DMA_RESV_LIST_MASK);
+	if (usage)
+		*usage = tmp & DMA_RESV_LIST_MASK;
+}
+
+/* Set the fence and usage flags at the specific index in the list. */
+static void dma_resv_list_set(struct dma_resv_list *list,
+			      unsigned int index,
+			      struct dma_fence *fence,
+			      enum dma_resv_usage usage)
+{
+	long tmp = ((long)fence) | usage;
+
+	RCU_INIT_POINTER(list->table[index], (struct dma_fence *)tmp);
+}
+
+/*
  * Allocate a new dma_resv_list and make sure to correctly initialize
- * shared_max.
+ * max_fences.
  */
-static struct dma_resv_list *dma_resv_list_alloc(unsigned int shared_max)
+static struct dma_resv_list *dma_resv_list_alloc(unsigned int max_fences)
 {
 	struct dma_resv_list *list;
 
-	list = kmalloc(struct_size(list, shared, shared_max), GFP_KERNEL);
+	list = kmalloc(struct_size(list, table, max_fences), GFP_KERNEL);
 	if (!list)
 		return NULL;
 
-	list->shared_max = (ksize(list) - offsetof(typeof(*list), shared)) /
-		sizeof(*list->shared);
+	list->max_fences = (ksize(list) - offsetof(typeof(*list), table)) /
+		sizeof(*list->table);
 
 	return list;
 }
 
-/**
- * dma_resv_list_free - free fence list
- * @list: list to free
- *
- * Free a dma_resv_list and make sure to drop all references.
- */
+/* Free a dma_resv_list and make sure to drop all references. */
 static void dma_resv_list_free(struct dma_resv_list *list)
 {
 	unsigned int i;
@@ -97,9 +117,12 @@ static void dma_resv_list_free(struct dma_resv_list *list)
 	if (!list)
 		return;
 
-	for (i = 0; i < list->shared_count; ++i)
-		dma_fence_put(rcu_dereference_protected(list->shared[i], true));
+	for (i = 0; i < list->num_fences; ++i) {
+		struct dma_fence *fence;
 
+		dma_resv_list_entry(list, i, NULL, &fence, NULL);
+		dma_fence_put(fence);
+	}
 	kfree_rcu(list, rcu);
 }
 
@@ -112,8 +135,7 @@ void dma_resv_init(struct dma_resv *obj)
 	ww_mutex_init(&obj->lock, &reservation_ww_class);
 	seqcount_ww_mutex_init(&obj->seq, &obj->lock);
 
-	RCU_INIT_POINTER(obj->fence, NULL);
-	RCU_INIT_POINTER(obj->fence_excl, NULL);
+	RCU_INIT_POINTER(obj->fences, NULL);
 }
 EXPORT_SYMBOL(dma_resv_init);
 
@@ -123,46 +145,32 @@ EXPORT_SYMBOL(dma_resv_init);
  */
 void dma_resv_fini(struct dma_resv *obj)
 {
-	struct dma_resv_list *fobj;
-	struct dma_fence *excl;
-
 	/*
 	 * This object should be dead and all references must have
 	 * been released to it, so no need to be protected with rcu.
 	 */
-	excl = rcu_dereference_protected(obj->fence_excl, 1);
-	if (excl)
-		dma_fence_put(excl);
-
-	fobj = rcu_dereference_protected(obj->fence, 1);
-	dma_resv_list_free(fobj);
+	dma_resv_list_free(rcu_dereference_protected(obj->fences, true));
 	ww_mutex_destroy(&obj->lock);
 }
 EXPORT_SYMBOL(dma_resv_fini);
 
-static inline struct dma_fence *
-dma_resv_excl_fence(struct dma_resv *obj)
-{
-       return rcu_dereference_check(obj->fence_excl, dma_resv_held(obj));
-}
-
-static inline struct dma_resv_list *dma_resv_shared_list(struct dma_resv *obj)
+/* Dereference the fences while ensuring RCU rules */
+static inline struct dma_resv_list *dma_resv_fences_list(struct dma_resv *obj)
 {
-	return rcu_dereference_check(obj->fence, dma_resv_held(obj));
+	return rcu_dereference_check(obj->fences, dma_resv_held(obj));
 }
 
 /**
- * dma_resv_reserve_fences - Reserve space to add shared fences to
- * a dma_resv.
+ * dma_resv_reserve_fences - Reserve space to add fences to a dma_resv object.
  * @obj: reservation object
  * @num_fences: number of fences we want to add
  *
- * Should be called before dma_resv_add_shared_fence().  Must
- * be called with @obj locked through dma_resv_lock().
+ * Should be called before dma_resv_add_fence().  Must be called with @obj
+ * locked through dma_resv_lock().
  *
  * Note that the preallocated slots need to be re-reserved if @obj is unlocked
- * at any time before calling dma_resv_add_shared_fence(). This is validated
- * when CONFIG_DEBUG_MUTEXES is enabled.
+ * at any time before calling dma_resv_add_fence(). This is validated when
+ * CONFIG_DEBUG_MUTEXES is enabled.
  *
  * RETURNS
  * Zero for success, or -errno
@@ -174,11 +182,11 @@ int dma_resv_reserve_fences(struct dma_resv *obj, unsigned int num_fences)
 
 	dma_resv_assert_held(obj);
 
-	old = dma_resv_shared_list(obj);
-	if (old && old->shared_max) {
-		if ((old->shared_count + num_fences) <= old->shared_max)
+	old = dma_resv_fences_list(obj);
+	if (old && old->max_fences) {
+		if ((old->num_fences + num_fences) <= old->max_fences)
 			return 0;
-		max = max(old->shared_count + num_fences, old->shared_max * 2);
+		max = max(old->num_fences + num_fences, old->max_fences * 2);
 	} else {
 		max = max(4ul, roundup_pow_of_two(num_fences));
 	}
@@ -193,27 +201,27 @@ int dma_resv_reserve_fences(struct dma_resv *obj, unsigned int num_fences)
 	 * references from the old struct are carried over to
 	 * the new.
 	 */
-	for (i = 0, j = 0, k = max; i < (old ? old->shared_count : 0); ++i) {
+	for (i = 0, j = 0, k = max; i < (old ? old->num_fences : 0); ++i) {
+		enum dma_resv_usage usage;
 		struct dma_fence *fence;
 
-		fence = rcu_dereference_protected(old->shared[i],
-						  dma_resv_held(obj));
+		dma_resv_list_entry(old, i, obj, &fence, &usage);
 		if (dma_fence_is_signaled(fence))
-			RCU_INIT_POINTER(new->shared[--k], fence);
+			RCU_INIT_POINTER(new->table[--k], fence);
 		else
-			RCU_INIT_POINTER(new->shared[j++], fence);
+			dma_resv_list_set(new, j++, fence, usage);
 	}
-	new->shared_count = j;
+	new->num_fences = j;
 
 	/*
 	 * We are not changing the effective set of fences here so can
 	 * merely update the pointer to the new array; both existing
 	 * readers and new readers will see exactly the same set of
-	 * active (unsignaled) shared fences. Individual fences and the
+	 * active (unsignaled) fences. Individual fences and the
 	 * old array are protected by RCU and so will not vanish under
 	 * the gaze of the rcu_read_lock() readers.
 	 */
-	rcu_assign_pointer(obj->fence, new);
+	rcu_assign_pointer(obj->fences, new);
 
 	if (!old)
 		return 0;
@@ -222,7 +230,7 @@ int dma_resv_reserve_fences(struct dma_resv *obj, unsigned int num_fences)
 	for (i = k; i < max; ++i) {
 		struct dma_fence *fence;
 
-		fence = rcu_dereference_protected(new->shared[i],
+		fence = rcu_dereference_protected(new->table[i],
 						  dma_resv_held(obj));
 		dma_fence_put(fence);
 	}
@@ -234,38 +242,39 @@ EXPORT_SYMBOL(dma_resv_reserve_fences);
 
 #ifdef CONFIG_DEBUG_MUTEXES
 /**
- * dma_resv_reset_max_fences - reset shared fences for debugging
+ * dma_resv_reset_max_fences - reset fences for debugging
  * @obj: the dma_resv object to reset
  *
- * Reset the number of pre-reserved shared slots to test that drivers do
+ * Reset the number of pre-reserved fence slots to test that drivers do
  * correct slot allocation using dma_resv_reserve_fences(). See also
- * &dma_resv_list.shared_max.
+ * &dma_resv_list.max_fences.
  */
 void dma_resv_reset_max_fences(struct dma_resv *obj)
 {
-	struct dma_resv_list *fences = dma_resv_shared_list(obj);
+	struct dma_resv_list *fences = dma_resv_fences_list(obj);
 
 	dma_resv_assert_held(obj);
 
-	/* Test shared fence slot reservation */
+	/* Test fence slot reservation */
 	if (fences)
-		fences->shared_max = fences->shared_count;
+		fences->max_fences = fences->num_fences;
 }
 EXPORT_SYMBOL(dma_resv_reset_max_fences);
 #endif
 
 /**
- * dma_resv_add_shared_fence - Add a fence to a shared slot
+ * dma_resv_add_fence - Add a fence to the dma_resv obj
  * @obj: the reservation object
- * @fence: the shared fence to add
+ * @fence: the fence to add
+ * @usage: how the fence is used, see enum dma_resv_usage
  *
- * Add a fence to a shared slot, @obj must be locked with dma_resv_lock(), and
+ * Add a fence to a slot, @obj must be locked with dma_resv_lock(), and
  * dma_resv_reserve_fences() has been called.
  *
  * See also &dma_resv.fence for a discussion of the semantics.
  */
-static void dma_resv_add_shared_fence(struct dma_resv *obj,
-				      struct dma_fence *fence)
+void dma_resv_add_fence(struct dma_resv *obj, struct dma_fence *fence,
+			enum dma_resv_usage usage)
 {
 	struct dma_resv_list *fobj;
 	struct dma_fence *old;
@@ -280,32 +289,33 @@ static void dma_resv_add_shared_fence(struct dma_resv *obj,
 	 */
 	WARN_ON(dma_fence_is_container(fence));
 
-	fobj = dma_resv_shared_list(obj);
-	count = fobj->shared_count;
+	fobj = dma_resv_fences_list(obj);
+	count = fobj->num_fences;
 
 	write_seqcount_begin(&obj->seq);
 
 	for (i = 0; i < count; ++i) {
+		enum dma_resv_usage old_usage;
 
-		old = rcu_dereference_protected(fobj->shared[i],
-						dma_resv_held(obj));
-		if (old->context == fence->context ||
+		dma_resv_list_entry(fobj, i, obj, &old, &old_usage);
+		if ((old->context == fence->context && old_usage >= usage) ||
 		    dma_fence_is_signaled(old))
 			goto replace;
 	}
 
-	BUG_ON(fobj->shared_count >= fobj->shared_max);
+	BUG_ON(fobj->num_fences >= fobj->max_fences);
 	old = NULL;
 	count++;
 
 replace:
-	RCU_INIT_POINTER(fobj->shared[i], fence);
-	/* pointer update must be visible before we extend the shared_count */
-	smp_store_mb(fobj->shared_count, count);
+	dma_resv_list_set(fobj, i, fence, usage);
+	/* pointer update must be visible before we extend the num_fences */
+	smp_store_mb(fobj->num_fences, count);
 
 	write_seqcount_end(&obj->seq);
 	dma_fence_put(old);
 }
+EXPORT_SYMBOL(dma_resv_add_fence);
 
 /**
  * dma_resv_replace_fences - replace fences in the dma_resv obj
@@ -326,128 +336,63 @@ void dma_resv_replace_fences(struct dma_resv *obj, uint64_t context,
 			     enum dma_resv_usage usage)
 {
 	struct dma_resv_list *list;
-	struct dma_fence *old;
 	unsigned int i;
 
-	/* Only readers supported for now */
-	WARN_ON(usage != DMA_RESV_USAGE_READ);
-
 	dma_resv_assert_held(obj);
 
+	list = dma_resv_fences_list(obj);
 	write_seqcount_begin(&obj->seq);
+	for (i = 0; list && i < list->num_fences; ++i) {
+		struct dma_fence *old;
 
-	old = dma_resv_excl_fence(obj);
-	if (old->context == context) {
-		RCU_INIT_POINTER(obj->fence_excl, dma_fence_get(replacement));
-		dma_fence_put(old);
-	}
-
-	list = dma_resv_shared_list(obj);
-	for (i = 0; list && i < list->shared_count; ++i) {
-		old = rcu_dereference_protected(list->shared[i],
-						dma_resv_held(obj));
+		dma_resv_list_entry(list, i, obj, &old, NULL);
 		if (old->context != context)
 			continue;
 
-		rcu_assign_pointer(list->shared[i], dma_fence_get(replacement));
+		dma_resv_list_set(list, i, replacement, usage);
 		dma_fence_put(old);
 	}
-
 	write_seqcount_end(&obj->seq);
 }
 EXPORT_SYMBOL(dma_resv_replace_fences);
 
-/**
- * dma_resv_add_excl_fence - Add an exclusive fence.
- * @obj: the reservation object
- * @fence: the exclusive fence to add
- *
- * Add a fence to the exclusive slot. @obj must be locked with dma_resv_lock().
- * See also &dma_resv.fence_excl for a discussion of the semantics.
- */
-static void dma_resv_add_excl_fence(struct dma_resv *obj,
-				    struct dma_fence *fence)
-{
-	struct dma_fence *old_fence = dma_resv_excl_fence(obj);
-
-	dma_resv_assert_held(obj);
-
-	dma_fence_get(fence);
-
-	write_seqcount_begin(&obj->seq);
-	/* write_seqcount_begin provides the necessary memory barrier */
-	RCU_INIT_POINTER(obj->fence_excl, fence);
-	write_seqcount_end(&obj->seq);
-
-	dma_fence_put(old_fence);
-}
-
-/**
- * dma_resv_add_fence - Add a fence to the dma_resv obj
- * @obj: the reservation object
- * @fence: the fence to add
- * @usage: how the fence is used, see enum dma_resv_usage
- *
- * Add a fence to a slot, @obj must be locked with dma_resv_lock(), and
- * dma_resv_reserve_fences() has been called.
- *
- * See also &dma_resv.fence for a discussion of the semantics.
- */
-void dma_resv_add_fence(struct dma_resv *obj, struct dma_fence *fence,
-			enum dma_resv_usage usage)
-{
-	if (usage == DMA_RESV_USAGE_WRITE)
-		dma_resv_add_excl_fence(obj, fence);
-	else
-		dma_resv_add_shared_fence(obj, fence);
-}
-EXPORT_SYMBOL(dma_resv_add_fence);
-
-/* Restart the iterator by initializing all the necessary fields, but not the
- * relation to the dma_resv object. */
+/* Restart the unlocked iteration by initializing the cursor object. */
 static void dma_resv_iter_restart_unlocked(struct dma_resv_iter *cursor)
 {
 	cursor->seq = read_seqcount_begin(&cursor->obj->seq);
-	cursor->index = -1;
-	cursor->shared_count = 0;
-	if (cursor->usage >= DMA_RESV_USAGE_READ) {
-		cursor->fences = dma_resv_shared_list(cursor->obj);
-		if (cursor->fences)
-			cursor->shared_count = cursor->fences->shared_count;
-	} else {
-		cursor->fences = NULL;
-	}
+	cursor->index = 0;
+	cursor->num_fences = 0;
+	cursor->fences = dma_resv_fences_list(cursor->obj);
+	if (cursor->fences)
+		cursor->num_fences = cursor->fences->num_fences;
 	cursor->is_restarted = true;
 }
 
 /* Walk to the next not signaled fence and grab a reference to it */
 static void dma_resv_iter_walk_unlocked(struct dma_resv_iter *cursor)
 {
-	struct dma_resv *obj = cursor->obj;
+	if (!cursor->fences)
+		return;
 
 	do {
 		/* Drop the reference from the previous round */
 		dma_fence_put(cursor->fence);
 
-		if (cursor->index == -1) {
-			cursor->fence = dma_resv_excl_fence(obj);
-			cursor->index++;
-			if (!cursor->fence)
-				continue;
-
-		} else if (!cursor->fences ||
-			   cursor->index >= cursor->shared_count) {
+		if (cursor->index >= cursor->num_fences) {
 			cursor->fence = NULL;
 			break;
 
-		} else {
-			struct dma_resv_list *fences = cursor->fences;
-			unsigned int idx = cursor->index++;
-
-			cursor->fence = rcu_dereference(fences->shared[idx]);
 		}
+
+		dma_resv_list_entry(cursor->fences, cursor->index++,
+				    cursor->obj, &cursor->fence,
+				    &cursor->fence_usage);
 		cursor->fence = dma_fence_get_rcu(cursor->fence);
-		if (!cursor->fence || !dma_fence_is_signaled(cursor->fence))
+		if (!cursor->fence)
+			break;
+
+		if (!dma_fence_is_signaled(cursor->fence) &&
+		    cursor->usage >= cursor->fence_usage)
 			break;
 	} while (true);
 }
@@ -522,15 +467,9 @@ struct dma_fence *dma_resv_iter_first(struct dma_resv_iter *cursor)
 	dma_resv_assert_held(cursor->obj);
 
 	cursor->index = 0;
-	if (cursor->usage >= DMA_RESV_USAGE_READ)
-		cursor->fences = dma_resv_shared_list(cursor->obj);
-	else
-		cursor->fences = NULL;
-
-	fence = dma_resv_excl_fence(cursor->obj);
-	if (!fence)
-		fence = dma_resv_iter_next(cursor);
+	cursor->fences = dma_resv_fences_list(cursor->obj);
 
+	fence = dma_resv_iter_next(cursor);
 	cursor->is_restarted = true;
 	return fence;
 }
@@ -545,17 +484,22 @@ EXPORT_SYMBOL_GPL(dma_resv_iter_first);
  */
 struct dma_fence *dma_resv_iter_next(struct dma_resv_iter *cursor)
 {
-	unsigned int idx;
+	struct dma_fence *fence;
 
 	dma_resv_assert_held(cursor->obj);
 
 	cursor->is_restarted = false;
-	if (!cursor->fences || cursor->index >= cursor->fences->shared_count)
-		return NULL;
 
-	idx = cursor->index++;
-	return rcu_dereference_protected(cursor->fences->shared[idx],
-					 dma_resv_held(cursor->obj));
+	do {
+		if (!cursor->fences ||
+		    cursor->index >= cursor->fences->num_fences)
+			return NULL;
+
+		dma_resv_list_entry(cursor->fences, cursor->index++,
+				    cursor->obj, &fence, &cursor->fence_usage);
+	} while (cursor->fence_usage > cursor->usage);
+
+	return fence;
 }
 EXPORT_SYMBOL_GPL(dma_resv_iter_next);
 
@@ -570,57 +514,43 @@ int dma_resv_copy_fences(struct dma_resv *dst, struct dma_resv *src)
 {
 	struct dma_resv_iter cursor;
 	struct dma_resv_list *list;
-	struct dma_fence *f, *excl;
+	struct dma_fence *f;
 
 	dma_resv_assert_held(dst);
 
 	list = NULL;
-	excl = NULL;
 
 	dma_resv_iter_begin(&cursor, src, DMA_RESV_USAGE_READ);
 	dma_resv_for_each_fence_unlocked(&cursor, f) {
 
 		if (dma_resv_iter_is_restarted(&cursor)) {
 			dma_resv_list_free(list);
-			dma_fence_put(excl);
-
-			if (cursor.shared_count) {
-				list = dma_resv_list_alloc(cursor.shared_count);
-				if (!list) {
-					dma_resv_iter_end(&cursor);
-					return -ENOMEM;
-				}
 
-				list->shared_count = 0;
-
-			} else {
-				list = NULL;
+			list = dma_resv_list_alloc(cursor.num_fences);
+			if (!list) {
+				dma_resv_iter_end(&cursor);
+				return -ENOMEM;
 			}
-			excl = NULL;
+			list->num_fences = 0;
 		}
 
 		dma_fence_get(f);
-		if (dma_resv_iter_usage(&cursor) == DMA_RESV_USAGE_WRITE)
-			excl = f;
-		else
-			RCU_INIT_POINTER(list->shared[list->shared_count++], f);
+		dma_resv_list_set(list, list->num_fences++, f,
+				  dma_resv_iter_usage(&cursor));
 	}
 	dma_resv_iter_end(&cursor);
 
 	write_seqcount_begin(&dst->seq);
-	excl = rcu_replace_pointer(dst->fence_excl, excl, dma_resv_held(dst));
-	list = rcu_replace_pointer(dst->fence, list, dma_resv_held(dst));
+	list = rcu_replace_pointer(dst->fences, list, dma_resv_held(dst));
 	write_seqcount_end(&dst->seq);
 
 	dma_resv_list_free(list);
-	dma_fence_put(excl);
-
 	return 0;
 }
 EXPORT_SYMBOL(dma_resv_copy_fences);
 
 /**
- * dma_resv_get_fences - Get an object's shared and exclusive
+ * dma_resv_get_fences - Get an object's fences
  * fences without update side lock held
  * @obj: the reservation object
  * @usage: controls which fences to include, see enum dma_resv_usage.
@@ -649,7 +579,7 @@ int dma_resv_get_fences(struct dma_resv *obj, enum dma_resv_usage usage,
 			while (*num_fences)
 				dma_fence_put((*fences)[--(*num_fences)]);
 
-			count = cursor.shared_count + 1;
+			count = cursor.num_fences + 1;
 
 			/* Eventually re-allocate the array */
 			*fences = krealloc_array(*fences, count,
@@ -723,8 +653,7 @@ int dma_resv_get_singleton(struct dma_resv *obj, enum dma_resv_usage usage,
 EXPORT_SYMBOL_GPL(dma_resv_get_singleton);
 
 /**
- * dma_resv_wait_timeout - Wait on reservation's objects
- * shared and/or exclusive fences.
+ * dma_resv_wait_timeout - Wait on reservation's objects fences
  * @obj: the reservation object
  * @usage: controls which fences to include, see enum dma_resv_usage.
  * @intr: if true, do interruptible wait
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_bo_list.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_bo_list.h
index 044b41f0bfd9..529d52a204cf 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_bo_list.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_bo_list.h
@@ -34,7 +34,6 @@ struct amdgpu_fpriv;
 struct amdgpu_bo_list_entry {
 	struct ttm_validate_buffer	tv;
 	struct amdgpu_bo_va		*bo_va;
-	struct dma_fence_chain		*chain;
 	uint32_t			priority;
 	struct page			**user_pages;
 	bool				user_invalidated;
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
index 76fd916424d6..8de283997769 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
@@ -574,14 +574,6 @@ static int amdgpu_cs_parser_bos(struct amdgpu_cs_parser *p,
 		struct amdgpu_bo *bo = ttm_to_amdgpu_bo(e->tv.bo);
 
 		e->bo_va = amdgpu_vm_bo_find(vm, bo);
-
-		if (bo->tbo.base.dma_buf && !amdgpu_bo_explicit_sync(bo)) {
-			e->chain = dma_fence_chain_alloc();
-			if (!e->chain) {
-				r = -ENOMEM;
-				goto error_validate;
-			}
-		}
 	}
 
 	/* Move fence waiting after getting reservation lock of
@@ -642,13 +634,8 @@ static int amdgpu_cs_parser_bos(struct amdgpu_cs_parser *p,
 	}
 
 error_validate:
-	if (r) {
-		amdgpu_bo_list_for_each_entry(e, p->bo_list) {
-			dma_fence_chain_free(e->chain);
-			e->chain = NULL;
-		}
+	if (r)
 		ttm_eu_backoff_reservation(&p->ticket, &p->validated);
-	}
 out:
 	return r;
 }
@@ -688,17 +675,9 @@ static void amdgpu_cs_parser_fini(struct amdgpu_cs_parser *parser, int error,
 {
 	unsigned i;
 
-	if (error && backoff) {
-		struct amdgpu_bo_list_entry *e;
-
-		amdgpu_bo_list_for_each_entry(e, parser->bo_list) {
-			dma_fence_chain_free(e->chain);
-			e->chain = NULL;
-		}
-
+	if (error && backoff)
 		ttm_eu_backoff_reservation(&parser->ticket,
 					   &parser->validated);
-	}
 
 	for (i = 0; i < parser->num_post_deps; i++) {
 		drm_syncobj_put(parser->post_deps[i].syncobj);
@@ -1272,31 +1251,9 @@ static int amdgpu_cs_submit(struct amdgpu_cs_parser *p,
 
 	amdgpu_vm_move_to_lru_tail(p->adev, &fpriv->vm);
 
-	amdgpu_bo_list_for_each_entry(e, p->bo_list) {
-		struct dma_resv *resv = e->tv.bo->base.resv;
-		struct dma_fence_chain *chain = e->chain;
-		struct dma_resv_iter cursor;
-		struct dma_fence *fence;
-
-		if (!chain)
-			continue;
-
-		/*
-		 * Temporary workaround dma_resv shortcommings by wrapping up
-		 * the submission in a dma_fence_chain and add it as exclusive
-		 * fence.
-		 *
-		 * TODO: Remove together with dma_resv rework.
-		 */
-		dma_resv_for_each_fence(&cursor, resv,
-					DMA_RESV_USAGE_WRITE,
-					fence) {
-			break;
-		}
-		dma_fence_chain_init(chain, fence, dma_fence_get(p->fence), 1);
-		rcu_assign_pointer(resv->fence_excl, &chain->base);
-		e->chain = NULL;
-	}
+	/* Make sure all BOs are remembered as writers */
+	amdgpu_bo_list_for_each_entry(e, p->bo_list)
+		e->tv.num_shared = 0;
 
 	ttm_eu_fence_buffer_objects(&p->ticket, &p->validated, p->fence);
 	mutex_unlock(&p->adev->notifier_lock);
diff --git a/include/linux/dma-resv.h b/include/linux/dma-resv.h
index 98dc5234b487..7bb7e7edbb6f 100644
--- a/include/linux/dma-resv.h
+++ b/include/linux/dma-resv.h
@@ -99,8 +99,8 @@ static inline enum dma_resv_usage dma_resv_usage_rw(bool write)
 /**
  * struct dma_resv - a reservation object manages fences for a buffer
  *
- * There are multiple uses for this, with sometimes slightly different rules in
- * how the fence slots are used.
+ * This is a container for dma_fence objects which needs to handle multiple use
+ * cases.
  *
  * One use is to synchronize cross-driver access to a struct dma_buf, either for
  * dynamic buffer management or just to handle implicit synchronization between
@@ -130,47 +130,22 @@ struct dma_resv {
 	 * @seq:
 	 *
 	 * Sequence count for managing RCU read-side synchronization, allows
-	 * read-only access to @fence_excl and @fence while ensuring we take a
-	 * consistent snapshot.
+	 * read-only access to @fences while ensuring we take a consistent
+	 * snapshot.
 	 */
 	seqcount_ww_mutex_t seq;
 
 	/**
-	 * @fence_excl:
+	 * @fences:
 	 *
-	 * The exclusive fence, if there is one currently.
+	 * Array of fences which where added to the dma_resv object
 	 *
-	 * To guarantee that no fences are lost, this new fence must signal
-	 * only after the previous exclusive fence has signalled. If
-	 * semantically only a new access is added without actually treating the
-	 * previous one as a dependency the exclusive fences can be strung
-	 * together using struct dma_fence_chain.
-	 *
-	 * Note that actual semantics of what an exclusive or shared fence mean
-	 * is defined by the user, for reservation objects shared across drivers
-	 * see &dma_buf.resv.
-	 */
-	struct dma_fence __rcu *fence_excl;
-
-	/**
-	 * @fence:
-	 *
-	 * List of current shared fences.
-	 *
-	 * There are no ordering constraints of shared fences against the
-	 * exclusive fence slot. If a waiter needs to wait for all access, it
-	 * has to wait for both sets of fences to signal.
-	 *
-	 * A new fence is added by calling dma_resv_add_shared_fence(). Since
-	 * this often needs to be done past the point of no return in command
+	 * A new fence is added by calling dma_resv_add_fence(). Since this
+	 * often needs to be done past the point of no return in command
 	 * submission it cannot fail, and therefore sufficient slots need to be
 	 * reserved by calling dma_resv_reserve_fences().
-	 *
-	 * Note that actual semantics of what an exclusive or shared fence mean
-	 * is defined by the user, for reservation objects shared across drivers
-	 * see &dma_buf.resv.
 	 */
-	struct dma_resv_list __rcu *fence;
+	struct dma_resv_list __rcu *fences;
 };
 
 /**
@@ -207,8 +182,8 @@ struct dma_resv_iter {
 	/** @fences: the shared fences; private, *MUST* not dereference  */
 	struct dma_resv_list *fences;
 
-	/** @shared_count: number of shared fences */
-	unsigned int shared_count;
+	/** @num_fences: number of fences */
+	unsigned int num_fences;
 
 	/** @is_restarted: true if this is the first returned fence */
 	bool is_restarted;
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 65+ messages in thread

* [PATCH 05/16] dma-buf: add DMA_RESV_USAGE_KERNEL v3
  2022-04-06  7:51 Christian König
                   ` (3 preceding siblings ...)
  2022-04-06  7:51 ` [PATCH 04/16] dma-buf & drm/amdgpu: remove dma_resv workaround Christian König
@ 2022-04-06  7:51 ` Christian König
  2022-04-06 12:41   ` Daniel Vetter
  2022-04-06  7:51 ` [PATCH 06/16] drm/amdgpu: use DMA_RESV_USAGE_KERNEL Christian König
                   ` (11 subsequent siblings)
  16 siblings, 1 reply; 65+ messages in thread
From: Christian König @ 2022-04-06  7:51 UTC (permalink / raw)
  To: daniel.vetter, dri-devel; +Cc: Christian König

Add an usage for kernel submissions. Waiting for those are mandatory for
dynamic DMA-bufs.

As a precaution this patch also changes all occurrences where fences are
added as part of memory management in TTM, VMWGFX and i915 to use the
new value because it now becomes possible for drivers to ignore fences
with the WRITE usage.

v2: use "must" in documentation, fix whitespaces
v3: separate out some driver changes and better document why some
    changes should still be part of this patch.

Signed-off-by: Christian König <christian.koenig@amd.com>
---
 drivers/dma-buf/dma-resv.c                  |  2 +-
 drivers/dma-buf/st-dma-resv.c               |  2 +-
 drivers/gpu/drm/i915/gem/i915_gem_clflush.c |  2 +-
 drivers/gpu/drm/ttm/ttm_bo.c                |  2 +-
 drivers/gpu/drm/ttm/ttm_bo_util.c           |  4 ++--
 drivers/gpu/drm/vmwgfx/vmwgfx_bo.c          |  2 +-
 include/linux/dma-resv.h                    | 24 ++++++++++++++++++---
 7 files changed, 28 insertions(+), 10 deletions(-)

diff --git a/drivers/dma-buf/dma-resv.c b/drivers/dma-buf/dma-resv.c
index 378d47e1cfea..f4860e5f2d8b 100644
--- a/drivers/dma-buf/dma-resv.c
+++ b/drivers/dma-buf/dma-resv.c
@@ -726,7 +726,7 @@ EXPORT_SYMBOL_GPL(dma_resv_test_signaled);
  */
 void dma_resv_describe(struct dma_resv *obj, struct seq_file *seq)
 {
-	static const char *usage[] = { "write", "read" };
+	static const char *usage[] = { "kernel", "write", "read" };
 	struct dma_resv_iter cursor;
 	struct dma_fence *fence;
 
diff --git a/drivers/dma-buf/st-dma-resv.c b/drivers/dma-buf/st-dma-resv.c
index d0f7c2bfd4f0..062b57d63fa6 100644
--- a/drivers/dma-buf/st-dma-resv.c
+++ b/drivers/dma-buf/st-dma-resv.c
@@ -296,7 +296,7 @@ int dma_resv(void)
 	int r;
 
 	spin_lock_init(&fence_lock);
-	for (usage = DMA_RESV_USAGE_WRITE; usage <= DMA_RESV_USAGE_READ;
+	for (usage = DMA_RESV_USAGE_KERNEL; usage <= DMA_RESV_USAGE_READ;
 	     ++usage) {
 		r = subtests(tests, (void *)(unsigned long)usage);
 		if (r)
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_clflush.c b/drivers/gpu/drm/i915/gem/i915_gem_clflush.c
index f5f2b8b115ea..0512afdd20d8 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_clflush.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_clflush.c
@@ -117,7 +117,7 @@ bool i915_gem_clflush_object(struct drm_i915_gem_object *obj,
 						i915_fence_timeout(i915),
 						I915_FENCE_GFP);
 		dma_resv_add_fence(obj->base.resv, &clflush->base.dma,
-				   DMA_RESV_USAGE_WRITE);
+				   DMA_RESV_USAGE_KERNEL);
 		dma_fence_work_commit(&clflush->base);
 		/*
 		 * We must have successfully populated the pages(since we are
diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c
index d74f9eea855e..6bf3fb1c8045 100644
--- a/drivers/gpu/drm/ttm/ttm_bo.c
+++ b/drivers/gpu/drm/ttm/ttm_bo.c
@@ -739,7 +739,7 @@ static int ttm_bo_add_move_fence(struct ttm_buffer_object *bo,
 		return ret;
 	}
 
-	dma_resv_add_fence(bo->base.resv, fence, DMA_RESV_USAGE_WRITE);
+	dma_resv_add_fence(bo->base.resv, fence, DMA_RESV_USAGE_KERNEL);
 
 	ret = dma_resv_reserve_fences(bo->base.resv, 1);
 	if (unlikely(ret)) {
diff --git a/drivers/gpu/drm/ttm/ttm_bo_util.c b/drivers/gpu/drm/ttm/ttm_bo_util.c
index 7a96a1db13a7..99deb45894f4 100644
--- a/drivers/gpu/drm/ttm/ttm_bo_util.c
+++ b/drivers/gpu/drm/ttm/ttm_bo_util.c
@@ -508,7 +508,7 @@ static int ttm_bo_move_to_ghost(struct ttm_buffer_object *bo,
 		return ret;
 
 	dma_resv_add_fence(&ghost_obj->base._resv, fence,
-			   DMA_RESV_USAGE_WRITE);
+			   DMA_RESV_USAGE_KERNEL);
 
 	/**
 	 * If we're not moving to fixed memory, the TTM object
@@ -562,7 +562,7 @@ int ttm_bo_move_accel_cleanup(struct ttm_buffer_object *bo,
 	struct ttm_resource_manager *man = ttm_manager_type(bdev, new_mem->mem_type);
 	int ret = 0;
 
-	dma_resv_add_fence(bo->base.resv, fence, DMA_RESV_USAGE_WRITE);
+	dma_resv_add_fence(bo->base.resv, fence, DMA_RESV_USAGE_KERNEL);
 	if (!evict)
 		ret = ttm_bo_move_to_ghost(bo, fence, man->use_tt);
 	else if (!from->use_tt && pipeline)
diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_bo.c b/drivers/gpu/drm/vmwgfx/vmwgfx_bo.c
index bec50223efe5..408ede1f967f 100644
--- a/drivers/gpu/drm/vmwgfx/vmwgfx_bo.c
+++ b/drivers/gpu/drm/vmwgfx/vmwgfx_bo.c
@@ -759,7 +759,7 @@ void vmw_bo_fence_single(struct ttm_buffer_object *bo,
 	ret = dma_resv_reserve_fences(bo->base.resv, 1);
 	if (!ret)
 		dma_resv_add_fence(bo->base.resv, &fence->base,
-				   DMA_RESV_USAGE_WRITE);
+				   DMA_RESV_USAGE_KERNEL);
 	else
 		/* Last resort fallback when we are OOM */
 		dma_fence_wait(&fence->base, false);
diff --git a/include/linux/dma-resv.h b/include/linux/dma-resv.h
index 7bb7e7edbb6f..a749f229ae91 100644
--- a/include/linux/dma-resv.h
+++ b/include/linux/dma-resv.h
@@ -55,11 +55,29 @@ struct dma_resv_list;
  * This enum describes the different use cases for a dma_resv object and
  * controls which fences are returned when queried.
  *
- * An important fact is that there is the order WRITE<READ and when the
- * dma_resv object is asked for fences for one use case the fences for the
- * lower use case are returned as well.
+ * An important fact is that there is the order KERNEL<WRITE<READ and
+ * when the dma_resv object is asked for fences for one use case the fences
+ * for the lower use case are returned as well.
+ *
+ * For example when asking for WRITE fences then the KERNEL fences are returned
+ * as well. Similar when asked for READ fences then both WRITE and KERNEL
+ * fences are returned as well.
  */
 enum dma_resv_usage {
+	/**
+	 * @DMA_RESV_USAGE_KERNEL: For in kernel memory management only.
+	 *
+	 * This should only be used for things like copying or clearing memory
+	 * with a DMA hardware engine for the purpose of kernel memory
+	 * management.
+	 *
+	 * Drivers *always* must wait for those fences before accessing the
+	 * resource protected by the dma_resv object. The only exception for
+	 * that is when the resource is known to be locked down in place by
+	 * pinning it previously.
+	 */
+	DMA_RESV_USAGE_KERNEL,
+
 	/**
 	 * @DMA_RESV_USAGE_WRITE: Implicit write synchronization.
 	 *
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 65+ messages in thread

* [PATCH 06/16] drm/amdgpu: use DMA_RESV_USAGE_KERNEL
  2022-04-06  7:51 Christian König
                   ` (4 preceding siblings ...)
  2022-04-06  7:51 ` [PATCH 05/16] dma-buf: add DMA_RESV_USAGE_KERNEL v3 Christian König
@ 2022-04-06  7:51 ` Christian König
  2022-04-06 12:42   ` Daniel Vetter
  2022-04-06  7:51 ` [PATCH 07/16] drm/radeon: " Christian König
                   ` (10 subsequent siblings)
  16 siblings, 1 reply; 65+ messages in thread
From: Christian König @ 2022-04-06  7:51 UTC (permalink / raw)
  To: daniel.vetter, dri-devel; +Cc: Christian König

Wait only for kernel fences before kmap or UVD direct submission.

This also makes sure that we always wait in amdgpu_bo_kmap() even when
returning a cached pointer.

Signed-off-by: Christian König <christian.koenig@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 10 +++++-----
 drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c    |  2 +-
 2 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
index a3cdf8a24377..5832c05ab10d 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
@@ -761,6 +761,11 @@ int amdgpu_bo_kmap(struct amdgpu_bo *bo, void **ptr)
 	if (bo->flags & AMDGPU_GEM_CREATE_NO_CPU_ACCESS)
 		return -EPERM;
 
+	r = dma_resv_wait_timeout(bo->tbo.base.resv, DMA_RESV_USAGE_KERNEL,
+				  false, MAX_SCHEDULE_TIMEOUT);
+	if (r < 0)
+		return r;
+
 	kptr = amdgpu_bo_kptr(bo);
 	if (kptr) {
 		if (ptr)
@@ -768,11 +773,6 @@ int amdgpu_bo_kmap(struct amdgpu_bo *bo, void **ptr)
 		return 0;
 	}
 
-	r = dma_resv_wait_timeout(bo->tbo.base.resv, DMA_RESV_USAGE_WRITE,
-				  false, MAX_SCHEDULE_TIMEOUT);
-	if (r < 0)
-		return r;
-
 	r = ttm_bo_kmap(&bo->tbo, 0, bo->tbo.resource->num_pages, &bo->kmap);
 	if (r)
 		return r;
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c
index 3654326219e0..6eac649499d3 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c
@@ -1164,7 +1164,7 @@ static int amdgpu_uvd_send_msg(struct amdgpu_ring *ring, struct amdgpu_bo *bo,
 
 	if (direct) {
 		r = dma_resv_wait_timeout(bo->tbo.base.resv,
-					  DMA_RESV_USAGE_WRITE, false,
+					  DMA_RESV_USAGE_KERNEL, false,
 					  msecs_to_jiffies(10));
 		if (r == 0)
 			r = -ETIMEDOUT;
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 65+ messages in thread

* [PATCH 07/16] drm/radeon: use DMA_RESV_USAGE_KERNEL
  2022-04-06  7:51 Christian König
                   ` (5 preceding siblings ...)
  2022-04-06  7:51 ` [PATCH 06/16] drm/amdgpu: use DMA_RESV_USAGE_KERNEL Christian König
@ 2022-04-06  7:51 ` Christian König
  2022-04-06 12:43   ` Daniel Vetter
  2022-04-06  7:51 ` [PATCH 08/16] drm/etnaviv: always wait for kernel fences Christian König
                   ` (9 subsequent siblings)
  16 siblings, 1 reply; 65+ messages in thread
From: Christian König @ 2022-04-06  7:51 UTC (permalink / raw)
  To: daniel.vetter, dri-devel; +Cc: Christian König

Always wait for kernel fences before kmap and not only for UVD kmaps.

Signed-off-by: Christian König <christian.koenig@amd.com>
---
 drivers/gpu/drm/radeon/radeon_object.c |  7 ++++++-
 drivers/gpu/drm/radeon/radeon_uvd.c    | 12 ++----------
 2 files changed, 8 insertions(+), 11 deletions(-)

diff --git a/drivers/gpu/drm/radeon/radeon_object.c b/drivers/gpu/drm/radeon/radeon_object.c
index cb5c4aa45cef..6c4a6802ca96 100644
--- a/drivers/gpu/drm/radeon/radeon_object.c
+++ b/drivers/gpu/drm/radeon/radeon_object.c
@@ -219,7 +219,12 @@ int radeon_bo_create(struct radeon_device *rdev,
 int radeon_bo_kmap(struct radeon_bo *bo, void **ptr)
 {
 	bool is_iomem;
-	int r;
+	long r;
+
+	r = dma_resv_wait_timeout(bo->tbo.base.resv, DMA_RESV_USAGE_KERNEL,
+				  false, MAX_SCHEDULE_TIMEOUT);
+	if (r < 0)
+		return r;
 
 	if (bo->kptr) {
 		if (ptr) {
diff --git a/drivers/gpu/drm/radeon/radeon_uvd.c b/drivers/gpu/drm/radeon/radeon_uvd.c
index a50750740ab0..a2cda184b2b2 100644
--- a/drivers/gpu/drm/radeon/radeon_uvd.c
+++ b/drivers/gpu/drm/radeon/radeon_uvd.c
@@ -470,24 +470,16 @@ static int radeon_uvd_cs_msg(struct radeon_cs_parser *p, struct radeon_bo *bo,
 	int32_t *msg, msg_type, handle;
 	unsigned img_size = 0;
 	void *ptr;
-	long r;
-	int i;
+	int i, r;
 
 	if (offset & 0x3F) {
 		DRM_ERROR("UVD messages must be 64 byte aligned!\n");
 		return -EINVAL;
 	}
 
-	r = dma_resv_wait_timeout(bo->tbo.base.resv, DMA_RESV_USAGE_WRITE,
-				  false, MAX_SCHEDULE_TIMEOUT);
-	if (r <= 0) {
-		DRM_ERROR("Failed waiting for UVD message (%ld)!\n", r);
-		return r ? r : -ETIME;
-	}
-
 	r = radeon_bo_kmap(bo, &ptr);
 	if (r) {
-		DRM_ERROR("Failed mapping the UVD message (%ld)!\n", r);
+		DRM_ERROR("Failed mapping the UVD message (%d)!\n", r);
 		return r;
 	}
 
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 65+ messages in thread

* [PATCH 08/16] drm/etnaviv: always wait for kernel fences
  2022-04-06  7:51 Christian König
                   ` (6 preceding siblings ...)
  2022-04-06  7:51 ` [PATCH 07/16] drm/radeon: " Christian König
@ 2022-04-06  7:51 ` Christian König
  2022-04-06 12:46   ` Daniel Vetter
  2022-04-06  7:51 ` [PATCH 09/16] drm/nouveau: only wait for kernel fences in nouveau_bo_vm_cleanup Christian König
                   ` (8 subsequent siblings)
  16 siblings, 1 reply; 65+ messages in thread
From: Christian König @ 2022-04-06  7:51 UTC (permalink / raw)
  To: daniel.vetter, dri-devel; +Cc: Christian König

Even for explicit synchronization we should wait for kernel fences.

Signed-off-by: Christian König <christian.koenig@amd.com>
---
 drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c | 27 ++++++++++++++++++--
 1 file changed, 25 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c b/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c
index 98bb5c9239de..3fedd29732d5 100644
--- a/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c
+++ b/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c
@@ -171,6 +171,26 @@ static int submit_lock_objects(struct etnaviv_gem_submit *submit,
 	return ret;
 }
 
+/* TODO: This should be moved into the GPU scheduler if others need it */
+static int submit_fence_kernel_sync(struct etnaviv_gem_submit *submit,
+				    struct dma_resv *resv)
+{
+	struct dma_resv_iter cursor;
+	struct dma_fence *fence;
+	int ret;
+
+	dma_resv_for_each_fence(&cursor, resv, DMA_RESV_USAGE_KERNEL, fence) {
+		/* Make sure to grab an additional ref on the added fence */
+		dma_fence_get(fence);
+		ret = drm_sched_job_add_dependency(&submit->sched_job, fence);
+		if (ret) {
+			dma_fence_put(fence);
+			return ret;
+		}
+	}
+	return 0;
+}
+
 static int submit_fence_sync(struct etnaviv_gem_submit *submit)
 {
 	int i, ret = 0;
@@ -183,8 +203,11 @@ static int submit_fence_sync(struct etnaviv_gem_submit *submit)
 		if (ret)
 			return ret;
 
-		if (submit->flags & ETNA_SUBMIT_NO_IMPLICIT)
-			continue;
+		if (submit->flags & ETNA_SUBMIT_NO_IMPLICIT) {
+			ret = submit_fence_kernel_sync(submit, robj);
+			if (ret)
+				return ret;
+		}
 
 		ret = drm_sched_job_add_implicit_dependencies(&submit->sched_job,
 							      &bo->obj->base,
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 65+ messages in thread

* [PATCH 09/16] drm/nouveau: only wait for kernel fences in nouveau_bo_vm_cleanup
  2022-04-06  7:51 Christian König
                   ` (7 preceding siblings ...)
  2022-04-06  7:51 ` [PATCH 08/16] drm/etnaviv: always wait for kernel fences Christian König
@ 2022-04-06  7:51 ` Christian König
  2022-04-06 12:47   ` Daniel Vetter
  2022-04-06  7:51 ` [PATCH 10/16] RDMA: use DMA_RESV_USAGE_KERNEL Christian König
                   ` (7 subsequent siblings)
  16 siblings, 1 reply; 65+ messages in thread
From: Christian König @ 2022-04-06  7:51 UTC (permalink / raw)
  To: daniel.vetter, dri-devel; +Cc: Christian König

Don't wait for user space submissions. I'm not 100% sure if that is
correct, but it seems to match what the code initially intended.

Signed-off-by: Christian König <christian.koenig@amd.com>
---
 drivers/gpu/drm/nouveau/nouveau_bo.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/nouveau/nouveau_bo.c b/drivers/gpu/drm/nouveau/nouveau_bo.c
index 05076e530e7d..13deb6c70ba6 100644
--- a/drivers/gpu/drm/nouveau/nouveau_bo.c
+++ b/drivers/gpu/drm/nouveau/nouveau_bo.c
@@ -962,10 +962,10 @@ nouveau_bo_vm_cleanup(struct ttm_buffer_object *bo,
 	struct dma_fence *fence;
 	int ret;
 
-	ret = dma_resv_get_singleton(bo->base.resv, DMA_RESV_USAGE_WRITE,
+	ret = dma_resv_get_singleton(bo->base.resv, DMA_RESV_USAGE_KERNEL,
 				     &fence);
 	if (ret)
-		dma_resv_wait_timeout(bo->base.resv, DMA_RESV_USAGE_WRITE,
+		dma_resv_wait_timeout(bo->base.resv, DMA_RESV_USAGE_KERNEL,
 				      false, MAX_SCHEDULE_TIMEOUT);
 
 	nv10_bo_put_tile_region(dev, *old_tile, fence);
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 65+ messages in thread

* [PATCH 10/16] RDMA: use DMA_RESV_USAGE_KERNEL
  2022-04-06  7:51 Christian König
                   ` (8 preceding siblings ...)
  2022-04-06  7:51 ` [PATCH 09/16] drm/nouveau: only wait for kernel fences in nouveau_bo_vm_cleanup Christian König
@ 2022-04-06  7:51 ` Christian König
  2022-04-06 12:48   ` Daniel Vetter
  2022-04-06  7:51 ` [PATCH 11/16] dma-buf: add DMA_RESV_USAGE_BOOKKEEP v3 Christian König
                   ` (6 subsequent siblings)
  16 siblings, 1 reply; 65+ messages in thread
From: Christian König @ 2022-04-06  7:51 UTC (permalink / raw)
  To: daniel.vetter, dri-devel; +Cc: Christian König

We only need to wait for kernel submissions here.

Signed-off-by: Christian König <christian.koenig@amd.com>
---
 drivers/infiniband/core/umem_dmabuf.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/infiniband/core/umem_dmabuf.c b/drivers/infiniband/core/umem_dmabuf.c
index f9901d273b8e..fce80a4a5147 100644
--- a/drivers/infiniband/core/umem_dmabuf.c
+++ b/drivers/infiniband/core/umem_dmabuf.c
@@ -68,7 +68,7 @@ int ib_umem_dmabuf_map_pages(struct ib_umem_dmabuf *umem_dmabuf)
 	 * the migration.
 	 */
 	return dma_resv_wait_timeout(umem_dmabuf->attach->dmabuf->resv,
-				     DMA_RESV_USAGE_WRITE,
+				     DMA_RESV_USAGE_KERNEL,
 				     false, MAX_SCHEDULE_TIMEOUT);
 }
 EXPORT_SYMBOL(ib_umem_dmabuf_map_pages);
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 65+ messages in thread

* [PATCH 11/16] dma-buf: add DMA_RESV_USAGE_BOOKKEEP v3
  2022-04-06  7:51 Christian König
                   ` (9 preceding siblings ...)
  2022-04-06  7:51 ` [PATCH 10/16] RDMA: use DMA_RESV_USAGE_KERNEL Christian König
@ 2022-04-06  7:51 ` Christian König
  2022-04-06  7:51 ` [PATCH 12/16] drm/amdgpu: use DMA_RESV_USAGE_BOOKKEEP Christian König
                   ` (5 subsequent siblings)
  16 siblings, 0 replies; 65+ messages in thread
From: Christian König @ 2022-04-06  7:51 UTC (permalink / raw)
  To: daniel.vetter, dri-devel; +Cc: Christian König

Add an usage for submissions independent of implicit sync but still
interesting for memory management.

v2: cleanup the kerneldoc a bit
v3: separate amdgpu changes from this

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
---
 drivers/dma-buf/dma-resv.c                  |  4 ++--
 drivers/dma-buf/st-dma-resv.c               |  2 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_ids.c     |  4 ++--
 drivers/gpu/drm/amd/amdgpu/amdgpu_mn.c      |  2 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_sync.c    |  2 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c     |  2 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c      |  6 +++---
 drivers/gpu/drm/i915/gem/i915_gem_lmem.c    |  2 +-
 drivers/gpu/drm/i915/gem/i915_gem_userptr.c |  2 +-
 drivers/gpu/drm/qxl/qxl_debugfs.c           |  2 +-
 drivers/gpu/drm/radeon/radeon_gem.c         |  2 +-
 drivers/gpu/drm/radeon/radeon_mn.c          |  2 +-
 drivers/gpu/drm/ttm/ttm_bo.c                | 14 +++++++-------
 include/linux/dma-resv.h                    | 13 ++++++++++++-
 14 files changed, 35 insertions(+), 24 deletions(-)

diff --git a/drivers/dma-buf/dma-resv.c b/drivers/dma-buf/dma-resv.c
index f4860e5f2d8b..5b64aa554c36 100644
--- a/drivers/dma-buf/dma-resv.c
+++ b/drivers/dma-buf/dma-resv.c
@@ -520,7 +520,7 @@ int dma_resv_copy_fences(struct dma_resv *dst, struct dma_resv *src)
 
 	list = NULL;
 
-	dma_resv_iter_begin(&cursor, src, DMA_RESV_USAGE_READ);
+	dma_resv_iter_begin(&cursor, src, DMA_RESV_USAGE_BOOKKEEP);
 	dma_resv_for_each_fence_unlocked(&cursor, f) {
 
 		if (dma_resv_iter_is_restarted(&cursor)) {
@@ -726,7 +726,7 @@ EXPORT_SYMBOL_GPL(dma_resv_test_signaled);
  */
 void dma_resv_describe(struct dma_resv *obj, struct seq_file *seq)
 {
-	static const char *usage[] = { "kernel", "write", "read" };
+	static const char *usage[] = { "kernel", "write", "read", "bookkeep" };
 	struct dma_resv_iter cursor;
 	struct dma_fence *fence;
 
diff --git a/drivers/dma-buf/st-dma-resv.c b/drivers/dma-buf/st-dma-resv.c
index 062b57d63fa6..8ace9e84c845 100644
--- a/drivers/dma-buf/st-dma-resv.c
+++ b/drivers/dma-buf/st-dma-resv.c
@@ -296,7 +296,7 @@ int dma_resv(void)
 	int r;
 
 	spin_lock_init(&fence_lock);
-	for (usage = DMA_RESV_USAGE_KERNEL; usage <= DMA_RESV_USAGE_READ;
+	for (usage = DMA_RESV_USAGE_KERNEL; usage <= DMA_RESV_USAGE_BOOKKEEP;
 	     ++usage) {
 		r = subtests(tests, (void *)(unsigned long)usage);
 		if (r)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ids.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ids.c
index 65998cbcd7f7..4ba4b54092f1 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ids.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ids.c
@@ -111,7 +111,7 @@ void amdgpu_pasid_free_delayed(struct dma_resv *resv,
 	struct dma_fence *fence;
 	int r;
 
-	r = dma_resv_get_singleton(resv, DMA_RESV_USAGE_READ, &fence);
+	r = dma_resv_get_singleton(resv, DMA_RESV_USAGE_BOOKKEEP, &fence);
 	if (r)
 		goto fallback;
 
@@ -139,7 +139,7 @@ void amdgpu_pasid_free_delayed(struct dma_resv *resv,
 	/* Not enough memory for the delayed delete, as last resort
 	 * block for all the fences to complete.
 	 */
-	dma_resv_wait_timeout(resv, DMA_RESV_USAGE_READ,
+	dma_resv_wait_timeout(resv, DMA_RESV_USAGE_BOOKKEEP,
 			      false, MAX_SCHEDULE_TIMEOUT);
 	amdgpu_pasid_free(pasid);
 }
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_mn.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_mn.c
index 86f5248676b0..b86c0b8252a5 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_mn.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_mn.c
@@ -75,7 +75,7 @@ static bool amdgpu_mn_invalidate_gfx(struct mmu_interval_notifier *mni,
 
 	mmu_interval_set_seq(mni, cur_seq);
 
-	r = dma_resv_wait_timeout(bo->tbo.base.resv, DMA_RESV_USAGE_READ,
+	r = dma_resv_wait_timeout(bo->tbo.base.resv, DMA_RESV_USAGE_BOOKKEEP,
 				  false, MAX_SCHEDULE_TIMEOUT);
 	mutex_unlock(&adev->notifier_lock);
 	if (r <= 0)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_sync.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_sync.c
index 744e144e5fc2..11c46b3e4c60 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_sync.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_sync.c
@@ -260,7 +260,7 @@ int amdgpu_sync_resv(struct amdgpu_device *adev, struct amdgpu_sync *sync,
 		return -EINVAL;
 
 	/* TODO: Use DMA_RESV_USAGE_READ here */
-	dma_resv_for_each_fence(&cursor, resv, DMA_RESV_USAGE_READ, f) {
+	dma_resv_for_each_fence(&cursor, resv, DMA_RESV_USAGE_BOOKKEEP, f) {
 		dma_fence_chain_for_each(f, f) {
 			struct dma_fence *tmp = dma_fence_chain_contained(f);
 
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
index 5db5066e74b4..49ffad312d5d 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
@@ -1345,7 +1345,7 @@ static bool amdgpu_ttm_bo_eviction_valuable(struct ttm_buffer_object *bo,
 	 * be resident to run successfully
 	 */
 	dma_resv_for_each_fence(&resv_cursor, bo->base.resv,
-				DMA_RESV_USAGE_READ, f) {
+				DMA_RESV_USAGE_BOOKKEEP, f) {
 		if (amdkfd_fence_check_mm(f, current->mm))
 			return false;
 	}
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
index a0376fd36a82..5277c10d901d 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
@@ -2059,7 +2059,7 @@ static void amdgpu_vm_prt_fini(struct amdgpu_device *adev, struct amdgpu_vm *vm)
 	struct dma_resv_iter cursor;
 	struct dma_fence *fence;
 
-	dma_resv_for_each_fence(&cursor, resv, DMA_RESV_USAGE_READ, fence) {
+	dma_resv_for_each_fence(&cursor, resv, DMA_RESV_USAGE_BOOKKEEP, fence) {
 		/* Add a callback for each fence in the reservation object */
 		amdgpu_vm_prt_get(adev);
 		amdgpu_vm_add_prt_cb(adev, fence);
@@ -2665,7 +2665,7 @@ bool amdgpu_vm_evictable(struct amdgpu_bo *bo)
 		return true;
 
 	/* Don't evict VM page tables while they are busy */
-	if (!dma_resv_test_signaled(bo->tbo.base.resv, DMA_RESV_USAGE_READ))
+	if (!dma_resv_test_signaled(bo->tbo.base.resv, DMA_RESV_USAGE_BOOKKEEP))
 		return false;
 
 	/* Try to block ongoing updates */
@@ -2846,7 +2846,7 @@ void amdgpu_vm_adjust_size(struct amdgpu_device *adev, uint32_t min_vm_size,
 long amdgpu_vm_wait_idle(struct amdgpu_vm *vm, long timeout)
 {
 	timeout = dma_resv_wait_timeout(vm->root.bo->tbo.base.resv,
-					DMA_RESV_USAGE_READ,
+					DMA_RESV_USAGE_BOOKKEEP,
 					true, timeout);
 	if (timeout <= 0)
 		return timeout;
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_lmem.c b/drivers/gpu/drm/i915/gem/i915_gem_lmem.c
index a200d3e66573..4115a222a853 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_lmem.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_lmem.c
@@ -66,7 +66,7 @@ bool __i915_gem_object_is_lmem(struct drm_i915_gem_object *obj)
 	struct intel_memory_region *mr = READ_ONCE(obj->mm.region);
 
 #ifdef CONFIG_LOCKDEP
-	GEM_WARN_ON(dma_resv_test_signaled(obj->base.resv, DMA_RESV_USAGE_READ) &&
+	GEM_WARN_ON(dma_resv_test_signaled(obj->base.resv, DMA_RESV_USAGE_BOOKKEEP) &&
 		    i915_gem_object_evictable(obj));
 #endif
 	return mr && (mr->type == INTEL_MEMORY_LOCAL ||
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_userptr.c b/drivers/gpu/drm/i915/gem/i915_gem_userptr.c
index 644fe237601c..094f06b4ce33 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_userptr.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_userptr.c
@@ -86,7 +86,7 @@ static bool i915_gem_userptr_invalidate(struct mmu_interval_notifier *mni,
 		return true;
 
 	/* we will unbind on next submission, still have userptr pins */
-	r = dma_resv_wait_timeout(obj->base.resv, DMA_RESV_USAGE_READ, false,
+	r = dma_resv_wait_timeout(obj->base.resv, DMA_RESV_USAGE_BOOKKEEP, false,
 				  MAX_SCHEDULE_TIMEOUT);
 	if (r <= 0)
 		drm_err(&i915->drm, "(%ld) failed to wait for idle\n", r);
diff --git a/drivers/gpu/drm/qxl/qxl_debugfs.c b/drivers/gpu/drm/qxl/qxl_debugfs.c
index 33e5889d6608..2d9ed3b94574 100644
--- a/drivers/gpu/drm/qxl/qxl_debugfs.c
+++ b/drivers/gpu/drm/qxl/qxl_debugfs.c
@@ -62,7 +62,7 @@ qxl_debugfs_buffers_info(struct seq_file *m, void *data)
 		int rel = 0;
 
 		dma_resv_iter_begin(&cursor, bo->tbo.base.resv,
-				    DMA_RESV_USAGE_READ);
+				    DMA_RESV_USAGE_BOOKKEEP);
 		dma_resv_for_each_fence_unlocked(&cursor, fence) {
 			if (dma_resv_iter_is_restarted(&cursor))
 				rel = 0;
diff --git a/drivers/gpu/drm/radeon/radeon_gem.c b/drivers/gpu/drm/radeon/radeon_gem.c
index 6616a828f40b..8c01a7f0e027 100644
--- a/drivers/gpu/drm/radeon/radeon_gem.c
+++ b/drivers/gpu/drm/radeon/radeon_gem.c
@@ -163,7 +163,7 @@ static int radeon_gem_set_domain(struct drm_gem_object *gobj,
 	if (domain == RADEON_GEM_DOMAIN_CPU) {
 		/* Asking for cpu access wait for object idle */
 		r = dma_resv_wait_timeout(robj->tbo.base.resv,
-					  DMA_RESV_USAGE_READ,
+					  DMA_RESV_USAGE_BOOKKEEP,
 					  true, 30 * HZ);
 		if (!r)
 			r = -EBUSY;
diff --git a/drivers/gpu/drm/radeon/radeon_mn.c b/drivers/gpu/drm/radeon/radeon_mn.c
index 68ebeb1bdfff..29fe8423bd90 100644
--- a/drivers/gpu/drm/radeon/radeon_mn.c
+++ b/drivers/gpu/drm/radeon/radeon_mn.c
@@ -66,7 +66,7 @@ static bool radeon_mn_invalidate(struct mmu_interval_notifier *mn,
 		return true;
 	}
 
-	r = dma_resv_wait_timeout(bo->tbo.base.resv, DMA_RESV_USAGE_READ,
+	r = dma_resv_wait_timeout(bo->tbo.base.resv, DMA_RESV_USAGE_BOOKKEEP,
 				  false, MAX_SCHEDULE_TIMEOUT);
 	if (r <= 0)
 		DRM_ERROR("(%ld) failed to wait for user bo\n", r);
diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c
index 6bf3fb1c8045..360f980c7e10 100644
--- a/drivers/gpu/drm/ttm/ttm_bo.c
+++ b/drivers/gpu/drm/ttm/ttm_bo.c
@@ -223,7 +223,7 @@ static void ttm_bo_flush_all_fences(struct ttm_buffer_object *bo)
 	struct dma_resv_iter cursor;
 	struct dma_fence *fence;
 
-	dma_resv_iter_begin(&cursor, resv, DMA_RESV_USAGE_READ);
+	dma_resv_iter_begin(&cursor, resv, DMA_RESV_USAGE_BOOKKEEP);
 	dma_resv_for_each_fence_unlocked(&cursor, fence) {
 		if (!fence->ops->signaled)
 			dma_fence_enable_sw_signaling(fence);
@@ -252,7 +252,7 @@ static int ttm_bo_cleanup_refs(struct ttm_buffer_object *bo,
 	struct dma_resv *resv = &bo->base._resv;
 	int ret;
 
-	if (dma_resv_test_signaled(resv, DMA_RESV_USAGE_READ))
+	if (dma_resv_test_signaled(resv, DMA_RESV_USAGE_BOOKKEEP))
 		ret = 0;
 	else
 		ret = -EBUSY;
@@ -264,7 +264,7 @@ static int ttm_bo_cleanup_refs(struct ttm_buffer_object *bo,
 			dma_resv_unlock(bo->base.resv);
 		spin_unlock(&bo->bdev->lru_lock);
 
-		lret = dma_resv_wait_timeout(resv, DMA_RESV_USAGE_READ,
+		lret = dma_resv_wait_timeout(resv, DMA_RESV_USAGE_BOOKKEEP,
 					     interruptible,
 					     30 * HZ);
 
@@ -369,7 +369,7 @@ static void ttm_bo_release(struct kref *kref)
 			 * fences block for the BO to become idle
 			 */
 			dma_resv_wait_timeout(bo->base.resv,
-					      DMA_RESV_USAGE_READ, false,
+					      DMA_RESV_USAGE_BOOKKEEP, false,
 					      30 * HZ);
 		}
 
@@ -380,7 +380,7 @@ static void ttm_bo_release(struct kref *kref)
 		ttm_mem_io_free(bdev, bo->resource);
 	}
 
-	if (!dma_resv_test_signaled(bo->base.resv, DMA_RESV_USAGE_READ) ||
+	if (!dma_resv_test_signaled(bo->base.resv, DMA_RESV_USAGE_BOOKKEEP) ||
 	    !dma_resv_trylock(bo->base.resv)) {
 		/* The BO is not idle, resurrect it for delayed destroy */
 		ttm_bo_flush_all_fences(bo);
@@ -1046,13 +1046,13 @@ int ttm_bo_wait(struct ttm_buffer_object *bo,
 	long timeout = 15 * HZ;
 
 	if (no_wait) {
-		if (dma_resv_test_signaled(bo->base.resv, DMA_RESV_USAGE_READ))
+		if (dma_resv_test_signaled(bo->base.resv, DMA_RESV_USAGE_BOOKKEEP))
 			return 0;
 		else
 			return -EBUSY;
 	}
 
-	timeout = dma_resv_wait_timeout(bo->base.resv, DMA_RESV_USAGE_READ,
+	timeout = dma_resv_wait_timeout(bo->base.resv, DMA_RESV_USAGE_BOOKKEEP,
 					interruptible, timeout);
 	if (timeout < 0)
 		return timeout;
diff --git a/include/linux/dma-resv.h b/include/linux/dma-resv.h
index a749f229ae91..1db759eacc98 100644
--- a/include/linux/dma-resv.h
+++ b/include/linux/dma-resv.h
@@ -55,7 +55,7 @@ struct dma_resv_list;
  * This enum describes the different use cases for a dma_resv object and
  * controls which fences are returned when queried.
  *
- * An important fact is that there is the order KERNEL<WRITE<READ and
+ * An important fact is that there is the order KERNEL<WRITE<READ<BOOKKEEP and
  * when the dma_resv object is asked for fences for one use case the fences
  * for the lower use case are returned as well.
  *
@@ -93,6 +93,17 @@ enum dma_resv_usage {
 	 * an implicit read dependency.
 	 */
 	DMA_RESV_USAGE_READ,
+
+	/**
+	 * @DMA_RESV_USAGE_BOOKKEEP: No implicit sync.
+	 *
+	 * This should be used by submissions which don't want to participate in
+	 * implicit synchronization.
+	 *
+	 * The most common case are preemption fences as well as page table
+	 * updates and their TLB flushes.
+	 */
+	DMA_RESV_USAGE_BOOKKEEP
 };
 
 /**
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 65+ messages in thread

* [PATCH 12/16] drm/amdgpu: use DMA_RESV_USAGE_BOOKKEEP
  2022-04-06  7:51 Christian König
                   ` (10 preceding siblings ...)
  2022-04-06  7:51 ` [PATCH 11/16] dma-buf: add DMA_RESV_USAGE_BOOKKEEP v3 Christian König
@ 2022-04-06  7:51 ` Christian König
  2022-04-06  7:51 ` [PATCH 13/16] dma-buf: wait for map to complete for static attachments Christian König
                   ` (4 subsequent siblings)
  16 siblings, 0 replies; 65+ messages in thread
From: Christian König @ 2022-04-06  7:51 UTC (permalink / raw)
  To: daniel.vetter, dri-devel; +Cc: Christian König

Use DMA_RESV_USAGE_BOOKKEEP for VM page table updates and KFD preemption fence.

Signed-off-by: Christian König <christian.koenig@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c | 2 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_vm_sdma.c      | 3 ++-
 2 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
index 5031e26e6716..808e21dcb517 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
@@ -263,7 +263,7 @@ static int amdgpu_amdkfd_remove_eviction_fence(struct amdgpu_bo *bo,
 	 */
 	replacement = dma_fence_get_stub();
 	dma_resv_replace_fences(bo->tbo.base.resv, ef->base.context,
-				replacement, DMA_RESV_USAGE_READ);
+				replacement, DMA_RESV_USAGE_BOOKKEEP);
 	dma_fence_put(replacement);
 	return 0;
 }
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm_sdma.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm_sdma.c
index dbb551762805..9485b541947e 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm_sdma.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm_sdma.c
@@ -112,7 +112,8 @@ static int amdgpu_vm_sdma_commit(struct amdgpu_vm_update_params *p,
 		swap(p->vm->last_unlocked, f);
 		dma_fence_put(tmp);
 	} else {
-		amdgpu_bo_fence(p->vm->root.bo, f, true);
+		dma_resv_add_fence(p->vm->root.bo->tbo.base.resv, f,
+				   DMA_RESV_USAGE_BOOKKEEP);
 	}
 
 	if (fence && !p->immediate)
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 65+ messages in thread

* [PATCH 13/16] dma-buf: wait for map to complete for static attachments
  2022-04-06  7:51 Christian König
                   ` (11 preceding siblings ...)
  2022-04-06  7:51 ` [PATCH 12/16] drm/amdgpu: use DMA_RESV_USAGE_BOOKKEEP Christian König
@ 2022-04-06  7:51 ` Christian König
  2022-04-06  7:51 ` [PATCH 14/16] drm/i915: drop bo->moving dependency Christian König
                   ` (3 subsequent siblings)
  16 siblings, 0 replies; 65+ messages in thread
From: Christian König @ 2022-04-06  7:51 UTC (permalink / raw)
  To: daniel.vetter, dri-devel; +Cc: Christian König

We have previously done that in the individual drivers but it is
more defensive to move that into the common code.

Dynamic attachments should wait for map operations to complete by themselves.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
---
 drivers/dma-buf/dma-buf.c                   | 18 +++++++++++++++---
 drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c | 14 +-------------
 drivers/gpu/drm/nouveau/nouveau_prime.c     | 17 +----------------
 drivers/gpu/drm/radeon/radeon_prime.c       | 16 +++-------------
 4 files changed, 20 insertions(+), 45 deletions(-)

diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c
index 1cddb65eafda..79795857be3e 100644
--- a/drivers/dma-buf/dma-buf.c
+++ b/drivers/dma-buf/dma-buf.c
@@ -661,12 +661,24 @@ static struct sg_table * __map_dma_buf(struct dma_buf_attachment *attach,
 				       enum dma_data_direction direction)
 {
 	struct sg_table *sg_table;
+	signed long ret;
 
 	sg_table = attach->dmabuf->ops->map_dma_buf(attach, direction);
+	if (IS_ERR_OR_NULL(sg_table))
+		return sg_table;
+
+	if (!dma_buf_attachment_is_dynamic(attach)) {
+		ret = dma_resv_wait_timeout(attach->dmabuf->resv,
+					    DMA_RESV_USAGE_KERNEL, true,
+					    MAX_SCHEDULE_TIMEOUT);
+		if (ret < 0) {
+			attach->dmabuf->ops->unmap_dma_buf(attach, sg_table,
+							   direction);
+			return ERR_PTR(ret);
+		}
+	}
 
-	if (!IS_ERR_OR_NULL(sg_table))
-		mangle_sg_table(sg_table);
-
+	mangle_sg_table(sg_table);
 	return sg_table;
 }
 
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c
index 579adfafe4d0..782cbca37538 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c
@@ -102,21 +102,9 @@ static int amdgpu_dma_buf_pin(struct dma_buf_attachment *attach)
 {
 	struct drm_gem_object *obj = attach->dmabuf->priv;
 	struct amdgpu_bo *bo = gem_to_amdgpu_bo(obj);
-	int r;
 
 	/* pin buffer into GTT */
-	r = amdgpu_bo_pin(bo, AMDGPU_GEM_DOMAIN_GTT);
-	if (r)
-		return r;
-
-	if (bo->tbo.moving) {
-		r = dma_fence_wait(bo->tbo.moving, true);
-		if (r) {
-			amdgpu_bo_unpin(bo);
-			return r;
-		}
-	}
-	return 0;
+	return amdgpu_bo_pin(bo, AMDGPU_GEM_DOMAIN_GTT);
 }
 
 /**
diff --git a/drivers/gpu/drm/nouveau/nouveau_prime.c b/drivers/gpu/drm/nouveau/nouveau_prime.c
index 60019d0532fc..347488685f74 100644
--- a/drivers/gpu/drm/nouveau/nouveau_prime.c
+++ b/drivers/gpu/drm/nouveau/nouveau_prime.c
@@ -93,22 +93,7 @@ int nouveau_gem_prime_pin(struct drm_gem_object *obj)
 	if (ret)
 		return -EINVAL;
 
-	ret = ttm_bo_reserve(&nvbo->bo, false, false, NULL);
-	if (ret)
-		goto error;
-
-	if (nvbo->bo.moving)
-		ret = dma_fence_wait(nvbo->bo.moving, true);
-
-	ttm_bo_unreserve(&nvbo->bo);
-	if (ret)
-		goto error;
-
-	return ret;
-
-error:
-	nouveau_bo_unpin(nvbo);
-	return ret;
+	return 0;
 }
 
 void nouveau_gem_prime_unpin(struct drm_gem_object *obj)
diff --git a/drivers/gpu/drm/radeon/radeon_prime.c b/drivers/gpu/drm/radeon/radeon_prime.c
index 4a90807351e7..42a87948e28c 100644
--- a/drivers/gpu/drm/radeon/radeon_prime.c
+++ b/drivers/gpu/drm/radeon/radeon_prime.c
@@ -77,19 +77,9 @@ int radeon_gem_prime_pin(struct drm_gem_object *obj)
 
 	/* pin buffer into GTT */
 	ret = radeon_bo_pin(bo, RADEON_GEM_DOMAIN_GTT, NULL);
-	if (unlikely(ret))
-		goto error;
-
-	if (bo->tbo.moving) {
-		ret = dma_fence_wait(bo->tbo.moving, false);
-		if (unlikely(ret)) {
-			radeon_bo_unpin(bo);
-			goto error;
-		}
-	}
-
-	bo->prime_shared_count++;
-error:
+	if (likely(ret == 0))
+		bo->prime_shared_count++;
+
 	radeon_bo_unreserve(bo);
 	return ret;
 }
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 65+ messages in thread

* [PATCH 14/16] drm/i915: drop bo->moving dependency
  2022-04-06  7:51 Christian König
                   ` (12 preceding siblings ...)
  2022-04-06  7:51 ` [PATCH 13/16] dma-buf: wait for map to complete for static attachments Christian König
@ 2022-04-06  7:51 ` Christian König
  2022-04-06 13:24   ` Matthew Auld
  2022-04-06  7:51 ` [PATCH 15/16] drm/ttm: remove bo->moving Christian König
                   ` (2 subsequent siblings)
  16 siblings, 1 reply; 65+ messages in thread
From: Christian König @ 2022-04-06  7:51 UTC (permalink / raw)
  To: daniel.vetter, dri-devel; +Cc: intel-gfx, Christian König

That should now be handled by the common dma_resv framework.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: intel-gfx@lists.freedesktop.org
---
 drivers/gpu/drm/i915/gem/i915_gem_object.c    | 41 ++++---------------
 drivers/gpu/drm/i915/gem/i915_gem_object.h    |  8 +---
 drivers/gpu/drm/i915/gem/i915_gem_ttm_move.c  | 15 +------
 .../drm/i915/gem/selftests/i915_gem_migrate.c |  3 +-
 .../drm/i915/gem/selftests/i915_gem_mman.c    |  3 +-
 drivers/gpu/drm/i915/i915_vma.c               |  9 +++-
 6 files changed, 21 insertions(+), 58 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.c b/drivers/gpu/drm/i915/gem/i915_gem_object.c
index 372bc220faeb..ffde7bc0a95d 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_object.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_object.c
@@ -741,30 +741,19 @@ static const struct drm_gem_object_funcs i915_gem_object_funcs = {
 /**
  * i915_gem_object_get_moving_fence - Get the object's moving fence if any
  * @obj: The object whose moving fence to get.
+ * @fence: The resulting fence
  *
  * A non-signaled moving fence means that there is an async operation
  * pending on the object that needs to be waited on before setting up
  * any GPU- or CPU PTEs to the object's pages.
  *
- * Return: A refcounted pointer to the object's moving fence if any,
- * NULL otherwise.
+ * Return: Negative error code or 0 for success.
  */
-struct dma_fence *
-i915_gem_object_get_moving_fence(struct drm_i915_gem_object *obj)
+int i915_gem_object_get_moving_fence(struct drm_i915_gem_object *obj,
+				     struct dma_fence **fence)
 {
-	return dma_fence_get(i915_gem_to_ttm(obj)->moving);
-}
-
-void i915_gem_object_set_moving_fence(struct drm_i915_gem_object *obj,
-				      struct dma_fence *fence)
-{
-	struct dma_fence **moving = &i915_gem_to_ttm(obj)->moving;
-
-	if (*moving == fence)
-		return;
-
-	dma_fence_put(*moving);
-	*moving = dma_fence_get(fence);
+	return dma_resv_get_singleton(obj->base.resv, DMA_RESV_USAGE_KERNEL,
+				      fence);
 }
 
 /**
@@ -782,23 +771,9 @@ void i915_gem_object_set_moving_fence(struct drm_i915_gem_object *obj,
 int i915_gem_object_wait_moving_fence(struct drm_i915_gem_object *obj,
 				      bool intr)
 {
-	struct dma_fence *fence = i915_gem_to_ttm(obj)->moving;
-	int ret;
-
 	assert_object_held(obj);
-	if (!fence)
-		return 0;
-
-	ret = dma_fence_wait(fence, intr);
-	if (ret)
-		return ret;
-
-	if (fence->error)
-		return fence->error;
-
-	i915_gem_to_ttm(obj)->moving = NULL;
-	dma_fence_put(fence);
-	return 0;
+	return dma_resv_wait_timeout(obj->base. resv, DMA_RESV_USAGE_KERNEL,
+				     intr, MAX_SCHEDULE_TIMEOUT);
 }
 
 #if IS_ENABLED(CONFIG_DRM_I915_SELFTEST)
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.h b/drivers/gpu/drm/i915/gem/i915_gem_object.h
index 02c37fe4a535..e11d82a9f7c3 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_object.h
+++ b/drivers/gpu/drm/i915/gem/i915_gem_object.h
@@ -520,12 +520,8 @@ i915_gem_object_finish_access(struct drm_i915_gem_object *obj)
 	i915_gem_object_unpin_pages(obj);
 }
 
-struct dma_fence *
-i915_gem_object_get_moving_fence(struct drm_i915_gem_object *obj);
-
-void i915_gem_object_set_moving_fence(struct drm_i915_gem_object *obj,
-				      struct dma_fence *fence);
-
+int i915_gem_object_get_moving_fence(struct drm_i915_gem_object *obj,
+				     struct dma_fence **fence);
 int i915_gem_object_wait_moving_fence(struct drm_i915_gem_object *obj,
 				      bool intr);
 
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_ttm_move.c b/drivers/gpu/drm/i915/gem/i915_gem_ttm_move.c
index 438b8a95b3d1..a10716f4e717 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_ttm_move.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_ttm_move.c
@@ -467,19 +467,6 @@ __i915_ttm_move(struct ttm_buffer_object *bo,
 	return fence;
 }
 
-static int
-prev_deps(struct ttm_buffer_object *bo, struct ttm_operation_ctx *ctx,
-	  struct i915_deps *deps)
-{
-	int ret;
-
-	ret = i915_deps_add_dependency(deps, bo->moving, ctx);
-	if (!ret)
-		ret = i915_deps_add_resv(deps, bo->base.resv, ctx);
-
-	return ret;
-}
-
 /**
  * i915_ttm_move - The TTM move callback used by i915.
  * @bo: The buffer object.
@@ -534,7 +521,7 @@ int i915_ttm_move(struct ttm_buffer_object *bo, bool evict,
 		struct i915_deps deps;
 
 		i915_deps_init(&deps, GFP_KERNEL | __GFP_NORETRY | __GFP_NOWARN);
-		ret = prev_deps(bo, ctx, &deps);
+		ret = i915_deps_add_resv(&deps, bo->base.resv, ctx);
 		if (ret) {
 			i915_refct_sgt_put(dst_rsgt);
 			return ret;
diff --git a/drivers/gpu/drm/i915/gem/selftests/i915_gem_migrate.c b/drivers/gpu/drm/i915/gem/selftests/i915_gem_migrate.c
index 4997ed18b6e4..0ad443a90c8b 100644
--- a/drivers/gpu/drm/i915/gem/selftests/i915_gem_migrate.c
+++ b/drivers/gpu/drm/i915/gem/selftests/i915_gem_migrate.c
@@ -219,8 +219,7 @@ static int __igt_lmem_pages_migrate(struct intel_gt *gt,
 			err = dma_resv_reserve_fences(obj->base.resv, 1);
 			if (!err)
 				dma_resv_add_fence(obj->base.resv, &rq->fence,
-						   DMA_RESV_USAGE_WRITE);
-			i915_gem_object_set_moving_fence(obj, &rq->fence);
+						   DMA_RESV_USAGE_KERNEL);
 			i915_request_put(rq);
 		}
 		if (err)
diff --git a/drivers/gpu/drm/i915/gem/selftests/i915_gem_mman.c b/drivers/gpu/drm/i915/gem/selftests/i915_gem_mman.c
index 3a6e3f6d239f..dfc34cc2ef8c 100644
--- a/drivers/gpu/drm/i915/gem/selftests/i915_gem_mman.c
+++ b/drivers/gpu/drm/i915/gem/selftests/i915_gem_mman.c
@@ -1221,8 +1221,7 @@ static int __igt_mmap_migrate(struct intel_memory_region **placements,
 	i915_gem_object_unpin_pages(obj);
 	if (rq) {
 		dma_resv_add_fence(obj->base.resv, &rq->fence,
-				   DMA_RESV_USAGE_WRITE);
-		i915_gem_object_set_moving_fence(obj, &rq->fence);
+				   DMA_RESV_USAGE_KERNEL);
 		i915_request_put(rq);
 	}
 	i915_gem_object_unlock(obj);
diff --git a/drivers/gpu/drm/i915/i915_vma.c b/drivers/gpu/drm/i915/i915_vma.c
index 524477d8939e..d077f7b9eaad 100644
--- a/drivers/gpu/drm/i915/i915_vma.c
+++ b/drivers/gpu/drm/i915/i915_vma.c
@@ -1357,10 +1357,17 @@ int i915_vma_pin_ww(struct i915_vma *vma, struct i915_gem_ww_ctx *ww,
 	if (err)
 		return err;
 
+	if (vma->obj) {
+		err = i915_gem_object_get_moving_fence(vma->obj, &moving);
+		if (err)
+			return err;
+	} else {
+		moving = NULL;
+	}
+
 	if (flags & PIN_GLOBAL)
 		wakeref = intel_runtime_pm_get(&vma->vm->i915->runtime_pm);
 
-	moving = vma->obj ? i915_gem_object_get_moving_fence(vma->obj) : NULL;
 	if (flags & vma->vm->bind_async_flags || moving) {
 		/* lock VM */
 		err = i915_vm_lock_objects(vma->vm, ww);
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 65+ messages in thread

* [PATCH 15/16] drm/ttm: remove bo->moving
  2022-04-06  7:51 Christian König
                   ` (13 preceding siblings ...)
  2022-04-06  7:51 ` [PATCH 14/16] drm/i915: drop bo->moving dependency Christian König
@ 2022-04-06  7:51 ` Christian König
  2022-04-06 12:52   ` Daniel Vetter
  2022-04-06  7:51 ` [PATCH 16/16] dma-buf: drop seq count based update Christian König
  2022-04-06 12:59 ` Daniel Vetter
  16 siblings, 1 reply; 65+ messages in thread
From: Christian König @ 2022-04-06  7:51 UTC (permalink / raw)
  To: daniel.vetter, dri-devel; +Cc: Christian König

This is now handled by the DMA-buf framework in the dma_resv obj.

Also remove the workaround inside VMWGFX to update the moving fence.

Signed-off-by: Christian König <christian.koenig@amd.com>
---
 .../gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c  | 13 ++++---
 drivers/gpu/drm/amd/amdgpu/amdgpu_object.c    |  5 ++-
 drivers/gpu/drm/amd/amdgpu/amdgpu_vm_cpu.c    | 11 +++---
 drivers/gpu/drm/amd/amdgpu/amdgpu_vm_sdma.c   | 11 ++++--
 drivers/gpu/drm/ttm/ttm_bo.c                  | 10 ++----
 drivers/gpu/drm/ttm/ttm_bo_util.c             |  7 ----
 drivers/gpu/drm/ttm/ttm_bo_vm.c               | 34 +++++++------------
 drivers/gpu/drm/vmwgfx/vmwgfx_resource.c      |  6 ----
 include/drm/ttm/ttm_bo_api.h                  |  2 --
 9 files changed, 39 insertions(+), 60 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
index 808e21dcb517..a4955ef76cfc 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
@@ -2447,6 +2447,8 @@ int amdgpu_amdkfd_gpuvm_restore_process_bos(void *info, struct dma_fence **ef)
 		struct amdgpu_bo *bo = mem->bo;
 		uint32_t domain = mem->domain;
 		struct kfd_mem_attachment *attachment;
+		struct dma_resv_iter cursor;
+		struct dma_fence *fence;
 
 		total_size += amdgpu_bo_size(bo);
 
@@ -2461,10 +2463,13 @@ int amdgpu_amdkfd_gpuvm_restore_process_bos(void *info, struct dma_fence **ef)
 				goto validate_map_fail;
 			}
 		}
-		ret = amdgpu_sync_fence(&sync_obj, bo->tbo.moving);
-		if (ret) {
-			pr_debug("Memory eviction: Sync BO fence failed. Try again\n");
-			goto validate_map_fail;
+		dma_resv_for_each_fence(&cursor, bo->tbo.base.resv,
+					DMA_RESV_USAGE_KERNEL, fence) {
+			ret = amdgpu_sync_fence(&sync_obj, fence);
+			if (ret) {
+				pr_debug("Memory eviction: Sync BO fence failed. Try again\n");
+				goto validate_map_fail;
+			}
 		}
 		list_for_each_entry(attachment, &mem->attachments, list) {
 			if (!attachment->is_mapped)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
index 5832c05ab10d..ef93abec13b9 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
@@ -612,9 +612,8 @@ int amdgpu_bo_create(struct amdgpu_device *adev,
 		if (unlikely(r))
 			goto fail_unreserve;
 
-		amdgpu_bo_fence(bo, fence, false);
-		dma_fence_put(bo->tbo.moving);
-		bo->tbo.moving = dma_fence_get(fence);
+		dma_resv_add_fence(bo->tbo.base.resv, fence,
+				   DMA_RESV_USAGE_KERNEL);
 		dma_fence_put(fence);
 	}
 	if (!bp->resv)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm_cpu.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm_cpu.c
index e3fbf0f10add..31913ae86de6 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm_cpu.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm_cpu.c
@@ -74,13 +74,12 @@ static int amdgpu_vm_cpu_update(struct amdgpu_vm_update_params *p,
 {
 	unsigned int i;
 	uint64_t value;
-	int r;
+	long r;
 
-	if (vmbo->bo.tbo.moving) {
-		r = dma_fence_wait(vmbo->bo.tbo.moving, true);
-		if (r)
-			return r;
-	}
+	r = dma_resv_wait_timeout(vmbo->bo.tbo.base.resv, DMA_RESV_USAGE_KERNEL,
+				  true, MAX_SCHEDULE_TIMEOUT);
+	if (r < 0)
+		return r;
 
 	pe += (unsigned long)amdgpu_bo_kptr(&vmbo->bo);
 
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm_sdma.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm_sdma.c
index 9485b541947e..9cd6f41896c0 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm_sdma.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm_sdma.c
@@ -205,14 +205,19 @@ static int amdgpu_vm_sdma_update(struct amdgpu_vm_update_params *p,
 	struct amdgpu_bo *bo = &vmbo->bo;
 	enum amdgpu_ib_pool_type pool = p->immediate ? AMDGPU_IB_POOL_IMMEDIATE
 		: AMDGPU_IB_POOL_DELAYED;
+	struct dma_resv_iter cursor;
 	unsigned int i, ndw, nptes;
+	struct dma_fence *fence;
 	uint64_t *pte;
 	int r;
 
 	/* Wait for PD/PT moves to be completed */
-	r = amdgpu_sync_fence(&p->job->sync, bo->tbo.moving);
-	if (r)
-		return r;
+	dma_resv_for_each_fence(&cursor, bo->tbo.base.resv,
+				DMA_RESV_USAGE_KERNEL, fence) {
+		r = amdgpu_sync_fence(&p->job->sync, fence);
+		if (r)
+			return r;
+	}
 
 	do {
 		ndw = p->num_dw_left;
diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c
index 360f980c7e10..015a94f766de 100644
--- a/drivers/gpu/drm/ttm/ttm_bo.c
+++ b/drivers/gpu/drm/ttm/ttm_bo.c
@@ -418,7 +418,6 @@ static void ttm_bo_release(struct kref *kref)
 	dma_resv_unlock(bo->base.resv);
 
 	atomic_dec(&ttm_glob.bo_count);
-	dma_fence_put(bo->moving);
 	bo->destroy(bo);
 }
 
@@ -714,9 +713,8 @@ void ttm_bo_unpin(struct ttm_buffer_object *bo)
 EXPORT_SYMBOL(ttm_bo_unpin);
 
 /*
- * Add the last move fence to the BO and reserve a new shared slot. We only use
- * a shared slot to avoid unecessary sync and rely on the subsequent bo move to
- * either stall or use an exclusive fence respectively set bo->moving.
+ * Add the last move fence to the BO as kernel dependency and reserve a new
+ * fence slot.
  */
 static int ttm_bo_add_move_fence(struct ttm_buffer_object *bo,
 				 struct ttm_resource_manager *man,
@@ -746,9 +744,6 @@ static int ttm_bo_add_move_fence(struct ttm_buffer_object *bo,
 		dma_fence_put(fence);
 		return ret;
 	}
-
-	dma_fence_put(bo->moving);
-	bo->moving = fence;
 	return 0;
 }
 
@@ -951,7 +946,6 @@ int ttm_bo_init_reserved(struct ttm_device *bdev,
 	bo->bdev = bdev;
 	bo->type = type;
 	bo->page_alignment = page_alignment;
-	bo->moving = NULL;
 	bo->pin_count = 0;
 	bo->sg = sg;
 	bo->bulk_move = NULL;
diff --git a/drivers/gpu/drm/ttm/ttm_bo_util.c b/drivers/gpu/drm/ttm/ttm_bo_util.c
index 99deb45894f4..bc5190340b9c 100644
--- a/drivers/gpu/drm/ttm/ttm_bo_util.c
+++ b/drivers/gpu/drm/ttm/ttm_bo_util.c
@@ -228,7 +228,6 @@ static int ttm_buffer_object_transfer(struct ttm_buffer_object *bo,
 
 	atomic_inc(&ttm_glob.bo_count);
 	INIT_LIST_HEAD(&fbo->base.ddestroy);
-	fbo->base.moving = NULL;
 	drm_vma_node_reset(&fbo->base.base.vma_node);
 
 	kref_init(&fbo->base.kref);
@@ -500,9 +499,6 @@ static int ttm_bo_move_to_ghost(struct ttm_buffer_object *bo,
 	 * operation has completed.
 	 */
 
-	dma_fence_put(bo->moving);
-	bo->moving = dma_fence_get(fence);
-
 	ret = ttm_buffer_object_transfer(bo, &ghost_obj);
 	if (ret)
 		return ret;
@@ -546,9 +542,6 @@ static void ttm_bo_move_pipeline_evict(struct ttm_buffer_object *bo,
 	spin_unlock(&from->move_lock);
 
 	ttm_resource_free(bo, &bo->resource);
-
-	dma_fence_put(bo->moving);
-	bo->moving = dma_fence_get(fence);
 }
 
 int ttm_bo_move_accel_cleanup(struct ttm_buffer_object *bo,
diff --git a/drivers/gpu/drm/ttm/ttm_bo_vm.c b/drivers/gpu/drm/ttm/ttm_bo_vm.c
index 08ba083a80d2..5b324f245265 100644
--- a/drivers/gpu/drm/ttm/ttm_bo_vm.c
+++ b/drivers/gpu/drm/ttm/ttm_bo_vm.c
@@ -46,17 +46,13 @@
 static vm_fault_t ttm_bo_vm_fault_idle(struct ttm_buffer_object *bo,
 				struct vm_fault *vmf)
 {
-	vm_fault_t ret = 0;
-	int err = 0;
-
-	if (likely(!bo->moving))
-		goto out_unlock;
+	long err = 0;
 
 	/*
 	 * Quick non-stalling check for idle.
 	 */
-	if (dma_fence_is_signaled(bo->moving))
-		goto out_clear;
+	if (dma_resv_test_signaled(bo->base.resv, DMA_RESV_USAGE_KERNEL))
+		return 0;
 
 	/*
 	 * If possible, avoid waiting for GPU with mmap_lock
@@ -64,34 +60,30 @@ static vm_fault_t ttm_bo_vm_fault_idle(struct ttm_buffer_object *bo,
 	 * is the first attempt.
 	 */
 	if (fault_flag_allow_retry_first(vmf->flags)) {
-		ret = VM_FAULT_RETRY;
 		if (vmf->flags & FAULT_FLAG_RETRY_NOWAIT)
-			goto out_unlock;
+			return VM_FAULT_RETRY;
 
 		ttm_bo_get(bo);
 		mmap_read_unlock(vmf->vma->vm_mm);
-		(void) dma_fence_wait(bo->moving, true);
+		(void)dma_resv_wait_timeout(bo->base.resv,
+					    DMA_RESV_USAGE_KERNEL, true,
+					    MAX_SCHEDULE_TIMEOUT);
 		dma_resv_unlock(bo->base.resv);
 		ttm_bo_put(bo);
-		goto out_unlock;
+		return VM_FAULT_RETRY;
 	}
 
 	/*
 	 * Ordinary wait.
 	 */
-	err = dma_fence_wait(bo->moving, true);
-	if (unlikely(err != 0)) {
-		ret = (err != -ERESTARTSYS) ? VM_FAULT_SIGBUS :
+	err = dma_resv_wait_timeout(bo->base.resv, DMA_RESV_USAGE_KERNEL, true,
+				    MAX_SCHEDULE_TIMEOUT);
+	if (unlikely(err < 0)) {
+		return (err != -ERESTARTSYS) ? VM_FAULT_SIGBUS :
 			VM_FAULT_NOPAGE;
-		goto out_unlock;
 	}
 
-out_clear:
-	dma_fence_put(bo->moving);
-	bo->moving = NULL;
-
-out_unlock:
-	return ret;
+	return 0;
 }
 
 static unsigned long ttm_bo_io_mem_pfn(struct ttm_buffer_object *bo,
diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_resource.c b/drivers/gpu/drm/vmwgfx/vmwgfx_resource.c
index a84d1d5628d0..a7d62a4eb47b 100644
--- a/drivers/gpu/drm/vmwgfx/vmwgfx_resource.c
+++ b/drivers/gpu/drm/vmwgfx/vmwgfx_resource.c
@@ -1161,12 +1161,6 @@ int vmw_resources_clean(struct vmw_buffer_object *vbo, pgoff_t start,
 		*num_prefault = __KERNEL_DIV_ROUND_UP(last_cleaned - res_start,
 						      PAGE_SIZE);
 		vmw_bo_fence_single(bo, NULL);
-		if (bo->moving)
-			dma_fence_put(bo->moving);
-
-		return dma_resv_get_singleton(bo->base.resv,
-					      DMA_RESV_USAGE_WRITE,
-					      &bo->moving);
 	}
 
 	return 0;
diff --git a/include/drm/ttm/ttm_bo_api.h b/include/drm/ttm/ttm_bo_api.h
index c76932b68a33..2d524f8b0802 100644
--- a/include/drm/ttm/ttm_bo_api.h
+++ b/include/drm/ttm/ttm_bo_api.h
@@ -94,7 +94,6 @@ struct ttm_tt;
  * @deleted: True if the object is only a zombie and already deleted.
  * @ddestroy: List head for the delayed destroy list.
  * @swap: List head for swap LRU list.
- * @moving: Fence set when BO is moving
  * @offset: The current GPU offset, which can have different meanings
  * depending on the memory type. For SYSTEM type memory, it should be 0.
  * @cur_placement: Hint of current placement.
@@ -147,7 +146,6 @@ struct ttm_buffer_object {
 	 * Members protected by a bo reservation.
 	 */
 
-	struct dma_fence *moving;
 	unsigned priority;
 	unsigned pin_count;
 
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 65+ messages in thread

* [PATCH 16/16] dma-buf: drop seq count based update
  2022-04-06  7:51 Christian König
                   ` (14 preceding siblings ...)
  2022-04-06  7:51 ` [PATCH 15/16] drm/ttm: remove bo->moving Christian König
@ 2022-04-06  7:51 ` Christian König
  2022-04-06 13:00   ` Daniel Vetter
  2022-04-06 12:59 ` Daniel Vetter
  16 siblings, 1 reply; 65+ messages in thread
From: Christian König @ 2022-04-06  7:51 UTC (permalink / raw)
  To: daniel.vetter, dri-devel; +Cc: Christian König

This should be possible now since we don't have the distinction
between exclusive and shared fences any more.

The only possible pitfall is that a dma_fence would be reused during the
RCU grace period, but even that could be handled with a single extra check.

Signed-off-by: Christian König <christian.koenig@amd.com>
---
 drivers/dma-buf/dma-resv.c    | 33 ++++++++++++---------------------
 drivers/dma-buf/st-dma-resv.c |  2 +-
 include/linux/dma-resv.h      | 12 ------------
 3 files changed, 13 insertions(+), 34 deletions(-)

diff --git a/drivers/dma-buf/dma-resv.c b/drivers/dma-buf/dma-resv.c
index 5b64aa554c36..0cce6e4ec946 100644
--- a/drivers/dma-buf/dma-resv.c
+++ b/drivers/dma-buf/dma-resv.c
@@ -133,7 +133,6 @@ static void dma_resv_list_free(struct dma_resv_list *list)
 void dma_resv_init(struct dma_resv *obj)
 {
 	ww_mutex_init(&obj->lock, &reservation_ww_class);
-	seqcount_ww_mutex_init(&obj->seq, &obj->lock);
 
 	RCU_INIT_POINTER(obj->fences, NULL);
 }
@@ -292,28 +291,24 @@ void dma_resv_add_fence(struct dma_resv *obj, struct dma_fence *fence,
 	fobj = dma_resv_fences_list(obj);
 	count = fobj->num_fences;
 
-	write_seqcount_begin(&obj->seq);
-
 	for (i = 0; i < count; ++i) {
 		enum dma_resv_usage old_usage;
 
 		dma_resv_list_entry(fobj, i, obj, &old, &old_usage);
 		if ((old->context == fence->context && old_usage >= usage) ||
-		    dma_fence_is_signaled(old))
-			goto replace;
+		    dma_fence_is_signaled(old)) {
+			dma_resv_list_set(fobj, i, fence, usage);
+			dma_fence_put(old);
+			return;
+		}
 	}
 
 	BUG_ON(fobj->num_fences >= fobj->max_fences);
-	old = NULL;
 	count++;
 
-replace:
 	dma_resv_list_set(fobj, i, fence, usage);
 	/* pointer update must be visible before we extend the num_fences */
 	smp_store_mb(fobj->num_fences, count);
-
-	write_seqcount_end(&obj->seq);
-	dma_fence_put(old);
 }
 EXPORT_SYMBOL(dma_resv_add_fence);
 
@@ -341,7 +336,6 @@ void dma_resv_replace_fences(struct dma_resv *obj, uint64_t context,
 	dma_resv_assert_held(obj);
 
 	list = dma_resv_fences_list(obj);
-	write_seqcount_begin(&obj->seq);
 	for (i = 0; list && i < list->num_fences; ++i) {
 		struct dma_fence *old;
 
@@ -352,14 +346,12 @@ void dma_resv_replace_fences(struct dma_resv *obj, uint64_t context,
 		dma_resv_list_set(list, i, replacement, usage);
 		dma_fence_put(old);
 	}
-	write_seqcount_end(&obj->seq);
 }
 EXPORT_SYMBOL(dma_resv_replace_fences);
 
 /* Restart the unlocked iteration by initializing the cursor object. */
 static void dma_resv_iter_restart_unlocked(struct dma_resv_iter *cursor)
 {
-	cursor->seq = read_seqcount_begin(&cursor->obj->seq);
 	cursor->index = 0;
 	cursor->num_fences = 0;
 	cursor->fences = dma_resv_fences_list(cursor->obj);
@@ -388,8 +380,10 @@ static void dma_resv_iter_walk_unlocked(struct dma_resv_iter *cursor)
 				    cursor->obj, &cursor->fence,
 				    &cursor->fence_usage);
 		cursor->fence = dma_fence_get_rcu(cursor->fence);
-		if (!cursor->fence)
-			break;
+		if (!cursor->fence) {
+			dma_resv_iter_restart_unlocked(cursor);
+			continue;
+		}
 
 		if (!dma_fence_is_signaled(cursor->fence) &&
 		    cursor->usage >= cursor->fence_usage)
@@ -415,7 +409,7 @@ struct dma_fence *dma_resv_iter_first_unlocked(struct dma_resv_iter *cursor)
 	do {
 		dma_resv_iter_restart_unlocked(cursor);
 		dma_resv_iter_walk_unlocked(cursor);
-	} while (read_seqcount_retry(&cursor->obj->seq, cursor->seq));
+	} while (dma_resv_fences_list(cursor->obj) != cursor->fences);
 	rcu_read_unlock();
 
 	return cursor->fence;
@@ -438,13 +432,13 @@ struct dma_fence *dma_resv_iter_next_unlocked(struct dma_resv_iter *cursor)
 
 	rcu_read_lock();
 	cursor->is_restarted = false;
-	restart = read_seqcount_retry(&cursor->obj->seq, cursor->seq);
+	restart = dma_resv_fences_list(cursor->obj) != cursor->fences;
 	do {
 		if (restart)
 			dma_resv_iter_restart_unlocked(cursor);
 		dma_resv_iter_walk_unlocked(cursor);
 		restart = true;
-	} while (read_seqcount_retry(&cursor->obj->seq, cursor->seq));
+	} while (dma_resv_fences_list(cursor->obj) != cursor->fences);
 	rcu_read_unlock();
 
 	return cursor->fence;
@@ -540,10 +534,7 @@ int dma_resv_copy_fences(struct dma_resv *dst, struct dma_resv *src)
 	}
 	dma_resv_iter_end(&cursor);
 
-	write_seqcount_begin(&dst->seq);
 	list = rcu_replace_pointer(dst->fences, list, dma_resv_held(dst));
-	write_seqcount_end(&dst->seq);
-
 	dma_resv_list_free(list);
 	return 0;
 }
diff --git a/drivers/dma-buf/st-dma-resv.c b/drivers/dma-buf/st-dma-resv.c
index 8ace9e84c845..813779e3c9be 100644
--- a/drivers/dma-buf/st-dma-resv.c
+++ b/drivers/dma-buf/st-dma-resv.c
@@ -217,7 +217,7 @@ static int test_for_each_unlocked(void *arg)
 		if (r == -ENOENT) {
 			r = -EINVAL;
 			/* That should trigger an restart */
-			cursor.seq--;
+			cursor.fences = (void*)~0;
 		} else if (r == -EINVAL) {
 			r = 0;
 		}
diff --git a/include/linux/dma-resv.h b/include/linux/dma-resv.h
index 1db759eacc98..c8ccbc94d5d2 100644
--- a/include/linux/dma-resv.h
+++ b/include/linux/dma-resv.h
@@ -155,15 +155,6 @@ struct dma_resv {
 	 */
 	struct ww_mutex lock;
 
-	/**
-	 * @seq:
-	 *
-	 * Sequence count for managing RCU read-side synchronization, allows
-	 * read-only access to @fences while ensuring we take a consistent
-	 * snapshot.
-	 */
-	seqcount_ww_mutex_t seq;
-
 	/**
 	 * @fences:
 	 *
@@ -202,9 +193,6 @@ struct dma_resv_iter {
 	/** @fence_usage: the usage of the current fence */
 	enum dma_resv_usage fence_usage;
 
-	/** @seq: sequence number to check for modifications */
-	unsigned int seq;
-
 	/** @index: index into the shared fences */
 	unsigned int index;
 
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 65+ messages in thread

* Re: [PATCH 01/16] dma-buf/drivers: make reserving a shared slot mandatory v4
  2022-04-06  7:51 ` [PATCH 01/16] dma-buf/drivers: make reserving a shared slot mandatory v4 Christian König
@ 2022-04-06 12:21   ` Daniel Vetter
  0 siblings, 0 replies; 65+ messages in thread
From: Daniel Vetter @ 2022-04-06 12:21 UTC (permalink / raw)
  To: DMA-resvusage; +Cc: daniel.vetter, Christian König, dri-devel

On Wed, Apr 06, 2022 at 09:51:17AM +0200, Christian König wrote:
> Audit all the users of dma_resv_add_excl_fence() and make sure they
> reserve a shared slot also when only trying to add an exclusive fence.
> 
> This is the next step towards handling the exclusive fence like a
> shared one.
> 
> v2: fix missed case in amdgpu
> v3: and two more radeon, rename function
> v4: add one more case to TTM, fix i915 after rebase
> 
> Signed-off-by: Christian König <christian.koenig@amd.com>

Ok I think it looks all reasonable now and complete afaict. But it's
tricky stuff.

Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>

> ---
>  drivers/dma-buf/dma-resv.c                    | 10 +--
>  drivers/dma-buf/st-dma-resv.c                 | 64 +++++++++----------
>  .../gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c  |  4 +-
>  drivers/gpu/drm/amd/amdgpu/amdgpu_object.c    |  8 +++
>  drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c        |  4 +-
>  drivers/gpu/drm/amd/amdkfd/kfd_svm.c          |  2 +-
>  drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c  |  8 +--
>  drivers/gpu/drm/i915/gem/i915_gem_clflush.c   |  3 +-
>  .../gpu/drm/i915/gem/i915_gem_execbuffer.c    | 10 ++-
>  drivers/gpu/drm/i915/gem/i915_gem_ttm_move.c  |  6 +-
>  .../drm/i915/gem/selftests/i915_gem_migrate.c |  5 +-
>  drivers/gpu/drm/i915/i915_vma.c               | 10 ++-
>  .../drm/i915/selftests/intel_memory_region.c  |  7 ++
>  drivers/gpu/drm/lima/lima_gem.c               | 10 ++-
>  drivers/gpu/drm/msm/msm_gem_submit.c          | 18 +++---
>  drivers/gpu/drm/nouveau/nouveau_fence.c       |  8 +--
>  drivers/gpu/drm/panfrost/panfrost_job.c       |  4 ++
>  drivers/gpu/drm/qxl/qxl_release.c             |  2 +-
>  drivers/gpu/drm/radeon/radeon_cs.c            |  4 ++
>  drivers/gpu/drm/radeon/radeon_object.c        |  8 +++
>  drivers/gpu/drm/radeon/radeon_vm.c            |  2 +-
>  drivers/gpu/drm/ttm/ttm_bo.c                  |  8 ++-
>  drivers/gpu/drm/ttm/ttm_bo_util.c             | 12 +++-
>  drivers/gpu/drm/ttm/ttm_execbuf_util.c        | 15 ++---
>  drivers/gpu/drm/v3d/v3d_gem.c                 | 15 +++--
>  drivers/gpu/drm/vc4/vc4_gem.c                 |  2 +-
>  drivers/gpu/drm/vgem/vgem_fence.c             | 12 ++--
>  drivers/gpu/drm/virtio/virtgpu_gem.c          |  9 +++
>  drivers/gpu/drm/vmwgfx/vmwgfx_bo.c            | 16 +++--
>  include/linux/dma-resv.h                      |  4 +-
>  30 files changed, 176 insertions(+), 114 deletions(-)
> 
> diff --git a/drivers/dma-buf/dma-resv.c b/drivers/dma-buf/dma-resv.c
> index 15ffac35439d..8c650b96357a 100644
> --- a/drivers/dma-buf/dma-resv.c
> +++ b/drivers/dma-buf/dma-resv.c
> @@ -152,7 +152,7 @@ static inline struct dma_resv_list *dma_resv_shared_list(struct dma_resv *obj)
>  }
>  
>  /**
> - * dma_resv_reserve_shared - Reserve space to add shared fences to
> + * dma_resv_reserve_fences - Reserve space to add shared fences to
>   * a dma_resv.
>   * @obj: reservation object
>   * @num_fences: number of fences we want to add
> @@ -167,7 +167,7 @@ static inline struct dma_resv_list *dma_resv_shared_list(struct dma_resv *obj)
>   * RETURNS
>   * Zero for success, or -errno
>   */
> -int dma_resv_reserve_shared(struct dma_resv *obj, unsigned int num_fences)
> +int dma_resv_reserve_fences(struct dma_resv *obj, unsigned int num_fences)
>  {
>  	struct dma_resv_list *old, *new;
>  	unsigned int i, j, k, max;
> @@ -230,7 +230,7 @@ int dma_resv_reserve_shared(struct dma_resv *obj, unsigned int num_fences)
>  
>  	return 0;
>  }
> -EXPORT_SYMBOL(dma_resv_reserve_shared);
> +EXPORT_SYMBOL(dma_resv_reserve_fences);
>  
>  #ifdef CONFIG_DEBUG_MUTEXES
>  /**
> @@ -238,7 +238,7 @@ EXPORT_SYMBOL(dma_resv_reserve_shared);
>   * @obj: the dma_resv object to reset
>   *
>   * Reset the number of pre-reserved shared slots to test that drivers do
> - * correct slot allocation using dma_resv_reserve_shared(). See also
> + * correct slot allocation using dma_resv_reserve_fences(). See also
>   * &dma_resv_list.shared_max.
>   */
>  void dma_resv_reset_shared_max(struct dma_resv *obj)
> @@ -260,7 +260,7 @@ EXPORT_SYMBOL(dma_resv_reset_shared_max);
>   * @fence: the shared fence to add
>   *
>   * Add a fence to a shared slot, @obj must be locked with dma_resv_lock(), and
> - * dma_resv_reserve_shared() has been called.
> + * dma_resv_reserve_fences() has been called.
>   *
>   * See also &dma_resv.fence for a discussion of the semantics.
>   */
> diff --git a/drivers/dma-buf/st-dma-resv.c b/drivers/dma-buf/st-dma-resv.c
> index cbe999c6e7a6..d2e61f6ae989 100644
> --- a/drivers/dma-buf/st-dma-resv.c
> +++ b/drivers/dma-buf/st-dma-resv.c
> @@ -75,17 +75,16 @@ static int test_signaling(void *arg, bool shared)
>  		goto err_free;
>  	}
>  
> -	if (shared) {
> -		r = dma_resv_reserve_shared(&resv, 1);
> -		if (r) {
> -			pr_err("Resv shared slot allocation failed\n");
> -			goto err_unlock;
> -		}
> +	r = dma_resv_reserve_fences(&resv, 1);
> +	if (r) {
> +		pr_err("Resv shared slot allocation failed\n");
> +		goto err_unlock;
> +	}
>  
> +	if (shared)
>  		dma_resv_add_shared_fence(&resv, f);
> -	} else {
> +	else
>  		dma_resv_add_excl_fence(&resv, f);
> -	}
>  
>  	if (dma_resv_test_signaled(&resv, shared)) {
>  		pr_err("Resv unexpectedly signaled\n");
> @@ -134,17 +133,16 @@ static int test_for_each(void *arg, bool shared)
>  		goto err_free;
>  	}
>  
> -	if (shared) {
> -		r = dma_resv_reserve_shared(&resv, 1);
> -		if (r) {
> -			pr_err("Resv shared slot allocation failed\n");
> -			goto err_unlock;
> -		}
> +	r = dma_resv_reserve_fences(&resv, 1);
> +	if (r) {
> +		pr_err("Resv shared slot allocation failed\n");
> +		goto err_unlock;
> +	}
>  
> +	if (shared)
>  		dma_resv_add_shared_fence(&resv, f);
> -	} else {
> +	else
>  		dma_resv_add_excl_fence(&resv, f);
> -	}
>  
>  	r = -ENOENT;
>  	dma_resv_for_each_fence(&cursor, &resv, shared, fence) {
> @@ -206,18 +204,17 @@ static int test_for_each_unlocked(void *arg, bool shared)
>  		goto err_free;
>  	}
>  
> -	if (shared) {
> -		r = dma_resv_reserve_shared(&resv, 1);
> -		if (r) {
> -			pr_err("Resv shared slot allocation failed\n");
> -			dma_resv_unlock(&resv);
> -			goto err_free;
> -		}
> +	r = dma_resv_reserve_fences(&resv, 1);
> +	if (r) {
> +		pr_err("Resv shared slot allocation failed\n");
> +		dma_resv_unlock(&resv);
> +		goto err_free;
> +	}
>  
> +	if (shared)
>  		dma_resv_add_shared_fence(&resv, f);
> -	} else {
> +	else
>  		dma_resv_add_excl_fence(&resv, f);
> -	}
>  	dma_resv_unlock(&resv);
>  
>  	r = -ENOENT;
> @@ -290,18 +287,17 @@ static int test_get_fences(void *arg, bool shared)
>  		goto err_resv;
>  	}
>  
> -	if (shared) {
> -		r = dma_resv_reserve_shared(&resv, 1);
> -		if (r) {
> -			pr_err("Resv shared slot allocation failed\n");
> -			dma_resv_unlock(&resv);
> -			goto err_resv;
> -		}
> +	r = dma_resv_reserve_fences(&resv, 1);
> +	if (r) {
> +		pr_err("Resv shared slot allocation failed\n");
> +		dma_resv_unlock(&resv);
> +		goto err_resv;
> +	}
>  
> +	if (shared)
>  		dma_resv_add_shared_fence(&resv, f);
> -	} else {
> +	else
>  		dma_resv_add_excl_fence(&resv, f);
> -	}
>  	dma_resv_unlock(&resv);
>  
>  	r = dma_resv_get_fences(&resv, shared, &i, &fences);
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
> index 900ed2a7483b..98b1736bb221 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
> @@ -1233,7 +1233,7 @@ static int init_kfd_vm(struct amdgpu_vm *vm, void **process_info,
>  				  AMDGPU_FENCE_OWNER_KFD, false);
>  	if (ret)
>  		goto wait_pd_fail;
> -	ret = dma_resv_reserve_shared(vm->root.bo->tbo.base.resv, 1);
> +	ret = dma_resv_reserve_fences(vm->root.bo->tbo.base.resv, 1);
>  	if (ret)
>  		goto reserve_shared_fail;
>  	amdgpu_bo_fence(vm->root.bo,
> @@ -2571,7 +2571,7 @@ int amdgpu_amdkfd_add_gws_to_process(void *info, void *gws, struct kgd_mem **mem
>  	 * Add process eviction fence to bo so they can
>  	 * evict each other.
>  	 */
> -	ret = dma_resv_reserve_shared(gws_bo->tbo.base.resv, 1);
> +	ret = dma_resv_reserve_fences(gws_bo->tbo.base.resv, 1);
>  	if (ret)
>  		goto reserve_shared_fail;
>  	amdgpu_bo_fence(gws_bo, &process_info->eviction_fence->base, true);
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> index 25731719c627..6f57a2fd5fe3 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> @@ -1388,6 +1388,14 @@ void amdgpu_bo_fence(struct amdgpu_bo *bo, struct dma_fence *fence,
>  		     bool shared)
>  {
>  	struct dma_resv *resv = bo->tbo.base.resv;
> +	int r;
> +
> +	r = dma_resv_reserve_fences(resv, 1);
> +	if (r) {
> +		/* As last resort on OOM we block for the fence */
> +		dma_fence_wait(fence, false);
> +		return;
> +	}
>  
>  	if (shared)
>  		dma_resv_add_shared_fence(resv, fence);
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> index 5d11978c162e..b13451255e8b 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> @@ -2926,7 +2926,7 @@ int amdgpu_vm_init(struct amdgpu_device *adev, struct amdgpu_vm *vm)
>  	if (r)
>  		goto error_free_root;
>  
> -	r = dma_resv_reserve_shared(root_bo->tbo.base.resv, 1);
> +	r = dma_resv_reserve_fences(root_bo->tbo.base.resv, 1);
>  	if (r)
>  		goto error_unreserve;
>  
> @@ -3369,7 +3369,7 @@ bool amdgpu_vm_handle_fault(struct amdgpu_device *adev, u32 pasid,
>  		value = 0;
>  	}
>  
> -	r = dma_resv_reserve_shared(root->tbo.base.resv, 1);
> +	r = dma_resv_reserve_fences(root->tbo.base.resv, 1);
>  	if (r) {
>  		pr_debug("failed %d to reserve fence slot\n", r);
>  		goto error_unlock;
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
> index 3b8856b4cece..b3fc3e958227 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
> @@ -548,7 +548,7 @@ svm_range_vram_node_new(struct amdgpu_device *adev, struct svm_range *prange,
>  		goto reserve_bo_failed;
>  	}
>  
> -	r = dma_resv_reserve_shared(bo->tbo.base.resv, 1);
> +	r = dma_resv_reserve_fences(bo->tbo.base.resv, 1);
>  	if (r) {
>  		pr_debug("failed %d to reserve bo\n", r);
>  		amdgpu_bo_unreserve(bo);
> diff --git a/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c b/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c
> index 5f502c49aec2..53f7c78628a4 100644
> --- a/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c
> +++ b/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c
> @@ -179,11 +179,9 @@ static int submit_fence_sync(struct etnaviv_gem_submit *submit)
>  		struct etnaviv_gem_submit_bo *bo = &submit->bos[i];
>  		struct dma_resv *robj = bo->obj->base.resv;
>  
> -		if (!(bo->flags & ETNA_SUBMIT_BO_WRITE)) {
> -			ret = dma_resv_reserve_shared(robj, 1);
> -			if (ret)
> -				return ret;
> -		}
> +		ret = dma_resv_reserve_fences(robj, 1);
> +		if (ret)
> +			return ret;
>  
>  		if (submit->flags & ETNA_SUBMIT_NO_IMPLICIT)
>  			continue;
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_clflush.c b/drivers/gpu/drm/i915/gem/i915_gem_clflush.c
> index ce91b23385cf..1fd0cc9ca213 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_clflush.c
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_clflush.c
> @@ -108,7 +108,8 @@ bool i915_gem_clflush_object(struct drm_i915_gem_object *obj,
>  	trace_i915_gem_object_clflush(obj);
>  
>  	clflush = NULL;
> -	if (!(flags & I915_CLFLUSH_SYNC))
> +	if (!(flags & I915_CLFLUSH_SYNC) &&
> +	    dma_resv_reserve_fences(obj->base.resv, 1) == 0)
>  		clflush = clflush_work_create(obj);
>  	if (clflush) {
>  		i915_sw_fence_await_reservation(&clflush->base.chain,
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
> index d42f437149c9..78f8797853ce 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
> @@ -998,11 +998,9 @@ static int eb_validate_vmas(struct i915_execbuffer *eb)
>  			}
>  		}
>  
> -		if (!(ev->flags & EXEC_OBJECT_WRITE)) {
> -			err = dma_resv_reserve_shared(vma->obj->base.resv, 1);
> -			if (err)
> -				return err;
> -		}
> +		err = dma_resv_reserve_fences(vma->obj->base.resv, 1);
> +		if (err)
> +			return err;
>  
>  		GEM_BUG_ON(drm_mm_node_allocated(&vma->node) &&
>  			   eb_vma_misplaced(&eb->exec[i], vma, ev->flags));
> @@ -2303,7 +2301,7 @@ static int eb_parse(struct i915_execbuffer *eb)
>  	if (IS_ERR(batch))
>  		return PTR_ERR(batch);
>  
> -	err = dma_resv_reserve_shared(shadow->obj->base.resv, 1);
> +	err = dma_resv_reserve_fences(shadow->obj->base.resv, 1);
>  	if (err)
>  		return err;
>  
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_ttm_move.c b/drivers/gpu/drm/i915/gem/i915_gem_ttm_move.c
> index 1ebe6e4086a1..432ac74ff225 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_ttm_move.c
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_ttm_move.c
> @@ -611,7 +611,11 @@ int i915_gem_obj_copy_ttm(struct drm_i915_gem_object *dst,
>  	assert_object_held(src);
>  	i915_deps_init(&deps, GFP_KERNEL | __GFP_NORETRY | __GFP_NOWARN);
>  
> -	ret = dma_resv_reserve_shared(src_bo->base.resv, 1);
> +	ret = dma_resv_reserve_fences(src_bo->base.resv, 1);
> +	if (ret)
> +		return ret;
> +
> +	ret = dma_resv_reserve_fences(dst_bo->base.resv, 1);
>  	if (ret)
>  		return ret;
>  
> diff --git a/drivers/gpu/drm/i915/gem/selftests/i915_gem_migrate.c b/drivers/gpu/drm/i915/gem/selftests/i915_gem_migrate.c
> index d534141b2cf7..0e52eb87cd55 100644
> --- a/drivers/gpu/drm/i915/gem/selftests/i915_gem_migrate.c
> +++ b/drivers/gpu/drm/i915/gem/selftests/i915_gem_migrate.c
> @@ -216,7 +216,10 @@ static int __igt_lmem_pages_migrate(struct intel_gt *gt,
>  					  i915_gem_object_is_lmem(obj),
>  					  0xdeadbeaf, &rq);
>  		if (rq) {
> -			dma_resv_add_excl_fence(obj->base.resv, &rq->fence);
> +			err = dma_resv_reserve_fences(obj->base.resv, 1);
> +			if (!err)
> +				dma_resv_add_excl_fence(obj->base.resv,
> +							&rq->fence);
>  			i915_gem_object_set_moving_fence(obj, &rq->fence);
>  			i915_request_put(rq);
>  		}
> diff --git a/drivers/gpu/drm/i915/i915_vma.c b/drivers/gpu/drm/i915/i915_vma.c
> index 94fcdb7bd21d..bae3423f58e8 100644
> --- a/drivers/gpu/drm/i915/i915_vma.c
> +++ b/drivers/gpu/drm/i915/i915_vma.c
> @@ -1819,6 +1819,12 @@ int _i915_vma_move_to_active(struct i915_vma *vma,
>  			intel_frontbuffer_put(front);
>  		}
>  
> +		if (!(flags & __EXEC_OBJECT_NO_RESERVE)) {
> +			err = dma_resv_reserve_fences(vma->obj->base.resv, 1);
> +			if (unlikely(err))
> +				return err;
> +		}
> +
>  		if (fence) {
>  			dma_resv_add_excl_fence(vma->obj->base.resv, fence);
>  			obj->write_domain = I915_GEM_DOMAIN_RENDER;
> @@ -1826,7 +1832,7 @@ int _i915_vma_move_to_active(struct i915_vma *vma,
>  		}
>  	} else {
>  		if (!(flags & __EXEC_OBJECT_NO_RESERVE)) {
> -			err = dma_resv_reserve_shared(vma->obj->base.resv, 1);
> +			err = dma_resv_reserve_fences(vma->obj->base.resv, 1);
>  			if (unlikely(err))
>  				return err;
>  		}
> @@ -2044,7 +2050,7 @@ int i915_vma_unbind_async(struct i915_vma *vma, bool trylock_vm)
>  	if (!obj->mm.rsgt)
>  		return -EBUSY;
>  
> -	err = dma_resv_reserve_shared(obj->base.resv, 1);
> +	err = dma_resv_reserve_fences(obj->base.resv, 1);
>  	if (err)
>  		return -EBUSY;
>  
> diff --git a/drivers/gpu/drm/i915/selftests/intel_memory_region.c b/drivers/gpu/drm/i915/selftests/intel_memory_region.c
> index ba32893e0873..6114e013092b 100644
> --- a/drivers/gpu/drm/i915/selftests/intel_memory_region.c
> +++ b/drivers/gpu/drm/i915/selftests/intel_memory_region.c
> @@ -1043,6 +1043,13 @@ static int igt_lmem_write_cpu(void *arg)
>  	}
>  
>  	i915_gem_object_lock(obj, NULL);
> +
> +	err = dma_resv_reserve_fences(obj->base.resv, 1);
> +	if (err) {
> +		i915_gem_object_unlock(obj);
> +		goto out_put;
> +	}
> +
>  	/* Put the pages into a known state -- from the gpu for added fun */
>  	intel_engine_pm_get(engine);
>  	err = intel_context_migrate_clear(engine->gt->migrate.context, NULL,
> diff --git a/drivers/gpu/drm/lima/lima_gem.c b/drivers/gpu/drm/lima/lima_gem.c
> index 55bb1ec3c4f7..e0a11ee0e86d 100644
> --- a/drivers/gpu/drm/lima/lima_gem.c
> +++ b/drivers/gpu/drm/lima/lima_gem.c
> @@ -257,13 +257,11 @@ int lima_gem_get_info(struct drm_file *file, u32 handle, u32 *va, u64 *offset)
>  static int lima_gem_sync_bo(struct lima_sched_task *task, struct lima_bo *bo,
>  			    bool write, bool explicit)
>  {
> -	int err = 0;
> +	int err;
>  
> -	if (!write) {
> -		err = dma_resv_reserve_shared(lima_bo_resv(bo), 1);
> -		if (err)
> -			return err;
> -	}
> +	err = dma_resv_reserve_fences(lima_bo_resv(bo), 1);
> +	if (err)
> +		return err;
>  
>  	/* explicit sync use user passed dep fence */
>  	if (explicit)
> diff --git a/drivers/gpu/drm/msm/msm_gem_submit.c b/drivers/gpu/drm/msm/msm_gem_submit.c
> index c6d60c8d286d..3164db8be893 100644
> --- a/drivers/gpu/drm/msm/msm_gem_submit.c
> +++ b/drivers/gpu/drm/msm/msm_gem_submit.c
> @@ -320,16 +320,14 @@ static int submit_fence_sync(struct msm_gem_submit *submit, bool no_implicit)
>  		struct drm_gem_object *obj = &submit->bos[i].obj->base;
>  		bool write = submit->bos[i].flags & MSM_SUBMIT_BO_WRITE;
>  
> -		if (!write) {
> -			/* NOTE: _reserve_shared() must happen before
> -			 * _add_shared_fence(), which makes this a slightly
> -			 * strange place to call it.  OTOH this is a
> -			 * convenient can-fail point to hook it in.
> -			 */
> -			ret = dma_resv_reserve_shared(obj->resv, 1);
> -			if (ret)
> -				return ret;
> -		}
> +		/* NOTE: _reserve_shared() must happen before
> +		 * _add_shared_fence(), which makes this a slightly
> +		 * strange place to call it.  OTOH this is a
> +		 * convenient can-fail point to hook it in.
> +		 */
> +		ret = dma_resv_reserve_fences(obj->resv, 1);
> +		if (ret)
> +			return ret;
>  
>  		/* exclusive fences must be ordered */
>  		if (no_implicit && !write)
> diff --git a/drivers/gpu/drm/nouveau/nouveau_fence.c b/drivers/gpu/drm/nouveau/nouveau_fence.c
> index a3a04e0d76ec..0268259e97eb 100644
> --- a/drivers/gpu/drm/nouveau/nouveau_fence.c
> +++ b/drivers/gpu/drm/nouveau/nouveau_fence.c
> @@ -346,11 +346,9 @@ nouveau_fence_sync(struct nouveau_bo *nvbo, struct nouveau_channel *chan,
>  	struct dma_resv *resv = nvbo->bo.base.resv;
>  	int i, ret;
>  
> -	if (!exclusive) {
> -		ret = dma_resv_reserve_shared(resv, 1);
> -		if (ret)
> -			return ret;
> -	}
> +	ret = dma_resv_reserve_fences(resv, 1);
> +	if (ret)
> +		return ret;
>  
>  	/* Waiting for the exclusive fence first causes performance regressions
>  	 * under some circumstances. So manually wait for the shared ones first.
> diff --git a/drivers/gpu/drm/panfrost/panfrost_job.c b/drivers/gpu/drm/panfrost/panfrost_job.c
> index a6925dbb6224..c34114560e49 100644
> --- a/drivers/gpu/drm/panfrost/panfrost_job.c
> +++ b/drivers/gpu/drm/panfrost/panfrost_job.c
> @@ -247,6 +247,10 @@ static int panfrost_acquire_object_fences(struct drm_gem_object **bos,
>  	int i, ret;
>  
>  	for (i = 0; i < bo_count; i++) {
> +		ret = dma_resv_reserve_fences(bos[i]->resv, 1);
> +		if (ret)
> +			return ret;
> +
>  		/* panfrost always uses write mode in its current uapi */
>  		ret = drm_sched_job_add_implicit_dependencies(job, bos[i],
>  							      true);
> diff --git a/drivers/gpu/drm/qxl/qxl_release.c b/drivers/gpu/drm/qxl/qxl_release.c
> index 469979cd0341..cde1e8ddaeaa 100644
> --- a/drivers/gpu/drm/qxl/qxl_release.c
> +++ b/drivers/gpu/drm/qxl/qxl_release.c
> @@ -200,7 +200,7 @@ static int qxl_release_validate_bo(struct qxl_bo *bo)
>  			return ret;
>  	}
>  
> -	ret = dma_resv_reserve_shared(bo->tbo.base.resv, 1);
> +	ret = dma_resv_reserve_fences(bo->tbo.base.resv, 1);
>  	if (ret)
>  		return ret;
>  
> diff --git a/drivers/gpu/drm/radeon/radeon_cs.c b/drivers/gpu/drm/radeon/radeon_cs.c
> index 9ed2b2700e0a..446f7bae54c4 100644
> --- a/drivers/gpu/drm/radeon/radeon_cs.c
> +++ b/drivers/gpu/drm/radeon/radeon_cs.c
> @@ -535,6 +535,10 @@ static int radeon_bo_vm_update_pte(struct radeon_cs_parser *p,
>  			return r;
>  
>  		radeon_sync_fence(&p->ib.sync, bo_va->last_pt_update);
> +
> +		r = dma_resv_reserve_fences(bo->tbo.base.resv, 1);
> +		if (r)
> +			return r;
>  	}
>  
>  	return radeon_vm_clear_invalids(rdev, vm);
> diff --git a/drivers/gpu/drm/radeon/radeon_object.c b/drivers/gpu/drm/radeon/radeon_object.c
> index 91a72cd14304..7ffd2e90f325 100644
> --- a/drivers/gpu/drm/radeon/radeon_object.c
> +++ b/drivers/gpu/drm/radeon/radeon_object.c
> @@ -782,6 +782,14 @@ void radeon_bo_fence(struct radeon_bo *bo, struct radeon_fence *fence,
>  		     bool shared)
>  {
>  	struct dma_resv *resv = bo->tbo.base.resv;
> +	int r;
> +
> +	r = dma_resv_reserve_fences(resv, 1);
> +	if (r) {
> +		/* As last resort on OOM we block for the fence */
> +		dma_fence_wait(&fence->base, false);
> +		return;
> +	}
>  
>  	if (shared)
>  		dma_resv_add_shared_fence(resv, &fence->base);
> diff --git a/drivers/gpu/drm/radeon/radeon_vm.c b/drivers/gpu/drm/radeon/radeon_vm.c
> index bb53016f3138..987cabbf1318 100644
> --- a/drivers/gpu/drm/radeon/radeon_vm.c
> +++ b/drivers/gpu/drm/radeon/radeon_vm.c
> @@ -831,7 +831,7 @@ static int radeon_vm_update_ptes(struct radeon_device *rdev,
>  		int r;
>  
>  		radeon_sync_resv(rdev, &ib->sync, pt->tbo.base.resv, true);
> -		r = dma_resv_reserve_shared(pt->tbo.base.resv, 1);
> +		r = dma_resv_reserve_fences(pt->tbo.base.resv, 1);
>  		if (r)
>  			return r;
>  
> diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c
> index e5fd0f2c0299..c49996cf25d0 100644
> --- a/drivers/gpu/drm/ttm/ttm_bo.c
> +++ b/drivers/gpu/drm/ttm/ttm_bo.c
> @@ -151,6 +151,10 @@ static int ttm_bo_handle_move_mem(struct ttm_buffer_object *bo,
>  		}
>  	}
>  
> +	ret = dma_resv_reserve_fences(bo->base.resv, 1);
> +	if (ret)
> +		goto out_err;
> +
>  	ret = bdev->funcs->move(bo, evict, ctx, mem, hop);
>  	if (ret) {
>  		if (ret == -EMULTIHOP)
> @@ -735,7 +739,7 @@ static int ttm_bo_add_move_fence(struct ttm_buffer_object *bo,
>  
>  	dma_resv_add_shared_fence(bo->base.resv, fence);
>  
> -	ret = dma_resv_reserve_shared(bo->base.resv, 1);
> +	ret = dma_resv_reserve_fences(bo->base.resv, 1);
>  	if (unlikely(ret)) {
>  		dma_fence_put(fence);
>  		return ret;
> @@ -794,7 +798,7 @@ int ttm_bo_mem_space(struct ttm_buffer_object *bo,
>  	bool type_found = false;
>  	int i, ret;
>  
> -	ret = dma_resv_reserve_shared(bo->base.resv, 1);
> +	ret = dma_resv_reserve_fences(bo->base.resv, 1);
>  	if (unlikely(ret))
>  		return ret;
>  
> diff --git a/drivers/gpu/drm/ttm/ttm_bo_util.c b/drivers/gpu/drm/ttm/ttm_bo_util.c
> index 219dd81bbeab..1b96b91bf81b 100644
> --- a/drivers/gpu/drm/ttm/ttm_bo_util.c
> +++ b/drivers/gpu/drm/ttm/ttm_bo_util.c
> @@ -221,9 +221,6 @@ static int ttm_buffer_object_transfer(struct ttm_buffer_object *bo,
>  
>  	fbo->base = *bo;
>  
> -	ttm_bo_get(bo);
> -	fbo->bo = bo;
> -
>  	/**
>  	 * Fix up members that we shouldn't copy directly:
>  	 * TODO: Explicit member copy would probably be better here.
> @@ -250,6 +247,15 @@ static int ttm_buffer_object_transfer(struct ttm_buffer_object *bo,
>  	ret = dma_resv_trylock(&fbo->base.base._resv);
>  	WARN_ON(!ret);
>  
> +	ret = dma_resv_reserve_fences(&fbo->base.base._resv, 1);
> +	if (ret) {
> +		kfree(fbo);
> +		return ret;
> +	}
> +
> +	ttm_bo_get(bo);
> +	fbo->bo = bo;
> +
>  	ttm_bo_move_to_lru_tail_unlocked(&fbo->base);
>  
>  	*new_obj = &fbo->base;
> diff --git a/drivers/gpu/drm/ttm/ttm_execbuf_util.c b/drivers/gpu/drm/ttm/ttm_execbuf_util.c
> index 071c48d672c6..789c645f004e 100644
> --- a/drivers/gpu/drm/ttm/ttm_execbuf_util.c
> +++ b/drivers/gpu/drm/ttm/ttm_execbuf_util.c
> @@ -90,6 +90,7 @@ int ttm_eu_reserve_buffers(struct ww_acquire_ctx *ticket,
>  
>  	list_for_each_entry(entry, list, head) {
>  		struct ttm_buffer_object *bo = entry->bo;
> +		unsigned int num_fences;
>  
>  		ret = ttm_bo_reserve(bo, intr, (ticket == NULL), ticket);
>  		if (ret == -EALREADY && dups) {
> @@ -100,12 +101,10 @@ int ttm_eu_reserve_buffers(struct ww_acquire_ctx *ticket,
>  			continue;
>  		}
>  
> +		num_fences = min(entry->num_shared, 1u);
>  		if (!ret) {
> -			if (!entry->num_shared)
> -				continue;
> -
> -			ret = dma_resv_reserve_shared(bo->base.resv,
> -								entry->num_shared);
> +			ret = dma_resv_reserve_fences(bo->base.resv,
> +						      num_fences);
>  			if (!ret)
>  				continue;
>  		}
> @@ -120,9 +119,9 @@ int ttm_eu_reserve_buffers(struct ww_acquire_ctx *ticket,
>  			ret = ttm_bo_reserve_slowpath(bo, intr, ticket);
>  		}
>  
> -		if (!ret && entry->num_shared)
> -			ret = dma_resv_reserve_shared(bo->base.resv,
> -								entry->num_shared);
> +		if (!ret)
> +			ret = dma_resv_reserve_fences(bo->base.resv,
> +						      num_fences);
>  
>  		if (unlikely(ret != 0)) {
>  			if (ticket) {
> diff --git a/drivers/gpu/drm/v3d/v3d_gem.c b/drivers/gpu/drm/v3d/v3d_gem.c
> index 92bc0faee84f..961812d33827 100644
> --- a/drivers/gpu/drm/v3d/v3d_gem.c
> +++ b/drivers/gpu/drm/v3d/v3d_gem.c
> @@ -259,16 +259,21 @@ v3d_lock_bo_reservations(struct v3d_job *job,
>  		return ret;
>  
>  	for (i = 0; i < job->bo_count; i++) {
> +		ret = dma_resv_reserve_fences(job->bo[i]->resv, 1);
> +		if (ret)
> +			goto fail;
> +
>  		ret = drm_sched_job_add_implicit_dependencies(&job->base,
>  							      job->bo[i], true);
> -		if (ret) {
> -			drm_gem_unlock_reservations(job->bo, job->bo_count,
> -						    acquire_ctx);
> -			return ret;
> -		}
> +		if (ret)
> +			goto fail;
>  	}
>  
>  	return 0;
> +
> +fail:
> +	drm_gem_unlock_reservations(job->bo, job->bo_count, acquire_ctx);
> +	return ret;
>  }
>  
>  /**
> diff --git a/drivers/gpu/drm/vc4/vc4_gem.c b/drivers/gpu/drm/vc4/vc4_gem.c
> index 4abf10b66fe8..594bd6bb00d2 100644
> --- a/drivers/gpu/drm/vc4/vc4_gem.c
> +++ b/drivers/gpu/drm/vc4/vc4_gem.c
> @@ -644,7 +644,7 @@ vc4_lock_bo_reservations(struct drm_device *dev,
>  	for (i = 0; i < exec->bo_count; i++) {
>  		bo = &exec->bo[i]->base;
>  
> -		ret = dma_resv_reserve_shared(bo->resv, 1);
> +		ret = dma_resv_reserve_fences(bo->resv, 1);
>  		if (ret) {
>  			vc4_unlock_bo_reservations(dev, exec, acquire_ctx);
>  			return ret;
> diff --git a/drivers/gpu/drm/vgem/vgem_fence.c b/drivers/gpu/drm/vgem/vgem_fence.c
> index bd6f75285fd9..2ddbebca87d9 100644
> --- a/drivers/gpu/drm/vgem/vgem_fence.c
> +++ b/drivers/gpu/drm/vgem/vgem_fence.c
> @@ -157,12 +157,14 @@ int vgem_fence_attach_ioctl(struct drm_device *dev,
>  	}
>  
>  	/* Expose the fence via the dma-buf */
> -	ret = 0;
>  	dma_resv_lock(resv, NULL);
> -	if (arg->flags & VGEM_FENCE_WRITE)
> -		dma_resv_add_excl_fence(resv, fence);
> -	else if ((ret = dma_resv_reserve_shared(resv, 1)) == 0)
> -		dma_resv_add_shared_fence(resv, fence);
> +	ret = dma_resv_reserve_fences(resv, 1);
> +	if (!ret) {
> +		if (arg->flags & VGEM_FENCE_WRITE)
> +			dma_resv_add_excl_fence(resv, fence);
> +		else
> +			dma_resv_add_shared_fence(resv, fence);
> +	}
>  	dma_resv_unlock(resv);
>  
>  	/* Record the fence in our idr for later signaling */
> diff --git a/drivers/gpu/drm/virtio/virtgpu_gem.c b/drivers/gpu/drm/virtio/virtgpu_gem.c
> index 48d3c9955f0d..1820ca6cf673 100644
> --- a/drivers/gpu/drm/virtio/virtgpu_gem.c
> +++ b/drivers/gpu/drm/virtio/virtgpu_gem.c
> @@ -214,6 +214,7 @@ void virtio_gpu_array_add_obj(struct virtio_gpu_object_array *objs,
>  
>  int virtio_gpu_array_lock_resv(struct virtio_gpu_object_array *objs)
>  {
> +	unsigned int i;
>  	int ret;
>  
>  	if (objs->nents == 1) {
> @@ -222,6 +223,14 @@ int virtio_gpu_array_lock_resv(struct virtio_gpu_object_array *objs)
>  		ret = drm_gem_lock_reservations(objs->objs, objs->nents,
>  						&objs->ticket);
>  	}
> +	if (ret)
> +		return ret;
> +
> +	for (i = 0; i < objs->nents; ++i) {
> +		ret = dma_resv_reserve_fences(objs->objs[i]->resv, 1);
> +		if (ret)
> +			return ret;
> +	}
>  	return ret;
>  }
>  
> diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_bo.c b/drivers/gpu/drm/vmwgfx/vmwgfx_bo.c
> index 31aecc46624b..fe13aa8b4a64 100644
> --- a/drivers/gpu/drm/vmwgfx/vmwgfx_bo.c
> +++ b/drivers/gpu/drm/vmwgfx/vmwgfx_bo.c
> @@ -747,16 +747,22 @@ void vmw_bo_fence_single(struct ttm_buffer_object *bo,
>  			 struct vmw_fence_obj *fence)
>  {
>  	struct ttm_device *bdev = bo->bdev;
> -
>  	struct vmw_private *dev_priv =
>  		container_of(bdev, struct vmw_private, bdev);
> +	int ret;
>  
> -	if (fence == NULL) {
> +	if (fence == NULL)
>  		vmw_execbuf_fence_commands(NULL, dev_priv, &fence, NULL);
> +	else
> +		dma_fence_get(&fence->base);
> +
> +	ret = dma_resv_reserve_fences(bo->base.resv, 1);
> +	if (!ret)
>  		dma_resv_add_excl_fence(bo->base.resv, &fence->base);
> -		dma_fence_put(&fence->base);
> -	} else
> -		dma_resv_add_excl_fence(bo->base.resv, &fence->base);
> +	else
> +		/* Last resort fallback when we are OOM */
> +		dma_fence_wait(&fence->base, false);
> +	dma_fence_put(&fence->base);
>  }
>  
>  
> diff --git a/include/linux/dma-resv.h b/include/linux/dma-resv.h
> index ecb697d4d861..5fa04d0fccad 100644
> --- a/include/linux/dma-resv.h
> +++ b/include/linux/dma-resv.h
> @@ -117,7 +117,7 @@ struct dma_resv {
>  	 * A new fence is added by calling dma_resv_add_shared_fence(). Since
>  	 * this often needs to be done past the point of no return in command
>  	 * submission it cannot fail, and therefore sufficient slots need to be
> -	 * reserved by calling dma_resv_reserve_shared().
> +	 * reserved by calling dma_resv_reserve_fences().
>  	 *
>  	 * Note that actual semantics of what an exclusive or shared fence mean
>  	 * is defined by the user, for reservation objects shared across drivers
> @@ -413,7 +413,7 @@ static inline void dma_resv_unlock(struct dma_resv *obj)
>  
>  void dma_resv_init(struct dma_resv *obj);
>  void dma_resv_fini(struct dma_resv *obj);
> -int dma_resv_reserve_shared(struct dma_resv *obj, unsigned int num_fences);
> +int dma_resv_reserve_fences(struct dma_resv *obj, unsigned int num_fences);
>  void dma_resv_add_shared_fence(struct dma_resv *obj, struct dma_fence *fence);
>  void dma_resv_replace_fences(struct dma_resv *obj, uint64_t context,
>  			     struct dma_fence *fence);
> -- 
> 2.25.1
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH 03/16] dma-buf: specify usage while adding fences to dma_resv obj v6
  2022-04-06  7:51 ` [PATCH 03/16] dma-buf: specify usage while adding fences to dma_resv obj v6 Christian König
@ 2022-04-06 12:32   ` Daniel Vetter
  2022-04-06 12:35     ` Daniel Vetter
  0 siblings, 1 reply; 65+ messages in thread
From: Daniel Vetter @ 2022-04-06 12:32 UTC (permalink / raw)
  To: DMA-resvusage; +Cc: daniel.vetter, Christian König, dri-devel

On Wed, Apr 06, 2022 at 09:51:19AM +0200, Christian König wrote:
> Instead of distingting between shared and exclusive fences specify
> the fence usage while adding fences.
> 
> Rework all drivers to use this interface instead and deprecate the old one.
> 
> v2: some kerneldoc comments suggested by Daniel
> v3: fix a missing case in radeon
> v4: rebase on nouveau changes, fix lockdep and temporary disable warning
> v5: more documentation updates
> v6: separate internal dma_resv changes from this patch, avoids to
>     disable warning temporary, rebase on upstream changes
> 
> Signed-off-by: Christian König <christian.koenig@amd.com>
> ---
>  drivers/dma-buf/dma-resv.c                    |  48 +++++++--
>  drivers/dma-buf/st-dma-resv.c                 | 101 +++++-------------
>  .../gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c  |   2 +-
>  drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c        |   4 +-
>  drivers/gpu/drm/amd/amdgpu/amdgpu_object.c    |   6 +-
>  drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c  |  10 +-
>  drivers/gpu/drm/i915/gem/i915_gem_busy.c      |  13 +--
>  drivers/gpu/drm/i915/gem/i915_gem_clflush.c   |   3 +-
>  drivers/gpu/drm/i915/gem/i915_gem_ttm_move.c  |   5 +-
>  .../drm/i915/gem/selftests/i915_gem_migrate.c |   4 +-
>  .../drm/i915/gem/selftests/i915_gem_mman.c    |   3 +-
>  drivers/gpu/drm/i915/i915_vma.c               |   8 +-
>  .../drm/i915/selftests/intel_memory_region.c  |   3 +-
>  drivers/gpu/drm/lima/lima_gem.c               |   2 +-
>  drivers/gpu/drm/msm/msm_gem_submit.c          |   6 +-
>  drivers/gpu/drm/nouveau/nouveau_bo.c          |   9 +-
>  drivers/gpu/drm/nouveau/nouveau_fence.c       |   4 +-
>  drivers/gpu/drm/panfrost/panfrost_job.c       |   2 +-
>  drivers/gpu/drm/qxl/qxl_release.c             |   3 +-
>  drivers/gpu/drm/radeon/radeon_object.c        |   6 +-
>  drivers/gpu/drm/ttm/ttm_bo.c                  |   2 +-
>  drivers/gpu/drm/ttm/ttm_bo_util.c             |   5 +-
>  drivers/gpu/drm/ttm/ttm_execbuf_util.c        |   6 +-
>  drivers/gpu/drm/v3d/v3d_gem.c                 |   4 +-
>  drivers/gpu/drm/vc4/vc4_gem.c                 |   2 +-
>  drivers/gpu/drm/vgem/vgem_fence.c             |   9 +-
>  drivers/gpu/drm/virtio/virtgpu_gem.c          |   3 +-
>  drivers/gpu/drm/vmwgfx/vmwgfx_bo.c            |   3 +-
>  include/linux/dma-buf.h                       |  16 +--
>  include/linux/dma-resv.h                      |  25 +++--
>  30 files changed, 151 insertions(+), 166 deletions(-)
> 
> diff --git a/drivers/dma-buf/dma-resv.c b/drivers/dma-buf/dma-resv.c
> index 17237e6ee30c..543dae6566d2 100644
> --- a/drivers/dma-buf/dma-resv.c
> +++ b/drivers/dma-buf/dma-resv.c
> @@ -234,14 +234,14 @@ EXPORT_SYMBOL(dma_resv_reserve_fences);
>  
>  #ifdef CONFIG_DEBUG_MUTEXES
>  /**
> - * dma_resv_reset_shared_max - reset shared fences for debugging
> + * dma_resv_reset_max_fences - reset shared fences for debugging
>   * @obj: the dma_resv object to reset
>   *
>   * Reset the number of pre-reserved shared slots to test that drivers do
>   * correct slot allocation using dma_resv_reserve_fences(). See also
>   * &dma_resv_list.shared_max.
>   */
> -void dma_resv_reset_shared_max(struct dma_resv *obj)
> +void dma_resv_reset_max_fences(struct dma_resv *obj)
>  {
>  	struct dma_resv_list *fences = dma_resv_shared_list(obj);
>  
> @@ -251,7 +251,7 @@ void dma_resv_reset_shared_max(struct dma_resv *obj)
>  	if (fences)
>  		fences->shared_max = fences->shared_count;
>  }
> -EXPORT_SYMBOL(dma_resv_reset_shared_max);
> +EXPORT_SYMBOL(dma_resv_reset_max_fences);
>  #endif
>  
>  /**
> @@ -264,7 +264,8 @@ EXPORT_SYMBOL(dma_resv_reset_shared_max);
>   *
>   * See also &dma_resv.fence for a discussion of the semantics.
>   */
> -void dma_resv_add_shared_fence(struct dma_resv *obj, struct dma_fence *fence)
> +static void dma_resv_add_shared_fence(struct dma_resv *obj,
> +				      struct dma_fence *fence)
>  {
>  	struct dma_resv_list *fobj;
>  	struct dma_fence *old;
> @@ -305,13 +306,13 @@ void dma_resv_add_shared_fence(struct dma_resv *obj, struct dma_fence *fence)
>  	write_seqcount_end(&obj->seq);
>  	dma_fence_put(old);
>  }
> -EXPORT_SYMBOL(dma_resv_add_shared_fence);
>  
>  /**
>   * dma_resv_replace_fences - replace fences in the dma_resv obj
>   * @obj: the reservation object
>   * @context: the context of the fences to replace
>   * @replacement: the new fence to use instead
> + * @usage: how the new fence is used, see enum dma_resv_usage
>   *
>   * Replace fences with a specified context with a new fence. Only valid if the
>   * operation represented by the original fence has no longer access to the
> @@ -321,12 +322,16 @@ EXPORT_SYMBOL(dma_resv_add_shared_fence);
>   * update fence which makes the resource inaccessible.
>   */
>  void dma_resv_replace_fences(struct dma_resv *obj, uint64_t context,
> -			     struct dma_fence *replacement)
> +			     struct dma_fence *replacement,
> +			     enum dma_resv_usage usage)
>  {
>  	struct dma_resv_list *list;
>  	struct dma_fence *old;
>  	unsigned int i;
>  
> +	/* Only readers supported for now */
> +	WARN_ON(usage != DMA_RESV_USAGE_READ);
> +
>  	dma_resv_assert_held(obj);
>  
>  	write_seqcount_begin(&obj->seq);
> @@ -360,7 +365,8 @@ EXPORT_SYMBOL(dma_resv_replace_fences);
>   * Add a fence to the exclusive slot. @obj must be locked with dma_resv_lock().
>   * See also &dma_resv.fence_excl for a discussion of the semantics.
>   */
> -void dma_resv_add_excl_fence(struct dma_resv *obj, struct dma_fence *fence)
> +static void dma_resv_add_excl_fence(struct dma_resv *obj,
> +				    struct dma_fence *fence)
>  {
>  	struct dma_fence *old_fence = dma_resv_excl_fence(obj);
>  
> @@ -375,7 +381,27 @@ void dma_resv_add_excl_fence(struct dma_resv *obj, struct dma_fence *fence)
>  
>  	dma_fence_put(old_fence);
>  }
> -EXPORT_SYMBOL(dma_resv_add_excl_fence);
> +
> +/**
> + * dma_resv_add_fence - Add a fence to the dma_resv obj
> + * @obj: the reservation object
> + * @fence: the fence to add
> + * @usage: how the fence is used, see enum dma_resv_usage
> + *
> + * Add a fence to a slot, @obj must be locked with dma_resv_lock(), and
> + * dma_resv_reserve_fences() has been called.
> + *
> + * See also &dma_resv.fence for a discussion of the semantics.
> + */
> +void dma_resv_add_fence(struct dma_resv *obj, struct dma_fence *fence,
> +			enum dma_resv_usage usage)
> +{
> +	if (usage == DMA_RESV_USAGE_WRITE)
> +		dma_resv_add_excl_fence(obj, fence);
> +	else
> +		dma_resv_add_shared_fence(obj, fence);
> +}
> +EXPORT_SYMBOL(dma_resv_add_fence);
>  
>  /* Restart the iterator by initializing all the necessary fields, but not the
>   * relation to the dma_resv object. */
> @@ -574,7 +600,7 @@ int dma_resv_copy_fences(struct dma_resv *dst, struct dma_resv *src)
>  		}
>  
>  		dma_fence_get(f);
> -		if (dma_resv_iter_is_exclusive(&cursor))
> +		if (dma_resv_iter_usage(&cursor) == DMA_RESV_USAGE_WRITE)
>  			excl = f;
>  		else
>  			RCU_INIT_POINTER(list->shared[list->shared_count++], f);
> @@ -771,13 +797,13 @@ EXPORT_SYMBOL_GPL(dma_resv_test_signaled);
>   */
>  void dma_resv_describe(struct dma_resv *obj, struct seq_file *seq)
>  {
> +	static const char *usage[] = { "write", "read" };
>  	struct dma_resv_iter cursor;
>  	struct dma_fence *fence;
>  
>  	dma_resv_for_each_fence(&cursor, obj, DMA_RESV_USAGE_READ, fence) {
>  		seq_printf(seq, "\t%s fence:",
> -			   dma_resv_iter_is_exclusive(&cursor) ?
> -				"Exclusive" : "Shared");
> +			   usage[dma_resv_iter_usage(&cursor)]);
>  		dma_fence_describe(fence, seq);
>  	}
>  }
> diff --git a/drivers/dma-buf/st-dma-resv.c b/drivers/dma-buf/st-dma-resv.c
> index d097981061b1..d0f7c2bfd4f0 100644
> --- a/drivers/dma-buf/st-dma-resv.c
> +++ b/drivers/dma-buf/st-dma-resv.c
> @@ -58,8 +58,9 @@ static int sanitycheck(void *arg)
>  	return r;
>  }
>  
> -static int test_signaling(void *arg, enum dma_resv_usage usage)
> +static int test_signaling(void *arg)
>  {
> +	enum dma_resv_usage usage = (unsigned long)arg;
>  	struct dma_resv resv;
>  	struct dma_fence *f;
>  	int r;
> @@ -81,11 +82,7 @@ static int test_signaling(void *arg, enum dma_resv_usage usage)
>  		goto err_unlock;
>  	}
>  
> -	if (usage >= DMA_RESV_USAGE_READ)
> -		dma_resv_add_shared_fence(&resv, f);
> -	else
> -		dma_resv_add_excl_fence(&resv, f);
> -
> +	dma_resv_add_fence(&resv, f, usage);
>  	if (dma_resv_test_signaled(&resv, usage)) {
>  		pr_err("Resv unexpectedly signaled\n");
>  		r = -EINVAL;
> @@ -105,18 +102,9 @@ static int test_signaling(void *arg, enum dma_resv_usage usage)
>  	return r;
>  }
>  
> -static int test_excl_signaling(void *arg)
> -{
> -	return test_signaling(arg, DMA_RESV_USAGE_WRITE);
> -}
> -
> -static int test_shared_signaling(void *arg)
> -{
> -	return test_signaling(arg, DMA_RESV_USAGE_READ);
> -}
> -
> -static int test_for_each(void *arg, enum dma_resv_usage usage)
> +static int test_for_each(void *arg)
>  {
> +	enum dma_resv_usage usage = (unsigned long)arg;
>  	struct dma_resv_iter cursor;
>  	struct dma_fence *f, *fence;
>  	struct dma_resv resv;
> @@ -139,10 +127,7 @@ static int test_for_each(void *arg, enum dma_resv_usage usage)
>  		goto err_unlock;
>  	}
>  
> -	if (usage >= DMA_RESV_USAGE_READ)
> -		dma_resv_add_shared_fence(&resv, f);
> -	else
> -		dma_resv_add_excl_fence(&resv, f);
> +	dma_resv_add_fence(&resv, f, usage);
>  
>  	r = -ENOENT;
>  	dma_resv_for_each_fence(&cursor, &resv, usage, fence) {
> @@ -156,8 +141,7 @@ static int test_for_each(void *arg, enum dma_resv_usage usage)
>  			r = -EINVAL;
>  			goto err_unlock;
>  		}
> -		if (dma_resv_iter_is_exclusive(&cursor) !=
> -		    (usage >= DMA_RESV_USAGE_READ)) {
> +		if (dma_resv_iter_usage(&cursor) != usage) {
>  			pr_err("Unexpected fence usage\n");
>  			r = -EINVAL;
>  			goto err_unlock;
> @@ -177,18 +161,9 @@ static int test_for_each(void *arg, enum dma_resv_usage usage)
>  	return r;
>  }
>  
> -static int test_excl_for_each(void *arg)
> -{
> -	return test_for_each(arg, DMA_RESV_USAGE_WRITE);
> -}
> -
> -static int test_shared_for_each(void *arg)
> -{
> -	return test_for_each(arg, DMA_RESV_USAGE_READ);
> -}
> -
> -static int test_for_each_unlocked(void *arg, enum dma_resv_usage usage)
> +static int test_for_each_unlocked(void *arg)
>  {
> +	enum dma_resv_usage usage = (unsigned long)arg;
>  	struct dma_resv_iter cursor;
>  	struct dma_fence *f, *fence;
>  	struct dma_resv resv;
> @@ -212,10 +187,7 @@ static int test_for_each_unlocked(void *arg, enum dma_resv_usage usage)
>  		goto err_free;
>  	}
>  
> -	if (usage >= DMA_RESV_USAGE_READ)
> -		dma_resv_add_shared_fence(&resv, f);
> -	else
> -		dma_resv_add_excl_fence(&resv, f);
> +	dma_resv_add_fence(&resv, f, usage);
>  	dma_resv_unlock(&resv);
>  
>  	r = -ENOENT;
> @@ -235,8 +207,7 @@ static int test_for_each_unlocked(void *arg, enum dma_resv_usage usage)
>  			r = -EINVAL;
>  			goto err_iter_end;
>  		}
> -		if (dma_resv_iter_is_exclusive(&cursor) !=
> -		    (usage >= DMA_RESV_USAGE_READ)) {
> +		if (dma_resv_iter_usage(&cursor) != usage) {
>  			pr_err("Unexpected fence usage\n");
>  			r = -EINVAL;
>  			goto err_iter_end;
> @@ -262,18 +233,9 @@ static int test_for_each_unlocked(void *arg, enum dma_resv_usage usage)
>  	return r;
>  }
>  
> -static int test_excl_for_each_unlocked(void *arg)
> -{
> -	return test_for_each_unlocked(arg, DMA_RESV_USAGE_WRITE);
> -}
> -
> -static int test_shared_for_each_unlocked(void *arg)
> -{
> -	return test_for_each_unlocked(arg, DMA_RESV_USAGE_READ);
> -}
> -
> -static int test_get_fences(void *arg, enum dma_resv_usage usage)
> +static int test_get_fences(void *arg)
>  {
> +	enum dma_resv_usage usage = (unsigned long)arg;
>  	struct dma_fence *f, **fences = NULL;
>  	struct dma_resv resv;
>  	int r, i;
> @@ -296,10 +258,7 @@ static int test_get_fences(void *arg, enum dma_resv_usage usage)
>  		goto err_resv;
>  	}
>  
> -	if (usage >= DMA_RESV_USAGE_READ)
> -		dma_resv_add_shared_fence(&resv, f);
> -	else
> -		dma_resv_add_excl_fence(&resv, f);
> +	dma_resv_add_fence(&resv, f, usage);
>  	dma_resv_unlock(&resv);
>  
>  	r = dma_resv_get_fences(&resv, usage, &i, &fences);
> @@ -324,30 +283,24 @@ static int test_get_fences(void *arg, enum dma_resv_usage usage)
>  	return r;
>  }
>  
> -static int test_excl_get_fences(void *arg)
> -{
> -	return test_get_fences(arg, DMA_RESV_USAGE_WRITE);
> -}
> -
> -static int test_shared_get_fences(void *arg)
> -{
> -	return test_get_fences(arg, DMA_RESV_USAGE_READ);
> -}
> -
>  int dma_resv(void)
>  {
>  	static const struct subtest tests[] = {
>  		SUBTEST(sanitycheck),
> -		SUBTEST(test_excl_signaling),
> -		SUBTEST(test_shared_signaling),
> -		SUBTEST(test_excl_for_each),
> -		SUBTEST(test_shared_for_each),
> -		SUBTEST(test_excl_for_each_unlocked),
> -		SUBTEST(test_shared_for_each_unlocked),
> -		SUBTEST(test_excl_get_fences),
> -		SUBTEST(test_shared_get_fences),
> +		SUBTEST(test_signaling),
> +		SUBTEST(test_for_each),
> +		SUBTEST(test_for_each_unlocked),
> +		SUBTEST(test_get_fences),
>  	};
> +	enum dma_resv_usage usage;
> +	int r;
>  
>  	spin_lock_init(&fence_lock);
> -	return subtests(tests, NULL);
> +	for (usage = DMA_RESV_USAGE_WRITE; usage <= DMA_RESV_USAGE_READ;
> +	     ++usage) {
> +		r = subtests(tests, (void *)(unsigned long)usage);
> +		if (r)
> +			return r;
> +	}
> +	return 0;
>  }
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
> index 98b1736bb221..5031e26e6716 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
> @@ -263,7 +263,7 @@ static int amdgpu_amdkfd_remove_eviction_fence(struct amdgpu_bo *bo,
>  	 */
>  	replacement = dma_fence_get_stub();
>  	dma_resv_replace_fences(bo->tbo.base.resv, ef->base.context,
> -				replacement);
> +				replacement, DMA_RESV_USAGE_READ);
>  	dma_fence_put(replacement);
>  	return 0;
>  }
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> index 413f32c3fd63..76fd916424d6 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> @@ -55,8 +55,8 @@ static int amdgpu_cs_user_fence_chunk(struct amdgpu_cs_parser *p,
>  	bo = amdgpu_bo_ref(gem_to_amdgpu_bo(gobj));
>  	p->uf_entry.priority = 0;
>  	p->uf_entry.tv.bo = &bo->tbo;
> -	/* One for TTM and one for the CS job */
> -	p->uf_entry.tv.num_shared = 2;
> +	/* One for TTM and two for the CS job */
> +	p->uf_entry.tv.num_shared = 3;
>  
>  	drm_gem_object_put(gobj);
>  
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> index a7f39f8ab7be..a3cdf8a24377 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> @@ -1397,10 +1397,8 @@ void amdgpu_bo_fence(struct amdgpu_bo *bo, struct dma_fence *fence,
>  		return;
>  	}
>  
> -	if (shared)
> -		dma_resv_add_shared_fence(resv, fence);
> -	else
> -		dma_resv_add_excl_fence(resv, fence);
> +	dma_resv_add_fence(resv, fence, shared ? DMA_RESV_USAGE_READ :
> +			   DMA_RESV_USAGE_WRITE);
>  }
>  
>  /**
> diff --git a/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c b/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c
> index 53f7c78628a4..98bb5c9239de 100644
> --- a/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c
> +++ b/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c
> @@ -202,14 +202,10 @@ static void submit_attach_object_fences(struct etnaviv_gem_submit *submit)
>  
>  	for (i = 0; i < submit->nr_bos; i++) {
>  		struct drm_gem_object *obj = &submit->bos[i].obj->base;
> +		bool write = submit->bos[i].flags & ETNA_SUBMIT_BO_WRITE;
>  
> -		if (submit->bos[i].flags & ETNA_SUBMIT_BO_WRITE)
> -			dma_resv_add_excl_fence(obj->resv,
> -							  submit->out_fence);
> -		else
> -			dma_resv_add_shared_fence(obj->resv,
> -							    submit->out_fence);
> -
> +		dma_resv_add_fence(obj->resv, submit->out_fence, write ?
> +				   DMA_RESV_USAGE_WRITE : DMA_RESV_USAGE_READ);

Iirc I had some suggestions to use dma_resv_usage_rw here and above. Do
these happen in later patches? There's also a few more of these later on.

>  		submit_unlock_object(submit, i);
>  	}
>  }
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_busy.c b/drivers/gpu/drm/i915/gem/i915_gem_busy.c
> index 14a1c0ad8c3c..e7ae94ee1b44 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_busy.c
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_busy.c
> @@ -148,12 +148,13 @@ i915_gem_busy_ioctl(struct drm_device *dev, void *data,
>  		if (dma_resv_iter_is_restarted(&cursor))
>  			args->busy = 0;
>  
> -		if (dma_resv_iter_is_exclusive(&cursor))
> -			/* Translate the exclusive fence to the READ *and* WRITE engine */
> -			args->busy |= busy_check_writer(fence);
> -		else
> -			/* Translate shared fences to READ set of engines */
> -			args->busy |= busy_check_reader(fence);
> +		/* Translate read fences to READ set of engines */
> +		args->busy |= busy_check_reader(fence);
> +	}
> +	dma_resv_iter_begin(&cursor, obj->base.resv, DMA_RESV_USAGE_WRITE);
> +	dma_resv_for_each_fence_unlocked(&cursor, fence) {

Two loops is a bit much but also meh.

> +		/* Translate the write fences to the READ *and* WRITE engine */
> +		args->busy |= busy_check_writer(fence);
>  	}
>  	dma_resv_iter_end(&cursor);
>  
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_clflush.c b/drivers/gpu/drm/i915/gem/i915_gem_clflush.c
> index 1fd0cc9ca213..f5f2b8b115ea 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_clflush.c
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_clflush.c
> @@ -116,7 +116,8 @@ bool i915_gem_clflush_object(struct drm_i915_gem_object *obj,
>  						obj->base.resv, NULL, true,
>  						i915_fence_timeout(i915),
>  						I915_FENCE_GFP);
> -		dma_resv_add_excl_fence(obj->base.resv, &clflush->base.dma);
> +		dma_resv_add_fence(obj->base.resv, &clflush->base.dma,
> +				   DMA_RESV_USAGE_WRITE);
>  		dma_fence_work_commit(&clflush->base);
>  		/*
>  		 * We must have successfully populated the pages(since we are
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_ttm_move.c b/drivers/gpu/drm/i915/gem/i915_gem_ttm_move.c
> index 432ac74ff225..438b8a95b3d1 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_ttm_move.c
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_ttm_move.c
> @@ -637,9 +637,8 @@ int i915_gem_obj_copy_ttm(struct drm_i915_gem_object *dst,
>  	if (IS_ERR_OR_NULL(copy_fence))
>  		return PTR_ERR_OR_ZERO(copy_fence);
>  
> -	dma_resv_add_excl_fence(dst_bo->base.resv, copy_fence);
> -	dma_resv_add_shared_fence(src_bo->base.resv, copy_fence);
> -
> +	dma_resv_add_fence(dst_bo->base.resv, copy_fence, DMA_RESV_USAGE_WRITE);
> +	dma_resv_add_fence(src_bo->base.resv, copy_fence, DMA_RESV_USAGE_READ);
>  	dma_fence_put(copy_fence);
>  
>  	return 0;
> diff --git a/drivers/gpu/drm/i915/gem/selftests/i915_gem_migrate.c b/drivers/gpu/drm/i915/gem/selftests/i915_gem_migrate.c
> index 0e52eb87cd55..4997ed18b6e4 100644
> --- a/drivers/gpu/drm/i915/gem/selftests/i915_gem_migrate.c
> +++ b/drivers/gpu/drm/i915/gem/selftests/i915_gem_migrate.c
> @@ -218,8 +218,8 @@ static int __igt_lmem_pages_migrate(struct intel_gt *gt,
>  		if (rq) {
>  			err = dma_resv_reserve_fences(obj->base.resv, 1);
>  			if (!err)
> -				dma_resv_add_excl_fence(obj->base.resv,
> -							&rq->fence);
> +				dma_resv_add_fence(obj->base.resv, &rq->fence,
> +						   DMA_RESV_USAGE_WRITE);
>  			i915_gem_object_set_moving_fence(obj, &rq->fence);
>  			i915_request_put(rq);
>  		}
> diff --git a/drivers/gpu/drm/i915/gem/selftests/i915_gem_mman.c b/drivers/gpu/drm/i915/gem/selftests/i915_gem_mman.c
> index a132e241c3ee..3a6e3f6d239f 100644
> --- a/drivers/gpu/drm/i915/gem/selftests/i915_gem_mman.c
> +++ b/drivers/gpu/drm/i915/gem/selftests/i915_gem_mman.c
> @@ -1220,7 +1220,8 @@ static int __igt_mmap_migrate(struct intel_memory_region **placements,
>  					  expand32(POISON_INUSE), &rq);
>  	i915_gem_object_unpin_pages(obj);
>  	if (rq) {
> -		dma_resv_add_excl_fence(obj->base.resv, &rq->fence);
> +		dma_resv_add_fence(obj->base.resv, &rq->fence,
> +				   DMA_RESV_USAGE_WRITE);
>  		i915_gem_object_set_moving_fence(obj, &rq->fence);
>  		i915_request_put(rq);
>  	}
> diff --git a/drivers/gpu/drm/i915/i915_vma.c b/drivers/gpu/drm/i915/i915_vma.c
> index bae3423f58e8..524477d8939e 100644
> --- a/drivers/gpu/drm/i915/i915_vma.c
> +++ b/drivers/gpu/drm/i915/i915_vma.c
> @@ -1826,7 +1826,8 @@ int _i915_vma_move_to_active(struct i915_vma *vma,
>  		}
>  
>  		if (fence) {
> -			dma_resv_add_excl_fence(vma->obj->base.resv, fence);
> +			dma_resv_add_fence(vma->obj->base.resv, fence,
> +					   DMA_RESV_USAGE_WRITE);
>  			obj->write_domain = I915_GEM_DOMAIN_RENDER;
>  			obj->read_domains = 0;
>  		}
> @@ -1838,7 +1839,8 @@ int _i915_vma_move_to_active(struct i915_vma *vma,
>  		}
>  
>  		if (fence) {
> -			dma_resv_add_shared_fence(vma->obj->base.resv, fence);
> +			dma_resv_add_fence(vma->obj->base.resv, fence,
> +					   DMA_RESV_USAGE_READ);
>  			obj->write_domain = 0;
>  		}
>  	}
> @@ -2078,7 +2080,7 @@ int i915_vma_unbind_async(struct i915_vma *vma, bool trylock_vm)
>  		goto out_rpm;
>  	}
>  
> -	dma_resv_add_shared_fence(obj->base.resv, fence);
> +	dma_resv_add_fence(obj->base.resv, fence, DMA_RESV_USAGE_READ);
>  	dma_fence_put(fence);
>  
>  out_rpm:
> diff --git a/drivers/gpu/drm/i915/selftests/intel_memory_region.c b/drivers/gpu/drm/i915/selftests/intel_memory_region.c
> index 6114e013092b..73eb53edb8de 100644
> --- a/drivers/gpu/drm/i915/selftests/intel_memory_region.c
> +++ b/drivers/gpu/drm/i915/selftests/intel_memory_region.c
> @@ -1056,7 +1056,8 @@ static int igt_lmem_write_cpu(void *arg)
>  					  obj->mm.pages->sgl, I915_CACHE_NONE,
>  					  true, 0xdeadbeaf, &rq);
>  	if (rq) {
> -		dma_resv_add_excl_fence(obj->base.resv, &rq->fence);
> +		dma_resv_add_fence(obj->base.resv, &rq->fence,
> +				   DMA_RESV_USAGE_WRITE);
>  		i915_request_put(rq);
>  	}
>  
> diff --git a/drivers/gpu/drm/lima/lima_gem.c b/drivers/gpu/drm/lima/lima_gem.c
> index e0a11ee0e86d..cb3bfccc930f 100644
> --- a/drivers/gpu/drm/lima/lima_gem.c
> +++ b/drivers/gpu/drm/lima/lima_gem.c
> @@ -367,7 +367,7 @@ int lima_gem_submit(struct drm_file *file, struct lima_submit *submit)
>  		if (submit->bos[i].flags & LIMA_SUBMIT_BO_WRITE)
>  			dma_resv_add_excl_fence(lima_bo_resv(bos[i]), fence);
>  		else
> -			dma_resv_add_shared_fence(lima_bo_resv(bos[i]), fence);
> +			dma_resv_add_fence(lima_bo_resv(bos[i]), fence);
>  	}
>  
>  	drm_gem_unlock_reservations((struct drm_gem_object **)bos,
> diff --git a/drivers/gpu/drm/msm/msm_gem_submit.c b/drivers/gpu/drm/msm/msm_gem_submit.c
> index 3164db8be893..8d1eef914ba8 100644
> --- a/drivers/gpu/drm/msm/msm_gem_submit.c
> +++ b/drivers/gpu/drm/msm/msm_gem_submit.c
> @@ -395,9 +395,11 @@ static void submit_attach_object_fences(struct msm_gem_submit *submit)
>  		struct drm_gem_object *obj = &submit->bos[i].obj->base;
>  
>  		if (submit->bos[i].flags & MSM_SUBMIT_BO_WRITE)
> -			dma_resv_add_excl_fence(obj->resv, submit->user_fence);
> +			dma_resv_add_fence(obj->resv, submit->user_fence,
> +					   DMA_RESV_USAGE_WRITE);
>  		else if (submit->bos[i].flags & MSM_SUBMIT_BO_READ)
> -			dma_resv_add_shared_fence(obj->resv, submit->user_fence);
> +			dma_resv_add_fence(obj->resv, submit->user_fence,
> +					   DMA_RESV_USAGE_READ);
>  	}
>  }
>  
> diff --git a/drivers/gpu/drm/nouveau/nouveau_bo.c b/drivers/gpu/drm/nouveau/nouveau_bo.c
> index c6bb4dbcd735..05076e530e7d 100644
> --- a/drivers/gpu/drm/nouveau/nouveau_bo.c
> +++ b/drivers/gpu/drm/nouveau/nouveau_bo.c
> @@ -1308,10 +1308,11 @@ nouveau_bo_fence(struct nouveau_bo *nvbo, struct nouveau_fence *fence, bool excl
>  {
>  	struct dma_resv *resv = nvbo->bo.base.resv;
>  
> -	if (exclusive)
> -		dma_resv_add_excl_fence(resv, &fence->base);
> -	else if (fence)
> -		dma_resv_add_shared_fence(resv, &fence->base);
> +	if (!fence)
> +		return;
> +
> +	dma_resv_add_fence(resv, &fence->base, exclusive ?
> +			   DMA_RESV_USAGE_WRITE : DMA_RESV_USAGE_READ);
>  }
>  
>  static void
> diff --git a/drivers/gpu/drm/nouveau/nouveau_fence.c b/drivers/gpu/drm/nouveau/nouveau_fence.c
> index d5e81ccee01c..7f01dcf81fab 100644
> --- a/drivers/gpu/drm/nouveau/nouveau_fence.c
> +++ b/drivers/gpu/drm/nouveau/nouveau_fence.c
> @@ -360,9 +360,11 @@ nouveau_fence_sync(struct nouveau_bo *nvbo, struct nouveau_channel *chan,
>  		dma_resv_for_each_fence(&cursor, resv,
>  					dma_resv_usage_rw(exclusive),
>  					fence) {
> +			enum dma_resv_usage usage;
>  			struct nouveau_fence *f;
>  
> -			if (i == 0 && dma_resv_iter_is_exclusive(&cursor))
> +			usage = dma_resv_iter_usage(&cursor);
> +			if (i == 0 && usage == DMA_RESV_USAGE_WRITE)
>  				continue;
>  
>  			f = nouveau_local_fence(fence, chan->drm);
> diff --git a/drivers/gpu/drm/panfrost/panfrost_job.c b/drivers/gpu/drm/panfrost/panfrost_job.c
> index c34114560e49..fda5871aebe3 100644
> --- a/drivers/gpu/drm/panfrost/panfrost_job.c
> +++ b/drivers/gpu/drm/panfrost/panfrost_job.c
> @@ -268,7 +268,7 @@ static void panfrost_attach_object_fences(struct drm_gem_object **bos,
>  	int i;
>  
>  	for (i = 0; i < bo_count; i++)
> -		dma_resv_add_excl_fence(bos[i]->resv, fence);
> +		dma_resv_add_fence(bos[i]->resv, fence, DMA_RESV_USAGE_WRITE);
>  }
>  
>  int panfrost_job_push(struct panfrost_job *job)
> diff --git a/drivers/gpu/drm/qxl/qxl_release.c b/drivers/gpu/drm/qxl/qxl_release.c
> index cde1e8ddaeaa..368d26da0d6a 100644
> --- a/drivers/gpu/drm/qxl/qxl_release.c
> +++ b/drivers/gpu/drm/qxl/qxl_release.c
> @@ -429,7 +429,8 @@ void qxl_release_fence_buffer_objects(struct qxl_release *release)
>  	list_for_each_entry(entry, &release->bos, head) {
>  		bo = entry->bo;
>  
> -		dma_resv_add_shared_fence(bo->base.resv, &release->base);
> +		dma_resv_add_fence(bo->base.resv, &release->base,
> +				   DMA_RESV_USAGE_READ);
>  		ttm_bo_move_to_lru_tail_unlocked(bo);
>  		dma_resv_unlock(bo->base.resv);
>  	}
> diff --git a/drivers/gpu/drm/radeon/radeon_object.c b/drivers/gpu/drm/radeon/radeon_object.c
> index 7ffd2e90f325..cb5c4aa45cef 100644
> --- a/drivers/gpu/drm/radeon/radeon_object.c
> +++ b/drivers/gpu/drm/radeon/radeon_object.c
> @@ -791,8 +791,6 @@ void radeon_bo_fence(struct radeon_bo *bo, struct radeon_fence *fence,
>  		return;
>  	}
>  
> -	if (shared)
> -		dma_resv_add_shared_fence(resv, &fence->base);
> -	else
> -		dma_resv_add_excl_fence(resv, &fence->base);
> +	dma_resv_add_fence(resv, &fence->base, shared ?
> +			   DMA_RESV_USAGE_READ : DMA_RESV_USAGE_WRITE);
>  }
> diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c
> index cff05b62f3f7..d74f9eea855e 100644
> --- a/drivers/gpu/drm/ttm/ttm_bo.c
> +++ b/drivers/gpu/drm/ttm/ttm_bo.c
> @@ -739,7 +739,7 @@ static int ttm_bo_add_move_fence(struct ttm_buffer_object *bo,
>  		return ret;
>  	}
>  
> -	dma_resv_add_shared_fence(bo->base.resv, fence);
> +	dma_resv_add_fence(bo->base.resv, fence, DMA_RESV_USAGE_WRITE);
>  
>  	ret = dma_resv_reserve_fences(bo->base.resv, 1);
>  	if (unlikely(ret)) {
> diff --git a/drivers/gpu/drm/ttm/ttm_bo_util.c b/drivers/gpu/drm/ttm/ttm_bo_util.c
> index 1b96b91bf81b..7a96a1db13a7 100644
> --- a/drivers/gpu/drm/ttm/ttm_bo_util.c
> +++ b/drivers/gpu/drm/ttm/ttm_bo_util.c
> @@ -507,7 +507,8 @@ static int ttm_bo_move_to_ghost(struct ttm_buffer_object *bo,
>  	if (ret)
>  		return ret;
>  
> -	dma_resv_add_excl_fence(&ghost_obj->base._resv, fence);
> +	dma_resv_add_fence(&ghost_obj->base._resv, fence,
> +			   DMA_RESV_USAGE_WRITE);
>  
>  	/**
>  	 * If we're not moving to fixed memory, the TTM object
> @@ -561,7 +562,7 @@ int ttm_bo_move_accel_cleanup(struct ttm_buffer_object *bo,
>  	struct ttm_resource_manager *man = ttm_manager_type(bdev, new_mem->mem_type);
>  	int ret = 0;
>  
> -	dma_resv_add_excl_fence(bo->base.resv, fence);
> +	dma_resv_add_fence(bo->base.resv, fence, DMA_RESV_USAGE_WRITE);
>  	if (!evict)
>  		ret = ttm_bo_move_to_ghost(bo, fence, man->use_tt);
>  	else if (!from->use_tt && pipeline)
> diff --git a/drivers/gpu/drm/ttm/ttm_execbuf_util.c b/drivers/gpu/drm/ttm/ttm_execbuf_util.c
> index 789c645f004e..0eb995d25df1 100644
> --- a/drivers/gpu/drm/ttm/ttm_execbuf_util.c
> +++ b/drivers/gpu/drm/ttm/ttm_execbuf_util.c
> @@ -154,10 +154,8 @@ void ttm_eu_fence_buffer_objects(struct ww_acquire_ctx *ticket,
>  	list_for_each_entry(entry, list, head) {
>  		struct ttm_buffer_object *bo = entry->bo;
>  
> -		if (entry->num_shared)
> -			dma_resv_add_shared_fence(bo->base.resv, fence);
> -		else
> -			dma_resv_add_excl_fence(bo->base.resv, fence);
> +		dma_resv_add_fence(bo->base.resv, fence, entry->num_shared ?
> +				   DMA_RESV_USAGE_READ : DMA_RESV_USAGE_WRITE);
>  		ttm_bo_move_to_lru_tail_unlocked(bo);
>  		dma_resv_unlock(bo->base.resv);
>  	}
> diff --git a/drivers/gpu/drm/v3d/v3d_gem.c b/drivers/gpu/drm/v3d/v3d_gem.c
> index 961812d33827..2352e9640922 100644
> --- a/drivers/gpu/drm/v3d/v3d_gem.c
> +++ b/drivers/gpu/drm/v3d/v3d_gem.c
> @@ -550,8 +550,8 @@ v3d_attach_fences_and_unlock_reservation(struct drm_file *file_priv,
>  
>  	for (i = 0; i < job->bo_count; i++) {
>  		/* XXX: Use shared fences for read-only objects. */
> -		dma_resv_add_excl_fence(job->bo[i]->resv,
> -					job->done_fence);
> +		dma_resv_add_fence(job->bo[i]->resv, job->done_fence,
> +				   DMA_RESV_USAGE_WRITE);
>  	}
>  
>  	drm_gem_unlock_reservations(job->bo, job->bo_count, acquire_ctx);
> diff --git a/drivers/gpu/drm/vc4/vc4_gem.c b/drivers/gpu/drm/vc4/vc4_gem.c
> index 594bd6bb00d2..38550317e025 100644
> --- a/drivers/gpu/drm/vc4/vc4_gem.c
> +++ b/drivers/gpu/drm/vc4/vc4_gem.c
> @@ -546,7 +546,7 @@ vc4_update_bo_seqnos(struct vc4_exec_info *exec, uint64_t seqno)
>  		bo = to_vc4_bo(&exec->bo[i]->base);
>  		bo->seqno = seqno;
>  
> -		dma_resv_add_shared_fence(bo->base.base.resv, exec->fence);
> +		dma_resv_add_fence(bo->base.base.resv, exec->fence);
>  	}
>  
>  	list_for_each_entry(bo, &exec->unref_list, unref_head) {
> diff --git a/drivers/gpu/drm/vgem/vgem_fence.c b/drivers/gpu/drm/vgem/vgem_fence.c
> index 91fc4940c65a..c2a879734d40 100644
> --- a/drivers/gpu/drm/vgem/vgem_fence.c
> +++ b/drivers/gpu/drm/vgem/vgem_fence.c
> @@ -161,12 +161,9 @@ int vgem_fence_attach_ioctl(struct drm_device *dev,
>  	/* Expose the fence via the dma-buf */
>  	dma_resv_lock(resv, NULL);
>  	ret = dma_resv_reserve_fences(resv, 1);
> -	if (!ret) {
> -		if (arg->flags & VGEM_FENCE_WRITE)
> -			dma_resv_add_excl_fence(resv, fence);
> -		else
> -			dma_resv_add_shared_fence(resv, fence);
> -	}
> +	if (!ret)
> +		dma_resv_add_fence(resv, fence, arg->flags & VGEM_FENCE_WRITE ?
> +				   DMA_RESV_USAGE_WRITE : DMA_RESV_USAGE_READ);
>  	dma_resv_unlock(resv);
>  
>  	/* Record the fence in our idr for later signaling */
> diff --git a/drivers/gpu/drm/virtio/virtgpu_gem.c b/drivers/gpu/drm/virtio/virtgpu_gem.c
> index 1820ca6cf673..580a78809836 100644
> --- a/drivers/gpu/drm/virtio/virtgpu_gem.c
> +++ b/drivers/gpu/drm/virtio/virtgpu_gem.c
> @@ -250,7 +250,8 @@ void virtio_gpu_array_add_fence(struct virtio_gpu_object_array *objs,
>  	int i;
>  
>  	for (i = 0; i < objs->nents; i++)
> -		dma_resv_add_excl_fence(objs->objs[i]->resv, fence);
> +		dma_resv_add_fence(objs->objs[i]->resv, fence,
> +				   DMA_RESV_USAGE_WRITE);
>  }
>  
>  void virtio_gpu_array_put_free(struct virtio_gpu_object_array *objs)
> diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_bo.c b/drivers/gpu/drm/vmwgfx/vmwgfx_bo.c
> index b96884f7d03d..bec50223efe5 100644
> --- a/drivers/gpu/drm/vmwgfx/vmwgfx_bo.c
> +++ b/drivers/gpu/drm/vmwgfx/vmwgfx_bo.c
> @@ -758,7 +758,8 @@ void vmw_bo_fence_single(struct ttm_buffer_object *bo,
>  
>  	ret = dma_resv_reserve_fences(bo->base.resv, 1);
>  	if (!ret)
> -		dma_resv_add_excl_fence(bo->base.resv, &fence->base);
> +		dma_resv_add_fence(bo->base.resv, &fence->base,
> +				   DMA_RESV_USAGE_WRITE);
>  	else
>  		/* Last resort fallback when we are OOM */
>  		dma_fence_wait(&fence->base, false);
> diff --git a/include/linux/dma-buf.h b/include/linux/dma-buf.h
> index a297397743a2..71731796c8c3 100644
> --- a/include/linux/dma-buf.h
> +++ b/include/linux/dma-buf.h
> @@ -393,15 +393,15 @@ struct dma_buf {
>  	 * e.g. exposed in `Implicit Fence Poll Support`_ must follow the
>  	 * below rules.
>  	 *
> -	 * - Drivers must add a shared fence through dma_resv_add_shared_fence()
> -	 *   for anything the userspace API considers a read access. This highly
> -	 *   depends upon the API and window system.
> +	 * - Drivers must add a read fence through dma_resv_add_fence() with the
> +	 *   DMA_RESV_USAGE_READ flag for anything the userspace API considers a
> +	 *   read access. This highly depends upon the API and window system.
>  	 *
> -	 * - Similarly drivers must set the exclusive fence through
> -	 *   dma_resv_add_excl_fence() for anything the userspace API considers
> -	 *   write access.
> +	 * - Similarly drivers must add a write fence through
> +	 *   dma_resv_add_fence() with the DMA_RESV_USAGE_WRITE flag for
> +	 *   anything the userspace API considers write access.
>  	 *
> -	 * - Drivers may just always set the exclusive fence, since that only
> +	 * - Drivers may just always add a write fence, since that only
>  	 *   causes unecessarily synchronization, but no correctness issues.
>  	 *
>  	 * - Some drivers only expose a synchronous userspace API with no
> @@ -416,7 +416,7 @@ struct dma_buf {
>  	 * Dynamic importers, see dma_buf_attachment_is_dynamic(), have
>  	 * additional constraints on how they set up fences:
>  	 *
> -	 * - Dynamic importers must obey the exclusive fence and wait for it to
> +	 * - Dynamic importers must obey the write fences and wait for them to
>  	 *   signal before allowing access to the buffer's underlying storage
>  	 *   through the device.
>  	 *
> diff --git a/include/linux/dma-resv.h b/include/linux/dma-resv.h
> index 92cd8023980f..98dc5234b487 100644
> --- a/include/linux/dma-resv.h
> +++ b/include/linux/dma-resv.h
> @@ -195,6 +195,9 @@ struct dma_resv_iter {
>  	/** @fence: the currently handled fence */
>  	struct dma_fence *fence;
>  
> +	/** @fence_usage: the usage of the current fence */
> +	enum dma_resv_usage fence_usage;
> +
>  	/** @seq: sequence number to check for modifications */
>  	unsigned int seq;
>  
> @@ -244,14 +247,15 @@ static inline void dma_resv_iter_end(struct dma_resv_iter *cursor)
>  }
>  
>  /**
> - * dma_resv_iter_is_exclusive - test if the current fence is the exclusive one
> + * dma_resv_iter_usage - Return the usage of the current fence
>   * @cursor: the cursor of the current position
>   *
> - * Returns true if the currently returned fence is the exclusive one.
> + * Returns the usage of the currently processed fence.
>   */
> -static inline bool dma_resv_iter_is_exclusive(struct dma_resv_iter *cursor)
> +static inline enum dma_resv_usage
> +dma_resv_iter_usage(struct dma_resv_iter *cursor)
>  {
> -	return cursor->index == 0;
> +	return cursor->fence_usage;
>  }
>  
>  /**
> @@ -306,9 +310,9 @@ static inline bool dma_resv_iter_is_restarted(struct dma_resv_iter *cursor)
>  #define dma_resv_assert_held(obj) lockdep_assert_held(&(obj)->lock.base)
>  
>  #ifdef CONFIG_DEBUG_MUTEXES
> -void dma_resv_reset_shared_max(struct dma_resv *obj);
> +void dma_resv_reset_max_fences(struct dma_resv *obj);
>  #else
> -static inline void dma_resv_reset_shared_max(struct dma_resv *obj) {}
> +static inline void dma_resv_reset_max_fences(struct dma_resv *obj) {}
>  #endif
>  
>  /**
> @@ -454,17 +458,18 @@ static inline struct ww_acquire_ctx *dma_resv_locking_ctx(struct dma_resv *obj)
>   */
>  static inline void dma_resv_unlock(struct dma_resv *obj)
>  {
> -	dma_resv_reset_shared_max(obj);
> +	dma_resv_reset_max_fences(obj);
>  	ww_mutex_unlock(&obj->lock);
>  }
>  
>  void dma_resv_init(struct dma_resv *obj);
>  void dma_resv_fini(struct dma_resv *obj);
>  int dma_resv_reserve_fences(struct dma_resv *obj, unsigned int num_fences);
> -void dma_resv_add_shared_fence(struct dma_resv *obj, struct dma_fence *fence);
> +void dma_resv_add_fence(struct dma_resv *obj, struct dma_fence *fence,
> +			enum dma_resv_usage usage);
>  void dma_resv_replace_fences(struct dma_resv *obj, uint64_t context,
> -			     struct dma_fence *fence);
> -void dma_resv_add_excl_fence(struct dma_resv *obj, struct dma_fence *fence);
> +			     struct dma_fence *fence,
> +			     enum dma_resv_usage usage);
>  int dma_resv_get_fences(struct dma_resv *obj, enum dma_resv_usage usage,
>  			unsigned int *num_fences, struct dma_fence ***fences);
>  int dma_resv_get_singleton(struct dma_resv *obj, enum dma_resv_usage usage,

Really only bikesheds left. I think ... I guess better to merge this
sooner and test more than later.

Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>

> -- 
> 2.25.1
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH 03/16] dma-buf: specify usage while adding fences to dma_resv obj v6
  2022-04-06 12:32   ` Daniel Vetter
@ 2022-04-06 12:35     ` Daniel Vetter
  2022-04-07  8:01       ` Christian König
  0 siblings, 1 reply; 65+ messages in thread
From: Daniel Vetter @ 2022-04-06 12:35 UTC (permalink / raw)
  To: DMA-resvusage; +Cc: daniel.vetter, Christian König, dri-devel

On Wed, Apr 06, 2022 at 02:32:22PM +0200, Daniel Vetter wrote:
> On Wed, Apr 06, 2022 at 09:51:19AM +0200, Christian König wrote:
> > Instead of distingting between shared and exclusive fences specify
> > the fence usage while adding fences.
> > 
> > Rework all drivers to use this interface instead and deprecate the old one.
> > 
> > v2: some kerneldoc comments suggested by Daniel
> > v3: fix a missing case in radeon
> > v4: rebase on nouveau changes, fix lockdep and temporary disable warning
> > v5: more documentation updates
> > v6: separate internal dma_resv changes from this patch, avoids to
> >     disable warning temporary, rebase on upstream changes
> > 
> > Signed-off-by: Christian König <christian.koenig@amd.com>
> > ---
> >  drivers/dma-buf/dma-resv.c                    |  48 +++++++--
> >  drivers/dma-buf/st-dma-resv.c                 | 101 +++++-------------
> >  .../gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c  |   2 +-
> >  drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c        |   4 +-
> >  drivers/gpu/drm/amd/amdgpu/amdgpu_object.c    |   6 +-
> >  drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c  |  10 +-
> >  drivers/gpu/drm/i915/gem/i915_gem_busy.c      |  13 +--
> >  drivers/gpu/drm/i915/gem/i915_gem_clflush.c   |   3 +-
> >  drivers/gpu/drm/i915/gem/i915_gem_ttm_move.c  |   5 +-
> >  .../drm/i915/gem/selftests/i915_gem_migrate.c |   4 +-
> >  .../drm/i915/gem/selftests/i915_gem_mman.c    |   3 +-
> >  drivers/gpu/drm/i915/i915_vma.c               |   8 +-
> >  .../drm/i915/selftests/intel_memory_region.c  |   3 +-
> >  drivers/gpu/drm/lima/lima_gem.c               |   2 +-
> >  drivers/gpu/drm/msm/msm_gem_submit.c          |   6 +-
> >  drivers/gpu/drm/nouveau/nouveau_bo.c          |   9 +-
> >  drivers/gpu/drm/nouveau/nouveau_fence.c       |   4 +-
> >  drivers/gpu/drm/panfrost/panfrost_job.c       |   2 +-
> >  drivers/gpu/drm/qxl/qxl_release.c             |   3 +-
> >  drivers/gpu/drm/radeon/radeon_object.c        |   6 +-
> >  drivers/gpu/drm/ttm/ttm_bo.c                  |   2 +-
> >  drivers/gpu/drm/ttm/ttm_bo_util.c             |   5 +-
> >  drivers/gpu/drm/ttm/ttm_execbuf_util.c        |   6 +-
> >  drivers/gpu/drm/v3d/v3d_gem.c                 |   4 +-
> >  drivers/gpu/drm/vc4/vc4_gem.c                 |   2 +-
> >  drivers/gpu/drm/vgem/vgem_fence.c             |   9 +-
> >  drivers/gpu/drm/virtio/virtgpu_gem.c          |   3 +-
> >  drivers/gpu/drm/vmwgfx/vmwgfx_bo.c            |   3 +-
> >  include/linux/dma-buf.h                       |  16 +--
> >  include/linux/dma-resv.h                      |  25 +++--
> >  30 files changed, 151 insertions(+), 166 deletions(-)
> > 
> > diff --git a/drivers/dma-buf/dma-resv.c b/drivers/dma-buf/dma-resv.c
> > index 17237e6ee30c..543dae6566d2 100644
> > --- a/drivers/dma-buf/dma-resv.c
> > +++ b/drivers/dma-buf/dma-resv.c
> > @@ -234,14 +234,14 @@ EXPORT_SYMBOL(dma_resv_reserve_fences);
> >  
> >  #ifdef CONFIG_DEBUG_MUTEXES
> >  /**
> > - * dma_resv_reset_shared_max - reset shared fences for debugging
> > + * dma_resv_reset_max_fences - reset shared fences for debugging
> >   * @obj: the dma_resv object to reset
> >   *
> >   * Reset the number of pre-reserved shared slots to test that drivers do
> >   * correct slot allocation using dma_resv_reserve_fences(). See also
> >   * &dma_resv_list.shared_max.
> >   */
> > -void dma_resv_reset_shared_max(struct dma_resv *obj)
> > +void dma_resv_reset_max_fences(struct dma_resv *obj)
> >  {
> >  	struct dma_resv_list *fences = dma_resv_shared_list(obj);
> >  
> > @@ -251,7 +251,7 @@ void dma_resv_reset_shared_max(struct dma_resv *obj)
> >  	if (fences)
> >  		fences->shared_max = fences->shared_count;
> >  }
> > -EXPORT_SYMBOL(dma_resv_reset_shared_max);
> > +EXPORT_SYMBOL(dma_resv_reset_max_fences);
> >  #endif
> >  
> >  /**
> > @@ -264,7 +264,8 @@ EXPORT_SYMBOL(dma_resv_reset_shared_max);
> >   *
> >   * See also &dma_resv.fence for a discussion of the semantics.
> >   */
> > -void dma_resv_add_shared_fence(struct dma_resv *obj, struct dma_fence *fence)
> > +static void dma_resv_add_shared_fence(struct dma_resv *obj,
> > +				      struct dma_fence *fence)
> >  {
> >  	struct dma_resv_list *fobj;
> >  	struct dma_fence *old;
> > @@ -305,13 +306,13 @@ void dma_resv_add_shared_fence(struct dma_resv *obj, struct dma_fence *fence)
> >  	write_seqcount_end(&obj->seq);
> >  	dma_fence_put(old);
> >  }
> > -EXPORT_SYMBOL(dma_resv_add_shared_fence);
> >  
> >  /**
> >   * dma_resv_replace_fences - replace fences in the dma_resv obj
> >   * @obj: the reservation object
> >   * @context: the context of the fences to replace
> >   * @replacement: the new fence to use instead
> > + * @usage: how the new fence is used, see enum dma_resv_usage
> >   *
> >   * Replace fences with a specified context with a new fence. Only valid if the
> >   * operation represented by the original fence has no longer access to the
> > @@ -321,12 +322,16 @@ EXPORT_SYMBOL(dma_resv_add_shared_fence);
> >   * update fence which makes the resource inaccessible.
> >   */
> >  void dma_resv_replace_fences(struct dma_resv *obj, uint64_t context,
> > -			     struct dma_fence *replacement)
> > +			     struct dma_fence *replacement,
> > +			     enum dma_resv_usage usage)
> >  {
> >  	struct dma_resv_list *list;
> >  	struct dma_fence *old;
> >  	unsigned int i;
> >  
> > +	/* Only readers supported for now */
> > +	WARN_ON(usage != DMA_RESV_USAGE_READ);
> > +
> >  	dma_resv_assert_held(obj);
> >  
> >  	write_seqcount_begin(&obj->seq);
> > @@ -360,7 +365,8 @@ EXPORT_SYMBOL(dma_resv_replace_fences);
> >   * Add a fence to the exclusive slot. @obj must be locked with dma_resv_lock().
> >   * See also &dma_resv.fence_excl for a discussion of the semantics.
> >   */
> > -void dma_resv_add_excl_fence(struct dma_resv *obj, struct dma_fence *fence)
> > +static void dma_resv_add_excl_fence(struct dma_resv *obj,
> > +				    struct dma_fence *fence)
> >  {
> >  	struct dma_fence *old_fence = dma_resv_excl_fence(obj);
> >  
> > @@ -375,7 +381,27 @@ void dma_resv_add_excl_fence(struct dma_resv *obj, struct dma_fence *fence)
> >  
> >  	dma_fence_put(old_fence);
> >  }
> > -EXPORT_SYMBOL(dma_resv_add_excl_fence);
> > +
> > +/**
> > + * dma_resv_add_fence - Add a fence to the dma_resv obj
> > + * @obj: the reservation object
> > + * @fence: the fence to add
> > + * @usage: how the fence is used, see enum dma_resv_usage
> > + *
> > + * Add a fence to a slot, @obj must be locked with dma_resv_lock(), and
> > + * dma_resv_reserve_fences() has been called.
> > + *
> > + * See also &dma_resv.fence for a discussion of the semantics.
> > + */
> > +void dma_resv_add_fence(struct dma_resv *obj, struct dma_fence *fence,
> > +			enum dma_resv_usage usage)
> > +{
> > +	if (usage == DMA_RESV_USAGE_WRITE)
> > +		dma_resv_add_excl_fence(obj, fence);
> > +	else
> > +		dma_resv_add_shared_fence(obj, fence);
> > +}
> > +EXPORT_SYMBOL(dma_resv_add_fence);
> >  
> >  /* Restart the iterator by initializing all the necessary fields, but not the
> >   * relation to the dma_resv object. */
> > @@ -574,7 +600,7 @@ int dma_resv_copy_fences(struct dma_resv *dst, struct dma_resv *src)
> >  		}
> >  
> >  		dma_fence_get(f);
> > -		if (dma_resv_iter_is_exclusive(&cursor))
> > +		if (dma_resv_iter_usage(&cursor) == DMA_RESV_USAGE_WRITE)
> >  			excl = f;
> >  		else
> >  			RCU_INIT_POINTER(list->shared[list->shared_count++], f);
> > @@ -771,13 +797,13 @@ EXPORT_SYMBOL_GPL(dma_resv_test_signaled);
> >   */
> >  void dma_resv_describe(struct dma_resv *obj, struct seq_file *seq)
> >  {
> > +	static const char *usage[] = { "write", "read" };
> >  	struct dma_resv_iter cursor;
> >  	struct dma_fence *fence;
> >  
> >  	dma_resv_for_each_fence(&cursor, obj, DMA_RESV_USAGE_READ, fence) {
> >  		seq_printf(seq, "\t%s fence:",
> > -			   dma_resv_iter_is_exclusive(&cursor) ?
> > -				"Exclusive" : "Shared");
> > +			   usage[dma_resv_iter_usage(&cursor)]);
> >  		dma_fence_describe(fence, seq);
> >  	}
> >  }
> > diff --git a/drivers/dma-buf/st-dma-resv.c b/drivers/dma-buf/st-dma-resv.c
> > index d097981061b1..d0f7c2bfd4f0 100644
> > --- a/drivers/dma-buf/st-dma-resv.c
> > +++ b/drivers/dma-buf/st-dma-resv.c
> > @@ -58,8 +58,9 @@ static int sanitycheck(void *arg)
> >  	return r;
> >  }
> >  
> > -static int test_signaling(void *arg, enum dma_resv_usage usage)
> > +static int test_signaling(void *arg)
> >  {
> > +	enum dma_resv_usage usage = (unsigned long)arg;
> >  	struct dma_resv resv;
> >  	struct dma_fence *f;
> >  	int r;
> > @@ -81,11 +82,7 @@ static int test_signaling(void *arg, enum dma_resv_usage usage)
> >  		goto err_unlock;
> >  	}
> >  
> > -	if (usage >= DMA_RESV_USAGE_READ)
> > -		dma_resv_add_shared_fence(&resv, f);
> > -	else
> > -		dma_resv_add_excl_fence(&resv, f);
> > -
> > +	dma_resv_add_fence(&resv, f, usage);
> >  	if (dma_resv_test_signaled(&resv, usage)) {
> >  		pr_err("Resv unexpectedly signaled\n");
> >  		r = -EINVAL;
> > @@ -105,18 +102,9 @@ static int test_signaling(void *arg, enum dma_resv_usage usage)
> >  	return r;
> >  }
> >  
> > -static int test_excl_signaling(void *arg)
> > -{
> > -	return test_signaling(arg, DMA_RESV_USAGE_WRITE);
> > -}
> > -
> > -static int test_shared_signaling(void *arg)
> > -{
> > -	return test_signaling(arg, DMA_RESV_USAGE_READ);
> > -}
> > -
> > -static int test_for_each(void *arg, enum dma_resv_usage usage)
> > +static int test_for_each(void *arg)
> >  {
> > +	enum dma_resv_usage usage = (unsigned long)arg;
> >  	struct dma_resv_iter cursor;
> >  	struct dma_fence *f, *fence;
> >  	struct dma_resv resv;
> > @@ -139,10 +127,7 @@ static int test_for_each(void *arg, enum dma_resv_usage usage)
> >  		goto err_unlock;
> >  	}
> >  
> > -	if (usage >= DMA_RESV_USAGE_READ)
> > -		dma_resv_add_shared_fence(&resv, f);
> > -	else
> > -		dma_resv_add_excl_fence(&resv, f);
> > +	dma_resv_add_fence(&resv, f, usage);
> >  
> >  	r = -ENOENT;
> >  	dma_resv_for_each_fence(&cursor, &resv, usage, fence) {
> > @@ -156,8 +141,7 @@ static int test_for_each(void *arg, enum dma_resv_usage usage)
> >  			r = -EINVAL;
> >  			goto err_unlock;
> >  		}
> > -		if (dma_resv_iter_is_exclusive(&cursor) !=
> > -		    (usage >= DMA_RESV_USAGE_READ)) {
> > +		if (dma_resv_iter_usage(&cursor) != usage) {
> >  			pr_err("Unexpected fence usage\n");
> >  			r = -EINVAL;
> >  			goto err_unlock;
> > @@ -177,18 +161,9 @@ static int test_for_each(void *arg, enum dma_resv_usage usage)
> >  	return r;
> >  }
> >  
> > -static int test_excl_for_each(void *arg)
> > -{
> > -	return test_for_each(arg, DMA_RESV_USAGE_WRITE);
> > -}
> > -
> > -static int test_shared_for_each(void *arg)
> > -{
> > -	return test_for_each(arg, DMA_RESV_USAGE_READ);
> > -}
> > -
> > -static int test_for_each_unlocked(void *arg, enum dma_resv_usage usage)
> > +static int test_for_each_unlocked(void *arg)
> >  {
> > +	enum dma_resv_usage usage = (unsigned long)arg;
> >  	struct dma_resv_iter cursor;
> >  	struct dma_fence *f, *fence;
> >  	struct dma_resv resv;
> > @@ -212,10 +187,7 @@ static int test_for_each_unlocked(void *arg, enum dma_resv_usage usage)
> >  		goto err_free;
> >  	}
> >  
> > -	if (usage >= DMA_RESV_USAGE_READ)
> > -		dma_resv_add_shared_fence(&resv, f);
> > -	else
> > -		dma_resv_add_excl_fence(&resv, f);
> > +	dma_resv_add_fence(&resv, f, usage);
> >  	dma_resv_unlock(&resv);
> >  
> >  	r = -ENOENT;
> > @@ -235,8 +207,7 @@ static int test_for_each_unlocked(void *arg, enum dma_resv_usage usage)
> >  			r = -EINVAL;
> >  			goto err_iter_end;
> >  		}
> > -		if (dma_resv_iter_is_exclusive(&cursor) !=
> > -		    (usage >= DMA_RESV_USAGE_READ)) {
> > +		if (dma_resv_iter_usage(&cursor) != usage) {
> >  			pr_err("Unexpected fence usage\n");
> >  			r = -EINVAL;
> >  			goto err_iter_end;
> > @@ -262,18 +233,9 @@ static int test_for_each_unlocked(void *arg, enum dma_resv_usage usage)
> >  	return r;
> >  }
> >  
> > -static int test_excl_for_each_unlocked(void *arg)
> > -{
> > -	return test_for_each_unlocked(arg, DMA_RESV_USAGE_WRITE);
> > -}
> > -
> > -static int test_shared_for_each_unlocked(void *arg)
> > -{
> > -	return test_for_each_unlocked(arg, DMA_RESV_USAGE_READ);
> > -}
> > -
> > -static int test_get_fences(void *arg, enum dma_resv_usage usage)
> > +static int test_get_fences(void *arg)
> >  {
> > +	enum dma_resv_usage usage = (unsigned long)arg;
> >  	struct dma_fence *f, **fences = NULL;
> >  	struct dma_resv resv;
> >  	int r, i;
> > @@ -296,10 +258,7 @@ static int test_get_fences(void *arg, enum dma_resv_usage usage)
> >  		goto err_resv;
> >  	}
> >  
> > -	if (usage >= DMA_RESV_USAGE_READ)
> > -		dma_resv_add_shared_fence(&resv, f);
> > -	else
> > -		dma_resv_add_excl_fence(&resv, f);
> > +	dma_resv_add_fence(&resv, f, usage);
> >  	dma_resv_unlock(&resv);
> >  
> >  	r = dma_resv_get_fences(&resv, usage, &i, &fences);
> > @@ -324,30 +283,24 @@ static int test_get_fences(void *arg, enum dma_resv_usage usage)
> >  	return r;
> >  }
> >  
> > -static int test_excl_get_fences(void *arg)
> > -{
> > -	return test_get_fences(arg, DMA_RESV_USAGE_WRITE);
> > -}
> > -
> > -static int test_shared_get_fences(void *arg)
> > -{
> > -	return test_get_fences(arg, DMA_RESV_USAGE_READ);
> > -}
> > -
> >  int dma_resv(void)
> >  {
> >  	static const struct subtest tests[] = {
> >  		SUBTEST(sanitycheck),
> > -		SUBTEST(test_excl_signaling),
> > -		SUBTEST(test_shared_signaling),
> > -		SUBTEST(test_excl_for_each),
> > -		SUBTEST(test_shared_for_each),
> > -		SUBTEST(test_excl_for_each_unlocked),
> > -		SUBTEST(test_shared_for_each_unlocked),
> > -		SUBTEST(test_excl_get_fences),
> > -		SUBTEST(test_shared_get_fences),
> > +		SUBTEST(test_signaling),
> > +		SUBTEST(test_for_each),
> > +		SUBTEST(test_for_each_unlocked),
> > +		SUBTEST(test_get_fences),
> >  	};
> > +	enum dma_resv_usage usage;
> > +	int r;
> >  
> >  	spin_lock_init(&fence_lock);
> > -	return subtests(tests, NULL);
> > +	for (usage = DMA_RESV_USAGE_WRITE; usage <= DMA_RESV_USAGE_READ;
> > +	     ++usage) {
> > +		r = subtests(tests, (void *)(unsigned long)usage);
> > +		if (r)
> > +			return r;
> > +	}
> > +	return 0;
> >  }
> > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
> > index 98b1736bb221..5031e26e6716 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
> > @@ -263,7 +263,7 @@ static int amdgpu_amdkfd_remove_eviction_fence(struct amdgpu_bo *bo,
> >  	 */
> >  	replacement = dma_fence_get_stub();
> >  	dma_resv_replace_fences(bo->tbo.base.resv, ef->base.context,
> > -				replacement);
> > +				replacement, DMA_RESV_USAGE_READ);
> >  	dma_fence_put(replacement);
> >  	return 0;
> >  }
> > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> > index 413f32c3fd63..76fd916424d6 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> > @@ -55,8 +55,8 @@ static int amdgpu_cs_user_fence_chunk(struct amdgpu_cs_parser *p,
> >  	bo = amdgpu_bo_ref(gem_to_amdgpu_bo(gobj));
> >  	p->uf_entry.priority = 0;
> >  	p->uf_entry.tv.bo = &bo->tbo;
> > -	/* One for TTM and one for the CS job */
> > -	p->uf_entry.tv.num_shared = 2;
> > +	/* One for TTM and two for the CS job */
> > +	p->uf_entry.tv.num_shared = 3;
> >  
> >  	drm_gem_object_put(gobj);
> >  
> > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> > index a7f39f8ab7be..a3cdf8a24377 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> > @@ -1397,10 +1397,8 @@ void amdgpu_bo_fence(struct amdgpu_bo *bo, struct dma_fence *fence,
> >  		return;
> >  	}
> >  
> > -	if (shared)
> > -		dma_resv_add_shared_fence(resv, fence);
> > -	else
> > -		dma_resv_add_excl_fence(resv, fence);
> > +	dma_resv_add_fence(resv, fence, shared ? DMA_RESV_USAGE_READ :
> > +			   DMA_RESV_USAGE_WRITE);
> >  }
> >  
> >  /**
> > diff --git a/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c b/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c
> > index 53f7c78628a4..98bb5c9239de 100644
> > --- a/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c
> > +++ b/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c
> > @@ -202,14 +202,10 @@ static void submit_attach_object_fences(struct etnaviv_gem_submit *submit)
> >  
> >  	for (i = 0; i < submit->nr_bos; i++) {
> >  		struct drm_gem_object *obj = &submit->bos[i].obj->base;
> > +		bool write = submit->bos[i].flags & ETNA_SUBMIT_BO_WRITE;
> >  
> > -		if (submit->bos[i].flags & ETNA_SUBMIT_BO_WRITE)
> > -			dma_resv_add_excl_fence(obj->resv,
> > -							  submit->out_fence);
> > -		else
> > -			dma_resv_add_shared_fence(obj->resv,
> > -							    submit->out_fence);
> > -
> > +		dma_resv_add_fence(obj->resv, submit->out_fence, write ?
> > +				   DMA_RESV_USAGE_WRITE : DMA_RESV_USAGE_READ);
> 
> Iirc I had some suggestions to use dma_resv_usage_rw here and above. Do
> these happen in later patches? There's also a few more of these later on.
> 
> >  		submit_unlock_object(submit, i);
> >  	}
> >  }
> > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_busy.c b/drivers/gpu/drm/i915/gem/i915_gem_busy.c
> > index 14a1c0ad8c3c..e7ae94ee1b44 100644
> > --- a/drivers/gpu/drm/i915/gem/i915_gem_busy.c
> > +++ b/drivers/gpu/drm/i915/gem/i915_gem_busy.c
> > @@ -148,12 +148,13 @@ i915_gem_busy_ioctl(struct drm_device *dev, void *data,
> >  		if (dma_resv_iter_is_restarted(&cursor))
> >  			args->busy = 0;
> >  
> > -		if (dma_resv_iter_is_exclusive(&cursor))
> > -			/* Translate the exclusive fence to the READ *and* WRITE engine */
> > -			args->busy |= busy_check_writer(fence);
> > -		else
> > -			/* Translate shared fences to READ set of engines */
> > -			args->busy |= busy_check_reader(fence);
> > +		/* Translate read fences to READ set of engines */
> > +		args->busy |= busy_check_reader(fence);
> > +	}
> > +	dma_resv_iter_begin(&cursor, obj->base.resv, DMA_RESV_USAGE_WRITE);
> > +	dma_resv_for_each_fence_unlocked(&cursor, fence) {
> 
> Two loops is a bit much but also meh.
> 
> > +		/* Translate the write fences to the READ *and* WRITE engine */
> > +		args->busy |= busy_check_writer(fence);
> >  	}
> >  	dma_resv_iter_end(&cursor);
> >  
> > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_clflush.c b/drivers/gpu/drm/i915/gem/i915_gem_clflush.c
> > index 1fd0cc9ca213..f5f2b8b115ea 100644
> > --- a/drivers/gpu/drm/i915/gem/i915_gem_clflush.c
> > +++ b/drivers/gpu/drm/i915/gem/i915_gem_clflush.c
> > @@ -116,7 +116,8 @@ bool i915_gem_clflush_object(struct drm_i915_gem_object *obj,
> >  						obj->base.resv, NULL, true,
> >  						i915_fence_timeout(i915),
> >  						I915_FENCE_GFP);
> > -		dma_resv_add_excl_fence(obj->base.resv, &clflush->base.dma);
> > +		dma_resv_add_fence(obj->base.resv, &clflush->base.dma,
> > +				   DMA_RESV_USAGE_WRITE);
> >  		dma_fence_work_commit(&clflush->base);
> >  		/*
> >  		 * We must have successfully populated the pages(since we are
> > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_ttm_move.c b/drivers/gpu/drm/i915/gem/i915_gem_ttm_move.c
> > index 432ac74ff225..438b8a95b3d1 100644
> > --- a/drivers/gpu/drm/i915/gem/i915_gem_ttm_move.c
> > +++ b/drivers/gpu/drm/i915/gem/i915_gem_ttm_move.c
> > @@ -637,9 +637,8 @@ int i915_gem_obj_copy_ttm(struct drm_i915_gem_object *dst,
> >  	if (IS_ERR_OR_NULL(copy_fence))
> >  		return PTR_ERR_OR_ZERO(copy_fence);
> >  
> > -	dma_resv_add_excl_fence(dst_bo->base.resv, copy_fence);
> > -	dma_resv_add_shared_fence(src_bo->base.resv, copy_fence);
> > -
> > +	dma_resv_add_fence(dst_bo->base.resv, copy_fence, DMA_RESV_USAGE_WRITE);
> > +	dma_resv_add_fence(src_bo->base.resv, copy_fence, DMA_RESV_USAGE_READ);
> >  	dma_fence_put(copy_fence);
> >  
> >  	return 0;
> > diff --git a/drivers/gpu/drm/i915/gem/selftests/i915_gem_migrate.c b/drivers/gpu/drm/i915/gem/selftests/i915_gem_migrate.c
> > index 0e52eb87cd55..4997ed18b6e4 100644
> > --- a/drivers/gpu/drm/i915/gem/selftests/i915_gem_migrate.c
> > +++ b/drivers/gpu/drm/i915/gem/selftests/i915_gem_migrate.c
> > @@ -218,8 +218,8 @@ static int __igt_lmem_pages_migrate(struct intel_gt *gt,
> >  		if (rq) {
> >  			err = dma_resv_reserve_fences(obj->base.resv, 1);
> >  			if (!err)
> > -				dma_resv_add_excl_fence(obj->base.resv,
> > -							&rq->fence);
> > +				dma_resv_add_fence(obj->base.resv, &rq->fence,
> > +						   DMA_RESV_USAGE_WRITE);
> >  			i915_gem_object_set_moving_fence(obj, &rq->fence);
> >  			i915_request_put(rq);
> >  		}
> > diff --git a/drivers/gpu/drm/i915/gem/selftests/i915_gem_mman.c b/drivers/gpu/drm/i915/gem/selftests/i915_gem_mman.c
> > index a132e241c3ee..3a6e3f6d239f 100644
> > --- a/drivers/gpu/drm/i915/gem/selftests/i915_gem_mman.c
> > +++ b/drivers/gpu/drm/i915/gem/selftests/i915_gem_mman.c
> > @@ -1220,7 +1220,8 @@ static int __igt_mmap_migrate(struct intel_memory_region **placements,
> >  					  expand32(POISON_INUSE), &rq);
> >  	i915_gem_object_unpin_pages(obj);
> >  	if (rq) {
> > -		dma_resv_add_excl_fence(obj->base.resv, &rq->fence);
> > +		dma_resv_add_fence(obj->base.resv, &rq->fence,
> > +				   DMA_RESV_USAGE_WRITE);
> >  		i915_gem_object_set_moving_fence(obj, &rq->fence);
> >  		i915_request_put(rq);
> >  	}
> > diff --git a/drivers/gpu/drm/i915/i915_vma.c b/drivers/gpu/drm/i915/i915_vma.c
> > index bae3423f58e8..524477d8939e 100644
> > --- a/drivers/gpu/drm/i915/i915_vma.c
> > +++ b/drivers/gpu/drm/i915/i915_vma.c
> > @@ -1826,7 +1826,8 @@ int _i915_vma_move_to_active(struct i915_vma *vma,
> >  		}
> >  
> >  		if (fence) {
> > -			dma_resv_add_excl_fence(vma->obj->base.resv, fence);
> > +			dma_resv_add_fence(vma->obj->base.resv, fence,
> > +					   DMA_RESV_USAGE_WRITE);
> >  			obj->write_domain = I915_GEM_DOMAIN_RENDER;
> >  			obj->read_domains = 0;
> >  		}
> > @@ -1838,7 +1839,8 @@ int _i915_vma_move_to_active(struct i915_vma *vma,
> >  		}
> >  
> >  		if (fence) {
> > -			dma_resv_add_shared_fence(vma->obj->base.resv, fence);
> > +			dma_resv_add_fence(vma->obj->base.resv, fence,
> > +					   DMA_RESV_USAGE_READ);
> >  			obj->write_domain = 0;
> >  		}
> >  	}
> > @@ -2078,7 +2080,7 @@ int i915_vma_unbind_async(struct i915_vma *vma, bool trylock_vm)
> >  		goto out_rpm;
> >  	}
> >  
> > -	dma_resv_add_shared_fence(obj->base.resv, fence);
> > +	dma_resv_add_fence(obj->base.resv, fence, DMA_RESV_USAGE_READ);
> >  	dma_fence_put(fence);
> >  
> >  out_rpm:
> > diff --git a/drivers/gpu/drm/i915/selftests/intel_memory_region.c b/drivers/gpu/drm/i915/selftests/intel_memory_region.c
> > index 6114e013092b..73eb53edb8de 100644
> > --- a/drivers/gpu/drm/i915/selftests/intel_memory_region.c
> > +++ b/drivers/gpu/drm/i915/selftests/intel_memory_region.c
> > @@ -1056,7 +1056,8 @@ static int igt_lmem_write_cpu(void *arg)
> >  					  obj->mm.pages->sgl, I915_CACHE_NONE,
> >  					  true, 0xdeadbeaf, &rq);
> >  	if (rq) {
> > -		dma_resv_add_excl_fence(obj->base.resv, &rq->fence);
> > +		dma_resv_add_fence(obj->base.resv, &rq->fence,
> > +				   DMA_RESV_USAGE_WRITE);
> >  		i915_request_put(rq);
> >  	}
> >  
> > diff --git a/drivers/gpu/drm/lima/lima_gem.c b/drivers/gpu/drm/lima/lima_gem.c
> > index e0a11ee0e86d..cb3bfccc930f 100644
> > --- a/drivers/gpu/drm/lima/lima_gem.c
> > +++ b/drivers/gpu/drm/lima/lima_gem.c
> > @@ -367,7 +367,7 @@ int lima_gem_submit(struct drm_file *file, struct lima_submit *submit)
> >  		if (submit->bos[i].flags & LIMA_SUBMIT_BO_WRITE)
> >  			dma_resv_add_excl_fence(lima_bo_resv(bos[i]), fence);
> >  		else
> > -			dma_resv_add_shared_fence(lima_bo_resv(bos[i]), fence);
> > +			dma_resv_add_fence(lima_bo_resv(bos[i]), fence);


Correction on the r-b, I'm still pretty sure that this won't compile at
all.
-Daniel

> >  	}
> >  
> >  	drm_gem_unlock_reservations((struct drm_gem_object **)bos,
> > diff --git a/drivers/gpu/drm/msm/msm_gem_submit.c b/drivers/gpu/drm/msm/msm_gem_submit.c
> > index 3164db8be893..8d1eef914ba8 100644
> > --- a/drivers/gpu/drm/msm/msm_gem_submit.c
> > +++ b/drivers/gpu/drm/msm/msm_gem_submit.c
> > @@ -395,9 +395,11 @@ static void submit_attach_object_fences(struct msm_gem_submit *submit)
> >  		struct drm_gem_object *obj = &submit->bos[i].obj->base;
> >  
> >  		if (submit->bos[i].flags & MSM_SUBMIT_BO_WRITE)
> > -			dma_resv_add_excl_fence(obj->resv, submit->user_fence);
> > +			dma_resv_add_fence(obj->resv, submit->user_fence,
> > +					   DMA_RESV_USAGE_WRITE);
> >  		else if (submit->bos[i].flags & MSM_SUBMIT_BO_READ)
> > -			dma_resv_add_shared_fence(obj->resv, submit->user_fence);
> > +			dma_resv_add_fence(obj->resv, submit->user_fence,
> > +					   DMA_RESV_USAGE_READ);
> >  	}
> >  }
> >  
> > diff --git a/drivers/gpu/drm/nouveau/nouveau_bo.c b/drivers/gpu/drm/nouveau/nouveau_bo.c
> > index c6bb4dbcd735..05076e530e7d 100644
> > --- a/drivers/gpu/drm/nouveau/nouveau_bo.c
> > +++ b/drivers/gpu/drm/nouveau/nouveau_bo.c
> > @@ -1308,10 +1308,11 @@ nouveau_bo_fence(struct nouveau_bo *nvbo, struct nouveau_fence *fence, bool excl
> >  {
> >  	struct dma_resv *resv = nvbo->bo.base.resv;
> >  
> > -	if (exclusive)
> > -		dma_resv_add_excl_fence(resv, &fence->base);
> > -	else if (fence)
> > -		dma_resv_add_shared_fence(resv, &fence->base);
> > +	if (!fence)
> > +		return;
> > +
> > +	dma_resv_add_fence(resv, &fence->base, exclusive ?
> > +			   DMA_RESV_USAGE_WRITE : DMA_RESV_USAGE_READ);
> >  }
> >  
> >  static void
> > diff --git a/drivers/gpu/drm/nouveau/nouveau_fence.c b/drivers/gpu/drm/nouveau/nouveau_fence.c
> > index d5e81ccee01c..7f01dcf81fab 100644
> > --- a/drivers/gpu/drm/nouveau/nouveau_fence.c
> > +++ b/drivers/gpu/drm/nouveau/nouveau_fence.c
> > @@ -360,9 +360,11 @@ nouveau_fence_sync(struct nouveau_bo *nvbo, struct nouveau_channel *chan,
> >  		dma_resv_for_each_fence(&cursor, resv,
> >  					dma_resv_usage_rw(exclusive),
> >  					fence) {
> > +			enum dma_resv_usage usage;
> >  			struct nouveau_fence *f;
> >  
> > -			if (i == 0 && dma_resv_iter_is_exclusive(&cursor))
> > +			usage = dma_resv_iter_usage(&cursor);
> > +			if (i == 0 && usage == DMA_RESV_USAGE_WRITE)
> >  				continue;
> >  
> >  			f = nouveau_local_fence(fence, chan->drm);
> > diff --git a/drivers/gpu/drm/panfrost/panfrost_job.c b/drivers/gpu/drm/panfrost/panfrost_job.c
> > index c34114560e49..fda5871aebe3 100644
> > --- a/drivers/gpu/drm/panfrost/panfrost_job.c
> > +++ b/drivers/gpu/drm/panfrost/panfrost_job.c
> > @@ -268,7 +268,7 @@ static void panfrost_attach_object_fences(struct drm_gem_object **bos,
> >  	int i;
> >  
> >  	for (i = 0; i < bo_count; i++)
> > -		dma_resv_add_excl_fence(bos[i]->resv, fence);
> > +		dma_resv_add_fence(bos[i]->resv, fence, DMA_RESV_USAGE_WRITE);
> >  }
> >  
> >  int panfrost_job_push(struct panfrost_job *job)
> > diff --git a/drivers/gpu/drm/qxl/qxl_release.c b/drivers/gpu/drm/qxl/qxl_release.c
> > index cde1e8ddaeaa..368d26da0d6a 100644
> > --- a/drivers/gpu/drm/qxl/qxl_release.c
> > +++ b/drivers/gpu/drm/qxl/qxl_release.c
> > @@ -429,7 +429,8 @@ void qxl_release_fence_buffer_objects(struct qxl_release *release)
> >  	list_for_each_entry(entry, &release->bos, head) {
> >  		bo = entry->bo;
> >  
> > -		dma_resv_add_shared_fence(bo->base.resv, &release->base);
> > +		dma_resv_add_fence(bo->base.resv, &release->base,
> > +				   DMA_RESV_USAGE_READ);
> >  		ttm_bo_move_to_lru_tail_unlocked(bo);
> >  		dma_resv_unlock(bo->base.resv);
> >  	}
> > diff --git a/drivers/gpu/drm/radeon/radeon_object.c b/drivers/gpu/drm/radeon/radeon_object.c
> > index 7ffd2e90f325..cb5c4aa45cef 100644
> > --- a/drivers/gpu/drm/radeon/radeon_object.c
> > +++ b/drivers/gpu/drm/radeon/radeon_object.c
> > @@ -791,8 +791,6 @@ void radeon_bo_fence(struct radeon_bo *bo, struct radeon_fence *fence,
> >  		return;
> >  	}
> >  
> > -	if (shared)
> > -		dma_resv_add_shared_fence(resv, &fence->base);
> > -	else
> > -		dma_resv_add_excl_fence(resv, &fence->base);
> > +	dma_resv_add_fence(resv, &fence->base, shared ?
> > +			   DMA_RESV_USAGE_READ : DMA_RESV_USAGE_WRITE);
> >  }
> > diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c
> > index cff05b62f3f7..d74f9eea855e 100644
> > --- a/drivers/gpu/drm/ttm/ttm_bo.c
> > +++ b/drivers/gpu/drm/ttm/ttm_bo.c
> > @@ -739,7 +739,7 @@ static int ttm_bo_add_move_fence(struct ttm_buffer_object *bo,
> >  		return ret;
> >  	}
> >  
> > -	dma_resv_add_shared_fence(bo->base.resv, fence);
> > +	dma_resv_add_fence(bo->base.resv, fence, DMA_RESV_USAGE_WRITE);
> >  
> >  	ret = dma_resv_reserve_fences(bo->base.resv, 1);
> >  	if (unlikely(ret)) {
> > diff --git a/drivers/gpu/drm/ttm/ttm_bo_util.c b/drivers/gpu/drm/ttm/ttm_bo_util.c
> > index 1b96b91bf81b..7a96a1db13a7 100644
> > --- a/drivers/gpu/drm/ttm/ttm_bo_util.c
> > +++ b/drivers/gpu/drm/ttm/ttm_bo_util.c
> > @@ -507,7 +507,8 @@ static int ttm_bo_move_to_ghost(struct ttm_buffer_object *bo,
> >  	if (ret)
> >  		return ret;
> >  
> > -	dma_resv_add_excl_fence(&ghost_obj->base._resv, fence);
> > +	dma_resv_add_fence(&ghost_obj->base._resv, fence,
> > +			   DMA_RESV_USAGE_WRITE);
> >  
> >  	/**
> >  	 * If we're not moving to fixed memory, the TTM object
> > @@ -561,7 +562,7 @@ int ttm_bo_move_accel_cleanup(struct ttm_buffer_object *bo,
> >  	struct ttm_resource_manager *man = ttm_manager_type(bdev, new_mem->mem_type);
> >  	int ret = 0;
> >  
> > -	dma_resv_add_excl_fence(bo->base.resv, fence);
> > +	dma_resv_add_fence(bo->base.resv, fence, DMA_RESV_USAGE_WRITE);
> >  	if (!evict)
> >  		ret = ttm_bo_move_to_ghost(bo, fence, man->use_tt);
> >  	else if (!from->use_tt && pipeline)
> > diff --git a/drivers/gpu/drm/ttm/ttm_execbuf_util.c b/drivers/gpu/drm/ttm/ttm_execbuf_util.c
> > index 789c645f004e..0eb995d25df1 100644
> > --- a/drivers/gpu/drm/ttm/ttm_execbuf_util.c
> > +++ b/drivers/gpu/drm/ttm/ttm_execbuf_util.c
> > @@ -154,10 +154,8 @@ void ttm_eu_fence_buffer_objects(struct ww_acquire_ctx *ticket,
> >  	list_for_each_entry(entry, list, head) {
> >  		struct ttm_buffer_object *bo = entry->bo;
> >  
> > -		if (entry->num_shared)
> > -			dma_resv_add_shared_fence(bo->base.resv, fence);
> > -		else
> > -			dma_resv_add_excl_fence(bo->base.resv, fence);
> > +		dma_resv_add_fence(bo->base.resv, fence, entry->num_shared ?
> > +				   DMA_RESV_USAGE_READ : DMA_RESV_USAGE_WRITE);
> >  		ttm_bo_move_to_lru_tail_unlocked(bo);
> >  		dma_resv_unlock(bo->base.resv);
> >  	}
> > diff --git a/drivers/gpu/drm/v3d/v3d_gem.c b/drivers/gpu/drm/v3d/v3d_gem.c
> > index 961812d33827..2352e9640922 100644
> > --- a/drivers/gpu/drm/v3d/v3d_gem.c
> > +++ b/drivers/gpu/drm/v3d/v3d_gem.c
> > @@ -550,8 +550,8 @@ v3d_attach_fences_and_unlock_reservation(struct drm_file *file_priv,
> >  
> >  	for (i = 0; i < job->bo_count; i++) {
> >  		/* XXX: Use shared fences for read-only objects. */
> > -		dma_resv_add_excl_fence(job->bo[i]->resv,
> > -					job->done_fence);
> > +		dma_resv_add_fence(job->bo[i]->resv, job->done_fence,
> > +				   DMA_RESV_USAGE_WRITE);
> >  	}
> >  
> >  	drm_gem_unlock_reservations(job->bo, job->bo_count, acquire_ctx);
> > diff --git a/drivers/gpu/drm/vc4/vc4_gem.c b/drivers/gpu/drm/vc4/vc4_gem.c
> > index 594bd6bb00d2..38550317e025 100644
> > --- a/drivers/gpu/drm/vc4/vc4_gem.c
> > +++ b/drivers/gpu/drm/vc4/vc4_gem.c
> > @@ -546,7 +546,7 @@ vc4_update_bo_seqnos(struct vc4_exec_info *exec, uint64_t seqno)
> >  		bo = to_vc4_bo(&exec->bo[i]->base);
> >  		bo->seqno = seqno;
> >  
> > -		dma_resv_add_shared_fence(bo->base.base.resv, exec->fence);
> > +		dma_resv_add_fence(bo->base.base.resv, exec->fence);
> >  	}
> >  
> >  	list_for_each_entry(bo, &exec->unref_list, unref_head) {
> > diff --git a/drivers/gpu/drm/vgem/vgem_fence.c b/drivers/gpu/drm/vgem/vgem_fence.c
> > index 91fc4940c65a..c2a879734d40 100644
> > --- a/drivers/gpu/drm/vgem/vgem_fence.c
> > +++ b/drivers/gpu/drm/vgem/vgem_fence.c
> > @@ -161,12 +161,9 @@ int vgem_fence_attach_ioctl(struct drm_device *dev,
> >  	/* Expose the fence via the dma-buf */
> >  	dma_resv_lock(resv, NULL);
> >  	ret = dma_resv_reserve_fences(resv, 1);
> > -	if (!ret) {
> > -		if (arg->flags & VGEM_FENCE_WRITE)
> > -			dma_resv_add_excl_fence(resv, fence);
> > -		else
> > -			dma_resv_add_shared_fence(resv, fence);
> > -	}
> > +	if (!ret)
> > +		dma_resv_add_fence(resv, fence, arg->flags & VGEM_FENCE_WRITE ?
> > +				   DMA_RESV_USAGE_WRITE : DMA_RESV_USAGE_READ);
> >  	dma_resv_unlock(resv);
> >  
> >  	/* Record the fence in our idr for later signaling */
> > diff --git a/drivers/gpu/drm/virtio/virtgpu_gem.c b/drivers/gpu/drm/virtio/virtgpu_gem.c
> > index 1820ca6cf673..580a78809836 100644
> > --- a/drivers/gpu/drm/virtio/virtgpu_gem.c
> > +++ b/drivers/gpu/drm/virtio/virtgpu_gem.c
> > @@ -250,7 +250,8 @@ void virtio_gpu_array_add_fence(struct virtio_gpu_object_array *objs,
> >  	int i;
> >  
> >  	for (i = 0; i < objs->nents; i++)
> > -		dma_resv_add_excl_fence(objs->objs[i]->resv, fence);
> > +		dma_resv_add_fence(objs->objs[i]->resv, fence,
> > +				   DMA_RESV_USAGE_WRITE);
> >  }
> >  
> >  void virtio_gpu_array_put_free(struct virtio_gpu_object_array *objs)
> > diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_bo.c b/drivers/gpu/drm/vmwgfx/vmwgfx_bo.c
> > index b96884f7d03d..bec50223efe5 100644
> > --- a/drivers/gpu/drm/vmwgfx/vmwgfx_bo.c
> > +++ b/drivers/gpu/drm/vmwgfx/vmwgfx_bo.c
> > @@ -758,7 +758,8 @@ void vmw_bo_fence_single(struct ttm_buffer_object *bo,
> >  
> >  	ret = dma_resv_reserve_fences(bo->base.resv, 1);
> >  	if (!ret)
> > -		dma_resv_add_excl_fence(bo->base.resv, &fence->base);
> > +		dma_resv_add_fence(bo->base.resv, &fence->base,
> > +				   DMA_RESV_USAGE_WRITE);
> >  	else
> >  		/* Last resort fallback when we are OOM */
> >  		dma_fence_wait(&fence->base, false);
> > diff --git a/include/linux/dma-buf.h b/include/linux/dma-buf.h
> > index a297397743a2..71731796c8c3 100644
> > --- a/include/linux/dma-buf.h
> > +++ b/include/linux/dma-buf.h
> > @@ -393,15 +393,15 @@ struct dma_buf {
> >  	 * e.g. exposed in `Implicit Fence Poll Support`_ must follow the
> >  	 * below rules.
> >  	 *
> > -	 * - Drivers must add a shared fence through dma_resv_add_shared_fence()
> > -	 *   for anything the userspace API considers a read access. This highly
> > -	 *   depends upon the API and window system.
> > +	 * - Drivers must add a read fence through dma_resv_add_fence() with the
> > +	 *   DMA_RESV_USAGE_READ flag for anything the userspace API considers a
> > +	 *   read access. This highly depends upon the API and window system.
> >  	 *
> > -	 * - Similarly drivers must set the exclusive fence through
> > -	 *   dma_resv_add_excl_fence() for anything the userspace API considers
> > -	 *   write access.
> > +	 * - Similarly drivers must add a write fence through
> > +	 *   dma_resv_add_fence() with the DMA_RESV_USAGE_WRITE flag for
> > +	 *   anything the userspace API considers write access.
> >  	 *
> > -	 * - Drivers may just always set the exclusive fence, since that only
> > +	 * - Drivers may just always add a write fence, since that only
> >  	 *   causes unecessarily synchronization, but no correctness issues.
> >  	 *
> >  	 * - Some drivers only expose a synchronous userspace API with no
> > @@ -416,7 +416,7 @@ struct dma_buf {
> >  	 * Dynamic importers, see dma_buf_attachment_is_dynamic(), have
> >  	 * additional constraints on how they set up fences:
> >  	 *
> > -	 * - Dynamic importers must obey the exclusive fence and wait for it to
> > +	 * - Dynamic importers must obey the write fences and wait for them to
> >  	 *   signal before allowing access to the buffer's underlying storage
> >  	 *   through the device.
> >  	 *
> > diff --git a/include/linux/dma-resv.h b/include/linux/dma-resv.h
> > index 92cd8023980f..98dc5234b487 100644
> > --- a/include/linux/dma-resv.h
> > +++ b/include/linux/dma-resv.h
> > @@ -195,6 +195,9 @@ struct dma_resv_iter {
> >  	/** @fence: the currently handled fence */
> >  	struct dma_fence *fence;
> >  
> > +	/** @fence_usage: the usage of the current fence */
> > +	enum dma_resv_usage fence_usage;
> > +
> >  	/** @seq: sequence number to check for modifications */
> >  	unsigned int seq;
> >  
> > @@ -244,14 +247,15 @@ static inline void dma_resv_iter_end(struct dma_resv_iter *cursor)
> >  }
> >  
> >  /**
> > - * dma_resv_iter_is_exclusive - test if the current fence is the exclusive one
> > + * dma_resv_iter_usage - Return the usage of the current fence
> >   * @cursor: the cursor of the current position
> >   *
> > - * Returns true if the currently returned fence is the exclusive one.
> > + * Returns the usage of the currently processed fence.
> >   */
> > -static inline bool dma_resv_iter_is_exclusive(struct dma_resv_iter *cursor)
> > +static inline enum dma_resv_usage
> > +dma_resv_iter_usage(struct dma_resv_iter *cursor)
> >  {
> > -	return cursor->index == 0;
> > +	return cursor->fence_usage;
> >  }
> >  
> >  /**
> > @@ -306,9 +310,9 @@ static inline bool dma_resv_iter_is_restarted(struct dma_resv_iter *cursor)
> >  #define dma_resv_assert_held(obj) lockdep_assert_held(&(obj)->lock.base)
> >  
> >  #ifdef CONFIG_DEBUG_MUTEXES
> > -void dma_resv_reset_shared_max(struct dma_resv *obj);
> > +void dma_resv_reset_max_fences(struct dma_resv *obj);
> >  #else
> > -static inline void dma_resv_reset_shared_max(struct dma_resv *obj) {}
> > +static inline void dma_resv_reset_max_fences(struct dma_resv *obj) {}
> >  #endif
> >  
> >  /**
> > @@ -454,17 +458,18 @@ static inline struct ww_acquire_ctx *dma_resv_locking_ctx(struct dma_resv *obj)
> >   */
> >  static inline void dma_resv_unlock(struct dma_resv *obj)
> >  {
> > -	dma_resv_reset_shared_max(obj);
> > +	dma_resv_reset_max_fences(obj);
> >  	ww_mutex_unlock(&obj->lock);
> >  }
> >  
> >  void dma_resv_init(struct dma_resv *obj);
> >  void dma_resv_fini(struct dma_resv *obj);
> >  int dma_resv_reserve_fences(struct dma_resv *obj, unsigned int num_fences);
> > -void dma_resv_add_shared_fence(struct dma_resv *obj, struct dma_fence *fence);
> > +void dma_resv_add_fence(struct dma_resv *obj, struct dma_fence *fence,
> > +			enum dma_resv_usage usage);
> >  void dma_resv_replace_fences(struct dma_resv *obj, uint64_t context,
> > -			     struct dma_fence *fence);
> > -void dma_resv_add_excl_fence(struct dma_resv *obj, struct dma_fence *fence);
> > +			     struct dma_fence *fence,
> > +			     enum dma_resv_usage usage);
> >  int dma_resv_get_fences(struct dma_resv *obj, enum dma_resv_usage usage,
> >  			unsigned int *num_fences, struct dma_fence ***fences);
> >  int dma_resv_get_singleton(struct dma_resv *obj, enum dma_resv_usage usage,
> 
> Really only bikesheds left. I think ... I guess better to merge this
> sooner and test more than later.
> 
> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
> 
> > -- 
> > 2.25.1
> > 
> 
> -- 
> Daniel Vetter
> Software Engineer, Intel Corporation
> http://blog.ffwll.ch

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH 04/16] dma-buf & drm/amdgpu: remove dma_resv workaround
  2022-04-06  7:51 ` [PATCH 04/16] dma-buf & drm/amdgpu: remove dma_resv workaround Christian König
@ 2022-04-06 12:39   ` Daniel Vetter
  0 siblings, 0 replies; 65+ messages in thread
From: Daniel Vetter @ 2022-04-06 12:39 UTC (permalink / raw)
  To: DMA-resvusage; +Cc: daniel.vetter, amd-gfx, Christian König, dri-devel

On Wed, Apr 06, 2022 at 09:51:20AM +0200, Christian König wrote:
> Rework the internals of the dma_resv object to allow adding more than one
> write fence and remember for each fence what purpose it had.
> 
> This allows removing the workaround from amdgpu which used a container for
> this instead.
> 
> Signed-off-by: Christian König <christian.koenig@amd.com>
> Cc: amd-gfx@lists.freedesktop.org

It is honestly all getting rather blurry, I think when it's all landed I
need to audit the entire tree and see what we missed. Anyway:

Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>

> ---
>  drivers/dma-buf/dma-resv.c                  | 353 ++++++++------------
>  drivers/gpu/drm/amd/amdgpu/amdgpu_bo_list.h |   1 -
>  drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c      |  53 +--
>  include/linux/dma-resv.h                    |  47 +--
>  4 files changed, 157 insertions(+), 297 deletions(-)
> 
> diff --git a/drivers/dma-buf/dma-resv.c b/drivers/dma-buf/dma-resv.c
> index 543dae6566d2..378d47e1cfea 100644
> --- a/drivers/dma-buf/dma-resv.c
> +++ b/drivers/dma-buf/dma-resv.c
> @@ -44,12 +44,12 @@
>  /**
>   * DOC: Reservation Object Overview
>   *
> - * The reservation object provides a mechanism to manage shared and
> - * exclusive fences associated with a buffer.  A reservation object
> - * can have attached one exclusive fence (normally associated with
> - * write operations) or N shared fences (read operations).  The RCU
> - * mechanism is used to protect read access to fences from locked
> - * write-side updates.
> + * The reservation object provides a mechanism to manage a container of
> + * dma_fence object associated with a resource. A reservation object
> + * can have any number of fences attaches to it. Each fence carries an usage
> + * parameter determining how the operation represented by the fence is using the
> + * resource. The RCU mechanism is used to protect read access to fences from
> + * locked write-side updates.
>   *
>   * See struct dma_resv for more details.
>   */
> @@ -57,39 +57,59 @@
>  DEFINE_WD_CLASS(reservation_ww_class);
>  EXPORT_SYMBOL(reservation_ww_class);
>  
> +/* Mask for the lower fence pointer bits */
> +#define DMA_RESV_LIST_MASK	0x3
> +
>  struct dma_resv_list {
>  	struct rcu_head rcu;
> -	u32 shared_count, shared_max;
> -	struct dma_fence __rcu *shared[];
> +	u32 num_fences, max_fences;
> +	struct dma_fence __rcu *table[];
>  };
>  
> -/**
> - * dma_resv_list_alloc - allocate fence list
> - * @shared_max: number of fences we need space for
> - *
> +/* Extract the fence and usage flags from an RCU protected entry in the list. */
> +static void dma_resv_list_entry(struct dma_resv_list *list, unsigned int index,
> +				struct dma_resv *resv, struct dma_fence **fence,
> +				enum dma_resv_usage *usage)
> +{
> +	long tmp;
> +
> +	tmp = (long)rcu_dereference_check(list->table[index],
> +					  resv ? dma_resv_held(resv) : true);
> +	*fence = (struct dma_fence *)(tmp & ~DMA_RESV_LIST_MASK);
> +	if (usage)
> +		*usage = tmp & DMA_RESV_LIST_MASK;
> +}
> +
> +/* Set the fence and usage flags at the specific index in the list. */
> +static void dma_resv_list_set(struct dma_resv_list *list,
> +			      unsigned int index,
> +			      struct dma_fence *fence,
> +			      enum dma_resv_usage usage)
> +{
> +	long tmp = ((long)fence) | usage;
> +
> +	RCU_INIT_POINTER(list->table[index], (struct dma_fence *)tmp);
> +}
> +
> +/*
>   * Allocate a new dma_resv_list and make sure to correctly initialize
> - * shared_max.
> + * max_fences.
>   */
> -static struct dma_resv_list *dma_resv_list_alloc(unsigned int shared_max)
> +static struct dma_resv_list *dma_resv_list_alloc(unsigned int max_fences)
>  {
>  	struct dma_resv_list *list;
>  
> -	list = kmalloc(struct_size(list, shared, shared_max), GFP_KERNEL);
> +	list = kmalloc(struct_size(list, table, max_fences), GFP_KERNEL);
>  	if (!list)
>  		return NULL;
>  
> -	list->shared_max = (ksize(list) - offsetof(typeof(*list), shared)) /
> -		sizeof(*list->shared);
> +	list->max_fences = (ksize(list) - offsetof(typeof(*list), table)) /
> +		sizeof(*list->table);
>  
>  	return list;
>  }
>  
> -/**
> - * dma_resv_list_free - free fence list
> - * @list: list to free
> - *
> - * Free a dma_resv_list and make sure to drop all references.
> - */
> +/* Free a dma_resv_list and make sure to drop all references. */
>  static void dma_resv_list_free(struct dma_resv_list *list)
>  {
>  	unsigned int i;
> @@ -97,9 +117,12 @@ static void dma_resv_list_free(struct dma_resv_list *list)
>  	if (!list)
>  		return;
>  
> -	for (i = 0; i < list->shared_count; ++i)
> -		dma_fence_put(rcu_dereference_protected(list->shared[i], true));
> +	for (i = 0; i < list->num_fences; ++i) {
> +		struct dma_fence *fence;
>  
> +		dma_resv_list_entry(list, i, NULL, &fence, NULL);
> +		dma_fence_put(fence);
> +	}
>  	kfree_rcu(list, rcu);
>  }
>  
> @@ -112,8 +135,7 @@ void dma_resv_init(struct dma_resv *obj)
>  	ww_mutex_init(&obj->lock, &reservation_ww_class);
>  	seqcount_ww_mutex_init(&obj->seq, &obj->lock);
>  
> -	RCU_INIT_POINTER(obj->fence, NULL);
> -	RCU_INIT_POINTER(obj->fence_excl, NULL);
> +	RCU_INIT_POINTER(obj->fences, NULL);
>  }
>  EXPORT_SYMBOL(dma_resv_init);
>  
> @@ -123,46 +145,32 @@ EXPORT_SYMBOL(dma_resv_init);
>   */
>  void dma_resv_fini(struct dma_resv *obj)
>  {
> -	struct dma_resv_list *fobj;
> -	struct dma_fence *excl;
> -
>  	/*
>  	 * This object should be dead and all references must have
>  	 * been released to it, so no need to be protected with rcu.
>  	 */
> -	excl = rcu_dereference_protected(obj->fence_excl, 1);
> -	if (excl)
> -		dma_fence_put(excl);
> -
> -	fobj = rcu_dereference_protected(obj->fence, 1);
> -	dma_resv_list_free(fobj);
> +	dma_resv_list_free(rcu_dereference_protected(obj->fences, true));
>  	ww_mutex_destroy(&obj->lock);
>  }
>  EXPORT_SYMBOL(dma_resv_fini);
>  
> -static inline struct dma_fence *
> -dma_resv_excl_fence(struct dma_resv *obj)
> -{
> -       return rcu_dereference_check(obj->fence_excl, dma_resv_held(obj));
> -}
> -
> -static inline struct dma_resv_list *dma_resv_shared_list(struct dma_resv *obj)
> +/* Dereference the fences while ensuring RCU rules */
> +static inline struct dma_resv_list *dma_resv_fences_list(struct dma_resv *obj)
>  {
> -	return rcu_dereference_check(obj->fence, dma_resv_held(obj));
> +	return rcu_dereference_check(obj->fences, dma_resv_held(obj));
>  }
>  
>  /**
> - * dma_resv_reserve_fences - Reserve space to add shared fences to
> - * a dma_resv.
> + * dma_resv_reserve_fences - Reserve space to add fences to a dma_resv object.
>   * @obj: reservation object
>   * @num_fences: number of fences we want to add
>   *
> - * Should be called before dma_resv_add_shared_fence().  Must
> - * be called with @obj locked through dma_resv_lock().
> + * Should be called before dma_resv_add_fence().  Must be called with @obj
> + * locked through dma_resv_lock().
>   *
>   * Note that the preallocated slots need to be re-reserved if @obj is unlocked
> - * at any time before calling dma_resv_add_shared_fence(). This is validated
> - * when CONFIG_DEBUG_MUTEXES is enabled.
> + * at any time before calling dma_resv_add_fence(). This is validated when
> + * CONFIG_DEBUG_MUTEXES is enabled.
>   *
>   * RETURNS
>   * Zero for success, or -errno
> @@ -174,11 +182,11 @@ int dma_resv_reserve_fences(struct dma_resv *obj, unsigned int num_fences)
>  
>  	dma_resv_assert_held(obj);
>  
> -	old = dma_resv_shared_list(obj);
> -	if (old && old->shared_max) {
> -		if ((old->shared_count + num_fences) <= old->shared_max)
> +	old = dma_resv_fences_list(obj);
> +	if (old && old->max_fences) {
> +		if ((old->num_fences + num_fences) <= old->max_fences)
>  			return 0;
> -		max = max(old->shared_count + num_fences, old->shared_max * 2);
> +		max = max(old->num_fences + num_fences, old->max_fences * 2);
>  	} else {
>  		max = max(4ul, roundup_pow_of_two(num_fences));
>  	}
> @@ -193,27 +201,27 @@ int dma_resv_reserve_fences(struct dma_resv *obj, unsigned int num_fences)
>  	 * references from the old struct are carried over to
>  	 * the new.
>  	 */
> -	for (i = 0, j = 0, k = max; i < (old ? old->shared_count : 0); ++i) {
> +	for (i = 0, j = 0, k = max; i < (old ? old->num_fences : 0); ++i) {
> +		enum dma_resv_usage usage;
>  		struct dma_fence *fence;
>  
> -		fence = rcu_dereference_protected(old->shared[i],
> -						  dma_resv_held(obj));
> +		dma_resv_list_entry(old, i, obj, &fence, &usage);
>  		if (dma_fence_is_signaled(fence))
> -			RCU_INIT_POINTER(new->shared[--k], fence);
> +			RCU_INIT_POINTER(new->table[--k], fence);
>  		else
> -			RCU_INIT_POINTER(new->shared[j++], fence);
> +			dma_resv_list_set(new, j++, fence, usage);
>  	}
> -	new->shared_count = j;
> +	new->num_fences = j;
>  
>  	/*
>  	 * We are not changing the effective set of fences here so can
>  	 * merely update the pointer to the new array; both existing
>  	 * readers and new readers will see exactly the same set of
> -	 * active (unsignaled) shared fences. Individual fences and the
> +	 * active (unsignaled) fences. Individual fences and the
>  	 * old array are protected by RCU and so will not vanish under
>  	 * the gaze of the rcu_read_lock() readers.
>  	 */
> -	rcu_assign_pointer(obj->fence, new);
> +	rcu_assign_pointer(obj->fences, new);
>  
>  	if (!old)
>  		return 0;
> @@ -222,7 +230,7 @@ int dma_resv_reserve_fences(struct dma_resv *obj, unsigned int num_fences)
>  	for (i = k; i < max; ++i) {
>  		struct dma_fence *fence;
>  
> -		fence = rcu_dereference_protected(new->shared[i],
> +		fence = rcu_dereference_protected(new->table[i],
>  						  dma_resv_held(obj));
>  		dma_fence_put(fence);
>  	}
> @@ -234,38 +242,39 @@ EXPORT_SYMBOL(dma_resv_reserve_fences);
>  
>  #ifdef CONFIG_DEBUG_MUTEXES
>  /**
> - * dma_resv_reset_max_fences - reset shared fences for debugging
> + * dma_resv_reset_max_fences - reset fences for debugging
>   * @obj: the dma_resv object to reset
>   *
> - * Reset the number of pre-reserved shared slots to test that drivers do
> + * Reset the number of pre-reserved fence slots to test that drivers do
>   * correct slot allocation using dma_resv_reserve_fences(). See also
> - * &dma_resv_list.shared_max.
> + * &dma_resv_list.max_fences.
>   */
>  void dma_resv_reset_max_fences(struct dma_resv *obj)
>  {
> -	struct dma_resv_list *fences = dma_resv_shared_list(obj);
> +	struct dma_resv_list *fences = dma_resv_fences_list(obj);
>  
>  	dma_resv_assert_held(obj);
>  
> -	/* Test shared fence slot reservation */
> +	/* Test fence slot reservation */
>  	if (fences)
> -		fences->shared_max = fences->shared_count;
> +		fences->max_fences = fences->num_fences;
>  }
>  EXPORT_SYMBOL(dma_resv_reset_max_fences);
>  #endif
>  
>  /**
> - * dma_resv_add_shared_fence - Add a fence to a shared slot
> + * dma_resv_add_fence - Add a fence to the dma_resv obj
>   * @obj: the reservation object
> - * @fence: the shared fence to add
> + * @fence: the fence to add
> + * @usage: how the fence is used, see enum dma_resv_usage
>   *
> - * Add a fence to a shared slot, @obj must be locked with dma_resv_lock(), and
> + * Add a fence to a slot, @obj must be locked with dma_resv_lock(), and
>   * dma_resv_reserve_fences() has been called.
>   *
>   * See also &dma_resv.fence for a discussion of the semantics.
>   */
> -static void dma_resv_add_shared_fence(struct dma_resv *obj,
> -				      struct dma_fence *fence)
> +void dma_resv_add_fence(struct dma_resv *obj, struct dma_fence *fence,
> +			enum dma_resv_usage usage)
>  {
>  	struct dma_resv_list *fobj;
>  	struct dma_fence *old;
> @@ -280,32 +289,33 @@ static void dma_resv_add_shared_fence(struct dma_resv *obj,
>  	 */
>  	WARN_ON(dma_fence_is_container(fence));
>  
> -	fobj = dma_resv_shared_list(obj);
> -	count = fobj->shared_count;
> +	fobj = dma_resv_fences_list(obj);
> +	count = fobj->num_fences;
>  
>  	write_seqcount_begin(&obj->seq);
>  
>  	for (i = 0; i < count; ++i) {
> +		enum dma_resv_usage old_usage;
>  
> -		old = rcu_dereference_protected(fobj->shared[i],
> -						dma_resv_held(obj));
> -		if (old->context == fence->context ||
> +		dma_resv_list_entry(fobj, i, obj, &old, &old_usage);
> +		if ((old->context == fence->context && old_usage >= usage) ||
>  		    dma_fence_is_signaled(old))
>  			goto replace;
>  	}
>  
> -	BUG_ON(fobj->shared_count >= fobj->shared_max);
> +	BUG_ON(fobj->num_fences >= fobj->max_fences);
>  	old = NULL;
>  	count++;
>  
>  replace:
> -	RCU_INIT_POINTER(fobj->shared[i], fence);
> -	/* pointer update must be visible before we extend the shared_count */
> -	smp_store_mb(fobj->shared_count, count);
> +	dma_resv_list_set(fobj, i, fence, usage);
> +	/* pointer update must be visible before we extend the num_fences */
> +	smp_store_mb(fobj->num_fences, count);
>  
>  	write_seqcount_end(&obj->seq);
>  	dma_fence_put(old);
>  }
> +EXPORT_SYMBOL(dma_resv_add_fence);
>  
>  /**
>   * dma_resv_replace_fences - replace fences in the dma_resv obj
> @@ -326,128 +336,63 @@ void dma_resv_replace_fences(struct dma_resv *obj, uint64_t context,
>  			     enum dma_resv_usage usage)
>  {
>  	struct dma_resv_list *list;
> -	struct dma_fence *old;
>  	unsigned int i;
>  
> -	/* Only readers supported for now */
> -	WARN_ON(usage != DMA_RESV_USAGE_READ);
> -
>  	dma_resv_assert_held(obj);
>  
> +	list = dma_resv_fences_list(obj);
>  	write_seqcount_begin(&obj->seq);
> +	for (i = 0; list && i < list->num_fences; ++i) {
> +		struct dma_fence *old;
>  
> -	old = dma_resv_excl_fence(obj);
> -	if (old->context == context) {
> -		RCU_INIT_POINTER(obj->fence_excl, dma_fence_get(replacement));
> -		dma_fence_put(old);
> -	}
> -
> -	list = dma_resv_shared_list(obj);
> -	for (i = 0; list && i < list->shared_count; ++i) {
> -		old = rcu_dereference_protected(list->shared[i],
> -						dma_resv_held(obj));
> +		dma_resv_list_entry(list, i, obj, &old, NULL);
>  		if (old->context != context)
>  			continue;
>  
> -		rcu_assign_pointer(list->shared[i], dma_fence_get(replacement));
> +		dma_resv_list_set(list, i, replacement, usage);
>  		dma_fence_put(old);
>  	}
> -
>  	write_seqcount_end(&obj->seq);
>  }
>  EXPORT_SYMBOL(dma_resv_replace_fences);
>  
> -/**
> - * dma_resv_add_excl_fence - Add an exclusive fence.
> - * @obj: the reservation object
> - * @fence: the exclusive fence to add
> - *
> - * Add a fence to the exclusive slot. @obj must be locked with dma_resv_lock().
> - * See also &dma_resv.fence_excl for a discussion of the semantics.
> - */
> -static void dma_resv_add_excl_fence(struct dma_resv *obj,
> -				    struct dma_fence *fence)
> -{
> -	struct dma_fence *old_fence = dma_resv_excl_fence(obj);
> -
> -	dma_resv_assert_held(obj);
> -
> -	dma_fence_get(fence);
> -
> -	write_seqcount_begin(&obj->seq);
> -	/* write_seqcount_begin provides the necessary memory barrier */
> -	RCU_INIT_POINTER(obj->fence_excl, fence);
> -	write_seqcount_end(&obj->seq);
> -
> -	dma_fence_put(old_fence);
> -}
> -
> -/**
> - * dma_resv_add_fence - Add a fence to the dma_resv obj
> - * @obj: the reservation object
> - * @fence: the fence to add
> - * @usage: how the fence is used, see enum dma_resv_usage
> - *
> - * Add a fence to a slot, @obj must be locked with dma_resv_lock(), and
> - * dma_resv_reserve_fences() has been called.
> - *
> - * See also &dma_resv.fence for a discussion of the semantics.
> - */
> -void dma_resv_add_fence(struct dma_resv *obj, struct dma_fence *fence,
> -			enum dma_resv_usage usage)
> -{
> -	if (usage == DMA_RESV_USAGE_WRITE)
> -		dma_resv_add_excl_fence(obj, fence);
> -	else
> -		dma_resv_add_shared_fence(obj, fence);
> -}
> -EXPORT_SYMBOL(dma_resv_add_fence);
> -
> -/* Restart the iterator by initializing all the necessary fields, but not the
> - * relation to the dma_resv object. */
> +/* Restart the unlocked iteration by initializing the cursor object. */
>  static void dma_resv_iter_restart_unlocked(struct dma_resv_iter *cursor)
>  {
>  	cursor->seq = read_seqcount_begin(&cursor->obj->seq);
> -	cursor->index = -1;
> -	cursor->shared_count = 0;
> -	if (cursor->usage >= DMA_RESV_USAGE_READ) {
> -		cursor->fences = dma_resv_shared_list(cursor->obj);
> -		if (cursor->fences)
> -			cursor->shared_count = cursor->fences->shared_count;
> -	} else {
> -		cursor->fences = NULL;
> -	}
> +	cursor->index = 0;
> +	cursor->num_fences = 0;
> +	cursor->fences = dma_resv_fences_list(cursor->obj);
> +	if (cursor->fences)
> +		cursor->num_fences = cursor->fences->num_fences;
>  	cursor->is_restarted = true;
>  }
>  
>  /* Walk to the next not signaled fence and grab a reference to it */
>  static void dma_resv_iter_walk_unlocked(struct dma_resv_iter *cursor)
>  {
> -	struct dma_resv *obj = cursor->obj;
> +	if (!cursor->fences)
> +		return;
>  
>  	do {
>  		/* Drop the reference from the previous round */
>  		dma_fence_put(cursor->fence);
>  
> -		if (cursor->index == -1) {
> -			cursor->fence = dma_resv_excl_fence(obj);
> -			cursor->index++;
> -			if (!cursor->fence)
> -				continue;
> -
> -		} else if (!cursor->fences ||
> -			   cursor->index >= cursor->shared_count) {
> +		if (cursor->index >= cursor->num_fences) {
>  			cursor->fence = NULL;
>  			break;
>  
> -		} else {
> -			struct dma_resv_list *fences = cursor->fences;
> -			unsigned int idx = cursor->index++;
> -
> -			cursor->fence = rcu_dereference(fences->shared[idx]);
>  		}
> +
> +		dma_resv_list_entry(cursor->fences, cursor->index++,
> +				    cursor->obj, &cursor->fence,
> +				    &cursor->fence_usage);
>  		cursor->fence = dma_fence_get_rcu(cursor->fence);
> -		if (!cursor->fence || !dma_fence_is_signaled(cursor->fence))
> +		if (!cursor->fence)
> +			break;
> +
> +		if (!dma_fence_is_signaled(cursor->fence) &&
> +		    cursor->usage >= cursor->fence_usage)
>  			break;
>  	} while (true);
>  }
> @@ -522,15 +467,9 @@ struct dma_fence *dma_resv_iter_first(struct dma_resv_iter *cursor)
>  	dma_resv_assert_held(cursor->obj);
>  
>  	cursor->index = 0;
> -	if (cursor->usage >= DMA_RESV_USAGE_READ)
> -		cursor->fences = dma_resv_shared_list(cursor->obj);
> -	else
> -		cursor->fences = NULL;
> -
> -	fence = dma_resv_excl_fence(cursor->obj);
> -	if (!fence)
> -		fence = dma_resv_iter_next(cursor);
> +	cursor->fences = dma_resv_fences_list(cursor->obj);
>  
> +	fence = dma_resv_iter_next(cursor);
>  	cursor->is_restarted = true;
>  	return fence;
>  }
> @@ -545,17 +484,22 @@ EXPORT_SYMBOL_GPL(dma_resv_iter_first);
>   */
>  struct dma_fence *dma_resv_iter_next(struct dma_resv_iter *cursor)
>  {
> -	unsigned int idx;
> +	struct dma_fence *fence;
>  
>  	dma_resv_assert_held(cursor->obj);
>  
>  	cursor->is_restarted = false;
> -	if (!cursor->fences || cursor->index >= cursor->fences->shared_count)
> -		return NULL;
>  
> -	idx = cursor->index++;
> -	return rcu_dereference_protected(cursor->fences->shared[idx],
> -					 dma_resv_held(cursor->obj));
> +	do {
> +		if (!cursor->fences ||
> +		    cursor->index >= cursor->fences->num_fences)
> +			return NULL;
> +
> +		dma_resv_list_entry(cursor->fences, cursor->index++,
> +				    cursor->obj, &fence, &cursor->fence_usage);
> +	} while (cursor->fence_usage > cursor->usage);
> +
> +	return fence;
>  }
>  EXPORT_SYMBOL_GPL(dma_resv_iter_next);
>  
> @@ -570,57 +514,43 @@ int dma_resv_copy_fences(struct dma_resv *dst, struct dma_resv *src)
>  {
>  	struct dma_resv_iter cursor;
>  	struct dma_resv_list *list;
> -	struct dma_fence *f, *excl;
> +	struct dma_fence *f;
>  
>  	dma_resv_assert_held(dst);
>  
>  	list = NULL;
> -	excl = NULL;
>  
>  	dma_resv_iter_begin(&cursor, src, DMA_RESV_USAGE_READ);
>  	dma_resv_for_each_fence_unlocked(&cursor, f) {
>  
>  		if (dma_resv_iter_is_restarted(&cursor)) {
>  			dma_resv_list_free(list);
> -			dma_fence_put(excl);
> -
> -			if (cursor.shared_count) {
> -				list = dma_resv_list_alloc(cursor.shared_count);
> -				if (!list) {
> -					dma_resv_iter_end(&cursor);
> -					return -ENOMEM;
> -				}
>  
> -				list->shared_count = 0;
> -
> -			} else {
> -				list = NULL;
> +			list = dma_resv_list_alloc(cursor.num_fences);
> +			if (!list) {
> +				dma_resv_iter_end(&cursor);
> +				return -ENOMEM;
>  			}
> -			excl = NULL;
> +			list->num_fences = 0;
>  		}
>  
>  		dma_fence_get(f);
> -		if (dma_resv_iter_usage(&cursor) == DMA_RESV_USAGE_WRITE)
> -			excl = f;
> -		else
> -			RCU_INIT_POINTER(list->shared[list->shared_count++], f);
> +		dma_resv_list_set(list, list->num_fences++, f,
> +				  dma_resv_iter_usage(&cursor));
>  	}
>  	dma_resv_iter_end(&cursor);
>  
>  	write_seqcount_begin(&dst->seq);
> -	excl = rcu_replace_pointer(dst->fence_excl, excl, dma_resv_held(dst));
> -	list = rcu_replace_pointer(dst->fence, list, dma_resv_held(dst));
> +	list = rcu_replace_pointer(dst->fences, list, dma_resv_held(dst));
>  	write_seqcount_end(&dst->seq);
>  
>  	dma_resv_list_free(list);
> -	dma_fence_put(excl);
> -
>  	return 0;
>  }
>  EXPORT_SYMBOL(dma_resv_copy_fences);
>  
>  /**
> - * dma_resv_get_fences - Get an object's shared and exclusive
> + * dma_resv_get_fences - Get an object's fences
>   * fences without update side lock held
>   * @obj: the reservation object
>   * @usage: controls which fences to include, see enum dma_resv_usage.
> @@ -649,7 +579,7 @@ int dma_resv_get_fences(struct dma_resv *obj, enum dma_resv_usage usage,
>  			while (*num_fences)
>  				dma_fence_put((*fences)[--(*num_fences)]);
>  
> -			count = cursor.shared_count + 1;
> +			count = cursor.num_fences + 1;
>  
>  			/* Eventually re-allocate the array */
>  			*fences = krealloc_array(*fences, count,
> @@ -723,8 +653,7 @@ int dma_resv_get_singleton(struct dma_resv *obj, enum dma_resv_usage usage,
>  EXPORT_SYMBOL_GPL(dma_resv_get_singleton);
>  
>  /**
> - * dma_resv_wait_timeout - Wait on reservation's objects
> - * shared and/or exclusive fences.
> + * dma_resv_wait_timeout - Wait on reservation's objects fences
>   * @obj: the reservation object
>   * @usage: controls which fences to include, see enum dma_resv_usage.
>   * @intr: if true, do interruptible wait
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_bo_list.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_bo_list.h
> index 044b41f0bfd9..529d52a204cf 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_bo_list.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_bo_list.h
> @@ -34,7 +34,6 @@ struct amdgpu_fpriv;
>  struct amdgpu_bo_list_entry {
>  	struct ttm_validate_buffer	tv;
>  	struct amdgpu_bo_va		*bo_va;
> -	struct dma_fence_chain		*chain;
>  	uint32_t			priority;
>  	struct page			**user_pages;
>  	bool				user_invalidated;
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> index 76fd916424d6..8de283997769 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> @@ -574,14 +574,6 @@ static int amdgpu_cs_parser_bos(struct amdgpu_cs_parser *p,
>  		struct amdgpu_bo *bo = ttm_to_amdgpu_bo(e->tv.bo);
>  
>  		e->bo_va = amdgpu_vm_bo_find(vm, bo);
> -
> -		if (bo->tbo.base.dma_buf && !amdgpu_bo_explicit_sync(bo)) {
> -			e->chain = dma_fence_chain_alloc();
> -			if (!e->chain) {
> -				r = -ENOMEM;
> -				goto error_validate;
> -			}
> -		}
>  	}
>  
>  	/* Move fence waiting after getting reservation lock of
> @@ -642,13 +634,8 @@ static int amdgpu_cs_parser_bos(struct amdgpu_cs_parser *p,
>  	}
>  
>  error_validate:
> -	if (r) {
> -		amdgpu_bo_list_for_each_entry(e, p->bo_list) {
> -			dma_fence_chain_free(e->chain);
> -			e->chain = NULL;
> -		}
> +	if (r)
>  		ttm_eu_backoff_reservation(&p->ticket, &p->validated);
> -	}
>  out:
>  	return r;
>  }
> @@ -688,17 +675,9 @@ static void amdgpu_cs_parser_fini(struct amdgpu_cs_parser *parser, int error,
>  {
>  	unsigned i;
>  
> -	if (error && backoff) {
> -		struct amdgpu_bo_list_entry *e;
> -
> -		amdgpu_bo_list_for_each_entry(e, parser->bo_list) {
> -			dma_fence_chain_free(e->chain);
> -			e->chain = NULL;
> -		}
> -
> +	if (error && backoff)
>  		ttm_eu_backoff_reservation(&parser->ticket,
>  					   &parser->validated);
> -	}
>  
>  	for (i = 0; i < parser->num_post_deps; i++) {
>  		drm_syncobj_put(parser->post_deps[i].syncobj);
> @@ -1272,31 +1251,9 @@ static int amdgpu_cs_submit(struct amdgpu_cs_parser *p,
>  
>  	amdgpu_vm_move_to_lru_tail(p->adev, &fpriv->vm);
>  
> -	amdgpu_bo_list_for_each_entry(e, p->bo_list) {
> -		struct dma_resv *resv = e->tv.bo->base.resv;
> -		struct dma_fence_chain *chain = e->chain;
> -		struct dma_resv_iter cursor;
> -		struct dma_fence *fence;
> -
> -		if (!chain)
> -			continue;
> -
> -		/*
> -		 * Temporary workaround dma_resv shortcommings by wrapping up
> -		 * the submission in a dma_fence_chain and add it as exclusive
> -		 * fence.
> -		 *
> -		 * TODO: Remove together with dma_resv rework.
> -		 */
> -		dma_resv_for_each_fence(&cursor, resv,
> -					DMA_RESV_USAGE_WRITE,
> -					fence) {
> -			break;
> -		}
> -		dma_fence_chain_init(chain, fence, dma_fence_get(p->fence), 1);
> -		rcu_assign_pointer(resv->fence_excl, &chain->base);
> -		e->chain = NULL;
> -	}
> +	/* Make sure all BOs are remembered as writers */
> +	amdgpu_bo_list_for_each_entry(e, p->bo_list)
> +		e->tv.num_shared = 0;
>  
>  	ttm_eu_fence_buffer_objects(&p->ticket, &p->validated, p->fence);
>  	mutex_unlock(&p->adev->notifier_lock);
> diff --git a/include/linux/dma-resv.h b/include/linux/dma-resv.h
> index 98dc5234b487..7bb7e7edbb6f 100644
> --- a/include/linux/dma-resv.h
> +++ b/include/linux/dma-resv.h
> @@ -99,8 +99,8 @@ static inline enum dma_resv_usage dma_resv_usage_rw(bool write)
>  /**
>   * struct dma_resv - a reservation object manages fences for a buffer
>   *
> - * There are multiple uses for this, with sometimes slightly different rules in
> - * how the fence slots are used.
> + * This is a container for dma_fence objects which needs to handle multiple use
> + * cases.
>   *
>   * One use is to synchronize cross-driver access to a struct dma_buf, either for
>   * dynamic buffer management or just to handle implicit synchronization between
> @@ -130,47 +130,22 @@ struct dma_resv {
>  	 * @seq:
>  	 *
>  	 * Sequence count for managing RCU read-side synchronization, allows
> -	 * read-only access to @fence_excl and @fence while ensuring we take a
> -	 * consistent snapshot.
> +	 * read-only access to @fences while ensuring we take a consistent
> +	 * snapshot.
>  	 */
>  	seqcount_ww_mutex_t seq;
>  
>  	/**
> -	 * @fence_excl:
> +	 * @fences:
>  	 *
> -	 * The exclusive fence, if there is one currently.
> +	 * Array of fences which where added to the dma_resv object
>  	 *
> -	 * To guarantee that no fences are lost, this new fence must signal
> -	 * only after the previous exclusive fence has signalled. If
> -	 * semantically only a new access is added without actually treating the
> -	 * previous one as a dependency the exclusive fences can be strung
> -	 * together using struct dma_fence_chain.
> -	 *
> -	 * Note that actual semantics of what an exclusive or shared fence mean
> -	 * is defined by the user, for reservation objects shared across drivers
> -	 * see &dma_buf.resv.
> -	 */
> -	struct dma_fence __rcu *fence_excl;
> -
> -	/**
> -	 * @fence:
> -	 *
> -	 * List of current shared fences.
> -	 *
> -	 * There are no ordering constraints of shared fences against the
> -	 * exclusive fence slot. If a waiter needs to wait for all access, it
> -	 * has to wait for both sets of fences to signal.
> -	 *
> -	 * A new fence is added by calling dma_resv_add_shared_fence(). Since
> -	 * this often needs to be done past the point of no return in command
> +	 * A new fence is added by calling dma_resv_add_fence(). Since this
> +	 * often needs to be done past the point of no return in command
>  	 * submission it cannot fail, and therefore sufficient slots need to be
>  	 * reserved by calling dma_resv_reserve_fences().
> -	 *
> -	 * Note that actual semantics of what an exclusive or shared fence mean
> -	 * is defined by the user, for reservation objects shared across drivers
> -	 * see &dma_buf.resv.
>  	 */
> -	struct dma_resv_list __rcu *fence;
> +	struct dma_resv_list __rcu *fences;
>  };
>  
>  /**
> @@ -207,8 +182,8 @@ struct dma_resv_iter {
>  	/** @fences: the shared fences; private, *MUST* not dereference  */
>  	struct dma_resv_list *fences;
>  
> -	/** @shared_count: number of shared fences */
> -	unsigned int shared_count;
> +	/** @num_fences: number of fences */
> +	unsigned int num_fences;
>  
>  	/** @is_restarted: true if this is the first returned fence */
>  	bool is_restarted;
> -- 
> 2.25.1
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH 05/16] dma-buf: add DMA_RESV_USAGE_KERNEL v3
  2022-04-06  7:51 ` [PATCH 05/16] dma-buf: add DMA_RESV_USAGE_KERNEL v3 Christian König
@ 2022-04-06 12:41   ` Daniel Vetter
  0 siblings, 0 replies; 65+ messages in thread
From: Daniel Vetter @ 2022-04-06 12:41 UTC (permalink / raw)
  To: DMA-resvusage; +Cc: daniel.vetter, Christian König, dri-devel

On Wed, Apr 06, 2022 at 09:51:21AM +0200, Christian König wrote:
> Add an usage for kernel submissions. Waiting for those are mandatory for
> dynamic DMA-bufs.
> 
> As a precaution this patch also changes all occurrences where fences are
> added as part of memory management in TTM, VMWGFX and i915 to use the
> new value because it now becomes possible for drivers to ignore fences
> with the WRITE usage.
> 
> v2: use "must" in documentation, fix whitespaces
> v3: separate out some driver changes and better document why some
>     changes should still be part of this patch.
> 
> Signed-off-by: Christian König <christian.koenig@amd.com>
> ---
>  drivers/dma-buf/dma-resv.c                  |  2 +-
>  drivers/dma-buf/st-dma-resv.c               |  2 +-
>  drivers/gpu/drm/i915/gem/i915_gem_clflush.c |  2 +-
>  drivers/gpu/drm/ttm/ttm_bo.c                |  2 +-
>  drivers/gpu/drm/ttm/ttm_bo_util.c           |  4 ++--
>  drivers/gpu/drm/vmwgfx/vmwgfx_bo.c          |  2 +-
>  include/linux/dma-resv.h                    | 24 ++++++++++++++++++---
>  7 files changed, 28 insertions(+), 10 deletions(-)
> 
> diff --git a/drivers/dma-buf/dma-resv.c b/drivers/dma-buf/dma-resv.c
> index 378d47e1cfea..f4860e5f2d8b 100644
> --- a/drivers/dma-buf/dma-resv.c
> +++ b/drivers/dma-buf/dma-resv.c
> @@ -726,7 +726,7 @@ EXPORT_SYMBOL_GPL(dma_resv_test_signaled);
>   */
>  void dma_resv_describe(struct dma_resv *obj, struct seq_file *seq)
>  {
> -	static const char *usage[] = { "write", "read" };
> +	static const char *usage[] = { "kernel", "write", "read" };
>  	struct dma_resv_iter cursor;
>  	struct dma_fence *fence;
>  
> diff --git a/drivers/dma-buf/st-dma-resv.c b/drivers/dma-buf/st-dma-resv.c
> index d0f7c2bfd4f0..062b57d63fa6 100644
> --- a/drivers/dma-buf/st-dma-resv.c
> +++ b/drivers/dma-buf/st-dma-resv.c
> @@ -296,7 +296,7 @@ int dma_resv(void)
>  	int r;
>  
>  	spin_lock_init(&fence_lock);
> -	for (usage = DMA_RESV_USAGE_WRITE; usage <= DMA_RESV_USAGE_READ;
> +	for (usage = DMA_RESV_USAGE_KERNEL; usage <= DMA_RESV_USAGE_READ;
>  	     ++usage) {
>  		r = subtests(tests, (void *)(unsigned long)usage);
>  		if (r)
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_clflush.c b/drivers/gpu/drm/i915/gem/i915_gem_clflush.c
> index f5f2b8b115ea..0512afdd20d8 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_clflush.c
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_clflush.c
> @@ -117,7 +117,7 @@ bool i915_gem_clflush_object(struct drm_i915_gem_object *obj,
>  						i915_fence_timeout(i915),
>  						I915_FENCE_GFP);
>  		dma_resv_add_fence(obj->base.resv, &clflush->base.dma,
> -				   DMA_RESV_USAGE_WRITE);
> +				   DMA_RESV_USAGE_KERNEL);
>  		dma_fence_work_commit(&clflush->base);
>  		/*
>  		 * We must have successfully populated the pages(since we are
> diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c
> index d74f9eea855e..6bf3fb1c8045 100644
> --- a/drivers/gpu/drm/ttm/ttm_bo.c
> +++ b/drivers/gpu/drm/ttm/ttm_bo.c
> @@ -739,7 +739,7 @@ static int ttm_bo_add_move_fence(struct ttm_buffer_object *bo,
>  		return ret;
>  	}
>  
> -	dma_resv_add_fence(bo->base.resv, fence, DMA_RESV_USAGE_WRITE);
> +	dma_resv_add_fence(bo->base.resv, fence, DMA_RESV_USAGE_KERNEL);
>  
>  	ret = dma_resv_reserve_fences(bo->base.resv, 1);
>  	if (unlikely(ret)) {
> diff --git a/drivers/gpu/drm/ttm/ttm_bo_util.c b/drivers/gpu/drm/ttm/ttm_bo_util.c
> index 7a96a1db13a7..99deb45894f4 100644
> --- a/drivers/gpu/drm/ttm/ttm_bo_util.c
> +++ b/drivers/gpu/drm/ttm/ttm_bo_util.c
> @@ -508,7 +508,7 @@ static int ttm_bo_move_to_ghost(struct ttm_buffer_object *bo,
>  		return ret;
>  
>  	dma_resv_add_fence(&ghost_obj->base._resv, fence,
> -			   DMA_RESV_USAGE_WRITE);
> +			   DMA_RESV_USAGE_KERNEL);
>  
>  	/**
>  	 * If we're not moving to fixed memory, the TTM object
> @@ -562,7 +562,7 @@ int ttm_bo_move_accel_cleanup(struct ttm_buffer_object *bo,
>  	struct ttm_resource_manager *man = ttm_manager_type(bdev, new_mem->mem_type);
>  	int ret = 0;
>  
> -	dma_resv_add_fence(bo->base.resv, fence, DMA_RESV_USAGE_WRITE);
> +	dma_resv_add_fence(bo->base.resv, fence, DMA_RESV_USAGE_KERNEL);
>  	if (!evict)
>  		ret = ttm_bo_move_to_ghost(bo, fence, man->use_tt);
>  	else if (!from->use_tt && pipeline)
> diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_bo.c b/drivers/gpu/drm/vmwgfx/vmwgfx_bo.c
> index bec50223efe5..408ede1f967f 100644
> --- a/drivers/gpu/drm/vmwgfx/vmwgfx_bo.c
> +++ b/drivers/gpu/drm/vmwgfx/vmwgfx_bo.c
> @@ -759,7 +759,7 @@ void vmw_bo_fence_single(struct ttm_buffer_object *bo,
>  	ret = dma_resv_reserve_fences(bo->base.resv, 1);
>  	if (!ret)
>  		dma_resv_add_fence(bo->base.resv, &fence->base,
> -				   DMA_RESV_USAGE_WRITE);
> +				   DMA_RESV_USAGE_KERNEL);
>  	else
>  		/* Last resort fallback when we are OOM */
>  		dma_fence_wait(&fence->base, false);
> diff --git a/include/linux/dma-resv.h b/include/linux/dma-resv.h
> index 7bb7e7edbb6f..a749f229ae91 100644
> --- a/include/linux/dma-resv.h
> +++ b/include/linux/dma-resv.h
> @@ -55,11 +55,29 @@ struct dma_resv_list;
>   * This enum describes the different use cases for a dma_resv object and
>   * controls which fences are returned when queried.
>   *
> - * An important fact is that there is the order WRITE<READ and when the
> - * dma_resv object is asked for fences for one use case the fences for the
> - * lower use case are returned as well.
> + * An important fact is that there is the order KERNEL<WRITE<READ and
> + * when the dma_resv object is asked for fences for one use case the fences
> + * for the lower use case are returned as well.
> + *
> + * For example when asking for WRITE fences then the KERNEL fences are returned
> + * as well. Similar when asked for READ fences then both WRITE and KERNEL
> + * fences are returned as well.
>   */
>  enum dma_resv_usage {
> +	/**
> +	 * @DMA_RESV_USAGE_KERNEL: For in kernel memory management only.
> +	 *
> +	 * This should only be used for things like copying or clearing memory
> +	 * with a DMA hardware engine for the purpose of kernel memory
> +	 * management.
> +	 *
> +	 * Drivers *always* must wait for those fences before accessing the
> +	 * resource protected by the dma_resv object. The only exception for
> +	 * that is when the resource is known to be locked down in place by
> +	 * pinning it previously.
> +	 */
> +	DMA_RESV_USAGE_KERNEL,
> +
>  	/**
>  	 * @DMA_RESV_USAGE_WRITE: Implicit write synchronization.
>  	 *
> -- 
> 2.25.1

All the functional changes in drivers make sense to me now.

Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>

> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH 06/16] drm/amdgpu: use DMA_RESV_USAGE_KERNEL
  2022-04-06  7:51 ` [PATCH 06/16] drm/amdgpu: use DMA_RESV_USAGE_KERNEL Christian König
@ 2022-04-06 12:42   ` Daniel Vetter
  2022-04-06 14:54     ` Christian König
  0 siblings, 1 reply; 65+ messages in thread
From: Daniel Vetter @ 2022-04-06 12:42 UTC (permalink / raw)
  To: DMA-resvusage; +Cc: daniel.vetter, Christian König, dri-devel

On Wed, Apr 06, 2022 at 09:51:22AM +0200, Christian König wrote:
> Wait only for kernel fences before kmap or UVD direct submission.
> 
> This also makes sure that we always wait in amdgpu_bo_kmap() even when
> returning a cached pointer.
> 
> Signed-off-by: Christian König <christian.koenig@amd.com>
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 10 +++++-----
>  drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c    |  2 +-
>  2 files changed, 6 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> index a3cdf8a24377..5832c05ab10d 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> @@ -761,6 +761,11 @@ int amdgpu_bo_kmap(struct amdgpu_bo *bo, void **ptr)
>  	if (bo->flags & AMDGPU_GEM_CREATE_NO_CPU_ACCESS)
>  		return -EPERM;
>  
> +	r = dma_resv_wait_timeout(bo->tbo.base.resv, DMA_RESV_USAGE_KERNEL,
> +				  false, MAX_SCHEDULE_TIMEOUT);
> +	if (r < 0)
> +		return r;
> +
>  	kptr = amdgpu_bo_kptr(bo);
>  	if (kptr) {
>  		if (ptr)
> @@ -768,11 +773,6 @@ int amdgpu_bo_kmap(struct amdgpu_bo *bo, void **ptr)
>  		return 0;
>  	}
>  
> -	r = dma_resv_wait_timeout(bo->tbo.base.resv, DMA_RESV_USAGE_WRITE,
> -				  false, MAX_SCHEDULE_TIMEOUT);
> -	if (r < 0)
> -		return r;
> -
>  	r = ttm_bo_kmap(&bo->tbo, 0, bo->tbo.resource->num_pages, &bo->kmap);

I wonder whether waiting for kernel fences shouldn't be ttm's duty here.
Anyway patch makes some sense to me.

Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
>  	if (r)
>  		return r;
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c
> index 3654326219e0..6eac649499d3 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c
> @@ -1164,7 +1164,7 @@ static int amdgpu_uvd_send_msg(struct amdgpu_ring *ring, struct amdgpu_bo *bo,
>  
>  	if (direct) {
>  		r = dma_resv_wait_timeout(bo->tbo.base.resv,
> -					  DMA_RESV_USAGE_WRITE, false,
> +					  DMA_RESV_USAGE_KERNEL, false,
>  					  msecs_to_jiffies(10));
>  		if (r == 0)
>  			r = -ETIMEDOUT;
> -- 
> 2.25.1
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH 07/16] drm/radeon: use DMA_RESV_USAGE_KERNEL
  2022-04-06  7:51 ` [PATCH 07/16] drm/radeon: " Christian König
@ 2022-04-06 12:43   ` Daniel Vetter
  0 siblings, 0 replies; 65+ messages in thread
From: Daniel Vetter @ 2022-04-06 12:43 UTC (permalink / raw)
  To: DMA-resvusage; +Cc: daniel.vetter, Christian König, dri-devel

On Wed, Apr 06, 2022 at 09:51:23AM +0200, Christian König wrote:
> Always wait for kernel fences before kmap and not only for UVD kmaps.
> 
> Signed-off-by: Christian König <christian.koenig@amd.com>
> ---
>  drivers/gpu/drm/radeon/radeon_object.c |  7 ++++++-
>  drivers/gpu/drm/radeon/radeon_uvd.c    | 12 ++----------
>  2 files changed, 8 insertions(+), 11 deletions(-)
> 
> diff --git a/drivers/gpu/drm/radeon/radeon_object.c b/drivers/gpu/drm/radeon/radeon_object.c
> index cb5c4aa45cef..6c4a6802ca96 100644
> --- a/drivers/gpu/drm/radeon/radeon_object.c
> +++ b/drivers/gpu/drm/radeon/radeon_object.c
> @@ -219,7 +219,12 @@ int radeon_bo_create(struct radeon_device *rdev,
>  int radeon_bo_kmap(struct radeon_bo *bo, void **ptr)
>  {
>  	bool is_iomem;
> -	int r;
> +	long r;
> +
> +	r = dma_resv_wait_timeout(bo->tbo.base.resv, DMA_RESV_USAGE_KERNEL,
> +				  false, MAX_SCHEDULE_TIMEOUT);

Maybe another reason why we should push this wait into ttm kmap helpers?

Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
> +	if (r < 0)
> +		return r;
>  
>  	if (bo->kptr) {
>  		if (ptr) {
> diff --git a/drivers/gpu/drm/radeon/radeon_uvd.c b/drivers/gpu/drm/radeon/radeon_uvd.c
> index a50750740ab0..a2cda184b2b2 100644
> --- a/drivers/gpu/drm/radeon/radeon_uvd.c
> +++ b/drivers/gpu/drm/radeon/radeon_uvd.c
> @@ -470,24 +470,16 @@ static int radeon_uvd_cs_msg(struct radeon_cs_parser *p, struct radeon_bo *bo,
>  	int32_t *msg, msg_type, handle;
>  	unsigned img_size = 0;
>  	void *ptr;
> -	long r;
> -	int i;
> +	int i, r;
>  
>  	if (offset & 0x3F) {
>  		DRM_ERROR("UVD messages must be 64 byte aligned!\n");
>  		return -EINVAL;
>  	}
>  
> -	r = dma_resv_wait_timeout(bo->tbo.base.resv, DMA_RESV_USAGE_WRITE,
> -				  false, MAX_SCHEDULE_TIMEOUT);
> -	if (r <= 0) {
> -		DRM_ERROR("Failed waiting for UVD message (%ld)!\n", r);
> -		return r ? r : -ETIME;
> -	}
> -
>  	r = radeon_bo_kmap(bo, &ptr);
>  	if (r) {
> -		DRM_ERROR("Failed mapping the UVD message (%ld)!\n", r);
> +		DRM_ERROR("Failed mapping the UVD message (%d)!\n", r);
>  		return r;
>  	}
>  
> -- 
> 2.25.1
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH 08/16] drm/etnaviv: always wait for kernel fences
  2022-04-06  7:51 ` [PATCH 08/16] drm/etnaviv: always wait for kernel fences Christian König
@ 2022-04-06 12:46   ` Daniel Vetter
  0 siblings, 0 replies; 65+ messages in thread
From: Daniel Vetter @ 2022-04-06 12:46 UTC (permalink / raw)
  To: DMA-resvusage; +Cc: daniel.vetter, Christian König, dri-devel

On Wed, Apr 06, 2022 at 09:51:24AM +0200, Christian König wrote:
> Even for explicit synchronization we should wait for kernel fences.

Yeah I don't think this patch makes much sense, because aside from etnaviv
there's also msm and lima which allow you to ignore all dma_resv fences
completely.

But it's also not an issue because these drivers don't move buffers, don't
have any other kernel fences and also don't do dynamic importing. I think
the real fix is replacing the write argument to
drm_sched_job_add_implicit_dependencies with dma_resv_usage and rolling
that out.

I'd just drop this for now, seems like a detour.
-Daniel

> 
> Signed-off-by: Christian König <christian.koenig@amd.com>
> ---
>  drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c | 27 ++++++++++++++++++--
>  1 file changed, 25 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c b/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c
> index 98bb5c9239de..3fedd29732d5 100644
> --- a/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c
> +++ b/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c
> @@ -171,6 +171,26 @@ static int submit_lock_objects(struct etnaviv_gem_submit *submit,
>  	return ret;
>  }
>  
> +/* TODO: This should be moved into the GPU scheduler if others need it */
> +static int submit_fence_kernel_sync(struct etnaviv_gem_submit *submit,
> +				    struct dma_resv *resv)
> +{
> +	struct dma_resv_iter cursor;
> +	struct dma_fence *fence;
> +	int ret;
> +
> +	dma_resv_for_each_fence(&cursor, resv, DMA_RESV_USAGE_KERNEL, fence) {
> +		/* Make sure to grab an additional ref on the added fence */
> +		dma_fence_get(fence);
> +		ret = drm_sched_job_add_dependency(&submit->sched_job, fence);
> +		if (ret) {
> +			dma_fence_put(fence);
> +			return ret;
> +		}
> +	}
> +	return 0;
> +}
> +
>  static int submit_fence_sync(struct etnaviv_gem_submit *submit)
>  {
>  	int i, ret = 0;
> @@ -183,8 +203,11 @@ static int submit_fence_sync(struct etnaviv_gem_submit *submit)
>  		if (ret)
>  			return ret;
>  
> -		if (submit->flags & ETNA_SUBMIT_NO_IMPLICIT)
> -			continue;
> +		if (submit->flags & ETNA_SUBMIT_NO_IMPLICIT) {
> +			ret = submit_fence_kernel_sync(submit, robj);
> +			if (ret)
> +				return ret;
> +		}
>  
>  		ret = drm_sched_job_add_implicit_dependencies(&submit->sched_job,
>  							      &bo->obj->base,
> -- 
> 2.25.1
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH 09/16] drm/nouveau: only wait for kernel fences in nouveau_bo_vm_cleanup
  2022-04-06  7:51 ` [PATCH 09/16] drm/nouveau: only wait for kernel fences in nouveau_bo_vm_cleanup Christian König
@ 2022-04-06 12:47   ` Daniel Vetter
  0 siblings, 0 replies; 65+ messages in thread
From: Daniel Vetter @ 2022-04-06 12:47 UTC (permalink / raw)
  To: DMA-resvusage; +Cc: daniel.vetter, Christian König, dri-devel

On Wed, Apr 06, 2022 at 09:51:25AM +0200, Christian König wrote:
> Don't wait for user space submissions. I'm not 100% sure if that is
> correct, but it seems to match what the code initially intended.
> 
> Signed-off-by: Christian König <christian.koenig@amd.com>

I'll let nouveau folks review/test this one.
-Daniel
> ---
>  drivers/gpu/drm/nouveau/nouveau_bo.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/gpu/drm/nouveau/nouveau_bo.c b/drivers/gpu/drm/nouveau/nouveau_bo.c
> index 05076e530e7d..13deb6c70ba6 100644
> --- a/drivers/gpu/drm/nouveau/nouveau_bo.c
> +++ b/drivers/gpu/drm/nouveau/nouveau_bo.c
> @@ -962,10 +962,10 @@ nouveau_bo_vm_cleanup(struct ttm_buffer_object *bo,
>  	struct dma_fence *fence;
>  	int ret;
>  
> -	ret = dma_resv_get_singleton(bo->base.resv, DMA_RESV_USAGE_WRITE,
> +	ret = dma_resv_get_singleton(bo->base.resv, DMA_RESV_USAGE_KERNEL,
>  				     &fence);
>  	if (ret)
> -		dma_resv_wait_timeout(bo->base.resv, DMA_RESV_USAGE_WRITE,
> +		dma_resv_wait_timeout(bo->base.resv, DMA_RESV_USAGE_KERNEL,
>  				      false, MAX_SCHEDULE_TIMEOUT);
>  
>  	nv10_bo_put_tile_region(dev, *old_tile, fence);
> -- 
> 2.25.1
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH 10/16] RDMA: use DMA_RESV_USAGE_KERNEL
  2022-04-06  7:51 ` [PATCH 10/16] RDMA: use DMA_RESV_USAGE_KERNEL Christian König
@ 2022-04-06 12:48   ` Daniel Vetter
  0 siblings, 0 replies; 65+ messages in thread
From: Daniel Vetter @ 2022-04-06 12:48 UTC (permalink / raw)
  To: DMA-resvusage; +Cc: daniel.vetter, Christian König, dri-devel

On Wed, Apr 06, 2022 at 09:51:26AM +0200, Christian König wrote:
> We only need to wait for kernel submissions here.
> 
> Signed-off-by: Christian König <christian.koenig@amd.com>

I think we had an implied ack from Jason on this one. Not quite enough cc
to regrab it.

Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>

> ---
>  drivers/infiniband/core/umem_dmabuf.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/infiniband/core/umem_dmabuf.c b/drivers/infiniband/core/umem_dmabuf.c
> index f9901d273b8e..fce80a4a5147 100644
> --- a/drivers/infiniband/core/umem_dmabuf.c
> +++ b/drivers/infiniband/core/umem_dmabuf.c
> @@ -68,7 +68,7 @@ int ib_umem_dmabuf_map_pages(struct ib_umem_dmabuf *umem_dmabuf)
>  	 * the migration.
>  	 */
>  	return dma_resv_wait_timeout(umem_dmabuf->attach->dmabuf->resv,
> -				     DMA_RESV_USAGE_WRITE,
> +				     DMA_RESV_USAGE_KERNEL,
>  				     false, MAX_SCHEDULE_TIMEOUT);
>  }
>  EXPORT_SYMBOL(ib_umem_dmabuf_map_pages);
> -- 
> 2.25.1
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH 15/16] drm/ttm: remove bo->moving
  2022-04-06  7:51 ` [PATCH 15/16] drm/ttm: remove bo->moving Christian König
@ 2022-04-06 12:52   ` Daniel Vetter
  0 siblings, 0 replies; 65+ messages in thread
From: Daniel Vetter @ 2022-04-06 12:52 UTC (permalink / raw)
  To: DMA-resvusage; +Cc: daniel.vetter, Christian König, dri-devel

On Wed, Apr 06, 2022 at 09:51:31AM +0200, Christian König wrote:
> This is now handled by the DMA-buf framework in the dma_resv obj.
> 
> Also remove the workaround inside VMWGFX to update the moving fence.
> 
> Signed-off-by: Christian König <christian.koenig@amd.com>

Looks all reasonable.

Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>

> ---
>  .../gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c  | 13 ++++---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_object.c    |  5 ++-
>  drivers/gpu/drm/amd/amdgpu/amdgpu_vm_cpu.c    | 11 +++---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_vm_sdma.c   | 11 ++++--
>  drivers/gpu/drm/ttm/ttm_bo.c                  | 10 ++----
>  drivers/gpu/drm/ttm/ttm_bo_util.c             |  7 ----
>  drivers/gpu/drm/ttm/ttm_bo_vm.c               | 34 +++++++------------
>  drivers/gpu/drm/vmwgfx/vmwgfx_resource.c      |  6 ----
>  include/drm/ttm/ttm_bo_api.h                  |  2 --
>  9 files changed, 39 insertions(+), 60 deletions(-)
> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
> index 808e21dcb517..a4955ef76cfc 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
> @@ -2447,6 +2447,8 @@ int amdgpu_amdkfd_gpuvm_restore_process_bos(void *info, struct dma_fence **ef)
>  		struct amdgpu_bo *bo = mem->bo;
>  		uint32_t domain = mem->domain;
>  		struct kfd_mem_attachment *attachment;
> +		struct dma_resv_iter cursor;
> +		struct dma_fence *fence;
>  
>  		total_size += amdgpu_bo_size(bo);
>  
> @@ -2461,10 +2463,13 @@ int amdgpu_amdkfd_gpuvm_restore_process_bos(void *info, struct dma_fence **ef)
>  				goto validate_map_fail;
>  			}
>  		}
> -		ret = amdgpu_sync_fence(&sync_obj, bo->tbo.moving);
> -		if (ret) {
> -			pr_debug("Memory eviction: Sync BO fence failed. Try again\n");
> -			goto validate_map_fail;
> +		dma_resv_for_each_fence(&cursor, bo->tbo.base.resv,
> +					DMA_RESV_USAGE_KERNEL, fence) {
> +			ret = amdgpu_sync_fence(&sync_obj, fence);
> +			if (ret) {
> +				pr_debug("Memory eviction: Sync BO fence failed. Try again\n");
> +				goto validate_map_fail;
> +			}
>  		}
>  		list_for_each_entry(attachment, &mem->attachments, list) {
>  			if (!attachment->is_mapped)
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> index 5832c05ab10d..ef93abec13b9 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> @@ -612,9 +612,8 @@ int amdgpu_bo_create(struct amdgpu_device *adev,
>  		if (unlikely(r))
>  			goto fail_unreserve;
>  
> -		amdgpu_bo_fence(bo, fence, false);
> -		dma_fence_put(bo->tbo.moving);
> -		bo->tbo.moving = dma_fence_get(fence);
> +		dma_resv_add_fence(bo->tbo.base.resv, fence,
> +				   DMA_RESV_USAGE_KERNEL);
>  		dma_fence_put(fence);
>  	}
>  	if (!bp->resv)
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm_cpu.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm_cpu.c
> index e3fbf0f10add..31913ae86de6 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm_cpu.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm_cpu.c
> @@ -74,13 +74,12 @@ static int amdgpu_vm_cpu_update(struct amdgpu_vm_update_params *p,
>  {
>  	unsigned int i;
>  	uint64_t value;
> -	int r;
> +	long r;
>  
> -	if (vmbo->bo.tbo.moving) {
> -		r = dma_fence_wait(vmbo->bo.tbo.moving, true);
> -		if (r)
> -			return r;
> -	}
> +	r = dma_resv_wait_timeout(vmbo->bo.tbo.base.resv, DMA_RESV_USAGE_KERNEL,
> +				  true, MAX_SCHEDULE_TIMEOUT);
> +	if (r < 0)
> +		return r;
>  
>  	pe += (unsigned long)amdgpu_bo_kptr(&vmbo->bo);
>  
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm_sdma.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm_sdma.c
> index 9485b541947e..9cd6f41896c0 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm_sdma.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm_sdma.c
> @@ -205,14 +205,19 @@ static int amdgpu_vm_sdma_update(struct amdgpu_vm_update_params *p,
>  	struct amdgpu_bo *bo = &vmbo->bo;
>  	enum amdgpu_ib_pool_type pool = p->immediate ? AMDGPU_IB_POOL_IMMEDIATE
>  		: AMDGPU_IB_POOL_DELAYED;
> +	struct dma_resv_iter cursor;
>  	unsigned int i, ndw, nptes;
> +	struct dma_fence *fence;
>  	uint64_t *pte;
>  	int r;
>  
>  	/* Wait for PD/PT moves to be completed */
> -	r = amdgpu_sync_fence(&p->job->sync, bo->tbo.moving);
> -	if (r)
> -		return r;
> +	dma_resv_for_each_fence(&cursor, bo->tbo.base.resv,
> +				DMA_RESV_USAGE_KERNEL, fence) {
> +		r = amdgpu_sync_fence(&p->job->sync, fence);
> +		if (r)
> +			return r;
> +	}
>  
>  	do {
>  		ndw = p->num_dw_left;
> diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c
> index 360f980c7e10..015a94f766de 100644
> --- a/drivers/gpu/drm/ttm/ttm_bo.c
> +++ b/drivers/gpu/drm/ttm/ttm_bo.c
> @@ -418,7 +418,6 @@ static void ttm_bo_release(struct kref *kref)
>  	dma_resv_unlock(bo->base.resv);
>  
>  	atomic_dec(&ttm_glob.bo_count);
> -	dma_fence_put(bo->moving);
>  	bo->destroy(bo);
>  }
>  
> @@ -714,9 +713,8 @@ void ttm_bo_unpin(struct ttm_buffer_object *bo)
>  EXPORT_SYMBOL(ttm_bo_unpin);
>  
>  /*
> - * Add the last move fence to the BO and reserve a new shared slot. We only use
> - * a shared slot to avoid unecessary sync and rely on the subsequent bo move to
> - * either stall or use an exclusive fence respectively set bo->moving.
> + * Add the last move fence to the BO as kernel dependency and reserve a new
> + * fence slot.
>   */
>  static int ttm_bo_add_move_fence(struct ttm_buffer_object *bo,
>  				 struct ttm_resource_manager *man,
> @@ -746,9 +744,6 @@ static int ttm_bo_add_move_fence(struct ttm_buffer_object *bo,
>  		dma_fence_put(fence);
>  		return ret;
>  	}
> -
> -	dma_fence_put(bo->moving);
> -	bo->moving = fence;
>  	return 0;
>  }
>  
> @@ -951,7 +946,6 @@ int ttm_bo_init_reserved(struct ttm_device *bdev,
>  	bo->bdev = bdev;
>  	bo->type = type;
>  	bo->page_alignment = page_alignment;
> -	bo->moving = NULL;
>  	bo->pin_count = 0;
>  	bo->sg = sg;
>  	bo->bulk_move = NULL;
> diff --git a/drivers/gpu/drm/ttm/ttm_bo_util.c b/drivers/gpu/drm/ttm/ttm_bo_util.c
> index 99deb45894f4..bc5190340b9c 100644
> --- a/drivers/gpu/drm/ttm/ttm_bo_util.c
> +++ b/drivers/gpu/drm/ttm/ttm_bo_util.c
> @@ -228,7 +228,6 @@ static int ttm_buffer_object_transfer(struct ttm_buffer_object *bo,
>  
>  	atomic_inc(&ttm_glob.bo_count);
>  	INIT_LIST_HEAD(&fbo->base.ddestroy);
> -	fbo->base.moving = NULL;
>  	drm_vma_node_reset(&fbo->base.base.vma_node);
>  
>  	kref_init(&fbo->base.kref);
> @@ -500,9 +499,6 @@ static int ttm_bo_move_to_ghost(struct ttm_buffer_object *bo,
>  	 * operation has completed.
>  	 */
>  
> -	dma_fence_put(bo->moving);
> -	bo->moving = dma_fence_get(fence);
> -
>  	ret = ttm_buffer_object_transfer(bo, &ghost_obj);
>  	if (ret)
>  		return ret;
> @@ -546,9 +542,6 @@ static void ttm_bo_move_pipeline_evict(struct ttm_buffer_object *bo,
>  	spin_unlock(&from->move_lock);
>  
>  	ttm_resource_free(bo, &bo->resource);
> -
> -	dma_fence_put(bo->moving);
> -	bo->moving = dma_fence_get(fence);
>  }
>  
>  int ttm_bo_move_accel_cleanup(struct ttm_buffer_object *bo,
> diff --git a/drivers/gpu/drm/ttm/ttm_bo_vm.c b/drivers/gpu/drm/ttm/ttm_bo_vm.c
> index 08ba083a80d2..5b324f245265 100644
> --- a/drivers/gpu/drm/ttm/ttm_bo_vm.c
> +++ b/drivers/gpu/drm/ttm/ttm_bo_vm.c
> @@ -46,17 +46,13 @@
>  static vm_fault_t ttm_bo_vm_fault_idle(struct ttm_buffer_object *bo,
>  				struct vm_fault *vmf)
>  {
> -	vm_fault_t ret = 0;
> -	int err = 0;
> -
> -	if (likely(!bo->moving))
> -		goto out_unlock;
> +	long err = 0;
>  
>  	/*
>  	 * Quick non-stalling check for idle.
>  	 */
> -	if (dma_fence_is_signaled(bo->moving))
> -		goto out_clear;
> +	if (dma_resv_test_signaled(bo->base.resv, DMA_RESV_USAGE_KERNEL))
> +		return 0;
>  
>  	/*
>  	 * If possible, avoid waiting for GPU with mmap_lock
> @@ -64,34 +60,30 @@ static vm_fault_t ttm_bo_vm_fault_idle(struct ttm_buffer_object *bo,
>  	 * is the first attempt.
>  	 */
>  	if (fault_flag_allow_retry_first(vmf->flags)) {
> -		ret = VM_FAULT_RETRY;
>  		if (vmf->flags & FAULT_FLAG_RETRY_NOWAIT)
> -			goto out_unlock;
> +			return VM_FAULT_RETRY;
>  
>  		ttm_bo_get(bo);
>  		mmap_read_unlock(vmf->vma->vm_mm);
> -		(void) dma_fence_wait(bo->moving, true);
> +		(void)dma_resv_wait_timeout(bo->base.resv,
> +					    DMA_RESV_USAGE_KERNEL, true,
> +					    MAX_SCHEDULE_TIMEOUT);
>  		dma_resv_unlock(bo->base.resv);
>  		ttm_bo_put(bo);
> -		goto out_unlock;
> +		return VM_FAULT_RETRY;
>  	}
>  
>  	/*
>  	 * Ordinary wait.
>  	 */
> -	err = dma_fence_wait(bo->moving, true);
> -	if (unlikely(err != 0)) {
> -		ret = (err != -ERESTARTSYS) ? VM_FAULT_SIGBUS :
> +	err = dma_resv_wait_timeout(bo->base.resv, DMA_RESV_USAGE_KERNEL, true,
> +				    MAX_SCHEDULE_TIMEOUT);
> +	if (unlikely(err < 0)) {
> +		return (err != -ERESTARTSYS) ? VM_FAULT_SIGBUS :
>  			VM_FAULT_NOPAGE;
> -		goto out_unlock;
>  	}
>  
> -out_clear:
> -	dma_fence_put(bo->moving);
> -	bo->moving = NULL;
> -
> -out_unlock:
> -	return ret;
> +	return 0;
>  }
>  
>  static unsigned long ttm_bo_io_mem_pfn(struct ttm_buffer_object *bo,
> diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_resource.c b/drivers/gpu/drm/vmwgfx/vmwgfx_resource.c
> index a84d1d5628d0..a7d62a4eb47b 100644
> --- a/drivers/gpu/drm/vmwgfx/vmwgfx_resource.c
> +++ b/drivers/gpu/drm/vmwgfx/vmwgfx_resource.c
> @@ -1161,12 +1161,6 @@ int vmw_resources_clean(struct vmw_buffer_object *vbo, pgoff_t start,
>  		*num_prefault = __KERNEL_DIV_ROUND_UP(last_cleaned - res_start,
>  						      PAGE_SIZE);
>  		vmw_bo_fence_single(bo, NULL);
> -		if (bo->moving)
> -			dma_fence_put(bo->moving);
> -
> -		return dma_resv_get_singleton(bo->base.resv,
> -					      DMA_RESV_USAGE_WRITE,
> -					      &bo->moving);
>  	}
>  
>  	return 0;
> diff --git a/include/drm/ttm/ttm_bo_api.h b/include/drm/ttm/ttm_bo_api.h
> index c76932b68a33..2d524f8b0802 100644
> --- a/include/drm/ttm/ttm_bo_api.h
> +++ b/include/drm/ttm/ttm_bo_api.h
> @@ -94,7 +94,6 @@ struct ttm_tt;
>   * @deleted: True if the object is only a zombie and already deleted.
>   * @ddestroy: List head for the delayed destroy list.
>   * @swap: List head for swap LRU list.
> - * @moving: Fence set when BO is moving
>   * @offset: The current GPU offset, which can have different meanings
>   * depending on the memory type. For SYSTEM type memory, it should be 0.
>   * @cur_placement: Hint of current placement.
> @@ -147,7 +146,6 @@ struct ttm_buffer_object {
>  	 * Members protected by a bo reservation.
>  	 */
>  
> -	struct dma_fence *moving;
>  	unsigned priority;
>  	unsigned pin_count;
>  
> -- 
> 2.25.1
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re:
  2022-04-06  7:51 Christian König
                   ` (15 preceding siblings ...)
  2022-04-06  7:51 ` [PATCH 16/16] dma-buf: drop seq count based update Christian König
@ 2022-04-06 12:59 ` Daniel Vetter
  16 siblings, 0 replies; 65+ messages in thread
From: Daniel Vetter @ 2022-04-06 12:59 UTC (permalink / raw)
  To: DMA-resvusage; +Cc: daniel.vetter, dri-devel

On Wed, Apr 06, 2022 at 09:51:16AM +0200, Christian König wrote:
> Hi Daniel,
> 
> rebased on top of all the changes in drm-misc-next now and hopefully
> ready for 5.19.
> 
> I think I addressed all concern, but there was a bunch of rebase fallout
> from i915, so better to double check that once more.

No idea what you managed to do with this series, but
- cover letter isn't showing up in archives
- you have Reply-To: DMA-resvusage sprinkled all over, which means my
  replies are bouncing in funny ways

Please fix for next time around.

Also the split up patches lack a bit cc.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH 16/16] dma-buf: drop seq count based update
  2022-04-06  7:51 ` [PATCH 16/16] dma-buf: drop seq count based update Christian König
@ 2022-04-06 13:00   ` Daniel Vetter
  0 siblings, 0 replies; 65+ messages in thread
From: Daniel Vetter @ 2022-04-06 13:00 UTC (permalink / raw)
  To: Christian König; +Cc: daniel.vetter, Christian König, dri-devel

On Wed, Apr 06, 2022 at 09:51:32AM +0200, Christian König wrote:
> This should be possible now since we don't have the distinction
> between exclusive and shared fences any more.
> 
> The only possible pitfall is that a dma_fence would be reused during the
> RCU grace period, but even that could be handled with a single extra check.

At worst this means we wait a bit longer than a perfect snapshot. For
anything where this would have resulted in dependency loops you need to
take the ww_mutex anyway.
> 
> Signed-off-by: Christian König <christian.koenig@amd.com>
> ---
>  drivers/dma-buf/dma-resv.c    | 33 ++++++++++++---------------------
>  drivers/dma-buf/st-dma-resv.c |  2 +-
>  include/linux/dma-resv.h      | 12 ------------
>  3 files changed, 13 insertions(+), 34 deletions(-)
> 
> diff --git a/drivers/dma-buf/dma-resv.c b/drivers/dma-buf/dma-resv.c
> index 5b64aa554c36..0cce6e4ec946 100644
> --- a/drivers/dma-buf/dma-resv.c
> +++ b/drivers/dma-buf/dma-resv.c
> @@ -133,7 +133,6 @@ static void dma_resv_list_free(struct dma_resv_list *list)
>  void dma_resv_init(struct dma_resv *obj)
>  {
>  	ww_mutex_init(&obj->lock, &reservation_ww_class);
> -	seqcount_ww_mutex_init(&obj->seq, &obj->lock);

This removes the last user, and I don't think we'll add one. Please also
add a patch to remove the seqcount_ww_mutex stuff and cc locking/rt folks
on this.

Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>

>  
>  	RCU_INIT_POINTER(obj->fences, NULL);
>  }
> @@ -292,28 +291,24 @@ void dma_resv_add_fence(struct dma_resv *obj, struct dma_fence *fence,
>  	fobj = dma_resv_fences_list(obj);
>  	count = fobj->num_fences;
>  
> -	write_seqcount_begin(&obj->seq);
> -
>  	for (i = 0; i < count; ++i) {
>  		enum dma_resv_usage old_usage;
>  
>  		dma_resv_list_entry(fobj, i, obj, &old, &old_usage);
>  		if ((old->context == fence->context && old_usage >= usage) ||
> -		    dma_fence_is_signaled(old))
> -			goto replace;
> +		    dma_fence_is_signaled(old)) {
> +			dma_resv_list_set(fobj, i, fence, usage);
> +			dma_fence_put(old);
> +			return;
> +		}
>  	}
>  
>  	BUG_ON(fobj->num_fences >= fobj->max_fences);
> -	old = NULL;
>  	count++;
>  
> -replace:
>  	dma_resv_list_set(fobj, i, fence, usage);
>  	/* pointer update must be visible before we extend the num_fences */
>  	smp_store_mb(fobj->num_fences, count);
> -
> -	write_seqcount_end(&obj->seq);
> -	dma_fence_put(old);
>  }
>  EXPORT_SYMBOL(dma_resv_add_fence);
>  
> @@ -341,7 +336,6 @@ void dma_resv_replace_fences(struct dma_resv *obj, uint64_t context,
>  	dma_resv_assert_held(obj);
>  
>  	list = dma_resv_fences_list(obj);
> -	write_seqcount_begin(&obj->seq);
>  	for (i = 0; list && i < list->num_fences; ++i) {
>  		struct dma_fence *old;
>  
> @@ -352,14 +346,12 @@ void dma_resv_replace_fences(struct dma_resv *obj, uint64_t context,
>  		dma_resv_list_set(list, i, replacement, usage);
>  		dma_fence_put(old);
>  	}
> -	write_seqcount_end(&obj->seq);
>  }
>  EXPORT_SYMBOL(dma_resv_replace_fences);
>  
>  /* Restart the unlocked iteration by initializing the cursor object. */
>  static void dma_resv_iter_restart_unlocked(struct dma_resv_iter *cursor)
>  {
> -	cursor->seq = read_seqcount_begin(&cursor->obj->seq);
>  	cursor->index = 0;
>  	cursor->num_fences = 0;
>  	cursor->fences = dma_resv_fences_list(cursor->obj);
> @@ -388,8 +380,10 @@ static void dma_resv_iter_walk_unlocked(struct dma_resv_iter *cursor)
>  				    cursor->obj, &cursor->fence,
>  				    &cursor->fence_usage);
>  		cursor->fence = dma_fence_get_rcu(cursor->fence);
> -		if (!cursor->fence)
> -			break;
> +		if (!cursor->fence) {
> +			dma_resv_iter_restart_unlocked(cursor);
> +			continue;
> +		}
>  
>  		if (!dma_fence_is_signaled(cursor->fence) &&
>  		    cursor->usage >= cursor->fence_usage)
> @@ -415,7 +409,7 @@ struct dma_fence *dma_resv_iter_first_unlocked(struct dma_resv_iter *cursor)
>  	do {
>  		dma_resv_iter_restart_unlocked(cursor);
>  		dma_resv_iter_walk_unlocked(cursor);
> -	} while (read_seqcount_retry(&cursor->obj->seq, cursor->seq));
> +	} while (dma_resv_fences_list(cursor->obj) != cursor->fences);
>  	rcu_read_unlock();
>  
>  	return cursor->fence;
> @@ -438,13 +432,13 @@ struct dma_fence *dma_resv_iter_next_unlocked(struct dma_resv_iter *cursor)
>  
>  	rcu_read_lock();
>  	cursor->is_restarted = false;
> -	restart = read_seqcount_retry(&cursor->obj->seq, cursor->seq);
> +	restart = dma_resv_fences_list(cursor->obj) != cursor->fences;
>  	do {
>  		if (restart)
>  			dma_resv_iter_restart_unlocked(cursor);
>  		dma_resv_iter_walk_unlocked(cursor);
>  		restart = true;
> -	} while (read_seqcount_retry(&cursor->obj->seq, cursor->seq));
> +	} while (dma_resv_fences_list(cursor->obj) != cursor->fences);
>  	rcu_read_unlock();
>  
>  	return cursor->fence;
> @@ -540,10 +534,7 @@ int dma_resv_copy_fences(struct dma_resv *dst, struct dma_resv *src)
>  	}
>  	dma_resv_iter_end(&cursor);
>  
> -	write_seqcount_begin(&dst->seq);
>  	list = rcu_replace_pointer(dst->fences, list, dma_resv_held(dst));
> -	write_seqcount_end(&dst->seq);
> -
>  	dma_resv_list_free(list);
>  	return 0;
>  }
> diff --git a/drivers/dma-buf/st-dma-resv.c b/drivers/dma-buf/st-dma-resv.c
> index 8ace9e84c845..813779e3c9be 100644
> --- a/drivers/dma-buf/st-dma-resv.c
> +++ b/drivers/dma-buf/st-dma-resv.c
> @@ -217,7 +217,7 @@ static int test_for_each_unlocked(void *arg)
>  		if (r == -ENOENT) {
>  			r = -EINVAL;
>  			/* That should trigger an restart */
> -			cursor.seq--;
> +			cursor.fences = (void*)~0;
>  		} else if (r == -EINVAL) {
>  			r = 0;
>  		}
> diff --git a/include/linux/dma-resv.h b/include/linux/dma-resv.h
> index 1db759eacc98..c8ccbc94d5d2 100644
> --- a/include/linux/dma-resv.h
> +++ b/include/linux/dma-resv.h
> @@ -155,15 +155,6 @@ struct dma_resv {
>  	 */
>  	struct ww_mutex lock;
>  
> -	/**
> -	 * @seq:
> -	 *
> -	 * Sequence count for managing RCU read-side synchronization, allows
> -	 * read-only access to @fences while ensuring we take a consistent
> -	 * snapshot.
> -	 */
> -	seqcount_ww_mutex_t seq;
> -
>  	/**
>  	 * @fences:
>  	 *
> @@ -202,9 +193,6 @@ struct dma_resv_iter {
>  	/** @fence_usage: the usage of the current fence */
>  	enum dma_resv_usage fence_usage;
>  
> -	/** @seq: sequence number to check for modifications */
> -	unsigned int seq;
> -
>  	/** @index: index into the shared fences */
>  	unsigned int index;
>  
> -- 
> 2.25.1
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH 14/16] drm/i915: drop bo->moving dependency
  2022-04-06  7:51 ` [PATCH 14/16] drm/i915: drop bo->moving dependency Christian König
@ 2022-04-06 13:24   ` Matthew Auld
  0 siblings, 0 replies; 65+ messages in thread
From: Matthew Auld @ 2022-04-06 13:24 UTC (permalink / raw)
  To: DMA-resv
  Cc: Daniel Vetter, Intel Graphics Development, Christian König,
	ML dri-devel

On Wed, 6 Apr 2022 at 08:52, Christian König
<ckoenig.leichtzumerken@gmail.com> wrote:
>
> That should now be handled by the common dma_resv framework.
>
> Signed-off-by: Christian König <christian.koenig@amd.com>
> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
> Cc: intel-gfx@lists.freedesktop.org
> ---
>  drivers/gpu/drm/i915/gem/i915_gem_object.c    | 41 ++++---------------
>  drivers/gpu/drm/i915/gem/i915_gem_object.h    |  8 +---
>  drivers/gpu/drm/i915/gem/i915_gem_ttm_move.c  | 15 +------
>  .../drm/i915/gem/selftests/i915_gem_migrate.c |  3 +-
>  .../drm/i915/gem/selftests/i915_gem_mman.c    |  3 +-
>  drivers/gpu/drm/i915/i915_vma.c               |  9 +++-
>  6 files changed, 21 insertions(+), 58 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.c b/drivers/gpu/drm/i915/gem/i915_gem_object.c
> index 372bc220faeb..ffde7bc0a95d 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_object.c
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_object.c
> @@ -741,30 +741,19 @@ static const struct drm_gem_object_funcs i915_gem_object_funcs = {
>  /**
>   * i915_gem_object_get_moving_fence - Get the object's moving fence if any
>   * @obj: The object whose moving fence to get.
> + * @fence: The resulting fence
>   *
>   * A non-signaled moving fence means that there is an async operation
>   * pending on the object that needs to be waited on before setting up
>   * any GPU- or CPU PTEs to the object's pages.
>   *
> - * Return: A refcounted pointer to the object's moving fence if any,
> - * NULL otherwise.
> + * Return: Negative error code or 0 for success.
>   */
> -struct dma_fence *
> -i915_gem_object_get_moving_fence(struct drm_i915_gem_object *obj)
> +int i915_gem_object_get_moving_fence(struct drm_i915_gem_object *obj,
> +                                    struct dma_fence **fence)
>  {
> -       return dma_fence_get(i915_gem_to_ttm(obj)->moving);
> -}
> -
> -void i915_gem_object_set_moving_fence(struct drm_i915_gem_object *obj,
> -                                     struct dma_fence *fence)
> -{
> -       struct dma_fence **moving = &i915_gem_to_ttm(obj)->moving;
> -
> -       if (*moving == fence)
> -               return;
> -
> -       dma_fence_put(*moving);
> -       *moving = dma_fence_get(fence);
> +       return dma_resv_get_singleton(obj->base.resv, DMA_RESV_USAGE_KERNEL,
> +                                     fence);
>  }
>
>  /**
> @@ -782,23 +771,9 @@ void i915_gem_object_set_moving_fence(struct drm_i915_gem_object *obj,
>  int i915_gem_object_wait_moving_fence(struct drm_i915_gem_object *obj,
>                                       bool intr)
>  {
> -       struct dma_fence *fence = i915_gem_to_ttm(obj)->moving;
> -       int ret;
> -
>         assert_object_held(obj);
> -       if (!fence)
> -               return 0;
> -
> -       ret = dma_fence_wait(fence, intr);
> -       if (ret)
> -               return ret;
> -
> -       if (fence->error)
> -               return fence->error;
> -
> -       i915_gem_to_ttm(obj)->moving = NULL;
> -       dma_fence_put(fence);
> -       return 0;
> +       return dma_resv_wait_timeout(obj->base. resv, DMA_RESV_USAGE_KERNEL,
> +                                    intr, MAX_SCHEDULE_TIMEOUT);
>  }
>
>  #if IS_ENABLED(CONFIG_DRM_I915_SELFTEST)
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.h b/drivers/gpu/drm/i915/gem/i915_gem_object.h
> index 02c37fe4a535..e11d82a9f7c3 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_object.h
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_object.h
> @@ -520,12 +520,8 @@ i915_gem_object_finish_access(struct drm_i915_gem_object *obj)
>         i915_gem_object_unpin_pages(obj);
>  }
>
> -struct dma_fence *
> -i915_gem_object_get_moving_fence(struct drm_i915_gem_object *obj);
> -
> -void i915_gem_object_set_moving_fence(struct drm_i915_gem_object *obj,
> -                                     struct dma_fence *fence);
> -
> +int i915_gem_object_get_moving_fence(struct drm_i915_gem_object *obj,
> +                                    struct dma_fence **fence);
>  int i915_gem_object_wait_moving_fence(struct drm_i915_gem_object *obj,
>                                       bool intr);
>
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_ttm_move.c b/drivers/gpu/drm/i915/gem/i915_gem_ttm_move.c
> index 438b8a95b3d1..a10716f4e717 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_ttm_move.c
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_ttm_move.c
> @@ -467,19 +467,6 @@ __i915_ttm_move(struct ttm_buffer_object *bo,
>         return fence;
>  }
>
> -static int
> -prev_deps(struct ttm_buffer_object *bo, struct ttm_operation_ctx *ctx,
> -         struct i915_deps *deps)
> -{
> -       int ret;
> -
> -       ret = i915_deps_add_dependency(deps, bo->moving, ctx);
> -       if (!ret)
> -               ret = i915_deps_add_resv(deps, bo->base.resv, ctx);
> -
> -       return ret;
> -}
> -
>  /**
>   * i915_ttm_move - The TTM move callback used by i915.
>   * @bo: The buffer object.
> @@ -534,7 +521,7 @@ int i915_ttm_move(struct ttm_buffer_object *bo, bool evict,
>                 struct i915_deps deps;
>
>                 i915_deps_init(&deps, GFP_KERNEL | __GFP_NORETRY | __GFP_NOWARN);
> -               ret = prev_deps(bo, ctx, &deps);
> +               ret = i915_deps_add_resv(&deps, bo->base.resv, ctx);
>                 if (ret) {
>                         i915_refct_sgt_put(dst_rsgt);
>                         return ret;
> diff --git a/drivers/gpu/drm/i915/gem/selftests/i915_gem_migrate.c b/drivers/gpu/drm/i915/gem/selftests/i915_gem_migrate.c
> index 4997ed18b6e4..0ad443a90c8b 100644
> --- a/drivers/gpu/drm/i915/gem/selftests/i915_gem_migrate.c
> +++ b/drivers/gpu/drm/i915/gem/selftests/i915_gem_migrate.c
> @@ -219,8 +219,7 @@ static int __igt_lmem_pages_migrate(struct intel_gt *gt,
>                         err = dma_resv_reserve_fences(obj->base.resv, 1);
>                         if (!err)
>                                 dma_resv_add_fence(obj->base.resv, &rq->fence,
> -                                                  DMA_RESV_USAGE_WRITE);
> -                       i915_gem_object_set_moving_fence(obj, &rq->fence);
> +                                                  DMA_RESV_USAGE_KERNEL);
>                         i915_request_put(rq);
>                 }
>                 if (err)
> diff --git a/drivers/gpu/drm/i915/gem/selftests/i915_gem_mman.c b/drivers/gpu/drm/i915/gem/selftests/i915_gem_mman.c
> index 3a6e3f6d239f..dfc34cc2ef8c 100644
> --- a/drivers/gpu/drm/i915/gem/selftests/i915_gem_mman.c
> +++ b/drivers/gpu/drm/i915/gem/selftests/i915_gem_mman.c
> @@ -1221,8 +1221,7 @@ static int __igt_mmap_migrate(struct intel_memory_region **placements,
>         i915_gem_object_unpin_pages(obj);
>         if (rq) {
>                 dma_resv_add_fence(obj->base.resv, &rq->fence,
> -                                  DMA_RESV_USAGE_WRITE);
> -               i915_gem_object_set_moving_fence(obj, &rq->fence);
> +                                  DMA_RESV_USAGE_KERNEL);
>                 i915_request_put(rq);
>         }
>         i915_gem_object_unlock(obj);
> diff --git a/drivers/gpu/drm/i915/i915_vma.c b/drivers/gpu/drm/i915/i915_vma.c
> index 524477d8939e..d077f7b9eaad 100644
> --- a/drivers/gpu/drm/i915/i915_vma.c
> +++ b/drivers/gpu/drm/i915/i915_vma.c
> @@ -1357,10 +1357,17 @@ int i915_vma_pin_ww(struct i915_vma *vma, struct i915_gem_ww_ctx *ww,
>         if (err)
>                 return err;
>
> +       if (vma->obj) {
> +               err = i915_gem_object_get_moving_fence(vma->obj, &moving);
> +               if (err)

goto err_put_pages;

> +                       return err;
> +       } else {
> +               moving = NULL;

It looks like moving is already initialised with NULL further up.

> +       }
> +
>         if (flags & PIN_GLOBAL)
>                 wakeref = intel_runtime_pm_get(&vma->vm->i915->runtime_pm);
>
> -       moving = vma->obj ? i915_gem_object_get_moving_fence(vma->obj) : NULL;

Just fyi, this patch will conflict slightly with the following in gt-next:

e4b3ee71ec2a drm/i915: stop checking for NULL vma->obj
833124a0d169 drm/i915: limit the async bind to bind_async_flags

>         if (flags & vma->vm->bind_async_flags || moving) {
>                 /* lock VM */
>                 err = i915_vm_lock_objects(vma->vm, ww);
> --
> 2.25.1
>

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH 06/16] drm/amdgpu: use DMA_RESV_USAGE_KERNEL
  2022-04-06 12:42   ` Daniel Vetter
@ 2022-04-06 14:54     ` Christian König
  0 siblings, 0 replies; 65+ messages in thread
From: Christian König @ 2022-04-06 14:54 UTC (permalink / raw)
  To: Daniel Vetter, DMA-resvusage; +Cc: daniel.vetter, dri-devel

Am 06.04.22 um 14:42 schrieb Daniel Vetter:
> On Wed, Apr 06, 2022 at 09:51:22AM +0200, Christian König wrote:
>> Wait only for kernel fences before kmap or UVD direct submission.
>>
>> This also makes sure that we always wait in amdgpu_bo_kmap() even when
>> returning a cached pointer.
>>
>> Signed-off-by: Christian König <christian.koenig@amd.com>
>> ---
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 10 +++++-----
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c    |  2 +-
>>   2 files changed, 6 insertions(+), 6 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
>> index a3cdf8a24377..5832c05ab10d 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
>> @@ -761,6 +761,11 @@ int amdgpu_bo_kmap(struct amdgpu_bo *bo, void **ptr)
>>   	if (bo->flags & AMDGPU_GEM_CREATE_NO_CPU_ACCESS)
>>   		return -EPERM;
>>   
>> +	r = dma_resv_wait_timeout(bo->tbo.base.resv, DMA_RESV_USAGE_KERNEL,
>> +				  false, MAX_SCHEDULE_TIMEOUT);
>> +	if (r < 0)
>> +		return r;
>> +
>>   	kptr = amdgpu_bo_kptr(bo);
>>   	if (kptr) {
>>   		if (ptr)
>> @@ -768,11 +773,6 @@ int amdgpu_bo_kmap(struct amdgpu_bo *bo, void **ptr)
>>   		return 0;
>>   	}
>>   
>> -	r = dma_resv_wait_timeout(bo->tbo.base.resv, DMA_RESV_USAGE_WRITE,
>> -				  false, MAX_SCHEDULE_TIMEOUT);
>> -	if (r < 0)
>> -		return r;
>> -
>>   	r = ttm_bo_kmap(&bo->tbo, 0, bo->tbo.resource->num_pages, &bo->kmap);
> I wonder whether waiting for kernel fences shouldn't be ttm's duty here.
> Anyway patch makes some sense to me.

I was thinking the same and already had it halve implemented until I 
realized that this won't work because of the ptr caching.

Need to move that around as well and rework the whole handling.

Christian.

>
> Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
>>   	if (r)
>>   		return r;
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c
>> index 3654326219e0..6eac649499d3 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c
>> @@ -1164,7 +1164,7 @@ static int amdgpu_uvd_send_msg(struct amdgpu_ring *ring, struct amdgpu_bo *bo,
>>   
>>   	if (direct) {
>>   		r = dma_resv_wait_timeout(bo->tbo.base.resv,
>> -					  DMA_RESV_USAGE_WRITE, false,
>> +					  DMA_RESV_USAGE_KERNEL, false,
>>   					  msecs_to_jiffies(10));
>>   		if (r == 0)
>>   			r = -ETIMEDOUT;
>> -- 
>> 2.25.1
>>


^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH 03/16] dma-buf: specify usage while adding fences to dma_resv obj v6
  2022-04-06 12:35     ` Daniel Vetter
@ 2022-04-07  8:01       ` Christian König
  2022-04-07  9:26         ` Daniel Vetter
  0 siblings, 1 reply; 65+ messages in thread
From: Christian König @ 2022-04-07  8:01 UTC (permalink / raw)
  To: Daniel Vetter, DMA-resvusage; +Cc: daniel.vetter, dri-devel

Am 06.04.22 um 14:35 schrieb Daniel Vetter:
> On Wed, Apr 06, 2022 at 02:32:22PM +0200, Daniel Vetter wrote:
>> On Wed, Apr 06, 2022 at 09:51:19AM +0200, Christian König wrote:
>>> [SNIP]
>>> diff --git a/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c b/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c
>>> index 53f7c78628a4..98bb5c9239de 100644
>>> --- a/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c
>>> +++ b/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c
>>> @@ -202,14 +202,10 @@ static void submit_attach_object_fences(struct etnaviv_gem_submit *submit)
>>>   
>>>   	for (i = 0; i < submit->nr_bos; i++) {
>>>   		struct drm_gem_object *obj = &submit->bos[i].obj->base;
>>> +		bool write = submit->bos[i].flags & ETNA_SUBMIT_BO_WRITE;
>>>   
>>> -		if (submit->bos[i].flags & ETNA_SUBMIT_BO_WRITE)
>>> -			dma_resv_add_excl_fence(obj->resv,
>>> -							  submit->out_fence);
>>> -		else
>>> -			dma_resv_add_shared_fence(obj->resv,
>>> -							    submit->out_fence);
>>> -
>>> +		dma_resv_add_fence(obj->resv, submit->out_fence, write ?
>>> +				   DMA_RESV_USAGE_WRITE : DMA_RESV_USAGE_READ);
>> Iirc I had some suggestions to use dma_resv_usage_rw here and above. Do
>> these happen in later patches? There's also a few more of these later on.

That won't work. dma_resv_usage_rw() returns the usage as necessary for 
dependencies. In other words write return DMA_RESV_USAGE_READ and read 
return DMA_RESV_USAGE_WRITE.

What we could do is to add a dma_resv_add_fence_rw() wrapper which does 
the necessary ?: in a central place.

>>>   
>>> diff --git a/drivers/gpu/drm/lima/lima_gem.c b/drivers/gpu/drm/lima/lima_gem.c
>>> index e0a11ee0e86d..cb3bfccc930f 100644
>>> --- a/drivers/gpu/drm/lima/lima_gem.c
>>> +++ b/drivers/gpu/drm/lima/lima_gem.c
>>> @@ -367,7 +367,7 @@ int lima_gem_submit(struct drm_file *file, struct lima_submit *submit)
>>>   		if (submit->bos[i].flags & LIMA_SUBMIT_BO_WRITE)
>>>   			dma_resv_add_excl_fence(lima_bo_resv(bos[i]), fence);
>>>   		else
>>> -			dma_resv_add_shared_fence(lima_bo_resv(bos[i]), fence);
>>> +			dma_resv_add_fence(lima_bo_resv(bos[i]), fence);
> Correction on the r-b, I'm still pretty sure that this won't compile at
> all.

Grrr, I've forgot to add CONFIG_OF to my compile build config.

BTW: Do we have a tool for compile test coverage of patches? E.g. 
something which figures out if a build created an o file for each c file 
a patch touched?

Christian.

> -Daniel
>


^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [PATCH 03/16] dma-buf: specify usage while adding fences to dma_resv obj v6
  2022-04-07  8:01       ` Christian König
@ 2022-04-07  9:26         ` Daniel Vetter
  0 siblings, 0 replies; 65+ messages in thread
From: Daniel Vetter @ 2022-04-07  9:26 UTC (permalink / raw)
  To: Christian König; +Cc: daniel.vetter, dri-devel, DMA-resvusage

On Thu, Apr 07, 2022 at 10:01:52AM +0200, Christian König wrote:
> Am 06.04.22 um 14:35 schrieb Daniel Vetter:
> > On Wed, Apr 06, 2022 at 02:32:22PM +0200, Daniel Vetter wrote:
> > > On Wed, Apr 06, 2022 at 09:51:19AM +0200, Christian König wrote:
> > > > [SNIP]
> > > > diff --git a/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c b/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c
> > > > index 53f7c78628a4..98bb5c9239de 100644
> > > > --- a/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c
> > > > +++ b/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c
> > > > @@ -202,14 +202,10 @@ static void submit_attach_object_fences(struct etnaviv_gem_submit *submit)
> > > >   	for (i = 0; i < submit->nr_bos; i++) {
> > > >   		struct drm_gem_object *obj = &submit->bos[i].obj->base;
> > > > +		bool write = submit->bos[i].flags & ETNA_SUBMIT_BO_WRITE;
> > > > -		if (submit->bos[i].flags & ETNA_SUBMIT_BO_WRITE)
> > > > -			dma_resv_add_excl_fence(obj->resv,
> > > > -							  submit->out_fence);
> > > > -		else
> > > > -			dma_resv_add_shared_fence(obj->resv,
> > > > -							    submit->out_fence);
> > > > -
> > > > +		dma_resv_add_fence(obj->resv, submit->out_fence, write ?
> > > > +				   DMA_RESV_USAGE_WRITE : DMA_RESV_USAGE_READ);
> > > Iirc I had some suggestions to use dma_resv_usage_rw here and above. Do
> > > these happen in later patches? There's also a few more of these later on.
> 
> That won't work. dma_resv_usage_rw() returns the usage as necessary for
> dependencies. In other words write return DMA_RESV_USAGE_READ and read
> return DMA_RESV_USAGE_WRITE.

Hm right, that's a bit annoying due to the asymetry in dependencies and
adding fences.
> 
> What we could do is to add a dma_resv_add_fence_rw() wrapper which does the
> necessary ?: in a central place.

I'm not sure it's overkill, but what about something like this:

enum drm_sync_mode {
	DRM_SYNC_NO_IMPLICIT,
	DRM_SYNC_WRITE,
	DRM_SYNC_READ,
}

And then two functions, on in the drm/sched which replaces the current
add_implicit_dependencies, and the other which would be in the glorious
future eu utils shared between ttm and gem drivers, which adds the fence
with the right usage. And they would take care of the right mapping in
each case.

And then all we'd still have in driver code is mapping from random
bonghits driver flags to drm_sync_mode, and all the confusion would be in
shared code. And see above, at least for me it's confusing as heck :-)

> 
> > > > diff --git a/drivers/gpu/drm/lima/lima_gem.c b/drivers/gpu/drm/lima/lima_gem.c
> > > > index e0a11ee0e86d..cb3bfccc930f 100644
> > > > --- a/drivers/gpu/drm/lima/lima_gem.c
> > > > +++ b/drivers/gpu/drm/lima/lima_gem.c
> > > > @@ -367,7 +367,7 @@ int lima_gem_submit(struct drm_file *file, struct lima_submit *submit)
> > > >   		if (submit->bos[i].flags & LIMA_SUBMIT_BO_WRITE)
> > > >   			dma_resv_add_excl_fence(lima_bo_resv(bos[i]), fence);
> > > >   		else
> > > > -			dma_resv_add_shared_fence(lima_bo_resv(bos[i]), fence);
> > > > +			dma_resv_add_fence(lima_bo_resv(bos[i]), fence);
> > Correction on the r-b, I'm still pretty sure that this won't compile at
> > all.
> 
> Grrr, I've forgot to add CONFIG_OF to my compile build config.
> 
> BTW: Do we have a tool for compile test coverage of patches? E.g. something
> which figures out if a build created an o file for each c file a patch
> touched?

Just regrets when I screw up :-/
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re:
  2023-11-11  4:21 Andrew Worsley
@ 2023-11-11  8:22 ` Javier Martinez Canillas
  0 siblings, 0 replies; 65+ messages in thread
From: Javier Martinez Canillas @ 2023-11-11  8:22 UTC (permalink / raw)
  To: Andrew Worsley, Thomas Zimmermann, Maarten Lankhorst,
	Maxime Ripard, David Airlie, Daniel Vetter,
	open list:DRM DRIVER FOR FIRMWARE FRAMEBUFFERS, open list

Andrew Worsley <amworsley@gmail.com> writes:

Hello Andrew,

>    This patch fix's the failure of the frame buffer driver on my Asahi kernel
> which prevented X11 from starting on my Apple M1 laptop. It seems like a straight
> forward failure to follow the procedure described in drivers/video/aperture.c
> to remove the ealier driver. This patch is very simple and minimal. Very likely
> there may be better ways to fix this and very like there may be other drivers
> which have the same problem but I submit this so at least there is
> an interim fix for my problem.
>

Which partch? I think you forgot to include in your email?

>     Thanks
>
>     Andrew Worsley
>

-- 
Best regards,

Javier Martinez Canillas
Core Platforms
Red Hat


^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re:
  2022-05-19 10:50 ` Matthew Auld
@ 2022-05-20  7:11   ` Christian König
  0 siblings, 0 replies; 65+ messages in thread
From: Christian König @ 2022-05-20  7:11 UTC (permalink / raw)
  To: Matthew Auld; +Cc: Intel Graphics Development, ML dri-devel

Am 19.05.22 um 12:50 schrieb Matthew Auld:
> On Thu, 19 May 2022 at 10:55, Christian König
> <ckoenig.leichtzumerken@gmail.com> wrote:
>> Just sending that out once more to intel-gfx to let the CI systems take
>> a look.
> If all went well it should normally appear at [1][2], if CI was able
> to pick up the series.
>
> Since it's not currently there, I assume it's temporarily stuck in the
> moderation queue, assuming you are not subscribed to intel-gfx ml?

Ah! Well I am subscribed, just not with the e-Mail address I use to send 
out those patches.

Going to fix that ASAP!

Thanks,
Christian.

>   If
> so, perhaps consider subscribing at [3] and then disable receiving any
> mail from the ml, so you get full use of CI without getting spammed.
>
> [1] https://intel-gfx-ci.01.org/queue/index.html
> [2] https://patchwork.freedesktop.org/project/intel-gfx/series/
> [3] https://lists.freedesktop.org/mailman/listinfo/intel-gfx
>
>> No functional change compared to the last version.
>>
>> Christian.
>>
>>


^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re:
  2022-05-19  9:54 Christian König
@ 2022-05-19 10:50 ` Matthew Auld
  2022-05-20  7:11   ` Re: Christian König
  0 siblings, 1 reply; 65+ messages in thread
From: Matthew Auld @ 2022-05-19 10:50 UTC (permalink / raw)
  To: Christian König; +Cc: Intel Graphics Development, ML dri-devel

On Thu, 19 May 2022 at 10:55, Christian König
<ckoenig.leichtzumerken@gmail.com> wrote:
>
> Just sending that out once more to intel-gfx to let the CI systems take
> a look.

If all went well it should normally appear at [1][2], if CI was able
to pick up the series.

Since it's not currently there, I assume it's temporarily stuck in the
moderation queue, assuming you are not subscribed to intel-gfx ml? If
so, perhaps consider subscribing at [3] and then disable receiving any
mail from the ml, so you get full use of CI without getting spammed.

[1] https://intel-gfx-ci.01.org/queue/index.html
[2] https://patchwork.freedesktop.org/project/intel-gfx/series/
[3] https://lists.freedesktop.org/mailman/listinfo/intel-gfx

>
> No functional change compared to the last version.
>
> Christian.
>
>

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re:
       [not found] <CAGsV3ysM+p_HAq+LgOe4db09e+zRtvELHUQzCjF8FVE2UF+3Ow@mail.gmail.com>
@ 2021-06-29 13:52 ` Alex Deucher
  0 siblings, 0 replies; 65+ messages in thread
From: Alex Deucher @ 2021-06-29 13:52 UTC (permalink / raw)
  To: shashank singh; +Cc: Maling list - DRI developers

Yes, please see this page for more information:
https://www.x.org/wiki/XorgEVoC/

Alex

On Mon, Jun 21, 2021 at 2:26 PM shashank singh
<shashanksingh819@gmail.com> wrote:
>
> Hello everyone, my name is Shashank Singh. I hope this is the right platform to reach out to the 'X.org' community. I was curious about the X.org Endless Vacation of Code. Is this program still active?
>
>

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re:
  2021-05-15 22:57 Dmitry Baryshkov
@ 2021-06-02 21:45 ` Dmitry Baryshkov
  0 siblings, 0 replies; 65+ messages in thread
From: Dmitry Baryshkov @ 2021-06-02 21:45 UTC (permalink / raw)
  To: Bjorn Andersson, Rob Clark, Sean Paul, Abhinav Kumar
  Cc: Jonathan Marek, Stephen Boyd, linux-arm-msm, dri-devel,
	David Airlie, freedreno

On 16/05/2021 01:57, Dmitry Baryshkov wrote:
>  From Dmitry Baryshkov <dmitry.baryshkov@linaro.org> # This line is ignored.
> From: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
> Reply-To:
> Subject: [PATCH v2 0/6] drm/msm/dpu: simplify RM code
> In-Reply-To:
> 
> There is no need to request most of hardware blocks through the resource
> manager (RM), since typically there is 1:1 or N:1 relationship between
> corresponding blocks. Each LM is tied to the single PP. Each MERGE_3D
> can be used by the specified pair of PPs.  Each DSPP is also tied to
> single LM. So instead of allocating them through the RM, get them via
> static configuration.
> 
> Depends on: https://lore.kernel.org/linux-arm-msm/20210515190909.1809050-1-dmitry.baryshkov@linaro.org
> 
> Changes since v1:
>   - Split into separate patch series to ease review.

Another gracious ping, now for this series.

I want to send next version with minor changes, but I'd like to hear 
your overall opinion before doing that.

> 
> ----------------------------------------------------------------
> Dmitry Baryshkov (6):
>        drm/msm/dpu: get DSPP blocks directly rather than through RM
>        drm/msm/dpu: get MERGE_3D blocks directly rather than through RM
>        drm/msm/dpu: get PINGPONG blocks directly rather than through RM
>        drm/msm/dpu: get INTF blocks directly rather than through RM
>        drm/msm/dpu: drop unused lm_max_width from RM
>        drm/msm/dpu: simplify peer LM handling
> 
>   drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c        |  54 +---
>   drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.h        |   8 -
>   drivers/gpu/drm/msm/disp/dpu1/dpu_encoder_phys.h   |   5 -
>   .../gpu/drm/msm/disp/dpu1/dpu_encoder_phys_cmd.c   |   8 -
>   .../gpu/drm/msm/disp/dpu1/dpu_encoder_phys_vid.c   |   8 -
>   drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.c     |   2 +-
>   drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.h     |   4 +-
>   drivers/gpu/drm/msm/disp/dpu1/dpu_hw_lm.c          |  14 +-
>   drivers/gpu/drm/msm/disp/dpu1/dpu_hw_lm.h          |   7 +-
>   drivers/gpu/drm/msm/disp/dpu1/dpu_hw_pingpong.c    |   7 +-
>   drivers/gpu/drm/msm/disp/dpu1/dpu_hw_pingpong.h    |   4 +-
>   drivers/gpu/drm/msm/disp/dpu1/dpu_kms.c            |  53 +++-
>   drivers/gpu/drm/msm/disp/dpu1/dpu_kms.h            |   5 +-
>   drivers/gpu/drm/msm/disp/dpu1/dpu_rm.c             | 310 ++-------------------
>   drivers/gpu/drm/msm/disp/dpu1/dpu_rm.h             |  18 +-
>   drivers/gpu/drm/msm/disp/dpu1/dpu_trace.h          |   9 +-
>   16 files changed, 115 insertions(+), 401 deletions(-)
> 
> 


-- 
With best wishes
Dmitry

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re:
  2020-10-09  7:51       ` Re: Thomas Zimmermann
@ 2020-10-09 15:48         ` Alex Deucher
  0 siblings, 0 replies; 65+ messages in thread
From: Alex Deucher @ 2020-10-09 15:48 UTC (permalink / raw)
  To: Thomas Zimmermann
  Cc: Deucher, Alexander, Sandeep Raghuraman, Maling list - DRI developers

On Fri, Oct 9, 2020 at 3:51 AM Thomas Zimmermann <tzimmermann@suse.de> wrote:
>
> Hi
>
> Am 09.10.20 um 09:38 schrieb Sandeep Raghuraman:
> >
> >
> > On 10/9/20 12:44 PM, Thomas Zimmermann wrote:
> >> Hi
> >>
> >> Am 09.10.20 um 08:47 schrieb Thomas Zimmermann:
> >>> NACK for the entire lack of any form of commit description.
> >>
> >> Please see the documentation at [1] on how to describe the changes and
> >> getting your patches merged.
> >
> > Yes, I tried to use git send-email to send patches this time, and it resulted in this disaster. I'll stick to sending them through Thunderbird.
>
> What's the problem with send-email?
>
> A typical call for your patchset would look like
>
>   git send-mail <upstream-branch>...HEAD --cover-letter --annotate
> --to=... --cc=...
>
> That allows you to write the cover letter and have it sent out. IIRC you
> need ot set $EDITOR to your favorite text editor; and configure the SMTP
> server in ~/.gitconfig, under [sendemail].
>

You can also do `git format-patch -3 --cover-letter` and manually edit
the coverletter as needed then send them with git send-email.

Alex

> Best regards
> Thomas
>
> >
> >>
> >> Best regards
> >> Thomas
> >>
> >> [1]
> > _______________________________________________
> > dri-devel mailing list
> > dri-devel@lists.freedesktop.org
> > https://lists.freedesktop.org/mailman/listinfo/dri-devel
> >
>
> --
> Thomas Zimmermann
> Graphics Driver Developer
> SUSE Software Solutions Germany GmbH
> Maxfeldstr. 5, 90409 Nürnberg, Germany
> (HRB 36809, AG Nürnberg)
> Geschäftsführer: Felix Imendörffer
>
> _______________________________________________
> dri-devel mailing list
> dri-devel@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/dri-devel
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re:
  2020-10-09  7:38     ` Re: Sandeep Raghuraman
@ 2020-10-09  7:51       ` Thomas Zimmermann
  2020-10-09 15:48         ` Re: Alex Deucher
  0 siblings, 1 reply; 65+ messages in thread
From: Thomas Zimmermann @ 2020-10-09  7:51 UTC (permalink / raw)
  To: Sandeep Raghuraman, alexander.deucher; +Cc: dri-devel


[-- Attachment #1.1.1: Type: text/plain, Size: 1328 bytes --]

Hi

Am 09.10.20 um 09:38 schrieb Sandeep Raghuraman:
> 
> 
> On 10/9/20 12:44 PM, Thomas Zimmermann wrote:
>> Hi
>>
>> Am 09.10.20 um 08:47 schrieb Thomas Zimmermann:
>>> NACK for the entire lack of any form of commit description.
>>
>> Please see the documentation at [1] on how to describe the changes and
>> getting your patches merged.
> 
> Yes, I tried to use git send-email to send patches this time, and it resulted in this disaster. I'll stick to sending them through Thunderbird.

What's the problem with send-email?

A typical call for your patchset would look like

  git send-mail <upstream-branch>...HEAD --cover-letter --annotate
--to=... --cc=...

That allows you to write the cover letter and have it sent out. IIRC you
need ot set $EDITOR to your favorite text editor; and configure the SMTP
server in ~/.gitconfig, under [sendemail].

Best regards
Thomas

> 
>>
>> Best regards
>> Thomas
>>
>> [1]
> _______________________________________________
> dri-devel mailing list
> dri-devel@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/dri-devel
> 

-- 
Thomas Zimmermann
Graphics Driver Developer
SUSE Software Solutions Germany GmbH
Maxfeldstr. 5, 90409 Nürnberg, Germany
(HRB 36809, AG Nürnberg)
Geschäftsführer: Felix Imendörffer


[-- Attachment #1.2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 516 bytes --]

[-- Attachment #2: Type: text/plain, Size: 160 bytes --]

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re:
  2020-10-09  7:14   ` Re: Thomas Zimmermann
@ 2020-10-09  7:38     ` Sandeep Raghuraman
  2020-10-09  7:51       ` Re: Thomas Zimmermann
  0 siblings, 1 reply; 65+ messages in thread
From: Sandeep Raghuraman @ 2020-10-09  7:38 UTC (permalink / raw)
  To: Thomas Zimmermann, alexander.deucher; +Cc: dri-devel



On 10/9/20 12:44 PM, Thomas Zimmermann wrote:
> Hi
> 
> Am 09.10.20 um 08:47 schrieb Thomas Zimmermann:
>> NACK for the entire lack of any form of commit description.
> 
> Please see the documentation at [1] on how to describe the changes and
> getting your patches merged.

Yes, I tried to use git send-email to send patches this time, and it resulted in this disaster. I'll stick to sending them through Thunderbird.

> 
> Best regards
> Thomas
> 
> [1]
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: Re:
  2020-10-09  6:47 ` Re: Thomas Zimmermann
@ 2020-10-09  7:14   ` Thomas Zimmermann
  2020-10-09  7:38     ` Re: Sandeep Raghuraman
  0 siblings, 1 reply; 65+ messages in thread
From: Thomas Zimmermann @ 2020-10-09  7:14 UTC (permalink / raw)
  To: sandy.8925, alexander.deucher; +Cc: dri-devel


[-- Attachment #1.1.1: Type: text/plain, Size: 1038 bytes --]

Hi

Am 09.10.20 um 08:47 schrieb Thomas Zimmermann:
> NACK for the entire lack of any form of commit description.

Please see the documentation at [1] on how to describe the changes and
getting your patches merged.

Best regards
Thomas

[1]
https://dri.freedesktop.org/docs/drm/process/submitting-patches.html#describe-your-changes

> 
> Am 08.10.20 um 20:16 schrieb sandy.8925@gmail.com:
>> Signed-off-by: Sandeep Raghuraman <sandy.8925@gmail.com>
>>
>>
>> _______________________________________________
>> dri-devel mailing list
>> dri-devel@lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/dri-devel
>>
> 
> 
> _______________________________________________
> dri-devel mailing list
> dri-devel@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/dri-devel
> 

-- 
Thomas Zimmermann
Graphics Driver Developer
SUSE Software Solutions Germany GmbH
Maxfeldstr. 5, 90409 Nürnberg, Germany
(HRB 36809, AG Nürnberg)
Geschäftsführer: Felix Imendörffer


[-- Attachment #1.2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 516 bytes --]

[-- Attachment #2: Type: text/plain, Size: 160 bytes --]

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re:
       [not found] <20201008181606.460499-1-sandy.8925@gmail.com>
@ 2020-10-09  6:47 ` Thomas Zimmermann
  2020-10-09  7:14   ` Re: Thomas Zimmermann
  0 siblings, 1 reply; 65+ messages in thread
From: Thomas Zimmermann @ 2020-10-09  6:47 UTC (permalink / raw)
  To: sandy.8925, alexander.deucher; +Cc: dri-devel


[-- Attachment #1.1.1: Type: text/plain, Size: 559 bytes --]

NACK for the entire lack of any form of commit description.

Am 08.10.20 um 20:16 schrieb sandy.8925@gmail.com:
> Signed-off-by: Sandeep Raghuraman <sandy.8925@gmail.com>
> 
> 
> _______________________________________________
> dri-devel mailing list
> dri-devel@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/dri-devel
> 

-- 
Thomas Zimmermann
Graphics Driver Developer
SUSE Software Solutions Germany GmbH
Maxfeldstr. 5, 90409 Nürnberg, Germany
(HRB 36809, AG Nürnberg)
Geschäftsführer: Felix Imendörffer


[-- Attachment #1.2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 516 bytes --]

[-- Attachment #2: Type: text/plain, Size: 160 bytes --]

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re:
  2020-09-15  2:40 Dave Airlie
@ 2020-09-15  7:53 ` Christian König
  0 siblings, 0 replies; 65+ messages in thread
From: Christian König @ 2020-09-15  7:53 UTC (permalink / raw)
  To: dri-devel; +Cc: bskeggs

Reviewed-by: Christian König <christian.koenig@amd.com> for patches #1, 
#3 and #5-#7.

Minor comments on the other two.

Christian.

Am 15.09.20 um 04:40 schrieb Dave Airlie:
> The goal here is to make the ttm_tt object just represent a
> memory backing store, and now whether the store is bound to a
> global translation table. It moves binding up to the bo level.
>
> There's a lot more work on removing the global TT from the core
> of TTM, but this seems like a good start.
>
> Dave.
>
>

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re:
  2020-02-26 14:56     ` Re: Linus Walleij
@ 2020-02-26 15:08       ` Ville Syrjälä
  0 siblings, 0 replies; 65+ messages in thread
From: Ville Syrjälä @ 2020-02-26 15:08 UTC (permalink / raw)
  To: Linus Walleij
  Cc: Josh Wu, Bhuvanchandra DV, Neil Armstrong, nouveau,
	Guido Günther, Paul Kocialkowski,
	open list:DRM PANEL DRIVERS, Gustaf Lindström,
	Andrzej Hajda, Thierry Reding, Laurent Pinchart, Sam Ravnborg,
	Marian-Cristian Rotariu, Jagan Teki, Thomas Hellstrom,
	Joonyoung Shim, Jonathan Marek, Stefan Mavrodiev, Adam Ford,
	Jerry Han, VMware Graphics, Ben Skeggs, H. Nikolaus Schaller,
	Robert Chiras, Heiko Schocher, Icenowy Zheng, Jonas Karlman,
	intel-gfx, Alexandre Courbot, open list:ARM/Amlogic Meson...,
	Vincent Abriou, Andreas Pretzsch, Jernej Skrabec, Alex Gonzalez,
	Purism Kernel Team, Boris Brezillon, Seung-Woo Kim,
	Christoph Fritz, Kyungmin Park, Heiko Stuebner, Eugen Hristev,
	Giulio Benetti

On Wed, Feb 26, 2020 at 03:56:36PM +0100, Linus Walleij wrote:
> On Wed, Feb 26, 2020 at 3:34 PM Ville Syrjälä
> <ville.syrjala@linux.intel.com> wrote:
> > On Wed, Feb 26, 2020 at 01:08:06PM +0100, Linus Walleij wrote:
> > > On Wed, Feb 26, 2020 at 12:57 PM Ville Syrjälä
> > > <ville.syrjala@linux.intel.com> wrote:
> > > > On Tue, Feb 25, 2020 at 10:52:25PM +0100, Linus Walleij wrote:
> > >
> > > > > I have long suspected that a whole bunch of the "simple" displays
> > > > > are not simple but contains a display controller and memory.
> > > > > That means that the speed over the link to the display and
> > > > > actual refresh rate on the actual display is asymmetric because
> > > > > well we are just updating a RAM, the resolution just limits how
> > > > > much data we are sending, the clock limits the speed on the
> > > > > bus over to the RAM on the other side.
> > > >
> > > > IMO even in command mode mode->clock should probably be the actual
> > > > dotclock used by the display. If there's another clock for the bus
> > > > speed/etc. it should be stored somewhere else.
> > >
> > > Good point. For the DSI panels we have the field hs_rate
> > > for the HS clock in struct mipi_dsi_device which is based
> > > on exactly this reasoning. And that is what I actually use for
> > > setting the HS clock.
> > >
> > > The problem is however that we in many cases have so
> > > substandard documentation of these panels that we have
> > > absolutely no idea about the dotclock. Maybe we should
> > > just set it to 0 in these cases?
> >
> > Don't you always have a TE interrupt or something like that
> > available? Could just measure it from that if no better
> > information is available?
> 
> Yes and I did exactly that, so that is why this comment is in
> the driver:
> 
> static const struct drm_display_mode sony_acx424akp_cmd_mode = {
> (...)
>         /*
>          * Some desired refresh rate, experiments at the maximum "pixel"
>          * clock speed (HS clock 420 MHz) yields around 117Hz.
>          */
>         .vrefresh = 60,
> 
> I got a review comment at the time saying 117 Hz was weird.
> We didn't reach a proper conclusion on this:
> https://lore.kernel.org/dri-devel/CACRpkdYW3YNPSNMY3A44GQn8DqK-n9TLvr7uipF7LM_DHZ5=Lg@mail.gmail.com/
> 
> Thierry wasn't sure if 60Hz was good or not, so I just had to
> go with something.
> 
> We could calculate the resulting pixel clock for ~117 Hz with
> this resolution and put that in the clock field but ... don't know
> what is the best?

I would vote for that approach.

-- 
Ville Syrjälä
Intel
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re:
  2020-02-26 14:34   ` Re: Ville Syrjälä
@ 2020-02-26 14:56     ` Linus Walleij
  2020-02-26 15:08       ` Re: Ville Syrjälä
  0 siblings, 1 reply; 65+ messages in thread
From: Linus Walleij @ 2020-02-26 14:56 UTC (permalink / raw)
  To: Ville Syrjälä
  Cc: Josh Wu, Bhuvanchandra DV, Neil Armstrong, nouveau,
	Guido Günther, Paul Kocialkowski,
	open list:DRM PANEL DRIVERS, Gustaf Lindström,
	Andrzej Hajda, Thierry Reding, Laurent Pinchart, Sam Ravnborg,
	Marian-Cristian Rotariu, Jagan Teki, Thomas Hellstrom,
	Joonyoung Shim, Jonathan Marek, Stefan Mavrodiev, Adam Ford,
	Jerry Han, VMware Graphics, Ben Skeggs, H. Nikolaus Schaller,
	Robert Chiras, Heiko Schocher, Icenowy Zheng, Jonas Karlman,
	intel-gfx, Alexandre Courbot, open list:ARM/Amlogic Meson...,
	Vincent Abriou, Andreas Pretzsch, Jernej Skrabec, Alex Gonzalez,
	Purism Kernel Team, Boris Brezillon, Seung-Woo Kim,
	Christoph Fritz, Kyungmin Park, Heiko Stuebner, Eugen Hristev,
	Giulio Benetti

On Wed, Feb 26, 2020 at 3:34 PM Ville Syrjälä
<ville.syrjala@linux.intel.com> wrote:
> On Wed, Feb 26, 2020 at 01:08:06PM +0100, Linus Walleij wrote:
> > On Wed, Feb 26, 2020 at 12:57 PM Ville Syrjälä
> > <ville.syrjala@linux.intel.com> wrote:
> > > On Tue, Feb 25, 2020 at 10:52:25PM +0100, Linus Walleij wrote:
> >
> > > > I have long suspected that a whole bunch of the "simple" displays
> > > > are not simple but contains a display controller and memory.
> > > > That means that the speed over the link to the display and
> > > > actual refresh rate on the actual display is asymmetric because
> > > > well we are just updating a RAM, the resolution just limits how
> > > > much data we are sending, the clock limits the speed on the
> > > > bus over to the RAM on the other side.
> > >
> > > IMO even in command mode mode->clock should probably be the actual
> > > dotclock used by the display. If there's another clock for the bus
> > > speed/etc. it should be stored somewhere else.
> >
> > Good point. For the DSI panels we have the field hs_rate
> > for the HS clock in struct mipi_dsi_device which is based
> > on exactly this reasoning. And that is what I actually use for
> > setting the HS clock.
> >
> > The problem is however that we in many cases have so
> > substandard documentation of these panels that we have
> > absolutely no idea about the dotclock. Maybe we should
> > just set it to 0 in these cases?
>
> Don't you always have a TE interrupt or something like that
> available? Could just measure it from that if no better
> information is available?

Yes and I did exactly that, so that is why this comment is in
the driver:

static const struct drm_display_mode sony_acx424akp_cmd_mode = {
(...)
        /*
         * Some desired refresh rate, experiments at the maximum "pixel"
         * clock speed (HS clock 420 MHz) yields around 117Hz.
         */
        .vrefresh = 60,

I got a review comment at the time saying 117 Hz was weird.
We didn't reach a proper conclusion on this:
https://lore.kernel.org/dri-devel/CACRpkdYW3YNPSNMY3A44GQn8DqK-n9TLvr7uipF7LM_DHZ5=Lg@mail.gmail.com/

Thierry wasn't sure if 60Hz was good or not, so I just had to
go with something.

We could calculate the resulting pixel clock for ~117 Hz with
this resolution and put that in the clock field but ... don't know
what is the best?

Yours,
Linus Walleij
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re:
  2020-02-26 12:08 ` Re: Linus Walleij
@ 2020-02-26 14:34   ` Ville Syrjälä
  2020-02-26 14:56     ` Re: Linus Walleij
  0 siblings, 1 reply; 65+ messages in thread
From: Ville Syrjälä @ 2020-02-26 14:34 UTC (permalink / raw)
  To: Linus Walleij
  Cc: Josh Wu, Bhuvanchandra DV, Neil Armstrong, nouveau,
	Guido Günther, Paul Kocialkowski,
	open list:DRM PANEL DRIVERS, Gustaf Lindström,
	Andrzej Hajda, Thierry Reding, Laurent Pinchart, Sam Ravnborg,
	Marian-Cristian Rotariu, Jagan Teki, Thomas Hellstrom,
	Joonyoung Shim, Jonathan Marek, Stefan Mavrodiev, Adam Ford,
	Jerry Han, VMware Graphics, Ben Skeggs, H. Nikolaus Schaller,
	Robert Chiras, Heiko Schocher, Icenowy Zheng, Jonas Karlman,
	intel-gfx, Alexandre Courbot, open list:ARM/Amlogic Meson...,
	Vincent Abriou, Andreas Pretzsch, Jernej Skrabec, Alex Gonzalez,
	Purism Kernel Team, Boris Brezillon, Seung-Woo Kim,
	Christoph Fritz, Kyungmin Park, Heiko Stuebner, Eugen Hristev,
	Giulio Benetti

On Wed, Feb 26, 2020 at 01:08:06PM +0100, Linus Walleij wrote:
> On Wed, Feb 26, 2020 at 12:57 PM Ville Syrjälä
> <ville.syrjala@linux.intel.com> wrote:
> > On Tue, Feb 25, 2020 at 10:52:25PM +0100, Linus Walleij wrote:
> 
> > > I have long suspected that a whole bunch of the "simple" displays
> > > are not simple but contains a display controller and memory.
> > > That means that the speed over the link to the display and
> > > actual refresh rate on the actual display is asymmetric because
> > > well we are just updating a RAM, the resolution just limits how
> > > much data we are sending, the clock limits the speed on the
> > > bus over to the RAM on the other side.
> >
> > IMO even in command mode mode->clock should probably be the actual
> > dotclock used by the display. If there's another clock for the bus
> > speed/etc. it should be stored somewhere else.
> 
> Good point. For the DSI panels we have the field hs_rate
> for the HS clock in struct mipi_dsi_device which is based
> on exactly this reasoning. And that is what I actually use for
> setting the HS clock.
> 
> The problem is however that we in many cases have so
> substandard documentation of these panels that we have
> absolutely no idea about the dotclock. Maybe we should
> just set it to 0 in these cases?

Don't you always have a TE interrupt or something like that
available? Could just measure it from that if no better
information is available?

-- 
Ville Syrjälä
Intel
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re:
       [not found] <86d0ec$ae4ffc@fmsmga001.fm.intel.com>
@ 2020-02-26 12:08 ` Linus Walleij
  2020-02-26 14:34   ` Re: Ville Syrjälä
  0 siblings, 1 reply; 65+ messages in thread
From: Linus Walleij @ 2020-02-26 12:08 UTC (permalink / raw)
  To: Ville Syrjälä
  Cc: Josh Wu, Bhuvanchandra DV, Neil Armstrong, nouveau,
	Guido Günther, Paul Kocialkowski,
	open list:DRM PANEL DRIVERS, Gustaf Lindström,
	Andrzej Hajda, Thierry Reding, Laurent Pinchart, Sam Ravnborg,
	Marian-Cristian Rotariu, Jagan Teki, Thomas Hellstrom,
	Joonyoung Shim, Jonathan Marek, Stefan Mavrodiev, Adam Ford,
	Jerry Han, VMware Graphics, Ben Skeggs, H. Nikolaus Schaller,
	Robert Chiras, Heiko Schocher, Icenowy Zheng, Jonas Karlman,
	intel-gfx, Alexandre Courbot, open list:ARM/Amlogic Meson...,
	Vincent Abriou, Andreas Pretzsch, Jernej Skrabec, Alex Gonzalez,
	Purism Kernel Team, Boris Brezillon, Seung-Woo Kim,
	Christoph Fritz, Kyungmin Park, Heiko Stuebner, Eugen Hristev,
	Giulio Benetti

On Wed, Feb 26, 2020 at 12:57 PM Ville Syrjälä
<ville.syrjala@linux.intel.com> wrote:
> On Tue, Feb 25, 2020 at 10:52:25PM +0100, Linus Walleij wrote:

> > I have long suspected that a whole bunch of the "simple" displays
> > are not simple but contains a display controller and memory.
> > That means that the speed over the link to the display and
> > actual refresh rate on the actual display is asymmetric because
> > well we are just updating a RAM, the resolution just limits how
> > much data we are sending, the clock limits the speed on the
> > bus over to the RAM on the other side.
>
> IMO even in command mode mode->clock should probably be the actual
> dotclock used by the display. If there's another clock for the bus
> speed/etc. it should be stored somewhere else.

Good point. For the DSI panels we have the field hs_rate
for the HS clock in struct mipi_dsi_device which is based
on exactly this reasoning. And that is what I actually use for
setting the HS clock.

The problem is however that we in many cases have so
substandard documentation of these panels that we have
absolutely no idea about the dotclock. Maybe we should
just set it to 0 in these cases?

Yours,
Linus Walleij
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re:
  2018-10-23  1:47       ` Re: Michael Tirado
@ 2018-10-23  6:23         ` Dave Airlie
  0 siblings, 0 replies; 65+ messages in thread
From: Dave Airlie @ 2018-10-23  6:23 UTC (permalink / raw)
  To: mtirado418; +Cc: LKML, dri-devel

On Tue, 23 Oct 2018 at 16:13, Michael Tirado <mtirado418@gmail.com> wrote:
>
> That preprocessor define worked but I'm still confused about this
> DRM_FILE_PAGE_OFFSET thing.  Check out drivers/gpu/drm/drm_gem.c
> right above drm_gem_init.
>
> ---
>
> /*
>  * We make up offsets for buffer objects so we can recognize them at
>  * mmap time.
>  */
>
> /* pgoff in mmap is an unsigned long, so we need to make sure that
>  * the faked up offset will fit
>  */
>
> #if BITS_PER_LONG == 64
> #define DRM_FILE_PAGE_OFFSET_START ((0xFFFFFFFFUL >> PAGE_SHIFT) + 1)
> #define DRM_FILE_PAGE_OFFSET_SIZE ((0xFFFFFFFFUL >> PAGE_SHIFT) * 16)
> #else
> #define DRM_FILE_PAGE_OFFSET_START ((0xFFFFFFFUL >> PAGE_SHIFT) + 1)
> #define DRM_FILE_PAGE_OFFSET_SIZE ((0xFFFFFFFUL >> PAGE_SHIFT) * 16)
> #endif
>
>
> ---
>
> Why is having a 64-bit file offsets critical, causing -EINVAL on mmap?
> What problems might be associated with using (0x10000000UL >>
> PAGE_SHIFT) ?

a) it finds people not using the correct userspace defines. mostly
libdrm should handle this,
and possibly mesa.

b) there used to be legacy maps below that address on older drivers,
so we decided to never put stuff in the first 32-bit range that they
could clash with.

Dave.
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re:
  2018-10-22  1:50     ` Re: Dave Airlie
  2018-10-21 22:20       ` Re: Michael Tirado
@ 2018-10-23  1:47       ` Michael Tirado
  2018-10-23  6:23         ` Re: Dave Airlie
  1 sibling, 1 reply; 65+ messages in thread
From: Michael Tirado @ 2018-10-23  1:47 UTC (permalink / raw)
  To: Dave Airlie, LKML, dri-devel

That preprocessor define worked but I'm still confused about this
DRM_FILE_PAGE_OFFSET thing.  Check out drivers/gpu/drm/drm_gem.c
right above drm_gem_init.

---

/*
 * We make up offsets for buffer objects so we can recognize them at
 * mmap time.
 */

/* pgoff in mmap is an unsigned long, so we need to make sure that
 * the faked up offset will fit
 */

#if BITS_PER_LONG == 64
#define DRM_FILE_PAGE_OFFSET_START ((0xFFFFFFFFUL >> PAGE_SHIFT) + 1)
#define DRM_FILE_PAGE_OFFSET_SIZE ((0xFFFFFFFFUL >> PAGE_SHIFT) * 16)
#else
#define DRM_FILE_PAGE_OFFSET_START ((0xFFFFFFFUL >> PAGE_SHIFT) + 1)
#define DRM_FILE_PAGE_OFFSET_SIZE ((0xFFFFFFFUL >> PAGE_SHIFT) * 16)
#endif


---

Why is having a 64-bit file offsets critical, causing -EINVAL on mmap?
What problems might be associated with using (0x10000000UL >>
PAGE_SHIFT) ?
On Mon, Oct 22, 2018 at 1:50 AM Dave Airlie <airlied@gmail.com> wrote:
>
> On Mon, 22 Oct 2018 at 10:49, Michael Tirado <mtirado418@gmail.com> wrote:
> >
> > On Mon, Oct 22, 2018 at 12:26 AM Dave Airlie <airlied@gmail.com> wrote:
> > >
> > > This shouldn't be necessary, did someone misbackport the mmap changes without:
> > >
> > > drm: set FMODE_UNSIGNED_OFFSET for drm files
> > >
> > > Dave.
> >
> > The latest kernel I have had to patch was a 4.18-rc6.  I'll try with a
> > newer 4.19 and let you know if it decides to work.  If not I'll
> > prepare a test case for demonstration on qemu-system-i386.
>
> If you have custom userspace software, make sure it's using
> AC_SYS_LARGEFILE or whatever the equivalant is in your build system.
>
> 64-bit file offsets are important.
>
> Dave.

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re:
  2018-10-21 20:23   ` Re: Michael Tirado
@ 2018-10-22  1:50     ` Dave Airlie
  2018-10-21 22:20       ` Re: Michael Tirado
  2018-10-23  1:47       ` Re: Michael Tirado
  0 siblings, 2 replies; 65+ messages in thread
From: Dave Airlie @ 2018-10-22  1:50 UTC (permalink / raw)
  To: mtirado418
  Cc: Dave Airlie, LKML, dri-devel, Hongbo.He, Gerd Hoffmann, Deucher,
	Alexander, Sean Paul, Koenig, Christian

On Mon, 22 Oct 2018 at 10:49, Michael Tirado <mtirado418@gmail.com> wrote:
>
> On Mon, Oct 22, 2018 at 12:26 AM Dave Airlie <airlied@gmail.com> wrote:
> >
> > This shouldn't be necessary, did someone misbackport the mmap changes without:
> >
> > drm: set FMODE_UNSIGNED_OFFSET for drm files
> >
> > Dave.
>
> The latest kernel I have had to patch was a 4.18-rc6.  I'll try with a
> newer 4.19 and let you know if it decides to work.  If not I'll
> prepare a test case for demonstration on qemu-system-i386.

If you have custom userspace software, make sure it's using
AC_SYS_LARGEFILE or whatever the equivalant is in your build system.

64-bit file offsets are important.

Dave.
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re:
  2018-10-21 16:25 (unknown), Michael Tirado
@ 2018-10-22  0:26 ` Dave Airlie
  2018-10-21 20:23   ` Re: Michael Tirado
  0 siblings, 1 reply; 65+ messages in thread
From: Dave Airlie @ 2018-10-22  0:26 UTC (permalink / raw)
  To: mtirado418
  Cc: Dave Airlie, LKML, dri-devel, Hongbo.He, Gerd Hoffmann, Deucher,
	Alexander, Sean Paul, Koenig, Christian

On Mon, 22 Oct 2018 at 07:22, Michael Tirado <mtirado418@gmail.com> wrote:
>
> Mapping a drm "dumb" buffer fails on 32-bit system (i686) from what
> appears to be a truncated memory address that has been copied
> throughout several files. The bug manifests as an -EINVAL when calling
> mmap with the offset gathered from DRM_IOCTL_MODE_MAP_DUMB <--
> DRM_IOCTL_MODE_ADDFB <-- DRM_IOCTL_MODE_CREATE_DUMB.  I can provide
> test code if needed.
>
> The following patch will apply to 4.18 though I've only been able to
> test through qemu bochs driver and nouveau. Intel driver worked
> without any issues.  I'm not sure if everyone is going to want to
> share a constant, and the whitespace is screwed up from gmail's awful
> javascript client, so let me know if I should resend this with any
> specific changes.  I have also attached the file with preserved
> whitespace.
>

This shouldn't be necessary, did someone misbackport the mmap changes without:

drm: set FMODE_UNSIGNED_OFFSET for drm files

Dave.
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re:
  2018-10-22  1:50     ` Re: Dave Airlie
@ 2018-10-21 22:20       ` Michael Tirado
  2018-10-23  1:47       ` Re: Michael Tirado
  1 sibling, 0 replies; 65+ messages in thread
From: Michael Tirado @ 2018-10-21 22:20 UTC (permalink / raw)
  To: Dave Airlie
  Cc: Airlied, dri-devel, LKML, kraxel, alexander.deucher,
	christian.koenig, David1.zhou, Hongbo.He, Sean Paul, Gustavo,
	maarten.lankhorst

[-- Attachment #1: Type: text/plain, Size: 837 bytes --]

On Mon, Oct 22, 2018 at 1:50 AM Dave Airlie <airlied@gmail.com> wrote:
>
> On Mon, 22 Oct 2018 at 10:49, Michael Tirado <mtirado418@gmail.com> wrote:
> >
> > On Mon, Oct 22, 2018 at 12:26 AM Dave Airlie <airlied@gmail.com> wrote:
> > >
> > > This shouldn't be necessary, did someone misbackport the mmap changes without:
> If you have custom userspace software, make sure it's using
> AC_SYS_LARGEFILE or whatever the equivalant is in your build system.
>
> 64-bit file offsets are important.
>

That fixed it! -D_FILE_OFFSET_BITS=64 is the pre-processor define
needed. It's a bit more than unintuitive but I'm glad I don't need
this stupid patch anymore, Thanks.

In case anyone is further interested I have attached test program
since I spent the last hour or so chopping it up anyway :S   [ gcc -o
kms -D_FILE_OFFSET_BITS=64 main.c ]

[-- Attachment #2: main.c --]
[-- Type: application/octet-stream, Size: 17153 bytes --]

/* Copyright (C) 2017 Michael R. Tirado <mtirado418@gmail.com> -- GPLv3+
 *
 * This program is libre software: you can redistribute it and/or modify
 * it under the terms of the GNU General Public License as published by
 * the Free Software Foundation, either version 3 of the License, or
 * (at your option) any later version.
 *
 * This program is distributed in the hope that it will be useful,
 * but WITHOUT ANY WARRANTY; without even the implied warranty of
 * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
 * GNU General Public License for more details. You should have
 * received a copy of the GNU General Public License version 3
 * along with this program.  If not, see <http://www.gnu.org/licenses/>.
 */


#include <string.h>
#include <unistd.h>
#include <errno.h>
#include <stdio.h>
#include <malloc.h>
#include <signal.h>
#include <stdlib.h>
#include <stdint.h>
#include <stddef.h>
#include <string.h>
#include <unistd.h>
#include <fcntl.h>
#include <errno.h>
#include <stdio.h>
#include <sys/mman.h>
#include <sys/ioctl.h>
#include <drm/drm.h>
#include <drm/drm_mode.h>

#define STRERR strerror(errno)

#ifndef PTRBITCOUNT
	#define PTRBITCOUNT 32
#endif
/* kernel structs use __u64 for pointer types */
#if (PTRBITCOUNT == 32)
	#define ptr_from_krn(ptr) ((void *)(uint32_t)(ptr))
	#define ptr_to_krn(ptr)   ((uint32_t)(ptr))
#elif (PTRBITCOUNT == 64)
	#define ptr_from_krn(ptr) ((void *)(uint64_t)(ptr))
	#define ptr_to_krn(ptr)   ((uint64_t)(ptr))
#else
	#error "PTRBITCOUNT is undefined"
#endif
#ifndef MAX_FBS
	#define MAX_FBS   12
#endif
#ifndef MAX_CRTCS
	#define MAX_CRTCS 12
#endif
#ifndef MAX_CONNECTORS
	#define MAX_CONNECTORS 12
#endif
#ifndef MAX_ENCODERS
	#define MAX_ENCODERS 12
#endif
#ifndef MAX_PROPS
	#define MAX_PROPS 256
#endif
#ifndef MAX_MODES
	#define MAX_MODES 256
#endif
#if (PTRBITCOUNT == 32)
	#define drm_to_ptr(ptr)   ((void *)(uint32_t)(ptr))
	#define drm_from_ptr(ptr) ((uint32_t)(ptr))
#elif (PTRBITCOUNT == 64)
	#define drm_to_ptr(ptr)   ((void *)(uint64_t)(ptr))
	#define drm_from_ptr(ptr) ((uint64_t)(ptr))
#else
	#error "PTRBITCOUNT is undefined"
#endif
#define drm_alloc(size) (drm_from_ptr(calloc(1,size)))

struct drm_buffer
{
	uint32_t drm_id;
	uint32_t fb_id;
	uint32_t pitch;
	uint32_t width;
	uint32_t height;
	uint32_t depth;
	uint32_t bpp;
	char    *addr;
	size_t   size;
};
struct drm_display
{
	struct drm_mode_get_encoder encoder;
	struct drm_mode_crtc crtc;
	struct drm_mode_get_connector *conn; /* do we need array for multi-screen? */
	struct drm_mode_modeinfo *modes; /* these both point to conn's mode array */
	struct drm_mode_modeinfo *cur_mode;
	uint32_t cur_mode_idx;
	uint32_t mode_count;
	uint32_t conn_id;
};
struct drm_kms
{
	struct drm_display display;
	struct drm_buffer *sfb;
	struct drm_mode_card_res *res;
	int card_fd;
};

/* get id out of drm_id_ptr */
static uint32_t drm_get_id(uint64_t addr, uint32_t idx)
{
	return ((uint32_t *)drm_to_ptr(addr))[idx];
}

static int free_mode_card_res(struct drm_mode_card_res *res)
{
	if (!res)
		return -1;
	if (res->fb_id_ptr)
		free(drm_to_ptr(res->fb_id_ptr));
	if (res->crtc_id_ptr)
		free(drm_to_ptr(res->crtc_id_ptr));
	if (res->encoder_id_ptr)
		free(drm_to_ptr(res->encoder_id_ptr));
	if (res->connector_id_ptr)
		free(drm_to_ptr(res->connector_id_ptr));
	free(res);
	return 0;
}

static struct drm_mode_card_res *alloc_mode_card_res(int fd)
{
	struct drm_mode_card_res res;
	struct drm_mode_card_res *ret;
	uint32_t count_fbs, count_crtcs, count_connectors, count_encoders;

	memset(&res, 0, sizeof(struct drm_mode_card_res));
	if (ioctl(fd, DRM_IOCTL_MODE_GETRESOURCES, &res)) {
		printf("ioctl(DRM_IOCTL_MODE_GETRESOURCES, &res): %s\n", STRERR);
		return NULL;
	}
	if (res.count_fbs > MAX_FBS
			|| res.count_crtcs > MAX_CRTCS
			|| res.count_encoders > MAX_ENCODERS
			|| res.count_connectors > MAX_CONNECTORS) {
		printf("resource limit reached, see defines.h\n");
		return NULL;
	}
	if (res.count_fbs) {
		res.fb_id_ptr = drm_alloc(sizeof(uint32_t)*res.count_fbs);
		if (!res.fb_id_ptr)
			goto alloc_err;
	}
	if (res.count_crtcs) {
		res.crtc_id_ptr = drm_alloc(sizeof(uint32_t)*res.count_crtcs);
		if (!res.crtc_id_ptr)
			goto alloc_err;
	}
	if (res.count_encoders) {
		res.encoder_id_ptr = drm_alloc(sizeof(uint32_t)*res.count_encoders);
		if (!res.encoder_id_ptr)
			goto alloc_err;
	}
	if (res.count_connectors) {
		res.connector_id_ptr = drm_alloc(sizeof(uint32_t)*res.count_connectors);
		if (!res.connector_id_ptr)
			goto alloc_err;
	}
	count_fbs = res.count_fbs;
	count_crtcs = res.count_crtcs;
	count_encoders = res.count_encoders;
	count_connectors = res.count_connectors;

	if (ioctl(fd, DRM_IOCTL_MODE_GETRESOURCES, &res) == -1) {
		printf("ioctl(DRM_IOCTL_MODE_GETRESOURCES, &res): %s\n", STRERR);
		goto free_err;
	}

	if (count_fbs != res.count_fbs
			|| count_crtcs != res.count_crtcs
			|| count_encoders != res.count_encoders
			|| count_connectors != res.count_connectors) {
		errno = EAGAIN;
		goto free_err;
	}

	ret = calloc(1, sizeof(struct drm_mode_card_res));
	if (!ret)
		goto alloc_err;

	memcpy(ret, &res, sizeof(struct drm_mode_card_res));
	return ret;

alloc_err:
	errno = ENOMEM;
free_err:
	free(drm_to_ptr(res.fb_id_ptr));
	free(drm_to_ptr(res.crtc_id_ptr));
	free(drm_to_ptr(res.connector_id_ptr));
	free(drm_to_ptr(res.encoder_id_ptr));
	return NULL;
}


static struct drm_mode_get_connector *alloc_connector(int fd, uint32_t conn_id)
{
	struct drm_mode_get_connector conn;
	struct drm_mode_get_connector *ret;
	uint32_t count_modes, count_props, count_encoders;

	memset(&conn, 0, sizeof(struct drm_mode_get_connector));
	conn.connector_id = conn_id;

	if (ioctl(fd, DRM_IOCTL_MODE_GETCONNECTOR, &conn) == -1) {
		printf("ioctl(DRM_IOCTL_MODE_GETCONNECTOR, &conn): %s\n", STRERR);
		return NULL;
	}
	if (conn.count_modes > MAX_MODES
			|| conn.count_props > MAX_PROPS
			|| conn.count_encoders > MAX_ENCODERS) {
		printf("resource limit reached, see defines.h\n");
		return NULL;
	}
	if (conn.count_modes) {
		conn.modes_ptr = drm_alloc(sizeof(struct drm_mode_modeinfo)
					   * conn.count_modes);
		if (!conn.modes_ptr)
			goto alloc_err;
	}
	if (conn.count_props) {
		conn.props_ptr = drm_alloc(sizeof(uint32_t)*conn.count_props);
		if (!conn.props_ptr)
			goto alloc_err;
		conn.prop_values_ptr = drm_alloc(sizeof(uint64_t)*conn.count_props);
		if (!conn.prop_values_ptr)
			goto alloc_err;
	}
	if (conn.count_encoders) {
		conn.encoders_ptr = drm_alloc(sizeof(uint32_t)*conn.count_encoders);
		if (!conn.encoders_ptr)
			goto alloc_err;
	}
	count_modes = conn.count_modes;
	count_props = conn.count_props;
	count_encoders = conn.count_encoders;

	if (ioctl(fd, DRM_IOCTL_MODE_GETCONNECTOR, &conn) == -1) {
		printf("ioctl(DRM_IOCTL_MODE_GETCONNECTOR, &conn): %s\n", STRERR);
		goto free_err;
	}

	if (count_modes != conn.count_modes
			|| count_props != conn.count_props
			|| count_encoders != conn.count_encoders) {
		errno = EAGAIN;
		goto free_err;
	}

	ret = calloc(1, sizeof(struct drm_mode_get_connector));
	if (!ret)
		goto alloc_err;

	memcpy(ret, &conn, sizeof(struct drm_mode_get_connector));
	return ret;

alloc_err:
	errno = ENOMEM;
free_err:
	free(drm_to_ptr(conn.modes_ptr));
	free(drm_to_ptr(conn.props_ptr));
	free(drm_to_ptr(conn.encoders_ptr));
	free(drm_to_ptr(conn.prop_values_ptr));
	return NULL;
}

static struct drm_mode_modeinfo *get_connector_modeinfo(struct drm_mode_get_connector *conn,  uint32_t *count)
{
	if (!conn || !count)
		return NULL;
	*count = conn->count_modes;
	return drm_to_ptr(conn->modes_ptr);

}

static int free_connector(struct drm_mode_get_connector *conn)
{
	if (!conn)
		return -1;
	if (conn->modes_ptr)
		free(drm_to_ptr(conn->modes_ptr));
	if (conn->props_ptr)
		free(drm_to_ptr(conn->props_ptr));
	if (conn->encoders_ptr)
		free(drm_to_ptr(conn->encoders_ptr));
	if (conn->prop_values_ptr)
		free(drm_to_ptr(conn->prop_values_ptr));
	free(conn);
	return 0;
}


static int drm_kms_connect_sfb(struct drm_kms *self)
{
	struct drm_display *display = &self->display;
	struct drm_mode_get_connector *conn;
	struct drm_mode_modeinfo *cur_mode;
	struct drm_mode_get_encoder *encoder;
	struct drm_mode_crtc *crtc;
	if (!display || !display->conn || !display->cur_mode || !self->sfb)
		return -1;

	conn = display->conn;
	cur_mode = display->cur_mode;
	encoder = &self->display.encoder;
	crtc = &self->display.crtc;
	memset(crtc, 0, sizeof(struct drm_mode_crtc));
	memset(encoder,  0, sizeof(struct drm_mode_get_encoder));


	/* XXX: there can be multiple encoders, have not investigated this much */
	if (conn->encoder_id == 0) {
		printf("conn->encoder_id was 0, defaulting to encoder[0]\n");
		conn->encoder_id = ((uint32_t *)drm_to_ptr(conn->encoders_ptr))[0];
	}
	encoder->encoder_id = conn->encoder_id;

	if (ioctl(self->card_fd, DRM_IOCTL_MODE_GETENCODER, encoder) == -1) {
		printf("ioctl(DRM_IOCTL_MODE_GETENCODER): %s\n", STRERR);
		return -1;
	}

	if (encoder->crtc_id == 0) {
		printf("encoder->crtc_id was 0, defaulting to crtc[0]\n");
		encoder->crtc_id = ((uint32_t *)drm_to_ptr(self->res->crtc_id_ptr))[0];
	}
	crtc->crtc_id = encoder->crtc_id;

	if (ioctl(self->card_fd, DRM_IOCTL_MODE_GETCRTC, crtc) == -1) {
		printf("ioctl(DRM_IOCTL_MODE_GETCRTC): %s\n", STRERR);
		return -1;
	}

	/* set crtc mode */
	crtc->fb_id = self->sfb->fb_id;
	crtc->set_connectors_ptr = drm_from_ptr((void *)&conn->connector_id);
	crtc->count_connectors = 1;
	crtc->mode = *cur_mode;
	/*printf("\nsetting mode:\n\n");
	print_mode_modeinfo(cur_mode);*/
	crtc->mode_valid = 1;
	if (ioctl(self->card_fd, DRM_IOCTL_MODE_SETCRTC, crtc) == -1) {
		printf("ioctl(DRM_IOCTL_MODE_SETCRTC): %s\n", STRERR);
		return -1;
	}
	return 0;
}

/* stupid frame buffer */
static struct drm_buffer *alloc_sfb(int card_fd,
			     uint32_t width,
			     uint32_t height,
			     uint32_t depth,
			     uint32_t bpp)
{
	struct drm_mode_create_dumb cdumb;
	struct drm_mode_map_dumb    moff;
	struct drm_mode_fb_cmd      cmd;
	struct drm_buffer *ret;
	void  *fbmap;

	memset(&cdumb, 0, sizeof(cdumb));
	memset(&moff,  0, sizeof(moff));
	memset(&cmd,   0, sizeof(cmd));

	/* create dumb buffer */
	cdumb.width  = width;
	cdumb.height = height;
	cdumb.bpp    = bpp;
	cdumb.flags  = 0;
	cdumb.pitch  = 0;
	cdumb.size   = 0;
	cdumb.handle = 0;
	if (ioctl(card_fd, DRM_IOCTL_MODE_CREATE_DUMB, &cdumb) == -1) {
		printf("ioctl(DRM_IOCTL_MODE_CREATE_DUMB): %s\n", STRERR);
		return NULL;
	}
	/* add framebuffer object */
	cmd.width  = cdumb.width;
	cmd.height = cdumb.height;
	cmd.bpp    = cdumb.bpp;
	cmd.pitch  = cdumb.pitch;
	cmd.depth  = depth;
	cmd.handle = cdumb.handle;
	if (ioctl(card_fd, DRM_IOCTL_MODE_ADDFB, &cmd) == -1) {
		printf("ioctl(DRM_IOCTL_MODE_ADDFB): %s\n", STRERR);
		ioctl(card_fd, DRM_IOCTL_MODE_DESTROY_DUMB, &cdumb.handle);
		return NULL;
	}
	/* get mmap offset */
	moff.handle = cdumb.handle;
	if (ioctl(card_fd, DRM_IOCTL_MODE_MAP_DUMB, &moff) == -1) {
		printf("ioctl(DRM_IOCTL_MODE_MAP_DUMB): %s\n", STRERR);
		ioctl(card_fd, DRM_IOCTL_MODE_RMFB, &cmd.fb_id);
		ioctl(card_fd, DRM_IOCTL_MODE_DESTROY_DUMB, &cdumb.handle);
		return NULL;
	}
	/* XXX this is probably better off as MAP_PRIVATE, we can't prime
	 * the main framebuffer if it's "dumb", AFAIK */
	fbmap = mmap(0, (size_t)cdumb.size, PROT_READ|PROT_WRITE,
			MAP_SHARED, card_fd, (off_t)moff.offset);
	if (fbmap == MAP_FAILED) {
		printf("framebuffer mmap failed: %s\n", STRERR);
		ioctl(card_fd, DRM_IOCTL_MODE_RMFB, &cmd.fb_id);
		ioctl(card_fd, DRM_IOCTL_MODE_DESTROY_DUMB, &cdumb.handle);
		return NULL;
	}

	ret = calloc(1, sizeof(struct drm_buffer));
	if (!ret) {
		printf("-ENOMEM\n");
		munmap(fbmap, cdumb.size);
		ioctl(card_fd, DRM_IOCTL_MODE_RMFB, &cmd.fb_id);
		ioctl(card_fd, DRM_IOCTL_MODE_DESTROY_DUMB, &cdumb.handle);
		return NULL;
	}
	ret->addr     = fbmap;
	ret->size     = cdumb.size;
	ret->pitch    = cdumb.pitch;
	ret->width    = cdumb.width;
	ret->height   = cdumb.height;
	ret->bpp      = cdumb.bpp;
	ret->depth    = cmd.depth;
	ret->fb_id    = cmd.fb_id;
	ret->drm_id   = cdumb.handle;
	memset(fbmap, 0x27, cdumb.size);
	return ret;
}

static int destroy_sfb(int card_fd, struct drm_buffer *sfb)
{
	if (!sfb)
		return -1;

	if (munmap(sfb->addr, sfb->size) == -1)
		printf("munmap: %s\n", STRERR);
	if (ioctl(card_fd, DRM_IOCTL_MODE_RMFB, &sfb->fb_id))
		printf("ioctl(DRM_IOCTL_MODE_RMFB): %s\n", STRERR);
	if (ioctl(card_fd, DRM_IOCTL_MODE_DESTROY_DUMB, &sfb->drm_id))
		printf("ioctl(DRM_IOCTL_MODE_DESTROY_DUMB): %s\n", STRERR);
	free(sfb);
	return 0;
}
static int card_set_master(int card_fd)
{
	if (ioctl(card_fd, DRM_IOCTL_SET_MASTER, 0)) {
		printf("ioctl(DRM_IOCTL_SET_MASTER, 0): %s\n", STRERR);
		return -1;
	}
	return 0;
}
static int card_drop_master(int card_fd)
{
	if (ioctl(card_fd, DRM_IOCTL_DROP_MASTER, 0)) {
		printf("ioctl(DRM_IOCTL_DROP_MASTER, 0): %s\n", STRERR);
		return -1;
	}
	return 0;
}
static int drm_display_destroy(struct drm_display *display)
{
	if (display->conn)
		free_connector(display->conn);
	memset(display, 0, sizeof(struct drm_display));
	return 0;
}
int drm_kms_destroy(struct drm_kms *self)
{
	if (self->sfb)
		destroy_sfb(self->card_fd, self->sfb);
	if (self->res)
		free_mode_card_res(self->res);
	drm_display_destroy(&self->display);

	close(self->card_fd);
	memset(self, 0, sizeof(struct drm_kms));
	free(self);
	return 0;
}
static int get_mode_idx(struct drm_mode_modeinfo *modes,
			uint16_t count,
			uint16_t width,
			uint16_t height,
			uint16_t refresh)
{
	int i;
	int pick = -1;
	if (width == 0)
		width = 0xffff;
	if (height == 0)
		height = 0xffff;
	for (i = 0; i < count; ++i)
	{
		if (modes[i].hdisplay > width || modes[i].vdisplay > height)
			continue;
		/* pretend these radical modes don't exist for now */
		if (modes[i].hdisplay % 16 == 0) {
			if (pick < 0) {
				pick = i;
				continue;
			}
			if (modes[i].hdisplay > modes[pick].hdisplay)
				pick = i;
			else if (modes[i].vdisplay > modes[pick].vdisplay)
				pick = i;
			else if (modes[i].hdisplay == modes[pick].hdisplay
					&& modes[i].vdisplay == modes[pick].vdisplay) {
				if (abs(refresh - modes[i].vrefresh)
					  < abs(refresh - modes[pick].vrefresh)) {
					pick = i;
				}
			}
		}
	}
	if (pick < 0) {
		printf("could not find any usable modes for (%dx%d@%dhz)\n",
				width, height, refresh);
		return -1;
	}
	return pick;
}
/* TODO handle hotplugging */
static int drm_display_load(struct drm_kms *self,
		     uint16_t req_width,
		     uint16_t req_height,
		     uint16_t req_refresh,
		     struct drm_display *out)
{
	uint32_t conn_id;
	int idx = -1;

	/* FIXME uses primary connector? "0" */
	conn_id = drm_get_id(self->res->connector_id_ptr, 0);
	out->conn = alloc_connector(self->card_fd, conn_id);
	if (!out->conn) {
		printf("unable to create drm connector structure\n");
		return -1;
	}

	out->conn_id = conn_id;
	out->modes = get_connector_modeinfo(out->conn, &out->mode_count);
	idx = get_mode_idx(out->modes, out->mode_count,
			   req_width, req_height, req_refresh);
	if (idx < 0)
		goto free_err;

	out->cur_mode_idx = (uint32_t)idx;
	out->cur_mode = &out->modes[out->cur_mode_idx];
	return 0;
free_err:
	drm_display_destroy(out);
	return -1;
}
struct drm_kms *drm_mode_create(char *devname,
				int no_connect,
				uint16_t req_width,
				uint16_t req_height,
				uint16_t req_refresh)
{
	char devpath[128];
	struct drm_kms *self;
	struct drm_mode_modeinfo *cur_mode;
	int card_fd;

	snprintf(devpath, sizeof(devpath), "/dev/dri/%s", devname);
	card_fd = open(devpath, O_RDWR|O_CLOEXEC);
	if (card_fd == -1) {
		printf("open(%s): %s\n", devpath, STRERR);
		return NULL;
	}
	if (card_set_master(card_fd)) {
		printf("card_set_master failed\n");
		return NULL;
	}

	self = calloc(1, sizeof(struct drm_kms));
	if (!self)
		return NULL;

	self->card_fd = card_fd;
	self->res = alloc_mode_card_res(card_fd);
	if (!self->res) {
		printf("unable to create drm structure\n");
		goto free_err;
	}

	if (drm_display_load(self, req_width, req_height, req_refresh, &self->display)) {
		printf("drm_display_load failed\n");
		goto free_err;
	}
	cur_mode = self->display.cur_mode;
	printf("connector(%d) using mode[%d] (%dx%d@%dhz)\n",
				self->display.conn_id,
				self->display.cur_mode_idx,
				cur_mode->hdisplay,
				cur_mode->vdisplay,
				cur_mode->vrefresh);

	/* buffer pitch must divide evenly by 16,
	 * TODO check against bpp here when that is variable instead of 32 */
	self->sfb = alloc_sfb(card_fd, cur_mode->hdisplay, cur_mode->vdisplay, 24, 32);
	if (!self->sfb) {
		printf("alloc_sfb failed\n");
		goto free_err;
	}

	if (!no_connect && drm_kms_connect_sfb(self)) {
		printf("drm_kms_connect_sfb failed\n");
		goto free_err;
	}
	return self;

free_err:
	drm_kms_destroy(self);
	return NULL;
}


int main(int argc, char *argv[])
{
	int ret = -1;
	struct drm_kms *card0;
	/*card0 = drm_mode_create("card0", g_srv_opts.inactive_vt,
					   g_srv_opts.request_width,
					   g_srv_opts.request_height,
					   g_srv_opts.request_refresh);*/
	/* do not connect to vt */
	card0 = drm_mode_create("card0", 1, 640, 480, 60);
	if (card0 == NULL) {
		printf("drm_mode_create failed\n");
		return -1;
	}


	drm_kms_destroy(card0);

	printf("looks ok, returning 0\n");
	return 0;
}

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re:
  2018-10-22  0:26 ` Dave Airlie
@ 2018-10-21 20:23   ` Michael Tirado
  2018-10-22  1:50     ` Re: Dave Airlie
  0 siblings, 1 reply; 65+ messages in thread
From: Michael Tirado @ 2018-10-21 20:23 UTC (permalink / raw)
  To: airlied
  Cc: Airlied, dri-devel, LKML, kraxel, alexander.deucher,
	christian.koenig, David1.zhou, Hongbo.He, Sean Paul, Gustavo,
	maarten.lankhorst

On Mon, Oct 22, 2018 at 12:26 AM Dave Airlie <airlied@gmail.com> wrote:
>
> This shouldn't be necessary, did someone misbackport the mmap changes without:
>
> drm: set FMODE_UNSIGNED_OFFSET for drm files
>
> Dave.

The latest kernel I have had to patch was a 4.18-rc6.  I'll try with a
newer 4.19 and let you know if it decides to work.  If not I'll
prepare a test case for demonstration on qemu-system-i386.

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re:
  2018-03-05 17:06 (unknown) Meghana Madhyastha
@ 2018-03-05 19:24 ` Noralf Trønnes
  0 siblings, 0 replies; 65+ messages in thread
From: Noralf Trønnes @ 2018-03-05 19:24 UTC (permalink / raw)
  To: Meghana Madhyastha, Daniel Vetter, dri-devel


Den 05.03.2018 18.06, skrev Meghana Madhyastha:
> linux-spi@vger.kernel.org,Noralf Trønnes <noralf@tronnes.org>,Sean Paul <seanpaul@chromium.org>,kernel@martin.sperl.org
> Cc:
> Bcc:
> Subject: Re: [PATCH v2 0/2] Chunk splitting of spi transfers
> Reply-To:
> In-Reply-To: <f6dbf3ca-4c1b-90cc-c4af-8889f7407180@tronnes.org>
>
> On Sun, Mar 04, 2018 at 06:38:42PM +0100, Noralf Trønnes wrote:
>> Den 02.03.2018 12.11, skrev Meghana Madhyastha:
>>> On Sun, Feb 25, 2018 at 02:19:10PM +0100, Lukas Wunner wrote:
>>>> [cc += linux-rpi-kernel@lists.infradead.org]
>>>>
>>>> On Sat, Feb 24, 2018 at 06:15:59PM +0000, Meghana Madhyastha wrote:
>>>>> I've added bcm2835_spi_transfer_one_message in spi-bcm2835. This calls
>>>>> spi_split_transfers_maxsize to split large chunks for spi dma transfers.
>>>>> I then removed chunk splitting in the tinydrm spi helper (as now the core
>>>>> is handling the chunk splitting). However, although the SPI HW should be
>>>>> able to accomodate up to 65535 bytes for dma transfers, the splitting of
>>>>> chunks to 65535 bytes results in a dma transfer time out error. However,
>>>>> when the chunks are split to < 64 bytes it seems to work fine.
>>>> Hm, that is really odd, how did you test this exactly, what did you
>>>> use as SPI slave?  It contradicts our own experience, we're using
>>>> Micrel KSZ8851 Ethernet chips as SPI slave on spi0 of a BCM2837
>>>> and can send/receive messages via DMA to the tune of several hundred
>>>> bytes without any issues.  In fact, for messages < 96 bytes, DMA is
>>>> not used at all, so you've probably been using interrupt mode,
>>>> see the BCM2835_SPI_DMA_MIN_LENGTH macro in spi-bcm2835.c.
>>> Hi Lukas,
>>>
>>> I think you are right. I checked it and its not using the DMA mode which
>>> is why its working with 64 bytes.
>>> Noralf, that leaves us back to the
>>> initial time out problem. I've tried doing the message splitting in
>>> spi_sync as well as spi_pump_messages. Martin had explained that DMA
>>> will wait for
>>> the SPI HW to set the send_more_data line, but the SPI-HW itself will
>>> stop triggering it when SPI_LEN is 0 causing DMA to wait forever. I
>>> thought if we split it before itself, the SPI_LEN will not go to zero
>>> thus preventing this problem, however it didn't work and started
>>> hanging. So I'm a little uncertain as to how to proceed and debug what
>>> exactly has caused the time out due to the asynchronous methods.
>> I did a quick test and at least this is working:
>>
>> int tinydrm_spi_transfer(struct spi_device *spi, u32 speed_hz,
>>               struct spi_transfer *header, u8 bpw, const void *buf,
>>               size_t len)
>> {
>>      struct spi_transfer tr = {
>>          .bits_per_word = bpw,
>>          .speed_hz = speed_hz,
>>          .tx_buf = buf,
>>          .len = len,
>>      };
>>      struct spi_message m;
>>      size_t maxsize;
>>      int ret;
>>
>>      maxsize = tinydrm_spi_max_transfer_size(spi, 0);
>>
>>      if (drm_debug & DRM_UT_DRIVER)
>>          pr_debug("[drm:%s] bpw=%u, maxsize=%zu, transfers:\n",
>>               __func__, bpw, maxsize);
>>
>>      spi_message_init(&m);
>>      m.spi = spi;
>>      if (header)
>>          spi_message_add_tail(header, &m);
>>      spi_message_add_tail(&tr, &m);
>>
>>      ret = spi_split_transfers_maxsize(spi->controller, &m, maxsize,
>> GFP_KERNEL);
>>      if (ret)
>>          return ret;
>>
>>      tinydrm_dbg_spi_message(spi, &m);
>>
>>      return spi_sync(spi, &m);
>> }
>> EXPORT_SYMBOL(tinydrm_spi_transfer);
>>
>>
>> Log:
>> [   39.015644] [drm:mipi_dbi_fb_dirty [mipi_dbi]] Flushing [FB:36] x1=0,
>> x2=320, y1=0, y2=240
>>
>> [   39.018079] [drm:mipi_dbi_typec3_command [mipi_dbi]] cmd=2a, par=00 00 01
>> 3f
>> [   39.018129] [drm:tinydrm_spi_transfer] bpw=8, maxsize=65532, transfers:
>> [   39.018152]     tr(0): speed=10MHz, bpw=8, len=1, tx_buf=[2a]
>> [   39.018231] [drm:tinydrm_spi_transfer] bpw=8, maxsize=65532, transfers:
>> [   39.018248]     tr(0): speed=10MHz, bpw=8, len=4, tx_buf=[00 00 01 3f]
>>
>> [   39.018330] [drm:mipi_dbi_typec3_command [mipi_dbi]] cmd=2b, par=00 00 00
>> ef
>> [   39.018347] [drm:tinydrm_spi_transfer] bpw=8, maxsize=65532, transfers:
>> [   39.018362]     tr(0): speed=10MHz, bpw=8, len=1, tx_buf=[2b]
>> [   39.018396] [drm:tinydrm_spi_transfer] bpw=8, maxsize=65532, transfers:
>> [   39.018428]     tr(0): speed=10MHz, bpw=8, len=4, tx_buf=[00 00 00 ef]
>>
>> [   39.018487] [drm:mipi_dbi_typec3_command [mipi_dbi]] cmd=2c, len=153600
>> [   39.018502] [drm:tinydrm_spi_transfer] bpw=8, maxsize=65532, transfers:
>> [   39.018517]     tr(0): speed=10MHz, bpw=8, len=1, tx_buf=[2c]
>> [   39.018565] [drm:tinydrm_spi_transfer] bpw=8, maxsize=65532, transfers:
>> [   39.018594]     tr(0): speed=48MHz, bpw=8, len=65532, tx_buf=[c6 18 c6 18
>> c6 18 c6 18 c6 18 c6 18 c6 18 c6 18 ...]
>> [   39.018608]     tr(1): speed=48MHz, bpw=8, len=65532, tx_buf=[06 18 06 18
>> 06 18 06 18 06 18 06 18 06 18 06 18 ...]
>> [   39.018621]     tr(2): speed=48MHz, bpw=8, len=22536, tx_buf=[10 82 10 82
>> 10 82 10 82 10 82 10 82 18 e3 18 e3 ...]
> Hi Noralf,
>
> Yes this works but splitting in the spi subsystem doesn't seem to work.
> So this means that spi_split_transfers_maxsize is working.
> Should I just send in a patch with splitting done here in tinydrm? (I
> had thought we wanted to avoid splitting in the tinydrm helper).

Oh, I assumed you didn't get this to work in any way.
Yes, I prefer splitting without the client's knowledge.

Looking at the code the splitting has to happen before spi_map_msg() is
called. Have you tried to do it in the prepare_message callback?

static void __spi_pump_messages(struct spi_controller *ctlr, bool 
in_kthread)
{
<...>
     if (ctlr->prepare_message) {
         ret = ctlr->prepare_message(ctlr, ctlr->cur_msg);
<...>
     ret = spi_map_msg(ctlr, ctlr->cur_msg);
<...>
     ret = ctlr->transfer_one_message(ctlr, ctlr->cur_msg);
<...>
}

There was something wrong with this email, it was missing subject and
several recipients.

Noralf.

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re:
  2014-09-08  3:47       ` Re: Alex Deucher
@ 2014-09-08  7:13         ` Markus Trippelsdorf
  0 siblings, 0 replies; 65+ messages in thread
From: Markus Trippelsdorf @ 2014-09-08  7:13 UTC (permalink / raw)
  To: Alex Deucher; +Cc: Maling list - DRI developers

On 2014.09.07 at 23:47 -0400, Alex Deucher wrote:
> On Sun, Sep 7, 2014 at 9:24 AM, Markus Trippelsdorf
> <markus@trippelsdorf.de> wrote:
> > On 2014.08.25 at 11:10 +0200, Christian König wrote:
> >> Let me know if it works for you, cause we don't really have any hardware
> >> any more to test it.
> >
> > I've tested your patch series today (using drm-next-3.18 from
> > ~agd5f/linux) on a RS780D/Radeon HD 3300 system with a couple of H264
> > videos. While it sometimes works as expected, it stalled the GPU far too
> > often to be usable. The stalls are not recoverable and the machine ends
> > up with a black sreen, but still accepts SysRq keyboard inputs.
> 
> 
> Does it work any better if dpm is disabled?

Unfortunately no. The symptoms are unchanged.

-- 
Markus

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re:
  2014-09-07 13:24     ` Re: Markus Trippelsdorf
@ 2014-09-08  3:47       ` Alex Deucher
  2014-09-08  7:13         ` Re: Markus Trippelsdorf
  0 siblings, 1 reply; 65+ messages in thread
From: Alex Deucher @ 2014-09-08  3:47 UTC (permalink / raw)
  To: Markus Trippelsdorf; +Cc: Maling list - DRI developers

On Sun, Sep 7, 2014 at 9:24 AM, Markus Trippelsdorf
<markus@trippelsdorf.de> wrote:
> On 2014.08.25 at 11:10 +0200, Christian König wrote:
>> Let me know if it works for you, cause we don't really have any hardware
>> any more to test it.
>
> I've tested your patch series today (using drm-next-3.18 from
> ~agd5f/linux) on a RS780D/Radeon HD 3300 system with a couple of H264
> videos. While it sometimes works as expected, it stalled the GPU far too
> often to be usable. The stalls are not recoverable and the machine ends
> up with a black sreen, but still accepts SysRq keyboard inputs.


Does it work any better if dpm is disabled?

Alex

>
> Here are some logs:
>
> vdpauinfo:
> display: :0   screen: 0
> API version: 1
> Information string: G3DVL VDPAU Driver Shared Library version 1.0
>
> Video surface:
>
> name   width height types
> -------------------------------------------
> 420     8192  8192  NV12 YV12
> 422     8192  8192  UYVY YUYV
> 444     8192  8192  Y8U8V8A8 V8U8Y8A8
>
> Decoder capabilities:
>
> name               level macbs width height
> -------------------------------------------
> MPEG1                 0  9216  2048  1152
> MPEG2_SIMPLE          3  9216  2048  1152
> MPEG2_MAIN            3  9216  2048  1152
> H264_BASELINE        41  9216  2048  1152
> H264_MAIN            41  9216  2048  1152
> H264_HIGH            41  9216  2048  1152
> VC1_ADVANCED          4  9216  2048  1152
>
> Output surface:
>
> name              width height nat types
> ----------------------------------------------------
> B8G8R8A8          8192  8192    y  NV12 YV12 UYVY YUYV Y8U8V8A8 V8U8Y8A8
> R8G8B8A8          8192  8192    y  NV12 YV12 UYVY YUYV Y8U8V8A8 V8U8Y8A8
> R10G10B10A2       8192  8192    y  NV12 YV12 UYVY YUYV Y8U8V8A8 V8U8Y8A8
> B10G10R10A2       8192  8192    y  NV12 YV12 UYVY YUYV Y8U8V8A8 V8U8Y8A8
>
> Bitmap surface:
>
> name              width height
> ------------------------------
> B8G8R8A8          8192  8192
> R8G8B8A8          8192  8192
> R10G10B10A2       8192  8192
> B10G10R10A2       8192  8192
> A8                8192  8192
>
> Video mixer:
>
> feature name                    sup
> ------------------------------------
> DEINTERLACE_TEMPORAL             y
> DEINTERLACE_TEMPORAL_SPATIAL     -
> INVERSE_TELECINE                 -
> NOISE_REDUCTION                  y
> SHARPNESS                        y
> LUMA_KEY                         -
> HIGH QUALITY SCALING - L1        -
> HIGH QUALITY SCALING - L2        -
> HIGH QUALITY SCALING - L3        -
> HIGH QUALITY SCALING - L4        -
> HIGH QUALITY SCALING - L5        -
> HIGH QUALITY SCALING - L6        -
> HIGH QUALITY SCALING - L7        -
> HIGH QUALITY SCALING - L8        -
> HIGH QUALITY SCALING - L9        -
>
> parameter name                  sup      min      max
> -----------------------------------------------------
> VIDEO_SURFACE_WIDTH              y        48     2048
> VIDEO_SURFACE_HEIGHT             y        48     1152
> CHROMA_TYPE                      y
> LAYERS                           y         0        4
>
> attribute name                  sup      min      max
> -----------------------------------------------------
> BACKGROUND_COLOR                 y
> CSC_MATRIX                       y
> NOISE_REDUCTION_LEVEL            y      0.00     1.00
> SHARPNESS_LEVEL                  y     -1.00     1.00
> LUMA_KEY_MIN_LUMA                y
> LUMA_KEY_MAX_LUMA                y
>
>
> Sep  7 14:03:45 x4 kernel: [drm] Initialized drm 1.1.0 20060810
> Sep  7 14:03:45 x4 kernel: [drm] radeon kernel modesetting enabled.
> Sep  7 14:03:45 x4 kernel: [drm] initializing kernel modesetting (RS780 0x1002:0x9614 0x1043:0x834D).
> Sep  7 14:03:45 x4 kernel: [drm] register mmio base: 0xFBEE0000
> Sep  7 14:03:45 x4 kernel: [drm] register mmio size: 65536
> Sep  7 14:03:45 x4 kernel: ATOM BIOS: 113
> Sep  7 14:03:45 x4 kernel: radeon 0000:01:05.0: VRAM: 128M 0x00000000C0000000 - 0x00000000C7FFFFFF (128M used)
> Sep  7 14:03:45 x4 kernel: radeon 0000:01:05.0: GTT: 512M 0x00000000A0000000 - 0x00000000BFFFFFFF
> Sep  7 14:03:45 x4 kernel: [drm] Detected VRAM RAM=128M, BAR=128M
> Sep  7 14:03:45 x4 kernel: [drm] RAM width 32bits DDR
> Sep  7 14:03:45 x4 kernel: [TTM] Zone  kernel: Available graphics memory: 4083350 kiB
> Sep  7 14:03:45 x4 kernel: [TTM] Zone   dma32: Available graphics memory: 2097152 kiB
> Sep  7 14:03:45 x4 kernel: [TTM] Initializing pool allocator
> Sep  7 14:03:45 x4 kernel: [TTM] Initializing DMA pool allocator
> Sep  7 14:03:45 x4 kernel: [drm] radeon: 128M of VRAM memory ready
> Sep  7 14:03:45 x4 kernel: [drm] radeon: 512M of GTT memory ready.
> Sep  7 14:03:45 x4 kernel: [drm] Loading RS780 Microcode
> Sep  7 14:03:45 x4 kernel: == power state 0 ==
> Sep  7 14:03:45 x4 kernel:      ui class: none
> Sep  7 14:03:45 x4 kernel:      internal class: boot
> Sep  7 14:03:45 x4 kernel:      caps: video
> Sep  7 14:03:45 x4 kernel:      uvd    vclk: 0 dclk: 0
> Sep  7 14:03:45 x4 kernel:              power level 0    sclk: 50000 vddc_index: 2
> Sep  7 14:03:45 x4 kernel:              power level 1    sclk: 50000 vddc_index: 2
> Sep  7 14:03:45 x4 kernel:      status: c r b
> Sep  7 14:03:45 x4 kernel: == power state 1 ==
> Sep  7 14:03:45 x4 kernel:      ui class: performance
> Sep  7 14:03:45 x4 kernel:      internal class: none
> Sep  7 14:03:45 x4 kernel:      caps: video
> Sep  7 14:03:45 x4 kernel:      uvd    vclk: 0 dclk: 0
> Sep  7 14:03:45 x4 kernel:              power level 0    sclk: 50000 vddc_index: 1
> Sep  7 14:03:45 x4 kernel:              power level 1    sclk: 70000 vddc_index: 2
> Sep  7 14:03:45 x4 kernel:      status:
> Sep  7 14:03:45 x4 kernel: == power state 2 ==
> Sep  7 14:03:45 x4 kernel:      ui class: none
> Sep  7 14:03:45 x4 kernel:      internal class: uvd
> Sep  7 14:03:45 x4 kernel:      caps: video
> Sep  7 14:03:45 x4 kernel:      uvd    vclk: 53300 dclk: 40000
> Sep  7 14:03:45 x4 kernel:              power level 0    sclk: 50000 vddc_index: 1
> Sep  7 14:03:45 x4 kernel:              power level 1    sclk: 50000 vddc_index: 1
> Sep  7 14:03:45 x4 kernel:      status:
> Sep  7 14:03:45 x4 kernel: [drm] radeon: dpm initialized
> Sep  7 14:03:45 x4 kernel: [drm] GART: num cpu pages 131072, num gpu pages 131072
> Sep  7 14:03:45 x4 kernel: [drm] PCIE GART of 512M enabled (table at 0x00000000C0258000).
> Sep  7 14:03:45 x4 kernel: radeon 0000:01:05.0: WB enabled
> Sep  7 14:03:45 x4 kernel: radeon 0000:01:05.0: fence driver on ring 0 use gpu addr 0x00000000a0000c00 and cpu addr 0xffff8800db8dcc00
> Sep  7 14:03:45 x4 kernel: radeon 0000:01:05.0: fence driver on ring 5 use gpu addr 0x00000000c0056038 and cpu addr 0xffffc90000116038
> Sep  7 14:03:45 x4 kernel: [drm] Supports vblank timestamp caching Rev 2 (21.10.2013).
> Sep  7 14:03:45 x4 kernel: [drm] Driver supports precise vblank timestamp query.
> Sep  7 14:03:45 x4 kernel: [drm] radeon: irq initialized.
> Sep  7 14:03:45 x4 kernel: [drm] ring test on 0 succeeded in 1 usecs
> Sep  7 14:03:45 x4 kernel: [drm] ring test on 5 succeeded in 1 usecs
> Sep  7 14:03:45 x4 kernel: [drm] UVD initialized successfully.
> Sep  7 14:03:45 x4 kernel: [drm] ib test on ring 0 succeeded in 0 usecs
> Sep  7 14:03:45 x4 kernel: [drm] ib test on ring 5 succeeded
> Sep  7 14:03:45 x4 kernel: [drm] Radeon Display Connectors
> Sep  7 14:03:45 x4 kernel: [drm] Connector 0:
> Sep  7 14:03:45 x4 kernel: [drm]   VGA-1
> Sep  7 14:03:45 x4 kernel: [drm]   DDC: 0x7e40 0x7e40 0x7e44 0x7e44 0x7e48 0x7e48 0x7e4c 0x7e4c
> Sep  7 14:03:45 x4 kernel: [drm]   Encoders:
> Sep  7 14:03:45 x4 kernel: [drm]     CRT1: INTERNAL_KLDSCP_DAC1
> Sep  7 14:03:45 x4 kernel: [drm] Connector 1:
> Sep  7 14:03:45 x4 kernel: [drm]   DVI-D-1
> Sep  7 14:03:45 x4 kernel: [drm]   HPD3
> Sep  7 14:03:45 x4 kernel: [drm]   DDC: 0x7e50 0x7e50 0x7e54 0x7e54 0x7e58 0x7e58 0x7e5c 0x7e5c
> Sep  7 14:03:45 x4 kernel: [drm]   Encoders:
> Sep  7 14:03:45 x4 kernel: [drm]     DFP3: INTERNAL_KLDSCP_LVTMA
> Sep  7 14:03:45 x4 kernel: switching from power state:
> Sep  7 14:03:45 x4 kernel:      ui class: none
> Sep  7 14:03:45 x4 kernel:      internal class: boot
> Sep  7 14:03:45 x4 kernel:      caps: video
> Sep  7 14:03:45 x4 kernel:      uvd    vclk: 0 dclk: 0
> Sep  7 14:03:45 x4 kernel:              power level 0    sclk: 50000 vddc_index: 2
> Sep  7 14:03:45 x4 kernel:              power level 1    sclk: 50000 vddc_index: 2
> Sep  7 14:03:45 x4 kernel:      status: c b
> Sep  7 14:03:45 x4 kernel: switching to power state:
> Sep  7 14:03:45 x4 kernel:      ui class: performance
> Sep  7 14:03:45 x4 kernel:      internal class: none
> Sep  7 14:03:45 x4 kernel:      caps: video
> Sep  7 14:03:45 x4 kernel:      uvd    vclk: 0 dclk: 0
> Sep  7 14:03:45 x4 kernel:              power level 0    sclk: 50000 vddc_index: 1
> Sep  7 14:03:45 x4 kernel:              power level 1    sclk: 70000 vddc_index: 2
> Sep  7 14:03:45 x4 kernel:      status: r
> Sep  7 14:03:45 x4 kernel: [drm] fb mappable at 0xF0359000
> Sep  7 14:03:45 x4 kernel: [drm] vram apper at 0xF0000000
> Sep  7 14:03:45 x4 kernel: [drm] size 7299072
> Sep  7 14:03:45 x4 kernel: [drm] fb depth is 24
> Sep  7 14:03:45 x4 kernel: [drm]    pitch is 6912
> Sep  7 14:03:45 x4 kernel: fbcon: radeondrmfb (fb0) is primary device
> Sep  7 14:03:45 x4 kernel: Console: switching to colour frame buffer device 131x105
> Sep  7 14:03:45 x4 kernel: radeon 0000:01:05.0: fb0: radeondrmfb frame buffer device
> Sep  7 14:03:45 x4 kernel: radeon 0000:01:05.0: registered panic notifier
> Sep  7 14:03:45 x4 kernel: tsc: Refined TSC clocksource calibration: 3210.826 MHz
> Sep  7 14:03:45 x4 kernel: [drm] Initialized radeon 2.40.0 20080528 for 0000:01:05.0 on minor 0
> ...
> Sep  7 14:20:37 x4 kernel: switching to power state:
> Sep  7 14:20:37 x4 kernel:      ui class: none
> Sep  7 14:20:37 x4 kernel:      internal class: uvd
> Sep  7 14:20:37 x4 kernel:      caps: video
> Sep  7 14:20:37 x4 kernel:      uvd    vclk: 53300 dclk: 40000
> Sep  7 14:20:37 x4 kernel:              power level 0    sclk: 50000 vddc_index: 1
> Sep  7 14:20:37 x4 kernel:              power level 1    sclk: 50000 vddc_index: 1
> Sep  7 14:20:37 x4 kernel:      status: r
> Sep  7 14:20:54 x4 kernel: switching from power state:
> Sep  7 14:20:54 x4 kernel:      ui class: none
> Sep  7 14:20:54 x4 kernel:      internal class: uvd
> Sep  7 14:20:54 x4 kernel:      caps: video
> Sep  7 14:20:54 x4 kernel:      uvd    vclk: 53300 dclk: 40000
> Sep  7 14:20:54 x4 kernel:              power level 0    sclk: 50000 vddc_index: 1
> Sep  7 14:20:54 x4 kernel:              power level 1    sclk: 50000 vddc_index: 1
> Sep  7 14:20:54 x4 kernel:      status: c
> Sep  7 14:20:54 x4 kernel: switching to power state:
> Sep  7 14:20:54 x4 kernel:      ui class: performance
> Sep  7 14:20:54 x4 kernel:      internal class: none
> Sep  7 14:20:54 x4 kernel:      caps: video
> Sep  7 14:20:54 x4 kernel:      uvd    vclk: 0 dclk: 0
> Sep  7 14:20:54 x4 kernel:              power level 0    sclk: 50000 vddc_index: 1
> Sep  7 14:20:54 x4 kernel:              power level 1    sclk: 70000 vddc_index: 2
> Sep  7 14:20:54 x4 kernel:      status: r
> Sep  7 14:21:02 x4 kernel: switching from power state:
> Sep  7 14:21:02 x4 kernel:      ui class: performance
> Sep  7 14:21:02 x4 kernel:      internal class: none
> Sep  7 14:21:02 x4 kernel:      caps: video
> Sep  7 14:21:02 x4 kernel:      uvd    vclk: 0 dclk: 0
> Sep  7 14:21:02 x4 kernel:              power level 0    sclk: 50000 vddc_index: 1
> Sep  7 14:21:02 x4 kernel:              power level 1    sclk: 70000 vddc_index: 2
> Sep  7 14:21:02 x4 kernel:      status: c
> Sep  7 14:21:02 x4 kernel: switching to power state:
> Sep  7 14:21:02 x4 kernel:      ui class: none
> Sep  7 14:21:02 x4 kernel:      internal class: uvd
> Sep  7 14:21:02 x4 kernel:      caps: video
> Sep  7 14:21:02 x4 kernel:      uvd    vclk: 53300 dclk: 40000
> Sep  7 14:21:02 x4 kernel:              power level 0    sclk: 50000 vddc_index: 1
> Sep  7 14:21:02 x4 kernel:              power level 1    sclk: 50000 vddc_index: 1
> Sep  7 14:21:02 x4 kernel:      status: r
> Sep  7 14:21:13 x4 kernel: radeon 0000:01:05.0: ring 0 stalled for more than 10106msec
> Sep  7 14:21:13 x4 kernel: radeon 0000:01:05.0: GPU lockup (current fence id 0x00000000000072ac last fence id 0x00000000000072b4 on ring 0)
> Sep  7 14:21:13 x4 kernel: radeon 0000:01:05.0: Saved 377 dwords of commands on ring 0.
> Sep  7 14:21:13 x4 kernel: radeon 0000:01:05.0: GPU softreset: 0x00000099
> Sep  7 14:21:13 x4 kernel: radeon 0000:01:05.0:   R_008010_GRBM_STATUS      = 0xA00034AF
> Sep  7 14:21:13 x4 kernel: radeon 0000:01:05.0:   R_008014_GRBM_STATUS2     = 0x00000003
> Sep  7 14:21:13 x4 kernel: radeon 0000:01:05.0:   R_000E50_SRBM_STATUS      = 0x20044040
> Sep  7 14:21:13 x4 kernel: radeon 0000:01:05.0:   R_008674_CP_STALLED_STAT1 = 0x00000004
> Sep  7 14:21:13 x4 kernel: radeon 0000:01:05.0:   R_008678_CP_STALLED_STAT2 = 0x00000002
> Sep  7 14:21:13 x4 kernel: radeon 0000:01:05.0:   R_00867C_CP_BUSY_STAT     = 0x00005087
> Sep  7 14:21:13 x4 kernel: radeon 0000:01:05.0:   R_008680_CP_STAT          = 0x80098645
> Sep  7 14:21:13 x4 kernel: radeon 0000:01:05.0:   R_00D034_DMA_STATUS_REG   = 0x44C83D57
> Sep  7 14:21:13 x4 kernel: radeon 0000:01:05.0: R_008020_GRBM_SOFT_RESET=0x00007FEF
> Sep  7 14:21:13 x4 kernel: radeon 0000:01:05.0: SRBM_SOFT_RESET=0x00008100
> Sep  7 14:21:13 x4 kernel: radeon 0000:01:05.0:   R_008010_GRBM_STATUS      = 0xA0003030
> Sep  7 14:21:13 x4 kernel: radeon 0000:01:05.0:   R_008014_GRBM_STATUS2     = 0x00000003
> Sep  7 14:21:13 x4 kernel: radeon 0000:01:05.0:   R_000E50_SRBM_STATUS      = 0x20048040
> Sep  7 14:21:13 x4 kernel: radeon 0000:01:05.0:   R_008674_CP_STALLED_STAT1 = 0x00000000
> Sep  7 14:21:13 x4 kernel: radeon 0000:01:05.0:   R_008678_CP_STALLED_STAT2 = 0x00000000
> Sep  7 14:21:13 x4 kernel: radeon 0000:01:05.0:   R_00867C_CP_BUSY_STAT     = 0x00000000
> Sep  7 14:21:13 x4 kernel: radeon 0000:01:05.0:   R_008680_CP_STAT          = 0x80100000
> Sep  7 14:21:13 x4 kernel: radeon 0000:01:05.0:   R_00D034_DMA_STATUS_REG   = 0x44C83D57
> Sep  7 14:21:13 x4 kernel: radeon 0000:01:05.0: GPU reset succeeded, trying to resume
> Sep  7 14:21:13 x4 kernel: [drm] PCIE GART of 512M enabled (table at 0x00000000C0258000).
> Sep  7 14:21:13 x4 kernel: radeon 0000:01:05.0: WB enabled
> Sep  7 14:21:13 x4 kernel: radeon 0000:01:05.0: fence driver on ring 0 use gpu addr 0x00000000a0000c00 and cpu addr 0xffff8800db8dcc00
> Sep  7 14:21:13 x4 kernel: radeon 0000:01:05.0: fence driver on ring 5 use gpu addr 0x00000000c0056038 and cpu addr 0xffffc90000116038
> Sep  7 14:21:13 x4 kernel: [drm] ring test on 0 succeeded in 1 usecs
> Sep  7 14:21:13 x4 kernel: [drm] ring test on 5 succeeded in 1 usecs
> Sep  7 14:21:13 x4 kernel: [drm] UVD initialized successfully.
> Sep  7 14:21:13 x4 kernel: switching from power state:
> Sep  7 14:21:13 x4 kernel:      ui class: none
> Sep  7 14:21:13 x4 kernel:      internal class: boot
> Sep  7 14:21:13 x4 kernel:      caps: video
> Sep  7 14:21:13 x4 kernel:      uvd    vclk: 0 dclk: 0
> Sep  7 14:21:13 x4 kernel:              power level 0    sclk: 50000 vddc_index: 2
> Sep  7 14:21:13 x4 kernel:              power level 1    sclk: 50000 vddc_index: 2
> Sep  7 14:21:13 x4 kernel:      status: c b
> Sep  7 14:21:13 x4 kernel: switching to power state:
> Sep  7 14:21:13 x4 kernel:      ui class: none
> Sep  7 14:21:13 x4 kernel:      internal class: uvd
> Sep  7 14:21:13 x4 kernel:      caps: video
> Sep  7 14:21:13 x4 kernel:      uvd    vclk: 53300 dclk: 40000
> Sep  7 14:21:13 x4 kernel:              power level 0    sclk: 50000 vddc_index: 1
> Sep  7 14:21:13 x4 kernel:              power level 1    sclk: 50000 vddc_index: 1
> Sep  7 14:21:13 x4 kernel:      status: r
> Sep  7 14:21:20 x4 kernel: SysRq : Emergency Sync
> (new boot)
> Sep  7 14:44:49 x4 kernel: switching from power state:
> Sep  7 14:44:49 x4 kernel:      ui class: performance
> Sep  7 14:44:49 x4 kernel:      internal class: none
> Sep  7 14:44:49 x4 kernel:      caps: video
> Sep  7 14:44:49 x4 kernel:      uvd    vclk: 0 dclk: 0
> Sep  7 14:44:49 x4 kernel:              power level 0    sclk: 50000 vddc_index: 1
> Sep  7 14:44:49 x4 kernel:              power level 1    sclk: 70000 vddc_index: 2
> Sep  7 14:44:49 x4 kernel:      status: c
> Sep  7 14:44:49 x4 kernel: switching to power state:
> Sep  7 14:44:49 x4 kernel:      ui class: none
> Sep  7 14:44:49 x4 kernel:      internal class: uvd
> Sep  7 14:44:49 x4 kernel:      caps: video
> Sep  7 14:44:49 x4 kernel:      uvd    vclk: 53300 dclk: 40000
> Sep  7 14:44:49 x4 kernel:              power level 0    sclk: 50000 vddc_index: 1
> Sep  7 14:44:49 x4 kernel:              power level 1    sclk: 50000 vddc_index: 1
> Sep  7 14:44:49 x4 kernel:      status: r
> Sep  7 14:44:59 x4 kernel: radeon 0000:01:05.0: ring 5 stalled for more than 10000msec
> Sep  7 14:44:59 x4 kernel: radeon 0000:01:05.0: GPU lockup (current fence id 0x0000000000000eac last fence id 0x0000000000000eaf on ring 5)
> Sep  7 14:44:59 x4 kernel: radeon 0000:01:05.0: ring 0 stalled for more than 10466msec
> Sep  7 14:44:59 x4 kernel: radeon 0000:01:05.0: GPU lockup (current fence id 0x00000000000064ff last fence id 0x0000000000006504 on ring 0)
> Sep  7 14:45:00 x4 kernel: radeon 0000:01:05.0: ring 5 stalled for more than 10500msec
> Sep  7 14:45:00 x4 kernel: radeon 0000:01:05.0: GPU lockup (current fence id 0x0000000000000eac last fence id 0x0000000000000eaf on ring 5)
> Sep  7 14:45:00 x4 kernel: radeon 0000:01:05.0: ring 0 stalled for more than 10966msec
> Sep  7 14:45:00 x4 kernel: radeon 0000:01:05.0: GPU lockup (current fence id 0x00000000000064ff last fence id 0x0000000000006504 on ring 0)
> Sep  7 14:45:00 x4 kernel: radeon 0000:01:05.0: ring 5 stalled for more than 11000msec
> ...
> Sep  7 14:45:17 x4 kernel: radeon 0000:01:05.0: GPU lockup (current fence id 0x0000000000000eac last fence id 0x0000000000000eaf on ring 5)
> Sep  7 14:45:17 x4 kernel: radeon 0000:01:05.0: ring 0 stalled for more than 28466msec
> Sep  7 14:45:17 x4 kernel: radeon 0000:01:05.0: GPU lockup (current fence id 0x00000000000064ff last fence id 0x0000000000006504 on ring 0)
> Sep  7 14:45:18 x4 kernel: radeon 0000:01:05.0: ring 5 stalled for more than 28500msec
> Sep  7 14:45:18 x4 kernel: radeon 0000:01:05.0: GPU lockup (current fence id 0x0000000000000eac last fence id 0x0000000000000eaf on ring 5)
> Sep  7 14:45:18 x4 kernel: radeon 0000:01:05.0: ring 0 stalled for more than 28966msec
> Sep  7 14:45:18 x4 kernel: radeon 0000:01:05.0: GPU lockup (current fence id 0x00000000000064ff last fence id 0x0000000000006504 on ring 0)
> Sep  7 14:45:18 x4 kernel: SysRq : Emergency Sync
> (new boot)
> Sep  7 14:48:14 x4 kernel: radeon 0000:01:05.0: ring 5 stalled for more than 10000msec
> Sep  7 14:48:14 x4 kernel: radeon 0000:01:05.0: ring 0 stalled for more than 10000msec
> Sep  7 14:48:14 x4 kernel: radeon 0000:01:05.0: GPU lockup (current fence id 0x00000000000001ed last fence id 0x00000000000001f6 on ring 0)
> Sep  7 14:48:14 x4 kernel: radeon 0000:01:05.0: GPU lockup (current fence id 0x000000000000000b last fence id 0x000000000000000f on ring 5)
> Sep  7 14:48:14 x4 kernel: radeon 0000:01:05.0: Saved 409 dwords of commands on ring 0.
> Sep  7 14:48:14 x4 kernel: radeon 0000:01:05.0: GPU softreset: 0x00000099
> Sep  7 14:48:14 x4 kernel: radeon 0000:01:05.0:   R_008010_GRBM_STATUS      = 0xA00034E0
> Sep  7 14:48:14 x4 kernel: radeon 0000:01:05.0:   R_008014_GRBM_STATUS2     = 0x00000003
> Sep  7 14:48:14 x4 kernel: radeon 0000:01:05.0:   R_000E50_SRBM_STATUS      = 0x20045040
> Sep  7 14:48:14 x4 kernel: radeon 0000:01:05.0:   R_008674_CP_STALLED_STAT1 = 0x01000004
> Sep  7 14:48:14 x4 kernel: radeon 0000:01:05.0:   R_008678_CP_STALLED_STAT2 = 0x00200002
> Sep  7 14:48:14 x4 kernel: radeon 0000:01:05.0:   R_00867C_CP_BUSY_STAT     = 0x00005087
> Sep  7 14:48:14 x4 kernel: radeon 0000:01:05.0:   R_008680_CP_STAT          = 0x808D8645
> Sep  7 14:48:14 x4 kernel: radeon 0000:01:05.0:   R_00D034_DMA_STATUS_REG   = 0x44C83D57
> Sep  7 14:48:15 x4 kernel: radeon 0000:01:05.0: R_008020_GRBM_SOFT_RESET=0x00007FEF
> Sep  7 14:48:15 x4 kernel: radeon 0000:01:05.0: SRBM_SOFT_RESET=0x00008100
> Sep  7 14:48:15 x4 kernel: radeon 0000:01:05.0:   R_008010_GRBM_STATUS      = 0xA0003030
> Sep  7 14:48:15 x4 kernel: radeon 0000:01:05.0:   R_008014_GRBM_STATUS2     = 0x00000003
> Sep  7 14:48:15 x4 kernel: radeon 0000:01:05.0:   R_000E50_SRBM_STATUS      = 0x20048040
> Sep  7 14:48:15 x4 kernel: radeon 0000:01:05.0:   R_008674_CP_STALLED_STAT1 = 0x00000000
> Sep  7 14:48:15 x4 kernel: radeon 0000:01:05.0:   R_008678_CP_STALLED_STAT2 = 0x00000000
> Sep  7 14:48:15 x4 kernel: radeon 0000:01:05.0:   R_00867C_CP_BUSY_STAT     = 0x00000000
> Sep  7 14:48:15 x4 kernel: radeon 0000:01:05.0:   R_008680_CP_STAT          = 0x80100000
> Sep  7 14:48:15 x4 kernel: radeon 0000:01:05.0:   R_00D034_DMA_STATUS_REG   = 0x44C83D57
> Sep  7 14:48:15 x4 kernel: radeon 0000:01:05.0: GPU reset succeeded, trying to resume
> Sep  7 14:48:15 x4 kernel: [drm] PCIE GART of 512M enabled (table at 0x00000000C0258000).
> Sep  7 14:48:15 x4 kernel: radeon 0000:01:05.0: WB enabled
> Sep  7 14:48:15 x4 kernel: radeon 0000:01:05.0: fence driver on ring 0 use gpu addr 0x00000000a0000c00 and cpu addr 0xffff8800db8dcc00
> Sep  7 14:48:15 x4 kernel: radeon 0000:01:05.0: fence driver on ring 5 use gpu addr 0x00000000c0056038 and cpu addr 0xffffc90000116038
> Sep  7 14:48:15 x4 kernel: [drm] ring test on 0 succeeded in 1 usecs
> Sep  7 14:48:15 x4 kernel: [drm] ring test on 5 succeeded in 1 usecs
> Sep  7 14:48:15 x4 kernel: [drm] UVD initialized successfully.
> Sep  7 14:48:15 x4 kernel: switching from power state:
> Sep  7 14:48:15 x4 kernel:      ui class: none
> Sep  7 14:48:15 x4 kernel:      internal class: boot
> Sep  7 14:48:15 x4 kernel:      caps: video
> Sep  7 14:48:15 x4 kernel:      uvd    vclk: 0 dclk: 0
> Sep  7 14:48:15 x4 kernel:              power level 0    sclk: 50000 vddc_index: 2
> Sep  7 14:48:15 x4 kernel:              power level 1    sclk: 50000 vddc_index: 2
> Sep  7 14:48:15 x4 kernel:      status: c b
> Sep  7 14:48:15 x4 kernel: switching to power state:
> Sep  7 14:48:15 x4 kernel:      ui class: none
> Sep  7 14:48:15 x4 kernel:      internal class: uvd
> Sep  7 14:48:15 x4 kernel:      caps: video
> Sep  7 14:48:15 x4 kernel:      uvd    vclk: 53300 dclk: 40000
> Sep  7 14:48:15 x4 kernel:              power level 0    sclk: 50000 vddc_index: 1
> Sep  7 14:48:15 x4 kernel:              power level 1    sclk: 50000 vddc_index: 1
> Sep  7 14:48:15 x4 kernel:      status: r
> Sep  7 14:48:15 x4 kernel: SysRq : Emergency Sync
> (new boot)
> Sep  7 14:53:20 x4 kernel: switching to power state:
> Sep  7 14:53:20 x4 kernel:      ui class: none
> Sep  7 14:53:20 x4 kernel:      internal class: uvd
> Sep  7 14:53:20 x4 kernel:      caps: video
> Sep  7 14:53:20 x4 kernel:      uvd    vclk: 53300 dclk: 40000
> Sep  7 14:53:20 x4 kernel:              power level 0    sclk: 50000 vddc_index: 1
> Sep  7 14:53:20 x4 kernel:              power level 1    sclk: 50000 vddc_index: 1
> Sep  7 14:53:20 x4 kernel:      status: r
> Sep  7 14:53:30 x4 kernel: radeon 0000:01:05.0: ring 0 stalled for more than 10050msec
> Sep  7 14:53:30 x4 kernel: radeon 0000:01:05.0: GPU lockup (current fence id 0x0000000000000596 last fence id 0x00000000000005a3 on ring 0)
> Sep  7 14:53:31 x4 kernel: radeon 0000:01:05.0: Saved 601 dwords of commands on ring 0.
> Sep  7 14:53:31 x4 kernel: radeon 0000:01:05.0: GPU softreset: 0x00000088
> Sep  7 14:53:31 x4 kernel: radeon 0000:01:05.0:   R_008010_GRBM_STATUS      = 0xA00030B0
> Sep  7 14:53:31 x4 kernel: radeon 0000:01:05.0:   R_008014_GRBM_STATUS2     = 0x00000003
> Sep  7 14:53:31 x4 kernel: radeon 0000:01:05.0:   R_000E50_SRBM_STATUS      = 0x20045040
> Sep  7 14:53:31 x4 kernel: radeon 0000:01:05.0:   R_008674_CP_STALLED_STAT1 = 0x00000004
> Sep  7 14:53:31 x4 kernel: radeon 0000:01:05.0:   R_008678_CP_STALLED_STAT2 = 0x00000002
> Sep  7 14:53:31 x4 kernel: radeon 0000:01:05.0:   R_00867C_CP_BUSY_STAT     = 0x00005087
> Sep  7 14:53:31 x4 kernel: radeon 0000:01:05.0:   R_008680_CP_STAT          = 0x80098645
> Sep  7 14:53:31 x4 kernel: radeon 0000:01:05.0:   R_00D034_DMA_STATUS_REG   = 0x44C83D57
> Sep  7 14:53:31 x4 kernel: radeon 0000:01:05.0: R_008020_GRBM_SOFT_RESET=0x00004001
> Sep  7 14:53:31 x4 kernel: radeon 0000:01:05.0: SRBM_SOFT_RESET=0x00008100
> Sep  7 14:53:31 x4 kernel: radeon 0000:01:05.0:   R_008010_GRBM_STATUS      = 0xA0003030
> Sep  7 14:53:31 x4 kernel: radeon 0000:01:05.0:   R_008014_GRBM_STATUS2     = 0x00000003
> Sep  7 14:53:31 x4 kernel: radeon 0000:01:05.0:   R_000E50_SRBM_STATUS      = 0x20048040
> Sep  7 14:53:31 x4 kernel: radeon 0000:01:05.0:   R_008674_CP_STALLED_STAT1 = 0x00000000
> Sep  7 14:53:31 x4 kernel: radeon 0000:01:05.0:   R_008678_CP_STALLED_STAT2 = 0x00000000
> Sep  7 14:53:31 x4 kernel: radeon 0000:01:05.0:   R_00867C_CP_BUSY_STAT     = 0x00000000
> Sep  7 14:53:31 x4 kernel: radeon 0000:01:05.0:   R_008680_CP_STAT          = 0x80100000
> Sep  7 14:53:31 x4 kernel: radeon 0000:01:05.0:   R_00D034_DMA_STATUS_REG   = 0x44C83D57
> Sep  7 14:53:31 x4 kernel: radeon 0000:01:05.0: GPU reset succeeded, trying to resume
> Sep  7 14:53:31 x4 kernel: [drm] PCIE GART of 512M enabled (table at 0x00000000C0258000).
> Sep  7 14:53:31 x4 kernel: radeon 0000:01:05.0: WB enabled
> Sep  7 14:53:31 x4 kernel: radeon 0000:01:05.0: fence driver on ring 0 use gpu addr 0x00000000a0000c00 and cpu addr 0xffff8800db8dcc00
> Sep  7 14:53:31 x4 kernel: radeon 0000:01:05.0: fence driver on ring 5 use gpu addr 0x00000000c0056038 and cpu addr 0xffffc90000116038
> Sep  7 14:53:31 x4 kernel: [drm] ring test on 0 succeeded in 1 usecs
> Sep  7 14:53:31 x4 kernel: [drm] ring test on 5 succeeded in 1 usecs
> Sep  7 14:53:31 x4 kernel: [drm] UVD initialized successfully.
> Sep  7 14:53:31 x4 kernel: switching from power state:
> Sep  7 14:53:31 x4 kernel:      ui class: none
> Sep  7 14:53:31 x4 kernel:      internal class: boot
> Sep  7 14:53:31 x4 kernel:      caps: video
> Sep  7 14:53:31 x4 kernel:      uvd    vclk: 0 dclk: 0
> Sep  7 14:53:31 x4 kernel:              power level 0    sclk: 50000 vddc_index: 2
> Sep  7 14:53:31 x4 kernel:              power level 1    sclk: 50000 vddc_index: 2
> Sep  7 14:53:31 x4 kernel:      status: c b
> Sep  7 14:53:31 x4 kernel: switching to power state:
> Sep  7 14:53:31 x4 kernel:      ui class: none
> Sep  7 14:53:31 x4 kernel:      internal class: uvd
> Sep  7 14:53:31 x4 kernel:      caps: video
> Sep  7 14:53:31 x4 kernel:      uvd    vclk: 53300 dclk: 40000
> Sep  7 14:53:31 x4 kernel:              power level 0    sclk: 50000 vddc_index: 1
> Sep  7 14:53:31 x4 kernel:              power level 1    sclk: 50000 vddc_index: 1
> Sep  7 14:53:31 x4 kernel:      status: r
> Sep  7 14:53:39 x4 kernel: SysRq : Emergency Sync
>
> --
> Markus
> _______________________________________________
> dri-devel mailing list
> dri-devel@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/dri-devel
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re:
  2014-08-25  9:10   ` Re: Christian König
@ 2014-09-07 13:24     ` Markus Trippelsdorf
  2014-09-08  3:47       ` Re: Alex Deucher
  0 siblings, 1 reply; 65+ messages in thread
From: Markus Trippelsdorf @ 2014-09-07 13:24 UTC (permalink / raw)
  To: Christian König; +Cc: dri-devel

On 2014.08.25 at 11:10 +0200, Christian König wrote:
> Let me know if it works for you, cause we don't really have any hardware 
> any more to test it.

I've tested your patch series today (using drm-next-3.18 from
~agd5f/linux) on a RS780D/Radeon HD 3300 system with a couple of H264
videos. While it sometimes works as expected, it stalled the GPU far too
often to be usable. The stalls are not recoverable and the machine ends
up with a black sreen, but still accepts SysRq keyboard inputs.

Here are some logs:

vdpauinfo:
display: :0   screen: 0
API version: 1
Information string: G3DVL VDPAU Driver Shared Library version 1.0

Video surface:

name   width height types
-------------------------------------------
420     8192  8192  NV12 YV12 
422     8192  8192  UYVY YUYV 
444     8192  8192  Y8U8V8A8 V8U8Y8A8 

Decoder capabilities:

name               level macbs width height
-------------------------------------------
MPEG1                 0  9216  2048  1152
MPEG2_SIMPLE          3  9216  2048  1152
MPEG2_MAIN            3  9216  2048  1152
H264_BASELINE        41  9216  2048  1152
H264_MAIN            41  9216  2048  1152
H264_HIGH            41  9216  2048  1152
VC1_ADVANCED          4  9216  2048  1152

Output surface:

name              width height nat types
----------------------------------------------------
B8G8R8A8          8192  8192    y  NV12 YV12 UYVY YUYV Y8U8V8A8 V8U8Y8A8 
R8G8B8A8          8192  8192    y  NV12 YV12 UYVY YUYV Y8U8V8A8 V8U8Y8A8 
R10G10B10A2       8192  8192    y  NV12 YV12 UYVY YUYV Y8U8V8A8 V8U8Y8A8 
B10G10R10A2       8192  8192    y  NV12 YV12 UYVY YUYV Y8U8V8A8 V8U8Y8A8 

Bitmap surface:

name              width height
------------------------------
B8G8R8A8          8192  8192
R8G8B8A8          8192  8192
R10G10B10A2       8192  8192
B10G10R10A2       8192  8192
A8                8192  8192

Video mixer:

feature name                    sup
------------------------------------
DEINTERLACE_TEMPORAL             y
DEINTERLACE_TEMPORAL_SPATIAL     -
INVERSE_TELECINE                 -
NOISE_REDUCTION                  y
SHARPNESS                        y
LUMA_KEY                         -
HIGH QUALITY SCALING - L1        -
HIGH QUALITY SCALING - L2        -
HIGH QUALITY SCALING - L3        -
HIGH QUALITY SCALING - L4        -
HIGH QUALITY SCALING - L5        -
HIGH QUALITY SCALING - L6        -
HIGH QUALITY SCALING - L7        -
HIGH QUALITY SCALING - L8        -
HIGH QUALITY SCALING - L9        -

parameter name                  sup      min      max
-----------------------------------------------------
VIDEO_SURFACE_WIDTH              y        48     2048
VIDEO_SURFACE_HEIGHT             y        48     1152
CHROMA_TYPE                      y  
LAYERS                           y         0        4

attribute name                  sup      min      max
-----------------------------------------------------
BACKGROUND_COLOR                 y  
CSC_MATRIX                       y  
NOISE_REDUCTION_LEVEL            y      0.00     1.00
SHARPNESS_LEVEL                  y     -1.00     1.00
LUMA_KEY_MIN_LUMA                y  
LUMA_KEY_MAX_LUMA                y  


Sep  7 14:03:45 x4 kernel: [drm] Initialized drm 1.1.0 20060810
Sep  7 14:03:45 x4 kernel: [drm] radeon kernel modesetting enabled.
Sep  7 14:03:45 x4 kernel: [drm] initializing kernel modesetting (RS780 0x1002:0x9614 0x1043:0x834D).
Sep  7 14:03:45 x4 kernel: [drm] register mmio base: 0xFBEE0000
Sep  7 14:03:45 x4 kernel: [drm] register mmio size: 65536
Sep  7 14:03:45 x4 kernel: ATOM BIOS: 113
Sep  7 14:03:45 x4 kernel: radeon 0000:01:05.0: VRAM: 128M 0x00000000C0000000 - 0x00000000C7FFFFFF (128M used)
Sep  7 14:03:45 x4 kernel: radeon 0000:01:05.0: GTT: 512M 0x00000000A0000000 - 0x00000000BFFFFFFF
Sep  7 14:03:45 x4 kernel: [drm] Detected VRAM RAM=128M, BAR=128M
Sep  7 14:03:45 x4 kernel: [drm] RAM width 32bits DDR
Sep  7 14:03:45 x4 kernel: [TTM] Zone  kernel: Available graphics memory: 4083350 kiB
Sep  7 14:03:45 x4 kernel: [TTM] Zone   dma32: Available graphics memory: 2097152 kiB
Sep  7 14:03:45 x4 kernel: [TTM] Initializing pool allocator
Sep  7 14:03:45 x4 kernel: [TTM] Initializing DMA pool allocator
Sep  7 14:03:45 x4 kernel: [drm] radeon: 128M of VRAM memory ready
Sep  7 14:03:45 x4 kernel: [drm] radeon: 512M of GTT memory ready.
Sep  7 14:03:45 x4 kernel: [drm] Loading RS780 Microcode
Sep  7 14:03:45 x4 kernel: == power state 0 ==
Sep  7 14:03:45 x4 kernel: 	ui class: none
Sep  7 14:03:45 x4 kernel: 	internal class: boot 
Sep  7 14:03:45 x4 kernel: 	caps: video 
Sep  7 14:03:45 x4 kernel: 	uvd    vclk: 0 dclk: 0
Sep  7 14:03:45 x4 kernel: 		power level 0    sclk: 50000 vddc_index: 2
Sep  7 14:03:45 x4 kernel: 		power level 1    sclk: 50000 vddc_index: 2
Sep  7 14:03:45 x4 kernel: 	status: c r b 
Sep  7 14:03:45 x4 kernel: == power state 1 ==
Sep  7 14:03:45 x4 kernel: 	ui class: performance
Sep  7 14:03:45 x4 kernel: 	internal class: none
Sep  7 14:03:45 x4 kernel: 	caps: video 
Sep  7 14:03:45 x4 kernel: 	uvd    vclk: 0 dclk: 0
Sep  7 14:03:45 x4 kernel: 		power level 0    sclk: 50000 vddc_index: 1
Sep  7 14:03:45 x4 kernel: 		power level 1    sclk: 70000 vddc_index: 2
Sep  7 14:03:45 x4 kernel: 	status: 
Sep  7 14:03:45 x4 kernel: == power state 2 ==
Sep  7 14:03:45 x4 kernel: 	ui class: none
Sep  7 14:03:45 x4 kernel: 	internal class: uvd 
Sep  7 14:03:45 x4 kernel: 	caps: video 
Sep  7 14:03:45 x4 kernel: 	uvd    vclk: 53300 dclk: 40000
Sep  7 14:03:45 x4 kernel: 		power level 0    sclk: 50000 vddc_index: 1
Sep  7 14:03:45 x4 kernel: 		power level 1    sclk: 50000 vddc_index: 1
Sep  7 14:03:45 x4 kernel: 	status: 
Sep  7 14:03:45 x4 kernel: [drm] radeon: dpm initialized
Sep  7 14:03:45 x4 kernel: [drm] GART: num cpu pages 131072, num gpu pages 131072
Sep  7 14:03:45 x4 kernel: [drm] PCIE GART of 512M enabled (table at 0x00000000C0258000).
Sep  7 14:03:45 x4 kernel: radeon 0000:01:05.0: WB enabled
Sep  7 14:03:45 x4 kernel: radeon 0000:01:05.0: fence driver on ring 0 use gpu addr 0x00000000a0000c00 and cpu addr 0xffff8800db8dcc00
Sep  7 14:03:45 x4 kernel: radeon 0000:01:05.0: fence driver on ring 5 use gpu addr 0x00000000c0056038 and cpu addr 0xffffc90000116038
Sep  7 14:03:45 x4 kernel: [drm] Supports vblank timestamp caching Rev 2 (21.10.2013).
Sep  7 14:03:45 x4 kernel: [drm] Driver supports precise vblank timestamp query.
Sep  7 14:03:45 x4 kernel: [drm] radeon: irq initialized.
Sep  7 14:03:45 x4 kernel: [drm] ring test on 0 succeeded in 1 usecs
Sep  7 14:03:45 x4 kernel: [drm] ring test on 5 succeeded in 1 usecs
Sep  7 14:03:45 x4 kernel: [drm] UVD initialized successfully.
Sep  7 14:03:45 x4 kernel: [drm] ib test on ring 0 succeeded in 0 usecs
Sep  7 14:03:45 x4 kernel: [drm] ib test on ring 5 succeeded
Sep  7 14:03:45 x4 kernel: [drm] Radeon Display Connectors
Sep  7 14:03:45 x4 kernel: [drm] Connector 0:
Sep  7 14:03:45 x4 kernel: [drm]   VGA-1
Sep  7 14:03:45 x4 kernel: [drm]   DDC: 0x7e40 0x7e40 0x7e44 0x7e44 0x7e48 0x7e48 0x7e4c 0x7e4c
Sep  7 14:03:45 x4 kernel: [drm]   Encoders:
Sep  7 14:03:45 x4 kernel: [drm]     CRT1: INTERNAL_KLDSCP_DAC1
Sep  7 14:03:45 x4 kernel: [drm] Connector 1:
Sep  7 14:03:45 x4 kernel: [drm]   DVI-D-1
Sep  7 14:03:45 x4 kernel: [drm]   HPD3
Sep  7 14:03:45 x4 kernel: [drm]   DDC: 0x7e50 0x7e50 0x7e54 0x7e54 0x7e58 0x7e58 0x7e5c 0x7e5c
Sep  7 14:03:45 x4 kernel: [drm]   Encoders:
Sep  7 14:03:45 x4 kernel: [drm]     DFP3: INTERNAL_KLDSCP_LVTMA
Sep  7 14:03:45 x4 kernel: switching from power state:
Sep  7 14:03:45 x4 kernel: 	ui class: none
Sep  7 14:03:45 x4 kernel: 	internal class: boot 
Sep  7 14:03:45 x4 kernel: 	caps: video 
Sep  7 14:03:45 x4 kernel: 	uvd    vclk: 0 dclk: 0
Sep  7 14:03:45 x4 kernel: 		power level 0    sclk: 50000 vddc_index: 2
Sep  7 14:03:45 x4 kernel: 		power level 1    sclk: 50000 vddc_index: 2
Sep  7 14:03:45 x4 kernel: 	status: c b 
Sep  7 14:03:45 x4 kernel: switching to power state:
Sep  7 14:03:45 x4 kernel: 	ui class: performance
Sep  7 14:03:45 x4 kernel: 	internal class: none
Sep  7 14:03:45 x4 kernel: 	caps: video 
Sep  7 14:03:45 x4 kernel: 	uvd    vclk: 0 dclk: 0
Sep  7 14:03:45 x4 kernel: 		power level 0    sclk: 50000 vddc_index: 1
Sep  7 14:03:45 x4 kernel: 		power level 1    sclk: 70000 vddc_index: 2
Sep  7 14:03:45 x4 kernel: 	status: r 
Sep  7 14:03:45 x4 kernel: [drm] fb mappable at 0xF0359000
Sep  7 14:03:45 x4 kernel: [drm] vram apper at 0xF0000000
Sep  7 14:03:45 x4 kernel: [drm] size 7299072
Sep  7 14:03:45 x4 kernel: [drm] fb depth is 24
Sep  7 14:03:45 x4 kernel: [drm]    pitch is 6912
Sep  7 14:03:45 x4 kernel: fbcon: radeondrmfb (fb0) is primary device
Sep  7 14:03:45 x4 kernel: Console: switching to colour frame buffer device 131x105
Sep  7 14:03:45 x4 kernel: radeon 0000:01:05.0: fb0: radeondrmfb frame buffer device
Sep  7 14:03:45 x4 kernel: radeon 0000:01:05.0: registered panic notifier
Sep  7 14:03:45 x4 kernel: tsc: Refined TSC clocksource calibration: 3210.826 MHz
Sep  7 14:03:45 x4 kernel: [drm] Initialized radeon 2.40.0 20080528 for 0000:01:05.0 on minor 0
...
Sep  7 14:20:37 x4 kernel: switching to power state:
Sep  7 14:20:37 x4 kernel: 	ui class: none
Sep  7 14:20:37 x4 kernel: 	internal class: uvd 
Sep  7 14:20:37 x4 kernel: 	caps: video 
Sep  7 14:20:37 x4 kernel: 	uvd    vclk: 53300 dclk: 40000
Sep  7 14:20:37 x4 kernel: 		power level 0    sclk: 50000 vddc_index: 1
Sep  7 14:20:37 x4 kernel: 		power level 1    sclk: 50000 vddc_index: 1
Sep  7 14:20:37 x4 kernel: 	status: r 
Sep  7 14:20:54 x4 kernel: switching from power state:
Sep  7 14:20:54 x4 kernel: 	ui class: none
Sep  7 14:20:54 x4 kernel: 	internal class: uvd 
Sep  7 14:20:54 x4 kernel: 	caps: video 
Sep  7 14:20:54 x4 kernel: 	uvd    vclk: 53300 dclk: 40000
Sep  7 14:20:54 x4 kernel: 		power level 0    sclk: 50000 vddc_index: 1
Sep  7 14:20:54 x4 kernel: 		power level 1    sclk: 50000 vddc_index: 1
Sep  7 14:20:54 x4 kernel: 	status: c 
Sep  7 14:20:54 x4 kernel: switching to power state:
Sep  7 14:20:54 x4 kernel: 	ui class: performance
Sep  7 14:20:54 x4 kernel: 	internal class: none
Sep  7 14:20:54 x4 kernel: 	caps: video 
Sep  7 14:20:54 x4 kernel: 	uvd    vclk: 0 dclk: 0
Sep  7 14:20:54 x4 kernel: 		power level 0    sclk: 50000 vddc_index: 1
Sep  7 14:20:54 x4 kernel: 		power level 1    sclk: 70000 vddc_index: 2
Sep  7 14:20:54 x4 kernel: 	status: r 
Sep  7 14:21:02 x4 kernel: switching from power state:
Sep  7 14:21:02 x4 kernel: 	ui class: performance
Sep  7 14:21:02 x4 kernel: 	internal class: none
Sep  7 14:21:02 x4 kernel: 	caps: video 
Sep  7 14:21:02 x4 kernel: 	uvd    vclk: 0 dclk: 0
Sep  7 14:21:02 x4 kernel: 		power level 0    sclk: 50000 vddc_index: 1
Sep  7 14:21:02 x4 kernel: 		power level 1    sclk: 70000 vddc_index: 2
Sep  7 14:21:02 x4 kernel: 	status: c 
Sep  7 14:21:02 x4 kernel: switching to power state:
Sep  7 14:21:02 x4 kernel: 	ui class: none
Sep  7 14:21:02 x4 kernel: 	internal class: uvd 
Sep  7 14:21:02 x4 kernel: 	caps: video 
Sep  7 14:21:02 x4 kernel: 	uvd    vclk: 53300 dclk: 40000
Sep  7 14:21:02 x4 kernel: 		power level 0    sclk: 50000 vddc_index: 1
Sep  7 14:21:02 x4 kernel: 		power level 1    sclk: 50000 vddc_index: 1
Sep  7 14:21:02 x4 kernel: 	status: r 
Sep  7 14:21:13 x4 kernel: radeon 0000:01:05.0: ring 0 stalled for more than 10106msec
Sep  7 14:21:13 x4 kernel: radeon 0000:01:05.0: GPU lockup (current fence id 0x00000000000072ac last fence id 0x00000000000072b4 on ring 0)
Sep  7 14:21:13 x4 kernel: radeon 0000:01:05.0: Saved 377 dwords of commands on ring 0.
Sep  7 14:21:13 x4 kernel: radeon 0000:01:05.0: GPU softreset: 0x00000099
Sep  7 14:21:13 x4 kernel: radeon 0000:01:05.0:   R_008010_GRBM_STATUS      = 0xA00034AF
Sep  7 14:21:13 x4 kernel: radeon 0000:01:05.0:   R_008014_GRBM_STATUS2     = 0x00000003
Sep  7 14:21:13 x4 kernel: radeon 0000:01:05.0:   R_000E50_SRBM_STATUS      = 0x20044040
Sep  7 14:21:13 x4 kernel: radeon 0000:01:05.0:   R_008674_CP_STALLED_STAT1 = 0x00000004
Sep  7 14:21:13 x4 kernel: radeon 0000:01:05.0:   R_008678_CP_STALLED_STAT2 = 0x00000002
Sep  7 14:21:13 x4 kernel: radeon 0000:01:05.0:   R_00867C_CP_BUSY_STAT     = 0x00005087
Sep  7 14:21:13 x4 kernel: radeon 0000:01:05.0:   R_008680_CP_STAT          = 0x80098645
Sep  7 14:21:13 x4 kernel: radeon 0000:01:05.0:   R_00D034_DMA_STATUS_REG   = 0x44C83D57
Sep  7 14:21:13 x4 kernel: radeon 0000:01:05.0: R_008020_GRBM_SOFT_RESET=0x00007FEF
Sep  7 14:21:13 x4 kernel: radeon 0000:01:05.0: SRBM_SOFT_RESET=0x00008100
Sep  7 14:21:13 x4 kernel: radeon 0000:01:05.0:   R_008010_GRBM_STATUS      = 0xA0003030
Sep  7 14:21:13 x4 kernel: radeon 0000:01:05.0:   R_008014_GRBM_STATUS2     = 0x00000003
Sep  7 14:21:13 x4 kernel: radeon 0000:01:05.0:   R_000E50_SRBM_STATUS      = 0x20048040
Sep  7 14:21:13 x4 kernel: radeon 0000:01:05.0:   R_008674_CP_STALLED_STAT1 = 0x00000000
Sep  7 14:21:13 x4 kernel: radeon 0000:01:05.0:   R_008678_CP_STALLED_STAT2 = 0x00000000
Sep  7 14:21:13 x4 kernel: radeon 0000:01:05.0:   R_00867C_CP_BUSY_STAT     = 0x00000000
Sep  7 14:21:13 x4 kernel: radeon 0000:01:05.0:   R_008680_CP_STAT          = 0x80100000
Sep  7 14:21:13 x4 kernel: radeon 0000:01:05.0:   R_00D034_DMA_STATUS_REG   = 0x44C83D57
Sep  7 14:21:13 x4 kernel: radeon 0000:01:05.0: GPU reset succeeded, trying to resume
Sep  7 14:21:13 x4 kernel: [drm] PCIE GART of 512M enabled (table at 0x00000000C0258000).
Sep  7 14:21:13 x4 kernel: radeon 0000:01:05.0: WB enabled
Sep  7 14:21:13 x4 kernel: radeon 0000:01:05.0: fence driver on ring 0 use gpu addr 0x00000000a0000c00 and cpu addr 0xffff8800db8dcc00
Sep  7 14:21:13 x4 kernel: radeon 0000:01:05.0: fence driver on ring 5 use gpu addr 0x00000000c0056038 and cpu addr 0xffffc90000116038
Sep  7 14:21:13 x4 kernel: [drm] ring test on 0 succeeded in 1 usecs
Sep  7 14:21:13 x4 kernel: [drm] ring test on 5 succeeded in 1 usecs
Sep  7 14:21:13 x4 kernel: [drm] UVD initialized successfully.
Sep  7 14:21:13 x4 kernel: switching from power state:
Sep  7 14:21:13 x4 kernel: 	ui class: none
Sep  7 14:21:13 x4 kernel: 	internal class: boot 
Sep  7 14:21:13 x4 kernel: 	caps: video 
Sep  7 14:21:13 x4 kernel: 	uvd    vclk: 0 dclk: 0
Sep  7 14:21:13 x4 kernel: 		power level 0    sclk: 50000 vddc_index: 2
Sep  7 14:21:13 x4 kernel: 		power level 1    sclk: 50000 vddc_index: 2
Sep  7 14:21:13 x4 kernel: 	status: c b 
Sep  7 14:21:13 x4 kernel: switching to power state:
Sep  7 14:21:13 x4 kernel: 	ui class: none
Sep  7 14:21:13 x4 kernel: 	internal class: uvd 
Sep  7 14:21:13 x4 kernel: 	caps: video 
Sep  7 14:21:13 x4 kernel: 	uvd    vclk: 53300 dclk: 40000
Sep  7 14:21:13 x4 kernel: 		power level 0    sclk: 50000 vddc_index: 1
Sep  7 14:21:13 x4 kernel: 		power level 1    sclk: 50000 vddc_index: 1
Sep  7 14:21:13 x4 kernel: 	status: r 
Sep  7 14:21:20 x4 kernel: SysRq : Emergency Sync
(new boot)
Sep  7 14:44:49 x4 kernel: switching from power state:
Sep  7 14:44:49 x4 kernel: 	ui class: performance
Sep  7 14:44:49 x4 kernel: 	internal class: none
Sep  7 14:44:49 x4 kernel: 	caps: video 
Sep  7 14:44:49 x4 kernel: 	uvd    vclk: 0 dclk: 0
Sep  7 14:44:49 x4 kernel: 		power level 0    sclk: 50000 vddc_index: 1
Sep  7 14:44:49 x4 kernel: 		power level 1    sclk: 70000 vddc_index: 2
Sep  7 14:44:49 x4 kernel: 	status: c 
Sep  7 14:44:49 x4 kernel: switching to power state:
Sep  7 14:44:49 x4 kernel: 	ui class: none
Sep  7 14:44:49 x4 kernel: 	internal class: uvd 
Sep  7 14:44:49 x4 kernel: 	caps: video 
Sep  7 14:44:49 x4 kernel: 	uvd    vclk: 53300 dclk: 40000
Sep  7 14:44:49 x4 kernel: 		power level 0    sclk: 50000 vddc_index: 1
Sep  7 14:44:49 x4 kernel: 		power level 1    sclk: 50000 vddc_index: 1
Sep  7 14:44:49 x4 kernel: 	status: r 
Sep  7 14:44:59 x4 kernel: radeon 0000:01:05.0: ring 5 stalled for more than 10000msec
Sep  7 14:44:59 x4 kernel: radeon 0000:01:05.0: GPU lockup (current fence id 0x0000000000000eac last fence id 0x0000000000000eaf on ring 5)
Sep  7 14:44:59 x4 kernel: radeon 0000:01:05.0: ring 0 stalled for more than 10466msec
Sep  7 14:44:59 x4 kernel: radeon 0000:01:05.0: GPU lockup (current fence id 0x00000000000064ff last fence id 0x0000000000006504 on ring 0)
Sep  7 14:45:00 x4 kernel: radeon 0000:01:05.0: ring 5 stalled for more than 10500msec
Sep  7 14:45:00 x4 kernel: radeon 0000:01:05.0: GPU lockup (current fence id 0x0000000000000eac last fence id 0x0000000000000eaf on ring 5)
Sep  7 14:45:00 x4 kernel: radeon 0000:01:05.0: ring 0 stalled for more than 10966msec
Sep  7 14:45:00 x4 kernel: radeon 0000:01:05.0: GPU lockup (current fence id 0x00000000000064ff last fence id 0x0000000000006504 on ring 0)
Sep  7 14:45:00 x4 kernel: radeon 0000:01:05.0: ring 5 stalled for more than 11000msec
...
Sep  7 14:45:17 x4 kernel: radeon 0000:01:05.0: GPU lockup (current fence id 0x0000000000000eac last fence id 0x0000000000000eaf on ring 5)
Sep  7 14:45:17 x4 kernel: radeon 0000:01:05.0: ring 0 stalled for more than 28466msec
Sep  7 14:45:17 x4 kernel: radeon 0000:01:05.0: GPU lockup (current fence id 0x00000000000064ff last fence id 0x0000000000006504 on ring 0)
Sep  7 14:45:18 x4 kernel: radeon 0000:01:05.0: ring 5 stalled for more than 28500msec
Sep  7 14:45:18 x4 kernel: radeon 0000:01:05.0: GPU lockup (current fence id 0x0000000000000eac last fence id 0x0000000000000eaf on ring 5)
Sep  7 14:45:18 x4 kernel: radeon 0000:01:05.0: ring 0 stalled for more than 28966msec
Sep  7 14:45:18 x4 kernel: radeon 0000:01:05.0: GPU lockup (current fence id 0x00000000000064ff last fence id 0x0000000000006504 on ring 0)
Sep  7 14:45:18 x4 kernel: SysRq : Emergency Sync
(new boot)
Sep  7 14:48:14 x4 kernel: radeon 0000:01:05.0: ring 5 stalled for more than 10000msec
Sep  7 14:48:14 x4 kernel: radeon 0000:01:05.0: ring 0 stalled for more than 10000msec
Sep  7 14:48:14 x4 kernel: radeon 0000:01:05.0: GPU lockup (current fence id 0x00000000000001ed last fence id 0x00000000000001f6 on ring 0)
Sep  7 14:48:14 x4 kernel: radeon 0000:01:05.0: GPU lockup (current fence id 0x000000000000000b last fence id 0x000000000000000f on ring 5)
Sep  7 14:48:14 x4 kernel: radeon 0000:01:05.0: Saved 409 dwords of commands on ring 0.
Sep  7 14:48:14 x4 kernel: radeon 0000:01:05.0: GPU softreset: 0x00000099
Sep  7 14:48:14 x4 kernel: radeon 0000:01:05.0:   R_008010_GRBM_STATUS      = 0xA00034E0
Sep  7 14:48:14 x4 kernel: radeon 0000:01:05.0:   R_008014_GRBM_STATUS2     = 0x00000003
Sep  7 14:48:14 x4 kernel: radeon 0000:01:05.0:   R_000E50_SRBM_STATUS      = 0x20045040
Sep  7 14:48:14 x4 kernel: radeon 0000:01:05.0:   R_008674_CP_STALLED_STAT1 = 0x01000004
Sep  7 14:48:14 x4 kernel: radeon 0000:01:05.0:   R_008678_CP_STALLED_STAT2 = 0x00200002
Sep  7 14:48:14 x4 kernel: radeon 0000:01:05.0:   R_00867C_CP_BUSY_STAT     = 0x00005087
Sep  7 14:48:14 x4 kernel: radeon 0000:01:05.0:   R_008680_CP_STAT          = 0x808D8645
Sep  7 14:48:14 x4 kernel: radeon 0000:01:05.0:   R_00D034_DMA_STATUS_REG   = 0x44C83D57
Sep  7 14:48:15 x4 kernel: radeon 0000:01:05.0: R_008020_GRBM_SOFT_RESET=0x00007FEF
Sep  7 14:48:15 x4 kernel: radeon 0000:01:05.0: SRBM_SOFT_RESET=0x00008100
Sep  7 14:48:15 x4 kernel: radeon 0000:01:05.0:   R_008010_GRBM_STATUS      = 0xA0003030
Sep  7 14:48:15 x4 kernel: radeon 0000:01:05.0:   R_008014_GRBM_STATUS2     = 0x00000003
Sep  7 14:48:15 x4 kernel: radeon 0000:01:05.0:   R_000E50_SRBM_STATUS      = 0x20048040
Sep  7 14:48:15 x4 kernel: radeon 0000:01:05.0:   R_008674_CP_STALLED_STAT1 = 0x00000000
Sep  7 14:48:15 x4 kernel: radeon 0000:01:05.0:   R_008678_CP_STALLED_STAT2 = 0x00000000
Sep  7 14:48:15 x4 kernel: radeon 0000:01:05.0:   R_00867C_CP_BUSY_STAT     = 0x00000000
Sep  7 14:48:15 x4 kernel: radeon 0000:01:05.0:   R_008680_CP_STAT          = 0x80100000
Sep  7 14:48:15 x4 kernel: radeon 0000:01:05.0:   R_00D034_DMA_STATUS_REG   = 0x44C83D57
Sep  7 14:48:15 x4 kernel: radeon 0000:01:05.0: GPU reset succeeded, trying to resume
Sep  7 14:48:15 x4 kernel: [drm] PCIE GART of 512M enabled (table at 0x00000000C0258000).
Sep  7 14:48:15 x4 kernel: radeon 0000:01:05.0: WB enabled
Sep  7 14:48:15 x4 kernel: radeon 0000:01:05.0: fence driver on ring 0 use gpu addr 0x00000000a0000c00 and cpu addr 0xffff8800db8dcc00
Sep  7 14:48:15 x4 kernel: radeon 0000:01:05.0: fence driver on ring 5 use gpu addr 0x00000000c0056038 and cpu addr 0xffffc90000116038
Sep  7 14:48:15 x4 kernel: [drm] ring test on 0 succeeded in 1 usecs
Sep  7 14:48:15 x4 kernel: [drm] ring test on 5 succeeded in 1 usecs
Sep  7 14:48:15 x4 kernel: [drm] UVD initialized successfully.
Sep  7 14:48:15 x4 kernel: switching from power state:
Sep  7 14:48:15 x4 kernel: 	ui class: none
Sep  7 14:48:15 x4 kernel: 	internal class: boot 
Sep  7 14:48:15 x4 kernel: 	caps: video 
Sep  7 14:48:15 x4 kernel: 	uvd    vclk: 0 dclk: 0
Sep  7 14:48:15 x4 kernel: 		power level 0    sclk: 50000 vddc_index: 2
Sep  7 14:48:15 x4 kernel: 		power level 1    sclk: 50000 vddc_index: 2
Sep  7 14:48:15 x4 kernel: 	status: c b 
Sep  7 14:48:15 x4 kernel: switching to power state:
Sep  7 14:48:15 x4 kernel: 	ui class: none
Sep  7 14:48:15 x4 kernel: 	internal class: uvd 
Sep  7 14:48:15 x4 kernel: 	caps: video 
Sep  7 14:48:15 x4 kernel: 	uvd    vclk: 53300 dclk: 40000
Sep  7 14:48:15 x4 kernel: 		power level 0    sclk: 50000 vddc_index: 1
Sep  7 14:48:15 x4 kernel: 		power level 1    sclk: 50000 vddc_index: 1
Sep  7 14:48:15 x4 kernel: 	status: r 
Sep  7 14:48:15 x4 kernel: SysRq : Emergency Sync
(new boot)
Sep  7 14:53:20 x4 kernel: switching to power state:
Sep  7 14:53:20 x4 kernel: 	ui class: none
Sep  7 14:53:20 x4 kernel: 	internal class: uvd 
Sep  7 14:53:20 x4 kernel: 	caps: video 
Sep  7 14:53:20 x4 kernel: 	uvd    vclk: 53300 dclk: 40000
Sep  7 14:53:20 x4 kernel: 		power level 0    sclk: 50000 vddc_index: 1
Sep  7 14:53:20 x4 kernel: 		power level 1    sclk: 50000 vddc_index: 1
Sep  7 14:53:20 x4 kernel: 	status: r 
Sep  7 14:53:30 x4 kernel: radeon 0000:01:05.0: ring 0 stalled for more than 10050msec
Sep  7 14:53:30 x4 kernel: radeon 0000:01:05.0: GPU lockup (current fence id 0x0000000000000596 last fence id 0x00000000000005a3 on ring 0)
Sep  7 14:53:31 x4 kernel: radeon 0000:01:05.0: Saved 601 dwords of commands on ring 0.
Sep  7 14:53:31 x4 kernel: radeon 0000:01:05.0: GPU softreset: 0x00000088
Sep  7 14:53:31 x4 kernel: radeon 0000:01:05.0:   R_008010_GRBM_STATUS      = 0xA00030B0
Sep  7 14:53:31 x4 kernel: radeon 0000:01:05.0:   R_008014_GRBM_STATUS2     = 0x00000003
Sep  7 14:53:31 x4 kernel: radeon 0000:01:05.0:   R_000E50_SRBM_STATUS      = 0x20045040
Sep  7 14:53:31 x4 kernel: radeon 0000:01:05.0:   R_008674_CP_STALLED_STAT1 = 0x00000004
Sep  7 14:53:31 x4 kernel: radeon 0000:01:05.0:   R_008678_CP_STALLED_STAT2 = 0x00000002
Sep  7 14:53:31 x4 kernel: radeon 0000:01:05.0:   R_00867C_CP_BUSY_STAT     = 0x00005087
Sep  7 14:53:31 x4 kernel: radeon 0000:01:05.0:   R_008680_CP_STAT          = 0x80098645
Sep  7 14:53:31 x4 kernel: radeon 0000:01:05.0:   R_00D034_DMA_STATUS_REG   = 0x44C83D57
Sep  7 14:53:31 x4 kernel: radeon 0000:01:05.0: R_008020_GRBM_SOFT_RESET=0x00004001
Sep  7 14:53:31 x4 kernel: radeon 0000:01:05.0: SRBM_SOFT_RESET=0x00008100
Sep  7 14:53:31 x4 kernel: radeon 0000:01:05.0:   R_008010_GRBM_STATUS      = 0xA0003030
Sep  7 14:53:31 x4 kernel: radeon 0000:01:05.0:   R_008014_GRBM_STATUS2     = 0x00000003
Sep  7 14:53:31 x4 kernel: radeon 0000:01:05.0:   R_000E50_SRBM_STATUS      = 0x20048040
Sep  7 14:53:31 x4 kernel: radeon 0000:01:05.0:   R_008674_CP_STALLED_STAT1 = 0x00000000
Sep  7 14:53:31 x4 kernel: radeon 0000:01:05.0:   R_008678_CP_STALLED_STAT2 = 0x00000000
Sep  7 14:53:31 x4 kernel: radeon 0000:01:05.0:   R_00867C_CP_BUSY_STAT     = 0x00000000
Sep  7 14:53:31 x4 kernel: radeon 0000:01:05.0:   R_008680_CP_STAT          = 0x80100000
Sep  7 14:53:31 x4 kernel: radeon 0000:01:05.0:   R_00D034_DMA_STATUS_REG   = 0x44C83D57
Sep  7 14:53:31 x4 kernel: radeon 0000:01:05.0: GPU reset succeeded, trying to resume
Sep  7 14:53:31 x4 kernel: [drm] PCIE GART of 512M enabled (table at 0x00000000C0258000).
Sep  7 14:53:31 x4 kernel: radeon 0000:01:05.0: WB enabled
Sep  7 14:53:31 x4 kernel: radeon 0000:01:05.0: fence driver on ring 0 use gpu addr 0x00000000a0000c00 and cpu addr 0xffff8800db8dcc00
Sep  7 14:53:31 x4 kernel: radeon 0000:01:05.0: fence driver on ring 5 use gpu addr 0x00000000c0056038 and cpu addr 0xffffc90000116038
Sep  7 14:53:31 x4 kernel: [drm] ring test on 0 succeeded in 1 usecs
Sep  7 14:53:31 x4 kernel: [drm] ring test on 5 succeeded in 1 usecs
Sep  7 14:53:31 x4 kernel: [drm] UVD initialized successfully.
Sep  7 14:53:31 x4 kernel: switching from power state:
Sep  7 14:53:31 x4 kernel: 	ui class: none
Sep  7 14:53:31 x4 kernel: 	internal class: boot 
Sep  7 14:53:31 x4 kernel: 	caps: video 
Sep  7 14:53:31 x4 kernel: 	uvd    vclk: 0 dclk: 0
Sep  7 14:53:31 x4 kernel: 		power level 0    sclk: 50000 vddc_index: 2
Sep  7 14:53:31 x4 kernel: 		power level 1    sclk: 50000 vddc_index: 2
Sep  7 14:53:31 x4 kernel: 	status: c b 
Sep  7 14:53:31 x4 kernel: switching to power state:
Sep  7 14:53:31 x4 kernel: 	ui class: none
Sep  7 14:53:31 x4 kernel: 	internal class: uvd 
Sep  7 14:53:31 x4 kernel: 	caps: video 
Sep  7 14:53:31 x4 kernel: 	uvd    vclk: 53300 dclk: 40000
Sep  7 14:53:31 x4 kernel: 		power level 0    sclk: 50000 vddc_index: 1
Sep  7 14:53:31 x4 kernel: 		power level 1    sclk: 50000 vddc_index: 1
Sep  7 14:53:31 x4 kernel: 	status: r 
Sep  7 14:53:39 x4 kernel: SysRq : Emergency Sync

-- 
Markus

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re:
  2014-08-24 13:34 ` Mike Lothian
@ 2014-08-25  9:10   ` Christian König
  2014-09-07 13:24     ` Re: Markus Trippelsdorf
  0 siblings, 1 reply; 65+ messages in thread
From: Christian König @ 2014-08-25  9:10 UTC (permalink / raw)
  To: Mike Lothian; +Cc: dri-devel


[-- Attachment #1.1: Type: text/plain, Size: 1550 bytes --]

Let me know if it works for you, cause we don't really have any hardware 
any more to test it.

Christian.

Am 24.08.2014 um 15:34 schrieb Mike Lothian:
>
> Thanks for this
>
> Good work
>
> On 24 Aug 2014 14:15, "Christian König" <deathsimple@vodafone.de 
> <mailto:deathsimple@vodafone.de>> wrote:
>
>     Hello everyone,
>
>     the following patches add UVD support for older ASICs (RV6xx,
>     RS[78]80, RV7[79]0). For everybody wanting to test it I've also
>     uploaded a branch to FDO:
>     http://cgit.freedesktop.org/~deathsimple/linux/log/?h=uvd-r600-release
>     <http://cgit.freedesktop.org/%7Edeathsimple/linux/log/?h=uvd-r600-release>
>
>     Additionally to the patches you need UVD firmware as well, which
>     can be found at the usual location:
>     http://people.freedesktop.org/~agd5f/radeon_ucode/
>     <http://people.freedesktop.org/%7Eagd5f/radeon_ucode/>
>
>     A small Mesa patch is needed as well, cause the older hardware
>     doesn't support field based output of video frames. So
>     unfortunately VDPAU/OpenGL interop won't work either.
>
>     We can only provide best effort support for those older ASICs, but
>     at least on my RS[78]80 based laptop it seems to work perfectly fine.
>
>     Happy testing,
>     Christian.
>
>     _______________________________________________
>     dri-devel mailing list
>     dri-devel@lists.freedesktop.org
>     <mailto:dri-devel@lists.freedesktop.org>
>     http://lists.freedesktop.org/mailman/listinfo/dri-devel
>


[-- Attachment #1.2: Type: text/html, Size: 2761 bytes --]

[-- Attachment #2: Type: text/plain, Size: 159 bytes --]

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re:
  2014-08-24 13:14 (unknown), Christian König
@ 2014-08-24 13:34 ` Mike Lothian
  2014-08-25  9:10   ` Re: Christian König
  0 siblings, 1 reply; 65+ messages in thread
From: Mike Lothian @ 2014-08-24 13:34 UTC (permalink / raw)
  To: Christian König; +Cc: dri-devel


[-- Attachment #1.1: Type: text/plain, Size: 1070 bytes --]

Thanks for this

Good work
On 24 Aug 2014 14:15, "Christian König" <deathsimple@vodafone.de> wrote:

> Hello everyone,
>
> the following patches add UVD support for older ASICs (RV6xx, RS[78]80,
> RV7[79]0). For everybody wanting to test it I've also uploaded a branch to
> FDO:
> http://cgit.freedesktop.org/~deathsimple/linux/log/?h=uvd-r600-release
>
> Additionally to the patches you need UVD firmware as well, which can be
> found at the usual location:
> http://people.freedesktop.org/~agd5f/radeon_ucode/
>
> A small Mesa patch is needed as well, cause the older hardware doesn't
> support field based output of video frames. So unfortunately VDPAU/OpenGL
> interop won't work either.
>
> We can only provide best effort support for those older ASICs, but at
> least on my RS[78]80 based laptop it seems to work perfectly fine.
>
> Happy testing,
> Christian.
>
> _______________________________________________
> dri-devel mailing list
> dri-devel@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/dri-devel
>

[-- Attachment #1.2: Type: text/html, Size: 1682 bytes --]

[-- Attachment #2: Type: text/plain, Size: 159 bytes --]

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re:
  2013-08-13  9:56 (unknown), Christian König
@ 2013-08-13 14:47 ` Alex Deucher
  0 siblings, 0 replies; 65+ messages in thread
From: Alex Deucher @ 2013-08-13 14:47 UTC (permalink / raw)
  To: Christian König; +Cc: dri-devel

On Tue, Aug 13, 2013 at 5:56 AM, Christian König
<deathsimple@vodafone.de> wrote:
> Hey Alex,
>
> here are my patches for reworking the ring function pointers and separating out the UVD and DMA rings.
>
> Everything is rebased on your drm-next-3.12-wip branch, please review and add them to your branch.

Patches look good to me.  I've added them to my 3.12 tree.

Alex

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re:
  2012-05-22 14:06 ` Lars-Peter Clausen
  2012-05-23  8:12   ` Re: Sascha Hauer
@ 2012-05-24  6:31   ` Sascha Hauer
  1 sibling, 0 replies; 65+ messages in thread
From: Sascha Hauer @ 2012-05-24  6:31 UTC (permalink / raw)
  To: Lars-Peter Clausen; +Cc: linux-arm-kernel, dri-devel

On Tue, May 22, 2012 at 04:06:41PM +0200, Lars-Peter Clausen wrote:
> On 05/18/2012 02:27 PM, Sascha Hauer wrote:
> > Hi All,
> > 
> > The following adds a drm/kms driver for the Freescale i.MX LCDC
> > controller. Most notable change to the last SDRM based version is that
> > the SDRM layer has been removed and the driver now is purely i.MX
> > specific. I hope that this is more acceptable now.
> > 
> > Another change is that the probe is now devicetree based. For now I
> > took the easy way out and only put an edid blob into the devicetree.
> > I haven't documented the binding yet, I would add that when the rest
> > is considered ok.
> > 
> > Comments very welcome.
> > 
> 
> Hi,
> 
> I really liked the sdrm layer. At least some bits of it. I've been working
> on a "simple" DRM driver as well. The hardware has no fancy acceleration
> features, just a simple buffer and some scanout logic. I'm basically using
> the same gem buffer structure and the buffer is also allocated using
> dma_alloc_writecombine, which means we can probably share all of the GEM
> handling code and probably also most of the fbdev code. I also started with
> the Exynos GEM code as a template, but reworked it later to be more like the
> UDL code, which made it a bit more compact. I think it would be a good idea
> to put at least the GEM handling in some common code as I expect that we'll
> see more similar "simple" DRM drivers pop up.

Ok, I'll try to put the GEM stuff into helper functions. Would you care
to review/test it? I have something else to do right now but I hope I'll
be there next week.

Sascha

-- 
Pengutronix e.K.                           |                             |
Industrial Linux Solutions                 | http://www.pengutronix.de/  |
Peiner Str. 6-8, 31137 Hildesheim, Germany | Phone: +49-5121-206917-0    |
Amtsgericht Hildesheim, HRA 2686           | Fax:   +49-5121-206917-5555 |

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re:
  2012-05-22 14:06 ` Lars-Peter Clausen
@ 2012-05-23  8:12   ` Sascha Hauer
  2012-05-24  6:31   ` Re: Sascha Hauer
  1 sibling, 0 replies; 65+ messages in thread
From: Sascha Hauer @ 2012-05-23  8:12 UTC (permalink / raw)
  To: Lars-Peter Clausen; +Cc: linux-arm-kernel, dri-devel

On Tue, May 22, 2012 at 04:06:41PM +0200, Lars-Peter Clausen wrote:
> On 05/18/2012 02:27 PM, Sascha Hauer wrote:
> > Hi All,
> > 
> > The following adds a drm/kms driver for the Freescale i.MX LCDC
> > controller. Most notable change to the last SDRM based version is that
> > the SDRM layer has been removed and the driver now is purely i.MX
> > specific. I hope that this is more acceptable now.
> > 
> > Another change is that the probe is now devicetree based. For now I
> > took the easy way out and only put an edid blob into the devicetree.
> > I haven't documented the binding yet, I would add that when the rest
> > is considered ok.
> > 
> > Comments very welcome.
> > 
> 
> Hi,
> 
> I really liked the sdrm layer. At least some bits of it. I've been working
> on a "simple" DRM driver as well. The hardware has no fancy acceleration
> features, just a simple buffer and some scanout logic. I'm basically using
> the same gem buffer structure and the buffer is also allocated using
> dma_alloc_writecombine, which means we can probably share all of the GEM
> handling code and probably also most of the fbdev code. I also started with
> the Exynos GEM code as a template, but reworked it later to be more like the
> UDL code, which made it a bit more compact. I think it would be a good idea
> to put at least the GEM handling in some common code as I expect that we'll
> see more similar "simple" DRM drivers pop up.

I totally agree. Having to track other drivers for bug fixes to apply
them on the own driver is not very convenient. As answered to Rob I do
not really have a clue how to accomplish this.

> 
> The code in question can be found at
> https://github.com/lclausen-adi/linux-2.6/commit/87a8fd6b98eeee317c7a486846cc8405d0bd68d8
> 
> Btw. the imx-drm.h is missing in your patch.

Oops, here it is for reference, will include it in the next round.


#ifndef _IMX_DRM_H_
#define _IMX_DRM_H_

/**
 * User-desired buffer creation information structure.
 *
 * @size: requested size for the object.
 *	- this size value would be page-aligned internally.
 * @flags: user request for setting memory type or cache attributes.
 * @handle: returned handle for the object.
 */
struct imx_drm_gem_create {
	unsigned int size;
	unsigned int flags;
	unsigned int handle;
};

struct imx_drm_device;
struct imx_drm_crtc;

struct imx_drm_crtc_helper_funcs {
	int (*enable_vblank)(struct drm_crtc *crtc);
	void (*disable_vblank)(struct drm_crtc *crtc);
};

int imx_drm_add_crtc(struct drm_crtc *crtc,
		struct imx_drm_crtc **new_crtc,
		const struct drm_crtc_funcs *crtc_funcs,
		const struct drm_crtc_helper_funcs *crtc_helper_funcs,
		const struct imx_drm_crtc_helper_funcs *ec_helper_funcs,
		struct module *owner);
int imx_drm_remove_crtc(struct imx_drm_crtc *);
int imx_drm_init_drm(struct platform_device *pdev,
		int preferred_bpp);
int imx_drm_exit_drm(void);

int imx_drm_crtc_vblank_get(struct imx_drm_crtc *imx_drm_crtc);
void imx_drm_crtc_vblank_put(struct imx_drm_crtc *imx_drm_crtc);
void imx_drm_handle_vblank(struct imx_drm_crtc *imx_drm_crtc);

/*
 * imx drm buffer entry structure.
 *
 * @paddr: physical address of allocated memory.
 * @vaddr: kernel virtual address of allocated memory.
 * @size: size of allocated memory.
 */
struct imx_drm_buf_entry {
	dma_addr_t paddr;
	void __iomem *vaddr;
	unsigned int size;
};

/* get physical memory information of a drm framebuffer. */
struct imx_drm_buf_entry *imx_drm_fb_get_buf(struct drm_framebuffer *fb);

struct imx_drm_encoder;
int imx_drm_add_encoder(struct drm_encoder *encoder,
		struct imx_drm_encoder **new_enc,
		struct module *owner);
int imx_drm_remove_encoder(struct imx_drm_encoder *);

struct imx_drm_connector;
int imx_drm_add_connector(struct drm_connector *connector,
		struct imx_drm_connector **new_con,
		struct module *owner);
int imx_drm_remove_connector(struct imx_drm_connector *);

void imx_drm_mode_config_init(struct drm_device *drm);

#define to_imx_drm_gem_obj(x)	container_of(x,\
			struct imx_drm_gem_obj, base)

struct imx_drm_gem_obj {
	struct drm_gem_object base;
	struct imx_drm_buf_entry *entry;
};

/* unmap a buffer from user space. */
int imx_drm_gem_munmap_ioctl(struct drm_device *drm, void *data,
		struct drm_file *file_priv);

/* initialize gem object. */
int imx_drm_gem_init_object(struct drm_gem_object *obj);

/* free gem object. */
void imx_drm_gem_free_object(struct drm_gem_object *gem_obj);

/* create memory region for drm framebuffer. */
int imx_drm_gem_dumb_create(struct drm_file *file_priv,
		struct drm_device *drm, struct drm_mode_create_dumb *args);

/* map memory region for drm framebuffer to user space. */
int imx_drm_gem_dumb_map_offset(struct drm_file *file_priv,
		struct drm_device *drm, uint32_t handle, uint64_t *offset);

/* page fault handler and mmap fault address(virtual) to physical memory. */
int imx_drm_gem_fault(struct vm_area_struct *vma, struct vm_fault *vmf);

/* set vm_flags and we can change the vm attribute to other one at here. */
int imx_drm_gem_mmap(struct file *filp, struct vm_area_struct *vma);

/*
 * destroy memory region allocated.
 *	- a gem handle and physical memory region pointed by a gem object
 *	would be released by drm_gem_handle_delete().
 */
int imx_drm_gem_dumb_destroy(struct drm_file *file_priv,
		struct drm_device *drm, unsigned int handle);

/* allocate physical memory. */
struct imx_drm_buf_entry *imx_drm_buf_create(struct drm_device *drm,
		unsigned int size);

/* remove allocated physical memory. */
void imx_drm_buf_destroy(struct drm_device *drm, struct imx_drm_buf_entry *entry);

struct drm_device *imx_drm_device_get(void);
void imx_drm_device_put(void);

#endif /* _IMX_DRM_H_ */
-- 
Pengutronix e.K.                           |                             |
Industrial Linux Solutions                 | http://www.pengutronix.de/  |
Peiner Str. 6-8, 31137 Hildesheim, Germany | Phone: +49-5121-206917-0    |
Amtsgericht Hildesheim, HRA 2686           | Fax:   +49-5121-206917-5555 |

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re:
  2012-05-18 12:27 (unknown), Sascha Hauer
@ 2012-05-22 14:06 ` Lars-Peter Clausen
  2012-05-23  8:12   ` Re: Sascha Hauer
  2012-05-24  6:31   ` Re: Sascha Hauer
  0 siblings, 2 replies; 65+ messages in thread
From: Lars-Peter Clausen @ 2012-05-22 14:06 UTC (permalink / raw)
  To: Sascha Hauer; +Cc: linux-arm-kernel, dri-devel

On 05/18/2012 02:27 PM, Sascha Hauer wrote:
> Hi All,
> 
> The following adds a drm/kms driver for the Freescale i.MX LCDC
> controller. Most notable change to the last SDRM based version is that
> the SDRM layer has been removed and the driver now is purely i.MX
> specific. I hope that this is more acceptable now.
> 
> Another change is that the probe is now devicetree based. For now I
> took the easy way out and only put an edid blob into the devicetree.
> I haven't documented the binding yet, I would add that when the rest
> is considered ok.
> 
> Comments very welcome.
> 

Hi,

I really liked the sdrm layer. At least some bits of it. I've been working
on a "simple" DRM driver as well. The hardware has no fancy acceleration
features, just a simple buffer and some scanout logic. I'm basically using
the same gem buffer structure and the buffer is also allocated using
dma_alloc_writecombine, which means we can probably share all of the GEM
handling code and probably also most of the fbdev code. I also started with
the Exynos GEM code as a template, but reworked it later to be more like the
UDL code, which made it a bit more compact. I think it would be a good idea
to put at least the GEM handling in some common code as I expect that we'll
see more similar "simple" DRM drivers pop up.

The code in question can be found at
https://github.com/lclausen-adi/linux-2.6/commit/87a8fd6b98eeee317c7a486846cc8405d0bd68d8

Btw. the imx-drm.h is missing in your patch.

- Lars

> Thanks
>  Sascha
> 
> ----------------------------------------------------------------
> Sascha Hauer (2):
>       DRM: add Freescale i.MX LCDC driver
>       pcm038 lcdc support
> 
>  arch/arm/boot/dts/imx27-phytec-phycore.dts |   39 ++
>  arch/arm/boot/dts/imx27.dtsi               |    7 +
>  arch/arm/mach-imx/clock-imx27.c            |    1 +
>  drivers/gpu/drm/Kconfig                    |    2 +
>  drivers/gpu/drm/Makefile                   |    1 +
>  drivers/gpu/drm/imx/Kconfig                |   18 +
>  drivers/gpu/drm/imx/Makefile               |    8 +
>  drivers/gpu/drm/imx/imx-drm-core.c         |  745 ++++++++++++++++++++++++++++
>  drivers/gpu/drm/imx/imx-fb.c               |  179 +++++++
>  drivers/gpu/drm/imx/imx-fbdev.c            |  275 ++++++++++
>  drivers/gpu/drm/imx/imx-gem.c              |  343 +++++++++++++
>  drivers/gpu/drm/imx/imx-lcdc-crtc.c        |  517 +++++++++++++++++++
>  drivers/gpu/drm/imx/imx-parallel-display.c |  228 +++++++++
>  13 files changed, 2363 insertions(+)
>  create mode 100644 drivers/gpu/drm/imx/Kconfig
>  create mode 100644 drivers/gpu/drm/imx/Makefile
>  create mode 100644 drivers/gpu/drm/imx/imx-drm-core.c
>  create mode 100644 drivers/gpu/drm/imx/imx-fb.c
>  create mode 100644 drivers/gpu/drm/imx/imx-fbdev.c
>  create mode 100644 drivers/gpu/drm/imx/imx-gem.c
>  create mode 100644 drivers/gpu/drm/imx/imx-lcdc-crtc.c
>  create mode 100644 drivers/gpu/drm/imx/imx-parallel-display.c
> _______________________________________________
> dri-devel mailing list
> dri-devel@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 65+ messages in thread

end of thread, other threads:[~2023-11-11  8:23 UTC | newest]

Thread overview: 65+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-04-06  7:51 Christian König
2022-04-06  7:51 ` [PATCH 01/16] dma-buf/drivers: make reserving a shared slot mandatory v4 Christian König
2022-04-06 12:21   ` Daniel Vetter
2022-04-06  7:51 ` [PATCH 02/16] dma-buf: add enum dma_resv_usage v4 Christian König
2022-04-06  7:51 ` [PATCH 03/16] dma-buf: specify usage while adding fences to dma_resv obj v6 Christian König
2022-04-06 12:32   ` Daniel Vetter
2022-04-06 12:35     ` Daniel Vetter
2022-04-07  8:01       ` Christian König
2022-04-07  9:26         ` Daniel Vetter
2022-04-06  7:51 ` [PATCH 04/16] dma-buf & drm/amdgpu: remove dma_resv workaround Christian König
2022-04-06 12:39   ` Daniel Vetter
2022-04-06  7:51 ` [PATCH 05/16] dma-buf: add DMA_RESV_USAGE_KERNEL v3 Christian König
2022-04-06 12:41   ` Daniel Vetter
2022-04-06  7:51 ` [PATCH 06/16] drm/amdgpu: use DMA_RESV_USAGE_KERNEL Christian König
2022-04-06 12:42   ` Daniel Vetter
2022-04-06 14:54     ` Christian König
2022-04-06  7:51 ` [PATCH 07/16] drm/radeon: " Christian König
2022-04-06 12:43   ` Daniel Vetter
2022-04-06  7:51 ` [PATCH 08/16] drm/etnaviv: always wait for kernel fences Christian König
2022-04-06 12:46   ` Daniel Vetter
2022-04-06  7:51 ` [PATCH 09/16] drm/nouveau: only wait for kernel fences in nouveau_bo_vm_cleanup Christian König
2022-04-06 12:47   ` Daniel Vetter
2022-04-06  7:51 ` [PATCH 10/16] RDMA: use DMA_RESV_USAGE_KERNEL Christian König
2022-04-06 12:48   ` Daniel Vetter
2022-04-06  7:51 ` [PATCH 11/16] dma-buf: add DMA_RESV_USAGE_BOOKKEEP v3 Christian König
2022-04-06  7:51 ` [PATCH 12/16] drm/amdgpu: use DMA_RESV_USAGE_BOOKKEEP Christian König
2022-04-06  7:51 ` [PATCH 13/16] dma-buf: wait for map to complete for static attachments Christian König
2022-04-06  7:51 ` [PATCH 14/16] drm/i915: drop bo->moving dependency Christian König
2022-04-06 13:24   ` Matthew Auld
2022-04-06  7:51 ` [PATCH 15/16] drm/ttm: remove bo->moving Christian König
2022-04-06 12:52   ` Daniel Vetter
2022-04-06  7:51 ` [PATCH 16/16] dma-buf: drop seq count based update Christian König
2022-04-06 13:00   ` Daniel Vetter
2022-04-06 12:59 ` Daniel Vetter
  -- strict thread matches above, loose matches on Subject: below --
2023-11-11  4:21 Andrew Worsley
2023-11-11  8:22 ` Javier Martinez Canillas
2022-05-19  9:54 Christian König
2022-05-19 10:50 ` Matthew Auld
2022-05-20  7:11   ` Re: Christian König
     [not found] <CAGsV3ysM+p_HAq+LgOe4db09e+zRtvELHUQzCjF8FVE2UF+3Ow@mail.gmail.com>
2021-06-29 13:52 ` Re: Alex Deucher
2021-05-15 22:57 Dmitry Baryshkov
2021-06-02 21:45 ` Dmitry Baryshkov
     [not found] <20201008181606.460499-1-sandy.8925@gmail.com>
2020-10-09  6:47 ` Re: Thomas Zimmermann
2020-10-09  7:14   ` Re: Thomas Zimmermann
2020-10-09  7:38     ` Re: Sandeep Raghuraman
2020-10-09  7:51       ` Re: Thomas Zimmermann
2020-10-09 15:48         ` Re: Alex Deucher
2020-09-15  2:40 Dave Airlie
2020-09-15  7:53 ` Christian König
     [not found] <86d0ec$ae4ffc@fmsmga001.fm.intel.com>
2020-02-26 12:08 ` Re: Linus Walleij
2020-02-26 14:34   ` Re: Ville Syrjälä
2020-02-26 14:56     ` Re: Linus Walleij
2020-02-26 15:08       ` Re: Ville Syrjälä
2018-10-21 16:25 (unknown), Michael Tirado
2018-10-22  0:26 ` Dave Airlie
2018-10-21 20:23   ` Re: Michael Tirado
2018-10-22  1:50     ` Re: Dave Airlie
2018-10-21 22:20       ` Re: Michael Tirado
2018-10-23  1:47       ` Re: Michael Tirado
2018-10-23  6:23         ` Re: Dave Airlie
2018-03-05 17:06 (unknown) Meghana Madhyastha
2018-03-05 19:24 ` Noralf Trønnes
2014-08-24 13:14 (unknown), Christian König
2014-08-24 13:34 ` Mike Lothian
2014-08-25  9:10   ` Re: Christian König
2014-09-07 13:24     ` Re: Markus Trippelsdorf
2014-09-08  3:47       ` Re: Alex Deucher
2014-09-08  7:13         ` Re: Markus Trippelsdorf
2013-08-13  9:56 (unknown), Christian König
2013-08-13 14:47 ` Alex Deucher
2012-05-18 12:27 (unknown), Sascha Hauer
2012-05-22 14:06 ` Lars-Peter Clausen
2012-05-23  8:12   ` Re: Sascha Hauer
2012-05-24  6:31   ` Re: Sascha Hauer

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).