dri-devel.lists.freedesktop.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/6] dma-buf: Add an API for exporting sync files (v10)
@ 2021-05-24 20:59 Jason Ekstrand
  2021-05-24 20:59 ` [PATCH 1/6] dma-buf: add dma_fence_array_for_each (v2) Jason Ekstrand
                   ` (5 more replies)
  0 siblings, 6 replies; 13+ messages in thread
From: Jason Ekstrand @ 2021-05-24 20:59 UTC (permalink / raw)
  To: dri-devel, intel-gfx
  Cc: Daniel Stone, Michel Dänzer, wayland-devel, Jason Ekstrand,
	Dave Airlie, mesa-dev, Christian König

Modern userspace APIs like Vulkan are built on an explicit
synchronization model.  This doesn't always play nicely with the
implicit synchronization used in the kernel and assumed by X11 and
Wayland.  The client -> compositor half of the synchronization isn't too
bad, at least on intel, because we can control whether or not i915
synchronizes on the buffer and whether or not it's considered written.

The harder part is the compositor -> client synchronization when we get
the buffer back from the compositor.  We're required to be able to
provide the client with a VkSemaphore and VkFence representing the point
in time where the window system (compositor and/or display) finished
using the buffer.  With current APIs, it's very hard to do this in such
a way that we don't get confused by the Vulkan driver's access of the
buffer.  In particular, once we tell the kernel that we're rendering to
the buffer again, any CPU waits on the buffer or GPU dependencies will
wait on some of the client rendering and not just the compositor.

This new IOCTL solves this problem by allowing us to get a snapshot of
the implicit synchronization state of a given dma-buf in the form of a
sync file.  It's effectively the same as a poll() or I915_GEM_WAIT only,
instead of CPU waiting directly, it encapsulates the wait operation, at
the current moment in time, in a sync_file so we can check/wait on it
later.  As long as the Vulkan driver does the sync_file export from the
dma-buf before we re-introduce it for rendering, it will only contain
fences from the compositor or display.  This allows to accurately turn
it into a VkFence or VkSemaphore without any over- synchronization.

This patch series actually contains two new ioctls.  There is the export
one mentioned above as well as an RFC for an import ioctl which provides
the other half.  The intention is to land the export ioctl since it seems
like there's no real disagreement on that one.  The import ioctl, however,
has a lot of debate around it so it's intended to be RFC-only for now.

Mesa MR: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4037
IGT tests: https://patchwork.freedesktop.org/series/90490/

v10 (Jason Ekstrand, Daniel Vetter):
 - Add reviews/acks
 - Add a patch to rename _rcu to _unlocked
 - Split things better so import is clearly RFC status

Cc: Christian König <christian.koenig@amd.com>
Cc: Michel Dänzer <michel@daenzer.net>
Cc: Dave Airlie <airlied@redhat.com>
Cc: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Cc: Daniel Stone <daniels@collabora.com>
Cc: mesa-dev@lists.freedesktop.org
Cc: wayland-devel@lists.freedesktop.org
Test-with: 20210524205225.872316-1-jason@jlekstrand.net

Christian König (1):
  dma-buf: add dma_fence_array_for_each (v2)

Jason Ekstrand (5):
  dma-buf: Rename dma_resv helpers from _rcu to _unlocked
  dma-buf: add dma_resv_get_singleton_unlocked (v4)
  dma-buf: Add an API for exporting sync files (v9)
  RFC: dma-buf: Add an extra fence to dma_resv_get_singleton_unlocked
  RFC: dma-buf: Add an API for importing sync files (v6)

 drivers/dma-buf/dma-buf.c                     | 100 ++++++++++++-
 drivers/dma-buf/dma-fence-array.c             |  27 ++++
 drivers/dma-buf/dma-resv.c                    | 140 ++++++++++++++++--
 drivers/gpu/drm/amd/amdgpu/amdgpu_display.c   |   2 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c   |   2 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c       |   2 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_ids.c       |   4 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_mn.c        |   2 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_object.c    |   2 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c       |   2 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c        |   8 +-
 .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c |   2 +-
 drivers/gpu/drm/drm_gem.c                     |   6 +-
 drivers/gpu/drm/drm_gem_atomic_helper.c       |   2 +-
 drivers/gpu/drm/etnaviv/etnaviv_gem.c         |   4 +-
 drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c  |   4 +-
 drivers/gpu/drm/i915/display/intel_display.c  |   2 +-
 drivers/gpu/drm/i915/dma_resv_utils.c         |   2 +-
 drivers/gpu/drm/i915/gem/i915_gem_busy.c      |   2 +-
 .../gpu/drm/i915/gem/i915_gem_execbuffer.c    |   2 +-
 drivers/gpu/drm/i915/gem/i915_gem_object.h    |   2 +-
 drivers/gpu/drm/i915/gem/i915_gem_userptr.c   |   2 +-
 drivers/gpu/drm/i915/gem/i915_gem_wait.c      |  10 +-
 drivers/gpu/drm/i915/i915_request.c           |   4 +-
 drivers/gpu/drm/i915/i915_sw_fence.c          |   4 +-
 drivers/gpu/drm/msm/msm_gem.c                 |   2 +-
 drivers/gpu/drm/nouveau/dispnv50/wndw.c       |   2 +-
 drivers/gpu/drm/nouveau/nouveau_gem.c         |   2 +-
 drivers/gpu/drm/panfrost/panfrost_drv.c       |   2 +-
 drivers/gpu/drm/panfrost/panfrost_job.c       |   2 +-
 drivers/gpu/drm/radeon/radeon_gem.c           |   6 +-
 drivers/gpu/drm/radeon/radeon_mn.c            |   2 +-
 drivers/gpu/drm/ttm/ttm_bo.c                  |  12 +-
 drivers/gpu/drm/vgem/vgem_fence.c             |   2 +-
 drivers/gpu/drm/virtio/virtgpu_ioctl.c        |   4 +-
 drivers/gpu/drm/vmwgfx/vmwgfx_bo.c            |   2 +-
 include/linux/dma-fence-array.h               |  17 +++
 include/linux/dma-resv.h                      |  21 +--
 include/uapi/linux/dma-buf.h                  |  25 ++++
 39 files changed, 361 insertions(+), 79 deletions(-)

-- 
2.31.1


^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH 1/6] dma-buf: add dma_fence_array_for_each (v2)
  2021-05-24 20:59 [PATCH 0/6] dma-buf: Add an API for exporting sync files (v10) Jason Ekstrand
@ 2021-05-24 20:59 ` Jason Ekstrand
  2021-05-24 20:59 ` [PATCH 2/6] dma-buf: Rename dma_resv helpers from _rcu to _unlocked Jason Ekstrand
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 13+ messages in thread
From: Jason Ekstrand @ 2021-05-24 20:59 UTC (permalink / raw)
  To: dri-devel, intel-gfx
  Cc: Christian König, Christian König, Jason Ekstrand

From: Christian König <ckoenig.leichtzumerken@gmail.com>

Add a helper to iterate over all fences in a dma_fence_array object.

v2 (Jason Ekstrand)
 - Return NULL from dma_fence_array_first if head == NULL.  This matches
   the iterator behavior of dma_fence_chain_for_each in that it iterates
   zero times if head == NULL.
 - Return NULL from dma_fence_array_next if index > array->num_fences.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Christian König <christian.koenig@amd.com>
---
 drivers/dma-buf/dma-fence-array.c | 27 +++++++++++++++++++++++++++
 include/linux/dma-fence-array.h   | 17 +++++++++++++++++
 2 files changed, 44 insertions(+)

diff --git a/drivers/dma-buf/dma-fence-array.c b/drivers/dma-buf/dma-fence-array.c
index d3fbd950be944..2ac1afc697d0f 100644
--- a/drivers/dma-buf/dma-fence-array.c
+++ b/drivers/dma-buf/dma-fence-array.c
@@ -201,3 +201,30 @@ bool dma_fence_match_context(struct dma_fence *fence, u64 context)
 	return true;
 }
 EXPORT_SYMBOL(dma_fence_match_context);
+
+struct dma_fence *dma_fence_array_first(struct dma_fence *head)
+{
+	struct dma_fence_array *array;
+
+	if (!head)
+		return NULL;
+
+	array = to_dma_fence_array(head);
+	if (!array)
+		return head;
+
+	return array->fences[0];
+}
+EXPORT_SYMBOL(dma_fence_array_first);
+
+struct dma_fence *dma_fence_array_next(struct dma_fence *head,
+				       unsigned int index)
+{
+	struct dma_fence_array *array = to_dma_fence_array(head);
+
+	if (!array || index >= array->num_fences)
+		return NULL;
+
+	return array->fences[index];
+}
+EXPORT_SYMBOL(dma_fence_array_next);
diff --git a/include/linux/dma-fence-array.h b/include/linux/dma-fence-array.h
index 303dd712220fd..588ac8089dd61 100644
--- a/include/linux/dma-fence-array.h
+++ b/include/linux/dma-fence-array.h
@@ -74,6 +74,19 @@ to_dma_fence_array(struct dma_fence *fence)
 	return container_of(fence, struct dma_fence_array, base);
 }
 
+/**
+ * dma_fence_array_for_each - iterate over all fences in array
+ * @fence: current fence
+ * @index: index into the array
+ * @head: potential dma_fence_array object
+ *
+ * Test if @array is a dma_fence_array object and if yes iterate over all fences
+ * in the array. If not just iterate over the fence in @array itself.
+ */
+#define dma_fence_array_for_each(fence, index, head)			\
+	for (index = 0, fence = dma_fence_array_first(head); fence;	\
+	     ++(index), fence = dma_fence_array_next(head, index))
+
 struct dma_fence_array *dma_fence_array_create(int num_fences,
 					       struct dma_fence **fences,
 					       u64 context, unsigned seqno,
@@ -81,4 +94,8 @@ struct dma_fence_array *dma_fence_array_create(int num_fences,
 
 bool dma_fence_match_context(struct dma_fence *fence, u64 context);
 
+struct dma_fence *dma_fence_array_first(struct dma_fence *head);
+struct dma_fence *dma_fence_array_next(struct dma_fence *head,
+				       unsigned int index);
+
 #endif /* __LINUX_DMA_FENCE_ARRAY_H */
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 2/6] dma-buf: Rename dma_resv helpers from _rcu to _unlocked
  2021-05-24 20:59 [PATCH 0/6] dma-buf: Add an API for exporting sync files (v10) Jason Ekstrand
  2021-05-24 20:59 ` [PATCH 1/6] dma-buf: add dma_fence_array_for_each (v2) Jason Ekstrand
@ 2021-05-24 20:59 ` Jason Ekstrand
  2021-05-25 14:57   ` Daniel Vetter
  2021-05-24 20:59 ` [PATCH 3/6] dma-buf: add dma_resv_get_singleton_unlocked (v4) Jason Ekstrand
                   ` (3 subsequent siblings)
  5 siblings, 1 reply; 13+ messages in thread
From: Jason Ekstrand @ 2021-05-24 20:59 UTC (permalink / raw)
  To: dri-devel, intel-gfx; +Cc: Daniel Vetter, Jason Ekstrand

None of these helpers actually leak any RCU details to the caller.  They
all assume you have a genuine reference, take the RCU read lock, and
retry if needed.  Naming them with an _rcu is likely to cause callers
more panic than needed.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Suggested-by: Daniel Vetter <daniel.vetter@ffwll.ch>
---
 drivers/dma-buf/dma-buf.c                     |  2 +-
 drivers/dma-buf/dma-resv.c                    | 28 +++++++++----------
 drivers/gpu/drm/amd/amdgpu/amdgpu_display.c   |  2 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c   |  2 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c       |  2 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_ids.c       |  4 +--
 drivers/gpu/drm/amd/amdgpu/amdgpu_mn.c        |  2 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_object.c    |  2 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c       |  2 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c        |  8 +++---
 .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c |  2 +-
 drivers/gpu/drm/drm_gem.c                     |  6 ++--
 drivers/gpu/drm/drm_gem_atomic_helper.c       |  2 +-
 drivers/gpu/drm/etnaviv/etnaviv_gem.c         |  4 +--
 drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c  |  4 +--
 drivers/gpu/drm/i915/display/intel_display.c  |  2 +-
 drivers/gpu/drm/i915/dma_resv_utils.c         |  2 +-
 drivers/gpu/drm/i915/gem/i915_gem_busy.c      |  2 +-
 .../gpu/drm/i915/gem/i915_gem_execbuffer.c    |  2 +-
 drivers/gpu/drm/i915/gem/i915_gem_object.h    |  2 +-
 drivers/gpu/drm/i915/gem/i915_gem_userptr.c   |  2 +-
 drivers/gpu/drm/i915/gem/i915_gem_wait.c      | 10 +++----
 drivers/gpu/drm/i915/i915_request.c           |  4 +--
 drivers/gpu/drm/i915/i915_sw_fence.c          |  4 +--
 drivers/gpu/drm/msm/msm_gem.c                 |  2 +-
 drivers/gpu/drm/nouveau/dispnv50/wndw.c       |  2 +-
 drivers/gpu/drm/nouveau/nouveau_gem.c         |  2 +-
 drivers/gpu/drm/panfrost/panfrost_drv.c       |  2 +-
 drivers/gpu/drm/panfrost/panfrost_job.c       |  2 +-
 drivers/gpu/drm/radeon/radeon_gem.c           |  6 ++--
 drivers/gpu/drm/radeon/radeon_mn.c            |  2 +-
 drivers/gpu/drm/ttm/ttm_bo.c                  | 12 ++++----
 drivers/gpu/drm/vgem/vgem_fence.c             |  2 +-
 drivers/gpu/drm/virtio/virtgpu_ioctl.c        |  4 +--
 drivers/gpu/drm/vmwgfx/vmwgfx_bo.c            |  2 +-
 include/linux/dma-resv.h                      | 18 ++++++------
 36 files changed, 79 insertions(+), 79 deletions(-)

diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c
index f264b70c383eb..d4529aa9d1a5a 100644
--- a/drivers/dma-buf/dma-buf.c
+++ b/drivers/dma-buf/dma-buf.c
@@ -1147,7 +1147,7 @@ static int __dma_buf_begin_cpu_access(struct dma_buf *dmabuf,
 	long ret;
 
 	/* Wait on any implicit rendering fences */
-	ret = dma_resv_wait_timeout_rcu(resv, write, true,
+	ret = dma_resv_wait_timeout_unlocked(resv, write, true,
 						  MAX_SCHEDULE_TIMEOUT);
 	if (ret < 0)
 		return ret;
diff --git a/drivers/dma-buf/dma-resv.c b/drivers/dma-buf/dma-resv.c
index 6ddbeb5dfbf65..d6f1ed4cd4d55 100644
--- a/drivers/dma-buf/dma-resv.c
+++ b/drivers/dma-buf/dma-resv.c
@@ -417,7 +417,7 @@ int dma_resv_copy_fences(struct dma_resv *dst, struct dma_resv *src)
 EXPORT_SYMBOL(dma_resv_copy_fences);
 
 /**
- * dma_resv_get_fences_rcu - Get an object's shared and exclusive
+ * dma_resv_get_fences_unlocked - Get an object's shared and exclusive
  * fences without update side lock held
  * @obj: the reservation object
  * @pfence_excl: the returned exclusive fence (or NULL)
@@ -429,10 +429,10 @@ EXPORT_SYMBOL(dma_resv_copy_fences);
  * exclusive fence is not specified the fence is put into the array of the
  * shared fences as well. Returns either zero or -ENOMEM.
  */
-int dma_resv_get_fences_rcu(struct dma_resv *obj,
-			    struct dma_fence **pfence_excl,
-			    unsigned *pshared_count,
-			    struct dma_fence ***pshared)
+int dma_resv_get_fences_unlocked(struct dma_resv *obj,
+				 struct dma_fence **pfence_excl,
+				 unsigned *pshared_count,
+				 struct dma_fence ***pshared)
 {
 	struct dma_fence **shared = NULL;
 	struct dma_fence *fence_excl;
@@ -515,10 +515,10 @@ int dma_resv_get_fences_rcu(struct dma_resv *obj,
 	*pshared = shared;
 	return ret;
 }
-EXPORT_SYMBOL_GPL(dma_resv_get_fences_rcu);
+EXPORT_SYMBOL_GPL(dma_resv_get_fences_unlocked);
 
 /**
- * dma_resv_wait_timeout_rcu - Wait on reservation's objects
+ * dma_resv_wait_timeout_unlocked - Wait on reservation's objects
  * shared and/or exclusive fences.
  * @obj: the reservation object
  * @wait_all: if true, wait on all fences, else wait on just exclusive fence
@@ -529,9 +529,9 @@ EXPORT_SYMBOL_GPL(dma_resv_get_fences_rcu);
  * Returns -ERESTARTSYS if interrupted, 0 if the wait timed out, or
  * greater than zer on success.
  */
-long dma_resv_wait_timeout_rcu(struct dma_resv *obj,
-			       bool wait_all, bool intr,
-			       unsigned long timeout)
+long dma_resv_wait_timeout_unlocked(struct dma_resv *obj,
+				    bool wait_all, bool intr,
+				    unsigned long timeout)
 {
 	struct dma_fence *fence;
 	unsigned seq, shared_count;
@@ -602,7 +602,7 @@ long dma_resv_wait_timeout_rcu(struct dma_resv *obj,
 	rcu_read_unlock();
 	goto retry;
 }
-EXPORT_SYMBOL_GPL(dma_resv_wait_timeout_rcu);
+EXPORT_SYMBOL_GPL(dma_resv_wait_timeout_unlocked);
 
 
 static inline int dma_resv_test_signaled_single(struct dma_fence *passed_fence)
@@ -622,7 +622,7 @@ static inline int dma_resv_test_signaled_single(struct dma_fence *passed_fence)
 }
 
 /**
- * dma_resv_test_signaled_rcu - Test if a reservation object's
+ * dma_resv_test_signaled_unlocked - Test if a reservation object's
  * fences have been signaled.
  * @obj: the reservation object
  * @test_all: if true, test all fences, otherwise only test the exclusive
@@ -631,7 +631,7 @@ static inline int dma_resv_test_signaled_single(struct dma_fence *passed_fence)
  * RETURNS
  * true if all fences signaled, else false
  */
-bool dma_resv_test_signaled_rcu(struct dma_resv *obj, bool test_all)
+bool dma_resv_test_signaled_unlocked(struct dma_resv *obj, bool test_all)
 {
 	unsigned seq, shared_count;
 	int ret;
@@ -680,4 +680,4 @@ bool dma_resv_test_signaled_rcu(struct dma_resv *obj, bool test_all)
 	rcu_read_unlock();
 	return ret;
 }
-EXPORT_SYMBOL_GPL(dma_resv_test_signaled_rcu);
+EXPORT_SYMBOL_GPL(dma_resv_test_signaled_unlocked);
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c
index 8a1fb8b6606e5..3b0df434e0ca3 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c
@@ -203,7 +203,7 @@ int amdgpu_display_crtc_page_flip_target(struct drm_crtc *crtc,
 		goto unpin;
 	}
 
-	r = dma_resv_get_fences_rcu(new_abo->tbo.base.resv, &work->excl,
+	r = dma_resv_get_fences_unlocked(new_abo->tbo.base.resv, &work->excl,
 					      &work->shared_count,
 					      &work->shared);
 	if (unlikely(r != 0)) {
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c
index baa980a477d94..0d0319bc51577 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c
@@ -98,7 +98,7 @@ __dma_resv_make_exclusive(struct dma_resv *obj)
 	if (!dma_resv_get_list(obj)) /* no shared fences to convert */
 		return 0;
 
-	r = dma_resv_get_fences_rcu(obj, NULL, &count, &fences);
+	r = dma_resv_get_fences_unlocked(obj, NULL, &count, &fences);
 	if (r)
 		return r;
 
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
index 18974bd081f00..a71f98ae1d72f 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
@@ -471,7 +471,7 @@ int amdgpu_gem_wait_idle_ioctl(struct drm_device *dev, void *data,
 		return -ENOENT;
 	}
 	robj = gem_to_amdgpu_bo(gobj);
-	ret = dma_resv_wait_timeout_rcu(robj->tbo.base.resv, true, true,
+	ret = dma_resv_wait_timeout_unlocked(robj->tbo.base.resv, true, true,
 						  timeout);
 
 	/* ret == 0 means not signaled,
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ids.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ids.c
index b4971e90b98cf..7045b104a33ae 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ids.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ids.c
@@ -112,7 +112,7 @@ void amdgpu_pasid_free_delayed(struct dma_resv *resv,
 	unsigned count;
 	int r;
 
-	r = dma_resv_get_fences_rcu(resv, NULL, &count, &fences);
+	r = dma_resv_get_fences_unlocked(resv, NULL, &count, &fences);
 	if (r)
 		goto fallback;
 
@@ -156,7 +156,7 @@ void amdgpu_pasid_free_delayed(struct dma_resv *resv,
 	/* Not enough memory for the delayed delete, as last resort
 	 * block for all the fences to complete.
 	 */
-	dma_resv_wait_timeout_rcu(resv, true, false,
+	dma_resv_wait_timeout_unlocked(resv, true, false,
 					    MAX_SCHEDULE_TIMEOUT);
 	amdgpu_pasid_free(pasid);
 }
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_mn.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_mn.c
index 828b5167ff128..58fb1de81c0c5 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_mn.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_mn.c
@@ -75,7 +75,7 @@ static bool amdgpu_mn_invalidate_gfx(struct mmu_interval_notifier *mni,
 
 	mmu_interval_set_seq(mni, cur_seq);
 
-	r = dma_resv_wait_timeout_rcu(bo->tbo.base.resv, true, false,
+	r = dma_resv_wait_timeout_unlocked(bo->tbo.base.resv, true, false,
 				      MAX_SCHEDULE_TIMEOUT);
 	mutex_unlock(&adev->notifier_lock);
 	if (r <= 0)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
index 0adffcace3263..81db9ea391c1c 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
@@ -791,7 +791,7 @@ int amdgpu_bo_kmap(struct amdgpu_bo *bo, void **ptr)
 		return 0;
 	}
 
-	r = dma_resv_wait_timeout_rcu(bo->tbo.base.resv, false, false,
+	r = dma_resv_wait_timeout_unlocked(bo->tbo.base.resv, false, false,
 						MAX_SCHEDULE_TIMEOUT);
 	if (r < 0)
 		return r;
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c
index c6dbc08016045..af7b667d3226d 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c
@@ -1115,7 +1115,7 @@ static int amdgpu_uvd_send_msg(struct amdgpu_ring *ring, struct amdgpu_bo *bo,
 	ib->length_dw = 16;
 
 	if (direct) {
-		r = dma_resv_wait_timeout_rcu(bo->tbo.base.resv,
+		r = dma_resv_wait_timeout_unlocked(bo->tbo.base.resv,
 							true, false,
 							msecs_to_jiffies(10));
 		if (r == 0)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
index 4a3e3f72e1277..33dbe3fcaf706 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
@@ -2007,13 +2007,13 @@ static void amdgpu_vm_prt_fini(struct amdgpu_device *adev, struct amdgpu_vm *vm)
 	unsigned i, shared_count;
 	int r;
 
-	r = dma_resv_get_fences_rcu(resv, &excl,
+	r = dma_resv_get_fences_unlocked(resv, &excl,
 					      &shared_count, &shared);
 	if (r) {
 		/* Not enough memory to grab the fence list, as last resort
 		 * block for all the fences to complete.
 		 */
-		dma_resv_wait_timeout_rcu(resv, true, false,
+		dma_resv_wait_timeout_unlocked(resv, true, false,
 						    MAX_SCHEDULE_TIMEOUT);
 		return;
 	}
@@ -2625,7 +2625,7 @@ bool amdgpu_vm_evictable(struct amdgpu_bo *bo)
 		return true;
 
 	/* Don't evict VM page tables while they are busy */
-	if (!dma_resv_test_signaled_rcu(bo->tbo.base.resv, true))
+	if (!dma_resv_test_signaled_unlocked(bo->tbo.base.resv, true))
 		return false;
 
 	/* Try to block ongoing updates */
@@ -2805,7 +2805,7 @@ void amdgpu_vm_adjust_size(struct amdgpu_device *adev, uint32_t min_vm_size,
  */
 long amdgpu_vm_wait_idle(struct amdgpu_vm *vm, long timeout)
 {
-	timeout = dma_resv_wait_timeout_rcu(vm->root.base.bo->tbo.base.resv,
+	timeout = dma_resv_wait_timeout_unlocked(vm->root.base.bo->tbo.base.resv,
 					    true, true, timeout);
 	if (timeout <= 0)
 		return timeout;
diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
index 9ca517b658546..e74fef044b301 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
@@ -8276,7 +8276,7 @@ static void amdgpu_dm_commit_planes(struct drm_atomic_state *state,
 		 * deadlock during GPU reset when this fence will not signal
 		 * but we hold reservation lock for the BO.
 		 */
-		r = dma_resv_wait_timeout_rcu(abo->tbo.base.resv, true,
+		r = dma_resv_wait_timeout_unlocked(abo->tbo.base.resv, true,
 							false,
 							msecs_to_jiffies(5000));
 		if (unlikely(r <= 0))
diff --git a/drivers/gpu/drm/drm_gem.c b/drivers/gpu/drm/drm_gem.c
index 9989425e9875a..42a432708c2fe 100644
--- a/drivers/gpu/drm/drm_gem.c
+++ b/drivers/gpu/drm/drm_gem.c
@@ -770,7 +770,7 @@ long drm_gem_dma_resv_wait(struct drm_file *filep, u32 handle,
 		return -EINVAL;
 	}
 
-	ret = dma_resv_wait_timeout_rcu(obj->resv, wait_all,
+	ret = dma_resv_wait_timeout_unlocked(obj->resv, wait_all,
 						  true, timeout);
 	if (ret == 0)
 		ret = -ETIME;
@@ -1375,12 +1375,12 @@ int drm_gem_fence_array_add_implicit(struct xarray *fence_array,
 
 	if (!write) {
 		struct dma_fence *fence =
-			dma_resv_get_excl_rcu(obj->resv);
+			dma_resv_get_excl_unlocked(obj->resv);
 
 		return drm_gem_fence_array_add(fence_array, fence);
 	}
 
-	ret = dma_resv_get_fences_rcu(obj->resv, NULL,
+	ret = dma_resv_get_fences_unlocked(obj->resv, NULL,
 						&fence_count, &fences);
 	if (ret || !fence_count)
 		return ret;
diff --git a/drivers/gpu/drm/drm_gem_atomic_helper.c b/drivers/gpu/drm/drm_gem_atomic_helper.c
index a005c5a0ba46a..a27135084ae5c 100644
--- a/drivers/gpu/drm/drm_gem_atomic_helper.c
+++ b/drivers/gpu/drm/drm_gem_atomic_helper.c
@@ -147,7 +147,7 @@ int drm_gem_plane_helper_prepare_fb(struct drm_plane *plane, struct drm_plane_st
 		return 0;
 
 	obj = drm_gem_fb_get_obj(state->fb, 0);
-	fence = dma_resv_get_excl_rcu(obj->resv);
+	fence = dma_resv_get_excl_unlocked(obj->resv);
 	drm_atomic_set_fence_for_plane(state, fence);
 
 	return 0;
diff --git a/drivers/gpu/drm/etnaviv/etnaviv_gem.c b/drivers/gpu/drm/etnaviv/etnaviv_gem.c
index db69f19ab5bca..b271e00480246 100644
--- a/drivers/gpu/drm/etnaviv/etnaviv_gem.c
+++ b/drivers/gpu/drm/etnaviv/etnaviv_gem.c
@@ -390,13 +390,13 @@ int etnaviv_gem_cpu_prep(struct drm_gem_object *obj, u32 op,
 	}
 
 	if (op & ETNA_PREP_NOSYNC) {
-		if (!dma_resv_test_signaled_rcu(obj->resv,
+		if (!dma_resv_test_signaled_unlocked(obj->resv,
 							  write))
 			return -EBUSY;
 	} else {
 		unsigned long remain = etnaviv_timeout_to_jiffies(timeout);
 
-		ret = dma_resv_wait_timeout_rcu(obj->resv,
+		ret = dma_resv_wait_timeout_unlocked(obj->resv,
 							  write, true, remain);
 		if (ret <= 0)
 			return ret == 0 ? -ETIMEDOUT : ret;
diff --git a/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c b/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c
index d05c359945799..b4ac4c7ab144d 100644
--- a/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c
+++ b/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c
@@ -189,13 +189,13 @@ static int submit_fence_sync(struct etnaviv_gem_submit *submit)
 			continue;
 
 		if (bo->flags & ETNA_SUBMIT_BO_WRITE) {
-			ret = dma_resv_get_fences_rcu(robj, &bo->excl,
+			ret = dma_resv_get_fences_unlocked(robj, &bo->excl,
 								&bo->nr_shared,
 								&bo->shared);
 			if (ret)
 				return ret;
 		} else {
-			bo->excl = dma_resv_get_excl_rcu(robj);
+			bo->excl = dma_resv_get_excl_unlocked(robj);
 		}
 
 	}
diff --git a/drivers/gpu/drm/i915/display/intel_display.c b/drivers/gpu/drm/i915/display/intel_display.c
index 422b59ebf6dce..5f0b85a102159 100644
--- a/drivers/gpu/drm/i915/display/intel_display.c
+++ b/drivers/gpu/drm/i915/display/intel_display.c
@@ -11040,7 +11040,7 @@ intel_prepare_plane_fb(struct drm_plane *_plane,
 		if (ret < 0)
 			goto unpin_fb;
 
-		fence = dma_resv_get_excl_rcu(obj->base.resv);
+		fence = dma_resv_get_excl_unlocked(obj->base.resv);
 		if (fence) {
 			add_rps_boost_after_vblank(new_plane_state->hw.crtc,
 						   fence);
diff --git a/drivers/gpu/drm/i915/dma_resv_utils.c b/drivers/gpu/drm/i915/dma_resv_utils.c
index 9e508e7d4629f..bdfc6bf16a4e9 100644
--- a/drivers/gpu/drm/i915/dma_resv_utils.c
+++ b/drivers/gpu/drm/i915/dma_resv_utils.c
@@ -10,7 +10,7 @@
 void dma_resv_prune(struct dma_resv *resv)
 {
 	if (dma_resv_trylock(resv)) {
-		if (dma_resv_test_signaled_rcu(resv, true))
+		if (dma_resv_test_signaled_unlocked(resv, true))
 			dma_resv_add_excl_fence(resv, NULL);
 		dma_resv_unlock(resv);
 	}
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_busy.c b/drivers/gpu/drm/i915/gem/i915_gem_busy.c
index 25235ef630c10..754ad6d1bace9 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_busy.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_busy.c
@@ -105,7 +105,7 @@ i915_gem_busy_ioctl(struct drm_device *dev, void *data,
 	 * Alternatively, we can trade that extra information on read/write
 	 * activity with
 	 *	args->busy =
-	 *		!dma_resv_test_signaled_rcu(obj->resv, true);
+	 *		!dma_resv_test_signaled_unlocked(obj->resv, true);
 	 * to report the overall busyness. This is what the wait-ioctl does.
 	 *
 	 */
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
index 297143511f99b..e8f323564e57b 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
@@ -1481,7 +1481,7 @@ static inline bool use_reloc_gpu(struct i915_vma *vma)
 	if (DBG_FORCE_RELOC)
 		return false;
 
-	return !dma_resv_test_signaled_rcu(vma->resv, true);
+	return !dma_resv_test_signaled_unlocked(vma->resv, true);
 }
 
 static unsigned long vma_phys_addr(struct i915_vma *vma, u32 offset)
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.h b/drivers/gpu/drm/i915/gem/i915_gem_object.h
index 2ebd79537aea9..7c0eb425cb3b3 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_object.h
+++ b/drivers/gpu/drm/i915/gem/i915_gem_object.h
@@ -500,7 +500,7 @@ i915_gem_object_last_write_engine(struct drm_i915_gem_object *obj)
 	struct dma_fence *fence;
 
 	rcu_read_lock();
-	fence = dma_resv_get_excl_rcu(obj->base.resv);
+	fence = dma_resv_get_excl_unlocked(obj->base.resv);
 	rcu_read_unlock();
 
 	if (fence && dma_fence_is_i915(fence) && !dma_fence_is_signaled(fence))
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_userptr.c b/drivers/gpu/drm/i915/gem/i915_gem_userptr.c
index a657b99ec7606..bb5f44ed932aa 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_userptr.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_userptr.c
@@ -85,7 +85,7 @@ static bool i915_gem_userptr_invalidate(struct mmu_interval_notifier *mni,
 		return true;
 
 	/* we will unbind on next submission, still have userptr pins */
-	r = dma_resv_wait_timeout_rcu(obj->base.resv, true, false,
+	r = dma_resv_wait_timeout_unlocked(obj->base.resv, true, false,
 				      MAX_SCHEDULE_TIMEOUT);
 	if (r <= 0)
 		drm_err(&i915->drm, "(%ld) failed to wait for idle\n", r);
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_wait.c b/drivers/gpu/drm/i915/gem/i915_gem_wait.c
index 4b9856d5ba14f..5b6c52659ad4d 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_wait.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_wait.c
@@ -45,7 +45,7 @@ i915_gem_object_wait_reservation(struct dma_resv *resv,
 		unsigned int count, i;
 		int ret;
 
-		ret = dma_resv_get_fences_rcu(resv, &excl, &count, &shared);
+		ret = dma_resv_get_fences_unlocked(resv, &excl, &count, &shared);
 		if (ret)
 			return ret;
 
@@ -73,7 +73,7 @@ i915_gem_object_wait_reservation(struct dma_resv *resv,
 		 */
 		prune_fences = count && timeout >= 0;
 	} else {
-		excl = dma_resv_get_excl_rcu(resv);
+		excl = dma_resv_get_excl_unlocked(resv);
 	}
 
 	if (excl && timeout >= 0)
@@ -158,8 +158,8 @@ i915_gem_object_wait_priority(struct drm_i915_gem_object *obj,
 		unsigned int count, i;
 		int ret;
 
-		ret = dma_resv_get_fences_rcu(obj->base.resv,
-					      &excl, &count, &shared);
+		ret = dma_resv_get_fences_unlocked(obj->base.resv,
+						   &excl, &count, &shared);
 		if (ret)
 			return ret;
 
@@ -170,7 +170,7 @@ i915_gem_object_wait_priority(struct drm_i915_gem_object *obj,
 
 		kfree(shared);
 	} else {
-		excl = dma_resv_get_excl_rcu(obj->base.resv);
+		excl = dma_resv_get_excl_unlocked(obj->base.resv);
 	}
 
 	if (excl) {
diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
index 970d8f4986bbe..d101d702fbadc 100644
--- a/drivers/gpu/drm/i915/i915_request.c
+++ b/drivers/gpu/drm/i915/i915_request.c
@@ -1594,7 +1594,7 @@ i915_request_await_object(struct i915_request *to,
 		struct dma_fence **shared;
 		unsigned int count, i;
 
-		ret = dma_resv_get_fences_rcu(obj->base.resv,
+		ret = dma_resv_get_fences_unlocked(obj->base.resv,
 							&excl, &count, &shared);
 		if (ret)
 			return ret;
@@ -1611,7 +1611,7 @@ i915_request_await_object(struct i915_request *to,
 			dma_fence_put(shared[i]);
 		kfree(shared);
 	} else {
-		excl = dma_resv_get_excl_rcu(obj->base.resv);
+		excl = dma_resv_get_excl_unlocked(obj->base.resv);
 	}
 
 	if (excl) {
diff --git a/drivers/gpu/drm/i915/i915_sw_fence.c b/drivers/gpu/drm/i915/i915_sw_fence.c
index 2744558f30507..0bcb7ea44201e 100644
--- a/drivers/gpu/drm/i915/i915_sw_fence.c
+++ b/drivers/gpu/drm/i915/i915_sw_fence.c
@@ -582,7 +582,7 @@ int i915_sw_fence_await_reservation(struct i915_sw_fence *fence,
 		struct dma_fence **shared;
 		unsigned int count, i;
 
-		ret = dma_resv_get_fences_rcu(resv, &excl, &count, &shared);
+		ret = dma_resv_get_fences_unlocked(resv, &excl, &count, &shared);
 		if (ret)
 			return ret;
 
@@ -606,7 +606,7 @@ int i915_sw_fence_await_reservation(struct i915_sw_fence *fence,
 			dma_fence_put(shared[i]);
 		kfree(shared);
 	} else {
-		excl = dma_resv_get_excl_rcu(resv);
+		excl = dma_resv_get_excl_unlocked(resv);
 	}
 
 	if (ret >= 0 && excl && excl->ops != exclude) {
diff --git a/drivers/gpu/drm/msm/msm_gem.c b/drivers/gpu/drm/msm/msm_gem.c
index 56df86e5f7400..f8afb081770e4 100644
--- a/drivers/gpu/drm/msm/msm_gem.c
+++ b/drivers/gpu/drm/msm/msm_gem.c
@@ -915,7 +915,7 @@ int msm_gem_cpu_prep(struct drm_gem_object *obj, uint32_t op, ktime_t *timeout)
 		op & MSM_PREP_NOSYNC ? 0 : timeout_to_jiffies(timeout);
 	long ret;
 
-	ret = dma_resv_wait_timeout_rcu(obj->resv, write,
+	ret = dma_resv_wait_timeout_unlocked(obj->resv, write,
 						  true,  remain);
 	if (ret == 0)
 		return remain == 0 ? -EBUSY : -ETIMEDOUT;
diff --git a/drivers/gpu/drm/nouveau/dispnv50/wndw.c b/drivers/gpu/drm/nouveau/dispnv50/wndw.c
index 0cb1f9d848d3e..8d048bacd6f02 100644
--- a/drivers/gpu/drm/nouveau/dispnv50/wndw.c
+++ b/drivers/gpu/drm/nouveau/dispnv50/wndw.c
@@ -561,7 +561,7 @@ nv50_wndw_prepare_fb(struct drm_plane *plane, struct drm_plane_state *state)
 			asyw->image.handle[0] = ctxdma->object.handle;
 	}
 
-	asyw->state.fence = dma_resv_get_excl_rcu(nvbo->bo.base.resv);
+	asyw->state.fence = dma_resv_get_excl_unlocked(nvbo->bo.base.resv);
 	asyw->image.offset[0] = nvbo->offset;
 
 	if (wndw->func->prepare) {
diff --git a/drivers/gpu/drm/nouveau/nouveau_gem.c b/drivers/gpu/drm/nouveau/nouveau_gem.c
index a70e82413fa75..06ea1fed02467 100644
--- a/drivers/gpu/drm/nouveau/nouveau_gem.c
+++ b/drivers/gpu/drm/nouveau/nouveau_gem.c
@@ -928,7 +928,7 @@ nouveau_gem_ioctl_cpu_prep(struct drm_device *dev, void *data,
 		return -ENOENT;
 	nvbo = nouveau_gem_object(gem);
 
-	lret = dma_resv_wait_timeout_rcu(nvbo->bo.base.resv, write, true,
+	lret = dma_resv_wait_timeout_unlocked(nvbo->bo.base.resv, write, true,
 						   no_wait ? 0 : 30 * HZ);
 	if (!lret)
 		ret = -EBUSY;
diff --git a/drivers/gpu/drm/panfrost/panfrost_drv.c b/drivers/gpu/drm/panfrost/panfrost_drv.c
index ca07098a61419..53e1842fe8bf8 100644
--- a/drivers/gpu/drm/panfrost/panfrost_drv.c
+++ b/drivers/gpu/drm/panfrost/panfrost_drv.c
@@ -311,7 +311,7 @@ panfrost_ioctl_wait_bo(struct drm_device *dev, void *data,
 	if (!gem_obj)
 		return -ENOENT;
 
-	ret = dma_resv_wait_timeout_rcu(gem_obj->resv, true,
+	ret = dma_resv_wait_timeout_unlocked(gem_obj->resv, true,
 						  true, timeout);
 	if (!ret)
 		ret = timeout ? -ETIMEDOUT : -EBUSY;
diff --git a/drivers/gpu/drm/panfrost/panfrost_job.c b/drivers/gpu/drm/panfrost/panfrost_job.c
index 6003cfeb13221..2df3e999a38d0 100644
--- a/drivers/gpu/drm/panfrost/panfrost_job.c
+++ b/drivers/gpu/drm/panfrost/panfrost_job.c
@@ -203,7 +203,7 @@ static void panfrost_acquire_object_fences(struct drm_gem_object **bos,
 	int i;
 
 	for (i = 0; i < bo_count; i++)
-		implicit_fences[i] = dma_resv_get_excl_rcu(bos[i]->resv);
+		implicit_fences[i] = dma_resv_get_excl_unlocked(bos[i]->resv);
 }
 
 static void panfrost_attach_object_fences(struct drm_gem_object **bos,
diff --git a/drivers/gpu/drm/radeon/radeon_gem.c b/drivers/gpu/drm/radeon/radeon_gem.c
index 05ea2f39f6261..1a38b0bf36d11 100644
--- a/drivers/gpu/drm/radeon/radeon_gem.c
+++ b/drivers/gpu/drm/radeon/radeon_gem.c
@@ -125,7 +125,7 @@ static int radeon_gem_set_domain(struct drm_gem_object *gobj,
 	}
 	if (domain == RADEON_GEM_DOMAIN_CPU) {
 		/* Asking for cpu access wait for object idle */
-		r = dma_resv_wait_timeout_rcu(robj->tbo.base.resv, true, true, 30 * HZ);
+		r = dma_resv_wait_timeout_unlocked(robj->tbo.base.resv, true, true, 30 * HZ);
 		if (!r)
 			r = -EBUSY;
 
@@ -474,7 +474,7 @@ int radeon_gem_busy_ioctl(struct drm_device *dev, void *data,
 	}
 	robj = gem_to_radeon_bo(gobj);
 
-	r = dma_resv_test_signaled_rcu(robj->tbo.base.resv, true);
+	r = dma_resv_test_signaled_unlocked(robj->tbo.base.resv, true);
 	if (r == 0)
 		r = -EBUSY;
 	else
@@ -503,7 +503,7 @@ int radeon_gem_wait_idle_ioctl(struct drm_device *dev, void *data,
 	}
 	robj = gem_to_radeon_bo(gobj);
 
-	ret = dma_resv_wait_timeout_rcu(robj->tbo.base.resv, true, true, 30 * HZ);
+	ret = dma_resv_wait_timeout_unlocked(robj->tbo.base.resv, true, true, 30 * HZ);
 	if (ret == 0)
 		r = -EBUSY;
 	else if (ret < 0)
diff --git a/drivers/gpu/drm/radeon/radeon_mn.c b/drivers/gpu/drm/radeon/radeon_mn.c
index e37c9a57a7c36..8016c9e568684 100644
--- a/drivers/gpu/drm/radeon/radeon_mn.c
+++ b/drivers/gpu/drm/radeon/radeon_mn.c
@@ -66,7 +66,7 @@ static bool radeon_mn_invalidate(struct mmu_interval_notifier *mn,
 		return true;
 	}
 
-	r = dma_resv_wait_timeout_rcu(bo->tbo.base.resv, true, false,
+	r = dma_resv_wait_timeout_unlocked(bo->tbo.base.resv, true, false,
 				      MAX_SCHEDULE_TIMEOUT);
 	if (r <= 0)
 		DRM_ERROR("(%ld) failed to wait for user bo\n", r);
diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c
index ca1b098b6a561..6925de3f179e8 100644
--- a/drivers/gpu/drm/ttm/ttm_bo.c
+++ b/drivers/gpu/drm/ttm/ttm_bo.c
@@ -294,7 +294,7 @@ static int ttm_bo_cleanup_refs(struct ttm_buffer_object *bo,
 	struct dma_resv *resv = &bo->base._resv;
 	int ret;
 
-	if (dma_resv_test_signaled_rcu(resv, true))
+	if (dma_resv_test_signaled_unlocked(resv, true))
 		ret = 0;
 	else
 		ret = -EBUSY;
@@ -306,7 +306,7 @@ static int ttm_bo_cleanup_refs(struct ttm_buffer_object *bo,
 			dma_resv_unlock(bo->base.resv);
 		spin_unlock(&bo->bdev->lru_lock);
 
-		lret = dma_resv_wait_timeout_rcu(resv, true, interruptible,
+		lret = dma_resv_wait_timeout_unlocked(resv, true, interruptible,
 						 30 * HZ);
 
 		if (lret < 0)
@@ -409,7 +409,7 @@ static void ttm_bo_release(struct kref *kref)
 			/* Last resort, if we fail to allocate memory for the
 			 * fences block for the BO to become idle
 			 */
-			dma_resv_wait_timeout_rcu(bo->base.resv, true, false,
+			dma_resv_wait_timeout_unlocked(bo->base.resv, true, false,
 						  30 * HZ);
 		}
 
@@ -420,7 +420,7 @@ static void ttm_bo_release(struct kref *kref)
 		ttm_mem_io_free(bdev, &bo->mem);
 	}
 
-	if (!dma_resv_test_signaled_rcu(bo->base.resv, true) ||
+	if (!dma_resv_test_signaled_unlocked(bo->base.resv, true) ||
 	    !dma_resv_trylock(bo->base.resv)) {
 		/* The BO is not idle, resurrect it for delayed destroy */
 		ttm_bo_flush_all_fences(bo);
@@ -1116,13 +1116,13 @@ int ttm_bo_wait(struct ttm_buffer_object *bo,
 	long timeout = 15 * HZ;
 
 	if (no_wait) {
-		if (dma_resv_test_signaled_rcu(bo->base.resv, true))
+		if (dma_resv_test_signaled_unlocked(bo->base.resv, true))
 			return 0;
 		else
 			return -EBUSY;
 	}
 
-	timeout = dma_resv_wait_timeout_rcu(bo->base.resv, true,
+	timeout = dma_resv_wait_timeout_unlocked(bo->base.resv, true,
 						      interruptible, timeout);
 	if (timeout < 0)
 		return timeout;
diff --git a/drivers/gpu/drm/vgem/vgem_fence.c b/drivers/gpu/drm/vgem/vgem_fence.c
index 2902dc6e64faf..110927edb9df7 100644
--- a/drivers/gpu/drm/vgem/vgem_fence.c
+++ b/drivers/gpu/drm/vgem/vgem_fence.c
@@ -151,7 +151,7 @@ int vgem_fence_attach_ioctl(struct drm_device *dev,
 
 	/* Check for a conflicting fence */
 	resv = obj->resv;
-	if (!dma_resv_test_signaled_rcu(resv,
+	if (!dma_resv_test_signaled_unlocked(resv,
 						  arg->flags & VGEM_FENCE_WRITE)) {
 		ret = -EBUSY;
 		goto err_fence;
diff --git a/drivers/gpu/drm/virtio/virtgpu_ioctl.c b/drivers/gpu/drm/virtio/virtgpu_ioctl.c
index 669f2ee395154..763a51686819c 100644
--- a/drivers/gpu/drm/virtio/virtgpu_ioctl.c
+++ b/drivers/gpu/drm/virtio/virtgpu_ioctl.c
@@ -451,9 +451,9 @@ static int virtio_gpu_wait_ioctl(struct drm_device *dev, void *data,
 		return -ENOENT;
 
 	if (args->flags & VIRTGPU_WAIT_NOWAIT) {
-		ret = dma_resv_test_signaled_rcu(obj->resv, true);
+		ret = dma_resv_test_signaled_unlocked(obj->resv, true);
 	} else {
-		ret = dma_resv_wait_timeout_rcu(obj->resv, true, true,
+		ret = dma_resv_wait_timeout_unlocked(obj->resv, true, true,
 						timeout);
 	}
 	if (ret == 0)
diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_bo.c b/drivers/gpu/drm/vmwgfx/vmwgfx_bo.c
index 04dd49c4c2572..19e1ce23842a9 100644
--- a/drivers/gpu/drm/vmwgfx/vmwgfx_bo.c
+++ b/drivers/gpu/drm/vmwgfx/vmwgfx_bo.c
@@ -743,7 +743,7 @@ static int vmw_user_bo_synccpu_grab(struct vmw_user_buffer_object *user_bo,
 	if (flags & drm_vmw_synccpu_allow_cs) {
 		long lret;
 
-		lret = dma_resv_wait_timeout_rcu
+		lret = dma_resv_wait_timeout_unlocked
 			(bo->base.resv, true, true,
 			 nonblock ? 0 : MAX_SCHEDULE_TIMEOUT);
 		if (!lret)
diff --git a/include/linux/dma-resv.h b/include/linux/dma-resv.h
index d44a77e8a7e34..99926680c3964 100644
--- a/include/linux/dma-resv.h
+++ b/include/linux/dma-resv.h
@@ -246,7 +246,7 @@ dma_resv_get_excl(struct dma_resv *obj)
 }
 
 /**
- * dma_resv_get_excl_rcu - get the reservation object's
+ * dma_resv_get_excl - get the reservation object's
  * exclusive fence, without lock held.
  * @obj: the reservation object
  *
@@ -257,7 +257,7 @@ dma_resv_get_excl(struct dma_resv *obj)
  * The exclusive fence or NULL if none
  */
 static inline struct dma_fence *
-dma_resv_get_excl_rcu(struct dma_resv *obj)
+dma_resv_get_excl_unlocked(struct dma_resv *obj)
 {
 	struct dma_fence *fence;
 
@@ -278,16 +278,16 @@ void dma_resv_add_shared_fence(struct dma_resv *obj, struct dma_fence *fence);
 
 void dma_resv_add_excl_fence(struct dma_resv *obj, struct dma_fence *fence);
 
-int dma_resv_get_fences_rcu(struct dma_resv *obj,
-			    struct dma_fence **pfence_excl,
-			    unsigned *pshared_count,
-			    struct dma_fence ***pshared);
+int dma_resv_get_fences_unlocked(struct dma_resv *obj,
+				 struct dma_fence **pfence_excl,
+				 unsigned *pshared_count,
+				 struct dma_fence ***pshared);
 
 int dma_resv_copy_fences(struct dma_resv *dst, struct dma_resv *src);
 
-long dma_resv_wait_timeout_rcu(struct dma_resv *obj, bool wait_all, bool intr,
-			       unsigned long timeout);
+long dma_resv_wait_timeout_unlocked(struct dma_resv *obj, bool wait_all, bool intr,
+				    unsigned long timeout);
 
-bool dma_resv_test_signaled_rcu(struct dma_resv *obj, bool test_all);
+bool dma_resv_test_signaled_unlocked(struct dma_resv *obj, bool test_all);
 
 #endif /* _LINUX_RESERVATION_H */
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 3/6] dma-buf: add dma_resv_get_singleton_unlocked (v4)
  2021-05-24 20:59 [PATCH 0/6] dma-buf: Add an API for exporting sync files (v10) Jason Ekstrand
  2021-05-24 20:59 ` [PATCH 1/6] dma-buf: add dma_fence_array_for_each (v2) Jason Ekstrand
  2021-05-24 20:59 ` [PATCH 2/6] dma-buf: Rename dma_resv helpers from _rcu to _unlocked Jason Ekstrand
@ 2021-05-24 20:59 ` Jason Ekstrand
  2021-05-24 20:59 ` [PATCH 4/6] dma-buf: Add an API for exporting sync files (v9) Jason Ekstrand
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 13+ messages in thread
From: Jason Ekstrand @ 2021-05-24 20:59 UTC (permalink / raw)
  To: dri-devel, intel-gfx; +Cc: Daniel Vetter, Jason Ekstrand

Add a helper function to get a single fence representing
all fences in a dma_resv object.

This fence is either the only one in the object or all not
signaled fences of the object in a flatted out dma_fence_array.

v2 (Jason Ekstrand):
 - Take reference of fences both for creating the dma_fence_array and in
   the case where we return one fence.
 - Handle the case where dma_resv_get_list() returns NULL

v3 (Jason Ekstrand):
 - Add an _rcu suffix because it is read-only
 - Rewrite to use dma_resv_get_fences_rcu so it's RCU-safe
 - Add an EXPORT_SYMBOL_GPL declaration
 - Re-author the patch to Jason since very little is left of Christian
   König's original patch
 - Remove the extra fence argument

v4 (Jason Ekstrand):
 - Restore the extra fence argument

v5 (Daniel Vetter):
 - Rename from _rcu to _unlocked since it doesn't leak RCU details to
   the caller
 - Fix docs
 - Use ERR_PTR for error handling rather than an output dma_fence**

v5 (Jason Ekstrand):
 - Drop the extra fence param and leave that to a separate patch

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
---
 drivers/dma-buf/dma-resv.c | 93 ++++++++++++++++++++++++++++++++++++++
 include/linux/dma-resv.h   |  2 +
 2 files changed, 95 insertions(+)

diff --git a/drivers/dma-buf/dma-resv.c b/drivers/dma-buf/dma-resv.c
index d6f1ed4cd4d55..312a3a59dac6a 100644
--- a/drivers/dma-buf/dma-resv.c
+++ b/drivers/dma-buf/dma-resv.c
@@ -33,6 +33,8 @@
  */
 
 #include <linux/dma-resv.h>
+#include <linux/dma-fence-chain.h>
+#include <linux/dma-fence-array.h>
 #include <linux/export.h>
 #include <linux/mm.h>
 #include <linux/sched/mm.h>
@@ -49,6 +51,11 @@
  * write-side updates.
  */
 
+/* deep dive into the fence containers */
+#define dma_fence_deep_dive_for_each(fence, chain, index, head)	\
+	dma_fence_chain_for_each(chain, head)			\
+		dma_fence_array_for_each(fence, index, chain)
+
 DEFINE_WD_CLASS(reservation_ww_class);
 EXPORT_SYMBOL(reservation_ww_class);
 
@@ -517,6 +524,92 @@ int dma_resv_get_fences_unlocked(struct dma_resv *obj,
 }
 EXPORT_SYMBOL_GPL(dma_resv_get_fences_unlocked);
 
+/**
+ * dma_resv_get_singleton_unlocked - get a single fence for the dma_resv object
+ * @obj: the reservation object
+ * @result: resulting dma_fence
+ *
+ * Get a single fence representing all unsignaled fences in the dma_resv object
+ * plus the given extra fence. If we got only one fence return a new
+ * reference to that, otherwise return a dma_fence_array object.
+ *
+ * RETURNS
+ * Returns -NOMEM if allocations fail, zero otherwise.
+ */
+struct dma_fence *dma_resv_get_singleton_unlocked(struct dma_resv *obj)
+{
+	struct dma_fence *result, **resv_fences, *fence, *chain, **fences;
+	struct dma_fence_array *array;
+	unsigned int num_resv_fences, num_fences;
+	unsigned int err, i, j;
+
+	err = dma_resv_get_fences_unlocked(obj, NULL, &num_resv_fences, &resv_fences);
+	if (err)
+		return ERR_PTR(err);
+
+	if (num_resv_fences == 0)
+		return NULL;
+
+	num_fences = 0;
+	result = NULL;
+
+	for (i = 0; i < num_resv_fences; ++i) {
+		dma_fence_deep_dive_for_each(fence, chain, j, resv_fences[i]) {
+			if (dma_fence_is_signaled(fence))
+				continue;
+
+			result = fence;
+			++num_fences;
+		}
+	}
+
+	if (num_fences <= 1) {
+		result = dma_fence_get(result);
+		goto put_resv_fences;
+	}
+
+	fences = kmalloc_array(num_fences, sizeof(struct dma_fence*),
+			       GFP_KERNEL);
+	if (!fences) {
+		result = ERR_PTR(-ENOMEM);
+		goto put_resv_fences;
+	}
+
+	num_fences = 0;
+	for (i = 0; i < num_resv_fences; ++i) {
+		dma_fence_deep_dive_for_each(fence, chain, j, resv_fences[i]) {
+			if (!dma_fence_is_signaled(fence))
+				fences[num_fences++] = dma_fence_get(fence);
+		}
+	}
+
+	if (num_fences <= 1) {
+		result = num_fences ? fences[0] : NULL;
+		kfree(fences);
+		goto put_resv_fences;
+	}
+
+	array = dma_fence_array_create(num_fences, fences,
+				       dma_fence_context_alloc(1),
+				       1, false);
+	if (array) {
+		result = &array->base;
+	} else {
+		result = ERR_PTR(-ENOMEM);
+		while (num_fences--)
+			dma_fence_put(fences[num_fences]);
+		kfree(fences);
+	}
+
+put_resv_fences:
+	while (num_resv_fences--)
+		dma_fence_put(resv_fences[num_resv_fences]);
+	kfree(resv_fences);
+
+	return result;
+}
+EXPORT_SYMBOL_GPL(dma_resv_get_singleton_unlocked);
+
 /**
  * dma_resv_wait_timeout_unlocked - Wait on reservation's objects
  * shared and/or exclusive fences.
diff --git a/include/linux/dma-resv.h b/include/linux/dma-resv.h
index 99926680c3964..c529ccee94bc5 100644
--- a/include/linux/dma-resv.h
+++ b/include/linux/dma-resv.h
@@ -285,6 +285,8 @@ int dma_resv_get_fences_unlocked(struct dma_resv *obj,
 
 int dma_resv_copy_fences(struct dma_resv *dst, struct dma_resv *src);
 
+struct dma_fence *dma_resv_get_singleton_unlocked(struct dma_resv *obj);
+
 long dma_resv_wait_timeout_unlocked(struct dma_resv *obj, bool wait_all, bool intr,
 				    unsigned long timeout);
 
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 4/6] dma-buf: Add an API for exporting sync files (v9)
  2021-05-24 20:59 [PATCH 0/6] dma-buf: Add an API for exporting sync files (v10) Jason Ekstrand
                   ` (2 preceding siblings ...)
  2021-05-24 20:59 ` [PATCH 3/6] dma-buf: add dma_resv_get_singleton_unlocked (v4) Jason Ekstrand
@ 2021-05-24 20:59 ` Jason Ekstrand
  2021-05-25 15:08   ` Daniel Vetter
  2021-05-24 20:59 ` [PATCH 5/6] RFC: dma-buf: Add an extra fence to dma_resv_get_singleton_unlocked Jason Ekstrand
  2021-05-24 20:59 ` [PATCH 6/6] RFC: dma-buf: Add an API for importing sync files (v6) Jason Ekstrand
  5 siblings, 1 reply; 13+ messages in thread
From: Jason Ekstrand @ 2021-05-24 20:59 UTC (permalink / raw)
  To: dri-devel, intel-gfx; +Cc: Christian König, Jason Ekstrand

Modern userspace APIs like Vulkan are built on an explicit
synchronization model.  This doesn't always play nicely with the
implicit synchronization used in the kernel and assumed by X11 and
Wayland.  The client -> compositor half of the synchronization isn't too
bad, at least on intel, because we can control whether or not i915
synchronizes on the buffer and whether or not it's considered written.

The harder part is the compositor -> client synchronization when we get
the buffer back from the compositor.  We're required to be able to
provide the client with a VkSemaphore and VkFence representing the point
in time where the window system (compositor and/or display) finished
using the buffer.  With current APIs, it's very hard to do this in such
a way that we don't get confused by the Vulkan driver's access of the
buffer.  In particular, once we tell the kernel that we're rendering to
the buffer again, any CPU waits on the buffer or GPU dependencies will
wait on some of the client rendering and not just the compositor.

This new IOCTL solves this problem by allowing us to get a snapshot of
the implicit synchronization state of a given dma-buf in the form of a
sync file.  It's effectively the same as a poll() or I915_GEM_WAIT only,
instead of CPU waiting directly, it encapsulates the wait operation, at
the current moment in time, in a sync_file so we can check/wait on it
later.  As long as the Vulkan driver does the sync_file export from the
dma-buf before we re-introduce it for rendering, it will only contain
fences from the compositor or display.  This allows to accurately turn
it into a VkFence or VkSemaphore without any over- synchronization.

v2 (Jason Ekstrand):
 - Use a wrapper dma_fence_array of all fences including the new one
   when importing an exclusive fence.

v3 (Jason Ekstrand):
 - Lock around setting shared fences as well as exclusive
 - Mark SIGNAL_SYNC_FILE as a read-write ioctl.
 - Initialize ret to 0 in dma_buf_wait_sync_file

v4 (Jason Ekstrand):
 - Use the new dma_resv_get_singleton helper

v5 (Jason Ekstrand):
 - Rename the IOCTLs to import/export rather than wait/signal
 - Drop the WRITE flag and always get/set the exclusive fence

v6 (Jason Ekstrand):
 - Drop the sync_file import as it was all-around sketchy and not nearly
   as useful as import.
 - Re-introduce READ/WRITE flag support for export
 - Rework the commit message

v7 (Jason Ekstrand):
 - Require at least one sync flag
 - Fix a refcounting bug: dma_resv_get_excl() doesn't take a reference
 - Use _rcu helpers since we're accessing the dma_resv read-only

v8 (Jason Ekstrand):
 - Return -ENOMEM if the sync_file_create fails
 - Predicate support on IS_ENABLED(CONFIG_SYNC_FILE)

v9 (Jason Ekstrand):
 - Add documentation for the new ioctl

v10 (Jason Ekstrand):
 - Go back to dma_buf_sync_file as the ioctl struct name

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Acked-by: Simon Ser <contact@emersion.fr>
Acked-by: Christian König <christian.koenig@amd.com>
---
 drivers/dma-buf/dma-buf.c    | 64 ++++++++++++++++++++++++++++++++++++
 include/uapi/linux/dma-buf.h | 24 ++++++++++++++
 2 files changed, 88 insertions(+)

diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c
index d4529aa9d1a5a..86efe71c0db96 100644
--- a/drivers/dma-buf/dma-buf.c
+++ b/drivers/dma-buf/dma-buf.c
@@ -20,6 +20,7 @@
 #include <linux/debugfs.h>
 #include <linux/module.h>
 #include <linux/seq_file.h>
+#include <linux/sync_file.h>
 #include <linux/poll.h>
 #include <linux/dma-resv.h>
 #include <linux/mm.h>
@@ -362,6 +363,64 @@ static long dma_buf_set_name(struct dma_buf *dmabuf, const char __user *buf)
 	return ret;
 }
 
+#if IS_ENABLED(CONFIG_SYNC_FILE)
+static long dma_buf_export_sync_file(struct dma_buf *dmabuf,
+				     void __user *user_data)
+{
+	struct dma_buf_sync_file arg;
+	struct dma_fence *fence = NULL;
+	struct sync_file *sync_file;
+	int fd, ret;
+
+	if (copy_from_user(&arg, user_data, sizeof(arg)))
+		return -EFAULT;
+
+	if (arg.flags & ~DMA_BUF_SYNC_RW)
+		return -EINVAL;
+
+	if ((arg.flags & DMA_BUF_SYNC_RW) == 0)
+		return -EINVAL;
+
+	fd = get_unused_fd_flags(O_CLOEXEC);
+	if (fd < 0)
+		return fd;
+
+	if (arg.flags & DMA_BUF_SYNC_WRITE) {
+		fence = dma_resv_get_singleton_unlocked(dmabuf->resv);
+		if (IS_ERR(fence)) {
+			ret = PTR_ERR(fence);
+			goto err_put_fd;
+		}
+	} else if (arg.flags & DMA_BUF_SYNC_READ) {
+		fence = dma_resv_get_excl_unlocked(dmabuf->resv);
+	}
+
+	if (!fence)
+		fence = dma_fence_get_stub();
+
+	sync_file = sync_file_create(fence);
+
+	dma_fence_put(fence);
+
+	if (!sync_file) {
+		ret = -ENOMEM;
+		goto err_put_fd;
+	}
+
+	fd_install(fd, sync_file->file);
+
+	arg.fd = fd;
+	if (copy_to_user(user_data, &arg, sizeof(arg)))
+		return -EFAULT;
+
+	return 0;
+
+err_put_fd:
+	put_unused_fd(fd);
+	return ret;
+}
+#endif
+
 static long dma_buf_ioctl(struct file *file,
 			  unsigned int cmd, unsigned long arg)
 {
@@ -405,6 +464,11 @@ static long dma_buf_ioctl(struct file *file,
 	case DMA_BUF_SET_NAME_B:
 		return dma_buf_set_name(dmabuf, (const char __user *)arg);
 
+#if IS_ENABLED(CONFIG_SYNC_FILE)
+	case DMA_BUF_IOCTL_EXPORT_SYNC_FILE:
+		return dma_buf_export_sync_file(dmabuf, (void __user *)arg);
+#endif
+
 	default:
 		return -ENOTTY;
 	}
diff --git a/include/uapi/linux/dma-buf.h b/include/uapi/linux/dma-buf.h
index 7f30393b92c3b..f902cadcbdb56 100644
--- a/include/uapi/linux/dma-buf.h
+++ b/include/uapi/linux/dma-buf.h
@@ -37,6 +37,29 @@ struct dma_buf_sync {
 
 #define DMA_BUF_NAME_LEN	32
 
+/**
+ * struct dma_buf_export_sync_file - Get a sync_file from a dma-buf
+ *
+ * Userspace can perform a DMA_BUF_IOCTL_EXPORT_SYNC_FILE to retrieve the
+ * current set of fences on a dma-buf file descriptor as a sync_file.  CPU
+ * waits via poll() or other driver-specific mechanisms typically wait on
+ * whatever fences are on the dma-buf at the time the wait begins.  This
+ * is similar except that it takes a snapshot of the current fences on the
+ * dma-buf for waiting later instead of waiting immediately.  This is
+ * useful for modern graphics APIs such as Vulkan which assume an explicit
+ * synchronization model but still need to inter-operate with dma-buf.
+ */
+struct dma_buf_sync_file {
+	/**
+	 * @flags: Read/write flags
+	 *
+	 * Must DMA_BUF_SYNC_READ, DMA_BUF_SYNC_WRITE, or both.
+	 */
+	__u32 flags;
+	/** @fd: Sync file file descriptor */
+	__s32 fd;
+};
+
 #define DMA_BUF_BASE		'b'
 #define DMA_BUF_IOCTL_SYNC	_IOW(DMA_BUF_BASE, 0, struct dma_buf_sync)
 
@@ -46,5 +69,6 @@ struct dma_buf_sync {
 #define DMA_BUF_SET_NAME	_IOW(DMA_BUF_BASE, 1, const char *)
 #define DMA_BUF_SET_NAME_A	_IOW(DMA_BUF_BASE, 1, u32)
 #define DMA_BUF_SET_NAME_B	_IOW(DMA_BUF_BASE, 1, u64)
+#define DMA_BUF_IOCTL_EXPORT_SYNC_FILE	_IOWR(DMA_BUF_BASE, 2, struct dma_buf_sync_file)
 
 #endif
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 5/6] RFC: dma-buf: Add an extra fence to dma_resv_get_singleton_unlocked
  2021-05-24 20:59 [PATCH 0/6] dma-buf: Add an API for exporting sync files (v10) Jason Ekstrand
                   ` (3 preceding siblings ...)
  2021-05-24 20:59 ` [PATCH 4/6] dma-buf: Add an API for exporting sync files (v9) Jason Ekstrand
@ 2021-05-24 20:59 ` Jason Ekstrand
  2021-05-25 15:25   ` [Intel-gfx] " Daniel Vetter
  2021-05-24 20:59 ` [PATCH 6/6] RFC: dma-buf: Add an API for importing sync files (v6) Jason Ekstrand
  5 siblings, 1 reply; 13+ messages in thread
From: Jason Ekstrand @ 2021-05-24 20:59 UTC (permalink / raw)
  To: dri-devel, intel-gfx; +Cc: Jason Ekstrand

For dma-buf sync_file import, we want to get all the fences on a
dma_resv plus one more.  We could wrap the fence we get back in an array
fence or we could make dma_resv_get_singleton_unlocked take "one more"
to make this case easier.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
---
 drivers/dma-buf/dma-buf.c  |  2 +-
 drivers/dma-buf/dma-resv.c | 23 +++++++++++++++++++++--
 include/linux/dma-resv.h   |  3 ++-
 3 files changed, 24 insertions(+), 4 deletions(-)

diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c
index 86efe71c0db96..f23d939e0e833 100644
--- a/drivers/dma-buf/dma-buf.c
+++ b/drivers/dma-buf/dma-buf.c
@@ -386,7 +386,7 @@ static long dma_buf_export_sync_file(struct dma_buf *dmabuf,
 		return fd;
 
 	if (arg.flags & DMA_BUF_SYNC_WRITE) {
-		fence = dma_resv_get_singleton_unlocked(dmabuf->resv);
+		fence = dma_resv_get_singleton_unlocked(dmabuf->resv, NULL);
 		if (IS_ERR(fence)) {
 			ret = PTR_ERR(fence);
 			goto err_put_fd;
diff --git a/drivers/dma-buf/dma-resv.c b/drivers/dma-buf/dma-resv.c
index 312a3a59dac6a..3c0ef8d0f599b 100644
--- a/drivers/dma-buf/dma-resv.c
+++ b/drivers/dma-buf/dma-resv.c
@@ -527,6 +527,7 @@ EXPORT_SYMBOL_GPL(dma_resv_get_fences_unlocked);
 /**
  * dma_resv_get_singleton_unlocked - get a single fence for the dma_resv object
  * @obj: the reservation object
+ * @extra: extra fence to add to the resulting array
  * @result: resulting dma_fence
  *
  * Get a single fence representing all unsignaled fences in the dma_resv object
@@ -536,7 +537,8 @@ EXPORT_SYMBOL_GPL(dma_resv_get_fences_unlocked);
  * RETURNS
  * Returns -NOMEM if allocations fail, zero otherwise.
  */
-struct dma_fence *dma_resv_get_singleton_unlocked(struct dma_resv *obj)
+struct dma_fence *dma_resv_get_singleton_unlocked(struct dma_resv *obj,
+						  struct dma_fence *extra)
 {
 	struct dma_fence *result, **resv_fences, *fence, *chain, **fences;
 	struct dma_fence_array *array;
@@ -547,7 +549,7 @@ struct dma_fence *dma_resv_get_singleton_unlocked(struct dma_resv *obj)
 	if (err)
 		return ERR_PTR(err);
 
-	if (num_resv_fences == 0)
+	if (num_resv_fences == 0 && !extra)
 		return NULL;
 
 	num_fences = 0;
@@ -563,6 +565,16 @@ struct dma_fence *dma_resv_get_singleton_unlocked(struct dma_resv *obj)
 		}
 	}
 
+	if (extra) {
+		dma_fence_deep_dive_for_each(fence, chain, j, extra) {
+			if (dma_fence_is_signaled(fence))
+				continue;
+
+			result = fence;
+			++num_fences;
+		}
+	}
+
 	if (num_fences <= 1) {
 		result = dma_fence_get(result);
 		goto put_resv_fences;
@@ -583,6 +595,13 @@ struct dma_fence *dma_resv_get_singleton_unlocked(struct dma_resv *obj)
 		}
 	}
 
+	if (extra) {
+		dma_fence_deep_dive_for_each(fence, chain, j, extra) {
+			if (dma_fence_is_signaled(fence))
+				fences[num_fences++] = dma_fence_get(fence);
+		}
+	}
+
 	if (num_fences <= 1) {
 		result = num_fences ? fences[0] : NULL;
 		kfree(fences);
diff --git a/include/linux/dma-resv.h b/include/linux/dma-resv.h
index c529ccee94bc5..156d989e95ab4 100644
--- a/include/linux/dma-resv.h
+++ b/include/linux/dma-resv.h
@@ -285,7 +285,8 @@ int dma_resv_get_fences_unlocked(struct dma_resv *obj,
 
 int dma_resv_copy_fences(struct dma_resv *dst, struct dma_resv *src);
 
-struct dma_fence *dma_resv_get_singleton_unlocked(struct dma_resv *obj);
+struct dma_fence *dma_resv_get_singleton_unlocked(struct dma_resv *obj,
+						  struct dma_fence *extra);
 
 long dma_resv_wait_timeout_unlocked(struct dma_resv *obj, bool wait_all, bool intr,
 				    unsigned long timeout);
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 6/6] RFC: dma-buf: Add an API for importing sync files (v6)
  2021-05-24 20:59 [PATCH 0/6] dma-buf: Add an API for exporting sync files (v10) Jason Ekstrand
                   ` (4 preceding siblings ...)
  2021-05-24 20:59 ` [PATCH 5/6] RFC: dma-buf: Add an extra fence to dma_resv_get_singleton_unlocked Jason Ekstrand
@ 2021-05-24 20:59 ` Jason Ekstrand
  2021-05-25 15:37   ` Daniel Vetter
  5 siblings, 1 reply; 13+ messages in thread
From: Jason Ekstrand @ 2021-05-24 20:59 UTC (permalink / raw)
  To: dri-devel, intel-gfx; +Cc: Jason Ekstrand

This patch is analogous to the previous sync file export patch in that
it allows you to import a sync_file into a dma-buf.  Unlike the previous
patch, however, this does add genuinely new functionality to dma-buf.
Without this, the only way to attach a sync_file to a dma-buf is to
submit a batch to your driver of choice which waits on the sync_file and
claims to write to the dma-buf.  Even if said batch is a no-op, a submit
is typically way more overhead than just attaching a fence.  A submit
may also imply extra synchronization with other work because it happens
on a hardware queue.

In the Vulkan world, this is useful for dealing with the out-fence from
vkQueuePresent.  Current Linux window-systems (X11, Wayland, etc.) all
rely on dma-buf implicit sync.  Since Vulkan is an explicit sync API, we
get a set of fences (VkSemaphores) in vkQueuePresent and have to stash
those as an exclusive (write) fence on the dma-buf.  We handle it in
Mesa today with the above mentioned dummy submit trick.  This ioctl
would allow us to set it directly without the dummy submit.

This may also open up possibilities for GPU drivers to move away from
implicit sync for their kernel driver uAPI and instead provide sync
files and rely on dma-buf import/export for communicating with other
implicit sync clients.

We make the explicit choice here to only allow setting RW fences which
translates to an exclusive fence on the dma_resv.  There's no use for
read-only fences for communicating with other implicit sync userspace
and any such attempts are likely to be racy at best.  When we got to
insert the RW fence, the actual fence we set as the new exclusive fence
is a combination of the sync_file provided by the user and all the other
fences on the dma_resv.  This ensures that the newly added exclusive
fence will never signal before the old one would have and ensures that
we don't break any dma_resv contracts.  We require userspace to specify
RW in the flags for symmetry with the export ioctl and in case we ever
want to support read fences in the future.

There is one downside here that's worth documenting:  If two clients
writing to the same dma-buf using this API race with each other, their
actions on the dma-buf may happen in parallel or in an undefined order.
Both with and without this API, the pattern is the same:  Collect all
the fences on dma-buf, submit work which depends on said fences, and
then set a new exclusive (write) fence on the dma-buf which depends on
said work.  The difference is that, when it's all handled by the GPU
driver's submit ioctl, the three operations happen atomically under the
dma_resv lock.  If two userspace submits race, one will happen before
the other.  You aren't guaranteed which but you are guaranteed that
they're strictly ordered.  If userspace manages the fences itself, then
these three operations happen separately and the two render operations
may happen genuinely in parallel or get interleaved.  However, this is a
case of userspace racing with itself.  As long as we ensure userspace
can't back the kernel into a corner, it should be fine.

v2 (Jason Ekstrand):
 - Use a wrapper dma_fence_array of all fences including the new one
   when importing an exclusive fence.

v3 (Jason Ekstrand):
 - Lock around setting shared fences as well as exclusive
 - Mark SIGNAL_SYNC_FILE as a read-write ioctl.
 - Initialize ret to 0 in dma_buf_wait_sync_file

v4 (Jason Ekstrand):
 - Use the new dma_resv_get_singleton helper

v5 (Jason Ekstrand):
 - Rename the IOCTLs to import/export rather than wait/signal
 - Drop the WRITE flag and always get/set the exclusive fence

v5 (Jason Ekstrand):
 - Split import and export into separate patches
 - New commit message

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
---
 drivers/dma-buf/dma-buf.c    | 34 ++++++++++++++++++++++++++++++++++
 include/uapi/linux/dma-buf.h |  1 +
 2 files changed, 35 insertions(+)

diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c
index f23d939e0e833..0a50c19dcf015 100644
--- a/drivers/dma-buf/dma-buf.c
+++ b/drivers/dma-buf/dma-buf.c
@@ -419,6 +419,38 @@ static long dma_buf_export_sync_file(struct dma_buf *dmabuf,
 	put_unused_fd(fd);
 	return ret;
 }
+
+static long dma_buf_import_sync_file(struct dma_buf *dmabuf,
+				     const void __user *user_data)
+{
+	struct dma_buf_sync_file arg;
+	struct dma_fence *fence, *singleton = NULL;
+	int ret = 0;
+
+	if (copy_from_user(&arg, user_data, sizeof(arg)))
+		return -EFAULT;
+
+	if (arg.flags != DMA_BUF_SYNC_RW)
+		return -EINVAL;
+
+	fence = sync_file_get_fence(arg.fd);
+	if (!fence)
+		return -EINVAL;
+
+	dma_resv_lock(dmabuf->resv, NULL);
+
+	singleton = dma_resv_get_singleton_unlocked(dmabuf->resv, fence);
+	if (IS_ERR(singleton))
+		ret = PTR_ERR(singleton);
+	else if (singleton)
+		dma_resv_add_excl_fence(dmabuf->resv, singleton);
+
+	dma_resv_unlock(dmabuf->resv);
+
+	dma_fence_put(fence);
+
+	return ret;
+}
 #endif
 
 static long dma_buf_ioctl(struct file *file,
@@ -467,6 +499,8 @@ static long dma_buf_ioctl(struct file *file,
 #if IS_ENABLED(CONFIG_SYNC_FILE)
 	case DMA_BUF_IOCTL_EXPORT_SYNC_FILE:
 		return dma_buf_export_sync_file(dmabuf, (void __user *)arg);
+	case DMA_BUF_IOCTL_IMPORT_SYNC_FILE:
+		return dma_buf_import_sync_file(dmabuf, (const void __user *)arg);
 #endif
 
 	default:
diff --git a/include/uapi/linux/dma-buf.h b/include/uapi/linux/dma-buf.h
index f902cadcbdb56..75fdde4800267 100644
--- a/include/uapi/linux/dma-buf.h
+++ b/include/uapi/linux/dma-buf.h
@@ -70,5 +70,6 @@ struct dma_buf_sync_file {
 #define DMA_BUF_SET_NAME_A	_IOW(DMA_BUF_BASE, 1, u32)
 #define DMA_BUF_SET_NAME_B	_IOW(DMA_BUF_BASE, 1, u64)
 #define DMA_BUF_IOCTL_EXPORT_SYNC_FILE	_IOWR(DMA_BUF_BASE, 2, struct dma_buf_sync_file)
+#define DMA_BUF_IOCTL_IMPORT_SYNC_FILE	_IOW(DMA_BUF_BASE, 3, struct dma_buf_sync)
 
 #endif
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH 2/6] dma-buf: Rename dma_resv helpers from _rcu to _unlocked
  2021-05-24 20:59 ` [PATCH 2/6] dma-buf: Rename dma_resv helpers from _rcu to _unlocked Jason Ekstrand
@ 2021-05-25 14:57   ` Daniel Vetter
  0 siblings, 0 replies; 13+ messages in thread
From: Daniel Vetter @ 2021-05-25 14:57 UTC (permalink / raw)
  To: Jason Ekstrand; +Cc: Daniel Vetter, intel-gfx, dri-devel

On Mon, May 24, 2021 at 03:59:50PM -0500, Jason Ekstrand wrote:
> None of these helpers actually leak any RCU details to the caller.  They
> all assume you have a genuine reference, take the RCU read lock, and
> retry if needed.  Naming them with an _rcu is likely to cause callers
> more panic than needed.
> 
> Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
> Suggested-by: Daniel Vetter <daniel.vetter@ffwll.ch>

scripts/get_maintainer.pl gives you ideas whom to send this to. Somewhat
applies to most patches here since I don't think you've cc'ed Christian on
the entire pile.
-Daniel

> ---
>  drivers/dma-buf/dma-buf.c                     |  2 +-
>  drivers/dma-buf/dma-resv.c                    | 28 +++++++++----------
>  drivers/gpu/drm/amd/amdgpu/amdgpu_display.c   |  2 +-
>  drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c   |  2 +-
>  drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c       |  2 +-
>  drivers/gpu/drm/amd/amdgpu/amdgpu_ids.c       |  4 +--
>  drivers/gpu/drm/amd/amdgpu/amdgpu_mn.c        |  2 +-
>  drivers/gpu/drm/amd/amdgpu/amdgpu_object.c    |  2 +-
>  drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c       |  2 +-
>  drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c        |  8 +++---
>  .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c |  2 +-
>  drivers/gpu/drm/drm_gem.c                     |  6 ++--
>  drivers/gpu/drm/drm_gem_atomic_helper.c       |  2 +-
>  drivers/gpu/drm/etnaviv/etnaviv_gem.c         |  4 +--
>  drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c  |  4 +--
>  drivers/gpu/drm/i915/display/intel_display.c  |  2 +-
>  drivers/gpu/drm/i915/dma_resv_utils.c         |  2 +-
>  drivers/gpu/drm/i915/gem/i915_gem_busy.c      |  2 +-
>  .../gpu/drm/i915/gem/i915_gem_execbuffer.c    |  2 +-
>  drivers/gpu/drm/i915/gem/i915_gem_object.h    |  2 +-
>  drivers/gpu/drm/i915/gem/i915_gem_userptr.c   |  2 +-
>  drivers/gpu/drm/i915/gem/i915_gem_wait.c      | 10 +++----
>  drivers/gpu/drm/i915/i915_request.c           |  4 +--
>  drivers/gpu/drm/i915/i915_sw_fence.c          |  4 +--
>  drivers/gpu/drm/msm/msm_gem.c                 |  2 +-
>  drivers/gpu/drm/nouveau/dispnv50/wndw.c       |  2 +-
>  drivers/gpu/drm/nouveau/nouveau_gem.c         |  2 +-
>  drivers/gpu/drm/panfrost/panfrost_drv.c       |  2 +-
>  drivers/gpu/drm/panfrost/panfrost_job.c       |  2 +-
>  drivers/gpu/drm/radeon/radeon_gem.c           |  6 ++--
>  drivers/gpu/drm/radeon/radeon_mn.c            |  2 +-
>  drivers/gpu/drm/ttm/ttm_bo.c                  | 12 ++++----
>  drivers/gpu/drm/vgem/vgem_fence.c             |  2 +-
>  drivers/gpu/drm/virtio/virtgpu_ioctl.c        |  4 +--
>  drivers/gpu/drm/vmwgfx/vmwgfx_bo.c            |  2 +-
>  include/linux/dma-resv.h                      | 18 ++++++------
>  36 files changed, 79 insertions(+), 79 deletions(-)
> 
> diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c
> index f264b70c383eb..d4529aa9d1a5a 100644
> --- a/drivers/dma-buf/dma-buf.c
> +++ b/drivers/dma-buf/dma-buf.c
> @@ -1147,7 +1147,7 @@ static int __dma_buf_begin_cpu_access(struct dma_buf *dmabuf,
>  	long ret;
>  
>  	/* Wait on any implicit rendering fences */
> -	ret = dma_resv_wait_timeout_rcu(resv, write, true,
> +	ret = dma_resv_wait_timeout_unlocked(resv, write, true,
>  						  MAX_SCHEDULE_TIMEOUT);
>  	if (ret < 0)
>  		return ret;
> diff --git a/drivers/dma-buf/dma-resv.c b/drivers/dma-buf/dma-resv.c
> index 6ddbeb5dfbf65..d6f1ed4cd4d55 100644
> --- a/drivers/dma-buf/dma-resv.c
> +++ b/drivers/dma-buf/dma-resv.c
> @@ -417,7 +417,7 @@ int dma_resv_copy_fences(struct dma_resv *dst, struct dma_resv *src)
>  EXPORT_SYMBOL(dma_resv_copy_fences);
>  
>  /**
> - * dma_resv_get_fences_rcu - Get an object's shared and exclusive
> + * dma_resv_get_fences_unlocked - Get an object's shared and exclusive
>   * fences without update side lock held
>   * @obj: the reservation object
>   * @pfence_excl: the returned exclusive fence (or NULL)
> @@ -429,10 +429,10 @@ EXPORT_SYMBOL(dma_resv_copy_fences);
>   * exclusive fence is not specified the fence is put into the array of the
>   * shared fences as well. Returns either zero or -ENOMEM.
>   */
> -int dma_resv_get_fences_rcu(struct dma_resv *obj,
> -			    struct dma_fence **pfence_excl,
> -			    unsigned *pshared_count,
> -			    struct dma_fence ***pshared)
> +int dma_resv_get_fences_unlocked(struct dma_resv *obj,
> +				 struct dma_fence **pfence_excl,
> +				 unsigned *pshared_count,
> +				 struct dma_fence ***pshared)
>  {
>  	struct dma_fence **shared = NULL;
>  	struct dma_fence *fence_excl;
> @@ -515,10 +515,10 @@ int dma_resv_get_fences_rcu(struct dma_resv *obj,
>  	*pshared = shared;
>  	return ret;
>  }
> -EXPORT_SYMBOL_GPL(dma_resv_get_fences_rcu);
> +EXPORT_SYMBOL_GPL(dma_resv_get_fences_unlocked);
>  
>  /**
> - * dma_resv_wait_timeout_rcu - Wait on reservation's objects
> + * dma_resv_wait_timeout_unlocked - Wait on reservation's objects
>   * shared and/or exclusive fences.
>   * @obj: the reservation object
>   * @wait_all: if true, wait on all fences, else wait on just exclusive fence
> @@ -529,9 +529,9 @@ EXPORT_SYMBOL_GPL(dma_resv_get_fences_rcu);
>   * Returns -ERESTARTSYS if interrupted, 0 if the wait timed out, or
>   * greater than zer on success.
>   */
> -long dma_resv_wait_timeout_rcu(struct dma_resv *obj,
> -			       bool wait_all, bool intr,
> -			       unsigned long timeout)
> +long dma_resv_wait_timeout_unlocked(struct dma_resv *obj,
> +				    bool wait_all, bool intr,
> +				    unsigned long timeout)
>  {
>  	struct dma_fence *fence;
>  	unsigned seq, shared_count;
> @@ -602,7 +602,7 @@ long dma_resv_wait_timeout_rcu(struct dma_resv *obj,
>  	rcu_read_unlock();
>  	goto retry;
>  }
> -EXPORT_SYMBOL_GPL(dma_resv_wait_timeout_rcu);
> +EXPORT_SYMBOL_GPL(dma_resv_wait_timeout_unlocked);
>  
>  
>  static inline int dma_resv_test_signaled_single(struct dma_fence *passed_fence)
> @@ -622,7 +622,7 @@ static inline int dma_resv_test_signaled_single(struct dma_fence *passed_fence)
>  }
>  
>  /**
> - * dma_resv_test_signaled_rcu - Test if a reservation object's
> + * dma_resv_test_signaled_unlocked - Test if a reservation object's
>   * fences have been signaled.
>   * @obj: the reservation object
>   * @test_all: if true, test all fences, otherwise only test the exclusive
> @@ -631,7 +631,7 @@ static inline int dma_resv_test_signaled_single(struct dma_fence *passed_fence)
>   * RETURNS
>   * true if all fences signaled, else false
>   */
> -bool dma_resv_test_signaled_rcu(struct dma_resv *obj, bool test_all)
> +bool dma_resv_test_signaled_unlocked(struct dma_resv *obj, bool test_all)
>  {
>  	unsigned seq, shared_count;
>  	int ret;
> @@ -680,4 +680,4 @@ bool dma_resv_test_signaled_rcu(struct dma_resv *obj, bool test_all)
>  	rcu_read_unlock();
>  	return ret;
>  }
> -EXPORT_SYMBOL_GPL(dma_resv_test_signaled_rcu);
> +EXPORT_SYMBOL_GPL(dma_resv_test_signaled_unlocked);
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c
> index 8a1fb8b6606e5..3b0df434e0ca3 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c
> @@ -203,7 +203,7 @@ int amdgpu_display_crtc_page_flip_target(struct drm_crtc *crtc,
>  		goto unpin;
>  	}
>  
> -	r = dma_resv_get_fences_rcu(new_abo->tbo.base.resv, &work->excl,
> +	r = dma_resv_get_fences_unlocked(new_abo->tbo.base.resv, &work->excl,
>  					      &work->shared_count,
>  					      &work->shared);
>  	if (unlikely(r != 0)) {
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c
> index baa980a477d94..0d0319bc51577 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c
> @@ -98,7 +98,7 @@ __dma_resv_make_exclusive(struct dma_resv *obj)
>  	if (!dma_resv_get_list(obj)) /* no shared fences to convert */
>  		return 0;
>  
> -	r = dma_resv_get_fences_rcu(obj, NULL, &count, &fences);
> +	r = dma_resv_get_fences_unlocked(obj, NULL, &count, &fences);
>  	if (r)
>  		return r;
>  
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
> index 18974bd081f00..a71f98ae1d72f 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
> @@ -471,7 +471,7 @@ int amdgpu_gem_wait_idle_ioctl(struct drm_device *dev, void *data,
>  		return -ENOENT;
>  	}
>  	robj = gem_to_amdgpu_bo(gobj);
> -	ret = dma_resv_wait_timeout_rcu(robj->tbo.base.resv, true, true,
> +	ret = dma_resv_wait_timeout_unlocked(robj->tbo.base.resv, true, true,
>  						  timeout);
>  
>  	/* ret == 0 means not signaled,
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ids.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ids.c
> index b4971e90b98cf..7045b104a33ae 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ids.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ids.c
> @@ -112,7 +112,7 @@ void amdgpu_pasid_free_delayed(struct dma_resv *resv,
>  	unsigned count;
>  	int r;
>  
> -	r = dma_resv_get_fences_rcu(resv, NULL, &count, &fences);
> +	r = dma_resv_get_fences_unlocked(resv, NULL, &count, &fences);
>  	if (r)
>  		goto fallback;
>  
> @@ -156,7 +156,7 @@ void amdgpu_pasid_free_delayed(struct dma_resv *resv,
>  	/* Not enough memory for the delayed delete, as last resort
>  	 * block for all the fences to complete.
>  	 */
> -	dma_resv_wait_timeout_rcu(resv, true, false,
> +	dma_resv_wait_timeout_unlocked(resv, true, false,
>  					    MAX_SCHEDULE_TIMEOUT);
>  	amdgpu_pasid_free(pasid);
>  }
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_mn.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_mn.c
> index 828b5167ff128..58fb1de81c0c5 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_mn.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_mn.c
> @@ -75,7 +75,7 @@ static bool amdgpu_mn_invalidate_gfx(struct mmu_interval_notifier *mni,
>  
>  	mmu_interval_set_seq(mni, cur_seq);
>  
> -	r = dma_resv_wait_timeout_rcu(bo->tbo.base.resv, true, false,
> +	r = dma_resv_wait_timeout_unlocked(bo->tbo.base.resv, true, false,
>  				      MAX_SCHEDULE_TIMEOUT);
>  	mutex_unlock(&adev->notifier_lock);
>  	if (r <= 0)
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> index 0adffcace3263..81db9ea391c1c 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> @@ -791,7 +791,7 @@ int amdgpu_bo_kmap(struct amdgpu_bo *bo, void **ptr)
>  		return 0;
>  	}
>  
> -	r = dma_resv_wait_timeout_rcu(bo->tbo.base.resv, false, false,
> +	r = dma_resv_wait_timeout_unlocked(bo->tbo.base.resv, false, false,
>  						MAX_SCHEDULE_TIMEOUT);
>  	if (r < 0)
>  		return r;
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c
> index c6dbc08016045..af7b667d3226d 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c
> @@ -1115,7 +1115,7 @@ static int amdgpu_uvd_send_msg(struct amdgpu_ring *ring, struct amdgpu_bo *bo,
>  	ib->length_dw = 16;
>  
>  	if (direct) {
> -		r = dma_resv_wait_timeout_rcu(bo->tbo.base.resv,
> +		r = dma_resv_wait_timeout_unlocked(bo->tbo.base.resv,
>  							true, false,
>  							msecs_to_jiffies(10));
>  		if (r == 0)
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> index 4a3e3f72e1277..33dbe3fcaf706 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> @@ -2007,13 +2007,13 @@ static void amdgpu_vm_prt_fini(struct amdgpu_device *adev, struct amdgpu_vm *vm)
>  	unsigned i, shared_count;
>  	int r;
>  
> -	r = dma_resv_get_fences_rcu(resv, &excl,
> +	r = dma_resv_get_fences_unlocked(resv, &excl,
>  					      &shared_count, &shared);
>  	if (r) {
>  		/* Not enough memory to grab the fence list, as last resort
>  		 * block for all the fences to complete.
>  		 */
> -		dma_resv_wait_timeout_rcu(resv, true, false,
> +		dma_resv_wait_timeout_unlocked(resv, true, false,
>  						    MAX_SCHEDULE_TIMEOUT);
>  		return;
>  	}
> @@ -2625,7 +2625,7 @@ bool amdgpu_vm_evictable(struct amdgpu_bo *bo)
>  		return true;
>  
>  	/* Don't evict VM page tables while they are busy */
> -	if (!dma_resv_test_signaled_rcu(bo->tbo.base.resv, true))
> +	if (!dma_resv_test_signaled_unlocked(bo->tbo.base.resv, true))
>  		return false;
>  
>  	/* Try to block ongoing updates */
> @@ -2805,7 +2805,7 @@ void amdgpu_vm_adjust_size(struct amdgpu_device *adev, uint32_t min_vm_size,
>   */
>  long amdgpu_vm_wait_idle(struct amdgpu_vm *vm, long timeout)
>  {
> -	timeout = dma_resv_wait_timeout_rcu(vm->root.base.bo->tbo.base.resv,
> +	timeout = dma_resv_wait_timeout_unlocked(vm->root.base.bo->tbo.base.resv,
>  					    true, true, timeout);
>  	if (timeout <= 0)
>  		return timeout;
> diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> index 9ca517b658546..e74fef044b301 100644
> --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> @@ -8276,7 +8276,7 @@ static void amdgpu_dm_commit_planes(struct drm_atomic_state *state,
>  		 * deadlock during GPU reset when this fence will not signal
>  		 * but we hold reservation lock for the BO.
>  		 */
> -		r = dma_resv_wait_timeout_rcu(abo->tbo.base.resv, true,
> +		r = dma_resv_wait_timeout_unlocked(abo->tbo.base.resv, true,
>  							false,
>  							msecs_to_jiffies(5000));
>  		if (unlikely(r <= 0))
> diff --git a/drivers/gpu/drm/drm_gem.c b/drivers/gpu/drm/drm_gem.c
> index 9989425e9875a..42a432708c2fe 100644
> --- a/drivers/gpu/drm/drm_gem.c
> +++ b/drivers/gpu/drm/drm_gem.c
> @@ -770,7 +770,7 @@ long drm_gem_dma_resv_wait(struct drm_file *filep, u32 handle,
>  		return -EINVAL;
>  	}
>  
> -	ret = dma_resv_wait_timeout_rcu(obj->resv, wait_all,
> +	ret = dma_resv_wait_timeout_unlocked(obj->resv, wait_all,
>  						  true, timeout);
>  	if (ret == 0)
>  		ret = -ETIME;
> @@ -1375,12 +1375,12 @@ int drm_gem_fence_array_add_implicit(struct xarray *fence_array,
>  
>  	if (!write) {
>  		struct dma_fence *fence =
> -			dma_resv_get_excl_rcu(obj->resv);
> +			dma_resv_get_excl_unlocked(obj->resv);
>  
>  		return drm_gem_fence_array_add(fence_array, fence);
>  	}
>  
> -	ret = dma_resv_get_fences_rcu(obj->resv, NULL,
> +	ret = dma_resv_get_fences_unlocked(obj->resv, NULL,
>  						&fence_count, &fences);
>  	if (ret || !fence_count)
>  		return ret;
> diff --git a/drivers/gpu/drm/drm_gem_atomic_helper.c b/drivers/gpu/drm/drm_gem_atomic_helper.c
> index a005c5a0ba46a..a27135084ae5c 100644
> --- a/drivers/gpu/drm/drm_gem_atomic_helper.c
> +++ b/drivers/gpu/drm/drm_gem_atomic_helper.c
> @@ -147,7 +147,7 @@ int drm_gem_plane_helper_prepare_fb(struct drm_plane *plane, struct drm_plane_st
>  		return 0;
>  
>  	obj = drm_gem_fb_get_obj(state->fb, 0);
> -	fence = dma_resv_get_excl_rcu(obj->resv);
> +	fence = dma_resv_get_excl_unlocked(obj->resv);
>  	drm_atomic_set_fence_for_plane(state, fence);
>  
>  	return 0;
> diff --git a/drivers/gpu/drm/etnaviv/etnaviv_gem.c b/drivers/gpu/drm/etnaviv/etnaviv_gem.c
> index db69f19ab5bca..b271e00480246 100644
> --- a/drivers/gpu/drm/etnaviv/etnaviv_gem.c
> +++ b/drivers/gpu/drm/etnaviv/etnaviv_gem.c
> @@ -390,13 +390,13 @@ int etnaviv_gem_cpu_prep(struct drm_gem_object *obj, u32 op,
>  	}
>  
>  	if (op & ETNA_PREP_NOSYNC) {
> -		if (!dma_resv_test_signaled_rcu(obj->resv,
> +		if (!dma_resv_test_signaled_unlocked(obj->resv,
>  							  write))
>  			return -EBUSY;
>  	} else {
>  		unsigned long remain = etnaviv_timeout_to_jiffies(timeout);
>  
> -		ret = dma_resv_wait_timeout_rcu(obj->resv,
> +		ret = dma_resv_wait_timeout_unlocked(obj->resv,
>  							  write, true, remain);
>  		if (ret <= 0)
>  			return ret == 0 ? -ETIMEDOUT : ret;
> diff --git a/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c b/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c
> index d05c359945799..b4ac4c7ab144d 100644
> --- a/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c
> +++ b/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c
> @@ -189,13 +189,13 @@ static int submit_fence_sync(struct etnaviv_gem_submit *submit)
>  			continue;
>  
>  		if (bo->flags & ETNA_SUBMIT_BO_WRITE) {
> -			ret = dma_resv_get_fences_rcu(robj, &bo->excl,
> +			ret = dma_resv_get_fences_unlocked(robj, &bo->excl,
>  								&bo->nr_shared,
>  								&bo->shared);
>  			if (ret)
>  				return ret;
>  		} else {
> -			bo->excl = dma_resv_get_excl_rcu(robj);
> +			bo->excl = dma_resv_get_excl_unlocked(robj);
>  		}
>  
>  	}
> diff --git a/drivers/gpu/drm/i915/display/intel_display.c b/drivers/gpu/drm/i915/display/intel_display.c
> index 422b59ebf6dce..5f0b85a102159 100644
> --- a/drivers/gpu/drm/i915/display/intel_display.c
> +++ b/drivers/gpu/drm/i915/display/intel_display.c
> @@ -11040,7 +11040,7 @@ intel_prepare_plane_fb(struct drm_plane *_plane,
>  		if (ret < 0)
>  			goto unpin_fb;
>  
> -		fence = dma_resv_get_excl_rcu(obj->base.resv);
> +		fence = dma_resv_get_excl_unlocked(obj->base.resv);
>  		if (fence) {
>  			add_rps_boost_after_vblank(new_plane_state->hw.crtc,
>  						   fence);
> diff --git a/drivers/gpu/drm/i915/dma_resv_utils.c b/drivers/gpu/drm/i915/dma_resv_utils.c
> index 9e508e7d4629f..bdfc6bf16a4e9 100644
> --- a/drivers/gpu/drm/i915/dma_resv_utils.c
> +++ b/drivers/gpu/drm/i915/dma_resv_utils.c
> @@ -10,7 +10,7 @@
>  void dma_resv_prune(struct dma_resv *resv)
>  {
>  	if (dma_resv_trylock(resv)) {
> -		if (dma_resv_test_signaled_rcu(resv, true))
> +		if (dma_resv_test_signaled_unlocked(resv, true))
>  			dma_resv_add_excl_fence(resv, NULL);
>  		dma_resv_unlock(resv);
>  	}
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_busy.c b/drivers/gpu/drm/i915/gem/i915_gem_busy.c
> index 25235ef630c10..754ad6d1bace9 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_busy.c
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_busy.c
> @@ -105,7 +105,7 @@ i915_gem_busy_ioctl(struct drm_device *dev, void *data,
>  	 * Alternatively, we can trade that extra information on read/write
>  	 * activity with
>  	 *	args->busy =
> -	 *		!dma_resv_test_signaled_rcu(obj->resv, true);
> +	 *		!dma_resv_test_signaled_unlocked(obj->resv, true);
>  	 * to report the overall busyness. This is what the wait-ioctl does.
>  	 *
>  	 */
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
> index 297143511f99b..e8f323564e57b 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
> @@ -1481,7 +1481,7 @@ static inline bool use_reloc_gpu(struct i915_vma *vma)
>  	if (DBG_FORCE_RELOC)
>  		return false;
>  
> -	return !dma_resv_test_signaled_rcu(vma->resv, true);
> +	return !dma_resv_test_signaled_unlocked(vma->resv, true);
>  }
>  
>  static unsigned long vma_phys_addr(struct i915_vma *vma, u32 offset)
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.h b/drivers/gpu/drm/i915/gem/i915_gem_object.h
> index 2ebd79537aea9..7c0eb425cb3b3 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_object.h
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_object.h
> @@ -500,7 +500,7 @@ i915_gem_object_last_write_engine(struct drm_i915_gem_object *obj)
>  	struct dma_fence *fence;
>  
>  	rcu_read_lock();
> -	fence = dma_resv_get_excl_rcu(obj->base.resv);
> +	fence = dma_resv_get_excl_unlocked(obj->base.resv);
>  	rcu_read_unlock();
>  
>  	if (fence && dma_fence_is_i915(fence) && !dma_fence_is_signaled(fence))
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_userptr.c b/drivers/gpu/drm/i915/gem/i915_gem_userptr.c
> index a657b99ec7606..bb5f44ed932aa 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_userptr.c
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_userptr.c
> @@ -85,7 +85,7 @@ static bool i915_gem_userptr_invalidate(struct mmu_interval_notifier *mni,
>  		return true;
>  
>  	/* we will unbind on next submission, still have userptr pins */
> -	r = dma_resv_wait_timeout_rcu(obj->base.resv, true, false,
> +	r = dma_resv_wait_timeout_unlocked(obj->base.resv, true, false,
>  				      MAX_SCHEDULE_TIMEOUT);
>  	if (r <= 0)
>  		drm_err(&i915->drm, "(%ld) failed to wait for idle\n", r);
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_wait.c b/drivers/gpu/drm/i915/gem/i915_gem_wait.c
> index 4b9856d5ba14f..5b6c52659ad4d 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_wait.c
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_wait.c
> @@ -45,7 +45,7 @@ i915_gem_object_wait_reservation(struct dma_resv *resv,
>  		unsigned int count, i;
>  		int ret;
>  
> -		ret = dma_resv_get_fences_rcu(resv, &excl, &count, &shared);
> +		ret = dma_resv_get_fences_unlocked(resv, &excl, &count, &shared);
>  		if (ret)
>  			return ret;
>  
> @@ -73,7 +73,7 @@ i915_gem_object_wait_reservation(struct dma_resv *resv,
>  		 */
>  		prune_fences = count && timeout >= 0;
>  	} else {
> -		excl = dma_resv_get_excl_rcu(resv);
> +		excl = dma_resv_get_excl_unlocked(resv);
>  	}
>  
>  	if (excl && timeout >= 0)
> @@ -158,8 +158,8 @@ i915_gem_object_wait_priority(struct drm_i915_gem_object *obj,
>  		unsigned int count, i;
>  		int ret;
>  
> -		ret = dma_resv_get_fences_rcu(obj->base.resv,
> -					      &excl, &count, &shared);
> +		ret = dma_resv_get_fences_unlocked(obj->base.resv,
> +						   &excl, &count, &shared);
>  		if (ret)
>  			return ret;
>  
> @@ -170,7 +170,7 @@ i915_gem_object_wait_priority(struct drm_i915_gem_object *obj,
>  
>  		kfree(shared);
>  	} else {
> -		excl = dma_resv_get_excl_rcu(obj->base.resv);
> +		excl = dma_resv_get_excl_unlocked(obj->base.resv);
>  	}
>  
>  	if (excl) {
> diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
> index 970d8f4986bbe..d101d702fbadc 100644
> --- a/drivers/gpu/drm/i915/i915_request.c
> +++ b/drivers/gpu/drm/i915/i915_request.c
> @@ -1594,7 +1594,7 @@ i915_request_await_object(struct i915_request *to,
>  		struct dma_fence **shared;
>  		unsigned int count, i;
>  
> -		ret = dma_resv_get_fences_rcu(obj->base.resv,
> +		ret = dma_resv_get_fences_unlocked(obj->base.resv,
>  							&excl, &count, &shared);
>  		if (ret)
>  			return ret;
> @@ -1611,7 +1611,7 @@ i915_request_await_object(struct i915_request *to,
>  			dma_fence_put(shared[i]);
>  		kfree(shared);
>  	} else {
> -		excl = dma_resv_get_excl_rcu(obj->base.resv);
> +		excl = dma_resv_get_excl_unlocked(obj->base.resv);
>  	}
>  
>  	if (excl) {
> diff --git a/drivers/gpu/drm/i915/i915_sw_fence.c b/drivers/gpu/drm/i915/i915_sw_fence.c
> index 2744558f30507..0bcb7ea44201e 100644
> --- a/drivers/gpu/drm/i915/i915_sw_fence.c
> +++ b/drivers/gpu/drm/i915/i915_sw_fence.c
> @@ -582,7 +582,7 @@ int i915_sw_fence_await_reservation(struct i915_sw_fence *fence,
>  		struct dma_fence **shared;
>  		unsigned int count, i;
>  
> -		ret = dma_resv_get_fences_rcu(resv, &excl, &count, &shared);
> +		ret = dma_resv_get_fences_unlocked(resv, &excl, &count, &shared);
>  		if (ret)
>  			return ret;
>  
> @@ -606,7 +606,7 @@ int i915_sw_fence_await_reservation(struct i915_sw_fence *fence,
>  			dma_fence_put(shared[i]);
>  		kfree(shared);
>  	} else {
> -		excl = dma_resv_get_excl_rcu(resv);
> +		excl = dma_resv_get_excl_unlocked(resv);
>  	}
>  
>  	if (ret >= 0 && excl && excl->ops != exclude) {
> diff --git a/drivers/gpu/drm/msm/msm_gem.c b/drivers/gpu/drm/msm/msm_gem.c
> index 56df86e5f7400..f8afb081770e4 100644
> --- a/drivers/gpu/drm/msm/msm_gem.c
> +++ b/drivers/gpu/drm/msm/msm_gem.c
> @@ -915,7 +915,7 @@ int msm_gem_cpu_prep(struct drm_gem_object *obj, uint32_t op, ktime_t *timeout)
>  		op & MSM_PREP_NOSYNC ? 0 : timeout_to_jiffies(timeout);
>  	long ret;
>  
> -	ret = dma_resv_wait_timeout_rcu(obj->resv, write,
> +	ret = dma_resv_wait_timeout_unlocked(obj->resv, write,
>  						  true,  remain);
>  	if (ret == 0)
>  		return remain == 0 ? -EBUSY : -ETIMEDOUT;
> diff --git a/drivers/gpu/drm/nouveau/dispnv50/wndw.c b/drivers/gpu/drm/nouveau/dispnv50/wndw.c
> index 0cb1f9d848d3e..8d048bacd6f02 100644
> --- a/drivers/gpu/drm/nouveau/dispnv50/wndw.c
> +++ b/drivers/gpu/drm/nouveau/dispnv50/wndw.c
> @@ -561,7 +561,7 @@ nv50_wndw_prepare_fb(struct drm_plane *plane, struct drm_plane_state *state)
>  			asyw->image.handle[0] = ctxdma->object.handle;
>  	}
>  
> -	asyw->state.fence = dma_resv_get_excl_rcu(nvbo->bo.base.resv);
> +	asyw->state.fence = dma_resv_get_excl_unlocked(nvbo->bo.base.resv);
>  	asyw->image.offset[0] = nvbo->offset;
>  
>  	if (wndw->func->prepare) {
> diff --git a/drivers/gpu/drm/nouveau/nouveau_gem.c b/drivers/gpu/drm/nouveau/nouveau_gem.c
> index a70e82413fa75..06ea1fed02467 100644
> --- a/drivers/gpu/drm/nouveau/nouveau_gem.c
> +++ b/drivers/gpu/drm/nouveau/nouveau_gem.c
> @@ -928,7 +928,7 @@ nouveau_gem_ioctl_cpu_prep(struct drm_device *dev, void *data,
>  		return -ENOENT;
>  	nvbo = nouveau_gem_object(gem);
>  
> -	lret = dma_resv_wait_timeout_rcu(nvbo->bo.base.resv, write, true,
> +	lret = dma_resv_wait_timeout_unlocked(nvbo->bo.base.resv, write, true,
>  						   no_wait ? 0 : 30 * HZ);
>  	if (!lret)
>  		ret = -EBUSY;
> diff --git a/drivers/gpu/drm/panfrost/panfrost_drv.c b/drivers/gpu/drm/panfrost/panfrost_drv.c
> index ca07098a61419..53e1842fe8bf8 100644
> --- a/drivers/gpu/drm/panfrost/panfrost_drv.c
> +++ b/drivers/gpu/drm/panfrost/panfrost_drv.c
> @@ -311,7 +311,7 @@ panfrost_ioctl_wait_bo(struct drm_device *dev, void *data,
>  	if (!gem_obj)
>  		return -ENOENT;
>  
> -	ret = dma_resv_wait_timeout_rcu(gem_obj->resv, true,
> +	ret = dma_resv_wait_timeout_unlocked(gem_obj->resv, true,
>  						  true, timeout);
>  	if (!ret)
>  		ret = timeout ? -ETIMEDOUT : -EBUSY;
> diff --git a/drivers/gpu/drm/panfrost/panfrost_job.c b/drivers/gpu/drm/panfrost/panfrost_job.c
> index 6003cfeb13221..2df3e999a38d0 100644
> --- a/drivers/gpu/drm/panfrost/panfrost_job.c
> +++ b/drivers/gpu/drm/panfrost/panfrost_job.c
> @@ -203,7 +203,7 @@ static void panfrost_acquire_object_fences(struct drm_gem_object **bos,
>  	int i;
>  
>  	for (i = 0; i < bo_count; i++)
> -		implicit_fences[i] = dma_resv_get_excl_rcu(bos[i]->resv);
> +		implicit_fences[i] = dma_resv_get_excl_unlocked(bos[i]->resv);
>  }
>  
>  static void panfrost_attach_object_fences(struct drm_gem_object **bos,
> diff --git a/drivers/gpu/drm/radeon/radeon_gem.c b/drivers/gpu/drm/radeon/radeon_gem.c
> index 05ea2f39f6261..1a38b0bf36d11 100644
> --- a/drivers/gpu/drm/radeon/radeon_gem.c
> +++ b/drivers/gpu/drm/radeon/radeon_gem.c
> @@ -125,7 +125,7 @@ static int radeon_gem_set_domain(struct drm_gem_object *gobj,
>  	}
>  	if (domain == RADEON_GEM_DOMAIN_CPU) {
>  		/* Asking for cpu access wait for object idle */
> -		r = dma_resv_wait_timeout_rcu(robj->tbo.base.resv, true, true, 30 * HZ);
> +		r = dma_resv_wait_timeout_unlocked(robj->tbo.base.resv, true, true, 30 * HZ);
>  		if (!r)
>  			r = -EBUSY;
>  
> @@ -474,7 +474,7 @@ int radeon_gem_busy_ioctl(struct drm_device *dev, void *data,
>  	}
>  	robj = gem_to_radeon_bo(gobj);
>  
> -	r = dma_resv_test_signaled_rcu(robj->tbo.base.resv, true);
> +	r = dma_resv_test_signaled_unlocked(robj->tbo.base.resv, true);
>  	if (r == 0)
>  		r = -EBUSY;
>  	else
> @@ -503,7 +503,7 @@ int radeon_gem_wait_idle_ioctl(struct drm_device *dev, void *data,
>  	}
>  	robj = gem_to_radeon_bo(gobj);
>  
> -	ret = dma_resv_wait_timeout_rcu(robj->tbo.base.resv, true, true, 30 * HZ);
> +	ret = dma_resv_wait_timeout_unlocked(robj->tbo.base.resv, true, true, 30 * HZ);
>  	if (ret == 0)
>  		r = -EBUSY;
>  	else if (ret < 0)
> diff --git a/drivers/gpu/drm/radeon/radeon_mn.c b/drivers/gpu/drm/radeon/radeon_mn.c
> index e37c9a57a7c36..8016c9e568684 100644
> --- a/drivers/gpu/drm/radeon/radeon_mn.c
> +++ b/drivers/gpu/drm/radeon/radeon_mn.c
> @@ -66,7 +66,7 @@ static bool radeon_mn_invalidate(struct mmu_interval_notifier *mn,
>  		return true;
>  	}
>  
> -	r = dma_resv_wait_timeout_rcu(bo->tbo.base.resv, true, false,
> +	r = dma_resv_wait_timeout_unlocked(bo->tbo.base.resv, true, false,
>  				      MAX_SCHEDULE_TIMEOUT);
>  	if (r <= 0)
>  		DRM_ERROR("(%ld) failed to wait for user bo\n", r);
> diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c
> index ca1b098b6a561..6925de3f179e8 100644
> --- a/drivers/gpu/drm/ttm/ttm_bo.c
> +++ b/drivers/gpu/drm/ttm/ttm_bo.c
> @@ -294,7 +294,7 @@ static int ttm_bo_cleanup_refs(struct ttm_buffer_object *bo,
>  	struct dma_resv *resv = &bo->base._resv;
>  	int ret;
>  
> -	if (dma_resv_test_signaled_rcu(resv, true))
> +	if (dma_resv_test_signaled_unlocked(resv, true))
>  		ret = 0;
>  	else
>  		ret = -EBUSY;
> @@ -306,7 +306,7 @@ static int ttm_bo_cleanup_refs(struct ttm_buffer_object *bo,
>  			dma_resv_unlock(bo->base.resv);
>  		spin_unlock(&bo->bdev->lru_lock);
>  
> -		lret = dma_resv_wait_timeout_rcu(resv, true, interruptible,
> +		lret = dma_resv_wait_timeout_unlocked(resv, true, interruptible,
>  						 30 * HZ);
>  
>  		if (lret < 0)
> @@ -409,7 +409,7 @@ static void ttm_bo_release(struct kref *kref)
>  			/* Last resort, if we fail to allocate memory for the
>  			 * fences block for the BO to become idle
>  			 */
> -			dma_resv_wait_timeout_rcu(bo->base.resv, true, false,
> +			dma_resv_wait_timeout_unlocked(bo->base.resv, true, false,
>  						  30 * HZ);
>  		}
>  
> @@ -420,7 +420,7 @@ static void ttm_bo_release(struct kref *kref)
>  		ttm_mem_io_free(bdev, &bo->mem);
>  	}
>  
> -	if (!dma_resv_test_signaled_rcu(bo->base.resv, true) ||
> +	if (!dma_resv_test_signaled_unlocked(bo->base.resv, true) ||
>  	    !dma_resv_trylock(bo->base.resv)) {
>  		/* The BO is not idle, resurrect it for delayed destroy */
>  		ttm_bo_flush_all_fences(bo);
> @@ -1116,13 +1116,13 @@ int ttm_bo_wait(struct ttm_buffer_object *bo,
>  	long timeout = 15 * HZ;
>  
>  	if (no_wait) {
> -		if (dma_resv_test_signaled_rcu(bo->base.resv, true))
> +		if (dma_resv_test_signaled_unlocked(bo->base.resv, true))
>  			return 0;
>  		else
>  			return -EBUSY;
>  	}
>  
> -	timeout = dma_resv_wait_timeout_rcu(bo->base.resv, true,
> +	timeout = dma_resv_wait_timeout_unlocked(bo->base.resv, true,
>  						      interruptible, timeout);
>  	if (timeout < 0)
>  		return timeout;
> diff --git a/drivers/gpu/drm/vgem/vgem_fence.c b/drivers/gpu/drm/vgem/vgem_fence.c
> index 2902dc6e64faf..110927edb9df7 100644
> --- a/drivers/gpu/drm/vgem/vgem_fence.c
> +++ b/drivers/gpu/drm/vgem/vgem_fence.c
> @@ -151,7 +151,7 @@ int vgem_fence_attach_ioctl(struct drm_device *dev,
>  
>  	/* Check for a conflicting fence */
>  	resv = obj->resv;
> -	if (!dma_resv_test_signaled_rcu(resv,
> +	if (!dma_resv_test_signaled_unlocked(resv,
>  						  arg->flags & VGEM_FENCE_WRITE)) {
>  		ret = -EBUSY;
>  		goto err_fence;
> diff --git a/drivers/gpu/drm/virtio/virtgpu_ioctl.c b/drivers/gpu/drm/virtio/virtgpu_ioctl.c
> index 669f2ee395154..763a51686819c 100644
> --- a/drivers/gpu/drm/virtio/virtgpu_ioctl.c
> +++ b/drivers/gpu/drm/virtio/virtgpu_ioctl.c
> @@ -451,9 +451,9 @@ static int virtio_gpu_wait_ioctl(struct drm_device *dev, void *data,
>  		return -ENOENT;
>  
>  	if (args->flags & VIRTGPU_WAIT_NOWAIT) {
> -		ret = dma_resv_test_signaled_rcu(obj->resv, true);
> +		ret = dma_resv_test_signaled_unlocked(obj->resv, true);
>  	} else {
> -		ret = dma_resv_wait_timeout_rcu(obj->resv, true, true,
> +		ret = dma_resv_wait_timeout_unlocked(obj->resv, true, true,
>  						timeout);
>  	}
>  	if (ret == 0)
> diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_bo.c b/drivers/gpu/drm/vmwgfx/vmwgfx_bo.c
> index 04dd49c4c2572..19e1ce23842a9 100644
> --- a/drivers/gpu/drm/vmwgfx/vmwgfx_bo.c
> +++ b/drivers/gpu/drm/vmwgfx/vmwgfx_bo.c
> @@ -743,7 +743,7 @@ static int vmw_user_bo_synccpu_grab(struct vmw_user_buffer_object *user_bo,
>  	if (flags & drm_vmw_synccpu_allow_cs) {
>  		long lret;
>  
> -		lret = dma_resv_wait_timeout_rcu
> +		lret = dma_resv_wait_timeout_unlocked
>  			(bo->base.resv, true, true,
>  			 nonblock ? 0 : MAX_SCHEDULE_TIMEOUT);
>  		if (!lret)
> diff --git a/include/linux/dma-resv.h b/include/linux/dma-resv.h
> index d44a77e8a7e34..99926680c3964 100644
> --- a/include/linux/dma-resv.h
> +++ b/include/linux/dma-resv.h
> @@ -246,7 +246,7 @@ dma_resv_get_excl(struct dma_resv *obj)
>  }
>  
>  /**
> - * dma_resv_get_excl_rcu - get the reservation object's
> + * dma_resv_get_excl - get the reservation object's
>   * exclusive fence, without lock held.
>   * @obj: the reservation object
>   *
> @@ -257,7 +257,7 @@ dma_resv_get_excl(struct dma_resv *obj)
>   * The exclusive fence or NULL if none
>   */
>  static inline struct dma_fence *
> -dma_resv_get_excl_rcu(struct dma_resv *obj)
> +dma_resv_get_excl_unlocked(struct dma_resv *obj)
>  {
>  	struct dma_fence *fence;
>  
> @@ -278,16 +278,16 @@ void dma_resv_add_shared_fence(struct dma_resv *obj, struct dma_fence *fence);
>  
>  void dma_resv_add_excl_fence(struct dma_resv *obj, struct dma_fence *fence);
>  
> -int dma_resv_get_fences_rcu(struct dma_resv *obj,
> -			    struct dma_fence **pfence_excl,
> -			    unsigned *pshared_count,
> -			    struct dma_fence ***pshared);
> +int dma_resv_get_fences_unlocked(struct dma_resv *obj,
> +				 struct dma_fence **pfence_excl,
> +				 unsigned *pshared_count,
> +				 struct dma_fence ***pshared);
>  
>  int dma_resv_copy_fences(struct dma_resv *dst, struct dma_resv *src);
>  
> -long dma_resv_wait_timeout_rcu(struct dma_resv *obj, bool wait_all, bool intr,
> -			       unsigned long timeout);
> +long dma_resv_wait_timeout_unlocked(struct dma_resv *obj, bool wait_all, bool intr,
> +				    unsigned long timeout);
>  
> -bool dma_resv_test_signaled_rcu(struct dma_resv *obj, bool test_all);
> +bool dma_resv_test_signaled_unlocked(struct dma_resv *obj, bool test_all);
>  
>  #endif /* _LINUX_RESERVATION_H */
> -- 
> 2.31.1
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 4/6] dma-buf: Add an API for exporting sync files (v9)
  2021-05-24 20:59 ` [PATCH 4/6] dma-buf: Add an API for exporting sync files (v9) Jason Ekstrand
@ 2021-05-25 15:08   ` Daniel Vetter
  0 siblings, 0 replies; 13+ messages in thread
From: Daniel Vetter @ 2021-05-25 15:08 UTC (permalink / raw)
  To: Jason Ekstrand; +Cc: intel-gfx, Christian König, dri-devel

On Mon, May 24, 2021 at 03:59:52PM -0500, Jason Ekstrand wrote:
> Modern userspace APIs like Vulkan are built on an explicit
> synchronization model.  This doesn't always play nicely with the
> implicit synchronization used in the kernel and assumed by X11 and
> Wayland.  The client -> compositor half of the synchronization isn't too
> bad, at least on intel, because we can control whether or not i915
> synchronizes on the buffer and whether or not it's considered written.
> 
> The harder part is the compositor -> client synchronization when we get
> the buffer back from the compositor.  We're required to be able to
> provide the client with a VkSemaphore and VkFence representing the point
> in time where the window system (compositor and/or display) finished
> using the buffer.  With current APIs, it's very hard to do this in such
> a way that we don't get confused by the Vulkan driver's access of the
> buffer.  In particular, once we tell the kernel that we're rendering to
> the buffer again, any CPU waits on the buffer or GPU dependencies will
> wait on some of the client rendering and not just the compositor.
> 
> This new IOCTL solves this problem by allowing us to get a snapshot of
> the implicit synchronization state of a given dma-buf in the form of a
> sync file.  It's effectively the same as a poll() or I915_GEM_WAIT only,
> instead of CPU waiting directly, it encapsulates the wait operation, at
> the current moment in time, in a sync_file so we can check/wait on it
> later.  As long as the Vulkan driver does the sync_file export from the
> dma-buf before we re-introduce it for rendering, it will only contain
> fences from the compositor or display.  This allows to accurately turn
> it into a VkFence or VkSemaphore without any over- synchronization.
> 
> v2 (Jason Ekstrand):
>  - Use a wrapper dma_fence_array of all fences including the new one
>    when importing an exclusive fence.
> 
> v3 (Jason Ekstrand):
>  - Lock around setting shared fences as well as exclusive
>  - Mark SIGNAL_SYNC_FILE as a read-write ioctl.
>  - Initialize ret to 0 in dma_buf_wait_sync_file
> 
> v4 (Jason Ekstrand):
>  - Use the new dma_resv_get_singleton helper
> 
> v5 (Jason Ekstrand):
>  - Rename the IOCTLs to import/export rather than wait/signal
>  - Drop the WRITE flag and always get/set the exclusive fence
> 
> v6 (Jason Ekstrand):
>  - Drop the sync_file import as it was all-around sketchy and not nearly
>    as useful as import.
>  - Re-introduce READ/WRITE flag support for export
>  - Rework the commit message
> 
> v7 (Jason Ekstrand):
>  - Require at least one sync flag
>  - Fix a refcounting bug: dma_resv_get_excl() doesn't take a reference
>  - Use _rcu helpers since we're accessing the dma_resv read-only
> 
> v8 (Jason Ekstrand):
>  - Return -ENOMEM if the sync_file_create fails
>  - Predicate support on IS_ENABLED(CONFIG_SYNC_FILE)
> 
> v9 (Jason Ekstrand):
>  - Add documentation for the new ioctl
> 
> v10 (Jason Ekstrand):
>  - Go back to dma_buf_sync_file as the ioctl struct name
> 
> Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
> Acked-by: Simon Ser <contact@emersion.fr>
> Acked-by: Christian König <christian.koenig@amd.com>

So one thing we need to capture here is that for amdgpu currently this
misreports, because amdgpu doesn't store write fences in the exclusive
slot. So that's a bit annoying.

If userspace only uses this sync_file to avoid stalls, then I think that's
all fine, we just lie about the stall that will still happen and maybe
there's more judder than necessary.

More annoying is when this is used in e.g. a vulkan based compositor. But
with current amdgpu userspace the kernel forces synchronization, even with
vulkan. So again no problem.

The only thing where we really need to be careful about is that when we
add more fine-grained implicit sync support to amdgpu (which needs more
than these patches, you need a per-file opt-out of the kernel's automagic
sync), then we also must fix the amdgpu use of the exclusive slot. But
that's doable I think.

I couldn't poke holes in your argument checking.

Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>

> ---
>  drivers/dma-buf/dma-buf.c    | 64 ++++++++++++++++++++++++++++++++++++
>  include/uapi/linux/dma-buf.h | 24 ++++++++++++++
>  2 files changed, 88 insertions(+)
> 
> diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c
> index d4529aa9d1a5a..86efe71c0db96 100644
> --- a/drivers/dma-buf/dma-buf.c
> +++ b/drivers/dma-buf/dma-buf.c
> @@ -20,6 +20,7 @@
>  #include <linux/debugfs.h>
>  #include <linux/module.h>
>  #include <linux/seq_file.h>
> +#include <linux/sync_file.h>
>  #include <linux/poll.h>
>  #include <linux/dma-resv.h>
>  #include <linux/mm.h>
> @@ -362,6 +363,64 @@ static long dma_buf_set_name(struct dma_buf *dmabuf, const char __user *buf)
>  	return ret;
>  }
>  
> +#if IS_ENABLED(CONFIG_SYNC_FILE)
> +static long dma_buf_export_sync_file(struct dma_buf *dmabuf,
> +				     void __user *user_data)
> +{
> +	struct dma_buf_sync_file arg;
> +	struct dma_fence *fence = NULL;
> +	struct sync_file *sync_file;
> +	int fd, ret;
> +
> +	if (copy_from_user(&arg, user_data, sizeof(arg)))
> +		return -EFAULT;
> +
> +	if (arg.flags & ~DMA_BUF_SYNC_RW)
> +		return -EINVAL;
> +
> +	if ((arg.flags & DMA_BUF_SYNC_RW) == 0)
> +		return -EINVAL;
> +
> +	fd = get_unused_fd_flags(O_CLOEXEC);
> +	if (fd < 0)
> +		return fd;
> +
> +	if (arg.flags & DMA_BUF_SYNC_WRITE) {
> +		fence = dma_resv_get_singleton_unlocked(dmabuf->resv);
> +		if (IS_ERR(fence)) {
> +			ret = PTR_ERR(fence);
> +			goto err_put_fd;
> +		}
> +	} else if (arg.flags & DMA_BUF_SYNC_READ) {
> +		fence = dma_resv_get_excl_unlocked(dmabuf->resv);
> +	}
> +
> +	if (!fence)
> +		fence = dma_fence_get_stub();
> +
> +	sync_file = sync_file_create(fence);
> +
> +	dma_fence_put(fence);
> +
> +	if (!sync_file) {
> +		ret = -ENOMEM;
> +		goto err_put_fd;
> +	}
> +
> +	fd_install(fd, sync_file->file);
> +
> +	arg.fd = fd;
> +	if (copy_to_user(user_data, &arg, sizeof(arg)))
> +		return -EFAULT;
> +
> +	return 0;
> +
> +err_put_fd:
> +	put_unused_fd(fd);
> +	return ret;
> +}
> +#endif
> +
>  static long dma_buf_ioctl(struct file *file,
>  			  unsigned int cmd, unsigned long arg)
>  {
> @@ -405,6 +464,11 @@ static long dma_buf_ioctl(struct file *file,
>  	case DMA_BUF_SET_NAME_B:
>  		return dma_buf_set_name(dmabuf, (const char __user *)arg);
>  
> +#if IS_ENABLED(CONFIG_SYNC_FILE)
> +	case DMA_BUF_IOCTL_EXPORT_SYNC_FILE:
> +		return dma_buf_export_sync_file(dmabuf, (void __user *)arg);
> +#endif
> +
>  	default:
>  		return -ENOTTY;
>  	}
> diff --git a/include/uapi/linux/dma-buf.h b/include/uapi/linux/dma-buf.h
> index 7f30393b92c3b..f902cadcbdb56 100644
> --- a/include/uapi/linux/dma-buf.h
> +++ b/include/uapi/linux/dma-buf.h
> @@ -37,6 +37,29 @@ struct dma_buf_sync {
>  
>  #define DMA_BUF_NAME_LEN	32
>  
> +/**

Pulling the ioctl stuff into our docs (I'd put it right after the poll
support chapter) would be really nice. Bonus if you document the 2 simple
existing ones already there too ...

> + * struct dma_buf_export_sync_file - Get a sync_file from a dma-buf
> + *
> + * Userspace can perform a DMA_BUF_IOCTL_EXPORT_SYNC_FILE to retrieve the
> + * current set of fences on a dma-buf file descriptor as a sync_file.  CPU
> + * waits via poll() or other driver-specific mechanisms typically wait on
> + * whatever fences are on the dma-buf at the time the wait begins.  This
> + * is similar except that it takes a snapshot of the current fences on the
> + * dma-buf for waiting later instead of waiting immediately.  This is
> + * useful for modern graphics APIs such as Vulkan which assume an explicit
> + * synchronization model but still need to inter-operate with dma-buf.
> + */
> +struct dma_buf_sync_file {
> +	/**
> +	 * @flags: Read/write flags
> +	 *
> +	 * Must DMA_BUF_SYNC_READ, DMA_BUF_SYNC_WRITE, or both.
> +	 */
> +	__u32 flags;

Maybe spec what actually happens for each flag ... We've had epic
bikesheds about what each ones needs.
-Daniel

> +	/** @fd: Sync file file descriptor */
> +	__s32 fd;
> +};
> +
>  #define DMA_BUF_BASE		'b'
>  #define DMA_BUF_IOCTL_SYNC	_IOW(DMA_BUF_BASE, 0, struct dma_buf_sync)
>  
> @@ -46,5 +69,6 @@ struct dma_buf_sync {
>  #define DMA_BUF_SET_NAME	_IOW(DMA_BUF_BASE, 1, const char *)
>  #define DMA_BUF_SET_NAME_A	_IOW(DMA_BUF_BASE, 1, u32)
>  #define DMA_BUF_SET_NAME_B	_IOW(DMA_BUF_BASE, 1, u64)
> +#define DMA_BUF_IOCTL_EXPORT_SYNC_FILE	_IOWR(DMA_BUF_BASE, 2, struct dma_buf_sync_file)
>  
>  #endif
> -- 
> 2.31.1
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Intel-gfx] [PATCH 5/6] RFC: dma-buf: Add an extra fence to dma_resv_get_singleton_unlocked
  2021-05-24 20:59 ` [PATCH 5/6] RFC: dma-buf: Add an extra fence to dma_resv_get_singleton_unlocked Jason Ekstrand
@ 2021-05-25 15:25   ` Daniel Vetter
  0 siblings, 0 replies; 13+ messages in thread
From: Daniel Vetter @ 2021-05-25 15:25 UTC (permalink / raw)
  To: Jason Ekstrand; +Cc: intel-gfx, dri-devel

On Mon, May 24, 2021 at 03:59:53PM -0500, Jason Ekstrand wrote:
> For dma-buf sync_file import, we want to get all the fences on a
> dma_resv plus one more.  We could wrap the fence we get back in an array
> fence or we could make dma_resv_get_singleton_unlocked take "one more"
> to make this case easier.
> 
> Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>

Ah now it makes very obviously sense.

Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>

> ---
>  drivers/dma-buf/dma-buf.c  |  2 +-
>  drivers/dma-buf/dma-resv.c | 23 +++++++++++++++++++++--
>  include/linux/dma-resv.h   |  3 ++-
>  3 files changed, 24 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c
> index 86efe71c0db96..f23d939e0e833 100644
> --- a/drivers/dma-buf/dma-buf.c
> +++ b/drivers/dma-buf/dma-buf.c
> @@ -386,7 +386,7 @@ static long dma_buf_export_sync_file(struct dma_buf *dmabuf,
>  		return fd;
>  
>  	if (arg.flags & DMA_BUF_SYNC_WRITE) {
> -		fence = dma_resv_get_singleton_unlocked(dmabuf->resv);
> +		fence = dma_resv_get_singleton_unlocked(dmabuf->resv, NULL);
>  		if (IS_ERR(fence)) {
>  			ret = PTR_ERR(fence);
>  			goto err_put_fd;
> diff --git a/drivers/dma-buf/dma-resv.c b/drivers/dma-buf/dma-resv.c
> index 312a3a59dac6a..3c0ef8d0f599b 100644
> --- a/drivers/dma-buf/dma-resv.c
> +++ b/drivers/dma-buf/dma-resv.c
> @@ -527,6 +527,7 @@ EXPORT_SYMBOL_GPL(dma_resv_get_fences_unlocked);
>  /**
>   * dma_resv_get_singleton_unlocked - get a single fence for the dma_resv object
>   * @obj: the reservation object
> + * @extra: extra fence to add to the resulting array
>   * @result: resulting dma_fence
>   *
>   * Get a single fence representing all unsignaled fences in the dma_resv object
> @@ -536,7 +537,8 @@ EXPORT_SYMBOL_GPL(dma_resv_get_fences_unlocked);
>   * RETURNS
>   * Returns -NOMEM if allocations fail, zero otherwise.
>   */
> -struct dma_fence *dma_resv_get_singleton_unlocked(struct dma_resv *obj)
> +struct dma_fence *dma_resv_get_singleton_unlocked(struct dma_resv *obj,
> +						  struct dma_fence *extra)
>  {
>  	struct dma_fence *result, **resv_fences, *fence, *chain, **fences;
>  	struct dma_fence_array *array;
> @@ -547,7 +549,7 @@ struct dma_fence *dma_resv_get_singleton_unlocked(struct dma_resv *obj)
>  	if (err)
>  		return ERR_PTR(err);
>  
> -	if (num_resv_fences == 0)
> +	if (num_resv_fences == 0 && !extra)
>  		return NULL;
>  
>  	num_fences = 0;
> @@ -563,6 +565,16 @@ struct dma_fence *dma_resv_get_singleton_unlocked(struct dma_resv *obj)
>  		}
>  	}
>  
> +	if (extra) {
> +		dma_fence_deep_dive_for_each(fence, chain, j, extra) {
> +			if (dma_fence_is_signaled(fence))
> +				continue;
> +
> +			result = fence;
> +			++num_fences;
> +		}
> +	}
> +
>  	if (num_fences <= 1) {
>  		result = dma_fence_get(result);
>  		goto put_resv_fences;
> @@ -583,6 +595,13 @@ struct dma_fence *dma_resv_get_singleton_unlocked(struct dma_resv *obj)
>  		}
>  	}
>  
> +	if (extra) {
> +		dma_fence_deep_dive_for_each(fence, chain, j, extra) {
> +			if (dma_fence_is_signaled(fence))
> +				fences[num_fences++] = dma_fence_get(fence);
> +		}
> +	}
> +
>  	if (num_fences <= 1) {
>  		result = num_fences ? fences[0] : NULL;
>  		kfree(fences);
> diff --git a/include/linux/dma-resv.h b/include/linux/dma-resv.h
> index c529ccee94bc5..156d989e95ab4 100644
> --- a/include/linux/dma-resv.h
> +++ b/include/linux/dma-resv.h
> @@ -285,7 +285,8 @@ int dma_resv_get_fences_unlocked(struct dma_resv *obj,
>  
>  int dma_resv_copy_fences(struct dma_resv *dst, struct dma_resv *src);
>  
> -struct dma_fence *dma_resv_get_singleton_unlocked(struct dma_resv *obj);
> +struct dma_fence *dma_resv_get_singleton_unlocked(struct dma_resv *obj,
> +						  struct dma_fence *extra);
>  
>  long dma_resv_wait_timeout_unlocked(struct dma_resv *obj, bool wait_all, bool intr,
>  				    unsigned long timeout);
> -- 
> 2.31.1
> 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 6/6] RFC: dma-buf: Add an API for importing sync files (v6)
  2021-05-24 20:59 ` [PATCH 6/6] RFC: dma-buf: Add an API for importing sync files (v6) Jason Ekstrand
@ 2021-05-25 15:37   ` Daniel Vetter
  2021-05-25 19:19     ` Jason Ekstrand
  0 siblings, 1 reply; 13+ messages in thread
From: Daniel Vetter @ 2021-05-25 15:37 UTC (permalink / raw)
  To: Jason Ekstrand; +Cc: intel-gfx, dri-devel

On Mon, May 24, 2021 at 03:59:54PM -0500, Jason Ekstrand wrote:
> This patch is analogous to the previous sync file export patch in that
> it allows you to import a sync_file into a dma-buf.  Unlike the previous
> patch, however, this does add genuinely new functionality to dma-buf.
> Without this, the only way to attach a sync_file to a dma-buf is to
> submit a batch to your driver of choice which waits on the sync_file and
> claims to write to the dma-buf.  Even if said batch is a no-op, a submit
> is typically way more overhead than just attaching a fence.  A submit
> may also imply extra synchronization with other work because it happens
> on a hardware queue.
> 
> In the Vulkan world, this is useful for dealing with the out-fence from
> vkQueuePresent.  Current Linux window-systems (X11, Wayland, etc.) all
> rely on dma-buf implicit sync.  Since Vulkan is an explicit sync API, we
> get a set of fences (VkSemaphores) in vkQueuePresent and have to stash
> those as an exclusive (write) fence on the dma-buf.  We handle it in
> Mesa today with the above mentioned dummy submit trick.  This ioctl
> would allow us to set it directly without the dummy submit.
> 
> This may also open up possibilities for GPU drivers to move away from
> implicit sync for their kernel driver uAPI and instead provide sync
> files and rely on dma-buf import/export for communicating with other
> implicit sync clients.
> 
> We make the explicit choice here to only allow setting RW fences which
> translates to an exclusive fence on the dma_resv.  There's no use for
> read-only fences for communicating with other implicit sync userspace
> and any such attempts are likely to be racy at best.  When we got to
> insert the RW fence, the actual fence we set as the new exclusive fence
> is a combination of the sync_file provided by the user and all the other
> fences on the dma_resv.  This ensures that the newly added exclusive
> fence will never signal before the old one would have and ensures that
> we don't break any dma_resv contracts.  We require userspace to specify
> RW in the flags for symmetry with the export ioctl and in case we ever
> want to support read fences in the future.
> 
> There is one downside here that's worth documenting:  If two clients
> writing to the same dma-buf using this API race with each other, their
> actions on the dma-buf may happen in parallel or in an undefined order.
> Both with and without this API, the pattern is the same:  Collect all
> the fences on dma-buf, submit work which depends on said fences, and
> then set a new exclusive (write) fence on the dma-buf which depends on
> said work.  The difference is that, when it's all handled by the GPU
> driver's submit ioctl, the three operations happen atomically under the
> dma_resv lock.  If two userspace submits race, one will happen before
> the other.  You aren't guaranteed which but you are guaranteed that
> they're strictly ordered.  If userspace manages the fences itself, then
> these three operations happen separately and the two render operations
> may happen genuinely in parallel or get interleaved.  However, this is a
> case of userspace racing with itself.  As long as we ensure userspace
> can't back the kernel into a corner, it should be fine.
> 
> v2 (Jason Ekstrand):
>  - Use a wrapper dma_fence_array of all fences including the new one
>    when importing an exclusive fence.
> 
> v3 (Jason Ekstrand):
>  - Lock around setting shared fences as well as exclusive
>  - Mark SIGNAL_SYNC_FILE as a read-write ioctl.
>  - Initialize ret to 0 in dma_buf_wait_sync_file
> 
> v4 (Jason Ekstrand):
>  - Use the new dma_resv_get_singleton helper
> 
> v5 (Jason Ekstrand):
>  - Rename the IOCTLs to import/export rather than wait/signal
>  - Drop the WRITE flag and always get/set the exclusive fence
> 
> v5 (Jason Ekstrand):
>  - Split import and export into separate patches
>  - New commit message
> 
> Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
> ---
>  drivers/dma-buf/dma-buf.c    | 34 ++++++++++++++++++++++++++++++++++
>  include/uapi/linux/dma-buf.h |  1 +
>  2 files changed, 35 insertions(+)
> 
> diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c
> index f23d939e0e833..0a50c19dcf015 100644
> --- a/drivers/dma-buf/dma-buf.c
> +++ b/drivers/dma-buf/dma-buf.c
> @@ -419,6 +419,38 @@ static long dma_buf_export_sync_file(struct dma_buf *dmabuf,
>  	put_unused_fd(fd);
>  	return ret;
>  }
> +
> +static long dma_buf_import_sync_file(struct dma_buf *dmabuf,
> +				     const void __user *user_data)
> +{
> +	struct dma_buf_sync_file arg;
> +	struct dma_fence *fence, *singleton = NULL;
> +	int ret = 0;
> +
> +	if (copy_from_user(&arg, user_data, sizeof(arg)))
> +		return -EFAULT;
> +
> +	if (arg.flags != DMA_BUF_SYNC_RW)
> +		return -EINVAL;
> +
> +	fence = sync_file_get_fence(arg.fd);
> +	if (!fence)
> +		return -EINVAL;
> +
> +	dma_resv_lock(dmabuf->resv, NULL);
> +
> +	singleton = dma_resv_get_singleton_unlocked(dmabuf->resv, fence);
> +	if (IS_ERR(singleton))
> +		ret = PTR_ERR(singleton);
> +	else if (singleton)
> +		dma_resv_add_excl_fence(dmabuf->resv, singleton);

We also need to add the new fence to the shared slots, to make sure that
the collective sum of shared fences still retires after the exclusive one.
Not holding this up will pretty surely allow userspace to pull a bunch of
ttm based drivers over the table I think.

Note that with dma-buf shared buffers there shouldn't be a problem here,
since as long as the buffer is in use by the other driver (which might
break the contract here) it's pinned. So nothing bad can happen.


Aside: The read-only version of this just adds the new fence, and the
exclusive fence to the shared array. I think that would be useful to have,
if just for completeness. I need to pester you how external images work
here with vulkan ...

> +
> +	dma_resv_unlock(dmabuf->resv);
> +
> +	dma_fence_put(fence);
> +
> +	return ret;
> +}
>  #endif
>  
>  static long dma_buf_ioctl(struct file *file,
> @@ -467,6 +499,8 @@ static long dma_buf_ioctl(struct file *file,
>  #if IS_ENABLED(CONFIG_SYNC_FILE)
>  	case DMA_BUF_IOCTL_EXPORT_SYNC_FILE:
>  		return dma_buf_export_sync_file(dmabuf, (void __user *)arg);
> +	case DMA_BUF_IOCTL_IMPORT_SYNC_FILE:
> +		return dma_buf_import_sync_file(dmabuf, (const void __user *)arg);
>  #endif
>  
>  	default:
> diff --git a/include/uapi/linux/dma-buf.h b/include/uapi/linux/dma-buf.h
> index f902cadcbdb56..75fdde4800267 100644
> --- a/include/uapi/linux/dma-buf.h
> +++ b/include/uapi/linux/dma-buf.h
> @@ -70,5 +70,6 @@ struct dma_buf_sync_file {
>  #define DMA_BUF_SET_NAME_A	_IOW(DMA_BUF_BASE, 1, u32)
>  #define DMA_BUF_SET_NAME_B	_IOW(DMA_BUF_BASE, 1, u64)
>  #define DMA_BUF_IOCTL_EXPORT_SYNC_FILE	_IOWR(DMA_BUF_BASE, 2, struct dma_buf_sync_file)
> +#define DMA_BUF_IOCTL_IMPORT_SYNC_FILE	_IOW(DMA_BUF_BASE, 3, struct dma_buf_sync)

Uh wrong struct here. Not good :-)

Also more kerneldoc would be really nice, plus on 2nd thought I'm not
really sure saving the few bytes in storage is such a bright idea, and
maybe we should have distinct dma_buf_export/import_sync_file structures,
each with their appropriate kerneldoc and no confusion.

Aside from these I think this looks good. And as long as we keep up the
"shared fences in their entirety complete after the exclusive fence if
both are present", then I think we'll be fine.
-Daniel



>  
>  #endif
> -- 
> 2.31.1
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 6/6] RFC: dma-buf: Add an API for importing sync files (v6)
  2021-05-25 15:37   ` Daniel Vetter
@ 2021-05-25 19:19     ` Jason Ekstrand
  2021-05-25 19:33       ` Daniel Vetter
  0 siblings, 1 reply; 13+ messages in thread
From: Jason Ekstrand @ 2021-05-25 19:19 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: Intel GFX, Maling list - DRI developers

On Tue, May 25, 2021 at 10:37 AM Daniel Vetter <daniel@ffwll.ch> wrote:
>
> On Mon, May 24, 2021 at 03:59:54PM -0500, Jason Ekstrand wrote:
> > This patch is analogous to the previous sync file export patch in that
> > it allows you to import a sync_file into a dma-buf.  Unlike the previous
> > patch, however, this does add genuinely new functionality to dma-buf.
> > Without this, the only way to attach a sync_file to a dma-buf is to
> > submit a batch to your driver of choice which waits on the sync_file and
> > claims to write to the dma-buf.  Even if said batch is a no-op, a submit
> > is typically way more overhead than just attaching a fence.  A submit
> > may also imply extra synchronization with other work because it happens
> > on a hardware queue.
> >
> > In the Vulkan world, this is useful for dealing with the out-fence from
> > vkQueuePresent.  Current Linux window-systems (X11, Wayland, etc.) all
> > rely on dma-buf implicit sync.  Since Vulkan is an explicit sync API, we
> > get a set of fences (VkSemaphores) in vkQueuePresent and have to stash
> > those as an exclusive (write) fence on the dma-buf.  We handle it in
> > Mesa today with the above mentioned dummy submit trick.  This ioctl
> > would allow us to set it directly without the dummy submit.
> >
> > This may also open up possibilities for GPU drivers to move away from
> > implicit sync for their kernel driver uAPI and instead provide sync
> > files and rely on dma-buf import/export for communicating with other
> > implicit sync clients.
> >
> > We make the explicit choice here to only allow setting RW fences which
> > translates to an exclusive fence on the dma_resv.  There's no use for
> > read-only fences for communicating with other implicit sync userspace
> > and any such attempts are likely to be racy at best.  When we got to
> > insert the RW fence, the actual fence we set as the new exclusive fence
> > is a combination of the sync_file provided by the user and all the other
> > fences on the dma_resv.  This ensures that the newly added exclusive
> > fence will never signal before the old one would have and ensures that
> > we don't break any dma_resv contracts.  We require userspace to specify
> > RW in the flags for symmetry with the export ioctl and in case we ever
> > want to support read fences in the future.
> >
> > There is one downside here that's worth documenting:  If two clients
> > writing to the same dma-buf using this API race with each other, their
> > actions on the dma-buf may happen in parallel or in an undefined order.
> > Both with and without this API, the pattern is the same:  Collect all
> > the fences on dma-buf, submit work which depends on said fences, and
> > then set a new exclusive (write) fence on the dma-buf which depends on
> > said work.  The difference is that, when it's all handled by the GPU
> > driver's submit ioctl, the three operations happen atomically under the
> > dma_resv lock.  If two userspace submits race, one will happen before
> > the other.  You aren't guaranteed which but you are guaranteed that
> > they're strictly ordered.  If userspace manages the fences itself, then
> > these three operations happen separately and the two render operations
> > may happen genuinely in parallel or get interleaved.  However, this is a
> > case of userspace racing with itself.  As long as we ensure userspace
> > can't back the kernel into a corner, it should be fine.
> >
> > v2 (Jason Ekstrand):
> >  - Use a wrapper dma_fence_array of all fences including the new one
> >    when importing an exclusive fence.
> >
> > v3 (Jason Ekstrand):
> >  - Lock around setting shared fences as well as exclusive
> >  - Mark SIGNAL_SYNC_FILE as a read-write ioctl.
> >  - Initialize ret to 0 in dma_buf_wait_sync_file
> >
> > v4 (Jason Ekstrand):
> >  - Use the new dma_resv_get_singleton helper
> >
> > v5 (Jason Ekstrand):
> >  - Rename the IOCTLs to import/export rather than wait/signal
> >  - Drop the WRITE flag and always get/set the exclusive fence
> >
> > v5 (Jason Ekstrand):
> >  - Split import and export into separate patches
> >  - New commit message
> >
> > Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
> > ---
> >  drivers/dma-buf/dma-buf.c    | 34 ++++++++++++++++++++++++++++++++++
> >  include/uapi/linux/dma-buf.h |  1 +
> >  2 files changed, 35 insertions(+)
> >
> > diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c
> > index f23d939e0e833..0a50c19dcf015 100644
> > --- a/drivers/dma-buf/dma-buf.c
> > +++ b/drivers/dma-buf/dma-buf.c
> > @@ -419,6 +419,38 @@ static long dma_buf_export_sync_file(struct dma_buf *dmabuf,
> >       put_unused_fd(fd);
> >       return ret;
> >  }
> > +
> > +static long dma_buf_import_sync_file(struct dma_buf *dmabuf,
> > +                                  const void __user *user_data)
> > +{
> > +     struct dma_buf_sync_file arg;
> > +     struct dma_fence *fence, *singleton = NULL;
> > +     int ret = 0;
> > +
> > +     if (copy_from_user(&arg, user_data, sizeof(arg)))
> > +             return -EFAULT;
> > +
> > +     if (arg.flags != DMA_BUF_SYNC_RW)
> > +             return -EINVAL;
> > +
> > +     fence = sync_file_get_fence(arg.fd);
> > +     if (!fence)
> > +             return -EINVAL;
> > +
> > +     dma_resv_lock(dmabuf->resv, NULL);
> > +
> > +     singleton = dma_resv_get_singleton_unlocked(dmabuf->resv, fence);
> > +     if (IS_ERR(singleton))
> > +             ret = PTR_ERR(singleton);
> > +     else if (singleton)
> > +             dma_resv_add_excl_fence(dmabuf->resv, singleton);
>
> We also need to add the new fence to the shared slots, to make sure that
> the collective sum of shared fences still retires after the exclusive one.
> Not holding this up will pretty surely allow userspace to pull a bunch of
> ttm based drivers over the table I think.

Ok, will fix.

> Note that with dma-buf shared buffers there shouldn't be a problem here,
> since as long as the buffer is in use by the other driver (which might
> break the contract here) it's pinned. So nothing bad can happen.
>
>
> Aside: The read-only version of this just adds the new fence, and the
> exclusive fence to the shared array. I think that would be useful to have,
> if just for completeness. I need to pester you how external images work
> here with vulkan ...

As discussed on IRC, let's leave that out until we can figure out how
it works. :-)

> > +
> > +     dma_resv_unlock(dmabuf->resv);
> > +
> > +     dma_fence_put(fence);
> > +
> > +     return ret;
> > +}
> >  #endif
> >
> >  static long dma_buf_ioctl(struct file *file,
> > @@ -467,6 +499,8 @@ static long dma_buf_ioctl(struct file *file,
> >  #if IS_ENABLED(CONFIG_SYNC_FILE)
> >       case DMA_BUF_IOCTL_EXPORT_SYNC_FILE:
> >               return dma_buf_export_sync_file(dmabuf, (void __user *)arg);
> > +     case DMA_BUF_IOCTL_IMPORT_SYNC_FILE:
> > +             return dma_buf_import_sync_file(dmabuf, (const void __user *)arg);
> >  #endif
> >
> >       default:
> > diff --git a/include/uapi/linux/dma-buf.h b/include/uapi/linux/dma-buf.h
> > index f902cadcbdb56..75fdde4800267 100644
> > --- a/include/uapi/linux/dma-buf.h
> > +++ b/include/uapi/linux/dma-buf.h
> > @@ -70,5 +70,6 @@ struct dma_buf_sync_file {
> >  #define DMA_BUF_SET_NAME_A   _IOW(DMA_BUF_BASE, 1, u32)
> >  #define DMA_BUF_SET_NAME_B   _IOW(DMA_BUF_BASE, 1, u64)
> >  #define DMA_BUF_IOCTL_EXPORT_SYNC_FILE       _IOWR(DMA_BUF_BASE, 2, struct dma_buf_sync_file)
> > +#define DMA_BUF_IOCTL_IMPORT_SYNC_FILE       _IOW(DMA_BUF_BASE, 3, struct dma_buf_sync)
>
> Uh wrong struct here. Not good :-)
>
> Also more kerneldoc would be really nice, plus on 2nd thought I'm not
> really sure saving the few bytes in storage

Not sure what storage you're talking about.  Kernel headers?

> is such a bright idea, and
> maybe we should have distinct dma_buf_export/import_sync_file structures,
> each with their appropriate kerneldoc and no confusion.

Sure. I can do that.

> Aside from these I think this looks good. And as long as we keep up the
> "shared fences in their entirety complete after the exclusive fence if
> both are present", then I think we'll be fine.
> -Daniel
>
>
>
> >
> >  #endif
> > --
> > 2.31.1
> >
>
> --
> Daniel Vetter
> Software Engineer, Intel Corporation
> http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 6/6] RFC: dma-buf: Add an API for importing sync files (v6)
  2021-05-25 19:19     ` Jason Ekstrand
@ 2021-05-25 19:33       ` Daniel Vetter
  0 siblings, 0 replies; 13+ messages in thread
From: Daniel Vetter @ 2021-05-25 19:33 UTC (permalink / raw)
  To: Jason Ekstrand; +Cc: Intel GFX, Maling list - DRI developers

On Tue, May 25, 2021 at 9:19 PM Jason Ekstrand <jason@jlekstrand.net> wrote:
>
> On Tue, May 25, 2021 at 10:37 AM Daniel Vetter <daniel@ffwll.ch> wrote:
> >
> > On Mon, May 24, 2021 at 03:59:54PM -0500, Jason Ekstrand wrote:
> > > This patch is analogous to the previous sync file export patch in that
> > > it allows you to import a sync_file into a dma-buf.  Unlike the previous
> > > patch, however, this does add genuinely new functionality to dma-buf.
> > > Without this, the only way to attach a sync_file to a dma-buf is to
> > > submit a batch to your driver of choice which waits on the sync_file and
> > > claims to write to the dma-buf.  Even if said batch is a no-op, a submit
> > > is typically way more overhead than just attaching a fence.  A submit
> > > may also imply extra synchronization with other work because it happens
> > > on a hardware queue.
> > >
> > > In the Vulkan world, this is useful for dealing with the out-fence from
> > > vkQueuePresent.  Current Linux window-systems (X11, Wayland, etc.) all
> > > rely on dma-buf implicit sync.  Since Vulkan is an explicit sync API, we
> > > get a set of fences (VkSemaphores) in vkQueuePresent and have to stash
> > > those as an exclusive (write) fence on the dma-buf.  We handle it in
> > > Mesa today with the above mentioned dummy submit trick.  This ioctl
> > > would allow us to set it directly without the dummy submit.
> > >
> > > This may also open up possibilities for GPU drivers to move away from
> > > implicit sync for their kernel driver uAPI and instead provide sync
> > > files and rely on dma-buf import/export for communicating with other
> > > implicit sync clients.
> > >
> > > We make the explicit choice here to only allow setting RW fences which
> > > translates to an exclusive fence on the dma_resv.  There's no use for
> > > read-only fences for communicating with other implicit sync userspace
> > > and any such attempts are likely to be racy at best.  When we got to
> > > insert the RW fence, the actual fence we set as the new exclusive fence
> > > is a combination of the sync_file provided by the user and all the other
> > > fences on the dma_resv.  This ensures that the newly added exclusive
> > > fence will never signal before the old one would have and ensures that
> > > we don't break any dma_resv contracts.  We require userspace to specify
> > > RW in the flags for symmetry with the export ioctl and in case we ever
> > > want to support read fences in the future.
> > >
> > > There is one downside here that's worth documenting:  If two clients
> > > writing to the same dma-buf using this API race with each other, their
> > > actions on the dma-buf may happen in parallel or in an undefined order.
> > > Both with and without this API, the pattern is the same:  Collect all
> > > the fences on dma-buf, submit work which depends on said fences, and
> > > then set a new exclusive (write) fence on the dma-buf which depends on
> > > said work.  The difference is that, when it's all handled by the GPU
> > > driver's submit ioctl, the three operations happen atomically under the
> > > dma_resv lock.  If two userspace submits race, one will happen before
> > > the other.  You aren't guaranteed which but you are guaranteed that
> > > they're strictly ordered.  If userspace manages the fences itself, then
> > > these three operations happen separately and the two render operations
> > > may happen genuinely in parallel or get interleaved.  However, this is a
> > > case of userspace racing with itself.  As long as we ensure userspace
> > > can't back the kernel into a corner, it should be fine.
> > >
> > > v2 (Jason Ekstrand):
> > >  - Use a wrapper dma_fence_array of all fences including the new one
> > >    when importing an exclusive fence.
> > >
> > > v3 (Jason Ekstrand):
> > >  - Lock around setting shared fences as well as exclusive
> > >  - Mark SIGNAL_SYNC_FILE as a read-write ioctl.
> > >  - Initialize ret to 0 in dma_buf_wait_sync_file
> > >
> > > v4 (Jason Ekstrand):
> > >  - Use the new dma_resv_get_singleton helper
> > >
> > > v5 (Jason Ekstrand):
> > >  - Rename the IOCTLs to import/export rather than wait/signal
> > >  - Drop the WRITE flag and always get/set the exclusive fence
> > >
> > > v5 (Jason Ekstrand):
> > >  - Split import and export into separate patches
> > >  - New commit message
> > >
> > > Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
> > > ---
> > >  drivers/dma-buf/dma-buf.c    | 34 ++++++++++++++++++++++++++++++++++
> > >  include/uapi/linux/dma-buf.h |  1 +
> > >  2 files changed, 35 insertions(+)
> > >
> > > diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c
> > > index f23d939e0e833..0a50c19dcf015 100644
> > > --- a/drivers/dma-buf/dma-buf.c
> > > +++ b/drivers/dma-buf/dma-buf.c
> > > @@ -419,6 +419,38 @@ static long dma_buf_export_sync_file(struct dma_buf *dmabuf,
> > >       put_unused_fd(fd);
> > >       return ret;
> > >  }
> > > +
> > > +static long dma_buf_import_sync_file(struct dma_buf *dmabuf,
> > > +                                  const void __user *user_data)
> > > +{
> > > +     struct dma_buf_sync_file arg;
> > > +     struct dma_fence *fence, *singleton = NULL;
> > > +     int ret = 0;
> > > +
> > > +     if (copy_from_user(&arg, user_data, sizeof(arg)))
> > > +             return -EFAULT;
> > > +
> > > +     if (arg.flags != DMA_BUF_SYNC_RW)
> > > +             return -EINVAL;
> > > +
> > > +     fence = sync_file_get_fence(arg.fd);
> > > +     if (!fence)
> > > +             return -EINVAL;
> > > +
> > > +     dma_resv_lock(dmabuf->resv, NULL);
> > > +
> > > +     singleton = dma_resv_get_singleton_unlocked(dmabuf->resv, fence);
> > > +     if (IS_ERR(singleton))
> > > +             ret = PTR_ERR(singleton);
> > > +     else if (singleton)
> > > +             dma_resv_add_excl_fence(dmabuf->resv, singleton);
> >
> > We also need to add the new fence to the shared slots, to make sure that
> > the collective sum of shared fences still retires after the exclusive one.
> > Not holding this up will pretty surely allow userspace to pull a bunch of
> > ttm based drivers over the table I think.
>
> Ok, will fix.
>
> > Note that with dma-buf shared buffers there shouldn't be a problem here,
> > since as long as the buffer is in use by the other driver (which might
> > break the contract here) it's pinned. So nothing bad can happen.
> >
> >
> > Aside: The read-only version of this just adds the new fence, and the
> > exclusive fence to the shared array. I think that would be useful to have,
> > if just for completeness. I need to pester you how external images work
> > here with vulkan ...
>
> As discussed on IRC, let's leave that out until we can figure out how
> it works. :-)

Yup, there's a bunch more things we need to clarify first.

> > > +
> > > +     dma_resv_unlock(dmabuf->resv);
> > > +
> > > +     dma_fence_put(fence);
> > > +
> > > +     return ret;
> > > +}
> > >  #endif
> > >
> > >  static long dma_buf_ioctl(struct file *file,
> > > @@ -467,6 +499,8 @@ static long dma_buf_ioctl(struct file *file,
> > >  #if IS_ENABLED(CONFIG_SYNC_FILE)
> > >       case DMA_BUF_IOCTL_EXPORT_SYNC_FILE:
> > >               return dma_buf_export_sync_file(dmabuf, (void __user *)arg);
> > > +     case DMA_BUF_IOCTL_IMPORT_SYNC_FILE:
> > > +             return dma_buf_import_sync_file(dmabuf, (const void __user *)arg);
> > >  #endif
> > >
> > >       default:
> > > diff --git a/include/uapi/linux/dma-buf.h b/include/uapi/linux/dma-buf.h
> > > index f902cadcbdb56..75fdde4800267 100644
> > > --- a/include/uapi/linux/dma-buf.h
> > > +++ b/include/uapi/linux/dma-buf.h
> > > @@ -70,5 +70,6 @@ struct dma_buf_sync_file {
> > >  #define DMA_BUF_SET_NAME_A   _IOW(DMA_BUF_BASE, 1, u32)
> > >  #define DMA_BUF_SET_NAME_B   _IOW(DMA_BUF_BASE, 1, u64)
> > >  #define DMA_BUF_IOCTL_EXPORT_SYNC_FILE       _IOWR(DMA_BUF_BASE, 2, struct dma_buf_sync_file)
> > > +#define DMA_BUF_IOCTL_IMPORT_SYNC_FILE       _IOW(DMA_BUF_BASE, 3, struct dma_buf_sync)
> >
> > Uh wrong struct here. Not good :-)
> >
> > Also more kerneldoc would be really nice, plus on 2nd thought I'm not
> > really sure saving the few bytes in storage
>
> Not sure what storage you're talking about.  Kernel headers?

Yeah the disk space waste on developers machines and all that :-)
-Daniel

> > is such a bright idea, and
> > maybe we should have distinct dma_buf_export/import_sync_file structures,
> > each with their appropriate kerneldoc and no confusion.
>
> Sure. I can do that.
>
> > Aside from these I think this looks good. And as long as we keep up the
> > "shared fences in their entirety complete after the exclusive fence if
> > both are present", then I think we'll be fine.
> > -Daniel
> >
> >
> >
> > >
> > >  #endif
> > > --
> > > 2.31.1
> > >
> >
> > --
> > Daniel Vetter
> > Software Engineer, Intel Corporation
> > http://blog.ffwll.ch



-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2021-05-25 19:33 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-05-24 20:59 [PATCH 0/6] dma-buf: Add an API for exporting sync files (v10) Jason Ekstrand
2021-05-24 20:59 ` [PATCH 1/6] dma-buf: add dma_fence_array_for_each (v2) Jason Ekstrand
2021-05-24 20:59 ` [PATCH 2/6] dma-buf: Rename dma_resv helpers from _rcu to _unlocked Jason Ekstrand
2021-05-25 14:57   ` Daniel Vetter
2021-05-24 20:59 ` [PATCH 3/6] dma-buf: add dma_resv_get_singleton_unlocked (v4) Jason Ekstrand
2021-05-24 20:59 ` [PATCH 4/6] dma-buf: Add an API for exporting sync files (v9) Jason Ekstrand
2021-05-25 15:08   ` Daniel Vetter
2021-05-24 20:59 ` [PATCH 5/6] RFC: dma-buf: Add an extra fence to dma_resv_get_singleton_unlocked Jason Ekstrand
2021-05-25 15:25   ` [Intel-gfx] " Daniel Vetter
2021-05-24 20:59 ` [PATCH 6/6] RFC: dma-buf: Add an API for importing sync files (v6) Jason Ekstrand
2021-05-25 15:37   ` Daniel Vetter
2021-05-25 19:19     ` Jason Ekstrand
2021-05-25 19:33       ` Daniel Vetter

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).