All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/9] drm/syncobj: Add full-featured wait support
@ 2017-08-08 22:46 Jason Ekstrand
  2017-08-08 22:46 ` [PATCH 1/9] drm/syncobj: Rename fence_get to find_fence Jason Ekstrand
                   ` (9 more replies)
  0 siblings, 10 replies; 55+ messages in thread
From: Jason Ekstrand @ 2017-08-08 22:46 UTC (permalink / raw)
  To: dri-devel; +Cc: Jason Ekstrand, Dave Airlie

This series contains the drm and i915 bits required to implement the
VK_KHR_external_semaphore and VK_KHR_external_fence extensions on top of
DRM sync objects.

Patches 1-3 are cleanups and fixes for the core syncobj APIs.

Patch 4 is the i915 patch that adds support for syncobj in the i915 driver.
I'm re-sending it because I've rebased the latest version on patches 1-3.

Patch 6 is technically not required by the end of the series but is needed
for i915 if we don't have patches 7-9.  I think it's a good and possibly
useful clean-up, but I'm more than willing to drop it in favor of 7-9.

Patch 7 adds a reset ioctl which allows the caller to reset the syncobj by
setting the fence to NULL.  This is needed to implement vkResetFences.

The real meat of the series is patches 8 and 9.  The previous wait
implementation would throw -EINVAL if you ever tried to wait on a syncobj
which did not yet have a dma_fence.  Patch 9 adds a new flag to the syncobj
which makes it so that, if it does not yet have a fence, it first waits for
a fence to show up and then waits on the fence.  This is needed to properly
implement VkWaitForFences behavior.

This series can be found as a branch here:

https://cgit.freedesktop.org/~jekstrand/linux/log/?h=drm-syncobj-wait-submit-v1

Patches to the Intel Vulkan driver to implement VK_KHR_external_fence on
top of this kernel interface can be found here:

https://cgit.freedesktop.org/~jekstrand/mesa/log/?h=wip/anv-external-fence

Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Dave Airlie <airlied@redhat.com>

Dave Airlie (1):
  drm/syncobj: add sync obj wait interface. (v8)

Jason Ekstrand (8):
  drm/syncobj: Rename fence_get to find_fence
  drm/syncobj: Lock around drm_syncobj::fence
  drm/syncobj: Remove file_private from replace_fence
  i915: Add support for drm syncobjs
  dma-buf/dma-fence: Allow wait_any_timeout without default_wait
  drm/syncobj: Add a reset ioctl
  drm/syncobj: Add a callback mechanism for replace_fence
  drm/syncobj: Allow wait for submit and signal behavior

 drivers/dma-buf/dma-fence.c                |   5 -
 drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c     |   5 +-
 drivers/gpu/drm/drm_internal.h             |   4 +
 drivers/gpu/drm/drm_ioctl.c                |   4 +
 drivers/gpu/drm/drm_syncobj.c              | 522 ++++++++++++++++++++++++++++-
 drivers/gpu/drm/i915/i915_drv.c            |   3 +-
 drivers/gpu/drm/i915/i915_gem_execbuffer.c | 146 +++++++-
 include/drm/drm_syncobj.h                  |  46 ++-
 include/uapi/drm/drm.h                     |  19 ++
 include/uapi/drm/i915_drm.h                |  30 +-
 10 files changed, 752 insertions(+), 32 deletions(-)

-- 
2.5.0.400.gff86faf

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 55+ messages in thread

* [PATCH 1/9] drm/syncobj: Rename fence_get to find_fence
  2017-08-08 22:46 [PATCH 0/9] drm/syncobj: Add full-featured wait support Jason Ekstrand
@ 2017-08-08 22:46 ` Jason Ekstrand
  2017-08-08 22:46 ` [PATCH 2/9] drm/syncobj: Lock around drm_syncobj::fence Jason Ekstrand
                   ` (8 subsequent siblings)
  9 siblings, 0 replies; 55+ messages in thread
From: Jason Ekstrand @ 2017-08-08 22:46 UTC (permalink / raw)
  To: dri-devel; +Cc: Jason Ekstrand, Jason Ekstrand

The function has far more in common with drm_syncobj_find than with
any in the get/put functions.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c |  2 +-
 drivers/gpu/drm/drm_syncobj.c          | 10 +++++-----
 include/drm/drm_syncobj.h              |  6 +++---
 3 files changed, 9 insertions(+), 9 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
index aeee684..0aa33ae 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
@@ -984,7 +984,7 @@ static int amdgpu_syncobj_lookup_and_add_to_sync(struct amdgpu_cs_parser *p,
 {
 	int r;
 	struct dma_fence *fence;
-	r = drm_syncobj_fence_get(p->filp, handle, &fence);
+	r = drm_syncobj_find_fence(p->filp, handle, &fence);
 	if (r)
 		return r;
 
diff --git a/drivers/gpu/drm/drm_syncobj.c b/drivers/gpu/drm/drm_syncobj.c
index 89441bc..6f2a6041 100644
--- a/drivers/gpu/drm/drm_syncobj.c
+++ b/drivers/gpu/drm/drm_syncobj.c
@@ -97,9 +97,9 @@ void drm_syncobj_replace_fence(struct drm_file *file_private,
 }
 EXPORT_SYMBOL(drm_syncobj_replace_fence);
 
-int drm_syncobj_fence_get(struct drm_file *file_private,
-			  u32 handle,
-			  struct dma_fence **fence)
+int drm_syncobj_find_fence(struct drm_file *file_private,
+			   u32 handle,
+			   struct dma_fence **fence)
 {
 	struct drm_syncobj *syncobj = drm_syncobj_find(file_private, handle);
 	int ret = 0;
@@ -114,7 +114,7 @@ int drm_syncobj_fence_get(struct drm_file *file_private,
 	drm_syncobj_put(syncobj);
 	return ret;
 }
-EXPORT_SYMBOL(drm_syncobj_fence_get);
+EXPORT_SYMBOL(drm_syncobj_find_fence);
 
 /**
  * drm_syncobj_free - free a sync object.
@@ -309,7 +309,7 @@ int drm_syncobj_export_sync_file(struct drm_file *file_private,
 	if (fd < 0)
 		return fd;
 
-	ret = drm_syncobj_fence_get(file_private, handle, &fence);
+	ret = drm_syncobj_find_fence(file_private, handle, &fence);
 	if (ret)
 		goto err_put_fd;
 
diff --git a/include/drm/drm_syncobj.h b/include/drm/drm_syncobj.h
index 2c3610a..9ffff22 100644
--- a/include/drm/drm_syncobj.h
+++ b/include/drm/drm_syncobj.h
@@ -82,9 +82,9 @@ struct drm_syncobj *drm_syncobj_find(struct drm_file *file_private,
 void drm_syncobj_replace_fence(struct drm_file *file_private,
 			       struct drm_syncobj *syncobj,
 			       struct dma_fence *fence);
-int drm_syncobj_fence_get(struct drm_file *file_private,
-			  u32 handle,
-			  struct dma_fence **fence);
+int drm_syncobj_find_fence(struct drm_file *file_private,
+			   u32 handle,
+			   struct dma_fence **fence);
 void drm_syncobj_free(struct kref *kref);
 
 #endif
-- 
2.5.0.400.gff86faf

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 55+ messages in thread

* [PATCH 2/9] drm/syncobj: Lock around drm_syncobj::fence
  2017-08-08 22:46 [PATCH 0/9] drm/syncobj: Add full-featured wait support Jason Ekstrand
  2017-08-08 22:46 ` [PATCH 1/9] drm/syncobj: Rename fence_get to find_fence Jason Ekstrand
@ 2017-08-08 22:46 ` Jason Ekstrand
  2017-08-09 21:21   ` Chris Wilson
  2017-08-08 22:46 ` [PATCH 3/9] drm/syncobj: Remove file_private from replace_fence Jason Ekstrand
                   ` (7 subsequent siblings)
  9 siblings, 1 reply; 55+ messages in thread
From: Jason Ekstrand @ 2017-08-08 22:46 UTC (permalink / raw)
  To: dri-devel; +Cc: Jason Ekstrand, Jason Ekstrand

The atomic exchange operation we were doing before in replace_fence was
sufficient for the case where it raced with itself.  However, if you
have a race between a replace_fence and dma_fence_get(syncobj->fence),
you may end up with the entire replace_fence happening between the point
in time where the one thread gets the syncobj->fence pointer and when it
calls dma_fence_get() on it.  If this happens, then the reference may be
dropped before we get a chance to get a new one.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
---
 drivers/gpu/drm/drm_syncobj.c | 25 +++++++++++++++++++++++--
 include/drm/drm_syncobj.h     |  6 ++++++
 2 files changed, 29 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/drm_syncobj.c b/drivers/gpu/drm/drm_syncobj.c
index 6f2a6041..dafca83 100644
--- a/drivers/gpu/drm/drm_syncobj.c
+++ b/drivers/gpu/drm/drm_syncobj.c
@@ -75,6 +75,22 @@ struct drm_syncobj *drm_syncobj_find(struct drm_file *file_private,
 }
 EXPORT_SYMBOL(drm_syncobj_find);
 
+struct dma_fence *drm_syncobj_fence_get(struct drm_syncobj *syncobj)
+{
+	struct dma_fence *fence;
+
+	/* Lock to prevent someone from replacing the fence and dropping
+	 * the syncobj's reference between when we get the dma_fence
+	 * pointer and wehen we have a chance to incref.
+	 */
+	spin_lock(&syncobj->lock);
+	fence = dma_fence_get(syncobj->fence);
+	spin_unlock(&syncobj->lock);
+
+	return fence;
+}
+EXPORT_SYMBOL(drm_syncobj_fence_get);
+
 /**
  * drm_syncobj_replace_fence - replace fence in a sync object.
  * @file_private: drm file private pointer.
@@ -91,7 +107,11 @@ void drm_syncobj_replace_fence(struct drm_file *file_private,
 
 	if (fence)
 		dma_fence_get(fence);
-	old_fence = xchg(&syncobj->fence, fence);
+
+	spin_lock(&syncobj->lock);
+	old_fence = syncobj->fence;
+	syncobj->fence = fence;
+	spin_unlock(&syncobj->lock);
 
 	dma_fence_put(old_fence);
 }
@@ -107,7 +127,7 @@ int drm_syncobj_find_fence(struct drm_file *file_private,
 	if (!syncobj)
 		return -ENOENT;
 
-	*fence = dma_fence_get(syncobj->fence);
+	*fence = drm_syncobj_fence_get(syncobj);
 	if (!*fence) {
 		ret = -EINVAL;
 	}
@@ -143,6 +163,7 @@ static int drm_syncobj_create(struct drm_file *file_private,
 		return -ENOMEM;
 
 	kref_init(&syncobj->refcount);
+	spin_lock_init(&syncobj->lock);
 
 	idr_preload(GFP_KERNEL);
 	spin_lock(&file_private->syncobj_table_lock);
diff --git a/include/drm/drm_syncobj.h b/include/drm/drm_syncobj.h
index 9ffff22..422d631 100644
--- a/include/drm/drm_syncobj.h
+++ b/include/drm/drm_syncobj.h
@@ -46,6 +46,11 @@ struct drm_syncobj {
 	 */
 	struct dma_fence *fence;
 	/**
+	 * @spinlock:
+	 * locks fence.
+	 */
+	spinlock_t lock;
+	/**
 	 * @file:
 	 * a file backing for this syncobj.
 	 */
@@ -79,6 +84,7 @@ drm_syncobj_put(struct drm_syncobj *obj)
 
 struct drm_syncobj *drm_syncobj_find(struct drm_file *file_private,
 				     u32 handle);
+struct dma_fence *drm_syncobj_fence_get(struct drm_syncobj *syncobj);
 void drm_syncobj_replace_fence(struct drm_file *file_private,
 			       struct drm_syncobj *syncobj,
 			       struct dma_fence *fence);
-- 
2.5.0.400.gff86faf

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 55+ messages in thread

* [PATCH 3/9] drm/syncobj: Remove file_private from replace_fence
  2017-08-08 22:46 [PATCH 0/9] drm/syncobj: Add full-featured wait support Jason Ekstrand
  2017-08-08 22:46 ` [PATCH 1/9] drm/syncobj: Rename fence_get to find_fence Jason Ekstrand
  2017-08-08 22:46 ` [PATCH 2/9] drm/syncobj: Lock around drm_syncobj::fence Jason Ekstrand
@ 2017-08-08 22:46 ` Jason Ekstrand
  2017-08-08 22:46 ` [PATCH 4/9] i915: Add support for drm syncobjs Jason Ekstrand
                   ` (6 subsequent siblings)
  9 siblings, 0 replies; 55+ messages in thread
From: Jason Ekstrand @ 2017-08-08 22:46 UTC (permalink / raw)
  To: dri-devel; +Cc: Jason Ekstrand, Jason Ekstrand

It's completely unused.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c | 3 +--
 drivers/gpu/drm/drm_syncobj.c          | 5 ++---
 include/drm/drm_syncobj.h              | 3 +--
 3 files changed, 4 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
index 0aa33ae..3314702 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
@@ -1070,8 +1070,7 @@ static void amdgpu_cs_post_dependencies(struct amdgpu_cs_parser *p)
 	int i;
 
 	for (i = 0; i < p->num_post_dep_syncobjs; ++i) {
-		drm_syncobj_replace_fence(p->filp, p->post_dep_syncobjs[i],
-					  p->fence);
+		drm_syncobj_replace_fence(p->post_dep_syncobjs[i], p->fence);
 	}
 }
 
diff --git a/drivers/gpu/drm/drm_syncobj.c b/drivers/gpu/drm/drm_syncobj.c
index dafca83..7fdf7e8 100644
--- a/drivers/gpu/drm/drm_syncobj.c
+++ b/drivers/gpu/drm/drm_syncobj.c
@@ -99,8 +99,7 @@ EXPORT_SYMBOL(drm_syncobj_fence_get);
  *
  * This replaces the fence on a sync object.
  */
-void drm_syncobj_replace_fence(struct drm_file *file_private,
-			       struct drm_syncobj *syncobj,
+void drm_syncobj_replace_fence(struct drm_syncobj *syncobj,
 			       struct dma_fence *fence)
 {
 	struct dma_fence *old_fence = NULL;
@@ -313,7 +312,7 @@ int drm_syncobj_import_sync_file_fence(struct drm_file *file_private,
 		return -ENOENT;
 	}
 
-	drm_syncobj_replace_fence(file_private, syncobj, fence);
+	drm_syncobj_replace_fence(syncobj, fence);
 	dma_fence_put(fence);
 	drm_syncobj_put(syncobj);
 	return 0;
diff --git a/include/drm/drm_syncobj.h b/include/drm/drm_syncobj.h
index 422d631..1c81658 100644
--- a/include/drm/drm_syncobj.h
+++ b/include/drm/drm_syncobj.h
@@ -85,8 +85,7 @@ drm_syncobj_put(struct drm_syncobj *obj)
 struct drm_syncobj *drm_syncobj_find(struct drm_file *file_private,
 				     u32 handle);
 struct dma_fence *drm_syncobj_fence_get(struct drm_syncobj *syncobj);
-void drm_syncobj_replace_fence(struct drm_file *file_private,
-			       struct drm_syncobj *syncobj,
+void drm_syncobj_replace_fence(struct drm_syncobj *syncobj,
 			       struct dma_fence *fence);
 int drm_syncobj_find_fence(struct drm_file *file_private,
 			   u32 handle,
-- 
2.5.0.400.gff86faf

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 55+ messages in thread

* [PATCH 4/9] i915: Add support for drm syncobjs
  2017-08-08 22:46 [PATCH 0/9] drm/syncobj: Add full-featured wait support Jason Ekstrand
                   ` (2 preceding siblings ...)
  2017-08-08 22:46 ` [PATCH 3/9] drm/syncobj: Remove file_private from replace_fence Jason Ekstrand
@ 2017-08-08 22:46 ` Jason Ekstrand
  2017-08-08 22:46 ` [PATCH 5/9] drm/syncobj: add sync obj wait interface. (v8) Jason Ekstrand
                   ` (5 subsequent siblings)
  9 siblings, 0 replies; 55+ messages in thread
From: Jason Ekstrand @ 2017-08-08 22:46 UTC (permalink / raw)
  To: dri-devel; +Cc: Jason Ekstrand, Jason Ekstrand

This commit adds support for waiting on or signaling DRM syncobjs as
part of execbuf.  It does so by hijacking the currently unused cliprects
pointer to instead point to an array of i915_gem_exec_fence structs
which containe a DRM syncobj and a flags parameter which specifies
whether to wait on it or to signal it.  This implementation
theoretically allows for both flags to be set in which case it waits on
the dma_fence that was in the syncobj and then immediately replaces it
with the dma_fence from the current execbuf.

v2:
 - Rebase on new syncobj API
v3:
 - Pull everything out into helpers
 - Do all allocation in gem_execbuffer2
 - Pack the flags in the bottom 2 bits of the drm_syncobj*
v4:
 - Prevent a potential race on syncobj->fence

Testcase: igt/gem_exec_fence/syncobj*
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/i915_drv.c            |   3 +-
 drivers/gpu/drm/i915/i915_gem_execbuffer.c | 146 ++++++++++++++++++++++++++++-
 include/uapi/drm/i915_drm.h                |  30 +++++-
 3 files changed, 171 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index 6a22f0e..f8f3d45 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -384,6 +384,7 @@ static int i915_getparam(struct drm_device *dev, void *data,
 	case I915_PARAM_HAS_EXEC_FENCE:
 	case I915_PARAM_HAS_EXEC_CAPTURE:
 	case I915_PARAM_HAS_EXEC_BATCH_FIRST:
+	case I915_PARAM_HAS_EXEC_FENCE_ARRAY:
 		/* For the time being all of these are always true;
 		 * if some supported hardware does not have one of these
 		 * features this value needs to be provided from
@@ -2733,7 +2734,7 @@ static struct drm_driver driver = {
 	 */
 	.driver_features =
 	    DRIVER_HAVE_IRQ | DRIVER_IRQ_SHARED | DRIVER_GEM | DRIVER_PRIME |
-	    DRIVER_RENDER | DRIVER_MODESET | DRIVER_ATOMIC,
+	    DRIVER_RENDER | DRIVER_MODESET | DRIVER_ATOMIC | DRIVER_SYNCOBJ,
 	.release = i915_driver_release,
 	.open = i915_driver_open,
 	.lastclose = i915_driver_lastclose,
diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
index 26eedb1..936fb0b 100644
--- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
@@ -33,6 +33,7 @@
 
 #include <drm/drmP.h>
 #include <drm/i915_drm.h>
+#include <drm/drm_syncobj.h>
 
 #include "i915_drv.h"
 #include "i915_gem_clflush.h"
@@ -1800,8 +1801,10 @@ static bool i915_gem_check_execbuffer(struct drm_i915_gem_execbuffer2 *exec)
 		return false;
 
 	/* Kernel clipping was a DRI1 misfeature */
-	if (exec->num_cliprects || exec->cliprects_ptr)
-		return false;
+	if (!(exec->flags & I915_EXEC_FENCE_ARRAY)) {
+		if (exec->num_cliprects || exec->cliprects_ptr)
+			return false;
+	}
 
 	if (exec->DR4 == 0xffffffff) {
 		DRM_DEBUG("UXA submitting garbage DR4, fixing up\n");
@@ -2029,11 +2032,125 @@ eb_select_engine(struct drm_i915_private *dev_priv,
 	return engine;
 }
 
+static void __free_fence_array(struct drm_syncobj **fences, unsigned int n)
+{
+	while (n--)
+		drm_syncobj_put(ptr_mask_bits(fences[n], 2));
+	kvfree(fences);
+}
+
+static struct drm_syncobj **get_fence_array(struct drm_i915_gem_execbuffer2 *args,
+					    struct drm_file *file)
+{
+	const unsigned int nfences = args->num_cliprects;
+	struct drm_i915_gem_exec_fence __user *user;
+	struct drm_syncobj **fences;
+	unsigned int n;
+	int err;
+
+	if (!(args->flags & I915_EXEC_FENCE_ARRAY))
+		return NULL;
+
+	if (nfences > SIZE_MAX / sizeof(*fences))
+		return ERR_PTR(-EINVAL);
+
+	user = u64_to_user_ptr(args->cliprects_ptr);
+	if (!access_ok(VERIFY_READ, user, nfences * 2 * sizeof(u32)))
+		return ERR_PTR(-EFAULT);
+
+	fences = kvmalloc_array(args->num_cliprects, sizeof(*fences),
+				__GFP_NOWARN | GFP_TEMPORARY);
+	if (!fences)
+		return ERR_PTR(-ENOMEM);
+
+	for (n = 0; n < nfences; n++) {
+		struct drm_i915_gem_exec_fence fence;
+		struct drm_syncobj *syncobj;
+
+		if (__copy_from_user(&fence, user++, sizeof(fence))) {
+			err = -EFAULT;
+			goto err;
+		}
+
+		syncobj = drm_syncobj_find(file, fence.handle);
+		if (!syncobj) {
+			DRM_DEBUG("Invalid syncobj handle provided\n");
+			err = -EINVAL;
+			goto err;
+		}
+
+		fences[n] = ptr_pack_bits(syncobj, fence.flags, 2);
+	}
+
+	return fences;
+
+err:
+	__free_fence_array(fences, n);
+	return ERR_PTR(err);
+}
+
+static void put_fence_array(struct drm_i915_gem_execbuffer2 *args,
+			    struct drm_syncobj **fences)
+{
+	if (!fences)
+		return;
+	__free_fence_array(fences, args->num_cliprects);
+}
+
+static int await_fence_array(struct i915_execbuffer *eb,
+			     struct drm_syncobj **fences)
+{
+	const unsigned int nfences = eb->args->num_cliprects;
+	unsigned int n;
+	int err;
+
+	for (n = 0; n < nfences; n++) {
+		struct drm_syncobj *syncobj;
+		struct dma_fence *fence;
+		unsigned int flags;
+
+		syncobj = ptr_unpack_bits(fences[n], &flags, 2);
+		if (!(flags & I915_EXEC_FENCE_WAIT))
+			continue;
+
+		fence = drm_syncobj_fence_get(syncobj);
+		if (!fence)
+		    return -EINVAL;
+
+		err = i915_gem_request_await_dma_fence(eb->request, fence);
+		dma_fence_put(fence);
+		if (err < 0)
+			return err;
+	}
+
+	return 0;
+}
+
+static void signal_fence_array(struct i915_execbuffer *eb,
+			       struct drm_syncobj **fences)
+{
+	const unsigned int nfences = eb->args->num_cliprects;
+	struct dma_fence * const fence = &eb->request->fence;
+	unsigned int n;
+
+	for (n = 0; n < nfences; n++) {
+		struct drm_syncobj *syncobj;
+		unsigned int flags;
+
+		syncobj = ptr_unpack_bits(fences[n], &flags, 2);
+		if (!(flags & I915_EXEC_FENCE_SIGNAL))
+			continue;
+
+		drm_syncobj_replace_fence(syncobj, fence);
+	}
+}
+
 static int
 i915_gem_do_execbuffer(struct drm_device *dev,
 		       struct drm_file *file,
 		       struct drm_i915_gem_execbuffer2 *args,
-		       struct drm_i915_gem_exec_object2 *exec)
+		       struct drm_i915_gem_exec_object2 *exec,
+		       struct drm_syncobj **fences)
 {
 	struct i915_execbuffer eb;
 	struct dma_fence *in_fence = NULL;
@@ -2221,6 +2338,12 @@ i915_gem_do_execbuffer(struct drm_device *dev,
 			goto err_request;
 	}
 
+	if (fences) {
+		err = await_fence_array(&eb, fences);
+		if (err < 0)
+			goto err_request;
+	}
+
 	if (out_fence_fd != -1) {
 		out_fence = sync_file_create(&eb.request->fence);
 		if (!out_fence) {
@@ -2244,6 +2367,9 @@ i915_gem_do_execbuffer(struct drm_device *dev,
 	__i915_add_request(eb.request, err == 0);
 	add_to_client(eb.request, file);
 
+	if (fences)
+		signal_fence_array(&eb, fences);
+
 	if (out_fence) {
 		if (err == 0) {
 			fd_install(out_fence_fd, out_fence->file);
@@ -2347,7 +2473,7 @@ i915_gem_execbuffer(struct drm_device *dev, void *data,
 			exec2_list[i].flags = 0;
 	}
 
-	err = i915_gem_do_execbuffer(dev, file, &exec2, exec2_list);
+	err = i915_gem_do_execbuffer(dev, file, &exec2, exec2_list, NULL);
 	if (exec2.flags & __EXEC_HAS_RELOC) {
 		struct drm_i915_gem_exec_object __user *user_exec_list =
 			u64_to_user_ptr(args->buffers_ptr);
@@ -2381,6 +2507,7 @@ i915_gem_execbuffer2(struct drm_device *dev, void *data,
 			   sizeof(unsigned int));
 	struct drm_i915_gem_execbuffer2 *args = data;
 	struct drm_i915_gem_exec_object2 *exec2_list;
+	struct drm_syncobj **fences = NULL;
 	int err;
 
 	if (args->buffer_count < 1 || args->buffer_count > SIZE_MAX / sz - 1) {
@@ -2407,7 +2534,15 @@ i915_gem_execbuffer2(struct drm_device *dev, void *data,
 		return -EFAULT;
 	}
 
-	err = i915_gem_do_execbuffer(dev, file, args, exec2_list);
+	if (args->flags & I915_EXEC_FENCE_ARRAY) {
+		fences = get_fence_array(args, file);
+		if (IS_ERR(fences)) {
+			kvfree(exec2_list);
+			return PTR_ERR(fences);
+		}
+	}
+
+	err = i915_gem_do_execbuffer(dev, file, args, exec2_list, fences);
 
 	/*
 	 * Now that we have begun execution of the batchbuffer, we ignore
@@ -2438,5 +2573,6 @@ i915_gem_execbuffer2(struct drm_device *dev, void *data,
 
 	args->flags &= ~__I915_EXEC_UNKNOWN_FLAGS;
 	kvfree(exec2_list);
+	put_fence_array(args, fences);
 	return err;
 }
diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
index 7650e1d..3a0b61a 100644
--- a/include/uapi/drm/i915_drm.h
+++ b/include/uapi/drm/i915_drm.h
@@ -475,6 +475,11 @@ typedef struct drm_i915_irq_wait {
  */
 #define I915_PARAM_HAS_EXEC_BATCH_FIRST	 48
 
+/* Query whether DRM_I915_GEM_EXECBUFFER2 supports supplying an array of
+ * drm_i915_gem_exec_fence structures.  See I915_EXEC_FENCE_ARRAY.
+ */
+#define I915_PARAM_HAS_EXEC_FENCE_ARRAY  49
+
 typedef struct drm_i915_getparam {
 	__s32 param;
 	/*
@@ -856,6 +861,17 @@ struct drm_i915_gem_exec_object2 {
 	__u64 rsvd2;
 };
 
+struct drm_i915_gem_exec_fence {
+	/**
+	 * User's handle for a dma-fence to wait on or signal.
+	 */
+	__u32 handle;
+
+#define I915_EXEC_FENCE_WAIT            (1<<0)
+#define I915_EXEC_FENCE_SIGNAL          (1<<1)
+	__u32 flags;
+};
+
 struct drm_i915_gem_execbuffer2 {
 	/**
 	 * List of gem_exec_object2 structs
@@ -870,7 +886,10 @@ struct drm_i915_gem_execbuffer2 {
 	__u32 DR1;
 	__u32 DR4;
 	__u32 num_cliprects;
-	/** This is a struct drm_clip_rect *cliprects */
+	/** This is a struct drm_clip_rect *cliprects if I915_EXEC_FENCE_ARRAY
+     * is not set.  If I915_EXEC_FENCE_ARRAY is set, then this is a
+     * struct drm_i915_gem_exec_fence *fences.
+     */
 	__u64 cliprects_ptr;
 #define I915_EXEC_RING_MASK              (7<<0)
 #define I915_EXEC_DEFAULT                (0<<0)
@@ -971,7 +990,14 @@ struct drm_i915_gem_execbuffer2 {
  * element).
  */
 #define I915_EXEC_BATCH_FIRST		(1<<18)
-#define __I915_EXEC_UNKNOWN_FLAGS (-(I915_EXEC_BATCH_FIRST<<1))
+
+/* Setting I915_FENCE_ARRAY implies that num_cliprects and cliprects_ptr
+ * define an array of i915_gem_exec_fence structures which specify a set of
+ * dma fences to wait upon or signal.
+ */
+#define I915_EXEC_FENCE_ARRAY   (1<<19)
+
+#define __I915_EXEC_UNKNOWN_FLAGS (-(I915_EXEC_FENCE_ARRAY<<1))
 
 #define I915_EXEC_CONTEXT_ID_MASK	(0xffffffff)
 #define i915_execbuffer2_set_context_id(eb2, context) \
-- 
2.5.0.400.gff86faf

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 55+ messages in thread

* [PATCH 5/9] drm/syncobj: add sync obj wait interface. (v8)
  2017-08-08 22:46 [PATCH 0/9] drm/syncobj: Add full-featured wait support Jason Ekstrand
                   ` (3 preceding siblings ...)
  2017-08-08 22:46 ` [PATCH 4/9] i915: Add support for drm syncobjs Jason Ekstrand
@ 2017-08-08 22:46 ` Jason Ekstrand
  2017-08-08 22:46 ` [PATCH 6/9] dma-buf/dma-fence: Allow wait_any_timeout without default_wait Jason Ekstrand
                   ` (4 subsequent siblings)
  9 siblings, 0 replies; 55+ messages in thread
From: Jason Ekstrand @ 2017-08-08 22:46 UTC (permalink / raw)
  To: dri-devel; +Cc: Dave Airlie

From: Dave Airlie <airlied@redhat.com>

This interface will allow sync object to be used to back
Vulkan fences. This API is pretty much the vulkan fence waiting
API, and I've ported the code from amdgpu.

v2: accept relative timeout, pass remaining time back
to userspace.
v3: return to absolute timeouts.
v4: absolute zero = poll,
    rewrite any/all code to have same operation for arrays
    return -EINVAL for 0 fences.
v4.1: fixup fences allocation check, use u64_to_user_ptr
v5: move to sec/nsec, and use timespec64 for calcs.
v6: use -ETIME and drop the out status flag. (-ETIME
is suggested by ickle, I can feel a shed painting)
v7: talked to Daniel/Arnd, use ktime and ns everywhere.
v8: be more careful in the timeout calculations
    use uint32_t for counter variables so we don't overflow
    graciously handle -ENOINT being returned from dma_fence_wait_timeout

Signed-off-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
---
 drivers/gpu/drm/drm_internal.h |   2 +
 drivers/gpu/drm/drm_ioctl.c    |   2 +
 drivers/gpu/drm/drm_syncobj.c  | 142 +++++++++++++++++++++++++++++++++++++++++
 include/uapi/drm/drm.h         |  12 ++++
 4 files changed, 158 insertions(+)

diff --git a/drivers/gpu/drm/drm_internal.h b/drivers/gpu/drm/drm_internal.h
index 068b685..f3c4e57 100644
--- a/drivers/gpu/drm/drm_internal.h
+++ b/drivers/gpu/drm/drm_internal.h
@@ -163,3 +163,5 @@ int drm_syncobj_handle_to_fd_ioctl(struct drm_device *dev, void *data,
 				   struct drm_file *file_private);
 int drm_syncobj_fd_to_handle_ioctl(struct drm_device *dev, void *data,
 				   struct drm_file *file_private);
+int drm_syncobj_wait_ioctl(struct drm_device *dev, void *data,
+			   struct drm_file *file_private);
diff --git a/drivers/gpu/drm/drm_ioctl.c b/drivers/gpu/drm/drm_ioctl.c
index f7f150e..310f3ec 100644
--- a/drivers/gpu/drm/drm_ioctl.c
+++ b/drivers/gpu/drm/drm_ioctl.c
@@ -657,6 +657,8 @@ static const struct drm_ioctl_desc drm_ioctls[] = {
 		      DRM_UNLOCKED|DRM_RENDER_ALLOW),
 	DRM_IOCTL_DEF(DRM_IOCTL_SYNCOBJ_FD_TO_HANDLE, drm_syncobj_fd_to_handle_ioctl,
 		      DRM_UNLOCKED|DRM_RENDER_ALLOW),
+	DRM_IOCTL_DEF(DRM_IOCTL_SYNCOBJ_WAIT, drm_syncobj_wait_ioctl,
+		      DRM_UNLOCKED|DRM_RENDER_ALLOW),
 };
 
 #define DRM_CORE_IOCTL_COUNT	ARRAY_SIZE( drm_ioctls )
diff --git a/drivers/gpu/drm/drm_syncobj.c b/drivers/gpu/drm/drm_syncobj.c
index 7fdf7e8..2f14e38 100644
--- a/drivers/gpu/drm/drm_syncobj.c
+++ b/drivers/gpu/drm/drm_syncobj.c
@@ -1,5 +1,7 @@
 /*
  * Copyright 2017 Red Hat
+ * Parts ported from amdgpu (fence wait code).
+ * Copyright 2016 Advanced Micro Devices, Inc.
  *
  * Permission is hereby granted, free of charge, to any person obtaining a
  * copy of this software and associated documentation files (the "Software"),
@@ -31,6 +33,9 @@
  * that contain an optional fence. The fence can be updated with a new
  * fence, or be NULL.
  *
+ * syncobj's can be waited upon, where it will wait for the underlying
+ * fence.
+ *
  * syncobj's can be export to fd's and back, these fd's are opaque and
  * have no other use case, except passing the syncobj between processes.
  *
@@ -471,3 +476,140 @@ drm_syncobj_fd_to_handle_ioctl(struct drm_device *dev, void *data,
 	return drm_syncobj_fd_to_handle(file_private, args->fd,
 					&args->handle);
 }
+
+/**
+ * drm_timeout_abs_to_jiffies - calculate jiffies timeout from absolute value
+ *
+ * @timeout_nsec: timeout nsec component in ns, 0 for poll
+ *
+ * Calculate the timeout in jiffies from an absolute time in sec/nsec.
+ */
+static signed long drm_timeout_abs_to_jiffies(int64_t timeout_nsec)
+{
+	ktime_t abs_timeout, now;
+	u64 timeout_ns, timeout_jiffies64;
+
+	/* make 0 timeout means poll - absolute 0 doesn't seem valid */
+	if (timeout_nsec == 0)
+		return 0;
+
+	abs_timeout = ns_to_ktime(timeout_nsec);
+	now = ktime_get();
+
+	if (!ktime_after(abs_timeout, now))
+		return 0;
+
+	timeout_ns = ktime_to_ns(ktime_sub(abs_timeout, now));
+
+	timeout_jiffies64 = nsecs_to_jiffies64(timeout_ns);
+	/*  clamp timeout to avoid infinite timeout */
+	if (timeout_jiffies64 >= MAX_SCHEDULE_TIMEOUT - 1)
+		return MAX_SCHEDULE_TIMEOUT - 1;
+
+	return timeout_jiffies64 + 1;
+}
+
+static int drm_syncobj_wait_fences(struct drm_device *dev,
+				   struct drm_file *file_private,
+				   struct drm_syncobj_wait *wait,
+				   struct dma_fence **fences)
+{
+	signed long timeout = drm_timeout_abs_to_jiffies(wait->timeout_nsec);
+	signed long ret = 0;
+	uint32_t first = ~0;
+
+	if (wait->flags & DRM_SYNCOBJ_WAIT_FLAGS_WAIT_ALL) {
+		uint32_t i;
+		for (i = 0; i < wait->count_handles; i++) {
+			ret = dma_fence_wait_timeout(fences[i], true, timeout);
+
+			/* Various dma_fence wait callbacks will return
+			 * ENOENT to indicate that the fence has already
+			 * been signaled.  We need to sanitize this to 0 so
+			 * we don't return early and the client doesn't see
+			 * an unexpected error.
+			 */
+			if (ret == -ENOENT)
+				ret = 0;
+
+			if (ret < 0)
+				return ret;
+			if (ret == 0)
+				break;
+			timeout = ret;
+		}
+		first = 0;
+	} else {
+		ret = dma_fence_wait_any_timeout(fences,
+						 wait->count_handles,
+						 true, timeout,
+						 &first);
+	}
+
+	if (ret < 0)
+		return ret;
+
+	wait->first_signaled = first;
+	if (ret == 0)
+		return -ETIME;
+	return 0;
+}
+
+int
+drm_syncobj_wait_ioctl(struct drm_device *dev, void *data,
+		       struct drm_file *file_private)
+{
+	struct drm_syncobj_wait *args = data;
+	uint32_t *handles;
+	struct dma_fence **fences;
+	int ret = 0;
+	uint32_t i;
+
+	if (!drm_core_check_feature(dev, DRIVER_SYNCOBJ))
+		return -ENODEV;
+
+	if (args->flags != 0 && args->flags != DRM_SYNCOBJ_WAIT_FLAGS_WAIT_ALL)
+		return -EINVAL;
+
+	if (args->count_handles == 0)
+		return -EINVAL;
+
+	/* Get the handles from userspace */
+	handles = kmalloc_array(args->count_handles, sizeof(uint32_t),
+				GFP_KERNEL);
+	if (handles == NULL)
+		return -ENOMEM;
+
+	if (copy_from_user(handles,
+			   u64_to_user_ptr(args->handles),
+			   sizeof(uint32_t) * args->count_handles)) {
+		ret = -EFAULT;
+		goto err_free_handles;
+	}
+
+	fences = kcalloc(args->count_handles,
+			 sizeof(struct dma_fence *), GFP_KERNEL);
+	if (!fences) {
+		ret = -ENOMEM;
+		goto err_free_handles;
+	}
+
+	for (i = 0; i < args->count_handles; i++) {
+		ret = drm_syncobj_find_fence(file_private, handles[i],
+					     &fences[i]);
+		if (ret)
+			goto err_free_fence_array;
+	}
+
+	ret = drm_syncobj_wait_fences(dev, file_private,
+				      args, fences);
+
+err_free_fence_array:
+	for (i = 0; i < args->count_handles; i++)
+		dma_fence_put(fences[i]);
+	kfree(fences);
+err_free_handles:
+	kfree(handles);
+
+	return ret;
+}
diff --git a/include/uapi/drm/drm.h b/include/uapi/drm/drm.h
index 101593a..0757c1a 100644
--- a/include/uapi/drm/drm.h
+++ b/include/uapi/drm/drm.h
@@ -718,6 +718,17 @@ struct drm_syncobj_handle {
 	__u32 pad;
 };
 
+#define DRM_SYNCOBJ_WAIT_FLAGS_WAIT_ALL (1 << 0)
+struct drm_syncobj_wait {
+	__u64 handles;
+	/* absolute timeout */
+	__s64 timeout_nsec;
+	__u32 count_handles;
+	__u32 flags;
+	__u32 first_signaled; /* only valid when not waiting all */
+	__u32 pad;
+};
+
 #if defined(__cplusplus)
 }
 #endif
@@ -840,6 +851,7 @@ extern "C" {
 #define DRM_IOCTL_SYNCOBJ_DESTROY	DRM_IOWR(0xC0, struct drm_syncobj_destroy)
 #define DRM_IOCTL_SYNCOBJ_HANDLE_TO_FD	DRM_IOWR(0xC1, struct drm_syncobj_handle)
 #define DRM_IOCTL_SYNCOBJ_FD_TO_HANDLE	DRM_IOWR(0xC2, struct drm_syncobj_handle)
+#define DRM_IOCTL_SYNCOBJ_WAIT		DRM_IOWR(0xC3, struct drm_syncobj_wait)
 
 /**
  * Device specific ioctls should only be in their respective headers
-- 
2.5.0.400.gff86faf

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 55+ messages in thread

* [PATCH 6/9] dma-buf/dma-fence: Allow wait_any_timeout without default_wait
  2017-08-08 22:46 [PATCH 0/9] drm/syncobj: Add full-featured wait support Jason Ekstrand
                   ` (4 preceding siblings ...)
  2017-08-08 22:46 ` [PATCH 5/9] drm/syncobj: add sync obj wait interface. (v8) Jason Ekstrand
@ 2017-08-08 22:46 ` Jason Ekstrand
  2017-08-08 22:46 ` [PATCH 7/9] drm/syncobj: Add a reset ioctl Jason Ekstrand
                   ` (3 subsequent siblings)
  9 siblings, 0 replies; 55+ messages in thread
From: Jason Ekstrand @ 2017-08-08 22:46 UTC (permalink / raw)
  To: dri-devel; +Cc: Jason Ekstrand, Jason Ekstrand

dma_fence_wait_any_timeout only relies on two things to work correctly:

 1) The SIGNALED flag to be set upon fence signal
 2) The callback list to get called upon fence signal

Both of these are part of the core dma-fence API so it should work fine
even with a non-default wait function.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
---
 drivers/dma-buf/dma-fence.c | 5 -----
 1 file changed, 5 deletions(-)

diff --git a/drivers/dma-buf/dma-fence.c b/drivers/dma-buf/dma-fence.c
index 57da14c..1218692 100644
--- a/drivers/dma-buf/dma-fence.c
+++ b/drivers/dma-buf/dma-fence.c
@@ -501,11 +501,6 @@ dma_fence_wait_any_timeout(struct dma_fence **fences, uint32_t count,
 	for (i = 0; i < count; ++i) {
 		struct dma_fence *fence = fences[i];
 
-		if (fence->ops->wait != dma_fence_default_wait) {
-			ret = -EINVAL;
-			goto fence_rm_cb;
-		}
-
 		cb[i].task = current;
 		if (dma_fence_add_callback(fence, &cb[i].base,
 					   dma_fence_default_wait_cb)) {
-- 
2.5.0.400.gff86faf

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 55+ messages in thread

* [PATCH 7/9] drm/syncobj: Add a reset ioctl
  2017-08-08 22:46 [PATCH 0/9] drm/syncobj: Add full-featured wait support Jason Ekstrand
                   ` (5 preceding siblings ...)
  2017-08-08 22:46 ` [PATCH 6/9] dma-buf/dma-fence: Allow wait_any_timeout without default_wait Jason Ekstrand
@ 2017-08-08 22:46 ` Jason Ekstrand
  2017-08-08 22:46 ` [PATCH 8/9] drm/syncobj: Add a callback mechanism for replace_fence Jason Ekstrand
                   ` (2 subsequent siblings)
  9 siblings, 0 replies; 55+ messages in thread
From: Jason Ekstrand @ 2017-08-08 22:46 UTC (permalink / raw)
  To: dri-devel; +Cc: Jason Ekstrand, Jason Ekstrand

This just resets the dma_fence to NULL so it looks like it's never been
signaled.  This will be useful once we add the new wait API for allowing
wait on "submit and signal" behavior.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
---
 drivers/gpu/drm/drm_internal.h |  2 ++
 drivers/gpu/drm/drm_ioctl.c    |  2 ++
 drivers/gpu/drm/drm_syncobj.c  | 24 ++++++++++++++++++++++++
 include/uapi/drm/drm.h         |  6 ++++++
 4 files changed, 34 insertions(+)

diff --git a/drivers/gpu/drm/drm_internal.h b/drivers/gpu/drm/drm_internal.h
index f3c4e57..6e8bc4d 100644
--- a/drivers/gpu/drm/drm_internal.h
+++ b/drivers/gpu/drm/drm_internal.h
@@ -165,3 +165,5 @@ int drm_syncobj_fd_to_handle_ioctl(struct drm_device *dev, void *data,
 				   struct drm_file *file_private);
 int drm_syncobj_wait_ioctl(struct drm_device *dev, void *data,
 			   struct drm_file *file_private);
+int drm_syncobj_reset_ioctl(struct drm_device *dev, void *data,
+			    struct drm_file *file_private);
diff --git a/drivers/gpu/drm/drm_ioctl.c b/drivers/gpu/drm/drm_ioctl.c
index 310f3ec..1acf6ac 100644
--- a/drivers/gpu/drm/drm_ioctl.c
+++ b/drivers/gpu/drm/drm_ioctl.c
@@ -659,6 +659,8 @@ static const struct drm_ioctl_desc drm_ioctls[] = {
 		      DRM_UNLOCKED|DRM_RENDER_ALLOW),
 	DRM_IOCTL_DEF(DRM_IOCTL_SYNCOBJ_WAIT, drm_syncobj_wait_ioctl,
 		      DRM_UNLOCKED|DRM_RENDER_ALLOW),
+	DRM_IOCTL_DEF(DRM_IOCTL_SYNCOBJ_RESET, drm_syncobj_reset_ioctl,
+		      DRM_UNLOCKED|DRM_RENDER_ALLOW),
 };
 
 #define DRM_CORE_IOCTL_COUNT	ARRAY_SIZE( drm_ioctls )
diff --git a/drivers/gpu/drm/drm_syncobj.c b/drivers/gpu/drm/drm_syncobj.c
index 2f14e38..25807f5 100644
--- a/drivers/gpu/drm/drm_syncobj.c
+++ b/drivers/gpu/drm/drm_syncobj.c
@@ -613,3 +613,27 @@ drm_syncobj_wait_ioctl(struct drm_device *dev, void *data,
 
 	return ret;
 }
+
+int
+drm_syncobj_reset_ioctl(struct drm_device *dev, void *data,
+			struct drm_file *file_private)
+{
+	struct drm_syncobj_reset *args = data;
+	struct drm_syncobj *syncobj;
+
+	if (!drm_core_check_feature(dev, DRIVER_SYNCOBJ))
+		return -ENODEV;
+
+	if (args->flags != 0)
+		return -EINVAL;
+
+	syncobj = drm_syncobj_find(file_private, args->handle);
+	if (!syncobj)
+		return -ENOENT;
+
+	drm_syncobj_replace_fence(syncobj, NULL);
+
+	drm_syncobj_put(syncobj);
+
+	return 0;
+}
diff --git a/include/uapi/drm/drm.h b/include/uapi/drm/drm.h
index 0757c1a..4b301b4 100644
--- a/include/uapi/drm/drm.h
+++ b/include/uapi/drm/drm.h
@@ -729,6 +729,11 @@ struct drm_syncobj_wait {
 	__u32 pad;
 };
 
+struct drm_syncobj_reset {
+	__u32 handle;
+	__u32 flags;
+};
+
 #if defined(__cplusplus)
 }
 #endif
@@ -852,6 +857,7 @@ extern "C" {
 #define DRM_IOCTL_SYNCOBJ_HANDLE_TO_FD	DRM_IOWR(0xC1, struct drm_syncobj_handle)
 #define DRM_IOCTL_SYNCOBJ_FD_TO_HANDLE	DRM_IOWR(0xC2, struct drm_syncobj_handle)
 #define DRM_IOCTL_SYNCOBJ_WAIT		DRM_IOWR(0xC3, struct drm_syncobj_wait)
+#define DRM_IOCTL_SYNCOBJ_RESET		DRM_IOWR(0xC4, struct drm_syncobj_reset)
 
 /**
  * Device specific ioctls should only be in their respective headers
-- 
2.5.0.400.gff86faf

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 55+ messages in thread

* [PATCH 8/9] drm/syncobj: Add a callback mechanism for replace_fence
  2017-08-08 22:46 [PATCH 0/9] drm/syncobj: Add full-featured wait support Jason Ekstrand
                   ` (6 preceding siblings ...)
  2017-08-08 22:46 ` [PATCH 7/9] drm/syncobj: Add a reset ioctl Jason Ekstrand
@ 2017-08-08 22:46 ` Jason Ekstrand
  2017-08-08 22:46 ` [PATCH 9/9] drm/syncobj: Allow wait for submit and signal behavior Jason Ekstrand
  2017-08-11 22:39 ` [PATCH 0/9] drm/syncobj: Add full-featured wait support (v2) Jason Ekstrand
  9 siblings, 0 replies; 55+ messages in thread
From: Jason Ekstrand @ 2017-08-08 22:46 UTC (permalink / raw)
  To: dri-devel; +Cc: Jason Ekstrand, Jason Ekstrand

It is useful in certain circumstances to know when the fence is replaced
in a syncobj.  Specifically, it may be useful to know when the fence
goes from NULL to something valid.  This does make syncobj_replace_fence
a little more expensive because it has to take a lock but, in the common
case where there is no callback list, it spends a very short amount of
time inside the lock.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
---
 drivers/gpu/drm/drm_syncobj.c | 51 ++++++++++++++++++++++++++++++++++++++++++-
 include/drm/drm_syncobj.h     | 33 +++++++++++++++++++++++++++-
 2 files changed, 82 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/drm_syncobj.c b/drivers/gpu/drm/drm_syncobj.c
index 25807f5..510dfc2 100644
--- a/drivers/gpu/drm/drm_syncobj.c
+++ b/drivers/gpu/drm/drm_syncobj.c
@@ -96,6 +96,46 @@ struct dma_fence *drm_syncobj_fence_get(struct drm_syncobj *syncobj)
 }
 EXPORT_SYMBOL(drm_syncobj_fence_get);
 
+static void drm_syncobj_add_callback_locked(struct drm_syncobj *syncobj,
+					    struct drm_syncobj_cb *cb,
+					    drm_syncobj_func_t func)
+{
+	cb->func = func;
+	list_add_tail(&cb->node, &syncobj->cb_list);
+}
+
+/**
+ * drm_syncobj_add_callback - adds a callback to syncobj::cb_list
+ * @syncobj: Sync object to which to add the callback
+ * @cb: Callback to add
+ * @func: Func to use when initializing the drm_syncobj_cb struct
+ *
+ * This adds a callback to be called next time the fence is replaced
+ */
+void drm_syncobj_add_callback(struct drm_syncobj *syncobj,
+			      struct drm_syncobj_cb *cb,
+			      drm_syncobj_func_t func)
+{
+	spin_lock(&syncobj->lock);
+	drm_syncobj_add_callback_locked(syncobj, cb, func);
+	spin_unlock(&syncobj->lock);
+}
+EXPORT_SYMBOL(drm_syncobj_add_callback);
+
+/**
+ * drm_syncobj_add_callback - removes a callback to syncobj::cb_list
+ * @syncobj: Sync object from which to remove the callback
+ * @cb: Callback to remove
+ */
+void drm_syncobj_remove_callback(struct drm_syncobj *syncobj,
+				 struct drm_syncobj_cb *cb)
+{
+	spin_lock(&syncobj->lock);
+	list_del_init(&cb->node);
+	spin_unlock(&syncobj->lock);
+}
+EXPORT_SYMBOL(drm_syncobj_remove_callback);
+
 /**
  * drm_syncobj_replace_fence - replace fence in a sync object.
  * @file_private: drm file private pointer.
@@ -108,13 +148,21 @@ void drm_syncobj_replace_fence(struct drm_syncobj *syncobj,
 			       struct dma_fence *fence)
 {
 	struct dma_fence *old_fence = NULL;
+	struct drm_syncobj_cb *cur, *tmp;
 
 	if (fence)
 		dma_fence_get(fence);
 
 	spin_lock(&syncobj->lock);
+
 	old_fence = syncobj->fence;
 	syncobj->fence = fence;
+
+	list_for_each_entry_safe(cur, tmp, &syncobj->cb_list, node) {
+		list_del_init(&cur->node);
+		cur->func(syncobj, cur);
+	}
+
 	spin_unlock(&syncobj->lock);
 
 	dma_fence_put(old_fence);
@@ -151,7 +199,7 @@ void drm_syncobj_free(struct kref *kref)
 	struct drm_syncobj *syncobj = container_of(kref,
 						   struct drm_syncobj,
 						   refcount);
-	dma_fence_put(syncobj->fence);
+	drm_syncobj_replace_fence(syncobj, NULL);
 	kfree(syncobj);
 }
 EXPORT_SYMBOL(drm_syncobj_free);
@@ -167,6 +215,7 @@ static int drm_syncobj_create(struct drm_file *file_private,
 		return -ENOMEM;
 
 	kref_init(&syncobj->refcount);
+	INIT_LIST_HEAD(&syncobj->cb_list);
 	spin_lock_init(&syncobj->lock);
 
 	idr_preload(GFP_KERNEL);
diff --git a/include/drm/drm_syncobj.h b/include/drm/drm_syncobj.h
index 1c81658..a81af76 100644
--- a/include/drm/drm_syncobj.h
+++ b/include/drm/drm_syncobj.h
@@ -28,6 +28,8 @@
 
 #include "linux/dma-fence.h"
 
+struct drm_syncobj_cb;
+
 /**
  * struct drm_syncobj - sync object.
  *
@@ -46,8 +48,13 @@ struct drm_syncobj {
 	 */
 	struct dma_fence *fence;
 	/**
+	 * @list
+	 * List of callbaks to call when the fence gets replaced
+	 */
+	struct list_head cb_list;
+	/**
 	 * @spinlock:
-	 * locks fence.
+	 * locks fence and cb_list.
 	 */
 	spinlock_t lock;
 	/**
@@ -57,6 +64,25 @@ struct drm_syncobj {
 	struct file *file;
 };
 
+typedef void (*drm_syncobj_func_t)(struct drm_syncobj *syncobj,
+				   struct drm_syncobj_cb *cb);
+
+/**
+ * struct drm_syncobj_cb - callback for drm_syncobj_add_callback
+ * @node: used by drm_syncob_add_callback to append this struct to
+ *	  syncobj::cb_list
+ * @func: drm_syncobj_func_t to call
+ *
+ * This struct will be initialized by drm_syncobj_add_callback, additional
+ * data can be passed along by embedding drm_syncobj_cb in another struct.
+ * The callback will get called the next time drm_syncobj_replace_fence is
+ * called.
+ */
+struct drm_syncobj_cb {
+	struct list_head node;
+	drm_syncobj_func_t func;
+};
+
 void drm_syncobj_free(struct kref *kref);
 
 /**
@@ -85,6 +111,11 @@ drm_syncobj_put(struct drm_syncobj *obj)
 struct drm_syncobj *drm_syncobj_find(struct drm_file *file_private,
 				     u32 handle);
 struct dma_fence *drm_syncobj_fence_get(struct drm_syncobj *syncobj);
+void drm_syncobj_add_callback(struct drm_syncobj *syncobj,
+			      struct drm_syncobj_cb *cb,
+			      drm_syncobj_func_t func);
+void drm_syncobj_remove_callback(struct drm_syncobj *syncobj,
+				 struct drm_syncobj_cb *cb);
 void drm_syncobj_replace_fence(struct drm_syncobj *syncobj,
 			       struct dma_fence *fence);
 int drm_syncobj_find_fence(struct drm_file *file_private,
-- 
2.5.0.400.gff86faf

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 55+ messages in thread

* [PATCH 9/9] drm/syncobj: Allow wait for submit and signal behavior
  2017-08-08 22:46 [PATCH 0/9] drm/syncobj: Add full-featured wait support Jason Ekstrand
                   ` (7 preceding siblings ...)
  2017-08-08 22:46 ` [PATCH 8/9] drm/syncobj: Add a callback mechanism for replace_fence Jason Ekstrand
@ 2017-08-08 22:46 ` Jason Ekstrand
  2017-08-09 17:00   ` [PATCH] drm/syncobj: Allow wait for submit and signal behavior (v2) Jason Ekstrand
  2017-08-11 22:39 ` [PATCH 0/9] drm/syncobj: Add full-featured wait support (v2) Jason Ekstrand
  9 siblings, 1 reply; 55+ messages in thread
From: Jason Ekstrand @ 2017-08-08 22:46 UTC (permalink / raw)
  To: dri-devel; +Cc: Jason Ekstrand, Jason Ekstrand

Vulkan VkFence semantics require that the application be able to perform
a CPU wait on work which may not yet have been submitted.  This is
perfectly safe because the CPU wait has a timeout which will get
triggered eventually if no work is ever submitted.  This behavior is
advantageous for multi-threaded workloads because, so long as all of the
threads agree on what fences to use up-front, you don't have the extra
cross-thread synchronization cost of thread A telling thread B that it
has submitted its dependent work and thread B is now free to wait.

Within a single process, this can be implemented in the userspace driver
by doing exactly the same kind of tracking the app would have to do
using posix condition variables or similar.  However, in order for this
to work cross-process (as is required by VK_KHR_external_fence), we need
to handle this in the kernel.

This commit adds a WAIT_FOR_SUBMIT flag to DRM_IOCTL_SYNCOBJ_WAIT which
instructs the IOCTL to wait for the syncobj to have a non-null fence and
then wait on the fence.  Combined with DRM_IOCTL_SYNCOBJ_RESET, you can
easily get the Vulkan behavior.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
---
 drivers/gpu/drm/drm_syncobj.c | 319 ++++++++++++++++++++++++++++++++++++++----
 include/uapi/drm/drm.h        |   1 +
 2 files changed, 293 insertions(+), 27 deletions(-)

diff --git a/drivers/gpu/drm/drm_syncobj.c b/drivers/gpu/drm/drm_syncobj.c
index 510dfc2..ecfc43b 100644
--- a/drivers/gpu/drm/drm_syncobj.c
+++ b/drivers/gpu/drm/drm_syncobj.c
@@ -51,6 +51,7 @@
 #include <linux/fs.h>
 #include <linux/anon_inodes.h>
 #include <linux/sync_file.h>
+#include <linux/sched/signal.h>
 
 #include "drm_internal.h"
 #include <drm/drm_syncobj.h>
@@ -104,6 +105,27 @@ static void drm_syncobj_add_callback_locked(struct drm_syncobj *syncobj,
 	list_add_tail(&cb->node, &syncobj->cb_list);
 }
 
+static int drm_syncobj_fence_get_or_add_callback(struct drm_syncobj *syncobj,
+						 struct dma_fence **fence,
+						 struct drm_syncobj_cb *cb,
+						 drm_syncobj_func_t func)
+{
+	int ret;
+
+	spin_lock(&syncobj->lock);
+	if (syncobj->fence) {
+		*fence = dma_fence_get(syncobj->fence);
+		ret = 1;
+	} else {
+		*fence = NULL;
+		drm_syncobj_add_callback_locked(syncobj, cb, func);
+		ret = 0;
+	}
+	spin_unlock(&syncobj->lock);
+
+	return ret;
+}
+
 /**
  * drm_syncobj_add_callback - adds a callback to syncobj::cb_list
  * @syncobj: Sync object to which to add the callback
@@ -526,6 +548,255 @@ drm_syncobj_fd_to_handle_ioctl(struct drm_device *dev, void *data,
 					&args->handle);
 }
 
+static int drm_syncobj_signaled(struct drm_syncobj *syncobj,
+				uint32_t wait_flags)
+{
+	struct dma_fence *fence;
+	int ret;
+
+	fence = drm_syncobj_fence_get(syncobj);
+	if (!fence) {
+		if (wait_flags & DRM_SYNCOBJ_WAIT_FLAGS_WAIT_FOR_SUBMIT)
+			return 0;
+		else
+			return -EINVAL;
+	}
+
+	ret = dma_fence_is_signaled(fence) ? 1 : 0;
+
+	dma_fence_put(fence);
+
+	return ret;
+}
+
+struct syncobj_get_wait_cb {
+	struct drm_syncobj_cb base;
+	struct task_struct *task;
+	struct dma_fence **fence;
+};
+
+static void syncobj_get_wait_cb_func(struct drm_syncobj *syncobj,
+				     struct drm_syncobj_cb *wait_cb)
+{
+	struct syncobj_get_wait_cb *wait =
+		container_of(wait_cb, struct syncobj_get_wait_cb, base);
+
+	/* This happens inside the syncobj lock */
+	*wait->fence = dma_fence_get(syncobj->fence);
+	wake_up_process(wait->task);
+}
+
+static signed long
+drm_syncobj_get_fence_wait_timeout(struct drm_syncobj *syncobj,
+				   struct dma_fence **fence,
+				   signed long timeout)
+{
+	struct syncobj_get_wait_cb cb;
+	signed long ret;
+
+	if (timeout == 0) {
+		*fence = drm_syncobj_fence_get(syncobj);
+		if (*fence)
+			return 1;
+		return 0;
+	}
+
+	cb.task = current;
+	cb.fence = fence;
+
+	if (drm_syncobj_fence_get_or_add_callback(syncobj, fence, &cb.base,
+						  syncobj_get_wait_cb_func)) {
+		/* We got the fence immediately. No callback was installed. */
+		return timeout;
+	}
+
+	ret = timeout;
+	while (ret > 0) {
+		set_current_state(TASK_INTERRUPTIBLE);
+
+		if (*fence)
+			break;
+
+		ret = schedule_timeout(ret);
+
+		if (ret > 0 && signal_pending(current))
+			ret = -ERESTARTSYS;
+	}
+
+	__set_current_state(TASK_RUNNING);
+
+	drm_syncobj_remove_callback(syncobj, &cb.base);
+
+	/* It's possible that we got the fence after timing out or getting
+	 * interrupted but before removing the callback.
+	 */
+	if (ret <= 0 && *fence)
+	    ret = 1;
+
+	return ret;
+}
+
+static signed long drm_syncobj_wait_timeout(struct drm_syncobj *syncobj,
+					    uint32_t flags,
+					    signed long timeout)
+{
+	struct dma_fence *fence;
+	signed long ret;
+
+	if (timeout == 0)
+		return drm_syncobj_signaled(syncobj, flags);
+
+	if (flags & DRM_SYNCOBJ_WAIT_FLAGS_WAIT_FOR_SUBMIT) {
+		ret = drm_syncobj_get_fence_wait_timeout(syncobj, &fence,
+							 timeout);
+		if (ret <= 0)
+			return ret;
+		timeout = ret;
+	} else {
+		fence = drm_syncobj_fence_get(syncobj);
+	}
+
+	if (!fence)
+		return -EINVAL;
+
+	ret = dma_fence_wait_timeout(fence, true, timeout);
+
+	/* Various dma_fence wait callbacks will return ENOENT to indicate
+	 * that the fence has already been signaled.  We sanitize this as
+	 * returning with all of the time left.
+	 */
+	if (ret == -ENOENT)
+		ret = timeout;
+
+	dma_fence_put(fence);
+
+	return ret;
+}
+
+struct syncobj_wait_any_entry {
+	struct task_struct *task;
+	struct dma_fence *fence;
+	struct dma_fence_cb fence_cb;
+	struct drm_syncobj_cb syncobj_cb;
+};
+
+static void syncobj_wait_any_fence_func(struct dma_fence *fence,
+					struct dma_fence_cb *cb)
+{
+	struct syncobj_wait_any_entry *wait =
+		container_of(cb, struct syncobj_wait_any_entry, fence_cb);
+
+	wake_up_process(wait->task);
+}
+
+static void syncobj_wait_any_syncobj_func(struct drm_syncobj *syncobj,
+					  struct drm_syncobj_cb *cb)
+{
+	struct syncobj_wait_any_entry *wait =
+		container_of(cb, struct syncobj_wait_any_entry, syncobj_cb);
+
+	/* This happens inside the syncobj lock */
+	wait->fence = dma_fence_get(syncobj->fence);
+	wake_up_process(wait->task);
+}
+
+static signed long drm_syncobj_wait_any_timeout(struct drm_syncobj **syncobjs,
+						uint32_t count,
+						uint32_t flags,
+						signed long timeout,
+						uint32_t *idx)
+{
+	struct syncobj_wait_any_entry *entries;
+	struct dma_fence *fence;
+	signed long ret;
+	uint32_t i;
+
+	if (timeout == 0) {
+		for (i = 0; i < count; ++i) {
+			ret = drm_syncobj_signaled(syncobjs[i], flags);
+			if (ret < 0)
+				return ret;
+			if (ret == 0)
+				continue;
+			if (idx)
+				*idx = i;
+			return 1;
+		}
+
+		return 0;
+	}
+
+	entries = kcalloc(count, sizeof(*entries), GFP_KERNEL);
+	if (!entries)
+		return -ENOMEM;
+
+	for (i = 0; i < count; ++i) {
+		entries[i].task = current;
+
+		if (flags & DRM_SYNCOBJ_WAIT_FLAGS_WAIT_FOR_SUBMIT) {
+			drm_syncobj_fence_get_or_add_callback(syncobjs[i],
+							      &entries[i].fence,
+							      &entries[i].syncobj_cb,
+							      syncobj_wait_any_syncobj_func);
+		} else {
+			entries[i].fence = drm_syncobj_fence_get(syncobjs[i]);
+			if (!entries[i].fence) {
+				ret = -EINVAL;
+				goto err_cleanup_entries;
+			}
+		}
+	}
+
+	ret = timeout;
+	while (ret > 0) {
+		for (i = 0; i < count; ++i) {
+			fence = entries[i].fence;
+			if (!fence)
+				continue;
+
+			if (dma_fence_is_signaled(fence)) {
+				if (idx)
+					*idx = i;
+				goto done_waiting;
+			}
+
+			if (!entries[i].fence_cb.func &&
+			    dma_fence_add_callback(fence,
+						   &entries[i].fence_cb,
+						   syncobj_wait_any_fence_func)) {
+				/* The fence is already signaled */
+				if (idx)
+					*idx = i;
+				goto done_waiting;
+			}
+		}
+
+		set_current_state(TASK_INTERRUPTIBLE);
+
+		ret = schedule_timeout(ret);
+
+		if (ret > 0 && signal_pending(current))
+			ret = -ERESTARTSYS;
+	}
+
+done_waiting:
+	__set_current_state(TASK_RUNNING);
+
+err_cleanup_entries:
+	for (i = 0; i < count; ++i) {
+		if (entries[i].syncobj_cb.func)
+			drm_syncobj_remove_callback(syncobjs[i],
+						    &entries[i].syncobj_cb);
+		if (entries[i].fence_cb.func)
+			dma_fence_remove_callback(entries[i].fence,
+						  &entries[i].fence_cb);
+		dma_fence_put(entries[i].fence);
+	}
+	kfree(entries);
+
+	return ret;
+}
+
 /**
  * drm_timeout_abs_to_jiffies - calculate jiffies timeout from absolute value
  *
@@ -561,7 +832,7 @@ static signed long drm_timeout_abs_to_jiffies(int64_t timeout_nsec)
 static int drm_syncobj_wait_fences(struct drm_device *dev,
 				   struct drm_file *file_private,
 				   struct drm_syncobj_wait *wait,
-				   struct dma_fence **fences)
+				   struct drm_syncobj **syncobjs)
 {
 	signed long timeout = drm_timeout_abs_to_jiffies(wait->timeout_nsec);
 	signed long ret = 0;
@@ -570,17 +841,9 @@ static int drm_syncobj_wait_fences(struct drm_device *dev,
 	if (wait->flags & DRM_SYNCOBJ_WAIT_FLAGS_WAIT_ALL) {
 		uint32_t i;
 		for (i = 0; i < wait->count_handles; i++) {
-			ret = dma_fence_wait_timeout(fences[i], true, timeout);
-
-			/* Various dma_fence wait callbacks will return
-			 * ENOENT to indicate that the fence has already
-			 * been signaled.  We need to sanitize this to 0 so
-			 * we don't return early and the client doesn't see
-			 * an unexpected error.
-			 */
-			if (ret == -ENOENT)
-				ret = 0;
-
+			ret = drm_syncobj_wait_timeout(syncobjs[i],
+						       wait->flags,
+						       timeout);
 			if (ret < 0)
 				return ret;
 			if (ret == 0)
@@ -589,10 +852,10 @@ static int drm_syncobj_wait_fences(struct drm_device *dev,
 		}
 		first = 0;
 	} else {
-		ret = dma_fence_wait_any_timeout(fences,
-						 wait->count_handles,
-						 true, timeout,
-						 &first);
+		ret = drm_syncobj_wait_any_timeout(syncobjs,
+						   wait->count_handles,
+						   wait->flags,
+						   timeout, &first);
 	}
 
 	if (ret < 0)
@@ -610,14 +873,15 @@ drm_syncobj_wait_ioctl(struct drm_device *dev, void *data,
 {
 	struct drm_syncobj_wait *args = data;
 	uint32_t *handles;
-	struct dma_fence **fences;
+	struct drm_syncobj **syncobjs;
 	int ret = 0;
 	uint32_t i;
 
 	if (!drm_core_check_feature(dev, DRIVER_SYNCOBJ))
 		return -ENODEV;
 
-	if (args->flags != 0 && args->flags != DRM_SYNCOBJ_WAIT_FLAGS_WAIT_ALL)
+	if (args->flags & ~(DRM_SYNCOBJ_WAIT_FLAGS_WAIT_ALL |
+			    DRM_SYNCOBJ_WAIT_FLAGS_WAIT_FOR_SUBMIT))
 		return -EINVAL;
 
 	if (args->count_handles == 0)
@@ -636,27 +900,28 @@ drm_syncobj_wait_ioctl(struct drm_device *dev, void *data,
 		goto err_free_handles;
 	}
 
-	fences = kcalloc(args->count_handles,
-			 sizeof(struct dma_fence *), GFP_KERNEL);
-	if (!fences) {
+	syncobjs = kcalloc(args->count_handles,
+			   sizeof(struct drm_syncobj *), GFP_KERNEL);
+	if (!syncobjs) {
 		ret = -ENOMEM;
 		goto err_free_handles;
 	}
 
 	for (i = 0; i < args->count_handles; i++) {
-		ret = drm_syncobj_find_fence(file_private, handles[i],
-					     &fences[i]);
-		if (ret)
+		syncobjs[i] = drm_syncobj_find(file_private, handles[i]);
+		if (!syncobjs[i]) {
+			ret = -ENOENT;
 			goto err_free_fence_array;
+		}
 	}
 
 	ret = drm_syncobj_wait_fences(dev, file_private,
-				      args, fences);
+				      args, syncobjs);
 
 err_free_fence_array:
 	for (i = 0; i < args->count_handles; i++)
-		dma_fence_put(fences[i]);
-	kfree(fences);
+		drm_syncobj_put(syncobjs[i]);
+	kfree(syncobjs);
 err_free_handles:
 	kfree(handles);
 
diff --git a/include/uapi/drm/drm.h b/include/uapi/drm/drm.h
index 4b301b4..f8ec8fe 100644
--- a/include/uapi/drm/drm.h
+++ b/include/uapi/drm/drm.h
@@ -719,6 +719,7 @@ struct drm_syncobj_handle {
 };
 
 #define DRM_SYNCOBJ_WAIT_FLAGS_WAIT_ALL (1 << 0)
+#define DRM_SYNCOBJ_WAIT_FLAGS_WAIT_FOR_SUBMIT (1 << 1)
 struct drm_syncobj_wait {
 	__u64 handles;
 	/* absolute timeout */
-- 
2.5.0.400.gff86faf

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 55+ messages in thread

* [PATCH] drm/syncobj: Allow wait for submit and signal behavior (v2)
  2017-08-08 22:46 ` [PATCH 9/9] drm/syncobj: Allow wait for submit and signal behavior Jason Ekstrand
@ 2017-08-09 17:00   ` Jason Ekstrand
  2017-08-09 17:57     ` Chris Wilson
                       ` (2 more replies)
  0 siblings, 3 replies; 55+ messages in thread
From: Jason Ekstrand @ 2017-08-09 17:00 UTC (permalink / raw)
  To: dri-devel; +Cc: Jason Ekstrand, Dave Airlie, Jason Ekstrand

Vulkan VkFence semantics require that the application be able to perform
a CPU wait on work which may not yet have been submitted.  This is
perfectly safe because the CPU wait has a timeout which will get
triggered eventually if no work is ever submitted.  This behavior is
advantageous for multi-threaded workloads because, so long as all of the
threads agree on what fences to use up-front, you don't have the extra
cross-thread synchronization cost of thread A telling thread B that it
has submitted its dependent work and thread B is now free to wait.

Within a single process, this can be implemented in the userspace driver
by doing exactly the same kind of tracking the app would have to do
using posix condition variables or similar.  However, in order for this
to work cross-process (as is required by VK_KHR_external_fence), we need
to handle this in the kernel.

This commit adds a WAIT_FOR_SUBMIT flag to DRM_IOCTL_SYNCOBJ_WAIT which
instructs the IOCTL to wait for the syncobj to have a non-null fence and
then wait on the fence.  Combined with DRM_IOCTL_SYNCOBJ_RESET, you can
easily get the Vulkan behavior.

v2:
 - Fix a bug in the invalid syncobj error path
 - Unify the wait-all and wait-any cases

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: Dave Airlie <airlied@redhat.com>
---

I realized today (this is what happens when you sleep) that it takes almost
no work to make the wait-any path also handle wait-all.  By unifying the
two, I deleted over a hundred lines of code from the implementation.

 drivers/gpu/drm/drm_syncobj.c | 243 ++++++++++++++++++++++++++++++++++--------
 include/uapi/drm/drm.h        |   1 +
 2 files changed, 199 insertions(+), 45 deletions(-)

diff --git a/drivers/gpu/drm/drm_syncobj.c b/drivers/gpu/drm/drm_syncobj.c
index 510dfc2..5e7f654 100644
--- a/drivers/gpu/drm/drm_syncobj.c
+++ b/drivers/gpu/drm/drm_syncobj.c
@@ -51,6 +51,7 @@
 #include <linux/fs.h>
 #include <linux/anon_inodes.h>
 #include <linux/sync_file.h>
+#include <linux/sched/signal.h>
 
 #include "drm_internal.h"
 #include <drm/drm_syncobj.h>
@@ -104,6 +105,27 @@ static void drm_syncobj_add_callback_locked(struct drm_syncobj *syncobj,
 	list_add_tail(&cb->node, &syncobj->cb_list);
 }
 
+static int drm_syncobj_fence_get_or_add_callback(struct drm_syncobj *syncobj,
+						 struct dma_fence **fence,
+						 struct drm_syncobj_cb *cb,
+						 drm_syncobj_func_t func)
+{
+	int ret;
+
+	spin_lock(&syncobj->lock);
+	if (syncobj->fence) {
+		*fence = dma_fence_get(syncobj->fence);
+		ret = 1;
+	} else {
+		*fence = NULL;
+		drm_syncobj_add_callback_locked(syncobj, cb, func);
+		ret = 0;
+	}
+	spin_unlock(&syncobj->lock);
+
+	return ret;
+}
+
 /**
  * drm_syncobj_add_callback - adds a callback to syncobj::cb_list
  * @syncobj: Sync object to which to add the callback
@@ -526,6 +548,159 @@ drm_syncobj_fd_to_handle_ioctl(struct drm_device *dev, void *data,
 					&args->handle);
 }
 
+static int drm_syncobj_signaled(struct drm_syncobj *syncobj,
+				uint32_t wait_flags)
+{
+	struct dma_fence *fence;
+	int ret;
+
+	fence = drm_syncobj_fence_get(syncobj);
+	if (!fence) {
+		if (wait_flags & DRM_SYNCOBJ_WAIT_FLAGS_WAIT_FOR_SUBMIT)
+			return 0;
+		else
+			return -EINVAL;
+	}
+
+	ret = dma_fence_is_signaled(fence) ? 1 : 0;
+
+	dma_fence_put(fence);
+
+	return ret;
+}
+
+struct syncobj_wait_entry {
+	struct task_struct *task;
+	struct dma_fence *fence;
+	struct dma_fence_cb fence_cb;
+	struct drm_syncobj_cb syncobj_cb;
+};
+
+static void syncobj_wait_fence_func(struct dma_fence *fence,
+				    struct dma_fence_cb *cb)
+{
+	struct syncobj_wait_entry *wait =
+		container_of(cb, struct syncobj_wait_entry, fence_cb);
+
+	wake_up_process(wait->task);
+}
+
+static void syncobj_wait_syncobj_func(struct drm_syncobj *syncobj,
+				      struct drm_syncobj_cb *cb)
+{
+	struct syncobj_wait_entry *wait =
+		container_of(cb, struct syncobj_wait_entry, syncobj_cb);
+
+	/* This happens inside the syncobj lock */
+	wait->fence = dma_fence_get(syncobj->fence);
+	wake_up_process(wait->task);
+}
+
+static signed long drm_syncobj_array_wait_timeout(struct drm_syncobj **syncobjs,
+						  uint32_t count,
+						  uint32_t flags,
+						  signed long timeout,
+						  uint32_t *idx)
+{
+	struct syncobj_wait_entry *entries;
+	struct dma_fence *fence;
+	signed long ret;
+	uint32_t signaled_count, i;
+
+	if (timeout == 0) {
+		signaled_count = 0;
+		for (i = 0; i < count; ++i) {
+			ret = drm_syncobj_signaled(syncobjs[i], flags);
+			if (ret < 0)
+				return ret;
+			if (ret == 0)
+				continue;
+			if (signaled_count == 0 && idx)
+				*idx = i;
+			signaled_count++;
+		}
+
+		if (flags & DRM_SYNCOBJ_WAIT_FLAGS_WAIT_ALL)
+			return signaled_count == count ? 1 : 0;
+		else
+			return signaled_count > 0 ? 1 : 0;
+	}
+
+	entries = kcalloc(count, sizeof(*entries), GFP_KERNEL);
+	if (!entries)
+		return -ENOMEM;
+
+	for (i = 0; i < count; ++i) {
+		entries[i].task = current;
+
+		if (flags & DRM_SYNCOBJ_WAIT_FLAGS_WAIT_FOR_SUBMIT) {
+			drm_syncobj_fence_get_or_add_callback(syncobjs[i],
+							      &entries[i].fence,
+							      &entries[i].syncobj_cb,
+							      syncobj_wait_syncobj_func);
+		} else {
+			entries[i].fence = drm_syncobj_fence_get(syncobjs[i]);
+			if (!entries[i].fence) {
+				ret = -EINVAL;
+				goto err_cleanup_entries;
+			}
+		}
+	}
+
+	ret = timeout;
+	while (ret > 0) {
+		signaled_count = 0;
+		for (i = 0; i < count; ++i) {
+			fence = entries[i].fence;
+			if (!fence)
+				continue;
+
+			if (dma_fence_is_signaled(fence) ||
+			    (!entries[i].fence_cb.func &&
+			     dma_fence_add_callback(fence,
+						    &entries[i].fence_cb,
+						    syncobj_wait_fence_func))) {
+				/* The fence has been signaled */
+				if (flags & DRM_SYNCOBJ_WAIT_FLAGS_WAIT_ALL) {
+					signaled_count++;
+				} else {
+					if (idx)
+						*idx = i;
+					goto done_waiting;
+				}
+			}
+		}
+
+		if ((flags & DRM_SYNCOBJ_WAIT_FLAGS_WAIT_ALL) &&
+		    signaled_count == count)
+			goto done_waiting;
+
+		set_current_state(TASK_INTERRUPTIBLE);
+
+		ret = schedule_timeout(ret);
+
+		if (ret > 0 && signal_pending(current))
+			ret = -ERESTARTSYS;
+	}
+
+done_waiting:
+	__set_current_state(TASK_RUNNING);
+
+err_cleanup_entries:
+	for (i = 0; i < count; ++i) {
+		if (entries[i].syncobj_cb.func)
+			drm_syncobj_remove_callback(syncobjs[i],
+						    &entries[i].syncobj_cb);
+		if (entries[i].fence_cb.func)
+			dma_fence_remove_callback(entries[i].fence,
+						  &entries[i].fence_cb);
+		dma_fence_put(entries[i].fence);
+	}
+	kfree(entries);
+
+	return ret;
+}
+
 /**
  * drm_timeout_abs_to_jiffies - calculate jiffies timeout from absolute value
  *
@@ -558,43 +733,19 @@ static signed long drm_timeout_abs_to_jiffies(int64_t timeout_nsec)
 	return timeout_jiffies64 + 1;
 }
 
-static int drm_syncobj_wait_fences(struct drm_device *dev,
-				   struct drm_file *file_private,
-				   struct drm_syncobj_wait *wait,
-				   struct dma_fence **fences)
+static int drm_syncobj_array_wait(struct drm_device *dev,
+				  struct drm_file *file_private,
+				  struct drm_syncobj_wait *wait,
+				  struct drm_syncobj **syncobjs)
 {
 	signed long timeout = drm_timeout_abs_to_jiffies(wait->timeout_nsec);
 	signed long ret = 0;
 	uint32_t first = ~0;
 
-	if (wait->flags & DRM_SYNCOBJ_WAIT_FLAGS_WAIT_ALL) {
-		uint32_t i;
-		for (i = 0; i < wait->count_handles; i++) {
-			ret = dma_fence_wait_timeout(fences[i], true, timeout);
-
-			/* Various dma_fence wait callbacks will return
-			 * ENOENT to indicate that the fence has already
-			 * been signaled.  We need to sanitize this to 0 so
-			 * we don't return early and the client doesn't see
-			 * an unexpected error.
-			 */
-			if (ret == -ENOENT)
-				ret = 0;
-
-			if (ret < 0)
-				return ret;
-			if (ret == 0)
-				break;
-			timeout = ret;
-		}
-		first = 0;
-	} else {
-		ret = dma_fence_wait_any_timeout(fences,
-						 wait->count_handles,
-						 true, timeout,
-						 &first);
-	}
-
+	ret = drm_syncobj_array_wait_timeout(syncobjs,
+					     wait->count_handles,
+					     wait->flags,
+					     timeout, &first);
 	if (ret < 0)
 		return ret;
 
@@ -610,14 +761,15 @@ drm_syncobj_wait_ioctl(struct drm_device *dev, void *data,
 {
 	struct drm_syncobj_wait *args = data;
 	uint32_t *handles;
-	struct dma_fence **fences;
+	struct drm_syncobj **syncobjs;
 	int ret = 0;
 	uint32_t i;
 
 	if (!drm_core_check_feature(dev, DRIVER_SYNCOBJ))
 		return -ENODEV;
 
-	if (args->flags != 0 && args->flags != DRM_SYNCOBJ_WAIT_FLAGS_WAIT_ALL)
+	if (args->flags & ~(DRM_SYNCOBJ_WAIT_FLAGS_WAIT_ALL |
+			    DRM_SYNCOBJ_WAIT_FLAGS_WAIT_FOR_SUBMIT))
 		return -EINVAL;
 
 	if (args->count_handles == 0)
@@ -636,27 +788,28 @@ drm_syncobj_wait_ioctl(struct drm_device *dev, void *data,
 		goto err_free_handles;
 	}
 
-	fences = kcalloc(args->count_handles,
-			 sizeof(struct dma_fence *), GFP_KERNEL);
-	if (!fences) {
+	syncobjs = kcalloc(args->count_handles,
+			   sizeof(struct drm_syncobj *), GFP_KERNEL);
+	if (!syncobjs) {
 		ret = -ENOMEM;
 		goto err_free_handles;
 	}
 
 	for (i = 0; i < args->count_handles; i++) {
-		ret = drm_syncobj_find_fence(file_private, handles[i],
-					     &fences[i]);
-		if (ret)
+		syncobjs[i] = drm_syncobj_find(file_private, handles[i]);
+		if (!syncobjs[i]) {
+			ret = -ENOENT;
 			goto err_free_fence_array;
+		}
 	}
 
-	ret = drm_syncobj_wait_fences(dev, file_private,
-				      args, fences);
+	ret = drm_syncobj_array_wait(dev, file_private,
+				     args, syncobjs);
 
 err_free_fence_array:
-	for (i = 0; i < args->count_handles; i++)
-		dma_fence_put(fences[i]);
-	kfree(fences);
+	while (i-- > 0)
+		drm_syncobj_put(syncobjs[i]);
+	kfree(syncobjs);
 err_free_handles:
 	kfree(handles);
 
diff --git a/include/uapi/drm/drm.h b/include/uapi/drm/drm.h
index 4b301b4..f8ec8fe 100644
--- a/include/uapi/drm/drm.h
+++ b/include/uapi/drm/drm.h
@@ -719,6 +719,7 @@ struct drm_syncobj_handle {
 };
 
 #define DRM_SYNCOBJ_WAIT_FLAGS_WAIT_ALL (1 << 0)
+#define DRM_SYNCOBJ_WAIT_FLAGS_WAIT_FOR_SUBMIT (1 << 1)
 struct drm_syncobj_wait {
 	__u64 handles;
 	/* absolute timeout */
-- 
2.5.0.400.gff86faf

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 55+ messages in thread

* Re: [PATCH] drm/syncobj: Allow wait for submit and signal behavior (v2)
  2017-08-09 17:00   ` [PATCH] drm/syncobj: Allow wait for submit and signal behavior (v2) Jason Ekstrand
@ 2017-08-09 17:57     ` Chris Wilson
  2017-08-09 18:25       ` Christian König
  2017-08-09 22:41       ` Chris Wilson
  2017-08-09 21:31     ` Chris Wilson
  2017-08-10  1:35     ` [PATCH v3 9/9] drm/syncobj: Allow wait for submit and signal behavior (v3) Jason Ekstrand
  2 siblings, 2 replies; 55+ messages in thread
From: Chris Wilson @ 2017-08-09 17:57 UTC (permalink / raw)
  To: dri-devel; +Cc: Jason Ekstrand, Dave Airlie, Jason Ekstrand

Quoting Jason Ekstrand (2017-08-09 18:00:54)
> Vulkan VkFence semantics require that the application be able to perform
> a CPU wait on work which may not yet have been submitted.  This is
> perfectly safe because the CPU wait has a timeout which will get
> triggered eventually if no work is ever submitted.  This behavior is
> advantageous for multi-threaded workloads because, so long as all of the
> threads agree on what fences to use up-front, you don't have the extra
> cross-thread synchronization cost of thread A telling thread B that it
> has submitted its dependent work and thread B is now free to wait.
> 
> Within a single process, this can be implemented in the userspace driver
> by doing exactly the same kind of tracking the app would have to do
> using posix condition variables or similar.  However, in order for this
> to work cross-process (as is required by VK_KHR_external_fence), we need
> to handle this in the kernel.
> 
> This commit adds a WAIT_FOR_SUBMIT flag to DRM_IOCTL_SYNCOBJ_WAIT which
> instructs the IOCTL to wait for the syncobj to have a non-null fence and
> then wait on the fence.  Combined with DRM_IOCTL_SYNCOBJ_RESET, you can
> easily get the Vulkan behavior.
> 
> v2:
>  - Fix a bug in the invalid syncobj error path
>  - Unify the wait-all and wait-any cases
> 
> Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
> Cc: Dave Airlie <airlied@redhat.com>
> ---
> 
> I realized today (this is what happens when you sleep) that it takes almost
> no work to make the wait-any path also handle wait-all.  By unifying the
> two, I deleted over a hundred lines of code from the implementation.
> 
>  drivers/gpu/drm/drm_syncobj.c | 243 ++++++++++++++++++++++++++++++++++--------
>  include/uapi/drm/drm.h        |   1 +
>  2 files changed, 199 insertions(+), 45 deletions(-)
> 
> diff --git a/drivers/gpu/drm/drm_syncobj.c b/drivers/gpu/drm/drm_syncobj.c
> index 510dfc2..5e7f654 100644
> --- a/drivers/gpu/drm/drm_syncobj.c
> +++ b/drivers/gpu/drm/drm_syncobj.c
> @@ -51,6 +51,7 @@
>  #include <linux/fs.h>
>  #include <linux/anon_inodes.h>
>  #include <linux/sync_file.h>
> +#include <linux/sched/signal.h>
>  
>  #include "drm_internal.h"
>  #include <drm/drm_syncobj.h>
> @@ -104,6 +105,27 @@ static void drm_syncobj_add_callback_locked(struct drm_syncobj *syncobj,

This is a bit of the puzzle that is missing.

>         list_add_tail(&cb->node, &syncobj->cb_list);
>  }
>  
> +static int drm_syncobj_fence_get_or_add_callback(struct drm_syncobj *syncobj,
> +                                                struct dma_fence **fence,
> +                                                struct drm_syncobj_cb *cb,
> +                                                drm_syncobj_func_t func)
> +{
> +       int ret;
> +
> +       spin_lock(&syncobj->lock);
> +       if (syncobj->fence) {
> +               *fence = dma_fence_get(syncobj->fence);
> +               ret = 1;
> +       } else {
> +               *fence = NULL;
> +               drm_syncobj_add_callback_locked(syncobj, cb, func);
> +               ret = 0;
> +       }
> +       spin_unlock(&syncobj->lock);
> +
> +       return ret;
> +}
> +
>  /**
>   * drm_syncobj_add_callback - adds a callback to syncobj::cb_list
>   * @syncobj: Sync object to which to add the callback
> @@ -526,6 +548,159 @@ drm_syncobj_fd_to_handle_ioctl(struct drm_device *dev, void *data,
>                                         &args->handle);
>  }
>  
> +static int drm_syncobj_signaled(struct drm_syncobj *syncobj,
> +                               uint32_t wait_flags)
> +{
> +       struct dma_fence *fence;
> +       int ret;
> +
> +       fence = drm_syncobj_fence_get(syncobj);
> +       if (!fence) {
> +               if (wait_flags & DRM_SYNCOBJ_WAIT_FLAGS_WAIT_FOR_SUBMIT)
> +                       return 0;
> +               else
> +                       return -EINVAL;
> +       }
> +
> +       ret = dma_fence_is_signaled(fence) ? 1 : 0;
> +
> +       dma_fence_put(fence);
> +
> +       return ret;
> +}
> +
> +struct syncobj_wait_entry {
> +       struct task_struct *task;
> +       struct dma_fence *fence;
> +       struct dma_fence_cb fence_cb;
> +       struct drm_syncobj_cb syncobj_cb;
> +};
> +
> +static void syncobj_wait_fence_func(struct dma_fence *fence,
> +                                   struct dma_fence_cb *cb)
> +{
> +       struct syncobj_wait_entry *wait =
> +               container_of(cb, struct syncobj_wait_entry, fence_cb);
> +
> +       wake_up_process(wait->task);
> +}
> +
> +static void syncobj_wait_syncobj_func(struct drm_syncobj *syncobj,
> +                                     struct drm_syncobj_cb *cb)
> +{
> +       struct syncobj_wait_entry *wait =
> +               container_of(cb, struct syncobj_wait_entry, syncobj_cb);
> +
> +       /* This happens inside the syncobj lock */
> +       wait->fence = dma_fence_get(syncobj->fence);
> +       wake_up_process(wait->task);
> +}
> +
> +static signed long drm_syncobj_array_wait_timeout(struct drm_syncobj **syncobjs,
> +                                                 uint32_t count,
> +                                                 uint32_t flags,
> +                                                 signed long timeout,
> +                                                 uint32_t *idx)
> +{
> +       struct syncobj_wait_entry *entries;
> +       struct dma_fence *fence;
> +       signed long ret;
> +       uint32_t signaled_count, i;
> +
> +       if (timeout == 0) {
> +               signaled_count = 0;
> +               for (i = 0; i < count; ++i) {
> +                       ret = drm_syncobj_signaled(syncobjs[i], flags);
> +                       if (ret < 0)
> +                               return ret;
> +                       if (ret == 0)
> +                               continue;
> +                       if (signaled_count == 0 && idx)
> +                               *idx = i;
> +                       signaled_count++;
> +               }
> +
> +               if (flags & DRM_SYNCOBJ_WAIT_FLAGS_WAIT_ALL)
> +                       return signaled_count == count ? 1 : 0;
> +               else
> +                       return signaled_count > 0 ? 1 : 0;
> +       }
> +
> +       entries = kcalloc(count, sizeof(*entries), GFP_KERNEL);
> +       if (!entries)
> +               return -ENOMEM;
> +
> +       for (i = 0; i < count; ++i) {
> +               entries[i].task = current;
> +
> +               if (flags & DRM_SYNCOBJ_WAIT_FLAGS_WAIT_FOR_SUBMIT) {
> +                       drm_syncobj_fence_get_or_add_callback(syncobjs[i],
> +                                                             &entries[i].fence,
> +                                                             &entries[i].syncobj_cb,
> +                                                             syncobj_wait_syncobj_func);
> +               } else {
> +                       entries[i].fence = drm_syncobj_fence_get(syncobjs[i]);
> +                       if (!entries[i].fence) {
> +                               ret = -EINVAL;
> +                               goto err_cleanup_entries;
> +                       }

So we are taking a snapshot here. It looks like this could have been
done using a dma_fence_array + dma_fence_proxy for capturing the future
fence.

> +       ret = timeout;
> +       while (ret > 0) {
> +               signaled_count = 0;
> +               for (i = 0; i < count; ++i) {
> +                       fence = entries[i].fence;
> +                       if (!fence)
> +                               continue;
> +
> +                       if (dma_fence_is_signaled(fence) ||
> +                           (!entries[i].fence_cb.func &&
> +                            dma_fence_add_callback(fence,
> +                                                   &entries[i].fence_cb,
> +                                                   syncobj_wait_fence_func))) {
> +                               /* The fence has been signaled */
> +                               if (flags & DRM_SYNCOBJ_WAIT_FLAGS_WAIT_ALL) {
> +                                       signaled_count++;
> +                               } else {
> +                                       if (idx)
> +                                               *idx = i;
> +                                       goto done_waiting;
> +                               }
> +                       }
> +               }
> +
> +               if ((flags & DRM_SYNCOBJ_WAIT_FLAGS_WAIT_ALL) &&
> +                   signaled_count == count)
> +                       goto done_waiting;
> +
> +               set_current_state(TASK_INTERRUPTIBLE);

This is the first step in the wait-loop, you set TASK_INTERRUPTIBLE
before doing any checks. So that if any state changes whilst you are in
the middle of those checks, the schedule_timeout is a nop and you can
check again.
-Chris
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH] drm/syncobj: Allow wait for submit and signal behavior (v2)
  2017-08-09 17:57     ` Chris Wilson
@ 2017-08-09 18:25       ` Christian König
  2017-08-09 21:09         ` Jason Ekstrand
  2017-08-09 22:41       ` Chris Wilson
  1 sibling, 1 reply; 55+ messages in thread
From: Christian König @ 2017-08-09 18:25 UTC (permalink / raw)
  To: Chris Wilson, Jason Ekstrand, dri-devel; +Cc: Jason Ekstrand, Dave Airlie

Am 09.08.2017 um 19:57 schrieb Chris Wilson:
> Quoting Jason Ekstrand (2017-08-09 18:00:54)
>> Vulkan VkFence semantics require that the application be able to perform
>> a CPU wait on work which may not yet have been submitted.  This is
>> perfectly safe because the CPU wait has a timeout which will get
>> triggered eventually if no work is ever submitted.  This behavior is
>> advantageous for multi-threaded workloads because, so long as all of the
>> threads agree on what fences to use up-front, you don't have the extra
>> cross-thread synchronization cost of thread A telling thread B that it
>> has submitted its dependent work and thread B is now free to wait.
>>
>> Within a single process, this can be implemented in the userspace driver
>> by doing exactly the same kind of tracking the app would have to do
>> using posix condition variables or similar.  However, in order for this
>> to work cross-process (as is required by VK_KHR_external_fence), we need
>> to handle this in the kernel.
>>
>> This commit adds a WAIT_FOR_SUBMIT flag to DRM_IOCTL_SYNCOBJ_WAIT which
>> instructs the IOCTL to wait for the syncobj to have a non-null fence and
>> then wait on the fence.  Combined with DRM_IOCTL_SYNCOBJ_RESET, you can
>> easily get the Vulkan behavior.
>>
>> v2:
>>   - Fix a bug in the invalid syncobj error path
>>   - Unify the wait-all and wait-any cases
>>
>> Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
>> Cc: Dave Airlie <airlied@redhat.com>
>> ---
>>
>> I realized today (this is what happens when you sleep) that it takes almost
>> no work to make the wait-any path also handle wait-all.  By unifying the
>> two, I deleted over a hundred lines of code from the implementation.
>>
>>   drivers/gpu/drm/drm_syncobj.c | 243 ++++++++++++++++++++++++++++++++++--------
>>   include/uapi/drm/drm.h        |   1 +
>>   2 files changed, 199 insertions(+), 45 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/drm_syncobj.c b/drivers/gpu/drm/drm_syncobj.c
>> index 510dfc2..5e7f654 100644
>> --- a/drivers/gpu/drm/drm_syncobj.c
>> +++ b/drivers/gpu/drm/drm_syncobj.c
>> @@ -51,6 +51,7 @@
>>   #include <linux/fs.h>
>>   #include <linux/anon_inodes.h>
>>   #include <linux/sync_file.h>
>> +#include <linux/sched/signal.h>
>>   
>>   #include "drm_internal.h"
>>   #include <drm/drm_syncobj.h>
>> @@ -104,6 +105,27 @@ static void drm_syncobj_add_callback_locked(struct drm_syncobj *syncobj,
> This is a bit of the puzzle that is missing.
>
>>          list_add_tail(&cb->node, &syncobj->cb_list);
>>   }
>>   
>> +static int drm_syncobj_fence_get_or_add_callback(struct drm_syncobj *syncobj,
>> +                                                struct dma_fence **fence,
>> +                                                struct drm_syncobj_cb *cb,
>> +                                                drm_syncobj_func_t func)
>> +{
>> +       int ret;
>> +
>> +       spin_lock(&syncobj->lock);
>> +       if (syncobj->fence) {
>> +               *fence = dma_fence_get(syncobj->fence);
>> +               ret = 1;
>> +       } else {
>> +               *fence = NULL;
>> +               drm_syncobj_add_callback_locked(syncobj, cb, func);
>> +               ret = 0;
>> +       }
>> +       spin_unlock(&syncobj->lock);
>> +
>> +       return ret;
>> +}
>> +
>>   /**
>>    * drm_syncobj_add_callback - adds a callback to syncobj::cb_list
>>    * @syncobj: Sync object to which to add the callback
>> @@ -526,6 +548,159 @@ drm_syncobj_fd_to_handle_ioctl(struct drm_device *dev, void *data,
>>                                          &args->handle);
>>   }
>>   
>> +static int drm_syncobj_signaled(struct drm_syncobj *syncobj,
>> +                               uint32_t wait_flags)
>> +{
>> +       struct dma_fence *fence;
>> +       int ret;
>> +
>> +       fence = drm_syncobj_fence_get(syncobj);
>> +       if (!fence) {
>> +               if (wait_flags & DRM_SYNCOBJ_WAIT_FLAGS_WAIT_FOR_SUBMIT)
>> +                       return 0;
>> +               else
>> +                       return -EINVAL;
>> +       }
>> +
>> +       ret = dma_fence_is_signaled(fence) ? 1 : 0;
>> +
>> +       dma_fence_put(fence);
>> +
>> +       return ret;
>> +}
>> +
>> +struct syncobj_wait_entry {
>> +       struct task_struct *task;
>> +       struct dma_fence *fence;
>> +       struct dma_fence_cb fence_cb;
>> +       struct drm_syncobj_cb syncobj_cb;
>> +};
>> +
>> +static void syncobj_wait_fence_func(struct dma_fence *fence,
>> +                                   struct dma_fence_cb *cb)
>> +{
>> +       struct syncobj_wait_entry *wait =
>> +               container_of(cb, struct syncobj_wait_entry, fence_cb);
>> +
>> +       wake_up_process(wait->task);
>> +}
>> +
>> +static void syncobj_wait_syncobj_func(struct drm_syncobj *syncobj,
>> +                                     struct drm_syncobj_cb *cb)
>> +{
>> +       struct syncobj_wait_entry *wait =
>> +               container_of(cb, struct syncobj_wait_entry, syncobj_cb);
>> +
>> +       /* This happens inside the syncobj lock */
>> +       wait->fence = dma_fence_get(syncobj->fence);
>> +       wake_up_process(wait->task);
>> +}
>> +
>> +static signed long drm_syncobj_array_wait_timeout(struct drm_syncobj **syncobjs,
>> +                                                 uint32_t count,
>> +                                                 uint32_t flags,
>> +                                                 signed long timeout,
>> +                                                 uint32_t *idx)
>> +{
>> +       struct syncobj_wait_entry *entries;
>> +       struct dma_fence *fence;
>> +       signed long ret;
>> +       uint32_t signaled_count, i;
>> +
>> +       if (timeout == 0) {
>> +               signaled_count = 0;
>> +               for (i = 0; i < count; ++i) {
>> +                       ret = drm_syncobj_signaled(syncobjs[i], flags);
>> +                       if (ret < 0)
>> +                               return ret;
>> +                       if (ret == 0)
>> +                               continue;
>> +                       if (signaled_count == 0 && idx)
>> +                               *idx = i;
>> +                       signaled_count++;
>> +               }
>> +
>> +               if (flags & DRM_SYNCOBJ_WAIT_FLAGS_WAIT_ALL)
>> +                       return signaled_count == count ? 1 : 0;
>> +               else
>> +                       return signaled_count > 0 ? 1 : 0;
>> +       }
>> +
>> +       entries = kcalloc(count, sizeof(*entries), GFP_KERNEL);
>> +       if (!entries)
>> +               return -ENOMEM;
>> +
>> +       for (i = 0; i < count; ++i) {
>> +               entries[i].task = current;
>> +
>> +               if (flags & DRM_SYNCOBJ_WAIT_FLAGS_WAIT_FOR_SUBMIT) {
>> +                       drm_syncobj_fence_get_or_add_callback(syncobjs[i],
>> +                                                             &entries[i].fence,
>> +                                                             &entries[i].syncobj_cb,
>> +                                                             syncobj_wait_syncobj_func);
>> +               } else {
>> +                       entries[i].fence = drm_syncobj_fence_get(syncobjs[i]);
>> +                       if (!entries[i].fence) {
>> +                               ret = -EINVAL;
>> +                               goto err_cleanup_entries;
>> +                       }
> So we are taking a snapshot here. It looks like this could have been
> done using a dma_fence_array + dma_fence_proxy for capturing the future
> fence.
>
>> +       ret = timeout;
>> +       while (ret > 0) {
>> +               signaled_count = 0;
>> +               for (i = 0; i < count; ++i) {
>> +                       fence = entries[i].fence;
>> +                       if (!fence)
>> +                               continue;
>> +
>> +                       if (dma_fence_is_signaled(fence) ||
>> +                           (!entries[i].fence_cb.func &&
>> +                            dma_fence_add_callback(fence,
>> +                                                   &entries[i].fence_cb,
>> +                                                   syncobj_wait_fence_func))) {
>> +                               /* The fence has been signaled */
>> +                               if (flags & DRM_SYNCOBJ_WAIT_FLAGS_WAIT_ALL) {
>> +                                       signaled_count++;
>> +                               } else {
>> +                                       if (idx)
>> +                                               *idx = i;
>> +                                       goto done_waiting;
>> +                               }
>> +                       }
>> +               }
>> +
>> +               if ((flags & DRM_SYNCOBJ_WAIT_FLAGS_WAIT_ALL) &&
>> +                   signaled_count == count)
>> +                       goto done_waiting;
>> +
>> +               set_current_state(TASK_INTERRUPTIBLE);
> This is the first step in the wait-loop, you set TASK_INTERRUPTIBLE
> before doing any checks. So that if any state changes whilst you are in
> the middle of those checks, the schedule_timeout is a nop and you can
> check again.

Yeah, completely agree.

I would rather drop the approach with the custom wait and try to use 
wait_event_interruptible here.

As far as I can see that should make the whole patch much cleaner in 
general.

Regards,
Christian.

> -Chris
> _______________________________________________
> dri-devel mailing list
> dri-devel@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/dri-devel


_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH] drm/syncobj: Allow wait for submit and signal behavior (v2)
  2017-08-09 18:25       ` Christian König
@ 2017-08-09 21:09         ` Jason Ekstrand
  0 siblings, 0 replies; 55+ messages in thread
From: Jason Ekstrand @ 2017-08-09 21:09 UTC (permalink / raw)
  To: Christian König
  Cc: Jason Ekstrand, Dave Airlie, Maling list - DRI developers


[-- Attachment #1.1: Type: text/plain, Size: 9934 bytes --]

On Wed, Aug 9, 2017 at 11:25 AM, Christian König <deathsimple@vodafone.de>
wrote:

> Am 09.08.2017 um 19:57 schrieb Chris Wilson:
>
>> Quoting Jason Ekstrand (2017-08-09 18:00:54)
>>
>>> Vulkan VkFence semantics require that the application be able to perform
>>> a CPU wait on work which may not yet have been submitted.  This is
>>> perfectly safe because the CPU wait has a timeout which will get
>>> triggered eventually if no work is ever submitted.  This behavior is
>>> advantageous for multi-threaded workloads because, so long as all of the
>>> threads agree on what fences to use up-front, you don't have the extra
>>> cross-thread synchronization cost of thread A telling thread B that it
>>> has submitted its dependent work and thread B is now free to wait.
>>>
>>> Within a single process, this can be implemented in the userspace driver
>>> by doing exactly the same kind of tracking the app would have to do
>>> using posix condition variables or similar.  However, in order for this
>>> to work cross-process (as is required by VK_KHR_external_fence), we need
>>> to handle this in the kernel.
>>>
>>> This commit adds a WAIT_FOR_SUBMIT flag to DRM_IOCTL_SYNCOBJ_WAIT which
>>> instructs the IOCTL to wait for the syncobj to have a non-null fence and
>>> then wait on the fence.  Combined with DRM_IOCTL_SYNCOBJ_RESET, you can
>>> easily get the Vulkan behavior.
>>>
>>> v2:
>>>   - Fix a bug in the invalid syncobj error path
>>>   - Unify the wait-all and wait-any cases
>>>
>>> Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
>>> Cc: Dave Airlie <airlied@redhat.com>
>>> ---
>>>
>>> I realized today (this is what happens when you sleep) that it takes
>>> almost
>>> no work to make the wait-any path also handle wait-all.  By unifying the
>>> two, I deleted over a hundred lines of code from the implementation.
>>>
>>>   drivers/gpu/drm/drm_syncobj.c | 243 ++++++++++++++++++++++++++++++
>>> ++++--------
>>>   include/uapi/drm/drm.h        |   1 +
>>>   2 files changed, 199 insertions(+), 45 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/drm_syncobj.c
>>> b/drivers/gpu/drm/drm_syncobj.c
>>> index 510dfc2..5e7f654 100644
>>> --- a/drivers/gpu/drm/drm_syncobj.c
>>> +++ b/drivers/gpu/drm/drm_syncobj.c
>>> @@ -51,6 +51,7 @@
>>>   #include <linux/fs.h>
>>>   #include <linux/anon_inodes.h>
>>>   #include <linux/sync_file.h>
>>> +#include <linux/sched/signal.h>
>>>     #include "drm_internal.h"
>>>   #include <drm/drm_syncobj.h>
>>> @@ -104,6 +105,27 @@ static void drm_syncobj_add_callback_locked(struct
>>> drm_syncobj *syncobj,
>>>
>> This is a bit of the puzzle that is missing.
>>
>
?  It's in the previous patch.


>          list_add_tail(&cb->node, &syncobj->cb_list);
>>>   }
>>>   +static int drm_syncobj_fence_get_or_add_callback(struct drm_syncobj
>>> *syncobj,
>>> +                                                struct dma_fence
>>> **fence,
>>> +                                                struct drm_syncobj_cb
>>> *cb,
>>> +                                                drm_syncobj_func_t func)
>>> +{
>>> +       int ret;
>>> +
>>> +       spin_lock(&syncobj->lock);
>>> +       if (syncobj->fence) {
>>> +               *fence = dma_fence_get(syncobj->fence);
>>> +               ret = 1;
>>> +       } else {
>>> +               *fence = NULL;
>>> +               drm_syncobj_add_callback_locked(syncobj, cb, func);
>>> +               ret = 0;
>>> +       }
>>> +       spin_unlock(&syncobj->lock);
>>> +
>>> +       return ret;
>>> +}
>>> +
>>>   /**
>>>    * drm_syncobj_add_callback - adds a callback to syncobj::cb_list
>>>    * @syncobj: Sync object to which to add the callback
>>> @@ -526,6 +548,159 @@ drm_syncobj_fd_to_handle_ioctl(struct drm_device
>>> *dev, void *data,
>>>                                          &args->handle);
>>>   }
>>>   +static int drm_syncobj_signaled(struct drm_syncobj *syncobj,
>>> +                               uint32_t wait_flags)
>>> +{
>>> +       struct dma_fence *fence;
>>> +       int ret;
>>> +
>>> +       fence = drm_syncobj_fence_get(syncobj);
>>> +       if (!fence) {
>>> +               if (wait_flags & DRM_SYNCOBJ_WAIT_FLAGS_WAIT_FOR_SUBMIT)
>>> +                       return 0;
>>> +               else
>>> +                       return -EINVAL;
>>> +       }
>>> +
>>> +       ret = dma_fence_is_signaled(fence) ? 1 : 0;
>>> +
>>> +       dma_fence_put(fence);
>>> +
>>> +       return ret;
>>> +}
>>> +
>>> +struct syncobj_wait_entry {
>>> +       struct task_struct *task;
>>> +       struct dma_fence *fence;
>>> +       struct dma_fence_cb fence_cb;
>>> +       struct drm_syncobj_cb syncobj_cb;
>>> +};
>>> +
>>> +static void syncobj_wait_fence_func(struct dma_fence *fence,
>>> +                                   struct dma_fence_cb *cb)
>>> +{
>>> +       struct syncobj_wait_entry *wait =
>>> +               container_of(cb, struct syncobj_wait_entry, fence_cb);
>>> +
>>> +       wake_up_process(wait->task);
>>> +}
>>> +
>>> +static void syncobj_wait_syncobj_func(struct drm_syncobj *syncobj,
>>> +                                     struct drm_syncobj_cb *cb)
>>> +{
>>> +       struct syncobj_wait_entry *wait =
>>> +               container_of(cb, struct syncobj_wait_entry, syncobj_cb);
>>> +
>>> +       /* This happens inside the syncobj lock */
>>> +       wait->fence = dma_fence_get(syncobj->fence);
>>> +       wake_up_process(wait->task);
>>> +}
>>> +
>>> +static signed long drm_syncobj_array_wait_timeout(struct drm_syncobj
>>> **syncobjs,
>>> +                                                 uint32_t count,
>>> +                                                 uint32_t flags,
>>> +                                                 signed long timeout,
>>> +                                                 uint32_t *idx)
>>> +{
>>> +       struct syncobj_wait_entry *entries;
>>> +       struct dma_fence *fence;
>>> +       signed long ret;
>>> +       uint32_t signaled_count, i;
>>> +
>>> +       if (timeout == 0) {
>>> +               signaled_count = 0;
>>> +               for (i = 0; i < count; ++i) {
>>> +                       ret = drm_syncobj_signaled(syncobjs[i], flags);
>>> +                       if (ret < 0)
>>> +                               return ret;
>>> +                       if (ret == 0)
>>> +                               continue;
>>> +                       if (signaled_count == 0 && idx)
>>> +                               *idx = i;
>>> +                       signaled_count++;
>>> +               }
>>> +
>>> +               if (flags & DRM_SYNCOBJ_WAIT_FLAGS_WAIT_ALL)
>>> +                       return signaled_count == count ? 1 : 0;
>>> +               else
>>> +                       return signaled_count > 0 ? 1 : 0;
>>> +       }
>>> +
>>> +       entries = kcalloc(count, sizeof(*entries), GFP_KERNEL);
>>> +       if (!entries)
>>> +               return -ENOMEM;
>>> +
>>> +       for (i = 0; i < count; ++i) {
>>> +               entries[i].task = current;
>>> +
>>> +               if (flags & DRM_SYNCOBJ_WAIT_FLAGS_WAIT_FOR_SUBMIT) {
>>> +                       drm_syncobj_fence_get_or_add_
>>> callback(syncobjs[i],
>>> +
>>>  &entries[i].fence,
>>> +
>>>  &entries[i].syncobj_cb,
>>> +
>>>  syncobj_wait_syncobj_func);
>>> +               } else {
>>> +                       entries[i].fence = drm_syncobj_fence_get(syncobjs
>>> [i]);
>>> +                       if (!entries[i].fence) {
>>> +                               ret = -EINVAL;
>>> +                               goto err_cleanup_entries;
>>> +                       }
>>>
>> So we are taking a snapshot here. It looks like this could have been
>> done using a dma_fence_array + dma_fence_proxy for capturing the future
>> fence.
>>
>
I'm not sure what you mean.  Is that something in the future fence series
that I should be looking at?


> +       ret = timeout;
>>> +       while (ret > 0) {
>>> +               signaled_count = 0;
>>> +               for (i = 0; i < count; ++i) {
>>> +                       fence = entries[i].fence;
>>> +                       if (!fence)
>>> +                               continue;
>>> +
>>> +                       if (dma_fence_is_signaled(fence) ||
>>> +                           (!entries[i].fence_cb.func &&
>>> +                            dma_fence_add_callback(fence,
>>> +                                                   &entries[i].fence_cb,
>>> +
>>>  syncobj_wait_fence_func))) {
>>> +                               /* The fence has been signaled */
>>> +                               if (flags &
>>> DRM_SYNCOBJ_WAIT_FLAGS_WAIT_ALL) {
>>> +                                       signaled_count++;
>>> +                               } else {
>>> +                                       if (idx)
>>> +                                               *idx = i;
>>> +                                       goto done_waiting;
>>> +                               }
>>> +                       }
>>> +               }
>>> +
>>> +               if ((flags & DRM_SYNCOBJ_WAIT_FLAGS_WAIT_ALL) &&
>>> +                   signaled_count == count)
>>> +                       goto done_waiting;
>>> +
>>> +               set_current_state(TASK_INTERRUPTIBLE);
>>>
>> This is the first step in the wait-loop, you set TASK_INTERRUPTIBLE
>> before doing any checks. So that if any state changes whilst you are in
>> the middle of those checks, the schedule_timeout is a nop and you can
>> check again.
>>
>
> Yeah, completely agree.
>
> I would rather drop the approach with the custom wait and try to use
> wait_event_interruptible here.
>
> As far as I can see that should make the whole patch much cleaner in
> general.
>

Sounds good to me.

--Jason

[-- Attachment #1.2: Type: text/html, Size: 13224 bytes --]

[-- Attachment #2: Type: text/plain, Size: 160 bytes --]

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH 2/9] drm/syncobj: Lock around drm_syncobj::fence
  2017-08-08 22:46 ` [PATCH 2/9] drm/syncobj: Lock around drm_syncobj::fence Jason Ekstrand
@ 2017-08-09 21:21   ` Chris Wilson
  2017-08-10  0:31     ` Jason Ekstrand
  0 siblings, 1 reply; 55+ messages in thread
From: Chris Wilson @ 2017-08-09 21:21 UTC (permalink / raw)
  To: dri-devel; +Cc: Jason Ekstrand, Jason Ekstrand

Quoting Jason Ekstrand (2017-08-08 23:46:02)
> The atomic exchange operation we were doing before in replace_fence was
> sufficient for the case where it raced with itself.  However, if you
> have a race between a replace_fence and dma_fence_get(syncobj->fence),
> you may end up with the entire replace_fence happening between the point
> in time where the one thread gets the syncobj->fence pointer and when it
> calls dma_fence_get() on it.  If this happens, then the reference may be
> dropped before we get a chance to get a new one.

This doesn't require a spinlock, just dma_fence_get_rcu_safe(). The
argument for keeping this patch lies in the merit of later patches..
-Chris
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH] drm/syncobj: Allow wait for submit and signal behavior (v2)
  2017-08-09 17:00   ` [PATCH] drm/syncobj: Allow wait for submit and signal behavior (v2) Jason Ekstrand
  2017-08-09 17:57     ` Chris Wilson
@ 2017-08-09 21:31     ` Chris Wilson
  2017-08-09 21:54       ` Jason Ekstrand
  2017-08-10 12:26       ` Christian König
  2017-08-10  1:35     ` [PATCH v3 9/9] drm/syncobj: Allow wait for submit and signal behavior (v3) Jason Ekstrand
  2 siblings, 2 replies; 55+ messages in thread
From: Chris Wilson @ 2017-08-09 21:31 UTC (permalink / raw)
  To: dri-devel; +Cc: Jason Ekstrand, Dave Airlie, Jason Ekstrand

Quoting Jason Ekstrand (2017-08-09 18:00:54)
> +static signed long drm_syncobj_array_wait_timeout(struct drm_syncobj **syncobjs,
> +                                                 uint32_t count,
> +                                                 uint32_t flags,
> +                                                 signed long timeout,
> +                                                 uint32_t *idx)
> +{
> +       struct syncobj_wait_entry *entries;
> +       struct dma_fence *fence;
> +       signed long ret;
> +       uint32_t signaled_count, i;
> +
> +       if (timeout == 0) {
> +               signaled_count = 0;
> +               for (i = 0; i < count; ++i) {
> +                       ret = drm_syncobj_signaled(syncobjs[i], flags);
> +                       if (ret < 0)
> +                               return ret;
> +                       if (ret == 0)
> +                               continue;
> +                       if (signaled_count == 0 && idx)
> +                               *idx = i;
> +                       signaled_count++;
> +               }
> +
> +               if (flags & DRM_SYNCOBJ_WAIT_FLAGS_WAIT_ALL)
> +                       return signaled_count == count ? 1 : 0;
> +               else
> +                       return signaled_count > 0 ? 1 : 0;

There's a very annoying laxness in the dma_fence API here, in that
backends are not required to automatically report when a fence is
signaled prior to fence->ops->enable_signaling() being called.
So here if we fail to match signaled_count, we need to fallthough and
try a 0 timeout wait!

Christian, dma_fence_wait_any_timeout() has this same bug you told me off
for, e.g. commit 698c0f7ff216 ("dma-buf/fence: revert "don't wait when
specified timeout is zero" (v2)")!
-Chris
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH] drm/syncobj: Allow wait for submit and signal behavior (v2)
  2017-08-09 21:31     ` Chris Wilson
@ 2017-08-09 21:54       ` Jason Ekstrand
  2017-08-10 12:26       ` Christian König
  1 sibling, 0 replies; 55+ messages in thread
From: Jason Ekstrand @ 2017-08-09 21:54 UTC (permalink / raw)
  To: Chris Wilson; +Cc: Jason Ekstrand, Dave Airlie, Maling list - DRI developers


[-- Attachment #1.1: Type: text/plain, Size: 2008 bytes --]

On Wed, Aug 9, 2017 at 2:31 PM, Chris Wilson <chris@chris-wilson.co.uk>
wrote:

> Quoting Jason Ekstrand (2017-08-09 18:00:54)
> > +static signed long drm_syncobj_array_wait_timeout(struct drm_syncobj
> **syncobjs,
> > +                                                 uint32_t count,
> > +                                                 uint32_t flags,
> > +                                                 signed long timeout,
> > +                                                 uint32_t *idx)
> > +{
> > +       struct syncobj_wait_entry *entries;
> > +       struct dma_fence *fence;
> > +       signed long ret;
> > +       uint32_t signaled_count, i;
> > +
> > +       if (timeout == 0) {
> > +               signaled_count = 0;
> > +               for (i = 0; i < count; ++i) {
> > +                       ret = drm_syncobj_signaled(syncobjs[i], flags);
> > +                       if (ret < 0)
> > +                               return ret;
> > +                       if (ret == 0)
> > +                               continue;
> > +                       if (signaled_count == 0 && idx)
> > +                               *idx = i;
> > +                       signaled_count++;
> > +               }
> > +
> > +               if (flags & DRM_SYNCOBJ_WAIT_FLAGS_WAIT_ALL)
> > +                       return signaled_count == count ? 1 : 0;
> > +               else
> > +                       return signaled_count > 0 ? 1 : 0;
>
> There's a very annoying laxness in the dma_fence API here, in that
> backends are not required to automatically report when a fence is
> signaled prior to fence->ops->enable_signaling() being called.
> So here if we fail to match signaled_count, we need to fallthough and
> try a 0 timeout wait!
>

That is very annoying!  I'll see how bad the fall-through is...


> Christian, dma_fence_wait_any_timeout() has this same bug you told me off
> for, e.g. commit 698c0f7ff216 ("dma-buf/fence: revert "don't wait when
> specified timeout is zero" (v2)")!
> -Chris
>

[-- Attachment #1.2: Type: text/html, Size: 3134 bytes --]

[-- Attachment #2: Type: text/plain, Size: 160 bytes --]

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH] drm/syncobj: Allow wait for submit and signal behavior (v2)
  2017-08-09 17:57     ` Chris Wilson
  2017-08-09 18:25       ` Christian König
@ 2017-08-09 22:41       ` Chris Wilson
  2017-08-09 23:53         ` Jason Ekstrand
  1 sibling, 1 reply; 55+ messages in thread
From: Chris Wilson @ 2017-08-09 22:41 UTC (permalink / raw)
  To: dri-devel; +Cc: Jason Ekstrand, Dave Airlie, Jason Ekstrand

Quoting Chris Wilson (2017-08-09 18:57:44)
> So we are taking a snapshot here. It looks like this could have been
> done using a dma_fence_array + dma_fence_proxy for capturing the future
> fence.

A quick sketch of this idea looks like:

 void drm_syncobj_replace_fence(struct drm_syncobj *syncobj,
                               struct dma_fence *fence)
 {
-       struct dma_fence *old_fence;
+       unsigned long flags;
 
-       if (fence)
-               dma_fence_get(fence);
-       old_fence = xchg(&syncobj->fence, fence);
-
-       dma_fence_put(old_fence);
+       spin_lock_irqsave(&syncobj->lock, flags);
+       dma_fence_replace_proxy(&syncobj->fence, fence);
+       spin_unlock_irqrestore(&syncobj->lock, flags);
 }

+int
+drm_syncobj_wait_ioctl(struct drm_device *dev, void *data,
+                      struct drm_file *file_private)
+{
+       struct drm_syncobj_wait *args = data;
+       struct dma_fence **fences;
+       struct dma_fence_array *array;
+       unsigned long timeout;
+       unsigned int count;
+       u32 *handles;
+       int ret = 0;
+       u32 i;
+
+       BUILD_BUG_ON(sizeof(*fences) < sizeof(*handles));
+
+       if (!drm_core_check_feature(dev, DRIVER_SYNCOBJ))
+               return -ENODEV;
+
+       if (args->flags != 0 && args->flags != DRM_SYNCOBJ_WAIT_FLAGS_WAIT_ALL)
+               return -EINVAL;
+
+       count = args->count_handles;
+       if (!count)
+               return -EINVAL;
+
+       /* Get the handles from userspace */
+       fences = kmalloc_array(count,
+                              sizeof(struct dma_fence *),
+                              __GFP_NOWARN | GFP_KERNEL);
+       if (!fences)
+               return -ENOMEM;
+
+       handles = (void *)fences + count * (sizeof(*fences) - sizeof(*handles));
+       if (copy_from_user(handles,
+                          u64_to_user_ptr(args->handles),
+                          sizeof(u32) * count)) {
+               ret = -EFAULT;
+               goto err;
+       }
+
+       for (i = 0; i < count; i++) {
+               struct drm_syncobj *s;
+
+               ret = -ENOENT;
+               s = drm_syncobj_find(file_private, handles[i]);
+               if (s) {
+                       ret = 0;
+                       spin_lock_irq(&s->lock);
+                       if (!s->fence) {
+                               if (args->flags & DRM_SYNCOBJ_WAIT_FLAGS_WAIT_FOR_SUBMIT)
+                                       s->fence = dma_fence_create_proxy();
+                               else
+                                       ret = -EINVAL;
+                       }
+                       if (s->fence)
+                               fences[i] = dma_fence_get(s->fence);
+                       spin_unlock_irq(&s->lock);
+               }
+               if (ret) {
+                       count = i;
+                       goto err_fences;
+               }
+       }
+
+       array = dma_fence_array_create(count, fences, 0, 0,
+                                      !(args->flags & DRM_SYNCOBJ_WAIT_FLAGS_WAIT_ALL));
+       if (!array) {
+               ret = -ENOMEM;
+               goto err_fences;
+       }
+
+       timeout = drm_timeout_abs_to_jiffies(args->timeout_nsec);
+       timeout = dma_fence_wait_timeout(&array->base, true, timeout);
+       args->first_signaled = array->first_signaled;
+       dma_fence_put(&array->base);
+
+       return timeout < 0 ? timeout : 0;
+
+err_fences:
+       for (i = 0; i < count; i++)
+               dma_fence_put(fences[i]);
+err:
+       kfree(fences);
+       return ret;
+}

The key advantage is that this translate the ioctl into a dma-fence-array
which already has to deal with the mess, sharing the burden. (But it
does require a trivial patch to dma-fence-array to record the first
signaled fence.)

However, this installs the proxy into syncobj->fence with the result
that any concurrent wait also become a WAIT_FOR_SUBMIT. The behaviour
of drm_syncobj is then quite inconsistent, sometimes it will wait for a
future fence, sometimes it will report an error.

Furthermore, do we want future-fences in the execbuf API? etc.
-Chris
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH] drm/syncobj: Allow wait for submit and signal behavior (v2)
  2017-08-09 22:41       ` Chris Wilson
@ 2017-08-09 23:53         ` Jason Ekstrand
  2017-08-10 11:00           ` Chris Wilson
  2017-08-10 12:26           ` Christian König
  0 siblings, 2 replies; 55+ messages in thread
From: Jason Ekstrand @ 2017-08-09 23:53 UTC (permalink / raw)
  To: Chris Wilson; +Cc: Jason Ekstrand, Dave Airlie, Maling list - DRI developers


[-- Attachment #1.1: Type: text/plain, Size: 4751 bytes --]

On Wed, Aug 9, 2017 at 3:41 PM, Chris Wilson <chris@chris-wilson.co.uk>
wrote:

> Quoting Chris Wilson (2017-08-09 18:57:44)
> > So we are taking a snapshot here. It looks like this could have been
> > done using a dma_fence_array + dma_fence_proxy for capturing the future
> > fence.
>
> A quick sketch of this idea looks like:
>
>  void drm_syncobj_replace_fence(struct drm_syncobj *syncobj,
>                                struct dma_fence *fence)
>  {
> -       struct dma_fence *old_fence;
> +       unsigned long flags;
>
> -       if (fence)
> -               dma_fence_get(fence);
> -       old_fence = xchg(&syncobj->fence, fence);
> -
> -       dma_fence_put(old_fence);
> +       spin_lock_irqsave(&syncobj->lock, flags);
> +       dma_fence_replace_proxy(&syncobj->fence, fence);
> +       spin_unlock_irqrestore(&syncobj->lock, flags);
>  }
>
> +int
> +drm_syncobj_wait_ioctl(struct drm_device *dev, void *data,
> +                      struct drm_file *file_private)
> +{
> +       struct drm_syncobj_wait *args = data;
> +       struct dma_fence **fences;
> +       struct dma_fence_array *array;
> +       unsigned long timeout;
> +       unsigned int count;
> +       u32 *handles;
> +       int ret = 0;
> +       u32 i;
> +
> +       BUILD_BUG_ON(sizeof(*fences) < sizeof(*handles));
> +
> +       if (!drm_core_check_feature(dev, DRIVER_SYNCOBJ))
> +               return -ENODEV;
> +
> +       if (args->flags != 0 && args->flags != DRM_SYNCOBJ_WAIT_FLAGS_WAIT_
> ALL)
> +               return -EINVAL;
> +
> +       count = args->count_handles;
> +       if (!count)
> +               return -EINVAL;
> +
> +       /* Get the handles from userspace */
> +       fences = kmalloc_array(count,
> +                              sizeof(struct dma_fence *),
> +                              __GFP_NOWARN | GFP_KERNEL);
> +       if (!fences)
> +               return -ENOMEM;
> +
> +       handles = (void *)fences + count * (sizeof(*fences) -
> sizeof(*handles));
> +       if (copy_from_user(handles,
> +                          u64_to_user_ptr(args->handles),
> +                          sizeof(u32) * count)) {
> +               ret = -EFAULT;
> +               goto err;
> +       }
> +
> +       for (i = 0; i < count; i++) {
> +               struct drm_syncobj *s;
> +
> +               ret = -ENOENT;
> +               s = drm_syncobj_find(file_private, handles[i]);
> +               if (s) {
> +                       ret = 0;
> +                       spin_lock_irq(&s->lock);
> +                       if (!s->fence) {
> +                               if (args->flags &
> DRM_SYNCOBJ_WAIT_FLAGS_WAIT_FOR_SUBMIT)
> +                                       s->fence =
> dma_fence_create_proxy();
> +                               else
> +                                       ret = -EINVAL;
> +                       }
> +                       if (s->fence)
> +                               fences[i] = dma_fence_get(s->fence);
> +                       spin_unlock_irq(&s->lock);
> +               }
> +               if (ret) {
> +                       count = i;
> +                       goto err_fences;
> +               }
> +       }
> +
> +       array = dma_fence_array_create(count, fences, 0, 0,
> +                                      !(args->flags &
> DRM_SYNCOBJ_WAIT_FLAGS_WAIT_ALL));
> +       if (!array) {
> +               ret = -ENOMEM;
> +               goto err_fences;
> +       }
> +
> +       timeout = drm_timeout_abs_to_jiffies(args->timeout_nsec);
> +       timeout = dma_fence_wait_timeout(&array->base, true, timeout);
> +       args->first_signaled = array->first_signaled;
> +       dma_fence_put(&array->base);
> +
> +       return timeout < 0 ? timeout : 0;
> +
> +err_fences:
> +       for (i = 0; i < count; i++)
> +               dma_fence_put(fences[i]);
> +err:
> +       kfree(fences);
> +       return ret;
> +}
>
> The key advantage is that this translate the ioctl into a dma-fence-array
> which already has to deal with the mess, sharing the burden. (But it
> does require a trivial patch to dma-fence-array to record the first
> signaled fence.)
>
> However, this installs the proxy into syncobj->fence with the result
> that any concurrent wait also become a WAIT_FOR_SUBMIT. The behaviour
> of drm_syncobj is then quite inconsistent, sometimes it will wait for a
> future fence, sometimes it will report an error.
>

Yeah, that's not good.  I thought about a variety of solutions to try and
re-use more core dma_fence code.  Ultimately I chose the current one
because it takes a snapshot of the syncobjs and then, from that point
forward, it's consistent with its snapshot.  Nothing I was able to come up
with based on core dma_fence wait routines does that.

--Jason

[-- Attachment #1.2: Type: text/html, Size: 6347 bytes --]

[-- Attachment #2: Type: text/plain, Size: 160 bytes --]

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH 2/9] drm/syncobj: Lock around drm_syncobj::fence
  2017-08-09 21:21   ` Chris Wilson
@ 2017-08-10  0:31     ` Jason Ekstrand
  2017-08-10 10:55       ` Chris Wilson
  2017-08-10 19:10       ` Chris Wilson
  0 siblings, 2 replies; 55+ messages in thread
From: Jason Ekstrand @ 2017-08-10  0:31 UTC (permalink / raw)
  To: Chris Wilson; +Cc: Jason Ekstrand, Maling list - DRI developers


[-- Attachment #1.1: Type: text/plain, Size: 935 bytes --]

On Wed, Aug 9, 2017 at 2:21 PM, Chris Wilson <chris@chris-wilson.co.uk>
wrote:

> Quoting Jason Ekstrand (2017-08-08 23:46:02)
> > The atomic exchange operation we were doing before in replace_fence was
> > sufficient for the case where it raced with itself.  However, if you
> > have a race between a replace_fence and dma_fence_get(syncobj->fence),
> > you may end up with the entire replace_fence happening between the point
> > in time where the one thread gets the syncobj->fence pointer and when it
> > calls dma_fence_get() on it.  If this happens, then the reference may be
> > dropped before we get a chance to get a new one.
>
> This doesn't require a spinlock, just dma_fence_get_rcu_safe(). The
> argument for keeping this patch lies in the merit of later patches..
>

Doesn't that also require that we start using an RCU for syncobj?  That was
my interpretation of the hieroglyphics above the definition of
get_rcu_safe()

[-- Attachment #1.2: Type: text/html, Size: 1415 bytes --]

[-- Attachment #2: Type: text/plain, Size: 160 bytes --]

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 55+ messages in thread

* [PATCH v3 9/9] drm/syncobj: Allow wait for submit and signal behavior (v3)
  2017-08-09 17:00   ` [PATCH] drm/syncobj: Allow wait for submit and signal behavior (v2) Jason Ekstrand
  2017-08-09 17:57     ` Chris Wilson
  2017-08-09 21:31     ` Chris Wilson
@ 2017-08-10  1:35     ` Jason Ekstrand
  2 siblings, 0 replies; 55+ messages in thread
From: Jason Ekstrand @ 2017-08-10  1:35 UTC (permalink / raw)
  To: dri-devel; +Cc: Jason Ekstrand, Dave Airlie, Jason Ekstrand

Vulkan VkFence semantics require that the application be able to perform
a CPU wait on work which may not yet have been submitted.  This is
perfectly safe because the CPU wait has a timeout which will get
triggered eventually if no work is ever submitted.  This behavior is
advantageous for multi-threaded workloads because, so long as all of the
threads agree on what fences to use up-front, you don't have the extra
cross-thread synchronization cost of thread A telling thread B that it
has submitted its dependent work and thread B is now free to wait.

Within a single process, this can be implemented in the userspace driver
by doing exactly the same kind of tracking the app would have to do
using posix condition variables or similar.  However, in order for this
to work cross-process (as is required by VK_KHR_external_fence), we need
to handle this in the kernel.

This commit adds a WAIT_FOR_SUBMIT flag to DRM_IOCTL_SYNCOBJ_WAIT which
instructs the IOCTL to wait for the syncobj to have a non-null fence and
then wait on the fence.  Combined with DRM_IOCTL_SYNCOBJ_RESET, you can
easily get the Vulkan behavior.

v2:
 - Fix a bug in the invalid syncobj error path
 - Unify the wait-all and wait-any cases
v3:
 - Unify the timeout == 0 case a bit with the timeout > 0 case
 - use wait_event_interruptible_timeout

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: Dave Airlie <airlied@redhat.com>
---
 drivers/gpu/drm/drm_syncobj.c | 239 ++++++++++++++++++++++++++++++++++--------
 include/uapi/drm/drm.h        |   1 +
 2 files changed, 195 insertions(+), 45 deletions(-)

diff --git a/drivers/gpu/drm/drm_syncobj.c b/drivers/gpu/drm/drm_syncobj.c
index 510dfc2..6e6fb86 100644
--- a/drivers/gpu/drm/drm_syncobj.c
+++ b/drivers/gpu/drm/drm_syncobj.c
@@ -51,6 +51,7 @@
 #include <linux/fs.h>
 #include <linux/anon_inodes.h>
 #include <linux/sync_file.h>
+#include <linux/sched/signal.h>
 
 #include "drm_internal.h"
 #include <drm/drm_syncobj.h>
@@ -104,6 +105,27 @@ static void drm_syncobj_add_callback_locked(struct drm_syncobj *syncobj,
 	list_add_tail(&cb->node, &syncobj->cb_list);
 }
 
+static int drm_syncobj_fence_get_or_add_callback(struct drm_syncobj *syncobj,
+						 struct dma_fence **fence,
+						 struct drm_syncobj_cb *cb,
+						 drm_syncobj_func_t func)
+{
+	int ret;
+
+	spin_lock(&syncobj->lock);
+	if (syncobj->fence) {
+		*fence = dma_fence_get(syncobj->fence);
+		ret = 1;
+	} else {
+		*fence = NULL;
+		drm_syncobj_add_callback_locked(syncobj, cb, func);
+		ret = 0;
+	}
+	spin_unlock(&syncobj->lock);
+
+	return ret;
+}
+
 /**
  * drm_syncobj_add_callback - adds a callback to syncobj::cb_list
  * @syncobj: Sync object to which to add the callback
@@ -526,6 +548,155 @@ drm_syncobj_fd_to_handle_ioctl(struct drm_device *dev, void *data,
 					&args->handle);
 }
 
+struct syncobj_wait_entry {
+	wait_queue_head_t *wq;
+	struct dma_fence *fence;
+	struct dma_fence_cb fence_cb;
+	struct drm_syncobj_cb syncobj_cb;
+};
+
+static void syncobj_wait_fence_func(struct dma_fence *fence,
+				    struct dma_fence_cb *cb)
+{
+	struct syncobj_wait_entry *wait =
+		container_of(cb, struct syncobj_wait_entry, fence_cb);
+
+	wake_up(wait->wq);
+}
+
+static void syncobj_wait_syncobj_func(struct drm_syncobj *syncobj,
+				      struct drm_syncobj_cb *cb)
+{
+	struct syncobj_wait_entry *wait =
+		container_of(cb, struct syncobj_wait_entry, syncobj_cb);
+
+	/* This happens inside the syncobj lock */
+	wait->fence = dma_fence_get(syncobj->fence);
+	wake_up(wait->wq);
+}
+
+static bool drm_syncobj_array_wait_complete(struct syncobj_wait_entry *entries,
+					    uint32_t count,
+					    uint32_t flags,
+					    uint32_t *idx)
+{
+	struct dma_fence *fence;
+	uint32_t signaled_count;
+	int i;
+
+	signaled_count = 0;
+	for (i = 0; i < count; ++i) {
+		fence = entries[i].fence;
+		if (!fence)
+			continue;
+
+		if (dma_fence_is_signaled(fence) ||
+		    (!entries[i].fence_cb.func &&
+		     dma_fence_add_callback(fence,
+					    &entries[i].fence_cb,
+					    syncobj_wait_fence_func))) {
+			/* The fence has been signaled */
+			if (flags & DRM_SYNCOBJ_WAIT_FLAGS_WAIT_ALL) {
+				signaled_count++;
+			} else {
+				if (idx)
+					*idx = i;
+				return true;
+			}
+		}
+	}
+
+	if (signaled_count == count)
+		return true;
+
+	return false;
+}
+
+static signed long drm_syncobj_array_wait_timeout(struct drm_syncobj **syncobjs,
+						  uint32_t count,
+						  uint32_t flags,
+						  signed long timeout,
+						  uint32_t *idx)
+{
+	wait_queue_head_t syncobj_wait_queue;
+	struct syncobj_wait_entry *entries;
+	signed long ret;
+	uint32_t signaled_count, i;
+
+	entries = kcalloc(count, sizeof(*entries), GFP_KERNEL);
+	if (!entries)
+		return -ENOMEM;
+
+	/* Walk the list of sync objects and initialize entries.  We do
+	 * this up-front so that we can properly return -EINVAL if there is
+	 * a syncobj with a missing fence and then never have the chance of
+	 * returning -EINVAL again.
+	 */
+	signaled_count = 0;
+	for (i = 0; i < count; ++i) {
+		entries[i].wq = &syncobj_wait_queue;
+		entries[i].fence = drm_syncobj_fence_get(syncobjs[i]);
+		if (!entries[i].fence) {
+			if (flags & DRM_SYNCOBJ_WAIT_FLAGS_WAIT_FOR_SUBMIT) {
+				continue;
+			} else {
+				ret = -EINVAL;
+				goto cleanup_entries;
+			}
+		}
+
+		if (dma_fence_is_signaled(entries[i].fence)) {
+			if (signaled_count == 0 && idx)
+				*idx = i;
+			signaled_count++;
+		}
+	}
+
+	if (signaled_count == count ||
+	    (signaled_count > 0 &&
+	     !(flags & DRM_SYNCOBJ_WAIT_FLAGS_WAIT_ALL))) {
+		ret = max_t(signed long, timeout, 1);
+		goto cleanup_entries;
+	}
+
+	/* There's a very annoying laxness in the dma_fence API here, in
+	 * that backends are not required to automatically report when a
+	 * fence is signaled prior to fence->ops->enable_signaling() being
+	 * called.  So here if we fail to match signaled_count, we need to
+	 * fallthough and try a 0 timeout wait!
+	 */
+
+	init_waitqueue_head(&syncobj_wait_queue);
+
+	if (flags & DRM_SYNCOBJ_WAIT_FLAGS_WAIT_FOR_SUBMIT) {
+		for (i = 0; i < count; ++i) {
+			drm_syncobj_fence_get_or_add_callback(syncobjs[i],
+							      &entries[i].fence,
+							      &entries[i].syncobj_cb,
+							      syncobj_wait_syncobj_func);
+		}
+	}
+
+	ret = wait_event_interruptible_timeout(
+		syncobj_wait_queue,
+		drm_syncobj_array_wait_complete(entries, count, flags, idx),
+		timeout);
+
+cleanup_entries:
+	for (i = 0; i < count; ++i) {
+		if (entries[i].syncobj_cb.func)
+			drm_syncobj_remove_callback(syncobjs[i],
+						    &entries[i].syncobj_cb);
+		if (entries[i].fence_cb.func)
+			dma_fence_remove_callback(entries[i].fence,
+						  &entries[i].fence_cb);
+		dma_fence_put(entries[i].fence);
+	}
+	kfree(entries);
+
+	return ret;
+}
+
 /**
  * drm_timeout_abs_to_jiffies - calculate jiffies timeout from absolute value
  *
@@ -558,43 +729,19 @@ static signed long drm_timeout_abs_to_jiffies(int64_t timeout_nsec)
 	return timeout_jiffies64 + 1;
 }
 
-static int drm_syncobj_wait_fences(struct drm_device *dev,
-				   struct drm_file *file_private,
-				   struct drm_syncobj_wait *wait,
-				   struct dma_fence **fences)
+static int drm_syncobj_array_wait(struct drm_device *dev,
+				  struct drm_file *file_private,
+				  struct drm_syncobj_wait *wait,
+				  struct drm_syncobj **syncobjs)
 {
 	signed long timeout = drm_timeout_abs_to_jiffies(wait->timeout_nsec);
 	signed long ret = 0;
 	uint32_t first = ~0;
 
-	if (wait->flags & DRM_SYNCOBJ_WAIT_FLAGS_WAIT_ALL) {
-		uint32_t i;
-		for (i = 0; i < wait->count_handles; i++) {
-			ret = dma_fence_wait_timeout(fences[i], true, timeout);
-
-			/* Various dma_fence wait callbacks will return
-			 * ENOENT to indicate that the fence has already
-			 * been signaled.  We need to sanitize this to 0 so
-			 * we don't return early and the client doesn't see
-			 * an unexpected error.
-			 */
-			if (ret == -ENOENT)
-				ret = 0;
-
-			if (ret < 0)
-				return ret;
-			if (ret == 0)
-				break;
-			timeout = ret;
-		}
-		first = 0;
-	} else {
-		ret = dma_fence_wait_any_timeout(fences,
-						 wait->count_handles,
-						 true, timeout,
-						 &first);
-	}
-
+	ret = drm_syncobj_array_wait_timeout(syncobjs,
+					     wait->count_handles,
+					     wait->flags,
+					     timeout, &first);
 	if (ret < 0)
 		return ret;
 
@@ -610,14 +757,15 @@ drm_syncobj_wait_ioctl(struct drm_device *dev, void *data,
 {
 	struct drm_syncobj_wait *args = data;
 	uint32_t *handles;
-	struct dma_fence **fences;
+	struct drm_syncobj **syncobjs;
 	int ret = 0;
 	uint32_t i;
 
 	if (!drm_core_check_feature(dev, DRIVER_SYNCOBJ))
 		return -ENODEV;
 
-	if (args->flags != 0 && args->flags != DRM_SYNCOBJ_WAIT_FLAGS_WAIT_ALL)
+	if (args->flags & ~(DRM_SYNCOBJ_WAIT_FLAGS_WAIT_ALL |
+			    DRM_SYNCOBJ_WAIT_FLAGS_WAIT_FOR_SUBMIT))
 		return -EINVAL;
 
 	if (args->count_handles == 0)
@@ -636,27 +784,28 @@ drm_syncobj_wait_ioctl(struct drm_device *dev, void *data,
 		goto err_free_handles;
 	}
 
-	fences = kcalloc(args->count_handles,
-			 sizeof(struct dma_fence *), GFP_KERNEL);
-	if (!fences) {
+	syncobjs = kcalloc(args->count_handles,
+			   sizeof(struct drm_syncobj *), GFP_KERNEL);
+	if (!syncobjs) {
 		ret = -ENOMEM;
 		goto err_free_handles;
 	}
 
 	for (i = 0; i < args->count_handles; i++) {
-		ret = drm_syncobj_find_fence(file_private, handles[i],
-					     &fences[i]);
-		if (ret)
+		syncobjs[i] = drm_syncobj_find(file_private, handles[i]);
+		if (!syncobjs[i]) {
+			ret = -ENOENT;
 			goto err_free_fence_array;
+		}
 	}
 
-	ret = drm_syncobj_wait_fences(dev, file_private,
-				      args, fences);
+	ret = drm_syncobj_array_wait(dev, file_private,
+				     args, syncobjs);
 
 err_free_fence_array:
-	for (i = 0; i < args->count_handles; i++)
-		dma_fence_put(fences[i]);
-	kfree(fences);
+	while (i-- > 0)
+		drm_syncobj_put(syncobjs[i]);
+	kfree(syncobjs);
 err_free_handles:
 	kfree(handles);
 
diff --git a/include/uapi/drm/drm.h b/include/uapi/drm/drm.h
index 4b301b4..f8ec8fe 100644
--- a/include/uapi/drm/drm.h
+++ b/include/uapi/drm/drm.h
@@ -719,6 +719,7 @@ struct drm_syncobj_handle {
 };
 
 #define DRM_SYNCOBJ_WAIT_FLAGS_WAIT_ALL (1 << 0)
+#define DRM_SYNCOBJ_WAIT_FLAGS_WAIT_FOR_SUBMIT (1 << 1)
 struct drm_syncobj_wait {
 	__u64 handles;
 	/* absolute timeout */
-- 
2.5.0.400.gff86faf

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 55+ messages in thread

* Re: [PATCH 2/9] drm/syncobj: Lock around drm_syncobj::fence
  2017-08-10  0:31     ` Jason Ekstrand
@ 2017-08-10 10:55       ` Chris Wilson
  2017-08-10 19:10       ` Chris Wilson
  1 sibling, 0 replies; 55+ messages in thread
From: Chris Wilson @ 2017-08-10 10:55 UTC (permalink / raw)
  To: Jason Ekstrand; +Cc: Jason Ekstrand, Maling list - DRI developers

Quoting Jason Ekstrand (2017-08-10 01:31:52)
> 
> 
> On Wed, Aug 9, 2017 at 2:21 PM, Chris Wilson <chris@chris-wilson.co.uk> wrote:
> 
>     Quoting Jason Ekstrand (2017-08-08 23:46:02)
>     > The atomic exchange operation we were doing before in replace_fence was
>     > sufficient for the case where it raced with itself.  However, if you
>     > have a race between a replace_fence and dma_fence_get(syncobj->fence),
>     > you may end up with the entire replace_fence happening between the point
>     > in time where the one thread gets the syncobj->fence pointer and when it
>     > calls dma_fence_get() on it.  If this happens, then the reference may be
>     > dropped before we get a chance to get a new one.
> 
>     This doesn't require a spinlock, just dma_fence_get_rcu_safe(). The
>     argument for keeping this patch lies in the merit of later patches..
> 
> 
> Doesn't that also require that we start using an RCU for syncobj?  That was my
> interpretation of the hieroglyphics above the definition of get_rcu_safe()

That we use RCU for the fence, which we do. i.e. so that the fence
pointer remains valid during the atomic_inc_unless_zero. The caller is
responsible for making sure that the syncobj remains valid across the
call (which we do using a ref, but it too could use RCU if that was ever
desired).
-Chris
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH] drm/syncobj: Allow wait for submit and signal behavior (v2)
  2017-08-09 23:53         ` Jason Ekstrand
@ 2017-08-10 11:00           ` Chris Wilson
  2017-08-10 14:42             ` Jason Ekstrand
  2017-08-10 12:26           ` Christian König
  1 sibling, 1 reply; 55+ messages in thread
From: Chris Wilson @ 2017-08-10 11:00 UTC (permalink / raw)
  To: Jason Ekstrand; +Cc: Jason Ekstrand, Dave Airlie, Maling list - DRI developers

Quoting Jason Ekstrand (2017-08-10 00:53:00)
> On Wed, Aug 9, 2017 at 3:41 PM, Chris Wilson <chris@chris-wilson.co.uk> wrote:
>     However, this installs the proxy into syncobj->fence with the result
>     that any concurrent wait also become a WAIT_FOR_SUBMIT. The behaviour
>     of drm_syncobj is then quite inconsistent, sometimes it will wait for a
>     future fence, sometimes it will report an error.
> 
> 
> Yeah, that's not good.  I thought about a variety of solutions to try and
> re-use more core dma_fence code.  Ultimately I chose the current one because it
> takes a snapshot of the syncobjs and then, from that point forward, it's
> consistent with its snapshot.  Nothing I was able to come up with based on core
> dma_fence wait routines does that.

So if we choose to keep the proxy local to the fence-array and not replace
syncobj->fence, we can still reduce this to a plain dma-fence-array +
wait.

Pertinent details:

+static void syncobj_notify(struct drm_syncobj *syncobj, struct dma_fence *fence)
+{
+       struct drm_syncobj_cb *cb, *cn;
+       unsigned long flags;
+
+       /* This is just an opencoded waitqueue; FIXME! */
+       spin_lock_irqsave(&syncobj->lock, flags);
+       list_for_each_entry_safe(cb, cn, &syncobj->cb_list, link)
+               cb->func(cb, fence);
+       INIT_LIST_HEAD(&syncobj->cb_list);
+       spin_unlock_irqrestore(&syncobj->lock, flags);
+}
+
 /**
  * drm_syncobj_replace_fence - replace fence in a sync object.
  * @syncobj: Sync object to replace fence in
@@ -89,7 +109,10 @@ void drm_syncobj_replace_fence(struct drm_syncobj *syncobj,
 
        if (fence)
                dma_fence_get(fence);
+
        old_fence = xchg(&syncobj->fence, fence);
+       if (!old_fence && fence)
+               syncobj_notify(syncobj, fence);
 
        dma_fence_put(old_fence);
 }
@@ -124,6 +147,9 @@ void drm_syncobj_free(struct kref *kref)
        struct drm_syncobj *syncobj = container_of(kref,
                                                   struct drm_syncobj,
                                                   refcount);
+
+       syncobj_notify(syncobj, NULL);
+
        dma_fence_put(syncobj->fence);
        kfree(syncobj);
 }
@@ -140,6 +166,8 @@ static int drm_syncobj_create(struct drm_file *file_private,
                return -ENOMEM;
 
        kref_init(&syncobj->refcount);
+       spin_lock_init(&syncobj->lock);
+       INIT_LIST_HEAD(&syncobj->cb_list);
 
        idr_preload(GFP_KERNEL);
        spin_lock(&file_private->syncobj_table_lock);

struct future_fence {
+       struct drm_syncobj_cb base;
+       struct dma_fence **slot;
+};
+
+static void future_fence_cb(struct drm_syncobj_cb *cb, struct dma_fence *fence)
+{
+       struct future_fence *ff = container_of(cb, typeof(*ff), base);
+
+       if (fence)
+               dma_fence_replace_proxy(ff->slot, fence);
+       else
+               dma_fence_signal(*ff->slot);
+
+       kfree(ff);
+}
+
+int
+drm_syncobj_wait_ioctl(struct drm_device *dev, void *data,
+                      struct drm_file *file_private)
+{
+       struct drm_syncobj_wait *args = data;
+       struct dma_fence_array *array;
+       struct dma_fence **fences;
+       unsigned int count, i;
+       long timeout;
+       u32 *handles;
+       int ret;
+
+       BUILD_BUG_ON(sizeof(*fences) < sizeof(*handles));
+
+       if (!drm_core_check_feature(dev, DRIVER_SYNCOBJ))
+               return -ENODEV;
+
+       if (args->flags & ~(DRM_SYNCOBJ_WAIT_FLAGS_WAIT_ALL |
+                           DRM_SYNCOBJ_WAIT_FLAGS_WAIT_FOR_SUBMIT))
+               return -EINVAL;
+
+       count = args->count_handles;
+       if (!count)
+               return -EINVAL;
+
+       /* Get the handles from userspace */
+       fences = kmalloc_array(count,
+                              sizeof(struct dma_fence *),
+                              __GFP_NOWARN | GFP_KERNEL);
+       if (!fences)
+               return -ENOMEM;
+
+       handles = (void *)fences + count * (sizeof(*fences) - sizeof(*handles));
+       if (copy_from_user(handles,
+                          u64_to_user_ptr(args->handles),
+                          sizeof(u32) * count)) {
+               ret = -EFAULT;
+               goto err;
+       }
+
+       for (i = 0; i < count; i++) {
+               struct drm_syncobj *s;
+
+               ret = -ENOENT;
+               s = drm_syncobj_find(file_private, handles[i]);
+               if (s) {
+                       ret = 0;
+                       spin_lock_irq(&s->lock);
+                       if (s->fence) {
+                               fences[i] = dma_fence_get(s->fence);
+                       } else if (args->flags & DRM_SYNCOBJ_WAIT_FLAGS_WAIT_FOR_SUBMIT) {
+                               struct future_fence *cb;
+
+                               cb = kmalloc(sizeof(*cb), GFP_KERNEL);
+                               fences[i] = dma_fence_create_proxy();
+                               if (cb && fences[i]) {
+                                       cb->slot = &fences[i];
+                                       cb->base.func = future_fence_cb;
+                                       list_add(&cb->base.link, &s->cb_list);
+                               } else {
+                                       kfree(cb);
+                                       dma_fence_put(fences[i]);
+                                       ret = -ENOMEM;
+                               }
+                       } else {
+                               ret = -EINVAL;
+                       }
+                       spin_unlock_irq(&s->lock);
+               }
+               if (ret) {
+                       count = i;
+                       goto err_fences;
+               }
+       }
+
+       array = dma_fence_array_create(count, fences, 0, 0,
+                                      !(args->flags & DRM_SYNCOBJ_WAIT_FLAGS_WAIT_ALL));
+       if (!array) {
+               ret = -ENOMEM;
+               goto err_fences;
+       }
+
+       timeout = drm_timeout_abs_to_jiffies(args->timeout_nsec);
+       timeout = dma_fence_wait_timeout(&array->base, true, timeout);
+       args->first_signaled = array->first_signaled;
+       dma_fence_put(&array->base);
+
+       return timeout < 0 ? timeout : timeout == 0 ? -ETIME : 0; /* Gah. */
+
+err_fences:
+       for (i = 0; i < count; i++)
+               dma_fence_put(fences[i]);
+err:
+       kfree(fences);
+       return ret;
+}

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH] drm/syncobj: Allow wait for submit and signal behavior (v2)
  2017-08-09 23:53         ` Jason Ekstrand
  2017-08-10 11:00           ` Chris Wilson
@ 2017-08-10 12:26           ` Christian König
  2017-08-10 14:32             ` Jason Ekstrand
  1 sibling, 1 reply; 55+ messages in thread
From: Christian König @ 2017-08-10 12:26 UTC (permalink / raw)
  To: Jason Ekstrand, Chris Wilson
  Cc: Jason Ekstrand, Dave Airlie, Maling list - DRI developers


[-- Attachment #1.1: Type: text/plain, Size: 6194 bytes --]

Am 10.08.2017 um 01:53 schrieb Jason Ekstrand:
> On Wed, Aug 9, 2017 at 3:41 PM, Chris Wilson <chris@chris-wilson.co.uk 
> <mailto:chris@chris-wilson.co.uk>> wrote:
>
>     Quoting Chris Wilson (2017-08-09 18:57:44)
>     > So we are taking a snapshot here. It looks like this could have been
>     > done using a dma_fence_array + dma_fence_proxy for capturing the
>     future
>     > fence.
>
>     A quick sketch of this idea looks like:
>
>      void drm_syncobj_replace_fence(struct drm_syncobj *syncobj,
>                                    struct dma_fence *fence)
>      {
>     -       struct dma_fence *old_fence;
>     +       unsigned long flags;
>
>     -       if (fence)
>     -               dma_fence_get(fence);
>     -       old_fence = xchg(&syncobj->fence, fence);
>     -
>     -       dma_fence_put(old_fence);
>     +       spin_lock_irqsave(&syncobj->lock, flags);
>     +       dma_fence_replace_proxy(&syncobj->fence, fence);
>     +       spin_unlock_irqrestore(&syncobj->lock, flags);
>      }
>
>     +int
>     +drm_syncobj_wait_ioctl(struct drm_device *dev, void *data,
>     +                      struct drm_file *file_private)
>     +{
>     +       struct drm_syncobj_wait *args = data;
>     +       struct dma_fence **fences;
>     +       struct dma_fence_array *array;
>     +       unsigned long timeout;
>     +       unsigned int count;
>     +       u32 *handles;
>     +       int ret = 0;
>     +       u32 i;
>     +
>     +       BUILD_BUG_ON(sizeof(*fences) < sizeof(*handles));
>     +
>     +       if (!drm_core_check_feature(dev, DRIVER_SYNCOBJ))
>     +               return -ENODEV;
>     +
>     +       if (args->flags != 0 && args->flags !=
>     DRM_SYNCOBJ_WAIT_FLAGS_WAIT_ALL)
>     +               return -EINVAL;
>     +
>     +       count = args->count_handles;
>     +       if (!count)
>     +               return -EINVAL;
>     +
>     +       /* Get the handles from userspace */
>     +       fences = kmalloc_array(count,
>     +                              sizeof(struct dma_fence *),
>     +                              __GFP_NOWARN | GFP_KERNEL);
>     +       if (!fences)
>     +               return -ENOMEM;
>     +
>     +       handles = (void *)fences + count * (sizeof(*fences) -
>     sizeof(*handles));
>     +       if (copy_from_user(handles,
>     + u64_to_user_ptr(args->handles),
>     +                          sizeof(u32) * count)) {
>     +               ret = -EFAULT;
>     +               goto err;
>     +       }
>     +
>     +       for (i = 0; i < count; i++) {
>     +               struct drm_syncobj *s;
>     +
>     +               ret = -ENOENT;
>     +               s = drm_syncobj_find(file_private, handles[i]);
>     +               if (s) {
>     +                       ret = 0;
>     +                       spin_lock_irq(&s->lock);
>     +                       if (!s->fence) {
>     +                               if (args->flags &
>     DRM_SYNCOBJ_WAIT_FLAGS_WAIT_FOR_SUBMIT)
>     +                                       s->fence =
>     dma_fence_create_proxy();
>     +                               else
>     +                                       ret = -EINVAL;
>     +                       }
>     +                       if (s->fence)
>     +                               fences[i] = dma_fence_get(s->fence);
>     +                       spin_unlock_irq(&s->lock);
>     +               }
>     +               if (ret) {
>     +                       count = i;
>     +                       goto err_fences;
>     +               }
>     +       }
>     +
>     +       array = dma_fence_array_create(count, fences, 0, 0,
>     +                                      !(args->flags &
>     DRM_SYNCOBJ_WAIT_FLAGS_WAIT_ALL));
>     +       if (!array) {
>     +               ret = -ENOMEM;
>     +               goto err_fences;
>     +       }
>     +
>     +       timeout = drm_timeout_abs_to_jiffies(args->timeout_nsec);
>     +       timeout = dma_fence_wait_timeout(&array->base, true, timeout);
>     +       args->first_signaled = array->first_signaled;
>     +       dma_fence_put(&array->base);
>     +
>     +       return timeout < 0 ? timeout : 0;
>     +
>     +err_fences:
>     +       for (i = 0; i < count; i++)
>     +               dma_fence_put(fences[i]);
>     +err:
>     +       kfree(fences);
>     +       return ret;
>     +}
>
>     The key advantage is that this translate the ioctl into a
>     dma-fence-array
>     which already has to deal with the mess, sharing the burden. (But it
>     does require a trivial patch to dma-fence-array to record the first
>     signaled fence.)
>
>     However, this installs the proxy into syncobj->fence with the result
>     that any concurrent wait also become a WAIT_FOR_SUBMIT. The behaviour
>     of drm_syncobj is then quite inconsistent, sometimes it will wait
>     for a
>     future fence, sometimes it will report an error.
>
>
> Yeah, that's not good.  I thought about a variety of solutions to try 
> and re-use more core dma_fence code.  Ultimately I chose the current 
> one because it takes a snapshot of the syncobjs and then, from that 
> point forward, it's consistent with its snapshot.  Nothing I was able 
> to come up with based on core dma_fence wait routines does that.

As Chris pointed out, that's really not a good idea.

Most of the time we need the behavior of reporting an error and only 
when the flag is given wait until some fence shows up.

In general I suggest that we only support this use case in the form of a 
wait_event_interruptible() on setting the first fence on an object.

Waiting on the first one of multiple objects wouldn't be possible (you 
would always wait till all objects have fences), but I think that this 
is acceptable.

The big advantage of the wait_event_interruptible() interface is that 
your condition checking and process handling is bullet prove, as far as 
I have seen so far your code is a bit buggy in that direction.

Christian.

>
> --Jason
>
>
> _______________________________________________
> dri-devel mailing list
> dri-devel@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/dri-devel



[-- Attachment #1.2: Type: text/html, Size: 10438 bytes --]

[-- Attachment #2: Type: text/plain, Size: 160 bytes --]

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH] drm/syncobj: Allow wait for submit and signal behavior (v2)
  2017-08-09 21:31     ` Chris Wilson
  2017-08-09 21:54       ` Jason Ekstrand
@ 2017-08-10 12:26       ` Christian König
  1 sibling, 0 replies; 55+ messages in thread
From: Christian König @ 2017-08-10 12:26 UTC (permalink / raw)
  To: Chris Wilson, Jason Ekstrand, dri-devel; +Cc: Jason Ekstrand, Dave Airlie

Am 09.08.2017 um 23:31 schrieb Chris Wilson:
> Quoting Jason Ekstrand (2017-08-09 18:00:54)
>> +static signed long drm_syncobj_array_wait_timeout(struct drm_syncobj **syncobjs,
>> +                                                 uint32_t count,
>> +                                                 uint32_t flags,
>> +                                                 signed long timeout,
>> +                                                 uint32_t *idx)
>> +{
>> +       struct syncobj_wait_entry *entries;
>> +       struct dma_fence *fence;
>> +       signed long ret;
>> +       uint32_t signaled_count, i;
>> +
>> +       if (timeout == 0) {
>> +               signaled_count = 0;
>> +               for (i = 0; i < count; ++i) {
>> +                       ret = drm_syncobj_signaled(syncobjs[i], flags);
>> +                       if (ret < 0)
>> +                               return ret;
>> +                       if (ret == 0)
>> +                               continue;
>> +                       if (signaled_count == 0 && idx)
>> +                               *idx = i;
>> +                       signaled_count++;
>> +               }
>> +
>> +               if (flags & DRM_SYNCOBJ_WAIT_FLAGS_WAIT_ALL)
>> +                       return signaled_count == count ? 1 : 0;
>> +               else
>> +                       return signaled_count > 0 ? 1 : 0;
> There's a very annoying laxness in the dma_fence API here, in that
> backends are not required to automatically report when a fence is
> signaled prior to fence->ops->enable_signaling() being called.
> So here if we fail to match signaled_count, we need to fallthough and
> try a 0 timeout wait!
>
> Christian, dma_fence_wait_any_timeout() has this same bug you told me off
> for, e.g. commit 698c0f7ff216 ("dma-buf/fence: revert "don't wait when
> specified timeout is zero" (v2)")!

Thanks for pointing this out, going to take care of this issue.

Christian.

> -Chris
> _______________________________________________
> dri-devel mailing list
> dri-devel@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/dri-devel


_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH] drm/syncobj: Allow wait for submit and signal behavior (v2)
  2017-08-10 12:26           ` Christian König
@ 2017-08-10 14:32             ` Jason Ekstrand
  2017-08-10 14:41               ` Christian König
  0 siblings, 1 reply; 55+ messages in thread
From: Jason Ekstrand @ 2017-08-10 14:32 UTC (permalink / raw)
  To: Christian König
  Cc: Jason Ekstrand, Dave Airlie, Maling list - DRI developers


[-- Attachment #1.1: Type: text/plain, Size: 6663 bytes --]

On Thu, Aug 10, 2017 at 5:26 AM, Christian König <deathsimple@vodafone.de>
wrote:

> Am 10.08.2017 um 01:53 schrieb Jason Ekstrand:
>
> On Wed, Aug 9, 2017 at 3:41 PM, Chris Wilson <chris@chris-wilson.co.uk>
> wrote:
>
>> Quoting Chris Wilson (2017-08-09 18:57:44)
>> > So we are taking a snapshot here. It looks like this could have been
>> > done using a dma_fence_array + dma_fence_proxy for capturing the future
>> > fence.
>>
>> A quick sketch of this idea looks like:
>>
>>  void drm_syncobj_replace_fence(struct drm_syncobj *syncobj,
>>                                struct dma_fence *fence)
>>  {
>> -       struct dma_fence *old_fence;
>> +       unsigned long flags;
>>
>> -       if (fence)
>> -               dma_fence_get(fence);
>> -       old_fence = xchg(&syncobj->fence, fence);
>> -
>> -       dma_fence_put(old_fence);
>> +       spin_lock_irqsave(&syncobj->lock, flags);
>> +       dma_fence_replace_proxy(&syncobj->fence, fence);
>> +       spin_unlock_irqrestore(&syncobj->lock, flags);
>>  }
>>
>> +int
>> +drm_syncobj_wait_ioctl(struct drm_device *dev, void *data,
>> +                      struct drm_file *file_private)
>> +{
>> +       struct drm_syncobj_wait *args = data;
>> +       struct dma_fence **fences;
>> +       struct dma_fence_array *array;
>> +       unsigned long timeout;
>> +       unsigned int count;
>> +       u32 *handles;
>> +       int ret = 0;
>> +       u32 i;
>> +
>> +       BUILD_BUG_ON(sizeof(*fences) < sizeof(*handles));
>> +
>> +       if (!drm_core_check_feature(dev, DRIVER_SYNCOBJ))
>> +               return -ENODEV;
>> +
>> +       if (args->flags != 0 && args->flags !=
>> DRM_SYNCOBJ_WAIT_FLAGS_WAIT_ALL)
>> +               return -EINVAL;
>> +
>> +       count = args->count_handles;
>> +       if (!count)
>> +               return -EINVAL;
>> +
>> +       /* Get the handles from userspace */
>> +       fences = kmalloc_array(count,
>> +                              sizeof(struct dma_fence *),
>> +                              __GFP_NOWARN | GFP_KERNEL);
>> +       if (!fences)
>> +               return -ENOMEM;
>> +
>> +       handles = (void *)fences + count * (sizeof(*fences) -
>> sizeof(*handles));
>> +       if (copy_from_user(handles,
>> +                          u64_to_user_ptr(args->handles),
>> +                          sizeof(u32) * count)) {
>> +               ret = -EFAULT;
>> +               goto err;
>> +       }
>> +
>> +       for (i = 0; i < count; i++) {
>> +               struct drm_syncobj *s;
>> +
>> +               ret = -ENOENT;
>> +               s = drm_syncobj_find(file_private, handles[i]);
>> +               if (s) {
>> +                       ret = 0;
>> +                       spin_lock_irq(&s->lock);
>> +                       if (!s->fence) {
>> +                               if (args->flags &
>> DRM_SYNCOBJ_WAIT_FLAGS_WAIT_FOR_SUBMIT)
>> +                                       s->fence =
>> dma_fence_create_proxy();
>> +                               else
>> +                                       ret = -EINVAL;
>> +                       }
>> +                       if (s->fence)
>> +                               fences[i] = dma_fence_get(s->fence);
>> +                       spin_unlock_irq(&s->lock);
>> +               }
>> +               if (ret) {
>> +                       count = i;
>> +                       goto err_fences;
>> +               }
>> +       }
>> +
>> +       array = dma_fence_array_create(count, fences, 0, 0,
>> +                                      !(args->flags &
>> DRM_SYNCOBJ_WAIT_FLAGS_WAIT_ALL));
>> +       if (!array) {
>> +               ret = -ENOMEM;
>> +               goto err_fences;
>> +       }
>> +
>> +       timeout = drm_timeout_abs_to_jiffies(args->timeout_nsec);
>> +       timeout = dma_fence_wait_timeout(&array->base, true, timeout);
>> +       args->first_signaled = array->first_signaled;
>> +       dma_fence_put(&array->base);
>> +
>> +       return timeout < 0 ? timeout : 0;
>> +
>> +err_fences:
>> +       for (i = 0; i < count; i++)
>> +               dma_fence_put(fences[i]);
>> +err:
>> +       kfree(fences);
>> +       return ret;
>> +}
>>
>> The key advantage is that this translate the ioctl into a dma-fence-array
>> which already has to deal with the mess, sharing the burden. (But it
>> does require a trivial patch to dma-fence-array to record the first
>> signaled fence.)
>>
>> However, this installs the proxy into syncobj->fence with the result
>> that any concurrent wait also become a WAIT_FOR_SUBMIT. The behaviour
>> of drm_syncobj is then quite inconsistent, sometimes it will wait for a
>> future fence, sometimes it will report an error.
>>
>
> Yeah, that's not good.  I thought about a variety of solutions to try and
> re-use more core dma_fence code.  Ultimately I chose the current one
> because it takes a snapshot of the syncobjs and then, from that point
> forward, it's consistent with its snapshot.  Nothing I was able to come up
> with based on core dma_fence wait routines does that.
>
>
> As Chris pointed out, that's really not a good idea.
>

What isn't a good idea?


> Most of the time we need the behavior of reporting an error and only when
> the flag is given wait until some fence shows up.
>
> In general I suggest that we only support this use case in the form of a
> wait_event_interruptible() on setting the first fence on an object.
>
> Waiting on the first one of multiple objects wouldn't be possible (you
> would always wait till all objects have fences), but I think that this is
> acceptable.
>

Not really.  If you're doing a wait-any, then you want it to return as soon
as you have a signaled fence even if half your sync objects never get
fences.  At least that's what's required for implementing vkWaitForFences.
The idea is that, if the WAIT_FOR_SUBMIT flag is set, then a syncobj
without a fence and a syncobj with an untriggered fence are identical as
far as waiting is concerned:  You wait until it has a signaled fence.


> The big advantage of the wait_event_interruptible() interface is that your
> condition checking and process handling is bullet prove, as far as I have
> seen so far your code is a bit buggy in that direction.
>

In v3, I used wait_event_interruptible.  Does it have those same bugs?

--Jason


> Christian.
>
>
> --Jason
>
>
> _______________________________________________
> dri-devel mailing listdri-devel@lists.freedesktop.orghttps://lists.freedesktop.org/mailman/listinfo/dri-devel
>
>
>

[-- Attachment #1.2: Type: text/html, Size: 12041 bytes --]

[-- Attachment #2: Type: text/plain, Size: 160 bytes --]

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH] drm/syncobj: Allow wait for submit and signal behavior (v2)
  2017-08-10 14:32             ` Jason Ekstrand
@ 2017-08-10 14:41               ` Christian König
  0 siblings, 0 replies; 55+ messages in thread
From: Christian König @ 2017-08-10 14:41 UTC (permalink / raw)
  To: Jason Ekstrand; +Cc: Jason Ekstrand, Dave Airlie, Maling list - DRI developers


[-- Attachment #1.1: Type: text/plain, Size: 8142 bytes --]

Am 10.08.2017 um 16:32 schrieb Jason Ekstrand:
> On Thu, Aug 10, 2017 at 5:26 AM, Christian König 
> <deathsimple@vodafone.de <mailto:deathsimple@vodafone.de>> wrote:
>
>     Am 10.08.2017 um 01:53 schrieb Jason Ekstrand:
>>     On Wed, Aug 9, 2017 at 3:41 PM, Chris Wilson
>>     <chris@chris-wilson.co.uk <mailto:chris@chris-wilson.co.uk>> wrote:
>>
>>         Quoting Chris Wilson (2017-08-09 18:57:44)
>>         > So we are taking a snapshot here. It looks like this could have been
>>         > done using a dma_fence_array + dma_fence_proxy for
>>         capturing the future
>>         > fence.
>>
>>         A quick sketch of this idea looks like:
>>
>>          void drm_syncobj_replace_fence(struct drm_syncobj *syncobj,
>>                                        struct dma_fence *fence)
>>          {
>>         -       struct dma_fence *old_fence;
>>         +       unsigned long flags;
>>
>>         -       if (fence)
>>         -               dma_fence_get(fence);
>>         -       old_fence = xchg(&syncobj->fence, fence);
>>         -
>>         -       dma_fence_put(old_fence);
>>         +  spin_lock_irqsave(&syncobj->lock, flags);
>>         +       dma_fence_replace_proxy(&syncobj->fence, fence);
>>         +       spin_unlock_irqrestore(&syncobj->lock, flags);
>>          }
>>
>>         +int
>>         +drm_syncobj_wait_ioctl(struct drm_device *dev, void *data,
>>         +                      struct drm_file *file_private)
>>         +{
>>         +       struct drm_syncobj_wait *args = data;
>>         +       struct dma_fence **fences;
>>         +       struct dma_fence_array *array;
>>         +       unsigned long timeout;
>>         +       unsigned int count;
>>         +       u32 *handles;
>>         +       int ret = 0;
>>         +       u32 i;
>>         +
>>         +       BUILD_BUG_ON(sizeof(*fences) < sizeof(*handles));
>>         +
>>         +       if (!drm_core_check_feature(dev, DRIVER_SYNCOBJ))
>>         +               return -ENODEV;
>>         +
>>         +       if (args->flags != 0 && args->flags !=
>>         DRM_SYNCOBJ_WAIT_FLAGS_WAIT_ALL)
>>         +               return -EINVAL;
>>         +
>>         +       count = args->count_handles;
>>         +       if (!count)
>>         +               return -EINVAL;
>>         +
>>         +       /* Get the handles from userspace */
>>         +       fences = kmalloc_array(count,
>>         + sizeof(struct dma_fence *),
>>         + __GFP_NOWARN | GFP_KERNEL);
>>         +       if (!fences)
>>         +               return -ENOMEM;
>>         +
>>         +       handles = (void *)fences + count * (sizeof(*fences) -
>>         sizeof(*handles));
>>         +       if (copy_from_user(handles,
>>         + u64_to_user_ptr(args->handles),
>>         +                          sizeof(u32) * count)) {
>>         +               ret = -EFAULT;
>>         +               goto err;
>>         +       }
>>         +
>>         +       for (i = 0; i < count; i++) {
>>         +               struct drm_syncobj *s;
>>         +
>>         +               ret = -ENOENT;
>>         +               s = drm_syncobj_find(file_private, handles[i]);
>>         +               if (s) {
>>         +                       ret = 0;
>>         +  spin_lock_irq(&s->lock);
>>         +                       if (!s->fence) {
>>         +                               if (args->flags &
>>         DRM_SYNCOBJ_WAIT_FLAGS_WAIT_FOR_SUBMIT)
>>         +  s->fence = dma_fence_create_proxy();
>>         +                               else
>>         +    ret = -EINVAL;
>>         +                       }
>>         +                       if (s->fence)
>>         +                               fences[i] =
>>         dma_fence_get(s->fence);
>>         +  spin_unlock_irq(&s->lock);
>>         +               }
>>         +               if (ret) {
>>         +                       count = i;
>>         +                       goto err_fences;
>>         +               }
>>         +       }
>>         +
>>         +       array = dma_fence_array_create(count, fences, 0, 0,
>>         + !(args->flags & DRM_SYNCOBJ_WAIT_FLAGS_WAIT_ALL));
>>         +       if (!array) {
>>         +               ret = -ENOMEM;
>>         +               goto err_fences;
>>         +       }
>>         +
>>         +       timeout = drm_timeout_abs_to_jiffies(args->timeout_nsec);
>>         +       timeout = dma_fence_wait_timeout(&array->base, true,
>>         timeout);
>>         +       args->first_signaled = array->first_signaled;
>>         +  dma_fence_put(&array->base);
>>         +
>>         +       return timeout < 0 ? timeout : 0;
>>         +
>>         +err_fences:
>>         +       for (i = 0; i < count; i++)
>>         +               dma_fence_put(fences[i]);
>>         +err:
>>         +       kfree(fences);
>>         +       return ret;
>>         +}
>>
>>         The key advantage is that this translate the ioctl into a
>>         dma-fence-array
>>         which already has to deal with the mess, sharing the burden.
>>         (But it
>>         does require a trivial patch to dma-fence-array to record the
>>         first
>>         signaled fence.)
>>
>>         However, this installs the proxy into syncobj->fence with the
>>         result
>>         that any concurrent wait also become a WAIT_FOR_SUBMIT. The
>>         behaviour
>>         of drm_syncobj is then quite inconsistent, sometimes it will
>>         wait for a
>>         future fence, sometimes it will report an error.
>>
>>
>>     Yeah, that's not good. I thought about a variety of solutions to
>>     try and re-use more core dma_fence code. Ultimately I chose the
>>     current one because it takes a snapshot of the syncobjs and then,
>>     from that point forward, it's consistent with its snapshot. 
>>     Nothing I was able to come up with based on core dma_fence wait
>>     routines does that.
>
>     As Chris pointed out, that's really not a good idea.
>
>
> What isn't a good idea?

> However, this installs the proxy into syncobj->fence with the result
> that any concurrent wait also become a WAIT_FOR_SUBMIT.

Installing that proxy. I'm always happy if someone reuses the code 
(after all I wrote a good part of it), but I think we should rather try 
to make this as clean as possible.

>     Most of the time we need the behavior of reporting an error and
>     only when the flag is given wait until some fence shows up.
>
>     In general I suggest that we only support this use case in the
>     form of a wait_event_interruptible() on setting the first fence on
>     an object.
>
>     Waiting on the first one of multiple objects wouldn't be possible
>     (you would always wait till all objects have fences), but I think
>     that this is acceptable.
>
>
> Not really.  If you're doing a wait-any, then you want it to return as 
> soon as you have a signaled fence even if half your sync objects never 
> get fences.  At least that's what's required for implementing 
> vkWaitForFences.  The idea is that, if the WAIT_FOR_SUBMIT flag is 
> set, then a syncobj without a fence and a syncobj with an untriggered 
> fence are identical as far as waiting is concerned:  You wait until it 
> has a signaled fence.

Crap, that makes the whole thing much more harder to implement.

>     The big advantage of the wait_event_interruptible() interface is
>     that your condition checking and process handling is bullet prove,
>     as far as I have seen so far your code is a bit buggy in that
>     direction.
>
>
> In v3, I used wait_event_interruptible.  Does it have those same bugs?

I haven't seen that yet, but when you now use wait_even_interruptible() 
the problem should be gone.

Regards,
Christian.

>
> --Jason
>
>     Christian.
>
>>
>>     --Jason
>>
>>
>>     _______________________________________________
>>     dri-devel mailing list
>>     dri-devel@lists.freedesktop.org
>>     <mailto:dri-devel@lists.freedesktop.org>
>>     https://lists.freedesktop.org/mailman/listinfo/dri-devel
>>     <https://lists.freedesktop.org/mailman/listinfo/dri-devel>
>
>
>


[-- Attachment #1.2: Type: text/html, Size: 18664 bytes --]

[-- Attachment #2: Type: text/plain, Size: 160 bytes --]

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH] drm/syncobj: Allow wait for submit and signal behavior (v2)
  2017-08-10 11:00           ` Chris Wilson
@ 2017-08-10 14:42             ` Jason Ekstrand
  0 siblings, 0 replies; 55+ messages in thread
From: Jason Ekstrand @ 2017-08-10 14:42 UTC (permalink / raw)
  To: Chris Wilson; +Cc: Jason Ekstrand, Dave Airlie, Maling list - DRI developers


[-- Attachment #1.1: Type: text/plain, Size: 7039 bytes --]

On Thu, Aug 10, 2017 at 4:00 AM, Chris Wilson <chris@chris-wilson.co.uk>
wrote:

> Quoting Jason Ekstrand (2017-08-10 00:53:00)
> > On Wed, Aug 9, 2017 at 3:41 PM, Chris Wilson <chris@chris-wilson.co.uk>
> wrote:
> >     However, this installs the proxy into syncobj->fence with the result
> >     that any concurrent wait also become a WAIT_FOR_SUBMIT. The behaviour
> >     of drm_syncobj is then quite inconsistent, sometimes it will wait
> for a
> >     future fence, sometimes it will report an error.
> >
> >
> > Yeah, that's not good.  I thought about a variety of solutions to try and
> > re-use more core dma_fence code.  Ultimately I chose the current one
> because it
> > takes a snapshot of the syncobjs and then, from that point forward, it's
> > consistent with its snapshot.  Nothing I was able to come up with based
> on core
> > dma_fence wait routines does that.
>
> So if we choose to keep the proxy local to the fence-array and not replace
> syncobj->fence, we can still reduce this to a plain dma-fence-array +
> wait.
>
> Pertinent details:
>
> +static void syncobj_notify(struct drm_syncobj *syncobj, struct dma_fence
> *fence)
> +{
> +       struct drm_syncobj_cb *cb, *cn;
> +       unsigned long flags;
> +
> +       /* This is just an opencoded waitqueue; FIXME! */
> +       spin_lock_irqsave(&syncobj->lock, flags);
> +       list_for_each_entry_safe(cb, cn, &syncobj->cb_list, link)
> +               cb->func(cb, fence);
> +       INIT_LIST_HEAD(&syncobj->cb_list);
> +       spin_unlock_irqrestore(&syncobj->lock, flags);
> +}
> +
>  /**
>   * drm_syncobj_replace_fence - replace fence in a sync object.
>   * @syncobj: Sync object to replace fence in
> @@ -89,7 +109,10 @@ void drm_syncobj_replace_fence(struct drm_syncobj
> *syncobj,
>
>         if (fence)
>                 dma_fence_get(fence);
> +
>         old_fence = xchg(&syncobj->fence, fence);
> +       if (!old_fence && fence)
> +               syncobj_notify(syncobj, fence);
>
>         dma_fence_put(old_fence);
>  }
> @@ -124,6 +147,9 @@ void drm_syncobj_free(struct kref *kref)
>         struct drm_syncobj *syncobj = container_of(kref,
>                                                    struct drm_syncobj,
>                                                    refcount);
> +
> +       syncobj_notify(syncobj, NULL);
> +
>         dma_fence_put(syncobj->fence);
>         kfree(syncobj);
>  }
> @@ -140,6 +166,8 @@ static int drm_syncobj_create(struct drm_file
> *file_private,
>                 return -ENOMEM;
>
>         kref_init(&syncobj->refcount);
> +       spin_lock_init(&syncobj->lock);
> +       INIT_LIST_HEAD(&syncobj->cb_list);
>
>         idr_preload(GFP_KERNEL);
>         spin_lock(&file_private->syncobj_table_lock);
>
> struct future_fence {
> +       struct drm_syncobj_cb base;
> +       struct dma_fence **slot;
> +};
> +
> +static void future_fence_cb(struct drm_syncobj_cb *cb, struct dma_fence
> *fence)
> +{
> +       struct future_fence *ff = container_of(cb, typeof(*ff), base);
> +
> +       if (fence)
> +               dma_fence_replace_proxy(ff->slot, fence);
>

Where does this come from?  I can't find it on dri-devel and "proxy"
doesn't show up anywhere in the dam-buf sources.  What does it do?


> +       else
> +               dma_fence_signal(*ff->slot);
> +
> +       kfree(ff);
> +}
> +
> +int
> +drm_syncobj_wait_ioctl(struct drm_device *dev, void *data,
> +                      struct drm_file *file_private)
> +{
> +       struct drm_syncobj_wait *args = data;
> +       struct dma_fence_array *array;
> +       struct dma_fence **fences;
> +       unsigned int count, i;
> +       long timeout;
> +       u32 *handles;
> +       int ret;
> +
> +       BUILD_BUG_ON(sizeof(*fences) < sizeof(*handles));
> +
> +       if (!drm_core_check_feature(dev, DRIVER_SYNCOBJ))
> +               return -ENODEV;
> +
> +       if (args->flags & ~(DRM_SYNCOBJ_WAIT_FLAGS_WAIT_ALL |
> +                           DRM_SYNCOBJ_WAIT_FLAGS_WAIT_FOR_SUBMIT))
> +               return -EINVAL;
> +
> +       count = args->count_handles;
> +       if (!count)
> +               return -EINVAL;
> +
> +       /* Get the handles from userspace */
> +       fences = kmalloc_array(count,
> +                              sizeof(struct dma_fence *),
> +                              __GFP_NOWARN | GFP_KERNEL);
> +       if (!fences)
> +               return -ENOMEM;
> +
> +       handles = (void *)fences + count * (sizeof(*fences) -
> sizeof(*handles));
> +       if (copy_from_user(handles,
> +                          u64_to_user_ptr(args->handles),
> +                          sizeof(u32) * count)) {
> +               ret = -EFAULT;
> +               goto err;
> +       }
> +
> +       for (i = 0; i < count; i++) {
> +               struct drm_syncobj *s;
> +
> +               ret = -ENOENT;
> +               s = drm_syncobj_find(file_private, handles[i]);
> +               if (s) {
> +                       ret = 0;
> +                       spin_lock_irq(&s->lock);
> +                       if (s->fence) {
> +                               fences[i] = dma_fence_get(s->fence);
> +                       } else if (args->flags &
> DRM_SYNCOBJ_WAIT_FLAGS_WAIT_FOR_SUBMIT) {
> +                               struct future_fence *cb;
> +
> +                               cb = kmalloc(sizeof(*cb), GFP_KERNEL);
> +                               fences[i] = dma_fence_create_proxy();
> +                               if (cb && fences[i]) {
> +                                       cb->slot = &fences[i];
> +                                       cb->base.func = future_fence_cb;
> +                                       list_add(&cb->base.link,
> &s->cb_list);
> +                               } else {
> +                                       kfree(cb);
> +                                       dma_fence_put(fences[i]);
> +                                       ret = -ENOMEM;
> +                               }
> +                       } else {
> +                               ret = -EINVAL;
> +                       }
> +                       spin_unlock_irq(&s->lock);
> +               }
> +               if (ret) {
> +                       count = i;
> +                       goto err_fences;
> +               }
> +       }
> +
> +       array = dma_fence_array_create(count, fences, 0, 0,
> +                                      !(args->flags &
> DRM_SYNCOBJ_WAIT_FLAGS_WAIT_ALL));
> +       if (!array) {
> +               ret = -ENOMEM;
> +               goto err_fences;
> +       }
> +
> +       timeout = drm_timeout_abs_to_jiffies(args->timeout_nsec);
> +       timeout = dma_fence_wait_timeout(&array->base, true, timeout);
> +       args->first_signaled = array->first_signaled;
> +       dma_fence_put(&array->base);
> +
> +       return timeout < 0 ? timeout : timeout == 0 ? -ETIME : 0; /* Gah.
> */
> +
> +err_fences:
> +       for (i = 0; i < count; i++)
> +               dma_fence_put(fences[i]);
> +err:
> +       kfree(fences);
> +       return ret;
> +}
>
>

[-- Attachment #1.2: Type: text/html, Size: 9707 bytes --]

[-- Attachment #2: Type: text/plain, Size: 160 bytes --]

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH 2/9] drm/syncobj: Lock around drm_syncobj::fence
  2017-08-10  0:31     ` Jason Ekstrand
  2017-08-10 10:55       ` Chris Wilson
@ 2017-08-10 19:10       ` Chris Wilson
  1 sibling, 0 replies; 55+ messages in thread
From: Chris Wilson @ 2017-08-10 19:10 UTC (permalink / raw)
  To: Jason Ekstrand; +Cc: Jason Ekstrand, Maling list - DRI developers

Quoting Jason Ekstrand (2017-08-10 01:31:52)
> 
> 
> On Wed, Aug 9, 2017 at 2:21 PM, Chris Wilson <chris@chris-wilson.co.uk> wrote:
> 
>     Quoting Jason Ekstrand (2017-08-08 23:46:02)
>     > The atomic exchange operation we were doing before in replace_fence was
>     > sufficient for the case where it raced with itself.  However, if you
>     > have a race between a replace_fence and dma_fence_get(syncobj->fence),
>     > you may end up with the entire replace_fence happening between the point
>     > in time where the one thread gets the syncobj->fence pointer and when it
>     > calls dma_fence_get() on it.  If this happens, then the reference may be
>     > dropped before we get a chance to get a new one.
> 
>     This doesn't require a spinlock, just dma_fence_get_rcu_safe(). The
>     argument for keeping this patch lies in the merit of later patches..
> 
> 
> Doesn't that also require that we start using an RCU for syncobj?  That was my
> interpretation of the hieroglyphics above the definition of get_rcu_safe()

Interesting you mention RCUing syncobj... The spinlock in
drm_syncobj_find() is the first contention point we hit. If we do make
syncobj RCU'ed we can avoid that lock.
-Chris
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 55+ messages in thread

* [PATCH 0/9] drm/syncobj: Add full-featured wait support (v2)
  2017-08-08 22:46 [PATCH 0/9] drm/syncobj: Add full-featured wait support Jason Ekstrand
                   ` (8 preceding siblings ...)
  2017-08-08 22:46 ` [PATCH 9/9] drm/syncobj: Allow wait for submit and signal behavior Jason Ekstrand
@ 2017-08-11 22:39 ` Jason Ekstrand
  2017-08-11 22:39   ` [PATCH 1/9] drm/syncobj: Rename fence_get to find_fence Jason Ekstrand
                     ` (9 more replies)
  9 siblings, 10 replies; 55+ messages in thread
From: Jason Ekstrand @ 2017-08-11 22:39 UTC (permalink / raw)
  To: dri-devel; +Cc: Jason Ekstrand, Dave Airlie, Christian König

This series does the same thing as my earlier series in that it adds a sync
object wait interface complete with WAIT_FOR_SUBMIT flag.  While the uapi
remains unchanged, the guts look a bit different.  Instead of adding a
callback mechanism to drm_syncobj that fired whenever replace_fence was
called, it's now using proxy fences.  The drm_syncobj_fence_get still
returns NULL whenever the sync object is in an unsubmitted state but there
is a new drm_syncobj_fence_proxy_get which returns either the real fence or
a proxy fence that will be triggered the next time replace_fence is called
with a non-NULL replacement.  This does make both drm_syncobj_fence_get and
drm_syncobj_replace_fence a tiny bit more expensive, but it lets us do it
all without locking.

This series can be found as a branch here:

https://cgit.freedesktop.org/~jekstrand/linux/log/?h=drm-syncobj-wait-submit-v4

IGT tests for DRM_IOCTL_SYNCOBJ_WAIT and DRM_IOCTL_SYNCOBJ_RESET can be
found on patchwork here:

https://patchwork.freedesktop.org/series/28666/

Patches to the Intel Vulkan driver to implement VK_KHR_external_fence on
top of this kernel interface can be found here:

https://cgit.freedesktop.org/~jekstrand/mesa/log/?h=wip/anv-external-fence

Cc: Dave Airlie <airlied@redhat.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Christian König <christian.koenig@amd.com>

Chris Wilson (2):
  dma-buf/dma-fence: Signal all callbacks from dma_fence_release()
  dma-buf/dma-fence: Add a mechanism for proxy fences

Dave Airlie (1):
  drm/syncobj: add sync obj wait interface. (v8)

Jason Ekstrand (6):
  drm/syncobj: Rename fence_get to find_fence
  drm/syncobj: Add a race-free drm_syncobj_fence_get helper
  i915: Add support for drm syncobjs
  dma-buf/dma-fence: Allow wait_any_timeout without default_wait (v2)
  drm/syncobj: Add a reset ioctl
  drm/syncobj: Allow wait for submit and signal behavior (v4)

 drivers/dma-buf/Makefile                   |   4 +-
 drivers/dma-buf/dma-fence-proxy.c          | 186 +++++++++++++++++++
 drivers/dma-buf/dma-fence.c                |  34 ++--
 drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c     |   2 +-
 drivers/gpu/drm/drm_internal.h             |   4 +
 drivers/gpu/drm/drm_ioctl.c                |   4 +
 drivers/gpu/drm/drm_syncobj.c              | 275 +++++++++++++++++++++++++++--
 drivers/gpu/drm/i915/i915_drv.c            |   3 +-
 drivers/gpu/drm/i915/i915_gem_execbuffer.c | 146 ++++++++++++++-
 include/drm/drm_syncobj.h                  |  15 +-
 include/linux/dma-fence-proxy.h            |  25 +++
 include/uapi/drm/drm.h                     |  19 ++
 include/uapi/drm/i915_drm.h                |  30 +++-
 13 files changed, 710 insertions(+), 37 deletions(-)
 create mode 100644 drivers/dma-buf/dma-fence-proxy.c
 create mode 100644 include/linux/dma-fence-proxy.h

-- 
2.5.0.400.gff86faf

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 55+ messages in thread

* [PATCH 1/9] drm/syncobj: Rename fence_get to find_fence
  2017-08-11 22:39 ` [PATCH 0/9] drm/syncobj: Add full-featured wait support (v2) Jason Ekstrand
@ 2017-08-11 22:39   ` Jason Ekstrand
  2017-08-11 22:39   ` [PATCH 2/9] drm/syncobj: Add a race-free drm_syncobj_fence_get helper Jason Ekstrand
                     ` (8 subsequent siblings)
  9 siblings, 0 replies; 55+ messages in thread
From: Jason Ekstrand @ 2017-08-11 22:39 UTC (permalink / raw)
  To: dri-devel; +Cc: Jason Ekstrand, Jason Ekstrand

The function has far more in common with drm_syncobj_find than with
any in the get/put functions.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c |  2 +-
 drivers/gpu/drm/drm_syncobj.c          | 10 +++++-----
 include/drm/drm_syncobj.h              |  6 +++---
 3 files changed, 9 insertions(+), 9 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
index 3378951..dcfb056 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
@@ -1038,7 +1038,7 @@ static int amdgpu_syncobj_lookup_and_add_to_sync(struct amdgpu_cs_parser *p,
 {
 	int r;
 	struct dma_fence *fence;
-	r = drm_syncobj_fence_get(p->filp, handle, &fence);
+	r = drm_syncobj_find_fence(p->filp, handle, &fence);
 	if (r)
 		return r;
 
diff --git a/drivers/gpu/drm/drm_syncobj.c b/drivers/gpu/drm/drm_syncobj.c
index a5b38a8..0412b0b 100644
--- a/drivers/gpu/drm/drm_syncobj.c
+++ b/drivers/gpu/drm/drm_syncobj.c
@@ -95,9 +95,9 @@ void drm_syncobj_replace_fence(struct drm_syncobj *syncobj,
 }
 EXPORT_SYMBOL(drm_syncobj_replace_fence);
 
-int drm_syncobj_fence_get(struct drm_file *file_private,
-			  u32 handle,
-			  struct dma_fence **fence)
+int drm_syncobj_find_fence(struct drm_file *file_private,
+			   u32 handle,
+			   struct dma_fence **fence)
 {
 	struct drm_syncobj *syncobj = drm_syncobj_find(file_private, handle);
 	int ret = 0;
@@ -112,7 +112,7 @@ int drm_syncobj_fence_get(struct drm_file *file_private,
 	drm_syncobj_put(syncobj);
 	return ret;
 }
-EXPORT_SYMBOL(drm_syncobj_fence_get);
+EXPORT_SYMBOL(drm_syncobj_find_fence);
 
 /**
  * drm_syncobj_free - free a sync object.
@@ -307,7 +307,7 @@ int drm_syncobj_export_sync_file(struct drm_file *file_private,
 	if (fd < 0)
 		return fd;
 
-	ret = drm_syncobj_fence_get(file_private, handle, &fence);
+	ret = drm_syncobj_find_fence(file_private, handle, &fence);
 	if (ret)
 		goto err_put_fd;
 
diff --git a/include/drm/drm_syncobj.h b/include/drm/drm_syncobj.h
index 89976da..7d4ad77 100644
--- a/include/drm/drm_syncobj.h
+++ b/include/drm/drm_syncobj.h
@@ -81,9 +81,9 @@ struct drm_syncobj *drm_syncobj_find(struct drm_file *file_private,
 				     u32 handle);
 void drm_syncobj_replace_fence(struct drm_syncobj *syncobj,
 			       struct dma_fence *fence);
-int drm_syncobj_fence_get(struct drm_file *file_private,
-			  u32 handle,
-			  struct dma_fence **fence);
+int drm_syncobj_find_fence(struct drm_file *file_private,
+			   u32 handle,
+			   struct dma_fence **fence);
 void drm_syncobj_free(struct kref *kref);
 
 #endif
-- 
2.5.0.400.gff86faf

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 55+ messages in thread

* [PATCH 2/9] drm/syncobj: Add a race-free drm_syncobj_fence_get helper
  2017-08-11 22:39 ` [PATCH 0/9] drm/syncobj: Add full-featured wait support (v2) Jason Ekstrand
  2017-08-11 22:39   ` [PATCH 1/9] drm/syncobj: Rename fence_get to find_fence Jason Ekstrand
@ 2017-08-11 22:39   ` Jason Ekstrand
  2017-08-14  2:03     ` kbuild test robot
  2017-08-11 22:39   ` [PATCH 3/9] i915: Add support for drm syncobjs Jason Ekstrand
                     ` (7 subsequent siblings)
  9 siblings, 1 reply; 55+ messages in thread
From: Jason Ekstrand @ 2017-08-11 22:39 UTC (permalink / raw)
  To: dri-devel; +Cc: Jason Ekstrand, Jason Ekstrand

The atomic exchange operation in drm_syncobj_replace_fence is sufficient
for the case where it races with itself.  However, if you have a race
between a replace_fence and dma_fence_get(syncobj->fence), you may end
up with the entire replace_fence happening between the point in time
where the one thread gets the syncobj->fence pointer and when it calls
dma_fence_get() on it.  If this happens, then the reference may be
dropped before we get a chance to get a new one.  The new helper uses
dma_fence_get_rcu_safe to get rid of the race.

This is also needed because it allows us to do a bit more than just get
a reference in drm_syncobj_fence_get should we wish to do so.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
---
 drivers/gpu/drm/drm_syncobj.c | 12 +++++++++---
 include/drm/drm_syncobj.h     |  6 +++++-
 2 files changed, 14 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/drm_syncobj.c b/drivers/gpu/drm/drm_syncobj.c
index 0412b0b..0190ec2 100644
--- a/drivers/gpu/drm/drm_syncobj.c
+++ b/drivers/gpu/drm/drm_syncobj.c
@@ -75,6 +75,12 @@ struct drm_syncobj *drm_syncobj_find(struct drm_file *file_private,
 }
 EXPORT_SYMBOL(drm_syncobj_find);
 
+struct dma_fence *drm_syncobj_fence_get(struct drm_syncobj *syncobj)
+{
+	return dma_fence_get_rcu_safe(&syncobj->_fence);
+}
+EXPORT_SYMBOL(drm_syncobj_fence_get);
+
 /**
  * drm_syncobj_replace_fence - replace fence in a sync object.
  * @syncobj: Sync object to replace fence in
@@ -89,7 +95,7 @@ void drm_syncobj_replace_fence(struct drm_syncobj *syncobj,
 
 	if (fence)
 		dma_fence_get(fence);
-	old_fence = xchg(&syncobj->fence, fence);
+	old_fence = xchg(&syncobj->_fence, fence);
 
 	dma_fence_put(old_fence);
 }
@@ -105,7 +111,7 @@ int drm_syncobj_find_fence(struct drm_file *file_private,
 	if (!syncobj)
 		return -ENOENT;
 
-	*fence = dma_fence_get(syncobj->fence);
+	*fence = drm_syncobj_fence_get(syncobj);
 	if (!*fence) {
 		ret = -EINVAL;
 	}
@@ -125,7 +131,7 @@ void drm_syncobj_free(struct kref *kref)
 	struct drm_syncobj *syncobj = container_of(kref,
 						   struct drm_syncobj,
 						   refcount);
-	dma_fence_put(syncobj->fence);
+	dma_fence_put(syncobj->_fence);
 	kfree(syncobj);
 }
 EXPORT_SYMBOL(drm_syncobj_free);
diff --git a/include/drm/drm_syncobj.h b/include/drm/drm_syncobj.h
index 7d4ad77..c06a441 100644
--- a/include/drm/drm_syncobj.h
+++ b/include/drm/drm_syncobj.h
@@ -43,8 +43,11 @@ struct drm_syncobj {
 	/**
 	 * @fence:
 	 * NULL or a pointer to the fence bound to this object.
+	 *
+	 * This pointer should not be accessed directly.  Instead, use
+	 * drm_syncobj_fence_get or drm_syncobj_replace_fence.
 	 */
-	struct dma_fence *fence;
+	struct dma_fence *_fence;
 	/**
 	 * @file:
 	 * a file backing for this syncobj.
@@ -79,6 +82,7 @@ drm_syncobj_put(struct drm_syncobj *obj)
 
 struct drm_syncobj *drm_syncobj_find(struct drm_file *file_private,
 				     u32 handle);
+struct dma_fence *drm_syncobj_fence_get(struct drm_syncobj *syncobj);
 void drm_syncobj_replace_fence(struct drm_syncobj *syncobj,
 			       struct dma_fence *fence);
 int drm_syncobj_find_fence(struct drm_file *file_private,
-- 
2.5.0.400.gff86faf

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 55+ messages in thread

* [PATCH 3/9] i915: Add support for drm syncobjs
  2017-08-11 22:39 ` [PATCH 0/9] drm/syncobj: Add full-featured wait support (v2) Jason Ekstrand
  2017-08-11 22:39   ` [PATCH 1/9] drm/syncobj: Rename fence_get to find_fence Jason Ekstrand
  2017-08-11 22:39   ` [PATCH 2/9] drm/syncobj: Add a race-free drm_syncobj_fence_get helper Jason Ekstrand
@ 2017-08-11 22:39   ` Jason Ekstrand
  2017-08-14  2:58     ` Jason Ekstrand
  2017-08-11 22:39   ` [PATCH 4/9] drm/syncobj: add sync obj wait interface. (v8) Jason Ekstrand
                     ` (6 subsequent siblings)
  9 siblings, 1 reply; 55+ messages in thread
From: Jason Ekstrand @ 2017-08-11 22:39 UTC (permalink / raw)
  To: dri-devel; +Cc: Jason Ekstrand, Jason Ekstrand

This commit adds support for waiting on or signaling DRM syncobjs as
part of execbuf.  It does so by hijacking the currently unused cliprects
pointer to instead point to an array of i915_gem_exec_fence structs
which containe a DRM syncobj and a flags parameter which specifies
whether to wait on it or to signal it.  This implementation
theoretically allows for both flags to be set in which case it waits on
the dma_fence that was in the syncobj and then immediately replaces it
with the dma_fence from the current execbuf.

v2:
 - Rebase on new syncobj API
v3:
 - Pull everything out into helpers
 - Do all allocation in gem_execbuffer2
 - Pack the flags in the bottom 2 bits of the drm_syncobj*
v4:
 - Prevent a potential race on syncobj->fence

Testcase: igt/gem_exec_fence/syncobj*
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/i915_drv.c            |   3 +-
 drivers/gpu/drm/i915/i915_gem_execbuffer.c | 146 ++++++++++++++++++++++++++++-
 include/uapi/drm/i915_drm.h                |  30 +++++-
 3 files changed, 171 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index 4c96a72..50db490 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -388,6 +388,7 @@ static int i915_getparam(struct drm_device *dev, void *data,
 	case I915_PARAM_HAS_EXEC_FENCE:
 	case I915_PARAM_HAS_EXEC_CAPTURE:
 	case I915_PARAM_HAS_EXEC_BATCH_FIRST:
+	case I915_PARAM_HAS_EXEC_FENCE_ARRAY:
 		/* For the time being all of these are always true;
 		 * if some supported hardware does not have one of these
 		 * features this value needs to be provided from
@@ -2738,7 +2739,7 @@ static struct drm_driver driver = {
 	 */
 	.driver_features =
 	    DRIVER_HAVE_IRQ | DRIVER_IRQ_SHARED | DRIVER_GEM | DRIVER_PRIME |
-	    DRIVER_RENDER | DRIVER_MODESET | DRIVER_ATOMIC,
+	    DRIVER_RENDER | DRIVER_MODESET | DRIVER_ATOMIC | DRIVER_SYNCOBJ,
 	.release = i915_driver_release,
 	.open = i915_driver_open,
 	.lastclose = i915_driver_lastclose,
diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
index 929f275..8df845b 100644
--- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
@@ -33,6 +33,7 @@
 
 #include <drm/drmP.h>
 #include <drm/i915_drm.h>
+#include <drm/drm_syncobj.h>
 
 #include "i915_drv.h"
 #include "i915_gem_clflush.h"
@@ -1882,8 +1883,10 @@ static bool i915_gem_check_execbuffer(struct drm_i915_gem_execbuffer2 *exec)
 		return false;
 
 	/* Kernel clipping was a DRI1 misfeature */
-	if (exec->num_cliprects || exec->cliprects_ptr)
-		return false;
+	if (!(exec->flags & I915_EXEC_FENCE_ARRAY)) {
+		if (exec->num_cliprects || exec->cliprects_ptr)
+			return false;
+	}
 
 	if (exec->DR4 == 0xffffffff) {
 		DRM_DEBUG("UXA submitting garbage DR4, fixing up\n");
@@ -2114,11 +2117,125 @@ eb_select_engine(struct drm_i915_private *dev_priv,
 	return engine;
 }
 
+static void __free_fence_array(struct drm_syncobj **fences, unsigned int n)
+{
+	while (n--)
+		drm_syncobj_put(ptr_mask_bits(fences[n], 2));
+	kvfree(fences);
+}
+
+static struct drm_syncobj **get_fence_array(struct drm_i915_gem_execbuffer2 *args,
+					    struct drm_file *file)
+{
+	const unsigned int nfences = args->num_cliprects;
+	struct drm_i915_gem_exec_fence __user *user;
+	struct drm_syncobj **fences;
+	unsigned int n;
+	int err;
+
+	if (!(args->flags & I915_EXEC_FENCE_ARRAY))
+		return NULL;
+
+	if (nfences > SIZE_MAX / sizeof(*fences))
+		return ERR_PTR(-EINVAL);
+
+	user = u64_to_user_ptr(args->cliprects_ptr);
+	if (!access_ok(VERIFY_READ, user, nfences * 2 * sizeof(u32)))
+		return ERR_PTR(-EFAULT);
+
+	fences = kvmalloc_array(args->num_cliprects, sizeof(*fences),
+				__GFP_NOWARN | GFP_TEMPORARY);
+	if (!fences)
+		return ERR_PTR(-ENOMEM);
+
+	for (n = 0; n < nfences; n++) {
+		struct drm_i915_gem_exec_fence fence;
+		struct drm_syncobj *syncobj;
+
+		if (__copy_from_user(&fence, user++, sizeof(fence))) {
+			err = -EFAULT;
+			goto err;
+		}
+
+		syncobj = drm_syncobj_find(file, fence.handle);
+		if (!syncobj) {
+			DRM_DEBUG("Invalid syncobj handle provided\n");
+			err = -EINVAL;
+			goto err;
+		}
+
+		fences[n] = ptr_pack_bits(syncobj, fence.flags, 2);
+	}
+
+	return fences;
+
+err:
+	__free_fence_array(fences, n);
+	return ERR_PTR(err);
+}
+
+static void put_fence_array(struct drm_i915_gem_execbuffer2 *args,
+			    struct drm_syncobj **fences)
+{
+	if (!fences)
+		return;
+	__free_fence_array(fences, args->num_cliprects);
+}
+
+static int await_fence_array(struct i915_execbuffer *eb,
+			     struct drm_syncobj **fences)
+{
+	const unsigned int nfences = eb->args->num_cliprects;
+	unsigned int n;
+	int err;
+
+	for (n = 0; n < nfences; n++) {
+		struct drm_syncobj *syncobj;
+		struct dma_fence *fence;
+		unsigned int flags;
+
+		syncobj = ptr_unpack_bits(fences[n], &flags, 2);
+		if (!(flags & I915_EXEC_FENCE_WAIT))
+			continue;
+
+		fence = drm_syncobj_fence_get(syncobj);
+		if (!fence)
+		    return -EINVAL;
+
+		err = i915_gem_request_await_dma_fence(eb->request, fence);
+		dma_fence_put(fence);
+		if (err < 0)
+			return err;
+	}
+
+	return 0;
+}
+
+static void signal_fence_array(struct i915_execbuffer *eb,
+			       struct drm_syncobj **fences)
+{
+	const unsigned int nfences = eb->args->num_cliprects;
+	struct dma_fence * const fence = &eb->request->fence;
+	unsigned int n;
+
+	for (n = 0; n < nfences; n++) {
+		struct drm_syncobj *syncobj;
+		unsigned int flags;
+
+		syncobj = ptr_unpack_bits(fences[n], &flags, 2);
+		if (!(flags & I915_EXEC_FENCE_SIGNAL))
+			continue;
+
+		drm_syncobj_replace_fence(syncobj, fence);
+	}
+}
+
 static int
 i915_gem_do_execbuffer(struct drm_device *dev,
 		       struct drm_file *file,
 		       struct drm_i915_gem_execbuffer2 *args,
-		       struct drm_i915_gem_exec_object2 *exec)
+		       struct drm_i915_gem_exec_object2 *exec,
+		       struct drm_syncobj **fences)
 {
 	struct i915_execbuffer eb;
 	struct dma_fence *in_fence = NULL;
@@ -2304,6 +2421,12 @@ i915_gem_do_execbuffer(struct drm_device *dev,
 			goto err_request;
 	}
 
+	if (fences) {
+		err = await_fence_array(&eb, fences);
+		if (err < 0)
+			goto err_request;
+	}
+
 	if (out_fence_fd != -1) {
 		out_fence = sync_file_create(&eb.request->fence);
 		if (!out_fence) {
@@ -2327,6 +2450,9 @@ i915_gem_do_execbuffer(struct drm_device *dev,
 	__i915_add_request(eb.request, err == 0);
 	add_to_client(eb.request, file);
 
+	if (fences)
+		signal_fence_array(&eb, fences);
+
 	if (out_fence) {
 		if (err == 0) {
 			fd_install(out_fence_fd, out_fence->file);
@@ -2428,7 +2554,7 @@ i915_gem_execbuffer(struct drm_device *dev, void *data,
 			exec2_list[i].flags = 0;
 	}
 
-	err = i915_gem_do_execbuffer(dev, file, &exec2, exec2_list);
+	err = i915_gem_do_execbuffer(dev, file, &exec2, exec2_list, NULL);
 	if (exec2.flags & __EXEC_HAS_RELOC) {
 		struct drm_i915_gem_exec_object __user *user_exec_list =
 			u64_to_user_ptr(args->buffers_ptr);
@@ -2460,6 +2586,7 @@ i915_gem_execbuffer2(struct drm_device *dev, void *data,
 	const size_t sz = sizeof(struct drm_i915_gem_exec_object2);
 	struct drm_i915_gem_execbuffer2 *args = data;
 	struct drm_i915_gem_exec_object2 *exec2_list;
+	struct drm_syncobj **fences = NULL;
 	int err;
 
 	if (args->buffer_count < 1 || args->buffer_count > SIZE_MAX / sz - 1) {
@@ -2486,7 +2613,15 @@ i915_gem_execbuffer2(struct drm_device *dev, void *data,
 		return -EFAULT;
 	}
 
-	err = i915_gem_do_execbuffer(dev, file, args, exec2_list);
+	if (args->flags & I915_EXEC_FENCE_ARRAY) {
+		fences = get_fence_array(args, file);
+		if (IS_ERR(fences)) {
+			kvfree(exec2_list);
+			return PTR_ERR(fences);
+		}
+	}
+
+	err = i915_gem_do_execbuffer(dev, file, args, exec2_list, fences);
 
 	/*
 	 * Now that we have begun execution of the batchbuffer, we ignore
@@ -2517,5 +2652,6 @@ i915_gem_execbuffer2(struct drm_device *dev, void *data,
 
 	args->flags &= ~__I915_EXEC_UNKNOWN_FLAGS;
 	kvfree(exec2_list);
+	put_fence_array(args, fences);
 	return err;
 }
diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
index 7ccbd6a..bb3fb49 100644
--- a/include/uapi/drm/i915_drm.h
+++ b/include/uapi/drm/i915_drm.h
@@ -431,6 +431,11 @@ typedef struct drm_i915_irq_wait {
  */
 #define I915_PARAM_HAS_EXEC_BATCH_FIRST	 48
 
+/* Query whether DRM_I915_GEM_EXECBUFFER2 supports supplying an array of
+ * drm_i915_gem_exec_fence structures.  See I915_EXEC_FENCE_ARRAY.
+ */
+#define I915_PARAM_HAS_EXEC_FENCE_ARRAY  49
+
 typedef struct drm_i915_getparam {
 	__s32 param;
 	/*
@@ -812,6 +817,17 @@ struct drm_i915_gem_exec_object2 {
 	__u64 rsvd2;
 };
 
+struct drm_i915_gem_exec_fence {
+	/**
+	 * User's handle for a dma-fence to wait on or signal.
+	 */
+	__u32 handle;
+
+#define I915_EXEC_FENCE_WAIT            (1<<0)
+#define I915_EXEC_FENCE_SIGNAL          (1<<1)
+	__u32 flags;
+};
+
 struct drm_i915_gem_execbuffer2 {
 	/**
 	 * List of gem_exec_object2 structs
@@ -826,7 +842,10 @@ struct drm_i915_gem_execbuffer2 {
 	__u32 DR1;
 	__u32 DR4;
 	__u32 num_cliprects;
-	/** This is a struct drm_clip_rect *cliprects */
+	/** This is a struct drm_clip_rect *cliprects if I915_EXEC_FENCE_ARRAY
+	 * is not set.  If I915_EXEC_FENCE_ARRAY is set, then this is a
+	 * struct drm_i915_gem_exec_fence *fences.
+	 */
 	__u64 cliprects_ptr;
 #define I915_EXEC_RING_MASK              (7<<0)
 #define I915_EXEC_DEFAULT                (0<<0)
@@ -927,7 +946,14 @@ struct drm_i915_gem_execbuffer2 {
  * element).
  */
 #define I915_EXEC_BATCH_FIRST		(1<<18)
-#define __I915_EXEC_UNKNOWN_FLAGS (-(I915_EXEC_BATCH_FIRST<<1))
+
+/* Setting I915_FENCE_ARRAY implies that num_cliprects and cliprects_ptr
+ * define an array of i915_gem_exec_fence structures which specify a set of
+ * dma fences to wait upon or signal.
+ */
+#define I915_EXEC_FENCE_ARRAY   (1<<19)
+
+#define __I915_EXEC_UNKNOWN_FLAGS (-(I915_EXEC_FENCE_ARRAY<<1))
 
 #define I915_EXEC_CONTEXT_ID_MASK	(0xffffffff)
 #define i915_execbuffer2_set_context_id(eb2, context) \
-- 
2.5.0.400.gff86faf

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 55+ messages in thread

* [PATCH 4/9] drm/syncobj: add sync obj wait interface. (v8)
  2017-08-11 22:39 ` [PATCH 0/9] drm/syncobj: Add full-featured wait support (v2) Jason Ekstrand
                     ` (2 preceding siblings ...)
  2017-08-11 22:39   ` [PATCH 3/9] i915: Add support for drm syncobjs Jason Ekstrand
@ 2017-08-11 22:39   ` Jason Ekstrand
  2017-08-11 22:39   ` [PATCH 5/9] dma-buf/dma-fence: Allow wait_any_timeout without default_wait (v2) Jason Ekstrand
                     ` (5 subsequent siblings)
  9 siblings, 0 replies; 55+ messages in thread
From: Jason Ekstrand @ 2017-08-11 22:39 UTC (permalink / raw)
  To: dri-devel; +Cc: Dave Airlie

From: Dave Airlie <airlied@redhat.com>

This interface will allow sync object to be used to back
Vulkan fences. This API is pretty much the vulkan fence waiting
API, and I've ported the code from amdgpu.

v2: accept relative timeout, pass remaining time back
to userspace.
v3: return to absolute timeouts.
v4: absolute zero = poll,
    rewrite any/all code to have same operation for arrays
    return -EINVAL for 0 fences.
v4.1: fixup fences allocation check, use u64_to_user_ptr
v5: move to sec/nsec, and use timespec64 for calcs.
v6: use -ETIME and drop the out status flag. (-ETIME
is suggested by ickle, I can feel a shed painting)
v7: talked to Daniel/Arnd, use ktime and ns everywhere.
v8: be more careful in the timeout calculations
    use uint32_t for counter variables so we don't overflow
    graciously handle -ENOINT being returned from dma_fence_wait_timeout

Signed-off-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
---
 drivers/gpu/drm/drm_internal.h |   2 +
 drivers/gpu/drm/drm_ioctl.c    |   2 +
 drivers/gpu/drm/drm_syncobj.c  | 142 +++++++++++++++++++++++++++++++++++++++++
 include/uapi/drm/drm.h         |  12 ++++
 4 files changed, 158 insertions(+)

diff --git a/drivers/gpu/drm/drm_internal.h b/drivers/gpu/drm/drm_internal.h
index 4e906b8..534e5ac 100644
--- a/drivers/gpu/drm/drm_internal.h
+++ b/drivers/gpu/drm/drm_internal.h
@@ -167,3 +167,5 @@ int drm_syncobj_handle_to_fd_ioctl(struct drm_device *dev, void *data,
 				   struct drm_file *file_private);
 int drm_syncobj_fd_to_handle_ioctl(struct drm_device *dev, void *data,
 				   struct drm_file *file_private);
+int drm_syncobj_wait_ioctl(struct drm_device *dev, void *data,
+			   struct drm_file *file_private);
diff --git a/drivers/gpu/drm/drm_ioctl.c b/drivers/gpu/drm/drm_ioctl.c
index 8bfeb32..2ab0ff90 100644
--- a/drivers/gpu/drm/drm_ioctl.c
+++ b/drivers/gpu/drm/drm_ioctl.c
@@ -657,6 +657,8 @@ static const struct drm_ioctl_desc drm_ioctls[] = {
 		      DRM_UNLOCKED|DRM_RENDER_ALLOW),
 	DRM_IOCTL_DEF(DRM_IOCTL_SYNCOBJ_FD_TO_HANDLE, drm_syncobj_fd_to_handle_ioctl,
 		      DRM_UNLOCKED|DRM_RENDER_ALLOW),
+	DRM_IOCTL_DEF(DRM_IOCTL_SYNCOBJ_WAIT, drm_syncobj_wait_ioctl,
+		      DRM_UNLOCKED|DRM_RENDER_ALLOW),
 };
 
 #define DRM_CORE_IOCTL_COUNT	ARRAY_SIZE( drm_ioctls )
diff --git a/drivers/gpu/drm/drm_syncobj.c b/drivers/gpu/drm/drm_syncobj.c
index 0190ec2..e1f96b9 100644
--- a/drivers/gpu/drm/drm_syncobj.c
+++ b/drivers/gpu/drm/drm_syncobj.c
@@ -1,5 +1,7 @@
 /*
  * Copyright 2017 Red Hat
+ * Parts ported from amdgpu (fence wait code).
+ * Copyright 2016 Advanced Micro Devices, Inc.
  *
  * Permission is hereby granted, free of charge, to any person obtaining a
  * copy of this software and associated documentation files (the "Software"),
@@ -31,6 +33,9 @@
  * that contain an optional fence. The fence can be updated with a new
  * fence, or be NULL.
  *
+ * syncobj's can be waited upon, where it will wait for the underlying
+ * fence.
+ *
  * syncobj's can be export to fd's and back, these fd's are opaque and
  * have no other use case, except passing the syncobj between processes.
  *
@@ -453,3 +458,140 @@ drm_syncobj_fd_to_handle_ioctl(struct drm_device *dev, void *data,
 	return drm_syncobj_fd_to_handle(file_private, args->fd,
 					&args->handle);
 }
+
+/**
+ * drm_timeout_abs_to_jiffies - calculate jiffies timeout from absolute value
+ *
+ * @timeout_nsec: timeout nsec component in ns, 0 for poll
+ *
+ * Calculate the timeout in jiffies from an absolute time in sec/nsec.
+ */
+static signed long drm_timeout_abs_to_jiffies(int64_t timeout_nsec)
+{
+	ktime_t abs_timeout, now;
+	u64 timeout_ns, timeout_jiffies64;
+
+	/* make 0 timeout means poll - absolute 0 doesn't seem valid */
+	if (timeout_nsec == 0)
+		return 0;
+
+	abs_timeout = ns_to_ktime(timeout_nsec);
+	now = ktime_get();
+
+	if (!ktime_after(abs_timeout, now))
+		return 0;
+
+	timeout_ns = ktime_to_ns(ktime_sub(abs_timeout, now));
+
+	timeout_jiffies64 = nsecs_to_jiffies64(timeout_ns);
+	/*  clamp timeout to avoid infinite timeout */
+	if (timeout_jiffies64 >= MAX_SCHEDULE_TIMEOUT - 1)
+		return MAX_SCHEDULE_TIMEOUT - 1;
+
+	return timeout_jiffies64 + 1;
+}
+
+static int drm_syncobj_wait_fences(struct drm_device *dev,
+				   struct drm_file *file_private,
+				   struct drm_syncobj_wait *wait,
+				   struct dma_fence **fences)
+{
+	signed long timeout = drm_timeout_abs_to_jiffies(wait->timeout_nsec);
+	signed long ret = 0;
+	uint32_t first = ~0;
+
+	if (wait->flags & DRM_SYNCOBJ_WAIT_FLAGS_WAIT_ALL) {
+		uint32_t i;
+		for (i = 0; i < wait->count_handles; i++) {
+			ret = dma_fence_wait_timeout(fences[i], true, timeout);
+
+			/* Various dma_fence wait callbacks will return
+			 * ENOENT to indicate that the fence has already
+			 * been signaled.  We need to sanitize this to 0 so
+			 * we don't return early and the client doesn't see
+			 * an unexpected error.
+			 */
+			if (ret == -ENOENT)
+				ret = 0;
+
+			if (ret < 0)
+				return ret;
+			if (ret == 0)
+				break;
+			timeout = ret;
+		}
+		first = 0;
+	} else {
+		ret = dma_fence_wait_any_timeout(fences,
+						 wait->count_handles,
+						 true, timeout,
+						 &first);
+	}
+
+	if (ret < 0)
+		return ret;
+
+	wait->first_signaled = first;
+	if (ret == 0)
+		return -ETIME;
+	return 0;
+}
+
+int
+drm_syncobj_wait_ioctl(struct drm_device *dev, void *data,
+		       struct drm_file *file_private)
+{
+	struct drm_syncobj_wait *args = data;
+	uint32_t *handles;
+	struct dma_fence **fences;
+	int ret = 0;
+	uint32_t i;
+
+	if (!drm_core_check_feature(dev, DRIVER_SYNCOBJ))
+		return -ENODEV;
+
+	if (args->flags != 0 && args->flags != DRM_SYNCOBJ_WAIT_FLAGS_WAIT_ALL)
+		return -EINVAL;
+
+	if (args->count_handles == 0)
+		return -EINVAL;
+
+	/* Get the handles from userspace */
+	handles = kmalloc_array(args->count_handles, sizeof(uint32_t),
+				GFP_KERNEL);
+	if (handles == NULL)
+		return -ENOMEM;
+
+	if (copy_from_user(handles,
+			   u64_to_user_ptr(args->handles),
+			   sizeof(uint32_t) * args->count_handles)) {
+		ret = -EFAULT;
+		goto err_free_handles;
+	}
+
+	fences = kcalloc(args->count_handles,
+			 sizeof(struct dma_fence *), GFP_KERNEL);
+	if (!fences) {
+		ret = -ENOMEM;
+		goto err_free_handles;
+	}
+
+	for (i = 0; i < args->count_handles; i++) {
+		ret = drm_syncobj_find_fence(file_private, handles[i],
+					     &fences[i]);
+		if (ret)
+			goto err_free_fence_array;
+	}
+
+	ret = drm_syncobj_wait_fences(dev, file_private,
+				      args, fences);
+
+err_free_fence_array:
+	for (i = 0; i < args->count_handles; i++)
+		dma_fence_put(fences[i]);
+	kfree(fences);
+err_free_handles:
+	kfree(handles);
+
+	return ret;
+}
diff --git a/include/uapi/drm/drm.h b/include/uapi/drm/drm.h
index 101593a..0757c1a 100644
--- a/include/uapi/drm/drm.h
+++ b/include/uapi/drm/drm.h
@@ -718,6 +718,17 @@ struct drm_syncobj_handle {
 	__u32 pad;
 };
 
+#define DRM_SYNCOBJ_WAIT_FLAGS_WAIT_ALL (1 << 0)
+struct drm_syncobj_wait {
+	__u64 handles;
+	/* absolute timeout */
+	__s64 timeout_nsec;
+	__u32 count_handles;
+	__u32 flags;
+	__u32 first_signaled; /* only valid when not waiting all */
+	__u32 pad;
+};
+
 #if defined(__cplusplus)
 }
 #endif
@@ -840,6 +851,7 @@ extern "C" {
 #define DRM_IOCTL_SYNCOBJ_DESTROY	DRM_IOWR(0xC0, struct drm_syncobj_destroy)
 #define DRM_IOCTL_SYNCOBJ_HANDLE_TO_FD	DRM_IOWR(0xC1, struct drm_syncobj_handle)
 #define DRM_IOCTL_SYNCOBJ_FD_TO_HANDLE	DRM_IOWR(0xC2, struct drm_syncobj_handle)
+#define DRM_IOCTL_SYNCOBJ_WAIT		DRM_IOWR(0xC3, struct drm_syncobj_wait)
 
 /**
  * Device specific ioctls should only be in their respective headers
-- 
2.5.0.400.gff86faf

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 55+ messages in thread

* [PATCH 5/9] dma-buf/dma-fence: Allow wait_any_timeout without default_wait (v2)
  2017-08-11 22:39 ` [PATCH 0/9] drm/syncobj: Add full-featured wait support (v2) Jason Ekstrand
                     ` (3 preceding siblings ...)
  2017-08-11 22:39   ` [PATCH 4/9] drm/syncobj: add sync obj wait interface. (v8) Jason Ekstrand
@ 2017-08-11 22:39   ` Jason Ekstrand
  2017-08-11 22:39   ` [PATCH 6/9] drm/syncobj: Add a reset ioctl Jason Ekstrand
                     ` (4 subsequent siblings)
  9 siblings, 0 replies; 55+ messages in thread
From: Jason Ekstrand @ 2017-08-11 22:39 UTC (permalink / raw)
  To: dri-devel; +Cc: Jason Ekstrand, Christian König, Jason Ekstrand

dma_fence_wait_any_timeout only relies on two things to work correctly:

 1) The callback list to get called upon fence signal
 2) The SIGNALED flag to be set upon fence signal

The first if these is part of the core dma-fence API.  The second is
only mostly part of the core dma-fence API with the one issue that the
flag may not be set and dma_fence_is_signaled may never return true if
fence->ops->enable_signaling() has not been called.  It's easy enough to
work around that by always using dma_fence_is_signaled instead of
manually checking the bit and falling through to a zero-timeout wait if
none of the fences report that they are signaled.

v2:
 - Use dma_fence_is_signaled in test_signaled_any
 - Fall through to a zero-timeout wait

Cc: Christian König <christian.koenig@amd.com>
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
---
 drivers/dma-buf/dma-fence.c | 20 ++++++++++----------
 1 file changed, 10 insertions(+), 10 deletions(-)

diff --git a/drivers/dma-buf/dma-fence.c b/drivers/dma-buf/dma-fence.c
index 9a30279..0cac367 100644
--- a/drivers/dma-buf/dma-fence.c
+++ b/drivers/dma-buf/dma-fence.c
@@ -437,8 +437,7 @@ dma_fence_test_signaled_any(struct dma_fence **fences, uint32_t count,
 	int i;
 
 	for (i = 0; i < count; ++i) {
-		struct dma_fence *fence = fences[i];
-		if (test_bit(DMA_FENCE_FLAG_SIGNALED_BIT, &fence->flags)) {
+		if (dma_fence_is_signaled(fences[i])) {
 			if (idx)
 				*idx = i;
 			return true;
@@ -484,7 +483,13 @@ dma_fence_wait_any_timeout(struct dma_fence **fences, uint32_t count,
 				return 1;
 			}
 
-		return 0;
+		/* There's a very annoying laxness in the dma_fence API
+		 * here, in that backends are not required to automatically
+		 * report when a fence is signaled prior to
+		 * fence->ops->enable_signaling() being called.  So here if
+		 * we fail to match signaled_count, we need to fallthough
+		 * and try a 0 timeout wait!
+		 */
 	}
 
 	cb = kcalloc(count, sizeof(struct default_wait_cb), GFP_KERNEL);
@@ -496,11 +501,6 @@ dma_fence_wait_any_timeout(struct dma_fence **fences, uint32_t count,
 	for (i = 0; i < count; ++i) {
 		struct dma_fence *fence = fences[i];
 
-		if (fence->ops->wait != dma_fence_default_wait) {
-			ret = -EINVAL;
-			goto fence_rm_cb;
-		}
-
 		cb[i].task = current;
 		if (dma_fence_add_callback(fence, &cb[i].base,
 					   dma_fence_default_wait_cb)) {
@@ -511,7 +511,7 @@ dma_fence_wait_any_timeout(struct dma_fence **fences, uint32_t count,
 		}
 	}
 
-	while (ret > 0) {
+	do {
 		if (intr)
 			set_current_state(TASK_INTERRUPTIBLE);
 		else
@@ -524,7 +524,7 @@ dma_fence_wait_any_timeout(struct dma_fence **fences, uint32_t count,
 
 		if (ret > 0 && intr && signal_pending(current))
 			ret = -ERESTARTSYS;
-	}
+	} while (ret > 0);
 
 	__set_current_state(TASK_RUNNING);
 
-- 
2.5.0.400.gff86faf

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 55+ messages in thread

* [PATCH 6/9] drm/syncobj: Add a reset ioctl
  2017-08-11 22:39 ` [PATCH 0/9] drm/syncobj: Add full-featured wait support (v2) Jason Ekstrand
                     ` (4 preceding siblings ...)
  2017-08-11 22:39   ` [PATCH 5/9] dma-buf/dma-fence: Allow wait_any_timeout without default_wait (v2) Jason Ekstrand
@ 2017-08-11 22:39   ` Jason Ekstrand
  2017-08-11 22:39   ` [PATCH 7/9] dma-buf/dma-fence: Signal all callbacks from dma_fence_release() Jason Ekstrand
                     ` (3 subsequent siblings)
  9 siblings, 0 replies; 55+ messages in thread
From: Jason Ekstrand @ 2017-08-11 22:39 UTC (permalink / raw)
  To: dri-devel; +Cc: Jason Ekstrand, Jason Ekstrand

This just resets the dma_fence to NULL so it looks like it's never been
signaled.  This will be useful once we add the new wait API for allowing
wait on "submit and signal" behavior.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
---
 drivers/gpu/drm/drm_internal.h |  2 ++
 drivers/gpu/drm/drm_ioctl.c    |  2 ++
 drivers/gpu/drm/drm_syncobj.c  | 24 ++++++++++++++++++++++++
 include/uapi/drm/drm.h         |  6 ++++++
 4 files changed, 34 insertions(+)

diff --git a/drivers/gpu/drm/drm_internal.h b/drivers/gpu/drm/drm_internal.h
index 534e5ac..83f1615 100644
--- a/drivers/gpu/drm/drm_internal.h
+++ b/drivers/gpu/drm/drm_internal.h
@@ -169,3 +169,5 @@ int drm_syncobj_fd_to_handle_ioctl(struct drm_device *dev, void *data,
 				   struct drm_file *file_private);
 int drm_syncobj_wait_ioctl(struct drm_device *dev, void *data,
 			   struct drm_file *file_private);
+int drm_syncobj_reset_ioctl(struct drm_device *dev, void *data,
+			    struct drm_file *file_private);
diff --git a/drivers/gpu/drm/drm_ioctl.c b/drivers/gpu/drm/drm_ioctl.c
index 2ab0ff90..4185483 100644
--- a/drivers/gpu/drm/drm_ioctl.c
+++ b/drivers/gpu/drm/drm_ioctl.c
@@ -659,6 +659,8 @@ static const struct drm_ioctl_desc drm_ioctls[] = {
 		      DRM_UNLOCKED|DRM_RENDER_ALLOW),
 	DRM_IOCTL_DEF(DRM_IOCTL_SYNCOBJ_WAIT, drm_syncobj_wait_ioctl,
 		      DRM_UNLOCKED|DRM_RENDER_ALLOW),
+	DRM_IOCTL_DEF(DRM_IOCTL_SYNCOBJ_RESET, drm_syncobj_reset_ioctl,
+		      DRM_UNLOCKED|DRM_RENDER_ALLOW),
 };
 
 #define DRM_CORE_IOCTL_COUNT	ARRAY_SIZE( drm_ioctls )
diff --git a/drivers/gpu/drm/drm_syncobj.c b/drivers/gpu/drm/drm_syncobj.c
index e1f96b9..226f1e7 100644
--- a/drivers/gpu/drm/drm_syncobj.c
+++ b/drivers/gpu/drm/drm_syncobj.c
@@ -595,3 +595,27 @@ drm_syncobj_wait_ioctl(struct drm_device *dev, void *data,
 
 	return ret;
 }
+
+int
+drm_syncobj_reset_ioctl(struct drm_device *dev, void *data,
+			struct drm_file *file_private)
+{
+	struct drm_syncobj_reset *args = data;
+	struct drm_syncobj *syncobj;
+
+	if (!drm_core_check_feature(dev, DRIVER_SYNCOBJ))
+		return -ENODEV;
+
+	if (args->flags != 0)
+		return -EINVAL;
+
+	syncobj = drm_syncobj_find(file_private, args->handle);
+	if (!syncobj)
+		return -ENOENT;
+
+	drm_syncobj_replace_fence(syncobj, NULL);
+
+	drm_syncobj_put(syncobj);
+
+	return 0;
+}
diff --git a/include/uapi/drm/drm.h b/include/uapi/drm/drm.h
index 0757c1a..4b301b4 100644
--- a/include/uapi/drm/drm.h
+++ b/include/uapi/drm/drm.h
@@ -729,6 +729,11 @@ struct drm_syncobj_wait {
 	__u32 pad;
 };
 
+struct drm_syncobj_reset {
+	__u32 handle;
+	__u32 flags;
+};
+
 #if defined(__cplusplus)
 }
 #endif
@@ -852,6 +857,7 @@ extern "C" {
 #define DRM_IOCTL_SYNCOBJ_HANDLE_TO_FD	DRM_IOWR(0xC1, struct drm_syncobj_handle)
 #define DRM_IOCTL_SYNCOBJ_FD_TO_HANDLE	DRM_IOWR(0xC2, struct drm_syncobj_handle)
 #define DRM_IOCTL_SYNCOBJ_WAIT		DRM_IOWR(0xC3, struct drm_syncobj_wait)
+#define DRM_IOCTL_SYNCOBJ_RESET		DRM_IOWR(0xC4, struct drm_syncobj_reset)
 
 /**
  * Device specific ioctls should only be in their respective headers
-- 
2.5.0.400.gff86faf

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 55+ messages in thread

* [PATCH 7/9] dma-buf/dma-fence: Signal all callbacks from dma_fence_release()
  2017-08-11 22:39 ` [PATCH 0/9] drm/syncobj: Add full-featured wait support (v2) Jason Ekstrand
                     ` (5 preceding siblings ...)
  2017-08-11 22:39   ` [PATCH 6/9] drm/syncobj: Add a reset ioctl Jason Ekstrand
@ 2017-08-11 22:39   ` Jason Ekstrand
  2018-01-31 12:32     ` Gustavo Padovan
  2017-08-11 22:39   ` [PATCH 8/9] dma-buf/dma-fence: Add a mechanism for proxy fences Jason Ekstrand
                     ` (2 subsequent siblings)
  9 siblings, 1 reply; 55+ messages in thread
From: Jason Ekstrand @ 2017-08-11 22:39 UTC (permalink / raw)
  To: dri-devel

From: Chris Wilson <chris@chris-wilson.co.uk>

This is an illegal scenario, to free the fence whilst there are pending
callbacks. Currently, we emit a WARN and then cast aside the callbacks
leaving them dangling. Alternatively, we could set an error on the fence
and then signal fence so that any dependency chains from the fence can
be tidied up, and if they care they can check for the error.

The question is whether or not the cure is worse than the disease
(premature fence signaling is never pretty).

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/dma-buf/dma-fence.c | 14 +++++++++++++-
 1 file changed, 13 insertions(+), 1 deletion(-)

diff --git a/drivers/dma-buf/dma-fence.c b/drivers/dma-buf/dma-fence.c
index 0cac367..4062708 100644
--- a/drivers/dma-buf/dma-fence.c
+++ b/drivers/dma-buf/dma-fence.c
@@ -172,7 +172,19 @@ void dma_fence_release(struct kref *kref)
 
 	trace_dma_fence_destroy(fence);
 
-	WARN_ON(!list_empty(&fence->cb_list));
+	if (WARN_ON(!list_empty(&fence->cb_list))) {
+		unsigned long flags;
+
+		/*
+		 * This should never happen, but if it does make sure that we
+		 * don't leave chains dangling. We set the error flag first
+		 * so that the callbacks know this signal is due to an error.
+		 */
+		spin_lock_irqsave(fence->lock, flags);
+		fence->error = -EDEADLK;
+		dma_fence_signal_locked(fence);
+		spin_unlock_irqrestore(fence->lock, flags);
+	}
 
 	if (fence->ops->release)
 		fence->ops->release(fence);
-- 
2.5.0.400.gff86faf

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 55+ messages in thread

* [PATCH 8/9] dma-buf/dma-fence: Add a mechanism for proxy fences
  2017-08-11 22:39 ` [PATCH 0/9] drm/syncobj: Add full-featured wait support (v2) Jason Ekstrand
                     ` (6 preceding siblings ...)
  2017-08-11 22:39   ` [PATCH 7/9] dma-buf/dma-fence: Signal all callbacks from dma_fence_release() Jason Ekstrand
@ 2017-08-11 22:39   ` Jason Ekstrand
  2017-08-11 22:39   ` [PATCH 9/9] drm/syncobj: Allow wait for submit and signal behavior (v4) Jason Ekstrand
  2017-08-13 13:19   ` [PATCH 0/9] drm/syncobj: Add full-featured wait support (v2) Christian König
  9 siblings, 0 replies; 55+ messages in thread
From: Jason Ekstrand @ 2017-08-11 22:39 UTC (permalink / raw)
  To: dri-devel; +Cc: Jason Ekstrand

From: Chris Wilson <chris@chris-wilson.co.uk>

Proxy fences allow you to create a place-holder fence which you can
later assign to the "real" fence at which point the two look
indistinguishable to anyone other than the driver which created the
real fence.  These should be used with care as it is the responsibility
of proxy fences creator to ensure that it eventually gets assigned a
real fence so it gets signaled.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
---
 drivers/dma-buf/Makefile          |   4 +-
 drivers/dma-buf/dma-fence-proxy.c | 186 ++++++++++++++++++++++++++++++++++++++
 include/linux/dma-fence-proxy.h   |  25 +++++
 3 files changed, 214 insertions(+), 1 deletion(-)
 create mode 100644 drivers/dma-buf/dma-fence-proxy.c
 create mode 100644 include/linux/dma-fence-proxy.h

diff --git a/drivers/dma-buf/Makefile b/drivers/dma-buf/Makefile
index c33bf88..e468215 100644
--- a/drivers/dma-buf/Makefile
+++ b/drivers/dma-buf/Makefile
@@ -1,3 +1,5 @@
-obj-y := dma-buf.o dma-fence.o dma-fence-array.o reservation.o seqno-fence.o
+obj-y := dma-buf.o \
+	dma-fence.o dma-fence-array.o dma-fence-proxy.o \
+	reservation.o seqno-fence.o
 obj-$(CONFIG_SYNC_FILE)		+= sync_file.o
 obj-$(CONFIG_SW_SYNC)		+= sw_sync.o sync_debug.o
diff --git a/drivers/dma-buf/dma-fence-proxy.c b/drivers/dma-buf/dma-fence-proxy.c
new file mode 100644
index 0000000..61bf8e5
--- /dev/null
+++ b/drivers/dma-buf/dma-fence-proxy.c
@@ -0,0 +1,186 @@
+/*
+ * dma-fence-proxy: placeholder unsignaled fence
+ *
+ * Copyright (C) 2017 Intel Corporation
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms of the GNU General Public License version 2 as published by
+ * the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+ * more details.
+ */
+
+#include <linux/dma-fence.h>
+#include <linux/export.h>
+#include <linux/irq_work.h>
+#include <linux/slab.h>
+
+struct dma_fence_proxy {
+	struct dma_fence base;
+	spinlock_t lock;
+
+	const char *driver_name;
+	void *tag;
+
+	struct dma_fence *real;
+	struct dma_fence_cb cb;
+	struct irq_work work;
+};
+
+static const char *proxy_get_driver_name(struct dma_fence *fence)
+{
+	struct dma_fence_proxy *p = container_of(fence, typeof(*p), base);
+
+	return p->real ? p->real->ops->get_driver_name(p->real) : p->driver_name;
+}
+
+static const char *proxy_get_timeline_name(struct dma_fence *fence)
+{
+	struct dma_fence_proxy *p = container_of(fence, typeof(*p), base);
+
+	return p->real ? p->real->ops->get_timeline_name(p->real) : "unset";
+}
+
+static void proxy_irq_work(struct irq_work *work)
+{
+	struct dma_fence_proxy *p = container_of(work, typeof(*p), work);
+
+	dma_fence_signal(&p->base);
+	dma_fence_put(&p->base);
+}
+
+static void proxy_callback(struct dma_fence *fence, struct dma_fence_cb *cb)
+{
+	struct dma_fence_proxy *p = container_of(cb, typeof(*p), cb);
+
+	/* beware the alleged spinlock inversion */
+	irq_work_queue(&p->work);
+}
+
+static bool proxy_enable_signaling(struct dma_fence *fence)
+{
+	struct dma_fence_proxy *p = container_of(fence, typeof(*p), base);
+
+	if (!p->real)
+		return true;
+
+	if (dma_fence_add_callback(p->real, &p->cb, proxy_callback))
+		return false;
+
+	dma_fence_get(fence);
+	return true;
+}
+
+static bool proxy_signaled(struct dma_fence *fence)
+{
+	struct dma_fence_proxy *p = container_of(fence, typeof(*p), base);
+
+	return p->real ? dma_fence_is_signaled(p->real) : false;
+}
+
+static void proxy_release(struct dma_fence *fence)
+{
+	struct dma_fence_proxy *p = container_of(fence, typeof(*p), base);
+
+	if (!p->real)
+		dma_fence_signal(&p->base);
+
+	dma_fence_put(p->real);
+	dma_fence_free(&p->base);
+}
+
+static const struct dma_fence_ops dma_fence_proxy_ops = {
+	.get_driver_name = proxy_get_driver_name,
+	.get_timeline_name = proxy_get_timeline_name,
+	.enable_signaling = proxy_enable_signaling,
+	.signaled = proxy_signaled,
+	.wait = dma_fence_default_wait,
+	.release = proxy_release,
+};
+
+/**
+ * dma_fence_proy_create - Create an unset proxy dma-fence
+ * @driver_name: The driver name to report; must outlive the fence
+ * @tag: A pointer which uniquely identifies the creator
+ */
+struct dma_fence *dma_fence_create_proxy(const char *driver_name, void *tag)
+{
+	struct dma_fence_proxy *p;
+
+	p = kzalloc(sizeof(*p), GFP_KERNEL);
+	if (!p)
+		return NULL;
+
+	p->driver_name = driver_name;
+	p->tag = tag;
+	spin_lock_init(&p->lock);
+	dma_fence_init(&p->base, &dma_fence_proxy_ops, &p->lock, 0, 0);
+	init_irq_work(&p->work, proxy_irq_work);
+
+	return &p->base;
+}
+EXPORT_SYMBOL(dma_fence_create_proxy);
+
+static bool dma_fence_is_proxy(struct dma_fence *fence)
+{
+	return fence->ops == &dma_fence_proxy_ops;
+}
+
+/**
+ * dma_fence_is_proxy_tagged - identify a proxy fence
+ * @fence: The fence to identify
+ * @tag: The tag pointer provided to dma_fence_create_proxy
+ *
+ * This returns true if this is a proxy fence tag is the same pointer as
+ * the tag provided to dma_fence_create_proxy.
+ */
+bool dma_fence_is_proxy_tagged(struct dma_fence *fence, void *tag)
+{
+	struct dma_fence_proxy *p = container_of(fence, typeof(*p), base);
+
+	if (!dma_fence_is_proxy(fence))
+		return false;
+
+	return p->tag == tag;
+}
+EXPORT_SYMBOL(dma_fence_is_proxy_tagged);
+
+/**
+ * dma_fence_proxy_assign - assign a fence to a proxy fence
+ * @proxy: The proxy fence
+ * @real: The real fence to assign to proxy
+ *
+ * This assigns the given real fence to the proxy fence.  From this point
+ * forward, the proxy fence will be almost indistinguishable from the real
+ * fence.  It will report the same driver and timeline names and will
+ * signal when the real fence signals.  If the real fence is already
+ * signaled when this function is called, it will signal as soon as it has
+ * any listeners, possibly immediately.
+ */
+void dma_fence_proxy_assign(struct dma_fence *proxy, struct dma_fence *real)
+{
+	struct dma_fence_proxy *p = container_of(proxy, typeof(*p), base);
+	unsigned long flags;
+
+	BUG_ON(!dma_fence_is_proxy(proxy));
+	BUG_ON(p->real);
+
+	spin_lock_irqsave(p->base.lock, flags);
+
+	p->real = dma_fence_get(real);
+
+	if (test_bit(DMA_FENCE_FLAG_ENABLE_SIGNAL_BIT, &p->base.flags)) {
+		if (dma_fence_add_callback(real, &p->cb, proxy_callback))
+			dma_fence_signal_locked(&p->base);
+		else
+			dma_fence_get(&p->base);
+	} else if (dma_fence_is_signaled(real)) {
+		dma_fence_signal_locked(&p->base);
+	}
+
+	spin_unlock_irqrestore(p->base.lock, flags);
+}
+EXPORT_SYMBOL(dma_fence_proxy_assign);
diff --git a/include/linux/dma-fence-proxy.h b/include/linux/dma-fence-proxy.h
new file mode 100644
index 0000000..177a5165
--- /dev/null
+++ b/include/linux/dma-fence-proxy.h
@@ -0,0 +1,25 @@
+/*
+ * dma-fence-proxy: allows waiting upon unset fences
+ *
+ * Copyright (C) 2017 Intel Corporation
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms of the GNU General Public License version 2 as published by
+ * the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+ * more details.
+ */
+
+#ifndef __LINUX_DMA_FENCE_PROXY_H
+#define __LINUX_DMA_FENCE_PROXY_H
+
+#include <linux/dma-fence.h>
+
+struct dma_fence *dma_fence_create_proxy(const char *driver_name, void *tag);
+bool dma_fence_is_proxy_tagged(struct dma_fence *fence, void *tag);
+void dma_fence_proxy_assign(struct dma_fence *proxy, struct dma_fence *real);
+
+#endif /* __LINUX_DMA_FENCE_PROXY_H */
-- 
2.5.0.400.gff86faf

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 55+ messages in thread

* [PATCH 9/9] drm/syncobj: Allow wait for submit and signal behavior (v4)
  2017-08-11 22:39 ` [PATCH 0/9] drm/syncobj: Add full-featured wait support (v2) Jason Ekstrand
                     ` (7 preceding siblings ...)
  2017-08-11 22:39   ` [PATCH 8/9] dma-buf/dma-fence: Add a mechanism for proxy fences Jason Ekstrand
@ 2017-08-11 22:39   ` Jason Ekstrand
  2017-08-13 13:19   ` [PATCH 0/9] drm/syncobj: Add full-featured wait support (v2) Christian König
  9 siblings, 0 replies; 55+ messages in thread
From: Jason Ekstrand @ 2017-08-11 22:39 UTC (permalink / raw)
  To: dri-devel; +Cc: Jason Ekstrand, Dave Airlie, Jason Ekstrand

Vulkan VkFence semantics require that the application be able to perform
a CPU wait on work which may not yet have been submitted.  This is
perfectly safe because the CPU wait has a timeout which will get
triggered eventually if no work is ever submitted.  This behavior is
advantageous for multi-threaded workloads because, so long as all of the
threads agree on what fences to use up-front, you don't have the extra
cross-thread synchronization cost of thread A telling thread B that it
has submitted its dependent work and thread B is now free to wait.

Within a single process, this can be implemented in the userspace driver
by doing exactly the same kind of tracking the app would have to do
using posix condition variables or similar.  However, in order for this
to work cross-process (as is required by VK_KHR_external_fence), we need
to handle this in the kernel.

This commit adds a WAIT_FOR_SUBMIT flag to DRM_IOCTL_SYNCOBJ_WAIT which
instructs the IOCTL to wait for the syncobj to have a non-null fence and
then wait on the fence.  Combined with DRM_IOCTL_SYNCOBJ_RESET, you can
easily get the Vulkan behavior.

v2:
 - Fix a bug in the invalid syncobj error path
 - Unify the wait-all and wait-any cases
v3:
 - Unify the timeout == 0 case a bit with the timeout > 0 case
 - use wait_event_interruptible_timeout
v4:
 - Use proxy fence

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: Dave Airlie <airlied@redhat.com>
---
 drivers/gpu/drm/drm_syncobj.c | 95 +++++++++++++++++++++++++++++++++++++++----
 include/drm/drm_syncobj.h     |  3 +-
 include/uapi/drm/drm.h        |  1 +
 3 files changed, 91 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/drm_syncobj.c b/drivers/gpu/drm/drm_syncobj.c
index 226f1e7..4f7c5e5 100644
--- a/drivers/gpu/drm/drm_syncobj.c
+++ b/drivers/gpu/drm/drm_syncobj.c
@@ -51,6 +51,7 @@
 #include <linux/fs.h>
 #include <linux/anon_inodes.h>
 #include <linux/sync_file.h>
+#include <linux/dma-fence-proxy.h>
 
 #include "drm_internal.h"
 #include <drm/drm_syncobj.h>
@@ -82,10 +83,45 @@ EXPORT_SYMBOL(drm_syncobj_find);
 
 struct dma_fence *drm_syncobj_fence_get(struct drm_syncobj *syncobj)
 {
-	return dma_fence_get_rcu_safe(&syncobj->_fence);
+	struct dma_fence *fence = dma_fence_get_rcu_safe(&syncobj->_fence);
+
+	/* Don't hand out our internal proxy fence.  Proxy fences are only used
+	 * for implementing WAIT_FLAGS_WAIT_FOR_SUBMIT behavior and should not
+	 * be handed out to random drivers unless they are prepared to deal
+	 * with the possibility that the fence will never get signaled.
+	 */
+	if (fence && dma_fence_is_proxy_tagged(fence, drm_syncobj_free)) {
+		dma_fence_put(fence);
+		return NULL;
+	}
+
+	return fence;
 }
 EXPORT_SYMBOL(drm_syncobj_fence_get);
 
+static struct dma_fence *drm_syncobj_fence_proxy_get(struct drm_syncobj *syncobj)
+{
+	struct dma_fence *fence, *proxy;
+
+	do {
+		fence = dma_fence_get_rcu_safe(&syncobj->_fence);
+		if (fence)
+			return fence;
+
+		proxy = dma_fence_create_proxy("drm_syncobj", drm_syncobj_free);
+		if (!proxy)
+			return NULL;
+
+		fence = cmpxchg(&syncobj->_fence, NULL, proxy);
+		if (!fence)
+			return dma_fence_get(proxy);
+
+		dma_fence_put(proxy);
+	} while(1);
+
+	return proxy;
+}
+
 /**
  * drm_syncobj_replace_fence - replace fence in a sync object.
  * @syncobj: Sync object to replace fence in
@@ -97,10 +133,40 @@ void drm_syncobj_replace_fence(struct drm_syncobj *syncobj,
 			       struct dma_fence *fence)
 {
 	struct dma_fence *old_fence;
+	bool old_fence_is_proxy;
 
-	if (fence)
-		dma_fence_get(fence);
-	old_fence = xchg(&syncobj->_fence, fence);
+	dma_fence_get(fence);
+
+	do {
+		old_fence = dma_fence_get_rcu_safe(&syncobj->_fence);
+		old_fence_is_proxy = old_fence &&
+			dma_fence_is_proxy_tagged(old_fence, drm_syncobj_free);
+		dma_fence_put(old_fence);
+
+		if (!fence && old_fence_is_proxy) {
+			/* If we're replacing a proxy with NULL, just leave
+			 * the proxy.
+			 */
+			return;
+		}
+
+		if (cmpxchg(&syncobj->_fence, old_fence, fence) != old_fence)
+			continue;
+	} while(0);
+
+	if (fence && old_fence_is_proxy) {
+		/* If we just replaced a proxy fence with a real fence,
+		 * assign the real fence to the proxy so that it gets
+		 * triggered when the real fence triggers.  If we are
+		 * replacing a proxy with NULL (such as through
+		 * DRM_IOCTL_SYNCOBJ_RESET), we drop the fence on the floor
+		 * and it will never get signaled.  This is ok because the
+		 * only code which waits on our proxy fences is
+		 * SYNCOBJ_WAIT with WAIT_FLAGS_WAIT_FOR_SUBMIT set and the
+		 * wait ioctl has a timeout which will eventually trigger.
+		 */
+		dma_fence_proxy_assign(old_fence, fence);
+	}
 
 	dma_fence_put(old_fence);
 }
@@ -544,13 +610,15 @@ drm_syncobj_wait_ioctl(struct drm_device *dev, void *data,
 	struct drm_syncobj_wait *args = data;
 	uint32_t *handles;
 	struct dma_fence **fences;
+	struct drm_syncobj *syncobj;
 	int ret = 0;
 	uint32_t i;
 
 	if (!drm_core_check_feature(dev, DRIVER_SYNCOBJ))
 		return -ENODEV;
 
-	if (args->flags != 0 && args->flags != DRM_SYNCOBJ_WAIT_FLAGS_WAIT_ALL)
+	if (args->flags & ~(DRM_SYNCOBJ_WAIT_FLAGS_WAIT_ALL |
+			    DRM_SYNCOBJ_WAIT_FLAGS_WAIT_FOR_SUBMIT))
 		return -EINVAL;
 
 	if (args->count_handles == 0)
@@ -577,8 +645,21 @@ drm_syncobj_wait_ioctl(struct drm_device *dev, void *data,
 	}
 
 	for (i = 0; i < args->count_handles; i++) {
-		ret = drm_syncobj_find_fence(file_private, handles[i],
-					     &fences[i]);
+		syncobj = drm_syncobj_find(file_private, handles[i]);
+		if (!syncobj) {
+			ret = -ENOENT;
+			goto err_free_fence_array;
+		}
+
+		if (args->flags & DRM_SYNCOBJ_WAIT_FLAGS_WAIT_FOR_SUBMIT) {
+			fences[i] = drm_syncobj_fence_proxy_get(syncobj);
+			if (!fences[i])
+				ret = -ENOMEM;
+		} else {
+			fences[i] = drm_syncobj_fence_get(syncobj);
+			if (!fences[i])
+				ret = -EINVAL;
+		}
 		if (ret)
 			goto err_free_fence_array;
 	}
diff --git a/include/drm/drm_syncobj.h b/include/drm/drm_syncobj.h
index c06a441..29037d5 100644
--- a/include/drm/drm_syncobj.h
+++ b/include/drm/drm_syncobj.h
@@ -42,7 +42,8 @@ struct drm_syncobj {
 	struct kref refcount;
 	/**
 	 * @fence:
-	 * NULL or a pointer to the fence bound to this object.
+	 * NULL or a pointer to the fence bound to this object or a pointer
+	 * to a proxy fence which will be assigned to the next bound fence.
 	 *
 	 * This pointer should not be accessed directly.  Instead, use
 	 * drm_syncobj_fence_get or drm_syncobj_replace_fence.
diff --git a/include/uapi/drm/drm.h b/include/uapi/drm/drm.h
index 4b301b4..f8ec8fe 100644
--- a/include/uapi/drm/drm.h
+++ b/include/uapi/drm/drm.h
@@ -719,6 +719,7 @@ struct drm_syncobj_handle {
 };
 
 #define DRM_SYNCOBJ_WAIT_FLAGS_WAIT_ALL (1 << 0)
+#define DRM_SYNCOBJ_WAIT_FLAGS_WAIT_FOR_SUBMIT (1 << 1)
 struct drm_syncobj_wait {
 	__u64 handles;
 	/* absolute timeout */
-- 
2.5.0.400.gff86faf

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 55+ messages in thread

* Re: [PATCH 0/9] drm/syncobj: Add full-featured wait support (v2)
  2017-08-11 22:39 ` [PATCH 0/9] drm/syncobj: Add full-featured wait support (v2) Jason Ekstrand
                     ` (8 preceding siblings ...)
  2017-08-11 22:39   ` [PATCH 9/9] drm/syncobj: Allow wait for submit and signal behavior (v4) Jason Ekstrand
@ 2017-08-13 13:19   ` Christian König
  2017-08-13 15:26     ` Jason Ekstrand
  9 siblings, 1 reply; 55+ messages in thread
From: Christian König @ 2017-08-13 13:19 UTC (permalink / raw)
  To: Jason Ekstrand, dri-devel; +Cc: Jason Ekstrand, Dave Airlie

Patches #1-#4 are Acked-by: Christian König <christian.koenig@amd.com>.

Patch #5: NAK, that will break radeon.

On radeon we need the non-default wait or otherwise we can run into a 
situation where we never signal a fence.

The general question is why do you need this?

Patch #6: Yes, please. Patch is Reviewed-by: Christian König 
<christian.koenig@amd.com>.

Patch #7: Already gave my rb on the patch Chris send out earlier.

Patch #8: NAK to the whole approach.

IIRC we discussed a very similar thing during the initial fence bringup 
and also during the fence_array development.

The problem is that you can easily build ring dependencies and so 
deadlocks with it.

I would really prefer an approach which is completely contained inside 
the syncobj code base.

Regards,
Christian.

Am 12.08.2017 um 00:39 schrieb Jason Ekstrand:
> This series does the same thing as my earlier series in that it adds a sync
> object wait interface complete with WAIT_FOR_SUBMIT flag.  While the uapi
> remains unchanged, the guts look a bit different.  Instead of adding a
> callback mechanism to drm_syncobj that fired whenever replace_fence was
> called, it's now using proxy fences.  The drm_syncobj_fence_get still
> returns NULL whenever the sync object is in an unsubmitted state but there
> is a new drm_syncobj_fence_proxy_get which returns either the real fence or
> a proxy fence that will be triggered the next time replace_fence is called
> with a non-NULL replacement.  This does make both drm_syncobj_fence_get and
> drm_syncobj_replace_fence a tiny bit more expensive, but it lets us do it
> all without locking.
>
> This series can be found as a branch here:
>
> https://cgit.freedesktop.org/~jekstrand/linux/log/?h=drm-syncobj-wait-submit-v4
>
> IGT tests for DRM_IOCTL_SYNCOBJ_WAIT and DRM_IOCTL_SYNCOBJ_RESET can be
> found on patchwork here:
>
> https://patchwork.freedesktop.org/series/28666/
>
> Patches to the Intel Vulkan driver to implement VK_KHR_external_fence on
> top of this kernel interface can be found here:
>
> https://cgit.freedesktop.org/~jekstrand/mesa/log/?h=wip/anv-external-fence
>
> Cc: Dave Airlie <airlied@redhat.com>
> Cc: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Christian König <christian.koenig@amd.com>
>
> Chris Wilson (2):
>    dma-buf/dma-fence: Signal all callbacks from dma_fence_release()
>    dma-buf/dma-fence: Add a mechanism for proxy fences
>
> Dave Airlie (1):
>    drm/syncobj: add sync obj wait interface. (v8)
>
> Jason Ekstrand (6):
>    drm/syncobj: Rename fence_get to find_fence
>    drm/syncobj: Add a race-free drm_syncobj_fence_get helper
>    i915: Add support for drm syncobjs
>    dma-buf/dma-fence: Allow wait_any_timeout without default_wait (v2)
>    drm/syncobj: Add a reset ioctl
>    drm/syncobj: Allow wait for submit and signal behavior (v4)
>
>   drivers/dma-buf/Makefile                   |   4 +-
>   drivers/dma-buf/dma-fence-proxy.c          | 186 +++++++++++++++++++
>   drivers/dma-buf/dma-fence.c                |  34 ++--
>   drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c     |   2 +-
>   drivers/gpu/drm/drm_internal.h             |   4 +
>   drivers/gpu/drm/drm_ioctl.c                |   4 +
>   drivers/gpu/drm/drm_syncobj.c              | 275 +++++++++++++++++++++++++++--
>   drivers/gpu/drm/i915/i915_drv.c            |   3 +-
>   drivers/gpu/drm/i915/i915_gem_execbuffer.c | 146 ++++++++++++++-
>   include/drm/drm_syncobj.h                  |  15 +-
>   include/linux/dma-fence-proxy.h            |  25 +++
>   include/uapi/drm/drm.h                     |  19 ++
>   include/uapi/drm/i915_drm.h                |  30 +++-
>   13 files changed, 710 insertions(+), 37 deletions(-)
>   create mode 100644 drivers/dma-buf/dma-fence-proxy.c
>   create mode 100644 include/linux/dma-fence-proxy.h
>

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH 0/9] drm/syncobj: Add full-featured wait support (v2)
  2017-08-13 13:19   ` [PATCH 0/9] drm/syncobj: Add full-featured wait support (v2) Christian König
@ 2017-08-13 15:26     ` Jason Ekstrand
  2017-08-13 15:52       ` Christian König
  0 siblings, 1 reply; 55+ messages in thread
From: Jason Ekstrand @ 2017-08-13 15:26 UTC (permalink / raw)
  To: Christian König, dri-devel; +Cc: Jason Ekstrand, Dave Airlie

On August 13, 2017 6:19:53 AM Christian König <christian.koenig@amd.com> wrote:

> Patches #1-#4 are Acked-by: Christian König <christian.koenig@amd.com>.
>
> Patch #5: NAK, that will break radeon.
>
> On radeon we need the non-default wait or otherwise we can run into a
> situation where we never signal a fence.
>
> The general question is why do you need this?

Because i915 sets a non-default wait function so calling wait_any just 
bails with fences from i915 immediately bails with -EINVAL.  This makes it 
work even with non-default waits.

> Patch #6: Yes, please. Patch is Reviewed-by: Christian König
> <christian.koenig@amd.com>.
>
> Patch #7: Already gave my rb on the patch Chris send out earlier.
>
> Patch #8: NAK to the whole approach.
>
> IIRC we discussed a very similar thing during the initial fence bringup
> and also during the fence_array development.
>
> The problem is that you can easily build ring dependencies and so
> deadlocks with it.
>
> I would really prefer an approach which is completely contained inside
> the syncobj code base.

Are you use to the approach of internally making a proxy so long as all the 
proxy code is inside syncobj?  Is also be happy to go back to the original 
approach with v3 of the last patch.

> Regards,
> Christian.
>
> Am 12.08.2017 um 00:39 schrieb Jason Ekstrand:
>> This series does the same thing as my earlier series in that it adds a sync
>> object wait interface complete with WAIT_FOR_SUBMIT flag.  While the uapi
>> remains unchanged, the guts look a bit different.  Instead of adding a
>> callback mechanism to drm_syncobj that fired whenever replace_fence was
>> called, it's now using proxy fences.  The drm_syncobj_fence_get still
>> returns NULL whenever the sync object is in an unsubmitted state but there
>> is a new drm_syncobj_fence_proxy_get which returns either the real fence or
>> a proxy fence that will be triggered the next time replace_fence is called
>> with a non-NULL replacement.  This does make both drm_syncobj_fence_get and
>> drm_syncobj_replace_fence a tiny bit more expensive, but it lets us do it
>> all without locking.
>>
>> This series can be found as a branch here:
>>
>> https://cgit.freedesktop.org/~jekstrand/linux/log/?h=drm-syncobj-wait-submit-v4
>>
>> IGT tests for DRM_IOCTL_SYNCOBJ_WAIT and DRM_IOCTL_SYNCOBJ_RESET can be
>> found on patchwork here:
>>
>> https://patchwork.freedesktop.org/series/28666/
>>
>> Patches to the Intel Vulkan driver to implement VK_KHR_external_fence on
>> top of this kernel interface can be found here:
>>
>> https://cgit.freedesktop.org/~jekstrand/mesa/log/?h=wip/anv-external-fence
>>
>> Cc: Dave Airlie <airlied@redhat.com>
>> Cc: Chris Wilson <chris@chris-wilson.co.uk>
>> Cc: Christian König <christian.koenig@amd.com>
>>
>> Chris Wilson (2):
>>    dma-buf/dma-fence: Signal all callbacks from dma_fence_release()
>>    dma-buf/dma-fence: Add a mechanism for proxy fences
>>
>> Dave Airlie (1):
>>    drm/syncobj: add sync obj wait interface. (v8)
>>
>> Jason Ekstrand (6):
>>    drm/syncobj: Rename fence_get to find_fence
>>    drm/syncobj: Add a race-free drm_syncobj_fence_get helper
>>    i915: Add support for drm syncobjs
>>    dma-buf/dma-fence: Allow wait_any_timeout without default_wait (v2)
>>    drm/syncobj: Add a reset ioctl
>>    drm/syncobj: Allow wait for submit and signal behavior (v4)
>>
>>   drivers/dma-buf/Makefile                   |   4 +-
>>   drivers/dma-buf/dma-fence-proxy.c          | 186 +++++++++++++++++++
>>   drivers/dma-buf/dma-fence.c                |  34 ++--
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c     |   2 +-
>>   drivers/gpu/drm/drm_internal.h             |   4 +
>>   drivers/gpu/drm/drm_ioctl.c                |   4 +
>>   drivers/gpu/drm/drm_syncobj.c              | 275 +++++++++++++++++++++++++++--
>>   drivers/gpu/drm/i915/i915_drv.c            |   3 +-
>>   drivers/gpu/drm/i915/i915_gem_execbuffer.c | 146 ++++++++++++++-
>>   include/drm/drm_syncobj.h                  |  15 +-
>>   include/linux/dma-fence-proxy.h            |  25 +++
>>   include/uapi/drm/drm.h                     |  19 ++
>>   include/uapi/drm/i915_drm.h                |  30 +++-
>>   13 files changed, 710 insertions(+), 37 deletions(-)
>>   create mode 100644 drivers/dma-buf/dma-fence-proxy.c
>>   create mode 100644 include/linux/dma-fence-proxy.h
>>
>


_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH 0/9] drm/syncobj: Add full-featured wait support (v2)
  2017-08-13 15:26     ` Jason Ekstrand
@ 2017-08-13 15:52       ` Christian König
  2017-08-13 23:14         ` Jason Ekstrand
  0 siblings, 1 reply; 55+ messages in thread
From: Christian König @ 2017-08-13 15:52 UTC (permalink / raw)
  To: Jason Ekstrand, dri-devel; +Cc: Jason Ekstrand, Dave Airlie

Am 13.08.2017 um 17:26 schrieb Jason Ekstrand:
> On August 13, 2017 6:19:53 AM Christian König 
> <christian.koenig@amd.com> wrote:
>
>> Patches #1-#4 are Acked-by: Christian König <christian.koenig@amd.com>.
>>
>> Patch #5: NAK, that will break radeon.
>>
>> On radeon we need the non-default wait or otherwise we can run into a
>> situation where we never signal a fence.
>>
>> The general question is why do you need this?
>
> Because i915 sets a non-default wait function so calling wait_any just 
> bails with fences from i915 immediately bails with -EINVAL. This makes 
> it work even with non-default waits.

Ok well, let me refine the question: Why does i915 sets a non-default 
wait function?

In radeon we have it because we need to handle 10+ years of different 
hardware generation, each which a bunch of separate bugs in their fence 
handling (and some even not solved by today).

>
>> Patch #6: Yes, please. Patch is Reviewed-by: Christian König
>> <christian.koenig@amd.com>.
>>
>> Patch #7: Already gave my rb on the patch Chris send out earlier.
>>
>> Patch #8: NAK to the whole approach.
>>
>> IIRC we discussed a very similar thing during the initial fence bringup
>> and also during the fence_array development.
>>
>> The problem is that you can easily build ring dependencies and so
>> deadlocks with it.
>>
>> I would really prefer an approach which is completely contained inside
>> the syncobj code base.
>
> Are you use to the approach of internally making a proxy so long as 
> all the proxy code is inside syncobj?
Yes, that would be a start.

In general if possible I would rather like to avoid the whole handling 
with the proxy/callback altogether, but that possible only works with 
wait for any if the waitqueue is global and that wouldn't be ideal either.

> Is also be happy to go back to the original approach with v3 of the 
> last patch.
v3 looked like it should work as well, I would just drop abusing the 
fence callback structure for the signaling.

Ideally we would finally come up with an interface to wait for multiple 
waitqueue at the same time, but that probably goes a bit to far.

For now just use a single linked list to start all processes waiting for 
a fence to arrive or something like this.

Regards,
Christian.

>
>> Regards,
>> Christian.
>>
>> Am 12.08.2017 um 00:39 schrieb Jason Ekstrand:
>>> This series does the same thing as my earlier series in that it adds 
>>> a sync
>>> object wait interface complete with WAIT_FOR_SUBMIT flag. While the 
>>> uapi
>>> remains unchanged, the guts look a bit different.  Instead of adding a
>>> callback mechanism to drm_syncobj that fired whenever replace_fence was
>>> called, it's now using proxy fences.  The drm_syncobj_fence_get still
>>> returns NULL whenever the sync object is in an unsubmitted state but 
>>> there
>>> is a new drm_syncobj_fence_proxy_get which returns either the real 
>>> fence or
>>> a proxy fence that will be triggered the next time replace_fence is 
>>> called
>>> with a non-NULL replacement.  This does make both 
>>> drm_syncobj_fence_get and
>>> drm_syncobj_replace_fence a tiny bit more expensive, but it lets us 
>>> do it
>>> all without locking.
>>>
>>> This series can be found as a branch here:
>>>
>>> https://cgit.freedesktop.org/~jekstrand/linux/log/?h=drm-syncobj-wait-submit-v4 
>>>
>>>
>>> IGT tests for DRM_IOCTL_SYNCOBJ_WAIT and DRM_IOCTL_SYNCOBJ_RESET can be
>>> found on patchwork here:
>>>
>>> https://patchwork.freedesktop.org/series/28666/
>>>
>>> Patches to the Intel Vulkan driver to implement 
>>> VK_KHR_external_fence on
>>> top of this kernel interface can be found here:
>>>
>>> https://cgit.freedesktop.org/~jekstrand/mesa/log/?h=wip/anv-external-fence 
>>>
>>>
>>> Cc: Dave Airlie <airlied@redhat.com>
>>> Cc: Chris Wilson <chris@chris-wilson.co.uk>
>>> Cc: Christian König <christian.koenig@amd.com>
>>>
>>> Chris Wilson (2):
>>>    dma-buf/dma-fence: Signal all callbacks from dma_fence_release()
>>>    dma-buf/dma-fence: Add a mechanism for proxy fences
>>>
>>> Dave Airlie (1):
>>>    drm/syncobj: add sync obj wait interface. (v8)
>>>
>>> Jason Ekstrand (6):
>>>    drm/syncobj: Rename fence_get to find_fence
>>>    drm/syncobj: Add a race-free drm_syncobj_fence_get helper
>>>    i915: Add support for drm syncobjs
>>>    dma-buf/dma-fence: Allow wait_any_timeout without default_wait (v2)
>>>    drm/syncobj: Add a reset ioctl
>>>    drm/syncobj: Allow wait for submit and signal behavior (v4)
>>>
>>>   drivers/dma-buf/Makefile                   |   4 +-
>>>   drivers/dma-buf/dma-fence-proxy.c          | 186 +++++++++++++++++++
>>>   drivers/dma-buf/dma-fence.c                |  34 ++--
>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c     |   2 +-
>>>   drivers/gpu/drm/drm_internal.h             |   4 +
>>>   drivers/gpu/drm/drm_ioctl.c                |   4 +
>>>   drivers/gpu/drm/drm_syncobj.c              | 275 
>>> +++++++++++++++++++++++++++--
>>>   drivers/gpu/drm/i915/i915_drv.c            |   3 +-
>>>   drivers/gpu/drm/i915/i915_gem_execbuffer.c | 146 ++++++++++++++-
>>>   include/drm/drm_syncobj.h                  |  15 +-
>>>   include/linux/dma-fence-proxy.h            |  25 +++
>>>   include/uapi/drm/drm.h                     |  19 ++
>>>   include/uapi/drm/i915_drm.h                |  30 +++-
>>>   13 files changed, 710 insertions(+), 37 deletions(-)
>>>   create mode 100644 drivers/dma-buf/dma-fence-proxy.c
>>>   create mode 100644 include/linux/dma-fence-proxy.h
>>>
>>
>
>

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH 0/9] drm/syncobj: Add full-featured wait support (v2)
  2017-08-13 15:52       ` Christian König
@ 2017-08-13 23:14         ` Jason Ekstrand
  2017-08-14  5:49           ` Jason Ekstrand
  2017-08-14  7:36           ` Christian König
  0 siblings, 2 replies; 55+ messages in thread
From: Jason Ekstrand @ 2017-08-13 23:14 UTC (permalink / raw)
  To: Christian König, dri-devel; +Cc: Jason Ekstrand, Dave Airlie

On August 13, 2017 8:52:21 AM Christian König <christian.koenig@amd.com> wrote:

> Am 13.08.2017 um 17:26 schrieb Jason Ekstrand:
>> On August 13, 2017 6:19:53 AM Christian König
>> <christian.koenig@amd.com> wrote:
>>
>>> Patches #1-#4 are Acked-by: Christian König <christian.koenig@amd.com>.
>>>
>>> Patch #5: NAK, that will break radeon.
>>>
>>> On radeon we need the non-default wait or otherwise we can run into a
>>> situation where we never signal a fence.
>>>
>>> The general question is why do you need this?
>>
>> Because i915 sets a non-default wait function so calling wait_any just
>> bails with fences from i915 immediately bails with -EINVAL. This makes
>> it work even with non-default waits.
>
> Ok well, let me refine the question: Why does i915 sets a non-default
> wait function?

I have no idea.

> In radeon we have it because we need to handle 10+ years of different
> hardware generation, each which a bunch of separate bugs in their fence
> handling (and some even not solved by today).

So how does wait_any returning -EINVAL for non-default waits help radeon?

>>> Patch #6: Yes, please. Patch is Reviewed-by: Christian König
>>> <christian.koenig@amd.com>.
>>>
>>> Patch #7: Already gave my rb on the patch Chris send out earlier.
>>>
>>> Patch #8: NAK to the whole approach.
>>>
>>> IIRC we discussed a very similar thing during the initial fence bringup
>>> and also during the fence_array development.
>>>
>>> The problem is that you can easily build ring dependencies and so
>>> deadlocks with it.
>>>
>>> I would really prefer an approach which is completely contained inside
>>> the syncobj code base.
>>
>> Are you use to the approach of internally making a proxy so long as
>> all the proxy code is inside syncobj?
> Yes, that would be a start.
>
> In general if possible I would rather like to avoid the whole handling
> with the proxy/callback altogether, but that possible only works with
> wait for any if the waitqueue is global and that wouldn't be ideal either.

I'm happy to get rid of the proxies.  They did work nicely but I'm not 
really attached to them

>> Is also be happy to go back to the original approach with v3 of the
>> last patch.
> v3 looked like it should work as well, I would just drop abusing the
> fence callback structure for the signaling.
>
> Ideally we would finally come up with an interface to wait for multiple
> waitqueue at the same time, but that probably goes a bit to far.
>
> For now just use a single linked list to start all processes waiting for
> a fence to arrive or something like this.

I don't really know what you're suggesting.  Patch v3 has a single 
waitqueue per process.  Are you suggesting one per fence?

--Jason

> Regards,
> Christian.
>
>>
>>> Regards,
>>> Christian.
>>>
>>> Am 12.08.2017 um 00:39 schrieb Jason Ekstrand:
>>>> This series does the same thing as my earlier series in that it adds
>>>> a sync
>>>> object wait interface complete with WAIT_FOR_SUBMIT flag. While the
>>>> uapi
>>>> remains unchanged, the guts look a bit different.  Instead of adding a
>>>> callback mechanism to drm_syncobj that fired whenever replace_fence was
>>>> called, it's now using proxy fences.  The drm_syncobj_fence_get still
>>>> returns NULL whenever the sync object is in an unsubmitted state but
>>>> there
>>>> is a new drm_syncobj_fence_proxy_get which returns either the real
>>>> fence or
>>>> a proxy fence that will be triggered the next time replace_fence is
>>>> called
>>>> with a non-NULL replacement.  This does make both
>>>> drm_syncobj_fence_get and
>>>> drm_syncobj_replace_fence a tiny bit more expensive, but it lets us
>>>> do it
>>>> all without locking.
>>>>
>>>> This series can be found as a branch here:
>>>>
>>>> https://cgit.freedesktop.org/~jekstrand/linux/log/?h=drm-syncobj-wait-submit-v4
>>>>
>>>>
>>>> IGT tests for DRM_IOCTL_SYNCOBJ_WAIT and DRM_IOCTL_SYNCOBJ_RESET can be
>>>> found on patchwork here:
>>>>
>>>> https://patchwork.freedesktop.org/series/28666/
>>>>
>>>> Patches to the Intel Vulkan driver to implement
>>>> VK_KHR_external_fence on
>>>> top of this kernel interface can be found here:
>>>>
>>>> https://cgit.freedesktop.org/~jekstrand/mesa/log/?h=wip/anv-external-fence
>>>>
>>>>
>>>> Cc: Dave Airlie <airlied@redhat.com>
>>>> Cc: Chris Wilson <chris@chris-wilson.co.uk>
>>>> Cc: Christian König <christian.koenig@amd.com>
>>>>
>>>> Chris Wilson (2):
>>>>    dma-buf/dma-fence: Signal all callbacks from dma_fence_release()
>>>>    dma-buf/dma-fence: Add a mechanism for proxy fences
>>>>
>>>> Dave Airlie (1):
>>>>    drm/syncobj: add sync obj wait interface. (v8)
>>>>
>>>> Jason Ekstrand (6):
>>>>    drm/syncobj: Rename fence_get to find_fence
>>>>    drm/syncobj: Add a race-free drm_syncobj_fence_get helper
>>>>    i915: Add support for drm syncobjs
>>>>    dma-buf/dma-fence: Allow wait_any_timeout without default_wait (v2)
>>>>    drm/syncobj: Add a reset ioctl
>>>>    drm/syncobj: Allow wait for submit and signal behavior (v4)
>>>>
>>>>   drivers/dma-buf/Makefile                   |   4 +-
>>>>   drivers/dma-buf/dma-fence-proxy.c          | 186 +++++++++++++++++++
>>>>   drivers/dma-buf/dma-fence.c                |  34 ++--
>>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c     |   2 +-
>>>>   drivers/gpu/drm/drm_internal.h             |   4 +
>>>>   drivers/gpu/drm/drm_ioctl.c                |   4 +
>>>>   drivers/gpu/drm/drm_syncobj.c              | 275
>>>> +++++++++++++++++++++++++++--
>>>>   drivers/gpu/drm/i915/i915_drv.c            |   3 +-
>>>>   drivers/gpu/drm/i915/i915_gem_execbuffer.c | 146 ++++++++++++++-
>>>>   include/drm/drm_syncobj.h                  |  15 +-
>>>>   include/linux/dma-fence-proxy.h            |  25 +++
>>>>   include/uapi/drm/drm.h                     |  19 ++
>>>>   include/uapi/drm/i915_drm.h                |  30 +++-
>>>>   13 files changed, 710 insertions(+), 37 deletions(-)
>>>>   create mode 100644 drivers/dma-buf/dma-fence-proxy.c
>>>>   create mode 100644 include/linux/dma-fence-proxy.h
>>>>
>>>
>>
>>
>


_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH 2/9] drm/syncobj: Add a race-free drm_syncobj_fence_get helper
  2017-08-11 22:39   ` [PATCH 2/9] drm/syncobj: Add a race-free drm_syncobj_fence_get helper Jason Ekstrand
@ 2017-08-14  2:03     ` kbuild test robot
  0 siblings, 0 replies; 55+ messages in thread
From: kbuild test robot @ 2017-08-14  2:03 UTC (permalink / raw)
  Cc: Jason Ekstrand, Jason Ekstrand, kbuild-all, dri-devel

[-- Attachment #1: Type: text/plain, Size: 20135 bytes --]

Hi Jason,

[auto build test WARNING on drm/drm-next]
[also build test WARNING on v4.13-rc5 next-20170811]
[if your patch is applied to the wrong git tree, please drop us a note to help improve the system]

url:    https://github.com/0day-ci/linux/commits/Jason-Ekstrand/drm-syncobj-Add-full-featured-wait-support-v2/20170814-074233
base:   git://people.freedesktop.org/~airlied/linux.git drm-next
reproduce: make htmldocs

All warnings (new ones prefixed by >>):

   WARNING: convert(1) not found, for SVG to PDF conversion install ImageMagick (https://www.imagemagick.org)
   include/linux/init.h:1: warning: no structured comments found
   include/linux/mod_devicetable.h:687: warning: Excess struct/union/enum/typedef member 'ver_major' description in 'fsl_mc_device_id'
   include/linux/mod_devicetable.h:687: warning: Excess struct/union/enum/typedef member 'ver_minor' description in 'fsl_mc_device_id'
   kernel/sched/core.c:2080: warning: No description found for parameter 'rf'
   kernel/sched/core.c:2080: warning: Excess function parameter 'cookie' description in 'try_to_wake_up_local'
   include/linux/wait.h:555: warning: No description found for parameter 'wq'
   include/linux/wait.h:555: warning: Excess function parameter 'wq_head' description in 'wait_event_interruptible_hrtimeout'
   include/linux/wait.h:759: warning: No description found for parameter 'wq_head'
   include/linux/wait.h:759: warning: Excess function parameter 'wq' description in 'wait_event_killable'
   include/linux/kthread.h:26: warning: Excess function parameter '...' description in 'kthread_create'
   kernel/sys.c:1: warning: no structured comments found
   include/linux/device.h:968: warning: No description found for parameter 'dma_ops'
   drivers/dma-buf/seqno-fence.c:1: warning: no structured comments found
   include/linux/sync_file.h:51: warning: No description found for parameter 'flags'
   include/linux/iio/iio.h:603: warning: No description found for parameter 'trig_readonly'
   include/linux/iio/trigger.h:151: warning: No description found for parameter 'indio_dev'
   include/linux/iio/trigger.h:151: warning: No description found for parameter 'trig'
   include/linux/device.h:969: warning: No description found for parameter 'dma_ops'
   drivers/ata/libata-eh.c:1449: warning: No description found for parameter 'link'
   drivers/ata/libata-eh.c:1449: warning: Excess function parameter 'ap' description in 'ata_eh_done'
   drivers/ata/libata-eh.c:1590: warning: No description found for parameter 'qc'
   drivers/ata/libata-eh.c:1590: warning: Excess function parameter 'dev' description in 'ata_eh_request_sense'
   drivers/mtd/nand/nand_base.c:2751: warning: Excess function parameter 'cached' description in 'nand_write_page'
   drivers/mtd/nand/nand_base.c:2751: warning: Excess function parameter 'cached' description in 'nand_write_page'
   arch/s390/include/asm/cmb.h:1: warning: no structured comments found
   drivers/scsi/scsi_lib.c:1116: warning: No description found for parameter 'rq'
   drivers/scsi/constants.c:1: warning: no structured comments found
   include/linux/usb/gadget.h:230: warning: No description found for parameter 'claimed'
   include/linux/usb/gadget.h:230: warning: No description found for parameter 'enabled'
   include/linux/usb/gadget.h:412: warning: No description found for parameter 'quirk_altset_not_supp'
   include/linux/usb/gadget.h:412: warning: No description found for parameter 'quirk_stall_not_supp'
   include/linux/usb/gadget.h:412: warning: No description found for parameter 'quirk_zlp_not_supp'
   fs/inode.c:1666: warning: No description found for parameter 'rcu'
   include/linux/jbd2.h:443: warning: No description found for parameter 'i_transaction'
   include/linux/jbd2.h:443: warning: No description found for parameter 'i_next_transaction'
   include/linux/jbd2.h:443: warning: No description found for parameter 'i_list'
   include/linux/jbd2.h:443: warning: No description found for parameter 'i_vfs_inode'
   include/linux/jbd2.h:443: warning: No description found for parameter 'i_flags'
   include/linux/jbd2.h:497: warning: No description found for parameter 'h_rsv_handle'
   include/linux/jbd2.h:497: warning: No description found for parameter 'h_reserved'
   include/linux/jbd2.h:497: warning: No description found for parameter 'h_type'
   include/linux/jbd2.h:497: warning: No description found for parameter 'h_line_no'
   include/linux/jbd2.h:497: warning: No description found for parameter 'h_start_jiffies'
   include/linux/jbd2.h:497: warning: No description found for parameter 'h_requested_credits'
   include/linux/jbd2.h:497: warning: No description found for parameter 'saved_alloc_context'
   include/linux/jbd2.h:1050: warning: No description found for parameter 'j_chkpt_bhs'
   include/linux/jbd2.h:1050: warning: No description found for parameter 'j_devname'
   include/linux/jbd2.h:1050: warning: No description found for parameter 'j_average_commit_time'
   include/linux/jbd2.h:1050: warning: No description found for parameter 'j_min_batch_time'
   include/linux/jbd2.h:1050: warning: No description found for parameter 'j_max_batch_time'
   include/linux/jbd2.h:1050: warning: No description found for parameter 'j_commit_callback'
   include/linux/jbd2.h:1050: warning: No description found for parameter 'j_failed_commit'
   include/linux/jbd2.h:1050: warning: No description found for parameter 'j_chksum_driver'
   include/linux/jbd2.h:1050: warning: No description found for parameter 'j_csum_seed'
   fs/jbd2/transaction.c:511: warning: No description found for parameter 'type'
   fs/jbd2/transaction.c:511: warning: No description found for parameter 'line_no'
   fs/jbd2/transaction.c:641: warning: No description found for parameter 'gfp_mask'
   include/drm/drm_drv.h:593: warning: No description found for parameter 'gem_prime_pin'
   include/drm/drm_drv.h:593: warning: No description found for parameter 'gem_prime_unpin'
   include/drm/drm_drv.h:593: warning: No description found for parameter 'gem_prime_res_obj'
   include/drm/drm_drv.h:593: warning: No description found for parameter 'gem_prime_get_sg_table'
   include/drm/drm_drv.h:593: warning: No description found for parameter 'gem_prime_import_sg_table'
   include/drm/drm_drv.h:593: warning: No description found for parameter 'gem_prime_vmap'
   include/drm/drm_drv.h:593: warning: No description found for parameter 'gem_prime_vunmap'
   include/drm/drm_drv.h:593: warning: No description found for parameter 'gem_prime_mmap'
   include/drm/drm_mode_config.h:771: warning: No description found for parameter 'modifiers_property'
   include/drm/drm_mode_config.h:771: warning: Excess struct/union/enum/typedef member 'modifiers' description in 'drm_mode_config'
   include/drm/drm_plane.h:544: warning: No description found for parameter 'modifiers'
   include/drm/drm_plane.h:544: warning: No description found for parameter 'modifier_count'
>> include/drm/drm_syncobj.h:57: warning: No description found for parameter '_fence'
>> include/drm/drm_syncobj.h:57: warning: Excess struct/union/enum/typedef member 'fence' description in 'drm_syncobj'
   Documentation/doc-guide/sphinx.rst:121: ERROR: Unknown target name: "sphinx c domain".
   kernel/sched/fair.c:7584: WARNING: Inline emphasis start-string without end-string.
   kernel/time/timer.c:1200: ERROR: Unexpected indentation.
   kernel/time/timer.c:1202: ERROR: Unexpected indentation.
   kernel/time/timer.c:1203: WARNING: Block quote ends without a blank line; unexpected unindent.
   include/linux/wait.h:108: WARNING: Block quote ends without a blank line; unexpected unindent.
   include/linux/wait.h:111: ERROR: Unexpected indentation.
   include/linux/wait.h:113: WARNING: Block quote ends without a blank line; unexpected unindent.
   kernel/time/hrtimer.c:991: WARNING: Block quote ends without a blank line; unexpected unindent.
   kernel/signal.c:323: WARNING: Inline literal start-string without end-string.
   kernel/rcu/tree.c:3187: ERROR: Unexpected indentation.
   kernel/rcu/tree.c:3214: ERROR: Unexpected indentation.
   kernel/rcu/tree.c:3215: WARNING: Bullet list ends without a blank line; unexpected unindent.
   include/linux/iio/iio.h:219: ERROR: Unexpected indentation.
   include/linux/iio/iio.h:220: WARNING: Block quote ends without a blank line; unexpected unindent.
   include/linux/iio/iio.h:226: WARNING: Definition list ends without a blank line; unexpected unindent.
   drivers/iio/industrialio-core.c:633: ERROR: Unknown target name: "iio_val".
   drivers/iio/industrialio-core.c:640: ERROR: Unknown target name: "iio_val".
   drivers/ata/libata-core.c:5906: ERROR: Unknown target name: "hw".
   drivers/message/fusion/mptbase.c:5051: WARNING: Definition list ends without a blank line; unexpected unindent.
   drivers/tty/serial/serial_core.c:1897: WARNING: Definition list ends without a blank line; unexpected unindent.
   drivers/pci/pci.c:3470: ERROR: Unexpected indentation.
   include/linux/regulator/driver.h:271: ERROR: Unknown target name: "regulator_regmap_x_voltage".
   include/linux/spi/spi.h:373: ERROR: Unexpected indentation.
   drivers/w1/w1_io.c:196: WARNING: Definition list ends without a blank line; unexpected unindent.
   block/bio.c:404: ERROR: Unknown target name: "gfp".
   sound/soc/soc-core.c:2703: ERROR: Unknown target name: "snd_soc_daifmt".
   sound/core/jack.c:312: ERROR: Unknown target name: "snd_jack_btn".
   Documentation/media/v4l-drivers/imx.rst:: WARNING: document isn't included in any toctree
   Documentation/virtual/kvm/vcpu-requests.rst:: WARNING: document isn't included in any toctree
   Documentation/dev-tools/kselftest.rst:15: WARNING: Could not lex literal_block as "c". Highlighting skipped.
   Fontconfig warning: "/home/kbuild/.config/fontconfig/fonts.conf", line 43: Having multiple values in <test> isn't supported and may not work as expected
   Fontconfig warning: "/home/kbuild/.config/fontconfig/fonts.conf", line 56: Having multiple values in <test> isn't supported and may not work as expected
   Fontconfig warning: "/home/kbuild/.config/fontconfig/fonts.conf", line 69: Having multiple values in <test> isn't supported and may not work as expected
   Fontconfig warning: "/home/kbuild/.config/fontconfig/fonts.conf", line 82: Having multiple values in <test> isn't supported and may not work as expected
   Fontconfig warning: "/home/kbuild/.config/fontconfig/fonts.conf", line 96: Having multiple values in <test> isn't supported and may not work as expected
   Fontconfig warning: "/home/kbuild/.config/fontconfig/fonts.conf", line 109: Having multiple values in <test> isn't supported and may not work as expected
   Fontconfig warning: "/home/kbuild/.config/fontconfig/fonts.conf", line 122: Having multiple values in <test> isn't supported and may not work as expected
   Fontconfig warning: "/home/kbuild/.config/fontconfig/fonts.conf", line 133: Having multiple values in <test> isn't supported and may not work as expected
   Fontconfig warning: "/home/kbuild/.config/fontconfig/fonts.conf", line 164: Having multiple values in <test> isn't supported and may not work as expected
   Fontconfig warning: "/home/kbuild/.config/fontconfig/fonts.conf", line 193: Having multiple values in <test> isn't supported and may not work as expected
   Fontconfig warning: "~/.fonts.conf", line 43: Having multiple values in <test> isn't supported and may not work as expected
   Fontconfig warning: "~/.fonts.conf", line 56: Having multiple values in <test> isn't supported and may not work as expected
   Fontconfig warning: "~/.fonts.conf", line 69: Having multiple values in <test> isn't supported and may not work as expected
   Fontconfig warning: "~/.fonts.conf", line 82: Having multiple values in <test> isn't supported and may not work as expected
   Fontconfig warning: "~/.fonts.conf", line 96: Having multiple values in <test> isn't supported and may not work as expected
   Fontconfig warning: "~/.fonts.conf", line 109: Having multiple values in <test> isn't supported and may not work as expected
   Fontconfig warning: "~/.fonts.conf", line 122: Having multiple values in <test> isn't supported and may not work as expected
   Fontconfig warning: "~/.fonts.conf", line 133: Having multiple values in <test> isn't supported and may not work as expected
   Fontconfig warning: "~/.fonts.conf", line 164: Having multiple values in <test> isn't supported and may not work as expected
   Fontconfig warning: "~/.fonts.conf", line 193: Having multiple values in <test> isn't supported and may not work as expected
   Fontconfig warning: "/home/kbuild/.config/fontconfig/fonts.conf", line 43: Having multiple values in <test> isn't supported and may not work as expected
   Fontconfig warning: "/home/kbuild/.config/fontconfig/fonts.conf", line 56: Having multiple values in <test> isn't supported and may not work as expected
   Fontconfig warning: "/home/kbuild/.config/fontconfig/fonts.conf", line 69: Having multiple values in <test> isn't supported and may not work as expected
   Fontconfig warning: "/home/kbuild/.config/fontconfig/fonts.conf", line 82: Having multiple values in <test> isn't supported and may not work as expected
   Fontconfig warning: "/home/kbuild/.config/fontconfig/fonts.conf", line 96: Having multiple values in <test> isn't supported and may not work as expected
   Fontconfig warning: "/home/kbuild/.config/fontconfig/fonts.conf", line 109: Having multiple values in <test> isn't supported and may not work as expected
   Fontconfig warning: "/home/kbuild/.config/fontconfig/fonts.conf", line 122: Having multiple values in <test> isn't supported and may not work as expected
   Fontconfig warning: "/home/kbuild/.config/fontconfig/fonts.conf", line 133: Having multiple values in <test> isn't supported and may not work as expected
   Fontconfig warning: "/home/kbuild/.config/fontconfig/fonts.conf", line 164: Having multiple values in <test> isn't supported and may not work as expected
   Fontconfig warning: "/home/kbuild/.config/fontconfig/fonts.conf", line 193: Having multiple values in <test> isn't supported and may not work as expected
   Fontconfig warning: "~/.fonts.conf", line 43: Having multiple values in <test> isn't supported and may not work as expected
   Fontconfig warning: "~/.fonts.conf", line 56: Having multiple values in <test> isn't supported and may not work as expected
   Fontconfig warning: "~/.fonts.conf", line 69: Having multiple values in <test> isn't supported and may not work as expected
   Fontconfig warning: "~/.fonts.conf", line 82: Having multiple values in <test> isn't supported and may not work as expected
   Fontconfig warning: "~/.fonts.conf", line 96: Having multiple values in <test> isn't supported and may not work as expected
   Fontconfig warning: "~/.fonts.conf", line 109: Having multiple values in <test> isn't supported and may not work as expected
   Fontconfig warning: "~/.fonts.conf", line 122: Having multiple values in <test> isn't supported and may not work as expected
   Fontconfig warning: "~/.fonts.conf", line 133: Having multiple values in <test> isn't supported and may not work as expected
   Fontconfig warning: "~/.fonts.conf", line 164: Having multiple values in <test> isn't supported and may not work as expected
   Fontconfig warning: "~/.fonts.conf", line 193: Having multiple values in <test> isn't supported and may not work as expected
   Fontconfig warning: "/home/kbuild/.config/fontconfig/fonts.conf", line 43: Having multiple values in <test> isn't supported and may not work as expected
   Fontconfig warning: "/home/kbuild/.config/fontconfig/fonts.conf", line 56: Having multiple values in <test> isn't supported and may not work as expected
   Fontconfig warning: "/home/kbuild/.config/fontconfig/fonts.conf", line 69: Having multiple values in <test> isn't supported and may not work as expected
   Fontconfig warning: "/home/kbuild/.config/fontconfig/fonts.conf", line 82: Having multiple values in <test> isn't supported and may not work as expected
   Fontconfig warning: "/home/kbuild/.config/fontconfig/fonts.conf", line 96: Having multiple values in <test> isn't supported and may not work as expected
   Fontconfig warning: "/home/kbuild/.config/fontconfig/fonts.conf", line 109: Having multiple values in <test> isn't supported and may not work as expected
   Fontconfig warning: "/home/kbuild/.config/fontconfig/fonts.conf", line 122: Having multiple values in <test> isn't supported and may not work as expected
   Fontconfig warning: "/home/kbuild/.config/fontconfig/fonts.conf", line 133: Having multiple values in <test> isn't supported and may not work as expected
   Fontconfig warning: "/home/kbuild/.config/fontconfig/fonts.conf", line 164: Having multiple values in <test> isn't supported and may not work as expected
   Fontconfig warning: "/home/kbuild/.config/fontconfig/fonts.conf", line 193: Having multiple values in <test> isn't supported and may not work as expected
   Fontconfig warning: "~/.fonts.conf", line 43: Having multiple values in <test> isn't supported and may not work as expected
   Fontconfig warning: "~/.fonts.conf", line 56: Having multiple values in <test> isn't supported and may not work as expected
   Fontconfig warning: "~/.fonts.conf", line 69: Having multiple values in <test> isn't supported and may not work as expected
   Fontconfig warning: "~/.fonts.conf", line 82: Having multiple values in <test> isn't supported and may not work as expected
   Fontconfig warning: "~/.fonts.conf", line 96: Having multiple values in <test> isn't supported and may not work as expected
   Fontconfig warning: "~/.fonts.conf", line 109: Having multiple values in <test> isn't supported and may not work as expected
   Fontconfig warning: "~/.fonts.conf", line 122: Having multiple values in <test> isn't supported and may not work as expected
   Fontconfig warning: "~/.fonts.conf", line 133: Having multiple values in <test> isn't supported and may not work as expected
   Fontconfig warning: "~/.fonts.conf", line 164: Having multiple values in <test> isn't supported and may not work as expected
   Fontconfig warning: "~/.fonts.conf", line 193: Having multiple values in <test> isn't supported and may not work as expected
   Fontconfig warning: "/home/kbuild/.config/fontconfig/fonts.conf", line 43: Having multiple values in <test> isn't supported and may not work as expected
   Fontconfig warning: "/home/kbuild/.config/fontconfig/fonts.conf", line 56: Having multiple values in <test> isn't supported and may not work as expected
   Fontconfig warning: "/home/kbuild/.config/fontconfig/fonts.conf", line 69: Having multiple values in <test> isn't supported and may not work as expected
   Fontconfig warning: "/home/kbuild/.config/fontconfig/fonts.conf", line 82: Having multiple values in <test> isn't supported and may not work as expected
   Fontconfig warning: "/home/kbuild/.config/fontconfig/fonts.conf", line 96: Having multiple values in <test> isn't supported and may not work as expected
   Fontconfig warning: "/home/kbuild/.config/fontconfig/fonts.conf", line 109: Having multiple values in <test> isn't supported and may not work as expected
   Fontconfig warning: "/home/kbuild/.config/fontconfig/fonts.conf", line 122: Having multiple values in <test> isn't supported and may not work as expected
   Fontconfig warning: "/home/kbuild/.config/fontconfig/fonts.conf", line 133: Having multiple values in <test> isn't supported and may not work as expected
   Fontconfig warning: "/home/kbuild/.config/fontconfig/fonts.conf", line 164: Having multiple values in <test> isn't supported and may not work as expected

vim +/_fence +57 include/drm/drm_syncobj.h

e9083420 Dave Airlie 2017-04-04 @57  

:::::: The code at line 57 was first introduced by commit
:::::: e9083420bbacce27e43d418064d0d2dfb4b37aaa drm: introduce sync objects (v4)

:::::: TO: Dave Airlie <airlied@redhat.com>
:::::: CC: Dave Airlie <airlied@redhat.com>

---
0-DAY kernel test infrastructure                Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all                   Intel Corporation

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 6735 bytes --]

[-- Attachment #3: Type: text/plain, Size: 160 bytes --]

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH 3/9] i915: Add support for drm syncobjs
  2017-08-11 22:39   ` [PATCH 3/9] i915: Add support for drm syncobjs Jason Ekstrand
@ 2017-08-14  2:58     ` Jason Ekstrand
  0 siblings, 0 replies; 55+ messages in thread
From: Jason Ekstrand @ 2017-08-14  2:58 UTC (permalink / raw)
  To: Maling list - DRI developers; +Cc: Jason Ekstrand, Jason Ekstrand


[-- Attachment #1.1: Type: text/plain, Size: 12767 bytes --]

We now have reviewed userspace for this:

https://lists.freedesktop.org/archives/mesa-dev/2017-August/166200.html

On Fri, Aug 11, 2017 at 3:39 PM, Jason Ekstrand <jason@jlekstrand.net>
wrote:

> This commit adds support for waiting on or signaling DRM syncobjs as
> part of execbuf.  It does so by hijacking the currently unused cliprects
> pointer to instead point to an array of i915_gem_exec_fence structs
> which containe a DRM syncobj and a flags parameter which specifies
> whether to wait on it or to signal it.  This implementation
> theoretically allows for both flags to be set in which case it waits on
> the dma_fence that was in the syncobj and then immediately replaces it
> with the dma_fence from the current execbuf.
>
> v2:
>  - Rebase on new syncobj API
> v3:
>  - Pull everything out into helpers
>  - Do all allocation in gem_execbuffer2
>  - Pack the flags in the bottom 2 bits of the drm_syncobj*
> v4:
>  - Prevent a potential race on syncobj->fence
>
> Testcase: igt/gem_exec_fence/syncobj*
> Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> ---
>  drivers/gpu/drm/i915/i915_drv.c            |   3 +-
>  drivers/gpu/drm/i915/i915_gem_execbuffer.c | 146
> ++++++++++++++++++++++++++++-
>  include/uapi/drm/i915_drm.h                |  30 +++++-
>  3 files changed, 171 insertions(+), 8 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_
> drv.c
> index 4c96a72..50db490 100644
> --- a/drivers/gpu/drm/i915/i915_drv.c
> +++ b/drivers/gpu/drm/i915/i915_drv.c
> @@ -388,6 +388,7 @@ static int i915_getparam(struct drm_device *dev, void
> *data,
>         case I915_PARAM_HAS_EXEC_FENCE:
>         case I915_PARAM_HAS_EXEC_CAPTURE:
>         case I915_PARAM_HAS_EXEC_BATCH_FIRST:
> +       case I915_PARAM_HAS_EXEC_FENCE_ARRAY:
>                 /* For the time being all of these are always true;
>                  * if some supported hardware does not have one of these
>                  * features this value needs to be provided from
> @@ -2738,7 +2739,7 @@ static struct drm_driver driver = {
>          */
>         .driver_features =
>             DRIVER_HAVE_IRQ | DRIVER_IRQ_SHARED | DRIVER_GEM |
> DRIVER_PRIME |
> -           DRIVER_RENDER | DRIVER_MODESET | DRIVER_ATOMIC,
> +           DRIVER_RENDER | DRIVER_MODESET | DRIVER_ATOMIC |
> DRIVER_SYNCOBJ,
>         .release = i915_driver_release,
>         .open = i915_driver_open,
>         .lastclose = i915_driver_lastclose,
> diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> index 929f275..8df845b 100644
> --- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> +++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> @@ -33,6 +33,7 @@
>
>  #include <drm/drmP.h>
>  #include <drm/i915_drm.h>
> +#include <drm/drm_syncobj.h>
>
>  #include "i915_drv.h"
>  #include "i915_gem_clflush.h"
> @@ -1882,8 +1883,10 @@ static bool i915_gem_check_execbuffer(struct
> drm_i915_gem_execbuffer2 *exec)
>                 return false;
>
>         /* Kernel clipping was a DRI1 misfeature */
> -       if (exec->num_cliprects || exec->cliprects_ptr)
> -               return false;
> +       if (!(exec->flags & I915_EXEC_FENCE_ARRAY)) {
> +               if (exec->num_cliprects || exec->cliprects_ptr)
> +                       return false;
> +       }
>
>         if (exec->DR4 == 0xffffffff) {
>                 DRM_DEBUG("UXA submitting garbage DR4, fixing up\n");
> @@ -2114,11 +2117,125 @@ eb_select_engine(struct drm_i915_private
> *dev_priv,
>         return engine;
>  }
>
> +static void __free_fence_array(struct drm_syncobj **fences, unsigned int
> n)
> +{
> +       while (n--)
> +               drm_syncobj_put(ptr_mask_bits(fences[n], 2));
> +       kvfree(fences);
> +}
> +
> +static struct drm_syncobj **get_fence_array(struct
> drm_i915_gem_execbuffer2 *args,
> +                                           struct drm_file *file)
> +{
> +       const unsigned int nfences = args->num_cliprects;
> +       struct drm_i915_gem_exec_fence __user *user;
> +       struct drm_syncobj **fences;
> +       unsigned int n;
> +       int err;
> +
> +       if (!(args->flags & I915_EXEC_FENCE_ARRAY))
> +               return NULL;
> +
> +       if (nfences > SIZE_MAX / sizeof(*fences))
> +               return ERR_PTR(-EINVAL);
> +
> +       user = u64_to_user_ptr(args->cliprects_ptr);
> +       if (!access_ok(VERIFY_READ, user, nfences * 2 * sizeof(u32)))
> +               return ERR_PTR(-EFAULT);
> +
> +       fences = kvmalloc_array(args->num_cliprects, sizeof(*fences),
> +                               __GFP_NOWARN | GFP_TEMPORARY);
> +       if (!fences)
> +               return ERR_PTR(-ENOMEM);
> +
> +       for (n = 0; n < nfences; n++) {
> +               struct drm_i915_gem_exec_fence fence;
> +               struct drm_syncobj *syncobj;
> +
> +               if (__copy_from_user(&fence, user++, sizeof(fence))) {
> +                       err = -EFAULT;
> +                       goto err;
> +               }
> +
> +               syncobj = drm_syncobj_find(file, fence.handle);
> +               if (!syncobj) {
> +                       DRM_DEBUG("Invalid syncobj handle provided\n");
> +                       err = -EINVAL;
> +                       goto err;
> +               }
> +
> +               fences[n] = ptr_pack_bits(syncobj, fence.flags, 2);
> +       }
> +
> +       return fences;
> +
> +err:
> +       __free_fence_array(fences, n);
> +       return ERR_PTR(err);
> +}
> +
> +static void put_fence_array(struct drm_i915_gem_execbuffer2 *args,
> +                           struct drm_syncobj **fences)
> +{
> +       if (!fences)
> +               return;
> +       __free_fence_array(fences, args->num_cliprects);
> +}
> +
> +static int await_fence_array(struct i915_execbuffer *eb,
> +                            struct drm_syncobj **fences)
> +{
> +       const unsigned int nfences = eb->args->num_cliprects;
> +       unsigned int n;
> +       int err;
> +
> +       for (n = 0; n < nfences; n++) {
> +               struct drm_syncobj *syncobj;
> +               struct dma_fence *fence;
> +               unsigned int flags;
> +
> +               syncobj = ptr_unpack_bits(fences[n], &flags, 2);
> +               if (!(flags & I915_EXEC_FENCE_WAIT))
> +                       continue;
> +
> +               fence = drm_syncobj_fence_get(syncobj);
> +               if (!fence)
> +                   return -EINVAL;
> +
> +               err = i915_gem_request_await_dma_fence(eb->request,
> fence);
> +               dma_fence_put(fence);
> +               if (err < 0)
> +                       return err;
> +       }
> +
> +       return 0;
> +}
> +
> +static void signal_fence_array(struct i915_execbuffer *eb,
> +                              struct drm_syncobj **fences)
> +{
> +       const unsigned int nfences = eb->args->num_cliprects;
> +       struct dma_fence * const fence = &eb->request->fence;
> +       unsigned int n;
> +
> +       for (n = 0; n < nfences; n++) {
> +               struct drm_syncobj *syncobj;
> +               unsigned int flags;
> +
> +               syncobj = ptr_unpack_bits(fences[n], &flags, 2);
> +               if (!(flags & I915_EXEC_FENCE_SIGNAL))
> +                       continue;
> +
> +               drm_syncobj_replace_fence(syncobj, fence);
> +       }
> +}
> +
>  static int
>  i915_gem_do_execbuffer(struct drm_device *dev,
>                        struct drm_file *file,
>                        struct drm_i915_gem_execbuffer2 *args,
> -                      struct drm_i915_gem_exec_object2 *exec)
> +                      struct drm_i915_gem_exec_object2 *exec,
> +                      struct drm_syncobj **fences)
>  {
>         struct i915_execbuffer eb;
>         struct dma_fence *in_fence = NULL;
> @@ -2304,6 +2421,12 @@ i915_gem_do_execbuffer(struct drm_device *dev,
>                         goto err_request;
>         }
>
> +       if (fences) {
> +               err = await_fence_array(&eb, fences);
> +               if (err < 0)
> +                       goto err_request;
> +       }
> +
>         if (out_fence_fd != -1) {
>                 out_fence = sync_file_create(&eb.request->fence);
>                 if (!out_fence) {
> @@ -2327,6 +2450,9 @@ i915_gem_do_execbuffer(struct drm_device *dev,
>         __i915_add_request(eb.request, err == 0);
>         add_to_client(eb.request, file);
>
> +       if (fences)
> +               signal_fence_array(&eb, fences);
> +
>         if (out_fence) {
>                 if (err == 0) {
>                         fd_install(out_fence_fd, out_fence->file);
> @@ -2428,7 +2554,7 @@ i915_gem_execbuffer(struct drm_device *dev, void
> *data,
>                         exec2_list[i].flags = 0;
>         }
>
> -       err = i915_gem_do_execbuffer(dev, file, &exec2, exec2_list);
> +       err = i915_gem_do_execbuffer(dev, file, &exec2, exec2_list, NULL);
>         if (exec2.flags & __EXEC_HAS_RELOC) {
>                 struct drm_i915_gem_exec_object __user *user_exec_list =
>                         u64_to_user_ptr(args->buffers_ptr);
> @@ -2460,6 +2586,7 @@ i915_gem_execbuffer2(struct drm_device *dev, void
> *data,
>         const size_t sz = sizeof(struct drm_i915_gem_exec_object2);
>         struct drm_i915_gem_execbuffer2 *args = data;
>         struct drm_i915_gem_exec_object2 *exec2_list;
> +       struct drm_syncobj **fences = NULL;
>         int err;
>
>         if (args->buffer_count < 1 || args->buffer_count > SIZE_MAX / sz -
> 1) {
> @@ -2486,7 +2613,15 @@ i915_gem_execbuffer2(struct drm_device *dev, void
> *data,
>                 return -EFAULT;
>         }
>
> -       err = i915_gem_do_execbuffer(dev, file, args, exec2_list);
> +       if (args->flags & I915_EXEC_FENCE_ARRAY) {
> +               fences = get_fence_array(args, file);
> +               if (IS_ERR(fences)) {
> +                       kvfree(exec2_list);
> +                       return PTR_ERR(fences);
> +               }
> +       }
> +
> +       err = i915_gem_do_execbuffer(dev, file, args, exec2_list, fences);
>
>         /*
>          * Now that we have begun execution of the batchbuffer, we ignore
> @@ -2517,5 +2652,6 @@ i915_gem_execbuffer2(struct drm_device *dev, void
> *data,
>
>         args->flags &= ~__I915_EXEC_UNKNOWN_FLAGS;
>         kvfree(exec2_list);
> +       put_fence_array(args, fences);
>         return err;
>  }
> diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
> index 7ccbd6a..bb3fb49 100644
> --- a/include/uapi/drm/i915_drm.h
> +++ b/include/uapi/drm/i915_drm.h
> @@ -431,6 +431,11 @@ typedef struct drm_i915_irq_wait {
>   */
>  #define I915_PARAM_HAS_EXEC_BATCH_FIRST         48
>
> +/* Query whether DRM_I915_GEM_EXECBUFFER2 supports supplying an array of
> + * drm_i915_gem_exec_fence structures.  See I915_EXEC_FENCE_ARRAY.
> + */
> +#define I915_PARAM_HAS_EXEC_FENCE_ARRAY  49
> +
>  typedef struct drm_i915_getparam {
>         __s32 param;
>         /*
> @@ -812,6 +817,17 @@ struct drm_i915_gem_exec_object2 {
>         __u64 rsvd2;
>  };
>
> +struct drm_i915_gem_exec_fence {
> +       /**
> +        * User's handle for a dma-fence to wait on or signal.
> +        */
> +       __u32 handle;
> +
> +#define I915_EXEC_FENCE_WAIT            (1<<0)
> +#define I915_EXEC_FENCE_SIGNAL          (1<<1)
> +       __u32 flags;
> +};
> +
>  struct drm_i915_gem_execbuffer2 {
>         /**
>          * List of gem_exec_object2 structs
> @@ -826,7 +842,10 @@ struct drm_i915_gem_execbuffer2 {
>         __u32 DR1;
>         __u32 DR4;
>         __u32 num_cliprects;
> -       /** This is a struct drm_clip_rect *cliprects */
> +       /** This is a struct drm_clip_rect *cliprects if
> I915_EXEC_FENCE_ARRAY
> +        * is not set.  If I915_EXEC_FENCE_ARRAY is set, then this is a
> +        * struct drm_i915_gem_exec_fence *fences.
> +        */
>         __u64 cliprects_ptr;
>  #define I915_EXEC_RING_MASK              (7<<0)
>  #define I915_EXEC_DEFAULT                (0<<0)
> @@ -927,7 +946,14 @@ struct drm_i915_gem_execbuffer2 {
>   * element).
>   */
>  #define I915_EXEC_BATCH_FIRST          (1<<18)
> -#define __I915_EXEC_UNKNOWN_FLAGS (-(I915_EXEC_BATCH_FIRST<<1))
> +
> +/* Setting I915_FENCE_ARRAY implies that num_cliprects and cliprects_ptr
> + * define an array of i915_gem_exec_fence structures which specify a set
> of
> + * dma fences to wait upon or signal.
> + */
> +#define I915_EXEC_FENCE_ARRAY   (1<<19)
> +
> +#define __I915_EXEC_UNKNOWN_FLAGS (-(I915_EXEC_FENCE_ARRAY<<1))
>
>  #define I915_EXEC_CONTEXT_ID_MASK      (0xffffffff)
>  #define i915_execbuffer2_set_context_id(eb2, context) \
> --
> 2.5.0.400.gff86faf
>
>

[-- Attachment #1.2: Type: text/html, Size: 16285 bytes --]

[-- Attachment #2: Type: text/plain, Size: 160 bytes --]

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH 0/9] drm/syncobj: Add full-featured wait support (v2)
  2017-08-13 23:14         ` Jason Ekstrand
@ 2017-08-14  5:49           ` Jason Ekstrand
  2017-08-14  7:36           ` Christian König
  1 sibling, 0 replies; 55+ messages in thread
From: Jason Ekstrand @ 2017-08-14  5:49 UTC (permalink / raw)
  To: Christian König, dri-devel; +Cc: Jason Ekstrand, Dave Airlie

On August 13, 2017 4:14:45 PM Jason Ekstrand <jason@jlekstrand.net> wrote:

> On August 13, 2017 8:52:21 AM Christian König <christian.koenig@amd.com> wrote:
>
>> Am 13.08.2017 um 17:26 schrieb Jason Ekstrand:
>>> On August 13, 2017 6:19:53 AM Christian König
>>> <christian.koenig@amd.com> wrote:
>>>
>>>> Patches #1-#4 are Acked-by: Christian König <christian.koenig@amd.com>.
>>>>
>>>> Patch #5: NAK, that will break radeon.
>>>>
>>>> On radeon we need the non-default wait or otherwise we can run into a
>>>> situation where we never signal a fence.
>>>>
>>>> The general question is why do you need this?
>>>
>>> Because i915 sets a non-default wait function so calling wait_any just
>>> bails with fences from i915 immediately bails with -EINVAL. This makes
>>> it work even with non-default waits.
>>
>> Ok well, let me refine the question: Why does i915 sets a non-default
>> wait function?
>
> I have no idea.
>
>> In radeon we have it because we need to handle 10+ years of different
>> hardware generation, each which a bunch of separate bugs in their fence
>> handling (and some even not solved by today).
>
> So how does wait_any returning -EINVAL for non-default waits help radeon?
>
>>>> Patch #6: Yes, please. Patch is Reviewed-by: Christian König
>>>> <christian.koenig@amd.com>.
>>>>
>>>> Patch #7: Already gave my rb on the patch Chris send out earlier.
>>>>
>>>> Patch #8: NAK to the whole approach.
>>>>
>>>> IIRC we discussed a very similar thing during the initial fence bringup
>>>> and also during the fence_array development.
>>>>
>>>> The problem is that you can easily build ring dependencies and so
>>>> deadlocks with it.
>>>>
>>>> I would really prefer an approach which is completely contained inside
>>>> the syncobj code base.
>>>
>>> Are you use to the approach of internally making a proxy so long as
>>> all the proxy code is inside syncobj?
>> Yes, that would be a start.
>>
>> In general if possible I would rather like to avoid the whole handling
>> with the proxy/callback altogether, but that possible only works with
>> wait for any if the waitqueue is global and that wouldn't be ideal either.
>
> I'm happy to get rid of the proxies.  They did work nicely but I'm not
> really attached to them
>
>>> Is also be happy to go back to the original approach with v3 of the
>>> last patch.
>> v3 looked like it should work as well, I would just drop abusing the
>> fence callback structure for the signaling.
>>
>> Ideally we would finally come up with an interface to wait for multiple
>> waitqueue at the same time, but that probably goes a bit to far.
>>
>> For now just use a single linked list to start all processes waiting for
>> a fence to arrive or something like this.
>
> I don't really know what you're suggesting.  Patch v3 has a single
> waitqueue per process.  Are you suggesting one per fence?

I apologize for my plethora of questions.  I'm still very new to this stuff 
and don't know what all is out there.

> --Jason
>
>> Regards,
>> Christian.
>>
>>>
>>>> Regards,
>>>> Christian.
>>>>
>>>> Am 12.08.2017 um 00:39 schrieb Jason Ekstrand:
>>>>> This series does the same thing as my earlier series in that it adds
>>>>> a sync
>>>>> object wait interface complete with WAIT_FOR_SUBMIT flag. While the
>>>>> uapi
>>>>> remains unchanged, the guts look a bit different.  Instead of adding a
>>>>> callback mechanism to drm_syncobj that fired whenever replace_fence was
>>>>> called, it's now using proxy fences.  The drm_syncobj_fence_get still
>>>>> returns NULL whenever the sync object is in an unsubmitted state but
>>>>> there
>>>>> is a new drm_syncobj_fence_proxy_get which returns either the real
>>>>> fence or
>>>>> a proxy fence that will be triggered the next time replace_fence is
>>>>> called
>>>>> with a non-NULL replacement.  This does make both
>>>>> drm_syncobj_fence_get and
>>>>> drm_syncobj_replace_fence a tiny bit more expensive, but it lets us
>>>>> do it
>>>>> all without locking.
>>>>>
>>>>> This series can be found as a branch here:
>>>>>
>>>>> https://cgit.freedesktop.org/~jekstrand/linux/log/?h=drm-syncobj-wait-submit-v4
>>>>>
>>>>>
>>>>> IGT tests for DRM_IOCTL_SYNCOBJ_WAIT and DRM_IOCTL_SYNCOBJ_RESET can be
>>>>> found on patchwork here:
>>>>>
>>>>> https://patchwork.freedesktop.org/series/28666/
>>>>>
>>>>> Patches to the Intel Vulkan driver to implement
>>>>> VK_KHR_external_fence on
>>>>> top of this kernel interface can be found here:
>>>>>
>>>>> https://cgit.freedesktop.org/~jekstrand/mesa/log/?h=wip/anv-external-fence
>>>>>
>>>>>
>>>>> Cc: Dave Airlie <airlied@redhat.com>
>>>>> Cc: Chris Wilson <chris@chris-wilson.co.uk>
>>>>> Cc: Christian König <christian.koenig@amd.com>
>>>>>
>>>>> Chris Wilson (2):
>>>>>    dma-buf/dma-fence: Signal all callbacks from dma_fence_release()
>>>>>    dma-buf/dma-fence: Add a mechanism for proxy fences
>>>>>
>>>>> Dave Airlie (1):
>>>>>    drm/syncobj: add sync obj wait interface. (v8)
>>>>>
>>>>> Jason Ekstrand (6):
>>>>>    drm/syncobj: Rename fence_get to find_fence
>>>>>    drm/syncobj: Add a race-free drm_syncobj_fence_get helper
>>>>>    i915: Add support for drm syncobjs
>>>>>    dma-buf/dma-fence: Allow wait_any_timeout without default_wait (v2)
>>>>>    drm/syncobj: Add a reset ioctl
>>>>>    drm/syncobj: Allow wait for submit and signal behavior (v4)
>>>>>
>>>>>   drivers/dma-buf/Makefile                   |   4 +-
>>>>>   drivers/dma-buf/dma-fence-proxy.c          | 186 +++++++++++++++++++
>>>>>   drivers/dma-buf/dma-fence.c                |  34 ++--
>>>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c     |   2 +-
>>>>>   drivers/gpu/drm/drm_internal.h             |   4 +
>>>>>   drivers/gpu/drm/drm_ioctl.c                |   4 +
>>>>>   drivers/gpu/drm/drm_syncobj.c              | 275
>>>>> +++++++++++++++++++++++++++--
>>>>>   drivers/gpu/drm/i915/i915_drv.c            |   3 +-
>>>>>   drivers/gpu/drm/i915/i915_gem_execbuffer.c | 146 ++++++++++++++-
>>>>>   include/drm/drm_syncobj.h                  |  15 +-
>>>>>   include/linux/dma-fence-proxy.h            |  25 +++
>>>>>   include/uapi/drm/drm.h                     |  19 ++
>>>>>   include/uapi/drm/i915_drm.h                |  30 +++-
>>>>>   13 files changed, 710 insertions(+), 37 deletions(-)
>>>>>   create mode 100644 drivers/dma-buf/dma-fence-proxy.c
>>>>>   create mode 100644 include/linux/dma-fence-proxy.h
>>>>>
>>>>
>>>
>>>
>>
>
>


_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH 0/9] drm/syncobj: Add full-featured wait support (v2)
  2017-08-13 23:14         ` Jason Ekstrand
  2017-08-14  5:49           ` Jason Ekstrand
@ 2017-08-14  7:36           ` Christian König
  2017-08-14 15:08             ` Jason Ekstrand
  1 sibling, 1 reply; 55+ messages in thread
From: Christian König @ 2017-08-14  7:36 UTC (permalink / raw)
  To: Jason Ekstrand, dri-devel; +Cc: Jason Ekstrand, Dave Airlie

Am 14.08.2017 um 01:14 schrieb Jason Ekstrand:
> On August 13, 2017 8:52:21 AM Christian König 
> <christian.koenig@amd.com> wrote:
>
>> Am 13.08.2017 um 17:26 schrieb Jason Ekstrand:
>>> On August 13, 2017 6:19:53 AM Christian König
>>> <christian.koenig@amd.com> wrote:
>>>
>>>> Patches #1-#4 are Acked-by: Christian König 
>>>> <christian.koenig@amd.com>.
>>>>
>>>> Patch #5: NAK, that will break radeon.
>>>>
>>>> On radeon we need the non-default wait or otherwise we can run into a
>>>> situation where we never signal a fence.
>>>>
>>>> The general question is why do you need this?
>>>
>>> Because i915 sets a non-default wait function so calling wait_any just
>>> bails with fences from i915 immediately bails with -EINVAL. This makes
>>> it work even with non-default waits.
>>
>> Ok well, let me refine the question: Why does i915 sets a non-default
>> wait function?
>
> I have no idea.

Can you figure that out? I'm not completely against removing that 
limitation, but it would be a lot cleaner if we can just fix i915 to not 
set a non-default wait function.

>> In radeon we have it because we need to handle 10+ years of different
>> hardware generation, each which a bunch of separate bugs in their fence
>> handling (and some even not solved by today).
>
> So how does wait_any returning -EINVAL for non-default waits help radeon?

When the wait function detects a problem it reports -EDEADLK to the 
caller to signal that the hardware is stuck.

The Caller then goes up the call chain and ultimately reports this to 
the IOCTL where the error is handled with a hardware reset. And yeah, I 
perfectly know that this design sucks badly.

The point is we should try to prevent that the wait for any function is 
even used together with a radeon fence.

>
>
>>>> Patch #6: Yes, please. Patch is Reviewed-by: Christian König
>>>> <christian.koenig@amd.com>.
>>>>
>>>> Patch #7: Already gave my rb on the patch Chris send out earlier.
>>>>
>>>> Patch #8: NAK to the whole approach.
>>>>
>>>> IIRC we discussed a very similar thing during the initial fence 
>>>> bringup
>>>> and also during the fence_array development.
>>>>
>>>> The problem is that you can easily build ring dependencies and so
>>>> deadlocks with it.
>>>>
>>>> I would really prefer an approach which is completely contained inside
>>>> the syncobj code base.
>>>
>>> Are you use to the approach of internally making a proxy so long as
>>> all the proxy code is inside syncobj?
>> Yes, that would be a start.
>>
>> In general if possible I would rather like to avoid the whole handling
>> with the proxy/callback altogether, but that possible only works with
>> wait for any if the waitqueue is global and that wouldn't be ideal 
>> either.
>
> I'm happy to get rid of the proxies.  They did work nicely but I'm not 
> really attached to them
>
>>> Is also be happy to go back to the original approach with v3 of the
>>> last patch.
>> v3 looked like it should work as well, I would just drop abusing the
>> fence callback structure for the signaling.
>>
>> Ideally we would finally come up with an interface to wait for multiple
>> waitqueue at the same time, but that probably goes a bit to far.
>>
>> For now just use a single linked list to start all processes waiting for
>> a fence to arrive or something like this.
>
> I don't really know what you're suggesting.  Patch v3 has a single 
> waitqueue per process.  Are you suggesting one per fence?

Yeah, more or less. What I'm suggesting is to use one wait_queue_head_t 
per drm_syncobj.

See a wait_queue is a callback mechanism anyway, so you are wrapping a 
callback mechanism inside another callback mechanism and that makes not 
really much sense.

The problem is that we don't have a wait_event_* variant which can wait 
for multiple events. Somebody should really add something like that :)

Regards,
Christian.

>
> --Jason
>
>> Regards,
>> Christian.
>>
>>>
>>>> Regards,
>>>> Christian.
>>>>
>>>> Am 12.08.2017 um 00:39 schrieb Jason Ekstrand:
>>>>> This series does the same thing as my earlier series in that it adds
>>>>> a sync
>>>>> object wait interface complete with WAIT_FOR_SUBMIT flag. While the
>>>>> uapi
>>>>> remains unchanged, the guts look a bit different.  Instead of 
>>>>> adding a
>>>>> callback mechanism to drm_syncobj that fired whenever 
>>>>> replace_fence was
>>>>> called, it's now using proxy fences.  The drm_syncobj_fence_get still
>>>>> returns NULL whenever the sync object is in an unsubmitted state but
>>>>> there
>>>>> is a new drm_syncobj_fence_proxy_get which returns either the real
>>>>> fence or
>>>>> a proxy fence that will be triggered the next time replace_fence is
>>>>> called
>>>>> with a non-NULL replacement.  This does make both
>>>>> drm_syncobj_fence_get and
>>>>> drm_syncobj_replace_fence a tiny bit more expensive, but it lets us
>>>>> do it
>>>>> all without locking.
>>>>>
>>>>> This series can be found as a branch here:
>>>>>
>>>>> https://cgit.freedesktop.org/~jekstrand/linux/log/?h=drm-syncobj-wait-submit-v4 
>>>>>
>>>>>
>>>>>
>>>>> IGT tests for DRM_IOCTL_SYNCOBJ_WAIT and DRM_IOCTL_SYNCOBJ_RESET 
>>>>> can be
>>>>> found on patchwork here:
>>>>>
>>>>> https://patchwork.freedesktop.org/series/28666/
>>>>>
>>>>> Patches to the Intel Vulkan driver to implement
>>>>> VK_KHR_external_fence on
>>>>> top of this kernel interface can be found here:
>>>>>
>>>>> https://cgit.freedesktop.org/~jekstrand/mesa/log/?h=wip/anv-external-fence 
>>>>>
>>>>>
>>>>>
>>>>> Cc: Dave Airlie <airlied@redhat.com>
>>>>> Cc: Chris Wilson <chris@chris-wilson.co.uk>
>>>>> Cc: Christian König <christian.koenig@amd.com>
>>>>>
>>>>> Chris Wilson (2):
>>>>>    dma-buf/dma-fence: Signal all callbacks from dma_fence_release()
>>>>>    dma-buf/dma-fence: Add a mechanism for proxy fences
>>>>>
>>>>> Dave Airlie (1):
>>>>>    drm/syncobj: add sync obj wait interface. (v8)
>>>>>
>>>>> Jason Ekstrand (6):
>>>>>    drm/syncobj: Rename fence_get to find_fence
>>>>>    drm/syncobj: Add a race-free drm_syncobj_fence_get helper
>>>>>    i915: Add support for drm syncobjs
>>>>>    dma-buf/dma-fence: Allow wait_any_timeout without default_wait 
>>>>> (v2)
>>>>>    drm/syncobj: Add a reset ioctl
>>>>>    drm/syncobj: Allow wait for submit and signal behavior (v4)
>>>>>
>>>>>   drivers/dma-buf/Makefile                   |   4 +-
>>>>>   drivers/dma-buf/dma-fence-proxy.c          | 186 
>>>>> +++++++++++++++++++
>>>>>   drivers/dma-buf/dma-fence.c                |  34 ++--
>>>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c     |   2 +-
>>>>>   drivers/gpu/drm/drm_internal.h             |   4 +
>>>>>   drivers/gpu/drm/drm_ioctl.c                |   4 +
>>>>>   drivers/gpu/drm/drm_syncobj.c              | 275
>>>>> +++++++++++++++++++++++++++--
>>>>>   drivers/gpu/drm/i915/i915_drv.c            |   3 +-
>>>>>   drivers/gpu/drm/i915/i915_gem_execbuffer.c | 146 ++++++++++++++-
>>>>>   include/drm/drm_syncobj.h                  |  15 +-
>>>>>   include/linux/dma-fence-proxy.h            |  25 +++
>>>>>   include/uapi/drm/drm.h                     |  19 ++
>>>>>   include/uapi/drm/i915_drm.h                |  30 +++-
>>>>>   13 files changed, 710 insertions(+), 37 deletions(-)
>>>>>   create mode 100644 drivers/dma-buf/dma-fence-proxy.c
>>>>>   create mode 100644 include/linux/dma-fence-proxy.h
>>>>>
>>>>
>>>
>>>
>>
>
>

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH 0/9] drm/syncobj: Add full-featured wait support (v2)
  2017-08-14  7:36           ` Christian König
@ 2017-08-14 15:08             ` Jason Ekstrand
  2017-08-16 15:52               ` Jason Ekstrand
  0 siblings, 1 reply; 55+ messages in thread
From: Jason Ekstrand @ 2017-08-14 15:08 UTC (permalink / raw)
  To: Christian König
  Cc: Jason Ekstrand, Dave Airlie, Maling list - DRI developers


[-- Attachment #1.1: Type: text/plain, Size: 8772 bytes --]

On Mon, Aug 14, 2017 at 12:36 AM, Christian König <christian.koenig@amd.com>
wrote:

> Am 14.08.2017 um 01:14 schrieb Jason Ekstrand:
>
>> On August 13, 2017 8:52:21 AM Christian König <christian.koenig@amd.com>
>> wrote:
>>
>> Am 13.08.2017 um 17:26 schrieb Jason Ekstrand:
>>>
>>>> On August 13, 2017 6:19:53 AM Christian König
>>>> <christian.koenig@amd.com> wrote:
>>>>
>>>> Patches #1-#4 are Acked-by: Christian König <christian.koenig@amd.com>.
>>>>>
>>>>> Patch #5: NAK, that will break radeon.
>>>>>
>>>>> On radeon we need the non-default wait or otherwise we can run into a
>>>>> situation where we never signal a fence.
>>>>>
>>>>> The general question is why do you need this?
>>>>>
>>>>
>>>> Because i915 sets a non-default wait function so calling wait_any just
>>>> bails with fences from i915 immediately bails with -EINVAL. This makes
>>>> it work even with non-default waits.
>>>>
>>>
>>> Ok well, let me refine the question: Why does i915 sets a non-default
>>> wait function?
>>>
>>
>> I have no idea.
>>
>
> Can you figure that out? I'm not completely against removing that
> limitation, but it would be a lot cleaner if we can just fix i915 to not
> set a non-default wait function.
>

I asked on IRC how bad it would be and chris replied with "a lot of work".
He didn't say it's impossible but, apparently, it is a pile of work.


> In radeon we have it because we need to handle 10+ years of different
>>> hardware generation, each which a bunch of separate bugs in their fence
>>> handling (and some even not solved by today).
>>>
>>
>> So how does wait_any returning -EINVAL for non-default waits help radeon?
>>
>
> When the wait function detects a problem it reports -EDEADLK to the caller
> to signal that the hardware is stuck.
>
> The Caller then goes up the call chain and ultimately reports this to the
> IOCTL where the error is handled with a hardware reset. And yeah, I
> perfectly know that this design sucks badly.
>

Could we use dma_fence::err for this?  i.e., could the radeon driver set
the error bit and then have wait_any check for errors on wakeup and report
the first one it sees?


> The point is we should try to prevent that the wait for any function is
> even used together with a radeon fence.
>

Does radeon not support sync_file?  Because this is the same mechanism it
uses for poll.


>
>>
>> Patch #6: Yes, please. Patch is Reviewed-by: Christian König
>>>>> <christian.koenig@amd.com>.
>>>>>
>>>>> Patch #7: Already gave my rb on the patch Chris send out earlier.
>>>>>
>>>>> Patch #8: NAK to the whole approach.
>>>>>
>>>>> IIRC we discussed a very similar thing during the initial fence bringup
>>>>> and also during the fence_array development.
>>>>>
>>>>> The problem is that you can easily build ring dependencies and so
>>>>> deadlocks with it.
>>>>>
>>>>> I would really prefer an approach which is completely contained inside
>>>>> the syncobj code base.
>>>>>
>>>>
>>>> Are you use to the approach of internally making a proxy so long as
>>>> all the proxy code is inside syncobj?
>>>>
>>> Yes, that would be a start.
>>>
>>> In general if possible I would rather like to avoid the whole handling
>>> with the proxy/callback altogether, but that possible only works with
>>> wait for any if the waitqueue is global and that wouldn't be ideal
>>> either.
>>>
>>
>> I'm happy to get rid of the proxies.  They did work nicely but I'm not
>> really attached to them
>>
>> Is also be happy to go back to the original approach with v3 of the
>>>> last patch.
>>>>
>>> v3 looked like it should work as well, I would just drop abusing the
>>> fence callback structure for the signaling.
>>>
>>> Ideally we would finally come up with an interface to wait for multiple
>>> waitqueue at the same time, but that probably goes a bit to far.
>>>
>>> For now just use a single linked list to start all processes waiting for
>>> a fence to arrive or something like this.
>>>
>>
>> I don't really know what you're suggesting.  Patch v3 has a single
>> waitqueue per process.  Are you suggesting one per fence?
>>
>
> Yeah, more or less. What I'm suggesting is to use one wait_queue_head_t
> per drm_syncobj.
>
> See a wait_queue is a callback mechanism anyway, so you are wrapping a
> callback mechanism inside another callback mechanism and that makes not
> really much sense.
>

Fair enough.  There is one little snag though:  We need to wait on sync
objects and fences at the same time in order for WAIT_ANY | WAIT_FOR_SUBMIT
to work.  I see two options here:

 1) Convert dma-fence to use waitqueue instead of its callback mechanism
and add a wait_queue_any.  A quick grep for dma_fence_add_callback says
that this would affect four drivers.

 2) Drop waitqueues and go back and fix patch v2 so that it does the wait
correctly.

--Jason


> The problem is that we don't have a wait_event_* variant which can wait
> for multiple events. Somebody should really add something like that :)
>
> Regards,
> Christian.
>
>
>
>> --Jason
>>
>> Regards,
>>> Christian.
>>>
>>>
>>>> Regards,
>>>>> Christian.
>>>>>
>>>>> Am 12.08.2017 um 00:39 schrieb Jason Ekstrand:
>>>>>
>>>>>> This series does the same thing as my earlier series in that it adds
>>>>>> a sync
>>>>>> object wait interface complete with WAIT_FOR_SUBMIT flag. While the
>>>>>> uapi
>>>>>> remains unchanged, the guts look a bit different.  Instead of adding a
>>>>>> callback mechanism to drm_syncobj that fired whenever replace_fence
>>>>>> was
>>>>>> called, it's now using proxy fences.  The drm_syncobj_fence_get still
>>>>>> returns NULL whenever the sync object is in an unsubmitted state but
>>>>>> there
>>>>>> is a new drm_syncobj_fence_proxy_get which returns either the real
>>>>>> fence or
>>>>>> a proxy fence that will be triggered the next time replace_fence is
>>>>>> called
>>>>>> with a non-NULL replacement.  This does make both
>>>>>> drm_syncobj_fence_get and
>>>>>> drm_syncobj_replace_fence a tiny bit more expensive, but it lets us
>>>>>> do it
>>>>>> all without locking.
>>>>>>
>>>>>> This series can be found as a branch here:
>>>>>>
>>>>>> https://cgit.freedesktop.org/~jekstrand/linux/log/?h=drm-syn
>>>>>> cobj-wait-submit-v4
>>>>>>
>>>>>>
>>>>>> IGT tests for DRM_IOCTL_SYNCOBJ_WAIT and DRM_IOCTL_SYNCOBJ_RESET can
>>>>>> be
>>>>>> found on patchwork here:
>>>>>>
>>>>>> https://patchwork.freedesktop.org/series/28666/
>>>>>>
>>>>>> Patches to the Intel Vulkan driver to implement
>>>>>> VK_KHR_external_fence on
>>>>>> top of this kernel interface can be found here:
>>>>>>
>>>>>> https://cgit.freedesktop.org/~jekstrand/mesa/log/?h=wip/anv-
>>>>>> external-fence
>>>>>>
>>>>>>
>>>>>> Cc: Dave Airlie <airlied@redhat.com>
>>>>>> Cc: Chris Wilson <chris@chris-wilson.co.uk>
>>>>>> Cc: Christian König <christian.koenig@amd.com>
>>>>>>
>>>>>> Chris Wilson (2):
>>>>>>    dma-buf/dma-fence: Signal all callbacks from dma_fence_release()
>>>>>>    dma-buf/dma-fence: Add a mechanism for proxy fences
>>>>>>
>>>>>> Dave Airlie (1):
>>>>>>    drm/syncobj: add sync obj wait interface. (v8)
>>>>>>
>>>>>> Jason Ekstrand (6):
>>>>>>    drm/syncobj: Rename fence_get to find_fence
>>>>>>    drm/syncobj: Add a race-free drm_syncobj_fence_get helper
>>>>>>    i915: Add support for drm syncobjs
>>>>>>    dma-buf/dma-fence: Allow wait_any_timeout without default_wait (v2)
>>>>>>    drm/syncobj: Add a reset ioctl
>>>>>>    drm/syncobj: Allow wait for submit and signal behavior (v4)
>>>>>>
>>>>>>   drivers/dma-buf/Makefile                   |   4 +-
>>>>>>   drivers/dma-buf/dma-fence-proxy.c          | 186
>>>>>> +++++++++++++++++++
>>>>>>   drivers/dma-buf/dma-fence.c                |  34 ++--
>>>>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c     |   2 +-
>>>>>>   drivers/gpu/drm/drm_internal.h             |   4 +
>>>>>>   drivers/gpu/drm/drm_ioctl.c                |   4 +
>>>>>>   drivers/gpu/drm/drm_syncobj.c              | 275
>>>>>> +++++++++++++++++++++++++++--
>>>>>>   drivers/gpu/drm/i915/i915_drv.c            |   3 +-
>>>>>>   drivers/gpu/drm/i915/i915_gem_execbuffer.c | 146 ++++++++++++++-
>>>>>>   include/drm/drm_syncobj.h                  |  15 +-
>>>>>>   include/linux/dma-fence-proxy.h            |  25 +++
>>>>>>   include/uapi/drm/drm.h                     |  19 ++
>>>>>>   include/uapi/drm/i915_drm.h                |  30 +++-
>>>>>>   13 files changed, 710 insertions(+), 37 deletions(-)
>>>>>>   create mode 100644 drivers/dma-buf/dma-fence-proxy.c
>>>>>>   create mode 100644 include/linux/dma-fence-proxy.h
>>>>>>
>>>>>>
>>>>>
>>>>
>>>>
>>>
>>
>>
>

[-- Attachment #1.2: Type: text/html, Size: 12595 bytes --]

[-- Attachment #2: Type: text/plain, Size: 160 bytes --]

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH 0/9] drm/syncobj: Add full-featured wait support (v2)
  2017-08-14 15:08             ` Jason Ekstrand
@ 2017-08-16 15:52               ` Jason Ekstrand
  2017-08-16 16:53                 ` Christian König
  0 siblings, 1 reply; 55+ messages in thread
From: Jason Ekstrand @ 2017-08-16 15:52 UTC (permalink / raw)
  To: Christian König
  Cc: Jason Ekstrand, Dave Airlie, Maling list - DRI developers


[-- Attachment #1.1: Type: text/plain, Size: 10065 bytes --]

On Mon, Aug 14, 2017 at 8:08 AM, Jason Ekstrand <jason@jlekstrand.net>
wrote:

> On Mon, Aug 14, 2017 at 12:36 AM, Christian König <
> christian.koenig@amd.com> wrote:
>
>> Am 14.08.2017 um 01:14 schrieb Jason Ekstrand:
>>
>>> On August 13, 2017 8:52:21 AM Christian König <christian.koenig@amd.com>
>>> wrote:
>>>
>>> Am 13.08.2017 um 17:26 schrieb Jason Ekstrand:
>>>>
>>>>> On August 13, 2017 6:19:53 AM Christian König
>>>>> <christian.koenig@amd.com> wrote:
>>>>>
>>>>> Patches #1-#4 are Acked-by: Christian König <christian.koenig@amd.com
>>>>>> >.
>>>>>>
>>>>>> Patch #5: NAK, that will break radeon.
>>>>>>
>>>>>> On radeon we need the non-default wait or otherwise we can run into a
>>>>>> situation where we never signal a fence.
>>>>>>
>>>>>> The general question is why do you need this?
>>>>>>
>>>>>
>>>>> Because i915 sets a non-default wait function so calling wait_any just
>>>>> bails with fences from i915 immediately bails with -EINVAL. This makes
>>>>> it work even with non-default waits.
>>>>>
>>>>
>>>> Ok well, let me refine the question: Why does i915 sets a non-default
>>>> wait function?
>>>>
>>>
>>> I have no idea.
>>>
>>
>> Can you figure that out? I'm not completely against removing that
>> limitation, but it would be a lot cleaner if we can just fix i915 to not
>> set a non-default wait function.
>>
>
> I asked on IRC how bad it would be and chris replied with "a lot of
> work".  He didn't say it's impossible but, apparently, it is a pile of work.
>
>
>> In radeon we have it because we need to handle 10+ years of different
>>>> hardware generation, each which a bunch of separate bugs in their fence
>>>> handling (and some even not solved by today).
>>>>
>>>
>>> So how does wait_any returning -EINVAL for non-default waits help radeon?
>>>
>>
>> When the wait function detects a problem it reports -EDEADLK to the
>> caller to signal that the hardware is stuck.
>>
>> The Caller then goes up the call chain and ultimately reports this to the
>> IOCTL where the error is handled with a hardware reset. And yeah, I
>> perfectly know that this design sucks badly.
>>
>
> Could we use dma_fence::err for this?  i.e., could the radeon driver set
> the error bit and then have wait_any check for errors on wakeup and report
> the first one it sees?
>
>
>> The point is we should try to prevent that the wait for any function is
>> even used together with a radeon fence.
>>
>
> Does radeon not support sync_file?  Because this is the same mechanism it
> uses for poll.
>
>
>>
>>>
>>> Patch #6: Yes, please. Patch is Reviewed-by: Christian König
>>>>>> <christian.koenig@amd.com>.
>>>>>>
>>>>>> Patch #7: Already gave my rb on the patch Chris send out earlier.
>>>>>>
>>>>>> Patch #8: NAK to the whole approach.
>>>>>>
>>>>>> IIRC we discussed a very similar thing during the initial fence
>>>>>> bringup
>>>>>> and also during the fence_array development.
>>>>>>
>>>>>> The problem is that you can easily build ring dependencies and so
>>>>>> deadlocks with it.
>>>>>>
>>>>>
For what it's worth, my proxy-based implementation does not suffer from the
same issues as the one suggested by Chris.  I was very careful to ensure
that drm_syncobj_fence_get will never return a proxy - if the current fence
is a proxy, it returns NULL.  It should look no different to drivers than
any of the other interfaces I've proposed from that perspective.


> I would really prefer an approach which is completely contained inside
>>>>>> the syncobj code base.
>>>>>>
>>>>>
>>>>> Are you use to the approach of internally making a proxy so long as
>>>>> all the proxy code is inside syncobj?
>>>>>
>>>> Yes, that would be a start.
>>>>
>>>> In general if possible I would rather like to avoid the whole handling
>>>> with the proxy/callback altogether, but that possible only works with
>>>> wait for any if the waitqueue is global and that wouldn't be ideal
>>>> either.
>>>>
>>>
>>> I'm happy to get rid of the proxies.  They did work nicely but I'm not
>>> really attached to them
>>>
>>> Is also be happy to go back to the original approach with v3 of the
>>>>> last patch.
>>>>>
>>>> v3 looked like it should work as well, I would just drop abusing the
>>>> fence callback structure for the signaling.
>>>>
>>>> Ideally we would finally come up with an interface to wait for multiple
>>>> waitqueue at the same time, but that probably goes a bit to far.
>>>>
>>>> For now just use a single linked list to start all processes waiting for
>>>> a fence to arrive or something like this.
>>>>
>>>
>>> I don't really know what you're suggesting.  Patch v3 has a single
>>> waitqueue per process.  Are you suggesting one per fence?
>>>
>>
>> Yeah, more or less. What I'm suggesting is to use one wait_queue_head_t
>> per drm_syncobj.
>>
>> See a wait_queue is a callback mechanism anyway, so you are wrapping a
>> callback mechanism inside another callback mechanism and that makes not
>> really much sense.
>>
>
> Fair enough.  There is one little snag though:  We need to wait on sync
> objects and fences at the same time in order for WAIT_ANY | WAIT_FOR_SUBMIT
> to work.  I see two options here:
>
>  1) Convert dma-fence to use waitqueue instead of its callback mechanism
> and add a wait_queue_any.  A quick grep for dma_fence_add_callback says
> that this would affect four drivers.
>

The more I think about it, the less sense using waitqueues makes.  The
fundamental problem here is that the event we are waiting on is actually
the concatenation of two events: submit and signal.  Since we are waiting
on several of these pairs of concatenated events simultaneously, the only
two options we have are to either combine them into one event (the proxy
approach) or to implement a wait which is capable of handling both at the
same time.  I don't see a way to do the latter with wait queues.

--Jason


>  2) Drop waitqueues and go back and fix patch v2 so that it does the wait
> correctly.
>
> --Jason
>
>
>> The problem is that we don't have a wait_event_* variant which can wait
>> for multiple events. Somebody should really add something like that :)
>>
>> Regards,
>> Christian.
>>
>>
>>
>>> --Jason
>>>
>>> Regards,
>>>> Christian.
>>>>
>>>>
>>>>> Regards,
>>>>>> Christian.
>>>>>>
>>>>>> Am 12.08.2017 um 00:39 schrieb Jason Ekstrand:
>>>>>>
>>>>>>> This series does the same thing as my earlier series in that it adds
>>>>>>> a sync
>>>>>>> object wait interface complete with WAIT_FOR_SUBMIT flag. While the
>>>>>>> uapi
>>>>>>> remains unchanged, the guts look a bit different.  Instead of adding
>>>>>>> a
>>>>>>> callback mechanism to drm_syncobj that fired whenever replace_fence
>>>>>>> was
>>>>>>> called, it's now using proxy fences.  The drm_syncobj_fence_get still
>>>>>>> returns NULL whenever the sync object is in an unsubmitted state but
>>>>>>> there
>>>>>>> is a new drm_syncobj_fence_proxy_get which returns either the real
>>>>>>> fence or
>>>>>>> a proxy fence that will be triggered the next time replace_fence is
>>>>>>> called
>>>>>>> with a non-NULL replacement.  This does make both
>>>>>>> drm_syncobj_fence_get and
>>>>>>> drm_syncobj_replace_fence a tiny bit more expensive, but it lets us
>>>>>>> do it
>>>>>>> all without locking.
>>>>>>>
>>>>>>> This series can be found as a branch here:
>>>>>>>
>>>>>>> https://cgit.freedesktop.org/~jekstrand/linux/log/?h=drm-syn
>>>>>>> cobj-wait-submit-v4
>>>>>>>
>>>>>>>
>>>>>>> IGT tests for DRM_IOCTL_SYNCOBJ_WAIT and DRM_IOCTL_SYNCOBJ_RESET can
>>>>>>> be
>>>>>>> found on patchwork here:
>>>>>>>
>>>>>>> https://patchwork.freedesktop.org/series/28666/
>>>>>>>
>>>>>>> Patches to the Intel Vulkan driver to implement
>>>>>>> VK_KHR_external_fence on
>>>>>>> top of this kernel interface can be found here:
>>>>>>>
>>>>>>> https://cgit.freedesktop.org/~jekstrand/mesa/log/?h=wip/anv-
>>>>>>> external-fence
>>>>>>>
>>>>>>>
>>>>>>> Cc: Dave Airlie <airlied@redhat.com>
>>>>>>> Cc: Chris Wilson <chris@chris-wilson.co.uk>
>>>>>>> Cc: Christian König <christian.koenig@amd.com>
>>>>>>>
>>>>>>> Chris Wilson (2):
>>>>>>>    dma-buf/dma-fence: Signal all callbacks from dma_fence_release()
>>>>>>>    dma-buf/dma-fence: Add a mechanism for proxy fences
>>>>>>>
>>>>>>> Dave Airlie (1):
>>>>>>>    drm/syncobj: add sync obj wait interface. (v8)
>>>>>>>
>>>>>>> Jason Ekstrand (6):
>>>>>>>    drm/syncobj: Rename fence_get to find_fence
>>>>>>>    drm/syncobj: Add a race-free drm_syncobj_fence_get helper
>>>>>>>    i915: Add support for drm syncobjs
>>>>>>>    dma-buf/dma-fence: Allow wait_any_timeout without default_wait
>>>>>>> (v2)
>>>>>>>    drm/syncobj: Add a reset ioctl
>>>>>>>    drm/syncobj: Allow wait for submit and signal behavior (v4)
>>>>>>>
>>>>>>>   drivers/dma-buf/Makefile                   |   4 +-
>>>>>>>   drivers/dma-buf/dma-fence-proxy.c          | 186
>>>>>>> +++++++++++++++++++
>>>>>>>   drivers/dma-buf/dma-fence.c                |  34 ++--
>>>>>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c     |   2 +-
>>>>>>>   drivers/gpu/drm/drm_internal.h             |   4 +
>>>>>>>   drivers/gpu/drm/drm_ioctl.c                |   4 +
>>>>>>>   drivers/gpu/drm/drm_syncobj.c              | 275
>>>>>>> +++++++++++++++++++++++++++--
>>>>>>>   drivers/gpu/drm/i915/i915_drv.c            |   3 +-
>>>>>>>   drivers/gpu/drm/i915/i915_gem_execbuffer.c | 146 ++++++++++++++-
>>>>>>>   include/drm/drm_syncobj.h                  |  15 +-
>>>>>>>   include/linux/dma-fence-proxy.h            |  25 +++
>>>>>>>   include/uapi/drm/drm.h                     |  19 ++
>>>>>>>   include/uapi/drm/i915_drm.h                |  30 +++-
>>>>>>>   13 files changed, 710 insertions(+), 37 deletions(-)
>>>>>>>   create mode 100644 drivers/dma-buf/dma-fence-proxy.c
>>>>>>>   create mode 100644 include/linux/dma-fence-proxy.h
>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>
>>>
>>>
>>
>

[-- Attachment #1.2: Type: text/html, Size: 15303 bytes --]

[-- Attachment #2: Type: text/plain, Size: 160 bytes --]

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH 0/9] drm/syncobj: Add full-featured wait support (v2)
  2017-08-16 15:52               ` Jason Ekstrand
@ 2017-08-16 16:53                 ` Christian König
  2017-08-16 20:10                   ` Jason Ekstrand
  0 siblings, 1 reply; 55+ messages in thread
From: Christian König @ 2017-08-16 16:53 UTC (permalink / raw)
  To: Jason Ekstrand; +Cc: Jason Ekstrand, Dave Airlie, Maling list - DRI developers


[-- Attachment #1.1: Type: text/plain, Size: 1606 bytes --]

> [SNIP]
>
>         See a wait_queue is a callback mechanism anyway, so you are
>         wrapping a callback mechanism inside another callback
>         mechanism and that makes not really much sense.
>
>
>     Fair enough.  There is one little snag though:  We need to wait on
>     sync objects and fences at the same time in order for WAIT_ANY |
>     WAIT_FOR_SUBMIT to work. I see two options here:
>
>      1) Convert dma-fence to use waitqueue instead of its callback
>     mechanism and add a wait_queue_any.  A quick grep for
>     dma_fence_add_callback says that this would affect four drivers.
>
>
> The more I think about it, the less sense using waitqueues makes.  The 
> fundamental problem here is that the event we are waiting on is 
> actually the concatenation of two events: submit and signal.  Since we 
> are waiting on several of these pairs of concatenated events 
> simultaneously, the only two options we have are to either combine 
> them into one event (the proxy approach) or to implement a wait which 
> is capable of handling both at the same time.  I don't see a way to do 
> the latter with wait queues.

Agree completely.

Essentially we would need to enable wait_event_* to wait for multiple 
events and then convert all the fence callback stuff to wait_event 
structures.

But that is certainly outside the scope of this patchset, so feel free 
to go ahead with the approach of waiting manually (but please without 
the bugs).

Well if you got a student/interim with free time that would certainly be 
a nice cleanup task to start on kernel work.

Regards,
Christian.

[-- Attachment #1.2: Type: text/html, Size: 3011 bytes --]

[-- Attachment #2: Type: text/plain, Size: 160 bytes --]

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH 0/9] drm/syncobj: Add full-featured wait support (v2)
  2017-08-16 16:53                 ` Christian König
@ 2017-08-16 20:10                   ` Jason Ekstrand
  2017-08-21 21:42                     ` Jason Ekstrand
  0 siblings, 1 reply; 55+ messages in thread
From: Jason Ekstrand @ 2017-08-16 20:10 UTC (permalink / raw)
  To: Christian König
  Cc: Jason Ekstrand, Dave Airlie, Maling list - DRI developers


[-- Attachment #1.1: Type: text/plain, Size: 1786 bytes --]

On Wed, Aug 16, 2017 at 9:53 AM, Christian König <christian.koenig@amd.com>
wrote:

> [SNIP]
>
>> See a wait_queue is a callback mechanism anyway, so you are wrapping a
>>> callback mechanism inside another callback mechanism and that makes not
>>> really much sense.
>>>
>>
>> Fair enough.  There is one little snag though:  We need to wait on sync
>> objects and fences at the same time in order for WAIT_ANY | WAIT_FOR_SUBMIT
>> to work.  I see two options here:
>>
>>  1) Convert dma-fence to use waitqueue instead of its callback mechanism
>> and add a wait_queue_any.  A quick grep for dma_fence_add_callback says
>> that this would affect four drivers.
>>
>
> The more I think about it, the less sense using waitqueues makes.  The
> fundamental problem here is that the event we are waiting on is actually
> the concatenation of two events: submit and signal.  Since we are waiting
> on several of these pairs of concatenated events simultaneously, the only
> two options we have are to either combine them into one event (the proxy
> approach) or to implement a wait which is capable of handling both at the
> same time.  I don't see a way to do the latter with wait queues.
>
>
> Agree completely.
>
> Essentially we would need to enable wait_event_* to wait for multiple
> events and then convert all the fence callback stuff to wait_event
> structures.
>
> But that is certainly outside the scope of this patchset, so feel free to
> go ahead with the approach of waiting manually (but please without the
> bugs).
>

Well, the patches I sent last night should do just that.  It's mostly the
original approach but with the bugfixes from versions 3 and 4.  Modulo
finding additional bugs, I think they should be good to go.

--Jason

[-- Attachment #1.2: Type: text/html, Size: 3518 bytes --]

[-- Attachment #2: Type: text/plain, Size: 160 bytes --]

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH 0/9] drm/syncobj: Add full-featured wait support (v2)
  2017-08-16 20:10                   ` Jason Ekstrand
@ 2017-08-21 21:42                     ` Jason Ekstrand
  2017-08-22  8:30                       ` Christian König
  0 siblings, 1 reply; 55+ messages in thread
From: Jason Ekstrand @ 2017-08-21 21:42 UTC (permalink / raw)
  To: Christian König
  Cc: Jason Ekstrand, Dave Airlie, Maling list - DRI developers


[-- Attachment #1.1: Type: text/plain, Size: 1917 bytes --]

On Wed, Aug 16, 2017 at 1:10 PM, Jason Ekstrand <jason@jlekstrand.net>
wrote:

> On Wed, Aug 16, 2017 at 9:53 AM, Christian König <christian.koenig@amd.com
> > wrote:
>
>> [SNIP]
>>
>>> See a wait_queue is a callback mechanism anyway, so you are wrapping a
>>>> callback mechanism inside another callback mechanism and that makes not
>>>> really much sense.
>>>>
>>>
>>> Fair enough.  There is one little snag though:  We need to wait on sync
>>> objects and fences at the same time in order for WAIT_ANY | WAIT_FOR_SUBMIT
>>> to work.  I see two options here:
>>>
>>>  1) Convert dma-fence to use waitqueue instead of its callback mechanism
>>> and add a wait_queue_any.  A quick grep for dma_fence_add_callback says
>>> that this would affect four drivers.
>>>
>>
>> The more I think about it, the less sense using waitqueues makes.  The
>> fundamental problem here is that the event we are waiting on is actually
>> the concatenation of two events: submit and signal.  Since we are waiting
>> on several of these pairs of concatenated events simultaneously, the only
>> two options we have are to either combine them into one event (the proxy
>> approach) or to implement a wait which is capable of handling both at the
>> same time.  I don't see a way to do the latter with wait queues.
>>
>>
>> Agree completely.
>>
>> Essentially we would need to enable wait_event_* to wait for multiple
>> events and then convert all the fence callback stuff to wait_event
>> structures.
>>
>> But that is certainly outside the scope of this patchset, so feel free to
>> go ahead with the approach of waiting manually (but please without the
>> bugs).
>>
>
> Well, the patches I sent last night should do just that.  It's mostly the
> original approach but with the bugfixes from versions 3 and 4.  Modulo
> finding additional bugs, I think they should be good to go.
>

ping?

[-- Attachment #1.2: Type: text/html, Size: 3903 bytes --]

[-- Attachment #2: Type: text/plain, Size: 160 bytes --]

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH 0/9] drm/syncobj: Add full-featured wait support (v2)
  2017-08-21 21:42                     ` Jason Ekstrand
@ 2017-08-22  8:30                       ` Christian König
  0 siblings, 0 replies; 55+ messages in thread
From: Christian König @ 2017-08-22  8:30 UTC (permalink / raw)
  To: Jason Ekstrand; +Cc: Jason Ekstrand, Dave Airlie, Maling list - DRI developers


[-- Attachment #1.1: Type: text/plain, Size: 2520 bytes --]

Am 21.08.2017 um 23:42 schrieb Jason Ekstrand:
>
>
> On Wed, Aug 16, 2017 at 1:10 PM, Jason Ekstrand <jason@jlekstrand.net 
> <mailto:jason@jlekstrand.net>> wrote:
>
>     On Wed, Aug 16, 2017 at 9:53 AM, Christian König
>     <christian.koenig@amd.com <mailto:christian.koenig@amd.com>> wrote:
>
>>         [SNIP]
>>
>>                 See a wait_queue is a callback mechanism anyway, so
>>                 you are wrapping a callback mechanism inside another
>>                 callback mechanism and that makes not really much sense.
>>
>>
>>             Fair enough.  There is one little snag though:  We need
>>             to wait on sync objects and fences at the same time in
>>             order for WAIT_ANY | WAIT_FOR_SUBMIT to work.  I see two
>>             options here:
>>
>>              1) Convert dma-fence to use waitqueue instead of its
>>             callback mechanism and add a wait_queue_any.  A quick
>>             grep for dma_fence_add_callback says that this would
>>             affect four drivers.
>>
>>
>>         The more I think about it, the less sense using waitqueues
>>         makes.  The fundamental problem here is that the event we are
>>         waiting on is actually the concatenation of two events:
>>         submit and signal.  Since we are waiting on several of these
>>         pairs of concatenated events simultaneously, the only two
>>         options we have are to either combine them into one event
>>         (the proxy approach) or to implement a wait which is capable
>>         of handling both at the same time.  I don't see a way to do
>>         the latter with wait queues.
>
>         Agree completely.
>
>         Essentially we would need to enable wait_event_* to wait for
>         multiple events and then convert all the fence callback stuff
>         to wait_event structures.
>
>         But that is certainly outside the scope of this patchset, so
>         feel free to go ahead with the approach of waiting manually
>         (but please without the bugs).
>
>
>     Well, the patches I sent last night should do just that.  It's
>     mostly the original approach but with the bugfixes from versions 3
>     and 4. Modulo finding additional bugs, I think they should be good
>     to go.
>
> ping?
I just way to much to do at the moment (as usually). Feel free to add an 
Acked-by: Christian König <christian.koenig@amd.com> to the patches in 
the meantime, but an detailed review would have to wait a bit.

Sorry for the delay,
Christian.

[-- Attachment #1.2: Type: text/html, Size: 7645 bytes --]

[-- Attachment #2: Type: text/plain, Size: 160 bytes --]

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH 7/9] dma-buf/dma-fence: Signal all callbacks from dma_fence_release()
  2017-08-11 22:39   ` [PATCH 7/9] dma-buf/dma-fence: Signal all callbacks from dma_fence_release() Jason Ekstrand
@ 2018-01-31 12:32     ` Gustavo Padovan
  2018-01-31 15:53       ` Chris Wilson
  0 siblings, 1 reply; 55+ messages in thread
From: Gustavo Padovan @ 2018-01-31 12:32 UTC (permalink / raw)
  To: Jason Ekstrand; +Cc: dri-devel

Hi,

2017-08-11 Jason Ekstrand <jason@jlekstrand.net>:

> From: Chris Wilson <chris@chris-wilson.co.uk>
> 
> This is an illegal scenario, to free the fence whilst there are pending
> callbacks. Currently, we emit a WARN and then cast aside the callbacks
> leaving them dangling. Alternatively, we could set an error on the fence
> and then signal fence so that any dependency chains from the fence can
> be tidied up, and if they care they can check for the error.
> 
> The question is whether or not the cure is worse than the disease
> (premature fence signaling is never pretty).
> 
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> ---
>  drivers/dma-buf/dma-fence.c | 14 +++++++++++++-
>  1 file changed, 13 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/dma-buf/dma-fence.c b/drivers/dma-buf/dma-fence.c
> index 0cac367..4062708 100644
> --- a/drivers/dma-buf/dma-fence.c
> +++ b/drivers/dma-buf/dma-fence.c
> @@ -172,7 +172,19 @@ void dma_fence_release(struct kref *kref)
>  
>  	trace_dma_fence_destroy(fence);
>  
> -	WARN_ON(!list_empty(&fence->cb_list));
> +	if (WARN_ON(!list_empty(&fence->cb_list))) {
> +		unsigned long flags;
> +
> +		/*
> +		 * This should never happen, but if it does make sure that we
> +		 * don't leave chains dangling. We set the error flag first
> +		 * so that the callbacks know this signal is due to an error.
> +		 */
> +		spin_lock_irqsave(fence->lock, flags);
> +		fence->error = -EDEADLK;
> +		dma_fence_signal_locked(fence);
> +		spin_unlock_irqrestore(fence->lock, flags);
> +	}

Thanks for the patch!

Reviewed-by: Gustavo Padovan <gustavo.padovan@collabora.com>

>  
>  	if (fence->ops->release)
>  		fence->ops->release(fence);
> -- 
> 2.5.0.400.gff86faf
> 
> _______________________________________________
> dri-devel mailing list
> dri-devel@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/dri-devel
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH 7/9] dma-buf/dma-fence: Signal all callbacks from dma_fence_release()
  2018-01-31 12:32     ` Gustavo Padovan
@ 2018-01-31 15:53       ` Chris Wilson
  0 siblings, 0 replies; 55+ messages in thread
From: Chris Wilson @ 2018-01-31 15:53 UTC (permalink / raw)
  To: Gustavo Padovan, Jason Ekstrand; +Cc: dri-devel

Quoting Gustavo Padovan (2018-01-31 12:32:21)
> Hi,
> 
> 2017-08-11 Jason Ekstrand <jason@jlekstrand.net>:
> 
> > From: Chris Wilson <chris@chris-wilson.co.uk>
> > 
> > This is an illegal scenario, to free the fence whilst there are pending
> > callbacks. Currently, we emit a WARN and then cast aside the callbacks
> > leaving them dangling. Alternatively, we could set an error on the fence
> > and then signal fence so that any dependency chains from the fence can
> > be tidied up, and if they care they can check for the error.
> > 
> > The question is whether or not the cure is worse than the disease
> > (premature fence signaling is never pretty).
> > 
> > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> > ---
> >  drivers/dma-buf/dma-fence.c | 14 +++++++++++++-
> >  1 file changed, 13 insertions(+), 1 deletion(-)
> > 
> > diff --git a/drivers/dma-buf/dma-fence.c b/drivers/dma-buf/dma-fence.c
> > index 0cac367..4062708 100644
> > --- a/drivers/dma-buf/dma-fence.c
> > +++ b/drivers/dma-buf/dma-fence.c
> > @@ -172,7 +172,19 @@ void dma_fence_release(struct kref *kref)
> >  
> >       trace_dma_fence_destroy(fence);
> >  
> > -     WARN_ON(!list_empty(&fence->cb_list));
> > +     if (WARN_ON(!list_empty(&fence->cb_list))) {
> > +             unsigned long flags;
> > +
> > +             /*
> > +              * This should never happen, but if it does make sure that we
> > +              * don't leave chains dangling. We set the error flag first
> > +              * so that the callbacks know this signal is due to an error.
> > +              */
> > +             spin_lock_irqsave(fence->lock, flags);
> > +             fence->error = -EDEADLK;
> > +             dma_fence_signal_locked(fence);
> > +             spin_unlock_irqrestore(fence->lock, flags);
> > +     }
> 
> Thanks for the patch!
> 
> Reviewed-by: Gustavo Padovan <gustavo.padovan@collabora.com>

If you are picking up this version, please also include
a Reviewed-by: Christian König <christian.koenig@amd.com> given to an
earlier posting, which included a discussion on the pitiful lack of
EHUMAN.
-Chris
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 55+ messages in thread

end of thread, other threads:[~2018-01-31 15:54 UTC | newest]

Thread overview: 55+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-08-08 22:46 [PATCH 0/9] drm/syncobj: Add full-featured wait support Jason Ekstrand
2017-08-08 22:46 ` [PATCH 1/9] drm/syncobj: Rename fence_get to find_fence Jason Ekstrand
2017-08-08 22:46 ` [PATCH 2/9] drm/syncobj: Lock around drm_syncobj::fence Jason Ekstrand
2017-08-09 21:21   ` Chris Wilson
2017-08-10  0:31     ` Jason Ekstrand
2017-08-10 10:55       ` Chris Wilson
2017-08-10 19:10       ` Chris Wilson
2017-08-08 22:46 ` [PATCH 3/9] drm/syncobj: Remove file_private from replace_fence Jason Ekstrand
2017-08-08 22:46 ` [PATCH 4/9] i915: Add support for drm syncobjs Jason Ekstrand
2017-08-08 22:46 ` [PATCH 5/9] drm/syncobj: add sync obj wait interface. (v8) Jason Ekstrand
2017-08-08 22:46 ` [PATCH 6/9] dma-buf/dma-fence: Allow wait_any_timeout without default_wait Jason Ekstrand
2017-08-08 22:46 ` [PATCH 7/9] drm/syncobj: Add a reset ioctl Jason Ekstrand
2017-08-08 22:46 ` [PATCH 8/9] drm/syncobj: Add a callback mechanism for replace_fence Jason Ekstrand
2017-08-08 22:46 ` [PATCH 9/9] drm/syncobj: Allow wait for submit and signal behavior Jason Ekstrand
2017-08-09 17:00   ` [PATCH] drm/syncobj: Allow wait for submit and signal behavior (v2) Jason Ekstrand
2017-08-09 17:57     ` Chris Wilson
2017-08-09 18:25       ` Christian König
2017-08-09 21:09         ` Jason Ekstrand
2017-08-09 22:41       ` Chris Wilson
2017-08-09 23:53         ` Jason Ekstrand
2017-08-10 11:00           ` Chris Wilson
2017-08-10 14:42             ` Jason Ekstrand
2017-08-10 12:26           ` Christian König
2017-08-10 14:32             ` Jason Ekstrand
2017-08-10 14:41               ` Christian König
2017-08-09 21:31     ` Chris Wilson
2017-08-09 21:54       ` Jason Ekstrand
2017-08-10 12:26       ` Christian König
2017-08-10  1:35     ` [PATCH v3 9/9] drm/syncobj: Allow wait for submit and signal behavior (v3) Jason Ekstrand
2017-08-11 22:39 ` [PATCH 0/9] drm/syncobj: Add full-featured wait support (v2) Jason Ekstrand
2017-08-11 22:39   ` [PATCH 1/9] drm/syncobj: Rename fence_get to find_fence Jason Ekstrand
2017-08-11 22:39   ` [PATCH 2/9] drm/syncobj: Add a race-free drm_syncobj_fence_get helper Jason Ekstrand
2017-08-14  2:03     ` kbuild test robot
2017-08-11 22:39   ` [PATCH 3/9] i915: Add support for drm syncobjs Jason Ekstrand
2017-08-14  2:58     ` Jason Ekstrand
2017-08-11 22:39   ` [PATCH 4/9] drm/syncobj: add sync obj wait interface. (v8) Jason Ekstrand
2017-08-11 22:39   ` [PATCH 5/9] dma-buf/dma-fence: Allow wait_any_timeout without default_wait (v2) Jason Ekstrand
2017-08-11 22:39   ` [PATCH 6/9] drm/syncobj: Add a reset ioctl Jason Ekstrand
2017-08-11 22:39   ` [PATCH 7/9] dma-buf/dma-fence: Signal all callbacks from dma_fence_release() Jason Ekstrand
2018-01-31 12:32     ` Gustavo Padovan
2018-01-31 15:53       ` Chris Wilson
2017-08-11 22:39   ` [PATCH 8/9] dma-buf/dma-fence: Add a mechanism for proxy fences Jason Ekstrand
2017-08-11 22:39   ` [PATCH 9/9] drm/syncobj: Allow wait for submit and signal behavior (v4) Jason Ekstrand
2017-08-13 13:19   ` [PATCH 0/9] drm/syncobj: Add full-featured wait support (v2) Christian König
2017-08-13 15:26     ` Jason Ekstrand
2017-08-13 15:52       ` Christian König
2017-08-13 23:14         ` Jason Ekstrand
2017-08-14  5:49           ` Jason Ekstrand
2017-08-14  7:36           ` Christian König
2017-08-14 15:08             ` Jason Ekstrand
2017-08-16 15:52               ` Jason Ekstrand
2017-08-16 16:53                 ` Christian König
2017-08-16 20:10                   ` Jason Ekstrand
2017-08-21 21:42                     ` Jason Ekstrand
2017-08-22  8:30                       ` Christian König

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.